[
  {
    "path": ".cargo/config.toml",
    "content": "[build]\nrustflags = [\"--cfg\", \"tokio_unstable\"]\nrustdocflags = [\"--cfg\", \"tokio_unstable\"]\n\n[target.x86_64-unknown-linux-gnu]\n# Targeting x86-64-v2 gives a ~2% performance boost while only\n# disallowing Intel CPUs older than 2008 and AMD CPUs older than 2011.\n# None of those very old CPUs are used in GCP\n# (https://cloud.google.com/compute/docs/cpu-platforms). Unfortunately,\n# AWS does not seem to disclose the exact CPUs they use.\nrustflags = [\"-C\", \"target-cpu=x86-64-v2\", \"--cfg\", \"tokio_unstable\"]\n\n"
  },
  {
    "path": ".claude/skills/bump-tantivy/SKILL.md",
    "content": "---\nname: bump-tantivy\ndescription: Bump tantivy to the latest commit on main branch, fix compilation issues, and open a PR\ndisable-model-invocation: true\n---\n\n# Bump Tantivy\n\nFollow these steps to bump tantivy to its latest version:\n\n## Step 1: Check that we are on the main branch\n\nRun: `git branch --show-current`\n\nIf the current branch is not `main`, abort and ask the user to switch to the main branch first.\n\n## Step 2: Ensure main is up to date\n\nRun: `git pull origin main`\n\nThis ensures we're working from the latest code.\n\n## Step 3: Get the latest tantivy SHA\n\nRun: `gh api repos/quickwit-oss/tantivy/commits/main --jq '.sha'`\n\nExtract the first 7 characters as the short SHA.\n\n## Step 4: Update Cargo.toml\n\nEdit `quickwit/Cargo.toml` and update the `rev` field in the tantivy dependency to the new short SHA.\n\nThe line looks like:\n```toml\ntantivy = { git = \"https://github.com/quickwit-oss/tantivy/\", rev = \"XXXXXXX\", ... }\n```\n\n## Step 5: Run cargo check and fix compilation errors\n\nRun `cargo check` in the `quickwit` directory to verify compilation.\n\nIf there are compilation errors:\n- If the fix is straightforward (simple API changes, renames, etc.), fix them without asking\n- If the fix is complex or unclear, ask the user before proceeding\n\nRepeat until cargo check passes.\n\n## Step 6: Format code\n\nRun `make fmt` from the `quickwit/` directory to format the code.\n\n## Step 7: Update licenses\n\nRun `make update-licenses` from the `quickwit/` directory, then move the generated file:\n```\nmv quickwit/LICENSE-3rdparty.csv ./LICENSE-3rdparty.csv\n```\n\n## Step 8: Create a new branch\n\nGet the git username: `git config user.name | tr ' ' '-' | tr '[:upper:]' '[:lower:]'`\n\nGet today's date: `date +%Y-%m-%d`\n\nCreate and checkout a new branch named: `{username}/bump-tantivy-{date}`\n\nExample: `paul/bump-tantivy-2024-03-15`\n\n## Step 9: Commit changes\n\nStage all modified files and create a commit with message:\n```\nBump tantivy to {short-sha}\n```\n\n## Step 10: Push and open a PR\n\nPush the branch and open a PR using:\n```\ngh pr create --title \"Bump tantivy to {short-sha}\" --body \"Updates tantivy dependency to the latest commit on main.\"\n```\n\nReport the PR URL to the user when complete.\n"
  },
  {
    "path": ".claude/skills/fix-clippy/SKILL.md",
    "content": "---\nname: fix-clippy\ndescription: Fix all clippy lint warnings in the project\n---\n\n# Fix Clippy\n\nClippy issues are **warnings**, not errors. Never grep for `error` when looking for clippy issues.\n\n## Step 1: Auto-fix\n\nRun `make fix` to automatically fix clippy warnings:\n\n```\nmake fix\n```\n\n## Step 2: Fix remaining warnings manually\n\nCheck for remaining warnings that couldn't be auto-fixed:\n\n```\ncargo clippy --tests 2>&1 | grep \"^warning:\" | sort -u\n```\n\nFor each remaining warning, find the exact location and fix it manually.\n"
  },
  {
    "path": ".claude/skills/fmt/SKILL.md",
    "content": "---\nname: fmt\ndescription: Run `make fmt` to check the code format.\n---\n\n# Format Check\n\nRun `make fmt` from the `quickwit/` subdirectory to check code formatting:\n\n```\ncd /Users/paul.masurel/git/quickwit/quickwit && make fmt\n```\n\nThis command checks:\n1. Rust code formatting\n2. License headers\n3. Log format policy (no trailing punctuation, no uppercase first character)\n\nIf there are log format issues, fix them by:\n- Making the first character lowercase\n- Removing trailing punctuation (periods, exclamation marks, etc.)\n\nFix any issues found and re-run until clean.\n"
  },
  {
    "path": ".claude/skills/rationalize-deps/SKILL.md",
    "content": "---\nname: rationalize-deps\ndescription: Analyze Cargo.toml dependencies and attempt to remove unused features to reduce compile times and binary size\n---\n\n# Rationalize Dependencies\n\nThis skill analyzes Cargo.toml dependencies to identify and remove unused features.\n\n## Overview\n\nMany crates enable features by default that may not be needed. This skill:\n1. Identifies dependencies with default features enabled\n2. Tests if `default-features = false` works\n3. Identifies which specific features are actually needed\n4. Verifies compilation after changes\n\n## Step 1: Identify the target\n\nAsk the user which crate(s) to analyze:\n- A specific crate name (e.g., \"tokio\", \"serde\")\n- A specific workspace member (e.g., \"quickwit-search\")\n- \"all\" to scan the entire workspace\n\n## Step 2: Analyze current dependencies\n\nFor the workspace Cargo.toml (`quickwit/Cargo.toml`), list dependencies that:\n- Do NOT have `default-features = false`\n- Have default features that might be unnecessary\n\nRun: `cargo tree -p <crate> -f \"{p} {f}\" --edges features` to see what features are actually used.\n\n## Step 3: For each candidate dependency\n\n### 3a: Check the crate's default features\n\nLook up the crate on crates.io or check its Cargo.toml to understand:\n- What features are enabled by default\n- What each feature provides\n\nUse: `cargo metadata --format-version=1 | jq '.packages[] | select(.name == \"<crate>\") | .features'`\n\n### 3b: Try disabling default features\n\nModify the dependency in `quickwit/Cargo.toml`:\n\nFrom:\n```toml\nsome-crate = { version = \"1.0\" }\n```\n\nTo:\n```toml\nsome-crate = { version = \"1.0\", default-features = false }\n```\n\n### 3c: Run cargo check\n\nRun: `cargo check --workspace` (or target specific packages for faster feedback)\n\nIf compilation fails:\n1. Read the error messages to identify which features are needed\n2. Add only the required features explicitly:\n   ```toml\n   some-crate = { version = \"1.0\", default-features = false, features = [\"needed-feature\"] }\n   ```\n3. Re-run cargo check\n\n### 3d: Binary search for minimal features\n\nIf there are many default features, use binary search:\n1. Start with no features\n2. If it fails, add half the default features\n3. Continue until you find the minimal set\n\n## Step 4: Document findings\n\nFor each dependency analyzed, report:\n- Original configuration\n- New configuration (if changed)\n- Features that were removed\n- Any features that are required\n\n## Step 5: Verify full build\n\nAfter all changes, run:\n```bash\ncargo check --workspace --all-targets\ncargo test --workspace --no-run\n```\n\n## Common Patterns\n\n### Serde\nOften only needs `derive`:\n```toml\nserde = { version = \"1.0\", default-features = false, features = [\"derive\", \"std\"] }\n```\n\n### Tokio\nIdentify which runtime features are actually used:\n```toml\ntokio = { version = \"1.0\", default-features = false, features = [\"rt-multi-thread\", \"macros\", \"sync\"] }\n```\n\n### Reqwest\nOften doesn't need all TLS backends:\n```toml\nreqwest = { version = \"0.11\", default-features = false, features = [\"rustls-tls\", \"json\"] }\n```\n\n## Rollback\n\nIf changes cause issues:\n```bash\ngit checkout quickwit/Cargo.toml\ncargo check --workspace\n```\n\n## Tips\n\n- Start with large crates that have many default features (tokio, reqwest, hyper)\n- Use `cargo bloat --crates` to identify large dependencies\n- Check `cargo tree -d` for duplicate dependencies that might indicate feature conflicts\n- Some features are needed only for tests - consider using `[dev-dependencies]` features\n"
  },
  {
    "path": ".claude/skills/simple-pr/SKILL.md",
    "content": "---\nname: simple-pr\ndescription: Create a simple PR from staged changes with an auto-generated commit message\ndisable-model-invocation: true\n---\n\n# Simple PR\n\nFollow these steps to create a simple PR from staged changes:\n\n## Step 1: Check workspace state\n\nRun: `git status`\n\nVerify that all changes have been staged (no unstaged changes). If there are unstaged changes, abort and ask the user to stage their changes first with `git add`.\n\nAlso verify that we are on the `main` branch. If not, abort and ask the user to switch to main first.\n\n## Step 2: Ensure main is up to date\n\nRun: `git pull origin main`\n\nThis ensures we're working from the latest code.\n\n## Step 3: Review staged changes\n\nRun: `git diff --cached`\n\nReview the staged changes to understand what the PR will contain.\n\n## Step 4: Generate commit message\n\nBased on the staged changes, generate a concise commit message (1-2 sentences) that describes the \"why\" rather than the \"what\".\n\nDisplay the proposed commit message to the user and ask for confirmation before proceeding.\n\n## Step 5: Create a new branch\n\nGet the git username: `git config user.name | tr ' ' '-' | tr '[:upper:]' '[:lower:]'`\n\nCreate a short, descriptive branch name based on the changes (e.g., `fix-typo-in-readme`, `add-retry-logic`, `update-deps`).\n\nCreate and checkout the branch: `git checkout -b {username}/{short-descriptive-name}`\n\n## Step 6: Commit changes\n\nCommit with the message from step 3:\n```\ngit commit -m \"{commit-message}\"\n```\n\n## Step 7: Push and open a PR\n\nPush the branch and open a PR:\n```\ngit push -u origin {branch-name}\ngh pr create --title \"{commit-message-title}\" --body \"{longer-description-if-needed}\"\n```\n\nReport the PR URL to the user when complete.\n"
  },
  {
    "path": ".devcontainer/devcontainer.json",
    "content": "{\n    \"name\": \"Quickwit\",\n    \"image\": \"mcr.microsoft.com/devcontainers/rust:bookworm\",\n    \"customizations\": {\n        \"codespaces\": {\n            \"openFiles\": [\n                \"CONTRIBUTING.md\"\n            ]\n        },\n        \"vscode\": {\n            \"extensions\": [\n                \"rust-lang.rust-analyzer\"\n            ]\n        }\n    },\n    \"hostRequirements\": {\n        \"cpus\": 4,\n        \"memory\": \"16gb\"\n    },\n    \"runArgs\": [\n        \"--init\"\n    ],\n    \"mounts\": [\n        {\n            \"source\": \"/var/run/docker.sock\",\n            \"target\": \"/var/run/docker.sock\",\n            \"type\": \"bind\"\n        }\n    ],\n    \"features\": {\n        \"docker-from-docker\": {\n            \"version\": \"latest\",\n            \"moby\": true\n        },\n        \"ghcr.io/devcontainers/features/node:1\": {\n            \"version\": \"24\"\n        },\n        \"ghcr.io/devcontainers/features/aws-cli:1\": {},\n        \"ghcr.io/devcontainers-contrib/features/protoc:1\": {}\n    },\n    \"postCreateCommand\": \".devcontainer/post-create.sh\"\n}\n"
  },
  {
    "path": ".devcontainer/post-create.sh",
    "content": "#!/bin/bash\n\n# Define success and error color codes\nSUCCESS_COLOR=\"\\e[32m\"\nERROR_COLOR=\"\\e[31m\"\nRESET_COLOR=\"\\e[0m\"\n\n# Define success tracking variables\nrustupToolchainNightlyInstalled=false\ncmakeInstalled=false\n\n\n# Define installation functions\n\n#Installing manually for now until we figure out why \"ghcr.io/devcontainers-community/features/cmake\": {} is not working\ninstall_cmake() {\n    echo -e \"Installing CMake...\"\n    sudo apt-get update\n    sudo apt-get install -y cmake > /dev/null 2>&1\n    if [[ \"$(cmake --version)\" =~ \"cmake version\" ]]; then\n        echo -e \"${SUCCESS_COLOR}CMake installed successfully.${RESET_COLOR}\"\n        cmakeInstalled=true\n    else\n        echo -e \"${ERROR_COLOR}CMake installation failed. Please install it manually.${RESET_COLOR}\"\n    fi\n}\n\ninstall_rustup_toolchain_nightly() {\n    echo -e \"Installing Rustup nightly toolchain...\"\n    rustup toolchain install nightly > /dev/null 2>&1\n    rustup component add rustfmt --toolchain nightly > /dev/null 2>&1\n    if [[ \"$(rustup toolchain list)\" =~ \"nightly\" && \"$(rustup component list --toolchain nightly | grep rustfmt)\" =~ \"installed\" ]]; then\n        echo -e \"${SUCCESS_COLOR}Rustup nightly toolchain and rustfmt installed successfully.${RESET_COLOR}\"\n        rustupToolchainNightlyInstalled=true\n    else\n        echo -e \"${ERROR_COLOR}Rustup nightly toolchain and/or rustfmt installation failed. Please install them manually.${RESET_COLOR}\"\n    fi\n}\n\n# Install tools\ninstall_cmake\ninstall_rustup_toolchain_nightly\n\n# Copy our custom welcome message to replace the default github welcome message\nsudo cp .devcontainer/welcome.txt /usr/local/etc/vscode-dev-containers/first-run-notice.txt\n\n\n# Check the success tracking variables\nif $rustupToolchainNightlyInstalled && $cmakeInstalled; then\n    echo -e \"${SUCCESS_COLOR}All tools installed successfully.${RESET_COLOR}\"\nelse\n    echo -e \"${ERROR_COLOR}One or more tools failed to install. Please check the output for errors and install the failed tools manually.${RESET_COLOR}\"\nfi\n"
  },
  {
    "path": ".devcontainer/welcome.txt",
    "content": "👋 Welcome to the project!\nAll the necessary tools have already been installed for you 🎉. \nYou can go ahead and start hacking! Happy coding💻.\n\n Here are some useful commands you can run:\n\n🔧 `make test-all` - starts necessary Docker services and runs all tests.\n🔧 `make -k test-all docker-compose-down` - the same as above, but tears down the Docker services after running all the tests.\n🔧 `make fmt` - runs formatter, this command requires the nightly toolchain to be installed by running `rustup toolchain install nightly`.\n🔧 `make fix` - runs formatter and clippy checks.\n🔧 `make typos` - runs the spellcheck tool over the codebase. (Install by running `cargo install typos`)\n🔧 `make build-docs` - builds docs.\n🔧 `make docker-compose-up` - starts Docker services.\n🔧 `make docker-compose-down` - stops Docker services.\n🔧 `make docker-compose-logs` - shows Docker logs.\n\n"
  },
  {
    "path": ".dockerignore",
    "content": "**/*.md\n**/*.txt\n**/.*\n**/build\n**/Dockerfile\n**/node_modules\n**/qwdata\n**/target\ndocs\nexamples\n!.git/\n!quickwit-ui/build/.gitignore\n!quickwit-ui/.gitignore_for_build_directory\n"
  },
  {
    "path": ".gitattributes",
    "content": "**/codegen/** linguist-generated\n"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/bug_report.md",
    "content": "---\nname: Bug report\nabout: Create a report to help us improve\ntitle: \"\"\nlabels: bug\nassignees: \"\"\n---\n\n**Describe the bug**\nA clear and concise description of what the bug is.\n\n**Steps to reproduce (if applicable)**\nSteps to reproduce the behavior:\n\n1.\n2.\n\n**Expected behavior**\nA clear and concise description of what you expected to happen.\n\n**Configuration:**\nPlease provide:\n\n1. Output of `quickwit --version`\n2. The index_config.yaml\n"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/documentation_request.md",
    "content": "---\nname: Documentation request\nabout: Suggest a documentation enhancement\ntitle: \"[Documentation topic]\"\nlabels: documentation\nassignees: \"\"\n---\n\n<!--\n\nHi 👋, thank you for submitting a documentation enhancement to Quickwit!\n\nDon't forget to replace the title of this issue with a short\nsentence that describes the topic of your enhancement!\n\n-->\n\n## My documentation idea\n\nUse this section to give a description of what your enhancement is about.\n\nExamples:\n\n> I would like to add how to configure MinIO storage for Quickwit:\n>\n\n**What do you all think?**\n👍 I would love to see it!\n🚀 I would love to help!\n\nThank you for your request!\n"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/feature_request.md",
    "content": "---\nname: Feature request\nabout: Suggest an idea for this project\ntitle: \"\"\nlabels: enhancement\nassignees: \"\"\n---\n\n**Is your feature request related to a problem? Please describe.**\nA clear and concise description of what the problem is. Ex. I'm always frustrated when [...]\n\n**Describe the solution you'd like**\nA clear and concise description of what you want to happen.\n\n**Describe alternatives you've considered**\nA clear and concise description of any alternative solutions or features you've considered.\n\n**Additional context**\nAdd any other context or information about the feature request here.\n"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/tutorial_request.md",
    "content": "---\nname: Tutorial request\nabout: Suggest a Quickwit tutorial\ntitle: \"[Tutorial topic]\"\nlabels: tutorial\nassignees: \"\"\n---\n\n<!--\n\nHi 👋, thank you for submitting a tutorial to Quickwit!\n\nDon't forget to replace the title of this issue with a short\nsentence that describes the topic of your tutorial!\n\n-->\n\n## My tutorial idea\n\nUse this section to give a description of what your tutorial is about.\n\nExamples:\n\n> I would like to write a tutorial that shows how to use Quickwit:\n>\n> - \"for storing traces...\"\n> - \"with Grafana/Jaeger/MinIO...\"\n> - \"for ingesting terabytes per day with Kafka...\"\n\nAre there any particular tools, concepts, languages or platforms that readers\nwill learn about?\n\n**What do you all think?**\n👍 I would love to see it!\n🚀 I would love to help!\n\nThank you for your request!\n"
  },
  {
    "path": ".github/PULL_REQUEST_TEMPLATE.md",
    "content": "### Description\n\nDescribe the proposed changes made in this PR.\n\n### How was this PR tested?\n\nDescribe how you tested this PR.\n"
  },
  {
    "path": ".github/actions/cargo-build-macos-binary/action.yml",
    "content": "name: \"Build Quickwit binary for macOS\"\ndescription: \"Build React app and Rust binary for macOS with cargo build.\"\ninputs:\n  target:\n    description: \"Target\"\n    required: true\n  version:\n    description: \"Binary version\"\n    required: true\n  token:\n    description: \"GitHub access token\"\n    required: true\nruns:\n  using: \"composite\"\n  steps:\n    - run: echo \"ASSET_FULL_NAME=quickwit-${{ inputs.version }}-${{ inputs.target }}\" >> $GITHUB_ENV\n      shell: bash\n    - uses: actions/setup-node@v3\n      with:\n        node-version: 24\n        cache: \"yarn\"\n        cache-dependency-path: quickwit/quickwit-ui/yarn.lock\n    - run: yarn global add node-gyp\n      shell: bash\n    - run: make build-ui\n      shell: bash\n    - name: Install protoc\n      run: brew install protobuf\n      shell: bash\n    - name: Install rustup\n      shell: bash\n      run: curl https://sh.rustup.rs -sSf | sh -s -- --default-toolchain none -y\n    - name: Add target ${{ inputs.target }}\n      run: rustup target add ${{ inputs.target }}\n      shell: bash\n      working-directory: ./quickwit\n    - name: Retrieve and export commit date, hash, and tags\n      run: |\n        echo \"QW_COMMIT_DATE=$(TZ=UTC0 git log -1 --format=%cd --date=format-local:%Y-%m-%dT%H:%M:%SZ)\" >> $GITHUB_ENV\n        echo \"QW_COMMIT_HASH=$(git rev-parse HEAD)\" >> $GITHUB_ENV\n        echo \"QW_COMMIT_TAGS=$(git tag --points-at HEAD | tr '\\n' ',')\" >> $GITHUB_ENV\n      shell: bash\n    - name: Build binary\n      run: cargo build --release --features release-macos-feature-vendored-set --target ${{ matrix.target }} --bin quickwit\n      shell: bash\n      working-directory: ./quickwit\n      env:\n        QW_COMMIT_DATE: ${{ env.QW_COMMIT_DATE }}\n        QW_COMMIT_HASH: ${{ env.QW_COMMIT_HASH }}\n        QW_COMMIT_TAGS: ${{ env.QW_COMMIT_TAGS }}\n    - name: Bundle archive\n      run: |\n        make archive BINARY_FILE=quickwit/target/${{ inputs.target }}/release/quickwit \\\n          BINARY_VERSION=${{ inputs.version }} ARCHIVE_NAME=${{ env.ASSET_FULL_NAME }}\n      shell: bash\n    - name: Save binary archive for three days\n      uses: actions/upload-artifact@v4.4.0\n      with:\n        name: ${{ env.ASSET_FULL_NAME }}.tar.gz\n        path: ./${{ env.ASSET_FULL_NAME }}.tar.gz\n        retention-days: 3\n    - name: Deploy archive to GitHub release\n      uses: quickwit-inc/upload-to-github-release@9b2c40fba23bf8dea05b7d2eece24cbc95d4a190\n      env:\n        GITHUB_TOKEN: ${{ inputs.token }}\n      with:\n        file: ${{ env.ASSET_FULL_NAME }}.tar.gz\n        overwrite: true\n        draft: ${{ inputs.version != 'nightly' }}\n        tag_name: ${{ inputs.version }}\n"
  },
  {
    "path": ".github/actions/cross-build-binary/action.yml",
    "content": "name: \"Build Quickwit binary with cargo cross\"\ndescription: \"Build React app and Rust binary with cargo cross.\"\ninputs:\n  target:\n    description: \"Target\"\n    required: true\n  version:\n    description: \"Binary version\"\n    required: true\n  token:\n    description: \"GitHub access token\"\n    required: true\nruns:\n  using: \"composite\"\n  steps:\n    - run: echo \"ASSET_FULL_NAME=quickwit-${{ inputs.version }}-${{ inputs.target }}\" >> $GITHUB_ENV\n      shell: bash\n    - uses: actions/setup-node@v3\n      with:\n        node-version: 24\n        cache: \"yarn\"\n        cache-dependency-path: quickwit/quickwit-ui/yarn.lock\n    - run: yarn global add node-gyp\n      shell: bash\n    - run: make build-ui\n      shell: bash\n    - name: Install rustup\n      shell: bash\n      run: curl https://sh.rustup.rs -sSf | sh -s -- --default-toolchain none -y\n    - name: Install cross\n      run: cargo install cross\n      shell: bash\n    - name: Retrieve and export commit date, hash, and tags\n      run: |\n        echo \"QW_COMMIT_DATE=$(TZ=UTC0 git log -1 --format=%cd --date=format-local:%Y-%m-%dT%H:%M:%SZ)\" >> $GITHUB_ENV\n        echo \"QW_COMMIT_HASH=$(git rev-parse HEAD)\" >> $GITHUB_ENV\n        echo \"QW_COMMIT_TAGS=$(git tag --points-at HEAD | tr '\\n' ',')\" >> $GITHUB_ENV\n      shell: bash\n    - name: Build Quickwit\n      run: cross build --release --features release-feature-vendored-set --target ${{ inputs.target }} --bin quickwit\n      shell: bash\n      env:\n        QW_COMMIT_DATE: ${{ env.QW_COMMIT_DATE }}\n        QW_COMMIT_HASH: ${{ env.QW_COMMIT_HASH }}\n        QW_COMMIT_TAGS: ${{ env.QW_COMMIT_TAGS }}\n      working-directory: ./quickwit\n    - name: Bundle archive\n      run: |\n        make archive BINARY_FILE=quickwit/target/${{ inputs.target }}/release/quickwit \\\n          BINARY_VERSION=${{ inputs.version }} ARCHIVE_NAME=${{ env.ASSET_FULL_NAME }}\n      shell: bash\n    - name: Save binary archive for three days\n      uses: actions/upload-artifact@v4.4.0\n      with:\n        name: ${{ env.ASSET_FULL_NAME }}.tar.gz\n        path: ./${{ env.ASSET_FULL_NAME }}.tar.gz\n        retention-days: 3\n    - name: Upload archive\n      uses: quickwit-inc/upload-to-github-release@9b2c40fba23bf8dea05b7d2eece24cbc95d4a190\n      env:\n        GITHUB_TOKEN: ${{ inputs.token }}\n      with:\n        file: ${{ env.ASSET_FULL_NAME }}.tar.gz\n        overwrite: true\n        draft: ${{ inputs.version != 'nightly' }}\n        tag_name: ${{ inputs.version }}\n"
  },
  {
    "path": ".github/dependabot.yml",
    "content": "version: 2\nupdates:\n  # Rust dependencies\n  - package-ecosystem: cargo\n    directory: \"/quickwit\"\n    schedule:\n      interval: \"monthly\"\n    groups:\n      rust-dependencies:\n        patterns:\n          - \"*\"\n    open-pull-requests-limit: 10\n    ignore:\n      - dependency-name: \"*\"\n        update-types: [\"version-update:semver-patch\"]\n\n  # Docker dependencies\n  - package-ecosystem: docker\n    directory: \"/\"\n    schedule:\n      interval: \"monthly\"\n    open-pull-requests-limit: 10\n\n  # GitHub Actions\n  - package-ecosystem: github-actions\n    directory: \"/\"\n    schedule:\n      interval: \"monthly\"\n    groups:\n      github-actions:\n        patterns:\n          - \"*\"\n    open-pull-requests-limit: 10\n\n  # NPM dependencies\n  - package-ecosystem: npm\n    directory: \"/\"\n    schedule:\n      interval: \"monthly\"\n    groups:\n      npm-dependencies:\n        patterns:\n          - \"*\"\n    open-pull-requests-limit: 10\n"
  },
  {
    "path": ".github/workflows/ci.yml",
    "content": "name: CI\n\non:\n  workflow_dispatch:\n  pull_request:\n  push:\n    branches:\n      - main\n      - trigger-ci-workflow\n    paths:\n      - \"quickwit/**\"\n      - \"!quickwit/quickwit-ui/**\"\n\npermissions:\n  contents: read\n\nenv:\n  CARGO_INCREMENTAL: 0\n  QW_DISABLE_TELEMETRY: 1\n  QW_TEST_DATABASE_URL: postgres://quickwit-dev:quickwit-dev@localhost:5432/quickwit-metastore-dev\n  RUST_BACKTRACE: 1\n  RUSTDOCFLAGS: -Dwarnings -Arustdoc::private_intra_doc_links\n  RUSTFLAGS: -Dwarnings --cfg tokio_unstable\n\n# Ensures that we cancel running jobs for the same PR / same workflow.\nconcurrency:\n  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}\n  cancel-in-progress: true\n\njobs:\n  tests:\n    name: Unit tests\n    runs-on: \"ubuntu-latest\"\n    timeout-minutes: 60\n    permissions:\n      contents: read\n      actions: write\n    services:\n      # PostgreSQL service container\n      postgres:\n        image: postgres:latest\n        ports:\n          - 5432:5432\n        env:\n          POSTGRES_USER: quickwit-dev\n          POSTGRES_PASSWORD: quickwit-dev\n          POSTGRES_DB: quickwit-metastore-dev\n        # Set health checks to wait until postgres has started\n        options: >-\n          --health-cmd pg_isready\n          --health-interval 10s\n          --health-timeout 5s\n          --health-retries 5\n    steps:\n      - name: Cleanup Disk Space\n        run: |\n          df -h\n\n          if [ \"$(df -BG / | awk 'NR==2 {gsub(\"G\",\"\",$4); print $4}')\" -lt 30 ]; then\n            echo \"Less than 30GiB available. Running cleanup...\"\n            sudo rm -rf /usr/share/dotnet\n            sudo rm -rf /usr/local/lib/android\n            sudo rm -rf /usr/share/swift\n            sudo rm -rf /usr/local/.ghcup\n            sudo rm -rf /opt/hostedtoolcache/CodeQL\n            df -h\n          else\n            echo \"30GiB or more available. Skipping cleanup.\"\n          fi\n\n      - uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1\n      - name: Install Ubuntu packages\n        run: |\n          sudo apt-get update\n          sudo apt-get -y install protobuf-compiler\n      - uses: actions/setup-python@83679a892e2d95755f2dac6acb0bfd1e9ac5d548 # v.6.1.0\n        with:\n          python-version: '3.11'\n      - uses: dorny/paths-filter@de90cc6fb38fc0963ad72b210f1f284cd68cea36 # v3.0.2\n        id: modified\n        with:\n          filters: |\n            rust_src:\n              - quickwit/**/*.rs\n              - quickwit/**/*.toml\n              - quickwit/**/*.proto\n              - quickwit/rest-api-tests/**\n              - .github/workflows/ci.yml\n      - name: Setup stable Rust Toolchain\n        if: steps.modified.outputs.rust_src == 'true'\n        uses: dtolnay/rust-toolchain@f7ccc83f9ed1e5b9c81d8a67d7ad1a747e22a561 # master\n        with:\n          toolchain: stable\n      - name: Setup cache\n        uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v2.8.2\n        if: steps.modified.outputs.rust_src == 'true'\n        with:\n          workspaces: \"./quickwit -> target\"\n          shared-key: \"quickwit-cargo\"\n      - name: Install nextest\n        if: always() && steps.modified.outputs.rust_src == 'true'\n        uses: taiki-e/install-action@aba36d755ec7ca22d38b12111787c26115943952\n        with:\n          tool: cargo-nextest\n      - name: cargo build\n        if: always() && steps.modified.outputs.rust_src == 'true'\n        run: cargo build --features=postgres --tests --bin quickwit\n        working-directory: ./quickwit\n      - name: cargo nextest\n        if: always() && steps.modified.outputs.rust_src == 'true'\n        run: cargo nextest run --features=postgres --retries 1\n        working-directory: ./quickwit\n      - name: Install python packages\n        if: always() && steps.modified.outputs.rust_src == 'true'\n        run: |\n          pip install --user --require-hashes -r ${{ github.workspace }}/.github/workflows/requirements.txt\n          pipenv install --deploy --ignore-pipfile\n        working-directory: ./quickwit/rest-api-tests\n      - name: Run REST API tests\n        if: always() && steps.modified.outputs.rust_src == 'true'\n        run: pipenv run python3 ./run_tests.py --binary ../target/debug/quickwit\n        working-directory: ./quickwit/rest-api-tests\n\n  lints:\n    name: Lints\n    runs-on: \"ubuntu-latest\"\n    timeout-minutes: 60\n    permissions:\n      contents: read\n      actions: write\n    steps:\n      - uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1\n      - uses: dorny/paths-filter@de90cc6fb38fc0963ad72b210f1f284cd68cea36 # v3.0.2\n        id: modified\n        with:\n          filters: |\n            rust_src:\n              - quickwit/**/*.rs\n              - quickwit/**/*.toml\n              - quickwit/**/*.proto\n              - .github/workflows/ci.yml\n      - name: Install Ubuntu packages\n        if: always() && steps.modified.outputs.rust_src == 'true'\n        run: |\n          sudo apt-get update\n          sudo apt-get -y install protobuf-compiler\n      - name: Setup nightly Rust Toolchain (for rustfmt)\n        if: steps.modified.outputs.rust_src == 'true'\n        uses: dtolnay/rust-toolchain@f7ccc83f9ed1e5b9c81d8a67d7ad1a747e22a561 # master\n        with:\n          toolchain: nightly\n          components: rustfmt\n      - name: Setup stable Rust Toolchain\n        if: steps.modified.outputs.rust_src == 'true'\n        uses: dtolnay/rust-toolchain@f7ccc83f9ed1e5b9c81d8a67d7ad1a747e22a561 # master\n        with:\n          toolchain: stable\n      - name: Setup cache\n        if: steps.modified.outputs.rust_src == 'true'\n        uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v2.8.2\n        with:\n          workspaces: \"./quickwit -> target\"\n          shared-key: \"quickwit-cargo\"\n      - name: Install cargo deny\n        if: always() && steps.modified.outputs.rust_src == 'true'\n        uses: taiki-e/cache-cargo-install-action@34ce5120836e5f9f1508d8713d7fdea0e8facd6f # v3.0.1\n        with:\n          # 0.18 requires rustc 1.85\n          tool: cargo-deny@0.17.0\n      - name: Install cargo machete\n        if: always() && steps.modified.outputs.rust_src == 'true'\n        uses: taiki-e/cache-cargo-install-action@34ce5120836e5f9f1508d8713d7fdea0e8facd6f # v3.0.1\n        with:\n          tool: cargo-machete\n      - name: cargo clippy\n        if: always() && steps.modified.outputs.rust_src == 'true'\n        run: cargo clippy --workspace --tests --all-features\n        working-directory: ./quickwit\n      - name: cargo deny\n        if: always() && steps.modified.outputs.rust_src == 'true'\n        run: cargo deny check licenses\n        working-directory: ./quickwit\n      - name: cargo machete\n        if: always() && steps.modified.outputs.rust_src == 'true'\n        run: cargo machete\n        working-directory: ./quickwit\n      - name: cargo doc\n        if: always() && steps.modified.outputs.rust_src == 'true'\n        run: cargo doc --no-deps\n        working-directory: ./quickwit\n      - name: License headers check\n        if: always()\n        run: bash scripts/check_license_headers.sh\n        working-directory: ./quickwit\n      - name: rustfmt\n        if: always() && steps.modified.outputs.rust_src == 'true'\n        run: cargo +nightly fmt --all -- --check\n        working-directory: ./quickwit\n\n  thirdparty-license:\n    name: Check Datadog third-party license file\n    runs-on: ubuntu-latest\n    permissions:\n      contents: read\n      actions: write\n    steps:\n      - uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1\n      - name: Install Rust toolchain\n        uses: dtolnay/rust-toolchain@f7ccc83f9ed1e5b9c81d8a67d7ad1a747e22a561 # master\n        with:\n          toolchain: stable\n\n      - name: Cache cargo tools\n        uses: actions/cache@9255dc7a253b0ccc959486e2bca901246202afeb # v5.0.1\n        with:\n          path: ~/.cargo/bin\n          key: ${{ runner.os }}-cargo-tools-${{ hashFiles('**/Cargo.lock') }}\n\n      - name: Install dd-rust-license-tool\n        run: dd-rust-license-tool --help || cargo install --git https://github.com/DataDog/rust-license-tool.git --force\n\n      - name: Check Datadog third-party license file\n        run: dd-rust-license-tool --config quickwit/license-tool.toml --manifest-path quickwit/Cargo.toml check\n"
  },
  {
    "path": ".github/workflows/coverage.yml",
    "content": "name: Code coverage\n\non:\n  workflow_dispatch:\n  push:\n    branches:\n      - main\n      - trigger-coverage-workflow\n    paths:\n      - quickwit/Cargo.toml\n      - quickwit/Cargo.lock\n      - quickwit/quickwit-*/**\n\npermissions:\n  contents: read\n\nenv:\n  AWS_REGION: us-east-1\n  AWS_ACCESS_KEY_ID: \"placeholder\"\n  AWS_SECRET_ACCESS_KEY: \"placeholder\"\n  CARGO_INCREMENTAL: 0\n  PUBSUB_EMULATOR_HOST: \"localhost:8681\"\n  QW_DISABLE_TELEMETRY: 1\n  QW_S3_ENDPOINT: \"http://localhost:4566\" # Services are exposed as localhost because we are not running coverage in a container.\n  QW_S3_FORCE_PATH_STYLE_ACCESS: 1\n  QW_TEST_DATABASE_URL: postgres://quickwit-dev:quickwit-dev@localhost:5432/quickwit-metastore-dev\n  RUSTFLAGS: -Dwarnings --cfg tokio_unstable\n\njobs:\n  test:\n    name: Coverage\n    runs-on: gh-ubuntu-arm64\n    timeout-minutes: 40\n    permissions:\n      contents: read\n      actions: write\n    # Setting a containing will require to fix the QW_S3_ENDPOINT to http://localstack:4566\n    services:\n      localstack:\n        image: localstack/localstack:latest\n        ports:\n          - \"4566:4566\"\n          - \"4571:4571\"\n          - \"8080:8080\"\n        env:\n          SERVICES: kinesis,s3,sqs\n        options: >-\n          --health-cmd \"curl -k https://localhost:4566\"\n          --health-interval 10s\n          --health-timeout 5s\n          --health-retries 5\n\n      postgres:\n        image: postgres:latest\n        ports:\n          - \"5432:5432\"\n        env:\n          POSTGRES_USER: quickwit-dev\n          POSTGRES_PASSWORD: quickwit-dev\n          POSTGRES_DB: quickwit-metastore-dev\n        options: >-\n          --health-cmd pg_isready\n          --health-interval 10s\n          --health-timeout 5s\n          --health-retries 5\n\n      kafka-broker:\n        image: confluentinc/confluent-local:7.4.11\n        ports:\n          - \"9092:9092\"\n          - \"9101:9101\"\n        env:\n          # Mode KRaft (Single Node)\n          KAFKA_NODE_ID: 1\n          KAFKA_PROCESS_ROLES: 'broker,controller'\n          KAFKA_CONTROLLER_QUORUM_VOTERS: '1@localhost:9093'\n          KAFKA_LOG4J_LOGGERS: \"org.apache.kafka.image.loader.MetadataLoader=WARN\"\n\n          # Listeners\n          KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: 'CONTROLLER:PLAINTEXT,EXTERNAL:PLAINTEXT'\n          KAFKA_LISTENERS: 'EXTERNAL://0.0.0.0:9092,CONTROLLER://0.0.0.0:9093'\n          KAFKA_ADVERTISED_LISTENERS: 'EXTERNAL://localhost:9092'\n          KAFKA_CONTROLLER_LISTENER_NAMES: 'CONTROLLER'\n          KAFKA_INTER_BROKER_LISTENER_NAME: 'EXTERNAL'\n\n          # Configuration simplifiée\n          KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1\n          KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0\n          KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1\n          KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1\n\n          # ID du Cluster (Nécessaire pour KRaft)\n          CLUSTER_ID: 'MkU3OEVBNTcwNTJENDM2Qk'\n\n          KAFKA_HEAP_OPTS: -Xms256M -Xmx256M\n\n        options: >-\n          --health-cmd \"ub kafka-ready -b localhost:9092 1 5\"\n          --health-interval 10s\n          --health-timeout 5s\n          --health-retries 5\n\n      gcp-pubsub-emulator:\n        image: thekevjames/gcloud-pubsub-emulator:550.0.0\n        ports:\n          - \"8681:8681\"\n        env:\n          PUBSUB_PROJECT1: \"quickwit-emulator,emulator_topic:emulator_subscription\"\n\n    steps:\n      - uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1\n\n      - name: Install lib libsasl2\n        run: |\n          sudo apt update\n          sudo apt install libsasl2-dev\n          sudo apt install libsasl2-2\n\n      - uses: actions/setup-python@83679a892e2d95755f2dac6acb0bfd1e9ac5d548 # v.6.1.0\n        with:\n          python-version: '3.11'\n\n      - uses: actions/cache@9255dc7a253b0ccc959486e2bca901246202afeb # v5.0.1\n        with:\n          path: |\n            ~/.cargo/git\n            ~/.cargo/registry\n          key: ${{ runner.os }}-cargo-test-${{ hashFiles('Cargo.lock') }}\n          restore-keys: |\n            ${{ runner.os }}-cargo-test-${{ hashFiles('Cargo.lock') }}\n            ${{ runner.os }}-cargo-test\n\n      - name: Install python packages\n        run: |\n          pip install --user --require-hashes -r ${{ github.workspace }}/.github/workflows/requirements.txt\n          pipenv install --deploy --ignore-pipfile\n        working-directory: ./quickwit/quickwit-cli/tests\n\n      - name: Prepare LocalStack S3\n        run: pipenv run ./prepare_tests.sh\n        working-directory: ./quickwit/quickwit-cli/tests\n\n      # GitHub Actions does not allow services to be started with a custom command,\n      # so we are running Azurite as a container manually.\n      - name: Run Azurite service\n        run: DOCKER_SERVICES=azurite make docker-compose-up\n\n      # GitHub Actions does not allow services to be started with a custom command,\n      # so we are running fake gcs server as a container manually.\n      - name: Run Fake GCS Server service\n        run: DOCKER_SERVICES=fake-gcs-server make docker-compose-up\n\n      - name: Run Pulsar service\n        run: DOCKER_SERVICES=pulsar make docker-compose-up\n\n      - name: Install Rust\n        run: rustup update stable\n\n      - name: Install cargo-llvm-cov, cargo-nextest, and protoc\n        uses: taiki-e/install-action@90558ad1e179036f31467972b00dec6cb80701fa # v2.66.3\n        with:\n          tool: cargo-llvm-cov,nextest,protoc\n\n      # We limit the number of jobs to 4 to avoid OOM errors when linking the binary.\n      - name: Generate code coverage\n        run: |\n          cargo llvm-cov clean --workspace\n          cargo llvm-cov nextest --no-report --test failpoints --features fail/failpoints --retries 4\n          # increase stack size for test_all_with_s3_localstack_cli, see quickwit#4963\n          RUST_MIN_STACK=67108864 CARGO_BUILD_JOBS=4 cargo llvm-cov nextest --no-report --all-features --retries 4\n          cargo llvm-cov report --lcov --output-path lcov.info\n        working-directory: ./quickwit\n\n      - name: Upload coverage to Codecov\n        uses: codecov/codecov-action@671740ac38dd9b0130fbe1cec585b89eea48d3de # v5.5.2\n        with:\n          token: ${{ secrets.CODECOV_TOKEN }} # not required for public repos\n          files: ./quickwit/lcov.info\n\n  on-failure:\n    if: ${{ github.repository_owner == 'quickwit-oss' && failure() }}\n    name: On Failure\n    needs: [test]\n    runs-on: ubuntu-latest\n    steps:\n      - name: Send Message\n        uses: sarisia/actions-status-discord@eb045afee445dc055c18d3d90bd0f244fd062708 # v1.16.0\n        with:\n          webhook: ${{ secrets.DISCORD_WEBHOOK }}\n          nodetail: true\n          color: \"#FF0000\"\n          title: \"\"\n          description: |\n            ### ❌ [${{ github.event.pull_request.title }}](${{ github.event.pull_request.html_url }})\n\n            @${{ github.actor }} quickwit coverage CI failed on your PR.\n\n            Coverage CI contains tests that are not running in the regular CI because they are too lengthy.\n            For this reason it is possible for it to break even if the tests were passing on your PR.\n            This is not a catastrophy, but you are responsible for fixing it!\n\n            You can run the full test suite locally with `make test-all`.\n\n            Please report in this channel that you are working on it/fixed it/or if it is a flaky test/\n            or if you need help.\n\n            **[View logs](https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }})**\n"
  },
  {
    "path": ".github/workflows/dependency.yml",
    "content": "name: \"Dependency Review\"\non: [pull_request]\n\npermissions:\n  contents: read\n\n# Ensures that we cancel running jobs for the same PR / same workflow.\nconcurrency:\n  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}\n  cancel-in-progress: true\n\njobs:\n  dependency-review:\n    runs-on: ubuntu-latest\n    steps:\n      - name: \"Checkout Repository\"\n        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1\n      - name: \"Dependency Review\"\n        uses: actions/dependency-review-action@98884d411b0f1c583e5ee579e7e897d4623019c2 # v4.8.1\n        with:\n          # This is an minor vuln on the rsa crate, used for\n          # google storage.\n          allow-ghsas: GHSA-c38w-74pg-36hr,GHSA-4grx-2x9w-596c\n"
  },
  {
    "path": ".github/workflows/publish_cross_images.yml",
    "content": "name: Publish custom cross images\n\non:\n  workflow_dispatch:\n  push:\n    branches:\n      - main\n    paths:\n      - \"build/cross-images/**\"\n\npermissions:\n  contents: read\n\njobs:\n  build-cross-images:\n    name: Publish cross images\n    runs-on: ubuntu-latest\n    environment:\n        name: production\n    steps:\n      - name: Check out the repo\n        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1\n      - name: Log in to Docker Hub\n        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # v3.6.0\n        with:\n          username: ${{ secrets.DOCKERHUB_USERNAME }}\n          password: ${{ secrets.DOCKERHUB_ACCESS_TOKEN }}\n      - name: Build and push cross images\n        run: make cross-images\n"
  },
  {
    "path": ".github/workflows/publish_docker_images.yml",
    "content": "name: Build and publish Docker images\n\non:\n  workflow_dispatch:\n  push:\n    branches:\n      - main\n      - release-0.9\n    paths:\n      - \"quickwit/**\"\n    tags:\n      - airmail\n      - happy-plazza\n      - qw*\n      - v*\n\npermissions:\n  contents: read\n\nenv:\n  REGISTRY_IMAGE: quickwit/quickwit\n\njobs:\n  docker:\n    strategy:\n      matrix:\n        include:\n          - os: ubuntu-latest\n            platform: linux/amd64\n            platform_suffix: amd64\n          - os: gh-ubuntu-arm64\n            platform: linux/arm64\n            platform_suffix: arm64\n    runs-on: ${{ matrix.os }}\n    permissions:\n      contents: read\n      actions: write\n    environment:\n      name: production\n    steps:\n      - name: Cleanup Disk Space\n        run: |\n          df -h\n          sudo rm -rf /opt/hostedtoolcache/CodeQL\n          sudo rm -rf /usr/local/.ghcup\n          sudo rm -rf /usr/local/lib/android\n          sudo rm -rf /usr/share/dotnet\n          sudo rm -rf /usr/share/swift\n          df -h\n\n      - name: Checkout\n        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1\n\n      - name: Login to Docker Hub\n        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # v3.6.0\n        with:\n          username: ${{ secrets.DOCKERHUB_USERNAME }}\n          password: ${{ secrets.DOCKERHUB_ACCESS_TOKEN }}\n\n      - name: Set up QEMU\n        uses: docker/setup-qemu-action@c7c53464625b32c7a7e944ae62b3e17d2b600130 # v3.7.0\n\n      - name: Set up Docker Buildx\n        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3.12.0\n\n      - name: Docker meta\n        id: meta\n        uses: docker/metadata-action@c299e40c65443455700f0fdfc63efafe5b349051 # v5.10.0\n        with:\n          images: |\n            ${{ env.REGISTRY_IMAGE }}\n          labels: |\n            org.opencontainers.image.title=Quickwit\n            maintainer=Quickwit, Inc. <hello@quickwit.io>\n            org.opencontainers.image.vendor=Quickwit, Inc.\n            org.opencontainers.image.licenses=Apache-2.0\n\n      - name: Retrieve commit date, hash, and tags\n        run: |\n          echo \"QW_COMMIT_DATE=$(TZ=UTC0 git log -1 --format=%cd --date=format-local:%Y-%m-%dT%H:%M:%SZ)\" >> $GITHUB_ENV\n          echo \"QW_COMMIT_HASH=$(git rev-parse HEAD)\" >> $GITHUB_ENV\n          echo \"QW_COMMIT_TAGS=$(git tag --points-at HEAD | tr '\\n' ',')\" >> $GITHUB_ENV\n          if [[ \"${{ github.event_name }}\" == \"push\" && \"${{ github.ref_type }}\" == \"tag\" && \"${GITHUB_REF#refs/tags/}\" == *\"jemprof\"* ]]; then\n            echo \"CARGO_FEATURES=release-jemalloc-profiled\" >> $GITHUB_ENV\n          else\n            echo \"CARGO_FEATURES=release-feature-set\" >> $GITHUB_ENV\n          fi\n\n      - name: Build and push image\n        uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83 # v6.18.0\n        id: build\n        with:\n          context: .\n          platforms: ${{ matrix.platform }}\n          build-args: |\n            QW_COMMIT_DATE=${{ env.QW_COMMIT_DATE }}\n            QW_COMMIT_HASH=${{ env.QW_COMMIT_HASH }}\n            QW_COMMIT_TAGS=${{ env.QW_COMMIT_TAGS }}\n            CARGO_FEATURES=${{ env.CARGO_FEATURES }}\n          labels: ${{ steps.meta.outputs.labels }}\n          outputs: type=image,name=${{ env.REGISTRY_IMAGE }},push-by-digest=true,name-canonical=true,push=true\n\n      - name: Export digest\n        run: |\n          mkdir -p /tmp/digests\n          digest=\"${{ steps.build.outputs.digest }}\"\n          touch \"/tmp/digests/${digest#sha256:}\"\n\n      - name: Upload digest\n        uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0\n        with:\n          name: digest-${{ matrix.platform_suffix }}\n          path: /tmp/digests/*\n          if-no-files-found: error\n          retention-days: 1\n\n  merge:\n    runs-on: ubuntu-latest\n    needs: [docker]\n    permissions:\n      contents: read\n      actions: read\n    environment: production\n    steps:\n      - name: Download digests\n        uses: actions/download-artifact@37930b1c2abaa49bbe596cd826c3c89aef350131 # v7.0.0\n        with:\n          pattern: digest-*\n          path: /tmp/digests\n          merge-multiple: true\n\n      - name: Set up Docker Buildx\n        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3.12.0\n\n      - name: Docker meta\n        id: meta\n        uses: docker/metadata-action@c299e40c65443455700f0fdfc63efafe5b349051 # v5.10.0\n        with:\n          images: ${{ env.REGISTRY_IMAGE }}\n          flavor: |\n            latest=false\n          tags: |\n            type=edge,branch=main\n            type=edge,branch=main,suffix=-slim-bookworm\n            type=semver,pattern={{version}}\n            type=semver,pattern={{version}},value=latest\n            type=semver,pattern={{version}},suffix=-slim-bookworm\n            type=ref,event=tag\n            type=raw,value=v0.9.0-rc,enable=${{ github.ref == 'refs/heads/release-0.9' }}\n      - name: Login to Docker Hub\n        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # v3.6.0\n        with:\n          username: ${{ secrets.DOCKERHUB_USERNAME }}\n          password: ${{ secrets.DOCKERHUB_ACCESS_TOKEN }}\n      - name: Create manifest list and push tags\n        working-directory: /tmp/digests\n        run: |\n          docker buildx imagetools create $(jq -cr '.tags | map(\"-t \" + .) | join(\" \")' <<< \"$DOCKER_METADATA_OUTPUT_JSON\") \\\n            $(printf '${{ env.REGISTRY_IMAGE }}@sha256:%s ' *)\n      - name: Inspect image\n        run: |\n          docker buildx imagetools inspect ${{ env.REGISTRY_IMAGE }}:${{ steps.meta.outputs.version }}\n"
  },
  {
    "path": ".github/workflows/publish_lambda.yaml",
    "content": "# This workflow creates a new release for a quickwit search aws lambda.\n# The artifact is a zip file containing a binary for ARM 64,\n# ready to be deployed by the deployer.\n#\n# See quickwit-lambda-client/README.md\nname: Release Lambda binary\n\non:\n  push:\n    tags:\n      - 'lambda-*'\n  workflow_dispatch:\n    inputs:\n      version:\n        description: 'Version tag (e.g., v0.8.0)'\n        required: false\n        default: 'dev'\n\npermissions:\n  contents: read\n\njobs:\n  build-lambda:\n    name: Build Lambda ARM64\n    runs-on: ubuntu-latest\n    permissions:\n      contents: write\n      actions: write\n    steps:\n      - uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1\n\n      - name: Set version\n        run: |\n          if [ \"${{ github.ref_type }}\" = \"tag\" ]; then\n            # Extract version from tag (e.g., lambda-v0.8.0 -> v0.8.0)\n            echo \"ASSET_VERSION=${GITHUB_REF_NAME#lambda-}\" >> $GITHUB_ENV\n          elif [ -n \"${{ github.event.inputs.version }}\" ] && [ \"${{ github.event.inputs.version }}\" != \"dev\" ]; then\n            echo \"ASSET_VERSION=${{ github.event.inputs.version }}\" >> $GITHUB_ENV\n          else\n            echo \"ASSET_VERSION=dev-$(git rev-parse --short HEAD)\" >> $GITHUB_ENV\n          fi\n\n      - name: Install rustup\n        run: curl https://sh.rustup.rs -sSf | sh -s -- --default-toolchain none -y\n\n      - name: Install cross\n        run: cargo install cross\n\n      - name: Retrieve and export commit date, hash, and tags\n        run: |\n          echo \"QW_COMMIT_DATE=$(TZ=UTC0 git log -1 --format=%cd --date=format-local:%Y-%m-%dT%H:%M:%SZ)\" >> $GITHUB_ENV\n          echo \"QW_COMMIT_HASH=$(git rev-parse HEAD)\" >> $GITHUB_ENV\n          echo \"QW_COMMIT_TAGS=$(git tag --points-at HEAD | tr '\\n' ',')\" >> $GITHUB_ENV\n\n      - name: Build Lambda binary\n        run: cross build --release --features lambda-release --target aarch64-unknown-linux-gnu -p quickwit-lambda-server --bin quickwit-aws-lambda-leaf-search\n        env:\n          QW_COMMIT_DATE: ${{ env.QW_COMMIT_DATE }}\n          QW_COMMIT_HASH: ${{ env.QW_COMMIT_HASH }}\n          QW_COMMIT_TAGS: ${{ env.QW_COMMIT_TAGS }}\n        working-directory: ./quickwit\n\n      - name: Create Lambda zip\n        run: |\n          cd quickwit/target/aarch64-unknown-linux-gnu/release\n          cp quickwit-aws-lambda-leaf-search bootstrap\n          zip quickwit-aws-lambda-${{ env.ASSET_VERSION }}-aarch64.zip bootstrap\n          mv quickwit-aws-lambda-${{ env.ASSET_VERSION }}-aarch64.zip ../../../../\n\n      - name: Upload to GitHub release\n        uses: quickwit-inc/upload-to-github-release@9b2c40fba23bf8dea05b7d2eece24cbc95d4a190\n        env:\n          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}\n        with:\n          file: quickwit-aws-lambda-${{ env.ASSET_VERSION }}-aarch64.zip\n          overwrite: true\n          draft: true\n          tag_name: ${{ env.ASSET_VERSION }}\n"
  },
  {
    "path": ".github/workflows/publish_nightly_packages.yml",
    "content": "name: Build and publish nightly packages\n\non:\n  workflow_dispatch:\n  schedule:\n    - cron: \"0 5 * * *\"\n\npermissions:\n  contents: read\n\njobs:\n  build-macos-binaries:\n    name: Build ${{ matrix.target }}\n    runs-on: macos-latest\n    permissions:\n      contents: write\n      actions: write\n    strategy:\n      fail-fast: false\n      matrix:\n        target: [x86_64-apple-darwin, aarch64-apple-darwin]\n    steps:\n      - uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1\n      - uses: ./.github/actions/cargo-build-macos-binary\n        with:\n          target: ${{ matrix.target }}\n          version: nightly\n          token: ${{ secrets.GITHUB_TOKEN }}\n  build-linux-binaries:\n    strategy:\n      fail-fast: false\n      matrix:\n        target: [x86_64-unknown-linux-gnu, aarch64-unknown-linux-gnu]\n    name: Build ${{ matrix.target }}\n    runs-on: ubuntu-latest\n    permissions:\n      contents: write\n      actions: write\n    steps:\n      - uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1\n      - uses: ./.github/actions/cross-build-binary\n        with:\n          target: ${{ matrix.target }}\n          version: nightly\n          token: ${{ secrets.GITHUB_TOKEN }}\n"
  },
  {
    "path": ".github/workflows/publish_release_packages.yml",
    "content": "name: Build and publish release packages\n\non:\n  push:\n    tags:\n      - \"v*\"\n\npermissions:\n  contents: read\n\njobs:\n  build-macos-binaries:\n    name: Build ${{ matrix.target }}\n    runs-on: macos-latest\n    permissions:\n      contents: write\n      actions: write\n    strategy:\n      matrix:\n        target: [x86_64-apple-darwin, aarch64-apple-darwin]\n\n    steps:\n      - uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1\n      - name: Extract asset version\n        run: echo \"ASSET_VERSION=${GITHUB_REF/refs\\/tags\\//}\" >> $GITHUB_ENV\n      - uses: ./.github/actions/cargo-build-macos-binary\n        with:\n          target: ${{ matrix.target }}\n          version: ${{ env.ASSET_VERSION }}\n          token: ${{ secrets.GITHUB_TOKEN }}\n\n  build-linux-binaries:\n    strategy:\n      matrix:\n        target: [x86_64-unknown-linux-gnu, aarch64-unknown-linux-gnu]\n    name: Build ${{ matrix.target }}\n    runs-on: ubuntu-latest\n    permissions:\n      contents: write\n      actions: write\n    steps:\n      - uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1\n      - name: Extract asset version\n        run: echo \"ASSET_VERSION=${GITHUB_REF/refs\\/tags\\//}\" >> $GITHUB_ENV\n      - uses: ./.github/actions/cross-build-binary\n        with:\n          target: ${{ matrix.target }}\n          version: ${{ env.ASSET_VERSION }}\n          token: ${{ secrets.GITHUB_TOKEN }}\n"
  },
  {
    "path": ".github/workflows/requirements.txt",
    "content": "# contains pinned dependencies for installing pipenv to ensure repeatable builds in CI/CD workflows\ncertifi==2025.10.5 \\\n    --hash=sha256:0f212c2744a9bb6de0c56639a6f68afe01ecd92d91f14ae897c4fe7bbeeef0de \\\n    --hash=sha256:47c09d31ccf2acf0be3f701ea53595ee7e0b8fa08801c6624be771df09ae7b43\ndistlib==0.4.0 \\\n    --hash=sha256:9659f7d87e46584a30b5780e43ac7a2143098441670ff0a49d5f9034c54a6c16 \\\n    --hash=sha256:feec40075be03a04501a973d81f633735b4b69f98b05450592310c0f401a4e0d\nfilelock==3.20.3 \\\n    --hash=sha256:18c57ee915c7ec61cff0ecf7f0f869936c7c30191bb0cf406f1341778d0834e1 \\\n    --hash=sha256:4b0dda527ee31078689fc205ec4f1c1bf7d56cf88b6dc9426c4f230e46c2dce1\npackaging==25.0 \\\n    --hash=sha256:29572ef2b1f17581046b3a2227d5c611fb25ec70ca1ba8554b24b0e69331a484 \\\n    --hash=sha256:d443872c98d677bf60f6a1f2f8c1cb748e8fe762d2bf9d3148b5599295b0fc4f\npipenv==2025.0.4 \\\n    --hash=sha256:36fc2a7841ccdb2f58a9f787b296c2e15dea3b5b79b84d4071812f28b7e8d7a2 \\\n    --hash=sha256:e1fbe4cfd25ab179f123d1fbb1fa1cdc0b3ffcdb1f21c775dcaa12ccc356f2bb\nplatformdirs==4.5.0 \\\n    --hash=sha256:70ddccdd7c99fc5942e9fc25636a8b34d04c24b335100223152c2803e4063312 \\\n    --hash=sha256:e578a81bb873cbb89a41fcc904c7ef523cc18284b7e3b3ccf06aca1403b7ebd3\nvirtualenv==20.36.1 \\\n    --hash=sha256:575a8d6b124ef88f6f51d56d656132389f961062a9177016a50e4f507bbcc19f \\\n    --hash=sha256:8befb5c81842c641f8ee658481e42641c68b5eab3521d8e092d18320902466ba\n"
  },
  {
    "path": ".github/workflows/scorecard.yml",
    "content": "name: OpenSSF Scorecard\non:\n  schedule:\n    - cron: '0 0 * * 0'\n  push:\n    branches:\n      - main\n\npermissions:\n  contents: read\n\njobs:\n  analysis:\n    name: Scorecards analysis\n    runs-on: ubuntu-latest\n    permissions:\n      # Needed to upload the results to code-scanning dashboard.\n      security-events: write\n      # Needed to publish results\n      id-token: write\n      actions: read\n      contents: read\n\n    steps:\n      - name: 'Checkout code'\n        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1\n        with:\n          persist-credentials: false\n\n      - name: 'Run analysis'\n        uses: ossf/scorecard-action@4eaacf0543bb3f2c246792bd56e8cdeffafb205a # v2.4.3\n        with:\n          results_file: results.sarif\n          results_format: sarif\n          repo_token: ${{ secrets.GITHUB_TOKEN }}\n          publish_results: true\n\n      # Upload the results as artifacts.\n      - name: 'Upload artifact'\n        uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0\n        with:\n          name: SARIF file\n          path: results.sarif\n          retention-days: 5\n\n      # Upload the results to GitHub's code scanning dashboard.\n      - name: 'Upload to code-scanning'\n        uses: github/codeql-action/upload-sarif@cdefb33c0f6224e58673d9004f47f7cb3e328b89 # v4.31.10\n        with:\n          sarif_file: results.sarif\n"
  },
  {
    "path": ".github/workflows/ui-ci.yml",
    "content": "name: UI CI\n\non:\n  workflow_dispatch:\n  pull_request:\n    paths:\n      - \"quickwit/quickwit-ui/**\"\n      - \".github/workflows/ui-ci.yml\"\n  push:\n    branches:\n      - main\n      - trigger-ci-workflow\n    paths:\n      - \"quickwit/quickwit-ui/**\"\n      - \".github/workflows/ui-ci.yml\"\n\npermissions:\n  contents: read\n\njobs:\n  checks:\n    name: Lint, type check & unit tests\n    runs-on: ubuntu-latest\n    steps:\n      - uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1\n      - uses: actions/setup-node@395ad3262231945c25e8478fd5baf05154b1d79f # v6.1.0\n        with:\n          node-version: 24\n          cache: \"yarn\"\n          cache-dependency-path: quickwit/quickwit-ui/yarn.lock\n      - name: Install JS dependencies\n        run: yarn --cwd quickwit-ui install\n        working-directory: ./quickwit\n      - name: Lint\n        run: yarn --cwd quickwit-ui lint\n        working-directory: ./quickwit\n      - name: Type check\n        run: yarn --cwd quickwit-ui type\n        working-directory: ./quickwit\n      - name: Unit tests\n        run: yarn --cwd quickwit-ui test\n        working-directory: ./quickwit\n\n  e2e:\n    name: Playwright e2e\n    runs-on: ubuntu-latest\n    permissions:\n      contents: read\n      actions: write\n    services:\n      postgres:\n        image: postgres:latest\n        ports:\n          - 5432:5432\n        env:\n          POSTGRES_USER: quickwit-dev\n          POSTGRES_PASSWORD: quickwit-dev\n          POSTGRES_DB: quickwit-metastore-dev\n        options: >-\n          --health-cmd pg_isready\n          --health-interval 10s\n          --health-timeout 5s\n          --health-retries 5\n    env:\n      CARGO_INCREMENTAL: 0\n      RUST_BACKTRACE: 1\n      RUSTFLAGS: -Dwarnings --cfg tokio_unstable\n      RUSTDOCFLAGS: -Dwarnings -Arustdoc::private_intra_doc_links\n      QW_TEST_DATABASE_URL: postgres://quickwit-dev:quickwit-dev@postgres:5432/quickwit-metastore-dev\n    steps:\n      - uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1\n      - uses: actions/setup-node@395ad3262231945c25e8478fd5baf05154b1d79f # v6.1.0\n        with:\n          node-version: 24\n          cache: \"yarn\"\n          cache-dependency-path: quickwit/quickwit-ui/yarn.lock\n      - name: Setup stable Rust Toolchain\n        uses: dtolnay/rust-toolchain@f7ccc83f9ed1e5b9c81d8a67d7ad1a747e22a561 # master\n        with:\n          toolchain: stable\n      - name: Setup Rust cache\n        uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v2.8.2\n        with:\n          workspaces: \"./quickwit -> target\"\n          shared-key: \"quickwit-cargo\"\n      - name: Install JS dependencies\n        run: yarn --cwd quickwit-ui install\n        working-directory: ./quickwit\n      - name: Install Playwright browsers\n        run: npx playwright install chromium --with-deps --only-shell\n        working-directory: ./quickwit/quickwit-ui\n      - name: Build UI\n        run: CI=false yarn --cwd quickwit-ui build\n        working-directory: ./quickwit\n      - name: Build Quickwit\n        run: |\n          sudo apt-get update && sudo apt-get -y install protobuf-compiler\n          cargo build --features=postgres\n        working-directory: ./quickwit\n      - name: Run e2e tests\n        run: |\n          mkdir -p qwdata\n          cargo run --features=postgres -- run --service searcher --service metastore --config ../config/quickwit.yaml &\n          yarn --cwd quickwit-ui e2e-test\n        working-directory: ./quickwit\n"
  },
  {
    "path": ".gitignore",
    "content": "# Generated by Cargo\n# will have compiled files and executables\n**/target/**\n**/proptest-regressions\n**/perf.data*\n**/flamegraph.svg\nlocal/**\nquickwit/quickwit-ui/package-lock.json\n**/.DS_Store\n\nTODO.md\nQUESTIONS.txt\n\n\n# Remove Cargo.lock from gitignore if creating an executable, leave it for libraries\n# More information here https://doc.rust-lang.org/cargo/guide/cargo-toml-vs-cargo-lock.html\n#Cargo.lock\n\n# These are backup files generated by rustfmt\n**/*.rs.bk\n\n.env\n.idea\n.vscode\n.vscode-license\ndeps\nelastic-search-artifacts\nqwdata\n\n# Generated by prost/tonic build\n*_descriptor.bin\n"
  },
  {
    "path": ".localstack/init.sh",
    "content": "#!/usr/bin/env bash\n\nset -eu\n\nawslocal s3 mb s3://quickwit-dev\nawslocal s3 mb s3://quickwit-integration-tests && awslocal s3 rm --recursive s3://quickwit-integration-tests\n\nif ! awslocal kinesis list-streams | grep -q quickwit-dev-stream ; then\n    awslocal kinesis create-stream --stream-name quickwit-dev-stream --shard-count 3\nfi\n"
  },
  {
    "path": "CHANGELOG.md",
    "content": "<!--\n# Changelog\nAll notable changes to this project will be documented in this file.\n\nThe format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),\nand this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).\n\n## [Unreleased]\n\n### Added\n\n### Fixed\n- (Jaeger) Query resource attributes when Jaeger request carries tags\n\n### Changed\n\n### Deprecated\n\n### Removed\n\n### Security\n\n--->\n\n# [0.9.0]\n\n### Added\n- Add Ingest V2 (#5600, #5566, #5463, #5375, #5350, #5252 #5202)\n- Add SQS source (#5374, #5335, #5148)\n- Disable control plane check for searcher (#5599, #5360)\n- Partially implement `_elastic/_cluster/health` (#5595)\n- Make Jaeger span attribute-to-tag conversion exhaustive (#5574)\n- Use `content_length_limit` for ES bulk limit (#5573)\n- Limit and monitor warmup memory usage (#5568)\n- Add eviction metrics to caches (#5523)\n- Record object storage request latencies (#5521)\n- Add some kind of throttling on the janitor to prevent it from overloading (#5510)\n- Prevent single split searches from different `leaf_search` from interleaving (#5509)\n- Retry on S3 internal error (#5504)\n- Allow specifying OTEL index ID in header (#5503)\n- Add a metric to count storage errors and their error code (#5497)\n- Add support for concatenated fields (#4773, #5369, #5331)\n- Add number of splits per root/leaf search histograms (#5472)\n- Introduce a searcher config option to timeout get requests (#5467)\n- Add fingerprint to task in cluster state (#5464)\n- Enrich root/leaf search spans with number of docs and splits (#5450)\n- Add some additional search metrics (#5447)\n- Improve GC resilience and add metrics (#5420)\n- Enable force shutdown with 2nd Ctrl+C (#5414)\n- Add request_timeout_secs config to searcher config (#5402)\n- Memoize S3 client (#5377)\n- Add more env var config for Postgres (#5365)\n- Enable str fast field range queries (#5324)\n- Allow querying non-existing fields (#5308)\n- Support updating doc mapper through api (#5253)\n- Add optional special handling for hex in code tokenizer (#5200)\n- Added a circuit breaker layer (#5134)\n- Various performance optimizations in Tantivy (https://github.com/quickwit-oss/tantivy/blob/main/CHANGELOG.md)\n\n### Changed\n- Parse datetimes and timestamps with leading and/or trailing whitespace (#5544)\n- Restrict maturity period to retention (#5543)\n- Wait for merge at end of local ingest (#5542)\n- Log PostgreSQL metastore error (#5530)\n- Update azure multipart policy (#5553)\n- Stop relying on our own version of pulsar-rs (#5487)\n- Handle nested OTLP values in attributes and log bodies (#5485)\n- Improve merge pipeline finalization (#5475)\n- Allow failed splits in root search (#5440)\n- Batch delete from GC (#5404, #5380)\n- Make some S3 errors retryable (#5384)\n- Change default timestamps in OTEL logs (#5366)\n- Only return root spans for Jaeger HTTP API (#5358)\n- Share aggregation limit on node (#5357)\n\n### Fixed\n- Fix existence queries for nested fields (#5581)\n- Fix lenient option with wildcard queries (#5575)\n- Fix incompatible ES Java date format (#5462)\n- Fix bulk api response order (#5434)\n- Fix pulsar finalize (#5471)\n- Fix pulsar URI scheme (#5470)\n- Fix grafana searchers dashboard (#5455)\n- Fix jaeger http endpoint (#5378)\n- Fix file re-ingestion after EOF (#5330)\n- Fix configuration interpolation (#5403)\n- Fix jaeger duration parse error (#5518)\n- Fix unit conversion in jaeger http search endpoint (#5519)\n\n### Removed\n- Remove support for 2-digit years in java datetime parser (#5596)\n- Remove DocMapper trait (#5508)\n- Remove support for AWS Lambda (#5884)\n- Remove search stream endpoint (#5886)\n\n# [0.8.1]\n\n### Fixed\n\n- Bug in the chitchat digest message serialization (chitchat#144)\n\n## [0.8.0]\n\n### Added\n\n- Remove some noisy logs (#4447)\n- Add `/{index}/_stats` and `/_stats` ES API (#4442)\n- Use `search_after` in ES scroll API (#4280)\n- Add support for wildcard exclusion in index patterns (#4458)\n- Add `.` support in DSL indentifiers (#3989)\n- Add cat indices ES API (#4465)\n- Limit concurrent merges (#4473)\n- Add Index Template API and auto create index (#4456) (only available with ingest V2)\n- Add support for compressed ES `_bulk` requests (#4506)\n- Add support for slash `/` character in field names (#4510)\n- Handle SIGTERM shutdown signal (#4539)\n- Add `start_timestamp` and `end_timestamp` filter to ES `_field_caps` API (#4547)\n- Limit the number of merge pipelines that can be spawned concurrently (#4574)\n- Add support for `_source_excludes` and `_source_includes` query parameters in ES API (#4572)\n- Add gRPC metrics layer to clients and servers (#4591)\n- Add additional cluster metrics (#4597)\n- Add index patterns query param on GET `/indexes` endpoint (#4600)\n- Add support for GCS file backed metastore (#4604)\n- Add default search fields for OTEL traces index (#4602)\n- Add support for delete index in ES API (#4606)\n- Add a handler to dynamically change the log level (#4662)\n- Add REST endpoint to parse a query into a query AST (#4652)\n- Add postgresql index and use `IN` instead of many `OR` (#4670)\n- Add support for `_source_excludes`, `_source_includes`, `extra_filters` in `_msearch` ES API (#4696)\n- Handle `track_total_size` on request ES body (#4710)\n- Add a metric for the number number of indexes (#4711)\n- Add various performance optimizations in Quickwit and Tantivy\n\nMore details in tantivy's [changelog](https://github.com/quickwit-oss/tantivy/blob/main/CHANGELOG.md).\n\n### Fixed\n\n- Fix aggregation result on empty index (#4449)\n- Fix Gzip file source (#4457)\n- Rate limit noisy logs (#4483)\n- Prevent the exponential backoff from overflowing after 64 attempts (#4501)\n- Remove field presence in ES `_field_caps` API (#4492)\n- Remove `source` in ES parameter, remove unsupported field `fields` in response (#4590)\n- Fix aggregation `split_size` parameter, add docs and test (#4627)\n- Various fixes in chitchat (gossip): more details in [chitchat commit history](https://github.com/quickwit-oss/chitchat/commits/main/?since=2024-01-08&until=2024-03-13)\n- Various fixes in mrecordlog (WAL): more details in [mrecordlog commit history](https://github.com/quickwit-oss/mrecordlog/commits/main/?since=2024-01-08&until=2024-03-13)\n\n### Changed\n\n- (Breaking) [Add ZSTD compression to chitchat's Deltas](https://github.com/quickwit-oss/chitchat/pull/112)\n\n### Removed\n\n### Migration from 0.7.x to 0.8.0\n\nTo deploy Quickwit 0.8.0, you must either:\n- **shutdown down** your cluster **entirely** before deploying, or\n- **restart all** the nodes of your cluster after deploying.\n\nBecause we made some breaking changes in the gossip protocol (chitchat), nodes running different versions of Quickwit cannot communicate with each other and crash upon receiving messages that do not match their release version. The new protocol is now versioned, and future updates of the gossip protocol will be backward compatible.\n\n\n## [0.7.1]\n\n### Added\n\n- Add es _count API (#4410)\n- Add _elastic/_field_caps API (#4350)\n- Make gRPC message size configurable (#4388)\n- Add API endpoint to get some control-plan internal info (#4339)\n- Add Google Cloud Storage Implementation available for storage paths starting with `gs://` (#4344)\n\n### Changed\n\n- Return 404 on index not found in ES Bulk API (#4425)\n- Allow $ and @ characters in field names (#4413)\n\n### Fixed\n- Assign all sources/shards, even if this requires exceeding the indexer #4363\n- Fix traces doc mapping (service name set as  fast) and update default otel logs index ID to `otel-logs-v0_7` (#4401)\n- Fix parsing multi-line queries (#4409)\n- Fix range query for optional fast field panics with Index out of bounds (#4362)\n\n### Migration from 0.7.0 to 0.7.1\n\nQuickwit 0.7.1 will create the new index `otel-logs-v0_7` which is now used by default when ingesting data with the OTEL gRPC and HTTP API.\n\nIn the traces index `otel-traces-v0_7`, the `service_name` field is now fast. No migration is done if `otel-traces-v0_7` already exists. If you want `service_name` field to be fast, you have to delete first the existing `otel-traces-v0_7` index or create your own index.\n\n## [0.7.0]\n\n### Added\n\n- Elasticsearch-compatible API\n  - Added scroll and search_after APIs and support for multi-index search queries\n  - Added exists, multi-match, match phrase prefix, match bool prefix, bool queries\n  - Added `_field_caps` API\n- Added support for OTLP over HTTP API (Protobuf only) (#4335)\n- Added Jaeger REST endpoints for Grafana tracing support (#4197)\n- Added support for injecting custom HTTP headers and moved REST config parameters into REST config section (#4198)\n- Added support for OTLP trace data in arbitrary sources\n- Commit Kafka offsets on suggest truncate (#3638)\n- Honor `auto.offset.reset` parameter in Kafka source (#4095)\n- Added exact count optimization (#4019)\n- Added stream splits gRPC (#4109)\n- Adding a split cache in Searchers (#3857)\n- Added `coerce` and `output_format` options for numeric fields (#3704)\n- Added `PhraseMatchQuery` and `MultiMatchQuery` (#3727)\n- Added Elasticsearch's `TermsQuery` (#3747)\n- Added GCP PubSub source (#3720)\n- Parse timestamp strings (#3639)\n- Added Digital Ocean storage flavor (#3632)\n- Added new tokenizers: `source_code_default`, `source_code`, `multilang` (#3647, #3655, #3608)\n\n\n### Fixed\n\n- Fixed dates in UI (#4277)\n- Fixed duplicate splits planned on pipeline crash-respawn (#3854)\n- Fixed sorting (#3799)\n\nMore details in tantivy's [changelog](https://github.com/quickwit-oss/tantivy/blob/main/CHANGELOG.md).\n\n### Changed\n\n- Improve OTEL traces index config (#4311)\n  - OTEL endpoints are now using by default indexes `otel-logs-v0_7` and `otel-traces-v0_7` instead of `otel-logs-v0_6` and `otel-traces-v0_6`\n  - OTEL indexes have more fields stored as \"fast\" and have Trace and Span ID bytes field in hex format\n\n- Increased the gRPC payload limits from 10MiB to 20MiB (#4227)\n- Reject malformed Elasticsearch API requests (#4175)\n- Better logging when doc processing fails (#4323)\n- Search performance improvements\n- Indexing performance improvements\n\n### Removed\n\n### Migration from 0.6.x to 0.7\n\nThe format of the index and internal objects stored in the metastore of 0.7 is backward compatible with 0.6.\n\nIf you are using the OTEL indexes and ingesting data into indexes the `otel-logs-v0_6` and `otel-traces-v0_6`, you must stop indexing before upgrading.\nIndeed, the first time you start Quickwit 0.7, it will update the doc mapping fields of Trace ID and Span ID of those two indexes by changing their input/output formats from base64 to hex. This is automatic: you don't have to perform any manual operation.\nQuickwit 0.7 will create new indexes `otel-logs-v0_7` and `otel-traces-v0_7`, which are now used by default when ingesting data with the OTEL gRPC and HTTP API. The Jaeger gRPC and HTTP APIs will query both `otel-traces-v0_6` and `otel-traces-v0_7` by default.\nIt's possible to define the index ID you want to use for OTEL gRPC endpoints and Jaeger gRPC API by setting the request header `qw-otel-logs-index` or `qw-otel-traces-index` to the index ID you want to target.\n\n\n## [0.6.1]\n\n### Added\n- Support of phrase prefix queries in the query language.\n\n### Fixed\n- Fix timestamp field which was not allowed when defined in an object mapping.\n- Fix querying of integer on a JSON field (no document were returned).\n\n\n## [0.6.0] - 2023-06-03\n\n### Added\n- Elasticsearch/Opensearch compatible API.\n- New columnar format:\n    - Fast fields can now have any cardinality (Optional, Multivalued, restricted). In fact cardinality is now only used to format the output.\n    - Dynamic Fields are now fast fields.\n- String fast fields now can be normalized.\n- Various parameters of object storages can now be configured.\n- The ingest API makes it possible to force a commit, or wait for a scheduled commit to occur.\n- Ability to parse non-JSON data using VRL to extract some structure from documents.\n- Object storage can now use the `virtual-hosted–style`.\n- `date_histogram` aggregation.\n- `percentiles` aggregation.\n- Added support for Prefix Phrase query.\n- Added support for range queries.\n- The query language now supports different date formats.\n- Added support for base16 input/output configuration for bytes field. You can search for bytes fields using base16 encoded values.\n- Autotagging: fields used in the partition key are automatically added to tags.\n- Added arm64 docker image.\n- Added CORS configuration for the REST API.\n\n\n### Fixed\n- Major bug fix that required to restart quickwit when deleting and recreating an index with the same name.\n- The number of concurrent GET requests to object stores is now limited. This fixes a bug observed with when requested a lot of documents from MinIO.\n- Quickwit now searches into resource attributes when receiving a Jaeger request carrying tags\n- Object storage can be figured to:\n    - avoid Bulk delete API (workaround for Google Cloud Storage).\n    - Use virtual-host style addresses (workaround for Alibaba Object Storage Service).\n- Fix aggregation min doc_count empty merge bug.\n- Fix: Sort order for term aggregations.\n- Switch to ms in histogram for date type (aligning with ES).\n\n### Improvements\n\n- Search performance improvement.\n- Aggregation performance improvement.\n- Aggregation memory improvement.\n\nMore details in tantivy's [changelog](https://github.com/quickwit-oss/tantivy/blob/main/CHANGELOG.md).\n\n### Changed\n- Datetime now have up to a nanosecond precision.\n- By default, quickwit now uses the node's hostname as the default node ID.\n- By default, Quickwit is in dynamic mode and all dynamic fields are marked as fast fields.\n- JSON field uses by default the raw tokanizer and is set to fast field.\n- Various performance/compression improvements.\n- OTEL indexes Trace ID and Span ID are now bytes fields.\n- OTEL indexes stores timestamps with nanosecond precision.\n- pan status is now indexed in the OTEL trace index.\n- Default and raw tokenizers filter tokesn longer than 255 bytes instead of 40 bytes.\n\n\n## [0.5.0] - 2023-03-16\n\n### Added\n- gRPC OpenTelemetry Protocol support for traces\n- gRPC OpenTelemetry Protocol support for logs\n- Control plane (indexing tasks scheduling)\n- Ingest API rate limiter\n- Pulsar source\n- VRL transform for data sources\n- REST API enhanced to fully manage indexes, sources, and splits\n- OpenAPI specification and swagger UI for all REST available endpoints\n- Large responses from REST API can be compressed\n- Add bulk stage splits method to metastore\n- MacOS M1 binary\n- Doc mapping field names starting with `_` are now valid\n\n### Fixed\n- Fix UI index completion on search page\n- Fix CLI index describe command to show stats on published splits\n- Fix REST API to always return on error a body formatted as `{\"message\": \"error message\"}`\n- Fixed REST status code when deleting unexisting index, source and when fetching splits on unexisting index\n\n### Changed\n- Source config schema (breaking or not? use serde rename to be not breaking?)\n- RocksDB replaced by [mrecordlog](https://github.com/quickwit-oss/mrecordlog) to store ingest API queues records\n- (Breaking) Indexing partition key new DSL\n- (Breaking) Helm chart updated with the new CLI\n- (Breaking) CLI indexes, sources, and splits commands use the REST API\n- (Breaking) Index new format: you need to reindex all your data\n\n## [0.4.0] - 2022-12-03\n\n### Added\n- Boolean, datetime, and IP address fields\n- Chinese tokenizer\n- Distributed indexing (Kafka only)\n- gRPC metastore server\n- Index partitioning\n- Kubernetes\n- Node config templating\n- Prometheus metrics\n- Retention policies\n- REST API for CRUD operations on indexes/sources\n- Support for Azure Blob Storage\n- Support for BM25 document scoring\n- Support for deletions\n- Support for slop in phrase queries\n- Support for snippeting\n\n### Fixed\n- Fixed cache misses during search fetch docs phase\n- Fixed credentials leak in metastore URI\n- Fixed GC scalability issues\n- Fixed support for multi-source\n\n### Changed\n- Changed default docstore block size to 1 MiB and compression algorithm to ZSTD\n\n- Quickwit now relies on sqlx rather than Diesel for PostgreSQL interactions.\nMigrating from 0.3 should work as expected. Migrating from earlier version however is\nnot supported.\n\n### Removed\n- Removed support for i64 as timestamp field\n- Removed support for sorting index by field\n\n### Security\n- Forbid access to paths with `..` at storage level\n\n## [0.3.1] - 2022-06-22\n\n### Added\n- Add support for Google Cloud Storage\n- Sort hits by timestamp desc by default in search UI\n- Add `description` attribute to field mappings\n- Display split state in output of `quickwit split list` command\n\n### Fixed\n- Clean up local split cache after index deletion\n- Fix API URLs displayed for copy and paste in UI\n- Fix custom S3 endpoint with trailing `/`\n- Fix `quickwit index create` command with `--overwrite` option\n\n## [0.3.0] - 2022-05-31\n\n### Added\n- Embedded UI for displaying search hits and cluster state\n- Schemaless indexing with JSON field\n- Ingest API (Elasticsearch-compatible)\n- Aggregation queries\n- Support for Amazon Kinesis\n\n### Fixed\n- Switched cluster membership algorithm from S.W.I.M. to Chitchat\n\n### Removed\n- u64 as date field\n\n## [0.2.1] - 2022-02-28\n\n### Added\n- Query validation against index schema before dispatch to leaf nodes (#1109, @linxGnu)\n- Support for custom S3 endpoint (#1108)\n- Warm up terms and fastfields concurrently (#1147)\n\n### Fixed\n- Minor bug in leaf search stream (#1110)\n- Default index root URI and metastore URI correctly default to data dir (#1140, @ddelemeny)\n\n### Removed\n- QW_ENV environment variable\n\n### Security\n- Compiled binaries with Rust 1.58.1, which fixes CVE-2022-21658\n\n## [0.2.0] - 2022-01-12\n\n## [0.1.0] - 2021-07-13\n"
  },
  {
    "path": "CODE_OF_CONDUCT.md",
    "content": "# Contributor Covenant Code of Conduct\n\n## Our Pledge\n\nWe as members, contributors, and leaders pledge to make participation in our\ncommunity a harassment-free experience for everyone, regardless of age, body\nsize, visible or invisible disability, ethnicity, sex characteristics, gender\nidentity and expression, level of experience, education, socio-economic status,\nnationality, personal appearance, race, caste, color, religion, or sexual identity\nand orientation.\n\nWe pledge to act and interact in ways that contribute to an open, welcoming,\ndiverse, inclusive, and healthy community.\n\n## Our Standards\n\nExamples of behavior that contributes to a positive environment for our\ncommunity include:\n\n* Demonstrating empathy and kindness toward other people\n* Being respectful of differing opinions, viewpoints, and experiences\n* Giving and gracefully accepting constructive feedback\n* Accepting responsibility and apologizing to those affected by our mistakes,\n  and learning from the experience\n* Focusing on what is best not just for us as individuals, but for the\n  overall community\n\nExamples of unacceptable behavior include:\n\n* The use of sexualized language or imagery, and sexual attention or\n  advances of any kind\n* Trolling, insulting or derogatory comments, and personal or political attacks\n* Public or private harassment\n* Publishing others' private information, such as a physical or email\n  address, without their explicit permission\n* Other conduct which could reasonably be considered inappropriate in a\n  professional setting\n\n## Enforcement Responsibilities\n\nCommunity leaders are responsible for clarifying and enforcing our standards of\nacceptable behavior and will take appropriate and fair corrective action in\nresponse to any behavior that they deem inappropriate, threatening, offensive,\nor harmful.\n\nCommunity leaders have the right and responsibility to remove, edit, or reject\ncomments, commits, code, wiki edits, issues, and other contributions that are\nnot aligned to this Code of Conduct, and will communicate reasons for moderation\ndecisions when appropriate.\n\n## Scope\n\nThis Code of Conduct applies within all community spaces, and also applies when\nan individual is officially representing the community in public spaces.\nExamples of representing our community include using an official e-mail address,\nposting via an official social media account, or acting as an appointed\nrepresentative at an online or offline event.\n\n## Enforcement\n\nInstances of abusive, harassing, or otherwise unacceptable behavior may be\nreported to the community leaders responsible for enforcement at adrien+cc at quickwit dot io.\nAll complaints will be reviewed and investigated promptly and fairly.\n\nAll community leaders are obligated to respect the privacy and security of the\nreporter of any incident.\n\n## Enforcement Guidelines\n\nCommunity leaders will follow these Community Impact Guidelines in determining\nthe consequences for any action they deem in violation of this Code of Conduct:\n\n### 1. Correction\n\n**Community Impact**: Use of inappropriate language or other behavior deemed\nunprofessional or unwelcome in the community.\n\n**Consequence**: A private, written warning from community leaders, providing\nclarity around the nature of the violation and an explanation of why the\nbehavior was inappropriate. A public apology may be requested.\n\n### 2. Warning\n\n**Community Impact**: A violation through a single incident or series\nof actions.\n\n**Consequence**: A warning with consequences for continued behavior. No\ninteraction with the people involved, including unsolicited interaction with\nthose enforcing the Code of Conduct, for a specified period of time. This\nincludes avoiding interactions in community spaces as well as external channels\nlike social media. Violating these terms may lead to a temporary or\npermanent ban.\n\n### 3. Temporary Ban\n\n**Community Impact**: A serious violation of community standards, including\nsustained inappropriate behavior.\n\n**Consequence**: A temporary ban from any sort of interaction or public\ncommunication with the community for a specified period of time. No public or\nprivate interaction with the people involved, including unsolicited interaction\nwith those enforcing the Code of Conduct, is allowed during this period.\nViolating these terms may lead to a permanent ban.\n\n### 4. Permanent Ban\n\n**Community Impact**: Demonstrating a pattern of violation of community\nstandards, including sustained inappropriate behavior,  harassment of an\nindividual, or aggression toward or disparagement of classes of individuals.\n\n**Consequence**: A permanent ban from any sort of public interaction within\nthe community.\n\n## Attribution\n\nThis Code of Conduct is adapted from the [Contributor Covenant][homepage],\nversion 2.0, available at\n[https://www.contributor-covenant.org/version/2/0/code_of_conduct.html][v2.0].\n\nCommunity Impact Guidelines were inspired by\n[Mozilla's code of conduct enforcement ladder][Mozilla CoC].\n\nFor answers to common questions about this code of conduct, see the FAQ at\n[https://www.contributor-covenant.org/faq][FAQ]. Translations are available\nat [https://www.contributor-covenant.org/translations][translations].\n\n[homepage]: https://www.contributor-covenant.org\n[v2.0]: https://www.contributor-covenant.org/version/2/0/code_of_conduct.html\n[Mozilla CoC]: https://github.com/mozilla/diversity\n[FAQ]: https://www.contributor-covenant.org/faq\n[translations]: https://www.contributor-covenant.org/translations\n"
  },
  {
    "path": "CODE_STYLE.md",
    "content": "# Quickwit Coding Style\n\nThis document resumes a couple of points we try to embrace in our coding style. Some of these points take an opinionated side on a trade-off story.\nThe description will try to make that clear.\n\nThe driving motivation of this code style is to make your code more readable.\n\nReadable is one word that hides several dimensions:\n- the reader understands the intent very rapidly\n- the reader can proofread. It can become confident that the code is correct very easily.\n\nNoticing how the two are different should not require too much squinting.\nShoot for *proofreadability*.\n\n## Code reviews\n\nDo a pass on your own code before sending it for review to avoid wasting the review time.\nAlso, a trivial code style issues can come in the way and avoid spotting\ndeeper issues with the code.\n\nAs a reviewer, your first mission is proofreading. If you find a logical bug, feel good. You did an awesome job today.\n\nYour second goal is to make sure the code quality stays high.\n\nYou can express \"nitpicks\": suggestions about some local aspect of the code that do not matter too much. Just prepend \"nitpick:\" to your comment.\nYou can also express an opinion/advice that you know is not universal.\nMake sure you make it clear to the reviewee that it is fine to ignore the comment.\n\nDo not use rhetorical questions... If you are 95% sure of something, there is no need to express it as a question.\nPrefer `I believe this should be n+1` to `Shouldn't this be n+1?`.\n\nThe issue with rhetorical questions is that when you will have a genuine\nquestion, reviewees may over interpret it as an affirmation.\n\nAs a reviewee, if you are not used to CRs, it can feel like an adversarial process. Relax. This is normal to end up with a lot of comments on your first few CRs.\n\nYou might feel like the comments are unjustified, try as much as possible to not feel frustrated.\nIf you want to discuss it, the best place is the chat, or maybe send a PR to modify this document.\n\nBut remember to pick your battles... If you think it does not matter much but it takes 2 secs to fix, just consider doing what is suggested by the reviewer or this style guide.\n\n## Rust gives us a lot of tools... this does not mean we need to abuse them.\n\nRust is an amazing language. It offers all kinds of tools to allow for zero-cost code reuse. Within these tools, however, generics and macros tend to hurt readability (and compile-time). Let's ONLY use them where necessary.\n\nThe same goes with the chaining iterator style.\nWhen coupled with error handling, rust's chaining iterator style can\nhurt readability.\nUsing a good old procedural for-loop is fine and recommended in that case.\n\n**example needed**\n\n\n## Naming\n\nFunction and variable names are key for readability.\n\nA good function name is often sufficient for the reader to build reasonable expectations of what it does.\n\nIf this implies long names, let's have very long names.\n\nTrying to fit this rule has an interesting side effect.\nNobody likes to type long function names. It just feels ugly.\nBut these are frequently symptoms of a badly organized code, and it can\nhelp spot refactoring opportunities.\n\n**example needed**\n\n## Explanatory variables\n\nOne incredibly powerful tool and simple tool to help make your code\nmore readable is to introduce explanatory variables.\n\nExplanatory variables are intermediary variables that were not really\nnecessary, but make it possible -through their names- to convey their\nsemantics to the reader.\n\n**example needed**\n\n## Shadowing\n\nAs much as possible, do not use reuse the same variable name in a function.\nIt is never necessary, very rarely helpful and can hurt.\n\n## Types\n\nRust handles type elision. That's great.\nChances are, your editor even automatically hints the type of\nyour variables.\n\nSometimes, however, it can be helpful for the reviewer to have the type of some very strategic variables.\n\n**example needed**\n\n## Early returns\n\nWe prefer early return.\nRather than chaining `else` statement, we prefer to isolate\ncorner case in short `if` statement to prevent nesting\n\n**example needed**\n\n## Invariants\n\nA good idea to help reviewers proofread your code is to\nidentify invariants and express them as `debug_assert`.\n\nThese assert will not be part of the release binary and won't hurt the execution time.\n\n**example needed**\n\n## Errors and log messages\n\nError and log messages follow the same format. They should be concise, lowercase (except proper names), and without trailing punctuation.\n\nAs a loose rule, where it does not hurt readability, log messages should rely on `tracing` \nstructured logging instead of templating. \n\nIn other words, prefer:\n`warn!(remaining=remaining_attempts, \"trubulizor rpc plane retry failed\")`\nto \n`warn!(\"trubulizor rpc plane retry failed ({remaining_attempts} attempts remaining)\")`\n\n### Error Examples\n- \"failed to start actor runtimes\"\n- \"cannot join PostgreSQL URI {} with path {:?}\"\n- \"could not find split metadata in Metastore {}\"\n- \"unknown output format {:?}\"\n \n### Log examples\n\n\n## Comments\n\nWe use on the same code style, [rustc's doc comments](https://doc.rust-lang.org/1.0.0/style/style/comments.html).\nIn particular, the summary line should be written in third-person singular present indicative form.\n\nNo rustdoc in Quickwit or in private API is ok.\nNo rustdoc on Tantivy public API is not ok.\n\nWe usually do not expect comments to contain any implementation details.\nTo some extent, it is normal for the user to have to look at the code.\n\nWhen it is not clear, comments should convey:\n- intent\n- context (links to a Wikipedia page or a paper, link to the original issue can be helpful too)\n- hidden contracts... but really you should avoid those.\n\nInline comments in the code can be very useful to help the reader understand\nthe justification of a thorny piece of code.\n\n**example needed**\n\n## Hidden contracts\n\nWe call hidden contract, a pre-condition on the arguments that is not enforced by their types.\n\nSometimes, hidden contracts are unavoidable.\n\nFor instance, a binary search requires the array to be sorted.\n\nWhenever possible, you should avoid having hidden contracts.\n\nTo avoid hidden contracts, you should consider:\n- changing your argument types to have the type system enforce the contract\n- internalize the contract enforcement.\n\nFor instance, the following function is not good because it hides a contract on values not being empty:\n\n```\nfn min(&self, values: &[usize]) -> usize {\n\tlet mut min_val = usize::MAX;\n\tfor val in values {\n\t\tmin_val = min_val.min(val)\n\t}\n\tmin_val\n}\n```\nIt can be done by changing the prototype to a `Result` or an `Option`.\n\nIn addition, while the author might have thought that the `usize::MAX` trick was a nice touch, it can easily backfire. Panicking is often better than returning a wrong result.\n\nThe better approach here is of course an `Option<usize>` like `Iterator::min` does.\n\nAnother way to internalize the contract enforcement is to move some logic from the caller to within the function.\n\nFor instance:\n```\n// The algorithms requires splits to be sorted by `end_time`\nfn merge_candidates(splits: &mut Vec<SplitMetadata>) -> Vec<SplitMetadata>\n```\n\nIt is tempting to rely on the fact that splits `Vec` is always sorted on the caller side and put this as a hidden contract.\nIf it is not too much work, just redoing the sorting within merge candidates\nis a good idea. For the above function, that extra work is tiny.\n\nBy the way, did you know Rust's std sort is inspired by timsort?\nIt will perform in linear time if the array is already sorted...\n\nWhen implementing a function with a hidden contract, as long as it does not hurt the overall performance, add an assert statement to your code to check the contract. (For instance, check that the array is sorted).\n\n**example needed*\n\n## Tests\n\nTest do not need to match the same quality as the original code.\n\nWhen a bug is encountered, it is ok to introduce a test that seems weirdly\noverfitted to the specific issue. A comment should then add a link to the issue.\n\nUnit test should run fast, and if possible they should not do any IO.\nCode should be structured to make unit testing possible.\n\nSome of our unit tests would not be considered good unit tests in some companies, and that's ok.\n\nHere are the controversial bits:\n\n### Not just for spotting regression\n\nOur unit tests are not here just to spot regression.\nThey are also here to check the correctness of our code.\n\n### Not just testing public API\n\nUnit test do not only test public API.\nComplex code often calls half a dozen smaller functions.\n\nThe cardinality of the corner case of the complex code\ncan make it difficult to test all corner case.\n\nOn the other hand, the smaller functions could be tested\nexhaustively.\n\nFor this reason, testing internal private functions is actually encouraged.\n\n### Not always \"unit\" tests\n\nIdeally, unit tests should be testing one thing and one thing only, but if they don't and it helps cover more ground, this is ok.\n\n### Not necessarily deterministic.\n\nFinally, unit tests are not necessarily deterministic. We really like proptests.\nWhen proptesting, make sure to reduce as much as possible the space of exploration to get the most out of it.\n\n## async vs sync\n\nYour async code should block for at most 500 microseconds.\nIf you are unsure whether your code blocks for 500 microseconds, or if it is a non-trivial question, it should run via `tokio::spawn_block`.\n"
  },
  {
    "path": "CONTRIBUTING.md",
    "content": "# Contributing to Quickwit\nThere are many ways to contribute to Quickwit.\nCode contributions are welcome of course, but also\nbug reports, feature requests, and evangelizing are as valuable.\n\n# Submitting a PR\nCheck if your issue is already listed on [github](https://github.com/quickwit-oss/quickwit/issues).\nIf it is not, create your own issue.\n\nPlease add the following phrase at the end of your commit `Closes #<Issue Number>`.\nIt will automatically link your PR in the issue page. Also, once your PR is merged, it will\nclose the issue. If your PR only partially addresses the issue and you would like to\nkeep it open, just write `See #<Issue Number>`.\n\nFeel free to send your contribution in an unfinished state to get early feedback.\nIn that case, simply mark the PR with the tag [WIP] (standing for work in progress).\n\n## PR verification checks\nWhen you submit a pull request to the project, the CI system runs several verification checks. After your PR is merged, a more exhaustive list of tests will be run.\n\nYou will be notified by email from the CI system if any issues are discovered, but if you want to run these checks locally before submitting PR or in order to verify changes you can use the following commands in the root directory:\n1. To verify that all tests are passing, run `make test-all`.\n2. To fix code style and format as well as catch common mistakes run `make fix`. Alternatively, run `make -k test-all docker-compose-down` to tear down the Docker services after running all the tests.\n3. To build docs run `make build-rustdoc`.\n\n# Development\n\n## Setup & run tests\n\n### Local Development\n\n1. Install Rust, CMake, Docker (https://docs.docker.com/engine/install/) and Docker Compose (https://docs.docker.com/compose/install/)\n2. Install node@24 and `npm install -g yarn`\n3. Install awslocal https://github.com/localstack/awscli-local\n4. Install protoc https://grpc.io/docs/protoc-installation/ (you may need to install the latest binaries rather than your distro's flavor)\n5. Install nextest https://nexte.st/docs/installation/pre-built-binaries/\n\n### GitHub Codespaces\n\n[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/quickwit-oss/quickwit?devcontainer_path=.devcontainer/devcontainer.json)\n\nGitHub Codespaces provides a fully configured development environment in the cloud, making it easy to get started with Quickwit development. By clicking the badge above, you can create a codespace with all the necessary tools installed and configured.\n\n### Running tests\nRun `make test-all` to run all tests.\n\n## Useful commands\n* `make test-all` - starts necessary Docker services and runs all tests.\n* `make -k test-all docker-compose-down` - the same as above, but tears down the Docker services after running all the tests.\n* `make fmt` - runs formatter, this command requires the nightly toolchain to be installed by running `rustup toolchain install nightly`.\n* `make fix` - runs formatter and clippy checks as well as removing unused dependencies (requires `cargo install cargo-machete`).\n* `make typos` - runs the spellcheck tool over the codebase. (Install by running `cargo install typos-cli`)\n* `make doc` - builds docs.\n* `make docker-compose-up` - starts Docker services.\n* `make docker-compose-down` - stops Docker services.\n* `make docker-compose-logs` - shows Docker logs.\n\n## Start the UI\n1. Switch to the `quickwit` subdirectory of the project and create a data directory `qwdata` there if it doesn't exist\n2. Start a server `cargo r run --config ../config/quickwit.yaml`\n3. `yarn --cwd quickwit-ui install` and `yarn --cwd quickwit-ui start`\n4. Open your browser at `http://localhost:3000/ui` if it doesn't open automatically\n\n## Running UI Tests\n1. Run `yarn --cwd quickwit-ui install` and `yarn --cwd quickwit-ui test` in the `quickwit` directory\n\n## Running UI e2e tests\n1. Ensure to run a searcher `cargo r run --service searcher --config ../config/quickwit.yaml`\n2. Run `yarn --cwd quickwit-ui e2e-test`\n\n## Running services such as Amazon Kinesis or S3, Kafka, or PostgreSQL locally.\n1. Ensure Docker and Docker Compose are correctly installed on your machine (see above)\n2. Run `make docker-compose-up` to launch all the services or `make docker-compose-up DOCKER_SERVICES=kafka,postgres` to launch a subset of services.\n\n## Tracing with Jaeger\n1. Ensure Docker and Docker Compose are correctly installed on your machine (see above)\n2. Start the Jaeger services (UI, collector, agent, ...) running the command `make docker-compose-up DOCKER_SERVICES=jaeger`\n3. Start Quickwit with the following environment variables:\n\n```\nOTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317\nQW_ENABLE_OPENTELEMETRY_OTLP_EXPORTER=true\n```\n\n4. Open your browser and visit [localhost:16686](http://localhost:16686/)\n\n## Using tokio console\n1. Install tokio-console by running `cargo install tokio-console`.\n2. Install the quickwit binary in the quickwit-cli folder `RUSTFLAGS=\"--cfg tokio_unstable\" cargo install --path . --features tokio-console`\n3. Launch a long running command such as index and activate tokio with the: `QW_ENABLE_TOKIO_CONSOLE=1 quickwit index ...`\n4. Run `tokio-console`.\n\n## Building binaries\n\nCurrently, we use [cross](https://github.com/rust-embedded/cross) to build Quickwit binaries for different architectures.\nFor this to work, we've had to customize the docker images cross uses. These customizations can be found in docker files located in the `./cross-images` folder. To make cross take into account any change on those\ndocker files, you will need to build and push the images on Docker Hub by running `make cross-images`.\nWe also have nightly builds that are pushed to Docker Hub. This helps continuously check that our binaries are still built even with external dependency updates. Successful builds let you access the artifacts for the next three days. Release builds always have their artifacts attached to the release.\n\n## Docker images\n\nEach merge on the `main` branch triggers the build of a new Docker image available on DockerHub at `quickwit/quickwit:edge`. Tagging a commit also creates a new image `quickwit/quickwit:<tag name>` if the tag name starts with `v*` or `qw*`. The Docker images are based on Debian.\n\n### Notes on the embedded UI\nAs the react UI is embedded in the rust binary, we need to build the react app before building the binary. Hence `make cross-image` depends on the command `build-ui`.\n\n## Testing release (alpha, beta, rc)\n\nThe following Quickwit installation command `curl -L https://install.quickwit.io | sh` always installs the latest stable version of quickwit. To make it easier in installing and testing new (alpha, beta, rc) releases, you can manually pull and execute the script as `./install.sh --allow-any-latest-version`. This will force the script to install any latest available release package.\n\n## Tracking licenses\n\nWe keep track of the licenses used by the open source crates used by this project using\n[`rust-license-tool`](https://github.com/DataDog/rust-license-tool). The listing is checked every\ntime CI is run. To update the listing, install the tool with `cargo install --git\nhttps://github.com/DataDog/rust-license-tool` and then run `dd-rust-license-tool write`. If there are\nany errors, you may need to update the listing of exceptions in `license-tool.toml`.\n\n# Documentation\n\nQuickwit documentation is located in the docs directory.\n\n## Generating the CLI docs.\n\nThe [CLI doc page](docs/reference/cli.md) is partly generated by a script.\nTo update it, first run the script:\n\n```bash\ncargo run --bin generate_markdown > ../docs/reference/cli_insert.md\n```\n\nThen manually edit the [doc page](docs/reference/cli.md) to update it and delete the generated file.\nThere are two comments to indicate where you want to insert the new docs and where it ends:\n\n```markdown\n[comment]: <> (Insert auto generated CLI docs from here.)\n\n...docs to insert...\n\n[comment]: <> (End of auto generated CLI docs.)\n```\n"
  },
  {
    "path": "Dockerfile",
    "content": "FROM node:24@sha256:b2b2184ba9b78c022e1d6a7924ec6fba577adf28f15c9d9c457730cc4ad3807a AS ui-builder\n\nCOPY quickwit/quickwit-ui /quickwit/quickwit-ui\n\nWORKDIR /quickwit/quickwit-ui\n\nRUN touch .gitignore_for_build_directory \\\n    && NODE_ENV=production make install build\n\n\nFROM rust:bookworm@sha256:b5efaabfd787a695d2e46b37d3d9c54040e11f4c10bc2e714bbadbfcc0cd6c39 AS bin-builder\n\nARG CARGO_FEATURES=release-feature-set\nARG CARGO_PROFILE=release\nARG QW_COMMIT_DATE\nARG QW_COMMIT_HASH\nARG QW_COMMIT_TAGS\n\nENV QW_COMMIT_DATE=$QW_COMMIT_DATE\nENV QW_COMMIT_HASH=$QW_COMMIT_HASH\nENV QW_COMMIT_TAGS=$QW_COMMIT_TAGS\n\nRUN apt-get -y update \\\n    && apt-get -y install ca-certificates \\\n    clang \\\n    cmake \\\n    libssl-dev \\\n    llvm \\\n    protobuf-compiler \\\n    && rm -rf /var/lib/apt/lists/*\n\nCOPY quickwit /quickwit\nCOPY config/quickwit.yaml /quickwit/config/quickwit.yaml\nCOPY --from=ui-builder /quickwit/quickwit-ui/build /quickwit/quickwit-ui/build\n\nWORKDIR /quickwit\n\nRUN rustup toolchain install\n\nRUN echo \"Building workspace with feature(s) '$CARGO_FEATURES' and profile '$CARGO_PROFILE'\" \\\n    && RUSTFLAGS=\"--cfg tokio_unstable\" \\\n    cargo build \\\n    -p quickwit-cli \\\n    --features $CARGO_FEATURES \\\n    --bin quickwit \\\n    $(test \"$CARGO_PROFILE\" = \"release\" && echo \"--release\") \\\n    && echo \"Copying binaries to /quickwit/bin\" \\\n    && mkdir -p /quickwit/bin \\\n    && find target/$CARGO_PROFILE -maxdepth 1 -perm /a+x -type f -exec mv {} /quickwit/bin \\;\n\n\nFROM debian:bookworm-slim@sha256:e899040a73d36e2b36fa33216943539d9957cba8172b858097c2cabcdb20a3e2 AS quickwit\n\nLABEL org.opencontainers.image.title=\"Quickwit\"\nLABEL maintainer=\"Quickwit, Inc. <hello@quickwit.io>\"\nLABEL org.opencontainers.image.vendor=\"Quickwit, Inc.\"\nLABEL org.opencontainers.image.licenses=\"Apache-2.0\"\n\nRUN apt-get -y update \\\n    && apt-get -y install ca-certificates \\\n    libssl3 \\\n    && rm -rf /var/lib/apt/lists/*\n\nWORKDIR /quickwit\nRUN mkdir config qwdata\nCOPY --from=bin-builder /quickwit/bin/quickwit /usr/local/bin/quickwit\nCOPY --from=bin-builder /quickwit/config/quickwit.yaml /quickwit/config/quickwit.yaml\n\nENV QW_CONFIG=/quickwit/config/quickwit.yaml\nENV QW_DATA_DIR=/quickwit/qwdata\nENV QW_LISTEN_ADDRESS=0.0.0.0\n\nRUN quickwit --version\n\nENTRYPOINT [\"quickwit\"]\n"
  },
  {
    "path": "LICENSE",
    "content": "                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      \"License\" shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      \"Licensor\" shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      \"Legal Entity\" shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      \"control\" means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      \"You\" (or \"Your\") shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      \"Source\" form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      \"Object\" form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      \"Work\" shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      \"Derivative Works\" shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      \"Contribution\" shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, \"submitted\"\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as \"Not a Contribution.\"\n\n      \"Contributor\" shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a \"NOTICE\" text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an \"AS IS\" BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets \"[]\"\n      replaced with your own identifying information. (Don't include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same \"printed page\" as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright 2021-Present Datadog, Inc.\n\n   Licensed under the Apache License, Version 2.0 (the \"License\");\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an \"AS IS\" BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n"
  },
  {
    "path": "LICENSE-3rdparty.csv",
    "content": "Component,Origin,License,Copyright\nadler2,https://github.com/oyvindln/adler2,0BSD OR MIT OR Apache-2.0,\"Jonas Schievink <jonasschievink@gmail.com>, oyvindln <oyvindln@users.noreply.github.com>\"\nadvapi32-sys,https://github.com/retep998/winapi-rs,MIT,Peter Atashian <retep998@gmail.com>\nahash,https://github.com/tkaitchuck/ahash,MIT OR Apache-2.0,Tom Kaitchuck <Tom.Kaitchuck@gmail.com>\naho-corasick,https://github.com/BurntSushi/aho-corasick,Unlicense OR MIT,Andrew Gallant <jamslam@gmail.com>\naliasable,https://github.com/avitex/rust-aliasable,MIT,avitex <avitex@wfxlabs.com>\nalloca,https://github.com/playXE/alloca-rs,MIT,\"Adel Prokurov <adel.prokurov@gmail.com>, StackOverflowExcept1on\"\nallocator-api2,https://github.com/zakarumych/allocator-api2,MIT OR Apache-2.0,Zakarum <zaq.dev@icloud.com>\nandroid_system_properties,https://github.com/nical/android_system_properties,MIT OR Apache-2.0,Nicolas Silva <nical@fastmail.com>\nanes,https://github.com/zrzka/anes-rs,MIT OR Apache-2.0,Robert Vojta <rvojta@me.com>\nansi-str,https://github.com/zhiburt/ansi-str,MIT,Maxim Zhiburt <zhiburt@gmail.com>\nansitok,https://gitlab.com/zhiburt/ansitok,MIT,Maxim Zhiburt <zhiburt@gmail.com>\nanstream,https://github.com/rust-cli/anstyle,MIT OR Apache-2.0,The anstream Authors\nanstyle,https://github.com/rust-cli/anstyle,MIT OR Apache-2.0,The anstyle Authors\nanstyle-parse,https://github.com/rust-cli/anstyle,MIT OR Apache-2.0,The anstyle-parse Authors\nanstyle-query,https://github.com/rust-cli/anstyle,MIT OR Apache-2.0,The anstyle-query Authors\nanstyle-wincon,https://github.com/rust-cli/anstyle,MIT OR Apache-2.0,The anstyle-wincon Authors\nanyhow,https://github.com/dtolnay/anyhow,MIT OR Apache-2.0,David Tolnay <dtolnay@gmail.com>\narc-swap,https://github.com/vorner/arc-swap,MIT OR Apache-2.0,Michal 'vorner' Vaner <vorner@vorner.cz>\narrayvec,https://github.com/bluss/arrayvec,MIT OR Apache-2.0,bluss\nassert-json-diff,https://github.com/davidpdrsn/assert-json-diff,MIT,David Pedersen <david.pdrsn@gmail.com>\nasync-compression,https://github.com/Nullus157/async-compression,MIT OR Apache-2.0,\"Wim Looman <wim@nemo157.com>, Allen Bui <fairingrey@gmail.com>\"\nasync-speed-limit,https://github.com/tikv/async-speed-limit,MIT OR Apache-2.0,The TiKV Project Developers\nasync-stream,https://github.com/tokio-rs/async-stream,MIT,Carl Lerche <me@carllerche.com>\nasync-stream-impl,https://github.com/tokio-rs/async-stream,MIT,Carl Lerche <me@carllerche.com>\nasync-trait,https://github.com/dtolnay/async-trait,MIT OR Apache-2.0,David Tolnay <dtolnay@gmail.com>\natomic-waker,https://github.com/smol-rs/atomic-waker,Apache-2.0 OR MIT,\"Stjepan Glavina <stjepang@gmail.com>, Contributors to futures-rs\"\naws-config,https://github.com/smithy-lang/smithy-rs,Apache-2.0,\"AWS Rust SDK Team <aws-sdk-rust@amazon.com>, Russell Cohen <rcoh@amazon.com>\"\naws-credential-types,https://github.com/smithy-lang/smithy-rs,Apache-2.0,AWS Rust SDK Team <aws-sdk-rust@amazon.com>\naws-lc-rs,https://github.com/aws/aws-lc-rs,ISC AND (Apache-2.0 OR ISC),AWS-LibCrypto\naws-lc-sys,https://github.com/aws/aws-lc-rs,ISC AND (Apache-2.0 OR ISC) AND OpenSSL,AWS-LC\naws-runtime,https://github.com/smithy-lang/smithy-rs,Apache-2.0,AWS Rust SDK Team <aws-sdk-rust@amazon.com>\naws-sdk-lambda,https://github.com/awslabs/aws-sdk-rust,Apache-2.0,\"AWS Rust SDK Team <aws-sdk-rust@amazon.com>, Russell Cohen <rcoh@amazon.com>\"\naws-sdk-s3,https://github.com/awslabs/aws-sdk-rust,Apache-2.0,\"AWS Rust SDK Team <aws-sdk-rust@amazon.com>, Russell Cohen <rcoh@amazon.com>\"\naws-sdk-sso,https://github.com/awslabs/aws-sdk-rust,Apache-2.0,\"AWS Rust SDK Team <aws-sdk-rust@amazon.com>, Russell Cohen <rcoh@amazon.com>\"\naws-sdk-ssooidc,https://github.com/awslabs/aws-sdk-rust,Apache-2.0,\"AWS Rust SDK Team <aws-sdk-rust@amazon.com>, Russell Cohen <rcoh@amazon.com>\"\naws-sdk-sts,https://github.com/awslabs/aws-sdk-rust,Apache-2.0,\"AWS Rust SDK Team <aws-sdk-rust@amazon.com>, Russell Cohen <rcoh@amazon.com>\"\naws-sigv4,https://github.com/smithy-lang/smithy-rs,Apache-2.0,\"AWS Rust SDK Team <aws-sdk-rust@amazon.com>, David Barsky <me@davidbarsky.com>\"\naws-smithy-async,https://github.com/smithy-lang/smithy-rs,Apache-2.0,\"AWS Rust SDK Team <aws-sdk-rust@amazon.com>, John DiSanti <jdisanti@amazon.com>\"\naws-smithy-checksums,https://github.com/smithy-lang/smithy-rs,Apache-2.0,\"AWS Rust SDK Team <aws-sdk-rust@amazon.com>, Zelda Hessler <zhessler@amazon.com>\"\naws-smithy-eventstream,https://github.com/smithy-lang/smithy-rs,Apache-2.0,\"AWS Rust SDK Team <aws-sdk-rust@amazon.com>, John DiSanti <jdisanti@amazon.com>\"\naws-smithy-http,https://github.com/smithy-lang/smithy-rs,Apache-2.0,\"AWS Rust SDK Team <aws-sdk-rust@amazon.com>, Russell Cohen <rcoh@amazon.com>\"\naws-smithy-http-client,https://github.com/smithy-lang/smithy-rs,Apache-2.0,AWS Rust SDK Team <aws-sdk-rust@amazon.com>\naws-smithy-json,https://github.com/smithy-lang/smithy-rs,Apache-2.0,\"AWS Rust SDK Team <aws-sdk-rust@amazon.com>, John DiSanti <jdisanti@amazon.com>\"\naws-smithy-observability,https://github.com/awslabs/smithy-rs,Apache-2.0,AWS Rust SDK Team <aws-sdk-rust@amazon.com>\naws-smithy-protocol-test,https://github.com/smithy-lang/smithy-rs,Apache-2.0,\"AWS Rust SDK Team <aws-sdk-rust@amazon.com>, Russell Cohen <rcoh@amazon.com>\"\naws-smithy-query,https://github.com/smithy-lang/smithy-rs,Apache-2.0,\"AWS Rust SDK Team <aws-sdk-rust@amazon.com>, John DiSanti <jdisanti@amazon.com>\"\naws-smithy-runtime,https://github.com/smithy-lang/smithy-rs,Apache-2.0,\"AWS Rust SDK Team <aws-sdk-rust@amazon.com>, Zelda Hessler <zhessler@amazon.com>\"\naws-smithy-runtime-api,https://github.com/smithy-lang/smithy-rs,Apache-2.0,\"AWS Rust SDK Team <aws-sdk-rust@amazon.com>, Zelda Hessler <zhessler@amazon.com>\"\naws-smithy-types,https://github.com/smithy-lang/smithy-rs,Apache-2.0,\"AWS Rust SDK Team <aws-sdk-rust@amazon.com>, Russell Cohen <rcoh@amazon.com>\"\naws-smithy-xml,https://github.com/smithy-lang/smithy-rs,Apache-2.0,\"AWS Rust SDK Team <aws-sdk-rust@amazon.com>, Russell Cohen <rcoh@amazon.com>\"\naws-types,https://github.com/smithy-lang/smithy-rs,Apache-2.0,\"AWS Rust SDK Team <aws-sdk-rust@amazon.com>, Russell Cohen <rcoh@amazon.com>\"\naxum,https://github.com/tokio-rs/axum,MIT,The axum Authors\naxum-core,https://github.com/tokio-rs/axum,MIT,The axum-core Authors\nbase16ct,https://github.com/RustCrypto/formats/tree/master/base16ct,Apache-2.0 OR MIT,RustCrypto Developers\nbase64,https://github.com/marshallpierce/rust-base64,MIT OR Apache-2.0,Marshall Pierce <marshall@mpierce.org>\nbase64-simd,https://github.com/Nugine/simd,MIT,The base64-simd Authors\nbase64ct,https://github.com/RustCrypto/formats,Apache-2.0 OR MIT,RustCrypto Developers\nbit-set,https://github.com/contain-rs/bit-set,Apache-2.0 OR MIT,Alexis Beingessner <a.beingessner@gmail.com>\nbit-vec,https://github.com/contain-rs/bit-vec,Apache-2.0 OR MIT,Alexis Beingessner <a.beingessner@gmail.com>\nbitflags,https://github.com/bitflags/bitflags,MIT OR Apache-2.0,The Rust Project Developers\nbitpacking,https://github.com/quickwit-oss/bitpacking,MIT,Paul Masurel <paul.masurel@gmail.com>\nblock-buffer,https://github.com/RustCrypto/utils,MIT OR Apache-2.0,RustCrypto Developers\nbon,https://github.com/elastio/bon,MIT OR Apache-2.0,The bon Authors\nbon-macros,https://github.com/elastio/bon,MIT OR Apache-2.0,The bon-macros Authors\nbpu_trasher,https://github.com/pseitz/bpu_trasher,MIT,Pascal Seitz <pascal.seitz@gmail.com>\nbs58,https://github.com/Nullus157/bs58-rs,MIT OR Apache-2.0,The bs58 Authors\nbumpalo,https://github.com/fitzgen/bumpalo,MIT OR Apache-2.0,Nick Fitzgerald <fitzgen@gmail.com>\nbytecount,https://github.com/llogiq/bytecount,Apache-2.0 OR MIT,\"Andre Bogus <bogusandre@gmail.de>, Joshua Landau <joshua@landau.ws>\"\nbyteorder,https://github.com/BurntSushi/byteorder,Unlicense OR MIT,Andrew Gallant <jamslam@gmail.com>\nbytes,https://github.com/tokio-rs/bytes,MIT,\"Carl Lerche <me@carllerche.com>, Sean McArthur <sean@seanmonstar.com>\"\nbytes-utils,https://github.com/vorner/bytes-utils,Apache-2.0 OR MIT,Michal 'vorner' Vaner <vorner@vorner.cz>\nbytesize,https://github.com/bytesize-rs/bytesize,Apache-2.0,\"Hyunsik Choi <hyunsik.choi@gmail.com>, MrCroxx <mrcroxx@outlook.com>, Rob Ede <robjtede@icloud.com>\"\nbytestring,https://github.com/actix/actix-net,MIT OR Apache-2.0,\"Nikolay Kim <fafhrd91@gmail.com>, Rob Ede <robjtede@icloud.com>\"\ncamino,https://github.com/camino-rs/camino,MIT OR Apache-2.0,\"Without Boats <saoirse@without.boats>, Ashley Williams <ashley666ashley@gmail.com>, Steve Klabnik <steve@steveklabnik.com>, Rain <rain@sunshowers.io>\"\ncargo-platform,https://github.com/rust-lang/cargo,MIT OR Apache-2.0,The cargo-platform Authors\ncargo_metadata,https://github.com/oli-obk/cargo_metadata,MIT,Oliver Schneider <git-spam-no-reply9815368754983@oli-obk.de>\ncast,https://github.com/japaric/cast.rs,MIT OR Apache-2.0,Jorge Aparicio <jorge@japaric.io>\ncbor-diag,https://github.com/Nullus157/cbor-diag-rs,MIT OR Apache-2.0,The cbor-diag Authors\ncc,https://github.com/rust-lang/cc-rs,MIT OR Apache-2.0,Alex Crichton <alex@alexcrichton.com>\ncensus,https://github.com/quickwit-inc/census,MIT,Paul Masurel <paul.masurel@gmail.com>\ncfg-if,https://github.com/rust-lang/cfg-if,MIT OR Apache-2.0,Alex Crichton <alex@alexcrichton.com>\nchitchat,https://github.com/quickwit-oss/chitchat,MIT,\"Quickwit, Inc. <hello@quickwit.io>\"\nchrono,https://github.com/chronotope/chrono,MIT OR Apache-2.0,The chrono Authors\nciborium,https://github.com/enarx/ciborium,Apache-2.0,Nathaniel McCallum <npmccallum@profian.com>\nciborium-io,https://github.com/enarx/ciborium,Apache-2.0,Nathaniel McCallum <npmccallum@profian.com>\nciborium-ll,https://github.com/enarx/ciborium,Apache-2.0,Nathaniel McCallum <npmccallum@profian.com>\nclap,https://github.com/clap-rs/clap,MIT OR Apache-2.0,The clap Authors\nclap_builder,https://github.com/clap-rs/clap,MIT OR Apache-2.0,The clap_builder Authors\nclap_lex,https://github.com/clap-rs/clap,MIT OR Apache-2.0,The clap_lex Authors\ncoarsetime,https://github.com/jedisct1/rust-coarsetime,ISC,Frank Denis <github@pureftpd.org>\ncobs,https://github.com/jamesmunns/cobs.rs,MIT OR Apache-2.0,\"Allen Welkie <>, James Munns <james@onevariable.com>\"\ncolorchoice,https://github.com/rust-cli/anstyle,MIT OR Apache-2.0,The colorchoice Authors\ncolored,https://github.com/mackwic/colored,MPL-2.0,Thomas Wickham <mackwic@gmail.com>\ncompression-codecs,https://github.com/Nullus157/async-compression,MIT OR Apache-2.0,\"Wim Looman <wim@nemo157.com>, Allen Bui <fairingrey@gmail.com>\"\ncompression-core,https://github.com/Nullus157/async-compression,MIT OR Apache-2.0,\"Wim Looman <wim@nemo157.com>, Allen Bui <fairingrey@gmail.com>\"\nconsole,https://github.com/console-rs/console,MIT,The console Authors\nconst-oid,https://github.com/RustCrypto/formats/tree/master/const-oid,Apache-2.0 OR MIT,RustCrypto Developers\ncore-foundation,https://github.com/servo/core-foundation-rs,MIT OR Apache-2.0,The Servo Project Developers\ncore-foundation-sys,https://github.com/servo/core-foundation-rs,MIT OR Apache-2.0,The Servo Project Developers\ncpufeatures,https://github.com/RustCrypto/utils,MIT OR Apache-2.0,RustCrypto Developers\ncrc32c,https://github.com/zowens/crc32c,Apache-2.0 OR MIT,Zack Owens\ncrc32fast,https://github.com/srijs/rust-crc32fast,MIT OR Apache-2.0,\"Sam Rijs <srijs@airpost.net>, Alex Crichton <alex@alexcrichton.com>\"\ncriterion-plot,https://github.com/criterion-rs/criterion.rs,Apache-2.0 OR MIT,\"Jorge Aparicio <japaricious@gmail.com>, Brook Heisler <brookheisler@gmail.com>\"\ncron,https://github.com/zslayton/cron,MIT OR Apache-2.0,Zack Slayton <zack.slayton@gmail.com>\ncrossbeam-channel,https://github.com/crossbeam-rs/crossbeam,MIT OR Apache-2.0,The crossbeam-channel Authors\ncrossbeam-deque,https://github.com/crossbeam-rs/crossbeam,MIT OR Apache-2.0,The crossbeam-deque Authors\ncrossbeam-epoch,https://github.com/crossbeam-rs/crossbeam,MIT OR Apache-2.0,The crossbeam-epoch Authors\ncrossbeam-utils,https://github.com/crossbeam-rs/crossbeam,MIT OR Apache-2.0,The crossbeam-utils Authors\ncrunchy,https://github.com/eira-fransham/crunchy,MIT,Eira Fransham <jackefransham@gmail.com>\ncrypto-bigint,https://github.com/RustCrypto/crypto-bigint,Apache-2.0 OR MIT,RustCrypto Developers\ncrypto-common,https://github.com/RustCrypto/traits,MIT OR Apache-2.0,RustCrypto Developers\ndarling,https://github.com/TedDriggs/darling,MIT,Ted Driggs <ted.driggs@outlook.com>\ndarling_core,https://github.com/TedDriggs/darling,MIT,Ted Driggs <ted.driggs@outlook.com>\ndarling_macro,https://github.com/TedDriggs/darling,MIT,Ted Driggs <ted.driggs@outlook.com>\ndashmap,https://github.com/xacrimon/dashmap,MIT,Acrimon <joel.wejdenstal@gmail.com>\ndata-encoding,https://github.com/ia0/data-encoding,MIT,Julien Cretin <git@ia0.eu>\ndeadpool,https://github.com/bikeshedder/deadpool,MIT OR Apache-2.0,Michael P. Jung <michael.jung@terreon.de>\ndeadpool-runtime,https://github.com/bikeshedder/deadpool,MIT OR Apache-2.0,Michael P. Jung <michael.jung@terreon.de>\nder,https://github.com/RustCrypto/formats/tree/master/der,Apache-2.0 OR MIT,RustCrypto Developers\nderanged,https://github.com/jhpratt/deranged,MIT OR Apache-2.0,Jacob Pratt <jacob@jhpratt.dev>\ndialoguer,https://github.com/console-rs/dialoguer,MIT,The dialoguer Authors\ndiff,https://github.com/utkarshkukreti/diff.rs,MIT OR Apache-2.0,Utkarsh Kukreti <utkarshkukreti@gmail.com>\ndifflib,https://github.com/DimaKudosh/difflib,MIT,Dima Kudosh <dimakudosh@gmail.com>\ndigest,https://github.com/RustCrypto/traits,MIT OR Apache-2.0,RustCrypto Developers\ndisplaydoc,https://github.com/yaahc/displaydoc,MIT OR Apache-2.0,Jane Lusby <jlusby@yaah.dev>\ndowncast,https://github.com/fkoep/downcast-rs,MIT,Felix Köpge <fkoep@mailbox.org>\ndowncast-rs,https://github.com/marcianx/downcast-rs,MIT OR Apache-2.0,The downcast-rs Authors\ndtoa,https://github.com/dtolnay/dtoa,MIT OR Apache-2.0,David Tolnay <dtolnay@gmail.com>\ndyn-clone,https://github.com/dtolnay/dyn-clone,MIT OR Apache-2.0,David Tolnay <dtolnay@gmail.com>\necdsa,https://github.com/RustCrypto/signatures/tree/master/ecdsa,Apache-2.0 OR MIT,RustCrypto Developers\neither,https://github.com/rayon-rs/either,MIT OR Apache-2.0,bluss\nelasticsearch-dsl,https://github.com/vinted/elasticsearch-dsl-rs,MIT OR Apache-2.0,\"Evaldas Buinauskas <evaldas.buinauskas@vinted.com>, Search Platform <search-platform@vinted.com>\"\nelliptic-curve,https://github.com/RustCrypto/traits/tree/master/elliptic-curve,Apache-2.0 OR MIT,RustCrypto Developers\nembedded-io,https://github.com/embassy-rs/embedded-io,MIT OR Apache-2.0,The embedded-io Authors\nembedded-io,https://github.com/rust-embedded/embedded-hal,MIT OR Apache-2.0,The embedded-io Authors\nencode_unicode,https://github.com/tormol/encode_unicode,Apache-2.0 OR MIT,Torbjørn Birch Moltu <t.b.moltu@lyse.net>\nencoding_rs,https://github.com/hsivonen/encoding_rs,(Apache-2.0 OR MIT) AND BSD-3-Clause,Henri Sivonen <hsivonen@hsivonen.fi>\nenum-iterator,https://github.com/stephaneyfx/enum-iterator,0BSD,Stephane Raux <stephaneyfx@gmail.com>\nenum-iterator-derive,https://github.com/stephaneyfx/enum-iterator,0BSD,Stephane Raux <stephaneyfx@gmail.com>\nenv_filter,https://github.com/rust-cli/env_logger,MIT OR Apache-2.0,The env_filter Authors\nenv_logger,https://github.com/rust-cli/env_logger,MIT OR Apache-2.0,The env_logger Authors\nequivalent,https://github.com/indexmap-rs/equivalent,Apache-2.0 OR MIT,The equivalent Authors\nerased-serde,https://github.com/dtolnay/erased-serde,MIT OR Apache-2.0,David Tolnay <dtolnay@gmail.com>\nerrno,https://github.com/lambda-fairy/rust-errno,MIT OR Apache-2.0,\"Chris Wong <lambda.fairy@gmail.com>, Dan Gohman <dev@sunfishcode.online>\"\nerror-chain,https://github.com/rust-lang-nursery/error-chain,MIT OR Apache-2.0,\"Brian Anderson <banderson@mozilla.com>, Paul Colomiets <paul@colomiets.name>, Colin Kiegel <kiegel@gmx.de>, Yamakaky <yamakaky@yamaworld.fr>, Andrew Gauger <andygauge@gmail.com>\"\nfail,https://github.com/tikv/fail-rs,Apache-2.0,The TiKV Project Developers\nfastdivide,https://github.com/fulmicoton/fastdivide,zlib-acknowledgement OR MIT,Paul Masurel <paul.masurel@gmail.com>\nfastrand,https://github.com/smol-rs/fastrand,Apache-2.0 OR MIT,Stjepan Glavina <stjepang@gmail.com>\nff,https://github.com/zkcrypto/ff,MIT OR Apache-2.0,\"Sean Bowe <ewillbefull@gmail.com>, Jack Grigg <thestr4d@gmail.com>\"\nfind-msvc-tools,https://github.com/rust-lang/cc-rs,MIT OR Apache-2.0,The find-msvc-tools Authors\nfixedbitset,https://github.com/petgraph/fixedbitset,MIT OR Apache-2.0,bluss\nflate2,https://github.com/rust-lang/flate2-rs,MIT OR Apache-2.0,\"Alex Crichton <alex@alexcrichton.com>, Josh Triplett <josh@joshtriplett.org>\"\nfloat-cmp,https://github.com/mikedilger/float-cmp,MIT,Mike Dilger <mike@mikedilger.com>\nflume,https://github.com/zesterer/flume,Apache-2.0 OR MIT,Joshua Barretto <joshua.s.barretto@gmail.com>\nfnv,https://github.com/servo/rust-fnv,Apache-2.0  OR  MIT,Alex Crichton <alex@alexcrichton.com>\nfoldhash,https://github.com/orlp/foldhash,Zlib,Orson Peters <orsonpeters@gmail.com>\nform_urlencoded,https://github.com/servo/rust-url,MIT OR Apache-2.0,The rust-url developers\nfragile,https://github.com/mitsuhiko/fragile,Apache-2.0,Armin Ronacher <armin.ronacher@active-4.com>\nfs4,https://github.com/al8n/fs4-rs,MIT OR Apache-2.0,\"Dan Burkert <dan@danburkert.com>, Al Liu <scygliu1@gmail.com>\"\nfslock,https://github.com/brunoczim/fslock,MIT,The fslock Authors\nfutures,https://github.com/rust-lang/futures-rs,MIT OR Apache-2.0,The futures Authors\nfutures-channel,https://github.com/rust-lang/futures-rs,MIT OR Apache-2.0,The futures-channel Authors\nfutures-core,https://github.com/rust-lang/futures-rs,MIT OR Apache-2.0,The futures-core Authors\nfutures-executor,https://github.com/rust-lang/futures-rs,MIT OR Apache-2.0,The futures-executor Authors\nfutures-io,https://github.com/rust-lang/futures-rs,MIT OR Apache-2.0,The futures-io Authors\nfutures-macro,https://github.com/rust-lang/futures-rs,MIT OR Apache-2.0,The futures-macro Authors\nfutures-sink,https://github.com/rust-lang/futures-rs,MIT OR Apache-2.0,The futures-sink Authors\nfutures-task,https://github.com/rust-lang/futures-rs,MIT OR Apache-2.0,The futures-task Authors\nfutures-timer,https://github.com/async-rs/futures-timer,MIT OR Apache-2.0,Alex Crichton <alex@alexcrichton.com>\nfutures-util,https://github.com/rust-lang/futures-rs,MIT OR Apache-2.0,The futures-util Authors\ngeneric-array,https://github.com/fizyk20/generic-array,MIT,\"Bartłomiej Kamiński <fizyk20@gmail.com>, Aaron Trent <novacrazy@gmail.com>\"\ngetrandom,https://github.com/rust-random/getrandom,MIT OR Apache-2.0,The Rand Project Developers\nglob,https://github.com/rust-lang/glob,MIT OR Apache-2.0,The Rust Project Developers\ngovernor,https://github.com/boinkor-net/governor,MIT,Andreas Fuchs <asf@boinkor.net>\ngroup,https://github.com/zkcrypto/group,MIT OR Apache-2.0,\"Sean Bowe <ewillbefull@gmail.com>, Jack Grigg <jack@z.cash>\"\nh2,https://github.com/hyperium/h2,MIT,\"Carl Lerche <me@carllerche.com>, Sean McArthur <sean@seanmonstar.com>\"\nhalf,https://github.com/VoidStarKat/half-rs,MIT OR Apache-2.0,Kathryn Long <squeeself@gmail.com>\nhashbrown,https://github.com/rust-lang/hashbrown,MIT OR Apache-2.0,Amanieu d'Antras <amanieu@gmail.com>\nheaders,https://github.com/hyperium/headers,MIT,Sean McArthur <sean@seanmonstar.com>\nheaders-core,https://github.com/hyperium/headers,MIT,Sean McArthur <sean@seanmonstar.com>\nheck,https://github.com/withoutboats/heck,MIT OR Apache-2.0,The heck Authors\nheck,https://github.com/withoutboats/heck,MIT OR Apache-2.0,Without Boats <woboats@gmail.com>\nhermit-abi,https://github.com/hermit-os/hermit-rs,MIT OR Apache-2.0,Stefan Lankes\nhex,https://github.com/KokaKiwi/rust-hex,MIT OR Apache-2.0,KokaKiwi <kokakiwi@kokakiwi.net>\nhmac,https://github.com/RustCrypto/MACs,MIT OR Apache-2.0,RustCrypto Developers\nhome,https://github.com/rust-lang/cargo,MIT OR Apache-2.0,Brian Anderson <andersrb@gmail.com>\nhostname,https://github.com/djc/hostname,MIT,The hostname Authors\nhtmlescape,https://github.com/veddan/rust-htmlescape,Apache-2.0  OR  MIT  OR  MPL-2.0,Viktor Dahl <pazaconyoman@gmail.com>\nhttp,https://github.com/hyperium/http,MIT OR Apache-2.0,\"Alex Crichton <alex@alexcrichton.com>, Carl Lerche <me@carllerche.com>, Sean McArthur <sean@seanmonstar.com>\"\nhttp-body,https://github.com/hyperium/http-body,MIT,\"Carl Lerche <me@carllerche.com>, Lucio Franco <luciofranco14@gmail.com>, Sean McArthur <sean@seanmonstar.com>\"\nhttp-body-util,https://github.com/hyperium/http-body,MIT,\"Carl Lerche <me@carllerche.com>, Lucio Franco <luciofranco14@gmail.com>, Sean McArthur <sean@seanmonstar.com>\"\nhttp-serde,https://gitlab.com/kornelski/http-serde,Apache-2.0 OR MIT,Kornel <kornel@geekhood.net>\nhttparse,https://github.com/seanmonstar/httparse,MIT OR Apache-2.0,Sean McArthur <sean@seanmonstar.com>\nhttpdate,https://github.com/pyfisch/httpdate,MIT OR Apache-2.0,Pyfisch <pyfisch@posteo.org>\nhumantime,https://github.com/chronotope/humantime,MIT OR Apache-2.0,The humantime Authors\nhyper,https://github.com/hyperium/hyper,MIT,Sean McArthur <sean@seanmonstar.com>\nhyper-rustls,https://github.com/rustls/hyper-rustls,Apache-2.0 OR ISC OR MIT,The hyper-rustls Authors\nhyper-timeout,https://github.com/hjr3/hyper-timeout,MIT OR Apache-2.0,Herman J. Radtke III <herman@hermanradtke.com>\nhyper-util,https://github.com/hyperium/hyper-util,MIT,Sean McArthur <sean@seanmonstar.com>\nhyperloglogplus,https://github.com/tabac/hyperloglog.rs,MIT,Tasos Bakogiannis <t.bakogiannis@gmail.com>\niana-time-zone,https://github.com/strawlab/iana-time-zone,MIT OR Apache-2.0,\"Andrew Straw <strawman@astraw.com>, René Kijewski <rene.kijewski@fu-berlin.de>, Ryan Lopopolo <rjl@hyperbo.la>\"\niana-time-zone-haiku,https://github.com/strawlab/iana-time-zone,MIT OR Apache-2.0,René Kijewski <crates.io@k6i.de>\nicu_collections,https://github.com/unicode-org/icu4x,Unicode-3.0,The ICU4X Project Developers\nicu_locale_core,https://github.com/unicode-org/icu4x,Unicode-3.0,The ICU4X Project Developers\nicu_normalizer,https://github.com/unicode-org/icu4x,Unicode-3.0,The ICU4X Project Developers\nicu_normalizer_data,https://github.com/unicode-org/icu4x,Unicode-3.0,The ICU4X Project Developers\nicu_properties,https://github.com/unicode-org/icu4x,Unicode-3.0,The ICU4X Project Developers\nicu_properties_data,https://github.com/unicode-org/icu4x,Unicode-3.0,The ICU4X Project Developers\nicu_provider,https://github.com/unicode-org/icu4x,Unicode-3.0,The ICU4X Project Developers\nident_case,https://github.com/TedDriggs/ident_case,MIT OR Apache-2.0,Ted Driggs <ted.driggs@outlook.com>\nidna,https://github.com/servo/rust-url,MIT OR Apache-2.0,The rust-url developers\nidna_adapter,https://github.com/hsivonen/idna_adapter,Apache-2.0 OR MIT,The rust-url developers\nindexmap,https://github.com/bluss/indexmap,Apache-2.0 OR MIT,The indexmap Authors\nindexmap,https://github.com/indexmap-rs/indexmap,Apache-2.0 OR MIT,The indexmap Authors\nindicatif,https://github.com/console-rs/indicatif,MIT,The indicatif Authors\ninventory,https://github.com/dtolnay/inventory,MIT OR Apache-2.0,David Tolnay <dtolnay@gmail.com>\nipnet,https://github.com/krisprice/ipnet,MIT OR Apache-2.0,Kris Price <kris@krisprice.nz>\nipnetwork,https://github.com/achanda/ipnetwork,MIT OR Apache-2.0,\"Abhishek Chanda <abhishek.becs@gmail.com>, Linus Färnstrand <faern@faern.net>\"\niri-string,https://github.com/lo48576/iri-string,MIT OR Apache-2.0,YOSHIOKA Takuma <nop_thread@nops.red>\nis-terminal,https://github.com/sunfishcode/is-terminal,MIT,\"softprops <d.tangren@gmail.com>, Dan Gohman <dev@sunfishcode.online>\"\nis_terminal_polyfill,https://github.com/polyfill-rs/is_terminal_polyfill,MIT OR Apache-2.0,The is_terminal_polyfill Authors\nitertools,https://github.com/rust-itertools/itertools,MIT OR Apache-2.0,bluss\nitoa,https://github.com/dtolnay/itoa,MIT OR Apache-2.0,David Tolnay <dtolnay@gmail.com>\njobserver,https://github.com/rust-lang/jobserver-rs,MIT OR Apache-2.0,Alex Crichton <alex@alexcrichton.com>\njs-sys,https://github.com/wasm-bindgen/wasm-bindgen/tree/master/crates/js-sys,MIT OR Apache-2.0,The wasm-bindgen Developers\njson_comments,https://github.com/tmccombs/json-comments-rs,Apache-2.0,Thayne McCombs <astrothayne@gmail.com>\nlambda_runtime,https://github.com/awslabs/aws-lambda-rust-runtime,Apache-2.0,\"David Calavera <dcalaver@amazon.com>, Harold Sun <sunhua@amazon.com>\"\nlambda_runtime_api_client,https://github.com/awslabs/aws-lambda-rust-runtime,Apache-2.0,\"David Calavera <dcalaver@amazon.com>, Harold Sun <sunhua@amazon.com>\"\nlazy_static,https://github.com/rust-lang-nursery/lazy-static.rs,MIT OR Apache-2.0,Marvin Löbel <loebel.marvin@gmail.com>\nlevenshtein_automata,https://github.com/tantivy-search/levenshtein-automata,MIT,Paul Masurel <paul.masurel@gmail.com>\nlibc,https://github.com/rust-lang/libc,MIT OR Apache-2.0,The Rust Project Developers\nlibm,https://github.com/rust-lang/compiler-builtins,MIT,Jorge Aparicio <jorge@japaric.io>\nlinked-hash-map,https://github.com/contain-rs/linked-hash-map,MIT OR Apache-2.0,\"Stepan Koltsov <stepan.koltsov@gmail.com>, Andrew Paseltiner <apaseltiner@gmail.com>\"\nlinux-raw-sys,https://github.com/sunfishcode/linux-raw-sys,Apache-2.0 WITH LLVM-exception OR Apache-2.0 OR MIT,Dan Gohman <dev@sunfishcode.online>\nlitemap,https://github.com/unicode-org/icu4x,Unicode-3.0,The ICU4X Project Developers\nlock_api,https://github.com/Amanieu/parking_lot,MIT OR Apache-2.0,Amanieu d'Antras <amanieu@gmail.com>\nlog,https://github.com/rust-lang/log,MIT OR Apache-2.0,The Rust Project Developers\nlru,https://github.com/jeromefroe/lru-rs,MIT,Jerome Froelich <jeromefroelic@hotmail.com>\nlru-slab,https://github.com/Ralith/lru-slab,MIT OR Apache-2.0 OR Zlib,Benjamin Saunders <ben.e.saunders@gmail.com>\nlz4_flex,https://github.com/pseitz/lz4_flex,MIT,\"Pascal Seitz <pascal.seitz@gmail.com>, Arthur Silva <arthurprs@gmail.com>, ticki <Ticki@users.noreply.github.com>\"\nmatchers,https://github.com/hawkw/matchers,MIT,Eliza Weisman <eliza@buoyant.io>\nmatchit,https://github.com/ibraheemdev/matchit,MIT AND BSD-3-Clause,Ibraheem Ahmed <ibraheem@ibraheem.ca>\nmd-5,https://github.com/RustCrypto/hashes,MIT OR Apache-2.0,RustCrypto Developers\nmd5,https://github.com/stainless-steel/md5,Apache-2.0 OR MIT,\"Ivan Ukhov <ivan.ukhov@gmail.com>, Kamal Ahmad <shibe@openmailbox.org>, Konstantin Stepanov <milezv@gmail.com>, Lukas Kalbertodt <lukas.kalbertodt@gmail.com>, Nathan Musoke <nathan.musoke@gmail.com>, Scott Mabin <scott@mabez.dev>, Tony Arcieri <bascule@gmail.com>, Wim de With <register@dewith.io>, Yosef Dinerstein <yosefdi@gmail.com>\"\nmeasure_time,https://github.com/PSeitz/rust_measure_time,MIT,Pascal Seitz <pascal.seitz@gmail.com>\nmemchr,https://github.com/BurntSushi/memchr,Unlicense OR MIT,\"Andrew Gallant <jamslam@gmail.com>, bluss\"\nmemmap2,https://github.com/RazrFalcon/memmap2-rs,MIT OR Apache-2.0,\"Dan Burkert <dan@danburkert.com>, Yevhenii Reizner <razrfalcon@gmail.com>\"\nmime,https://github.com/hyperium/mime,MIT OR Apache-2.0,Sean McArthur <sean@seanmonstar.com>\nmime_guess,https://github.com/abonander/mime_guess,MIT,Austin Bonander <austin.bonander@gmail.com>\nmini-internal,https://github.com/dtolnay/miniserde,MIT OR Apache-2.0,David Tolnay <dtolnay@gmail.com>\nmini-moka,https://github.com/moka-rs/mini-moka,MIT OR Apache-2.0,The mini-moka Authors\nminimal-lexical,https://github.com/Alexhuszagh/minimal-lexical,MIT OR Apache-2.0,Alex Huszagh <ahuszagh@gmail.com>\nminiserde,https://github.com/dtolnay/miniserde,MIT OR Apache-2.0,David Tolnay <dtolnay@gmail.com>\nminiz_oxide,https://github.com/Frommi/miniz_oxide/tree/master/miniz_oxide,MIT OR Zlib OR Apache-2.0,\"Frommi <daniil.liferenko@gmail.com>, oyvindln <oyvindln@users.noreply.github.com>, Rich Geldreich richgel99@gmail.com\"\nmio,https://github.com/tokio-rs/mio,MIT,\"Carl Lerche <me@carllerche.com>, Thomas de Zeeuw <thomasdezeeuw@gmail.com>, Tokio Contributors <team@tokio.rs>\"\nmockall,https://github.com/asomers/mockall,MIT OR Apache-2.0,Alan Somers <asomers@gmail.com>\nmockall_derive,https://github.com/asomers/mockall,MIT OR Apache-2.0,Alan Somers <asomers@gmail.com>\nmrecordlog,https://github.com/quickwit-oss/mrecordlog,MIT,The mrecordlog Authors\nmultimap,https://github.com/havarnov/multimap,MIT OR Apache-2.0,Håvar Nøvik <havar.novik@gmail.com>\nmurmurhash32,https://github.com/quickwit-inc/murmurhash32,MIT,Paul Masurel <paul.masurel@gmail.com>\nnew_string_template,https://github.com/hasezoey/new_string_template,MIT,hasezoey <hasezoey@gmail.com>\nno-std-net,https://github.com/dunmatt/no-std-net,MIT,M@ Dunlap <mattdunlap@gmail.com>\nnom,https://github.com/Geal/nom,MIT,contact@geoffroycouprie.com\nnom,https://github.com/rust-bakery/nom,MIT,contact@geoffroycouprie.com\nnonzero_ext,https://github.com/antifuchs/nonzero_ext,Apache-2.0,Andreas Fuchs <asf@boinkor.net>\nnormalize-line-endings,https://github.com/derekdreery/normalize-line-endings,Apache-2.0,Richard Dodd <richdodj@gmail.com>\nnu-ansi-term,https://github.com/nushell/nu-ansi-term,MIT,\"ogham@bsago.me, Ryan Scheel (Havvy) <ryan.havvy@gmail.com>, Josh Triplett <josh@joshtriplett.org>, The Nushell Project Developers\"\nnum-bigint,https://github.com/rust-num/num-bigint,MIT OR Apache-2.0,The Rust Project Developers\nnum-conv,https://github.com/jhpratt/num-conv,MIT OR Apache-2.0,Jacob Pratt <jacob@jhpratt.dev>\nnum-integer,https://github.com/rust-num/num-integer,MIT OR Apache-2.0,The Rust Project Developers\nnum-rational,https://github.com/rust-num/num-rational,MIT OR Apache-2.0,The Rust Project Developers\nnum-traits,https://github.com/rust-num/num-traits,MIT OR Apache-2.0,The Rust Project Developers\nnum_cpus,https://github.com/seanmonstar/num_cpus,MIT OR Apache-2.0,Sean McArthur <sean@seanmonstar.com>\nnumfmt,https://github.com/kurtlawrence/numfmt,MIT,Kurt Lawrence <kurtlawrence.info>\nobjc2-core-foundation,https://github.com/madsmtm/objc2,Zlib OR Apache-2.0 OR MIT,The objc2-core-foundation Authors\nobjc2-io-kit,https://github.com/madsmtm/objc2,Zlib OR Apache-2.0 OR MIT,The objc2-io-kit Authors\nonce_cell,https://github.com/matklad/once_cell,MIT OR Apache-2.0,Aleksey Kladov <aleksey.kladov@gmail.com>\nonce_cell_polyfill,https://github.com/polyfill-rs/once_cell_polyfill,MIT OR Apache-2.0,The once_cell_polyfill Authors\noneshot,https://github.com/faern/oneshot,MIT OR Apache-2.0,Linus Färnstrand <faern@faern.net>\noorandom,https://hg.sr.ht/~icefox/oorandom,MIT,Simon Heath <icefox@dreamquest.io>\nopenssl-probe,https://github.com/alexcrichton/openssl-probe,MIT OR Apache-2.0,Alex Crichton <alex@alexcrichton.com>\nopentelemetry,https://github.com/open-telemetry/opentelemetry-rust/tree/main/opentelemetry,Apache-2.0,The opentelemetry Authors\nopentelemetry-appender-tracing,https://github.com/open-telemetry/opentelemetry-rust/tree/main/opentelemetry-appender-tracing,Apache-2.0,The opentelemetry-appender-tracing Authors\nopentelemetry-http,https://github.com/open-telemetry/opentelemetry-rust/tree/main/opentelemetry-http,Apache-2.0,The opentelemetry-http Authors\nopentelemetry-otlp,https://github.com/open-telemetry/opentelemetry-rust/tree/main/opentelemetry-otlp,Apache-2.0,The opentelemetry-otlp Authors\nopentelemetry-proto,https://github.com/open-telemetry/opentelemetry-rust/tree/main/opentelemetry-proto,Apache-2.0,The opentelemetry-proto Authors\nopentelemetry_sdk,https://github.com/open-telemetry/opentelemetry-rust/tree/main/opentelemetry-sdk,Apache-2.0,The opentelemetry_sdk Authors\nordered-float,https://github.com/reem/rust-ordered-float,MIT,\"Jonathan Reem <jonathan.reem@gmail.com>, Matt Brubeck <mbrubeck@limpet.net>\"\nouroboros,https://github.com/someguynamedjosh/ouroboros,MIT OR Apache-2.0,Josh <someguynamedjosh@github.com>\nouroboros_macro,https://github.com/someguynamedjosh/ouroboros,MIT OR Apache-2.0,Josh <someguynamedjosh@github.com>\noutref,https://github.com/Nugine/outref,MIT,The outref Authors\nownedbytes,https://github.com/quickwit-oss/tantivy,MIT,\"Paul Masurel <paul@quickwit.io>, Pascal Seitz <pascal@quickwit.io>\"\np256,https://github.com/RustCrypto/elliptic-curves/tree/master/p256,Apache-2.0 OR MIT,RustCrypto Developers\npage_size,https://github.com/Elzair/page_size_rs,MIT OR Apache-2.0,Philip Woods <elzairthesorcerer@gmail.com>\npapergrid,https://github.com/zhiburt/tabled,MIT,Maxim Zhiburt <zhiburt@gmail.com>\nparking_lot,https://github.com/Amanieu/parking_lot,MIT OR Apache-2.0,Amanieu d'Antras <amanieu@gmail.com>\nparking_lot_core,https://github.com/Amanieu/parking_lot,MIT OR Apache-2.0,Amanieu d'Antras <amanieu@gmail.com>\npeakmem-alloc,https://github.com/PSeitz/peakmem-alloc,MIT,Pascal Seitz <pascal.seitz@gmail.com>\npercent-encoding,https://github.com/servo/rust-url,MIT OR Apache-2.0,The rust-url developers\nperf-event,https://github.com/jimblandy/perf-event,MIT OR Apache-2.0,Jim Blandy <jimb@red-bean.com>\nperf-event-open-sys,https://github.com/jimblandy/perf-event-open-sys,MIT OR Apache-2.0,Jim Blandy <jimb@red-bean.com>\npetgraph,https://github.com/petgraph/petgraph,MIT OR Apache-2.0,\"bluss, mitchmindtree\"\npin-project,https://github.com/taiki-e/pin-project,Apache-2.0 OR MIT,The pin-project Authors\npin-project-internal,https://github.com/taiki-e/pin-project,Apache-2.0 OR MIT,The pin-project-internal Authors\npin-project-lite,https://github.com/taiki-e/pin-project-lite,Apache-2.0 OR MIT,The pin-project-lite Authors\npin-utils,https://github.com/rust-lang-nursery/pin-utils,MIT OR Apache-2.0,Josef Brandl <mail@josefbrandl.de>\npkcs8,https://github.com/RustCrypto/formats/tree/master/pkcs8,Apache-2.0 OR MIT,RustCrypto Developers\nplotters,https://github.com/plotters-rs/plotters,MIT,Hao Hou <haohou302@gmail.com>\nplotters-backend,https://github.com/plotters-rs/plotters,MIT,Hao Hou <haohou302@gmail.com>\nplotters-svg,https://github.com/plotters-rs/plotters,MIT,Hao Hou <haohou302@gmail.com>\npnet,https://github.com/libpnet/libpnet,MIT OR Apache-2.0,Robert Clipsham <robert@octarineparrot.com>\npnet_base,https://github.com/libpnet/libpnet,MIT OR Apache-2.0,\"Robert Clipsham <robert@octarineparrot.com>, Linus Färnstrand <faern@faern.net>\"\npnet_datalink,https://github.com/libpnet/libpnet,MIT OR Apache-2.0,\"Robert Clipsham <robert@octarineparrot.com>, Linus Färnstrand <faern@faern.net>\"\npnet_macros,https://github.com/libpnet/libpnet,MIT OR Apache-2.0,\"Robert Clipsham <robert@octarineparrot.com>, Pierre Chifflier <chifflier@wzdftpd.net>\"\npnet_macros_support,https://github.com/libpnet/libpnet,MIT OR Apache-2.0,Robert Clipsham <robert@octarineparrot.com>\npnet_packet,https://github.com/libpnet/libpnet,MIT OR Apache-2.0,Robert Clipsham <robert@octarineparrot.com>\npnet_sys,https://github.com/libpnet/libpnet,MIT OR Apache-2.0,\"Robert Clipsham <robert@octarineparrot.com>, Linus Färnstrand <faern@faern.net>\"\npnet_transport,https://github.com/libpnet/libpnet,MIT OR Apache-2.0,Robert Clipsham <robert@octarineparrot.com>\nportable-atomic,https://github.com/taiki-e/portable-atomic,Apache-2.0 OR MIT,The portable-atomic Authors\npostcard,https://github.com/jamesmunns/postcard,MIT OR Apache-2.0,James Munns <james@onevariable.com>\npotential_utf,https://github.com/unicode-org/icu4x,Unicode-3.0,The ICU4X Project Developers\npowerfmt,https://github.com/jhpratt/powerfmt,MIT OR Apache-2.0,Jacob Pratt <jacob@jhpratt.dev>\nppv-lite86,https://github.com/cryptocorrosion/cryptocorrosion,MIT OR Apache-2.0,The CryptoCorrosion Contributors\npredicates,https://github.com/assert-rs/predicates-rs,MIT OR Apache-2.0,Nick Stevens <nick@bitcurry.com>\npredicates-core,https://github.com/assert-rs/predicates-rs/tree/master/crates/core,MIT OR Apache-2.0,Nick Stevens <nick@bitcurry.com>\npredicates-tree,https://github.com/assert-rs/predicates-rs/tree/master/crates/tree,MIT OR Apache-2.0,Nick Stevens <nick@bitcurry.com>\npretty_assertions,https://github.com/rust-pretty-assertions/rust-pretty-assertions,MIT OR Apache-2.0,\"Colin Kiegel <kiegel@gmx.de>, Florent Fayolle <florent.fayolle69@gmail.com>, Tom Milligan <code@tommilligan.net>\"\nprettyplease,https://github.com/dtolnay/prettyplease,MIT OR Apache-2.0,David Tolnay <dtolnay@gmail.com>\nproc-macro-error,https://gitlab.com/CreepySkeleton/proc-macro-error,MIT OR Apache-2.0,CreepySkeleton <creepy-skeleton@yandex.ru>\nproc-macro-error-attr,https://gitlab.com/CreepySkeleton/proc-macro-error,MIT OR Apache-2.0,CreepySkeleton <creepy-skeleton@yandex.ru>\nproc-macro-error-attr2,https://github.com/GnomedDev/proc-macro-error-2,MIT OR Apache-2.0,\"CreepySkeleton <creepy-skeleton@yandex.ru>, GnomedDev <david2005thomas@gmail.com>\"\nproc-macro-error2,https://github.com/GnomedDev/proc-macro-error-2,MIT OR Apache-2.0,\"CreepySkeleton <creepy-skeleton@yandex.ru>, GnomedDev <david2005thomas@gmail.com>\"\nproc-macro2,https://github.com/dtolnay/proc-macro2,MIT OR Apache-2.0,\"David Tolnay <dtolnay@gmail.com>, Alex Crichton <alex@alexcrichton.com>\"\nproc-macro2-diagnostics,https://github.com/SergioBenitez/proc-macro2-diagnostics,MIT OR Apache-2.0,Sergio Benitez <sb@sergio.bz>\nprocfs,https://github.com/eminence/procfs,MIT OR Apache-2.0,Andrew Chin <achin@eminence32.net>\nprocfs-core,https://github.com/eminence/procfs,MIT OR Apache-2.0,Andrew Chin <achin@eminence32.net>\nprometheus,https://github.com/tikv/rust-prometheus,Apache-2.0,\"overvenus@gmail.com, siddontang@gmail.com, vistaswx@gmail.com\"\nprost,https://github.com/tokio-rs/prost,Apache-2.0,\"Dan Burkert <dan@danburkert.com>, Lucio Franco <luciofranco14@gmail.com>, Casper Meijn <casper@meijn.net>, Tokio Contributors <team@tokio.rs>\"\nprost-build,https://github.com/tokio-rs/prost,Apache-2.0,\"Dan Burkert <dan@danburkert.com>, Lucio Franco <luciofranco14@gmail.com>, Casper Meijn <casper@meijn.net>, Tokio Contributors <team@tokio.rs>\"\nprost-derive,https://github.com/tokio-rs/prost,Apache-2.0,\"Dan Burkert <dan@danburkert.com>, Lucio Franco <luciofranco14@gmail.com>, Casper Meijn <casper@meijn.net>, Tokio Contributors <team@tokio.rs>\"\nprost-types,https://github.com/tokio-rs/prost,Apache-2.0,\"Dan Burkert <dan@danburkert.com>, Lucio Franco <luciofranco14@gmail.com>, Casper Meijn <casper@meijn.net>, Tokio Contributors <team@tokio.rs>\"\npulldown-cmark,https://github.com/raphlinus/pulldown-cmark,MIT,\"Raph Levien <raph.levien@gmail.com>, Marcus Klaas de Vries <mail@marcusklaas.nl>\"\npulldown-cmark-to-cmark,https://github.com/Byron/pulldown-cmark-to-cmark,Apache-2.0,\"Sebastian Thiel <byronimo@gmail.com>, Dylan Owen <dyltotheo@gmail.com>, Alessandro Ogier <alessandro.ogier@gmail.com>, Zixian Cai <2891235+caizixian@users.noreply.github.com>, Andrew Lyjak <andrew.lyjak@gmail.com>\"\nquanta,https://github.com/metrics-rs/quanta,MIT,Toby Lawrence <toby@nuclearfurnace.com>\nquick-error,http://github.com/tailhook/quick-error,MIT OR Apache-2.0,\"Paul Colomiets <paul@colomiets.name>, Colin Kiegel <kiegel@gmx.de>\"\nquick_cache,https://github.com/arthurprs/quick-cache,MIT,Arthur Silva <arthurprs@gmail.com>\nquinn,https://github.com/quinn-rs/quinn,MIT OR Apache-2.0,The quinn Authors\nquinn-proto,https://github.com/quinn-rs/quinn,MIT OR Apache-2.0,The quinn-proto Authors\nquinn-udp,https://github.com/quinn-rs/quinn,MIT OR Apache-2.0,The quinn-udp Authors\nquote,https://github.com/dtolnay/quote,MIT OR Apache-2.0,David Tolnay <dtolnay@gmail.com>\nr-efi,https://github.com/r-efi/r-efi,MIT OR Apache-2.0 OR LGPL-2.1-or-later,The r-efi Authors\nrand,https://github.com/rust-random/rand,MIT OR Apache-2.0,\"The Rand Project Developers, The Rust Project Developers\"\nrand_chacha,https://github.com/rust-random/rand,MIT OR Apache-2.0,\"The Rand Project Developers, The Rust Project Developers, The CryptoCorrosion Contributors\"\nrand_core,https://github.com/rust-random/rand,MIT OR Apache-2.0,\"The Rand Project Developers, The Rust Project Developers\"\nrand_xorshift,https://github.com/rust-random/rngs,MIT OR Apache-2.0,\"The Rand Project Developers, The Rust Project Developers\"\nraw-cpuid,https://github.com/gz/rust-cpuid,MIT,Gerd Zellweger <mail@gerdzellweger.com>\nrayon,https://github.com/rayon-rs/rayon,MIT OR Apache-2.0,The rayon Authors\nrayon-core,https://github.com/rayon-rs/rayon,MIT OR Apache-2.0,The rayon-core Authors\nredox_syscall,https://gitlab.redox-os.org/redox-os/syscall,MIT,Jeremy Soller <jackpot51@gmail.com>\nref-cast,https://github.com/dtolnay/ref-cast,MIT OR Apache-2.0,David Tolnay <dtolnay@gmail.com>\nref-cast-impl,https://github.com/dtolnay/ref-cast,MIT OR Apache-2.0,David Tolnay <dtolnay@gmail.com>\nregex,https://github.com/rust-lang/regex,MIT OR Apache-2.0,\"The Rust Project Developers, Andrew Gallant <jamslam@gmail.com>\"\nregex-automata,https://github.com/rust-lang/regex,MIT OR Apache-2.0,\"The Rust Project Developers, Andrew Gallant <jamslam@gmail.com>\"\nregex-lite,https://github.com/rust-lang/regex,MIT OR Apache-2.0,\"The Rust Project Developers, Andrew Gallant <jamslam@gmail.com>\"\nregex-syntax,https://github.com/rust-lang/regex,MIT OR Apache-2.0,\"The Rust Project Developers, Andrew Gallant <jamslam@gmail.com>\"\nreqwest,https://github.com/seanmonstar/reqwest,MIT OR Apache-2.0,Sean McArthur <sean@seanmonstar.com>\nreqwest-middleware,https://github.com/TrueLayer/reqwest-middleware,MIT OR Apache-2.0,Rodrigo Gryzinski <rodrigo.gryzinski@truelayer.com>\nreqwest-retry,https://github.com/TrueLayer/reqwest-middleware,MIT OR Apache-2.0,Rodrigo Gryzinski <rodrigo.gryzinski@truelayer.com>\nretry-policies,https://github.com/TrueLayer/retry-policies,MIT OR Apache-2.0,Luca Palmieri <lpalmieri@truelayer.com>\nrfc6979,https://github.com/RustCrypto/signatures/tree/master/rfc6979,Apache-2.0 OR MIT,RustCrypto Developers\nring,https://github.com/briansmith/ring,Apache-2.0 AND ISC,The ring Authors\nroxmltree,https://github.com/RazrFalcon/roxmltree,MIT OR Apache-2.0,Evgeniy Reizner <razrfalcon@gmail.com>\nrust-embed,https://pyrossh.dev/repos/rust-embed,MIT,pyrossh\nrust-embed-impl,https://pyrossh.dev/repos/rust-embed,MIT,pyrossh\nrust-embed-utils,https://pyrossh.dev/repos/rust-embed,MIT,pyrossh\nrustc-hash,https://github.com/rust-lang/rustc-hash,Apache-2.0 OR MIT,The Rust Project Developers\nrustix,https://github.com/bytecodealliance/rustix,Apache-2.0 WITH LLVM-exception OR Apache-2.0 OR MIT,\"Dan Gohman <dev@sunfishcode.online>, Jakub Konka <kubkon@jakubkonka.com>\"\nrustls,https://github.com/rustls/rustls,Apache-2.0 OR ISC OR MIT,The rustls Authors\nrustls-native-certs,https://github.com/rustls/rustls-native-certs,Apache-2.0 OR ISC OR MIT,The rustls-native-certs Authors\nrustls-pemfile,https://github.com/rustls/pemfile,Apache-2.0 OR ISC OR MIT,The rustls-pemfile Authors\nrustls-pki-types,https://github.com/rustls/pki-types,MIT OR Apache-2.0,The rustls-pki-types Authors\nrustls-webpki,https://github.com/rustls/webpki,ISC,The rustls-webpki Authors\nrustop,https://chiselapp.com/user/fifr/repository/rustop,MIT,Frank Fischer <frank-fischer@shadow-soft.de>\nrustversion,https://github.com/dtolnay/rustversion,MIT OR Apache-2.0,David Tolnay <dtolnay@gmail.com>\nrusty-fork,https://github.com/altsysrq/rusty-fork,MIT OR Apache-2.0,Jason Lingle\nryu,https://github.com/dtolnay/ryu,Apache-2.0 OR BSL-1.0,David Tolnay <dtolnay@gmail.com>\nsame-file,https://github.com/BurntSushi/same-file,Unlicense OR MIT,Andrew Gallant <jamslam@gmail.com>\nscc,https://github.com/wvwwvwwv/scalable-concurrent-containers,Apache-2.0,wvwwvwwv <wvwwvwwv@me.com>\nschannel,https://github.com/steffengy/schannel-rs,MIT,\"Steven Fackler <sfackler@gmail.com>, Steffen Butzer <steffen.butzer@outlook.com>\"\nschemars,https://github.com/GREsau/schemars,MIT,Graham Esau <gesau@hotmail.co.uk>\nscoped-tls,https://github.com/alexcrichton/scoped-tls,MIT OR Apache-2.0,Alex Crichton <alex@alexcrichton.com>\nscopeguard,https://github.com/bluss/scopeguard,MIT OR Apache-2.0,bluss\nsct,https://github.com/rustls/sct.rs,Apache-2.0 OR ISC OR MIT,Joseph Birr-Pixton <jpixton@gmail.com>\nsdd,https://github.com/wvwwvwwv/scalable-delayed-dealloc,Apache-2.0,wvwwvwwv <wvwwvwwv@me.com>\nsec1,https://github.com/RustCrypto/formats/tree/master/sec1,Apache-2.0 OR MIT,RustCrypto Developers\nsecurity-framework,https://github.com/kornelski/rust-security-framework,MIT OR Apache-2.0,\"Steven Fackler <sfackler@gmail.com>, Kornel <kornel@geekhood.net>\"\nsecurity-framework-sys,https://github.com/kornelski/rust-security-framework,MIT OR Apache-2.0,\"Steven Fackler <sfackler@gmail.com>, Kornel <kornel@geekhood.net>\"\nsemver,https://github.com/dtolnay/semver,MIT OR Apache-2.0,David Tolnay <dtolnay@gmail.com>\nseparator,https://github.com/saghm/rust-separator,MIT,Saghm Rossi <saghmrossi@gmail.com>\nserde,https://github.com/serde-rs/serde,MIT OR Apache-2.0,\"Erick Tryzelaar <erick.tryzelaar@gmail.com>, David Tolnay <dtolnay@gmail.com>\"\nserde_core,https://github.com/serde-rs/serde,MIT OR Apache-2.0,\"Erick Tryzelaar <erick.tryzelaar@gmail.com>, David Tolnay <dtolnay@gmail.com>\"\nserde_derive,https://github.com/serde-rs/serde,MIT OR Apache-2.0,\"Erick Tryzelaar <erick.tryzelaar@gmail.com>, David Tolnay <dtolnay@gmail.com>\"\nserde_json,https://github.com/serde-rs/json,MIT OR Apache-2.0,\"Erick Tryzelaar <erick.tryzelaar@gmail.com>, David Tolnay <dtolnay@gmail.com>\"\nserde_json_borrow,https://github.com/PSeitz/serde_json_borrow,MIT,Pascal Seitz <pascal.seitz@gmail.com>\nserde_path_to_error,https://github.com/dtolnay/path-to-error,MIT OR Apache-2.0,David Tolnay <dtolnay@gmail.com>\nserde_qs,https://github.com/samscott89/serde_qs,MIT OR Apache-2.0,Sam Scott <sam@osohq.com>\nserde_spanned,https://github.com/toml-rs/toml,MIT OR Apache-2.0,The serde_spanned Authors\nserde_urlencoded,https://github.com/nox/serde_urlencoded,MIT OR Apache-2.0,Anthony Ramine <n.oxyde@gmail.com>\nserde_with,https://github.com/jonasbb/serde_with,MIT OR Apache-2.0,\"Jonas Bushart, Marcin Kaźmierczak\"\nserde_with_macros,https://github.com/jonasbb/serde_with,MIT OR Apache-2.0,Jonas Bushart\nserde_yaml,https://github.com/dtolnay/serde-yaml,MIT OR Apache-2.0,David Tolnay <dtolnay@gmail.com>\nserial_test_derive,https://github.com/palfrey/serial_test,MIT,Tom Parker-Shemilt <palfrey@tevp.net>\nsha1,https://github.com/RustCrypto/hashes,MIT OR Apache-2.0,RustCrypto Developers\nsha2,https://github.com/RustCrypto/hashes,MIT OR Apache-2.0,RustCrypto Developers\nsharded-slab,https://github.com/hawkw/sharded-slab,MIT,Eliza Weisman <eliza@buoyant.io>\nshell-words,https://github.com/tmiasko/shell-words,MIT OR Apache-2.0,Tomasz Miąsko <tomasz.miasko@gmail.com>\nshlex,https://github.com/comex/rust-shlex,MIT OR Apache-2.0,\"comex <comexk@gmail.com>, Fenhl <fenhl@fenhl.net>, Adrian Taylor <adetaylor@chromium.org>, Alex Touchet <alextouchet@outlook.com>, Daniel Parks <dp+git@oxidized.org>, Garrett Berg <googberg@gmail.com>\"\nsignal-hook-registry,https://github.com/vorner/signal-hook,MIT OR Apache-2.0,\"Michal 'vorner' Vaner <vorner@vorner.cz>, Masaki Hara <ackie.h.gmai@gmail.com>\"\nsignature,https://github.com/RustCrypto/traits/tree/master/signature,Apache-2.0 OR MIT,RustCrypto Developers\nsimd-adler32,https://github.com/mcountryman/simd-adler32,MIT,Marvin Countryman <me@maar.vin>\nsiphasher,https://github.com/jedisct1/rust-siphash,MIT OR Apache-2.0,Frank Denis <github@pureftpd.org>\nsketches-ddsketch,https://github.com/mheffner/rust-sketches-ddsketch,Apache-2.0,Mike Heffner <mikeh@fesnel.com>\nslab,https://github.com/tokio-rs/slab,MIT,Carl Lerche <me@carllerche.com>\nsmallvec,https://github.com/servo/rust-smallvec,MIT OR Apache-2.0,The Servo Project Developers\nsocket2,https://github.com/rust-lang/socket2,MIT OR Apache-2.0,\"Alex Crichton <alex@alexcrichton.com>, Thomas de Zeeuw <thomasdezeeuw@gmail.com>\"\nspin,https://github.com/mvdnes/spin-rs,MIT,\"Mathijs van de Nes <git@mathijs.vd-nes.nl>, John Ericson <git@JohnEricson.me>, Joshua Barretto <joshua.s.barretto@gmail.com>\"\nspinning_top,https://github.com/rust-osdev/spinning_top,MIT OR Apache-2.0,Philipp Oppermann <dev@phil-opp.com>\nspki,https://github.com/RustCrypto/formats/tree/master/spki,Apache-2.0 OR MIT,RustCrypto Developers\nstable_deref_trait,https://github.com/storyyeller/stable_deref_trait,MIT OR Apache-2.0,Robert Grosse <n210241048576@gmail.com>\nstatic_assertions,https://github.com/nvzqz/static-assertions-rs,MIT OR Apache-2.0,Nikolai Vazquez\nstrsim,https://github.com/rapidfuzz/strsim-rs,MIT,\"Danny Guo <danny@dannyguo.com>, maxbachmann <oss@maxbachmann.de>\"\nsubtle,https://github.com/dalek-cryptography/subtle,BSD-3-Clause,\"Isis Lovecruft <isis@patternsinthevoid.net>, Henry de Valence <hdevalence@hdevalence.ca>\"\nsyn,https://github.com/dtolnay/syn,MIT OR Apache-2.0,David Tolnay <dtolnay@gmail.com>\nsync_wrapper,https://github.com/Actyx/sync_wrapper,Apache-2.0,Actyx AG <developer@actyx.io>\nsynstructure,https://github.com/mystor/synstructure,MIT,Nika Layzell <nika@thelayzells.com>\nsysinfo,https://github.com/GuillaumeGomez/sysinfo,MIT,Guillaume Gomez <guillaume1.gomez@gmail.com>\ntabled,https://github.com/zhiburt/tabled,MIT,Maxim Zhiburt <zhiburt@gmail.com>\ntabled_derive,https://github.com/zhiburt/tabled,MIT,Maxim Zhiburt <zhiburt@gmail.com>\ntagptr,https://github.com/oliver-giersch/tagptr,MIT OR Apache-2.0,Oliver Giersch\ntantivy,https://github.com/quickwit-oss/tantivy,MIT,Paul Masurel <paul.masurel@gmail.com>\ntantivy-bitpacker,https://github.com/quickwit-oss/tantivy,MIT,Paul Masurel <paul.masurel@gmail.com>\ntantivy-columnar,https://github.com/quickwit-oss/tantivy,MIT,The tantivy-columnar Authors\ntantivy-common,https://github.com/quickwit-oss/tantivy,MIT,\"Paul Masurel <paul@quickwit.io>, Pascal Seitz <pascal@quickwit.io>\"\ntantivy-fst,https://github.com/quickwit-inc/fst,Unlicense OR MIT,Andrew Gallant <jamslam@gmail.com>\ntantivy-query-grammar,https://github.com/quickwit-oss/tantivy,MIT,Paul Masurel <paul.masurel@gmail.com>\ntantivy-sstable,https://github.com/quickwit-oss/tantivy,MIT,The tantivy-sstable Authors\ntantivy-stacker,https://github.com/quickwit-oss/tantivy,MIT,The tantivy-stacker Authors\ntantivy-tokenizer-api,https://github.com/quickwit-oss/tantivy,MIT,The tantivy-tokenizer-api Authors\ntempfile,https://github.com/Stebalien/tempfile,MIT OR Apache-2.0,\"Steven Allen <steven@stebalien.com>, The Rust Project Developers, Ashley Mannix <ashleymannix@live.com.au>, Jason White <me@jasonwhite.io>\"\ntermtree,https://github.com/rust-cli/termtree,MIT,The termtree Authors\ntesting_table,https://github.com/zhiburt/tabled,MIT,Maxim Zhiburt <zhiburt@gmail.com>\nthiserror,https://github.com/dtolnay/thiserror,MIT OR Apache-2.0,David Tolnay <dtolnay@gmail.com>\nthiserror-impl,https://github.com/dtolnay/thiserror,MIT OR Apache-2.0,David Tolnay <dtolnay@gmail.com>\nthousands,https://github.com/tov/thousands-rs,MIT OR Apache-2.0,Jesse A. Tov <jesse.tov@gmail.com>\nthread_local,https://github.com/Amanieu/thread_local-rs,MIT OR Apache-2.0,Amanieu d'Antras <amanieu@gmail.com>\ntime,https://github.com/time-rs/time,MIT OR Apache-2.0,\"Jacob Pratt <open-source@jhpratt.dev>, Time contributors\"\ntime-core,https://github.com/time-rs/time,MIT OR Apache-2.0,\"Jacob Pratt <open-source@jhpratt.dev>, Time contributors\"\ntime-fmt,https://github.com/MiSawa/time-fmt,MIT OR Apache-2.0,mi_sawa <mi.sawa.1216+git@gmail.com>\ntime-macros,https://github.com/time-rs/time,MIT OR Apache-2.0,\"Jacob Pratt <open-source@jhpratt.dev>, Time contributors\"\ntinystr,https://github.com/unicode-org/icu4x,Unicode-3.0,The ICU4X Project Developers\ntinytemplate,https://github.com/bheisler/TinyTemplate,Apache-2.0 OR MIT,Brook Heisler <brookheisler@gmail.com>\ntinyvec,https://github.com/Lokathor/tinyvec,Zlib OR Apache-2.0 OR MIT,Lokathor <zefria@gmail.com>\ntinyvec_macros,https://github.com/Soveu/tinyvec_macros,MIT OR Apache-2.0 OR Zlib,Soveu <marx.tomasz@gmail.com>\ntokio,https://github.com/tokio-rs/tokio,MIT,Tokio Contributors <team@tokio.rs>\ntokio-macros,https://github.com/tokio-rs/tokio,MIT,Tokio Contributors <team@tokio.rs>\ntokio-metrics,https://github.com/tokio-rs/tokio-metrics,MIT,Tokio Contributors <team@tokio.rs>\ntokio-rustls,https://github.com/rustls/tokio-rustls,MIT OR Apache-2.0,The tokio-rustls Authors\ntokio-stream,https://github.com/tokio-rs/tokio,MIT,Tokio Contributors <team@tokio.rs>\ntokio-util,https://github.com/tokio-rs/tokio,MIT,Tokio Contributors <team@tokio.rs>\ntoml,https://github.com/toml-rs/toml,MIT OR Apache-2.0,The toml Authors\ntoml_datetime,https://github.com/toml-rs/toml,MIT OR Apache-2.0,The toml_datetime Authors\ntoml_parser,https://github.com/toml-rs/toml,MIT OR Apache-2.0,The toml_parser Authors\ntoml_writer,https://github.com/toml-rs/toml,MIT OR Apache-2.0,The toml_writer Authors\ntonic,https://github.com/hyperium/tonic,MIT,Lucio Franco <luciofranco14@gmail.com>\ntonic-build,https://github.com/hyperium/tonic,MIT,Lucio Franco <luciofranco14@gmail.com>\ntonic-health,https://github.com/hyperium/tonic,MIT,James Nugent <james@jen20.com>\ntonic-prost,https://github.com/hyperium/tonic,MIT,Lucio Franco <luciofranco14@gmail.com>\ntonic-prost-build,https://github.com/hyperium/tonic,MIT,Lucio Franco <luciofranco14@gmail.com>\ntonic-reflection,https://github.com/hyperium/tonic,MIT,\"James Nugent <james@jen20.com>, Samani G. Gikandi <samani@gojulas.com>\"\ntower,https://github.com/tower-rs/tower,MIT,Tower Maintainers <team@tower-rs.com>\ntower-http,https://github.com/tower-rs/tower-http,MIT,Tower Maintainers <team@tower-rs.com>\ntower-layer,https://github.com/tower-rs/tower,MIT,Tower Maintainers <team@tower-rs.com>\ntower-service,https://github.com/tower-rs/tower,MIT,Tower Maintainers <team@tower-rs.com>\ntracing,https://github.com/tokio-rs/tracing,MIT,\"Eliza Weisman <eliza@buoyant.io>, Tokio Contributors <team@tokio.rs>\"\ntracing-attributes,https://github.com/tokio-rs/tracing,MIT,\"Tokio Contributors <team@tokio.rs>, Eliza Weisman <eliza@buoyant.io>, David Barsky <dbarsky@amazon.com>\"\ntracing-core,https://github.com/tokio-rs/tracing,MIT,Tokio Contributors <team@tokio.rs>\ntracing-log,https://github.com/tokio-rs/tracing,MIT,Tokio Contributors <team@tokio.rs>\ntracing-opentelemetry,https://github.com/tokio-rs/tracing-opentelemetry,MIT,The tracing-opentelemetry Authors\ntracing-serde,https://github.com/tokio-rs/tracing,MIT,Tokio Contributors <team@tokio.rs>\ntracing-subscriber,https://github.com/tokio-rs/tracing,MIT,\"Eliza Weisman <eliza@buoyant.io>, David Barsky <me@davidbarsky.com>, Tokio Contributors <team@tokio.rs>\"\ntriomphe,https://github.com/Manishearth/triomphe,MIT OR Apache-2.0,\"Manish Goregaokar <manishsmail@gmail.com>, The Servo Project Developers\"\ntry-lock,https://github.com/seanmonstar/try-lock,MIT,Sean McArthur <sean@seanmonstar.com>\nttl_cache,https://github.com/stusmall/ttl_cache,MIT OR Apache-2.0,Stu Small <stuart.alan.small@gmail.com>\ntypeid,https://github.com/dtolnay/typeid,MIT OR Apache-2.0,David Tolnay <dtolnay@gmail.com>\ntypenum,https://github.com/paholg/typenum,MIT OR Apache-2.0,\"Paho Lurie-Gregg <paho@paholg.com>, Andre Bogus <bogusandre@gmail.com>\"\ntypetag,https://github.com/dtolnay/typetag,MIT OR Apache-2.0,David Tolnay <dtolnay@gmail.com>\ntypetag-impl,https://github.com/dtolnay/typetag,MIT OR Apache-2.0,David Tolnay <dtolnay@gmail.com>\nulid,https://github.com/dylanhart/ulid-rs,MIT,dylanhart <dylan96hart@gmail.com>\nunarray,https://github.com/cameron1024/unarray,MIT OR Apache-2.0,The unarray Authors\nunicase,https://github.com/seanmonstar/unicase,MIT OR Apache-2.0,Sean McArthur <sean@seanmonstar.com>\nunicode-ident,https://github.com/dtolnay/unicode-ident,(MIT OR Apache-2.0) AND Unicode-3.0,David Tolnay <dtolnay@gmail.com>\nunicode-width,https://github.com/unicode-rs/unicode-width,MIT OR Apache-2.0,\"kwantam <kwantam@gmail.com>, Manish Goregaokar <manishsmail@gmail.com>\"\nunit-prefix,https://codeberg.org/commons-rs/unit-prefix,MIT,\"Fabio Valentini <decathorpe@gmail.com>, Benjamin Sago <ogham@bsago.me>\"\nunsafe-libyaml,https://github.com/dtolnay/unsafe-libyaml,MIT,David Tolnay <dtolnay@gmail.com>\nuntrusted,https://github.com/briansmith/untrusted,ISC,Brian Smith <brian@briansmith.org>\nureq-proto,https://github.com/algesten/ureq-proto,MIT OR Apache-2.0,Martin Algesten <martin@algesten.se>\nurl,https://github.com/servo/rust-url,MIT OR Apache-2.0,The rust-url developers\nurlencoding,https://github.com/kornelski/rust_urlencoding,MIT,\"Kornel <kornel@geekhood.net>, Bertram Truong <b@bertramtruong.com>\"\nusername,https://pijul.org/darcs/user,MIT OR Apache-2.0,Pierre-Étienne Meunier <pierre-etienne.meunier@aalto.fi>\nutf-8,https://github.com/SimonSapin/rust-utf8,MIT OR Apache-2.0,Simon Sapin <simon.sapin@exyr.org>\nutf8-ranges,https://github.com/BurntSushi/utf8-ranges,Unlicense OR MIT,Andrew Gallant <jamslam@gmail.com>\nutf8_iter,https://github.com/hsivonen/utf8_iter,Apache-2.0 OR MIT,Henri Sivonen <hsivonen@hsivonen.fi>\nutf8parse,https://github.com/alacritty/vte,Apache-2.0 OR MIT,\"Joe Wilm <joe@jwilm.com>, Christian Duerr <contact@christianduerr.com>\"\nutoipa,https://github.com/juhaku/utoipa,MIT OR Apache-2.0,Juha Kukkonen <juha7kukkonen@gmail.com>\nutoipa-gen,https://github.com/juhaku/utoipa,MIT OR Apache-2.0,Juha Kukkonen <juha7kukkonen@gmail.com>\nuuid,https://github.com/uuid-rs/uuid,Apache-2.0 OR MIT,\"Ashley Mannix<ashleymannix@live.com.au>, Dylan DPC<dylan.dpc@gmail.com>, Hunar Roop Kahlon<hunar.roop@gmail.com>\"\nvaluable,https://github.com/tokio-rs/valuable,MIT,The valuable Authors\nvsimd,https://github.com/Nugine/simd,MIT,The vsimd Authors\nvte,https://github.com/alacritty/vte,Apache-2.0 OR MIT,\"Joe Wilm <joe@jwilm.com>, Christian Duerr <contact@christianduerr.com>\"\nwait-timeout,https://github.com/alexcrichton/wait-timeout,MIT OR Apache-2.0,Alex Crichton <alex@alexcrichton.com>\nwalkdir,https://github.com/BurntSushi/walkdir,Unlicense OR MIT,Andrew Gallant <jamslam@gmail.com>\nwant,https://github.com/seanmonstar/want,MIT,Sean McArthur <sean@seanmonstar.com>\nwarp,https://github.com/seanmonstar/warp,MIT,Sean McArthur <sean@seanmonstar.com>\nwasi,https://github.com/bytecodealliance/wasi,Apache-2.0 WITH LLVM-exception OR Apache-2.0 OR MIT,The Cranelift Project Developers\nwasip2,https://github.com/bytecodealliance/wasi-rs,Apache-2.0 WITH LLVM-exception OR Apache-2.0 OR MIT,The wasip2 Authors\nwasix,https://github.com/wasix-org/wasix-abi-rust,Apache-2.0 WITH LLVM-exception OR Apache-2.0 OR MIT,\"The Cranelift Project Developers, john-sharratt\"\nwasm-bindgen,https://github.com/wasm-bindgen/wasm-bindgen,MIT OR Apache-2.0,The wasm-bindgen Developers\nwasm-bindgen-futures,https://github.com/wasm-bindgen/wasm-bindgen/tree/master/crates/futures,MIT OR Apache-2.0,The wasm-bindgen Developers\nwasm-bindgen-macro,https://github.com/wasm-bindgen/wasm-bindgen/tree/master/crates/macro,MIT OR Apache-2.0,The wasm-bindgen Developers\nwasm-bindgen-macro-support,https://github.com/wasm-bindgen/wasm-bindgen/tree/master/crates/macro-support,MIT OR Apache-2.0,The wasm-bindgen Developers\nwasm-bindgen-shared,https://github.com/wasm-bindgen/wasm-bindgen/tree/master/crates/shared,MIT OR Apache-2.0,The wasm-bindgen Developers\nwasmtimer,https://github.com/whizsid/wasmtimer-rs,MIT,\"WhizSid <whizsid@aol.com>, Pierre Krieger <pierre.krieger1708@gmail.com>\"\nweb-sys,https://github.com/wasm-bindgen/wasm-bindgen/tree/master/crates/web-sys,MIT OR Apache-2.0,The wasm-bindgen Developers\nweb-time,https://github.com/daxpedda/web-time,MIT OR Apache-2.0,The web-time Authors\nwebpki-roots,https://github.com/rustls/webpki-roots,CDLA-Permissive-2.0,The webpki-roots Authors\nwinapi,https://github.com/retep998/winapi-rs,MIT,Peter Atashian <retep998@gmail.com>\nwinapi,https://github.com/retep998/winapi-rs,MIT OR Apache-2.0,Peter Atashian <retep998@gmail.com>\nwinapi-i686-pc-windows-gnu,https://github.com/retep998/winapi-rs,MIT OR Apache-2.0,Peter Atashian <retep998@gmail.com>\nwinapi-util,https://github.com/BurntSushi/winapi-util,Unlicense OR MIT,Andrew Gallant <jamslam@gmail.com>\nwinapi-x86_64-pc-windows-gnu,https://github.com/retep998/winapi-rs,MIT OR Apache-2.0,Peter Atashian <retep998@gmail.com>\nwindows,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,Microsoft\nwindows-collections,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,The windows-collections Authors\nwindows-core,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,Microsoft\nwindows-core,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,The windows-core Authors\nwindows-future,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,The windows-future Authors\nwindows-implement,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,The windows-implement Authors\nwindows-interface,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,The windows-interface Authors\nwindows-link,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,Microsoft\nwindows-link,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,The windows-link Authors\nwindows-numerics,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,The windows-numerics Authors\nwindows-result,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,Microsoft\nwindows-result,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,The windows-result Authors\nwindows-strings,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,Microsoft\nwindows-strings,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,The windows-strings Authors\nwindows-sys,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,Microsoft\nwindows-sys,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,The windows-sys Authors\nwindows-targets,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,Microsoft\nwindows-targets,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,The windows-targets Authors\nwindows-threading,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,Microsoft\nwindows_aarch64_gnullvm,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,Microsoft\nwindows_aarch64_gnullvm,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,The windows_aarch64_gnullvm Authors\nwindows_aarch64_msvc,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,Microsoft\nwindows_aarch64_msvc,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,The windows_aarch64_msvc Authors\nwindows_i686_gnu,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,Microsoft\nwindows_i686_gnu,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,The windows_i686_gnu Authors\nwindows_i686_gnullvm,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,Microsoft\nwindows_i686_gnullvm,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,The windows_i686_gnullvm Authors\nwindows_i686_msvc,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,Microsoft\nwindows_i686_msvc,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,The windows_i686_msvc Authors\nwindows_x86_64_gnu,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,Microsoft\nwindows_x86_64_gnu,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,The windows_x86_64_gnu Authors\nwindows_x86_64_gnullvm,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,Microsoft\nwindows_x86_64_gnullvm,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,The windows_x86_64_gnullvm Authors\nwindows_x86_64_msvc,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,Microsoft\nwindows_x86_64_msvc,https://github.com/microsoft/windows-rs,MIT OR Apache-2.0,The windows_x86_64_msvc Authors\nwinnow,https://github.com/winnow-rs/winnow,MIT,The winnow Authors\nwit-bindgen,https://github.com/bytecodealliance/wit-bindgen,Apache-2.0 WITH LLVM-exception OR Apache-2.0 OR MIT,Alex Crichton <alex@alexcrichton.com>\nwriteable,https://github.com/unicode-org/icu4x,Unicode-3.0,The ICU4X Project Developers\nxmlparser,https://github.com/RazrFalcon/xmlparser,MIT OR Apache-2.0,Yevhenii Reizner <razrfalcon@gmail.com>\nyansi,https://github.com/SergioBenitez/yansi,MIT OR Apache-2.0,Sergio Benitez <sb@sergio.bz>\nyoke,https://github.com/unicode-org/icu4x,Unicode-3.0,Manish Goregaokar <manishsmail@gmail.com>\nyoke-derive,https://github.com/unicode-org/icu4x,Unicode-3.0,Manish Goregaokar <manishsmail@gmail.com>\nzerocopy,https://github.com/google/zerocopy,BSD-2-Clause OR Apache-2.0 OR MIT,\"Joshua Liebow-Feeser <joshlf@google.com>, Jack Wrenn <jswrenn@amazon.com>\"\nzerocopy-derive,https://github.com/google/zerocopy,BSD-2-Clause OR Apache-2.0 OR MIT,\"Joshua Liebow-Feeser <joshlf@google.com>, Jack Wrenn <jswrenn@amazon.com>\"\nzerofrom,https://github.com/unicode-org/icu4x,Unicode-3.0,Manish Goregaokar <manishsmail@gmail.com>\nzerofrom-derive,https://github.com/unicode-org/icu4x,Unicode-3.0,Manish Goregaokar <manishsmail@gmail.com>\nzeroize,https://github.com/RustCrypto/utils,Apache-2.0 OR MIT,The RustCrypto Project Developers\nzerotrie,https://github.com/unicode-org/icu4x,Unicode-3.0,The ICU4X Project Developers\nzerovec,https://github.com/unicode-org/icu4x,Unicode-3.0,The ICU4X Project Developers\nzerovec-derive,https://github.com/unicode-org/icu4x,Unicode-3.0,Manish Goregaokar <manishsmail@gmail.com>\nzmij,https://github.com/dtolnay/zmij,MIT,David Tolnay <dtolnay@gmail.com>\nzstd,https://github.com/gyscos/zstd-rs,MIT,Alexandre Bury <alexandre.bury@gmail.com>\nzstd-safe,https://github.com/gyscos/zstd-rs,MIT OR Apache-2.0,Alexandre Bury <alexandre.bury@gmail.com>\nzstd-sys,https://github.com/gyscos/zstd-rs,MIT OR Apache-2.0,Alexandre Bury <alexandre.bury@gmail.com>\n"
  },
  {
    "path": "Makefile",
    "content": "DOCKER_SERVICES ?= all\n\nQUICKWIT_SRC = quickwit\n\nhelp:\n\t@grep '^[^\\.#[:space:]].*:' Makefile\n\n\nIMAGE_TAG := $(shell git branch --show-current | tr '\\#/' '-')\n\nQW_COMMIT_DATE := $(shell TZ=UTC0 git log -1 --format=%cd --date=format-local:'%Y-%m-%dT%H:%M:%SZ')\nQW_COMMIT_HASH := $(shell git rev-parse HEAD)\nQW_COMMIT_TAGS := $(shell git tag --points-at HEAD | tr '\\n' ',')\n\ndocker-build:\n\t@docker build \\\n\t\t--build-arg QW_COMMIT_DATE=$(QW_COMMIT_DATE) \\\n\t\t--build-arg QW_COMMIT_HASH=$(QW_COMMIT_HASH) \\\n\t\t--build-arg QW_COMMIT_TAGS=$(QW_COMMIT_TAGS) \\\n\t\t-t quickwit/quickwit:$(IMAGE_TAG) .\n\n# Usage:\n# `make docker-compose-up` starts all the services.\n# `make docker-compose-up DOCKER_SERVICES='jaeger,localstack'` starts the subset of services matching the profiles.\ndocker-compose-up:\n\t@echo \"Launching ${DOCKER_SERVICES} Docker service(s)\"\n\tCOMPOSE_PROFILES=$(DOCKER_SERVICES) docker compose -f docker-compose.yml up -d --remove-orphans --wait\n\ndocker-compose-down:\n\tdocker compose -p quickwit down --remove-orphans\n\ndocker-compose-logs:\n\tdocker compose logs -f docker-compose.yml -t\n\ndocker-compose-monitoring:\n\tCOMPOSE_PROFILES=monitoring docker compose -f docker-compose.yml up -d --remove-orphans\n\ndocker-rm-postgres-volume:\n\tdocker volume rm quickwit_postgres_data\n\ndocker-rm-volumes:\n\tdocker volume rm quickwit_azurite_data quickwit_fake_gcs_server_data quickwit_grafana_conf quickwit_grafana_data quickwit_localstack_data quickwit_postgres_data\n\ndoc:\n\t@$(MAKE) -C $(QUICKWIT_SRC) doc\n\nfmt:\n\t@$(MAKE) -C $(QUICKWIT_SRC) fmt\n\nfix:\n\t@$(MAKE) -C $(QUICKWIT_SRC) fix\n\ntypos:\n\ttypos\n\n# Usage:\n# `make test-all` starts the Docker services and runs all the tests.\n# `make -k test-all docker-compose-down`, tears down the Docker services after running all the tests.\ntest-all: docker-compose-up\n\t@$(MAKE) -C $(QUICKWIT_SRC) test-all\n\ntest-failpoints:\n\t@$(MAKE) -C $(QUICKWIT_SRC) test-failpoints\n\n# This will build and push all custom cross images for cross-compilation.\n# You will need to login into Docker Hub with the `quickwit` account.\nIMAGE_TAGS = x86_64-unknown-linux-gnu aarch64-unknown-linux-gnu x86_64-unknown-linux-musl aarch64-unknown-linux-musl\n\n.PHONY: cross-images\ncross-images:\n\t@for tag in ${IMAGE_TAGS}; do \\\n\t\tdocker build --tag quickwit/cross:$$tag --file ./build/cross-images/$$tag.dockerfile ./build/cross-images; \\\n\t\tdocker push quickwit/cross:$$tag; \\\n\tdone\n\n# TODO: to be replaced by https://github.com/quickwit-oss/quickwit/issues/237\n.PHONY: build\nbuild: build-ui\n\t$(MAKE) -C $(QUICKWIT_SRC) build\n\n# Usage:\n# `BINARY_FILE=path/to/quickwit/binary BINARY_VERSION=0.1.0 ARCHIVE_NAME=quickwit make archive`\n# - BINARY_FILE: Path of the quickwit binary file.\n# - BINARY_VERSION: Version of the quickwit binary.\n# - ARCHIVE_NAME: Name of the resulting archive file (without extension).\n.PHONY: archive\narchive:\n\t@echo \"Archiving release binary & assets\"\n\t@mkdir -p \"./quickwit-${BINARY_VERSION}/config\"\n\t@mkdir -p \"./quickwit-${BINARY_VERSION}/qwdata\"\n\t@cp ./config/quickwit.yaml \"./quickwit-${BINARY_VERSION}/config\"\n\t@cp ./LICENSE \"./quickwit-${BINARY_VERSION}\"\n\t@cp \"${BINARY_FILE}\" \"./quickwit-${BINARY_VERSION}\"\n\t@tar -czf \"${ARCHIVE_NAME}.tar.gz\" \"./quickwit-${BINARY_VERSION}\"\n\t@rm -rf \"./quickwit-${BINARY_VERSION}\"\n\nworkspace-deps-tree:\n\t$(MAKE) -C $(QUICKWIT_SRC) workspace-deps-tree\n\n.PHONY: build-rustdoc\nbuild-rustdoc:\n\t$(MAKE) -C $(QUICKWIT_SRC) build-rustdoc\n\n.PHONY: build-ui\nbuild-ui:\n\t$(MAKE) -C $(QUICKWIT_SRC) build-ui\n"
  },
  {
    "path": "README.md",
    "content": "[![CI](https://github.com/quickwit-oss/quickwit/actions/workflows/ci.yml/badge.svg)](https://github.com/quickwit-oss/quickwit/actions?query=workflow%3ACI+branch%3Amain)\n[![codecov](https://codecov.io/gh/quickwit-oss/quickwit/branch/main/graph/badge.svg?token=06SRGAV5SS)](https://codecov.io/gh/quickwit-oss/quickwit)\n[![OpenSSF Scorecard](https://api.scorecard.dev/projects/github.com/quickwit-oss/quickwit/badge)](https://scorecard.dev/viewer/?uri=github.com/quickwit-oss/quickwit)\n[![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-2.0-4baaaa.svg)](CODE_OF_CONDUCT.md)\n[![License: Apache 2.0](https://img.shields.io/badge/license-Apache%202.0-blue?style=flat-square)](LICENSE)\n[![Twitter Follow](https://img.shields.io/twitter/follow/Quickwit_Inc?color=%231DA1F2&logo=Twitter&style=plastic)](https://twitter.com/Quickwit_Inc)\n[![Discord](https://img.shields.io/discord/908281611840282624?logo=Discord&logoColor=%23FFFFFF&style=plastic)](https://discord.quickwit.io)\n<br/>\n\n<br/>\n<br/>\n<p align=\"center\">\n  <img src=\"docs/assets/images/logo_horizontal.svg#gh-light-mode-only\" alt=\"Quickwit Cloud-Native Search Engine\" height=\"40\">\n  <img src=\"docs/assets/images/quickwit-dark-theme-logo.png#gh-dark-mode-only\" alt=\"Quickwit Cloud-Native Search Engine\" height=\"40\">\n</p>\n\n<h2 align=\"center\">\nCloud-native search engine for observability (logs, traces, and soon metrics!). An open-source alternative to Datadog, Elasticsearch,  Loki, and Tempo.\n</h2>\n\n<h4 align=\"center\">\n  <a href=\"https://quickwit.io/docs/get-started/quickstart\">Quickstart</a> |\n  <a href=\"https://quickwit.io/docs/\">Docs</a> |\n  <a href=\"https://quickwit.io/tutorials\">Tutorials</a> |\n  <a href=\"https://discord.quickwit.io\">Chat</a> |\n  <a href=\"https://quickwit.io/docs/get-started/installation\">Download</a>\n</h4>\n<br/>\n\n<b>We just released Quickwit 0.8! Read the [blog post](https://quickwit.io/blog/quickwit-0.8) to learn about the latest powerful features!</b>\n\n### **Quickwit is the fastest search engine on cloud storage. It's the perfect fit for observability use cases**\n\n- [Log management](https://quickwit.io/docs/log-management/overview)\n- [Distributed tracing](https://quickwit.io/docs/distributed-tracing/overview)\n- Metrics support is on the roadmap\n\n### 🚀 Quickstart\n\n- [Search and analytics on Stack Overflow dataset](https://quickwit.io/docs/get-started/quickstart)\n- [Trace analytics with Grafana](https://quickwit.io/docs/get-started/tutorials/trace-analytics-with-grafana)\n- [Distributed tracing with Jaeger](https://quickwit.io/docs/get-started/tutorials/tutorial-jaeger)\n\n<br/>\n\n<video src=\"https://github.com/quickwit-oss/quickwit/assets/653704/020b94b9-deeb-4376-9a3a-b82e1168094c\" controls=\"controls\" style=\"max-width: 1200px;\">\n</video>\n\n<br/>\n\n# 💡 Features\n\n- Full-text search and aggregation queries\n- Elasticsearch-compatible API, use Quickwit with any Elasticsearch or OpenSearch client\n- [Jaeger-native](https://quickwit.io/docs/distributed-tracing/plug-quickwit-to-jaeger)\n- OTEL-native for [logs](https://quickwit.io/docs/log-management/overview) and [traces](https://quickwit.io/docs/distributed-tracing/overview)\n- [Schemaless](https://quickwit.io/docs/guides/schemaless) or strict schema indexing\n- Schemaless analytics\n- Sub-second search on cloud storage (Amazon S3, Azure Blob Storage, Google Cloud Storage, …)\n- Decoupled compute and storage, stateless indexers & searchers\n- [Grafana data source](https://github.com/quickwit-oss/quickwit-datasource)\n- Kubernetes ready - See our [helm-chart](https://quickwit.io/docs/deployment/kubernetes/helm)\n- RESTful API\n\n## Enterprise ready\n\n- Multiple [data sources](https://quickwit.io/docs/ingest-data/) Kafka / Kinesis / Pulsar native\n- Multi-tenancy: indexing with many indexes and partitioning\n- Retention policies\n- Delete tasks (for GDPR use cases)\n- Distributed and highly available* engine that scales out in seconds (*HA indexing only with Kafka)\n\n# 📑 Architecture overview\n\n![Quickwit Distributed Tracing](./docs/assets/images/quickwit-overview-light.svg#gh-light-mode-only)![Quickwit Distributed Tracing](./docs/assets/images/quickwit-overview-dark.svg#gh-dark-mode-only)\n\n- [Architecture overview]([https://quickwit.io/docs/distributed-tracing/overview](https://quickwit.io/docs/overview/architecture))\n- [Log management](https://quickwit.io/docs/log-management/overview)\n- [Distributed traces](https://quickwit.io/docs/distributed-tracing/overview)\n\n\n# 📕 Documentation\n\n- [Installation](https://quickwit.io/docs/get-started/installation)\n- [Log management with Quickwit](https://quickwit.io/docs/log-management/overview)\n- [Distributed Tracing with Quickwit](https://quickwit.io/docs/distributed-tracing/overview)\n- [Ingest data](https://quickwit.io/docs/ingest-data/)\n- [REST API](https://quickwit.io/docs/reference/rest-api)\n\n# 📚 Resources\n\n- [Blog posts](https://quickwit.io/blog/)\n- [Youtube channel](https://www.youtube.com/@quickwit8103)\n- [Discord](https://discord.quickwit.io)\n\n# 🔮 Roadmap\n\n- Quickwit 0.9 (July 2024)\n  - Indexing and search performance improvements\n  - Index configuration updates (retention policy, indexing and search settings)\n  - Concatenated field\n\n- Quickwit 0.10 (October 2024)\n  - Schema (doc mapping) updates\n  - Native distributed ingestion\n  - Index templates\n\n# 🙋 FAQ\n\n### How can I switch from Elasticsearch or OpenSearch to Quickwit?\n\nQuickwit supports a large subset of Elasticsearch/OpenSearch API.\n\nFor instance, it has an ES-compatible ingest API to make it easier to migrate your log shippers (Vector, Fluent Bit, Syslog, ...) to Quickwit.\n\nOn the search side, the most popular Elasticsearch endpoints, query DSL, and even aggregations are supported.\n\nThe list of available endpoints and queries is available [here](https://quickwit.io/docs/reference/es_compatible_api), while the list of supported aggregations is available [here](https://quickwit.io/docs/reference/aggregation).\n\nLet us know if part of the API you are using is missing!\n\nIf the client you are using is refusing to connect to Quickwit due to missing headers, you can use the `extra_headers` option in the [node configuration](https://quickwit.io/docs/configuration/node-config#rest-configuration) to impersonate any compatible version of Elasticsearch or OpenSearch.\n\n### How is Quickwit different from traditional search engines like Elasticsearch or Solr?\n\nThe core difference and advantage of Quickwit is its architecture built from the ground to search on cloud storage. We optimized IO paths, revamped the index data structures and made search stateless and sub-second on cloud storage.\n\n### How does Quickwit compare to Elastic in terms of cost?\n\nWe estimate that Quickwit can be up to 10x cheaper on average than Elastic. To understand how, check out our [blog post](https://quickwit.io/blog/commoncrawl/) about searching the web on AWS S3.\n\n### What license does Quickwit use?\n\nQuickwit is open-source under the Apache License, Version 2.0 - Apache-2.0.\n\n### Is it possible to set up Quickwit for a High Availability (HA)?\n\nHA is available for search, for indexing it's available only with a Kafka source.\n\n# 🤝 Contribute and spread the word\n\nWe are always thrilled to receive contributions: code, documentation, issues, or feedback. Here's how you can help us build the future of log management:\n\n- Start by checking out the [GitHub issues labeled \"Good first issue\"](https://github.com/quickwit-oss/quickwit/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22). These are a great place for newcomers to contribute.\n- Read our [Contributor Covenant Code of Conduct](./CODE_OF_CONDUCT.md) to understand our community standards.\n- [Create a fork of Quickwit](https://github.com/quickwit-oss/quickwit/fork) to have your own copy of the repository where you can make changes.\n- To understand how to contribute, read our [contributing guide](./CONTRIBUTING.md).\n- Set up your development environment following our [development setup guide](./CONTRIBUTING.md#development).\n- Once you've made your changes and tested them, you can contribute by [submitting a pull request](./CONTRIBUTING.md#submitting-a-pr).\n\n✨ After your contributions are accepted, don't forget to claim your swag by emailing us at hello@quickwit.io. Thank you for contributing!\n\n# 💬 Join Our Community\n\nWe welcome everyone to our community! Whether you're contributing code or just saying hello, we'd love to hear from you. Here's how you can connect with us:\n\n- Join the conversation on [Discord](https://discord.quickwit.io).\n- Follow us on [Twitter](https://twitter.com/Quickwit_Inc).\n- Check out our [website](https://quickwit.io/) and [blog](https://quickwit.io/blog) for the latest updates.\n- Watch our [YouTube](https://www.youtube.com/channel/UCvZVuRm2FiDq1_ul0mY85wA) channel for video content.\n"
  },
  {
    "path": "SECURITY.md",
    "content": "# Security Policy\n\n## Supported Versions\n\n| Version | Supported          |\n| ------- | ------------------ |\n| 0.3.1   | :white_check_mark: |\n| < 0.3.1   | :x:                |\n\n## Reporting a Vulnerability\n\nTo disclose a vulnerability in our code, please notify us by email at security@quickwit.io or private message _@fulmicoton_ or _@guilload_ on our Discord \nserver ([discord.quickwit.io](https://discord.quickwit.io)). We will open a draft security advisory on our repository and grant you access so you can\nshare with us more details about the vulnerability. After releasing a fix, we will publish the security advisory to publicly disclose the security vulnerability\nto the project's community.\n"
  },
  {
    "path": "_typos.toml",
    "content": "[files]\nextend-exclude = [\"**/*.json\"]\n\n[default.extend-words]\n# Don't correct the surname \"Teh\"\nstrat = \"strat\"\n"
  },
  {
    "path": "build/cross-images/aarch64-unknown-linux-gnu.dockerfile",
    "content": "FROM ghcr.io/cross-rs/aarch64-unknown-linux-gnu:0.2.4@sha256:3356619b020614effd22e83cec41236e69f17ce581ffe35e252898b0c693b4e2\n\nARG PBC_URL=\"https://github.com/protocolbuffers/protobuf/releases/download/v21.5/protoc-21.5-linux-x86_64.zip\"\n\n#TODO: \n# We can switch to static linking (remove `libsasl2-dev:arm64`) using \n# `rdkafka/gssapi-vendored` feature when there is a release including: \n# https://github.com/MaterializeInc/rust-sasl/pull/48\n\nRUN dpkg --add-architecture arm64 && \\\n    apt-get update && \\\n    apt-get install -y clang-3.9 \\\n        libclang-3.9-dev \\\n        binutils-aarch64-linux-gnu \\\n        libsasl2-dev:arm64 \\\n        unzip && \\\n    rm -rf /var/lib/apt/lists/*\n\nRUN curl -fLO $PBC_URL && \\\n    unzip protoc-21.5-linux-x86_64.zip -d ./protobuf && \\\n    mv ./protobuf/bin/protoc /usr/bin/ && \\\n    rm -rf ./protobuf protoc-21.5-linux-x86_64.zip\n\nENV LIBZ_SYS_STATIC=1 \\\n    PKG_CONFIG_ALLOW_CROSS=true \\\n    PKG_CONFIG_ALL_STATIC=true \\\n    X86_64_UNKNOWN_LINUX_MUSL_OPENSSL_STATIC=1 \\\n    X86_64_UNKNOWN_LINUX_MUSL_OPENSSL_DIR=/usr/local/musl/\n"
  },
  {
    "path": "build/cross-images/aarch64-unknown-linux-musl.dockerfile",
    "content": "FROM rustembedded/cross:aarch64-unknown-linux-musl@sha256:22627e0ba533781062127b13601c37216fdca27123390b07dfabd3f31f3c84a0\n\n\n# The Rust toolchain to use when building our image.  Set by `hooks/build`.\n# ARG TOOLCHAIN=stable\n\n# The OpenSSL version to use. Here is the place to check for new releases:\n#\n# - https://www.openssl.org/source/\n#\n# ALSO UPDATE hooks/build!\nARG OPENSSL_VERSION=1.1.1i\nARG ZLIB_VERSION=1.2.11\n\nRUN echo \"Building OpenSSL\" && \\\n    cd /tmp && \\\n    short_version=\"$(echo \"$OPENSSL_VERSION\" | sed s'/[a-z]$//' )\" && \\\n    curl -fLO \"https://www.openssl.org/source/openssl-$OPENSSL_VERSION.tar.gz\" || \\\n        curl -fLO \"https://www.openssl.org/source/old/$short_version/openssl-$OPENSSL_VERSION.tar.gz\" && \\\n    tar xvzf \"openssl-$OPENSSL_VERSION.tar.gz\" && cd \"openssl-$OPENSSL_VERSION\" && \\\n    AR=aarch64-linux-musl-ar CC=aarch64-linux-musl-gcc ./Configure no-zlib -fPIC --prefix=/usr/local/aarch64-linux-musl -DOPENSSL_NO_SECURE_MEMORY linux-aarch64 && \\\n    env C_INCLUDE_PATH=/usr/local/aarch64-linux-musl/include/ make depend && \\\n    env C_INCLUDE_PATH=/usr/local/aarch64-linux-musl/include/ make && \\\n    make install && \\\n    rm -r /tmp/*\n\nRUN echo \"Building zlib\" && \\\n    cd /tmp && \\\n    curl -fLO \"https://zlib.net/fossils/zlib-$ZLIB_VERSION.tar.gz\" && \\\n    tar xzf \"zlib-$ZLIB_VERSION.tar.gz\" && cd \"zlib-$ZLIB_VERSION\" && \\\n    AR=aarch64-linux-musl-ar CC=aarch64-linux-musl-gcc ./configure --static --prefix=/usr/local/aarch64-linux-musl && \\\n    make && make install && \\\n    rm -r /tmp/*\n\nENV AARCH64_UNKNOWN_LINUX_MUSL_OPENSSL_STATIC=1 \\\n    CC=aarch64-linux-musl-gcc \\\n    CFLAGS=-I/usr/local/aarch64-linux-musl/include \\\n    LIBZ_SYS_STATIC=1 \\\n    LIB_LDFLAGS=-L/usr/local/aarch64-linux-musl/lib \\\n    OPENSSL_INCLUDE_DIR=/usr/local/aarch64-linux-musl/include/openssl \\\n    OPENSSL_LIB_DIR=/usr/local/aarch64-linux-musl/lib \\\n    PKG_CONFIG_ALLOW_CROSS=true \\\n    PKG_CONFIG_ALL_STATIC=true \\\n    TARGET=aarch64-unknown-linux-musl \\\n    AARCH64_UNKNOWN_LINUX_MUSL_OPENSSL_DIR=/usr/local/aarch64-linux-musl \\\n    OPENSSL_ROOT_DIR=/usr/local/aarch64-linux-musl\n"
  },
  {
    "path": "build/cross-images/x86_64-unknown-linux-gnu.dockerfile",
    "content": "FROM ghcr.io/cross-rs/x86_64-unknown-linux-gnu:0.2.4@sha256:7c9067212c2283be2a1d5585af5ecebd4c4a2e18091e2a6aafd23f9b4b81d496\n\nARG PBC_URL=\"https://github.com/protocolbuffers/protobuf/releases/download/v21.5/protoc-21.5-linux-x86_64.zip\"\n\nRUN apt-get update && \\\n    apt-get install -y clang-3.9 \\\n        libclang-3.9-dev \\\n        libsasl2-dev \\\n        unzip && \\\n    rm -rf /var/lib/apt/lists/*\n\nRUN curl -fLO $PBC_URL && \\\n    unzip protoc-21.5-linux-x86_64.zip -d ./protobuf && \\\n    mv ./protobuf/bin/protoc /usr/bin/ && \\\n    rm -rf ./protobuf protoc-21.5-linux-x86_64.zip\n"
  },
  {
    "path": "build/cross-images/x86_64-unknown-linux-musl.dockerfile",
    "content": "FROM quickwit/cross-base:x86_64-unknown-linux-musl@sha256:5bcc7843aab64f89bf85c464fa2c5a00ecc634a8b1ac88c84a864f60054450cb\n# See https://github.com/quickwit-inc/rust-musl-builder\n\nRUN echo \"Upgrading CMake\" && \\\n    sudo apt-get remove cmake -y && \\\n    curl -fLO https://www.cmake.org/files/v3.12/cmake-3.12.1.tar.gz && \\\n    tar -xvzf cmake-3.12.1.tar.gz && \\\n    cd cmake-3.12.1/ && ./configure && \\\n    sudo make install\n    \nENV CC=musl-gcc \\\n    CFLAGS=-I/usr/local/musl/include \\\n    LIB_LDFLAGS=-L/usr/lib/x86_64-linux-gnu\n"
  },
  {
    "path": "config/quickwit.yaml",
    "content": "# ============================ Node Configuration ==============================\n#\n# Website: https://quickwit.io\n# Docs: https://quickwit.io/docs/configuration/node-config\n#\n# Configure AWS credentials: https://quickwit.io/docs/guides/aws-setup#aws-credentials\n#\n# -------------------------------- General settings --------------------------------\n#\n# Config file format version.\n#\nversion: 0.8\n#\n# Node ID. Must be unique within a cluster. If not set, a random node ID is generated on each startup.\n#\n# node_id: node-1\n#\n# Quickwit opens three sockets.\n# - for its HTTP server, hosting the UI and the REST API (TCP)\n# - for its gRPC service (TCP)\n# - for its Gossip cluster membership service (UDP)\n#\n# All three services are bound to the same host and a different port. The host can be an IP address or a hostname.\n#\n# Default HTTP server host is `127.0.0.1` and default HTTP port is 7280.\n# The default host value was chosen to avoid exposing the node to the open-world without users' explicit consent.\n# This allows for testing Quickwit in single-node mode or with multiple nodes running on the same host and listening\n# on different ports. However, in cluster mode, using this value is never appropriate because it causes the node to\n# ignore incoming traffic.\n# There are two options to set up a node in cluster mode:\n#   1. specify the node's hostname or IP\n#   2. pass `0.0.0.0` and let Quickwit do its best to discover the node's IP (see `advertise_address`)\n#\n# listen_address: 127.0.0.1\n#\n# rest:\n#   listen_port: 7280\n#   cors_allow_origins:\n#     - \"http://localhost:3000\"\n#   extra_headers:\n#     x-header-1: header-value-1\n#     x-header-2: header-value-2\n#\n# grpc:\n#   max_message_size: 10 MiB\n#\n# IP address advertised by the node, i.e. the IP address that peer nodes should use to connect to the node for RPCs.\n# The environment variable `QW_ADVERTISE_ADDRESS` can also be used to override this value.\n# The default advertise address is `listen_address`. If `listen_address` is unspecified (`0.0.0.0`),\n# Quickwit attempts to sniff the node's IP by scanning the available network interfaces.\n# advertise_address: 192.168.0.42\n#\n# In order to join a cluster, one needs to specify a list of\n# seeds to connect to. If no port is specified, Quickwit will assume\n# the seeds are using the same port as the current node gossip port.\n# By default, the peer seed list is empty.\n#\n# peer_seeds:\n#   - quickwit-searcher-0.local\n#   - quickwit-searcher-1.local:10000\n#\n# Path to directory where temporary data (caches, intermediate indexing data structures)\n# is stored. Defaults to `./qwdata`.\n#\n# data_dir: /path/to/data/dir\n#\n# Metastore URI. Defaults to `data_dir/indexes#polling_interval=30s`,\n# which is a file-backed metastore and mostly convenient for testing. A cluster would\n# require a metastore backed by Amzon S3 or PostgreSQL.\n#\n# metastore_uri: s3://your-bucket/indexes\n# metastore_uri: postgres://username:password@host:port/db\n#\n# When using a file-backed metastore, the state of the metastore will be cached forever.\n# If you are indexing and searching from different processes, it is possible to periodically\n# refresh the state of the metastore on the searcher using the `polling_interval` hashtag.\n#\n# metastore_uri: s3://your-bucket/indexes#polling_interval=30s\n#\n# Default index root URI, which defines where index data (splits) is stored,\n# following the scheme `{default_index_root_uri}/{index-id}`. Defaults to `{data_dir}/indexes`.\n#\n# default_index_root_uri: s3://your-bucket/indexes\n#\n# -------------------------------- Storage settings --------------------------------\n# https://quickwit.io/docs/configuration/node-config#storage-configuration\n#\n# Hardcoding credentials into configuration files is not secure and strongly\n# discouraged. Prefer the alternative authentication methods that your storage\n# backend may provide.\n#\n# storage:\n#   azure:\n#     account: ${QW_AZURE_STORAGE_ACCOUNT}\n#     access_key: ${QW_AZURE_STORAGE_ACCESS_KEY}\n#\n#   s3:\n#     access_key_id: ${AWS_ACCESS_KEY_ID}\n#     secret_access_key: ${AWS_SECRET_ACCESS_KEY}\n#     region: ${AWS_REGION}\n#     endpoint: ${QW_S3_ENDPOINT}\n#     force_path_style_access: ${QW_S3_FORCE_PATH_STYLE_ACCESS:-false}\n#     disable_multi_object_delete: false\n#     disable_multipart_upload: false\n#\n# -------------------------------- Metastore settings --------------------------------\n# https://quickwit.io/docs/configuration/node-config#metastore-configuration\n#\n# metastore:\n#   postgres:\n#     min_connections: 0\n#     max_connections: 10\n#     acquire_connection_timeout: 10s\n#     idle_connection_timeout: 10min\n#     max_connection_lifetime: 30min\n#\n# -------------------------------- Indexer settings --------------------------------\n# https://quickwit.io/docs/configuration/node-config#indexer-configuration\n\nindexer:\n  enable_otlp_endpoint: ${QW_ENABLE_OTLP_ENDPOINT:-true}\n#   split_store_max_num_bytes: 100G\n#   split_store_max_num_splits: 1000\n#   max_concurrent_split_uploads: 12\n#\n#\n# -------------------------------- Ingest API settings ------------------------------\n# https://quickwit.io/docs/configuration/node-config#ingest-api-configuration\n#\n# ingest_api:\n#   max_queue_memory_usage: 2GiB\n#   max_queue_disk_usage: 4GiB\n#   content_length_limit: 10MiB\n#\n# -------------------------------- Searcher settings --------------------------------\n# https://quickwit.io/docs/configuration/node-config#searcher-configuration\n#\n# searcher:\n#   fast_field_cache_capacity: 1G\n#   split_footer_cache_capacity: 500M\n#   partial_request_cache_capacity: 64M\n#   max_num_concurrent_split_streams: 100\n#   max_num_concurrent_split_searches: 100\n#   aggregation_memory_limit: 500M\n#   aggregation_bucket_limit: 65000\n#   split_cache:\n#      max_num_bytes: 1G\n#      max_num_splits: 10000\n#      num_concurrent_downloads: 1\n# -------------------------------- Jaeger settings --------------------------------\n\njaeger:\n  enable_endpoint: ${QW_ENABLE_JAEGER_ENDPOINT:-true}\n"
  },
  {
    "path": "config/templates/gh-archive.yaml",
    "content": "version: 0.8\n\ntemplate_id: gh-archive\n\nindex_id_patterns:\n  - gh-archive*\n\ndescription: Index config template for the GH Archive dataset (gharchive.org)\n\npriority: 0\n\ndoc_mapping:\n  field_mappings:\n    - name: id\n      type: text\n      tokenizer: raw\n    - name: type\n      type: text\n      fast: true\n      tokenizer: raw\n    - name: public\n      type: bool\n      fast: true\n    - name: payload\n      type: json\n      tokenizer: default\n    - name: org\n      type: json\n      tokenizer: default\n    - name: repo\n      type: json\n      tokenizer: default\n    - name: actor\n      type: json\n      tokenizer: default\n    - name: other\n      type: json\n      tokenizer: default\n    - name: created_at\n      type: datetime\n      fast: true\n      input_formats:\n        - rfc3339\n      fast_precision: seconds\n  timestamp_field: created_at\n\nindexing_settings:\n  commit_timeout_secs: 10\n"
  },
  {
    "path": "config/templates/stackoverflow.yaml",
    "content": "version: 0.8\n\ntemplate_id: stackoverflow\n\nindex_id_patterns:\n  - stackoverflow*\n\ndescription: Index config template for the Stackoverflow tutorial (quickwit.io/docs/get-started/quickstart)\n\npriority: 0\n\ndoc_mapping:\n  field_mappings:\n    - name: title\n      type: text\n      tokenizer: default\n      record: position\n      stored: true\n    - name: body\n      type: text\n      tokenizer: default\n      record: position\n      stored: true\n    - name: creationDate\n      type: datetime\n      fast: true\n      input_formats:\n        - rfc3339\n      fast_precision: seconds\n  timestamp_field: creationDate\n\nsearch_settings:\n  default_search_fields: [title, body]\n\nindexing_settings:\n  commit_timeout_secs: 10\n"
  },
  {
    "path": "config/tutorials/fluentbit-logs/index-config.yaml",
    "content": "version: 0.8\n\nindex_id: fluentbit-logs\n\ndoc_mapping:\n  mode: dynamic\n  field_mappings:\n    - name: timestamp\n      type: datetime\n      input_formats:\n        - unix_timestamp\n      output_format: unix_timestamp_secs\n      fast: true\n  timestamp_field: timestamp\n\nindexing_settings:\n  commit_timeout_secs: 10\n"
  },
  {
    "path": "config/tutorials/gh-archive/index-config-for-clickhouse.yaml",
    "content": "#\n# Index config file for gh-archive dataset.\n#\nversion: 0.8\n\nindex_id: gh-archive\n\ndoc_mapping:\n  store_source: false\n  field_mappings:\n    - name: id\n      type: u64\n      fast: true\n    - name: created_at\n      type: datetime\n      input_formats:\n        - unix_timestamp\n      output_format: unix_timestamp_secs\n      fast_precision: seconds\n      fast: true\n    - name: event_type\n      type: text\n      tokenizer: raw\n    - name: title\n      type: text\n      tokenizer: default\n      record: position\n    - name: body\n      type: text\n      tokenizer: default\n      record: position\n  timestamp_field: created_at\n\nsearch_settings:\n  default_search_fields: [title, body]\n"
  },
  {
    "path": "config/tutorials/gh-archive/index-config.yaml",
    "content": "#\n# Index config file for gh-archive dataset.\n#\nversion: 0.8\n\nindex_id: gh-archive\n\ndoc_mapping:\n  field_mappings:\n    - name: id\n      type: text\n      tokenizer: raw\n    - name: type\n      type: text\n      fast: true\n      tokenizer: raw\n    - name: public\n      type: bool\n      fast: true\n    - name: payload\n      type: json\n      tokenizer: default\n    - name: org\n      type: json\n      tokenizer: default\n    - name: repo\n      type: json\n      tokenizer: default\n    - name: actor\n      type: json\n      tokenizer: default\n    - name: other\n      type: json\n      tokenizer: default\n    - name: created_at\n      type: datetime\n      fast: true\n      input_formats:\n        - rfc3339\n      fast_precision: seconds\n  timestamp_field: created_at\n\nindexing_settings:\n  commit_timeout_secs: 10\n"
  },
  {
    "path": "config/tutorials/gh-archive/kafka-source.yaml",
    "content": "version: 0.8\nsource_id: kafka-source\nsource_type: kafka\nnum_pipelines: 2\nparams:\n  topic: gh-archive\n  client_params:\n    bootstrap.servers: localhost:9092\n"
  },
  {
    "path": "config/tutorials/gh-archive/kinesis-source.yaml",
    "content": "version: 0.8\nsource_id: kinesis-source\nsource_type: kinesis\nparams:\n  stream_name: gh-archive\n"
  },
  {
    "path": "config/tutorials/grafana/docker-compose.yml",
    "content": "version: \"3.9\"\n\nnetworks:\n  default:\n    name: quickwit-grafana\n    # ipam:\n    #   config:\n    #   - subnet: 172.16.7.0/24\n    #     gateway: 172.16.7.1\n\nservices:\n  quickwit:\n    image: quickwit/quickwit:${QUICKWIT_VERSION:-0.7.1}\n  grafana:\n    image: grafana/grafana-oss:${GRAFANA_VERSION:-9.4.7}\n    container_name: grafana\n    ports:\n      - \"${MAP_HOST_GRAFANA:-127.0.0.1}:3000:3000\"\n    environment:\n      GF_AUTH_DISABLE_LOGIN_FORM: \"true\"\n      GF_AUTH_ANONYMOUS_ENABLED: \"true\"\n      GF_AUTH_ANONYMOUS_ORG_ROLE: Admin\n    volumes:\n      - ./monitoring/grafana/dashboards:/var/lib/grafana/dashboards\n      - ./monitoring/grafana/provisioning:/etc/grafana/provisioning\n\n  jaeger:\n    image: jaegertracing/all-in-one:${JAEGER_VERSION:-1.48.0}\n    container_name: jaeger\n    ports:\n      - \"${MAP_HOST_JAEGER:-127.0.0.1}:16686:16686\" # Frontend\n    profiles:\n      - jaeger\n      - monitoring\n\n  otel-collector:\n    image: otel/opentelemetry-collector:${OTEL_VERSION:-0.84.0}\n    container_name: otel-collector\n    ports:\n      - \"${MAP_HOST_OTEL:-127.0.0.1}:1888:1888\"   # pprof extension\n      - \"${MAP_HOST_OTEL:-127.0.0.1}:8888:8888\"   # Prometheus metrics exposed by the collector\n      - \"${MAP_HOST_OTEL:-127.0.0.1}:8889:8889\"   # Prometheus exporter metrics\n      - \"${MAP_HOST_OTEL:-127.0.0.1}:13133:13133\" # health_check extension\n      - \"${MAP_HOST_OTEL:-127.0.0.1}:4317:4317\"   # OTLP gRPC receiver\n      - \"${MAP_HOST_OTEL:-127.0.0.1}:4318:4318\"   # OTLP http receiver\n      - \"${MAP_HOST_OTEL:-127.0.0.1}:55679:55679\" # zpages extension\n    profiles:\n      - otel\n      - monitoring\n    volumes:\n      - ./monitoring/otel-collector-config.yaml:/etc/otel-collector-config.yaml\n    command: [\"--config=/etc/otel-collector-config.yaml\"]\n\n  prometheus:\n    image: prom/prometheus:${PROMETHEUS_VERSION:-v2.43.0}\n    container_name: prometheus\n    ports:\n      - \"${MAP_HOST_PROMETHEUS:-127.0.0.1}:9090:9090\"\n    profiles:\n      - prometheus\n      - monitoring\n    volumes:\n      - ./monitoring/prometheus.yaml:/etc/prometheus/prometheus.yml\n    extra_hosts:\n      - \"host.docker.internal:host-gateway\"\n\n  gcp-pubsub-emulator:\n    # It is not an official docker image\n    # if we prefer we can build a docker from the official docker image (gcloud cli)\n    # and install the pubsub emulator https://cloud.google.com/pubsub/docs/emulator\n    image: thekevjames/gcloud-pubsub-emulator:${GCLOUD_EMULATOR:-455.0.0}\n    container_name: gcp-pubsub-emulator\n    ports:\n      - \"${MAP_HOST_GCLOUD_EMULATOR:-127.0.0.1}:8681:8681\"\n    environment:\n      # create a fake gcp project and a topic / subscription\n      - PUBSUB_PROJECT1=quickwit-emulator,emulator_topic:emulator_subscription\n    profiles:\n      - all\n      - gcp-pubsub\n\nvolumes:\n  localstack_data:\n  postgres_data:\n  azurite_data:\n"
  },
  {
    "path": "config/tutorials/hdfs-logs/index-config-partitioned.yaml",
    "content": "#\n# Index config file for hdfs-logs dataset with partitioning configured.\n#\n\nversion: 0.8\n\nindex_id: hdfs-logs-partitioned\n\ndoc_mapping:\n  field_mappings:\n    - name: timestamp\n      type: datetime\n      input_formats:\n        - unix_timestamp\n      output_format: unix_timestamp_secs\n      fast_precision: seconds\n      fast: true\n    - name: tenant_id\n      type: u64\n    - name: severity_text\n      type: text\n      tokenizer: raw\n    - name: body\n      type: text\n      tokenizer: default\n      record: position\n    - name: resource\n      type: json\n      tokenizer: raw\n  tag_fields: [tenant_id]\n  partition_key: tenant_id\n  max_num_partitions: 1000\n  timestamp_field: timestamp\n\nsearch_settings:\n  default_search_fields: [severity_text, body]\n\nindexing_settings:\n  commit_timeout_secs: 30\n  split_num_docs_target: 10000000\n  merge_policy:\n    type: \"limit_merge\"\n    merge_factor: 10\n    max_merge_ops: 3\n    maturation_period: 48 hours\n"
  },
  {
    "path": "config/tutorials/hdfs-logs/index-config-retention-policy.yaml",
    "content": "#\n# Index config file for hdfs-logs dataset with a retention policy configured.\n#\n\nversion: 0.8\n\nindex_id: hdfs-logs-retention-policy\n\ndoc_mapping:\n  field_mappings:\n    - name: timestamp\n      type: datetime\n      input_formats:\n        - unix_timestamp\n      output_format: unix_timestamp_secs\n      fast_precision: seconds\n      fast: true\n    - name: tenant_id\n      type: u64\n    - name: severity_text\n      type: text\n      tokenizer: raw\n    - name: body\n      type: text\n      tokenizer: default\n      record: position\n    - name: resource\n      type: json\n      tokenizer: raw\n  tag_fields: [tenant_id]\n  timestamp_field: timestamp\n\nsearch_settings:\n  default_search_fields: [severity_text, body]\n\nretention:\n  period: 90 days\n  schedule: daily\n\nindexing_settings:\n  commit_timeout_secs: 10\n  split_num_docs_target: 10000000\n"
  },
  {
    "path": "config/tutorials/hdfs-logs/index-config.yaml",
    "content": "#\n# Index config file for hdfs-logs dataset.\n#\n\nversion: 0.8\n\nindex_id: hdfs-logs\n\ndoc_mapping:\n  field_mappings:\n    - name: timestamp\n      type: datetime\n      input_formats:\n        - unix_timestamp\n      output_format: unix_timestamp_secs\n      fast_precision: seconds\n      fast: true\n    - name: tenant_id\n      type: u64\n    - name: severity_text\n      type: text\n      tokenizer: raw\n    - name: body\n      type: text\n      tokenizer: default\n      record: position\n    - name: resource\n      type: json\n      tokenizer: raw\n  tag_fields: [tenant_id]\n  timestamp_field: timestamp\n\nsearch_settings:\n  default_search_fields: [severity_text, body]\n"
  },
  {
    "path": "config/tutorials/hdfs-logs/searcher-1.yaml",
    "content": "version: 0.8\nnode_id: searcher-1\nlisten_address: 127.0.0.1\nrest:\n  listen_port: 7280\ningest_api:\n  max_queue_memory_usage: 4GiB\n  max_queue_disk_usage: 8GiB\npeer_seeds:\n  - 127.0.0.1:7290 # searcher-2\n  - 127.0.0.1:7300 # searcher-3\n"
  },
  {
    "path": "config/tutorials/hdfs-logs/searcher-2.yaml",
    "content": "version: 0.8\nnode_id: searcher-2\nlisten_address: 127.0.0.1\nrest:\n  listen_port: 7290\npeer_seeds:\n  - 127.0.0.1:7280 # searcher-1\n  - 127.0.0.1:7300 # searcher-3\n"
  },
  {
    "path": "config/tutorials/hdfs-logs/searcher-3.yaml",
    "content": "version: 0.8\nnode_id: searcher-3\nlisten_address: 127.0.0.1\nrest:\n  listen_port: 7300\npeer_seeds:\n  - 127.0.0.1:7280 # searcher-1\n  - 127.0.0.1:7290 # searcher-2\n\n"
  },
  {
    "path": "config/tutorials/otel-logs/index-config.yaml",
    "content": "#\n# Index config file for receiving logs in OpenTelemetry format.\n# Link: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/data-model.md\n#\n\nversion: 0.8\n\nindex_id: otel-log-v0\n\ndoc_mapping:\n  field_mappings:\n    - name: timestamp\n      type: datetime\n      input_formats:\n        - unix_timestamp\n      output_format: unix_timestamp_secs\n      fast: true\n    - name: severity\n      type: text\n      tokenizer: raw\n      fast: true\n    - name: body\n      type: text\n      tokenizer: default\n      record: position\n    - name: attributes\n      type: json\n    - name: resource\n      type: json\n  timestamp_field: timestamp\n\nsearch_settings:\n  default_search_fields: [severity, body]\n\n"
  },
  {
    "path": "config/tutorials/otel-logs/kafka-source.yaml",
    "content": "version: 0.8\nsource_id: kafka-source\nsource_type: kafka\ninput_format: otlp_logs_proto\nparams:\n  topic: otlp_logs\n  client_params:\n    bootstrap.servers: localhost:9092\n"
  },
  {
    "path": "config/tutorials/otel-logs/otel-values.yaml",
    "content": "mode: \"daemonset\"\npresets:\n  logsCollection:\n    enabled: true\n  kubernetesAttributes:\n    enabled: true\nconfig:\n  exporters:\n    otlp:\n      endpoint: quickwit-indexer.qw-tutorial.svc.cluster.local:7281\n      tls:\n        insecure: true\n  service:\n    pipelines:\n      logs:\n        exporters:\n          - otlp\n"
  },
  {
    "path": "config/tutorials/otel-traces/index-config.yaml",
    "content": "#\n# Index config file for receiving logs in OpenTelemetry format.\n# Link: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/data-model.md\n#\n\nversion: 0.8\n\nindex_id: otel-trace-v0\n\ndoc_mapping:\n  mode: lenient\n  field_mappings:\n    - name: trace_id\n      type: bytes\n    - name: trace_state\n      type: text\n      indexed: false\n    - name: resource_attributes\n      type: json\n      tokenizer: raw\n    - name: resource_dropped_attributes_count\n      type: u64\n      indexed: false\n    - name: service_name\n      type: text\n      tokenizer: raw\n    - name: span_id\n      type: bytes\n    - name: span_kind\n      type: u64\n    - name: span_name\n      type: text\n      tokenizer: raw\n    - name: span_start_timestamp_secs\n      type: datetime\n      indexed: true\n      fast_precision: seconds\n      fast: true\n      input_formats: [unix_timestamp]\n      output_format: unix_timestamp_secs\n    - name: span_start_timestamp_nanos\n      type: i64\n      indexed: false\n    - name: span_end_timestamp_nanos\n      type: i64\n      indexed: false\n    - name: span_duration_secs\n      type: i64\n      indexed: false\n    - name: span_attributes\n      type: json\n      tokenizer: raw\n    - name: span_dropped_attributes_count\n      type: u64\n      indexed: false\n    - name: span_dropped_events_count\n      type: u64\n      indexed: false\n    - name: span_dropped_links_count\n      type: u64\n      indexed: false\n    - name: span_status\n      type: json\n      indexed: false\n    - name: parent_span_id\n      type: bytes\n    - name: events\n      type: array<json>\n      tokenizer: raw\n    - name: links\n      type: array<json>\n      tokenizer: raw\n\n  timestamp_field: span_start_timestamp_secs\n\n  partition_key: service_name\n  max_num_partitions: 100\n\nindexing_settings:\n  commit_timeout_secs: 30\n\nsearch_settings:\n  default_search_fields: []\n"
  },
  {
    "path": "config/tutorials/otel-traces/kafka-source.yaml",
    "content": "version: 0.8\nsource_id: kafka-source\nsource_type: kafka\ninput_format: otlp_traces_proto\nparams:\n  topic: otlp_spans\n  client_params:\n    bootstrap.servers: localhost:9092\n"
  },
  {
    "path": "config/tutorials/stackoverflow/index-config.yaml",
    "content": "#\n# Index config file for stackoverflow dataset.\n#\nversion: 0.8\n\nindex_id: stackoverflow\n\ndoc_mapping:\n  field_mappings:\n    - name: title\n      type: text\n      tokenizer: default\n      record: position\n      stored: true\n    - name: body\n      type: text\n      tokenizer: default\n      record: position\n      stored: true\n    - name: creationDate\n      type: datetime\n      fast: true\n      input_formats:\n        - rfc3339\n      fast_precision: seconds\n  timestamp_field: creationDate\n\nsearch_settings:\n  default_search_fields: [title, body]\n\nindexing_settings:\n  commit_timeout_secs: 10\n"
  },
  {
    "path": "config/tutorials/stackoverflow/pulsar-source.yaml",
    "content": "version: 0.8\nsource_id: pulsar-source\nsource_type: pulsar\nparams:\n  topics:\n    - quickwit/pulsar/stackoverflow\n  address: pulsar://localhost:6650\n\n"
  },
  {
    "path": "config/tutorials/stackoverflow/send_messages_to_pulsar.py",
    "content": "import json\nimport pulsar\n\nclient = pulsar.Client('pulsar://localhost:6650')\nproducer = client.create_producer('stackoverflow')\n\nwith open('stackoverflow.posts.transformed-10000.json', encoding='utf8') as file:\n   for i, line in enumerate(file):\n       producer.send(line.encode('utf-8'))\n       if i % 100 == 0:\n           print(f\"{i}/10000 messages sent.\", i)\n\nclient.close()\n"
  },
  {
    "path": "config/tutorials/vector-otel-logs/vector.toml",
    "content": "[sources.generate_syslog]\ntype = \"demo_logs\"\nformat = \"syslog\"\ncount = 100000\ninterval = 0.001\n\n[transforms.remap_syslog]\ninputs = [ \"generate_syslog\"]\ntype = \"remap\"\nsource = '''\n  structured = parse_syslog!(.message)\n  .timestamp_nanos, err = to_unix_timestamp(structured.timestamp, unit: \"nanoseconds\")\n  .body = structured\n  .service_name = structured.appname\n  .resource_attributes.source_type = .source_type\n  .resource_attributes.host.hostname = structured.hostname\n  .resource_attributes.service.name = structured.appname\n  .attributes.syslog.procid = structured.procid\n  .attributes.syslog.facility = structured.facility\n  .attributes.syslog.version = structured.version\n  .severity_text = if includes([\"emerg\", \"err\", \"crit\", \"alert\"], structured.severity) {\n    \"ERROR\"\n  } else if structured.severity == \"warning\" {\n    \"WARN\"\n  } else if structured.severity == \"debug\" {\n    \"DEBUG\"\n  } else if includes([\"info\", \"notice\"], structured.severity) {\n    \"INFO\"\n  } else {\n   structured.severity\n  }\n  .scope_name = structured.msgid\n  del(.message)\n  del(.timestamp)\n  del(.source_type)\n'''\n\n[sinks.emit_syslog]\ninputs = [\"remap_syslog\"]\ntype = \"console\"\nencoding.codec = \"json\"\n\n[sinks.quickwit_logs]\ntype = \"http\"\nmethod = \"post\"\ninputs = [\"remap_syslog\"]\nencoding.codec = \"json\"\nframing.method = \"newline_delimited\"\nuri = \"http://127.0.0.1:7280/api/v1/otel-logs-v0_7/ingest\"\n"
  },
  {
    "path": "config/tutorials/wikipedia/index-config.yaml",
    "content": "#\n# Index config file for wikipedia dataset.\n#\n\nversion: 0.8\n\nindex_id: wikipedia\n\ndoc_mapping:\n  field_mappings:\n    - name: title\n      type: text\n      tokenizer: default\n      record: position\n      stored: true\n      fieldnorms: true\n    - name: body\n      type: text\n      tokenizer: default\n      record: position\n      stored: true\n      fieldnorms: true\n    - name: url\n      type: text\n      tokenizer: raw\n\nsearch_settings:\n  default_search_fields: [title, body]\n\nindexing_settings:\n  commit_timeout_secs: 10\n"
  },
  {
    "path": "config/tutorials/wikipedia/multilang-index-config.yaml",
    "content": "#\n# Index config file for multilang wikipedia datasets.\n#\n\nversion: 0.8\n\nindex_id: multilang-wikipedia\n\ndoc_mapping:\n  tokenizers:\n    - name: multilang\n      type: multilang\n  field_mappings:\n    - name: title\n      type: text\n      tokenizer: multilang\n      record: position\n      stored: true\n      fieldnorms: true\n    - name: body\n      type: text\n      tokenizer: multilang\n      record: position\n      stored: true\n      fieldnorms: true\n    - name: url\n      type: text\n      tokenizer: raw\n\nsearch_settings:\n  default_search_fields: [title, body]\n\nindexing_settings:\n  commit_timeout_secs: 10\n"
  },
  {
    "path": "distribution/docker/ubuntu/Dockerfile",
    "content": "FROM ubuntu:noble@sha256:66460d557b25769b102175144d538d88219c077c678a49af4afca6fbfc1b5252 AS builder\n\nRUN apt-get update && apt-get install -y curl\nRUN curl -L https://install.quickwit.io | sh\n\n\nFROM ubuntu:noble@sha256:66460d557b25769b102175144d538d88219c077c678a49af4afca6fbfc1b5252 AS quickwit\n\nLABEL org.opencontainers.image.title=\"Quickwit\"\nLABEL maintainer=\"Quickwit, Inc. <hello@quickwit.io>\"\nLABEL org.opencontainers.image.vendor=\"Quickwit, Inc.\"\nLABEL org.opencontainers.image.licenses=\"Apache-2.0\"\n\nRUN apt-get -y update \\\n    && apt-get -y install ca-certificates \\\n    libssl3 \\\n    && rm -rf /var/lib/apt/lists/*\n\nWORKDIR /quickwit\nRUN mkdir config qwdata\nCOPY --from=builder /quickwit-v*/quickwit /usr/local/bin/quickwit\nCOPY --from=builder /quickwit-v*/config/quickwit.yaml /quickwit/config/quickwit.yaml\n\nENV QW_CONFIG=/quickwit/config/quickwit.yaml\nENV QW_DATA_DIR=/quickwit/qwdata\nENV QW_LISTEN_ADDRESS=0.0.0.0\n\nRUN quickwit --version\n\nENTRYPOINT [\"quickwit\"]\n"
  },
  {
    "path": "distribution/ecs/.gitignore",
    "content": ".terraform\nterraform.tfstate*\n.terraform.tfstate*\nterraform.tfvars\n"
  },
  {
    "path": "distribution/ecs/README.md",
    "content": "# ECS deployment for quickwit\n\n## Run Quickwit in your infrastructure\n\nCreate a Quickwit module using:\n\n```terraform\nmodule \"quickwit\" {\n  source = \"github.com/quickwit-oss/quickwit/distribution/ecs/quickwit\"\n\n  vpc_id                       =       # VPC in which all resources will be created\n  subnet_ids                   = [...] # At least 2 private subnets must be specified\n  quickwit_ingress_cidr_blocks = [...] # List of CIDR blocks allowed to access to the Quickwit API\n}\n```\n\nThe Quickwit cluster is running on a private subnet. For ECS to pull the image:\n- if using the default Docker Hub image `quickwit/quickwit`, the subnets\nspecified must be configured with a NAT Gateway (no public IPs are attached to\nthe tasks)\n- if using an image hosted on ECR, a VPC endpoint for ECR can be used instead of\na NAT Gateway\n\n\n## Module configurations\n\nTo get the list of available configurations, check the `./quickwit/variables.tf`\nfile.\n\n### Tips\n\nMetastore database backups are disabled as restoring one would lead to\ninconsistencies with the index store on S3. To ensure high availability, you\nshould enable `rds_config.multi_az` instead. To use your own Postgres database\ninstead of creating a new RDS instance, configure the\n`external_postgres_uri_secret_arn` variable (e.g ARN of an SSM parameter with\nthe value `postgres://user:password@domain:port/db`).\n\nUsing NAT Gateways for the image registry is quite costly (approx. $0.05/hour/AZ). If\nyou are not already using NAT Gateways in the AZs where Quickwit will be\ndeployed, you should probably push the Quickwit image to ECR and use ECR\ninterface VPC endpoints instead (approx. ~$0.01/hour/AZ).\n\nWhen using the default image, you will quickly run into the Docker Hub rate\nlimiting. We recommend pushing the Quickwit image to ECR and configure that as\n`quickwit_image`. Note that the architecture of the image that you push to ECR\nmust match the `quickwit_cpu_architecture` variable (`ARM64` by default).\n\nSidecar container and custom logging configurations can be configured using the\nvariables `sidecar_container_definitions`, `sidecar_container_dependencies`,\n`log_configuration`, `enable_cloudwatch_logging`. See [custom log\nrouting](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_firelens.html).\n\nYou can use sidecars to inject additional secrets as files. This can be\nuseful for configuring sources such as Kafka. See `./example/kafka.tf` for an\nexample.\n\nTo access external AWS services like the Kinesis source, use the\n`quickwit_indexer.extra_task_policy_arns` variable to attach the necessary\nIAM policies to indexers.\n\n## Running the example stack\n\nWe provide an example of self contained deployment with an ad-hoc VPC. \n\n> [!IMPORTANT]\n> This stack costs ~$200/month to run (Fargate tasks, NAT Gateways\n> and RDS)\n\n### Deploy the Quickwit module and connect through a bastion\n\nTo make it easy to access your Quickwit cluster, the example stack includes\na bastion instance. Access is secured using an SSH key pair that you need to\nprovide (e.g generated with `ssh-keygen -t ed25519`).\n\nIn the `./example` directory, create a `terraform.tfvars` file with the public\nkey of your RSA key pair:\n\n```terraform\nbastion_public_key = \"ssh-ed25519 ...\"\n```\n\n> [!NOTE]\n> You can skip the creation of the bastion by not specifying the\n> `bastion_public_key` variable, but that would make it hard to access and\n> experiment with the created Quickwit cluster.\n\nIn the same directory (`./example`) run:\n\n```bash\nterraform init\nterraform apply\n```\n\nThe successful `apply` command should output the IP of the bastion EC2 instance.\nYou can port forward Quickwit's search UI using:\n\n```bash\nssh -N -L 7280:searcher.quickwit:7280 -i {your-private-key-file} ubuntu@{bastion_ip}\n```\n\nTo ingest some example dataset, log into the bastion:\n\n```bash\nssh -i {your-private-key-file} ubuntu@{bastion_ip}\n\n# create the log index\nwget https://raw.githubusercontent.com/quickwit-oss/quickwit/main/config/tutorials/hdfs-logs/index-config.yaml\ncurl -X POST \\\n  -H \"content-type: application/yaml\" \\\n  --data-binary @index-config.yaml \\\n  http://indexer.quickwit:7280/api/v1/indexes\n\n# import some data\nwget https://quickwit-datasets-public.s3.amazonaws.com/hdfs-logs-multitenants-10000.json\ncurl -X POST \\\n  -H \"content-type: application/json\" \\\n  --data-binary @hdfs-logs-multitenants-10000.json \\\n  http://indexer.quickwit:7280/api/v1/hdfs-logs/ingest?commit=force\n```\n\nIf your SSH tunnel to the searcher is still running, you should be able to see\nthe ingested data in the UI.\n\n### Setup an ECR repository to avoid throttling from Docker Hub\n\nBy default, the example stack uses Docker Hub to pull the Quickwit image. This\nis convenient but it quickly runs into rate limiting. To avoid this, in the\n`terraform.tfvars` file, set the `dockerhub_pull_through_creds_secret_arn` to a\nAWS Secret with the following content:\n\n```json\n{\"username\":\"...\",\"accessToken\":\"...\"}\n```\n\nThis will:\n- provision an ECR repository and a pull through cache rule\n- configure the Quickwit module to use that repository\n"
  },
  {
    "path": "distribution/ecs/example/.terraform.lock.hcl",
    "content": "# This file is maintained automatically by \"terraform init\".\n# Manual edits may be lost in future updates.\n\nprovider \"registry.terraform.io/hashicorp/aws\" {\n  version     = \"5.39.1\"\n  constraints = \">= 4.66.1, >= 5.36.0, ~> 5.39.1\"\n  hashes = [\n    \"h1:hQLlAd6O1LdQHy1GdWtgT5fcOlc3TWW+SaaFkpe+e8E=\",\n    \"zh:05c50a5d8edb3ba4ebc4eb6e0d0b5e319142f5983b27821710ed7d475d335bdc\",\n    \"zh:082986a5784dd21957e632371b289e549f051a4ea21d5c78c6d744c3537f03c5\",\n    \"zh:192ae622ba562eacc4921ed549a794506179233d724fdd15a4f147f3400724a0\",\n    \"zh:19a1d4637a62de90b0da174c0bf01000cd900488f7e8f709d8a37f082c59756b\",\n    \"zh:1d7689a8583515f1705972d7ce57ccfab96215b19905530d2c78c02dcfaff583\",\n    \"zh:22c446a21209a52ab74b4ba1ede0b220531e97ce479430047e493a2c45e1d8cb\",\n    \"zh:4154de82290ab4e9f81bac1ea62342de8b3b7a608f99258c190d4dd1c6663e47\",\n    \"zh:6bc4859ccdc54f28af9286b2fa090a31dcb345138d68c471510b737f6a052011\",\n    \"zh:73c69e000e0b321e78a4a12fef60d37285f2afec0ea7be9e06163d985101cb59\",\n    \"zh:890a3422f5e445b49bae30facf448d0ec9cd647e9155d0b685b5b39e9d331a94\",\n    \"zh:9b12af85486a96aedd8d7984b0ff811a4b42e3d88dad1a3fb4c0b580d04fa425\",\n    \"zh:9cd88bec0f5205df9032e3126d4e57edd1c5cc8d45cda25626882dafc485a3b0\",\n    \"zh:a3a8e3276d0fbf051bbafa192a2998b05745f2cf285ac8c36a9ad167a75c037f\",\n    \"zh:d47e4dcf4c0ad71b9a7c720be4f3a89f6786a82e77bbe8d950794562792a1da5\",\n    \"zh:f74e5b2af508c7de80a6ae5198df54a795eeba5058a0cd247828943f0c54f6e0\",\n  ]\n}\n\nprovider \"registry.terraform.io/hashicorp/random\" {\n  version     = \"3.6.0\"\n  constraints = \">= 3.1.0\"\n  hashes = [\n    \"h1:R5Ucn26riKIEijcsiOMBR3uOAjuOMfI1x7XvH4P6B1w=\",\n    \"zh:03360ed3ecd31e8c5dac9c95fe0858be50f3e9a0d0c654b5e504109c2159287d\",\n    \"zh:1c67ac51254ba2a2bb53a25e8ae7e4d076103483f55f39b426ec55e47d1fe211\",\n    \"zh:24a17bba7f6d679538ff51b3a2f378cedadede97af8a1db7dad4fd8d6d50f829\",\n    \"zh:30ffb297ffd1633175d6545d37c2217e2cef9545a6e03946e514c59c0859b77d\",\n    \"zh:454ce4b3dbc73e6775f2f6605d45cee6e16c3872a2e66a2c97993d6e5cbd7055\",\n    \"zh:78d5eefdd9e494defcb3c68d282b8f96630502cac21d1ea161f53cfe9bb483b3\",\n    \"zh:91df0a9fab329aff2ff4cf26797592eb7a3a90b4a0c04d64ce186654e0cc6e17\",\n    \"zh:aa57384b85622a9f7bfb5d4512ca88e61f22a9cea9f30febaa4c98c68ff0dc21\",\n    \"zh:c4a3e329ba786ffb6f2b694e1fd41d413a7010f3a53c20b432325a94fa71e839\",\n    \"zh:e2699bc9116447f96c53d55f2a00570f982e6f9935038c3810603572693712d0\",\n    \"zh:e747c0fd5d7684e5bfad8aa0ca441903f15ae7a98a737ff6aca24ba223207e2c\",\n    \"zh:f1ca75f417ce490368f047b63ec09fd003711ae48487fba90b4aba2ccf71920e\",\n  ]\n}\n"
  },
  {
    "path": "distribution/ecs/example/bastion.tf",
    "content": "variable \"bastion_public_key\" {\n  description = \"The public key used to connect to the bastion host. If empty, no bastion is created.\"\n  default     = \"\"\n}\n\noutput \"bastion_ip\" {\n  value = var.bastion_public_key != \"\" ? aws_instance.bastion[0].public_ip : null\n}\n\ndata \"aws_ami\" \"ubuntu\" {\n  most_recent = true\n\n  filter {\n    name   = \"name\"\n    values = [\"ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*\"]\n  }\n\n  filter {\n    name   = \"virtualization-type\"\n    values = [\"hvm\"]\n  }\n\n  owners = [\"099720109477\"] # Canonical\n}\n\nresource \"aws_security_group\" \"allow_ssh\" {\n  count       = var.bastion_public_key != \"\" ? 1 : 0\n  name        = \"qw_ecs_bastion_allow_ssh\"\n  description = \"Allow SSH inbound traffic from everywhere\"\n  vpc_id      = module.vpc.vpc_id\n\n  ingress {\n    from_port   = 22\n    to_port     = 22\n    protocol    = \"tcp\"\n    cidr_blocks = [\"0.0.0.0/0\"]\n  }\n\n  egress {\n    from_port   = 0\n    to_port     = 0\n    protocol    = \"-1\"\n    cidr_blocks = [\"0.0.0.0/0\"]\n  }\n}\n\nresource \"aws_instance\" \"bastion\" {\n  count                       = var.bastion_public_key != \"\" ? 1 : 0\n  ami                         = data.aws_ami.ubuntu.id\n  instance_type               = \"t3.nano\"\n  key_name                    = aws_key_pair.bastion_key[0].key_name\n  subnet_id                   = module.vpc.public_subnets[0]\n  associate_public_ip_address = true\n  vpc_security_group_ids      = [aws_security_group.allow_ssh[0].id]\n\n  tags = {\n    Name = \"quickwit-ecs-bastion\"\n  }\n}\n\nresource \"aws_key_pair\" \"bastion_key\" {\n  count      = var.bastion_public_key != \"\" ? 1 : 0\n  key_name   = \"quickwit-ecs-bastion-key\"\n  public_key = var.bastion_public_key\n}\n"
  },
  {
    "path": "distribution/ecs/example/image.tf",
    "content": "variable \"dockerhub_pull_through_creds_secret_arn\" {\n  description = \"If left empty, image is pulled directly from Docker Hub, which might be throttled.\"\n  default     = \"\"\n}\n\nlocals {\n  ecr_repository_prefix = \"quickwit-ecs-example\"\n}\n\n# This repo is populated by the pull through cache below\nresource \"aws_ecr_repository\" \"quickwit\" {\n  count                = var.dockerhub_pull_through_creds_secret_arn == \"\" ? 0 : 1\n  name                 = \"${local.ecr_repository_prefix}/quickwit/quickwit\"\n  image_tag_mutability = \"MUTABLE\"\n  force_delete         = true\n  image_scanning_configuration {\n    scan_on_push = false\n  }\n}\n\nresource \"aws_ecr_pull_through_cache_rule\" \"docker_hub\" {\n  count                 = var.dockerhub_pull_through_creds_secret_arn == \"\" ? 0 : 1\n  ecr_repository_prefix = local.ecr_repository_prefix\n  upstream_registry_url = \"registry-1.docker.io\"\n  credential_arn        = var.dockerhub_pull_through_creds_secret_arn\n}\n\n\nlocals {\n  ecr_domain     = \"${data.aws_caller_identity.current.account_id}.dkr.ecr.${data.aws_region.current.name}.amazonaws.com\"\n  image_prefix   = var.dockerhub_pull_through_creds_secret_arn == \"\" ? \"\" : \"${local.ecr_domain}/${local.ecr_repository_prefix}/\"\n  quickwit_image = \"${local.image_prefix}quickwit/quickwit\"\n}\n"
  },
  {
    "path": "distribution/ecs/example/kafka.tf",
    "content": "# Example configuration for injecting SSL keys for securing a Kafka connection\n# You can then create a secured Kafka source along these lines:\n# \n# version: 0.8\n# source_id: kafka-source\n# source_type: kafka\n# num_pipelines: 2\n# params:\n#   topic: your-topic\n#   client_params:\n#     bootstrap.servers: \"your-kafka-broker.com\"\n#     security.protocol: \"SSL\"\n#     ssl.ca.location: \"/quickwit/keys/ca.pem\"\n#     ssl.certificate.location: \"/quickwit/keys/service.cert\"\n#     ssl.key.location: \"/quickwit/keys/service.key\"\n\n\nlocals {\n  ca_pem       = \"echo \\\"$CA_PEM\\\" > /quickwit/cfg/ca.pem\"\n  service_cert = \"echo \\\"$SERVICE_CERT\\\" > /quickwit/cfg/service.cert\"\n  service_key  = \"echo \\\"$SERVICE_KEY\\\" > /quickwit/cfg/service.key\"\n  example_kafka_sidecar_container_definitions = {\n    kafka_key_init = {\n      name                      = \"kafka_key_init\"\n      essential                 = false\n      image                     = \"busybox\"\n      command                   = [\"sh\", \"-c\", \"${local.ca_pem} && ${local.service_cert} && ${local.service_key}\"]\n      enable_cloudwatch_logging = true\n      mount_points = [\n        {\n          sourceVolume  = \"quickwit-keys\"\n          containerPath = \"/quickwit/keys\"\n        }\n      ]\n      secrets = [\n        {\n          name      = \"CA_PEM\"\n          valueFrom = \"arn:aws:secretsmanager:eu-west-1:123456789:secret:your_kafka_ca_pem\"\n        },\n        {\n          name      = \"SERVICE_CERT\"\n          valueFrom = \"arn:aws:secretsmanager:eu-west-1:123456789:secret:your_kafka_service_cert\"\n        },\n        {\n          name      = \"SERVICE_KEY\"\n          valueFrom = \"arn:aws:secretsmanager:eu-west-1:123456789:secret:your_kafka_service_key\"\n        }\n      ]\n    }\n  }\n\n  example_kafka_sidecar_container_dependencies = [\n    {\n      condition     = \"SUCCESS\"\n      containerName = \"kafka_key_init\"\n    }\n  ]\n}\n"
  },
  {
    "path": "distribution/ecs/example/terraform.tf",
    "content": "terraform {\n  backend \"local\" {}\n  required_providers {\n    aws = {\n      source  = \"hashicorp/aws\"\n      version = \"~> 5.39.1\"\n    }\n  }\n}\n\nprovider \"aws\" {\n  region = \"eu-west-1\"\n  default_tags {\n    tags = {\n      provisioner = \"terraform\"\n    }\n  }\n}\n\ndata \"aws_region\" \"current\" {}\n\ndata \"aws_caller_identity\" \"current\" {}\n\nmodule \"quickwit\" {\n  source                       = \"../quickwit\"\n  vpc_id                       = module.vpc.vpc_id\n  subnet_ids                   = module.vpc.private_subnets\n  quickwit_ingress_cidr_blocks = [module.vpc.vpc_cidr_block]\n\n  ## Optional configurations:\n\n  # - ECR if you provide the `dockerhub_pull_through_creds_secret_arn` variable\n  # - Docker Hub otherwise (subject to throttling)\n  quickwit_image = \"${local.quickwit_image}:latest\"\n\n  # quickwit_index_s3_prefix  = \"my-bucket/my-prefix\"\n  # quickwit_domain           = \"quickwit\"\n  # quickwit_cpu_architecture = \"ARM64\"\n\n  # quickwit_indexer = {\n  #   desired_count         = 3\n  #   memory                = 8192\n  #   cpu                   = 4096\n  #   ephemeral_storage_gib = 50\n  #   extra_task_policy_arns = [\"arn:aws:iam::aws:policy/AmazonKinesisFullAccess\"]\n  # }\n\n  # quickwit_metastore = {\n  #   desired_count = 1\n  #   memory        = 512\n  #   cpu           = 256\n  # }\n\n  # quickwit_searcher = {\n  #   desired_count         = 1\n  #   memory                = 2048\n  #   cpu                   = 1024\n  # }\n\n  # quickwit_control_plane = {\n  #   memory = 512\n  #   cpu    = 256\n  # }\n\n  # quickwit_janitor = {\n  #   memory = 512\n  #   cpu    = 256\n  # }\n\n  # rds_config = {\n  #   instance_class = \"db.t4g.micro\"\n  #   multi_az       = false\n  # }\n\n  # external_postgres_uri_secret_arn = aws_ssm_parameter.postgres_uri.arn\n\n  ## Example logging configuration \n  # sidecar_container_definitions  = {\n  #   my_sidecar_container = see http://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_ContainerDefinition.html\n  # }\n  # sidecar_container_dependencies = [{condition = \"START\", containerName = \"my_sidecar_container\"}]\n  # log_configuration              = see https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/ecs_service#log_configuration\n  # enable_cloudwatch_logging      = false\n\n  ## Example Kafka key injection (see kafka.tf)\n  # sidecar_container_definitions  = local.example_kafka_sidecar_container_definitions\n  # sidecar_container_dependencies = local.example_kafka_sidecar_container_dependencies\n}\n\n\noutput \"indexer_service_name\" {\n  value = module.quickwit.indexer_service_name\n}\n\noutput \"searcher_service_name\" {\n  value = module.quickwit.searcher_service_name\n}\n"
  },
  {
    "path": "distribution/ecs/example/vpc.tf",
    "content": "module \"vpc\" {\n  source  = \"terraform-aws-modules/vpc/aws\"\n  version = \"5.5.3\"\n\n  name = \"quickwit-ecs\"\n  cidr = \"10.0.0.0/16\"\n\n  azs             = [\"${data.aws_region.current.name}a\", \"${data.aws_region.current.name}b\"]\n  private_subnets = [\"10.0.1.0/24\", \"10.0.2.0/24\"]\n  public_subnets  = [\"10.0.101.0/24\", \"10.0.102.0/24\"]\n\n  enable_nat_gateway = true\n}\n"
  },
  {
    "path": "distribution/ecs/quickwit/cluster.tf",
    "content": "module \"ecs_cluster\" {\n  source  = \"terraform-aws-modules/ecs/aws//modules/cluster\"\n  version = \"5.9.3\"\n\n  cluster_name = \"quickwit-${local.module_id}\"\n}\n\nresource \"aws_service_discovery_private_dns_namespace\" \"quickwit_internal\" {\n  name        = var.quickwit_domain\n  description = \"Internal quickwit domain\"\n  vpc         = var.vpc_id\n}\n\nresource \"aws_security_group\" \"quickwit_cluster_member_sg\" {\n  name        = \"quickwit-cluster-member-${local.module_id}\"\n  description = \"Security group for members of the Quickwit cluster\"\n  vpc_id      = var.vpc_id\n}\n"
  },
  {
    "path": "distribution/ecs/quickwit/configs.tf",
    "content": "locals {\n  quickwit_peer_list = [\n    \"${aws_service_discovery_service.metastore.name}.${aws_service_discovery_private_dns_namespace.quickwit_internal.name}\",\n    \"${aws_service_discovery_service.control_plane.name}.${aws_service_discovery_private_dns_namespace.quickwit_internal.name}\",\n    \"${aws_service_discovery_service.janitor.name}.${aws_service_discovery_private_dns_namespace.quickwit_internal.name}\",\n    \"${aws_service_discovery_service.indexer.name}.${aws_service_discovery_private_dns_namespace.quickwit_internal.name}\",\n    \"${aws_service_discovery_service.searcher.name}.${aws_service_discovery_private_dns_namespace.quickwit_internal.name}\",\n  ]\n\n  # id to avoid conflicts when deploying this module multiple times (random by default)\n  module_id = var.module_id == \"\" ? random_id.module.hex : var.module_id\n  s3_id     = var.module_id == \"\" ? random_id.module.hex : \"${var.module_id}-${random_id.module.hex}\"\n\n  quickwit_index_s3_prefix = var.quickwit_index_s3_prefix == \"\" ? aws_s3_bucket.index[0].id : var.quickwit_index_s3_prefix\n\n  use_external_rds        = var.external_postgres_uri_secret_arn != \"\"\n  postgres_uri_secret_arn = var.external_postgres_uri_secret_arn != \"\" ? var.external_postgres_uri_secret_arn : aws_ssm_parameter.postgres_credential[0].arn\n}\n\nresource \"random_id\" \"module\" {\n  byte_length = 3\n}\n"
  },
  {
    "path": "distribution/ecs/quickwit/iam.tf",
    "content": "data \"aws_iam_policy_document\" \"quickwit_task_permission\" {\n  # Reference: https://quickwit.io/docs/guides/aws-setup#amazon-s3\n  statement {\n    actions = [\n      \"s3:ListBucket\",\n      \"s3:GetObject\",\n      \"s3:PutObject\",\n      \"s3:DeleteObject\"\n    ]\n\n    resources = [\n      \"arn:aws:s3:::${local.quickwit_index_s3_prefix}*\",\n    ]\n  }\n}\n\nresource \"aws_iam_policy\" \"quickwit_task_permission\" {\n  name = \"quickwit-task-policy-${local.module_id}\"\n  path = \"/\"\n\n  policy = data.aws_iam_policy_document.quickwit_task_permission.json\n}\n\ndata \"aws_iam_policy_document\" \"quickwit_task_execution_permission\" {\n  statement {\n    actions = [\n      \"logs:PutLogEvents\",\n      \"logs:CreateLogStream\"\n    ]\n\n    resources = [\"*\"]\n  }\n  statement {\n    actions = [\n      \"ecr:GetDownloadUrlForLayer\",\n      \"ecr:GetAuthorizationToken\",\n      \"ecr:BatchGetImage\",\n      \"ecr:BatchCheckLayerAvailability\",\n      \"ecr:CreateRepository\",\n      \"ecr:BatchImportUpstreamImage\"\n    ]\n\n    resources = [\"*\"]\n  }\n\n  statement {\n    actions = [\"ssm:GetParameters\"]\n\n    resources = [local.postgres_uri_secret_arn]\n  }\n\n  statement {\n    actions = [\"secretsmanager:GetSecretValue\"]\n\n    resources = [\"arn:aws:secretsmanager:*:*:secret:*\"]\n  }\n\n}\n\nresource \"aws_iam_policy\" \"quickwit_task_execution_permission\" {\n  name = \"quickwit-task-execution-policy-${local.module_id}\"\n  path = \"/\"\n\n  policy = data.aws_iam_policy_document.quickwit_task_execution_permission.json\n}\n"
  },
  {
    "path": "distribution/ecs/quickwit/outputs.tf",
    "content": "output \"indexer_service_name\" {\n  value = \"${aws_service_discovery_service.indexer.name}.${aws_service_discovery_private_dns_namespace.quickwit_internal.name}\"\n}\n\noutput \"searcher_service_name\" {\n  value = \"${aws_service_discovery_service.searcher.name}.${aws_service_discovery_private_dns_namespace.quickwit_internal.name}\"\n}\n\noutput \"janitor_service_name\" {\n  value = \"${aws_service_discovery_service.janitor.name}.${aws_service_discovery_private_dns_namespace.quickwit_internal.name}\"\n}\n\noutput \"control_plane_service_name\" {\n  value = \"${aws_service_discovery_service.control_plane.name}.${aws_service_discovery_private_dns_namespace.quickwit_internal.name}\"\n}\n\noutput \"metastore_service_name\" {\n  value = \"${aws_service_discovery_service.metastore.name}.${aws_service_discovery_private_dns_namespace.quickwit_internal.name}\"\n}\n"
  },
  {
    "path": "distribution/ecs/quickwit/quickwit-control-plane.tf",
    "content": "module \"quickwit_control_plane\" {\n  source                         = \"./service\"\n  service_name                   = \"control_plane\"\n  service_discovery_registry_arn = aws_service_discovery_service.control_plane.arn\n  cluster_arn                    = module.ecs_cluster.arn\n  postgres_uri_secret_arn        = local.postgres_uri_secret_arn\n  quickwit_peer_list             = local.quickwit_peer_list\n  s3_access_policy_arn           = aws_iam_policy.quickwit_task_permission.arn\n  task_execution_policy_arn      = aws_iam_policy.quickwit_task_execution_permission.arn\n  module_id                      = local.module_id\n  quickwit_cluster_member_sg_id  = aws_security_group.quickwit_cluster_member_sg.id\n\n  subnet_ids                     = var.subnet_ids\n  ingress_cidr_blocks            = var.quickwit_ingress_cidr_blocks\n  quickwit_image                 = var.quickwit_image\n  quickwit_cpu_architecture      = var.quickwit_cpu_architecture\n  sidecar_container_definitions  = var.sidecar_container_definitions\n  sidecar_container_dependencies = var.sidecar_container_dependencies\n  log_configuration              = var.log_configuration\n  enable_cloudwatch_logging      = var.enable_cloudwatch_logging\n  service_config                 = var.quickwit_control_plane\n  quickwit_index_s3_prefix       = local.quickwit_index_s3_prefix\n}\n\nresource \"aws_service_discovery_service\" \"control_plane\" {\n  name = \"control-plane\"\n\n  dns_config {\n    namespace_id = aws_service_discovery_private_dns_namespace.quickwit_internal.id\n\n    dns_records {\n      ttl  = 10\n      type = \"A\"\n    }\n\n    routing_policy = \"MULTIVALUE\"\n  }\n}\n"
  },
  {
    "path": "distribution/ecs/quickwit/quickwit-indexer.tf",
    "content": "module \"quickwit_indexer\" {\n  source                         = \"./service\"\n  service_name                   = \"indexer\"\n  service_discovery_registry_arn = aws_service_discovery_service.indexer.arn\n  cluster_arn                    = module.ecs_cluster.arn\n  postgres_uri_secret_arn        = local.postgres_uri_secret_arn\n  quickwit_peer_list             = local.quickwit_peer_list\n  s3_access_policy_arn           = aws_iam_policy.quickwit_task_permission.arn\n  task_execution_policy_arn      = aws_iam_policy.quickwit_task_execution_permission.arn\n  module_id                      = local.module_id\n  quickwit_cluster_member_sg_id  = aws_security_group.quickwit_cluster_member_sg.id\n\n  subnet_ids                     = var.subnet_ids\n  ingress_cidr_blocks            = var.quickwit_ingress_cidr_blocks\n  quickwit_image                 = var.quickwit_image\n  quickwit_cpu_architecture      = var.quickwit_cpu_architecture\n  sidecar_container_definitions  = var.sidecar_container_definitions\n  sidecar_container_dependencies = var.sidecar_container_dependencies\n  log_configuration              = var.log_configuration\n  enable_cloudwatch_logging      = var.enable_cloudwatch_logging\n  service_config                 = var.quickwit_indexer\n  quickwit_index_s3_prefix       = local.quickwit_index_s3_prefix\n  # Longer termination grace period for indexers because we are waiting for the\n  # data persisted in the ingesters to be indexed and committed. Should be\n  # larger than the largest commit timeout.\n  stop_timeout = 120\n}\n\nresource \"aws_service_discovery_service\" \"indexer\" {\n  name = \"indexer\"\n\n  dns_config {\n    namespace_id = aws_service_discovery_private_dns_namespace.quickwit_internal.id\n\n    dns_records {\n      ttl  = 10\n      type = \"A\"\n    }\n\n    routing_policy = \"MULTIVALUE\"\n  }\n}\n"
  },
  {
    "path": "distribution/ecs/quickwit/quickwit-janitor.tf",
    "content": "module \"quickwit_janitor\" {\n  source                         = \"./service\"\n  service_name                   = \"janitor\"\n  service_discovery_registry_arn = aws_service_discovery_service.janitor.arn\n  cluster_arn                    = module.ecs_cluster.arn\n  postgres_uri_secret_arn        = local.postgres_uri_secret_arn\n  quickwit_peer_list             = local.quickwit_peer_list\n  s3_access_policy_arn           = aws_iam_policy.quickwit_task_permission.arn\n  task_execution_policy_arn      = aws_iam_policy.quickwit_task_execution_permission.arn\n  module_id                      = local.module_id\n  quickwit_cluster_member_sg_id  = aws_security_group.quickwit_cluster_member_sg.id\n\n  subnet_ids                     = var.subnet_ids\n  ingress_cidr_blocks            = var.quickwit_ingress_cidr_blocks\n  quickwit_image                 = var.quickwit_image\n  quickwit_cpu_architecture      = var.quickwit_cpu_architecture\n  sidecar_container_definitions  = var.sidecar_container_definitions\n  sidecar_container_dependencies = var.sidecar_container_dependencies\n  log_configuration              = var.log_configuration\n  enable_cloudwatch_logging      = var.enable_cloudwatch_logging\n  service_config                 = var.quickwit_janitor\n  quickwit_index_s3_prefix       = local.quickwit_index_s3_prefix\n}\n\nresource \"aws_service_discovery_service\" \"janitor\" {\n  name = \"janitor\"\n\n  dns_config {\n    namespace_id = aws_service_discovery_private_dns_namespace.quickwit_internal.id\n\n    dns_records {\n      ttl  = 10\n      type = \"A\"\n    }\n\n    routing_policy = \"MULTIVALUE\"\n  }\n}\n"
  },
  {
    "path": "distribution/ecs/quickwit/quickwit-metastore.tf",
    "content": "module \"quickwit_metastore\" {\n  source                         = \"./service\"\n  service_name                   = \"metastore\"\n  service_discovery_registry_arn = aws_service_discovery_service.metastore.arn\n  cluster_arn                    = module.ecs_cluster.arn\n  postgres_uri_secret_arn        = local.postgres_uri_secret_arn\n  quickwit_peer_list             = local.quickwit_peer_list\n  s3_access_policy_arn           = aws_iam_policy.quickwit_task_permission.arn\n  task_execution_policy_arn      = aws_iam_policy.quickwit_task_execution_permission.arn\n  module_id                      = local.module_id\n  quickwit_cluster_member_sg_id  = aws_security_group.quickwit_cluster_member_sg.id\n\n  subnet_ids                     = var.subnet_ids\n  ingress_cidr_blocks            = var.quickwit_ingress_cidr_blocks\n  quickwit_image                 = var.quickwit_image\n  quickwit_cpu_architecture      = var.quickwit_cpu_architecture\n  sidecar_container_definitions  = var.sidecar_container_definitions\n  sidecar_container_dependencies = var.sidecar_container_dependencies\n  log_configuration              = var.log_configuration\n  enable_cloudwatch_logging      = var.enable_cloudwatch_logging\n  service_config                 = var.quickwit_metastore\n  quickwit_index_s3_prefix       = local.quickwit_index_s3_prefix\n}\n\nresource \"aws_service_discovery_service\" \"metastore\" {\n  name = \"metastore\"\n\n  dns_config {\n    namespace_id = aws_service_discovery_private_dns_namespace.quickwit_internal.id\n\n    dns_records {\n      ttl  = 10\n      type = \"A\"\n    }\n\n    routing_policy = \"MULTIVALUE\"\n  }\n}\n"
  },
  {
    "path": "distribution/ecs/quickwit/quickwit-searcher.tf",
    "content": "module \"quickwit_searcher\" {\n  source                         = \"./service\"\n  service_name                   = \"searcher\"\n  service_discovery_registry_arn = aws_service_discovery_service.searcher.arn\n  cluster_arn                    = module.ecs_cluster.arn\n  postgres_uri_secret_arn        = local.postgres_uri_secret_arn\n  quickwit_peer_list             = local.quickwit_peer_list\n  s3_access_policy_arn           = aws_iam_policy.quickwit_task_permission.arn\n  task_execution_policy_arn      = aws_iam_policy.quickwit_task_execution_permission.arn\n  module_id                      = local.module_id\n  quickwit_cluster_member_sg_id  = aws_security_group.quickwit_cluster_member_sg.id\n\n  subnet_ids                     = var.subnet_ids\n  ingress_cidr_blocks            = var.quickwit_ingress_cidr_blocks\n  quickwit_image                 = var.quickwit_image\n  quickwit_cpu_architecture      = var.quickwit_cpu_architecture\n  sidecar_container_definitions  = var.sidecar_container_definitions\n  sidecar_container_dependencies = var.sidecar_container_dependencies\n  log_configuration              = var.log_configuration\n  enable_cloudwatch_logging      = var.enable_cloudwatch_logging\n  service_config                 = var.quickwit_searcher\n  quickwit_index_s3_prefix       = local.quickwit_index_s3_prefix\n}\n\nresource \"aws_service_discovery_service\" \"searcher\" {\n  name = \"searcher\"\n\n  dns_config {\n    namespace_id = aws_service_discovery_private_dns_namespace.quickwit_internal.id\n\n    dns_records {\n      ttl  = 10\n      type = \"A\"\n    }\n\n    routing_policy = \"MULTIVALUE\"\n  }\n}\n"
  },
  {
    "path": "distribution/ecs/quickwit/rds.tf",
    "content": "resource \"random_password\" \"quickwit_db\" {\n  count   = local.use_external_rds ? 0 : 1\n  length  = 64\n  special = false\n}\n\nmodule \"quickwit_db\" {\n  count   = local.use_external_rds ? 0 : 1\n  source  = \"terraform-aws-modules/rds/aws\"\n  version = \"6.5.2\"\n\n  identifier = \"quickwit-metastore-${local.module_id}\"\n\n  engine               = \"postgres\"\n  engine_version       = \"16\"\n  family               = \"postgres16\" # DB parameter group\n  major_engine_version = \"16\"         # DB option group\n\n  instance_class    = var.rds_config.instance_class\n  multi_az          = var.rds_config.multi_az\n  allocated_storage = 5\n\n  db_name  = \"quickwit\"\n  username = \"quickwit\"\n  password = random_password.quickwit_db[0].result\n\n  port                                = \"5432\"\n  publicly_accessible                 = false\n  manage_master_user_password         = false\n  iam_database_authentication_enabled = true\n  vpc_security_group_ids              = [aws_security_group.quickwit_db[0].id]\n  db_subnet_group_name                = aws_db_subnet_group.quickwit[0].name\n\n  maintenance_window = \"Mon:00:00-Mon:03:00\"\n\n  create_monitoring_role = true\n  monitoring_interval    = \"30\"\n  monitoring_role_name   = \"RDSQuickwitMonitoringRole-${local.module_id}\"\n\n  deletion_protection = false\n  skip_final_snapshot = true\n}\n\nresource \"aws_security_group\" \"quickwit_db\" {\n  count       = local.use_external_rds ? 0 : 1\n  name        = \"quickwit-db-${local.module_id}\"\n  description = \"Security group for the Quickwit Metastore DB\"\n  vpc_id      = var.vpc_id\n\n  ingress {\n    description     = \"Connection from explicitly allowed resources\"\n    from_port       = 5432\n    to_port         = 5432\n    protocol        = \"tcp\"\n    security_groups = [aws_security_group.quickwit_cluster_member_sg.id]\n  }\n}\n\nresource \"aws_db_subnet_group\" \"quickwit\" {\n  count       = local.use_external_rds ? 0 : 1\n  name        = \"quickwit-${local.module_id}\"\n  description = \"Quickwit metastore\"\n  subnet_ids  = var.subnet_ids\n}\n\nresource \"aws_ssm_parameter\" \"postgres_credential\" {\n  count = local.use_external_rds ? 0 : 1\n  name  = \"/quickwit/${local.module_id}/postgres\"\n  type  = \"SecureString\"\n  value = \"postgres://${module.quickwit_db[0].db_instance_username}:${random_password.quickwit_db[0].result}@${module.quickwit_db[0].db_instance_address}:${module.quickwit_db[0].db_instance_port}/${module.quickwit_db[0].db_instance_name}\"\n}\n"
  },
  {
    "path": "distribution/ecs/quickwit/s3.tf",
    "content": "data \"aws_caller_identity\" \"current\" {}\n\nresource \"aws_s3_bucket\" \"index\" {\n  count         = var.quickwit_index_s3_prefix == \"\" ? 1 : 0\n  bucket        = \"quickwit-ecs-index-${data.aws_caller_identity.current.account_id}-${local.s3_id}\"\n  force_destroy = true\n}\n"
  },
  {
    "path": "distribution/ecs/quickwit/service/config.tf",
    "content": "locals {\n  quickwit_data_dir = \"/quickwit/qwdata\"\n\n  quickwit_common_environment = [\n    {\n      name  = \"QW_ENABLED_SERVICES\"\n      value = var.service_name\n    },\n    {\n      name  = \"QW_PEER_SEEDS\"\n      value = join(\",\", var.quickwit_peer_list)\n    },\n    {\n      name  = \"NO_COLOR\"\n      value = \"true\"\n    },\n    {\n      name  = \"QW_CLUSTER_ID\"\n      value = \"ecs-${var.module_id}\"\n    },\n    {\n      name  = \"QW_LISTEN_ADDRESS\"\n      value = \"0.0.0.0\"\n    },\n    {\n      name  = \"QW_DATA_DIR\"\n      value = local.quickwit_data_dir\n    },\n    {\n      name  = \"QW_DEFAULT_INDEX_ROOT_URI\"\n      value = \"s3://${var.quickwit_index_s3_prefix}\"\n    },\n  ]\n\n  nb_extra_policies             = length(var.service_config.extra_task_policy_arns)\n  extra_tasks_iam_role_policies = { for i in range(local.nb_extra_policies) : \"extra_policy_${i}\" => var.service_config.extra_task_policy_arns[i] }\n  tasks_iam_role_policies       = merge({ s3_access = var.s3_access_policy_arn }, local.extra_tasks_iam_role_policies)\n}\n"
  },
  {
    "path": "distribution/ecs/quickwit/service/ecs.tf",
    "content": "module \"quickwit_service\" {\n  source  = \"terraform-aws-modules/ecs/aws//modules/service\"\n  version = \"5.9.3\"\n\n  name        = \"quickwit-${var.service_name}-${var.module_id}\"\n  cluster_arn = var.cluster_arn\n\n  cpu    = var.service_config.cpu\n  memory = var.service_config.memory\n  ephemeral_storage = {\n    size_in_gib = var.service_config.ephemeral_storage_gib\n  }\n\n  container_definitions = merge(var.sidecar_container_definitions, {\n    quickwit = {\n      cpu    = var.service_config.cpu\n      memory = var.service_config.memory\n\n      essential                 = true\n      image                     = var.quickwit_image\n      enable_cloudwatch_logging = var.enable_cloudwatch_logging\n\n      command = [\"run\"]\n\n      environment = local.quickwit_common_environment\n\n      secrets = [\n        {\n          name      = \"QW_METASTORE_URI\"\n          valueFrom = var.postgres_uri_secret_arn\n        }\n      ]\n\n      port_mappings = [\n        {\n          name          = \"rest\"\n          containerPort = 7280\n          protocol      = \"tcp\"\n        },\n        {\n          name          = \"grpc\"\n          containerPort = 7281\n          protocol      = \"tcp\"\n        },\n        {\n          name          = \"gossip\"\n          containerPort = 7280\n          protocol      = \"udp\"\n        }\n      ]\n\n      log_configuration = var.log_configuration\n\n      mount_points = [\n        {\n          sourceVolume  = \"quickwit-data-vol\"\n          containerPath = local.quickwit_data_dir\n        },\n        # A volume that can be used to inject secrets as files.\n        {\n          sourceVolume  = \"quickwit-keys\"\n          containerPath = \"/quickwit/keys\"\n        }\n      ]\n\n      stopTimeout = var.stop_timeout\n\n      dependencies = var.sidecar_container_dependencies\n    }\n  })\n\n  requires_compatibilities = [\"FARGATE\"]\n  runtime_platform = {\n    operating_system_family = \"LINUX\"\n    cpu_architecture        = var.quickwit_cpu_architecture\n  }\n\n  service_registries = {\n    registry_arn   = var.service_discovery_registry_arn\n    container_name = \"quickwit\"\n  }\n\n  subnet_ids = var.subnet_ids\n  security_group_rules = {\n    ingress_internal = {\n      type      = \"ingress\"\n      from_port = 7280\n      to_port   = 7281\n      protocol  = \"-1\"\n\n      source_security_group_id = var.quickwit_cluster_member_sg_id\n    }\n    ingress_external = {\n      type      = \"ingress\"\n      from_port = 7280\n      to_port   = 7281\n      protocol  = \"-1\"\n\n      cidr_blocks = var.ingress_cidr_blocks\n    }\n    egress_all = {\n      type      = \"egress\"\n      from_port = 0\n      to_port   = 0\n      protocol  = \"-1\"\n\n      cidr_blocks = [\"0.0.0.0/0\"]\n    }\n  }\n  security_group_ids = [var.quickwit_cluster_member_sg_id]\n\n  enable_autoscaling = false\n  desired_count      = var.service_config.desired_count\n\n  volume = [\n    {\n      name = \"quickwit-data-vol\"\n    },\n    {\n      name = \"quickwit-keys\"\n    }\n  ]\n\n  tasks_iam_role_policies = local.tasks_iam_role_policies\n\n  task_exec_iam_role_policies = {\n    policy = var.task_execution_policy_arn\n  }\n\n}\n"
  },
  {
    "path": "distribution/ecs/quickwit/service/variables.tf",
    "content": "variable \"service_name\" {\n  description = \"One of indexer, metastore, searcher, control_plane, janitor\"\n}\n\nvariable \"service_discovery_registry_arn\" {}\n\nvariable \"sidecar_container_definitions\" {}\n\nvariable \"sidecar_container_dependencies\" {\n  type = list(object({\n    containerName = string\n    condition     = string\n  }))\n  default = []\n}\n\nvariable \"log_configuration\" {}\n\nvariable \"enable_cloudwatch_logging\" {\n  type = bool\n}\n\nvariable \"cluster_arn\" {}\n\nvariable \"ingress_cidr_blocks\" {\n  type = list(string)\n}\n\nvariable \"quickwit_cluster_member_sg_id\" {}\n\nvariable \"subnet_ids\" {\n  type = list(string)\n}\n\nvariable \"postgres_uri_secret_arn\" {\n  description = \"ARN of the SSM parameter or Secret Manager secret containing the URI of a Postgres instance\"\n}\n\nvariable \"quickwit_image\" {}\n\nvariable \"service_config\" {\n  type = object({\n    desired_count          = optional(number, 1)\n    memory                 = number\n    cpu                    = number\n    ephemeral_storage_gib  = optional(number, 21)\n    extra_task_policy_arns = optional(list(string), [])\n  })\n}\n\nvariable \"quickwit_index_s3_prefix\" {}\n\nvariable \"quickwit_peer_list\" {\n  type = list(string)\n}\n\nvariable \"s3_access_policy_arn\" {}\n\nvariable \"task_execution_policy_arn\" {}\n\nvariable \"quickwit_cpu_architecture\" {}\n\nvariable \"module_id\" {}\n\nvariable \"stop_timeout\" {\n  # between 1s and 120s on Fargate, 30s is the ECS default\n  default = 30\n}\n"
  },
  {
    "path": "distribution/ecs/quickwit/variables.tf",
    "content": "## REQUIRED VARIABLES\n\nvariable \"vpc_id\" {\n  description = \"VPC ID of the cluster\"\n}\n\nvariable \"subnet_ids\" {\n  description = \"Subnet(s) where quickwit will be deployed\"\n  type        = list(string)\n}\n\n\n\n## OPTIONAL VARIABLES\n\nvariable \"module_id\" {\n  description = \"Identifier for the module, e.g the stage. If not specified, a random string is generated.\"\n  default     = \"\"\n}\n\nvariable \"quickwit_ingress_cidr_blocks\" {\n  description = \"CIDR blocks (private) that should have access to the Quickwit cluster\"\n  type        = list(string)\n  default     = []\n}\n\n\nvariable \"quickwit_index_s3_prefix\" {\n  description = \"S3 bucket name and prefix for the Quickwit data, e.g. my-bucket-name/my-prefix. Quickwit will only have access to this S3 location. Leave empty to create a new bucket.\"\n  default     = \"\"\n}\n\nvariable \"quickwit_domain\" {\n  description = \"Local domain for quickwit service discovery\"\n  default     = \"quickwit\"\n}\n\nvariable \"quickwit_image\" {\n  description = \"Quickwit docker image\"\n  default     = \"quickwit/quickwit:latest\"\n}\n\nvariable \"quickwit_cpu_architecture\" {\n  description = \"One of X86_64 / ARM64. Must match the arch of the provided image (var.quickwit_image).\"\n  default     = \"ARM64\"\n}\n\nvariable \"sidecar_container_definitions\" {\n  description = \"Sidecar containers to be attached to Quickwit tasks\"\n  default     = {}\n}\n\nvariable \"sidecar_container_dependencies\" {\n  description = \"Specify the Quickwit container's dependencies on sidecars\"\n  type = list(object({\n    containerName = string\n    condition     = string\n  }))\n  default = []\n}\n\nvariable \"enable_cloudwatch_logging\" {\n  description = \"Cloudwatch logging for Quickwit tasks. Usually disabled when using a custom log configuration.\"\n  default     = true\n}\n\nvariable \"log_configuration\" {\n  description = \"Custom log configuration for Quickwit tasks\"\n  default     = {}\n}\n\nvariable \"quickwit_indexer\" {\n  description = \"Indexer service sizing configurations\"\n  type = object({\n    desired_count          = optional(number, 1)\n    memory                 = optional(number, 8192)\n    cpu                    = optional(number, 2048)\n    ephemeral_storage_gib  = optional(number, 21)\n    extra_task_policy_arns = optional(list(string), [])\n  })\n  default = {}\n}\n\nvariable \"quickwit_metastore\" {\n  description = \"Metastore service sizing configurations\"\n  type = object({\n    desired_count = optional(number, 1)\n    memory        = optional(number, 512)\n    cpu           = optional(number, 256)\n  })\n  default = {}\n}\n\nvariable \"quickwit_searcher\" {\n  description = \"Searcher service sizing configurations\"\n  type = object({\n    desired_count         = optional(number, 1)\n    memory                = optional(number, 4096)\n    cpu                   = optional(number, 1024)\n    ephemeral_storage_gib = optional(number, 21)\n  })\n  default = {}\n}\n\nvariable \"quickwit_control_plane\" {\n  description = \"Control plane service sizing configurations\"\n  type = object({\n    # only 1 task is necessary\n    memory = optional(number, 512)\n    cpu    = optional(number, 256)\n  })\n  default = {}\n}\n\nvariable \"quickwit_janitor\" {\n  description = \"Janitor service sizing configurations\"\n  type = object({\n    # only 1 task is necessary\n    memory = optional(number, 512)\n    cpu    = optional(number, 256)\n  })\n  default = {}\n}\n\nvariable \"rds_config\" {\n  description = \"Configurations of the metastore RDS database. Enable multi_az to ensure high availability.\"\n  type = object({\n    instance_class = optional(string, \"db.t4g.micro\")\n    multi_az       = optional(bool, false)\n  })\n  default = {}\n}\n\nvariable \"external_postgres_uri_secret_arn\" {\n  description = \"ARN of the SSM parameter or Secret Manager secret containing the URI of a Postgres instance (postgres://{user}:{password}@{address}:{port}/{db_instance_name}). The Postgres instance should allow indbound connections from the subnets specified in `variable.subnet_ids`. If provided, the internal RDS will not be created and `var.rds_config` is ignored.\"\n  default     = \"\"\n}\n"
  },
  {
    "path": "distribution/kubernetes/README.md",
    "content": "# Quickwit on Kubernetes\n\nTo deploy Quickwit on Kubernetes, use the official Quickwit Helm chart available at [helm.quickwit.io](https://helm.quickwit.io/) and refer to our [documentation](https://quickwit.io/docs/deployment/kubernetes/helm) for more information.\n"
  },
  {
    "path": "docker-compose.yml",
    "content": "# By default, this docker compose script maps all services to localhost only.\n# If you need to make services available outside of your machine, add\n# appropriate service mappings to the .env file. See .env.example file for\n# configuration example.\n#\n# Notes on image versions:\n#  - For the key services such as postgres and pulsar we are trying to run\n#    against the oldest supported version\n#  - For kafka we use the oldest version that supports KRaft.\n#  - For everything else we are trying to run against the latest version.\n#\n# To run against the latest image versions update .env file. See .env.example\n# file for configuration examples. You might need to remove the old images\n# first if they are already tagged latest and volumes if their content is\n# incompatible with the latest version, as in case of postgres.\n\nname: quickwit\n\nnetworks:\n  default:\n    name: quickwit-network\n    ipam:\n      config:\n      - subnet: 172.16.7.0/24\n        gateway: 172.16.7.1\n\nservices:\n  localstack:\n    image: localstack/localstack:${LOCALSTACK_VERSION:-3.5.0}\n    container_name: localstack\n    ports:\n      - \"${MAP_HOST_LOCALSTACK:-127.0.0.1}:4566:4566\"\n      - \"${MAP_HOST_LOCALSTACK:-127.0.0.1}:4571:4571\"\n      - \"${MAP_HOST_LOCALSTACK:-127.0.0.1}:8080:8080\"\n    profiles:\n      - all\n      - localstack\n    environment:\n      SERVICES: kinesis,s3,sqs\n      PERSISTENCE: 1\n    volumes:\n      - .localstack:/etc/localstack/init/ready.d\n      - localstack_data:/var/lib/localstack\n    healthcheck:\n      test: [\"CMD\", \"curl\", \"-k\", \"-f\", \"https://localhost:4566/quickwit-integration-tests\"]\n      interval: 1s\n      timeout: 5s\n      retries: 100\n\n  postgres:\n    # The oldest supported version. EOL November 14, 2024\n    image: postgres:${POSTGRES_VERSION:-12.17-alpine}\n    container_name: postgres\n    ports:\n      - \"${MAP_HOST_POSTGRES:-127.0.0.1}:5432:5432\"\n    profiles:\n      - all\n      - postgres\n    environment:\n      PGDATA: /var/lib/postgresql/data/pgdata\n      POSTGRES_USER: ${POSTGRES_USER:-quickwit-dev}\n      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-quickwit-dev}\n      POSTGRES_DB: ${POSTGRES_DB:-quickwit-metastore-dev}\n    volumes:\n      - postgres_data:/var/lib/postgresql/data\n    healthcheck:\n      test: [\"CMD\", \"pg_isready\"]\n      interval: 1s\n      timeout: 5s\n      retries: 100\n\n  pulsar-broker:\n    # The oldest version with arm64 docker images. EOL May 2 2025\n    image: apachepulsar/pulsar:${PULSAR_VERSION:-3.0.0}\n    container_name: pulsar-broker\n    command: bin/pulsar standalone --no-functions-worker\n    ports:\n      - \"${MAP_HOST_PULSAR:-127.0.0.1}:6650:6650\"\n      - \"${MAP_HOST_PULSAR:-127.0.0.1}:8081:8080\"\n    environment:\n      PULSAR_MEM: \"-Xms384M -Xmx384M\"\n      # Disable functions worker to save memory/time\n      PULSAR_PREFIX_functionsWorkerEnabled: \"false\"\n    profiles:\n      - all\n      - pulsar\n\n  kafka-broker:\n    image: confluentinc/confluent-local:${CP_VERSION:-7.4.11}\n    container_name: kafka-broker\n    ports:\n      - \"${MAP_HOST_KAFKA:-127.0.0.1}:9092:9092\"\n      - \"${MAP_HOST_KAFKA:-127.0.0.1}:9101:9101\"\n    profiles:\n      - all\n      - kafka\n\n    environment:\n          # Mode KRaft (Single Node)\n          KAFKA_NODE_ID: 1\n          KAFKA_PROCESS_ROLES: 'broker,controller'\n          KAFKA_CONTROLLER_QUORUM_VOTERS: '1@localhost:9093'\n          KAFKA_LOG4J_LOGGERS: \"org.apache.kafka.image.loader.MetadataLoader=WARN\"\n\n          # Listeners\n          KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: 'CONTROLLER:PLAINTEXT,EXTERNAL:PLAINTEXT'\n          KAFKA_LISTENERS: 'EXTERNAL://0.0.0.0:9092,CONTROLLER://0.0.0.0:9093'\n          KAFKA_ADVERTISED_LISTENERS: 'EXTERNAL://localhost:9092'\n          KAFKA_CONTROLLER_LISTENER_NAMES: 'CONTROLLER'\n          KAFKA_INTER_BROKER_LISTENER_NAME: 'EXTERNAL'\n\n          # Configuration simplifiée\n          KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1\n          KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0\n          KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1\n          KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1\n\n          # ID du Cluster (Nécessaire pour KRaft)\n          CLUSTER_ID: 'MkU3OEVBNTcwNTJENDM2Qk'\n\n          KAFKA_HEAP_OPTS: -Xms256M -Xmx256M\n    healthcheck:\n        # test: [\"CMD-SHELL\", \"nc -z localhost 9092 || exit 1\"]\n        test: [\"CMD\", \"ub\", \"kafka-ready\", \"-b\", \"localhost:9092\", \"1\", \"5\"]\n        start_period: 5s\n        interval: 5s\n        timeout: 10s\n        retries: 100\n\n  azurite:\n    image: mcr.microsoft.com/azure-storage/azurite:${AZURITE_VERSION:-3.24.0}\n    container_name: azurite\n    ports:\n        - \"${MAP_HOST_AZURITE:-127.0.0.1}:10000:10000\" # Blob store port\n    profiles:\n      - all\n      - azurite\n    volumes:\n        - azurite_data:/data\n    command: azurite --blobHost 0.0.0.0 --loose\n\n  fake-gcs-server:\n    image: fsouza/fake-gcs-server:${FAKE_GCS_SERVER_VERSION:-1.47.7}\n    container_name: fake-gcs-server\n    ports:\n      - \"${MAP_HOST_FAKE_GCS_SERVER:-127.0.0.1}:4443:4443\" # Blob store port\n    profiles:\n      - all\n      - fake-gcs-server\n    volumes:\n      - fake_gcs_server_data:/data\n    command: -scheme http\n\n  grafana:\n    image: grafana/grafana-oss:${GRAFANA_VERSION:-10.4.1}\n    container_name: grafana\n    ports:\n      - \"${MAP_HOST_GRAFANA:-127.0.0.1}:3000:3000\"\n    profiles:\n      - grafana\n      - monitoring\n    environment:\n      GF_AUTH_DISABLE_LOGIN_FORM: \"true\"\n      GF_AUTH_ANONYMOUS_ENABLED: \"true\"\n      GF_AUTH_ANONYMOUS_ORG_ROLE: Admin\n    volumes:\n      - grafana_conf:/etc/grafana\n      - grafana_data:/var/lib/grafana\n      - ./monitoring/grafana/dashboards:/var/lib/grafana/dashboards\n      - ./monitoring/grafana/provisioning:/etc/grafana/provisioning\n\n  jaeger:\n    image: jaegertracing/all-in-one:${JAEGER_VERSION:-1.48.0}\n    container_name: jaeger\n    ports:\n      - \"${MAP_HOST_JAEGER:-127.0.0.1}:16686:16686\" # Frontend\n    profiles:\n      - jaeger\n      - monitoring\n\n  otel-collector:\n    image: otel/opentelemetry-collector:${OTEL_VERSION:-0.84.0}\n    container_name: otel-collector\n    ports:\n      - \"${MAP_HOST_OTEL:-127.0.0.1}:1888:1888\"   # pprof extension\n      - \"${MAP_HOST_OTEL:-127.0.0.1}:8888:8888\"   # Prometheus metrics exposed by the collector\n      - \"${MAP_HOST_OTEL:-127.0.0.1}:8889:8889\"   # Prometheus exporter metrics\n      - \"${MAP_HOST_OTEL:-127.0.0.1}:13133:13133\" # health_check extension\n      - \"${MAP_HOST_OTEL:-127.0.0.1}:4317:4317\"   # OTLP gRPC receiver\n      - \"${MAP_HOST_OTEL:-127.0.0.1}:4318:4318\"   # OTLP http receiver\n      - \"${MAP_HOST_OTEL:-127.0.0.1}:55679:55679\" # zpages extension\n    profiles:\n      - otel\n      - monitoring\n    volumes:\n      - ./monitoring/otel-collector-config.yaml:/etc/otel-collector-config.yaml\n    command: [\"--config=/etc/otel-collector-config.yaml\"]\n\n  prometheus:\n    image: prom/prometheus:${PROMETHEUS_VERSION:-v2.43.0}\n    container_name: prometheus\n    ports:\n      - \"${MAP_HOST_PROMETHEUS:-127.0.0.1}:9090:9090\"\n    profiles:\n      - prometheus\n      - monitoring\n    volumes:\n      - ./monitoring/prometheus.yaml:/etc/prometheus/prometheus.yml\n    extra_hosts:\n      - \"host.docker.internal:host-gateway\"\n\n  gcp-pubsub-emulator:\n    # It is not an official docker image\n    # if we prefer we can build a docker from the official docker image (gcloud cli)\n    # and install the pubsub emulator https://cloud.google.com/pubsub/docs/emulator\n    image: thekevjames/gcloud-pubsub-emulator:${GCLOUD_EMULATOR:-550.0.0}\n    container_name: gcp-pubsub-emulator\n    ports:\n      - \"${MAP_HOST_GCLOUD_EMULATOR:-127.0.0.1}:8681:8681\"\n    environment:\n      # create a fake gcp project and a topic / subscription\n      - PUBSUB_PROJECT1=quickwit-emulator,emulator_topic:emulator_subscription\n    profiles:\n      - all\n      - gcp-pubsub\n\nvolumes:\n  azurite_data:\n  fake_gcs_server_data:\n  grafana_conf:\n  grafana_data:\n  localstack_data:\n  postgres_data:\n"
  },
  {
    "path": "docs/assets/sqs-file-source.tf",
    "content": "terraform {\n  required_version = \"1.7.5\"\n  required_providers {\n    aws = {\n      source  = \"hashicorp/aws\"\n      version = \"~> 5.39.1\"\n    }\n  }\n}\n\nprovider \"aws\" {\n  region = \"us-east-1\"\n  default_tags {\n    tags = {\n      provisioner = \"terraform\"\n      author      = \"Quickwit\"\n    }\n  }\n}\n\nlocals {\n  sqs_notification_queue_name = \"qw-tuto-s3-event-notifications\"\n  source_bucket_name          = \"qw-tuto-source-bucket\"\n}\n\nresource \"aws_s3_bucket\" \"file_source\" {\n  bucket_prefix = local.source_bucket_name\n  force_destroy = true\n}\n\ndata \"aws_iam_policy_document\" \"sqs_notification\" {\n  statement {\n    effect = \"Allow\"\n\n    principals {\n      type        = \"*\"\n      identifiers = [\"*\"]\n    }\n\n    actions   = [\"sqs:SendMessage\"]\n    resources = [\"arn:aws:sqs:*:*:${local.sqs_notification_queue_name}\"]\n\n    condition {\n      test     = \"ArnEquals\"\n      variable = \"aws:SourceArn\"\n      values   = [aws_s3_bucket.file_source.arn]\n    }\n  }\n}\n\n\nresource \"aws_sqs_queue\" \"s3_events\" {\n  name   = local.sqs_notification_queue_name\n  policy = data.aws_iam_policy_document.sqs_notification.json\n\n  redrive_policy = jsonencode({\n    deadLetterTargetArn = aws_sqs_queue.s3_events_deadletter.arn\n    maxReceiveCount     = 5\n  })\n}\n\nresource \"aws_sqs_queue\" \"s3_events_deadletter\" {\n  name = \"${locals.sqs_notification_queue_name}-deadletter\"\n}\n\nresource \"aws_sqs_queue_redrive_allow_policy\" \"s3_events_deadletter\" {\n  queue_url = aws_sqs_queue.s3_events_deadletter.id\n\n  redrive_allow_policy = jsonencode({\n    redrivePermission = \"byQueue\",\n    sourceQueueArns   = [aws_sqs_queue.s3_events.arn]\n  })\n}\n\nresource \"aws_s3_bucket_notification\" \"bucket_notification\" {\n  bucket = aws_s3_bucket.file_source.id\n\n  queue {\n    queue_arn = aws_sqs_queue.s3_events.arn\n    events    = [\"s3:ObjectCreated:*\"]\n  }\n}\n\ndata \"aws_iam_policy_document\" \"quickwit_node\" {\n  statement {\n    effect = \"Allow\"\n    actions = [\n      \"sqs:ReceiveMessage\",\n      \"sqs:DeleteMessage\",\n      \"sqs:ChangeMessageVisibility\",\n      \"sqs:GetQueueAttributes\",\n    ]\n    resources = [aws_sqs_queue.s3_events.arn]\n  }\n  statement {\n    effect    = \"Allow\"\n    actions   = [\"s3:GetObject\"]\n    resources = [\"${aws_s3_bucket.file_source.arn}/*\"]\n  }\n}\n\nresource \"aws_iam_user\" \"quickwit_node\" {\n  name = \"quickwit-filesource-tutorial\"\n  path = \"/system/\"\n}\n\nresource \"aws_iam_user_policy\" \"quickwit_node\" {\n  name   = \"quickwit-filesource-tutorial\"\n  user   = aws_iam_user.quickwit_node.name\n  policy = data.aws_iam_policy_document.quickwit_node.json\n}\n\nresource \"aws_iam_access_key\" \"quickwit_node\" {\n  user = aws_iam_user.quickwit_node.name\n}\n\noutput \"source_bucket_name\" {\n  value = aws_s3_bucket.file_source.bucket\n\n}\n\noutput \"notification_queue_url\" {\n  value = aws_sqs_queue.s3_events.id\n}\n\noutput \"quickwit_node_access_key_id\" {\n  value     = aws_iam_access_key.quickwit_node.id\n  sensitive = true\n}\n\noutput \"quickwit_node_secret_access_key\" {\n  value     = aws_iam_access_key.quickwit_node.secret\n  sensitive = true\n}\n"
  },
  {
    "path": "docs/configuration/_category_.yaml",
    "content": "label: 'Configuration'\nposition: 4\ncollapsed: true\n"
  },
  {
    "path": "docs/configuration/index-config.md",
    "content": "---\ntitle: Index configuration\nsidebar_position: 3\ntoc_max_heading_level: 4\n---\n\nThis page describes how to configure an index.\n\nIn addition to the `index_id`, the index configuration lets you define five items:\n\n- The **index-uri**: it defines where the index files should be stored.\n- The **doc mapping**: it defines how a document and the fields it contains are stored and indexed for a given index.\n- The **indexing settings**: it defines the timestamp field used for sharding, and some more advanced parameters like the merge policy.\n- The **search settings**: it defines the default search fields `default_search_fields`, a list of fields that Quickwit will search into if the user query does not explicitly target a field.\n- The **retention policy**: it defines how long Quickwit should keep the indexed data. If not specified, the data is stored forever.\n\nConfiguration is set at index creation and can be changed using the [update endpoint](../reference/rest-api.md) or the [CLI](../reference/cli.md).\n\n## Config file format\n\nThe index configuration format is YAML. When a key is absent from the configuration file, the default value is used.\nHere is a complete example suited for the HDFS logs dataset:\n\n```yaml\nversion: 0.7 # File format version.\n\nindex_id: \"hdfs\"\n\nindex_uri: \"s3://my-bucket/hdfs\"\n\ndoc_mapping:\n  mode: lenient\n  field_mappings:\n    - name: timestamp\n      type: datetime\n      input_formats:\n        - unix_timestamp\n      output_format: unix_timestamp_secs\n      fast_precision: seconds\n      fast: true\n    - name: severity_text\n      type: text\n      tokenizer: raw\n      fast:\n        - tokenizer: lowercase\n    - name: body\n      type: text\n      tokenizer: default\n      record: position\n    - name: resource\n      type: object\n      field_mappings:\n        - name: service\n          type: text\n          tokenizer: raw\n  tag_fields: [\"resource.service\"]\n  timestamp_field: timestamp\n  index_field_presence: true\n\nsearch_settings:\n  default_search_fields: [severity_text, body]\n\nretention:\n  period: 90 days\n  schedule: daily\n```\n\n## Index ID\n\nThe index ID is a string that uniquely identifies the index within the metastore. It may only contain uppercase or lowercase ASCII letters, digits, hyphens (`-`), and underscores (`_`). Finally, it must start with a letter and contain at least 3 characters but no more than 255.\n\n## Index uri\n\nThe index-uri defines where the index files (also called splits) should be stored.\nThis parameter expects a [storage uri](storage-config#storage-uris).\n\nThe `index-uri` parameter is optional.\nBy default, the `index-uri` will be computed by concatenating the `index-id` with the\n`default_index_root_uri` defined in the [Quickwit's config](node-config).\n\n:::caution\nThe file storage will not work when running quickwit in distributed mode. Instead, AWS S3, Azure Blob Storage, Google Cloud Storage (in s3 interoperability mode) or other S3-compatible storage systems including Scaleway Object Storage and Garage should be used as storage when running several searcher nodes.\n:::\n\n## Doc mapping\n\nThe doc mapping defines how a document and the fields it contains are stored and indexed for a given index. A document is a collection of named fields, each having its own data type (text, bytes, datetime, bool, i64, u64, f64, ip, json).\n\n| Variable      | Description   | Default value |\n| ------------- | ------------- | ------------- |\n| `field_mappings` | Collection of field mapping, each having its own data type (text, binary, datetime, bool, i64, u64, f64, ip, json).   | `[]` |\n| `mode`        | Defines how quickwit should handle document fields that are not present in the `field_mappings`. In particular, the \"dynamic\" mode makes it possible to use quickwit in a schemaless manner. (See [mode](#mode)) | `dynamic`\n| `dynamic_mapping` | This parameter is only allowed when `mode` is set to `dynamic`. It then defines whether dynamically mapped fields should be indexed, stored, etc.  | (See [mode](#mode))\n| `tag_fields` | Collection of fields* explicitly defined in `field_mappings` whose values will be stored as part of the `tags` metadata. Allowed types are: `text` (with raw tokenizer), `i64` and `u64`. [Learn more about tags](../overview/concepts/querying.md#tag-pruning). | `[]` |\n| `store_source` | Whether or not the original JSON document is stored or not in the index.   | `false` |\n| `timestamp_field`      | Timestamp field* used for sharding documents in splits. The field has to be of type `datetime`. [Learn more about time sharding](./../overview/architecture.md).  | `None` |\n| `partition_key`   |  If set, quickwit will route documents into different splits depending on the field name declared as the `partition_key`. | `null` |\n| `max_num_partitions`  | Limits the number of splits created through partitioning. (See [Partitioning](../overview/concepts/querying.md#partitioning))  |    `200` |\n| `index_field_presence` | `exists` queries are enabled automatically for fast fields. To enable it for all other fields set this parameter to `true`. Enabling it can have a significant CPU-cost on indexing.  |  false |\n\n*: tags fields and timestamp field are expressed as a path from the root of the JSON object to the given field. If a field name contains a `.` character, it needs to be escaped with a `\\` character.\n\n### Field types\n\nEach field[^1] has a type that indicates the kind of data it contains, such as integer on 64 bits or text.\nQuickwit supports the following raw types [`text`](#text-type), [`i64`](#numeric-types-i64-u64-and-f64-type), [`u64`](#numeric-types-i64-u64-and-f64-type), [`f64`](#numeric-types-i64-u64-and-f64-type), [`datetime`](#datetime-type), [`bool`](#bool-type), [`ip`](#ip-type), [`bytes`](#bytes-type), and [`json`](#json-type), and also supports composite types such as array and object. Behind the scenes, Quickwit is using tantivy field types, don't hesitate to look at [tantivy documentation](https://github.com/tantivy-search/tantivy) if you want to go into the details.\n\n### Raw types\n\n#### Text type\n\nThis field is a text field that will be analyzed and split into tokens before indexing.\nThis kind of field is tailored for full-text search.\n\nExample of a mapping for a text field:\n\n```yaml\nname: body\ndescription: Body of the document\ntype: text\ntokenizer: default\nrecord: position\nfieldnorms: true\nfast:\n  normalizer: lowercase\n```\n\n**Parameters for text field**\n\n| Variable      | Description   | Default value |\n| ------------- | ------------- | ------------- |\n| `description` | Optional description for the field. | `None` |\n| `stored`    | Whether value is stored in the document store | `true` |\n| `indexed`   | Whether value should be indexed so it can be searched | `true` |\n| `tokenizer` | Name of the `Tokenizer`. ([See tokenizers](#description-of-available-tokenizers)) for a list of available tokenizers.  | `default` |\n| `record`    | Describes the amount of information indexed, choices between `basic`, `freq` and `position` | `basic` |\n| `fieldnorms` | Whether to store fieldnorms for the field. Fieldnorms are required to calculate the BM25 Score of the document. | `false` |\n| `fast`     | Whether value is stored in a fast field. The fast field will contain the term ids and the dictionary. The default behaviour for `true` is to store the original text unchanged. The normalizers on the fast field is separately configured. It can be configured via `normalizer: lowercase`. ([See normalizers](#description-of-available-normalizers)) for a list of available normalizers. | `false` |\n\n##### Description of available tokenizers\n\n| Tokenizer     | Description   |\n| ------------- | ------------- |\n| `raw`         | Does not process nor tokenize text. Filters out tokens larger than 255 bytes. This is similar to the `keyword` type in Elasticsearch. |\n| `raw_lowercase` | Does not tokenize text, but lowercase it. Filters out tokens larger than 255 bytes.  |\n| `default`     | Chops the text on according to whitespace and punctuation, removes tokens that are too long, and converts to lowercase. Filters out tokens larger than 255 bytes. |\n| `en_stem`     | Like `default`, but also applies stemming on the resulting tokens. Filters out tokens larger than 255 bytes.  |\n| `whitespace`  | Chops the text on according to whitespace only. Doesn't remove long tokens or converts to lowercase. |\n| `chinese_compatible` |  Chop between each CJK character in addition to what `default` does. Should be used with `record: position` to be able to properly search |\n| `lowercase`   | Applies a lowercase transformation on the text. It does not tokenize the text. |\n\n##### Description of available normalizers\n\n| Normalizer     | Description   |\n| ------------- | ------------- |\n| `raw`         | Does not process nor tokenize text. Filters token larger than 255 bytes.  |\n| `lowercase` |  Applies a lowercase transformation on the text. Filters token larger than 255 bytes. |\n\n**Description of record options**\n\n| Record option | Description   |\n| ------------- | ------------- |\n| `basic`       |  Records only the `DocId`s |\n| `freq`        |  Records the document ids as well as the term frequency  |\n| `position`    |  Records the document id, the term frequency and the positions of occurrences.  |\n\nIndexing with position is required to run phrase queries.\n\n#### Numeric types: `i64`, `u64` and `f64` type\n\nQuickwit handles three numeric types: `i64`, `u64`, and `f64`.\n\nNumeric values can be stored in a fast field (the equivalent of Lucene's `DocValues`), which is a column-oriented storage used for range queries and aggregations.\n\nWhen querying negative numbers without precising a field (using `default_search_fields`), you should single-quote the number (for instance '-5'), otherwise it will be interpreted as wanting to match anything but that number.\n\nExample of a mapping for an u64 field:\n\n```yaml\nname: rating\ndescription: Score between 0 and 5\ntype: u64\nstored: true\nindexed: true\nfast: true\n```\n\n**Parameters for i64, u64 and f64 field**\n\n| Variable        | Description   | Default value |\n| --------------- | ------------- | ------------- |\n| `description`   | Optional description for the field. | `None` |\n| `stored`        | Whether the field values are stored in the document store. | `true` |\n| `indexed`       | Whether the field values are indexed. | `true` |\n| `fast`          | Whether the field values are stored in a fast field. | `false` |\n| `coerce`        | Whether to convert numbers passed as strings to integers or floats. | `true` |\n| `output_format` | JSON type used to return numbers in search results. Possible values are `number` or `string`. | `number` |\n\n#### `datetime` type\n\nThe `datetime` type handles dates and datetimes. Since JSON doesn’t have a date type, the `datetime` field support multiple input types and formats. The supported input types are:\n- floating-point or integer numbers representing a Unix timestamp\n- strings containing a formatted date, datetime, or Unix timestamp\n\nThe `input_formats` field parameter specifies the accepted date formats. The following input formats are natively supported:\n- `iso8601`\n- `rfc2822`\n- `rfc3339`\n- `strptime`\n- `unix_timestamp`\n\n**Input formats**\n\nWhen specifying multiple input formats, the corresponding parsers are attempted in the order they are declared. The following formats are natively supported:\n- `iso8601`, `rfc2822`, `rfc3339`: parse dates using standard ISO and RFC formats.\n- `strptime`: parse dates using the Unix [strptime](https://man7.org/linux/man-pages/man3/strptime.3.html) format with some variations:\n  - `strptime` format specifiers: `%C`, `%d`, `%D`, `%e`, `%F`, `%g`, `%G`, `%h`, `%H`, `%I`, `%j`, `%k`, `%l`, `%m`, `%M`, `%n`, `%R`, `%S`, `%t`, `%T`, `%u`, `%U`, `%V`, `%w`, `%W`, `%y`, `%Y`, `%%`.\n  - `%f` for milliseconds precision support.\n  - `%z` timezone offsets can be specified as `(+|-)hhmm` or `(+|-)hh:mm`.\n\n:::warning\nThe timezone name format specifier (`%Z`) is not supported currently.\n:::\n\n- `unix_timestamp`: parse float and integer numbers to Unix timestamps. Floating-point values are converted to timestamps expressed in seconds. Integer values are converted to Unix timestamps whose precision, determined in `seconds`, `milliseconds`, `microseconds`, or `nanoseconds`, is inferred from the number of input digits. Internally, datetimes are converted to UTC (if the time zone is specified) and stored as *i64* integers. As a result, Quickwit only supports timestamp values ranging from `Apr 13, 1972 23:59:55` to `Mar 16, 2242 12:56:31`.\n\n:::warning\nConverting timestamps from float to integer values may occur with a loss of precision.\n:::\n\nWhen a `datetime` field is stored as a fast field, the `fast_precision` parameter indicates the precision used to truncate the values before encoding, which improves compression (truncation here means zeroing). The `fast_precision` parameter can take the following values: `seconds`, `milliseconds`, `microseconds`, or `nanoseconds`. It only affects what is stored in fast fields when a `datetime` field is marked as \"fast\". Finally, operations on `datetime` fast fields, e.g. via aggregations, need to be done at the nanosecond level.\n\n:::info\nInternally `datetime` is stored in `nanoseconds` in fast fields and in the docstore, and in `seconds` in the term dictionary.\n:::\n\nIn addition, Quickwit supports the `output_format` field parameter to specify with which precision datetimes are deserialized. This parameter supports the same value as input formats except for `unix_timestamp` which is replaced by the following formats:\n- `unix_timestamp_secs`: displays timestamps in seconds.\n- `unix_timestamp_millis`: displays timestamps in milliseconds.\n- `unix_timestamp_micros`: displays timestamps in microseconds.\n- `unix_timestamp_nanos`: displays timestamps in nanoseconds.\n\nExample of a mapping for a datetime field:\n\n```yaml\nname: timestamp\ntype: datetime\ndescription: Time at which the event was emitted\ninput_formats:\n  - rfc3339\n  - unix_timestamp\n  - \"%Y %m %d %H:%M:%S.%f %z\"\noutput_format: unix_timestamp_secs\nstored: true\nindexed: true\nfast: true\nfast_precision: milliseconds\n```\n\n**Parameters for datetime field**\n\n| Variable      | Description   | Default value |\n| ------------- | ------------- | ------------- |\n| `input_formats` | Formats used to parse input dates | [`rfc3339`, `unix_timestamp`] |\n| `output_format` | Format used to display dates in search results | `rfc3339` |\n| `stored`        | Whether the field values are stored in the document store | `true` |\n| `indexed`       | Whether the field values are indexed | `true` |\n| `fast`          | Whether the field values are stored in a fast field | `false` |\n| `fast_precision`     | The precision (`seconds`, `milliseconds`, `microseconds`, or `nanoseconds`) used to store the fast values. | `seconds` |\n\n#### `bool` type\n\nThe `bool` type accepts boolean values.\n\nExample of a mapping for a boolean field:\n\n```yaml\nname: is_active\ndescription: Activation status\ntype: bool\nstored: true\nindexed: true\nfast: true\n```\n\n**Parameters for bool field**\n\n| Variable      | Description   | Default value |\n| ------------- | ------------- | ------------- |\n| `description` | Optional description for the field. | `None` |\n| `stored`    | Whether value is stored in the document store | `true` |\n| `indexed`   | Whether value is indexed | `true` |\n| `fast`      | Whether value is stored in a fast field | `false` |\n\n#### `ip` type\n\nThe `ip` type accepts IP address values, both IpV4 and IpV6 are supported. Internally IpV4 are converted to IpV6.\n\nExample of a mapping for an IP field:\n\n```yaml\nname: host_ip\ndescription: Host IP address\ntype: ip\nfast: true\n```\n\n**Parameters for IP field**\n\n| Variable      | Description   | Default value |\n| ------------- | ------------- | ------------- |\n| `description` | Optional description for the field. | `None` |\n| `stored`    | Whether value is stored in the document store | `true` |\n| `indexed`   | Whether value is indexed | `true` |\n| `fast`      | Whether value is stored in a fast field | `false` |\n\n\n#### `bytes` type\nThe `bytes` type accepts a binary value as a `Base64` encoded string.\n\nExample of a mapping for a bytes field:\n\n```yaml\nname: binary\ntype: bytes\nstored: true\nindexed: true\nfast: true\ninput_format: hex\noutput_format: hex\n```\n\n**Parameters for bytes field**\n\n| Variable      | Description   | Default value |\n| ------------- | ------------- | ------------- |\n| `description` | Optional description for the field. | `None` |\n| `stored`    | Whether value is stored in the document store | `true` |\n| `indexed`   | Whether value is indexed | `true` |\n| `fast`     | Whether value is stored in a fast field. Only on 1:1 cardinality, not supported on `array<bytes>` fields | `false` |\n| `input_format`   | Encoding used to represent input bytes, either `hex` or `base64` | `base64` |\n| `output_format`   |  Encoding used to represent bytes in search results, either `hex` or `base64` | `base64` |\n\n#### `json` type\n\nThe `json` type accepts a JSON object.\n\nExample of a mapping for a JSON field:\n\n```yaml\nname: parameters\ntype: json\nstored: true\nindexed: true\ntokenizer: raw\nexpand_dots: false\nfast:\n  normalizer: lowercase\n```\n\nStored primitive types are inferred from the JSON value types using the following rules:\n- a boolean value `true` or `false` is stored as `bool`\n- numeric values are cast to the first compatible format between `i64`, `u64` or\n  `f64` (in this order)\n- for string values (surrounded with quotes), Tantivy attempts to parse a date\n  in `rfc3339` format. If the parsing fails, the value is stored as `text` using\n  the configured tokenization rules\n\n**Parameters for JSON field**\n\n| Variable      | Description   | Default value |\n| ------------- | ------------- | ------------- |\n| `description` | Optional description for the field. | `None` |\n| `stored`    | Whether value is stored in the document store | `true` |\n| `indexed`   | Whether value is indexed | `true` |\n| `fast`     | Whether value is stored in a fast field. The default behaviour for text in the JSON is to store the text unchanged. A normalizer can be configured via `normalizer: lowercase`. ([See normalizers](#description-of-available-normalizers)) for a list of available normalizers. | `false` |\n| `tokenizer` | **Only affects strings in the json object**. Name of the `Tokenizer`, choices between `raw`, `default`, `en_stem` and `chinese_compatible` | `raw` |\n| `record`    | **Only affects strings in the json object**. Describes the amount of information indexed, choices between `basic`, `freq` and `position` | `basic` |\n| `expand_dots`    | If true, json keys containing a `.` should be expanded. For instance, if `expand_dots` is set to true, `{\"k8s.node.id\": \"node-2\"}` will be indexed as if it was `{\"k8s\": {\"node\": {\"id\": \"node2\"}}}`. The benefit is that escaping the `.` will not be required at query time. In other words, `k8s.node.id:node2` will match the document. This does not impact the way the document is stored.  | `true` |\n\nNote that the `tokenizer` and the `record` have the same definition and the same effect as for the text field.\n\nTo search into a json object, one then needs to extend the field name with the path that will lead to the target value.\n\nFor instance, when indexing the following object:\n```json\n{\n    \"product_name\": \"droopy t-shirt\",\n    \"attributes\": {\n        \"color\": [\"red\", \"green\", \"white\"],\n        \"size:\": \"L\"\n    }\n}\n```\n\nAssuming `attributes` as been defined as a field mapping as follows:\n```yaml\n- type: json\n  name: attributes\n```\n\n`attributes.color:red` is then a valid query.\n\nIf, in addition, `attributes` is set as a default search field, then `color:red` is a valid query.\n\n### Composite types\n\n#### array\n\nQuickwit supports arrays for all raw types except for `object` types.\n\nTo declare an array type of `i64` in the index config, you just have to set the type to `array<i64>`.\n\n#### object\n\nQuickwit supports nested objects as long as it does not contain arrays of objects.\n\n```yaml\nname: resource\ntype: object\nfield_mappings:\n  - name: service\n    type: text\n```\n\n#### concatenate\n\nQuickwit supports mapping the content of multiple fields to a single one. This can be more efficient at query time than\nsearching through dozens of `default_search_fields`. It also allows querying inside a json field without knowing the path\nto the field being searched.\n\n```yaml\nname: my_default_field\ntype: concatenate\nconcatenate_fields:\n  - text # things inside text, tokenized with the `default` tokenizer\n  - resource.author # all fields in resource.author, assuming resource is an `object` field.\ninclude_dynamic_fields: true\ntokenizer: default\nrecord: basic\n```\n\nConcatenate fields don't support fast fields, and are never stored. They uses their own tokenizer, independently of the\ntokenizer configured on the individual fields.\nAt query time, concatenate fields don't support range queries.\nOnly the following types are supported inside a concatenate field: text, bool,\ni64, u64, f64, json. Other types are rejected at index creation, or silently\ndiscarded during indexation if they are found inside a json field. Unlike\nregular JSON fields, JSON fields in a concatenate field don't store RFC3339\ndates as Tantivy dates. This means you can still perform prefix queries,\ne.g `my_default_field:\"2025-12-12\"*` to work around the lack of support for range\nqueries.\nAdding an object field to a concatenate field doesn't automatically add its subfields (yet).\n<!-- typing is made so it wouldn't be too hard to add, as well as things like params_* matching all fields which starts name with params_ , but the feature isn't implemented yet -->\nIt isn't possible to add subfields from a json field to a concatenate field. For instance if `attributes` is a json field, it's not possible to add only `attributes.color` to a concatenate field.\n\nFor json fields and dynamic fields, the path is not indexed, only values are. For instance, given the following document:\n```json\n{\n  \"421312\": {\n    \"my-key\": \"my-value\"\n  }\n}\n```\nIt is possible to search for `my-value` despite not knowing the full path, but it isn't possible to search for all documents containing a key `my-key`.\n\n<!--\nwhen the features are supported, add these:\n  - params_* # shortcut for all fields starting with `params_`\n  - resource.author # all fields in resource.author, assuming resource is either of type `object` or `json`\n---\nOnly the following types are supported inside a concatenate field: text, datetime, bool, i64, u64, ip, json. Other types are rejected\n---\nDatetime can only be queried in their RFC-3339 form, possibly omitting later components. # todo! will have to confirm this is achievable\n---\nplan:\n- implement text/bool/i64/u64 (nothing to do on search side for it to work). all gets converted to strings\n- add json\n- add object\n- add dynamic\n-- you are here\n- add wildcard\n- add json sub-fields?\n- add datetime (at index time, generate multiple tokens for yyyy, yyyy-MM... to yyyy-MM-ddThh:mm:ss; at search time, emit both tokenized and \"raw\" version of what may look like a datetime)\n- check negative i64 works as intended for non-raw tokenizer, and leverage datetime code if it doesn't\n- add ip (at index time, convert to single token; at search time, emit both tokenized and \"raw\" version of the ip)\n- allow optionally indexing json path (how do we tokenize it? split at each dot, or not?)\n-->\n\n### Mode\n\nThe `mode` describes how Quickwit should behave when it receives a field that is not defined in the field mapping.\n\nQuickwit offers you three different modes:\n- `dynamic` (default value): unmapped fields are gathered by Quickwit and handled as defined in the `dynamic_mapping` parameter.\n- `lenient`: unmapped fields are dismissed by Quickwit.\n- `strict`: if a document contains a field that is not mapped, quickwit will dismiss it, and count it as an error.\n\n#### Dynamic Mapping\n\n`dynamic` mode makes it possible to operate Quickwit in a schemaless manner, or with a partial schema.\nThe configuration of `dynamic` mode can be set via the `dynamic_mapping` parameter.\n`dynamic_mapping` offers the same configuration options as when configuring a `json` field. It defaults to:\n\n```yaml\nversion: 0.7\nindex_id: my-dynamic-index\ndoc_mapping:\n  mode: dynamic\n  dynamic_mapping:\n    indexed: true\n    stored: true\n    tokenizer: raw\n    record: basic\n    expand_dots: true\n    fast: true\n```\n\nWhen the `dynamic_mapping` is set as indexed (default), fields mapped through\ndynamic mode can be searched by targeting the path needed to access them from\nthe root of the JSON object.\n\nFor instance, in an entirely schemaless settings, a minimal index configuration could be:\n\n```yaml\nversion: 0.7\nindex_id: my-dynamic-index\ndoc_mapping:\n    # If you have a timestamp field, it is important to tell quickwit about it.\n    timestamp_field: unix_timestamp\n    # mode: dynamic #< Commented out, as dynamic is the default mode.\n```\n\nWith such a simple configuration, we can index a complex document like the following:\n\n```json\n{\n  \"endpoint\": \"/admin\",\n  \"query_params\": {\n    \"ctk\": \"e42bb897d\",\n    \"page\": \"eeb\"\n  },\n  \"src\": {\n    \"ip\": \"8.8.8.8\",\n    \"port\": 53,\n  },\n  //...\n}\n```\n\nThe following queries are then valid, and match the document above.\n\n```bash\n// Fields can be searched simply.\nendpoint:/admin\n\n// Nested object can be queried by specifying a `.` separated\n// path from the root of the json object to the given field.\nquery_params.ctk:e42bb897d\n\n// numbers are searchable too\nsrc.port:53\n\n// and of course we can combine them with boolean operators.\nsrc.port:53 AND query_params.ctk:e42bb897d\n```\n\nThe stored primitive type inference is the [same as for JSON fields](#json-type).\n\n### Field name validation rules\n\nCurrently Quickwit only accepts field name that matches the following regular expression:\n`^[@$_\\-a-zA-Z][@$_/\\.\\-a-zA-Z0-9]{0,254}$`\n\nIn plain language:\n- it needs to have at least one character.\n- it can only contain uppercase and lowercase ASCII letters `[a-zA-Z]`, digits `[0-9]`, `.`, hyphens `-`, underscores `_`, slash `/`, at `@` and dollar `$` signs.\n- it must not start with a dot or a digit.\n- it must be different from Quickwit's reserved field mapping names `_source`, `_dynamic`, `_field_presence`.\n\n:::caution\nFor field names containing the `.` character, you will need to escape it when referencing them. Otherwise the `.` character will be interpreted as a JSON object property access. Because of this, it is recommended to avoid using field names containing the `.` character.\n:::\n\n### Behavior with null values or missing fields\n\nFields with `null` or missing fields in your JSON document will be silently ignored when indexing.\n\n## Indexing settings\n\nThis section describes indexing settings for a given index.\n\n| Variable      | Description   | Default value |\n| ------------- | ------------- | ------------- |\n| `commit_timeout_secs`      | Maximum number of seconds before committing a split since its creation.   | `60` |\n| `split_num_docs_target` | Target number of docs per split.   | `10000000` |\n| `merge_policy` | Describes the strategy used to trigger split merge operations (see [Merge policies](#merge-policies) section below). |\n| `resources.heap_size`      | Indexer heap size per source per index.   | `2000000000` |\n| `docstore_compression_level` | Level of compression used by zstd for the docstore. Lower values may increase ingest speed, at the cost of index size | `8` |\n| `docstore_blocksize` | Size of blocks in the docstore, in bytes. Lower values may improve doc retrieval speed, at the cost of index size | `1000000` |\n\n:::note\n\nChoosing an appropriate commit timeout is critical. With a shorter commit timeout, ingested data is queryable faster. But the published splits will be smaller, increasing the overhead associated with [merges](#merge-policies). \n\nWhen decommissioning definitively an indexer node that received data through the ingest API (including the [Elastic bulk API](/docs/reference/es_compatible_api) and the OTEL [log](/docs/log-management/otel-service.md) and [trace](/docs/distributed-tracing/otel-service.md) services), we need to make sure that all the data that was persisted locally (Write Ahead Log) is indexed and committed. After receiving the termination signal, the Quickwit process waits for the indexing pipelines to finish processing this local data. This can take as long as the longest commit timeout of all indexes. Make sure that the termination grace period of the infrastructure supporting the Quickwit indexer nodes is long enough (e.g [`terminationGracePeriodSeconds`](https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/) in Kubernetes or [`stopTimeout`](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definition_parameters.html) on AWS ECS).\n\n:::\n\n### Merge policies\n\nQuickwit makes it possible to define the strategy used to decide which splits should be merged together and when.\n\nQuickwit offers three different merge policies, each with their\nown set of parameters.\n\n#### \"Stable log\" merge policy\n\nThe stable log merge policy attempts to minimize write amplification AND keep time-pruning power as high as possible, by merging splits with a similar size, and with a close time span.\n\nQuickwit's default merge policy is the `stable_log` merge policy\nwith the following parameters:\n\n```yaml\nversion: 0.7\nindex_id: \"hdfs\"\n# ...\nindexing_settings:\n  merge_policy:\n    type: \"stable_log\"\n    min_level_num_docs: 100000\n    merge_factor: 10\n    max_merge_factor: 12\n    maturation_period: 48h\n```\n\n\n| Variable      | Description   | Default value |\n| ------------- | ------------- | ------------- |\n| `merge_factor`      | *(advanced)* Number of splits to merge together in a single merge operation.   | `10` |\n| `max_merge_factor` | *(advanced)* Maximum number of splits that can be merged together in a single merge operation.  | `12` |\n| `min_level_num_docs` |  *(advanced)* Number of docs below which all splits are considered as belonging to the same level.   | `100000` |\n| `maturation_period` | Duration after which a split is considered mature, and won't be considered for merges anymore. May impact the completion time of pending delete tasks. | `48h` |\n\n#### \"Limit Merge\" merge policy\n\n*The limit merge policy is considered advanced*.\n\nThe limit merge policy simply limits write amplification by setting an upperbound\nof the number of merge operation a split should undergo.\n\n\n```yaml\nversion: 0.7\nindex_id: \"hdfs\"\n# ...\nindexing_settings:\n  merge_policy:\n    type: \"limit_merge\"\n    max_merge_ops: 5\n    merge_factor: 10\n    max_merge_factor: 12\n    maturation_period: 48h\n```\n\n\n| Variable      | Description   | Default value |\n| ------------- | ------------- | ------------- |\n| `max_merge_ops`   |  Maximum number of merges that a given split should undergo. | `4` |\n| `merge_factor`      | *(advanced)* Number of splits to merge together in a single merge operation.   | `10` |\n| `max_merge_factor` | *(advanced)* Maximum number of splits that can be merged together in a single merge operation.  | `12` |\n| `maturation_period` | Duration after which a split is considered mature, and won't be considered for merges anymore. May impact the completion time of pending delete tasks. | `48h` |\n\n#### No merge\n\nThe `no_merge` merge policy entirely disables merging.\n\n:::caution\nThis setting is not recommended. Merges are necessary to reduce the number of splits, and hence improve search performances.\n:::\n\n```yaml\nversion: 0.7\nindex_id: \"hdfs\"\nindexing_settings:\n    merge_policy:\n        type: \"no_merge\"\n```\n\n\n\n### Indexer memory usage\n\nIndexer works with a default heap of 2 GiB of memory. This does not directly reflect the overall memory usage, but doubling this value should give a fair approximation.\n\n\n## Search settings\n\nThis section describes search settings for a given index.\n\n| Variable      | Description   | Default value |\n| ------------- | ------------- | ------------- |\n| `default_search_fields` | Default list of fields that will be used for search. The field names in this list may be declared explicitly in the schema, or may refer to a field captured by the dynamic mode. | `None` |\n\n## Retention policy\n\nThis section describes how Quickwit manages data retention. In Quickwit, the retention policy manager drops data on a split basis as opposed to individually dropping documents. Splits are evaluated based on their `time_range` which is derived from the index timestamp field specified in the (`doc_mapping.timestamp_field`) settings. Using this setting, the retention policy will delete a split when `now() - split.time_range.end >= retention_policy.period`\n\n```yaml\nversion: 0.7\nindex_id: hdfs\n# ...\nretention:\n  period: 90 days\n  schedule: daily\n```\n\n| Variable      | Description   | Default value |\n| ------------- | ------------- | ------------- |\n| `period`      | Duration after which splits are dropped, expressed in a human-readable way (`1 day`, `2 hours`, `a week`, ...). | required |\n| `schedule`    | Frequency at which the retention policy is evaluated and applied, expressed as a cron expression (`0 0 * * * *`) or human-readable form (`hourly`, `daily`, `weekly`, `monthly`, `yearly`). | `hourly` |\n\n\n`period` is specified as set of time spans. Each time span is an integer followed by a unit suffix like: `2 days 3h 24min`. The supported units are:\n  - `nsec`, `ns` -- nanoseconds\n  - `usec`, `us` -- microseconds\n  - `msec`, `ms` -- milliseconds\n  - `seconds`, `second`, `sec`, `s`\n  - `minutes`, `minute`, `min`, `m`\n  - `hours`, `hour`, `hr`, `h`\n  - `days`, `day`, `d`\n  - `weeks`, `week`, `w`\n  - `months`, `month`, `M` -- a month is defined as `30.44 days`\n  - `years`, `year`, `y` -- a year is defined as `365.25 days`\n"
  },
  {
    "path": "docs/configuration/index.md",
    "content": "---\ntitle: Configuration Reference\n---\n\nimport DocCardList from '@theme/DocCardList';\n\n<DocCardList />\n\n\n"
  },
  {
    "path": "docs/configuration/lambda-config.md",
    "content": "---\ntitle: Lambda configuration\nsidebar_position: 6\n---\n\nQuickwit supports offloading leaf search operations to AWS Lambda for horizontal scaling. When the local search queue becomes saturated, overflow splits are automatically sent to Lambda functions for processing.\n\n:::note\nLambda offloading is currently only supported on AWS.\n:::\n\n## How it works\n\nLambda offloading is **only active when a `lambda` configuration section is present** under `searcher` in your node configuration. When configured:\n\n1. Quickwit monitors the local search queue depth\n2. When pending searches exceed the `offload_threshold`, new splits are sent to Lambda instead of being queued locally\n3. Lambda returns per-split search results that are cached and merged with local results\n\nThis allows Quickwit to handle traffic spikes without provisioning additional searcher nodes.\n\n## Startup validation\n\nWhen a `lambda` configuration is defined, Quickwit performs a **dry run invocation** at startup to verify that:\n- The Lambda function exists\n- The function version matches the embedded binary\n- The invoker has permission to call the function\n\nIf this validation fails, **Quickwit will fail to start**. This ensures that Lambda offloading works correctly before the node begins serving traffic.\n\n## Configuration\n\nAdd a `lambda` section under `searcher` in your node configuration:\n\n```yaml\nsearcher:\n  lambda:\n    offload_threshold: 100\n    auto_deploy:\n      execution_role_arn: arn:aws:iam::123456789012:role/quickwit-lambda-role\n      memory_size: 5 GiB\n      invocation_timeout_secs: 15\n```\n\n### Lambda configuration options\n\n| Property | Description | Default value |\n| --- | --- | --- |\n| `function_name` | Name of the AWS Lambda function to invoke. | `quickwit-lambda-search` |\n| `max_splits_per_invocation` | Maximum number of splits to send in a single Lambda invocation. Must be at least 1. | `10` |\n| `offload_threshold` | Number of pending local searches before offloading to Lambda. A value of `0` offloads everything to Lambda. | `100` |\n| `auto_deploy` | Auto-deployment configuration. If set, Quickwit automatically deploys or updates the Lambda function at startup. | (none) |\n\n### Auto-deploy configuration options\n\n| Property | Description | Default value |\n| --- | --- | --- |\n| `execution_role_arn` | **Required.** IAM role ARN for the Lambda function's execution role. | |\n| `memory_size` | Memory allocated to the Lambda function. More memory provides more CPU. | `5 GiB` |\n| `invocation_timeout_secs` | Timeout for Lambda invocations in seconds. | `15` |\n\n## Deployment options\n\n### Automatic deployment (recommended)\n\nWith `auto_deploy` configured, Quickwit automatically:\n1. Creates the Lambda function if it doesn't exist\n2. Updates the function code if the embedded binary has changed\n3. Publishes a new version with a unique identifier\n4. Garbage collects old versions (keeps current + 5 most recent)\n\nThis is the recommended approach as it ensures the Lambda function always matches the Quickwit binary version.\n\n### Manual deployment\n\nYou can deploy the Lambda function manually without `auto_deploy`:\n1. Download the Lambda zip from [GitHub releases](https://github.com/quickwit-oss/quickwit/releases)\n2. Create or update the Lambda function using AWS CLI, Terraform, or the AWS Console\n3. Publish a version with description format `quickwit_{version}_{sha256}_{timeout}_{deploy_config}\"` (e.g., `quickwit_0_8_0_fa940f44_5120_60s_6c3b2`)\n\nThe description must match the format Quickwit expects, or it won't find the function version.\n\n## IAM permissions\n\n### Permissions for the Quickwit node\n\nThe IAM role or user running Quickwit needs the following permissions to invoke Lambda:\n\n```json\n{\n  \"Version\": \"2012-10-17\",\n  \"Statement\": [\n    {\n      \"Effect\": \"Allow\",\n      \"Action\": [\n        \"lambda:InvokeFunction\"\n      ],\n      \"Resource\": \"arn:aws:lambda:*:*:function:quickwit-lambda-search:*\"\n    }\n  ]\n}\n```\n\nIf using `auto_deploy`, additional permissions are required for deployment:\n\n```json\n{\n  \"Version\": \"2012-10-17\",\n  \"Statement\": [\n    {\n      \"Effect\": \"Allow\",\n      \"Action\": [\n        \"lambda:CreateFunction\",\n        \"lambda:GetFunction\",\n        \"lambda:UpdateFunctionCode\",\n        \"lambda:PublishVersion\",\n        \"lambda:ListVersionsByFunction\",\n        \"lambda:DeleteFunction\"\n      ],\n      \"Resource\": \"arn:aws:lambda:*:*:function:quickwit-lambda-search\"\n    },\n    {\n      \"Effect\": \"Allow\",\n      \"Action\": \"iam:PassRole\",\n      \"Resource\": \"arn:aws:iam::*:role/quickwit-lambda-role\",\n      \"Condition\": {\n        \"StringEquals\": {\n          \"iam:PassedToService\": \"lambda.amazonaws.com\"\n        }\n      }\n    }\n  ]\n}\n```\n\n### Lambda execution role\n\nThe Lambda function requires an execution role with S3 read access to your index data.\n\nExample policy:\n\n```json\n{\n  \"Version\": \"2012-10-17\",\n  \"Statement\": [\n    {\n      \"Effect\": \"Allow\",\n      \"Action\": \"s3:GetObject\",\n      \"Resource\": \"arn:aws:s3:::your-index-bucket/*\"\n    }\n  ]\n}\n```\n\nThe execution role must also have a trust policy allowing Lambda to assume it:\n\n```json\n{\n  \"Version\": \"2012-10-17\",\n  \"Statement\": [\n    {\n      \"Effect\": \"Allow\",\n      \"Principal\": {\n        \"Service\": \"lambda.amazonaws.com\"\n      },\n      \"Action\": \"sts:AssumeRole\"\n    }\n  ]\n}\n```\n\n## CloudWatch logging\n\nThe Lambda function emits structured logs (JSON) to stdout. To have these logs captured by CloudWatch, add the following iam permissions to the Lambda execution role:\n\n```json\n{\n  \"Version\": \"2012-10-17\",\n  \"Statement\": [\n    {\n      \"Effect\": \"Allow\",\n      \"Action\": [\n        \"logs:CreateLogGroup\",\n        \"logs:CreateLogStream\",\n        \"logs:PutLogEvents\"\n      ],\n      \"Resource\": \"arn:aws:logs:*:*:*\"\n    }\n  ]\n}\n```\n\nNo additional configuration is needed on the Quickwit side.\n\n## Versioning\n\nQuickwit uses content-based versioning for Lambda:\n- A SHA256 hash of the Lambda binary is computed at build time\n- This hash is embedded in the Lambda function description as `quickwit:{version}-{sha256_short}`\n- When Quickwit starts, it searches for a version matching this description\n- Different Quickwit builds with the same Lambda binary share the same Lambda version\n- Updating the Lambda binary automatically triggers a new deployment\n\n## Example configuration\n\n\nMinimal configuration (with auto-deployment):\n\n```yaml\nsearcher:\n  lambda:\n    auto_deploy:\n      execution_role_arn: arn:aws:iam::123456789012:role/quickwit-lambda-role\n```\n\n\nFull configuration (auto-deployment):\n\n```yaml\nsearcher:\n  lambda:\n    function_name: quickwit-lambda-search\n    max_splits_per_invocation: 10\n    offload_threshold: 10\n    auto_deploy:\n      execution_role_arn: arn:aws:iam::123456789012:role/quickwit-lambda-role\n      memory_size: 5 GiB\n      invocation_timeout_secs: 15\n```\n\nAggressive offloading (send everything to Lambda):\n\n```yaml\nsearcher:\n  lambda:\n    function_name: quickwit-lambda-search\n    offload_threshold: 0\n    auto_deploy:\n      execution_role_arn: arn:aws:iam::123456789012:role/quickwit-lambda-role\n```\n"
  },
  {
    "path": "docs/configuration/metastore-config.md",
    "content": "---\ntitle: Metastore configuration\nsidebar_position: 4\n---\n\nQuickwit needs a place to store meta-information about its indexes.\n\nFor instance:\n\n- The index configuration.\n- Meta-information about its splits. For instance, their IDs, the number of documents they contain, their sizes, their min/max timestamp, and the set of tags present in the split.\n- The different sources checkpoints.\n- Some extra information such as the index creation time.\n\nThe metastore is entirely defined by a single URI. One can set it by editing the `metastore_uri` parameter of the [node configuration file](./node-config.md) (often named `quickwit.yaml`).\n\nCurrently, Quickwit offers two implementations:\n\n- **PostgreSQL**: recommended for distributed usage.\n- **File-backed implementation**.\n\n# PostgreSQL Metastore\n\nWe recommend the PostgreSQL metastore for any distributed usage.\n\nThe PostgreSQL metastore can be configured by setting a PostgreSQL URI in the `metastore_uri` parameter of the Quickwit configuration file. The URI takes the following format:\n\n```\npostgres://[user]:[password]@[host]:[port]/[dbname]\n```\n\nSome of those parameters can be omitted. The following PostgreSQL URIs are for instance valid:\n\n```\npostgres://localhost/mydb\npostgres://user@localhost\npostgres://user:secret@localhost\n```\n\nThe database has to be created in advance.\n\nOn its first execution, Quickwit will transparently create the necessary tables.\n\nLikewise, if you upgrade Quickwit to a version that includes some changes in the PostgreSQL schema, Quickwit will transparently operate the migration startup.\n\n# File-backed metastore\n\nFor convenience, Quickwit also makes it possible to store its metadata in files using a file-backed metastore. In that case, Quickwit will write one file per index.\n\nThe metastore is then configured by passing a [storage URI](storage-config#storage-uris) that will serve as the root of the metastore storage.\n\nThe metadata file associated with a given index will then be stored under\n\n  `[storage_uri]/[index_id]/metastore.json`\n\nFor the moment, Quickwit supports two types of storage types:\n\n- a local file system URI (e.g., `file:///opt/toto`). It is also valid to pass a file path directly (without file://). `/var/quickwit`. Relative paths will be resolved with respect to the current working directory.\n- S3-compatible storage URI (e.g., `s3://my-bucket/some-path`). See the [storage config](storage-config) documentation to configure S3 or S3-compatible storage providers.\n\n### Polling configuration\n\nBy default, the File-Backed Metastore is only read once when you start a Quickwit process (searcher, indexer, ...).\n\nYou can also configure it to poll the File-Backed Metastore periodically to keep a fresh view of it. This is useful for a Searcher instance that needs to be aware of new splits published by an Indexer running in parallel.\n\nTo configure the polling interval (in seconds), add a URI fragment to the storage URI as follows: `s3://quickwit/my-indexes#polling_interval=30s`\n\n:::note\nThe polling interval can be configured in seconds only; other units, such as minutes or hours, are not supported.\n:::\n\n:::tip\nAmazon S3 charges $0.0004 per 1000 GET requests. Polling a metastore every 30 seconds costs $0.04 per month and index.\n:::\n\n### Examples\n\nThe following file-backed metastore URIs for instance are valid:\n\n```markdown\ns3://my-indexes\ns3://quickwit/my-indexes\ns3://quickwit/my-indexes#polling_interval=30s\nfile:///local/indices\nfile:///local/indices#polling_interval=30s\n/local/indices\n./quickwit-metastores\n```\n\n:::caution\nThe file-backed metastore does not support multiple instances running at the same time because it does not implement any locking mechanism to prevent concurrent writes from overwriting each other. Ensure that only one file-backed metastore instance is running at all times.\n:::\n"
  },
  {
    "path": "docs/configuration/node-config.md",
    "content": "---\ntitle: Node configuration\nsidebar_position: 1\n---\n\nThe node configuration allows you to customize and optimize the settings for individual nodes in your cluster. It is divided into several sections:\n\n- Common configuration settings: shared top-level properties\n- Storage settings: defined in the [storage](#storage-configuration) section\n- Metastore settings: defined in the [metastore](#metastore-configuration) section\n- Ingest settings: defined in the [ingest_api](#ingest-api-configuration) section\n- Indexer settings: defined in the [indexer](#indexer-configuration) section\n- Searcher settings: defined in the [searcher](#searcher-configuration) section\n- Jaeger settings: defined in the [jaeger](#jaeger-configuration) section\n\nA commented example is available here: [quickwit.yaml](https://github.com/quickwit-oss/quickwit/blob/main/config/quickwit.yaml).\n\n## Common configuration\n\n| Property | Description | Env variable | Default value |\n| --- | --- | --- | --- |\n| `version` | Config file version. `0.7` is the only available value with a retro compatibility on `0.5` and `0.4`. | | |\n| `cluster_id` | Unique identifier of the cluster the node will be joining. Clusters sharing the same network should use distinct cluster IDs.| `QW_CLUSTER_ID` | `quickwit-default-cluster` |\n| `node_id` | Unique identifier of the node. It must be distinct from the node IDs of its cluster peers. Defaults to the instance's short hostname if not set. | `QW_NODE_ID` | short hostname |\n| `enabled_services` | Enabled services (control_plane, indexer, janitor, metastore, searcher) | `QW_ENABLED_SERVICES` | all services |\n| `listen_address` | The IP address or hostname that Quickwit service binds to for starting REST and GRPC server and connecting this node to other nodes. By default, Quickwit binds itself to 127.0.0.1 (localhost). This default is not valid when trying to form a cluster. | `QW_LISTEN_ADDRESS` | `127.0.0.1` |\n| `advertise_address` | IP address advertised by the node, i.e. the IP address that peer nodes should use to connect to the node for RPCs. | `QW_ADVERTISE_ADDRESS` | `listen_address` |\n| `gossip_listen_port` | The port which to listen for the Gossip cluster membership service (UDP). | `QW_GOSSIP_LISTEN_PORT` | `rest.listen_port` |\n| `grpc_listen_port` | The port on which gRPC services listen for traffic. | `QW_GRPC_LISTEN_PORT` | `rest.listen_port + 1` |\n| `peer_seeds` | List of IP addresses or hostnames used to bootstrap the cluster and discover the complete set of nodes. This list may contain the current node address and does not need to be exhaustive. If the list of peer seeds contains a host name, Quickwit will resolve it by querying the DNS every minute. On kubernetes for instance, it is a good practise to set it to a [headless service](https://kubernetes.io/docs/concepts/services-networking/service/#headless-services). | `QW_PEER_SEEDS` | |\n| `data_dir` | Path to directory where data (tmp data, splits kept for caching purpose) is persisted. This is mostly used in indexing. | `QW_DATA_DIR` | `./qwdata` |\n| `metastore_uri` | Metastore URI. Can be a local directory or `s3://my-bucket/indexes` or `postgres://username:password@localhost:5432/metastore`. [Learn more about the metastore configuration](metastore-config.md). | `QW_METASTORE_URI` | `{data_dir}/indexes` |\n| `default_index_root_uri` | Default index root URI that defines the location where index data (splits) is stored. The index URI is built following the scheme: `{default_index_root_uri}/{index-id}` | `QW_DEFAULT_INDEX_ROOT_URI` | `{data_dir}/indexes` |\n| environment variable only | Log level of Quickwit. Can be a direct log level, or a comma separated list of `module_name=level` | `RUST_LOG` | `info` |\n\n## REST configuration\n\nThis section contains the REST API configuration options.\n\n| Property | Description | Env variable | Default value |\n| --- | --- | --- | --- |\n| `listen_port` | The port on which the REST API listens for HTTP traffic. | `QW_REST_LISTEN_PORT` | `7280` |\n| `cors_allow_origins` | Configure the CORS origins which are allowed to access the API. [Read more](#configuring-cors-cross-origin-resource-sharing) | |\n| `extra_headers` | List of header names and values | | |\n\n### Configuring CORS (Cross-origin resource sharing)\n\nCORS (Cross-origin resource sharing) describes which address or origins can access the REST API from the browser.\nBy default, sharing resources cross-origin is not allowed.\n\nA wildcard, single origin, or multiple origins can be specified as part of the `cors_allow_origins` parameter:\n\n\nExample of a REST configuration:\n\n```yaml\nrest:\n  listen_port: 1789\n  extra_headers:\n    x-header-1: header-value-1\n    x-header-2: header-value-2\n  cors_allow_origins: '*'\n\n#   cors_allow_origins: https://my-hdfs-logs.domain.com   # Optionally we can specify one domain\n#   cors_allow_origins:                                   # Or allow multiple origins\n#     - https://my-hdfs-logs.domain.com\n#     - https://my-hdfs.other-domain.com\n```\n\n## gRPC configuration\n\nThis section contains the configuration options for gRPC services and clients used for internal communication between nodes.\n\n| Property | Description | Env variable | Default value |\n| --- | --- | --- | --- |\n| `max_message_size` | The maximum size (in bytes) of messages exchanged by internal gRPC clients and services. | | `20 MiB` |\n\nExample of a gRPC configuration:\n\n```yaml\ngrpc:\n  max_message_size: 30 MiB\n```\n\n:::warning\nWe advise changing the default value of 20 MiB only if you encounter the following error:\n`Error, message length too large: found 24732228 bytes, the limit is: 20971520 bytes.` In that case, increase `max_message_size` by increments of 10 MiB until the issue disappears. This is a temporary fix: the next version of Quickwit will rely exclusively on gRPC streaming endpoints and handle messages of any length.\n:::\n\n## Storage configuration\n\nPlease refer to the dedicated [storage configuration](storage-config) page to learn more about configuring Quickwit for various storage providers.\n\nHere are also some minimal examples of how to configure Quickwit with Amazon S3 or Alibaba OSS:\n\n```bash\nAWS_ACCESS_KEY_ID=<your access key ID>\nAWS_SECRET_ACCESS_KEY=<your secret access key>\n```\n\n*Amazon S3*\n\n```yaml\nstorage:\n  s3:\n    region: us-east-1\n```\n\n*Alibaba*\n\n```yaml\nstorage:\n  s3:\n    region: us-east-1\n    endpoint: https://oss-us-east-1.aliyuncs.com\n```\n\n## Metastore configuration\n\nThis section may contain one configuration subsection per available metastore implementation. The specific configuration parameters for each implementation may vary. Currently, the available metastore implementations are:\n- File-backed\n- PostgreSQL\n\n### File-backed metastore configuration\n\nFile-backed metastore doesn't have any node level configuration. You can configure the poll interval [at the index level](./metastore-config.md#polling-configuration).\n\n### PostgreSQL metastore configuration\n\n| Property | Description | Default value |\n| --- | --- | --- |\n| `min_connections` | Minimum number of connections to maintain in the pool at all times. | `0` |\n| `max_connections` | Maximum number of connections to maintain in the pool. | `10` |\n| `acquire_connection_timeout` | Maximum amount of time to spend waiting for an available connection before aborting a query. | `10s` |\n| `idle_connection_timeout` | Maximum idle duration before closing individual connections. | `10min` |\n| `max_connection_lifetime` | Maximum lifetime of individual connections. | `30min` |\n\nExample of a metastore configuration for PostgreSQL in YAML format:\n\n```yaml\nmetastore:\n  postgres:\n    min_connections: 10\n    max_connections: 50\n    acquire_connection_timeout: 30s\n    idle_connection_timeout: 1h\n    max_connection_lifetime: 1d\n```\n\n## Indexer configuration\n\nThis section contains the configuration options for an indexer. The split store is documented in the [indexing document](../overview/concepts/indexing.md#split-store).\n\n| Property | Description | Default value |\n| --- | --- | --- |\n| `split_store_max_num_bytes` | Maximum size in bytes allowed in the split store. | `100G` |\n| `split_store_max_num_splits` | Maximum number of files allowed in the split store. | `1000` |\n| `max_concurrent_split_uploads` | Maximum number of concurrent split uploads allowed on the node. | `12` |\n| `merge_concurrency` | Maximum number of merge operations that can be executed on the node at one point in time. | `(2 x num threads available) / 3` |\n| `enable_otlp_endpoint` | If true, enables the OpenTelemetry exporter endpoint to ingest logs and traces via the OpenTelemetry Protocol (OTLP). | `false` |\n| `cpu_capacity` | Advisory parameter used by the control plane. The value can expressed be in threads (e.g. `2`) or in term of millicpus (`2000m`). The control plane will attempt to schedule indexing pipelines on the different nodes proportionally to the cpu capacity advertised by the indexer. It is NOT used as a limit. All pipelines will be scheduled regardless of whether the cluster has sufficient capacity or not. The control plane does not attempt to spread the work equally when the load is well below the `cpu_capacity`. Users who need a balanced load on all of their indexer nodes can set the `cpu_capacity` to an arbitrarily low value as long as they keep it proportional to the number of threads available. | `num threads available` |\n| `enable_cooperative_indexing` | Enable sharing resources more efficiently when the number of indexes actively written to is significantly higher than the number of cores but might decrease the overall indexing throughput. | `false` |\n\nExample:\n\n```yaml\nindexer:\n  split_store_max_num_bytes: 100G\n  split_store_max_num_splits: 1000\n  max_concurrent_split_uploads: 12\n  enable_otlp_endpoint: true\n```\n\n## Ingest API configuration\n\n| Property | Description | Default value |\n| --- | --- | --- |\n| `max_queue_memory_usage` | Maximum size in bytes of the in-memory Ingest queue. | `2GiB` |\n| `max_queue_disk_usage` | Maximum disk-space in bytes taken by the Ingest queue. The minimum size is at least `256M` and be at least `max_queue_memory_usage`. | `4GiB` |\n| `content_length_limit` | Maximum payload size uncompressed. Increasing this is discouraged, use a [file source](../ingest-data/sqs-files.md) instead. | `10MiB` |\n| `grpc_compression_algorithm` | Compression algorithm (`gzip` or `zstd`) to use for gRPC traffic between nodes for the ingest service | `None` |\n\nExample:\n\n```yaml\ningest_api:\n  max_queue_memory_usage: 2GiB\n  max_queue_disk_usage: 4GiB\n  content_length_limit: 10MiB\n  grpc_compression_algorithm: zstd\n```\n\n## Searcher configuration\n\nThis section contains the configuration options for a Searcher.\n\n| Property | Description | Default value |\n| --- | --- | --- |\n| `aggregation_memory_limit` | Controls the maximum amount of memory that can be used for aggregations before aborting. This limit is per searcher node. A node may run concurrent queries, which share the limit. The first query that will hit the limit will be aborted and frees its memory. It is used to prevent excessive memory usage during the aggregation phase, which can lead to performance degradation or crashes. | `500M`|\n| `aggregation_bucket_limit` | Determines the maximum number of buckets returned to the client. | `65000` |\n| `fast_field_cache_capacity` | Fast field in memory cache capacity on a Searcher. If your filter by dates, run aggregations, range queries, or even for tracing, it might worth increasing this parameter. The [metrics](../reference/metrics.md) starting by `quickwit_cache_fastfields_cache` can help you make an informed choice when setting this value. | `1G` |\n| `split_footer_cache_capacity` | Split footer in memory cache (it is essentially the hotcache) capacity on a Searcher.| `500M` |\n| `partial_request_cache_capacity` | Partial request in memory cache capacity on a Searcher. Cache intermediate state for a request, possibly making subsequent requests faster. It can be disabled by setting the size to `0`. | `64M` |\n| `max_num_concurrent_split_searches` | Maximum number of concurrent split search requests running on a Searcher. | `100` |\n| `split_cache` | Searcher split cache configuration options defined in the section below. Cache disabled if unspecified. | |\n| `request_timeout_secs` | The time before a search request is cancelled. This should match the timeout of the stack calling into quickwit if there is one set.  | `30` |\n\n### Searcher split cache configuration\n\nThis section contains the configuration options for the on-disk searcher split cache. Files are stored in the data directory under `searcher-split-cache/`.\n\n| Property | Description | Default value |\n| --- | --- | --- |\n| `max_num_bytes` | Maximum disk size in bytes allowed in the split cache. Can be exceeded by the size of one split. | |\n| `max_num_splits` | Maximum number of splits allowed in the split cache.   | `10000` |\n| `num_concurrent_downloads` | Maximum number of concurrent download of splits. | `1` |\n\n\nExample:\n\n```yaml\nsearcher:\n  fast_field_cache_capacity: 1G\n  split_footer_cache_capacity: 500M\n  partial_request_cache_capacity: 64M\n  split_cache:\n    max_num_bytes: 1G\n    max_num_splits: 10000\n    num_concurrent_downloads: 1\n```\n\n## Jaeger configuration\n\n| Property | Description | Default value |\n| --- | --- | --- |\n| `enable_endpoint` | If true, enables the gRPC endpoint that allows the Jaeger Query Service to connect and retrieve traces. | `false` |\n\nExample:\n\n```yaml\njaeger:\n  enable_endpoint: true\n```\n\n\n## Using environment variables in the configuration\n\nYou can use environment variable references in the config file to set values that need to be configurable during deployment. To do this, use:\n\n`${VAR_NAME}`\n\nwhere `VAR_NAME` is the name of the environment variable.\n\nEach variable reference is replaced at startup by the value of the environment variable. The replacement is case-sensitive and occurs before the configuration file is parsed. Referencing undefined variables throws an error unless you specify a default value or custom error text.\n\nTo specify a default value, use:\n\n`${VAR_NAME:-default_value}`\n\nwhere `default_value` is the value to use if the environment variable is unset.\n\n```\n<config_field>: ${VAR_NAME}\nor\n<config_field>: ${VAR_NAME:-default value}\n```\n\nFor example:\n\n```bash\nexport QW_LISTEN_ADDRESS=0.0.0.0\n```\n\n```yaml\n# config.yaml\nversion: 0.7\ncluster_id: quickwit-cluster\nnode_id: my-unique-node-id\nlisten_address: ${QW_LISTEN_ADDRESS}\nrest:\n  listen_port: ${QW_LISTEN_PORT:-1111}\n```\n\nWill be interpreted by Quickwit as:\n\n```yaml\nversion: 0.7\ncluster_id: quickwit-cluster\nnode_id: my-unique-node-id\nlisten_address: 0.0.0.0\nrest:\n  listen_port: 1111\n```\n"
  },
  {
    "path": "docs/configuration/ports-config.md",
    "content": "---\ntitle: Ports configuration\nsidebar_position: 6\n---\n\nWhen starting a quickwit search server, one important parameter that can be configured is\nthe `rest.listen_port` (defaults to :7280).\n\nInternally, Quickwit will, in fact, use three sockets. The ports of these three sockets\ncannot be configured independently at the moment.\nThe ports used are computed relative to the `rest.listen_port` port, as follows.\n\n\n| Service                       | Port used                 | Protocol |  Default  |\n|-------------------------------|---------------------------|----------|-----------|\n| Http server with the rest api | `${rest.listen_port}`     |   TCP    | 7280      |\n| Cluster membership            | `${rest.listen_port}`     |   UDP    | 7280      |\n| GRPC service                  | `${rest.listen_port} + 1` |   TCP    | 7281      |\n\nIt is not possible for the moment to configure these ports independently.\n\n\nIn order to form a cluster, you will also need to define a `peer_seeds` parameter.\nThe following addresses are valid peer seed addresses:\n\n| Type | Example without port | Example with port         |\n|--------------|--------------|---------------------------|\n| IPv4         | 172.1.0.12   | 172.1.0.12:7180           |\n| IPv6         | 2001:0db8:85a3:0000:0000:8a2e:0370:7334  | [2001:0db8:85a3:0000:0000:8a2e:0370:7334:7180]:7280 |\n| hostname     | node3        | node3:7180                |\n\nIf no port is specified in a peer node address, a Quickwit node will assume the peer is using the same port as itself.\n"
  },
  {
    "path": "docs/configuration/source-config.md",
    "content": "---\ntitle: Source configuration\nsidebar_position: 5\n---\n\nQuickwit can insert data into an index from one or multiple sources.\nA source can be added after index creation using the [CLI command](../reference/cli.md#source) `quickwit source create`.\nIt can also be enabled or disabled with the `quickwit source enable/disable` subcommands.\n\nA source is declared using an object called source config, which defines the source's settings. It consists of multiple parameters:\n\n- source ID\n- source type\n- source parameters\n- input_format\n- maximum number of pipelines per indexer (optional)\n- desired number of pipelines (optional)\n- transform parameters (optional)\n\n## Source ID\n\nThe source ID is a string that uniquely identifies the source within an index. It may only contain uppercase or lowercase ASCII letters, digits, hyphens (`-`), and underscores (`_`). Finally, it must start with a letter and contain at least 3 characters but no more than 255.\n\n## Source type\n\nThe source type designates the kind of source being configured. As of version 0.5, available source types are `ingest-api`, `kafka`, `kinesis`, and `pulsar`. The `file` type is also supported but only for local ingestion from [the CLI](/docs/reference/cli.md#tool-local-ingest).\n\n## Source parameters\n\nThe source parameters indicate how to connect to a data store and are specific to the source type.\n\n### File source\n\nA file source reads data from files containing JSON objects separated by newlines (NDJSON). Gzip compression is supported provided that the file name ends with the `.gz` suffix.\n\n#### Ingest a single file (CLI only)\n\nTo ingest a specific file, run the indexing directly in an adhoc CLI process with:\n\n```bash\n./quickwit tool local-ingest --index <index> --input-path <input-path>\n```\n\nBoth local and object files are supported, provided that the environment is configured with the appropriate permissions. A tutorial is available [here](/docs/ingest-data/ingest-local-file.md).\n\n#### Notification based file ingestion (beta)\n\nQuickwit can automatically ingest all new files that are uploaded to an S3 bucket. This requires creating and configuring an [SQS notification queue](https://docs.aws.amazon.com/AmazonS3/latest/userguide/ways-to-add-notification-config-to-bucket.html). A complete example can be found [in this tutorial](/docs/ingest-data/sqs-files.md).\n\n\nThe `notifications` parameter takes an array of notification settings. Currently one notifier can be configured per source and only the SQS notification `type` is supported.\n\nRequired fields for the SQS `notifications` parameter items:\n- `type`: `sqs`\n- `queue_url`: complete URL of the SQS queue (e.g `https://sqs.us-east-1.amazonaws.com/123456789012/queue-name`)\n- `message_type`: format of the message payload, either\n  - `s3_notification`: an [S3 event notification](https://docs.aws.amazon.com/AmazonS3/latest/userguide/EventNotifications.html)\n  - `raw_uri`: a message containing just the file object URI (e.g. `s3://mybucket/mykey`)\n  - `deduplication_window_duration_sec`: maximum duration for which ingested files checkpoints are kept (default 3600)\n  - `deduplication_window_max_messages`: maximum number of ingested file checkpoints kept (default 100k)\n  - `deduplication_cleanup_interval_secs`: frequency at which outdated file checkpoints are cleaned up\n\n*Adding a file source with SQS notifications to an index with the [CLI](../reference/cli.md#source)*\n\n```bash\ncat << EOF > source-config.yaml\nversion: 0.8\nsource_id: my-sqs-file-source\nsource_type: file\nnum_pipelines: 2\nparams:\n  notifications:\n    - type: sqs\n      queue_url: https://sqs.us-east-1.amazonaws.com/123456789012/queue-name\n      message_type: s3_notification\nEOF\n./quickwit source create --index my-index --source-config source-config.yaml\n```\n\n:::note\n\n- Quickwit does not automatically delete the source files after a successful ingestion. You can use [S3 object expiration](https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-expire-general-considerations.html) to configure how long they should be retained in the bucket.\n- Configure the notification to only forward events of type `s3:ObjectCreated:*`. Other events are acknowledged by the source without further processing and an warning is logged.\n- We strongly recommend using a [dead letter queue](https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-dead-letter-queues.html) to receive all messages that couldn't be processed by the file source. A `maxReceiveCount` of 5 is a good default value. Here are some common situations where the notification message ends up in the dead letter queue:\n  - the notification message could not be parsed (e.g it is not a valid S3 notification)\n  - the file was not found\n  - the file is corrupted (e.g unexpected compression)\n- AWS S3 notifications and AWS SQS provide \"at least once\" delivery guaranties. To avoid duplicates, the file source includes a mechanism that prevents the same file from being ingested twice. It works by storing checkpoints in the metastore that track the indexing progress for each file. You can decrease `deduplication_window_*` or increase `deduplication_cleanup_interval_secs` to reduce the load on the metastore.\n\n:::\n\n### Ingest API source\n\nAn ingest API source reads data from the [Ingest API](/docs/reference/rest-api.md#ingest-data-into-an-index). This source is automatically created at the index creation and cannot be deleted nor disabled.\n\n### Kafka source\n\nA Kafka source reads data from a Kafka stream. Each message in the stream must hold a JSON object.\n\nA tutorial is available [here](/docs/ingest-data/kafka.md).\n\n#### Kafka source parameters\n\nThe Kafka source consumes a `topic` using the client library [librdkafka](https://github.com/edenhill/librdkafka) and forwards the key-value pairs carried by the parameter `client_params` to the underlying librdkafka consumer. Common `client_params` options are bootstrap servers (`bootstrap.servers`), or security protocol (`security.protocol`). Please, refer to [Kafka](https://kafka.apache.org/documentation/#consumerconfigs) and [librdkafka](https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md) documentation pages for more advanced options.\n\n| Property | Description | Default value |\n| --- | --- | --- |\n| `topic` | Name of the topic to consume. | required |\n| `client_log_level` | librdkafka client log level. Possible values are: debug, info, warn, error. | `info` |\n| `client_params` | librdkafka client configuration parameters. | `{}` |\n| `enable_backfill_mode` | Backfill mode stops the source after reaching the end of the topic. | `false` |\n\n**Kafka client parameters**\n\n- `bootstrap.servers`\nComma-separated list of host and port pairs that are the addresses of a subset of the Kafka brokers in the Kafka cluster.\n\n- `auto.offset.reset`\nDefines the behavior of the source when consuming a partition for which there is no initial offset saved in the checkpoint. `earliest` consumes from the beginning of the partition, whereas `latest` (default) consumes from the end.\n\n- `enable.auto.commit`\nThis setting is ignored because the Kafka source manages commit offsets internally using the [checkpoint API](../overview/concepts/indexing.md#checkpoint) and forces auto-commits to be disabled.\n\n- `group.id`\nKafka-based distributed indexing relies on consumer groups. Unless overridden in the client parameters, the default group ID assigned to each consumer managed by the source is `quickwit-{index_uid}-{source_id}`.\n\n- `max.poll.interval.ms`\nShort max poll interval durations may cause a source to crash when back pressure from the indexer occurs. Therefore, Quickwit recommends using the default value of `300000` (5 minutes).\n\n*Adding a Kafka source to an index with the [CLI](../reference/cli.md#source)*\n\n```bash\ncat << EOF > source-config.yaml\nversion: 0.8\nsource_id: my-kafka-source\nsource_type: kafka\nnum_pipelines: 2\nparams:\n  topic: my-topic\n  client_params:\n    bootstrap.servers: localhost:9092\n    security.protocol: SSL\nEOF\n./quickwit source create --index my-index --source-config source-config.yaml\n```\n\n### Kinesis source\n\nA Kinesis source reads data from an [Amazon Kinesis](https://aws.amazon.com/kinesis/) stream. Each message in the stream must hold a JSON object.\n\nA tutorial is available [here](/docs/ingest-data/kinesis.md).\n\n**Kinesis source parameters**\n\nThe Kinesis source consumes a stream identified by a `stream_name` and a `region`.\n\n| Property | Description | Default value |\n| --- | --- | --- |\n| `stream_name` | Name of the stream to consume. | required |\n| `region` | The AWS region of the stream. Mutually exclusive with `endpoint`. | `us-east-1` |\n| `endpoint` | Custom endpoint for use with AWS-compatible Kinesis service. Mutually exclusive with `region`. | optional |\n\nIf no region is specified, Quickwit will attempt to find one in multiple other locations and with the following order of precedence:\n\n1. Environment variables (`AWS_REGION` then `AWS_DEFAULT_REGION`)\n\n2. Config file, typically located at `~/.aws/config` or otherwise specified by the `AWS_CONFIG_FILE` environment variable if set and not empty.\n\n3. Amazon EC2 instance metadata service determining the region of the currently running Amazon EC2 instance.\n\n4. Default value: `us-east-1`\n\n*Adding a Kinesis source to an index with the [CLI](../reference/cli.md#source)*\n\n```bash\ncat << EOF > source-config.yaml\nversion: 0.7\nsource_id: my-kinesis-source\nsource_type: kinesis\nparams:\n  stream_name: my-stream\nEOF\nquickwit source create --index my-index --source-config source-config.yaml\n```\n\n### Pulsar source\n\nA Puslar source reads data from one or several Pulsar topics. Each message in topic(s) must hold a JSON object.\n\nA tutorial is available [here](/docs/ingest-data/pulsar.md).\n\n**Pulsar source parameters**\n\nThe Pulsar source consumes `topics` using the client library [pulsar-rs](https://github.com/streamnative/pulsar-rs).\n\n| Property | Description | Default value |\n| --- | --- | --- |\n| `topics` | List of topics to consume. | required |\n| `address` | Pulsar URL (pulsar:// and pulsar+ssl://). | required |\n| `consumer_name` | The consumer name to register with the pulsar source. | `quickwit` |\n\n*Adding a Pulsar source to an index with the [CLI](../reference/cli.md#source)*\n\n```bash\ncat << EOF > source-config.yaml\nversion: 0.7\nsource_id: my-pulsar-source\nsource_type: pulsar\nparams:\n  topics:\n    - my-topic\n  address: pulsar://localhost:6650\nEOF\n./quickwit source create --index my-index --source-config source-config.yaml\n```\n\n## Number of pipelines\n\nThe `num_pipelines` parameter is only available for distributed sources like Kafka, GCP PubSub, and Pulsar.\n\nIt defines the number of pipelines to run on a cluster for the source. The actual placement of these pipelines on the different indexer\nwill be decided by the control plane.\n\n:::info\n\nNote that distributing the indexing load of partitioned sources like Kafka is done by assigning the different partitions to different pipelines. As a result, it is important to ensure that the number of partitions is a multiple of `num_pipelines`.\n\nAlso, assuming you are only indexing a single Kafka source in your Quickwit cluster, you should set the number of pipelines to a multiple of the number of indexers. Finally, if your indexing throughput is high, you should provision between 2 and 4 vCPUs per pipeline.\n\nFor instance, assume you want to index a 60-partition topic, with each partition receiving a throughput of 10 MB/s. If you measured that Quickwit can index your data at a pace of 40MB/s per pipeline, a possible setting could be:\n- 5 indexers with 8 vCPUs each\n- 15 pipelines\n\nEach indexer will then be in charge of 3 pipelines, and each pipeline will cover 4 partitions.\n:::\n\n\n## Transform parameters\n\nFor all source types but the `ingest-api`, ingested documents can be transformed before being indexed using [Vector Remap Language (VRL)](https://vector.dev/docs/reference/vrl/) scripts.\n\n| Property | Description | Default value |\n| --- | --- | --- |\n| `script` | Source code of the VRL program executed to transform documents. | required |\n| `timezone` | Timezone used in the VRL program for date and time manipulations. It must be a valid name in the [TZ database](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones) | `UTC` |\n\n```yaml\n# Your source config here\n# ...\ntransform:\n  script: |\n    .message = downcase(string!(.message))\n    .timestamp = now()\n    del(.username)\n  timezone: local\n```\n\n## Input format\n\nThe `input_format` parameter specifies the expected data format of the source. The formats currently supported are:\n- `json` (default)\n- `otlp_logs_json`\n- `otlp_logs_proto`\n- `otlp_traces_json`\n- `otlp_traces_proto`\n- `plain_text`\n\n*OTLP formats*\n\nWhen ingesting OTLP data into an OTLP logs or traces index with a source other than the native OTEL endpoints, use this parameter to specify whether the exported logs or traces will be serialized in JSON or Protobuf. When possible, prefer the latter, which is a more compact encoding.\n\n*Plaint text format*\n\nUse this parameter for unstructured text data. Internally, Quickwit can only index JSON data. To allow the ingestion of plain text documents, Quickwit transform them on the fly into JSON objects of the following form: `{\"plain_text\": \"<original plain text document>\"}`. Then, they can be optionally transformed into more complex documents using a VRL script. (see [transform feature](#transform-parameters)).\n\nThe following is an example of how one could parse and transform a CSV dataset containing a list of users described by 3 attributes: first name, last name, and age.\n\n```yaml\n# Your source config here\n# ...\ninput_format: plain_text\ntransform:\n  script: |\n    user = parse_csv!(.plain_text)\n    .first_name = user[0]\n    .last_name = user[1]\n    .age = to_int!(user[2])\n    del(.plain_text)\n```\n\n## Enabling/disabling a source from an index\n\nA source can be enabled or disabled from an index using the [CLI command](../reference/cli.md) `quickwit source enable` or `quickwit source disable`:\n\n```bash\nquickwit source disable --index my-index --source my-source\n```\n\nA source is enabled by default. When disabling a source, the related indexing pipelines will be shut down on each relevant indexer and indexing for this source will be paused.\n\n## Deleting a source from an index\n\nA source can be removed from an index using the [CLI command](../reference/cli.md) `quickwit source delete`:\n\n```bash\nquickwit source delete --index my-index --source my-source\n```\n\nWhen deleting a source, the checkpoint associated with the source is also removed.\n"
  },
  {
    "path": "docs/configuration/storage-config.md",
    "content": "---\ntitle: Storage configuration\nsidebar_position: 2\n---\n\n## Supported Storage Providers\n\nQuickwit currently supports four types of storage providers:\n- Amazon S3 and S3-compatible (Garage, MinIO, ...)\n- Azure Blob Storage\n- Local file storage*\n- Google Cloud Storage (native API)\n\n## Storage URIs\n\nStorage URIs refer to different storage providers identified by a URI \"protocol\" or \"scheme\". Quickwit supports the following storage URI protocols:\n- `s3://` for Amazon S3 and S3-compatible\n- `azure://` for Azure Blob Storage\n- `file://` for local file systems\n- `gs://` for Google Cloud Storage\n\nIn general, you can use a storage URI or a file path anywhere you would intuitively expect a file path. For instance:\n- when setting the `index_uri` of an index to specify the storage provider and location;\n- when setting the `metastore_uri` in a node config to set up a file-backed metastore;\n- when passing a file path as a command line argument.\n\n### Local file storage URIs\n\nQuickwit interprets regular file paths as local file system URIs. Relative file paths are allowed and are resolved relatively to the current working directory (CWD). `~` can be used as a shortcut to refer to the user’s home directory. The following are valid local file system URIs:\n\n```markdown\n- /var/quickwit\n- file:///var/quickwit\n- /home/quickwit/data\n- ~/data\n- ./quickwit\n```\n\n:::caution\nWhen using the `file://` protocol, a third `/` is necessary to express an absolute path. For instance, the following URI `file://home/quickwit/` is interpreted as `./home/quickwit`\n:::\n\n## Storage configuration\n\nThis section contains one configuration subsection per storage provider. If a storage configuration parameter is not explicitly set, Quickwit relies on the default values provided by the storage provider SDKs ([Azure SDK for Rust](https://github.com/Azure/azure-sdk-for-rust), [AWS SDK for Rust](https://github.com/awslabs/aws-sdk-rust)).\n\n### S3 storage configuration\n\n| Property | Description | Default value |\n| --- | --- | --- |\n| `flavor` |  The optional storage flavor to use. Available flavors are `digital_ocean`, `garage`, `gcs`, and `minio`. | |\n| `access_key_id` | The AWS access key ID. | |\n| `secret_access_key` | The AWS secret access key. | |\n| `region` | The AWS region to send requests to. | `us-east-1` (SDK default) |\n| `endpoint` | Custom endpoint for use with S3-compatible providers. | SDK default |\n| `force_path_style_access` | Disables [virtual-hosted–style](https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html) requests. Required by some S3-compatible providers (Ceph, MinIO). | `false` |\n| `disable_multi_object_delete` | Disables [Multi-Object Delete](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html) requests. Required by some S3-compatible providers (GCS). | `false` |\n| `disable_multipart_upload` | Disables [multipart upload](https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html) of objects. Required by some S3-compatible providers (GCS). | `false` |\n\n:::warning\nHardcoding credentials into configuration files is not secure and strongly discouraged. Prefer the alternative authentication methods that your storage backend may provide.\n:::\n\n#### Environment variables\n\n| Env variable | Description |\n| --- | --- |\n| `QW_S3_ENDPOINT` | Custom S3 endpoint. |\n| `QW_S3_MAX_CONCURRENCY` | Limit the number of concurrent requests to S3 |\n\n#### Storage flavors\n\nStorage flavors ensure that Quickwit works correctly with storage providers that deviate from the S3 API by automatically configuring the appropriate settings. The available flavors are:\n- `digital_ocean`\n- `garage`\n- `gcs`\n- `minio`\n\n*Digital Ocean*\n\nThe Digital Ocean flavor (`digital_ocean`) forces path-style access and turns off multi-object delete requests.\n\n*Garage flavor*\n\nThe Garage flavor (`garage`) overrides the `region` parameter to `garage` and forces path-style access.\n\n*Google Cloud Storage*\n\nThe Google Cloud Storage flavor (`gcs`) turns off multi-object delete requests and multipart uploads.\n\n*MinIO flavor*\n\nThe MinIO flavor (`minio`) overrides the `region` parameter to `minio` and forces path-style access.\n\nExample of a storage configuration for Google Cloud Storage in YAML format:\n\n```yaml\nstorage:\n  s3:\n    flavor: gcs\n    region: us-east1\n    endpoint: https://storage.googleapis.com\n```\n\n### Azure storage configuration\n\n| Property | Description | Default value |\n| --- | --- | --- |\n| `account` | The Azure storage account name. | |\n| `access_key` | The Azure storage account access key. | |\n\n#### Environment variables\n\n| Env variable | Description |\n| --- | --- |\n| `QW_AZURE_STORAGE_ACCOUNT` | Azure Blob Storage account name. |\n| `QW_AZURE_STORAGE_ACCESS_KEY` | Azure Blob Storage account access key. |\n\nExample of a storage configuration for Azure in YAML format:\n\n```yaml\nstorage:\n  azure:\n    account: your-azure-account-name\n    access_key: your-azure-access-key\n```\n\n## Storage configuration examples for various object storage providers\n\n### Garage\n\n[Garage](https://garagehq.deuxfleurs.fr/) is an open-source distributed object storage service tailored for self-hosting.\n\n```yaml\nstorage:\n  s3:\n    flavor: garage\n    endpoint: http://127.0.0.1:3900\n```\n\n### MinIO\n\n[MinIO](https://min.io/) is a high-performance object storage.\n\n```yaml\nstorage:\n  s3:\n    flavor: minio\n    endpoint: http://127.0.0.1:9000\n```\n\nNote: `default_index_root_uri` or index URIs do not include the endpoint, you should set it as a typical S3 path such as `s3://indexes`.\n"
  },
  {
    "path": "docs/configuration/template-config.md",
    "content": "---\ntitle: Index template configuration\nsidebar_position: 7\ntoc_max_heading_level: 4\n---\n\nThis page describes how to configure an index template.\n\nIndex templates let you dynamically create indexes according to predefined rules. Templates are used automatically when documents are received on the ingest API for an index that doesn't exist.\n\nThe index template configuration lets you define the following parameters:\n- `template_id` (required)\n- `description`\n- `index_id_patterns` (required)\n- `index_root_uri`\n- `priority`\n\nBesides, the following parameters can also be configured and are the same as those found in the [index configuration](../configuration/index-config.md):\n- doc mapping (required)\n- indexing settings\n- search settings\n- retention policy\n\nYou can manage templates using the [index template API](../reference/rest-api.md#index-template-api).\n\n## Config file format\n\nThe index configuration format is YAML or JSON. When a key is absent from the configuration file, the default value is used.\nHere is a complete example:\n\n```yaml\nversion: 0.9 # File format version.\n\ntemplate_id: \"hdfs-dev\"\n\nindex_root_uri: \"s3://my-bucket/hdfs-dev/\"\n\ndescription: \"HDFS log management dev\"\n\nindex_id_patterns:\n    - hdfs-dev-*\n    - hdfs-staging-*\n\npriority: 100\n\ndoc_mapping:\n  mode: lenient\n  field_mappings:\n    - name: timestamp\n      type: datetime\n      input_formats:\n        - unix_timestamp\n      output_format: unix_timestamp_secs\n      fast_precision: seconds\n      fast: true\n    - name: severity_text\n      type: text\n      tokenizer: raw\n      fast:\n        - tokenizer: lowercase\n    - name: body\n      type: text\n      tokenizer: default\n      record: position\n    - name: resource\n      type: object\n      field_mappings:\n        - name: service\n          type: text\n          tokenizer: raw\n  tag_fields: [\"resource.service\"]\n  timestamp_field: timestamp\n  index_field_presence: true\n\nsearch_settings:\n  default_search_fields: [severity_text, body]\n\nretention:\n  period: 90 days\n  schedule: daily\n```\n\n## Template ID\n\nThe `template_id` is a string that uniquely identifies the index template within the metastore. It may only contain uppercase or lowercase ASCII letters, digits, hyphens (`-`), and underscores (`_`). It must start with a letter and contain at least 3 characters but no more than 255.\n\n## Description\n\nAn optional string that describes what the index template is used for.\n\n## Index root uri\n\nThe `index_root_uri` defines where the index files (also called splits) should be stored.\nThis parameter expects a [storage uri](storage-config#storage-uris).\n\nThe actual URI of the index is the path concatenation of the `index_root_uri` with the index id. \n\nIf `index_root_uri` is not defined, the `default_index_root_uri` from [Quickwit's node config](node-config) will be used.\n\n## Index ID patterns\n\n`index_id_patterns` is a list of strings that define which indices should be created according to this template. Use [glob-like](https://en.wikipedia.org/wiki/Glob_(programming)) wildcard ( \\* ) expressions to target indices that match a pattern: test\\* or \\*test or te\\*t or \\*test\\*. You can also use negative patterns by prepending the hyphen `-` character.\n\nPatterns must obey the following rules:\n- It must follow the regex `^-?[a-zA-Z\\*][a-zA-Z0-9-_\\.\\*]{0,254}$`.\n- It cannot contain consecutive asterisks (`*`).\n- If it does not contain an asterisk (`*`), the length must be greater than or equal to 3 characters.\n\n## Priority\n\nWhen multiple templates match a new index ID, the template with the highest `priority` is used to configure the index.\n"
  },
  {
    "path": "docs/deployment/_category_.yaml",
    "content": "label: 'Deployment'\nposition: 7\ncollapsed: true\n"
  },
  {
    "path": "docs/deployment/cluster-sizing.md",
    "content": "---\ntitle: Cluster sizing\nsidebar_position: 3\n---\n\nIn this guide, we discuss how to size your Quickwit cluster and nodes. As shown\nin the [architecture section](../overview/architecture.md), a Quickwit cluster\nhas 5 main components: the Indexers, Searchers, Control Plane,\nMetastore and, Janitor. Each component has different resource requirements\nand can be scaled independently. We will also discuss how to size the metastore\nPostgreSQL database.\n\n:::note\n\nThis guide provides general guidelines. The actual resource requirements depend\nstrongly on your workload. We recommend monitoring the resource usage and\nadjusting the cluster size accordingly.\n\n:::\n\n## Quickwit services\n\n### Indexers\n\nHere are some high-level guidelines to size your Indexer nodes:\n- Quickwit can index at around **7.5MB per second per core**\n- For the general use case, configure 4GB of RAM per core\n  - Workloads with a large number of indexes or data sources consume more RAM\n    <!-- TODO: revisit this when cooperative indexing becomes the default -->\n  - Don't use instances with less than 8GB of RAM\n    <!-- Note: 2GB for the heap size (per pipeline) and 2GB for ingest queues -->\n- Mount the data directory to a volume of at least 120GB to store:\n  - the [split cache](../configuration/node-config.md#Indexer-configuration) (default 100GB)\n  - the [ingest queue](../configuration/node-config.md#ingest-api-configuration) (default 4GiB)\n  - a little extra for the indexes that are being built (first generation and merges)\n- Local SSDs are preferred for deploying Indexers since they generally provide the best performance per dollar and save some network bandwidth. However, remote disks can also if they provide roughly 20 MB/s of write throughput per core when using the ingest API or 10 MB/s when relying on other sources. For Amazon EBS volumes, this is equivalent to 320 or 160 IOPS per core (assuming 64 KB IOPS).\n\n:::note\n\nTo utilize all CPUs on Indexer nodes that have more than 4 cores, your indexing\nworkload needs to be broken down into multiple indexing pipelines. This can be\nachieved by creating multiple indexes or by using a [partitioned data\nsource](../configuration/source-config.md#number-of-pipelines) such as\n[Kafka](../configuration/source-config.md#kafka-source) or the [ingest API\n(v2)](../ingest-data/ingest-api.md#ingest-api-versions).\n\n:::\n\n\n### Searchers\n\nSearch performance is highly dependent on the workload. For example, term queries\nare usually cheaper than aggregations. A good starting point for dimensioning\nSearcher nodes:\n- Configure 8GB of RAM per core when using a high latency / low bandwidth object\n  store like AWS S3\n- Decrease the RAM / CPU ratio (e.g 4GB/core) when using a faster object store\n- Provision more RAM if you expect many concurrent aggregation requests. By\n  default, each request can use up to 500MB of RAM on each node.\n- Avoid instances with less than 4GB of RAM\n<!-- 1GB fast_field_cache_capacity + 0.5GB split_footer_cache_capacity + 0.5GB/req aggregation_memory_limit -->\n- Searcher nodes don't use disk unless the [split\n  cache](../configuration/node-config.md#Searcher-split-cache-configuration) is\n  explicitly enabled\n\nOne strength of Quickwit is that its Searchers are stateless, which makes it\neasy to scale them up and down based on the workload. Scale the number of\nSearcher nodes based on:\n- the number of concurrent requests expected\n- aggregations that run on large amounts of data (without\n  [time](../overview/concepts/querying.md#time-sharding) or\n  [tag](../overview/concepts/querying.md#tag-pruning) pruning)\n\n### Other services\n\nThe Control Plane, Metastore and, Janitor are lightweight components.\n\n- **Control Plane**: A cluster must have only one Control Plane. It needs a\n  single core and 2GB of RAM. It doesn't require any disk.\n\n- **Metastore**: A cluster must have exactly one Metastore when using the\n  [file-backed metastore](../configuration/metastore-config.md#file-backed-metastore).\n  When using the [PostgreSQL metastore](#postgres-metastore-backend), you can\n  run one or several Metastore pods for high availability (HA). The Metastore\n  requires a single core and 2GB of RAM. For clusters handling hundreds of\n  indexes, you may increase the size to 2 cores and 4GB of RAM. It doesn't\n  write to disk (when using PostgreSQL, the database handles persistence).\n\n- **Janitor**: A cluster must have only one Janitor. In general, it requires 1\n  core and 2GB of RAM and doesn't use the disk. If you use the [delete\n  API](https://quickwit.io/docs/overview/concepts/deletes), the Janitor should\n  be dimensioned like an indexer.\n\n### Single node deployments\n\nFor experimentations and small scale POCs, it is possible to deploy all the\nservices on a single node (see\n[tutorial](../get-started/tutorials/tutorial-hdfs-logs.md)). We recommend at\nleast 2 cores and 8GB of RAM.\n\n## Postgres Metastore backend\n\nFor most use cases, a PostgreSQL instance with 4GB of RAM and 1 core is\nsufficient:\n- with the AWS RDS managed service, use the t4g.medium instance type. Enable\n  multi-AZ with one standby for high availability.\n"
  },
  {
    "path": "docs/deployment/deployment-modes.md",
    "content": "---\ntitle: Deployment modes\nsidebar_position: 1\n---\n\nAs an application, Quickwit is built out of multiple services and is designed to run as a horizontally-scalable distributed system. Currently, Quickwit supports four core services (indexer, searcher, metastore, control plane) and one maintenance service (janitor):\n\n- Indexers ingest documents from data sources and build indexes.\n- Searchers execute search queries submitted via the REST API.\n- The Metastore stores index metadata in a PostgreSQL-compatible database or cloud-hosted file.\n- The Control Plane distributes and coordinates indexing workloads on indexers.\n- The Janitor performs periodic maintenance tasks.\n\nQuickwit is distributed as a single binary or Docker image. The behavior of that executable file or image is controlled with the `--service` option of the `quickwit run` command and defines which services run on a node. You may start one service, multiple, or all of them. Nodes always serve the REST API and the search and admin UI. In addition, they will redirect requests that they cannot satisfy to the appropriate nodes in the cluster. Finally, each service can run on one or several nodes depending on the expected load on the system.\n\n## Standalone mode (single node)\n\nThis deployment mode is the simplest way to get started with Quickwit. Launch all the services with the `quickwit run` [command](../reference/cli.md), and you are now ready to ingest data and search your indexes.\n\n## Cluster mode (multi-node)\n\nYou can deploy Quickwit on multiple nodes. We provide a [Helm chart](./kubernetes/helm.md) to help you deploy Quickwit on Kubernetes. In cluster mode, you must store your index data on a shared storage backend such as Amazon S3 or MinIO.\n\n## One indexer, multiple searchers\n\nOne indexer running on a small instance (4 vCPUs) can ingest documents at a throughput of 20-40MB/s (1-3+ TB/day). A deployment with one indexer is thus an excellent place to start. However, you may need several searchers to handle large datasets or serve many resource-intensive requests such as aggregation queries.\n\n## Multiple indexers, multiple searchers\n\nIndexing a single [data source](../configuration/source-config.md) on several indexers is only possible with a [Kafka source](../configuration/source-config.md#kafka-source).\nSupport for native distributed indexing was added with Quickwit 0.9.\n\n## File-backed metastore limitations\n\nThe file-backed metastore is a good fit for standalone and small deployments. However, it does not support multiple instances running at the same time. As long as you can guarantee that no more than one metastore is running at any given time, the file-backed metastore is safe to use. For heavy workloads, we recommend using a PostgreSQL metastore.\n"
  },
  {
    "path": "docs/deployment/kubernetes/_category_.yaml",
    "content": "label: 'Kubernetes'\nposition: 2\ncollapsed: true\n"
  },
  {
    "path": "docs/deployment/kubernetes/gke.md",
    "content": "---\ntitle: Install Quickwit on Google GKE\nsidebar_label: Google GKE\nsidebar_position: 2\n---\n\nThis guide will help you set up a Quickwit cluster with the correct GCS permissions.\n\n\n## Set up\n\nBefore installing Quickwit with Helm, let's create a namespace for our playground.\n\n```\nexport NS=quickwit-tutorial\nkubectl create ns ${NS}\n```\n\nQuickwit stores its index on an object storage. We will use GCS, which is natively supported since the 0.7 version (for versions < 0.7, you should use an S3 interoperability key).\n\nThe following steps create a GCP and a GKE service account and bind them together.\nWe are going to create them, set the right permissions and bind them.\n\n```bash\nexport PROJECT_ID={your-project-id}\nexport GCP_SERVICE_ACCOUNT=quickwit-tutorial\nexport GKE_SERVICE_ACCOUNT=quickwit-sa\nexport BUCKET=your-bucket\n\nkubectl create serviceaccount ${GKE_SERVICE_ACCOUNT} -n ${NS}\n\ngcloud iam service-accounts create ${GCP_SERVICE_ACCOUNT} --project=${PROJECT_ID}\n\ngcloud storage buckets add-iam-policy-binding gs://${BUCKET} \\\n--member \"serviceAccount:${GCP_SERVICE_ACCOUNT}@${PROJECT_ID}.iam.gserviceaccount.com\" \\\n--role \"roles/storage.objectAdmin\"\n\n# Notice that the member is related to a namespace.\ngcloud iam service-accounts add-iam-policy-binding ${GCP_SERVICE_ACCOUNT}@${PROJECT_ID}.iam.gserviceaccount.com \\\n--role roles/iam.workloadIdentityUser \\\n--member \"serviceAccount:${PROJECT_ID}.svc.id.goog[${NS}/${GKE_SERVICE_ACCOUNT}]\"\n\n# Now we can annotate our service account!\nkubectl annotate serviceaccount ${GKE_SERVICE_ACCOUNT} \\\niam.gke.io/gcp-service-account=${GCP_SERVICE_ACCOUNT}@${PROJECT_ID}.iam.gserviceaccount.com \\\n-n ${NS}\n```\n\n## Install Quickwit using Helm\n\nWe are now ready to install Quickwit on GKE. If you'd like to know more about Helm, consult our [comprehensive guide](./helm.md) for installing Quickwit on Kubernetes.\n\n```bash\nhelm repo add quickwit https://helm.quickwit.io\nhelm repo update quickwit\n```\n\nLet's set Quickwit `values.yaml`:\n\n```yaml\n# We use the edge version here as we recently fixed\n# a bug which prevents the metastore from running on GCS.\nimage:\n    repository: quickwit/quickwit\n    pullPolicy: Always\n    tag: edge\n\nserviceAccount:\n  create: false\n  name: quickwit-sa\n\nconfig:\n  default_index_root_uri: gs://{BUCKET}/qw-indexes\n  metastore_uri: gs://{BUCKET}/qw-indexes\n\n```\n\nWe're ready to deploy:\n\n```bash\nhelm install <deployment name> quickwit/quickwit -f values.yaml\n```\n\n## Check that Quickwit is running\n\nIt should take a few seconds for the cluster to start. During the startup process, individual pods might restart themselves several times.\n\nTo access the UI, you can run the following command and then open your browser at [http://localhost:7280](http://localhost:7280):\n\n```\nkubectl port-forward svc/release-name-quickwit-searcher 7280:7280\n```\n\n\n## Uninstall the deployment\n\nRun the following Helm command to uninstall the deployment\n\n```bash\nhelm uninstall <deployment name>\n```\n\nAnd don't forget to clean your bucket, Quickwit should have stored 3 files in `gs://{BUCKET}/qw-indexes`.\n"
  },
  {
    "path": "docs/deployment/kubernetes/glasskube.md",
    "content": "---\ntitle: Install Quickwit with Glasskube\nsidebar_label: Glasskube\nsidebar_position: 3\n---\n\n[Glasskube](https://glasskube.dev) is a package manager for Kubernetes that empowers you to effortlessly install, upgrade, configure, and manage your Kubernetes cluster packages, all while streamlining repetitive and cumbersome maintenance tasks.\n\n## Requirements\n\nTo deploy Quickwit on Kubernetes, you will need:\n\n- kubectl, compatible with your cluster (+/- 1 minor release from your cluster) (`kubectl version`)\n- A Kubernetes cluster\n\n1. Install `kubectl` and `glasskube` cli.\n\nTo install `kubectl` locally, you can refer to [this documentation](https://kubernetes.io/docs/tasks/tools/#install-kubectl).\n\nTo install `glasskube` cli locally, you can refer to [this documentation](https://glasskube.dev/docs/getting-started/install) and choose the right installation options according to your operating system.\n\nFor example, let's assume that you're on MacOS using homebrew and kind, this is what you'll have to do:\n\n```shell\nbrew install glasskube/tap/glasskube # install the glasskube cli\nkind create cluster # create a kind Kubernetes cluster\n```\n\n2. Install glasskube in your Kubernetes cluster:\n\n```shell\nglasskube bootstrap\n```\n\n3. Start and access to the Glasskube's GUI:\n\n```shell\nglasskube serve\n```\n\nYou'll be able to access to the GUI of Glasskube here: http://localhost:8580\n\n## Install Quickwit using Glasskube\n\n`glasskube` will install Quickwit in the `quickwit` namespace. You can perform the Quickwit installation directly with the GUI:\n\n![screenshot-glasskube-ui.png](../../assets/images/screenshot-glasskube-ui.png)\n\nOr use the CLI instead:\n\n```shell\nglasskube install quickwit\n```\n\nIn both, you'll have to set the value of those parameters:\n\n* `defaultIndexRootUri`: the default index URI is a S3 compliant bucket which usually looks like this: `s3://<bucket-name>/<optional-base-path>`\n* `metastoreUri`: if you're not using PostgreSQL and object storage, you can pick the same bucket and value you used for the `defaultIndexRootUri` parameter\n* `s3Endpoint`: the http(s) URL of your object storage service which should looks like `https://s3.{region}.{your object storage domain}`\n* `s3Flavor`: which can be one of the following: `do`, `garage`, `gcp`, `minio`. You can leave it empty if your object storage is compliant with AWS S3\n* `s3Region`\n* `s3AccessKeyId`\n* `s3SecretAccessKey`\n\n## Uninstall quickwit\n\n```shell\nglasskube uninstall quickwit\n```\n"
  },
  {
    "path": "docs/deployment/kubernetes/helm.md",
    "content": "---\ntitle: Install Quickwit with Helm\nsidebar_label: Helm\nsidebar_position: 1\n---\n\n[Helm](https://helm.sh) is a package manager for Kubernetes that allows you to configure, install, and upgrade containerized applications in a Kubernetes cluster in a version-controlled and reproducible way.\n\nYou can install Quickwit on Kubernetes with the official Quickwit Helm chart. If you encounter any problem with the chart, please, open an issue in our [GitHub repository](https://github.com/quickwit-oss/helm-charts).\n\n## Requirements\n\nTo deploy Quickwit on Kubernetes, you will need:\n\n- kubectl, compatible with your cluster (+/- 1 minor release from your cluster) (`kubectl version`)\n- Helm v3 (`helm version`)\n- A Kubernetes cluster\n\n1. Install `kubectl` and `helm`\n\nTo install `kubectl` and `helm` locally, follow the [Kubernetes](https://kubernetes.io/docs/tasks/tools/#install-kubectl) and [Helm](https://helm.sh/docs/intro/install/) documentation pages.\n\n2. Add the Quickwit Helm chart repository to Helm\n\n```bash\nhelm repo add quickwit https://helm.quickwit.io\n```\n\n3. Update the repository\n\n```bash\nhelm repo update quickwit\n```\n\n4. Create and customize your configuration file `values.yaml`\n\nYou can inspect the default configuration values of the chart using the following command:\n\n```bash\nhelm show values quickwit/quickwit\n```\n\nHere is an example of a minimal configuration with a file-backed metastore:\n\n```yaml\nenvironment:\n  QW_METASTORE_URI: s3://<my-bucket>/quickwit-indexes\n\nconfig:\n  default_index_root_uri: s3://<my-bucket>/quickwit-indexes\n  storage:\n    s3:\n      region: eu-east-1\n      # We recommend using IAM roles and permissions to access Amazon S3 resources,\n      # but you can specify a pair of access and secret keys if necessary.\n      access_key_id: <my access key>\n      secret_access_key: <my secret key>\n```\n\n5. Deploy Quickwit\n\n```bash\nhelm install <deployment name> quickwit/quickwit -f values.yaml\n```\n\n6. Check that Quickwit is running\n\nIt might take some time for the cluster to start. During the startup process individual pods might restart themselves several times. The command on the previous step will print the instructions on how to connect to the cluster. This endpoint can be used to access the quickwit search UI, as well execute standard API commands against.\n\n## Using PostgreSQL as a metadata store\n\nThe file-backed metastore is mainly useful for testing purposes. Though a file-backed metastore might be easier to setup, we strongly encourage you to use a PostgreSQL metastore in production. For the quickwit installation to work with PostgreSQL metadata you need to provide connection PostgreSQL information instead of metastore URI:\n\n```yaml\nconfig:\n  default_index_root_uri: s3://<my-bucket>/quickwit-indexes\n\n  postgres:\n    host: <postgres_host>\n    port: 5432\n    database: quickwit-metastore\n    username: quickwit\n    password: <my strong password> # This password will be stored as a Kubernetes Secret\n\n  storage: {}\n    s3:\n      region: eu-east-1\n      # We recommend using IAM roles and permissions to access Amazon S3 resources,\n      # but you can specify a pair of access and secret keys if necessary.\n      access_key_id: <my access key>\n      secret_access_key: <my secret key>\n```\n\n## Uninstall the deployment\n\nRun the following Helm command to uninstall the deployment\n\n```bash\nhelm uninstall <deployment name>\n```\n"
  },
  {
    "path": "docs/distributed-tracing/_category_.yaml",
    "content": "label: 'Distributed tracing'\nposition: 6\ncollapsed: true\n"
  },
  {
    "path": "docs/distributed-tracing/otel-service.md",
    "content": "---\ntitle: OTEL service\nsidebar_position: 5\n---\n\nQuickwit natively supports the [OpenTelemetry Protocol (OTLP)](https://opentelemetry.io/docs/reference/specification/protocol/otlp/) and provides a gRPC endpoint to receive spans from an OpenTelemetry collector, or from your application directly, via an exporter. This endpoint is enabled by default.\n\nWhen enabled, Quickwit will start the gRPC service ready to receive spans from an OpenTelemetry collector. The spans are indexed in the `otel-trace-v0_7` index by default, and this index will be automatically created if not present. The index doc mapping is described in the next [section](#trace-and-span-data-model).\n\nIf for any reason, you want to disable this endpoint, you can:\n- Set the `QW_ENABLE_OTLP_ENDPOINT` environment variable to `false` when starting Quickwit.\n- Or [configure the node config](/docs/configuration/node-config.md) by setting the indexer setting `enable_otlp_endpoint` to `false`.\n\n```yaml title=node-config.yaml\n# ... Indexer configuration ...\nindexer:\n    enable_otlp_endpoint: false\n```\n\n## Sending spans in your own index\n\nYou can send spans in the index of your choice by setting the header `qw-otel-traces-index` of your gRPC request to the targeted index ID.\n\n\n## Trace and span data model\n\nA trace is a collection of spans that represents a single request. A span represents a single operation within a trace. OpenTelemetry collectors send spans, Quickwit then indexes them in the `otel-trace-v0_7` index by default that maps OpenTelemetry span model to an indexed document in Quickwit.\n\nThe span model is derived from the [OpenTelemetry specification](https://opentelemetry.io/docs/reference/specification/trace/api/).\n\nBelow is the doc mapping of the `otel-trace-v0_7` index:\n\n```yaml\n\nversion: 0.7\n\nindex_id: otel-trace-v0_7\n\ndoc_mapping:\n  mode: strict\n  field_mappings:\n    - name: trace_id\n      type: bytes\n      input_format: hex\n      output_format: hex\n      fast: true\n    - name: trace_state\n      type: text\n      indexed: false\n    - name: service_name\n      type: text\n      tokenizer: raw\n      fast: true\n    - name: resource_attributes\n      type: json\n      tokenizer: raw\n    - name: resource_dropped_attributes_count\n      type: u64\n      indexed: false\n    - name: scope_name\n      type: text\n      indexed: false\n    - name: scope_version\n      type: text\n      indexed: false\n    - name: scope_attributes\n      type: json\n      indexed: false\n    - name: scope_dropped_attributes_count\n      type: u64\n      indexed: false\n    - name: span_id\n      type: bytes\n      input_format: hex\n      output_format: hex\n    - name: span_kind\n      type: u64\n    - name: span_name\n      type: text\n      tokenizer: raw\n      fast: true\n    - name: span_fingerprint\n      type: text\n      tokenizer: raw\n    - name: span_start_timestamp_nanos\n      type: datetime\n      input_formats: [unix_timestamp]\n      output_format: unix_timestamp_nanos\n      indexed: false\n      fast: true\n      fast_precision: milliseconds\n    - name: span_end_timestamp_nanos\n      type: datetime\n      input_formats: [unix_timestamp]\n      output_format: unix_timestamp_nanos\n      indexed: false\n      fast: false\n    - name: span_duration_millis\n      type: u64\n      indexed: false\n      fast: true\n    - name: span_attributes\n      type: json\n      tokenizer: raw\n      fast: true\n    - name: span_dropped_attributes_count\n      type: u64\n      indexed: false\n    - name: span_dropped_events_count\n      type: u64\n      indexed: false\n    - name: span_dropped_links_count\n      type: u64\n      indexed: false\n    - name: span_status\n      type: json\n      indexed: true\n    - name: parent_span_id\n      type: bytes\n      input_format: hex\n      output_format: hex\n      indexed: false\n    - name: events\n      type: array<json>\n      tokenizer: raw\n      fast: true\n    - name: event_names\n      type: array<text>\n      tokenizer: default\n      record: position\n      stored: false\n    - name: links\n      type: array<json>\n      tokenizer: raw\n\n  timestamp_field: span_start_timestamp_nanos\n\nindexing_settings:\n  commit_timeout_secs: 10\n\nsearch_settings:\n  default_search_fields: []\n```\n\n## Known limitations\n\nThere are a few limitations on the current distributed tracing setup in Quickwit 0.9:\n- The OTLP gRPC service does not provide High-Durability. This will be fixed in 0.10.\n- OTLP HTTP is only available with the Binary Protobuf Encoding. OTLP HTTP with JSON encoding is not planned yet, but this can be easily fixed in the next version. Please open an issue if you need this feature.\n\nIf you are interested in new features or discovered other limitations, please open an issue on [GitHub](https://github.com/quickwit-oss/quickwit).\n"
  },
  {
    "path": "docs/distributed-tracing/overview.md",
    "content": "---\ntitle: Distributed Tracing with Quickwit\nsidebar_label: Overview\nsidebar_position: 1\n---\n\nDistributed Tracing is a process that tracks your application requests flowing through your different services: frontend, backend, databases and more. It's a powerful tool to understand how your application works and to debug performance issues.\n\nQuickwit is a cloud-native engine to index and search unstructured data which makes it a perfect fit for a traces backend.\n\nMoreover, Quickwit supports natively the [OpenTelemetry gRPC and HTTP (protobuf only) protocol](https://opentelemetry.io/docs/reference/specification/protocol/otlp/) and the [Jaeger gRPC API (SpanReader only)](https://www.jaegertracing.io/). **This means that you can use Quickwit to store your traces and to query them with Jaeger UI**.\n\n![Quickwit Distributed Tracing](../assets/images/distributed-tracing-overview-light.png#gh-light-mode-only)![Quickwit Distributed Tracing](../assets/images/distributed-tracing-overview-dark.png#gh-dark-mode-only)\n\n## Plug Quickwit to Jaeger\n\nQuickwit implements a gRPC service compatible with Jaeger UI. All you need is to configure Jaeger with a (span) storage type `grpc`[^1] and you will be able to visualize your traces in Jaeger that are stored in any Quickwit's indexes matching the pattern `otel-traces-v0_*`.\n\nWe made a tutorial on [how to plug Quickwit to Jaeger UI](plug-quickwit-to-jaeger.md) that will guide you through the process.\n\n[^1]: It was `grpc-plugin` until the version 1.58 of Jaeger.\n\n## Send traces to Quickwit\n\n- [Using OTEL collector](send-traces/using-otel-collector.md)\n- [Using python OTEL SDK](send-traces/using-otel-sdk-python.md)\n\n"
  },
  {
    "path": "docs/distributed-tracing/plug-quickwit-to-jaeger.md",
    "content": "---\ntitle: Plug Quickwit to Jaeger\ndescription: A simple tutorial to use Jaeger with Quickwit backend.\nicon_url: /img/tutorials/quickwit-logo.png\ntags: [traces, ingestion]\nsidebar_position: 2\n---\n\nIn this tutorial, we will show you how Quickwit can eat its own dog food: we will send Quickwit traces into Jaeger and analyze them, which will generate new traces to analyze :)\n\n## Start Quickwit\n\nFirst, start a [Quickwit instance](../get-started/installation.md) with the OTLP service enabled:\n\n```bash\nQW_ENABLE_OPENTELEMETRY_OTLP_EXPORTER=true \\\nOTEL_EXPORTER_OTLP_ENDPOINT=http://127.0.0.1:7281 \\\n./quickwit run\n```\n\nWe also set `QW_ENABLE_OPENTELEMETRY_OTLP_EXPORTER` and `OTEL_EXPORTER_OTLP_ENDPOINT` environment variables so that Quickwit will send its own traces to itself.\n\n## Start Jaeger UI\n\nLet's start a Jaeger UI instance with docker. Here we need to inform jaeger that it should use quickwit as its backend.\n\nDue to some idiosyncrasy associated with networking with containers, we will have to use a different approach on MacOS & Windows on one side, and Linux on the other side.\n\n### MacOS & Windows\n\nWe can rely on `host.docker.internal` to get the docker bridge ip address, pointing to our quickwit server.\n\n```bash\ndocker run --rm --name jaeger-qw \\\n    -e SPAN_STORAGE_TYPE=grpc \\\n    -e GRPC_STORAGE_SERVER=host.docker.internal:7281 \\\n    -p 16686:16686 \\\n    jaegertracing/jaeger-query:1.60\n```\n\n### Linux\n\nBy default, Quickwit is listening to `127.0.0.1`, and will not respond to request directed\nto the docker bridge (`172.17.0.1`). There are different ways to solve this problem.\nThe easiest is probably to use host network mode.\n\n```bash\ndocker run --rm --name jaeger-qw  --network=host \\\n    -e SPAN_STORAGE_TYPE=grpc \\\n    -e GRPC_STORAGE_SERVER=127.0.0.1:7281 \\\n    -p 16686:16686 \\\n    jaegertracing/jaeger-query:1.60\n\n```\n\n## Search traces in Jaeger UI\n\nAs Quickwit is indexing its own traces, you should be able to see them in Jaeger UI after 5 seconds (the time it takes for Quickwit to do its first commit).\n\nOpen the Jaeger UI at [http://localhost:16686](http://localhost:16686) and search for traces! By executing search queries, you will then see Quickwit's own traces:\n\n- `find_traces` is the endpoint called when you search for traces in Jaeger UI, it then calls `find_trace_ids`.\n- `find_traces_ids` is doing an aggregation query on spans to get unique trace IDs.\n- `root_search` is Quickwit search entry point. It calls search on each split (piece of index) in parallel, in a distributed manner, or just locally if there is only one node.\n- `leaf_search` is the search entry point on each node. It calls `leaf_search_single_split` on each split.\n- `leaf_search_single_split` is the search entry point on a split. It will call consecutively `warmup` and `tantivy_search`.\n- `warmup` is the warmup phase of the search. It prefetches data needed to execute the search query.\n- `tantivy_search` is the search phase of the search. It is executing the search query at horse speeds with the [Tantivy](https://github.com/quickwit-oss/tantivy).\n\n![Quickwit trace in Jaeger UI](../assets/images/jaeger-ui-quickwit-trace-analysis.png)\n\n## Next steps\n\nYou are now ready for the next step: instrumenting your application and sending its traces to Quickwit. You can do it:\n- In [python](send-traces/using-otel-sdk-python.md).\n- And in any other language that OpenTelemetry supports.\n"
  },
  {
    "path": "docs/distributed-tracing/send-traces/_category_.yaml",
    "content": "label: 'Sending traces'\nposition: 3\ncollapsed: false\n"
  },
  {
    "path": "docs/distributed-tracing/send-traces/using-otel-collector.md",
    "content": "---\ntitle: Using OTEL Collector\ndescription: Using OTEL Collector\ntags: [otel, collector, traces]\nsidebar_position: 1\n---\n\nimport Tabs from '@theme/Tabs';\nimport TabItem from '@theme/TabItem';\n\nIf you already have your own OpenTelemetry Collector and want to export your traces to Quickwit, you need a new OLTP gRPC exporter in your config.yaml:\n\n<Tabs>\n\n<TabItem value=\"macOS_windows\" label=\"macOS/Windows\">\n\n```yaml title=\"otel-collector-config.yaml\"\nreceivers:\n  otlp:\n    protocols:\n      grpc:\n      http:\n\nprocessors:\n  batch:\n\nexporters:\n  otlp/quickwit:\n    endpoint: host.docker.internal:7281\n    tls:\n      insecure: true\n    # By default, traces are sent to the otel-traces-v0_7.\n    # You can customize the index ID By setting this header.\n    # headers:\n    #   qw-otel-traces-index: otel-traces-v0_7\n\nservice:\n  pipelines:\n    traces:\n      receivers: [otlp]\n      processors: [batch]\n      exporters: [otlp/quickwit]\n```\n\n</TabItem>\n\n<TabItem value=\"linux\" label=\"Linux\">\n\n```yaml title=\"otel-collector-config.yaml\"\nreceivers:\n  otlp:\n    protocols:\n      grpc:\n      http:\n\nprocessors:\n  batch:\n\nexporters:\n  otlp/quickwit:\n    endpoint: 127.0.0.1:7281\n    tls:\n      insecure: true\n\nservice:\n  pipelines:\n    traces:\n      receivers: [otlp]\n      processors: [batch]\n      exporters: [otlp/quickwit]\n```\n\n</TabItem>\n\n</Tabs>\n\n\n## Test your OTEL configuration\n\n1. [Install](../../get-started/installation.md) and start a Quickwit server:\n   \n```bash\n./quickwit run\n```\n\n2. Start a collector with the previous config:\n\n<Tabs>\n\n<TabItem value=\"macOS_windows\" label=\"macOS/Windows\">\n\n```bash\ndocker run -v ${PWD}/otel-collector-config.yaml:/etc/otelcol/config.yaml -p 4317:4317 -p 4318:4318 -p 7281:7281 otel/opentelemetry-collector\n```\n\n</TabItem>\n\n<TabItem value=\"linux\" label=\"Linux\">\n\n```bash\ndocker run -v ${PWD}/otel-collector-config.yaml:/etc/otelcol/config.yaml --network=host -p 4317:4317 -p 4318:4318 -p 7281:7281 otel/opentelemetry-collector\n```\n\n</TabItem>\n\n</Tabs>\n\n3. Send a trace to your collector with cURL:\n\n```bash\ncurl -XPOST \"http://localhost:4318/v1/traces\" -H \"Content-Type: application/json\" \\\n--data-binary @- << EOF\n{\n \"resource_spans\": [\n   {\n     \"resource\": {\n       \"attributes\": [\n         {\n           \"key\": \"service.name\",\n           \"value\": {\n             \"string_value\": \"test-with-curl\"\n           }\n         }\n       ]\n     },\n     \"scope_spans\": [\n       {\n         \"scope\": {\n           \"name\": \"manual-test\"\n         },\n         \"spans\": [\n           {\n             \"time_unix_nano\": \"1678974011000000000\",\n             \"observed_time_unix_nano\": \"1678974011000000000\",\n             \"start_time_unix_nano\": \"1678974011000000000\",\n             \"end_time_unix_nano\": \"1678974021000000000\",\n             \"trace_id\": \"3c191d03fa8be0653c191d03fa8be065\",\n             \"span_id\": \"3c191d03fa8be065\",\n             \"kind\": 2,\n             \"events\": [],\n             \"status\": {\n               \"code\": 1\n             }\n           }\n         ]\n       }\n     ]\n   }\n ]\n}\nEOF\n```\n\nYou should see a log on the Quickwit server similar to the following:\n\n```bash\n2023-03-16T13:44:09.369Z  INFO quickwit_indexing::actors::indexer: new-split split_id=\"01GVNAKT5TQW0T2QGA245XCMTJ\" partition_id=6444214793425557444\n```\n\nThis means that Quickwit has received the trace and created a new split. Wait for the split to be published before searching for traces.\n\n## Next step\n\nFollow our tutorial on [how to send traces from your python app](using-otel-sdk-python.md).\n"
  },
  {
    "path": "docs/distributed-tracing/send-traces/using-otel-sdk-python.md",
    "content": "---\ntitle: Using OTEL SDK - Python\ndescription: A simple tutorial to send traces to Quickwit from a Python Flask app.\nicon_url: /img/tutorials/python-logo.png\ntags: [python, traces, ingestion]\nsidebar_position: 2\n---\n\nimport Tabs from '@theme/Tabs';\nimport TabItem from '@theme/TabItem';\n\nIn this tutorial, we will show you how to instrument a Python [Flask](https://flask.palletsprojects.com/en/2.2.x/) app with OpenTelemetry and send traces to Quickwit. This tutorial was inspired by the [Python OpenTelemetry](https://opentelemetry.io/docs/instrumentation/python/getting-started/) documentation, huge thanks to the OpenTelemetry team!\n\n## Prerequisites\n\n- Python3 installed\n- Docker installed\n\n## Start a Quickwit instance\n\n[Install Quickwit](/docs/get-started/installation.md) and start a Quickwit instance:\n\n```bash\n./quickwit run\n```\n\n## Start Jaeger UI\n\nLet's start a Jaeger UI instance with docker. Here we need to inform jaeger that it should use quickwit as its backend.\n\nDue to some idiosyncrasy associated with networking with containers, we will have to use a different approach on MacOS & Windows on one side, and Linux on the other side.\n\n### MacOS & Windows\n\nWe can rely on `host.docker.internal` to get the docker bridge ip address, pointing to our quickwit server.\n\n```bash\ndocker run --rm --name jaeger-qw \\\n    -e SPAN_STORAGE_TYPE=grpc \\\n    -e GRPC_STORAGE_SERVER=host.docker.internal:7281 \\\n    -p 16686:16686 \\\n    jaegertracing/jaeger-query:1.60\n```\n\n### Linux\n\nBy default, quickwit is listening to `127.0.0.1`, and will not respond to request directed\nto the docker bridge (`172.17.0.1`). There are different ways to solve this problem.\nThe easiest is probably to use host network mode.\n\n```bash\ndocker run --rm --name jaeger-qw --network=host \\\n    -e SPAN_STORAGE_TYPE=grpc \\\n    -e GRPC_STORAGE_SERVER=127.0.0.1:7281 \\\n    -p 16686:16686 \\\n    jaegertracing/jaeger-query:1.60\n```\n\n## Run a simple Flask app\n\nWe will start a flask application that is doing three things on each HTTP call `http://localhost:5000/process-ip`:\n\n- Fetching an IP address from [https://httpbin.org/ip](https://httpbin.org/ip).\n- Parsing it and fake processing it with a random sleep.\n- Displaying it with a random sleep.\n\n\nLet's first install the dependencies:\n\n```bash\npip install flask\npip install opentelemetry-distro\npip install opentelemetry-exporter-otlp\n```\n\nThe opentelemetry-distro package installs the API, SDK, and the opentelemetry-bootstrap and opentelemetry-instrument tools that you’ll use.\n\nHere is the code of our app:\n\n```python title=my_app.py\nimport random\nimport time\nimport requests\n\nfrom flask import Flask\n\napp = Flask(__name__)\n\n@app.route(\"/process-ip\")\ndef process_ip():\n    body = fetch()\n    ip = parse(body)\n    display(ip)\n    return ip\n\ndef fetch():\n    resp = requests.get('https://httpbin.org/ip')\n    body = resp.json()\n    return body\n\ndef parse(body):\n    # Sleep for a random amount of time to make the span more visible.\n    secs = random.randint(1, 100) / 1000\n    time.sleep(secs)\n\n    return body[\"origin\"]\n\ndef display(ip):\n    # Sleep for a random amount of time to make the span more visible.\n    secs = random.randint(1, 100) / 1000\n    time.sleep(secs)\n\n    message = f\"Your IP address is `{ip}`.\"\n    print(message)\n\nif __name__ == \"__main__\":\n    app.run(port=5000)\n```\n\n## Auto-instrumentation\n\nOpenTelemetry provides a tool called `opentelemetry-bootstrap` that automatically instruments your Python application.\n\n```bash\nopentelemetry-bootstrap -a install\n```\n\nAnd that's it, we are now ready to run the app:\n\n```bash\n# We don't need metrics.\nOTEL_METRICS_EXPORTER=none \\\nOTEL_TRACES_EXPORTER=console \\\nOTEL_SERVICE_NAME=my_app \\\npython my_app.py\n```\n\nBy hitting [http://localhost:5000/process-ip](http://localhost:5000/process-ip) you should see the corresponding trace in the console.\n\nThis is nice but it would be even better if we could have the time passed in each steps, get the status code of the HTTP request, and the content type of the response. Let's do that by manually instrumentating our app!\n\n## Manual instrumentation\n\n```python title=my_instrumented_app.py\nimport random\nimport time\nimport requests\n\nfrom flask import Flask\n\nfrom opentelemetry import trace\n\n# Creates a tracer from the global tracer provider\ntracer = trace.get_tracer(__name__)\n\napp = Flask(__name__)\n\n@app.route(\"/process-ip\")\n@tracer.start_as_current_span(\"process_ip\")\ndef process_ip():\n    body = fetch()\n    ip = parse(body)\n    display(ip)\n    return ip\n\n@tracer.start_as_current_span(\"fetch\")\ndef fetch():\n    resp = requests.get('https://httpbin.org/ip')\n    body = resp.json()\n\n    headers = resp.headers\n    current_span = trace.get_current_span()\n    current_span.set_attribute(\"status_code\", resp.status_code)\n    current_span.set_attribute(\"content_type\", headers[\"Content-Type\"])\n    current_span.set_attribute(\"content_length\", headers[\"Content-Length\"])\n\n    return body\n\n@tracer.start_as_current_span(\"parse\")\ndef parse(body):\n    # Sleep for a random amount of time to make the span more visible.\n    secs = random.randint(1, 100) / 1000\n    time.sleep(secs)\n\n    return body[\"origin\"]\n\n@tracer.start_as_current_span(\"display\")\ndef display(ip):\n    # Sleep for a random amount of time to make the span more visible.\n    secs = random.randint(1, 100) / 1000\n    time.sleep(secs)\n\n    message = f\"Your IP address is `{ip}`.\"\n    print(message)\n\n    current_span = trace.get_current_span()\n    current_span.add_event(message)\n\nif __name__ == \"__main__\":\n    app.run(port=5000)\n\n```\n\nWe can now start the new instrumented app:\n\n```bash\nOTEL_METRICS_EXPORTER=none \\\nOTEL_TRACES_EXPORTER=console \\\nOTEL_SERVICE_NAME=my_app \\\nopentelemetry-instrument python my_instrumented_app.py\n```\n\nIf you hit again [http://localhost:5000/process-ip](http://localhost:5000/process-ip), you should see new spans with name `fetch`, `parse`, and `display` and with the corresponding custom attributes!\n\n\n## Sending traces to Quickwit\n\nTo send traces to Quickwit, we need to use the OTLP exporter. This is a simple as this:\n\n```bash\nOTEL_METRICS_EXPORTER=none \\ # We don't need metrics\nOTEL_SERVICE_NAME=my_app \\\nOTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://localhost:7281 \\\nopentelemetry-instrument python my_instrumented_app.py\n```\n\nNow, if you hit [http://localhost:5000/process-ip](http://localhost:5000/process-ip), traces will be send to Quickwit, you just need to wait around 30 seconds before they are indexed. It's time for a coffee break!\n\n30 seconds has passed, let's query the traces from our service:\n\n```bash\ncurl -XPOST http://localhost:7280/api/v1/otel-trace-v0/search -H 'Content-Type: application/json' -d '{\n    \"query\": \"resource_attributes.service.name:my_app\"\n}'\n```\n\nAnd then open the Jaeger UI [localhost:16686](http://localhost:16686/) and play with it, you have now a Jaeger UI powered by a Quickwit storage backend!\n\n![Flask trace analysis in Jaeger UI](../../assets/images/jaeger-ui-python-app-trace-analysis.png)\n\n![Flask traces in Jaeger UI](../../assets/images/jaeger-ui-python-app-traces.png)\n\n## Sending traces to your OpenTelemetry collector\n\nStart a collector as described in the [OpenTelemetry collector tutorial](using-otel-collector.md) and execute the following command:\n\n```bash\nOTEL_METRICS_EXPORTER=none \\ # We don't need metrics\nOTEL_SERVICE_NAME=my_app \\\nopentelemetry-instrument python instrumented_app.py\n```\n\nTraces will be sent to your collector, and then to Quickwit.\n\n\n## Wrap up\n\nIn this tutorial, we have seen how to instrument a Python application with OpenTelemetry and send traces to Quickwit. We have also seen how to use the Jaeger UI to analyze traces.\n\nAll the code snippets in our [tutorial repository](https://github.com/quickwit-oss/tutorials).\n\nPlease let us know what you think about this tutorial, and if you have any questions, feel free to reach out to us on [Discord](https://discord.gg/7eNYX4d) or [Twitter](https://twitter.com/quickwit_inc).\n"
  },
  {
    "path": "docs/get-started/_category_.yaml",
    "content": "label: 'Get started'\nposition: 2\ncollapsed: false\n"
  },
  {
    "path": "docs/get-started/installation.md",
    "content": "---\ntitle: Installation\nsidebar_position: 2\n---\n\nimport Tabs from '@theme/Tabs';\nimport TabItem from '@theme/TabItem';\nimport { useDocsVersion } from '@docusaurus/theme-common/internal';\n\nexport const RenderIf = ({children, condition}) => (\n    <>\n        {condition && children}\n    </>\n);\n\nQuickwit compiles to a single binary and we provide different methods to install it:\n\n- Linux/MacOS binaries that you can [download manually](#download) or with the [install script](#install-script)\n- [Docker image](#use-the-docker-image)\n- [Helm chart](../deployment/kubernetes/helm.md)\n- [Glasskube](../deployment/kubernetes/glasskube.md)\n\n## Prerequisites\n\nQuickwit is officially only supported for Linux. Freebsd and MacOS are not officially supported, but should work as well.\n\nQuickwit supplies binaries for x86-64 and aarch64. No special instruction set is required, but on x86-64 SSE3 is recommended.\nSupport of aarch64 is currently experimental.\n\n## Download\n\n<RenderIf condition={useDocsVersion().version == 'current'}>\n\nVersion: nightly - \nLicense: [Apache 2.0](https://github.com/quickwit-oss/quickwit/blob/main/LICENSE) -\nDownloads `.tar.gz`:\n- [Linux ARM64](https://github.com/quickwit-oss/quickwit/releases/download/nightly/quickwit-nightly-aarch64-unknown-linux-gnu.tar.gz)\n- [Linux x86_64](https://github.com/quickwit-oss/quickwit/releases/download/nightly/quickwit-nightly-x86_64-unknown-linux-gnu.tar.gz)\n- [macOS aarch64](https://github.com/quickwit-oss/quickwit/releases/download/nightly/quickwit-nightly-aarch64-apple-darwin.tar.gz)\n- [macOS x86_64](https://github.com/quickwit-oss/quickwit/releases/download/nightly/quickwit-nightly-x86_64-apple-darwin.tar.gz)\n\n</RenderIf>\n\n<!-- Bellow is the set of links to edit when a new version is released -->\n<RenderIf condition={useDocsVersion().version != 'current'}>\n\nversion: 0.8.1 - [Release notes](https://github.com/quickwit-oss/quickwit/releases/tag/v0.8.1) - [Changelog](https://github.com/quickwit-oss/quickwit/blob/main/CHANGELOG.md)\nLicense: [Apache 2.0](https://github.com/quickwit-oss/quickwit/blob/main/LICENSE)\nDownloads `.tar.gz`:\n- [Linux ARM64](https://github.com/quickwit-oss/quickwit/releases/download/v0.8.1/quickwit-v0.8.1-aarch64-unknown-linux-gnu.tar.gz)\n- [Linux x86_64](https://github.com/quickwit-oss/quickwit/releases/download/v0.8.1/quickwit-v0.8.1-x86_64-unknown-linux-gnu.tar.gz)\n- [macOS aarch64](https://github.com/quickwit-oss/quickwit/releases/download/v0.8.1/quickwit-v0.8.1-aarch64-apple-darwin.tar.gz)\n- [macOS x86_64](https://github.com/quickwit-oss/quickwit/releases/download/v0.8.1/quickwit-v0.8.1-x86_64-apple-darwin.tar.gz)\n\n</RenderIf>\n\nCheck out the available builds in greater detail on [GitHub](https://github.com/quickwit-oss/quickwit/releases)\n\n### Note on external dependencies\n\nQuickwit depends on the following external libraries to work correctly:\n- `libssl`: the industry defacto cryptography library.\nThese libraries can be installed on your system using the native package manager.\nYou can install these dependencies using the following command:\n\n<Tabs>\n\n<TabItem value=\"ubuntu\" label=\"Ubuntu\">\n\n```bash\napt-get -y update && apt-get -y install libssl\n```\n\n</TabItem>\n\n<TabItem value=\"aws-linux\" label=\"AWS Linux\">\n\n```bash\nyum -y update && yum -y install openssl\n```\n\n</TabItem>\n\n<TabItem value=\"arch-linux\" label=\"Arch Linux\">\n\n```bash\npacman -S openssl\n```\n\n</TabItem>\n\n</Tabs>\n\nAdditionally it requires a few more dependencies to compile it. These dependencies are not required on production system:\n- `clang`: used to compile some dependencies.\n- `protobuf-compiler`: used to compile protobuf definitions.\n- `libssl-dev`: headers for libssl.\n- `pkg-config`: used to locate libssl.\n- `cmake`: used to build librdkafka, for kafka support.\nThese dependencies can be installed on your system using the native package manager.\nYou can install these dependencies using the following command:\n\n<Tabs>\n\n<TabItem value=\"ubuntu\" label=\"Ubuntu\">\n\n```bash\napt install -y clang protobuf-compiler libssl-dev pkg-config cmake\n```\n\n</TabItem>\n\n<TabItem value=\"aws-linux\" label=\"AWS Linux\">\n\n```bash\nyum -y update && yum -y install clang openssl-devel pkgconfig cmake3\n# amazonlinux only has protobuf-compiler 2.5, we need something much more up to date.\nwget https://github.com/protocolbuffers/protobuf/releases/download/v21.9/protoc-21.9-linux-x86_64.zip\nsudo unzip protoc-21.9-linux-x86_64.zip -d /usr/local\n# amazonlinux use cmake2 as cmake, we need cmake3\nln -s /usr/bin/cmake3 /usr/bin/cmake\n```\n\n</TabItem>\n\n<TabItem value=\"arch-linux\" label=\"Arch Linux\">\n\n```bash\npacman -S clang protobuf openssl pkg-config cmake make\n```\n\n</TabItem>\n\n</Tabs>\n\n## Install script\n\nTo easily install Quickwit on your machine, just run the command below from your preferred shell.\nThe script detects the architecture and then downloads the correct binary archive for the machine.\n\n```bash\ncurl -L https://install.quickwit.io | sh\n```\n\nAll this script does is download the correct binary archive for your machine and extracts it in the current working directory. This means you can download any desired archive from [github](https://github.com/quickwit-oss/quickwit/releases) that matches your OS architecture and manually extract it anywhere.\n\nOnce installed or extracted, all of Quickwit's installation files can be found in a directory named `quickwit-{version}` where `version` is the corresponding version of Quickwit. This directory has the following layout:\n\n```bash\nquickwit-{version}\n    ├── config\n    │   └── quickwit.yaml\n    ├── LICENSE\n    ├── quickwit\n    └── qwdata\n```\n\n- `config/quickwit.yaml`: is the default configuration file.\n- `LICENSE`: the license file.\n- `quickwit`: the quickwit executable binary.\n- `qwdata/`: the default data directory.\n\n\n## Use the Docker image\n\nIf you use Docker, this might be one of the quickest way to get going.\nThe following command will pull the image from [Docker Hub](https://hub.docker.com/r/quickwit/quickwit)\nand start a container ready to execute Quickwit commands.\n\n```bash\ndocker run --rm quickwit/quickwit --version\n\n# If you are using Apple silicon based macOS system you might need to specify the platform.\n# You can also safely ignore jemalloc warnings.\ndocker run --rm --platform linux/amd64 quickwit/quickwit --version\n```\n\nTo get the full gist of this, follow the [Quickstart guide](./quickstart.md).\n"
  },
  {
    "path": "docs/get-started/query-language-intro.md",
    "content": "---\ntitle: Introduction to Quickwit's query language\nsidebar_position: 3\n---\n\nQuickwit allows you to search on your indexed documents using a simple query language. Here's a quick overview.\n\n## Clauses\n\nThe main concept of this language is a clause, which represents a simple condition that can be tested against documents. \n\n### Querying fields\n\nA clause operates on fields of your document. It has the following syntax :\n```\nfield:condition\n```\n\nFor example, when searching documents where the field `app_name` contains the token `tantivy`, you would write the following clause:\n```\napp_name:tantivy\n```\n\nIn many cases the field name can be omitted, quickwit will then use the `default_search_fields` configured for the index.\n\n### Clauses Cheat Sheet\n\nQuickwit support various types of clauses to express different kinds of conditions. Here's a quick overview of them:\n\n| type | syntax | examples | description| `default_search_field`|\n|-------------|--------|----------|------------|-----------------------|\n| term | `field:token` | `app_name:tantivy` <br/> `process_id:1234` <br/> `word` | A term clause tests the existence of avalue in the field's tokens | yes |\n| term prefix | `field:prefix*` | `app_name:tant*` <br/> `quick*` | A term clause tests the existence of a token starting with the provided value | yes |\n| term set | `field:IN [token token ..]` |`severity:IN [error warn]` | A term set clause tests the existence of any of the provided value in the field's tokens| yes |\n| phrase | `field:\"sequence of tokens\"` | `full_name:\"john doe\"` | A phrase clause tests the existence of the provided sequence of tokens | yes |\n| phrase prefix | `field:\"sequence of tokens\"*` | `title:\"how to m\"*` | A phrase prefix clause tests the existence of a sequence of tokens, the last one used like in a prefix clause | yes |\n| all | `*` | `*` | A match-all clause will match every document | no |\n| exist | `field:*` | `error:*` | An exist clause tests the existence of any value for the field, it will match only if the field exists | no |\n| range | `field:bounds` |`duration:[0 TO 1000}` <br/> `last_name:[banner TO miller]` | A term clause tests the existence of a token between the provided bounds | no |\n\n## Queries\n\n### Combining queries\n\nClauses can be combined using boolean operators `AND` and  `OR` to create more complex search expressions\nAn `AND` query will match only if conditions on both sides of the operator are met\n```\ntype:rose AND color:red\n```\n\nAn `OR` query will match if either or both conditions on each side of the operator are met\n```\nweekday:6 OR weekday:7\n```\n\nIf no operator is provided, `AND` is implicitly assumed.\n\n```\ntype:violet color:blue\n```\n\n### Grouping queries\nYou can build complex expressions by grouping clauses using parentheses.\n```\n(type:rose AND color:red) OR (type:violet AND color:blue)\n```\n\nWhen no parentheses are used, `AND` takes precedence over `OR`, meaning that the following query is equivalent to the one above.\n\n```\ntype:rose AND color:red OR type:violet AND color:blue\n```\n\n### Negating queries\n\nAn expression can be negated either with the operator `NOT` or by prefixing the query with a dash `-`.\n\n`NOT` and `-` take precedence over everything, such that `-a AND b` means `(-a) AND b`, not `-(a AND B)`.\n\n```\nNOT severity:debug\n```\n\nor\n\n```\ntype:proposal -(status:rejected OR status:pending)\n```\n\n\n## Dive deeper\n\nIf you want to know more about the query language, head to the [Query Language Reference](../reference/query-language.md)\n"
  },
  {
    "path": "docs/get-started/quickstart.md",
    "content": "---\ntitle: Quickstart\nsidebar_position: 1\n---\n\nimport Tabs from '@theme/Tabs';\nimport TabItem from '@theme/TabItem';\n\nIn this quick start guide, we will install Quickwit, create an index, add documents and finally execute search queries. All the Quickwit commands used in this guide are documented [in the CLI reference documentation](/docs/reference/cli.md).\n\n## Install Quickwit using Quickwit installer\n\nThe Quickwit installer automatically picks the correct binary archive for your environment and then downloads and unpacks it in your working directory.\nThis method works only for [some OS/architectures](installation.md#download), and you will also need to install some [external dependencies](installation.md#note-on-external-dependencies).\n\n```bash\ncurl -L https://install.quickwit.io | sh\n```\n\n```bash\ncd ./quickwit-v*/\n./quickwit --version\n```\n\nYou can now move this executable directory wherever sensible for your environment and possibly add it to your `PATH` environment.\n\n## Use Quickwit's Docker image\n\nYou can also pull and run the Quickwit binary in an isolated Docker container.\n\n```bash\n# Create first the data directory.\nmkdir qwdata\ndocker run --rm quickwit/quickwit --version\n```\n\nIf you are using Apple silicon based macOS system you might need to specify the platform. You can also safely ignore jemalloc warnings.\n\n```bash\ndocker run --rm --platform linux/amd64 quickwit/quickwit --version\n```\n\n## Start Quickwit server\n\n<Tabs>\n\n<TabItem value=\"cli\" label=\"CLI\">\n\n```bash\n./quickwit run\n```\n\n</TabItem>\n\n<TabItem value=\"docker\" label=\"Docker\">\n\n```bash\ndocker run --rm -v $(pwd)/qwdata:/quickwit/qwdata -p 127.0.0.1:7280:7280 quickwit/quickwit run\n```\n\n</TabItem>\n\n</Tabs>\n\nTips: you can use the environment variable `RUST_LOG` to control quickwit verbosity.\n\nCheck it's working by browsing the [UI at http://localhost:7280](http://localhost:7280) or do a simple GET with cURL:\n\n```bash\ncurl http://localhost:7280/api/v1/version\n```\n\n## Create your first index\n\nBefore adding documents to Quickwit, you need to create an index configured with a YAML config file. This config file notably lets you define how to map your input documents to your index fields and whether these fields should be stored and indexed. See the [index config documentation](/docs/configuration/index-config.md).\n\nLet's create an index configured to receive Stackoverflow posts (questions and answers).\n\n```bash\n# First, download the stackoverflow dataset config from Quickwit repository.\ncurl -o stackoverflow-index-config.yaml https://raw.githubusercontent.com/quickwit-oss/quickwit/main/config/tutorials/stackoverflow/index-config.yaml\n```\n\nThe index config defines three fields: `title`, `body` and `creationDate`. `title` and `body` are [indexed and tokenized](../configuration/index-config.md#text-type), and they are also used as default search fields, which means they will be used for search if you do not target a specific field in your query. `creationDate` serves as the timestamp for each record. There are no more explicit field definitions as we can use the default dynamic [mode](/docs/configuration/index-config.md#mode): the undeclared fields will still be indexed, by default fast fields are enabled to enable aggregation queries. and the `raw` tokenizer is used for text. \n\nAnd here is the complete config:\n\n```yaml title=\"stackoverflow-index-config.yaml\"\n#\n# Index config file for stackoverflow dataset.\n#\nversion: 0.7\n\nindex_id: stackoverflow\n\ndoc_mapping:\n  field_mappings:\n    - name: title\n      type: text\n      tokenizer: default\n      record: position\n      stored: true\n    - name: body\n      type: text\n      tokenizer: default\n      record: position\n      stored: true\n    - name: creationDate\n      type: datetime\n      fast: true\n      input_formats:\n        - rfc3339\n      fast_precision: seconds\n  timestamp_field: creationDate\n\nsearch_settings:\n  default_search_fields: [title, body]\n\nindexing_settings:\n  commit_timeout_secs: 30\n```\n\nNow we can create the index with the command:\n\n<Tabs>\n\n<TabItem value=\"cli\" label=\"CLI\">\n\n```bash\n./quickwit index create --index-config ./stackoverflow-index-config.yaml\n```\n\n</TabItem>\n\n<TabItem value=\"curl\" label=\"CURL\">\n\n```bash\ncurl -XPOST http://127.0.0.1:7280/api/v1/indexes --header \"content-type: application/yaml\" --data-binary @./stackoverflow-index-config.yaml\n```\n\n</TabItem>\n\n</Tabs>\n\nCheck that a directory `./qwdata/indexes/stackoverflow` has been created, Quickwit will write index files here and a `metastore.json` which contains the [index metadata](../overview/architecture.md#index).\nYou're now ready to fill the index.\n\n\n## Let's add some documents\n\nQuickwit can index data from many [sources](/docs/configuration/source-config.md). We will use a new line delimited json [ndjson](http://ndjson.org/) datasets as our data source.\nLet's download [a bunch of stackoverflow posts (10 000)](https://quickwit-datasets-public.s3.amazonaws.com/stackoverflow.posts.transformed-10000.json) in [ndjson](http://ndjson.org/) format and index it.\n\n```bash\n# Download the first 10_000 Stackoverflow posts articles.\ncurl -O https://quickwit-datasets-public.s3.amazonaws.com/stackoverflow.posts.transformed-10000.json\n```\n\n<Tabs>\n\n<TabItem value=\"cli\" label=\"CLI\">\n\n```bash\n# Index our 10k documents.\n./quickwit index ingest --index stackoverflow --input-path stackoverflow.posts.transformed-10000.json --force\n```\n\n</TabItem>\n\n<TabItem value=\"curl\" label=\"CURL\">\n\n```bash\n# Index our 10k documents.\ncurl -XPOST \"http://127.0.0.1:7280/api/v1/stackoverflow/ingest?commit=force\" --data-binary @stackoverflow.posts.transformed-10000.json\n```\n\n</TabItem>\n\n</Tabs>\n\nAs soon as the ingest command finishes you can start querying data by using the following `search` command:\n\n<Tabs>\n\n<TabItem value=\"cli\" label=\"CLI\">\n\n```bash\n./quickwit index search --index stackoverflow --query \"search AND engine\"\n```\n\n</TabItem>\n\n<TabItem value=\"curl\" label=\"CURL\">\n\n```bash\ncurl \"http://127.0.0.1:7280/api/v1/stackoverflow/search?query=search+AND+engine\"\n```\n\n</TabItem>\n\n</Tabs>\n\nIt should return 10 hits. Now you're ready to play with the search API.\n\n\n## Execute search queries\n\n\nLet's start with a query on the field `title`: `title:search AND engine`:\n```bash\ncurl \"http://127.0.0.1:7280/api/v1/stackoverflow/search?query=title:search+AND+engine\"\n```\n\nThe same request can be expressed as a JSON query:\n```bash\ncurl -XPOST \"http://localhost:7280/api/v1/stackoverflow/search\" -H 'Content-Type: application/json' -d '{\n    \"query\": \"title:search AND engine\"\n}'\n```\n\nThis format is more verbose but it allows you to use more advanced features such as aggregations. The following query finds most popular tags used on the questions in this dataset:\n```bash\ncurl -XPOST \"http://localhost:7280/api/v1/stackoverflow/search\" -H 'Content-Type: application/json' -d '{\n    \"query\": \"type:question\",\n    \"max_hits\": 0,\n    \"aggs\": {\n        \"foo\": {\n            \"terms\":{\n                \"field\":\"tags\",\n                \"size\": 10\n            }\n        }\n    }\n}'\n```\n\nAs you are experimenting with different queries check out the server logs to see what's happening.\n\n:::note\n\nDon't forget to encode correctly the query params to avoid bad request (status 400).\n\n:::\n\n\n\n## Clean\n\nLet's do some cleanup by deleting the index:\n\n<Tabs>\n\n<TabItem value=\"cli\" label=\"CLI\">\n\n```bash\n./quickwit index delete --index stackoverflow\n```\n\n</TabItem>\n\n<TabItem value=\"rest\" label=\"REST\">\n\n```bash\ncurl -XDELETE http://127.0.0.1:7280/api/v1/indexes/stackoverflow\n```\n\n</TabItem>\n\n</Tabs>\n\nCongrats! You can level up with the following tutorials to discover all Quickwit features.\n\n\n## TLDR\n\nRun the following command from within Quickwit's installation directory.\n\n```bash\ncurl -o stackoverflow-index-config.yaml https://raw.githubusercontent.com/quickwit-oss/quickwit/main/config/tutorials/stackoverflow/index-config.yaml\n./quickwit index create --index-config ./stackoverflow-index-config.yaml\ncurl -O https://quickwit-datasets-public.s3.amazonaws.com/stackoverflow.posts.transformed-10000.json\n./quickwit index ingest --index stackoverflow --input-path ./stackoverflow.posts.transformed-10000.json --force\n./quickwit index search --index stackoverflow --query \"search AND engine\"\n./quickwit index delete --index stackoverflow\n```\n\n\n## Next tutorials\n\n- [Search on logs with timestamp pruning](/docs/get-started/tutorials/tutorial-hdfs-logs)\n- [Setup a distributed search on AWS S3](/docs/get-started/tutorials/tutorial-hdfs-logs-distributed-search-aws-s3)\n"
  },
  {
    "path": "docs/get-started/tutorials/_category_.yaml",
    "content": "label: 'Tutorials'\nposition: 2\ncollapsed: false\n"
  },
  {
    "path": "docs/get-started/tutorials/prometheus-metrics.md",
    "content": "---\ntitle: Metrics with Grafana and Prometheus\ndescription: A simple tutorial to display Quickwit metrics with Grafana.\nicon_url: /img/tutorials/quickwit-logo.png\ntags: [grafana, prometheus, integration]\nsidebar_position: 2\n---\n\nIn this tutorial, you will learn how to set up Grafana to display Quickwit metrics using Prometheus. Grafana will visualize the metrics collected from Quickwit, allowing you to monitor its performance effectively.\n\n## Step 1: Create a Docker Compose File\n\nFirst, create a `docker-compose.yml` file in your project directory. This file will configure and run Quickwit, Prometheus, and Grafana as Docker services.\n\nHere’s the complete Docker Compose configuration:\n\n```yaml\nservices:\n  quickwit:\n    image: quickwit/quickwit\n    environment:\n      QW_ENABLE_OPENTELEMETRY_OTLP_EXPORTER: \"true\"\n      OTEL_EXPORTER_OTLP_ENDPOINT: \"http://localhost:7281\"\n    ports:\n      - 7280:7280\n    command: [\"run\"]\n\n  grafana:\n    image: grafana/grafana-oss\n    container_name: grafana\n    ports:\n      - \"${MAP_HOST_GRAFANA:-127.0.0.1}:3000:3000\"\n    environment:\n      GF_INSTALL_PLUGINS: https://github.com/quickwit-oss/quickwit-datasource/releases/download/v0.4.6/quickwit-quickwit-datasource-0.4.6.zip;quickwit-quickwit-datasource\n      GF_AUTH_DISABLE_LOGIN_FORM: \"true\"\n      GF_AUTH_ANONYMOUS_ENABLED: \"true\"\n      GF_AUTH_ANONYMOUS_ORG_ROLE: Admin\n\n  prometheus:\n    image: prom/prometheus:latest\n    container_name: prometheus\n    volumes:\n      - ./prometheus.yml:/etc/prometheus/prometheus.yml  # Ensure prometheus.yml exists in the same directory\n    ports:\n      - 9090:9090\n```\n\n### Explanation of Services\n\n- **Quickwit**: Runs the Quickwit service on port `7280`.\n- **Grafana**: Queries and displays data from Prometheus.\n- **Prometheus**: Collects metrics from Quickwit using the `/metrics` endpoint.\n\n## Step 2: Configure Prometheus\n\nPrometheus needs a configuration file to define how it scrapes metrics from Quickwit. Create a file named `prometheus.yml` in the same directory as your Docker Compose file with the following content:\n\n```yaml\nglobal:\n  scrape_interval: 1s\n  scrape_timeout: 1s\n\nscrape_configs:\n  - job_name: quickwit\n    metrics_path: /metrics\n    static_configs:\n      - targets:\n          - quickwit:7280\n```\n\n## Step 3: Start the Services\n\nRun the following command in your terminal to start all services defined in the Docker Compose file:\n\n```bash\ndocker compose up\n```\n\nThis will launch Quickwit, Prometheus, and Grafana services.\n\n## Step 4: Configure Grafana to Use Prometheus\n\n1. Open Grafana in your browser at `http://localhost:3000`.\n2. Navigate to **Configuration** > **Data Sources**.\n3. Click **Add Data Source**, select **Prometheus**, and set the URL to `http://prometheus:9090`.\n4. Click **Save & Test** to verify the connection.\n\n## Step 5: Create or Use Pre-Configured Dashboards\n\nNow that Grafana is set up with Prometheus as a data source, you can create custom dashboards or use Quickwit's pre-configured dashboards:\n\n1. Go to the **Dashboards** section in Grafana.\n2. Import or create a new dashboard to visualize metrics.\n3. Alternatively, use one of Quickwit’s [pre-configured dashboards](../../operating/monitoring).\n\n"
  },
  {
    "path": "docs/get-started/tutorials/trace-analytics-with-grafana.md",
    "content": "---\ntitle: Logs and Traces with Grafana\ndescription: A simple tutorial to use Grafana with Quickwit's datasource plugin.\nicon_url: /img/tutorials/quickwit-logo.png\ntags: [grafana, integration]\nsidebar_position: 2\n---\n\nIn this tutorial, we will set up a Grafana Dashboard showing Quickwit traces using Docker Compose.\n\nYou only need a few minutes to get Grafana working with Quickwit and build meaningful dashboards.\n\n## Create a Docker Compose recipe\n\nFirst, create a `docker-compose.yml` file. This file will define the services needed to run Quickwit with OpenTelemetry and Grafana with the Quickwit Datasource plugin.\n\nBelow is the complete Docker Compose configuration:\n\n```yaml\nversion: '3.0'\nservices:\n  quickwit:\n    image: quickwit/quickwit\n    environment:\n      QW_ENABLE_OPENTELEMETRY_OTLP_EXPORTER: \"true\"\n      OTEL_EXPORTER_OTLP_ENDPOINT: \"http://localhost:7281\"\n    ports:\n      - 7280:7280\n    command: [\"run\"]\n\n  grafana:\n    image: grafana/grafana-oss\n    container_name: grafana\n    ports:\n      - \"${MAP_HOST_GRAFANA:-127.0.0.1}:3000:3000\"\n    environment:\n      GF_INSTALL_PLUGINS: https://github.com/quickwit-oss/quickwit-datasource/releases/download/v0.4.6/quickwit-quickwit-datasource-0.4.6.zip;quickwit-quickwit-datasource\n      GF_AUTH_DISABLE_LOGIN_FORM: \"true\"\n      GF_AUTH_ANONYMOUS_ENABLED: \"true\"\n      GF_AUTH_ANONYMOUS_ORG_ROLE: Admin\n```\n\nThe default Grafana port is 3000. If this port is already taken, you can modify the port mapping, for example, changing 3000:3000 to 3100:3000 or any other available port.\n\nSave and run the recipe:\n\n```bash\n$ docker compose up\n```\n\nYou should be able to access Quickwit's UI on `http://localhost:7280/` and Grafana's UI on `http://localhost:3000/`.\n\n## Setting up the datasource\n\nIn Grafana, head to [Data Sources](http://localhost:3000/connections/datasources). If the plugin is installed correctly you should be able to find Quickwit in the list.\n\nWe're going to set up a new Quickwit data source looking at Quickwit's own OpenTelemetry traces, let's configure the datasource with the following parameters:\n\n- URL : `http://quickwit:7280/api/v1` _This uses the docker service name as the host_\n- Index ID : `otel-traces-v0_7`\n\nSave and test, you should obtain a confirmation that the datasource is correctly set up.\n\n\n![Quickwit Plugin configuration success](../../assets/images/grafana-ui-quickwit-datasource-plugin-success.png)\n\n\nYou can also set up a new Quickwit data source looking at Quickwit's own OpenTelemetry logs (or your own logs index), let's configure the datasource with the following parameters:\n\n- URL : `http://quickwit:7280/api/v1` _This uses the docker service name as the host_\n- Index ID : `otel-logs-v0_7`\n\n\n## Creating a dashboard\n\nYou can then [create a new dashboard](http://localhost:3000/dashboard/new) and add a visualization : you should be able to choose the traces quickwit datasource here.\n\nQuickwit sends itself its own traces, so you should already have data to display. Let's configure some panels !\n\n- a Table counting span_names \n  - **Panel type** : Table\n  - **Query**: _empty_\n  - **Metric** : Count\n  - **Group by** : Terms : `span_name` : order by Count\n- a Bar Chart showing the amount of tantivy searches per hour :\n  - **Panel type**: Time Series\n  - **Query** : \"span_name:tantivy_search\"\n  - **Metric**: Count\n  - **Group by** : Date Histogram : `span_start_timestamp_nanos` : Interval 1h\n- a Bar Chart showing the amount of ERROR logs per hour for the last 6 hours :\n  - **Panel type**: Bar Chart\n  - **Query**: \"service_name:quickwit AND events.event_attributes.level:ERROR\"\n  - **Metric**: Count\n  - **Group by** : Terms : `span_start_timestamp_nanos` : Interval 1h\n- another query on the same Bar Chart for WARN logs\n\n## The result\n\nHere's what your first dashboard can look like :\n\n![Quickwit Panel in Grafana Dashboard](../../assets/images/screenshot-grafana-tutorial-dashboard.png)\n\n\n"
  },
  {
    "path": "docs/get-started/tutorials/tutorial-hdfs-logs-distributed-search-aws-s3.md",
    "content": "---\ntitle: Distributed search on AWS S3\ndescription: Index log entries on AWS S3 using an EC2 instance and launch a distributed cluster.\ntags: [aws, integration]\nicon_url: /img/tutorials/aws-logo.png\nsidebar_position: 6\n---\n\nIn this guide, we will index about 40 million log entries (13 GB decompressed) on AWS S3 using an EC2 instance and launch a three-node distributed search cluster.\n\nExample of a log entry:\n```json\n{\n  \"timestamp\": 1460530013,\n  \"severity_text\": \"INFO\",\n  \"body\": \"PacketResponder: BP-108841162-10.10.34.11-1440074360971:blk_1074072698_331874, type=HAS_DOWNSTREAM_IN_PIPELINE terminating\",\n  \"resource\": {\n    \"service\": \"datanode/01\"\n  },\n  \"attributes\": {\n    \"class\": \"org.apache.hadoop.hdfs.server.datanode.DataNode\"\n  }\n}\n```\n\n:::caution\n\nBefore using Quickwit with an object storage, check out our [advice](../../operating/aws-costs) for deploying on AWS S3 to avoid some bad surprises at the end of the month.\n\n:::\n\nFirst of all, let's create an EC2 instance, install a Quickwit binary, and [configure it](../../guides/aws-setup) to let Quickwit access your S3 buckets. This instance will be used for indexing our dataset (note that you can also index your dataset from your local machine if it has the rights to read/write on AWS S3).\n\n## Install\n\n```bash\ncurl -L https://install.quickwit.io | sh\ncd quickwit-v*/\n```\n\n## Configure Quickwit with S3\n\nLet's define the S3 path where we want to store our indexes.\n\n```bash\nexport S3_PATH=s3://{path/to/bucket}/indexes\n```\n\n:::note\nYou'll want to include the necessary authorization for the given bucket, this can be done by setting the `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`\nenvironment variables, or via the AWS credentials file. Usually located at `~/.aws/credentials`.\n\nFor more info check out [our AWS setup guide](https://quickwit.io/docs/guides/aws-setup)\n:::\n\nNow we can create a Quickwit config file.\n\n```bash\n# Create Quickwit config file.\necho \"version: 0.7\nnode_id: searcher-1\nlisten_address: 0.0.0.0\nmetastore_uri: ${S3_PATH}\ndefault_index_root_uri: ${S3_PATH}\n\" > config.yaml\n```\n\n> You can also pass environment variables directly:\n> ```yaml\n> # config.yaml\n> node_id: searcher-1\n> listen_address: 0.0.0.0\n> version: 0.7\n> metastore_uri: ${S3_PATH}\n> default_index_root_uri: ${S3_PATH}\n>```\n\nWe are now ready to start Quickwit.\n\n```bash\n./quickwit run --config config.yaml\n```\n\n## Create your index\n\n```bash\n# First, download the hdfs logs config from Quickwit repository.\ncurl -o hdfs_logs_index_config.yaml https://raw.githubusercontent.com/quickwit-oss/quickwit/main/config/tutorials/hdfs-logs/index-config.yaml\n```\n\nThe index config defines five fields: `timestamp`, `tenant_id`, `severity_text`, `body`, and one JSON field\nfor the nested values `resource.service`, we could use an object field here and maintain a fixed schema, but for convenience we're going to use a JSON field.\nIt also sets the `default_search_fields`, the `tag_fields`, and the `timestamp_field`. The `timestamp_field` and `tag_fields` are\nused by Quickwit for [splits pruning](../../overview/architecture) at query time to boost search speed. \nCheck out the [index config docs](../../configuration/index-config) for more details.\n\n```yaml title=\"hdfs_logs_index_config.yaml\"\nversion: 0.7\n\nindex_id: hdfs-logs\n\ndoc_mapping:\n  field_mappings:\n    - name: timestamp\n      type: datetime\n      input_formats:\n        - unix_timestamp\n      output_format: unix_timestamp_secs\n      fast_precision: seconds\n      fast: true\n    - name: tenant_id\n      type: u64\n    - name: severity_text\n      type: text\n      tokenizer: raw\n    - name: body\n      type: text\n      tokenizer: default\n      record: position\n    - name: resource\n      type: json\n      tokenizer: raw\n  tag_fields: [tenant_id]\n  timestamp_field: timestamp\n\nsearch_settings:\n  default_search_fields: [severity_text, body]\n```\n\nWe can now create the index with the `create` subcommand.\n\n```bash\n./quickwit index create --index-config hdfs_logs_index_config.yaml\n```\n\n:::note\n\nThis step can also be executed on your local machine. The `create` command creates the index locally and then uploads a \njson file `metastore.json` to your bucket at `s3://path-to-your-bucket/hdfs-logs/metastore.json`.\n\n:::\n\n## Index logs\nThe dataset is a compressed [NDJSON file](https://quickwit-datasets-public.s3.amazonaws.com/hdfs-logs-multitenants.json.gz). \nInstead of downloading and indexing the data in separate steps, we will use pipes to send a decompressed stream to Quickwit directly.\n\n```bash\nwget https://quickwit-datasets-public.s3.amazonaws.com/hdfs-logs-multitenants.json.gz\ngunzip -c hdfs-logs-multitenants.json.gz | ./quickwit index ingest --index hdfs-logs\n```\n\n:::note\n\n8GB of RAM is enough to index this dataset; an instance like `t4g.large` with 8GB and 2 vCPU indexed this dataset in less than 10 minutes \n(provided that you have some CPU credits).\n\nThis step can also be done on your local machine. \nThe `ingest` subcommand generates locally [splits](../../overview/architecture) of 10 million documents and will upload \nthem on your bucket. Concretely, each split is a bundle of index files and metadata files.\n\n:::\n\n\nYou can check it's working by using `search` subcommand and look for `ERROR` in `severity_text` field:\n```bash\n./quickwit index search --index hdfs-logs --query \"severity_text:ERROR\"\n```\n\nwhich returns the json\n\n```json\n{\n  \"num_hits\": 345,\n  \"hits\": [\n    {\n      \"attributes\": {\n        \"class\": \"org.apache.hadoop.hdfs.server.datanode.DataNode\"\n      },\n      \"body\": \"RECEIVED SIGNAL 15: SIGTERM\",\n      \"resource\": {\n        \"service\": \"datanode/16\"\n      },\n      \"severity_text\": \"ERROR\",\n      \"tenant_id\": 51,\n      \"timestamp\": 1469687755\n    },\n    ...\n  ],\n  \"elapsed_time_micros\": 522542\n}\n```\n\nYou can see that this query has 345 hits. In this case for the first run, the server responded in 523 milliseconds.\nSubsequent runs use the cached metastore and can be resolved in under 100 milliseconds.\n\nNow that we have indexed the logs and can search from one instance, it's time to configure and start two other instances to form a cluster.\n\n## Start two more instances\n\nQuickwit needs a port `rest.listen_port` for serving the HTTP rest API via TCP as well as maintaining the cluster formation via UDP. \nAlso, it needs `{rest.listen_port} + 1` for gRPC communication between instances.\n\nIn AWS, you can create a security group to group these inbound rules. Check out the [network section](../../guides/aws-setup) of our AWS setup guide.\n\nTo make things easier, let's create a security group that opens the TCP/UDP port range [7200-7300]. \nNext, create three EC2 instances using the previously created security group. Take note of each instance's public IP address.\n\nNow ssh into the first EC2 instance, install Quickwit, and [configure the environment](../../guides/aws-setup) to let Quickwit access the index S3 buckets.\n\nLet's install Quickwit on the second and third EC2 instances.\n\n```bash\ncurl -L https://install.quickwit.io | sh\ncd quickwit-v*/\n```\n\nAnd configure the environment so instances can form a cluster:\n\n```bash\nexport S3_PATH=s3://{path/to/bucket}/indexes\nexport IP_NODE_1={first-ec2-instance-public-ip}\n```\n\n```bash\n# configuration for our second node\necho \"version: 0.7\nnode_id: searcher-2\nmetastore_uri: ${S3_PATH}\ndefault_index_root_uri: ${S3_PATH}\nlisten_address: 0.0.0.0\npeer_seeds:\n  - ${IP_NODE_1} # searcher-1\n\" > config.yaml\n\n# Start a Quickwit searcher.\n./quickwit run --service searcher --config config.yaml\n```\n\n```bash\n# configuration for our third node\necho \"version: 0.7\nnode_id: searcher-3\nlisten_address: 0.0.0.0\npeer_seeds:\n  - ${IP_NODE_1} # searcher-1\nmetastore_uri: ${S3_PATH}\ndefault_index_root_uri: ${S3_PATH}\n\" > config.yaml\n\n# Start a Quickwit searcher.\n./quickwit run --service searcher --config config.yaml\n```\n\n\nYou will see in the terminal the confirmation that the instance has joined the existing cluster. Example of such a log:\n\n```\n2023-03-19T16:44:56.918Z  INFO quickwit_cluster::cluster: Joining cluster. cluster_id=quickwit-default-cluster node_id=searcher-2 enabled_services={Searcher} gossip_listen_addr=0.0.0.0:7280 gossip_advertise_addr=172.31.30.168:7280 grpc_advertise_addr=172.31.30.168:7281 peer_seed_addrs=172.31.91.203:7280\n```\n\nNow we can query one of our instance directly by issuing http requests to one of the nodes rest API endpoint.\n\n```\ncurl -v \"http://0.0.0.0:7280/api/v1/hdfs-logs/search?query=severity_text:ERROR\"\n```\n\nCheck out the logs of all instances and you will see that all nodes are working.\n\n## Load balancing incoming requests\n\nNow that you have a search cluster, ideally, you will want to load balance external requests. \nThis can quickly be done by adding an AWS load balancer to listen to incoming HTTP or HTTPS traffic and forward it to a target group.\nYou can now play with your cluster, kill processes randomly, add/remove new instances, and keep calm.\n\n## Clean\n\nLet's do some cleanup by deleting the index:\n\n```bash\n./quickwit index delete --index hdfs-logs\n```\n\nAlso remember to remove the security group to protect your EC2 instances. You can just remove the instances if you don't need them.\n\nCongratz! You finished this tutorial!\n\nTo continue your Quickwit journey, check out the [search REST API reference](/docs/reference/rest-api) or the [query language reference](/docs/reference/query-language).\n"
  },
  {
    "path": "docs/get-started/tutorials/tutorial-hdfs-logs.md",
    "content": "---\ntitle: Index a logging dataset locally\ndescription: Index log entries on a local machine.\ntags: [self-hosted, setup]\nicon_url: /img/quickwit-icon.svg\nsidebar_position: 3\n---\n\nimport Tabs from '@theme/Tabs';\nimport TabItem from '@theme/TabItem';\n\nIn this guide, we will index about 20 million log entries (7 GB decompressed) on a local machine. If you want to start a server with indexes on AWS S3 with several search nodes, check out the [tutorial for distributed search](tutorial-hdfs-logs-distributed-search-aws-s3.md).\n\nHere is an example of a log entry:\n```json\n{\n  \"timestamp\": 1460530013,\n  \"severity_text\": \"INFO\",\n  \"body\": \"PacketResponder: BP-108841162-10.10.34.11-1440074360971:blk_1074072698_331874, type=HAS_DOWNSTREAM_IN_PIPELINE terminating\",\n  \"resource\": {\n    \"service\": \"datanode/01\"\n  },\n  \"attributes\": {\n    \"class\": \"org.apache.hadoop.hdfs.server.datanode.DataNode\"\n  },\n  \"tenant_id\": 58\n}\n```\n\n\n## Install\n\nLet's download and install Quickwit.\n\n```bash\ncurl -L https://install.quickwit.io | sh\ncd quickwit-v*/\n```\n\nOr pull and run the Quickwit binary in an isolated Docker container.\n\n```bash\ndocker run quickwit/quickwit --version\n```\n\n## Start a Quickwit server\n\n<Tabs>\n\n<TabItem value=\"cli\" label=\"CLI\">\n\n```bash\n./quickwit run\n```\n\n</TabItem>\n\n<TabItem value=\"docker\" label=\"Docker\">\n\n```bash\ndocker run --rm -v $(pwd)/qwdata:/quickwit/qwdata -p 127.0.0.1:7280:7280 quickwit/quickwit run\n```\n\nYou may need to specify the platform if you are using Apple silicon based macOS system with the `--platform linux/amd64` flag. You can also safely ignore jemalloc warnings.\n\n</TabItem>\n\n</Tabs>\n\n\n## Create your index\n\nLet's create an index configured to receive these logs.\n\n```bash\n# First, download the hdfs logs config from Quickwit repository.\ncurl -o hdfs_logs_index_config.yaml https://raw.githubusercontent.com/quickwit-oss/quickwit/main/config/tutorials/hdfs-logs/index-config.yaml\n```\n\nThe index config defines five fields: `timestamp`, `tenant_id`, `severity_text`, `body`, and one JSON field\nfor the nested values `resource.service`, we could use an object field here and maintain a fixed schema, but for convenience we're going to use a JSON field.\nIt also sets the `default_search_fields`, the `tag_fields`, and the `timestamp_field`.\nThe `timestamp_field` and `tag_fields` are used by Quickwit for [splits pruning](../../overview/concepts/querying.md#time-sharding) at query time to boost search speed. \nCheck out the [index config docs](../../configuration/index-config) for more details.\n\n```yaml title=\"hdfs-logs-index.yaml\"\nversion: 0.7\n\nindex_id: hdfs-logs\n\ndoc_mapping:\n  field_mappings:\n    - name: timestamp\n      type: datetime\n      input_formats:\n        - unix_timestamp\n      output_format: unix_timestamp_secs\n      fast_precision: seconds\n      fast: true\n    - name: tenant_id\n      type: u64\n    - name: severity_text\n      type: text\n      tokenizer: raw\n    - name: body\n      type: text\n      tokenizer: default\n      record: position\n    - name: resource\n      type: json\n      tokenizer: raw\n  tag_fields: [tenant_id]\n  timestamp_field: timestamp\n\nsearch_settings:\n  default_search_fields: [severity_text, body]\n```\n\nNow let's create the index with the `create` subcommand (assuming you are inside Quickwit install directory):\n\n<Tabs>\n\n<TabItem value=\"cli\" label=\"CLI\">\n\n```bash\n./quickwit index create --index-config hdfs_logs_index_config.yaml\n```\n\n</TabItem>\n\n<TabItem value=\"curl\" label=\"cURL\">\n\n```bash\ncurl -XPOST http://localhost:7280/api/v1/indexes -H \"content-type: application/yaml\" --data-binary @hdfs_logs_index_config.yaml\n```\n\n</TabItem>\n\n</Tabs>\n\n\nYou're now ready to fill the index.\n\n## Index logs\nThe dataset is a compressed [NDJSON file](https://quickwit-datasets-public.s3.amazonaws.com/hdfs-logs-multitenants.json.gz).\nInstead of downloading it and then indexing the data, we will use pipes to directly send a decompressed stream to Quickwit.\nThis can take up to 10 minutes on a modern machine, the perfect time for a coffee break.\n\n<Tabs>\n\n<TabItem value=\"cli\" label=\"CLI\">\n\n```bash\ncurl https://quickwit-datasets-public.s3.amazonaws.com/hdfs-logs-multitenants.json.gz | gunzip | ./quickwit index ingest --index hdfs-logs\n```\n\n</TabItem>\n\n<TabItem value=\"docker\" label=\"Docker\">\n\n```bash\ncurl https://quickwit-datasets-public.s3.amazonaws.com/hdfs-logs-multitenants.json.gz | gunzip | docker run -v $(pwd)/qwdata:/quickwit/qwdata -i quickwit/quickwit index ingest --index hdfs-logs\n```\n\n</TabItem>\n\n</Tabs>\n\n\n\nIf you are in a hurry, use the sample dataset that contains 10 000 documents, we will use this dataset for the example queries:\n\n<Tabs>\n\n<TabItem value=\"cli\" label=\"CLI\">\n\n```bash\ncurl https://quickwit-datasets-public.s3.amazonaws.com/hdfs-logs-multitenants-10000.json | ./quickwit index ingest --index hdfs-logs\n```\n\n</TabItem>\n\n<TabItem value=\"docker\" label=\"Docker\">\n\nOn macOS or Windows:\n\n```bash\ncurl https://quickwit-datasets-public.s3.amazonaws.com/hdfs-logs-multitenants-10000.json | docker run -v $(pwd)/qwdata:/quickwit/qwdata -i quickwit/quickwit index ingest --index hdfs-logs --endpoint http://host.docker.internal:7280\n```\n\nOn linux:\n\n```bash\ncurl https://quickwit-datasets-public.s3.amazonaws.com/hdfs-logs-multitenants-10000.json | docker run --network=host -v $(pwd)/qwdata:/quickwit/qwdata -i quickwit/quickwit index ingest --index hdfs-logs --endpoint http://127.0.0.1:7280\n```\n\n</TabItem>\n\n<TabItem value=\"curl\" label=\"cURL\">\n\n```bash\nwget https://quickwit-datasets-public.s3.amazonaws.com/hdfs-logs-multitenants-10000.json\ncurl -XPOST http://localhost:7280/api/v1/hdfs-logs/ingest -H \"content-type: application/json\" --data-binary @hdfs-logs-multitenants-10000.json\n```\n\n</TabItem>\n\n</Tabs>\n\nYou can check it's working by searching for `INFO` in `severity_text` field:\n\n<Tabs>\n\n<TabItem value=\"cli\" label=\"CLI\">\n\n```bash\n./quickwit index search --index hdfs-logs  --query \"severity_text:INFO\"\n```\n\n</TabItem>\n\n<TabItem value=\"docker\" label=\"Docker\">\n\nOn macOS or Windows:\n\n```bash\ndocker run -v $(pwd)/qwdata:/quickwit/qwdata quickwit/quickwit index search --index hdfs-logs  --query \"severity_text:INFO\" --endpoint http://host.docker.internal:7280\n```\n\nOn linux:\n\n```bash\ndocker run --network=host -v $(pwd)/qwdata:/quickwit/qwdata quickwit/quickwit index search --index hdfs-logs  --query \"severity_text:INFO\" --endpoint http://127.0.0.1:7280\n```\n\n</TabItem>\n\n</Tabs>\n\n:::note\n\nThe `ingest` subcommand generates [splits](../../overview/architecture) of 5 million documents. Each split is a small piece of index represented by a file in which index files and metadata files are saved.\n\n:::\n\n\nThe query which returns the json:\n\n```json\n{\n  \"num_hits\": 10000,\n  \"hits\": [\n    {\n      \"body\": \"Receiving BP-108841162-10.10.34.11-1440074360971:blk_1073836032_95208 src: /10.10.34.20:60300 dest: /10.10.34.13:50010\",\n      \"resource\": {\n        \"service\": \"datanode/03\"\n      },\n      \"severity_text\": \"INFO\",\n      \"tenant_id\": 58,\n      \"timestamp\": 1440670490\n    }\n    ...\n  ],\n  \"elapsed_time_micros\": 2490\n}\n```\n\nThe index config shows that we can use the timestamp field parameters `start_timestamp` and `end_timestamp` and benefit from time pruning. \nBehind the scenes, Quickwit will only query [splits](../../overview/architecture) that have logs in this time range.\n\nLet's use these parameters with the following query:\n\n```bash\ncurl 'http://127.0.0.1:7280/api/v1/hdfs-logs/search?query=severity_text:INFO&start_timestamp=1440670490&end_timestamp=1450670490'\n```\n\n## Clean\n\nLet's do some cleanup by deleting the index:\n\n<Tabs>\n\n<TabItem value=\"cli\" label=\"CLI\">\n\n```bash\n./quickwit index delete --index hdfs-logs\n```\n\n</TabItem>\n\n<TabItem value=\"curl\" label=\"cURL\">\n\n```bash\ncurl -XDELETE http://127.0.0.1:7280/api/v1/indexes/hdfs-logs\n```\n\n</TabItem>\n\n</Tabs>\n\nCongratz! You finished this tutorial!\n\n\nTo continue your Quickwit journey, check out the [tutorial for distributed search](tutorial-hdfs-logs-distributed-search-aws-s3.md) or dig into the [search REST API](/docs/reference/rest-api) or [query language](/docs/reference/query-language).\n"
  },
  {
    "path": "docs/get-started/tutorials/tutorial-jaeger.md",
    "content": "---\ntitle: Traces with Jaeger\nsidebar_position: 2\n---\n\nIn this quick start guide, we will set up a Quickwit instance and analyze its own traces with Jaeger using Docker Compose.\n\nYou only need a minute to get Jaeger working with Quickwit storage backend.\n\n## Start Quickwit and Jaeger\n\nLet's use `docker compose` with the following configuration:\n\n```yaml title=\"docker-compose.yaml\"\nversion: \"3\"\n\nservices:\n  quickwit:\n    image: quickwit/quickwit:${QW_VERSION:-0.8.1}\n    volumes:\n      - ./qwdata:/quickwit/qwdata\n    ports:\n      - 7280:7280\n    environment:\n      - QW_ENABLE_OPENTELEMETRY_OTLP_EXPORTER=true\n      - OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:7281\n    command: [\"run\"]\n\n  jaeger-query:\n    image: jaegertracing/jaeger-query:1.60\n    ports:\n      - 16686:16686\n    environment:\n      - SPAN_STORAGE_TYPE=grpc\n      - GRPC_STORAGE_SERVER=quickwit:7281\n      - GRPC_STORAGE_TLS=false\n```\n\nAs you can see in the docker compose file, Quickwit is configured to send its own traces `OTEL_EXPORTER_OTLP_ENDPOINT` to itself `http://localhost:7281`.\nOn the other side, Jaeger is configured to use a gRPC storage server `quickwit:7281`.\n\nSave and run the recipe:\n\n```bash\n$ docker compose up\n```\n\nYou should be able to access Quickwit's UI on `http://localhost:7280/` and Jager's UI on `http://localhost:16686/`.\n\n\n## Searching and view traces in Jaeger\n\nQuickwit generates many traces, let's take a look at some of them:\n- `find_traces`: generated by the \"Find traces\" Jaeger button.\n- `get_operations`: generated by Jaeger when it is fetching the list of operations.\n- `get_services`: generated by Jaeger when it is fetching the list of services.\n- `ingest-spans`: generated when Quickwit receives spans on the gRPC OTLP API.\n- ...\n\nHere are the screenshots of the search and trace view:\n\n![Jaeger search view](../../assets/images/jaeger-ui-quickwit-search-traces.png)\n![Jaeger trace view](../../assets/images/jaeger-ui-quickwit-trace-view.png)\n\n## Searching traces with Quickwit UI\n\nYou can also use the Quickwit UI at [http://localhost:7280](http://localhost:7280) to search traces.\n\nHere are a couple of query examples:\n- `service_name:quickwit AND events.event_attributes.level:INFO`\n- `span_duration_millis:>100`\n- `resource_attributes.service.version:v0.8.1`\n- `service_name:quickwit`\n\nThat's it! You can level up with the following tutorials to discover all Quickwit features.\n\n## Next tutorials\n\n- [Send traces using an OTEL collector](/docs/distributed-tracing/send-traces/using-otel-collector.md)\n- [Send traces from a python web server](/docs/distributed-tracing/send-traces/using-otel-sdk-python.md)\n"
  },
  {
    "path": "docs/guides/_category_.yaml",
    "content": "label: 'Guides'\nposition: 8\ncollapsed: true\n"
  },
  {
    "path": "docs/guides/aws-setup.md",
    "content": "---\ntitle: AWS cluster setup\nsidebar_position: 3\n---\n\nSetting up a Quickwit cluster on AWS requires the configuration of three elements:\n- AWS credentials\n- AWS region\n- Network configuration\n\n## AWS credentials\n\nWhen starting a node, Quickwit attempts to find AWS credentials using the credential provider chain implemented by [rusoto_core::ChainProvider](https://docs.rs/rusoto_credential/latest/rusoto_credential/struct.ChainProvider.html) and looks for credentials in this order:\n\n1. Environment variables `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, or `AWS_SESSION_TOKEN` (optional).\n\n2. Credential profiles file, typically located at `~/.aws/credentials` or otherwise specified by the `AWS_SHARED_CREDENTIALS_FILE` and `AWS_PROFILE` environment variables if set and not empty.\n\n3. Amazon ECS container credentials, loaded from the Amazon ECS container if the environment variable `AWS_CONTAINER_CREDENTIALS_RELATIVE_URI` is set.\n\n4. Instance profile credentials, used on Amazon EC2 instances, and delivered through the Amazon EC2 metadata service.\n\nAn error is returned if no credentials are found in the chain.\n\n## AWS region\n\nQuickwit attempts to find an AWS region in multiple locations and with the following order of precedence:\n\n1. Environment variables (`AWS_REGION` then `AWS_DEFAULT_REGION`)\n\n2. Config file, typically located at `~/.aws/config` or otherwise specified by the `AWS_CONFIG_FILE` environment variable if set and not empty.\n\n3. Amazon EC2 instance metadata service indicating the region of the currently running Amazon EC2 instance.\n\n4. Default value: `us-east-1`\n\n:::note\n\nAWS credentials or region resolution may take a few seconds, especially if the Amazon EC2 instance metadata service is slow or unavailable.\n\n:::\n\n## IAM permissions\n\n### Amazon S3\n\nRequired authorized actions:\n- `ListBucket` (on the bucket directly)\n- `GetObject`\n- `PutObject`\n- `DeleteObject`\n- `ListMultipartUploadParts`\n- `AbortMultipartUpload`\n\nHere is an example of a bucket policy:\n```json\n{\n  \"Version\": \"2012-10-17\",\n  \"Statement\": [\n    {\n      \"Effect\": \"Allow\",\n      \"Action\": [\n        \"s3:ListBucket\"\n      ],\n      \"Resource\": [\n        \"arn:aws:s3:::my-bucket\"\n      ]\n    },\n    {\n      \"Effect\": \"Allow\",\n      \"Action\": [\n        \"s3:GetObject\",\n        \"s3:PutObject\",\n        \"s3:DeleteObject\",\n        \"s3:ListMultipartUploadParts\",\n        \"s3:AbortMultipartUpload\"\n      ],\n      \"Resource\": [\n        \"arn:aws:s3:::my-bucket/*\"\n      ]\n    }\n  ]\n}\n```\n\nYou can run the following commands to verify that AWS credentials, region, and IAM permissions are properly configured for Amazon S3:\n\n```bash\nMY_BUCKET=<bucket name>\naws s3 ls $MY_BUCKET\necho \"Hello, World!\" | aws s3 cp - $MY_BUCKET/hello\naws s3 ls $MY_BUCKET/hello\naws s3 cp $MY_BUCKET/hello -\naws s3 rm $MY_BUCKET/hello\n```\n\n### Amazon Kinesis\n\n- `GetRecords`\n- `GetShardIterator`\n- `ListShards`\n\nYou can run the following commands to verify that AWS credentials, region, and IAM permissions are properly configured for Amazon Kinesis:\n\n```bash\nMY_STREAM=<my stream name>\n\n# List the shards in the stream and select the first one.\nSHARD_ID=$(\n    aws kinesis list-shards --stream-name $MY_STREAM \\\n    | jq -r .Shards[0].ShardId\n)\n\n# Get a shard iterator for the selected shard.\nSHARD_ITERATOR=$(\n    aws kinesis get-shard-iterator --stream-name $MY_STREAM \\\n                                   --shard-id $SHARD_ID \\\n                                   --shard-iterator-type TRIM_HORIZON \\\n    | jq -r .ShardIterator\n)\n\n# Fetch some records from the shard and display the first one.\naws kinesis get-records --shard-iterator $SHARD_ITERATOR | jq -r .Records[0]\n```\n\n## Network configuration\n\n### Security groups\n\nTo communicate with each other, nodes must reside in security groups that allow inbound and outbound traffic on one UDP port and two TCP ports. Please, refer to the [ports configuration](/configuration/ports-config.md) page for more details.\n\n## Common errors\n\nIf you set the wrong credentials, you will see this error message with `Unauthorized` in your terminal:\n\n```bash\nCommand failed: Another error occurred. `Metastore error`. Cause: `StorageError(kind=Unauthorized, source=failed to fetch object: s3://quickwit-dev/my-hdfs/metastore.json)`\n```\n\nIf you put the wrong region, you will see this one:\n\n```bash\nCommand failed: Another error occurred. `Metastore error`. Cause: `StorageError(kind=Internal, source=failed to fetch object: s3://your-bucket/your-index/metastore.json)`.\n```\n"
  },
  {
    "path": "docs/guides/schemaless.md",
    "content": "---\ntitle: Schemaless\nsidebar_position: 1\n---\n\n# Strict schema or schemaless?\n\nQuickwit lets you place the cursor on how strict you would like your schema to be. In other words, it is possible to operate Quickwit with a very strict mapping, in an entirely schemaless manner, and anywhere in between. Let's see how this works!\n\n:::note\n\nTo execute the CLI commands throughout this guide, [install](/docs/get-started/installation.md) Quickwit and start a server in a terminal with the following command:\n\n```bash\n./quickwit run\n```\n\n:::\n\n## A strict mapping\n\nThat's the most straightforward approach.\nAs a user, you need to precisely define the list of fields to be ingested by Quickwit.\n\nFor instance, a reasonable mapping for an application log could be:\n\n```yaml title=my_strict_index.yaml\nversion: 0.7\n\nindex_id: my_strict_index\n\ndoc_mapping:\n  mode: strict # <--- The mode attribute\n  field_mappings:\n    - name: timestamp\n      type: datetime\n      input_formats:\n        - unix_timestamp\n      output_format: unix_timestamp_secs\n      fast_precision: seconds\n      fast: true\n    - name: server\n      type: text\n      tokenizer: raw\n    - name: message\n      type: text\n      record: position\n    - name: severity\n      tokenizer: raw\n  timestamp_field: timestamp\n\nsearch_settings:\n  default_search_fields: [severity, message]\n\nindexing_settings:\n  commit_timeout_secs: 30\n```\n\nThe `mode` attribute controls what should be done if an ingested document\ncontains a field that is not defined in the document mapping. By default, your index is in the `dynamic` mode. In `dynamic` mode, the fields that do not appear in the document mapping will be indexed in a schemaless fashion.\nSee details in the [dynamic mode section](#dynamic-mode).\n\n\nIf `mode` is set to `strict` on the other hand, documents containing fields\nthat are not defined in the mapping will be entirely discarded.\n\nFinally the last possible value for `mode` is `lenient`. In lenient mode, fields that are not present in the field mapping will simply be ignored.\n\n## The dynamic mode: schemaless with a partial schema {#dynamic-mode}\n\n`mode` can take the value: `dynamic`.\nWhen set to dynamic, all extra fields will actually be mapped using a catch-all configuration.\n\nBy default, this catch-all configuration indexes and stores all of these fields, but this can be configured by setting the [`dynamic_mapping` attribute](../configuration/index-config#mode).\nA minimalist, yet perfectly valid and useful index configuration is then:\n\n```yaml title=my_dynamic_index.yaml\nversion: 0.7\nindex_id: my_dynamic_index\ndoc_mapping:\n  mode: dynamic\n```\n\nThis configuration makes it possible to ingest any JSON object and search them.\n\nHowever, the dynamic mode can also be used in conjunction with field mappings.\nThis combination is especially powerful for event logs which cannot be mapped to a single schema.\n\nFor instance, let's consider the following user event log:\n\n```json file title=my_logs.json\n{\n    \"timestamp\": 1653021741,\n    \"user_id\": \"8705a7fak\",\n    \"event_type\": \"login\",\n    \"ab_groups\": [\"phoenix-red-ux\"]\n}\n{\n    \"timestamp\": 1653021746,\n    \"user_id\": \"7618fe06\",\n    \"event_type\": \"order\",\n    \"ab_groups\": [\"phoenix-red-ux\", \"new-ranker\"],\n    \"cart\": [\n        {\n            \"product_id\": 120391,\n            \"product_description\": \"Cherry Pi: A single-board computer that is compatible...\"\n        }\n    ]\n}\n{\n    \"timestamp\": 1653021748,\n    \"user_id\": \"8705a7fak\",\n    \"event_type\": \"login\",\n    \"ab_groups\": [\"phoenix-red-ux\"]\n}\n```\n\nEach event type comes with its own set of attributes. Declaring our mapping as the union of all of these event-specific mappings would be a tedious exercise.\n\nInstead, we can cherry-pick the fields that are common to all of the logs, and rely on dynamic mode to handle the rest.\n\n```yaml title=my_dynamic_index.yaml\nversion: 0.7\nindex_id: my_dynamic_index\ndoc_mapping:\n  mode: dynamic\n  field_mappings:\n    - name: timestamp\n      type: datetime\n      input_formats:\n        - unix_timestamp\n      output_format: unix_timestamp_secs\n      fast_precision: seconds\n      fast: true\n    - name: user_id\n      type: text\n      tokenizer: raw\n    - name: event_type\n      type: text\n      tokenizer: raw\n  timestamp_field: timestamp\n\nindexing_settings:\n  commit_timeout_secs: 30  # <--- Your document will be searchable ~30 seconds after you ingest them.\n```\n\nOur index is now ready to handle queries like this:\n\n```\nevent_type:order AND cart.product_id:120391\n```\n\nExecute the following commands to create the index, ingest a few documents and search through them:\n\n```bash\ncat << EOF > my_dynamic_index.yaml\nversion: 0.7\nindex_id: my_dynamic_index\ndoc_mapping:\n  mode: dynamic\n  field_mappings:\n    - name: timestamp\n      type: datetime\n      input_formats:\n        - unix_timestamp\n      output_format: unix_timestamp_secs\n      fast_precision: seconds\n      fast: true\n    - name: user_id\n      type: text\n      tokenizer: raw\n    - name: event_type\n      type: text\n      tokenizer: raw\n  timestamp_field: timestamp\n\nindexing_settings:\n  commit_timeout_secs: 30\nEOF\n\n# Create index.\n./quickwit index create --index-config ./my_dynamic_index.yaml --overwrite --yes\n\ncat << EOF > my_logs.json\n{\"timestamp\":1653021741,\"user_id\":\"8705a7fak\",\"event_type\":\"login\",\"ab_groups\":[\"phoenix-red-ux\"]}\n{\"timestamp\":1653021746,\"user_id\":\"7618fe06\",\"event_type\":\"order\",\"ab_groups\":[\"phoenix-red-ux\",\"new-ranker\"],\"cart\":[{\"product_id\":120391,\"product_description\":\"Cherry Pi: A single-board computer that is compatible...\"}]}\n{\"timestamp\":1653021748,\"user_id\":\"8705a7fak\",\"event_type\":\"login\",\"ab_groups\":[\"phoenix-red-ux\"]}\nEOF\n\n# Ingest documents.\n./quickwit index ingest --index my_dynamic_index --input-path my_logs.json --force\n\n# Execute search query.\n./quickwit index search --index my_dynamic_index --query \"event_type:order AND cart.product_id:120391\n\n```\n\n## A schema with schemaless pockets\n\nSome logs are isolating these event-specific attributes in a\nsub-field. For instance, let's have a look at an OpenTelemetry JSON log.\n\n```json title=otel_logs.json\n{\n  \"Timestamp\": 1653028151,\n  \"Attributes\": {\n    \"split_id\": \"28f897f2-0419-4d88-8abc-ada72b4b5256\"\n  },\n  \"Resource\": {\n    \"service\": \"donut_shop\",\n    \"k8s_pod_uid\": \"27413708-876b-4652-8ca4-50e8b4a5caa2\"\n  },\n  \"TraceId\": \"f4dbb3edd765f620\",\n  \"SpanId\": \"43222c2d51a7abe3\",\n  \"SeverityText\": \"INFO\",\n  \"SeverityNumber\": 9,\n  \"Body\": \"merge ended\"\n}\n```\n\nIn this log, the `Attributes` and the `Resource` fields contain arbitrary key-values.\n\nQuickwit 0.3 introduced a JSON field type to handle this use case.\nA good index configuration here could be:\n\n```yaml title=otel_logs.yaml\nversion: 0.7\nindex_id: otel_logs\ndoc_mapping:\n  mode: dynamic\n  field_mappings:\n    - name: Timestamp\n      type: datetime\n      fast: true\n      input_formats:\n        - unix_timestamp\n      output_format: unix_timestamp_secs\n      fast_precision: seconds\n      fast: true\n    - name: Attributes\n      type: json\n      tokenizer: raw\n    - name: Resource\n      type: json\n      tokenizer: raw\n    - name: TraceId\n      type: text\n      tokenizer: raw\n    - name: SpanId\n      type: text\n      tokenizer: raw\n    - name: SeverityText\n      type: text\n      tokenizer: raw\n      fast: true\n    - name: Body\n      type: text\n  timestamp_field: Timestamp\n  \nsearch_settings:\n  default_search_fields: [SeverityText, Body, Attributes, Resource]\n\nindexing_settings:\n  commit_timeout_secs: 10\n```\n\nWe can now naturally search our logs with the following query:\n\n```\nmerge AND service:donuts_shop\n```\n\nLet's execute the following commands to create the index, ingest a document and execute a search query:\n\n```bash\n# Create index.\n./quickwit index create --index-config ./otel_logs.yaml --overwrite --yes\n\ncat << EOF > otel_logs.json\n{\"Timestamp\":1653028151,\"Attributes\":{\"split_id\":\"28f897f2-0419-4d88-8abc-ada72b4b5256\"},\"Resource\":{\"service\":\"donut_shop\",\"k8s_pod_uid\":\"27413708-876b-4652-8ca4-50e8b4a5caa2\"},\"TraceId\":\"f4dbb3edd765f620\",\"SpanId\":\"43222c2d51a7abe3\",\"SeverityText\":\"INFO\",\"SeverityNumber\":9,\"Body\":\"merge ended\"}\nEOF\n\n# Ingest documents.\n./quickwit index ingest --index otel_logs --input-path otel_logs.json --force\n\n# Execute search query.\n./quickwit index search --index otel_logs --query \"merge AND service:donut_shop\"\n\n```\n"
  },
  {
    "path": "docs/guides/storage-setup/_category_.yaml",
    "content": "label: 'Storage Setup'\nposition: 2\ncollapsed: true\n"
  },
  {
    "path": "docs/guides/storage-setup/aws-s3.md",
    "content": "---\ntitle: AWS S3\nsidebar_position: 1\n---\n\nIn this guide, you will learn how to configure a Quickwit [storage](../../configuration/storage-config) for Amazon S3.\n\n## Set your AWS credentials\n\nA simple way to do it is to declare the environment variables `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`. For more details, read our guide on [AWS setup](../aws-setup).\n\n## Set the Metastore URI and default index URI\n\nHere is an example of how to set up your [node config file](../../configuration/node-config) with S3:\n\n```yaml\nmetastore_uri: s3://{my-bucket}/indexes\ndefault_index_uri: s3://{my-bucket}/indexes\n```\n\n## Set the Index URI\n\nHere is an example of how to set up your index URI in the [index config file](../../configuration/index-config):\n```yaml\nindex_uri: s3://{my-bucket}/indexes/{my-index-id}\n```\n"
  },
  {
    "path": "docs/ingest-data/_category_.yaml",
    "content": "label: 'Ingest data'\nposition: 4\ncollapsed: true\n"
  },
  {
    "path": "docs/ingest-data/index.md",
    "content": "---\ntitle: Ingest data from multiple sources\n---\n\nimport DocCardList from '@theme/DocCardList';\n\n<DocCardList />\n\nIt is possible to ingest data with log shippers like [OpenTelemetry](../log-management/overview.md#opentelemetry-agent), [Fluentbit](../log-management/send-logs/using-fluentbit.md), or [Vector](../log-management/send-logs/using-vector.md). It's also possible to send traces from your apps to the [OpenTelemetry Collector](../log-management/send-logs/using-otel-collector-with-helm.md) and then to Quickwit.\n\n\n"
  },
  {
    "path": "docs/ingest-data/ingest-api.md",
    "content": "---\ntitle: Ingest API\ndescription: A short tutorial describing how to send data in Quickwit using the ingest API\ntags: [ingest-api, integration]\nicon_url: /img/tutorials/quickwit-logo.svg\nsidebar_position: 1\n---\n\nimport Tabs from '@theme/Tabs';\nimport TabItem from '@theme/TabItem';\n\nIn this tutorial, we will describe how to send data to Quickwit using the ingest API.\n\nYou will need a [local Quickwit instance](../get-started/installation) up and running to follow this tutorial.\n\nTo start it, run `./quickwit run` in a terminal.\n\n## Create an index\n\nFirst, let's create a schemaless index.\n\n```bash\n# Create the index config file.\ncat << EOF > stackoverflow-schemaless-config.yaml\nversion: 0.7\nindex_id: stackoverflow-schemaless\ndoc_mapping:\n  mode: dynamic\n  dynamic_mapping:\n    tokenizer: default\nindexing_settings:\n  commit_timeout_secs: 30\nEOF\n# Use the CLI to create the index...\n./quickwit index create --index-config stackoverflow-schemaless-config.yaml\n# Or with cURL.\ncurl -XPOST -H 'Content-Type: application/yaml' 'http://localhost:7280/api/v1/indexes' --data-binary @stackoverflow-schemaless-config.yaml\n```\n\nNote that for this example, we configure the dynamic mapping to use the [default tokenizer](../configuration/index-config.md#description-of-available-tokenizers). This is necessary to enable full-text search on all text fields.\n\n## Ingest data\n\nLet's first download a sample of the [StackOverflow dataset](https://www.kaggle.com/stackoverflow/stacksample).\n\n```bash\n# Download the first 10_000 Stackoverflow posts articles.\ncurl -O https://quickwit-datasets-public.s3.amazonaws.com/stackoverflow.posts.transformed-10000.json\n```\n\nYou can ingest data either with the CLI or with cURL. The CLI is more convenient for ingesting several GB as Quickwit may return `429` responses if the ingest queue is full. Quickwit CLI will automatically retry ingestion in this case.\n\n```bash\n# Ingest the first 10_000 Stackoverflow posts articles with the CLI...\n./quickwit index ingest --index stackoverflow-schemaless --input-path stackoverflow.posts.transformed-10000.json --force\n\n# OR with cURL.\ncurl -XPOST -H 'Content-Type: application/json' 'http://localhost:7280/api/v1/stackoverflow-schemaless/ingest?commit=force' --data-binary @stackoverflow.posts.transformed-10000.json\n```\n\n## Execute search queries\n\nYou can now search the index.\n\n```bash\ncurl 'http://localhost:7280/api/v1/stackoverflow-schemaless/search?query=body:python'\n```\n\n## Tear down resources (optional)\n\n```bash\ncurl -XDELETE 'http://localhost:7280/api/v1/indexes/stackoverflow-schemaless'\n```\n\nThis concludes the tutorial. You can now move on to the [next tutorial](/docs/ingest-data/kafka.md) to learn how to ingest data from Kafka.\n\n## Ingest API versions\n\nIn 0.9, Quickwit introduced a new version of the ingest API that enables distributing the indexing in the cluster regardless of the node that received the ingest request. This new ingestion service is often referred to as \"Ingest V2\" compared to the legacy ingestion (V1). In upcoming versions the new ingest API will also be capable of replicating the write ahead log in order to achieve higher durability.\n\nBy default, both ingestion services are enabled and ingest V2 is used. You can toggle this behavior with the following environment variables:\n\n| Variable              | Description   | Default value |\n| --------------------- | --------------|-------------- |\n| `QW_ENABLE_INGEST_V2` | Start the V2 ingest service and use it by default. | true | \n| `QW_DISABLE_INGEST_V1`| V1 ingest will be used by the APIs only if V2 is disabled. Running V1 along V2 is necessary to migrate to V2 without loosing existing unindexed V1 logs. | false |\n\n:::note\n\nThese configurations drive the ingest service used both by the `api/v1/<index-id>/ingest` endpoint and the [bulk API](../reference/es_compatible_api.md#_bulk--batch-ingestion-endpoint).\n\n:::\n"
  },
  {
    "path": "docs/ingest-data/ingest-local-file.md",
    "content": "---\ntitle: Local file\ndescription: A short tutorial describing how to index a local file with the Quickiwt CLI \ntags: [local-ingest, integration]\nicon_url: /img/tutorials/file-ndjson.svg\nsidebar_position: 2\n---\n\nimport Tabs from '@theme/Tabs';\nimport TabItem from '@theme/TabItem';\n\nIn this tutorial, we will describe how to index a local file with the Quickwit CLI.\n\nYou will need the [Quickwit binary](/docs/get-started/installation.md) to follow this tutorial.\n\n## Create an index\n\nFirst, let's create a schemaless index. We need to start a Quickwit server only for the creation so we will start it and shut it down afterwards.\n\nStart the Quickwit server.\n\n```bash\n./quickwit run\n```\n\nAnd create the index in a separate terminal.\n\n```bash\n# Create the index config file.\ncat << EOF > stackoverflow-schemaless-config.yaml\nversion: 0.7\nindex_id: stackoverflow-schemaless\ndoc_mapping:\n  mode: dynamic\nindexing_settings:\n  commit_timeout_secs: 30\nEOF\n\n./quickwit index create --index-config stackoverflow-schemaless-config.yaml\n```\n\nYou can now shutdown the server by pressing `Ctrl+C` in the first terminal.\n\n## Ingest the file\n\nTo ingest a file, you just need to execute the following command:\n\n```bash\n./quickwit tool local-ingest --index stackoverflow-schemaless --input-path stackoverflow.posts.transformed-10000.json\n```\n\nAfter a few seconds you should see the following output:\n\n```bash\n❯ Ingesting documents locally...\n\n---------------------------------------------------\n Connectivity checklist\n ✔ metastore\n ✔ storage\n ✔ _ingest-cli-source\n\n Num docs   10000 Parse errs     0 PublSplits   1 Input size     6MB Thrghput  3.34MB/s Time 00:00:02\n Num docs   10000 Parse errs     0 PublSplits   1 Input size     6MB Thrghput  2.23MB/s Time 00:00:03\n Num docs   10000 Parse errs     0 PublSplits   1 Input size     6MB Thrghput  1.67MB/s Time 00:00:04\n\nIndexed 10,000 documents in 4s.\nNow, you can query the index with the following command:\nquickwit index search --index stackoverflow-schemaless --config ./config/quickwit.yaml --query \"my query\"\nClearing local cache directory...\n✔ Local cache directory cleared.\n✔ Documents successfully indexed.\n```\n\n:::tip\n\nObject store URIs like `s3://mybucket/mykey.json` are also supported as `--input-path`, provided that your environment is configured with the appropriate permissions.\n\n:::\n\n## Tear down resources (optional)\n\nThat's it! You can now tear down the resources you created. You can do so by running the following command:\n\n```bash\n./quickwit run\n```\n\nAnd in a separate terminal:\n\n```bash\n./quickwit index delete --index-id stackoverflow-schemaless\n```\n\nThis concludes the tutorial. You can now move on to the next tutorial.\n"
  },
  {
    "path": "docs/ingest-data/kafka.md",
    "content": "---\ntitle: Kafka\ndescription: A short tutorial describing how to set up Quickwit to ingest data from Kafka in a few minutes\ntags: [kafka, integration]\nicon_url: /img/tutorials/kafka.svg\nsidebar_position: 2\n---\n\nIn this tutorial, we will describe how to set up Quickwit to ingest data from Kafka in a few minutes. First, we will create an index and configure a Kafka source. Then, we will create a Kafka topic and load some events from the [GH Archive](https://www.gharchive.org/) into it. Finally, we will execute some search and aggregation queries to explore the freshly ingested data.\n\n## Prerequisites\n\nYou will need the following to complete this tutorial:\n- A running Kafka cluster (see Kafka [quickstart](https://kafka.apache.org/quickstart))\n- A local Quickwit [installation](/docs/get-started/installation.md)\n\n## Create index\n\nFirst, let's create a new index. Here is the index config and doc mapping corresponding to the schema of the GH Archive events:\n\n```yaml title=\"index-config.yaml\"\n#\n# Index config file for gh-archive dataset.\n#\nversion: 0.7\n\nindex_id: gh-archive\n\ndoc_mapping:\n  field_mappings:\n    - name: id\n      type: text\n      tokenizer: raw\n    - name: type\n      type: text\n      fast: true\n      tokenizer: raw\n    - name: public\n      type: bool\n      fast: true\n    - name: payload\n      type: json\n      tokenizer: default\n    - name: org\n      type: json\n      tokenizer: default\n    - name: repo\n      type: json\n      tokenizer: default\n    - name: actor\n      type: json\n      tokenizer: default\n    - name: other\n      type: json\n      tokenizer: default\n    - name: created_at\n      type: datetime\n      fast: true\n      input_formats:\n        - rfc3339\n      fast_precision: seconds\n  timestamp_field: created_at\n\nindexing_settings:\n  commit_timeout_secs: 10\n```\n\nExecute these Bash commands to download the index config and create the `gh-archive` index:\n\n```bash\n# Download GH Archive index config.\nwget -O gh-archive.yaml https://raw.githubusercontent.com/quickwit-oss/quickwit/main/config/tutorials/gh-archive/index-config.yaml\n\n# Create index.\n./quickwit index create --index-config gh-archive.yaml\n```\n\n## Create and populate Kafka topic\n\nNow, let's create a Kafka topic and load some events into it.\n\n```bash\n# Create a topic named `gh-archive` with 3 partitions.\nbin/kafka-topics.sh --create --topic gh-archive --partitions 3 --bootstrap-server localhost:9092\n\n# Download a few GH Archive files.\nwget https://data.gharchive.org/2022-05-12-{10..15}.json.gz\n\n# Load the events into Kafka topic.\ngunzip -c 2022-05-12*.json.gz | \\\nbin/kafka-console-producer.sh --topic gh-archive --bootstrap-server localhost:9092\n```\n\n## Create Kafka source\n\n:::note\nThis tutorial assumes that the Kafka cluster is available locally on the default port (9092). If it's not the case, please, update the `bootstrap.servers` parameter accordingly.\n:::\n\n```yaml title=\"kafka-source.yaml\"\n#\n# Kafka source config file.\n#\nversion: 0.8\nsource_id: kafka-source\nsource_type: kafka\nnum_pipelines: 2\nparams:\n  topic: gh-archive\n  client_params:\n    bootstrap.servers: localhost:9092\n```\n\nRun these commands to download the source config file and create the source.\n\n```bash\n# Download Kafka source config.\nwget https://raw.githubusercontent.com/quickwit-oss/quickwit/main/config/tutorials/gh-archive/kafka-source.yaml\n\n# Create source.\n./quickwit source create --index gh-archive --source-config kafka-source.yaml\n```\n:::note\n\nIf you get the following error:\n\n``` Command failed: Topic `gh-archive` has no partitions.```\n\nIt means the Kafka topic `gh-archive` was not properly created in the previous step.\n\n:::\n\n\n\n## Launch indexing and search services\n\nFinally, execute this command to start Quickwit in server mode.\n\n```bash\n# Launch Quickwit services.\n./quickwit run\n```\n\nUnder the hood, this command spawns an indexer and a searcher. On startup, the indexer will connect to the Kafka topic specified by the source and start streaming and indexing events from the partitions composing the topic. With the default commit timeout value (see [indexing settings](../configuration/index-config#indexing-settings)), the indexer should publish the first split after approximately 60 seconds.\n\nYou can run this command (in another shell) to inspect the properties of the index and check the current number of published splits:\n\n```bash\n# Display some general information about the index.\n./quickwit index describe --index gh-archive\n```\n\nOnce the first split is published, you can start running search queries. For instance, we can find all the events for the Kubernetes [repository](https://github.com/kubernetes/kubernetes):\n\n```bash\ncurl 'http://localhost:7280/api/v1/gh-archive/search?query=org.login:kubernetes%20AND%20repo.name:kubernetes'\n```\n\nIt is also possible to access these results through the [Quickwit UI](http://localhost:7280/ui/search?query=org.login%3Akubernetes+AND+repo.name%3Akubernetes&index_id=gh-archive&max_hits=10).\n\n\nWe can also group these events by type and count them:\n\n```\ncurl -XPOST -H 'Content-Type: application/json' 'http://localhost:7280/api/v1/gh-archive/search' -d '\n{\n  \"query\":\"org.login:kubernetes AND repo.name:kubernetes\",\n  \"max_hits\":0,\n  \"aggs\":{\n    \"count_by_event_type\":{\n      \"terms\":{\n        \"field\":\"type\"\n      }\n    }\n  }\n}'\n```\n\n\n## Secured Kafka connection (optional)\n\nThe Quickwit Kafka source supports SSL and SASL authentication. This is\nparticularly useful when consuming data from an external Kafka service.\n\n:::tip\n\nThe certificate and key files must be present on all Quickwit nodes for the\nKafka source to be created and for the indexing pipelines to run successfully.\n\n:::\n\n### SSL configuration\n\n```yaml\nversion: 0.8\nsource_id: kafka-source-ssl\nsource_type: kafka\nnum_pipelines: 2\nparams:\n  topic: gh-archive\n  client_params:\n    bootstrap.servers: your-kafka-broker.com\n    security.protocol: SSL\n    ssl.ca.location: /path/to/ca.pem\n    ssl.certificate.location: /path/to/service.cert\n    ssl.key.location: /path/to/service.key\n```\n\n### SASL configuration\n\n```yaml\nversion: 0.8\nsource_id: kafka-source-sasl\nsource_type: kafka\nnum_pipelines: 2\nparams:\n  topic: gh-archive\n  client_params:\n    bootstrap.servers: your-kafka-broker.com\n    ssl.ca.location: /path/to/ca.pem\n    security.protocol: SASL_SSL\n    sasl.mechanisms: SCRAM-SHA-256\n    sasl.username: your_sasl_username\n    sasl.password: your_sasl_password\n```\n\n:::note\n\nIf you get the following error:\n\n```Client creation error: ssl.ca.location failed: error:05880002:x509 certificate routines::system lib```\n\nIt usually means the path to the CA certificate is incorrect. Update the\n`ssl.ca.location` parameter accordingly.\n\n:::\n\n## Tear down resources (optional)\n\nLet's delete the files and resources created for the purpose of this tutorial.\n\n```bash\n# Delete Kafka topic.\nbin/kafka-topics.sh --delete --topic gh-archive --bootstrap-server localhost:9092\n\n# Delete index.\n./quickwit index delete --index gh-archive\n\n# Delete source config.\nrm kafka-source.yaml\n```\n\nThis concludes the tutorial. If you have any questions regarding Quickwit or encounter any issues, don't hesitate to ask a [question](https://github.com/quickwit-oss/quickwit/discussions) or open an [issue](https://github.com/quickwit-oss/quickwit/issues) on [GitHub](https://github.com/quickwit-oss/quickwit) or contact us directly on [Discord](https://discord.com/invite/MT27AG5EVE).\n"
  },
  {
    "path": "docs/ingest-data/kinesis.md",
    "content": "---\ntitle: Kinesis\ndescription: A short tutorial describing how to set up Quickwit to ingest data from Kinesis in a few minutes\ntags: [aws, integration]\nicon_url: /img/tutorials/aws-kinesis.svg\nsidebar_position: 4\n---\n\nIn this tutorial, we will describe how to set up Quickwit to ingest data from Kinesis in a few minutes. First, we will create an index and configure a Kinesis source. Then, we will create a Kinesis stream and load some events from the [GH Archive](https://www.gharchive.org/) into it. Finally, we will execute some search and aggregation queries to explore the freshly ingested data.\n\n:::caution\nYou will incur some charges for using the Amazon Kinesis service during this tutorial.\n:::\n\n## Prerequisites\n\nYou will need the following to complete this tutorial:\n- The AWS CLI version 2 (see [Getting started with the AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-prereqs.html) for prerequisites and installation)\n- A local Quickwit [installation](/docs/get-started/installation.md)\n- [jq](https://stedolan.github.io/jq/download/)\n- [GNU parallel](https://www.gnu.org/software/parallel/)\n\n:::note\n`jq` is required to reshape the events into records ingestable by the Amazon Kinesis API.\n:::\n\n### Create index\n\nFirst, let's create a new index. Here is the index config and doc mapping corresponding to the schema of the GH Archive events:\n\n```yaml title=\"index-config.yaml\"\n#\n# Index config file for gh-archive dataset.\n#\nversion: 0.7\n\nindex_id: gh-archive\n\ndoc_mapping:\n  field_mappings:\n    - name: id\n      type: text\n      tokenizer: raw\n    - name: type\n      type: text\n      fast: true\n      tokenizer: raw\n    - name: public\n      type: bool\n      fast: true\n    - name: payload\n      type: json\n      tokenizer: default\n    - name: org\n      type: json\n      tokenizer: default\n    - name: repo\n      type: json\n      tokenizer: default\n    - name: actor\n      type: json\n      tokenizer: default\n    - name: other\n      type: json\n      tokenizer: default\n    - name: created_at\n      type: datetime\n      fast: true\n      input_formats:\n        - rfc3339\n      fast_precision: seconds\n  timestamp_field: created_at\n\nindexing_settings:\n  commit_timeout_secs: 10\n\n```\n\nExecute these Bash commands to download the index config and create the `gh-archive` index.\n\n```bash\n# Download GH Archive index config.\nwget -O gh-archive.yaml https://raw.githubusercontent.com/quickwit-oss/quickwit/main/config/tutorials/gh-archive/index-config.yaml\n\n# Create index.\n./quickwit index create --index-config gh-archive.yaml\n```\n\n\n## Create and populate Kinesis stream\n\nNow, let's create a Kinesis stream and load some events into it.\n\n:::tip\nThis step may be fairly slow depending on how much bandwidth is available. The current command limits the volume of data to ingest by taking the first 10 000 lines of every single file downloaded from the GH Archive. If you have enough bandwidth, you can remove it to ingest the whole set of files. You can also speed things up by increasing the number of shards and/or the number of jobs launched by `parallel` (`-j` option).\n:::\n\n```bash\n# Create a stream named `gh-archive` with 3 shards.\naws kinesis create-stream --stream-name gh-archive --shard-count 8\n\n# Download a few GH Archive files.\nwget https://data.gharchive.org/2022-05-12-{10..12}.json.gz\n\n# Load the events into Kinesis stream\ngunzip -c 2022-05-12*.json.gz | \\\nhead -n 10000 | \\\nparallel --gnu -j8 -N 500 --pipe \\\n'jq --slurp -c \"{\\\"Records\\\": [.[] | {\\\"Data\\\": (. | tostring), \\\"PartitionKey\\\": .id }], \\\"StreamName\\\": \\\"gh-archive\\\"}\" > records-{%}.json && \\\naws kinesis put-records --cli-input-json file://records-{%}.json --cli-binary-format raw-in-base64-out >> out.log'\n```\n\n## Create Kinesis source\n\n```yaml title=\"kinesis-source.yaml\"\n#\n# Kinesis source config file.\n#\nversion: 0.7\nsource_id: kinesis-source\nsource_type: kinesis\nparams:\n  stream_name: gh-archive\n```\n\nRun these commands to download the source config file and create the source.\n\n```bash\n# Download Kinesis source config.\nwget https://raw.githubusercontent.com/quickwit-oss/quickwit/main/config/tutorials/gh-archive/kinesis-source.yaml\n\n# Create source.\n./quickwit source create --index gh-archive --source-config kinesis-source.yaml\n```\n\n:::note\n\nIf this command fails with the following error message:\n```\nCommand failed: Stream gh-archive under account XXXXXXXXX not found.\n\nCaused by:\n    0: Stream gh-archive under account XXXXXXXX not found.\n    1: Stream gh-archive under account XXXXXXXX not found.\n```\n\nit means the Kinesis stream was not properly created in the previous step.\n:::\n\n## Launch indexing and search services\n\nFinally, execute this command to start Quickwit in server mode.\n\n```bash\n# Launch Quickwit services.\n./quickwit run\n```\n\nUnder the hood, this command spawns an indexer and a searcher. On startup, the indexer will connect to the Kinesis stream specified by the source and start streaming and indexing events from the shards composing the stream. With the default commit timeout value (see [indexing settings](../configuration/index-config#indexing-settings)), the indexer should publish the first split after approximately 60 seconds.\n\nYou can run this command (in another shell) to inspect the properties of the index and check the current number of published splits:\n\n```bash\n# Display some general information about the index.\n./quickwit index describe --index gh-archive\n```\n\nIt is also possible to get index information through the [Quickwit UI](http://localhost:7280/ui/indexes/gh-archive).\n\nOnce the first split is published, you can start running search queries. For instance, we can find all the events for the Kubernetes [repository](https://github.com/kubernetes/kubernetes):\n\n```bash\ncurl 'http://localhost:7280/api/v1/gh-archive/search?query=org.login:kubernetes%20AND%20repo.name:kubernetes'\n```\n\nIt is also possible to access these results through the [UI](http://localhost:7280/ui/search?query=org.login%3Akubernetes+AND+repo.name%3Akubernetes&index_id=gh-archive&max_hits=10).\n\nWe can also group these events by type and count them:\n\n```\ncurl -XPOST -H 'Content-Type: application/json' 'http://localhost:7280/api/v1/gh-archive/search' -d '\n{\n  \"query\":\"org.login:kubernetes AND repo.name:kubernetes\",\n  \"max_hits\":0,\n  \"aggs\":{\n    \"count_by_event_type\":{\n      \"terms\":{\n        \"field\":\"type\"\n      }\n    }\n  }\n}'\n```\n\n## Tear down resources (optional)\n\nLet's delete the files and resources created for the purpose of this tutorial.\n\n```bash\n# Delete Kinesis stream.\naws kinesis delete-stream --stream-name gh-archive\n\n# Delete index.\n./quickwit index delete --index gh-archive\n\n# Delete source config.\nrm kinesis-source.yaml\n```\n\nThis concludes the tutorial. If you have any questions regarding Quickwit or encounter any issues, don't hesitate to ask a [question](https://github.com/quickwit-oss/quickwit/discussions) or open an [issue](https://github.com/quickwit-oss/quickwit/issues) on [GitHub](https://github.com/quickwit-oss/quickwit) or contact us directly on [Discord](https://discord.com/invite/MT27AG5EVE).\n"
  },
  {
    "path": "docs/ingest-data/pulsar.md",
    "content": "---\ntitle: Pulsar\ndescription: A short tutorial describing how to set up Quickwit to ingest data from Pulsar in a few minutes\ntags: [pulsar, integration]\nicon_url: /img/tutorials/pulsar.svg\nsidebar_position: 3\n---\n\nimport Tabs from '@theme/Tabs';\nimport TabItem from '@theme/TabItem';\n\nIn this tutorial, we will describe how to set up Quickwit to ingest data from Pulsar in a few minutes. First, we will create an index and configure a Pulsar source. Then, we will create a Pulsar topic and load some events from the [Stack Overflow dataset](https://www.kaggle.com/stackoverflow/stacksample) into it. Finally, we will execute some searches.\n\n## Prerequisites\n\nYou will need the following to complete this tutorial:\n- A local running [Quickwit instance](/docs/get-started/installation.md)\n- A local running [Pulsar instance](https://pulsar.apache.org/docs/next/getting-started-standalone/)\n\n### Quickwit setup\n\n[Download](/docs/get-started/installation.md) Quickwit and start a server. Then open a new terminal to execute CLI commands with the same binary. \n\n```bash\n./quickwit run\n```\n\nTest that the cluster is running:\n\n```bash\n./quickwit index list\n```\n\n### Pulsar setup\n\n<Tabs>\n\n<TabItem value=\"Local\" label=\"Local\">\n\n```bash\nwget https://archive.apache.org/dist/pulsar/pulsar-2.11.0/apache-pulsar-2.11.0-bin.tar.gz\ntar xvfz apache-pulsar-2.11.0-bin.tar.gz\ncd apache-pulsar-2.11.0\nbin/pulsar standalone\n```\n\n</TabItem>\n\n<TabItem value=\"Docker\" label=\"Docker\">\n\n```bash\ndocker run -it -p 6650:6650 -p 8080:8080 apachepulsar/pulsar:2.11.0 bin/pulsar standalone\n```\n\nSee the details on the [official documentation](https://pulsar.apache.org/docs/next/getting-started-docker/).\n\n</TabItem>\n\n</Tabs>\n\n## Prepare Quickwit\n\nFirst, let's create a new index. Here is the index config and doc mapping corresponding to the schema of Stack Overflow posts:\n\n```yaml title=\"index-config.yaml\"\n#\n# Index config file for Stack Overflow dataset.\n#\nversion: 0.7\n\nindex_id: stackoverflow\n\ndoc_mapping:\n  field_mappings:\n    - name: user\n      type: text\n      fast: true\n      tokenizer: raw\n    - name: tags\n      type: array<text>\n      fast: true\n      tokenizer: raw\n    - name: type\n      type: text\n      fast: true\n      tokenizer: raw\n    - name: title\n      type: text\n      tokenizer: default\n      record: position\n      stored: true\n    - name: body\n      type: text\n      tokenizer: default\n      record: position\n      stored: true\n    - name: questionId\n      type: u64\n    - name: answerId\n      type: u64\n    - name: acceptedAnswerId\n      type: u64\n    - name: creationDate\n      type: datetime\n      fast: true\n      input_formats:\n        - rfc3339\n      fast_precision: seconds\n  timestamp_field: creationDate\n\nsearch_settings:\n  default_search_fields: [title, body]\n\nindexing_settings:\n  commit_timeout_secs: 10\n```\n\nExecute these Bash commands to download the index config and create the `stackoverflow` index.\n\n```bash\n# Download stackoverflow index config.\nwget -O stackoverflow.yaml https://raw.githubusercontent.com/quickwit-oss/quickwit/main/config/tutorials/stackoverflow/index-config.yaml\n\n# Create index.\n./quickwit index create --index-config stackoverflow.yaml\n```\n\n## Create the Pulsar source\n\nA Pulsar source just needs to define the list of topics and the instance address.\n\n```yaml title=\"pulsar-source.yaml\"\n#\n# Pulsar source config file.\n#\nversion: 0.7\nsource_id: pulsar-source\nsource_type: pulsar\nparams:\n  topics:\n    - stackoverflow\n  address: pulsar://localhost:6650\n```\n\nRun these commands to download the source config file and create the source.\n\n```bash\n# Download Pulsar source config.\nwget -O stackoverflow-pulsar-source.yaml https://raw.githubusercontent.com/quickwit-oss/quickwit/main/config/tutorials/stackoverflow/pulsar-source.yaml\n\n# Create source.\n./quickwit source create --index stackoverflow --source-config stackoverflow-pulsar-source.yaml\n```\n\nAs soon as the Pulsar source is created, Quickwit control plane will ask an indexer to start a new indexing pipeline. You will see logs like below by looking on the indexer:\n\n```bash\nINFO spawn_pipeline{index=stackoverflow gen=0}:pulsar-consumer{subscription_name=\"quickwit-stackoverflow-pulsar-source\" params=PulsarSourceParams { topics: [\"stackoverflow\"], address: \"pulsar://localhost:6650\", consumer_name: \"quickwit\", authentication: None } current_positions={}}: quickwit_indexing::source::pulsar_source: Seeking to last checkpoint positions. positions={}\n```\n\n## Create and populate a Pulsar topic\n\nWe will use the Pulsar's default tenant/namespace `public/default`. To populate the topic, we will use a python script:\n\n```python title=send_messages_to_pulsar.py\nimport json\nimport pulsar\n\nclient = pulsar.Client('pulsar://localhost:6650')\nproducer = client.create_producer('public/default/stackoverflow')\n\nwith open('stackoverflow.posts.transformed-10000.json', encoding='utf8') as file:\n   for i, line in enumerate(file):\n       producer.send(line.encode('utf-8'))\n       if i % 100 == 0:\n           print(f\"{i}/10000 messages sent.\", i)\n\nclient.close()\n```\n\nInstall locally the python client, more details on [documentation page](https://pulsar.apache.org/docs/2.11.x/client-libraries-python/):\n\n```bash\n# Download the first 10_000 Stackoverflow posts articles.\ncurl -O https://quickwit-datasets-public.s3.amazonaws.com/stackoverflow.posts.transformed-10000.json\n\n# Install pulsar python client.\n# Requires a python version < 3.11\npip3 install 'pulsar-client==2.10.1'\nwget https://raw.githubusercontent.com/quickwit-oss/quickwit/main/config/tutorials/stackoverflow/send_messages_to_pulsar.py\npython3 send_messages_to_pulsar.py\n```\n\n## Time to search!\n\nYou can run this command to inspect the properties of the index and check the current number of published splits and documents:\n\n```bash\n# Display some general information about the index.\n./quickwit index describe --index stackoverflow\n```\n\nYou will notably see the number of published documents.\n\nYou are now ready to execute some queries.\n\n```bash\ncurl 'http://localhost:7280/api/v1/stackoverflow/search?query=search+AND+engine'\n```\n\nIf your Quickwit server is local, you can access to the results through Quickwit UI on [localhost:7280](http://localhost:7280/ui/search?query=&index_id=stackoverflow&max_hits=10).\n\n\n## Tear down resources (optional)\n\nLet's delete the files and resources created for the purpose of this tutorial.\n\n```bash\n# Delete quickwit index.\n./quickwit index delete --index stackoverflow --yes\n# Delete Pulsar topic.\nbin/pulsar-admin topics delete stackoverflow\n```\n\nThis concludes the tutorial. If you have any questions regarding Quickwit or encounter any issues, don't hesitate to ask a [question](https://github.com/quickwit-oss/quickwit/discussions) or open an [issue](https://github.com/quickwit-oss/quickwit/issues) on [GitHub](https://github.com/quickwit-oss/quickwit) or contact us directly on [Discord](https://discord.com/invite/MT27AG5EVE).\n"
  },
  {
    "path": "docs/ingest-data/sqs-files.md",
    "content": "---\ntitle: S3 with SQS notifications\ndescription: A short tutorial describing how to set up Quickwit to ingest data from S3 files using an SQS notifier\ntags: [s3, sqs, integration]\nicon_url: /img/tutorials/file-ndjson.svg\nsidebar_position: 5\n---\n\nIn this tutorial, we describe how to set up Quickwit to ingest data from S3\nwith bucket notification events flowing through SQS. We will first create the\nAWS resources (S3 bucket, SQS queue, notifications) using terraform. We will\nthen configure the Quickwit index and file source. Finally we will send some\ndata to the source bucket and verify that it gets indexed.\n\n## AWS resources\n\nThe complete terraform script can be downloaded [here](../assets/sqs-file-source.tf).\n\nFirst, create the bucket that will receive the source data files (NDJSON format):\n\n```\nresource \"aws_s3_bucket\" \"file_source\" {\n  bucket_prefix = \"qw-tuto-source-bucket\"\n}\n```\n\nThen setup the SQS queue that will carry the notifications when files are added\nto the bucket. The queue is configured with a policy that allows the source\nbucket to write the S3 notification messages to it. Also create a dead letter\nqueue (DLQ) to receive the messages that couldn't be processed by the file\nsource (e.g corrupted files). Messages are moved to the DLQ after 5 indexing\nattempts. \n\n```\nlocals {\n  sqs_notification_queue_name = \"qw-tuto-s3-event-notifications\"\n}\n\ndata \"aws_iam_policy_document\" \"sqs_notification\" {\n  statement {\n    effect = \"Allow\"\n\n    principals {\n      type        = \"*\"\n      identifiers = [\"*\"]\n    }\n\n    actions   = [\"sqs:SendMessage\"]\n    resources = [\"arn:aws:sqs:*:*:${local.sqs_notification_queue_name}\"]\n\n    condition {\n      test     = \"ArnEquals\"\n      variable = \"aws:SourceArn\"\n      values   = [aws_s3_bucket.file_source.arn]\n    }\n  }\n}\n\nresource \"aws_sqs_queue\" \"s3_events_deadletter\" {\n  name = \"${locals.sqs_notification_queue_name}-deadletter\"\n}\n\nresource \"aws_sqs_queue\" \"s3_events\" {\n  name   = local.sqs_notification_queue_name\n  policy = data.aws_iam_policy_document.sqs_notification.json\n\n  redrive_policy = jsonencode({\n    deadLetterTargetArn = aws_sqs_queue.s3_events_deadletter.arn\n    maxReceiveCount     = 5\n  })\n}\n\nresource \"aws_sqs_queue_redrive_allow_policy\" \"s3_events_deadletter\" {\n  queue_url = aws_sqs_queue.s3_events_deadletter.id\n\n  redrive_allow_policy = jsonencode({\n    redrivePermission = \"byQueue\",\n    sourceQueueArns   = [aws_sqs_queue.s3_events.arn]\n  })\n}\n```\n\nConfigure the bucket notification that writes messages to SQS each time a new\nfile is created in the source bucket:\n\n```\nresource \"aws_s3_bucket_notification\" \"bucket_notification\" {\n  bucket = aws_s3_bucket.file_source.id\n\n  queue {\n    queue_arn = aws_sqs_queue.s3_events.arn\n    events    = [\"s3:ObjectCreated:*\"]\n  }\n}\n```\n\n:::note\n\nOnly events of type `s3:ObjectCreated:*` are supported. Other types (e.g.\n`ObjectRemoved`) are acknowledged and a warning is logged.\n\n:::\n\nThe source needs to have access to both the notification queue and the source\nbucket. The following policy document contains the minimum permissions required\nby the source:\n\n```\ndata \"aws_iam_policy_document\" \"quickwit_node\" {\n  statement {\n    effect = \"Allow\"\n    actions = [\n      \"sqs:ReceiveMessage\",\n      \"sqs:DeleteMessage\",\n      \"sqs:ChangeMessageVisibility\",\n      \"sqs:GetQueueAttributes\",\n    ]\n    resources = [aws_sqs_queue.s3_events.arn]\n  }\n  statement {\n    effect    = \"Allow\"\n    actions   = [\"s3:GetObject\"]\n    resources = [\"${aws_s3_bucket.file_source.arn}/*\"]\n  }\n}\n```\n\nCreate the IAM user and credentials that will be used to\nassociate this policy to your local Quickwit instance:\n\n```\nresource \"aws_iam_user\" \"quickwit_node\" {\n  name = \"quickwit-filesource-tutorial\"\n  path = \"/system/\"\n}\n\nresource \"aws_iam_user_policy\" \"quickwit_node\" {\n  name   = \"quickwit-filesource-tutorial\"\n  user   = aws_iam_user.quickwit_node.name\n  policy = data.aws_iam_policy_document.quickwit_node.json\n}\n\nresource \"aws_iam_access_key\" \"quickwit_node\" {\n  user = aws_iam_user.quickwit_node.name\n}\n```\n\n\n:::warning\n\nWe don't recommend using IAM user credentials for running Quickwit nodes in\nproduction. This is just a simplified setup for the sake of the tutorial. When\nrunning on EC2/ECS, attach the policy document to an IAM roles instead.\n\n:::\n\nDownload the [complete terraform script](../assets/sqs-file-source.tf) and\ndeploy it using `terraform init` and `terraform apply`. After a successful\nexecution, the outputs required to configure Quickwit will be listed. You can\ndisplay the values of the sensitive outputs (key id and secret key) with:\n\n\n```bash\nterraform output quickwit_node_access_key_id\nterraform output quickwit_node_secret_access_key\n```\n\n## Run Quickwit\n\n[Install Quickwit locally](/docs/get-started/installation), then in your install\ndirectory, run Quickwit with the necessary access rights by replacing the\n`<quickwit_node_access_key_id>` and `<quickwit_node_secret_access_key>` with the\nmatching Terraform output values:\n\n```bash\nAWS_ACCESS_KEY_ID=<quickwit_node_access_key_id> \\\nAWS_SECRET_ACCESS_KEY=<quickwit_node_secret_access_key> \\\nAWS_REGION=us-east-1 \\\n./quickwit run\n```\n\n## Configure the index and the source\n\nIn another terminal, in the Quickwit install directory, create an index:\n\n```bash\ncat << EOF > tutorial-sqs-file-index.yaml\nversion: 0.7\nindex_id: tutorial-sqs-file\ndoc_mapping:\n  mode: dynamic\nindexing_settings:\n  commit_timeout_secs: 30\nEOF\n\n./quickwit index create --index-config tutorial-sqs-file-index.yaml\n```\n\nReplacing `<notification_queue_url>` with the corresponding Terraform output\nvalue, create a file source for that index:\n\n```bash\ncat << EOF > tutorial-sqs-file-source.yaml\nversion: 0.8\nsource_id: sqs-filesource\nsource_type: file\nnum_pipelines: 2\nparams:\n  notifications:\n    - type: sqs\n      queue_url: <notification_queue_url>\n      message_type: s3_notification\nEOF\n\n./quickwit source create --index tutorial-sqs-file --source-config tutorial-sqs-file-source.yaml\n```\n\n:::tip\n\nThe `num_pipeline` configuration controls how many consumers will poll from the queue in parallel. Choose the number according to the indexer compute resources you want to dedicate to this source. As a rule of thumb, configure 1 pipeline for every 2 cores.\n\n:::\n\n## Ingest data\n\nWe can now ingest data into Quickwit by uploading files to S3. If you have the\nAWS CLI installed, run the following command, replacing `<source_bucket_name>`\nwith the associated Terraform output:\n\n```bash\ncurl https://quickwit-datasets-public.s3.amazonaws.com/hdfs-logs-multitenants-10000.json | \\\n    aws s3 cp - s3://<source_bucket_name>/hdfs-logs-multitenants-10000.json\n```\n\nIf you prefer not to use the AWS CLI, you can also download the file and upload\nit manually to the source bucket using the AWS console.\n\nWait approximately 1 minute and the data should appear in the index:\n\n```bash\n./quickwit index describe --index tutorial-sqs-file\n```\n\n## Tear down the resources\n\nThe AWS resources instantiated in this tutorial don't incur any fixed costs, but\nwe still recommend deleting them when you are done. In the directory with the\nTerraform script, run `terraform destroy`.\n"
  },
  {
    "path": "docs/internals/backward-compatibility.md",
    "content": "# Backward compatibility in Quickwit.\n\nIf you are reading this, chances are you want to make a change to one of the resource\nof Quickwit's meta/config.\n\nThere are basically 3 types of configuration:\n\nEdited by the user and read back from file on startup:\n- QuickwitConfig\n\nEdited by the user then stored in the metastore:\n- IndexConfig\n- SourceConfig\n- VersionedIndexTemplate\n\nAssembled by Quickwit then stored in the metastore:\n- IndexMetadata\n- SplitMetadata\n- FileBackedIndex (file backed metastore only)\n- Manifest (file backed metastore only)\n\nQuickwit currently manages the backward compatibility of all of these resources except the `QuickwitConfig`.\n\nThis document describes how to handle a change, and how to make test such a change, and spot eventual regression.\n\n## How do I update `{IndexMetadata, SplitMetadata, FileBackedIndex, SourceConfig, IndexConfig, Manifest}`?\n\nThere are two types of upgrades:\n- naturally backward compatible change\n- change requiring a new version\n\n### Naturally backward compatible change\n\nSerde offers some attributes to make backward compatible changes to our model.\nFor instance, it is possible to add a new field to a struct and slap\na `serde(default)` attribute to it in order to handle older serialized version of the\nstruct.\n\nIf you want to avoid to generate any diff on the non-regression json files,\nyou can also avoid use `#[serde(skip_serializing_if)]`, although by default,\nit is recommended to not use it.\n\nIt is also possible to rename a field in a backward compatible manner\nby using the `#[serde(alias)]`.\n\nFor this type of change it is not required to update the serialization version.\n\nNevertheless, the regression tests will spot these changes. When that happens:\n- modify your model with the help of the attributes above.\n- modify the example for the model by editing its `TestableForRegression` trait implementation.\n- run the backward compatibility tests (see below)\n- check the diff between the `xxx.modified.json` files created and the matching `xxx.json` files. \nIf the changes are acceptable, replace the content of the `xxx.json` files and commit them.\n\nBe particularly careful to changes on files corresponding to the most recent version. If the \nchanges are not compatible, create a new configuration version.\n\n### Change requiring a new version\n\nFor changes requiring a new version, you will have to increment the configuration\nversion. You need to make sure that all of these resources share the same version number.\n\n- update the resource struct you want to change.\n- create a new item in the `VersionedXXXX` struct. It is usually located in a serialize.rs file\n- `Serialize` is not needed for the previous serialized version. We just need `Deserialize`. We can \nremove the `Serialize` impl from the derive statement, and mark it a `skip_serializing` as follows.\n\ne.g.\n```\n#[serde(tag = \"version\")]\npub(crate) enum VersionedXXXXXX {\n    #[serde(rename = \"0\")]\n    V0(#[serde(skip_serializing)] XXXX_V0),\n    #[serde(rename = \"1\")]\n    V1(XXXX_V1),\n}\n```\n- complete the conversion `From<VersionedXXXX> for XXXX` and `From<XXXX> for VersionedXXXX`\n- run the backward compatibility tests (see below)\n- for older versions, check the diff between the `xxx.expected.modified.json` files created and the matching `xxx.expected.json` files. \nIf the changes are acceptable, replace the content of the `xxx.expected.json` files and commit them.\n- check the `yyyy.json` that was created for the new version and commit it along with the `yyyy.expected.json` file (identical).\n- possibly update the generation of the default XXXX instance used for regression. It is in the function `TestableForRegression::sample_for_regression`.\n\n\n## Backward compatibility tests\n\nThese tests are used to ensure the backward compatibility of Quickwit.\nRight now, `SplitMetadata`, `IndexMetadata`, `Manifest` and `FileBackedIndex` are tested.\n\nWe want to be able to read all past versions of these files, but only write the most recent format.\n\nThe tests consist of pairs of JSON files, `XXXX.json` and `XXXX.expected.json`:\n- `XXXX.json` is the first serialized value of a new version.\n- `XXXX.expected.json` is the result of `serialize_new_version(deserialize(XXXX.json))`.\n\nFormat changes are automatically detected. There are two possible situations when a format changes.\n\n#### Updating expected.json\n\nWe need to keep `*.expected.json` files up-to-date with the format changes.\n\nThis is done in a semi-automatic fashion.\n\nChecks are performed in two steps:\n- first pass, `deserialize(original_json) == deserialize(expectation_json)`\n- second pass, `expectation_json = serialize(deserialize(expectation_json))`\n\nWhen changing the json format, it is expected to see this test fail.\nThe unit test then updates automatically the `expected.json`. The developer just has to\ncheck the diff of the result (in particular no information should be lost) and commit the \nupdated expected.json files.\n\nAdding this update operation within the unit test is a tad unexpected, but it has the merit of\nintegrating well with CI. If a developer forgets to update the expected.json file,\nthe CI will catch it.\n\n#### Adding a new test case.\n\nIf the serialization format changes, a new version should be created and the unit test will\nautomatically add a new unit test generated from the sample tested objects.\nConcretely, it will just write two files `XXXX.json` and `XXXX.expected.json` for each model.\n\nThe two files will be identical. This is expected as this is a unit test for the most recent \nversion. The unit test will start making sense in future updates thanks to the update phase\ndescribed in the previous section.\n"
  },
  {
    "path": "docs/internals/date-time.md",
    "content": "# Datetime format\n\nQuickwit's DateTime is a wrapper around Tantivy's provided DateTime type which is internally represented as an `i64` microseconds value. For optimization reasons, Tantivy stores the value differently at the following locations:\n- DocStore: Dates are stored as they are received from the input document.\n- TermDict: Dates are stored with `seconds` precision.\n- FastField: Dates are stored using the DateTime type configured precision that can take of the following values: `seconds`, `milliseconds`, `microseconds`.\n"
  },
  {
    "path": "docs/internals/ingest-v2.md",
    "content": "# Ingest V2\n\nIngest V2 is the latest ingestion API that is designed to be more efficient and scalable for thousands of indexes than the previous version. It is the default since 0.9.\n\n## Architecture\n\nJust like ingest V1, the new ingest uses [`mrecordlog`](https://github.com/quickwit-oss/mrecordlog) to persist ingested documents that are waiting to be indexed. But unlike V1, which always persists the documents locally on the node that receives them, ingest V2 can dynamically distribute them into WAL units called _shards_. The assigned shard can be local or on another indexer. The control plane is in charge of distributing the shards to balance the indexing work as well as possible across all indexer nodes. The progress within each shard is not tracked as an index metadata checkpoint anymore but in a dedicated metastore `shards` table.\n\nIn the future, the shard based ingest will also be capable of writing a replica for each shard, thus ensuring a high durability of the documents that are waiting to be indexed (durability of the indexed documents is guarantied by the object store).\n\n## Toggling between ingest V1 and V2\n\nVariables driving the ingest configuration are documented [here](../ingest-data/ingest-api.md#ingest-api-versions).\n\nWith ingest V2, you can also activate the `enable_cooperative_indexing` option in the indexer configuration. This setting is useful for deployments with very large numbers (dozens) of actively written indexers, to limit the indexing workbench memory consumption. The indexer configuration is in the node configuration:\n\n```yaml\nversion: 0.8\n# [...]\nindexer:\n  enable_cooperative_indexing: true\n```\n\nSee [full configuration example](https://github.com/quickwit-oss/quickwit/blob/main/config/quickwit.yaml).\n\n## Differences between ingest V1 and V2\n\n- V1 uses the `queues/` directory whereas V2 uses the `wal/` directory\n- both V1 and V2 are configured with:\n  - `ingest_api.max_queue_memory_usage` \n  - `ingest_api.max_queue_disk_usage` \n- but ingest V2 can also be configured with:\n  - `ingest_api.replication_factor`, not working yet\n- ingest V1 always writes to the WAL of the node receiving the request, V2 potentially forwards it to another node, dynamically assigned by the control plane to distribute the indexing work more evenly.\n- ingest V2 parses and validates input documents synchronously. Schema and JSON formatting errors are returned in the ingest response (for ingest V1 those errors were available in the server logs only).\n"
  },
  {
    "path": "docs/internals/scroll.md",
    "content": "# Scroll API\n\nThe scroll API has been implemented to offer compatibility with ElasticSearch.\nThe API and the implementation are quirky and are detailed in this document.\n\n## API description\n\nYou can find information about the scroll API here.\nhttps://www.elastic.co/guide/en/elasticsearch/reference/current/paginate-search-results.html#scroll-search-results\nhttps://www.elastic.co/guide/en/elasticsearch/reference/current/scroll-api.html\n\nThe user runs a regular search request with a `scroll` param.\nThe search result then contains the normal response, but a `_scroll` property is added to the search body.\n\nThat id is then meant to be sent to a scroll rest API.\nThis API successive calls will then return pages incrementally.\n\n## Quirk and difficulty.\n\nThe scrolled results should be consistent with the state of the original index.\nFor this reason we need to capture the state of the index at the point of the original request.\n\nIf a network error happens between the client and the server at page N, there is no way for the client to ask the reemission of page N.\nPage N+1 will be returned on the next call.\n\n## Implementation\n\nServer side, we store a replicated scroll context.\n\nIt contains:\n- the detail about the original query (we need to be able to reemit paginated queries)\n- the \"point-in-time\" list of split metadatas used for the query\n- a cached list of partial docs (= not the doc content, just its address and its score) to avoid\nperforming search over and over.\n- the total number of results, in order to append that information to our response.\nsearching at every single scroll requests.\n\nWe use a simple leaderless KV store to keep the state required to run the scroll API.\nWe generate a scroll ULID and use it to get a list of the servers with the best affinity according\nto rendez vous hashing. We then go through them in order and attempt to put that key on up to 2 servers. Failures for these PUTs are silent.\n\nFor each call to scroll, one of two things can happen:\n- the partial docs for the page requested is in the partial doc cache. We just run the fetch_docs phase, and update the context with the `start_offset`.\n- the partial docs for the page request are not in the partial doc cache. We then run a new search query.\n\nWe attempt to fetch `SCROLL_BATCH_LEN` in order to fill the partial doc address cache for subsequent calls.\n\n# A strange `scroll_id`.\n\nThe elasticsearch API is needlessly broken as it returns the same scroll_id most of the time.\nThe \"page-change\" mutation is something that happens on the server side.\n\nIn quickwit on the other hand, the scroll id is the concatenation of the\n- ULID: used as the address for the search context.\n- the start_offset.\n- the number of hits per page\n- a search_after key\n\nWe only mutate the state server side to update the cache whenever needed.\n\nThe idea here is that if that if the put request failed, we can still return the right results even if we have an obsolete version of the `ScrollContext`.\n\n# Quickwit implementation (improvement, quirks and shortcuts)\n\nWe do not do explicitly protect the split from our store Point-In-Time information\nfrom deletion. Instead we simply rely on the existing grace period mechanism (a split\nonly is effectively garbage collected 32mn after it is marked as deleted).\n\nFor this reason we limit the scroll period to 30mn and subsequent scroll calls do not\nextend the scroll period.\n\nAlso thanks to this period, we do not add any extra replication repair mechanism.\nSome scroll calls will end up being broken if we were to remove 2 servers within 30mn.\n\nQuickwit caches partial hits in batches of 1000 results.\nQuerying page N leverages `search_after`, so that accessing further pages isn't more\ncostly than accessing the first ones.\n"
  },
  {
    "path": "docs/internals/searcher-split-cache.md",
    "content": "\n# Searcher split cache\n\nQuickwit includes a split cache. It can be useful for specific workloads:\n- to improve performance\n- to reduce the cost associated with GET requests.\n\nThe split cache stores entire split files on disk.\nIt works under the following configurable constraints:\n- number of concurrent downloads\n- amount of disk space\n- number of on-disk files.\n\nSearcher get tipped by indexers about the existence of splits (for which they have the best affinity).\nThey also might learn about split existence, upon read requests.\n\nThe searcher is then in charge of maintaining an in-memory data structure with a bounded list of splits it knows about and their score.\nThe current strategy for admission/evicton is a simple LRU logic.\n\nIf the most recently accessed split not already in cache has been accessed, we consider downloading it.\nIf the limits have been reached, we only proceed to eviction if one of the split currently\nin cache has been less recently accessed.\n\n\n"
  },
  {
    "path": "docs/internals/sorting.md",
    "content": "# Sorting\n\nQuickwit can sort results based on fastfield values or score. This document discuss where and how\n it happens.\nIt also tries to describe optimizations that may be enabled (but are not necessarily implemented)\nby this behavior.\n\n## Behavior\n\nSorting is controlled by the `sort_by` query parameter. It accepts a comma separated list of fields\nto use for sorting. Sorting is Descending by default. The sorting order can be reversed by prefixing\na field name with a hyphen `-`.\nThe special value `_score` means sorting by score, it is also Descending by default.\n\nIn case of equality between two documents, the GlobalDocId, composed of (SplitId, SegmentId, DocId)\nis used as a tie breaker. It is used to sort in the same order as the first field being sorted by.\nThis means it is in Descending order by default.\n\nIf a document doesn't have a value for a sorting field, that document is considered to go after any\ndocument which has a value, independently of sort order. That is, when sorting the value 1,2 and\nNone, ascending sort would give `[1, 2, None]`, and descending sort would give `[2, 1, None]`.\n\nIf a client does not request sorting, documents are sorted using (SplitId, SegmentId, DocId), on\nDescending order. In other words, everything happens as if documents were sorted by a constant\nvalue.\n\n<!--\nTODO we could also say \"it's not sorted\" and add a special `_doc_id` for that. See optimizations\n-->\n\n# Code\n\nA new structure TopK is introduced which is used both for in-split sorting and for merging of\nresults. It reduces the risks of inconsistencies between in-split and between-split behavior.\n`SortOrder` gets new `compare` and `compare_opt` method which can be used to compare two values with\n respect to the particular sort order required, and with proper handling of the `None` special case.\n\n# Optimization permitted\n\nBoth orders allow an optimization when sorting by date (either direction), by leveraging splits\nmeta-data to know in advance if a split can, or not, contain better results. Changing the sorting\norder for \"not sorted\" queries allows to leverage SplitId as a way to know whether a split can\ncontain or not better results (if its SplitId is more/less than the current worst best-hit, the\nsplit does not need to be searched).\n\n<!--\nIf we allow unsorted requests, we can go further and stop searching as soon as we have k hits\n(even going as far as stopping mid collection), without even looking at other splits metadata.\nArgument can be made in favor of this because GlobalDocId is not stable, and can change during\na merge, so order is not guaranteed anyway, at least not until Quickwit has support for a Point\nIn Time mechanism.\n-->\n\nThese optimization have limited to no impact if we give an exact count of matching documents.\nAn option to request only a lower bound would be required for these optimizations to make sense.\n"
  },
  {
    "path": "docs/internals/split-format.md",
    "content": "# Split format\n\nQuickwit's index are divided into small independent immutable piece of index called split.\n\nFor convenience, a split consists in a single file, with the extension `.split`.\n\nIn reality, this file hides an internal mini static filesystem,\nwith:\n- the Tantivy index files (`.idx`, `.pos`, `.term`...)\n- a Quickwit specific file with the list of fields, including those indexed as part of a JSON type. \nIt contains the field name, type and capabilities.\n\nThe split file data layout looks like this:\n- concatenation all of the files in the split\n- a footer\n\nThe footer follows the following format.\n\n- a json object called `BundleStorageFileOffsets` containing the `[start, end)` byte-offsets\nof all files.\n- the length of this json (8 bytes little endian)\n- a hotcache, a small static cache that contains some important file sections.\n- the length of this hotcache (8 bytes little endian)\n\nThis footer plays a key role a very important role in quickwit.\nIt packs in one read all of the information required to open a split.\n\nWhen opening a file from a distant storage,  Quickwit's metastore stores the byte offsets of this footer to make this read possible.\n\nIf this footer offset information is not available, for instance if the split is just a file on the filesystem, it is still possible to open it by reading the last 8 bytes of the split (encoding the length of the hotcache), deducing the position of the meta information and unpacking this in turn.\n"
  },
  {
    "path": "docs/internals/template-index.md",
    "content": "# Index template API\n\nIndex templates are a way to create indexes automatically with some given configuration when Quickwit receives documents for an index that doesn't exist yet.\n\nExample of templates: [https://github.com/quickwit-oss/quickwit/tree/main/config/templates](https://github.com/quickwit-oss/quickwit/tree/main/config/templates).\n\n# Curl to run to use the REST API to create Stackoverflow template\n\n```bash\ncurl -XPOST -H 'Content-Type: application/yaml' 'http://localhost:7280/api/v1/templates' --data-binary @config/templates/stackoverflow.yaml\n\n# Lists templates.\ncurl 'http://localhost:7280/api/v1/templates'\n\n# Update Stackoverflow template.\ncurl -XPUT -H 'Content-Type: application/yaml' 'http://localhost:7280/api/v1/templates/stackoverflow' --data-binary @config/templates/stackoverflow.yaml\n\n# Download dataset.\ncurl -O https://quickwit-datasets-public.s3.amazonaws.com/stackoverflow.posts.transformed-10000.json\n\n# Ingest 10k docs into `stackoverflow-foo` index.\ncurl -XPOST \"http://127.0.0.1:7280/api/v1/stackoverflow-foo/ingest\" --data-binary @stackoverflow.posts.transformed-10000.json\n\n# Ingest 10k docs into `stackoverflow-bar` index.\ncurl -XPOST \"http://127.0.0.1:7280/api/v1/stackoverflow-bar/ingest\" --data-binary @stackoverflow.posts.transformed-10000.json\n\n# Delete Stackoverflow template.\ncurl -XDELETE 'http://localhost:7280/api/v1/templates/stackoverflow'\n\n```bash\n"
  },
  {
    "path": "docs/log-management/_category_.yaml",
    "content": "label: 'Log management'\nposition: 5\ncollapsed: true\n"
  },
  {
    "path": "docs/log-management/otel-service.md",
    "content": "---\ntitle: OTEL service\nsidebar_position: 4\n---\n\nQuickwit natively supports the [OpenTelemetry Protocol (OTLP)](https://opentelemetry.io/docs/reference/specification/protocol/otlp/) and provides a gRPC endpoint to receive spans from an OpenTelemetry collector. This endpoint is enabled by default.\n\nWhen enabled, Quickwit will start the gRPC service ready to receive logs from an OpenTelemetry collector. The logs are indexed in the `otel-logs-v0_7` index by default, and this index will be automatically created if not present. The index doc mapping is described in the next [section](#trace-and-span-data-model).\n\nIf for any reason, you want to disable this endpoint, you can:\n- Set the `QW_ENABLE_OTLP_ENDPOINT` environment variable to `false` when starting Quickwit.\n- Or [configure the node config](/docs/configuration/node-config.md) by setting the indexer setting `enable_otlp_endpoint` to `false`.\n\n```yaml title=node-config.yaml\n# ... Indexer configuration ...\nindexer:\n    enable_otlp_endpoint: false\n```\n\n## Sending logs in your own index\n\nYou can send logs in the index of your choice by setting the header `qw-otel-logs-index` of your gRPC request to the targeted index ID.\n\n\n## OpenTelemetry logs data model\n\nQuickwit sends OpenTelemetry logs into the `otel-logs-v0_7` index by default which is automatically created if you enable the OpenTelemetry service.\nThe doc mapping of this index described below is derived from the [OpenTelemetry logs data model](https://opentelemetry.io/docs/reference/specification/logs/data-model/).\n\n```yaml\n\nversion: 0.7\n\nindex_id: otel-logs-v0_7\n\ndoc_mapping:\n  mode: strict\n  field_mappings:\n    - name: timestamp_nanos\n      type: datetime\n      input_formats: [unix_timestamp]\n      output_format: unix_timestamp_nanos\n      indexed: false\n      fast: true\n      fast_precision: milliseconds\n    - name: observed_timestamp_nanos\n      type: datetime\n      input_formats: [unix_timestamp]\n      output_format: unix_timestamp_nanos\n    - name: service_name\n      type: text\n      tokenizer: raw\n      fast: true\n    - name: severity_text\n      type: text\n      tokenizer: raw\n      fast: true\n    - name: severity_number\n      type: u64\n      fast: true\n    - name: body\n      type: json\n      tokenizer: default\n    - name: attributes\n      type: json\n      tokenizer: raw\n      fast: true\n    - name: dropped_attributes_count\n      type: u64\n      indexed: false\n    - name: trace_id\n      type: bytes\n      input_format: hex\n      output_format: hex\n    - name: span_id\n      type: bytes\n      input_format: hex\n      output_format: hex\n    - name: trace_flags\n      type: u64\n      indexed: false\n    - name: resource_attributes\n      type: json\n      tokenizer: raw\n      fast: true\n    - name: resource_dropped_attributes_count\n      type: u64\n      indexed: false\n    - name: scope_name\n      type: text\n      indexed: false\n    - name: scope_version\n      type: text\n      indexed: false\n    - name: scope_attributes\n      type: json\n      indexed: false\n    - name: scope_dropped_attributes_count\n      type: u64\n      indexed: false\n\n  timestamp_field: timestamp_nanos\n\nindexing_settings:\n  commit_timeout_secs: 10\n\nsearch_settings:\n  default_search_fields: [body.message]\n```\n\n## UI Integration\n\nCurrently, Quickwit provides a simplistic UI to get basic information from the cluster, indexes and search documents.\nIf a simple UI is not sufficient for you and you need additional features, Grafana and Elasticsearch query API support are planned for Q2 2023, stay tuned!\n\nYou can also send traces to Quickwit that you can visualize in Jaeger UI, as explained in the following [tutorial](../distributed-tracing/send-traces/using-otel-sdk-python.md).\n\n\n## Known limitations\n\nThere are a few limitations on the log management setup in Quickwit 0.9:\n- The ingest API does not provide High-Durability. This will be fixed in 0.10.\n- OTLP HTTP is only available with the Binary Protobuf Encoding. OTLP HTTP with JSON encoding is not planned yet, but this can be easily fixed in the next version. Please open an issue if you need this feature.\n\nIf you are interested in new features or discover other limitations, please open an issue on [GitHub](https://github.com/quickwit-oss/quickwit).\n"
  },
  {
    "path": "docs/log-management/overview.md",
    "content": "---\ntitle: Log management with Quickwit\nsidebar_label: Overview\nsidebar_position: 1\n---\n\nQuickwit is built from the ground up to [efficiently index unstructured data](../guides/schemaless.md), and search it effortlessly on cloud storage.\nMoreover, Quickwit supports OpenTelemetry gRPC and HTTP (protobuf only) protocols out of the box and provides a REST API ready to ingest any JSON formatted logs.\n**This makes Quickwit a perfect fit for logs!**.\n\n![Quickwit Log Management](../assets/images/log-management-overview-light.svg#gh-light-mode-only)![Quickwit Log Management](../assets/images/log-management-overview-dark.svg#gh-dark-mode-only)\n\n## Sending logs to Quickwit\n\n- [Using OTEL collector](send-logs/using-otel-collector.md)\n- [Using OTEL collector with Helm](send-logs/using-otel-collector-with-helm.md)\n- [Using Fluentbit](send-logs/using-fluentbit.md)\n- [Using Vector](send-logs/using-vector.md)\n\n"
  },
  {
    "path": "docs/log-management/send-logs/_category_.yaml",
    "content": "label: 'Sending logs'\nposition: 2\ncollapsed: false\n"
  },
  {
    "path": "docs/log-management/send-logs/send-docker-logs.md",
    "content": "---\ntitle: Send docker logs into Quickwit\nsidebar_label: Docker logs into Quickwit\ndescription: Send docker logs into Quickwit\ntags: [otel, docker, collector, log]\nsidebar_position: 5\n---\n\nTo send docker container logs into Quickwit, you just need to setup an OpenTelemetry Collector with the file logs receiver. In this tutorial, we will use `docker compose` to start the collector and Quickwit.\n\nYou only need a minute to get your Quickwit log UI!\n\n![Quickwit UI Logs](../../assets/images/screenshot-quickwit-ui-docker-compose-logs.png)\n\n## OTEL collector configuration\n\nThe following collector configuration will collect docker logs in `/var/lib/docker/containers/*/*-json.log` (depending on your system, log files can be at a different location), add a few attributes and send them to Quickwit through gRPC at `http://quickwit:7281`.\n\n\n```yaml title=\"otel-collector-config.yaml\"\nreceivers:\n  filelog:\n    include:\n      - /var/lib/docker/containers/*/*-json.log\n    operators:\n     - id: parser-docker\n       timestamp:\n         layout: '%Y-%m-%dT%H:%M:%S.%LZ'\n         parse_from: attributes.time\n       type: json_parser\n     - field: attributes.time\n       type: remove\n     - id: extract_metadata_from_docker_tag\n       parse_from: attributes.attrs.tag\n       regex: ^(?P<name>[^\\|]+)\\|(?P<image_name>[^\\|]+)\\|(?P<id>[^$]+)$\n       type: regex_parser\n       if: 'attributes?.attrs?.tag != nil'\n     - from: attributes.name\n       to: resource[\"docker.container.name\"]\n       type: move\n       if: 'attributes?.name != nil'\n     - from: attributes.image_name\n       to: resource[\"docker.image.name\"]\n       type: move\n       if: 'attributes?.image_name != nil'\n     - from: attributes.id\n       to: resource[\"docker.container.id\"]\n       type: move\n       if: 'attributes?.id != nil'\n     - from: attributes.log\n       to: body\n       type: move\n\nprocessors:\n  batch:\n    timeout: 5s\n\nexporters:\n  otlp/qw:\n    endpoint: quickwit:7281\n    tls:\n      insecure: true\n\nservice:\n  pipelines:\n    logs:\n      receivers: [filelog]\n      processors: [batch]\n      exporters: [otlp/qw]\n```\n\n## Start the OTEL collector and a Quickwit instance\n\nLet's use `docker compose` with the following configuration:\n\n```yaml title=\"docker-compose.yaml\"\nversion: \"3\"\n\nx-default-logging: &logging\n driver: \"json-file\"\n options:\n   max-size: \"5m\"\n   max-file: \"2\"\n   tag: \"{{.Name}}|{{.ImageName}}|{{.ID}}\"\n\nservices:\n  quickwit:\n    image: quickwit/quickwit:${QW_VERSION:-0.8.1}\n    volumes:\n      - ./qwdata:/quickwit/qwdata\n    ports:\n      - 7280:7280\n    environment:\n      - NO_COLOR=true\n    command: [\"run\"]\n    logging: *logging\n\n  otel-collector:\n    user: \"0\" # Needed to access the directory /var/lib/docker/containers/\n    image: otel/opentelemetry-collector-contrib:${OTEL_VERSION:-0.87.0}\n    volumes:\n      - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml\n      - /var/lib/docker/containers:/var/lib/docker/containers:ro\n    command: [\"--config=/etc/otel-collector-config.yaml\"] \n    logging: *logging\n```\n\n\nYou will notice the custom `logging`, the OTEL collector will use that additional information to enrich the logs.\n\n## Run it and search\n\nDownload the configuration files and start the containers:\n   \n```bash\n\nmkdir qwdata\ndocker compose up\n```\n\nAfter a few seconds, you will see the logs in the Quickwit UI [http://localhost:7280](http://localhost:7280).\n\n\nHere is what it should look like:\n\n```json\n{\n  \"attributes\": {\n    \"log.file.name\": \"34ad1a84c71de1d29ad75f99b56d01205e2976440f2398734037151ba2bcde1a-json.log\",\n    \"stream\": \"stdout\"\n  },\n  \"body\": {\n    \"message\": \"2023-10-23T16:39:57.892  INFO --- [   asgi_gw_1] localstack.request.aws     : AWS s3.ListObjects => 200\\n\"\n  },\n  \"observed_timestamp_nanos\": 1698079197979435000,\n  \"service_name\": \"unknown_service\",\n  \"severity_number\": 0,\n  \"timestamp_nanos\": 1698079197892726000,\n  \"trace_flags\": 0\n}\n```\n\n\n## Troubleshooting\n\nIt's possible that you get no logs in the UI. In this case, check the `docker compose` logs. The problem can typically come from a wrong configuration of the OTEL collector.\n"
  },
  {
    "path": "docs/log-management/send-logs/using-fluentbit.md",
    "content": "---\ntitle: Send logs using Fluentbit\nsidebar_label: Using Fluentbit\ndescription: A simple tutorial to send logs from Fluentbit to Quickwit in a few minutes.\nicon_url: /img/tutorials/fluentbit-logo.png\ntags: [logs, ingestion]\nsidebar_position: 4\n---\n\nimport Tabs from '@theme/Tabs';\nimport TabItem from '@theme/TabItem';\n\n[Fluent Bit](https://fluentbit.io/) is an open-source logging and metrics processor and forwarder to multiple destinations.\n\nIn this guide, we will show you how to connect it to Quickwit.\n\n## Prerequisites\n\n- [Install Quickwit](/docs/get-started/installation.md)\n- Start a Quickwit instance with `./quickwit run`\n- [Install Fluentbit](https://docs.fluentbit.io/manual/installation/getting-started-with-fluent-bit)\n\n\n## Create a simple index for Fluentbit logs\n\nLet's create a schemaless index with only one field `timestamp`. The mode `dynamic` indicates that Quickwit will index all fields even if they are not defined in the doc mapping.\n\n```yaml title=\"index-config.yaml\"\nversion: 0.7\n\nindex_id: fluentbit-logs\n\ndoc_mapping:\n  mode: dynamic\n  field_mappings:\n    - name: timestamp\n      type: datetime\n      input_formats:\n        - unix_timestamp\n      output_format: unix_timestamp_secs\n      fast: true\n  timestamp_field: timestamp\n\nindexing_settings:\n  commit_timeout_secs: 10\n```\n\n```bash\ncurl -o fluentbit-logs.yaml https://raw.githubusercontent.com/quickwit-oss/quickwit/main/config/tutorials/fluentbit-logs/index-config.yaml\n```\n\nAnd then create the index with `cURL` or the `CLI`:\n\n<Tabs>\n\n<TabItem value=\"curl\" label=\"cURL\">\n\n```bash\ncurl -XPOST http://localhost:7280/api/v1/indexes -H \"content-type: application/yaml\" --data-binary @fluentbit-logs.yaml\n```\n\n</TabItem>\n\n<TabItem value=\"cli\" label=\"CLI\">\n\n```bash\n./quickwit index create --index-config fluentbit-logs.yaml\n```\n\n</TabItem>\n\n</Tabs>\n\n\n## Setup Fluentbit\n\nFluentbit configuration file is made of inputs and outputs. For this tutorial, we will use a dummy configuration:\n\n``` title=fluent-bit.conf\n[INPUT]\n  Name   dummy\n  Tag    dummy.log\n\n[OUTPUT]\n  Name http\n  Match *\n  URI   /api/v1/fluentbit-logs/ingest\n  Host  localhost\n  Port  7280\n  tls   Off\n  Format json_lines\n  Json_date_key    timestamp\n  Json_date_format epoch\n```\n\nFluentbit will send `dummy` logs to Quickwit endpoint `/api/v1/fluentbit-logs/ingest`.\n\nLet's start Fluentbit.\n\n```bash\nfluent-bit -c fluent-bit.conf\n```\n\n## Search logs\n\nQuickwit is now ingesting logs coming from Fluentbit and you can search them either with `cURL` or by using the UI:\n- `curl \"http://127.0.0.1:7280/api/v1/fluentbit-logs/search?query=severity:DEBUG\"`\n- Open your browser at `http://127.0.0.1:7280/ui/search?query=severity:DEBUG&index_id=fluentbit-logs&max_hits=10`.\n\n\n## Further improvements\n\nYou will soon be able to do aggregations on dynamic fields (planned for 0.7).\n"
  },
  {
    "path": "docs/log-management/send-logs/using-otel-collector-with-helm.md",
    "content": "---\ntitle: Send K8s logs using OTEL collector\nsidebar_label: Using OTEL with Helm\ndescription: Send K8s logs with OTEL collectors and Helm to Quickwit in a few minutes.\ntags: [k8s, helm]\nicon_url: /img/tutorials/helm-otel-k8s-tutorial-illustation.jpg\nsidebar_position: 2\n---\n\nThis guide will help you to unlock log search on your k8s cluster logs. We will first deploy Quickwit and OTEL collectors with [Helm](https://helm.sh/) and then see how to index and search them.\n\n## Prerequisites\n\nYou will need the following to complete this tutorial:\n- A Kubernetes cluster.\n- The command line tool [kubectl](https://kubernetes.io/docs/reference/kubectl/).\n- The command line tool [Helm](https://helm.sh/).\n- An access to an object storage like AWS S3, GCS, Azure blob storage, or Scaleway to store index data.\n\n\n## Install with Helm\n\nLet's first create a namespace to isolate our experiment and set it as the default namespace.\n\n```bash\nkubectl create namespace qw-tutorial\nkubectl config set-context --current --namespace=qw-tutorial\n```\n\n\nThen let's add [Quickwit](https://github.com/quickwit-oss/helm-charts) and [Otel](https://github.com/open-telemetry/opentelemetry-helm-charts) helm repositories:\n\n```bash\nhelm repo add quickwit https://helm.quickwit.io\nhelm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts\n```\n\nYou should now see the two repos in helm:\n\n```bash\nhelm repo list\nNAME                \tURL\nquickwit            \thttps://helm.quickwit.io\nopen-telemetry      \thttps://open-telemetry.github.io/opentelemetry-helm-charts\n```\n\n\n### Deploy Quickwit\n\nLet's create a basic chart configuration:\n\n```bash\nexport AWS_REGION=us-east-1\nexport AWS_ACCESS_KEY_ID=XXXX\nexport AWS_SECRET_ACCESS_KEY=XXXX\nexport DEFAULT_INDEX_ROOT_URI=s3://your-bucket/indexes\n```\n\n```bash\n# Create Quickwit config file.\necho \"\nsearcher:\n  replicaCount: 1\nindexer:\n  replicaCount: 1\nmetastore:\n  replicaCount: 1\njanitor:\n  enabled: true\ncontrol_plane:\n  enabled: true\n\nenvironment:\n  # Remove ANSI colors.\n  NO_COLOR: 1\n\n# Quickwit configuration\nconfig:\n  storage:\n    s3:\n      region: ${AWS_REGION}\n      access_key_id: ${AWS_ACCESS_KEY_ID}\n      secret_access_key: ${AWS_SECRET_ACCESS_KEY}\n      # If you are not on AWS S3, you can define a flavor (gcs, minio, garage...)\n      # and additional variables for your object storage.\n      # flavor: gcs\n      # endpoint: https://storage.googleapis.com\n\n  # Metastore on S3.\n  metastore_uri: ${DEFAULT_INDEX_ROOT_URI}\n\n  default_index_root_uri: ${DEFAULT_INDEX_ROOT_URI}\n\n  # Indexer settings\n  indexer:\n    # By activating the OTEL service, Quickwit will be able\n    # to receive gRPC requests from OTEL collectors.\n    enable_otlp_endpoint: true\n\" > qw-tutorial-values.yaml\n```\n\nBefore installing Quickwit chart, make sure you have access to S3 and that you did not make a typo in the `default_index_root_uri`. This can be easily done with `aws` CLI with a simple `ls`:\n\n```bash\naws s3 ls ${DEFAULT_INDEX_ROOT_URI}\n```\n\nIf the CLI did not return an error, you are ready to install the chart:\n\n```bash\nhelm install quickwit quickwit/quickwit -f qw-tutorial-values.yaml\n```\n\nIn a few moments, you will see the pods running Quickwit services:\n\n```bash\nkubectl get pods\nNAME                                      READY   STATUS    RESTARTS      AGE\nquickwit-control-plane-7fc495f4c4-slqv4   1/1     Running   2 (84s ago)   87s\nquickwit-indexer-0                        1/1     Running   2 (84s ago)   87s\nquickwit-janitor-7f75f4bc8-jrfv6          1/1     Running   2 (84s ago)   87s\nquickwit-metastore-6989978fc-9s82j        1/1     Running   2 (85s ago)   87s\nquickwit-searcher-0                       1/1     Running   2 (84s ago)   87s\n```\n\nLet's check Quickwit is working:\n\n```bash\nkubectl port-forward svc/quickwit-searcher 7280\n```\n\nThen open your browser `http://localhost:7280/ui/indexes`. You should see the list of indexes. If everything is fine, keep the kubectl command running and open a new terminal.\n\n### Deploy OTEL collectors\n\nWe need to configure a bit the collectors in order to:\n- collect logs from k8s\n- enrich the logs with k8s attributes\n- export the logs to Quickwit indexer.\n\n```bash\necho \"\nmode: daemonset\npresets:\n  logsCollection:\n    enabled: true\n  kubernetesAttributes:\n    enabled: true\nconfig:\n  exporters:\n    otlp:\n      endpoint: quickwit-indexer.qw-tutorial.svc.cluster.local:7281\n      tls:\n        insecure: true\n      # By default, logs are sent to the otel-logs-v0_7.\n      # You can customize the index ID By setting this header.\n      # headers:\n      #   qw-otel-logs-index: otel-logs-v0_7\n  service:\n    pipelines:\n      logs:\n        exporters:\n          - otlp\n\" > otel-values.yaml\n```\n\n```\nhelm install otel-collector open-telemetry/opentelemetry-collector -f otel-values.yaml\n```\n\nAfter a few seconds, you should see logs on your indexer that show indexing has started. It looks like this:\n```\n2022-11-30T18:27:37.628Z  INFO spawn_merge_pipeline{index=otel-log-v0 gen=0}: quickwit_indexing::actors::merge_pipeline: Spawning merge pipeline. index_id=otel-log-v0 source_id=_ingest-api-source pipeline_ord=0 root_dir=/quickwit/qwdata/indexing/otel-log-v0/_ingest-api-source merge_policy=StableLogMergePolicy { config: StableLogMergePolicyConfig { min_level_num_docs: 100000, merge_factor: 10, max_merge_factor: 12, maturation_period: 172800s }, split_num_docs_target: 10000000 }\n2022-11-30T18:27:37.628Z  INFO quickwit_serve::grpc: Starting gRPC server. enabled_grpc_services={\"otlp-log\", \"otlp-trace\"} grpc_listen_addr=0.0.0.0:7281\n2022-11-30T18:27:37.628Z  INFO quickwit_serve::rest: Starting REST server. rest_listen_addr=0.0.0.0:7280\n2022-11-30T18:27:37.628Z  INFO quickwit_serve::rest: Searcher ready to accept requests at http://0.0.0.0:7280/\n2022-11-30T18:27:42.654Z  INFO quickwit_indexing::actors::indexer: new-split split_id=\"01GK4WPTXK8GH3AGTRNBN9A8YG\" partition_id=0\n2022-11-30T18:27:52.643Z  INFO quickwit_indexing::actors::indexer: send-to-index-serializer commit_trigger=Timeout split_ids=01GK4WPTXK8GH3AGTRNBN9A8YG num_docs=22\n2022-11-30T18:27:52.652Z  INFO index_batch{index_id=otel-log-v0 source_id=_ingest-api-source pipeline_ord=0}:packager: quickwit_indexing::actors::packager: start-packaging-splits split_ids=[\"01GK4WPTXK8GH3AGTRNBN9A8YG\"]\n2022-11-30T18:27:52.652Z  INFO index_batch{index_id=otel-log-v0 source_id=_ingest-api-source pipeline_ord=0}:packager: quickwit_indexing::actors::packager: create-packaged-split split_id=\"01GK4WPTXK8GH3AGTRNBN9A8YG\"\n2022-11-30T18:27:52.653Z  INFO index_batch{index_id=otel-log-v0 source_id=_ingest-api-source pipeline_ord=0}:uploader: quickwit_indexing::actors::uploader: start-stage-and-store-splits split_ids=[\"01GK4WPTXK8GH3AGTRNBN9A8YG\"]\n2022-11-30T18:27:52.733Z  INFO index_batch{index_id=otel-log-v0 source_id=_ingest-api-source pipeline_ord=0}:uploader:stage_and_upload{split=01GK4WPTXK8GH3AGTRNBN9A8YG}:store_split: quickwit_indexing::split_store::indexing_split_store: store-split-remote-success split_size_in_megabytes=0.018351 num_docs=22 elapsed_secs=0.07654519 throughput_mb_s=0.23974074 is_mature=false\n```\n\nIf you see some errors there, it's probably coming from a misconfiguration of your object storage. If you need some help, please open an issue on [GitHub](https://github.com/quickwit-oss/quickwit) or come on our [discord server](https://discord.gg/MT27AG5EVE).\n\n\n### Ready to search logs\n\nYou are now ready to search, wait 30 seconds and you will see the first indexed logs: just [open the UI](http://localhost:7280/ui/search?query=*&index_id=otel-logs-v0&max_hits=10&sort_by=-timestamp_secs) and play with it. Funny thing you will see quickwit logs in it :).\n\nExample of queries:\n\n- [body.message:quickwit](http://localhost:7280/ui/search?query=body.message:quickwit&index_id=otel-logs-v0&max_hits=10&sort_by=-timestamp_secs)\n- [resource_attributes.k8s.container.name:quickwit](http://localhost:7280/ui/search?query=resource_attributes.k8s.container.name%3Aquickwit&index_id=otel-logs-v0&max_hits=10&sort_by=-timestamp_secs)\n- [resource_attributes.k8s.container.restart_count:1](http://localhost:7280/ui/search?query=resource_attributes.k8s.container.restart_count%3A1&index_id=otel-logs-v0&max_hits=10&sort_by=-timestamp_secs)\n\n\n![UI screenshot](../../assets/screenshot-ui-otel-logs.png)\n\nThat's all, folks!\n\n### Clean up\n\nLet's first delete the index and then uninstall the charts.\n\n```bash\n# Delete the index. The command will return the list of delete split files.\ncurl -XDELETE http://127.0.0.1:7280/api/v1/indexes/otel-logs-v0\n\n# Uninstall charts\nhelm uninstall otel-collector\nhelm uninstall quickwit\n\n# Delete namespace\nkubectl delete namespace qw-tutorial\n```\n\nFinally, you need to delete three JSON files created by Quickwit on your object storage:\n\n```bash\n# if your version <= 0.7.1\naws s3 rm ${DEFAULT_INDEX_ROOT_URI}/indexes_states.json\n# if your version > 0.7.1\naws s3 rm ${DEFAULT_INDEX_ROOT_URI}/manifest.json\n# the metastore file of the logs index\naws s3 rm ${DEFAULT_INDEX_ROOT_URI}/otel-logs-v0_7/metastore.json\n# the metastore file of the traces index\naws s3 rm ${DEFAULT_INDEX_ROOT_URI}/otel-traces-v0_7/metastore.json\n```\n\n## Next step\n\nFollow our [tutorial](../../get-started/tutorials/trace-analytics-with-grafana.md) to install Quickwit Grafana plugin to explore your logs, create dashboards and alerts.\n"
  },
  {
    "path": "docs/log-management/send-logs/using-otel-collector.md",
    "content": "---\ntitle: Send logs from OTEL Collector\nsidebar_label: Using OTEL collector\ndescription: Using OTEL Collector\ntags: [otel, collector, log]\nsidebar_position: 1\n---\n\nimport Tabs from '@theme/Tabs';\nimport TabItem from '@theme/TabItem';\n\nIf you already have your own OpenTelemetry Collector and want to export your logs to Quickwit, you need a new OLTP gRPC exporter in your config.yaml:\n\n<Tabs>\n\n<TabItem value=\"macOS_windows\" label=\"macOS/Windows\">\n\n```yaml title=\"otel-collector-config.yaml\"\nreceivers:\n  otlp:\n    protocols:\n      grpc:\n      http:\n\nprocessors:\n  batch:\n\nexporters:\n  otlp/quickwit:\n    endpoint: host.docker.internal:7281\n    tls:\n      insecure: true  \n    # By default, logs are sent to the otel-logs-v0_7.\n    # You can customize the index ID By setting this header.\n    # headers:\n    #   qw-otel-logs-index: otel-logs-v0_7\nservice:\n  pipelines:\n    logs:\n      receivers: [otlp]\n      processors: [batch]\n      exporters: [otlp/quickwit]\n```\n\n</TabItem>\n\n<TabItem value=\"linux\" label=\"Linux\">\n\n```yaml title=\"otel-collector-config.yaml\"\nreceivers:\n  otlp:\n    protocols:\n      grpc:\n      http:\n\nprocessors:\n  batch:\n\nexporters:\n  otlp/quickwit:\n    endpoint: 127.0.0.1:7281\n    tls:\n      insecure: true\n    # By default, logs are sent to the otel-logs-v0_7.\n    # You can customize the index ID By setting this header.\n    # headers:\n    #   qw-otel-logs-index: otel-logs-v0_7\n\nservice:\n  pipelines:\n    logs:\n      receivers: [otlp]\n      processors: [batch]\n      exporters: [otlp/quickwit]\n```\n\n</TabItem>\n\n</Tabs>\n\n\n## Test your OTEL configuration\n\n1. [Install](../../get-started/installation.md) and start a Quickwit server:\n   \n```bash\n./quickwit run\n```\n\n2. Start a collector with the previous config:\n\n<Tabs>\n\n<TabItem value=\"macOS_windows\" label=\"macOS/Windows\">\n\n```bash\ndocker run -v ${PWD}/otel-collector-config.yaml:/etc/otelcol/config.yaml -p 4317:4317 -p 4318:4318 -p 7281:7281 otel/opentelemetry-collector\n```\n\n</TabItem>\n\n<TabItem value=\"linux\" label=\"Linux\">\n\n```bash\ndocker run -v ${PWD}/otel-collector-config.yaml:/etc/otelcol/config.yaml --network=host -p 4317:4317 -p 4318:4318 -p 7281:7281 otel/opentelemetry-collector\n```\n\n</TabItem>\n\n</Tabs>\n\n3. Send a log to your collector with cURL:\n\n```bash\ncurl -XPOST \"http://localhost:4318/v1/logs\" -H \"Content-Type: application/json\" \\\n--data-binary @- << EOF\n{\n \"resource_logs\": [\n   {\n     \"resource\": {\n       \"attributes\": [\n         {\n           \"key\": \"service.name\",\n           \"value\": {\n             \"stringValue\": \"test-with-curl\"\n           }\n         }\n       ]\n     },\n     \"scope_logs\": [\n       {\n         \"scope\": {\n           \"name\": \"manual-test\"\n         },\n         \"log_records\": [\n           {\n             \"time_unix_nano\": \"1678974011000000000\",\n             \"observed_time_unix_nano\": \"1678974011000000000\",\n             \"name\": \"test\",\n             \"severity_text\": \"INFO\"\n           }\n         ]\n       }\n     ]\n   }\n ]\n}\nEOF\n```\n\nYou should see a log on the Quickwit server similar to the following:\n\n```bash\n2023-03-16T13:44:09.369Z  INFO quickwit_indexing::actors::indexer: new-split split_id=\"01GVNAKT5TQW0T2QGA245XCMTJ\" partition_id=6444214793425557444\n```\n\nThis means that Quickwit has received the log and created a new split. Wait for the split to be published before searching for logs.\n"
  },
  {
    "path": "docs/log-management/send-logs/using-vector.md",
    "content": "---\ntitle: Send logs from Vector\nsidebar_label: Using Vector\ndescription: A simple tutorial to send logs from Vector to Quickwit in a few minutes.\nicon_url: /img/tutorials/vector-logo.png\ntags: [logs, ingestion]\nsidebar_position: 3\n---\n\nimport Tabs from '@theme/Tabs';\nimport TabItem from '@theme/TabItem';\n\n[Vector](https://vector.dev/) is an amazing piece of software (in Rust obviously) and brings a new fresh wind in the observability space,\nit is well-known for collecting logs from every part of your infrastructure, transforming and aggregating them, and finally forwarding them to a sink.\n\nIn this guide, we will show you how to connect it to Quickwit.\n\n## Start Quickwit server\n\n<Tabs>\n\n<TabItem value=\"cli\" label=\"CLI\">\n\n```bash\n# Create Quickwit data dir.\nmkdir qwdata\n./quickwit run\n```\n\n</TabItem>\n\n<TabItem value=\"docker\" label=\"Docker\">\n\n```bash\n# Create Quickwit data dir.\nmkdir qwdata\ndocker run --rm -v $(pwd)/qwdata:/quickwit/qwdata -p 7280:7280 quickwit/quickwit run\n```\n\n</TabItem>\n\n</Tabs>\n\n## Taking advantage of Quickwit's native support for logs\n\nLet's embrace the OpenTelemetry standard and take advantage of Quickwit features. With the native support for OpenTelemetry standards, Quickwit already comes with an index called `otel-logs_v0_7` that is compatible with the OpenTelemetry [logs data model](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/data-model.md). This means we can start pushing log data without any prior usual index setup.\n\nThe OpenTelemetry index configuration can be found in the [quickwit-opentelemetry/src/otlp/logs.rs](https://github.com/quickwit-oss/quickwit/blob/main/quickwit/quickwit-opentelemetry/src/otlp/logs.rs) source file.\n\n## Setup Vector\n\nOur sink here will be Quickwit ingest API `http://127.0.0.1:7280/api/v1/otel-logs-v0_7/ingest`.\nTo keep it simple in this tutorial, we will use a log source called `demo_logs` that generates logs in a given format. Let's choose the common `syslog` format\n(Vector does not generate logs in the OpenTelemetry format directly!) and use the transform feature to map the `syslog` format into the OpenTelemetry format.\n\n\n```toml title=vector.toml\n[sources.generate_syslog]\ntype = \"demo_logs\"\nformat = \"syslog\"\ncount = 100000\ninterval = 0.001\n\n[transforms.remap_syslog]\ninputs = [ \"generate_syslog\"]\ntype = \"remap\"\nsource = '''\n  structured = parse_syslog!(.message)\n  .timestamp_nanos = to_unix_timestamp!(structured.timestamp, unit: \"nanoseconds\")\n  .body = structured\n  .service_name = structured.appname\n  .resource_attributes.source_type = .source_type\n  .resource_attributes.host.hostname = structured.hostname\n  .resource_attributes.service.name = structured.appname\n  .attributes.syslog.procid = structured.procid\n  .attributes.syslog.facility = structured.facility\n  .attributes.syslog.version = structured.version\n  .severity_text = if includes([\"emerg\", \"err\", \"crit\", \"alert\"], structured.severity) {\n    \"ERROR\"\n  } else if structured.severity == \"warning\" {\n    \"WARN\"\n  } else if structured.severity == \"debug\" {\n    \"DEBUG\"\n  } else if includes([\"info\", \"notice\"], structured.severity) {\n    \"INFO\"\n  } else {\n   structured.severity\n  }\n  .scope_name = structured.msgid\n  del(.message)\n  del(.timestamp)\n  del(.service)\n  del(.source_type)\n'''\n\n# useful to see the logs in the terminal\n# [sinks.emit_syslog]\n# inputs = [\"remap_syslog\"]\n# type = \"console\"\n# encoding.codec = \"json\"\n\n[sinks.quickwit_logs]\ntype = \"http\"\nmethod = \"post\"\ninputs = [\"remap_syslog\"]\nencoding.codec = \"json\"\nframing.method = \"newline_delimited\"\nuri = \"http://127.0.0.1:7280/api/v1/otel-logs-v0_7/ingest\"\n```\nDownload the above Vector config file.\n\n```bash\ncurl -o vector.toml https://raw.githubusercontent.com/quickwit-oss/quickwit/main/config/tutorials/vector-otel-logs/vector.toml\n```\n\nNow let's start Vector so that we can start sending logs to Quickwit.\n\n```bash\ndocker run -v $(pwd)/vector.toml:/etc/vector/vector.toml:ro -p 8383:8383 --net=host timberio/vector:0.25.0-distroless-libc\n```\n\n## Search logs\n\nQuickwit is now ingesting logs coming from Vector and you can search them either with `curl` or by using the UI:\n- `curl -XGET http://127.0.0.1:7280/api/v1/otel-logs-v0_7/search?query=severity_text:ERROR`\n- Open your browser at `http://127.0.0.1:7280/ui/search?query=severity_text:ERROR&index_id=otel-logs-v0_7&max_hits=10` and play with it!\n\n## Compute aggregation on severity_text\n\nFor aggregations, we can't use yet Quickwit UI but we can use cURL.\n\nLet's craft a nice aggregation query to count how many `INFO`, `DEBUG`, `WARN`, and `ERROR` per minute (all datetime are stored in microseconds thus the interval of 60_000_000 microseconds) we have:\n\n```json title=aggregation-query.json\n{\n  \"query\": \"*\",\n  \"max_hits\": 0,\n  \"aggs\": {\n    \"count_per_minute\": {\n      \"histogram\": {\n          \"field\": \"timestamp_nanos\",\n          \"interval\": 60000000\n      },\n      \"aggs\": {\n        \"severity_text_count\": {\n          \"terms\": {\n            \"field\": \"severity_text\"\n          }\n        }\n      }\n    }\n  }\n}\n```\n\n```bash\ncurl -XPOST -H \"Content-Type: application/json\" http://127.0.0.1:7280/api/v1/otel-logs-v0_7/search --data @aggregation-query.json\n```\n\n## Going further\n\nNow you can also deploy Grafana and connect to Quickwit as data source for query, dashboard, alerts and more!\n"
  },
  {
    "path": "docs/log-management/supported-agents.md",
    "content": "---\ntitle: Supported agents\nsidebar_position: 3\n---\n\nQuickwit is compatible with the following agents:\n\n## OpenTelemetry agent\n\nBefore using an [OpenTelemetry collector](https://opentelemetry.io/docs/collector/), check that [Quickwit OpenTelemetry service](otel-service.md) is enabled.\nOnce started, Quickwit is then ready to receive and ingest OpenTelemetry gRPC requests.\n\nHere is a configuration example of an OpenTelemetry agent that sends logs into Quickwit:\n\n```yaml\nmode: daemonset\npresets:\n  logsCollection:\n    enabled: true\n  kubernetesAttributes:\n    enabled: true\nconfig:\n  exporters:\n    otlp:\n      # Replace quickwit-host with the hostname of your Quickwit node/service.\n      # On k8s, it should be of the form `{quickwit-indexer-service-name}.{namespace}.svc.cluster.local:7281\n      endpoint: quickwit-host:7281\n      tls:\n        insecure: true\n  service:\n    pipelines:\n      logs:\n        exporters:\n          - otlp\n```\n\nFind more configuration details on the [OpenTelemetry documentation](https://opentelemetry.io/docs/collector/configuration/). You can also check out our [tutorial to send logs with OTEL collector](send-logs/using-otel-collector.md) to Quickwit.\n\n## HTTP-based agents\n\nIt's also possible to use other agents that send HTTP requests to Quickwit Ingest API. Quickwit also partially supports Elasticseardch `_bulk` API. Thus, there is a good chance that your agent is already compatible with Quickwit.\nCurrently, we have tested the following HTTP-based agents:\n\n- [Vector](send-logs/using-vector.md)\n- [Fluentbit](send-logs/using-fluentbit.md)\n- FluentD (tutorial coming soon)\n- Logstash: Quickwit does not support the Elasticsearch output. However, it's possible to send logs with the HTTP output but with `json` [format](https://www.elastic.co/guide/en/logstash/current/plugins-outputs-http.html) only.\n\nQuickwit natively supports the [OpenTelemetry Protocol (OTLP)](https://opentelemetry.io/docs/reference/specification/protocol/otlp/) and provides a gRPC endpoint to receive logs from an OpenTelemetry collector by default.\n\nThe logs received by this endpoint are indexed on  the `otel-logs-v0` index. This index will be automatically created if not present. The index doc mapping is described in this [section](#opentelemetry-logs-data-model).\n\nYou can also send your logs directly to this index by using the [ingest API](/docs/reference/rest-api.md#ingest-data-into-an-index).\n\n## OpenTelemetry service\n\nQuickwit natively supports the [OpenTelemetry Protocol (OTLP)](https://opentelemetry.io/docs/reference/specification/protocol/otlp/) and provides a gRPC endpoint to receive spans from an OpenTelemetry collector. This endpoint is enabled by default.\n\nWhen enabled, Quickwit will start the gRPC service ready to receive spans from an OpenTelemetry collector. The spans are indexed in the `otel-trace-v0_7` index by default, and this index will be automatically created if not present. The index doc mapping is described in the next [section](#trace-and-span-data-model).\n\nIf for any reason, you want to disable this endpoint, you can:\n- Set the `QW_ENABLE_OTLP_ENDPOINT` environment variable to `false` when starting Quickwit.\n- Or [configure the node config](/docs/configuration/node-config.md) by setting the indexer setting `enable_otlp_endpoint` to `false`.\n\n```yaml title=node-config.yaml\n# ... Indexer configuration ...\nindexer:\n    enable_otlp_endpoint: false\n```\n"
  },
  {
    "path": "docs/operating/_category_.yaml",
    "content": "label: 'Operating Quickwit'\nposition: 7\ncollapsed: true\n"
  },
  {
    "path": "docs/operating/aws-costs.md",
    "content": "---\ntitle: AWS Cost Optimization\nsidebar_position: 3\n---\n\nQuickwit has been tested on Amazon S3. This page sums up what we have learned from that experience.\n\n## Real World Example\nIn this [blog post](https://quickwit.io/blog/benchmarking-quickwit-engine-on-an-adversarial-dataset#indexing-costs), we indexed 23 TB of data and evaluated performance and costs.\nYou may be able to deduce the costs of indexing and querying on your dataset.\n\n## Data transfers costs and latency\n\nCloud providers charge for data transfers in and out of their networks. In addition, querying an index from a remote machine adds some extra latency.\nFor those reasons, we recommend that you test and use the Quickwit from an instance located within your cloud provider's network.\n\n## Optimizing bandwidth with wisely chosen instances\n\nWe recommend picking instances with high network performance to allow faster downloads from Amazon S3. In our experience, `c5n.2xlarge` instances offer the best bang for your buck.\n\n## Requests cost\n\nA final note on object storage requests costs. These are [quite low](https://aws.amazon.com/s3/pricing/) actually, $0,0004 / 1000 requests for GET and $0.005 / 1000 requests for PUT on AWS S3.\n\n### PUT requests\n\nDuring indexing, Quickwit uploads new splits on Amazon S3 and progressively merges them until they reach 10 million documents that we call “mature splits”. Such splits have a typical size between 1GB and 10GB and will usually require 2 PUT requests to be uploaded (1 PUT request / 5GB).\n\nWith default indexing parameters `commit_timeout_secs` of 60 seconds and `merge_policy.merge_factor` of 10 and assuming you want to ingest 1 million documents every minute, this will cost you less than $1 / month.\n\n### GET requests\n\nWhen querying, Quickwit needs to make multiple GET requests:\n\n```jsx\n#num requests = #num splits * ((#num search fields * #num terms * 3) + (#num search fields with fieldnorms enabled) + 1 (timestamp fast field if present)) + #num docs returned\n```\n\nThe above formula assumes that the hotcache is cached, which will be loaded after the first query for every split.\n`#num splits` can be reduced with [pruning](../overview/concepts/querying.md).\n\nWhen positions are not enabled, only 2 GET requests will be executed per term.\n\nThese requests costs could add up quickly if you have a high number of splits or QPS > 10.\nDon't hesitate to [contact us](mailto:hello@quickwit.io) if this is the case :).\n"
  },
  {
    "path": "docs/operating/data-directory.md",
    "content": "---\ntitle: Data directory\nsidebar_position: 1\n---\n\nQuickwit operates on a local directory to store temporary files. By default, this working directory is named `qwdata` and placed right next to the Quickwit binary.\n\nLet's have a look at how Quickwit organizes the data directory.\n\n## Data directory layout\n\nWhen operating Quickwit, you will end up with the following tree:\n\n```bash\nqwdata\n├── cache\n│   └── splits\n|       ├── 03BSWV41QNGY5MZV9JYAE1DCGA.split\n│       └── 01GSWV41QNGY5MZV9JYAE1DCB7.split\n├── delete_task_service\n│   └── wikipedia%01H13SVKDS03P%TpCfrA\n├── indexing\n│   ├── wikipedia%01H13SVKDS03P%_ingest-api-source%RbaOAI\n│   └── wikipedia%01H13SVKDS03P%kafka-source%cNqQtI\n├── wal\n│   ├── wal-00000000000000000056\n│   └── wal-00000000000000000057\n└── queues\n    ├── partition_id\n    ├── wal-00000000000000000028\n    └── wal-00000000000000000029\n```\n\n### `/queues` and `/wal` directories\n \nThese directories are created only if the ingest API service is running on your node. They contain write ahead log files of the ingest API to guard against data loss. The `/queues` directory is used by the legacy version of the ingest (sometimes referred to as ingest V1). It is meant to be phased out in upcoming versions of Quickwit. Learn more about ingest API versions [here](../ingest-data/ingest-api.md#ingest-api-versions).\n\nThe log file is truncated when Quickwit commits a split (piece of index), which means that the split is stored on the storage and its metadata are in the metastore.\n\nYou can configure `max_queue_memory_usage` and `max_queue_disk_usage` in the [node config file](../configuration/node-config.md#ingest-api-configuration) to limit the max disk usage.\n\n### `/indexing` directory\n\nThis directory holds the local indexing directory of each indexing source of each index managed by Quickwit. In the above tree, you can see two directories corresponding to the `wikipedia` index, which means that index is currently handling two sources.\n\n\n### `/delete_task_service` directory\n\nThis directory is used by the Janitor service to apply deletes on indexes. During this process, splits are downloaded, a new split is created while applying deletes and uploaded to the target storage. This directory gets created only on nodes running the Janitor service.\n\n### `/cache` directory\n\nThis directory is used for caching splits that will undergo a merge operation to save disk IOPS. Splits are evicted if they are older than two days. If cache limits are reached, oldest splits are evicted.\n\nYou can [configure](../configuration/node-config#indexer-configuration) the number of splits the cache can hold with `split_store_max_num_splits` and limit the overall size in bytes of splits with `split_store_max_num_bytes`.\n\n### `/searcher-split-cache` directory\n\nThis directory is used by searcher nodes to cache entire splits and reduce calls to the object store. It won't be created unless you set the `split_cache` fields in the [searcher configuration](../configuration/node-config.md#searcher-configuration).\n\n\n## Setting the right splits cache limits\n\nCaching splits saves disk IOPS when Quickwit needs to merge splits.\n\nSetting the right limits will depend on your [merge policy](../configuration/index-config.md#merge-policies) and the number of partitions you are using. The default splits cache limits should fit most use cases.\n\n### Splits cache with the default configuration\n\nFor a given index, Quickwit commits one split every minute and uses the \"Stable log\" [merge policy](../configuration/index-config.md#merge-policies). This merge policy by default merges splits by group of 10, 11, or 12 until splits have more than 10 millions of documents. A split will typically undergo 3 or 4 merges and after will be considered as mature and evicted from the cache.\n\nThe following table shows how many splits will be created after a given amount of time assuming a 20MB/s ingestion rate with a compression ratio of 0.5:\n\n| Time (minutes) | Number of splits                       | Splits size (GB) |\n| -------------- | -------------------------------------- | ----------- |\n| 1              | 1                                      | 0.6 GB      |\n| 2              | 2                                      | 1.2 GB      |\n| 10             | 10                                     | 6 GB        |\n| 11             | 1 + 1 (merged once)                    | 6.6 GB      |\n| 21             | 1 + 2 (merged once)                    | 12.6 GB     |\n| 91             | 1 + 9 (merged once)                    | 54.6 GB     |\n| 101            | 1 + 1 (merged twice)                   | 60.6 GB     |\n| 111            | 2 + 1 (merged once) + 1 (merged twice) | 66.6 GB     |\n| 201            | 1 + 0 (merged once) + 2 (merged twice) | 120.6 GB    |\n| ..             | ...                                    |             |\n\nIn this case, the default cache limits of 1000 splits and 100GB are good enough to avoid downloading splits from the storage for the first two merges. This is perfectly fine for a production use case. You may want to increase the splits cache size to avoid any split download.\n\nYou can monitor the download rate with our [indexers dashboard](monitoring.md).\n\n### Splits cache with partitioning\n\nWhen using [partitions](../overview/concepts/querying.md#partitioning), Quickwit will create one split per partition and the number of splits can add up very quickly.\n\nLet's take a concrete example with the following assumptions:\n- A [commit timeout](../configuration/index-config.md#indexing-settings) of 1 minute.\n- A partitioning that has 100 partitions. Quickwit will create 100 splits per minute assuming a document of each partition is ingested in one minute.\n- A merge policy that merges splits of same partition as soon as there is 10 splits.\n\nThe following table shows how many splits will be created after a given amount of time:\n\n| Time (minutes) | Number of splits |\n| ------------ | ---------------- |\n| 1            | 100              |\n| 2            | 200              |\n| 10           | 1000             |\n| 11           | 100 + 100 (merged once) |\n| 21           | 100 + 200 (merged once) |\n| 91           | 100 + 900 (merged once) |\n| 100          | 1000 + 900 (merged once) |\n| 101          | 100 + 0 (merge once) + 100 (merged twice) |\n| 200          | 1000 + 900 (merged once) + 100 (merged twice) |\n| 201          | 100 + 0 (merged once) + 200 (merged twice) |\n\nWith these assumptions, you have to set `split_store_max_num_splits` to at least 1000 to avoid downloading splits from the storage for the first merge operation. And as merging can take a bit of time, you should set `split_store_max_num_splits` to a value that can hold all the splits that are not yet merged plus the incoming splits, a value of 1100 splits should be enough. If you want to store split until the second merge, a limit of 2500 splits should be good enough.\n\n## Troubleshooting with a huge number of local splits\n\nWhen starting, Quickwit is scanning all the splits in the cache directory to know which split is present locally, this can take a few minutes if you have tens of thousands splits. On Kubernetes, as your pod can be restarted if it takes too long to start, you may want to clean up the data directory or increase the liveliness probe timeout.\nAlso please report such a behavior on [GitHub](https://github.com/quickwit-oss/quickwit) as we can certainly optimize this start phase.\n"
  },
  {
    "path": "docs/operating/monitoring.md",
    "content": "---\ntitle: Monitoring with Grafana\nsidebar_position: 2\n---\n\nYou can monitor your Quickwit cluster with Grafana.\nFollow the tutorial at [Quickwit Monitoring with Grafana](../get-started/tutorials/prometheus-metrics) on how to set it up.\n\nWe provide three Grafana dashboards to help you monitor:\n- [indexers performance](https://github.com/quickwit-oss/quickwit/blob/main/monitoring/grafana/dashboards/indexers.json)\n- [searchers performance](https://github.com/quickwit-oss/quickwit/blob/main/monitoring/grafana/dashboards/searchers.json)\n- [metastore queries](https://github.com/quickwit-oss/quickwit/blob/main/monitoring/grafana/dashboards/metastore.json)\n\nDashboards rely on a prometheus datasource fed with [Quickwit metrics](../reference/metrics.md).\n\n## Screenshots\n\n![Indexers Grafana Dashboard](../assets/images/screenshot-indexers-grafana-dashboard.png)\n\n![Searchers Grafana Dashboard](../assets/images/screenshot-searchers-grafana-dashboard.png)\n\n![Metastore Grafana Dashboard](../assets/images/screenshot-metastore-grafana-dashboard.png)\n"
  },
  {
    "path": "docs/operating/upgrades.md",
    "content": "---\ntitle: Version upgrade\nsidebar_position: 4\n---\n\n## Migration from 0.6.x to 0.7.0\n\nThe format of the index and internal objects stored in the metastore of 0.7 is backward compatible with 0.6.\n\nIf you are using the OTEL indexes and ingesting data into indexes the `otel-logs-v0_6` and `otel-traces-v0_6`, you must stop indexing before upgrading. Indeed, the first time you start Quickwit 0.7, it will update the doc mapping fields of Trace ID and Span ID of those two indexes by changing their input/output formats from `base64` to `hex`. This is automatic: you don't have to perform any manual operation.\n\nQuickwit 0.7 will also create the new index `otel-traces-v0_7`, which is now used by default when ingesting data with the OTEL gRPC and HTTP API. The Jaeger gRPC and HTTP APIs will query both `otel-traces-v0_6` and `otel-traces-v0_7` by default. It's possible to define the index ID you want to use for OTEL gRPC endpoints and Jaeger gRPC API by setting the request header `qw-otel-logs-index` or `qw-otel-traces-index` to the index ID you want to target.\n\n\n## Migration from 0.7.0 to 0.7.1\n\nQuickwit 0.7.1 will create the new index `otel-logs-v0_7` which is now used by default when ingesting data with the OTEL gRPC and HTTP API.\n\nIn the traces index `otel-traces-v0_7`, the `service_name` field is now `fast`. \nNo migration is done if `otel-traces-v0_7` already exists. If you want `service_name` field to be `fast`, you have to delete first the existing `otel-traces-v0_7` index or you need to create your own index.\n\n## Migration from 0.8 to 0.9\n\nQuickwit 0.9 introduces a new ingestion service to to power the ingest and bulk APIs (v2). The new ingest is enabled and used by default, even though the legacy one (v1) remains enabled to finish indexing residual data in the legacy write ahead logs. Note that `ingest_api.max_queue_disk_usage` is enforced on both ingest versions separately, which means that the cumulated disk usage might be up to twice this limit.\n\nWhen upgrading to 0.9, we recommend to perform a full cluster restart.\n\n<!--\nReasons:\n- Ingested data into previously existing indexes on upgraded indexer nodes will not be picked by the indexing pipelines until the control plane is upgraded.\n- The indexing plan is computed differently in 0.9, all pipelines will be restarted when upgrading the control plane.\n- If you intend to enable compression for the ingest service (`ingest_api.grpc_compression_algorithm`), you must do so in two steps: first, upgrade the indexer nodes with compression disabled, then update the node configuration to enable compression, and finally restart the indexer nodes.\n- Obscure bug raised in https://github.com/quickwit-oss/quickwit/issues/5787#issuecomment-2979470315\n-->\n\nShutdown order:\n1) indexers, searchers and janitor\n2) control plane\n3) metastores\n\nStart up order:\n1) metastores\n2) control plane\n3) indexers, searchers and janitor\n"
  },
  {
    "path": "docs/overview/_category_.yaml",
    "content": "label: 'Introduction'\nposition: 1\ncollapsed: true\n"
  },
  {
    "path": "docs/overview/architecture.md",
    "content": "---\ntitle: Architecture\nsidebar_position: 2\n---\n\nQuickwit distributed search engine relies on 4 major services and one maintenance service:\n\n- The Searchers for executing search queries from the REST API.\n- The Indexers that index data from data sources.\n- The Metastore that stores the index metadata in a PostgreSQL-like database or in a cloud storage file.\n- The Control plane that schedules indexing tasks to the indexers.\n- The Janitor that executes periodic maintenance tasks.\n\nMoreover, Quickwit leverages existing infrastructure by relying on battled-tested technologies for index storage, metadata storage, and ingestion:\n\n- Cloud storage like AWS S3, Google Cloud Storage, Azure Blob Storage or other S3 compatible storage for index storage.\n- Postgresql for metadata storage.\n- Distributed queues like Kafka and Pulsar for ingestion.\n\n## Architecture diagram\n\nThe following diagram shows a Quickwit cluster with its four major components and the janitor whose role is to execute periodic maintenance tasks, see the [Janitor section](#janitor) for more details.\n\n![Quickwit Architecture](../assets/images/quickwit-architecture-light.svg#gh-light-mode-only)![Quickwit Log Management](../assets/images/quickwit-architecture-dark.svg#gh-dark-mode-only)\n\n## Index & splits\n\nA Quickwit index stores documents and makes it possible to query them efficiently. The index organizes documents into a collection of smaller independent indexes called **splits**.\n\nA document is a collection of fields. Fields can be stored in different data structures:\n\n- an inverted index, which enables fast full-text search.\n- a columnar storage called `fast field`. It is the equivalent of doc values in [Lucene](https://lucene.apache.org/). Fast fields are required to compute aggregates over the documents matching a query. They can also allow some advanced types of filtering.\n- a row-storage called the doc store. It makes it possible to get the content of the matching documents.\n\nYou can configure your index to control how to map your JSON object to a Quickwit document and, for each field, define whether it should be stored, indexed, or be a fast field. [Learn how to configure your index](../configuration/index-config.md)\n\n### Splits\n\nA split is a small piece of an index identified by a UUID. For each split, Quickwit adds up a `hotcache` file along with index files. This **hotcache** is what makes it possible for Searchers to open a split in less than 60ms, even on high latency storage.\n\nThe Quickwit index is aware of its splits by keeping splits metadata, notably:\n\n- the split state which indicates if the split is ready for search\n- the min/max time range computed on the timestamp field if present.\n\nThis timestamp metadata can be handy at query time. If the user specifies a time range filter to their query, Quickwit will use it to **prune irrelevant splits**.\n\nIndex metadata needs to be accessible by every instance of the cluster. This is made possible thanks to the `metastore`.\n\n### Index storage\n\nQuickwit stores the indexes data (splits files) on cloud storage (AWS S3, Google Cloud Storage, Azure Blob Storage or other S3 compatible storage) and also on local disk for single-server deployment.\n\n## Metastore\n\nQuickwit gathers index metadata into a metastore to make them available across the cluster. \n\nOn the write path, indexers push index data on the index storage and publish metadata to the metastore.\n\nOn the read path, for a given query on a given index, a search node will ask the metastore for the index metadata and then use it to do the query planning and finally execute the plan.\n\nIn a clustered deployment, the metastore is typically a traditional RDBMS like PostgreSQL which we only support today. In a single-server deployment, it’s also possible to rely on a local file or on Amazon S3.\n\n## Quickwit cluster and services\n\n### Cluster formation\n\nQuickwit uses [chitchat](https://github.com/quickwit-oss/chitchat), a cluster membership protocol with failure detection implemented by Quickwit. The protocol is inspired by Scuttlebutt reconciliation and phi-accrual detection, ideas borrowed from Cassandra and DynamoDB.\n\n[Learn more on chitchat](https://github.com/quickwit-oss/chitchat).\n\n### Indexers\n\nSee [dedicated indexing doc page](./concepts/indexing.md).\n\n### Searchers\n\nQuickwit's search cluster has the following characteristics:\n\n- It is composed of stateless nodes: any node can answer any query about any splits.\n- A node can distribute search workload to other nodes.\n- Load-balancing is made with rendezvous hashing to allow for efficient caching.\n\nThis design provides high availability while keeping the architecture simple.\n\n**Workload distribution: root and leaf nodes**\n\nAny search node can handle any search request. A node that receives a query will act as the root node for the span of the request. It will then process it in 3 steps:\n\n- Get the index metadata from the metastore and identify the splits relevant to the query.\n- Distributes the split workload among the nodes of the cluster. These nodes are assuming the role of leaf nodes.\n- Waits for results from leaf nodes, merges them, and returns the aggregated results.\n\n**Stateless nodes**\n\nQuickwit cluster distributes search workloads while keeping nodes stateless.\n\nThanks to the hotcache, opening a split on Amazon S3 only takes 60ms. It makes it possible to remain totally stateless: a node does not need to know anything about the indexes. Adding or removing nodes takes seconds and does not require moving data around.\n\n**Rendezvous hashing**\n\nThe root node uses [Rendezvous hashing](https://en.wikipedia.org/wiki/Rendezvous_hashing) to distribute the workload among leaf nodes. Rendez-vous hashing makes it possible to define a node/split affinity function with excellent stability properties when a node joins or leaves the cluster. This trick unlocks efficient caching.\n\nLearn more about query internals on the [querying doc page](./concepts/querying.md).\n\n\n### Control plane\n\nThe control plane service schedules indexing tasks to indexers. The scheduling is executed when the scheduler receives external or internal events and on certains conditions:\n\n- The scheduler listens to metastore events: source create, delete, toggle, or index delete. On each of these events, it will schedule a new plan, named the `desired plan` and send indexing tasks to the indexers.\n- On every `HEARTBEAT` (3 seconds), the scheduler controls if the `desired plan` and the indexing tasks running on indexers are in sync. If not, it will reapply the desired plan to indexers.\n- Every minute, the scheduler rebuilds a plan with the latest metastore state, and if it differs from the last applied plan, it will apply the new one. This is necessary as the scheduler may have not received all metastore events due to network issues.\n\n### Janitor\n\nThe Janitor service runs maintenance tasks on indexes: garbage collection, delete query tasks, and retention policy tasks.\n\n## Data sources\n\nQuickwit supports [multiple sources](../ingest-data/) to ingest data from.\n\nA file is ideal for a one-time ingestion like an initial load, the ingest API or a message queue are ideal to continuously feed data into the system. \n\nQuickwit indexers connect directly to external message queues like Kafka, Pulsar or Kinesis and guarantee the exactly-once semantics. If you need support for other distributed queues, please vote for yours [here](https://github.com/quickwit-oss/quickwit/issues/1000).\n"
  },
  {
    "path": "docs/overview/concepts/_category_.yaml",
    "content": "label: 'Advanced concepts'\nposition: 3\ncollapsed: true\n"
  },
  {
    "path": "docs/overview/concepts/deletes.md",
    "content": "---\ntitle: Deletes\nsidebar_position: 3\n---\n\nQuickwit supports deletes thanks to the [delete API](../../reference/rest-api.md#delete-api). It's important to note that this feature is mainly intended to comply with GDPR (General Data Protection Regulation) and should be used parsimoniously as deletes are expensive: typically a few queries per hour or day is recommended.\n\n## Delete tasks\n\nA delete task on a given index is executed on all splits created before the delete task creation. This can be a long-running task that could last several hours if the delete query is matching documents present in many splits.\n\nTo track the progress of the execution, each delete task is given a unique and incremental identifier called \"operation stamp\" or `opstamp`. All existing splits will undergo a delete operation and, after its success, each split metadata will be updated with the corresponding operation stamp.\n\nAll splits created after the creation of a delete tasks will have a `opstamp` greater or equal to the `opstamp` of the delete task (greater if other delete tasks have been created at the same moment).\n\nQuickwit batches delete operations on a given split: for example, if a split has it delete `opstamp = n` and the last created delete task has a `opstamp = n + 10`, ten (10) delete queries will be executed at once on the split.\n\n## Delete API\n\nDelete tasks are created through the [Delete REST API](../../reference/rest-api.md#delete-api).\n\n## Pitfalls\n\n### Immature splits\n\nDelete operations are applied only to “mature” splits, that is splits that will no longer undergo merges. Whether a split is mature depends on the [merge policy](../../configuration/index-config.md#merge-policies). It is possible to define `maturation_period` after which a split will be mature. Thus, a delete request created at `t0` will first apply deletes to mature splits and, in the worst case, will wait the `t0 + maturation_period` for immature splits to become mature.\n\n\n### Monitoring and dev XP\n\nIt's currently not possible to monitor delete operations. An [issue](https://github.com/quickwit-oss/quickwit/issues/2494) is opened to improve the dev experience, don't hesitate to add your comments it and follow its progress.\n"
  },
  {
    "path": "docs/overview/concepts/indexing.md",
    "content": "---\ntitle: Indexing\nsidebar_position: 1\n---\n\n## Supported data formats\n\nQuickwit ingests JSON records and refers to them as \"documents\" or \"docs\". Each document must be a JSON object. When ingesting files, documents must be separated by a newline.\n\nQuickwit does not yet support file formats such as `Avro` or `CSV`. Compression formats such as `bzip2` or `gzip` are also not supported yet.\n\n## Data model\n\nQuickwit supports both schemaless indexes and fixed schemas. The \"document mapping\" of an index, also commonly called \"doc mapping\", is a list of field names and types that declares the schema of an index. For a schemaless or mixed fixed schema and schemaless indexing, follow our [guide on schemaless indexing](../../guides/schemaless.md). Additionally, a doc mapping specifies how documents are indexed (tokenizers) and stored (column-oriented vs. row-oriented).\n\n\n## Merge process and merge policy\n\nAn index is broken into immutable splits. The size of a split is defined by the number of documents it carries. A split is considered \"mature\" when its size reaches a threshold defined in the index config as `split_num_docs_target`.\n\nAn indexer buffers incoming documents and produces a new split when the size of the buffer reaches `split_num_docs_target` or `commit_timeout_secs` seconds have passed since the first document has been enqueued, depending on which event occurs first. In the latter case, the indexer generates immature splits. The merge process designates the iterative procedure that groups and merges immature splits together to produce mature splits.\n\nThe merge policy controls the merge algorithm, which is mainly driven by the two parameters `split_num_docs_target` and `merge_factor`. Each time a new split is published, the merge policy examines the list of immature splits and attempts to merge `merge_factor` splits together in order to produce larger splits. The merge policy may also decide to merge fewer or more splits together if deemed necessary. Finally, the merge algorithm never merges more than `max_merge_factor` splits together.\n\n### Split store\n\nThe split store is a cache that keeps recently published and immature splits on disk to speed up the merge process. After a successful merge phase, the split store evicts dangling splits.\n\nThe disk space allocated to the split store is controlled by the config parameters `split_store_max_num_splits` and `split_store_max_num_bytes`.\n\n## Data sources\n\nA data source designates the location and set of parameters that allow to connect to and ingest data from an external data store, which can be a file, a stream, or a database. Often, Quickwit simply refers to data sources as \"sources\". The indexing engine supports local adhoc file ingests using [the CLI](/docs/reference/cli#tool-local-ingest) and streaming sources (e.g. the Kafka source). Quickwit can insert data into an index from one or multiple sources. More details can be found [in the source configuration page](https://quickwit.io/docs/configuration/source-config).\n\n## Checkpoint\n\nQuickwit achieves exactly-once processing using checkpoints. For each source, a \"source checkpoint\" records up to which point documents have been processed in the target file or stream. Checkpoints are stored in the metastore and updated atomically each time a new split is published. When an indexing error occurs, the indexing process is resumed right after the last successfully published checkpoint. Internally, a source checkpoint is represented as an object mapping from absolute paths or partition IDs to offsets or sequence numbers.\n"
  },
  {
    "path": "docs/overview/concepts/querying.md",
    "content": "---\ntitle: Querying\nsidebar_position: 2\n---\n\nA search query received by a searcher will be executed using a map-reduce approach following these steps:\n\n1. The Searcher identifies relevant splits based on the request’s [timestamp interval](#time-sharding) and [tags](#tag-pruning).\n2. It distributes the splits workload among other searchers available in the cluster using *[rendez-vous hashing](https://en.wikipedia.org/wiki/Rendezvous_hashing)* to optimize caching and load.\n3. It finally waits for all results, merges them, and returns them to the client.\n\nA search stream query follows the same execution path as for a search query except for the last step: instead of waiting for each Searcher's result, the searcher streams the results as soon as it starts receiving some from a searcher.\n\n### **Time sharding**\n\nOn datasets with a time component, Quickwit will shard data into timestamp-aware splits. With this feature, Quickwit is capable of filtering out most splits before they can make it to the query processing stage, thus reducing drastically the amount of data needed to process the query.\n\nThe following query parameters are available to apply timestamped pruning to your query:\n\n- `startTimestamp`: restricts search to documents with a `timestamp >= start_timestamp`\n- `endTimestamp`: restricts search to documents with a `timestamp < end_timestamp`\n\n### Tag pruning\n\nQuickwit also provides pruning on a second dimension called `tags`. By [setting a field as tagged](../../configuration/index-config.md) Quickwit will generate split metadata at indexing in order to filter splits that match requested tags at query time. Note that this metadata is only generated when the cardinality of the field is less than 1,000.\n\nTag pruning is notably useful on multi-tenant datasets.\n\n### Partitioning\n\nQuickwit makes it possible to route documents into different splits based on a partitioning key.\n\nThis feature is especially useful in a context where documents with different\ntags are all mixed together in the same source (usually a Kafka topic).\n\nIn that case, simply marking the field as tag will have no positive effect on search, as all produced splits will contain almost all tags.\n\nThe `partition_key` attributes (defined in the doc mapping) lets you configure the logic used by Quickwit to route documents into isolated splits.\nQuickwit will also enforce this isolation during merges. This functionality is, in a sense, similar to sharding.\n\nQuickwit supports a simple DSL for partitioning described in the next section.\n\nPartition & tags are often used to:\n\n- separate `tenants` in a multi-tenant application\n- separate `team` or `application` in an observation logging case.\n\nEmitting many splits can heavily stress an `indexer`. For this reason,\nanother parameter of the doc mapping called `max_num_partitions` acts as a safety valve. If the number of partitions is\nabout to exceed `max_num_partitions`, a single extra partition is created\nand all extra partitions will be grouped together into this special partition.\n\nIf you are expecting 20 partitions, we strongly recommend you to not set\n`max_num_partitions` to 20, but instead use a larger value (200 for instance).\nQuickwit should handle that number of partitions smoothly, and it will avoid documents belonging to different partitions from being grouped together due to\na few faulty documents.\n\n### Partition key DSL\n\nQuickwit allows you to configure how document are routed with a simple DSL. Here are some sample expression with a short description of their result:\n\n- `tenant_id`: create one partition per tenant\\_id\n- `tenant_id,app_id`: create one partition per unique combination of tenant\\_id and app\\_id\n- `tenant_id,hash_mod(app_id, 8)`: for each tenant, create up to 8 partitions containing each data related to some applications\n- `hash_mod((tenant_id,app_id), 50)`: create 50 partition in total, containing some combination of tenant and apps.\n\n\nThe partition key DSL is generated by this grammar:\n```\nRoutingExpr := RoutingSubExpr [ , RoutingExpr ]\nRougingSubExpr := Identifier [ \\( Arguments \\) ]\nIdentifier := FieldChar [ Identifier ]\nFieldChar := { a..z | A..Z | 0..9 | _ }\nArguments := Argument [ , Arguments ]\nArgument := { \\( RoutingExpr \\) | RoutingSubExpr | DirectValue }\n# We may want other DirectValue in the future\nDirectValue := Number\nNumber := { 0..9 } [ Number ]\n```\nSupported functions are currently:\n- `hash_mod(RoutingExpr, Number)`: hash `RoutingExpr` and divide the result by `Number`, keeping only the reminder.\n\nWhen using `hash_mod` with a tuple of key like in `hash_mod((tenant_id,app_id), 50)`, beware it might route together documents which would make tags less effective.\nFor instance, if tenant\\_1,app\\_1 and tenant\\_2,app\\_2 are both sent to partition one, but tenant\\_1,app\\_2 is sent to partition two, a query for tenant\\_1,app\\_2 will\nstill search inside the 1st partition as it will be tagged with tenant\\_1,tenant\\_2,app\\_1 and app\\_2. You should therefore prefer a partition key such as\n`hash_mod(tenant_id, 10),hash_mod(app_id, 5)` which will generate as many splits, but with better tags.\n\n### Caching\n\nQuickwit does caching in many places to deliver a highly performing query engine.\n\nIn memory:\n\n- Hotcache caching: A static cache that holds information about a split file internal representation. It helps speed up the opening of a split file. Its size can be defined via the `split_footer_cache_capacity` configuration parameter.\n- Fast field caching: Fast fields tend to be accessed very frequently by users especially for stream requests. They are cached in a RAM whose size can be limited by the `fast_field_cache_capacity` configuration value.\n- Partial request caching: In some cases, like when using dashboards, some very similar requests might be issued, with only timestamp bounds changing. Some partial results can be cached to make these requests faster and issue less requests to the storage. They are cached in a RAM whose size can be limited by the `partial_request_cache_capacity` configuration value.\n\nOn disk:\n\n- The split cache stores entire splits on disk. It can be enabled by setting the `split_cache` configuration fields. This cache can help reduce object store costs and load. Searchers populate this cache when splits are created or queried and evict them with a simple LRU strategy.\n\nLearn more about cache parameters in the [searcher configuration docs](../../configuration/node-config.md#searcher-configuration).\n\n### Scoring\n\nQuickwit supports sorting docs by their BM25 scores. In order to query by score, [fieldnorms](../../configuration/index-config.md#text-type) must be enabled for the field. By default, BM25 scoring is disabled to improve query latencies but it can be opt-in by setting the `sort_by` option to `_score` in queries.\n\n### Document ID\n\nEach document in Quickwit is assigned a unique document ID, which is a combination of the split ID and the Tantivy DocId within the split. This implies that you cannot assign a custom ID and that the ID changes when splits undergo merges. This ID is used for every search query as sort order (after the explicitly specified sort values) to make the results deterministic.\n"
  },
  {
    "path": "docs/overview/index.md",
    "content": "---\ntitle: Quickwit documentation\nslug: /\nsidebar_position: 1\n---\n\nimport CallToAction from '@theme/CallToAction';\n\nQuickwit is the first engine to execute complex search and analytics queries directly on cloud storage with sub-second latency. Powered by Rust and its decoupled compute and storage architecture, it is designed to be resource-efficient, easy to operate, and scale to petabytes of data.\n\nQuickwit is a great fit for log management, distributed tracing, and generally immutable data such as conversational data (emails, texts, messaging platforms) and event-based analytics.\n\n<CallToAction\nheading='Get started with Quickwit'\ndescription='Get up and running in minutes and start harnessing the power of Quickwit today!'\nbuttontext='GET STARTED'\nto='/docs/main-branch/get-started/quickstart'>\n</CallToAction>\n\n## Use cases\n\n- [Log management](../log-management/overview.md)\n- [Distributed Tracing](../distributed-tracing/overview.md)\n\n## Key concepts\n\n- [Architecture](architecture.md)\n- [Indexing](concepts/indexing.md)\n- [Querying](concepts/querying.md)\n\n## Reference\n\n- [Configuration](../configuration/index.md)\n- [REST API](../reference/rest-api.md)\n- [CLI](../reference/cli.md)\n"
  },
  {
    "path": "docs/overview/introduction.md",
    "content": "---\ntitle: What is Quickwit?\nsidebar_position: 1\n---\n\nQuickwit is the first engine to execute complex search and analytics queries directly on cloud storage with sub-second latency. Powered by Rust and its decoupled compute and storage architecture, it is designed to be resource-efficient, easy to operate, and scale to petabytes of data.\n\nQuickwit is a great fit for log management, distributed tracing, and generally immutable data such as conversational data (emails, texts, messaging platforms) and event-based analytics.\n\n\n## Why Quickwit is different from other search engines?\n\nQuickwit is designed for sub-second search straight from object storage allowing true decoupled compute and storage. And it means a lot for your infrastructure:\n\n- You store once for all your data on cheap, safe and unlimited storage.\n- You scale out your cluster in seconds, no need to move data around.\n- Indexing and search workloads are decoupled, you can scale them independently.\n- Your tenants are easily isolated and you can charge them for their usage.\n\nQuickwit is also designed to index and search semi-structured data. Its schemaless indexing allows you to index JSON document with an arbitrary amount of field without heavily impacting your performance. Aggregation are not yet supported but we are working on it, stay tuned!\n\n## When to use Quickwit\n\nQuickwit is a great fit for log management, distributed tracing, and generally immutable data such as conversational data (emails, texts, messaging platforms), event-based analytics,  audit logs, security logs, and more.\n\nCheck out our guides to see how you can use Quickwit:\n\n- [Log management](../log-management/overview.md)\n- [Distributed Tracing](../distributed-tracing/overview.md)\n\n\n## Key features\n\n- Full-text search and aggregation queries\n- Elasticsearch query language support\n- Sub-second search on cloud storage (Amazon S3, Azure Blob Storage, …)\n- Decoupled compute and storage, stateless indexers & searchers\n- [Schemaless](https://quickwit.io/docs/guides/schemaless) or strict schema indexing\n- Schemaless analytics\n- [Grafana data source](https://github.com/quickwit-oss/quickwit-datasource)\n- [Jaeger-native](https://quickwit.io/docs/distributed-tracing/plug-quickwit-to-jaeger)\n- OTEL-native for [logs](https://quickwit.io/docs/log-management/overview) and [traces](https://quickwit.io/docs/distributed-tracing/overview)\n- Kubernetes ready - See our [helm-chart](https://quickwit.io/docs/deployment/kubernetes)\n- RESTful API\n\n### Enterprise-grade features\n\n- Multiple [data sources](../ingest-data/index.md) Kafka / Kinesis / Pulsar native\n- Multi-tenancy: indexing with many indexes and partitioning\n- Retention policies\n- Delete tasks (for GRPR use cases)\n- Distributed and highly available* engine that scales out in seconds (HA indexing only with Kafka)\n\n## When not to use Quickwit\n\nUse cases where you would likely *not* want to use Quickwit include:\n\n- You need a low-latency search for e-commerce websites.\n- Your data is mutable.\n\n## Time to discover Quickwit\n\n- [Quickstart](../get-started/quickstart.md)\n- [Concepts](architecture.md)\n- [Last release blog post](https://quickwit.io/blog/quickwit-0.7)\n"
  },
  {
    "path": "docs/reference/_category_.yaml",
    "content": "label: 'Reference'\nposition: 11\ncollapsed: true\n"
  },
  {
    "path": "docs/reference/aggregation.md",
    "content": "---\ntitle: Aggregations API\nsidebar_position: 30\n---\n\nAn aggregation summarizes your data as statistics on buckets or metrics.\n\nAggregations can provide answers to questions like:\n\n- What is the average price of all sold articles?\n- How many errors with status code 500 do we have per day?\n- What is the average listing price of cars grouped by color?\n\nThere are two categories: [Metrics](#metric-aggregations) and [Buckets](#bucket-aggregations).\n\n#### Prerequisite\n\nTo be able to use aggregations on a field, the field needs to have a fast field index created. A fast field index is a columnar storage,\nwhere documents values are extracted and stored.\n\nExample to create a fast field on text for term aggregations.\n```yaml\nname: category\ntype: text\ntokenizer: raw\nrecord: basic\nfast: true\n```\n\nSee the [index configuration](../configuration/index-config.md) page for more details and examples.\n\n#### API Endpoint\n\nThe endpoints for aggregations are the search endpoints:\n- Quickwit API: `api/v1/<index id>/search`\n- Elasticsearch API: `api/v1/_elastic/<index_id>/_search`.\n\n#### Format\n\nThe aggregation request and result de/serialize into elasticsearch compatible JSON.\nIf not documented otherwise you should be able to drop in your elasticsearch aggregation queries.\n\nIn some examples below is not the full request shown, but only the payload for `aggregations`.\n\n#### Example\n\nRequest\n```json skip\n{\n    \"query\": \"*\",\n    \"max_hits\": 0,\n    \"aggs\": {\n        \"sites_and_aqi\": {\n            \"terms\": {\n                \"field\": \"County\",\n                \"size\": 2,\n                \"order\": { \"average_aqi\": \"asc\" }\n            },\n            \"aggs\": {\n                \"average_aqi\": {\n                    \"avg\": { \"field\": \"AQI\" }\n                }\n            }\n        }\n    }\n}\n```\n\n\nResponse\n```json\n...\n\"aggregations\": {\n    \"sites_and_aqi\": {\n      \"buckets\": [\n        {\n          \"average_aqi\": {\n            \"value\": 32.62267569707098\n          },\n          \"doc_count\": 56845,\n          \"key\": \"臺東縣\"\n        },\n        {\n          \"average_aqi\": {\n            \"value\": 35.97893635571055\n          },\n          \"doc_count\": 28675,\n          \"key\": \"花蓮縣\"\n        }\n      ],\n      \"sum_other_doc_count\": 1872055\n    }\n}\n```\n\n### Supported Aggregations\n\n - Bucket\n    - [Histogram](#histogram)\n    - [DateHistogram](#date-histogram)\n    - [Range](#range)\n    - [Terms](#terms)\n- Metric\n    - [Average](#average)\n    - [Count](#count)\n    - [Max](#max)\n    - [Min](#min)\n    - [Stats](#stats)\n    - [Sum](#sum)\n    - [Percentiles](#percentiles)\n    - [Cardinality](#cardinality)\n\n\n## Bucket Aggregations\n\nBucketAggregations create buckets of documents. Each bucket is associated with a rule which determines whether or not a document falls into it.\nIn other words, the buckets effectively define document sets. Buckets are not necessarily disjunct, therefore a document can fall into multiple buckets.\nIn addition to the buckets themselves, the bucket aggregations also compute and return the number of documents for each bucket.\nBucket aggregations, as opposed to metric aggregations, can hold sub-aggregations.\nThese sub-aggregations will be aggregated for the buckets created by their “parent” bucket aggregation.\nThere are different bucket aggregators, each with a different “bucketing” strategy.\nSome define a single bucket, some define a fixed number of multiple buckets, and others dynamically create the buckets during the aggregation process.\n\n\n### Histogram\n\nA histogram is a type of bucket aggregation where documents are grouped into buckets based on a fixed interval. Each document's value is \"rounded down\" to the nearest bucket boundary.\n\nE.g. if we have a price 18 and an interval of 5, the document will fall into the bucket with the key 15. The formula used for this is: `((val - offset) / interval).floor() * interval + offset`.\n\n#### Histogram on datetime fields\n\nSee [`DateHistogram`](#date-histogram) for more convenient API for `datetime` fields.\n\nFields of type `datetime` are handled the same way as any numeric field. However, all values in the requests such as intervals, offsets, bounds, and range boundaries need to be expressed in milliseconds.\n\nHistogram with one bucket per day on a `datetime` field. `interval` needs to be provided in milliseconds.\nIn the following example, we grouped documents per day (`1 day = 86400000 milliseconds`).\nThe returned format is currently fixed at `RFC3339`.\n\n##### Request\n```json skip\n{\n  \"query\": \"*\",\n  \"max_hits\": 0,\n  \"aggs\": {\n    \"count_per_day\":{\n      \"histogram\":{\n        \"field\": \"datetime\",\n        \"interval\": 86400000\n      }\n    }\n  }\n}\n```\n##### Response\n\n```json skip\n{\n  ...\n  \"aggregations\": {\n    \"count_per_day\": {\n      \"buckets\": [\n        {\n          \"doc_count\": 1,\n          \"key\": 1546300800000000.0,\n          \"key_as_string\": \"2019-01-01T00:00:00Z\"\n        },\n        {\n          \"doc_count\": 2,\n          \"key\": 1546560000000000.0,\n          \"key_as_string\": \"2019-01-04T00:00:00Z\"\n        }\n      ]\n    }\n  }\n}\n```\n\n\n#### Returned Buckets\n\nBy default buckets are returned between the min and max value of the documents, including empty buckets. Setting `min_doc_count > 0` will filter empty buckets.\n\nThe value range of the buckets can be extended via [`extended_bounds`](#extended_bounds) or limit the range via [`hard_bounds`](#hard_bounds).\n\n#### Example\n\n```json\n{\n    \"query\": \"*\",\n    \"max_hits\": 0,\n    \"aggs\": {\n        \"prices\": {\n            \"histogram\": {\n                \"field\": \"price\",\n                \"interval\": 10\n            }\n        }\n    }\n}\n```\n\n#### Parameters\n\n###### **field**\n\nThe field to aggregate on.\n\nCurrently this aggregation only works on fast fields of type `u64`, `f64`, `i64`, and `datetime`.\n\n###### **keyed**\n\nChange response format from an array to a hashmap, `key` in the bucket will be the `key` in the hashmap.\n\n###### **interval**\n\nThe interval to chunk your data range. Each bucket spans a value range of [0..interval). Must be larger than 0.\n\n###### **offset**\n\nIntervals implicitly defines an absolute grid of buckets `[interval * k, interval * (k + 1))`.\nOffset makes it possible to shift this grid into `[offset + interval * k, offset + interval (k + 1))`. Offset has to be in the range [0, interval).\n\nAs an example, if there are two documents with value 8 and 12 and interval 10.0, they would fall into the buckets with the key 0 and 10. With offset 5 and interval 10, they would both fall into the bucket with the key 5 and the range [5..15)\n\n```json\n{\n    \"query\": \"*\",\n    \"max_hits\": 0,\n    \"aggs\": {\n        \"prices\": {\n            \"histogram\": {\n                \"field\": \"price\",\n                \"interval\": 10,\n                \"offset\": 2.5\n            }\n        }\n    }\n}\n```\n\n\n###### **min_doc_count**\n\nThe minimum number of documents in a bucket to be returned. Defaults to 0.\n\n###### **hard_bounds**\n\nLimits the data range to [min, max] closed interval.\nThis can be used to filter values if they are not in the data range.\nhard_bounds only limits the buckets, to force a range set both `extended_bounds` and `hard_bounds` to the same range.\n\n```json\n{\n    \"query\": \"*\",\n    \"max_hits\": 0,\n    \"aggs\": {\n        \"prices\": {\n            \"histogram\": {\n                \"field\": \"price\",\n                \"interval\": 10,\n                \"hard_bounds\": {\n                    \"min\": 0,\n                    \"max\": 100\n                }\n            }\n        }\n    }\n}\n```\n\n###### **extended_bounds**\n\nCan be set to extend your bounds. The range of the buckets is by default defined by the data range of the values of the documents. As the name suggests, this can only be used to extend the value range. If the bounds for min or max are not extending the range, the value has no effect on the returned buckets.\nCannot be set in conjunction with `min_doc_count` > 0, since the empty buckets from extended bounds would not be returned.\n\n```json\n{\n    \"query\": \"*\",\n    \"max_hits\": 0,\n    \"aggs\": {\n        \"prices\": {\n            \"histogram\": {\n                \"field\": \"price\",\n                \"interval\": 10,\n                \"extended_bounds\": {\n                    \"min\": 0,\n                    \"max\": 100\n                }\n            }\n        }\n    }\n}\n```\n\n### Date Histogram\n\n`DateHistogram` is similar to `Histogram`, but it can only be used with [datetime type](../configuration/index-config#datetime-type) and provides a more convenient API to define intervals.\n\nLike the histogram, values are rounded down to the closest bucket.\n\nThe returned format is currently fixed at `Rfc3339`.\n\n##### Limitations\nOnly fixed time intervals via the `fixed_interval` parameter are supported.\nThe parameters `interval` and `calendar_interval` are unsupported.\n\n##### Request\n```json skip\n{\n    \"query\": \"*\",\n    \"max_hits\": 0,\n    \"aggs\": {\n        \"sales_over_time\": {\n            \"date_histogram\": {\n                \"field\": \"sold_at\",\n                \"fixed_interval\": \"30d\"\n                \"offset\": \"-4d\"\n            }\n        }\n    }\n}\n```\n##### Response\n\n```json skip\n{\n    ...\n    \"aggregations\": {\n        \"sales_over_time\" : {\n            \"buckets\" : [{\n                \"key_as_string\" : \"2015-01-01T00:00:00Z\",\n                \"key\" : 1420070400000,\n                \"doc_count\" : 4\n            }]\n        }\n    }\n}\n```\n\n\n#### Parameters\n\n###### **field**\n\nThe field to aggregate on.\n\nCurrently this aggregation only works on fast fields of type `datetime`.\n\n###### **keyed**\n\nChange response format from an array to a hashmap, `key` in the bucket will be the `key` in the hashmap.\n\n###### **interval**\n\nThe interval to chunk your data range. Each bucket spans a value range of [0..interval). Must be larger than 0.\n\nFixed intervals are configured with the `fixed_interval` parameter.\nFixed intervals are a fixed number of SI units and\nnever deviate, regardless of where they fall on the calendar. One second is always\ncomposed of 1000ms. This allows fixed intervals to be specified in any multiple of the\nsupported units. However, it means fixed intervals cannot express other units such as\nmonths, since the duration of a month is not a fixed quantity. Attempting to specify a\ncalendar interval like month or quarter will return an Error.\n\nThe accepted units for fixed intervals are:\n* `ms`: milliseconds\n* `s`: seconds. Defined as 1000 milliseconds each.\n* `m`: minutes. Defined as 60 seconds each (60_000 milliseconds).\n* `h`: hours. Defined as 60 minutes each (3_600_000 milliseconds).\n* `d`: days. Defined as 24 hours (86_400_000 milliseconds).\n\nFractional time values are not supported, but this can be addressed by shifting to another\ntime unit (e.g., `1.5h` could instead be specified as `90m`).\n\n###### **offset**\n\nIntervals implicitly define an absolute grid of buckets `[interval * k, interval * (k + 1))`.\nOffset makes it possible to shift this grid into `[offset + interval * k, offset + interval (k + 1))`. Offset has to be in the range [0, interval).\n\nThis is especially useful when using `fixed_interval`, to shift the first bucket e.g. at the start of the year.\n\nThe `offset` parameter has the same syntax as the `fixed_interval` parameter, but also allows for negative values.\n\n###### **min_doc_count**\n\nThe minimum number of documents in a bucket to be returned. Defaults to 0.\n\n###### **hard_bounds**\nSame as in [`Histogram`](#hard_bounds) but `min` and `max` parameters need to be set as timestamp with milliseconds precision.\n\n###### **extended_bounds**\nSame as in [`Histogram`](#extended_bounds) but `min` and `max` parameters need to be set as timestamp with milliseconds precision.\n\n### Range\n\nProvide user-defined buckets to aggregate on. Two special buckets will automatically be created to cover the whole range of values.\nThe provided buckets have to be continuous. During the aggregation, the values extracted from the fast_field field will be checked against each bucket range.\nNote that this aggregation includes the from value and excludes the to value for each range.\n\n#### Limitations/Compatibility\n\nOverlapping ranges are not yet supported.\n\n##### Request\n```json skip\n{\n    \"query\": \"*\",\n    \"max_hits\": 0,\n    \"aggs\": {\n        \"my_scores\": {\n            \"range\": {\n                \"field\": \"score\",\n                \"ranges\": [\n                    { \"to\": 3.0, \"key\": \"low\" },\n                    { \"from\": 3.0, \"to\": 7.0, \"key\": \"medium-low\" },\n                    { \"from\": 7.0, \"to\": 20.0, \"key\": \"medium-high\" },\n                    { \"from\": 20.0, \"key\": \"high\" }\n                ]\n            }\n        }\n    }\n}\n```\n\n##### Response\n\n```json skip\n{\n    ...\n    \"aggregations\": {\n        \"my_scores\" : {\n            \"buckets\": [\n                {\"key\": \"low\", \"doc_count\": 0, \"to\": 3.0},\n                {\"key\": \"medium-low\", \"doc_count\": 10, \"from\": 3.0, \"to\": 7.0},\n                {\"key\": \"medium-high\", \"doc_count\": 10, \"from\": 7.0, \"to\": 20.0},\n                {\"key\": \"high\", \"doc_count\": 80, \"from\": 20.0}\n            ]\n        }\n    }\n}\n```\n\n#### Parameters\n\n###### **keyed**\n\nChange response format from an array to a hashmap, the serialized range will be the `key` in the hashmap.\nIf a custom `key` is provided, it will be used instead.\n\n###### **field**\n\nThe field to aggregate on.\n\nCurrently this aggregation only works on fast fields of type `u64`, `f64`, `i64`, and `datetime`.\n\n###### **ranges**\n\nThe list of buckets, with `from` and `to` values.\nThe `from` value is inclusive in the range.\nThe `to` value is not inclusive in the range.\n`key` is optional, and will be used as the bucket key in the response.\n\nThe first bucket can omit the `from` value, and the last bucket the `to` value.\nNote that this aggregation includes the `from` value and excludes the `to` value for each range. Extra buckets will be created until the first `to`, and last `from`, if necessary.\n\n### Terms\n\nCreates a bucket for every unique term and counts the number of occurrences.\n\nRequest\n```json skip\n{\n    \"query\": \"*\",\n    \"max_hits\": 0,\n    \"aggs\": {\n        \"genres\": {\n            \"terms\": { \"field\": \"genre\" }\n        }\n    }\n}\n```\n\nResponse\n```json\n...\n\"aggregations\": {\n    \"genres\": {\n        \"doc_count_error_upper_bound\": 0,\n        \"sum_other_doc_count\": 0,\n        \"buckets\": [\n            { \"key\": \"drumnbass\", \"doc_count\": 6 },\n            { \"key\": \"raggae\", \"doc_count\": 4 },\n            { \"key\": \"jazz\", \"doc_count\": 2 }\n        ]\n    }\n}\n```\n\n\n#### Document count error\nIn Quickwit, we have one segment per split. Therefore the results returned from a split, is equivalent to results returned from a segment.\nTo improve performance, results from one split are cut off at `shard_size`.\nWhen combining results of multiple splits, terms that\ndon't make it in the top n of a result from a split increase the theoretical upper bound error by lowest\nterm-count.\n\nEven with a larger `shard_size` value, doc_count values for a terms aggregation may be\napproximate. As a result, any sub-aggregations on the terms aggregation may also be approximate.\n`sum_other_doc_count` is the number of documents that didn’t make it into the top size\nterms. If this is greater than 0, the terms agg had to throw away some\nbuckets, either because they didn’t fit into `size` on the root node or they didn’t fit into\n`shard_size` on the leaf node.\n\n#### Per bucket document count error\nIf you set the `show_term_doc_count_error` parameter to true, the terms aggregation will include\ndoc_count_error_upper_bound, which is an upper bound to the error on the doc_count returned by\neach split. It’s the sum of the size of the largest bucket on each split that didn’t fit\ninto `shard_size`.\n\n#### Parameters\n\n###### **field**\n\nThe field to aggregate on.\n\nCurrently term aggregation only works on fast fields of type `text`, `f64`, `i64` and `u64`.\n\n###### **size**\n\nBy default, the top 10 terms with the most documents are returned. Larger values for size are more expensive.\n\n###### **shard_size**\n\nTo obtain more accurate results, we fetch more than the `size` from each segment/split.\n\nIncreasing this value will enhance accuracy but will also increase CPU/memory usage. \nRefer to the [`document count error`](#document-count-error) section for more information on how `shard_size` impacts accuracy.\n\n`shard_size` represents the number of terms that are returned from one split. \nFor example, if there are 100 splits and `shard_size` is set to 1000, the root node may receive up to 100_000 terms to merge. \nAssuming an average cost of 50 bytes per term, this would require up to 5MB of memory. \nThe actual number of terms sent to the root depends on the number of splits handled by one node and how the intermediate results can be merged (e.g., the cardinality of the terms).\n\nNote on differences between Quickwit and Elasticsearch:\n* Unlike Elasticsearch, Quickwit does not use global ordinals, so serialized terms need to be sent to the root node.\n* The concept of shards in Elasticsearch differs from splits in Quickwit. In Elasticsearch, a shard contains up to 200M documents and is a collection of segments. In contrast, a Quickwit split comprises a single segment, typically with 5M documents. Therefore, `shard_size` in Elasticsearch applies to a group of segments, whereas in Quickwit, it applies to a single segment.\n\nDefaults to `size * 10`.\n\n###### **show_term_doc_count_error**\n\nIf you set the show_term_doc_count_error parameter to true, the terms aggregation will include doc_count_error_upper_bound, which is an upper bound to the error on the doc_count returned by each split.\nIt’s the sum of the size of the largest bucket on each split that didn’t fit into `shard_size`.\n\nDefaults to true when ordering by count desc.\n\n\n###### **min_doc_count**\n\nFilter all terms that are lower than `min_doc_count`. Defaults to 1.\n\n_Expensive_ : When set to 0, this will return all terms in the field.\n\n###### **missing**\n\nThe `missing` parameter defines how documents that are missing a value should be treated.\nBy default they will be ignored but it is also possible to treat them as if they had a value.\n```json skip\n{ \"field\": \"genre\", \"missing\": \"NO_DATA\" }\n```\n\n###### **order**\n\nSet the order. String is here a target, which is either “_count”, “_key”, or the name of a metric sub_aggregation.\nSingle value metrics like average can be addressed by its name. Multi value metrics like stats are required to address their field by name e.g. “stats.avg”.\n_Limitation_ : Ordering is only supported by one property currently. Passing an array for `order` is _not_ supported `\"order\": [{ \"average_price\": \"asc\" }, { \"_key\": \"asc\" }]`.\n\nOrder alphabetically\n```json skip\n{\n    \"query\": \"*\",\n    \"max_hits\": 0,\n    \"aggs\": {\n        \"genres\": {\n            \"terms\": {\n                \"field\": \"genre\",\n                \"order\": { \"_key\": \"asc\" }\n            }\n        }\n    }\n}\n```\n\n\nOrder by sub_aggregation\n\n```json skip\n{\n    \"query\": \"*\",\n    \"max_hits\": 0,\n    \"aggs\": {\n        \"articles_by_price\": {\n            \"terms\": {\n                \"field\": \"article_name\",\n                \"order\": { \"average_price\": \"asc\" }\n            },\n            \"aggs\": {\n                \"average_price\": {\n                    \"avg\": { \"field\": \"price\" }\n                }\n            }\n        }\n    }\n}\n```\n\n\n\n## Metric Aggregations\n\nThe aggregations in this family compute metrics based on values extracted from the documents that are being aggregated.\nValues are extracted from the fast field of the document. Some aggregations output a single numeric metric (e.g. Average)\nand are called single-value numeric metrics aggregation, others generate multiple metrics (e.g. Stats) and are called multi-value numeric metrics aggregation.\n\nIn contrast to bucket aggregations, metrics don't allow sub-aggregations, since there is no document set to aggregate on.\n\n### Average\n\nA single-value metric aggregation that computes the average of numeric values that are extracted from the aggregated documents.\nSupported field types are `u64`, `f64`, `i64`, and `datetime`.\n\n**Request**\n```json skip\n{\n    \"query\": \"*\",\n    \"max_hits\": 0,\n    \"aggs\": {\n        \"average_price\": {\n            \"avg\": { \"field\": \"price\" }\n        }\n    }\n}\n```\n\n**Response**\n```json\n{\n    \"num_hits\": 9582098,\n    \"hits\": [],\n    \"elapsed_time_micros\": 101942,\n    \"errors\": [],\n    \"aggregations\": {\n        \"average_price\": {\n            \"value\": 133.7\n        }\n    }\n}\n```\n\n#### Parameters\n\n###### **missing**\nThe `missing` parameter defines how documents that are missing a value should be treated.\nBy default they will be ignored but it is also possible to treat them as if they had a value.\n```json skip\n{ \"field\": \"price\", \"missing\": \"10.0\" }\n```\n\n### Count\n\nA single-value metric aggregation that counts the number of values that are extracted from the aggregated documents.\nSupported field types are `u64`, `f64`, `i64`, and `datetime`.\n\n**Request**\n```json skip\n{\n    \"query\": \"*\",\n    \"max_hits\": 0,\n    \"aggs\": {\n        \"price_count\": {\n            \"value_count\": { \"field\": \"price\" }\n        }\n    }\n}\n```\n\n**Response**\n```json\n{\n    \"num_hits\": 9582098,\n    \"hits\": [],\n    \"elapsed_time_micros\": 102956,\n    \"errors\": [],\n    \"aggregations\": {\n        \"price_count\": {\n            \"value\": 9582098\n        }\n    }\n}\n```\n#### Parameters\n\n###### **missing**\nThe `missing` parameter defines how documents that are missing a value should be treated.\nBy default they will be ignored but it is also possible to treat them as if they had a value.\n```json skip\n{ \"field\": \"price\", \"missing\": \"10.0\" }\n```\n\n### Max\n\nA single-value metric aggregation that computes the maximum of numeric values that are that are extracted from the aggregated documents.\nSupported field types are `u64`, `f64`, `i64`, and `datetime`.\n\n**Request**\n```json skip\n{\n    \"query\": \"*\",\n    \"max_hits\": 0,\n    \"aggs\": {\n        \"max_price\": {\n            \"max\": { \"field\": \"price\" }\n        }\n    }\n}\n```\n\n**Response**\n```json\n{\n    \"num_hits\": 9582098,\n    \"hits\": [],\n    \"elapsed_time_micros\": 101543,\n    \"errors\": [],\n    \"aggregations\": {\n        \"max_price\": {\n            \"value\": 1353.23\n        }\n    }\n}\n```\n#### Parameters\n\n###### **missing**\nThe `missing` parameter defines how documents that are missing a value should be treated.\nBy default they will be ignored but it is also possible to treat them as if they had a value.\n```json skip\n{ \"field\": \"price\", \"missing\": \"10.0\" }\n```\n\n### Min\n\nA single-value metric aggregation that computes the minimum of numeric values that are that are extracted from the aggregated documents.\nSupported field types are `u64`, `f64`, `i64`, and `datetime`.\n\n**Request**\n```json skip\n{\n    \"query\": \"*\",\n    \"max_hits\": 0,\n    \"aggs\": {\n        \"min_price\": {\n            \"min\": { \"field\": \"price\" }\n        }\n    }\n}\n```\n\n**Response**\n```json\n{\n    \"num_hits\": 9582098,\n    \"hits\": [],\n    \"elapsed_time_micros\": 102342,\n    \"errors\": [],\n    \"aggregations\": {\n        \"min_price\": {\n            \"value\": 0.01\n        }\n    }\n}\n```\n#### Parameters\n\n###### **missing**\nThe `missing` parameter defines how documents that are missing a value should be treated.\nBy default they will be ignored but it is also possible to treat them as if they had a value.\n```json skip\n{ \"field\": \"price\", \"missing\": \"10.0\" }\n```\n\n### Stats\n\nA multi-value metric aggregation that computes stats (average, count, min, max, standard deviation, and sum) of numeric values that are extracted from the aggregated documents.\nSupported field types are `u64`, `f64`, `i64`, and `datetime`.\n\n**Request**\n```json skip\n{\n    \"query\": \"*\",\n    \"max_hits\": 0,\n    \"aggs\": {\n        \"timestamp_stats\": {\n            \"stats\": { \"field\": \"timestamp\" }\n        }\n    }\n}\n```\n\n\n\n**Response**\n```json\n{\n    \"num_hits\": 10000783,\n    \"hits\": [],\n    \"elapsed_time_micros\": 65297,\n    \"errors\": [],\n    \"aggregations\": {\n        \"timestamp_stats\": {\n            \"avg\": 1462320207.9803998,\n            \"count\": 10000783,\n            \"max\": 1475669670.0,\n            \"min\": 1440670432.0,\n            \"standard_deviation\": 11867304.28681695,\n            \"sum\": 1.4624347076526848e16\n        }\n    }\n}\n```\n#### Parameters\n\n###### **missing**\nThe `missing` parameter defines how documents that are missing a value should be treated.\nBy default they will be ignored but it is also possible to treat them as if they had a value.\n```json skip\n{ \"field\": \"price\", \"missing\": \"10.0\" }\n```\n\n### Extended Stats\n\nExtended stats is the same as `stats`, but with following additional metrics: `sum_of_squares`, `variance`, `std_deviation`, and `std_deviation_bounds`.\nSupported field types are `u64`, `f64`, `i64`, and `datetime`.\n\n**Request**\n```json\n{\n    \"query\": \"*\",\n    \"max_hits\": 0,\n    \"aggs\": {\n        \"response_extended_stats\": {\n            \"extended_stats\": { \"field\": \"response\" }\n        }\n    }\n}\n```\n\n**Response**\n```json\n{\n    ..\n    \"aggregations\": {\n        \"response_extended_stats\": {\n            \"avg\": 65.55555555555556,\n            \"count\": 9,\n            \"max\": 130.0,\n            \"min\": 20.0,\n            \"std_deviation\": 42.97573245736381,\n            \"std_deviation_bounds\": {\n                \"lower\": -20.395909359172062,\n                \"lower_population\": -20.395909359172062,\n                \"lower_sampling\": -25.60973998562673,\n                \"upper\": 151.50702047028318,\n                \"upper_population\": 151.50702047028318,\n                \"upper_sampling\": 156.72085109673785\n            },\n            \"std_deviation_population\": 42.97573245736381,\n            \"std_deviation_sampling\": 45.582647770591144,\n            \"sum\": 590.0,\n            \"sum_of_squares\": 55300.0,\n            \"variance\": 1846.9135802469136,\n            \"variance_population\": 1846.9135802469136,\n            \"variance_sampling\": 2077.777777777778\n        }\n    }\n}\n```\n\n#### Parameters\n\n###### **missing**\nThe `missing` parameter defines how documents that are missing a value should be treated.\nBy default they will be ignored but it is also possible to treat them as if they had a value.\n```json skip\n{ \"field\": \"price\", \"missing\": \"10.0\" }\n```\n\n###### **sigma**\n\nThe sigma parameter controls how many standard deviations +/- from the mean should be displayed.\nThe default value is 2.\n```json skip\n{ \"field\": \"price\", \"sigma\": \"3.0\" }\n```\n\n### Sum\n\nA single-value metric aggregation that sums up numeric values that are that are extracted from the aggregated documents.\nSupported field types are `u64`, `f64`, `i64`, and `datetime`.\n\n**Request**\n```json skip\n{\n    \"query\": \"*\",\n    \"max_hits\": 0,\n    \"aggs\": {\n        \"total_price\": {\n            \"sum\": { \"field\": \"price\" }\n        }\n    }\n}\n```\n\n**Response**\n```json\n{\n    \"num_hits\": 9582098,\n    \"hits\": [],\n    \"elapsed_time_micros\": 101142,\n    \"errors\": [],\n    \"aggregations\": {\n        \"total_price\": {\n            \"value\": 12966782476.54\n        }\n    }\n}\n```\n\n#### Parameters\n\n###### **missing**\nThe `missing` parameter defines how documents that are missing a value should be treated.\nBy default they will be ignored but it is also possible to treat them as if they had a value.\n```json skip\n{ \"field\": \"price\", \"missing\": \"10.0\" }\n```\n\n\n\n### Percentiles\nThe percentiles aggregation is a useful tool for understanding the distribution of a data set.\nIt calculates the values below which a given percentage of the data falls.\nFor instance, the 95th percentile indicates the value below which 95% of the data points can be found.\n\nThis aggregation can be particularly interesting for analyzing website or service response times.\nFor example, if the 95th percentile website load time is significantly higher than the median, this indicates\nthat a small percentage of users are experiencing much slower load times than the majority.\n\nTo use the percentiles aggregation, you'll need to provide a field to aggregate on.\nIn the case of website load times, this would typically be a field containing the duration of time it takes for the site to load.\n\n**Request**\n```json skip\n{\n    \"query\": \"*\",\n    \"max_hits\": 0,\n    \"aggs\": {\n        \"loading_times\": {\n            \"percentiles\": {\n                \"field\": \"load_time\"\n                \"percents\": [90, 95, 99]\n            }\n        }\n    }\n}\n```\n\n**Response**\n```JSON\n{\n    \"num_hits\": 9582098,\n    \"hits\": [],\n    \"elapsed_time_micros\": 101142,\n    \"errors\": [],\n    \"aggregations\": {\n        \"loading_times\": {\n            \"values\": {\n                \"90.0\": 33.4,\n                \"95.0\": 83.4,\n                \"99.0\": 230.3\n            }\n        }\n    }\n}\n```\n\n`percents` may be omitted, it will default to `[1, 5, 25, 50 (median), 75, 95, and 99]`.\n\n#### Estimating Percentiles\n\nWhile percentiles provide valuable insights into the distribution of data, it's important to understand that they are often estimates.\nThis is because calculating exact percentiles for large data sets can be computationally expensive and time-consuming.\n\n#### Parameters\n\n###### **missing**\nThe `missing` parameter defines how documents that are missing a value should be treated.\nBy default they will be ignored but it is also possible to treat them as if they had a value.\n```json skip\n{ \"field\": \"price\", \"missing\": \"10.0\" }\n```\n\n\n### Cardinality\nThe cardinality aggregation is used to approximate the count of distinct values in a field. \nCardinality aggregations are essential when working with large datasets where computing the exact count of distinct values would be computationally expensive. \n\nThe cardinality aggregation can be useful to e.g. to count the number of unique users visiting a website or to determine the number of unique IP addresses that have logged into a server over a certain period.\n\nThe algorithm behind the cardinality aggregation is based on HyperLogLog++, which provides an approximate count over the hashed values.\n\nTo use the cardinality aggregation, you need to specify the field on which to perform the aggregation.\n\n**Request**\n```json\n{\n    \"query\": \"*\",\n    \"max_hits\": 0,\n    \"aggs\": {\n        \"unique_users\": {\n            \"cardinality\": {\n                \"field\": \"user_id\"\n            }\n        }\n    }\n}\n```\n\n**Response**\n```json\n{\n    \"num_hits\": 9582098,\n    \"hits\": [],\n    \"elapsed_time_micros\": 101142,\n    \"errors\": [],\n    \"aggregations\": {\n        \"unique_users\": {\n            \"value\": 345672\n        }\n    }\n}\n```\n\n\n#### Parameters\n\n###### **missing**\nThe `missing` parameter defines how documents that are missing a value should be treated.\nBy default they will be ignored but it is also possible to treat them as if they had a value.\n```json skip\n{ \"field\": \"price\", \"missing\": \"10.0\" }\n```\n\n#### Performance\n\nThe cardinality aggregation on text fields is computationally expensive for datasets with a large amount of unique values. \nThis is because the aggregation computes the hash for each unique term in the field. \nIn order to do this, Quickwit will for each split first collect the term ids and then fetch the compressed terms for those term ids from the dictionary.\nDecompressing the terms is comparatively expensive and keeping the term ids increases the memory usage.\n\nFor numeric fields, the cardinality aggregation is much more efficient as it directly computes the hash of the numeric values and adds them to HLL++.\n\n##### Limitations\nThe parameter `precision_threshold` is ignored currently. Normally it allows to set the threshold until the aggregation is exact.\n\n\n"
  },
  {
    "path": "docs/reference/cli.md",
    "content": "---\ntitle: Command-line options\nsidebar_position: 50\n---\n\nQuickwit command line tool lets you start a Quickwit server and manage indexes (create, delete, ingest), splits and sources (create, delete, toggle). To start a server, `quickwit` needs a [node config file path](../configuration/node-config.md) that you can specify with `QW_CONFIG` environment variable: `export QW_CONFIG=./config/quickwit.yaml`.\n\nThis page documents all the available commands, related options, and environment variables.\n\n### Common Options\n\nTo manage indexes, splits and sources on a remote cluster you might need to specify the connection to a Quickwit node. The following options are supported:\n\n| Option              | Description                 | Default                 |\n|---------------------|-----------------------------|------------------------:|\n| `--endpoint`        | The url of a Quickwit node. | `http://127.0.0.1:7280` |\n| `--timeout`         | Command timeout.            | *See below*             |\n| `--connect-timeout` | Connect timeout.            | `5s`                    |\n\nThe default timeouts are command specific:\n- **search** - 1 minute\n- **ingest** (without force or wait) - 1 minute\n- **ingest** (with force or wait) - 30 minute\n- all other operations - 10 seconds\n\nThe timeout can be expressed as in seconds, minutes, hours or days. For example:\n\n- `10s` - 10 seconds timeout\n- `1m` - 1 minute timeout\n- `2h` - 2 hours timeout\n- `1d` - 1 day timeout\n- `none` - no timeout is applied.\n\n:::caution\n\nBefore using Quickwit with object storage, consult our [guidelines](../operating/aws-costs.md) for deploying on AWS S3 to avoid surprises on your next bill.\n\n:::\n\n\n## Commands\n\n[Command-line synopsis syntax](https://developers.google.com/style/code-syntax)\n\n### Help\n\n`quickwit` or `quickwit --help` displays the list of available commands.\n\n`quickwit <command name> --help` displays the documentation for the command and a usage example.\n\n### Version\n\n`quickwit --version` displays the version. It is helpful for reporting bugs.\n\n\n### Syntax\n\nThe CLI is structured into high-level commands with subcommands.\n`quickwit [command] [subcommand] [args]`.\n\n* `command`: `run`, `index`, `split`, `source` and `tool`.\n\n\n<!--\n    Insert auto-generated CLI docs here...\n-->\n## run\nStarts a Quickwit node with all services enabled by default: `indexer`, `searcher`, `metastore`, `control-plane`, and `janitor`.\n\n\n### Indexer service\n\nThe indexer service runs indexing pipelines assigned by the control plane.\n\n### Searcher service \nStarts a web server at `rest_listing_address:rest_list_port` that exposes the [Quickwit REST API](rest-api.md)\nwhere `rest_listing_address` and `rest_list_port` are defined in Quickwit config file (quickwit.yaml).\nThe node can optionally join a cluster using the `peer_seeds` parameter.\nThis list of node addresses is used to discover the remaining peer nodes in the cluster through a gossip protocol, see [chitchat](https://github.com/quickwit-oss/chitchat).\n\n### Metastore service\n\nThe metastore service exposes Quickwit metastore over the network. This is a core internal service that is needed to operate Quickwit. As such, at least one running instance of this service is required for other services to work.\n\n### Control plane service\n\nThe control plane service schedules indexing tasks to indexers. It listens to metastore events such as\nan source create, delete, toggle, or index delete and reacts accordingly to update the indexing plan.\n\n### Janitor service\n\nThe Janitor service runs maintenance tasks on indexes: garbage collection, documents delete, and retention policy tasks.\n\n:::note\nQuickwit needs to open the following port for cluster formation and workload distribution:\n\n    TCP port (default is 7280) for REST API\n    TCP and UDP port (default is 7280) for cluster membership protocol\n    TCP port + 1 (default is 7281) for gRPC address for the distributed search\n\nIf ports are already taken, the serve command will fail.\n:::\n  \n`quickwit  run [args]`\n\n*Synopsis*\n\n```bash\nquickwit run\n    [--config <config>]\n    [--service <service>]\n```\n\n*Options*\n\n| Option | Description | Default |\n|-----------------|-------------|--------:|\n| `--config` | Config file location | `config/quickwit.yaml` |\n| `--service` | Services (`indexer`, `searcher`, `metastore`, `control-plane`, or `janitor`) to run. If unspecified, all the supported services are started. |  |\n\n*Examples*\n\n*Starts an indexer and a metastore services*\n```bash\nquickwit run --service indexer --service metastore --endpoint=http://127.0.0.1:7280\n```\n\n*Start a control plane, metastore and janitor services*\n```bash\nquickwit run --service control_plane --service metastore --service janitor --config=./config/quickwit.yaml\n```\n\n*Make a search request on a wikipedia index*\n```bash\n# To create wikipedia index and ingest data, go to our tutorials https://quickwit.io/docs/get-started/.\n# Start a searcher.\nquickwit run --service searcher --service metastore --config=./config/quickwit.yaml\n# Make a request.\ncurl \"http://127.0.0.1:7280/api/v1/wikipedia/search?query=barack+obama\"\n\n```\n\n## index\nManages indexes: creates, updates, deletes, ingests, searches, describes...\n\n### index create\n\nCreates an index of ID `index` at `index-uri` configured by a [YAML config file](../configuration/index-config.md) located at `index-config`.\nThe index config lets you define the mapping of your document on the index and how each field is stored and indexed.\nIf `index-uri` is omitted, `index-uri` will be set to `{default_index_root_uri}/{index}`, more info on [Quickwit config docs](../configuration/node-config.md).\nThe command fails if an index already exists unless `overwrite` is passed.\nWhen `overwrite` is enabled, the command deletes all the files stored at `index-uri` before creating a new index.\n  \n`quickwit index create [args]`\n\n*Synopsis*\n\n```bash\nquickwit index create\n    --index-config <index-config>\n    [--overwrite]\n```\n\n*Options*\n\n| Option | Description |\n|-----------------|-------------|\n| `--index-config` | Location of the index config file. |\n| `--overwrite` | Overwrites pre-existing index. This will delete all existing data stored at `index-uri` before creating a new index. |\n\n*Examples*\n\n*Create a new index.*\n```bash\n# Start a Quickwit server.\nquickwit run --config=./config/quickwit.yaml\n# Open a new terminal and run:\ncurl -o wikipedia_index_config.yaml https://raw.githubusercontent.com/quickwit-oss/quickwit/main/config/tutorials/wikipedia/index-config.yaml\nquickwit index create --endpoint=http://127.0.0.1:7280 --index-config wikipedia_index_config.yaml\n\n```\n\n### index update\n\nUpdates an index using an index config file.  \n`quickwit index update [args]`\n\n*Synopsis*\n\n```bash\nquickwit index update\n    --index <index>\n    --index-config <index-config>\n```\n\n*Options*\n\n| Option | Description |\n|-----------------|-------------|\n| `--index` | ID of the target index |\n| `--index-config` | Location of the index config file. |\n| `--create` | Create the index if it doesn't exist. |\n### index clear\n\nClears an index: deletes all splits and resets checkpoint.  \n`quickwit index clear [args]`\n`quickwit index clr [args]`\n\n*Synopsis*\n\n```bash\nquickwit index clear\n    --index <index>\n```\n\n*Options*\n\n| Option | Description |\n|-----------------|-------------|\n| `--index` | Index ID |\n### index delete\n\nDeletes an index.  \n`quickwit index delete [args]`\n`quickwit index del [args]`\n\n*Synopsis*\n\n```bash\nquickwit index delete\n    --index <index>\n    [--dry-run]\n```\n\n*Options*\n\n| Option | Description |\n|-----------------|-------------|\n| `--index` | ID of the target index |\n| `--dry-run` | Executes the command in dry run mode and only displays the list of splits candidates for deletion. |\n\n*Examples*\n\n*Delete your index*\n```bash\n# Start a Quickwit server.\nquickwit run --service metastore --config=./config/quickwit.yaml\n# Open a new terminal and run:\nquickwit index delete --index wikipedia --endpoint=http://127.0.0.1:7280\n\n```\n\n### index describe\n\nDisplays descriptive statistics of an index.  \n`quickwit index describe [args]`\n\n*Synopsis*\n\n```bash\nquickwit index describe\n    --index <index>\n```\n\n*Options*\n\n| Option | Description |\n|-----------------|-------------|\n| `--index` | ID of the target index |\n\n*Examples*\n\n*Displays descriptive statistics of your index*\n```bash\n# Start a Quickwit server.\nquickwit run --service metastore --config=./config/quickwit.yaml\n# Open a new terminal and run:\nquickwit index describe --endpoint=http://127.0.0.1:7280 --index wikipedia\n\n1. General infos\n===============================================================================\nIndex id:                           wikipedia\nIndex uri:                          file:///home/quickwit-indices/qwdata/indexes/wikipedia\nNumber of published splits:         1\nNumber of published documents:      300000\nSize of published splits:           448 MB\n\n2. Statistics on splits\n===============================================================================\nDocument count stats:\nMean ± σ in [min … max]:            300000 ± 0 in [300000 … 300000]\nQuantiles [1%, 25%, 50%, 75%, 99%]: [300000, 300000, 300000, 300000, 300000]\n\nSize in MB stats:\nMean ± σ in [min … max]:            448 ± 0 in [448 … 448]\nQuantiles [1%, 25%, 50%, 75%, 99%]: [448, 448, 448, 448, 448]\n\n```\n\n### index list\n\nList indexes.  \n`quickwit index list [args]`\n`quickwit index ls [args]`\n\n*Examples*\n\n*List indexes*\n```bash\n# Start a Quickwit server.\nquickwit run --config=./config/quickwit.yaml\n# Open a new terminal and run:\nquickwit index list --endpoint=http://127.0.0.1:7280\n# Or with alias.\nquickwit index ls --endpoint=http://127.0.0.1:7280\n\n                                    Indexes                                     \n+-----------+--------------------------------------------------------+\n| Index ID  |                       Index URI                        |\n+-----------+--------------------------------------------------------+\n| hdfs-logs | file:///home/quickwit-indices/qwdata/indexes/hdfs-logs |\n+-----------+--------------------------------------------------------+\n| wikipedia | file:///home/quickwit-indices/qwdata/indexes/wikipedia |\n+-----------+--------------------------------------------------------+\n\n\n```\n\n### index ingest\n\nIndexes a dataset consisting of newline-delimited JSON objects located at `input-path` or read from *stdin*.\nThe data is appended to the target index of ID `index` unless `overwrite` is passed. `input-path` can be a file or another command output piped into stdin.\nCurrently, only local datasets are supported.\nBy default, Quickwit's indexer will work with a heap of 2 GiB of memory. Learn how to change `heap-size` in the [index config doc page](../configuration/index-config.md).\n  \n`quickwit index ingest [args]`\n\n*Synopsis*\n\n```bash\nquickwit index ingest\n    --index <index>\n    [--input-path <input-path>]\n    [--batch-size-limit <batch-size-limit>]\n    [--wait]\n    [--detailed-response]\n    [--force]\n    [--commit-timeout <commit-timeout>]\n```\n\n*Options*\n\n| Option | Description |\n|-----------------|-------------|\n| `--index` | ID of the target index |\n| `--input-path` | Location of the input file. |\n| `--batch-size-limit` | Size limit of each submitted document batch. |\n| `--wait` | Wait for all documents to be committed and available for search before exiting. Applies only to the last batch, see [#5417](https://github.com/quickwit-oss/quickwit/issues/5417). |\n| `--detailed-response` | Print detailed errors. Enabling might impact performance negatively. |\n| `--force` | Force a commit after the last document is sent, and wait for all documents to be committed and available for search before exiting. Applies only to the last batch, see [#5417](https://github.com/quickwit-oss/quickwit/issues/5417). |\n| `--commit-timeout` | Timeout for ingest operations that require waiting for the final commit (`--wait` or `--force`). This is different from the `commit_timeout_secs` indexing setting, which sets the maximum time before committing splits after their creation. |\n\n*Examples*\n\n*Indexing a dataset from a file*\n```bash\n# Start a Quickwit server.\nquickwit run --config=./config/quickwit.yaml\n# Open a new terminal and run:\ncurl -o wiki-articles-10000.json https://quickwit-datasets-public.s3.amazonaws.com/wiki-articles-10000.json\nquickwit index ingest --endpoint=http://127.0.0.1:7280 --index wikipedia --input-path wiki-articles-10000.json\n\n```\n\n*Indexing a dataset from stdin*\n```bash\n# Start a Quickwit server.\nquickwit run --config=./config/quickwit.yaml\n# Open a new terminal and run:\ncat wiki-articles-10000.json | quickwit index ingest --endpoint=http://127.0.0.1:7280 --index wikipedia\n\n```\n\n### index search\n\nSearches an index with ID `--index` and returns the documents matching the query specified with `--query`.\nMore details on the [query language page](query-language.md).\nThe offset of the first hit returned and the number of hits returned can be set with the `start-offset` and `max-hits` options.\nIt's possible to override the default search fields `search-fields` option to define the list of fields that Quickwit will search into if \nthe user query does not explicitly target a field in the query. Quickwit will return snippets of the matching content when requested via the `snippet-fields` options.\nSearch can also be limited to a time range using the `start-timestamp` and `end-timestamp` options.\nThese timestamp options are useful for boosting query performance when using a time series dataset.\n\n:::warning\nThe `start_timestamp` and `end_timestamp` should be specified in seconds regardless of the timestamp field precision. The timestamp field precision only affects the way it's stored as fast-fields, whereas the document filtering is always performed in seconds.\n:::\n  \n`quickwit index search [args]`\n\n*Synopsis*\n\n```bash\nquickwit index search\n    --index <index>\n    --query <query>\n    [--aggregation <aggregation>]\n    [--max-hits <max-hits>]\n    [--start-offset <start-offset>]\n    [--search-fields <search-fields>]\n    [--snippet-fields <snippet-fields>]\n    [--start-timestamp <start-timestamp>]\n    [--end-timestamp <end-timestamp>]\n    [--sort-by-score]\n```\n\n*Options*\n\n| Option | Description | Default |\n|-----------------|-------------|--------:|\n| `--index` | ID of the target index |  |\n| `--query` | Query expressed in natural query language ((barack AND obama) OR \"president of united states\"). Learn more on https://quickwit.io/docs/reference/search-language. |  |\n| `--aggregation` | JSON serialized aggregation request in tantivy/elasticsearch format. |  |\n| `--max-hits` | Maximum number of hits returned. | `20` |\n| `--start-offset` | Offset in the global result set of the first hit returned. | `0` |\n| `--search-fields` | List of fields that Quickwit will search into if the user query does not explicitly target a field in the query. It overrides the default search fields defined in the index config. Space-separated list, e.g. \"field1 field2\".  |  |\n| `--snippet-fields` | List of fields that Quickwit will return snippet highlight on. Space-separated list, e.g. \"field1 field2\".  |  |\n| `--start-timestamp` | Filters out documents before that timestamp (time-series indexes only). |  |\n| `--end-timestamp` | Filters out documents after that timestamp (time-series indexes only). |  |\n| `--sort-by-score` | Sorts documents by their BM25 score. |  |\n\n*Examples*\n\n*Searching a index*\n```bash\n# Start a Quickwit server.\nquickwit run --config=./config/quickwit.yaml\n# Open a new terminal and run:\nquickwit index search --endpoint=http://127.0.0.1:7280 --index wikipedia --query \"Barack Obama\"\n# If you have jq installed.\nquickwit index search --endpoint=http://127.0.0.1:7280 --index wikipedia --query \"Barack Obama\" | jq '.hits[].title'\n\n```\n\n*Sorting documents by their BM25 score*\n```bash\n# Start a Quickwit server.\nquickwit run --config=./config/quickwit.yaml\n# Open a new terminal and run:\nquickwit index search --endpoint=http://127.0.0.1:7280 --index wikipedia --query \"obama\" --sort-by-score\n\n```\n\n*Limiting the result set to 50 hits*\n```bash\n# Start a Quickwit server.\nquickwit run --config=./config/quickwit.yaml\n# Open a new terminal and run:\nquickwit index search --endpoint=http://127.0.0.1:7280 --index wikipedia --query \"Barack Obama\" --max-hits 50\n# If you have jq installed.\nquickwit index search --endpoint=http://127.0.0.1:7280 --index wikipedia --query \"Barack Obama\" --max-hits 50 | jq '.num_hits'\n\n```\n\n*Looking for matches in the title only*\n```bash\n# Start a Quickwit server.\nquickwit run --config=./config/quickwit.yaml\n# Open a new terminal and run:\nquickwit index search --endpoint=http://127.0.0.1:7280 --index wikipedia --query \"obama\" --search-fields body\n# If you have jq installed.\nquickwit index search --endpoint=http://127.0.0.1:7280 --index wikipedia --query \"obama\" --search-fields body | jq '.hits[].title'\n\n```\n\n## source\nManages sources: creates, updates, deletes sources...\n\n### source create\n\nAdds a new source to an index.  \n`quickwit source create [args]`\n\n*Synopsis*\n\n```bash\nquickwit source create\n    --index <index>\n    --source-config <source-config>\n```\n\n*Options*\n\n| Option | Description |\n|-----------------|-------------|\n| `--index` | ID of the target index |\n| `--source-config` | Path to source config file. Please, refer to the documentation for more details. |\n### source update\n\nUpdate an existing source.  \n`quickwit source update [args]`\n\n*Synopsis*\n\n```bash\nquickwit source update\n    --index <index>\n    --source <source>\n    --source-config <source-config>\n```\n\n*Options*\n\n| Option | Description |\n|-----------------|-------------|\n| `--index` | ID of the target index |\n| `--source` | ID of the source |\n| `--source-config` | Path to source config file. Please, refer to the documentation for more details. |\n| `--create` | Create the source if it doesn't exist. |\n### source enable\n\nEnables a source for an index.  \n`quickwit source enable [args]`\n\n*Synopsis*\n\n```bash\nquickwit source enable\n    --index <index>\n    --source <source>\n```\n\n*Options*\n\n| Option | Description |\n|-----------------|-------------|\n| `--index` | ID of the target index |\n| `--source` | ID of the source. |\n### source disable\n\nDisables a source for an index.  \n`quickwit source disable [args]`\n\n*Synopsis*\n\n```bash\nquickwit source disable\n    --index <index>\n    --source <source>\n```\n\n*Options*\n\n| Option | Description |\n|-----------------|-------------|\n| `--index` | ID of the target index |\n| `--source` | ID of the source. |\n### source ingest-api\n\nEnables/disables the ingest API of an index.  \n`quickwit source ingest-api [args]`\n\n*Synopsis*\n\n```bash\nquickwit source ingest-api\n    --index <index>\n    [--enable]\n    [--disable]\n```\n\n*Options*\n\n| Option | Description |\n|-----------------|-------------|\n| `--index` | ID of the target index |\n| `--enable` | Enables the ingest API. |\n| `--disable` | Disables the ingest API. |\n### source delete\n\nDeletes a source from an index.  \n`quickwit source delete [args]`\n`quickwit source del [args]`\n\n*Synopsis*\n\n```bash\nquickwit source delete\n    --index <index>\n    --source <source>\n```\n\n*Options*\n\n| Option | Description |\n|-----------------|-------------|\n| `--index` | ID of the target index |\n| `--source` | ID of the source. |\n\n*Examples*\n\n*Delete a `wikipedia-source` source*\n```bash\n# Start a Quickwit server.\nquickwit run --service metastore --config=./config/quickwit.yaml\n# Open a new terminal and run:\nquickwit source delete --endpoint=http://127.0.0.1:7280 --index wikipedia --source wikipedia-source\n\n```\n\n### source describe\n\nDescribes a source.  \n`quickwit source describe [args]`\n`quickwit source desc [args]`\n\n*Synopsis*\n\n```bash\nquickwit source describe\n    --index <index>\n    --source <source>\n```\n\n*Options*\n\n| Option | Description |\n|-----------------|-------------|\n| `--index` | ID of the target index |\n| `--source` | ID of the source. |\n### source list\n\nLists the sources of an index.  \n`quickwit source list [args]`\n`quickwit source ls [args]`\n\n*Synopsis*\n\n```bash\nquickwit source list\n    --index <index>\n```\n\n*Options*\n\n| Option | Description |\n|-----------------|-------------|\n| `--index` | ID of the target index |\n\n*Examples*\n\n*List `wikipedia` index sources*\n```bash\n# Start a Quickwit server.\nquickwit run --service metastore --config=./config/quickwit.yaml\n# Open a new terminal and run:\nquickwit source list --endpoint=http://127.0.0.1:7280 --index wikipedia\n\n```\n\n### source reset-checkpoint\n\nResets a source checkpoint.  \n`quickwit source reset-checkpoint [args]`\n`quickwit source reset [args]`\n\n*Synopsis*\n\n```bash\nquickwit source reset-checkpoint\n    --index <index>\n    --source <source>\n```\n\n*Options*\n\n| Option | Description |\n|-----------------|-------------|\n| `--index` | Index ID |\n| `--source` | Source ID |\n## split\nManages splits: lists, describes, marks for deletion...\n\n### split list\n\nLists the splits of an index.  \n`quickwit split list [args]`\n`quickwit split ls [args]`\n\n*Synopsis*\n\n```bash\nquickwit split list\n    --index <index>\n    [--offset <offset>]\n    [--limit <limit>]\n    [--states <states>]\n    [--create-date <create-date>]\n    [--start-date <start-date>]\n    [--end-date <end-date>]\n    [--output-format <output-format>]\n```\n\n*Options*\n\n| Option | Description |\n|-----------------|-------------|\n| `--index` | Target index ID |\n| `--offset` | Number of splits to skip. |\n| `--limit` | Maximum number of splits to retrieve. |\n| `--states` | Selects the splits whose states are included in this comma-separated list of states. Possible values are `staged`, `published`, and `marked`. |\n| `--create-date` | Selects the splits whose creation dates are before this date. |\n| `--start-date` | Selects the splits that contain documents after this date (time-series indexes only). |\n| `--end-date` | Selects the splits that contain documents before this date (time-series indexes only). |\n| `--output-format` | Output format. Possible values are `table`, `json`, and `pretty-json`. |\n### split describe\n\nDisplays metadata about a split.  \n`quickwit split describe [args]`\n`quickwit split desc [args]`\n\n*Synopsis*\n\n```bash\nquickwit split describe\n    --index <index>\n    --split <split>\n    [--verbose]\n```\n\n*Options*\n\n| Option | Description |\n|-----------------|-------------|\n| `--index` | ID of the target index |\n| `--split` | ID of the target split |\n| `--verbose` | Displays additional metadata about the hotcache. |\n### split mark-for-deletion\n\nMarks one or multiple splits of an index for deletion.  \n`quickwit split mark-for-deletion [args]`\n`quickwit split mark [args]`\n\n*Synopsis*\n\n```bash\nquickwit split mark-for-deletion\n    --index <index>\n    --splits <splits>\n    [--yes]\n```\n\n*Options*\n\n| Option | Description |\n|-----------------|-------------|\n| `--index` | Target index ID |\n| `--splits` | Comma-separated list of split IDs |\n| `--yes` | Assume \"yes\" as an answer to all prompts and run non-interactively. |\n## tool\nPerforms utility operations. Requires a node config.\n\n### tool local-ingest\n\nIndexes NDJSON documents locally.  \n`quickwit tool local-ingest [args]`\n\n*Synopsis*\n\n```bash\nquickwit tool local-ingest\n    --index <index>\n    [--input-path <input-path>]\n    [--input-format <input-format>]\n    [--overwrite]\n    [--transform-script <transform-script>]\n    [--keep-cache]\n```\n\n*Options*\n\n| Option | Description | Default |\n|-----------------|-------------|--------:|\n| `--index` | ID of the target index |  |\n| `--input-path` | Location of the input file. |  |\n| `--input-format` | Format of the input data. | `json` |\n| `--overwrite` | Overwrites pre-existing index. |  |\n| `--transform-script` | VRL program to transform docs before ingesting. |  |\n| `--keep-cache` | Does not clear local cache directory upon completion. |  |\n### tool extract-split\n\nDownloads and extracts a split to a directory.  \n`quickwit tool extract-split [args]`\n\n*Synopsis*\n\n```bash\nquickwit tool extract-split\n    --index <index>\n    --split <split>\n    [--target-dir <target-dir>]\n```\n\n*Options*\n\n| Option | Description |\n|-----------------|-------------|\n| `--index` | ID of the target index |\n| `--split` | ID of the target split |\n| `--target-dir` | Directory to extract the split to. |\n### tool gc\n\nGarbage collects stale staged splits and splits marked for deletion.  \n:::note\nIntermediate files are created while executing Quickwit commands.\nThese intermediate files are always cleaned at the end of each successfully executed command.\nHowever, failed or interrupted commands can leave behind intermediate files that need to be removed.\nAlso, note that using a very short grace period (like seconds) can cause the removal of intermediate files being operated on, especially when using Quickwit concurrently on the same index.\nIn practice, you can settle with the default value (1 hour) and only specify a lower value if you really know what you are doing.\n\n:::\n`quickwit tool gc [args]`\n\n*Synopsis*\n\n```bash\nquickwit tool gc\n    --index <index>\n    [--grace-period <grace-period>]\n    [--dry-run]\n```\n\n*Options*\n\n| Option | Description | Default |\n|-----------------|-------------|--------:|\n| `--index` | ID of the target index |  |\n| `--grace-period` | Threshold period after which stale staged splits are garbage collected. | `1h` |\n| `--dry-run` | Executes the command in dry run mode and only displays the list of splits candidates for garbage collection. |  |\n\n<!--\n    End of auto-generated CLI docs\n-->\n\n## Environment Variables\n\n### QW_CLUSTER_ENDPOINT\n\nSpecifies the address of the cluster to connect to. Management commands `index`, `split` and `source` require the `cluster_endpoint`, which you can set once and for all with the `QW_CLUSTER_ENDPOINT` environment variable.\n\n### QW_CONFIG\n\nSpecifies the path to the [quickwit config](../configuration/node-config.md). Commands `run` and `tools` require the `config`, which you can set once and for all with the `QW_CONFIG` environment variable.\n\n*Example*\n\n`export QW_CONFIG=config/quickwit.yaml`\n\n### QW_DISABLE_TELEMETRY\n\nDisables [telemetry](../telemetry.md) when set to any non-empty value.\n\n*Example*\n\n`QW_DISABLE_TELEMETRY=1 quickwit help`\n\n### QW_POSTGRES_SKIP_MIGRATIONS\n\nDon't run database migrations (but verify that migrations were run successfully before, and no that unknown migration was run).\n\n### QW_POSTGRES_SKIP_MIGRATION_LOCKING\n\nDon't lock the database during migration. This may increase compatibility with alternative databases using the PostgreSQL wire protocol. However, it\nis dangerous to use this if you can't guarantee that only one node will run the migrations.\n\n### RUST_LOG\n\nConfigure quickwit log level.\n\n*Examples*\n\n```\n# run with higher verbosity\nRUST_LOG=debug quickwit run\n# run with log level info, except for indexing related logs\nRUST_LOG=info,quickwit_indexing=debug quickwit run\n```\n"
  },
  {
    "path": "docs/reference/es_compatible_api.md",
    "content": "---\ntitle: Elasticsearch compatible API\nsidebar_position: 20\n---\n\n\nIn order to facilitate migrations and integrations with existing tools,\nQuickwit offers an Elasticsearch/Opensearch compatible API.\nThis API is incomplete. This page lists the available features and endpoints.\n\n## Supported endpoints\n\nAll the API endpoints start with the `api/v1/_elastic/` prefix.\n\n### `_bulk` &nbsp; Batch ingestion endpoint\n\n```\nPOST api/v1/_elastic/_bulk\n```\n```\nPOST api/v1/_elastic/<index>/_bulk\n```\n\nThe _bulk ingestion API makes it possible to index a batch of documents, possibly targeting several indices in the same request.\n\n#### Request Body example\n\n```json\n{ \"create\" : { \"_index\" : \"wikipedia\", \"_id\" : \"1\" } }\n{\"url\":\"https://en.wikipedia.org/wiki?id=1\",\"title\":\"foo\",\"body\":\"foo\"}\n{ \"create\" : { \"_index\" : \"wikipedia\", \"_id\" : \"2\" } }\n{\"url\":\"https://en.wikipedia.org/wiki?id=2\",\"title\":\"bar\",\"body\":\"bar\"}\n{ \"create\" : { \"_index\" : \"wikipedia\", \"_id\" : \"3\" } }\n{\"url\":\"https://en.wikipedia.org/wiki?id=3\",\"title\":\"baz\",\"body\":\"baz\"}'\n```\n\nIngest a batch of documents to make them searchable using the [Elasticsearch](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html) bulk API. This endpoint provides compatibility with tools or systems that already send data to Elasticsearch for indexing. Currently, only the `create` action of the bulk API is supported, all other actions such as `delete` or `update` are ignored.\n\nIf an index is specified via the url path, it will act as a default value\nfor the `_index` properties.\n\nThe [`refresh`](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-refresh.html) parameter is supported.\n\n:::caution\nThe quickwit API will not report errors, you need to check the server logs.\n\nIn Elasticsearch, the `create` action has a specific behavior when the ingested documents contain an identifier (the `_id` field). It only inserts such a document if it was not inserted before. This is extremely handy to achieve At-Most-Once indexing.\nQuickwit does not have any notion of document id and does not support this feature.\n:::\n\n:::info\nThe payload size is limited to 10MB as this endpoint is intended to receive documents in batch.\n:::\n\n#### Query parameter\n\n| Variable  | Type     | Description                                                      | Default value |\n| --------- | -------- | ---------------------------------------------------------------- | ------------- |\n| `refresh` | `String` | The commit behavior: blank string, `true`, `wait_for` or `false` | `false`       |\n\n#### Response\n\nThe response is a JSON object, and the content type is `application/json; charset=UTF-8.`\n\n| Field                     | Description                                                                                                                                                              |   Type   |\n| ------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | :------: |\n| `num_docs_for_processing` | Total number of documents ingested for processing. The documents may not have been processed. The API will not return indexing errors, check the server logs for errors. | `number` |\n\n\n\n\n### `_search` &nbsp; Index search endpoint\n\n```\nPOST api/v1/_elastic/<index_id>/_search\n```\n```\nGET api/v1/_elastic/<index_id>/_search\n```\n\n#### Request Body example\n\n```json\n{\n  \"size\": 10,\n  \"query\": {\n    \"bool\": {\n      \"must\": [\n        {\n          \"query_string\": {\n            \"query\": \"bitpacking\"\n          }\n        },\n        {\n          \"term\": {\n            \"actor.login\": {\n              \"value\": \"fulmicoton\"\n            }\n          }\n        }\n      ]\n    }\n  },\n  \"sort\": [\n    {\n      \"actor.id\": {\n        \"order\": null\n      }\n    }\n  ],\n  \"aggs\": {\n    \"event_types\": {\n      \"terms\": {\n        \"field\": \"type\",\n        \"size\": 5\n      }\n    }\n  }\n}\n```\n\nSearch into a specific index using the [Elasticsearch search API](https://www.elastic.co/guide/en/elasticsearch/reference/8.8/search-search.html).\n\nSome of the parameter can be passed as query string parameter, and some via JSON payload.\nIf a parameter appears both as a query string parameter and in the JSON payload, the query string parameter value will take priority.\n\n#### Supported Query string parameters\n\n\n| Variable           | Type          | Description                                                                      | Default value |\n| ------------------ | ------------- | -------------------------------------------------------------------------------- | ------------- |\n| `default_operator` | `AND` or `OR` | The default operator used to combine search terms. It should be `AND` or `OR`.   | `OR`          |\n| `from`             | `Integer`     | The rank of the first hit to return. This is useful for pagination.              | 0             |\n| `q`                | `String`      | The search query.                                                                | (Optional)    |\n| `size`             | `Integer`     | Number of hits to return.                                                        | 10            |\n| `sort`             | `String`      | Describes how documents should be ranked. See [Sort order](#sort-order)          | (Optional)    |\n| `scroll`           | `Duration`    | Creates a scroll context for \"time to live\". See [Scroll](#_searchscroll--scroll-api). | (Optional)    |\n| `allow_partial_search_results` | `Boolean` | Returns a partial response if some (but not all) of the split searches were unsuccessful. | `true` |\n\n#### Supported Request Body parameters\n\n| Variable           | Type              | Description                                                                    | Default value |\n| ------------------ | ----------------- | ------------------------------------------------------------------------------ | ------------- |\n| `default_operator` | `\"AND\"` or `\"OR\"` | The default operator used to combine search terms. It should be `AND` or `OR`. | `OR`          |\n| `from`             | `Integer`         | The rank of the first hit to return. This is useful for pagination.            | 0             |\n| `query`            | `Json object`     | Describe the search query. See [Query DSL](#query-dsl)                         | (Optional)    |\n| `size`             | `Integer`         | Number of hits to return.                                                      | 10            |\n| `sort`             | `JsonObject[]`    | Describes how documents should be ranked. See [Sort order](#sort-order)        | `[]`          |\n| `search_after`     | `Any[]`           | Ignore documents with a SortingValue preceding or equal to the parameter       | (Optional)    |\n| `aggs`             | `Json object`     | Aggregation definition. See [Aggregations](aggregation.md).                    | `{}`          |\n\n\n#### Sort order\n\nYou can define up to two criteria on which to apply sort.\nThe second criterion will only be used in presence of a tie for the first criterion.\n\nA given criterion can either be\n- the name of a fast field (explicitly defined in the schema or captured by the dynamic mode)\n- `_score` to sort by BM25.\n\nBy default, the sort order is `ascending` for fast fields and descending for `_score`.\n\nWhen sorting by a fast field and this field contains several values in a single document, only the first value is used for sorting.\n\nThe sort order can be set as descending/ascending using the\nfollowing syntax.\n\n```json\n{\n  // ...\n  \"sort\" : [\n    { \"timestamp\" : {\"order\" : \"asc\"}},\n    { \"serial_number\" : \"desc\" }\n  ]\n  // ...\n}\n\n```\n\nIt is also possible to not supply an order and rely on the default order using the following syntax.\n\n```json\n{ //...\n  \"sort\" : [\"_score\", \"timestamp\"]\n  // ...\n}\n```\n\nIf no format is provided for timestamps, timestamps are returned with milliseconds precision.\n\nIf you need nanosecond precision, you can use the `epoch_nanos_int` format. Beware this means the resulting\nJSON may contain high numbers for which there is loss of precision when using languages where all numbers are\nfloats, such as JavaScript.\n\n```json\n{\n  // ...\n  \"sort\" : [\n    { \"timestamp\" : {\"format\": \"epoch_nanos_int\",\"order\" : \"asc\"}},\n    { \"serial_number\" : \"desc\" }\n  ]\n  // ...\n}\n\n#### Search after\n\nWhen sorting results, the answer looks like the following\n\n```json\n{\n  // ...\n  \"hits\": {\n    // ...\n    \"hits\": [\n      // ...\n      {\n        // ...\n        \"sort\": [\n          1701962929199\n        ]\n      }\n    ]\n  }\n}\n```\n\nYou can pass the `sort` value of the last hit in a subsequent request where other fields are kept unchanged:\n```json\n{\n  // keep all fields from the original request\n  \"search_after\": [\n    1701962929199\n  ]\n}\n```\n\nThis allows you to paginate your results.\n\n### `_msearch` &nbsp; Multi search API\n\n```\nPOST api/v1/_elastic/_msearch\n```\n\n#### Request Body example\n\n```json\n{\"index\": \"gharchive\" }\n{\"query\" : {\"match\" : { \"author.login\": \"fulmicoton\"}}}\n{\"index\": \"gharchive\"}\n{\"query\" : {\"match_all\" : {}}}\n```\n\n[Multi search endpoint ES API reference](https://www.elastic.co/guide/en/elasticsearch/reference/8.8/search-multi-search.html)\n\nRuns several search requests at once.\n\nThe payload is expected to alternate:\n- a `header` json object, containing the targeted index id.\n- a `search request body` as defined in the [`_search` endpoint section].\n\n\n### `_search/scroll` &nbsp; Scroll API\n\n```\nGET api/v1/_elastic/_search/scroll\n```\n\n#### Supported Request Body parameters\n\n| Variable    | Type                                        | Description | Default value |\n| ----------- | ------------------------------------------- | ----------- | ------------- |\n| `scroll_id` | Scroll id (obtained from a search response) | Required    |               |\n\n\nThe `_search/scroll` endpoint, in combination with the `_search` API makes it possible to request successive pages of search results.\nFirst, the client needs to call the `search api` with a `scroll` query parameter, and then pass the `scroll_id` returned in the response payload to  `_search/scroll` endpoint.\n\nEach subsequent call to the `_search/scroll` endpoint will return a new `scroll_id` pointing to the next page.\n\n:::tip\n\nUsing `_search` and then `_search/scroll` is somewhat similar to using `_search` with the `search_after` parameter, except that it creates a lightweight snapshot view of the dataset during the initial call to `_search`. Further calls to `_search/scroll` only return results from that view, thus ensuring more consistent results.\n\n:::\n\n### `_cat` &nbsp; Cat API\n\n```\nGET api/v1/_elastic/_cat/indices/<index>\n```\n```\nGET api/v1/_elastic/_cat/indices\n```\n\n#### Supported Query string parameters\n\n| Variable | Type       | Description                                                                                            | Default value |\n|----------|------------|--------------------------------------------------------------------------------------------------------|---------------|\n| `format` | `String`   | Format for response. Only JSON supported for now.                                                      |               |\n| `h`      | `String[]` | Comma-separated list of column names to display.                                                       | (Optional)    |\n| `health` | `String`   | Filter for health: `green`, `yellow`, or `red`.                                                        | (Optional)    |\n| `bytes`  | `String`   | Unit used to display byte values. Unsupported for now.                                                 | (Optional)    |\n| `s`      | `String`   | Comma-separated list of column names or column aliases used to sort the response. Unsupported for now. | (Optional)    |\n| `v`      | `Boolean`  | If true, the response includes column headings. Unsupported for now.                                   | (Optional)    |\n\nUse the [cat indices API](https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-indices.html) to get the following information for each index in a cluster:\n* Shard count\n* Document count\n* Deleted document count\n* Primary store size\n* Total store size\n\n#### Response\n\nThe response is a JSON object, and the content type is `application/json; charset=UTF-8.`\n\n| Field            | Description                                      |   Type   |\n|------------------|--------------------------------------------------|:--------:|\n| `uuid`           | Index uuid                                       | `String` |\n| `index`          | Index name                                       | `String` |\n| `health`         | Health of the index `green`, `yellow`, or `red`. | `String` |\n| `status`         | Status of the index `open`.                      | `String` |\n| `rep`            | Replication factor.                              | `Number` |\n| `pri`            | Number of primary shards                         | `Number` |\n| `pri.store.size` | Stored size of primary shard.                    | `String` |\n| `store.size`     | Stored size of index.                            | `String` |\n| `dataset.size`   | Indexed data size.                               | `String` |\n| `docs.count`     | Number of records in index.                      | `Number` |\n| `docs.deleted`   | Number of deleted records in index.              | `Number` |\n\nExample response:\n\n```json\n[\n  {\n    \"dataset.size\": \"0b\",\n    \"docs.count\": \"0\",\n    \"docs.deleted\": \"0\",\n    \"health\": \"green\",\n    \"index\": \"otel-traces-v0_7\",\n    \"pri\": \"1\",\n    \"pri.store.size\": \"0b\",\n    \"rep\": \"1\",\n    \"status\": \"open\",\n    \"store.size\": \"0b\",\n    \"uuid\": \"otel-traces-v0_7:01HTJC6TQDGM07KBDQZ2KDHW53\"\n  },\n  {\n    \"dataset.size\": \"387.5gb\",\n    \"docs.count\": \"224453081\",\n    \"docs.deleted\": \"0\",\n    \"health\": \"green\",\n    \"index\": \"otel-logs-v0_7\",\n    \"pri\": \"1\",\n    \"pri.store.size\": \"37.5gb\",\n    \"rep\": \"1\",\n    \"status\": \"open\",\n    \"store.size\": \"37.5gb\",\n    \"uuid\": \"otel-logs-v0_7:01HTJC6TME1JGXBFERHZ0FJ860\"\n  }\n]\n```\n\n[HTTP accept header]: https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html\n\n\n### `_field_caps` &nbsp; Field capabilities API\n\n```\nGET api/v1/_elastic/<index>/_field_caps\n```\n```\nPOST api/v1/_elastic/<index>/_field_caps\n```\n```\nGET api/v1/_elastic/_field_caps\n```\n```\nPOST api/v1/_elastic/_field_caps\n```\n\nThe [field capabilities API](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-field-caps.html) returns information about the capabilities of fields among multiple indices.\n\n#### Supported Query string parameters\n\n| Variable              | Type       | Description                                                                    | Default value |\n| --------------------- | ---------- | ------------------------------------------------------------------------------ | ------------- |\n| `fields`              | `String`   | Comma-separated list of fields to retrieve capabilities for. Supports wildcards (`*`). | (Optional) |\n| `allow_no_indices`    | `Boolean`  | If `true`, missing or closed indices are not an error.                          | (Optional)    |\n| `expand_wildcards`    | `String`   | Controls what kind of indices that wildcard patterns can match.                 | (Optional)    |\n| `ignore_unavailable`  | `Boolean`  | If `true`, unavailable indices are ignored.                                    | (Optional)    |\n| `start_timestamp`     | `Integer`  | *(Quickwit-specific)* If set, restricts splits to documents with a timestamp range start >= `start_timestamp` (seconds since epoch). | (Optional) |\n| `end_timestamp`       | `Integer`  | *(Quickwit-specific)* If set, restricts splits to documents with a timestamp range end < `end_timestamp` (seconds since epoch). | (Optional) |\n\n#### Supported Request Body parameters\n\n| Variable           | Type          | Description                                                                 | Default value |\n| ------------------ | ------------- | --------------------------------------------------------------------------- | ------------- |\n| `index_filter`     | `Json object` | A query to filter indices. If provided, only fields from indices that can potentially match the filter are returned. See [index_filter](#index_filter). | (Optional) |\n| `runtime_mappings`  | `Json object` | Accepted but not supported.                                                 | (Optional)    |\n\n#### `index_filter`\n\nThe `index_filter` parameter allows you to filter which indices contribute to the field capabilities response. When provided, Quickwit uses the filter query to prune indices (splits) that cannot match the filter, and only returns field capabilities for the remaining ones.\n\nLike Elasticsearch, this is a **best-effort** approach: Quickwit may return field capabilities from indices that do not actually contain any matching documents. In Quickwit, the filtering is limited to the existing split-pruning based on metadata:\n\n- **Time pruning**: Range queries on the timestamp field can eliminate splits whose time range does not overlap with the filter.\n- **Tag pruning**: Term queries on [tag fields](../configuration/index-config.md#tag-fields) can eliminate splits that do not contain the requested tag value.\n\nOther filter types (e.g. full-text queries or term queries on non-tag fields) are accepted but will not prune any splits — all indices will be returned as if no filter was specified. In particular, Quickwit does not check whether terms are present in the term dictionary.\n\n#### Request Body example\n\n```json\n{\n  \"index_filter\": {\n    \"range\": {\n      \"timestamp\": {\n        \"gte\": \"2024-01-01T00:00:00Z\",\n        \"lt\": \"2024-02-01T00:00:00Z\"\n      }\n    }\n  }\n}\n```\n\n```json\n{\n  \"index_filter\": {\n    \"term\": {\n      \"status\": \"active\"\n    }\n  }\n}\n```\n\n\n## Query DSL\n\n[Elasticsearch Query DSL reference](https://www.elastic.co/guide/en/elasticsearch/reference/8.8/query-dsl.html).\n\nThe following query types are supported.\n\n### `query_string`\n\n[Elasticsearch reference documentation](https://www.elastic.co/guide/en/elasticsearch/reference/8.8/query-dsl-query-string-query.html)\n\n\n#### Example\n\n```json\n{\n  \"query\": {\n    \"query_string\": {\n      \"query\": \"bitpacking AND author.login:fulmicoton\",\n      \"fields\": [\n        \"payload.description\"\n      ]\n    }\n  }\n}\n```\n\n#### Supported parameters\n\n| Variable           | Type                  | Description                                                                                                                 | Default value |\n| ------------------ | --------------------- | --------------------------------------------------------------------------------------------------------------------------- | ------------- |\n| `query`            | `String`              | Query meant to be parsed.                                                                                                   | -             |\n| `fields`           | `String[]` (Optional) | Default search target fields.                                                                                               | -             |\n| `default_operator` | `\"AND\"` or `\"OR\"`     | In the absence of boolean operator defines whether terms should be combined as a conjunction (`AND`) or disjunction (`OR`). | `OR`          |\n| `boost`            | `Number`              | Multiplier boost for score computation.                                                                                     | 1.0           |\n| `lenient`          | `Boolean`             | [See note](#about-the-lenient-argument).                                                                                    | false         |\n\n\n### `bool`\n\n[Elasticsearch reference documentation](https://www.elastic.co/guide/en/elasticsearch/reference/8.8/query-dsl-term-query.html)\n\n#### Example\n\n```json\n{\n  \"query\": {\n    \"bool\": {\n      \"must\": [\n        {\n          \"query_string\": {\n            \"query\": \"bitpacking\"\n          }\n        }\n      ],\n      \"must_not\": {\n        \"term\": {\n          \"type\": {\n            \"value\": \"CommitEvent\"\n          }\n        }\n      }\n    }\n  }\n}\n```\n\n#### Supported parameters\n\n| Variable   | Type                      | Description                                                       | Default value |\n| ---------- | ------------------------- | ----------------------------------------------------------------- | ------------- |\n| `must`     | `JsonObject[]` (Optional) | Sub-queries required to match the document.                       | []            |\n| `must_not` | `JsonObject[]` (Optional) | Sub-queries required to not match the document.                   | []            |\n| `should`   | `JsonObject[]` (Optional) | Sub-queries that should match the documents.                      | []            |\n| `filter`   | `JsonObject[]`            | Like must queries, but the match does not influence the `_score`. | []            |\n| `boost`    | `Number`                  | Multiplier boost for score computation.                           | 1.0           |\n| `minimum_should_match`    | `Number` or `Str` | If present, quickwit will only match documents for which at least `minimum_should_match` should clauses are matching. `2`, `-1`, `\"10%\"` and `\"-10%\"` are supported. |  |\n\n### `range`\n\n[Elasticsearch reference documentation](https://www.elastic.co/guide/en/elasticsearch/reference/8.8/query-dsl-range-query.html)\n\n#### Example\n\n```json\n{\n  \"query\": {\n    \"range\": {\n      \"my_date_field\": {\n        \"lt\": \"2015-02-01T00:00:13Z\",\n        \"gte\": \"2015-02-01T00:00:10Z\"\n      }\n    }\n  }\n}\n\n```\n\n#### Supported parameters\n\n| Variable | Type                            | Description                            | Default value |\n| -------- | ------------------------------- | -------------------------------------- | ------------- |\n| `gt`     | bool, string, Number (Optional) | Greater than                           | None          |\n| `gte`    | bool, string, Number (Optional) | Greater than or equal                  | None          |\n| `lt`     | bool, string, Number (Optional) | Less than                              | None          |\n| `lte`    | bool, string, Number (Optional) | Less than or equal                     | None          |\n| `boost`  | `Number`                        | Multiplier boost for score computation | 1.0           |\n\n\n### `match`\n\n[Elasticsearch reference documentation](https://www.elastic.co/guide/en/elasticsearch/reference/8.8/query-dsl-match-query.html)\n\n#### Example\n\n```json\n{\n  \"query\": {\n    \"match\": {\n        \"type\": {\n            \"query\": \"CommitEvent\",\n            \"zero_terms_query\": \"all\"\n        }\n    }\n  }\n}\n```\n\n#### Supported Parameters\n\n| Variable           | Type              | Description                                                                                                                    | Default |\n| ------------------ | ----------------- | ------------------------------------------------------------------------------------------------------------------------------ | ------- |\n| `query`            | String            | Full-text search query.                                                                                                        | -       |\n| `operator`         | `\"AND\"` or `\"OR\"` | Defines whether all terms should be present (`AND`) or if at least one term is sufficient to match (`OR`).                     | OR      |\n| `zero_terms_query` | `all` or `none`   | Defines if all (`all`) or no documents (`none`) should be returned if the query does not contain any terms after tokenization. | `none`  |\n| `boost`            | `Number`          | Multiplier boost for score computation                                                                                         | 1.0     |\n| `lenient`          | `Boolean`         | [See note](#about-the-lenient-argument).                                                                                       | false   |\n\n\n\n### `match_phrase`\n\n[Elasticsearch reference documentation](https://www.elastic.co/guide/en/elasticsearch/reference/8.8/query-dsl-match-query-phrase.html)\n\n#### Example\n\n```json\n{\n  \"query\": {\n    \"match_phrase\": {\n      \"title\": \"search keywords\",\n      \"analyzer\": \"default\"\n    }\n  }\n}\n```\n\n\n\n### `match_phrase_prefix`\n\n[Elasticsearch reference documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-query-phrase-prefix.html)\n\n#### Example\n\n```json\n{\n  \"query\": {\n    \"match_phrase_prefix\": {\n      \"payload.commits.message\": {\n        \"query\": \"automated comm\" // This will match \"automated commit\" for instance.\n      }\n    }\n  }\n}\n```\n\n#### Supported Parameters\n\n| Variable           | Type            | Description                                                                                                                    | Default                     |\n| ------------------ | --------------- | ------------------------------------------------------------------------------------------------------------------------------ | --------------------------- |\n| `query`            | String          | Full-text search query. The last token will be prefix-matched                                                                  | -                           |\n| `zero_terms_query` | `all` or `none` | Defines if all (`all`) or no documents (`none`) should be returned if the query does not contain any terms after tokenization. | `none`                      |\n| `max_expansions`   | `Integer`       | Number of terms to be match by the prefix matching.                                                                            | 50                          |\n| `slop`             | `Integer`       | Allows extra tokens between the query tokens.                                                                                  | 0                           |\n| `analyzer`         | String          | Analyzer meant to cut the query into terms. It is recommended to NOT use this parameter.                                       | The actual field tokenizer. |\n\n\n### `match_bool_prefix`\n\n[Elasticsearch reference documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-query-phrase-prefix.html)\n\n#### Example\n\n```json\n{\n  \"query\": {\n    \"match_bool_prefix\": {\n      \"payload.commits.message\": {\n        \"query\": \"automated comm\" // This will match \"automated commit\" for instance.\n      }\n    }\n  }\n}\n```\n\nContrary to ES/Opensearch, in Quickwit, at most 50 terms will be considered when searching the last term of the query as a prefix `match_bool_prefix`.\n\n#### Supported Parameters\n\n| Variable           | Type              | Description                                                                                                                    | Default |\n| ------------------ | ----------------- | ------------------------------------------------------------------------------------------------------------------------------ | ------- |\n| `query`            | String            | Full-text search query. The last token will be prefix-matched                                                                  | -       |\n| `operator`         | `\"AND\"` or `\"OR\"` | Defines whether all terms should be present (`AND`) or if at least one term is sufficient to match (`OR`).                     | OR      |\n| `zero_terms_query` | `all` or `none`   | Defines if all (`all`) or no documents (`none`) should be returned if the query does not contain any terms after tokenization. | `none`  |\n\n\n\n### `Multi-match`\n\n[Elasticsearch reference documentation](https://www.elastic.co/guide/en/elasticsearch/reference/8.8/query-dsl-multi-match-query.html)\n\n#### Example\n```json\n{\n  \"query\": {\n    \"multi_match\": {\n      \"query\": \"search keywords\",\n      \"fields\": [\n        \"title\",\n        \"body\"\n      ]\n    }\n  }\n}\n```\n\n```json\n{\n  \"query\": {\n    \"multi_match\": {\n      \"query\": \"search keywords\",\n      \"type\": \"most_fields\",\n      \"fields\": [\n        \"title\",\n        \"body\"\n      ]\n    }\n  }\n}\n```\n\n```json\n{\n  \"query\": {\n    \"multi_match\": {\n      \"query\": \"search keywords\",\n      \"type\": \"phrase\",\n      \"fields\": [\n        \"title\",\n        \"body\"\n      ]\n    }\n  }\n}\n```\n\n```json\n{\n  \"query\": {\n    \"multi_match\" : {\n      \"query\":      \"search key\",\n      \"type\":       \"phrase_prefix\",\n      \"fields\":     [ \"title\", \"body\" ]\n    }\n  }\n}\n```\n\n#### Supported parameters\n\n| Variable           | Type                  | Description                                  | Default value |\n| ------------------ | --------------------- | ---------------------------------------------| ------------- |\n| `type`             | `String`              | See supported types below                    | `most_fields` |\n| `fields`           | `String[]` (Optional) | Default search target fields.                | -             |\n| `lenient`          | `Boolean`             | [See note](#about-the-lenient-argument).     | false         |\n\nSupported types:\n\n| `type` value    | Description                                                                                 |\n| --------------- | ------------------------------------------------------------------------------------------- |\n| `most_fields`   | Finds documents matching any field and combines the `_score` from each field (default).  |\n| `phrase`        | Runs a `match_phrase` query on each field.       |\n| `phrase_prefix` | Runs a `match_phrase_prefix` query on each field. |\n| `bool_prefix`   | Runs a `match_bool_prefix` query on each field. |\n\n:::warning\n\nIn `phrase`, `phrase_prefix` and `bool_prefix` modes, Quickwit sums the score of the different fields instead of returning their max.\n\nMoreover, while Quickwit does not support `best_fields` or `cross_fields`, it will not return an error when presented a `best_fields` or `cross_fields` type. For compatibilility reasons, Quickwit silently accepts these parameters and interprets them as a `most_fields` type.\n\n:::\n\n### `term`\n\n[Elasticsearch reference documentation](https://www.elastic.co/guide/en/elasticsearch/reference/8.8/query-dsl-term-query.html)\n\n:::note\n\nWhen working on text, it is recommended to only use `term` queries on fields configured with `tokenizer: raw`. This is the Quickwit equivalent of the Elasticsearch `keyword` type.\n\n:::\n\n#### Example\n\n```json\n{\n  \"query\": {\n    \"term\": {\n      \"payload.commits.message\": {\n        \"value\": \"automated\",\n        \"boost\": 2.0\n      }\n    }\n  }\n}\n```\n\n#### Supported Parameters\n\n| Variable           | Type    | Description                                                                  | Default |\n| ------------------ | ------- | ---------------------------------------------------------------------------- | ------- |\n| `value`            | String  | Term value. This is the string representation of a token after tokenization. | -       |\n| `boost`            | Number  | Multiplier boost for score computation                                       | 1.0     |\n| `case_insensitive` | Boolean | Allows ASCII case insensitive matching of the value.                         | false   |\n\n\n\n\n### `match_all` / `match_none`\n\n[Elasticsearch reference documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-all-query.html)\n\n#### Example\n\n```json\n{\"match_all\": {}}\n```\n```json\n{\"match_none\": {}}\n```\n\n\n### `exists`\n\n[Elasticsearch reference documentation](https://www.elastic.co/guide/en/elasticsearch/reference/8.8/query-dsl-exists-query.html)\n\nQuery matching only documents containing a non-null value for a given field.\n\n#### Example\n\n```json\n{\n  \"query\": {\n    \"exists\": {\n      \"field\": \"author.login\"\n    }\n  }\n}\n```\n\n#### Supported Parameters\n\n| Variable | Type   | Description                                             | Default |\n| -------- | ------ | ------------------------------------------------------- | ------- |\n| `field`  | String | Only documents with a value for field will be returned. | -       |\n\n### `prefix`\n\n[Elasticsearch reference documentation](https://www.elastic.co/guide/en/elasticsearch/reference/8.8/query-dsl-prefix-query.html)\n\nReturns documents that contain a specific prefix in a provided field.\n\n#### Example\n\n```json\n{\n  \"query\": {\n    \"prefix\": {\n      \"author.login\" {\n        \"value\": \"adm\",\n      }\n    }\n  }\n}\n```\n\n#### Supported Parameters\n\n| Variable           | Type    | Description                                          | Default |\n| ------------------ | ------- | ---------------------------------------------------- | ------- |\n| `value`            | String  | Beginning characters of terms you wish to find.      | -       |\n| `case_insensitive` | Boolean | Allows ASCII case insensitive matching of the value. | false   |\n\n### `wildcard`\n\n[Elasticsearch reference documentation](https://www.elastic.co/guide/en/elasticsearch/reference/8.8/query-dsl-wildcard-query.html)\n\nReturns documents that contain terms matching a wildcard pattern:\n* `?` replaces one and only one term character\n* `*` replaces any number of term characters or an empty string\n\n#### Example\n\n```json\n{\n  \"query\": {\n    \"wildcard\": {\n      \"author.login\" {\n        \"value\": \"adm?n*\",\n      }\n    }\n  }\n}\n```\n\n#### Supported Parameters\n\n| Variable           | Type    | Description                                          | Default |\n| ------------------ | ------- | ---------------------------------------------------- | ------- |\n| `value`            | String  | Wildcard pattern for terms you wish to find.         | -       |\n| `boost`            | Number  | Multiplier boost for score computation.              | 1.0     |\n| `case_insensitive` | Boolean | Allows ASCII case insensitive matching of the value. | false   |\n\n\n### `regexp`\n\n[Elasticsearch reference documentation](https://www.elastic.co/guide/en/elasticsearch/reference/8.8/query-dsl-regexp-query.html)\n\nReturns documents that contain terms matching a regular expression.\n\n#### Example\n\n```json\n{\n  \"query\": {\n    \"regexp\": {\n      \"author.login\" {\n        \"value\": \"adm.*n\",\n      }\n    }\n  }\n}\n```\n\n#### Supported Parameters\n\n| Variable           | Type    | Description                                          | Default |\n| ------------------ | ------- | ---------------------------------------------------- | ------- |\n| `value`            | String  | Wildcard pattern for terms you wish to find.         | -       |\n| `case_insensitive` | Boolean | Allows ASCII case insensitive matching of the value. | false   |\n\n\n### About the `lenient` argument\n\nQuickwit and Elasticsearch have different interpretations of the `lenient` setting:\n- In Quickwit, lenient mode allows ignoring parts of the query that reference non-existing columns. This is a behavior that Elasticsearch supports by default.\n- In Elasticsearch, lenient mode primarily addresses type errors (such as searching for text in an integer field). Quickwit always supports this behavior, regardless of the `lenient` setting.\n\n## Search multiple indices\n\nSearch APIs that accept <index_id> requests path parameter also support multi-target syntax.\n\n### Multi-target syntax\n\nIn multi-target syntax, you can use a comma or its URL encoded version '%2C' separated list to run a request on multiple indices: test1,test2,test3. You can also sue [glob-like](https://en.wikipedia.org/wiki/Glob_(programming)) wildcard ( \\* ) expressions to target indices that match a pattern: test\\* or \\*test or te\\*t or \\*test\\*.\n\nThe multi-target expression has the following constraints:\n\n    - It must follow the regex `^[a-zA-Z\\*][a-zA-Z0-9-_\\.\\*]{0,254}$`.\n    - It cannot contain consecutive asterisks (`*`).\n    - If it does not contain an asterisk (`*`), the length must be greater than or equal to 3 characters.\n\n### Examples\n```\nGET api/v1/_elastic/stackoverflow-000001,stackoverflow-000002/_search\n{\n  \"query\": {\n    \"query_string\": {\n      \"query\": \"search AND engine\",\n      \"fields\": [\n        \"title\",\n        \"body\"\n      ]\n    }\n  }\n}\n```\n\n```\nGET api/v1/_elastic/stackoverflow*/_search\n{\n  \"query\": {\n    \"query_string\": {\n      \"query\": \"search AND engine\",\n      \"fields\": [\n        \"title\",\n        \"body\"\n      ]\n    }\n  }\n}\n```\n"
  },
  {
    "path": "docs/reference/metrics.md",
    "content": "---\ntitle: Metrics\nsidebar_position: 70\n---\n\nQuickwit exposes key metrics in the [Prometheus](https://prometheus.io/) format on the `/metrics` endpoint. You can use any front-end that supports Prometheus to examine the behavior of Quickwit visually.\n\n## Cache Metrics\n\nCurrently Quickwit exposes metrics for three caches: `fastfields`, `shortlived`, `splitfooter`. These metrics share the same structure.\n\n| Namespace | Metric Name | Description | Type |\n| --------- | ----------- | ----------- | ---- |\n| `quickwit_cache_{cache_name}` | `in_cache_count` | Count of {cache_name} in cache | `gauge` |\n| `quickwit_cache_{cache_name}` | `in_cache_num_bytes` | Number of {cache_name} bytes in cache | `gauge` |\n| `quickwit_cache_{cache_name}` | `cache_hit_total` | Number of {cache_name} cache hits | `counter` |\n| `quickwit_cache_{cache_name}` | `cache_hits_bytes` | Number of {cache_name} cache hits in bytes | `counter` |\n| `quickwit_cache_{cache_name}` | `cache_miss_total` | Number of {cache_name} cache hits | `counter` |\n\n## CLI Metrics\n\n| Namespace | Metric Name | Description | Type |\n| --------- | ----------- | ----------- | ---- |\n| `quickwit` | `allocated_num_bytes` | Number of bytes allocated memory, as reported by jemalloc. | `gauge` |\n\n## Common Metrics\n\n| Namespace | Metric Name | Description | Labels | Type |\n| --------- | ----------- | ----------- | ------ | ---- |\n| `quickwit` | `write_bytes`| Number of bytes written by a given component in [`indexer`, `merger`, `deleter`, `split_downloader_{merge,delete}`] | [`index`, `component`] | `counter` |\n\n## Indexing Metrics\n\n| Namespace | Metric Name | Description | Labels | Type |\n| --------- | ----------- | ----------- | ------ | ---- |\n| `quickwit_indexing` | `processed_docs_total`| Number of processed docs by index, source and processed status in [`valid`, `schema_error`, `parse_error`, `transform_error`] | [`index`, `source`, `docs_processed_status`] | `counter` |\n| `quickwit_indexing` | `processed_bytes`| Number of processed bytes by index, source and processed status in [`valid`, `schema_error`, `parse_error`, `transform_error`] | [`index`, `source`, `docs_processed_status`] | `counter` |\n| `quickwit_indexing` | `available_concurrent_upload_permits`| Number of available concurrent upload permits by component in [`merger`, `indexer`] | [`component`] | `gauge` |\n| `quickwit_indexing` | `ongoing_merge_operations`| Number of available concurrent upload permits by component in [`merger`, `indexer`]. | [`index`, `source`] | `gauge` |\n\n## Ingest Metrics\n\n| Namespace | Metric Name | Description | Type |\n| --------- | ----------- | ----------- | ---- |\n| `quickwit_ingest` | `docs_bytes_total` | Total size of the docs ingested, measured in ingester's leader, after validation and before persistence/replication | `counter` |\n| `quickwit_ingest` | `docs_total` | Total number of the docs ingested, measured in ingester's leader, after validation and before persistence/replication | `counter` |\n| `quickwit_ingest` | `queue_count` | Number of queues currently active | `counter` |\n\n## Metastore Metrics\n\nAll metastore methods are monitored by the 3 metrics:\n\n| Namespace | Metric Name | Description | Labels | Type |\n| --------- | ----------- | ----------- | ------ | ---- |\n| `quickwit_metastore` | `requests_total` | Number of requests | [`operation`, `index`] | `counter` |\n| `quickwit_metastore` | `request_errors_total` | Number of failed requests | [`operation`, `index`] | `counter` |\n| `quickwit_metastore` | `request_duration_seconds` | Duration of requests | [`operation`, `index`, `error`] | `histogram` |\n\nExamples of operation names: `create_index`, `index_metadata`, `delete_index`, `stage_splits`, `publish_splits`, `list_splits`, `add_source`, ...\n\n## Rest API Metrics\n\n| Namespace | Metric Name | Description | Type |\n| --------- | ----------- | ----------- | ---- |\n| `quickwit` | `http_requests_total` | Total number of HTTP requests received | `counter` |\n\n## Search Metrics\n\n| Namespace | Metric Name | Description | Type |\n| --------- | ----------- | ----------- | ---- |\n| `quickwit_search` | `leaf_searches_splits_total` | Number of leaf searches (count of splits) started | `counter` |\n| `quickwit_search` | `leaf_search_split_duration_secs` | Number of seconds required to run a leaf search over a single split. The timer starts after the semaphore is obtained | `histogram` |\n| `quickwit_search` | `active_search_threads_count` | Number of threads in use in the CPU thread pool | `gauge` |\n\n## Storage Metrics\n\n| Namespace | Metric Name | Description | Type |\n| --------- | ----------- | ----------- | ---- |\n| `quickwit_storage` | `object_storage_gets_total` | Number of objects fetched | `counter` |\n| `quickwit_storage` | `object_storage_puts_total` | Number of objects uploaded. May differ from object_storage_requests_parts due to multipart upload | `counter` |\n| `quickwit_storage` | `object_storage_puts_parts` | Number of object parts uploaded | `counter` |\n| `quickwit_storage` | `object_storage_download_num_bytes` | Amount of data downloaded from an object storage | `counter` |\n"
  },
  {
    "path": "docs/reference/query-language.md",
    "content": "---\ntitle: Query Language Reference\nsidebar_position: 40\n---\n\n## Pseudo-grammar\n\n```\nquery = '(' query ')'\n      | query operator query\n      | unary_operator query\n      | query query\n      | clause\n\noperator = 'AND' | 'OR'\n\nunary_operator = 'NOT' | '-'\n\nclause = field_name ':' field_clause\n       | defaultable_clause\n       | '*'\n\nfield_clause = term | term_prefix | term_set | phrase | phrase_prefix | range | '*'\ndefaultable_clause = term | term_prefix | term_set | phrase | phrase_prefix\n```\n---\n## Writing Queries\n### Escaping Special Characters\n\nSome characters need to be escaped in non quoted terms because they are syntactically significant otherwise: special reserved characters are: `+` , `^`, `` ` ``, `:`, `{`, `}`, `\"`, `[`, `]`, `(`, `)`, `~`, `!`, `\\\\`, `*`, `SPACE`. If such such characters appear in query terms, they need to be escaped by prefixing them with an anti-slash `\\`.\n\nIn quoted terms, the quote character in use `'` or `\"` needs to be escaped.\n\n###### Allowed characters in field names\n\nSee the [Field name validation rules](https://quickwit.io/docs/configuration/index-config#field-name-validation-rules) in the index config documentation.\n\n### Addressing nested structures\n\nData stored deep inside nested data structures like `object` or `json` fields can be addressed using dots as separators in the field name.\nFor instance, the document `{\"product\": {\"attributes\": {color\": \"red\"}}}` is matched by\n```\nproduct.attributes.color:red\n```\n\nIf the keys of your object contain dots, the above syntax has some ambiguity : by default `{\"k8s.component.name\": \"quickwit\"}` will be matched by \n```k8s.component.name:quickwit```\n\nIt is possible to remove the ambiguity by setting expand_dots in the json field configuration. \nIn that case, it will be necessary to escape the `.` in the query to match this document like this :\n```\nk8s\\.component\\.name:quickwit\n```\n\n---\n\n## Structured data\n### Datetime\nDatetime values must be provided in rfc3339 format, such as `1970-01-01T00:00:00Z`\n\n### IP addresses\nIP addresses can be provided as IPv4 or IPv6. It is recommended to search with the format used when indexing documents.\nThere is no support for searching for a range of IP using CIDR notation, but you can use normal range queries.\n\n---\n\n## Types of clauses\n\n### Term `field:term`\n```\nterm = term_char+\n```\n\nMatches documents if the targeted field contains a token equal to the provided term. \n\n`field:value` will match any document where the field 'field' has a token 'value'.\n\n### Wildcard `field:wil?car*d`\n```\nwildcard = [term_char\\*\\?]+\n```\n\nMatches documents if the targeted field contains a token that matches the wildcard:\n- `?` replaces one and only one term character\n- `*` replaces any number of term characters or an empty string\n\nExamples:\n- `field:quick*` will match any document where the field 'field' has a token like `quickwit` or `quickstart`, but not `qui` or `abcd`.\n- `field:h?llo` will match any document where the field 'field' has a token like `hello` or `hallo`, but not `heillo` or `hllo`.\n\nQueries with prefixes (`field:qui*`) are much more efficient than queries starting with a wildcard (`field:*wit`)\n\n\n### Term set `field:IN [a b c]`\n```\nterm_set = 'IN' '[' term_list ']'\nterm_list = term_list term\n          | term\n```\nMatches if the document contains any of the tokens provided. \n\n###### Examples\n`field:IN [ab cd]` will match 'ab' or 'cd', but nothing else.\n\n###### Performance Note\nThis is a lot like writing `field:ab OR field:cd`. When there are only a handful of terms to search for, using ORs is usually faster.\nWhen there are many values to match, a term set query can become more efficient.\n\n<!-- previously a field was required. It looks like it may no longer be the case -->\n\n### Phrase `field:\"sequence of words\"`\n```\nphrase = phrase_string\n       | phrase_string slop\nphrase_string = '\"' phrase_char '\"'\nslop = '~' [01-9]+\n\n```\n\nMatches if the field contains the sequence of token provided:\n- `field:\"looks good to me\"` will match any document containing that sequence of tokens.\n- `field:\"look* good to me\"` with the default tokenizer is equivalent to `field:\"look good to me\"`, i.e. the '*' character is pruned by the tokenizer and not interpreted as a wildcard.\n\n:::info\n\nThe field must have been configured with `record: position` when indexing.\n\n:::\n\n###### Slop operator\nIs is also possible to add a slop, which allow matching a sequence with some distance. For instance `\"looks to me\"~1` will match \"looks good to me\", but not \"looks very good to me\".\nTransposition costs 2, e.g. `\"A B\"~1` will not match `\"B A\"` but it would with `\"A B\"~2`.\nTransposition is not a special case, in the example above A is moved 1 position and B is moved 1 position, so the slop is 2.\n\n### Phrase Prefix `field:\"finish this phr\"*`\n```\nphrase_prefix = phrase '*'\n```\n\nMatches if the field contains the sequence of token provided, where the last token in the query may be only a prefix of the token in the document.\n\nThe field must have been configured with `record: position` when indexing.\n\nThere is no slop for phrase prefix queries.\n\n###### Examples\n `field:\"thanks for your contrib\"*` will match 'thanks for your contribution'.\n\n###### Limitation\n\nQuickwit may trim some results matched by this clause in some cases.  If you search for `\"thanks for your co\"*`, it will enumerate the first 50 tokens which start with \"co\" (in their storage order), and search for any documents where \"thanks for your\" is followed by any of these tokens.\n\nIf there are many tokens starting with \"co\", \"contribution\" might not be one of the 50 selected tokens, and the query won't match a document containing \"thanks for your contribution\". Normal prefix queries don't suffer from this issue.\n\n### Range `field:[low_bound TO high_bound}`\n```\nrange = explicit_range | comparison_half_range\n\nexplicit_range = left_bound_char bounds right_bound_char\nleft_bound_char = '[' | '{' \nright_bound_char = '}' | ']'\nbounds = term TO term\n       | term TO '*'\n       | '*' TO term\n\ncomparison_range = comparison_operator term\ncomparison_operator = '<' | '>' | '<=' | '>='\n```\n\nMatches if the document contains a token between the provided bounds for that field.\nFor range queries, you must provide a field. Quickwit won't use `default_search_fields` automatically.\n\n###### Order\nFor text fields, the ranges are defined by lexicographic order on uft-8 encoded byte arrays. It means for a text field, 100 is between 1 and 2.\n<!-- TODO: Build a more comprehensive example set to showcase how wharacters are sorted -->\n\nWhen using ranges on integers, it behaves naturally.\n\n###### Inclusive and exclusive bounds\nInclusive bounds are represented by square brackets `[]`. They will match tokens equal to the bound term.\nExclusive bounds are represented by curly brackets `{}`. They will not match tokens equal to the bound term.\n\n###### Half-Open bounds\nYou can make an half open range by using `*` as one of the bounds. `field:[b TO *]` will match 'bb' and 'zz', but not 'ab'.\nYou can also use a comparison based syntax:`field:<b`, `field:>b`, `field:<=b` or `field:>=b`.\n\n<!-- NOTE : empty values likely not indexed -->\n\n###### Examples\n- Inclusive Range: `ip:[127.0.0.1 TO 127.0.0.50]`\n- Exclusive Range: `ip:{127.0.0.1 TO 127.0.0.50}`\n- Unbounded Inclusive Range: `ip:[127.0.0.1 TO *] or ip:>=127.0.0.1`\n- Unbounded Exclusive Range: `ip:{127.0.0.1 TO *] or ip:>127.0.0.1`\n\n\n### Exists `field:*`\n\nMatches documents where the field is set. You have to specify a field for this query, Quickwit won't use `default_search_fields` automatically.\n\n### Match All `*`\n\nMatches every document. You can't put a field in front. It is simply written as `*`.\n\n---\n\n## Building Queries\nMost queries are composed of more than one clause. When doing so, you may add operators between clauses.\n\nImplicitly if no operator is provided, 'AND' is assumed.\n\n### Conjunction `AND`\nAn `AND` query will match only if both sides match.\n\n<!-- TODO: Formal example ?*-->\n\n### Disjunction `OR`\nAn `OR` query will match if either (or both) sides match.\n\n<!-- TODO: Formal example ?*-->\n\n### Negation `NOT` or `-`\nA `NOT` query will match if the clause it is applied to does not match.\nThe `-` prefix is equivalent to the `NOT` operator.\n\n### Grouping `()`\nParentheses are used to force the order of evaluation of operators.\nFor instance, if a query should match if 'field1' is 'one' or 'two', and 'field2' is 'three', you can use `(field1:one OR field1:two) AND field2:three`.\n\n### Operator Precedence\nWithout parentheses, `AND` takes precedence over `OR`. That is, `a AND b OR c` is interpreted as `(a AND b) or c`.\n\n`NOT` and `-` takes precedence over everything, such that `-a AND b` means `(-a) AND b`, not `-(a AND B)`.\n\n\n---\n\n## Other considerations \n\n### Default Search Fields\nIn many case it is possible to omit the field you search if it was configured in the `default_search_fields` array of the index configuration. If more than one field is configured as default, the resulting implicit clauses are combined using a conjunction ('OR').\n\n### Tokenization\nNote that the result of a query can depend on the tokenizer used for the field getting searched. Hence this document always speaks of tokens, which may be the exact value the document contain (in case of the raw tokenizer), or a subset of it (for instance any tokenizer cutting on spaces).\n\n<!-- NOTE : should dig deeper ? -->\n"
  },
  {
    "path": "docs/reference/rest-api.md",
    "content": "---\ntitle: REST API\nsidebar_position: 10\n---\n\n## API version\n\nAll the API endpoints start with the `api/v1/` prefix. `v1` indicates that we are currently using version 1 of the API.\n\n\n## OpenAPI specification\n\nThe OpenAPI specification of the REST API is available at `/openapi.json` and a Swagger UI version is available at `/ui/api-playground`.\n\n## Parameters\n\nParameters passed in the URL must be properly URL-encoded, using the UTF-8 encoding for non-ASCII characters.\n\n```\nGET [..]/search?query=barack%20obama\n```\n\n## Error handling\n\nSuccessful requests return a 2xx HTTP status code.\n\nFailed requests return a 4xx HTTP status code. The response body of failed requests holds a JSON object containing a `message` field that describes the error.\n\n```json\n{\n \"message\": \"Failed to parse query\"\n}\n```\n\n## Search API\n\n### Search in an index\n\nSearch for documents matching a query in the given index `api/v1/<index id>/search`. This endpoint is available as long as you have at least one node running a searcher service in the cluster.\nThe search endpoint accepts `GET` and `POST` requests. The [parameters](#get-parameters) are URL parameters for `GET` requests or JSON key-value pairs for `POST` requests.\n\n```\nGET api/v1/<index id>/search?query=searchterm\n```\n\n```\nPOST api/v1/<index id>/search\n{\n  \"query\": searchterm\n}\n```\n\n#### Path variable\n\n| Variable      | Description   |\n| ------------- | ------------- |\n| `index id`  | The index id  |\n\n#### Parameters\n\n| Variable            | Type       | Description     | Default value   |\n|---------------------|------------|-----------------|-----------------|\n| `query`           | `String`   | Query text. See the [query language doc](query-language.md) | _required_ |\n| `start_timestamp` | `i64`      | If set, restrict search to documents with a `timestamp >= start_timestamp`, taking advantage of potential time pruning opportunities. The value must be in seconds. | |\n| `end_timestamp`   | `i64`      | If set, restrict search to documents with a `timestamp < end_timestamp`, taking advantage of potential time pruning opportunities. The value must be in seconds.    | |\n| `start_offset`    | `Integer`  | Number of documents to skip | `0` |\n| `max_hits`        | `Integer`  | Maximum number of hits to return (by default 20) | `20` |\n| `search_field`    | `[String]` | Fields to search on if no field name is specified in the query. Comma-separated list, e.g. \"field1,field2\"  | index_config.search_settings.default_search_fields |\n| `snippet_fields`  | `[String]` | Fields to extract snippet on. Comma-separated list, e.g. \"field1,field2\"  | |\n| `sort_by`         | `[String]` | Fields to sort the query results on. You can sort by one or two fast fields or by BM25 `_score` (requires fieldnorms). By default, hits are sorted in reverse order of their [document ID](/docs/overview/concepts/querying.md#document-id) (to show recent events first). | |\n| `format`          | `Enum`     | The output format. Allowed values are \"json\" or \"pretty_json\" | `pretty_json` |\n| `aggs`            | `JSON`     | The aggregations request. See the [aggregations doc](aggregation.md) for supported aggregations. | |\n\n:::info\nThe `start_timestamp` and `end_timestamp` should be specified in seconds regardless of the timestamp field precision.\n:::\n\n#### Response\n\nThe response is a JSON object, and the content type is `application/json; charset=UTF-8.`\n\n| Field                   | Description                    | Type       |\n| --------------------    | ------------------------------ | :--------: |\n| `hits`                | Results of the query           | `[hit]`    |\n| `num_hits`            | Total number of matches        | `number`   |\n| `elapsed_time_micros` | Processing time of the query   | `number`   |\n\n### Search multiple indices\nSearch APIs that accept `index id` requests path parameter also support multi-target syntax.\n\n#### Multi-target syntax\n\nIn multi-target syntax, you can use a comma or its URL encoded version '%2C' separated list to run a request on multiple indices: test1,test2,test3. You can also use [glob-like](https://en.wikipedia.org/wiki/Glob_(programming)) wildcard ( \\* ) expressions to target indices that match a pattern: test\\* or \\*test or te\\*t or \\*test\\*.\n\nThe following are some constrains about the multi-target expression.\n\n    - It must follow the regex `^[a-zA-Z\\*][a-zA-Z0-9-_\\.\\*]{0,254}$`.\n    - It cannot contain consecutive asterisks (`*`).\n    - If it does not contain an asterisk (`*`), the length must be greater than or equal to 3 characters.\n\n#### Examples\n```\nGET api/v1/stackoverflow-000001,stackoverflow-000002/search\n{\n    \"query\": \"search AND engine\",\n}\n```\n\n```\nGET api/v1/stackoverflow*/search\n{\n    \"query\": \"search AND engine\",\n}\n```\n\n## Ingest API\n\n### Ingest data into an index\n\n```\nPOST api/v1/<index id>/ingest -d \\\n'{\"url\":\"https://en.wikipedia.org/wiki?id=1\",\"title\":\"foo\",\"body\":\"foo\"}\n{\"url\":\"https://en.wikipedia.org/wiki?id=2\",\"title\":\"bar\",\"body\":\"bar\"}\n{\"url\":\"https://en.wikipedia.org/wiki?id=3\",\"title\":\"baz\",\"body\":\"baz\"}'\n```\n\nIngest a batch of documents to make them searchable in a given `<index id>`. Currently, NDJSON is the only accepted payload format. This endpoint is only available on a node that is running an indexer service.\n\n#### Controlling when the indexed documents will be available for search\n\nNewly added documents will not appear in the search results until they are added to a split and that split is committed. This process is automatic and is controlled by `split_num_docs_target` and `commit_timeout_secs` parameters. By default, the ingest command exits as soon as the records are added to the indexing queue, which means that the new documents will not appear in the search results at this moment. This behavior can be changed by adding `commit=wait_for` or `commit=force` parameters to the query. The `wait_for` parameter will cause the command to wait for the documents to be committed according to the standard time or number of documents rules. The `force` parameter will trigger a commit after all documents in the request are processed. It will also wait for this commit to finish before returning. Please note that the `force` option may have a significant performance cost especially if it is used on small batches.\n\n```\nPOST api/v1/<index id>/ingest?commit=wait_for -d \\\n'{\"url\":\"https://en.wikipedia.org/wiki?id=1\",\"title\":\"foo\",\"body\":\"foo\"}\n{\"url\":\"https://en.wikipedia.org/wiki?id=2\",\"title\":\"bar\",\"body\":\"bar\"}\n{\"url\":\"https://en.wikipedia.org/wiki?id=3\",\"title\":\"baz\",\"body\":\"baz\"}'\n```\n\n:::info\n\nThe payload size is limited to 10MB [by default](../configuration/node-config.md#ingest-api-configuration) since this endpoint is intended to receive documents in batches.\n\n:::\n\n#### Path variable\n\n| Variable      | Description   |\n| ------------- | ------------- |\n| `index id`  | The index id  |\n\n#### Query parameters\n\n| Variable            | Type       | Description                                        | Default value |\n|---------------------|------------|----------------------------------------------------|---------------|\n| `commit`            | `String`   | The commit behavior: `auto`, `wait_for` or `force` | `auto`        |\n| `detailed_response` | `bool`     | Enable `parse_failures` in the response. Setting to `true` might impact performances negatively. | `false`        |\n\n#### Response\n\nThe response is a JSON object, and the content type is `application/json; charset=UTF-8.`\n\n| Field                       | Description                                                                                                                                                              |   Type   |\n|-----------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------:|\n| `num_docs_for_processing` | Total number of documents submitted for processing. The documents may not have been processed. | `number` |\n| `num_ingested_docs`       | Number of documents successfully persisted in the write ahead log | `number` |\n| `num_rejected_docs`       | Number of documents that couldn't be parsed (invalid json, bad schema...) | `number` |\n| `parse_failures`          | List detailing parsing failures. Only available if `detailed_response` is set to `true`. | `list(object)` |\n\nThe parse failure objects contain the following fields:\n- `message`: a detailed message explaining the error\n- `reason`: one of `invalid_json`, `invalid_schema` or `unspecified`\n- `document`: the utf-8 decoded string of the document byte chunk that generated the error\n\n\n## Index API\n\n### Create an index\n\n```\nPOST api/v1/indexes\n```\n\nCreate an index by posting an `IndexConfig` payload. The API accepts JSON with `content-type: application/json` and YAML with `content-type: application/yaml`.\n\n#### POST payload\n\n| Variable            | Type               | Description                                                                                                           | Default value                         |\n|---------------------|--------------------|-----------------------------------------------------------------------------------------------------------------------|---------------------------------------|\n| `version`           | `String`           | Config format version, use the same as your Quickwit version.                                                         | _required_                            |\n| `index_id`          | `String`           | Index ID, see its [validation rules](../configuration/index-config.md#index-id) on identifiers.                       | _required_                            |\n| `index_uri`         | `String`           | Defines where the index files are stored. This parameter expects a [storage URI](../configuration/storage-config.md#storage-uris).           | `{default_index_root_uri}/{index_id}` |\n| `doc_mapping`       | `DocMapping`       | Doc mapping object as specified in the [index config docs](../configuration/index-config.md#doc-mapping).             | _required_                            |\n| `indexing_settings` | `IndexingSettings` | Indexing settings object as specified in the [index config docs](../configuration/index-config.md#indexing-settings). |                                       |\n| `search_settings`   | `SearchSettings`   | Search settings object as specified in the [index config docs](../configuration/index-config.md#search-settings).     |                                       |\n| `retention`         | `Retention`        | Retention policy object as specified in the [index config docs](../configuration/index-config.md#retention-policy).   |                                       |\n\n\n**Payload Example**\n\ncurl -XPOST http://localhost:7280/api/v1/indexes --data @index_config.json -H \"Content-Type: application/json\"\n\n```json title=\"index_config.json\n{\n    \"version\": \"0.8\",\n    \"index_id\": \"hdfs-logs\",\n    \"doc_mapping\": {\n        \"field_mappings\": [\n            {\n                \"name\": \"tenant_id\",\n                \"type\": \"u64\",\n                \"fast\": true\n            },\n            {\n                \"name\": \"app_id\",\n                \"type\": \"u64\",\n                \"fast\": true\n            },\n            {\n                \"name\": \"timestamp\",\n                \"type\": \"datetime\",\n                \"input_formats\": [\"unix_timestamp\"],\n                \"fast_precision\": \"seconds\",\n                \"fast\": true\n            },\n            {\n                \"name\": \"body\",\n                \"type\": \"text\",\n                \"record\": \"position\"\n            }\n        ],\n        \"partition_key\": \"tenant_id\",\n        \"max_num_partitions\": 200,\n        \"tag_fields\": [\"tenant_id\"],\n        \"timestamp_field\": \"timestamp\"\n    },\n    \"search_settings\": {\n        \"default_search_fields\": [\"body\"]\n    },\n    \"indexing_settings\": {\n        \"merge_policy\": {\n            \"type\": \"limit_merge\",\n            \"max_merge_ops\": 3,\n            \"merge_factor\": 10,\n            \"max_merge_factor\": 12\n        }\n    },\n    \"retention\": {\n        \"period\": \"7 days\",\n        \"schedule\": \"@daily\"\n    }\n}\n```\n\n#### Response\n\nThe response is the index metadata of the created index, and the content type is `application/json; charset=UTF-8.`\n\n| Field                | Description                                   |         Type          |\n|----------------------|-----------------------------------------------|:---------------------:|\n| `version`          | The current index configuration format version. |       `string`        |\n| `index_uid`        | The server-generated index UID.                 |       `string`        |\n| `index_config`     | The posted index config.                        |     `IndexConfig`     |\n| `checkpoint`       | Map of checkpoints by source.                   |   `IndexCheckpoint`   |\n| `create_timestamp` | Index creation timestamp                        |       `number`        |\n| `sources`          | List of the index sources configurations.       | `Array<SourceConfig>` |\n\n\n### Update an index\n\n```\nPUT api/v1/indexes/<index id>\n```\n\n#### Path variable\n\n| Variable      | Description   |\n| ------------- | ------------- |\n| `index id`    | The index id  |\n\n#### Query parameters\n\n| Variable  | Type   | Description                                   | Default value |\n|-----------|--------|-----------------------------------------------|---------------|\n| `create`  | `bool` | Create the index if it doesn't already exists | `false`       |\n\nUpdate the configurations of an index. This endpoint follows PUT semantics, which means that all the fields of the current configuration are replaced by the values specified in this request or the associated defaults. In particular, if the field is optional (e.g. `retention_policy`), omitting it will delete the associated configuration. If the new configuration file contains updates that cannot be applied, the request fails, and none of the updates are applied. The API accepts JSON with `content-type: application/json` and YAML with `content-type: application/yaml`.\n\n- The retention policy update is automatically picked up by the janitor service on its next state refresh.\n- The search settings update is automatically picked up by searcher nodes when the next query is executed.\n- The indexing settings update is automatically picked up by the indexer nodes once the control plane emits a new indexing plan.\n- The doc mapping update is automatically picked up by the indexer nodes once the control plane emit a new indexing plan.\n\n:::warning\n\nIf you use the ingest or ES bulk API (V2), the old doc mapping will still be used to validate new documents that end up being persisted on existing shards (see [#5738](https://github.com/quickwit-oss/quickwit/issues/5738)).\n\n:::\n\nUpdating the doc mapping doesn't reindex existing data. Queries and results are mapped on a best-effort basis when querying older splits. For more details, check [the reference](updating-mapper.md) out.\n\n#### PUT payload\n\n| Variable            | Type               | Description                                                                                                           | Default value                         |\n|---------------------|--------------------|-----------------------------------------------------------------------------------------------------------------------|---------------------------------------|\n| `version`           | `String`           | Config format version, use the same as your Quickwit version.                                                         | _required_                            |\n| `index_id`          | `String`           | Index ID, must be the same index as in the request URI.                                                               | _required_                            |\n| `index_uri`         | `String`           | Defines where the index files are stored. Cannot be updated.                                                          | `{default_index_root_uri}/{index_id}`                 |\n| `doc_mapping`       | `DocMapping`       | Doc mapping object as specified in the [index config docs](../configuration/index-config.md#doc-mapping).             | _required_                            |\n| `indexing_settings` | `IndexingSettings` | Indexing settings object as specified in the [index config docs](../configuration/index-config.md#indexing-settings). |                                       |\n| `search_settings`   | `SearchSettings`   | Search settings object as specified in the [index config docs](../configuration/index-config.md#search-settings).     |                                       |\n| `retention`         | `Retention`        | Retention policy object as specified in the [index config docs](../configuration/index-config.md#retention-policy).   |                                       |\n\n\n**Payload Example**\n\ncurl -XPUT http://localhost:7280/api/v1/indexes/hdfs-logs --data @updated_index_update.json -H \"Content-Type: application/json\"\n\n```json title=\"updated_index_update.json\n{\n    \"version\": \"0.8\",\n    \"index_id\": \"hdfs-logs\",\n    \"doc_mapping\": {\n        \"field_mappings\": [\n            {\n                \"name\": \"tenant_id\",\n                \"type\": \"u64\",\n                \"fast\": true\n            },\n            {\n                \"name\": \"app_id\",\n                \"type\": \"u64\",\n                \"fast\": true\n            },\n            {\n                \"name\": \"timestamp\",\n                \"type\": \"datetime\",\n                \"input_formats\": [\"unix_timestamp\"],\n                \"fast_precision\": \"seconds\",\n                \"fast\": true\n            },\n            {\n                \"name\": \"body\",\n                \"type\": \"text\",\n                \"record\": \"position\"\n            }\n        ],\n        \"partition_key\": \"tenant_id\",\n        \"max_num_partitions\": 200,\n        \"tag_fields\": [\"tenant_id\"],\n        \"timestamp_field\": \"timestamp\"\n    },\n    \"search_settings\": {\n        \"default_search_fields\": [\"body\"]\n    },\n    \"indexing_settings\": {\n        \"merge_policy\": {\n            \"type\": \"limit_merge\",\n            \"max_merge_ops\": 3,\n            \"merge_factor\": 10,\n            \"max_merge_factor\": 12\n        }\n    },\n    \"retention\": {\n        \"period\": \"30 days\",\n        \"schedule\": \"@daily\"\n    }\n}\n```\n\n#### Response\n\nThe response is the index metadata of the updated index, and the content type is `application/json; charset=UTF-8.`\n\n| Field                | Description                             |         Type          |\n|----------------------|-----------------------------------------|:---------------------:|\n| `version`          | The current server configuration version. |       `string`        |\n| `index_uid`        | The server-generated index UID.            |       `string`        |\n| `index_config`     | The posted index config.                  |     `IndexConfig`     |\n| `checkpoint`       | Map of checkpoints by source.             |   `IndexCheckpoint`   |\n| `create_timestamp` | Index creation timestamp                  |       `number`        |\n| `sources`          | List of the index sources configurations. | `Array<SourceConfig>` |\n\n\n### Get an index metadata\n\n```\nGET api/v1/indexes/<index id>\n```\n\nGet the index metadata of ID `index id`.\n\n#### Response\n\nThe response is the index metadata of the requested index, and the content type is `application/json; charset=UTF-8.`\n\n| Field                | Description                               |         Type          |\n|----------------------|-------------------------------------------|:---------------------:|\n| `version`          | The current server configuration version. |       `string`        |\n| `index_uid`        | The server-generated index UID.            |       `string`        |\n| `index_config`     | The posted index config.                  |     `IndexConfig`     |\n| `checkpoint`       | Map of checkpoints by source.             |   `IndexCheckpoint`   |\n| `create_timestamp` | Index creation timestamp.                 |       `number`        |\n| `sources`          | List of the index sources configurations. | `Array<SourceConfig>` |\n\n\n### Describe an index\n\n```\nGET api/v1/indexes/<index id>/describe\n```\nDescribes an index of ID `index id`.\n\n#### Response\n\nThe response is the stats about the requested index, and the content type is `application/json; charset=UTF-8.`\n\n| Field                               | Description                                              |         Type          |\n|-------------------------------------|----------------------------------------------------------|:---------------------:|\n| `index_id`                          | Index ID of index.                                       |       `String`        |\n| `index_uri`                         | Uri of index                                             |       `String`        |\n| `num_published_splits`              | Number of published splits.                              |       `number`        |\n| `size_published_splits`             | Size of published splits.                                |       `number`        |\n| `num_published_docs`                | Number of published documents.                           |       `number`        |\n| `size_published_docs_uncompressed`  | Size of the published documents in bytes (uncompressed). |       `number`        |\n| `timestamp_field_name`              | Name of timestamp field.                                       |       `String`        |\n| `min_timestamp`                     | Starting time of timestamp.                              |       `number`        |\n| `max_timestamp`                     | Ending time of timestamp.                                |       `number`        |\n\n\n### Get splits\n\n```\nGET api/v1/indexes/<index id>/splits\n```\nGet splits belongs to an index of ID `index id`.\n\n#### Path variable\n\n| Variable      | Description   |\n| ------------- | ------------- |\n| `index id`  | The index id  |\n\n#### Get parameters\n\n| Variable            | Type       | Description                                                                                                      |\n|---------------------|------------|------------------------------------------------------------------------------------------------------------------|\n| `offset`           | `number`   | If set, restrict the number of splits to skip|\n| `limit `           | `number`   | If set, restrict maximum number of splits to retrieve|\n| `split_states`           | `usize`   | If set, specific split state(s) to filter by|\n| `start_timestamp`           | `number`   | If set, restrict splits to documents with a `timestamp >= start_timestamp|\n| `end_timestamp`           | `number`   | If set, restrict splits to documents with a `timestamp < end_timestamp|\n| `end_create_timestamp`           | `number`   | If set, restrict splits whose creation dates are before this date|\n\n\n#### Response\n\nThe response is the stats about the requested index, and the content type is `application/json; charset=UTF-8.`\n\n| Field                               | Description                                              |         Type          |\n|-------------------------------------|----------------------------------------------------------|:---------------------:|\n| `offset`                          | Index ID of index.                                       |       `String`        |\n| `size`                         | Uri of index                                             |       `String`        |\n| `splits`              | Number of published splits.                              |       `List`        |\n\n#### Examples\n```\nGET /api/v1/indexes/stackoverflow/splits?offset=0&limit=10\n```\n```json\n{\n  \"offset\": 0,\n  \"size\": 1,\n  \"splits\": [\n    {\n      \"split_state\": \"Published\",\n      \"update_timestamp\": 1695642901,\n      \"publish_timestamp\": 1695642901,\n      \"version\": \"0.7\",\n      \"split_id\": \"01HB632HD8W6WHNM7CZFH3KG1X\",\n      \"index_uid\": \"stackoverflow:01HB6321TDT3SP58D4EZP14KSX\",\n      \"partition_id\": 0,\n      \"source_id\": \"_ingest-api-source\",\n      \"node_id\": \"jerry\",\n      \"num_docs\": 10000,\n      \"uncompressed_docs_size_in_bytes\": 6674940,\n      \"time_range\": {\n        \"start\": 1217540572,\n        \"end\": 1219335682\n      },\n      \"create_timestamp\": 1695642900,\n      \"maturity\": {\n        \"type\": \"immature\",\n        \"maturation_period_millis\": 172800000\n      },\n      \"tags\": [],\n      \"footer_offsets\": {\n        \"start\": 4714989,\n        \"end\": 4719999\n      },\n      \"delete_opstamp\": 0,\n      \"num_merge_ops\": 0\n    }\n  ]\n}\n```\n\n\n### Clears an index\n\n```\nPUT api/v1/indexes/<index id>/clear\n```\n\nClears index of ID `index id`: all splits will be deleted (metastore + storage) and all source checkpoints will be reset.\n\nIt returns an empty body.\n\n\n### Delete an index\n\n```\nDELETE api/v1/indexes/<index id>\n```\n\nDelete index of ID `index id`.\n\n#### Response\n\nThe response is the list of deleted split files; the content type is `application/json; charset=UTF-8.`\n\n```json\n[\n    {\n        \"split_id\": \"01GK1XNAECH7P14850S9VV6P94\",\n        \"num_docs\": 1337,\n        \"uncompressed_docs_size_bytes\": 23933408,\n        \"file_name\": \"01GK1XNAECH7P14850S9VV6P94.split\",\n        \"file_size_bytes\": 2991676\n    }\n]\n```\n\n### Get all indexes metadata\n\n```\nGET api/v1/indexes\n```\n\nRetrieve the metadata of all indexes present in the metastore.\n\n#### Response\n\nThe response is an array of `IndexMetadata`, and the content type is `application/json; charset=UTF-8.`\n\n\n### Create a source\n\n```\nPOST api/v1/indexes/<index id>/sources\n```\n\nCreate source by posting a source config JSON payload.\n\n#### POST payload\n\n| Variable          | Type     | Description                                                                            | Default value |\n|-------------------|----------|----------------------------------------------------------------------------------------|---------------|\n| `version**       | `String` | Config format version, put your current Quickwit version.                               | _required_    |\n| `source_id`     | `String` | Source ID. See ID [validation rules](../configuration/source-config.md).                 | _required_    |\n| `source_type`   | `String` | Source type: `kafka`, `kinesis` or `pulsar`.                                             | _required_    |\n| `num_pipelines` | `usize`  | Number of running indexing pipelines per node for this source.                           | `1`           |\n| `transform`     | `object` | A [VRL](https://vector.dev/docs/reference/vrl/) transformation applied to incoming documents, as defined in [source config docs](../configuration/source-config.md#transform-parameters).                          | `null`         |\n| `params`        | `object` | Source parameters as defined in [source config docs](../configuration/source-config.md). | _required_    |\n\n\n**Payload Example**\n\ncurl -XPOST http://localhost:7280/api/v1/indexes/my-index/sources --data @source_config.json -H \"Content-Type: application/json\"\n\n```json title=\"source_config.json\n{\n    \"version\": \"0.8\",\n    \"source_id\": \"kafka-source\",\n    \"source_type\": \"kafka\",\n    \"params\": {\n        \"topic\": \"quickwit-fts-staging\",\n        \"client_params\": {\n            \"bootstrap.servers\": \"kafka-quickwit-server:9092\"\n        }\n    }\n}\n```\n\n#### Response\n\nThe response is the created source config, and the content type is `application/json; charset=UTF-8.`\n\n### Update a source\n\n```\nPUT api/v1/indexes/<index id>/sources/<source id>\n```\n\n#### Path variable\n\n| Variable      | Description   |\n| ------------- | ------------- |\n| `index id`    | The index id  |\n| `source id`   | The source id  |\n\n#### Query parameters\n\n| Variable  | Type   | Description                                   | Default value |\n|-----------|--------|-----------------------------------------------|---------------|\n| `create`  | `bool` | Create the index if it doesn't already exists | `false`       |\n\nUpdate a source by posting a source config JSON payload.\n\n#### PUT payload\n\n| Variable          | Type     | Description                                                                            | Default value |\n|-------------------|----------|----------------------------------------------------------------------------------------|---------------|\n| `version**       | `String` | Config format version, put your current Quickwit version.                               | _required_    |\n| `source_id`     | `String` | Source ID, must be the same source as in the request URL.                                | _required_    |\n| `source_type`   | `String` | Source type: `kafka`, `kinesis` or `pulsar`. Cannot be updated.                          | _required_    |\n| `num_pipelines` | `usize`  | Number of running indexing pipelines per node for this source.                           | `1`           |\n| `transform`     | `object` | A [VRL](https://vector.dev/docs/reference/vrl/) transformation applied to incoming documents, as defined in [source config docs](../configuration/source-config.md#transform-parameters).                          | `null`         |\n| `params`        | `object` | Source parameters as defined in [source config docs](../configuration/source-config.md). | _required_    |\n\n:::warning\n\nWhile updating `num_pipelines` and `transform` is generally safe and reversible, updating `params` has consequences specific to the source type and might have side effects such as loosing the source's checkpoints. Perform such updates with great care. \n\n:::\n\n**Payload Example**\n\ncurl -XPOST http://localhost:7280/api/v1/indexes/my-index/sources --data @source_config.json -H \"Content-Type: application/json\"\n\n```json title=\"source_config.json\n{\n    \"version\": \"0.8\",\n    \"source_id\": \"kafka-source\",\n    \"source_type\": \"kafka\",\n    \"params\": {\n        \"topic\": \"quickwit-fts-staging\",\n        \"client_params\": {\n            \"bootstrap.servers\": \"kafka-quickwit-server:9092\"\n        }\n    }\n}\n```\n\n#### Response\n\nThe response is the created source config, and the content type is `application/json; charset=UTF-8.`\n\n### Toggle source\n\n```\nPUT api/v1/indexes/<index id>/sources/<source id>/toggle\n```\n\nToggle (enable/disable) source `source id` of index ID `index id`.\n\nIt returns an empty body.\n\n#### PUT payload\n\n| Variable          | Type     | Description                                                                                          |\n|-------------------|----------|------------------------------------------------------------------------------------------------------|\n| `enable`       | `bool` | If `true` enable the source, else disable it.                                |\n\n### Reset source checkpoint\n\n```\nPUT api/v1/indexes/<index id>/sources/<source id>/reset-checkpoint\n```\n\nResets checkpoints of source `source id` of index ID `index id`.\n\nIt returns an empty body.\n\n### Delete a source\n\n```\nDELETE api/v1/indexes/<index id>/sources/<source id>\n```\n\nDelete source of ID `<source id>`.\n\n\n## Cluster API\n\nThis endpoint lets you check the state of the cluster from the point of view of the node handling the request.\n\n```\nGET api/v1/cluster?format=pretty_json\n```\n\n#### Parameters\n\nName | Type | Description | Default value\n--- | --- | --- | ---\n`format` | `String` | The output format requested for the response: `json` or `pretty_json` | `pretty_json`\n\n\n## Delete API\n\nThe delete API enables to delete documents matching a query.\n\n### Create a delete task\n\n```\nPOST api/v1/<index id>/delete-tasks\n```\n\nCreate a delete task that will delete all documents matching the provided query in the given index `<index id>`.\nThe endpoint simply appends your delete task to the delete task queue in the metastore. The deletion will eventually be executed.\n\n#### Path variable\n\n| Variable      | Description   |\n| ------------- | ------------- |\n| `index id`  | The index id  |\n\n\n#### POST payload `DeleteQuery`\n\n\n| Variable            | Type       | Description                                                                                             | Default value                                      |\n|---------------------|------------|---------------------------------------------------------------------------------------------------------|----------------------------------------------------|\n| `query`           | `String`   | Query text. See the [query language doc](query-language.md)                                               | _required_                                         |\n| `search_field`    | `[String]` | Fields to search on. Comma-separated list, e.g. \"field1,field2\"                                           | index_config.search_settings.default_search_fields |\n| `start_timestamp` | `i64`      | If set, restrict search to documents with a `timestamp >= start_timestamp`. The value must be in seconds. |                                                    |\n| `end_timestamp`   | `i64`      | If set, restrict search to documents with a `timestamp < end_timestamp`. The value must be in seconds.    |                                                    |\n\n\n**Example**\n\n```json\n{\n    \"query\": \"body:trash\",\n    \"start_timestamp\": \"1669738645\",\n    \"end_timestamp\": \"1669825046\",\n}\n```\n\n#### Response\n\nThe response is the created delete task represented in JSON, `DeleteTask`, the content type is `application/json; charset=UTF-8.`\n\n| Field                | Description                                            |     Type      |\n|----------------------|--------------------------------------------------------|:-------------:|\n| `create_timestamp` | Create timestamp of the delete query in seconds        |     `i64`     |\n| `opstamp`          | Unique operation stamp associated with the delete task |     `u64`     |\n| `delete_query`     | The posted delete query                                | `DeleteQuery` |\n\n\n### List delete queries\n\n```\nGET api/v1/<index id>/delete-tasks\n```\n\nGet the list of delete tasks for a given `index_id`.\n\n\n#### Response\n\nThe response is an array of `DeleteTask`.\n\n\n## Index template API\n\nThis API manages index template resources. Templates are higher level configuration objects used to automatically create indexes according to predefined rules. See [index template configuration](../configuration/template-config.md).\n\n### Create a template\n\n```\nPOST api/v1/templates\n```\n\n#### POST payload\n\nCreate an index template by posting a [template configuration](../configuration/template-config.md) payload. The API accepts JSON with the header `content-type: application/json` and YAML with `content-type: application/yaml`.\n\n**Example**\n\n```yaml\nversion: 0.9 # File format version.\n\ntemplate_id: \"all-logs\"\n\nindex_root_uri: \"s3://my-bucket/logs/\"\n\ndescription: \"All my logs\"\n\nindex_id_patterns:\n    - logs-*\n\npriority: 100\n\ndoc_mapping:\n  mode: dynamic\n  field_mappings:\n    - name: timestamp\n      type: datetime\n      input_formats:\n        - unix_timestamp\n      output_format: unix_timestamp_secs\n      fast: true\n  timestamp_field: timestamp\n```\n\n#### Response\n\nThe created index template configuration as JSON.\n\n\n### Update a template\n\n```\nPUT api/v1/templates/<template id>\n```\n\n#### Path variable\n\n| Variable      | Description   |\n| ------------- | ------------- |\n| `template id` | The template id  |\n\n\n#### POST payload\n\nUpdate an index template by posting an [template configuration](../configuration/template-config.md) payload. The API accepts JSON with the header `content-type: application/json` and YAML with `content-type: application/yaml`.\n\n**Example**\n\nSee [create endpoint](#create-a-template).\n\n#### Response\n\nThe updated template configuration as JSON.\n\n### List the templates\n\n```\nGET api/v1/templates\n```\n\n#### Response\n\nAn array with all the existing index template configurations as JSON.\n\n### Get a template\n\n```\nGET api/v1/templates/<template id>\n```\n\n#### Path variable\n\n| Variable      | Description   |\n| ------------- | ------------- |\n| `template id` | The template id  |\n\n#### Response\n\nThe requested index template configuration as JSON.\n\n### Delete a template\n\n```\nDELETE api/v1/templates/<template id>\n```\n\n#### Path variable\n\n| Variable      | Description   |\n| ------------- | ------------- |\n| `template id` | The template id  |\n\n#### Response\n\nEmpty response.\n"
  },
  {
    "path": "docs/reference/updating-mapper.md",
    "content": "# Updating the doc mapping of an index\n\nQuickwit allows updating the mapping it uses to add more fields to an existing index or change how they are indexed. In doing so, it does not reindex existing data but still lets you search through older documents where possible.\n\n## Indexing\n\nWhen you update a doc mapping for an index, Quickwit will restart indexing pipelines to take the changes into account. As both this operation and the document ingestion are asynchronous, there is no strict happens-before relationship between ingestion and update. This means a document ingested just before the update may be indexed according to the newer doc mapper, and document ingested just after the update may be indexed with the older doc mapper.\n\n:::warning\n\nIf you use the ingest or ES bulk API (V2), the old doc mapping will still be used to validate new documents that end up being persisted on existing shards (see [#5738](https://github.com/quickwit-oss/quickwit/issues/5738)).\n\n:::\n\n## Querying\n\nQuickwit always validate queries against the most recent mapping.\nIf a query was valid under a previous mapping but is not compatible with the newer mapping, that query will be rejected.\nFor instance if a field which was indexed no longer is, any query that uses it will become invalid.\nOn the other hand, if a query was not valid for a previous doc mapping, but is valid under the new doc mapping, Quickwit will process the query.\nWhen querying newer splits, it will behave normally, when querying older splits, it will try to execute the query as correctly as possible.\nIf you find yourself in a situation where older splits causes a valid request to return an error, please open a bug report.\nSee examples 1 and 2 below for clarification.\n\nChange in tokenizer affect only newer splits, older splits keep using the tokenizers they were created with.\n\nDocument retrieved are mapped from Quickwit internal format to JSON based on the latest doc mapping. This means if fields are deleted,\nthey will stop appearing (see also Reversibility below) unless mapper mode is Dynamic. If the type of some field changed, it will be converted on a best-effort basis:\nintegers will get turned into text, text will get turned into string when it is possible, otherwise, the field is omited.\nSee example 3 for clarification.\n\n## Reversibility\n\nQuickwit does not modify existing data when receiving a new doc mapping. If you realize that you updated the mapping in a wrong way, you can re-update your index using the previous mapping. Documents indexed while the mapping was wrong will be impacted, but any document that was committed before the change will be queryable as if nothing happened.\n\n## Type update reference\n\nConversion from a type to itself is omitted. Conversions that never succeed and always omit the field are omitted, too.\n\n<!-- this is extracted from `quickwit_doc_mapper::::default_doc_mapper::value_to_json()` -->\n| type before | type after | behavior |\n|-------------|------------|\n| u64/i64/f64 | text | convert to decimal string |\n| date | text | convert to rfc3339 textual representation |\n| ip | text | convert to IPv6 representation. For IPv4, convert to IPv4-mapped IPv6 address (`::ffff:1.2.3.4`) |\n| bool | text | convert to \"true\" or false\" |\n| u64/i64/f64 | bool | convert 0/0.0 to false and 1/1.0 to true, otherwise omit |\n| text | bool | convert if \"true\" or \"false\" (lowercase), otherwise omit |\n| text | ip | convert if valid IPv4 or IPv6, otherwise omit |\n| text | f64 | convert if valid floating point number, otherwise omit |\n| u64/i64 | f64 | convert, possibly with loss of precision |\n| bool | f64 | convert to 0.0 for false, and 1.0 for true |\n| text | u64 | convert is valid integer in range 0..2\\*\\*64, otherwise omit |\n| i64 | u64 | convert if in range 0..2\\*\\*63, otherwise omit |\n| f64 | u64 | convert if in range 0..2\\*\\*64, possibly with loss of precision, otherwise omit |\n| text | i64 | convert is valid integer in range -2\\*\\*63..2\\*\\*63, otherwise omit |\n| u64 | i64 | convert if in range 0..2\\*\\*63, otherwise omit |\n| f64 | i64 | convert if in range -2\\*\\*63..2\\*\\*63, possibly with loss of precision, otherwise omit |\n| bool | i64 | convert to 0 for false, and 1 for true |\n| text | datetime | parse according to current input\\_format, otherwise omit |\n| u64 | datetime | parse according to current input\\_format, otherwise omit |\n| i64 | datetime | parse according to current input\\_format, otherwise omit |\n| f64 | datetime | parse according to current input\\_format, otherwise omit |\n| array\\<T\\> | array\\<U\\> | convert individual elements, skipping over those which can't be converted |\n| T | array\\<U\\> | convert element, emiting array of a single element, or empty array if it can't be converted |\n| array\\<T\\> | U | convert individual elements, keeping the first which can be converted |\n| json | object | try convert individual elements if they exists inside object, omit individual elements which can't be |\n| object | json | convert individual elements. Previous lists of one element are converted to a single element not in an array.\n\n## Examples\n\nIn the below examples, fields which are not relevant are removed for conciseness, you will not be able to use these index config as is.\n\n### Example 1\n\nbefore:\n```yaml\ndoc_mapping:\n  field_mappings:\n    - name: field1\n      type: text\n      tokenizer: raw\n```\n\nafter:\n```yaml\ndoc_mapping:\n  field_mappings:\n    - name: field1\n      type: text\n      indexed: false\n```\n\nA field changed from being indexed to not being indexed.\nA query such as `field1:my_value` was valid, but is now rejected.\n\n### Example 2\n\nbefore:\n```yaml\ndoc_mapping:\n  field_mappings:\n    - name: field1\n      type: text\n      indexed: false\n    - name: field2\n      type: text\n      tokenizer: raw\n\n```\n\nafter:\n```yaml\ndoc_mapping:\n  field_mappings:\n    - name: field1\n      type: text\n      tokenizer: raw\n    - name: field2\n      type: text\n      tokenizer: raw\n```\n\nA field changed from being not indexed to being indexed.\nA query such as `field1:my_value` was invalid before, and is now valid. When querying older splits, it won't return a match, but won't return an error either.\nA query such as `field1:my_value OR field2:my_value` is now valid too. For old splits, it will return the same results as `field2:my_value` as field1 wasn't indexed before. For newer splits, it will return the expected results.\nA query such as `NOT field1:my_value` would return all documents for old splits, and only documents where `field1` is not `my_value` for newer splits.\n\n\n### Example 3\n\n# show cast (trivial, valid and invalid)\n# show array to single\n\nbefore:\n```yaml\ndoc_mapping:\n  field_mappings:\n    - name: field1\n      type: text\n    - name: field2\n      type: u64\n    - name: field3\n      type: array<text>\n```\ndocument presents before update:\n```json\n{\n  \"field1\": \"123\",\n  \"field2\": 456,\n  \"field3\": [\"abc\", \"def\"]\n}\n{\n  \"field1\": \"message\",\n  \"field2\": 987,\n  \"field3\": [\"ghi\"]\n}\n```\n\nafter:\n```yaml\ndoc_mapping:\n  field_mappings:\n    - name: field1\n      type: u64\n    - name: field2\n      type: text\n    - name: field3\n      type: text\n```\n\nWhen querying this index, the documents returned would become:\n```json\n{\n  \"field1\": 123,\n  \"field2\": \"456\",\n  \"field3\": \"abc\"\n}\n{\n  // field1 is missing because \"message\" can't be converted to int\n  \"field2\": \"987\",\n  \"field3\": \"ghi\"\n}\n```\n"
  },
  {
    "path": "docs/telemetry.md",
    "content": "---\ntitle: Telemetry\nsidebar_position: 12\n---\n\nQuickwit, Inc. collects anonymous data regarding general usage to help us drive our development. Privacy and transparency are at the heart of Quickwit values and we only collect the minimal useful data and don't use any third party tool for the collection.\n\n## Disabling data collection\n\nData collection are opt-out. To disable them, just set the environment variable `QW_DISABLE_TELEMETRY` to whatever value.\n\n```bash\nexport QW_DISABLE_TELEMETRY=1\n```\n\nLook at `--help` command output to check whether telemetry is enabled or not:\n```bash\nquickwit --help\nQuickwit 0.7\nSub-second search & analytics engine on cloud storage.\n  Find more information at https://quickwit.io/docs\n\nTelemetry enabled\n```\n\nThe line `Telemetry enabled` disappears when you disable it.\n\n## Which data are collected?\n\nWe collect the minimum amount of information to respect privacy. Here are the data collected:\n- type of events among create, index, delete and serve events\n- client information:\n  - session uuid: uuid generated on the fly\n  - quickwit version\n  - os (linux, macos, freebsd, android...)\n  - architecture of the CPU\n  - md5 hash of host and username\n  - a boolean to know if `KUBERNETES_SERVICE_HOST` is set.\n\nAll data are sent to `telemetry.quickwit.io`.\n\n## No third party\n\nWe did not want to add any untrusted third party tool in the collection so we decided to implement and host our own metric collection server.\n"
  },
  {
    "path": "install.sh",
    "content": "#!/bin/bash\n\n# installer.sh\n#\n# This is just a little script that can be downloaded from the internet to\n# install Quickwit.\n# It just does platform detection, fetches the latest appropriate release version from github\n# and execute the appropriate commands to download the binary.\n#\n# Heavily inspired by the Vector & Meilisearch installation scripts\n\nset -u\n\n# If PACKAGE_ROOT is unset or empty, default it.\nPACKAGE_ROOT=\"${PACKAGE_ROOT:-\"https://github.com/quickwit-oss/quickwit/releases/download\"}\"\nPACKAGE_RELEASE_API=\"${PACKAGE_RELEASE_API:-\"https://api.github.com/repos/quickwit-oss/quickwit/releases\"}\"\nPACKAGE_NAME=\"quickwit\"\n_divider=\"--------------------------------------------------------------------------------\"\n_prompt=\">>>\"\n_indent=\"   \"\n\nheader() {\n    cat 1>&2 <<EOF\n\n                                   Q U I C K W I T\n                                      Installer\n\n$_divider\nWebsite: https://quickwit.io/\nDocs: https://quickwit.io/docs/\n$_divider\n\nEOF\n}\n\nusage() {\n    cat 1>&2 <<EOF\nquickwit-install\nThe installer for Quickwit (https://quickwit.io/)\n\nUSAGE:\n    quickwit-install [FLAGS] [OPTIONS]\n\nFLAGS:\n    -h, --help              Prints help information\nEOF\n}\n\nmain() {\n    downloader --check\n    header\n    install_from_archive \"${1:-\"\"}\"\n}\n\ninstall_from_archive() {\n    need_cmd cp\n    need_cmd mv\n    need_cmd rm\n    need_cmd tar\n    need_cmd gzip\n    need_cmd chmod\n    need_cmd grep\n    need_cmd head\n    need_cmd sed\n    need_cmd curl\n\n    get_architecture || return 1\n    local _arch=\"$RETVAL\"\n    assert_nz \"$_arch\" \"arch\"\n\n    local _binary_arch=\"\"\n    case \"$_arch\" in\n        aarch64-apple-darwin)\n            _binary_arch=$_arch\n            ;;\n        x86_64-apple-darwin)\n            _binary_arch=$_arch\n            ;;\n        x86_64-*linux*-gnu)\n            _binary_arch=\"x86_64-unknown-linux-gnu\"\n            ;;\n        aarch64-*linux*-gnu)\n            _binary_arch=\"aarch64-unknown-linux-gnu\"\n            ;;\n        *)\n            printf \"%s A pre-built package is not available for your OS architecture: %s\" \"$_prompt\" \"$_arch\"\n            printf \"\\n\"\n            err \"You can easily build it from source following the docs: https://quickwit.io/docs\"\n            ;;\n    esac\n\n    local _version=$(get_latest_version \"$1\")\n    local _archive_content_file=\"quickwit-${_version}-${_binary_arch}\"\n    local _file=\"${_archive_content_file}.tar.gz\"\n    local _url=\"${PACKAGE_ROOT}/${_version}/${_file}\"\n\n    printf \"%s Downloading Quickwit via %s\" \"$_prompt\" \"$_url\"\n    ensure downloader \"$_url\" \"$_file\"\n    printf \"\\n\"\n\n    printf \"%s Unpacking archive ...\" \"$_prompt\"\n    ensure tar -xzf \"$_file\"\n    chmod 744 \"./quickwit-${_version}/quickwit\"\n    ensure rm \"$_file\"\n    printf \"\\n\"\n\n    printf \"\\n\"\n    printf \"%s Install succeeded!\\n\" \"$_prompt\"\n    printf \"%s To start using Quickwit:\\n\" \"$_prompt\"\n    printf \"\\n\"\n    printf \"%s ./quickwit-${_version}/quickwit --version \\n\" \"$_indent\"\n    printf \"\\n\"\n    printf \"%s More information at https://quickwit.io/docs/\\n\" \"$_prompt\"\n\n    local _retval=$?\n\n    return \"$_retval\"\n}\n\n# ------------------------------------------------------------------------------\n# semverParseInto and semverLT from https://github.com/cloudflare/semver_bash/blob/master/semver.sh\n#\n# usage: semverParseInto version major minor patch special\n# version: the string version\n# major, minor, patch, special: will be assigned by the function\n# ------------------------------------------------------------------------------\n\nsemverParseInto() {\n    local RE='[^0-9]*\\([0-9]*\\)[.]\\([0-9]*\\)[.]\\([0-9]*\\)\\([0-9A-Za-z-]*\\)'\n    #MAJOR\n    eval $2=`echo $1 | sed -e \"s#$RE#\\1#\"`\n    #MINOR\n    eval $3=`echo $1 | sed -e \"s#$RE#\\2#\"`\n    #PATCH\n    eval $4=`echo $1 | sed -e \"s#$RE#\\3#\"`\n    #SPECIAL\n    eval $5=`echo $1 | sed -e \"s#$RE#\\4#\"`\n}\n\n# usage: semverLT version1 version2\nsemverLT() {\n    local MAJOR_A=0\n    local MINOR_A=0\n    local PATCH_A=0\n    local SPECIAL_A=0\n\n    local MAJOR_B=0\n    local MINOR_B=0\n    local PATCH_B=0\n    local SPECIAL_B=0\n\n    semverParseInto $1 MAJOR_A MINOR_A PATCH_A SPECIAL_A\n    semverParseInto $2 MAJOR_B MINOR_B PATCH_B SPECIAL_B\n\n    if [ $MAJOR_A -lt $MAJOR_B ]; then\n        return 0\n    fi\n    if [ $MAJOR_A -le $MAJOR_B ] && [ $MINOR_A -lt $MINOR_B ]; then\n        return 0\n    fi\n    if [ $MAJOR_A -le $MAJOR_B ] && [ $MINOR_A -le $MINOR_B ] && [ $PATCH_A -lt $PATCH_B ]; then\n        return 0\n    fi\n    if [ \"_$SPECIAL_A\"  == \"_\" ] && [ \"_$SPECIAL_B\"  == \"_\" ] ; then\n        return 1\n    fi\n    if [ \"_$SPECIAL_A\"  == \"_\" ] && [ \"_$SPECIAL_B\"  != \"_\" ] ; then\n        return 1\n    fi\n    if [ \"_$SPECIAL_A\"  != \"_\" ] && [ \"_$SPECIAL_B\"  == \"_\" ] ; then\n        return 0\n    fi\n    if [ \"_$SPECIAL_A\" < \"_$SPECIAL_B\" ]; then\n        return 0\n    fi\n\n    return 1\n}\n\n# Returns the tag of the latest stable release (in terms of semver and not of release date)\nget_latest_version() {\n    GREP_SEMVER_REGEXP='v\\([0-9]*\\)[.]\\([0-9]*\\)[.]\\([0-9]*\\)$' # i.e. v[number].[number].[number]\n    temp_file='temp_file' # temp_file needed because the grep would start before the download is over\n    curl -s \"${PACKAGE_RELEASE_API}\" > \"$temp_file\" || return 1\n    releases=$(cat \"$temp_file\" | \\\n        grep -E \"tag_name|draft|prerelease\" \\\n        | tr -d ',\"' | cut -d ':' -f2 | tr -d ' ')\n        # Returns a list of [tag_name draft_boolean prerelease_boolean ...]\n        # Ex: v0.10.1 false false v0.9.1-rc.1 false true v0.9.0 false false...\n\n    # clean up early\n    rm -f \"$temp_file\"\n\n    if [ \"$1\" = \"--allow-any-latest-version\" ]; then\n        local first_release=$(echo $releases | { read first rest; echo $first; })\n        echo $first_release\n        return\n    fi\n\n    i=0\n    latest=\"\"\n    current_tag=\"\"\n    for release_info in $releases; do\n        if [ $i -eq 0 ]; then # Checking tag_name\n            if echo \"$release_info\" | grep -q \"$GREP_SEMVER_REGEXP\"; then # If it's not an alpha or beta release\n                current_tag=$release_info\n            else\n                current_tag=\"\"\n            fi\n            i=1\n        elif [ $i -eq 1 ]; then # Checking draft boolean\n            if [ \"$release_info\" = \"true\" ]; then\n                current_tag=\"\"\n            fi\n            i=2\n        elif [ $i -eq 2 ]; then # Checking prerelease boolean\n            if [ \"$release_info\" = \"true\" ]; then\n                current_tag=\"\"\n            fi\n            i=0\n            if [ \"$current_tag\" != \"\" ]; then # If the current_tag is valid\n                if [ \"$latest\" = \"\" ]; then # If there is no latest yet\n                    latest=\"$current_tag\"\n                else\n                    semverLT $current_tag $latest # Comparing latest and the current tag\n                    if [ $? -eq 1 ]; then\n                        latest=\"$current_tag\"\n                    fi\n                fi\n            fi\n        fi\n    done\n\n    echo $latest\n}\n\n# ------------------------------------------------------------------------------\n# All code below here was copied from https://sh.rustup.rs and can safely\n# be updated if necessary.\n# ------------------------------------------------------------------------------\n\nget_gnu_musl_glibc() {\n  need_cmd ldd\n  need_cmd bc\n  need_cmd awk\n  # Detect both gnu and musl\n  local _ldd_version\n  local _glibc_version\n  _ldd_version=$(ldd --version)\n  if ldd --version 2>&1 | grep -Eq 'GNU'; then\n    _glibc_version=$(echo \"$_ldd_version\" | awk '/ldd/{print $NF}')\n    if [ 1 -eq \"$(echo \"${_glibc_version} < 2.18\" | bc)\" ]; then\n      echo \"musl\"\n    else\n      echo \"gnu\"\n    fi\n  elif ldd --version 2>&1 | grep -Eq \"musl\"; then\n    echo \"musl\"\n  else\n    err \"Warning: Unable to detect architecture from ldd (using gnu-unknown)\"\n  fi\n}\n\nget_bitness() {\n    need_cmd head\n    # Architecture detection without dependencies beyond coreutils.\n    # ELF files start out \"\\x7fELF\", and the following byte is\n    #   0x01 for 32-bit and\n    #   0x02 for 64-bit.\n    # The printf builtin on some shells like dash only supports octal\n    # escape sequences, so we use those.\n    local _current_exe_head\n    _current_exe_head=$(head -c 5 /proc/self/exe )\n    if [ \"$_current_exe_head\" = \"$(printf '\\177ELF\\001')\" ]; then\n        echo 32\n    elif [ \"$_current_exe_head\" = \"$(printf '\\177ELF\\002')\" ]; then\n        echo 64\n    else\n        err \"unknown platform bitness\"\n    fi\n}\n\nget_endianness() {\n    local cputype=$1\n    local suffix_eb=$2\n    local suffix_el=$3\n\n    # detect endianness without od/hexdump, like get_bitness() does.\n    need_cmd head\n    need_cmd tail\n\n    local _current_exe_endianness\n    _current_exe_endianness=\"$(head -c 6 /proc/self/exe | tail -c 1)\"\n    if [ \"$_current_exe_endianness\" = \"$(printf '\\001')\" ]; then\n        echo \"${cputype}${suffix_el}\"\n    elif [ \"$_current_exe_endianness\" = \"$(printf '\\002')\" ]; then\n        echo \"${cputype}${suffix_eb}\"\n    else\n        err \"unknown platform endianness\"\n    fi\n}\n\nget_architecture() {\n    local _ostype _cputype _bitness _arch\n    _ostype=\"$(uname -s)\"\n    _cputype=\"$(uname -m)\"\n\n    if [ \"$_ostype\" = Linux ]; then\n        if [ \"$(uname -o)\" = Android ]; then\n            _ostype=Android\n        fi\n    fi\n\n    if [ \"$_ostype\" = Darwin ] && [ \"$_cputype\" = i386 ]; then\n        # Darwin `uname -m` lies\n        if sysctl hw.optional.x86_64 | grep -q ': 1'; then\n            _cputype=x86_64\n        fi\n    fi\n\n    case \"$_ostype\" in\n\n        Android)\n            _ostype=linux-android\n            ;;\n\n        Linux)\n            case $(get_gnu_musl_glibc) in\n              \"musl\")\n                _ostype=unknown-linux-musl\n                ;;\n              \"gnu\")\n                _ostype=unknown-linux-gnu\n                ;;\n              # Fallback\n              *)\n                _ostype=unknown-linux-gnu\n                ;;\n            esac\n            _bitness=$(get_bitness)\n            ;;\n\n        FreeBSD)\n            _ostype=unknown-freebsd\n            ;;\n\n        NetBSD)\n            _ostype=unknown-netbsd\n            ;;\n\n        DragonFly)\n            _ostype=unknown-dragonfly\n            ;;\n\n        Darwin)\n            _ostype=apple-darwin\n            ;;\n\n        MINGW* | MSYS* | CYGWIN*)\n            _ostype=pc-windows-gnu\n            ;;\n\n        *)\n            err \"unrecognized OS type: $_ostype\"\n            ;;\n\n    esac\n\n    case \"$_cputype\" in\n\n        i386 | i486 | i686 | i786 | x86)\n            _cputype=i686\n            ;;\n\n        xscale | arm)\n            _cputype=arm\n            if [ \"$_ostype\" = \"linux-android\" ]; then\n                _ostype=linux-androideabi\n            fi\n            ;;\n\n        armv6l)\n            _cputype=arm\n            if [ \"$_ostype\" = \"linux-android\" ]; then\n                _ostype=linux-androideabi\n            else\n                _ostype=\"${_ostype}eabihf\"\n            fi\n            ;;\n\n        armv7l | armv8l)\n            _cputype=armv7\n            if [ \"$_ostype\" = \"linux-android\" ]; then\n                _ostype=linux-androideabi\n            else\n                _ostype=\"${_ostype}eabihf\"\n            fi\n            ;;\n\n        aarch64 | arm64)\n            _cputype=aarch64\n            ;;\n\n        x86_64 | x86-64 | x64 | amd64)\n            _cputype=x86_64\n            ;;\n\n        mips)\n            _cputype=$(get_endianness mips '' el)\n            ;;\n\n        mips64)\n            if [ \"$_bitness\" -eq 64 ]; then\n                # only n64 ABI is supported for now\n                _ostype=\"${_ostype}abi64\"\n                _cputype=$(get_endianness mips64 '' el)\n            fi\n            ;;\n\n        ppc)\n            _cputype=powerpc\n            ;;\n\n        ppc64)\n            _cputype=powerpc64\n            ;;\n\n        ppc64le)\n            _cputype=powerpc64le\n            ;;\n\n        s390x)\n            _cputype=s390x\n            ;;\n\n        *)\n            err \"unknown CPU type: $_cputype\"\n\n    esac\n\n    # Detect 64-bit linux with 32-bit userland\n    if [ \"${_ostype}\" = unknown-linux-gnu ] && [ \"${_bitness}\" -eq 32 ]; then\n        case $_cputype in\n            x86_64)\n                _cputype=i686\n                ;;\n            mips64)\n                _cputype=$(get_endianness mips '' el)\n                ;;\n            powerpc64)\n                _cputype=powerpc\n                ;;\n            aarch64)\n                _cputype=armv7\n                if [ \"$_ostype\" = \"linux-android\" ]; then\n                    _ostype=linux-androideabi\n                else\n                    _ostype=\"${_ostype}eabihf\"\n                fi\n                ;;\n        esac\n    fi\n\n    # Detect armv7 but without the CPU features Rust needs in that build,\n    # and fall back to arm.\n    # See https://github.com/rust-lang/rustup.rs/issues/587.\n    if [ \"$_ostype\" = \"unknown-linux-gnueabihf\" ] && [ \"$_cputype\" = armv7 ]; then\n        if ensure grep '^Features' /proc/cpuinfo | grep -q -v neon; then\n            # At least one processor does not have NEON.\n            _cputype=arm\n        fi\n    fi\n\n    _arch=\"${_cputype}-${_ostype}\"\n\n    RETVAL=\"$_arch\"\n}\n\nerr() {\n    echo \"$_prompt $1\" >&2\n    exit 1\n}\n\nneed_cmd() {\n    if ! check_cmd \"$1\"; then\n        err \"Error: the install script failed because the command '$1' was not found\"\n    fi\n}\n\ncheck_cmd() {\n    command -v \"$1\" > /dev/null 2>&1\n}\n\nassert_nz() {\n    if [ -z \"$1\" ]; then err \"assert_nz $2\"; fi\n}\n\n# Run a command that should never fail. If the command fails execution\n# will immediately terminate with an error showing the failing\n# command.\nensure() {\n    local output\n    output=\"$(\"$@\" 2>&1 > /dev/null)\"\n\n    if [ \"$output\" ]; then\n        echo \"\"\n        echo \"$_prompt command failed: $*\"\n        echo \"\"\n        echo \"$_divider\"\n        echo \"\"\n        echo \"$output\" >&2\n        exit 1\n    fi\n}\n\n# This is just for indicating that commands' results are being\n# intentionally ignored. Usually, because it's being executed\n# as part of error handling.\nignore() {\n    \"$@\"\n}\n\n# This wraps curl or wget. Try curl first, if not installed,\n# use wget instead.\ndownloader() {\n    if [ \"$1\" = --check ]; then\n        need_cmd curl\n    else\n        if ! check_help_for curl --proto --tlsv1.2; then\n            echo \"Warning: Not forcing TLS v1.2, this is potentially less secure\"\n            curl --silent --show-error --fail --location \"$1\" --output \"$2\"\n        else\n            curl --proto '=https' --tlsv1.2 --silent --show-error --fail --location \"$1\" --output \"$2\"\n        fi\n    fi\n}\n\ncheck_help_for() {\n    local _cmd\n    local _arg\n    local _ok\n    _cmd=\"$1\"\n    _ok=\"y\"\n    shift\n\n    # If we're running on OS-X, older than 10.13, then we always\n    # fail to find these options to force fallback\n    if check_cmd sw_vers; then\n        if [ \"$(sw_vers -productVersion | cut -d. -f2)\" -lt 13 ]; then\n            # Older than 10.13\n            echo \"Warning: Detected OS X platform older than 10.13\"\n            _ok=\"n\"\n        fi\n    fi\n\n    for _arg in \"$@\"; do\n        if ! \"$_cmd\" --help | grep -q -- \"$_arg\"; then\n            _ok=\"n\"\n        fi\n    done\n\n    test \"$_ok\" = \"y\"\n}\n\nmain \"$@\" || exit 1\n"
  },
  {
    "path": "monitoring/grafana/README.md",
    "content": "# Grafana dashboards for monitoring Quickwit\n\nThe list of featured dashboards:\n- [x] Metastore\n- [x] Indexers\n- [x] Searchers\n- [ ] Janitor\n\n"
  },
  {
    "path": "monitoring/grafana/dashboards/indexers.json",
    "content": "{\n  \"annotations\": {\n    \"list\": [\n      {\n        \"builtIn\": 1,\n        \"datasource\": {\n          \"type\": \"grafana\",\n          \"uid\": \"-- Grafana --\"\n        },\n        \"enable\": true,\n        \"hide\": true,\n        \"iconColor\": \"rgba(0, 211, 255, 1)\",\n        \"name\": \"Annotations & Alerts\",\n        \"target\": {\n          \"limit\": 100,\n          \"matchAny\": false,\n          \"tags\": [],\n          \"type\": \"dashboard\"\n        },\n        \"type\": \"dashboard\"\n      }\n    ]\n  },\n  \"description\": \"\",\n  \"editable\": true,\n  \"fiscalYearStartMonth\": 0,\n  \"graphTooltip\": 0,\n  \"links\": [],\n  \"liveNow\": false,\n  \"panels\": [\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          },\n          \"unit\": \"Bps\"\n        },\n        \"overrides\": []\n      },\n      \"gridPos\": {\n        \"h\": 8,\n        \"w\": 6,\n        \"x\": 0,\n        \"y\": 0\n      },\n      \"id\": 10,\n      \"options\": {\n        \"colorMode\": \"value\",\n        \"graphMode\": \"none\",\n        \"justifyMode\": \"auto\",\n        \"orientation\": \"auto\",\n        \"reduceOptions\": {\n          \"calcs\": [\n            \"lastNotNull\"\n          ],\n          \"fields\": \"\",\n          \"values\": false\n        },\n        \"showPercentChange\": false,\n        \"textMode\": \"auto\",\n        \"wideLayout\": true\n      },\n      \"pluginVersion\": \"10.4.1\",\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"editorMode\": \"builder\",\n          \"expr\": \"sum by(docs_processed_status) (rate(quickwit_indexing_processed_bytes{instance=~\\\"$instance\\\"}[$__rate_interval]))\",\n          \"legendFormat\": \"{{docs_processed_status}}\",\n          \"range\": true,\n          \"refId\": \"A\"\n        }\n      ],\n      \"title\": \"Indexing throughput\",\n      \"type\": \"stat\"\n    },\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          },\n          \"unit\": \"suffix: docs/s\"\n        },\n        \"overrides\": []\n      },\n      \"gridPos\": {\n        \"h\": 8,\n        \"w\": 6,\n        \"x\": 6,\n        \"y\": 0\n      },\n      \"id\": 11,\n      \"options\": {\n        \"colorMode\": \"value\",\n        \"graphMode\": \"none\",\n        \"justifyMode\": \"auto\",\n        \"orientation\": \"auto\",\n        \"reduceOptions\": {\n          \"calcs\": [\n            \"lastNotNull\"\n          ],\n          \"fields\": \"\",\n          \"values\": false\n        },\n        \"showPercentChange\": false,\n        \"textMode\": \"auto\",\n        \"wideLayout\": true\n      },\n      \"pluginVersion\": \"10.4.1\",\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"editorMode\": \"builder\",\n          \"expr\": \"sum by(docs_processed_status) (rate(quickwit_indexing_processed_docs_total{instance=~\\\"$pod|$instance\\\"}[$__rate_interval]))\",\n          \"legendFormat\": \"{{docs_processed_status}}\",\n          \"range\": true,\n          \"refId\": \"A\"\n        }\n      ],\n      \"title\": \"Documents throughput\",\n      \"type\": \"stat\"\n    },\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"left\",\n            \"axisSoftMax\": -4,\n            \"axisSoftMin\": 8,\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 8,\n            \"gradientMode\": \"none\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineWidth\": 1,\n            \"pointSize\": 5,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          },\n          \"unit\": \"Bps\"\n        },\n        \"overrides\": [\n          {\n            \"matcher\": {\n              \"id\": \"byName\",\n              \"options\": \"valid\"\n            },\n            \"properties\": [\n              {\n                \"id\": \"color\",\n                \"value\": {\n                  \"mode\": \"continuous-GrYlRd\",\n                  \"seriesBy\": \"last\"\n                }\n              }\n            ]\n          },\n          {\n            \"matcher\": {\n              \"id\": \"byName\",\n              \"options\": \"parsing_error\"\n            },\n            \"properties\": [\n              {\n                \"id\": \"color\",\n                \"value\": {\n                  \"mode\": \"continuous-RdYlGr\"\n                }\n              }\n            ]\n          }\n        ]\n      },\n      \"gridPos\": {\n        \"h\": 8,\n        \"w\": 12,\n        \"x\": 12,\n        \"y\": 0\n      },\n      \"id\": 2,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [\n            \"min\",\n            \"max\",\n            \"mean\"\n          ],\n          \"displayMode\": \"table\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"editorMode\": \"builder\",\n          \"exemplar\": false,\n          \"expr\": \"sum by(docs_processed_status, index) (rate(quickwit_indexing_processed_bytes{instance=~\\\"$instance\\\"}[$__rate_interval]))\",\n          \"format\": \"time_series\",\n          \"instant\": false,\n          \"interval\": \"\",\n          \"legendFormat\": \"{{docs_processed_status}}-{{index}}\",\n          \"range\": true,\n          \"refId\": \"Indexing bytes rate\"\n        }\n      ],\n      \"title\": \"Indexing throughput\",\n      \"type\": \"timeseries\"\n    },\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"description\": \"\",\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"auto\",\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 0,\n            \"gradientMode\": \"none\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineWidth\": 1,\n            \"pointSize\": 5,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"decimals\": 1,\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          },\n          \"unit\": \"Bps\"\n        },\n        \"overrides\": []\n      },\n      \"gridPos\": {\n        \"h\": 8,\n        \"w\": 12,\n        \"x\": 0,\n        \"y\": 8\n      },\n      \"id\": 6,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [],\n          \"displayMode\": \"list\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"editorMode\": \"builder\",\n          \"exemplar\": false,\n          \"expr\": \"rate(quickwit_write_bytes{instance=~\\\"$instance\\\"}[$__rate_interval])\",\n          \"hide\": false,\n          \"instant\": false,\n          \"legendFormat\": \"{{instance}} {{component}}\",\n          \"range\": true,\n          \"refId\": \"Writes\"\n        }\n      ],\n      \"title\": \"Writes rate\",\n      \"type\": \"timeseries\"\n    },\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"auto\",\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 12,\n            \"gradientMode\": \"none\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineWidth\": 1,\n            \"pointSize\": 5,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          },\n          \"unit\": \"decbytes\"\n        },\n        \"overrides\": []\n      },\n      \"gridPos\": {\n        \"h\": 8,\n        \"w\": 12,\n        \"x\": 12,\n        \"y\": 8\n      },\n      \"id\": 8,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [\n            \"min\",\n            \"max\",\n            \"mean\"\n          ],\n          \"displayMode\": \"table\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"pluginVersion\": \"9.2.1\",\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"editorMode\": \"builder\",\n          \"expr\": \"quickwit_memory_allocated_bytes{instance=~\\\"$instance\\\"}\",\n          \"legendFormat\": \"{{instance}} allocated\",\n          \"range\": true,\n          \"refId\": \"A\"\n        },\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"editorMode\": \"builder\",\n          \"expr\": \"quickwit_memory_resident_bytes{instance=~\\\"$instance\\\"}\",\n          \"hide\": false,\n          \"legendFormat\": \"{{instance}} RSS\",\n          \"range\": true,\n          \"refId\": \"C\"\n        }\n      ],\n      \"title\": \"Memory usage (allocated and RSS)\",\n      \"type\": \"timeseries\"\n    },\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"auto\",\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 14,\n            \"gradientMode\": \"hue\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineStyle\": {\n              \"fill\": \"solid\"\n            },\n            \"lineWidth\": 1,\n            \"pointSize\": 5,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"decimals\": 1,\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          },\n          \"unit\": \"none\"\n        },\n        \"overrides\": []\n      },\n      \"gridPos\": {\n        \"h\": 8,\n        \"w\": 12,\n        \"x\": 0,\n        \"y\": 16\n      },\n      \"id\": 13,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [\n            \"min\",\n            \"max\",\n            \"mean\"\n          ],\n          \"displayMode\": \"table\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"editorMode\": \"builder\",\n          \"expr\": \"quickwit_indexing_ongoing_merge_operations{instance=~\\\"$instance\\\"}\",\n          \"hide\": false,\n          \"legendFormat\": \"{{instance}} ongoing\",\n          \"range\": true,\n          \"refId\": \"Processed docs\"\n        },\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"editorMode\": \"builder\",\n          \"expr\": \"quickwit_indexing_pending_merge_operations{instance=~\\\"$instance\\\"}\",\n          \"hide\": false,\n          \"legendFormat\": \"{{instance}} pending\",\n          \"range\": true,\n          \"refId\": \"A\"\n        }\n      ],\n      \"title\": \"Merge operations (ongoing and pending)\",\n      \"type\": \"timeseries\"\n    },\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"auto\",\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 7,\n            \"gradientMode\": \"none\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineStyle\": {\n              \"fill\": \"solid\"\n            },\n            \"lineWidth\": 1,\n            \"pointSize\": 4,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"decimals\": 1,\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          },\n          \"unit\": \"Bps\"\n        },\n        \"overrides\": [\n          {\n            \"matcher\": {\n              \"id\": \"byName\",\n              \"options\": \"Upload bytes / sec\"\n            },\n            \"properties\": [\n              {\n                \"id\": \"custom.transform\",\n                \"value\": \"negative-Y\"\n              }\n            ]\n          }\n        ]\n      },\n      \"gridPos\": {\n        \"h\": 8,\n        \"w\": 12,\n        \"x\": 12,\n        \"y\": 16\n      },\n      \"id\": 4,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [\n            \"min\",\n            \"max\",\n            \"mean\"\n          ],\n          \"displayMode\": \"table\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"editorMode\": \"builder\",\n          \"expr\": \"sum by(pod) (rate(quickwit_storage_object_storage_download_num_bytes{instance=~\\\"$instance\\\"}[$__rate_interval]))\",\n          \"legendFormat\": \"Download bytes / sec - {{pod}}\",\n          \"range\": true,\n          \"refId\": \"Download\"\n        },\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"editorMode\": \"builder\",\n          \"expr\": \"sum by(pod) (rate(quickwit_storage_object_storage_upload_num_bytes{namespace=\\\"$namespace\\\", pod=~\\\"$pod\\\", instance=~\\\"$instance\\\"}[$__rate_interval]))\",\n          \"hide\": false,\n          \"legendFormat\": \"Upload bytes / sec - {{pod}}\",\n          \"range\": true,\n          \"refId\": \"Upload\"\n        }\n      ],\n      \"title\": \"Object storage transfer rate\",\n      \"type\": \"timeseries\"\n    },\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"auto\",\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 14,\n            \"gradientMode\": \"hue\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineStyle\": {\n              \"fill\": \"solid\"\n            },\n            \"lineWidth\": 1,\n            \"pointSize\": 5,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"decimals\": 1,\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          },\n          \"unit\": \"none\"\n        },\n        \"overrides\": []\n      },\n      \"gridPos\": {\n        \"h\": 8,\n        \"w\": 12,\n        \"x\": 0,\n        \"y\": 24\n      },\n      \"id\": 5,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [\n            \"min\",\n            \"max\",\n            \"mean\"\n          ],\n          \"displayMode\": \"table\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"editorMode\": \"builder\",\n          \"expr\": \"sum by(docs_processed_status, index) (rate(quickwit_indexing_processed_docs_total{instance=~\\\"$instance\\\"}[$__rate_interval]))\",\n          \"hide\": false,\n          \"legendFormat\": \"{{docs_processed_status}}-{{index}}\",\n          \"range\": true,\n          \"refId\": \"Processed docs\"\n        }\n      ],\n      \"title\": \"Indexed documents rate\",\n      \"type\": \"timeseries\"\n    },\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"auto\",\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 7,\n            \"gradientMode\": \"none\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineStyle\": {\n              \"fill\": \"solid\"\n            },\n            \"lineWidth\": 1,\n            \"pointSize\": 4,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"decimals\": 1,\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          },\n          \"unit\": \"none\"\n        },\n        \"overrides\": [\n          {\n            \"matcher\": {\n              \"id\": \"byName\",\n              \"options\": \"Upload bytes / sec\"\n            },\n            \"properties\": [\n              {\n                \"id\": \"custom.transform\",\n                \"value\": \"negative-Y\"\n              }\n            ]\n          }\n        ]\n      },\n      \"gridPos\": {\n        \"h\": 8,\n        \"w\": 12,\n        \"x\": 12,\n        \"y\": 24\n      },\n      \"id\": 14,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [\n            \"min\",\n            \"max\",\n            \"mean\"\n          ],\n          \"displayMode\": \"table\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"editorMode\": \"builder\",\n          \"expr\": \"sum(rate(quickwit_storage_object_storage_gets_total{instance=~\\\"$instance\\\"}[$__rate_interval]))\",\n          \"legendFormat\": \"GET req/sec\",\n          \"range\": true,\n          \"refId\": \"Download\"\n        },\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"editorMode\": \"builder\",\n          \"expr\": \"sum(rate(quickwit_storage_object_storage_puts_total{namespace=\\\"$namespace\\\", pod=~\\\"$pod\\\", instance=~\\\"$instance\\\"}[$__rate_interval]))\",\n          \"hide\": false,\n          \"legendFormat\": \"PUT req/sec\",\n          \"range\": true,\n          \"refId\": \"Upload\"\n        }\n      ],\n      \"title\": \"Requests on object storage\",\n      \"type\": \"timeseries\"\n    },\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"auto\",\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 0,\n            \"gradientMode\": \"none\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineWidth\": 1,\n            \"pointSize\": 5,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          },\n          \"unit\": \"µs\"\n        },\n        \"overrides\": []\n      },\n      \"gridPos\": {\n        \"h\": 8,\n        \"w\": 12,\n        \"x\": 0,\n        \"y\": 32\n      },\n      \"id\": 15,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [],\n          \"displayMode\": \"list\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"editorMode\": \"builder\",\n          \"expr\": \"histogram_quantile(0.75, rate(quickwit_cli_thread_unpark_duration_microseconds_bucket{instance=~\\\"$instance\\\"}[$__rate_interval]))\",\n          \"instant\": false,\n          \"legendFormat\": \"{{pod}}\",\n          \"range\": true,\n          \"refId\": \"A\"\n        }\n      ],\n      \"title\": \"Thread unpark duration\",\n      \"type\": \"timeseries\"\n    },\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"auto\",\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 12,\n            \"gradientMode\": \"none\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineWidth\": 1,\n            \"pointSize\": 5,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          },\n          \"unit\": \"µs\"\n        },\n        \"overrides\": []\n      },\n      \"gridPos\": {\n        \"h\": 8,\n        \"w\": 12,\n        \"x\": 12,\n        \"y\": 32\n      },\n      \"id\": 12,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [],\n          \"displayMode\": \"list\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"pluginVersion\": \"9.2.1\",\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"editorMode\": \"builder\",\n          \"expr\": \"rate(quickwit_indexing_backpressure_micros{instance=~\\\"$instance\\\"}[$__rate_interval])\",\n          \"legendFormat\": \"{{actor_name}}-{{pod}}\",\n          \"range\": true,\n          \"refId\": \"A\"\n        }\n      ],\n      \"title\": \"Backpressure\",\n      \"type\": \"timeseries\"\n    }\n  ],\n  \"refresh\": \"30s\",\n  \"revision\": 1,\n  \"schemaVersion\": 39,\n  \"tags\": [\n    \"quickwit\",\n    \"indexer\"\n  ],\n  \"templating\": {\n    \"list\": [\n      {\n        \"current\": {\n          \"selected\": true,\n          \"text\": \"Prometheus\",\n          \"value\": \"PBFA97CFB590B2093\"\n        },\n        \"hide\": 0,\n        \"includeAll\": false,\n        \"label\": \"Datasource\",\n        \"multi\": false,\n        \"name\": \"datasource\",\n        \"options\": [],\n        \"query\": \"prometheus\",\n        \"queryValue\": \"\",\n        \"refresh\": 1,\n        \"regex\": \"\",\n        \"skipUrlSync\": false,\n        \"type\": \"datasource\"\n      },\n      {\n        \"current\": {\n          \"selected\": false,\n          \"text\": \"All\",\n          \"value\": \"$__all\"\n        },\n        \"datasource\": {\n          \"type\": \"prometheus\",\n          \"uid\": \"${datasource}\"\n        },\n        \"definition\": \"label_values(quickwit_memory_in_flight_data_bytes, instance)\",\n        \"hide\": 0,\n        \"includeAll\": true,\n        \"label\": \"Instance\",\n        \"multi\": false,\n        \"name\": \"instance\",\n        \"options\": [],\n        \"query\": {\n          \"query\": \"label_values(quickwit_memory_in_flight_data_bytes, instance)\",\n          \"refId\": \"StandardVariableQuery\"\n        },\n        \"refresh\": 1,\n        \"regex\": \"\",\n        \"skipUrlSync\": false,\n        \"sort\": 0,\n        \"type\": \"query\"\n      }\n    ]\n  },\n  \"time\": {\n    \"from\": \"now-1h\",\n    \"to\": \"now\"\n  },\n  \"timepicker\": {},\n  \"timezone\": \"\",\n  \"title\": \"Quickwit Indexers\",\n  \"uid\": \"quickwit-indexers\",\n  \"version\": 2,\n  \"weekStart\": \"\"\n}\n"
  },
  {
    "path": "monitoring/grafana/dashboards/ingesters.json",
    "content": "{\n  \"annotations\": {\n    \"list\": [\n      {\n        \"builtIn\": 1,\n        \"datasource\": {\n          \"type\": \"grafana\",\n          \"uid\": \"-- Grafana --\"\n        },\n        \"enable\": true,\n        \"hide\": true,\n        \"iconColor\": \"rgba(0, 211, 255, 1)\",\n        \"name\": \"Annotations & Alerts\",\n        \"target\": {\n          \"limit\": 100,\n          \"matchAny\": false,\n          \"tags\": [],\n          \"type\": \"dashboard\"\n        },\n        \"type\": \"dashboard\"\n      }\n    ]\n  },\n  \"editable\": true,\n  \"fiscalYearStartMonth\": 0,\n  \"graphTooltip\": 0,\n  \"links\": [],\n  \"liveNow\": false,\n  \"panels\": [\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"auto\",\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 0,\n            \"gradientMode\": \"none\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineWidth\": 1,\n            \"pointSize\": 5,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          }\n        },\n        \"overrides\": [\n          {\n            \"__systemRef\": \"hideSeriesFrom\",\n            \"matcher\": {\n              \"id\": \"byNames\",\n              \"options\": {\n                \"mode\": \"exclude\",\n                \"names\": [\n                  \"{component=\\\"ingester\\\", instance=\\\"host.docker.internal:7280\\\", job=\\\"quickwit\\\", kind=\\\"server\\\", operation=\\\"truncate_shards\\\", status=\\\"success\\\"}\"\n                ],\n                \"prefix\": \"All except:\",\n                \"readOnly\": true\n              }\n            },\n            \"properties\": [\n              {\n                \"id\": \"custom.hideFrom\",\n                \"value\": {\n                  \"legend\": false,\n                  \"tooltip\": false,\n                  \"viz\": true\n                }\n              }\n            ]\n          }\n        ]\n      },\n      \"gridPos\": {\n        \"h\": 8,\n        \"w\": 12,\n        \"x\": 0,\n        \"y\": 0\n      },\n      \"id\": 2,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [],\n          \"displayMode\": \"list\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"disableTextWrap\": false,\n          \"editorMode\": \"code\",\n          \"expr\": \"rate(quickwit_ingest_grpc_requests_total{kind=\\\"server\\\", instance=~\\\"$instance\\\"}[$__rate_interval])\",\n          \"fullMetaSearch\": false,\n          \"includeNullMetadata\": true,\n          \"legendFormat\": \"__auto\",\n          \"range\": true,\n          \"refId\": \"A\",\n          \"useBackend\": false\n        }\n      ],\n      \"title\": \"gRPC server request rate\",\n      \"type\": \"timeseries\"\n    },\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"auto\",\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 0,\n            \"gradientMode\": \"none\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineWidth\": 1,\n            \"pointSize\": 5,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          }\n        },\n        \"overrides\": []\n      },\n      \"gridPos\": {\n        \"h\": 8,\n        \"w\": 12,\n        \"x\": 12,\n        \"y\": 0\n      },\n      \"id\": 10,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [],\n          \"displayMode\": \"list\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"disableTextWrap\": false,\n          \"editorMode\": \"code\",\n          \"expr\": \"histogram_quantile(0.95, sum by(le, rpc) (rate(quickwit_ingest_grpc_request_duration_seconds_bucket{kind=\\\"server\\\", instance=~\\\"$instance\\\"}[$__rate_interval])))\",\n          \"fullMetaSearch\": false,\n          \"includeNullMetadata\": true,\n          \"legendFormat\": \"__auto\",\n          \"range\": true,\n          \"refId\": \"A\",\n          \"useBackend\": false\n        }\n      ],\n      \"title\": \"gRPC server request latencies\",\n      \"type\": \"timeseries\"\n    },\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"auto\",\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 0,\n            \"gradientMode\": \"none\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineWidth\": 1,\n            \"pointSize\": 5,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          }\n        },\n        \"overrides\": []\n      },\n      \"gridPos\": {\n        \"h\": 8,\n        \"w\": 12,\n        \"x\": 0,\n        \"y\": 8\n      },\n      \"id\": 9,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [],\n          \"displayMode\": \"list\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"disableTextWrap\": false,\n          \"editorMode\": \"builder\",\n          \"expr\": \"quickwit_ingest_grpc_requests_in_flight{kind=\\\"server\\\", instance=~\\\"$instance\\\"}\",\n          \"fullMetaSearch\": false,\n          \"includeNullMetadata\": true,\n          \"legendFormat\": \"__auto\",\n          \"range\": true,\n          \"refId\": \"A\",\n          \"useBackend\": false\n        }\n      ],\n      \"title\": \"gRPC server in-flight requests\",\n      \"type\": \"timeseries\"\n    },\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"auto\",\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 0,\n            \"gradientMode\": \"none\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineWidth\": 1,\n            \"pointSize\": 5,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          }\n        },\n        \"overrides\": []\n      },\n      \"gridPos\": {\n        \"h\": 8,\n        \"w\": 12,\n        \"x\": 12,\n        \"y\": 8\n      },\n      \"id\": 8,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [],\n          \"displayMode\": \"list\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"disableTextWrap\": false,\n          \"editorMode\": \"builder\",\n          \"expr\": \"sum(quickwit_ingest_shards{state=\\\"open\\\", instance=~\\\"$instance\\\"})\",\n          \"fullMetaSearch\": false,\n          \"includeNullMetadata\": true,\n          \"legendFormat\": \"Open shards\",\n          \"range\": true,\n          \"refId\": \"A\",\n          \"useBackend\": false\n        },\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"disableTextWrap\": false,\n          \"editorMode\": \"builder\",\n          \"expr\": \"sum(quickwit_ingest_shards{state=\\\"closed\\\", instance=~\\\"$instance\\\"})\",\n          \"fullMetaSearch\": false,\n          \"hide\": false,\n          \"includeNullMetadata\": true,\n          \"legendFormat\": \"Closed shards\",\n          \"range\": true,\n          \"refId\": \"B\",\n          \"useBackend\": false\n        }\n      ],\n      \"title\": \"Shard status\",\n      \"type\": \"timeseries\"\n    },\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"auto\",\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 0,\n            \"gradientMode\": \"none\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineWidth\": 1,\n            \"pointSize\": 5,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          },\n          \"unit\": \"bytes\"\n        },\n        \"overrides\": []\n      },\n      \"gridPos\": {\n        \"h\": 8,\n        \"w\": 12,\n        \"x\": 0,\n        \"y\": 16\n      },\n      \"id\": 4,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [],\n          \"displayMode\": \"list\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"disableTextWrap\": false,\n          \"editorMode\": \"builder\",\n          \"expr\": \"quickwit_ingest_wal_disk_used_bytes{instance=~\\\"$instance\\\"}\",\n          \"fullMetaSearch\": false,\n          \"includeNullMetadata\": true,\n          \"legendFormat\": \"__auto\",\n          \"range\": true,\n          \"refId\": \"A\",\n          \"useBackend\": false\n        },\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"disableTextWrap\": false,\n          \"editorMode\": \"builder\",\n          \"expr\": \"quickwit_ingest_wal_memory_used_bytes{instance=~\\\"$instance\\\"}\",\n          \"fullMetaSearch\": false,\n          \"hide\": false,\n          \"includeNullMetadata\": true,\n          \"legendFormat\": \"__auto\",\n          \"range\": true,\n          \"refId\": \"B\",\n          \"useBackend\": false\n        }\n      ],\n      \"title\": \"WAL usage\",\n      \"type\": \"timeseries\"\n    }\n  ],\n  \"refresh\": \"30s\",\n  \"revision\": 1,\n  \"schemaVersion\": 39,\n  \"tags\": [\n    \"quickwit\"\n  ],\n  \"templating\": {\n    \"list\": [\n      {\n        \"current\": {\n          \"selected\": false,\n          \"text\": \"Prometheus\",\n          \"value\": \"PBFA97CFB590B2093\"\n        },\n        \"hide\": 0,\n        \"includeAll\": false,\n        \"label\": \"Datasource\",\n        \"multi\": false,\n        \"name\": \"datasource\",\n        \"options\": [],\n        \"query\": \"prometheus\",\n        \"refresh\": 1,\n        \"regex\": \"\",\n        \"skipUrlSync\": false,\n        \"type\": \"datasource\"\n      },\n      {\n        \"current\": {\n          \"selected\": true,\n          \"text\": \"All\",\n          \"value\": \"$__all\"\n        },\n        \"datasource\": {\n          \"type\": \"prometheus\",\n          \"uid\": \"PBFA97CFB590B2093\"\n        },\n        \"definition\": \"label_values(quickwit_ingest_shards,instance)\",\n        \"hide\": 0,\n        \"includeAll\": true,\n        \"label\": \"Instance\",\n        \"multi\": false,\n        \"name\": \"instance\",\n        \"options\": [],\n        \"query\": {\n          \"qryType\": 1,\n          \"query\": \"label_values(quickwit_ingest_shards,instance)\",\n          \"refId\": \"PrometheusVariableQueryEditor-VariableQuery\"\n        },\n        \"refresh\": 1,\n        \"regex\": \"\",\n        \"skipUrlSync\": false,\n        \"sort\": 0,\n        \"type\": \"query\"\n      }\n    ]\n  },\n  \"time\": {\n    \"from\": \"now-1h\",\n    \"to\": \"now\"\n  },\n  \"timepicker\": {},\n  \"timezone\": \"\",\n  \"title\": \"Quickwit Ingesters\",\n  \"uid\": \"DjSPsTvSz\",\n  \"version\": 1,\n  \"weekStart\": \"\"\n}\n"
  },
  {
    "path": "monitoring/grafana/dashboards/metastore.json",
    "content": "{\n  \"annotations\": {\n    \"list\": [\n      {\n        \"builtIn\": 1,\n        \"datasource\": {\n          \"type\": \"grafana\",\n          \"uid\": \"-- Grafana --\"\n        },\n        \"enable\": true,\n        \"hide\": true,\n        \"iconColor\": \"rgba(0, 211, 255, 1)\",\n        \"name\": \"Annotations & Alerts\",\n        \"target\": {\n          \"limit\": 100,\n          \"matchAny\": false,\n          \"tags\": [],\n          \"type\": \"dashboard\"\n        },\n        \"type\": \"dashboard\"\n      }\n    ]\n  },\n  \"editable\": true,\n  \"fiscalYearStartMonth\": 0,\n  \"graphTooltip\": 0,\n  \"links\": [],\n  \"liveNow\": false,\n  \"panels\": [\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"description\": \"\",\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"auto\",\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 0,\n            \"gradientMode\": \"none\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineWidth\": 1,\n            \"pointSize\": 5,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          }\n        },\n        \"overrides\": []\n      },\n      \"gridPos\": {\n        \"h\": 13,\n        \"w\": 12,\n        \"x\": 0,\n        \"y\": 0\n      },\n      \"id\": 2,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [],\n          \"displayMode\": \"list\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"disableTextWrap\": false,\n          \"editorMode\": \"builder\",\n          \"expr\": \"sum by(rpc) (rate(quickwit_metastore_grpc_requests_total{kind=\\\"server\\\", instance=~\\\"$instance\\\"}[$__rate_interval]))\",\n          \"fullMetaSearch\": false,\n          \"includeNullMetadata\": true,\n          \"legendFormat\": \"{{operation}}\",\n          \"range\": true,\n          \"refId\": \"A\",\n          \"useBackend\": false\n        }\n      ],\n      \"title\": \"Metastore requests rate\",\n      \"type\": \"timeseries\"\n    },\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"description\": \"Duration in seconds\",\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"auto\",\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 0,\n            \"gradientMode\": \"none\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineWidth\": 1,\n            \"pointSize\": 5,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          }\n        },\n        \"overrides\": []\n      },\n      \"gridPos\": {\n        \"h\": 13,\n        \"w\": 12,\n        \"x\": 12,\n        \"y\": 0\n      },\n      \"id\": 3,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [],\n          \"displayMode\": \"list\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"disableTextWrap\": false,\n          \"editorMode\": \"builder\",\n          \"expr\": \"histogram_quantile(0.95, sum by(le, rpc) (rate(quickwit_metastore_grpc_request_duration_seconds_bucket{kind=\\\"server\\\", instance=~\\\"$instance\\\"}[$__rate_interval])))\",\n          \"fullMetaSearch\": false,\n          \"includeNullMetadata\": true,\n          \"legendFormat\": \"{{rpc}}\",\n          \"range\": true,\n          \"refId\": \"A\",\n          \"useBackend\": false\n        }\n      ],\n      \"title\": \"Metastore requests duration (p95)\",\n      \"type\": \"timeseries\"\n    },\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"description\": \"\",\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"auto\",\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 0,\n            \"gradientMode\": \"none\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineWidth\": 1,\n            \"pointSize\": 5,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          }\n        },\n        \"overrides\": []\n      },\n      \"gridPos\": {\n        \"h\": 13,\n        \"w\": 12,\n        \"x\": 0,\n        \"y\": 13\n      },\n      \"id\": 4,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [],\n          \"displayMode\": \"list\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"disableTextWrap\": false,\n          \"editorMode\": \"builder\",\n          \"expr\": \"sum by(rpc) (rate(quickwit_metastore_grpc_requests_total{kind=\\\"server\\\", status=\\\"error\\\", instance=~\\\"$instance\\\"}[$__rate_interval]))\",\n          \"fullMetaSearch\": false,\n          \"includeNullMetadata\": true,\n          \"legendFormat\": \"{{rpc}}\",\n          \"range\": true,\n          \"refId\": \"A\",\n          \"useBackend\": false\n        }\n      ],\n      \"title\": \"Metastore requests error rate\",\n      \"type\": \"timeseries\"\n    }\n  ],\n  \"refresh\": \"30s\",\n  \"schemaVersion\": 39,\n  \"tags\": [\n    \"quickwit\",\n    \"metastore\"\n  ],\n  \"templating\": {\n    \"list\": [\n      {\n        \"current\": {\n          \"selected\": true,\n          \"text\": \"Prometheus\",\n          \"value\": \"PBFA97CFB590B2093\"\n        },\n        \"hide\": 0,\n        \"includeAll\": false,\n        \"label\": \"Datasource\",\n        \"multi\": false,\n        \"name\": \"datasource\",\n        \"options\": [],\n        \"query\": \"prometheus\",\n        \"queryValue\": \"\",\n        \"refresh\": 1,\n        \"regex\": \"\",\n        \"skipUrlSync\": false,\n        \"type\": \"datasource\"\n      },\n      {\n        \"allValue\": \"\",\n        \"current\": {\n          \"selected\": true,\n          \"text\": \"All\",\n          \"value\": \"$__all\"\n        },\n        \"definition\": \"label_values(quickwit_metastore_grpc_requests_total{kind=\\\"server\\\"},instance)\",\n        \"hide\": 0,\n        \"includeAll\": true,\n        \"label\": \"Instance\",\n        \"multi\": false,\n        \"name\": \"instance\",\n        \"options\": [],\n        \"query\": {\n          \"qryType\": 1,\n          \"query\": \"label_values(quickwit_metastore_grpc_requests_total{kind=\\\"server\\\"},instance)\",\n          \"refId\": \"PrometheusVariableQueryEditor-VariableQuery\"\n        },\n        \"refresh\": 1,\n        \"regex\": \"\",\n        \"skipUrlSync\": false,\n        \"sort\": 0,\n        \"type\": \"query\"\n      }\n    ]\n  },\n  \"time\": {\n    \"from\": \"now-1h\",\n    \"to\": \"now\"\n  },\n  \"timepicker\": {},\n  \"timezone\": \"\",\n  \"title\": \"Quickwit Metastore\",\n  \"uid\": \"quickwit-metastore\",\n  \"version\": 1,\n  \"weekStart\": \"\"\n}\n"
  },
  {
    "path": "monitoring/grafana/dashboards/searchers.json",
    "content": "{\n  \"annotations\": {\n    \"list\": [\n      {\n        \"builtIn\": 1,\n        \"datasource\": {\n          \"type\": \"grafana\",\n          \"uid\": \"-- Grafana --\"\n        },\n        \"enable\": true,\n        \"hide\": true,\n        \"iconColor\": \"rgba(0, 211, 255, 1)\",\n        \"name\": \"Annotations & Alerts\",\n        \"type\": \"dashboard\"\n      }\n    ]\n  },\n  \"editable\": true,\n  \"fiscalYearStartMonth\": 0,\n  \"graphTooltip\": 0,\n  \"id\": 2,\n  \"links\": [],\n  \"liveNow\": false,\n  \"panels\": [\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"auto\",\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 0,\n            \"gradientMode\": \"none\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineWidth\": 1,\n            \"pointSize\": 5,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          }\n        },\n        \"overrides\": []\n      },\n      \"gridPos\": {\n        \"h\": 8,\n        \"w\": 12,\n        \"x\": 0,\n        \"y\": 0\n      },\n      \"id\": 1,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [],\n          \"displayMode\": \"list\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"disableTextWrap\": false,\n          \"editorMode\": \"builder\",\n          \"expr\": \"quickwit_search_leaf_searches_splits_total{instance=~\\\"$instance\\\"}\",\n          \"fullMetaSearch\": false,\n          \"includeNullMetadata\": true,\n          \"instant\": false,\n          \"legendFormat\": \"{{instance}}\",\n          \"range\": true,\n          \"refId\": \"A\",\n          \"useBackend\": false\n        }\n      ],\n      \"title\": \"Leaf search splits\",\n      \"type\": \"timeseries\"\n    },\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"auto\",\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 0,\n            \"gradientMode\": \"none\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineWidth\": 1,\n            \"pointSize\": 5,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          },\n          \"unit\": \"decbytes\"\n        },\n        \"overrides\": []\n      },\n      \"gridPos\": {\n        \"h\": 8,\n        \"w\": 12,\n        \"x\": 12,\n        \"y\": 0\n      },\n      \"id\": 4,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [],\n          \"displayMode\": \"list\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"disableTextWrap\": false,\n          \"editorMode\": \"builder\",\n          \"expr\": \"quickwit_memory_resident_bytes{instance=~\\\"$instance\\\"}\",\n          \"fullMetaSearch\": false,\n          \"hide\": false,\n          \"includeNullMetadata\": true,\n          \"instant\": false,\n          \"legendFormat\": \"{{instance}}\",\n          \"range\": true,\n          \"refId\": \"B\",\n          \"useBackend\": false\n        }\n      ],\n      \"title\": \"Memory usage (bytes)\",\n      \"type\": \"timeseries\"\n    },\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"auto\",\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 0,\n            \"gradientMode\": \"none\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineWidth\": 1,\n            \"pointSize\": 5,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          }\n        },\n        \"overrides\": []\n      },\n      \"gridPos\": {\n        \"h\": 8,\n        \"w\": 12,\n        \"x\": 0,\n        \"y\": 8\n      },\n      \"id\": 2,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [],\n          \"displayMode\": \"list\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"disableTextWrap\": false,\n          \"editorMode\": \"builder\",\n          \"expr\": \"rate(quickwit_storage_object_storage_gets_total{instance=~\\\"$instance\\\"}[$__rate_interval])\",\n          \"fullMetaSearch\": false,\n          \"includeNullMetadata\": false,\n          \"instant\": false,\n          \"legendFormat\": \"Total\",\n          \"range\": true,\n          \"refId\": \"A\",\n          \"useBackend\": false\n        }\n      ],\n      \"title\": \"Number of GET requests\",\n      \"type\": \"timeseries\"\n    },\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"auto\",\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 0,\n            \"gradientMode\": \"none\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineWidth\": 1,\n            \"pointSize\": 5,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          },\n          \"unit\": \"decbytes\"\n        },\n        \"overrides\": []\n      },\n      \"gridPos\": {\n        \"h\": 8,\n        \"w\": 12,\n        \"x\": 12,\n        \"y\": 8\n      },\n      \"id\": 22,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [],\n          \"displayMode\": \"list\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"disableTextWrap\": false,\n          \"editorMode\": \"builder\",\n          \"expr\": \"quickwit_storage_object_storage_download_num_bytes{instance=~\\\"$instance\\\"}\",\n          \"fullMetaSearch\": false,\n          \"includeNullMetadata\": true,\n          \"instant\": false,\n          \"legendFormat\": \"Downloaded bytes\",\n          \"range\": true,\n          \"refId\": \"A\",\n          \"useBackend\": false\n        }\n      ],\n      \"title\": \"Size of GET requests (bytes)\",\n      \"type\": \"timeseries\"\n    },\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"auto\",\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 0,\n            \"gradientMode\": \"none\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineWidth\": 1,\n            \"pointSize\": 5,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          }\n        },\n        \"overrides\": []\n      },\n      \"gridPos\": {\n        \"h\": 8,\n        \"w\": 12,\n        \"x\": 0,\n        \"y\": 16\n      },\n      \"id\": 9,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [],\n          \"displayMode\": \"list\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"disableTextWrap\": false,\n          \"editorMode\": \"builder\",\n          \"expr\": \"rate(quickwit_cache_cache_hits_total{instance=~\\\"$instance\\\"}[$__rate_interval])\",\n          \"fullMetaSearch\": false,\n          \"includeNullMetadata\": true,\n          \"instant\": false,\n          \"legendFormat\": \"{{component_name}}\",\n          \"range\": true,\n          \"refId\": \"A\",\n          \"useBackend\": false\n        }\n      ],\n      \"title\": \"Cache hits\",\n      \"type\": \"timeseries\"\n    },\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"auto\",\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 0,\n            \"gradientMode\": \"none\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineWidth\": 1,\n            \"pointSize\": 5,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          }\n        },\n        \"overrides\": []\n      },\n      \"gridPos\": {\n        \"h\": 8,\n        \"w\": 12,\n        \"x\": 12,\n        \"y\": 16\n      },\n      \"id\": 23,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [],\n          \"displayMode\": \"list\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"disableTextWrap\": false,\n          \"editorMode\": \"builder\",\n          \"expr\": \"rate(quickwit_cache_cache_misses_total{instance=~\\\"$instance\\\"}[$__rate_interval])\",\n          \"fullMetaSearch\": false,\n          \"includeNullMetadata\": true,\n          \"instant\": false,\n          \"legendFormat\": \"{{component_name}}\",\n          \"range\": true,\n          \"refId\": \"A\",\n          \"useBackend\": false\n        }\n      ],\n      \"title\": \"Cache misses\",\n      \"type\": \"timeseries\"\n    },\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"auto\",\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 0,\n            \"gradientMode\": \"none\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineWidth\": 1,\n            \"pointSize\": 5,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          },\n          \"unit\": \"none\"\n        },\n        \"overrides\": []\n      },\n      \"gridPos\": {\n        \"h\": 8,\n        \"w\": 12,\n        \"x\": 0,\n        \"y\": 24\n      },\n      \"id\": 24,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [],\n          \"displayMode\": \"list\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"disableTextWrap\": false,\n          \"editorMode\": \"builder\",\n          \"expr\": \"quickwit_cache_in_cache_count{instance=~\\\"$instance\\\"}\",\n          \"fullMetaSearch\": false,\n          \"includeNullMetadata\": true,\n          \"instant\": false,\n          \"legendFormat\": \"Split footer\",\n          \"range\": true,\n          \"refId\": \"A\",\n          \"useBackend\": false\n        }\n      ],\n      \"title\": \"Number of cached objects\",\n      \"type\": \"timeseries\"\n    },\n    {\n      \"datasource\": {\n        \"type\": \"prometheus\",\n        \"uid\": \"${datasource}\"\n      },\n      \"fieldConfig\": {\n        \"defaults\": {\n          \"color\": {\n            \"mode\": \"palette-classic\"\n          },\n          \"custom\": {\n            \"axisBorderShow\": false,\n            \"axisCenteredZero\": false,\n            \"axisColorMode\": \"text\",\n            \"axisLabel\": \"\",\n            \"axisPlacement\": \"auto\",\n            \"barAlignment\": 0,\n            \"drawStyle\": \"line\",\n            \"fillOpacity\": 0,\n            \"gradientMode\": \"none\",\n            \"hideFrom\": {\n              \"legend\": false,\n              \"tooltip\": false,\n              \"viz\": false\n            },\n            \"insertNulls\": false,\n            \"lineInterpolation\": \"linear\",\n            \"lineWidth\": 1,\n            \"pointSize\": 5,\n            \"scaleDistribution\": {\n              \"type\": \"linear\"\n            },\n            \"showPoints\": \"auto\",\n            \"spanNulls\": false,\n            \"stacking\": {\n              \"group\": \"A\",\n              \"mode\": \"none\"\n            },\n            \"thresholdsStyle\": {\n              \"mode\": \"off\"\n            }\n          },\n          \"mappings\": [],\n          \"thresholds\": {\n            \"mode\": \"absolute\",\n            \"steps\": [\n              {\n                \"color\": \"green\",\n                \"value\": null\n              },\n              {\n                \"color\": \"red\",\n                \"value\": 80\n              }\n            ]\n          },\n          \"unit\": \"decbytes\"\n        },\n        \"overrides\": []\n      },\n      \"gridPos\": {\n        \"h\": 8,\n        \"w\": 12,\n        \"x\": 12,\n        \"y\": 24\n      },\n      \"id\": 11,\n      \"options\": {\n        \"legend\": {\n          \"calcs\": [],\n          \"displayMode\": \"list\",\n          \"placement\": \"bottom\",\n          \"showLegend\": true\n        },\n        \"tooltip\": {\n          \"mode\": \"single\",\n          \"sort\": \"none\"\n        }\n      },\n      \"targets\": [\n        {\n          \"datasource\": {\n            \"type\": \"prometheus\",\n            \"uid\": \"${datasource}\"\n          },\n          \"disableTextWrap\": false,\n          \"editorMode\": \"code\",\n          \"expr\": \"quickwit_cache_in_cache_num_bytes{instance=~\\\"$instance\\\"}\",\n          \"fullMetaSearch\": false,\n          \"includeNullMetadata\": true,\n          \"instant\": false,\n          \"legendFormat\": \"Split footer\",\n          \"range\": true,\n          \"refId\": \"A\",\n          \"useBackend\": false\n        }\n      ],\n      \"title\": \"Size of cached objects (bytes)\",\n      \"type\": \"timeseries\"\n    }\n  ],\n  \"refresh\": \"30s\",\n  \"schemaVersion\": 39,\n  \"tags\": [\n    \"quickwit\",\n    \"searcher\"\n  ],\n  \"templating\": {\n    \"list\": [\n      {\n        \"current\": {\n          \"selected\": false,\n          \"text\": \"Prometheus\",\n          \"value\": \"PBFA97CFB590B2093\"\n        },\n        \"hide\": 0,\n        \"includeAll\": false,\n        \"label\": \"Datasource\",\n        \"multi\": false,\n        \"name\": \"datasource\",\n        \"options\": [],\n        \"query\": \"prometheus\",\n        \"queryValue\": \"\",\n        \"refresh\": 1,\n        \"regex\": \"\",\n        \"skipUrlSync\": false,\n        \"type\": \"datasource\"\n      },\n      {\n        \"current\": {\n          \"selected\": false,\n          \"text\": \"All\",\n          \"value\": \"$__all\"\n        },\n        \"datasource\": {\n          \"type\": \"prometheus\",\n          \"uid\": \"${datasource}\"\n        },\n        \"definition\": \"label_values(quickwit_search_leaf_searches_splits_total,instance)\",\n        \"hide\": 0,\n        \"includeAll\": true,\n        \"label\": \"Instance\",\n        \"multi\": false,\n        \"name\": \"instance\",\n        \"options\": [],\n        \"query\": {\n          \"query\": \"label_values(quickwit_search_leaf_searches_splits_total,instance)\",\n          \"refId\": \"StandardVariableQuery\"\n        },\n        \"refresh\": 1,\n        \"regex\": \"\",\n        \"skipUrlSync\": false,\n        \"sort\": 0,\n        \"type\": \"query\"\n      }\n\n    ]\n  },\n  \"time\": {\n    \"from\": \"now-1h\",\n    \"to\": \"now\"\n  },\n  \"timepicker\": {},\n  \"timezone\": \"\",\n  \"title\": \"Quickwit Searchers\",\n  \"uid\": \"quickwit-searchers\",\n  \"version\": 1,\n  \"weekStart\": \"\"\n}\n"
  },
  {
    "path": "monitoring/grafana/provisioning/dashboards/default.yaml",
    "content": "apiVersion: 1\n\nproviders:\n  - name: Default\n    folder: Quickwit\n    allowUiUpdates: true\n    options:\n      path: /var/lib/grafana/dashboards\n      foldersFromFilesStructure: true\n"
  },
  {
    "path": "monitoring/grafana/provisioning/datasources/default.yaml",
    "content": "apiVersion: 1\n\ndatasources:\n  - id: 1\n    name: Prometheus\n    type: prometheus\n    typeName: Prometheus\n    access: proxy\n    url: http://prometheus:9090\n    isDefault: true\n    jsonData:\n      httpMethod: POST\n      timeInterval: 5s\n    readOnly: false\n\n  - id: 2\n    name: Jaeger\n    type: jaeger\n    typeName: Jaeger\n    access: proxy\n    url: http://jaeger:16686\n    isDefault: false\n    jsonData:\n      httpMethod: POST\n    readOnly: false\n"
  },
  {
    "path": "monitoring/otel-collector-config.yaml",
    "content": "receivers:\n  jaeger:\n    protocols:\n      grpc:\n      thrift_binary:\n      thrift_compact:\n      thrift_http:\n\n  otlp:\n    protocols:\n      grpc:\n      http:\n\nprocessors:\n  batch:\n\nexporters:\n  jaeger:\n    endpoint: jaeger:14250\n    tls:\n      insecure: true\n\n  kafka:\n    brokers:\n      - kafka-broker:29092\n\n  otlp/qw:\n    endpoint: host.docker.internal:7281\n    tls:\n      insecure: true\n\nextensions:\n  health_check:\n  pprof:\n  zpages:\n\nservice:\n  extensions: [health_check, pprof, zpages]\n  pipelines:\n    traces:\n      receivers: [jaeger, otlp]\n      processors: [batch]\n      exporters: [jaeger, kafka, otlp/qw]\n    # metrics:\n    #   receivers: [otlp]\n    #   processors: [batch]\n    #   exporters: [otlp]\n    logs:\n      receivers: [otlp]\n      processors: [batch]\n      exporters: [kafka, otlp/qw]\n"
  },
  {
    "path": "monitoring/prometheus.yaml",
    "content": "global:\n  scrape_interval: 1s\n  scrape_timeout: 1s\n\nscrape_configs:\n  - job_name: quickwit\n    metrics_path: /metrics\n    static_configs:\n      - targets:\n          - host.docker.internal:7280\n"
  },
  {
    "path": "quickwit/.cargo/config.toml",
    "content": "[build]\nrustflags = [\"--cfg\", \"tokio_unstable\"]\n"
  },
  {
    "path": "quickwit/.cargo-dev/config.toml",
    "content": "# This configuration makes it possible to use mold\n# as the linker for rustc.\n#\n# I recommended it for development as it really improves performance. \n# \n# To enable\n# - install clang\n# - install mold https://github.com/rui314/mold into /usr/local/bin/mold\n# - add a symbolic link from .cargo -> .cargo-dev \n# via `ln -s .cargo-dev .cargo`.\n#\n# If there is an issue, reverting is as simple as deleting .cargo.\n\n[target.x86_64-unknown-linux-gnu]\nlinker = \"/usr/bin/clang\"\nrustflags = [\"-C\", \"link-arg=-fuse-ld=/usr/local/bin/mold\"]\n"
  },
  {
    "path": "quickwit/.config/nextest.toml",
    "content": "[profile.default]\nslow-timeout = \"10s\"\n\n[profile.ci]\n# Print out output for failing tests as soon as they fail, and also at the end\n# of the run (for easy scrollability).\nfailure-output = \"immediate-final\"\n# Do not cancel the test run on the first failure.\nfail-fast = false"
  },
  {
    "path": "quickwit/.license_header.txt",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n"
  },
  {
    "path": "quickwit/CLAUDE.md",
    "content": "## Build & Test Commands\n\n### Formatting & Linting\n- **`make fmt`** — Format and validate code (requires nightly toolchain: `rustup toolchain install nightly`):\n  1. Runs `cargo +nightly fmt`\n  2. Checks license headers on `.rs`, `.ts`, `.proto` files\n  3. Enforces log format policy: no trailing punctuation, no uppercase first character in log and error messages\n- **`make fix`** — Runs clippy with `--fix`, then `make fmt`, then `make unused-deps`\n- **`make unused-deps`** — Detects unused dependencies via `cargo-machete`\n\nLog messages (`info!`, `warn!`, `error!`, `debug!`) must:\n- Start with a **lowercase** letter\n- Have **no trailing punctuation**\n\n### Testing\n- **Single crate test**: `cargo nextest run -p quickwit-search my_test_name`\n- **Single test**: `cargo test -p quickwit-common my_test_name`\n- **`make test-all`** — Starts Docker services (LocalStack S3, PostgreSQL, Pub/Sub emulator) and runs the full test suite with `cargo nextest run --all-features --retries 5`\n- **`make test-failpoints`** — Runs failpoint tests only: `cargo nextest run --test failpoints --features fail/failpoints`\n- Docker services: `make docker-compose-up` / `make docker-compose-down` (subset: `DOCKER_SERVICES=kafka,postgres`)\n\n### Building\n- **`make doc`** — Generates docs with `cargo doc --all-features` (warnings as errors)\n- Rust toolchain: **1.93**\n\n## Code Conventions\n\n### Clippy Disallowed Methods\nThese methods are banned (see `clippy.toml`):\n- `Path::exists` — (use try_exists)\n- `Option::is_some_and`, `Option::is_none_or`, `Option::xor`\n- `Option::map_or`, `Option::map_or_else` — use `.map(..).unwrap_or(..)` or `let Some(..) else {..}` instead\n\n### Formatting Shortcut\nUse `/fmt` to automatically run format checks.\n\n## Architecture Overview\n\nQuickwit is a cloud-native distributed search engine for observability data (logs, traces). It's organized as a ~38-crate Rust workspace.\n\n### Key Layers\n\n**Protocol & Types** — `quickwit-proto` defines all gRPC service contracts and message types via protobuf. Service traits are auto-generated.\n\n**Actor System** — `quickwit-actors` is a custom lightweight actor framework. The indexing pipeline is fully actor-based:\n```\nSource → DocProcessor → Indexer → IndexSerializer → Packager → Uploader → Sequencer → Publisher\n```\nA parallel merge pipeline runs alongside.\n\n**Search** — `quickwit-search` implements a root-leaf pattern: root servers parse queries and coordinate, leaf servers search their assigned splits in parallel, leaf results are merged at root.\n\n**Storage** — `quickwit-storage` abstracts cloud storage (S3, Azure, GCS, local file, RAM) behind a `Storage` trait.\n\n**Metastore** — `quickwit-metastore` manages index metadata with file-backed (dev) and PostgreSQL (production) backends.\n\n**Cluster** — `quickwit-cluster` uses Chitchat gossip protocol for membership. `quickwit-control-plane` handles indexing task scheduling and placement.\n\n**API Surface** — `quickwit-serve` hosts both REST and gRPC endpoints over the same service traits, plus serves the embedded React UI.\n\n### Core Crates\n| Crate | Purpose |\n|-------|---------|\n| `quickwit-cli` | CLI entry point and binary |\n| `quickwit-serve` | REST/gRPC server |\n| `quickwit-search` | Distributed search orchestration |\n| `quickwit-indexing` | Actor-based indexing pipeline |\n| `quickwit-ingest` | Distributed ingestion with replication |\n| `quickwit-metastore` | Index metadata storage |\n| `quickwit-storage` | Multi-cloud storage abstraction |\n| `quickwit-config` | Configuration parsing/validation |\n| `quickwit-doc-mapper` | Index schema and document mapping |\n| `quickwit-query` | Query DSL parsing (ES-compatible) |\n| `quickwit-cluster` | Cluster membership (Chitchat) |\n| `quickwit-control-plane` | Indexing task scheduling |\n| `quickwit-actors` | Actor framework |\n| `quickwit-proto` | Protobuf definitions and gRPC traits |\n| `quickwit-common` | Shared utilities and metrics |\n| `quickwit-lambda-server` | AWS Lambda leaf search handler |\n| `quickwit-lambda-client` | Lambda invocation with auto-deployment |\n\nquickwit-common contains shared utilities about metrics, rate limited logging, reading from environment variables, etc.\nIt also contains the `run_cpu_intensive` that should be use to run CPU-intensive tasks from tokio tasks.\n\nWhen the client is unlikely to match on an error, you can rely on the crate level Error or anyhow::Error. If you need to introduce a new Error type, use thiserror.\n\n### Design Patterns\n- **Trait-based services**: `SearchService`, `MetastoreService`, etc. — enables mocking and multiple implementations\n- **Feature gates**: Cloud backends (`azure`, `gcs`), message sources (`kafka`, `kinesis`, `pulsar`, `sqs`, `gcp-pubsub`), `postgres` metastore, `multilang` tokenizers\n- **Metrics**: `once_cell::sync::Lazy` statics with `quickwit_common::metrics::*` factories\n\n### Key Dependencies\n- **Tantivy**: Search engine library (custom fork)\n- **Tonic/Prost**: gRPC framework and protobuf\n- **Tokio**: Async runtime\n- **SQLx**: PostgreSQL metastore\n\n# Quickwit Claude Guidelines\n\nWhen adding a new dependency, update license by running `make update-licenses`.\nPrefer referring to the crate in workspace. \nMake sure to keep features minimal.\n\nIn other words, prefer\nzip = { workspace = true, default-features = false, features=[\"deflate\"] }\nto\nzip = \"2\"\n\n## Code Formatting\n### Quick Fix\n\nUse `/fmt` to automatically run format checks and see issues.\n\n## Coding Style\n- Avoid single-letter variable names except for indices (i, j, k)\n- Document all \"hidden contracts\" (implicit assumptions, invariants, preconditions)\n- Try to avoid deep nesting. In particular, prefer early return style\n- Avoid abusing iterator chaining with complex constructs like `.transpose()`\n- Write type names explicitly when it aids readability\n- Use `with_capacity` to hint container capacity when size is known\n"
  },
  {
    "path": "quickwit/Cargo.toml",
    "content": "[workspace]\nresolver = \"2\"\nmembers = [\n  \"quickwit-actors\",\n  \"quickwit-aws\",\n  \"quickwit-cli\",\n  \"quickwit-cluster\",\n  \"quickwit-codegen\",\n  \"quickwit-codegen/example\",\n  \"quickwit-common\",\n  \"quickwit-config\",\n  \"quickwit-control-plane\",\n  \"quickwit-datetime\",\n  \"quickwit-directories\",\n  \"quickwit-doc-mapper\",\n  \"quickwit-index-management\",\n  \"quickwit-indexing\",\n  \"quickwit-ingest\",\n  \"quickwit-integration-tests\",\n  \"quickwit-jaeger\",\n  \"quickwit-janitor\",\n  \"quickwit-lambda-client\",\n  \"quickwit-lambda-server\",\n  \"quickwit-macros\",\n  \"quickwit-metastore\",\n\n  # Disabling metastore-utils from the quickwit projects to ease build/deps.\n  # We can reenable it when we need it.\n  # \"quickwit-metastore-utils\",\n  \"quickwit-opentelemetry\",\n  \"quickwit-proto\",\n  \"quickwit-query\",\n  \"quickwit-rest-client\",\n  \"quickwit-search\",\n  \"quickwit-serve\",\n  \"quickwit-storage\",\n  \"quickwit-telemetry\",\n]\n\n# The following list excludes `quickwit-metastore-utils`\n# from the default member to ease build/deps.\ndefault-members = [\n  \"quickwit-actors\",\n  \"quickwit-aws\",\n  \"quickwit-cli\",\n  \"quickwit-cluster\",\n  \"quickwit-codegen\",\n  \"quickwit-codegen/example\",\n  \"quickwit-common\",\n  \"quickwit-config\",\n  \"quickwit-control-plane\",\n  \"quickwit-datetime\",\n  \"quickwit-directories\",\n  \"quickwit-doc-mapper\",\n  \"quickwit-index-management\",\n  \"quickwit-indexing\",\n  \"quickwit-ingest\",\n  \"quickwit-integration-tests\",\n  \"quickwit-jaeger\",\n  \"quickwit-janitor\",\n  \"quickwit-lambda-client\",\n  \"quickwit-lambda-server\",\n  \"quickwit-macros\",\n  \"quickwit-metastore\",\n  \"quickwit-opentelemetry\",\n  \"quickwit-proto\",\n  \"quickwit-query\",\n  \"quickwit-rest-client\",\n  \"quickwit-search\",\n  \"quickwit-serve\",\n  \"quickwit-storage\",\n  \"quickwit-telemetry\",\n]\n\n[workspace.package]\nversion = \"0.8.0\"\nedition = \"2024\"\nhomepage = \"https://quickwit.io/\"\ndocumentation = \"https://quickwit.io/docs/\"\nrepository = \"https://github.com/quickwit-oss/quickwit\"\nauthors = [\"Quickwit, Inc. <hello@quickwit.io>\"]\nlicense = \"Apache-2.0\"\n\n[workspace.dependencies]\nanyhow = \"1\"\narc-swap = \"1.8\"\nassert-json-diff = \"2\"\nasync-compression = { version = \"0.4\", features = [\"tokio\", \"gzip\"] }\nasync-speed-limit = \"0.4\"\nasync-trait = \"0.1\"\nbacktrace = \"0.3\"\nbase64 = \"0.22\"\nbinggan = { version = \"0.15\" }\nbitpacking = \"0.9.3\"\nbytes = { version = \"1\", features = [\"serde\"] }\nbytesize = { version = \"2.3.1\", features = [\"serde\"] }\nbytestring = \"1.5\"\nchitchat = \"0.10.0\"\nchrono = { version = \"0.4\", default-features = false, features = [\n  \"clock\",\n  \"std\",\n] }\nclap = { version = \"4.5\", features = [\"env\", \"string\"] }\ncoarsetime = \"0.1\"\ncolored = \"3.0\"\nconsole-subscriber = \"0.5\"\ncriterion = { version = \"0.8\", features = [\"async_tokio\"] }\ncron = \"0.15\"\ndialoguer = { version = \"0.12\", default-features = false }\ndotenvy = \"0.15\"\ndyn-clone = \"1.0\"\nenum-iterator = \"2.3\"\nenv_logger = { version = \"0.11\", default-features = false, features = [\"auto-color\"] }\nfail = \"0.5\"\nflate2 = \"1.1\"\nflume = \"0.12\"\nfnv = \"1\"\nfutures = \"0.3\"\nfutures-util = { version = \"0.3\", default-features = false }\nglob = \"0.3\"\n# We can't directly update google-cloud-auth to 1.3 and google-cloud-gax to 1.4, because the latest version\n# of google-cloud-pubsub is \"0.30\" which explicitly depends on: google-cloud-auth ^0.17 and google-cloud-gax ^0.19.\ngoogle-cloud-auth = \"0.17.2\"\ngoogle-cloud-gax = \"0.19.2\"\ngoogle-cloud-googleapis = { version = \"0.16\", features = [\"pubsub\"] }\ngoogle-cloud-pubsub = \"0.30\"\ngovernor = \"0.10.4\"\nheck = \"0.5\"\nhex = \"0.4\"\nhome = \"0.5\"\nhostname = \"0.4\"\nhttp = \"1.4\"\nhttp-body = \"1.0\"\nhttp-body-util = \"0.1\"\nhttp-serde = \"2.1\"\nhumantime = \"2.3\"\nhyper = { version = \"1.8\", features = [\"client\", \"http1\", \"http2\", \"server\"] }\nhyper-rustls = \"0.27\"\nhyper-util = { version = \"0.1\", default-features = false, features = [\n  \"client-legacy\",\n  \"server-auto\",\n  \"server-graceful\",\n  \"service\",\n  \"tokio\",\n] }\nindexmap = { version = \"2.12\", features = [\"serde\"] }\nindicatif = \"0.18\"\nitertools = \"0.14\"\nlambda_runtime = \"0.13\"\njson_comments = \"0.2\"\nlibz-sys = \"1.1\"\nlru = \"0.16\"\nmatches = \"0.1\"\nmd5 = \"0.8\"\nmime_guess = \"2.0\"\nmini-moka = \"0.10.3\"\nmockall = \"0.14\"\nmrecordlog = { git = \"https://github.com/quickwit-oss/mrecordlog\", rev = \"306c0a7\" }\nnew_string_template = \"1.5\"\nnom = \"8.0\"\nnumfmt = \"1.2\"\nonce_cell = \"1\"\noneshot = \"0.1\"\nopenssl = { version = \"0.10\", default-features = false }\nopenssl-probe = \"0.1\"\nopentelemetry = \"0.31\"\nopentelemetry-appender-tracing = \"0.31\"\nopentelemetry_sdk = { version = \"0.31\", features = [\"rt-tokio\"] }\nopentelemetry-otlp = { version = \"0.31\", features = [\"grpc-tonic\"] }\nouroboros = \"0.18\"\npercent-encoding = \"2.3\"\npin-project = \"1.1\"\npnet = { version = \"0.35\", features = [\"std\"] }\npostcard = { version = \"1.1\", features = [\n  \"use-std\",\n], default-features = false }\npprof = { version = \"0.15\", features = [\"flamegraph\"] }\npredicates = \"3\"\nprettyplease = \"0.2\"\nproc-macro2 = \"1.0\"\nprometheus = { version = \"0.14\", default-features = false, features = [\"process\"] }\nproptest = \"1\"\nprost = { version = \"0.14\", default-features = false, features = [\n  \"derive\",\n] }\nprost-build = \"0.14\"\nprost-types = \"0.14\"\npulsar = { version = \"6.6\", default-features = false, features = [\n  \"auth-oauth2\",\n  \"compression\",\n  \"tokio-runtime\",\n] }\nquick_cache = \"0.6.18\"\nquote = \"1.0\"\nrand = \"0.9\"\nrand_distr = \"0.5\"\nrayon = \"1.11\"\nrdkafka = { version = \"0.38\", default-features = false, features = [\n  \"cmake-build\",\n  \"libz\",\n  \"ssl\",\n  \"tokio\",\n  \"zstd\",\n] }\nregex = \"1.12\"\nregex-syntax = \"0.8\"\nreqwest = { version = \"0.12\", default-features = false, features = [\n  \"json\",\n  \"rustls-tls\",\n] }\nreqwest-middleware = \"0.4\"\nreqwest-retry = \"0.8\"\nrust-embed = \"8.9\"\nrustc-hash = \"2.1\"\nrustls = \"0.23\"\nrustls-pemfile = \"2.2\"\nsea-query = { version = \"0.32\" }\nsea-query-binder = { version = \"0.7\", features = [\n  \"runtime-tokio-rustls\",\n  \"sqlx-postgres\",\n] }\n# ^1.0.184 due to serde-rs/serde#2538\nserde = { version = \"1.0.228\", features = [\"derive\", \"rc\"] }\nserde_json = \"1.0\"\nserde_json_borrow = \"0.9\"\nserde_qs = { version = \"0.15\" }\nserde_with = \"3.16\"\nserde_yaml = \"0.9\"\nserial_test = { version = \"3.2\", features = [\"file_locks\"] }\nsha2 = \"0.10\"\nsiphasher = \"1.0\"\nsmallvec = \"1\"\nsqlx = { version = \"0.8\", features = [\n  \"migrate\",\n  \"postgres\",\n  \"runtime-tokio-rustls\",\n  \"time\",\n] }\nsyn = { version = \"2.0\", features = [\"extra-traits\", \"full\", \"parsing\"] }\nsync_wrapper = \"1\"\nsysinfo = { version = \"0.37\", default-features = false, features = [\"disk\"] }\ntabled = { version = \"0.20\", features = [\"ansi\"] }\ntempfile = \"3\"\nthiserror = \"2\"\nthousands = \"0.2\"\ntikv-jemalloc-ctl = { version = \"0.6\", features = [\"stats\"] }\ntikv-jemallocator = \"0.6\"\ntime = { version = \"0.3\", features = [\"std\", \"formatting\", \"macros\"] }\ntokio = { version = \"1.48\", features = [\"full\"] }\ntokio-metrics = { version = \"0.4\", features = [\"rt\"] }\ntokio-rustls = { version = \"0.26\", default-features = false }\ntokio-stream = { version = \"0.1\", features = [\"sync\"] }\ntokio-util = { version = \"0.7\", default-features = false, features = [\n  \"compat\",\n  \"io-util\",\n] }\ntoml = \"0.9\"\ntonic = { version = \"0.14\", features = [\n  \"_tls-any\",\n  \"gzip\",\n  \"tls-native-roots\",\n  \"zstd\",\n] }\ntonic-build = \"0.14\"\ntonic-health = \"0.14\"\ntonic-prost = \"0.14\"\ntonic-prost-build = \"0.14\"\ntonic-reflection = \"0.14\"\ntower = { version = \"0.5\", features = [\n  \"balance\",\n  \"buffer\",\n  \"load\",\n  \"retry\",\n  \"util\",\n] }\n# legacy version because of warp\ntower-http = { version = \"0.6\", features = [\n  \"compression-gzip\",\n  \"compression-zstd\",\n  \"cors\",\n] }\ntracing = \"0.1\"\ntracing-opentelemetry = \"0.32\"\ntracing-subscriber = { version = \"0.3\", features = [\n  \"env-filter\",\n  \"json\",\n  \"std\",\n  \"time\",\n] }\nttl_cache = \"0.5\"\ntypetag = \"0.2\"\nulid = \"1.2\"\nureq = \"3\"\nusername = \"0.2\"\n# We cannot upgrade to utoipa 5.0+ due to significant breaking changes:\n# 1. The `OpenApi` struct structure changed (fields are private), breaking our manual merging logic in openapi.rs\n# in `quickwit-serve`. This code is fundamentally incompatible with version 5.x.\nutoipa = { version = \"4.2\", features = [\"time\", \"ulid\"] }\nuuid = { version = \"1.19\", features = [\"v4\", \"serde\"] }\nvrl = { version = \"0.29\", default-features = false, features = [\n  \"compiler\",\n  \"diagnostic\",\n  \"stdlib\",\n  \"value\",\n] }\nwarp = { version = \"0.4\", features = [\"server\", \"test\"] }\nwiremock = \"0.6\"\nzstd = { version = \"0.13\", default-features = false }\n\naws-config = \"1.8\"\naws-credential-types = { version = \"1.2\", features = [\"hardcoded-credentials\"] }\naws-runtime = \"1.5\"\naws-sdk-kinesis = \"1.97\"\naws-sdk-s3 = \"=1.62\"\naws-sdk-lambda = \"1\"\naws-sdk-sqs = \"1.91\"\naws-smithy-async = \"1.2\"\naws-smithy-mocks = \"0.2\"\naws-smithy-http-client = { version = \"1.1\", features = [\"default-client\"] }\naws-smithy-runtime = \"1.9\"\naws-smithy-types = { version = \"1.3\", features = [\n  \"byte-stream-poll-next\",\n  \"http-body-1-x\",\n] }\naws-types = \"1.3\"\n\nazure_core = { version = \"0.21\", features = [\"hmac_rust\", \"enable_reqwest_rustls\"] }\nazure_identity = { version = \"0.21\" }\nazure_storage = { version = \"0.21\", default-features = false, features = [\n  \"enable_reqwest_rustls\",\n] }\nazure_storage_blobs = { version = \"0.21\", default-features = false, features = [\n  \"enable_reqwest_rustls\",\n] }\n\nopendal = { version = \"0.55\", default-features = false }\nreqsign = { version = \"0.18\", default-features = false, features = [\"google\", \"default-context\"] }\n\nquickwit-actors = { path = \"quickwit-actors\" }\nquickwit-aws = { path = \"quickwit-aws\" }\nquickwit-cli = { path = \"quickwit-cli\" }\nquickwit-cluster = { path = \"quickwit-cluster\" }\nquickwit-codegen = { path = \"quickwit-codegen\" }\nquickwit-codegen-example = { path = \"quickwit-codegen/example\" }\nquickwit-common = { path = \"quickwit-common\" }\nquickwit-config = { path = \"quickwit-config\" }\nquickwit-control-plane = { path = \"quickwit-control-plane\" }\nquickwit-datetime = { path = \"quickwit-datetime\" }\nquickwit-directories = { path = \"quickwit-directories\" }\nquickwit-doc-mapper = { path = \"quickwit-doc-mapper\" }\nquickwit-index-management = { path = \"quickwit-index-management\" }\nquickwit-indexing = { path = \"quickwit-indexing\" }\nquickwit-ingest = { path = \"quickwit-ingest\" }\nquickwit-integration-tests = { path = \"quickwit-integration-tests\" }\nquickwit-jaeger = { path = \"quickwit-jaeger\" }\nquickwit-janitor = { path = \"quickwit-janitor\" }\nquickwit-lambda-client = { path = \"quickwit-lambda-client\" }\nquickwit-lambda-server = { path = \"quickwit-lambda-server\" }\nquickwit-macros = { path = \"quickwit-macros\" }\nquickwit-metastore = { path = \"quickwit-metastore\" }\nquickwit-opentelemetry = { path = \"quickwit-opentelemetry\" }\nquickwit-proto = { path = \"quickwit-proto\" }\nquickwit-query = { path = \"quickwit-query\" }\nquickwit-rest-client = { path = \"quickwit-rest-client\" }\nquickwit-search = { path = \"quickwit-search\" }\nquickwit-serve = { path = \"quickwit-serve\" }\nquickwit-storage = { path = \"quickwit-storage\" }\nquickwit-telemetry = { path = \"quickwit-telemetry\" }\n\ntantivy = { git = \"https://github.com/quickwit-oss/tantivy/\", rev = \"98ebbf9\", default-features = false, features = [\n  \"lz4-compression\",\n  \"mmap\",\n  \"quickwit\",\n  \"zstd-compression\",\n  \"columnar-zstd-compression\",\n] }\ntantivy-fst = \"0.5\"\n\n# This is actually not used directly the goal is to fix the version\n# used by reqwest.\nencoding_rs = \"=0.8.35\"\n\n[patch.crates-io]\nsasl2-sys = { git = \"https://github.com/quickwit-oss/rust-sasl/\", rev = \"085a4c7\" }\n\n## this patched version of tracing helps better understand what happens inside futures (when are\n## they polled, how long does poll take...)\n#tracing = { git = \"https://github.com/trinity-1686a/tracing.git\", rev = \"6806cac3\" }\n#tracing-attributes = { git = \"https://github.com/trinity-1686a/tracing.git\", rev = \"6806cac3\" }\n#tracing-core = { git = \"https://github.com/trinity-1686a/tracing.git\", rev = \"6806cac3\" }\n#tracing-futures = { git = \"https://github.com/trinity-1686a/tracing.git\", rev = \"6806cac3\" }\n#tracing-log = { git = \"https://github.com/trinity-1686a/tracing.git\", rev = \"6806cac3\" }\n#tracing-opentelemetry = { git = \"https://github.com/trinity-1686a/tracing.git\", rev = \"6806cac3\" }\n#tracing-subscriber = { git = \"https://github.com/trinity-1686a/tracing.git\", rev = \"6806cac3\" }\n\n[profile.dev]\ndebug = false\n\n[profile.release]\nlto = \"thin\"\n"
  },
  {
    "path": "quickwit/Cross.toml",
    "content": "[build.env]\npassthrough = [\n    \"QW_COMMIT_DATE\",\n    \"QW_COMMIT_HASH\",\n    \"QW_COMMIT_TAGS\",\n]\n\n[target.x86_64-unknown-linux-gnu]\nimage = \"quickwit/cross:x86_64-unknown-linux-gnu\"\n\n[target.x86_64-unknown-linux-musl]\nimage = \"quickwit/cross:x86_64-unknown-linux-musl\"\n\n[target.aarch64-unknown-linux-gnu]\nimage = \"quickwit/cross:aarch64-unknown-linux-gnu\"\n\n[target.aarch64-unknown-linux-gnu.env]\n# Fix build for transitive dependency rdkafka -> rdkafka-sys -> sasl2-sys -> krb5-src\n# Introduced by https://github.com/MaterializeInc/rust-krb5-src/pull/27\npassthrough = [\n    \"krb5_cv_attr_constructor_destructor=yes\",\n    \"ac_cv_func_regcomp=yes\",\n    \"ac_cv_printf_positional=yes\",\n]\n\n[target.aarch64-unknown-linux-musl]\nimage = \"quickwit/cross:aarch64-unknown-linux-musl\"\n"
  },
  {
    "path": "quickwit/Makefile",
    "content": "help:\n\t@grep '^[^\\.#[:space:]].*:' Makefile\n\ndoc:\n\t@echo \"Running cargo doc\"\n\t@RUSTDOCFLAGS='-Dwarnings -Arustdoc::private_intra_doc_links' cargo doc --all-features\n\nfmt:\n\t@echo \"Formatting Rust files\"\n\t@(rustup toolchain list | ( ! grep -q nightly && echo \"Toolchain 'nightly' is not installed. Please install using 'rustup toolchain install nightly'.\") ) || cargo +nightly fmt\n\t@echo \"Checking license headers\"\n\t@bash scripts/check_license_headers.sh\n\t@echo \"Checking log format\"\n\t@bash scripts/check_log_format.sh\n\ndependency-licenses.html: Cargo.lock scripts/about.hbs scripts/about.toml\n\t@echo \"Checking dependency licenses\"\n\t@cargo about generate -c scripts/about.toml scripts/about.hbs -o dependency-licenses.html --workspace\n\nfix:\n\t@echo \"Running cargo clippy --fix\"\n\t@cargo clippy --workspace --all-features --tests --fix --allow-dirty --allow-staged\n\t@$(MAKE) fmt\n\t@$(MAKE) unused-deps\n\nunused-deps:\n\t@echo \"Checking for unused dependencies\"\n\t@(command -v cargo-machete >/dev/null || cargo --list | grep -q machete || (echo \"cargo-machete is not installed. Please install using 'cargo install cargo-machete'.\" && exit 1))\n\t@cargo machete\n\n# Usage:\n# `make test-all` starts the Docker services and runs all the tests.\n# `make -k test-all docker-compose-down`, tears down the Docker services after running all the tests.\ntest-all:\n\tAWS_ACCESS_KEY_ID=ignored \\\n\tAWS_SECRET_ACCESS_KEY=ignored \\\n\tAWS_REGION=us-east-1 \\\n\tPUBSUB_EMULATOR_HOST=localhost:8681 \\\n\tQW_S3_ENDPOINT=http://localhost:4566 \\\n\tQW_S3_FORCE_PATH_STYLE_ACCESS=1 \\\n\tQW_TEST_DATABASE_URL=postgres://quickwit-dev:quickwit-dev@localhost:5432/quickwit-metastore-dev \\\n\tRUST_MIN_STACK=67108864 \\\n\tcargo nextest run --all-features --retries 5\n\tcargo nextest run --test failpoints --features fail/failpoints\n\ntest-failpoints:\n\tcargo nextest run --test failpoints --features fail/failpoints\n\n# TODO: to be replaced by https://github.com/quickwit-oss/quickwit/issues/237\nTARGET ?= x86_64-unknown-linux-gnu\n.PHONY: build\nbuild: build-ui\n\t@echo \"Building binary for target=${TARGET}\"\n\t@which cross > /dev/null 2>&1 || (echo \"Cross is not installed. Please install using 'cargo install cross'.\" && exit 1)\n\t@case \"${TARGET}\" in \\\n\t\t*musl ) \\\n\t\t\tcross build --release --features release-feature-set --target ${TARGET}; \\\n\t\t;; \\\n\t\t* ) \\\n\t\t\tcross build --release --features release-feature-vendored-set --target ${TARGET}; \\\n\t\t;; \\\n\tesac\n\nworkspace-deps-tree:\n\tcargo tree --all-features --workspace -f \"{p}\" --prefix depth | cut -f 1 -d ' ' | python3 scripts/dep-tree.py\n\n.PHONY: build-rustdoc\nbuild-rustdoc:\n\tRUSTDOCFLAGS=\"-Dwarnings -Arustdoc::private_intra_doc_links\" cargo doc --no-deps --all-features --document-private-items\n\n.PHONY: build-ui\nbuild-ui:\n\tNODE_ENV=production cd quickwit-ui && $(MAKE) install build\n\nrm-postgres:\n\trm -fr /tmp/quickwit/services/postgres\n\nupdate-licenses:\n\t dd-rust-license-tool --config license-tool.toml write\n\t mv LICENSE-3rdparty.csv ../LICENSE-3rdparty.csv\n\n"
  },
  {
    "path": "quickwit/NOTICE",
    "content": "Datadog Quickwit\nCopyright 2021-Present Datadog, Inc.\nThis product includes software developed at Datadog (<https://www.datadoghq.com/).>\n"
  },
  {
    "path": "quickwit/clippy.toml",
    "content": "disallowed-methods = [\n    # This function is not sound because it does not return a Result\n    \"std::path::Path::exists\",\n    # These functions hurt readability (according to Paul)\n    \"std::option::Option::is_some_and\",\n    \"std::option::Option::is_none_or\",\n    \"std::option::Option::xor\",\n    # \"std::option::Option::and_then\",\n    # .map(..).unwrap_or(..) or let Some(..) else {..}\n    \"std::option::Option::map_or\",\n    # .map(..).unwrap_or_else(..) or let Some(..) else {..}\n    \"std::option::Option::map_or_else\",\n]\n\nignore-interior-mutability = [\n    \"bytes::Bytes\",\n    \"bytestring::ByteString\",\n    \"quickwit_ingest::ShardInfo\",\n    \"quickwit_ingest::ShardInfos\",\n    \"quickwit_proto::types::ShardId\",\n]\n"
  },
  {
    "path": "quickwit/deny.toml",
    "content": "# This template contains all of the possible sections and their default values\n\n# Note that all fields that take a lint level have these possible values:\n# * deny - An error will be produced and the check will fail\n# * warn - A warning will be produced, but the check will not fail\n# * allow - No warning or error will be produced, though in some cases a note\n# will be\n\n# The values provided in this template are the default values that will be used\n# when any section or field is not specified in your own configuration\n\n[graph]\n# If 1 or more target triples (and optionally, target_features) are specified,\n# only the specified targets will be checked when running `cargo deny check`.\n# This means, if a particular package is only ever used as a target specific\n# dependency, such as, for example, the `nix` crate only being used via the\n# `target_family = \"unix\"` configuration, that only having windows targets in\n# this list would mean the nix crate, as well as any of its exclusive\n# dependencies not shared by any other crates, would be ignored, as the target\n# list here is effectively saying which targets you are building for.\ntargets = [\n    # The triple can be any string, but only the target triples built in to\n    # rustc (as of 1.40) can be checked against actual config expressions\n    #{ triple = \"x86_64-unknown-linux-musl\" },\n    # You can also specify which target_features you promise are enabled for a\n    # particular target. target_features are currently not validated against\n    # the actual valid features supported by the target architecture.\n    #{ triple = \"wasm32-unknown-unknown\", features = [\"atomics\"] },\n]\n\n# This section is considered when running `cargo deny check advisories`\n# More documentation for the advisories section can be found here:\n# https://embarkstudios.github.io/cargo-deny/checks/advisories/cfg.html\n[advisories]\nversion = 2\n# The path where the advisory database is cloned/fetched into\ndb-path = \"~/.cargo/advisory-db\"\n# The url(s) of the advisory databases to use\ndb-urls = [\"https://github.com/rustsec/advisory-db\"]\n# A list of advisory IDs to ignore. Note that ignored advisories will still\n# output a note when they are encountered.\nignore = [\n    \"RUSTSEC-2021-0153\", # `encoding` is unmaintained, it's used in lindera\n]\n\n# This section is considered when running `cargo deny check licenses`\n# More documentation for the licenses section can be found here:\n# https://embarkstudios.github.io/cargo-deny/checks/licenses/cfg.html\n[licenses]\nversion = 2\n# List of explicitly allowed licenses\n# See https://spdx.org/licenses/ for list of possible licenses\n# [possible values: any SPDX 3.11 short identifier (+ optional exception)].\nallow = [\n    # \"Apache-2.0 WITH LLVM-exception\",\n    \"0BSD\",\n    \"Apache-2.0\",\n    \"BSD-2-Clause\",\n    \"BSD-3-Clause\",\n    \"CC0-1.0\",\n    \"CDLA-Permissive-2.0\",\n    \"ISC\",\n    \"MIT\",\n    \"MPL-2.0\",\n    \"OpenSSL\",\n    \"Unicode-3.0\",\n    \"Unlicense\",\n    \"Zlib\",\n    \"zlib-acknowledgement\",\n]\n# The confidence threshold for detecting a license from license text.\n# The higher the value, the more closely the license text must be to the\n# canonical license text of a valid SPDX license file.\n# [possible values: any between 0.0 and 1.0].\nconfidence-threshold = 0.8\n# Allow 1 or more licenses on a per-crate basis, so that particular licenses\n# aren't accepted for every possible crate as with the normal allow list\nexceptions = []\n\n# Some crates don't have (easily) machine readable licensing information,\n# adding a clarification entry for it allows you to manually specify the\n# licensing information\n[[licenses.clarify]]\n# The name of the crate the clarification applies to\nname = \"ring\"\n# The optional version constraint for the crate\nversion = \"*\"\n# The SPDX expression for the license requirements of the crate\nexpression = \"OpenSSL\"\n# One or more files in the crate's source used as the \"source of truth\" for\n# the license expression. If the contents match, the clarification will be used\n# when running the license check, otherwise the clarification will be ignored\n# and the crate will be checked normally, which may produce warnings or errors\n# depending on the rest of your configuration\nlicense-files = [\n    # Each entry is a crate relative path, and the (opaque) hash of its contents\n    { path = \"LICENSE\", hash = 0xbd0eed23 }\n]\n\n[licenses.private]\n# If true, ignores workspace crates that aren't published, or are only\n# published to private registries\nignore = false\n# One or more private registries that you might publish crates to, if a crate\n# is only published to private registries, and ignore is true, the crate will\n# not have its license(s) checked\nregistries = [\n    #\"https://sekretz.com/registry\n]\n\n# This section is considered when running `cargo deny check bans`.\n# More documentation about the 'bans' section can be found here:\n# https://embarkstudios.github.io/cargo-deny/checks/bans/cfg.html\n[bans]\n# Lint level for when multiple versions of the same crate are detected\nmultiple-versions = \"warn\"\n# Lint level for when a crate version requirement is `*`\nwildcards = \"allow\"\n# The graph highlighting used when creating dotgraphs for crates\n# with multiple versions\n# * lowest-version - The path to the lowest versioned duplicate is highlighted\n# * simplest-path - The path to the version with the fewest edges is highlighted\n# * all - Both lowest-version and simplest-path are used\nhighlight = \"all\"\n# List of crates that are allowed. Use with care!\nallow = [\n    #{ name = \"ansi_term\", version = \"=0.11.0\" },\n]\n# List of crates to deny\ndeny = [\n    # Each entry the name of a crate and a version range. If version is\n    # not specified, all versions will be matched.\n    #{ name = \"ansi_term\", version = \"=0.11.0\" },\n    #\n    # Wrapper crates can optionally be specified to allow the crate when it\n    # is a direct dependency of the otherwise banned crate\n    #{ name = \"ansi_term\", version = \"=0.11.0\", wrappers = [] },\n]\n# Certain crates/versions that will be skipped when doing duplicate detection.\nskip = [\n    #{ name = \"ansi_term\", version = \"=0.11.0\" },\n]\n# Similarly to `skip` allows you to skip certain crates during duplicate\n# detection. Unlike skip, it also includes the entire tree of transitive\n# dependencies starting at the specified crate, up to a certain depth, which is\n# by default infinite\nskip-tree = [\n    #{ name = \"ansi_term\", version = \"=0.11.0\", depth = 20 },\n]\n\n# This section is considered when running `cargo deny check sources`.\n# More documentation about the 'sources' section can be found here:\n# https://embarkstudios.github.io/cargo-deny/checks/sources/cfg.html\n[sources]\n# Lint level for what to happen when a crate from a crate registry that is not\n# in the allow list is encountered\nunknown-registry = \"warn\"\n# Lint level for what to happen when a crate from a git repository that is not\n# in the allow list is encountered\nunknown-git = \"warn\"\n# List of URLs for allowed crate registries. Defaults to the crates.io index\n# if not specified. If it is specified but empty, no registries are allowed.\nallow-registry = [\"https://github.com/rust-lang/crates.io-index\"]\n# List of URLs for allowed Git repositories\nallow-git = []\n\n[sources.allow-org]\n# 1 or more github.com organizations to allow git sources for\ngithub = [\"quickwit-oss\"]\n# 1 or more gitlab.com organizations to allow git sources for\ngitlab = []\n# 1 or more bitbucket.org organizations to allow git sources for\nbitbucket = []\n"
  },
  {
    "path": "quickwit/dependency-licenses.html",
    "content": "<html>\n\n<head>\n    <style>\n        @media (prefers-color-scheme: dark) {\n            body {\n                background: #333;\n                color: white;\n            }\n            a {\n                color: skyblue;\n            }\n        }\n        .container {\n            font-family: sans-serif;\n            max-width: 800px;\n            margin: 0 auto;\n        }\n        .intro {\n            text-align: center;\n        }\n        .licenses-list {\n            list-style-type: none;\n            margin: 0;\n            padding: 0;\n        }\n        .license-used-by {\n            margin-top: -10px;\n        }\n        .license-text {\n            max-height: 200px;\n            overflow-y: scroll;\n            white-space: pre-wrap;\n        }\n    </style>\n</head>\n\n<body>\n    <main class=\"container\">\n        <div class=\"intro\">\n            <h1>Third Party Licenses</h1>\n            <p>This page lists the licenses of the projects used in cargo-about.</p>\n        </div>\n    \n        <h2>Overview of licenses:</h2>\n        <ul class=\"licenses-overview\">\n            <li><a href=\"#Apache-2.0\">Apache License 2.0</a> (424)</li>\n            <li><a href=\"#MIT\">MIT License</a> (168)</li>\n            <li><a href=\"#AGPL-3.0\">GNU Affero General Public License v3.0</a> (29)</li>\n            <li><a href=\"#CC0-1.0\">Creative Commons Zero v1.0 Universal</a> (7)</li>\n            <li><a href=\"#BSD-3-Clause\">BSD 3-Clause &quot;New&quot; or &quot;Revised&quot; License</a> (5)</li>\n            <li><a href=\"#ISC\">ISC License</a> (5)</li>\n            <li><a href=\"#0BSD\">BSD Zero Clause License</a> (2)</li>\n            <li><a href=\"#MPL-2.0\">Mozilla Public License 2.0</a> (2)</li>\n            <li><a href=\"#Zlib\">zlib License</a> (2)</li>\n            <li><a href=\"#OpenSSL\">OpenSSL License</a> (1)</li>\n            <li><a href=\"#Unicode-DFS-2016\">Unicode License Agreement - Data Files and Software (2016)</a> (1)</li>\n            <li><a href=\"#zlib-acknowledgement\">zlib/libpng License with Acknowledgement</a> (1)</li>\n        </ul>\n\n        <h2>All license text:</h2>\n        <ul class=\"licenses-list\">\n            <li class=\"license\">\n                <h3 id=\"0BSD\">BSD Zero Clause License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/stephaneyfx/enum-iterator.git \">enum-iterator-derive 1.4.0</a></li>\n                </ul>\n                <pre class=\"license-text\">BSD Zero Clause License\n\nCopyright (c) 2018 Stephane Raux\n\nPermission to use, copy, modify, and/or distribute this software for any\npurpose with or without fee is hereby granted.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot; AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH\nREGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY\nAND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT,\nINDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM\nLOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR\nOTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR\nPERFORMANCE OF THIS SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"0BSD\">BSD Zero Clause License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/oyvindln/adler2 \">adler2 2.0.0</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (C) Jonas Schievink &lt;jonasschievink@gmail.com&gt;\n\nPermission to use, copy, modify, and/or distribute this software for\nany purpose with or without fee is hereby granted.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot; AND THE AUTHOR DISCLAIMS ALL WARRANTIES\nWITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF\nMERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR\nANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES\nWHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN\nAN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT\nOF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"AGPL-3.0\">GNU Affero General Public License v3.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-actors 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-aws 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-cli 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-cluster 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-codegen 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-codegen-example 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-common 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-config 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-control-plane 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-datetime 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-directories 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-doc-mapper 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-index-management 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-indexing 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-ingest 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-integration-tests 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-jaeger 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-janitor 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-lambda 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-macros 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-metastore 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-opentelemetry 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-proto 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-query 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-rest-client 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-search 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-serve 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-storage 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/quickwit \">quickwit-telemetry 0.8.0</a></li>\n                </ul>\n                <pre class=\"license-text\">GNU AFFERO GENERAL PUBLIC LICENSE\nVersion 3, 19 November 2007\n\nCopyright (C) 2007 Free Software Foundation, Inc. &lt;http://fsf.org/&gt;\n\nEveryone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.\n\n                            Preamble\n\nThe GNU Affero General Public License is a free, copyleft license for software and other kinds of works, specifically designed to ensure cooperation with the community in the case of network server software.\n\nThe licenses for most software and other practical works are designed to take away your freedom to share and change the works.  By contrast, our General Public Licenses are intended to guarantee your freedom to share and change all versions of a program--to make sure it remains free software for all its users.\n\nWhen we speak of free software, we are referring to freedom, not price.  Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for them if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs, and that you know you can do these things.\n\nDevelopers that use our General Public Licenses protect your rights with two steps: (1) assert copyright on the software, and (2) offer you this License which gives you legal permission to copy, distribute and/or modify the software.\n\nA secondary benefit of defending all users&#x27; freedom is that improvements made in alternate versions of the program, if they receive widespread use, become available for other developers to incorporate.  Many developers of free software are heartened and encouraged by the resulting cooperation.  However, in the case of software used on network servers, this result may fail to come about. The GNU General Public License permits making a modified version and letting the public access it on a server without ever releasing its source code to the public.\n\nThe GNU Affero General Public License is designed specifically to ensure that, in such cases, the modified source code becomes available to the community.  It requires the operator of a network server to provide the source code of the modified version running there to the users of that server.  Therefore, public use of a modified version, on a publicly accessible server, gives the public access to the source code of the modified version.\n\nAn older license, called the Affero General Public License and published by Affero, was designed to accomplish similar goals.  This is a different license, not a version of the Affero GPL, but Affero has released a new version of the Affero GPL which permits relicensing under this license.\n\nThe precise terms and conditions for copying, distribution and modification follow.\n\n                       TERMS AND CONDITIONS\n\n0. Definitions.\n\n&quot;This License&quot; refers to version 3 of the GNU Affero General Public License.\n\n&quot;Copyright&quot; also means copyright-like laws that apply to other kinds of works, such as semiconductor masks.\n\n&quot;The Program&quot; refers to any copyrightable work licensed under this License.  Each licensee is addressed as &quot;you&quot;.  &quot;Licensees&quot; and &quot;recipients&quot; may be individuals or organizations.\n\nTo &quot;modify&quot; a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy.  The resulting work is called a &quot;modified version&quot; of the earlier work or a work &quot;based on&quot; the earlier work.\n\nA &quot;covered work&quot; means either the unmodified Program or a work based on the Program.\n\nTo &quot;propagate&quot; a work means to do anything with it that, without permission, would make you directly or secondarily liable for infringement under applicable copyright law, except executing it on a computer or modifying a private copy.  Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well.\n\nTo &quot;convey&quot; a work means any kind of propagation that enables other parties to make or receive copies.  Mere interaction with a user through a computer network, with no transfer of a copy, is not conveying.\n\nAn interactive user interface displays &quot;Appropriate Legal Notices&quot; to the extent that it includes a convenient and prominently visible feature that (1) displays an appropriate copyright notice, and (2) tells the user that there is no warranty for the work (except to the extent that warranties are provided), that licensees may convey the work under this License, and how to view a copy of this License.  If the interface presents a list of user commands or options, such as a menu, a prominent item in the list meets this criterion.\n\n1. Source Code.\nThe &quot;source code&quot; for a work means the preferred form of the work for making modifications to it.  &quot;Object code&quot; means any non-source form of a work.\n\nA &quot;Standard Interface&quot; means an interface that either is an official standard defined by a recognized standards body, or, in the case of interfaces specified for a particular programming language, one that is widely used among developers working in that language.\n\nThe &quot;System Libraries&quot; of an executable work include anything, other than the work as a whole, that (a) is included in the normal form of packaging a Major Component, but which is not part of that Major Component, and (b) serves only to enable use of the work with that Major Component, or to implement a Standard Interface for which an implementation is available to the public in source code form.  A &quot;Major Component&quot;, in this context, means a major essential component (kernel, window system, and so on) of the specific operating system (if any) on which the executable work runs, or a compiler used to produce the work, or an object code interpreter used to run it.\n\nThe &quot;Corresponding Source&quot; for a work in object code form means all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities.  However, it does not include the work&#x27;s System Libraries, or general-purpose tools or generally available free programs which are used unmodified in performing those activities but which are not part of the work.  For example, Corresponding Source includes interface definition files associated with source files for the work, and the source code for shared libraries and dynamically linked subprograms that the work is specifically designed to require, such as by intimate data communication or control flow between those\nsubprograms and other parts of the work.\n\nThe Corresponding Source need not include anything that users can regenerate automatically from other parts of the Corresponding Source.\n\nThe Corresponding Source for a work in source code form is that same work.\n\n2. Basic Permissions.\nAll rights granted under this License are granted for the term of copyright on the Program, and are irrevocable provided the stated conditions are met.  This License explicitly affirms your unlimited permission to run the unmodified Program.  The output from running a covered work is covered by this License only if the output, given its content, constitutes a covered work.  This License acknowledges your rights of fair use or other equivalent, as provided by copyright law.\n\nYou may make, run and propagate covered works that you do not convey, without conditions so long as your license otherwise remains in force.  You may convey covered works to others for the sole purpose of having them make modifications exclusively for you, or provide you with facilities for running those works, provided that you comply with the terms of this License in conveying all material for which you do not control copyright.  Those thus making or running the covered works for you must do so exclusively on your behalf, under your direction and control, on terms that prohibit them from making any copies of your copyrighted material outside their relationship with you.\n\nConveying under any other circumstances is permitted solely under the conditions stated below.  Sublicensing is not allowed; section 10 makes it unnecessary.\n\n3. Protecting Users&#x27; Legal Rights From Anti-Circumvention Law.\nNo covered work shall be deemed part of an effective technological measure under any applicable law fulfilling obligations under article 11 of the WIPO copyright treaty adopted on 20 December 1996, or similar laws prohibiting or restricting circumvention of such measures.\n\nWhen you convey a covered work, you waive any legal power to forbid circumvention of technological measures to the extent such circumvention is effected by exercising rights under this License with respect to the covered work, and you disclaim any intention to limit operation or modification of the work as a means of enforcing, against the work&#x27;s users, your or third parties&#x27; legal rights to forbid circumvention of technological measures.\n\n4. Conveying Verbatim Copies.\nYou may convey verbatim copies of the Program&#x27;s source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice; keep intact all notices stating that this License and any non-permissive terms added in accord with section 7 apply to the code; keep intact all notices of the absence of any warranty; and give all recipients a copy of this License along with the Program.\n\nYou may charge any price or no price for each copy that you convey, and you may offer support or warranty protection for a fee.\n\n5. Conveying Modified Source Versions.\nYou may convey a work based on the Program, or the modifications to produce it from the Program, in the form of source code under the terms of section 4, provided that you also meet all of these conditions:\n\n    a) The work must carry prominent notices stating that you modified it, and giving a relevant date.\n\n    b) The work must carry prominent notices stating that it is released under this License and any conditions added under section 7.  This requirement modifies the requirement in section 4 to &quot;keep intact all notices&quot;.\n\n    c) You must license the entire work, as a whole, under this License to anyone who comes into possession of a copy.  This License will therefore apply, along with any applicable section 7 additional terms, to the whole of the work, and all its parts, regardless of how they are packaged.  This License gives no permission to license the work in any other way, but it does not invalidate such permission if you have separately received it.\n\n    d) If the work has interactive user interfaces, each must display Appropriate Legal Notices; however, if the Program has interactive interfaces that do not display Appropriate Legal Notices, your work need not make them do so.\n\nA compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an &quot;aggregate&quot; if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation&#x27;s users beyond what the individual works permit.  Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate.\n\n6. Conveying Non-Source Forms.\nYou may convey a covered work in object code form under the terms of sections 4 and 5, provided that you also convey the machine-readable Corresponding Source under the terms of this License, in one of these ways:\n\n    a) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by the Corresponding Source fixed on a durable physical medium customarily used for software interchange.\n\n    b) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by a written offer, valid for at least three years and valid for as long as you offer spare parts or customer support for that product model, to give anyone who possesses the object code either (1) a copy of the Corresponding Source for all the software in the product that is covered by this License, on a durable physical medium customarily used for software interchange, for a price no more than your reasonable cost of physically performing this conveying of source, or (2) access to copy the Corresponding Source from a network server at no charge.\n\n    c) Convey individual copies of the object code with a copy of the written offer to provide the Corresponding Source.  This alternative is allowed only occasionally and noncommercially, and only if you received the object code with such an offer, in accord with subsection 6b.\n\n    d) Convey the object code by offering access from a designated place (gratis or for a charge), and offer equivalent access to the Corresponding Source in the same way through the same place at no further charge.  You need not require recipients to copy the Corresponding Source along with the object code.  If the place to copy the object code is a network server, the Corresponding Source may be on a different server (operated by you or a third party) that supports equivalent copying facilities, provided you maintain clear directions next to the object code saying where to find the Corresponding Source.  Regardless of what server hosts the Corresponding Source, you remain obligated to ensure that it is available for as long as needed to satisfy these requirements.\n\n    e) Convey the object code using peer-to-peer transmission, provided you inform other peers where the object code and Corresponding Source of the work are being offered to the general public at no charge under subsection 6d.\n\nA separable portion of the object code, whose source code is excluded from the Corresponding Source as a System Library, need not be included in conveying the object code work.\n\nA &quot;User Product&quot; is either (1) a &quot;consumer product&quot;, which means any tangible personal property which is normally used for personal, family, or household purposes, or (2) anything designed or sold for incorporation into a dwelling.  In determining whether a product is a consumer product, doubtful cases shall be resolved in favor of coverage.  For a particular product received by a particular user, &quot;normally used&quot; refers to a typical or common use of that class of product, regardless of the status of the particular user or of the way in which the particular user actually uses, or expects or is expected to use, the product.  A product is a consumer product regardless of whether the product has substantial commercial, industrial or non-consumer uses, unless such uses represent the only significant mode of use of the product.\n\n&quot;Installation Information&quot; for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source.  The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made.\n\nIf you convey an object code work under this section in, or with, or specifically for use in, a User Product, and the conveying occurs as part of a transaction in which the right of possession and use of the User Product is transferred to the recipient in perpetuity or for a fixed term (regardless of how the transaction is characterized), the Corresponding Source conveyed under this section must be accompanied by the Installation Information.  But this requirement does not apply if neither you nor any third party retains the ability to install modified object code on the User Product (for example, the work has been installed in ROM).\n\nThe requirement to provide Installation Information does not include a requirement to continue to provide support service, warranty, or updates for a work that has been modified or installed by the recipient, or for the User Product in which it has been modified or installed.  Access to a network may be denied when the modification itself materially and adversely affects the operation of the network or violates the rules and protocols for communication across the network.\n\nCorresponding Source conveyed, and Installation Information provided, in accord with this section must be in a format that is publicly documented (and with an implementation available to the public in source code form), and must require no special password or key for unpacking, reading or copying.\n\n7. Additional Terms.\n&quot;Additional permissions&quot; are terms that supplement the terms of this License by making exceptions from one or more of its conditions. Additional permissions that are applicable to the entire Program shall be treated as though they were included in this License, to the extent that they are valid under applicable law.  If additional permissions apply only to part of the Program, that part may be used separately under those permissions, but the entire Program remains governed by this License without regard to the additional permissions.\n\nWhen you convey a copy of a covered work, you may at your option remove any additional permissions from that copy, or from any part of it.  (Additional permissions may be written to require their own removal in certain cases when you modify the work.)  You may place additional permissions on material, added by you to a covered work, for which you have or can give appropriate copyright permission.\n\nNotwithstanding any other provision of this License, for material you add to a covered work, you may (if authorized by the copyright holders of that material) supplement the terms of this License with terms:\n\n    a) Disclaiming warranty or limiting liability differently from the terms of sections 15 and 16 of this License; or\n\n    b) Requiring preservation of specified reasonable legal notices or author attributions in that material or in the Appropriate Legal Notices displayed by works containing it; or\n\n    c) Prohibiting misrepresentation of the origin of that material, or requiring that modified versions of such material be marked in reasonable ways as different from the original version; or\n\n    d) Limiting the use for publicity purposes of names of licensors or authors of the material; or\n\n    e) Declining to grant rights under trademark law for use of some trade names, trademarks, or service marks; or\n\n    f) Requiring indemnification of licensors and authors of that material by anyone who conveys the material (or modified versions of it) with contractual assumptions of liability to the recipient, for any liability that these contractual assumptions directly impose on those licensors and authors.\n\nAll other non-permissive additional terms are considered &quot;further restrictions&quot; within the meaning of section 10.  If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term.  If a license document contains a further restriction but permits relicensing or conveying under this License, you may add to a covered work material governed by the terms of that license document, provided that the further restriction does not survive such relicensing or conveying.\n\nIf you add terms to a covered work in accord with this section, you must place, in the relevant source files, a statement of the additional terms that apply to those files, or a notice indicating where to find the applicable terms.\n\nAdditional terms, permissive or non-permissive, may be stated in the form of a separately written license, or stated as exceptions; the above requirements apply either way.\n\n8. Termination.\n\nYou may not propagate or modify a covered work except as expressly provided under this License.  Any attempt otherwise to propagate or modify it is void, and will automatically terminate your rights under this License (including any patent licenses granted under the third paragraph of section 11).\n\nHowever, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation.\n\nMoreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice.\n\nTermination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License.  If your rights have been terminated and not permanently reinstated, you do not qualify to receive new licenses for the same material under section 10.\n\n9. Acceptance Not Required for Having Copies.\n\nYou are not required to accept this License in order to receive or run a copy of the Program.  Ancillary propagation of a covered work occurring solely as a consequence of using peer-to-peer transmission to receive a copy likewise does not require acceptance.  However, nothing other than this License grants you permission to propagate or modify any covered work.  These actions infringe copyright if you do not accept this License.  Therefore, by modifying or propagating a covered work, you indicate your acceptance of this License to do so.\n\n10. Automatic Licensing of Downstream Recipients.\n\nEach time you convey a covered work, the recipient automatically receives a license from the original licensors, to run, modify and propagate that work, subject to this License.  You are not responsible for enforcing compliance by third parties with this License.\n\nAn &quot;entity transaction&quot; is a transaction transferring control of an organization, or substantially all assets of one, or subdividing an organization, or merging organizations.  If propagation of a covered work results from an entity transaction, each party to that transaction who receives a copy of the work also receives whatever licenses to the work the party&#x27;s predecessor in interest had or could give under the previous paragraph, plus a right to possession of the Corresponding Source of the work from the predecessor in interest, if the predecessor has it or can get it with reasonable efforts.\n\nYou may not impose any further restrictions on the exercise of the rights granted or affirmed under this License.  For example, you may not impose a license fee, royalty, or other charge for exercise of rights granted under this License, and you may not initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging that any patent claim is infringed by making, using, selling, offering for sale, or importing the Program or any portion of it.\n\n11. Patents.\n\nA &quot;contributor&quot; is a copyright holder who authorizes use under this License of the Program or a work on which the Program is based.  The work thus licensed is called the contributor&#x27;s &quot;contributor version&quot;.\n\nA contributor&#x27;s &quot;essential patent claims&quot; are all patent claims owned or controlled by the contributor, whether already acquired or hereafter acquired, that would be infringed by some manner, permitted by this License, of making, using, or selling its contributor version, but do not include claims that would be infringed only as a consequence of further modification of the contributor version.  For purposes of this definition, &quot;control&quot; includes the right to grant patent sublicenses in a manner consistent with the requirements of this License.\n\nEach contributor grants you a non-exclusive, worldwide, royalty-free patent license under the contributor&#x27;s essential patent claims, to make, use, sell, offer for sale, import and otherwise run, modify and propagate the contents of its contributor version.\n\nIn the following three paragraphs, a &quot;patent license&quot; is any express agreement or commitment, however denominated, not to enforce a patent (such as an express permission to practice a patent or covenant not to sue for patent infringement).  To &quot;grant&quot; such a patent license to a party means to make such an agreement or commitment not to enforce a patent against the party.\n\nIf you convey a covered work, knowingly relying on a patent license, and the Corresponding Source of the work is not available for anyone to copy, free of charge and under the terms of this License, through a publicly available network server or other readily accessible means, then you must either (1) cause the Corresponding Source to be so available, or (2) arrange to deprive yourself of the benefit of the patent license for this particular work, or (3) arrange, in a manner consistent with the requirements of this License, to extend the patent\nlicense to downstream recipients.  &quot;Knowingly relying&quot; means you have actual knowledge that, but for the patent license, your conveying the covered work in a country, or your recipient&#x27;s use of the covered work in a country, would infringe one or more identifiable patents in that country that you have reason to believe are valid.\n\nIf, pursuant to or in connection with a single transaction or arrangement, you convey, or propagate by procuring conveyance of, a covered work, and grant a patent license to some of the parties receiving the covered work authorizing them to use, propagate, modify or convey a specific copy of the covered work, then the patent license you grant is automatically extended to all recipients of the covered work and works based on it.\n\nA patent license is &quot;discriminatory&quot; if it does not include within the scope of its coverage, prohibits the exercise of, or is conditioned on the non-exercise of one or more of the rights that are specifically granted under this License.  You may not convey a covered work if you are a party to an arrangement with a third party that is in the business of distributing software, under which you make payment to the third party based on the extent of your activity of conveying the work, and under which the third party grants, to any of the parties who would receive the covered work from you, a discriminatory patent license (a) in connection with copies of the covered work conveyed by you (or copies made from those copies), or (b) primarily for and in connection with specific products or compilations that contain the covered work, unless you entered into that arrangement, or that patent license was granted, prior to 28 March 2007.\n\nNothing in this License shall be construed as excluding or limiting any implied license or other defenses to infringement that may otherwise be available to you under applicable patent law.\n\n12. No Surrender of Others&#x27; Freedom.\n\nIf conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License.  If you cannot convey a covered work so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may\nnot convey it at all.  For example, if you agree to terms that obligate you to collect a royalty for further conveying from those to whom you convey the Program, the only way you could satisfy both those terms and this License would be to refrain entirely from conveying the Program.\n\n13. Remote Network Interaction; Use with the GNU General Public License.\n\nNotwithstanding any other provision of this License, if you modify the Program, your modified version must prominently offer all users interacting with it remotely through a computer network (if your version supports such interaction) an opportunity to receive the Corresponding Source of your version by providing access to the Corresponding Source from a network server at no charge, through some standard or customary means of facilitating copying of software.  This Corresponding Source shall include the Corresponding Source for any work covered by version 3 of the GNU General Public License that is incorporated pursuant to the following paragraph.\n\nNotwithstanding any other provision of this License, you have permission to link or combine any covered work with a work licensed under version 3 of the GNU General Public License into a single combined work, and to convey the resulting work.  The terms of this License will continue to apply to the part which is the covered work, but the work with which it is combined will remain governed by version 3 of the GNU General Public License.\n\n14. Revised Versions of this License.\n\nThe Free Software Foundation may publish revised and/or new versions of the GNU Affero General Public License from time to time.  Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns.\n\nEach version is given a distinguishing version number.  If the Program specifies that a certain numbered version of the GNU Affero General Public License &quot;or any later version&quot; applies to it, you have the option of following the terms and conditions either of that numbered version or of any later version published by the Free Software Foundation.  If the Program does not specify a version number of the GNU Affero General Public License, you may choose any version ever published by the Free Software Foundation.\n\nIf the Program specifies that a proxy can decide which future versions of the GNU Affero General Public License can be used, that proxy&#x27;s public statement of acceptance of a version permanently authorizes you to choose that version for the Program.\n\nLater license versions may give you additional or different permissions.  However, no additional obligations are imposed on any author or copyright holder as a result of your choosing to follow a later version.\n\n15. Disclaimer of Warranty.\n\nTHERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM &quot;AS IS&quot; WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.\n\n16. Limitation of Liability.\n\nIN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.\n\n17. Interpretation of Sections 15 and 16.\n\nIf the disclaimer of warranty and limitation of liability provided above cannot be given local legal effect according to their terms, reviewing courts shall apply local law that most closely approximates an absolute waiver of all civil liability in connection with the Program, unless a warranty or assumption of liability accompanies a copy of the Program in return for a fee.\n\nEND OF TERMS AND CONDITIONS\n\n            How to Apply These Terms to Your New Programs\n\nIf you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms.\n\nTo do so, attach the following notices to the program.  It is safest to attach them to the start of each source file to most effectively state the exclusion of warranty; and each file should have at least the &quot;copyright&quot; line and a pointer to where the full notice is found.\n\n     &lt;one line to give the program&#x27;s name and a brief idea of what it does.&gt;\n     Copyright (C) &lt;year&gt;  &lt;name of author&gt;\n\n     This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.\n\n     This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Affero General Public License for more details.\n\n     You should have received a copy of the GNU Affero General Public License along with this program.  If not, see &lt;http://www.gnu.org/licenses/&gt;.\n\nAlso add information on how to contact you by electronic and paper mail.\n\nIf your software can interact with users remotely through a computer network, you should also make sure that it provides a way for users to get its source.  For example, if your program is a web application, its interface could display a &quot;Source&quot; link that leads users to an archive of the code.  There are many ways you could offer source, and different solutions will be better for different programs; see section 13 for the specific requirements.\n\nYou should also get your employer (if you work as a programmer) or school, if any, to sign a &quot;copyright disclaimer&quot; for the program, if necessary. For more information on this, and how to apply and follow the GNU AGPL, see &lt;http://www.gnu.org/licenses/&gt;.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/smithy-lang/smithy-rs \">aws-config 1.5.8</a></li>\n                    <li><a href=\" https://github.com/smithy-lang/smithy-rs \">aws-credential-types 1.2.1</a></li>\n                    <li><a href=\" https://github.com/smithy-lang/smithy-rs \">aws-runtime 1.4.3</a></li>\n                    <li><a href=\" https://github.com/smithy-lang/smithy-rs \">aws-smithy-async 1.2.1</a></li>\n                    <li><a href=\" https://github.com/smithy-lang/smithy-rs \">aws-smithy-checksums 0.60.12</a></li>\n                    <li><a href=\" https://github.com/smithy-lang/smithy-rs \">aws-smithy-eventstream 0.60.5</a></li>\n                    <li><a href=\" https://github.com/smithy-lang/smithy-rs \">aws-smithy-http 0.60.11</a></li>\n                    <li><a href=\" https://github.com/smithy-lang/smithy-rs \">aws-smithy-json 0.60.7</a></li>\n                    <li><a href=\" https://github.com/smithy-lang/smithy-rs \">aws-smithy-protocol-test 0.63.0</a></li>\n                    <li><a href=\" https://github.com/smithy-lang/smithy-rs \">aws-smithy-query 0.60.7</a></li>\n                    <li><a href=\" https://github.com/smithy-lang/smithy-rs \">aws-smithy-runtime-api 1.7.2</a></li>\n                    <li><a href=\" https://github.com/smithy-lang/smithy-rs \">aws-smithy-runtime 1.7.2</a></li>\n                    <li><a href=\" https://github.com/smithy-lang/smithy-rs \">aws-smithy-types 1.2.7</a></li>\n                    <li><a href=\" https://github.com/smithy-lang/smithy-rs \">aws-smithy-xml 0.60.9</a></li>\n                    <li><a href=\" https://github.com/smithy-lang/smithy-rs \">aws-types 1.3.3</a></li>\n                </ul>\n                <pre class=\"license-text\">\n                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/Frommi/miniz_oxide/tree/master/miniz_oxide \">miniz_oxide 0.8.0</a></li>\n                    <li><a href=\" https://github.com/taiki-e/pin-project \">pin-project-internal 1.1.6</a></li>\n                    <li><a href=\" https://github.com/taiki-e/pin-project-lite \">pin-project-lite 0.2.14</a></li>\n                    <li><a href=\" https://github.com/taiki-e/pin-project \">pin-project 1.1.6</a></li>\n                    <li><a href=\" https://github.com/taiki-e/portable-atomic \">portable-atomic 1.9.0</a></li>\n                    <li><a href=\" https://github.com/Actyx/sync_wrapper \">sync_wrapper 0.1.2</a></li>\n                    <li><a href=\" https://github.com/gyscos/zstd-rs \">zstd-safe 5.0.2+zstd.1.5.2</a></li>\n                    <li><a href=\" https://github.com/gyscos/zstd-rs \">zstd-safe 7.2.1</a></li>\n                    <li><a href=\" https://github.com/gyscos/zstd-rs \">zstd-sys 2.0.13+zstd.1.5.6</a></li>\n                </ul>\n                <pre class=\"license-text\">\n                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/jhpratt/deranged \">deranged 0.3.11</a></li>\n                    <li><a href=\" https://github.com/time-rs/time \">time-core 0.1.2</a></li>\n                </ul>\n                <pre class=\"license-text\">\n                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n      replaced with your own identifying information. (Don&#x27;t include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same &quot;printed page&quot; as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright 2022 Jacob Pratt et al.\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/jhpratt/num-conv \">num-conv 0.1.0</a></li>\n                </ul>\n                <pre class=\"license-text\">\n                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n      replaced with your own identifying information. (Don&#x27;t include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same &quot;printed page&quot; as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright 2023 Jacob Pratt\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/jhpratt/powerfmt \">powerfmt 0.2.0</a></li>\n                </ul>\n                <pre class=\"license-text\">\n                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n      replaced with your own identifying information. (Don&#x27;t include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same &quot;printed page&quot; as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright 2023 Jacob Pratt et al.\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/time-rs/time \">time-macros 0.2.18</a></li>\n                    <li><a href=\" https://github.com/time-rs/time \">time 0.3.36</a></li>\n                </ul>\n                <pre class=\"license-text\">\n                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n      replaced with your own identifying information. (Don&#x27;t include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same &quot;printed page&quot; as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright 2024 Jacob Pratt et al.\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/smithy-lang/smithy-rs \">aws-sigv4 1.2.4</a></li>\n                    <li><a href=\" https://github.com/tormol/encode_unicode \">encode_unicode 0.3.6</a></li>\n                    <li><a href=\" https://github.com/hsivonen/encoding_rs \">encoding_rs 0.8.32</a></li>\n                    <li><a href=\" https://github.com/mitsuhiko/fragile \">fragile 2.0.0</a></li>\n                    <li><a href=\" https://github.com/nvzqz/static-assertions-rs \">static_assertions 1.1.0</a></li>\n                    <li><a href=\" https://github.com/Lokathor/tinyvec \">tinyvec 1.8.0</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/utils/tree/master/zeroize \">zeroize 1.8.1</a></li>\n                </ul>\n                <pre class=\"license-text\">\n                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n      replaced with your own identifying information. (Don&#x27;t include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same &quot;printed page&quot; as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright [yyyy] [name of copyright owner]\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/wvwwvwwv/scalable-delayed-dealloc/ \">sdd 3.0.3</a></li>\n                </ul>\n                <pre class=\"license-text\">                                 Apache License\n                           Version 2.0, April 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   Copyright 2024-present Changgyoo Park\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/assert-rs/predicates-rs \">predicates 2.1.5</a></li>\n                </ul>\n                <pre class=\"license-text\">                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/microsoft/windows-rs \">windows-core 0.52.0</a></li>\n                    <li><a href=\" https://github.com/microsoft/windows-rs \">windows-sys 0.48.0</a></li>\n                    <li><a href=\" https://github.com/microsoft/windows-rs \">windows-sys 0.52.0</a></li>\n                    <li><a href=\" https://github.com/microsoft/windows-rs \">windows-sys 0.59.0</a></li>\n                    <li><a href=\" https://github.com/microsoft/windows-rs \">windows-targets 0.48.5</a></li>\n                    <li><a href=\" https://github.com/microsoft/windows-rs \">windows-targets 0.52.6</a></li>\n                    <li><a href=\" https://github.com/microsoft/windows-rs \">windows_aarch64_gnullvm 0.48.5</a></li>\n                    <li><a href=\" https://github.com/microsoft/windows-rs \">windows_aarch64_gnullvm 0.52.6</a></li>\n                    <li><a href=\" https://github.com/microsoft/windows-rs \">windows_aarch64_msvc 0.48.5</a></li>\n                    <li><a href=\" https://github.com/microsoft/windows-rs \">windows_aarch64_msvc 0.52.6</a></li>\n                    <li><a href=\" https://github.com/microsoft/windows-rs \">windows_i686_gnu 0.48.5</a></li>\n                    <li><a href=\" https://github.com/microsoft/windows-rs \">windows_i686_gnu 0.52.6</a></li>\n                    <li><a href=\" https://github.com/microsoft/windows-rs \">windows_i686_gnullvm 0.52.6</a></li>\n                    <li><a href=\" https://github.com/microsoft/windows-rs \">windows_i686_msvc 0.48.5</a></li>\n                    <li><a href=\" https://github.com/microsoft/windows-rs \">windows_i686_msvc 0.52.6</a></li>\n                    <li><a href=\" https://github.com/microsoft/windows-rs \">windows_x86_64_gnu 0.48.5</a></li>\n                    <li><a href=\" https://github.com/microsoft/windows-rs \">windows_x86_64_gnu 0.52.6</a></li>\n                    <li><a href=\" https://github.com/microsoft/windows-rs \">windows_x86_64_gnullvm 0.48.5</a></li>\n                    <li><a href=\" https://github.com/microsoft/windows-rs \">windows_x86_64_gnullvm 0.52.6</a></li>\n                    <li><a href=\" https://github.com/microsoft/windows-rs \">windows_x86_64_msvc 0.48.5</a></li>\n                    <li><a href=\" https://github.com/microsoft/windows-rs \">windows_x86_64_msvc 0.52.6</a></li>\n                </ul>\n                <pre class=\"license-text\">                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n      replaced with your own identifying information. (Don&#x27;t include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same &quot;printed page&quot; as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright (c) Microsoft Corporation.\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/Soveu/tinyvec_macros \">tinyvec_macros 0.1.1</a></li>\n                </ul>\n                <pre class=\"license-text\">                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n      replaced with your own identifying information. (Don&#x27;t include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same &quot;printed page&quot; as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright 2020 Tomasz &quot;Soveu&quot; Marx\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/google/zerocopy \">zerocopy-derive 0.7.35</a></li>\n                    <li><a href=\" https://github.com/google/zerocopy \">zerocopy 0.7.35</a></li>\n                </ul>\n                <pre class=\"license-text\">                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n      replaced with your own identifying information. (Don&#x27;t include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same &quot;printed page&quot; as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright 2023 The Fuchsia Authors\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/open-telemetry/opentelemetry-rust \">opentelemetry-http 0.9.0</a></li>\n                    <li><a href=\" https://github.com/open-telemetry/opentelemetry-rust/tree/main/opentelemetry-otlp \">opentelemetry-otlp 0.13.0</a></li>\n                    <li><a href=\" https://github.com/open-telemetry/opentelemetry-rust/tree/main/opentelemetry-proto \">opentelemetry-proto 0.3.0</a></li>\n                    <li><a href=\" https://github.com/open-telemetry/opentelemetry-rust/tree/main/opentelemetry-semantic-conventions \">opentelemetry-semantic-conventions 0.12.0</a></li>\n                    <li><a href=\" https://github.com/open-telemetry/opentelemetry-rust \">opentelemetry 0.20.0</a></li>\n                    <li><a href=\" https://github.com/open-telemetry/opentelemetry-rust \">opentelemetry_api 0.20.0</a></li>\n                    <li><a href=\" https://github.com/open-telemetry/opentelemetry-rust \">opentelemetry_sdk 0.20.0</a></li>\n                </ul>\n                <pre class=\"license-text\">                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n      replaced with your own identifying information. (Don&#x27;t include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same &quot;printed page&quot; as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright 2023 The OpenTelemetry Authors\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/daxpedda/web-time \">web-time 1.1.0</a></li>\n                </ul>\n                <pre class=\"license-text\">                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n      replaced with your own identifying information. (Don&#x27;t include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same &quot;printed page&quot; as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright 2023 dAxpeDDa\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/mheffner/rust-sketches-ddsketch \">sketches-ddsketch 0.3.0</a></li>\n                </ul>\n                <pre class=\"license-text\">                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n      replaced with your own identifying information. (Don&#x27;t include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same &quot;printed page&quot; as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright [2019] [Mike Heffner]\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/juhaku/utoipa \">utoipa-gen 4.3.1</a></li>\n                    <li><a href=\" https://github.com/juhaku/utoipa \">utoipa 4.2.3</a></li>\n                </ul>\n                <pre class=\"license-text\">                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n      replaced with your own identifying information. (Don&#x27;t include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same &quot;printed page&quot; as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright [yyyy] [name of copyright owner]\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/enarx/ciborium \">ciborium-io 0.2.2</a></li>\n                    <li><a href=\" https://github.com/enarx/ciborium \">ciborium-ll 0.2.2</a></li>\n                    <li><a href=\" https://github.com/enarx/ciborium \">ciborium 0.2.2</a></li>\n                    <li><a href=\" https://github.com/clap-rs/clap \">clap_builder 4.5.20</a></li>\n                    <li><a href=\" https://github.com/clap-rs/clap \">clap_lex 0.7.2</a></li>\n                    <li><a href=\" https://github.com/vinted/elasticsearch-dsl-rs \">elasticsearch-dsl 0.4.22</a></li>\n                    <li><a href=\" https://github.com/tmccombs/json-comments-rs \">json_comments 0.2.2</a></li>\n                    <li><a href=\" https://github.com/MiSawa/time-fmt \">time-fmt 0.3.8</a></li>\n                    <li><a href=\" https://github.com/cameron1024/unarray \">unarray 0.1.4</a></li>\n                </ul>\n                <pre class=\"license-text\">                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n      replaced with your own identifying information. (Don&#x27;t include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same &quot;printed page&quot; as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright [yyyy] [name of copyright owner]\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/krisprice/ipnet \">ipnet 2.10.1</a></li>\n                </ul>\n                <pre class=\"license-text\">                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets &quot;{}&quot;\n      replaced with your own identifying information. (Don&#x27;t include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same &quot;printed page&quot; as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright 2017 Juniper Networks, Inc.\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/actix/actix-net.git \">bytestring 1.3.1</a></li>\n                </ul>\n                <pre class=\"license-text\">                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets &quot;{}&quot;\n      replaced with your own identifying information. (Don&#x27;t include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same &quot;printed page&quot; as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright 2017-NOW Actix Team\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/bikeshedder/deadpool \">deadpool-runtime 0.1.4</a></li>\n                    <li><a href=\" https://github.com/bikeshedder/deadpool \">deadpool 0.9.5</a></li>\n                </ul>\n                <pre class=\"license-text\">                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets &quot;{}&quot;\n      replaced with your own identifying information. (Don&#x27;t include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same &quot;printed page&quot; as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright 2019 Michael P. Jung\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/tikv/fail-rs \">fail 0.5.1</a></li>\n                    <li><a href=\" https://github.com/tikv/rust-prometheus \">prometheus 0.13.4</a></li>\n                </ul>\n                <pre class=\"license-text\">                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets &quot;{}&quot;\n      replaced with your own identifying information. (Don&#x27;t include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same &quot;printed page&quot; as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright 2019 TiKV Project Authors.\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/rust-cli/anstyle.git \">anstyle-parse 0.2.5</a></li>\n                    <li><a href=\" https://github.com/llogiq/bytecount \">bytecount 0.6.8</a></li>\n                    <li><a href=\" https://github.com/utkarshkukreti/diff.rs \">diff 0.1.13</a></li>\n                    <li><a href=\" https://github.com/achanda/ipnetwork \">ipnetwork 0.20.0</a></li>\n                    <li><a href=\" https://github.com/derekdreery/normalize-line-endings \">normalize-line-endings 0.3.0</a></li>\n                    <li><a href=\" https://github.com/assert-rs/predicates-rs/tree/master/crates/core \">predicates-core 1.0.8</a></li>\n                    <li><a href=\" https://github.com/assert-rs/predicates-rs/tree/master/crates/tree \">predicates-tree 1.0.11</a></li>\n                    <li><a href=\" https://github.com/assert-rs/predicates-rs \">predicates 3.1.2</a></li>\n                    <li><a href=\" https://github.com/retep998/winapi-rs \">winapi 0.3.9</a></li>\n                </ul>\n                <pre class=\"license-text\">                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets &quot;{}&quot;\n      replaced with your own identifying information. (Don&#x27;t include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same &quot;printed page&quot; as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright {yyyy} {name of copyright owner}\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/rust-cli/anstyle.git \">anstream 0.6.15</a></li>\n                    <li><a href=\" https://github.com/rust-cli/anstyle \">anstyle-query 1.1.1</a></li>\n                    <li><a href=\" https://github.com/rust-cli/anstyle.git \">anstyle-wincon 3.0.4</a></li>\n                    <li><a href=\" https://github.com/rust-cli/anstyle.git \">anstyle 1.0.8</a></li>\n                    <li><a href=\" https://github.com/hyunsik/bytesize/ \">bytesize 1.3.0</a></li>\n                    <li><a href=\" https://github.com/clap-rs/clap \">clap 4.5.20</a></li>\n                    <li><a href=\" https://github.com/jamesmunns/cobs.rs \">cobs 0.2.3</a></li>\n                    <li><a href=\" https://github.com/rust-cli/anstyle \">colorchoice 1.0.2</a></li>\n                    <li><a href=\" https://github.com/srijs/rust-crc32fast \">crc32fast 1.4.2</a></li>\n                    <li><a href=\" https://github.com/rust-cli/env_logger \">env_logger 0.10.2</a></li>\n                    <li><a href=\" https://github.com/KokaKiwi/rust-hex \">hex 0.4.3</a></li>\n                    <li><a href=\" https://github.com/tailhook/humantime \">humantime 2.1.0</a></li>\n                    <li><a href=\" https://github.com/polyfill-rs/is_terminal_polyfill \">is_terminal_polyfill 1.70.1</a></li>\n                    <li><a href=\" https://github.com/rust-pretty-assertions/rust-pretty-assertions \">pretty_assertions 1.4.1</a></li>\n                    <li><a href=\" http://github.com/tailhook/quick-error \">quick-error 1.2.3</a></li>\n                    <li><a href=\" https://github.com/toml-rs/toml \">serde_spanned 0.6.8</a></li>\n                    <li><a href=\" https://github.com/sfackler/tokio-io-timeout \">tokio-io-timeout 1.2.0</a></li>\n                    <li><a href=\" https://github.com/toml-rs/toml \">toml 0.7.8</a></li>\n                    <li><a href=\" https://github.com/toml-rs/toml \">toml_datetime 0.6.8</a></li>\n                    <li><a href=\" https://github.com/toml-rs/toml \">toml_edit 0.19.15</a></li>\n                    <li><a href=\" https://github.com/stusmall/ttl_cache \">ttl_cache 0.5.1</a></li>\n                </ul>\n                <pre class=\"license-text\">                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets &quot;{}&quot;\n      replaced with your own identifying information. (Don&#x27;t include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same &quot;printed page&quot; as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright {yyyy} {name of copyright owner}\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/http-rs/http-types \">http-types 2.12.0</a></li>\n                </ul>\n                <pre class=\"license-text\">                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   Copyright 2019 Yoshua Wuyts\n   Copyright 2016-2018 Michael Tilli (Pyfisch) &amp; &#x60;httpdate&#x60; contributors\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/wvwwvwwv/scalable-concurrent-containers/ \">scc 2.2.0</a></li>\n                </ul>\n                <pre class=\"license-text\">                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   Copyright 2020-2024 Changgyoo Park\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/RumovZ/android-tzdata \">android-tzdata 0.1.1</a></li>\n                </ul>\n                <pre class=\"license-text\">                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1.  Definitions.\n\n    &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n    and distribution as defined by Sections 1 through 9 of this document.\n\n    &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n    the copyright owner that is granting the License.\n\n    &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n    other entities that control, are controlled by, or are under common\n    control with that entity. For the purposes of this definition,\n    &quot;control&quot; means (i) the power, direct or indirect, to cause the\n    direction or management of such entity, whether by contract or\n    otherwise, or (ii) ownership of fifty percent (50%) or more of the\n    outstanding shares, or (iii) beneficial ownership of such entity.\n\n    &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n    exercising permissions granted by this License.\n\n    &quot;Source&quot; form shall mean the preferred form for making modifications,\n    including but not limited to software source code, documentation\n    source, and configuration files.\n\n    &quot;Object&quot; form shall mean any form resulting from mechanical\n    transformation or translation of a Source form, including but\n    not limited to compiled object code, generated documentation,\n    and conversions to other media types.\n\n    &quot;Work&quot; shall mean the work of authorship, whether in Source or\n    Object form, made available under the License, as indicated by a\n    copyright notice that is included in or attached to the work\n    (an example is provided in the Appendix below).\n\n    &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n    form, that is based on (or derived from) the Work and for which the\n    editorial revisions, annotations, elaborations, or other modifications\n    represent, as a whole, an original work of authorship. For the purposes\n    of this License, Derivative Works shall not include works that remain\n    separable from, or merely link (or bind by name) to the interfaces of,\n    the Work and Derivative Works thereof.\n\n    &quot;Contribution&quot; shall mean any work of authorship, including\n    the original version of the Work and any modifications or additions\n    to that Work or Derivative Works thereof, that is intentionally\n    submitted to Licensor for inclusion in the Work by the copyright owner\n    or by an individual or Legal Entity authorized to submit on behalf of\n    the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n    means any form of electronic, verbal, or written communication sent\n    to the Licensor or its representatives, including but not limited to\n    communication on electronic mailing lists, source code control systems,\n    and issue tracking systems that are managed by, or on behalf of, the\n    Licensor for the purpose of discussing and improving the Work, but\n    excluding communication that is conspicuously marked or otherwise\n    designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n    &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n    on behalf of whom a Contribution has been received by Licensor and\n    subsequently incorporated within the Work.\n\n2.  Grant of Copyright License. Subject to the terms and conditions of\n    this License, each Contributor hereby grants to You a perpetual,\n    worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n    copyright license to reproduce, prepare Derivative Works of,\n    publicly display, publicly perform, sublicense, and distribute the\n    Work and such Derivative Works in Source or Object form.\n\n3.  Grant of Patent License. Subject to the terms and conditions of\n    this License, each Contributor hereby grants to You a perpetual,\n    worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n    (except as stated in this section) patent license to make, have made,\n    use, offer to sell, sell, import, and otherwise transfer the Work,\n    where such license applies only to those patent claims licensable\n    by such Contributor that are necessarily infringed by their\n    Contribution(s) alone or by combination of their Contribution(s)\n    with the Work to which such Contribution(s) was submitted. If You\n    institute patent litigation against any entity (including a\n    cross-claim or counterclaim in a lawsuit) alleging that the Work\n    or a Contribution incorporated within the Work constitutes direct\n    or contributory patent infringement, then any patent licenses\n    granted to You under this License for that Work shall terminate\n    as of the date such litigation is filed.\n\n4.  Redistribution. You may reproduce and distribute copies of the\n    Work or Derivative Works thereof in any medium, with or without\n    modifications, and in Source or Object form, provided that You\n    meet the following conditions:\n\n    (a) You must give any other recipients of the Work or\n    Derivative Works a copy of this License; and\n\n    (b) You must cause any modified files to carry prominent notices\n    stating that You changed the files; and\n\n    (c) You must retain, in the Source form of any Derivative Works\n    that You distribute, all copyright, patent, trademark, and\n    attribution notices from the Source form of the Work,\n    excluding those notices that do not pertain to any part of\n    the Derivative Works; and\n\n    (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n    distribution, then any Derivative Works that You distribute must\n    include a readable copy of the attribution notices contained\n    within such NOTICE file, excluding those notices that do not\n    pertain to any part of the Derivative Works, in at least one\n    of the following places: within a NOTICE text file distributed\n    as part of the Derivative Works; within the Source form or\n    documentation, if provided along with the Derivative Works; or,\n    within a display generated by the Derivative Works, if and\n    wherever such third-party notices normally appear. The contents\n    of the NOTICE file are for informational purposes only and\n    do not modify the License. You may add Your own attribution\n    notices within Derivative Works that You distribute, alongside\n    or as an addendum to the NOTICE text from the Work, provided\n    that such additional attribution notices cannot be construed\n    as modifying the License.\n\n    You may add Your own copyright statement to Your modifications and\n    may provide additional or different license terms and conditions\n    for use, reproduction, or distribution of Your modifications, or\n    for any such Derivative Works as a whole, provided Your use,\n    reproduction, and distribution of the Work otherwise complies with\n    the conditions stated in this License.\n\n5.  Submission of Contributions. Unless You explicitly state otherwise,\n    any Contribution intentionally submitted for inclusion in the Work\n    by You to the Licensor shall be under the terms and conditions of\n    this License, without any additional terms or conditions.\n    Notwithstanding the above, nothing herein shall supersede or modify\n    the terms of any separate license agreement you may have executed\n    with Licensor regarding such Contributions.\n\n6.  Trademarks. This License does not grant permission to use the trade\n    names, trademarks, service marks, or product names of the Licensor,\n    except as required for reasonable and customary use in describing the\n    origin of the Work and reproducing the content of the NOTICE file.\n\n7.  Disclaimer of Warranty. Unless required by applicable law or\n    agreed to in writing, Licensor provides the Work (and each\n    Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n    implied, including, without limitation, any warranties or conditions\n    of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n    PARTICULAR PURPOSE. You are solely responsible for determining the\n    appropriateness of using or redistributing the Work and assume any\n    risks associated with Your exercise of permissions under this License.\n\n8.  Limitation of Liability. In no event and under no legal theory,\n    whether in tort (including negligence), contract, or otherwise,\n    unless required by applicable law (such as deliberate and grossly\n    negligent acts) or agreed to in writing, shall any Contributor be\n    liable to You for damages, including any direct, indirect, special,\n    incidental, or consequential damages of any character arising as a\n    result of this License or out of the use or inability to use the\n    Work (including but not limited to damages for loss of goodwill,\n    work stoppage, computer failure or malfunction, or any and all\n    other commercial damages or losses), even if such Contributor\n    has been advised of the possibility of such damages.\n\n9.  Accepting Warranty or Additional Liability. While redistributing\n    the Work or Derivative Works thereof, You may choose to offer,\n    and charge a fee for, acceptance of support, warranty, indemnity,\n    or other liability obligations and/or rights consistent with this\n    License. However, in accepting such obligations, You may act only\n    on Your own behalf and on Your sole responsibility, not on behalf\n    of any other Contributor, and only if You agree to indemnify,\n    defend, and hold each Contributor harmless for any liability\n    incurred by, or claims asserted against, such Contributor by reason\n    of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n\nAPPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n      replaced with your own identifying information. (Don&#x27;t include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same &quot;printed page&quot; as the copyright notice for easier\n      identification within third-party archives.\n\nCopyright [yyyy] [name of copyright owner]\n\nLicensed under the Apache License, Version 2.0 (the &quot;License&quot;);\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an &quot;AS IS&quot; BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/awslabs/aws-sdk-rust \">aws-sdk-s3 1.54.0</a></li>\n                    <li><a href=\" https://github.com/awslabs/aws-sdk-rust \">aws-sdk-sqs 1.45.0</a></li>\n                    <li><a href=\" https://github.com/awslabs/aws-sdk-rust \">aws-sdk-sso 1.45.0</a></li>\n                    <li><a href=\" https://github.com/awslabs/aws-sdk-rust \">aws-sdk-ssooidc 1.46.0</a></li>\n                    <li><a href=\" https://github.com/awslabs/aws-sdk-rust \">aws-sdk-sts 1.45.0</a></li>\n                </ul>\n                <pre class=\"license-text\">                                Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets &quot;{}&quot;\n      replaced with your own identifying information. (Don&#x27;t include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same &quot;printed page&quot; as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright 2018-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/tmiasko/shell-words \">shell-words 1.1.0</a></li>\n                </ul>\n                <pre class=\"license-text\">                               Apache License\n                         Version 2.0, January 2004\n                      http://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n  &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n  and distribution as defined by Sections 1 through 9 of this document.\n\n  &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n  the copyright owner that is granting the License.\n\n  &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n  other entities that control, are controlled by, or are under common\n  control with that entity. For the purposes of this definition,\n  &quot;control&quot; means (i) the power, direct or indirect, to cause the\n  direction or management of such entity, whether by contract or\n  otherwise, or (ii) ownership of fifty percent (50%) or more of the\n  outstanding shares, or (iii) beneficial ownership of such entity.\n\n  &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n  exercising permissions granted by this License.\n\n  &quot;Source&quot; form shall mean the preferred form for making modifications,\n  including but not limited to software source code, documentation\n  source, and configuration files.\n\n  &quot;Object&quot; form shall mean any form resulting from mechanical\n  transformation or translation of a Source form, including but\n  not limited to compiled object code, generated documentation,\n  and conversions to other media types.\n\n  &quot;Work&quot; shall mean the work of authorship, whether in Source or\n  Object form, made available under the License, as indicated by a\n  copyright notice that is included in or attached to the work\n  (an example is provided in the Appendix below).\n\n  &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n  form, that is based on (or derived from) the Work and for which the\n  editorial revisions, annotations, elaborations, or other modifications\n  represent, as a whole, an original work of authorship. For the purposes\n  of this License, Derivative Works shall not include works that remain\n  separable from, or merely link (or bind by name) to the interfaces of,\n  the Work and Derivative Works thereof.\n\n  &quot;Contribution&quot; shall mean any work of authorship, including\n  the original version of the Work and any modifications or additions\n  to that Work or Derivative Works thereof, that is intentionally\n  submitted to Licensor for inclusion in the Work by the copyright owner\n  or by an individual or Legal Entity authorized to submit on behalf of\n  the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n  means any form of electronic, verbal, or written communication sent\n  to the Licensor or its representatives, including but not limited to\n  communication on electronic mailing lists, source code control systems,\n  and issue tracking systems that are managed by, or on behalf of, the\n  Licensor for the purpose of discussing and improving the Work, but\n  excluding communication that is conspicuously marked or otherwise\n  designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n  &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n  on behalf of whom a Contribution has been received by Licensor and\n  subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\n  this License, each Contributor hereby grants to You a perpetual,\n  worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n  copyright license to reproduce, prepare Derivative Works of,\n  publicly display, publicly perform, sublicense, and distribute the\n  Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\n  this License, each Contributor hereby grants to You a perpetual,\n  worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n  (except as stated in this section) patent license to make, have made,\n  use, offer to sell, sell, import, and otherwise transfer the Work,\n  where such license applies only to those patent claims licensable\n  by such Contributor that are necessarily infringed by their\n  Contribution(s) alone or by combination of their Contribution(s)\n  with the Work to which such Contribution(s) was submitted. If You\n  institute patent litigation against any entity (including a\n  cross-claim or counterclaim in a lawsuit) alleging that the Work\n  or a Contribution incorporated within the Work constitutes direct\n  or contributory patent infringement, then any patent licenses\n  granted to You under this License for that Work shall terminate\n  as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\n  Work or Derivative Works thereof in any medium, with or without\n  modifications, and in Source or Object form, provided that You\n  meet the following conditions:\n\n  (a) You must give any other recipients of the Work or\n      Derivative Works a copy of this License; and\n\n  (b) You must cause any modified files to carry prominent notices\n      stating that You changed the files; and\n\n  (c) You must retain, in the Source form of any Derivative Works\n      that You distribute, all copyright, patent, trademark, and\n      attribution notices from the Source form of the Work,\n      excluding those notices that do not pertain to any part of\n      the Derivative Works; and\n\n  (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n      distribution, then any Derivative Works that You distribute must\n      include a readable copy of the attribution notices contained\n      within such NOTICE file, excluding those notices that do not\n      pertain to any part of the Derivative Works, in at least one\n      of the following places: within a NOTICE text file distributed\n      as part of the Derivative Works; within the Source form or\n      documentation, if provided along with the Derivative Works; or,\n      within a display generated by the Derivative Works, if and\n      wherever such third-party notices normally appear. The contents\n      of the NOTICE file are for informational purposes only and\n      do not modify the License. You may add Your own attribution\n      notices within Derivative Works that You distribute, alongside\n      or as an addendum to the NOTICE text from the Work, provided\n      that such additional attribution notices cannot be construed\n      as modifying the License.\n\n  You may add Your own copyright statement to Your modifications and\n  may provide additional or different license terms and conditions\n  for use, reproduction, or distribution of Your modifications, or\n  for any such Derivative Works as a whole, provided Your use,\n  reproduction, and distribution of the Work otherwise complies with\n  the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\n  any Contribution intentionally submitted for inclusion in the Work\n  by You to the Licensor shall be under the terms and conditions of\n  this License, without any additional terms or conditions.\n  Notwithstanding the above, nothing herein shall supersede or modify\n  the terms of any separate license agreement you may have executed\n  with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\n  names, trademarks, service marks, or product names of the Licensor,\n  except as required for reasonable and customary use in describing the\n  origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\n  agreed to in writing, Licensor provides the Work (and each\n  Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n  implied, including, without limitation, any warranties or conditions\n  of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n  PARTICULAR PURPOSE. You are solely responsible for determining the\n  appropriateness of using or redistributing the Work and assume any\n  risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\n  whether in tort (including negligence), contract, or otherwise,\n  unless required by applicable law (such as deliberate and grossly\n  negligent acts) or agreed to in writing, shall any Contributor be\n  liable to You for damages, including any direct, indirect, special,\n  incidental, or consequential damages of any character arising as a\n  result of this License or out of the use or inability to use the\n  Work (including but not limited to damages for loss of goodwill,\n  work stoppage, computer failure or malfunction, or any and all\n  other commercial damages or losses), even if such Contributor\n  has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\n  the Work or Derivative Works thereof, You may choose to offer,\n  and charge a fee for, acceptance of support, warranty, indemnity,\n  or other liability obligations and/or rights consistent with this\n  License. However, in accepting such obligations, You may act only\n  on Your own behalf and on Your sole responsibility, not on behalf\n  of any other Contributor, and only if You agree to indemnify,\n  defend, and hold each Contributor harmless for any liability\n  incurred by, or claims asserted against, such Contributor by reason\n  of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n\nAPPENDIX: How to apply the Apache License to your work.\n\n  To apply the Apache License to your work, attach the following\n  boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n  replaced with your own identifying information. (Don&#x27;t include\n  the brackets!)  The text should be enclosed in the appropriate\n  comment syntax for the file format. We also recommend that a\n  file or class name and description of purpose be included on the\n  same &quot;printed page&quot; as the copyright notice for easier\n  identification within third-party archives.\n\nCopyright [yyyy] [name of copyright owner]\n\nLicensed under the Apache License, Version 2.0 (the &quot;License&quot;);\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n   http://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an &quot;AS IS&quot; BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/takuyaa/yada \">yada 0.5.1</a></li>\n                </ul>\n                <pre class=\"license-text\">                              Apache License\n                        Version 2.0, January 2004\n                     http://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n   and distribution as defined by Sections 1 through 9 of this document.\n\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n   the copyright owner that is granting the License.\n\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n   other entities that control, are controlled by, or are under common\n   control with that entity. For the purposes of this definition,\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\n   direction or management of such entity, whether by contract or\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\n   outstanding shares, or (iii) beneficial ownership of such entity.\n\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n   exercising permissions granted by this License.\n\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\n   including but not limited to software source code, documentation\n   source, and configuration files.\n\n   &quot;Object&quot; form shall mean any form resulting from mechanical\n   transformation or translation of a Source form, including but\n   not limited to compiled object code, generated documentation,\n   and conversions to other media types.\n\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\n   Object form, made available under the License, as indicated by a\n   copyright notice that is included in or attached to the work\n   (an example is provided in the Appendix below).\n\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n   form, that is based on (or derived from) the Work and for which the\n   editorial revisions, annotations, elaborations, or other modifications\n   represent, as a whole, an original work of authorship. For the purposes\n   of this License, Derivative Works shall not include works that remain\n   separable from, or merely link (or bind by name) to the interfaces of,\n   the Work and Derivative Works thereof.\n\n   &quot;Contribution&quot; shall mean any work of authorship, including\n   the original version of the Work and any modifications or additions\n   to that Work or Derivative Works thereof, that is intentionally\n   submitted to Licensor for inclusion in the Work by the copyright owner\n   or by an individual or Legal Entity authorized to submit on behalf of\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n   means any form of electronic, verbal, or written communication sent\n   to the Licensor or its representatives, including but not limited to\n   communication on electronic mailing lists, source code control systems,\n   and issue tracking systems that are managed by, or on behalf of, the\n   Licensor for the purpose of discussing and improving the Work, but\n   excluding communication that is conspicuously marked or otherwise\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n   on behalf of whom a Contribution has been received by Licensor and\n   subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   copyright license to reproduce, prepare Derivative Works of,\n   publicly display, publicly perform, sublicense, and distribute the\n   Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   (except as stated in this section) patent license to make, have made,\n   use, offer to sell, sell, import, and otherwise transfer the Work,\n   where such license applies only to those patent claims licensable\n   by such Contributor that are necessarily infringed by their\n   Contribution(s) alone or by combination of their Contribution(s)\n   with the Work to which such Contribution(s) was submitted. If You\n   institute patent litigation against any entity (including a\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\n   or a Contribution incorporated within the Work constitutes direct\n   or contributory patent infringement, then any patent licenses\n   granted to You under this License for that Work shall terminate\n   as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\n   Work or Derivative Works thereof in any medium, with or without\n   modifications, and in Source or Object form, provided that You\n   meet the following conditions:\n\n   (a) You must give any other recipients of the Work or\n       Derivative Works a copy of this License; and\n\n   (b) You must cause any modified files to carry prominent notices\n       stating that You changed the files; and\n\n   (c) You must retain, in the Source form of any Derivative Works\n       that You distribute, all copyright, patent, trademark, and\n       attribution notices from the Source form of the Work,\n       excluding those notices that do not pertain to any part of\n       the Derivative Works; and\n\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n       distribution, then any Derivative Works that You distribute must\n       include a readable copy of the attribution notices contained\n       within such NOTICE file, excluding those notices that do not\n       pertain to any part of the Derivative Works, in at least one\n       of the following places: within a NOTICE text file distributed\n       as part of the Derivative Works; within the Source form or\n       documentation, if provided along with the Derivative Works; or,\n       within a display generated by the Derivative Works, if and\n       wherever such third-party notices normally appear. The contents\n       of the NOTICE file are for informational purposes only and\n       do not modify the License. You may add Your own attribution\n       notices within Derivative Works that You distribute, alongside\n       or as an addendum to the NOTICE text from the Work, provided\n       that such additional attribution notices cannot be construed\n       as modifying the License.\n\n   You may add Your own copyright statement to Your modifications and\n   may provide additional or different license terms and conditions\n   for use, reproduction, or distribution of Your modifications, or\n   for any such Derivative Works as a whole, provided Your use,\n   reproduction, and distribution of the Work otherwise complies with\n   the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\n   any Contribution intentionally submitted for inclusion in the Work\n   by You to the Licensor shall be under the terms and conditions of\n   this License, without any additional terms or conditions.\n   Notwithstanding the above, nothing herein shall supersede or modify\n   the terms of any separate license agreement you may have executed\n   with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\n   names, trademarks, service marks, or product names of the Licensor,\n   except as required for reasonable and customary use in describing the\n   origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\n   agreed to in writing, Licensor provides the Work (and each\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n   implied, including, without limitation, any warranties or conditions\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n   PARTICULAR PURPOSE. You are solely responsible for determining the\n   appropriateness of using or redistributing the Work and assume any\n   risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\n   whether in tort (including negligence), contract, or otherwise,\n   unless required by applicable law (such as deliberate and grossly\n   negligent acts) or agreed to in writing, shall any Contributor be\n   liable to You for damages, including any direct, indirect, special,\n   incidental, or consequential damages of any character arising as a\n   result of this License or out of the use or inability to use the\n   Work (including but not limited to damages for loss of goodwill,\n   work stoppage, computer failure or malfunction, or any and all\n   other commercial damages or losses), even if such Contributor\n   has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\n   the Work or Derivative Works thereof, You may choose to offer,\n   and charge a fee for, acceptance of support, warranty, indemnity,\n   or other liability obligations and/or rights consistent with this\n   License. However, in accepting such obligations, You may act only\n   on Your own behalf and on Your sole responsibility, not on behalf\n   of any other Contributor, and only if You agree to indemnify,\n   defend, and hold each Contributor harmless for any liability\n   incurred by, or claims asserted against, such Contributor by reason\n   of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/dtolnay/anyhow \">anyhow 1.0.89</a></li>\n                    <li><a href=\" https://github.com/dtolnay/async-trait \">async-trait 0.1.83</a></li>\n                    <li><a href=\" https://github.com/zslayton/cron \">cron 0.12.1</a></li>\n                    <li><a href=\" https://github.com/dtolnay/dtoa \">dtoa 1.0.9</a></li>\n                    <li><a href=\" https://github.com/dtolnay/dyn-clone \">dyn-clone 1.0.17</a></li>\n                    <li><a href=\" https://github.com/dtolnay/erased-serde \">erased-serde 0.4.5</a></li>\n                    <li><a href=\" https://github.com/dtolnay/inventory \">inventory 0.3.15</a></li>\n                    <li><a href=\" https://github.com/dtolnay/itoa \">itoa 1.0.11</a></li>\n                    <li><a href=\" https://github.com/rust-lang/libc \">libc 0.2.159</a></li>\n                    <li><a href=\" https://github.com/dtolnay/prettyplease \">prettyplease 0.1.25</a></li>\n                    <li><a href=\" https://github.com/dtolnay/prettyplease \">prettyplease 0.2.22</a></li>\n                    <li><a href=\" https://github.com/SergioBenitez/proc-macro2-diagnostics \">proc-macro2-diagnostics 0.10.1</a></li>\n                    <li><a href=\" https://github.com/dtolnay/proc-macro2 \">proc-macro2 1.0.87</a></li>\n                    <li><a href=\" https://github.com/dtolnay/quote \">quote 1.0.37</a></li>\n                    <li><a href=\" https://github.com/dtolnay/rustversion \">rustversion 1.0.17</a></li>\n                    <li><a href=\" https://github.com/dtolnay/ryu \">ryu 1.0.18</a></li>\n                    <li><a href=\" https://github.com/dtolnay/semver \">semver 1.0.23</a></li>\n                    <li><a href=\" https://github.com/serde-rs/serde \">serde 1.0.210</a></li>\n                    <li><a href=\" https://github.com/serde-rs/serde \">serde_derive 1.0.210</a></li>\n                    <li><a href=\" https://github.com/serde-rs/json \">serde_json 1.0.112</a></li>\n                    <li><a href=\" https://github.com/dtolnay/path-to-error \">serde_path_to_error 0.1.16</a></li>\n                    <li><a href=\" https://github.com/samscott89/serde_qs \">serde_qs 0.12.0</a></li>\n                    <li><a href=\" https://github.com/samscott89/serde_qs \">serde_qs 0.8.5</a></li>\n                    <li><a href=\" https://github.com/nox/serde_urlencoded \">serde_urlencoded 0.7.1</a></li>\n                    <li><a href=\" https://github.com/dtolnay/serde-yaml \">serde_yaml 0.9.30</a></li>\n                    <li><a href=\" https://github.com/dtolnay/syn \">syn 2.0.79</a></li>\n                    <li><a href=\" https://github.com/dtolnay/thiserror \">thiserror-impl 1.0.64</a></li>\n                    <li><a href=\" https://github.com/dtolnay/thiserror \">thiserror 1.0.64</a></li>\n                    <li><a href=\" https://github.com/dtolnay/typeid \">typeid 1.0.2</a></li>\n                    <li><a href=\" https://github.com/dtolnay/typetag \">typetag-impl 0.2.18</a></li>\n                    <li><a href=\" https://github.com/dtolnay/typetag \">typetag 0.2.18</a></li>\n                    <li><a href=\" https://github.com/dtolnay/unicode-ident \">unicode-ident 1.0.13</a></li>\n                    <li><a href=\" https://github.com/SimonSapin/rust-utf8 \">utf-8 0.7.6</a></li>\n                    <li><a href=\" https://github.com/alacritty/vte \">utf8parse 0.2.2</a></li>\n                    <li><a href=\" https://github.com/alacritty/vte \">vte 0.10.1</a></li>\n                    <li><a href=\" https://github.com/alacritty/vte \">vte_generate_state_changes 0.1.2</a></li>\n                </ul>\n                <pre class=\"license-text\">                              Apache License\n                        Version 2.0, January 2004\n                     http://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n   and distribution as defined by Sections 1 through 9 of this document.\n\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n   the copyright owner that is granting the License.\n\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n   other entities that control, are controlled by, or are under common\n   control with that entity. For the purposes of this definition,\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\n   direction or management of such entity, whether by contract or\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\n   outstanding shares, or (iii) beneficial ownership of such entity.\n\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n   exercising permissions granted by this License.\n\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\n   including but not limited to software source code, documentation\n   source, and configuration files.\n\n   &quot;Object&quot; form shall mean any form resulting from mechanical\n   transformation or translation of a Source form, including but\n   not limited to compiled object code, generated documentation,\n   and conversions to other media types.\n\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\n   Object form, made available under the License, as indicated by a\n   copyright notice that is included in or attached to the work\n   (an example is provided in the Appendix below).\n\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n   form, that is based on (or derived from) the Work and for which the\n   editorial revisions, annotations, elaborations, or other modifications\n   represent, as a whole, an original work of authorship. For the purposes\n   of this License, Derivative Works shall not include works that remain\n   separable from, or merely link (or bind by name) to the interfaces of,\n   the Work and Derivative Works thereof.\n\n   &quot;Contribution&quot; shall mean any work of authorship, including\n   the original version of the Work and any modifications or additions\n   to that Work or Derivative Works thereof, that is intentionally\n   submitted to Licensor for inclusion in the Work by the copyright owner\n   or by an individual or Legal Entity authorized to submit on behalf of\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n   means any form of electronic, verbal, or written communication sent\n   to the Licensor or its representatives, including but not limited to\n   communication on electronic mailing lists, source code control systems,\n   and issue tracking systems that are managed by, or on behalf of, the\n   Licensor for the purpose of discussing and improving the Work, but\n   excluding communication that is conspicuously marked or otherwise\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n   on behalf of whom a Contribution has been received by Licensor and\n   subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   copyright license to reproduce, prepare Derivative Works of,\n   publicly display, publicly perform, sublicense, and distribute the\n   Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   (except as stated in this section) patent license to make, have made,\n   use, offer to sell, sell, import, and otherwise transfer the Work,\n   where such license applies only to those patent claims licensable\n   by such Contributor that are necessarily infringed by their\n   Contribution(s) alone or by combination of their Contribution(s)\n   with the Work to which such Contribution(s) was submitted. If You\n   institute patent litigation against any entity (including a\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\n   or a Contribution incorporated within the Work constitutes direct\n   or contributory patent infringement, then any patent licenses\n   granted to You under this License for that Work shall terminate\n   as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\n   Work or Derivative Works thereof in any medium, with or without\n   modifications, and in Source or Object form, provided that You\n   meet the following conditions:\n\n   (a) You must give any other recipients of the Work or\n       Derivative Works a copy of this License; and\n\n   (b) You must cause any modified files to carry prominent notices\n       stating that You changed the files; and\n\n   (c) You must retain, in the Source form of any Derivative Works\n       that You distribute, all copyright, patent, trademark, and\n       attribution notices from the Source form of the Work,\n       excluding those notices that do not pertain to any part of\n       the Derivative Works; and\n\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n       distribution, then any Derivative Works that You distribute must\n       include a readable copy of the attribution notices contained\n       within such NOTICE file, excluding those notices that do not\n       pertain to any part of the Derivative Works, in at least one\n       of the following places: within a NOTICE text file distributed\n       as part of the Derivative Works; within the Source form or\n       documentation, if provided along with the Derivative Works; or,\n       within a display generated by the Derivative Works, if and\n       wherever such third-party notices normally appear. The contents\n       of the NOTICE file are for informational purposes only and\n       do not modify the License. You may add Your own attribution\n       notices within Derivative Works that You distribute, alongside\n       or as an addendum to the NOTICE text from the Work, provided\n       that such additional attribution notices cannot be construed\n       as modifying the License.\n\n   You may add Your own copyright statement to Your modifications and\n   may provide additional or different license terms and conditions\n   for use, reproduction, or distribution of Your modifications, or\n   for any such Derivative Works as a whole, provided Your use,\n   reproduction, and distribution of the Work otherwise complies with\n   the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\n   any Contribution intentionally submitted for inclusion in the Work\n   by You to the Licensor shall be under the terms and conditions of\n   this License, without any additional terms or conditions.\n   Notwithstanding the above, nothing herein shall supersede or modify\n   the terms of any separate license agreement you may have executed\n   with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\n   names, trademarks, service marks, or product names of the Licensor,\n   except as required for reasonable and customary use in describing the\n   origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\n   agreed to in writing, Licensor provides the Work (and each\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n   implied, including, without limitation, any warranties or conditions\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n   PARTICULAR PURPOSE. You are solely responsible for determining the\n   appropriateness of using or redistributing the Work and assume any\n   risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\n   whether in tort (including negligence), contract, or otherwise,\n   unless required by applicable law (such as deliberate and grossly\n   negligent acts) or agreed to in writing, shall any Contributor be\n   liable to You for damages, including any direct, indirect, special,\n   incidental, or consequential damages of any character arising as a\n   result of this License or out of the use or inability to use the\n   Work (including but not limited to damages for loss of goodwill,\n   work stoppage, computer failure or malfunction, or any and all\n   other commercial damages or losses), even if such Contributor\n   has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\n   the Work or Derivative Works thereof, You may choose to offer,\n   and charge a fee for, acceptance of support, warranty, indemnity,\n   or other liability obligations and/or rights consistent with this\n   License. However, in accepting such obligations, You may act only\n   on Your own behalf and on Your sole responsibility, not on behalf\n   of any other Contributor, and only if You agree to indemnify,\n   defend, and hold each Contributor harmless for any liability\n   incurred by, or claims asserted against, such Contributor by reason\n   of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/rust-lang/futures-rs \">futures-channel 0.3.31</a></li>\n                    <li><a href=\" https://github.com/rust-lang/futures-rs \">futures-core 0.3.31</a></li>\n                    <li><a href=\" https://github.com/rust-lang/futures-rs \">futures-executor 0.3.31</a></li>\n                    <li><a href=\" https://github.com/rust-lang/futures-rs \">futures-io 0.3.31</a></li>\n                    <li><a href=\" https://github.com/rust-lang/futures-rs \">futures-macro 0.3.31</a></li>\n                    <li><a href=\" https://github.com/rust-lang/futures-rs \">futures-sink 0.3.31</a></li>\n                    <li><a href=\" https://github.com/rust-lang/futures-rs \">futures-task 0.3.31</a></li>\n                    <li><a href=\" https://github.com/rust-lang/futures-rs \">futures-util 0.3.31</a></li>\n                    <li><a href=\" https://github.com/rust-lang/futures-rs \">futures 0.3.31</a></li>\n                </ul>\n                <pre class=\"license-text\">                              Apache License\n                        Version 2.0, January 2004\n                     http://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n   and distribution as defined by Sections 1 through 9 of this document.\n\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n   the copyright owner that is granting the License.\n\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n   other entities that control, are controlled by, or are under common\n   control with that entity. For the purposes of this definition,\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\n   direction or management of such entity, whether by contract or\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\n   outstanding shares, or (iii) beneficial ownership of such entity.\n\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n   exercising permissions granted by this License.\n\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\n   including but not limited to software source code, documentation\n   source, and configuration files.\n\n   &quot;Object&quot; form shall mean any form resulting from mechanical\n   transformation or translation of a Source form, including but\n   not limited to compiled object code, generated documentation,\n   and conversions to other media types.\n\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\n   Object form, made available under the License, as indicated by a\n   copyright notice that is included in or attached to the work\n   (an example is provided in the Appendix below).\n\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n   form, that is based on (or derived from) the Work and for which the\n   editorial revisions, annotations, elaborations, or other modifications\n   represent, as a whole, an original work of authorship. For the purposes\n   of this License, Derivative Works shall not include works that remain\n   separable from, or merely link (or bind by name) to the interfaces of,\n   the Work and Derivative Works thereof.\n\n   &quot;Contribution&quot; shall mean any work of authorship, including\n   the original version of the Work and any modifications or additions\n   to that Work or Derivative Works thereof, that is intentionally\n   submitted to Licensor for inclusion in the Work by the copyright owner\n   or by an individual or Legal Entity authorized to submit on behalf of\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n   means any form of electronic, verbal, or written communication sent\n   to the Licensor or its representatives, including but not limited to\n   communication on electronic mailing lists, source code control systems,\n   and issue tracking systems that are managed by, or on behalf of, the\n   Licensor for the purpose of discussing and improving the Work, but\n   excluding communication that is conspicuously marked or otherwise\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n   on behalf of whom a Contribution has been received by Licensor and\n   subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   copyright license to reproduce, prepare Derivative Works of,\n   publicly display, publicly perform, sublicense, and distribute the\n   Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   (except as stated in this section) patent license to make, have made,\n   use, offer to sell, sell, import, and otherwise transfer the Work,\n   where such license applies only to those patent claims licensable\n   by such Contributor that are necessarily infringed by their\n   Contribution(s) alone or by combination of their Contribution(s)\n   with the Work to which such Contribution(s) was submitted. If You\n   institute patent litigation against any entity (including a\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\n   or a Contribution incorporated within the Work constitutes direct\n   or contributory patent infringement, then any patent licenses\n   granted to You under this License for that Work shall terminate\n   as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\n   Work or Derivative Works thereof in any medium, with or without\n   modifications, and in Source or Object form, provided that You\n   meet the following conditions:\n\n   (a) You must give any other recipients of the Work or\n       Derivative Works a copy of this License; and\n\n   (b) You must cause any modified files to carry prominent notices\n       stating that You changed the files; and\n\n   (c) You must retain, in the Source form of any Derivative Works\n       that You distribute, all copyright, patent, trademark, and\n       attribution notices from the Source form of the Work,\n       excluding those notices that do not pertain to any part of\n       the Derivative Works; and\n\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n       distribution, then any Derivative Works that You distribute must\n       include a readable copy of the attribution notices contained\n       within such NOTICE file, excluding those notices that do not\n       pertain to any part of the Derivative Works, in at least one\n       of the following places: within a NOTICE text file distributed\n       as part of the Derivative Works; within the Source form or\n       documentation, if provided along with the Derivative Works; or,\n       within a display generated by the Derivative Works, if and\n       wherever such third-party notices normally appear. The contents\n       of the NOTICE file are for informational purposes only and\n       do not modify the License. You may add Your own attribution\n       notices within Derivative Works that You distribute, alongside\n       or as an addendum to the NOTICE text from the Work, provided\n       that such additional attribution notices cannot be construed\n       as modifying the License.\n\n   You may add Your own copyright statement to Your modifications and\n   may provide additional or different license terms and conditions\n   for use, reproduction, or distribution of Your modifications, or\n   for any such Derivative Works as a whole, provided Your use,\n   reproduction, and distribution of the Work otherwise complies with\n   the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\n   any Contribution intentionally submitted for inclusion in the Work\n   by You to the Licensor shall be under the terms and conditions of\n   this License, without any additional terms or conditions.\n   Notwithstanding the above, nothing herein shall supersede or modify\n   the terms of any separate license agreement you may have executed\n   with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\n   names, trademarks, service marks, or product names of the Licensor,\n   except as required for reasonable and customary use in describing the\n   origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\n   agreed to in writing, Licensor provides the Work (and each\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n   implied, including, without limitation, any warranties or conditions\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n   PARTICULAR PURPOSE. You are solely responsible for determining the\n   appropriateness of using or redistributing the Work and assume any\n   risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\n   whether in tort (including negligence), contract, or otherwise,\n   unless required by applicable law (such as deliberate and grossly\n   negligent acts) or agreed to in writing, shall any Contributor be\n   liable to You for damages, including any direct, indirect, special,\n   incidental, or consequential damages of any character arising as a\n   result of this License or out of the use or inability to use the\n   Work (including but not limited to damages for loss of goodwill,\n   work stoppage, computer failure or malfunction, or any and all\n   other commercial damages or losses), even if such Contributor\n   has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\n   the Work or Derivative Works thereof, You may choose to offer,\n   and charge a fee for, acceptance of support, warranty, indemnity,\n   or other liability obligations and/or rights consistent with this\n   License. However, in accepting such obligations, You may act only\n   on Your own behalf and on Your sole responsibility, not on behalf\n   of any other Contributor, and only if You agree to indemnify,\n   defend, and hold each Contributor harmless for any liability\n   incurred by, or claims asserted against, such Contributor by reason\n   of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n\nAPPENDIX: How to apply the Apache License to your work.\n\n   To apply the Apache License to your work, attach the following\n   boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n   replaced with your own identifying information. (Don&#x27;t include\n   the brackets!)  The text should be enclosed in the appropriate\n   comment syntax for the file format. We also recommend that a\n   file or class name and description of purpose be included on the\n   same &quot;printed page&quot; as the copyright notice for easier\n   identification within third-party archives.\n\nCopyright (c) 2016 Alex Crichton\nCopyright (c) 2017 The Tokio Authors\n\nLicensed under the Apache License, Version 2.0 (the &quot;License&quot;);\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n\thttp://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an &quot;AS IS&quot; BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/paholg/typenum \">typenum 1.17.0</a></li>\n                </ul>\n                <pre class=\"license-text\">                              Apache License\n                        Version 2.0, January 2004\n                     http://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n   and distribution as defined by Sections 1 through 9 of this document.\n\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n   the copyright owner that is granting the License.\n\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n   other entities that control, are controlled by, or are under common\n   control with that entity. For the purposes of this definition,\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\n   direction or management of such entity, whether by contract or\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\n   outstanding shares, or (iii) beneficial ownership of such entity.\n\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n   exercising permissions granted by this License.\n\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\n   including but not limited to software source code, documentation\n   source, and configuration files.\n\n   &quot;Object&quot; form shall mean any form resulting from mechanical\n   transformation or translation of a Source form, including but\n   not limited to compiled object code, generated documentation,\n   and conversions to other media types.\n\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\n   Object form, made available under the License, as indicated by a\n   copyright notice that is included in or attached to the work\n   (an example is provided in the Appendix below).\n\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n   form, that is based on (or derived from) the Work and for which the\n   editorial revisions, annotations, elaborations, or other modifications\n   represent, as a whole, an original work of authorship. For the purposes\n   of this License, Derivative Works shall not include works that remain\n   separable from, or merely link (or bind by name) to the interfaces of,\n   the Work and Derivative Works thereof.\n\n   &quot;Contribution&quot; shall mean any work of authorship, including\n   the original version of the Work and any modifications or additions\n   to that Work or Derivative Works thereof, that is intentionally\n   submitted to Licensor for inclusion in the Work by the copyright owner\n   or by an individual or Legal Entity authorized to submit on behalf of\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n   means any form of electronic, verbal, or written communication sent\n   to the Licensor or its representatives, including but not limited to\n   communication on electronic mailing lists, source code control systems,\n   and issue tracking systems that are managed by, or on behalf of, the\n   Licensor for the purpose of discussing and improving the Work, but\n   excluding communication that is conspicuously marked or otherwise\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n   on behalf of whom a Contribution has been received by Licensor and\n   subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   copyright license to reproduce, prepare Derivative Works of,\n   publicly display, publicly perform, sublicense, and distribute the\n   Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   (except as stated in this section) patent license to make, have made,\n   use, offer to sell, sell, import, and otherwise transfer the Work,\n   where such license applies only to those patent claims licensable\n   by such Contributor that are necessarily infringed by their\n   Contribution(s) alone or by combination of their Contribution(s)\n   with the Work to which such Contribution(s) was submitted. If You\n   institute patent litigation against any entity (including a\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\n   or a Contribution incorporated within the Work constitutes direct\n   or contributory patent infringement, then any patent licenses\n   granted to You under this License for that Work shall terminate\n   as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\n   Work or Derivative Works thereof in any medium, with or without\n   modifications, and in Source or Object form, provided that You\n   meet the following conditions:\n\n   (a) You must give any other recipients of the Work or\n       Derivative Works a copy of this License; and\n\n   (b) You must cause any modified files to carry prominent notices\n       stating that You changed the files; and\n\n   (c) You must retain, in the Source form of any Derivative Works\n       that You distribute, all copyright, patent, trademark, and\n       attribution notices from the Source form of the Work,\n       excluding those notices that do not pertain to any part of\n       the Derivative Works; and\n\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n       distribution, then any Derivative Works that You distribute must\n       include a readable copy of the attribution notices contained\n       within such NOTICE file, excluding those notices that do not\n       pertain to any part of the Derivative Works, in at least one\n       of the following places: within a NOTICE text file distributed\n       as part of the Derivative Works; within the Source form or\n       documentation, if provided along with the Derivative Works; or,\n       within a display generated by the Derivative Works, if and\n       wherever such third-party notices normally appear. The contents\n       of the NOTICE file are for informational purposes only and\n       do not modify the License. You may add Your own attribution\n       notices within Derivative Works that You distribute, alongside\n       or as an addendum to the NOTICE text from the Work, provided\n       that such additional attribution notices cannot be construed\n       as modifying the License.\n\n   You may add Your own copyright statement to Your modifications and\n   may provide additional or different license terms and conditions\n   for use, reproduction, or distribution of Your modifications, or\n   for any such Derivative Works as a whole, provided Your use,\n   reproduction, and distribution of the Work otherwise complies with\n   the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\n   any Contribution intentionally submitted for inclusion in the Work\n   by You to the Licensor shall be under the terms and conditions of\n   this License, without any additional terms or conditions.\n   Notwithstanding the above, nothing herein shall supersede or modify\n   the terms of any separate license agreement you may have executed\n   with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\n   names, trademarks, service marks, or product names of the Licensor,\n   except as required for reasonable and customary use in describing the\n   origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\n   agreed to in writing, Licensor provides the Work (and each\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n   implied, including, without limitation, any warranties or conditions\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n   PARTICULAR PURPOSE. You are solely responsible for determining the\n   appropriateness of using or redistributing the Work and assume any\n   risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\n   whether in tort (including negligence), contract, or otherwise,\n   unless required by applicable law (such as deliberate and grossly\n   negligent acts) or agreed to in writing, shall any Contributor be\n   liable to You for damages, including any direct, indirect, special,\n   incidental, or consequential damages of any character arising as a\n   result of this License or out of the use or inability to use the\n   Work (including but not limited to damages for loss of goodwill,\n   work stoppage, computer failure or malfunction, or any and all\n   other commercial damages or losses), even if such Contributor\n   has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\n   the Work or Derivative Works thereof, You may choose to offer,\n   and charge a fee for, acceptance of support, warranty, indemnity,\n   or other liability obligations and/or rights consistent with this\n   License. However, in accepting such obligations, You may act only\n   on Your own behalf and on Your sole responsibility, not on behalf\n   of any other Contributor, and only if You agree to indemnify,\n   defend, and hold each Contributor harmless for any liability\n   incurred by, or claims asserted against, such Contributor by reason\n   of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n\nAPPENDIX: How to apply the Apache License to your work.\n\n   To apply the Apache License to your work, attach the following\n   boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n   replaced with your own identifying information. (Don&#x27;t include\n   the brackets!)  The text should be enclosed in the appropriate\n   comment syntax for the file format. We also recommend that a\n   file or class name and description of purpose be included on the\n   same &quot;printed page&quot; as the copyright notice for easier\n   identification within third-party archives.\n\nCopyright 2014 Paho Lurie-Gregg\n\nLicensed under the Apache License, Version 2.0 (the &quot;License&quot;);\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n\thttp://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an &quot;AS IS&quot; BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/seanmonstar/reqwest \">reqwest 0.11.27</a></li>\n                </ul>\n                <pre class=\"license-text\">                              Apache License\n                        Version 2.0, January 2004\n                     http://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n   and distribution as defined by Sections 1 through 9 of this document.\n\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n   the copyright owner that is granting the License.\n\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n   other entities that control, are controlled by, or are under common\n   control with that entity. For the purposes of this definition,\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\n   direction or management of such entity, whether by contract or\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\n   outstanding shares, or (iii) beneficial ownership of such entity.\n\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n   exercising permissions granted by this License.\n\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\n   including but not limited to software source code, documentation\n   source, and configuration files.\n\n   &quot;Object&quot; form shall mean any form resulting from mechanical\n   transformation or translation of a Source form, including but\n   not limited to compiled object code, generated documentation,\n   and conversions to other media types.\n\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\n   Object form, made available under the License, as indicated by a\n   copyright notice that is included in or attached to the work\n   (an example is provided in the Appendix below).\n\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n   form, that is based on (or derived from) the Work and for which the\n   editorial revisions, annotations, elaborations, or other modifications\n   represent, as a whole, an original work of authorship. For the purposes\n   of this License, Derivative Works shall not include works that remain\n   separable from, or merely link (or bind by name) to the interfaces of,\n   the Work and Derivative Works thereof.\n\n   &quot;Contribution&quot; shall mean any work of authorship, including\n   the original version of the Work and any modifications or additions\n   to that Work or Derivative Works thereof, that is intentionally\n   submitted to Licensor for inclusion in the Work by the copyright owner\n   or by an individual or Legal Entity authorized to submit on behalf of\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n   means any form of electronic, verbal, or written communication sent\n   to the Licensor or its representatives, including but not limited to\n   communication on electronic mailing lists, source code control systems,\n   and issue tracking systems that are managed by, or on behalf of, the\n   Licensor for the purpose of discussing and improving the Work, but\n   excluding communication that is conspicuously marked or otherwise\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n   on behalf of whom a Contribution has been received by Licensor and\n   subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   copyright license to reproduce, prepare Derivative Works of,\n   publicly display, publicly perform, sublicense, and distribute the\n   Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   (except as stated in this section) patent license to make, have made,\n   use, offer to sell, sell, import, and otherwise transfer the Work,\n   where such license applies only to those patent claims licensable\n   by such Contributor that are necessarily infringed by their\n   Contribution(s) alone or by combination of their Contribution(s)\n   with the Work to which such Contribution(s) was submitted. If You\n   institute patent litigation against any entity (including a\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\n   or a Contribution incorporated within the Work constitutes direct\n   or contributory patent infringement, then any patent licenses\n   granted to You under this License for that Work shall terminate\n   as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\n   Work or Derivative Works thereof in any medium, with or without\n   modifications, and in Source or Object form, provided that You\n   meet the following conditions:\n\n   (a) You must give any other recipients of the Work or\n       Derivative Works a copy of this License; and\n\n   (b) You must cause any modified files to carry prominent notices\n       stating that You changed the files; and\n\n   (c) You must retain, in the Source form of any Derivative Works\n       that You distribute, all copyright, patent, trademark, and\n       attribution notices from the Source form of the Work,\n       excluding those notices that do not pertain to any part of\n       the Derivative Works; and\n\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n       distribution, then any Derivative Works that You distribute must\n       include a readable copy of the attribution notices contained\n       within such NOTICE file, excluding those notices that do not\n       pertain to any part of the Derivative Works, in at least one\n       of the following places: within a NOTICE text file distributed\n       as part of the Derivative Works; within the Source form or\n       documentation, if provided along with the Derivative Works; or,\n       within a display generated by the Derivative Works, if and\n       wherever such third-party notices normally appear. The contents\n       of the NOTICE file are for informational purposes only and\n       do not modify the License. You may add Your own attribution\n       notices within Derivative Works that You distribute, alongside\n       or as an addendum to the NOTICE text from the Work, provided\n       that such additional attribution notices cannot be construed\n       as modifying the License.\n\n   You may add Your own copyright statement to Your modifications and\n   may provide additional or different license terms and conditions\n   for use, reproduction, or distribution of Your modifications, or\n   for any such Derivative Works as a whole, provided Your use,\n   reproduction, and distribution of the Work otherwise complies with\n   the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\n   any Contribution intentionally submitted for inclusion in the Work\n   by You to the Licensor shall be under the terms and conditions of\n   this License, without any additional terms or conditions.\n   Notwithstanding the above, nothing herein shall supersede or modify\n   the terms of any separate license agreement you may have executed\n   with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\n   names, trademarks, service marks, or product names of the Licensor,\n   except as required for reasonable and customary use in describing the\n   origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\n   agreed to in writing, Licensor provides the Work (and each\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n   implied, including, without limitation, any warranties or conditions\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n   PARTICULAR PURPOSE. You are solely responsible for determining the\n   appropriateness of using or redistributing the Work and assume any\n   risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\n   whether in tort (including negligence), contract, or otherwise,\n   unless required by applicable law (such as deliberate and grossly\n   negligent acts) or agreed to in writing, shall any Contributor be\n   liable to You for damages, including any direct, indirect, special,\n   incidental, or consequential damages of any character arising as a\n   result of this License or out of the use or inability to use the\n   Work (including but not limited to damages for loss of goodwill,\n   work stoppage, computer failure or malfunction, or any and all\n   other commercial damages or losses), even if such Contributor\n   has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\n   the Work or Derivative Works thereof, You may choose to offer,\n   and charge a fee for, acceptance of support, warranty, indemnity,\n   or other liability obligations and/or rights consistent with this\n   License. However, in accepting such obligations, You may act only\n   on Your own behalf and on Your sole responsibility, not on behalf\n   of any other Contributor, and only if You agree to indemnify,\n   defend, and hold each Contributor harmless for any liability\n   incurred by, or claims asserted against, such Contributor by reason\n   of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n\nAPPENDIX: How to apply the Apache License to your work.\n\n   To apply the Apache License to your work, attach the following\n   boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n   replaced with your own identifying information. (Don&#x27;t include\n   the brackets!)  The text should be enclosed in the appropriate\n   comment syntax for the file format. We also recommend that a\n   file or class name and description of purpose be included on the\n   same &quot;printed page&quot; as the copyright notice for easier\n   identification within third-party archives.\n\nCopyright 2016 Sean McArthur\n\nLicensed under the Apache License, Version 2.0 (the &quot;License&quot;);\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n\thttp://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an &quot;AS IS&quot; BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/SergioBenitez/yansi \">yansi 1.0.1</a></li>\n                </ul>\n                <pre class=\"license-text\">                              Apache License\n                        Version 2.0, January 2004\n                     http://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n   and distribution as defined by Sections 1 through 9 of this document.\n\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n   the copyright owner that is granting the License.\n\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n   other entities that control, are controlled by, or are under common\n   control with that entity. For the purposes of this definition,\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\n   direction or management of such entity, whether by contract or\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\n   outstanding shares, or (iii) beneficial ownership of such entity.\n\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n   exercising permissions granted by this License.\n\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\n   including but not limited to software source code, documentation\n   source, and configuration files.\n\n   &quot;Object&quot; form shall mean any form resulting from mechanical\n   transformation or translation of a Source form, including but\n   not limited to compiled object code, generated documentation,\n   and conversions to other media types.\n\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\n   Object form, made available under the License, as indicated by a\n   copyright notice that is included in or attached to the work\n   (an example is provided in the Appendix below).\n\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n   form, that is based on (or derived from) the Work and for which the\n   editorial revisions, annotations, elaborations, or other modifications\n   represent, as a whole, an original work of authorship. For the purposes\n   of this License, Derivative Works shall not include works that remain\n   separable from, or merely link (or bind by name) to the interfaces of,\n   the Work and Derivative Works thereof.\n\n   &quot;Contribution&quot; shall mean any work of authorship, including\n   the original version of the Work and any modifications or additions\n   to that Work or Derivative Works thereof, that is intentionally\n   submitted to Licensor for inclusion in the Work by the copyright owner\n   or by an individual or Legal Entity authorized to submit on behalf of\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n   means any form of electronic, verbal, or written communication sent\n   to the Licensor or its representatives, including but not limited to\n   communication on electronic mailing lists, source code control systems,\n   and issue tracking systems that are managed by, or on behalf of, the\n   Licensor for the purpose of discussing and improving the Work, but\n   excluding communication that is conspicuously marked or otherwise\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n   on behalf of whom a Contribution has been received by Licensor and\n   subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   copyright license to reproduce, prepare Derivative Works of,\n   publicly display, publicly perform, sublicense, and distribute the\n   Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   (except as stated in this section) patent license to make, have made,\n   use, offer to sell, sell, import, and otherwise transfer the Work,\n   where such license applies only to those patent claims licensable\n   by such Contributor that are necessarily infringed by their\n   Contribution(s) alone or by combination of their Contribution(s)\n   with the Work to which such Contribution(s) was submitted. If You\n   institute patent litigation against any entity (including a\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\n   or a Contribution incorporated within the Work constitutes direct\n   or contributory patent infringement, then any patent licenses\n   granted to You under this License for that Work shall terminate\n   as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\n   Work or Derivative Works thereof in any medium, with or without\n   modifications, and in Source or Object form, provided that You\n   meet the following conditions:\n\n   (a) You must give any other recipients of the Work or\n       Derivative Works a copy of this License; and\n\n   (b) You must cause any modified files to carry prominent notices\n       stating that You changed the files; and\n\n   (c) You must retain, in the Source form of any Derivative Works\n       that You distribute, all copyright, patent, trademark, and\n       attribution notices from the Source form of the Work,\n       excluding those notices that do not pertain to any part of\n       the Derivative Works; and\n\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n       distribution, then any Derivative Works that You distribute must\n       include a readable copy of the attribution notices contained\n       within such NOTICE file, excluding those notices that do not\n       pertain to any part of the Derivative Works, in at least one\n       of the following places: within a NOTICE text file distributed\n       as part of the Derivative Works; within the Source form or\n       documentation, if provided along with the Derivative Works; or,\n       within a display generated by the Derivative Works, if and\n       wherever such third-party notices normally appear. The contents\n       of the NOTICE file are for informational purposes only and\n       do not modify the License. You may add Your own attribution\n       notices within Derivative Works that You distribute, alongside\n       or as an addendum to the NOTICE text from the Work, provided\n       that such additional attribution notices cannot be construed\n       as modifying the License.\n\n   You may add Your own copyright statement to Your modifications and\n   may provide additional or different license terms and conditions\n   for use, reproduction, or distribution of Your modifications, or\n   for any such Derivative Works as a whole, provided Your use,\n   reproduction, and distribution of the Work otherwise complies with\n   the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\n   any Contribution intentionally submitted for inclusion in the Work\n   by You to the Licensor shall be under the terms and conditions of\n   this License, without any additional terms or conditions.\n   Notwithstanding the above, nothing herein shall supersede or modify\n   the terms of any separate license agreement you may have executed\n   with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\n   names, trademarks, service marks, or product names of the Licensor,\n   except as required for reasonable and customary use in describing the\n   origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\n   agreed to in writing, Licensor provides the Work (and each\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n   implied, including, without limitation, any warranties or conditions\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n   PARTICULAR PURPOSE. You are solely responsible for determining the\n   appropriateness of using or redistributing the Work and assume any\n   risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\n   whether in tort (including negligence), contract, or otherwise,\n   unless required by applicable law (such as deliberate and grossly\n   negligent acts) or agreed to in writing, shall any Contributor be\n   liable to You for damages, including any direct, indirect, special,\n   incidental, or consequential damages of any character arising as a\n   result of this License or out of the use or inability to use the\n   Work (including but not limited to damages for loss of goodwill,\n   work stoppage, computer failure or malfunction, or any and all\n   other commercial damages or losses), even if such Contributor\n   has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\n   the Work or Derivative Works thereof, You may choose to offer,\n   and charge a fee for, acceptance of support, warranty, indemnity,\n   or other liability obligations and/or rights consistent with this\n   License. However, in accepting such obligations, You may act only\n   on Your own behalf and on Your sole responsibility, not on behalf\n   of any other Contributor, and only if You agree to indemnify,\n   defend, and hold each Contributor harmless for any liability\n   incurred by, or claims asserted against, such Contributor by reason\n   of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n\nAPPENDIX: How to apply the Apache License to your work.\n\n   To apply the Apache License to your work, attach the following\n   boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n   replaced with your own identifying information. (Don&#x27;t include\n   the brackets!)  The text should be enclosed in the appropriate\n   comment syntax for the file format. We also recommend that a\n   file or class name and description of purpose be included on the\n   same &quot;printed page&quot; as the copyright notice for easier\n   identification within third-party archives.\n\nCopyright 2017 Sergio Benitez\n\nLicensed under the Apache License, Version 2.0 (the &quot;License&quot;);\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n\thttp://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an &quot;AS IS&quot; BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/hyperium/http \">http 0.2.12</a></li>\n                    <li><a href=\" https://github.com/hyperium/http \">http 1.1.0</a></li>\n                </ul>\n                <pre class=\"license-text\">                              Apache License\n                        Version 2.0, January 2004\n                     http://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n   and distribution as defined by Sections 1 through 9 of this document.\n\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n   the copyright owner that is granting the License.\n\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n   other entities that control, are controlled by, or are under common\n   control with that entity. For the purposes of this definition,\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\n   direction or management of such entity, whether by contract or\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\n   outstanding shares, or (iii) beneficial ownership of such entity.\n\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n   exercising permissions granted by this License.\n\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\n   including but not limited to software source code, documentation\n   source, and configuration files.\n\n   &quot;Object&quot; form shall mean any form resulting from mechanical\n   transformation or translation of a Source form, including but\n   not limited to compiled object code, generated documentation,\n   and conversions to other media types.\n\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\n   Object form, made available under the License, as indicated by a\n   copyright notice that is included in or attached to the work\n   (an example is provided in the Appendix below).\n\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n   form, that is based on (or derived from) the Work and for which the\n   editorial revisions, annotations, elaborations, or other modifications\n   represent, as a whole, an original work of authorship. For the purposes\n   of this License, Derivative Works shall not include works that remain\n   separable from, or merely link (or bind by name) to the interfaces of,\n   the Work and Derivative Works thereof.\n\n   &quot;Contribution&quot; shall mean any work of authorship, including\n   the original version of the Work and any modifications or additions\n   to that Work or Derivative Works thereof, that is intentionally\n   submitted to Licensor for inclusion in the Work by the copyright owner\n   or by an individual or Legal Entity authorized to submit on behalf of\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n   means any form of electronic, verbal, or written communication sent\n   to the Licensor or its representatives, including but not limited to\n   communication on electronic mailing lists, source code control systems,\n   and issue tracking systems that are managed by, or on behalf of, the\n   Licensor for the purpose of discussing and improving the Work, but\n   excluding communication that is conspicuously marked or otherwise\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n   on behalf of whom a Contribution has been received by Licensor and\n   subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   copyright license to reproduce, prepare Derivative Works of,\n   publicly display, publicly perform, sublicense, and distribute the\n   Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   (except as stated in this section) patent license to make, have made,\n   use, offer to sell, sell, import, and otherwise transfer the Work,\n   where such license applies only to those patent claims licensable\n   by such Contributor that are necessarily infringed by their\n   Contribution(s) alone or by combination of their Contribution(s)\n   with the Work to which such Contribution(s) was submitted. If You\n   institute patent litigation against any entity (including a\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\n   or a Contribution incorporated within the Work constitutes direct\n   or contributory patent infringement, then any patent licenses\n   granted to You under this License for that Work shall terminate\n   as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\n   Work or Derivative Works thereof in any medium, with or without\n   modifications, and in Source or Object form, provided that You\n   meet the following conditions:\n\n   (a) You must give any other recipients of the Work or\n       Derivative Works a copy of this License; and\n\n   (b) You must cause any modified files to carry prominent notices\n       stating that You changed the files; and\n\n   (c) You must retain, in the Source form of any Derivative Works\n       that You distribute, all copyright, patent, trademark, and\n       attribution notices from the Source form of the Work,\n       excluding those notices that do not pertain to any part of\n       the Derivative Works; and\n\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n       distribution, then any Derivative Works that You distribute must\n       include a readable copy of the attribution notices contained\n       within such NOTICE file, excluding those notices that do not\n       pertain to any part of the Derivative Works, in at least one\n       of the following places: within a NOTICE text file distributed\n       as part of the Derivative Works; within the Source form or\n       documentation, if provided along with the Derivative Works; or,\n       within a display generated by the Derivative Works, if and\n       wherever such third-party notices normally appear. The contents\n       of the NOTICE file are for informational purposes only and\n       do not modify the License. You may add Your own attribution\n       notices within Derivative Works that You distribute, alongside\n       or as an addendum to the NOTICE text from the Work, provided\n       that such additional attribution notices cannot be construed\n       as modifying the License.\n\n   You may add Your own copyright statement to Your modifications and\n   may provide additional or different license terms and conditions\n   for use, reproduction, or distribution of Your modifications, or\n   for any such Derivative Works as a whole, provided Your use,\n   reproduction, and distribution of the Work otherwise complies with\n   the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\n   any Contribution intentionally submitted for inclusion in the Work\n   by You to the Licensor shall be under the terms and conditions of\n   this License, without any additional terms or conditions.\n   Notwithstanding the above, nothing herein shall supersede or modify\n   the terms of any separate license agreement you may have executed\n   with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\n   names, trademarks, service marks, or product names of the Licensor,\n   except as required for reasonable and customary use in describing the\n   origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\n   agreed to in writing, Licensor provides the Work (and each\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n   implied, including, without limitation, any warranties or conditions\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n   PARTICULAR PURPOSE. You are solely responsible for determining the\n   appropriateness of using or redistributing the Work and assume any\n   risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\n   whether in tort (including negligence), contract, or otherwise,\n   unless required by applicable law (such as deliberate and grossly\n   negligent acts) or agreed to in writing, shall any Contributor be\n   liable to You for damages, including any direct, indirect, special,\n   incidental, or consequential damages of any character arising as a\n   result of this License or out of the use or inability to use the\n   Work (including but not limited to damages for loss of goodwill,\n   work stoppage, computer failure or malfunction, or any and all\n   other commercial damages or losses), even if such Contributor\n   has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\n   the Work or Derivative Works thereof, You may choose to offer,\n   and charge a fee for, acceptance of support, warranty, indemnity,\n   or other liability obligations and/or rights consistent with this\n   License. However, in accepting such obligations, You may act only\n   on Your own behalf and on Your sole responsibility, not on behalf\n   of any other Contributor, and only if You agree to indemnify,\n   defend, and hold each Contributor harmless for any liability\n   incurred by, or claims asserted against, such Contributor by reason\n   of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n\nAPPENDIX: How to apply the Apache License to your work.\n\n   To apply the Apache License to your work, attach the following\n   boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n   replaced with your own identifying information. (Don&#x27;t include\n   the brackets!)  The text should be enclosed in the appropriate\n   comment syntax for the file format. We also recommend that a\n   file or class name and description of purpose be included on the\n   same &quot;printed page&quot; as the copyright notice for easier\n   identification within third-party archives.\n\nCopyright 2017 http-rs authors\n\nLicensed under the Apache License, Version 2.0 (the &quot;License&quot;);\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n\thttp://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an &quot;AS IS&quot; BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/rustls/tokio-rustls \">tokio-rustls 0.24.1</a></li>\n                </ul>\n                <pre class=\"license-text\">                              Apache License\n                        Version 2.0, January 2004\n                     http://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n   and distribution as defined by Sections 1 through 9 of this document.\n\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n   the copyright owner that is granting the License.\n\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n   other entities that control, are controlled by, or are under common\n   control with that entity. For the purposes of this definition,\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\n   direction or management of such entity, whether by contract or\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\n   outstanding shares, or (iii) beneficial ownership of such entity.\n\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n   exercising permissions granted by this License.\n\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\n   including but not limited to software source code, documentation\n   source, and configuration files.\n\n   &quot;Object&quot; form shall mean any form resulting from mechanical\n   transformation or translation of a Source form, including but\n   not limited to compiled object code, generated documentation,\n   and conversions to other media types.\n\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\n   Object form, made available under the License, as indicated by a\n   copyright notice that is included in or attached to the work\n   (an example is provided in the Appendix below).\n\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n   form, that is based on (or derived from) the Work and for which the\n   editorial revisions, annotations, elaborations, or other modifications\n   represent, as a whole, an original work of authorship. For the purposes\n   of this License, Derivative Works shall not include works that remain\n   separable from, or merely link (or bind by name) to the interfaces of,\n   the Work and Derivative Works thereof.\n\n   &quot;Contribution&quot; shall mean any work of authorship, including\n   the original version of the Work and any modifications or additions\n   to that Work or Derivative Works thereof, that is intentionally\n   submitted to Licensor for inclusion in the Work by the copyright owner\n   or by an individual or Legal Entity authorized to submit on behalf of\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n   means any form of electronic, verbal, or written communication sent\n   to the Licensor or its representatives, including but not limited to\n   communication on electronic mailing lists, source code control systems,\n   and issue tracking systems that are managed by, or on behalf of, the\n   Licensor for the purpose of discussing and improving the Work, but\n   excluding communication that is conspicuously marked or otherwise\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n   on behalf of whom a Contribution has been received by Licensor and\n   subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   copyright license to reproduce, prepare Derivative Works of,\n   publicly display, publicly perform, sublicense, and distribute the\n   Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   (except as stated in this section) patent license to make, have made,\n   use, offer to sell, sell, import, and otherwise transfer the Work,\n   where such license applies only to those patent claims licensable\n   by such Contributor that are necessarily infringed by their\n   Contribution(s) alone or by combination of their Contribution(s)\n   with the Work to which such Contribution(s) was submitted. If You\n   institute patent litigation against any entity (including a\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\n   or a Contribution incorporated within the Work constitutes direct\n   or contributory patent infringement, then any patent licenses\n   granted to You under this License for that Work shall terminate\n   as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\n   Work or Derivative Works thereof in any medium, with or without\n   modifications, and in Source or Object form, provided that You\n   meet the following conditions:\n\n   (a) You must give any other recipients of the Work or\n       Derivative Works a copy of this License; and\n\n   (b) You must cause any modified files to carry prominent notices\n       stating that You changed the files; and\n\n   (c) You must retain, in the Source form of any Derivative Works\n       that You distribute, all copyright, patent, trademark, and\n       attribution notices from the Source form of the Work,\n       excluding those notices that do not pertain to any part of\n       the Derivative Works; and\n\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n       distribution, then any Derivative Works that You distribute must\n       include a readable copy of the attribution notices contained\n       within such NOTICE file, excluding those notices that do not\n       pertain to any part of the Derivative Works, in at least one\n       of the following places: within a NOTICE text file distributed\n       as part of the Derivative Works; within the Source form or\n       documentation, if provided along with the Derivative Works; or,\n       within a display generated by the Derivative Works, if and\n       wherever such third-party notices normally appear. The contents\n       of the NOTICE file are for informational purposes only and\n       do not modify the License. You may add Your own attribution\n       notices within Derivative Works that You distribute, alongside\n       or as an addendum to the NOTICE text from the Work, provided\n       that such additional attribution notices cannot be construed\n       as modifying the License.\n\n   You may add Your own copyright statement to Your modifications and\n   may provide additional or different license terms and conditions\n   for use, reproduction, or distribution of Your modifications, or\n   for any such Derivative Works as a whole, provided Your use,\n   reproduction, and distribution of the Work otherwise complies with\n   the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\n   any Contribution intentionally submitted for inclusion in the Work\n   by You to the Licensor shall be under the terms and conditions of\n   this License, without any additional terms or conditions.\n   Notwithstanding the above, nothing herein shall supersede or modify\n   the terms of any separate license agreement you may have executed\n   with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\n   names, trademarks, service marks, or product names of the Licensor,\n   except as required for reasonable and customary use in describing the\n   origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\n   agreed to in writing, Licensor provides the Work (and each\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n   implied, including, without limitation, any warranties or conditions\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n   PARTICULAR PURPOSE. You are solely responsible for determining the\n   appropriateness of using or redistributing the Work and assume any\n   risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\n   whether in tort (including negligence), contract, or otherwise,\n   unless required by applicable law (such as deliberate and grossly\n   negligent acts) or agreed to in writing, shall any Contributor be\n   liable to You for damages, including any direct, indirect, special,\n   incidental, or consequential damages of any character arising as a\n   result of this License or out of the use or inability to use the\n   Work (including but not limited to damages for loss of goodwill,\n   work stoppage, computer failure or malfunction, or any and all\n   other commercial damages or losses), even if such Contributor\n   has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\n   the Work or Derivative Works thereof, You may choose to offer,\n   and charge a fee for, acceptance of support, warranty, indemnity,\n   or other liability obligations and/or rights consistent with this\n   License. However, in accepting such obligations, You may act only\n   on Your own behalf and on Your sole responsibility, not on behalf\n   of any other Contributor, and only if You agree to indemnify,\n   defend, and hold each Contributor harmless for any liability\n   incurred by, or claims asserted against, such Contributor by reason\n   of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n\nAPPENDIX: How to apply the Apache License to your work.\n\n   To apply the Apache License to your work, attach the following\n   boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n   replaced with your own identifying information. (Don&#x27;t include\n   the brackets!)  The text should be enclosed in the appropriate\n   comment syntax for the file format. We also recommend that a\n   file or class name and description of purpose be included on the\n   same &quot;printed page&quot; as the copyright notice for easier\n   identification within third-party archives.\n\nCopyright 2017 quininer kel\n\nLicensed under the Apache License, Version 2.0 (the &quot;License&quot;);\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n\thttp://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an &quot;AS IS&quot; BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/rust-lang-nursery/pin-utils \">pin-utils 0.1.0</a></li>\n                </ul>\n                <pre class=\"license-text\">                              Apache License\n                        Version 2.0, January 2004\n                     http://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n   and distribution as defined by Sections 1 through 9 of this document.\n\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n   the copyright owner that is granting the License.\n\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n   other entities that control, are controlled by, or are under common\n   control with that entity. For the purposes of this definition,\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\n   direction or management of such entity, whether by contract or\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\n   outstanding shares, or (iii) beneficial ownership of such entity.\n\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n   exercising permissions granted by this License.\n\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\n   including but not limited to software source code, documentation\n   source, and configuration files.\n\n   &quot;Object&quot; form shall mean any form resulting from mechanical\n   transformation or translation of a Source form, including but\n   not limited to compiled object code, generated documentation,\n   and conversions to other media types.\n\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\n   Object form, made available under the License, as indicated by a\n   copyright notice that is included in or attached to the work\n   (an example is provided in the Appendix below).\n\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n   form, that is based on (or derived from) the Work and for which the\n   editorial revisions, annotations, elaborations, or other modifications\n   represent, as a whole, an original work of authorship. For the purposes\n   of this License, Derivative Works shall not include works that remain\n   separable from, or merely link (or bind by name) to the interfaces of,\n   the Work and Derivative Works thereof.\n\n   &quot;Contribution&quot; shall mean any work of authorship, including\n   the original version of the Work and any modifications or additions\n   to that Work or Derivative Works thereof, that is intentionally\n   submitted to Licensor for inclusion in the Work by the copyright owner\n   or by an individual or Legal Entity authorized to submit on behalf of\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n   means any form of electronic, verbal, or written communication sent\n   to the Licensor or its representatives, including but not limited to\n   communication on electronic mailing lists, source code control systems,\n   and issue tracking systems that are managed by, or on behalf of, the\n   Licensor for the purpose of discussing and improving the Work, but\n   excluding communication that is conspicuously marked or otherwise\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n   on behalf of whom a Contribution has been received by Licensor and\n   subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   copyright license to reproduce, prepare Derivative Works of,\n   publicly display, publicly perform, sublicense, and distribute the\n   Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   (except as stated in this section) patent license to make, have made,\n   use, offer to sell, sell, import, and otherwise transfer the Work,\n   where such license applies only to those patent claims licensable\n   by such Contributor that are necessarily infringed by their\n   Contribution(s) alone or by combination of their Contribution(s)\n   with the Work to which such Contribution(s) was submitted. If You\n   institute patent litigation against any entity (including a\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\n   or a Contribution incorporated within the Work constitutes direct\n   or contributory patent infringement, then any patent licenses\n   granted to You under this License for that Work shall terminate\n   as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\n   Work or Derivative Works thereof in any medium, with or without\n   modifications, and in Source or Object form, provided that You\n   meet the following conditions:\n\n   (a) You must give any other recipients of the Work or\n       Derivative Works a copy of this License; and\n\n   (b) You must cause any modified files to carry prominent notices\n       stating that You changed the files; and\n\n   (c) You must retain, in the Source form of any Derivative Works\n       that You distribute, all copyright, patent, trademark, and\n       attribution notices from the Source form of the Work,\n       excluding those notices that do not pertain to any part of\n       the Derivative Works; and\n\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n       distribution, then any Derivative Works that You distribute must\n       include a readable copy of the attribution notices contained\n       within such NOTICE file, excluding those notices that do not\n       pertain to any part of the Derivative Works, in at least one\n       of the following places: within a NOTICE text file distributed\n       as part of the Derivative Works; within the Source form or\n       documentation, if provided along with the Derivative Works; or,\n       within a display generated by the Derivative Works, if and\n       wherever such third-party notices normally appear. The contents\n       of the NOTICE file are for informational purposes only and\n       do not modify the License. You may add Your own attribution\n       notices within Derivative Works that You distribute, alongside\n       or as an addendum to the NOTICE text from the Work, provided\n       that such additional attribution notices cannot be construed\n       as modifying the License.\n\n   You may add Your own copyright statement to Your modifications and\n   may provide additional or different license terms and conditions\n   for use, reproduction, or distribution of Your modifications, or\n   for any such Derivative Works as a whole, provided Your use,\n   reproduction, and distribution of the Work otherwise complies with\n   the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\n   any Contribution intentionally submitted for inclusion in the Work\n   by You to the Licensor shall be under the terms and conditions of\n   this License, without any additional terms or conditions.\n   Notwithstanding the above, nothing herein shall supersede or modify\n   the terms of any separate license agreement you may have executed\n   with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\n   names, trademarks, service marks, or product names of the Licensor,\n   except as required for reasonable and customary use in describing the\n   origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\n   agreed to in writing, Licensor provides the Work (and each\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n   implied, including, without limitation, any warranties or conditions\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n   PARTICULAR PURPOSE. You are solely responsible for determining the\n   appropriateness of using or redistributing the Work and assume any\n   risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\n   whether in tort (including negligence), contract, or otherwise,\n   unless required by applicable law (such as deliberate and grossly\n   negligent acts) or agreed to in writing, shall any Contributor be\n   liable to You for damages, including any direct, indirect, special,\n   incidental, or consequential damages of any character arising as a\n   result of this License or out of the use or inability to use the\n   Work (including but not limited to damages for loss of goodwill,\n   work stoppage, computer failure or malfunction, or any and all\n   other commercial damages or losses), even if such Contributor\n   has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\n   the Work or Derivative Works thereof, You may choose to offer,\n   and charge a fee for, acceptance of support, warranty, indemnity,\n   or other liability obligations and/or rights consistent with this\n   License. However, in accepting such obligations, You may act only\n   on Your own behalf and on Your sole responsibility, not on behalf\n   of any other Contributor, and only if You agree to indemnify,\n   defend, and hold each Contributor harmless for any liability\n   incurred by, or claims asserted against, such Contributor by reason\n   of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n\nAPPENDIX: How to apply the Apache License to your work.\n\n   To apply the Apache License to your work, attach the following\n   boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n   replaced with your own identifying information. (Don&#x27;t include\n   the brackets!)  The text should be enclosed in the appropriate\n   comment syntax for the file format. We also recommend that a\n   file or class name and description of purpose be included on the\n   same &quot;printed page&quot; as the copyright notice for easier\n   identification within third-party archives.\n\nCopyright 2018 The pin-utils authors\n\nLicensed under the Apache License, Version 2.0 (the &quot;License&quot;);\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n\thttp://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an &quot;AS IS&quot; BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/RustCrypto/signatures/tree/master/ecdsa \">ecdsa 0.14.8</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/signatures/tree/master/rfc6979 \">rfc6979 0.3.1</a></li>\n                </ul>\n                <pre class=\"license-text\">                              Apache License\n                        Version 2.0, January 2004\n                     http://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n   and distribution as defined by Sections 1 through 9 of this document.\n\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n   the copyright owner that is granting the License.\n\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n   other entities that control, are controlled by, or are under common\n   control with that entity. For the purposes of this definition,\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\n   direction or management of such entity, whether by contract or\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\n   outstanding shares, or (iii) beneficial ownership of such entity.\n\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n   exercising permissions granted by this License.\n\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\n   including but not limited to software source code, documentation\n   source, and configuration files.\n\n   &quot;Object&quot; form shall mean any form resulting from mechanical\n   transformation or translation of a Source form, including but\n   not limited to compiled object code, generated documentation,\n   and conversions to other media types.\n\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\n   Object form, made available under the License, as indicated by a\n   copyright notice that is included in or attached to the work\n   (an example is provided in the Appendix below).\n\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n   form, that is based on (or derived from) the Work and for which the\n   editorial revisions, annotations, elaborations, or other modifications\n   represent, as a whole, an original work of authorship. For the purposes\n   of this License, Derivative Works shall not include works that remain\n   separable from, or merely link (or bind by name) to the interfaces of,\n   the Work and Derivative Works thereof.\n\n   &quot;Contribution&quot; shall mean any work of authorship, including\n   the original version of the Work and any modifications or additions\n   to that Work or Derivative Works thereof, that is intentionally\n   submitted to Licensor for inclusion in the Work by the copyright owner\n   or by an individual or Legal Entity authorized to submit on behalf of\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n   means any form of electronic, verbal, or written communication sent\n   to the Licensor or its representatives, including but not limited to\n   communication on electronic mailing lists, source code control systems,\n   and issue tracking systems that are managed by, or on behalf of, the\n   Licensor for the purpose of discussing and improving the Work, but\n   excluding communication that is conspicuously marked or otherwise\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n   on behalf of whom a Contribution has been received by Licensor and\n   subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   copyright license to reproduce, prepare Derivative Works of,\n   publicly display, publicly perform, sublicense, and distribute the\n   Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   (except as stated in this section) patent license to make, have made,\n   use, offer to sell, sell, import, and otherwise transfer the Work,\n   where such license applies only to those patent claims licensable\n   by such Contributor that are necessarily infringed by their\n   Contribution(s) alone or by combination of their Contribution(s)\n   with the Work to which such Contribution(s) was submitted. If You\n   institute patent litigation against any entity (including a\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\n   or a Contribution incorporated within the Work constitutes direct\n   or contributory patent infringement, then any patent licenses\n   granted to You under this License for that Work shall terminate\n   as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\n   Work or Derivative Works thereof in any medium, with or without\n   modifications, and in Source or Object form, provided that You\n   meet the following conditions:\n\n   (a) You must give any other recipients of the Work or\n       Derivative Works a copy of this License; and\n\n   (b) You must cause any modified files to carry prominent notices\n       stating that You changed the files; and\n\n   (c) You must retain, in the Source form of any Derivative Works\n       that You distribute, all copyright, patent, trademark, and\n       attribution notices from the Source form of the Work,\n       excluding those notices that do not pertain to any part of\n       the Derivative Works; and\n\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n       distribution, then any Derivative Works that You distribute must\n       include a readable copy of the attribution notices contained\n       within such NOTICE file, excluding those notices that do not\n       pertain to any part of the Derivative Works, in at least one\n       of the following places: within a NOTICE text file distributed\n       as part of the Derivative Works; within the Source form or\n       documentation, if provided along with the Derivative Works; or,\n       within a display generated by the Derivative Works, if and\n       wherever such third-party notices normally appear. The contents\n       of the NOTICE file are for informational purposes only and\n       do not modify the License. You may add Your own attribution\n       notices within Derivative Works that You distribute, alongside\n       or as an addendum to the NOTICE text from the Work, provided\n       that such additional attribution notices cannot be construed\n       as modifying the License.\n\n   You may add Your own copyright statement to Your modifications and\n   may provide additional or different license terms and conditions\n   for use, reproduction, or distribution of Your modifications, or\n   for any such Derivative Works as a whole, provided Your use,\n   reproduction, and distribution of the Work otherwise complies with\n   the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\n   any Contribution intentionally submitted for inclusion in the Work\n   by You to the Licensor shall be under the terms and conditions of\n   this License, without any additional terms or conditions.\n   Notwithstanding the above, nothing herein shall supersede or modify\n   the terms of any separate license agreement you may have executed\n   with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\n   names, trademarks, service marks, or product names of the Licensor,\n   except as required for reasonable and customary use in describing the\n   origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\n   agreed to in writing, Licensor provides the Work (and each\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n   implied, including, without limitation, any warranties or conditions\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n   PARTICULAR PURPOSE. You are solely responsible for determining the\n   appropriateness of using or redistributing the Work and assume any\n   risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\n   whether in tort (including negligence), contract, or otherwise,\n   unless required by applicable law (such as deliberate and grossly\n   negligent acts) or agreed to in writing, shall any Contributor be\n   liable to You for damages, including any direct, indirect, special,\n   incidental, or consequential damages of any character arising as a\n   result of this License or out of the use or inability to use the\n   Work (including but not limited to damages for loss of goodwill,\n   work stoppage, computer failure or malfunction, or any and all\n   other commercial damages or losses), even if such Contributor\n   has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\n   the Work or Derivative Works thereof, You may choose to offer,\n   and charge a fee for, acceptance of support, warranty, indemnity,\n   or other liability obligations and/or rights consistent with this\n   License. However, in accepting such obligations, You may act only\n   on Your own behalf and on Your sole responsibility, not on behalf\n   of any other Contributor, and only if You agree to indemnify,\n   defend, and hold each Contributor harmless for any liability\n   incurred by, or claims asserted against, such Contributor by reason\n   of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n\nAPPENDIX: How to apply the Apache License to your work.\n\n   To apply the Apache License to your work, attach the following\n   boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n   replaced with your own identifying information. (Don&#x27;t include\n   the brackets!)  The text should be enclosed in the appropriate\n   comment syntax for the file format. We also recommend that a\n   file or class name and description of purpose be included on the\n   same &quot;printed page&quot; as the copyright notice for easier\n   identification within third-party archives.\n\nCopyright 2018-2022 RustCrypto Developers\n\nLicensed under the Apache License, Version 2.0 (the &quot;License&quot;);\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n\thttp://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an &quot;AS IS&quot; BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/cryptocorrosion/cryptocorrosion \">ppv-lite86 0.2.20</a></li>\n                </ul>\n                <pre class=\"license-text\">                              Apache License\n                        Version 2.0, January 2004\n                     http://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n   and distribution as defined by Sections 1 through 9 of this document.\n\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n   the copyright owner that is granting the License.\n\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n   other entities that control, are controlled by, or are under common\n   control with that entity. For the purposes of this definition,\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\n   direction or management of such entity, whether by contract or\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\n   outstanding shares, or (iii) beneficial ownership of such entity.\n\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n   exercising permissions granted by this License.\n\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\n   including but not limited to software source code, documentation\n   source, and configuration files.\n\n   &quot;Object&quot; form shall mean any form resulting from mechanical\n   transformation or translation of a Source form, including but\n   not limited to compiled object code, generated documentation,\n   and conversions to other media types.\n\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\n   Object form, made available under the License, as indicated by a\n   copyright notice that is included in or attached to the work\n   (an example is provided in the Appendix below).\n\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n   form, that is based on (or derived from) the Work and for which the\n   editorial revisions, annotations, elaborations, or other modifications\n   represent, as a whole, an original work of authorship. For the purposes\n   of this License, Derivative Works shall not include works that remain\n   separable from, or merely link (or bind by name) to the interfaces of,\n   the Work and Derivative Works thereof.\n\n   &quot;Contribution&quot; shall mean any work of authorship, including\n   the original version of the Work and any modifications or additions\n   to that Work or Derivative Works thereof, that is intentionally\n   submitted to Licensor for inclusion in the Work by the copyright owner\n   or by an individual or Legal Entity authorized to submit on behalf of\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n   means any form of electronic, verbal, or written communication sent\n   to the Licensor or its representatives, including but not limited to\n   communication on electronic mailing lists, source code control systems,\n   and issue tracking systems that are managed by, or on behalf of, the\n   Licensor for the purpose of discussing and improving the Work, but\n   excluding communication that is conspicuously marked or otherwise\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n   on behalf of whom a Contribution has been received by Licensor and\n   subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   copyright license to reproduce, prepare Derivative Works of,\n   publicly display, publicly perform, sublicense, and distribute the\n   Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   (except as stated in this section) patent license to make, have made,\n   use, offer to sell, sell, import, and otherwise transfer the Work,\n   where such license applies only to those patent claims licensable\n   by such Contributor that are necessarily infringed by their\n   Contribution(s) alone or by combination of their Contribution(s)\n   with the Work to which such Contribution(s) was submitted. If You\n   institute patent litigation against any entity (including a\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\n   or a Contribution incorporated within the Work constitutes direct\n   or contributory patent infringement, then any patent licenses\n   granted to You under this License for that Work shall terminate\n   as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\n   Work or Derivative Works thereof in any medium, with or without\n   modifications, and in Source or Object form, provided that You\n   meet the following conditions:\n\n   (a) You must give any other recipients of the Work or\n       Derivative Works a copy of this License; and\n\n   (b) You must cause any modified files to carry prominent notices\n       stating that You changed the files; and\n\n   (c) You must retain, in the Source form of any Derivative Works\n       that You distribute, all copyright, patent, trademark, and\n       attribution notices from the Source form of the Work,\n       excluding those notices that do not pertain to any part of\n       the Derivative Works; and\n\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n       distribution, then any Derivative Works that You distribute must\n       include a readable copy of the attribution notices contained\n       within such NOTICE file, excluding those notices that do not\n       pertain to any part of the Derivative Works, in at least one\n       of the following places: within a NOTICE text file distributed\n       as part of the Derivative Works; within the Source form or\n       documentation, if provided along with the Derivative Works; or,\n       within a display generated by the Derivative Works, if and\n       wherever such third-party notices normally appear. The contents\n       of the NOTICE file are for informational purposes only and\n       do not modify the License. You may add Your own attribution\n       notices within Derivative Works that You distribute, alongside\n       or as an addendum to the NOTICE text from the Work, provided\n       that such additional attribution notices cannot be construed\n       as modifying the License.\n\n   You may add Your own copyright statement to Your modifications and\n   may provide additional or different license terms and conditions\n   for use, reproduction, or distribution of Your modifications, or\n   for any such Derivative Works as a whole, provided Your use,\n   reproduction, and distribution of the Work otherwise complies with\n   the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\n   any Contribution intentionally submitted for inclusion in the Work\n   by You to the Licensor shall be under the terms and conditions of\n   this License, without any additional terms or conditions.\n   Notwithstanding the above, nothing herein shall supersede or modify\n   the terms of any separate license agreement you may have executed\n   with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\n   names, trademarks, service marks, or product names of the Licensor,\n   except as required for reasonable and customary use in describing the\n   origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\n   agreed to in writing, Licensor provides the Work (and each\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n   implied, including, without limitation, any warranties or conditions\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n   PARTICULAR PURPOSE. You are solely responsible for determining the\n   appropriateness of using or redistributing the Work and assume any\n   risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\n   whether in tort (including negligence), contract, or otherwise,\n   unless required by applicable law (such as deliberate and grossly\n   negligent acts) or agreed to in writing, shall any Contributor be\n   liable to You for damages, including any direct, indirect, special,\n   incidental, or consequential damages of any character arising as a\n   result of this License or out of the use or inability to use the\n   Work (including but not limited to damages for loss of goodwill,\n   work stoppage, computer failure or malfunction, or any and all\n   other commercial damages or losses), even if such Contributor\n   has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\n   the Work or Derivative Works thereof, You may choose to offer,\n   and charge a fee for, acceptance of support, warranty, indemnity,\n   or other liability obligations and/or rights consistent with this\n   License. However, in accepting such obligations, You may act only\n   on Your own behalf and on Your sole responsibility, not on behalf\n   of any other Contributor, and only if You agree to indemnify,\n   defend, and hold each Contributor harmless for any liability\n   incurred by, or claims asserted against, such Contributor by reason\n   of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n\nAPPENDIX: How to apply the Apache License to your work.\n\n   To apply the Apache License to your work, attach the following\n   boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n   replaced with your own identifying information. (Don&#x27;t include\n   the brackets!)  The text should be enclosed in the appropriate\n   comment syntax for the file format. We also recommend that a\n   file or class name and description of purpose be included on the\n   same &quot;printed page&quot; as the copyright notice for easier\n   identification within third-party archives.\n\nCopyright 2019 The CryptoCorrosion Contributors\n\nLicensed under the Apache License, Version 2.0 (the &quot;License&quot;);\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n   http://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an &quot;AS IS&quot; BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://gitlab.com/CreepySkeleton/proc-macro-error \">proc-macro-error-attr 1.0.4</a></li>\n                </ul>\n                <pre class=\"license-text\">                              Apache License\n                        Version 2.0, January 2004\n                     http://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n   and distribution as defined by Sections 1 through 9 of this document.\n\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n   the copyright owner that is granting the License.\n\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n   other entities that control, are controlled by, or are under common\n   control with that entity. For the purposes of this definition,\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\n   direction or management of such entity, whether by contract or\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\n   outstanding shares, or (iii) beneficial ownership of such entity.\n\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n   exercising permissions granted by this License.\n\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\n   including but not limited to software source code, documentation\n   source, and configuration files.\n\n   &quot;Object&quot; form shall mean any form resulting from mechanical\n   transformation or translation of a Source form, including but\n   not limited to compiled object code, generated documentation,\n   and conversions to other media types.\n\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\n   Object form, made available under the License, as indicated by a\n   copyright notice that is included in or attached to the work\n   (an example is provided in the Appendix below).\n\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n   form, that is based on (or derived from) the Work and for which the\n   editorial revisions, annotations, elaborations, or other modifications\n   represent, as a whole, an original work of authorship. For the purposes\n   of this License, Derivative Works shall not include works that remain\n   separable from, or merely link (or bind by name) to the interfaces of,\n   the Work and Derivative Works thereof.\n\n   &quot;Contribution&quot; shall mean any work of authorship, including\n   the original version of the Work and any modifications or additions\n   to that Work or Derivative Works thereof, that is intentionally\n   submitted to Licensor for inclusion in the Work by the copyright owner\n   or by an individual or Legal Entity authorized to submit on behalf of\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n   means any form of electronic, verbal, or written communication sent\n   to the Licensor or its representatives, including but not limited to\n   communication on electronic mailing lists, source code control systems,\n   and issue tracking systems that are managed by, or on behalf of, the\n   Licensor for the purpose of discussing and improving the Work, but\n   excluding communication that is conspicuously marked or otherwise\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n   on behalf of whom a Contribution has been received by Licensor and\n   subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   copyright license to reproduce, prepare Derivative Works of,\n   publicly display, publicly perform, sublicense, and distribute the\n   Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   (except as stated in this section) patent license to make, have made,\n   use, offer to sell, sell, import, and otherwise transfer the Work,\n   where such license applies only to those patent claims licensable\n   by such Contributor that are necessarily infringed by their\n   Contribution(s) alone or by combination of their Contribution(s)\n   with the Work to which such Contribution(s) was submitted. If You\n   institute patent litigation against any entity (including a\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\n   or a Contribution incorporated within the Work constitutes direct\n   or contributory patent infringement, then any patent licenses\n   granted to You under this License for that Work shall terminate\n   as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\n   Work or Derivative Works thereof in any medium, with or without\n   modifications, and in Source or Object form, provided that You\n   meet the following conditions:\n\n   (a) You must give any other recipients of the Work or\n       Derivative Works a copy of this License; and\n\n   (b) You must cause any modified files to carry prominent notices\n       stating that You changed the files; and\n\n   (c) You must retain, in the Source form of any Derivative Works\n       that You distribute, all copyright, patent, trademark, and\n       attribution notices from the Source form of the Work,\n       excluding those notices that do not pertain to any part of\n       the Derivative Works; and\n\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n       distribution, then any Derivative Works that You distribute must\n       include a readable copy of the attribution notices contained\n       within such NOTICE file, excluding those notices that do not\n       pertain to any part of the Derivative Works, in at least one\n       of the following places: within a NOTICE text file distributed\n       as part of the Derivative Works; within the Source form or\n       documentation, if provided along with the Derivative Works; or,\n       within a display generated by the Derivative Works, if and\n       wherever such third-party notices normally appear. The contents\n       of the NOTICE file are for informational purposes only and\n       do not modify the License. You may add Your own attribution\n       notices within Derivative Works that You distribute, alongside\n       or as an addendum to the NOTICE text from the Work, provided\n       that such additional attribution notices cannot be construed\n       as modifying the License.\n\n   You may add Your own copyright statement to Your modifications and\n   may provide additional or different license terms and conditions\n   for use, reproduction, or distribution of Your modifications, or\n   for any such Derivative Works as a whole, provided Your use,\n   reproduction, and distribution of the Work otherwise complies with\n   the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\n   any Contribution intentionally submitted for inclusion in the Work\n   by You to the Licensor shall be under the terms and conditions of\n   this License, without any additional terms or conditions.\n   Notwithstanding the above, nothing herein shall supersede or modify\n   the terms of any separate license agreement you may have executed\n   with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\n   names, trademarks, service marks, or product names of the Licensor,\n   except as required for reasonable and customary use in describing the\n   origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\n   agreed to in writing, Licensor provides the Work (and each\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n   implied, including, without limitation, any warranties or conditions\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n   PARTICULAR PURPOSE. You are solely responsible for determining the\n   appropriateness of using or redistributing the Work and assume any\n   risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\n   whether in tort (including negligence), contract, or otherwise,\n   unless required by applicable law (such as deliberate and grossly\n   negligent acts) or agreed to in writing, shall any Contributor be\n   liable to You for damages, including any direct, indirect, special,\n   incidental, or consequential damages of any character arising as a\n   result of this License or out of the use or inability to use the\n   Work (including but not limited to damages for loss of goodwill,\n   work stoppage, computer failure or malfunction, or any and all\n   other commercial damages or losses), even if such Contributor\n   has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\n   the Work or Derivative Works thereof, You may choose to offer,\n   and charge a fee for, acceptance of support, warranty, indemnity,\n   or other liability obligations and/or rights consistent with this\n   License. However, in accepting such obligations, You may act only\n   on Your own behalf and on Your sole responsibility, not on behalf\n   of any other Contributor, and only if You agree to indemnify,\n   defend, and hold each Contributor harmless for any liability\n   incurred by, or claims asserted against, such Contributor by reason\n   of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n\nAPPENDIX: How to apply the Apache License to your work.\n\n   To apply the Apache License to your work, attach the following\n   boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n   replaced with your own identifying information. (Don&#x27;t include\n   the brackets!)  The text should be enclosed in the appropriate\n   comment syntax for the file format. We also recommend that a\n   file or class name and description of purpose be included on the\n   same &quot;printed page&quot; as the copyright notice for easier\n   identification within third-party archives.\n\nCopyright 2019-2020 CreepySkeleton &lt;creepy-skeleton@yandex.ru&gt;\n\nLicensed under the Apache License, Version 2.0 (the &quot;License&quot;);\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n    http://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an &quot;AS IS&quot; BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/strawlab/iana-time-zone \">iana-time-zone-haiku 0.1.2</a></li>\n                    <li><a href=\" https://github.com/strawlab/iana-time-zone \">iana-time-zone 0.1.61</a></li>\n                </ul>\n                <pre class=\"license-text\">                              Apache License\n                        Version 2.0, January 2004\n                     http://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n   and distribution as defined by Sections 1 through 9 of this document.\n\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n   the copyright owner that is granting the License.\n\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n   other entities that control, are controlled by, or are under common\n   control with that entity. For the purposes of this definition,\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\n   direction or management of such entity, whether by contract or\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\n   outstanding shares, or (iii) beneficial ownership of such entity.\n\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n   exercising permissions granted by this License.\n\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\n   including but not limited to software source code, documentation\n   source, and configuration files.\n\n   &quot;Object&quot; form shall mean any form resulting from mechanical\n   transformation or translation of a Source form, including but\n   not limited to compiled object code, generated documentation,\n   and conversions to other media types.\n\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\n   Object form, made available under the License, as indicated by a\n   copyright notice that is included in or attached to the work\n   (an example is provided in the Appendix below).\n\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n   form, that is based on (or derived from) the Work and for which the\n   editorial revisions, annotations, elaborations, or other modifications\n   represent, as a whole, an original work of authorship. For the purposes\n   of this License, Derivative Works shall not include works that remain\n   separable from, or merely link (or bind by name) to the interfaces of,\n   the Work and Derivative Works thereof.\n\n   &quot;Contribution&quot; shall mean any work of authorship, including\n   the original version of the Work and any modifications or additions\n   to that Work or Derivative Works thereof, that is intentionally\n   submitted to Licensor for inclusion in the Work by the copyright owner\n   or by an individual or Legal Entity authorized to submit on behalf of\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n   means any form of electronic, verbal, or written communication sent\n   to the Licensor or its representatives, including but not limited to\n   communication on electronic mailing lists, source code control systems,\n   and issue tracking systems that are managed by, or on behalf of, the\n   Licensor for the purpose of discussing and improving the Work, but\n   excluding communication that is conspicuously marked or otherwise\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n   on behalf of whom a Contribution has been received by Licensor and\n   subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   copyright license to reproduce, prepare Derivative Works of,\n   publicly display, publicly perform, sublicense, and distribute the\n   Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   (except as stated in this section) patent license to make, have made,\n   use, offer to sell, sell, import, and otherwise transfer the Work,\n   where such license applies only to those patent claims licensable\n   by such Contributor that are necessarily infringed by their\n   Contribution(s) alone or by combination of their Contribution(s)\n   with the Work to which such Contribution(s) was submitted. If You\n   institute patent litigation against any entity (including a\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\n   or a Contribution incorporated within the Work constitutes direct\n   or contributory patent infringement, then any patent licenses\n   granted to You under this License for that Work shall terminate\n   as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\n   Work or Derivative Works thereof in any medium, with or without\n   modifications, and in Source or Object form, provided that You\n   meet the following conditions:\n\n   (a) You must give any other recipients of the Work or\n       Derivative Works a copy of this License; and\n\n   (b) You must cause any modified files to carry prominent notices\n       stating that You changed the files; and\n\n   (c) You must retain, in the Source form of any Derivative Works\n       that You distribute, all copyright, patent, trademark, and\n       attribution notices from the Source form of the Work,\n       excluding those notices that do not pertain to any part of\n       the Derivative Works; and\n\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n       distribution, then any Derivative Works that You distribute must\n       include a readable copy of the attribution notices contained\n       within such NOTICE file, excluding those notices that do not\n       pertain to any part of the Derivative Works, in at least one\n       of the following places: within a NOTICE text file distributed\n       as part of the Derivative Works; within the Source form or\n       documentation, if provided along with the Derivative Works; or,\n       within a display generated by the Derivative Works, if and\n       wherever such third-party notices normally appear. The contents\n       of the NOTICE file are for informational purposes only and\n       do not modify the License. You may add Your own attribution\n       notices within Derivative Works that You distribute, alongside\n       or as an addendum to the NOTICE text from the Work, provided\n       that such additional attribution notices cannot be construed\n       as modifying the License.\n\n   You may add Your own copyright statement to Your modifications and\n   may provide additional or different license terms and conditions\n   for use, reproduction, or distribution of Your modifications, or\n   for any such Derivative Works as a whole, provided Your use,\n   reproduction, and distribution of the Work otherwise complies with\n   the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\n   any Contribution intentionally submitted for inclusion in the Work\n   by You to the Licensor shall be under the terms and conditions of\n   this License, without any additional terms or conditions.\n   Notwithstanding the above, nothing herein shall supersede or modify\n   the terms of any separate license agreement you may have executed\n   with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\n   names, trademarks, service marks, or product names of the Licensor,\n   except as required for reasonable and customary use in describing the\n   origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\n   agreed to in writing, Licensor provides the Work (and each\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n   implied, including, without limitation, any warranties or conditions\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n   PARTICULAR PURPOSE. You are solely responsible for determining the\n   appropriateness of using or redistributing the Work and assume any\n   risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\n   whether in tort (including negligence), contract, or otherwise,\n   unless required by applicable law (such as deliberate and grossly\n   negligent acts) or agreed to in writing, shall any Contributor be\n   liable to You for damages, including any direct, indirect, special,\n   incidental, or consequential damages of any character arising as a\n   result of this License or out of the use or inability to use the\n   Work (including but not limited to damages for loss of goodwill,\n   work stoppage, computer failure or malfunction, or any and all\n   other commercial damages or losses), even if such Contributor\n   has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\n   the Work or Derivative Works thereof, You may choose to offer,\n   and charge a fee for, acceptance of support, warranty, indemnity,\n   or other liability obligations and/or rights consistent with this\n   License. However, in accepting such obligations, You may act only\n   on Your own behalf and on Your sole responsibility, not on behalf\n   of any other Contributor, and only if You agree to indemnify,\n   defend, and hold each Contributor harmless for any liability\n   incurred by, or claims asserted against, such Contributor by reason\n   of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n\nAPPENDIX: How to apply the Apache License to your work.\n\n   To apply the Apache License to your work, attach the following\n   boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n   replaced with your own identifying information. (Don&#x27;t include\n   the brackets!)  The text should be enclosed in the appropriate\n   comment syntax for the file format. We also recommend that a\n   file or class name and description of purpose be included on the\n   same &quot;printed page&quot; as the copyright notice for easier\n   identification within third-party archives.\n\nCopyright 2020 Andrew Straw\n\nLicensed under the Apache License, Version 2.0 (the &quot;License&quot;);\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n\thttp://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an &quot;AS IS&quot; BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/RazrFalcon/memmap2-rs \">memmap2 0.9.5</a></li>\n                </ul>\n                <pre class=\"license-text\">                              Apache License\n                        Version 2.0, January 2004\n                     http://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n   and distribution as defined by Sections 1 through 9 of this document.\n\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n   the copyright owner that is granting the License.\n\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n   other entities that control, are controlled by, or are under common\n   control with that entity. For the purposes of this definition,\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\n   direction or management of such entity, whether by contract or\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\n   outstanding shares, or (iii) beneficial ownership of such entity.\n\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n   exercising permissions granted by this License.\n\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\n   including but not limited to software source code, documentation\n   source, and configuration files.\n\n   &quot;Object&quot; form shall mean any form resulting from mechanical\n   transformation or translation of a Source form, including but\n   not limited to compiled object code, generated documentation,\n   and conversions to other media types.\n\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\n   Object form, made available under the License, as indicated by a\n   copyright notice that is included in or attached to the work\n   (an example is provided in the Appendix below).\n\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n   form, that is based on (or derived from) the Work and for which the\n   editorial revisions, annotations, elaborations, or other modifications\n   represent, as a whole, an original work of authorship. For the purposes\n   of this License, Derivative Works shall not include works that remain\n   separable from, or merely link (or bind by name) to the interfaces of,\n   the Work and Derivative Works thereof.\n\n   &quot;Contribution&quot; shall mean any work of authorship, including\n   the original version of the Work and any modifications or additions\n   to that Work or Derivative Works thereof, that is intentionally\n   submitted to Licensor for inclusion in the Work by the copyright owner\n   or by an individual or Legal Entity authorized to submit on behalf of\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n   means any form of electronic, verbal, or written communication sent\n   to the Licensor or its representatives, including but not limited to\n   communication on electronic mailing lists, source code control systems,\n   and issue tracking systems that are managed by, or on behalf of, the\n   Licensor for the purpose of discussing and improving the Work, but\n   excluding communication that is conspicuously marked or otherwise\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n   on behalf of whom a Contribution has been received by Licensor and\n   subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   copyright license to reproduce, prepare Derivative Works of,\n   publicly display, publicly perform, sublicense, and distribute the\n   Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   (except as stated in this section) patent license to make, have made,\n   use, offer to sell, sell, import, and otherwise transfer the Work,\n   where such license applies only to those patent claims licensable\n   by such Contributor that are necessarily infringed by their\n   Contribution(s) alone or by combination of their Contribution(s)\n   with the Work to which such Contribution(s) was submitted. If You\n   institute patent litigation against any entity (including a\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\n   or a Contribution incorporated within the Work constitutes direct\n   or contributory patent infringement, then any patent licenses\n   granted to You under this License for that Work shall terminate\n   as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\n   Work or Derivative Works thereof in any medium, with or without\n   modifications, and in Source or Object form, provided that You\n   meet the following conditions:\n\n   (a) You must give any other recipients of the Work or\n       Derivative Works a copy of this License; and\n\n   (b) You must cause any modified files to carry prominent notices\n       stating that You changed the files; and\n\n   (c) You must retain, in the Source form of any Derivative Works\n       that You distribute, all copyright, patent, trademark, and\n       attribution notices from the Source form of the Work,\n       excluding those notices that do not pertain to any part of\n       the Derivative Works; and\n\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n       distribution, then any Derivative Works that You distribute must\n       include a readable copy of the attribution notices contained\n       within such NOTICE file, excluding those notices that do not\n       pertain to any part of the Derivative Works, in at least one\n       of the following places: within a NOTICE text file distributed\n       as part of the Derivative Works; within the Source form or\n       documentation, if provided along with the Derivative Works; or,\n       within a display generated by the Derivative Works, if and\n       wherever such third-party notices normally appear. The contents\n       of the NOTICE file are for informational purposes only and\n       do not modify the License. You may add Your own attribution\n       notices within Derivative Works that You distribute, alongside\n       or as an addendum to the NOTICE text from the Work, provided\n       that such additional attribution notices cannot be construed\n       as modifying the License.\n\n   You may add Your own copyright statement to Your modifications and\n   may provide additional or different license terms and conditions\n   for use, reproduction, or distribution of Your modifications, or\n   for any such Derivative Works as a whole, provided Your use,\n   reproduction, and distribution of the Work otherwise complies with\n   the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\n   any Contribution intentionally submitted for inclusion in the Work\n   by You to the Licensor shall be under the terms and conditions of\n   this License, without any additional terms or conditions.\n   Notwithstanding the above, nothing herein shall supersede or modify\n   the terms of any separate license agreement you may have executed\n   with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\n   names, trademarks, service marks, or product names of the Licensor,\n   except as required for reasonable and customary use in describing the\n   origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\n   agreed to in writing, Licensor provides the Work (and each\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n   implied, including, without limitation, any warranties or conditions\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n   PARTICULAR PURPOSE. You are solely responsible for determining the\n   appropriateness of using or redistributing the Work and assume any\n   risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\n   whether in tort (including negligence), contract, or otherwise,\n   unless required by applicable law (such as deliberate and grossly\n   negligent acts) or agreed to in writing, shall any Contributor be\n   liable to You for damages, including any direct, indirect, special,\n   incidental, or consequential damages of any character arising as a\n   result of this License or out of the use or inability to use the\n   Work (including but not limited to damages for loss of goodwill,\n   work stoppage, computer failure or malfunction, or any and all\n   other commercial damages or losses), even if such Contributor\n   has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\n   the Work or Derivative Works thereof, You may choose to offer,\n   and charge a fee for, acceptance of support, warranty, indemnity,\n   or other liability obligations and/or rights consistent with this\n   License. However, in accepting such obligations, You may act only\n   on Your own behalf and on Your sole responsibility, not on behalf\n   of any other Contributor, and only if You agree to indemnify,\n   defend, and hold each Contributor harmless for any liability\n   incurred by, or claims asserted against, such Contributor by reason\n   of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n\nAPPENDIX: How to apply the Apache License to your work.\n\n   To apply the Apache License to your work, attach the following\n   boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n   replaced with your own identifying information. (Don&#x27;t include\n   the brackets!)  The text should be enclosed in the appropriate\n   comment syntax for the file format. We also recommend that a\n   file or class name and description of purpose be included on the\n   same &quot;printed page&quot; as the copyright notice for easier\n   identification within third-party archives.\n\nCopyright [2015] [Dan Burkert]\n\nLicensed under the Apache License, Version 2.0 (the &quot;License&quot;);\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n\thttp://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an &quot;AS IS&quot; BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/gimli-rs/addr2line \">addr2line 0.24.2</a></li>\n                    <li><a href=\" https://github.com/tkaitchuck/ahash \">ahash 0.8.11</a></li>\n                    <li><a href=\" https://github.com/vorner/arc-swap \">arc-swap 1.7.1</a></li>\n                    <li><a href=\" https://github.com/bluss/arrayvec \">arrayvec 0.5.2</a></li>\n                    <li><a href=\" https://github.com/smol-rs/async-channel \">async-channel 1.9.0</a></li>\n                    <li><a href=\" https://github.com/Nullus157/async-compression \">async-compression 0.4.13</a></li>\n                    <li><a href=\" https://github.com/cuviper/autocfg \">autocfg 1.4.0</a></li>\n                    <li><a href=\" https://github.com/rust-lang/backtrace-rs \">backtrace 0.3.74</a></li>\n                    <li><a href=\" https://github.com/marshallpierce/rust-base64 \">base64 0.13.1</a></li>\n                    <li><a href=\" https://github.com/marshallpierce/rust-base64 \">base64 0.20.0</a></li>\n                    <li><a href=\" https://github.com/marshallpierce/rust-base64 \">base64 0.21.7</a></li>\n                    <li><a href=\" https://github.com/marshallpierce/rust-base64 \">base64 0.22.1</a></li>\n                    <li><a href=\" https://github.com/bitflags/bitflags \">bitflags 1.3.2</a></li>\n                    <li><a href=\" https://github.com/bitflags/bitflags \">bitflags 2.6.0</a></li>\n                    <li><a href=\" https://github.com/Nullus157/bs58-rs \">bs58 0.5.1</a></li>\n                    <li><a href=\" https://github.com/fitzgen/bumpalo \">bumpalo 3.16.0</a></li>\n                    <li><a href=\" https://github.com/vorner/bytes-utils \">bytes-utils 0.1.4</a></li>\n                    <li><a href=\" https://github.com/alexcrichton/bzip2-rs \">bzip2-sys 0.1.11+1.0.8</a></li>\n                    <li><a href=\" https://github.com/alexcrichton/bzip2-rs \">bzip2 0.4.4</a></li>\n                    <li><a href=\" https://github.com/japaric/cast.rs \">cast 0.3.0</a></li>\n                    <li><a href=\" https://github.com/Nullus157/cbor-diag-rs \">cbor-diag 0.1.12</a></li>\n                    <li><a href=\" https://github.com/rust-lang/cc-rs \">cc 1.1.28</a></li>\n                    <li><a href=\" https://github.com/alexcrichton/cfg-if \">cfg-if 1.0.0</a></li>\n                    <li><a href=\" https://github.com/smol-rs/concurrent-queue \">concurrent-queue 2.5.0</a></li>\n                    <li><a href=\" https://github.com/servo/core-foundation-rs \">core-foundation-sys 0.8.7</a></li>\n                    <li><a href=\" https://github.com/servo/core-foundation-rs \">core-foundation 0.9.4</a></li>\n                    <li><a href=\" https://github.com/bheisler/criterion.rs \">criterion-plot 0.5.0</a></li>\n                    <li><a href=\" https://github.com/bheisler/criterion.rs \">criterion 0.5.1</a></li>\n                    <li><a href=\" https://github.com/crossbeam-rs/crossbeam \">crossbeam-channel 0.5.13</a></li>\n                    <li><a href=\" https://github.com/crossbeam-rs/crossbeam \">crossbeam-deque 0.8.5</a></li>\n                    <li><a href=\" https://github.com/crossbeam-rs/crossbeam \">crossbeam-epoch 0.9.18</a></li>\n                    <li><a href=\" https://github.com/crossbeam-rs/crossbeam \">crossbeam-utils 0.8.20</a></li>\n                    <li><a href=\" https://github.com/rayon-rs/either \">either 1.13.0</a></li>\n                    <li><a href=\" https://github.com/BurntSushi/encoding_rs_io \">encoding_rs_io 0.1.7</a></li>\n                    <li><a href=\" https://github.com/cuviper/equivalent \">equivalent 1.0.1</a></li>\n                    <li><a href=\" https://github.com/lambda-fairy/rust-errno \">errno 0.3.9</a></li>\n                    <li><a href=\" https://github.com/smol-rs/event-listener \">event-listener 2.5.3</a></li>\n                    <li><a href=\" https://github.com/smol-rs/fastrand \">fastrand 1.9.0</a></li>\n                    <li><a href=\" https://github.com/smol-rs/fastrand \">fastrand 2.1.1</a></li>\n                    <li><a href=\" https://github.com/alexcrichton/filetime \">filetime 0.2.25</a></li>\n                    <li><a href=\" https://github.com/petgraph/fixedbitset \">fixedbitset 0.4.2</a></li>\n                    <li><a href=\" https://github.com/rust-lang/flate2-rs \">flate2 1.0.34</a></li>\n                    <li><a href=\" https://github.com/servo/rust-fnv \">fnv 1.0.7</a></li>\n                    <li><a href=\" https://github.com/servo/rust-url \">form_urlencoded 1.2.1</a></li>\n                    <li><a href=\" https://github.com/al8n/fs4-rs \">fs4 0.8.4</a></li>\n                    <li><a href=\" https://github.com/smol-rs/futures-lite \">futures-lite 1.13.0</a></li>\n                    <li><a href=\" https://github.com/async-rs/futures-timer \">futures-timer 3.0.3</a></li>\n                    <li><a href=\" https://github.com/gimli-rs/gimli \">gimli 0.31.1</a></li>\n                    <li><a href=\" https://github.com/rust-lang/glob \">glob 0.3.1</a></li>\n                    <li><a href=\" https://github.com/zkcrypto/group \">group 0.12.1</a></li>\n                    <li><a href=\" https://github.com/rust-lang/hashbrown \">hashbrown 0.12.3</a></li>\n                    <li><a href=\" https://github.com/rust-lang/hashbrown \">hashbrown 0.14.5</a></li>\n                    <li><a href=\" https://github.com/rust-lang/hashbrown \">hashbrown 0.15.0</a></li>\n                    <li><a href=\" https://github.com/withoutboats/heck \">heck 0.4.1</a></li>\n                    <li><a href=\" https://github.com/hermit-os/hermit-rs \">hermit-abi 0.3.9</a></li>\n                    <li><a href=\" https://github.com/hermit-os/hermit-rs \">hermit-abi 0.4.0</a></li>\n                    <li><a href=\" https://github.com/seanmonstar/httparse \">httparse 1.9.5</a></li>\n                    <li><a href=\" https://github.com/rustls/hyper-rustls \">hyper-rustls 0.24.2</a></li>\n                    <li><a href=\" https://github.com/hjr3/hyper-timeout \">hyper-timeout 0.4.1</a></li>\n                    <li><a href=\" https://github.com/servo/rust-url/ \">idna 0.5.0</a></li>\n                    <li><a href=\" https://github.com/bluss/indexmap \">indexmap 1.9.3</a></li>\n                    <li><a href=\" https://github.com/bluss/indexmap \">indexmap 2.1.0</a></li>\n                    <li><a href=\" https://github.com/rust-itertools/itertools \">itertools 0.10.5</a></li>\n                    <li><a href=\" https://github.com/rust-itertools/itertools \">itertools 0.12.1</a></li>\n                    <li><a href=\" https://github.com/rust-itertools/itertools \">itertools 0.13.0</a></li>\n                    <li><a href=\" https://github.com/rust-lang/jobserver-rs \">jobserver 0.1.32</a></li>\n                    <li><a href=\" https://github.com/rustwasm/wasm-bindgen/tree/master/crates/js-sys \">js-sys 0.3.71</a></li>\n                    <li><a href=\" https://github.com/rust-lang-nursery/lazy-static.rs \">lazy_static 1.5.0</a></li>\n                    <li><a href=\" https://github.com/rust-lang/libm \">libm 0.2.8</a></li>\n                    <li><a href=\" https://github.com/sunfishcode/linux-raw-sys \">linux-raw-sys 0.4.14</a></li>\n                    <li><a href=\" https://github.com/Amanieu/parking_lot \">lock_api 0.4.12</a></li>\n                    <li><a href=\" https://github.com/rust-lang/log \">log 0.4.22</a></li>\n                    <li><a href=\" https://github.com/gnzlbg/match_cfg \">match_cfg 0.1.0</a></li>\n                    <li><a href=\" https://github.com/hyperium/mime \">mime 0.3.17</a></li>\n                    <li><a href=\" https://github.com/asomers/mockall \">mockall 0.11.4</a></li>\n                    <li><a href=\" https://github.com/asomers/mockall \">mockall_derive 0.11.4</a></li>\n                    <li><a href=\" https://github.com/havarnov/multimap \">multimap 0.8.3</a></li>\n                    <li><a href=\" https://github.com/rust-num/num-bigint \">num-bigint 0.4.6</a></li>\n                    <li><a href=\" https://github.com/rust-num/num-integer \">num-integer 0.1.46</a></li>\n                    <li><a href=\" https://github.com/rust-num/num-rational \">num-rational 0.4.2</a></li>\n                    <li><a href=\" https://github.com/rust-num/num-traits \">num-traits 0.2.19</a></li>\n                    <li><a href=\" https://github.com/seanmonstar/num_cpus \">num_cpus 1.16.0</a></li>\n                    <li><a href=\" https://github.com/gimli-rs/object \">object 0.36.5</a></li>\n                    <li><a href=\" https://github.com/matklad/once_cell \">once_cell 1.20.2</a></li>\n                    <li><a href=\" https://github.com/alexcrichton/openssl-probe \">openssl-probe 0.1.5</a></li>\n                    <li><a href=\" https://github.com/smol-rs/parking \">parking 2.2.1</a></li>\n                    <li><a href=\" https://github.com/Amanieu/parking_lot \">parking_lot 0.12.3</a></li>\n                    <li><a href=\" https://github.com/Amanieu/parking_lot \">parking_lot_core 0.9.10</a></li>\n                    <li><a href=\" https://github.com/servo/rust-url/ \">percent-encoding 2.3.1</a></li>\n                    <li><a href=\" https://github.com/petgraph/petgraph \">petgraph 0.6.5</a></li>\n                    <li><a href=\" https://github.com/rust-lang/pkg-config-rs \">pkg-config 0.3.31</a></li>\n                    <li><a href=\" https://github.com/libpnet/libpnet \">pnet 0.33.0</a></li>\n                    <li><a href=\" https://github.com/libpnet/libpnet \">pnet_base 0.33.0</a></li>\n                    <li><a href=\" https://github.com/libpnet/libpnet \">pnet_datalink 0.33.0</a></li>\n                    <li><a href=\" https://github.com/libpnet/libpnet \">pnet_macros 0.33.0</a></li>\n                    <li><a href=\" https://github.com/libpnet/libpnet \">pnet_macros_support 0.33.0</a></li>\n                    <li><a href=\" https://github.com/libpnet/libpnet \">pnet_packet 0.33.0</a></li>\n                    <li><a href=\" https://github.com/libpnet/libpnet \">pnet_sys 0.33.0</a></li>\n                    <li><a href=\" https://github.com/libpnet/libpnet \">pnet_transport 0.33.0</a></li>\n                    <li><a href=\" https://github.com/proptest-rs/proptest \">proptest 1.5.0</a></li>\n                    <li><a href=\" https://github.com/tokio-rs/prost \">prost-build 0.11.9</a></li>\n                    <li><a href=\" https://github.com/tokio-rs/prost \">prost-derive 0.11.9</a></li>\n                    <li><a href=\" https://github.com/tokio-rs/prost \">prost-types 0.11.9</a></li>\n                    <li><a href=\" https://github.com/tokio-rs/prost \">prost 0.11.9</a></li>\n                    <li><a href=\" https://github.com/rayon-rs/rayon \">rayon-core 1.12.1</a></li>\n                    <li><a href=\" https://github.com/rayon-rs/rayon \">rayon 1.10.0</a></li>\n                    <li><a href=\" https://github.com/rust-lang/regex/tree/master/regex-automata \">regex-automata 0.4.8</a></li>\n                    <li><a href=\" https://github.com/rust-lang/regex/tree/master/regex-lite \">regex-lite 0.1.6</a></li>\n                    <li><a href=\" https://github.com/rust-lang/regex \">regex-syntax 0.6.29</a></li>\n                    <li><a href=\" https://github.com/rust-lang/regex/tree/master/regex-syntax \">regex-syntax 0.8.5</a></li>\n                    <li><a href=\" https://github.com/rust-lang/regex \">regex 1.11.0</a></li>\n                    <li><a href=\" https://github.com/RazrFalcon/roxmltree \">roxmltree 0.14.1</a></li>\n                    <li><a href=\" https://github.com/rust-lang/rustc-demangle \">rustc-demangle 0.1.24</a></li>\n                    <li><a href=\" https://github.com/rust-lang-nursery/rustc-hash \">rustc-hash 1.1.0</a></li>\n                    <li><a href=\" https://github.com/djc/rustc-version-rs \">rustc_version 0.4.1</a></li>\n                    <li><a href=\" https://github.com/bytecodealliance/rustix \">rustix 0.38.37</a></li>\n                    <li><a href=\" https://github.com/ctz/rustls-native-certs \">rustls-native-certs 0.6.3</a></li>\n                    <li><a href=\" https://github.com/rustls/pemfile \">rustls-pemfile 1.0.4</a></li>\n                    <li><a href=\" https://github.com/rustls/rustls \">rustls 0.21.12</a></li>\n                    <li><a href=\" https://github.com/altsysrq/rusty-fork \">rusty-fork 0.3.0</a></li>\n                    <li><a href=\" https://github.com/alexcrichton/scoped-tls \">scoped-tls 1.0.1</a></li>\n                    <li><a href=\" https://github.com/bluss/scopeguard \">scopeguard 1.2.0</a></li>\n                    <li><a href=\" https://github.com/rustls/sct.rs \">sct 0.7.1</a></li>\n                    <li><a href=\" https://github.com/kornelski/rust-security-framework \">security-framework-sys 2.12.0</a></li>\n                    <li><a href=\" https://github.com/kornelski/rust-security-framework \">security-framework 2.11.1</a></li>\n                    <li><a href=\" https://github.com/jonasbb/serde_with/ \">serde_with 3.11.0</a></li>\n                    <li><a href=\" https://github.com/jonasbb/serde_with/ \">serde_with_macros 3.11.0</a></li>\n                    <li><a href=\" https://github.com/vorner/signal-hook \">signal-hook-registry 1.4.2</a></li>\n                    <li><a href=\" https://github.com/servo/rust-smallvec \">smallvec 1.13.2</a></li>\n                    <li><a href=\" https://github.com/rust-lang/socket2 \">socket2 0.5.7</a></li>\n                    <li><a href=\" https://github.com/storyyeller/stable_deref_trait \">stable_deref_trait 1.2.0</a></li>\n                    <li><a href=\" https://github.com/dtolnay/syn \">syn 1.0.109</a></li>\n                    <li><a href=\" https://github.com/alexcrichton/tar-rs \">tar 0.4.42</a></li>\n                    <li><a href=\" https://github.com/Stebalien/tempfile \">tempfile 3.13.0</a></li>\n                    <li><a href=\" https://github.com/tov/thousands-rs \">thousands 0.2.0</a></li>\n                    <li><a href=\" https://github.com/Amanieu/thread_local-rs \">thread_local 1.1.8</a></li>\n                    <li><a href=\" https://github.com/bheisler/TinyTemplate \">tinytemplate 1.2.1</a></li>\n                    <li><a href=\" https://github.com/snapview/tungstenite-rs \">tungstenite 0.21.0</a></li>\n                    <li><a href=\" https://github.com/seanmonstar/unicase \">unicase 2.7.0</a></li>\n                    <li><a href=\" https://github.com/servo/unicode-bidi \">unicode-bidi 0.3.17</a></li>\n                    <li><a href=\" https://github.com/unicode-rs/unicode-normalization \">unicode-normalization 0.1.24</a></li>\n                    <li><a href=\" https://github.com/unicode-rs/unicode-width \">unicode-width 0.1.14</a></li>\n                    <li><a href=\" https://github.com/servo/rust-url \">url 2.5.2</a></li>\n                    <li><a href=\" https://github.com/uuid-rs/uuid \">uuid 1.10.0</a></li>\n                    <li><a href=\" https://github.com/SergioBenitez/version_check \">version_check 0.9.5</a></li>\n                    <li><a href=\" https://github.com/alexcrichton/wait-timeout \">wait-timeout 0.2.0</a></li>\n                    <li><a href=\" https://github.com/smol-rs/waker-fn \">waker-fn 1.2.0</a></li>\n                    <li><a href=\" https://github.com/bytecodealliance/wasi \">wasi 0.11.0+wasi-snapshot-preview1</a></li>\n                    <li><a href=\" https://github.com/bytecodealliance/wasi \">wasi 0.9.0+wasi-snapshot-preview1</a></li>\n                    <li><a href=\" https://github.com/wasix-org/wasix-abi-rust \">wasix 0.12.21</a></li>\n                    <li><a href=\" https://github.com/rustwasm/wasm-bindgen/tree/master/crates/backend \">wasm-bindgen-backend 0.2.94</a></li>\n                    <li><a href=\" https://github.com/rustwasm/wasm-bindgen/tree/master/crates/futures \">wasm-bindgen-futures 0.4.44</a></li>\n                    <li><a href=\" https://github.com/rustwasm/wasm-bindgen/tree/master/crates/macro-support \">wasm-bindgen-macro-support 0.2.94</a></li>\n                    <li><a href=\" https://github.com/rustwasm/wasm-bindgen/tree/master/crates/macro \">wasm-bindgen-macro 0.2.94</a></li>\n                    <li><a href=\" https://github.com/rustwasm/wasm-bindgen/tree/master/crates/shared \">wasm-bindgen-shared 0.2.94</a></li>\n                    <li><a href=\" https://github.com/rustwasm/wasm-bindgen \">wasm-bindgen 0.2.94</a></li>\n                    <li><a href=\" https://github.com/rustwasm/wasm-bindgen/tree/master/crates/web-sys \">web-sys 0.3.71</a></li>\n                    <li><a href=\" https://github.com/LukeMathWalker/wiremock-rs \">wiremock 0.5.22</a></li>\n                    <li><a href=\" https://github.com/Stebalien/xattr \">xattr 1.3.1</a></li>\n                    <li><a href=\" https://github.com/RazrFalcon/xmlparser \">xmlparser 0.13.6</a></li>\n                </ul>\n                <pre class=\"license-text\">                              Apache License\n                        Version 2.0, January 2004\n                     http://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n   and distribution as defined by Sections 1 through 9 of this document.\n\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n   the copyright owner that is granting the License.\n\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n   other entities that control, are controlled by, or are under common\n   control with that entity. For the purposes of this definition,\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\n   direction or management of such entity, whether by contract or\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\n   outstanding shares, or (iii) beneficial ownership of such entity.\n\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n   exercising permissions granted by this License.\n\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\n   including but not limited to software source code, documentation\n   source, and configuration files.\n\n   &quot;Object&quot; form shall mean any form resulting from mechanical\n   transformation or translation of a Source form, including but\n   not limited to compiled object code, generated documentation,\n   and conversions to other media types.\n\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\n   Object form, made available under the License, as indicated by a\n   copyright notice that is included in or attached to the work\n   (an example is provided in the Appendix below).\n\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n   form, that is based on (or derived from) the Work and for which the\n   editorial revisions, annotations, elaborations, or other modifications\n   represent, as a whole, an original work of authorship. For the purposes\n   of this License, Derivative Works shall not include works that remain\n   separable from, or merely link (or bind by name) to the interfaces of,\n   the Work and Derivative Works thereof.\n\n   &quot;Contribution&quot; shall mean any work of authorship, including\n   the original version of the Work and any modifications or additions\n   to that Work or Derivative Works thereof, that is intentionally\n   submitted to Licensor for inclusion in the Work by the copyright owner\n   or by an individual or Legal Entity authorized to submit on behalf of\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n   means any form of electronic, verbal, or written communication sent\n   to the Licensor or its representatives, including but not limited to\n   communication on electronic mailing lists, source code control systems,\n   and issue tracking systems that are managed by, or on behalf of, the\n   Licensor for the purpose of discussing and improving the Work, but\n   excluding communication that is conspicuously marked or otherwise\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n   on behalf of whom a Contribution has been received by Licensor and\n   subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   copyright license to reproduce, prepare Derivative Works of,\n   publicly display, publicly perform, sublicense, and distribute the\n   Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   (except as stated in this section) patent license to make, have made,\n   use, offer to sell, sell, import, and otherwise transfer the Work,\n   where such license applies only to those patent claims licensable\n   by such Contributor that are necessarily infringed by their\n   Contribution(s) alone or by combination of their Contribution(s)\n   with the Work to which such Contribution(s) was submitted. If You\n   institute patent litigation against any entity (including a\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\n   or a Contribution incorporated within the Work constitutes direct\n   or contributory patent infringement, then any patent licenses\n   granted to You under this License for that Work shall terminate\n   as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\n   Work or Derivative Works thereof in any medium, with or without\n   modifications, and in Source or Object form, provided that You\n   meet the following conditions:\n\n   (a) You must give any other recipients of the Work or\n       Derivative Works a copy of this License; and\n\n   (b) You must cause any modified files to carry prominent notices\n       stating that You changed the files; and\n\n   (c) You must retain, in the Source form of any Derivative Works\n       that You distribute, all copyright, patent, trademark, and\n       attribution notices from the Source form of the Work,\n       excluding those notices that do not pertain to any part of\n       the Derivative Works; and\n\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n       distribution, then any Derivative Works that You distribute must\n       include a readable copy of the attribution notices contained\n       within such NOTICE file, excluding those notices that do not\n       pertain to any part of the Derivative Works, in at least one\n       of the following places: within a NOTICE text file distributed\n       as part of the Derivative Works; within the Source form or\n       documentation, if provided along with the Derivative Works; or,\n       within a display generated by the Derivative Works, if and\n       wherever such third-party notices normally appear. The contents\n       of the NOTICE file are for informational purposes only and\n       do not modify the License. You may add Your own attribution\n       notices within Derivative Works that You distribute, alongside\n       or as an addendum to the NOTICE text from the Work, provided\n       that such additional attribution notices cannot be construed\n       as modifying the License.\n\n   You may add Your own copyright statement to Your modifications and\n   may provide additional or different license terms and conditions\n   for use, reproduction, or distribution of Your modifications, or\n   for any such Derivative Works as a whole, provided Your use,\n   reproduction, and distribution of the Work otherwise complies with\n   the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\n   any Contribution intentionally submitted for inclusion in the Work\n   by You to the Licensor shall be under the terms and conditions of\n   this License, without any additional terms or conditions.\n   Notwithstanding the above, nothing herein shall supersede or modify\n   the terms of any separate license agreement you may have executed\n   with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\n   names, trademarks, service marks, or product names of the Licensor,\n   except as required for reasonable and customary use in describing the\n   origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\n   agreed to in writing, Licensor provides the Work (and each\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n   implied, including, without limitation, any warranties or conditions\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n   PARTICULAR PURPOSE. You are solely responsible for determining the\n   appropriateness of using or redistributing the Work and assume any\n   risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\n   whether in tort (including negligence), contract, or otherwise,\n   unless required by applicable law (such as deliberate and grossly\n   negligent acts) or agreed to in writing, shall any Contributor be\n   liable to You for damages, including any direct, indirect, special,\n   incidental, or consequential damages of any character arising as a\n   result of this License or out of the use or inability to use the\n   Work (including but not limited to damages for loss of goodwill,\n   work stoppage, computer failure or malfunction, or any and all\n   other commercial damages or losses), even if such Contributor\n   has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\n   the Work or Derivative Works thereof, You may choose to offer,\n   and charge a fee for, acceptance of support, warranty, indemnity,\n   or other liability obligations and/or rights consistent with this\n   License. However, in accepting such obligations, You may act only\n   on Your own behalf and on Your sole responsibility, not on behalf\n   of any other Contributor, and only if You agree to indemnify,\n   defend, and hold each Contributor harmless for any liability\n   incurred by, or claims asserted against, such Contributor by reason\n   of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n\nAPPENDIX: How to apply the Apache License to your work.\n\n   To apply the Apache License to your work, attach the following\n   boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n   replaced with your own identifying information. (Don&#x27;t include\n   the brackets!)  The text should be enclosed in the appropriate\n   comment syntax for the file format. We also recommend that a\n   file or class name and description of purpose be included on the\n   same &quot;printed page&quot; as the copyright notice for easier\n   identification within third-party archives.\n\nCopyright [yyyy] [name of copyright owner]\n\nLicensed under the Apache License, Version 2.0 (the &quot;License&quot;);\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n\thttp://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an &quot;AS IS&quot; BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/zkcrypto/ff \">ff 0.12.1</a></li>\n                </ul>\n                <pre class=\"license-text\">                              Apache License\n                        Version 2.0, January 2004\n                     http://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n   and distribution as defined by Sections 1 through 9 of this document.\n\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n   the copyright owner that is granting the License.\n\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n   other entities that control, are controlled by, or are under common\n   control with that entity. For the purposes of this definition,\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\n   direction or management of such entity, whether by contract or\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\n   outstanding shares, or (iii) beneficial ownership of such entity.\n\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n   exercising permissions granted by this License.\n\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\n   including but not limited to software source code, documentation\n   source, and configuration files.\n\n   &quot;Object&quot; form shall mean any form resulting from mechanical\n   transformation or translation of a Source form, including but\n   not limited to compiled object code, generated documentation,\n   and conversions to other media types.\n\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\n   Object form, made available under the License, as indicated by a\n   copyright notice that is included in or attached to the work\n   (an example is provided in the Appendix below).\n\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n   form, that is based on (or derived from) the Work and for which the\n   editorial revisions, annotations, elaborations, or other modifications\n   represent, as a whole, an original work of authorship. For the purposes\n   of this License, Derivative Works shall not include works that remain\n   separable from, or merely link (or bind by name) to the interfaces of,\n   the Work and Derivative Works thereof.\n\n   &quot;Contribution&quot; shall mean any work of authorship, including\n   the original version of the Work and any modifications or additions\n   to that Work or Derivative Works thereof, that is intentionally\n   submitted to Licensor for inclusion in the Work by the copyright owner\n   or by an individual or Legal Entity authorized to submit on behalf of\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n   means any form of electronic, verbal, or written communication sent\n   to the Licensor or its representatives, including but not limited to\n   communication on electronic mailing lists, source code control systems,\n   and issue tracking systems that are managed by, or on behalf of, the\n   Licensor for the purpose of discussing and improving the Work, but\n   excluding communication that is conspicuously marked or otherwise\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n   on behalf of whom a Contribution has been received by Licensor and\n   subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   copyright license to reproduce, prepare Derivative Works of,\n   publicly display, publicly perform, sublicense, and distribute the\n   Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   (except as stated in this section) patent license to make, have made,\n   use, offer to sell, sell, import, and otherwise transfer the Work,\n   where such license applies only to those patent claims licensable\n   by such Contributor that are necessarily infringed by their\n   Contribution(s) alone or by combination of their Contribution(s)\n   with the Work to which such Contribution(s) was submitted. If You\n   institute patent litigation against any entity (including a\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\n   or a Contribution incorporated within the Work constitutes direct\n   or contributory patent infringement, then any patent licenses\n   granted to You under this License for that Work shall terminate\n   as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\n   Work or Derivative Works thereof in any medium, with or without\n   modifications, and in Source or Object form, provided that You\n   meet the following conditions:\n\n   (a) You must give any other recipients of the Work or\n       Derivative Works a copy of this License; and\n\n   (b) You must cause any modified files to carry prominent notices\n       stating that You changed the files; and\n\n   (c) You must retain, in the Source form of any Derivative Works\n       that You distribute, all copyright, patent, trademark, and\n       attribution notices from the Source form of the Work,\n       excluding those notices that do not pertain to any part of\n       the Derivative Works; and\n\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n       distribution, then any Derivative Works that You distribute must\n       include a readable copy of the attribution notices contained\n       within such NOTICE file, excluding those notices that do not\n       pertain to any part of the Derivative Works, in at least one\n       of the following places: within a NOTICE text file distributed\n       as part of the Derivative Works; within the Source form or\n       documentation, if provided along with the Derivative Works; or,\n       within a display generated by the Derivative Works, if and\n       wherever such third-party notices normally appear. The contents\n       of the NOTICE file are for informational purposes only and\n       do not modify the License. You may add Your own attribution\n       notices within Derivative Works that You distribute, alongside\n       or as an addendum to the NOTICE text from the Work, provided\n       that such additional attribution notices cannot be construed\n       as modifying the License.\n\n   You may add Your own copyright statement to Your modifications and\n   may provide additional or different license terms and conditions\n   for use, reproduction, or distribution of Your modifications, or\n   for any such Derivative Works as a whole, provided Your use,\n   reproduction, and distribution of the Work otherwise complies with\n   the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\n   any Contribution intentionally submitted for inclusion in the Work\n   by You to the Licensor shall be under the terms and conditions of\n   this License, without any additional terms or conditions.\n   Notwithstanding the above, nothing herein shall supersede or modify\n   the terms of any separate license agreement you may have executed\n   with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\n   names, trademarks, service marks, or product names of the Licensor,\n   except as required for reasonable and customary use in describing the\n   origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\n   agreed to in writing, Licensor provides the Work (and each\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n   implied, including, without limitation, any warranties or conditions\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n   PARTICULAR PURPOSE. You are solely responsible for determining the\n   appropriateness of using or redistributing the Work and assume any\n   risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\n   whether in tort (including negligence), contract, or otherwise,\n   unless required by applicable law (such as deliberate and grossly\n   negligent acts) or agreed to in writing, shall any Contributor be\n   liable to You for damages, including any direct, indirect, special,\n   incidental, or consequential damages of any character arising as a\n   result of this License or out of the use or inability to use the\n   Work (including but not limited to damages for loss of goodwill,\n   work stoppage, computer failure or malfunction, or any and all\n   other commercial damages or losses), even if such Contributor\n   has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\n   the Work or Derivative Works thereof, You may choose to offer,\n   and charge a fee for, acceptance of support, warranty, indemnity,\n   or other liability obligations and/or rights consistent with this\n   License. However, in accepting such obligations, You may act only\n   on Your own behalf and on Your sole responsibility, not on behalf\n   of any other Contributor, and only if You agree to indemnify,\n   defend, and hold each Contributor harmless for any liability\n   incurred by, or claims asserted against, such Contributor by reason\n   of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n\nAPPENDIX: How to apply the Apache License to your work.\n\n   To apply the Apache License to your work, attach the following\n   boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n   replaced with your own identifying information. (Don&#x27;t include\n   the brackets!)  The text should be enclosed in the appropriate\n   comment syntax for the file format. We also recommend that a\n   file or class name and description of purpose be included on the\n   same &quot;printed page&quot; as the copyright notice for easier\n   identification within third-party archives.\n\nCopyright [yyyy] [name of copyright owner]\n\nLicensed under the Apache License, Version 2.0 (the &quot;License&quot;);\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n\thttp://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an &quot;AS IS&quot; BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/contain-rs/bit-set \">bit-set 0.5.3</a></li>\n                    <li><a href=\" https://github.com/contain-rs/bit-vec \">bit-vec 0.6.3</a></li>\n                    <li><a href=\" https://github.com/marcianx/downcast-rs \">downcast-rs 1.2.1</a></li>\n                    <li><a href=\" https://github.com/contain-rs/linked-hash-map \">linked-hash-map 0.5.6</a></li>\n                    <li><a href=\" https://github.com/Alexhuszagh/minimal-lexical \">minimal-lexical 0.2.1</a></li>\n                </ul>\n                <pre class=\"license-text\">                              Apache License\n                        Version 2.0, January 2004\n                     http://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n   and distribution as defined by Sections 1 through 9 of this document.\n\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n   the copyright owner that is granting the License.\n\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n   other entities that control, are controlled by, or are under common\n   control with that entity. For the purposes of this definition,\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\n   direction or management of such entity, whether by contract or\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\n   outstanding shares, or (iii) beneficial ownership of such entity.\n\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n   exercising permissions granted by this License.\n\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\n   including but not limited to software source code, documentation\n   source, and configuration files.\n\n   &quot;Object&quot; form shall mean any form resulting from mechanical\n   transformation or translation of a Source form, including but\n   not limited to compiled object code, generated documentation,\n   and conversions to other media types.\n\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\n   Object form, made available under the License, as indicated by a\n   copyright notice that is included in or attached to the work\n   (an example is provided in the Appendix below).\n\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n   form, that is based on (or derived from) the Work and for which the\n   editorial revisions, annotations, elaborations, or other modifications\n   represent, as a whole, an original work of authorship. For the purposes\n   of this License, Derivative Works shall not include works that remain\n   separable from, or merely link (or bind by name) to the interfaces of,\n   the Work and Derivative Works thereof.\n\n   &quot;Contribution&quot; shall mean any work of authorship, including\n   the original version of the Work and any modifications or additions\n   to that Work or Derivative Works thereof, that is intentionally\n   submitted to Licensor for inclusion in the Work by the copyright owner\n   or by an individual or Legal Entity authorized to submit on behalf of\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n   means any form of electronic, verbal, or written communication sent\n   to the Licensor or its representatives, including but not limited to\n   communication on electronic mailing lists, source code control systems,\n   and issue tracking systems that are managed by, or on behalf of, the\n   Licensor for the purpose of discussing and improving the Work, but\n   excluding communication that is conspicuously marked or otherwise\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n   on behalf of whom a Contribution has been received by Licensor and\n   subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   copyright license to reproduce, prepare Derivative Works of,\n   publicly display, publicly perform, sublicense, and distribute the\n   Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   (except as stated in this section) patent license to make, have made,\n   use, offer to sell, sell, import, and otherwise transfer the Work,\n   where such license applies only to those patent claims licensable\n   by such Contributor that are necessarily infringed by their\n   Contribution(s) alone or by combination of their Contribution(s)\n   with the Work to which such Contribution(s) was submitted. If You\n   institute patent litigation against any entity (including a\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\n   or a Contribution incorporated within the Work constitutes direct\n   or contributory patent infringement, then any patent licenses\n   granted to You under this License for that Work shall terminate\n   as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\n   Work or Derivative Works thereof in any medium, with or without\n   modifications, and in Source or Object form, provided that You\n   meet the following conditions:\n\n   (a) You must give any other recipients of the Work or\n       Derivative Works a copy of this License; and\n\n   (b) You must cause any modified files to carry prominent notices\n       stating that You changed the files; and\n\n   (c) You must retain, in the Source form of any Derivative Works\n       that You distribute, all copyright, patent, trademark, and\n       attribution notices from the Source form of the Work,\n       excluding those notices that do not pertain to any part of\n       the Derivative Works; and\n\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n       distribution, then any Derivative Works that You distribute must\n       include a readable copy of the attribution notices contained\n       within such NOTICE file, excluding those notices that do not\n       pertain to any part of the Derivative Works, in at least one\n       of the following places: within a NOTICE text file distributed\n       as part of the Derivative Works; within the Source form or\n       documentation, if provided along with the Derivative Works; or,\n       within a display generated by the Derivative Works, if and\n       wherever such third-party notices normally appear. The contents\n       of the NOTICE file are for informational purposes only and\n       do not modify the License. You may add Your own attribution\n       notices within Derivative Works that You distribute, alongside\n       or as an addendum to the NOTICE text from the Work, provided\n       that such additional attribution notices cannot be construed\n       as modifying the License.\n\n   You may add Your own copyright statement to Your modifications and\n   may provide additional or different license terms and conditions\n   for use, reproduction, or distribution of Your modifications, or\n   for any such Derivative Works as a whole, provided Your use,\n   reproduction, and distribution of the Work otherwise complies with\n   the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\n   any Contribution intentionally submitted for inclusion in the Work\n   by You to the Licensor shall be under the terms and conditions of\n   this License, without any additional terms or conditions.\n   Notwithstanding the above, nothing herein shall supersede or modify\n   the terms of any separate license agreement you may have executed\n   with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\n   names, trademarks, service marks, or product names of the Licensor,\n   except as required for reasonable and customary use in describing the\n   origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\n   agreed to in writing, Licensor provides the Work (and each\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n   implied, including, without limitation, any warranties or conditions\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n   PARTICULAR PURPOSE. You are solely responsible for determining the\n   appropriateness of using or redistributing the Work and assume any\n   risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\n   whether in tort (including negligence), contract, or otherwise,\n   unless required by applicable law (such as deliberate and grossly\n   negligent acts) or agreed to in writing, shall any Contributor be\n   liable to You for damages, including any direct, indirect, special,\n   incidental, or consequential damages of any character arising as a\n   result of this License or out of the use or inability to use the\n   Work (including but not limited to damages for loss of goodwill,\n   work stoppage, computer failure or malfunction, or any and all\n   other commercial damages or losses), even if such Contributor\n   has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\n   the Work or Derivative Works thereof, You may choose to offer,\n   and charge a fee for, acceptance of support, warranty, indemnity,\n   or other liability obligations and/or rights consistent with this\n   License. However, in accepting such obligations, You may act only\n   on Your own behalf and on Your sole responsibility, not on behalf\n   of any other Contributor, and only if You agree to indemnify,\n   defend, and hold each Contributor harmless for any liability\n   incurred by, or claims asserted against, such Contributor by reason\n   of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n\nAPPENDIX: How to apply the Apache License to your work.\n\n   To apply the Apache License to your work, attach the following\n   boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n   replaced with your own identifying information. (Don&#x27;t include\n   the brackets!)  The text should be enclosed in the appropriate\n   comment syntax for the file format. We also recommend that a\n   file or class name and description of purpose be included on the\n   same &quot;printed page&quot; as the copyright notice for easier\n   identification within third-party archives.\n\nCopyright [yyyy] [name of copyright owner]\n\nLicensed under the Apache License, Version 2.0 (the &quot;License&quot;);\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n    http://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an &quot;AS IS&quot; BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/RustCrypto/block-ciphers \">aes 0.8.4</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/formats/tree/master/base16ct \">base16ct 0.1.1</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/formats/tree/master/base64ct \">base64ct 1.6.0</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/utils \">block-buffer 0.10.4</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/traits \">cipher 0.4.4</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/formats/tree/master/const-oid \">const-oid 0.9.6</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/utils \">cpufeatures 0.2.14</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/crypto-bigint \">crypto-bigint 0.4.9</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/crypto-bigint \">crypto-bigint 0.5.5</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/traits \">crypto-common 0.1.6</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/formats/tree/master/der \">der 0.6.1</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/traits \">digest 0.10.7</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/traits/tree/master/elliptic-curve \">elliptic-curve 0.12.3</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/MACs \">hmac 0.12.1</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/utils \">inout 0.1.3</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/hashes \">md-5 0.10.6</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/elliptic-curves/tree/master/p256 \">p256 0.11.1</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/traits/tree/master/password-hash \">password-hash 0.4.2</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/password-hashes/tree/master/pbkdf2 \">pbkdf2 0.11.0</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/formats/tree/master/pkcs8 \">pkcs8 0.9.0</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/formats/tree/master/sec1 \">sec1 0.3.0</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/hashes \">sha1 0.10.6</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/hashes \">sha2 0.10.8</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/traits/tree/master/signature \">signature 1.6.4</a></li>\n                    <li><a href=\" https://github.com/RustCrypto/formats/tree/master/spki \">spki 0.6.0</a></li>\n                </ul>\n                <pre class=\"license-text\">                              Apache License\n                        Version 2.0, January 2004\n                     http://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n   and distribution as defined by Sections 1 through 9 of this document.\n\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n   the copyright owner that is granting the License.\n\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n   other entities that control, are controlled by, or are under common\n   control with that entity. For the purposes of this definition,\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\n   direction or management of such entity, whether by contract or\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\n   outstanding shares, or (iii) beneficial ownership of such entity.\n\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n   exercising permissions granted by this License.\n\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\n   including but not limited to software source code, documentation\n   source, and configuration files.\n\n   &quot;Object&quot; form shall mean any form resulting from mechanical\n   transformation or translation of a Source form, including but\n   not limited to compiled object code, generated documentation,\n   and conversions to other media types.\n\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\n   Object form, made available under the License, as indicated by a\n   copyright notice that is included in or attached to the work\n   (an example is provided in the Appendix below).\n\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n   form, that is based on (or derived from) the Work and for which the\n   editorial revisions, annotations, elaborations, or other modifications\n   represent, as a whole, an original work of authorship. For the purposes\n   of this License, Derivative Works shall not include works that remain\n   separable from, or merely link (or bind by name) to the interfaces of,\n   the Work and Derivative Works thereof.\n\n   &quot;Contribution&quot; shall mean any work of authorship, including\n   the original version of the Work and any modifications or additions\n   to that Work or Derivative Works thereof, that is intentionally\n   submitted to Licensor for inclusion in the Work by the copyright owner\n   or by an individual or Legal Entity authorized to submit on behalf of\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n   means any form of electronic, verbal, or written communication sent\n   to the Licensor or its representatives, including but not limited to\n   communication on electronic mailing lists, source code control systems,\n   and issue tracking systems that are managed by, or on behalf of, the\n   Licensor for the purpose of discussing and improving the Work, but\n   excluding communication that is conspicuously marked or otherwise\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n   on behalf of whom a Contribution has been received by Licensor and\n   subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   copyright license to reproduce, prepare Derivative Works of,\n   publicly display, publicly perform, sublicense, and distribute the\n   Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   (except as stated in this section) patent license to make, have made,\n   use, offer to sell, sell, import, and otherwise transfer the Work,\n   where such license applies only to those patent claims licensable\n   by such Contributor that are necessarily infringed by their\n   Contribution(s) alone or by combination of their Contribution(s)\n   with the Work to which such Contribution(s) was submitted. If You\n   institute patent litigation against any entity (including a\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\n   or a Contribution incorporated within the Work constitutes direct\n   or contributory patent infringement, then any patent licenses\n   granted to You under this License for that Work shall terminate\n   as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\n   Work or Derivative Works thereof in any medium, with or without\n   modifications, and in Source or Object form, provided that You\n   meet the following conditions:\n\n   (a) You must give any other recipients of the Work or\n       Derivative Works a copy of this License; and\n\n   (b) You must cause any modified files to carry prominent notices\n       stating that You changed the files; and\n\n   (c) You must retain, in the Source form of any Derivative Works\n       that You distribute, all copyright, patent, trademark, and\n       attribution notices from the Source form of the Work,\n       excluding those notices that do not pertain to any part of\n       the Derivative Works; and\n\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n       distribution, then any Derivative Works that You distribute must\n       include a readable copy of the attribution notices contained\n       within such NOTICE file, excluding those notices that do not\n       pertain to any part of the Derivative Works, in at least one\n       of the following places: within a NOTICE text file distributed\n       as part of the Derivative Works; within the Source form or\n       documentation, if provided along with the Derivative Works; or,\n       within a display generated by the Derivative Works, if and\n       wherever such third-party notices normally appear. The contents\n       of the NOTICE file are for informational purposes only and\n       do not modify the License. You may add Your own attribution\n       notices within Derivative Works that You distribute, alongside\n       or as an addendum to the NOTICE text from the Work, provided\n       that such additional attribution notices cannot be construed\n       as modifying the License.\n\n   You may add Your own copyright statement to Your modifications and\n   may provide additional or different license terms and conditions\n   for use, reproduction, or distribution of Your modifications, or\n   for any such Derivative Works as a whole, provided Your use,\n   reproduction, and distribution of the Work otherwise complies with\n   the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\n   any Contribution intentionally submitted for inclusion in the Work\n   by You to the Licensor shall be under the terms and conditions of\n   this License, without any additional terms or conditions.\n   Notwithstanding the above, nothing herein shall supersede or modify\n   the terms of any separate license agreement you may have executed\n   with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\n   names, trademarks, service marks, or product names of the Licensor,\n   except as required for reasonable and customary use in describing the\n   origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\n   agreed to in writing, Licensor provides the Work (and each\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n   implied, including, without limitation, any warranties or conditions\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n   PARTICULAR PURPOSE. You are solely responsible for determining the\n   appropriateness of using or redistributing the Work and assume any\n   risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\n   whether in tort (including negligence), contract, or otherwise,\n   unless required by applicable law (such as deliberate and grossly\n   negligent acts) or agreed to in writing, shall any Contributor be\n   liable to You for damages, including any direct, indirect, special,\n   incidental, or consequential damages of any character arising as a\n   result of this License or out of the use or inability to use the\n   Work (including but not limited to damages for loss of goodwill,\n   work stoppage, computer failure or malfunction, or any and all\n   other commercial damages or losses), even if such Contributor\n   has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\n   the Work or Derivative Works thereof, You may choose to offer,\n   and charge a fee for, acceptance of support, warranty, indemnity,\n   or other liability obligations and/or rights consistent with this\n   License. However, in accepting such obligations, You may act only\n   on Your own behalf and on Your sole responsibility, not on behalf\n   of any other Contributor, and only if You agree to indemnify,\n   defend, and hold each Contributor harmless for any liability\n   incurred by, or claims asserted against, such Contributor by reason\n   of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n\nAPPENDIX: How to apply the Apache License to your work.\n\n   To apply the Apache License to your work, attach the following\n   boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n   replaced with your own identifying information. (Don&#x27;t include\n   the brackets!)  The text should be enclosed in the appropriate\n   comment syntax for the file format. We also recommend that a\n   file or class name and description of purpose be included on the\n   same &quot;printed page&quot; as the copyright notice for easier\n   identification within third-party archives.\n\nCopyright [yyyy] [name of copyright owner]\n\nLicensed under the Apache License, Version 2.0 (the &quot;License&quot;);\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n   http://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an &quot;AS IS&quot; BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/rust-random/rand \">rand 0.8.5</a></li>\n                </ul>\n                <pre class=\"license-text\">                              Apache License\n                        Version 2.0, January 2004\n                     https://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n   and distribution as defined by Sections 1 through 9 of this document.\n\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n   the copyright owner that is granting the License.\n\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n   other entities that control, are controlled by, or are under common\n   control with that entity. For the purposes of this definition,\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\n   direction or management of such entity, whether by contract or\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\n   outstanding shares, or (iii) beneficial ownership of such entity.\n\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n   exercising permissions granted by this License.\n\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\n   including but not limited to software source code, documentation\n   source, and configuration files.\n\n   &quot;Object&quot; form shall mean any form resulting from mechanical\n   transformation or translation of a Source form, including but\n   not limited to compiled object code, generated documentation,\n   and conversions to other media types.\n\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\n   Object form, made available under the License, as indicated by a\n   copyright notice that is included in or attached to the work\n   (an example is provided in the Appendix below).\n\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n   form, that is based on (or derived from) the Work and for which the\n   editorial revisions, annotations, elaborations, or other modifications\n   represent, as a whole, an original work of authorship. For the purposes\n   of this License, Derivative Works shall not include works that remain\n   separable from, or merely link (or bind by name) to the interfaces of,\n   the Work and Derivative Works thereof.\n\n   &quot;Contribution&quot; shall mean any work of authorship, including\n   the original version of the Work and any modifications or additions\n   to that Work or Derivative Works thereof, that is intentionally\n   submitted to Licensor for inclusion in the Work by the copyright owner\n   or by an individual or Legal Entity authorized to submit on behalf of\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n   means any form of electronic, verbal, or written communication sent\n   to the Licensor or its representatives, including but not limited to\n   communication on electronic mailing lists, source code control systems,\n   and issue tracking systems that are managed by, or on behalf of, the\n   Licensor for the purpose of discussing and improving the Work, but\n   excluding communication that is conspicuously marked or otherwise\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n   on behalf of whom a Contribution has been received by Licensor and\n   subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   copyright license to reproduce, prepare Derivative Works of,\n   publicly display, publicly perform, sublicense, and distribute the\n   Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   (except as stated in this section) patent license to make, have made,\n   use, offer to sell, sell, import, and otherwise transfer the Work,\n   where such license applies only to those patent claims licensable\n   by such Contributor that are necessarily infringed by their\n   Contribution(s) alone or by combination of their Contribution(s)\n   with the Work to which such Contribution(s) was submitted. If You\n   institute patent litigation against any entity (including a\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\n   or a Contribution incorporated within the Work constitutes direct\n   or contributory patent infringement, then any patent licenses\n   granted to You under this License for that Work shall terminate\n   as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\n   Work or Derivative Works thereof in any medium, with or without\n   modifications, and in Source or Object form, provided that You\n   meet the following conditions:\n\n   (a) You must give any other recipients of the Work or\n       Derivative Works a copy of this License; and\n\n   (b) You must cause any modified files to carry prominent notices\n       stating that You changed the files; and\n\n   (c) You must retain, in the Source form of any Derivative Works\n       that You distribute, all copyright, patent, trademark, and\n       attribution notices from the Source form of the Work,\n       excluding those notices that do not pertain to any part of\n       the Derivative Works; and\n\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n       distribution, then any Derivative Works that You distribute must\n       include a readable copy of the attribution notices contained\n       within such NOTICE file, excluding those notices that do not\n       pertain to any part of the Derivative Works, in at least one\n       of the following places: within a NOTICE text file distributed\n       as part of the Derivative Works; within the Source form or\n       documentation, if provided along with the Derivative Works; or,\n       within a display generated by the Derivative Works, if and\n       wherever such third-party notices normally appear. The contents\n       of the NOTICE file are for informational purposes only and\n       do not modify the License. You may add Your own attribution\n       notices within Derivative Works that You distribute, alongside\n       or as an addendum to the NOTICE text from the Work, provided\n       that such additional attribution notices cannot be construed\n       as modifying the License.\n\n   You may add Your own copyright statement to Your modifications and\n   may provide additional or different license terms and conditions\n   for use, reproduction, or distribution of Your modifications, or\n   for any such Derivative Works as a whole, provided Your use,\n   reproduction, and distribution of the Work otherwise complies with\n   the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\n   any Contribution intentionally submitted for inclusion in the Work\n   by You to the Licensor shall be under the terms and conditions of\n   this License, without any additional terms or conditions.\n   Notwithstanding the above, nothing herein shall supersede or modify\n   the terms of any separate license agreement you may have executed\n   with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\n   names, trademarks, service marks, or product names of the Licensor,\n   except as required for reasonable and customary use in describing the\n   origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\n   agreed to in writing, Licensor provides the Work (and each\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n   implied, including, without limitation, any warranties or conditions\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n   PARTICULAR PURPOSE. You are solely responsible for determining the\n   appropriateness of using or redistributing the Work and assume any\n   risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\n   whether in tort (including negligence), contract, or otherwise,\n   unless required by applicable law (such as deliberate and grossly\n   negligent acts) or agreed to in writing, shall any Contributor be\n   liable to You for damages, including any direct, indirect, special,\n   incidental, or consequential damages of any character arising as a\n   result of this License or out of the use or inability to use the\n   Work (including but not limited to damages for loss of goodwill,\n   work stoppage, computer failure or malfunction, or any and all\n   other commercial damages or losses), even if such Contributor\n   has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\n   the Work or Derivative Works thereof, You may choose to offer,\n   and charge a fee for, acceptance of support, warranty, indemnity,\n   or other liability obligations and/or rights consistent with this\n   License. However, in accepting such obligations, You may act only\n   on Your own behalf and on Your sole responsibility, not on behalf\n   of any other Contributor, and only if You agree to indemnify,\n   defend, and hold each Contributor harmless for any liability\n   incurred by, or claims asserted against, such Contributor by reason\n   of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/rust-random/rand \">rand_core 0.6.4</a></li>\n                    <li><a href=\" https://github.com/rust-random/rand \">rand_distr 0.4.3</a></li>\n                </ul>\n                <pre class=\"license-text\">                              Apache License\n                        Version 2.0, January 2004\n                     https://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n   and distribution as defined by Sections 1 through 9 of this document.\n\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n   the copyright owner that is granting the License.\n\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n   other entities that control, are controlled by, or are under common\n   control with that entity. For the purposes of this definition,\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\n   direction or management of such entity, whether by contract or\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\n   outstanding shares, or (iii) beneficial ownership of such entity.\n\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n   exercising permissions granted by this License.\n\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\n   including but not limited to software source code, documentation\n   source, and configuration files.\n\n   &quot;Object&quot; form shall mean any form resulting from mechanical\n   transformation or translation of a Source form, including but\n   not limited to compiled object code, generated documentation,\n   and conversions to other media types.\n\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\n   Object form, made available under the License, as indicated by a\n   copyright notice that is included in or attached to the work\n   (an example is provided in the Appendix below).\n\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n   form, that is based on (or derived from) the Work and for which the\n   editorial revisions, annotations, elaborations, or other modifications\n   represent, as a whole, an original work of authorship. For the purposes\n   of this License, Derivative Works shall not include works that remain\n   separable from, or merely link (or bind by name) to the interfaces of,\n   the Work and Derivative Works thereof.\n\n   &quot;Contribution&quot; shall mean any work of authorship, including\n   the original version of the Work and any modifications or additions\n   to that Work or Derivative Works thereof, that is intentionally\n   submitted to Licensor for inclusion in the Work by the copyright owner\n   or by an individual or Legal Entity authorized to submit on behalf of\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n   means any form of electronic, verbal, or written communication sent\n   to the Licensor or its representatives, including but not limited to\n   communication on electronic mailing lists, source code control systems,\n   and issue tracking systems that are managed by, or on behalf of, the\n   Licensor for the purpose of discussing and improving the Work, but\n   excluding communication that is conspicuously marked or otherwise\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n   on behalf of whom a Contribution has been received by Licensor and\n   subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   copyright license to reproduce, prepare Derivative Works of,\n   publicly display, publicly perform, sublicense, and distribute the\n   Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   (except as stated in this section) patent license to make, have made,\n   use, offer to sell, sell, import, and otherwise transfer the Work,\n   where such license applies only to those patent claims licensable\n   by such Contributor that are necessarily infringed by their\n   Contribution(s) alone or by combination of their Contribution(s)\n   with the Work to which such Contribution(s) was submitted. If You\n   institute patent litigation against any entity (including a\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\n   or a Contribution incorporated within the Work constitutes direct\n   or contributory patent infringement, then any patent licenses\n   granted to You under this License for that Work shall terminate\n   as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\n   Work or Derivative Works thereof in any medium, with or without\n   modifications, and in Source or Object form, provided that You\n   meet the following conditions:\n\n   (a) You must give any other recipients of the Work or\n       Derivative Works a copy of this License; and\n\n   (b) You must cause any modified files to carry prominent notices\n       stating that You changed the files; and\n\n   (c) You must retain, in the Source form of any Derivative Works\n       that You distribute, all copyright, patent, trademark, and\n       attribution notices from the Source form of the Work,\n       excluding those notices that do not pertain to any part of\n       the Derivative Works; and\n\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n       distribution, then any Derivative Works that You distribute must\n       include a readable copy of the attribution notices contained\n       within such NOTICE file, excluding those notices that do not\n       pertain to any part of the Derivative Works, in at least one\n       of the following places: within a NOTICE text file distributed\n       as part of the Derivative Works; within the Source form or\n       documentation, if provided along with the Derivative Works; or,\n       within a display generated by the Derivative Works, if and\n       wherever such third-party notices normally appear. The contents\n       of the NOTICE file are for informational purposes only and\n       do not modify the License. You may add Your own attribution\n       notices within Derivative Works that You distribute, alongside\n       or as an addendum to the NOTICE text from the Work, provided\n       that such additional attribution notices cannot be construed\n       as modifying the License.\n\n   You may add Your own copyright statement to Your modifications and\n   may provide additional or different license terms and conditions\n   for use, reproduction, or distribution of Your modifications, or\n   for any such Derivative Works as a whole, provided Your use,\n   reproduction, and distribution of the Work otherwise complies with\n   the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\n   any Contribution intentionally submitted for inclusion in the Work\n   by You to the Licensor shall be under the terms and conditions of\n   this License, without any additional terms or conditions.\n   Notwithstanding the above, nothing herein shall supersede or modify\n   the terms of any separate license agreement you may have executed\n   with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\n   names, trademarks, service marks, or product names of the Licensor,\n   except as required for reasonable and customary use in describing the\n   origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\n   agreed to in writing, Licensor provides the Work (and each\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n   implied, including, without limitation, any warranties or conditions\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n   PARTICULAR PURPOSE. You are solely responsible for determining the\n   appropriateness of using or redistributing the Work and assume any\n   risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\n   whether in tort (including negligence), contract, or otherwise,\n   unless required by applicable law (such as deliberate and grossly\n   negligent acts) or agreed to in writing, shall any Contributor be\n   liable to You for damages, including any direct, indirect, special,\n   incidental, or consequential damages of any character arising as a\n   result of this License or out of the use or inability to use the\n   Work (including but not limited to damages for loss of goodwill,\n   work stoppage, computer failure or malfunction, or any and all\n   other commercial damages or losses), even if such Contributor\n   has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\n   the Work or Derivative Works thereof, You may choose to offer,\n   and charge a fee for, acceptance of support, warranty, indemnity,\n   or other liability obligations and/or rights consistent with this\n   License. However, in accepting such obligations, You may act only\n   on Your own behalf and on Your sole responsibility, not on behalf\n   of any other Contributor, and only if You agree to indemnify,\n   defend, and hold each Contributor harmless for any liability\n   incurred by, or claims asserted against, such Contributor by reason\n   of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n\nAPPENDIX: How to apply the Apache License to your work.\n\n   To apply the Apache License to your work, attach the following\n   boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n   replaced with your own identifying information. (Don&#x27;t include\n   the brackets!)  The text should be enclosed in the appropriate\n   comment syntax for the file format. We also recommend that a\n   file or class name and description of purpose be included on the\n   same &quot;printed page&quot; as the copyright notice for easier\n   identification within third-party archives.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/rust-random/getrandom \">getrandom 0.1.16</a></li>\n                    <li><a href=\" https://github.com/rust-random/getrandom \">getrandom 0.2.15</a></li>\n                    <li><a href=\" https://github.com/rust-random/rand \">rand 0.7.3</a></li>\n                    <li><a href=\" https://github.com/rust-random/rand \">rand_chacha 0.2.2</a></li>\n                    <li><a href=\" https://github.com/rust-random/rand \">rand_chacha 0.3.1</a></li>\n                    <li><a href=\" https://github.com/rust-random/rand \">rand_core 0.5.1</a></li>\n                    <li><a href=\" https://github.com/rust-random/rand \">rand_hc 0.2.0</a></li>\n                    <li><a href=\" https://github.com/rust-random/rngs \">rand_xorshift 0.3.0</a></li>\n                </ul>\n                <pre class=\"license-text\">                              Apache License\n                        Version 2.0, January 2004\n                     https://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n   and distribution as defined by Sections 1 through 9 of this document.\n\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n   the copyright owner that is granting the License.\n\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n   other entities that control, are controlled by, or are under common\n   control with that entity. For the purposes of this definition,\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\n   direction or management of such entity, whether by contract or\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\n   outstanding shares, or (iii) beneficial ownership of such entity.\n\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n   exercising permissions granted by this License.\n\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\n   including but not limited to software source code, documentation\n   source, and configuration files.\n\n   &quot;Object&quot; form shall mean any form resulting from mechanical\n   transformation or translation of a Source form, including but\n   not limited to compiled object code, generated documentation,\n   and conversions to other media types.\n\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\n   Object form, made available under the License, as indicated by a\n   copyright notice that is included in or attached to the work\n   (an example is provided in the Appendix below).\n\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n   form, that is based on (or derived from) the Work and for which the\n   editorial revisions, annotations, elaborations, or other modifications\n   represent, as a whole, an original work of authorship. For the purposes\n   of this License, Derivative Works shall not include works that remain\n   separable from, or merely link (or bind by name) to the interfaces of,\n   the Work and Derivative Works thereof.\n\n   &quot;Contribution&quot; shall mean any work of authorship, including\n   the original version of the Work and any modifications or additions\n   to that Work or Derivative Works thereof, that is intentionally\n   submitted to Licensor for inclusion in the Work by the copyright owner\n   or by an individual or Legal Entity authorized to submit on behalf of\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n   means any form of electronic, verbal, or written communication sent\n   to the Licensor or its representatives, including but not limited to\n   communication on electronic mailing lists, source code control systems,\n   and issue tracking systems that are managed by, or on behalf of, the\n   Licensor for the purpose of discussing and improving the Work, but\n   excluding communication that is conspicuously marked or otherwise\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n   on behalf of whom a Contribution has been received by Licensor and\n   subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   copyright license to reproduce, prepare Derivative Works of,\n   publicly display, publicly perform, sublicense, and distribute the\n   Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   (except as stated in this section) patent license to make, have made,\n   use, offer to sell, sell, import, and otherwise transfer the Work,\n   where such license applies only to those patent claims licensable\n   by such Contributor that are necessarily infringed by their\n   Contribution(s) alone or by combination of their Contribution(s)\n   with the Work to which such Contribution(s) was submitted. If You\n   institute patent litigation against any entity (including a\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\n   or a Contribution incorporated within the Work constitutes direct\n   or contributory patent infringement, then any patent licenses\n   granted to You under this License for that Work shall terminate\n   as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\n   Work or Derivative Works thereof in any medium, with or without\n   modifications, and in Source or Object form, provided that You\n   meet the following conditions:\n\n   (a) You must give any other recipients of the Work or\n       Derivative Works a copy of this License; and\n\n   (b) You must cause any modified files to carry prominent notices\n       stating that You changed the files; and\n\n   (c) You must retain, in the Source form of any Derivative Works\n       that You distribute, all copyright, patent, trademark, and\n       attribution notices from the Source form of the Work,\n       excluding those notices that do not pertain to any part of\n       the Derivative Works; and\n\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n       distribution, then any Derivative Works that You distribute must\n       include a readable copy of the attribution notices contained\n       within such NOTICE file, excluding those notices that do not\n       pertain to any part of the Derivative Works, in at least one\n       of the following places: within a NOTICE text file distributed\n       as part of the Derivative Works; within the Source form or\n       documentation, if provided along with the Derivative Works; or,\n       within a display generated by the Derivative Works, if and\n       wherever such third-party notices normally appear. The contents\n       of the NOTICE file are for informational purposes only and\n       do not modify the License. You may add Your own attribution\n       notices within Derivative Works that You distribute, alongside\n       or as an addendum to the NOTICE text from the Work, provided\n       that such additional attribution notices cannot be construed\n       as modifying the License.\n\n   You may add Your own copyright statement to Your modifications and\n   may provide additional or different license terms and conditions\n   for use, reproduction, or distribution of Your modifications, or\n   for any such Derivative Works as a whole, provided Your use,\n   reproduction, and distribution of the Work otherwise complies with\n   the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\n   any Contribution intentionally submitted for inclusion in the Work\n   by You to the Licensor shall be under the terms and conditions of\n   this License, without any additional terms or conditions.\n   Notwithstanding the above, nothing herein shall supersede or modify\n   the terms of any separate license agreement you may have executed\n   with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\n   names, trademarks, service marks, or product names of the Licensor,\n   except as required for reasonable and customary use in describing the\n   origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\n   agreed to in writing, Licensor provides the Work (and each\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n   implied, including, without limitation, any warranties or conditions\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n   PARTICULAR PURPOSE. You are solely responsible for determining the\n   appropriateness of using or redistributing the Work and assume any\n   risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\n   whether in tort (including negligence), contract, or otherwise,\n   unless required by applicable law (such as deliberate and grossly\n   negligent acts) or agreed to in writing, shall any Contributor be\n   liable to You for damages, including any direct, indirect, special,\n   incidental, or consequential damages of any character arising as a\n   result of this License or out of the use or inability to use the\n   Work (including but not limited to damages for loss of goodwill,\n   work stoppage, computer failure or malfunction, or any and all\n   other commercial damages or losses), even if such Contributor\n   has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\n   the Work or Derivative Works thereof, You may choose to offer,\n   and charge a fee for, acceptance of support, warranty, indemnity,\n   or other liability obligations and/or rights consistent with this\n   License. However, in accepting such obligations, You may act only\n   on Your own behalf and on Your sole responsibility, not on behalf\n   of any other Contributor, and only if You agree to indemnify,\n   defend, and hold each Contributor harmless for any liability\n   incurred by, or claims asserted against, such Contributor by reason\n   of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n\nAPPENDIX: How to apply the Apache License to your work.\n\n   To apply the Apache License to your work, attach the following\n   boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n   replaced with your own identifying information. (Don&#x27;t include\n   the brackets!)  The text should be enclosed in the appropriate\n   comment syntax for the file format. We also recommend that a\n   file or class name and description of purpose be included on the\n   same &quot;printed page&quot; as the copyright notice for easier\n   identification within third-party archives.\n\nCopyright [yyyy] [name of copyright owner]\n\nLicensed under the Apache License, Version 2.0 (the &quot;License&quot;);\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n\thttps://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an &quot;AS IS&quot; BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/rust-lang/cargo \">home 0.5.9</a></li>\n                </ul>\n                <pre class=\"license-text\">                              Apache License\n                        Version 2.0, January 2004\n                     https://www.apache.org/licenses/LICENSE-2.0\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n   and distribution as defined by Sections 1 through 9 of this document.\n\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n   the copyright owner that is granting the License.\n\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n   other entities that control, are controlled by, or are under common\n   control with that entity. For the purposes of this definition,\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\n   direction or management of such entity, whether by contract or\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\n   outstanding shares, or (iii) beneficial ownership of such entity.\n\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n   exercising permissions granted by this License.\n\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\n   including but not limited to software source code, documentation\n   source, and configuration files.\n\n   &quot;Object&quot; form shall mean any form resulting from mechanical\n   transformation or translation of a Source form, including but\n   not limited to compiled object code, generated documentation,\n   and conversions to other media types.\n\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\n   Object form, made available under the License, as indicated by a\n   copyright notice that is included in or attached to the work\n   (an example is provided in the Appendix below).\n\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n   form, that is based on (or derived from) the Work and for which the\n   editorial revisions, annotations, elaborations, or other modifications\n   represent, as a whole, an original work of authorship. For the purposes\n   of this License, Derivative Works shall not include works that remain\n   separable from, or merely link (or bind by name) to the interfaces of,\n   the Work and Derivative Works thereof.\n\n   &quot;Contribution&quot; shall mean any work of authorship, including\n   the original version of the Work and any modifications or additions\n   to that Work or Derivative Works thereof, that is intentionally\n   submitted to Licensor for inclusion in the Work by the copyright owner\n   or by an individual or Legal Entity authorized to submit on behalf of\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n   means any form of electronic, verbal, or written communication sent\n   to the Licensor or its representatives, including but not limited to\n   communication on electronic mailing lists, source code control systems,\n   and issue tracking systems that are managed by, or on behalf of, the\n   Licensor for the purpose of discussing and improving the Work, but\n   excluding communication that is conspicuously marked or otherwise\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n   on behalf of whom a Contribution has been received by Licensor and\n   subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   copyright license to reproduce, prepare Derivative Works of,\n   publicly display, publicly perform, sublicense, and distribute the\n   Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   (except as stated in this section) patent license to make, have made,\n   use, offer to sell, sell, import, and otherwise transfer the Work,\n   where such license applies only to those patent claims licensable\n   by such Contributor that are necessarily infringed by their\n   Contribution(s) alone or by combination of their Contribution(s)\n   with the Work to which such Contribution(s) was submitted. If You\n   institute patent litigation against any entity (including a\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\n   or a Contribution incorporated within the Work constitutes direct\n   or contributory patent infringement, then any patent licenses\n   granted to You under this License for that Work shall terminate\n   as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\n   Work or Derivative Works thereof in any medium, with or without\n   modifications, and in Source or Object form, provided that You\n   meet the following conditions:\n\n   (a) You must give any other recipients of the Work or\n       Derivative Works a copy of this License; and\n\n   (b) You must cause any modified files to carry prominent notices\n       stating that You changed the files; and\n\n   (c) You must retain, in the Source form of any Derivative Works\n       that You distribute, all copyright, patent, trademark, and\n       attribution notices from the Source form of the Work,\n       excluding those notices that do not pertain to any part of\n       the Derivative Works; and\n\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n       distribution, then any Derivative Works that You distribute must\n       include a readable copy of the attribution notices contained\n       within such NOTICE file, excluding those notices that do not\n       pertain to any part of the Derivative Works, in at least one\n       of the following places: within a NOTICE text file distributed\n       as part of the Derivative Works; within the Source form or\n       documentation, if provided along with the Derivative Works; or,\n       within a display generated by the Derivative Works, if and\n       wherever such third-party notices normally appear. The contents\n       of the NOTICE file are for informational purposes only and\n       do not modify the License. You may add Your own attribution\n       notices within Derivative Works that You distribute, alongside\n       or as an addendum to the NOTICE text from the Work, provided\n       that such additional attribution notices cannot be construed\n       as modifying the License.\n\n   You may add Your own copyright statement to Your modifications and\n   may provide additional or different license terms and conditions\n   for use, reproduction, or distribution of Your modifications, or\n   for any such Derivative Works as a whole, provided Your use,\n   reproduction, and distribution of the Work otherwise complies with\n   the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\n   any Contribution intentionally submitted for inclusion in the Work\n   by You to the Licensor shall be under the terms and conditions of\n   this License, without any additional terms or conditions.\n   Notwithstanding the above, nothing herein shall supersede or modify\n   the terms of any separate license agreement you may have executed\n   with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\n   names, trademarks, service marks, or product names of the Licensor,\n   except as required for reasonable and customary use in describing the\n   origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\n   agreed to in writing, Licensor provides the Work (and each\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n   implied, including, without limitation, any warranties or conditions\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n   PARTICULAR PURPOSE. You are solely responsible for determining the\n   appropriateness of using or redistributing the Work and assume any\n   risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\n   whether in tort (including negligence), contract, or otherwise,\n   unless required by applicable law (such as deliberate and grossly\n   negligent acts) or agreed to in writing, shall any Contributor be\n   liable to You for damages, including any direct, indirect, special,\n   incidental, or consequential damages of any character arising as a\n   result of this License or out of the use or inability to use the\n   Work (including but not limited to damages for loss of goodwill,\n   work stoppage, computer failure or malfunction, or any and all\n   other commercial damages or losses), even if such Contributor\n   has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\n   the Work or Derivative Works thereof, You may choose to offer,\n   and charge a fee for, acceptance of support, warranty, indemnity,\n   or other liability obligations and/or rights consistent with this\n   License. However, in accepting such obligations, You may act only\n   on Your own behalf and on Your sole responsibility, not on behalf\n   of any other Contributor, and only if You agree to indemnify,\n   defend, and hold each Contributor harmless for any liability\n   incurred by, or claims asserted against, such Contributor by reason\n   of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n\nAPPENDIX: How to apply the Apache License to your work.\n\n   To apply the Apache License to your work, attach the following\n   boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n   replaced with your own identifying information. (Don&#x27;t include\n   the brackets!)  The text should be enclosed in the appropriate\n   comment syntax for the file format. We also recommend that a\n   file or class name and description of purpose be included on the\n   same &quot;printed page&quot; as the copyright notice for easier\n   identification within third-party archives.\n\nCopyright [yyyy] [name of copyright owner]\n\nLicensed under the Apache License, Version 2.0 (the &quot;License&quot;);\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n\thttps://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an &quot;AS IS&quot; BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/zakarumych/allocator-api2 \">allocator-api2 0.2.18</a></li>\n                </ul>\n                <pre class=\"license-text\">                              Apache License\r\n                        Version 2.0, January 2004\r\n                     http://www.apache.org/licenses/\r\n\r\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\r\n\r\n1. Definitions.\r\n\r\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\r\n   and distribution as defined by Sections 1 through 9 of this document.\r\n\r\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\r\n   the copyright owner that is granting the License.\r\n\r\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\r\n   other entities that control, are controlled by, or are under common\r\n   control with that entity. For the purposes of this definition,\r\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\r\n   direction or management of such entity, whether by contract or\r\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\r\n   outstanding shares, or (iii) beneficial ownership of such entity.\r\n\r\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\r\n   exercising permissions granted by this License.\r\n\r\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\r\n   including but not limited to software source code, documentation\r\n   source, and configuration files.\r\n\r\n   &quot;Object&quot; form shall mean any form resulting from mechanical\r\n   transformation or translation of a Source form, including but\r\n   not limited to compiled object code, generated documentation,\r\n   and conversions to other media types.\r\n\r\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\r\n   Object form, made available under the License, as indicated by a\r\n   copyright notice that is included in or attached to the work\r\n   (an example is provided in the Appendix below).\r\n\r\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\r\n   form, that is based on (or derived from) the Work and for which the\r\n   editorial revisions, annotations, elaborations, or other modifications\r\n   represent, as a whole, an original work of authorship. For the purposes\r\n   of this License, Derivative Works shall not include works that remain\r\n   separable from, or merely link (or bind by name) to the interfaces of,\r\n   the Work and Derivative Works thereof.\r\n\r\n   &quot;Contribution&quot; shall mean any work of authorship, including\r\n   the original version of the Work and any modifications or additions\r\n   to that Work or Derivative Works thereof, that is intentionally\r\n   submitted to Licensor for inclusion in the Work by the copyright owner\r\n   or by an individual or Legal Entity authorized to submit on behalf of\r\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\r\n   means any form of electronic, verbal, or written communication sent\r\n   to the Licensor or its representatives, including but not limited to\r\n   communication on electronic mailing lists, source code control systems,\r\n   and issue tracking systems that are managed by, or on behalf of, the\r\n   Licensor for the purpose of discussing and improving the Work, but\r\n   excluding communication that is conspicuously marked or otherwise\r\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\r\n\r\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\r\n   on behalf of whom a Contribution has been received by Licensor and\r\n   subsequently incorporated within the Work.\r\n\r\n2. Grant of Copyright License. Subject to the terms and conditions of\r\n   this License, each Contributor hereby grants to You a perpetual,\r\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\r\n   copyright license to reproduce, prepare Derivative Works of,\r\n   publicly display, publicly perform, sublicense, and distribute the\r\n   Work and such Derivative Works in Source or Object form.\r\n\r\n3. Grant of Patent License. Subject to the terms and conditions of\r\n   this License, each Contributor hereby grants to You a perpetual,\r\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\r\n   (except as stated in this section) patent license to make, have made,\r\n   use, offer to sell, sell, import, and otherwise transfer the Work,\r\n   where such license applies only to those patent claims licensable\r\n   by such Contributor that are necessarily infringed by their\r\n   Contribution(s) alone or by combination of their Contribution(s)\r\n   with the Work to which such Contribution(s) was submitted. If You\r\n   institute patent litigation against any entity (including a\r\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\r\n   or a Contribution incorporated within the Work constitutes direct\r\n   or contributory patent infringement, then any patent licenses\r\n   granted to You under this License for that Work shall terminate\r\n   as of the date such litigation is filed.\r\n\r\n4. Redistribution. You may reproduce and distribute copies of the\r\n   Work or Derivative Works thereof in any medium, with or without\r\n   modifications, and in Source or Object form, provided that You\r\n   meet the following conditions:\r\n\r\n   (a) You must give any other recipients of the Work or\r\n       Derivative Works a copy of this License; and\r\n\r\n   (b) You must cause any modified files to carry prominent notices\r\n       stating that You changed the files; and\r\n\r\n   (c) You must retain, in the Source form of any Derivative Works\r\n       that You distribute, all copyright, patent, trademark, and\r\n       attribution notices from the Source form of the Work,\r\n       excluding those notices that do not pertain to any part of\r\n       the Derivative Works; and\r\n\r\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\r\n       distribution, then any Derivative Works that You distribute must\r\n       include a readable copy of the attribution notices contained\r\n       within such NOTICE file, excluding those notices that do not\r\n       pertain to any part of the Derivative Works, in at least one\r\n       of the following places: within a NOTICE text file distributed\r\n       as part of the Derivative Works; within the Source form or\r\n       documentation, if provided along with the Derivative Works; or,\r\n       within a display generated by the Derivative Works, if and\r\n       wherever such third-party notices normally appear. The contents\r\n       of the NOTICE file are for informational purposes only and\r\n       do not modify the License. You may add Your own attribution\r\n       notices within Derivative Works that You distribute, alongside\r\n       or as an addendum to the NOTICE text from the Work, provided\r\n       that such additional attribution notices cannot be construed\r\n       as modifying the License.\r\n\r\n   You may add Your own copyright statement to Your modifications and\r\n   may provide additional or different license terms and conditions\r\n   for use, reproduction, or distribution of Your modifications, or\r\n   for any such Derivative Works as a whole, provided Your use,\r\n   reproduction, and distribution of the Work otherwise complies with\r\n   the conditions stated in this License.\r\n\r\n5. Submission of Contributions. Unless You explicitly state otherwise,\r\n   any Contribution intentionally submitted for inclusion in the Work\r\n   by You to the Licensor shall be under the terms and conditions of\r\n   this License, without any additional terms or conditions.\r\n   Notwithstanding the above, nothing herein shall supersede or modify\r\n   the terms of any separate license agreement you may have executed\r\n   with Licensor regarding such Contributions.\r\n\r\n6. Trademarks. This License does not grant permission to use the trade\r\n   names, trademarks, service marks, or product names of the Licensor,\r\n   except as required for reasonable and customary use in describing the\r\n   origin of the Work and reproducing the content of the NOTICE file.\r\n\r\n7. Disclaimer of Warranty. Unless required by applicable law or\r\n   agreed to in writing, Licensor provides the Work (and each\r\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\r\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\r\n   implied, including, without limitation, any warranties or conditions\r\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\r\n   PARTICULAR PURPOSE. You are solely responsible for determining the\r\n   appropriateness of using or redistributing the Work and assume any\r\n   risks associated with Your exercise of permissions under this License.\r\n\r\n8. Limitation of Liability. In no event and under no legal theory,\r\n   whether in tort (including negligence), contract, or otherwise,\r\n   unless required by applicable law (such as deliberate and grossly\r\n   negligent acts) or agreed to in writing, shall any Contributor be\r\n   liable to You for damages, including any direct, indirect, special,\r\n   incidental, or consequential damages of any character arising as a\r\n   result of this License or out of the use or inability to use the\r\n   Work (including but not limited to damages for loss of goodwill,\r\n   work stoppage, computer failure or malfunction, or any and all\r\n   other commercial damages or losses), even if such Contributor\r\n   has been advised of the possibility of such damages.\r\n\r\n9. Accepting Warranty or Additional Liability. While redistributing\r\n   the Work or Derivative Works thereof, You may choose to offer,\r\n   and charge a fee for, acceptance of support, warranty, indemnity,\r\n   or other liability obligations and/or rights consistent with this\r\n   License. However, in accepting such obligations, You may act only\r\n   on Your own behalf and on Your sole responsibility, not on behalf\r\n   of any other Contributor, and only if You agree to indemnify,\r\n   defend, and hold each Contributor harmless for any liability\r\n   incurred by, or claims asserted against, such Contributor by reason\r\n   of your accepting any such warranty or additional liability.\r\n\r\nEND OF TERMS AND CONDITIONS\r\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://gitlab.com/CreepySkeleton/proc-macro-error \">proc-macro-error 1.0.4</a></li>\n                </ul>\n                <pre class=\"license-text\">                              Apache License\r\n                        Version 2.0, January 2004\r\n                     http://www.apache.org/licenses/\r\n\r\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\r\n\r\n1. Definitions.\r\n\r\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\r\n   and distribution as defined by Sections 1 through 9 of this document.\r\n\r\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\r\n   the copyright owner that is granting the License.\r\n\r\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\r\n   other entities that control, are controlled by, or are under common\r\n   control with that entity. For the purposes of this definition,\r\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\r\n   direction or management of such entity, whether by contract or\r\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\r\n   outstanding shares, or (iii) beneficial ownership of such entity.\r\n\r\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\r\n   exercising permissions granted by this License.\r\n\r\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\r\n   including but not limited to software source code, documentation\r\n   source, and configuration files.\r\n\r\n   &quot;Object&quot; form shall mean any form resulting from mechanical\r\n   transformation or translation of a Source form, including but\r\n   not limited to compiled object code, generated documentation,\r\n   and conversions to other media types.\r\n\r\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\r\n   Object form, made available under the License, as indicated by a\r\n   copyright notice that is included in or attached to the work\r\n   (an example is provided in the Appendix below).\r\n\r\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\r\n   form, that is based on (or derived from) the Work and for which the\r\n   editorial revisions, annotations, elaborations, or other modifications\r\n   represent, as a whole, an original work of authorship. For the purposes\r\n   of this License, Derivative Works shall not include works that remain\r\n   separable from, or merely link (or bind by name) to the interfaces of,\r\n   the Work and Derivative Works thereof.\r\n\r\n   &quot;Contribution&quot; shall mean any work of authorship, including\r\n   the original version of the Work and any modifications or additions\r\n   to that Work or Derivative Works thereof, that is intentionally\r\n   submitted to Licensor for inclusion in the Work by the copyright owner\r\n   or by an individual or Legal Entity authorized to submit on behalf of\r\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\r\n   means any form of electronic, verbal, or written communication sent\r\n   to the Licensor or its representatives, including but not limited to\r\n   communication on electronic mailing lists, source code control systems,\r\n   and issue tracking systems that are managed by, or on behalf of, the\r\n   Licensor for the purpose of discussing and improving the Work, but\r\n   excluding communication that is conspicuously marked or otherwise\r\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\r\n\r\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\r\n   on behalf of whom a Contribution has been received by Licensor and\r\n   subsequently incorporated within the Work.\r\n\r\n2. Grant of Copyright License. Subject to the terms and conditions of\r\n   this License, each Contributor hereby grants to You a perpetual,\r\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\r\n   copyright license to reproduce, prepare Derivative Works of,\r\n   publicly display, publicly perform, sublicense, and distribute the\r\n   Work and such Derivative Works in Source or Object form.\r\n\r\n3. Grant of Patent License. Subject to the terms and conditions of\r\n   this License, each Contributor hereby grants to You a perpetual,\r\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\r\n   (except as stated in this section) patent license to make, have made,\r\n   use, offer to sell, sell, import, and otherwise transfer the Work,\r\n   where such license applies only to those patent claims licensable\r\n   by such Contributor that are necessarily infringed by their\r\n   Contribution(s) alone or by combination of their Contribution(s)\r\n   with the Work to which such Contribution(s) was submitted. If You\r\n   institute patent litigation against any entity (including a\r\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\r\n   or a Contribution incorporated within the Work constitutes direct\r\n   or contributory patent infringement, then any patent licenses\r\n   granted to You under this License for that Work shall terminate\r\n   as of the date such litigation is filed.\r\n\r\n4. Redistribution. You may reproduce and distribute copies of the\r\n   Work or Derivative Works thereof in any medium, with or without\r\n   modifications, and in Source or Object form, provided that You\r\n   meet the following conditions:\r\n\r\n   (a) You must give any other recipients of the Work or\r\n       Derivative Works a copy of this License; and\r\n\r\n   (b) You must cause any modified files to carry prominent notices\r\n       stating that You changed the files; and\r\n\r\n   (c) You must retain, in the Source form of any Derivative Works\r\n       that You distribute, all copyright, patent, trademark, and\r\n       attribution notices from the Source form of the Work,\r\n       excluding those notices that do not pertain to any part of\r\n       the Derivative Works; and\r\n\r\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\r\n       distribution, then any Derivative Works that You distribute must\r\n       include a readable copy of the attribution notices contained\r\n       within such NOTICE file, excluding those notices that do not\r\n       pertain to any part of the Derivative Works, in at least one\r\n       of the following places: within a NOTICE text file distributed\r\n       as part of the Derivative Works; within the Source form or\r\n       documentation, if provided along with the Derivative Works; or,\r\n       within a display generated by the Derivative Works, if and\r\n       wherever such third-party notices normally appear. The contents\r\n       of the NOTICE file are for informational purposes only and\r\n       do not modify the License. You may add Your own attribution\r\n       notices within Derivative Works that You distribute, alongside\r\n       or as an addendum to the NOTICE text from the Work, provided\r\n       that such additional attribution notices cannot be construed\r\n       as modifying the License.\r\n\r\n   You may add Your own copyright statement to Your modifications and\r\n   may provide additional or different license terms and conditions\r\n   for use, reproduction, or distribution of Your modifications, or\r\n   for any such Derivative Works as a whole, provided Your use,\r\n   reproduction, and distribution of the Work otherwise complies with\r\n   the conditions stated in this License.\r\n\r\n5. Submission of Contributions. Unless You explicitly state otherwise,\r\n   any Contribution intentionally submitted for inclusion in the Work\r\n   by You to the Licensor shall be under the terms and conditions of\r\n   this License, without any additional terms or conditions.\r\n   Notwithstanding the above, nothing herein shall supersede or modify\r\n   the terms of any separate license agreement you may have executed\r\n   with Licensor regarding such Contributions.\r\n\r\n6. Trademarks. This License does not grant permission to use the trade\r\n   names, trademarks, service marks, or product names of the Licensor,\r\n   except as required for reasonable and customary use in describing the\r\n   origin of the Work and reproducing the content of the NOTICE file.\r\n\r\n7. Disclaimer of Warranty. Unless required by applicable law or\r\n   agreed to in writing, Licensor provides the Work (and each\r\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\r\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\r\n   implied, including, without limitation, any warranties or conditions\r\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\r\n   PARTICULAR PURPOSE. You are solely responsible for determining the\r\n   appropriateness of using or redistributing the Work and assume any\r\n   risks associated with Your exercise of permissions under this License.\r\n\r\n8. Limitation of Liability. In no event and under no legal theory,\r\n   whether in tort (including negligence), contract, or otherwise,\r\n   unless required by applicable law (such as deliberate and grossly\r\n   negligent acts) or agreed to in writing, shall any Contributor be\r\n   liable to You for damages, including any direct, indirect, special,\r\n   incidental, or consequential damages of any character arising as a\r\n   result of this License or out of the use or inability to use the\r\n   Work (including but not limited to damages for loss of goodwill,\r\n   work stoppage, computer failure or malfunction, or any and all\r\n   other commercial damages or losses), even if such Contributor\r\n   has been advised of the possibility of such damages.\r\n\r\n9. Accepting Warranty or Additional Liability. While redistributing\r\n   the Work or Derivative Works thereof, You may choose to offer,\r\n   and charge a fee for, acceptance of support, warranty, indemnity,\r\n   or other liability obligations and/or rights consistent with this\r\n   License. However, in accepting such obligations, You may act only\r\n   on Your own behalf and on Your sole responsibility, not on behalf\r\n   of any other Contributor, and only if You agree to indemnify,\r\n   defend, and hold each Contributor harmless for any liability\r\n   incurred by, or claims asserted against, such Contributor by reason\r\n   of your accepting any such warranty or additional liability.\r\n\r\nEND OF TERMS AND CONDITIONS\r\n\r\nAPPENDIX: How to apply the Apache License to your work.\r\n\r\n   To apply the Apache License to your work, attach the following\r\n   boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\r\n   replaced with your own identifying information. (Don&#x27;t include\r\n   the brackets!)  The text should be enclosed in the appropriate\r\n   comment syntax for the file format. We also recommend that a\r\n   file or class name and description of purpose be included on the\r\n   same &quot;printed page&quot; as the copyright notice for easier\r\n   identification within third-party archives.\r\n\r\nCopyright 2019-2020 CreepySkeleton &lt;creepy-skeleton@yandex.ru&gt;\r\n\r\nLicensed under the Apache License, Version 2.0 (the &quot;License&quot;);\r\nyou may not use this file except in compliance with the License.\r\nYou may obtain a copy of the License at\r\n\r\n    http://www.apache.org/licenses/LICENSE-2.0\r\n\r\nUnless required by applicable law or agreed to in writing, software\r\ndistributed under the License is distributed on an &quot;AS IS&quot; BASIS,\r\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\r\nSee the License for the specific language governing permissions and\r\nlimitations under the License.\r\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/zesterer/flume \">flume 0.11.0</a></li>\n                </ul>\n                <pre class=\"license-text\">   Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n      replaced with your own identifying information. (Don&#x27;t include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same &quot;printed page&quot; as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright [yyyy] [name of copyright owner]\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/tikv/async-speed-limit \">async-speed-limit 0.4.2</a></li>\n                </ul>\n                <pre class=\"license-text\">Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      &quot;control&quot; means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      &quot;Source&quot; form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      &quot;Object&quot; form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      &quot;Work&quot; shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      &quot;Contribution&quot; shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n      &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets &quot;{}&quot;\n      replaced with your own identifying information. (Don&#x27;t include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same &quot;printed page&quot; as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright {}\n\n   Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/pyfisch/httpdate \">httpdate 1.0.3</a></li>\n                </ul>\n                <pre class=\"license-text\">Apache License\nVersion 2.0, January 2004\nhttp://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n&quot;License&quot; shall mean the terms and conditions for use, reproduction,\nand distribution as defined by Sections 1 through 9 of this document.\n\n&quot;Licensor&quot; shall mean the copyright owner or entity authorized by\nthe copyright owner that is granting the License.\n\n&quot;Legal Entity&quot; shall mean the union of the acting entity and all\nother entities that control, are controlled by, or are under common\ncontrol with that entity. For the purposes of this definition,\n&quot;control&quot; means (i) the power, direct or indirect, to cause the\ndirection or management of such entity, whether by contract or\notherwise, or (ii) ownership of fifty percent (50%) or more of the\noutstanding shares, or (iii) beneficial ownership of such entity.\n\n&quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\nexercising permissions granted by this License.\n\n&quot;Source&quot; form shall mean the preferred form for making modifications,\nincluding but not limited to software source code, documentation\nsource, and configuration files.\n\n&quot;Object&quot; form shall mean any form resulting from mechanical\ntransformation or translation of a Source form, including but\nnot limited to compiled object code, generated documentation,\nand conversions to other media types.\n\n&quot;Work&quot; shall mean the work of authorship, whether in Source or\nObject form, made available under the License, as indicated by a\ncopyright notice that is included in or attached to the work\n(an example is provided in the Appendix below).\n\n&quot;Derivative Works&quot; shall mean any work, whether in Source or Object\nform, that is based on (or derived from) the Work and for which the\neditorial revisions, annotations, elaborations, or other modifications\nrepresent, as a whole, an original work of authorship. For the purposes\nof this License, Derivative Works shall not include works that remain\nseparable from, or merely link (or bind by name) to the interfaces of,\nthe Work and Derivative Works thereof.\n\n&quot;Contribution&quot; shall mean any work of authorship, including\nthe original version of the Work and any modifications or additions\nto that Work or Derivative Works thereof, that is intentionally\nsubmitted to Licensor for inclusion in the Work by the copyright owner\nor by an individual or Legal Entity authorized to submit on behalf of\nthe copyright owner. For the purposes of this definition, &quot;submitted&quot;\nmeans any form of electronic, verbal, or written communication sent\nto the Licensor or its representatives, including but not limited to\ncommunication on electronic mailing lists, source code control systems,\nand issue tracking systems that are managed by, or on behalf of, the\nLicensor for the purpose of discussing and improving the Work, but\nexcluding communication that is conspicuously marked or otherwise\ndesignated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n&quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\non behalf of whom a Contribution has been received by Licensor and\nsubsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\nthis License, each Contributor hereby grants to You a perpetual,\nworldwide, non-exclusive, no-charge, royalty-free, irrevocable\ncopyright license to reproduce, prepare Derivative Works of,\npublicly display, publicly perform, sublicense, and distribute the\nWork and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\nthis License, each Contributor hereby grants to You a perpetual,\nworldwide, non-exclusive, no-charge, royalty-free, irrevocable\n(except as stated in this section) patent license to make, have made,\nuse, offer to sell, sell, import, and otherwise transfer the Work,\nwhere such license applies only to those patent claims licensable\nby such Contributor that are necessarily infringed by their\nContribution(s) alone or by combination of their Contribution(s)\nwith the Work to which such Contribution(s) was submitted. If You\ninstitute patent litigation against any entity (including a\ncross-claim or counterclaim in a lawsuit) alleging that the Work\nor a Contribution incorporated within the Work constitutes direct\nor contributory patent infringement, then any patent licenses\ngranted to You under this License for that Work shall terminate\nas of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\nWork or Derivative Works thereof in any medium, with or without\nmodifications, and in Source or Object form, provided that You\nmeet the following conditions:\n\n(a) You must give any other recipients of the Work or\nDerivative Works a copy of this License; and\n\n(b) You must cause any modified files to carry prominent notices\nstating that You changed the files; and\n\n(c) You must retain, in the Source form of any Derivative Works\nthat You distribute, all copyright, patent, trademark, and\nattribution notices from the Source form of the Work,\nexcluding those notices that do not pertain to any part of\nthe Derivative Works; and\n\n(d) If the Work includes a &quot;NOTICE&quot; text file as part of its\ndistribution, then any Derivative Works that You distribute must\ninclude a readable copy of the attribution notices contained\nwithin such NOTICE file, excluding those notices that do not\npertain to any part of the Derivative Works, in at least one\nof the following places: within a NOTICE text file distributed\nas part of the Derivative Works; within the Source form or\ndocumentation, if provided along with the Derivative Works; or,\nwithin a display generated by the Derivative Works, if and\nwherever such third-party notices normally appear. The contents\nof the NOTICE file are for informational purposes only and\ndo not modify the License. You may add Your own attribution\nnotices within Derivative Works that You distribute, alongside\nor as an addendum to the NOTICE text from the Work, provided\nthat such additional attribution notices cannot be construed\nas modifying the License.\n\nYou may add Your own copyright statement to Your modifications and\nmay provide additional or different license terms and conditions\nfor use, reproduction, or distribution of Your modifications, or\nfor any such Derivative Works as a whole, provided Your use,\nreproduction, and distribution of the Work otherwise complies with\nthe conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\nany Contribution intentionally submitted for inclusion in the Work\nby You to the Licensor shall be under the terms and conditions of\nthis License, without any additional terms or conditions.\nNotwithstanding the above, nothing herein shall supersede or modify\nthe terms of any separate license agreement you may have executed\nwith Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\nnames, trademarks, service marks, or product names of the Licensor,\nexcept as required for reasonable and customary use in describing the\norigin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\nagreed to in writing, Licensor provides the Work (and each\nContributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\nimplied, including, without limitation, any warranties or conditions\nof TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\nPARTICULAR PURPOSE. You are solely responsible for determining the\nappropriateness of using or redistributing the Work and assume any\nrisks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\nwhether in tort (including negligence), contract, or otherwise,\nunless required by applicable law (such as deliberate and grossly\nnegligent acts) or agreed to in writing, shall any Contributor be\nliable to You for damages, including any direct, indirect, special,\nincidental, or consequential damages of any character arising as a\nresult of this License or out of the use or inability to use the\nWork (including but not limited to damages for loss of goodwill,\nwork stoppage, computer failure or malfunction, or any and all\nother commercial damages or losses), even if such Contributor\nhas been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\nthe Work or Derivative Works thereof, You may choose to offer,\nand charge a fee for, acceptance of support, warranty, indemnity,\nor other liability obligations and/or rights consistent with this\nLicense. However, in accepting such obligations, You may act only\non Your own behalf and on Your sole responsibility, not on behalf\nof any other Contributor, and only if You agree to indemnify,\ndefend, and hold each Contributor harmless for any liability\nincurred by, or claims asserted against, such Contributor by reason\nof your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n\nAPPENDIX: How to apply the Apache License to your work.\n\nTo apply the Apache License to your work, attach the following\nboilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\nreplaced with your own identifying information. (Don&#x27;t include\nthe brackets!)  The text should be enclosed in the appropriate\ncomment syntax for the file format. We also recommend that a\nfile or class name and description of purpose be included on the\nsame &quot;printed page&quot; as the copyright notice for easier\nidentification within third-party archives.\n\nCopyright [yyyy] [name of copyright owner]\n\nLicensed under the Apache License, Version 2.0 (the &quot;License&quot;);\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\nhttp://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an &quot;AS IS&quot; BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/nical/android_system_properties \">android_system_properties 0.1.5</a></li>\n                    <li><a href=\" https://github.com/zrzka/anes-rs \">anes 0.1.6</a></li>\n                    <li><a href=\" https://github.com/zowens/crc32c \">crc32c 0.6.8</a></li>\n                    <li><a href=\" https://github.com/starkat99/half-rs \">half 2.4.1</a></li>\n                    <li><a href=\" https://github.com/veddan/rust-htmlescape \">htmlescape 0.3.1</a></li>\n                    <li><a href=\" https://gitlab.com/kornelski/http-serde \">http-serde 1.1.3</a></li>\n                    <li><a href=\" https://gitlab.com/kornelski/http-serde \">http-serde 2.1.1</a></li>\n                    <li><a href=\" https://github.com/TedDriggs/ident_case \">ident_case 1.0.1</a></li>\n                    <li><a href=\" https://github.com/awslabs/aws-lambda-rust-runtime \">lambda_http 0.8.3</a></li>\n                    <li><a href=\" https://github.com/awslabs/aws-lambda-rust-runtime \">lambda_runtime 0.13.0</a></li>\n                    <li><a href=\" https://github.com/awslabs/aws-lambda-rust-runtime \">lambda_runtime 0.8.3</a></li>\n                    <li><a href=\" https://github.com/awslabs/aws-lambda-rust-runtime \">lambda_runtime_api_client 0.11.1</a></li>\n                    <li><a href=\" https://github.com/awslabs/aws-lambda-rust-runtime \">lambda_runtime_api_client 0.8.0</a></li>\n                    <li><a href=\" https://github.com/stainless-steel/md5 \">md5 0.7.0</a></li>\n                    <li><a href=\" https://github.com/faern/oneshot \">oneshot 0.1.8</a></li>\n                    <li><a href=\" https://github.com/someguynamedjosh/ouroboros \">ouroboros 0.18.4</a></li>\n                    <li><a href=\" https://github.com/someguynamedjosh/ouroboros \">ouroboros_macro 0.18.4</a></li>\n                    <li><a href=\" https://github.com/jamesmunns/postcard \">postcard 1.0.10</a></li>\n                    <li><a href=\" https://github.com/eminence/procfs \">procfs-core 0.16.0</a></li>\n                    <li><a href=\" https://github.com/eminence/procfs \">procfs 0.16.0</a></li>\n                    <li><a href=\" https://github.com/comex/rust-shlex \">shlex 1.3.0</a></li>\n                    <li><a href=\" https://github.com/jedisct1/rust-siphash \">siphasher 0.3.11</a></li>\n                    <li><a href=\" https://github.com/mullvad/system-configuration-rs \">system-configuration-sys 0.5.0</a></li>\n                    <li><a href=\" https://github.com/mullvad/system-configuration-rs \">system-configuration 0.5.1</a></li>\n                    <li><a href=\" https://pijul.org/darcs/user \">username 0.2.0</a></li>\n                    <li><a href=\" https://github.com/retep998/winapi-rs \">winapi-i686-pc-windows-gnu 0.4.0</a></li>\n                    <li><a href=\" https://github.com/retep998/winapi-rs \">winapi-x86_64-pc-windows-gnu 0.4.0</a></li>\n                </ul>\n                <pre class=\"license-text\">Apache License\nVersion 2.0, January 2004\nhttp://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n&quot;License&quot; shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document.\n\n&quot;Licensor&quot; shall mean the copyright owner or entity authorized by the copyright owner that is granting the License.\n\n&quot;Legal Entity&quot; shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, &quot;control&quot; means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity.\n\n&quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity exercising permissions granted by this License.\n\n&quot;Source&quot; form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files.\n\n&quot;Object&quot; form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types.\n\n&quot;Work&quot; shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below).\n\n&quot;Derivative Works&quot; shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof.\n\n&quot;Contribution&quot; shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, &quot;submitted&quot; means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n&quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions:\n\n     (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and\n\n     (b) You must cause any modified files to carry prominent notices stating that You changed the files; and\n\n     (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and\n\n     (d) If the Work includes a &quot;NOTICE&quot; text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License.\n\n     You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n\nAPPENDIX: How to apply the Apache License to your work.\n\nTo apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets &quot;[]&quot; replaced with your own identifying information. (Don&#x27;t include the brackets!)  The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same &quot;printed page&quot; as the copyright notice for easier identification within third-party archives.\n\nCopyright [yyyy] [name of copyright owner]\n\nLicensed under the Apache License, Version 2.0 (the &quot;License&quot;);\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\nhttp://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an &quot;AS IS&quot; BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Apache-2.0\">Apache License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/chronotope/chrono \">chrono 0.4.38</a></li>\n                </ul>\n                <pre class=\"license-text\">Rust-chrono is dual-licensed under The MIT License [1] and\nApache 2.0 License [2]. Copyright (c) 2014--2017, Kang Seonghoon and\ncontributors.\n\nNota Bene: This is same as the Rust Project&#x27;s own license.\n\n\n[1]: &lt;http://opensource.org/licenses/MIT&gt;, which is reproduced below:\n\n~~~~\nThe MIT License (MIT)\n\nCopyright (c) 2014, Kang Seonghoon.\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.\n~~~~\n\n\n[2]: &lt;http://www.apache.org/licenses/LICENSE-2.0&gt;, which is reproduced below:\n\n~~~~\n                              Apache License\n                        Version 2.0, January 2004\n                     http://www.apache.org/licenses/\n\nTERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n1. Definitions.\n\n   &quot;License&quot; shall mean the terms and conditions for use, reproduction,\n   and distribution as defined by Sections 1 through 9 of this document.\n\n   &quot;Licensor&quot; shall mean the copyright owner or entity authorized by\n   the copyright owner that is granting the License.\n\n   &quot;Legal Entity&quot; shall mean the union of the acting entity and all\n   other entities that control, are controlled by, or are under common\n   control with that entity. For the purposes of this definition,\n   &quot;control&quot; means (i) the power, direct or indirect, to cause the\n   direction or management of such entity, whether by contract or\n   otherwise, or (ii) ownership of fifty percent (50%) or more of the\n   outstanding shares, or (iii) beneficial ownership of such entity.\n\n   &quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity\n   exercising permissions granted by this License.\n\n   &quot;Source&quot; form shall mean the preferred form for making modifications,\n   including but not limited to software source code, documentation\n   source, and configuration files.\n\n   &quot;Object&quot; form shall mean any form resulting from mechanical\n   transformation or translation of a Source form, including but\n   not limited to compiled object code, generated documentation,\n   and conversions to other media types.\n\n   &quot;Work&quot; shall mean the work of authorship, whether in Source or\n   Object form, made available under the License, as indicated by a\n   copyright notice that is included in or attached to the work\n   (an example is provided in the Appendix below).\n\n   &quot;Derivative Works&quot; shall mean any work, whether in Source or Object\n   form, that is based on (or derived from) the Work and for which the\n   editorial revisions, annotations, elaborations, or other modifications\n   represent, as a whole, an original work of authorship. For the purposes\n   of this License, Derivative Works shall not include works that remain\n   separable from, or merely link (or bind by name) to the interfaces of,\n   the Work and Derivative Works thereof.\n\n   &quot;Contribution&quot; shall mean any work of authorship, including\n   the original version of the Work and any modifications or additions\n   to that Work or Derivative Works thereof, that is intentionally\n   submitted to Licensor for inclusion in the Work by the copyright owner\n   or by an individual or Legal Entity authorized to submit on behalf of\n   the copyright owner. For the purposes of this definition, &quot;submitted&quot;\n   means any form of electronic, verbal, or written communication sent\n   to the Licensor or its representatives, including but not limited to\n   communication on electronic mailing lists, source code control systems,\n   and issue tracking systems that are managed by, or on behalf of, the\n   Licensor for the purpose of discussing and improving the Work, but\n   excluding communication that is conspicuously marked or otherwise\n   designated in writing by the copyright owner as &quot;Not a Contribution.&quot;\n\n   &quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity\n   on behalf of whom a Contribution has been received by Licensor and\n   subsequently incorporated within the Work.\n\n2. Grant of Copyright License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   copyright license to reproduce, prepare Derivative Works of,\n   publicly display, publicly perform, sublicense, and distribute the\n   Work and such Derivative Works in Source or Object form.\n\n3. Grant of Patent License. Subject to the terms and conditions of\n   this License, each Contributor hereby grants to You a perpetual,\n   worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n   (except as stated in this section) patent license to make, have made,\n   use, offer to sell, sell, import, and otherwise transfer the Work,\n   where such license applies only to those patent claims licensable\n   by such Contributor that are necessarily infringed by their\n   Contribution(s) alone or by combination of their Contribution(s)\n   with the Work to which such Contribution(s) was submitted. If You\n   institute patent litigation against any entity (including a\n   cross-claim or counterclaim in a lawsuit) alleging that the Work\n   or a Contribution incorporated within the Work constitutes direct\n   or contributory patent infringement, then any patent licenses\n   granted to You under this License for that Work shall terminate\n   as of the date such litigation is filed.\n\n4. Redistribution. You may reproduce and distribute copies of the\n   Work or Derivative Works thereof in any medium, with or without\n   modifications, and in Source or Object form, provided that You\n   meet the following conditions:\n\n   (a) You must give any other recipients of the Work or\n       Derivative Works a copy of this License; and\n\n   (b) You must cause any modified files to carry prominent notices\n       stating that You changed the files; and\n\n   (c) You must retain, in the Source form of any Derivative Works\n       that You distribute, all copyright, patent, trademark, and\n       attribution notices from the Source form of the Work,\n       excluding those notices that do not pertain to any part of\n       the Derivative Works; and\n\n   (d) If the Work includes a &quot;NOTICE&quot; text file as part of its\n       distribution, then any Derivative Works that You distribute must\n       include a readable copy of the attribution notices contained\n       within such NOTICE file, excluding those notices that do not\n       pertain to any part of the Derivative Works, in at least one\n       of the following places: within a NOTICE text file distributed\n       as part of the Derivative Works; within the Source form or\n       documentation, if provided along with the Derivative Works; or,\n       within a display generated by the Derivative Works, if and\n       wherever such third-party notices normally appear. The contents\n       of the NOTICE file are for informational purposes only and\n       do not modify the License. You may add Your own attribution\n       notices within Derivative Works that You distribute, alongside\n       or as an addendum to the NOTICE text from the Work, provided\n       that such additional attribution notices cannot be construed\n       as modifying the License.\n\n   You may add Your own copyright statement to Your modifications and\n   may provide additional or different license terms and conditions\n   for use, reproduction, or distribution of Your modifications, or\n   for any such Derivative Works as a whole, provided Your use,\n   reproduction, and distribution of the Work otherwise complies with\n   the conditions stated in this License.\n\n5. Submission of Contributions. Unless You explicitly state otherwise,\n   any Contribution intentionally submitted for inclusion in the Work\n   by You to the Licensor shall be under the terms and conditions of\n   this License, without any additional terms or conditions.\n   Notwithstanding the above, nothing herein shall supersede or modify\n   the terms of any separate license agreement you may have executed\n   with Licensor regarding such Contributions.\n\n6. Trademarks. This License does not grant permission to use the trade\n   names, trademarks, service marks, or product names of the Licensor,\n   except as required for reasonable and customary use in describing the\n   origin of the Work and reproducing the content of the NOTICE file.\n\n7. Disclaimer of Warranty. Unless required by applicable law or\n   agreed to in writing, Licensor provides the Work (and each\n   Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n   implied, including, without limitation, any warranties or conditions\n   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n   PARTICULAR PURPOSE. You are solely responsible for determining the\n   appropriateness of using or redistributing the Work and assume any\n   risks associated with Your exercise of permissions under this License.\n\n8. Limitation of Liability. In no event and under no legal theory,\n   whether in tort (including negligence), contract, or otherwise,\n   unless required by applicable law (such as deliberate and grossly\n   negligent acts) or agreed to in writing, shall any Contributor be\n   liable to You for damages, including any direct, indirect, special,\n   incidental, or consequential damages of any character arising as a\n   result of this License or out of the use or inability to use the\n   Work (including but not limited to damages for loss of goodwill,\n   work stoppage, computer failure or malfunction, or any and all\n   other commercial damages or losses), even if such Contributor\n   has been advised of the possibility of such damages.\n\n9. Accepting Warranty or Additional Liability. While redistributing\n   the Work or Derivative Works thereof, You may choose to offer,\n   and charge a fee for, acceptance of support, warranty, indemnity,\n   or other liability obligations and/or rights consistent with this\n   License. However, in accepting such obligations, You may act only\n   on Your own behalf and on Your sole responsibility, not on behalf\n   of any other Contributor, and only if You agree to indemnify,\n   defend, and hold each Contributor harmless for any liability\n   incurred by, or claims asserted against, such Contributor by reason\n   of your accepting any such warranty or additional liability.\n\nEND OF TERMS AND CONDITIONS\n\nAPPENDIX: How to apply the Apache License to your work.\n\n   To apply the Apache License to your work, attach the following\n   boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;\n   replaced with your own identifying information. (Don&#x27;t include\n   the brackets!)  The text should be enclosed in the appropriate\n   comment syntax for the file format. We also recommend that a\n   file or class name and description of purpose be included on the\n   same &quot;printed page&quot; as the copyright notice for easier\n   identification within third-party archives.\n\nCopyright [yyyy] [name of copyright owner]\n\nLicensed under the Apache License, Version 2.0 (the &quot;License&quot;);\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n\thttp://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an &quot;AS IS&quot; BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n~~~~\n\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"BSD-3-Clause\">BSD 3-Clause &quot;New&quot; or &quot;Revised&quot; License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/ibraheemdev/matchit \">matchit 0.7.3</a></li>\n                </ul>\n                <pre class=\"license-text\">BSD 3-Clause License\n\nCopyright (c) 2013, Julien Schmidt\nAll rights reserved.\n\nRedistribution and use in source and binary forms, with or without\nmodification, are permitted provided that the following conditions are met:\n\n1. Redistributions of source code must retain the above copyright notice, this\n   list of conditions and the following disclaimer.\n\n2. Redistributions in binary form must reproduce the above copyright notice,\n   this list of conditions and the following disclaimer in the documentation\n   and/or other materials provided with the distribution.\n\n3. Neither the name of the copyright holder nor the names of its\n   contributors may be used to endorse or promote products derived from\n   this software without specific prior written permission.\n\nTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS &quot;AS IS&quot;\nAND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE\nIMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE\nDISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE\nFOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL\nDAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR\nSERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER\nCAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,\nOR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\nOF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"BSD-3-Clause\">BSD 3-Clause &quot;New&quot; or &quot;Revised&quot; License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/CurrySoftware/rust-stemmers \">rust-stemmers 1.2.0</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2001, Dr Martin Porter\nCopyright (c) 2004,2005, Richard Boulton\nAll rights reserved.\n\nRedistribution and use in source and binary forms, with or without\nmodification, are permitted provided that the following conditions\nare met:\n\n  1. Redistributions of source code must retain the above copyright notice,\n     this list of conditions and the following disclaimer.\n  2. Redistributions in binary form must reproduce the above copyright notice,\n     this list of conditions and the following disclaimer in the documentation\n     and/or other materials provided with the distribution.\n  3. Neither the name of the Snowball project nor the names of its contributors\n     may be used to endorse or promote products derived from this software\n     without specific prior written permission.\n\nTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS &quot;AS IS&quot; AND\nANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED\nWARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE\nDISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR\nANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES\n(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;\nLOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON\nANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS\nSOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"BSD-3-Clause\">BSD 3-Clause &quot;New&quot; or &quot;Revised&quot; License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/dalek-cryptography/subtle \">subtle 2.6.1</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2016-2017 Isis Agora Lovecruft, Henry de Valence. All rights reserved.\nCopyright (c) 2016-2024 Isis Agora Lovecruft. All rights reserved.\n\nRedistribution and use in source and binary forms, with or without\nmodification, are permitted provided that the following conditions are\nmet:\n\n1. Redistributions of source code must retain the above copyright\nnotice, this list of conditions and the following disclaimer.\n\n2. Redistributions in binary form must reproduce the above copyright\nnotice, this list of conditions and the following disclaimer in the\ndocumentation and/or other materials provided with the distribution.\n\n3. Neither the name of the copyright holder nor the names of its\ncontributors may be used to endorse or promote products derived from\nthis software without specific prior written permission.\n\nTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS &quot;AS\nIS&quot; AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED\nTO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A\nPARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\nHOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\nSPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED\nTO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR\nPROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF\nLIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING\nNEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS\nSOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. \n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"BSD-3-Clause\">BSD 3-Clause &quot;New&quot; or &quot;Revised&quot; License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/sebcrozet/instant \">instant 0.1.13</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2019, Sébastien Crozet\r\nAll rights reserved.\r\n\r\nRedistribution and use in source and binary forms, with or without\r\nmodification, are permitted provided that the following conditions are met:\r\n\r\n1. Redistributions of source code must retain the above copyright notice, this\r\n   list of conditions and the following disclaimer.\r\n\r\n2. Redistributions in binary form must reproduce the above copyright notice,\r\n   this list of conditions and the following disclaimer in the documentation\r\n   and/or other materials provided with the distribution.\r\n\r\n3. Neither the name of the author nor the names of its contributors may be used\r\n   to endorse or promote products derived from this software without specific\r\n   prior written permission.\r\n\r\nTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS &quot;AS IS&quot; AND\r\nANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED\r\nWARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE\r\nDISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE\r\nFOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL\r\nDAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR\r\nSERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER\r\nCAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,\r\nOR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\r\nOF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\r\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"BSD-3-Clause\">BSD 3-Clause &quot;New&quot; or &quot;Revised&quot; License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/hsivonen/encoding_rs \">encoding_rs 0.8.32</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright © WHATWG (Apple, Google, Mozilla, Microsoft).\n\nRedistribution and use in source and binary forms, with or without\nmodification, are permitted provided that the following conditions are met:\n\n1. Redistributions of source code must retain the above copyright notice, this\n   list of conditions and the following disclaimer.\n\n2. Redistributions in binary form must reproduce the above copyright notice,\n   this list of conditions and the following disclaimer in the documentation\n   and/or other materials provided with the distribution.\n\n3. Neither the name of the copyright holder nor the names of its\n   contributors may be used to endorse or promote products derived from\n   this software without specific prior written permission.\n\nTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS &quot;AS IS&quot;\nAND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE\nIMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE\nDISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE\nFOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL\nDAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR\nSERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER\nCAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,\nOR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\nOF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"CC0-1.0\">Creative Commons Zero v1.0 Universal</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/cesarb/constant_time_eq \">constant_time_eq 0.1.5</a></li>\n                    <li><a href=\" https://crates.io/crates/encoding-index-japanese \">encoding-index-japanese 1.20141219.5</a></li>\n                    <li><a href=\" https://crates.io/crates/encoding-index-korean \">encoding-index-korean 1.20141219.5</a></li>\n                    <li><a href=\" https://crates.io/crates/encoding-index-simpchinese \">encoding-index-simpchinese 1.20141219.5</a></li>\n                    <li><a href=\" https://crates.io/crates/encoding-index-singlebyte \">encoding-index-singlebyte 1.20141219.5</a></li>\n                    <li><a href=\" https://crates.io/crates/encoding-index-tradchinese \">encoding-index-tradchinese 1.20141219.5</a></li>\n                    <li><a href=\" https://crates.io/crates/encoding_index_tests \">encoding_index_tests 0.1.4</a></li>\n                </ul>\n                <pre class=\"license-text\">Creative Commons Legal Code\n\nCC0 1.0 Universal\n\n    CREATIVE COMMONS CORPORATION IS NOT A LAW FIRM AND DOES NOT PROVIDE\n    LEGAL SERVICES. DISTRIBUTION OF THIS DOCUMENT DOES NOT CREATE AN\n    ATTORNEY-CLIENT RELATIONSHIP. CREATIVE COMMONS PROVIDES THIS\n    INFORMATION ON AN &quot;AS-IS&quot; BASIS. CREATIVE COMMONS MAKES NO WARRANTIES\n    REGARDING THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS\n    PROVIDED HEREUNDER, AND DISCLAIMS LIABILITY FOR DAMAGES RESULTING FROM\n    THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS PROVIDED\n    HEREUNDER.\n\nStatement of Purpose\n\nThe laws of most jurisdictions throughout the world automatically confer\nexclusive Copyright and Related Rights (defined below) upon the creator\nand subsequent owner(s) (each and all, an &quot;owner&quot;) of an original work of\nauthorship and/or a database (each, a &quot;Work&quot;).\n\nCertain owners wish to permanently relinquish those rights to a Work for\nthe purpose of contributing to a commons of creative, cultural and\nscientific works (&quot;Commons&quot;) that the public can reliably and without fear\nof later claims of infringement build upon, modify, incorporate in other\nworks, reuse and redistribute as freely as possible in any form whatsoever\nand for any purposes, including without limitation commercial purposes.\nThese owners may contribute to the Commons to promote the ideal of a free\nculture and the further production of creative, cultural and scientific\nworks, or to gain reputation or greater distribution for their Work in\npart through the use and efforts of others.\n\nFor these and/or other purposes and motivations, and without any\nexpectation of additional consideration or compensation, the person\nassociating CC0 with a Work (the &quot;Affirmer&quot;), to the extent that he or she\nis an owner of Copyright and Related Rights in the Work, voluntarily\nelects to apply CC0 to the Work and publicly distribute the Work under its\nterms, with knowledge of his or her Copyright and Related Rights in the\nWork and the meaning and intended legal effect of CC0 on those rights.\n\n1. Copyright and Related Rights. A Work made available under CC0 may be\nprotected by copyright and related or neighboring rights (&quot;Copyright and\nRelated Rights&quot;). Copyright and Related Rights include, but are not\nlimited to, the following:\n\n  i. the right to reproduce, adapt, distribute, perform, display,\n     communicate, and translate a Work;\n ii. moral rights retained by the original author(s) and/or performer(s);\niii. publicity and privacy rights pertaining to a person&#x27;s image or\n     likeness depicted in a Work;\n iv. rights protecting against unfair competition in regards to a Work,\n     subject to the limitations in paragraph 4(a), below;\n  v. rights protecting the extraction, dissemination, use and reuse of data\n     in a Work;\n vi. database rights (such as those arising under Directive 96/9/EC of the\n     European Parliament and of the Council of 11 March 1996 on the legal\n     protection of databases, and under any national implementation\n     thereof, including any amended or successor version of such\n     directive); and\nvii. other similar, equivalent or corresponding rights throughout the\n     world based on applicable law or treaty, and any national\n     implementations thereof.\n\n2. Waiver. To the greatest extent permitted by, but not in contravention\nof, applicable law, Affirmer hereby overtly, fully, permanently,\nirrevocably and unconditionally waives, abandons, and surrenders all of\nAffirmer&#x27;s Copyright and Related Rights and associated claims and causes\nof action, whether now known or unknown (including existing as well as\nfuture claims and causes of action), in the Work (i) in all territories\nworldwide, (ii) for the maximum duration provided by applicable law or\ntreaty (including future time extensions), (iii) in any current or future\nmedium and for any number of copies, and (iv) for any purpose whatsoever,\nincluding without limitation commercial, advertising or promotional\npurposes (the &quot;Waiver&quot;). Affirmer makes the Waiver for the benefit of each\nmember of the public at large and to the detriment of Affirmer&#x27;s heirs and\nsuccessors, fully intending that such Waiver shall not be subject to\nrevocation, rescission, cancellation, termination, or any other legal or\nequitable action to disrupt the quiet enjoyment of the Work by the public\nas contemplated by Affirmer&#x27;s express Statement of Purpose.\n\n3. Public License Fallback. Should any part of the Waiver for any reason\nbe judged legally invalid or ineffective under applicable law, then the\nWaiver shall be preserved to the maximum extent permitted taking into\naccount Affirmer&#x27;s express Statement of Purpose. In addition, to the\nextent the Waiver is so judged Affirmer hereby grants to each affected\nperson a royalty-free, non transferable, non sublicensable, non exclusive,\nirrevocable and unconditional license to exercise Affirmer&#x27;s Copyright and\nRelated Rights in the Work (i) in all territories worldwide, (ii) for the\nmaximum duration provided by applicable law or treaty (including future\ntime extensions), (iii) in any current or future medium and for any number\nof copies, and (iv) for any purpose whatsoever, including without\nlimitation commercial, advertising or promotional purposes (the\n&quot;License&quot;). The License shall be deemed effective as of the date CC0 was\napplied by Affirmer to the Work. Should any part of the License for any\nreason be judged legally invalid or ineffective under applicable law, such\npartial invalidity or ineffectiveness shall not invalidate the remainder\nof the License, and in such case Affirmer hereby affirms that he or she\nwill not (i) exercise any of his or her remaining Copyright and Related\nRights in the Work or (ii) assert any associated claims and causes of\naction with respect to the Work, in either case contrary to Affirmer&#x27;s\nexpress Statement of Purpose.\n\n4. Limitations and Disclaimers.\n\n a. No trademark or patent rights held by Affirmer are waived, abandoned,\n    surrendered, licensed or otherwise affected by this document.\n b. Affirmer offers the Work as-is and makes no representations or\n    warranties of any kind concerning the Work, express, implied,\n    statutory or otherwise, including without limitation warranties of\n    title, merchantability, fitness for a particular purpose, non\n    infringement, or the absence of latent or other defects, accuracy, or\n    the present or absence of errors, whether or not discoverable, all to\n    the greatest extent permissible under applicable law.\n c. Affirmer disclaims responsibility for clearing rights of other persons\n    that may apply to the Work or any use thereof, including without\n    limitation any person&#x27;s Copyright and Related Rights in the Work.\n    Further, Affirmer disclaims responsibility for obtaining any necessary\n    consents, permissions or other rights required for any use of the\n    Work.\n d. Affirmer understands and acknowledges that Creative Commons is not a\n    party to this document and has no duty or obligation with respect to\n    this CC0 or use of the Work.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"ISC\">ISC License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/briansmith/ring \">ring 0.17.8</a></li>\n                </ul>\n                <pre class=\"license-text\">   Copyright 2015-2016 Brian Smith.\n\n   Permission to use, copy, modify, and/or distribute this software for any\n   purpose with or without fee is hereby granted, provided that the above\n   copyright notice and this permission notice appear in all copies.\n\n   THE SOFTWARE IS PROVIDED &quot;AS IS&quot; AND THE AUTHORS DISCLAIM ALL WARRANTIES\n   WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF\n   MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY\n   SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES\n   WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION\n   OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN\n   CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"ISC\">ISC License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/briansmith/ring \">ring 0.17.8</a></li>\n                </ul>\n                <pre class=\"license-text\">/* Copyright (c) 2015, Google Inc.\n *\n * Permission to use, copy, modify, and/or distribute this software for any\n * purpose with or without fee is hereby granted, provided that the above\n * copyright notice and this permission notice appear in all copies.\n *\n * THE SOFTWARE IS PROVIDED &quot;AS IS&quot; AND THE AUTHOR DISCLAIMS ALL WARRANTIES\n * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF\n * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY\n * SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES\n * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION\n * OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN\n * CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"ISC\">ISC License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/briansmith/untrusted \">untrusted 0.9.0</a></li>\n                </ul>\n                <pre class=\"license-text\">// Copyright 2015-2016 Brian Smith.\n//\n// Permission to use, copy, modify, and/or distribute this software for any\n// purpose with or without fee is hereby granted, provided that the above\n// copyright notice and this permission notice appear in all copies.\n//\n// THE SOFTWARE IS PROVIDED &quot;AS IS&quot; AND THE AUTHORS DISCLAIM ALL WARRANTIES\n// WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF\n// MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR\n// ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES\n// WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN\n// ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF\n// OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"ISC\">ISC License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/rustls/webpki \">rustls-webpki 0.101.7</a></li>\n                </ul>\n                <pre class=\"license-text\">// Copyright 2021 Brian Smith.\n//\n// Permission to use, copy, modify, and/or distribute this software for any\n// purpose with or without fee is hereby granted, provided that the above\n// copyright notice and this permission notice appear in all copies.\n//\n// THE SOFTWARE IS PROVIDED &quot;AS IS&quot; AND THE AUTHORS DISCLAIM ALL WARRANTIES\n// WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF\n// MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR\n// ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES\n// WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN\n// ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF\n// OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.\n\n#[test]\nfn cert_without_extensions_test() {\n    // Check the certificate is valid with\n    // &#x60;openssl x509 -in cert_without_extensions.der -inform DER -text -noout&#x60;\n    const CERT_WITHOUT_EXTENSIONS_DER: &amp;[u8] &#x3D; include_bytes!(&quot;cert_without_extensions.der&quot;);\n\n    assert!(webpki::EndEntityCert::try_from(CERT_WITHOUT_EXTENSIONS_DER).is_ok());\n}\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"ISC\">ISC License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/jedisct1/rust-coarsetime \">coarsetime 0.1.34</a></li>\n                </ul>\n                <pre class=\"license-text\">ISC License:\n\nCopyright (c) 2004-2010 by Internet Systems Consortium, Inc. (&quot;ISC&quot;)\nCopyright (c) 1995-2003 by Internet Software Consortium\n\nPermission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot; AND ISC DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL ISC BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/allan2/dotenvy \">dotenvy 0.15.7</a></li>\n                </ul>\n                <pre class=\"license-text\"># The MIT License (MIT)\n\nCopyright (c) 2014 Santiago Lapresta and contributors\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/tokio-rs/mio \">mio 1.0.2</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2014 Carl Lerche and other MIO contributors\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/SimonSapin/rust-std-candidates \">matches 0.1.10</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2014-2016 Simon Sapin\n\nPermission is hereby granted, free of charge, to any\nperson obtaining a copy of this software and associated\ndocumentation files (the &quot;Software&quot;), to deal in the\nSoftware without restriction, including without\nlimitation the rights to use, copy, modify, merge,\npublish, distribute, sublicense, and/or sell copies of\nthe Software, and to permit persons to whom the Software\nis furnished to do so, subject to the following\nconditions:\n\nThe above copyright notice and this permission notice\nshall be included in all copies or substantial portions\nof the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF\nANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED\nTO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A\nPARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT\nSHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY\nCLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION\nOF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR\nIN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER\nDEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/Geal/nom \">nom 7.1.3</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2014-2019 Geoffroy Couprie\n\nPermission is hereby granted, free of charge, to any person obtaining\na copy of this software and associated documentation files (the\n&quot;Software&quot;), to deal in the Software without restriction, including\nwithout limitation the rights to use, copy, modify, merge, publish,\ndistribute, sublicense, and/or sell copies of the Software, and to\npermit persons to whom the Software is furnished to do so, subject to\nthe following conditions:\n\nThe above copyright notice and this permission notice shall be\nincluded in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND,\nEXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF\nMERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND\nNONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE\nLIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION\nOF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION\nWITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/hyperium/headers \">headers-core 0.2.0</a></li>\n                    <li><a href=\" https://github.com/hyperium/headers \">headers 0.3.9</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2014-2019 Sean McArthur\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.\n\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/mikedilger/float-cmp \">float-cmp 0.9.0</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2014-2020 Optimal Computing (NZ) Ltd\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of\nthis software and associated documentation files (the &quot;Software&quot;), to deal in\nthe Software without restriction, including without limitation the rights to\nuse, copy, modify, merge, publish, distribute, sublicense, and/or sell copies\nof the Software, and to permit persons to whom the Software is furnished to do\nso, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING\nFROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS\nIN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/hyperium/hyper \">hyper 0.14.30</a></li>\n                    <li><a href=\" https://github.com/hyperium/hyper \">hyper 1.4.1</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2014-2021 Sean McArthur\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/gentoo90/winreg-rs \">winreg 0.50.0</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2015 Igor Shaula\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/reem/rust-ordered-float \">ordered-float 3.9.2</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2015 Jonathan Reem\n\nPermission is hereby granted, free of charge, to any\nperson obtaining a copy of this software and associated\ndocumentation files (the &quot;Software&quot;), to deal in the\nSoftware without restriction, including without\nlimitation the rights to use, copy, modify, merge,\npublish, distribute, sublicense, and/or sell copies of\nthe Software, and to permit persons to whom the Software\nis furnished to do so, subject to the following\nconditions:\n\nThe above copyright notice and this permission notice\nshall be included in all copies or substantial portions\nof the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF\nANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED\nTO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A\nPARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT\nSHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY\nCLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION\nOF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR\nIN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER\nDEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/harryfei/which-rs.git \">which 4.4.2</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2015 fangyuanziti\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/steffengy/schannel-rs \">schannel 0.1.26</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2015 steffengy\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the &quot;Software&quot;), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/briansmith/ring \">ring 0.17.8</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2015-2016 the fiat-crypto authors (see\nhttps://github.com/mit-plv/fiat-crypto/blob/master/AUTHORS).\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/quickwit-oss/bitpacking \">bitpacking 0.9.2</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2016 Paul Masurel\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the &quot;Software&quot;), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/snapview/tokio-tungstenite \">tokio-tungstenite 0.21.0</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2017 Daniel Abramov\nCopyright (c) 2017 Alexey Galakhov\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/rust-cli/termtree \">termtree 0.4.1</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2017 Doug Tangren\n\nPermission is hereby granted, free of charge, to any person obtaining\na copy of this software and associated documentation files (the\n&quot;Software&quot;), to deal in the Software without restriction, including\nwithout limitation the rights to use, copy, modify, merge, publish,\ndistribute, sublicense, and/or sell copies of the Software, and to\npermit persons to whom the Software is furnished to do so, subject to\nthe following conditions:\n\nThe above copyright notice and this permission notice shall be\nincluded in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND,\nEXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF\nMERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND\nNONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE\nLIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION\nOF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION\nWITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://gitlab.redox-os.org/redox-os/syscall \">redox_syscall 0.5.7</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2017 Redox OS Developers\n\nMIT License\n\nPermission is hereby granted, free of charge, to any person obtaining\na copy of this software and associated documentation files (the\n&quot;Software&quot;), to deal in the Software without restriction, including\nwithout limitation the rights to use, copy, modify, merge, publish,\ndistribute, sublicense, and/or sell copies of the Software, and to\npermit persons to whom the Software is furnished to do so, subject to\nthe following conditions:\n\nThe above copyright notice and this permission notice shall be\nincluded in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND,\nEXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF\nMERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND\nNONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE\nLIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION\nOF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION\nWITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/hyperium/h2 \">h2 0.3.26</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2017 h2 authors\n\nPermission is hereby granted, free of charge, to any\nperson obtaining a copy of this software and associated\ndocumentation files (the &quot;Software&quot;), to deal in the\nSoftware without restriction, including without\nlimitation the rights to use, copy, modify, merge,\npublish, distribute, sublicense, and/or sell copies of\nthe Software, and to permit persons to whom the Software\nis furnished to do so, subject to the following\nconditions:\n\nThe above copyright notice and this permission notice\nshall be included in all copies or substantial portions\nof the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF\nANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED\nTO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A\nPARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT\nSHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY\nCLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION\nOF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR\nIN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER\nDEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/tokio-rs/bytes \">bytes 1.7.2</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2018 Carl Lerche\n\nPermission is hereby granted, free of charge, to any\nperson obtaining a copy of this software and associated\ndocumentation files (the &quot;Software&quot;), to deal in the\nSoftware without restriction, including without\nlimitation the rights to use, copy, modify, merge,\npublish, distribute, sublicense, and/or sell copies of\nthe Software, and to permit persons to whom the Software\nis furnished to do so, subject to the following\nconditions:\n\nThe above copyright notice and this permission notice\nshall be included in all copies or substantial portions\nof the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF\nANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED\nTO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A\nPARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT\nSHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY\nCLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION\nOF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR\nIN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER\nDEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/tantivy-search/levenshtein-automata \">levenshtein_automata 0.2.1</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2018 Paul Masurel\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the &quot;Software&quot;), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/palfrey/serial_test/ \">serial_test 3.1.1</a></li>\n                    <li><a href=\" https://github.com/palfrey/serial_test/ \">serial_test_derive 3.1.1</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2018 Tom Parker-Shemilt\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/quickwit-inc/census \">census 0.4.2</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2018 by Quickwit, Inc. \n\nPermission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the &quot;Software&quot;), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/quickwit-oss/tantivy \">tantivy 0.23.0</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2018 by the project authors, as listed in the AUTHORS file. \n\nPermission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the &quot;Software&quot;), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/seanmonstar/want \">want 0.3.1</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2018-2019 Sean McArthur\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.\n\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/seanmonstar/warp \">warp 0.3.7</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2018-2020 Sean McArthur\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.\n\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/seanmonstar/try-lock \">try-lock 0.2.5</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2018-2023 Sean McArthur\nCopyright (c) 2016 Alex Crichton\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.\n\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/tokio-rs/axum \">axum 0.6.20</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2019 Axum Contributors\n\nPermission is hereby granted, free of charge, to any\nperson obtaining a copy of this software and associated\ndocumentation files (the &quot;Software&quot;), to deal in the\nSoftware without restriction, including without\nlimitation the rights to use, copy, modify, merge,\npublish, distribute, sublicense, and/or sell copies of\nthe Software, and to permit persons to whom the Software\nis furnished to do so, subject to the following\nconditions:\n\nThe above copyright notice and this permission notice\nshall be included in all copies or substantial portions\nof the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF\nANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED\nTO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A\nPARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT\nSHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY\nCLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION\nOF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR\nIN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER\nDEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/tokio-rs/slab \">slab 0.4.9</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2019 Carl Lerche\n\nPermission is hereby granted, free of charge, to any\nperson obtaining a copy of this software and associated\ndocumentation files (the &quot;Software&quot;), to deal in the\nSoftware without restriction, including without\nlimitation the rights to use, copy, modify, merge,\npublish, distribute, sublicense, and/or sell copies of\nthe Software, and to permit persons to whom the Software\nis furnished to do so, subject to the following\nconditions:\n\nThe above copyright notice and this permission notice\nshall be included in all copies or substantial portions\nof the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF\nANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED\nTO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A\nPARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT\nSHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY\nCLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION\nOF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR\nIN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER\nDEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/davidpdrsn/assert-json-diff.git \">assert-json-diff 1.1.0</a></li>\n                    <li><a href=\" https://github.com/davidpdrsn/assert-json-diff.git \">assert-json-diff 2.0.2</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2019 David Pedersen\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the &quot;Software&quot;), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/hawkw/sharded-slab \">sharded-slab 0.1.7</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2019 Eliza Weisman\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/hawkw/matchers \">matchers 0.1.0</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2019 Eliza Weisman\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/hyperium/http-body \">http-body-util 0.1.2</a></li>\n                    <li><a href=\" https://github.com/hyperium/http-body \">http-body 0.4.6</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2019 Hyper Contributors\n\nPermission is hereby granted, free of charge, to any\nperson obtaining a copy of this software and associated\ndocumentation files (the &quot;Software&quot;), to deal in the\nSoftware without restriction, including without\nlimitation the rights to use, copy, modify, merge,\npublish, distribute, sublicense, and/or sell copies of\nthe Software, and to permit persons to whom the Software\nis furnished to do so, subject to the following\nconditions:\n\nThe above copyright notice and this permission notice\nshall be included in all copies or substantial portions\nof the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF\nANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED\nTO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A\nPARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT\nSHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY\nCLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION\nOF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR\nIN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER\nDEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/stepancheg/rust-protobuf/ \">protobuf 2.28.0</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2019 Stepan Koltsov\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND,\nEXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF\nMERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.\nIN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,\nDAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR\nOTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE\nOR OTHER DEALINGS IN THE SOFTWARE.</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/tokio-rs/tracing \">tracing-attributes 0.1.27</a></li>\n                    <li><a href=\" https://github.com/tokio-rs/tracing \">tracing-core 0.1.32</a></li>\n                    <li><a href=\" https://github.com/tokio-rs/tracing \">tracing-log 0.1.4</a></li>\n                    <li><a href=\" https://github.com/tokio-rs/tracing \">tracing-log 0.2.0</a></li>\n                    <li><a href=\" https://github.com/tokio-rs/tracing-opentelemetry \">tracing-opentelemetry 0.20.0</a></li>\n                    <li><a href=\" https://github.com/tokio-rs/tracing \">tracing-serde 0.1.3</a></li>\n                    <li><a href=\" https://github.com/tokio-rs/tracing \">tracing-subscriber 0.3.18</a></li>\n                    <li><a href=\" https://github.com/tokio-rs/tracing \">tracing 0.1.40</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2019 Tokio Contributors\n\nPermission is hereby granted, free of charge, to any\nperson obtaining a copy of this software and associated\ndocumentation files (the &quot;Software&quot;), to deal in the\nSoftware without restriction, including without\nlimitation the rights to use, copy, modify, merge,\npublish, distribute, sublicense, and/or sell copies of\nthe Software, and to permit persons to whom the Software\nis furnished to do so, subject to the following\nconditions:\n\nThe above copyright notice and this permission notice\nshall be included in all copies or substantial portions\nof the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF\nANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED\nTO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A\nPARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT\nSHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY\nCLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION\nOF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR\nIN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER\nDEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/tower-rs/tower \">tower-layer 0.3.3</a></li>\n                    <li><a href=\" https://github.com/tower-rs/tower \">tower-service 0.3.3</a></li>\n                    <li><a href=\" https://github.com/tower-rs/tower \">tower 0.4.13</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2019 Tower Contributors\n\nPermission is hereby granted, free of charge, to any\nperson obtaining a copy of this software and associated\ndocumentation files (the &quot;Software&quot;), to deal in the\nSoftware without restriction, including without\nlimitation the rights to use, copy, modify, merge,\npublish, distribute, sublicense, and/or sell copies of\nthe Software, and to permit persons to whom the Software\nis furnished to do so, subject to the following\nconditions:\n\nThe above copyright notice and this permission notice\nshall be included in all copies or substantial portions\nof the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF\nANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED\nTO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A\nPARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT\nSHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY\nCLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION\nOF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR\nIN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER\nDEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/tower-rs/tower-http \">tower-http 0.4.4</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2019-2021 Tower Contributors\n\nPermission is hereby granted, free of charge, to any\nperson obtaining a copy of this software and associated\ndocumentation files (the &quot;Software&quot;), to deal in the\nSoftware without restriction, including without\nlimitation the rights to use, copy, modify, merge,\npublish, distribute, sublicense, and/or sell copies of\nthe Software, and to permit persons to whom the Software\nis furnished to do so, subject to the following\nconditions:\n\nThe above copyright notice and this permission notice\nshall be included in all copies or substantial portions\nof the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF\nANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED\nTO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A\nPARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT\nSHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY\nCLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION\nOF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR\nIN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER\nDEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/hyperium/http-body \">http-body 1.0.1</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2019-2024 Sean McArthur &amp; Hyper Contributors\n\nPermission is hereby granted, free of charge, to any\nperson obtaining a copy of this software and associated\ndocumentation files (the &quot;Software&quot;), to deal in the\nSoftware without restriction, including without\nlimitation the rights to use, copy, modify, merge,\npublish, distribute, sublicense, and/or sell copies of\nthe Software, and to permit persons to whom the Software\nis furnished to do so, subject to the following\nconditions:\n\nThe above copyright notice and this permission notice\nshall be included in all copies or substantial portions\nof the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF\nANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED\nTO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A\nPARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT\nSHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY\nCLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION\nOF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR\nIN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER\nDEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/hyperium/tonic \">tonic-build 0.9.2</a></li>\n                    <li><a href=\" https://github.com/hyperium/tonic \">tonic 0.9.2</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2020 Lucio Franco\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/calavera/query-map-rs \">query_map 0.7.0</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2021 David Calavera &lt;david.calavera@gmail.com&gt;\n\nMIT License\n\nPermission is hereby granted, free of charge, to any person obtaining\na copy of this software and associated documentation files (the\n&quot;Software&quot;), to deal in the Software without restriction, including\nwithout limitation the rights to use, copy, modify, merge, publish,\ndistribute, sublicense, and/or sell copies of the Software, and to\npermit persons to whom the Software is furnished to do so, subject to\nthe following conditions:\n\nThe above copyright notice and this permission notice shall be\nincluded in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND,\nEXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF\nMERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND\nNONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE\nLIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION\nOF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION\nWITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/tokio-rs/tokio-metrics \">tokio-metrics 0.3.1</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2022 Tokio Contributors\n\nPermission is hereby granted, free of charge, to any\nperson obtaining a copy of this software and associated\ndocumentation files (the &quot;Software&quot;), to deal in the\nSoftware without restriction, including without\nlimitation the rights to use, copy, modify, merge,\npublish, distribute, sublicense, and/or sell copies of\nthe Software, and to permit persons to whom the Software\nis furnished to do so, subject to the following\nconditions:\n\nThe above copyright notice and this permission notice\nshall be included in all copies or substantial portions\nof the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF\nANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED\nTO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A\nPARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT\nSHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY\nCLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION\nOF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR\nIN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER\nDEALINGS IN THE SOFTWARE.</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://crates.io/crates/mrecordlog \">mrecordlog 0.4.0</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2022 by Quickwit, Inc.\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the &quot;Software&quot;), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/hyperium/hyper-util \">hyper-util 0.1.9</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2023 Sean McArthur\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://crates.io/crates/whichlang \">whichlang 0.1.0</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2023 by Quickwit Inc.\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the &quot;Software&quot;), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/quickwit-inc/murmurhash32 \">murmurhash32 0.3.1</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2024 by Quickwit Inc.\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the &quot;Software&quot;), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/tokio-rs/axum \">axum-core 0.3.4</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright 2021 Axum Contributors\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the &quot;Software&quot;), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/PSeitz/rust_measure_time \">measure_time 0.8.3</a></li>\n                </ul>\n                <pre class=\"license-text\">Includes portions of humantime\nCopyright (c) 2016 The humantime Developers\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the &quot;Software&quot;), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n\n\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/jeromefroe/lru-rs.git \">lru 0.12.5</a></li>\n                </ul>\n                <pre class=\"license-text\">MIT License\n\nCopyright (c) 2016 Jerome Froelich\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/svartalf/hostname \">hostname 0.3.1</a></li>\n                </ul>\n                <pre class=\"license-text\">MIT License\n\nCopyright (c) 2016 fengcen\nCopyright (c) 2019 svartalf\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/dylanhart/ulid-rs \">ulid 1.1.3</a></li>\n                </ul>\n                <pre class=\"license-text\">MIT License\n\nCopyright (c) 2017 Dylan Hart\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/TedDriggs/darling \">darling 0.20.10</a></li>\n                    <li><a href=\" https://github.com/TedDriggs/darling \">darling_core 0.20.10</a></li>\n                    <li><a href=\" https://github.com/TedDriggs/darling \">darling_macro 0.20.10</a></li>\n                </ul>\n                <pre class=\"license-text\">MIT License\n\nCopyright (c) 2017 Ted Driggs\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/dunmatt/no-std-net \">no-std-net 0.6.0</a></li>\n                </ul>\n                <pre class=\"license-text\">MIT License\n\nCopyright (c) 2018 M@\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/awslabs/aws-lambda-rust-runtime \">aws_lambda_events 0.12.1</a></li>\n                    <li><a href=\" https://github.com/awslabs/aws-lambda-rust-runtime \">aws_lambda_events 0.15.1</a></li>\n                </ul>\n                <pre class=\"license-text\">MIT License\n\nCopyright (c) 2018 Sam Rijs and Christian Legnitto\nCopyright 2023 Amazon.com, Inc. or its affiliates\n\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/bojand/infer \">infer 0.2.3</a></li>\n                </ul>\n                <pre class=\"license-text\">MIT License\n\nCopyright (c) 2019 Bojan\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/tokio-rs/tokio \">tokio-macros 2.4.0</a></li>\n                </ul>\n                <pre class=\"license-text\">MIT License\n\nCopyright (c) 2019 Yoshua Wuyts\nCopyright (c) Tokio Contributors\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/brunoczim/fslock \">fslock 0.2.1</a></li>\n                </ul>\n                <pre class=\"license-text\">MIT License\n\nCopyright (c) 2019 brunoczim\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/tabac/hyperloglog.rs \">hyperloglogplus 0.4.1</a></li>\n                </ul>\n                <pre class=\"license-text\">MIT License\n\nCopyright (c) 2020 Anastasios Bakogiannis\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/rousan/multer-rs \">multer 2.1.0</a></li>\n                </ul>\n                <pre class=\"license-text\">MIT License\n\nCopyright (c) 2020 Rousan Ali\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/zenlist/serde_dynamo \">serde_dynamo 4.2.14</a></li>\n                </ul>\n                <pre class=\"license-text\">MIT License\n\nCopyright (c) 2020 Zenlist\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/MarcusGrass/parse-range-headers \">http-range-header 0.3.1</a></li>\n                </ul>\n                <pre class=\"license-text\">MIT License\n\nCopyright (c) 2021 MarcusGrass\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/zhiburt/tabled \">papergrid 0.10.0</a></li>\n                    <li><a href=\" https://github.com/zhiburt/tabled \">tabled 0.14.0</a></li>\n                    <li><a href=\" https://github.com/zhiburt/tabled \">tabled_derive 0.6.0</a></li>\n                </ul>\n                <pre class=\"license-text\">MIT License\n\nCopyright (c) 2021 Maxim Zhiburt\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/ibraheemdev/matchit \">matchit 0.7.3</a></li>\n                </ul>\n                <pre class=\"license-text\">MIT License\n\nCopyright (c) 2022 Ibraheem Ahmed\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/zhiburt/ansi-str \">ansi-str 0.8.0</a></li>\n                    <li><a href=\" https://gitlab.com/zhiburt/ansitok \">ansitok 0.2.0</a></li>\n                </ul>\n                <pre class=\"license-text\">MIT License\n\nCopyright (c) 2022 Maxim Zhiburt\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/Nugine/outref \">outref 0.5.1</a></li>\n                </ul>\n                <pre class=\"license-text\">MIT License\n\nCopyright (c) 2022 Nugine\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://gitlab.redox-os.org/redox-os/libredox.git \">libredox 0.1.3</a></li>\n                </ul>\n                <pre class=\"license-text\">MIT License\n\nCopyright (c) 2023 4lDO2\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/kdr-aus/numfmt \">numfmt 1.1.1</a></li>\n                </ul>\n                <pre class=\"license-text\">MIT License\n\nCopyright (c) 2023 kurtlawrence\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/quickwit-oss/chitchat \">chitchat 0.8.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/tantivy \">ownedbytes 0.7.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/tantivy \">tantivy-bitpacker 0.6.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/tantivy \">tantivy-columnar 0.3.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/tantivy \">tantivy-common 0.7.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/tantivy \">tantivy-query-grammar 0.22.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/tantivy \">tantivy-sstable 0.3.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/tantivy \">tantivy-stacker 0.3.0</a></li>\n                    <li><a href=\" https://github.com/quickwit-oss/tantivy \">tantivy-tokenizer-api 0.3.0</a></li>\n                    <li><a href=\" https://github.com/retep998/winapi-rs \">advapi32-sys 0.2.0</a></li>\n                    <li><a href=\" https://github.com/tokio-rs/async-stream \">async-stream-impl 0.3.6</a></li>\n                    <li><a href=\" https://github.com/tokio-rs/async-stream \">async-stream 0.3.6</a></li>\n                    <li><a href=\" https://github.com/Nugine/simd \">base64-simd 0.8.0</a></li>\n                    <li><a href=\" https://crates.io/crates/crunchy \">crunchy 0.2.2</a></li>\n                    <li><a href=\" https://github.com/DimaKudosh/difflib \">difflib 0.4.0</a></li>\n                    <li><a href=\" https://github.com/stephaneyfx/enum-iterator.git \">enum-iterator 1.5.0</a></li>\n                    <li><a href=\" https://github.com/lindera-morphology/lindera \">lindera-cc-cedict-builder 0.27.2</a></li>\n                    <li><a href=\" https://github.com/lindera-morphology/lindera \">lindera-cc-cedict 0.27.2</a></li>\n                    <li><a href=\" https://github.com/lindera-morphology/lindera \">lindera-compress 0.27.2</a></li>\n                    <li><a href=\" https://github.com/lindera-morphology/lindera \">lindera-core 0.27.2</a></li>\n                    <li><a href=\" https://github.com/lindera-morphology/lindera \">lindera-decompress 0.27.2</a></li>\n                    <li><a href=\" https://github.com/lindera-morphology/lindera \">lindera-dictionary 0.27.2</a></li>\n                    <li><a href=\" https://github.com/lindera-morphology/lindera \">lindera-ipadic-builder 0.27.2</a></li>\n                    <li><a href=\" https://github.com/lindera-morphology/lindera \">lindera-ipadic-neologd-builder 0.27.2</a></li>\n                    <li><a href=\" https://github.com/lindera-morphology/lindera \">lindera-ipadic 0.27.2</a></li>\n                    <li><a href=\" https://github.com/lindera-morphology/lindera \">lindera-ko-dic-builder 0.27.2</a></li>\n                    <li><a href=\" https://github.com/lindera-morphology/lindera \">lindera-ko-dic 0.27.2</a></li>\n                    <li><a href=\" https://github.com/lindera-morphology/lindera \">lindera-tokenizer 0.27.2</a></li>\n                    <li><a href=\" https://github.com/lindera-morphology/lindera \">lindera-unidic-builder 0.27.2</a></li>\n                    <li><a href=\" https://github.com/hasezoey/new_string_template \">new_string_template 1.5.3</a></li>\n                    <li><a href=\" https://github.com/ogham/rust-number-prefix \">number_prefix 0.4.0</a></li>\n                    <li><a href=\" https://github.com/plotters-rs/plotters \">plotters-backend 0.3.7</a></li>\n                    <li><a href=\" https://github.com/plotters-rs/plotters.git \">plotters-svg 0.3.7</a></li>\n                    <li><a href=\" https://github.com/plotters-rs/plotters \">plotters 0.3.7</a></li>\n                    <li><a href=\" https://github.com/tokio-rs/valuable \">valuable 0.1.0</a></li>\n                    <li><a href=\" https://github.com/Nugine/simd \">vsimd 0.8.0</a></li>\n                    <li><a href=\" https://github.com/retep998/winapi-rs \">winapi-build 0.1.1</a></li>\n                </ul>\n                <pre class=\"license-text\">MIT License\n\nCopyright (c) &lt;year&gt; &lt;copyright holders&gt;\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the &quot;Software&quot;), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/tokio-rs/tokio \">tokio-stream 0.1.16</a></li>\n                    <li><a href=\" https://github.com/tokio-rs/tokio \">tokio-util 0.7.12</a></li>\n                    <li><a href=\" https://github.com/tokio-rs/tokio \">tokio 1.40.0</a></li>\n                </ul>\n                <pre class=\"license-text\">MIT License\n\nCopyright (c) Tokio Contributors\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/danaugrs/overload \">overload 0.1.1</a></li>\n                </ul>\n                <pre class=\"license-text\">MIT License\r\n\r\nCopyright (c) 2019 Daniel Augusto Rizzi Salvadori\r\n\r\nPermission is hereby granted, free of charge, to any person obtaining a copy\r\nof this software and associated documentation files (the &quot;Software&quot;), to deal\r\nin the Software without restriction, including without limitation the rights\r\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\r\ncopies of the Software, and to permit persons to whom the Software is\r\nfurnished to do so, subject to the following conditions:\r\n\r\nThe above copyright notice and this permission notice shall be included in all\r\ncopies or substantial portions of the Software.\r\n\r\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\r\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\r\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\r\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\r\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\r\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\r\nSOFTWARE.</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/fkoep/downcast-rs \">downcast 0.11.0</a></li>\n                </ul>\n                <pre class=\"license-text\">MIT License (MIT)\n\nCopyright (c) 2017 Felix Köpge\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the &quot;Software&quot;), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/davidpdrsn/ext.git \">extend 0.1.2</a></li>\n                </ul>\n                <pre class=\"license-text\">MIT License Copyright (c) 2020 David Pedersen\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is furnished\nto do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice (including the next\nparagraph) shall be included in all copies or substantial portions of the\nSoftware.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS\nFOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS\nOR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,\nWHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF\nOR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/sunfishcode/is-terminal \">is-terminal 0.4.13</a></li>\n                    <li><a href=\" https://github.com/upsuper/retain_mut \">retain_mut 0.1.9</a></li>\n                    <li><a href=\" https://github.com/PSeitz/serde_json_borrow \">serde_json_borrow 0.5.1</a></li>\n                    <li><a href=\" https://github.com/dtolnay/unsafe-libyaml \">unsafe-libyaml 0.2.11</a></li>\n                </ul>\n                <pre class=\"license-text\">Permission is hereby granted, free of charge, to any\nperson obtaining a copy of this software and associated\ndocumentation files (the &quot;Software&quot;), to deal in the\nSoftware without restriction, including without\nlimitation the rights to use, copy, modify, merge,\npublish, distribute, sublicense, and/or sell copies of\nthe Software, and to permit persons to whom the Software\nis furnished to do so, subject to the following\nconditions:\n\nThe above copyright notice and this permission notice\nshall be included in all copies or substantial portions\nof the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF\nANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED\nTO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A\nPARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT\nSHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY\nCLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION\nOF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR\nIN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER\nDEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/winnow-rs/winnow \">winnow 0.5.40</a></li>\n                </ul>\n                <pre class=\"license-text\">Permission is hereby granted, free of charge, to any person obtaining\na copy of this software and associated documentation files (the\n&quot;Software&quot;), to deal in the Software without restriction, including\nwithout limitation the rights to use, copy, modify, merge, publish,\ndistribute, sublicense, and/or sell copies of the Software, and to\npermit persons to whom the Software is furnished to do so, subject to\nthe following conditions:\n\nThe above copyright notice and this permission notice shall be\nincluded in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND,\nEXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF\nMERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND\nNONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE\nLIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION\nOF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION\nWITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/lifthrasiir/rust-encoding \">encoding 0.2.33</a></li>\n                </ul>\n                <pre class=\"license-text\">The MIT License (MIT)\n\nCopyright (c) 2013, Kang Seonghoon.\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.\n\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/nushell/nu-ansi-term \">nu-ansi-term 0.46.0</a></li>\n                </ul>\n                <pre class=\"license-text\">The MIT License (MIT)\n\nCopyright (c) 2014 Benjamin Sago\nCopyright (c) 2021-2022 The Nushell Project Developers\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/mvdnes/spin-rs.git \">spin 0.9.8</a></li>\n                    <li><a href=\" https://github.com/zip-rs/zip.git \">zip 0.6.6</a></li>\n                </ul>\n                <pre class=\"license-text\">The MIT License (MIT)\n\nCopyright (c) 2014 Mathijs van de Nes\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/BurntSushi/aho-corasick \">aho-corasick 1.1.3</a></li>\n                    <li><a href=\" https://github.com/BurntSushi/byteorder \">byteorder 1.5.0</a></li>\n                    <li><a href=\" https://github.com/BurntSushi/rust-csv \">csv-core 0.1.11</a></li>\n                    <li><a href=\" https://github.com/BurntSushi/rust-csv \">csv 1.3.0</a></li>\n                    <li><a href=\" https://github.com/BurntSushi/memchr \">memchr 2.7.4</a></li>\n                    <li><a href=\" https://github.com/BurntSushi/regex-automata \">regex-automata 0.1.10</a></li>\n                    <li><a href=\" https://github.com/BurntSushi/termcolor \">termcolor 1.4.1</a></li>\n                    <li><a href=\" https://github.com/BurntSushi/utf8-ranges \">utf8-ranges 1.0.5</a></li>\n                    <li><a href=\" https://github.com/BurntSushi/walkdir \">walkdir 2.5.0</a></li>\n                </ul>\n                <pre class=\"license-text\">The MIT License (MIT)\n\nCopyright (c) 2015 Andrew Gallant\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/quickwit-inc/fst \">tantivy-fst 0.5.0</a></li>\n                </ul>\n                <pre class=\"license-text\">The MIT License (MIT)\n\nCopyright (c) 2015 Andrew Gallant\nCopyright (c) 2019 Paul Masurel\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/rapidfuzz/strsim-rs \">strsim 0.11.1</a></li>\n                </ul>\n                <pre class=\"license-text\">The MIT License (MIT)\n\nCopyright (c) 2015 Danny Guo\nCopyright (c) 2016 Titus Wormer &lt;tituswormer@gmail.com&gt;\nCopyright (c) 2018 Akash Kurdekar\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/retep998/winapi-rs \">winapi 0.2.8</a></li>\n                </ul>\n                <pre class=\"license-text\">The MIT License (MIT)\n\nCopyright (c) 2015 Peter Atashian\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/saghm/rust-separator \">separator 0.4.1</a></li>\n                </ul>\n                <pre class=\"license-text\">The MIT License (MIT)\n\nCopyright (c) 2015 Saghm Rossi\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/ia0/data-encoding \">data-encoding 2.6.0</a></li>\n                </ul>\n                <pre class=\"license-text\">The MIT License (MIT)\n\nCopyright (c) 2015-2020 Julien Cretin\nCopyright (c) 2017-2020 Google Inc.\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/BurntSushi/same-file \">same-file 1.0.6</a></li>\n                    <li><a href=\" https://github.com/BurntSushi/winapi-util \">winapi-util 0.1.9</a></li>\n                </ul>\n                <pre class=\"license-text\">The MIT License (MIT)\n\nCopyright (c) 2017 Andrew Gallant\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/console-rs/console \">console 0.15.8</a></li>\n                    <li><a href=\" https://github.com/mitsuhiko/dialoguer \">dialoguer 0.10.4</a></li>\n                    <li><a href=\" https://github.com/console-rs/indicatif \">indicatif 0.17.8</a></li>\n                </ul>\n                <pre class=\"license-text\">The MIT License (MIT)\n\nCopyright (c) 2017 Armin Ronacher &lt;armin.ronacher@active-4.com&gt;\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/pyros2097/rust-embed \">rust-embed-impl 6.8.1</a></li>\n                    <li><a href=\" https://github.com/pyros2097/rust-embed \">rust-embed-utils 7.8.1</a></li>\n                    <li><a href=\" https://github.com/pyros2097/rust-embed \">rust-embed 6.8.1</a></li>\n                </ul>\n                <pre class=\"license-text\">The MIT License (MIT)\n\nCopyright (c) 2018 pyros2097\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://hg.sr.ht/~icefox/oorandom \">oorandom 11.1.4</a></li>\n                </ul>\n                <pre class=\"license-text\">The MIT License (MIT)\n\nCopyright (c) 2019 Simon Heath\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/avitex/rust-aliasable \">aliasable 0.1.3</a></li>\n                </ul>\n                <pre class=\"license-text\">The MIT License (MIT)\n\nCopyright (c) 2020 James Dyson &lt;avitex@wfxlabs.com&gt;\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/pseitz/lz4_flex \">lz4_flex 0.11.3</a></li>\n                </ul>\n                <pre class=\"license-text\">The MIT License (MIT)\n\nCopyright (c) 2020 Pascal Seitz\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of\nthis software and associated documentation files (the &quot;Software&quot;), to deal in\nthe Software without restriction, including without limitation the rights to\nuse, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of\nthe Software, and to permit persons to whom the Software is furnished to do so,\nsubject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS\nFOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR\nCOPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER\nIN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN\nCONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/gyscos/zstd-rs \">zstd 0.11.2+zstd.1.5.2</a></li>\n                    <li><a href=\" https://github.com/gyscos/zstd-rs \">zstd 0.13.2</a></li>\n                </ul>\n                <pre class=\"license-text\">The MIT License (MIT)\nCopyright (c) 2016 Alexandre Bury\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the &quot;Software&quot;), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/servo/bincode \">bincode 1.3.3</a></li>\n                </ul>\n                <pre class=\"license-text\">The MIT License (MIT)\r\n\r\nCopyright (c) 2014 Ty Overby\r\n\r\nPermission is hereby granted, free of charge, to any person obtaining a copy\r\nof this software and associated documentation files (the &quot;Software&quot;), to deal\r\nin the Software without restriction, including without limitation the rights\r\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\r\ncopies of the Software, and to permit persons to whom the Software is\r\nfurnished to do so, subject to the following conditions:\r\n\r\nThe above copyright notice and this permission notice shall be included in all\r\ncopies or substantial portions of the Software.\r\n\r\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\r\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\r\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\r\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\r\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\r\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\r\nSOFTWARE.\r\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/abonander/mime_guess \">mime_guess 2.0.5</a></li>\n                </ul>\n                <pre class=\"license-text\">The MIT License (MIT)\r\n\r\nCopyright (c) 2015 Austin Bonander\r\n\r\nPermission is hereby granted, free of charge, to any person obtaining a copy\r\nof this software and associated documentation files (the &quot;Software&quot;), to deal\r\nin the Software without restriction, including without limitation the rights\r\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\r\ncopies of the Software, and to permit persons to whom the Software is\r\nfurnished to do so, subject to the following conditions:\r\n\r\nThe above copyright notice and this permission notice shall be included in all\r\ncopies or substantial portions of the Software.\r\n\r\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\r\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\r\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\r\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\r\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\r\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\r\nSOFTWARE.\r\n\r\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/fizyk20/generic-array.git \">generic-array 0.14.7</a></li>\n                </ul>\n                <pre class=\"license-text\">The MIT License (MIT)\r\n\r\nCopyright (c) 2015 Bartłomiej Kamiński\r\n\r\nPermission is hereby granted, free of charge, to any person obtaining a copy\r\nof this software and associated documentation files (the &quot;Software&quot;), to deal\r\nin the Software without restriction, including without limitation the rights\r\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\r\ncopies of the Software, and to permit persons to whom the Software is\r\nfurnished to do so, subject to the following conditions:\r\n\r\nThe above copyright notice and this permission notice shall be included in all\r\ncopies or substantial portions of the Software.\r\n\r\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\r\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\r\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\r\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\r\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\r\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\r\nSOFTWARE.</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MIT\">MIT License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/kornelski/rust_urlencoding \">urlencoding 2.1.3</a></li>\n                </ul>\n                <pre class=\"license-text\">© 2016 Bertram Truong\n© 2021 Kornel Lesiński\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the &quot;Software&quot;), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MPL-2.0\">Mozilla Public License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/mackwic/colored \">colored 2.1.0</a></li>\n                </ul>\n                <pre class=\"license-text\">Mozilla Public License Version 2.0\n&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;\n\n1. Definitions\n--------------\n\n1.1. &quot;Contributor&quot;\n    means each individual or legal entity that creates, contributes to\n    the creation of, or owns Covered Software.\n\n1.2. &quot;Contributor Version&quot;\n    means the combination of the Contributions of others (if any) used\n    by a Contributor and that particular Contributor&#x27;s Contribution.\n\n1.3. &quot;Contribution&quot;\n    means Covered Software of a particular Contributor.\n\n1.4. &quot;Covered Software&quot;\n    means Source Code Form to which the initial Contributor has attached\n    the notice in Exhibit A, the Executable Form of such Source Code\n    Form, and Modifications of such Source Code Form, in each case\n    including portions thereof.\n\n1.5. &quot;Incompatible With Secondary Licenses&quot;\n    means\n\n    (a) that the initial Contributor has attached the notice described\n        in Exhibit B to the Covered Software; or\n\n    (b) that the Covered Software was made available under the terms of\n        version 1.1 or earlier of the License, but not also under the\n        terms of a Secondary License.\n\n1.6. &quot;Executable Form&quot;\n    means any form of the work other than Source Code Form.\n\n1.7. &quot;Larger Work&quot;\n    means a work that combines Covered Software with other material, in \n    a separate file or files, that is not Covered Software.\n\n1.8. &quot;License&quot;\n    means this document.\n\n1.9. &quot;Licensable&quot;\n    means having the right to grant, to the maximum extent possible,\n    whether at the time of the initial grant or subsequently, any and\n    all of the rights conveyed by this License.\n\n1.10. &quot;Modifications&quot;\n    means any of the following:\n\n    (a) any file in Source Code Form that results from an addition to,\n        deletion from, or modification of the contents of Covered\n        Software; or\n\n    (b) any new file in Source Code Form that contains any Covered\n        Software.\n\n1.11. &quot;Patent Claims&quot; of a Contributor\n    means any patent claim(s), including without limitation, method,\n    process, and apparatus claims, in any patent Licensable by such\n    Contributor that would be infringed, but for the grant of the\n    License, by the making, using, selling, offering for sale, having\n    made, import, or transfer of either its Contributions or its\n    Contributor Version.\n\n1.12. &quot;Secondary License&quot;\n    means either the GNU General Public License, Version 2.0, the GNU\n    Lesser General Public License, Version 2.1, the GNU Affero General\n    Public License, Version 3.0, or any later versions of those\n    licenses.\n\n1.13. &quot;Source Code Form&quot;\n    means the form of the work preferred for making modifications.\n\n1.14. &quot;You&quot; (or &quot;Your&quot;)\n    means an individual or a legal entity exercising rights under this\n    License. For legal entities, &quot;You&quot; includes any entity that\n    controls, is controlled by, or is under common control with You. For\n    purposes of this definition, &quot;control&quot; means (a) the power, direct\n    or indirect, to cause the direction or management of such entity,\n    whether by contract or otherwise, or (b) ownership of more than\n    fifty percent (50%) of the outstanding shares or beneficial\n    ownership of such entity.\n\n2. License Grants and Conditions\n--------------------------------\n\n2.1. Grants\n\nEach Contributor hereby grants You a world-wide, royalty-free,\nnon-exclusive license:\n\n(a) under intellectual property rights (other than patent or trademark)\n    Licensable by such Contributor to use, reproduce, make available,\n    modify, display, perform, distribute, and otherwise exploit its\n    Contributions, either on an unmodified basis, with Modifications, or\n    as part of a Larger Work; and\n\n(b) under Patent Claims of such Contributor to make, use, sell, offer\n    for sale, have made, import, and otherwise transfer either its\n    Contributions or its Contributor Version.\n\n2.2. Effective Date\n\nThe licenses granted in Section 2.1 with respect to any Contribution\nbecome effective for each Contribution on the date the Contributor first\ndistributes such Contribution.\n\n2.3. Limitations on Grant Scope\n\nThe licenses granted in this Section 2 are the only rights granted under\nthis License. No additional rights or licenses will be implied from the\ndistribution or licensing of Covered Software under this License.\nNotwithstanding Section 2.1(b) above, no patent license is granted by a\nContributor:\n\n(a) for any code that a Contributor has removed from Covered Software;\n    or\n\n(b) for infringements caused by: (i) Your and any other third party&#x27;s\n    modifications of Covered Software, or (ii) the combination of its\n    Contributions with other software (except as part of its Contributor\n    Version); or\n\n(c) under Patent Claims infringed by Covered Software in the absence of\n    its Contributions.\n\nThis License does not grant any rights in the trademarks, service marks,\nor logos of any Contributor (except as may be necessary to comply with\nthe notice requirements in Section 3.4).\n\n2.4. Subsequent Licenses\n\nNo Contributor makes additional grants as a result of Your choice to\ndistribute the Covered Software under a subsequent version of this\nLicense (see Section 10.2) or under the terms of a Secondary License (if\npermitted under the terms of Section 3.3).\n\n2.5. Representation\n\nEach Contributor represents that the Contributor believes its\nContributions are its original creation(s) or it has sufficient rights\nto grant the rights to its Contributions conveyed by this License.\n\n2.6. Fair Use\n\nThis License is not intended to limit any rights You have under\napplicable copyright doctrines of fair use, fair dealing, or other\nequivalents.\n\n2.7. Conditions\n\nSections 3.1, 3.2, 3.3, and 3.4 are conditions of the licenses granted\nin Section 2.1.\n\n3. Responsibilities\n-------------------\n\n3.1. Distribution of Source Form\n\nAll distribution of Covered Software in Source Code Form, including any\nModifications that You create or to which You contribute, must be under\nthe terms of this License. You must inform recipients that the Source\nCode Form of the Covered Software is governed by the terms of this\nLicense, and how they can obtain a copy of this License. You may not\nattempt to alter or restrict the recipients&#x27; rights in the Source Code\nForm.\n\n3.2. Distribution of Executable Form\n\nIf You distribute Covered Software in Executable Form then:\n\n(a) such Covered Software must also be made available in Source Code\n    Form, as described in Section 3.1, and You must inform recipients of\n    the Executable Form how they can obtain a copy of such Source Code\n    Form by reasonable means in a timely manner, at a charge no more\n    than the cost of distribution to the recipient; and\n\n(b) You may distribute such Executable Form under the terms of this\n    License, or sublicense it under different terms, provided that the\n    license for the Executable Form does not attempt to limit or alter\n    the recipients&#x27; rights in the Source Code Form under this License.\n\n3.3. Distribution of a Larger Work\n\nYou may create and distribute a Larger Work under terms of Your choice,\nprovided that You also comply with the requirements of this License for\nthe Covered Software. If the Larger Work is a combination of Covered\nSoftware with a work governed by one or more Secondary Licenses, and the\nCovered Software is not Incompatible With Secondary Licenses, this\nLicense permits You to additionally distribute such Covered Software\nunder the terms of such Secondary License(s), so that the recipient of\nthe Larger Work may, at their option, further distribute the Covered\nSoftware under the terms of either this License or such Secondary\nLicense(s).\n\n3.4. Notices\n\nYou may not remove or alter the substance of any license notices\n(including copyright notices, patent notices, disclaimers of warranty,\nor limitations of liability) contained within the Source Code Form of\nthe Covered Software, except that You may alter any license notices to\nthe extent required to remedy known factual inaccuracies.\n\n3.5. Application of Additional Terms\n\nYou may choose to offer, and to charge a fee for, warranty, support,\nindemnity or liability obligations to one or more recipients of Covered\nSoftware. However, You may do so only on Your own behalf, and not on\nbehalf of any Contributor. You must make it absolutely clear that any\nsuch warranty, support, indemnity, or liability obligation is offered by\nYou alone, and You hereby agree to indemnify every Contributor for any\nliability incurred by such Contributor as a result of warranty, support,\nindemnity or liability terms You offer. You may include additional\ndisclaimers of warranty and limitations of liability specific to any\njurisdiction.\n\n4. Inability to Comply Due to Statute or Regulation\n---------------------------------------------------\n\nIf it is impossible for You to comply with any of the terms of this\nLicense with respect to some or all of the Covered Software due to\nstatute, judicial order, or regulation then You must: (a) comply with\nthe terms of this License to the maximum extent possible; and (b)\ndescribe the limitations and the code they affect. Such description must\nbe placed in a text file included with all distributions of the Covered\nSoftware under this License. Except to the extent prohibited by statute\nor regulation, such description must be sufficiently detailed for a\nrecipient of ordinary skill to be able to understand it.\n\n5. Termination\n--------------\n\n5.1. The rights granted under this License will terminate automatically\nif You fail to comply with any of its terms. However, if You become\ncompliant, then the rights granted under this License from a particular\nContributor are reinstated (a) provisionally, unless and until such\nContributor explicitly and finally terminates Your grants, and (b) on an\nongoing basis, if such Contributor fails to notify You of the\nnon-compliance by some reasonable means prior to 60 days after You have\ncome back into compliance. Moreover, Your grants from a particular\nContributor are reinstated on an ongoing basis if such Contributor\nnotifies You of the non-compliance by some reasonable means, this is the\nfirst time You have received notice of non-compliance with this License\nfrom such Contributor, and You become compliant prior to 30 days after\nYour receipt of the notice.\n\n5.2. If You initiate litigation against any entity by asserting a patent\ninfringement claim (excluding declaratory judgment actions,\ncounter-claims, and cross-claims) alleging that a Contributor Version\ndirectly or indirectly infringes any patent, then the rights granted to\nYou by any and all Contributors for the Covered Software under Section\n2.1 of this License shall terminate.\n\n5.3. In the event of termination under Sections 5.1 or 5.2 above, all\nend user license agreements (excluding distributors and resellers) which\nhave been validly granted by You or Your distributors under this License\nprior to termination shall survive termination.\n\n************************************************************************\n*                                                                      *\n*  6. Disclaimer of Warranty                                           *\n*  -------------------------                                           *\n*                                                                      *\n*  Covered Software is provided under this License on an &quot;as is&quot;       *\n*  basis, without warranty of any kind, either expressed, implied, or  *\n*  statutory, including, without limitation, warranties that the       *\n*  Covered Software is free of defects, merchantable, fit for a        *\n*  particular purpose or non-infringing. The entire risk as to the     *\n*  quality and performance of the Covered Software is with You.        *\n*  Should any Covered Software prove defective in any respect, You     *\n*  (not any Contributor) assume the cost of any necessary servicing,   *\n*  repair, or correction. This disclaimer of warranty constitutes an   *\n*  essential part of this License. No use of any Covered Software is   *\n*  authorized under this License except under this disclaimer.         *\n*                                                                      *\n************************************************************************\n\n************************************************************************\n*                                                                      *\n*  7. Limitation of Liability                                          *\n*  --------------------------                                          *\n*                                                                      *\n*  Under no circumstances and under no legal theory, whether tort      *\n*  (including negligence), contract, or otherwise, shall any           *\n*  Contributor, or anyone who distributes Covered Software as          *\n*  permitted above, be liable to You for any direct, indirect,         *\n*  special, incidental, or consequential damages of any character      *\n*  including, without limitation, damages for lost profits, loss of    *\n*  goodwill, work stoppage, computer failure or malfunction, or any    *\n*  and all other commercial damages or losses, even if such party      *\n*  shall have been informed of the possibility of such damages. This   *\n*  limitation of liability shall not apply to liability for death or   *\n*  personal injury resulting from such party&#x27;s negligence to the       *\n*  extent applicable law prohibits such limitation. Some               *\n*  jurisdictions do not allow the exclusion or limitation of           *\n*  incidental or consequential damages, so this exclusion and          *\n*  limitation may not apply to You.                                    *\n*                                                                      *\n************************************************************************\n\n8. Litigation\n-------------\n\nAny litigation relating to this License may be brought only in the\ncourts of a jurisdiction where the defendant maintains its principal\nplace of business and such litigation shall be governed by laws of that\njurisdiction, without reference to its conflict-of-law provisions.\nNothing in this Section shall prevent a party&#x27;s ability to bring\ncross-claims or counter-claims.\n\n9. Miscellaneous\n----------------\n\nThis License represents the complete agreement concerning the subject\nmatter hereof. If any provision of this License is held to be\nunenforceable, such provision shall be reformed only to the extent\nnecessary to make it enforceable. Any law or regulation which provides\nthat the language of a contract shall be construed against the drafter\nshall not be used to construe this License against a Contributor.\n\n10. Versions of the License\n---------------------------\n\n10.1. New Versions\n\nMozilla Foundation is the license steward. Except as provided in Section\n10.3, no one other than the license steward has the right to modify or\npublish new versions of this License. Each version will be given a\ndistinguishing version number.\n\n10.2. Effect of New Versions\n\nYou may distribute the Covered Software under the terms of the version\nof the License under which You originally received the Covered Software,\nor under the terms of any subsequent version published by the license\nsteward.\n\n10.3. Modified Versions\n\nIf you create software not governed by this License, and you want to\ncreate a new license for such software, you may create and use a\nmodified version of this License if you rename the license and remove\nany references to the name of the license steward (except to note that\nsuch modified license differs from this License).\n\n10.4. Distributing Source Code Form that is Incompatible With Secondary\nLicenses\n\nIf You choose to distribute Source Code Form that is Incompatible With\nSecondary Licenses under the terms of this version of the License, the\nnotice described in Exhibit B of this License must be attached.\n\nExhibit A - Source Code Form License Notice\n-------------------------------------------\n\n  This Source Code Form is subject to the terms of the Mozilla Public\n  License, v. 2.0. If a copy of the MPL was not distributed with this\n  file, You can obtain one at http://mozilla.org/MPL/2.0/.\n\nIf it is not possible or desirable to put the notice in a particular\nfile, then You may include the notice in a location (such as a LICENSE\nfile in a relevant directory) where a recipient would be likely to look\nfor such a notice.\n\nYou may add additional accurate notices of copyright ownership.\n\nExhibit B - &quot;Incompatible With Secondary Licenses&quot; Notice\n---------------------------------------------------------\n\n  This Source Code Form is &quot;Incompatible With Secondary Licenses&quot;, as\n  defined by the Mozilla Public License, v. 2.0.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"MPL-2.0\">Mozilla Public License 2.0</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/rustls/webpki-roots \">webpki-roots 0.25.4</a></li>\n                </ul>\n                <pre class=\"license-text\">Mozilla Public License Version 2.0\n&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;\n\n1. Definitions\n--------------\n\n1.1. &quot;Contributor&quot;\n    means each individual or legal entity that creates, contributes to\n    the creation of, or owns Covered Software.\n\n1.2. &quot;Contributor Version&quot;\n    means the combination of the Contributions of others (if any) used\n    by a Contributor and that particular Contributor&#x27;s Contribution.\n\n1.3. &quot;Contribution&quot;\n    means Covered Software of a particular Contributor.\n\n1.4. &quot;Covered Software&quot;\n    means Source Code Form to which the initial Contributor has attached\n    the notice in Exhibit A, the Executable Form of such Source Code\n    Form, and Modifications of such Source Code Form, in each case\n    including portions thereof.\n\n1.5. &quot;Incompatible With Secondary Licenses&quot;\n    means\n\n    (a) that the initial Contributor has attached the notice described\n        in Exhibit B to the Covered Software; or\n\n    (b) that the Covered Software was made available under the terms of\n        version 1.1 or earlier of the License, but not also under the\n        terms of a Secondary License.\n\n1.6. &quot;Executable Form&quot;\n    means any form of the work other than Source Code Form.\n\n1.7. &quot;Larger Work&quot;\n    means a work that combines Covered Software with other material, in \n    a separate file or files, that is not Covered Software.\n\n1.8. &quot;License&quot;\n    means this document.\n\n1.9. &quot;Licensable&quot;\n    means having the right to grant, to the maximum extent possible,\n    whether at the time of the initial grant or subsequently, any and\n    all of the rights conveyed by this License.\n\n1.10. &quot;Modifications&quot;\n    means any of the following:\n\n    (a) any file in Source Code Form that results from an addition to,\n        deletion from, or modification of the contents of Covered\n        Software; or\n\n    (b) any new file in Source Code Form that contains any Covered\n        Software.\n\n1.11. &quot;Patent Claims&quot; of a Contributor\n    means any patent claim(s), including without limitation, method,\n    process, and apparatus claims, in any patent Licensable by such\n    Contributor that would be infringed, but for the grant of the\n    License, by the making, using, selling, offering for sale, having\n    made, import, or transfer of either its Contributions or its\n    Contributor Version.\n\n1.12. &quot;Secondary License&quot;\n    means either the GNU General Public License, Version 2.0, the GNU\n    Lesser General Public License, Version 2.1, the GNU Affero General\n    Public License, Version 3.0, or any later versions of those\n    licenses.\n\n1.13. &quot;Source Code Form&quot;\n    means the form of the work preferred for making modifications.\n\n1.14. &quot;You&quot; (or &quot;Your&quot;)\n    means an individual or a legal entity exercising rights under this\n    License. For legal entities, &quot;You&quot; includes any entity that\n    controls, is controlled by, or is under common control with You. For\n    purposes of this definition, &quot;control&quot; means (a) the power, direct\n    or indirect, to cause the direction or management of such entity,\n    whether by contract or otherwise, or (b) ownership of more than\n    fifty percent (50%) of the outstanding shares or beneficial\n    ownership of such entity.\n\n2. License Grants and Conditions\n--------------------------------\n\n2.1. Grants\n\nEach Contributor hereby grants You a world-wide, royalty-free,\nnon-exclusive license:\n\n(a) under intellectual property rights (other than patent or trademark)\n    Licensable by such Contributor to use, reproduce, make available,\n    modify, display, perform, distribute, and otherwise exploit its\n    Contributions, either on an unmodified basis, with Modifications, or\n    as part of a Larger Work; and\n\n(b) under Patent Claims of such Contributor to make, use, sell, offer\n    for sale, have made, import, and otherwise transfer either its\n    Contributions or its Contributor Version.\n\n2.2. Effective Date\n\nThe licenses granted in Section 2.1 with respect to any Contribution\nbecome effective for each Contribution on the date the Contributor first\ndistributes such Contribution.\n\n2.3. Limitations on Grant Scope\n\nThe licenses granted in this Section 2 are the only rights granted under\nthis License. No additional rights or licenses will be implied from the\ndistribution or licensing of Covered Software under this License.\nNotwithstanding Section 2.1(b) above, no patent license is granted by a\nContributor:\n\n(a) for any code that a Contributor has removed from Covered Software;\n    or\n\n(b) for infringements caused by: (i) Your and any other third party&#x27;s\n    modifications of Covered Software, or (ii) the combination of its\n    Contributions with other software (except as part of its Contributor\n    Version); or\n\n(c) under Patent Claims infringed by Covered Software in the absence of\n    its Contributions.\n\nThis License does not grant any rights in the trademarks, service marks,\nor logos of any Contributor (except as may be necessary to comply with\nthe notice requirements in Section 3.4).\n\n2.4. Subsequent Licenses\n\nNo Contributor makes additional grants as a result of Your choice to\ndistribute the Covered Software under a subsequent version of this\nLicense (see Section 10.2) or under the terms of a Secondary License (if\npermitted under the terms of Section 3.3).\n\n2.5. Representation\n\nEach Contributor represents that the Contributor believes its\nContributions are its original creation(s) or it has sufficient rights\nto grant the rights to its Contributions conveyed by this License.\n\n2.6. Fair Use\n\nThis License is not intended to limit any rights You have under\napplicable copyright doctrines of fair use, fair dealing, or other\nequivalents.\n\n2.7. Conditions\n\nSections 3.1, 3.2, 3.3, and 3.4 are conditions of the licenses granted\nin Section 2.1.\n\n3. Responsibilities\n-------------------\n\n3.1. Distribution of Source Form\n\nAll distribution of Covered Software in Source Code Form, including any\nModifications that You create or to which You contribute, must be under\nthe terms of this License. You must inform recipients that the Source\nCode Form of the Covered Software is governed by the terms of this\nLicense, and how they can obtain a copy of this License. You may not\nattempt to alter or restrict the recipients&#x27; rights in the Source Code\nForm.\n\n3.2. Distribution of Executable Form\n\nIf You distribute Covered Software in Executable Form then:\n\n(a) such Covered Software must also be made available in Source Code\n    Form, as described in Section 3.1, and You must inform recipients of\n    the Executable Form how they can obtain a copy of such Source Code\n    Form by reasonable means in a timely manner, at a charge no more\n    than the cost of distribution to the recipient; and\n\n(b) You may distribute such Executable Form under the terms of this\n    License, or sublicense it under different terms, provided that the\n    license for the Executable Form does not attempt to limit or alter\n    the recipients&#x27; rights in the Source Code Form under this License.\n\n3.3. Distribution of a Larger Work\n\nYou may create and distribute a Larger Work under terms of Your choice,\nprovided that You also comply with the requirements of this License for\nthe Covered Software. If the Larger Work is a combination of Covered\nSoftware with a work governed by one or more Secondary Licenses, and the\nCovered Software is not Incompatible With Secondary Licenses, this\nLicense permits You to additionally distribute such Covered Software\nunder the terms of such Secondary License(s), so that the recipient of\nthe Larger Work may, at their option, further distribute the Covered\nSoftware under the terms of either this License or such Secondary\nLicense(s).\n\n3.4. Notices\n\nYou may not remove or alter the substance of any license notices\n(including copyright notices, patent notices, disclaimers of warranty,\nor limitations of liability) contained within the Source Code Form of\nthe Covered Software, except that You may alter any license notices to\nthe extent required to remedy known factual inaccuracies.\n\n3.5. Application of Additional Terms\n\nYou may choose to offer, and to charge a fee for, warranty, support,\nindemnity or liability obligations to one or more recipients of Covered\nSoftware. However, You may do so only on Your own behalf, and not on\nbehalf of any Contributor. You must make it absolutely clear that any\nsuch warranty, support, indemnity, or liability obligation is offered by\nYou alone, and You hereby agree to indemnify every Contributor for any\nliability incurred by such Contributor as a result of warranty, support,\nindemnity or liability terms You offer. You may include additional\ndisclaimers of warranty and limitations of liability specific to any\njurisdiction.\n\n4. Inability to Comply Due to Statute or Regulation\n---------------------------------------------------\n\nIf it is impossible for You to comply with any of the terms of this\nLicense with respect to some or all of the Covered Software due to\nstatute, judicial order, or regulation then You must: (a) comply with\nthe terms of this License to the maximum extent possible; and (b)\ndescribe the limitations and the code they affect. Such description must\nbe placed in a text file included with all distributions of the Covered\nSoftware under this License. Except to the extent prohibited by statute\nor regulation, such description must be sufficiently detailed for a\nrecipient of ordinary skill to be able to understand it.\n\n5. Termination\n--------------\n\n5.1. The rights granted under this License will terminate automatically\nif You fail to comply with any of its terms. However, if You become\ncompliant, then the rights granted under this License from a particular\nContributor are reinstated (a) provisionally, unless and until such\nContributor explicitly and finally terminates Your grants, and (b) on an\nongoing basis, if such Contributor fails to notify You of the\nnon-compliance by some reasonable means prior to 60 days after You have\ncome back into compliance. Moreover, Your grants from a particular\nContributor are reinstated on an ongoing basis if such Contributor\nnotifies You of the non-compliance by some reasonable means, this is the\nfirst time You have received notice of non-compliance with this License\nfrom such Contributor, and You become compliant prior to 30 days after\nYour receipt of the notice.\n\n5.2. If You initiate litigation against any entity by asserting a patent\ninfringement claim (excluding declaratory judgment actions,\ncounter-claims, and cross-claims) alleging that a Contributor Version\ndirectly or indirectly infringes any patent, then the rights granted to\nYou by any and all Contributors for the Covered Software under Section\n2.1 of this License shall terminate.\n\n5.3. In the event of termination under Sections 5.1 or 5.2 above, all\nend user license agreements (excluding distributors and resellers) which\nhave been validly granted by You or Your distributors under this License\nprior to termination shall survive termination.\n\n************************************************************************\n*                                                                      *\n*  6. Disclaimer of Warranty                                           *\n*  -------------------------                                           *\n*                                                                      *\n*  Covered Software is provided under this License on an &quot;as is&quot;       *\n*  basis, without warranty of any kind, either expressed, implied, or  *\n*  statutory, including, without limitation, warranties that the       *\n*  Covered Software is free of defects, merchantable, fit for a        *\n*  particular purpose or non-infringing. The entire risk as to the     *\n*  quality and performance of the Covered Software is with You.        *\n*  Should any Covered Software prove defective in any respect, You     *\n*  (not any Contributor) assume the cost of any necessary servicing,   *\n*  repair, or correction. This disclaimer of warranty constitutes an   *\n*  essential part of this License. No use of any Covered Software is   *\n*  authorized under this License except under this disclaimer.         *\n*                                                                      *\n************************************************************************\n\n************************************************************************\n*                                                                      *\n*  7. Limitation of Liability                                          *\n*  --------------------------                                          *\n*                                                                      *\n*  Under no circumstances and under no legal theory, whether tort      *\n*  (including negligence), contract, or otherwise, shall any           *\n*  Contributor, or anyone who distributes Covered Software as          *\n*  permitted above, be liable to You for any direct, indirect,         *\n*  special, incidental, or consequential damages of any character      *\n*  including, without limitation, damages for lost profits, loss of    *\n*  goodwill, work stoppage, computer failure or malfunction, or any    *\n*  and all other commercial damages or losses, even if such party      *\n*  shall have been informed of the possibility of such damages. This   *\n*  limitation of liability shall not apply to liability for death or   *\n*  personal injury resulting from such party&#x27;s negligence to the       *\n*  extent applicable law prohibits such limitation. Some               *\n*  jurisdictions do not allow the exclusion or limitation of           *\n*  incidental or consequential damages, so this exclusion and          *\n*  limitation may not apply to You.                                    *\n*                                                                      *\n************************************************************************\n\n8. Litigation\n-------------\n\nAny litigation relating to this License may be brought only in the\ncourts of a jurisdiction where the defendant maintains its principal\nplace of business and such litigation shall be governed by laws of that\njurisdiction, without reference to its conflict-of-law provisions.\nNothing in this Section shall prevent a party&#x27;s ability to bring\ncross-claims or counter-claims.\n\n9. Miscellaneous\n----------------\n\nThis License represents the complete agreement concerning the subject\nmatter hereof. If any provision of this License is held to be\nunenforceable, such provision shall be reformed only to the extent\nnecessary to make it enforceable. Any law or regulation which provides\nthat the language of a contract shall be construed against the drafter\nshall not be used to construe this License against a Contributor.\n\n10. Versions of the License\n---------------------------\n\n10.1. New Versions\n\nMozilla Foundation is the license steward. Except as provided in Section\n10.3, no one other than the license steward has the right to modify or\npublish new versions of this License. Each version will be given a\ndistinguishing version number.\n\n10.2. Effect of New Versions\n\nYou may distribute the Covered Software under the terms of the version\nof the License under which You originally received the Covered Software,\nor under the terms of any subsequent version published by the license\nsteward.\n\n10.3. Modified Versions\n\nIf you create software not governed by this License, and you want to\ncreate a new license for such software, you may create and use a\nmodified version of this License if you rename the license and remove\nany references to the name of the license steward (except to note that\nsuch modified license differs from this License).\n\n10.4. Distributing Source Code Form that is Incompatible With Secondary\nLicenses\n\nIf You choose to distribute Source Code Form that is Incompatible With\nSecondary Licenses under the terms of this version of the License, the\nnotice described in Exhibit B of this License must be attached.\n\nExhibit A - Source Code Form License Notice\n-------------------------------------------\n\n  This Source Code Form is subject to the terms of the Mozilla Public\n  License, v. 2.0. If a copy of the MPL was not distributed with this\n  file, You can obtain one at https://mozilla.org/MPL/2.0/.\n\nIf it is not possible or desirable to put the notice in a particular\nfile, then You may include the notice in a location (such as a LICENSE\nfile in a relevant directory) where a recipient would be likely to look\nfor such a notice.\n\nYou may add additional accurate notices of copyright ownership.\n\nExhibit B - &quot;Incompatible With Secondary Licenses&quot; Notice\n---------------------------------------------------------\n\n  This Source Code Form is &quot;Incompatible With Secondary Licenses&quot;, as\n  defined by the Mozilla Public License, v. 2.0.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"OpenSSL\">OpenSSL License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/briansmith/ring \">ring 0.17.8</a></li>\n                </ul>\n                <pre class=\"license-text\">/* &#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;\n * Copyright (c) 1998-2011 The OpenSSL Project.  All rights reserved.\n *\n * Redistribution and use in source and binary forms, with or without\n * modification, are permitted provided that the following conditions\n * are met:\n *\n * 1. Redistributions of source code must retain the above copyright\n *    notice, this list of conditions and the following disclaimer. \n *\n * 2. Redistributions in binary form must reproduce the above copyright\n *    notice, this list of conditions and the following disclaimer in\n *    the documentation and/or other materials provided with the\n *    distribution.\n *\n * 3. All advertising materials mentioning features or use of this\n *    software must display the following acknowledgment:\n *    &quot;This product includes software developed by the OpenSSL Project\n *    for use in the OpenSSL Toolkit. (http://www.openssl.org/)&quot;\n *\n * 4. The names &quot;OpenSSL Toolkit&quot; and &quot;OpenSSL Project&quot; must not be used to\n *    endorse or promote products derived from this software without\n *    prior written permission. For written permission, please contact\n *    openssl-core@openssl.org.\n *\n * 5. Products derived from this software may not be called &quot;OpenSSL&quot;\n *    nor may &quot;OpenSSL&quot; appear in their names without prior written\n *    permission of the OpenSSL Project.\n *\n * 6. Redistributions of any form whatsoever must retain the following\n *    acknowledgment:\n *    &quot;This product includes software developed by the OpenSSL Project\n *    for use in the OpenSSL Toolkit (http://www.openssl.org/)&quot;\n *\n * THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT &#x60;&#x60;AS IS&#x27;&#x27; AND ANY\n * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE\n * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR\n * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE OpenSSL PROJECT OR\n * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT\n * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;\n * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)\n * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,\n * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)\n * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED\n * OF THE POSSIBILITY OF SUCH DAMAGE.\n * &#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;\n *\n * This product includes cryptographic software written by Eric Young\n * (eay@cryptsoft.com).  This product includes software written by Tim\n * Hudson (tjh@cryptsoft.com).\n *\n */</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Unicode-DFS-2016\">Unicode License Agreement - Data Files and Software (2016)</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/dtolnay/unicode-ident \">unicode-ident 1.0.13</a></li>\n                </ul>\n                <pre class=\"license-text\">UNICODE, INC. LICENSE AGREEMENT - DATA FILES AND SOFTWARE\n\nSee Terms of Use &lt;https://www.unicode.org/copyright.html&gt;\nfor definitions of Unicode Inc.’s Data Files and Software.\n\nNOTICE TO USER: Carefully read the following legal agreement.\nBY DOWNLOADING, INSTALLING, COPYING OR OTHERWISE USING UNICODE INC.&#x27;S\nDATA FILES (&quot;DATA FILES&quot;), AND/OR SOFTWARE (&quot;SOFTWARE&quot;),\nYOU UNEQUIVOCALLY ACCEPT, AND AGREE TO BE BOUND BY, ALL OF THE\nTERMS AND CONDITIONS OF THIS AGREEMENT.\nIF YOU DO NOT AGREE, DO NOT DOWNLOAD, INSTALL, COPY, DISTRIBUTE OR USE\nTHE DATA FILES OR SOFTWARE.\n\nCOPYRIGHT AND PERMISSION NOTICE\n\nCopyright © 1991-2022 Unicode, Inc. All rights reserved.\nDistributed under the Terms of Use in https://www.unicode.org/copyright.html.\n\nPermission is hereby granted, free of charge, to any person obtaining\na copy of the Unicode data files and any associated documentation\n(the &quot;Data Files&quot;) or Unicode software and any associated documentation\n(the &quot;Software&quot;) to deal in the Data Files or Software\nwithout restriction, including without limitation the rights to use,\ncopy, modify, merge, publish, distribute, and/or sell copies of\nthe Data Files or Software, and to permit persons to whom the Data Files\nor Software are furnished to do so, provided that either\n(a) this copyright and permission notice appear with all copies\nof the Data Files or Software, or\n(b) this copyright and permission notice appear in associated\nDocumentation.\n\nTHE DATA FILES AND SOFTWARE ARE PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF\nANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE\nWARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND\nNONINFRINGEMENT OF THIRD PARTY RIGHTS.\nIN NO EVENT SHALL THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS\nNOTICE BE LIABLE FOR ANY CLAIM, OR ANY SPECIAL INDIRECT OR CONSEQUENTIAL\nDAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE,\nDATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER\nTORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR\nPERFORMANCE OF THE DATA FILES OR SOFTWARE.\n\nExcept as contained in this notice, the name of a copyright holder\nshall not be used in advertising or otherwise to promote the sale,\nuse or other dealings in these Data Files or Software without prior\nwritten authorization of the copyright holder.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Zlib\">zlib License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/orlp/foldhash \">foldhash 0.1.3</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2024 Orson Peters\n\nThis software is provided &#x27;as-is&#x27;, without any express or implied warranty. In\nno event will the authors be held liable for any damages arising from the use of\nthis software.\n\nPermission is granted to anyone to use this software for any purpose, including\ncommercial applications, and to alter it and redistribute it freely, subject to\nthe following restrictions:\n\n1. The origin of this software must not be misrepresented; you must not claim\n    that you wrote the original software. If you use this software in a product,\n    an acknowledgment in the product documentation would be appreciated but is\n    not required.\n\n2. Altered source versions must be plainly marked as such, and must not be\n    misrepresented as being the original software.\n\n3. This notice may not be removed or altered from any source distribution.</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"Zlib\">zlib License</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/Absolucy/nanorand-rs \">nanorand 0.7.0</a></li>\n                </ul>\n                <pre class=\"license-text\">The zlib/libpng License\n&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;\n\nCopyright (c) 2021 lucy\n\nThis software is provided &#x27;as-is&#x27;, without any express or implied warranty. In\nno event will the authors be held liable for any damages arising from the use of\nthis software.\n\nPermission is granted to anyone to use this software for any purpose, including\ncommercial applications, and to alter it and redistribute it freely, subject to\nthe following restrictions:\n\n1.  The origin of this software must not be misrepresented; you must not claim\n    that you wrote the original software. If you use this software in a product,\n    an acknowledgment in the product documentation would be appreciated but is\n    not required.\n\n2.  Altered source versions must be plainly marked as such, and must not be\n    misrepresented as being the original software.\n\n3.  This notice may not be removed or altered from any source distribution.\n</pre>\n            </li>\n            <li class=\"license\">\n                <h3 id=\"zlib-acknowledgement\">zlib/libpng License with Acknowledgement</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    <li><a href=\" https://github.com/fulmicoton/fastdivide \">fastdivide 0.4.1</a></li>\n                </ul>\n                <pre class=\"license-text\">Copyright (c) 2002-2007 Charlie Poole\nCopyright (c) 2002-2004 James W. Newkirk, Michael C. Two, Alexei A. Vorontsov\nCopyright (c) 2000-2002 Philip A. Craig\n\nThis software is provided &#x27;as-is&#x27;, without any express or implied warranty. In no event will the authors be held liable for any damages arising from the use of this software.\n\nPermission is granted to anyone to use this software for any purpose, including commercial applications, and to alter it and redistribute it freely, subject to the following restrictions:\n\n1. The origin of this software must not be misrepresented; you must not claim that you wrote the original software. If you use this software in a product, an acknowledgment (see the following) in the product documentation is required.\n\n     Portions Copyright (c) 2002-2007 Charlie Poole or Copyright (c) 2002-2004 James W. Newkirk, Michael C. Two, Alexei A. Vorontsov or Copyright (c) 2000-2002 Philip A. Craig\n\n2. Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software.\n\n3. This notice may not be removed or altered from any source distribution.\n</pre>\n            </li>\n        </ul>\n    </main>\n</body>\n\n</html>\n"
  },
  {
    "path": "quickwit/license-tool.toml",
    "content": "[overrides]\n# rust-license-tool can't find the license for crunchy-0.2.2, but it's MIT\n# according to the crate's Cargo.toml.\n\"crunchy-0.2.2\" = { license = \"MIT\", origin = \"https://github.com/eira-fransham/crunchy\" }\n\n# `ring` has a custom license that is mostly \"ISC-style\" but parts of it also fall under OpenSSL licensing.\n\"ring-0.17.8\" = { license = \"ISC AND Custom\" }\n"
  },
  {
    "path": "quickwit/quickwit-actors/Cargo.toml",
    "content": "[package]\nname = \"quickwit-actors\"\ndescription = \"Actor framework powering Quickwit services\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nanyhow = { workspace = true }\nasync-trait = { workspace = true }\nflume = { workspace = true }\nfutures = { workspace = true }\nonce_cell = { workspace = true }\nserde = { workspace = true }\nserde_json = { workspace = true }\nsync_wrapper = { workspace = true }\nthiserror = { workspace = true }\ntokio = { workspace = true }\ntracing = { workspace = true }\n\nquickwit-common = { workspace = true }\n\n[features]\ntestsuite = []\n\n[dev-dependencies]\nrand = { workspace = true }\ncriterion = { workspace = true }\n\n[[bench]]\nname = \"bench\"\nharness = false\n"
  },
  {
    "path": "quickwit/quickwit-actors/LICENSE",
    "content": "The files under the quickwit-actors/ subdirectory are published under the MIT license.\n\nCopyright (c) 2023 by Quickwit Inc.\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n"
  },
  {
    "path": "quickwit/quickwit-actors/README.md",
    "content": "# Quickwit actors\n\nYet another actor crate for rust.\nThis crate exists specifically to answer quickwit needs.\nThe API may change in the future.\n\n## Objective\n\n- Producing easy-to-reason with code: Quickwit's indexing pipeline is complex as it is.\n- Easy to test actors.\n- Control over the runtime.\n\n## Non-objective\n\n- High number of message throughput. Most of message exchanged in quickwit\nare \"large\". For instance, it can hold a temp directory with gigabytes worth of data.\nThe actor dealing with the highest number of messages are the indexer and sources.\nOne message then typically holds a batch of records.\n\n# Features\n\n- Actor message box\n- The framework is meant to run asynchronous actors by default, but it can also run actors that are blocking for long amount of time. The message handler methods are technically asynchronous in both case, but the `Actor::runner` method makes it possible to run an actor with blocking code on a dedicated thread.\n- A scheduler actor that makes it possible to mock simulate time.\n\n# Example\n\n```rust\nuse std::time::Duration;\nuse async_trait::async_trait;\nuse quickwit_actors::{Handler, Actor, Universe, ActorContext, ActorExitStatus, Mailbox};\n\n#[derive(Default)]\nstruct PingReceiver;\n\nimpl Actor for PingReceiver {\n    type ObservableState = ();\n    fn observable_state(&self) -> Self::ObservableState {}\n}\n\n#[async_trait]\nimpl Handler<Ping> for PingReceiver {\n    type Reply = String;\n    async fn handle(\n        &mut self,\n        _msg: Ping,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<String, ActorExitStatus> {\n        Ok(\"Pong\".to_string())\n    }\n}\n\nstruct PingSender {\n    peer: Mailbox<PingReceiver>,\n}\n\n#[derive(Debug)]\nstruct Loop;\n\n#[derive(Debug)]\nstruct Ping;\n\n#[async_trait]\nimpl Actor for PingSender {\n    type ObservableState = ();\n    fn observable_state(&self) -> Self::ObservableState {}\n\n    async fn initialize(&mut self, ctx: &ActorContext<Self>) -> Result<(),ActorExitStatus> {\n        ctx.send_self_message(Loop).await?;\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<Loop> for PingSender {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        _: Loop,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        let reply_msg = ctx.ask(&self.peer, Ping).await.unwrap();\n        println!(\"{reply_msg}\");\n        ctx.schedule_self_msg(Duration::from_secs(1), Loop).await;\n        Ok(())\n    }\n}\n\n#[tokio::main]\nasync fn main() {\n    let universe = Universe::new();\n\n    let (recv_mailbox, _) =\n        universe.spawn_actor(PingReceiver::default()).spawn();\n\n    let ping_sender = PingSender { peer: recv_mailbox };\n    let (_, ping_sender_handler) = universe.spawn_actor(ping_sender).spawn();\n\n    ping_sender_handler.join().await;\n}\n```\n"
  },
  {
    "path": "quickwit/quickwit-actors/benches/bench.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse async_trait::async_trait;\nuse criterion::{BenchmarkId, Criterion, criterion_group, criterion_main};\nuse quickwit_actors::{Actor, ActorContext, ActorExitStatus, Handler, Universe};\n\n#[derive(Default)]\nstruct DoNothingActor<const YIELD_AFTER_EACH_MESSAGE: bool>(u64);\n\n#[async_trait]\nimpl<const YIELD_AFTER_EACH_MESSAGE: bool> Actor for DoNothingActor<YIELD_AFTER_EACH_MESSAGE> {\n    type ObservableState = u64;\n\n    fn observable_state(&self) -> u64 {\n        self.0\n    }\n\n    fn yield_after_each_message(&self) -> bool {\n        YIELD_AFTER_EACH_MESSAGE\n    }\n}\n\n#[derive(Debug)]\nstruct AddMessage(u64);\n\n#[async_trait]\nimpl<const YIELD_AFTER_EACH_MESSAGE: bool> Handler<AddMessage>\n    for DoNothingActor<YIELD_AFTER_EACH_MESSAGE>\n{\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        msg: AddMessage,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        self.0 += msg.0;\n        Ok(())\n    }\n}\n\nasync fn actor_bench_code<const YIELD_AFTER_EACH_MESSAGE: bool>(num_messages: usize) {\n    let universe = Universe::default();\n    let actor: DoNothingActor<YIELD_AFTER_EACH_MESSAGE> = DoNothingActor::default();\n    let (mailbox, handle) = universe.spawn_builder().spawn(actor);\n    for _ in 0..num_messages {\n        mailbox.send_message(AddMessage(1)).await.unwrap();\n    }\n    drop(mailbox);\n    let (_, total) = handle.join().await;\n    assert_eq!(total, num_messages as u64);\n}\n\nasync fn flume_bench_code(num_messages: usize) {\n    let (tx, rx) = flume::unbounded::<AddMessage>();\n    for _ in 0..num_messages {\n        tx.send_async(AddMessage(1)).await.unwrap();\n    }\n    let join = tokio::task::spawn(async move {\n        let mut sum = 0;\n        while rx.recv_async().await.is_ok() {\n            sum += 1;\n        }\n        sum\n    });\n    drop(tx);\n    let total = join.await.unwrap();\n    assert_eq!(total, num_messages as u64);\n}\n\nasync fn chan_with_priority_bench_code(num_messages: usize) {\n    let (tx, rx) =\n        quickwit_actors::channel_with_priority::channel(quickwit_actors::QueueCapacity::Unbounded);\n    for _ in 0..num_messages {\n        tx.send_low_priority(AddMessage(1)).await.unwrap();\n    }\n    let join = tokio::task::spawn(async move {\n        let mut sum = 0;\n        while rx.recv().await.is_ok() {\n            sum += 1;\n        }\n        sum\n    });\n    drop(tx);\n    let total = join.await.unwrap();\n    assert_eq!(total, num_messages as u64);\n}\n\nfn message_throughput(c: &mut Criterion) {\n    let num_messages = [10_000]; // [1, 1_000, 10_000]\n    for num_messages in num_messages {\n        c.bench_with_input(\n            BenchmarkId::new(\n                \"unlimited_capacity_actors_yield_after_each_message\",\n                num_messages,\n            ),\n            &num_messages,\n            |b, &num_messages| {\n                // Insert a call to `to_async` to convert the bencher to async mode.\n                // The timing loops are the same as with the normal bencher.\n                let runtime = tokio::runtime::Builder::new_multi_thread()\n                    .enable_all()\n                    .build()\n                    .unwrap();\n                b.to_async(runtime)\n                    .iter(|| actor_bench_code::<true>(num_messages));\n            },\n        );\n        c.bench_with_input(\n            BenchmarkId::new(\n                \"unlimited_capacity_actors_no_yield_after_each_message\",\n                num_messages,\n            ),\n            &num_messages,\n            |b, &num_messages| {\n                // Insert a call to `to_async` to convert the bencher to async mode.\n                // The timing loops are the same as with the normal bencher.\n                let runtime = tokio::runtime::Builder::new_multi_thread()\n                    .enable_all()\n                    .build()\n                    .unwrap();\n                b.to_async(runtime)\n                    .iter(|| actor_bench_code::<false>(num_messages));\n            },\n        );\n        c.bench_with_input(\n            BenchmarkId::new(\"unlimited_capacity_flume\", num_messages),\n            &num_messages,\n            |b, &num_messages| {\n                // Insert a call to `to_async` to convert the bencher to async mode.\n                // The timing loops are the same as with the normal bencher.\n                let runtime = tokio::runtime::Builder::new_multi_thread()\n                    .enable_all()\n                    .build()\n                    .unwrap();\n                b.to_async(runtime).iter(|| flume_bench_code(num_messages));\n            },\n        );\n        c.bench_with_input(\n            BenchmarkId::new(\"unlimited_capacity_chan_with_priority\", num_messages),\n            &num_messages,\n            |b, &num_messages| {\n                // Insert a call to `to_async` to convert the bencher to async mode.\n                // The timing loops are the same as with the normal bencher.\n                let runtime = tokio::runtime::Builder::new_multi_thread()\n                    .enable_all()\n                    .build()\n                    .unwrap();\n                b.to_async(runtime)\n                    .iter(|| chan_with_priority_bench_code(num_messages));\n            },\n        );\n    }\n}\n\ncriterion_group!(benches, message_throughput);\ncriterion_main!(benches);\n"
  },
  {
    "path": "quickwit/quickwit-actors/examples/ping_actor.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::time::Duration;\n\nuse async_trait::async_trait;\nuse quickwit_actors::{Actor, ActorContext, ActorExitStatus, Handler, Mailbox, Universe};\nuse rand::prelude::IteratorRandom;\n\nstruct PingReceiver {\n    name: &'static str,\n    num_ping_received: usize,\n}\n\nimpl PingReceiver {\n    pub fn with_name(name: &'static str) -> Self {\n        PingReceiver {\n            name,\n            num_ping_received: 0,\n        }\n    }\n}\n\nimpl Actor for PingReceiver {\n    type ObservableState = usize;\n\n    fn observable_state(&self) -> Self::ObservableState {\n        self.num_ping_received\n    }\n}\n\n#[async_trait]\nimpl Handler<Ping> for PingReceiver {\n    type Reply = String;\n    async fn handle(\n        &mut self,\n        _msg: Ping,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<String, ActorExitStatus> {\n        self.num_ping_received += 1;\n        Ok(format!(\n            \"Actor `{}` received {} pings\",\n            self.name, self.num_ping_received\n        ))\n    }\n}\n\n// ------------------\n\n#[derive(Default)]\nstruct PingSender {\n    peers: Vec<Mailbox<PingReceiver>>,\n    num_ping_emitted: usize,\n}\n\n#[derive(Debug)]\nstruct Loop;\n\n#[derive(Debug)]\nstruct Ping;\n\n#[derive(Debug)]\npub struct AddPeer(Mailbox<PingReceiver>);\n\n#[async_trait]\nimpl Actor for PingSender {\n    type ObservableState = ();\n    fn observable_state(&self) -> Self::ObservableState {}\n\n    async fn initialize(&mut self, ctx: &ActorContext<Self>) -> Result<(), ActorExitStatus> {\n        ctx.send_self_message(Loop).await?;\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<Loop> for PingSender {\n    type Reply = ();\n\n    async fn handle(&mut self, _: Loop, ctx: &ActorContext<Self>) -> Result<(), ActorExitStatus> {\n        let random_peer_id_opt = (0..self.peers.len()).choose(&mut rand::rng());\n        if let Some(random_peer_id) = random_peer_id_opt {\n            match ctx.ask(&self.peers[random_peer_id], Ping).await {\n                Ok(reply_msg) => {\n                    println!(\"{reply_msg}\");\n                }\n                Err(_send_error) => {\n                    self.peers.swap_remove(random_peer_id);\n                }\n            }\n        }\n        self.num_ping_emitted += 1;\n        if self.num_ping_emitted == 10 {\n            return Err(ActorExitStatus::Success);\n        }\n        ctx.schedule_self_msg(Duration::from_secs(1), Loop);\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<AddPeer> for PingSender {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        add_peer_msg: AddPeer,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        let AddPeer(peer_mailbox) = add_peer_msg;\n        self.peers.push(peer_mailbox);\n        Ok(())\n    }\n}\n\n#[tokio::main]\nasync fn main() {\n    let universe = Universe::new();\n\n    let (roger_mailbox, _) = universe\n        .spawn_builder()\n        .spawn(PingReceiver::with_name(\"Roger\"));\n\n    let (myriam_mailbox, _) = universe\n        .spawn_builder()\n        .spawn(PingReceiver::with_name(\"Myriam\"));\n\n    let (ping_sender_mailbox, ping_sender_handler) =\n        universe.spawn_builder().spawn(PingSender::default());\n\n    ping_sender_mailbox\n        .send_message(AddPeer(roger_mailbox))\n        .await\n        .unwrap();\n    ping_sender_mailbox\n        .send_message(AddPeer(myriam_mailbox))\n        .await\n        .unwrap();\n\n    ping_sender_handler.join().await;\n}\n"
  },
  {
    "path": "quickwit/quickwit-actors/src/actor.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::any::type_name;\nuse std::fmt;\nuse std::sync::Arc;\n\nuse async_trait::async_trait;\nuse thiserror::Error;\n\nuse crate::{ActorContext, QueueCapacity, SendError};\n\n/// The actor exit status represents the outcome of the execution of an actor,\n/// after the end of the execution.\n///\n/// It is in many ways, similar to the exit status code of a program.\n#[derive(Clone, Debug, Error)]\npub enum ActorExitStatus {\n    /// The actor successfully exited.\n    ///\n    /// It happens either because:\n    /// - all of the existing mailboxes were dropped and the actor message queue was exhausted. No\n    ///   new message could ever arrive to the actor. (This exit is triggered by the framework.) or\n    /// - the actor `process_message` method returned `Err(ExitStatusCode::Success)`. (This exit is\n    ///   triggered by the actor implementer.)\n    ///\n    /// (This is equivalent to exit status code 0.)\n    /// Note that this is not really an error.\n    #[error(\"success\")]\n    Success,\n\n    /// The actor was asked to gracefully shutdown.\n    ///\n    /// (Semantically equivalent to exit status code 130, triggered by SIGINT aka Ctrl-C, or\n    /// SIGQUIT)\n    #[error(\"quit\")]\n    Quit,\n\n    /// The actor tried to send a message to a dowstream actor and failed.\n    /// The logic ruled that the actor should be killed.\n    ///\n    /// (Semantically equivalent to exit status code 141, triggered by SIGPIPE)\n    #[error(\"downstream actor exited\")]\n    DownstreamClosed,\n\n    /// The actor was killed.\n    ///\n    /// It can happen because:\n    /// - it received `Command::Kill`.\n    /// - its kill switch was activated.\n    ///\n    /// (Semantically equivalent to exit status code 137, triggered by SIGKILL)\n    #[error(\"killed\")]\n    Killed,\n\n    /// An unexpected error happened while processing a message.\n    #[error(\"failure(cause={0:?})\")]\n    Failure(Arc<anyhow::Error>),\n\n    /// The thread or the task executing the actor loop panicked.\n    #[error(\"panicked\")]\n    Panicked,\n}\n\nimpl From<anyhow::Error> for ActorExitStatus {\n    fn from(err: anyhow::Error) -> Self {\n        ActorExitStatus::Failure(Arc::new(err))\n    }\n}\n\nimpl ActorExitStatus {\n    pub fn is_success(&self) -> bool {\n        matches!(self, ActorExitStatus::Success)\n    }\n}\n\nimpl From<SendError> for ActorExitStatus {\n    fn from(_: SendError) -> Self {\n        ActorExitStatus::DownstreamClosed\n    }\n}\n\n/// An actor has an internal state and processes a stream of messages.\n/// Each actor has a mailbox where the messages are enqueued before being processed.\n///\n/// While processing a message, the actor typically\n/// - update its state;\n/// - emits one or more messages to other actors.\n#[async_trait]\npub trait Actor: Send + Sized + 'static {\n    /// Piece of state that can be copied for assert in unit test, admin, etc.\n    type ObservableState: fmt::Debug + serde::Serialize + Send + Sync + Clone;\n    /// A name identifying the type of actor.\n    ///\n    /// Ideally respect the `CamelCase` convention.\n    ///\n    /// It does not need to be \"instance-unique\", and can be the name of\n    /// the actor implementation.\n    fn name(&self) -> String {\n        type_name::<Self>().to_string()\n    }\n\n    /// The runner method makes it possible to decide the environment\n    /// of execution of the Actor.\n    ///\n    /// Actor with a handler that may block for more than 50 microseconds\n    /// should use the `ActorRunner::DedicatedThread`.\n    fn runtime_handle(&self) -> tokio::runtime::Handle {\n        tokio::runtime::Handle::current()\n    }\n\n    /// If set to true, the actor will yield after every single\n    /// message.\n    ///\n    /// For actors that are calling `.await` regularly,\n    /// returning `false` can yield better performance.\n    fn yield_after_each_message(&self) -> bool {\n        true\n    }\n\n    /// The Actor's incoming mailbox queue capacity. It is set when the actor is spawned.\n    fn queue_capacity(&self) -> QueueCapacity {\n        QueueCapacity::Unbounded\n    }\n\n    /// Extracts an observable state. Useful for unit tests, and admin UI.\n    ///\n    /// This function should return quickly.\n    fn observable_state(&self) -> Self::ObservableState;\n\n    /// Initialize is called before running the actor.\n    ///\n    /// This function is useful for instance to schedule an initial message in a looping\n    /// actor.\n    ///\n    /// It can be compared just to an implicit Initial message.\n    ///\n    /// Returning an ActorExitStatus will therefore have the same effect as if it\n    /// was in `process_message` (e.g. the actor will stop, the finalize method will be called.\n    /// the kill switch may be activated etc.)\n    async fn initialize(&mut self, _ctx: &ActorContext<Self>) -> Result<(), ActorExitStatus> {\n        Ok(())\n    }\n\n    /// This function is called after a series of one, or several messages have been processed and\n    /// no more message is available.\n    ///\n    /// It is a great place to have the actor \"sleep\".\n    ///\n    /// Quickwit's Indexer actor for instance use `on_drained_messages` to\n    /// schedule indexing in such a way that an indexer drains all of its\n    /// available messages and sleeps for some amount of time.\n    async fn on_drained_messages(\n        &mut self,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        Ok(())\n    }\n\n    /// Hook  that can be set up to define what should happen upon actor exit.\n    /// This hook is called only once.\n    ///\n    /// It is always called regardless of the reason why the actor exited.\n    /// The exit status is passed as an argument to make it possible to act conditionally\n    /// upon it.\n    /// For instance, it is often better to do as little work as possible on a killed actor.\n    /// It can be done by checking the `exit_status` and performing an early-exit if it is\n    /// equal to `ActorExitStatus::Killed`.\n    async fn finalize(\n        &mut self,\n        _exit_status: &ActorExitStatus,\n        _ctx: &ActorContext<Self>,\n    ) -> anyhow::Result<()> {\n        Ok(())\n    }\n}\n\n/// Message handler that allows actor to defer the reply\n#[async_trait::async_trait]\npub trait DeferableReplyHandler<M>: Actor {\n    type Reply: Send + 'static;\n\n    async fn handle_message(\n        &mut self,\n        message: M,\n        reply: impl FnOnce(Self::Reply) + Send + Sync + 'static,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus>\n    where\n        M: Send + 'static;\n}\n\n/// Message handler that requires actor to provide immediate response\n#[async_trait::async_trait]\npub trait Handler<M>: Actor {\n    type Reply: Send + 'static;\n\n    /// Processes a message.\n    ///\n    /// If an exit status is returned as an error, the actor will exit.\n    /// It will stop processing more message, the finalize method will be called,\n    /// and its exit status will be the one defined in the error.\n    async fn handle(\n        &mut self,\n        message: M,\n        ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus>;\n}\n\n#[async_trait::async_trait]\nimpl<H, M> DeferableReplyHandler<M> for H\nwhere H: Handler<M>\n{\n    type Reply = H::Reply;\n\n    async fn handle_message(\n        &mut self,\n        message: M,\n        reply: impl FnOnce(Self::Reply) + Send + 'static,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus>\n    where\n        M: Send + 'static,\n    {\n        self.handle(message, ctx).await.map(reply)\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-actors/src/actor_context.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::convert::Infallible;\nuse std::fmt;\nuse std::future::Future;\nuse std::ops::Deref;\nuse std::sync::Arc;\nuse std::sync::atomic::{AtomicBool, Ordering};\nuse std::time::Duration;\n\nuse quickwit_common::metrics::IntCounter;\nuse quickwit_common::{KillSwitch, Progress, ProtectedZoneGuard};\nuse tokio::sync::{oneshot, watch};\nuse tracing::{debug, error};\n\n#[cfg(any(test, feature = \"testsuite\"))]\nuse crate::Universe;\nuse crate::actor_state::AtomicState;\nuse crate::registry::ActorRegistry;\nuse crate::spawn_builder::{SpawnBuilder, SpawnContext};\nuse crate::{\n    Actor, ActorExitStatus, ActorState, AskError, Command, DeferableReplyHandler, Mailbox,\n    SendError, TrySendError,\n};\n\n// TODO hide all of this public stuff\npub struct ActorContext<A: Actor> {\n    inner: Arc<ActorContextInner<A>>,\n}\n\nimpl<A: Actor> Clone for ActorContext<A> {\n    fn clone(&self) -> Self {\n        ActorContext {\n            inner: self.inner.clone(),\n        }\n    }\n}\n\nimpl<A: Actor> Deref for ActorContext<A> {\n    type Target = ActorContextInner<A>;\n\n    fn deref(&self) -> &Self::Target {\n        self.inner.as_ref()\n    }\n}\n\npub struct ActorContextInner<A: Actor> {\n    spawn_ctx: SpawnContext,\n    self_mailbox: Mailbox<A>,\n    progress: Progress,\n    actor_state: AtomicState,\n    backpressure_micros_counter_opt: Option<IntCounter>,\n    observable_state_tx: watch::Sender<A::ObservableState>,\n    // Boolean marking the presence of an observe message in the actor's high priority queue.\n    observe_enqueued: AtomicBool,\n}\n\nimpl<A: Actor> ActorContext<A> {\n    pub(crate) fn new(\n        self_mailbox: Mailbox<A>,\n        spawn_ctx: SpawnContext,\n        observable_state_tx: watch::Sender<A::ObservableState>,\n        backpressure_micros_counter_opt: Option<IntCounter>,\n    ) -> Self {\n        ActorContext {\n            inner: ActorContextInner {\n                self_mailbox,\n                spawn_ctx,\n                progress: Progress::default(),\n                actor_state: AtomicState::default(),\n                observable_state_tx,\n                backpressure_micros_counter_opt,\n                observe_enqueued: AtomicBool::new(false),\n            }\n            .into(),\n        }\n    }\n\n    pub fn spawn_ctx(&self) -> &SpawnContext {\n        &self.spawn_ctx\n    }\n\n    /// Sleeps for a given amount of time.\n    ///\n    /// That sleep is measured by the universe scheduler, which means that it can be\n    /// shortened if `Universe::simulate_sleep(..)` is used.\n    ///\n    /// While sleeping, an actor is NOT protected from its supervisor.\n    /// It is up to the user to call `ActorContext::protect_future(..)`.\n    pub async fn sleep(&self, duration: Duration) {\n        let scheduler_client = &self.spawn_ctx().scheduler_client;\n        scheduler_client.dec_no_advance_time();\n        scheduler_client.sleep(duration).await;\n        scheduler_client.inc_no_advance_time();\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn for_test(\n        universe: &Universe,\n        actor_mailbox: Mailbox<A>,\n        observable_state_tx: watch::Sender<A::ObservableState>,\n    ) -> Self {\n        Self::new(\n            actor_mailbox,\n            universe.spawn_ctx.clone(),\n            observable_state_tx,\n            None,\n        )\n    }\n\n    pub fn mailbox(&self) -> &Mailbox<A> {\n        &self.self_mailbox\n    }\n\n    pub(crate) fn registry(&self) -> &ActorRegistry {\n        &self.spawn_ctx.registry\n    }\n\n    pub fn actor_instance_id(&self) -> &str {\n        self.mailbox().actor_instance_id()\n    }\n\n    /// This function returns a guard that prevents any supervisor from identifying the\n    /// actor as dead.\n    /// The protection ends when the `ProtectZoneGuard` is dropped.\n    ///\n    /// In an ideal world, you should never need to call this function.\n    /// It is only useful in some corner cases, like calling a long blocking\n    /// from an external library that you trust.\n    pub fn protect_zone(&self) -> ProtectedZoneGuard {\n        self.progress.protect_zone()\n    }\n\n    /// Executes a future in a protected zone.\n    pub async fn protect_future<Fut, T>(&self, future: Fut) -> T\n    where Fut: Future<Output = T> {\n        let _guard = self.protect_zone();\n        future.await\n    }\n\n    /// Cooperatively yields, while keeping the actor protected.\n    pub async fn yield_now(&self) {\n        self.protect_future(tokio::task::yield_now()).await;\n    }\n\n    /// Gets a copy of the actor kill switch.\n    /// This should rarely be used.\n    ///\n    /// For instance, when quitting from the process_message function, prefer simply\n    /// returning `Error(ActorExitStatus::Failure(..))`\n    pub fn kill_switch(&self) -> &KillSwitch {\n        &self.spawn_ctx.kill_switch\n    }\n\n    #[must_use]\n    pub fn progress(&self) -> &Progress {\n        &self.progress\n    }\n\n    pub fn spawn_actor<SpawnedActor: Actor>(&self) -> SpawnBuilder<SpawnedActor> {\n        self.spawn_ctx.clone().spawn_builder()\n    }\n\n    /// Records some progress.\n    /// This function is only useful when implementing actors that may take more than\n    /// `HEARTBEAT` to process a single message.\n    /// In that case, you can call this function in the middle of the process_message method\n    /// to prevent the actor from being identified as blocked or dead.\n    pub fn record_progress(&self) {\n        self.progress.record_progress();\n    }\n\n    pub(crate) fn state(&self) -> ActorState {\n        self.actor_state.get_state()\n    }\n\n    pub fn pause(&self) {\n        self.actor_state.pause();\n    }\n\n    pub(crate) fn resume(&self) {\n        self.actor_state.resume();\n    }\n\n    /// Sets the queue as observed and returns the previous value.\n    /// This method is used to make sure we do not have Observe messages\n    /// stacking up in the observe queue.\n    pub(crate) fn set_observe_enqueued_and_return_previous(&self) -> bool {\n        self.observe_enqueued.swap(true, Ordering::Relaxed)\n    }\n\n    /// Updates the observable state of the actor.\n    pub fn observe(&self, actor: &mut A) -> A::ObservableState {\n        let obs_state = actor.observable_state();\n        self.inner.observe_enqueued.store(false, Ordering::Relaxed);\n        let _ = self.observable_state_tx.send(obs_state.clone());\n        obs_state\n    }\n\n    pub(crate) fn exit(&self, exit_status: &ActorExitStatus) {\n        self.actor_state.exit(exit_status.is_success());\n        if should_activate_kill_switch(exit_status) {\n            error!(actor=%self.actor_instance_id(), exit_status=?exit_status, \"exit activating-kill-switch\");\n            self.kill_switch().kill();\n        }\n    }\n\n    /// Posts a message in an actor's mailbox.\n    ///\n    /// This method does not wait for the message to be handled by the\n    /// target actor. However, it returns a oneshot receiver that the caller\n    /// that makes it possible to `.await` it.\n    /// If the reply is important, chances are the `.ask(...)` method is\n    /// more indicated.\n    ///\n    /// Dropping the receiver channel will not cancel the\n    /// processing of the message. It is a very common usage.\n    /// In fact most actors are expected to send message in a\n    /// fire-and-forget fashion.\n    ///\n    /// Regular messages (as opposed to commands) are queued and guaranteed\n    /// to be processed in FIFO order.\n    ///\n    /// This method hides logic to prevent an actor from being identified\n    /// as frozen if the destination actor channel is saturated, and we\n    /// are simply experiencing back pressure.\n    pub async fn send_message<DestActor, M>(\n        &self,\n        mailbox: &Mailbox<DestActor>,\n        msg: M,\n    ) -> Result<oneshot::Receiver<DestActor::Reply>, SendError>\n    where\n        DestActor: DeferableReplyHandler<M>,\n        M: fmt::Debug + Send + 'static,\n    {\n        let _guard = self.protect_zone();\n        debug!(from=%self.self_mailbox.actor_instance_id(), send=%mailbox.actor_instance_id(), msg=?msg);\n        mailbox\n            .send_message_with_backpressure_counter(\n                msg,\n                self.backpressure_micros_counter_opt.as_ref(),\n            )\n            .await\n    }\n\n    pub async fn ask<DestActor, M, T>(\n        &self,\n        mailbox: &Mailbox<DestActor>,\n        msg: M,\n    ) -> Result<T, AskError<Infallible>>\n    where\n        DestActor: DeferableReplyHandler<M, Reply = T>,\n        M: fmt::Debug + Send + 'static,\n    {\n        let _guard = self.protect_zone();\n        debug!(from=%self.self_mailbox.actor_instance_id(), send=%mailbox.actor_instance_id(), msg=?msg, \"ask\");\n        mailbox\n            .ask_with_backpressure_counter(msg, self.backpressure_micros_counter_opt.as_ref())\n            .await\n    }\n\n    /// Similar to `send_message`, except this method\n    /// waits asynchronously for the actor reply.\n    pub async fn ask_for_res<DestActor, M, T, E>(\n        &self,\n        mailbox: &Mailbox<DestActor>,\n        msg: M,\n    ) -> Result<T, AskError<E>>\n    where\n        DestActor: DeferableReplyHandler<M, Reply = Result<T, E>>,\n        M: fmt::Debug + Send + Sync + 'static,\n        E: fmt::Debug,\n    {\n        let _guard = self.protect_zone();\n        debug!(from=%self.self_mailbox.actor_instance_id(), send=%mailbox.actor_instance_id(), msg=?msg, \"ask\");\n        mailbox.ask_for_res(msg).await\n    }\n\n    /// Send the Success message to terminate the destination actor with the Success exit status.\n    ///\n    /// The message is queued like any regular message, so that pending messages will be processed\n    /// first.\n    pub async fn send_exit_with_success<Dest: Actor>(\n        &self,\n        mailbox: &Mailbox<Dest>,\n    ) -> Result<(), SendError> {\n        let _guard = self.protect_zone();\n        debug!(from=%self.self_mailbox.actor_instance_id(), to=%mailbox.actor_instance_id(), \"success\");\n        mailbox.send_message(Command::ExitWithSuccess).await?;\n        Ok(())\n    }\n\n    /// Sends a message to an actor's own mailbox.\n    ///\n    /// Warning: This method is dangerous as it can very easily\n    /// cause a deadlock.\n    pub async fn send_self_message<M>(\n        &self,\n        msg: M,\n    ) -> Result<oneshot::Receiver<A::Reply>, SendError>\n    where\n        A: DeferableReplyHandler<M>,\n        M: 'static + Sync + Send + fmt::Debug,\n    {\n        debug!(self=%self.self_mailbox.actor_instance_id(), msg=?msg, \"self_send\");\n        self.self_mailbox.send_message(msg).await\n    }\n\n    /// Attempts to send a message to itself.\n    /// The message will be queue to self's low_priority queue.\n    ///\n    /// Warning: This method will always fail if\n    /// an actor has a capacity of 0.\n    pub fn try_send_self_message<M>(\n        &self,\n        msg: M,\n    ) -> Result<oneshot::Receiver<A::Reply>, TrySendError<M>>\n    where\n        A: DeferableReplyHandler<M>,\n        M: 'static + Sync + Send + fmt::Debug,\n    {\n        self.self_mailbox.try_send_message(msg)\n    }\n\n    /// Schedules a message that will be sent to the high-priority queue of the\n    /// actor Mailbox once `after_duration` has elapsed.\n    ///\n    /// Note that this holds a reference to the actor mailbox until the message\n    /// is actually sent.\n    pub fn schedule_self_msg<M>(&self, after_duration: Duration, message: M)\n    where\n        A: DeferableReplyHandler<M>,\n        M: Sync + Send + std::fmt::Debug + 'static,\n    {\n        let self_mailbox = self.mailbox().clone();\n        let callback = move || {\n            let _ = self_mailbox.send_message_with_high_priority(message);\n        };\n        self.spawn_ctx().schedule_event(callback, after_duration);\n    }\n}\n\n/// If an actor exits in an unexpected manner, its kill\n/// switch will be activated, and all other actors under the same\n/// kill switch will be killed.\nfn should_activate_kill_switch(exit_status: &ActorExitStatus) -> bool {\n    match exit_status {\n        ActorExitStatus::DownstreamClosed => true,\n        ActorExitStatus::Failure(_) => true,\n        ActorExitStatus::Panicked => true,\n        ActorExitStatus::Success => false,\n        ActorExitStatus::Quit => false,\n        ActorExitStatus::Killed => false,\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-actors/src/actor_handle.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::ops::Deref;\n\nuse serde::Serialize;\nuse tokio::sync::{oneshot, watch};\nuse tracing::error;\n\nuse crate::actor_state::ActorState;\nuse crate::command::Observe;\nuse crate::mailbox::Priority;\nuse crate::observation::ObservationType;\nuse crate::registry::ActorJoinHandle;\nuse crate::{Actor, ActorContext, ActorExitStatus, Command, Mailbox, Observation};\n\n/// An Actor Handle serves as an address to communicate with an actor.\npub struct ActorHandle<A: Actor> {\n    actor_context: ActorContext<A>,\n    last_state: watch::Receiver<A::ObservableState>,\n    join_handle: ActorJoinHandle,\n}\n\n/// Describes the health of a given actor.\n#[derive(Clone, Eq, PartialEq, Debug, Hash, Serialize)]\npub enum Health {\n    /// The actor is running and behaving as expected.\n    Healthy,\n    /// No progress was registered, or the process terminated with an error\n    FailureOrUnhealthy,\n    /// The actor terminated successfully.\n    Success,\n}\n\n/// Message received by health probe handlers.\n#[derive(Clone, Debug)]\npub struct Healthz;\n\nimpl<A: Actor> fmt::Debug for ActorHandle<A> {\n    fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {\n        formatter\n            .debug_struct(\"ActorHandle\")\n            .field(\"name\", &self.actor_context.actor_instance_id())\n            .finish()\n    }\n}\n\npub trait Supervisable {\n    fn name(&self) -> &str;\n\n    /// Check for the ActorState (has it terminated?), and provided `check_for_progress`\n    /// is set to `true`, it will also check for the progress of the actor.\n    fn check_health(&self, check_for_progress: bool) -> Health;\n\n    fn state(&self) -> ActorState;\n}\n\nimpl<A: Actor> Supervisable for ActorHandle<A> {\n    fn name(&self) -> &str {\n        self.actor_context.actor_instance_id()\n    }\n\n    fn state(&self) -> ActorState {\n        self.actor_context.state()\n    }\n\n    /// Harvests the health of the actor by checking its state (see [`ActorState`]) and,\n    /// provided `check_for_progress` is set to true, it will check its progress too\n    /// (see `Progress`).\n    ///\n    /// When `check_for_progress` is set to true, calling this method resets its progress state\n    /// to \"no update\" (see `ProgressState`). As a consequence, only one supervisor or probe\n    /// should periodically invoke this method during the lifetime of the actor.\n    fn check_health(&self, check_for_progress: bool) -> Health {\n        let actor_state = self.state();\n        if actor_state == ActorState::Success {\n            return Health::Success;\n        }\n        if actor_state == ActorState::Failure {\n            error!(actor = self.name(), \"actor-exit-without-success\");\n            return Health::FailureOrUnhealthy;\n        }\n        if !check_for_progress\n            || self\n                .actor_context\n                .progress()\n                .registered_activity_since_last_call()\n        {\n            Health::Healthy\n        } else {\n            error!(actor = self.name(), \"actor-timeout\");\n            Health::FailureOrUnhealthy\n        }\n    }\n}\n\nimpl<A: Actor> ActorHandle<A> {\n    pub(crate) fn new(\n        last_state: watch::Receiver<A::ObservableState>,\n        join_handle: ActorJoinHandle,\n        actor_context: ActorContext<A>,\n    ) -> Self {\n        ActorHandle {\n            actor_context,\n            last_state,\n            join_handle,\n        }\n    }\n\n    pub fn state(&self) -> ActorState {\n        self.actor_context.state()\n    }\n\n    /// Process all of the pending messages, and returns a snapshot of\n    /// the observable state of the actor after this.\n    ///\n    /// This method is mostly useful for tests.\n    ///\n    /// To actually observe the state of an actor for ops purpose,\n    /// prefer using the `.observe()` method.\n    ///\n    /// This method timeout if reaching the end of the message takes more than an HEARTBEAT.\n    pub async fn process_pending_and_observe(&self) -> Observation<A::ObservableState> {\n        self.observe_with_priority(Priority::Low).await\n    }\n\n    /// Observe the current state.\n    ///\n    /// The observation will be scheduled as a high priority message, therefore it will be executed\n    /// after the current active message and the current command queue have been processed.\n    ///\n    /// This method does not do anything to avoid Observe messages from stacking up.\n    /// In supervisors, prefer using `refresh_observation`.\n    pub async fn observe(&self) -> Observation<A::ObservableState> {\n        self.observe_with_priority(Priority::High).await\n    }\n\n    /// Triggers an observation.\n    /// It is scheduled as a high priority\n    /// message, and will hence be executed as soon as possible.\n    ///\n    /// This method does not enqueue an Observe request if there is already one in\n    /// the queue.\n    ///\n    /// The resulting observation can eventually be accessible using the\n    /// observation watch channel.\n    ///\n    /// This function returning does NOT mean that the observation was executed.\n    pub fn refresh_observe(&self) {\n        let observation_already_enqueued = self\n            .actor_context\n            .set_observe_enqueued_and_return_previous();\n        if !observation_already_enqueued {\n            let _ = self\n                .actor_context\n                .mailbox()\n                .send_message_with_high_priority(Observe);\n        }\n    }\n\n    async fn observe_with_priority(&self, priority: Priority) -> Observation<A::ObservableState> {\n        if !self.actor_context.state().is_exit() {\n            if let Ok(oneshot_rx) = self\n                .actor_context\n                .mailbox()\n                .send_message_with_priority(Observe, priority)\n                .await\n            {\n                // The timeout is required here. If the actor fails, its inbox is properly dropped\n                // but the send channel might actually prevent the onechannel\n                // Receiver from being dropped.\n                return self.wait_for_observable_state_callback(oneshot_rx).await;\n            } else {\n                error!(\n                    actor_id=%self.actor_context.actor_instance_id(),\n                    \"Failed to send observe message\"\n                );\n            }\n        }\n        let state = self.last_observation().clone();\n        Observation {\n            obs_type: ObservationType::PostMortem,\n            state,\n        }\n    }\n\n    /// Pauses the actor. The actor will stop processing messages from the low priority\n    /// channel, but its work can be resumed by calling the method `.resume()`.\n    pub fn pause(&self) {\n        let _ = self\n            .actor_context\n            .mailbox()\n            .send_message_with_high_priority(Command::Pause);\n    }\n\n    /// Resumes a paused actor.\n    pub fn resume(&self) {\n        let _ = self\n            .actor_context\n            .mailbox()\n            .send_message_with_high_priority(Command::Resume);\n    }\n\n    /// Kills the actor. Its finalize function will still be called.\n    ///\n    /// This function also actionnates the actor kill switch.\n    ///\n    /// The other difference with quit is the exit status. It is important,\n    /// as the finalize logic may behave differently depending on the exit status.\n    pub async fn kill(self) -> (ActorExitStatus, A::ObservableState) {\n        self.actor_context.kill_switch().kill();\n        let _ = self\n            .actor_context\n            .mailbox()\n            .send_message_with_high_priority(Command::Nudge);\n        self.join().await\n    }\n\n    /// Gracefully quit the actor, regardless of whether there are pending messages or not.\n    /// Its finalize function will be called.\n    ///\n    /// The kill switch is not actionated.\n    ///\n    /// The other difference with kill is the exit status. It is important,\n    /// as the finalize logic may behave differently depending on the exit status.\n    pub async fn quit(self) -> (ActorExitStatus, A::ObservableState) {\n        let _ = self\n            .actor_context\n            .mailbox()\n            .send_message_with_high_priority(Command::Quit);\n        self.join().await\n    }\n\n    /// Waits until the actor exits by itself. This is the equivalent of `Thread::join`.\n    pub async fn join(self) -> (ActorExitStatus, A::ObservableState) {\n        let exit_status = self.join_handle.join().await;\n        let observation = self.last_state.borrow().clone();\n        (exit_status, observation)\n    }\n\n    pub fn last_observation(&self) -> impl Deref<Target = A::ObservableState> + '_ {\n        self.last_state.borrow()\n    }\n\n    async fn wait_for_observable_state_callback(\n        &self,\n        rx: oneshot::Receiver<A::ObservableState>,\n    ) -> Observation<A::ObservableState> {\n        let scheduler_client = &self.actor_context.spawn_ctx().scheduler_client;\n        let observable_state_or_timeout =\n            scheduler_client.timeout(crate::OBSERVE_TIMEOUT, rx).await;\n        match observable_state_or_timeout {\n            Ok(Ok(state)) => {\n                let obs_type = ObservationType::Alive;\n                Observation { obs_type, state }\n            }\n            Ok(Err(_)) => {\n                let state = self.last_observation().clone();\n                let obs_type = ObservationType::PostMortem;\n                Observation { obs_type, state }\n            }\n            Err(_) => {\n                let state = self.last_observation().clone();\n                let obs_type = if self.actor_context.state().is_exit() {\n                    ObservationType::PostMortem\n                } else {\n                    ObservationType::Timeout\n                };\n                Observation { obs_type, state }\n            }\n        }\n    }\n\n    pub fn mailbox(&self) -> &Mailbox<A> {\n        self.actor_context.mailbox()\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::sync::atomic::{AtomicU32, Ordering};\n    use std::time::Duration;\n\n    use async_trait::async_trait;\n\n    use super::*;\n    use crate::{Handler, Universe};\n\n    #[derive(Default)]\n    struct PanickingActor {\n        count: usize,\n    }\n\n    impl Actor for PanickingActor {\n        type ObservableState = usize;\n        fn observable_state(&self) -> usize {\n            self.count\n        }\n    }\n\n    #[derive(Debug)]\n    struct Panic;\n\n    #[async_trait]\n    impl Handler<Panic> for PanickingActor {\n        type Reply = ();\n        async fn handle(\n            &mut self,\n            _message: Panic,\n            _ctx: &ActorContext<Self>,\n        ) -> Result<(), ActorExitStatus> {\n            self.count += 1;\n            panic!(\"Oops\");\n        }\n    }\n\n    #[derive(Default)]\n    struct ExitActor {\n        count: usize,\n    }\n\n    impl Actor for ExitActor {\n        type ObservableState = usize;\n        fn observable_state(&self) -> usize {\n            self.count\n        }\n    }\n\n    #[derive(Debug)]\n    struct Exit;\n\n    #[async_trait]\n    impl Handler<Exit> for ExitActor {\n        type Reply = ();\n\n        async fn handle(\n            &mut self,\n            _msg: Exit,\n            _ctx: &ActorContext<Self>,\n        ) -> Result<(), ActorExitStatus> {\n            self.count += 1;\n            Err(ActorExitStatus::DownstreamClosed)\n        }\n    }\n\n    #[tokio::test]\n    async fn test_panic_in_actor() -> anyhow::Result<()> {\n        let universe = Universe::with_accelerated_time();\n        let (mailbox, handle) = universe.spawn_builder().spawn(PanickingActor::default());\n        mailbox.send_message(Panic).await?;\n        let (exit_status, count) = handle.join().await;\n        assert!(matches!(exit_status, ActorExitStatus::Panicked));\n        assert!(matches!(count, 1)); //< Upon panick we cannot get a post mortem state.\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_exit() -> anyhow::Result<()> {\n        let universe = Universe::with_accelerated_time();\n        let (mailbox, handle) = universe.spawn_builder().spawn(ExitActor::default());\n        mailbox.send_message(Exit).await?;\n        let (exit_status, count) = handle.join().await;\n        assert!(matches!(exit_status, ActorExitStatus::DownstreamClosed));\n        assert!(matches!(count, 1)); //< Upon panick we cannot get a post mortem state.\n        Ok(())\n    }\n\n    #[derive(Default)]\n    struct ObserveActor {\n        observe: AtomicU32,\n    }\n\n    #[async_trait]\n    impl Actor for ObserveActor {\n        type ObservableState = u32;\n\n        fn observable_state(&self) -> u32 {\n            self.observe.fetch_add(1, Ordering::Relaxed)\n        }\n\n        async fn initialize(&mut self, ctx: &ActorContext<Self>) -> Result<(), ActorExitStatus> {\n            ctx.send_self_message(YieldLoop).await?;\n            Ok(())\n        }\n    }\n\n    #[derive(Debug)]\n    struct YieldLoop;\n\n    #[async_trait]\n    impl Handler<YieldLoop> for ObserveActor {\n        type Reply = ();\n        async fn handle(\n            &mut self,\n            _: YieldLoop,\n            ctx: &ActorContext<Self>,\n        ) -> Result<Self::Reply, ActorExitStatus> {\n            ctx.sleep(Duration::from_millis(25)).await; // OBSERVE_TIMEOUT.mul_f32(10.0f32)).await;\n            ctx.send_self_message(YieldLoop).await?;\n            Ok(())\n        }\n    }\n\n    #[tokio::test]\n    async fn test_observation_debounce() {\n        // TODO investigate why Universe::with_accelerated_time() does not work here.\n        let universe = Universe::new();\n        let (_, actor_handle) = universe.spawn_builder().spawn(ObserveActor::default());\n        for _ in 0..10 {\n            actor_handle.refresh_observe();\n            universe.sleep(Duration::from_millis(10)).await;\n        }\n        let (_last_obs, num_obs) = actor_handle.quit().await;\n        assert!(num_obs < 8);\n        universe.assert_quit().await;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-actors/src/actor_state.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::atomic::{AtomicU32, Ordering};\n\n#[repr(u32)]\n#[derive(Clone, Copy, Debug, Eq, PartialEq)]\npub enum ActorState {\n    /// Running means that the actor consumes and processes both low priority messages (regular\n    /// message) and high priority message commands.\n    Running = 0,\n    /// Pause means that the actor only consumes and processes high priority messages. Typically\n    /// commands as well as scheduled messages.\n    Paused = 1,\n    /// Success means that the actor exited and cannot return to any other states.\n    Success = 2,\n    /// Failure means that the actor exited with a failure or panicked.\n    Failure = 3,\n}\n\nimpl From<u32> for ActorState {\n    fn from(actor_state_u32: u32) -> Self {\n        match actor_state_u32 {\n            0 => ActorState::Running,\n            1 => ActorState::Paused,\n            2 => ActorState::Success,\n            3 => ActorState::Failure,\n            _ => {\n                panic!(\n                    \"Found forbidden u32 value for ActorState `{actor_state_u32}`. This should \\\n                     never happen.\"\n                );\n            }\n        }\n    }\n}\n\nimpl From<ActorState> for AtomicState {\n    fn from(state: ActorState) -> Self {\n        AtomicState(AtomicU32::from(state as u32))\n    }\n}\n\nimpl ActorState {\n    pub fn is_running(&self) -> bool {\n        *self == ActorState::Running\n    }\n\n    pub fn is_exit(&self) -> bool {\n        match self {\n            ActorState::Running | ActorState::Paused => false,\n            ActorState::Success | ActorState::Failure => true,\n        }\n    }\n}\n\npub(crate) struct AtomicState(AtomicU32);\n\nimpl Default for AtomicState {\n    fn default() -> Self {\n        AtomicState(AtomicU32::new(ActorState::Running as u32))\n    }\n}\n\nimpl AtomicState {\n    pub(crate) fn pause(&self) {\n        let _ = self\n            .0\n            .fetch_update(Ordering::SeqCst, Ordering::SeqCst, |state| {\n                if ActorState::from(state).is_running() {\n                    return Some(ActorState::Paused as u32);\n                }\n                None\n            });\n    }\n\n    pub(crate) fn resume(&self) {\n        let _ = self.0.compare_exchange(\n            ActorState::Paused as u32,\n            ActorState::Running as u32,\n            Ordering::SeqCst,\n            Ordering::SeqCst,\n        );\n    }\n\n    pub(crate) fn exit(&self, success: bool) {\n        let new_state = if success {\n            ActorState::Success\n        } else {\n            ActorState::Failure\n        };\n        self.0.fetch_max(new_state as u32, Ordering::Release);\n    }\n\n    pub fn get_state(&self) -> ActorState {\n        ActorState::from(self.0.load(Ordering::Acquire))\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    enum Operation {\n        Pause,\n        Resume,\n        ExitSuccess,\n        ExitFailure,\n    }\n\n    impl Operation {\n        fn apply(&self, state: &AtomicState) {\n            match self {\n                Operation::Pause => {\n                    state.pause();\n                }\n                Operation::Resume => state.resume(),\n                Operation::ExitSuccess => state.exit(true),\n                Operation::ExitFailure => state.exit(false),\n            }\n        }\n    }\n\n    #[track_caller]\n    fn test_transition(from_state: ActorState, op: Operation, expected_state: ActorState) {\n        let state = AtomicState::from(from_state);\n        op.apply(&state);\n        assert_eq!(state.get_state(), expected_state);\n    }\n\n    #[test]\n    fn test_atomic_state_from_running() {\n        test_transition(ActorState::Running, Operation::Pause, ActorState::Paused);\n        test_transition(ActorState::Running, Operation::Resume, ActorState::Running);\n        test_transition(\n            ActorState::Running,\n            Operation::ExitSuccess,\n            ActorState::Success,\n        );\n        test_transition(ActorState::Paused, Operation::Pause, ActorState::Paused);\n        test_transition(ActorState::Paused, Operation::Resume, ActorState::Running);\n        test_transition(\n            ActorState::Paused,\n            Operation::ExitSuccess,\n            ActorState::Success,\n        );\n        test_transition(\n            ActorState::Success,\n            Operation::ExitFailure,\n            ActorState::Failure,\n        );\n\n        test_transition(ActorState::Success, Operation::Pause, ActorState::Success);\n        test_transition(ActorState::Success, Operation::Resume, ActorState::Success);\n        test_transition(\n            ActorState::Success,\n            Operation::ExitSuccess,\n            ActorState::Success,\n        );\n\n        test_transition(ActorState::Failure, Operation::Pause, ActorState::Failure);\n        test_transition(ActorState::Failure, Operation::Resume, ActorState::Failure);\n        test_transition(\n            ActorState::Failure,\n            Operation::ExitSuccess,\n            ActorState::Failure,\n        );\n        test_transition(\n            ActorState::Failure,\n            Operation::ExitFailure,\n            ActorState::Failure,\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-actors/src/channel_with_priority.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::Mutex;\nuse std::sync::atomic::{AtomicBool, Ordering};\n\nuse flume::TryRecvError;\nuse thiserror::Error;\n\n#[derive(Default)]\nstruct LockedOption<T> {\n    opt: Mutex<Option<T>>,\n    has_val: AtomicBool,\n}\n\nimpl<T> LockedOption<T> {\n    pub fn none() -> Self {\n        LockedOption {\n            opt: Mutex::new(None),\n            has_val: AtomicBool::new(false),\n        }\n    }\n\n    pub fn is_some(&self) -> bool {\n        self.has_val.load(Ordering::Acquire)\n    }\n\n    pub fn is_none(&self) -> bool {\n        !self.is_some()\n    }\n\n    pub fn take(&self) -> Option<T> {\n        if !self.has_val.load(Ordering::Acquire) {\n            return None;\n        }\n        let mut lock = self.opt.lock().unwrap();\n        let val_opt = lock.take();\n        self.has_val.store(false, Ordering::Release);\n        val_opt\n    }\n\n    pub fn place(&self, val: T) {\n        let mut lock = self.opt.lock().unwrap();\n        self.has_val.store(true, Ordering::Release);\n        *lock = Some(val);\n    }\n}\n\n#[derive(Debug, Error)]\npub enum SendError {\n    #[error(\"the channel is closed\")]\n    Disconnected,\n    #[error(\"the channel is full\")]\n    Full,\n}\n\n#[derive(Debug, Error)]\npub enum TrySendError<M> {\n    #[error(\"the channel is closed\")]\n    Disconnected,\n    #[error(\"the channel is full\")]\n    Full(M),\n}\n\nimpl<M> From<flume::TrySendError<M>> for TrySendError<M> {\n    fn from(err: flume::TrySendError<M>) -> Self {\n        match err {\n            flume::TrySendError::Full(msg) => TrySendError::Full(msg),\n            flume::TrySendError::Disconnected(_) => TrySendError::Disconnected,\n        }\n    }\n}\n\n#[derive(Clone, Copy, Debug, Error, Eq, PartialEq)]\npub enum RecvError {\n    #[error(\"no message are currently available\")]\n    NoMessageAvailable,\n    #[error(\"all senders were dropped and no pending messages are in the channel\")]\n    Disconnected,\n}\n\nimpl From<flume::RecvTimeoutError> for RecvError {\n    fn from(flume_err: flume::RecvTimeoutError) -> Self {\n        match flume_err {\n            flume::RecvTimeoutError::Timeout => Self::NoMessageAvailable,\n            flume::RecvTimeoutError::Disconnected => Self::Disconnected,\n        }\n    }\n}\n\nimpl<T> From<flume::SendError<T>> for SendError {\n    fn from(_send_error: flume::SendError<T>) -> Self {\n        SendError::Disconnected\n    }\n}\n\nimpl<T> From<flume::TrySendError<T>> for SendError {\n    fn from(try_send_error: flume::TrySendError<T>) -> Self {\n        match try_send_error {\n            flume::TrySendError::Full(_) => SendError::Full,\n            flume::TrySendError::Disconnected(_) => SendError::Disconnected,\n        }\n    }\n}\n\n#[derive(Clone, Copy, Debug)]\npub enum QueueCapacity {\n    Bounded(usize),\n    Unbounded,\n}\n\n/// Creates a channel with the ability to send high priority messages.\n///\n/// A high priority message is guaranteed to be consumed before any\n/// low priority message sent after it.\npub fn channel<T>(queue_capacity: QueueCapacity) -> (Sender<T>, Receiver<T>) {\n    let (high_priority_tx, high_priority_rx) = flume::unbounded();\n    let (low_priority_tx, low_priority_rx) = match queue_capacity {\n        QueueCapacity::Bounded(cap) => flume::bounded(cap),\n        QueueCapacity::Unbounded => flume::unbounded(),\n    };\n    let receiver = Receiver {\n        low_priority_rx,\n        high_priority_rx,\n        _high_priority_tx: high_priority_tx.clone(),\n        pending_low_priority_message: LockedOption::none(),\n        _clone_is_forbidden: CloneIsForbidden,\n    };\n    let sender = Sender {\n        low_priority_tx,\n        high_priority_tx,\n    };\n    (sender, receiver)\n}\n\npub struct Sender<T> {\n    low_priority_tx: flume::Sender<T>,\n    high_priority_tx: flume::Sender<T>,\n}\n\nimpl<T> Sender<T> {\n    pub fn is_disconnected(&self) -> bool {\n        self.low_priority_tx.is_disconnected()\n    }\n\n    pub fn try_send_low_priority(&self, msg: T) -> Result<(), TrySendError<T>> {\n        self.low_priority_tx.try_send(msg)?;\n        Ok(())\n    }\n\n    pub async fn send_low_priority(&self, msg: T) -> Result<(), SendError> {\n        self.low_priority_tx.send_async(msg).await?;\n        Ok(())\n    }\n\n    pub fn send_high_priority(&self, msg: T) -> Result<(), SendError> {\n        self.high_priority_tx.send(msg)?;\n        Ok(())\n    }\n}\n\n// Message to future generations. I created this flag to prevent you\n// from naively making a struct cloneable.\n// The drop implementation drains the elements in the channel.\nstruct CloneIsForbidden;\n\npub struct Receiver<T> {\n    low_priority_rx: flume::Receiver<T>,\n    high_priority_rx: flume::Receiver<T>,\n    _high_priority_tx: flume::Sender<T>,\n    pending_low_priority_message: LockedOption<T>,\n    _clone_is_forbidden: CloneIsForbidden,\n}\n\nimpl<T> Drop for Receiver<T> {\n    fn drop(&mut self) {\n        // Flume strangely (tokio::mpsc does not behave like this for instance)\n        // does not drop the message in the channel when all receiver are dropped.\n        //\n        // They are only dropped when both the receivers AND the sender are dropped.\n        // We fix this behavior by drainng the channel upon drop.\n        self.high_priority_rx.drain();\n        self.low_priority_rx.drain();\n    }\n}\n\nimpl<T> Receiver<T> {\n    pub fn is_empty(&self) -> bool {\n        self.low_priority_rx.is_empty()\n            && self.pending_low_priority_message.is_none()\n            && self.high_priority_rx.is_empty()\n    }\n\n    pub fn try_recv_high_priority_message(&self) -> Result<T, RecvError> {\n        match self.high_priority_rx.try_recv() {\n            Ok(msg) => Ok(msg),\n            Err(TryRecvError::Disconnected) => {\n                unreachable!(\n                    \"This can never happen, as the high priority Sender is owned by the Receiver.\"\n                );\n            }\n            Err(TryRecvError::Empty) => {\n                if self.low_priority_rx.is_disconnected() {\n                    // We check that no new high priority message were sent\n                    // in between.\n                    if let Ok(msg) = self.high_priority_rx.try_recv() {\n                        Ok(msg)\n                    } else {\n                        Err(RecvError::Disconnected)\n                    }\n                } else {\n                    Err(RecvError::NoMessageAvailable)\n                }\n            }\n        }\n    }\n\n    pub fn try_recv(&self) -> Result<T, RecvError> {\n        if let Ok(msg) = self.high_priority_rx.try_recv() {\n            return Ok(msg);\n        }\n        if let Some(pending_msg) = self.pending_low_priority_message.take() {\n            return Ok(pending_msg);\n        }\n        match self.low_priority_rx.try_recv() {\n            Ok(low_msg) => {\n                if let Ok(high_msg) = self.high_priority_rx.try_recv() {\n                    self.pending_low_priority_message.place(low_msg);\n                    Ok(high_msg)\n                } else {\n                    Ok(low_msg)\n                }\n            }\n            Err(TryRecvError::Disconnected) => {\n                if let Ok(high_msg) = self.high_priority_rx.try_recv() {\n                    Ok(high_msg)\n                } else {\n                    Err(RecvError::Disconnected)\n                }\n            }\n            Err(TryRecvError::Empty) => Err(RecvError::NoMessageAvailable),\n        }\n    }\n\n    pub async fn recv_high_priority(&self) -> T {\n        self.high_priority_rx\n            .recv_async()\n            .await\n            .expect(\"The Receiver owns the high priority Sender to avoid any disconnection.\")\n    }\n\n    pub async fn recv(&self) -> Result<T, RecvError> {\n        if let Ok(msg) = self.try_recv_high_priority_message() {\n            return Ok(msg);\n        }\n        if let Some(pending_msg) = self.pending_low_priority_message.take() {\n            return Ok(pending_msg);\n        }\n        tokio::select! {\n            // We don't really care about fairness here.\n            // We will double check if there is a command or not anyway.\n            biased;\n            high_priority_msg_res = self.high_priority_rx.recv_async() => {\n                match high_priority_msg_res {\n                    Ok(high_priority_msg) => {\n                        Ok(high_priority_msg)\n                    },\n                    Err(_) => {\n                        unreachable!(\"The Receiver owns the high priority Sender to avoid any disconnection.\")\n                    },\n                }\n            }\n            low_priority_msg_res = self.low_priority_rx.recv_async() => {\n                match low_priority_msg_res {\n                    Ok(low_priority_msg) => {\n                        if let Ok(high_priority_msg) = self.try_recv_high_priority_message() {\n                            self.pending_low_priority_message.place(low_priority_msg);\n                            Ok(high_priority_msg)\n                        } else {\n                            Ok(low_priority_msg)\n                        }\n                    },\n                    Err(flume::RecvError::Disconnected) => {\n                        if let Ok(high_priority_msg) = self.try_recv_high_priority_message() {\n                            Ok(high_priority_msg)\n                        } else {\n                            Err(RecvError::Disconnected)\n                        }\n                    }\n                }\n           }\n        }\n    }\n\n    /// Drain all of the pending low priority messages and return them.\n    pub fn drain_low_priority(&self) -> Vec<T> {\n        let mut messages = Vec::new();\n        while let Ok(msg) = self.low_priority_rx.try_recv() {\n            messages.push(msg);\n        }\n        messages\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::sync::Arc;\n    use std::time::Duration;\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_channel_with_priority_drop_receiver_drop_messages() {\n        let arc_high = Arc::new(());\n        let arc_low = Arc::new(());\n        let (tx, rx) = super::channel(QueueCapacity::Bounded(2));\n        tx.send_high_priority(arc_high.clone()).unwrap();\n        tx.send_low_priority(arc_low.clone()).await.unwrap();\n        assert_eq!(Arc::strong_count(&arc_high), 2);\n        assert_eq!(Arc::strong_count(&arc_low), 2);\n        drop(rx);\n        assert_eq!(Arc::strong_count(&arc_high), 1);\n        assert_eq!(Arc::strong_count(&arc_low), 1);\n    }\n\n    #[test]\n    fn test_locked_option_new_empty() {\n        let locked_option: LockedOption<usize> = LockedOption::none();\n        assert_eq!(locked_option.take(), None);\n    }\n\n    #[test]\n    fn test_locked_option_place() {\n        let locked_option = LockedOption::none();\n        locked_option.place(1);\n        assert_eq!(locked_option.take(), Some(1));\n    }\n\n    #[test]\n    fn test_locked_option_place_twice_keep_last() {\n        let locked_option = LockedOption::none();\n        locked_option.place(1);\n        locked_option.place(2);\n        assert_eq!(locked_option.take(), Some(2));\n    }\n\n    #[test]\n    fn test_locked_option_place_take_twice() {\n        let locked_option = LockedOption::none();\n        locked_option.place(1);\n        assert_eq!(locked_option.take(), Some(1));\n        assert_eq!(locked_option.take(), None);\n    }\n\n    #[tokio::test]\n    async fn test_recv_priority() -> anyhow::Result<()> {\n        let (sender, receiver) = super::channel::<usize>(QueueCapacity::Unbounded);\n        sender.send_low_priority(1).await?;\n        sender.send_high_priority(2)?;\n        assert_eq!(receiver.recv().await, Ok(2));\n        assert_eq!(receiver.recv().await, Ok(1));\n        assert!(\n            tokio::time::timeout(Duration::from_millis(50), receiver.recv())\n                .await\n                .is_err()\n        );\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_try_recv() -> anyhow::Result<()> {\n        let (sender, receiver) = super::channel::<usize>(QueueCapacity::Unbounded);\n        sender.send_low_priority(1).await?;\n        assert_eq!(receiver.try_recv(), Ok(1));\n        assert_eq!(receiver.try_recv(), Err(RecvError::NoMessageAvailable));\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_try_recv_high_priority() -> anyhow::Result<()> {\n        let (sender, receiver) = super::channel::<usize>(QueueCapacity::Unbounded);\n        sender.send_low_priority(1).await?;\n        assert_eq!(\n            receiver.try_recv_high_priority_message(),\n            Err(RecvError::NoMessageAvailable)\n        );\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_recv_high_priority_ignore_disconnection() -> anyhow::Result<()> {\n        let (sender, receiver) = super::channel::<usize>(QueueCapacity::Unbounded);\n        std::mem::drop(sender);\n        assert!(\n            tokio::time::timeout(Duration::from_millis(100), receiver.recv_high_priority())\n                .await\n                .is_err()\n        );\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_recv_disconnect() -> anyhow::Result<()> {\n        let (sender, receiver) = super::channel::<usize>(QueueCapacity::Unbounded);\n        std::mem::drop(sender);\n        assert_eq!(receiver.recv().await, Err(RecvError::Disconnected));\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_recv_timeout_simple() -> anyhow::Result<()> {\n        let (_sender, receiver) = super::channel::<usize>(QueueCapacity::Unbounded);\n        assert!(matches!(\n            receiver.try_recv(),\n            Err(RecvError::NoMessageAvailable)\n        ));\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_try_recv_priority_corner_case() -> anyhow::Result<()> {\n        let (sender, receiver) = super::channel::<usize>(QueueCapacity::Unbounded);\n        tokio::task::spawn(async move {\n            tokio::time::sleep(Duration::from_millis(10)).await;\n            sender.send_high_priority(1)?;\n            sender.send_low_priority(2).await?;\n            Result::<(), SendError>::Ok(())\n        });\n        assert_eq!(receiver.recv().await, Ok(1));\n        assert_eq!(receiver.try_recv(), Ok(2));\n        assert!(matches!(receiver.try_recv(), Err(RecvError::Disconnected)));\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_try_recv_high_low() {\n        let (tx, rx) = super::channel::<usize>(QueueCapacity::Unbounded);\n        tx.send_low_priority(1).await.unwrap();\n        tx.send_high_priority(2).unwrap();\n        assert_eq!(rx.try_recv(), Ok(2));\n        assert_eq!(rx.try_recv(), Ok(1));\n        assert_eq!(rx.try_recv(), Err(RecvError::NoMessageAvailable));\n    }\n\n    #[tokio::test]\n    async fn test_try_recv_high() {\n        let (tx, rx) = super::channel::<usize>(QueueCapacity::Unbounded);\n        tx.send_low_priority(1).await.unwrap();\n        tx.send_high_priority(2).unwrap();\n        assert_eq!(rx.try_recv_high_priority_message(), Ok(2));\n        assert_eq!(\n            rx.try_recv_high_priority_message(),\n            Err(RecvError::NoMessageAvailable)\n        );\n        assert_eq!(rx.try_recv(), Ok(1));\n        assert_eq!(rx.try_recv(), Err(RecvError::NoMessageAvailable));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-actors/src/command.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse async_trait::async_trait;\n\nuse crate::{Actor, ActorContext, ActorExitStatus, Handler};\n\n/// Commands are messages that can be send to control the behavior of an actor.\n///\n/// They are similar to UNIX signals.\n///\n/// They are treated with a higher priority than regular actor messages.\n#[derive(Debug)]\npub enum Command {\n    /// Temporarily pauses the actor. A paused actor only checks\n    /// on its high priority channel and still shows \"progress\". It appears as\n    /// healthy to the supervisor.\n    ///\n    /// Scheduled message are still processed.\n    ///\n    /// Semantically, it is similar to SIGSTOP.\n    Pause,\n\n    /// Resume a paused actor. If the actor was not paused this command\n    /// has no effects.\n    ///\n    /// Semantically, it is similar to SIGCONT.\n    Resume,\n\n    /// Stops the actor with a success exit status code.\n    ///\n    /// Upstream `actors` that terminates should send the `ExitWithSuccess`\n    /// command to downstream actors to inform them that there are no more\n    /// incoming messages.\n    ///\n    /// It is similar to `Quit`, except for the resulting exit status.\n    ExitWithSuccess,\n\n    /// Asks the actor to gracefully shutdown.\n    ///\n    /// The actor will stop processing messages and its finalize function will\n    /// be called.\n    ///\n    /// The exit status is then `ActorExitStatus::Quit`.\n    ///\n    /// This is the equivalent of sending SIGINT/Ctrl-C to a process.\n    Quit,\n\n    /// Nudging is a No-op message.\n    ///\n    /// Its only effect is to wake-up actors that are stuck waiting\n    /// for a message.\n    ///\n    /// This is useful to kill actors properly or for tests.\n    /// Actors stuck waiting for a message do not have any timeout to\n    /// check for their killswitch signal.\n    ///\n    ///\n    /// Note: Historically, actors used to have a timeout, then\n    /// the wake up logic worked using a Kill command.\n    /// However, after the introduction of supervision, it became common\n    /// to recycle a mailbox.\n    ///\n    /// After a panic for instance, the supervisor of an actor might kill\n    /// it by activating its killswitch and sending a Kill message.\n    ///\n    /// The respawned actor would receive its predecessor mailbox and\n    /// possibly end up process a Kill message as its first message.\n    Nudge,\n}\n\n#[async_trait]\nimpl<A: Actor> Handler<Command> for A {\n    type Reply = ();\n\n    /// and its exit status will be the one defined in the error.\n    async fn handle(\n        &mut self,\n        command: Command,\n        ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        match command {\n            Command::Pause => {\n                ctx.pause();\n                Ok(())\n            }\n            Command::ExitWithSuccess => Err(ActorExitStatus::Success),\n            Command::Quit => Err(ActorExitStatus::Quit),\n            Command::Nudge => Ok(()),\n            Command::Resume => {\n                ctx.resume();\n                Ok(())\n            }\n        }\n    }\n}\n\n/// Asks the actor to update its ObservableState.\n///\n/// The observation is then available using the `ActorHandler::last_observation()`\n/// method.\n#[derive(Debug)]\npub struct Observe;\n\n#[async_trait]\nimpl<A: Actor> Handler<Observe> for A {\n    type Reply = A::ObservableState;\n\n    async fn handle(\n        &mut self,\n        _observe: Observe,\n        ctx: &ActorContext<Self>,\n    ) -> Result<A::ObservableState, ActorExitStatus> {\n        Ok(ctx.observe(self))\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-actors/src/envelope.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::any::Any;\nuse std::fmt;\n\nuse async_trait::async_trait;\nuse tokio::sync::oneshot;\n\nuse crate::actor::DeferableReplyHandler;\nuse crate::scheduler::NoAdvanceTimeGuard;\nuse crate::{Actor, ActorContext, ActorExitStatus};\n\n/// An `Envelope` is just a way to capture the handler\n/// of a message and hide its type.\n///\n/// Messages can have different types but somehow need to be pushed to a\n/// queue with a single type.\n/// Before appending, we capture the right handler implementation\n/// in the form of a `Box<dyn Envelope>`, and append that to the queue.\npub struct Envelope<A> {\n    handler_envelope: Box<dyn EnvelopeT<A>>,\n    _no_advance_time_guard: Option<NoAdvanceTimeGuard>,\n}\n\nimpl<A: Actor> Envelope<A> {\n    /// Returns the message as a boxed any.\n    ///\n    /// This method is only useful in unit tests.\n    pub fn message(&mut self) -> Box<dyn Any> {\n        self.handler_envelope.message()\n    }\n\n    pub fn message_typed<M: 'static>(&mut self) -> Option<M> {\n        if let Ok(boxed_msg) = self.handler_envelope.message().downcast::<M>() {\n            Some(*boxed_msg)\n        } else {\n            None\n        }\n    }\n\n    /// Executes the captured handle function.\n    ///\n    /// When exiting, also returns the message type name.\n    pub async fn handle_message(\n        &mut self,\n        actor: &mut A,\n        ctx: &ActorContext<A>,\n    ) -> Result<(), (ActorExitStatus, &'static str)> {\n        let handling_res = self.handler_envelope.handle_message(actor, ctx).await;\n        if let Err(exit_status) = handling_res {\n            return Err((exit_status, self.handler_envelope.message_type_name()));\n        }\n        Ok(())\n    }\n}\n\nimpl<A: Actor> fmt::Debug for Envelope<A> {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        let msg_str = self.handler_envelope.debug_msg();\n        f.debug_tuple(\"Envelope\").field(&msg_str).finish()\n    }\n}\n\n#[async_trait]\ntrait EnvelopeT<A: Actor>: Send {\n    fn message_type_name(&self) -> &'static str;\n\n    fn debug_msg(&self) -> String;\n\n    /// Returns the message as a boxed any.\n    ///\n    /// This method is only useful in unit tests.\n    fn message(&mut self) -> Box<dyn Any>;\n\n    /// Execute the captured handle function.\n    async fn handle_message(\n        &mut self,\n        actor: &mut A,\n        ctx: &ActorContext<A>,\n    ) -> Result<(), ActorExitStatus>;\n}\n\n#[async_trait]\nimpl<A, M> EnvelopeT<A> for Option<(oneshot::Sender<A::Reply>, M)>\nwhere\n    A: DeferableReplyHandler<M>,\n    M: fmt::Debug + Send + 'static,\n{\n    fn message_type_name(&self) -> &'static str {\n        std::any::type_name::<M>()\n    }\n\n    fn debug_msg(&self) -> String {\n        #[allow(clippy::needless_option_take)]\n        if let Some((_response_tx, msg)) = self.as_ref().take() {\n            format!(\"{msg:?}\")\n        } else {\n            \"<consumed>\".to_string()\n        }\n    }\n\n    fn message(&mut self) -> Box<dyn Any> {\n        if let Some((_, message)) = self.take() {\n            Box::new(message)\n        } else {\n            Box::new(())\n        }\n    }\n\n    async fn handle_message(\n        &mut self,\n        actor: &mut A,\n        ctx: &ActorContext<A>,\n    ) -> Result<(), ActorExitStatus> {\n        let (response_tx, msg) = self\n            .take()\n            .expect(\"handle_message should never be called twice.\");\n        actor\n            .handle_message(\n                msg,\n                |response| {\n                    // A SendError is fine here. The caller just did not wait\n                    // for our response and dropped its Receiver channel.\n                    let _ = response_tx.send(response);\n                },\n                ctx,\n            )\n            .await?;\n        Ok(())\n    }\n}\n\npub(crate) fn wrap_in_envelope<A, M>(\n    msg: M,\n    no_advance_time_guard: Option<NoAdvanceTimeGuard>,\n) -> (Envelope<A>, oneshot::Receiver<A::Reply>)\nwhere\n    A: DeferableReplyHandler<M>,\n    M: fmt::Debug + Send + 'static,\n{\n    let (response_tx, response_rx) = oneshot::channel();\n    let handler_envelope = Some((response_tx, msg));\n    let envelope = Envelope {\n        handler_envelope: Box::new(handler_envelope),\n        _no_advance_time_guard: no_advance_time_guard,\n    };\n    (envelope, response_rx)\n}\n"
  },
  {
    "path": "quickwit/quickwit-actors/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n#![deny(clippy::disallowed_methods)]\n\n//! quickwit-actors is a simplified actor framework for quickwit.\n//!\n//! It solves the following problem:\n//! - have sync and async tasks communicate together.\n//! - make these task observable\n//! - make these task modular and testable\n//! - detect when some task is stuck and does not progress anymore\n\nuse std::fmt;\nuse std::num::NonZeroU64;\n\nuse once_cell::sync::Lazy;\nuse tokio::time::Duration;\nmod actor;\nmod actor_context;\nmod actor_handle;\nmod actor_state;\n#[doc(hidden)]\npub mod channel_with_priority;\nmod command;\nmod envelope;\nmod mailbox;\nmod observation;\nmod registry;\npub(crate) mod scheduler;\nmod spawn_builder;\nmod supervisor;\n\npub use scheduler::{SchedulerClient, start_scheduler};\n\n#[cfg(test)]\npub(crate) mod tests;\nmod universe;\n\npub use actor::{Actor, ActorExitStatus, DeferableReplyHandler, Handler};\npub use actor_handle::{ActorHandle, Health, Healthz, Supervisable};\npub use command::{Command, Observe};\npub use observation::{Observation, ObservationType};\nuse quickwit_common::KillSwitch;\npub use spawn_builder::SpawnContext;\nuse thiserror::Error;\nuse tracing::{info, warn};\npub use universe::Universe;\n\npub use self::actor_context::ActorContext;\npub use self::actor_state::ActorState;\npub use self::channel_with_priority::{QueueCapacity, RecvError, SendError, TrySendError};\npub use self::mailbox::{Inbox, Mailbox, WeakMailbox};\npub use self::registry::ActorObservation;\npub use self::supervisor::{Supervisor, SupervisorMetrics, SupervisorState};\n\n/// Heartbeat used to verify that actors are progressing.\n///\n/// If an actor does not advertise a progress within an interval of duration `HEARTBEAT`,\n/// its supervisor will consider it as blocked and will proceed to kill it, as well\n/// as all of the actors all the actors that share the killswitch.\npub static HEARTBEAT: Lazy<Duration> = Lazy::new(heartbeat_from_env_or_default);\n\n/// Returns the actor's heartbeat duration:\n/// - Derived from `QW_ACTOR_HEARTBEAT_SECS` if set and valid.\n/// - Defaults to 30 seconds or 500ms for tests.\nfn heartbeat_from_env_or_default() -> Duration {\n    if cfg!(any(test, feature = \"testsuite\")) {\n        // Right now some unit test end when we detect that a\n        // pipeline has terminated, which can require waiting\n        // for a heartbeat.\n        //\n        // We use a shorter heartbeat to reduce the time running unit tests.\n        return Duration::from_millis(500);\n    }\n    match std::env::var(\"QW_ACTOR_HEARTBEAT_SECS\") {\n        Ok(actor_heartbeat_secs_str) => {\n            if let Ok(actor_heartbeat_secs) = actor_heartbeat_secs_str.parse::<NonZeroU64>() {\n                info!(\"set the actor heartbeat to {actor_heartbeat_secs} seconds\");\n                return Duration::from_secs(actor_heartbeat_secs.get());\n            } else {\n                warn!(\n                    \"failed to parse `QW_ACTOR_HEARTBEAT_SECS={actor_heartbeat_secs_str}` in \\\n                     seconds > 0, using default heartbeat (30 seconds)\"\n                );\n            };\n        }\n        Err(std::env::VarError::NotUnicode(os_str)) => {\n            warn!(\n                \"failed to parse `QW_ACTOR_HEARTBEAT_SECS={os_str:?}` in a valid unicode string, \\\n                 using default heartbeat (30 seconds)\"\n            );\n        }\n        Err(std::env::VarError::NotPresent) => {}\n    }\n    Duration::from_secs(30)\n}\n\n/// Time we accept to wait for a new observation.\n///\n/// Once this time is elapsed, we just return the last observation.\nconst OBSERVE_TIMEOUT: Duration = Duration::from_secs(3);\n\n/// Error that occurred while calling `ActorContext::ask(..)` or `Universe::ask`\n#[derive(Error, Debug)]\npub enum AskError<E: fmt::Debug> {\n    #[error(\"message could not be delivered\")]\n    MessageNotDelivered,\n    #[error(\"error while the message was being processed\")]\n    ProcessMessageError,\n    #[error(\"the handler returned an error: `{0:?}`\")]\n    ErrorReply(#[from] E),\n}\n"
  },
  {
    "path": "quickwit/quickwit-actors/src/mailbox.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::any::Any;\nuse std::convert::Infallible;\nuse std::fmt;\nuse std::sync::atomic::{AtomicUsize, Ordering};\nuse std::sync::{Arc, OnceLock, Weak};\nuse std::time::Instant;\n\nuse quickwit_common::metrics::{GaugeGuard, IntCounter, IntGauge};\nuse tokio::sync::oneshot;\n\nuse crate::channel_with_priority::{Receiver, Sender, TrySendError};\nuse crate::envelope::{Envelope, wrap_in_envelope};\nuse crate::scheduler::SchedulerClient;\nuse crate::{Actor, AskError, Command, DeferableReplyHandler, QueueCapacity, RecvError, SendError};\n\n/// A mailbox is the object that makes it possible to send a message\n/// to an actor.\n///\n/// It is lightweight to clone.\n///\n/// The actor holds its `Inbox` counterpart.\n///\n/// The mailbox can receive high priority and low priority messages.\n/// Commands are typically sent as high priority messages, whereas regular\n/// actor messages are sent to the low priority channel.\n///\n/// Whenever a high priority message is available, it is processed\n/// before low priority messages.\n///\n/// If all mailboxes are dropped, the actor will process all of the pending messages\n/// and gracefully exit with [`crate::actor::ActorExitStatus::Success`].\npub struct Mailbox<A: Actor> {\n    inner: Arc<Inner<A>>,\n    // We do not rely on the `Arc:strong_count` here to avoid an intricate\n    // race condition. We want to make sure the processing of the `Nudge`\n    // message happens AFTER we decrement the refcount.\n    ref_count: Arc<AtomicUsize>,\n}\n\nimpl<A: Actor> Mailbox<A> {\n    pub fn downgrade(&self) -> WeakMailbox<A> {\n        WeakMailbox {\n            inner: Arc::downgrade(&self.inner),\n            ref_count: Arc::downgrade(&self.ref_count),\n        }\n    }\n}\n\nimpl<A: Actor> Drop for Mailbox<A> {\n    fn drop(&mut self) {\n        let old_val = self.ref_count.fetch_sub(1, Ordering::SeqCst);\n        if old_val == 2 {\n            // This was the last mailbox.\n            // `ref_count == 1` means that only the mailbox in the ActorContext\n            // is remaining.\n            let _ = self.send_message_with_high_priority(Command::Nudge);\n        }\n    }\n}\n\n#[derive(Copy, Clone)]\npub(crate) enum Priority {\n    High,\n    Low,\n}\n\nimpl<A: Actor> Clone for Mailbox<A> {\n    fn clone(&self) -> Self {\n        self.ref_count.fetch_add(1, Ordering::SeqCst);\n        Mailbox {\n            inner: self.inner.clone(),\n            ref_count: self.ref_count.clone(),\n        }\n    }\n}\n\nimpl<A: Actor> Mailbox<A> {\n    pub(crate) fn is_last_mailbox(&self) -> bool {\n        self.ref_count.load(Ordering::SeqCst) == 1\n    }\n\n    pub fn id(&self) -> &str {\n        &self.inner.instance_id\n    }\n\n    pub(crate) fn scheduler_client(&self) -> Option<&SchedulerClient> {\n        self.inner.scheduler_client_opt.as_ref()\n    }\n}\n\nstruct Inner<A: Actor> {\n    pub(crate) tx: Sender<Envelope<A>>,\n    scheduler_client_opt: Option<SchedulerClient>,\n    instance_id: String,\n}\n\nimpl<A: Actor> fmt::Debug for Mailbox<A> {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        f.debug_tuple(\"Mailbox\")\n            .field(&self.actor_instance_id())\n            .finish()\n    }\n}\n\nimpl<A: Actor> Mailbox<A> {\n    pub fn actor_instance_id(&self) -> &str {\n        &self.inner.instance_id\n    }\n\n    pub fn is_disconnected(&self) -> bool {\n        self.inner.tx.is_disconnected()\n    }\n\n    /// Sends a message to the actor owning the associated inbox.\n    ///\n    /// From an actor context, use the `ActorContext::send_message` method instead.\n    ///\n    /// SendError is returned if the actor has already exited.\n    pub async fn send_message<M>(\n        &self,\n        message: M,\n    ) -> Result<oneshot::Receiver<A::Reply>, SendError>\n    where\n        A: DeferableReplyHandler<M>,\n        M: fmt::Debug + Send + 'static,\n    {\n        self.send_message_with_backpressure_counter(message, None)\n            .await\n    }\n\n    /// Attempts to queue a message in the low priority channel of the mailbox.\n    ///\n    /// If sending the message would block, the method simply returns `TrySendError::Full(message)`.\n    pub fn try_send_message<M>(\n        &self,\n        message: M,\n    ) -> Result<oneshot::Receiver<A::Reply>, TrySendError<M>>\n    where\n        A: DeferableReplyHandler<M>,\n        M: fmt::Debug + Send + 'static,\n    {\n        let (envelope, response_rx) = self.wrap_in_envelope(message);\n        self.inner\n            .tx\n            .try_send_low_priority(envelope)\n            .map_err(|err| {\n                match err {\n                    TrySendError::Disconnected => TrySendError::Disconnected,\n                    TrySendError::Full(mut envelope) => {\n                        // We need to un pack the envelope.\n                        let message: M = envelope.message_typed().unwrap();\n                        TrySendError::Full(message)\n                    }\n                }\n            })?;\n        Ok(response_rx)\n    }\n\n    fn wrap_in_envelope<M>(&self, message: M) -> (Envelope<A>, oneshot::Receiver<A::Reply>)\n    where\n        A: DeferableReplyHandler<M>,\n        M: fmt::Debug + Send + 'static,\n    {\n        let guard = self\n            .inner\n            .scheduler_client_opt\n            .as_ref()\n            .map(|scheduler_client| scheduler_client.no_advance_time_guard());\n        wrap_in_envelope(message, guard)\n    }\n\n    /// Sends a message to the actor owning the associated inbox.\n    ///\n    /// If the actor experiences some backpressure, then\n    /// `backpressure_micros` will be increased by the amount of\n    /// microseconds of backpressure experienced.\n    pub async fn send_message_with_backpressure_counter<M>(\n        &self,\n        message: M,\n        backpressure_micros_counter_opt: Option<&IntCounter>,\n    ) -> Result<oneshot::Receiver<A::Reply>, SendError>\n    where\n        A: DeferableReplyHandler<M>,\n        M: fmt::Debug + Send + 'static,\n    {\n        let (envelope, response_rx) = self.wrap_in_envelope(message);\n        match self.inner.tx.try_send_low_priority(envelope) {\n            Ok(()) => Ok(response_rx),\n            Err(TrySendError::Full(envelope)) => {\n                if let Some(backpressure_micros_counter) = backpressure_micros_counter_opt {\n                    let now = Instant::now();\n                    self.inner.tx.send_low_priority(envelope).await?;\n                    let elapsed = now.elapsed();\n                    backpressure_micros_counter.inc_by(elapsed.as_micros() as u64);\n                } else {\n                    self.inner.tx.send_low_priority(envelope).await?;\n                }\n                Ok(response_rx)\n            }\n            Err(TrySendError::Disconnected) => Err(SendError::Disconnected),\n        }\n    }\n\n    pub fn send_message_with_high_priority<M>(\n        &self,\n        message: M,\n    ) -> Result<oneshot::Receiver<A::Reply>, SendError>\n    where\n        A: DeferableReplyHandler<M>,\n        M: fmt::Debug + Send + 'static,\n    {\n        let (envelope, response_rx) = self.wrap_in_envelope(message);\n        self.inner.tx.send_high_priority(envelope)?;\n        Ok(response_rx)\n    }\n\n    pub(crate) async fn send_message_with_priority<M>(\n        &self,\n        message: M,\n        priority: Priority,\n    ) -> Result<oneshot::Receiver<A::Reply>, SendError>\n    where\n        A: DeferableReplyHandler<M>,\n        M: fmt::Debug + Send + 'static,\n    {\n        let (envelope, response_rx) = self.wrap_in_envelope(message);\n        match priority {\n            Priority::High => self.inner.tx.send_high_priority(envelope)?,\n            Priority::Low => {\n                self.inner.tx.send_low_priority(envelope).await?;\n            }\n        }\n        Ok(response_rx)\n    }\n\n    /// Similar to `send_message`, except this method\n    /// waits asynchronously for the actor reply.\n    ///\n    /// From an actor context, use the `ActorContext::ask` method instead.\n    pub async fn ask<M, T>(&self, message: M) -> Result<T, AskError<Infallible>>\n    where\n        A: DeferableReplyHandler<M, Reply = T>,\n        M: fmt::Debug + Send + 'static,\n    {\n        self.ask_with_backpressure_counter(message, None).await\n    }\n\n    /// Similar to `ask`, but if a backpressure counter is passed,\n    /// it increments the amount of time spent in the backpressure.\n    ///\n    /// The backpressure duration only includes the amount of time\n    /// it took to `queue` the request into the actor pipeline.\n    ///\n    /// It does not include\n    /// - the amount spent waiting in the queue,\n    /// - the amount spent processing the message.\n    ///\n    /// From an actor context, use the `ActorContext::ask` method instead.\n    pub async fn ask_with_backpressure_counter<M, T>(\n        &self,\n        message: M,\n        backpressure_micros_counter_opt: Option<&IntCounter>,\n    ) -> Result<T, AskError<Infallible>>\n    where\n        A: DeferableReplyHandler<M, Reply = T>,\n        M: fmt::Debug + Send + 'static,\n    {\n        let resp = self\n            .send_message_with_backpressure_counter(message, backpressure_micros_counter_opt)\n            .await;\n        resp.map_err(|_send_error| AskError::MessageNotDelivered)?\n            .await\n            .map_err(|_| AskError::ProcessMessageError)\n    }\n\n    /// Similar to `send_message`, except this method\n    /// waits asynchronously for the actor reply.\n    ///\n    /// From an actor context, use the `ActorContext::ask` method instead.\n    pub async fn ask_for_res<M, T, E>(&self, message: M) -> Result<T, AskError<E>>\n    where\n        A: DeferableReplyHandler<M, Reply = Result<T, E>>,\n        M: fmt::Debug + Send + 'static,\n        E: fmt::Debug,\n    {\n        self.send_message(message)\n            .await\n            .map_err(|_send_error| AskError::MessageNotDelivered)?\n            .await\n            .map_err(|_| AskError::ProcessMessageError)?\n            .map_err(AskError::from)\n    }\n}\n\nstruct InboxInner<A: Actor> {\n    rx: Receiver<Envelope<A>>,\n    _inboxes_count_gauge_guard: GaugeGuard<'static>,\n}\n\npub struct Inbox<A: Actor> {\n    inner: Arc<InboxInner<A>>,\n}\n\nimpl<A: Actor> Clone for Inbox<A> {\n    fn clone(&self) -> Self {\n        Inbox {\n            inner: self.inner.clone(),\n        }\n    }\n}\n\nimpl<A: Actor> Inbox<A> {\n    pub(crate) fn is_empty(&self) -> bool {\n        self.inner.rx.is_empty()\n    }\n\n    pub(crate) async fn recv(&self) -> Result<Envelope<A>, RecvError> {\n        self.inner.rx.recv().await\n    }\n\n    pub(crate) async fn recv_cmd_and_scheduled_msg_only(&self) -> Envelope<A> {\n        self.inner.rx.recv_high_priority().await\n    }\n\n    pub(crate) fn try_recv(&self) -> Result<Envelope<A>, RecvError> {\n        self.inner.rx.try_recv()\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub async fn recv_typed_message<M: 'static>(&self) -> Result<M, RecvError> {\n        loop {\n            match self.inner.rx.recv().await {\n                Ok(mut envelope) => {\n                    if let Some(msg) = envelope.message_typed() {\n                        return Ok(msg);\n                    }\n                }\n                Err(err) => {\n                    return Err(err);\n                }\n            }\n        }\n    }\n\n    /// Destroys the inbox and returns the list of pending messages or commands\n    /// in the low priority channel.\n    ///\n    /// Warning this iterator might never be exhausted if there is a living\n    /// mailbox associated to it.\n    pub fn drain_for_test(&self) -> Vec<Box<dyn Any>> {\n        self.inner\n            .rx\n            .drain_low_priority()\n            .into_iter()\n            .map(|mut envelope| envelope.message())\n            .collect()\n    }\n\n    /// Destroys the inbox and returns the list of pending messages or commands\n    /// in the low priority channel.\n    ///\n    /// Warning this iterator might never be exhausted if there is a living\n    /// mailbox associated to it.\n    pub fn drain_for_test_typed<M: 'static>(&self) -> Vec<M> {\n        self.inner\n            .rx\n            .drain_low_priority()\n            .into_iter()\n            .flat_map(|mut envelope| envelope.message_typed())\n            .collect()\n    }\n}\n\nfn get_actor_inboxes_count_gauge_guard() -> GaugeGuard<'static> {\n    static INBOX_GAUGE: std::sync::OnceLock<IntGauge> = OnceLock::new();\n    let gauge = INBOX_GAUGE.get_or_init(|| {\n        quickwit_common::metrics::new_gauge(\n            \"inboxes_count\",\n            \"overall count of actors\",\n            \"actor\",\n            &[],\n        )\n    });\n    let mut gauge_guard = GaugeGuard::from_gauge(gauge);\n    gauge_guard.add(1);\n    gauge_guard\n}\n\npub(crate) fn create_mailbox<A: Actor>(\n    actor_name: String,\n    queue_capacity: QueueCapacity,\n    scheduler_client_opt: Option<SchedulerClient>,\n) -> (Mailbox<A>, Inbox<A>) {\n    let (tx, rx) = crate::channel_with_priority::channel(queue_capacity);\n    let ref_count = Arc::new(AtomicUsize::new(1));\n    let mailbox = Mailbox {\n        inner: Arc::new(Inner {\n            tx,\n            instance_id: quickwit_common::new_coolid(&actor_name),\n            scheduler_client_opt,\n        }),\n        ref_count,\n    };\n    let inner = InboxInner {\n        rx,\n        _inboxes_count_gauge_guard: get_actor_inboxes_count_gauge_guard(),\n    };\n    let inbox = Inbox {\n        inner: Arc::new(inner),\n    };\n    (mailbox, inbox)\n}\n\npub struct WeakMailbox<A: Actor> {\n    inner: Weak<Inner<A>>,\n    ref_count: Weak<AtomicUsize>,\n}\n\nimpl<A: Actor> Clone for WeakMailbox<A> {\n    fn clone(&self) -> Self {\n        Self {\n            inner: self.inner.clone(),\n            ref_count: self.ref_count.clone(),\n        }\n    }\n}\n\nimpl<A: Actor> WeakMailbox<A> {\n    pub fn upgrade(&self) -> Option<Mailbox<A>> {\n        let inner = self.inner.upgrade()?;\n        let ref_count = self.ref_count.upgrade()?;\n        ref_count.fetch_add(1, Ordering::SeqCst);\n        Some(Mailbox { inner, ref_count })\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::mem;\n    use std::time::Duration;\n\n    use super::*;\n    use crate::tests::{Ping, PingReceiverActor};\n    use crate::{ActorContext, ActorExitStatus, Handler, Universe};\n\n    #[tokio::test]\n    async fn test_weak_mailbox_downgrade_upgrade() {\n        let universe = Universe::with_accelerated_time();\n        let (mailbox, _inbox) = universe.create_test_mailbox::<PingReceiverActor>();\n        let weak_mailbox = mailbox.downgrade();\n        assert!(weak_mailbox.upgrade().is_some());\n    }\n\n    #[tokio::test]\n    async fn test_weak_mailbox_failing_upgrade() {\n        let universe = Universe::with_accelerated_time();\n        let (mailbox, _inbox) = universe.create_test_mailbox::<PingReceiverActor>();\n        let weak_mailbox = mailbox.downgrade();\n        drop(mailbox);\n        assert!(weak_mailbox.upgrade().is_none());\n    }\n\n    struct BackPressureActor;\n\n    impl Actor for BackPressureActor {\n        type ObservableState = ();\n\n        fn observable_state(&self) -> Self::ObservableState {}\n\n        fn queue_capacity(&self) -> QueueCapacity {\n            QueueCapacity::Bounded(0)\n        }\n\n        fn yield_after_each_message(&self) -> bool {\n            false\n        }\n    }\n\n    use async_trait::async_trait;\n\n    #[async_trait]\n    impl Handler<Duration> for BackPressureActor {\n        type Reply = ();\n\n        async fn handle(\n            &mut self,\n            sleep_duration: Duration,\n            _ctx: &ActorContext<Self>,\n        ) -> Result<(), ActorExitStatus> {\n            if !sleep_duration.is_zero() {\n                tokio::time::sleep(sleep_duration).await;\n            }\n            Ok(())\n        }\n    }\n\n    #[tokio::test]\n    async fn test_mailbox_send_with_backpressure_counter_low_backpressure() {\n        let universe = Universe::with_accelerated_time();\n        let back_pressure_actor = BackPressureActor;\n        let (mailbox, _handle) = universe.spawn_builder().spawn(back_pressure_actor);\n        // We send a first message to make sure the actor has been properly spawned and is listening\n        // for new messages.\n        mailbox\n            .ask_with_backpressure_counter(Duration::default(), None)\n            .await\n            .unwrap();\n        // At this point the actor was started and even processed a message entirely.\n        let backpressure_micros_counter =\n            IntCounter::new(\"test_counter\", \"help for test_counter\").unwrap();\n        let wait_duration = Duration::from_millis(1);\n        let processed = mailbox\n            .send_message_with_backpressure_counter(\n                wait_duration,\n                Some(&backpressure_micros_counter),\n            )\n            .await\n            .unwrap();\n        assert!(backpressure_micros_counter.get() < 500);\n        processed.await.unwrap();\n        assert!(backpressure_micros_counter.get() < 500);\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_mailbox_send_with_backpressure_counter_backpressure() {\n        let universe = Universe::with_accelerated_time();\n        let back_pressure_actor = BackPressureActor;\n        let (mailbox, _handle) = universe.spawn_builder().spawn(back_pressure_actor);\n        // We send a first message to make sure the actor has been properly spawned and is listening\n        // for new messages.\n        mailbox\n            .ask_with_backpressure_counter(Duration::default(), None)\n            .await\n            .unwrap();\n        let backpressure_micros_counter =\n            IntCounter::new(\"test_counter\", \"help for test_counter\").unwrap();\n        let wait_duration = Duration::from_millis(1);\n        mailbox\n            .send_message_with_backpressure_counter(\n                wait_duration,\n                Some(&backpressure_micros_counter),\n            )\n            .await\n            .unwrap();\n        // That second message will present some backpressure, since the capacity is 0 and\n        // the first message will take 1000 micros to be processed.\n        mailbox\n            .send_message_with_backpressure_counter(\n                Duration::default(),\n                Some(&backpressure_micros_counter),\n            )\n            .await\n            .unwrap();\n        assert!(backpressure_micros_counter.get() > 1_000u64);\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_mailbox_waiting_for_processing_does_not_counter_as_backpressure() {\n        let universe = Universe::with_accelerated_time();\n        let back_pressure_actor = BackPressureActor;\n        let (mailbox, _handle) = universe.spawn_builder().spawn(back_pressure_actor);\n        mailbox\n            .ask_with_backpressure_counter(Duration::default(), None)\n            .await\n            .unwrap();\n        let backpressure_micros_counter =\n            IntCounter::new(\"test_counter\", \"help for test_counter\").unwrap();\n        let start = Instant::now();\n        mailbox\n            .ask_with_backpressure_counter(Duration::from_millis(1), None)\n            .await\n            .unwrap();\n        let elapsed = start.elapsed();\n        assert!(elapsed.as_micros() > 1000);\n        assert_eq!(backpressure_micros_counter.get(), 0);\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_try_send() {\n        let universe = Universe::with_accelerated_time();\n        let (mailbox, _inbox) = universe\n            .create_mailbox::<PingReceiverActor>(\"hello\".to_string(), QueueCapacity::Bounded(1));\n        assert!(mailbox.try_send_message(Ping).is_ok());\n        assert!(matches!(\n            mailbox.try_send_message(Ping).unwrap_err(),\n            TrySendError::Full(Ping)\n        ));\n    }\n\n    #[tokio::test]\n    async fn test_try_send_disconnect() {\n        let universe = Universe::with_accelerated_time();\n        let (mailbox, inbox) = universe\n            .create_mailbox::<PingReceiverActor>(\"hello\".to_string(), QueueCapacity::Bounded(1));\n        assert!(mailbox.try_send_message(Ping).is_ok());\n        mem::drop(inbox);\n        assert!(matches!(\n            mailbox.try_send_message(Ping).unwrap_err(),\n            TrySendError::Disconnected\n        ));\n    }\n\n    #[tokio::test]\n    async fn test_weak_mailbox_ref_count() {\n        let universe = Universe::with_accelerated_time();\n        let (mailbox, _inbox) = universe\n            .create_mailbox::<PingReceiverActor>(\"hello\".to_string(), QueueCapacity::Bounded(1));\n        assert!(mailbox.is_last_mailbox());\n        let weak_mailbox = mailbox.downgrade();\n        let second_mailbox = weak_mailbox.upgrade().unwrap();\n        assert!(!mailbox.is_last_mailbox());\n        drop(second_mailbox);\n        assert!(mailbox.is_last_mailbox());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-actors/src/observation.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::ops::Deref;\n\n#[derive(Debug)]\npub struct Observation<ObservableState> {\n    pub obs_type: ObservationType,\n    pub state: ObservableState,\n}\n\nimpl<ObservableState> Deref for Observation<ObservableState> {\n    type Target = ObservableState;\n\n    fn deref(&self) -> &Self::Target {\n        &self.state\n    }\n}\n\n// Describes the actual outcome of observation.\n#[derive(Clone, Copy, Debug, Eq, PartialEq)]\npub enum ObservationType {\n    /// The actor is alive and was able to snapshot its state within `HEARTBEAT`\n    Alive,\n    /// An observation could not be made with HEARTBEAT, because\n    /// the actor had too much work. In that case, in a best effort fashion, the\n    /// last observed state is returned. The actor will still update its state,\n    /// as soon as it has finished processing the current message.\n    Timeout,\n    /// The actor has exited. The post-mortem state is joined.\n    PostMortem,\n}\n\nimpl<State: fmt::Debug + PartialEq> PartialEq for Observation<State> {\n    fn eq(&self, other: &Self) -> bool {\n        self.obs_type.eq(&other.obs_type) && self.state.eq(&other.state)\n    }\n}\n\nimpl<State: fmt::Debug + PartialEq + Eq> Eq for Observation<State> {}\n"
  },
  {
    "path": "quickwit/quickwit-actors/src/registry.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::any::{Any, TypeId};\nuse std::collections::HashMap;\nuse std::pin::Pin;\nuse std::sync::{Arc, RwLock};\nuse std::time::Duration;\n\nuse async_trait::async_trait;\nuse futures::future::{self, Shared};\nuse futures::{Future, FutureExt};\nuse serde::Serialize;\nuse serde_json::Value as JsonValue;\nuse tokio::task::JoinHandle;\n\nuse crate::command::Observe;\nuse crate::mailbox::WeakMailbox;\nuse crate::{Actor, ActorExitStatus, Command, Mailbox};\n\nstruct TypedJsonObservable<A: Actor> {\n    actor_instance_id: String,\n    weak_mailbox: WeakMailbox<A>,\n    join_handle: ActorJoinHandle,\n}\n\n#[async_trait]\ntrait JsonObservable: Sync + Send {\n    fn is_disconnected(&self) -> bool;\n    fn any(&self) -> &dyn Any;\n    fn actor_instance_id(&self) -> &str;\n    async fn observe(&self) -> Option<JsonValue>;\n    async fn quit(&self) -> ActorExitStatus;\n    async fn join(&self) -> ActorExitStatus;\n}\n\n#[async_trait]\nimpl<A: Actor> JsonObservable for TypedJsonObservable<A> {\n    fn is_disconnected(&self) -> bool {\n        self.weak_mailbox\n            .upgrade()\n            .map(|mailbox| mailbox.is_disconnected())\n            .unwrap_or(true)\n    }\n    fn any(&self) -> &dyn Any {\n        &self.weak_mailbox\n    }\n    fn actor_instance_id(&self) -> &str {\n        self.actor_instance_id.as_str()\n    }\n    async fn observe(&self) -> Option<JsonValue> {\n        let mailbox = self.weak_mailbox.upgrade()?;\n        let oneshot_rx = mailbox.send_message_with_high_priority(Observe).ok()?;\n        let state: <A as Actor>::ObservableState = oneshot_rx.await.ok()?;\n        serde_json::to_value(&state).ok()\n    }\n\n    async fn quit(&self) -> ActorExitStatus {\n        if let Some(mailbox) = self.weak_mailbox.upgrade() {\n            let _ = mailbox.send_message_with_high_priority(Command::Quit);\n        }\n        self.join().await\n    }\n\n    async fn join(&self) -> ActorExitStatus {\n        self.join_handle.join().await\n    }\n}\n\n#[derive(Default, Clone)]\npub(crate) struct ActorRegistry {\n    actors: Arc<RwLock<HashMap<TypeId, ActorRegistryForSpecificType>>>,\n}\n\nstruct ActorRegistryForSpecificType {\n    type_name: &'static str,\n    observables: Vec<Arc<dyn JsonObservable>>,\n}\n\nimpl ActorRegistryForSpecificType {\n    fn for_type<A>() -> ActorRegistryForSpecificType {\n        ActorRegistryForSpecificType {\n            type_name: std::any::type_name::<A>(),\n            observables: Vec::new(),\n        }\n    }\n\n    fn gc(&mut self) {\n        let mut i = 0;\n        while i < self.observables.len() {\n            if self.observables[i].is_disconnected() {\n                self.observables.swap_remove(i);\n            } else {\n                i += 1;\n            }\n        }\n    }\n}\n\n#[derive(Serialize, Debug)]\npub struct ActorObservation {\n    pub type_name: &'static str,\n    pub instance_id: String,\n    pub obs: Option<JsonValue>,\n}\n\nimpl ActorRegistry {\n    pub fn register<A: Actor>(&self, mailbox: &Mailbox<A>, join_handle: ActorJoinHandle) {\n        let typed_id = TypeId::of::<A>();\n        let actor_instance_id = mailbox.actor_instance_id().to_string();\n        let weak_mailbox = mailbox.downgrade();\n        self.actors\n            .write()\n            .unwrap()\n            .entry(typed_id)\n            .or_insert_with(|| ActorRegistryForSpecificType::for_type::<A>())\n            .observables\n            .push(Arc::new(TypedJsonObservable {\n                weak_mailbox,\n                actor_instance_id,\n                join_handle,\n            }));\n    }\n\n    pub async fn observe(&self, timeout: Duration) -> Vec<ActorObservation> {\n        self.gc();\n        let mut obs_futures = Vec::new();\n        for registry_for_type in self.actors.read().unwrap().values() {\n            for obs in &registry_for_type.observables {\n                if obs.is_disconnected() {\n                    continue;\n                }\n                let obs_clone = obs.clone();\n                let type_name = registry_for_type.type_name;\n                let instance_id = obs.actor_instance_id().to_string();\n                obs_futures.push(async move {\n                    let obs = tokio::time::timeout(timeout, obs_clone.observe())\n                        .await\n                        .unwrap_or(None);\n                    ActorObservation {\n                        type_name,\n                        instance_id,\n                        obs,\n                    }\n                });\n            }\n        }\n        future::join_all(obs_futures.into_iter()).await\n    }\n\n    pub fn get<A: Actor>(&self) -> Vec<Mailbox<A>> {\n        let mut lock = self.actors.write().unwrap();\n        get_iter::<A>(&mut lock).collect()\n    }\n\n    pub fn get_one<A: Actor>(&self) -> Option<Mailbox<A>> {\n        let mut lock = self.actors.write().unwrap();\n        get_iter::<A>(&mut lock).next()\n    }\n\n    fn gc(&self) {\n        for registry_for_type in self.actors.write().unwrap().values_mut() {\n            registry_for_type.gc();\n        }\n    }\n\n    pub async fn quit(&self) -> HashMap<String, ActorExitStatus> {\n        let mut obs_futures = Vec::new();\n        let mut actor_ids = Vec::new();\n        for registry_for_type in self.actors.read().unwrap().values() {\n            for obs in &registry_for_type.observables {\n                let obs_clone = obs.clone();\n                obs_futures.push(async move { obs_clone.quit().await });\n                actor_ids.push(obs.actor_instance_id().to_string());\n            }\n        }\n        let res = future::join_all(obs_futures).await;\n        actor_ids.into_iter().zip(res).collect()\n    }\n\n    pub fn is_empty(&self) -> bool {\n        self.actors\n            .read()\n            .unwrap()\n            .values()\n            .all(|registry_for_type| {\n                registry_for_type\n                    .observables\n                    .iter()\n                    .all(|obs| obs.is_disconnected())\n            })\n    }\n}\n\nfn get_iter<A: Actor>(\n    actors: &mut HashMap<TypeId, ActorRegistryForSpecificType>,\n) -> impl Iterator<Item = Mailbox<A>> + '_ {\n    let typed_id = TypeId::of::<A>();\n    actors\n        .get(&typed_id)\n        .into_iter()\n        .flat_map(|registry_for_type| {\n            registry_for_type\n                .observables\n                .iter()\n                .flat_map(|box_any| box_any.any().downcast_ref::<WeakMailbox<A>>())\n                .flat_map(|weak_mailbox| weak_mailbox.upgrade())\n        })\n        .filter(|mailbox| !mailbox.is_disconnected())\n}\n\n/// This structure contains an optional exit handle. The handle is present\n/// until the join() method is called.\n#[derive(Clone)]\npub(crate) struct ActorJoinHandle {\n    holder: Shared<Pin<Box<dyn Future<Output = ActorExitStatus> + Send>>>,\n}\n\nimpl ActorJoinHandle {\n    pub(crate) fn new(join_handle: JoinHandle<ActorExitStatus>) -> Self {\n        ActorJoinHandle {\n            holder: Self::inner_join(join_handle).boxed().shared(),\n        }\n    }\n\n    async fn inner_join(join_handle: JoinHandle<ActorExitStatus>) -> ActorExitStatus {\n        join_handle.await.unwrap_or_else(|join_err| {\n            if join_err.is_panic() {\n                ActorExitStatus::Panicked\n            } else {\n                ActorExitStatus::Killed\n            }\n        })\n    }\n\n    /// Joins the actor and returns its exit status on the first invocation.\n    /// Returns None afterwards.\n    pub(crate) async fn join(&self) -> ActorExitStatus {\n        self.holder.clone().await\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::time::Duration;\n\n    use crate::Universe;\n    use crate::tests::PingReceiverActor;\n\n    #[tokio::test]\n    async fn test_registry() {\n        let test_actor = PingReceiverActor::default();\n        let universe = Universe::with_accelerated_time();\n        let (_mailbox, _handle) = universe.spawn_builder().spawn(test_actor);\n        let _actor_mailbox = universe.get_one::<PingReceiverActor>().unwrap();\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_registry_killed_actor() {\n        let test_actor = PingReceiverActor::default();\n        let universe = Universe::with_accelerated_time();\n        let (_mailbox, handle) = universe.spawn_builder().spawn(test_actor);\n        handle.kill().await;\n        assert!(universe.get_one::<PingReceiverActor>().is_none());\n    }\n\n    #[tokio::test]\n    async fn test_registry_last_mailbox_dropped_actor() {\n        let test_actor = PingReceiverActor::default();\n        let universe = Universe::with_accelerated_time();\n        let (mailbox, handle) = universe.spawn_builder().spawn(test_actor);\n        drop(mailbox);\n        handle.join().await;\n        assert!(universe.get_one::<PingReceiverActor>().is_none());\n    }\n\n    #[tokio::test]\n    async fn test_get_actor_states() {\n        let test_actor = PingReceiverActor::default();\n        let universe = Universe::with_accelerated_time();\n        let (_mailbox, _handle) = universe.spawn_builder().spawn(test_actor);\n        let obs = universe.observe(Duration::from_millis(1000)).await;\n        assert_eq!(obs.len(), 1);\n        universe.assert_quit().await;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-actors/src/scheduler.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::cmp::Reverse;\nuse std::collections::BinaryHeap;\nuse std::collections::binary_heap::PeekMut;\nuse std::future::Future;\nuse std::sync::atomic::{AtomicBool, AtomicUsize, Ordering};\nuse std::sync::{Arc, Weak};\nuse std::time::{Duration, Instant};\n\nuse quickwit_common::spawn_named_task;\nuse tokio::sync::oneshot;\nuse tokio::task::JoinHandle;\n\ntype Callback = Box<dyn FnOnce() + Sync + Send + 'static>;\n\nstruct TimeoutEvent {\n    deadline: Instant,\n    event_id: u64, //< only useful to break ties in a deterministic way.\n    callback: Callback,\n}\n\nimpl PartialEq for TimeoutEvent {\n    fn eq(&self, other: &Self) -> bool {\n        self.event_id == other.event_id\n    }\n}\n\nimpl Eq for TimeoutEvent {}\n\nimpl PartialOrd for TimeoutEvent {\n    fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {\n        Some(self.cmp(other))\n    }\n}\n\nimpl Ord for TimeoutEvent {\n    fn cmp(&self, other: &Self) -> std::cmp::Ordering {\n        self.deadline\n            .cmp(&other.deadline)\n            .then_with(|| self.event_id.cmp(&other.event_id))\n    }\n}\n\nenum SchedulerMessage {\n    ProcessTime,\n    Schedule {\n        callback: Callback,\n        timeout: Duration,\n    },\n}\n\n#[derive(Clone)]\npub struct SchedulerClient {\n    inner: Arc<SchedulerClientInner>,\n}\n\nstruct SchedulerClientInner {\n    no_advance_time_guard_count: AtomicUsize,\n    accelerate_time: AtomicBool,\n    tx: flume::Sender<SchedulerMessage>,\n}\n\nimpl SchedulerClient {\n    /// Returns true if someone asked for the time to be accelerated.\n    fn time_is_accelerated(&self) -> bool {\n        self.inner.accelerate_time.load(Ordering::Relaxed)\n    }\n\n    /// Returns true if something is preventing for accelerating the time.\n    fn is_advance_time_forbidden(&self) -> bool {\n        self.inner\n            .no_advance_time_guard_count\n            .load(Ordering::SeqCst)\n            > 0\n    }\n\n    /// Schedules a new event.\n    /// Once `timeout` is elapsed, the future `fut` is\n    /// executed.\n    ///\n    /// `fut` will be executed in the scheduler task, so it is\n    /// required to be short.\n    pub fn schedule_event<F: FnOnce() + Send + Sync + 'static>(\n        &self,\n        callback: F,\n        timeout: Duration,\n    ) {\n        let _ = self.inner.tx.send(SchedulerMessage::Schedule {\n            callback: Box::new(callback),\n            timeout,\n        });\n    }\n\n    // Increases the number of reasons to not simulate advance time.\n    pub(crate) fn inc_no_advance_time(&self) {\n        self.inner\n            .no_advance_time_guard_count\n            .fetch_add(1, Ordering::SeqCst);\n    }\n\n    // Decrease the number of reasons to not simulate advance time.\n    //\n    // If the number reaches 0, we trigger a `timeout`.\n    pub(crate) fn dec_no_advance_time(&self) {\n        let previous_count = self\n            .inner\n            .no_advance_time_guard_count\n            .fetch_sub(1, Ordering::SeqCst);\n        if previous_count == 1 {\n            self.process_time();\n        }\n    }\n\n    /// Switch accelerated time mode for the scheduler.\n    ///\n    /// The scheduler will jump in time whenever there are no more `NoAdvanceInTimeGuard`.\n    pub fn accelerate_time(&self) {\n        self.inner.accelerate_time.store(true, Ordering::Relaxed);\n        self.process_time();\n    }\n\n    pub async fn sleep(&self, duration: Duration) {\n        let (oneshot_tx, oneshot_rx) = oneshot::channel();\n        self.schedule_event(\n            move || {\n                let _ = oneshot_tx.send(());\n            },\n            duration,\n        );\n        let _ = oneshot_rx.await;\n    }\n\n    pub async fn timeout<O>(\n        &self,\n        duration: Duration,\n        fut: impl Future<Output = O>,\n    ) -> Result<O, ()> {\n        tokio::select! {\n            _ = self.sleep(duration) => {\n                Err(())\n            },\n            future_output = fut => {\n                Ok(future_output)\n            }\n        }\n    }\n\n    // Triggers an event, telling the Scheduler to process time,\n    // checks whether some scheduled events have timed out, or whether we should\n    // jump forward in time.\n    pub(crate) fn process_time(&self) {\n        let _ = self.inner.tx.send(SchedulerMessage::ProcessTime);\n    }\n\n    /// Returns a `NoAdvanceTimeGuard` which calls `inc_no_advance_time`\n    /// on `NoAdvanceTimeGuard::new` and `dec_no_advance_time` when dropped.\n    pub fn no_advance_time_guard(&self) -> NoAdvanceTimeGuard {\n        NoAdvanceTimeGuard::new(self.clone())\n    }\n}\n\npub struct NoAdvanceTimeGuard {\n    scheduler_client: SchedulerClient,\n}\n\nimpl NoAdvanceTimeGuard {\n    fn new(scheduler_client: SchedulerClient) -> Self {\n        scheduler_client.inc_no_advance_time();\n        NoAdvanceTimeGuard { scheduler_client }\n    }\n}\n\nimpl Drop for NoAdvanceTimeGuard {\n    fn drop(&mut self) {\n        self.scheduler_client.dec_no_advance_time();\n    }\n}\n\npub fn start_scheduler() -> SchedulerClient {\n    let (tx, rx) = flume::unbounded::<SchedulerMessage>();\n    let scheduler_client = SchedulerClient {\n        inner: Arc::new(SchedulerClientInner {\n            no_advance_time_guard_count: AtomicUsize::default(),\n            accelerate_time: Default::default(),\n            tx,\n        }),\n    };\n    let mut scheduler = Scheduler::new(&scheduler_client);\n    spawn_named_task(\n        async move {\n            while let Ok(scheduler_message) = rx.recv_async().await {\n                match scheduler_message {\n                    SchedulerMessage::ProcessTime => scheduler.process_time(),\n                    SchedulerMessage::Schedule { callback, timeout } => {\n                        scheduler.process_schedule(callback, timeout);\n                    }\n                }\n            }\n        },\n        \"scheduler\",\n    );\n    scheduler_client\n}\n\nstruct Scheduler {\n    // We attribute an event_id to all event just to break ties\n    // if two events are scheduled on the same time.\n    event_id_generator: u64,\n    // Simulated time shift which defines the scheduler time reference as `simulated_time =\n    // Instant::now() + simulated_time_shift`. By default `simulated_time_shift` is set to 0\n    // but it can be shifted when the scheduler has to process a simulate sleep event`.\n    simulated_time_shift: Duration,\n    future_events: BinaryHeap<Reverse<TimeoutEvent>>,\n    next_timeout: Option<JoinHandle<()>>,\n    weak_scheduler_client: Weak<SchedulerClientInner>,\n}\n\nimpl Scheduler {\n    /// Processes \"time\".\n    ///\n    /// This :\n    /// - identifies all events that are elapsed and execute their callback,\n    /// - advance time if necessary\n    /// - schedule a message to make sure process_time is called in time for the next event.\n    fn process_time(&mut self) {\n        let now = self.simulated_now();\n        // Pops all elapsed events and executes the associated callback.\n        while let Some(next_event_peek) = self.future_events.peek_mut() {\n            if next_event_peek.0.deadline > now {\n                // The next event is out of scope.\n                break;\n            }\n            let next_event = PeekMut::pop(next_event_peek);\n            (next_event.0.callback)();\n        }\n\n        // If the condition to accelerate time are met, we can\n        // advance time and jump straight to the next timeout.\n        self.advance_time_if_necessary();\n        self.schedule_next_timeout();\n    }\n\n    /// Schedules a new event.\n    fn process_schedule(&mut self, callback: Callback, timeout: Duration) {\n        let new_evt_deadline = self.simulated_now() + timeout;\n        let timeout_event = self.timeout_event(new_evt_deadline, callback);\n        self.future_events.push(Reverse(timeout_event));\n        self.process_time();\n    }\n\n    fn scheduler_client(&self) -> Option<SchedulerClient> {\n        let scheduler_client = SchedulerClient {\n            inner: self.weak_scheduler_client.upgrade()?,\n        };\n        Some(scheduler_client)\n    }\n\n    /// Schedules a Timeout event callback if necessary.\n    fn schedule_next_timeout(&mut self) {\n        let Some(scheduler_client) = self.scheduler_client() else {\n            return;\n        };\n        let simulated_now = self.simulated_now();\n        let Some(next_deadline) = self.next_event_deadline() else {\n            return;\n        };\n        let timeout: Duration = if next_deadline <= simulated_now {\n            // This should almost never happen, because we supposedly triggered\n            // all pending events.\n            //\n            // But time has advanced as we were calling the different callbacks\n            // so it is actually possible.\n            Duration::default()\n        } else {\n            next_deadline - simulated_now\n        };\n        if let Some(previous_join_handle) = self.next_timeout.take() {\n            // The next event timeout is about to change. Let's cancel the previous\n            // scheduled event.\n            previous_join_handle.abort();\n        }\n        let new_join_handle: JoinHandle<()> = tokio::task::spawn(async move {\n            if timeout.is_zero() {\n                tokio::task::yield_now().await;\n            } else {\n                tokio::time::sleep(timeout).await;\n            }\n            scheduler_client.process_time();\n        });\n        self.next_timeout = Some(new_join_handle);\n    }\n}\n\nimpl Scheduler {\n    pub fn new(scheduler_client: &SchedulerClient) -> Self {\n        Scheduler {\n            event_id_generator: 0u64,\n            simulated_time_shift: Duration::default(),\n            future_events: Default::default(),\n            next_timeout: None,\n            weak_scheduler_client: Arc::downgrade(&scheduler_client.inner),\n        }\n    }\n\n    /// Updates the simulated time shift, if appropriate.\n    ///\n    /// We advance time if:\n    /// - someone is actually requesting for a simulated fast forward in time. (if\n    ///   Universe::simulate_time_shift(..) has been called).\n    /// - no message is queued for processing, no initialize or no finalize is being processed.\n    fn advance_time_if_necessary(&mut self) {\n        let Some(scheduler_client) = self.scheduler_client() else {\n            return;\n        };\n        if !scheduler_client.time_is_accelerated() {\n            return;\n        }\n        if scheduler_client.is_advance_time_forbidden() {\n            return;\n        }\n        let Some(advance_to_instant) = self.next_event_deadline() else {\n            return;\n        };\n        let now = self.simulated_now();\n        if let Some(time_shift) = advance_to_instant.checked_duration_since(now) {\n            self.simulated_time_shift += time_shift;\n        }\n    }\n\n    fn next_event_deadline(&self) -> Option<Instant> {\n        self.future_events.peek().map(|rev| rev.0.deadline)\n    }\n\n    fn simulated_now(&self) -> Instant {\n        Instant::now() + self.simulated_time_shift\n    }\n\n    fn timeout_event(&mut self, deadline: Instant, callback: Callback) -> TimeoutEvent {\n        let event_id = self.event_id_generator;\n        self.event_id_generator += 1;\n        TimeoutEvent {\n            deadline,\n            event_id,\n            callback,\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::sync::Arc;\n    use std::sync::atomic::{AtomicUsize, Ordering};\n    use std::time::{Duration, Instant};\n\n    use async_trait::async_trait;\n\n    use crate::{Actor, ActorContext, ActorExitStatus, Handler, Universe};\n\n    struct ClockActor {\n        count: Arc<AtomicUsize>,\n    }\n\n    #[derive(Debug)]\n    struct Tick;\n\n    #[async_trait]\n    impl Actor for ClockActor {\n        type ObservableState = ();\n        fn observable_state(&self) -> Self::ObservableState {}\n\n        async fn initialize(&mut self, ctx: &ActorContext<Self>) -> Result<(), ActorExitStatus> {\n            self.handle(Tick, ctx).await\n        }\n    }\n\n    #[async_trait]\n    impl Handler<Tick> for ClockActor {\n        type Reply = ();\n\n        async fn handle(\n            &mut self,\n            _tick: Tick,\n            ctx: &ActorContext<Self>,\n        ) -> Result<(), ActorExitStatus> {\n            self.count.fetch_add(1, Ordering::SeqCst);\n            ctx.schedule_self_msg(Duration::from_secs(1), Tick);\n            Ok(())\n        }\n    }\n\n    #[tokio::test]\n    async fn test_scheduler_advance_time_fast_forward_initialize() {\n        quickwit_common::setup_logging_for_tests();\n        let count: Arc<AtomicUsize> = Default::default();\n        let simple_actor = ClockActor {\n            count: count.clone(),\n        };\n        let universe = Universe::with_accelerated_time();\n        universe.spawn_builder().spawn(simple_actor);\n        assert_eq!(count.load(Ordering::SeqCst), 0);\n        universe.sleep(Duration::from_millis(15)).await;\n        assert_eq!(count.load(Ordering::SeqCst), 1);\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_scheduler_advance_time_fast_forward_scheduled_message() {\n        let start = Instant::now();\n        quickwit_common::setup_logging_for_tests();\n        let count: Arc<AtomicUsize> = Default::default();\n        let simple_actor = ClockActor {\n            count: count.clone(),\n        };\n        let universe = Universe::with_accelerated_time();\n        universe.spawn_builder().spawn(simple_actor);\n        assert_eq!(count.load(Ordering::SeqCst), 0);\n        universe.sleep(Duration::from_secs(10)).await;\n        assert_eq!(count.load(Ordering::SeqCst), 10);\n        let elapsed = start.elapsed();\n        // The whole point is to accelerate time.\n        assert!(elapsed.as_millis() < 50);\n        universe.assert_quit().await;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-actors/src/spawn_builder.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::time::Duration;\n\nuse anyhow::Context;\nuse quickwit_common::metrics::IntCounter;\nuse sync_wrapper::SyncWrapper;\nuse tokio::sync::watch;\nuse tracing::{debug, error, info};\n\nuse crate::envelope::Envelope;\nuse crate::mailbox::{Inbox, create_mailbox};\nuse crate::registry::{ActorJoinHandle, ActorRegistry};\nuse crate::scheduler::{NoAdvanceTimeGuard, SchedulerClient};\nuse crate::supervisor::Supervisor;\nuse crate::{\n    Actor, ActorContext, ActorExitStatus, ActorHandle, KillSwitch, Mailbox, QueueCapacity,\n};\n\n#[derive(Clone)]\npub struct SpawnContext {\n    pub(crate) scheduler_client: SchedulerClient,\n    pub(crate) kill_switch: KillSwitch,\n    pub(crate) registry: ActorRegistry,\n}\n\nimpl SpawnContext {\n    pub fn new(scheduler_client: SchedulerClient) -> Self {\n        SpawnContext {\n            scheduler_client,\n            kill_switch: Default::default(),\n            registry: ActorRegistry::default(),\n        }\n    }\n\n    pub fn spawn_builder<A: Actor>(&self) -> SpawnBuilder<A> {\n        SpawnBuilder::new(self.child_context())\n    }\n\n    pub fn create_mailbox<A: Actor>(\n        &self,\n        actor_name: impl ToString,\n        queue_capacity: QueueCapacity,\n    ) -> (Mailbox<A>, Inbox<A>) {\n        create_mailbox(\n            actor_name.to_string(),\n            queue_capacity,\n            Some(self.scheduler_client.clone()),\n        )\n    }\n\n    pub fn child_context(&self) -> SpawnContext {\n        SpawnContext {\n            scheduler_client: self.scheduler_client.clone(),\n            kill_switch: self.kill_switch.child(),\n            registry: self.registry.clone(),\n        }\n    }\n\n    /// Schedules a new event.\n    /// Once `timeout` is elapsed, the future `fut` is\n    /// executed.\n    ///\n    /// `fut` will be executed in the scheduler task, so it is\n    /// required to be short.\n    pub fn schedule_event<F: FnOnce() + Send + Sync + 'static>(\n        &self,\n        callback: F,\n        timeout: Duration,\n    ) {\n        self.scheduler_client.schedule_event(callback, timeout)\n    }\n}\n\n/// `SpawnBuilder` makes it possible to configure misc parameters before spawning an actor.\n#[derive(Clone)]\npub struct SpawnBuilder<A: Actor> {\n    spawn_ctx: SpawnContext,\n    #[allow(clippy::type_complexity)]\n    mailboxes: Option<(Mailbox<A>, Inbox<A>)>,\n    backpressure_micros_counter_opt: Option<IntCounter>,\n}\n\nimpl<A: Actor> SpawnBuilder<A> {\n    pub(crate) fn new(spawn_ctx: SpawnContext) -> Self {\n        SpawnBuilder {\n            spawn_ctx,\n            mailboxes: None,\n            backpressure_micros_counter_opt: None,\n        }\n    }\n\n    /// Sets a specific kill switch for the actor.\n    ///\n    /// By default, the kill switch is inherited from the context that was used to\n    /// spawn the actor.\n    pub fn set_kill_switch(mut self, kill_switch: KillSwitch) -> Self {\n        self.spawn_ctx.kill_switch = kill_switch;\n        self\n    }\n\n    /// Sets a specific set of mailbox.\n    ///\n    /// By default, a brand new set of mailboxes will be created\n    /// when the actor is spawned.\n    ///\n    /// This function makes it possible to create non-DAG networks\n    /// of actors.\n    pub fn set_mailboxes(mut self, mailbox: Mailbox<A>, inbox: Inbox<A>) -> Self {\n        self.mailboxes = Some((mailbox, inbox));\n        self\n    }\n\n    /// Adds a counter to track the amount of time the actor is\n    /// spending in \"backpressure\".\n    ///\n    /// When using `.ask` the amount of time counted may be misleading.\n    /// (See `Mailbox::ask_with_backpressure_counter` for more details)\n    pub fn set_backpressure_micros_counter(\n        mut self,\n        backpressure_micros_counter: IntCounter,\n    ) -> Self {\n        self.backpressure_micros_counter_opt = Some(backpressure_micros_counter);\n        self\n    }\n\n    fn take_or_create_mailboxes(&mut self, actor: &A) -> (Mailbox<A>, Inbox<A>) {\n        if let Some((mailbox, inbox)) = self.mailboxes.take() {\n            return (mailbox, inbox);\n        }\n        let actor_name = actor.name();\n        let queue_capacity = actor.queue_capacity();\n        self.spawn_ctx.create_mailbox(actor_name, queue_capacity)\n    }\n\n    fn create_actor_context_and_inbox(\n        mut self,\n        actor: &A,\n    ) -> (\n        ActorContext<A>,\n        Inbox<A>,\n        watch::Receiver<A::ObservableState>,\n    ) {\n        let (mailbox, inbox) = self.take_or_create_mailboxes(actor);\n        let obs_state = actor.observable_state();\n        let (state_tx, state_rx) = watch::channel(obs_state);\n        let ctx = ActorContext::new(\n            mailbox,\n            self.spawn_ctx.clone(),\n            state_tx,\n            self.backpressure_micros_counter_opt,\n        );\n        (ctx, inbox, state_rx)\n    }\n\n    /// Spawns an async actor.\n    pub fn spawn(self, actor: A) -> (Mailbox<A>, ActorHandle<A>) {\n        // We prevent fast forward of the scheduler during  initialization.\n        let no_advance_time_guard = self.spawn_ctx.scheduler_client.no_advance_time_guard();\n        let runtime_handle = actor.runtime_handle();\n        let (ctx, inbox, state_rx) = self.create_actor_context_and_inbox(&actor);\n        debug!(actor_id = %ctx.actor_instance_id(), \"spawn-actor\");\n        let mailbox = ctx.mailbox().clone();\n        let ctx_clone = ctx.clone();\n        let loop_async_actor_future =\n            async move { actor_loop(actor, inbox, no_advance_time_guard, ctx).await };\n        let join_handle = ActorJoinHandle::new(quickwit_common::spawn_named_task_on(\n            loop_async_actor_future,\n            std::any::type_name::<A>(),\n            &runtime_handle,\n        ));\n        ctx_clone.registry().register(&mailbox, join_handle.clone());\n        let actor_handle = ActorHandle::new(state_rx, join_handle, ctx_clone);\n        (mailbox, actor_handle)\n    }\n\n    pub fn supervise_fn<F: Fn() -> A + Send + 'static>(\n        mut self,\n        actor_factory: F,\n    ) -> (Mailbox<A>, ActorHandle<Supervisor<A>>) {\n        let actor = actor_factory();\n        let actor_name = actor.name();\n        let (mailbox, inbox) = self.take_or_create_mailboxes(&actor);\n        self.mailboxes = Some((mailbox, inbox.clone()));\n        let child_ctx = self.spawn_ctx.child_context();\n        let parent_spawn_ctx = std::mem::replace(&mut self.spawn_ctx, child_ctx);\n        let (mailbox, actor_handle) = self.spawn(actor);\n        let supervisor = Supervisor::new(actor_name, Box::new(actor_factory), inbox, actor_handle);\n        let (_supervisor_mailbox, supervisor_handle) =\n            parent_spawn_ctx.spawn_builder().spawn(supervisor);\n        (mailbox, supervisor_handle)\n    }\n}\n\nimpl<A: Actor + Clone> SpawnBuilder<A> {\n    pub fn supervise(self, actor: A) -> (Mailbox<A>, ActorHandle<Supervisor<A>>) {\n        self.supervise_fn(move || actor.clone())\n    }\n}\n\nimpl<A: Actor + Default> SpawnBuilder<A> {\n    pub fn supervise_default(self) -> (Mailbox<A>, ActorHandle<Supervisor<A>>) {\n        self.supervise_fn(Default::default)\n    }\n}\n\nenum ActorExitPhase {\n    Initializing,\n    Handling { message: &'static str },\n    Running,\n    OnDrainedMessaged,\n    Completed,\n}\n\nimpl fmt::Debug for ActorExitPhase {\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        match self {\n            ActorExitPhase::Initializing => write!(f, \"initializing\"),\n            ActorExitPhase::Handling { message } => write!(f, \"handling({message})\"),\n            ActorExitPhase::Running => write!(f, \"running\"),\n            ActorExitPhase::OnDrainedMessaged => write!(f, \"on_drained_messages\"),\n            ActorExitPhase::Completed => write!(f, \"completed\"),\n        }\n    }\n}\n\n/// Receives an envelope from either the high priority queue or the low priority queue.\n///\n/// In the paused state, the actor will only attempt to receive high priority messages.\n///\n/// If no message is available, this function will yield until a message arrives.\n/// If a high priority message is arrives first it is guaranteed to be processed first.\n/// This other way around is however not guaranteed.\nasync fn recv_envelope<A: Actor>(inbox: &mut Inbox<A>, ctx: &ActorContext<A>) -> Envelope<A> {\n    if ctx.state().is_running() {\n        ctx.protect_future(inbox.recv()).await.expect(\n            \"Disconnection should be impossible because the ActorContext holds a Mailbox too\",\n        )\n    } else {\n        // The actor is paused. We only process command and scheduled message.\n        ctx.protect_future(inbox.recv_cmd_and_scheduled_msg_only())\n            .await\n    }\n}\n\nfn try_recv_envelope<A: Actor>(inbox: &mut Inbox<A>) -> Option<Envelope<A>> {\n    inbox.try_recv().ok()\n}\n\nstruct ActorExecutionEnv<A: Actor> {\n    actor: SyncWrapper<A>,\n    inbox: Inbox<A>,\n    ctx: ActorContext<A>,\n}\n\nimpl<A: Actor> ActorExecutionEnv<A> {\n    async fn initialize(&mut self) -> Result<(), ActorExitStatus> {\n        self.actor.get_mut().initialize(&self.ctx).await\n    }\n\n    async fn process_messages(&mut self) -> (ActorExitStatus, ActorExitPhase) {\n        loop {\n            if let Err((exit_status, exit_phase)) = self.process_all_available_messages().await {\n                return (exit_status, exit_phase);\n            }\n        }\n    }\n\n    async fn process_one_message(\n        &mut self,\n        mut envelope: Envelope<A>,\n    ) -> Result<(), (ActorExitStatus, ActorExitPhase)> {\n        self.yield_and_check_if_killed().await?;\n        envelope\n            .handle_message(self.actor.get_mut(), &self.ctx)\n            .await\n            .map_err(|(exit_status, message)| {\n                (exit_status, ActorExitPhase::Handling { message })\n            })?;\n        Ok(())\n    }\n\n    async fn yield_and_check_if_killed(&mut self) -> Result<(), (ActorExitStatus, ActorExitPhase)> {\n        if self.ctx.kill_switch().is_dead() {\n            return Err((ActorExitStatus::Killed, ActorExitPhase::Running));\n        }\n        if self.actor.get_mut().yield_after_each_message() {\n            self.ctx.yield_now().await;\n            if self.ctx.kill_switch().is_dead() {\n                return Err((ActorExitStatus::Killed, ActorExitPhase::Running));\n            }\n        } else {\n            self.ctx.record_progress();\n        }\n        Ok(())\n    }\n\n    async fn process_all_available_messages(\n        &mut self,\n    ) -> Result<(), (ActorExitStatus, ActorExitPhase)> {\n        self.yield_and_check_if_killed().await?;\n        let envelope = recv_envelope(&mut self.inbox, &self.ctx).await;\n        self.process_one_message(envelope).await?;\n        // If the actor is Running (not Paused), we consume all the messages in the mailbox\n        // and call `on_drained_message`.\n        if self.ctx.state().is_running() {\n            loop {\n                while let Some(envelope) = try_recv_envelope(&mut self.inbox) {\n                    self.process_one_message(envelope).await?;\n                }\n                // We have reached the last message.\n                // Let's still yield and see if we have more messages:\n                // an upstream actor might have experienced backpressure, and is now waiting for our\n                // mailbox to have some room.\n                self.ctx.yield_now().await;\n                if self.inbox.is_empty() {\n                    break;\n                }\n            }\n            self.actor\n                .get_mut()\n                .on_drained_messages(&self.ctx)\n                .await\n                .map_err(|exit_status| (exit_status, ActorExitPhase::OnDrainedMessaged))?;\n        }\n        if self.ctx.mailbox().is_last_mailbox() {\n            // We double check here that the mailbox does not contain any messages,\n            // as someone on different runtime thread could have added a last message\n            // and dropped the last mailbox right before this block.\n            // See #4248\n            if self.inbox.is_empty() {\n                // No one will be able to send us more messages.\n                // We can exit the actor.\n                return Err((ActorExitStatus::Success, ActorExitPhase::Completed));\n            }\n        }\n\n        Ok(())\n    }\n\n    async fn finalize(&mut self, exit_status: ActorExitStatus) -> ActorExitStatus {\n        let _no_advance_time_guard = self\n            .ctx\n            .mailbox()\n            .scheduler_client()\n            .map(|scheduler_client| scheduler_client.no_advance_time_guard());\n        if let Err(finalize_error) = self\n            .actor\n            .get_mut()\n            .finalize(&exit_status, &self.ctx)\n            .await\n            .with_context(|| format!(\"finalization of actor {}\", self.actor.get_mut().name()))\n        {\n            error!(error=?finalize_error, \"finalizing failed, set exit status to panicked\");\n            return ActorExitStatus::Panicked;\n        }\n        exit_status\n    }\n}\n\nimpl<A: Actor> Drop for ActorExecutionEnv<A> {\n    // We rely on this object internally to fetch a post-mortem state,\n    // even in case of a panic.\n    fn drop(&mut self) {\n        self.ctx.observe(self.actor.get_mut());\n    }\n}\n\nasync fn actor_loop<A: Actor>(\n    actor: A,\n    inbox: Inbox<A>,\n    no_advance_time_guard: NoAdvanceTimeGuard,\n    ctx: ActorContext<A>,\n) -> ActorExitStatus {\n    let mut actor_env = ActorExecutionEnv {\n        actor: SyncWrapper::new(actor),\n        inbox,\n        ctx,\n    };\n\n    let initialize_exit_status_res: Result<(), ActorExitStatus> = actor_env.initialize().await;\n    drop(no_advance_time_guard);\n\n    let (after_process_exit_status, exit_phase) =\n        if let Err(initialize_exit_status) = initialize_exit_status_res {\n            // We do not process messages if initialize yield an error.\n            // We still call finalize however!\n            (initialize_exit_status, ActorExitPhase::Initializing)\n        } else {\n            actor_env.process_messages().await\n        };\n\n    let actor_id = actor_env.ctx.actor_instance_id();\n    match after_process_exit_status {\n        ActorExitStatus::Success\n        | ActorExitStatus::Quit\n        | ActorExitStatus::DownstreamClosed\n        | ActorExitStatus::Killed => {\n            info!(actor_id, phase = ?exit_phase, exit_status = ?after_process_exit_status, \"actor-exit\");\n        }\n        ActorExitStatus::Failure(_) | ActorExitStatus::Panicked => {\n            error!(actor_id, phase = ?exit_phase, exit_status = ?after_process_exit_status, \"actor-exit\");\n        }\n    };\n\n    // TODO the no advance time guard for finalize has a race condition. Ideally we would\n    // like to have the guard before we drop the last envelope.\n    let final_exit_status = actor_env.finalize(after_process_exit_status).await;\n    // The last observation is collected on `ActorExecutionEnv::Drop`.\n    actor_env.ctx.exit(&final_exit_status);\n    final_exit_status\n}\n"
  },
  {
    "path": "quickwit/quickwit-actors/src/supervisor.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse async_trait::async_trait;\nuse serde::Serialize;\nuse tracing::{info, warn};\n\nuse crate::mailbox::Inbox;\nuse crate::{Actor, ActorContext, ActorExitStatus, ActorHandle, Handler, Health, Supervisable};\n\n#[derive(Debug, Clone, Copy, Default, Eq, PartialEq, Serialize)]\npub struct SupervisorMetrics {\n    pub num_panics: usize,\n    pub num_errors: usize,\n    pub num_kills: usize,\n}\n\n#[derive(Debug, Clone, Eq, PartialEq, Serialize)]\npub struct SupervisorState<S> {\n    pub metrics: SupervisorMetrics,\n    pub state_opt: Option<S>,\n}\n\nimpl<S> Default for SupervisorState<S> {\n    fn default() -> Self {\n        SupervisorState {\n            metrics: Default::default(),\n            state_opt: None,\n        }\n    }\n}\n\npub struct Supervisor<A: Actor> {\n    actor_name: String,\n    actor_factory: Box<dyn Fn() -> A + Send>,\n    inbox: Inbox<A>,\n    handle_opt: Option<ActorHandle<A>>,\n    metrics: SupervisorMetrics,\n}\n\n#[derive(Debug, Copy, Clone)]\nstruct SuperviseLoop;\n\n#[async_trait]\nimpl<A: Actor> Actor for Supervisor<A> {\n    type ObservableState = SupervisorState<A::ObservableState>;\n\n    fn observable_state(&self) -> Self::ObservableState {\n        let state_opt: Option<A::ObservableState> = self\n            .handle_opt\n            .as_ref()\n            .map(|handle| handle.last_observation().clone());\n        SupervisorState {\n            metrics: self.metrics,\n            state_opt,\n        }\n    }\n\n    fn name(&self) -> String {\n        format!(\"Supervisor({})\", self.actor_name)\n    }\n\n    fn queue_capacity(&self) -> crate::QueueCapacity {\n        crate::QueueCapacity::Unbounded\n    }\n\n    async fn initialize(&mut self, ctx: &ActorContext<Self>) -> Result<(), ActorExitStatus> {\n        ctx.schedule_self_msg(*crate::HEARTBEAT, SuperviseLoop);\n        Ok(())\n    }\n\n    async fn finalize(\n        &mut self,\n        exit_status: &ActorExitStatus,\n        _ctx: &ActorContext<Self>,\n    ) -> anyhow::Result<()> {\n        match exit_status {\n            ActorExitStatus::Quit => {\n                if let Some(handle) = self.handle_opt.take() {\n                    handle.quit().await;\n                }\n            }\n            ActorExitStatus::Killed => {\n                if let Some(handle) = self.handle_opt.take() {\n                    handle.kill().await;\n                }\n            }\n            ActorExitStatus::Failure(_)\n            | ActorExitStatus::Success\n            | ActorExitStatus::DownstreamClosed => {}\n            ActorExitStatus::Panicked => {}\n        }\n\n        Ok(())\n    }\n}\n\nimpl<A: Actor> Supervisor<A> {\n    pub(crate) fn new(\n        actor_name: String,\n        actor_factory: Box<dyn Fn() -> A + Send>,\n        inbox: Inbox<A>,\n        handle: ActorHandle<A>,\n    ) -> Self {\n        Supervisor {\n            actor_name,\n            actor_factory,\n            inbox,\n            handle_opt: Some(handle),\n            metrics: Default::default(),\n        }\n    }\n\n    async fn supervise(\n        &mut self,\n        ctx: &ActorContext<Supervisor<A>>,\n    ) -> Result<(), ActorExitStatus> {\n        let handle_ref = self\n            .handle_opt\n            .as_ref()\n            .expect(\"The actor handle should always be set.\");\n        match handle_ref.check_health(true) {\n            Health::Healthy => {\n                handle_ref.refresh_observe();\n                return Ok(());\n            }\n            Health::FailureOrUnhealthy => {}\n            Health::Success => {\n                return Err(ActorExitStatus::Success);\n            }\n        }\n        warn!(\"unhealthy-actor\");\n        // The actor is failing we need to restart it.\n        let actor_handle = self.handle_opt.take().unwrap();\n        let actor_mailbox = actor_handle.mailbox().clone();\n        let (actor_exit_status, _last_state) = if !actor_handle.state().is_exit() {\n            // The actor is probably frozen.\n            // Let's kill it.\n            warn!(\"killing\");\n            actor_handle.kill().await\n        } else {\n            actor_handle.join().await\n        };\n        match actor_exit_status {\n            ActorExitStatus::Success => {\n                return Err(ActorExitStatus::Success);\n            }\n            ActorExitStatus::Quit => {\n                return Err(ActorExitStatus::Quit);\n            }\n            ActorExitStatus::DownstreamClosed => {\n                return Err(ActorExitStatus::DownstreamClosed);\n            }\n            ActorExitStatus::Killed => {\n                self.metrics.num_kills += 1;\n            }\n            ActorExitStatus::Failure(_err) => {\n                self.metrics.num_errors += 1;\n            }\n            ActorExitStatus::Panicked => {\n                self.metrics.num_panics += 1;\n            }\n        }\n        info!(\"respawning-actor\");\n        let (_, actor_handle) = ctx\n            .spawn_actor()\n            .set_mailboxes(actor_mailbox, self.inbox.clone())\n            .set_kill_switch(ctx.kill_switch().child())\n            .spawn((*self.actor_factory)());\n        self.handle_opt = Some(actor_handle);\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl<A: Actor> Handler<SuperviseLoop> for Supervisor<A> {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        _msg: SuperviseLoop,\n        ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        self.supervise(ctx).await?;\n        ctx.schedule_self_msg(*crate::HEARTBEAT, SuperviseLoop);\n        Ok(())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::time::Duration;\n\n    use async_trait::async_trait;\n    use tracing::info;\n\n    use crate::supervisor::SupervisorMetrics;\n    use crate::tests::{Ping, PingReceiverActor};\n    use crate::{Actor, ActorContext, ActorExitStatus, AskError, Handler, Observe, Universe};\n\n    #[derive(Copy, Clone, Debug)]\n    enum FailingActorMessage {\n        Panic,\n        ReturnError,\n        Increment,\n        Freeze(Duration),\n    }\n\n    #[derive(Default, Clone)]\n    struct FailingActor {\n        counter: usize,\n    }\n\n    #[async_trait]\n    impl Actor for FailingActor {\n        type ObservableState = usize;\n\n        fn name(&self) -> String {\n            \"FailingActor\".to_string()\n        }\n\n        fn observable_state(&self) -> Self::ObservableState {\n            self.counter\n        }\n\n        async fn finalize(\n            &mut self,\n            _exit_status: &ActorExitStatus,\n            _ctx: &ActorContext<Self>,\n        ) -> anyhow::Result<()> {\n            info!(\"finalize-failing-actor\");\n            Ok(())\n        }\n    }\n\n    #[async_trait]\n    impl Handler<FailingActorMessage> for FailingActor {\n        type Reply = usize;\n\n        async fn handle(\n            &mut self,\n            msg: FailingActorMessage,\n            ctx: &ActorContext<Self>,\n        ) -> Result<Self::Reply, ActorExitStatus> {\n            match msg {\n                FailingActorMessage::Panic => {\n                    panic!(\"Failing actor panicked\");\n                }\n                FailingActorMessage::ReturnError => {\n                    return Err(ActorExitStatus::from(anyhow::anyhow!(\n                        \"failing actor error\"\n                    )));\n                }\n                FailingActorMessage::Increment => {\n                    self.counter += 1;\n                }\n                FailingActorMessage::Freeze(wait_duration) => {\n                    ctx.sleep(wait_duration).await;\n                }\n            }\n            Ok(self.counter)\n        }\n    }\n\n    #[tokio::test]\n    async fn test_supervisor_restart_on_panic() {\n        quickwit_common::setup_logging_for_tests();\n        let universe = Universe::with_accelerated_time();\n        let actor = FailingActor::default();\n        let (mailbox, supervisor_handle) = universe.spawn_builder().supervise(actor);\n        assert_eq!(\n            mailbox.ask(FailingActorMessage::Increment).await.unwrap(),\n            1\n        );\n        assert_eq!(\n            mailbox.ask(FailingActorMessage::Increment).await.unwrap(),\n            2\n        );\n        assert!(mailbox.ask(FailingActorMessage::Panic).await.is_err());\n        assert_eq!(\n            mailbox.ask(FailingActorMessage::Increment).await.unwrap(),\n            1\n        );\n        assert_eq!(\n            supervisor_handle.observe().await.metrics,\n            SupervisorMetrics {\n                num_panics: 1,\n                num_errors: 0,\n                num_kills: 0\n            }\n        );\n        assert!(!matches!(\n            supervisor_handle.quit().await.0,\n            ActorExitStatus::Panicked\n        ));\n    }\n\n    #[tokio::test]\n    async fn test_supervisor_restart_on_error() {\n        let universe = Universe::with_accelerated_time();\n        let actor = FailingActor::default();\n        let (mailbox, supervisor_handle) = universe.spawn_builder().supervise(actor);\n        assert_eq!(\n            mailbox.ask(FailingActorMessage::Increment).await.unwrap(),\n            1\n        );\n        assert_eq!(\n            mailbox.ask(FailingActorMessage::Increment).await.unwrap(),\n            2\n        );\n        assert!(mailbox.ask(FailingActorMessage::ReturnError).await.is_err());\n        assert_eq!(\n            mailbox.ask(FailingActorMessage::Increment).await.unwrap(),\n            1\n        );\n        assert_eq!(\n            supervisor_handle.observe().await.metrics,\n            SupervisorMetrics {\n                num_panics: 0,\n                num_errors: 1,\n                num_kills: 0\n            }\n        );\n        assert!(!matches!(\n            supervisor_handle.quit().await.0,\n            ActorExitStatus::Panicked\n        ));\n    }\n\n    #[tokio::test]\n    async fn test_supervisor_kills_and_restart_frozen_actor() {\n        let universe = Universe::with_accelerated_time();\n        let actor = FailingActor::default();\n        let (mailbox, supervisor_handle) = universe.spawn_builder().supervise(actor);\n        assert_eq!(\n            mailbox.ask(FailingActorMessage::Increment).await.unwrap(),\n            1\n        );\n        assert_eq!(\n            mailbox.ask(FailingActorMessage::Increment).await.unwrap(),\n            2\n        );\n        assert_eq!(\n            supervisor_handle.observe().await.metrics,\n            SupervisorMetrics {\n                num_panics: 0,\n                num_errors: 0,\n                num_kills: 0\n            }\n        );\n        mailbox\n            .send_message(FailingActorMessage::Freeze(\n                crate::HEARTBEAT.mul_f32(3.0f32),\n            ))\n            .await\n            .unwrap();\n        assert_eq!(\n            mailbox.ask(FailingActorMessage::Increment).await.unwrap(),\n            1\n        );\n        assert_eq!(\n            supervisor_handle.observe().await.metrics,\n            SupervisorMetrics {\n                num_panics: 0,\n                num_errors: 0,\n                num_kills: 1\n            }\n        );\n        assert!(!matches!(\n            supervisor_handle.quit().await.0,\n            ActorExitStatus::Panicked\n        ));\n    }\n\n    #[tokio::test]\n    async fn test_supervisor_forwards_quit_commands() {\n        let universe = Universe::with_accelerated_time();\n        let actor = FailingActor::default();\n        let (mailbox, supervisor_handle) = universe.spawn_builder().supervise(actor);\n        assert_eq!(\n            mailbox.ask(FailingActorMessage::Increment).await.unwrap(),\n            1\n        );\n        let (exit_status, _state) = supervisor_handle.quit().await;\n        assert!(matches!(\n            mailbox\n                .ask(FailingActorMessage::Increment)\n                .await\n                .unwrap_err(),\n            AskError::MessageNotDelivered\n        ));\n        assert!(matches!(exit_status, ActorExitStatus::Quit));\n    }\n\n    #[tokio::test]\n    async fn test_supervisor_forwards_kill_command() {\n        quickwit_common::setup_logging_for_tests();\n        let universe = Universe::with_accelerated_time();\n        let actor = FailingActor::default();\n        let (mailbox, supervisor_handle) = universe.spawn_builder().supervise(actor);\n        assert_eq!(\n            mailbox.ask(FailingActorMessage::Increment).await.unwrap(),\n            1\n        );\n        let (exit_status, _state) = supervisor_handle.kill().await;\n        assert!(mailbox.ask(FailingActorMessage::Increment).await.is_err());\n        assert!(matches!(\n            mailbox\n                .ask(FailingActorMessage::Increment)\n                .await\n                .unwrap_err(),\n            AskError::MessageNotDelivered\n        ));\n        assert!(matches!(exit_status, ActorExitStatus::Killed));\n    }\n\n    #[tokio::test]\n    async fn test_supervisor_exits_successfully_when_supervised_actor_mailbox_is_dropped() {\n        quickwit_common::setup_logging_for_tests();\n        let universe = Universe::with_accelerated_time();\n        let actor = FailingActor::default();\n        let (_, supervisor_handle) = universe.spawn_builder().supervise(actor);\n        let (exit_status, _state) = supervisor_handle.join().await;\n        assert!(matches!(exit_status, ActorExitStatus::Success));\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_supervisor_state() {\n        quickwit_common::setup_logging_for_tests();\n        let universe = Universe::with_accelerated_time();\n        let ping_actor = PingReceiverActor::default();\n        let (mailbox, handler) = universe.spawn_builder().supervise(ping_actor);\n        let obs = handler.observe().await;\n        assert_eq!(obs.state.state_opt, Some(0));\n        let _ = mailbox.ask(Ping).await;\n        assert_eq!(mailbox.ask(Observe).await.unwrap(), 1);\n        universe.sleep(Duration::from_secs(60)).await;\n        let obs = handler.observe().await;\n        assert_eq!(obs.state.state_opt, Some(1));\n        handler.quit().await;\n        universe.assert_quit().await;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-actors/src/tests.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::cell::Cell;\nuse std::collections::HashMap;\nuse std::ops::Mul;\nuse std::time::Duration;\n\nuse async_trait::async_trait;\nuse quickwit_common::new_coolid;\nuse serde::Serialize;\n\nuse crate::observation::ObservationType;\nuse crate::{\n    Actor, ActorContext, ActorExitStatus, ActorHandle, ActorState, Command, Handler, Health,\n    Mailbox, Observation, Supervisable, Universe,\n};\n\n// An actor that receives ping messages.\n#[derive(Default, Clone)]\npub struct PingReceiverActor {\n    ping_count: usize,\n}\n\nimpl Actor for PingReceiverActor {\n    type ObservableState = usize;\n\n    fn name(&self) -> String {\n        \"Ping\".to_string()\n    }\n\n    fn observable_state(&self) -> Self::ObservableState {\n        self.ping_count\n    }\n}\n\n#[derive(Debug)]\npub struct Ping;\n\n#[async_trait]\nimpl Handler<Ping> for PingReceiverActor {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        _message: Ping,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        self.ping_count += 1;\n        assert_eq!(ctx.state(), ActorState::Running);\n        Ok(())\n    }\n}\n\n#[derive(Default)]\npub struct PingerSenderActor {\n    count: usize,\n    peers: HashMap<String, Mailbox<PingReceiverActor>>,\n}\n\n#[derive(Clone, Debug, Eq, PartialEq, Serialize)]\npub struct SenderState {\n    pub count: usize,\n    pub num_peers: usize,\n}\n\n#[derive(Clone, Debug)]\npub struct AddPeer(Mailbox<PingReceiverActor>);\n\nimpl Actor for PingerSenderActor {\n    type ObservableState = SenderState;\n\n    fn name(&self) -> String {\n        \"PingSender\".to_string()\n    }\n\n    fn observable_state(&self) -> Self::ObservableState {\n        SenderState {\n            count: self.count,\n            num_peers: self.peers.len(),\n        }\n    }\n}\n\n#[async_trait]\nimpl Handler<Ping> for PingerSenderActor {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        _message: Ping,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        self.count += 1;\n        for peer in self.peers.values() {\n            let _ = peer.send_message(Ping).await;\n        }\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<AddPeer> for PingerSenderActor {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        message: AddPeer,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        let AddPeer(peer) = message;\n        let peer_id = peer.actor_instance_id().to_string();\n        self.peers.insert(peer_id, peer);\n        Ok(())\n    }\n}\n\n#[tokio::test]\nasync fn test_actor_stops_when_last_mailbox_is_dropped() {\n    quickwit_common::setup_logging_for_tests();\n    let universe = Universe::with_accelerated_time();\n    let (ping_recv_mailbox, ping_recv_handle) =\n        universe.spawn_builder().spawn(PingReceiverActor::default());\n    drop(ping_recv_mailbox);\n    let (exit_status, _) = ping_recv_handle.join().await;\n    assert!(exit_status.is_success());\n}\n\n#[tokio::test]\nasync fn test_ping_actor() {\n    quickwit_common::setup_logging_for_tests();\n    let universe = Universe::with_accelerated_time();\n    let (ping_recv_mailbox, ping_recv_handle) =\n        universe.spawn_builder().spawn(PingReceiverActor::default());\n    let (ping_sender_mailbox, ping_sender_handle) =\n        universe.spawn_builder().spawn(PingerSenderActor::default());\n    assert_eq!(\n        ping_recv_handle.observe().await,\n        Observation {\n            obs_type: ObservationType::Alive,\n            state: 0\n        }\n    );\n    // No peers. This one will have no impact.\n    let ping_recv_mailbox = ping_recv_mailbox.clone();\n    assert!(ping_sender_mailbox.send_message(Ping).await.is_ok());\n    assert!(\n        ping_sender_mailbox\n            .send_message(AddPeer(ping_recv_mailbox.clone()))\n            .await\n            .is_ok()\n    );\n    assert_eq!(\n        ping_sender_handle.process_pending_and_observe().await,\n        Observation {\n            obs_type: ObservationType::Alive,\n            state: SenderState {\n                num_peers: 1,\n                count: 1\n            }\n        }\n    );\n    assert!(ping_sender_mailbox.send_message(Ping).await.is_ok());\n    assert!(ping_sender_mailbox.send_message(Ping).await.is_ok());\n    assert_eq!(\n        ping_sender_handle.process_pending_and_observe().await,\n        Observation {\n            obs_type: ObservationType::Alive,\n            state: SenderState {\n                num_peers: 1,\n                count: 3\n            }\n        }\n    );\n    assert_eq!(\n        ping_recv_handle.process_pending_and_observe().await,\n        Observation {\n            obs_type: ObservationType::Alive,\n            state: 2\n        }\n    );\n    universe.kill();\n    assert_eq!(\n        ping_recv_handle.process_pending_and_observe().await,\n        Observation {\n            obs_type: ObservationType::PostMortem,\n            state: 2\n        }\n    );\n    assert_eq!(\n        ping_sender_handle.process_pending_and_observe().await,\n        Observation {\n            obs_type: ObservationType::PostMortem,\n            state: SenderState {\n                num_peers: 1,\n                count: 3\n            }\n        }\n    );\n    ping_sender_handle.join().await;\n    assert!(ping_sender_mailbox.send_message(Ping).await.is_err());\n}\n\nstruct BuggyActor;\n\n#[derive(Clone, Debug)]\nstruct DoNothing;\n\n#[derive(Clone, Debug)]\nstruct Block;\n\nimpl Actor for BuggyActor {\n    type ObservableState = ();\n\n    fn name(&self) -> String {\n        \"BuggyActor\".to_string()\n    }\n\n    fn observable_state(&self) {}\n}\n\n#[async_trait]\nimpl Handler<DoNothing> for BuggyActor {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        _message: DoNothing,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<Block> for BuggyActor {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        _message: Block,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        while ctx.kill_switch().is_alive() {\n            tokio::task::yield_now().await;\n        }\n        Ok(())\n    }\n}\n\n#[tokio::test]\nasync fn test_timeouting_actor() {\n    let universe = Universe::with_accelerated_time();\n    let (buggy_mailbox, buggy_handle) = universe.spawn_builder().spawn(BuggyActor);\n    let buggy_mailbox = buggy_mailbox;\n    assert_eq!(\n        buggy_handle.observe().await.obs_type,\n        ObservationType::Alive\n    );\n    assert!(buggy_mailbox.send_message(DoNothing).await.is_ok());\n    assert_eq!(\n        buggy_handle.observe().await.obs_type,\n        ObservationType::Alive\n    );\n    assert!(buggy_mailbox.send_message(Block).await.is_ok());\n\n    assert_eq!(buggy_handle.check_health(true), Health::Healthy);\n    assert_eq!(\n        buggy_handle.process_pending_and_observe().await.obs_type,\n        ObservationType::Timeout\n    );\n    assert_eq!(buggy_handle.check_health(true), Health::Healthy);\n    universe.sleep(crate::HEARTBEAT.mul(2)).await;\n    assert_eq!(buggy_handle.check_health(true), Health::FailureOrUnhealthy);\n    buggy_handle.kill().await;\n}\n\n#[tokio::test]\nasync fn test_pause_actor() {\n    quickwit_common::setup_logging_for_tests();\n    let universe = Universe::with_accelerated_time();\n    let (ping_mailbox, ping_handle) = universe.spawn_builder().spawn(PingReceiverActor::default());\n    for _ in 0u32..1000u32 {\n        assert!(ping_mailbox.send_message(Ping).await.is_ok());\n    }\n    assert!(\n        ping_mailbox\n            .send_message_with_high_priority(Command::Pause)\n            .is_ok()\n    );\n    let first_state = ping_handle.observe().await.state;\n    assert!(first_state < 1000);\n    let second_state = ping_handle.observe().await.state;\n    assert_eq!(first_state, second_state);\n    assert!(\n        ping_mailbox\n            .send_message_with_high_priority(Command::Resume)\n            .is_ok()\n    );\n    let end_state = ping_handle.process_pending_and_observe().await.state;\n    assert_eq!(end_state, 1000);\n    universe.assert_quit().await;\n}\n\n#[tokio::test]\nasync fn test_actor_running_states() {\n    quickwit_common::setup_logging_for_tests();\n    let universe = Universe::with_accelerated_time();\n    let (ping_mailbox, ping_handle) = universe.spawn_builder().spawn(PingReceiverActor::default());\n    assert_eq!(ping_handle.state(), ActorState::Running);\n    for _ in 0u32..10u32 {\n        assert!(ping_mailbox.send_message(Ping).await.is_ok());\n    }\n    let obs = ping_handle.process_pending_and_observe().await;\n    assert_eq!(*obs, 10);\n    universe.sleep(Duration::from_millis(1)).await;\n    assert_eq!(ping_handle.state(), ActorState::Running);\n    universe.assert_quit().await;\n}\n\n#[derive(Clone, Debug, Default, Serialize)]\nstruct LoopingActor {\n    pub loop_count: usize,\n    pub single_shot_count: usize,\n}\n\n#[derive(Debug)]\nstruct Loop;\n\n#[derive(Debug)]\nstruct SingleShot;\n\n#[async_trait]\nimpl Actor for LoopingActor {\n    type ObservableState = Self;\n\n    fn observable_state(&self) -> Self::ObservableState {\n        self.clone()\n    }\n\n    fn yield_after_each_message(&self) -> bool {\n        false\n    }\n\n    async fn initialize(&mut self, ctx: &ActorContext<Self>) -> Result<(), ActorExitStatus> {\n        self.handle(Loop, ctx).await\n    }\n}\n\n#[async_trait]\nimpl Handler<Loop> for LoopingActor {\n    type Reply = ();\n    async fn handle(\n        &mut self,\n        _msg: Loop,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        self.loop_count += 1;\n        ctx.send_self_message(Loop).await?;\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<SingleShot> for LoopingActor {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        _msg: SingleShot,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        self.single_shot_count += 1;\n        Ok(())\n    }\n}\n\n#[tokio::test(flavor = \"multi_thread\")]\nasync fn test_looping() -> anyhow::Result<()> {\n    let universe = Universe::with_accelerated_time();\n    let looping_actor = LoopingActor::default();\n    let (looping_actor_mailbox, looping_actor_handle) =\n        universe.spawn_builder().spawn(looping_actor);\n    assert!(looping_actor_mailbox.send_message(SingleShot).await.is_ok());\n    looping_actor_handle.process_pending_and_observe().await;\n    let (exit_status, state) = looping_actor_handle.quit().await;\n    assert!(matches!(exit_status, ActorExitStatus::Quit));\n    assert_eq!(state.single_shot_count, 1);\n    assert!(state.loop_count > 0);\n    Ok(())\n}\n\n#[derive(Default)]\nstruct SummingActor {\n    sum: u64,\n}\n\n#[async_trait]\nimpl Handler<u64> for SummingActor {\n    type Reply = ();\n\n    async fn handle(&mut self, add: u64, _ctx: &ActorContext<Self>) -> Result<(), ActorExitStatus> {\n        self.sum += add;\n        Ok(())\n    }\n}\n\nimpl Actor for SummingActor {\n    type ObservableState = u64;\n\n    fn observable_state(&self) -> Self::ObservableState {\n        self.sum\n    }\n}\n\n#[derive(Default)]\nstruct SpawningActor {\n    res: u64,\n    handle_opt: Option<(Mailbox<SummingActor>, ActorHandle<SummingActor>)>,\n}\n\n#[async_trait]\nimpl Actor for SpawningActor {\n    type ObservableState = u64;\n\n    fn observable_state(&self) -> Self::ObservableState {\n        self.res\n    }\n\n    async fn finalize(\n        &mut self,\n        _exit_status: &ActorExitStatus,\n        _ctx: &ActorContext<Self>,\n    ) -> anyhow::Result<()> {\n        if let Some((_, child_handler)) = self.handle_opt.take() {\n            self.res = child_handler.process_pending_and_observe().await.state;\n            child_handler.kill().await;\n        }\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<u64> for SpawningActor {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        message: u64,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        let (mailbox, _) = self\n            .handle_opt\n            .get_or_insert_with(|| ctx.spawn_actor().spawn(SummingActor::default()));\n        ctx.send_message(mailbox, message).await?;\n        Ok(())\n    }\n}\n\n#[tokio::test]\nasync fn test_actor_spawning_actor() -> anyhow::Result<()> {\n    let universe = Universe::with_accelerated_time();\n    let (mailbox, handle) = universe.spawn_builder().spawn(SpawningActor::default());\n    mailbox.send_message(1).await?;\n    mailbox.send_message(2).await?;\n    mailbox.send_message(3).await?;\n    drop(mailbox);\n    let (exit, result) = handle.join().await;\n    assert!(matches!(exit, ActorExitStatus::Success));\n    assert_eq!(result, 6);\n    Ok(())\n}\n\nstruct BuggyFinalizeActor;\n\n#[async_trait]\nimpl Actor for BuggyFinalizeActor {\n    type ObservableState = ();\n\n    fn name(&self) -> String {\n        \"BuggyFinalizeActor\".to_string()\n    }\n\n    fn observable_state(&self) {}\n\n    async fn finalize(\n        &mut self,\n        _exit_status: &ActorExitStatus,\n        _: &ActorContext<Self>,\n    ) -> anyhow::Result<()> {\n        anyhow::bail!(\"finalize error\")\n    }\n}\n\n#[tokio::test]\nasync fn test_actor_finalize_error_set_exit_status_to_panicked() -> anyhow::Result<()> {\n    let universe = Universe::with_accelerated_time();\n    let (mailbox, handle) = universe.spawn_builder().spawn(BuggyFinalizeActor);\n    assert!(matches!(handle.state(), ActorState::Running));\n    drop(mailbox);\n    let (exit, _) = handle.join().await;\n    assert!(matches!(exit, ActorExitStatus::Panicked));\n    Ok(())\n}\n\n#[derive(Default)]\nstruct Adder(u64);\n\nimpl Actor for Adder {\n    type ObservableState = u64;\n\n    fn yield_after_each_message(&self) -> bool {\n        false\n    }\n\n    fn observable_state(&self) -> Self::ObservableState {\n        self.0\n    }\n}\n\n#[derive(Debug)]\nstruct AddOperand(u64);\n\n#[async_trait]\nimpl Handler<AddOperand> for Adder {\n    type Reply = u64;\n\n    async fn handle(\n        &mut self,\n        add_op: AddOperand,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<u64, ActorExitStatus> {\n        self.0 += add_op.0;\n        Ok(self.0)\n    }\n}\n\n#[tokio::test]\nasync fn test_actor_return_response() -> anyhow::Result<()> {\n    let universe = Universe::with_accelerated_time();\n    let adder = Adder::default();\n    let (mailbox, _handle) = universe.spawn_builder().spawn(adder);\n    let plus_two = mailbox.send_message(AddOperand(2)).await?;\n    let plus_two_plus_four = mailbox.send_message(AddOperand(4)).await?;\n    assert_eq!(plus_two.await.unwrap(), 2);\n    assert_eq!(plus_two_plus_four.await.unwrap(), 6);\n    universe.assert_quit().await;\n    Ok(())\n}\n\n#[derive(Default)]\nstruct TestActorWithDrain {\n    counts: ProcessAndDrainCounts,\n}\n\n#[derive(Clone, Copy, Debug, Default, Eq, PartialEq, Serialize)]\nstruct ProcessAndDrainCounts {\n    process_calls_count: usize,\n    drain_calls_count: usize,\n}\n\n#[async_trait]\nimpl Actor for TestActorWithDrain {\n    type ObservableState = ProcessAndDrainCounts;\n\n    fn observable_state(&self) -> ProcessAndDrainCounts {\n        self.counts\n    }\n\n    async fn on_drained_messages(\n        &mut self,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        self.counts.drain_calls_count += 1;\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<()> for TestActorWithDrain {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        _message: (),\n        _ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        self.counts.process_calls_count += 1;\n        Ok(())\n    }\n}\n\n#[tokio::test]\nasync fn test_drain_is_called() {\n    quickwit_common::setup_logging_for_tests();\n    let universe = Universe::with_accelerated_time();\n    let test_actor_with_drain = TestActorWithDrain::default();\n    let (mailbox, handle) = universe.spawn_builder().spawn(test_actor_with_drain);\n    assert_eq!(\n        *handle.process_pending_and_observe().await,\n        ProcessAndDrainCounts {\n            process_calls_count: 0,\n            drain_calls_count: 0\n        }\n    );\n    handle.pause();\n    mailbox.send_message(()).await.unwrap();\n    mailbox.send_message(()).await.unwrap();\n    mailbox.send_message(()).await.unwrap();\n    handle.resume();\n    universe.sleep(Duration::from_millis(1)).await;\n    assert_eq!(\n        *handle.process_pending_and_observe().await,\n        ProcessAndDrainCounts {\n            process_calls_count: 3,\n            drain_calls_count: 1\n        }\n    );\n    mailbox.send_message(()).await.unwrap();\n    universe.sleep(Duration::from_millis(1)).await;\n    assert_eq!(\n        *handle.process_pending_and_observe().await,\n        ProcessAndDrainCounts {\n            process_calls_count: 4,\n            drain_calls_count: 2\n        }\n    );\n    universe.assert_quit().await;\n}\n\n#[tokio::test]\nasync fn test_unsync_actor() {\n    #[derive(Default)]\n    struct UnsyncActor(Cell<u64>);\n\n    impl Actor for UnsyncActor {\n        type ObservableState = u64;\n\n        fn observable_state(&self) -> Self::ObservableState {\n            self.0.get()\n        }\n    }\n\n    #[async_trait]\n    impl Handler<u64> for UnsyncActor {\n        type Reply = u64;\n\n        async fn handle(\n            &mut self,\n            number: u64,\n            _ctx: &ActorContext<Self>,\n        ) -> Result<u64, ActorExitStatus> {\n            *self.0.get_mut() += number;\n            Ok(self.0.get())\n        }\n    }\n    let universe = Universe::with_accelerated_time();\n    let unsync_message_actor = UnsyncActor::default();\n    let (mailbox, _handle) = universe.spawn_builder().spawn(unsync_message_actor);\n\n    let response = mailbox.ask(1).await.unwrap();\n    assert_eq!(response, 1);\n\n    universe.assert_quit().await;\n}\n\n#[tokio::test]\nasync fn test_unsync_actor_message() {\n    #[derive(Default)]\n    struct UnsyncMessageActor(u64);\n\n    impl Actor for UnsyncMessageActor {\n        type ObservableState = u64;\n\n        fn observable_state(&self) -> Self::ObservableState {\n            self.0\n        }\n    }\n\n    #[async_trait]\n    impl Handler<Cell<u64>> for UnsyncMessageActor {\n        type Reply = anyhow::Result<u64>;\n\n        async fn handle(\n            &mut self,\n            number: Cell<u64>,\n            _ctx: &ActorContext<Self>,\n        ) -> Result<anyhow::Result<u64>, ActorExitStatus> {\n            self.0 += number.get();\n            Ok(Ok(self.0))\n        }\n    }\n    let universe = Universe::with_accelerated_time();\n    let unsync_message_actor = UnsyncMessageActor::default();\n    let (mailbox, _handle) = universe.spawn_builder().spawn(unsync_message_actor);\n\n    let response_rx = mailbox.send_message(Cell::new(1)).await.unwrap();\n    assert_eq!(response_rx.await.unwrap().unwrap(), 1);\n\n    let response = mailbox.ask(Cell::new(1)).await.unwrap().unwrap();\n    assert_eq!(response, 2);\n\n    let response = mailbox.ask_for_res(Cell::new(1)).await.unwrap();\n    assert_eq!(response, 3);\n\n    let response_rx = mailbox\n        .send_message_with_high_priority(Cell::new(1))\n        .unwrap();\n    assert_eq!(response_rx.await.unwrap().unwrap(), 4);\n\n    let response_rx = mailbox.try_send_message(Cell::new(1)).unwrap();\n    assert_eq!(response_rx.await.unwrap().unwrap(), 5);\n\n    universe.assert_quit().await;\n}\n\nstruct FakeActorService {\n    // We use a cool id to make sure in the test that we get twice the same instance.\n    cool_id: String,\n}\n\n#[derive(Debug)]\nstruct GetCoolId;\n\nimpl Actor for FakeActorService {\n    type ObservableState = ();\n\n    fn observable_state(&self) {}\n}\n\n#[async_trait]\nimpl Handler<GetCoolId> for FakeActorService {\n    type Reply = String;\n\n    async fn handle(\n        &mut self,\n        _: GetCoolId,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        Ok(self.cool_id.clone())\n    }\n}\n\nimpl Default for FakeActorService {\n    fn default() -> Self {\n        FakeActorService {\n            cool_id: new_coolid(\"fake-actor\"),\n        }\n    }\n}\n\n#[tokio::test]\nasync fn test_get_or_spawn() {\n    let universe = Universe::new();\n    let mailbox1: Mailbox<FakeActorService> = universe.get_or_spawn_one();\n    let id1 = mailbox1.ask(GetCoolId).await.unwrap();\n    let mailbox2: Mailbox<FakeActorService> = universe.get_or_spawn_one();\n    let id2 = mailbox2.ask(GetCoolId).await.unwrap();\n    assert_eq!(id1, id2);\n    universe.assert_quit().await;\n}\n"
  },
  {
    "path": "quickwit/quickwit-actors/src/universe.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::thread;\nuse std::time::Duration;\n\nuse crate::mailbox::create_mailbox;\nuse crate::registry::ActorObservation;\nuse crate::scheduler::start_scheduler;\nuse crate::spawn_builder::{SpawnBuilder, SpawnContext};\nuse crate::{Actor, ActorExitStatus, Command, Inbox, Mailbox, QueueCapacity};\n\n/// Universe serves as the top-level context in which Actor can be spawned.\n///\n/// It is *not* a singleton. A typical application will usually have only one universe hosting all\n/// of the actors but it is not a requirement.\n///\n/// In particular, unit test all have their own universe and hence can be executed in parallel.\npub struct Universe {\n    pub(crate) spawn_ctx: SpawnContext,\n}\n\nimpl Default for Universe {\n    fn default() -> Universe {\n        Universe::new()\n    }\n}\n\nimpl Universe {\n    /// Creates a new universe.\n    pub fn new() -> Universe {\n        let scheduler_client = start_scheduler();\n        Universe {\n            spawn_ctx: SpawnContext::new(scheduler_client),\n        }\n    }\n\n    /// Creates a universe were time is accelerated.\n    ///\n    /// Time is accelerated in a way to exhibit a behavior as close as possible\n    /// to what would have happened with normal time but faster.\n    ///\n    /// The time \"jumps\" only happen when no actor is processing any message,\n    /// running initialization or finalize.\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn with_accelerated_time() -> Universe {\n        let universe = Universe::new();\n        universe.spawn_ctx().scheduler_client.accelerate_time();\n        universe\n    }\n\n    pub fn spawn_ctx(&self) -> &SpawnContext {\n        &self.spawn_ctx\n    }\n\n    pub fn create_test_mailbox<A: Actor>(&self) -> (Mailbox<A>, Inbox<A>) {\n        create_mailbox(\"test-mailbox\".to_string(), QueueCapacity::Unbounded, None)\n    }\n\n    pub fn create_mailbox<A: Actor>(\n        &self,\n        actor_name: impl ToString,\n        queue_capacity: QueueCapacity,\n    ) -> (Mailbox<A>, Inbox<A>) {\n        self.spawn_ctx.create_mailbox(actor_name, queue_capacity)\n    }\n\n    pub fn get<A: Actor>(&self) -> Vec<Mailbox<A>> {\n        self.spawn_ctx.registry.get::<A>()\n    }\n\n    pub fn get_one<A: Actor>(&self) -> Option<Mailbox<A>> {\n        self.spawn_ctx.registry.get_one::<A>()\n    }\n\n    pub fn get_or_spawn_one<A: Actor + Default>(&self) -> Mailbox<A> {\n        if let Some(actor_mailbox) = self.spawn_ctx.registry.get_one::<A>() {\n            actor_mailbox\n        } else {\n            let actor_default = A::default();\n            let (mailbox, _handler) = self.spawn_builder().spawn(actor_default);\n            mailbox\n        }\n    }\n\n    pub async fn observe(&self, timeout: Duration) -> Vec<ActorObservation> {\n        self.spawn_ctx.registry.observe(timeout).await\n    }\n\n    pub fn kill(&self) {\n        self.spawn_ctx.kill_switch.kill();\n    }\n\n    /// This function acts as a drop-in replacement of\n    /// `tokio::time::sleep`.\n    ///\n    /// It can however be accelerated when using a time-accelerated\n    /// universe.\n    pub async fn sleep(&self, duration: Duration) {\n        self.spawn_ctx.scheduler_client.sleep(duration).await;\n    }\n\n    pub fn spawn_builder<A: Actor>(&self) -> SpawnBuilder<A> {\n        self.spawn_ctx.spawn_builder()\n    }\n\n    /// Inform an actor to process pending message and then stop processing new messages\n    /// and exit successfully.\n    pub async fn send_exit_with_success<A: Actor>(\n        &self,\n        mailbox: &Mailbox<A>,\n    ) -> Result<(), crate::SendError> {\n        mailbox.send_message(Command::ExitWithSuccess).await?;\n        Ok(())\n    }\n\n    /// Gracefully quits all registered actors.\n    pub async fn quit(&self) -> HashMap<String, ActorExitStatus> {\n        self.spawn_ctx.registry.quit().await\n    }\n\n    /// Gracefully quits all registered actors and asserts that none of them panicked.\n    ///\n    /// This is useful for testing purposes to detect failed asserts in actors.\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub async fn assert_quit(self) {\n        assert!(\n            !self\n                .quit()\n                .await\n                .values()\n                .any(|status| matches!(status, ActorExitStatus::Panicked))\n        );\n    }\n}\n\nimpl Drop for Universe {\n    fn drop(&mut self) {\n        if cfg!(any(test, feature = \"testsuite\"))\n            && !self.spawn_ctx.registry.is_empty()\n            && !thread::panicking()\n        {\n            panic!(\n                \"There are still running actors at the end of the test. Did you call \\\n                 universe.assert_quit()?\"\n            );\n        }\n        self.spawn_ctx.kill_switch.kill();\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use core::panic;\n    use std::time::Duration;\n\n    use async_trait::async_trait;\n\n    use crate::{Actor, ActorContext, ActorExitStatus, Handler, Universe};\n\n    #[derive(Default)]\n    pub struct CountingMinutesActor {\n        count: usize,\n    }\n\n    #[async_trait]\n    impl Actor for CountingMinutesActor {\n        type ObservableState = usize;\n\n        fn observable_state(&self) -> usize {\n            self.count\n        }\n\n        async fn initialize(&mut self, ctx: &ActorContext<Self>) -> Result<(), ActorExitStatus> {\n            self.handle(Loop, ctx).await\n        }\n    }\n\n    #[derive(Debug)]\n    struct Loop;\n\n    #[async_trait]\n    impl Handler<Loop> for CountingMinutesActor {\n        type Reply = ();\n        async fn handle(\n            &mut self,\n            _msg: Loop,\n            ctx: &ActorContext<Self>,\n        ) -> Result<(), ActorExitStatus> {\n            self.count += 1;\n            ctx.schedule_self_msg(Duration::from_secs(60), Loop);\n            Ok(())\n        }\n    }\n\n    #[derive(Default)]\n    pub struct ExitPanickingActor {}\n\n    #[async_trait]\n    impl Actor for ExitPanickingActor {\n        type ObservableState = ();\n\n        fn observable_state(&self) -> Self::ObservableState {}\n    }\n\n    impl Drop for ExitPanickingActor {\n        fn drop(&mut self) {\n            panic!(\"Panicking on drop\")\n        }\n    }\n\n    #[tokio::test]\n    async fn test_schedule_for_actor() {\n        let universe = Universe::with_accelerated_time();\n        let actor_with_schedule = CountingMinutesActor::default();\n        let (_mailbox, handler) = universe.spawn_builder().spawn(actor_with_schedule);\n        let count_after_initialization = handler.process_pending_and_observe().await.state;\n        assert_eq!(count_after_initialization, 1);\n        universe.sleep(Duration::from_secs(200)).await;\n        let count_after_advance_time = handler.process_pending_and_observe().await.state;\n        assert_eq!(count_after_advance_time, 4);\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_actor_quit_after_universe_quit() {\n        let universe = Universe::with_accelerated_time();\n        let actor_with_schedule = CountingMinutesActor::default();\n        let (_mailbox, handler) = universe.spawn_builder().spawn(actor_with_schedule);\n        universe.sleep(Duration::from_secs(200)).await;\n        let res = universe.quit().await;\n        assert_eq!(res.len(), 1);\n        assert!(matches!(\n            res.values().next().unwrap(),\n            ActorExitStatus::Quit\n        ));\n        assert!(matches!(handler.quit().await, (ActorExitStatus::Quit, 4)));\n    }\n\n    #[tokio::test]\n    async fn test_universe_join_after_actor_quit() {\n        let universe = Universe::default();\n        let actor_with_schedule = CountingMinutesActor::default();\n        let (_mailbox, handler) = universe.spawn_builder().spawn(actor_with_schedule);\n        assert!(matches!(handler.quit().await, (ActorExitStatus::Quit, 1)));\n        assert!(\n            !universe\n                .quit()\n                .await\n                .values()\n                .any(|status| matches!(status, ActorExitStatus::Panicked))\n        );\n    }\n\n    #[tokio::test]\n    async fn test_universe_quit_with_panicking_actor() {\n        let universe = Universe::default();\n        let panicking_actor = ExitPanickingActor::default();\n        let actor_with_schedule = CountingMinutesActor::default();\n        let (_mailbox, _handler) = universe.spawn_builder().spawn(panicking_actor);\n        let (_mailbox, _handler) = universe.spawn_builder().spawn(actor_with_schedule);\n        assert!(\n            universe\n                .quit()\n                .await\n                .values()\n                .any(|status| matches!(status, ActorExitStatus::Panicked))\n        );\n    }\n\n    #[tokio::test]\n    #[should_panic(\n        expected = \"There are still running actors at the end of the test. Did you call \\\n                    universe.assert_quit()?\"\n    )]\n    async fn test_enforce_universe_assert_quit_calls() {\n        let universe = Universe::with_accelerated_time();\n        let actor_with_schedule = CountingMinutesActor::default();\n        let _ = universe.spawn_builder().spawn(actor_with_schedule);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-aws/Cargo.toml",
    "content": "[package]\nname = \"quickwit-aws\"\ndescription = \"Set up AWS config and clients\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\naws-config = { workspace = true }\naws-runtime = { workspace = true }\naws-sdk-kinesis = { workspace = true, optional = true }\naws-sdk-s3 = { workspace = true }\naws-sdk-sqs = { workspace = true, optional = true }\naws-smithy-async = { workspace = true }\nfutures = { workspace = true }\ntokio = { workspace = true }\n\nquickwit-common = { workspace = true }\n\n[features]\nkinesis = [\"aws-sdk-kinesis\"]\nsqs = [\"aws-sdk-sqs\"]\n"
  },
  {
    "path": "quickwit/quickwit-aws/src/error.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n#![allow(clippy::match_like_matches_macro)]\n\nuse aws_runtime::retries::classifiers::{THROTTLING_ERRORS, TRANSIENT_ERRORS};\nuse aws_sdk_s3::error::SdkError;\nuse aws_sdk_s3::operation::abort_multipart_upload::AbortMultipartUploadError;\nuse aws_sdk_s3::operation::complete_multipart_upload::CompleteMultipartUploadError;\nuse aws_sdk_s3::operation::create_multipart_upload::CreateMultipartUploadError;\nuse aws_sdk_s3::operation::delete_object::DeleteObjectError;\nuse aws_sdk_s3::operation::delete_objects::DeleteObjectsError;\nuse aws_sdk_s3::operation::get_object::GetObjectError;\nuse aws_sdk_s3::operation::head_object::HeadObjectError;\nuse aws_sdk_s3::operation::put_object::PutObjectError;\nuse aws_sdk_s3::operation::upload_part::UploadPartError;\n\nuse crate::retry::AwsRetryable;\n\nimpl<E> AwsRetryable for SdkError<E>\nwhere E: AwsRetryable\n{\n    fn is_retryable(&self) -> bool {\n        match self {\n            SdkError::ConstructionFailure(_) => false,\n            SdkError::TimeoutError(_) => true,\n            SdkError::DispatchFailure(_) => false,\n            SdkError::ResponseError(_) => true,\n            SdkError::ServiceError(error) => error.err().is_retryable(),\n            _ => false,\n        }\n    }\n}\n\nfn is_retryable(meta: &aws_sdk_s3::error::ErrorMetadata) -> bool {\n    if let Some(code) = meta.code() {\n        THROTTLING_ERRORS.contains(&code)\n            || TRANSIENT_ERRORS.contains(&code)\n            || code == \"InternalError\" // this is somehow not considered transient, despite the\n    // associated error message containing \"Please try again.\"\n    } else {\n        false\n    }\n}\n\nimpl AwsRetryable for GetObjectError {\n    fn is_retryable(&self) -> bool {\n        is_retryable(self.meta())\n    }\n}\n\nimpl AwsRetryable for DeleteObjectError {\n    fn is_retryable(&self) -> bool {\n        is_retryable(self.meta())\n    }\n}\n\nimpl AwsRetryable for DeleteObjectsError {\n    fn is_retryable(&self) -> bool {\n        is_retryable(self.meta())\n    }\n}\n\nimpl AwsRetryable for UploadPartError {\n    fn is_retryable(&self) -> bool {\n        is_retryable(self.meta())\n    }\n}\n\nimpl AwsRetryable for CompleteMultipartUploadError {\n    fn is_retryable(&self) -> bool {\n        is_retryable(self.meta())\n    }\n}\n\nimpl AwsRetryable for AbortMultipartUploadError {\n    fn is_retryable(&self) -> bool {\n        is_retryable(self.meta())\n    }\n}\n\nimpl AwsRetryable for CreateMultipartUploadError {\n    fn is_retryable(&self) -> bool {\n        is_retryable(self.meta())\n    }\n}\n\nimpl AwsRetryable for PutObjectError {\n    fn is_retryable(&self) -> bool {\n        is_retryable(self.meta())\n    }\n}\n\nimpl AwsRetryable for HeadObjectError {\n    fn is_retryable(&self) -> bool {\n        is_retryable(self.meta())\n    }\n}\n\n#[cfg(feature = \"kinesis\")]\nmod kinesis {\n    use aws_sdk_kinesis::operation::create_stream::CreateStreamError;\n    use aws_sdk_kinesis::operation::delete_stream::DeleteStreamError;\n    use aws_sdk_kinesis::operation::describe_stream::DescribeStreamError;\n    use aws_sdk_kinesis::operation::get_records::GetRecordsError;\n    use aws_sdk_kinesis::operation::get_shard_iterator::GetShardIteratorError;\n    use aws_sdk_kinesis::operation::list_shards::ListShardsError;\n    use aws_sdk_kinesis::operation::list_streams::ListStreamsError;\n    use aws_sdk_kinesis::operation::merge_shards::MergeShardsError;\n    use aws_sdk_kinesis::operation::split_shard::SplitShardError;\n\n    use super::*;\n\n    impl AwsRetryable for GetRecordsError {\n        fn is_retryable(&self) -> bool {\n            match self {\n                GetRecordsError::KmsThrottlingException(_) => true,\n                GetRecordsError::ProvisionedThroughputExceededException(_) => true,\n                _ => false,\n            }\n        }\n    }\n\n    impl AwsRetryable for GetShardIteratorError {\n        fn is_retryable(&self) -> bool {\n            matches!(\n                self,\n                GetShardIteratorError::ProvisionedThroughputExceededException(_)\n            )\n        }\n    }\n\n    impl AwsRetryable for ListShardsError {\n        fn is_retryable(&self) -> bool {\n            matches!(\n                self,\n                ListShardsError::ResourceInUseException(_)\n                    | ListShardsError::LimitExceededException(_)\n            )\n        }\n    }\n\n    impl AwsRetryable for CreateStreamError {\n        fn is_retryable(&self) -> bool {\n            matches!(\n                self,\n                CreateStreamError::ResourceInUseException(_)\n                    | CreateStreamError::LimitExceededException(_)\n            )\n        }\n    }\n\n    impl AwsRetryable for DeleteStreamError {\n        fn is_retryable(&self) -> bool {\n            matches!(\n                self,\n                DeleteStreamError::ResourceInUseException(_)\n                    | DeleteStreamError::LimitExceededException(_)\n            )\n        }\n    }\n\n    impl AwsRetryable for DescribeStreamError {\n        fn is_retryable(&self) -> bool {\n            matches!(self, DescribeStreamError::LimitExceededException(_))\n        }\n    }\n\n    impl AwsRetryable for ListStreamsError {\n        fn is_retryable(&self) -> bool {\n            matches!(self, ListStreamsError::LimitExceededException(_))\n        }\n    }\n\n    impl AwsRetryable for MergeShardsError {\n        fn is_retryable(&self) -> bool {\n            matches!(\n                self,\n                MergeShardsError::ResourceInUseException(_)\n                    | MergeShardsError::LimitExceededException(_)\n            )\n        }\n    }\n\n    impl AwsRetryable for SplitShardError {\n        fn is_retryable(&self) -> bool {\n            matches!(\n                self,\n                SplitShardError::ResourceInUseException(_)\n                    | SplitShardError::LimitExceededException(_)\n            )\n        }\n    }\n}\n\n#[cfg(feature = \"sqs\")]\nmod sqs {\n    use aws_sdk_sqs::operation::change_message_visibility::ChangeMessageVisibilityError;\n    use aws_sdk_sqs::operation::delete_message_batch::DeleteMessageBatchError;\n    use aws_sdk_sqs::operation::receive_message::ReceiveMessageError;\n\n    use super::*;\n\n    impl AwsRetryable for ReceiveMessageError {\n        fn is_retryable(&self) -> bool {\n            false\n        }\n    }\n\n    impl AwsRetryable for DeleteMessageBatchError {\n        fn is_retryable(&self) -> bool {\n            false\n        }\n    }\n\n    impl AwsRetryable for ChangeMessageVisibilityError {\n        fn is_retryable(&self) -> bool {\n            false\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-aws/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse aws_config::retry::RetryConfig;\nuse aws_config::stalled_stream_protection::StalledStreamProtectionConfig;\nuse aws_config::{BehaviorVersion, Region};\npub use aws_smithy_async::rt::sleep::TokioSleep;\nuse tokio::sync::OnceCell;\n\npub mod error;\npub mod retry;\n\npub const DEFAULT_AWS_REGION: Region = Region::from_static(\"us-east-1\");\n\n/// Initialises and returns the AWS config.\npub async fn get_aws_config() -> &'static aws_config::SdkConfig {\n    static SDK_CONFIG: OnceCell<aws_config::SdkConfig> = OnceCell::const_new();\n\n    SDK_CONFIG\n        .get_or_init(|| async {\n            aws_config::defaults(aws_behavior_version())\n                .stalled_stream_protection(StalledStreamProtectionConfig::enabled().build())\n                // Currently handle this ourselves so probably best for now to leave it as is.\n                .retry_config(RetryConfig::disabled())\n                .sleep_impl(TokioSleep::default())\n                .load()\n                .await\n        })\n        .await\n}\n\n/// Returns the AWS behavior version.\npub fn aws_behavior_version() -> BehaviorVersion {\n    BehaviorVersion::v2026_01_12()\n}\n"
  },
  {
    "path": "quickwit/quickwit-aws/src/retry.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt::Debug;\n\nuse futures::{Future, TryFutureExt};\nuse quickwit_common::retry::{\n    Retry, RetryParams, Retryable, TokioSleep, retry_with_mockable_sleep,\n};\n\npub trait AwsRetryable {\n    fn is_retryable(&self) -> bool {\n        false\n    }\n}\n\nimpl<E> AwsRetryable for Retry<E> {\n    fn is_retryable(&self) -> bool {\n        match self {\n            Retry::Transient(_) => true,\n            Retry::Permanent(_) => false,\n        }\n    }\n}\n\n#[derive(Debug)]\nstruct AwsRetryableWrapper<E>(E);\n\nimpl<E> Retryable for AwsRetryableWrapper<E>\nwhere E: AwsRetryable\n{\n    fn is_retryable(&self) -> bool {\n        self.0.is_retryable()\n    }\n}\n\npub async fn aws_retry<U, E, Fut>(retry_params: &RetryParams, f: impl Fn() -> Fut) -> Result<U, E>\nwhere\n    Fut: Future<Output = Result<U, E>>,\n    E: AwsRetryable + Debug + 'static,\n{\n    retry_with_mockable_sleep(\n        retry_params,\n        || f().map_err(AwsRetryableWrapper),\n        TokioSleep,\n    )\n    .await\n    .map_err(|error| error.0)\n}\n"
  },
  {
    "path": "quickwit/quickwit-cli/Cargo.toml",
    "content": "[package]\nname = \"quickwit-cli\"\ndescription = \"Command line interface for launching and managing Quickwit clusters\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\ndefault-run = \"quickwit\"\n\n[[bin]]\nname = \"quickwit\"\npath = \"src/main.rs\"\n\n[[bin]]\nname = \"generate_markdown\"\npath = \"src/generate_markdown.rs\"\n\n[dependencies]\nanyhow = { workspace = true }\nbacktrace = { workspace = true, optional = true }\nbytesize = { workspace = true }\nchrono = { workspace = true }\nclap = { workspace = true }\ncolored = { workspace = true }\nconsole-subscriber = { workspace = true, optional = true }\ndialoguer = { workspace = true }\nfutures = { workspace = true }\nhumantime = { workspace = true }\nindicatif = { workspace = true }\nitertools = { workspace = true }\nnumfmt = { workspace = true }\nonce_cell = { workspace = true }\nopenssl-probe = { workspace = true, optional = true }\nopentelemetry = { workspace = true }\nopentelemetry-appender-tracing = { workspace = true }\nopentelemetry_sdk = { workspace = true }\nopentelemetry-otlp = { workspace = true }\nreqwest = { workspace = true }\nrustls = { workspace = true }\nserde_json = { workspace = true }\ntabled = { workspace = true }\ntempfile = { workspace = true }\nthiserror = { workspace = true }\nthousands = { workspace = true }\ntikv-jemalloc-ctl = { workspace = true, optional = true }\ntikv-jemallocator = { workspace = true, optional = true }\ntime = { workspace = true }\ntokio = { workspace = true }\ntoml = { workspace = true }\ntracing = { workspace = true }\ntracing-opentelemetry = { workspace = true }\ntracing-subscriber = { workspace = true }\n\nquickwit-actors = { workspace = true }\nquickwit-cluster = { workspace = true }\nquickwit-common = { workspace = true }\nquickwit-config = { workspace = true }\nquickwit-index-management = { workspace = true }\nquickwit-indexing = { workspace = true }\nquickwit-ingest = { workspace = true }\nquickwit-metastore = { workspace = true }\nquickwit-proto = { workspace = true }\nquickwit-rest-client = { workspace = true }\nquickwit-search = { workspace = true }\nquickwit-serve = { workspace = true }\nquickwit-storage = { workspace = true }\nquickwit-telemetry = { workspace = true }\n\n[dev-dependencies]\npredicates = { workspace = true }\nreqwest = { workspace = true }\n\nquickwit-actors = { workspace = true, features = [\"testsuite\"] }\nquickwit-common = { workspace = true, features = [\"testsuite\"] }\nquickwit-config = { workspace = true, features = [\"testsuite\"] }\nquickwit-metastore = { workspace = true, features = [\"testsuite\"] }\nquickwit-storage = { workspace = true, features = [\"testsuite\"] }\n\n[features]\njemalloc = [\"dep:tikv-jemalloc-ctl\", \"dep:tikv-jemallocator\"]\njemalloc-profiled = [\n  \"dep:backtrace\",\n  \"quickwit-common/jemalloc-profiled\",\n  \"quickwit-serve/jemalloc-profiled\"\n]\nci-test = []\npprof = [\"quickwit-serve/pprof\"]\nopenssl-support = [\"openssl-probe\"]\n# Requires to enable tokio unstable via RUSTFLAGS=\"--cfg tokio_unstable\"\ntokio-console = [\"console-subscriber\", \"quickwit-common/named_tasks\"]\nrelease-feature-set = [\n  \"jemalloc\",\n  \"openssl-support\",\n  \"pprof\",\n  \"quickwit-indexing/kafka\",\n  \"quickwit-indexing/kinesis\",\n  \"quickwit-indexing/pulsar\",\n  \"quickwit-indexing/sqs\",\n  \"quickwit-indexing/vrl\",\n  \"quickwit-serve/lambda\",\n  \"quickwit-storage/azure\",\n  \"quickwit-storage/gcs\",\n  \"quickwit-metastore/postgres\",\n]\nrelease-feature-vendored-set = [\n  \"jemalloc\",\n  \"openssl-support\",\n  \"pprof\",\n  \"quickwit-indexing/kinesis\",\n  \"quickwit-indexing/pulsar\",\n  \"quickwit-indexing/sqs\",\n  \"quickwit-indexing/vrl\",\n  \"quickwit-indexing/vendored-kafka\",\n  \"quickwit-serve/lambda\",\n  \"quickwit-storage/azure\",\n  \"quickwit-storage/gcs\",\n  \"quickwit-metastore/postgres\",\n]\nrelease-macos-feature-vendored-set = [\n  \"jemalloc\",\n  \"openssl-support\",\n  \"quickwit-indexing/kinesis\",\n  \"quickwit-indexing/pulsar\",\n  \"quickwit-indexing/sqs\",\n  \"quickwit-indexing/vrl\",\n  \"quickwit-indexing/vendored-kafka-macos\",\n  \"quickwit-serve/lambda\",\n  \"quickwit-storage/azure\",\n  \"quickwit-storage/gcs\",\n  \"quickwit-metastore/postgres\",\n]\nrelease-jemalloc-profiled = [\n  \"release-feature-set\",\n  \"jemalloc-profiled\",\n]\n"
  },
  {
    "path": "quickwit/quickwit-cli/src/checklist.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt::Display;\n\nuse colored::{Color, Colorize};\nuse itertools::Itertools;\nuse thiserror::Error;\n\n/// Quickwit main colors slightly adapted to be readable on a terminal.\npub const BLUE_COLOR: Color = Color::TrueColor {\n    r: 22,\n    g: 74,\n    b: 209,\n};\n\npub const GREEN_COLOR: Color = Color::Green;\npub const WHITE_COLOR: Color = Color::TrueColor {\n    r: 255,\n    g: 255,\n    b: 255,\n};\npub const RED_COLOR: Color = Color::TrueColor {\n    r: 230,\n    g: 0,\n    b: 34,\n};\n\npub fn print_checklist(check_list_results: &[(&str, anyhow::Result<()>)]) {\n    eprintln!(\n        \"\\n{}\\n{}\",\n        \"---------------------------------------------------\".color(GREEN_COLOR),\n        \" Connectivity checklist \"\n            .color(WHITE_COLOR)\n            .on_color(GREEN_COLOR)\n    );\n    let mut errors = Vec::new();\n    for (check_item_name, check_item_result) in check_list_results {\n        let outcome_symbol = if check_item_result.is_ok() {\n            \"✔\".color(GREEN_COLOR) // '✓'\n        } else {\n            \"✖\".color(RED_COLOR) //𐄂\n        };\n        eprintln!(\" {outcome_symbol} {check_item_name}\");\n        if let Err(check_item_err) = check_item_result {\n            errors.push((check_item_name, check_item_err));\n        }\n    }\n    if errors.is_empty() {\n        println!();\n        return;\n    }\n    eprintln!(\n        \"{}\\n{}\",\n        \"---------------------------------------------------\".color(RED_COLOR),\n        \" Error Details \".color(WHITE_COLOR).on_color(RED_COLOR)\n    );\n    for (check_item_name, check_item_err) in errors {\n        eprintln!(\n            \"\\n{}\\n{:?}\",\n            format!(\" ✖ {check_item_name}\").color(RED_COLOR),\n            check_item_err\n        );\n    }\n    eprintln!(\"\\n\\n\");\n}\n\n/// Run a checklist and print out its successes and failures on stdout.\n///\n/// If an error is encountered, the process will exit with exit code 1.\npub fn run_checklist(checks: Vec<(&str, anyhow::Result<()>)>) -> Result<(), ChecklistError> {\n    print_checklist(&checks);\n    if !checks\n        .iter()\n        .all(|(_, check_items_res)| check_items_res.is_ok())\n    {\n        return Err(ChecklistError::from_results(checks));\n    }\n\n    Ok(())\n}\n\n#[derive(Error, Debug)]\npub struct ChecklistError {\n    pub errors: Vec<(String, anyhow::Result<()>)>,\n}\n\nimpl ChecklistError {\n    pub fn from_results(results: Vec<(&str, anyhow::Result<()>)>) -> Self {\n        let errors = results\n            .into_iter()\n            .filter(|(_, check_res)| check_res.is_err())\n            .map(|(check_elem, check_res)| (check_elem.to_string(), check_res))\n            .collect();\n        ChecklistError { errors }\n    }\n}\n\nimpl Display for ChecklistError {\n    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {\n        let err_string = self\n            .errors\n            .iter()\n            .map(|(check_item, check_item_err)| {\n                format!(\n                    \"\\n{}: {}\",\n                    check_item,\n                    check_item_err\n                        .as_ref()\n                        .expect_err(\"ChecklistError can't contain success results\")\n                )\n            })\n            .join(\"\");\n        write!(f, \"{err_string}\")\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-cli/src/cli.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse anyhow::{Context, bail};\nuse clap::{Arg, ArgAction, ArgMatches, Command, arg};\nuse quickwit_serve::EnvFilterReloadFn;\nuse tracing::Level;\n\nuse crate::index::{IndexCliCommand, build_index_command};\nuse crate::service::{RunCliCommand, build_run_command};\nuse crate::source::{SourceCliCommand, build_source_command};\nuse crate::split::{SplitCliCommand, build_split_command};\nuse crate::tool::{ToolCliCommand, build_tool_command};\n\npub fn build_cli() -> Command {\n    Command::new(\"Quickwit\")\n        .arg(\n            // Following https://no-color.org/\n            Arg::new(\"no-color\")\n                .long(\"no-color\")\n                .help(\n                    \"Disable ANSI terminal codes (colors, etc...) being injected into the logging \\\n                     output\",\n                )\n                .env(\"NO_COLOR\")\n                .value_parser(clap::builder::FalseyValueParser::new())\n                .global(true)\n                .action(ArgAction::SetTrue),\n        )\n        .arg(arg!(-y --\"yes\" \"Assume \\\"yes\\\" as an answer to all prompts and run non-interactively.\")\n            .global(true)\n            .required(false)\n        )\n        .subcommand(build_run_command().display_order(1))\n        .subcommand(build_index_command().display_order(2))\n        .subcommand(build_source_command().display_order(3))\n        .subcommand(build_split_command().display_order(4))\n        .subcommand(build_tool_command().display_order(5))\n        .arg_required_else_help(true)\n        .disable_help_subcommand(true)\n        .subcommand_required(true)\n}\n\n#[derive(Debug, PartialEq)]\npub enum CliCommand {\n    Run(RunCliCommand),\n    Index(IndexCliCommand),\n    Split(SplitCliCommand),\n    Source(SourceCliCommand),\n    Tool(ToolCliCommand),\n}\n\nimpl CliCommand {\n    pub fn default_log_level(&self) -> Level {\n        match self {\n            CliCommand::Run(_) => Level::INFO,\n            CliCommand::Index(subcommand) => subcommand.default_log_level(),\n            CliCommand::Source(_) => Level::ERROR,\n            CliCommand::Split(_) => Level::ERROR,\n            CliCommand::Tool(_) => Level::ERROR,\n        }\n    }\n\n    pub fn parse_cli_args(mut matches: ArgMatches) -> anyhow::Result<Self> {\n        let (subcommand, submatches) = matches\n            .remove_subcommand()\n            .context(\"failed to parse command\")?;\n        match subcommand.as_str() {\n            \"index\" => IndexCliCommand::parse_cli_args(submatches).map(CliCommand::Index),\n            \"run\" => RunCliCommand::parse_cli_args(submatches).map(CliCommand::Run),\n            \"source\" => SourceCliCommand::parse_cli_args(submatches).map(CliCommand::Source),\n            \"split\" => SplitCliCommand::parse_cli_args(submatches).map(CliCommand::Split),\n            \"tool\" => ToolCliCommand::parse_cli_args(submatches).map(CliCommand::Tool),\n            _ => bail!(\"unknown command `{subcommand}`\"),\n        }\n    }\n\n    pub async fn execute(self, env_filter_reload_fn: EnvFilterReloadFn) -> anyhow::Result<()> {\n        match self {\n            CliCommand::Index(subcommand) => subcommand.execute().await,\n            CliCommand::Run(subcommand) => subcommand.execute(env_filter_reload_fn).await,\n            CliCommand::Source(subcommand) => subcommand.execute().await,\n            CliCommand::Split(subcommand) => subcommand.execute().await,\n            CliCommand::Tool(subcommand) => subcommand.execute().await,\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-cli/src/cli_doc_ext.toml",
    "content": "[index.create]\nlong_about = \"\"\"\nCreates an index of ID `index` at `index-uri` configured by a [YAML config file](../configuration/index-config.md) located at `index-config`.\nThe index config lets you define the mapping of your document on the index and how each field is stored and indexed.\nIf `index-uri` is omitted, `index-uri` will be set to `{default_index_root_uri}/{index}`, more info on [Quickwit config docs](../configuration/node-config.md).\nThe command fails if an index already exists unless `overwrite` is passed.\nWhen `overwrite` is enabled, the command deletes all the files stored at `index-uri` before creating a new index.\n\"\"\"\n\n[[index.create.examples]]\nname= \"Create a new index.\"\ncommand = '''\n# Start a Quickwit server.\nquickwit run --config=./config/quickwit.yaml\n# Open a new terminal and run:\ncurl -o wikipedia_index_config.yaml https://raw.githubusercontent.com/quickwit-oss/quickwit/main/config/tutorials/wikipedia/index-config.yaml\nquickwit index create --endpoint=http://127.0.0.1:7280 --index-config wikipedia_index_config.yaml\n'''\n\n[index.ingest]\nlong_about = \"\"\"\nIndexes a dataset consisting of newline-delimited JSON objects located at `input-path` or read from *stdin*.\nThe data is appended to the target index of ID `index` unless `overwrite` is passed. `input-path` can be a file or another command output piped into stdin.\nCurrently, only local datasets are supported.\nBy default, Quickwit's indexer will work with a heap of 2 GiB of memory. Learn how to change `heap-size` in the [index config doc page](../configuration/index-config.md).\n\"\"\"\n\n[[index.ingest.examples]]\nname = \"Indexing a dataset from a file\"\ncommand = '''\n# Start a Quickwit server.\nquickwit run --config=./config/quickwit.yaml\n# Open a new terminal and run:\ncurl -o wiki-articles-10000.json https://quickwit-datasets-public.s3.amazonaws.com/wiki-articles-10000.json\nquickwit index ingest --endpoint=http://127.0.0.1:7280 --index wikipedia --input-path wiki-articles-10000.json\n'''\n\n[[index.ingest.examples]]\nname = \"Indexing a dataset from stdin\"\ncommand = '''\n# Start a Quickwit server.\nquickwit run --config=./config/quickwit.yaml\n# Open a new terminal and run:\ncat wiki-articles-10000.json | quickwit index ingest --endpoint=http://127.0.0.1:7280 --index wikipedia\n'''\n\n[tool.gc]\nnote = \"\"\"\nIntermediate files are created while executing Quickwit commands.\nThese intermediate files are always cleaned at the end of each successfully executed command.\nHowever, failed or interrupted commands can leave behind intermediate files that need to be removed.\nAlso, note that using a very short grace period (like seconds) can cause the removal of intermediate files being operated on, especially when using Quickwit concurrently on the same index.\nIn practice, you can settle with the default value (1 hour) and only specify a lower value if you really know what you are doing.\n\"\"\"\n\n[index.search]\nlong_about = \"\"\"\nSearches an index with ID `--index` and returns the documents matching the query specified with `--query`.\nMore details on the [query language page](query-language.md).\nThe offset of the first hit returned and the number of hits returned can be set with the `start-offset` and `max-hits` options.\nIt's possible to override the default search fields `search-fields` option to define the list of fields that Quickwit will search into if \nthe user query does not explicitly target a field in the query. Quickwit will return snippets of the matching content when requested via the `snippet-fields` options.\nSearch can also be limited to a time range using the `start-timestamp` and `end-timestamp` options.\nThese timestamp options are useful for boosting query performance when using a time series dataset.\n\n:::warning\nThe `start_timestamp` and `end_timestamp` should be specified in seconds regardless of the timestamp field precision. The timestamp field precision only affects the way it's stored as fast-fields, whereas the document filtering is always performed in seconds.\n:::\n\"\"\"\n\n[[index.search.examples]]\nname = \"Searching a index\"\ncommand = '''\n# Start a Quickwit server.\nquickwit run --config=./config/quickwit.yaml\n# Open a new terminal and run:\nquickwit index search --endpoint=http://127.0.0.1:7280 --index wikipedia --query \"Barack Obama\"\n# If you have jq installed.\nquickwit index search --endpoint=http://127.0.0.1:7280 --index wikipedia --query \"Barack Obama\" | jq '.hits[].title'\n'''\n\n[[index.search.examples]]\nname = \"Sorting documents by their BM25 score\"\ncommand = '''\n# Start a Quickwit server.\nquickwit run --config=./config/quickwit.yaml\n# Open a new terminal and run:\nquickwit index search --endpoint=http://127.0.0.1:7280 --index wikipedia --query \"obama\" --sort-by-score\n'''\n\n[[index.search.examples]]\nname = \"Limiting the result set to 50 hits\"\ncommand = '''\n# Start a Quickwit server.\nquickwit run --config=./config/quickwit.yaml\n# Open a new terminal and run:\nquickwit index search --endpoint=http://127.0.0.1:7280 --index wikipedia --query \"Barack Obama\" --max-hits 50\n# If you have jq installed.\nquickwit index search --endpoint=http://127.0.0.1:7280 --index wikipedia --query \"Barack Obama\" --max-hits 50 | jq '.num_hits'\n'''\n\n[[index.search.examples]]\nname = \"Looking for matches in the title only\"\ncommand = '''\n# Start a Quickwit server.\nquickwit run --config=./config/quickwit.yaml\n# Open a new terminal and run:\nquickwit index search --endpoint=http://127.0.0.1:7280 --index wikipedia --query \"obama\" --search-fields body\n# If you have jq installed.\nquickwit index search --endpoint=http://127.0.0.1:7280 --index wikipedia --query \"obama\" --search-fields body | jq '.hits[].title'\n'''\n\n[[index.list.examples]]\nname = \"List indexes\"\ncommand = '''\n# Start a Quickwit server.\nquickwit run --config=./config/quickwit.yaml\n# Open a new terminal and run:\nquickwit index list --endpoint=http://127.0.0.1:7280\n# Or with alias.\nquickwit index ls --endpoint=http://127.0.0.1:7280\n\n                                    Indexes                                     \n+-----------+--------------------------------------------------------+\n| Index ID  |                       Index URI                        |\n+-----------+--------------------------------------------------------+\n| hdfs-logs | file:///home/quickwit-indices/qwdata/indexes/hdfs-logs |\n+-----------+--------------------------------------------------------+\n| wikipedia | file:///home/quickwit-indices/qwdata/indexes/wikipedia |\n+-----------+--------------------------------------------------------+\n\n'''\n\n\n[[index.describe.examples]]\nname = \"Displays descriptive statistics of your index\"\ncommand = '''\n# Start a Quickwit server.\nquickwit run --service metastore --config=./config/quickwit.yaml\n# Open a new terminal and run:\nquickwit index describe --endpoint=http://127.0.0.1:7280 --index wikipedia\n\n1. General infos\n===============================================================================\nIndex id:                           wikipedia\nIndex uri:                          file:///home/quickwit-indices/qwdata/indexes/wikipedia\nNumber of published splits:         1\nNumber of published documents:      300000\nSize of published splits:           448 MB\n\n2. Statistics on splits\n===============================================================================\nDocument count stats:\nMean ± σ in [min … max]:            300000 ± 0 in [300000 … 300000]\nQuantiles [1%, 25%, 50%, 75%, 99%]: [300000, 300000, 300000, 300000, 300000]\n\nSize in MB stats:\nMean ± σ in [min … max]:            448 ± 0 in [448 … 448]\nQuantiles [1%, 25%, 50%, 75%, 99%]: [448, 448, 448, 448, 448]\n'''\n\n[[index.delete.examples]]\nname = \"Delete your index\"\ncommand = '''\n# Start a Quickwit server.\nquickwit run --service metastore --config=./config/quickwit.yaml\n# Open a new terminal and run:\nquickwit index delete --index wikipedia --endpoint=http://127.0.0.1:7280\n'''\n\n\n[run]\nlong_about = \"\"\"\n\n### Indexer service\n\nThe indexer service runs indexing pipelines assigned by the control plane.\n\n### Searcher service \nStarts a web server at `rest_listing_address:rest_list_port` that exposes the [Quickwit REST API](rest-api.md)\nwhere `rest_listing_address` and `rest_list_port` are defined in Quickwit config file (quickwit.yaml).\nThe node can optionally join a cluster using the `peer_seeds` parameter.\nThis list of node addresses is used to discover the remaining peer nodes in the cluster through a gossip protocol, see [chitchat](https://github.com/quickwit-oss/chitchat).\n\n### Metastore service\n\nThe metastore service exposes Quickwit metastore over the network. This is a core internal service that is needed to operate Quickwit. As such, at least one running instance of this service is required for other services to work.\n\n### Control plane service\n\nThe control plane service schedules indexing tasks to indexers. It listens to metastore events such as\nan source create, delete, toggle, or index delete and reacts accordingly to update the indexing plan.\n\n### Janitor service\n\nThe Janitor service runs maintenance tasks on indexes: garbage collection, documents delete, and retention policy tasks.\n\n:::note\nQuickwit needs to open the following port for cluster formation and workload distribution:\n\n    TCP port (default is 7280) for REST API\n    TCP and UDP port (default is 7280) for cluster membership protocol\n    TCP port + 1 (default is 7281) for gRPC address for the distributed search\n\nIf ports are already taken, the serve command will fail.\n:::\n\"\"\"\n\n[[run.examples]]\nname = \"Starts an indexer and a metastore services\"\ncommand = \"quickwit run --service indexer --service metastore --endpoint=http://127.0.0.1:7280\"\n\n[[run.examples]]\nname = \"Start a control plane, metastore and janitor services\"\ncommand = \"quickwit run --service control_plane --service metastore --service janitor --config=./config/quickwit.yaml\"\n\n[[run.examples]]\nname = \"Make a search request on a wikipedia index\"\ncommand = '''\n# To create wikipedia index and ingest data, go to our tutorials https://quickwit.io/docs/get-started/.\n# Start a searcher.\nquickwit run --service searcher --service metastore --config=./config/quickwit.yaml\n# Make a request.\ncurl \"http://127.0.0.1:7280/api/v1/wikipedia/search?query=barack+obama\"\n'''\n\n[[source.examples]]\nname = \"Add a Kafka source to `wikipedia` index\"\ncommand = '''\n# Start a Quickwit server.\nquickwit run --service metastore --config=./config/quickwit.yaml\n# Open a new terminal and run:\ncat << EOF > wikipedia-kafka-source.json\n{\n  \"version\": \"0.7\",\n  \"source_id\": \"kafka-source\",\n  \"source_type\": \"kafka\",\n  \"params\": {\n    \"topic\": \"wikipedia\",\n      \"client_params\": {\n        \"bootstrap.servers\": \"localhost:9092\",\n        \"group.id\": \"my-group-id\",\n        \"security.protocol\": \"SSL\"\n      }\n  }\n}\nEOF\nquickwit source create --endpoint=http://127.0.0.1:7280 --index wikipedia --config-file wikipedia-kafka-source.json\n'''\n\n[[source.list.examples]]\nname = \"List `wikipedia` index sources\"\ncommand = '''\n# Start a Quickwit server.\nquickwit run --service metastore --config=./config/quickwit.yaml\n# Open a new terminal and run:\nquickwit source list --endpoint=http://127.0.0.1:7280 --index wikipedia\n'''\n\n\n[[source.delete.examples]]\nname = \"Delete a `wikipedia-source` source\"\ncommand = '''\n# Start a Quickwit server.\nquickwit run --service metastore --config=./config/quickwit.yaml\n# Open a new terminal and run:\nquickwit source delete --endpoint=http://127.0.0.1:7280 --index wikipedia --source wikipedia-source\n'''\n"
  },
  {
    "path": "quickwit/quickwit-cli/src/generate_markdown.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse clap::Command;\nuse quickwit_cli::cli::build_cli;\nuse quickwit_serve::BuildInfo;\nuse toml::Value;\n\n#[tokio::main]\nasync fn main() -> anyhow::Result<()> {\n    let version_text = BuildInfo::get_version_text();\n    let app = build_cli()\n        .version(version_text)\n        .disable_help_subcommand(true);\n\n    generate_markdown_from_clap(&app);\n    Ok(())\n}\n\nfn markdown_for_command(command: &Command, doc_extensions: &toml::Value) {\n    let command_name = command.get_name();\n    let command_ext: Option<&Value> = doc_extensions.get(command_name.to_string());\n    markdown_for_command_helper(command, command_ext, command_name.to_string(), Vec::new());\n}\n\nfn markdown_for_subcommand(\n    subcommand: &Command,\n    command_group: Vec<String>,\n    doc_extensions: &toml::Value,\n    level: usize,\n) {\n    let subcommand_name = subcommand.get_name();\n\n    let command_name = format!(\"{} {}\", command_group.join(\" \"), subcommand_name);\n    let header_level = \"#\".repeat(level);\n    println!(\"{header_level} {command_name}\\n\");\n\n    let subcommand_ext: Option<&Value> = {\n        let mut val_opt: Option<&Value> = doc_extensions.get(command_group[0].to_string());\n        for command in command_group\n            .iter()\n            .skip(1)\n            .chain(&[subcommand_name.to_string()])\n        {\n            if let Some(val) = val_opt {\n                val_opt = val.get(command);\n            }\n        }\n        val_opt\n    };\n    markdown_for_command_helper(subcommand, subcommand_ext, command_name, command_group);\n}\n\nfn markdown_for_command_helper(\n    subcommand: &Command,\n    subcommand_ext: Option<&Value>,\n    command_name: String,\n    command_group: Vec<String>,\n) {\n    let long_about_opt: Option<&str> =\n        subcommand_ext.and_then(|el| el.get(\"long_about\").and_then(|el| el.as_str()));\n\n    let note: Option<&str> =\n        subcommand_ext.and_then(|el| el.get(\"note\").and_then(|el| el.as_str()));\n\n    let examples_opt: Option<&Vec<Value>> =\n        subcommand_ext.and_then(|el| el.get(\"examples\").and_then(|el| el.as_array()));\n\n    if let Some(about) = long_about_opt {\n        if !about.trim().is_empty() {\n            println!(\"{about}  \");\n        }\n    } else if let Some(about) = subcommand.get_about()\n        && !about.to_string().trim().is_empty()\n    {\n        println!(\"{about}  \");\n    }\n\n    if let Some(note) = note {\n        println!(\":::note\");\n        println!(\"{note}\");\n        println!(\":::\");\n    }\n\n    println!(\n        \"`quickwit {} {} [args]`\",\n        command_group.join(\" \"),\n        subcommand.get_name()\n    );\n    for alias in subcommand.get_all_aliases() {\n        println!(\"`quickwit {} {} [args]`\", command_group.join(\" \"), alias);\n    }\n\n    let arguments = subcommand\n        .get_arguments()\n        .filter(|arg| !(arg.get_id() == \"help\" || arg.get_id() == \"version\"))\n        .collect::<Vec<_>>();\n    if !arguments.is_empty() {\n        println!(\"\\n*Synopsis*\\n\");\n\n        println!(\"```bash\");\n        println!(\"quickwit {command_name}\");\n        for arg in &arguments {\n            let is_required = arg.is_required_set();\n            let is_bool = !arg.get_action().takes_values();\n\n            let mut commando = format!(\"--{}\", arg.get_id());\n            if !is_bool {\n                commando = format!(\"{} <{}>\", commando, arg.get_id());\n            }\n            if !is_required {\n                commando = format!(\"[{commando}]\");\n            }\n            println!(\"    {commando}\");\n        }\n        println!(\"```\");\n        println!(\"\\n*Options*\\n\");\n\n        // Check if any options have defaults to know if the \"Default\" column is needed\n        let has_defaults = arguments\n            .iter()\n            .any(|arg| !arg.get_default_values().is_empty());\n\n        if has_defaults {\n            println!(\"| Option | Description | Default |\");\n            println!(\"|-----------------|-------------|--------:|\");\n            for arg in arguments {\n                let default = if let Some(val) = arg.get_default_values().first() {\n                    format!(\"`{}`\", val.to_str().unwrap())\n                } else {\n                    \"\".to_string()\n                };\n                println!(\n                    \"| `--{}` | {} | {} |\",\n                    arg.get_id(),\n                    arg.get_help().unwrap_or_default(),\n                    default\n                );\n            }\n        } else {\n            println!(\"| Option | Description |\");\n            println!(\"|-----------------|-------------|\");\n            for arg in arguments {\n                println!(\n                    \"| `--{}` | {} |\",\n                    arg.get_id(),\n                    arg.get_help().unwrap_or_default()\n                );\n            }\n        }\n    }\n\n    if let Some(examples) = examples_opt {\n        println!(\"\\n*Examples*\\n\");\n        for example in examples {\n            println!(\"*{}*\", example.get(\"name\").unwrap().as_str().unwrap());\n            println!(\n                \"```bash\\n{}\\n```\\n\",\n                example.get(\"command\").unwrap().as_str().unwrap()\n            );\n        }\n    }\n}\n\nfn generate_markdown_from_clap(command: &Command) {\n    let ext_toml = include_str!(\"cli_doc_ext.toml\");\n    let doc_extensions: Value = ext_toml.parse::<Value>().unwrap();\n\n    let commands = command.get_subcommands();\n    for command in commands {\n        let command_name = command.get_name(); // index, split, source\n        println!(\"## {command_name}\");\n        if let Some(about) = command.get_long_about().or_else(|| command.get_about())\n            && !about.to_string().trim().is_empty()\n        {\n            println!(\"{about}\\n\");\n        }\n\n        if command.get_subcommands().count() == 0 {\n            markdown_for_command(command, &doc_extensions);\n            continue;\n        }\n\n        let excluded_doc_commands = [\"merge\", \"local-search\"];\n        for subcommand in command\n            .get_subcommands()\n            .filter(|subcommand| !excluded_doc_commands.contains(&subcommand.get_name()))\n        {\n            let commands = vec![command.get_name().to_string()];\n            markdown_for_subcommand(subcommand, commands, &doc_extensions, 3);\n\n            for subsubcommand in subcommand.get_subcommands() {\n                let commands = vec![\n                    command.get_name().to_string(),\n                    subcommand.get_name().to_string(),\n                ];\n                markdown_for_subcommand(subsubcommand, commands, &doc_extensions, 4);\n            }\n        }\n    }\n    std::process::exit(0);\n}\n"
  },
  {
    "path": "quickwit/quickwit-cli/src/index.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::borrow::Cow;\nuse std::fmt::Display;\nuse std::num::NonZeroUsize;\nuse std::ops::Div;\nuse std::path::PathBuf;\nuse std::str::FromStr;\nuse std::time::{Duration, Instant};\n\nuse anyhow::{Context, anyhow, bail};\nuse bytesize::ByteSize;\nuse clap::{Arg, ArgAction, ArgMatches, Command, arg};\nuse colored::Colorize;\nuse indicatif::{ProgressBar, ProgressStyle};\nuse itertools::Itertools;\nuse numfmt::{Formatter, Scales};\nuse quickwit_common::tower::{Rate, RateEstimator, SmaRateEstimator};\nuse quickwit_common::uri::Uri;\nuse quickwit_config::{ConfigFormat, IndexConfig};\nuse quickwit_metastore::{IndexMetadata, Split, SplitState};\nuse quickwit_proto::search::{CountHits, SortField, SortOrder};\nuse quickwit_proto::types::IndexId;\nuse quickwit_rest_client::models::{IngestSource, SearchResponseRestClient};\nuse quickwit_rest_client::rest_client::{CommitType, IngestEvent};\nuse quickwit_serve::{ListSplitsQueryParams, SearchRequestQueryString, SortBy};\nuse quickwit_storage::{StorageResolver, load_file};\nuse tabled::settings::object::{FirstRow, Rows, Segment};\nuse tabled::settings::panel::Footer;\nuse tabled::settings::{Alignment, Format, Modify, Panel, Remove, Rotate, Style};\nuse tabled::{Table, Tabled};\nuse tracing::{Level, debug};\n\nuse crate::checklist::{GREEN_COLOR, RED_COLOR};\nuse crate::stats::{mean, percentile, std_deviation};\nuse crate::{ClientArgs, client_args, make_table, prompt_confirmation};\n\npub fn build_index_command() -> Command {\n    Command::new(\"index\")\n        .about(\"Manages indexes: creates, updates, deletes, ingests, searches, describes...\")\n        .args(client_args())\n        .subcommand(\n            Command::new(\"create\")\n                .display_order(1)\n                .about(\"Creates an index from an index config file.\")\n                .args(&[\n                    arg!(--\"index-config\" <INDEX_CONFIG> \"Location of the index config file.\")\n                        .display_order(1)\n                        .required(true),\n                    arg!(--overwrite \"Overwrites pre-existing index. This will delete all existing data stored at `index-uri` before creating a new index.\")\n                        .display_order(2)\n                        .required(false),\n                ])\n            )\n        .subcommand(\n            Command::new(\"update\")\n            .display_order(1)\n            .about(\"Updates an index using an index config file.\")\n            .long_about(\"This command follows PUT semantics, which means that all the fields of the current configuration are replaced by the values specified in this request or the associated defaults. In particular, if the field is optional (e.g. `retention_policy`), omitting it will delete the associated configuration. If the new configuration file contains updates that cannot be applied, the request fails, and none of the updates are applied.\")\n            .args(&[\n                arg!(--index <INDEX> \"ID of the target index\")\n                    .display_order(1)\n                    .required(true),\n                arg!(--\"index-config\" <INDEX_CONFIG> \"Location of the index config file.\")\n                    .display_order(2)\n                    .required(true),\n                arg!(--\"create\" \"Create the index if it does not already exists.\")\n                    .display_order(3)\n                    .required(false),\n            ])\n        )\n        .subcommand(\n            Command::new(\"clear\")\n                .display_order(3)\n                .alias(\"clr\")\n                .about(\"Clears an index: deletes all splits and resets checkpoint.\")\n                .long_about(\"Deletes all its splits and resets its checkpoint. This operation is destructive and cannot be undone, proceed with caution.\")\n                .args(&[\n                    arg!(--index <INDEX> \"Index ID\")\n                        .display_order(1)\n                        .required(true),\n                ])\n            )\n        .subcommand(\n            Command::new(\"delete\")\n                .display_order(4)\n                .alias(\"del\")\n                .about(\"Deletes an index.\")\n                .long_about(\"Deletes an index. This operation is destructive and cannot be undone, proceed with caution.\")\n                .args(&[\n                    arg!(--index <INDEX> \"ID of the target index\")\n                        .display_order(1)\n                        .required(true),\n                    arg!(--\"dry-run\" \"Executes the command in dry run mode and only displays the list of splits candidates for deletion.\")\n                        .required(false),\n                ])\n            )\n        .subcommand(\n            Command::new(\"describe\")\n                .display_order(5)\n                .about(\"Displays descriptive statistics of an index.\")\n                .long_about(\"Displays descriptive statistics of an index. Displayed statistics are: number of published splits, number of documents, splits min/max timestamps, size of splits.\")\n                .args(&[\n                    arg!(--index <INDEX> \"ID of the target index\")\n                        .display_order(1)\n                        .required(true),\n                ])\n            )\n        .subcommand(\n            Command::new(\"list\")\n                .alias(\"ls\")\n                .display_order(6)\n                .about(\"List indexes.\")\n            )\n        .subcommand(\n            Command::new(\"ingest\")\n                .display_order(7)\n                .about(\"Ingest NDJSON documents with the ingest API.\")\n                .long_about(\"Reads NDJSON documents from a file or streamed from stdin and sends them into ingest API.\")\n                .args(&[\n                    arg!(--index <INDEX> \"ID of the target index\")\n                        .display_order(1)\n                        .required(true),\n                    arg!(--\"input-path\" <INPUT_PATH> \"Location of the input file.\")\n                        .required(false),\n                    arg!(--\"batch-size-limit\" <BATCH_SIZE_LIMIT> \"Size limit of each submitted document batch.\")\n                        .required(false),\n                    Arg::new(\"wait\")\n                        .long(\"wait\")\n                        .short('w')\n                        .help(\"Wait for all documents to be committed and available for search before exiting. Applies only to the last batch, see [#5417](https://github.com/quickwit-oss/quickwit/issues/5417).\")\n                        .action(ArgAction::SetTrue),\n                    Arg::new(\"detailed-response\")\n                        .long(\"detailed-response\")\n                        .help(\"Print detailed errors. Enabling might impact performance negatively.\")\n                        .action(ArgAction::SetTrue),\n                    Arg::new(\"force\")\n                        .long(\"force\")\n                        .short('f')\n                        .help(\"Force a commit after the last document is sent, and wait for all documents to be committed and available for search before exiting. Applies only to the last batch, see [#5417](https://github.com/quickwit-oss/quickwit/issues/5417).\")\n                        .action(ArgAction::SetTrue)\n                        .conflicts_with(\"wait\"),\n                    Arg::new(\"commit-timeout\")\n                        .long(\"commit-timeout\")\n                        .help(\"Timeout for ingest operations that require waiting for the final commit (`--wait` or `--force`). This is different from the `commit_timeout_secs` indexing setting, which sets the maximum time before committing splits after their creation.\")\n                        .required(false)\n                        .global(true),\n                ])\n            )\n        .subcommand(\n            Command::new(\"search\")\n                .display_order(8)\n                .about(\"Searches an index.\")\n                .args(&[\n                    arg!(--index <INDEX> \"ID of the target index\")\n                        .display_order(1)\n                        .required(true),\n                    arg!(--query <QUERY> \"Query expressed in natural query language ((barack AND obama) OR \\\"president of united states\\\"). Learn more on https://quickwit.io/docs/reference/search-language.\")\n                        .display_order(2)\n                        .required(true),\n                    arg!(--aggregation <AGG> \"JSON serialized aggregation request in tantivy/elasticsearch format.\")\n                        .required(false),\n                    arg!(--\"max-hits\" <MAX_HITS> \"Maximum number of hits returned.\")\n                        .default_value(\"20\")\n                        .required(false),\n                    arg!(--\"start-offset\" <OFFSET> \"Offset in the global result set of the first hit returned.\")\n                        .default_value(\"0\")\n                        .required(false),\n                    arg!(--\"search-fields\" <FIELD_NAME> \"List of fields that Quickwit will search into if the user query does not explicitly target a field in the query. It overrides the default search fields defined in the index config. Space-separated list, e.g. \\\"field1 field2\\\". \")\n                        .num_args(1..)\n                        .required(false),\n                    arg!(--\"snippet-fields\" <FIELD_NAME> \"List of fields that Quickwit will return snippet highlight on. Space-separated list, e.g. \\\"field1 field2\\\". \")\n                        .num_args(1..)\n                        .required(false),\n                    arg!(--\"start-timestamp\" <TIMESTAMP> \"Filters out documents before that timestamp (time-series indexes only).\")\n                        .required(false),\n                    arg!(--\"end-timestamp\" <TIMESTAMP> \"Filters out documents after that timestamp (time-series indexes only).\")\n                        .required(false),\n                    arg!(--\"sort-by-score\" \"Sorts documents by their BM25 score.\")\n                        .required(false),\n                ])\n            )\n        .arg_required_else_help(true)\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub struct ClearIndexArgs {\n    pub client_args: ClientArgs,\n    pub index_id: IndexId,\n    pub assume_yes: bool,\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub struct CreateIndexArgs {\n    pub client_args: ClientArgs,\n    pub index_config_uri: Uri,\n    pub overwrite: bool,\n    pub assume_yes: bool,\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub struct UpdateIndexArgs {\n    pub client_args: ClientArgs,\n    pub index_id: IndexId,\n    pub index_config_uri: Uri,\n    pub create: bool,\n    pub assume_yes: bool,\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub struct DescribeIndexArgs {\n    pub client_args: ClientArgs,\n    pub index_id: IndexId,\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub struct IngestDocsArgs {\n    pub client_args: ClientArgs,\n    pub index_id: IndexId,\n    pub input_path_opt: Option<PathBuf>,\n    pub batch_size_limit_opt: Option<ByteSize>,\n    pub commit_type: CommitType,\n    pub detailed_response: bool,\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub struct SearchIndexArgs {\n    pub client_args: ClientArgs,\n    pub index_id: IndexId,\n    pub query: String,\n    pub aggregation: Option<String>,\n    pub max_hits: usize,\n    pub start_offset: usize,\n    pub search_fields: Option<Vec<String>>,\n    pub snippet_fields: Option<Vec<String>>,\n    pub start_timestamp: Option<i64>,\n    pub end_timestamp: Option<i64>,\n    pub sort_by_score: bool,\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub struct DeleteIndexArgs {\n    pub client_args: ClientArgs,\n    pub index_id: IndexId,\n    pub dry_run: bool,\n    pub assume_yes: bool,\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub struct ListIndexesArgs {\n    pub client_args: ClientArgs,\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub enum IndexCliCommand {\n    Clear(ClearIndexArgs),\n    Create(CreateIndexArgs),\n    Update(UpdateIndexArgs),\n    Delete(DeleteIndexArgs),\n    Describe(DescribeIndexArgs),\n    Ingest(IngestDocsArgs),\n    List(ListIndexesArgs),\n    Search(SearchIndexArgs),\n}\n\nimpl IndexCliCommand {\n    pub fn default_log_level(&self) -> Level {\n        match self {\n            Self::Search(_) => Level::ERROR,\n            _ => Level::INFO,\n        }\n    }\n\n    pub fn parse_cli_args(mut matches: ArgMatches) -> anyhow::Result<Self> {\n        let (subcommand, submatches) = matches\n            .remove_subcommand()\n            .context(\"failed to parse index subcommand\")?;\n        match subcommand.as_str() {\n            \"clear\" => Self::parse_clear_args(submatches),\n            \"create\" => Self::parse_create_args(submatches),\n            \"delete\" => Self::parse_delete_args(submatches),\n            \"describe\" => Self::parse_describe_args(submatches),\n            \"ingest\" => Self::parse_ingest_args(submatches),\n            \"list\" => Self::parse_list_args(submatches),\n            \"search\" => Self::parse_search_args(submatches),\n            \"update\" => Self::parse_update_args(submatches),\n            _ => bail!(\"unknown index subcommand `{subcommand}`\"),\n        }\n    }\n\n    fn parse_clear_args(mut matches: ArgMatches) -> anyhow::Result<Self> {\n        let client_args = ClientArgs::parse(&mut matches)?;\n        let index_id = matches\n            .remove_one::<String>(\"index\")\n            .expect(\"`index` should be a required arg\");\n        let assume_yes = matches.get_flag(\"yes\");\n        Ok(Self::Clear(ClearIndexArgs {\n            client_args,\n            index_id,\n            assume_yes,\n        }))\n    }\n\n    fn parse_create_args(mut matches: ArgMatches) -> anyhow::Result<Self> {\n        let client_args = ClientArgs::parse(&mut matches)?;\n        let index_config_uri = matches\n            .remove_one::<String>(\"index-config\")\n            .map(|uri| Uri::from_str(&uri))\n            .expect(\"`index-config` should be a required arg\")?;\n        let overwrite = matches.get_flag(\"overwrite\");\n        let assume_yes = matches.get_flag(\"yes\");\n\n        Ok(Self::Create(CreateIndexArgs {\n            client_args,\n            index_config_uri,\n            overwrite,\n            assume_yes,\n        }))\n    }\n\n    fn parse_update_args(mut matches: ArgMatches) -> anyhow::Result<Self> {\n        let client_args = ClientArgs::parse(&mut matches)?;\n        let index_id = matches\n            .remove_one::<String>(\"index\")\n            .expect(\"`index` should be a required arg\");\n        let index_config_uri = matches\n            .remove_one::<String>(\"index-config\")\n            .map(|uri| Uri::from_str(&uri))\n            .expect(\"`index-config` should be a required arg\")?;\n        let create = matches.get_flag(\"create\");\n        let assume_yes = matches.get_flag(\"yes\");\n\n        Ok(Self::Update(UpdateIndexArgs {\n            index_id,\n            client_args,\n            index_config_uri,\n            create,\n            assume_yes,\n        }))\n    }\n\n    fn parse_describe_args(mut matches: ArgMatches) -> anyhow::Result<Self> {\n        let client_args = ClientArgs::parse(&mut matches)?;\n        let index_id = matches\n            .remove_one::<String>(\"index\")\n            .expect(\"`index` should be a required arg\");\n\n        Ok(Self::Describe(DescribeIndexArgs {\n            client_args,\n            index_id,\n        }))\n    }\n\n    fn parse_list_args(mut matches: ArgMatches) -> anyhow::Result<Self> {\n        let client_args = ClientArgs::parse(&mut matches)?;\n        Ok(Self::List(ListIndexesArgs { client_args }))\n    }\n\n    fn parse_ingest_args(mut matches: ArgMatches) -> anyhow::Result<Self> {\n        let client_args = ClientArgs::parse_for_ingest(&mut matches)?;\n        let index_id = matches\n            .remove_one::<String>(\"index\")\n            .expect(\"`index` should be a required arg\");\n        let input_path_opt = if let Some(input_path) = matches.remove_one::<String>(\"input-path\") {\n            Uri::from_str(&input_path)?\n                .filepath()\n                .map(|path| path.to_path_buf())\n        } else {\n            None\n        };\n        let detailed_response: bool = matches.get_flag(\"detailed-response\");\n        let batch_size_limit_opt = matches\n            .remove_one::<String>(\"batch-size-limit\")\n            .map(|limit| limit.parse::<ByteSize>())\n            .transpose()\n            .map_err(|error| anyhow!(error))?;\n        let commit_type = match (matches.get_flag(\"wait\"), matches.get_flag(\"force\")) {\n            (false, false) => CommitType::Auto,\n            (false, true) => CommitType::Force,\n            (true, false) => CommitType::WaitFor,\n            (true, true) => bail!(\"`--wait` and `--force` are mutually exclusive options\"),\n        };\n\n        if commit_type == CommitType::Auto && client_args.commit_timeout.is_some() {\n            bail!(\"`--commit-timeout` can only be used with --wait or --force options\");\n        }\n\n        Ok(Self::Ingest(IngestDocsArgs {\n            client_args,\n            index_id,\n            input_path_opt,\n            batch_size_limit_opt,\n            commit_type,\n            detailed_response,\n        }))\n    }\n\n    fn parse_search_args(mut matches: ArgMatches) -> anyhow::Result<Self> {\n        let index_id = matches\n            .remove_one::<String>(\"index\")\n            .expect(\"`index` should be a required arg\");\n        let query = matches\n            .remove_one::<String>(\"query\")\n            .context(\"`query` should be a required arg\")?;\n        let aggregation = matches.remove_one::<String>(\"aggregation\");\n\n        let max_hits = matches\n            .remove_one::<String>(\"max-hits\")\n            .expect(\"`max-hits` should have a default value.\")\n            .parse()?;\n        let start_offset = matches\n            .remove_one::<String>(\"start-offset\")\n            .expect(\"`start-offset` should have a default value.\")\n            .parse()?;\n        let search_fields = matches\n            .remove_many::<String>(\"search-fields\")\n            .map(|values| values.collect());\n        let snippet_fields = matches\n            .remove_many::<String>(\"snippet-fields\")\n            .map(|values| values.collect());\n        let sort_by_score = matches.get_flag(\"sort-by-score\");\n        let start_timestamp = matches\n            .remove_one::<String>(\"start-timestamp\")\n            .map(|ts| ts.parse())\n            .transpose()?;\n        let end_timestamp = matches\n            .remove_one::<String>(\"end-timestamp\")\n            .map(|ts| ts.parse())\n            .transpose()?;\n        let client_args = ClientArgs::parse(&mut matches)?;\n        Ok(Self::Search(SearchIndexArgs {\n            index_id,\n            query,\n            aggregation,\n            max_hits,\n            start_offset,\n            search_fields,\n            snippet_fields,\n            start_timestamp,\n            end_timestamp,\n            client_args,\n            sort_by_score,\n        }))\n    }\n\n    fn parse_delete_args(mut matches: ArgMatches) -> anyhow::Result<Self> {\n        let client_args = ClientArgs::parse(&mut matches)?;\n        let index_id = matches\n            .remove_one::<String>(\"index\")\n            .expect(\"`index` should be a required arg\");\n        let dry_run = matches.get_flag(\"dry-run\");\n        let assume_yes = matches.get_flag(\"yes\");\n        Ok(Self::Delete(DeleteIndexArgs {\n            index_id,\n            dry_run,\n            client_args,\n            assume_yes,\n        }))\n    }\n\n    pub async fn execute(self) -> anyhow::Result<()> {\n        match self {\n            Self::Clear(args) => clear_index_cli(args).await,\n            Self::Create(args) => create_index_cli(args).await,\n            Self::Delete(args) => delete_index_cli(args).await,\n            Self::Describe(args) => describe_index_cli(args).await,\n            Self::Ingest(args) => ingest_docs_cli(args).await,\n            Self::List(args) => list_index_cli(args).await,\n            Self::Search(args) => search_index_cli(args).await,\n            Self::Update(args) => update_index_cli(args).await,\n        }\n    }\n}\n\npub async fn clear_index_cli(args: ClearIndexArgs) -> anyhow::Result<()> {\n    debug!(args=?args, \"clear-index\");\n    if !args.assume_yes {\n        let prompt = format!(\n            \"This operation will delete all the splits of the index `{}` and reset its \\\n             checkpoint. Do you want to proceed?\",\n            args.index_id\n        );\n        if !prompt_confirmation(&prompt, false) {\n            return Ok(());\n        }\n    }\n    let qw_client = args.client_args.client();\n    qw_client.indexes().clear(&args.index_id).await?;\n    println!(\"{} Index successfully cleared.\", \"✔\".color(GREEN_COLOR),);\n    Ok(())\n}\n\npub async fn create_index_cli(args: CreateIndexArgs) -> anyhow::Result<()> {\n    debug!(args=?args, \"create-index\");\n    println!(\"❯ Creating index...\");\n    let storage_resolver = StorageResolver::unconfigured();\n    let file_content = load_file(&storage_resolver, &args.index_config_uri).await?;\n    let index_config_str: String = std::str::from_utf8(&file_content)\n        .with_context(|| format!(\"Invalid utf8: `{}`\", args.index_config_uri))?\n        .to_string();\n    let config_format = ConfigFormat::sniff_from_uri(&args.index_config_uri)?;\n    let qw_client = args.client_args.client();\n    // TODO: nice to have: check first if the index exists by send a GET request, if we get a 404,\n    // the index does not exist. If it exists, we can display the prompt.\n    if args.overwrite && !args.assume_yes {\n        // Stop if user answers no.\n        let prompt = \"This operation will overwrite the index and delete all its data. Do you \\\n                      want to proceed?\"\n            .to_string();\n        if !prompt_confirmation(&prompt, false) {\n            return Ok(());\n        }\n    }\n    qw_client\n        .indexes()\n        .create(&index_config_str, config_format, args.overwrite)\n        .await?;\n    println!(\"{} Index successfully created.\", \"✔\".color(GREEN_COLOR));\n    Ok(())\n}\n\npub async fn update_index_cli(args: UpdateIndexArgs) -> anyhow::Result<()> {\n    debug!(args=?args, \"update-index\");\n    println!(\"❯ Updating index...\");\n    let storage_resolver = StorageResolver::unconfigured();\n    let file_content = load_file(&storage_resolver, &args.index_config_uri).await?;\n    let index_config_str = std::str::from_utf8(&file_content)\n        .with_context(|| {\n            format!(\n                \"index config file `{}` contains some invalid UTF-8 characters\",\n                args.index_config_uri\n            )\n        })?\n        .to_string();\n    let config_format = ConfigFormat::sniff_from_uri(&args.index_config_uri)?;\n    let qw_client = args.client_args.client();\n    if !args.assume_yes {\n        let prompt = \"This operation will update the index configuration. Do you want to proceed?\";\n        if !prompt_confirmation(prompt, false) {\n            return Ok(());\n        }\n    }\n    qw_client\n        .indexes()\n        .update(\n            &args.index_id,\n            &index_config_str,\n            config_format,\n            args.create,\n        )\n        .await?;\n    println!(\"{} Index successfully updated.\", \"✔\".color(GREEN_COLOR));\n    Ok(())\n}\n\npub async fn list_index_cli(args: ListIndexesArgs) -> anyhow::Result<()> {\n    debug!(args=?args, \"list-index\");\n    let qw_client = args.client_args.client();\n    let indexes_metadatas = qw_client.indexes().list().await?;\n    let index_table = make_list_indexes_table(\n        indexes_metadatas\n            .into_iter()\n            .map(IndexMetadata::into_index_config),\n    );\n    println!(\"\\n{index_table}\\n\");\n    Ok(())\n}\n\nfn make_list_indexes_table<I>(indexes: I) -> Table\nwhere I: IntoIterator<Item = IndexConfig> {\n    let rows = indexes\n        .into_iter()\n        .map(|index| IndexRow {\n            index_id: index.index_id,\n            index_uri: index.index_uri,\n        })\n        .sorted_by(|left, right| left.index_id.cmp(&right.index_id));\n    make_table(\"Indexes\", rows, false)\n}\n\n#[derive(Tabled)]\nstruct IndexRow {\n    #[tabled(rename = \"Index ID\")]\n    index_id: IndexId,\n    #[tabled(rename = \"Index URI\")]\n    index_uri: Uri,\n}\n\npub async fn describe_index_cli(args: DescribeIndexArgs) -> anyhow::Result<()> {\n    debug!(args=?args, \"describe-index\");\n    let qw_client = args.client_args.client();\n    let index_metadata = qw_client.indexes().get(&args.index_id).await?;\n    let list_splits_query_params = ListSplitsQueryParams::default();\n    let splits = qw_client\n        .splits(&args.index_id)\n        .list(list_splits_query_params)\n        .await?;\n    let index_stats = IndexStats::from_metadata(index_metadata, splits)?;\n    println!(\"{}\", index_stats.display_as_table());\n    Ok(())\n}\n\npub struct IndexStats {\n    pub index_id: IndexId,\n    pub index_uri: Uri,\n    pub num_published_splits: usize,\n    pub size_published_splits: ByteSize,\n    pub num_published_docs: u64,\n    pub size_published_docs_uncompressed: ByteSize,\n    pub timestamp_field_name: Option<String>,\n    pub timestamp_range: Option<(i64, i64)>,\n    pub num_docs_descriptive: Option<DescriptiveStats>,\n    pub num_bytes_descriptive: Option<DescriptiveStats>,\n}\n\nimpl Tabled for IndexStats {\n    const LENGTH: usize = 9;\n\n    fn fields(&self) -> Vec<Cow<'_, str>> {\n        let num_published_docs = format!(\n            \"{} ({})\",\n            format_to_si_scale(self.num_published_docs),\n            separate_thousands(self.num_published_docs)\n        );\n\n        [\n            self.index_id.to_string(),\n            self.index_uri.to_string(),\n            num_published_docs,\n            self.size_published_docs_uncompressed.to_string(),\n            separate_thousands(self.num_published_splits),\n            self.size_published_splits.to_string(),\n            display_option_in_table(&self.timestamp_field_name),\n            display_timestamp(&self.timestamp_range.map(|(start, _end)| start)),\n            display_timestamp(&self.timestamp_range.map(|(_start, end)| end)),\n        ]\n        .into_iter()\n        .map(|field| field.into())\n        .collect()\n    }\n\n    fn headers() -> Vec<Cow<'static, str>> {\n        [\n            \"Index ID\",\n            \"Index URI\",\n            \"Number of published documents\",\n            \"Size of published documents (uncompressed)\",\n            \"Number of published splits\",\n            \"Size of published splits\",\n            \"Timestamp field\",\n            \"Timestamp range start\",\n            \"Timestamp range end\",\n        ]\n        .into_iter()\n        .map(|header| header.into())\n        .collect()\n    }\n}\n\nfn format_to_si_scale(num: impl numfmt::Numeric) -> String {\n    let mut si_scale_formatter = Formatter::new().scales(Scales::metric());\n    si_scale_formatter.fmt2(num).to_string()\n}\n\nfn separate_thousands(num: impl numfmt::Numeric) -> String {\n    let mut thousands_separator_formatter = Formatter::new()\n        .separator(',')\n        // NOTE: .separator(sep) only panics if sep.len_utf8() != 1\n        .expect(\"`,` separator should be valid\")\n        .precision(numfmt::Precision::Significance(3));\n\n    thousands_separator_formatter.fmt2(num).to_string()\n}\n\nfn display_option_in_table(opt: &Option<impl Display>) -> String {\n    match opt {\n        Some(opt_val) => format!(\"\\\"{opt_val}\\\"\"),\n        None => \"Field does not exist for the index.\".to_string(),\n    }\n}\n\nfn display_timestamp(timestamp: &Option<i64>) -> String {\n    match timestamp {\n        Some(timestamp) => {\n            let datetime = chrono::DateTime::from_timestamp_millis(*timestamp * 1000)\n                .map(|datetime| datetime.format(\"%Y-%m-%d %H:%M:%S\").to_string())\n                .unwrap_or_else(|| \"Invalid timestamp!\".to_string());\n            format!(\"{datetime} (Timestamp: {timestamp})\")\n        }\n        _ => \"Timestamp does not exist for the index.\".to_string(),\n    }\n}\n\nimpl IndexStats {\n    pub fn from_metadata(\n        index_metadata: IndexMetadata,\n        splits: Vec<Split>,\n    ) -> anyhow::Result<Self> {\n        let published_splits: Vec<Split> = splits\n            .into_iter()\n            .filter(|split| split.split_state == SplitState::Published)\n            .collect();\n        let splits_num_docs = published_splits\n            .iter()\n            .map(|split| split.split_metadata.num_docs as u64)\n            .sorted()\n            .collect_vec();\n\n        let total_num_docs = splits_num_docs.iter().sum::<u64>();\n\n        let splits_bytes = published_splits\n            .iter()\n            .map(|split| split.split_metadata.footer_offsets.end)\n            .sorted()\n            .collect_vec();\n        let total_num_bytes = splits_bytes.iter().sum::<u64>();\n        let total_uncompressed_num_bytes = published_splits\n            .iter()\n            .map(|split| split.split_metadata.uncompressed_docs_size_in_bytes)\n            .sum::<u64>();\n\n        let timestamp_range = if index_metadata\n            .index_config()\n            .doc_mapping\n            .timestamp_field\n            .is_some()\n        {\n            let time_min = published_splits\n                .iter()\n                .flat_map(|split| split.split_metadata.time_range.clone())\n                .map(|time_range| *time_range.start())\n                .min();\n            let time_max = published_splits\n                .iter()\n                .flat_map(|split| split.split_metadata.time_range.clone())\n                .map(|time_range| *time_range.end())\n                .max();\n            if let (Some(time_min), Some(time_max)) = (time_min, time_max) {\n                Some((time_min, time_max))\n            } else {\n                None\n            }\n        } else {\n            None\n        };\n\n        let (num_docs_descriptive, num_bytes_descriptive) = if !published_splits.is_empty() {\n            (\n                DescriptiveStats::maybe_new(&splits_num_docs),\n                DescriptiveStats::maybe_new(&splits_bytes),\n            )\n        } else {\n            (None, None)\n        };\n        let index_config = index_metadata.into_index_config();\n\n        Ok(Self {\n            index_id: index_config.index_id.clone(),\n            index_uri: index_config.index_uri.clone(),\n            num_published_splits: published_splits.len(),\n            size_published_splits: ByteSize(total_num_bytes),\n            num_published_docs: total_num_docs,\n            size_published_docs_uncompressed: ByteSize(total_uncompressed_num_bytes),\n            timestamp_field_name: index_config.doc_mapping.timestamp_field,\n            timestamp_range,\n            num_docs_descriptive,\n            num_bytes_descriptive,\n        })\n    }\n\n    pub fn display_as_table(&self) -> String {\n        let mut tables = Vec::new();\n        let index_stats_table = create_table(self, \"General Information\", true);\n        tables.push(index_stats_table);\n\n        if let Some(docs_stats) = &self.num_docs_descriptive {\n            let doc_stats_table = docs_stats.into_table(\"Published documents count stats\");\n            tables.push(doc_stats_table);\n        }\n\n        if let Some(size_stats) = &self.num_bytes_descriptive {\n            let size_stats_in_mb = size_stats / 1_000_000.0;\n            let size_stats_table = size_stats_in_mb.into_table(\"Published splits size stats (MB)\");\n            tables.push(size_stats_table);\n        }\n\n        Table::builder(tables.into_iter().map(|table| table.to_string()))\n            .build()\n            .with(Modify::new(Segment::all()).with(Alignment::center_vertical()))\n            .with(Remove::row(FirstRow))\n            .with(Style::empty())\n            .to_string()\n    }\n}\n\nfn create_table(table: impl Tabled, header: &str, is_vertical: bool) -> Table {\n    let mut table = Table::new(vec![table]);\n\n    // Make the field names GREEN :D\n    table.with(Modify::new(Rows::first()).with(Format::content(|column| {\n        column.color(GREEN_COLOR).to_string()\n    })));\n\n    if is_vertical {\n        table.with(Rotate::Left).with(Rotate::Bottom);\n    }\n\n    table\n        .with(Panel::header(header))\n        // Makes the table header bright green and bold.\n        .with(Modify::new(Rows::first()).with(Format::content(|header| {\n            header.bright_green().bold().to_string()\n        })))\n        .with(\n            Modify::new(Segment::all())\n                .with(Alignment::left())\n                .with(Alignment::top()),\n        )\n        .with(Footer::new(\"\\n\"))\n        .with(Style::psql());\n\n    table\n}\n\n#[derive(Debug, Clone, Copy)]\npub struct DescriptiveStats {\n    summary_stats: SummaryStats,\n    quantiles: Quantiles,\n}\n\nimpl DescriptiveStats {\n    pub fn into_table(self, header: &str) -> Table {\n        let summary_stats_table = create_table(self.summary_stats, header, true);\n        let quantiles_table = create_table(self.quantiles, \"Quantiles\", false);\n        let mut table =\n            Table::builder([summary_stats_table.to_string(), quantiles_table.to_string()]).build();\n\n        table\n            .with(Style::empty())\n            .with(Remove::row(FirstRow))\n            // We separate tables with a newline already, this is to separate quantile part of the\n            // table further away from the next table.\n            .with(Footer::new(\"\\n\"));\n\n        table\n    }\n}\n\nimpl Div<f32> for &DescriptiveStats {\n    type Output = DescriptiveStats;\n\n    fn div(self, rhs: f32) -> Self::Output {\n        DescriptiveStats {\n            summary_stats: self.summary_stats / rhs,\n            quantiles: self.quantiles / rhs,\n        }\n    }\n}\n\n#[derive(Debug, Clone, Copy)]\npub struct SummaryStats {\n    mean_val: f32,\n    std_val: f32,\n    min_val: u64,\n    max_val: u64,\n}\n\nimpl Div<f32> for SummaryStats {\n    type Output = Self;\n\n    fn div(self, rhs: f32) -> Self::Output {\n        Self {\n            mean_val: self.mean_val / rhs,\n            std_val: self.std_val / rhs,\n            min_val: self.min_val / rhs as u64,\n            max_val: self.max_val / rhs as u64,\n        }\n    }\n}\n\n#[derive(Debug, Clone, Copy)]\npub struct Quantiles {\n    q1: f32,\n    q25: f32,\n    q50: f32,\n    q75: f32,\n    q99: f32,\n}\n\nimpl Div<f32> for Quantiles {\n    type Output = Self;\n\n    fn div(self, rhs: f32) -> Self::Output {\n        Self {\n            q1: self.q1 / rhs,\n            q25: self.q25 / rhs,\n            q50: self.q50 / rhs,\n            q75: self.q75 / rhs,\n            q99: self.q99 / rhs,\n        }\n    }\n}\n\nimpl DescriptiveStats {\n    pub fn maybe_new(values: &[u64]) -> Option<DescriptiveStats> {\n        if values.is_empty() {\n            return None;\n        }\n\n        Some(DescriptiveStats {\n            summary_stats: SummaryStats {\n                mean_val: mean(values),\n                std_val: std_deviation(values),\n                min_val: *values.iter().min().expect(\"Values should not be empty.\"),\n                max_val: *values.iter().max().expect(\"Values should not be empty.\"),\n            },\n            quantiles: Quantiles {\n                q1: percentile(values, 1),\n                q25: percentile(values, 25),\n                q50: percentile(values, 50),\n                q75: percentile(values, 75),\n                q99: percentile(values, 99),\n            },\n        })\n    }\n}\n\nimpl Tabled for SummaryStats {\n    const LENGTH: usize = 4;\n\n    fn fields(&self) -> Vec<Cow<'_, str>> {\n        [\n            separate_thousands(self.mean_val),\n            separate_thousands(self.min_val),\n            separate_thousands(self.max_val),\n            separate_thousands(self.std_val),\n        ]\n        .into_iter()\n        .map(|field| field.into())\n        .collect()\n    }\n\n    fn headers() -> Vec<Cow<'static, str>> {\n        [\n            \"Mean\".to_string(),\n            \"Min\".to_string(),\n            \"Max\".to_string(),\n            \"Standard deviation\".to_string(),\n        ]\n        .into_iter()\n        .map(|header| header.into())\n        .collect()\n    }\n}\n\nimpl Tabled for Quantiles {\n    const LENGTH: usize = 5;\n\n    fn fields(&self) -> Vec<Cow<'_, str>> {\n        [\n            separate_thousands(self.q1),\n            separate_thousands(self.q25),\n            separate_thousands(self.q50),\n            separate_thousands(self.q75),\n            separate_thousands(self.q99),\n        ]\n        .into_iter()\n        .map(|field| field.into())\n        .collect()\n    }\n\n    fn headers() -> Vec<Cow<'static, str>> {\n        [\n            \"1%\".to_string(),\n            \"25%\".to_string(),\n            \"50%\".to_string(),\n            \"75%\".to_string(),\n            \"99%\".to_string(),\n        ]\n        .into_iter()\n        .map(|header| header.into())\n        .collect()\n    }\n}\n\npub async fn ingest_docs_cli(args: IngestDocsArgs) -> anyhow::Result<()> {\n    debug!(args=?args, \"ingest-docs\");\n    let mut rate_estimator = SmaRateEstimator::new(\n        NonZeroUsize::new(8).unwrap(),\n        Duration::from_millis(250),\n        Duration::from_secs(1),\n    );\n    if let Some(input_path) = &args.input_path_opt {\n        println!(\"❯ Ingesting documents from {}.\", input_path.display());\n    } else {\n        println!(\"❯ Ingesting documents from stdin.\");\n    }\n    let progress_bar = match &args.input_path_opt {\n        Some(filepath) => {\n            let file_len = std::fs::metadata(filepath).context(\"file not found\")?.len();\n            ProgressBar::new(file_len)\n        }\n        None => ProgressBar::new_spinner(),\n    };\n    progress_bar.enable_steady_tick(Duration::from_millis(100));\n    progress_bar.set_style(progress_bar_style());\n    progress_bar.set_message(\"0MiB/s\");\n    // It is not used by the rate estimator anyway.\n    let useless_start_time = Instant::now();\n    let mut update_progress_bar = |ingest_event: IngestEvent| {\n        match ingest_event {\n            IngestEvent::IngestedDocBatch(num_bytes) => {\n                rate_estimator.update(useless_start_time, Instant::now(), num_bytes as u64);\n                progress_bar.inc(num_bytes as u64)\n            }\n            IngestEvent::Sleep => {} // To\n        };\n        let throughput = rate_estimator.work() as f64 / (1024 * 1024) as f64;\n        progress_bar.set_message(format!(\"{throughput:.1} MiB/s\"));\n    };\n\n    let mut qw_client_builder = args.client_args.client_builder();\n    if args.detailed_response {\n        qw_client_builder = qw_client_builder.detailed_response(args.detailed_response);\n    }\n    let qw_client = qw_client_builder.build();\n    let ingest_source = match args.input_path_opt {\n        Some(filepath) => IngestSource::File(filepath),\n        None => IngestSource::Stdin,\n    };\n    let batch_size_limit_opt = args\n        .batch_size_limit_opt\n        .map(|batch_size_limit| batch_size_limit.as_u64() as usize);\n    let response = qw_client\n        .ingest(\n            &args.index_id,\n            ingest_source,\n            batch_size_limit_opt,\n            Some(&mut update_progress_bar),\n            args.commit_type,\n        )\n        .await?;\n    progress_bar.finish();\n    println!(\n        \"{} Ingested {} document(s) successfully.\",\n        \"✔\".color(GREEN_COLOR),\n        response\n            .num_ingested_docs\n            // TODO(#5604) remove unwrap\n            .unwrap_or(response.num_docs_for_processing),\n    );\n    if let Some(rejected) = response.num_rejected_docs\n        && rejected > 0\n    {\n        println!(\n            \"{} Rejected {} document(s).\",\n            \"✖\".color(RED_COLOR),\n            rejected\n        );\n    }\n    if let Some(parse_failures) = response.parse_failures {\n        if !parse_failures.is_empty() {\n            println!(\"Detailed parse failures:\");\n        }\n        for (idx, failure) in parse_failures.iter().enumerate() {\n            let reason_value = serde_json::to_value(failure.reason).unwrap();\n            println!();\n            println!(\"┌ error {}\", idx + 1);\n            println!(\"├ reason: {}\", reason_value.as_str().unwrap());\n            println!(\"├ message: {}\", failure.message);\n            println!(\"└ document: {}\", failure.document);\n        }\n    }\n    Ok(())\n}\n\nfn progress_bar_style() -> ProgressStyle {\n    ProgressStyle::with_template(\n        \"{spinner:.blue} [{elapsed_precise}] {bytes}/{total_bytes} ({msg})\",\n    )\n    .expect(\"Progress style should always be valid.\")\n    .tick_strings(&[\"⠋\", \"⠙\", \"⠹\", \"⠸\", \"⠼\", \"⠴\", \"⠦\", \"⠧\", \"⠇\", \"⠏\"])\n}\n\npub async fn search_index(args: SearchIndexArgs) -> anyhow::Result<SearchResponseRestClient> {\n    let aggs: Option<serde_json::Value> = args\n        .aggregation\n        .map(|aggs_string| {\n            serde_json::from_str(&aggs_string).context(\"failed to deserialize aggregations\")\n        })\n        .transpose()?;\n    let sort_fields = if args.sort_by_score {\n        vec![SortField {\n            field_name: \"_score\".to_string(),\n            sort_order: SortOrder::Desc as i32,\n            sort_datetime_format: None,\n        }]\n    } else {\n        Vec::new()\n    };\n    let sort_by = SortBy { sort_fields };\n    let search_request = SearchRequestQueryString {\n        query: args.query,\n        aggs,\n        search_fields: args.search_fields.clone(),\n        snippet_fields: args.snippet_fields.clone(),\n        start_timestamp: args.start_timestamp,\n        end_timestamp: args.end_timestamp,\n        max_hits: args.max_hits as u64,\n        start_offset: args.start_offset as u64,\n        sort_by,\n        count_all: CountHits::CountAll,\n        ..Default::default()\n    };\n    let qw_client = args.client_args.client();\n    let search_response = qw_client.search(&args.index_id, search_request).await?;\n    Ok(search_response)\n}\n\npub async fn search_index_cli(args: SearchIndexArgs) -> anyhow::Result<()> {\n    debug!(args=?args, \"search-index\");\n    let search_response_rest = search_index(args).await?;\n    let search_response_json = serde_json::to_string_pretty(&search_response_rest)?;\n    println!(\"{search_response_json}\");\n    Ok(())\n}\n\npub async fn delete_index_cli(args: DeleteIndexArgs) -> anyhow::Result<()> {\n    debug!(args=?args, \"delete-index\");\n    if !args.dry_run && !args.assume_yes {\n        let prompt = \"This operation will delete the index. Do you want to proceed?\".to_string();\n        if !prompt_confirmation(&prompt, false) {\n            return Ok(());\n        }\n    }\n\n    println!(\"❯ Deleting index...\");\n    let qw_client = args.client_args.client();\n    let affected_files = qw_client\n        .indexes()\n        .delete(&args.index_id, args.dry_run)\n        .await?;\n\n    if args.dry_run {\n        if affected_files.is_empty() {\n            println!(\"Only the index will be deleted since it does not contains any data file.\");\n            return Ok(());\n        }\n        println!(\n            \"The following files will be removed from the index `{}`\",\n            args.index_id\n        );\n        for split_info in affected_files {\n            println!(\" - {}\", split_info.file_name.display());\n        }\n        return Ok(());\n    }\n    println!(\"{} Index successfully deleted.\", \"✔\".color(GREEN_COLOR));\n    Ok(())\n}\n\n#[cfg(test)]\nmod test {\n\n    use std::ops::RangeInclusive;\n\n    use quickwit_metastore::SplitMetadata;\n\n    use super::*;\n\n    pub fn split_metadata_for_test(\n        split_id: &str,\n        num_docs: usize,\n        time_range: RangeInclusive<i64>,\n        size: u64,\n    ) -> SplitMetadata {\n        let mut split_metadata = SplitMetadata::for_test(split_id.to_string());\n        split_metadata.num_docs = num_docs;\n        split_metadata.time_range = Some(time_range);\n        split_metadata.footer_offsets = (size - 10)..size;\n        split_metadata\n    }\n\n    #[test]\n    fn test_index_stats() -> anyhow::Result<()> {\n        let index_id = \"index-stats-env\".to_string();\n        let split_id_1 = \"test_split_id_1\".to_string();\n        let split_id_2 = \"test_split_id_2\".to_string();\n        let index_uri = \"s3://some-test-bucket\";\n\n        let index_metadata = IndexMetadata::for_test(&index_id, index_uri);\n        let mut split_metadata_1 =\n            split_metadata_for_test(&split_id_1, 100_000, 1111..=2222, 15_000_000);\n        split_metadata_1.uncompressed_docs_size_in_bytes = 19_000_000;\n        let mut split_metadata_2 =\n            split_metadata_for_test(&split_id_2, 100_000, 1000..=3000, 30_000_000);\n        split_metadata_2.uncompressed_docs_size_in_bytes = 36_000_000;\n\n        let split_data_1 = Split {\n            split_metadata: split_metadata_1,\n            split_state: SplitState::Published,\n            update_timestamp: 0,\n            publish_timestamp: Some(10),\n        };\n        let split_data_2 = Split {\n            split_metadata: split_metadata_2,\n            split_state: SplitState::MarkedForDeletion,\n            update_timestamp: 0,\n            publish_timestamp: Some(10),\n        };\n\n        let index_stats =\n            IndexStats::from_metadata(index_metadata, vec![split_data_1, split_data_2])?;\n\n        assert_eq!(index_stats.index_id, index_id);\n        assert_eq!(index_stats.index_uri.as_str(), index_uri);\n        assert_eq!(index_stats.num_published_splits, 1);\n        assert_eq!(index_stats.size_published_splits, ByteSize::mb(15));\n        assert_eq!(index_stats.num_published_docs, 100_000);\n        assert_eq!(\n            index_stats.size_published_docs_uncompressed,\n            ByteSize::mb(19)\n        );\n        assert_eq!(\n            index_stats.timestamp_field_name,\n            Some(\"timestamp\".to_string())\n        );\n        assert_eq!(index_stats.timestamp_range, Some((1111, 2222)));\n\n        Ok(())\n    }\n\n    #[test]\n    fn test_descriptive_stats() -> anyhow::Result<()> {\n        let split_id = \"stat-test-split\".to_string();\n        let template_split = Split {\n            split_state: SplitState::Published,\n            update_timestamp: 10,\n            publish_timestamp: Some(10),\n            split_metadata: SplitMetadata::default(),\n        };\n\n        let split_metadata_1 = split_metadata_for_test(&split_id, 70_000, 10..=12, 60_000_000);\n        let split_metadata_2 = split_metadata_for_test(&split_id, 120_000, 11..=15, 145_000_000);\n        let split_metadata_3 = split_metadata_for_test(&split_id, 90_000, 15..=22, 115_000_000);\n        let split_metadata_4 = split_metadata_for_test(&split_id, 40_000, 22..=22, 55_000_000);\n\n        let mut split_1 = template_split.clone();\n        split_1.split_metadata = split_metadata_1;\n        let mut split_2 = template_split.clone();\n        split_2.split_metadata = split_metadata_2;\n        let mut split_3 = template_split.clone();\n        split_3.split_metadata = split_metadata_3;\n        let mut split_4 = template_split;\n        split_4.split_metadata = split_metadata_4;\n\n        let splits = [split_1, split_2, split_3, split_4];\n\n        let splits_num_docs = splits\n            .iter()\n            .map(|split| split.split_metadata.num_docs as u64)\n            .sorted()\n            .collect_vec();\n\n        let splits_bytes = splits\n            .iter()\n            .map(|split| split.split_metadata.footer_offsets.end)\n            .sorted()\n            .collect_vec();\n\n        let num_docs_descriptive = DescriptiveStats::maybe_new(&splits_num_docs);\n        let num_bytes_descriptive = DescriptiveStats::maybe_new(&splits_bytes);\n\n        assert!(num_docs_descriptive.is_some());\n        assert!(num_bytes_descriptive.is_some());\n\n        let num_docs_descriptive = num_docs_descriptive.unwrap();\n        let num_bytes_descriptive = num_bytes_descriptive.unwrap();\n\n        assert_eq!(num_docs_descriptive.quantiles.q1, 40900.0);\n        assert_eq!(num_docs_descriptive.quantiles.q25, 62500.0);\n        assert_eq!(num_docs_descriptive.quantiles.q50, 80000.0);\n        assert_eq!(num_docs_descriptive.quantiles.q75, 97500.0);\n        assert_eq!(num_docs_descriptive.quantiles.q99, 119100.0);\n\n        assert_eq!(num_bytes_descriptive.quantiles.q1, 55150000.0);\n        assert_eq!(num_bytes_descriptive.quantiles.q25, 58750000.0);\n        assert_eq!(num_bytes_descriptive.quantiles.q50, 87500000.0);\n        assert_eq!(num_bytes_descriptive.quantiles.q75, 122500000.0);\n        assert_eq!(num_bytes_descriptive.quantiles.q99, 144100000.0);\n\n        let descriptive_stats_none = DescriptiveStats::maybe_new(&[]);\n        assert!(descriptive_stats_none.is_none());\n\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-cli/src/jemalloc.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::time::Duration;\n\nuse quickwit_common::metrics::MEMORY_METRICS;\nuse tikv_jemallocator::Jemalloc;\nuse tracing::error;\n\n#[cfg(feature = \"jemalloc-profiled\")]\n#[global_allocator]\npub static GLOBAL: quickwit_common::jemalloc_profiled::JemallocProfiled =\n    quickwit_common::jemalloc_profiled::JemallocProfiled(Jemalloc);\n\n#[cfg(not(feature = \"jemalloc-profiled\"))]\n#[global_allocator]\npub static GLOBAL: Jemalloc = Jemalloc;\n\nconst JEMALLOC_METRICS_POLLING_INTERVAL: Duration = Duration::from_secs(1);\n\npub async fn jemalloc_metrics_loop() -> tikv_jemalloc_ctl::Result<()> {\n    let memory_metrics = MEMORY_METRICS.clone();\n\n    // Obtain a MIB for the `epoch`, `stats.active`, `stats.allocated`, and `stats.resident` keys:\n    let epoch_mib = tikv_jemalloc_ctl::epoch::mib()?;\n    let active_mib = tikv_jemalloc_ctl::stats::active::mib()?;\n    let allocated_mib = tikv_jemalloc_ctl::stats::allocated::mib()?;\n    let resident_mib = tikv_jemalloc_ctl::stats::resident::mib()?;\n\n    let mut poll_interval = tokio::time::interval(JEMALLOC_METRICS_POLLING_INTERVAL);\n\n    loop {\n        poll_interval.tick().await;\n\n        // Many statistics are cached and only updated when the epoch is advanced:\n        epoch_mib.advance()?;\n\n        // Read statistics using MIB keys:\n        let active = active_mib.read()?;\n        memory_metrics.active_bytes.set(active as i64);\n\n        let allocated = allocated_mib.read()?;\n        memory_metrics.allocated_bytes.set(allocated as i64);\n\n        let resident = resident_mib.read()?;\n        memory_metrics.resident_bytes.set(resident as i64);\n    }\n}\n\npub fn start_jemalloc_metrics_loop() {\n    tokio::task::spawn(async {\n        if let Err(error) = jemalloc_metrics_loop().await {\n            error!(%error, \"failed to collect metrics from jemalloc\");\n        }\n    });\n}\n"
  },
  {
    "path": "quickwit/quickwit-cli/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n#![deny(clippy::disallowed_methods)]\n\nuse std::collections::HashSet;\nuse std::str::FromStr;\nuse std::sync::OnceLock;\n\nuse anyhow::Context;\nuse clap::{Arg, ArgMatches, arg};\nuse dialoguer::Confirm;\nuse dialoguer::theme::ColorfulTheme;\nuse quickwit_common::runtimes::RuntimesConfig;\nuse quickwit_common::uri::Uri;\nuse quickwit_config::service::QuickwitService;\nuse quickwit_config::{\n    ConfigFormat, DEFAULT_QW_CONFIG_PATH, MetastoreConfigs, NodeConfig, SourceConfig,\n    StorageConfigs,\n};\nuse quickwit_indexing::check_source_connectivity;\nuse quickwit_metastore::{IndexMetadataResponseExt, MetastoreResolver};\nuse quickwit_proto::metastore::{IndexMetadataRequest, MetastoreService, MetastoreServiceClient};\nuse quickwit_rest_client::models::Timeout;\nuse quickwit_rest_client::rest_client::{DEFAULT_BASE_URL, QuickwitClient, QuickwitClientBuilder};\nuse quickwit_storage::{StorageResolver, load_file};\nuse reqwest::Url;\nuse tabled::settings::object::Rows;\nuse tabled::settings::panel::Header;\nuse tabled::settings::{Alignment, Modify, Style};\nuse tabled::{Table, Tabled};\nuse tracing::info;\n\nuse crate::checklist::run_checklist;\n\npub mod checklist;\npub mod cli;\npub mod index;\n#[cfg(feature = \"jemalloc\")]\npub mod jemalloc;\npub mod logger;\npub mod metrics;\npub mod service;\npub mod source;\npub mod split;\npub mod stats;\npub mod tool;\n\n/// Throughput calculation window size.\nconst THROUGHPUT_WINDOW_SIZE: usize = 5;\n\npub const QW_ENABLE_TOKIO_CONSOLE_ENV_KEY: &str = \"QW_ENABLE_TOKIO_CONSOLE\";\n\npub const QW_ENABLE_OPENTELEMETRY_OTLP_EXPORTER_ENV_KEY: &str =\n    \"QW_ENABLE_OPENTELEMETRY_OTLP_EXPORTER\";\n\nfn config_cli_arg() -> Arg {\n    Arg::new(\"config\")\n        .long(\"config\")\n        .help(\"Config file location\")\n        .env(\"QW_CONFIG\")\n        .default_value(DEFAULT_QW_CONFIG_PATH)\n        .global(true)\n        .display_order(1)\n}\n\nfn client_args() -> Vec<Arg> {\n    vec![\n        arg!(--\"endpoint\" <QW_CLUSTER_ENDPOINT> \"Quickwit cluster endpoint.\")\n            .default_value(\"http://127.0.0.1:7280\")\n            .env(\"QW_CLUSTER_ENDPOINT\")\n            .required(false)\n            .display_order(1)\n            .global(true),\n        Arg::new(\"timeout\")\n            .long(\"timeout\")\n            .help(\"Duration of the timeout.\")\n            .required(false)\n            .global(true)\n            .display_order(2),\n        Arg::new(\"connect-timeout\")\n            .long(\"connect-timeout\")\n            .help(\"Duration of the connect timeout.\")\n            .required(false)\n            .global(true)\n            .display_order(3),\n        Arg::new(\"retries\")\n            .long(\"retries\")\n            .help(\n                \"Maximum number of retries for transient errors. Default value is 0. The total \\\n                 number of attempts will be `1 + RETRIES`.\",\n            )\n            .required(false)\n            .global(true)\n            .default_value(\"0\")\n            .display_order(4),\n    ]\n}\n\npub fn install_default_crypto_ring_provider() {\n    static CALL_ONLY_ONCE: OnceLock<Result<(), ()>> = OnceLock::new();\n    CALL_ONLY_ONCE\n        .get_or_init(|| {\n            rustls::crypto::ring::default_provider()\n                .install_default()\n                .map_err(|_| ())\n        })\n        .expect(\"rustls crypto ring default provider installation should not fail\");\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub struct ClientArgs {\n    pub cluster_endpoint: Url,\n    pub connect_timeout: Option<Timeout>,\n    pub timeout: Option<Timeout>,\n    pub commit_timeout: Option<Timeout>,\n    pub num_retries: u32,\n}\n\nimpl Default for ClientArgs {\n    fn default() -> Self {\n        Self {\n            cluster_endpoint: Url::parse(DEFAULT_BASE_URL).unwrap(),\n            connect_timeout: None,\n            timeout: None,\n            commit_timeout: None,\n            num_retries: 0,\n        }\n    }\n}\n\nimpl ClientArgs {\n    pub fn client_builder(self) -> QuickwitClientBuilder {\n        let mut builder = QuickwitClientBuilder::new(self.cluster_endpoint);\n        if let Some(connect_timeout) = self.connect_timeout {\n            builder = builder.connect_timeout(connect_timeout);\n        }\n        if let Some(timeout) = self.timeout {\n            builder = builder.timeout(timeout);\n            builder = builder.search_timeout(timeout);\n            builder = builder.ingest_timeout(timeout);\n        }\n        if let Some(commit_timeout) = self.commit_timeout {\n            builder = builder.commit_timeout(commit_timeout);\n        }\n        builder.num_retries(self.num_retries)\n    }\n\n    pub fn client(self) -> QuickwitClient {\n        self.client_builder().build()\n    }\n\n    pub fn parse_for_ingest(matches: &mut ArgMatches) -> anyhow::Result<Self> {\n        Self::parse_inner(matches, true)\n    }\n\n    pub fn parse(matches: &mut ArgMatches) -> anyhow::Result<Self> {\n        Self::parse_inner(matches, false)\n    }\n\n    fn parse_inner(matches: &mut ArgMatches, process_ingest: bool) -> anyhow::Result<Self> {\n        let cluster_endpoint = matches\n            .remove_one::<String>(\"endpoint\")\n            .map(|endpoint_str| Url::from_str(&endpoint_str))\n            .expect(\"`endpoint` should be a required arg\")?;\n        let connect_timeout =\n            if let Some(duration) = matches.remove_one::<String>(\"connect-timeout\") {\n                Some(parse_duration_or_none(&duration)?)\n            } else {\n                None\n            };\n        let timeout = if let Some(duration) = matches.remove_one::<String>(\"timeout\") {\n            Some(parse_duration_or_none(&duration)?)\n        } else {\n            None\n        };\n        let commit_timeout = if process_ingest {\n            if let Some(duration) = matches.remove_one::<String>(\"commit-timeout\") {\n                Some(parse_duration_or_none(&duration)?)\n            } else {\n                None\n            }\n        } else {\n            None\n        };\n        let num_retries = matches\n            .remove_one::<String>(\"retries\")\n            .map(|retries| retries.parse::<u32>())\n            .expect(\"`retries` should have a default value\")?;\n        Ok(Self {\n            cluster_endpoint,\n            connect_timeout,\n            timeout,\n            commit_timeout,\n            num_retries,\n        })\n    }\n}\n\npub fn parse_duration_or_none(duration_with_unit_str: &str) -> anyhow::Result<Timeout> {\n    if duration_with_unit_str == \"none\" {\n        Ok(Timeout::none())\n    } else {\n        humantime::parse_duration(duration_with_unit_str)\n            .map(Timeout::new)\n            .context(\"failed to parse timeout\")\n    }\n}\n\npub fn start_actor_runtimes(\n    runtimes_config: RuntimesConfig,\n    services: &HashSet<QuickwitService>,\n) -> anyhow::Result<()> {\n    if services.contains(&QuickwitService::Indexer)\n        || services.contains(&QuickwitService::Janitor)\n        || services.contains(&QuickwitService::ControlPlane)\n    {\n        quickwit_common::runtimes::initialize_runtimes(runtimes_config)\n            .context(\"failed to start actor runtimes\")?;\n    }\n    Ok(())\n}\n\n/// Loads a node config located at `config_uri` with the default storage configuration.\nasync fn load_node_config(config_uri: &Uri) -> anyhow::Result<NodeConfig> {\n    let config_content = load_file(&StorageResolver::unconfigured(), config_uri)\n        .await\n        .context(\"failed to load node config\")?;\n    let config_format = ConfigFormat::sniff_from_uri(config_uri)?;\n    let config = NodeConfig::load(config_format, config_content.as_slice())\n        .await\n        .with_context(|| format!(\"failed to parse node config `{config_uri}`\"))?;\n    info!(config_uri=%config_uri, config=?config, \"loaded node config\");\n    Ok(config)\n}\n\nfn get_resolvers(\n    storage_configs: &StorageConfigs,\n    metastore_configs: &MetastoreConfigs,\n) -> (StorageResolver, MetastoreResolver) {\n    // The CLI tests rely on the unconfigured singleton resolvers, so it's better to return them if\n    // the storage and metastore configs are not set.\n    if storage_configs.is_empty() && metastore_configs.is_empty() {\n        return (\n            StorageResolver::unconfigured(),\n            MetastoreResolver::unconfigured(),\n        );\n    }\n    let storage_resolver = StorageResolver::configured(storage_configs);\n    let metastore_resolver =\n        MetastoreResolver::configured(storage_resolver.clone(), metastore_configs);\n    (storage_resolver, metastore_resolver)\n}\n\n/// Runs connectivity checks for a given `metastore_uri` and `index_id`.\n/// Optionally, it takes a `SourceConfig` that will be checked instead\n/// of the index's sources.\npub async fn run_index_checklist(\n    metastore: &mut MetastoreServiceClient,\n    storage_resolver: &StorageResolver,\n    index_id: &str,\n    source_config_opt: Option<&SourceConfig>,\n) -> anyhow::Result<()> {\n    let mut checks: Vec<(&str, anyhow::Result<()>)> = Vec::new();\n    for metastore_endpoint in metastore.endpoints() {\n        // If it's not a database, the metastore is file-backed. To display a nicer message to the\n        // user, we check the metastore storage connectivity before the mestastore check\n        // connectivity which will check the storage anyway.\n        if !metastore_endpoint.protocol().is_database() {\n            let metastore_storage = storage_resolver.resolve(&metastore_endpoint).await?;\n            checks.push((\n                \"metastore storage\",\n                metastore_storage.check_connectivity().await,\n            ));\n        }\n    }\n    checks.push((\"metastore\", metastore.check_connectivity().await));\n    let index_metadata = metastore\n        .index_metadata(IndexMetadataRequest::for_index_id(index_id.to_string()))\n        .await?\n        .deserialize_index_metadata()?;\n    let index_storage = storage_resolver.resolve(index_metadata.index_uri()).await?;\n    checks.push((\"index storage\", index_storage.check_connectivity().await));\n\n    if let Some(source_config) = source_config_opt {\n        checks.push((\n            source_config.source_id.as_str(),\n            check_source_connectivity(storage_resolver, source_config).await,\n        ));\n    } else {\n        for source_config in index_metadata.sources.values() {\n            checks.push((\n                source_config.source_id.as_str(),\n                check_source_connectivity(storage_resolver, source_config).await,\n            ));\n        }\n    }\n    run_checklist(checks)?;\n    Ok(())\n}\n\n/// Constructs a table for display.\npub fn make_table<T: Tabled>(\n    header: &str,\n    rows: impl IntoIterator<Item = T>,\n    transpose: bool,\n) -> Table {\n    let mut table = if transpose {\n        let index_builder = Table::builder(rows).index();\n        index_builder.column(0).transpose().build()\n    } else {\n        Table::builder(rows).build()\n    };\n\n    table\n        .with(Modify::new(Rows::new(1..)).with(Alignment::left()))\n        .with(Style::ascii())\n        .with(Header::new(header))\n        .with(Modify::new(Rows::new(0..1)).with(Alignment::center()));\n\n    table\n}\n\n/// Prompts user for confirmation.\nfn prompt_confirmation(prompt: &str, default: bool) -> bool {\n    if Confirm::with_theme(&ColorfulTheme::default())\n        .with_prompt(prompt)\n        .default(default)\n        .interact()\n        .unwrap()\n    {\n        true\n    } else {\n        println!(\"Aborting.\");\n        false\n    }\n}\n\npub mod busy_detector {\n    use std::sync::atomic::{AtomicBool, AtomicU64, Ordering};\n    use std::time::Instant;\n\n    use once_cell::sync::Lazy;\n    use tracing::debug;\n\n    use crate::metrics::CLI_METRICS;\n\n    // we need that time reference to use an atomic and not a mutex for LAST_UNPARK\n    static TIME_REF: Lazy<Instant> = Lazy::new(Instant::now);\n    static ENABLED: AtomicBool = AtomicBool::new(false);\n\n    const ALLOWED_DELAY_MICROS: u64 = 5000;\n    const DEBUG_SUPPRESSION_MICROS: u64 = 30_000_000;\n\n    // LAST_UNPARK_TIMESTAMP and NEXT_DEBUG_TIMESTAMP are semantically micro-second\n    // precision timestamps, but we use atomics to allow accessing them without locks.\n    thread_local!(static LAST_UNPARK_TIMESTAMP: AtomicU64 = const { AtomicU64::new(0) });\n    static NEXT_DEBUG_TIMESTAMP: AtomicU64 = AtomicU64::new(0);\n    static SUPPRESSED_DEBUG_COUNT: AtomicU64 = AtomicU64::new(0);\n\n    pub fn set_enabled(enabled: bool) {\n        ENABLED.store(enabled, Ordering::Relaxed);\n    }\n\n    pub fn thread_unpark() {\n        LAST_UNPARK_TIMESTAMP.with(|time| {\n            let now = Instant::now()\n                .checked_duration_since(*TIME_REF)\n                .unwrap_or_default();\n            time.store(now.as_micros() as u64, Ordering::Relaxed);\n        })\n    }\n\n    pub fn thread_park() {\n        if !ENABLED.load(Ordering::Relaxed) {\n            return;\n        }\n\n        LAST_UNPARK_TIMESTAMP.with(|time| {\n            let now = Instant::now()\n                .checked_duration_since(*TIME_REF)\n                .unwrap_or_default();\n            let now = now.as_micros() as u64;\n            let delta = now - time.load(Ordering::Relaxed);\n            CLI_METRICS\n                .thread_unpark_duration_microseconds\n                .with_label_values([])\n                .observe(delta as f64);\n            if delta > ALLOWED_DELAY_MICROS {\n                emit_debug(delta, now);\n            }\n        })\n    }\n\n    fn emit_debug(delta: u64, now: u64) {\n        if NEXT_DEBUG_TIMESTAMP\n            .fetch_update(Ordering::Relaxed, Ordering::Relaxed, |next_debug| {\n                if next_debug < now {\n                    Some(now + DEBUG_SUPPRESSION_MICROS)\n                } else {\n                    None\n                }\n            })\n            .is_err()\n        {\n            // a debug was emitted recently, don't emit log for this one\n            SUPPRESSED_DEBUG_COUNT.fetch_add(1, Ordering::Relaxed);\n            return;\n        }\n\n        let suppressed = SUPPRESSED_DEBUG_COUNT.swap(0, Ordering::Relaxed);\n        if suppressed == 0 {\n            debug!(\"thread wasn't parked for {delta}µs, is the runtime too busy?\");\n        } else {\n            debug!(\n                \"thread wasn't parked for {delta}µs, is the runtime too busy? ({suppressed} \\\n                 similar messages suppressed)\"\n            );\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_config::{S3StorageConfig, StorageConfigs};\n    use quickwit_rest_client::models::Timeout;\n\n    use super::*;\n    use crate::parse_duration_or_none;\n\n    #[test]\n    fn test_parse_duration_or_none() -> anyhow::Result<()> {\n        assert_eq!(parse_duration_or_none(\"1s\")?, Timeout::from_secs(1));\n        assert_eq!(parse_duration_or_none(\"2m\")?, Timeout::from_mins(2));\n        assert_eq!(parse_duration_or_none(\"3h\")?, Timeout::from_hours(3));\n        assert_eq!(parse_duration_or_none(\"4d\")?, Timeout::from_days(4));\n        assert_eq!(parse_duration_or_none(\"none\")?, Timeout::none());\n        assert!(parse_duration_or_none(\"something\").is_err());\n        Ok(())\n    }\n\n    #[test]\n    fn test_get_resolvers() {\n        let s3_storage_config = S3StorageConfig {\n            force_path_style_access: true,\n            ..Default::default()\n        };\n        let storage_configs = StorageConfigs::new(vec![s3_storage_config.into()]);\n        let metastore_configs = MetastoreConfigs::default();\n        let (_storage_resolver, _metastore_resolver) =\n            get_resolvers(&storage_configs, &metastore_configs);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-cli/src/logger.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::Arc;\nuse std::{env, fmt};\n\nuse anyhow::Context;\nuse opentelemetry::trace::TracerProvider;\nuse opentelemetry::{KeyValue, global};\nuse opentelemetry_appender_tracing::layer::OpenTelemetryTracingBridge;\nuse opentelemetry_sdk::logs::SdkLoggerProvider;\nuse opentelemetry_sdk::propagation::TraceContextPropagator;\nuse opentelemetry_sdk::trace::{BatchConfigBuilder, SdkTracerProvider};\nuse opentelemetry_sdk::{Resource, trace};\nuse quickwit_common::{get_bool_from_env, get_from_env_opt};\nuse quickwit_serve::{BuildInfo, EnvFilterReloadFn};\nuse time::format_description::BorrowedFormatItem;\nuse tracing::{Event, Level, Subscriber};\nuse tracing_subscriber::EnvFilter;\nuse tracing_subscriber::field::RecordFields;\nuse tracing_subscriber::fmt::FmtContext;\nuse tracing_subscriber::fmt::format::{\n    DefaultFields, Format, FormatEvent, FormatFields, Full, Json, JsonFields, Writer,\n};\nuse tracing_subscriber::fmt::time::UtcTime;\nuse tracing_subscriber::layer::SubscriberExt;\nuse tracing_subscriber::prelude::*;\nuse tracing_subscriber::registry::LookupSpan;\n\nuse crate::QW_ENABLE_OPENTELEMETRY_OTLP_EXPORTER_ENV_KEY;\n#[cfg(feature = \"tokio-console\")]\nuse crate::QW_ENABLE_TOKIO_CONSOLE_ENV_KEY;\n\n/// Load the default logging filter from the environment. The filter can later\n/// be updated using the result callback of [setup_logging_and_tracing].\nfn startup_env_filter(level: Level) -> anyhow::Result<EnvFilter> {\n    let env_filter = env::var(\"RUST_LOG\")\n        .map(|_| EnvFilter::from_default_env())\n        .or_else(|_| EnvFilter::try_new(format!(\"quickwit={level},tantivy=WARN\")))\n        .context(\"failed to set up tracing env filter\")?;\n    Ok(env_filter)\n}\n\ntype ReloadLayer = tracing_subscriber::reload::Layer<EnvFilter, tracing_subscriber::Registry>;\n\npub fn setup_logging_and_tracing(\n    level: Level,\n    ansi_colors: bool,\n    build_info: &BuildInfo,\n) -> anyhow::Result<(\n    EnvFilterReloadFn,\n    Option<(SdkTracerProvider, SdkLoggerProvider)>,\n)> {\n    #[cfg(feature = \"tokio-console\")]\n    {\n        if get_bool_from_env(QW_ENABLE_TOKIO_CONSOLE_ENV_KEY, false) {\n            console_subscriber::init();\n            return Ok((quickwit_serve::do_nothing_env_filter_reload_fn(), None));\n        }\n    }\n    global::set_text_map_propagator(TraceContextPropagator::new());\n\n    let event_format = EventFormat::get_from_env();\n    let fmt_fields = event_format.format_fields();\n    let registry = tracing_subscriber::registry();\n\n    let (reloadable_env_filter, reload_handle) = ReloadLayer::new(startup_env_filter(level)?);\n\n    #[cfg(not(feature = \"jemalloc-profiled\"))]\n    let registry = registry.with(reloadable_env_filter).with(\n        tracing_subscriber::fmt::layer()\n            .event_format(event_format)\n            .fmt_fields(fmt_fields)\n            .with_ansi(ansi_colors),\n    );\n\n    #[cfg(feature = \"jemalloc-profiled\")]\n    let registry = jemalloc_profiled::configure_registry(\n        registry,\n        event_format,\n        fmt_fields,\n        ansi_colors,\n        level,\n        reloadable_env_filter,\n    )?;\n\n    // Note on disabling ANSI characters: setting the ansi boolean on event format is insufficient.\n    // It is thus set on layers, see https://github.com/tokio-rs/tracing/issues/1817\n    let provider_opt = if get_bool_from_env(QW_ENABLE_OPENTELEMETRY_OTLP_EXPORTER_ENV_KEY, false) {\n        let span_exporter = opentelemetry_otlp::SpanExporter::builder()\n            .with_tonic()\n            .build()\n            .context(\"failed to initialize OpenTelemetry OTLP exporter\")?;\n        let span_processor = trace::BatchSpanProcessor::builder(span_exporter)\n            .with_batch_config(\n                BatchConfigBuilder::default()\n                    // Quickwit can generate a lot of spans, especially in debug mode, and the\n                    // default queue size of 2048 is too small.\n                    .with_max_queue_size(32_768)\n                    .build(),\n            )\n            .build();\n\n        let resource = Resource::builder()\n            .with_service_name(\"quickwit\")\n            .with_attribute(KeyValue::new(\"service.version\", build_info.version.clone()))\n            .build();\n\n        let logs_exporter = opentelemetry_otlp::LogExporter::builder()\n            .with_tonic()\n            .build()\n            .context(\"failed to initialize OpenTelemetry OTLP logs\")?;\n\n        let logger_provider = SdkLoggerProvider::builder()\n            .with_resource(resource.clone())\n            .with_batch_exporter(logs_exporter)\n            .build();\n\n        let tracing_provider = opentelemetry_sdk::trace::SdkTracerProvider::builder()\n            .with_span_processor(span_processor)\n            .with_resource(resource)\n            .build();\n\n        let tracer = tracing_provider.tracer(\"quickwit\");\n        let telemetry_layer = tracing_opentelemetry::layer().with_tracer(tracer);\n\n        // Bridge between tracing logs and otel tracing events\n        let logs_otel_layer = OpenTelemetryTracingBridge::new(&logger_provider);\n\n        registry\n            .with(telemetry_layer)\n            .with(logs_otel_layer)\n            .try_init()\n            .context(\"failed to register tracing subscriber\")?;\n        Some((tracing_provider, logger_provider))\n    } else {\n        registry\n            .try_init()\n            .context(\"failed to register tracing subscriber\")?;\n        None\n    };\n\n    Ok((\n        Arc::new(move |env_filter_def: &str| {\n            let new_env_filter = EnvFilter::try_new(env_filter_def)?;\n            reload_handle.reload(new_env_filter)?;\n            Ok(())\n        }),\n        provider_opt,\n    ))\n}\n\n/// We do not rely on the RFC3339 implementation, because it has a nanosecond precision.\n/// See discussion here: https://github.com/time-rs/time/discussions/418\nfn time_formatter() -> UtcTime<Vec<BorrowedFormatItem<'static>>> {\n    let time_format = time::format_description::parse(\n        \"[year]-[month]-[day]T[hour]:[minute]:[second].[subsecond digits:3]Z\",\n    )\n    .expect(\"time format description should be valid\");\n    UtcTime::new(time_format)\n}\n\nenum EventFormat<'a> {\n    Full(Format<Full, UtcTime<Vec<BorrowedFormatItem<'a>>>>),\n    Json(Format<Json>),\n}\n\nimpl EventFormat<'_> {\n    /// Gets the log format from the environment variable `QW_LOG_FORMAT`. Returns a JSON\n    /// formatter if the variable is set to `json`, otherwise returns a full formatter.\n    fn get_from_env() -> Self {\n        if get_from_env_opt::<String>(\"QW_LOG_FORMAT\", false)\n            .map(|log_format| log_format.eq_ignore_ascii_case(\"json\"))\n            .unwrap_or(false)\n        {\n            let json_format = tracing_subscriber::fmt::format().json();\n            EventFormat::Json(json_format)\n        } else {\n            let full_format = tracing_subscriber::fmt::format()\n                .with_target(true)\n                .with_timer(time_formatter());\n\n            EventFormat::Full(full_format)\n        }\n    }\n\n    fn format_fields(&self) -> FieldFormat {\n        match self {\n            EventFormat::Full(_) => FieldFormat::Default(DefaultFields::new()),\n            EventFormat::Json(_) => FieldFormat::Json(JsonFields::new()),\n        }\n    }\n}\n\nimpl<S, N> FormatEvent<S, N> for EventFormat<'_>\nwhere\n    S: Subscriber + for<'a> LookupSpan<'a>,\n    N: for<'a> FormatFields<'a> + 'static,\n{\n    fn format_event(\n        &self,\n        ctx: &FmtContext<'_, S, N>,\n        writer: Writer<'_>,\n        event: &Event<'_>,\n    ) -> fmt::Result {\n        match self {\n            EventFormat::Full(format) => format.format_event(ctx, writer, event),\n            EventFormat::Json(format) => format.format_event(ctx, writer, event),\n        }\n    }\n}\n\nenum FieldFormat {\n    Default(DefaultFields),\n    Json(JsonFields),\n}\n\nimpl FormatFields<'_> for FieldFormat {\n    fn format_fields<R: RecordFields>(&self, writer: Writer<'_>, fields: R) -> fmt::Result {\n        match self {\n            FieldFormat::Default(default_fields) => default_fields.format_fields(writer, fields),\n            FieldFormat::Json(json_fields) => json_fields.format_fields(writer, fields),\n        }\n    }\n}\n\n/// Logger configurations specific to the jemalloc profiler.\n///\n/// A custom event formatter is used to print the backtrace of the\n/// profiling events.\n#[cfg(feature = \"jemalloc-profiled\")]\npub(super) mod jemalloc_profiled {\n    use std::fmt;\n\n    use quickwit_common::jemalloc_profiled::JEMALLOC_PROFILER_TARGET;\n    use time::format_description::BorrowedFormatItem;\n    use tracing::{Event, Level, Metadata, Subscriber};\n    use tracing_subscriber::Layer;\n    use tracing_subscriber::filter::filter_fn;\n    use tracing_subscriber::fmt::format::{DefaultFields, Writer};\n    use tracing_subscriber::fmt::time::{FormatTime, UtcTime};\n    use tracing_subscriber::fmt::{FmtContext, FormatEvent, FormatFields, FormattedFields};\n    use tracing_subscriber::layer::SubscriberExt;\n    use tracing_subscriber::registry::LookupSpan;\n\n    use super::{EventFormat, FieldFormat, startup_env_filter, time_formatter};\n    use crate::logger::ReloadLayer;\n\n    /// An event formatter specific to the memory profiler output.\n    ///\n    /// Also displays a backtrace after the spans and fields of the tracing\n    /// event (into separate lines).\n    struct ProfilingFormat {\n        time_formatter: UtcTime<Vec<BorrowedFormatItem<'static>>>,\n    }\n\n    impl Default for ProfilingFormat {\n        fn default() -> Self {\n            Self {\n                time_formatter: time_formatter(),\n            }\n        }\n    }\n\n    impl<S, N> FormatEvent<S, N> for ProfilingFormat\n    where\n        S: Subscriber + for<'a> LookupSpan<'a>,\n        N: for<'a> FormatFields<'a> + 'static,\n    {\n        fn format_event(\n            &self,\n            ctx: &FmtContext<'_, S, N>,\n            mut writer: Writer<'_>,\n            event: &Event<'_>,\n        ) -> fmt::Result {\n            self.time_formatter.format_time(&mut writer)?;\n            write!(writer, \" {JEMALLOC_PROFILER_TARGET} \")?;\n            if let Some(scope) = ctx.event_scope() {\n                let mut seen = false;\n\n                for span in scope.from_root() {\n                    write!(writer, \"{}\", span.metadata().name())?;\n                    seen = true;\n\n                    let ext = span.extensions();\n                    if let Some(fields) = &ext.get::<FormattedFields<N>>()\n                        && !fields.is_empty()\n                    {\n                        write!(writer, \"{{{fields}}}:\")?;\n                    }\n                }\n\n                if seen {\n                    writer.write_char(' ')?;\n                }\n            };\n\n            ctx.format_fields(writer.by_ref(), event)?;\n            writeln!(writer)?;\n\n            // Print a backtrace to help identify the callsite\n            backtrace::trace(|frame| {\n                backtrace::resolve_frame(frame, |symbol| {\n                    if let Some(symbole_name) = symbol.name() {\n                        let _ = writeln!(writer, \"{symbole_name}\");\n                    } else {\n                        let _ = writeln!(writer, \"symb failed\");\n                    }\n                });\n                true\n            });\n            Ok(())\n        }\n    }\n\n    fn profiler_tracing_filter(metadata: &Metadata) -> bool {\n        metadata.is_span() || (metadata.is_event() && metadata.target() == JEMALLOC_PROFILER_TARGET)\n    }\n\n    /// Configures the regular logging layer and a specific layer that gathers\n    /// extra debug information for the jemalloc profiler.\n    ///\n    /// The the jemalloc profiler formatter disables the env filter reloading\n    /// because the [tracing_subscriber::reload::Layer] seems to overwrite the\n    /// filter configured by [profiler_tracing_filter()] even though it is\n    /// applied to a separate layer.\n    pub(super) fn configure_registry<S>(\n        registry: S,\n        event_format: EventFormat<'static>,\n        fmt_fields: FieldFormat,\n        ansi_colors: bool,\n        level: Level,\n        _reloadable_env_filter: ReloadLayer,\n    ) -> anyhow::Result<impl Subscriber + for<'span> LookupSpan<'span>>\n    where\n        S: Subscriber + for<'span> LookupSpan<'span>,\n    {\n        Ok(registry\n            .with(\n                tracing_subscriber::fmt::layer()\n                    .event_format(ProfilingFormat::default())\n                    .fmt_fields(DefaultFields::new())\n                    .with_ansi(ansi_colors)\n                    .with_filter(filter_fn(profiler_tracing_filter)),\n            )\n            .with(\n                tracing_subscriber::fmt::layer()\n                    .event_format(event_format)\n                    .fmt_fields(fmt_fields)\n                    .with_ansi(ansi_colors)\n                    .with_filter(startup_env_filter(level)?),\n            ))\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-cli/src/main.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n#![recursion_limit = \"256\"]\n\nuse std::collections::BTreeMap;\n\nuse anyhow::Context;\nuse colored::Colorize;\nuse quickwit_cli::checklist::RED_COLOR;\nuse quickwit_cli::cli::{CliCommand, build_cli};\n#[cfg(feature = \"jemalloc\")]\nuse quickwit_cli::jemalloc::start_jemalloc_metrics_loop;\nuse quickwit_cli::logger::setup_logging_and_tracing;\nuse quickwit_cli::{busy_detector, install_default_crypto_ring_provider};\nuse quickwit_common::runtimes::scrape_tokio_runtime_metrics;\nuse quickwit_serve::BuildInfo;\nuse tracing::error;\n\n/// The main tokio runtime takes num_cores / 3 threads by default, and can be overridden by the\n/// QW_RUNTIME_NUM_THREADS environment variable.\nfn get_main_runtime_num_threads() -> usize {\n    let default_num_runtime_threads: usize = quickwit_common::num_cpus().div_ceil(3);\n    quickwit_common::get_from_env(\n        \"QW_TOKIO_RUNTIME_NUM_THREADS\",\n        default_num_runtime_threads,\n        false,\n    )\n}\n\nfn main() -> anyhow::Result<()> {\n    let main_runtime_num_threads: usize = get_main_runtime_num_threads();\n    let rt = tokio::runtime::Builder::new_multi_thread()\n        .enable_all()\n        .on_thread_unpark(busy_detector::thread_unpark)\n        .on_thread_park(busy_detector::thread_park)\n        .thread_name(\"main_runtime_thread\")\n        .worker_threads(main_runtime_num_threads)\n        .build()\n        .context(\"failed to start main Tokio runtime\")?;\n\n    scrape_tokio_runtime_metrics(rt.handle(), \"main\");\n\n    rt.block_on(main_impl())\n}\n\nfn register_build_info_metric() {\n    use itertools::Itertools;\n    let build_info = BuildInfo::get();\n    let mut build_kvs = BTreeMap::default();\n    build_kvs.insert(\"build_date\", build_info.build_date.to_string());\n    build_kvs.insert(\"commit_hash\", build_info.commit_short_hash.to_string());\n    build_kvs.insert(\"version\", build_info.version.to_string());\n    if !build_info.commit_tags.is_empty() {\n        let tags_str = build_info.commit_tags.iter().join(\",\");\n        build_kvs.insert(\"commit_tags\", tags_str);\n    }\n    build_kvs.insert(\"target\", build_info.build_target.to_string());\n    quickwit_common::metrics::register_info(\"build_info\", \"Quickwit's build info\", build_kvs);\n}\n\nasync fn main_impl() -> anyhow::Result<()> {\n    #[cfg(feature = \"openssl-support\")]\n    unsafe {\n        openssl_probe::init_openssl_env_vars()\n    };\n    register_build_info_metric();\n\n    let about_text = about_text();\n    let version_text = BuildInfo::get_version_text();\n\n    let app = build_cli().about(about_text).version(version_text);\n    let matches = app.get_matches();\n    let ansi_colors = !matches.get_flag(\"no-color\");\n\n    let command = match CliCommand::parse_cli_args(matches) {\n        Ok(command) => command,\n        Err(error) => {\n            eprintln!(\"failed to parse command line arguments: {error:?}\");\n            std::process::exit(1);\n        }\n    };\n\n    install_default_crypto_ring_provider();\n\n    #[cfg(feature = \"jemalloc\")]\n    start_jemalloc_metrics_loop();\n\n    let build_info = BuildInfo::get();\n    let (env_filter_reload_fn, tracer_provider_opt) =\n        setup_logging_and_tracing(command.default_log_level(), ansi_colors, build_info)?;\n\n    let return_code: i32 = if let Err(command_error) = command.execute(env_filter_reload_fn).await {\n        error!(error=%command_error, \"command failed\");\n        eprintln!(\n            \"{} command failed: {:?}\\n\",\n            \"✘\".color(RED_COLOR),\n            command_error\n        );\n        1\n    } else {\n        0\n    };\n\n    if let Some((trace_provider, logs_provider)) = tracer_provider_opt {\n        trace_provider\n            .shutdown()\n            .context(\"failed to shutdown OpenTelemetry tracer provider\")?;\n        logs_provider\n            .shutdown()\n            .context(\"failed to shutdown OpenTelemetry logs provider\")?;\n    }\n\n    std::process::exit(return_code)\n}\n\n/// Return the about text with telemetry info.\nfn about_text() -> String {\n    let mut about_text = String::from(\n        \"Sub-second search & analytics engine on cloud storage.\\n  Find more information at https://quickwit.io/docs\\n\\n\",\n    );\n    if !quickwit_telemetry::is_telemetry_disabled() {\n        about_text += \"Telemetry: enabled\";\n    }\n    about_text\n}\n\n#[cfg(test)]\nmod tests {\n    use std::str::FromStr;\n    use std::time::Duration;\n\n    use bytesize::ByteSize;\n    use quickwit_cli::ClientArgs;\n    use quickwit_cli::cli::{CliCommand, build_cli};\n    use quickwit_cli::index::{\n        ClearIndexArgs, CreateIndexArgs, DeleteIndexArgs, DescribeIndexArgs, IndexCliCommand,\n        IngestDocsArgs, SearchIndexArgs,\n    };\n    use quickwit_cli::split::{DescribeSplitArgs, SplitCliCommand};\n    use quickwit_cli::tool::{\n        ExtractSplitArgs, GarbageCollectIndexArgs, LocalIngestDocsArgs, LocalSearchArgs, MergeArgs,\n        ToolCliCommand,\n    };\n    use quickwit_common::uri::Uri;\n    use quickwit_config::SourceInputFormat;\n    use quickwit_rest_client::models::Timeout;\n    use quickwit_rest_client::rest_client::CommitType;\n    use reqwest::Url;\n\n    #[test]\n    fn test_parse_clear_args() {\n        let app = build_cli().no_binary_name(true);\n        let matches = app\n            .try_get_matches_from([\"index\", \"clear\", \"--index\", \"wikipedia\"])\n            .unwrap();\n        let command = CliCommand::parse_cli_args(matches).unwrap();\n        let expected_cmd = CliCommand::Index(IndexCliCommand::Clear(ClearIndexArgs {\n            client_args: ClientArgs::default(),\n            index_id: \"wikipedia\".to_string(),\n            assume_yes: false,\n        }));\n        assert_eq!(command, expected_cmd);\n\n        let app = build_cli().no_binary_name(true);\n        let matches = app\n            .try_get_matches_from([\"index\", \"clear\", \"--index\", \"wikipedia\", \"--yes\"])\n            .unwrap();\n        let command = CliCommand::parse_cli_args(matches).unwrap();\n        let expected_cmd = CliCommand::Index(IndexCliCommand::Clear(ClearIndexArgs {\n            client_args: ClientArgs::default(),\n            index_id: \"wikipedia\".to_string(),\n            assume_yes: true,\n        }));\n        assert_eq!(command, expected_cmd);\n    }\n\n    #[test]\n    fn test_parse_create_args() -> anyhow::Result<()> {\n        let app = build_cli().no_binary_name(true);\n        let _ = app\n            .try_get_matches_from([\"new\", \"--index-uri\", \"file:///indexes/wikipedia\"])\n            .unwrap_err();\n\n        let app = build_cli().no_binary_name(true);\n        let matches =\n            app.try_get_matches_from([\"index\", \"create\", \"--index-config\", \"index-conf.yaml\"])?;\n        let command = CliCommand::parse_cli_args(matches)?;\n        let expected_index_config_uri = Uri::from_str(&format!(\n            \"file://{}/index-conf.yaml\",\n            std::env::current_dir().unwrap().display()\n        ))\n        .unwrap();\n        let expected_cmd = CliCommand::Index(IndexCliCommand::Create(CreateIndexArgs {\n            client_args: ClientArgs::default(),\n            index_config_uri: expected_index_config_uri.clone(),\n            overwrite: false,\n            assume_yes: false,\n        }));\n        assert_eq!(command, expected_cmd);\n\n        let app = build_cli().no_binary_name(true);\n        let matches = app.try_get_matches_from([\n            \"index\",\n            \"create\",\n            \"--index-config\",\n            \"index-conf.yaml\",\n            \"--overwrite\",\n        ])?;\n        let command = CliCommand::parse_cli_args(matches)?;\n        let expected_cmd = CliCommand::Index(IndexCliCommand::Create(CreateIndexArgs {\n            client_args: ClientArgs::default(),\n            index_config_uri: expected_index_config_uri,\n            overwrite: true,\n            assume_yes: false,\n        }));\n        assert_eq!(command, expected_cmd);\n\n        Ok(())\n    }\n\n    #[test]\n    fn test_parse_ingest_args() -> anyhow::Result<()> {\n        let app = build_cli().no_binary_name(true);\n        let matches = app.try_get_matches_from([\n            \"index\",\n            \"ingest\",\n            \"--index\",\n            \"wikipedia\",\n            \"--endpoint\",\n            \"http://127.0.0.1:8000\",\n            \"--retries\",\n            \"2\",\n        ])?;\n        let command = CliCommand::parse_cli_args(matches)?;\n        assert!(matches!(\n            command,\n            CliCommand::Index(IndexCliCommand::Ingest(\n                IngestDocsArgs {\n                    client_args,\n                    index_id,\n                    input_path_opt: None,\n                    batch_size_limit_opt: None,\n                    commit_type: CommitType::Auto,\n                    detailed_response: false,\n                })) if &index_id == \"wikipedia\"\n                && client_args.timeout.is_none()\n                && client_args.connect_timeout.is_none()\n                && client_args.commit_timeout.is_none()\n                && client_args.cluster_endpoint == Url::from_str(\"http://127.0.0.1:8000\").unwrap()\n                && client_args.num_retries == 2\n        ));\n\n        let app = build_cli().no_binary_name(true);\n        let matches = app.try_get_matches_from([\n            \"index\",\n            \"ingest\",\n            \"--index\",\n            \"wikipedia\",\n            \"--detailed-response\",\n            \"--batch-size-limit\",\n            \"8MB\",\n            \"--force\",\n        ])?;\n        let command = CliCommand::parse_cli_args(matches)?;\n        assert!(matches!(\n            command,\n            CliCommand::Index(IndexCliCommand::Ingest(\n                IngestDocsArgs {\n                    client_args,\n                    index_id,\n                    input_path_opt: None,\n                    batch_size_limit_opt: Some(batch_size_limit),\n                    commit_type: CommitType::Force,\n                    detailed_response: true,\n                })) if &index_id == \"wikipedia\"\n                        && client_args.cluster_endpoint == Url::from_str(\"http://127.0.0.1:7280\").unwrap()\n                        && client_args.timeout.is_none()\n                        && client_args.connect_timeout.is_none()\n                        && client_args.commit_timeout.is_none()\n                        && client_args.num_retries == 0\n                        && batch_size_limit == ByteSize::mb(8)\n        ));\n\n        let app = build_cli().no_binary_name(true);\n        let matches = app.try_get_matches_from([\n            \"index\",\n            \"ingest\",\n            \"--index\",\n            \"wikipedia\",\n            \"--batch-size-limit\",\n            \"4KB\",\n            \"--wait\",\n        ])?;\n        let command = CliCommand::parse_cli_args(matches)?;\n        assert!(matches!(\n            command,\n            CliCommand::Index(IndexCliCommand::Ingest(\n                IngestDocsArgs {\n                    client_args,\n                    index_id,\n                    input_path_opt: None,\n                    batch_size_limit_opt: Some(batch_size_limit),\n                    commit_type: CommitType::WaitFor,\n                    detailed_response: false,\n                })) if &index_id == \"wikipedia\"\n                    && client_args.cluster_endpoint == Url::from_str(\"http://127.0.0.1:7280\").unwrap()\n                    && client_args.timeout.is_none()\n                    && client_args.connect_timeout.is_none()\n                    && client_args.commit_timeout.is_none()\n                    && client_args.num_retries == 0\n                    && batch_size_limit == ByteSize::kb(4)\n        ));\n\n        let app = build_cli().no_binary_name(true);\n        let matches = app.try_get_matches_from([\n            \"index\",\n            \"ingest\",\n            \"--index\",\n            \"wikipedia\",\n            \"--timeout\",\n            \"10s\",\n            \"--connect-timeout\",\n            \"2s\",\n        ])?;\n        let command = CliCommand::parse_cli_args(matches)?;\n        assert!(matches!(\n            command,\n            CliCommand::Index(IndexCliCommand::Ingest(\n                IngestDocsArgs {\n                    client_args,\n                    index_id,\n                    input_path_opt: None,\n                    batch_size_limit_opt: None,\n                    commit_type: CommitType::Auto,\n                    detailed_response: false,\n                })) if &index_id == \"wikipedia\"\n                        && client_args.cluster_endpoint == Url::from_str(\"http://127.0.0.1:7280\").unwrap()\n                        && client_args.timeout == Some(Timeout::from_secs(10))\n                        && client_args.connect_timeout == Some(Timeout::from_secs(2))\n                        && client_args.commit_timeout.is_none()\n                        && client_args.num_retries == 0\n        ));\n\n        let app = build_cli().no_binary_name(true);\n        let matches = app.try_get_matches_from([\n            \"index\",\n            \"ingest\",\n            \"--index\",\n            \"wikipedia\",\n            \"--timeout\",\n            \"none\",\n            \"--wait\",\n            \"--connect-timeout\",\n            \"15s\",\n            \"--commit-timeout\",\n            \"4h\",\n        ])?;\n        let command = CliCommand::parse_cli_args(matches)?;\n        assert!(matches!(\n            command,\n            CliCommand::Index(IndexCliCommand::Ingest(\n                IngestDocsArgs {\n                    client_args,\n                    index_id,\n                    input_path_opt: None,\n                    batch_size_limit_opt: None,\n                    commit_type: CommitType::WaitFor,\n                    detailed_response: false,\n                })) if &index_id == \"wikipedia\"\n                        && client_args.cluster_endpoint == Url::from_str(\"http://127.0.0.1:7280\").unwrap()\n                        && client_args.timeout == Some(Timeout::none())\n                        && client_args.connect_timeout == Some(Timeout::from_secs(15))\n                        && client_args.commit_timeout == Some(Timeout::from_hours(4))\n        ));\n\n        let app = build_cli().no_binary_name(true);\n        assert_eq!(\n            app.try_get_matches_from([\n                \"index\",\n                \"ingest\",\n                \"--index\",\n                \"wikipedia\",\n                \"--wait\",\n                \"--force\",\n            ])\n            .unwrap_err()\n            .kind(),\n            clap::error::ErrorKind::ArgumentConflict\n        );\n        Ok(())\n    }\n\n    #[test]\n    fn test_parse_local_ingest_args() {\n        let app = build_cli().no_binary_name(true);\n        let matches = app\n            .try_get_matches_from([\n                \"tool\",\n                \"local-ingest\",\n                \"--index\",\n                \"wikipedia\",\n                \"--config\",\n                \"/config.yaml\",\n                \"--overwrite\",\n                \"--keep-cache\",\n                \"--input-format\",\n                \"plain\",\n                \"--transform-script\",\n                \".message = downcase(string!(.message))\",\n            ])\n            .unwrap();\n        let command = CliCommand::parse_cli_args(matches).unwrap();\n        assert!(matches!(\n            command,\n            CliCommand::Tool(ToolCliCommand::LocalIngest(\n                LocalIngestDocsArgs {\n                    config_uri,\n                    index_id,\n                    input_path_opt: None,\n                    input_format,\n                    overwrite,\n                    vrl_script: Some(vrl_script),\n                    clear_cache,\n                })) if &index_id == \"wikipedia\"\n                       && config_uri == Uri::from_str(\"file:///config.yaml\").unwrap()\n                       && vrl_script == \".message = downcase(string!(.message))\"\n                       && overwrite\n                       && !clear_cache\n                       && input_format == SourceInputFormat::PlainText,\n        ));\n    }\n\n    #[test]\n    fn test_parse_search_args() -> anyhow::Result<()> {\n        let app = build_cli().no_binary_name(true);\n        let matches = app.try_get_matches_from([\n            \"index\",\n            \"search\",\n            \"--index\",\n            \"wikipedia\",\n            \"--query\",\n            \"Barack Obama\",\n        ])?;\n        let command = CliCommand::parse_cli_args(matches)?;\n        assert!(matches!(\n            command,\n            CliCommand::Index(IndexCliCommand::Search(SearchIndexArgs {\n                index_id,\n                query,\n                max_hits: 20,\n                start_offset: 0,\n                search_fields: None,\n                snippet_fields: None,\n                start_timestamp: None,\n                end_timestamp: None,\n                aggregation: None,\n                ..\n            })) if &index_id == \"wikipedia\" && &query == \"Barack Obama\"\n        ));\n\n        let app = build_cli().no_binary_name(true);\n        let matches = app.try_get_matches_from([\n            \"index\",\n            \"search\",\n            \"--index\",\n            \"wikipedia\",\n            \"--query\",\n            \"Barack Obama\",\n            \"--max-hits\",\n            \"50\",\n            \"--start-offset\",\n            \"100\",\n            \"--start-timestamp\",\n            \"0\",\n            \"--end-timestamp\",\n            \"1\",\n            \"--search-fields\",\n            \"title\",\n            \"url\",\n            \"--snippet-fields\",\n            \"body\",\n        ])?;\n        let command = CliCommand::parse_cli_args(matches)?;\n        assert!(matches!(\n            command,\n            CliCommand::Index(IndexCliCommand::Search(SearchIndexArgs {\n                client_args: _,\n                index_id,\n                query,\n                aggregation: None,\n                max_hits: 50,\n                start_offset: 100,\n                search_fields: Some(search_field_names),\n                snippet_fields: Some(snippet_field_names),\n                start_timestamp: Some(0),\n                end_timestamp: Some(1),\n                sort_by_score: false,\n            })) if &index_id == \"wikipedia\"\n                  && query == \"Barack Obama\"\n                  && search_field_names == vec![\"title\".to_string(), \"url\".to_string()]\n                  && snippet_field_names == vec![\"body\".to_string()]\n        ));\n        Ok(())\n    }\n\n    #[test]\n    fn test_parse_local_search_args() {\n        let app = build_cli().no_binary_name(true);\n        let matches = app\n            .try_get_matches_from([\n                \"tool\",\n                \"local-search\",\n                \"--index\",\n                \"wikipedia\",\n                \"--query\",\n                \"Barack Obama\",\n            ])\n            .unwrap();\n        let command = CliCommand::parse_cli_args(matches).unwrap();\n        assert!(matches!(\n            command,\n            CliCommand::Tool(ToolCliCommand::LocalSearch(LocalSearchArgs {\n                index_id,\n                query,\n                max_hits: 20,\n                start_offset: 0,\n                search_fields: None,\n                snippet_fields: None,\n                start_timestamp: None,\n                end_timestamp: None,\n                aggregation: None,\n                ..\n            })) if &index_id == \"wikipedia\" && &query == \"Barack Obama\"\n        ));\n\n        let app = build_cli().no_binary_name(true);\n        let matches = app\n            .try_get_matches_from([\n                \"tool\",\n                \"local-search\",\n                \"--index\",\n                \"wikipedia\",\n                \"--query\",\n                \"Barack Obama\",\n                \"--max-hits\",\n                \"50\",\n                \"--start-offset\",\n                \"100\",\n                \"--start-timestamp\",\n                \"0\",\n                \"--end-timestamp\",\n                \"1\",\n                \"--search-fields\",\n                \"title\",\n                \"url\",\n                \"--snippet-fields\",\n                \"body\",\n                \"--sort-by-field=-score\",\n            ])\n            .unwrap();\n        let command = CliCommand::parse_cli_args(matches).unwrap();\n        assert!(matches!(\n            command,\n            CliCommand::Tool(ToolCliCommand::LocalSearch(LocalSearchArgs {\n                config_uri: _,\n                index_id,\n                query,\n                aggregation: None,\n                max_hits: 50,\n                start_offset: 100,\n                search_fields: Some(search_field_names),\n                snippet_fields: Some(snippet_field_names),\n                start_timestamp: Some(0),\n                end_timestamp: Some(1),\n                sort_by_field: Some(sort_by_field),\n            })) if &index_id == \"wikipedia\"\n                  && query == \"Barack Obama\"\n                  && search_field_names == vec![\"title\".to_string(), \"url\".to_string()]\n                  && snippet_field_names == vec![\"body\".to_string()]\n                  && sort_by_field == \"-score\"\n        ));\n    }\n\n    #[test]\n    fn test_parse_delete_args() {\n        let app = build_cli().no_binary_name(true);\n        let matches = app\n            .try_get_matches_from([\"index\", \"delete\", \"--index\", \"wikipedia\"])\n            .unwrap();\n        let command = CliCommand::parse_cli_args(matches).unwrap();\n        assert!(matches!(\n            command,\n            CliCommand::Index(IndexCliCommand::Delete(DeleteIndexArgs {\n                index_id,\n                dry_run: false,\n                ..\n            })) if &index_id == \"wikipedia\"\n        ));\n\n        let app = build_cli().no_binary_name(true);\n        let matches = app\n            .try_get_matches_from([\"index\", \"delete\", \"--index\", \"wikipedia\", \"--dry-run\"])\n            .unwrap();\n        let command = CliCommand::parse_cli_args(matches).unwrap();\n        assert!(matches!(\n            command,\n            CliCommand::Index(IndexCliCommand::Delete(DeleteIndexArgs {\n                index_id,\n                dry_run: true,\n                ..\n            })) if &index_id == \"wikipedia\"\n        ));\n    }\n\n    #[test]\n    fn test_parse_describe_index_args() {\n        let app = build_cli().no_binary_name(true);\n        let matches = app\n            .try_get_matches_from([\"index\", \"describe\", \"--index\", \"wikipedia\"])\n            .unwrap();\n        let command = CliCommand::parse_cli_args(matches).unwrap();\n        assert!(matches!(\n            command,\n            CliCommand::Index(IndexCliCommand::Describe(DescribeIndexArgs {\n                index_id,\n                ..\n            })) if &index_id == \"wikipedia\"\n        ));\n    }\n\n    #[test]\n    fn test_parse_split_describe_args() -> anyhow::Result<()> {\n        let app = build_cli().no_binary_name(true);\n        let matches = app.try_get_matches_from([\n            \"split\",\n            \"describe\",\n            \"--index\",\n            \"wikipedia\",\n            \"--split\",\n            \"ABC\",\n        ])?;\n        let command = CliCommand::parse_cli_args(matches)?;\n        assert!(matches!(\n            command,\n            CliCommand::Split(SplitCliCommand::Describe(DescribeSplitArgs {\n                index_id,\n                split_id,\n                verbose: false,\n                ..\n            })) if &index_id == \"wikipedia\" && &split_id == \"ABC\"\n        ));\n        Ok(())\n    }\n\n    #[test]\n    fn test_parse_split_extract_args() -> anyhow::Result<()> {\n        let app = build_cli().no_binary_name(true);\n        let matches = app.try_get_matches_from([\n            \"tool\",\n            \"extract-split\",\n            \"--index\",\n            \"wikipedia\",\n            \"--split\",\n            \"ABC\",\n            \"--target-dir\",\n            \"datadir\",\n            \"--config\",\n            \"/config.yaml\",\n        ])?;\n        let command = CliCommand::parse_cli_args(matches)?;\n        assert!(matches!(\n            command,\n            CliCommand::Tool(ToolCliCommand::ExtractSplit(ExtractSplitArgs {\n                index_id,\n                split_id,\n                target_dir,\n                ..\n            })) if &index_id == \"wikipedia\" && &split_id == \"ABC\" && target_dir == *\"datadir\"\n        ));\n        Ok(())\n    }\n\n    #[test]\n    fn test_parse_garbage_collect_args() -> anyhow::Result<()> {\n        let app = build_cli().no_binary_name(true);\n        let matches = app.try_get_matches_from([\n            \"tool\",\n            \"gc\",\n            \"--index\",\n            \"wikipedia\",\n            \"--config\",\n            \"/config.yaml\",\n        ])?;\n        let command = CliCommand::parse_cli_args(matches)?;\n        assert!(matches!(\n            command,\n            CliCommand::Tool(ToolCliCommand::GarbageCollect(GarbageCollectIndexArgs {\n                index_id,\n                grace_period,\n                dry_run: false,\n                ..\n            })) if &index_id == \"wikipedia\" && grace_period == Duration::from_secs(60 * 60)\n        ));\n\n        let app = build_cli().no_binary_name(true);\n        let matches = app.try_get_matches_from([\n            \"tool\",\n            \"gc\",\n            \"--index\",\n            \"wikipedia\",\n            \"--grace-period\",\n            \"5m\",\n            \"--config\",\n            \"/config.yaml\",\n            \"--dry-run\",\n        ])?;\n        let command = CliCommand::parse_cli_args(matches)?;\n        let expected_config_uri = Uri::from_str(\"file:///config.yaml\").unwrap();\n        assert!(matches!(\n            command,\n            CliCommand::Tool(ToolCliCommand::GarbageCollect(GarbageCollectIndexArgs {\n                index_id,\n                grace_period,\n                config_uri,\n                dry_run: true,\n            })) if &index_id == \"wikipedia\" && grace_period == Duration::from_secs(5 * 60) && config_uri == expected_config_uri\n        ));\n        Ok(())\n    }\n\n    #[test]\n    fn test_parse_merge_args() -> anyhow::Result<()> {\n        let app = build_cli().no_binary_name(true);\n        let matches = app.try_get_matches_from([\n            \"tool\",\n            \"merge\",\n            \"--index\",\n            \"wikipedia\",\n            \"--source\",\n            \"ingest-source\",\n            \"--config\",\n            \"/config.yaml\",\n        ])?;\n        let command = CliCommand::parse_cli_args(matches)?;\n        assert!(matches!(\n            command,\n            CliCommand::Tool(ToolCliCommand::Merge(MergeArgs {\n                index_id,\n                source_id,\n                ..\n            })) if &index_id == \"wikipedia\" && source_id == \"ingest-source\"\n        ));\n        Ok(())\n    }\n\n    #[test]\n    fn test_parse_no_color() {\n        // SAFETY: this test may not be entirely sound if not run with nextest or --test-threads=1\n        // as this is only a test, and it would be extremly inconvenient to run it in a different\n        // way, we are keeping it that way\n\n        let previous_no_color_res = std::env::var(\"NO_COLOR\");\n        {\n            unsafe { std::env::set_var(\"NO_COLOR\", \"whatever_interpreted_as_true\") };\n            let app = build_cli().no_binary_name(true);\n            let matches = app.try_get_matches_from([\"run\"]).unwrap();\n            let no_color = matches.get_flag(\"no-color\");\n            assert!(no_color);\n        }\n        {\n            // empty string is false.\n            unsafe { std::env::set_var(\"NO_COLOR\", \"\") };\n            let app = build_cli().no_binary_name(true);\n            let matches = app.try_get_matches_from([\"run\"]).unwrap();\n            let no_color = matches.get_flag(\"no-color\");\n            assert!(!no_color);\n        }\n        {\n            // empty string is false.\n            let app = build_cli().no_binary_name(true);\n            let matches = app.try_get_matches_from([\"run\", \"--no-color\"]).unwrap();\n            let no_color = matches.get_flag(\"no-color\");\n            assert!(no_color);\n        }\n        if let Ok(previous_no_color) = previous_no_color_res {\n            unsafe { std::env::set_var(\"NO_COLOR\", previous_no_color) };\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-cli/src/metrics.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse once_cell::sync::Lazy;\nuse quickwit_common::metrics::{HistogramVec, new_histogram_vec};\n\npub struct CliMetrics {\n    pub thread_unpark_duration_microseconds: HistogramVec<0>,\n}\n\nimpl Default for CliMetrics {\n    fn default() -> Self {\n        CliMetrics {\n            thread_unpark_duration_microseconds: new_histogram_vec(\n                \"thread_unpark_duration_microseconds\",\n                \"Duration for which a thread of the main tokio runtime is unparked.\",\n                \"cli\",\n                &[],\n                [],\n                quickwit_common::metrics::exponential_buckets(5.0, 5.0, 5).unwrap(),\n            ),\n        }\n    }\n}\n\n/// Serve counters exposes a bunch a set of metrics about the request received to quickwit.\npub static CLI_METRICS: Lazy<CliMetrics> = Lazy::new(CliMetrics::default);\n"
  },
  {
    "path": "quickwit/quickwit-cli/src/service.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashSet;\nuse std::pin::pin;\nuse std::str::FromStr;\n\nuse clap::{ArgAction, ArgMatches, Command, arg};\nuse colored::Colorize;\nuse futures::future::select;\nuse itertools::Itertools;\nuse quickwit_common::runtimes::RuntimesConfig;\nuse quickwit_common::uri::{Protocol, Uri};\nuse quickwit_config::NodeConfig;\nuse quickwit_config::service::QuickwitService;\nuse quickwit_serve::tcp_listener::DefaultTcpListenerResolver;\nuse quickwit_serve::{BuildInfo, EnvFilterReloadFn, serve_quickwit};\nuse quickwit_telemetry::payload::{QuickwitFeature, QuickwitTelemetryInfo, TelemetryEvent};\nuse tokio::signal;\nuse tracing::{debug, info};\n\nuse crate::checklist::{BLUE_COLOR, RED_COLOR};\nuse crate::{config_cli_arg, get_resolvers, load_node_config, start_actor_runtimes};\n\npub fn build_run_command() -> Command {\n    Command::new(\"run\")\n        .about(\"Starts a Quickwit node.\")\n        .long_about(\"Starts a Quickwit node with all services enabled by default: `indexer`, `searcher`, `metastore`, `control-plane`, and `janitor`.\")\n        .arg(config_cli_arg())\n        .args(&[\n            arg!(--\"service\" <SERVICE> \"Services (`indexer`, `searcher`, `metastore`, `control-plane`, or `janitor`) to run. If unspecified, all the supported services are started.\")\n                .action(ArgAction::Append)\n                .required(false),\n        ])\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub struct RunCliCommand {\n    pub config_uri: Uri,\n    pub services: Option<HashSet<QuickwitService>>,\n}\n\nasync fn listen_interrupt() {\n    async fn ctrl_c() {\n        signal::ctrl_c()\n            .await\n            .expect(\"registering a signal handler for SIGINT should not fail\");\n        // carriage return to hide the ^C echo from the terminal\n        print!(\"\\r\");\n    }\n    ctrl_c().await;\n    println!(\n        \"{} Graceful shutdown initiated. Waiting for ingested data to be indexed. This may take a \\\n         few minutes. Press Ctrl+C again to force shutdown.\",\n        \"❢\".color(BLUE_COLOR)\n    );\n    tokio::spawn(async {\n        ctrl_c().await;\n        println!(\n            \"{} Quickwit was forcefully shut down. Some data might not have been indexed.\",\n            \"✘\".color(RED_COLOR)\n        );\n        std::process::exit(1);\n    });\n}\n\nasync fn listen_sigterm() {\n    signal::unix::signal(signal::unix::SignalKind::terminate())\n        .expect(\"registering a signal handler for SIGTERM should not fail\")\n        .recv()\n        .await;\n    info!(\"SIGTERM received\");\n}\n\nimpl RunCliCommand {\n    pub fn parse_cli_args(mut matches: ArgMatches) -> anyhow::Result<Self> {\n        let config_uri = matches\n            .remove_one::<String>(\"config\")\n            .map(|uri_str| Uri::from_str(&uri_str))\n            .expect(\"`config` should be a required arg.\")?;\n        let services = matches\n            .remove_many::<String>(\"service\")\n            .map(|values| {\n                let services: Result<HashSet<_>, _> = values\n                    .into_iter()\n                    .map(|service_str| QuickwitService::from_str(&service_str))\n                    .collect();\n                services\n            })\n            .transpose()?;\n        Ok(RunCliCommand {\n            config_uri,\n            services,\n        })\n    }\n\n    pub async fn execute(&self, env_filter_reload_fn: EnvFilterReloadFn) -> anyhow::Result<()> {\n        debug!(args = ?self, \"run-service\");\n        let version_text = BuildInfo::get_version_text();\n        info!(\"quickwit version: {version_text}\");\n        let mut node_config = load_node_config(&self.config_uri).await?;\n        let (storage_resolver, metastore_resolver) =\n            get_resolvers(&node_config.storage_configs, &node_config.metastore_configs);\n        crate::busy_detector::set_enabled(true);\n\n        if let Some(services) = &self.services {\n            info!(services = %services.iter().join(\", \"), \"setting services from override\");\n            node_config.enabled_services.clone_from(services);\n        }\n        let telemetry_handle_opt =\n            quickwit_telemetry::start_telemetry_loop(quickwit_telemetry_info(&node_config));\n        quickwit_telemetry::send_telemetry_event(TelemetryEvent::RunCommand).await;\n        // TODO move in serve quickwit?\n        let runtimes_config = RuntimesConfig::default();\n        start_actor_runtimes(runtimes_config, &node_config.enabled_services)?;\n        let shutdown_signal = Box::pin(async {\n            select(pin!(listen_interrupt()), pin!(listen_sigterm())).await;\n        });\n        let serve_result = serve_quickwit(\n            node_config,\n            runtimes_config,\n            metastore_resolver,\n            storage_resolver,\n            DefaultTcpListenerResolver,\n            shutdown_signal,\n            env_filter_reload_fn,\n        )\n        .await;\n        let return_code = match serve_result {\n            Ok(_) => 0,\n            Err(_) => 1,\n        };\n        quickwit_telemetry::send_telemetry_event(TelemetryEvent::EndCommand { return_code }).await;\n        if let Some(telemetry_handle) = telemetry_handle_opt {\n            telemetry_handle.terminate_telemetry().await;\n        }\n        serve_result?;\n        info!(\"quickwit successfully terminated\");\n        Ok(())\n    }\n}\n\nfn quickwit_telemetry_info(config: &NodeConfig) -> QuickwitTelemetryInfo {\n    let mut features = HashSet::new();\n    if config.indexer_config.enable_otlp_endpoint {\n        features.insert(QuickwitFeature::Otlp);\n    }\n    if config.jaeger_config.enable_endpoint {\n        features.insert(QuickwitFeature::Jaeger);\n    }\n    // The metastore URI is only relevant if the metastore is enabled.\n    if config.is_service_enabled(QuickwitService::Metastore) {\n        let feature = if config.metastore_uri.protocol() == Protocol::PostgreSQL {\n            QuickwitFeature::PostgresqMetastore\n        } else {\n            QuickwitFeature::FileBackedMetastore\n        };\n        features.insert(feature);\n    }\n    let services = config\n        .enabled_services\n        .iter()\n        .map(|service| service.to_string())\n        .collect();\n    QuickwitTelemetryInfo::new(services, features)\n}\n\n#[cfg(test)]\nmod tests {\n\n    use super::*;\n    use crate::cli::{CliCommand, build_cli};\n\n    #[test]\n    fn test_parse_service_run_args_all_services() -> anyhow::Result<()> {\n        let command = build_cli().no_binary_name(true);\n        let matches = command.try_get_matches_from(vec![\"run\", \"--config\", \"/config.yaml\"])?;\n        let command = CliCommand::parse_cli_args(matches)?;\n        let expected_config_uri = Uri::from_str(\"file:///config.yaml\").unwrap();\n        assert!(matches!(\n            command,\n            CliCommand::Run(RunCliCommand {\n                config_uri,\n                services,\n                ..\n            })\n            if config_uri == expected_config_uri && services.is_none()\n        ));\n        Ok(())\n    }\n\n    #[test]\n    fn test_parse_service_run_args_indexer_only() -> anyhow::Result<()> {\n        let command = build_cli().no_binary_name(true);\n        let matches = command.try_get_matches_from(vec![\n            \"run\",\n            \"--config\",\n            \"/config.yaml\",\n            \"--service\",\n            \"indexer\",\n        ])?;\n        let command = CliCommand::parse_cli_args(matches)?;\n        let expected_config_uri = Uri::from_str(\"file:///config.yaml\").unwrap();\n        assert!(matches!(\n            command,\n            CliCommand::Run(RunCliCommand {\n                config_uri,\n                services,\n                ..\n            })\n            if config_uri == expected_config_uri && services.as_ref().unwrap().len() == 1 && services.as_ref().unwrap().iter().cloned().next().unwrap() == QuickwitService::Indexer\n        ));\n        Ok(())\n    }\n\n    #[test]\n    fn test_parse_service_run_args_searcher_and_metastore() -> anyhow::Result<()> {\n        let command = build_cli().no_binary_name(true);\n        let matches = command.try_get_matches_from(vec![\n            \"run\",\n            \"--config\",\n            \"/config.yaml\",\n            \"--service\",\n            \"searcher\",\n            \"--service\",\n            \"metastore\",\n        ])?;\n        let command = CliCommand::parse_cli_args(matches).unwrap();\n        let expected_config_uri = Uri::from_str(\"file:///config.yaml\").unwrap();\n        let expected_services =\n            HashSet::from_iter([QuickwitService::Metastore, QuickwitService::Searcher]);\n        assert!(matches!(\n            command,\n            CliCommand::Run(RunCliCommand {\n                config_uri,\n                services,\n                ..\n            })\n            if config_uri == expected_config_uri && services.as_ref().unwrap().len() == 2 && services.as_ref().unwrap() == &expected_services\n        ));\n        Ok(())\n    }\n\n    #[test]\n    fn test_parse_service_run_indexer_only_args() -> anyhow::Result<()> {\n        let command = build_cli().no_binary_name(true);\n        let matches = command.try_get_matches_from(vec![\n            \"run\",\n            \"--config\",\n            \"/config.yaml\",\n            \"--service\",\n            \"indexer\",\n        ])?;\n        let command = CliCommand::parse_cli_args(matches)?;\n        let expected_config_uri = Uri::from_str(\"file:///config.yaml\").unwrap();\n        assert!(matches!(\n            command,\n            CliCommand::Run(RunCliCommand {\n                config_uri,\n                services,\n                ..\n            })\n            if config_uri == expected_config_uri && services.as_ref().unwrap().len() == 1 && services.as_ref().unwrap().contains(&QuickwitService::Indexer)\n        ));\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-cli/src/source.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::str::FromStr;\n\nuse anyhow::{Context, bail};\nuse clap::{ArgMatches, Command, arg};\nuse colored::Colorize;\nuse itertools::Itertools;\nuse quickwit_common::uri::Uri;\nuse quickwit_config::{ConfigFormat, SourceConfig, validate_identifier};\nuse quickwit_metastore::checkpoint::SourceCheckpoint;\nuse quickwit_proto::types::{IndexId, SourceId};\nuse quickwit_storage::{StorageResolver, load_file};\nuse serde_json::Value as JsonValue;\nuse tabled::{Table, Tabled};\nuse tracing::debug;\n\nuse crate::checklist::GREEN_COLOR;\nuse crate::{ClientArgs, client_args, make_table, prompt_confirmation};\n\npub fn build_source_command() -> Command {\n    Command::new(\"source\")\n        .about(\"Manages sources: creates, updates, deletes sources...\")\n        .args(client_args())\n        .subcommand(\n            Command::new(\"create\")\n                .about(\"Adds a new source to an index.\")\n                .args(&[\n                    arg!(--index <INDEX_ID> \"ID of the target index\")\n                        .display_order(1)\n                        .required(true),\n                    arg!(--\"source-config\" <SOURCE_CONFIG> \"Path to source config file. Please, refer to the documentation for more details.\")\n                        .required(true),\n                ])\n            )\n        .subcommand(\n            Command::new(\"update\")\n                .about(\"Updates an existing source.\")\n                .args(&[\n                    arg!(--index <INDEX_ID> \"ID of the target index\")\n                        .display_order(1)\n                        .required(true),\n                    arg!(--source <SOURCE_ID> \"ID of the source\")\n                        .display_order(2)\n                        .required(true),\n                    arg!(--\"source-config\" <SOURCE_CONFIG> \"Path to source config file. Please, refer to the documentation for more details.\")\n                        .required(true),\n                    arg!(--\"create\" \"Create the index if it does not already exists.\")\n                        .required(false),\n                ])\n            )\n        .subcommand(\n            Command::new(\"enable\")\n                .about(\"Enables a source for an index.\")\n                .args(&[\n                    arg!(--index <INDEX_ID> \"ID of the target index\")\n                        .display_order(1)\n                        .required(true),\n                    arg!(--source <SOURCE_ID> \"ID of the source.\")\n                        .display_order(2)\n                        .required(true),\n                ])\n            )\n        .subcommand(\n            Command::new(\"disable\")\n                .about(\"Disables a source for an index.\")\n                .args(&[\n                    arg!(--index <INDEX_ID> \"ID of the target index\")\n                        .display_order(1)\n                        .required(true),\n                    arg!(--source <SOURCE_ID> \"ID of the source.\")\n                        .display_order(2)\n                        .required(true),\n                ])\n            )\n        .subcommand(\n            Command::new(\"ingest-api\")\n                .about(\"Enables/disables the ingest API of an index.\")\n                .args(&[\n                    arg!(--index <INDEX> \"ID of the target index\")\n                        .display_order(1)\n                        .required(true),\n                    arg!(--enable \"Enables the ingest API.\")\n                        .display_order(2),\n                    arg!(--disable \"Disables the ingest API.\")\n                        .display_order(3)\n                        .conflicts_with(\"enable\"),\n                ])\n            )\n        .subcommand(\n            Command::new(\"delete\")\n                .about(\"Deletes a source from an index.\")\n                .alias(\"del\")\n                .args(&[\n                    arg!(--index <INDEX_ID> \"ID of the target index\")\n                        .display_order(1)\n                        .required(true),\n                    arg!(--source <SOURCE_ID> \"ID of the source.\")\n                        .display_order(2)\n                        .required(true),\n                ])\n            )\n        .subcommand(\n            Command::new(\"describe\")\n                .about(\"Describes a source.\")\n                .alias(\"desc\")\n                .args(&[\n                    arg!(--index <INDEX_ID> \"ID of the target index\")\n                        .display_order(1)\n                        .required(true),\n                    arg!(--source <SOURCE_ID> \"ID of the source.\")\n                        .display_order(2)\n                        .required(true),\n                ])\n            )\n        .subcommand(\n            Command::new(\"list\")\n                .about(\"Lists the sources of an index.\")\n                .alias(\"ls\")\n                .args(&[\n                    arg!(--index <INDEX_ID> \"ID of the target index\")\n                        .display_order(1)\n                        .required(true),\n                ])\n            )\n        .subcommand(\n            Command::new(\"reset-checkpoint\")\n                .about(\"Resets a source checkpoint.\")\n                .alias(\"reset\")\n                .args(&[\n                    arg!(--index <INDEX_ID> \"Index ID\")\n                        .display_order(1)\n                        .required(true),\n                    arg!(--source <SOURCE_ID> \"Source ID\")\n                        .display_order(2)\n                        .required(true),\n                ])\n            )\n        .arg_required_else_help(true)\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub struct CreateSourceArgs {\n    pub client_args: ClientArgs,\n    pub index_id: IndexId,\n    pub source_config_uri: Uri,\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub struct UpdateSourceArgs {\n    pub client_args: ClientArgs,\n    pub index_id: IndexId,\n    pub source_id: SourceId,\n    pub source_config_uri: Uri,\n    pub create: bool,\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub struct ToggleSourceArgs {\n    pub client_args: ClientArgs,\n    pub index_id: IndexId,\n    pub source_id: SourceId,\n    pub enable: bool,\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub struct DeleteSourceArgs {\n    pub client_args: ClientArgs,\n    pub index_id: IndexId,\n    pub source_id: SourceId,\n    pub assume_yes: bool,\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub struct DescribeSourceArgs {\n    pub client_args: ClientArgs,\n    pub index_id: IndexId,\n    pub source_id: SourceId,\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub struct ListSourcesArgs {\n    pub client_args: ClientArgs,\n    pub index_id: IndexId,\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub struct ResetCheckpointArgs {\n    pub client_args: ClientArgs,\n    pub index_id: IndexId,\n    pub source_id: SourceId,\n    pub assume_yes: bool,\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub enum SourceCliCommand {\n    CreateSource(CreateSourceArgs),\n    UpdateSource(UpdateSourceArgs),\n    ToggleSource(ToggleSourceArgs),\n    DeleteSource(DeleteSourceArgs),\n    DescribeSource(DescribeSourceArgs),\n    ListSources(ListSourcesArgs),\n    ResetCheckpoint(ResetCheckpointArgs),\n}\n\nimpl SourceCliCommand {\n    pub async fn execute(self) -> anyhow::Result<()> {\n        match self {\n            Self::CreateSource(args) => create_source_cli(args).await,\n            Self::UpdateSource(args) => update_source_cli(args).await,\n            Self::ToggleSource(args) => toggle_source_cli(args).await,\n            Self::DeleteSource(args) => delete_source_cli(args).await,\n            Self::DescribeSource(args) => describe_source_cli(args).await,\n            Self::ListSources(args) => list_sources_cli(args).await,\n            Self::ResetCheckpoint(args) => reset_checkpoint_cli(args).await,\n        }\n    }\n\n    pub fn parse_cli_args(mut matches: ArgMatches) -> anyhow::Result<Self> {\n        let (subcommand, submatches) = matches\n            .remove_subcommand()\n            .context(\"failed to parse source subcommand\")?;\n        match subcommand.as_str() {\n            \"create\" => Self::parse_create_args(submatches).map(Self::CreateSource),\n            \"update\" => Self::parse_update_args(submatches).map(Self::UpdateSource),\n            \"enable\" => {\n                Self::parse_toggle_source_args(&subcommand, submatches).map(Self::ToggleSource)\n            }\n            \"disable\" => {\n                Self::parse_toggle_source_args(&subcommand, submatches).map(Self::ToggleSource)\n            }\n            \"delete\" => Self::parse_delete_args(submatches).map(Self::DeleteSource),\n            \"describe\" => Self::parse_describe_args(submatches).map(Self::DescribeSource),\n            \"list\" => Self::parse_list_args(submatches).map(Self::ListSources),\n            \"reset-checkpoint\" => {\n                Self::parse_reset_checkpoint_args(submatches).map(Self::ResetCheckpoint)\n            }\n            _ => bail!(\"unknown source subcommand `{subcommand}`\"),\n        }\n    }\n\n    fn parse_create_args(mut matches: ArgMatches) -> anyhow::Result<CreateSourceArgs> {\n        let client_args = ClientArgs::parse(&mut matches)?;\n        let index_id = matches\n            .remove_one::<String>(\"index\")\n            .expect(\"`index` should be a required arg.\");\n        let source_config_uri = matches\n            .remove_one::<String>(\"source-config\")\n            .map(|uri_str| Uri::from_str(&uri_str))\n            .expect(\"`source-config` should be a required arg.\")?;\n        Ok(CreateSourceArgs {\n            client_args,\n            index_id,\n            source_config_uri,\n        })\n    }\n\n    fn parse_update_args(mut matches: ArgMatches) -> anyhow::Result<UpdateSourceArgs> {\n        let client_args = ClientArgs::parse(&mut matches)?;\n        let index_id = matches\n            .remove_one::<String>(\"index\")\n            .expect(\"`index` should be a required arg.\");\n        let source_id = matches\n            .remove_one::<String>(\"source\")\n            .expect(\"`source` should be a required arg.\");\n        let source_config_uri = matches\n            .remove_one::<String>(\"source-config\")\n            .map(|uri_str| Uri::from_str(&uri_str))\n            .expect(\"`source-config` should be a required arg.\")?;\n        let create = matches.get_flag(\"create\");\n\n        Ok(UpdateSourceArgs {\n            client_args,\n            index_id,\n            source_id,\n            source_config_uri,\n            create,\n        })\n    }\n\n    fn parse_toggle_source_args(\n        subcommand: &str,\n        mut matches: ArgMatches,\n    ) -> anyhow::Result<ToggleSourceArgs> {\n        let client_args = ClientArgs::parse(&mut matches)?;\n        let index_id = matches\n            .remove_one::<String>(\"index\")\n            .expect(\"`index` should be a required arg.\");\n        let source_id = matches\n            .remove_one::<String>(\"source\")\n            .expect(\"`source` should be a required arg.\");\n        let enable = matches!(subcommand, \"enable\");\n        Ok(ToggleSourceArgs {\n            client_args,\n            index_id,\n            source_id,\n            enable,\n        })\n    }\n\n    fn parse_delete_args(mut matches: ArgMatches) -> anyhow::Result<DeleteSourceArgs> {\n        let client_args = ClientArgs::parse(&mut matches)?;\n        let index_id = matches\n            .remove_one::<String>(\"index\")\n            .expect(\"`index` should be a required arg.\");\n        let source_id = matches\n            .remove_one::<String>(\"source\")\n            .expect(\"`source` should be a required arg.\");\n        let assume_yes = matches.get_flag(\"yes\");\n        Ok(DeleteSourceArgs {\n            client_args,\n            index_id,\n            source_id,\n            assume_yes,\n        })\n    }\n\n    fn parse_describe_args(mut matches: ArgMatches) -> anyhow::Result<DescribeSourceArgs> {\n        let client_args = ClientArgs::parse(&mut matches)?;\n        let index_id = matches\n            .remove_one::<String>(\"index\")\n            .expect(\"`index` should be a required arg.\");\n        let source_id = matches\n            .remove_one::<String>(\"source\")\n            .expect(\"`source` should be a required arg.\");\n        Ok(DescribeSourceArgs {\n            client_args,\n            index_id,\n            source_id,\n        })\n    }\n\n    fn parse_list_args(mut matches: ArgMatches) -> anyhow::Result<ListSourcesArgs> {\n        let client_args = ClientArgs::parse(&mut matches)?;\n        let index_id = matches\n            .remove_one::<String>(\"index\")\n            .expect(\"`index` should be a required arg.\");\n        Ok(ListSourcesArgs {\n            client_args,\n            index_id,\n        })\n    }\n\n    fn parse_reset_checkpoint_args(mut matches: ArgMatches) -> anyhow::Result<ResetCheckpointArgs> {\n        let client_args = ClientArgs::parse(&mut matches)?;\n        let index_id = matches\n            .remove_one::<String>(\"index\")\n            .expect(\"`index` should be a required arg.\");\n        let source_id = matches\n            .remove_one::<String>(\"source\")\n            .expect(\"`source` should be a required arg.\");\n        let assume_yes = matches.get_flag(\"yes\");\n        Ok(ResetCheckpointArgs {\n            client_args,\n            index_id,\n            source_id,\n            assume_yes,\n        })\n    }\n}\n\nasync fn create_source_cli(args: CreateSourceArgs) -> anyhow::Result<()> {\n    debug!(args=?args, \"create-source\");\n    println!(\"❯ Creating source...\");\n    let storage_resolver = StorageResolver::unconfigured();\n    let source_config_content = load_file(&storage_resolver, &args.source_config_uri).await?;\n    let source_config_str: &str = std::str::from_utf8(&source_config_content)\n        .with_context(|| format!(\"source config is not utf-8: {}\", args.source_config_uri))?;\n    let config_format = ConfigFormat::sniff_from_uri(&args.source_config_uri)?;\n    let qw_client = args.client_args.client();\n    qw_client\n        .sources(&args.index_id)\n        .create(source_config_str, config_format)\n        .await?;\n    println!(\"{} Source successfully created.\", \"✔\".color(GREEN_COLOR));\n    Ok(())\n}\n\nasync fn update_source_cli(args: UpdateSourceArgs) -> anyhow::Result<()> {\n    debug!(args=?args, \"update-source\");\n    println!(\"❯ Updating source...\");\n    let storage_resolver = StorageResolver::unconfigured();\n    let source_config_content = load_file(&storage_resolver, &args.source_config_uri).await?;\n    let source_config_str: &str = std::str::from_utf8(&source_config_content)\n        .with_context(|| format!(\"source config is not utf-8: {}\", args.source_config_uri))?;\n    let config_format = ConfigFormat::sniff_from_uri(&args.source_config_uri)?;\n    let qw_client = args.client_args.client();\n    qw_client\n        .sources(&args.index_id)\n        .update(\n            &args.source_id,\n            source_config_str,\n            config_format,\n            args.create,\n        )\n        .await?;\n    println!(\"{} Source successfully updated.\", \"✔\".color(GREEN_COLOR));\n    Ok(())\n}\n\nasync fn toggle_source_cli(args: ToggleSourceArgs) -> anyhow::Result<()> {\n    debug!(args=?args, \"toggle-source\");\n    println!(\"❯ Toggling source...\");\n    let qw_client = args.client_args.client();\n    qw_client\n        .sources(&args.index_id)\n        .toggle(&args.source_id, args.enable)\n        .await\n        .context(\"failed to update source\")?;\n\n    let toggled_state_name = if args.enable { \"enabled\" } else { \"disabled\" };\n    println!(\n        \"{} Source successfully {}.\",\n        toggled_state_name,\n        \"✔\".color(GREEN_COLOR)\n    );\n    Ok(())\n}\n\nasync fn delete_source_cli(args: DeleteSourceArgs) -> anyhow::Result<()> {\n    debug!(args=?args, \"delete-source\");\n    println!(\"❯ Deleting source...\");\n    validate_identifier(\"Source ID\", &args.source_id)?;\n\n    if !args.assume_yes {\n        let prompt = \"This operation will delete the source. Do you want to proceed?\".to_string();\n        if !prompt_confirmation(&prompt, false) {\n            return Ok(());\n        }\n    }\n\n    let qw_client = args.client_args.client();\n    qw_client\n        .sources(&args.index_id)\n        .delete(&args.source_id)\n        .await\n        .context(\"failed to delete source\")?;\n    println!(\"{} Source successfully deleted.\", \"✔\".color(GREEN_COLOR));\n    Ok(())\n}\n\nasync fn describe_source_cli(args: DescribeSourceArgs) -> anyhow::Result<()> {\n    debug!(args=?args, \"describe-source\");\n    let qw_client = args.client_args.client();\n    let index_metadata = qw_client\n        .indexes()\n        .get(&args.index_id)\n        .await\n        .context(\"failed to fetch index metadata\")?;\n    let source_checkpoint = index_metadata\n        .checkpoint\n        .source_checkpoint(&args.source_id)\n        .cloned()\n        .unwrap_or_default();\n    let (source_table, params_table, checkpoint_table) = make_describe_source_tables(\n        source_checkpoint,\n        index_metadata.sources.into_values(),\n        &args.source_id,\n    )?;\n    display_tables(&[source_table, params_table, checkpoint_table]);\n    Ok(())\n}\n\nfn make_describe_source_tables<I>(\n    checkpoint: SourceCheckpoint,\n    sources: I,\n    source_id: &str,\n) -> anyhow::Result<(Table, Table, Table)>\nwhere\n    I: IntoIterator<Item = SourceConfig>,\n{\n    let source = sources\n        .into_iter()\n        .find(|source| source.source_id == source_id)\n        .with_context(|| format!(\"source `{source_id}` does not exist\"))?;\n\n    let source_rows = vec![SourceRow {\n        source_id: source.source_id.clone(),\n        source_type: source.source_type().as_str().to_string(),\n        enabled: source.enabled.to_string(),\n    }];\n    let source_table = make_table(\"Source\", source_rows, true);\n\n    let params_rows = flatten_json(source.params())\n        .into_iter()\n        .map(|(key, value)| ParamsRow { key, value })\n        .sorted_by(|left, right| left.key.cmp(&right.key));\n    let params_table = make_table(\"Parameters\", params_rows, false);\n\n    let checkpoint_rows = checkpoint\n        .iter()\n        .map(|(partition_id, position)| CheckpointRow {\n            partition_id: partition_id.0.to_string(),\n            offset: position.to_string(),\n        })\n        .sorted_by(|left, right| left.partition_id.cmp(&right.partition_id));\n    let checkpoint_table = make_table(\"Checkpoint\", checkpoint_rows, false);\n    Ok((source_table, params_table, checkpoint_table))\n}\n\nasync fn list_sources_cli(args: ListSourcesArgs) -> anyhow::Result<()> {\n    let qw_client = args.client_args.client();\n    let index_metadata = qw_client\n        .indexes()\n        .get(&args.index_id)\n        .await\n        .context(\"failed to fetch indexes metadatas\")?;\n    let table = make_list_sources_table(index_metadata.sources.into_values());\n    display_tables(&[table]);\n    Ok(())\n}\n\nfn make_list_sources_table<I>(sources: I) -> Table\nwhere I: IntoIterator<Item = SourceConfig> {\n    let rows = sources\n        .into_iter()\n        .map(|source| SourceRow {\n            source_type: source.source_type().as_str().to_string(),\n            source_id: source.source_id,\n            enabled: source.enabled.to_string(),\n        })\n        .sorted_by(|left, right| left.source_id.cmp(&right.source_id));\n    make_table(\"Sources\", rows, false)\n}\n\n#[derive(Tabled)]\nstruct SourceRow {\n    #[tabled(rename = \"ID\")]\n    source_id: SourceId,\n    #[tabled(rename = \"Type\")]\n    source_type: String,\n    #[tabled(rename = \"Enabled\")]\n    enabled: String,\n}\n\n#[derive(Tabled)]\nstruct ParamsRow {\n    #[tabled(rename = \"Key\")]\n    key: String,\n    #[tabled(rename = \"Value\")]\n    value: JsonValue,\n}\n\n#[derive(Tabled)]\nstruct CheckpointRow {\n    #[tabled(rename = \"Partition ID\")]\n    partition_id: String,\n    #[tabled(rename = \"Offset\")]\n    offset: String,\n}\n\nfn display_tables(tables: &[Table]) {\n    println!(\n        \"{}\",\n        tables.iter().map(|table| table.to_string()).join(\"\\n\\n\")\n    );\n}\n\nasync fn reset_checkpoint_cli(args: ResetCheckpointArgs) -> anyhow::Result<()> {\n    debug!(args=?args, \"reset-checkpoint-source\");\n    println!(\"❯ Resetting source checkpoint...\");\n    if !args.assume_yes {\n        let prompt =\n            \"This operation will reset the source checkpoints. Do you want to proceed?\".to_string();\n        if !prompt_confirmation(&prompt, false) {\n            return Ok(());\n        }\n    }\n    let qw_client = args.client_args.client();\n    qw_client\n        .sources(&args.index_id)\n        .reset_checkpoint(&args.source_id)\n        .await?;\n    println!(\n        \"{} Checkpoint successfully deleted.\",\n        \"✔\".color(GREEN_COLOR)\n    );\n    Ok(())\n}\n\n/// Recursively flattens a JSON object into a vector of `(path, value)` tuples where `path`\n/// represents the full path of each property in the original object. For instance, `{\"root\": true,\n/// \"parent\": {\"child\": 0}}` yields `[(\"root\", true), (\"parent.child\", 0)]`. Arrays are not\n/// flattened.\nfn flatten_json(value: JsonValue) -> Vec<(String, JsonValue)> {\n    let mut acc = Vec::new();\n    let mut values = vec![(String::new(), value)];\n\n    while let Some((root, value)) = values.pop() {\n        if let JsonValue::Object(obj) = value {\n            for (key, val) in obj {\n                values.push((\n                    if root.is_empty() {\n                        key\n                    } else {\n                        format!(\"{root}.{key}\")\n                    },\n                    val,\n                ));\n            }\n            continue;\n        }\n        acc.push((root, value))\n    }\n    acc\n}\n\n#[cfg(test)]\nmod tests {\n    use std::num::NonZeroUsize;\n    use std::str::FromStr;\n\n    use quickwit_config::{SourceInputFormat, SourceParams};\n    use quickwit_metastore::checkpoint::PartitionId;\n    use quickwit_proto::types::Position;\n    use serde_json::json;\n\n    use super::*;\n    use crate::cli::{CliCommand, build_cli};\n\n    #[test]\n    fn test_flatten_json() {\n        assert!(flatten_json(json!({})).is_empty());\n\n        assert_eq!(\n            flatten_json(json!(JsonValue::Null)),\n            vec![(\"\".to_string(), JsonValue::Null)]\n        );\n        assert_eq!(\n            flatten_json(\n                json!({\"foo\": {\"bar\": JsonValue::Bool(true)}, \"baz\": JsonValue::Bool(false)})\n            ),\n            vec![\n                (\"baz\".to_string(), JsonValue::Bool(false)),\n                (\"foo.bar\".to_string(), JsonValue::Bool(true)),\n            ]\n        );\n    }\n\n    #[test]\n    fn test_parse_create_source_args() {\n        let app = build_cli().no_binary_name(true);\n        let matches = app\n            .try_get_matches_from(vec![\n                \"source\",\n                \"create\",\n                \"--index\",\n                \"hdfs-logs\",\n                \"--source-config\",\n                \"/source-conf.yaml\",\n            ])\n            .unwrap();\n        let command = CliCommand::parse_cli_args(matches).unwrap();\n        let expected_command =\n            CliCommand::Source(SourceCliCommand::CreateSource(CreateSourceArgs {\n                client_args: ClientArgs::default(),\n                index_id: \"hdfs-logs\".to_string(),\n                source_config_uri: Uri::from_str(\"file:///source-conf.yaml\").unwrap(),\n            }));\n        assert_eq!(command, expected_command);\n    }\n\n    #[test]\n    fn test_parse_update_source_args() {\n        let app = build_cli().no_binary_name(true);\n        let matches = app\n            .try_get_matches_from(vec![\n                \"source\",\n                \"update\",\n                \"--index\",\n                \"hdfs-logs\",\n                \"--source\",\n                \"kafka-foo\",\n                \"--source-config\",\n                \"/source-conf.yaml\",\n            ])\n            .unwrap();\n        let command = CliCommand::parse_cli_args(matches).unwrap();\n        let expected_command =\n            CliCommand::Source(SourceCliCommand::UpdateSource(UpdateSourceArgs {\n                client_args: ClientArgs::default(),\n                index_id: \"hdfs-logs\".to_string(),\n                source_id: \"kafka-foo\".to_string(),\n                source_config_uri: Uri::from_str(\"file:///source-conf.yaml\").unwrap(),\n                create: false,\n            }));\n        assert_eq!(command, expected_command);\n    }\n\n    #[test]\n    fn test_parse_toggle_source_args() {\n        {\n            let app = build_cli().no_binary_name(true);\n            let matches = app\n                .try_get_matches_from(vec![\n                    \"source\",\n                    \"enable\",\n                    \"--index\",\n                    \"hdfs-logs\",\n                    \"--source\",\n                    \"kafka-foo\",\n                ])\n                .unwrap();\n            let command = CliCommand::parse_cli_args(matches).unwrap();\n            let expected_command =\n                CliCommand::Source(SourceCliCommand::ToggleSource(ToggleSourceArgs {\n                    client_args: ClientArgs::default(),\n                    index_id: \"hdfs-logs\".to_string(),\n                    source_id: \"kafka-foo\".to_string(),\n                    enable: true,\n                }));\n            assert_eq!(command, expected_command);\n        }\n        {\n            let app = build_cli().no_binary_name(true);\n            let matches = app\n                .try_get_matches_from(vec![\n                    \"source\",\n                    \"disable\",\n                    \"--index\",\n                    \"hdfs-logs\",\n                    \"--source\",\n                    \"kafka-foo\",\n                ])\n                .unwrap();\n            let command = CliCommand::parse_cli_args(matches).unwrap();\n            let expected_command =\n                CliCommand::Source(SourceCliCommand::ToggleSource(ToggleSourceArgs {\n                    client_args: ClientArgs::default(),\n                    index_id: \"hdfs-logs\".to_string(),\n                    source_id: \"kafka-foo\".to_string(),\n                    enable: false,\n                }));\n            assert_eq!(command, expected_command);\n        }\n    }\n\n    #[test]\n    fn test_parse_delete_source_args() {\n        let app = build_cli().no_binary_name(true);\n        let matches = app\n            .try_get_matches_from(vec![\n                \"source\",\n                \"delete\",\n                \"--index\",\n                \"hdfs-logs\",\n                \"--source\",\n                \"hdfs-logs-source\",\n                \"--yes\",\n            ])\n            .unwrap();\n        let command = CliCommand::parse_cli_args(matches).unwrap();\n        let expected_command =\n            CliCommand::Source(SourceCliCommand::DeleteSource(DeleteSourceArgs {\n                client_args: ClientArgs::default(),\n                index_id: \"hdfs-logs\".to_string(),\n                source_id: \"hdfs-logs-source\".to_string(),\n                assume_yes: true,\n            }));\n        assert_eq!(command, expected_command);\n    }\n\n    #[test]\n    fn test_parse_describe_source_args() {\n        let app = build_cli().no_binary_name(true);\n        let matches = app\n            .try_get_matches_from(vec![\n                \"source\",\n                \"describe\",\n                \"--index\",\n                \"hdfs-logs\",\n                \"--source\",\n                \"hdfs-logs-source\",\n            ])\n            .unwrap();\n        let command = CliCommand::parse_cli_args(matches).unwrap();\n        let expected_command =\n            CliCommand::Source(SourceCliCommand::DescribeSource(DescribeSourceArgs {\n                client_args: ClientArgs::default(),\n                index_id: \"hdfs-logs\".to_string(),\n                source_id: \"hdfs-logs-source\".to_string(),\n            }));\n        assert_eq!(command, expected_command);\n    }\n\n    #[test]\n    fn test_parse_reset_checkpoint_args() {\n        let app = build_cli().no_binary_name(true);\n        let matches = app\n            .try_get_matches_from(vec![\n                \"source\",\n                \"reset-checkpoint\",\n                \"--index\",\n                \"hdfs-logs\",\n                \"--source\",\n                \"hdfs-logs-source\",\n                \"--yes\",\n            ])\n            .unwrap();\n        let command = CliCommand::parse_cli_args(matches).unwrap();\n        let expected_command =\n            CliCommand::Source(SourceCliCommand::ResetCheckpoint(ResetCheckpointArgs {\n                client_args: ClientArgs::default(),\n                index_id: \"hdfs-logs\".to_string(),\n                source_id: \"hdfs-logs-source\".to_string(),\n                assume_yes: true,\n            }));\n        assert_eq!(command, expected_command);\n    }\n\n    #[test]\n    fn test_make_describe_source_tables() {\n        assert!(\n            make_describe_source_tables(SourceCheckpoint::default(), [], \"source-does-not-exist\")\n                .is_err()\n        );\n\n        let checkpoint: SourceCheckpoint = vec![(\"shard-000\", \"\"), (\"shard-001\", \"1234567890\")]\n            .into_iter()\n            .map(|(partition_id, offset)| {\n                (PartitionId::from(partition_id), Position::offset(offset))\n            })\n            .collect();\n        let sources = vec![SourceConfig {\n            source_id: \"foo-source\".to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::file_from_str(\"path/to/file\").unwrap(),\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        }];\n        let expected_source = vec![SourceRow {\n            source_id: \"foo-source\".to_string(),\n            source_type: \"file\".to_string(),\n            enabled: \"true\".to_string(),\n        }];\n        let expected_uri = Uri::from_str(\"path/to/file\").unwrap();\n        let expected_params = vec![ParamsRow {\n            key: \"filepath\".to_string(),\n            value: JsonValue::String(expected_uri.to_string()),\n        }];\n        let expected_checkpoint = vec![\n            CheckpointRow {\n                partition_id: \"shard-000\".to_string(),\n                offset: \"\".to_string(),\n            },\n            CheckpointRow {\n                partition_id: \"shard-001\".to_string(),\n                offset: \"1234567890\".to_string(),\n            },\n        ];\n        let (source_table, params_table, checkpoint_table) =\n            make_describe_source_tables(checkpoint, sources, \"foo-source\").unwrap();\n        assert_eq!(\n            source_table.to_string(),\n            make_table(\"Source\", expected_source, true).to_string()\n        );\n        assert_eq!(\n            params_table.to_string(),\n            make_table(\"Parameters\", expected_params, false).to_string()\n        );\n        assert_eq!(\n            checkpoint_table.to_string(),\n            make_table(\"Checkpoint\", expected_checkpoint, false).to_string()\n        );\n    }\n\n    #[test]\n    fn test_parse_list_sources_args() {\n        let app = build_cli().no_binary_name(true);\n        let matches = app\n            .try_get_matches_from(vec![\"source\", \"list\", \"--index\", \"hdfs-logs\"])\n            .unwrap();\n        let command = CliCommand::parse_cli_args(matches).unwrap();\n        let expected_command = CliCommand::Source(SourceCliCommand::ListSources(ListSourcesArgs {\n            client_args: ClientArgs::default(),\n            index_id: \"hdfs-logs\".to_string(),\n        }));\n        assert_eq!(command, expected_command);\n    }\n\n    #[test]\n    fn test_make_list_sources_table() {\n        let sources = [\n            SourceConfig {\n                source_id: \"foo-source\".to_string(),\n                num_pipelines: NonZeroUsize::MIN,\n                enabled: true,\n                source_params: SourceParams::stdin(),\n                transform_config: None,\n                input_format: SourceInputFormat::Json,\n            },\n            SourceConfig {\n                source_id: \"bar-source\".to_string(),\n                num_pipelines: NonZeroUsize::MIN,\n                enabled: true,\n                source_params: SourceParams::stdin(),\n                transform_config: None,\n                input_format: SourceInputFormat::Json,\n            },\n        ];\n        let expected_sources = [\n            SourceRow {\n                source_id: \"bar-source\".to_string(),\n                source_type: \"stdin\".to_string(),\n                enabled: \"true\".to_string(),\n            },\n            SourceRow {\n                source_id: \"foo-source\".to_string(),\n                source_type: \"stdin\".to_string(),\n                enabled: \"true\".to_string(),\n            },\n        ];\n        assert_eq!(\n            make_list_sources_table(sources).to_string(),\n            make_table(\"Sources\", expected_sources, false).to_string()\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-cli/src/split.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::str::FromStr;\n\nuse anyhow::{Context, bail};\nuse clap::{ArgMatches, Command, arg};\nuse colored::Colorize;\nuse itertools::Itertools;\nuse quickwit_metastore::{Split, SplitState};\nuse quickwit_proto::types::{IndexId, SplitId};\nuse quickwit_serve::ListSplitsQueryParams;\nuse tabled::{Table, Tabled};\nuse time::{Date, OffsetDateTime, PrimitiveDateTime, format_description};\nuse tracing::debug;\n\nuse crate::checklist::GREEN_COLOR;\nuse crate::{ClientArgs, client_args, make_table, prompt_confirmation};\n\npub fn build_split_command() -> Command {\n    Command::new(\"split\")\n        .about(\"Manages splits: lists, describes, marks for deletion...\")\n        .args(client_args())\n        .subcommand(\n            Command::new(\"list\")\n                .about(\"Lists the splits of an index.\")\n                .alias(\"ls\")\n                .args(&[\n                    arg!(--index <INDEX> \"Target index ID\")\n                        .display_order(1)\n                        .required(true),\n                    arg!(--\"offset\" <OFFSET> \"Number of splits to skip.\")\n                        .display_order(2)\n                        .required(false),\n                    arg!(--\"limit\" <LIMIT> \"Maximum number of splits to retrieve.\")\n                        .display_order(3)\n                        .required(false),\n                    arg!(--states <SPLIT_STATES> \"Selects the splits whose states are included in this comma-separated list of states. Possible values are `staged`, `published`, and `marked`.\")\n                        .display_order(4)\n                        .required(false)\n                        .value_delimiter(','),\n                    arg!(--\"create-date\" <CREATE_DATE> \"Selects the splits whose creation dates are before this date.\")\n                        .display_order(5)\n                        .required(false),\n                    arg!(--\"start-date\" <START_DATE> \"Selects the splits that contain documents after this date (time-series indexes only).\")\n                        .display_order(6)\n                        .required(false),\n                    arg!(--\"end-date\" <END_DATE> \"Selects the splits that contain documents before this date (time-series indexes only).\")\n                        .display_order(7)\n                        .required(false),\n                    // See #2762:\n                    // arg!(--tags <TAGS> \"Selects the splits whose tags are all included in this comma-separated list of tags.\")\n                    //     .display_order(6)\n                    //     .required(false)\n                    //     .use_value_delimiter(true),\n                    arg!(--\"output-format\" <OUTPUT_FORMAT> \"Output format. Possible values are `table`, `json`, and `pretty-json`.\")\n                        .alias(\"format\")\n                        .display_order(8)\n                        .required(false)\n                ])\n            )\n        .subcommand(\n            Command::new(\"describe\")\n                .about(\"Displays metadata about a split.\")\n                .alias(\"desc\")\n                .args(&[\n                    arg!(--index <INDEX> \"ID of the target index\")\n                        .display_order(1)\n                        .required(true),\n                    arg!(--split <SPLIT> \"ID of the target split\")\n                        .display_order(2)\n                        .required(true),\n                    arg!(--verbose \"Displays additional metadata about the hotcache.\"),\n                ])\n            )\n        .subcommand(\n            Command::new(\"mark-for-deletion\")\n                .about(\"Marks one or multiple splits of an index for deletion.\")\n                .alias(\"mark\")\n                .args(&[\n                    arg!(--index <INDEX_ID> \"Target index ID\")\n                        .display_order(1)\n                        .required(true),\n                    arg!(--splits <SPLIT_IDS> \"Comma-separated list of split IDs\")\n                        .display_order(2)\n                        .required(true)\n                        .value_delimiter(','),\n                    arg!(-y --\"yes\" \"Assume \\\"yes\\\" as an answer to all prompts and run non-interactively.\")\n                        .required(false),\n                ])\n            )\n        .arg_required_else_help(true)\n}\n\n#[derive(Debug, Eq, PartialEq)]\nenum OutputFormat {\n    Table, // Default\n    Json,\n    PrettyJson,\n}\n\nimpl FromStr for OutputFormat {\n    type Err = anyhow::Error;\n\n    fn from_str(output_format_str: &str) -> anyhow::Result<Self> {\n        match output_format_str {\n            \"json\" => Ok(OutputFormat::Json),\n            \"pretty-json\" | \"pretty_json\" => Ok(OutputFormat::PrettyJson),\n            \"table\" => Ok(OutputFormat::Table),\n            _ => bail!(\n                \"unknown output format `{output_format_str}`. supported formats are: `table`, \\\n                 `json`, and `pretty-json`\"\n            ),\n        }\n    }\n}\n\n#[derive(Debug, PartialEq)]\npub struct ListSplitArgs {\n    pub client_args: ClientArgs,\n    pub index_id: IndexId,\n    pub offset: Option<usize>,\n    pub limit: Option<usize>,\n    pub split_states: Option<Vec<SplitState>>,\n    pub create_date: Option<OffsetDateTime>,\n    pub start_date: Option<OffsetDateTime>,\n    pub end_date: Option<OffsetDateTime>,\n    // pub tags: Option<TagFilterAst>,\n    output_format: OutputFormat,\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub struct MarkForDeletionArgs {\n    pub client_args: ClientArgs,\n    pub index_id: IndexId,\n    pub split_ids: Vec<String>,\n    pub assume_yes: bool,\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub struct DescribeSplitArgs {\n    pub client_args: ClientArgs,\n    pub index_id: IndexId,\n    pub split_id: SplitId,\n    pub verbose: bool,\n}\n\n#[derive(Debug, PartialEq)]\npub enum SplitCliCommand {\n    List(ListSplitArgs),\n    MarkForDeletion(MarkForDeletionArgs),\n    Describe(DescribeSplitArgs),\n}\n\nimpl SplitCliCommand {\n    pub fn parse_cli_args(mut matches: ArgMatches) -> anyhow::Result<Self> {\n        let (subcommand, submatches) = matches\n            .remove_subcommand()\n            .context(\"failed to split subcommand\")?;\n        match subcommand.as_str() {\n            \"describe\" => Self::parse_describe_args(submatches),\n            \"list\" => Self::parse_list_args(submatches),\n            \"mark-for-deletion\" => Self::parse_mark_for_deletion_args(submatches),\n            _ => bail!(\"unknown split subcommand `{subcommand}`\"),\n        }\n    }\n\n    fn parse_list_args(mut matches: ArgMatches) -> anyhow::Result<Self> {\n        let client_args = ClientArgs::parse(&mut matches)?;\n        let index_id = matches\n            .remove_one::<String>(\"index\")\n            .expect(\"`index` should be a required arg.\");\n        let offset = matches\n            .remove_one::<String>(\"offset\")\n            .and_then(|s| s.parse::<usize>().ok());\n        let limit = matches\n            .remove_one::<String>(\"limit\")\n            .and_then(|s| s.parse::<usize>().ok());\n        let split_states = matches\n            .remove_many::<String>(\"states\")\n            .map(|values| {\n                values\n                    .into_iter()\n                    .dedup()\n                    .map(|split_state_str| parse_split_state(&split_state_str))\n                    .collect::<Result<Vec<_>, _>>()\n            })\n            .transpose()?;\n        let create_date = matches\n            .remove_one::<String>(\"create-date\")\n            .map(|date_str| parse_date(&date_str, \"create\"))\n            .transpose()?;\n        let start_date = matches\n            .remove_one::<String>(\"start-date\")\n            .map(|date_str| parse_date(&date_str, \"start\"))\n            .transpose()?;\n        let end_date = matches\n            .remove_one::<String>(\"end-date\")\n            .map(|date_str| parse_date(&date_str, \"end\"))\n            .transpose()?;\n        // let tags = matches.values_of(\"tags\").map(|values| {\n        //     TagFilterAst::And(\n        //         values\n        //             .into_iter()\n        //             .map(|value| TagFilterAst::Tag {\n        //                 get_flag: true,\n        //                 tag: value.to_string(),\n        //             })\n        //             .collect(),\n        //     )\n        // });\n        let output_format = matches\n            .remove_one::<String>(\"output-format\")\n            .map(|s| OutputFormat::from_str(s.as_str()))\n            .transpose()?\n            .unwrap_or(OutputFormat::Table);\n        Ok(Self::List(ListSplitArgs {\n            client_args,\n            index_id,\n            offset,\n            limit,\n            split_states,\n            start_date,\n            end_date,\n            create_date,\n            // tags,\n            output_format,\n        }))\n    }\n\n    fn parse_mark_for_deletion_args(mut matches: ArgMatches) -> anyhow::Result<Self> {\n        let client_args = ClientArgs::parse(&mut matches)?;\n        let index_id = matches\n            .remove_one::<String>(\"index\")\n            .expect(\"`index` should be a required arg.\");\n        let split_ids = matches\n            .remove_many::<String>(\"splits\")\n            .expect(\"`splits` should be a required arg.\")\n            .collect();\n        let assume_yes = matches.get_flag(\"yes\");\n        Ok(Self::MarkForDeletion(MarkForDeletionArgs {\n            client_args,\n            index_id,\n            split_ids,\n            assume_yes,\n        }))\n    }\n\n    fn parse_describe_args(mut matches: ArgMatches) -> anyhow::Result<Self> {\n        let index_id = matches\n            .remove_one::<String>(\"index\")\n            .expect(\"`index` should be a required arg.\");\n        let split_id = matches\n            .remove_one::<String>(\"split\")\n            .expect(\"`split` should be a required arg.\");\n        let client_args = ClientArgs::parse(&mut matches)?;\n        let verbose = matches.get_flag(\"verbose\");\n\n        Ok(Self::Describe(DescribeSplitArgs {\n            client_args,\n            index_id,\n            split_id,\n            verbose,\n        }))\n    }\n\n    pub async fn execute(self) -> anyhow::Result<()> {\n        match self {\n            Self::List(args) => list_split_cli(args).await,\n            Self::MarkForDeletion(args) => mark_splits_for_deletion_cli(args).await,\n            Self::Describe(args) => describe_split_cli(args).await,\n        }\n    }\n}\n\nasync fn list_split_cli(args: ListSplitArgs) -> anyhow::Result<()> {\n    debug!(args=?args, \"list-split\");\n    let qw_client = args.client_args.client();\n    let list_splits_query_params = ListSplitsQueryParams {\n        offset: args.offset,\n        limit: args.limit,\n        split_states: args.split_states,\n        start_timestamp: args.start_date.map(OffsetDateTime::unix_timestamp),\n        end_timestamp: args.end_date.map(OffsetDateTime::unix_timestamp),\n        end_create_timestamp: args.create_date.map(OffsetDateTime::unix_timestamp),\n    };\n    // TODO: plug tags.\n    // if let Some(tags) = args.tags {\n    //     query = query.with_tags_filter(tags);\n    // }\n    let splits = qw_client\n        .splits(&args.index_id)\n        .list(list_splits_query_params)\n        .await\n        .context(\"failed to list splits\")?;\n    let output = match args.output_format {\n        OutputFormat::Json => serde_json::to_string(&splits)?,\n        OutputFormat::PrettyJson => serde_json::to_string_pretty(&splits)?,\n        OutputFormat::Table => make_split_table(&splits, \"Splits\").to_string(),\n    };\n    println!(\"{output}\");\n    Ok(())\n}\n\nasync fn mark_splits_for_deletion_cli(args: MarkForDeletionArgs) -> anyhow::Result<()> {\n    debug!(args=?args, \"mark-splits-for-deletion\");\n    println!(\"❯ Marking splits for deletion...\");\n    if !args.assume_yes {\n        let prompt = \"This operation will mark splits for deletion, those splits will be deleted \\\n                      after the next garbage collection. Do you want to proceed?\";\n        if !prompt_confirmation(prompt, false) {\n            return Ok(());\n        }\n    }\n    let qw_client = args.client_args.client();\n    qw_client\n        .splits(&args.index_id)\n        .mark_for_deletion(args.split_ids)\n        .await?;\n    println!(\n        \"{} Splits successfully marked for deletion.\",\n        \"✔\".color(GREEN_COLOR)\n    );\n    Ok(())\n}\n\nasync fn describe_split_cli(args: DescribeSplitArgs) -> anyhow::Result<()> {\n    debug!(args=?args, \"describe-split\");\n    let qw_client = args.client_args.client();\n    let list_splits_query_params = ListSplitsQueryParams::default();\n    let split = qw_client\n        .splits(&args.index_id)\n        .list(list_splits_query_params)\n        .await\n        .expect(\"Failed to fetch splits.\")\n        .into_iter()\n        .find(|split| split.split_id() == args.split_id)\n        .with_context(|| {\n            format!(\n                \"could not find split metadata in metastore {}\",\n                args.split_id\n            )\n        })?;\n\n    println!(\"{}\", make_split_table(&[split], \"Split\"));\n\n    // TODO: if we have access to the storage, we could fetch that.\n    // let split_file = PathBuf::from(format!(\"{}.split\", args.split_id));\n    // let (split_footer, _) = read_split_footer(index_storage, &split_file).await?;\n    // let stats = BundleDirectory::get_stats_split(split_footer.clone())?;\n    // let hotcache_bytes = get_hotcache_from_split(split_footer)?;\n\n    // let mut file_rows = Vec::new();\n\n    // for (path, size) in stats {\n    //     file_rows.push(FileRow {\n    //         file_name: path.to_str().unwrap().to_string(),\n    //         size: format_size(size, DECIMAL),\n    //     });\n    // }\n    // println!(\n    //     \"{}\",\n    //     make_table(\"Files in Split\", file_rows.into_iter(), false)\n    // );\n    // if args.verbose {\n    //     let mut hotcache_files = Vec::new();\n    //     let hotcache_stats = HotDirectory::get_stats_per_file(hotcache_bytes)?;\n    //     for (path, size) in hotcache_stats {\n    //         hotcache_files.push(FileRow {\n    //             file_name: path.to_str().unwrap().to_string(),\n    //             size: format_size(size, DECIMAL),\n    //         });\n    //     }\n    //     let hotcache_table = make_table(\"Files in Hotcache\", hotcache_files.into_iter(), false);\n    //     println!(\"{hotcache_table}\");\n    // }\n    Ok(())\n}\n\nfn make_split_table(splits: &[Split], title: &str) -> Table {\n    let rows = splits\n        .iter()\n        .map(|split| {\n            let time_range = if let Some(time_range) = &split.split_metadata.time_range {\n                format!(\"[{time_range:?}]\")\n            } else {\n                \"[*]\".to_string()\n            };\n            let created_at =\n                OffsetDateTime::from_unix_timestamp(split.split_metadata.create_timestamp)\n                    .expect(\"Failed to create `OffsetDateTime` from split create timestamp.\");\n            let updated_at = OffsetDateTime::from_unix_timestamp(split.update_timestamp)\n                .expect(\"Failed to create `OffsetDateTime` from split update timestamp.\");\n\n            SplitRow {\n                split_id: split.split_metadata.split_id.clone(),\n                split_state: split.split_state,\n                num_docs: split.split_metadata.num_docs,\n                size_mega_bytes: split.split_metadata.uncompressed_docs_size_in_bytes / 1_000_000,\n                created_at,\n                updated_at,\n                time_range,\n            }\n        })\n        .sorted_by(|left, right| left.created_at.cmp(&right.created_at));\n    make_table(title, rows, false)\n}\n\nfn parse_date(date_arg: &str, option_name: &str) -> anyhow::Result<OffsetDateTime> {\n    let description = format_description::parse(\"[year]-[month]-[day]\")?;\n    if let Ok(date) = Date::parse(date_arg, &description) {\n        return Ok(date.with_hms(0, 0, 0)?.assume_utc());\n    }\n\n    for datetime_format in [\n        \"[year]-[month]-[day] [hour]:[minute]\",\n        \"[year]-[month]-[day] [hour]:[minute]:[second]\",\n        \"[year]-[month]-[day]T[hour]:[minute]\",\n        \"[year]-[month]-[day]T[hour]:[minute]:[second]\",\n    ] {\n        let description = format_description::parse(datetime_format)?;\n        if let Ok(datetime) = PrimitiveDateTime::parse(date_arg, &description) {\n            return Ok(datetime.assume_utc());\n        }\n    }\n    bail!(\n        \"failed to parse --{}-date option parameter `{}`. supported format is `YYYY-MM-DD[ \\\n         HH:DD[:SS]]`\",\n        option_name,\n        date_arg\n    );\n}\n\nfn parse_split_state(split_state_arg: &str) -> anyhow::Result<SplitState> {\n    let split_state = match split_state_arg.to_lowercase().as_str() {\n        \"staged\" => SplitState::Staged,\n        \"published\" => SplitState::Published,\n        \"marked\" => SplitState::MarkedForDeletion,\n        _ => bail!(format!(\n            \"unknown split state `{split_state_arg}`. possible values are `staged`, `published`, \\\n             and `marked`\"\n        )),\n    };\n    Ok(split_state)\n}\n\n#[derive(Tabled)]\nstruct SplitRow {\n    #[tabled(rename = \"ID\")]\n    split_id: SplitId,\n    #[tabled(rename = \"State\")]\n    split_state: SplitState,\n    #[tabled(rename = \"Num docs\")]\n    num_docs: usize,\n    #[tabled(rename = \"Size (MB)\")]\n    size_mega_bytes: u64,\n    #[tabled(rename = \"Created at\")]\n    created_at: OffsetDateTime,\n    #[tabled(rename = \"Updated at\")]\n    updated_at: OffsetDateTime,\n    #[tabled(rename = \"Time range\")]\n    time_range: String,\n}\n\n#[cfg(test)]\nmod tests {\n    use reqwest::Url;\n    use time::macros::datetime;\n\n    use super::*;\n    use crate::cli::{CliCommand, build_cli};\n\n    #[test]\n    fn test_parse_list_split_args() -> anyhow::Result<()> {\n        let app = build_cli().no_binary_name(true);\n        let matches = app.try_get_matches_from(vec![\n            \"split\",\n            \"list\",\n            \"--index\",\n            \"hdfs\",\n            \"--states\",\n            \"staged,published\",\n            \"--create-date\",\n            \"2020-12-24\",\n            \"--start-date\",\n            \"2020-12-24\",\n            \"--end-date\",\n            \"2020-12-25T12:42\",\n            // \"--tags\",\n            // \"tenant:a,service:zk\",\n            \"--format\",\n            \"json\",\n        ])?;\n        let command = CliCommand::parse_cli_args(matches)?;\n\n        let expected_split_states = Some(vec![SplitState::Staged, SplitState::Published]);\n        let expected_create_date = Some(datetime!(2020-12-24 00:00 UTC));\n        let expected_start_date = Some(datetime!(2020-12-24 00:00 UTC));\n        let expected_end_date = Some(datetime!(2020-12-25 12:42 UTC));\n        // let expected_tags = Some(TagFilterAst::And(vec![\n        //     TagFilterAst::Tag {\n        //         get_flag: true,\n        //         tag: \"tenant:a\".to_string(),\n        //     },\n        //     TagFilterAst::Tag {\n        //         get_flag: true,\n        //         tag: \"service:zk\".to_string(),\n        //     },\n        // ]));\n        let expected_output_format = OutputFormat::Json;\n        assert!(matches!(\n            command,\n            CliCommand::Split(SplitCliCommand::List(ListSplitArgs {\n                index_id,\n                split_states,\n                create_date,\n                start_date,\n                end_date,\n                // tags,\n                output_format,\n                ..\n            })) if index_id == \"hdfs\"\n                   && split_states == expected_split_states\n                   && create_date == expected_create_date\n                   && start_date == expected_start_date\n                   && end_date == expected_end_date\n                   // && tags == expected_tags\n                   && output_format == expected_output_format\n        ));\n        Ok(())\n    }\n\n    #[test]\n    fn test_parse_split_mark_for_deletion_args() -> anyhow::Result<()> {\n        let app = build_cli().no_binary_name(true);\n        let matches = app.try_get_matches_from(vec![\n            \"split\",\n            \"mark\",\n            \"--endpoint\",\n            \"https://quickwit-cluster.io\",\n            \"--index\",\n            \"wikipedia\",\n            \"--splits\",\n            \"split1,split2\",\n            \"--yes\",\n        ])?;\n        let command = CliCommand::parse_cli_args(matches)?;\n        assert!(matches!(\n            command,\n            CliCommand::Split(SplitCliCommand::MarkForDeletion(MarkForDeletionArgs {\n                client_args,\n                index_id,\n                split_ids,\n                assume_yes,\n            })) if client_args.cluster_endpoint == Url::from_str(\"https://quickwit-cluster.io\").unwrap()\n                && index_id == \"wikipedia\"\n                && split_ids == vec![\"split1\".to_string(), \"split2\".to_string()]\n                && assume_yes\n        ));\n        Ok(())\n    }\n\n    #[test]\n    fn test_parse_split_describe_args() -> anyhow::Result<()> {\n        let app = build_cli().no_binary_name(true);\n        let matches = app.try_get_matches_from(vec![\n            \"split\",\n            \"describe\",\n            \"--index\",\n            \"wikipedia\",\n            \"--split\",\n            \"ABC\",\n        ])?;\n        let command = CliCommand::parse_cli_args(matches)?;\n        assert!(matches!(\n            command,\n            CliCommand::Split(SplitCliCommand::Describe(DescribeSplitArgs {\n                index_id,\n                split_id,\n                verbose: false,\n                ..\n            })) if &index_id == \"wikipedia\" && &split_id == \"ABC\"\n        ));\n        Ok(())\n    }\n\n    #[test]\n    fn test_parse_date() {\n        assert_eq!(\n            parse_date(\"2020-12-24\", \"create\").unwrap(),\n            datetime!(2020-12-24 00:00 UTC)\n        );\n        assert_eq!(\n            parse_date(\"2020-12-24 10:20\", \"create\").unwrap(),\n            datetime!(2020-12-24 10:20 UTC)\n        );\n        assert_eq!(\n            parse_date(\"2020-12-24T10:20\", \"create\").unwrap(),\n            datetime!(2020-12-24 10:20 UTC)\n        );\n        assert_eq!(\n            parse_date(\"2020-12-24 10:20:30\", \"create\").unwrap(),\n            datetime!(2020-12-24 10:20:30 UTC)\n        );\n        assert_eq!(\n            parse_date(\"2020-12-24T10:20:30\", \"create\").unwrap(),\n            datetime!(2020-12-24 10:20:30 UTC)\n        );\n    }\n\n    #[test]\n    fn test_parse_split_state() {\n        assert_eq!(parse_split_state(\"Staged\").unwrap(), SplitState::Staged);\n        assert_eq!(\n            parse_split_state(\"Published\").unwrap(),\n            SplitState::Published\n        );\n        assert_eq!(\n            parse_split_state(\"Marked\").unwrap(),\n            SplitState::MarkedForDeletion\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-cli/src/stats.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\npub(crate) fn mean(values: &[u64]) -> f32 {\n    assert!(!values.is_empty());\n    let sum: u64 = values.iter().sum();\n    sum as f32 / values.len() as f32\n}\n\npub(crate) fn std_deviation(values: &[u64]) -> f32 {\n    assert!(!values.is_empty());\n    let mean = mean(values);\n    let variance = values\n        .iter()\n        .map(|value| {\n            let diff = mean - (*value as f32);\n            diff * diff\n        })\n        .sum::<f32>()\n        / values.len() as f32;\n    variance.sqrt()\n}\n\n/// Return percentile of sorted values using linear interpolation.\npub(crate) fn percentile(sorted_values: &[u64], percent: usize) -> f32 {\n    assert!(!sorted_values.is_empty());\n    assert!(percent <= 100);\n    if sorted_values.len() == 1 {\n        return sorted_values[0] as f32;\n    }\n    if percent == 100 {\n        return sorted_values[sorted_values.len() - 1] as f32;\n    }\n    let length = (sorted_values.len() - 1) as f32;\n    let rank = (percent as f32 / 100f32) * length;\n    let lrank = rank.floor();\n    let d = rank - lrank;\n    let n = lrank as usize;\n    let lo = sorted_values[n] as f32;\n    let hi = sorted_values[n + 1] as f32;\n    lo + (hi - lo) * d\n}\n"
  },
  {
    "path": "quickwit/quickwit-cli/src/tool.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{HashSet, VecDeque};\nuse std::io::{IsTerminal, Stdout, Write, stdout};\nuse std::num::NonZeroUsize;\nuse std::path::PathBuf;\nuse std::str::FromStr;\nuse std::time::{Duration, Instant};\nuse std::{env, fmt, io};\n\nuse anyhow::{Context, bail};\nuse clap::{ArgMatches, Command, arg};\nuse colored::{ColoredString, Colorize};\nuse humantime::format_duration;\nuse quickwit_actors::{ActorExitStatus, ActorHandle, Mailbox, Universe};\nuse quickwit_cluster::{\n    ChannelTransport, Cluster, ClusterMember, FailureDetectorConfig, make_client_grpc_config,\n};\nuse quickwit_common::pubsub::EventBroker;\nuse quickwit_common::runtimes::RuntimesConfig;\nuse quickwit_common::uri::Uri;\nuse quickwit_config::service::QuickwitService;\nuse quickwit_config::{\n    CLI_SOURCE_ID, IndexerConfig, NodeConfig, SourceConfig, SourceInputFormat, SourceParams,\n    TransformConfig, VecSourceParams,\n};\nuse quickwit_index_management::{IndexService, clear_cache_directory};\nuse quickwit_indexing::IndexingPipeline;\nuse quickwit_indexing::actors::{IndexingService, MergePipeline, MergeSchedulerService};\nuse quickwit_indexing::models::{\n    DetachIndexingPipeline, DetachMergePipeline, IndexingStatistics, SpawnPipeline,\n};\nuse quickwit_ingest::IngesterPool;\nuse quickwit_metastore::IndexMetadataResponseExt;\nuse quickwit_proto::indexing::CpuCapacity;\nuse quickwit_proto::ingest::ingester::IngesterStatus;\nuse quickwit_proto::metastore::{IndexMetadataRequest, MetastoreService, MetastoreServiceClient};\nuse quickwit_proto::search::{CountHits, SearchResponse};\nuse quickwit_proto::types::{IndexId, PipelineUid, SourceId, SplitId};\nuse quickwit_search::{SearchResponseRest, single_node_search};\nuse quickwit_serve::{\n    BodyFormat, SearchRequestQueryString, SortBy, search_request_from_api_request,\n};\nuse quickwit_storage::{BundleStorage, Storage};\nuse thousands::Separable;\nuse tracing::{debug, info};\n\nuse crate::checklist::{GREEN_COLOR, RED_COLOR};\nuse crate::{\n    THROUGHPUT_WINDOW_SIZE, config_cli_arg, get_resolvers, load_node_config, run_index_checklist,\n    start_actor_runtimes,\n};\n\npub fn build_tool_command() -> Command {\n    Command::new(\"tool\")\n        .about(\"Performs utility operations. Requires a node config.\")\n        .arg(config_cli_arg())\n        .subcommand(\n            Command::new(\"local-ingest\")\n                .display_order(10)\n                .about(\"Indexes NDJSON documents locally.\")\n                .long_about(\"Local ingest indexes locally NDJSON documents from a file or from stdin and uploads splits on the configured storage.\")\n                .args(&[\n                    arg!(--index <INDEX> \"ID of the target index\")\n                        .display_order(1)\n                        .required(true),\n                    arg!(--\"input-path\" <INPUT_PATH> \"Location of the input file.\")\n                        .required(false),\n                    arg!(--\"input-format\" <INPUT_FORMAT> \"Format of the input data.\")\n                        .default_value(\"json\")\n                        .required(false),\n                    arg!(--overwrite \"Overwrites pre-existing index.\")\n                        .required(false),\n                    arg!(--\"transform-script\" <SCRIPT> \"VRL program to transform docs before ingesting.\")\n                        .required(false),\n                    arg!(--\"keep-cache\" \"Does not clear local cache directory upon completion.\")\n                        .required(false),\n                ])\n            )\n        .subcommand(\n            Command::new(\"local-search\")\n                .display_order(10)\n                .about(\"Searches an index locally.\")\n                .long_about(\"Searchers an index directly on the configured storage without using a server.\")\n                .args(&[\n                    arg!(--index <INDEX> \"ID of the target index\")\n                        .display_order(1)\n                        .required(true),\n                    arg!(--query <QUERY> \"Query expressed in natural query language ((barack AND obama) OR \\\"president of united states\\\"). Learn more on https://quickwit.io/docs/reference/search-language.\")\n                        .display_order(2)\n                        .required(true),\n                    arg!(--aggregation <AGG> \"JSON serialized aggregation request in tantivy/elasticsearch format.\")\n                        .required(false),\n                    arg!(--\"max-hits\" <MAX_HITS> \"Maximum number of hits returned.\")\n                        .default_value(\"20\")\n                        .required(false),\n                    arg!(--\"start-offset\" <OFFSET> \"Offset in the global result set of the first hit returned.\")\n                        .default_value(\"0\")\n                        .required(false),\n                    arg!(--\"search-fields\" <FIELD_NAME> \"List of fields that Quickwit will search into if the user query does not explicitly target a field in the query. It overrides the default search fields defined in the index config. Space-separated list, e.g. \\\"field1 field2\\\". \")\n                        .num_args(1..)\n                        .required(false),\n                    arg!(--\"snippet-fields\" <FIELD_NAME> \"List of fields that Quickwit will return snippet highlight on. Space-separated list, e.g. \\\"field1 field2\\\". \")\n                        .num_args(1..)\n                        .required(false),\n                    arg!(--\"start-timestamp\" <TIMESTAMP> \"Filters out documents before that timestamp (time-series indexes only).\")\n                        .required(false),\n                    arg!(--\"end-timestamp\" <TIMESTAMP> \"Filters out documents after that timestamp (time-series indexes only).\")\n                        .required(false),\n                    arg!(--\"sort-by-field\" <SORT_BY_FIELD> \"Sort by field.\")\n                        .required(false),\n                ])\n            )\n        .subcommand(\n            Command::new(\"extract-split\")\n                .about(\"Downloads and extracts a split to a directory.\")\n                .args(&[\n                    arg!(--index <INDEX> \"ID of the target index\")\n                        .display_order(1)\n                        .required(true),\n                    arg!(--split <SPLIT> \"ID of the target split\")\n                        .display_order(2)\n                        .required(true),\n                    arg!(--\"target-dir\" <TARGET_DIR> \"Directory to extract the split to.\"),\n                ])\n            )\n        .subcommand(\n            Command::new(\"gc\")\n                .display_order(10)\n                .about(\"Garbage collects stale staged splits and splits marked for deletion.\")\n                .args(&[\n                    arg!(--index <INDEX> \"ID of the target index\")\n                        .display_order(1)\n                        .required(true),\n                    arg!(--\"grace-period\" <GRACE_PERIOD> \"Threshold period after which stale staged splits are garbage collected.\")\n                        .default_value(\"1h\")\n                        .required(false),\n                    arg!(--\"dry-run\" \"Executes the command in dry run mode and only displays the list of splits candidates for garbage collection.\")\n                        .required(false),\n                ])\n            )\n        .subcommand(\n            Command::new(\"merge\")\n                .display_order(10)\n                .about(\"Merges all the splits for a given Node ID, index ID, source ID.\")\n                .args(&[\n                    arg!(--index <INDEX> \"ID of the target index.\")\n                        .display_order(1)\n                        .required(true),\n                    arg!(--source <SOURCE_ID> \"ID of the target source.\")\n                        .display_order(2)\n                        .required(true),\n                ])\n            )\n        .arg_required_else_help(true)\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub struct LocalIngestDocsArgs {\n    pub config_uri: Uri,\n    pub index_id: IndexId,\n    pub input_path_opt: Option<Uri>,\n    pub input_format: SourceInputFormat,\n    pub overwrite: bool,\n    pub vrl_script: Option<String>,\n    pub clear_cache: bool,\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub struct LocalSearchArgs {\n    pub config_uri: Uri,\n    pub index_id: IndexId,\n    pub query: String,\n    pub aggregation: Option<String>,\n    pub max_hits: usize,\n    pub start_offset: usize,\n    pub search_fields: Option<Vec<String>>,\n    pub snippet_fields: Option<Vec<String>>,\n    pub start_timestamp: Option<i64>,\n    pub end_timestamp: Option<i64>,\n    pub sort_by_field: Option<String>,\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub struct GarbageCollectIndexArgs {\n    pub config_uri: Uri,\n    pub index_id: IndexId,\n    pub grace_period: Duration,\n    pub dry_run: bool,\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub struct MergeArgs {\n    pub config_uri: Uri,\n    pub index_id: IndexId,\n    pub source_id: SourceId,\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub struct ExtractSplitArgs {\n    pub config_uri: Uri,\n    pub index_id: IndexId,\n    pub split_id: SplitId,\n    pub target_dir: PathBuf,\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub enum ToolCliCommand {\n    GarbageCollect(GarbageCollectIndexArgs),\n    LocalIngest(LocalIngestDocsArgs),\n    LocalSearch(LocalSearchArgs),\n    Merge(MergeArgs),\n    ExtractSplit(ExtractSplitArgs),\n}\n\nimpl ToolCliCommand {\n    pub fn parse_cli_args(mut matches: ArgMatches) -> anyhow::Result<Self> {\n        let (subcommand, submatches) = matches\n            .remove_subcommand()\n            .context(\"failed to parse tool subcommand\")?;\n        match subcommand.as_str() {\n            \"gc\" => Self::parse_garbage_collect_args(submatches),\n            \"local-ingest\" => Self::parse_local_ingest_args(submatches),\n            \"local-search\" => Self::parse_local_search_args(submatches),\n            \"merge\" => Self::parse_merge_args(submatches),\n            \"extract-split\" => Self::parse_extract_split_args(submatches),\n            _ => bail!(\"unknown tool subcommand `{subcommand}`\"),\n        }\n    }\n\n    fn parse_local_ingest_args(mut matches: ArgMatches) -> anyhow::Result<Self> {\n        let config_uri = matches\n            .remove_one::<String>(\"config\")\n            .map(|uri_str| Uri::from_str(&uri_str))\n            .expect(\"`config` should be a required arg.\")?;\n        let index_id = matches\n            .remove_one::<String>(\"index\")\n            .expect(\"`index` should be a required arg.\");\n        let input_path_opt = if let Some(input_path) = matches.remove_one::<String>(\"input-path\") {\n            Some(Uri::from_str(&input_path)?)\n        } else {\n            None\n        };\n\n        let input_format = matches\n            .remove_one::<String>(\"input-format\")\n            .map(|input_format| SourceInputFormat::from_str(&input_format))\n            .expect(\"`input-format` should have a default value.\")\n            .map_err(|err| anyhow::anyhow!(err))?;\n        let overwrite = matches.get_flag(\"overwrite\");\n        let vrl_script = matches.remove_one::<String>(\"transform-script\");\n        let clear_cache = !matches.get_flag(\"keep-cache\");\n\n        Ok(Self::LocalIngest(LocalIngestDocsArgs {\n            config_uri,\n            index_id,\n            input_path_opt,\n            input_format,\n            overwrite,\n            vrl_script,\n            clear_cache,\n        }))\n    }\n\n    fn parse_local_search_args(mut matches: ArgMatches) -> anyhow::Result<Self> {\n        let config_uri = matches\n            .remove_one::<String>(\"config\")\n            .map(|uri_str| Uri::from_str(&uri_str))\n            .expect(\"`config` should be a required arg.\")?;\n        let index_id = matches\n            .remove_one::<String>(\"index\")\n            .expect(\"`index` should be a required arg.\");\n        let query = matches\n            .remove_one::<String>(\"query\")\n            .context(\"`query` should be a required arg\")?;\n        let aggregation = matches.remove_one::<String>(\"aggregation\");\n        let max_hits = matches\n            .remove_one::<String>(\"max-hits\")\n            .expect(\"`max-hits` should have a default value.\")\n            .parse()?;\n        let start_offset = matches\n            .remove_one::<String>(\"start-offset\")\n            .expect(\"`start-offset` should have a default value.\")\n            .parse()?;\n        let search_fields = matches\n            .remove_many::<String>(\"search-fields\")\n            .map(|values| values.collect());\n        let snippet_fields = matches\n            .remove_many::<String>(\"snippet-fields\")\n            .map(|values| values.collect());\n        let sort_by_field = matches.remove_one::<String>(\"sort-by-field\");\n        let start_timestamp = matches\n            .remove_one::<String>(\"start-timestamp\")\n            .map(|ts| ts.parse())\n            .transpose()?;\n        let end_timestamp = matches\n            .remove_one::<String>(\"end-timestamp\")\n            .map(|ts| ts.parse())\n            .transpose()?;\n        Ok(Self::LocalSearch(LocalSearchArgs {\n            config_uri,\n            index_id,\n            query,\n            aggregation,\n            max_hits,\n            start_offset,\n            search_fields,\n            snippet_fields,\n            start_timestamp,\n            end_timestamp,\n            sort_by_field,\n        }))\n    }\n\n    fn parse_merge_args(mut matches: ArgMatches) -> anyhow::Result<Self> {\n        let config_uri = matches\n            .remove_one::<String>(\"config\")\n            .map(|uri_str| Uri::from_str(&uri_str))\n            .expect(\"`config` should be a required arg.\")?;\n        let index_id = matches\n            .remove_one::<String>(\"index\")\n            .expect(\"'index-id' should be a required arg.\");\n        let source_id = matches\n            .remove_one::<String>(\"source\")\n            .expect(\"'source-id' should be a required arg.\");\n        Ok(Self::Merge(MergeArgs {\n            index_id,\n            source_id,\n            config_uri,\n        }))\n    }\n\n    fn parse_garbage_collect_args(mut matches: ArgMatches) -> anyhow::Result<Self> {\n        let config_uri = matches\n            .get_one(\"config\")\n            .map(|uri_str: &String| Uri::from_str(uri_str))\n            .expect(\"`config` should be a required arg.\")?;\n        let index_id = matches\n            .remove_one::<String>(\"index\")\n            .expect(\"`index` should be a required arg.\");\n        let grace_period = matches\n            .get_one(\"grace-period\")\n            .map(|duration_str: &String| humantime::parse_duration(duration_str))\n            .expect(\"`grace-period` should have a default value.\")?;\n        let dry_run = matches.get_flag(\"dry-run\");\n        Ok(Self::GarbageCollect(GarbageCollectIndexArgs {\n            index_id,\n            grace_period,\n            dry_run,\n            config_uri,\n        }))\n    }\n\n    fn parse_extract_split_args(mut matches: ArgMatches) -> anyhow::Result<Self> {\n        let index_id = matches\n            .remove_one::<String>(\"index\")\n            .expect(\"`index` should be a required arg.\");\n        let split_id = matches\n            .remove_one::<String>(\"split\")\n            .expect(\"`split` should be a required arg.\");\n        let config_uri = matches\n            .remove_one::<String>(\"config\")\n            .map(|uri_str| Uri::from_str(&uri_str))\n            .expect(\"`config` should be a required arg.\")?;\n        let target_dir = matches\n            .remove_one::<String>(\"target-dir\")\n            .map(PathBuf::from)\n            .expect(\"`target-dir` should be a required arg.\");\n        Ok(Self::ExtractSplit(ExtractSplitArgs {\n            config_uri,\n            index_id,\n            split_id,\n            target_dir,\n        }))\n    }\n\n    pub async fn execute(self) -> anyhow::Result<()> {\n        match self {\n            Self::GarbageCollect(args) => garbage_collect_index_cli(args).await,\n            Self::LocalIngest(args) => local_ingest_docs_cli(args).await,\n            Self::LocalSearch(args) => local_search_cli(args).await,\n            Self::Merge(args) => merge_cli(args).await,\n            Self::ExtractSplit(args) => extract_split_cli(args).await,\n        }\n    }\n}\n\npub async fn local_ingest_docs_cli(args: LocalIngestDocsArgs) -> anyhow::Result<()> {\n    debug!(args=?args, \"local-ingest-docs\");\n    println!(\"❯ Ingesting documents locally...\");\n\n    let config = load_node_config(&args.config_uri).await?;\n    let (storage_resolver, metastore_resolver) =\n        get_resolvers(&config.storage_configs, &config.metastore_configs);\n    let mut metastore = metastore_resolver.resolve(&config.metastore_uri).await?;\n\n    let source_params = if let Some(uri) = args.input_path_opt.as_ref() {\n        SourceParams::file_from_uri(uri.clone())\n    } else {\n        SourceParams::stdin()\n    };\n    let transform_config = args\n        .vrl_script\n        .map(|vrl_script| TransformConfig::new(vrl_script, None));\n    let source_config = SourceConfig {\n        source_id: CLI_SOURCE_ID.to_string(),\n        num_pipelines: NonZeroUsize::MIN,\n        enabled: true,\n        source_params,\n        transform_config,\n        input_format: args.input_format,\n    };\n    run_index_checklist(\n        &mut metastore,\n        &storage_resolver,\n        &args.index_id,\n        Some(&source_config),\n    )\n    .await?;\n\n    if args.overwrite {\n        let mut index_service = IndexService::new(metastore.clone(), storage_resolver.clone());\n        index_service.clear_index(&args.index_id).await?;\n    }\n    // The indexing service needs to update its cluster chitchat state so that the control plane is\n    // aware of the running tasks. We thus create a fake cluster to instantiate the indexing service\n    // and avoid impacting potential control plane running on the cluster.\n    let cluster = create_empty_cluster(&config).await?;\n    let indexer_config = IndexerConfig {\n        ..Default::default()\n    };\n    let runtimes_config = RuntimesConfig::default();\n    start_actor_runtimes(\n        runtimes_config,\n        &HashSet::from_iter([QuickwitService::Indexer]),\n    )?;\n    let universe = Universe::new();\n    let merge_scheduler_service_mailbox = universe.get_or_spawn_one();\n    let indexing_server = IndexingService::new(\n        config.node_id.clone(),\n        config.data_dir_path.clone(),\n        indexer_config,\n        runtimes_config.num_threads_blocking,\n        cluster,\n        metastore,\n        None,\n        merge_scheduler_service_mailbox,\n        IngesterPool::default(),\n        storage_resolver,\n        EventBroker::default(),\n    )\n    .await?;\n    let (indexing_server_mailbox, indexing_server_handle) =\n        universe.spawn_builder().spawn(indexing_server);\n    let pipeline_id = indexing_server_mailbox\n        .ask_for_res(SpawnPipeline {\n            index_id: args.index_id.clone(),\n            source_config,\n            pipeline_uid: PipelineUid::random(),\n        })\n        .await?;\n    let merge_pipeline_handle = indexing_server_mailbox\n        .ask_for_res(DetachMergePipeline {\n            pipeline_id: pipeline_id.merge_pipeline_id(),\n        })\n        .await?;\n    let indexing_pipeline_handle = indexing_server_mailbox\n        .ask_for_res(DetachIndexingPipeline { pipeline_id })\n        .await?;\n\n    if args.input_path_opt.is_none() && io::stdin().is_terminal() {\n        let eof_shortcut = match env::consts::OS {\n            \"windows\" => \"CTRL+Z\",\n            _ => \"CTRL+D\",\n        };\n        println!(\n            \"Please, enter JSON documents one line at a time.\\nEnd your input using \\\n             {eof_shortcut}.\"\n        );\n    }\n    let statistics =\n        start_statistics_reporting_loop(indexing_pipeline_handle, args.input_path_opt.is_none())\n            .await?;\n    merge_pipeline_handle\n        .mailbox()\n        .ask(quickwit_indexing::FinishPendingMergesAndShutdownPipeline)\n        .await?;\n    merge_pipeline_handle.join().await;\n    // Shutdown the indexing server.\n    universe\n        .send_exit_with_success(&indexing_server_mailbox)\n        .await?;\n    indexing_server_handle.join().await;\n    universe.quit().await;\n    if statistics.num_published_splits > 0 {\n        println!(\n            \"Now, you can query the index with the following command:\\nquickwit index search \\\n             --index {} --config ./config/quickwit.yaml --query \\\"my query\\\"\",\n            args.index_id\n        );\n    }\n\n    if args.clear_cache {\n        println!(\"Clearing local cache directory...\");\n        clear_cache_directory(&config.data_dir_path).await?;\n        println!(\"{} Local cache directory cleared.\", \"✔\".color(GREEN_COLOR));\n    }\n\n    match statistics.num_invalid_docs {\n        0 => {\n            println!(\"{} Documents successfully indexed.\", \"✔\".color(GREEN_COLOR));\n            Ok(())\n        }\n        _ => bail!(\"failed to ingest all the documents\"),\n    }\n}\n\npub async fn local_search_cli(args: LocalSearchArgs) -> anyhow::Result<()> {\n    debug!(args=?args, \"local-search\");\n    println!(\"❯ Searching directly on the index storage (without calling REST API)...\");\n    let config = load_node_config(&args.config_uri).await?;\n    let (storage_resolver, metastore_resolver) =\n        get_resolvers(&config.storage_configs, &config.metastore_configs);\n    let metastore: MetastoreServiceClient =\n        metastore_resolver.resolve(&config.metastore_uri).await?;\n    let aggs = args\n        .aggregation\n        .map(|agg_string| serde_json::from_str(&agg_string))\n        .transpose()?;\n    let sort_by: SortBy = args.sort_by_field.map(SortBy::from).unwrap_or_default();\n    let search_request_query_string = SearchRequestQueryString {\n        query: args.query,\n        start_offset: args.start_offset as u64,\n        max_hits: args.max_hits as u64,\n        search_fields: args.search_fields,\n        snippet_fields: args.snippet_fields,\n        start_timestamp: args.start_timestamp,\n        end_timestamp: args.end_timestamp,\n        aggs,\n        format: BodyFormat::Json,\n        sort_by,\n        count_all: CountHits::CountAll,\n        allow_failed_splits: false,\n    };\n    let search_request =\n        search_request_from_api_request(vec![args.index_id], search_request_query_string)?;\n    debug!(search_request=?search_request, \"search-request\");\n    let search_response: SearchResponse =\n        single_node_search(search_request, metastore, storage_resolver).await?;\n    let search_response_rest = SearchResponseRest::try_from(search_response)?;\n    let search_response_json = serde_json::to_string_pretty(&search_response_rest)?;\n    println!(\"{search_response_json}\");\n    Ok(())\n}\n\npub async fn merge_cli(args: MergeArgs) -> anyhow::Result<()> {\n    debug!(args=?args, \"run-merge-operations\");\n    println!(\"❯ Merging splits locally...\");\n    let config = load_node_config(&args.config_uri).await?;\n    let (storage_resolver, metastore_resolver) =\n        get_resolvers(&config.storage_configs, &config.metastore_configs);\n    let mut metastore = metastore_resolver.resolve(&config.metastore_uri).await?;\n    run_index_checklist(&mut metastore, &storage_resolver, &args.index_id, None).await?;\n    // The indexing service needs to update its cluster chitchat state so that the control plane is\n    // aware of the running tasks. We thus create a fake cluster to instantiate the indexing service\n    // and avoid impacting potential control plane running on the cluster.\n    let cluster = create_empty_cluster(&config).await?;\n    let runtimes_config = RuntimesConfig::default();\n    start_actor_runtimes(\n        runtimes_config,\n        &HashSet::from_iter([QuickwitService::Indexer]),\n    )?;\n    let indexer_config = IndexerConfig::default();\n    let universe = Universe::new();\n    let merge_scheduler_service: Mailbox<MergeSchedulerService> = universe.get_or_spawn_one();\n    let indexing_server = IndexingService::new(\n        config.node_id,\n        config.data_dir_path,\n        indexer_config,\n        runtimes_config.num_threads_blocking,\n        cluster,\n        metastore,\n        None,\n        merge_scheduler_service,\n        IngesterPool::default(),\n        storage_resolver,\n        EventBroker::default(),\n    )\n    .await?;\n    let (indexing_service_mailbox, indexing_service_handle) =\n        universe.spawn_builder().spawn(indexing_server);\n    let pipeline_id = indexing_service_mailbox\n        .ask_for_res(SpawnPipeline {\n            index_id: args.index_id,\n            source_config: SourceConfig {\n                source_id: args.source_id,\n                num_pipelines: NonZeroUsize::MIN,\n                enabled: true,\n                source_params: SourceParams::Vec(VecSourceParams::default()),\n                transform_config: None,\n                input_format: SourceInputFormat::Json,\n            },\n            pipeline_uid: PipelineUid::random(),\n        })\n        .await?;\n    let pipeline_handle: ActorHandle<MergePipeline> = indexing_service_mailbox\n        .ask_for_res(DetachMergePipeline {\n            pipeline_id: pipeline_id.merge_pipeline_id(),\n        })\n        .await?;\n\n    let mut check_interval = tokio::time::interval(Duration::from_secs(1));\n    loop {\n        check_interval.tick().await;\n\n        pipeline_handle.refresh_observe();\n        let observation = pipeline_handle.last_observation();\n\n        if observation.num_ongoing_merges == 0 {\n            info!(\"merge pipeline has no more ongoing merges, exiting\");\n            break;\n        }\n\n        if pipeline_handle.state().is_exit() {\n            info!(\"merge pipeline has exited, exiting\");\n            break;\n        }\n    }\n\n    let (pipeline_exit_status, _pipeline_statistics) = pipeline_handle.quit().await;\n    indexing_service_handle.quit().await;\n    if !matches!(\n        pipeline_exit_status,\n        ActorExitStatus::Success | ActorExitStatus::Quit\n    ) {\n        bail!(pipeline_exit_status);\n    }\n    println!(\"{} Merge successful.\", \"✔\".color(GREEN_COLOR));\n    Ok(())\n}\n\npub async fn garbage_collect_index_cli(args: GarbageCollectIndexArgs) -> anyhow::Result<()> {\n    debug!(args=?args, \"garbage-collect-index\");\n    println!(\"❯ Garbage collecting index...\");\n\n    let config = load_node_config(&args.config_uri).await?;\n    let (storage_resolver, metastore_resolver) =\n        get_resolvers(&config.storage_configs, &config.metastore_configs);\n    let metastore = metastore_resolver.resolve(&config.metastore_uri).await?;\n    let mut index_service = IndexService::new(metastore, storage_resolver);\n    let removal_info = index_service\n        .garbage_collect_index(&args.index_id, args.grace_period, args.dry_run)\n        .await?;\n    if removal_info.removed_split_entries.is_empty() && removal_info.failed_splits.is_empty() {\n        println!(\"No dangling files to garbage collect.\");\n        return Ok(());\n    }\n\n    if args.dry_run {\n        println!(\"The following files will be garbage collected.\");\n        for split_info in removal_info.removed_split_entries {\n            println!(\" - {}\", split_info.file_name.display());\n        }\n        return Ok(());\n    }\n\n    if !removal_info.failed_splits.is_empty() {\n        println!(\"The following splits were attempted to be removed, but failed.\");\n        for split_info in &removal_info.failed_splits {\n            println!(\" - {}\", split_info.split_id);\n        }\n        println!(\n            \"{} Splits were unable to be removed.\",\n            removal_info.failed_splits.len()\n        );\n    }\n\n    let deleted_bytes: u64 = removal_info\n        .removed_split_entries\n        .iter()\n        .map(|split_info| split_info.file_size_bytes.as_u64())\n        .sum();\n    println!(\n        \"{}MB of storage garbage collected.\",\n        deleted_bytes / 1_000_000\n    );\n\n    if removal_info.failed_splits.is_empty() {\n        println!(\n            \"{} Index successfully garbage collected.\",\n            \"✔\".color(GREEN_COLOR)\n        );\n    } else if removal_info.removed_split_entries.is_empty()\n        && !removal_info.failed_splits.is_empty()\n    {\n        println!(\"{} Failed to garbage collect index.\", \"✘\".color(RED_COLOR));\n    } else {\n        println!(\n            \"{} Index partially garbage collected.\",\n            \"✘\".color(RED_COLOR)\n        );\n    }\n\n    Ok(())\n}\n\nasync fn extract_split_cli(args: ExtractSplitArgs) -> anyhow::Result<()> {\n    debug!(args=?args, \"extract-split\");\n    println!(\"❯ Extracting split...\");\n\n    let config = load_node_config(&args.config_uri).await?;\n    let (storage_resolver, metastore_resolver) =\n        get_resolvers(&config.storage_configs, &config.metastore_configs);\n    let metastore = metastore_resolver.resolve(&config.metastore_uri).await?;\n    let index_metadata = metastore\n        .index_metadata(IndexMetadataRequest::for_index_id(args.index_id))\n        .await?\n        .deserialize_index_metadata()?;\n    let index_storage = storage_resolver.resolve(index_metadata.index_uri()).await?;\n    let split_file = PathBuf::from(format!(\"{}.split\", args.split_id));\n    let split_data = index_storage.get_all(split_file.as_path()).await?;\n    let (_hotcache_bytes, bundle_storage) = BundleStorage::open_from_split_data_with_owned_bytes(\n        index_storage,\n        split_file,\n        split_data,\n    )?;\n    std::fs::create_dir_all(&args.target_dir)?;\n    for path in bundle_storage.iter_files() {\n        let mut out_path = args.target_dir.to_owned();\n        out_path.push(path);\n        println!(\"Copying {out_path:?}\");\n        bundle_storage.copy_to_file(path, &out_path).await?;\n    }\n\n    println!(\"{} Split successfully extracted.\", \"✔\".color(GREEN_COLOR));\n    Ok(())\n}\n\n/// Starts a tokio task that displays the indexing statistics\n/// every once in awhile.\npub async fn start_statistics_reporting_loop(\n    pipeline_handle: ActorHandle<IndexingPipeline>,\n    is_stdin: bool,\n) -> anyhow::Result<IndexingStatistics> {\n    let mut stdout_handle = stdout();\n    let start_time = Instant::now();\n    let mut throughput_calculator = ThroughputCalculator::new(start_time);\n    let mut report_interval = tokio::time::interval(Duration::from_secs(1));\n\n    loop {\n        // TODO fixme. The way we wait today is a bit lame: if the indexing pipeline exits, we will\n        // still wait up to an entire heartbeat...  Ideally we should  select between two\n        // futures.\n        report_interval.tick().await;\n        // Try to receive with a timeout of 1 second.\n        // 1 second is also the frequency at which we update statistic in the console\n        pipeline_handle.refresh_observe();\n\n        let observation = pipeline_handle.last_observation();\n\n        // Let's not display live statistics to allow screen to scroll.\n        if observation.num_docs > 0 {\n            display_statistics(&mut stdout_handle, &mut throughput_calculator, &observation)?;\n        }\n\n        if pipeline_handle.state().is_exit() {\n            break;\n        }\n    }\n    let (pipeline_exit_status, pipeline_statistics) = pipeline_handle.join().await;\n    if !pipeline_exit_status.is_success() {\n        bail!(pipeline_exit_status);\n    }\n    // If we have received zero docs at this point,\n    // there is no point in displaying report.\n    if pipeline_statistics.num_docs == 0 {\n        return Ok(pipeline_statistics);\n    }\n\n    if is_stdin {\n        display_statistics(\n            &mut stdout_handle,\n            &mut throughput_calculator,\n            &pipeline_statistics,\n        )?;\n    }\n    // display end of task report\n    println!();\n    let secs = Duration::from_secs(start_time.elapsed().as_secs());\n    if pipeline_statistics.num_invalid_docs == 0 {\n        println!(\n            \"Indexed {} documents in {}.\",\n            pipeline_statistics.num_docs.separate_with_commas(),\n            format_duration(secs)\n        );\n    } else {\n        let num_indexed_docs = (pipeline_statistics.num_docs\n            - pipeline_statistics.num_invalid_docs)\n            .separate_with_commas();\n\n        let error_rate = (pipeline_statistics.num_invalid_docs as f64\n            / pipeline_statistics.num_docs as f64)\n            * 100.0;\n\n        println!(\n            \"Indexed {} out of {} documents in {}. Failed to index {} document(s). {}\\n\",\n            num_indexed_docs,\n            pipeline_statistics.num_docs.separate_with_commas(),\n            format_duration(secs),\n            pipeline_statistics.num_invalid_docs.separate_with_commas(),\n            colorize_error_rate(error_rate),\n        );\n    }\n\n    Ok(pipeline_statistics)\n}\n\nfn colorize_error_rate(error_rate: f64) -> ColoredString {\n    let error_rate_message = format!(\"({error_rate:.1}% error rate)\");\n    if error_rate < 1.0 {\n        error_rate_message.yellow()\n    } else if error_rate < 5.0 {\n        error_rate_message.truecolor(255, 181, 46) //< Orange\n    } else {\n        error_rate_message.red()\n    }\n}\n\n/// A struct to print data on the standard output.\nstruct Printer<'a> {\n    pub stdout: &'a mut Stdout,\n}\n\nimpl Printer<'_> {\n    pub fn print_header(&mut self, header: &str) -> io::Result<()> {\n        write!(&mut self.stdout, \" {}\", header.bright_blue())?;\n        Ok(())\n    }\n\n    pub fn print_value(&mut self, fmt_args: fmt::Arguments) -> io::Result<()> {\n        write!(&mut self.stdout, \" {fmt_args}\")\n    }\n\n    pub fn flush(&mut self) -> io::Result<()> {\n        self.stdout.flush()\n    }\n}\n\nfn display_statistics(\n    stdout: &mut Stdout,\n    throughput_calculator: &mut ThroughputCalculator,\n    statistics: &IndexingStatistics,\n) -> anyhow::Result<()> {\n    let elapsed_duration = time::Duration::try_from(throughput_calculator.elapsed_time())?;\n    let elapsed_time = format!(\n        \"{:02}:{:02}:{:02}\",\n        elapsed_duration.whole_hours(),\n        elapsed_duration.whole_minutes() % 60,\n        elapsed_duration.whole_seconds() % 60\n    );\n    let throughput_mb_s = throughput_calculator.calculate(statistics.total_bytes_processed);\n    let mut printer = Printer { stdout };\n    printer.print_header(\"Num docs\")?;\n    printer.print_value(format_args!(\"{:>7}\", statistics.num_docs))?;\n    printer.print_header(\"Parse errs\")?;\n    printer.print_value(format_args!(\"{:>5}\", statistics.num_invalid_docs))?;\n    printer.print_header(\"PublSplits\")?;\n    printer.print_value(format_args!(\"{:>3}\", statistics.num_published_splits))?;\n    printer.print_header(\"Input size\")?;\n    printer.print_value(format_args!(\n        \"{:>5}MB\",\n        statistics.total_bytes_processed / 1_000_000\n    ))?;\n    printer.print_header(\"Thrghput\")?;\n    printer.print_value(format_args!(\"{throughput_mb_s:>5.2}MB/s\"))?;\n    printer.print_header(\"Time\")?;\n    printer.print_value(format_args!(\"{elapsed_time}\\n\"))?;\n    printer.flush()?;\n    Ok(())\n}\n\n/// ThroughputCalculator is used to calculate throughput.\nstruct ThroughputCalculator {\n    /// Stores the time series of processed bytes value.\n    processed_bytes_values: VecDeque<(Instant, u64)>,\n    /// Store the time this calculator started\n    start_time: Instant,\n}\n\nimpl ThroughputCalculator {\n    /// Creates new instance.\n    pub fn new(start_time: Instant) -> Self {\n        let processed_bytes_values: VecDeque<(Instant, u64)> = (0..THROUGHPUT_WINDOW_SIZE)\n            .map(|_| (start_time, 0u64))\n            .collect();\n        Self {\n            processed_bytes_values,\n            start_time,\n        }\n    }\n\n    /// Calculates the throughput.\n    pub fn calculate(&mut self, current_processed_bytes: u64) -> f64 {\n        self.processed_bytes_values.pop_front();\n        let current_instant = Instant::now();\n        let (first_instant, first_processed_bytes) = *self.processed_bytes_values.front().unwrap();\n        let elapsed_time = (current_instant - first_instant).as_millis() as f64 / 1_000f64;\n        self.processed_bytes_values\n            .push_back((current_instant, current_processed_bytes));\n        (current_processed_bytes - first_processed_bytes) as f64\n            / 1_000_000f64\n            / elapsed_time.max(1f64)\n    }\n\n    pub fn elapsed_time(&self) -> Duration {\n        self.start_time.elapsed()\n    }\n}\n\nasync fn create_empty_cluster(config: &NodeConfig) -> anyhow::Result<Cluster> {\n    let self_node = ClusterMember {\n        node_id: config.node_id.clone(),\n        generation_id: quickwit_cluster::GenerationId::now(),\n        is_ready: false,\n        enabled_services: HashSet::new(),\n        gossip_advertise_addr: config.gossip_advertise_addr,\n        grpc_advertise_addr: config.grpc_advertise_addr,\n        indexing_tasks: Vec::new(),\n        indexing_cpu_capacity: CpuCapacity::zero(),\n        ingester_status: IngesterStatus::default(),\n        availability_zone: None,\n    };\n    let client_grpc_config = make_client_grpc_config(&config.grpc_config)?;\n    let cluster = Cluster::join(\n        config.cluster_id.clone(),\n        self_node,\n        config.gossip_advertise_addr,\n        Vec::new(),\n        config.gossip_interval,\n        FailureDetectorConfig::default(),\n        &ChannelTransport::default(),\n        client_grpc_config,\n    )\n    .await?;\n\n    Ok(cluster)\n}\n"
  },
  {
    "path": "quickwit/quickwit-cli/tests/Pipfile",
    "content": "[[source]]\nurl = \"https://pypi.org/simple\"\nverify_ssl = true\nname = \"pypi\"\n\n[packages]\nawscli-local = \"*\"\n\n[dev-packages]\n\n[requires]\npython_version = \"3.11\"\n"
  },
  {
    "path": "quickwit/quickwit-cli/tests/cli.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n#![recursion_limit = \"256\"]\n#![allow(clippy::bool_assert_comparison)]\n\nmod helpers;\n\nuse std::path::Path;\n\nuse anyhow::Result;\nuse clap::error::ErrorKind;\nuse helpers::{TestEnv, TestStorageType, uri_from_path};\nuse quickwit_cli::checklist::ChecklistError;\nuse quickwit_cli::cli::build_cli;\nuse quickwit_cli::index::{\n    CreateIndexArgs, DeleteIndexArgs, SearchIndexArgs, UpdateIndexArgs, create_index_cli,\n    delete_index_cli, search_index, update_index_cli,\n};\nuse quickwit_cli::tool::{\n    GarbageCollectIndexArgs, LocalIngestDocsArgs, garbage_collect_index_cli, local_ingest_docs_cli,\n};\nuse quickwit_common::fs::get_cache_directory_path;\nuse quickwit_common::rand::append_random_suffix;\nuse quickwit_common::uri::Uri;\nuse quickwit_config::{CLI_SOURCE_ID, RetentionPolicy, SourceInputFormat};\nuse quickwit_metastore::{\n    ListSplitsRequestExt, MetastoreResolver, MetastoreServiceExt, MetastoreServiceStreamSplitsExt,\n    SplitMetadata, SplitState, StageSplitsRequestExt,\n};\nuse quickwit_proto::metastore::{\n    DeleteSplitsRequest, EntityKind, IndexMetadataRequest, ListSplitsRequest,\n    MarkSplitsForDeletionRequest, MetastoreError, MetastoreService, StageSplitsRequest,\n};\nuse serde_json::{Number, Value, json};\nuse tokio::time::{Duration, sleep};\n\nuse crate::helpers::{PACKAGE_BIN_NAME, create_test_env, upload_test_file};\n\nasync fn create_logs_index(test_env: &TestEnv) -> anyhow::Result<()> {\n    let args = CreateIndexArgs {\n        client_args: test_env.default_client_args(),\n        index_config_uri: test_env.resource_files.index_config.clone(),\n        overwrite: false,\n        assume_yes: true,\n    };\n    create_index_cli(args).await\n}\n\nasync fn local_ingest_docs(uri: Uri, test_env: &TestEnv) -> anyhow::Result<()> {\n    let args = LocalIngestDocsArgs {\n        config_uri: test_env.resource_files.config.clone(),\n        index_id: test_env.index_id.clone(),\n        input_path_opt: Some(uri),\n        input_format: SourceInputFormat::Json,\n        overwrite: false,\n        clear_cache: true,\n        vrl_script: None,\n    };\n    local_ingest_docs_cli(args).await\n}\n\nasync fn local_ingest_log_docs(test_env: &TestEnv) -> anyhow::Result<()> {\n    local_ingest_docs(test_env.resource_files.log_docs.clone(), test_env).await\n}\n\n#[test]\nfn test_cmd_help() {\n    let cmd = build_cli();\n    let error = cmd\n        .try_get_matches_from(vec![PACKAGE_BIN_NAME, \"--help\"])\n        .unwrap_err();\n    // on `--help` clap returns an error.\n    assert_eq!(error.kind(), ErrorKind::DisplayHelp);\n}\n\n#[tokio::test]\nasync fn test_cmd_create() {\n    quickwit_common::setup_logging_for_tests();\n    let index_id = append_random_suffix(\"test-create-cmd\");\n    let test_env = create_test_env(index_id, TestStorageType::LocalFileSystem)\n        .await\n        .unwrap();\n    test_env.start_server().await.unwrap();\n    create_logs_index(&test_env).await.unwrap();\n\n    let index_metadata = test_env.index_metadata().await.unwrap();\n    assert_eq!(index_metadata.index_id(), test_env.index_id);\n\n    // Creating an existing index should fail.\n    let error = create_logs_index(&test_env).await.unwrap_err();\n    assert!(error.to_string().contains(\"already exist(s)\"),);\n}\n\n#[tokio::test]\nasync fn test_cmd_create_no_index_uri() {\n    quickwit_common::setup_logging_for_tests();\n    let index_id = append_random_suffix(\"test-create-cmd-no-index-uri\");\n    let test_env = create_test_env(index_id, TestStorageType::LocalFileSystem)\n        .await\n        .unwrap();\n    test_env.start_server().await.unwrap();\n\n    let index_config_without_uri = test_env.resource_files.index_config_without_uri.clone();\n    let args = CreateIndexArgs {\n        client_args: test_env.default_client_args(),\n        index_config_uri: index_config_without_uri,\n        overwrite: false,\n        assume_yes: true,\n    };\n\n    let response = create_index_cli(args).await;\n    response.unwrap();\n\n    let index_metadata = test_env.index_metadata().await.unwrap();\n    assert_eq!(index_metadata.index_id(), test_env.index_id);\n    assert_eq!(index_metadata.index_uri(), &test_env.index_uri);\n}\n\n#[tokio::test]\nasync fn test_cmd_create_overwrite() {\n    // Create non existing index with --overwrite.\n    let index_id = append_random_suffix(\"test-create-non-existing-index-with-overwrite\");\n    let test_env = create_test_env(index_id, TestStorageType::LocalFileSystem)\n        .await\n        .unwrap();\n    test_env.start_server().await.unwrap();\n\n    let index_config_without_uri = test_env.resource_files.index_config_without_uri.clone();\n    let args = CreateIndexArgs {\n        client_args: test_env.default_client_args(),\n        index_config_uri: index_config_without_uri,\n        overwrite: true,\n        assume_yes: true,\n    };\n\n    create_index_cli(args).await.unwrap();\n\n    let index_metadata = test_env.index_metadata().await.unwrap();\n    assert_eq!(index_metadata.index_id(), &test_env.index_id);\n    assert_eq!(index_metadata.index_uri(), &test_env.index_uri);\n}\n\n#[test]\nfn test_cmd_create_with_ill_formed_command() {\n    // Attempt to create with ill-formed new command.\n    let app = build_cli();\n    let error = app\n        .try_get_matches_from(vec![PACKAGE_BIN_NAME, \"index\", \"create\"])\n        .unwrap_err();\n    assert_eq!(error.kind(), ErrorKind::MissingRequiredArgument);\n}\n\n#[tokio::test]\nasync fn test_cmd_ingest_on_non_existing_index() {\n    let index_id = append_random_suffix(\"index-does-not-exist\");\n    let test_env = create_test_env(index_id, TestStorageType::LocalFileSystem)\n        .await\n        .unwrap();\n\n    let args = LocalIngestDocsArgs {\n        config_uri: test_env.resource_files.config,\n        index_id: \"index-does-not-exist\".to_string(),\n        input_path_opt: Some(test_env.resource_files.log_docs.clone()),\n        input_format: SourceInputFormat::Json,\n        overwrite: false,\n        clear_cache: true,\n        vrl_script: None,\n    };\n\n    let error = local_ingest_docs_cli(args).await.unwrap_err();\n\n    assert_eq!(\n        error.root_cause().downcast_ref::<MetastoreError>().unwrap(),\n        &MetastoreError::NotFound(EntityKind::Index {\n            index_id: \"index-does-not-exist\".to_string()\n        })\n    );\n}\n\n#[tokio::test]\nasync fn test_ingest_docs_cli_keep_cache() {\n    quickwit_common::setup_logging_for_tests();\n    let index_id = append_random_suffix(\"test-index-keep-cache\");\n    let test_env = create_test_env(index_id.clone(), TestStorageType::LocalFileSystem)\n        .await\n        .unwrap();\n    test_env.start_server().await.unwrap();\n    create_logs_index(&test_env).await.unwrap();\n\n    let args = LocalIngestDocsArgs {\n        config_uri: test_env.resource_files.config,\n        index_id,\n        input_path_opt: Some(test_env.resource_files.log_docs.clone()),\n        input_format: SourceInputFormat::Json,\n        overwrite: false,\n        clear_cache: false,\n        vrl_script: None,\n    };\n\n    local_ingest_docs_cli(args).await.unwrap();\n    // Ensure cache directory is not empty.\n    let cache_directory_path = get_cache_directory_path(&test_env.data_dir_path);\n    assert!(cache_directory_path.read_dir().unwrap().next().is_some());\n}\n\n#[tokio::test]\nasync fn test_ingest_docs_cli() {\n    quickwit_common::setup_logging_for_tests();\n    let index_id = append_random_suffix(\"test-index-simple\");\n    let test_env = create_test_env(index_id.clone(), TestStorageType::LocalFileSystem)\n        .await\n        .unwrap();\n    test_env.start_server().await.unwrap();\n    create_logs_index(&test_env).await.unwrap();\n    let index_uid = test_env.index_metadata().await.unwrap().index_uid;\n\n    let args = LocalIngestDocsArgs {\n        config_uri: test_env.resource_files.config.clone(),\n        index_id: index_id.clone(),\n        input_path_opt: Some(test_env.resource_files.log_docs.clone()),\n        input_format: SourceInputFormat::Json,\n        overwrite: false,\n        clear_cache: true,\n        vrl_script: None,\n    };\n\n    local_ingest_docs_cli(args).await.unwrap();\n\n    let splits_metadata: Vec<SplitMetadata> = test_env\n        .metastore()\n        .await\n        .list_splits(ListSplitsRequest::try_from_index_uid(index_uid).unwrap())\n        .await\n        .unwrap()\n        .collect_splits_metadata()\n        .await\n        .unwrap();\n\n    assert_eq!(splits_metadata.len(), 1);\n    assert_eq!(splits_metadata[0].num_docs, 5);\n\n    // Ensure cache directory is empty.\n    let cache_directory_path = get_cache_directory_path(&test_env.data_dir_path);\n    assert!(cache_directory_path.read_dir().unwrap().next().is_none());\n\n    let does_not_exist_uri = uri_from_path(&test_env.data_dir_path)\n        .join(\"file-does-not-exist.json\")\n        .unwrap();\n\n    // Ingest a non-existing file should fail.\n    let args = LocalIngestDocsArgs {\n        config_uri: test_env.resource_files.config,\n        index_id: test_env.index_id,\n        input_path_opt: Some(does_not_exist_uri),\n        input_format: SourceInputFormat::Json,\n        overwrite: false,\n        clear_cache: true,\n        vrl_script: None,\n    };\n\n    let error = local_ingest_docs_cli(args).await.unwrap_err();\n\n    assert!(matches!(\n        error.root_cause().downcast_ref::<ChecklistError>().unwrap(),\n        ChecklistError {\n            errors\n        } if errors.len() == 1 && errors[0].0 == CLI_SOURCE_ID\n    ));\n}\n\n#[tokio::test]\nasync fn test_reingest_same_file_cli() {\n    quickwit_common::setup_logging_for_tests();\n    let index_id = append_random_suffix(\"test-index-simple\");\n    let test_env = create_test_env(index_id.clone(), TestStorageType::LocalFileSystem)\n        .await\n        .unwrap();\n    test_env.start_server().await.unwrap();\n    create_logs_index(&test_env).await.unwrap();\n    let index_uid = test_env.index_metadata().await.unwrap().index_uid;\n\n    for _ in 0..2 {\n        let args = LocalIngestDocsArgs {\n            config_uri: test_env.resource_files.config.clone(),\n            index_id: index_id.clone(),\n            input_path_opt: Some(test_env.resource_files.log_docs.clone()),\n            input_format: SourceInputFormat::Json,\n            overwrite: false,\n            clear_cache: true,\n            vrl_script: None,\n        };\n\n        local_ingest_docs_cli(args).await.unwrap();\n    }\n\n    let splits_metadata: Vec<SplitMetadata> = test_env\n        .metastore()\n        .await\n        .list_splits(ListSplitsRequest::try_from_index_uid(index_uid).unwrap())\n        .await\n        .unwrap()\n        .collect_splits_metadata()\n        .await\n        .unwrap();\n\n    assert_eq!(splits_metadata.len(), 1);\n    assert_eq!(splits_metadata[0].num_docs, 5);\n}\n\n/// Helper function to compare a json payload.\n///\n/// It will serialize and deserialize the value in order\n/// to make sure floating points are the exact value obtained via\n/// JSON deserialization.\n#[track_caller]\nfn assert_flexible_json_eq(value_json: &serde_json::Value, expected_json: &serde_json::Value) {\n    match (value_json, expected_json) {\n        (Value::Array(left_arr), Value::Array(right_arr)) => {\n            assert_eq!(\n                left_arr.len(),\n                right_arr.len(),\n                \"left: {left_arr:?} right: {right_arr:?}\"\n            );\n            for i in 0..left_arr.len() {\n                assert_flexible_json_eq(&left_arr[i], &right_arr[i]);\n            }\n        }\n        (Value::Object(left_obj), Value::Object(right_obj)) => {\n            assert_eq!(\n                left_obj.len(),\n                right_obj.len(),\n                \"left: {left_obj:?} right: {right_obj:?}\"\n            );\n            for (k, v) in left_obj {\n                if let Some(right_value) = right_obj.get(k) {\n                    assert_flexible_json_eq(v, right_value);\n                } else {\n                    panic!(\"Missing key `{k}`\");\n                }\n            }\n        }\n        (Value::Number(left_num), Value::Number(right_num)) => {\n            let left = left_num.as_f64().unwrap();\n            let right = right_num.as_f64().unwrap();\n            assert!(\n                (left - right).abs() / (1e-32 + left + right).abs() < 1e-4,\n                \"left: {left:?} right: {right:?}\"\n            );\n        }\n        (left, right) => {\n            assert_eq!(left, right);\n        }\n    }\n}\n\n#[tokio::test]\nasync fn test_cmd_search_aggregation() {\n    quickwit_common::setup_logging_for_tests();\n    let index_id = append_random_suffix(\"test-search-cmd\");\n    let test_env = create_test_env(index_id, TestStorageType::LocalFileSystem)\n        .await\n        .unwrap();\n    test_env.start_server().await.unwrap();\n    create_logs_index(&test_env).await.unwrap();\n\n    local_ingest_log_docs(&test_env).await.unwrap();\n\n    let aggregation: Value = json!(\n    {\n      \"range_buckets\": {\n        \"range\": {\n          \"field\": \"ts\",\n          \"ranges\": [\n            { \"to\": 72057597000000000f64 },\n            { \"from\": 72057597000000000f64, \"to\": 72057600000000000f64 },\n            { \"from\": 72057600000000000f64, \"to\": 72057604000000000f64 },\n            { \"from\": 72057604000000000f64 },\n          ]\n        },\n        \"aggs\": {\n          \"average_ts\": {\n            \"avg\": { \"field\": \"ts\" }\n          }\n        }\n      }\n    });\n\n    // search with aggregation\n    let args = SearchIndexArgs {\n        index_id: test_env.index_id.clone(),\n        query: \"paris OR tokio OR london\".to_string(),\n        aggregation: Some(serde_json::to_string(&aggregation).unwrap()),\n        max_hits: 10,\n        start_offset: 0,\n        search_fields: Some(vec![\"city\".to_string()]),\n        snippet_fields: None,\n        start_timestamp: None,\n        end_timestamp: None,\n        client_args: test_env.default_client_args(),\n        sort_by_score: false,\n    };\n    let search_response = search_index(args).await.unwrap();\n\n    let aggregation_res = search_response.aggregations.unwrap();\n    let expected_json = serde_json::json!({\n        \"range_buckets\": {\n            \"buckets\": [\n                {\n                    \"average_ts\": {\n                        \"value\": null\n                    },\n                    \"doc_count\": 0,\n                    \"key\": \"*-1972-04-13T23:59:57Z\",\n                    \"to\": 72057597000000000f64,\n                    \"to_as_string\": \"1972-04-13T23:59:57Z\"\n                },\n                {\n                    \"average_ts\": {\n                        \"value\": 72057597500000000f64\n                    },\n                    \"doc_count\": 2,\n                    \"from\": 72057597000000000f64,\n                    \"from_as_string\": \"1972-04-13T23:59:57Z\",\n                    \"key\": \"1972-04-13T23:59:57Z-1972-04-14T00:00:00Z\",\n                    \"to\": 72057600000000000f64,\n                    \"to_as_string\": \"1972-04-14T00:00:00Z\"\n                },\n                {\n                    \"average_ts\": {\n                        \"value\": null\n                    },\n                    \"doc_count\": 0,\n                    \"from\": 72057600000000000f64,\n                    \"from_as_string\": \"1972-04-14T00:00:00Z\",\n                    \"key\": \"1972-04-14T00:00:00Z-1972-04-14T00:00:04Z\",\n                    \"to\": 72057604000000000f64,\n                    \"to_as_string\": \"1972-04-14T00:00:04Z\"\n                },\n                {\n                    \"average_ts\": {\n                        \"value\": 72057606333333330f64\n                    },\n                    \"doc_count\": 3,\n                    \"from\": 72057604000000000f64,\n                    \"from_as_string\": \"1972-04-14T00:00:04Z\",\n                    \"key\": \"1972-04-14T00:00:04Z-*\"\n                }\n            ]\n        }\n    });\n    assert_flexible_json_eq(&aggregation_res, &expected_json);\n}\n\n#[tokio::test]\nasync fn test_cmd_search_with_snippets() -> Result<()> {\n    quickwit_common::setup_logging_for_tests();\n    let index_id = append_random_suffix(\"test-search-cmd\");\n    let test_env = create_test_env(index_id, TestStorageType::LocalFileSystem)\n        .await\n        .unwrap();\n    test_env.start_server().await.unwrap();\n    create_logs_index(&test_env).await.unwrap();\n\n    local_ingest_log_docs(&test_env).await.unwrap();\n\n    // search with snippets\n    let args = SearchIndexArgs {\n        index_id: test_env.index_id.clone(),\n        query: \"event:baz\".to_string(),\n        aggregation: None,\n        max_hits: 10,\n        start_offset: 0,\n        search_fields: None,\n        snippet_fields: Some(vec![\"event\".to_string()]),\n        start_timestamp: None,\n        end_timestamp: None,\n        client_args: test_env.default_client_args(),\n        sort_by_score: false,\n    };\n    let search_response = search_index(args).await.unwrap();\n    assert_eq!(search_response.hits.len(), 1);\n    let hit = &search_response.hits[0];\n    assert_eq!(hit, &json!({\"event\": \"baz\", \"ts\": 72057604}));\n    assert_eq!(\n        search_response.snippets.unwrap()[0],\n        json!({\n            \"event\": [ \"<b>baz</b>\"]\n        })\n    );\n    Ok(())\n}\n\n#[tokio::test]\nasync fn test_search_index_cli() {\n    quickwit_common::setup_logging_for_tests();\n    let index_id = append_random_suffix(\"test-search-cmd\");\n    let test_env = create_test_env(index_id.clone(), TestStorageType::LocalFileSystem)\n        .await\n        .unwrap();\n    test_env.start_server().await.unwrap();\n    create_logs_index(&test_env).await.unwrap();\n\n    let create_search_args = |query: &str| SearchIndexArgs {\n        client_args: test_env.default_client_args(),\n        index_id: index_id.clone(),\n        query: query.to_string(),\n        aggregation: None,\n        max_hits: 20,\n        start_offset: 0,\n        search_fields: None,\n        snippet_fields: None,\n        start_timestamp: None,\n        end_timestamp: None,\n        sort_by_score: false,\n    };\n\n    local_ingest_log_docs(&test_env).await.unwrap();\n\n    let args = create_search_args(\"level:info\");\n\n    // search_index_cli calls search_index and prints the SearchResponse\n    let search_res = search_index(args).await.unwrap();\n    assert_eq!(search_res.num_hits, 2);\n\n    // search with tag pruning\n    let args = create_search_args(\"+level:info +city:paris\");\n\n    // search_index_cli calls search_index and prints the SearchResponse\n    let search_res = search_index(args).await.unwrap();\n    assert_eq!(search_res.num_hits, 1);\n\n    // search with tag pruning\n    let args = create_search_args(\"level:info AND city:conakry\");\n\n    // search_index_cli calls search_index and prints the SearchResponse\n    let search_res = search_index(args).await.unwrap();\n    assert_eq!(search_res.num_hits, 0);\n}\n\n#[tokio::test]\nasync fn test_cmd_update_index() {\n    quickwit_common::setup_logging_for_tests();\n    let index_id = append_random_suffix(\"test-update-cmd\");\n    let test_env = create_test_env(index_id.clone(), TestStorageType::LocalFileSystem)\n        .await\n        .unwrap();\n    test_env.start_server().await.unwrap();\n    create_logs_index(&test_env).await.unwrap();\n\n    // add retention policy\n    let args = UpdateIndexArgs {\n        client_args: test_env.default_client_args(),\n        index_id: index_id.clone(),\n        index_config_uri: test_env.resource_files.index_config_with_retention.clone(),\n        create: false,\n        assume_yes: true,\n    };\n    update_index_cli(args).await.unwrap();\n    let index_metadata = test_env.index_metadata().await.unwrap();\n    assert_eq!(index_metadata.index_id(), test_env.index_id);\n    assert_eq!(\n        index_metadata.index_config.retention_policy_opt,\n        Some(RetentionPolicy {\n            retention_period: String::from(\"1 week\"),\n            evaluation_schedule: String::from(\"daily\")\n        })\n    );\n\n    // remove retention policy\n    let args = UpdateIndexArgs {\n        client_args: test_env.default_client_args(),\n        index_id,\n        index_config_uri: test_env.resource_files.index_config.clone(),\n        create: false,\n        assume_yes: true,\n    };\n    update_index_cli(args).await.unwrap();\n    let index_metadata = test_env.index_metadata().await.unwrap();\n    assert_eq!(index_metadata.index_id(), test_env.index_id);\n    assert_eq!(index_metadata.index_config.retention_policy_opt, None);\n}\n\n#[tokio::test]\nasync fn test_delete_index_cli_dry_run() {\n    quickwit_common::setup_logging_for_tests();\n    let index_id = append_random_suffix(\"test-delete-cmd--dry-run\");\n    let test_env = create_test_env(index_id.clone(), TestStorageType::LocalFileSystem)\n        .await\n        .unwrap();\n    test_env.start_server().await.unwrap();\n    create_logs_index(&test_env).await.unwrap();\n\n    let refresh_metastore = |metastore| async {\n        // In this test we rely on the file backed metastore\n        // and the file backed metastore caches results.\n        // Therefore we need to force reading the disk to fetch updates.\n        //\n        // We do that by dropping and recreating our metastore.\n        drop(metastore);\n        MetastoreResolver::unconfigured()\n            .resolve(&test_env.metastore_uri)\n            .await\n    };\n\n    let create_delete_args = |dry_run| DeleteIndexArgs {\n        client_args: test_env.default_client_args(),\n        index_id: index_id.clone(),\n        dry_run,\n        assume_yes: true,\n    };\n\n    let mut metastore = MetastoreResolver::unconfigured()\n        .resolve(&test_env.metastore_uri)\n        .await\n        .unwrap();\n\n    assert!(metastore.index_exists(&index_id).await.unwrap());\n    // On empty index.\n    let args = create_delete_args(true);\n\n    delete_index_cli(args).await.unwrap();\n    // On dry run index should still exist\n    let mut metastore = refresh_metastore(metastore).await.unwrap();\n    metastore\n        .index_metadata(IndexMetadataRequest::for_index_id(index_id.to_string()))\n        .await\n        .unwrap();\n    assert!(metastore.index_exists(&index_id).await.unwrap());\n\n    local_ingest_log_docs(&test_env).await.unwrap();\n\n    // On non-empty index\n    let args = create_delete_args(true);\n\n    delete_index_cli(args).await.unwrap();\n    // On dry run index should still exist\n    let mut metastore = refresh_metastore(metastore).await.unwrap();\n    metastore\n        .index_metadata(IndexMetadataRequest::for_index_id(index_id.to_string()))\n        .await\n        .unwrap();\n    assert!(metastore.index_exists(&index_id).await.unwrap());\n}\n\n#[tokio::test]\nasync fn test_delete_index_cli() {\n    let index_id = append_random_suffix(\"test-delete-cmd\");\n    let test_env = create_test_env(index_id.clone(), TestStorageType::LocalFileSystem)\n        .await\n        .unwrap();\n    test_env.start_server().await.unwrap();\n    create_logs_index(&test_env).await.unwrap();\n\n    local_ingest_log_docs(&test_env).await.unwrap();\n\n    let args = DeleteIndexArgs {\n        client_args: test_env.default_client_args(),\n        index_id: index_id.clone(),\n        assume_yes: true,\n        dry_run: false,\n    };\n\n    delete_index_cli(args).await.unwrap();\n\n    assert!(test_env.index_metadata().await.is_err());\n}\n\n#[tokio::test]\nasync fn test_garbage_collect_cli_no_grace() {\n    quickwit_common::setup_logging_for_tests();\n    let index_id = append_random_suffix(\"test-gc-cmd--no-grace-period\");\n    let test_env = create_test_env(index_id.clone(), TestStorageType::LocalFileSystem)\n        .await\n        .unwrap();\n    test_env.start_server().await.unwrap();\n    create_logs_index(&test_env).await.unwrap();\n    let index_uid = test_env.index_metadata().await.unwrap().index_uid;\n    local_ingest_log_docs(&test_env).await.unwrap();\n\n    let metastore = MetastoreResolver::unconfigured()\n        .resolve(&test_env.metastore_uri)\n        .await\n        .unwrap();\n\n    let refresh_metastore = |metastore| async {\n        // In this test we rely on the file backed metastore and write on\n        // a different process. The file backed metastore caches results.\n        // Therefore we need to force reading the disk.\n        //\n        // We do that by dropping and recreating our metastore.\n        drop(metastore);\n        MetastoreResolver::unconfigured()\n            .resolve(&test_env.metastore_uri)\n            .await\n    };\n\n    let create_gc_args = |dry_run| GarbageCollectIndexArgs {\n        config_uri: test_env.resource_files.config.clone(),\n        index_id: index_id.clone(),\n        grace_period: Duration::from_secs(3600),\n        dry_run,\n    };\n\n    let splits_metadata = metastore\n        .list_splits(ListSplitsRequest::try_from_index_uid(index_uid.clone()).unwrap())\n        .await\n        .unwrap()\n        .collect_splits_metadata()\n        .await\n        .unwrap();\n    assert_eq!(splits_metadata.len(), 1);\n\n    let args = create_gc_args(false);\n\n    garbage_collect_index_cli(args).await.unwrap();\n\n    // On gc splits within grace period should still exist.\n    let index_path = test_env.indexes_dir_path.join(&test_env.index_id);\n    assert_eq!(index_path.try_exists().unwrap(), true);\n\n    let split_ids = vec![splits_metadata[0].split_id().to_string()];\n    let metastore = refresh_metastore(metastore).await.unwrap();\n    let mark_for_deletion_request =\n        MarkSplitsForDeletionRequest::new(index_uid.clone(), split_ids.clone());\n    metastore\n        .mark_splits_for_deletion(mark_for_deletion_request)\n        .await\n        .unwrap();\n\n    let args = create_gc_args(true);\n\n    garbage_collect_index_cli(args).await.unwrap();\n\n    // On `dry_run = true` splits `MarkedForDeletion` should still exist.\n    for split_id in split_ids.iter() {\n        let split_file = quickwit_common::split_file(split_id);\n        let split_filepath = index_path.join(split_file);\n        assert_eq!(split_filepath.try_exists().unwrap(), true);\n    }\n\n    let args = create_gc_args(false);\n\n    garbage_collect_index_cli(args).await.unwrap();\n\n    // If split is `MarkedForDeletion` it should be deleted after gc run\n    for split_id in split_ids.iter() {\n        let split_file = quickwit_common::split_file(split_id);\n        let split_filepath = index_path.join(split_file);\n        assert_eq!(split_filepath.try_exists().unwrap(), false);\n    }\n\n    let metastore = refresh_metastore(metastore).await.unwrap();\n    assert_eq!(\n        metastore\n            .list_splits(ListSplitsRequest::try_from_index_uid(index_uid).unwrap())\n            .await\n            .unwrap()\n            .collect_splits_metadata()\n            .await\n            .unwrap()\n            .len(),\n        0\n    );\n\n    let args = DeleteIndexArgs {\n        client_args: test_env.default_client_args(),\n        index_id,\n        dry_run: false,\n        assume_yes: true,\n    };\n\n    delete_index_cli(args).await.unwrap();\n\n    assert_eq!(index_path.try_exists().unwrap(), false);\n}\n\n#[tokio::test]\nasync fn test_garbage_collect_index_cli() {\n    let index_id = append_random_suffix(\"test-gc-cmd\");\n    let test_env = create_test_env(index_id.clone(), TestStorageType::LocalFileSystem)\n        .await\n        .unwrap();\n    test_env.start_server().await.unwrap();\n    create_logs_index(&test_env).await.unwrap();\n    let index_uid = test_env.index_metadata().await.unwrap().index_uid;\n    local_ingest_log_docs(&test_env).await.unwrap();\n\n    let refresh_metastore = |metastore| async {\n        // In this test we rely on the file backed metastore and\n        // modify it but the file backed metastore caches results.\n        // Therefore we need to force reading the disk to update split info.\n        //\n        // We do that by dropping and recreating our metastore.\n        drop(metastore);\n        MetastoreResolver::unconfigured()\n            .resolve(&test_env.metastore_uri)\n            .await\n    };\n\n    let create_gc_args = |grace_period_secs| GarbageCollectIndexArgs {\n        config_uri: test_env.resource_files.config.clone(),\n        index_id: index_id.clone(),\n        grace_period: Duration::from_secs(grace_period_secs),\n        dry_run: false,\n    };\n\n    let metastore = MetastoreResolver::unconfigured()\n        .resolve(&test_env.metastore_uri)\n        .await\n        .unwrap();\n\n    let splits_metadata = metastore\n        .list_splits(ListSplitsRequest::try_from_index_uid(index_uid.clone()).unwrap())\n        .await\n        .unwrap()\n        .collect_splits_metadata()\n        .await\n        .unwrap();\n    assert_eq!(splits_metadata.len(), 1);\n\n    let index_path = test_env.indexes_dir_path.join(&test_env.index_id);\n    let split_filename = quickwit_common::split_file(splits_metadata[0].split_id.as_str());\n    let split_path = index_path.join(&split_filename);\n    assert_eq!(split_path.try_exists().unwrap(), true);\n\n    let args = create_gc_args(3600);\n\n    garbage_collect_index_cli(args).await.unwrap();\n\n    // Split should still exists within grace period.\n    let metastore = refresh_metastore(metastore).await.unwrap();\n    let splits_metadata = metastore\n        .list_splits(ListSplitsRequest::try_from_index_uid(index_uid.clone()).unwrap())\n        .await\n        .unwrap()\n        .collect_splits_metadata()\n        .await\n        .unwrap();\n    assert_eq!(splits_metadata.len(), 1);\n\n    // The following steps help turn an existing published split into a staged one\n    // without deleting the files.\n    let split_metadata = splits_metadata[0].clone();\n    metastore\n        .mark_splits_for_deletion(MarkSplitsForDeletionRequest::new(\n            index_uid.clone(),\n            vec![split_metadata.split_id.to_string()],\n        ))\n        .await\n        .unwrap();\n    metastore\n        .delete_splits(DeleteSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            split_ids: splits_metadata\n                .into_iter()\n                .map(|split_metadata| split_metadata.split_id)\n                .collect(),\n        })\n        .await\n        .unwrap();\n    metastore\n        .stage_splits(\n            StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata)\n                .unwrap(),\n        )\n        .await\n        .unwrap();\n    assert_eq!(split_path.try_exists().unwrap(), true);\n\n    let metastore = refresh_metastore(metastore).await.unwrap();\n    let splits = metastore\n        .list_splits(ListSplitsRequest::try_from_index_uid(index_uid.clone()).unwrap())\n        .await\n        .unwrap()\n        .collect_splits()\n        .await\n        .unwrap();\n    assert_eq!(splits[0].split_state, SplitState::Staged);\n\n    let args = create_gc_args(3600);\n\n    garbage_collect_index_cli(args).await.unwrap();\n\n    assert_eq!(split_path.try_exists().unwrap(), true);\n    // Staged splits should still exist within grace period.\n    let metastore = refresh_metastore(metastore).await.unwrap();\n    let splits = metastore\n        .list_splits(ListSplitsRequest::try_from_index_uid(index_uid.clone()).unwrap())\n        .await\n        .unwrap()\n        .collect_splits()\n        .await\n        .unwrap();\n    assert_eq!(splits.len(), 1);\n    assert_eq!(splits[0].split_state, SplitState::Staged);\n\n    // Wait for grace period.\n    // TODO: edit split update timestamps and remove this sleep.\n    sleep(Duration::from_secs(2)).await;\n\n    let args = create_gc_args(1);\n\n    garbage_collect_index_cli(args).await.unwrap();\n\n    let metastore = refresh_metastore(metastore).await.unwrap();\n    let splits = metastore\n        .list_splits(ListSplitsRequest::try_from_index_uid(index_uid).unwrap())\n        .await\n        .unwrap()\n        .collect_splits()\n        .await\n        .unwrap();\n    // Splits should be deleted from both metastore and file system.\n    assert_eq!(splits.len(), 0);\n    assert_eq!(split_path.try_exists().unwrap(), false);\n}\n\n/// testing the api via cli commands\n#[tokio::test]\nasync fn test_all_local_index() {\n    quickwit_common::setup_logging_for_tests();\n    let index_id = append_random_suffix(\"test-all\");\n    let test_env = create_test_env(index_id.clone(), TestStorageType::LocalFileSystem)\n        .await\n        .unwrap();\n    test_env.start_server().await.unwrap();\n    create_logs_index(&test_env).await.unwrap();\n\n    let metadata_file_exists = test_env\n        .storage\n        .exists(&Path::new(&test_env.index_id).join(\"metastore.json\"))\n        .await\n        .unwrap();\n    assert!(metadata_file_exists);\n\n    local_ingest_log_docs(&test_env).await.unwrap();\n\n    let query_response = reqwest::get(format!(\n        \"http://127.0.0.1:{}/api/v1/{}/search?query=level:info\",\n        test_env.rest_listen_port, test_env.index_id\n    ))\n    .await\n    .unwrap()\n    .text()\n    .await\n    .unwrap();\n\n    let result: Value = serde_json::from_str(&query_response).unwrap();\n    assert_eq!(result[\"num_hits\"], Value::Number(Number::from(2i64)));\n\n    let args = DeleteIndexArgs {\n        client_args: test_env.default_client_args(),\n        index_id,\n        dry_run: false,\n        assume_yes: true,\n    };\n    delete_index_cli(args).await.unwrap();\n\n    let metadata_file_exists = test_env\n        .storage\n        .exists(&Path::new(&test_env.index_id).join(\"metastore.json\"))\n        .await\n        .unwrap();\n    assert_eq!(metadata_file_exists, false);\n}\n\n/// testing the api via cli commands\n#[tokio::test]\n#[cfg_attr(not(feature = \"ci-test\"), ignore)]\nasync fn test_all_with_s3_localstack_cli() {\n    let index_id = append_random_suffix(\"test-all--cli-s3-localstack\");\n    let test_env = create_test_env(index_id.clone(), TestStorageType::S3)\n        .await\n        .unwrap();\n    test_env.start_server().await.unwrap();\n    create_logs_index(&test_env).await.unwrap();\n\n    let s3_uri = upload_test_file(\n        test_env.storage_resolver.clone(),\n        test_env\n            .resource_files\n            .log_docs\n            .filepath()\n            .unwrap()\n            .to_path_buf(),\n        \"quickwit-integration-tests\",\n        \"sources/\",\n        &append_random_suffix(\"test-all--cli-s3-localstack\"),\n    )\n    .await;\n\n    local_ingest_docs(s3_uri, &test_env).await.unwrap();\n\n    // Cli search\n    let args = SearchIndexArgs {\n        client_args: test_env.default_client_args(),\n        index_id: index_id.clone(),\n        query: \"level:info\".to_string(),\n        aggregation: None,\n        max_hits: 20,\n        start_offset: 0,\n        search_fields: None,\n        snippet_fields: None,\n        start_timestamp: None,\n        end_timestamp: None,\n        sort_by_score: false,\n    };\n\n    let search_res = search_index(args).await.unwrap();\n    assert_eq!(search_res.num_hits, 2);\n\n    let query_response = reqwest::get(format!(\n        \"http://127.0.0.1:{}/api/v1/{}/search?query=level:info\",\n        test_env.rest_listen_port, test_env.index_id,\n    ))\n    .await\n    .unwrap()\n    .text()\n    .await\n    .unwrap();\n\n    let result: Value = serde_json::from_str(&query_response).unwrap();\n    assert_eq!(result[\"num_hits\"], Value::Number(Number::from(2i64)));\n\n    let args = DeleteIndexArgs {\n        client_args: test_env.default_client_args(),\n        index_id: index_id.clone(),\n        dry_run: false,\n        assume_yes: true,\n    };\n\n    delete_index_cli(args).await.unwrap();\n\n    assert_eq!(\n        test_env\n            .storage\n            .exists(Path::new(&test_env.index_id))\n            .await\n            .unwrap(),\n        false\n    );\n}\n"
  },
  {
    "path": "quickwit/quickwit-cli/tests/helpers.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fs;\nuse std::path::{Path, PathBuf};\nuse std::str::FromStr;\nuse std::sync::Arc;\n\nuse anyhow::Context;\nuse predicates::str;\nuse quickwit_cli::ClientArgs;\nuse quickwit_cli::service::RunCliCommand;\nuse quickwit_common::net::find_available_tcp_port;\nuse quickwit_common::test_utils::wait_for_server_ready;\nuse quickwit_common::uri::Uri;\nuse quickwit_config::service::QuickwitService;\nuse quickwit_metastore::{IndexMetadata, IndexMetadataResponseExt, MetastoreResolver};\nuse quickwit_proto::metastore::{IndexMetadataRequest, MetastoreService, MetastoreServiceClient};\nuse quickwit_proto::types::IndexId;\nuse quickwit_storage::{Storage, StorageResolver};\nuse reqwest::Url;\nuse tempfile::{TempDir, tempdir};\nuse tracing::error;\n\npub const PACKAGE_BIN_NAME: &str = \"quickwit\";\n\nconst DEFAULT_INDEX_CONFIG: &str = r#\"\n    version: 0.8\n\n    index_id: #index_id\n    index_uri: #index_uri\n\n    doc_mapping:\n      field_mappings:\n        - name: ts\n          type: datetime\n          input_formats:\n            - unix_timestamp\n          output_format: unix_timestamp_secs\n          fast_precision: seconds\n          fast: true\n        - name: level\n          type: text\n          stored: false\n        - name: event\n          type: text\n        - name: device\n          type: text\n          stored: false\n          tokenizer: raw\n        - name: city\n          type: text\n          stored: false\n          tokenizer: raw\n\n      timestamp_field: ts\n      tag_fields: [city, device]\n\n    indexing_settings:\n      resources:\n        heap_size: 50MB\n\n    search_settings:\n      default_search_fields: [event]\n\"#;\n\nconst RETENTION_CONFIG: &str = r#\"\n    retention:\n      period: 1 week\n      schedule: daily\n\"#;\n\nconst DEFAULT_QUICKWIT_CONFIG: &str = r#\"\n    version: 0.8\n    metastore_uri: #metastore_uri\n    data_dir: #data_dir\n    rest:\n        listen_port: #rest_listen_port\n    grpc_listen_port: #grpc_listen_port\n\"#;\n\nconst LOGS_JSON_DOCS: &str = r#\"{\"event\": \"foo\", \"level\": \"info\", \"ts\": 72057597, \"device\": \"rpi\", \"city\": \"tokio\"}\n{\"event\": \"bar\", \"level\": \"error\", \"ts\": 72057598, \"device\": \"rpi\", \"city\": \"paris\"}\n{\"event\": \"baz\", \"level\": \"warning\", \"ts\": 72057604, \"device\": \"fbit\", \"city\": \"london\"}\n{\"event\": \"buz\", \"level\": \"debug\", \"ts\": 72057607, \"device\": \"rpi\", \"city\": \"paris\"}\n{\"event\": \"biz\", \"level\": \"info\", \"ts\": 72057608, \"device\": \"fbit\", \"city\": \"paris\"}\"#;\n\nconst WIKI_JSON_DOCS: &str = r#\"{\"body\": \"foo\", \"title\": \"shimroy\", \"url\": \"https://wiki.com?id=10\"}\n{\"body\": \"bar\", \"title\": \"shimray\", \"url\": \"https://wiki.com?id=12\"}\n{\"body\": \"baz\", \"title\": \"preshow\", \"url\": \"https://wiki.com?id=11\"}\n{\"body\": \"buz\", \"title\": \"frederick\", \"url\": \"https://wiki.com?id=48\"}\n{\"body\": \"biz\", \"title\": \"modern\", \"url\": \"https://wiki.com?id=13\"}\n\"#;\n\npub struct TestResourceFiles {\n    pub config: Uri,\n    pub index_config: Uri,\n    pub index_config_without_uri: Uri,\n    pub index_config_with_retention: Uri,\n    pub log_docs: Uri,\n}\n\n/// A struct to hold few info about the test environment.\npub struct TestEnv {\n    /// The temporary directory of the test.\n    _temp_dir: TempDir,\n    /// Path of the directory where indexing directory are created.\n    pub data_dir_path: PathBuf,\n    /// Path of the directory where indexes are stored.\n    pub indexes_dir_path: PathBuf,\n    /// Resource files needed for the test.\n    pub resource_files: TestResourceFiles,\n    /// The metastore URI.\n    pub metastore_uri: Uri,\n    pub metastore_resolver: MetastoreResolver,\n\n    pub cluster_endpoint: Url,\n\n    /// The index ID.\n    pub index_id: IndexId,\n    pub index_uri: Uri,\n    pub rest_listen_port: u16,\n    pub storage_resolver: StorageResolver,\n    pub storage: Arc<dyn Storage>,\n}\n\nimpl TestEnv {\n    // For cache reason, it's safer to always create an instance and then make your assertions.\n    pub async fn metastore(&self) -> MetastoreServiceClient {\n        self.metastore_resolver\n            .resolve(&self.metastore_uri)\n            .await\n            .unwrap()\n    }\n\n    pub async fn index_metadata(&self) -> anyhow::Result<IndexMetadata> {\n        let index_metadata = self\n            .metastore()\n            .await\n            .index_metadata(IndexMetadataRequest::for_index_id(self.index_id.clone()))\n            .await?\n            .deserialize_index_metadata()?;\n        Ok(index_metadata)\n    }\n\n    pub async fn start_server(&self) -> anyhow::Result<()> {\n        let run_command = RunCliCommand {\n            config_uri: self.resource_files.config.clone(),\n            services: Some(QuickwitService::supported_services()),\n        };\n        tokio::spawn(async move {\n            if let Err(error) = run_command\n                .execute(quickwit_serve::do_nothing_env_filter_reload_fn())\n                .await\n            {\n                error!(err=?error, \"failed to start a quickwit server\");\n            }\n        });\n        wait_for_server_ready(([127, 0, 0, 1], self.rest_listen_port).into()).await?;\n        Ok(())\n    }\n\n    pub fn default_client_args(&self) -> ClientArgs {\n        ClientArgs {\n            cluster_endpoint: self.cluster_endpoint.clone(),\n            ..Default::default()\n        }\n    }\n}\n\npub enum TestStorageType {\n    S3,\n    LocalFileSystem,\n}\n\npub fn uri_from_path(path: &Path) -> Uri {\n    Uri::from_str(path.to_str().unwrap()).unwrap()\n}\n\n/// Creates all necessary artifacts in a test environment.\npub async fn create_test_env(\n    index_id: IndexId,\n    storage_type: TestStorageType,\n) -> anyhow::Result<TestEnv> {\n    let temp_dir = tempdir()?;\n    let data_dir_path = temp_dir.path().join(\"data\");\n    let indexes_dir_path = data_dir_path.join(\"indexes\");\n    let resources_dir_path = temp_dir.path().join(\"resources\");\n\n    for dir_path in [&data_dir_path, &indexes_dir_path, &resources_dir_path] {\n        fs::create_dir(dir_path)?;\n    }\n\n    // TODO: refactor when we have a singleton storage resolver.\n    let metastore_uri = match storage_type {\n        TestStorageType::LocalFileSystem => {\n            Uri::from_str(&format!(\"file://{}\", indexes_dir_path.display())).unwrap()\n        }\n        TestStorageType::S3 => Uri::for_test(\"s3://quickwit-integration-tests/indexes\"),\n    };\n    let storage_resolver = StorageResolver::unconfigured();\n    let storage = storage_resolver.resolve(&metastore_uri).await?;\n    let metastore_resolver = MetastoreResolver::unconfigured();\n    let index_uri = metastore_uri.join(&index_id).unwrap();\n    let index_config_path = resources_dir_path.join(\"index_config.yaml\");\n    fs::write(\n        &index_config_path,\n        DEFAULT_INDEX_CONFIG\n            .replace(\"#index_id\", &index_id)\n            .replace(\"#index_uri\", index_uri.as_str()),\n    )?;\n    let index_config_without_uri_path = resources_dir_path.join(\"index_config_without_uri.yaml\");\n    fs::write(\n        &index_config_without_uri_path,\n        DEFAULT_INDEX_CONFIG\n            .replace(\"#index_id\", &index_id)\n            .replace(\"index_uri: #index_uri\\n\", \"\"),\n    )?;\n    let index_config_with_retention_path =\n        resources_dir_path.join(\"index_config_with_retention.yaml\");\n    fs::write(\n        &index_config_with_retention_path,\n        format!(\"{DEFAULT_INDEX_CONFIG}{RETENTION_CONFIG}\")\n            .replace(\"#index_id\", &index_id)\n            .replace(\"#index_uri\", index_uri.as_str()),\n    )?;\n    let node_config_path = resources_dir_path.join(\"config.yaml\");\n    let rest_listen_port = find_available_tcp_port()?;\n    let grpc_listen_port = find_available_tcp_port()?;\n    fs::write(\n        &node_config_path,\n        // A poor's man templating engine reloaded...\n        DEFAULT_QUICKWIT_CONFIG\n            .replace(\"#metastore_uri\", metastore_uri.as_str())\n            .replace(\"#data_dir\", data_dir_path.to_str().unwrap())\n            .replace(\"#rest_listen_port\", &rest_listen_port.to_string())\n            .replace(\"#grpc_listen_port\", &grpc_listen_port.to_string()),\n    )?;\n    let log_docs_path = resources_dir_path.join(\"logs.json\");\n    fs::write(&log_docs_path, LOGS_JSON_DOCS)?;\n    let wikipedia_docs_path = resources_dir_path.join(\"wikis.json\");\n    fs::write(wikipedia_docs_path, WIKI_JSON_DOCS)?;\n\n    let cluster_endpoint = Url::parse(&format!(\"http://localhost:{rest_listen_port}\"))\n        .context(\"failed to parse cluster endpoint\")?;\n\n    let resource_files = TestResourceFiles {\n        config: uri_from_path(&node_config_path),\n        index_config: uri_from_path(&index_config_path),\n        index_config_without_uri: uri_from_path(&index_config_without_uri_path),\n        index_config_with_retention: uri_from_path(&index_config_with_retention_path),\n        log_docs: uri_from_path(&log_docs_path),\n    };\n\n    Ok(TestEnv {\n        _temp_dir: temp_dir,\n        data_dir_path,\n        indexes_dir_path,\n        resource_files,\n        metastore_uri,\n        metastore_resolver,\n        cluster_endpoint,\n        index_id,\n        index_uri,\n        rest_listen_port,\n        storage_resolver,\n        storage,\n    })\n}\n\n/// TODO: this should be part of the test env setup\npub async fn upload_test_file(\n    storage_resolver: StorageResolver,\n    local_src_path: PathBuf,\n    bucket: &str,\n    prefix: &str,\n    filename: &str,\n) -> Uri {\n    let test_data = tokio::fs::read(local_src_path).await.unwrap();\n    let src_location = format!(\"s3://{bucket}/{prefix}\");\n    let storage_uri = Uri::from_str(&src_location).unwrap();\n    let storage = storage_resolver.resolve(&storage_uri).await.unwrap();\n    storage\n        .put(&PathBuf::from(filename), Box::new(test_data))\n        .await\n        .unwrap();\n    storage_uri.join(filename).unwrap()\n}\n"
  },
  {
    "path": "quickwit/quickwit-cli/tests/prepare_tests.sh",
    "content": "#!/bin/bash\nawslocal s3 mb s3://quickwit-integration-tests && awslocal s3 rm --recursive s3://quickwit-integration-tests\n"
  },
  {
    "path": "quickwit/quickwit-cluster/Cargo.toml",
    "content": "[package]\nname = \"quickwit-cluster\"\ndescription = \"Cluster membership based on Chitchat\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nanyhow = { workspace = true }\nasync-trait = { workspace = true }\nbytesize = { workspace = true }\nchitchat = { workspace = true }\nfutures = { workspace = true }\nitertools = { workspace = true }\nonce_cell = { workspace = true }\npin-project = { workspace = true }\nrand = { workspace = true }\nserde = { workspace = true }\nserde_json = { workspace = true }\ntime = { workspace = true }\ntokio = { workspace = true }\ntokio-stream = { workspace = true }\ntonic = { workspace = true }\ntracing = { workspace = true }\nutoipa = { workspace = true }\n\nquickwit-common = { workspace = true }\nquickwit-config = { workspace = true }\nquickwit-proto = { workspace = true }\n\n[features]\ntestsuite = []\n\n[dev-dependencies]\nrand = { workspace = true }\n\nchitchat = { workspace = true, features = [\"testsuite\"] }\nquickwit-common = { workspace = true, features = [\"testsuite\"] }\nquickwit-proto = { workspace = true, features = [\"testsuite\"] }\n\n[package.metadata.cargo-machete]\n# used inside code generated by utoipa\nignored = [\"serde_json\"]\n"
  },
  {
    "path": "quickwit/quickwit-cluster/src/change.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeMap;\nuse std::collections::btree_map::Entry;\nuse std::pin::Pin;\nuse std::task::{Context, Poll};\n\nuse chitchat::{ChitchatId, NodeState};\nuse futures::Stream;\nuse pin_project::pin_project;\nuse quickwit_common::sorted_iter::{KeyDiff, SortedByKeyIterator};\nuse quickwit_common::tower::{ClientGrpcConfig, make_channel, warmup_channel};\nuse quickwit_proto::types::NodeId;\nuse tokio::sync::mpsc;\nuse tokio_stream::wrappers::UnboundedReceiverStream;\nuse tonic::transport::Channel;\nuse tracing::{info, warn};\n\nuse crate::ClusterNode;\nuse crate::member::NodeStateExt;\n\n/// Describes a change in the cluster.\n#[derive(Debug, Clone)]\npub enum ClusterChange {\n    Add(ClusterNode),\n    Update {\n        previous: ClusterNode,\n        updated: ClusterNode,\n    },\n    Remove(ClusterNode),\n}\n\n/// A stream of cluster change events.\n#[pin_project]\npub struct ClusterChangeStream(#[pin] UnboundedReceiverStream<ClusterChange>);\n\nimpl ClusterChangeStream {\n    pub fn new_unbounded() -> (Self, mpsc::UnboundedSender<ClusterChange>) {\n        let (change_stream_tx, change_stream_rx) = mpsc::unbounded_channel();\n        (\n            Self(UnboundedReceiverStream::new(change_stream_rx)),\n            change_stream_tx,\n        )\n    }\n}\n\nimpl Stream for ClusterChangeStream {\n    type Item = ClusterChange;\n\n    fn poll_next(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Option<Self::Item>> {\n        self.project().0.poll_next(cx)\n    }\n}\n\n/// A factory for creating cluster change streams.\npub trait ClusterChangeStreamFactory: Clone + Send + 'static {\n    fn create(&self) -> ClusterChangeStream;\n}\n\n/// Compares the digests of the previous and new set of lives nodes, identifies the changes that\n/// occurred in the cluster, and emits the corresponding events, focusing on ready nodes only.\npub(crate) async fn compute_cluster_change_events(\n    cluster_id: &str,\n    self_chitchat_id: &ChitchatId,\n    previous_nodes: &mut BTreeMap<NodeId, ClusterNode>,\n    previous_node_states: &BTreeMap<ChitchatId, NodeState>,\n    new_node_states: &BTreeMap<ChitchatId, NodeState>,\n    client_grpc_config: &ClientGrpcConfig,\n) -> Vec<ClusterChange> {\n    let mut cluster_events = Vec::new();\n\n    for key_diff in previous_node_states\n        .iter()\n        .diff_by_key(new_node_states.iter())\n    {\n        match key_diff {\n            // The node has joined the cluster.\n            KeyDiff::Added(chitchat_id, node_state) => {\n                let node_events = compute_cluster_change_events_on_added(\n                    cluster_id,\n                    self_chitchat_id,\n                    chitchat_id,\n                    node_state,\n                    previous_nodes,\n                    client_grpc_config.clone(),\n                )\n                .await;\n\n                cluster_events.extend(node_events);\n            }\n            // The node's state has changed.\n            KeyDiff::Unchanged(chitchat_id, previous_node_state, new_node_state)\n                if previous_node_state.max_version() != new_node_state.max_version() =>\n            {\n                let node_event_opt = compute_cluster_change_events_on_updated(\n                    cluster_id,\n                    self_chitchat_id,\n                    chitchat_id,\n                    new_node_state,\n                    previous_nodes,\n                )\n                .await;\n\n                if let Some(node_event) = node_event_opt {\n                    cluster_events.push(node_event);\n                }\n            }\n            // The node's state has not changed.\n            KeyDiff::Unchanged(_chitchat_id, _previous_max_version, _new_max_version) => {}\n            // The node has left the cluster, i.e. it is considered dead by the failure detector.\n            KeyDiff::Removed(chitchat_id, _node_state) => {\n                let node_event_opt =\n                    compute_cluster_change_events_on_removed(chitchat_id, previous_nodes);\n\n                if let Some(node_event) = node_event_opt {\n                    cluster_events.push(node_event);\n                }\n            }\n        };\n    }\n    cluster_events\n}\n\nasync fn compute_cluster_change_events_on_added(\n    cluster_id: &str,\n    self_chitchat_id: &ChitchatId,\n    new_chitchat_id: &ChitchatId,\n    new_node_state: &NodeState,\n    previous_nodes: &mut BTreeMap<NodeId, ClusterNode>,\n    client_grpc_config: ClientGrpcConfig,\n) -> Vec<ClusterChange> {\n    let is_self_node = self_chitchat_id == new_chitchat_id;\n    let new_node_id: NodeId = new_chitchat_id.node_id.clone().into();\n    let maybe_previous_node_entry = previous_nodes.entry(new_node_id);\n\n    let mut events = Vec::new();\n    let mut verb = \"joined\";\n\n    if let Entry::Occupied(previous_node_entry) = maybe_previous_node_entry {\n        let previous_node_ref = previous_node_entry.get();\n\n        if previous_node_ref.chitchat_id().generation_id > new_chitchat_id.generation_id {\n            warn!(\n                node_id=%new_chitchat_id.node_id,\n                generation_id=%new_chitchat_id.generation_id,\n                \"ignoring node `{}` rejoining the cluster with a lower generation ID\",\n                new_chitchat_id.node_id\n            );\n            return events;\n        }\n        let previous_node = previous_node_entry.remove();\n        verb = \"rejoined\";\n\n        if previous_node.is_ready() {\n            events.push(ClusterChange::Remove(previous_node));\n        }\n    }\n    let Some(new_node) = try_new_node(\n        cluster_id,\n        new_chitchat_id,\n        new_node_state,\n        is_self_node,\n        &client_grpc_config,\n    )\n    .await\n    else {\n        return events;\n    };\n    info!(\n        node_id=%new_chitchat_id.node_id,\n        generation_id=%new_chitchat_id.generation_id,\n        \"node `{}` has {verb} the cluster\",\n        new_chitchat_id.node_id,\n    );\n    let new_node_id: NodeId = new_node.node_id().into();\n    previous_nodes.insert(new_node_id, new_node.clone());\n\n    if new_node.is_ready() {\n        info!(\n            node_id=%new_chitchat_id.node_id,\n            generation_id=%new_chitchat_id.generation_id,\n            \"node `{}` has transitioned to ready state\",\n            new_chitchat_id.node_id\n        );\n        warmup_channel(new_node.channel()).await;\n        events.push(ClusterChange::Add(new_node));\n    }\n    events\n}\n\nasync fn compute_cluster_change_events_on_updated(\n    cluster_id: &str,\n    self_chitchat_id: &ChitchatId,\n    updated_chitchat_id: &ChitchatId,\n    updated_node_state: &NodeState,\n    previous_nodes: &mut BTreeMap<NodeId, ClusterNode>,\n) -> Option<ClusterChange> {\n    let previous_node = previous_nodes.get(&updated_chitchat_id.node_id)?.clone();\n\n    if previous_node.chitchat_id().generation_id > updated_chitchat_id.generation_id {\n        warn!(\n            node_id=%updated_chitchat_id.node_id,\n            generation_id=%updated_chitchat_id.generation_id,\n            \"ignoring node `{}` update with a lower generation ID\",\n            updated_chitchat_id.node_id\n        );\n        return None;\n    }\n    let previous_channel = previous_node.channel();\n    let is_self_node = self_chitchat_id == updated_chitchat_id;\n    let updated_node = try_new_node_with_channel(\n        cluster_id,\n        updated_chitchat_id,\n        updated_node_state,\n        previous_channel,\n        is_self_node,\n    )?;\n    let updated_node_id: NodeId = updated_node.chitchat_id().node_id.clone().into();\n    previous_nodes.insert(updated_node_id, updated_node.clone());\n\n    if !previous_node.is_ready() && updated_node.is_ready() {\n        warmup_channel(updated_node.channel()).await;\n\n        info!(\n            node_id=%updated_chitchat_id.node_id,\n            generation_id=%updated_chitchat_id.generation_id,\n            \"node `{}` has transitioned to ready state\",\n            updated_chitchat_id.node_id\n        );\n        Some(ClusterChange::Add(updated_node))\n    } else if previous_node.is_ready() && !updated_node.is_ready() {\n        info!(\n            node_id=%updated_chitchat_id.node_id,\n            generation_id=%updated_chitchat_id.generation_id,\n            \"node `{}` has transitioned out of ready state\",\n            updated_chitchat_id.node_id\n        );\n        Some(ClusterChange::Remove(updated_node))\n    } else if previous_node.is_ready() && updated_node.is_ready() {\n        Some(ClusterChange::Update {\n            previous: previous_node,\n            updated: updated_node,\n        })\n    } else {\n        None\n    }\n}\n\nfn compute_cluster_change_events_on_removed(\n    removed_chitchat_id: &ChitchatId,\n    previous_nodes: &mut BTreeMap<NodeId, ClusterNode>,\n) -> Option<ClusterChange> {\n    let removed_node_id: NodeId = removed_chitchat_id.node_id.clone().into();\n\n    if let Entry::Occupied(previous_node_entry) = previous_nodes.entry(removed_node_id) {\n        let previous_node_ref = previous_node_entry.get();\n\n        if previous_node_ref.chitchat_id().generation_id == removed_chitchat_id.generation_id {\n            info!(\n                node_id=%removed_chitchat_id.node_id,\n                generation_id=%removed_chitchat_id.generation_id,\n                \"node `{}` has left the cluster\",\n                removed_chitchat_id.node_id\n            );\n            let previous_node = previous_node_entry.remove();\n\n            if previous_node.is_ready() {\n                return Some(ClusterChange::Remove(previous_node));\n            }\n        }\n    };\n    None\n}\n\nfn try_new_node_with_channel(\n    cluster_id: &str,\n    chitchat_id: &ChitchatId,\n    node_state: &NodeState,\n    channel: Channel,\n    is_self_node: bool,\n) -> Option<ClusterNode> {\n    match ClusterNode::try_new(chitchat_id.clone(), node_state, channel, is_self_node) {\n        Ok(node) => Some(node),\n        Err(error) => {\n            warn!(\n                cluster_id=%cluster_id,\n                node_id=%chitchat_id.node_id,\n                error=%error,\n                \"failed to create cluster node from Chitchat node state\"\n            );\n            None\n        }\n    }\n}\n\nasync fn try_new_node(\n    cluster_id: &str,\n    chitchat_id: &ChitchatId,\n    node_state: &NodeState,\n    is_self_node: bool,\n    grpc_config: &ClientGrpcConfig,\n) -> Option<ClusterNode> {\n    match node_state.grpc_advertise_addr() {\n        Ok(socket_addr) => {\n            let channel = make_channel(socket_addr, grpc_config.clone()).await;\n            try_new_node_with_channel(cluster_id, chitchat_id, node_state, channel, is_self_node)\n        }\n        Err(error) => {\n            warn!(\n                cluster_id=%cluster_id,\n                node_id=%chitchat_id.node_id,\n                error=%error,\n                \"failed to read or parse gRPC advertise address\"\n            );\n            None\n        }\n    }\n}\n\n#[cfg(any(test, feature = \"testsuite\"))]\npub mod for_test {\n    use std::sync::{Arc, Mutex};\n\n    use tokio::sync::mpsc;\n\n    use super::*;\n\n    #[derive(Clone, Default)]\n    pub struct ClusterChangeStreamFactoryForTest {\n        inner: Arc<Mutex<Option<mpsc::UnboundedSender<ClusterChange>>>>,\n    }\n\n    impl ClusterChangeStreamFactoryForTest {\n        pub fn change_stream_tx(&self) -> mpsc::UnboundedSender<ClusterChange> {\n            self.inner.lock().unwrap().take().unwrap()\n        }\n    }\n\n    impl ClusterChangeStreamFactory for ClusterChangeStreamFactoryForTest {\n        fn create(&self) -> ClusterChangeStream {\n            let (change_stream, change_stream_tx) = ClusterChangeStream::new_unbounded();\n            *self.inner.lock().unwrap() = Some(change_stream_tx);\n            change_stream\n        }\n    }\n}\n\n#[cfg(test)]\npub(crate) mod tests {\n    use std::collections::HashSet;\n    use std::net::SocketAddr;\n\n    use itertools::Itertools;\n    use quickwit_config::service::QuickwitService;\n    use tonic::transport::Channel;\n\n    use super::*;\n    use crate::member::{\n        ENABLED_SERVICES_KEY, GRPC_ADVERTISE_ADDR_KEY, READINESS_KEY, READINESS_VALUE_NOT_READY,\n        READINESS_VALUE_READY,\n    };\n\n    pub(crate) struct NodeStateBuilder {\n        enabled_services: HashSet<QuickwitService>,\n        grpc_advertise_addr: SocketAddr,\n        readiness: bool,\n        key_values: Vec<(String, String)>,\n    }\n\n    impl Default for NodeStateBuilder {\n        fn default() -> Self {\n            Self {\n                enabled_services: QuickwitService::supported_services(),\n                grpc_advertise_addr: \"127.0.0.1:7281\".parse().unwrap(),\n                readiness: false,\n                key_values: Vec::new(),\n            }\n        }\n    }\n\n    impl NodeStateBuilder {\n        pub(crate) fn with_grpc_advertise_addr(mut self, grpc_advertise_addr: SocketAddr) -> Self {\n            self.grpc_advertise_addr = grpc_advertise_addr;\n            self\n        }\n\n        pub(crate) fn with_readiness(mut self, readiness: bool) -> Self {\n            self.readiness = readiness;\n            self\n        }\n\n        pub(crate) fn with_key_value(mut self, key: &str, value: &str) -> Self {\n            self.key_values.push((key.to_string(), value.to_string()));\n            self\n        }\n\n        pub(crate) fn build(self) -> NodeState {\n            let mut node_state = NodeState::for_test();\n\n            node_state.set(\n                ENABLED_SERVICES_KEY,\n                self.enabled_services\n                    .iter()\n                    .map(|service| service.as_str())\n                    .join(\",\"),\n            );\n            node_state.set(\n                GRPC_ADVERTISE_ADDR_KEY,\n                self.grpc_advertise_addr.to_string(),\n            );\n            node_state.set(\n                READINESS_KEY,\n                if self.readiness {\n                    READINESS_VALUE_READY\n                } else {\n                    READINESS_VALUE_NOT_READY\n                },\n            );\n            for (key, value) in self.key_values {\n                node_state.set(key, value);\n            }\n            node_state\n        }\n    }\n\n    #[tokio::test]\n    async fn test_compute_cluster_change_events_on_added() {\n        let cluster_id = \"test-cluster\".to_string();\n        let self_port = 1234;\n        let self_chitchat_id = ChitchatId::for_local_test(self_port);\n        {\n            // New node joins the cluster with an invalid gRPC advertise address.\n            let port = 1235;\n            let new_chitchat_id = ChitchatId::for_local_test(port);\n            let mut new_node_state = NodeStateBuilder::default().build();\n            new_node_state.set(GRPC_ADVERTISE_ADDR_KEY, \"bogus-grpc-advertise-addr\");\n            let mut previous_nodes = BTreeMap::new();\n\n            let events = compute_cluster_change_events_on_added(\n                &cluster_id,\n                &self_chitchat_id,\n                &new_chitchat_id,\n                &new_node_state,\n                &mut previous_nodes,\n                Default::default(),\n            )\n            .await;\n            assert!(events.is_empty());\n            assert!(previous_nodes.is_empty());\n        }\n        {\n            // New node joins the cluster but is not ready.\n            let port = 1235;\n            let grpc_advertise_addr: SocketAddr = ([127, 0, 0, 1], port + 1).into();\n            let new_chitchat_id = ChitchatId::for_local_test(port);\n            let new_node_state = NodeStateBuilder::default()\n                .with_grpc_advertise_addr(grpc_advertise_addr)\n                .with_readiness(false)\n                .build();\n            let mut previous_nodes = BTreeMap::new();\n\n            let events = compute_cluster_change_events_on_added(\n                &cluster_id,\n                &self_chitchat_id,\n                &new_chitchat_id,\n                &new_node_state,\n                &mut previous_nodes,\n                Default::default(),\n            )\n            .await;\n            assert!(events.is_empty());\n\n            let node = previous_nodes.get(&new_chitchat_id.node_id).unwrap();\n\n            assert_eq!(node.chitchat_id(), &new_chitchat_id);\n            assert_eq!(node.grpc_advertise_addr(), grpc_advertise_addr);\n            assert!(!node.is_self_node());\n            assert!(!node.is_ready());\n        }\n        {\n            // New node joins the cluster and is ready.\n            let port = 1235;\n            let grpc_advertise_addr: SocketAddr = ([127, 0, 0, 1], port + 1).into();\n            let new_chitchat_id = ChitchatId::for_local_test(port);\n            let new_node_state = NodeStateBuilder::default()\n                .with_grpc_advertise_addr(grpc_advertise_addr)\n                .with_readiness(true)\n                .build();\n            let mut previous_nodes = BTreeMap::new();\n\n            let events = compute_cluster_change_events_on_added(\n                &cluster_id,\n                &self_chitchat_id,\n                &new_chitchat_id,\n                &new_node_state,\n                &mut previous_nodes,\n                Default::default(),\n            )\n            .await;\n\n            let ClusterChange::Add(node) = &events[0] else {\n                panic!(\"expected `ClusterChange::Add` event, got `{:?}`\", events[0]);\n            };\n            assert_eq!(node.chitchat_id(), &new_chitchat_id);\n            assert_eq!(node.grpc_advertise_addr(), grpc_advertise_addr);\n            assert!(!node.is_self_node());\n            assert!(node.is_ready());\n            assert_eq!(previous_nodes.get(&new_chitchat_id.node_id).unwrap(), node);\n\n            // Node rejoins with same node ID but newer generation ID.\n            let mut rejoined_chitchat_id = ChitchatId::for_local_test(port);\n            rejoined_chitchat_id.generation_id += 1;\n\n            let events = compute_cluster_change_events_on_added(\n                &cluster_id,\n                &self_chitchat_id,\n                &rejoined_chitchat_id,\n                &new_node_state,\n                &mut previous_nodes,\n                Default::default(),\n            )\n            .await;\n            assert_eq!(events.len(), 2);\n\n            let ClusterChange::Remove(removed_node) = &events[0] else {\n                panic!(\n                    \"expected `ClusterChange::Remove` event, got `{:?}`\",\n                    events[0]\n                );\n            };\n            assert_eq!(removed_node.chitchat_id(), &new_chitchat_id);\n\n            let ClusterChange::Add(rejoined_node) = &events[1] else {\n                panic!(\"expected `ClusterChange::Add` event, got `{:?}`\", events[1]);\n            };\n            assert_eq!(rejoined_node.chitchat_id(), &rejoined_chitchat_id);\n            assert_eq!(\n                previous_nodes.get(&rejoined_chitchat_id.node_id).unwrap(),\n                rejoined_node\n            );\n\n            // Node comes back from the dead with an older generation ID.\n            let events = compute_cluster_change_events_on_added(\n                &cluster_id,\n                &self_chitchat_id,\n                &new_chitchat_id,\n                &new_node_state,\n                &mut previous_nodes,\n                Default::default(),\n            )\n            .await;\n            assert!(events.is_empty());\n            assert_eq!(\n                previous_nodes.get(&rejoined_chitchat_id.node_id).unwrap(),\n                rejoined_node\n            );\n        }\n        {\n            // Self node joins the cluster and is ready.\n            let grpc_advertise_addr: SocketAddr = ([127, 0, 0, 1], self_port + 1).into();\n            let new_chitchat_id = self_chitchat_id.clone();\n            let new_node_state = NodeStateBuilder::default()\n                .with_grpc_advertise_addr(grpc_advertise_addr)\n                .with_readiness(true)\n                .build();\n            let mut previous_nodes = BTreeMap::new();\n\n            let events = compute_cluster_change_events_on_added(\n                &cluster_id,\n                &self_chitchat_id,\n                &new_chitchat_id,\n                &new_node_state,\n                &mut previous_nodes,\n                Default::default(),\n            )\n            .await;\n            assert_eq!(events.len(), 1);\n\n            let ClusterChange::Add(node) = &events[0] else {\n                panic!(\"expected `ClusterChange::Add` event, got `{:?}`\", events[0]);\n            };\n            assert_eq!(node.chitchat_id(), &new_chitchat_id);\n            assert_eq!(node.grpc_advertise_addr(), grpc_advertise_addr);\n            assert!(node.is_self_node());\n            assert!(node.is_ready());\n            assert_eq!(previous_nodes.get(&new_chitchat_id.node_id).unwrap(), node);\n        }\n    }\n\n    #[tokio::test]\n    async fn test_compute_cluster_change_events_on_updated() {\n        let cluster_id = \"test-cluster\".to_string();\n        let self_port = 1234;\n        let self_chitchat_id = ChitchatId::for_local_test(self_port);\n        {\n            // Node becomes ready.\n            let port = 1235;\n            let grpc_advertise_addr: SocketAddr = ([127, 0, 0, 1], port + 1).into();\n            let updated_chitchat_id = ChitchatId::for_local_test(port);\n            let updated_node_id: NodeId = updated_chitchat_id.node_id.clone().into();\n            let previous_node_state = NodeStateBuilder::default()\n                .with_grpc_advertise_addr(grpc_advertise_addr)\n                .with_readiness(false)\n                .build();\n            let previous_channel = Channel::from_static(\"http://127.0.0.1:12345/\").connect_lazy();\n            let is_self_node = true;\n            let previous_node = ClusterNode::try_new(\n                updated_chitchat_id.clone(),\n                &previous_node_state,\n                previous_channel,\n                is_self_node,\n            )\n            .unwrap();\n            let mut previous_nodes = BTreeMap::from_iter([(updated_node_id, previous_node)]);\n\n            let updated_node_state = NodeStateBuilder::default()\n                .with_grpc_advertise_addr(grpc_advertise_addr)\n                .with_readiness(true)\n                .with_key_value(\"my-key\", \"my-value\")\n                .build();\n            let event = compute_cluster_change_events_on_updated(\n                &cluster_id,\n                &self_chitchat_id,\n                &updated_chitchat_id,\n                &updated_node_state,\n                &mut previous_nodes,\n            )\n            .await\n            .unwrap();\n            let ClusterChange::Add(node) = event else {\n                panic!(\"expected `ClusterChange::Add` event, got `{event:?}`\");\n            };\n            assert_eq!(node.chitchat_id(), &updated_chitchat_id);\n            assert_eq!(node.grpc_advertise_addr(), grpc_advertise_addr);\n            assert!(node.is_ready());\n            assert!(!node.is_self_node());\n            assert_eq!(\n                previous_nodes.get(&updated_chitchat_id.node_id).unwrap(),\n                &node\n            );\n        }\n        {\n            // Node changes.\n            let port = 1235;\n            let grpc_advertise_addr: SocketAddr = ([127, 0, 0, 1], port + 1).into();\n            let updated_chitchat_id = ChitchatId::for_local_test(port);\n            let updated_node_id: NodeId = updated_chitchat_id.node_id.clone().into();\n            let previous_node_state = NodeStateBuilder::default()\n                .with_grpc_advertise_addr(grpc_advertise_addr)\n                .with_readiness(true)\n                .build();\n            let previous_channel = Channel::from_static(\"http://127.0.0.1:12345/\").connect_lazy();\n            let is_self_node = true;\n            let previous_node = ClusterNode::try_new(\n                updated_chitchat_id.clone(),\n                &previous_node_state,\n                previous_channel,\n                is_self_node,\n            )\n            .unwrap();\n            let mut previous_nodes = BTreeMap::from_iter([(updated_node_id, previous_node)]);\n\n            let updated_node_state = NodeStateBuilder::default()\n                .with_grpc_advertise_addr(grpc_advertise_addr)\n                .with_readiness(true)\n                .with_key_value(\"my-key\", \"my-value\")\n                .build();\n            let event = compute_cluster_change_events_on_updated(\n                &cluster_id,\n                &self_chitchat_id,\n                &updated_chitchat_id,\n                &updated_node_state,\n                &mut previous_nodes,\n            )\n            .await\n            .unwrap();\n\n            let ClusterChange::Update { updated, .. } = event else {\n                panic!(\"expected `ClusterChange::Remove` event, got `{event:?}`\");\n            };\n            assert_eq!(updated.chitchat_id(), &updated_chitchat_id);\n            assert_eq!(updated.grpc_advertise_addr(), grpc_advertise_addr);\n            assert!(!updated.is_self_node());\n            assert!(updated.is_ready());\n            assert_eq!(\n                previous_nodes.get(&updated_chitchat_id.node_id).unwrap(),\n                &updated\n            );\n        }\n        {\n            // Node is no longer ready.\n            let port = 1235;\n            let grpc_advertise_addr: SocketAddr = ([127, 0, 0, 1], port + 1).into();\n            let updated_chitchat_id = ChitchatId::for_local_test(port);\n            let updated_node_id: NodeId = updated_chitchat_id.node_id.clone().into();\n            let previous_node_state = NodeStateBuilder::default()\n                .with_grpc_advertise_addr(grpc_advertise_addr)\n                .with_readiness(true)\n                .build();\n            let previous_channel = Channel::from_static(\"http://127.0.0.1:12345/\").connect_lazy();\n            let is_self_node = true;\n            let previous_node = ClusterNode::try_new(\n                updated_chitchat_id.clone(),\n                &previous_node_state,\n                previous_channel,\n                is_self_node,\n            )\n            .unwrap();\n            let mut previous_nodes = BTreeMap::from_iter([(updated_node_id, previous_node)]);\n\n            let updated_node_state = NodeStateBuilder::default()\n                .with_grpc_advertise_addr(grpc_advertise_addr)\n                .with_readiness(false)\n                .with_key_value(\"my-key\", \"my-value\")\n                .build();\n            let event = compute_cluster_change_events_on_updated(\n                &cluster_id,\n                &self_chitchat_id,\n                &updated_chitchat_id,\n                &updated_node_state,\n                &mut previous_nodes,\n            )\n            .await\n            .unwrap();\n            let ClusterChange::Remove(node) = event else {\n                panic!(\"expected `ClusterChange::Remove` event, got `{event:?}`\");\n            };\n            assert_eq!(node.chitchat_id(), &updated_chitchat_id);\n            assert_eq!(node.grpc_advertise_addr(), grpc_advertise_addr);\n            assert!(!node.is_self_node());\n            assert!(!node.is_ready());\n            assert_eq!(\n                previous_nodes.get(&updated_chitchat_id.node_id).unwrap(),\n                &node\n            );\n        }\n        {\n            // Ignore node update with a lower generation ID.\n            let port = 1235;\n            let grpc_advertise_addr: SocketAddr = ([127, 0, 0, 1], port + 1).into();\n            let updated_chitchat_id = ChitchatId::for_local_test(port);\n            let updated_node_id: NodeId = updated_chitchat_id.node_id.clone().into();\n            let mut previous_chitchat_id = updated_chitchat_id.clone();\n            previous_chitchat_id.generation_id += 1;\n            let previous_node_state = NodeStateBuilder::default()\n                .with_grpc_advertise_addr(grpc_advertise_addr)\n                .with_readiness(true)\n                .build();\n            let previous_channel = Channel::from_static(\"http://127.0.0.1:12345/\").connect_lazy();\n            let is_self_node = true;\n            let previous_node = ClusterNode::try_new(\n                previous_chitchat_id.clone(),\n                &previous_node_state,\n                previous_channel,\n                is_self_node,\n            )\n            .unwrap();\n            let mut previous_nodes =\n                BTreeMap::from_iter([(updated_node_id, previous_node.clone())]);\n\n            let updated_node_state = NodeStateBuilder::default()\n                .with_grpc_advertise_addr(grpc_advertise_addr)\n                .with_readiness(false)\n                .with_key_value(\"my-key\", \"my-value\")\n                .build();\n            let event_opt = compute_cluster_change_events_on_updated(\n                &cluster_id,\n                &self_chitchat_id,\n                &updated_chitchat_id,\n                &updated_node_state,\n                &mut previous_nodes,\n            )\n            .await;\n            assert!(event_opt.is_none());\n\n            assert_eq!(\n                previous_nodes.get(&updated_chitchat_id.node_id).unwrap(),\n                &previous_node\n            );\n        }\n    }\n\n    #[tokio::test]\n    async fn test_compute_cluster_change_events_on_removed() {\n        {\n            // Node leaves the cluster but it's missing from the previous live nodes.\n            let port = 1235;\n            let removed_chitchat_id = ChitchatId::for_local_test(port);\n            let mut previous_nodes = BTreeMap::default();\n\n            let event_opt =\n                compute_cluster_change_events_on_removed(&removed_chitchat_id, &mut previous_nodes);\n            assert!(event_opt.is_none());\n        }\n        {\n            // Node leaves the cluster in not ready state.\n            let port = 1235;\n            let grpc_advertise_addr: SocketAddr = ([127, 0, 0, 1], port + 1).into();\n            let removed_chitchat_id = ChitchatId::for_local_test(port);\n            let removed_node_id: NodeId = removed_chitchat_id.node_id.clone().into();\n            let previous_node_state = NodeStateBuilder::default()\n                .with_grpc_advertise_addr(grpc_advertise_addr)\n                .with_readiness(false)\n                .build();\n            let previous_channel = Channel::from_static(\"http://127.0.0.1:12345/\").connect_lazy();\n            let is_self_node = true;\n            let previous_node = ClusterNode::try_new(\n                removed_chitchat_id.clone(),\n                &previous_node_state,\n                previous_channel,\n                is_self_node,\n            )\n            .unwrap();\n            let mut previous_nodes = BTreeMap::from_iter([(removed_node_id, previous_node)]);\n\n            let event_opt =\n                compute_cluster_change_events_on_removed(&removed_chitchat_id, &mut previous_nodes);\n            assert!(event_opt.is_none());\n            assert!(!previous_nodes.contains_key(&removed_chitchat_id.node_id));\n        }\n        {\n            // Node leaves the cluster in ready state.\n            let port = 1235;\n            let grpc_advertise_addr: SocketAddr = ([127, 0, 0, 1], port + 1).into();\n            let removed_chitchat_id = ChitchatId::for_local_test(port);\n            let removed_node_id: NodeId = removed_chitchat_id.node_id.clone().into();\n            let removed_node_state = NodeStateBuilder::default()\n                .with_grpc_advertise_addr(grpc_advertise_addr)\n                .with_readiness(true)\n                .build();\n            let channel = Channel::from_static(\"http://127.0.0.1:12345/\").connect_lazy();\n            let removed_node = ClusterNode::try_new(\n                removed_chitchat_id.clone(),\n                &removed_node_state,\n                channel,\n                false,\n            )\n            .unwrap();\n            let mut previous_nodes = BTreeMap::from_iter([(removed_node_id.clone(), removed_node)]);\n\n            let event =\n                compute_cluster_change_events_on_removed(&removed_chitchat_id, &mut previous_nodes)\n                    .unwrap();\n\n            let ClusterChange::Remove(node) = event else {\n                panic!(\"expected `ClusterChange::Remove` event, got `{event:?}`\");\n            };\n            assert_eq!(node.chitchat_id(), &removed_chitchat_id);\n            assert_eq!(node.grpc_advertise_addr(), grpc_advertise_addr);\n            assert!(!node.is_self_node());\n            assert!(node.is_ready());\n            assert!(!previous_nodes.contains_key(&removed_chitchat_id.node_id));\n        }\n        {\n            // Node leaves the cluster in ready state but in the meantime it has rejoined the\n            // cluster with a newer generation ID.\n            let port = 1235;\n            let grpc_advertise_addr: SocketAddr = ([127, 0, 0, 1], port + 1).into();\n            let removed_chitchat_id = ChitchatId::for_local_test(port);\n\n            let mut rejoined_chitchat_id = removed_chitchat_id.clone();\n            rejoined_chitchat_id.generation_id += 1;\n            let rejoined_node_id: NodeId = rejoined_chitchat_id.node_id.clone().into();\n            let rejoined_node_state = NodeStateBuilder::default()\n                .with_grpc_advertise_addr(grpc_advertise_addr)\n                .with_readiness(true)\n                .build();\n            let channel = Channel::from_static(\"http://127.0.0.1:12345/\").connect_lazy();\n            let rejoined_node = ClusterNode::try_new(\n                rejoined_chitchat_id.clone(),\n                &rejoined_node_state,\n                channel,\n                false,\n            )\n            .unwrap();\n            let mut previous_nodes =\n                BTreeMap::from_iter([(rejoined_node_id.clone(), rejoined_node.clone())]);\n\n            let event_opt =\n                compute_cluster_change_events_on_removed(&removed_chitchat_id, &mut previous_nodes);\n            assert!(event_opt.is_none());\n            assert_eq!(\n                previous_nodes.get(&rejoined_node_id).unwrap(),\n                &rejoined_node\n            );\n        }\n    }\n\n    #[tokio::test]\n    async fn test_compute_cluster_change_events() {\n        let cluster_id = \"test-cluster\".to_string();\n        let self_port = 1234;\n        let self_chitchat_id = ChitchatId::for_local_test(self_port);\n        let self_node_id: NodeId = self_chitchat_id.node_id.clone().into();\n        {\n            let mut previous_nodes = BTreeMap::default();\n            let previous_node_states = BTreeMap::default();\n            let new_node_states = BTreeMap::default();\n            let events = compute_cluster_change_events(\n                &cluster_id,\n                &self_chitchat_id,\n                &mut previous_nodes,\n                &previous_node_states,\n                &new_node_states,\n                &Default::default(),\n            )\n            .await;\n            assert!(events.is_empty());\n        }\n        {\n            // Node remained unchanged.\n            let previous_node_state = NodeStateBuilder::default().with_readiness(true).build();\n            let previous_channel = Channel::from_static(\"http://127.0.0.1:12345/\").connect_lazy();\n            let is_self_node = true;\n            let previous_node = ClusterNode::try_new(\n                self_chitchat_id.clone(),\n                &previous_node_state,\n                previous_channel,\n                is_self_node,\n            )\n            .unwrap();\n            let mut previous_nodes = BTreeMap::from_iter([(self_node_id.clone(), previous_node)]);\n            let previous_node_states =\n                BTreeMap::from_iter([(self_chitchat_id.clone(), previous_node_state)]);\n\n            let new_node_state = NodeStateBuilder::default().with_readiness(true).build();\n            let new_node_states = BTreeMap::from_iter([(self_chitchat_id.clone(), new_node_state)]);\n\n            let events = compute_cluster_change_events(\n                &cluster_id,\n                &self_chitchat_id,\n                &mut previous_nodes,\n                &previous_node_states,\n                &new_node_states,\n                &Default::default(),\n            )\n            .await;\n            assert!(events.is_empty());\n        }\n        {\n            // Node joins the cluster.\n            let mut previous_nodes = BTreeMap::default();\n            let previous_node_states = BTreeMap::default();\n            let new_chitchat_id = ChitchatId::for_local_test(self_port + 1);\n            let new_node_state = NodeStateBuilder::default().with_readiness(true).build();\n            let new_node_states = BTreeMap::from_iter([(new_chitchat_id, new_node_state)]);\n            let events = compute_cluster_change_events(\n                &cluster_id,\n                &self_chitchat_id,\n                &mut previous_nodes,\n                &previous_node_states,\n                &new_node_states,\n                &Default::default(),\n            )\n            .await;\n            assert_eq!(events.len(), 1);\n\n            let ClusterChange::Add(_node) = events[0].clone() else {\n                panic!(\"expected `ClusterChange::Add` event, got `{:?}`\", events[0]);\n            };\n\n            let events = compute_cluster_change_events(\n                &cluster_id,\n                &self_chitchat_id,\n                &mut previous_nodes,\n                &new_node_states,\n                &new_node_states,\n                &Default::default(),\n            )\n            .await;\n            assert_eq!(events.len(), 0);\n        }\n        {\n            // Node changes.\n            let previous_node_state = NodeStateBuilder::default().with_readiness(true).build();\n            let previous_channel = Channel::from_static(\"http://127.0.0.1:12345/\").connect_lazy();\n            let is_self_node = true;\n            let previous_node = ClusterNode::try_new(\n                self_chitchat_id.clone(),\n                &previous_node_state,\n                previous_channel,\n                is_self_node,\n            )\n            .unwrap();\n            let mut previous_nodes = BTreeMap::from_iter([(self_node_id, previous_node)]);\n            let previous_node_states =\n                BTreeMap::from_iter([(self_chitchat_id.clone(), previous_node_state)]);\n\n            let new_node_state = NodeStateBuilder::default()\n                .with_readiness(true)\n                .with_key_value(\"my-key\", \"my-value\")\n                .build();\n            let new_node_states = BTreeMap::from_iter([(self_chitchat_id.clone(), new_node_state)]);\n\n            let events = compute_cluster_change_events(\n                &cluster_id,\n                &self_chitchat_id,\n                &mut previous_nodes,\n                &previous_node_states,\n                &new_node_states,\n                &Default::default(),\n            )\n            .await;\n            assert_eq!(events.len(), 1);\n\n            let ClusterChange::Update { .. } = events[0].clone() else {\n                panic!(\n                    \"Expected `ClusterChange::Update` event, got `{:?}`\",\n                    events[0]\n                );\n            };\n\n            // Node leaves the cluster.\n            let new_node_states = BTreeMap::default();\n            let events = compute_cluster_change_events(\n                &cluster_id,\n                &self_chitchat_id,\n                &mut previous_nodes,\n                &previous_node_states,\n                &new_node_states,\n                &Default::default(),\n            )\n            .await;\n            assert_eq!(events.len(), 1);\n\n            let ClusterChange::Remove(_node) = events[0].clone() else {\n                panic!(\n                    \"Expected `ClusterChange::Remove` event, got `{:?}`\",\n                    events[0]\n                );\n            };\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-cluster/src/cluster.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{BTreeMap, HashMap, HashSet};\nuse std::fmt::{Debug, Display};\nuse std::net::SocketAddr;\nuse std::str::FromStr;\nuse std::sync::Arc;\nuse std::time::Duration;\n\nuse anyhow::Context;\nuse chitchat::transport::Transport;\nuse chitchat::{\n    Chitchat, ChitchatConfig, ChitchatHandle, ChitchatId, ClusterStateSnapshot,\n    FailureDetectorConfig, KeyChangeEvent, ListenerHandle, NodeState, spawn_chitchat,\n};\nuse itertools::Itertools;\nuse quickwit_common::tower::ClientGrpcConfig;\nuse quickwit_proto::indexing::{IndexingPipelineId, IndexingTask, PipelineMetrics};\nuse quickwit_proto::types::{NodeId, NodeIdRef, PipelineUid, ShardId};\nuse serde::{Deserialize, Serialize};\nuse tokio::sync::{Mutex, RwLock, mpsc, watch};\nuse tokio::time::timeout;\nuse tokio_stream::StreamExt;\nuse tokio_stream::wrappers::WatchStream;\nuse tracing::{info, warn};\n\nuse crate::change::{ClusterChange, ClusterChangeStreamFactory, compute_cluster_change_events};\nuse crate::grpc_gossip::spawn_catchup_callback_task;\nuse crate::member::{\n    AVAILABILITY_ZONE_KEY, ClusterMember, ENABLED_SERVICES_KEY, GRPC_ADVERTISE_ADDR_KEY,\n    NodeStateExt, PIPELINE_METRICS_PREFIX, READINESS_KEY, READINESS_VALUE_NOT_READY,\n    READINESS_VALUE_READY, build_cluster_member,\n};\nuse crate::metrics::spawn_metrics_task;\nuse crate::{ClusterChangeStream, ClusterNode};\n\nconst MARKED_FOR_DELETION_GRACE_PERIOD: Duration = if cfg!(any(test, feature = \"testsuite\")) {\n    Duration::from_millis(2_500) // 2.5 secs\n} else {\n    Duration::from_secs(3_600 * 2) // 2 hours.\n};\n\n// An indexing task key is formatted as\n// `{INDEXING_TASK_PREFIX}{PIPELINE_ULID}`.\nconst INDEXING_TASK_PREFIX: &str = \"indexer.task:\";\n\n#[derive(Clone)]\npub struct Cluster {\n    cluster_id: String,\n    self_chitchat_id: ChitchatId,\n    /// Socket address (UDP) the node listens on for receiving gossip messages.\n    pub gossip_listen_addr: SocketAddr,\n    // TODO this object contains a tls config. We might want to change it to a\n    // ArcSwap<ClientGrpcConfig> or something so that some task can watch for new certificates\n    // and update this (hot reloading)\n    client_grpc_config: ClientGrpcConfig,\n    gossip_interval: Duration,\n    inner: Arc<RwLock<InnerCluster>>,\n}\n\nimpl Debug for Cluster {\n    fn fmt(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {\n        formatter\n            .debug_struct(\"Cluster\")\n            .field(\"cluster_id\", &self.cluster_id)\n            .field(\"self_node_id\", &self.self_chitchat_id.node_id)\n            .field(\"gossip_listen_addr\", &self.gossip_listen_addr)\n            .field(\n                \"gossip_advertise_addr\",\n                &self.self_chitchat_id.gossip_advertise_addr,\n            )\n            .field(\"gossip_interval\", &self.gossip_interval)\n            .finish()\n    }\n}\n\nimpl Cluster {\n    pub fn cluster_id(&self) -> &str {\n        &self.cluster_id\n    }\n\n    pub fn self_chitchat_id(&self) -> &ChitchatId {\n        &self.self_chitchat_id\n    }\n\n    pub fn self_node_id(&self) -> &NodeIdRef {\n        NodeIdRef::from_str(&self.self_chitchat_id.node_id)\n    }\n\n    pub fn gossip_listen_addr(&self) -> SocketAddr {\n        self.gossip_listen_addr\n    }\n\n    pub fn gossip_advertise_addr(&self) -> SocketAddr {\n        self.self_chitchat_id.gossip_advertise_addr\n    }\n\n    #[allow(clippy::too_many_arguments)]\n    pub async fn join(\n        cluster_id: String,\n        self_node: ClusterMember,\n        gossip_listen_addr: SocketAddr,\n        peer_seed_addrs: Vec<String>,\n        gossip_interval: Duration,\n        failure_detector_config: FailureDetectorConfig,\n        transport: &dyn Transport,\n        client_grpc_config: ClientGrpcConfig,\n    ) -> anyhow::Result<Self> {\n        info!(\n            cluster_id=%cluster_id,\n            node_id=%self_node.node_id,\n            generation_id=self_node.generation_id.as_u64(),\n            enabled_services=?self_node.enabled_services,\n            gossip_listen_addr=%gossip_listen_addr,\n            gossip_advertise_addr=%self_node.gossip_advertise_addr,\n            grpc_advertise_addr=%self_node.grpc_advertise_addr,\n            peer_seed_addrs=%peer_seed_addrs.join(\", \"),\n            \"joining cluster\"\n        );\n        // Set up catchup callback and extra liveness predicate functions.\n        let (catchup_callback_tx, catchup_callback_rx) = watch::channel(());\n        let catchup_callback = move || {\n            let _ = catchup_callback_tx.send(());\n        };\n        let extra_liveness_predicate = |node_state: &NodeState| {\n            [ENABLED_SERVICES_KEY, GRPC_ADVERTISE_ADDR_KEY]\n                .iter()\n                .all(|key| node_state.contains_key(key))\n        };\n        let chitchat_config = ChitchatConfig {\n            cluster_id: cluster_id.clone(),\n            chitchat_id: self_node.chitchat_id(),\n            listen_addr: gossip_listen_addr,\n            seed_nodes: peer_seed_addrs,\n            failure_detector_config,\n            gossip_interval,\n            marked_for_deletion_grace_period: MARKED_FOR_DELETION_GRACE_PERIOD,\n            catchup_callback: Some(Box::new(catchup_callback)),\n            extra_liveness_predicate: Some(Box::new(extra_liveness_predicate)),\n        };\n        let mut initial_key_values = vec![\n            (\n                ENABLED_SERVICES_KEY.to_string(),\n                self_node.enabled_services.iter().join(\",\"),\n            ),\n            (\n                GRPC_ADVERTISE_ADDR_KEY.to_string(),\n                self_node.grpc_advertise_addr.to_string(),\n            ),\n            (\n                READINESS_KEY.to_string(),\n                READINESS_VALUE_NOT_READY.to_string(),\n            ),\n        ];\n\n        if let Some(az) = &self_node.availability_zone {\n            initial_key_values.push((AVAILABILITY_ZONE_KEY.to_string(), az.clone()));\n        }\n        let chitchat_handle =\n            spawn_chitchat(chitchat_config, initial_key_values, transport).await?;\n\n        let chitchat = chitchat_handle.chitchat();\n        let chitchat_guard = chitchat.lock().await;\n        let live_nodes_rx = chitchat_guard.live_nodes_watcher();\n        let live_nodes_stream = chitchat_guard.live_nodes_watch_stream();\n        let (ready_members_tx, ready_members_rx) = watch::channel(Vec::new());\n        spawn_ready_members_task(cluster_id.clone(), live_nodes_stream, ready_members_tx);\n        drop(chitchat_guard);\n\n        let weak_chitchat = Arc::downgrade(&chitchat);\n        spawn_metrics_task(weak_chitchat.clone(), self_node.chitchat_id());\n\n        spawn_catchup_callback_task(\n            cluster_id.clone(),\n            self_node.chitchat_id(),\n            weak_chitchat,\n            live_nodes_rx,\n            catchup_callback_rx.clone(),\n            client_grpc_config.clone(),\n        )\n        .await;\n\n        let inner = InnerCluster {\n            cluster_id: cluster_id.clone(),\n            self_chitchat_id: self_node.chitchat_id(),\n            chitchat_handle,\n            live_nodes: BTreeMap::new(),\n            change_stream_subscribers: Vec::new(),\n            ready_members_rx,\n        };\n        let cluster = Cluster {\n            cluster_id,\n            self_chitchat_id: self_node.chitchat_id(),\n            gossip_listen_addr,\n            gossip_interval,\n            inner: Arc::new(RwLock::new(inner)),\n            client_grpc_config,\n        };\n        spawn_change_stream_task(cluster.clone()).await;\n        Ok(cluster)\n    }\n\n    /// Deprecated: this is going away soon.\n    pub async fn ready_members(&self) -> Vec<ClusterMember> {\n        self.inner.read().await.ready_members_rx.borrow().clone()\n    }\n\n    /// Deprecated: this is going away soon.\n    async fn ready_members_watcher(&self) -> WatchStream<Vec<ClusterMember>> {\n        WatchStream::new(self.inner.read().await.ready_members_rx.clone())\n    }\n\n    pub async fn ready_nodes(&self) -> Vec<ClusterNode> {\n        self.inner\n            .write()\n            .await\n            .live_nodes\n            .values()\n            .filter(|node| node.is_ready())\n            .cloned()\n            .collect()\n    }\n\n    /// Returns a stream of changes affecting the set of ready nodes in the cluster.\n    pub fn change_stream(&self) -> ClusterChangeStream {\n        let (change_stream, change_stream_tx) = ClusterChangeStream::new_unbounded();\n        let inner = self.inner.clone();\n        // We spawn a task so the signature of this function is sync.\n        let future = async move {\n            let mut inner = inner.write().await;\n            for node in inner.live_nodes.values() {\n                if node.is_ready() {\n                    change_stream_tx\n                        .send(ClusterChange::Add(node.clone()))\n                        .expect(\"receiver end of the channel should be open\");\n                }\n            }\n            inner.change_stream_subscribers.push(change_stream_tx);\n        };\n        tokio::spawn(future);\n        change_stream\n    }\n\n    /// Returns whether the self node is ready.\n    pub async fn is_self_node_ready(&self) -> bool {\n        self.chitchat()\n            .await\n            .lock()\n            .await\n            .node_state(&self.self_chitchat_id)\n            .expect(\"The self node should always be present in the set of live nodes.\")\n            .is_ready()\n    }\n\n    /// Sets the self node's readiness.\n    pub async fn set_self_node_readiness(&self, readiness: bool) {\n        let readiness_value = if readiness {\n            READINESS_VALUE_READY\n        } else {\n            READINESS_VALUE_NOT_READY\n        };\n        self.set_self_key_value(READINESS_KEY, readiness_value)\n            .await\n    }\n\n    /// Sets a key-value pair on the cluster node's state.\n    pub async fn set_self_key_value(&self, key: impl Display, value: impl Display) {\n        self.chitchat()\n            .await\n            .lock()\n            .await\n            .self_node_state()\n            .set(key, value);\n    }\n\n    /// Sets a key-value pair on the cluster node's state.\n    pub async fn set_self_key_value_delete_after_ttl(\n        &self,\n        key: impl ToString,\n        value: impl ToString,\n    ) {\n        let chitchat = self.chitchat().await;\n        let mut chitchat_lock = chitchat.lock().await;\n        let chitchat_self_node = chitchat_lock.self_node_state();\n        let key = key.to_string();\n        chitchat_self_node.set_with_ttl(key.clone(), value);\n    }\n\n    pub async fn get_self_key_value(&self, key: &str) -> Option<String> {\n        self.chitchat()\n            .await\n            .lock()\n            .await\n            .self_node_state()\n            .get(key)\n            .map(|value| value.to_string())\n    }\n\n    pub async fn remove_self_key(&self, key: &str) {\n        self.chitchat()\n            .await\n            .lock()\n            .await\n            .self_node_state()\n            .delete(key)\n    }\n\n    pub async fn subscribe(\n        &self,\n        key_prefix: &str,\n        callback: impl Fn(KeyChangeEvent) + Send + Sync + 'static,\n    ) -> ListenerHandle {\n        self.chitchat()\n            .await\n            .lock()\n            .await\n            .subscribe_event(key_prefix, callback)\n    }\n\n    /// Waits until the predicate holds true for the set of ready members.\n    pub async fn wait_for_ready_members<F>(\n        &self,\n        mut predicate: F,\n        timeout_after: Duration,\n    ) -> anyhow::Result<()>\n    where\n        F: FnMut(&[ClusterMember]) -> bool,\n    {\n        timeout(\n            timeout_after,\n            self.ready_members_watcher()\n                .await\n                .skip_while(|members| !predicate(members))\n                .next(),\n        )\n        .await\n        .context(\"deadline has passed before predicate held true\")?;\n        Ok(())\n    }\n\n    /// Returns a snapshot of the cluster state, including the underlying Chitchat state.\n    pub async fn snapshot(&self) -> ClusterSnapshot {\n        let chitchat = self.chitchat().await;\n        let chitchat_guard = chitchat.lock().await;\n        let chitchat_state_snapshot = chitchat_guard.state_snapshot();\n        let mut ready_nodes = HashSet::new();\n        let mut live_nodes = HashSet::new();\n\n        for chitchat_id in chitchat_guard.live_nodes().cloned() {\n            let node_state = chitchat_guard.node_state(&chitchat_id).expect(\n                \"The node should always be present in the cluster state because we hold the \\\n                 Chitchat mutex.\",\n            );\n            if node_state.is_ready() {\n                ready_nodes.insert(chitchat_id);\n            } else {\n                live_nodes.insert(chitchat_id);\n            }\n        }\n        let dead_nodes = chitchat_guard.dead_nodes().cloned().collect::<HashSet<_>>();\n\n        ClusterSnapshot {\n            cluster_id: self.cluster_id.clone(),\n            self_node_id: self.self_chitchat_id.node_id.clone(),\n            ready_nodes,\n            live_nodes,\n            dead_nodes,\n            chitchat_state_snapshot,\n        }\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub async fn leave(&self) {\n        info!(\n            cluster_id=%self.cluster_id,\n            node_id=%self.self_chitchat_id.node_id,\n            \"leaving the cluster\"\n        );\n        self.set_self_node_readiness(false).await;\n        tokio::time::sleep(self.gossip_interval * 2).await;\n    }\n\n    pub async fn initiate_shutdown(&self) -> anyhow::Result<()> {\n        self.inner.read().await.chitchat_handle.initiate_shutdown()\n    }\n\n    /// This exposes in chitchat some metrics about the CPU usage of cooperative pipelines.\n    /// The metrics are exposed as follows:\n    /// Key:        pipeline_metrics:<index_uid>:<source_id>\n    /// Value:      179m,76MB/s\n    pub async fn update_self_node_pipeline_metrics(\n        &self,\n        pipeline_metrics: &HashMap<&IndexingPipelineId, PipelineMetrics>,\n    ) {\n        let chitchat = self.chitchat().await;\n        let mut chitchat_guard = chitchat.lock().await;\n        let node_state = chitchat_guard.self_node_state();\n        let mut current_metrics_keys: HashSet<String> = node_state\n            .iter_prefix(PIPELINE_METRICS_PREFIX)\n            .map(|(key, _)| key.to_string())\n            .collect();\n        for (pipeline_id, metrics) in pipeline_metrics {\n            let key = format!(\"{PIPELINE_METRICS_PREFIX}{pipeline_id}\");\n            current_metrics_keys.remove(&key);\n            node_state.set(key, metrics.to_string());\n        }\n        for obsolete_task_key in current_metrics_keys {\n            node_state.delete(&obsolete_task_key);\n        }\n    }\n\n    /// Updates indexing tasks in chitchat state.\n    /// Tasks are grouped by (index_id, source_id), each group is stored in a key as follows:\n    /// - key: `{INDEXING_TASK_PREFIX}{index_id}{INDEXING_TASK_SEPARATOR}{source_id}`\n    /// - value: Number of indexing tasks in the group.\n    ///\n    /// Keys present in chitchat state but not in the given `indexing_tasks` are marked for\n    /// deletion.\n    pub async fn update_self_node_indexing_tasks(&self, indexing_tasks: &[IndexingTask]) {\n        let chitchat = self.chitchat().await;\n        let mut chitchat_guard = chitchat.lock().await;\n        let node_state = chitchat_guard.self_node_state();\n        set_indexing_tasks_in_node_state(indexing_tasks, node_state);\n    }\n\n    pub async fn chitchat(&self) -> Arc<Mutex<Chitchat>> {\n        self.inner.read().await.chitchat_handle.chitchat()\n    }\n\n    pub async fn chitchat_server_termination_watcher(\n        &self,\n    ) -> impl Future<Output = anyhow::Result<()>> + use<> {\n        self.inner\n            .read()\n            .await\n            .chitchat_handle\n            .termination_watcher()\n    }\n}\n\nimpl ClusterChangeStreamFactory for Cluster {\n    fn create(&self) -> ClusterChangeStream {\n        self.change_stream()\n    }\n}\n\n/// Deprecated: this is going away soon.\nfn spawn_ready_members_task(\n    cluster_id: String,\n    mut live_nodes_stream: WatchStream<BTreeMap<ChitchatId, NodeState>>,\n    ready_members_tx: watch::Sender<Vec<ClusterMember>>,\n) {\n    let fut = async move {\n        while let Some(new_live_nodes) = live_nodes_stream.next().await {\n            let mut new_ready_members = Vec::with_capacity(new_live_nodes.len());\n\n            for (chitchat_id, node_state) in new_live_nodes {\n                let member = match build_cluster_member(chitchat_id, &node_state) {\n                    Ok(member) => member,\n                    Err(error) => {\n                        warn!(\n                            cluster_id=%cluster_id,\n                            error=?error,\n                            \"Failed to build cluster member from Chitchat node state.\"\n                        );\n                        continue;\n                    }\n                };\n                if member.is_ready {\n                    new_ready_members.push(member);\n                }\n            }\n            if *ready_members_tx.borrow() != new_ready_members\n                && ready_members_tx.send(new_ready_members).is_err()\n            {\n                break;\n            }\n        }\n    };\n    tokio::spawn(fut);\n}\n\n/// Parses indexing tasks from the chitchat node state.\npub fn parse_indexing_tasks(node_state: &NodeState) -> Vec<IndexingTask> {\n    node_state\n        .iter_prefix(INDEXING_TASK_PREFIX)\n        .map(|(key, versioned_value)| (key, versioned_value.value.as_str()))\n        .flat_map(|(key, value)| {\n            let indexing_task_opt = chitchat_kv_to_indexing_task(key, value);\n            if indexing_task_opt.is_none() {\n                warn!(key=%key, value=%value, \"failed to parse indexing task from chitchat kv\");\n            }\n            indexing_task_opt\n        })\n        .collect()\n}\n\n/// Writes the given indexing tasks in the given node state.\n///\n/// If previous indexing tasks were present in the node state but were not in the given tasks, they\n/// are marked for deletion.\npub(crate) fn set_indexing_tasks_in_node_state(\n    indexing_tasks: &[IndexingTask],\n    node_state: &mut NodeState,\n) {\n    let mut current_indexing_tasks_keys: HashSet<String> = node_state\n        .iter_prefix(INDEXING_TASK_PREFIX)\n        .map(|(key, _)| key.to_string())\n        .collect();\n    for indexing_task in indexing_tasks {\n        let (key, value) = indexing_task_to_chitchat_kv(indexing_task);\n        current_indexing_tasks_keys.remove(&key);\n        node_state.set(key, value);\n    }\n    for obsolete_task_key in current_indexing_tasks_keys {\n        node_state.delete(&obsolete_task_key);\n    }\n}\n\nfn indexing_task_to_chitchat_kv(indexing_task: &IndexingTask) -> (String, String) {\n    let IndexingTask {\n        index_uid: _,\n        source_id,\n        shard_ids,\n        pipeline_uid: _,\n        params_fingerprint: _,\n    } = indexing_task;\n    let index_uid = indexing_task.index_uid();\n    let key = format!(\"{INDEXING_TASK_PREFIX}{}\", indexing_task.pipeline_uid());\n    let shard_ids_str = shard_ids.iter().sorted().join(\",\");\n    let fingerprint = indexing_task.params_fingerprint;\n    let value = format!(\"{index_uid}:{source_id}:{fingerprint}:{shard_ids_str}\");\n    (key, value)\n}\n\nfn parse_shard_ids_str(shard_ids_str: &str) -> Vec<ShardId> {\n    shard_ids_str\n        .split(',')\n        .filter(|shard_id_str| !shard_id_str.is_empty())\n        .map(ShardId::from)\n        .collect()\n}\n\nfn chitchat_kv_to_indexing_task(key: &str, value: &str) -> Option<IndexingTask> {\n    let pipeline_uid_str = key.strip_prefix(INDEXING_TASK_PREFIX)?;\n    let pipeline_uid = PipelineUid::from_str(pipeline_uid_str).ok()?;\n    let mut field_iterator = value.rsplitn(4, ':');\n    let shards_str = field_iterator.next()?;\n    let fingerprint_str = field_iterator.next()?;\n    let source_id = field_iterator.next()?;\n    let index_uid = field_iterator.next()?;\n    let params_fingerprint: u64 = fingerprint_str.parse().ok()?;\n    let index_uid = index_uid.parse().ok()?;\n    let shard_ids = parse_shard_ids_str(shards_str);\n    Some(IndexingTask {\n        index_uid: Some(index_uid),\n        source_id: source_id.to_string(),\n        pipeline_uid: Some(pipeline_uid),\n        shard_ids,\n        params_fingerprint,\n    })\n}\n\nasync fn spawn_change_stream_task(cluster: Cluster) {\n    let cluster_guard = cluster.inner.read().await;\n    let cluster_id = cluster_guard.cluster_id.clone();\n    let client_grpc_config = cluster.client_grpc_config.clone();\n    let self_chitchat_id = cluster_guard.self_chitchat_id.clone();\n    let chitchat = cluster_guard.chitchat_handle.chitchat();\n    let weak_cluster = Arc::downgrade(&cluster.inner);\n    drop(cluster_guard);\n    drop(cluster);\n\n    let mut previous_live_node_states = BTreeMap::new();\n    let mut live_nodes_watch_stream = chitchat.lock().await.live_nodes_watch_stream();\n\n    let future = async move {\n        while let Some(new_live_node_states) = live_nodes_watch_stream.next().await {\n            let Some(cluster) = weak_cluster.upgrade() else {\n                break;\n            };\n            let mut cluster_guard = cluster.write().await;\n            let previous_live_nodes = &mut cluster_guard.live_nodes;\n\n            let events = compute_cluster_change_events(\n                &cluster_id,\n                &self_chitchat_id,\n                previous_live_nodes,\n                &previous_live_node_states,\n                &new_live_node_states,\n                &client_grpc_config,\n            )\n            .await;\n            if !events.is_empty() {\n                cluster_guard\n                    .change_stream_subscribers\n                    .retain(|change_stream_tx| {\n                        events\n                            .iter()\n                            .all(|event| change_stream_tx.send(event.clone()).is_ok())\n                    });\n            }\n            previous_live_node_states = new_live_node_states;\n        }\n    };\n    tokio::spawn(future);\n}\n\nstruct InnerCluster {\n    cluster_id: String,\n    self_chitchat_id: ChitchatId,\n    chitchat_handle: ChitchatHandle,\n    live_nodes: BTreeMap<NodeId, ClusterNode>,\n    change_stream_subscribers: Vec<mpsc::UnboundedSender<ClusterChange>>,\n    ready_members_rx: watch::Receiver<Vec<ClusterMember>>,\n}\n\n// Not used within the code, used for documentation.\n#[derive(Debug, utoipa::ToSchema)]\npub struct NodeIdSchema {\n    #[schema(example = \"node-1\")]\n    /// The unique identifier of the node in the cluster.\n    pub node_id: String,\n\n    #[schema(example = \"1683736537\", value_type = u64)]\n    /// A numeric identifier incremented every time the node leaves and rejoins the cluster.\n    pub generation_id: u64,\n\n    #[schema(example = \"127.0.0.1:8000\", value_type = String)]\n    /// The socket address peers should use to gossip with the node.\n    pub gossip_advertise_addr: SocketAddr,\n}\n\n#[derive(Debug, Serialize, Deserialize, utoipa::ToSchema)]\npub struct ClusterSnapshot {\n    #[schema(example = \"qw-cluster-1\")]\n    /// The ID of the cluster that the node is a part of.\n    pub cluster_id: String,\n\n    #[schema(value_type = NodeIdSchema)]\n    /// The unique ID of the current node.\n    pub self_node_id: String,\n\n    #[schema(value_type  = Vec<NodeIdSchema>)]\n    /// The set of cluster node IDs that are ready to handle requests.\n    pub ready_nodes: HashSet<ChitchatId>,\n\n    #[schema(value_type  = Vec<NodeIdSchema>)]\n    /// The set of cluster node IDs that are alive but not ready.\n    pub live_nodes: HashSet<ChitchatId>,\n\n    #[schema(value_type  = Vec<NodeIdSchema>)]\n    /// The set of cluster node IDs flagged as dead or faulty.\n    pub dead_nodes: HashSet<ChitchatId>,\n\n    #[schema(\n        value_type = Object,\n        example = json!({\n            \"key_values\": {\n                \"grpc_advertise_addr\": \"127.0.0.1:8080\",\n                \"enabled_services\": \"searcher\",\n            },\n            \"max_version\": 5,\n        })\n    )]\n    /// A complete snapshot of the Chitchat cluster state.\n    pub chitchat_state_snapshot: ClusterStateSnapshot,\n}\n\n/// Computes the gRPC port from the listen address for tests.\n#[cfg(any(test, feature = \"testsuite\"))]\npub fn grpc_addr_from_listen_addr_for_test(listen_addr: SocketAddr) -> SocketAddr {\n    let grpc_port = listen_addr.port() + 1u16;\n    (listen_addr.ip(), grpc_port).into()\n}\n\n#[cfg(any(test, feature = \"testsuite\"))]\npub async fn create_cluster_for_test_with_id(\n    node_id: NodeId,\n    gossip_advertise_port: u16,\n    cluster_id: String,\n    peer_seed_addrs: Vec<String>,\n    enabled_services: &HashSet<quickwit_config::service::QuickwitService>,\n    transport: &dyn Transport,\n    self_node_readiness: bool,\n) -> anyhow::Result<Cluster> {\n    use quickwit_proto::indexing::PIPELINE_FULL_CAPACITY;\n    use quickwit_proto::ingest::ingester::IngesterStatus;\n    let gossip_advertise_addr: SocketAddr = ([127, 0, 0, 1], gossip_advertise_port).into();\n    let self_node = ClusterMember {\n        node_id,\n        generation_id: crate::GenerationId(1),\n        is_ready: self_node_readiness,\n        enabled_services: enabled_services.clone(),\n        gossip_advertise_addr,\n        grpc_advertise_addr: grpc_addr_from_listen_addr_for_test(gossip_advertise_addr),\n        indexing_tasks: Vec::new(),\n        indexing_cpu_capacity: PIPELINE_FULL_CAPACITY,\n        ingester_status: IngesterStatus::default(),\n        availability_zone: None,\n    };\n    let failure_detector_config = create_failure_detector_config_for_test();\n    let cluster = Cluster::join(\n        cluster_id,\n        self_node,\n        gossip_advertise_addr,\n        peer_seed_addrs,\n        Duration::from_millis(25),\n        failure_detector_config,\n        transport,\n        Default::default(),\n    )\n    .await?;\n    cluster.set_self_node_readiness(self_node_readiness).await;\n    Ok(cluster)\n}\n\n/// Creates a failure detector config for tests.\n#[cfg(any(test, feature = \"testsuite\"))]\nfn create_failure_detector_config_for_test() -> FailureDetectorConfig {\n    FailureDetectorConfig {\n        phi_threshold: 5.0,\n        initial_interval: Duration::from_millis(25),\n        ..Default::default()\n    }\n}\n\n/// Creates a local cluster listening on a random port.\n#[cfg(any(test, feature = \"testsuite\"))]\npub async fn create_cluster_for_test(\n    seeds: Vec<String>,\n    enabled_services: &[&str],\n    transport: &dyn Transport,\n    self_node_readiness: bool,\n) -> anyhow::Result<Cluster> {\n    use std::sync::atomic::{AtomicU16, Ordering};\n\n    use quickwit_config::service::QuickwitService;\n\n    static GOSSIP_ADVERTISE_PORT_SEQUENCE: AtomicU16 = AtomicU16::new(1u16);\n    let gossip_advertise_port = GOSSIP_ADVERTISE_PORT_SEQUENCE.fetch_add(1, Ordering::Relaxed);\n    let node_id: NodeId = format!(\"node-{gossip_advertise_port}\").into();\n\n    let enabled_services = enabled_services\n        .iter()\n        .map(|service_str| QuickwitService::from_str(service_str))\n        .collect::<Result<HashSet<_>, _>>()?;\n    let cluster = create_cluster_for_test_with_id(\n        node_id,\n        gossip_advertise_port,\n        \"test-cluster\".to_string(),\n        seeds,\n        &enabled_services,\n        transport,\n        self_node_readiness,\n    )\n    .await?;\n    Ok(cluster)\n}\n\n#[cfg(test)]\nmod tests {\n    use std::collections::HashMap;\n    use std::net::SocketAddr;\n    use std::time::Duration;\n\n    use chitchat::transport::ChannelTransport;\n    use itertools::Itertools;\n    use quickwit_common::test_utils::wait_until_predicate;\n    use quickwit_config::service::QuickwitService;\n    use quickwit_proto::indexing::IndexingTask;\n    use quickwit_proto::types::IndexUid;\n    use rand::Rng;\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_single_node_cluster_readiness() {\n        let transport = ChannelTransport::default();\n        let node = create_cluster_for_test(Vec::new(), &[], &transport, false)\n            .await\n            .unwrap();\n\n        let mut ready_members_watcher = node.ready_members_watcher().await;\n        let ready_members = ready_members_watcher.next().await.unwrap();\n\n        assert!(ready_members.is_empty());\n        assert!(!node.is_self_node_ready().await);\n\n        let cluster_snapshot = node.snapshot().await;\n        assert!(cluster_snapshot.ready_nodes.is_empty());\n\n        let self_node_state = cluster_snapshot\n            .chitchat_state_snapshot\n            .node_states\n            .into_iter()\n            .find(|node_state| node_state.chitchat_id() == &node.self_chitchat_id)\n            .unwrap();\n        assert_eq!(\n            self_node_state.get(READINESS_KEY).unwrap(),\n            READINESS_VALUE_NOT_READY\n        );\n\n        node.set_self_node_readiness(true).await;\n\n        let ready_members = ready_members_watcher.next().await.unwrap();\n        assert_eq!(ready_members.len(), 1);\n        assert!(node.is_self_node_ready().await);\n\n        let cluster_snapshot = node.snapshot().await;\n        assert_eq!(cluster_snapshot.ready_nodes.len(), 1);\n\n        let self_node_state = cluster_snapshot\n            .chitchat_state_snapshot\n            .node_states\n            .into_iter()\n            .find(|node_state| node_state.chitchat_id() == &node.self_chitchat_id)\n            .unwrap();\n        assert_eq!(\n            self_node_state.get(READINESS_KEY).unwrap(),\n            READINESS_VALUE_READY\n        );\n\n        node.set_self_node_readiness(false).await;\n\n        let ready_members = ready_members_watcher.next().await.unwrap();\n        assert!(ready_members.is_empty());\n        assert!(!node.is_self_node_ready().await);\n\n        let cluster_snapshot = node.snapshot().await;\n        assert!(cluster_snapshot.ready_nodes.is_empty());\n\n        let self_node_state = cluster_snapshot\n            .chitchat_state_snapshot\n            .node_states\n            .into_iter()\n            .find(|node_state| node_state.chitchat_id() == &node.self_chitchat_id)\n            .unwrap();\n        assert_eq!(\n            self_node_state.get(READINESS_KEY).unwrap(),\n            READINESS_VALUE_NOT_READY\n        );\n        node.leave().await;\n    }\n\n    #[tokio::test]\n    async fn test_cluster_multiple_nodes() -> anyhow::Result<()> {\n        let transport = ChannelTransport::default();\n        let node_1 = create_cluster_for_test(Vec::new(), &[], &transport, true).await?;\n        let node_1_change_stream = node_1.change_stream();\n\n        let peer_seeds = vec![node_1.gossip_listen_addr.to_string()];\n        let node_2 = create_cluster_for_test(peer_seeds, &[], &transport, true).await?;\n\n        let peer_seeds = vec![node_2.gossip_listen_addr.to_string()];\n        let node_3 = create_cluster_for_test(peer_seeds, &[], &transport, true).await?;\n\n        let wait_secs = Duration::from_secs(30);\n\n        for node in [&node_1, &node_2, &node_3] {\n            node.wait_for_ready_members(|members| members.len() == 3, wait_secs)\n                .await\n                .unwrap();\n        }\n        let members: Vec<SocketAddr> = node_1\n            .ready_members()\n            .await\n            .into_iter()\n            .map(|member| member.gossip_advertise_addr)\n            .sorted()\n            .collect();\n        let mut expected_members = vec![\n            node_1.gossip_listen_addr,\n            node_2.gossip_listen_addr,\n            node_3.gossip_listen_addr,\n        ];\n        expected_members.sort();\n        assert_eq!(members, expected_members);\n\n        node_2.leave().await;\n        node_1\n            .wait_for_ready_members(|members| members.len() == 2, wait_secs)\n            .await\n            .unwrap();\n\n        node_3.leave().await;\n        node_1\n            .wait_for_ready_members(|members| members.len() == 1, wait_secs)\n            .await\n            .unwrap();\n\n        node_1.leave().await;\n        drop(node_1);\n\n        let cluster_changes: Vec<ClusterChange> = node_1_change_stream.collect().await;\n        assert_eq!(cluster_changes.len(), 6);\n        assert!(matches!(&cluster_changes[0], ClusterChange::Add(_)));\n        assert!(matches!(&cluster_changes[1], ClusterChange::Add(_)));\n        assert!(matches!(&cluster_changes[2], ClusterChange::Add(_)));\n        assert!(matches!(&cluster_changes[3], ClusterChange::Remove(_)));\n        assert!(matches!(&cluster_changes[4], ClusterChange::Remove(_)));\n        assert!(matches!(&cluster_changes[5], ClusterChange::Remove(_)));\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_multi_node_cluster_readiness() {\n        let transport = ChannelTransport::default();\n        let node_1 =\n            create_cluster_for_test(Vec::new(), &[\"searcher\", \"indexer\"], &transport, true)\n                .await\n                .unwrap();\n\n        let peer_seeds = vec![node_1.gossip_listen_addr.to_string()];\n        let node_2 = create_cluster_for_test(peer_seeds, &[\"indexer\"], &transport, false)\n            .await\n            .unwrap();\n\n        let wait_secs = Duration::from_secs(5);\n\n        // Bother cluster 1 and cluster 2 see only one ready member.\n        node_1\n            .wait_for_ready_members(|members| members.len() == 1, wait_secs)\n            .await\n            .unwrap();\n\n        node_2\n            .wait_for_ready_members(|members| members.len() == 1, wait_secs)\n            .await\n            .unwrap();\n\n        // Now, node 2 becomes ready.\n        node_2.set_self_node_readiness(true).await;\n\n        // Bother cluster 1 and cluster 2 see only two ready members.\n        node_1\n            .wait_for_ready_members(|members| members.len() == 2, wait_secs)\n            .await\n            .unwrap();\n\n        node_2\n            .wait_for_ready_members(|members| members.len() == 2, wait_secs)\n            .await\n            .unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_cluster_members_built_from_chitchat_state() {\n        let transport = ChannelTransport::default();\n        let cluster1 = create_cluster_for_test(Vec::new(), &[\"indexer\"], &transport, true)\n            .await\n            .unwrap();\n        let cluster2 = create_cluster_for_test(\n            vec![cluster1.gossip_listen_addr.to_string()],\n            &[\"indexer\", \"metastore\"],\n            &transport,\n            true,\n        )\n        .await\n        .unwrap();\n        let index_uid: IndexUid = IndexUid::for_test(\"index-1\", 1);\n        let indexing_task1 = IndexingTask {\n            pipeline_uid: Some(PipelineUid::for_test(1u128)),\n            index_uid: Some(index_uid.clone()),\n            source_id: \"source-1\".to_string(),\n            shard_ids: Vec::new(),\n            params_fingerprint: 0,\n        };\n        let indexing_task2 = IndexingTask {\n            pipeline_uid: Some(PipelineUid::for_test(2u128)),\n            index_uid: Some(index_uid.clone()),\n            source_id: \"source-1\".to_string(),\n            shard_ids: Vec::new(),\n            params_fingerprint: 0,\n        };\n        cluster2\n            .set_self_key_value(GRPC_ADVERTISE_ADDR_KEY, \"127.0.0.1:1001\")\n            .await;\n        cluster2\n            .update_self_node_indexing_tasks(&[indexing_task1.clone(), indexing_task2.clone()])\n            .await;\n        cluster1\n            .wait_for_ready_members(|members| members.len() == 2, Duration::from_secs(30))\n            .await\n            .unwrap();\n        let members = cluster1.ready_members().await;\n        let member_node_1 = members\n            .iter()\n            .find(|member| member.chitchat_id() == cluster1.self_chitchat_id)\n            .unwrap();\n        let member_node_2 = members\n            .iter()\n            .find(|member| member.chitchat_id() == cluster2.self_chitchat_id)\n            .unwrap();\n        assert_eq!(\n            member_node_1.enabled_services,\n            HashSet::from_iter([QuickwitService::Indexer])\n        );\n        assert!(member_node_1.indexing_tasks.is_empty());\n        assert_eq!(\n            member_node_2.grpc_advertise_addr,\n            ([127, 0, 0, 1], 1001).into()\n        );\n        assert_eq!(\n            member_node_2.enabled_services,\n            HashSet::from_iter([QuickwitService::Indexer, QuickwitService::Metastore].into_iter())\n        );\n\n        assert_eq!(\n            &member_node_2.indexing_tasks,\n            &[indexing_task1, indexing_task2]\n        );\n    }\n\n    #[tokio::test]\n    async fn test_chitchat_state_set_high_number_of_tasks() {\n        let transport = ChannelTransport::default();\n        let cluster1 = create_cluster_for_test(Vec::new(), &[\"indexer\"], &transport, true)\n            .await\n            .unwrap();\n        let cluster2 = Arc::new(\n            create_cluster_for_test(\n                vec![cluster1.gossip_listen_addr.to_string()],\n                &[\"indexer\", \"metastore\"],\n                &transport,\n                true,\n            )\n            .await\n            .unwrap(),\n        );\n        let cluster3 = Arc::new(\n            create_cluster_for_test(\n                vec![cluster1.gossip_listen_addr.to_string()],\n                &[\"indexer\", \"metastore\"],\n                &transport,\n                true,\n            )\n            .await\n            .unwrap(),\n        );\n        let mut random_generator = rand::rng();\n        // TODO: increase it back to 1000 when https://github.com/quickwit-oss/chitchat/issues/81 is fixed\n        let indexing_tasks = (0..500)\n            .map(|pipeline_id| {\n                let index_id = random_generator.random_range(0..=10_000);\n                let source_id = random_generator.random_range(0..=100);\n                IndexingTask {\n                    pipeline_uid: Some(PipelineUid::for_test(pipeline_id as u128)),\n                    index_uid: Some(\n                        format!(\"index-{index_id}:11111111111111111111111111\")\n                            .parse()\n                            .unwrap(),\n                    ),\n                    source_id: format!(\"source-{source_id}\"),\n                    shard_ids: Vec::new(),\n                    params_fingerprint: 0,\n                }\n            })\n            .collect_vec();\n        cluster1\n            .update_self_node_indexing_tasks(&indexing_tasks)\n            .await;\n        for cluster in [&cluster2, &cluster3] {\n            let cluster_clone = cluster.clone();\n            let indexing_tasks_clone = indexing_tasks.clone();\n            wait_until_predicate(\n                move || {\n                    test_indexing_tasks_in_given_node(\n                        cluster_clone.clone(),\n                        cluster1.self_chitchat_id.gossip_advertise_addr,\n                        indexing_tasks_clone.clone(),\n                    )\n                },\n                Duration::from_secs(5),\n                Duration::from_millis(100),\n            )\n            .await\n            .unwrap();\n        }\n\n        // Mark tasks for deletion.\n        cluster1.update_self_node_indexing_tasks(&[]).await;\n        for cluster in [&cluster2, &cluster3] {\n            let cluster_clone = cluster.clone();\n            wait_until_predicate(\n                move || {\n                    test_indexing_tasks_in_given_node(\n                        cluster_clone.clone(),\n                        cluster1.self_chitchat_id.gossip_advertise_addr,\n                        Vec::new(),\n                    )\n                },\n                Duration::from_secs(4),\n                Duration::from_millis(500),\n            )\n            .await\n            .unwrap();\n        }\n\n        // Re-add tasks.\n        cluster1\n            .update_self_node_indexing_tasks(&indexing_tasks)\n            .await;\n        for cluster in [&cluster2, &cluster3] {\n            let cluster_clone = cluster.clone();\n            let indexing_tasks_clone = indexing_tasks.clone();\n            wait_until_predicate(\n                move || {\n                    test_indexing_tasks_in_given_node(\n                        cluster_clone.clone(),\n                        cluster1.self_chitchat_id.gossip_advertise_addr,\n                        indexing_tasks_clone.clone(),\n                    )\n                },\n                Duration::from_secs(4),\n                Duration::from_millis(500),\n            )\n            .await\n            .unwrap();\n        }\n    }\n\n    async fn test_indexing_tasks_in_given_node(\n        cluster: Arc<Cluster>,\n        gossip_advertise_addr: SocketAddr,\n        indexing_tasks: Vec<IndexingTask>,\n    ) -> bool {\n        let members = cluster.ready_members().await;\n        let node_opt = members\n            .iter()\n            .find(|member| member.gossip_advertise_addr == gossip_advertise_addr);\n        let Some(node) = node_opt else {\n            return false;\n        };\n        let node_grouped_tasks: HashMap<IndexingTask, usize> = node\n            .indexing_tasks\n            .iter()\n            .chunk_by(|task| (*task).clone())\n            .into_iter()\n            .map(|(key, group)| (key, group.count()))\n            .collect();\n        let grouped_tasks: HashMap<IndexingTask, usize> = indexing_tasks\n            .iter()\n            .chunk_by(|task| (*task).clone())\n            .into_iter()\n            .map(|(key, group)| (key, group.count()))\n            .collect();\n        node_grouped_tasks == grouped_tasks\n    }\n\n    #[tokio::test]\n    async fn test_chitchat_state_with_malformatted_indexing_task_key() {\n        let transport = ChannelTransport::default();\n        let node = create_cluster_for_test(Vec::new(), &[\"indexer\"], &transport, true)\n            .await\n            .unwrap();\n        {\n            let chitchat_handle = node.inner.read().await.chitchat_handle.chitchat();\n            let mut chitchat_guard = chitchat_handle.lock().await;\n            chitchat_guard.self_node_state().set(\n                format!(\"{INDEXING_TASK_PREFIX}01BX5ZZKBKACTAV9WEVGEMMVS0\"),\n                \"my_index:00000000000000000000000000:my_source:41:1,3\".to_string(),\n            );\n            chitchat_guard.self_node_state().set(\n                format!(\"{INDEXING_TASK_PREFIX}01BX5ZZKBKACTAV9WEVGEMMVS1\"),\n                \"my_index-00000000000000000000000000-my_source:53:3,5\".to_string(),\n            );\n        }\n        node.wait_for_ready_members(|members| members.len() == 1, Duration::from_secs(5))\n            .await\n            .unwrap();\n        let ready_members = node.ready_members().await;\n        assert_eq!(ready_members[0].indexing_tasks.len(), 1);\n    }\n\n    #[tokio::test]\n    async fn test_cluster_id_isolation() -> anyhow::Result<()> {\n        quickwit_common::setup_logging_for_tests();\n        let transport = ChannelTransport::default();\n\n        let cluster1a = create_cluster_for_test_with_id(\n            \"node-11\".into(),\n            11,\n            \"cluster1\".to_string(),\n            Vec::new(),\n            &HashSet::default(),\n            &transport,\n            true,\n        )\n        .await?;\n        let cluster2a = create_cluster_for_test_with_id(\n            \"node-21\".into(),\n            21,\n            \"cluster2\".to_string(),\n            vec![cluster1a.gossip_listen_addr.to_string()],\n            &HashSet::default(),\n            &transport,\n            true,\n        )\n        .await?;\n        let cluster1b = create_cluster_for_test_with_id(\n            \"node-12\".into(),\n            12,\n            \"cluster1\".to_string(),\n            vec![\n                cluster1a.gossip_listen_addr.to_string(),\n                cluster2a.gossip_listen_addr.to_string(),\n            ],\n            &HashSet::default(),\n            &transport,\n            true,\n        )\n        .await?;\n        let cluster2b = create_cluster_for_test_with_id(\n            \"node-22\".into(),\n            22,\n            \"cluster2\".to_string(),\n            vec![\n                cluster1a.gossip_listen_addr.to_string(),\n                cluster2a.gossip_listen_addr.to_string(),\n            ],\n            &HashSet::default(),\n            &transport,\n            true,\n        )\n        .await?;\n\n        let wait_secs = Duration::from_secs(10);\n\n        for cluster in [&cluster1a, &cluster2a, &cluster1b, &cluster2b] {\n            cluster\n                .wait_for_ready_members(|members| members.len() == 2, wait_secs)\n                .await\n                .unwrap();\n        }\n\n        let members_a: Vec<SocketAddr> = cluster1a\n            .ready_members()\n            .await\n            .iter()\n            .map(|member| member.gossip_advertise_addr)\n            .sorted()\n            .collect();\n        let mut expected_members_a =\n            vec![cluster1a.gossip_listen_addr, cluster1b.gossip_listen_addr];\n        expected_members_a.sort();\n        assert_eq!(members_a, expected_members_a);\n\n        let members_b: Vec<SocketAddr> = cluster2a\n            .ready_members()\n            .await\n            .iter()\n            .map(|member| member.gossip_advertise_addr)\n            .sorted()\n            .collect();\n        let mut expected_members_b =\n            vec![cluster2a.gossip_listen_addr, cluster2b.gossip_listen_addr];\n        expected_members_b.sort();\n        assert_eq!(members_b, expected_members_b);\n\n        Ok(())\n    }\n\n    fn test_serialize_indexing_tasks_aux(\n        indexing_tasks: &[IndexingTask],\n        node_state: &mut NodeState,\n    ) {\n        set_indexing_tasks_in_node_state(indexing_tasks, node_state);\n        let ser_deser_indexing_tasks = parse_indexing_tasks(node_state);\n        assert_eq!(indexing_tasks, ser_deser_indexing_tasks);\n    }\n\n    #[test]\n    fn test_serialize_indexing_tasks() {\n        let mut node_state = NodeState::for_test();\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        test_serialize_indexing_tasks_aux(&[], &mut node_state);\n        test_serialize_indexing_tasks_aux(\n            &[IndexingTask {\n                pipeline_uid: Some(PipelineUid::for_test(1u128)),\n                index_uid: Some(index_uid.clone()),\n                source_id: \"my-source1\".to_string(),\n                shard_ids: vec![ShardId::from(1), ShardId::from(2)],\n                params_fingerprint: 0,\n            }],\n            &mut node_state,\n        );\n        // change in the set of shards\n        test_serialize_indexing_tasks_aux(\n            &[IndexingTask {\n                pipeline_uid: Some(PipelineUid::for_test(2u128)),\n                index_uid: Some(index_uid.clone()),\n                source_id: \"my-source1\".to_string(),\n                shard_ids: vec![ShardId::from(1), ShardId::from(2), ShardId::from(3)],\n                params_fingerprint: 0,\n            }],\n            &mut node_state,\n        );\n        test_serialize_indexing_tasks_aux(\n            &[\n                IndexingTask {\n                    pipeline_uid: Some(PipelineUid::for_test(1u128)),\n                    index_uid: Some(index_uid.clone()),\n                    source_id: \"my-source1\".to_string(),\n                    shard_ids: vec![ShardId::from(1), ShardId::from(2)],\n                    params_fingerprint: 0,\n                },\n                IndexingTask {\n                    pipeline_uid: Some(PipelineUid::for_test(2u128)),\n                    index_uid: Some(index_uid.clone()),\n                    source_id: \"my-source1\".to_string(),\n                    shard_ids: vec![ShardId::from(3), ShardId::from(4)],\n                    params_fingerprint: 0,\n                },\n            ],\n            &mut node_state,\n        );\n        // different index.\n        test_serialize_indexing_tasks_aux(\n            &[\n                IndexingTask {\n                    pipeline_uid: Some(PipelineUid::for_test(1u128)),\n                    index_uid: Some(index_uid.clone()),\n                    source_id: \"my-source1\".to_string(),\n                    shard_ids: vec![ShardId::from(1), ShardId::from(2)],\n                    params_fingerprint: 0,\n                },\n                IndexingTask {\n                    pipeline_uid: Some(PipelineUid::for_test(2u128)),\n                    index_uid: Some(IndexUid::for_test(\"test-index2\", 0)),\n                    source_id: \"my-source1\".to_string(),\n                    shard_ids: vec![ShardId::from(3), ShardId::from(4)],\n                    params_fingerprint: 0,\n                },\n            ],\n            &mut node_state,\n        );\n        // same index, different source.\n        test_serialize_indexing_tasks_aux(\n            &[\n                IndexingTask {\n                    pipeline_uid: Some(PipelineUid::for_test(1u128)),\n                    index_uid: Some(index_uid.clone()),\n                    source_id: \"my-source1\".to_string(),\n                    shard_ids: vec![ShardId::from(1), ShardId::from(2)],\n                    params_fingerprint: 0,\n                },\n                IndexingTask {\n                    pipeline_uid: Some(PipelineUid::for_test(2u128)),\n                    index_uid: Some(index_uid.clone()),\n                    source_id: \"my-source2\".to_string(),\n                    shard_ids: vec![ShardId::from(3), ShardId::from(4)],\n                    params_fingerprint: 0,\n                },\n            ],\n            &mut node_state,\n        );\n    }\n\n    #[test]\n    fn test_parse_shard_ids_str() {\n        assert!(parse_shard_ids_str(\"\").is_empty());\n        assert!(parse_shard_ids_str(\",\").is_empty());\n        assert_eq!(\n            parse_shard_ids_str(\"00000000000000000012,\"),\n            [ShardId::from(12)]\n        );\n        assert_eq!(\n            parse_shard_ids_str(\"00000000000000000012,00000000000000000023,\"),\n            [ShardId::from(12), ShardId::from(23)]\n        );\n    }\n\n    #[test]\n    fn test_parse_chitchat_kv() {\n        assert!(\n            chitchat_kv_to_indexing_task(\"invalidulid\", \"my_index:uid:my_source:42:1,3\").is_none()\n        );\n        let task = super::chitchat_kv_to_indexing_task(\n            \"indexer.task:01BX5ZZKBKACTAV9WEVGEMMVS0\",\n            \"my_index:00000000000000000000000000:my_source:42:00000000000000000001,\\\n             00000000000000000003\",\n        )\n        .unwrap();\n        assert_eq!(task.params_fingerprint, 42);\n        assert_eq!(\n            task.pipeline_uid(),\n            PipelineUid::from_str(\"01BX5ZZKBKACTAV9WEVGEMMVS0\").unwrap()\n        );\n        assert_eq!(\n            &task.index_uid().to_string(),\n            \"my_index:00000000000000000000000000\"\n        );\n        assert_eq!(&task.source_id, \"my_source\");\n        assert_eq!(&task.shard_ids, &[ShardId::from(1), ShardId::from(3)]);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-cluster/src/grpc_gossip.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeMap;\nuse std::iter::zip;\nuse std::net::SocketAddr;\nuse std::sync::{Arc, Weak};\nuse std::time::{Duration, Instant};\n\nuse chitchat::{Chitchat, ChitchatId, NodeState, VersionedValue};\nuse futures::Future;\nuse quickwit_common::pretty::PrettyDisplay;\nuse quickwit_common::tower::ClientGrpcConfig;\nuse quickwit_proto::cluster::{ClusterService, ClusterServiceClient, FetchClusterStateRequest};\nuse rand::seq::IteratorRandom;\nuse tokio::sync::{Mutex, watch};\nuse tokio_stream::StreamExt;\nuse tokio_stream::wrappers::WatchStream;\nuse tracing::{info, warn};\n\nuse crate::grpc_service::cluster_grpc_client;\nuse crate::member::NodeStateExt;\nuse crate::metrics::CLUSTER_METRICS;\n\nconst MAX_GOSSIP_PEERS: usize = 3;\n\n/// select a few and then fetches the state from them via gRPC.\npub(crate) async fn spawn_catchup_callback_task(\n    cluster_id: String,\n    self_chitchat_id: ChitchatId,\n    weak_chitchat: Weak<Mutex<Chitchat>>,\n    live_nodes_rx: watch::Receiver<BTreeMap<ChitchatId, NodeState>>,\n    mut catchup_callback_rx: watch::Receiver<()>,\n    client_grpc_config: ClientGrpcConfig,\n) {\n    let catchup_callback_future = async move {\n        let mut interval = tokio::time::interval(Duration::from_secs(60));\n        interval.tick().await;\n\n        loop {\n            let Some(chitchat) = weak_chitchat.upgrade() else {\n                return;\n            };\n            perform_grpc_gossip_rounds(\n                cluster_id.clone(),\n                &self_chitchat_id,\n                chitchat,\n                live_nodes_rx.clone(),\n                |socket_addr| cluster_grpc_client(socket_addr, client_grpc_config.clone()),\n            )\n            .await;\n\n            interval.tick().await;\n\n            if catchup_callback_rx.changed().await.is_err() {\n                return;\n            }\n        }\n    };\n    tokio::spawn(catchup_callback_future);\n}\n\nasync fn perform_grpc_gossip_rounds<ClusterServiceClientFactory, Fut>(\n    cluster_id: String,\n    self_chitchat_id: &ChitchatId,\n    chitchat: Arc<Mutex<Chitchat>>,\n    live_nodes_rx: watch::Receiver<BTreeMap<ChitchatId, NodeState>>,\n    grpc_client_factory: ClusterServiceClientFactory,\n) where\n    ClusterServiceClientFactory: Fn(SocketAddr) -> Fut,\n    Fut: Future<Output = ClusterServiceClient>,\n{\n    wait_for_gossip_candidates(\n        self_chitchat_id,\n        live_nodes_rx.clone(),\n        Duration::from_secs(10),\n    )\n    .await;\n\n    let now = Instant::now();\n    let (node_ids, grpc_advertise_addrs) =\n        select_gossip_candidates(self_chitchat_id, live_nodes_rx);\n\n    if node_ids.is_empty() {\n        info!(\"no peer nodes to pull the cluster state from\");\n        return;\n    }\n    info!(\"pulling cluster state from node(s): {node_ids:?}\");\n\n    for (node_id, grpc_advertise_addr) in zip(node_ids, grpc_advertise_addrs) {\n        let cluster_client = grpc_client_factory(grpc_advertise_addr).await;\n\n        let request = FetchClusterStateRequest {\n            cluster_id: cluster_id.clone(),\n        };\n        let Ok(response) = cluster_client.fetch_cluster_state(request).await else {\n            warn!(\"failed to fetch cluster state from node `{node_id}`\");\n            continue;\n        };\n        CLUSTER_METRICS.grpc_gossip_rounds_total.inc();\n\n        let mut chitchat_guard = chitchat.lock().await;\n\n        for proto_node_state in response.node_states {\n            let proto_chitchat_id = proto_node_state\n                .chitchat_id\n                .expect(\"`chitchat_id` should be a required field\");\n            let chitchat_id = ChitchatId {\n                node_id: proto_chitchat_id.node_id.clone(),\n                generation_id: proto_chitchat_id.generation_id,\n                gossip_advertise_addr: proto_chitchat_id\n                    .gossip_advertise_addr\n                    .parse()\n                    .expect(\"`gossip_advertise_addr` should be a valid socket address\"),\n            };\n            if chitchat_id == *self_chitchat_id {\n                continue;\n            }\n            let now = tokio::time::Instant::now();\n            let key_values = proto_node_state.key_values.into_iter().map(|key_value| {\n                let status: chitchat::DeletionStatus = match key_value.status() {\n                    quickwit_proto::cluster::DeletionStatus::Set => chitchat::DeletionStatus::Set,\n                    quickwit_proto::cluster::DeletionStatus::Deleted => {\n                        chitchat::DeletionStatus::Deleted(now)\n                    }\n                    quickwit_proto::cluster::DeletionStatus::DeleteAfterTtl => {\n                        chitchat::DeletionStatus::DeleteAfterTtl(now)\n                    }\n                };\n                (\n                    key_value.key,\n                    VersionedValue {\n                        value: key_value.value,\n                        version: key_value.version,\n                        status,\n                    },\n                )\n            });\n            chitchat_guard.reset_node_state_if_update(\n                &chitchat_id,\n                key_values,\n                proto_node_state.max_version,\n                proto_node_state.last_gc_version,\n            );\n        }\n    }\n    info!(\"pulled cluster state in {}\", now.elapsed().pretty_display());\n}\n\nasync fn wait_for_gossip_candidates(\n    self_chitchat_id: &ChitchatId,\n    live_nodes_rx: watch::Receiver<BTreeMap<ChitchatId, NodeState>>,\n    timeout_after: Duration,\n) {\n    let live_nodes_stream = WatchStream::new(live_nodes_rx);\n    let _ = tokio::time::timeout(\n        timeout_after,\n        live_nodes_stream\n            .skip_while(|node_states| {\n                node_states.len() < MAX_GOSSIP_PEERS\n                    && node_states\n                        .values()\n                        .filter(|node_state| {\n                            find_gossip_candidate_grpc_addr(self_chitchat_id, node_state).is_some()\n                        })\n                        .count()\n                        < MAX_GOSSIP_PEERS\n            })\n            .next(),\n    )\n    .await;\n}\n\nfn select_gossip_candidates(\n    self_chitchat_id: &ChitchatId,\n    live_nodes_rx: watch::Receiver<BTreeMap<ChitchatId, NodeState>>,\n) -> (Vec<String>, Vec<SocketAddr>) {\n    live_nodes_rx\n        .borrow()\n        .values()\n        .filter_map(|node_state| {\n            find_gossip_candidate_grpc_addr(self_chitchat_id, node_state)\n                .map(|grpc_addr| (&node_state.chitchat_id().node_id, grpc_addr))\n        })\n        .choose_multiple(&mut rand::rng(), MAX_GOSSIP_PEERS)\n        .into_iter()\n        .map(|(node_id, grpc_addr)| (node_id.clone(), grpc_addr))\n        .unzip()\n}\n\n/// Returns the gRPC advertise address of the node if it is a gossip candidate.\nfn find_gossip_candidate_grpc_addr(\n    self_chitchat_id: &ChitchatId,\n    node_state: &NodeState,\n) -> Option<SocketAddr> {\n    // Ignore self node, including previous generations, and nodes that are not ready.\n    if self_chitchat_id.node_id == node_state.chitchat_id().node_id || !node_state.is_ready() {\n        return None;\n    }\n    node_state.grpc_advertise_addr().ok()\n}\n\n#[cfg(test)]\nmod tests {\n    use chitchat::transport::ChannelTransport;\n    use quickwit_proto::cluster::{\n        ChitchatId as ProtoChitchatId, DeletionStatus, FetchClusterStateResponse,\n        MockClusterService, NodeState as ProtoNodeState, VersionedKeyValue,\n    };\n\n    use super::*;\n    use crate::change::tests::NodeStateBuilder;\n    use crate::create_cluster_for_test;\n    use crate::member::{GRPC_ADVERTISE_ADDR_KEY, READINESS_KEY, READINESS_VALUE_READY};\n\n    #[tokio::test]\n    async fn test_find_gossip_candidate_grpc_addr() {\n        let gossip_advertise_addr: SocketAddr = \"127.0.0.1:10000\".parse().unwrap();\n        let grpc_advertise_addr: SocketAddr = \"127.0.0.1:10001\".parse().unwrap();\n        let self_chitchat_id =\n            ChitchatId::new(\"test-node-foo\".to_string(), 1, gossip_advertise_addr);\n\n        let node_state = NodeStateBuilder::default()\n            .with_readiness(true)\n            .with_grpc_advertise_addr(grpc_advertise_addr)\n            .build();\n        let grpc_addr = find_gossip_candidate_grpc_addr(&self_chitchat_id, &node_state).unwrap();\n        assert_eq!(grpc_addr, grpc_advertise_addr);\n\n        let node_state = NodeStateBuilder::default()\n            .with_readiness(false)\n            .with_grpc_advertise_addr(grpc_advertise_addr)\n            .build();\n        let grpc_addr_opt = find_gossip_candidate_grpc_addr(&self_chitchat_id, &node_state);\n        assert!(grpc_addr_opt.is_none());\n\n        let node_state = NodeStateBuilder::default().with_readiness(false).build();\n        let grpc_addr_opt = find_gossip_candidate_grpc_addr(&self_chitchat_id, &node_state);\n        assert!(grpc_addr_opt.is_none());\n\n        let self_chitchat_id = ChitchatId::new(\"test-node\".to_string(), 1, gossip_advertise_addr);\n        let node_state = NodeStateBuilder::default()\n            .with_readiness(true)\n            .with_grpc_advertise_addr(grpc_advertise_addr)\n            .build();\n        let grpc_addr_opt = find_gossip_candidate_grpc_addr(&self_chitchat_id, &node_state);\n        assert!(grpc_addr_opt.is_none());\n    }\n\n    #[tokio::test]\n    async fn test_perform_grpc_gossip_rounds() {\n        let peer_seeds = Vec::new();\n        let transport = ChannelTransport::default();\n        let cluster = create_cluster_for_test(peer_seeds, &[\"indexer\"], &transport, true)\n            .await\n            .unwrap();\n        let cluster_id = cluster.cluster_id().to_string();\n        let self_chitchat_id = cluster.self_chitchat_id();\n        let chitchat = cluster.chitchat().await;\n\n        let grpc_client_factory = |_: SocketAddr| {\n            Box::pin(async {\n                let mut mock_cluster_service = MockClusterService::new();\n                mock_cluster_service\n                    .expect_fetch_cluster_state()\n                    .returning(|_request| {\n                        let response = FetchClusterStateResponse {\n                            node_states: vec![ProtoNodeState {\n                                chitchat_id: Some(ProtoChitchatId {\n                                    node_id: \"node-4\".to_string(),\n                                    generation_id: 0,\n                                    gossip_advertise_addr: \"127.0.0.1:14000\".to_string(),\n                                }),\n                                key_values: vec![VersionedKeyValue {\n                                    key: \"foo\".to_string(),\n                                    value: \"bar\".to_string(),\n                                    version: 2,\n\n                                    status: DeletionStatus::Set as i32,\n                                }],\n                                max_version: 2,\n                                last_gc_version: 1,\n                            }],\n                            ..Default::default()\n                        };\n                        Ok(response)\n                    });\n                ClusterServiceClient::from_mock(mock_cluster_service)\n            })\n        };\n        let live_nodes = BTreeMap::from_iter([\n            {\n                let chitchat_id = ChitchatId::for_local_test(11_000);\n                let mut node_state = NodeState::for_test();\n\n                node_state.set(GRPC_ADVERTISE_ADDR_KEY, \"127.0.0.1:11001\");\n                node_state.set(READINESS_KEY, READINESS_VALUE_READY);\n                (chitchat_id, node_state)\n            },\n            {\n                let chitchat_id = ChitchatId::for_local_test(12_000);\n                let mut node_state = NodeState::for_test();\n\n                node_state.set(GRPC_ADVERTISE_ADDR_KEY, \"127.0.0.1:12001\");\n                node_state.set(READINESS_KEY, READINESS_VALUE_READY);\n                (chitchat_id, node_state)\n            },\n            {\n                let chitchat_id = ChitchatId::for_local_test(13_000);\n                let mut node_state = NodeState::for_test();\n\n                node_state.set(GRPC_ADVERTISE_ADDR_KEY, \"127.0.0.1:13001\");\n                node_state.set(READINESS_KEY, READINESS_VALUE_READY);\n                (chitchat_id, node_state)\n            },\n        ]);\n        let (_live_nodes_tx, live_nodes_rx) = watch::channel(live_nodes);\n\n        perform_grpc_gossip_rounds(\n            cluster_id,\n            self_chitchat_id,\n            chitchat.clone(),\n            live_nodes_rx,\n            grpc_client_factory,\n        )\n        .await;\n\n        let chitchat_mutex_guard = chitchat.lock().await;\n        let chitchat_id = ChitchatId {\n            node_id: \"node-4\".to_string(),\n            generation_id: 0,\n            gossip_advertise_addr: \"127.0.0.1:14000\".parse().unwrap(),\n        };\n        let node_state = chitchat_mutex_guard.node_state(&chitchat_id).unwrap();\n        assert_eq!(node_state.num_key_values(), 1);\n        assert_eq!(node_state.get(\"foo\").unwrap(), \"bar\");\n        assert_eq!(node_state.max_version(), 2);\n        assert_eq!(node_state.last_gc_version(), 1);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-cluster/src/grpc_service.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::net::SocketAddr;\n\nuse bytesize::ByteSize;\nuse itertools::Itertools;\nuse once_cell::sync::Lazy;\nuse quickwit_common::tower::{ClientGrpcConfig, GrpcMetricsLayer, make_channel};\nuse quickwit_proto::cluster::cluster_service_grpc_server::ClusterServiceGrpcServer;\nuse quickwit_proto::cluster::{\n    ChitchatId as ProtoChitchatId, ClusterError, ClusterResult, ClusterService,\n    ClusterServiceClient, ClusterServiceGrpcServerAdapter, FetchClusterStateRequest,\n    FetchClusterStateResponse, NodeState as ProtoNodeState, VersionedKeyValue,\n};\nuse tonic::async_trait;\n\nuse crate::Cluster;\n\nconst MAX_MESSAGE_SIZE: ByteSize = ByteSize::mib(64);\n\nstatic CLUSTER_GRPC_CLIENT_METRICS_LAYER: Lazy<GrpcMetricsLayer> =\n    Lazy::new(|| GrpcMetricsLayer::new(\"cluster\", \"client\"));\nstatic CLUSTER_GRPC_SERVER_METRICS_LAYER: Lazy<GrpcMetricsLayer> =\n    Lazy::new(|| GrpcMetricsLayer::new(\"cluster\", \"server\"));\n\npub(crate) async fn cluster_grpc_client(\n    socket_addr: SocketAddr,\n    client_grpc_config: ClientGrpcConfig,\n) -> ClusterServiceClient {\n    let channel = make_channel(socket_addr, client_grpc_config).await;\n\n    ClusterServiceClient::tower()\n        .stack_layer(CLUSTER_GRPC_CLIENT_METRICS_LAYER.clone())\n        .build_from_channel(socket_addr, channel, MAX_MESSAGE_SIZE, None)\n}\n\npub fn cluster_grpc_server(\n    cluster: Cluster,\n) -> ClusterServiceGrpcServer<ClusterServiceGrpcServerAdapter> {\n    ClusterServiceClient::tower()\n        .stack_layer(CLUSTER_GRPC_SERVER_METRICS_LAYER.clone())\n        .build(cluster)\n        .as_grpc_service(MAX_MESSAGE_SIZE)\n}\n\n#[async_trait]\nimpl ClusterService for Cluster {\n    async fn fetch_cluster_state(\n        &self,\n        request: FetchClusterStateRequest,\n    ) -> ClusterResult<FetchClusterStateResponse> {\n        if request.cluster_id != self.cluster_id() {\n            return Err(ClusterError::Internal(\"wrong cluster\".to_string()));\n        }\n        let chitchat = self.chitchat().await;\n        let chitchat_guard = chitchat.lock().await;\n\n        let num_nodes = chitchat_guard.node_states().len();\n        let mut proto_node_states = Vec::with_capacity(num_nodes);\n\n        for (chitchat_id, node_state) in chitchat_guard.node_states() {\n            let proto_chitchat_id = ProtoChitchatId {\n                node_id: chitchat_id.node_id.clone(),\n                generation_id: chitchat_id.generation_id,\n                gossip_advertise_addr: chitchat_id.gossip_advertise_addr.to_string(),\n            };\n\n            let key_values: Vec<VersionedKeyValue> = node_state\n                .key_values_including_deleted()\n                .map(|(key, versioned_value)| {\n                    let key_value_status_proto = match versioned_value.status {\n                        chitchat::DeletionStatus::Set => {\n                            quickwit_proto::cluster::DeletionStatus::Set\n                        }\n                        chitchat::DeletionStatus::Deleted(_) => {\n                            quickwit_proto::cluster::DeletionStatus::Deleted\n                        }\n                        chitchat::DeletionStatus::DeleteAfterTtl(_) => {\n                            quickwit_proto::cluster::DeletionStatus::DeleteAfterTtl\n                        }\n                    };\n                    VersionedKeyValue {\n                        key: key.to_string(),\n                        value: versioned_value.value.clone(),\n                        version: versioned_value.version,\n                        status: key_value_status_proto as i32,\n                    }\n                })\n                .sorted_unstable_by_key(|key_value| key_value.version)\n                .collect();\n            if key_values.is_empty() {\n                continue;\n            }\n            let proto_node_state = ProtoNodeState {\n                chitchat_id: Some(proto_chitchat_id),\n                key_values,\n                max_version: node_state.max_version(),\n                last_gc_version: node_state.last_gc_version(),\n            };\n            proto_node_states.push(proto_node_state);\n        }\n        let response = FetchClusterStateResponse {\n            cluster_id: request.cluster_id,\n            node_states: proto_node_states,\n        };\n        Ok(response)\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use chitchat::transport::ChannelTransport;\n\n    use super::*;\n    use crate::create_cluster_for_test;\n    use crate::member::{ENABLED_SERVICES_KEY, GRPC_ADVERTISE_ADDR_KEY, READINESS_KEY};\n\n    #[tokio::test]\n    async fn test_fetch_cluster_state() {\n        let transport = ChannelTransport::default();\n        let cluster = create_cluster_for_test(Vec::new(), &[\"indexer\"], &transport, true)\n            .await\n            .unwrap();\n\n        let cluster_id = cluster.cluster_id().to_string();\n        let node_id = cluster.self_node_id().to_owned();\n\n        cluster.set_self_key_value(\"foo\", \"bar\").await;\n\n        let fetch_cluster_state_request = FetchClusterStateRequest {\n            cluster_id: cluster_id.clone(),\n        };\n        let mut fetch_cluster_state_response = cluster\n            .fetch_cluster_state(fetch_cluster_state_request)\n            .await\n            .unwrap();\n        assert_eq!(\n            fetch_cluster_state_response.cluster_id,\n            cluster.cluster_id()\n        );\n        assert_eq!(fetch_cluster_state_response.node_states.len(), 1);\n\n        let node_state = &mut fetch_cluster_state_response.node_states[0];\n\n        let chitchat_id = node_state.chitchat_id.clone().unwrap();\n        assert_eq!(chitchat_id.node_id, node_id);\n        assert_eq!(chitchat_id.generation_id, 1);\n\n        node_state\n            .key_values\n            .sort_unstable_by(|left, right| left.key.cmp(&right.key));\n\n        assert_eq!(node_state.key_values.len(), 4);\n        assert_eq!(node_state.key_values[0].key, ENABLED_SERVICES_KEY);\n        assert_eq!(node_state.key_values[0].value, \"indexer\");\n\n        assert_eq!(node_state.key_values[1].key, \"foo\");\n        assert_eq!(node_state.key_values[1].value, \"bar\");\n\n        assert_eq!(node_state.key_values[2].key, GRPC_ADVERTISE_ADDR_KEY);\n\n        assert_eq!(node_state.key_values[3].key, READINESS_KEY);\n        assert_eq!(node_state.key_values[3].value, \"READY\");\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-cluster/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n#![deny(clippy::disallowed_methods)]\n\nmod change;\nmod cluster;\nmod grpc_gossip;\nmod grpc_service;\nmod member;\nmod metrics;\nmod node;\n\nuse std::net::SocketAddr;\nuse std::time::Duration;\n\nuse async_trait::async_trait;\npub use chitchat::transport::ChannelTransport;\nuse chitchat::transport::{Socket, Transport, UdpSocket};\nuse chitchat::{ChitchatMessage, Serializable};\npub use chitchat::{FailureDetectorConfig, KeyChangeEvent, ListenerHandle};\npub use grpc_service::cluster_grpc_server;\nuse quickwit_common::metrics::IntCounter;\nuse quickwit_common::tower::ClientGrpcConfig;\nuse quickwit_config::service::QuickwitService;\nuse quickwit_config::{GrpcConfig, NodeConfig, TlsConfig};\nuse quickwit_proto::indexing::CpuCapacity;\nuse quickwit_proto::ingest::ingester::IngesterStatus;\nuse quickwit_proto::tonic::transport::{Certificate, ClientTlsConfig, Identity};\nuse time::OffsetDateTime;\n\n#[cfg(any(test, feature = \"testsuite\"))]\npub use crate::change::for_test::*;\npub use crate::change::{ClusterChange, ClusterChangeStream, ClusterChangeStreamFactory};\npub use crate::cluster::{Cluster, ClusterSnapshot, NodeIdSchema};\n#[cfg(any(test, feature = \"testsuite\"))]\npub use crate::cluster::{\n    create_cluster_for_test, create_cluster_for_test_with_id, grpc_addr_from_listen_addr_for_test,\n};\npub use crate::member::{ClusterMember, INDEXING_CPU_CAPACITY_KEY};\npub use crate::node::ClusterNode;\n\n#[derive(Debug, Clone, Copy, Eq, PartialEq)]\npub struct GenerationId(u64);\n\nimpl GenerationId {\n    pub fn as_u64(&self) -> u64 {\n        self.0\n    }\n\n    pub fn now() -> Self {\n        Self(OffsetDateTime::now_utc().unix_timestamp_nanos() as u64)\n    }\n}\n\nimpl From<u64> for GenerationId {\n    fn from(generation_id: u64) -> Self {\n        Self(generation_id)\n    }\n}\n\nstruct CountingUdpTransport;\n\nstruct CountingUdpSocket {\n    socket: UdpSocket,\n    gossip_recv: IntCounter,\n    gossip_recv_bytes: IntCounter,\n    gossip_send: IntCounter,\n    gossip_send_bytes: IntCounter,\n}\n\n#[async_trait]\nimpl Socket for CountingUdpSocket {\n    async fn send(&mut self, to: SocketAddr, msg: ChitchatMessage) -> anyhow::Result<()> {\n        let msg_len = msg.serialized_len() as u64;\n        self.socket.send(to, msg).await?;\n        self.gossip_send.inc();\n        self.gossip_send_bytes.inc_by(msg_len);\n        Ok(())\n    }\n\n    async fn recv(&mut self) -> anyhow::Result<(SocketAddr, ChitchatMessage)> {\n        let (socket_addr, msg) = self.socket.recv().await?;\n        self.gossip_recv.inc();\n        let msg_len = msg.serialized_len() as u64;\n        self.gossip_recv_bytes.inc_by(msg_len);\n        Ok((socket_addr, msg))\n    }\n}\n\n#[async_trait]\nimpl Transport for CountingUdpTransport {\n    async fn open(&self, listen_addr: SocketAddr) -> anyhow::Result<Box<dyn Socket>> {\n        let socket = UdpSocket::open(listen_addr).await?;\n        Ok(Box::new(CountingUdpSocket {\n            socket,\n            gossip_recv: crate::metrics::CLUSTER_METRICS\n                .gossip_recv_messages_total\n                .clone(),\n            gossip_recv_bytes: crate::metrics::CLUSTER_METRICS\n                .gossip_recv_bytes_total\n                .clone(),\n            gossip_send: crate::metrics::CLUSTER_METRICS\n                .gossip_sent_messages_total\n                .clone(),\n            gossip_send_bytes: crate::metrics::CLUSTER_METRICS\n                .gossip_sent_bytes_total\n                .clone(),\n        }))\n    }\n}\n\npub async fn start_cluster_service(node_config: &NodeConfig) -> anyhow::Result<Cluster> {\n    let cluster_id = node_config.cluster_id.clone();\n    let gossip_listen_addr = node_config.gossip_listen_addr;\n    let peer_seed_addrs = node_config.peer_seed_addrs().await?;\n    let indexing_tasks = Vec::new();\n\n    let node_id = node_config.node_id.clone();\n    let generation_id = GenerationId::now();\n    let is_ready = false;\n    let indexing_cpu_capacity = if node_config.is_service_enabled(QuickwitService::Indexer) {\n        node_config.indexer_config.cpu_capacity\n    } else {\n        CpuCapacity::zero()\n    };\n    let self_node = ClusterMember {\n        node_id,\n        generation_id,\n        is_ready,\n        enabled_services: node_config.enabled_services.clone(),\n        gossip_advertise_addr: node_config.gossip_advertise_addr,\n        grpc_advertise_addr: node_config.grpc_advertise_addr,\n        indexing_tasks,\n        indexing_cpu_capacity,\n        ingester_status: IngesterStatus::default(),\n        availability_zone: node_config.availability_zone.clone(),\n    };\n    let failure_detector_config = FailureDetectorConfig {\n        dead_node_grace_period: Duration::from_secs(2 * 60 * 60), // 2 hours\n        ..Default::default()\n    };\n    let client_grpc_config = make_client_grpc_config(&node_config.grpc_config)?;\n    let cluster = Cluster::join(\n        cluster_id,\n        self_node,\n        gossip_listen_addr,\n        peer_seed_addrs,\n        node_config.gossip_interval,\n        failure_detector_config,\n        &CountingUdpTransport,\n        client_grpc_config,\n    )\n    .await?;\n    if node_config\n        .enabled_services\n        .contains(&QuickwitService::Indexer)\n    {\n        cluster\n            .set_self_key_value(INDEXING_CPU_CAPACITY_KEY, indexing_cpu_capacity)\n            .await;\n    }\n    Ok(cluster)\n}\n\npub fn make_client_grpc_config(grpc_config: &GrpcConfig) -> anyhow::Result<ClientGrpcConfig> {\n    let tls_config_opt = grpc_config\n        .tls\n        .as_ref()\n        .map(make_client_tls_config)\n        .transpose()?;\n    Ok(ClientGrpcConfig {\n        keep_alive_opt: grpc_config.keep_alive.clone().map(Into::into),\n        tls_config_opt,\n    })\n}\n\nfn make_client_tls_config(tls_config: &TlsConfig) -> anyhow::Result<ClientTlsConfig> {\n    let pem = std::fs::read_to_string(&tls_config.ca_path)?;\n    let ca = Certificate::from_pem(pem);\n    let mut tls = ClientTlsConfig::new().ca_certificate(ca);\n\n    if tls_config.validate_client {\n        let cert = std::fs::read_to_string(&tls_config.cert_path)?;\n        let key = std::fs::read_to_string(&tls_config.key_path)?;\n        let identity = Identity::from_pem(cert, key);\n        tls = tls.identity(identity);\n    }\n    if let Some(expected_name) = &tls_config.expected_name {\n        tls = tls.domain_name(expected_name);\n    }\n\n    Ok(tls)\n}\n"
  },
  {
    "path": "quickwit/quickwit-cluster/src/member.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashSet;\nuse std::mem::size_of;\nuse std::net::SocketAddr;\nuse std::str::FromStr;\n\nuse anyhow::Context;\nuse chitchat::{ChitchatId, NodeState, Version};\nuse quickwit_common::shared_consts::INGESTER_STATUS_KEY;\nuse quickwit_proto::indexing::{CpuCapacity, IndexingTask};\nuse quickwit_proto::ingest::ingester::IngesterStatus;\nuse quickwit_proto::types::NodeId;\nuse tracing::{error, warn};\n\nuse crate::cluster::parse_indexing_tasks;\nuse crate::{GenerationId, QuickwitService};\n\n// Keys used to store member's data in chitchat state.\npub(crate) const GRPC_ADVERTISE_ADDR_KEY: &str = \"grpc_advertise_addr\";\npub(crate) const ENABLED_SERVICES_KEY: &str = \"enabled_services\";\npub(crate) const PIPELINE_METRICS_PREFIX: &str = \"pipeline_metrics:\";\n\n// Readiness key and values used to store node's readiness in Chitchat state.\npub(crate) const READINESS_KEY: &str = \"readiness\";\npub(crate) const READINESS_VALUE_READY: &str = \"READY\";\npub(crate) const READINESS_VALUE_NOT_READY: &str = \"NOT_READY\";\n\npub(crate) const AVAILABILITY_ZONE_KEY: &str = \"availability_zone\";\n\npub const INDEXING_CPU_CAPACITY_KEY: &str = \"indexing_cpu_capacity\";\n\npub(crate) trait NodeStateExt {\n    fn grpc_advertise_addr(&self) -> anyhow::Result<SocketAddr>;\n\n    fn is_ready(&self) -> bool;\n\n    fn size_bytes(&self) -> usize;\n\n    fn ingester_status(&self) -> IngesterStatus;\n\n    fn availability_zone(&self) -> Option<String>;\n}\n\nimpl NodeStateExt for NodeState {\n    fn grpc_advertise_addr(&self) -> anyhow::Result<SocketAddr> {\n        self.get(GRPC_ADVERTISE_ADDR_KEY)\n            .with_context(|| {\n                format!(\"could not find key `{GRPC_ADVERTISE_ADDR_KEY}` in Chitchat node state\")\n            })\n            .map(|grpc_advertise_addr_value| {\n                grpc_advertise_addr_value.parse().with_context(|| {\n                    format!(\"failed to parse gRPC advertise address `{grpc_advertise_addr_value}`\")\n                })\n            })?\n    }\n\n    fn is_ready(&self) -> bool {\n        self.get(READINESS_KEY)\n            .map(|health_value| health_value == READINESS_VALUE_READY)\n            .unwrap_or(false)\n    }\n\n    // TODO: Expose more accurate size of the state in Chitchat.\n    fn size_bytes(&self) -> usize {\n        const SIZE_OF_VERSION: usize = size_of::<Version>();\n        const SIZE_OF_TOMBSTONE: usize = size_of::<u64>();\n\n        self.key_values_including_deleted()\n            .map(|(key, value)| key.len() + value.value.len() + SIZE_OF_VERSION + SIZE_OF_TOMBSTONE)\n            .sum()\n    }\n\n    fn ingester_status(&self) -> IngesterStatus {\n        self.get(INGESTER_STATUS_KEY)\n            .and_then(IngesterStatus::from_json_str_name)\n            .unwrap_or_default()\n    }\n\n    fn availability_zone(&self) -> Option<String> {\n        self.get(AVAILABILITY_ZONE_KEY).map(|az| az.to_string())\n    }\n}\n\n/// Cluster member.\n#[derive(Debug, Clone, Eq, PartialEq)]\npub struct ClusterMember {\n    /// A unique node ID across the cluster.\n    /// The Chitchat node ID is the concatenation of the node ID and the start timestamp:\n    /// `{node_id}/{start_timestamp}`.\n    pub node_id: NodeId,\n    /// The start timestamp (seconds) of the node.\n    pub generation_id: GenerationId,\n    /// Enabled services, i.e. services configured to run on the node. Depending on the node and\n    /// service health, each service may or may not be available/running.\n    pub enabled_services: HashSet<QuickwitService>,\n    /// Gossip advertise address, i.e. the address that other nodes should use to gossip with the\n    /// node.\n    pub gossip_advertise_addr: SocketAddr,\n    /// gRPC advertise address, i.e. the address that other nodes should use to communicate with\n    /// the node via gRPC.\n    pub grpc_advertise_addr: SocketAddr,\n    /// Running indexing plan.\n    /// None if the node is not an indexer or the indexer has not yet started some indexing\n    /// pipelines.\n    pub indexing_tasks: Vec<IndexingTask>,\n    /// Indexing cpu capacity of the node expressed in milli cpu.\n    pub indexing_cpu_capacity: CpuCapacity,\n    /// Status of the ingester service running on the node. `IngesterStatus::Unspecified` if the\n    /// node is not an ingester.\n    pub ingester_status: IngesterStatus,\n    /// Whether the node is ready to serve requests.\n    pub is_ready: bool,\n    /// Availability zone the node is running in, if enabled.\n    pub availability_zone: Option<String>,\n}\n\nimpl ClusterMember {\n    pub fn chitchat_id(&self) -> ChitchatId {\n        ChitchatId::new(\n            self.node_id.clone().into(),\n            self.generation_id.as_u64(),\n            self.gossip_advertise_addr,\n        )\n    }\n}\n\nimpl From<ClusterMember> for ChitchatId {\n    fn from(member: ClusterMember) -> Self {\n        member.chitchat_id()\n    }\n}\n\nfn parse_indexing_cpu_capacity(node_state: &NodeState) -> CpuCapacity {\n    let Some(indexing_capacity_str) = node_state.get(INDEXING_CPU_CAPACITY_KEY) else {\n        return CpuCapacity::zero();\n    };\n    if let Ok(indexing_capacity) = CpuCapacity::from_str(indexing_capacity_str) {\n        indexing_capacity\n    } else {\n        error!(indexing_capacity=?indexing_capacity_str, \"received an unparsable indexing capacity from node\");\n        CpuCapacity::zero()\n    }\n}\n\n// Builds a cluster member from a [`NodeState`].\npub(crate) fn build_cluster_member(\n    chitchat_id: ChitchatId,\n    node_state: &NodeState,\n) -> anyhow::Result<ClusterMember> {\n    let is_ready = node_state.is_ready();\n    let enabled_services = node_state\n        .get(ENABLED_SERVICES_KEY)\n        .ok_or_else(|| {\n            anyhow::anyhow!(\n                \"could not find `{}` key in node `{}` state\",\n                ENABLED_SERVICES_KEY,\n                chitchat_id.node_id\n            )\n        })\n        .map(|enabled_services_str| {\n            parse_enabled_services_str(enabled_services_str, &chitchat_id.node_id)\n        })?;\n    let grpc_advertise_addr = node_state.grpc_advertise_addr()?;\n    let indexing_tasks = parse_indexing_tasks(node_state);\n    let indexing_cpu_capacity = parse_indexing_cpu_capacity(node_state);\n    let ingester_status = node_state.ingester_status();\n    let availability_zone = node_state.availability_zone();\n\n    let member = ClusterMember {\n        node_id: chitchat_id.node_id.into(),\n        generation_id: chitchat_id.generation_id.into(),\n        is_ready,\n        enabled_services,\n        gossip_advertise_addr: chitchat_id.gossip_advertise_addr,\n        grpc_advertise_addr,\n        indexing_tasks,\n        indexing_cpu_capacity,\n        ingester_status,\n        availability_zone,\n    };\n    Ok(member)\n}\n\nfn parse_enabled_services_str(\n    enabled_services_str: &str,\n    node_id: &str,\n) -> HashSet<QuickwitService> {\n    let enabled_services: HashSet<QuickwitService> = enabled_services_str\n        .split(',')\n        .filter(|service_str| !service_str.is_empty())\n        .filter_map(|service_str| match service_str.parse() {\n            Ok(service) => Some(service),\n            Err(_) => {\n                warn!(\n                    node_id=%node_id,\n                    service=%service_str,\n                    \"Found unknown service enabled on node.\"\n                );\n                None\n            }\n        })\n        .collect();\n    if enabled_services.is_empty() {\n        warn!(\n            node_id=%node_id,\n            \"Node has no enabled services.\"\n        )\n    }\n    enabled_services\n}\n"
  },
  {
    "path": "quickwit/quickwit-cluster/src/metrics.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashSet;\nuse std::net::SocketAddr;\nuse std::sync::Weak;\nuse std::time::Duration;\n\nuse chitchat::{Chitchat, ChitchatId};\nuse once_cell::sync::Lazy;\nuse quickwit_common::metrics::{IntCounter, IntGauge, new_counter, new_gauge};\nuse tokio::sync::Mutex;\n\nuse crate::member::NodeStateExt;\n\npub struct ClusterMetrics {\n    pub live_nodes: IntGauge,\n    pub ready_nodes: IntGauge,\n    pub zombie_nodes: IntGauge,\n    pub dead_nodes: IntGauge,\n    pub cluster_state_size_bytes: IntGauge,\n    pub node_state_size_bytes: IntGauge,\n    pub node_state_keys: IntGauge,\n    pub gossip_recv_messages_total: IntCounter,\n    pub gossip_recv_bytes_total: IntCounter,\n    pub gossip_sent_messages_total: IntCounter,\n    pub gossip_sent_bytes_total: IntCounter,\n    pub grpc_gossip_rounds_total: IntCounter,\n}\n\nimpl Default for ClusterMetrics {\n    fn default() -> Self {\n        ClusterMetrics {\n            live_nodes: new_gauge(\n                \"live_nodes\",\n                \"The number of live nodes observed locally.\",\n                \"cluster\",\n                &[],\n            ),\n            ready_nodes: new_gauge(\n                \"ready_nodes\",\n                \"The number of ready nodes observed locally.\",\n                \"cluster\",\n                &[],\n            ),\n            zombie_nodes: new_gauge(\n                \"zombie_nodes\",\n                \"The number of zombie nodes observed locally.\",\n                \"cluster\",\n                &[],\n            ),\n            dead_nodes: new_gauge(\n                \"dead_nodes\",\n                \"The number of dead nodes observed locally.\",\n                \"cluster\",\n                &[],\n            ),\n            cluster_state_size_bytes: new_gauge(\n                \"cluster_state_size_bytes\",\n                \"The size of the cluster state in bytes.\",\n                \"cluster\",\n                &[],\n            ),\n            node_state_keys: new_gauge(\n                \"node_state_keys\",\n                \"The number of keys in the node state.\",\n                \"cluster\",\n                &[],\n            ),\n            node_state_size_bytes: new_gauge(\n                \"node_state_size_bytes\",\n                \"The size of the node state in bytes.\",\n                \"cluster\",\n                &[],\n            ),\n            gossip_recv_messages_total: new_counter(\n                \"gossip_recv_messages_total\",\n                \"Total number of gossip messages received.\",\n                \"cluster\",\n                &[],\n            ),\n            gossip_recv_bytes_total: new_counter(\n                \"gossip_recv_bytes_total\",\n                \"Total amount of gossip data received in bytes.\",\n                \"cluster\",\n                &[],\n            ),\n            gossip_sent_messages_total: new_counter(\n                \"gossip_sent_messages_total\",\n                \"Total number of gossip messages sent.\",\n                \"cluster\",\n                &[],\n            ),\n            gossip_sent_bytes_total: new_counter(\n                \"gossip_sent_bytes_total\",\n                \"Total amount of gossip data sent in bytes.\",\n                \"cluster\",\n                &[],\n            ),\n            grpc_gossip_rounds_total: new_counter(\n                \"grpc_gossip_rounds_total\",\n                \"Total number of gRPC gossip rounds performed with peer nodes.\",\n                \"cluster\",\n                &[],\n            ),\n        }\n    }\n}\n\npub static CLUSTER_METRICS: Lazy<ClusterMetrics> = Lazy::new(ClusterMetrics::default);\n\npub(crate) fn spawn_metrics_task(\n    weak_chitchat: Weak<Mutex<Chitchat>>,\n    self_chitchat_id: ChitchatId,\n) {\n    const METRICS_INTERVAL: Duration = Duration::from_secs(15);\n\n    const SIZE_OF_GENERATION_ID: usize = std::mem::size_of::<u64>();\n    const SIZE_OF_SOCKET_ADDR: usize = std::mem::size_of::<SocketAddr>();\n\n    let future = async move {\n        let mut interval = tokio::time::interval(METRICS_INTERVAL);\n\n        while let Some(chitchat) = weak_chitchat.upgrade() {\n            interval.tick().await;\n\n            let mut num_ready_nodes = 0;\n            let mut cluster_state_size_bytes = 0;\n\n            let chitchat_guard = chitchat.lock().await;\n            let live_nodes: HashSet<&ChitchatId> = chitchat_guard.live_nodes().collect();\n\n            let num_live_nodes = live_nodes.len();\n            let num_zombie_nodes = chitchat_guard.scheduled_for_deletion_nodes().count();\n            let num_dead_nodes = chitchat_guard.dead_nodes().count();\n\n            for (chitchat_id, node_state) in chitchat_guard.node_states() {\n                if live_nodes.contains(chitchat_id) && node_state.is_ready() {\n                    num_ready_nodes += 1;\n                }\n                let chitchat_id_size_bytes =\n                    chitchat_id.node_id.len() + SIZE_OF_GENERATION_ID + SIZE_OF_SOCKET_ADDR;\n                let node_state_size_bytes = node_state.size_bytes();\n\n                cluster_state_size_bytes += chitchat_id_size_bytes + node_state_size_bytes;\n\n                if *chitchat_id == self_chitchat_id {\n                    CLUSTER_METRICS\n                        .node_state_keys\n                        .set(node_state.num_key_values() as i64);\n                    CLUSTER_METRICS\n                        .node_state_size_bytes\n                        .set(node_state_size_bytes as i64);\n                }\n            }\n            drop(chitchat_guard);\n\n            CLUSTER_METRICS.live_nodes.set(num_live_nodes as i64);\n            CLUSTER_METRICS.ready_nodes.set(num_ready_nodes as i64);\n            CLUSTER_METRICS.zombie_nodes.set(num_zombie_nodes as i64);\n            CLUSTER_METRICS.dead_nodes.set(num_dead_nodes as i64);\n\n            CLUSTER_METRICS\n                .cluster_state_size_bytes\n                .set(cluster_state_size_bytes as i64);\n        }\n    };\n    tokio::spawn(future);\n}\n"
  },
  {
    "path": "quickwit/quickwit-cluster/src/node.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashSet;\nuse std::fmt::Debug;\nuse std::net::SocketAddr;\nuse std::sync::Arc;\n\nuse chitchat::{ChitchatId, NodeState};\nuse quickwit_config::service::QuickwitService;\nuse quickwit_proto::indexing::{CpuCapacity, IndexingTask};\nuse quickwit_proto::ingest::ingester::IngesterStatus;\nuse quickwit_proto::types::NodeIdRef;\nuse tonic::transport::Channel;\n\nuse crate::member::build_cluster_member;\n\n#[derive(Clone)]\npub struct ClusterNode {\n    inner: Arc<InnerNode>,\n}\n\nimpl ClusterNode {\n    /// Attempts to create a new `ClusterNode` from a Chitchat `NodeState`.\n    pub(crate) fn try_new(\n        chitchat_id: ChitchatId,\n        node_state: &NodeState,\n        channel: Channel,\n        is_self_node: bool,\n    ) -> anyhow::Result<Self> {\n        let member = build_cluster_member(chitchat_id.clone(), node_state)?;\n        let inner = InnerNode {\n            chitchat_id,\n            channel,\n            enabled_services: member.enabled_services,\n            grpc_advertise_addr: member.grpc_advertise_addr,\n            indexing_tasks: member.indexing_tasks,\n            indexing_capacity: member.indexing_cpu_capacity,\n            ingester_status: member.ingester_status,\n            is_ready: member.is_ready,\n            is_self_node,\n            availability_zone: member.availability_zone,\n        };\n        let node = ClusterNode {\n            inner: Arc::new(inner),\n        };\n        Ok(node)\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub async fn for_test(\n        node_id: &str,\n        port: u16,\n        is_self_node: bool,\n        enabled_services: &[&str],\n        indexing_tasks: &[IndexingTask],\n        ingester_status: IngesterStatus,\n    ) -> Self {\n        use quickwit_common::shared_consts::INGESTER_STATUS_KEY;\n        use quickwit_common::tower::{ClientGrpcConfig, make_channel};\n\n        use crate::cluster::set_indexing_tasks_in_node_state;\n        use crate::member::{ENABLED_SERVICES_KEY, GRPC_ADVERTISE_ADDR_KEY};\n\n        let gossip_advertise_addr = ([127, 0, 0, 1], port).into();\n        let grpc_advertise_addr = ([127, 0, 0, 1], port + 1).into();\n        let chitchat_id = ChitchatId::new(node_id.to_string(), 0, gossip_advertise_addr);\n        let channel = make_channel(grpc_advertise_addr, ClientGrpcConfig::default()).await;\n        let mut node_state = NodeState::for_test();\n        node_state.set(ENABLED_SERVICES_KEY, enabled_services.join(\",\"));\n        node_state.set(GRPC_ADVERTISE_ADDR_KEY, grpc_advertise_addr.to_string());\n        node_state.set(INGESTER_STATUS_KEY, ingester_status.as_json_str_name());\n        set_indexing_tasks_in_node_state(indexing_tasks, &mut node_state);\n        Self::try_new(chitchat_id, &node_state, channel, is_self_node).unwrap()\n    }\n\n    pub fn chitchat_id(&self) -> &ChitchatId {\n        &self.inner.chitchat_id\n    }\n\n    pub fn node_id(&self) -> &NodeIdRef {\n        NodeIdRef::from_str(&self.inner.chitchat_id.node_id)\n    }\n\n    pub fn channel(&self) -> Channel {\n        self.inner.channel.clone()\n    }\n\n    pub fn enabled_services(&self) -> &HashSet<QuickwitService> {\n        &self.inner.enabled_services\n    }\n\n    pub fn is_indexer(&self) -> bool {\n        self.inner\n            .enabled_services\n            .contains(&QuickwitService::Indexer)\n    }\n\n    pub fn is_ingester(&self) -> bool {\n        self.inner\n            .enabled_services\n            .contains(&QuickwitService::Indexer)\n    }\n\n    pub fn is_searcher(&self) -> bool {\n        self.inner\n            .enabled_services\n            .contains(&QuickwitService::Searcher)\n    }\n\n    pub fn grpc_advertise_addr(&self) -> SocketAddr {\n        self.inner.grpc_advertise_addr\n    }\n\n    pub fn indexing_tasks(&self) -> &[IndexingTask] {\n        &self.inner.indexing_tasks\n    }\n\n    pub fn indexing_capacity(&self) -> CpuCapacity {\n        self.inner.indexing_capacity\n    }\n\n    pub fn ingester_status(&self) -> IngesterStatus {\n        self.inner.ingester_status\n    }\n\n    pub fn is_ready(&self) -> bool {\n        self.inner.is_ready\n    }\n\n    pub fn is_self_node(&self) -> bool {\n        self.inner.is_self_node\n    }\n\n    pub fn availability_zone(&self) -> Option<&str> {\n        self.inner.availability_zone.as_deref()\n    }\n}\n\nimpl Debug for ClusterNode {\n    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {\n        f.debug_struct(\"Node\")\n            .field(\"node_id\", &self.inner.chitchat_id.node_id)\n            .field(\"enabled_services\", &self.inner.enabled_services)\n            .field(\"is_ready\", &self.inner.is_ready)\n            .finish()\n    }\n}\n\n#[cfg(test)]\nimpl PartialEq for ClusterNode {\n    fn eq(&self, other: &Self) -> bool {\n        self.inner.chitchat_id == other.inner.chitchat_id\n            && self.inner.enabled_services == other.inner.enabled_services\n            && self.inner.grpc_advertise_addr == other.inner.grpc_advertise_addr\n            && self.inner.indexing_tasks == other.inner.indexing_tasks\n            && self.inner.is_ready == other.inner.is_ready\n            && self.inner.is_self_node == other.inner.is_self_node\n            && self.inner.availability_zone == other.inner.availability_zone\n    }\n}\n\nstruct InnerNode {\n    chitchat_id: ChitchatId,\n    channel: Channel,\n    enabled_services: HashSet<QuickwitService>,\n    grpc_advertise_addr: SocketAddr,\n    indexing_tasks: Vec<IndexingTask>,\n    indexing_capacity: CpuCapacity,\n    ingester_status: IngesterStatus,\n    is_ready: bool,\n    is_self_node: bool,\n    availability_zone: Option<String>,\n}\n"
  },
  {
    "path": "quickwit/quickwit-codegen/Cargo.toml",
    "content": "[package]\nname = \"quickwit-codegen\"\ndescription = \"Generate traits, adapters, and gRPC clients/servers from proto files.\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nanyhow = { workspace = true }\nheck = { workspace = true }\nprettyplease = { workspace = true }\nproc-macro2 = { workspace = true }\nprost-build = { workspace = true }\nquote = { workspace = true }\nsyn = { workspace = true }\ntonic-prost-build = { workspace = true }\n\n[dev-dependencies]\nfutures = { workspace = true }\nserde = { workspace = true }\n"
  },
  {
    "path": "quickwit/quickwit-codegen/README.md",
    "content": "# Quickwit codegen\n\n## Getting Started\n\n1. Describe your service in a proto file.\n\n2. Define an error and a result type for your service. The error type must implement `quickwit_proto::error::GrpcServiceError` and have at least the three following variants: `Internal`, `Timeout`, and `Unavailable`.\n\n3. Add the following dependencies to your project:\n\n```toml\n[dependencies]\nasync-trait = { workspace = true }\nbytes = { workspace = true }\nbytesize = { workspace = true }\nhttp = { workspace = true }\nhyper = { workspace = true }\nprost = { workspace = true }\nserde = { workspace = true }\nthiserror = { workspace = true }\ntokio = { workspace = true }\ntonic = { workspace = true }\ntower = { workspace = true }\nutoipa = { workspace = true }\n\nquickwit-actors = { workspace = true }\nquickwit-proto = { workspace = true }\n\n[dev-dependencies]\nmockall = { workspace = true }\n\n[build-dependencies]\nquickwit-codegen = { workspace = true }\n```\n\n4. Run the code generation logic as part of a Cargo build script:\n\n```rust\nuse quickwit_codegen::Codegen;\n\nfn main() {\n    Codegen::builder()\n        .with_protos(&[\"src/hello.proto\"])\n        .with_output_dir(\"src/\")\n        .with_result_type_path(\"crate::HelloResult\")\n        .with_error_type_path(\"crate::HelloError\")\n        .run()\n        .unwrap();\n}\n```\n\n5. If additional prost settings need to be configured they can be provided the following way:\n\n```rust\nuse quickwit_codegen::Codegen;\n\nfn main() {\n    let mut config = prost_build::Config::default();\n    config.bytes([\"PingRequest.name\", \"PingResponse.name\"]);\n    Codegen::builder()\n        .with_protos(&[\"src/hello.proto\"])\n        .with_output_dir(\"src/codegen/\")\n        .with_result_type_path(\"crate::HelloResult\")\n        .with_error_type_path(\"crate::HelloError\")\n        .with_prost_config(config)\n        .run()\n        .unwrap();\n}\n```\n\n\n6. Import and implement the generated service trait and use the various generated adapters to instantiate a gRPC server, or use a local or remote gRPC implementation with the same client interface.\n\nCheckout the complete working example in the `quickwit-codegen-example` crate.\n"
  },
  {
    "path": "quickwit/quickwit-codegen/example/Cargo.toml",
    "content": "[package]\nname = \"quickwit-codegen-example\"\ndescription = \"Demonstrates how to set up, configure, and run code generation for a simple service\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nanyhow = { workspace = true }\nasync-trait = { workspace = true }\nbytesize = { workspace = true }\nfutures = { workspace = true }\nhttp = { workspace = true }\nmockall = { workspace = true, optional = true }\nprost = { workspace = true }\nserde = { workspace = true }\nthiserror = { workspace = true }\ntokio = { workspace = true }\ntokio-stream = { workspace = true }\ntonic = { workspace = true }\ntonic-prost = { workspace = true }\ntower = { workspace = true }\nutoipa = { workspace = true }\n\nquickwit-actors = { workspace = true }\nquickwit-common = { workspace = true }\nquickwit-proto = { workspace = true }\n\n[dev-dependencies]\nmockall = { workspace = true }\n\nquickwit-actors = { workspace = true, features = [\"testsuite\"] }\n\n[build-dependencies]\nquickwit-codegen = { workspace = true }\n\n[features]\ntestsuite = [\"mockall\"]\n"
  },
  {
    "path": "quickwit/quickwit-codegen/example/build.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_codegen::Codegen;\n\nfn main() {\n    Codegen::builder()\n        .with_protos(&[\"src/hello.proto\"])\n        .with_output_dir(\"src/codegen/\")\n        .with_result_type_path(\"crate::HelloResult\")\n        .with_error_type_path(\"crate::HelloError\")\n        .generate_extra_service_methods()\n        .generate_rpc_name_impls()\n        .run()\n        .unwrap();\n}\n"
  },
  {
    "path": "quickwit/quickwit-codegen/example/src/codegen/hello.rs",
    "content": "// This file is @generated by prost-build.\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct HelloRequest {\n    #[prost(string, tag = \"1\")]\n    pub name: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct HelloResponse {\n    #[prost(string, tag = \"1\")]\n    pub message: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct GoodbyeRequest {\n    #[prost(string, tag = \"1\")]\n    pub name: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct GoodbyeResponse {\n    #[prost(string, tag = \"1\")]\n    pub message: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct PingRequest {\n    #[prost(string, tag = \"1\")]\n    pub name: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct PingResponse {\n    #[prost(string, tag = \"1\")]\n    pub message: ::prost::alloc::string::String,\n}\n/// BEGIN quickwit-codegen\n#[allow(unused_imports)]\nuse std::str::FromStr;\nuse tower::{Layer, Service, ServiceExt};\nuse quickwit_common::tower::RpcName;\nimpl RpcName for HelloRequest {\n    fn rpc_name() -> &'static str {\n        \"hello\"\n    }\n}\nimpl RpcName for GoodbyeRequest {\n    fn rpc_name() -> &'static str {\n        \"goodbye\"\n    }\n}\nimpl RpcName for PingRequest {\n    fn rpc_name() -> &'static str {\n        \"ping\"\n    }\n}\npub type HelloStream<T> = quickwit_common::ServiceStream<crate::HelloResult<T>>;\n#[cfg_attr(any(test, feature = \"testsuite\"), mockall::automock)]\n#[async_trait::async_trait]\npub trait Hello: std::fmt::Debug + Send + Sync + 'static {\n    ///Says hello.\n    async fn hello(&self, request: HelloRequest) -> crate::HelloResult<HelloResponse>;\n    ///Says goodbye.\n    async fn goodbye(\n        &self,\n        request: GoodbyeRequest,\n    ) -> crate::HelloResult<GoodbyeResponse>;\n    ///Ping pong.\n    async fn ping(\n        &self,\n        request: quickwit_common::ServiceStream<PingRequest>,\n    ) -> crate::HelloResult<HelloStream<PingResponse>>;\n    async fn check_connectivity(&self) -> anyhow::Result<()>;\n    fn endpoints(&self) -> Vec<quickwit_common::uri::Uri>;\n}\n#[derive(Debug, Clone)]\npub struct HelloClient {\n    inner: InnerHelloClient,\n}\n#[derive(Debug, Clone)]\nstruct InnerHelloClient(std::sync::Arc<dyn Hello>);\nimpl HelloClient {\n    pub fn new<T>(instance: T) -> Self\n    where\n        T: Hello,\n    {\n        #[cfg(any(test, feature = \"testsuite\"))]\n        assert!(\n            std::any::TypeId::of:: < T > () != std::any::TypeId::of:: < MockHello > (),\n            \"`MockHello` must be wrapped in a `MockHelloWrapper`: use `HelloClient::from_mock(mock)` to instantiate the client\"\n        );\n        Self {\n            inner: InnerHelloClient(std::sync::Arc::new(instance)),\n        }\n    }\n    pub fn as_grpc_service(\n        &self,\n        max_message_size: bytesize::ByteSize,\n    ) -> hello_grpc_server::HelloGrpcServer<HelloGrpcServerAdapter> {\n        let adapter = HelloGrpcServerAdapter::new(self.clone());\n        hello_grpc_server::HelloGrpcServer::new(adapter)\n            .accept_compressed(tonic::codec::CompressionEncoding::Gzip)\n            .accept_compressed(tonic::codec::CompressionEncoding::Zstd)\n            .send_compressed(tonic::codec::CompressionEncoding::Gzip)\n            .send_compressed(tonic::codec::CompressionEncoding::Zstd)\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize)\n    }\n    pub fn from_channel(\n        addr: std::net::SocketAddr,\n        channel: tonic::transport::Channel,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> Self {\n        let (_, connection_keys_watcher) = tokio::sync::watch::channel(\n            std::collections::HashSet::from_iter([addr]),\n        );\n        let mut client = hello_grpc_client::HelloGrpcClient::new(channel)\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize);\n        if let Some(compression_encoding) = compression_encoding_opt {\n            client = client\n                .accept_compressed(compression_encoding)\n                .send_compressed(compression_encoding);\n        }\n        let adapter = HelloGrpcClientAdapter::new(client, connection_keys_watcher);\n        Self::new(adapter)\n    }\n    pub fn from_balance_channel(\n        balance_channel: quickwit_common::tower::BalanceChannel<std::net::SocketAddr>,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> HelloClient {\n        let connection_keys_watcher = balance_channel.connection_keys_watcher();\n        let mut client = hello_grpc_client::HelloGrpcClient::new(balance_channel)\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize);\n        if let Some(compression_encoding) = compression_encoding_opt {\n            client = client\n                .accept_compressed(compression_encoding)\n                .send_compressed(compression_encoding);\n        }\n        let adapter = HelloGrpcClientAdapter::new(client, connection_keys_watcher);\n        Self::new(adapter)\n    }\n    pub fn from_mailbox<A>(mailbox: quickwit_actors::Mailbox<A>) -> Self\n    where\n        A: quickwit_actors::Actor + std::fmt::Debug + Send + 'static,\n        HelloMailbox<A>: Hello,\n    {\n        HelloClient::new(HelloMailbox::new(mailbox))\n    }\n    pub fn tower() -> HelloTowerLayerStack {\n        HelloTowerLayerStack::default()\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn from_mock(mock: MockHello) -> Self {\n        let mock_wrapper = mock_hello::MockHelloWrapper {\n            inner: tokio::sync::Mutex::new(mock),\n        };\n        Self::new(mock_wrapper)\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn mocked() -> Self {\n        Self::from_mock(MockHello::new())\n    }\n}\n#[async_trait::async_trait]\nimpl Hello for HelloClient {\n    async fn hello(&self, request: HelloRequest) -> crate::HelloResult<HelloResponse> {\n        self.inner.0.hello(request).await\n    }\n    async fn goodbye(\n        &self,\n        request: GoodbyeRequest,\n    ) -> crate::HelloResult<GoodbyeResponse> {\n        self.inner.0.goodbye(request).await\n    }\n    async fn ping(\n        &self,\n        request: quickwit_common::ServiceStream<PingRequest>,\n    ) -> crate::HelloResult<HelloStream<PingResponse>> {\n        self.inner.0.ping(request).await\n    }\n    async fn check_connectivity(&self) -> anyhow::Result<()> {\n        self.inner.0.check_connectivity().await\n    }\n    fn endpoints(&self) -> Vec<quickwit_common::uri::Uri> {\n        self.inner.0.endpoints()\n    }\n}\n#[cfg(any(test, feature = \"testsuite\"))]\npub mod mock_hello {\n    use super::*;\n    #[derive(Debug)]\n    pub struct MockHelloWrapper {\n        pub(super) inner: tokio::sync::Mutex<MockHello>,\n    }\n    #[async_trait::async_trait]\n    impl Hello for MockHelloWrapper {\n        async fn hello(\n            &self,\n            request: super::HelloRequest,\n        ) -> crate::HelloResult<super::HelloResponse> {\n            self.inner.lock().await.hello(request).await\n        }\n        async fn goodbye(\n            &self,\n            request: super::GoodbyeRequest,\n        ) -> crate::HelloResult<super::GoodbyeResponse> {\n            self.inner.lock().await.goodbye(request).await\n        }\n        async fn ping(\n            &self,\n            request: quickwit_common::ServiceStream<super::PingRequest>,\n        ) -> crate::HelloResult<HelloStream<super::PingResponse>> {\n            self.inner.lock().await.ping(request).await\n        }\n        async fn check_connectivity(&self) -> anyhow::Result<()> {\n            self.inner.lock().await.check_connectivity().await\n        }\n        fn endpoints(&self) -> Vec<quickwit_common::uri::Uri> {\n            futures::executor::block_on(self.inner.lock()).endpoints()\n        }\n    }\n}\npub type BoxFuture<T, E> = std::pin::Pin<\n    Box<dyn std::future::Future<Output = Result<T, E>> + Send + 'static>,\n>;\nimpl tower::Service<HelloRequest> for InnerHelloClient {\n    type Response = HelloResponse;\n    type Error = crate::HelloError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: HelloRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.hello(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<GoodbyeRequest> for InnerHelloClient {\n    type Response = GoodbyeResponse;\n    type Error = crate::HelloError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: GoodbyeRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.goodbye(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<quickwit_common::ServiceStream<PingRequest>> for InnerHelloClient {\n    type Response = HelloStream<PingResponse>;\n    type Error = crate::HelloError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(\n        &mut self,\n        request: quickwit_common::ServiceStream<PingRequest>,\n    ) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.ping(request).await };\n        Box::pin(fut)\n    }\n}\n/// A tower service stack is a set of tower services.\n#[derive(Debug)]\nstruct HelloTowerServiceStack {\n    #[allow(dead_code)]\n    inner: InnerHelloClient,\n    hello_svc: quickwit_common::tower::BoxService<\n        HelloRequest,\n        HelloResponse,\n        crate::HelloError,\n    >,\n    goodbye_svc: quickwit_common::tower::BoxService<\n        GoodbyeRequest,\n        GoodbyeResponse,\n        crate::HelloError,\n    >,\n    ping_svc: quickwit_common::tower::BoxService<\n        quickwit_common::ServiceStream<PingRequest>,\n        HelloStream<PingResponse>,\n        crate::HelloError,\n    >,\n}\n#[async_trait::async_trait]\nimpl Hello for HelloTowerServiceStack {\n    async fn hello(&self, request: HelloRequest) -> crate::HelloResult<HelloResponse> {\n        self.hello_svc.clone().ready().await?.call(request).await\n    }\n    async fn goodbye(\n        &self,\n        request: GoodbyeRequest,\n    ) -> crate::HelloResult<GoodbyeResponse> {\n        self.goodbye_svc.clone().ready().await?.call(request).await\n    }\n    async fn ping(\n        &self,\n        request: quickwit_common::ServiceStream<PingRequest>,\n    ) -> crate::HelloResult<HelloStream<PingResponse>> {\n        self.ping_svc.clone().ready().await?.call(request).await\n    }\n    async fn check_connectivity(&self) -> anyhow::Result<()> {\n        self.inner.0.check_connectivity().await\n    }\n    fn endpoints(&self) -> Vec<quickwit_common::uri::Uri> {\n        self.inner.0.endpoints()\n    }\n}\ntype HelloLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<HelloRequest, HelloResponse, crate::HelloError>,\n    HelloRequest,\n    HelloResponse,\n    crate::HelloError,\n>;\ntype GoodbyeLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        GoodbyeRequest,\n        GoodbyeResponse,\n        crate::HelloError,\n    >,\n    GoodbyeRequest,\n    GoodbyeResponse,\n    crate::HelloError,\n>;\ntype PingLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        quickwit_common::ServiceStream<PingRequest>,\n        HelloStream<PingResponse>,\n        crate::HelloError,\n    >,\n    quickwit_common::ServiceStream<PingRequest>,\n    HelloStream<PingResponse>,\n    crate::HelloError,\n>;\n#[derive(Debug, Default)]\npub struct HelloTowerLayerStack {\n    hello_layers: Vec<HelloLayer>,\n    goodbye_layers: Vec<GoodbyeLayer>,\n    ping_layers: Vec<PingLayer>,\n}\nimpl HelloTowerLayerStack {\n    pub fn stack_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    HelloRequest,\n                    HelloResponse,\n                    crate::HelloError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                HelloRequest,\n                HelloResponse,\n                crate::HelloError,\n            >,\n        >>::Service: tower::Service<\n                HelloRequest,\n                Response = HelloResponse,\n                Error = crate::HelloError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                HelloRequest,\n                HelloResponse,\n                crate::HelloError,\n            >,\n        >>::Service as tower::Service<HelloRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    GoodbyeRequest,\n                    GoodbyeResponse,\n                    crate::HelloError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                GoodbyeRequest,\n                GoodbyeResponse,\n                crate::HelloError,\n            >,\n        >>::Service: tower::Service<\n                GoodbyeRequest,\n                Response = GoodbyeResponse,\n                Error = crate::HelloError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                GoodbyeRequest,\n                GoodbyeResponse,\n                crate::HelloError,\n            >,\n        >>::Service as tower::Service<GoodbyeRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    quickwit_common::ServiceStream<PingRequest>,\n                    HelloStream<PingResponse>,\n                    crate::HelloError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                quickwit_common::ServiceStream<PingRequest>,\n                HelloStream<PingResponse>,\n                crate::HelloError,\n            >,\n        >>::Service: tower::Service<\n                quickwit_common::ServiceStream<PingRequest>,\n                Response = HelloStream<PingResponse>,\n                Error = crate::HelloError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                quickwit_common::ServiceStream<PingRequest>,\n                HelloStream<PingResponse>,\n                crate::HelloError,\n            >,\n        >>::Service as tower::Service<\n            quickwit_common::ServiceStream<PingRequest>,\n        >>::Future: Send + 'static,\n    {\n        self.hello_layers.push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.goodbye_layers.push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.ping_layers.push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self\n    }\n    pub fn stack_hello_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    HelloRequest,\n                    HelloResponse,\n                    crate::HelloError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                HelloRequest,\n                Response = HelloResponse,\n                Error = crate::HelloError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<HelloRequest>>::Future: Send + 'static,\n    {\n        self.hello_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_goodbye_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    GoodbyeRequest,\n                    GoodbyeResponse,\n                    crate::HelloError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                GoodbyeRequest,\n                Response = GoodbyeResponse,\n                Error = crate::HelloError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<GoodbyeRequest>>::Future: Send + 'static,\n    {\n        self.goodbye_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_ping_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    quickwit_common::ServiceStream<PingRequest>,\n                    HelloStream<PingResponse>,\n                    crate::HelloError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                quickwit_common::ServiceStream<PingRequest>,\n                Response = HelloStream<PingResponse>,\n                Error = crate::HelloError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<\n            quickwit_common::ServiceStream<PingRequest>,\n        >>::Future: Send + 'static,\n    {\n        self.ping_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn build<T>(self, instance: T) -> HelloClient\n    where\n        T: Hello,\n    {\n        let inner_client = InnerHelloClient(std::sync::Arc::new(instance));\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_channel(\n        self,\n        addr: std::net::SocketAddr,\n        channel: tonic::transport::Channel,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> HelloClient {\n        let client = HelloClient::from_channel(\n            addr,\n            channel,\n            max_message_size,\n            compression_encoding_opt,\n        );\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_balance_channel(\n        self,\n        balance_channel: quickwit_common::tower::BalanceChannel<std::net::SocketAddr>,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> HelloClient {\n        let client = HelloClient::from_balance_channel(\n            balance_channel,\n            max_message_size,\n            compression_encoding_opt,\n        );\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_mailbox<A>(\n        self,\n        mailbox: quickwit_actors::Mailbox<A>,\n    ) -> HelloClient\n    where\n        A: quickwit_actors::Actor + std::fmt::Debug + Send + 'static,\n        HelloMailbox<A>: Hello,\n    {\n        let inner_client = InnerHelloClient(\n            std::sync::Arc::new(HelloMailbox::new(mailbox)),\n        );\n        self.build_from_inner_client(inner_client)\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn build_from_mock(self, mock: MockHello) -> HelloClient {\n        let client = HelloClient::from_mock(mock);\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    fn build_from_inner_client(self, inner_client: InnerHelloClient) -> HelloClient {\n        let hello_svc = self\n            .hello_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let goodbye_svc = self\n            .goodbye_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let ping_svc = self\n            .ping_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let tower_svc_stack = HelloTowerServiceStack {\n            inner: inner_client,\n            hello_svc,\n            goodbye_svc,\n            ping_svc,\n        };\n        HelloClient::new(tower_svc_stack)\n    }\n}\n#[derive(Debug, Clone)]\nstruct MailboxAdapter<A: quickwit_actors::Actor, E> {\n    inner: quickwit_actors::Mailbox<A>,\n    phantom: std::marker::PhantomData<E>,\n}\nimpl<A, E> std::ops::Deref for MailboxAdapter<A, E>\nwhere\n    A: quickwit_actors::Actor,\n{\n    type Target = quickwit_actors::Mailbox<A>;\n    fn deref(&self) -> &Self::Target {\n        &self.inner\n    }\n}\n#[derive(Debug)]\npub struct HelloMailbox<A: quickwit_actors::Actor> {\n    inner: MailboxAdapter<A, crate::HelloError>,\n}\nimpl<A: quickwit_actors::Actor> HelloMailbox<A> {\n    pub fn new(instance: quickwit_actors::Mailbox<A>) -> Self {\n        let inner = MailboxAdapter {\n            inner: instance,\n            phantom: std::marker::PhantomData,\n        };\n        Self { inner }\n    }\n}\nimpl<A: quickwit_actors::Actor> Clone for HelloMailbox<A> {\n    fn clone(&self) -> Self {\n        let inner = MailboxAdapter {\n            inner: self.inner.clone(),\n            phantom: std::marker::PhantomData,\n        };\n        Self { inner }\n    }\n}\nimpl<A, M, T, E> tower::Service<M> for HelloMailbox<A>\nwhere\n    A: quickwit_actors::Actor\n        + quickwit_actors::DeferableReplyHandler<M, Reply = Result<T, E>> + Send\n        + 'static,\n    M: std::fmt::Debug + Send + 'static,\n    T: Send + 'static,\n    E: std::fmt::Debug + Send + 'static,\n    crate::HelloError: From<quickwit_actors::AskError<E>>,\n{\n    type Response = T;\n    type Error = crate::HelloError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        //! This does not work with balance middlewares such as `tower::balance::pool::Pool` because\n        //! this always returns `Poll::Ready`. The fix is to acquire a permit from the\n        //! mailbox in `poll_ready` and consume it in `call`.\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, message: M) -> Self::Future {\n        let mailbox = self.inner.clone();\n        let fut = async move {\n            mailbox.ask_for_res(message).await.map_err(|error| error.into())\n        };\n        Box::pin(fut)\n    }\n}\n#[async_trait::async_trait]\nimpl<A> Hello for HelloMailbox<A>\nwhere\n    A: quickwit_actors::Actor + std::fmt::Debug,\n    HelloMailbox<\n        A,\n    >: tower::Service<\n            HelloRequest,\n            Response = HelloResponse,\n            Error = crate::HelloError,\n            Future = BoxFuture<HelloResponse, crate::HelloError>,\n        >\n        + tower::Service<\n            GoodbyeRequest,\n            Response = GoodbyeResponse,\n            Error = crate::HelloError,\n            Future = BoxFuture<GoodbyeResponse, crate::HelloError>,\n        >\n        + tower::Service<\n            quickwit_common::ServiceStream<PingRequest>,\n            Response = HelloStream<PingResponse>,\n            Error = crate::HelloError,\n            Future = BoxFuture<HelloStream<PingResponse>, crate::HelloError>,\n        >,\n{\n    async fn hello(&self, request: HelloRequest) -> crate::HelloResult<HelloResponse> {\n        self.clone().call(request).await\n    }\n    async fn goodbye(\n        &self,\n        request: GoodbyeRequest,\n    ) -> crate::HelloResult<GoodbyeResponse> {\n        self.clone().call(request).await\n    }\n    async fn ping(\n        &self,\n        request: quickwit_common::ServiceStream<PingRequest>,\n    ) -> crate::HelloResult<HelloStream<PingResponse>> {\n        self.clone().call(request).await\n    }\n    async fn check_connectivity(&self) -> anyhow::Result<()> {\n        if self.inner.is_disconnected() {\n            anyhow::bail!(\"actor `{}` is disconnected\", self.inner.actor_instance_id())\n        }\n        Ok(())\n    }\n    fn endpoints(&self) -> Vec<quickwit_common::uri::Uri> {\n        vec![\n            quickwit_common::uri::Uri::from_str(& format!(\"actor://localhost/{}\", self\n            .inner.actor_instance_id())).expect(\"URI should be valid\")\n        ]\n    }\n}\n#[derive(Debug, Clone)]\npub struct HelloGrpcClientAdapter<T> {\n    inner: T,\n    #[allow(dead_code)]\n    connection_addrs_rx: tokio::sync::watch::Receiver<\n        std::collections::HashSet<std::net::SocketAddr>,\n    >,\n}\nimpl<T> HelloGrpcClientAdapter<T> {\n    pub fn new(\n        instance: T,\n        connection_addrs_rx: tokio::sync::watch::Receiver<\n            std::collections::HashSet<std::net::SocketAddr>,\n        >,\n    ) -> Self {\n        Self {\n            inner: instance,\n            connection_addrs_rx,\n        }\n    }\n}\n#[async_trait::async_trait]\nimpl<T> Hello for HelloGrpcClientAdapter<hello_grpc_client::HelloGrpcClient<T>>\nwhere\n    T: tonic::client::GrpcService<tonic::body::Body> + std::fmt::Debug + Clone + Send\n        + Sync + 'static,\n    T::ResponseBody: tonic::codegen::Body<Data = tonic::codegen::Bytes> + Send + 'static,\n    <T::ResponseBody as tonic::codegen::Body>::Error: Into<tonic::codegen::StdError>\n        + Send,\n    T::Future: Send,\n{\n    async fn hello(&self, request: HelloRequest) -> crate::HelloResult<HelloResponse> {\n        self.inner\n            .clone()\n            .hello(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                HelloRequest::rpc_name(),\n            ))\n    }\n    async fn goodbye(\n        &self,\n        request: GoodbyeRequest,\n    ) -> crate::HelloResult<GoodbyeResponse> {\n        self.inner\n            .clone()\n            .goodbye(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                GoodbyeRequest::rpc_name(),\n            ))\n    }\n    async fn ping(\n        &self,\n        request: quickwit_common::ServiceStream<PingRequest>,\n    ) -> crate::HelloResult<HelloStream<PingResponse>> {\n        self.inner\n            .clone()\n            .ping(request)\n            .await\n            .map(|response| {\n                let streaming: tonic::Streaming<_> = response.into_inner();\n                let stream = quickwit_common::ServiceStream::from(streaming);\n                stream\n                    .map_err(|status| crate::error::grpc_status_to_service_error(\n                        status,\n                        PingRequest::rpc_name(),\n                    ))\n            })\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                PingRequest::rpc_name(),\n            ))\n    }\n    async fn check_connectivity(&self) -> anyhow::Result<()> {\n        if self.connection_addrs_rx.borrow().is_empty() {\n            anyhow::bail!(\"no server currently available\")\n        }\n        Ok(())\n    }\n    fn endpoints(&self) -> Vec<quickwit_common::uri::Uri> {\n        self.connection_addrs_rx\n            .borrow()\n            .iter()\n            .flat_map(|addr| quickwit_common::uri::Uri::from_str(\n                &format!(\"grpc://{addr}/{}.{}\", \"hello\", \"Hello\"),\n            ))\n            .collect()\n    }\n}\n#[derive(Debug)]\npub struct HelloGrpcServerAdapter {\n    inner: InnerHelloClient,\n}\nimpl HelloGrpcServerAdapter {\n    pub fn new<T>(instance: T) -> Self\n    where\n        T: Hello,\n    {\n        Self {\n            inner: InnerHelloClient(std::sync::Arc::new(instance)),\n        }\n    }\n}\n#[async_trait::async_trait]\nimpl hello_grpc_server::HelloGrpc for HelloGrpcServerAdapter {\n    async fn hello(\n        &self,\n        request: tonic::Request<HelloRequest>,\n    ) -> Result<tonic::Response<HelloResponse>, tonic::Status> {\n        self.inner\n            .0\n            .hello(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn goodbye(\n        &self,\n        request: tonic::Request<GoodbyeRequest>,\n    ) -> Result<tonic::Response<GoodbyeResponse>, tonic::Status> {\n        self.inner\n            .0\n            .goodbye(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    type PingStream = quickwit_common::ServiceStream<tonic::Result<PingResponse>>;\n    async fn ping(\n        &self,\n        request: tonic::Request<tonic::Streaming<PingRequest>>,\n    ) -> Result<tonic::Response<Self::PingStream>, tonic::Status> {\n        self.inner\n            .0\n            .ping({\n                let streaming: tonic::Streaming<_> = request.into_inner();\n                quickwit_common::ServiceStream::from(streaming)\n            })\n            .await\n            .map(|stream| tonic::Response::new(\n                stream.map_err(crate::error::grpc_error_to_grpc_status),\n            ))\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n}\n/// Generated client implementations.\npub mod hello_grpc_client {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    use tonic::codegen::http::Uri;\n    #[derive(Debug, Clone)]\n    pub struct HelloGrpcClient<T> {\n        inner: tonic::client::Grpc<T>,\n    }\n    impl HelloGrpcClient<tonic::transport::Channel> {\n        /// Attempt to create a new client by connecting to a given endpoint.\n        pub async fn connect<D>(dst: D) -> Result<Self, tonic::transport::Error>\n        where\n            D: TryInto<tonic::transport::Endpoint>,\n            D::Error: Into<StdError>,\n        {\n            let conn = tonic::transport::Endpoint::new(dst)?.connect().await?;\n            Ok(Self::new(conn))\n        }\n    }\n    impl<T> HelloGrpcClient<T>\n    where\n        T: tonic::client::GrpcService<tonic::body::Body>,\n        T::Error: Into<StdError>,\n        T::ResponseBody: Body<Data = Bytes> + std::marker::Send + 'static,\n        <T::ResponseBody as Body>::Error: Into<StdError> + std::marker::Send,\n    {\n        pub fn new(inner: T) -> Self {\n            let inner = tonic::client::Grpc::new(inner);\n            Self { inner }\n        }\n        pub fn with_origin(inner: T, origin: Uri) -> Self {\n            let inner = tonic::client::Grpc::with_origin(inner, origin);\n            Self { inner }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> HelloGrpcClient<InterceptedService<T, F>>\n        where\n            F: tonic::service::Interceptor,\n            T::ResponseBody: Default,\n            T: tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n                Response = http::Response<\n                    <T as tonic::client::GrpcService<tonic::body::Body>>::ResponseBody,\n                >,\n            >,\n            <T as tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n            >>::Error: Into<StdError> + std::marker::Send + std::marker::Sync,\n        {\n            HelloGrpcClient::new(InterceptedService::new(inner, interceptor))\n        }\n        /// Compress requests with the given encoding.\n        ///\n        /// This requires the server to support it otherwise it might respond with an\n        /// error.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.send_compressed(encoding);\n            self\n        }\n        /// Enable decompressing responses.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.accept_compressed(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_decoding_message_size(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_encoding_message_size(limit);\n            self\n        }\n        /// Says hello.\n        pub async fn hello(\n            &mut self,\n            request: impl tonic::IntoRequest<super::HelloRequest>,\n        ) -> std::result::Result<tonic::Response<super::HelloResponse>, tonic::Status> {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\"/hello.Hello/Hello\");\n            let mut req = request.into_request();\n            req.extensions_mut().insert(GrpcMethod::new(\"hello.Hello\", \"Hello\"));\n            self.inner.unary(req, path, codec).await\n        }\n        /// Says goodbye.\n        pub async fn goodbye(\n            &mut self,\n            request: impl tonic::IntoRequest<super::GoodbyeRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::GoodbyeResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\"/hello.Hello/Goodbye\");\n            let mut req = request.into_request();\n            req.extensions_mut().insert(GrpcMethod::new(\"hello.Hello\", \"Goodbye\"));\n            self.inner.unary(req, path, codec).await\n        }\n        /// Ping pong.\n        pub async fn ping(\n            &mut self,\n            request: impl tonic::IntoStreamingRequest<Message = super::PingRequest>,\n        ) -> std::result::Result<\n            tonic::Response<tonic::codec::Streaming<super::PingResponse>>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\"/hello.Hello/Ping\");\n            let mut req = request.into_streaming_request();\n            req.extensions_mut().insert(GrpcMethod::new(\"hello.Hello\", \"Ping\"));\n            self.inner.streaming(req, path, codec).await\n        }\n    }\n}\n/// Generated server implementations.\npub mod hello_grpc_server {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    /// Generated trait containing gRPC methods that should be implemented for use with HelloGrpcServer.\n    #[async_trait]\n    pub trait HelloGrpc: std::marker::Send + std::marker::Sync + 'static {\n        /// Says hello.\n        async fn hello(\n            &self,\n            request: tonic::Request<super::HelloRequest>,\n        ) -> std::result::Result<tonic::Response<super::HelloResponse>, tonic::Status>;\n        /// Says goodbye.\n        async fn goodbye(\n            &self,\n            request: tonic::Request<super::GoodbyeRequest>,\n        ) -> std::result::Result<tonic::Response<super::GoodbyeResponse>, tonic::Status>;\n        /// Server streaming response type for the Ping method.\n        type PingStream: tonic::codegen::tokio_stream::Stream<\n                Item = std::result::Result<super::PingResponse, tonic::Status>,\n            >\n            + std::marker::Send\n            + 'static;\n        /// Ping pong.\n        async fn ping(\n            &self,\n            request: tonic::Request<tonic::Streaming<super::PingRequest>>,\n        ) -> std::result::Result<tonic::Response<Self::PingStream>, tonic::Status>;\n    }\n    #[derive(Debug)]\n    pub struct HelloGrpcServer<T> {\n        inner: Arc<T>,\n        accept_compression_encodings: EnabledCompressionEncodings,\n        send_compression_encodings: EnabledCompressionEncodings,\n        max_decoding_message_size: Option<usize>,\n        max_encoding_message_size: Option<usize>,\n    }\n    impl<T> HelloGrpcServer<T> {\n        pub fn new(inner: T) -> Self {\n            Self::from_arc(Arc::new(inner))\n        }\n        pub fn from_arc(inner: Arc<T>) -> Self {\n            Self {\n                inner,\n                accept_compression_encodings: Default::default(),\n                send_compression_encodings: Default::default(),\n                max_decoding_message_size: None,\n                max_encoding_message_size: None,\n            }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> InterceptedService<Self, F>\n        where\n            F: tonic::service::Interceptor,\n        {\n            InterceptedService::new(Self::new(inner), interceptor)\n        }\n        /// Enable decompressing requests with the given encoding.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.accept_compression_encodings.enable(encoding);\n            self\n        }\n        /// Compress responses with the given encoding, if the client supports it.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.send_compression_encodings.enable(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.max_decoding_message_size = Some(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.max_encoding_message_size = Some(limit);\n            self\n        }\n    }\n    impl<T, B> tonic::codegen::Service<http::Request<B>> for HelloGrpcServer<T>\n    where\n        T: HelloGrpc,\n        B: Body + std::marker::Send + 'static,\n        B::Error: Into<StdError> + std::marker::Send + 'static,\n    {\n        type Response = http::Response<tonic::body::Body>;\n        type Error = std::convert::Infallible;\n        type Future = BoxFuture<Self::Response, Self::Error>;\n        fn poll_ready(\n            &mut self,\n            _cx: &mut Context<'_>,\n        ) -> Poll<std::result::Result<(), Self::Error>> {\n            Poll::Ready(Ok(()))\n        }\n        fn call(&mut self, req: http::Request<B>) -> Self::Future {\n            match req.uri().path() {\n                \"/hello.Hello/Hello\" => {\n                    #[allow(non_camel_case_types)]\n                    struct HelloSvc<T: HelloGrpc>(pub Arc<T>);\n                    impl<T: HelloGrpc> tonic::server::UnaryService<super::HelloRequest>\n                    for HelloSvc<T> {\n                        type Response = super::HelloResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::HelloRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as HelloGrpc>::hello(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = HelloSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/hello.Hello/Goodbye\" => {\n                    #[allow(non_camel_case_types)]\n                    struct GoodbyeSvc<T: HelloGrpc>(pub Arc<T>);\n                    impl<T: HelloGrpc> tonic::server::UnaryService<super::GoodbyeRequest>\n                    for GoodbyeSvc<T> {\n                        type Response = super::GoodbyeResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::GoodbyeRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as HelloGrpc>::goodbye(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = GoodbyeSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/hello.Hello/Ping\" => {\n                    #[allow(non_camel_case_types)]\n                    struct PingSvc<T: HelloGrpc>(pub Arc<T>);\n                    impl<\n                        T: HelloGrpc,\n                    > tonic::server::StreamingService<super::PingRequest>\n                    for PingSvc<T> {\n                        type Response = super::PingResponse;\n                        type ResponseStream = T::PingStream;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::ResponseStream>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<tonic::Streaming<super::PingRequest>>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as HelloGrpc>::ping(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = PingSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.streaming(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                _ => {\n                    Box::pin(async move {\n                        let mut response = http::Response::new(\n                            tonic::body::Body::default(),\n                        );\n                        let headers = response.headers_mut();\n                        headers\n                            .insert(\n                                tonic::Status::GRPC_STATUS,\n                                (tonic::Code::Unimplemented as i32).into(),\n                            );\n                        headers\n                            .insert(\n                                http::header::CONTENT_TYPE,\n                                tonic::metadata::GRPC_CONTENT_TYPE,\n                            );\n                        Ok(response)\n                    })\n                }\n            }\n        }\n    }\n    impl<T> Clone for HelloGrpcServer<T> {\n        fn clone(&self) -> Self {\n            let inner = self.inner.clone();\n            Self {\n                inner,\n                accept_compression_encodings: self.accept_compression_encodings,\n                send_compression_encodings: self.send_compression_encodings,\n                max_decoding_message_size: self.max_decoding_message_size,\n                max_encoding_message_size: self.max_encoding_message_size,\n            }\n        }\n    }\n    /// Generated gRPC service name\n    pub const SERVICE_NAME: &str = \"hello.Hello\";\n    impl<T> tonic::server::NamedService for HelloGrpcServer<T> {\n        const NAME: &'static str = SERVICE_NAME;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-codegen/example/src/error.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\n\nuse quickwit_actors::AskError;\nuse quickwit_common::tower::TimeoutExceeded;\nuse quickwit_proto::error::GrpcServiceError;\npub use quickwit_proto::error::{grpc_error_to_grpc_status, grpc_status_to_service_error};\nuse quickwit_proto::{ServiceError, ServiceErrorCode};\nuse serde::{Deserialize, Serialize};\n\n// Service errors have to be handwritten before codegen.\n#[derive(Debug, thiserror::Error, Serialize, Deserialize)]\npub enum HelloError {\n    #[error(\"internal error: {0}\")]\n    Internal(String),\n    #[error(\"invalid argument: {0}\")]\n    InvalidArgument(String),\n    #[error(\"request timed out: {0}\")]\n    Timeout(String),\n    #[error(\"too many requests\")]\n    TooManyRequests,\n    #[error(\"service unavailable: {0}\")]\n    Unavailable(String),\n}\n\nimpl ServiceError for HelloError {\n    fn error_code(&self) -> ServiceErrorCode {\n        match self {\n            Self::Internal(_) => ServiceErrorCode::Internal,\n            Self::InvalidArgument(_) => ServiceErrorCode::BadRequest,\n            Self::Timeout(_) => ServiceErrorCode::Timeout,\n            Self::TooManyRequests => ServiceErrorCode::TooManyRequests,\n            Self::Unavailable(_) => ServiceErrorCode::Unavailable,\n        }\n    }\n}\n\nimpl GrpcServiceError for HelloError {\n    fn new_internal(message: String) -> Self {\n        Self::Internal(message)\n    }\n\n    fn new_timeout(message: String) -> Self {\n        Self::Timeout(message)\n    }\n\n    fn new_too_many_requests() -> Self {\n        Self::TooManyRequests\n    }\n\n    fn new_unavailable(message: String) -> Self {\n        Self::Unavailable(message)\n    }\n}\n\nimpl<E> From<AskError<E>> for HelloError\nwhere E: fmt::Debug\n{\n    fn from(error: AskError<E>) -> Self {\n        HelloError::Internal(format!(\"{error:?}\"))\n    }\n}\n\nimpl From<TimeoutExceeded> for HelloError {\n    fn from(_: TimeoutExceeded) -> Self {\n        HelloError::Timeout(\"client\".to_string())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-codegen/example/src/hello.proto",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nsyntax = \"proto3\";\n\npackage hello;\n\nmessage HelloRequest {\n    string name = 1;\n}\n\nmessage HelloResponse {\n    string message = 1;\n}\n\nmessage GoodbyeRequest {\n    string name = 1;\n}\n\nmessage GoodbyeResponse {\n    string message = 1;\n}\n\nmessage PingRequest {\n    string name = 1;\n}\n\nmessage PingResponse {\n    string message = 1;\n}\n\nservice Hello {\n    // Says hello.\n    rpc Hello(HelloRequest) returns (HelloResponse);\n\n    // Says goodbye.\n    rpc Goodbye(GoodbyeRequest) returns (GoodbyeResponse);\n\n    // Ping pong.\n    rpc Ping(stream PingRequest) returns (stream PingResponse);\n}\n"
  },
  {
    "path": "quickwit/quickwit-codegen/example/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod error;\n\n#[path = \"codegen/hello.rs\"]\nmod hello;\n\nuse std::sync::Arc;\nuse std::sync::atomic::{AtomicUsize, Ordering};\nuse std::task::{Context, Poll};\nuse std::time::Duration;\n\nuse async_trait::async_trait;\nuse futures::StreamExt;\nuse quickwit_common::ServiceStream;\nuse quickwit_common::uri::Uri;\nuse tower::{Layer, Service};\n\npub use crate::error::HelloError;\npub use crate::hello::*;\n\npub type HelloResult<T> = Result<T, HelloError>;\n\n#[derive(Debug, Clone)]\nstruct Counter<S> {\n    counter: Arc<AtomicUsize>,\n    inner: S,\n}\n\nimpl<S, R> Service<R> for Counter<S>\nwhere S: Service<R>\n{\n    type Response = S::Response;\n    type Error = S::Error;\n    type Future = S::Future;\n\n    fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {\n        self.inner.poll_ready(cx)\n    }\n\n    fn call(&mut self, req: R) -> Self::Future {\n        self.counter.fetch_add(1, Ordering::Relaxed);\n        self.inner.call(req)\n    }\n}\n\n#[derive(Debug, Clone, Default)]\n#[allow(dead_code)]\nstruct CounterLayer {\n    counter: Arc<AtomicUsize>,\n}\n\nimpl<S> Layer<S> for CounterLayer {\n    type Service = Counter<S>;\n\n    fn layer(&self, inner: S) -> Self::Service {\n        Counter {\n            counter: self.counter.clone(),\n            inner,\n        }\n    }\n}\n\n#[allow(dead_code)]\nfn spawn_ping_response_stream(\n    mut request_stream: ServiceStream<PingRequest>,\n) -> ServiceStream<HelloResult<PingResponse>> {\n    let (ping_tx, service_stream) = ServiceStream::new_bounded(1);\n    let future = async move {\n        let mut name = \"\".to_string();\n        let mut interval = tokio::time::interval(Duration::from_millis(100));\n\n        loop {\n            tokio::select! {\n                request_opt = request_stream.next() => {\n                    match request_opt {\n                        Some(request) => name = request.name,\n                        _ => break,\n                    };\n                }\n                _ = interval.tick() => {\n                    if name.is_empty() {\n                        continue;\n                    }\n                    if name == \"stop\" {\n                        break;\n                    }\n                    if ping_tx.send(Ok(PingResponse {\n                        message: format!(\"Pong, {name}!\")\n                    })).await.is_err() {\n                        break;\n                    }\n                }\n            }\n        }\n    };\n    tokio::spawn(future);\n    service_stream\n}\n\n#[derive(Debug, Clone, Default)]\n#[allow(dead_code)]\nstruct HelloImpl {\n    delay: Duration,\n}\n\n#[async_trait]\nimpl Hello for HelloImpl {\n    async fn hello(&self, request: HelloRequest) -> HelloResult<HelloResponse> {\n        tokio::time::sleep(self.delay).await;\n\n        if request.name.is_empty() {\n            return Err(HelloError::InvalidArgument(\"name is empty\".to_string()));\n        }\n        Ok(HelloResponse {\n            message: format!(\"Hello, {}!\", request.name),\n        })\n    }\n\n    async fn goodbye(&self, request: GoodbyeRequest) -> HelloResult<GoodbyeResponse> {\n        tokio::time::sleep(self.delay).await;\n\n        Ok(GoodbyeResponse {\n            message: format!(\"Goodbye, {}!\", request.name),\n        })\n    }\n\n    async fn ping(\n        &self,\n        request: ServiceStream<PingRequest>,\n    ) -> HelloResult<HelloStream<PingResponse>> {\n        Ok(spawn_ping_response_stream(request))\n    }\n\n    async fn check_connectivity(&self) -> anyhow::Result<()> {\n        Ok(())\n    }\n\n    fn endpoints(&self) -> Vec<Uri> {\n        Vec::new()\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::fmt;\n    use std::net::SocketAddr;\n    use std::str::FromStr;\n    use std::sync::atomic::Ordering;\n\n    use bytesize::ByteSize;\n    use quickwit_actors::{Actor, ActorContext, ActorExitStatus, Handler, Universe};\n    use quickwit_common::tower::{BalanceChannel, Change, TimeoutLayer};\n    use tokio::sync::mpsc::error::TrySendError;\n    use tokio_stream::StreamExt;\n    use tonic::codec::CompressionEncoding;\n    use tonic::transport::{Endpoint, Server};\n    use tonic::{Code, Status};\n\n    use super::*;\n    use crate::hello::MockHello;\n    use crate::hello::hello_grpc_server::HelloGrpcServer;\n    use crate::hello_grpc_client::HelloGrpcClient;\n    use crate::{CounterLayer, GoodbyeRequest, GoodbyeResponse};\n\n    const MAX_GRPC_MESSAGE_SIZE: ByteSize = ByteSize::mib(1);\n\n    #[tokio::test]\n    async fn test_hello_codegen() {\n        let hello = HelloImpl::default();\n\n        assert_eq!(\n            hello\n                .hello(HelloRequest {\n                    name: \"World\".to_string()\n                })\n                .await\n                .unwrap(),\n            HelloResponse {\n                message: \"Hello, World!\".to_string()\n            }\n        );\n\n        let client = HelloClient::new(hello.clone()).clone();\n\n        assert_eq!(\n            client\n                .hello(HelloRequest {\n                    name: \"World\".to_string()\n                })\n                .await\n                .unwrap(),\n            HelloResponse {\n                message: \"Hello, World!\".to_string()\n            }\n        );\n\n        let (ping_stream_tx, ping_stream) = ServiceStream::new_bounded(1);\n        let mut pong_stream = client.ping(ping_stream).await.unwrap();\n\n        ping_stream_tx\n            .try_send(PingRequest {\n                name: \"World\".to_string(),\n            })\n            .unwrap();\n        assert_eq!(\n            pong_stream.next().await.unwrap().unwrap().message,\n            \"Pong, World!\"\n        );\n        ping_stream_tx\n            .try_send(PingRequest {\n                name: \"Mundo\".to_string(),\n            })\n            .unwrap();\n        assert_eq!(\n            pong_stream.next().await.unwrap().unwrap().message,\n            \"Pong, Mundo!\"\n        );\n        ping_stream_tx\n            .try_send(PingRequest {\n                name: \"stop\".to_string(),\n            })\n            .unwrap();\n        assert!(pong_stream.next().await.is_none());\n\n        let error = ping_stream_tx\n            .try_send(PingRequest {\n                name: \"stop\".to_string(),\n            })\n            .unwrap_err();\n        assert!(matches!(error, TrySendError::Closed(_)));\n\n        let mut mock_hello = MockHello::new();\n\n        mock_hello.expect_hello().returning(|_| {\n            Ok(HelloResponse {\n                message: \"Hello, Mock!\".to_string(),\n            })\n        });\n\n        assert_eq!(\n            mock_hello\n                .hello(HelloRequest {\n                    name: \"\".to_string()\n                })\n                .await\n                .unwrap(),\n            HelloResponse {\n                message: \"Hello, Mock!\".to_string()\n            }\n        );\n    }\n\n    #[tokio::test]\n    async fn test_hello_codegen_grpc() {\n        let grpc_server =\n            HelloClient::new(HelloImpl::default()).as_grpc_service(MAX_GRPC_MESSAGE_SIZE);\n        let addr: SocketAddr = \"127.0.0.1:6666\".parse().unwrap();\n\n        tokio::spawn({\n            async move {\n                Server::builder()\n                    .add_service(grpc_server)\n                    .serve(addr)\n                    .await\n                    .unwrap();\n            }\n        });\n        let channel = BalanceChannel::from_channel(\n            \"127.0.0.1:6666\".parse().unwrap(),\n            Endpoint::from_static(\"http://127.0.0.1:6666\").connect_lazy(),\n        );\n        let grpc_client = HelloClient::from_balance_channel(channel, MAX_GRPC_MESSAGE_SIZE, None);\n\n        assert_eq!(\n            grpc_client\n                .hello(HelloRequest {\n                    name: \"gRPC client\".to_string()\n                })\n                .await\n                .unwrap(),\n            HelloResponse {\n                message: \"Hello, gRPC client!\".to_string()\n            }\n        );\n\n        assert!(matches!(\n            grpc_client\n                .hello(HelloRequest {\n                    name: \"\".to_string()\n                })\n                .await\n                .unwrap_err(),\n            HelloError::InvalidArgument(_)\n        ));\n\n        let (ping_stream_tx, ping_stream) = ServiceStream::new_bounded(1);\n        let mut pong_stream = grpc_client.ping(ping_stream).await.unwrap();\n\n        ping_stream_tx\n            .try_send(PingRequest {\n                name: \"gRPC client\".to_string(),\n            })\n            .unwrap();\n        assert_eq!(\n            pong_stream.next().await.unwrap().unwrap().message,\n            \"Pong, gRPC client!\"\n        );\n\n        ping_stream_tx\n            .try_send(PingRequest {\n                name: \"stop\".to_string(),\n            })\n            .unwrap();\n        assert!(pong_stream.next().await.is_none());\n\n        let error = ping_stream_tx\n            .try_send(PingRequest {\n                name: \"stop\".to_string(),\n            })\n            .unwrap_err();\n        assert!(matches!(error, TrySendError::Closed(_)));\n\n        grpc_client.check_connectivity().await.unwrap();\n        assert_eq!(\n            grpc_client.endpoints(),\n            vec![Uri::from_str(\"grpc://127.0.0.1:6666/hello.Hello\").unwrap()]\n        );\n\n        // The connectivity check fails if there is no client behind the channel.\n        let (balance_channel, _): (BalanceChannel<SocketAddr>, _) = BalanceChannel::new();\n        let grpc_client =\n            HelloClient::from_balance_channel(balance_channel, MAX_GRPC_MESSAGE_SIZE, None);\n        assert_eq!(\n            grpc_client\n                .check_connectivity()\n                .await\n                .unwrap_err()\n                .to_string(),\n            \"no server currently available\"\n        );\n    }\n\n    #[tokio::test]\n    async fn test_hello_codegen_grpc_with_compression() {\n        #[derive(Debug, Clone)]\n        struct CheckCompression<S> {\n            inner: S,\n        }\n\n        impl<S, ReqBody, ResBody> Service<http::Request<ReqBody>> for CheckCompression<S>\n        where\n            S: Service<http::Request<ReqBody>, Response = http::Response<ResBody>>\n                + Clone\n                + Send\n                + 'static,\n            S::Future: Send + 'static,\n            ReqBody: Send + 'static,\n        {\n            type Response = S::Response;\n            type Error = S::Error;\n            type Future = BoxFuture<Self::Response, Self::Error>;\n\n            fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {\n                self.inner.poll_ready(cx)\n            }\n\n            fn call(&mut self, request: http::Request<ReqBody>) -> Self::Future {\n                let Some(grpc_encoding) = request.headers().get(\"grpc-encoding\") else {\n                    panic!(\"request should be compressed\");\n                };\n                assert!(grpc_encoding.to_str().unwrap().contains(\"zstd\"));\n\n                let Some(grpc_accept_encoding) = request.headers().get(\"grpc-accept-encoding\")\n                else {\n                    panic!(\"client should accept compressed responses\");\n                };\n                assert!(grpc_accept_encoding.to_str().unwrap().contains(\"zstd\"));\n                let fut = self.inner.call(request);\n\n                Box::pin(async move {\n                    let response = fut.await?;\n\n                    let grpc_status_code = Status::from_header_map(response.headers())\n                        .map(|status| status.code())\n                        .unwrap_or(Code::Ok);\n\n                    if grpc_status_code == Code::Ok {\n                        let Some(grpc_encoding) = response.headers().get(\"grpc-encoding\") else {\n                            panic!(\"response should be compressed\");\n                        };\n                        assert!(grpc_encoding.to_str().unwrap().contains(\"zstd\"));\n                    }\n                    Ok(response)\n                })\n            }\n        }\n\n        #[derive(Debug, Clone)]\n        struct CheckCompressionLayer;\n\n        impl<S> Layer<S> for CheckCompressionLayer {\n            type Service = CheckCompression<S>;\n\n            fn layer(&self, inner: S) -> Self::Service {\n                Self::Service { inner }\n            }\n        }\n\n        let grpc_server =\n            HelloClient::new(HelloImpl::default()).as_grpc_service(MAX_GRPC_MESSAGE_SIZE);\n        let addr: SocketAddr = \"127.0.0.1:33333\".parse().unwrap();\n\n        tokio::spawn({\n            async move {\n                Server::builder()\n                    .layer(CheckCompressionLayer)\n                    .add_service(grpc_server)\n                    .serve(addr)\n                    .await\n                    .unwrap();\n            }\n        });\n        let channel = BalanceChannel::from_channel(\n            \"127.0.0.1:33333\".parse().unwrap(),\n            Endpoint::from_static(\"http://127.0.0.1:33333\").connect_lazy(),\n        );\n        let grpc_client = HelloClient::from_balance_channel(\n            channel,\n            MAX_GRPC_MESSAGE_SIZE,\n            Some(CompressionEncoding::Zstd),\n        );\n\n        assert_eq!(\n            grpc_client\n                .hello(HelloRequest {\n                    name: \"gRPC client\".to_string()\n                })\n                .await\n                .unwrap(),\n            HelloResponse {\n                message: \"Hello, gRPC client!\".to_string()\n            }\n        );\n\n        assert!(matches!(\n            grpc_client\n                .hello(HelloRequest {\n                    name: \"\".to_string()\n                })\n                .await\n                .unwrap_err(),\n            HelloError::InvalidArgument(_)\n        ));\n\n        let (ping_stream_tx, ping_stream) = ServiceStream::new_bounded(1);\n        let mut pong_stream = grpc_client.ping(ping_stream).await.unwrap();\n\n        ping_stream_tx\n            .try_send(PingRequest {\n                name: \"gRPC client\".to_string(),\n            })\n            .unwrap();\n        assert_eq!(\n            pong_stream.next().await.unwrap().unwrap().message,\n            \"Pong, gRPC client!\"\n        );\n\n        ping_stream_tx\n            .try_send(PingRequest {\n                name: \"stop\".to_string(),\n            })\n            .unwrap();\n        assert!(pong_stream.next().await.is_none());\n\n        let error = ping_stream_tx\n            .try_send(PingRequest {\n                name: \"stop\".to_string(),\n            })\n            .unwrap_err();\n        assert!(matches!(error, TrySendError::Closed(_)));\n    }\n\n    #[tokio::test]\n    async fn test_hello_codegen_actor() {\n        #[derive(Debug)]\n        struct HelloActor;\n\n        impl Actor for HelloActor {\n            type ObservableState = ();\n\n            fn observable_state(&self) -> Self::ObservableState {}\n        }\n\n        #[async_trait]\n        impl Handler<HelloRequest> for HelloActor {\n            type Reply = HelloResult<HelloResponse>;\n\n            async fn handle(\n                &mut self,\n                message: HelloRequest,\n                _ctx: &ActorContext<Self>,\n            ) -> Result<Self::Reply, ActorExitStatus> {\n                Ok(Ok(HelloResponse {\n                    message: format!(\"Hello, {}!\", message.name),\n                }))\n            }\n        }\n\n        #[async_trait]\n        impl Handler<GoodbyeRequest> for HelloActor {\n            type Reply = HelloResult<GoodbyeResponse>;\n\n            async fn handle(\n                &mut self,\n                message: GoodbyeRequest,\n                _ctx: &ActorContext<Self>,\n            ) -> Result<Self::Reply, ActorExitStatus> {\n                Ok(Ok(GoodbyeResponse {\n                    message: format!(\"Goodbye, {}!\", message.name),\n                }))\n            }\n        }\n\n        #[async_trait]\n        impl Handler<ServiceStream<PingRequest>> for HelloActor {\n            type Reply = HelloResult<HelloStream<PingResponse>>;\n\n            async fn handle(\n                &mut self,\n                message: ServiceStream<PingRequest>,\n                _ctx: &ActorContext<Self>,\n            ) -> Result<Self::Reply, ActorExitStatus> {\n                Ok(Ok(spawn_ping_response_stream(message)))\n            }\n        }\n\n        let universe = Universe::new();\n        let hello_actor = HelloActor;\n        let (actor_mailbox, _actor_handle) = universe.spawn_builder().spawn(hello_actor);\n        let actor_client = HelloClient::from_mailbox(actor_mailbox.clone());\n\n        assert_eq!(\n            actor_client\n                .hello(HelloRequest {\n                    name: \"beautiful actor\".to_string()\n                })\n                .await\n                .unwrap(),\n            HelloResponse {\n                message: \"Hello, beautiful actor!\".to_string()\n            }\n        );\n\n        actor_client.check_connectivity().await.unwrap();\n        assert_eq!(\n            actor_client.endpoints(),\n            vec![\n                Uri::from_str(&format!(\n                    \"actor://localhost/{}\",\n                    actor_mailbox.actor_instance_id()\n                ))\n                .unwrap()\n            ]\n        );\n\n        let (ping_stream_tx, ping_stream) = ServiceStream::new_bounded(1);\n        let mut pong_stream = actor_client.ping(ping_stream).await.unwrap();\n\n        ping_stream_tx\n            .try_send(PingRequest {\n                name: \"beautiful actor\".to_string(),\n            })\n            .unwrap();\n        assert_eq!(\n            pong_stream.next().await.unwrap().unwrap().message,\n            \"Pong, beautiful actor!\"\n        );\n\n        let hello_tower = HelloClient::tower().build_from_mailbox(actor_mailbox);\n\n        assert_eq!(\n            hello_tower\n                .hello(HelloRequest {\n                    name: \"Tower actor\".to_string()\n                })\n                .await\n                .unwrap(),\n            HelloResponse {\n                message: \"Hello, Tower actor!\".to_string()\n            }\n        );\n\n        assert_eq!(\n            hello_tower\n                .goodbye(GoodbyeRequest {\n                    name: \"Tower actor\".to_string()\n                })\n                .await\n                .unwrap(),\n            GoodbyeResponse {\n                message: \"Goodbye, Tower actor!\".to_string()\n            }\n        );\n\n        let (ping_stream_tx, ping_stream) = ServiceStream::new_bounded(1);\n        let mut pong_stream = actor_client.ping(ping_stream).await.unwrap();\n\n        ping_stream_tx\n            .try_send(PingRequest {\n                name: \"beautiful Tower actor\".to_string(),\n            })\n            .unwrap();\n        assert_eq!(\n            pong_stream.next().await.unwrap().unwrap().message,\n            \"Pong, beautiful Tower actor!\"\n        );\n\n        universe.assert_quit().await;\n\n        actor_client.check_connectivity().await.unwrap_err();\n    }\n\n    #[tokio::test]\n    async fn test_hello_codegen_tower_stack_layers() {\n        let layer = CounterLayer::default();\n        let hello_layer = CounterLayer::default();\n        let goodbye_layer = CounterLayer::default();\n        let ping_layer = CounterLayer::default();\n\n        let hello_tower = HelloClient::tower()\n            .stack_layer(layer.clone())\n            .stack_hello_layer(hello_layer.clone())\n            .stack_goodbye_layer(goodbye_layer.clone())\n            .stack_ping_layer(ping_layer.clone())\n            .build(HelloImpl::default());\n\n        hello_tower\n            .hello(HelloRequest {\n                name: \"Tower\".to_string(),\n            })\n            .await\n            .unwrap();\n\n        hello_tower\n            .goodbye(GoodbyeRequest {\n                name: \"Tower\".to_string(),\n            })\n            .await\n            .unwrap();\n\n        let (ping_stream_tx, ping_stream) = ServiceStream::new_bounded(1);\n        let mut pong_stream = hello_tower.ping(ping_stream).await.unwrap();\n\n        ping_stream_tx\n            .try_send(PingRequest {\n                name: \"Tower\".to_string(),\n            })\n            .unwrap();\n        assert_eq!(\n            pong_stream.next().await.unwrap().unwrap().message,\n            \"Pong, Tower!\"\n        );\n\n        assert_eq!(layer.counter.load(Ordering::Relaxed), 3);\n        assert_eq!(hello_layer.counter.load(Ordering::Relaxed), 1);\n        assert_eq!(goodbye_layer.counter.load(Ordering::Relaxed), 1);\n        assert_eq!(ping_layer.counter.load(Ordering::Relaxed), 1);\n    }\n\n    #[tokio::test]\n    async fn test_hello_codegen_tower_stack_layer_ordering() {\n        trait AppendSuffix {\n            fn append_suffix(&mut self, suffix: &'static str);\n        }\n\n        impl AppendSuffix for HelloRequest {\n            fn append_suffix(&mut self, suffix: &'static str) {\n                self.name.push_str(suffix);\n            }\n        }\n\n        impl AppendSuffix for GoodbyeRequest {\n            fn append_suffix(&mut self, suffix: &'static str) {\n                self.name.push_str(suffix);\n            }\n        }\n\n        impl AppendSuffix for PingRequest {\n            fn append_suffix(&mut self, suffix: &'static str) {\n                self.name.push_str(suffix);\n            }\n        }\n\n        impl AppendSuffix for ServiceStream<PingRequest> {\n            fn append_suffix(&mut self, _suffix: &'static str) {}\n        }\n\n        #[derive(Debug, Clone)]\n        struct AppendSuffixService<S> {\n            inner: S,\n            suffix: &'static str,\n        }\n\n        impl<S, R> Service<R> for AppendSuffixService<S>\n        where\n            S: Service<R, Error = HelloError>,\n            S::Response: fmt::Debug,\n            S::Future: Send + 'static,\n            R: AppendSuffix,\n        {\n            type Response = S::Response;\n            type Error = HelloError;\n            type Future = BoxFuture<S::Response, S::Error>;\n\n            fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {\n                self.inner.poll_ready(cx)\n            }\n\n            fn call(&mut self, mut req: R) -> Self::Future {\n                req.append_suffix(self.suffix);\n                let inner = self.inner.call(req);\n                Box::pin(inner)\n            }\n        }\n\n        #[derive(Debug, Clone)]\n        struct AppendSuffixLayer {\n            suffix: &'static str,\n        }\n\n        impl AppendSuffixLayer {\n            fn new(suffix: &'static str) -> Self {\n                Self { suffix }\n            }\n        }\n\n        impl<S> Layer<S> for AppendSuffixLayer {\n            type Service = AppendSuffixService<S>;\n\n            fn layer(&self, inner: S) -> Self::Service {\n                AppendSuffixService {\n                    inner,\n                    suffix: self.suffix,\n                }\n            }\n        }\n        let hello_tower = HelloClient::tower()\n            .stack_layer(AppendSuffixLayer::new(\"->foo\"))\n            .stack_hello_layer(AppendSuffixLayer::new(\"->bar\"))\n            .stack_layer(AppendSuffixLayer::new(\"->qux\"))\n            .stack_hello_layer(AppendSuffixLayer::new(\"->tox\"))\n            .stack_goodbye_layer(AppendSuffixLayer::new(\"->moo\"))\n            .build(HelloImpl::default());\n\n        let response = hello_tower\n            .hello(HelloRequest {\n                name: \"\".to_string(),\n            })\n            .await\n            .unwrap();\n        assert_eq!(response.message, \"Hello, ->foo->bar->qux->tox!\");\n\n        let response = hello_tower\n            .goodbye(GoodbyeRequest {\n                name: \"\".to_string(),\n            })\n            .await\n            .unwrap();\n        assert_eq!(response.message, \"Goodbye, ->foo->qux->moo!\");\n    }\n\n    #[tokio::test]\n    async fn test_from_channel() {\n        let balance_channed = BalanceChannel::from_channel(\n            \"127.0.0.1:7777\".parse().unwrap(),\n            Endpoint::from_static(\"http://127.0.0.1:7777\").connect_lazy(),\n        );\n        HelloClient::from_balance_channel(balance_channed, MAX_GRPC_MESSAGE_SIZE, None);\n    }\n\n    #[tokio::test]\n    async fn test_balance_channel() {\n        let hello = HelloImpl::default();\n        let grpc_server_adapter = HelloGrpcServerAdapter::new(hello);\n        let grpc_server = HelloGrpcServer::new(grpc_server_adapter);\n        let addr: SocketAddr = \"127.0.0.1:11111\".parse().unwrap();\n\n        tokio::spawn({\n            async move {\n                Server::builder()\n                    .add_service(grpc_server)\n                    .serve(addr)\n                    .await\n                    .unwrap();\n            }\n        });\n        let (balance_channel, balance_channel_tx) = BalanceChannel::new();\n        let channel = Endpoint::from_static(\"http://127.0.0.1:11111\").connect_lazy();\n        balance_channel_tx\n            .send(Change::Insert(\"foo\", channel))\n            .unwrap();\n\n        let mut grpc_client = HelloGrpcClient::new(balance_channel.clone());\n\n        assert_eq!(\n            grpc_client\n                .hello(HelloRequest {\n                    name: \"Client\".to_string()\n                })\n                .await\n                .unwrap()\n                .into_inner(),\n            HelloResponse {\n                message: \"Hello, Client!\".to_string()\n            }\n        );\n        assert_eq!(balance_channel.num_connections(), 1);\n    }\n\n    #[tokio::test]\n    async fn test_hello_codegen_mock() {\n        let mut mock_hello = MockHello::new();\n        mock_hello.expect_hello().returning(|_| {\n            Ok(HelloResponse {\n                message: \"Hello, mock!\".to_string(),\n            })\n        });\n        mock_hello.expect_check_connectivity().returning(|| Ok(()));\n        let hello = HelloClient::from_mock(mock_hello);\n\n        assert_eq!(\n            hello\n                .hello(HelloRequest {\n                    name: \"World\".to_string()\n                })\n                .await\n                .unwrap(),\n            HelloResponse {\n                message: \"Hello, mock!\".to_string()\n            }\n        );\n        assert_eq!(\n            hello\n                .clone()\n                .hello(HelloRequest {\n                    name: \"World\".to_string()\n                })\n                .await\n                .unwrap(),\n            HelloResponse {\n                message: \"Hello, mock!\".to_string()\n            }\n        );\n        hello.check_connectivity().await.unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_transport_errors_handling() {\n        quickwit_common::setup_logging_for_tests();\n\n        let addr: SocketAddr = \"127.0.0.1:9999\".parse().unwrap();\n        let channel = Endpoint::from_static(\"http://127.0.0.1:9999\")\n            .timeout(Duration::from_millis(100))\n            .connect_lazy();\n        let max_message_size = ByteSize::mib(1);\n        let grpc_client = HelloClient::from_channel(addr, channel, max_message_size, None);\n\n        let error = grpc_client\n            .hello(HelloRequest {\n                name: \"Client\".to_string(),\n            })\n            .await\n            .unwrap_err();\n        assert!(matches!(error, HelloError::Unavailable(_)));\n\n        let hello = HelloImpl {\n            delay: Duration::from_secs(1),\n        };\n        let grpc_server_adapter = HelloGrpcServerAdapter::new(hello);\n        let grpc_server: HelloGrpcServer<HelloGrpcServerAdapter> =\n            HelloGrpcServer::new(grpc_server_adapter);\n        let addr: SocketAddr = \"127.0.0.1:9999\".parse().unwrap();\n\n        tokio::spawn({\n            async move {\n                Server::builder()\n                    .add_service(grpc_server)\n                    .serve(addr)\n                    .await\n                    .unwrap();\n            }\n        });\n        let error = grpc_client\n            .hello(HelloRequest {\n                name: \"Client\".to_string(),\n            })\n            .await\n            .unwrap_err();\n        assert!(matches!(error, HelloError::Timeout(_)));\n    }\n\n    #[tokio::test]\n    async fn test_balanced_channel_timeout_with_server_crash() {\n        let addr_str = \"127.0.0.1:11112\";\n        let addr: SocketAddr = addr_str.parse().unwrap();\n        // We want to abruptly stop a server without even sending the connection\n        // RST packet. Simply dropping the tonic Server is not enough, so we\n        // spawn a thread and freeze it with thread::park().\n        std::thread::spawn(move || {\n            let server_fut = async {\n                let hello = HelloImpl {\n                    // delay the response so that the server freezes in the middle of the request\n                    delay: Duration::from_millis(1000),\n                };\n                let grpc_server_adapter = HelloGrpcServerAdapter::new(hello);\n                let grpc_server = HelloGrpcServer::new(grpc_server_adapter);\n                tokio::select! {\n                    // wait just enough to let the client perform its request\n                    _ = tokio::time::sleep(Duration::from_millis(100)) => {}\n                    _ = Server::builder().add_service(grpc_server).serve(addr) => {}\n                };\n                std::thread::park();\n                println!(\"Thread unparked, unexpected\");\n            };\n            tokio::runtime::Builder::new_current_thread()\n                .enable_all()\n                .build()\n                .unwrap()\n                .block_on(server_fut);\n        });\n\n        // create a client that will try to connect to the server\n        let (balance_channel, balance_channel_tx) = BalanceChannel::new();\n        let channel = Endpoint::from_str(&format!(\"http://{addr_str}\"))\n            .unwrap()\n            .connect_lazy();\n        balance_channel_tx\n            .send(Change::Insert(addr, channel))\n            .unwrap();\n\n        let grpc_client = HelloClient::tower()\n            // this test hangs forever if we comment out the TimeoutLayer, which\n            // shows that a request without explicit timeout might hang forever\n            .stack_layer(TimeoutLayer::new(Duration::from_secs(3)))\n            .build_from_balance_channel(balance_channel, ByteSize::mib(1), None);\n\n        let response_fut = async move {\n            grpc_client\n                .hello(HelloRequest {\n                    name: \"World\".to_string(),\n                })\n                .await\n        };\n        response_fut\n            .await\n            .expect_err(\"should have timed out at the client level\");\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-codegen/src/codegen.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse anyhow::ensure;\nuse heck::{ToSnakeCase, ToUpperCamelCase};\nuse proc_macro2::TokenStream;\nuse prost_build::{Comments, Method, Service, ServiceGenerator};\nuse quote::{ToTokens, quote};\nuse syn::{Ident, parse_quote};\n\nuse crate::ProstConfig;\n\npub struct Codegen;\n\nimpl Codegen {\n    pub fn run(mut args: CodegenBuilder) -> anyhow::Result<()> {\n        let service_generator = Box::new(QuickwitServiceGenerator::new(\n            args.result_type_path,\n            args.error_type_path,\n            args.generate_extra_service_methods,\n            args.generate_prom_labels_for_requests,\n        ));\n        args.prost_config\n            .protoc_arg(\"--experimental_allow_proto3_optional\")\n            .type_attribute(\n                \".\",\n                \"#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\",\n            )\n            .field_attribute(\n                \"DocBatch.doc_buffer\",\n                \"#[schema(value_type = String, format = Binary)]\",\n            )\n            .enum_attribute(\".\", \"#[serde(rename_all=\\\"snake_case\\\")]\")\n            .service_generator(service_generator)\n            .out_dir(args.output_dir);\n\n        for proto in args.protos {\n            println!(\"cargo:rerun-if-changed={proto}\");\n            args.prost_config.compile_protos(&[proto], &args.includes)?;\n        }\n        Ok(())\n    }\n\n    pub fn builder() -> CodegenBuilder {\n        CodegenBuilder::default()\n    }\n}\n\n#[derive(Default)]\npub struct CodegenBuilder {\n    protos: Vec<String>,\n    includes: Vec<String>,\n    output_dir: String,\n    prost_config: ProstConfig,\n    result_type_path: String,\n    error_type_path: String,\n    generate_extra_service_methods: bool,\n    generate_prom_labels_for_requests: bool,\n}\n\nimpl CodegenBuilder {\n    pub fn with_protos(mut self, protos: &[&str]) -> Self {\n        self.protos = protos.iter().map(|proto| proto.to_string()).collect();\n        self\n    }\n\n    pub fn with_includes(mut self, includes: &[&str]) -> Self {\n        self.includes = includes.iter().map(|include| include.to_string()).collect();\n        self\n    }\n\n    pub fn with_output_dir(mut self, path: &str) -> Self {\n        self.output_dir = path.to_string();\n        self\n    }\n\n    pub fn with_result_type_path(mut self, path: &str) -> Self {\n        self.result_type_path = path.to_string();\n        self\n    }\n\n    pub fn with_error_type_path(mut self, path: &str) -> Self {\n        self.error_type_path = path.to_string();\n        self\n    }\n\n    pub fn with_prost_config(mut self, prost_config: ProstConfig) -> Self {\n        self.prost_config = prost_config;\n        self\n    }\n\n    pub fn generate_extra_service_methods(mut self) -> Self {\n        self.generate_extra_service_methods = true;\n        self\n    }\n\n    /// Generates `RpcName` trait implementations for request types.\n    pub fn generate_rpc_name_impls(mut self) -> Self {\n        self.generate_prom_labels_for_requests = true;\n        self\n    }\n\n    pub fn run(self) -> anyhow::Result<()> {\n        ensure!(!self.protos.is_empty(), \"proto file list is empty\");\n        ensure!(!self.output_dir.is_empty(), \"output directory is undefined\");\n        ensure!(!self.result_type_path.is_empty(),);\n        ensure!(!self.error_type_path.is_empty(), \"error type is undefined\");\n\n        Codegen::run(self)\n    }\n}\n\nstruct QuickwitServiceGenerator {\n    result_type_path: String,\n    error_type_path: String,\n    generate_extra_service_methods: bool,\n    generate_prom_labels_for_requests: bool,\n    inner: Box<dyn ServiceGenerator>,\n}\n\nimpl QuickwitServiceGenerator {\n    fn new(\n        result_type_path: String,\n        error_type_path: String,\n        generate_extra_service_methods: bool,\n        generate_prom_labels_for_requests: bool,\n    ) -> Self {\n        let inner = Box::new(WithSuffixServiceGenerator::new(\n            \"Grpc\",\n            tonic_prost_build::configure().service_generator(),\n        ));\n        Self {\n            result_type_path,\n            error_type_path,\n            generate_extra_service_methods,\n            generate_prom_labels_for_requests,\n            inner,\n        }\n    }\n}\n\nimpl ServiceGenerator for QuickwitServiceGenerator {\n    fn generate(&mut self, service: Service, buf: &mut String) {\n        let tokens = generate_all(\n            &service,\n            &self.result_type_path,\n            &self.error_type_path,\n            self.generate_extra_service_methods,\n            self.generate_prom_labels_for_requests,\n        );\n        let ast: syn::File = syn::parse2(tokens).expect(\"Tokenstream should be a valid Syn AST.\");\n        let pretty_code = prettyplease::unparse(&ast);\n        buf.push_str(&pretty_code);\n\n        self.inner.generate(service, buf)\n    }\n\n    fn finalize(&mut self, buf: &mut String) {\n        self.inner.finalize(buf);\n    }\n}\n\nstruct CodegenContext {\n    package_name: String,\n    service_name: Ident,\n    result_type: syn::Path,\n    error_type: syn::Path,\n    stream_type: Ident,\n    stream_type_alias: TokenStream,\n    methods: Vec<SynMethod>,\n    client_name: Ident,\n    inner_client_name: Ident,\n    tower_svc_stack_name: Ident,\n    tower_layer_stack_name: Ident,\n    mailbox_name: Ident,\n    mock_mod_name: Ident,\n    mock_name: Ident,\n    grpc_client_name: Ident,\n    grpc_client_adapter_name: Ident,\n    grpc_client_package_name: Ident,\n    grpc_server_name: Ident,\n    grpc_server_adapter_name: Ident,\n    grpc_server_package_name: Ident,\n    grpc_service_name: Ident,\n    generate_extra_service_methods: bool,\n}\n\nimpl CodegenContext {\n    fn from_service(\n        service: &Service,\n        result_type_path: &str,\n        error_type_path: &str,\n        generate_extra_service_methods: bool,\n    ) -> Self {\n        let service_name = quote::format_ident!(\"{}\", service.name);\n        let mock_mod_name = quote::format_ident!(\"mock_{}\", service.name.to_snake_case());\n        let mock_name = quote::format_ident!(\"Mock{}\", service.name);\n\n        let result_type = syn::parse_str::<syn::Path>(result_type_path)\n            .expect(\"Result path should be a valid result path such as `crate::HelloResult`.\");\n        let error_type = syn::parse_str::<syn::Path>(error_type_path)\n            .expect(\"Error path should be a valid result path such as `crate::error::HelloError`.\");\n        let stream_type = quote::format_ident!(\"{}Stream\", service.name);\n        let stream_type_alias = if service.methods.iter().any(|method| method.server_streaming) {\n            quote! {\n                pub type #stream_type<T> = quickwit_common::ServiceStream<#result_type<T>>;\n            }\n        } else {\n            TokenStream::new()\n        };\n\n        let methods = SynMethod::parse_prost_methods(&service.methods);\n\n        let client_name = quote::format_ident!(\"{}Client\", service.name);\n        let inner_client_name = quote::format_ident!(\"Inner{}\", client_name);\n        let tower_svc_stack_name = quote::format_ident!(\"{}TowerServiceStack\", service.name);\n        let tower_layer_stack_name = quote::format_ident!(\"{}TowerLayerStack\", service.name);\n        let mailbox_name = quote::format_ident!(\"{}Mailbox\", service.name);\n\n        let grpc_client_name = quote::format_ident!(\"{}GrpcClient\", service.name);\n        let grpc_client_adapter_name = quote::format_ident!(\"{}GrpcClientAdapter\", service.name);\n        let grpc_client_package_name =\n            quote::format_ident!(\"{}_grpc_client\", service.name.to_snake_case());\n        let package_name = service.package.clone();\n\n        let grpc_server_name = quote::format_ident!(\"{}GrpcServer\", service.name);\n        let grpc_server_adapter_name = quote::format_ident!(\"{}GrpcServerAdapter\", service.name);\n        let grpc_server_package_name =\n            quote::format_ident!(\"{}_grpc_server\", service.name.to_snake_case());\n\n        let grpc_service_name = quote::format_ident!(\"{}Grpc\", service.name);\n\n        Self {\n            package_name,\n            service_name,\n            result_type,\n            error_type,\n            stream_type,\n            stream_type_alias,\n            methods,\n            client_name,\n            inner_client_name,\n            tower_svc_stack_name,\n            tower_layer_stack_name,\n            mailbox_name,\n            mock_mod_name,\n            mock_name,\n            grpc_client_name,\n            grpc_client_adapter_name,\n            grpc_client_package_name,\n            grpc_server_name,\n            grpc_server_adapter_name,\n            grpc_server_package_name,\n            grpc_service_name,\n            generate_extra_service_methods,\n        }\n    }\n}\n\nfn generate_all(\n    service: &Service,\n    result_type_path: &str,\n    error_type_path: &str,\n    generate_extra_service_methods: bool,\n    generate_prom_labels_for_requests: bool,\n) -> TokenStream {\n    let context = CodegenContext::from_service(\n        service,\n        result_type_path,\n        error_type_path,\n        generate_extra_service_methods,\n    );\n    let stream_type_alias = &context.stream_type_alias;\n    let service_trait = generate_service_trait(&context);\n    let client = generate_client(&context);\n    let tower_services = generate_tower_services(&context);\n    let tower_svc_stack = generate_tower_svc_stack(&context);\n    let tower_layer_stack = generate_tower_layer_stack(&context);\n    let tower_mailbox = generate_tower_mailbox(&context);\n    let grpc_client_adapter = generate_grpc_client_adapter(&context);\n    let grpc_server_adapter = generate_grpc_server_adapter(&context);\n    let prom_labels_impl = if generate_prom_labels_for_requests {\n        generate_prom_labels_impl_for_requests(&context)\n    } else {\n        TokenStream::new()\n    };\n\n    quote! {\n        // The line below is necessary to opt out of the license header check.\n        /// BEGIN quickwit-codegen\n        #[allow(unused_imports)]\n        use std::str::FromStr;\n        use tower::{Layer, Service, ServiceExt};\n        #prom_labels_impl\n\n        #stream_type_alias\n\n        #service_trait\n\n        #client\n\n        pub type BoxFuture<T, E> = std::pin::Pin<Box<dyn std::future::Future<Output = Result<T, E>> + Send + 'static>>;\n\n        #tower_services\n\n        #tower_svc_stack\n\n        #tower_layer_stack\n\n        #tower_mailbox\n\n        #grpc_client_adapter\n\n        #grpc_server_adapter\n    }\n}\n\nstruct SynMethod {\n    name: Ident,\n    proto_name: Ident,\n    comments: Vec<syn::Attribute>,\n    request_type: syn::Path,\n    response_type: syn::Path,\n    client_streaming: bool,\n    server_streaming: bool,\n}\n\nimpl SynMethod {\n    fn request_type(&self, mock: bool) -> TokenStream {\n        let request_type = if mock {\n            let request_type = &self.request_type;\n            quote! { super::#request_type }\n        } else {\n            self.request_type.to_token_stream()\n        };\n        if self.client_streaming {\n            quote! { quickwit_common::ServiceStream<#request_type> }\n        } else {\n            request_type\n        }\n    }\n\n    fn rpc_name(&self, mock: bool) -> TokenStream {\n        let request_type = &self.request_type;\n\n        if mock {\n            quote! { super::#request_type::rpc_name() }\n        } else {\n            quote! { #request_type::rpc_name() }\n        }\n    }\n\n    fn response_type(&self, context: &CodegenContext, mock: bool) -> TokenStream {\n        let response_type = if mock {\n            let response_type = &self.response_type;\n            quote! { super::#response_type }\n        } else {\n            self.response_type.to_token_stream()\n        };\n        if self.server_streaming {\n            let stream_type = &context.stream_type;\n            quote! { #stream_type<#response_type> }\n        } else {\n            response_type\n        }\n    }\n\n    fn parse_prost_methods(methods: &[Method]) -> Vec<Self> {\n        let mut syn_methods = Vec::with_capacity(methods.len());\n\n        for method in methods {\n            let name = quote::format_ident!(\"{}\", method.name);\n            let proto_name = quote::format_ident!(\"{}\", method.proto_name);\n            let comments = generate_comment_attributes(&method.comments);\n            let request_type = syn::parse_str::<syn::Path>(&method.input_type).unwrap();\n            let response_type = syn::parse_str::<syn::Path>(&method.output_type).unwrap();\n\n            let syn_method = SynMethod {\n                name,\n                proto_name,\n                comments,\n                request_type,\n                response_type,\n                client_streaming: method.client_streaming,\n                server_streaming: method.server_streaming,\n            };\n            syn_methods.push(syn_method);\n        }\n        syn_methods\n    }\n}\n\nfn generate_prom_labels_impl_for_requests(context: &CodegenContext) -> TokenStream {\n    let mut rpc_name_impls = Vec::new();\n\n    for syn_method in &context.methods {\n        let request_type = syn_method.request_type.to_token_stream();\n        let rpc_name = &syn_method.name.to_string();\n        let rpc_name_impl = quote! {\n            impl RpcName for #request_type {\n                fn rpc_name() -> &'static str {\n                    #rpc_name\n                }\n            }\n        };\n        rpc_name_impls.extend(rpc_name_impl);\n    }\n    if rpc_name_impls.is_empty() {\n        return TokenStream::new();\n    }\n    quote! {\n        use quickwit_common::tower::RpcName;\n\n        #(#rpc_name_impls)*\n    }\n}\n\nfn generate_comment_attributes(comments: &Comments) -> Vec<syn::Attribute> {\n    let mut attributes = Vec::with_capacity(comments.leading.len());\n\n    for comment in &comments.leading {\n        let comment = syn::LitStr::new(comment, proc_macro2::Span::call_site());\n        let attribute: syn::Attribute = parse_quote! {\n            #[doc = #comment]\n        };\n        attributes.push(attribute);\n    }\n    attributes\n}\n\nfn generate_service_trait(context: &CodegenContext) -> TokenStream {\n    let service_name = &context.service_name;\n    let trait_methods = generate_service_trait_methods(context);\n    let extra_trait_methods = if context.generate_extra_service_methods {\n        quote! {\n            async fn check_connectivity(&self) -> anyhow::Result<()>;\n            fn endpoints(&self) -> Vec<quickwit_common::uri::Uri>;\n        }\n    } else {\n        TokenStream::new()\n    };\n\n    quote! {\n        #[cfg_attr(any(test, feature = \"testsuite\"), mockall::automock)]\n        #[async_trait::async_trait]\n        pub trait #service_name: std::fmt::Debug + Send + Sync + 'static {\n            #trait_methods\n            #extra_trait_methods\n        }\n    }\n}\n\nfn generate_service_trait_methods(context: &CodegenContext) -> TokenStream {\n    let result_type = &context.result_type;\n\n    let mut stream = TokenStream::new();\n\n    for syn_method in &context.methods {\n        let comments = &syn_method.comments;\n        let method_name = syn_method.name.to_token_stream();\n        let request_type = syn_method.request_type(false);\n        let response_type = syn_method.response_type(context, false);\n        let method = quote! {\n            #(#comments)*\n            async fn #method_name(&self, request: #request_type) -> #result_type<#response_type>;\n        };\n        stream.extend(method);\n    }\n    stream\n}\n\nfn generate_extra_methods_calling_inner() -> TokenStream {\n    quote! {\n        async fn check_connectivity(&self) -> anyhow::Result<()> {\n            self.inner.0.check_connectivity().await\n        }\n\n        fn endpoints(&self) -> Vec<quickwit_common::uri::Uri> {\n            self.inner.0.endpoints()\n        }\n    }\n}\n\nfn generate_client(context: &CodegenContext) -> TokenStream {\n    let service_name = &context.service_name;\n    let client_name = &context.client_name;\n    let inner_client_name = &context.inner_client_name;\n\n    let grpc_client_name = &context.grpc_client_name;\n    let grpc_client_adapter_name = &context.grpc_client_adapter_name;\n    let grpc_client_package_name = &context.grpc_client_package_name;\n\n    let grpc_server_name = &context.grpc_server_name;\n    let grpc_server_adapter_name = &context.grpc_server_adapter_name;\n    let grpc_server_package_name = &context.grpc_server_package_name;\n\n    let client_methods = generate_client_methods(context, false);\n    let mock_mod_name = &context.mock_mod_name;\n    let mock_methods = generate_client_methods(context, true);\n    let mailbox_name = &context.mailbox_name;\n    let tower_layer_stack_name = &context.tower_layer_stack_name;\n    let mock_name = &context.mock_name;\n    let mock_wrapper_name = quote::format_ident!(\"{}Wrapper\", mock_name);\n    let error_message = format!(\n        \"`{mock_name}` must be wrapped in a `{mock_wrapper_name}`: use \\\n         `{client_name}::from_mock(mock)` to instantiate the client\"\n    );\n    let extra_client_methods = if context.generate_extra_service_methods {\n        generate_extra_methods_calling_inner()\n    } else {\n        TokenStream::new()\n    };\n    let extra_mock_methods = if context.generate_extra_service_methods {\n        quote! {\n            async fn check_connectivity(&self) -> anyhow::Result<()> {\n                self.inner.lock().await.check_connectivity().await\n            }\n\n            fn endpoints(&self) -> Vec<quickwit_common::uri::Uri> {\n                futures::executor::block_on(self.inner.lock()).endpoints()\n            }\n        }\n    } else {\n        TokenStream::new()\n    };\n\n    quote! {\n        #[derive(Debug, Clone)]\n        pub struct #client_name {\n            inner: #inner_client_name,\n        }\n\n        #[derive(Debug, Clone)]\n        struct #inner_client_name(std::sync::Arc<dyn #service_name>);\n\n        impl #client_name {\n            pub fn new<T>(instance: T) -> Self\n            where\n                T: #service_name,\n            {\n                #[cfg(any(test, feature = \"testsuite\"))]\n                assert!(std::any::TypeId::of::<T>() != std::any::TypeId::of::<#mock_name>(), #error_message);\n                Self {\n                    inner: #inner_client_name(std::sync::Arc::new(instance)),\n                }\n            }\n\n            pub fn as_grpc_service(&self, max_message_size: bytesize::ByteSize) -> #grpc_server_package_name::#grpc_server_name<#grpc_server_adapter_name> {\n                let adapter = #grpc_server_adapter_name::new(self.clone());\n                #grpc_server_package_name::#grpc_server_name::new(adapter)\n                    // Servers accept both Gzip and Zstd. The order is not important because the client decides which encoding to use.\n                    .accept_compressed(tonic::codec::CompressionEncoding::Gzip)\n                    .accept_compressed(tonic::codec::CompressionEncoding::Zstd)\n                    .send_compressed(tonic::codec::CompressionEncoding::Gzip)\n                    .send_compressed(tonic::codec::CompressionEncoding::Zstd)\n                    .max_decoding_message_size(max_message_size.0 as usize)\n                    .max_encoding_message_size(max_message_size.0 as usize)\n            }\n\n            pub fn from_channel(\n                addr: std::net::SocketAddr,\n                channel: tonic::transport::Channel,\n                max_message_size: bytesize::ByteSize,\n                compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n            ) -> Self\n            {\n                let (_, connection_keys_watcher) = tokio::sync::watch::channel(std::collections::HashSet::from_iter([addr]));\n                let mut client = #grpc_client_package_name::#grpc_client_name::new(channel)\n                    .max_decoding_message_size(max_message_size.0 as usize)\n                    .max_encoding_message_size(max_message_size.0 as usize);\n                if let Some(compression_encoding) = compression_encoding_opt {\n                    client = client\n                        .accept_compressed(compression_encoding)\n                        .send_compressed(compression_encoding);\n                }\n                let adapter = #grpc_client_adapter_name::new(client, connection_keys_watcher);\n                Self::new(adapter)\n            }\n\n            pub fn from_balance_channel(\n                balance_channel: quickwit_common::tower::BalanceChannel<std::net::SocketAddr>,\n                max_message_size: bytesize::ByteSize,\n                compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n            ) -> #client_name\n            {\n                let connection_keys_watcher = balance_channel.connection_keys_watcher();\n                let mut client = #grpc_client_package_name::#grpc_client_name::new(balance_channel)\n                    .max_decoding_message_size(max_message_size.0 as usize)\n                    .max_encoding_message_size(max_message_size.0 as usize);\n                if let Some(compression_encoding) = compression_encoding_opt {\n                    client = client\n                        .accept_compressed(compression_encoding)\n                        .send_compressed(compression_encoding);\n                }\n                let adapter = #grpc_client_adapter_name::new(client, connection_keys_watcher);\n                Self::new(adapter)\n            }\n\n            pub fn from_mailbox<A>(mailbox: quickwit_actors::Mailbox<A>) -> Self\n            where\n                A: quickwit_actors::Actor + std::fmt::Debug + Send + 'static,\n                #mailbox_name<A>: #service_name,\n            {\n                #client_name::new(#mailbox_name::new(mailbox))\n            }\n\n            pub fn tower() -> #tower_layer_stack_name {\n                #tower_layer_stack_name::default()\n            }\n\n            #[cfg(any(test, feature = \"testsuite\"))]\n            pub fn from_mock(mock: #mock_name) -> Self {\n                let mock_wrapper = #mock_mod_name::#mock_wrapper_name {\n                    inner: tokio::sync::Mutex::new(mock)\n                };\n                Self::new(mock_wrapper)\n            }\n\n            #[cfg(any(test, feature = \"testsuite\"))]\n            pub fn mocked() -> Self {\n                Self::from_mock(#mock_name::new())\n            }\n        }\n\n        #[async_trait::async_trait]\n        impl #service_name for #client_name {\n            #client_methods\n            #extra_client_methods\n        }\n\n        #[cfg(any(test, feature = \"testsuite\"))]\n        pub mod #mock_mod_name {\n            use super::*;\n\n            #[derive(Debug)]\n            pub struct #mock_wrapper_name {\n                pub(super) inner: tokio::sync::Mutex<#mock_name>\n            }\n\n            #[async_trait::async_trait]\n            impl #service_name for #mock_wrapper_name {\n                #mock_methods\n                #extra_mock_methods\n            }\n        }\n    }\n}\n\nfn generate_client_methods(context: &CodegenContext, mock: bool) -> TokenStream {\n    let result_type = &context.result_type;\n\n    let mut stream = TokenStream::new();\n\n    for syn_method in &context.methods {\n        let method_name = syn_method.name.to_token_stream();\n        let request_type = syn_method.request_type(mock);\n        let response_type = syn_method.response_type(context, mock);\n\n        let body = if !mock {\n            quote! {\n                self.inner.0.#method_name(request).await\n            }\n        } else {\n            quote! {\n                self.inner.lock().await.#method_name(request).await\n            }\n        };\n        let method = quote! {\n            async fn #method_name(&self, request: #request_type) -> #result_type<#response_type> {\n                #body\n            }\n        };\n        stream.extend(method);\n    }\n    stream\n}\n\nfn generate_tower_services(context: &CodegenContext) -> TokenStream {\n    let inner_client_name = &context.inner_client_name;\n    let error_type = &context.error_type;\n\n    let mut stream = TokenStream::new();\n\n    for syn_method in &context.methods {\n        let method_name = syn_method.name.to_token_stream();\n        let request_type = syn_method.request_type(false);\n        let response_type = syn_method.response_type(context, false);\n\n        let service = quote! {\n            impl tower::Service<#request_type> for #inner_client_name {\n                type Response = #response_type;\n                type Error = #error_type;\n                type Future = BoxFuture<Self::Response, Self::Error>;\n\n                fn poll_ready(\n                    &mut self,\n                    _cx: &mut std::task::Context<'_>,\n                ) -> std::task::Poll<Result<(), Self::Error>> {\n                    std::task::Poll::Ready(Ok(()))\n                }\n\n                fn call(&mut self, request: #request_type) -> Self::Future {\n                    let svc = self.clone();\n                    let fut = async move { svc.0.#method_name(request).await };\n                    Box::pin(fut)\n                }\n            }\n        };\n        stream.extend(service);\n    }\n    stream\n}\n\nfn generate_tower_svc_stack(context: &CodegenContext) -> TokenStream {\n    let tower_svc_stack_name = &context.tower_svc_stack_name;\n    let inner_client_name = &context.inner_client_name;\n    let tower_svc_stack_attributes = generate_tower_svc_stack_attributes(context);\n    let tower_svc_stack_service_impl = generate_tower_svc_stack_service_impl(context);\n\n    quote! {\n        /// A tower service stack is a set of tower services.\n        #[derive(Debug)]\n        struct #tower_svc_stack_name {\n            // TODO: remove this field once `check_connectivity` is used for all services.\n            #[allow(dead_code)]\n            inner: #inner_client_name,\n\n            #tower_svc_stack_attributes\n        }\n\n        #tower_svc_stack_service_impl\n    }\n}\n\nfn generate_tower_svc_stack_attributes(context: &CodegenContext) -> TokenStream {\n    let error_type = &context.error_type;\n\n    let mut stream = TokenStream::new();\n\n    for syn_method in &context.methods {\n        let attribute_name = quote::format_ident!(\"{}_svc\", syn_method.name);\n        let request_type = syn_method.request_type(false);\n        let response_type = syn_method.response_type(context, false);\n\n        let attribute = quote! {\n            #attribute_name: quickwit_common::tower::BoxService<#request_type, #response_type, #error_type>,\n        };\n        stream.extend(attribute);\n    }\n    stream\n}\n\nfn generate_tower_svc_stack_service_impl(context: &CodegenContext) -> TokenStream {\n    let service_name = &context.service_name;\n    let tower_svc_stack_name = &context.tower_svc_stack_name;\n    let result_type = &context.result_type;\n    let extra_client_methods = if context.generate_extra_service_methods {\n        generate_extra_methods_calling_inner()\n    } else {\n        TokenStream::new()\n    };\n    let mut methods = TokenStream::new();\n\n    for syn_method in &context.methods {\n        let attribute_name = quote::format_ident!(\"{}_svc\", syn_method.name);\n        let method_name = syn_method.name.to_token_stream();\n        let request_type = syn_method.request_type(false);\n        let response_type = syn_method.response_type(context, false);\n\n        let attribute = quote! {\n            async fn #method_name(&self, request: #request_type) -> #result_type<#response_type> {\n                self.#attribute_name.clone().ready().await?.call(request).await\n            }\n        };\n        methods.extend(attribute);\n    }\n\n    quote! {\n        #[async_trait::async_trait]\n        impl #service_name for #tower_svc_stack_name {\n            #methods\n            #extra_client_methods\n        }\n    }\n}\n\nfn generate_tower_layer_stack(context: &CodegenContext) -> TokenStream {\n    let tower_layer_stack_name = &context.tower_layer_stack_name;\n    let (tower_layer_stack_types, layer_stack_attributes) =\n        generate_layer_stack_types_and_attributes(context);\n    let layer_stack_impl = generate_layer_stack_impl(context);\n\n    quote! {\n        #tower_layer_stack_types\n\n        #[derive(Debug, Default)]\n        pub struct #tower_layer_stack_name {\n            #layer_stack_attributes\n        }\n\n        #layer_stack_impl\n    }\n}\n\nfn generate_layer_stack_types_and_attributes(\n    context: &CodegenContext,\n) -> (TokenStream, TokenStream) {\n    let error_type = &context.error_type;\n\n    let mut type_aliases = TokenStream::new();\n    let mut attributes = TokenStream::new();\n\n    for syn_method in &context.methods {\n        let service_name_upper_camel_case = syn_method.name.to_string().to_upper_camel_case();\n        let type_alias_name = quote::format_ident!(\"{service_name_upper_camel_case}Layer\");\n        let attribute_name = quote::format_ident!(\"{}_layers\", syn_method.name);\n        let request_type = syn_method.request_type(false);\n        let response_type = syn_method.response_type(context, false);\n\n        let type_alias = quote! {\n            type #type_alias_name = quickwit_common::tower::BoxLayer<quickwit_common::tower::BoxService<#request_type, #response_type, #error_type>, #request_type, #response_type, #error_type>;\n        };\n        let attribute = quote! {\n            #attribute_name: Vec<#type_alias_name>,\n        };\n        type_aliases.extend(type_alias);\n        attributes.extend(attribute);\n    }\n    (type_aliases, attributes)\n}\n\nfn generate_layer_stack_impl(context: &CodegenContext) -> TokenStream {\n    let service_name = &context.service_name;\n    let client_name = &context.client_name;\n    let inner_client_name = &context.inner_client_name;\n    let mailbox_name = &context.mailbox_name;\n    let mock_name = &context.mock_name;\n    let tower_svc_stack_name = &context.tower_svc_stack_name;\n    let tower_layer_stack_name = &context.tower_layer_stack_name;\n    let error_type = &context.error_type;\n\n    let mut shared_layer_method_bounds = TokenStream::new();\n    let mut layer_method_bounds = TokenStream::new();\n    let mut layer_method_statements = TokenStream::new();\n    let mut layer_methods = TokenStream::new();\n    let mut svc_statements = TokenStream::new();\n    let mut svc_attribute_idents = Vec::with_capacity(context.methods.len());\n\n    for syn_method in &context.methods {\n        let layer_attribute_name = quote::format_ident!(\"{}_layers\", syn_method.name);\n        let layer_method_name = quote::format_ident!(\"stack_{}_layer\", syn_method.name);\n        let svc_attribute_name = quote::format_ident!(\"{}_svc\", syn_method.name);\n        let request_type = syn_method.request_type(false);\n        let response_type = syn_method.response_type(context, false);\n\n        let shared_layer_method_bound = quote! {\n            L: tower::Layer<quickwit_common::tower::BoxService<#request_type, #response_type, #error_type>> + Clone + Send + Sync + 'static,\n            <L as tower::Layer<quickwit_common::tower::BoxService<#request_type, #response_type, #error_type>>>::Service: tower::Service<#request_type, Response = #response_type, Error = #error_type> + Clone + Send + Sync + 'static,\n            <<L as tower::Layer<quickwit_common::tower::BoxService<#request_type, #response_type, #error_type>>>::Service as tower::Service<#request_type>>::Future: Send + 'static,\n        };\n        let layer_method_bound = quote! {\n            L: tower::Layer<quickwit_common::tower::BoxService<#request_type, #response_type, #error_type>> + Send + Sync + 'static,\n            L::Service: tower::Service<#request_type, Response = #response_type, Error = #error_type> + Clone + Send + Sync + 'static,\n            <L::Service as tower::Service<#request_type>>::Future: Send + 'static,\n        };\n        let layer_method_statement = quote! {\n            self.#layer_attribute_name.push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        };\n        let layer_method = quote! {\n            pub fn #layer_method_name<L>(\n                mut self,\n                layer: L\n            ) -> Self\n            where\n                #layer_method_bound\n            {\n                self.#layer_attribute_name.push(quickwit_common::tower::BoxLayer::new(layer));\n                self\n            }\n        };\n        shared_layer_method_bounds.extend(shared_layer_method_bound);\n        layer_method_bounds.extend(layer_method_bound);\n        layer_method_statements.extend(layer_method_statement);\n        layer_methods.extend(layer_method);\n\n        let svc_statement = quote! {\n            let #svc_attribute_name = self.#layer_attribute_name.into_iter().rev().fold(quickwit_common::tower::BoxService::new(inner_client.clone()), |svc, layer| layer.layer(svc));\n        };\n        svc_statements.extend(svc_statement);\n\n        svc_attribute_idents.push(svc_attribute_name);\n    }\n\n    quote! {\n        impl #tower_layer_stack_name {\n            pub fn stack_layer<L>(mut self, layer: L) -> Self\n            where\n                #shared_layer_method_bounds\n            {\n                #layer_method_statements\n                self\n            }\n\n            #layer_methods\n\n            pub fn build<T>(self, instance: T) -> #client_name\n            where\n                T: #service_name\n            {\n                let inner_client = #inner_client_name(std::sync::Arc::new(instance));\n                self.build_from_inner_client(inner_client)\n            }\n\n            pub fn build_from_channel(\n                self,\n                addr: std::net::SocketAddr,\n                channel: tonic::transport::Channel,\n                max_message_size: bytesize::ByteSize,\n                compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n            ) -> #client_name\n            {\n                let client =  #client_name::from_channel(addr, channel, max_message_size, compression_encoding_opt);\n                let inner_client = client.inner;\n                self.build_from_inner_client(inner_client)\n            }\n\n            pub fn build_from_balance_channel(\n                self,\n                balance_channel: quickwit_common::tower::BalanceChannel<std::net::SocketAddr>,\n                max_message_size: bytesize::ByteSize,\n                compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n            ) -> #client_name\n            {\n                let client =  #client_name::from_balance_channel(balance_channel, max_message_size, compression_encoding_opt);\n                let inner_client = client.inner;\n                self.build_from_inner_client(inner_client)\n            }\n\n            pub fn build_from_mailbox<A>(self, mailbox: quickwit_actors::Mailbox<A>) -> #client_name\n            where\n                A: quickwit_actors::Actor + std::fmt::Debug + Send + 'static,\n                #mailbox_name<A>: #service_name,\n            {\n                let inner_client = #inner_client_name(std::sync::Arc::new(#mailbox_name::new(mailbox)));\n                self.build_from_inner_client(inner_client)\n            }\n\n            #[cfg(any(test, feature = \"testsuite\"))]\n            pub fn build_from_mock(self, mock: #mock_name) -> #client_name {\n                let client = #client_name::from_mock(mock);\n                let inner_client = client.inner;\n                self.build_from_inner_client(inner_client)\n            }\n\n            fn build_from_inner_client(self, inner_client: #inner_client_name) -> #client_name\n            {\n                #svc_statements\n\n                let tower_svc_stack = #tower_svc_stack_name {\n                    inner: inner_client,\n                    #(#svc_attribute_idents),*\n                };\n                #client_name::new(tower_svc_stack)\n            }\n        }\n    }\n}\n\nfn generate_tower_mailbox(context: &CodegenContext) -> TokenStream {\n    let service_name = &context.service_name;\n    let mailbox_name = &context.mailbox_name;\n    let error_type = &context.error_type;\n    let extra_mailbox_methods = if context.generate_extra_service_methods {\n        quote! {\n            async fn check_connectivity(&self) -> anyhow::Result<()> {\n                if self.inner.is_disconnected() {\n                    anyhow::bail!(\"actor `{}` is disconnected\", self.inner.actor_instance_id())\n                }\n                Ok(())\n            }\n\n            fn endpoints(&self) -> Vec<quickwit_common::uri::Uri> {\n                vec![quickwit_common::uri::Uri::from_str(&format!(\"actor://localhost/{}\", self.inner.actor_instance_id())).expect(\"URI should be valid\")]\n            }\n        }\n    } else {\n        TokenStream::new()\n    };\n\n    let (mailbox_bounds, mailbox_methods) = generate_mailbox_bounds_and_methods(context);\n\n    quote! {\n        #[derive(Debug, Clone)]\n        struct MailboxAdapter<A: quickwit_actors::Actor, E> {\n            inner: quickwit_actors::Mailbox<A>,\n            phantom: std::marker::PhantomData<E>,\n        }\n\n        impl<A, E> std::ops::Deref for MailboxAdapter<A, E> where A: quickwit_actors::Actor {\n            type Target = quickwit_actors::Mailbox<A>;\n\n            fn deref(&self) -> &Self::Target {\n                &self.inner\n            }\n        }\n\n        #[derive(Debug)]\n        pub struct #mailbox_name<A: quickwit_actors::Actor> {\n            inner: MailboxAdapter<A, #error_type>\n        }\n\n        impl <A: quickwit_actors::Actor> #mailbox_name<A> {\n            pub fn new(instance: quickwit_actors::Mailbox<A>) -> Self {\n                let inner = MailboxAdapter {\n                    inner: instance,\n                    phantom: std::marker::PhantomData,\n                };\n                Self {\n                    inner\n                }\n            }\n        }\n\n        impl <A: quickwit_actors::Actor> Clone for #mailbox_name<A> {\n            fn clone(&self) -> Self {\n                let inner = MailboxAdapter {\n                    inner: self.inner.clone(),\n                    phantom: std::marker::PhantomData,\n                };\n                Self { inner }\n            }\n        }\n\n        impl<A, M, T, E> tower::Service<M> for #mailbox_name<A>\n        where\n            A: quickwit_actors::Actor + quickwit_actors::DeferableReplyHandler<M, Reply = Result<T, E>> + Send + 'static,\n            M: std::fmt::Debug + Send + 'static,\n            T: Send + 'static,\n            E: std::fmt::Debug + Send + 'static,\n            #error_type: From<quickwit_actors::AskError<E>>,\n        {\n            type Response = T;\n            type Error = #error_type;\n            type Future = BoxFuture<Self::Response, Self::Error>;\n\n            fn poll_ready(&mut self, _cx: &mut std::task::Context<'_>) -> std::task::Poll<Result<(), Self::Error>> {\n                //! This does not work with balance middlewares such as `tower::balance::pool::Pool` because\n                //! this always returns `Poll::Ready`. The fix is to acquire a permit from the\n                //! mailbox in `poll_ready` and consume it in `call`.\n                std::task::Poll::Ready(Ok(()))\n            }\n\n            fn call(&mut self, message: M) -> Self::Future {\n                let mailbox = self.inner.clone();\n                let fut = async move {\n                    mailbox\n                        .ask_for_res(message)\n                        .await\n                        .map_err(|error| error.into())\n                };\n                Box::pin(fut)\n            }\n        }\n\n        #[async_trait::async_trait]\n        impl<A> #service_name for #mailbox_name<A>\n        where\n            A: quickwit_actors::Actor + std::fmt::Debug,\n            #mailbox_name<A>: #(#mailbox_bounds)+*,\n        {\n            #mailbox_methods\n            #extra_mailbox_methods\n        }\n    }\n}\n\nfn generate_mailbox_bounds_and_methods(\n    context: &CodegenContext,\n) -> (Vec<TokenStream>, TokenStream) {\n    let result_type = &context.result_type;\n    let error_type = &context.error_type;\n\n    let mut bounds = Vec::with_capacity(context.methods.len());\n    let mut methods = TokenStream::new();\n\n    for syn_method in &context.methods {\n        let method_name = syn_method.name.to_token_stream();\n        let request_type = syn_method.request_type(false);\n        let response_type = syn_method.response_type(context, false);\n\n        let bound = quote! {\n            tower::Service<#request_type, Response = #response_type, Error = #error_type, Future = BoxFuture<#response_type, #error_type>>\n        };\n        bounds.push(bound);\n\n        let method = quote! {\n            async fn #method_name(&self, request: #request_type) -> #result_type<#response_type> {\n                self.clone().call(request).await\n            }\n        };\n        methods.extend(method);\n    }\n    (bounds, methods)\n}\n\nfn generate_grpc_client_adapter(context: &CodegenContext) -> TokenStream {\n    let service_name = &context.service_name;\n    let service_name_string = service_name.to_string();\n    let grpc_client_package_name = &context.grpc_client_package_name;\n    let grpc_client_package_name_string = &context.package_name.to_string();\n    let grpc_client_name = &context.grpc_client_name;\n    let grpc_client_adapter_name = &context.grpc_client_adapter_name;\n    let grpc_server_adapter_methods = generate_grpc_client_adapter_methods(context);\n    let extra_grpc_server_adapter_methods = if context.generate_extra_service_methods {\n        quote! {\n            async fn check_connectivity(&self) -> anyhow::Result<()> {\n                if self.connection_addrs_rx.borrow().is_empty() {\n                    anyhow::bail!(\"no server currently available\")\n                }\n                Ok(())\n            }\n\n            fn endpoints(&self) -> Vec<quickwit_common::uri::Uri> {\n                self.connection_addrs_rx\n                    .borrow()\n                    .iter()\n                    .flat_map(|addr| quickwit_common::uri::Uri::from_str(&format!(\"grpc://{addr}/{}.{}\", #grpc_client_package_name_string, #service_name_string)))\n                    .collect()\n            }\n        }\n    } else {\n        TokenStream::new()\n    };\n\n    quote! {\n        #[derive(Debug, Clone)]\n        pub struct #grpc_client_adapter_name<T> {\n            inner: T,\n            // TODO: remove this field once `check_connectivity` is used for all services.\n            #[allow(dead_code)]\n            connection_addrs_rx: tokio::sync::watch::Receiver<std::collections::HashSet<std::net::SocketAddr>>,\n        }\n\n        impl<T> #grpc_client_adapter_name<T> {\n            pub fn new(instance: T, connection_addrs_rx: tokio::sync::watch::Receiver<std::collections::HashSet<std::net::SocketAddr>>) -> Self {\n                Self {\n                    inner: instance,\n                    connection_addrs_rx\n                }\n            }\n        }\n\n        #[async_trait::async_trait]\n        impl<T> #service_name for #grpc_client_adapter_name<#grpc_client_package_name::#grpc_client_name<T>>\n        where\n            T: tonic::client::GrpcService<tonic::body::Body> + std::fmt::Debug + Clone + Send + Sync + 'static,\n            T::ResponseBody: tonic::codegen::Body<Data = tonic::codegen::Bytes> + Send + 'static,\n            <T::ResponseBody as tonic::codegen::Body>::Error: Into<tonic::codegen::StdError> + Send,\n            T::Future: Send\n        {\n            #grpc_server_adapter_methods\n            #extra_grpc_server_adapter_methods\n        }\n    }\n}\n\nfn generate_grpc_client_adapter_methods(context: &CodegenContext) -> TokenStream {\n    let result_type = &context.result_type;\n\n    let mut stream = TokenStream::new();\n\n    for syn_method in &context.methods {\n        let method_name = syn_method.name.to_token_stream();\n        let request_type = syn_method.request_type(false);\n        let rpc_name = syn_method.rpc_name(false);\n        let response_type = syn_method.response_type(context, false);\n\n        let into_response_type = if syn_method.server_streaming {\n            quote! { |response|\n                {\n                    let streaming: tonic::Streaming<_> = response.into_inner();\n                    let stream = quickwit_common::ServiceStream::from(streaming);\n                    stream.map_err(|status| crate::error::grpc_status_to_service_error(status, #rpc_name))\n                }\n            }\n        } else {\n            quote! { |response| response.into_inner() }\n        };\n        let method = quote! {\n            async fn #method_name(&self, request: #request_type) -> #result_type<#response_type> {\n                self.inner\n                    .clone()\n                    .#method_name(request)\n                    .await\n                    .map(#into_response_type)\n                    .map_err(|status| crate::error::grpc_status_to_service_error(status, #rpc_name))\n            }\n        };\n        stream.extend(method);\n    }\n    stream\n}\n\nfn generate_grpc_server_adapter(context: &CodegenContext) -> TokenStream {\n    let service_name = &context.service_name;\n    let inner_client_name = &context.inner_client_name;\n    let grpc_server_package_name = &context.grpc_server_package_name;\n    let grpc_service_name = &context.grpc_service_name;\n    let grpc_server_adapter_name = &context.grpc_server_adapter_name;\n    let grpc_server_adapter_methods = generate_grpc_server_adapter_methods(context);\n\n    quote! {\n        #[derive(Debug)]\n        pub struct #grpc_server_adapter_name {\n            inner: #inner_client_name,\n        }\n\n        impl #grpc_server_adapter_name {\n            pub fn new<T>(instance: T) -> Self\n            where T: #service_name {\n                Self {\n                    inner: #inner_client_name(std::sync::Arc::new(instance)),\n                }\n            }\n        }\n\n        #[async_trait::async_trait]\n        impl #grpc_server_package_name::#grpc_service_name for #grpc_server_adapter_name {\n            #grpc_server_adapter_methods\n        }\n    }\n}\n\nfn generate_grpc_server_adapter_methods(context: &CodegenContext) -> TokenStream {\n    let mut stream = TokenStream::new();\n\n    for syn_method in &context.methods {\n        let method_name = syn_method.name.to_token_stream();\n        let request_type = if syn_method.client_streaming {\n            let request_type = &syn_method.request_type;\n            quote! { tonic::Streaming<#request_type> }\n        } else {\n            syn_method.request_type.to_token_stream()\n        };\n        let method_arg = if syn_method.client_streaming {\n            quote! {\n                {\n                    let streaming: tonic::Streaming<_> = request.into_inner();\n                    quickwit_common::ServiceStream::from(streaming)\n                }\n            }\n        } else {\n            quote! { request.into_inner() }\n        };\n        let response_type = if syn_method.server_streaming {\n            let associated_type_name = quote::format_ident!(\"{}Stream\", syn_method.proto_name);\n            quote! { Self::#associated_type_name }\n        } else {\n            syn_method.response_type.to_token_stream()\n        };\n        let associated_type = if syn_method.server_streaming {\n            let associated_type_name = quote::format_ident!(\"{}Stream\", syn_method.proto_name);\n            let response_type = &syn_method.response_type;\n            quote! { type #associated_type_name = quickwit_common::ServiceStream<tonic::Result<#response_type>>; }\n        } else {\n            TokenStream::new()\n        };\n        let into_response_type = if syn_method.server_streaming {\n            quote! {\n                |stream| tonic::Response::new(stream.map_err(crate::error::grpc_error_to_grpc_status))\n            }\n        } else {\n            quote! { tonic::Response::new }\n        };\n        let method = quote! {\n            #associated_type\n\n            async fn #method_name(&self, request: tonic::Request<#request_type>) -> Result<tonic::Response<#response_type>, tonic::Status> {\n                self.inner\n                    .0\n                    .#method_name(#method_arg)\n                    .await\n                    .map(#into_response_type)\n                    .map_err(crate::error::grpc_error_to_grpc_status)\n            }\n        };\n        stream.extend(method);\n    }\n    stream\n}\n\n/// A [`ServiceGenerator`] wrapper that appends a suffix to the name of the wrapped service. It is\n/// used to add a `Grpc` suffix to the service, client, and server generated by tonic.\nstruct WithSuffixServiceGenerator {\n    suffix: String,\n    inner: Box<dyn ServiceGenerator>,\n}\n\nimpl WithSuffixServiceGenerator {\n    fn new(suffix: &str, service_generator: Box<dyn ServiceGenerator>) -> Self {\n        Self {\n            suffix: suffix.to_string(),\n            inner: service_generator,\n        }\n    }\n}\n\nimpl ServiceGenerator for WithSuffixServiceGenerator {\n    fn generate(&mut self, mut service: Service, buf: &mut String) {\n        service.name = format!(\"{}{}\", service.name, self.suffix);\n        self.inner.generate(service, buf);\n    }\n\n    fn finalize(&mut self, buf: &mut String) {\n        self.inner.finalize(buf);\n    }\n\n    fn finalize_package(&mut self, package: &str, buf: &mut String) {\n        self.inner.finalize_package(package, buf);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-codegen/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod codegen;\n\npub use codegen::Codegen;\npub use prost_build::Config as ProstConfig;\n"
  },
  {
    "path": "quickwit/quickwit-common/Cargo.toml",
    "content": "[package]\nname = \"quickwit-common\"\ndescription = \"Shared utilities for Quickwit\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nanyhow = { workspace = true }\nasync-speed-limit = { workspace = true }\nasync-trait = { workspace = true }\nbacktrace = { workspace = true, optional = true }\nbytesize = { workspace = true }\ncoarsetime = { workspace = true }\ndyn-clone = { workspace = true }\nenv_logger = { workspace = true }\nfnv = { workspace = true }\nfutures = { workspace = true }\ngovernor = { workspace = true }\nhome = { workspace = true }\nhostname = { workspace = true }\nhttp = { workspace = true }\nhyper = { workspace = true }\nhyper-util = { workspace = true, optional = true }\nitertools = { workspace = true }\nonce_cell = { workspace = true }\npin-project = { workspace = true }\npnet = { workspace = true }\nprometheus = { workspace = true }\nrand = { workspace = true }\nrayon = { workspace = true }\nregex = { workspace = true }\nserde = { workspace = true }\nsiphasher = { workspace = true }\nsysinfo = { workspace = true }\ntempfile = { workspace = true }\nthiserror = { workspace = true }\ntikv-jemallocator = { workspace = true, optional = true }\ntokio = { workspace = true }\ntokio-metrics = { workspace = true }\ntokio-stream = { workspace = true }\ntonic = { workspace = true, features = [\n    \"tls-native-roots\",\n    \"server\",\n    \"channel\",\n] }\ntower = { workspace = true }\ntracing = { workspace = true }\n\n[features]\ntestsuite = [\"hyper-util\"]\nnamed_tasks = [\"tokio/tracing\"]\njemalloc-profiled = [\n    \"named_tasks\",\n    \"dep:backtrace\",\n    \"dep:tikv-jemallocator\",\n]\n\n[dev-dependencies]\nhyper-util = { workspace = true }\nproptest = { workspace = true }\nserde_json = { workspace = true }\nserial_test = { workspace = true }\ntempfile = { workspace = true }\ntokio = { workspace = true, features = [\"test-util\"] }\n"
  },
  {
    "path": "quickwit/quickwit-common/build.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nfn main() {\n    println!(\"cargo::rustc-check-cfg=cfg(tokio_unstable)\");\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/alloc_tracker.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::collections::hash_map::Entry;\n\nuse bytesize::ByteSize;\n\n#[derive(Debug)]\nstruct Allocation {\n    pub callsite_hash: u64,\n    pub size: ByteSize,\n}\n\n#[derive(Debug, Copy, Clone)]\npub struct AllocStat {\n    pub count: u64,\n    pub size: ByteSize,\n    pub last_report: ByteSize,\n}\n\n#[derive(Debug)]\nenum TrackerStatus {\n    Started { reporting_interval: ByteSize },\n    Stopped,\n}\n\n/// WARN:\n/// - keys and values in these maps should not allocate!\n/// - we assume HashMaps don't allocate if their capacity is not exceeded\n#[derive(Debug)]\npub struct Allocations {\n    memory_locations: HashMap<usize, Allocation>,\n    max_tracked_memory_locations: usize,\n    callsite_statistics: HashMap<u64, AllocStat>,\n    max_tracked_callsites: usize,\n    status: TrackerStatus,\n}\n\nimpl Default for Allocations {\n    fn default() -> Self {\n        let max_tracked_memory_locations = 128 * 1024;\n        let max_tracked_callsites = 32 * 1024;\n        // TODO: We use a load factor of 0.5 to avoid resizing. There is no\n        // strict guarantee with std::collections::HashMap that it's enough, but\n        // it seems to be the case in practice (see test_tracker_full).\n        Self {\n            memory_locations: HashMap::with_capacity(2 * max_tracked_memory_locations),\n            max_tracked_memory_locations,\n            callsite_statistics: HashMap::with_capacity(2 * max_tracked_callsites),\n            max_tracked_callsites,\n            status: TrackerStatus::Stopped,\n        }\n    }\n}\n\npub enum AllocRecordingResponse {\n    ThresholdExceeded(AllocStat),\n    ThresholdNotExceeded,\n    TrackerFull(&'static str),\n    NotStarted,\n}\n\npub enum ReallocRecordingResponse {\n    ThresholdExceeded {\n        statistics: AllocStat,\n        callsite_hash: u64,\n    },\n    ThresholdNotExceeded,\n    NotStarted,\n}\n\nimpl Allocations {\n    pub fn init(&mut self, reporting_interval_bytes: u64) {\n        self.memory_locations.clear();\n        self.callsite_statistics.clear();\n        self.status = TrackerStatus::Started {\n            reporting_interval: ByteSize(reporting_interval_bytes),\n        }\n    }\n\n    /// Records an allocation and occasionally reports the cumulated allocation\n    /// size for the provided callsite_hash.\n    ///\n    /// Every time the total allocated size for a given callsite_hash exceeds\n    /// the previous reported value by at least reporting_interval, the new total\n    /// allocated size is reported.\n    ///\n    /// WARN: this function should not allocate!\n    pub fn record_allocation(\n        &mut self,\n        callsite_hash: u64,\n        size_bytes: u64,\n        ptr: *mut u8,\n    ) -> AllocRecordingResponse {\n        let TrackerStatus::Started { reporting_interval } = self.status else {\n            return AllocRecordingResponse::NotStarted;\n        };\n        if self.max_tracked_memory_locations == self.memory_locations.len() {\n            return AllocRecordingResponse::TrackerFull(\"memory_locations\");\n        }\n        if self.max_tracked_callsites == self.callsite_statistics.len() {\n            return AllocRecordingResponse::TrackerFull(\"tracked_callsites\");\n        }\n        self.memory_locations.insert(\n            ptr as usize,\n            Allocation {\n                callsite_hash,\n                size: ByteSize(size_bytes),\n            },\n        );\n        let entry = self\n            .callsite_statistics\n            .entry(callsite_hash)\n            .and_modify(|stat| {\n                stat.count += 1;\n                stat.size += size_bytes;\n            })\n            .or_insert(AllocStat {\n                count: 1,\n                size: ByteSize(size_bytes),\n                last_report: ByteSize(0),\n            });\n        let new_threshold_exceeded = entry.size >= (entry.last_report + reporting_interval);\n        if new_threshold_exceeded {\n            let reported_statistic = *entry;\n            entry.last_report = entry.size;\n            AllocRecordingResponse::ThresholdExceeded(reported_statistic)\n        } else {\n            AllocRecordingResponse::ThresholdNotExceeded\n        }\n    }\n\n    /// Updates the memory location and size of an existing allocation. Only\n    /// update the statistics if the original allocation was recorded.\n    ///\n    /// WARN: this function should not allocate!\n    pub fn record_reallocation(\n        &mut self,\n        new_size_bytes: u64,\n        old_ptr: *mut u8,\n        new_ptr: *mut u8,\n    ) -> ReallocRecordingResponse {\n        let TrackerStatus::Started { reporting_interval } = self.status else {\n            return ReallocRecordingResponse::NotStarted;\n        };\n        let (callsite_hash, old_size_bytes) = if old_ptr != new_ptr {\n            let Some(old_alloc) = self.memory_locations.remove(&(old_ptr as usize)) else {\n                return ReallocRecordingResponse::ThresholdNotExceeded;\n            };\n            self.memory_locations.insert(\n                new_ptr as usize,\n                Allocation {\n                    callsite_hash: old_alloc.callsite_hash,\n                    size: ByteSize(new_size_bytes),\n                },\n            );\n            (old_alloc.callsite_hash, old_alloc.size.0)\n        } else {\n            let Some(alloc) = self.memory_locations.get_mut(&(old_ptr as usize)) else {\n                return ReallocRecordingResponse::ThresholdNotExceeded;\n            };\n            let old_size_bytes = alloc.size.0;\n            alloc.size = ByteSize(new_size_bytes);\n            (alloc.callsite_hash, old_size_bytes)\n        };\n\n        let delta = new_size_bytes as i64 - old_size_bytes as i64;\n\n        let Some(current_stat) = self.callsite_statistics.get_mut(&callsite_hash) else {\n            // tables are inconsistent, this should not happen\n            return ReallocRecordingResponse::ThresholdNotExceeded;\n        };\n        current_stat.size = ByteSize((current_stat.size.0 as i64 + delta) as u64);\n        let new_threshold_exceeded =\n            current_stat.size >= (current_stat.last_report + reporting_interval);\n        if new_threshold_exceeded {\n            let reported_statistic = *current_stat;\n            current_stat.last_report = current_stat.size;\n            ReallocRecordingResponse::ThresholdExceeded {\n                statistics: reported_statistic,\n                callsite_hash,\n            }\n        } else {\n            ReallocRecordingResponse::ThresholdNotExceeded\n        }\n    }\n\n    /// WARN: this function should not allocate!\n    pub fn record_deallocation(&mut self, ptr: *mut u8) {\n        if let TrackerStatus::Stopped = self.status {\n            return;\n        }\n        let Some(Allocation {\n            size,\n            callsite_hash,\n            ..\n        }) = self.memory_locations.remove(&(ptr as usize))\n        else {\n            // this was allocated before the tracking started\n            return;\n        };\n        if let Entry::Occupied(mut content) = self.callsite_statistics.entry(callsite_hash) {\n            let new_size_bytes = content.get().size.0.saturating_sub(size.0);\n            let new_count = content.get().count.saturating_sub(1);\n            content.get_mut().count = new_count;\n            content.get_mut().size = ByteSize(new_size_bytes);\n            if content.get().count == 0 {\n                content.remove();\n            }\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    fn as_ptr(i: usize) -> *mut u8 {\n        i as *mut u8\n    }\n\n    #[test]\n    fn test_record_allocation_and_deallocation() {\n        let mut allocations = Allocations::default();\n        allocations.init(2000);\n        let callsite_hash_1 = 777;\n\n        let ptr_1 = as_ptr(1);\n        let response = allocations.record_allocation(callsite_hash_1, 1500, ptr_1);\n        assert!(matches!(\n            response,\n            AllocRecordingResponse::ThresholdNotExceeded\n        ));\n\n        let ptr_2 = as_ptr(2);\n        let response = allocations.record_allocation(callsite_hash_1, 1500, ptr_2);\n        let AllocRecordingResponse::ThresholdExceeded(statistic) = response else {\n            panic!(\"Expected ThresholdExceeded response\");\n        };\n        assert_eq!(statistic.count, 2);\n        assert_eq!(statistic.size, ByteSize(3000));\n        assert_eq!(statistic.last_report, ByteSize(0));\n\n        allocations.record_deallocation(ptr_2);\n\n        // the threshold was already crossed\n        let ptr_3 = as_ptr(3);\n        let response = allocations.record_allocation(callsite_hash_1, 1500, ptr_3);\n        assert!(matches!(\n            response,\n            AllocRecordingResponse::ThresholdNotExceeded\n        ));\n\n        // this is a brand new call site with different statistics\n        let callsite_hash_2 = 42;\n        let ptr_4 = as_ptr(4);\n        let response = allocations.record_allocation(callsite_hash_2, 1500, ptr_4);\n        assert!(matches!(\n            response,\n            AllocRecordingResponse::ThresholdNotExceeded\n        ));\n    }\n\n    #[test]\n    fn test_record_allocation_and_reallocation() {\n        let mut allocations = Allocations::default();\n        allocations.init(2000);\n        let callsite_hash_1 = 777;\n\n        let ptr_1 = as_ptr(1);\n        let response = allocations.record_allocation(callsite_hash_1, 1500, ptr_1);\n        assert!(matches!(\n            response,\n            AllocRecordingResponse::ThresholdNotExceeded\n        ));\n\n        let ptr_2 = as_ptr(2);\n        let response = allocations.record_allocation(callsite_hash_1, 1500, ptr_2);\n        let AllocRecordingResponse::ThresholdExceeded(statistic) = response else {\n            panic!(\"Expected ThresholdExceeded response\");\n        };\n        assert_eq!(statistic.count, 2);\n        assert_eq!(statistic.size, ByteSize(3000));\n        assert_eq!(statistic.last_report, ByteSize(0));\n\n        // alloc grows a little bit\n        let response = allocations.record_reallocation(2000, ptr_1, ptr_1);\n        assert!(matches!(\n            response,\n            ReallocRecordingResponse::ThresholdNotExceeded\n        ));\n\n        // alloc grows a lot\n        let response = allocations.record_reallocation(4000, ptr_1, ptr_1);\n        let ReallocRecordingResponse::ThresholdExceeded {\n            statistics,\n            callsite_hash,\n        } = response\n        else {\n            panic!(\"Expected ThresholdExceeded response\");\n        };\n        assert_eq!(statistics.count, 2);\n        assert_eq!(statistics.size, ByteSize(5500));\n        assert_eq!(statistics.last_report, ByteSize(3000));\n        assert_eq!(callsite_hash, callsite_hash_1);\n\n        // alloc grows a little bit and moves\n        let ptr_3 = as_ptr(3);\n        let response = allocations.record_reallocation(4500, ptr_1, ptr_3);\n        assert!(matches!(\n            response,\n            ReallocRecordingResponse::ThresholdNotExceeded\n        ));\n\n        // alloc grows a lot and moves\n        let ptr_4 = as_ptr(4);\n        let response = allocations.record_reallocation(6000, ptr_3, ptr_4);\n        let ReallocRecordingResponse::ThresholdExceeded {\n            statistics,\n            callsite_hash,\n        } = response\n        else {\n            panic!(\"Expected ThresholdExceeded response\");\n        };\n        assert_eq!(statistics.count, 2);\n        assert_eq!(statistics.size, ByteSize(7500));\n        assert_eq!(statistics.last_report, ByteSize(5500));\n        assert_eq!(callsite_hash, callsite_hash_1);\n\n        // once an existing allocation moved, it's previous location can be re-allocated\n        let response = allocations.record_allocation(callsite_hash_1, 2000, ptr_1);\n        let AllocRecordingResponse::ThresholdExceeded(statistics) = response else {\n            panic!(\"Expected ThresholdExceeded response\");\n        };\n        assert_eq!(statistics.count, 3);\n        assert_eq!(statistics.size, ByteSize(9500));\n        assert_eq!(statistics.last_report, ByteSize(7500));\n        assert_eq!(callsite_hash, callsite_hash_1);\n\n        // reallocation is ignored on unknown allocation\n        let ptr_404 = as_ptr(404);\n        let response = allocations.record_reallocation(10000, ptr_404, ptr_404);\n        assert!(matches!(\n            response,\n            ReallocRecordingResponse::ThresholdNotExceeded\n        ));\n    }\n\n    #[test]\n    fn test_tracker_full() {\n        let mut allocations = Allocations::default();\n        allocations.init(1024 * 1024 * 1024);\n        let max_tracked_locations = allocations.max_tracked_memory_locations;\n\n        // Track a first allocation. This one is not removed thoughout this test.\n        let first_location_ptr = as_ptr(1);\n        let response = allocations.record_allocation(777, 10, first_location_ptr);\n        assert!(matches!(\n            response,\n            AllocRecordingResponse::ThresholdNotExceeded\n        ));\n        let ref_addr = allocations\n            .memory_locations\n            .get(&(first_location_ptr as usize))\n            .unwrap() as *const Allocation;\n        // Assert that no hashmap resize occurs by tracking the address\n        // stability of the first value. Using HashMap::capacity() proved not to\n        // be reliable (unclear spec).\n        let assert_locations_map_didnt_move = |allocations: &Allocations, loc: &str| {\n            assert_eq!(\n                allocations\n                    .memory_locations\n                    .get(&(first_location_ptr as usize))\n                    .unwrap() as *const Allocation,\n                ref_addr,\n                \"{loc}\",\n            );\n        };\n\n        // fill the table\n        let moving_ptr_range = (first_location_ptr as usize + 1)\n            ..(first_location_ptr as usize + max_tracked_locations);\n        for i in moving_ptr_range.clone() {\n            let ptr = as_ptr(i);\n            let response = allocations.record_allocation(777, 10, ptr);\n            assert!(matches!(\n                response,\n                AllocRecordingResponse::ThresholdNotExceeded\n            ));\n            assert_locations_map_didnt_move(&allocations, \"fill\");\n        }\n        assert_eq!(allocations.memory_locations.len(), max_tracked_locations);\n\n        // the table is full, no more allocation is tracked\n        let response = allocations.record_allocation(777, 10, as_ptr(moving_ptr_range.end));\n        assert!(matches!(\n            response,\n            AllocRecordingResponse::TrackerFull(\"memory_locations\")\n        ));\n        assert_locations_map_didnt_move(&allocations, \"full\");\n\n        // run a heavy insert/remove workload\n        let last_location = 10 * max_tracked_locations;\n        for i in moving_ptr_range.end..=last_location {\n            let removed_ptr = as_ptr(i - 1);\n            allocations.record_deallocation(removed_ptr);\n            let inserted_ptr = as_ptr(i);\n            let response = allocations.record_allocation(888, 10, inserted_ptr);\n            assert!(matches!(\n                response,\n                AllocRecordingResponse::ThresholdNotExceeded\n            ));\n            assert_locations_map_didnt_move(&allocations, \"reinsert\");\n        }\n\n        // reallocations are fine because they don't create an entry in the map\n        let response =\n            allocations.record_reallocation(10, as_ptr(last_location), as_ptr(last_location + 1));\n        assert!(matches!(\n            response,\n            ReallocRecordingResponse::ThresholdNotExceeded,\n        ));\n        assert_locations_map_didnt_move(&allocations, \"realloc\");\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/binary_heap.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::cmp::{Ordering, Reverse};\nuse std::collections::BinaryHeap;\nuse std::iter::FusedIterator;\n\n// TODO: Remove this once `BinaryHeap::into_iter_sorted` is stabilized.\n\n#[must_use = \"iterators are lazy and do nothing unless consumed\"]\n#[derive(Clone, Debug)]\npub struct IntoIterSorted<T> {\n    inner: BinaryHeap<T>,\n}\n\nimpl<T> IntoIterSorted<T> {\n    pub fn new(instance: BinaryHeap<T>) -> Self {\n        Self { inner: instance }\n    }\n}\n\nimpl<T: Ord> Iterator for IntoIterSorted<T> {\n    type Item = T;\n\n    #[inline]\n    fn next(&mut self) -> Option<T> {\n        self.inner.pop()\n    }\n\n    #[inline]\n    fn size_hint(&self) -> (usize, Option<usize>) {\n        let exact = self.inner.len();\n        (exact, Some(exact))\n    }\n}\n\nimpl<T: Ord> ExactSizeIterator for IntoIterSorted<T> {}\n\nimpl<T: Ord> FusedIterator for IntoIterSorted<T> {}\n\n/// Consumes an iterator entirely and return the top-K best element according to a scoring key.\n/// Behavior under the presence of ties is unspecified.\npub fn top_k<T, SortKeyFn, O>(\n    mut items: impl Iterator<Item = T>,\n    k: usize,\n    sort_key_fn: SortKeyFn,\n) -> Vec<T>\nwhere\n    SortKeyFn: Fn(&T) -> O,\n    O: Ord,\n{\n    if k == 0 {\n        return Vec::new();\n    }\n    let mut heap: BinaryHeap<Reverse<OrderItemPair<O, T>>> = BinaryHeap::with_capacity(k);\n    for _ in 0..k {\n        if let Some(item) = items.next() {\n            let order: O = sort_key_fn(&item);\n            heap.push(Reverse(OrderItemPair { order, item }));\n        } else {\n            break;\n        }\n    }\n    if heap.len() == k {\n        for item in items {\n            let mut head = heap.peek_mut().unwrap();\n            let order = sort_key_fn(&item);\n            if head.0.order < order {\n                *head = Reverse(OrderItemPair { order, item });\n            }\n        }\n    }\n    let resulting_top_k: Vec<T> = heap\n        .into_sorted_vec()\n        .into_iter()\n        .map(|order_item| order_item.0.item)\n        .collect();\n    resulting_top_k\n}\n\n#[derive(Clone)]\nstruct OrderItemPair<O: Ord, T> {\n    order: O,\n    item: T,\n}\n\nimpl<O: Ord, T> Ord for OrderItemPair<O, T> {\n    fn cmp(&self, other: &Self) -> Ordering {\n        self.order.cmp(&other.order)\n    }\n}\n\nimpl<O: Ord, T> PartialOrd for OrderItemPair<O, T> {\n    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {\n        Some(self.cmp(other))\n    }\n}\n\nimpl<O: Ord, T> PartialEq for OrderItemPair<O, T> {\n    fn eq(&self, other: &Self) -> bool {\n        self.order.cmp(&other.order) == Ordering::Equal\n    }\n}\n\nimpl<O: Ord, T> Eq for OrderItemPair<O, T> {}\n\npub trait SortKeyMapper<Value> {\n    type Key;\n    fn get_sort_key(&self, value: &Value) -> Self::Key;\n}\n\n/// Progressively compute top-k.\n#[derive(Clone)]\npub struct TopK<T, O: Ord, S> {\n    heap: BinaryHeap<Reverse<OrderItemPair<O, T>>>,\n    pub sort_key_mapper: S,\n    k: usize,\n}\n\nimpl<T, O, S> TopK<T, O, S>\nwhere\n    O: Ord,\n    S: SortKeyMapper<T, Key = O>,\n{\n    /// Create a new top-k computer.\n    pub fn new(k: usize, sort_key_mapper: S) -> Self {\n        TopK {\n            heap: BinaryHeap::with_capacity(k),\n            sort_key_mapper,\n            k,\n        }\n    }\n\n    /// Whether there are k element ready already.\n    pub fn at_capacity(&self) -> bool {\n        self.heap.len() >= self.k\n    }\n\n    pub fn max_len(&self) -> usize {\n        self.k\n    }\n\n    /// Try to add new entries, if they are better than the current worst.\n    pub fn add_entries(&mut self, mut items: impl Iterator<Item = T>) {\n        if self.k == 0 {\n            return;\n        }\n        while !self.at_capacity() {\n            if let Some(item) = items.next() {\n                let order: O = self.sort_key_mapper.get_sort_key(&item);\n                self.heap.push(Reverse(OrderItemPair { order, item }));\n            } else {\n                return;\n            }\n        }\n\n        for item in items {\n            let mut head = self.heap.peek_mut().unwrap();\n            let order = self.sort_key_mapper.get_sort_key(&item);\n            if head.0.order < order {\n                *head = Reverse(OrderItemPair { order, item });\n            }\n        }\n    }\n\n    pub fn add_entry(&mut self, item: T) {\n        self.add_entries(std::iter::once(item))\n    }\n\n    /// Get a reference to the worst entry.\n    pub fn peek_worst(&self) -> Option<&T> {\n        self.heap.peek().map(|entry| &entry.0.item)\n    }\n\n    /// Get a Vec of sorted entries.\n    pub fn finalize(self) -> Vec<T> {\n        self.heap\n            .into_sorted_vec()\n            .into_iter()\n            .map(|order_item| order_item.0.item)\n            .collect()\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use super::*;\n\n    #[test]\n    fn test_top_k() {\n        let top_k = super::top_k(vec![1u32, 2, 3].into_iter(), 2, |n| *n);\n        assert_eq!(&top_k, &[3, 2]);\n        let top_k = super::top_k(vec![1u32, 2, 3].into_iter(), 2, |n| Reverse(*n));\n        assert_eq!(&top_k, &[1, 2]);\n        let top_k = super::top_k(vec![1u32, 2, 2].into_iter(), 4, |n| *n);\n        assert_eq!(&top_k, &[2u32, 2, 1]);\n        let top_k = super::top_k(vec![1u32, 2, 2].into_iter(), 4, |n| *n);\n        assert_eq!(&top_k, &[2u32, 2, 1]);\n        let top_k: Vec<u32> = super::top_k(Vec::new().into_iter(), 4, |n| *n);\n        assert!(top_k.is_empty());\n    }\n\n    #[test]\n    fn test_incremental_top_k() {\n        struct Mapper(bool);\n        impl SortKeyMapper<u32> for Mapper {\n            type Key = u32;\n            fn get_sort_key(&self, value: &u32) -> u32 {\n                if self.0 { u32::MAX - value } else { *value }\n            }\n        }\n        let mut top_k = TopK::new(2, Mapper(false));\n        top_k.add_entries([1u32, 2, 3].into_iter());\n        assert!(top_k.at_capacity());\n        assert_eq!(top_k.peek_worst(), Some(&2));\n        assert_eq!(&top_k.finalize(), &[3, 2]);\n\n        let mut top_k = TopK::new(2, Mapper(false));\n        top_k.add_entries([1u32].into_iter());\n        assert!(!top_k.at_capacity());\n        assert_eq!(top_k.peek_worst(), Some(&1));\n        top_k.add_entries([3].into_iter());\n        assert!(top_k.at_capacity());\n        assert_eq!(top_k.peek_worst(), Some(&1));\n        top_k.add_entries([2].into_iter());\n        assert!(top_k.at_capacity());\n        assert_eq!(top_k.peek_worst(), Some(&2));\n        assert_eq!(&top_k.finalize(), &[3, 2]);\n\n        let mut top_k = TopK::new(2, Mapper(true));\n        top_k.add_entries([1u32, 2, 3].into_iter());\n        assert!(top_k.at_capacity());\n        assert_eq!(top_k.peek_worst(), Some(&2));\n        assert_eq!(&top_k.finalize(), &[1, 2]);\n\n        let mut top_k = TopK::new(2, Mapper(true));\n        top_k.add_entries([1u32].into_iter());\n        assert!(!top_k.at_capacity());\n        assert_eq!(top_k.peek_worst(), Some(&1));\n        top_k.add_entries([3].into_iter());\n        assert!(top_k.at_capacity());\n        assert_eq!(top_k.peek_worst(), Some(&3));\n        top_k.add_entries([2].into_iter());\n        assert!(top_k.at_capacity());\n        assert_eq!(top_k.peek_worst(), Some(&2));\n        assert_eq!(&top_k.finalize(), &[1, 2]);\n\n        let mut top_k = TopK::new(4, Mapper(false));\n        top_k.add_entries([2u32, 1, 2].into_iter());\n        assert!(!top_k.at_capacity());\n        assert_eq!(top_k.peek_worst(), Some(&1));\n        assert_eq!(&top_k.finalize(), &[2, 2, 1]);\n\n        let mut top_k = TopK::new(4, Mapper(false));\n        top_k.add_entries([2u32].into_iter());\n        assert!(!top_k.at_capacity());\n        assert_eq!(top_k.peek_worst(), Some(&2));\n        top_k.add_entries([1].into_iter());\n        assert!(!top_k.at_capacity());\n        assert_eq!(top_k.peek_worst(), Some(&1));\n        top_k.add_entries([2].into_iter());\n        assert!(!top_k.at_capacity());\n        assert_eq!(top_k.peek_worst(), Some(&1));\n        assert_eq!(&top_k.finalize(), &[2, 2, 1]);\n\n        let mut top_k = TopK::<u32, u32, _>::new(4, Mapper(false));\n        top_k.add_entries([].into_iter());\n        assert!(top_k.finalize().is_empty());\n\n        let mut top_k = TopK::new(0, Mapper(false));\n        top_k.add_entries([1u32, 2, 3].into_iter());\n        assert!(top_k.at_capacity());\n        assert_eq!(top_k.peek_worst(), None);\n        assert!(top_k.finalize().is_empty());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/coolid.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse rand::distr::Alphanumeric;\nuse rand::prelude::*;\n\nconst ADJECTIVES: &[&str] = &[\n    \"aged\",\n    \"ancient\",\n    \"autumn\",\n    \"billowing\",\n    \"bitter\",\n    \"black\",\n    \"blue\",\n    \"bold\",\n    \"broken\",\n    \"cold\",\n    \"cool\",\n    \"crimson\",\n    \"damp\",\n    \"dark\",\n    \"dawn\",\n    \"delicate\",\n    \"divine\",\n    \"dry\",\n    \"empty\",\n    \"falling\",\n    \"floral\",\n    \"fragrant\",\n    \"frosty\",\n    \"green\",\n    \"hidden\",\n    \"holy\",\n    \"icy\",\n    \"late\",\n    \"lingering\",\n    \"little\",\n    \"lively\",\n    \"long\",\n    \"misty\",\n    \"morning\",\n    \"muddy\",\n    \"nameless\",\n    \"old\",\n    \"patient\",\n    \"polished\",\n    \"proud\",\n    \"purple\",\n    \"quiet\",\n    \"red\",\n    \"restless\",\n    \"rough\",\n    \"shy\",\n    \"silent\",\n    \"small\",\n    \"snowy\",\n    \"solitary\",\n    \"sparkling\",\n    \"spring\",\n    \"still\",\n    \"summer\",\n    \"throbbing\",\n    \"twilight\",\n    \"wandering\",\n    \"weathered\",\n    \"white\",\n    \"wild\",\n    \"winter\",\n    \"wispy\",\n    \"withered\",\n    \"young\",\n];\n\n/// Returns a randomly generated id\npub fn new_coolid(name: &str) -> String {\n    let mut rng = rand::rng();\n    let adjective = ADJECTIVES[rng.random_range(0..ADJECTIVES.len())];\n    let slug: String = rng\n        .sample_iter(&Alphanumeric)\n        .take(4)\n        .map(char::from)\n        .collect();\n    format!(\"{name}-{adjective}-{slug}\")\n}\n\n#[cfg(test)]\nmod tests {\n    use std::collections::HashSet;\n\n    use super::new_coolid;\n\n    #[test]\n    fn test_coolid() {\n        let cool_ids: HashSet<String> = std::iter::repeat_with(|| new_coolid(\"hello\"))\n            .take(100)\n            .collect();\n        assert_eq!(cool_ids.len(), 100);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/cpus.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::num::NonZero;\nuse std::sync::OnceLock;\n\nuse tracing::{error, info, warn};\n\nconst QW_NUM_CPUS_ENV_KEY: &str = \"QW_NUM_CPUS\";\nconst KUBERNETES_LIMITS_CPU: &str = \"KUBERNETES_LIMITS_CPU\";\n\n/// Return the number of vCPU/hyperthreads available.\n/// The following methods are used in order:\n/// - from the `QW_NUM_CPUS` environment variable\n/// - from the `KUBERNETES_LIMITS_CPU` environment variable\n/// - from the operating system\n/// - default to 2.\npub fn num_cpus() -> usize {\n    static NUM_CPUS: OnceLock<usize> = OnceLock::new();\n    *NUM_CPUS.get_or_init(num_cpus_aux)\n}\n\nfn num_cpus_aux() -> usize {\n    let num_cpus_from_os_opt = std::thread::available_parallelism()\n        .map(NonZero::get)\n        .inspect_err(|err| {\n            error!(error=?err, \"failed to detect the number of threads available: arbitrarily returning 2\");\n        })\n        .ok();\n    let num_cpus_from_env_opt = get_num_cpus_from_env(QW_NUM_CPUS_ENV_KEY);\n    let num_cpus_from_k8s_limit = get_num_cpus_from_env(KUBERNETES_LIMITS_CPU);\n\n    if let Some(num_cpus) = num_cpus_from_env_opt {\n        return num_cpus;\n    }\n\n    if let Some(num_cpus_from_k8s_limit) = num_cpus_from_k8s_limit {\n        info!(\n            \"num cpus from k8s limit: {},  possibly overriding os value {:?}\",\n            num_cpus_from_k8s_limit, num_cpus_from_env_opt\n        );\n        return num_cpus_from_k8s_limit;\n    }\n\n    if let Some(num_cpus_from_os_opt) = num_cpus_from_os_opt {\n        info!(\"num cpus from os: {}\", num_cpus_from_os_opt);\n        return num_cpus_from_os_opt;\n    }\n\n    warn!(\"failed to detect number of cpus. defaulting to 2\");\n    2\n}\n\nfn parse_cpu_to_mcpu(cpu_string: &str) -> Result<usize, &'static str> {\n    let trimmed_str = cpu_string.trim();\n\n    if trimmed_str.is_empty() {\n        return Err(\"input cpu_string cannot be empty\");\n    }\n\n    if let Some(val_str) = trimmed_str.strip_suffix('m') {\n        // The value is already in millicores.\n        val_str\n            .parse::<usize>()\n            .map_err(|_| \"invalid millicore value\")\n    } else {\n        // The value is in CPU cores.\n        let value = trimmed_str\n            .parse::<f64>()\n            .map_err(|_| \"invalid float value\")?;\n        Ok((value * 1000.0f64) as usize)\n    }\n}\n\n// Get the number of CPUs from an environment variable.\n// The value is expected to be in k8s format (200m means 200 millicores, 2 means 2 cores)\n//\n// We then get the number of vCPUs by ceiling any non integer value.\nfn get_num_cpus_from_env(env_key: &str) -> Option<usize> {\n    let k8s_cpu_limit_str: String = crate::get_from_env_opt(env_key, false)?;\n    let mcpus = parse_cpu_to_mcpu(&k8s_cpu_limit_str)\n        .inspect_err(|err_msg| {\n            warn!(\n                \"failed to parse k8s cpu limit (`{}`): {}\",\n                k8s_cpu_limit_str, err_msg\n            );\n        })\n        .ok()?;\n    let num_vcpus = mcpus.div_ceil(1000);\n    Some(num_vcpus)\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_millicores() {\n        assert_eq!(parse_cpu_to_mcpu(\"500m\").unwrap(), 500);\n        assert_eq!(parse_cpu_to_mcpu(\"100m\").unwrap(), 100);\n        assert_eq!(parse_cpu_to_mcpu(\"2500m\").unwrap(), 2500);\n    }\n\n    #[test]\n    fn test_cores() {\n        assert_eq!(parse_cpu_to_mcpu(\"1\").unwrap(), 1000);\n        assert_eq!(parse_cpu_to_mcpu(\"2\").unwrap(), 2000);\n    }\n\n    #[test]\n    fn test_fractional_cores() {\n        assert_eq!(parse_cpu_to_mcpu(\"0.5\").unwrap(), 500);\n        assert_eq!(parse_cpu_to_mcpu(\"1.5\").unwrap(), 1500);\n        assert_eq!(parse_cpu_to_mcpu(\"0.25\").unwrap(), 250);\n    }\n\n    #[test]\n    fn test_with_whitespace() {\n        assert_eq!(parse_cpu_to_mcpu(\" 750m \").unwrap(), 750);\n        assert_eq!(parse_cpu_to_mcpu(\" 0.75 \").unwrap(), 750);\n    }\n\n    #[test]\n    fn test_invalid_input() {\n        assert!(parse_cpu_to_mcpu(\"\").is_err());\n        assert!(parse_cpu_to_mcpu(\"   \").is_err());\n        assert!(parse_cpu_to_mcpu(\"abc\").is_err());\n        assert!(parse_cpu_to_mcpu(\"1a\").is_err());\n        assert!(parse_cpu_to_mcpu(\"m500\").is_err());\n        assert!(parse_cpu_to_mcpu(\"500m1\").is_err());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/fs.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::path::{Path, PathBuf};\n\nuse bytesize::ByteSize;\nuse sysinfo::{Disk, DiskRefreshKind};\nuse tokio;\n\n/// Deletes the contents of a directory.\npub async fn empty_dir<P: AsRef<Path>>(path: P) -> anyhow::Result<()> {\n    let mut entries = tokio::fs::read_dir(path).await?;\n    while let Some(entry) = entries.next_entry().await? {\n        if entry.file_type().await?.is_dir() {\n            tokio::fs::remove_dir_all(entry.path()).await?\n        } else {\n            tokio::fs::remove_file(entry.path()).await?;\n        }\n    }\n    Ok(())\n}\n\n/// Helper function to get the indexer split cache path.\npub fn get_cache_directory_path(data_dir_path: &Path) -> PathBuf {\n    data_dir_path.join(\"indexer-split-cache\").join(\"splits\")\n}\n\n/// Get the total size of the disk containing the given directory, or `None` if\n/// it couldn't be determined.\npub fn get_disk_size(dir_path: &Path) -> Option<ByteSize> {\n    let disks = sysinfo::Disks::new_with_refreshed_list_specifics(\n        DiskRefreshKind::nothing().with_storage(),\n    );\n    let mut best_match: Option<(&Disk, PathBuf)> = None;\n    let dir_path = dir_path.canonicalize().ok()?;\n    for disk in disks.list() {\n        let canonical_mount_path = disk.mount_point().canonicalize().ok()?;\n        if dir_path.starts_with(&canonical_mount_path) {\n            match best_match {\n                Some((_, best_mount_point))\n                    if canonical_mount_path.starts_with(&best_mount_point) =>\n                {\n                    best_match = Some((disk, canonical_mount_path.clone()));\n                }\n                None => {\n                    best_match = Some((disk, canonical_mount_path.clone()));\n                }\n                _ => {}\n            }\n        }\n        if canonical_mount_path.starts_with(&dir_path) && canonical_mount_path != dir_path {\n            // if a disk is mounted within the directory, we can't determine the\n            // size of the directories disk\n            return None;\n        }\n    }\n    best_match.map(|(disk, _)| ByteSize::b(disk.total_space()))\n}\n\n#[cfg(test)]\nmod tests {\n    use tempfile;\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_empty_dir() -> anyhow::Result<()> {\n        let temp_dir = tempfile::tempdir()?;\n\n        let file_path = temp_dir.path().join(\"file\");\n        tokio::fs::File::create(file_path).await?;\n\n        let subdir = temp_dir.path().join(\"subdir\");\n        tokio::fs::create_dir(&subdir).await?;\n\n        let subfile_path = subdir.join(\"subfile\");\n        tokio::fs::File::create(subfile_path).await?;\n\n        empty_dir(temp_dir.path()).await?;\n        assert!(\n            tokio::fs::read_dir(temp_dir.path())\n                .await?\n                .next_entry()\n                .await?\n                .is_none()\n        );\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/io.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n// This file contains code copied from the Resource trait\n// in async-speed-limit from the TiKV project.\n// https://github.com/tikv/async-speed-limit/blob/master/src/io.rs\n//\n// Copyright 2019 TiKV Project Authors. Licensed under MIT or Apache-2.0.\n\n// We are simply porting the logic to tokio here and adding the functionality to\n// plug some metrics.\n\nuse std::future::Future;\nuse std::io;\nuse std::io::IoSlice;\nuse std::pin::Pin;\nuse std::task::{Context, Poll};\nuse std::time::Duration;\n\npub use async_speed_limit::Limiter;\nuse async_speed_limit::clock::StandardClock;\nuse async_speed_limit::limiter::Consume;\nuse bytesize::ByteSize;\nuse once_cell::sync::Lazy;\nuse pin_project::pin_project;\nuse prometheus::IntCounter;\nuse tokio::io::AsyncWrite;\n\nuse crate::metrics::{IntCounterVec, new_counter_vec};\nuse crate::{KillSwitch, Progress, ProtectedZoneGuard};\n\n// Max 1MB at a time.\nconst MAX_NUM_BYTES_WRITTEN_AT_ONCE: usize = 1 << 20;\n\nfn truncate_bytes(bytes: &[u8]) -> &[u8] {\n    let num_bytes = bytes.len().min(MAX_NUM_BYTES_WRITTEN_AT_ONCE);\n    &bytes[..num_bytes]\n}\n\nstruct IoMetrics {\n    write_bytes: IntCounterVec<1>,\n}\n\nimpl Default for IoMetrics {\n    fn default() -> Self {\n        let write_bytes = new_counter_vec(\n            \"write_bytes\",\n            \"Number of bytes written by a given component in [indexer, merger, deleter, \\\n             split_downloader_{merge,delete}]\",\n            \"\",\n            &[],\n            [\"component\"],\n        );\n        Self { write_bytes }\n    }\n}\n\nstatic IO_METRICS: Lazy<IoMetrics> = Lazy::new(IoMetrics::default);\n\n/// Parameter used in `async_speed_limit`.\n///\n/// The default value is good and does not need to be tweaked.\n/// We use a smaller value in unit test to get reasonably accurate throttling one very\n/// short period of times.\n///\n/// For more details, please refer to `async_speed_limit` documentation.\nconst REFILL_DURATION: Duration = if cfg!(test) {\n    Duration::from_millis(10)\n} else {\n    // Default value in async_speed_limit\n    Duration::from_millis(100)\n};\n\npub fn limiter(throughput: ByteSize) -> Limiter {\n    Limiter::builder(throughput.as_u64() as f64)\n        .refill(REFILL_DURATION)\n        .build()\n}\n\n#[derive(Clone)]\npub struct IoControls {\n    throughput_limiter_opt: Option<Limiter>,\n    bytes_counter: IntCounter,\n    progress: Progress,\n    kill_switch: KillSwitch,\n}\n\nimpl Default for IoControls {\n    fn default() -> Self {\n        let default_bytes_counter =\n            IntCounter::new(\"default_write_num_bytes\", \"Default write counter.\").unwrap();\n        IoControls {\n            throughput_limiter_opt: None,\n            progress: Progress::default(),\n            kill_switch: KillSwitch::default(),\n            bytes_counter: default_bytes_counter,\n        }\n    }\n}\n\nimpl IoControls {\n    #[must_use]\n    pub fn progress(&self) -> &Progress {\n        &self.progress\n    }\n\n    pub fn kill(&self) {\n        self.kill_switch.kill();\n    }\n\n    pub fn num_bytes(&self) -> u64 {\n        self.bytes_counter.get()\n    }\n\n    pub fn check_if_alive(&self) -> io::Result<ProtectedZoneGuard> {\n        if self.kill_switch.is_dead() {\n            return Err(io::Error::other(\"directory kill switch was activated\"));\n        }\n        let guard = self.progress.protect_zone();\n        Ok(guard)\n    }\n\n    pub fn set_component(mut self, component: &str) -> Self {\n        self.bytes_counter = IO_METRICS.write_bytes.with_label_values([component]);\n        self\n    }\n\n    pub fn set_throughput_limit(self, throughput: ByteSize) -> Self {\n        let throughput_limiter = Limiter::builder(throughput.as_u64() as f64)\n            .refill(REFILL_DURATION)\n            .build();\n        self.set_throughput_limiter_opt(Some(throughput_limiter))\n    }\n\n    pub fn set_throughput_limiter_opt(mut self, throughput_limiter_opt: Option<Limiter>) -> Self {\n        self.throughput_limiter_opt = throughput_limiter_opt;\n        self\n    }\n\n    pub fn set_bytes_counter(mut self, bytes_counter: IntCounter) -> Self {\n        self.bytes_counter = bytes_counter;\n        self\n    }\n\n    pub fn set_progress(mut self, progress: Progress) -> Self {\n        self.progress = progress;\n        self\n    }\n\n    pub fn set_kill_switch(mut self, kill_switch: KillSwitch) -> Self {\n        self.kill_switch = kill_switch;\n        self\n    }\n    fn consume_blocking(&self, num_bytes: usize) -> io::Result<()> {\n        let _guard = self.check_if_alive()?;\n        if let Some(throughput_limiter) = &self.throughput_limiter_opt {\n            throughput_limiter.blocking_consume(num_bytes);\n        }\n        self.bytes_counter.inc_by(num_bytes as u64);\n        Ok(())\n    }\n}\n\n#[pin_project]\npub struct ControlledWrite<A: IoControlsAccess, W> {\n    #[pin]\n    underlying_wrt: W,\n    waiter: Option<Consume<StandardClock, ()>>,\n    io_controls_access: A,\n}\n\nimpl<A: IoControlsAccess, W: AsyncWrite> ControlledWrite<A, W> {\n    // This function was copied from TiKV's `async-speed-limit`.\n    // Copyright 2019 TiKV Project Authors. Licensed under MIT or Apache-2.0.\n    /// Wraps a poll function with a delay after it.\n    ///\n    /// This method calls the given `poll` function until it is fulfilled. After\n    /// that, the result is saved into this `Resource` instance (therefore\n    /// different `poll_***` calls should not be interleaving), while returning\n    /// `Pending` until the limiter has completely consumed the result.\n    #[allow(dead_code)]\n    pub(crate) fn poll_limited(\n        self: Pin<&mut Self>,\n        cx: &mut Context<'_>,\n        poll: impl FnOnce(Pin<&mut W>, &mut Context<'_>) -> Poll<io::Result<usize>>,\n    ) -> Poll<io::Result<usize>> {\n        let this = self.project();\n\n        let _protect_guard = match this\n            .io_controls_access\n            .apply(|io_controls| io_controls.check_if_alive())\n        {\n            Ok(protect_guard) => protect_guard,\n            Err(io_err) => {\n                return Poll::Ready(Err(io_err));\n            }\n        };\n\n        if let Some(waiter) = this.waiter {\n            let res = Pin::new(waiter).poll(cx);\n            if res.is_pending() {\n                return Poll::Pending;\n            }\n            *this.waiter = None;\n        }\n\n        let res: Poll<io::Result<usize>> = poll(this.underlying_wrt, cx);\n        if let Poll::Ready(obj) = &res {\n            let len = *obj.as_ref().unwrap_or(&0);\n            if len > 0 {\n                let waiter = this.io_controls_access.apply(|io_controls| {\n                    io_controls.bytes_counter.inc_by(len as u64);\n                    io_controls\n                        .throughput_limiter_opt\n                        .as_ref()\n                        .map(|limiter| limiter.consume(len))\n                });\n                *this.waiter = waiter\n            }\n        }\n        res\n    }\n}\n\n/// Quirky spec: truncates the list of bufs, and keep as many leftmost elements\n/// as possible, within the constraint of not exceeding `max_len` bytes.\n///\n/// Please keep this function private\nfn quirky_truncate_slices<'a, 'b>(bufs: &'b [IoSlice<'a>], max_len: usize) -> &'b [IoSlice<'a>] {\n    if bufs.is_empty() {\n        return bufs;\n    }\n    let mut cumulated_len = bufs[0].len();\n    for (i, buf) in bufs.iter().enumerate().skip(1) {\n        cumulated_len += buf.len();\n        if cumulated_len > max_len {\n            return &bufs[..i];\n        }\n    }\n    bufs\n}\n\nimpl<A: IoControlsAccess, W: AsyncWrite> AsyncWrite for ControlledWrite<A, W> {\n    fn poll_write(\n        self: Pin<&mut Self>,\n        cx: &mut Context<'_>,\n        buf: &[u8],\n    ) -> Poll<io::Result<usize>> {\n        let buf = truncate_bytes(buf);\n        // The shadowing is on purpose.\n        self.poll_limited(cx, |r, cx| r.poll_write(cx, buf))\n    }\n\n    fn poll_write_vectored(\n        self: Pin<&mut Self>,\n        cx: &mut Context<'_>,\n        bufs: &[IoSlice<'_>],\n    ) -> Poll<io::Result<usize>> {\n        if bufs.is_empty() {\n            return Poll::Ready(Ok(0));\n        }\n        // The shadowing is on purpose.\n        let bufs = quirky_truncate_slices(bufs, MAX_NUM_BYTES_WRITTEN_AT_ONCE);\n        self.poll_limited(cx, |r, cx| r.poll_write_vectored(cx, bufs))\n    }\n\n    fn poll_flush(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<io::Result<()>> {\n        self.project().underlying_wrt.poll_flush(cx)\n    }\n\n    fn poll_shutdown(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Result<(), io::Error>> {\n        self.project().underlying_wrt.poll_shutdown(cx)\n    }\n}\n\npub trait IoControlsAccess: Sized {\n    fn wrap_write<W>(self, wrt: W) -> ControlledWrite<Self, W> {\n        ControlledWrite {\n            underlying_wrt: wrt,\n            waiter: None,\n            io_controls_access: self,\n        }\n    }\n\n    fn apply<F, R>(&self, f: F) -> R\n    where F: Fn(&IoControls) -> R;\n}\n\nimpl IoControlsAccess for IoControls {\n    fn apply<F, R>(&self, f: F) -> R\n    where F: Fn(&IoControls) -> R {\n        f(self)\n    }\n}\n\nimpl<A, W> ControlledWrite<A, W>\nwhere A: IoControlsAccess\n{\n    pub fn underlying_wrt(&mut self) -> &mut W {\n        &mut self.underlying_wrt\n    }\n\n    fn check_if_alive(&self) -> io::Result<ProtectedZoneGuard> {\n        self.io_controls_access\n            .apply(|io_controls| io_controls.check_if_alive())\n    }\n}\n\nimpl<A, W: io::Write> io::Write for ControlledWrite<A, W>\nwhere A: IoControlsAccess\n{\n    fn write(&mut self, buf: &[u8]) -> io::Result<usize> {\n        let buf = truncate_bytes(buf);\n        let written_num_bytes = self.underlying_wrt.write(buf)?;\n        self.io_controls_access\n            .apply(|io_controls| io_controls.consume_blocking(written_num_bytes))?;\n        Ok(written_num_bytes)\n    }\n\n    fn flush(&mut self) -> io::Result<()> {\n        // We voluntarily avoid to check the kill switch on flush.\n        // This is because the `RAMDirectory` currently panics if flush\n        // is not called before `Drop`.\n        let _guard = self.check_if_alive();\n        self.underlying_wrt.flush()\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::io::{IoSlice, Write};\n    use std::time::Duration;\n\n    use bytesize::ByteSize;\n    use tokio::io::{AsyncWriteExt, sink};\n    use tokio::time::Instant;\n\n    use crate::io::{IoControls, IoControlsAccess};\n\n    #[tokio::test]\n    async fn test_controlled_writer_limited_async() {\n        let io_controls = IoControls::default().set_throughput_limit(ByteSize::mb(2));\n        let mut controlled_write = io_controls.clone().wrap_write(sink());\n        let buf = vec![44u8; 1_000];\n        let start = Instant::now();\n        // We write 200 KB\n        for _ in 0..200 {\n            controlled_write.write_all(&buf).await.unwrap();\n        }\n        controlled_write.flush().await.unwrap();\n        let elapsed = start.elapsed();\n        assert!(elapsed >= Duration::from_millis(50));\n        assert!(elapsed <= Duration::from_millis(150));\n        assert_eq!(io_controls.num_bytes(), 200_000u64);\n    }\n\n    #[tokio::test]\n    async fn test_controlled_writer_no_limit_async() {\n        let io_controls = IoControls::default();\n        let mut controlled_write = io_controls.clone().wrap_write(sink());\n        let buf = vec![44u8; 1_000];\n        let start = Instant::now();\n        // We write 2MB\n        for _ in 0..2_000 {\n            controlled_write.write_all(&buf).await.unwrap();\n        }\n        controlled_write.flush().await.unwrap();\n        let elapsed = start.elapsed();\n        assert!(elapsed <= Duration::from_millis(10));\n        assert_eq!(io_controls.num_bytes(), 2_000_000u64);\n    }\n\n    #[test]\n    fn test_controlled_writer_limited_sync() {\n        let io_controls = IoControls::default().set_throughput_limit(ByteSize::mb(2));\n        let mut controlled_write = io_controls.clone().wrap_write(std::io::sink());\n        let buf = vec![44u8; 1_000];\n        let start = Instant::now();\n        // We write 200 KB\n        for _ in 0..200 {\n            controlled_write.write_all(&buf).unwrap();\n        }\n        controlled_write.flush().unwrap();\n        let elapsed = start.elapsed();\n        assert!(elapsed >= Duration::from_millis(50));\n        assert!(elapsed <= Duration::from_millis(150));\n        assert_eq!(io_controls.num_bytes(), 200_000u64);\n    }\n\n    #[test]\n    fn test_controlled_writer_no_limit_sync() {\n        let io_controls = IoControls::default();\n        let mut controlled_write = io_controls.clone().wrap_write(std::io::sink());\n        let buf = vec![44u8; 1_000];\n        let start = Instant::now();\n        // We write 2MB\n        for _ in 0..2_000 {\n            controlled_write.write_all(&buf).unwrap();\n        }\n        controlled_write.flush().unwrap();\n        let elapsed = start.elapsed();\n        assert!(elapsed <= Duration::from_millis(5));\n        assert_eq!(io_controls.num_bytes(), 2_000_000u64);\n    }\n\n    #[test]\n    fn test_truncate_io_slices_one_slice_too_long_corner_case() {\n        let one_slice = IoSlice::new(&b\"abcdef\"[..]);\n        assert_eq!(super::quirky_truncate_slices(&[one_slice], 2).len(), 1);\n    }\n\n    #[test]\n    fn test_truncate_io_empty() {\n        assert_eq!(super::quirky_truncate_slices(&[], 2).len(), 0);\n    }\n\n    #[test]\n    fn test_truncate_io_slices() {\n        let slices = &[\n            IoSlice::new(&b\"abc\"[..]),\n            IoSlice::new(&b\"defg\"[..]),\n            IoSlice::new(&b\"hi\"[..]),\n        ];\n        assert_eq!(super::quirky_truncate_slices(slices, 0).len(), 1);\n        assert_eq!(super::quirky_truncate_slices(slices, 6).len(), 1);\n        assert_eq!(super::quirky_truncate_slices(slices, 7).len(), 2);\n        assert_eq!(super::quirky_truncate_slices(slices, 9).len(), 3);\n        assert_eq!(super::quirky_truncate_slices(slices, 10).len(), 3);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/jemalloc_profiled.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::alloc::{GlobalAlloc, Layout};\nuse std::hash::Hasher;\nuse std::sync::Mutex;\nuse std::sync::atomic::{AtomicBool, AtomicU64, Ordering};\n\nuse bytesize::ByteSize;\nuse once_cell::sync::Lazy;\nuse tikv_jemallocator::Jemalloc;\nuse tracing::{error, info, trace};\n\nuse crate::alloc_tracker::{\n    AllocRecordingResponse, AllocStat, Allocations, ReallocRecordingResponse,\n};\n\nconst DEFAULT_MIN_ALLOC_BYTES_FOR_PROFILING: u64 = 64 * 1024;\nconst DEFAULT_REPORTING_INTERVAL_BYTES: u64 = 1024 * 1024 * 1024;\n\n/// This custom target name is used to filter profiling events in the tracing\n/// subscriber. It is also included in the printed log.\npub const JEMALLOC_PROFILER_TARGET: &str = \"jemprof\";\n\n/// Atomics are used to communicate configurations between the start/stop\n/// endpoints and the [JemallocProfiled] allocator wrapper.\n///\n/// The flags are padded to avoid false sharing of the CPU cache line between\n/// threads. 128 bytes is the cache line size on x86_64 and arm64.\n#[repr(align(128))]\nstruct Flags {\n    /// The minimum allocation size that is recorded by the tracker.\n    min_alloc_bytes_for_profiling: AtomicU64,\n    /// Whether the profiling is started or not.\n    enabled: AtomicBool,\n    /// Padding to make sure we fill the cache line.\n    _padding: [u8; 119], // 128 (align) - 8 (u64) - 1 (bool)\n}\n\nstatic FLAGS: Flags = Flags {\n    min_alloc_bytes_for_profiling: AtomicU64::new(DEFAULT_MIN_ALLOC_BYTES_FOR_PROFILING),\n    enabled: AtomicBool::new(false),\n    _padding: [0; 119],\n};\n\nstatic ALLOCATION_TRACKER: Lazy<Mutex<Allocations>> =\n    Lazy::new(|| Mutex::new(Allocations::default()));\n\n/// Starts measuring heap allocations and logs important leaks.\n///\n/// This function uses a wrapper around the global Jemalloc allocator to\n/// instrument it.\n///\n/// Each time an allocation bigger than min_alloc_bytes_for_profiling is\n/// performed, it is recorded in a map and the statistics for its call site are\n/// updated. Tracking allocations is costly because it requires acquiring a\n/// global mutex. Setting a reasonable value for min_alloc_bytes_for_profiling\n/// is crucial. For instance for a search aggregation request, tracking every\n/// allocations (min_alloc_bytes_for_profiling=1) is typically 100x slower than\n/// using a minimum of 64kB.\n///\n/// During profiling, the statistics per call site are used to log when specific\n/// thresholds are exceeded. For each call site, the allocated memory is logged\n/// (with a backtrace) every time it exceeds the last logged allocated memory by\n/// at least alloc_bytes_triggering_backtrace. This logging interval should\n/// usually be set to a value of at least 500MB to limit the logging verbosity.\npub fn start_profiling(\n    min_alloc_bytes_for_profiling: Option<u64>,\n    alloc_bytes_triggering_backtrace: Option<u64>,\n) {\n    #[cfg(miri)]\n    warn!(\n        \"heap profiling is not supported with Miri because in that case the `backtrace` crate \\\n         allocates\"\n    );\n\n    // Call backtrace once to warmup symbolization allocations (~30MB)\n    backtrace::trace(|frame| {\n        backtrace::resolve_frame(frame, |_| {});\n        true\n    });\n\n    let alloc_bytes_triggering_backtrace =\n        alloc_bytes_triggering_backtrace.unwrap_or(DEFAULT_REPORTING_INTERVAL_BYTES);\n    ALLOCATION_TRACKER\n        .lock()\n        .unwrap()\n        .init(alloc_bytes_triggering_backtrace);\n\n    let min_alloc_bytes_for_profiling =\n        min_alloc_bytes_for_profiling.unwrap_or(DEFAULT_MIN_ALLOC_BYTES_FOR_PROFILING);\n\n    // stdout() might allocate a buffer on first use. If the first allocation\n    // tracked comes from stdout, it will trigger a deadlock. Logging here\n    // guarantees that it doesn't happen.\n    info!(\n        min_alloc_for_profiling = %ByteSize(min_alloc_bytes_for_profiling),\n        alloc_triggering_backtrace = %ByteSize(alloc_bytes_triggering_backtrace),\n        \"heap profiling running\"\n    );\n\n    // Use strong ordering to make sure all threads see these changes in this order\n    FLAGS\n        .min_alloc_bytes_for_profiling\n        .store(min_alloc_bytes_for_profiling, Ordering::SeqCst);\n    FLAGS.enabled.store(true, Ordering::SeqCst);\n}\n\n/// Stops measuring heap allocations.\n///\n/// The allocation tracking tables and the symbol cache are not cleared.\npub fn stop_profiling() {\n    // Use strong ordering to make sure all threads see these changes in this order\n    let previously_enabled = FLAGS.enabled.swap(false, Ordering::SeqCst);\n    FLAGS\n        .min_alloc_bytes_for_profiling\n        .store(DEFAULT_MIN_ALLOC_BYTES_FOR_PROFILING, Ordering::SeqCst);\n\n    info!(previously_enabled, \"heap profiling stopped\");\n}\n\n/// Wraps the Jemalloc global allocator calls with tracking routines.\n///\n/// The tracking routines are called only when FLAGS.enabled is set to true\n/// (calling [start_profiling()]). We load it with [Ordering::Relaxed] because\n/// it's fine to miss or record extra allocation events and prefer limiting the\n/// performance impact when profiling is not enabled.\n///\n/// Note: It's important to ensure that no allocations are performed inside the\n/// allocator! It can cause an abort, a panic or even a deadlock.\npub struct JemallocProfiled(pub Jemalloc);\n\nunsafe impl GlobalAlloc for JemallocProfiled {\n    #[inline]\n    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {\n        let ptr = unsafe { self.0.alloc(layout) };\n        if FLAGS.enabled.load(Ordering::Relaxed) {\n            track_alloc_call(ptr, layout);\n        }\n        ptr\n    }\n\n    #[inline]\n    unsafe fn alloc_zeroed(&self, layout: Layout) -> *mut u8 {\n        let ptr = unsafe { self.0.alloc_zeroed(layout) };\n        if FLAGS.enabled.load(Ordering::Relaxed) {\n            track_alloc_call(ptr, layout);\n        }\n        ptr\n    }\n\n    #[inline]\n    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {\n        if FLAGS.enabled.load(Ordering::Relaxed) {\n            track_dealloc_call(ptr, layout);\n        }\n        unsafe { self.0.dealloc(ptr, layout) }\n    }\n\n    #[inline]\n    unsafe fn realloc(&self, old_ptr: *mut u8, layout: Layout, new_size: usize) -> *mut u8 {\n        let new_ptr = unsafe { self.0.realloc(old_ptr, layout, new_size) };\n        if FLAGS.enabled.load(Ordering::Relaxed) {\n            track_realloc_call(old_ptr, new_ptr, layout, new_size);\n        }\n        new_ptr\n    }\n}\n\n/// Prints both a backtrace and a Tokio tracing log\n///\n/// Warning: stdout writer might allocate a buffer on first use\nfn identify_callsite(callsite_hash: u64, stat: AllocStat) {\n    // To generate a complete trace:\n    // - tokio/tracing feature must be enabled, otherwise un-instrumented tasks will not propagate\n    //   parent spans\n    // - the tracing fmt subscriber filter must keep all spans for this event (TRACE level). See the\n    //   logger configuration for more details.\n    trace!(target: JEMALLOC_PROFILER_TARGET, callsite=callsite_hash, allocs=stat.count, size=%stat.size);\n}\n\nfn backtrace_hash() -> u64 {\n    let mut hasher = fnv::FnvHasher::default();\n    backtrace::trace(|frame| {\n        hasher.write_usize(frame.ip() as usize);\n        true\n    });\n    hasher.finish()\n}\n\n/// Warning: this function should not allocate!\n#[cold]\nfn track_alloc_call(ptr: *mut u8, layout: Layout) {\n    if layout.size() >= FLAGS.min_alloc_bytes_for_profiling.load(Ordering::Relaxed) as usize {\n        let callsite_hash = backtrace_hash();\n        let recording_response = ALLOCATION_TRACKER.lock().unwrap().record_allocation(\n            callsite_hash,\n            layout.size() as u64,\n            ptr,\n        );\n\n        match recording_response {\n            AllocRecordingResponse::ThresholdExceeded(stat_for_trace) => {\n                identify_callsite(callsite_hash, stat_for_trace);\n            }\n            AllocRecordingResponse::TrackerFull(table_name) => {\n                // this message might be displayed multiple times but that's fine\n                // warning: stdout writer might allocate a buffer on first use\n                error!(\"heap profiling stopped, {table_name} full\");\n                FLAGS.enabled.store(false, Ordering::Relaxed);\n            }\n            AllocRecordingResponse::ThresholdNotExceeded => {}\n            AllocRecordingResponse::NotStarted => {}\n        }\n    }\n}\n\n/// Warning: this function should not allocate!\n#[cold]\nfn track_dealloc_call(ptr: *mut u8, layout: Layout) {\n    if layout.size() >= FLAGS.min_alloc_bytes_for_profiling.load(Ordering::Relaxed) as usize {\n        ALLOCATION_TRACKER.lock().unwrap().record_deallocation(ptr);\n    }\n}\n\n/// Warning: this function should not allocate!\n#[cold]\nfn track_realloc_call(old_ptr: *mut u8, new_ptr: *mut u8, current_layout: Layout, new_size: usize) {\n    if current_layout.size() >= FLAGS.min_alloc_bytes_for_profiling.load(Ordering::Relaxed) as usize\n    {\n        let recording_response = ALLOCATION_TRACKER.lock().unwrap().record_reallocation(\n            new_size as u64,\n            old_ptr,\n            new_ptr,\n        );\n\n        match recording_response {\n            ReallocRecordingResponse::ThresholdExceeded {\n                statistics,\n                callsite_hash,\n            } => {\n                identify_callsite(callsite_hash, statistics);\n            }\n            ReallocRecordingResponse::ThresholdNotExceeded => {}\n            ReallocRecordingResponse::NotStarted => {}\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_size_of_flags() {\n        assert_eq!(std::mem::size_of::<Flags>(), 128);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/kill_switch.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::atomic::{AtomicBool, Ordering};\nuse std::sync::{Arc, Mutex, Weak};\n\nuse tracing::debug;\n\n#[derive(Clone, Default)]\npub struct KillSwitch {\n    inner: Arc<Inner>,\n}\n\nstruct Inner {\n    alive: AtomicBool,\n    children: Mutex<Vec<Weak<Inner>>>,\n}\n\nimpl Default for Inner {\n    fn default() -> Self {\n        Self {\n            alive: AtomicBool::new(true),\n            children: Mutex::default(),\n        }\n    }\n}\n\nfn garbage_collect(children: &mut Vec<Weak<Inner>>) {\n    let mut i = 0;\n    while i < children.len() {\n        if Weak::strong_count(&children[i]) == 0 {\n            children.swap_remove(i);\n        } else {\n            i += 1;\n        }\n    }\n}\n\nimpl KillSwitch {\n    pub fn is_alive(&self) -> bool {\n        self.inner.alive.load(Ordering::Relaxed)\n    }\n\n    pub fn is_dead(&self) -> bool {\n        !self.is_alive()\n    }\n\n    pub fn kill(&self) {\n        self.inner.kill();\n    }\n\n    // Creates a child killswitch.\n    //\n    // If the parent kill switch is dead to begin with, the child will be dead too.\n    pub fn child(&self) -> KillSwitch {\n        let mut lock = self.inner.children.lock().unwrap();\n        let child_inner = Inner {\n            alive: AtomicBool::new(self.is_alive()),\n            ..Default::default()\n        };\n        garbage_collect(&mut lock);\n        let child_inner_arc = Arc::new(child_inner);\n        lock.push(Arc::downgrade(&child_inner_arc));\n        KillSwitch {\n            inner: child_inner_arc,\n        }\n    }\n}\n\nimpl Inner {\n    pub fn kill(&self) {\n        debug!(\"kill-switch-activated\");\n        self.alive.store(false, Ordering::Relaxed);\n        let mut lock = self.children.lock().unwrap();\n        for weak in lock.drain(..) {\n            if let Some(inner) = weak.upgrade() {\n                inner.kill();\n            }\n        }\n    }\n}\n#[cfg(test)]\nmod tests {\n    use super::KillSwitch;\n\n    #[test]\n    fn test_kill_switch() {\n        let kill_switch = KillSwitch::default();\n        assert!(kill_switch.is_alive());\n        assert!(!kill_switch.is_dead());\n        kill_switch.kill();\n        assert!(!kill_switch.is_alive());\n        assert!(kill_switch.is_dead());\n        kill_switch.kill();\n        assert!(!kill_switch.is_alive());\n        assert!(kill_switch.is_dead());\n    }\n\n    #[test]\n    fn test_kill_switch_child() {\n        let kill_switch = KillSwitch::default();\n        let child_kill_switch = kill_switch.child();\n        let child_kill_switch2 = kill_switch.child();\n        assert!(child_kill_switch.is_alive());\n        assert!(child_kill_switch2.is_alive());\n        kill_switch.kill();\n        assert!(child_kill_switch.is_dead());\n        assert!(child_kill_switch2.is_dead());\n    }\n\n    #[test]\n    fn test_kill_switch_grandchildren() {\n        let kill_switch = KillSwitch::default();\n        let child_kill_switch = kill_switch.child();\n        let grandchild_kill_switch = child_kill_switch.child();\n        assert!(kill_switch.is_alive());\n        assert!(child_kill_switch.is_alive());\n        assert!(grandchild_kill_switch.is_alive());\n        kill_switch.kill();\n        assert!(kill_switch.is_dead());\n        assert!(child_kill_switch.is_dead());\n        assert!(grandchild_kill_switch.is_dead());\n    }\n\n    #[test]\n    fn test_kill_switch_to_quoque_me_fili() {\n        let kill_switch = KillSwitch::default();\n        let child_kill_switch = kill_switch.child();\n        assert!(kill_switch.is_alive());\n        assert!(child_kill_switch.is_alive());\n        child_kill_switch.kill();\n        assert!(kill_switch.is_alive());\n        assert!(child_kill_switch.is_dead());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n#![deny(clippy::disallowed_methods)]\n\nmod coolid;\n\n#[cfg(feature = \"jemalloc-profiled\")]\npub(crate) mod alloc_tracker;\npub mod binary_heap;\nmod cpus;\npub mod fs;\npub mod io;\n#[cfg(feature = \"jemalloc-profiled\")]\npub mod jemalloc_profiled;\nmod kill_switch;\npub mod metrics;\npub mod net;\nmod path_hasher;\npub mod pretty;\nmod progress;\npub mod pubsub;\npub mod rand;\npub mod rate_limited_tracing;\npub mod rate_limiter;\npub mod rendezvous_hasher;\npub mod retry;\npub mod ring_buffer;\npub mod runtimes;\npub mod shared_consts;\npub mod sorted_iter;\npub mod stream_utils;\npub mod temp_dir;\n#[cfg(any(test, feature = \"testsuite\"))]\npub mod test_utils;\npub mod thread_pool;\npub mod tower;\npub mod type_map;\npub mod uri;\n\nmod socket_addr_legacy_hash;\n\nuse std::env;\nuse std::fmt::{Debug, Display};\nuse std::future::Future;\nuse std::ops::{Range, RangeInclusive};\nuse std::str::FromStr;\n\npub use coolid::new_coolid;\npub use cpus::num_cpus;\npub use kill_switch::KillSwitch;\npub use path_hasher::PathHasher;\npub use progress::{Progress, ProtectedZoneGuard};\npub use socket_addr_legacy_hash::SocketAddrLegacyHash;\npub use stream_utils::{BoxStream, ServiceStream};\nuse tracing::{error, info};\n\n/// Returns true at compile time. This function is mostly used with serde to initialize boolean\n/// fields to true.\npub const fn true_fn() -> bool {\n    true\n}\n\n/// Returns whether the given boolean value is true. This function is mostly used with serde to skip\n/// serializing boolean fields with `skip_serializing_if = \"is_true\"` when the value is true.\npub fn is_true(value: &bool) -> bool {\n    *value\n}\n\npub fn chunk_range(range: Range<usize>, chunk_size: usize) -> impl Iterator<Item = Range<usize>> {\n    range.clone().step_by(chunk_size).map(move |block_start| {\n        let block_end = (block_start + chunk_size).min(range.end);\n        block_start..block_end\n    })\n}\n\npub fn into_u64_range(range: Range<usize>) -> Range<u64> {\n    range.start as u64..range.end as u64\n}\n\npub fn setup_logging_for_tests() {\n    let _ = env_logger::builder().format_timestamp(None).try_init();\n}\n\npub fn split_file(split_id: impl Display) -> String {\n    format!(\"{split_id}.split\")\n}\n\nfn get_from_env_opt_aux<T: Debug>(\n    key: &str,\n    parse_fn: impl FnOnce(&str) -> Option<T>,\n    sensitive: bool,\n) -> Option<T> {\n    let value_str = std::env::var(key).ok()?;\n    let Some(value) = parse_fn(&value_str) else {\n        error!(value=%value_str, \"failed to parse environment variable `{key}` value\");\n        return None;\n    };\n    if sensitive {\n        info!(\"using environment variable `{key}` value\");\n    } else {\n        info!(value=?value, \"using environment variable `{key}` value\");\n    }\n    Some(value)\n}\n\npub fn get_from_env<T: FromStr + Debug>(key: &str, default_value: T, sensitive: bool) -> T {\n    if let Some(value) = get_from_env_opt(key, sensitive) {\n        value\n    } else {\n        info!(default_value=?default_value, \"using environment variable `{key}` default value\");\n        default_value\n    }\n}\n\npub fn get_from_env_opt<T: FromStr + Debug>(key: &str, sensitive: bool) -> Option<T> {\n    get_from_env_opt_aux(key, |val_str| val_str.parse().ok(), sensitive)\n}\n\npub fn get_bool_from_env_opt(key: &str) -> Option<bool> {\n    get_from_env_opt_aux(key, parse_bool_lenient, false)\n}\n\npub fn get_bool_from_env(key: &str, default_value: bool) -> bool {\n    if let Some(flag_value) = get_bool_from_env_opt(key) {\n        flag_value\n    } else {\n        info!(default_value=%default_value, \"using environment variable `{key}` default value\");\n        default_value\n    }\n}\n\npub fn truncate_str(text: &str, max_len: usize) -> &str {\n    if max_len > text.len() {\n        return text;\n    }\n\n    let mut truncation_index = max_len;\n    while !text.is_char_boundary(truncation_index) {\n        truncation_index -= 1;\n    }\n    &text[..truncation_index]\n}\n\n/// Extracts time range from optional start and end timestamps.\npub fn extract_time_range(\n    start_timestamp_opt: Option<i64>,\n    end_timestamp_opt: Option<i64>,\n) -> Option<Range<i64>> {\n    match (start_timestamp_opt, end_timestamp_opt) {\n        (Some(start_timestamp), Some(end_timestamp)) => Some(Range {\n            start: start_timestamp,\n            end: end_timestamp,\n        }),\n        (_, Some(end_timestamp)) => Some(Range {\n            start: i64::MIN,\n            end: end_timestamp,\n        }),\n        (Some(start_timestamp), _) => Some(Range {\n            start: start_timestamp,\n            end: i64::MAX,\n        }),\n        _ => None,\n    }\n}\n\n/// Takes 2 intervals and returns true iff their intersection is empty\npub fn is_disjoint(left: &Range<i64>, right: &RangeInclusive<i64>) -> bool {\n    left.end <= *right.start() || *right.end() < left.start\n}\n\n/// For use with the `skip_serializing_if` serde attribute.\npub fn is_false(value: &bool) -> bool {\n    !*value\n}\n\npub fn no_color() -> bool {\n    matches!(env::var(\"NO_COLOR\"), Ok(value) if !value.is_empty())\n}\n\n#[macro_export]\nmacro_rules! assert_eventually {\n    ($cond:expr, $timeout:expr, $interval:expr) => {\n        let start = std::time::Instant::now();\n        loop {\n            if $cond {\n                break;\n            }\n            if start.elapsed() > $timeout {\n                panic!(\n                    \"assertion failed: condition `{}` never became true within {} ms\",\n                    stringify!($cond),\n                    $timeout.as_millis()\n                );\n            }\n            tokio::time::sleep($interval).await;\n        }\n    };\n    ($cond:expr, $timeout:expr) => {\n        assert_eventually!($cond, $timeout, std::time::Duration::from_millis(50));\n    };\n    ($cond:expr) => {\n        assert_eventually!($cond, std::time::Duration::from_secs(1));\n    };\n}\n\n#[macro_export]\nmacro_rules! ignore_error_kind {\n    ($kind:path, $expr:expr) => {\n        match $expr {\n            Ok(_) => Ok(()),\n            Err(error) if error.kind() == $kind => Ok(()),\n            Err(error) => Err(error),\n        }\n    };\n}\n\n#[inline]\npub const fn div_ceil_u32(lhs: u32, rhs: u32) -> u32 {\n    let d = lhs / rhs;\n    let r = lhs % rhs;\n    if r > 0 { d + 1 } else { d }\n}\n\n#[inline]\npub const fn div_ceil(lhs: i64, rhs: i64) -> i64 {\n    let d = lhs / rhs;\n    let r = lhs % rhs;\n    if (r > 0 && rhs > 0) || (r < 0 && rhs < 0) {\n        d + 1\n    } else {\n        d\n    }\n}\n\n// The following are helpers to build named tasks.\n//\n// Named tasks require the tokio feature `tracing` to be enabled. If the\n// `named_tasks` feature is disabled, this is no-op.\n//\n// By default, these function will just ignore the name passed and just act like\n// a regular call to `tokio::spawn`.\n//\n// If the user compiles `quickwit-cli` with the `tokio-console` feature, then\n// tasks will automatically be named. This is not just \"visual sugar\".\n//\n// Without names, tasks will only show their spawn site on tokio-console. This\n// is a catastrophy for actors who all share the same spawn site.\n//\n// The #[track_caller] annotation is used to show the right spawn site in the\n// Tokio TRACE spans (only available when the tokio/tracing feature is on).\n//\n// # Naming\n//\n// Actors will get named after their type, which is fine. For other tasks,\n// please use `snake_case`.\n\n#[cfg(not(all(tokio_unstable, feature = \"named_tasks\")))]\n#[track_caller]\npub fn spawn_named_task<F>(future: F, _name: &'static str) -> tokio::task::JoinHandle<F::Output>\nwhere\n    F: Future + Send + 'static,\n    F::Output: Send + 'static,\n{\n    tokio::task::spawn(future)\n}\n\n#[cfg(not(all(tokio_unstable, feature = \"named_tasks\")))]\n#[track_caller]\npub fn spawn_named_task_on<F>(\n    future: F,\n    _name: &'static str,\n    runtime: &tokio::runtime::Handle,\n) -> tokio::task::JoinHandle<F::Output>\nwhere\n    F: Future + Send + 'static,\n    F::Output: Send + 'static,\n{\n    runtime.spawn(future)\n}\n\n#[cfg(all(tokio_unstable, feature = \"named_tasks\"))]\n#[track_caller]\npub fn spawn_named_task<F>(future: F, name: &'static str) -> tokio::task::JoinHandle<F::Output>\nwhere\n    F: Future + Send + 'static,\n    F::Output: Send + 'static,\n{\n    tokio::task::Builder::new()\n        .name(name)\n        .spawn(future)\n        .unwrap()\n}\n\n#[cfg(all(tokio_unstable, feature = \"named_tasks\"))]\n#[track_caller]\npub fn spawn_named_task_on<F>(\n    future: F,\n    name: &'static str,\n    runtime: &tokio::runtime::Handle,\n) -> tokio::task::JoinHandle<F::Output>\nwhere\n    F: Future + Send + 'static,\n    F::Output: Send + 'static,\n{\n    tokio::task::Builder::new()\n        .name(name)\n        .spawn_on(future, runtime)\n        .unwrap()\n}\n\npub fn parse_bool_lenient(bool_str: &str) -> Option<bool> {\n    let trimmed_bool_str = bool_str.trim();\n\n    for truthy_value in [\"true\", \"yes\", \"1\"] {\n        if trimmed_bool_str.eq_ignore_ascii_case(truthy_value) {\n            return Some(true);\n        }\n    }\n    for falsy_value in [\"false\", \"no\", \"0\"] {\n        if trimmed_bool_str.eq_ignore_ascii_case(falsy_value) {\n            return Some(false);\n        }\n    }\n    None\n}\n\n#[cfg(test)]\nmod tests {\n    use std::io::ErrorKind;\n\n    use super::*;\n\n    #[test]\n    fn test_get_from_env() {\n        // SAFETY: this test may not be entirely sound if not run with nextest or --test-threads=1\n        // as this is only a test, and it would be extremely inconvenient to run it in a different\n        // way, we are keeping it that way\n\n        const TEST_KEY: &str = \"TEST_KEY\";\n        assert_eq!(super::get_from_env(TEST_KEY, 10, false), 10);\n        unsafe { std::env::set_var(TEST_KEY, \"15\") };\n        assert_eq!(super::get_from_env(TEST_KEY, 10, false), 15);\n        unsafe { std::env::set_var(TEST_KEY, \"1invalidnumber\") };\n        assert_eq!(super::get_from_env(TEST_KEY, 10, false), 10);\n    }\n\n    #[test]\n    fn test_truncate_str() {\n        assert_eq!(truncate_str(\"\", 0), \"\");\n        assert_eq!(truncate_str(\"\", 3), \"\");\n        assert_eq!(truncate_str(\"hello\", 0), \"\");\n        assert_eq!(truncate_str(\"hello\", 5), \"hello\");\n        assert_eq!(truncate_str(\"hello\", 6), \"hello\");\n        assert_eq!(truncate_str(\"hello-world\", 5), \"hello\");\n        assert_eq!(truncate_str(\"hello-world\", 6), \"hello-\");\n        assert_eq!(truncate_str(\"hello🧑‍🔬world\", 6), \"hello\");\n        assert_eq!(truncate_str(\"hello🧑‍🔬world\", 7), \"hello\");\n    }\n\n    #[test]\n    fn test_ignore_io_error_macro() {\n        ignore_error_kind!(\n            ErrorKind::NotFound,\n            std::fs::remove_file(\"file-does-not-exist\")\n        )\n        .unwrap();\n    }\n\n    #[test]\n    fn test_div_ceil() {\n        assert_eq!(div_ceil(5, 1), 5);\n        assert_eq!(div_ceil(5, 2), 3);\n        assert_eq!(div_ceil(6, 2), 3);\n\n        assert_eq!(div_ceil(3, 3), 1);\n        assert_eq!(div_ceil(2, 3), 1);\n        assert_eq!(div_ceil(1, 3), 1);\n        assert_eq!(div_ceil(0, 3), 0);\n        assert_eq!(div_ceil(-1, 3), 0);\n        assert_eq!(div_ceil(-2, 3), 0);\n\n        assert_eq!(div_ceil(-5, 1), -5);\n        assert_eq!(div_ceil(-5, 2), -2);\n        assert_eq!(div_ceil(-6, 2), -3);\n\n        assert_eq!(div_ceil(5, -1), -5);\n        assert_eq!(div_ceil(5, -2), -2);\n        assert_eq!(div_ceil(6, -2), -3);\n\n        assert_eq!(div_ceil(-5, -1), 5);\n        assert_eq!(div_ceil(-5, -2), 3);\n        assert_eq!(div_ceil(-6, -2), 3);\n    }\n\n    #[test]\n    fn test_div_ceil_u32() {\n        assert_eq!(div_ceil_u32(5, 1), 5);\n        assert_eq!(div_ceil_u32(5, 2), 3);\n        assert_eq!(div_ceil_u32(6, 2), 3);\n        assert_eq!(div_ceil_u32(3, 3), 1);\n        assert_eq!(div_ceil_u32(2, 3), 1);\n        assert_eq!(div_ceil_u32(1, 3), 1);\n        assert_eq!(div_ceil_u32(0, 3), 0);\n    }\n\n    #[test]\n    fn test_parse_bool_lenient() {\n        assert_eq!(parse_bool_lenient(\"true\"), Some(true));\n        assert_eq!(parse_bool_lenient(\"TRUE\"), Some(true));\n        assert_eq!(parse_bool_lenient(\"True\"), Some(true));\n        assert_eq!(parse_bool_lenient(\"yes\"), Some(true));\n        assert_eq!(parse_bool_lenient(\" 1\"), Some(true));\n\n        assert_eq!(parse_bool_lenient(\"false\"), Some(false));\n        assert_eq!(parse_bool_lenient(\"FALSE\"), Some(false));\n        assert_eq!(parse_bool_lenient(\"False\"), Some(false));\n        assert_eq!(parse_bool_lenient(\"no\"), Some(false));\n        assert_eq!(parse_bool_lenient(\"0 \"), Some(false));\n\n        assert_eq!(parse_bool_lenient(\"foo\"), None);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/metrics.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{BTreeMap, HashMap};\nuse std::sync::{LazyLock, OnceLock};\n\nuse prometheus::{Gauge, HistogramOpts, Opts, TextEncoder};\npub use prometheus::{\n    Histogram, HistogramTimer, HistogramVec as PrometheusHistogramVec, IntCounter,\n    IntCounterVec as PrometheusIntCounterVec, IntGauge, IntGaugeVec as PrometheusIntGaugeVec,\n    exponential_buckets, linear_buckets,\n};\n\n#[derive(Clone)]\npub struct HistogramVec<const N: usize> {\n    underlying: PrometheusHistogramVec,\n}\n\nimpl<const N: usize> HistogramVec<N> {\n    pub fn with_label_values(&self, label_values: [&str; N]) -> Histogram {\n        self.underlying.with_label_values(&label_values)\n    }\n}\n\n#[derive(Clone)]\npub struct IntCounterVec<const N: usize> {\n    underlying: PrometheusIntCounterVec,\n}\n\nimpl<const N: usize> IntCounterVec<N> {\n    pub fn new(\n        name: &str,\n        help: &str,\n        subsystem: &str,\n        const_labels: &[(&str, &str)],\n        label_names: [&str; N],\n    ) -> IntCounterVec<N> {\n        let owned_const_labels: HashMap<String, String> = const_labels\n            .iter()\n            .map(|(label_name, label_value)| (label_name.to_string(), label_value.to_string()))\n            .collect();\n        let counter_opts = Opts::new(name, help)\n            .namespace(\"quickwit\")\n            .subsystem(subsystem)\n            .const_labels(owned_const_labels);\n        let underlying = PrometheusIntCounterVec::new(counter_opts, &label_names)\n            .expect(\"failed to create counter vec\");\n        IntCounterVec { underlying }\n    }\n\n    pub fn with_label_values(&self, label_values: [&str; N]) -> IntCounter {\n        self.underlying.with_label_values(&label_values)\n    }\n}\n\n#[derive(Clone)]\npub struct IntGaugeVec<const N: usize> {\n    underlying: PrometheusIntGaugeVec,\n}\n\nimpl<const N: usize> IntGaugeVec<N> {\n    pub fn with_label_values(&self, label_values: [&str; N]) -> IntGauge {\n        self.underlying.with_label_values(&label_values)\n    }\n}\n\npub fn register_info(name: &'static str, help: &'static str, kvs: BTreeMap<&'static str, String>) {\n    let mut counter_opts = Opts::new(name, help).namespace(\"quickwit\");\n    for (k, v) in kvs {\n        counter_opts = counter_opts.const_label(k, v);\n    }\n    let counter = IntCounter::with_opts(counter_opts).expect(\"failed to create counter\");\n    counter.inc();\n    prometheus::register(Box::new(counter)).expect(\"failed to register counter\");\n}\n\npub fn new_counter(\n    name: &str,\n    help: &str,\n    subsystem: &str,\n    const_labels: &[(&str, &str)],\n) -> IntCounter {\n    let owned_const_labels: HashMap<String, String> = const_labels\n        .iter()\n        .map(|(label_name, label_value)| (label_name.to_string(), label_value.to_string()))\n        .collect();\n    let counter_opts = Opts::new(name, help)\n        .namespace(\"quickwit\")\n        .subsystem(subsystem)\n        .const_labels(owned_const_labels);\n    let counter = IntCounter::with_opts(counter_opts).expect(\"failed to create counter\");\n    prometheus::register(Box::new(counter.clone())).expect(\"failed to register counter\");\n    counter\n}\n\npub fn new_counter_vec<const N: usize>(\n    name: &str,\n    help: &str,\n    subsystem: &str,\n    const_labels: &[(&str, &str)],\n    label_names: [&str; N],\n) -> IntCounterVec<N> {\n    let int_counter_vec = IntCounterVec::new(name, help, subsystem, const_labels, label_names);\n    let collector = Box::new(int_counter_vec.underlying.clone());\n    prometheus::register(collector).expect(\"failed to register counter vec\");\n    int_counter_vec\n}\n\npub fn new_float_gauge(\n    name: &str,\n    help: &str,\n    subsystem: &str,\n    const_labels: &[(&str, &str)],\n) -> Gauge {\n    let owned_const_labels: HashMap<String, String> = const_labels\n        .iter()\n        .map(|(label_name, label_value)| (label_name.to_string(), label_value.to_string()))\n        .collect();\n    let gauge_opts = Opts::new(name, help)\n        .namespace(\"quickwit\")\n        .subsystem(subsystem)\n        .const_labels(owned_const_labels);\n    let gauge = Gauge::with_opts(gauge_opts).expect(\"failed to create float gauge\");\n    prometheus::register(Box::new(gauge.clone())).expect(\"failed to register float gauge\");\n    gauge\n}\n\npub fn new_gauge(\n    name: &str,\n    help: &str,\n    subsystem: &str,\n    const_labels: &[(&str, &str)],\n) -> IntGauge {\n    let owned_const_labels: HashMap<String, String> = const_labels\n        .iter()\n        .map(|(label_name, label_value)| (label_name.to_string(), label_value.to_string()))\n        .collect();\n    let gauge_opts = Opts::new(name, help)\n        .namespace(\"quickwit\")\n        .subsystem(subsystem)\n        .const_labels(owned_const_labels);\n    let gauge = IntGauge::with_opts(gauge_opts).expect(\"failed to create gauge\");\n    prometheus::register(Box::new(gauge.clone())).expect(\"failed to register gauge\");\n    gauge\n}\n\npub fn new_gauge_vec<const N: usize>(\n    name: &str,\n    help: &str,\n    subsystem: &str,\n    const_labels: &[(&str, &str)],\n    label_names: [&str; N],\n) -> IntGaugeVec<N> {\n    let owned_const_labels: HashMap<String, String> = const_labels\n        .iter()\n        .map(|(label_name, label_value)| (label_name.to_string(), label_value.to_string()))\n        .collect();\n    let gauge_opts = Opts::new(name, help)\n        .namespace(\"quickwit\")\n        .subsystem(subsystem)\n        .const_labels(owned_const_labels);\n    let underlying =\n        PrometheusIntGaugeVec::new(gauge_opts, &label_names).expect(\"failed to create gauge vec\");\n\n    let collector = Box::new(underlying.clone());\n    prometheus::register(collector).expect(\"failed to register counter vec\");\n\n    IntGaugeVec { underlying }\n}\n\npub fn new_histogram(name: &str, help: &str, subsystem: &str, buckets: Vec<f64>) -> Histogram {\n    let histogram_opts = HistogramOpts::new(name, help)\n        .namespace(\"quickwit\")\n        .subsystem(subsystem)\n        .buckets(buckets);\n    let histogram = Histogram::with_opts(histogram_opts).expect(\"failed to create histogram\");\n    prometheus::register(Box::new(histogram.clone())).expect(\"failed to register histogram\");\n    histogram\n}\n\npub fn new_histogram_vec<const N: usize>(\n    name: &str,\n    help: &str,\n    subsystem: &str,\n    const_labels: &[(&str, &str)],\n    label_names: [&str; N],\n    buckets: Vec<f64>,\n) -> HistogramVec<N> {\n    let owned_const_labels: HashMap<String, String> = const_labels\n        .iter()\n        .map(|(label_name, label_value)| (label_name.to_string(), label_value.to_string()))\n        .collect();\n    let histogram_opts = HistogramOpts::new(name, help)\n        .namespace(\"quickwit\")\n        .subsystem(subsystem)\n        .const_labels(owned_const_labels)\n        .buckets(buckets);\n    let underlying = PrometheusHistogramVec::new(histogram_opts, &label_names)\n        .expect(\"failed to create histogram vec\");\n\n    let collector = Box::new(underlying.clone());\n    prometheus::register(collector).expect(\"failed to register histogram vec\");\n\n    HistogramVec { underlying }\n}\n\npub struct GaugeGuard<'a> {\n    gauge: &'a IntGauge,\n    delta: i64,\n}\n\nimpl std::fmt::Debug for GaugeGuard<'_> {\n    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {\n        self.delta.fmt(f)\n    }\n}\n\nimpl<'a> GaugeGuard<'a> {\n    pub fn from_gauge(gauge: &'a IntGauge) -> Self {\n        Self { gauge, delta: 0i64 }\n    }\n\n    pub fn get(&self) -> i64 {\n        self.delta\n    }\n\n    pub fn add(&mut self, delta: i64) {\n        self.gauge.add(delta);\n        self.delta += delta;\n    }\n\n    pub fn sub(&mut self, delta: i64) {\n        self.gauge.sub(delta);\n        self.delta -= delta;\n    }\n}\n\nimpl Drop for GaugeGuard<'_> {\n    fn drop(&mut self) {\n        self.gauge.sub(self.delta)\n    }\n}\n\npub struct OwnedGaugeGuard {\n    gauge: IntGauge,\n    delta: i64,\n}\n\nimpl std::fmt::Debug for OwnedGaugeGuard {\n    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {\n        self.delta.fmt(f)\n    }\n}\n\nimpl OwnedGaugeGuard {\n    pub fn from_gauge(gauge: IntGauge) -> Self {\n        Self { gauge, delta: 0i64 }\n    }\n\n    pub fn get(&self) -> i64 {\n        self.delta\n    }\n\n    pub fn add(&mut self, delta: i64) {\n        self.gauge.add(delta);\n        self.delta += delta;\n    }\n\n    pub fn sub(&mut self, delta: i64) {\n        self.gauge.sub(delta);\n        self.delta -= delta;\n    }\n}\n\nimpl Drop for OwnedGaugeGuard {\n    fn drop(&mut self) {\n        self.gauge.sub(self.delta)\n    }\n}\n\npub fn metrics_text_payload() -> Result<String, String> {\n    let metric_families = prometheus::gather();\n    // Arbitrary non-zero size in order to skip a bunch of\n    // buffer growth-reallocations when encoding metrics.\n    let mut buffer = String::with_capacity(1024);\n    let encoder = TextEncoder::new();\n    match encoder.encode_utf8(&metric_families, &mut buffer) {\n        Ok(()) => Ok(buffer),\n        Err(e) => Err(e.to_string()),\n    }\n}\n\n#[derive(Clone)]\npub struct MemoryMetrics {\n    pub active_bytes: IntGauge,\n    pub allocated_bytes: IntGauge,\n    pub resident_bytes: IntGauge,\n    pub in_flight: InFlightDataGauges,\n}\n\nimpl Default for MemoryMetrics {\n    fn default() -> Self {\n        Self {\n            active_bytes: new_gauge(\n                \"active_bytes\",\n                \"Total number of bytes in active pages allocated by the application, as reported \\\n                 by jemalloc `stats.active`.\",\n                \"memory\",\n                &[],\n            ),\n            allocated_bytes: new_gauge(\n                \"allocated_bytes\",\n                \"Total number of bytes allocated by the application, as reported by jemalloc \\\n                 `stats.allocated`.\",\n                \"memory\",\n                &[],\n            ),\n            resident_bytes: new_gauge(\n                \"resident_bytes\",\n                \" Total number of bytes in physically resident data pages mapped by the \\\n                 allocator, as reported by jemalloc `stats.resident`.\",\n                \"memory\",\n                &[],\n            ),\n            in_flight: InFlightDataGauges::default(),\n        }\n    }\n}\n\n#[derive(Clone)]\npub struct InFlightDataGauges {\n    pub rest_server: IntGauge,\n    pub ingest_router: IntGauge,\n    pub ingester_persist: IntGauge,\n    pub ingester_replicate: IntGauge,\n    pub wal: IntGauge,\n    pub fetch_stream: IntGauge,\n    pub multi_fetch_stream: IntGauge,\n    pub doc_processor_mailbox: IntGauge,\n    pub indexer_mailbox: IntGauge,\n    pub index_writer: IntGauge,\n    in_flight_gauge_vec: IntGaugeVec<1>,\n}\n\nimpl Default for InFlightDataGauges {\n    fn default() -> Self {\n        let in_flight_gauge_vec = new_gauge_vec(\n            \"in_flight_data_bytes\",\n            \"Amount of data in-flight in various buffers in bytes.\",\n            \"memory\",\n            &[],\n            [\"component\"],\n        );\n        Self {\n            rest_server: in_flight_gauge_vec.with_label_values([\"rest_server\"]),\n            ingest_router: in_flight_gauge_vec.with_label_values([\"ingest_router\"]),\n            ingester_persist: in_flight_gauge_vec.with_label_values([\"ingester_persist\"]),\n            ingester_replicate: in_flight_gauge_vec.with_label_values([\"ingester_replicate\"]),\n            wal: in_flight_gauge_vec.with_label_values([\"wal\"]),\n            fetch_stream: in_flight_gauge_vec.with_label_values([\"fetch_stream\"]),\n            multi_fetch_stream: in_flight_gauge_vec.with_label_values([\"multi_fetch_stream\"]),\n            doc_processor_mailbox: in_flight_gauge_vec.with_label_values([\"doc_processor_mailbox\"]),\n            indexer_mailbox: in_flight_gauge_vec.with_label_values([\"indexer_mailbox\"]),\n            index_writer: in_flight_gauge_vec.with_label_values([\"index_writer\"]),\n            in_flight_gauge_vec: in_flight_gauge_vec.clone(),\n        }\n    }\n}\n\nimpl InFlightDataGauges {\n    #[inline]\n    pub fn file(&self) -> &IntGauge {\n        static GAUGE: OnceLock<IntGauge> = OnceLock::new();\n        GAUGE.get_or_init(|| self.in_flight_gauge_vec.with_label_values([\"file_source\"]))\n    }\n\n    #[inline]\n    pub fn ingest(&self) -> &IntGauge {\n        static GAUGE: OnceLock<IntGauge> = OnceLock::new();\n        GAUGE.get_or_init(|| {\n            self.in_flight_gauge_vec\n                .with_label_values([\"ingest_source\"])\n        })\n    }\n\n    #[inline]\n    pub fn kafka(&self) -> &IntGauge {\n        static GAUGE: OnceLock<IntGauge> = OnceLock::new();\n        GAUGE.get_or_init(|| self.in_flight_gauge_vec.with_label_values([\"kafka_source\"]))\n    }\n\n    #[inline]\n    pub fn kinesis(&self) -> &IntGauge {\n        static GAUGE: OnceLock<IntGauge> = OnceLock::new();\n        GAUGE.get_or_init(|| {\n            self.in_flight_gauge_vec\n                .with_label_values([\"kinesis_source\"])\n        })\n    }\n\n    #[inline]\n    pub fn pubsub(&self) -> &IntGauge {\n        static GAUGE: OnceLock<IntGauge> = OnceLock::new();\n        GAUGE.get_or_init(|| {\n            self.in_flight_gauge_vec\n                .with_label_values([\"pubsub_source\"])\n        })\n    }\n\n    #[inline]\n    pub fn pulsar(&self) -> &IntGauge {\n        static GAUGE: OnceLock<IntGauge> = OnceLock::new();\n        GAUGE.get_or_init(|| {\n            self.in_flight_gauge_vec\n                .with_label_values([\"pulsar_source\"])\n        })\n    }\n\n    #[inline]\n    pub fn other(&self) -> &IntGauge {\n        static GAUGE: OnceLock<IntGauge> = OnceLock::new();\n        GAUGE.get_or_init(|| {\n            self.in_flight_gauge_vec\n                .with_label_values([\"pulsar_source\"])\n        })\n    }\n}\n\n/// This function returns `index_id` as is if per-index metrics are enabled, or projects it to\n/// `\"__any__\"` otherwise.\npub fn index_label(index_id: &str) -> &str {\n    static PER_INDEX_METRICS_ENABLED: LazyLock<bool> =\n        LazyLock::new(|| !crate::get_bool_from_env(\"QW_DISABLE_PER_INDEX_METRICS\", false));\n\n    if *PER_INDEX_METRICS_ENABLED {\n        index_id\n    } else {\n        \"__any__\"\n    }\n}\n\npub static MEMORY_METRICS: LazyLock<MemoryMetrics> = LazyLock::new(MemoryMetrics::default);\n"
  },
  {
    "path": "quickwit/quickwit-common/src/net.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::ffi::OsString;\nuse std::fmt::Display;\nuse std::io;\nuse std::net::{IpAddr, Ipv4Addr, Ipv6Addr, SocketAddr, TcpListener};\nuse std::str::FromStr;\n\nuse anyhow::{Context, bail};\nuse itertools::Itertools;\nuse once_cell::sync::OnceCell;\nuse pnet::datalink::{self, NetworkInterface};\nuse pnet::ipnetwork::IpNetwork;\nuse serde::{Deserialize, Serialize, Serializer};\nuse tokio::net::{ToSocketAddrs, lookup_host};\n\n/// Represents a host, i.e. an IP address (`127.0.0.1`) or a hostname (`localhost`).\n#[derive(Clone, Debug, Eq, PartialEq)]\npub enum Host {\n    Hostname(String),\n    IpAddr(IpAddr),\n}\n\nimpl Host {\n    /// Returns [`true`] for the \"unspecified\" address (all bits set to zero).\n    pub fn is_unspecified(&self) -> bool {\n        match &self {\n            Host::Hostname(_) => false,\n            Host::IpAddr(ip_addr) => ip_addr.is_unspecified(),\n        }\n    }\n\n    /// Appends `port` to `self` and returns a [`HostAddr`].\n    pub fn with_port(&self, port: u16) -> HostAddr {\n        HostAddr {\n            host: self.clone(),\n            port,\n        }\n    }\n\n    /// Resolves the host if necessary and returns an [`IpAddr`].\n    pub async fn resolve(&self) -> anyhow::Result<IpAddr> {\n        match &self {\n            Host::Hostname(hostname) => get_socket_addr(&(hostname.as_str(), 0))\n                .await\n                .map(|socket_addr| socket_addr.ip()),\n            Host::IpAddr(ip_addr) => Ok(*ip_addr),\n        }\n    }\n}\n\nimpl Default for Host {\n    fn default() -> Self {\n        Host::IpAddr(IpAddr::V4(Ipv4Addr::LOCALHOST))\n    }\n}\n\nimpl Display for Host {\n    fn fmt(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {\n        match self {\n            Host::Hostname(hostname) => hostname.fmt(formatter),\n            Host::IpAddr(ip_addr) => ip_addr.fmt(formatter),\n        }\n    }\n}\n\nimpl Serialize for Host {\n    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>\n    where S: Serializer {\n        match self {\n            Host::Hostname(hostname) => hostname.serialize(serializer),\n            Host::IpAddr(ip_addr) => ip_addr.serialize(serializer),\n        }\n    }\n}\n\nimpl<'de> Deserialize<'de> for Host {\n    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>\n    where D: serde::Deserializer<'de> {\n        let host_str: String = Deserialize::deserialize(deserializer)?;\n        host_str.parse().map_err(serde::de::Error::custom)\n    }\n}\n\nimpl From<IpAddr> for Host {\n    fn from(ip_addr: IpAddr) -> Self {\n        Host::IpAddr(ip_addr)\n    }\n}\n\nimpl From<Ipv4Addr> for Host {\n    fn from(ip_addr: Ipv4Addr) -> Self {\n        Host::IpAddr(IpAddr::V4(ip_addr))\n    }\n}\n\nimpl From<Ipv6Addr> for Host {\n    fn from(ip_addr: Ipv6Addr) -> Self {\n        Host::IpAddr(IpAddr::V6(ip_addr))\n    }\n}\n\nimpl FromStr for Host {\n    type Err = anyhow::Error;\n\n    fn from_str(host: &str) -> Result<Self, Self::Err> {\n        if let Ok(ip_addr) = host.parse::<IpAddr>() {\n            return Ok(Self::IpAddr(ip_addr));\n        }\n        if is_valid_hostname(host) {\n            return Ok(Self::Hostname(host.to_string()));\n        }\n        bail!(\"failed to parse host: `{host}`\")\n    }\n}\n\n/// Represents an address `<host>:<port>` where `host` can be an IP address or a hostname.\n#[derive(Clone, Debug)]\npub struct HostAddr {\n    host: Host,\n    port: u16,\n}\n\nimpl HostAddr {\n    /// Attempts to parse a `host_addr`.\n    /// If no port is defined, it just accepts the host and uses the given default port.\n    ///\n    /// This function supports:\n    /// - IPv4\n    /// - IPv4:port\n    /// - IPv6\n    /// - \\[IPv6\\]:port -- IpV6 contains colon. It is customary to require bracket for this reason.\n    /// - hostname\n    /// - hostname:port\n    pub fn parse_with_default_port(host_addr: &str, default_port: u16) -> anyhow::Result<Self> {\n        if let Ok(socket_addr) = host_addr.parse::<SocketAddr>() {\n            return Ok(Self {\n                host: Host::IpAddr(socket_addr.ip()),\n                port: socket_addr.port(),\n            });\n        }\n        if let Ok(ip_addr) = host_addr.parse::<IpAddr>() {\n            return Ok(Self {\n                host: Host::IpAddr(ip_addr),\n                port: default_port,\n            });\n        }\n        let (hostname, port) = if let Some((hostname_str, port_str)) = host_addr.split_once(':') {\n            let port_u16 = port_str.parse::<u16>().with_context(|| {\n                format!(\"failed to parse address `{host_addr}`: port is invalid\")\n            })?;\n            (hostname_str, port_u16)\n        } else {\n            (host_addr, default_port)\n        };\n        if !is_valid_hostname(hostname) {\n            bail!(\n                \"failed to parse address `{}`: hostname is invalid\",\n                host_addr\n            )\n        }\n        Ok(Self {\n            host: Host::Hostname(hostname.to_string()),\n            port,\n        })\n    }\n\n    /// Resolves the host if necessary and returns a `SocketAddr`.\n    pub async fn resolve(&self) -> anyhow::Result<SocketAddr> {\n        self.host\n            .resolve()\n            .await\n            .map(|ip_addr| SocketAddr::new(ip_addr, self.port))\n    }\n\n    /// Skips DNS resolution if possible and returns the host address as a `SocketAddr`.\n    pub fn to_socket_addr(self) -> Option<SocketAddr> {\n        if let Host::IpAddr(ip_addr) = self.host {\n            Some(SocketAddr::new(ip_addr, self.port))\n        } else {\n            None\n        }\n    }\n}\n\nimpl Display for HostAddr {\n    fn fmt(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {\n        match self.host {\n            Host::IpAddr(IpAddr::V6(_)) => write!(formatter, \"[{}]:{}\", self.host, self.port),\n            _ => write!(formatter, \"{}:{}\", self.host, self.port),\n        }\n    }\n}\n\n/// Finds a random available TCP port.\n///\n/// This function induces a race condition, use it only in unit tests.\npub fn find_available_tcp_port() -> anyhow::Result<u16> {\n    let socket: SocketAddr = ([127, 0, 0, 1], 0u16).into();\n    let listener = TcpListener::bind(socket)?;\n    let port = listener.local_addr()?.port();\n    Ok(port)\n}\n\n/// Attempts to find the private IP of the host. Returns the matching interface name along with it.\npub fn find_private_ip() -> Option<(String, IpAddr)> {\n    _find_private_ip(&datalink::interfaces())\n}\n\n// Inner function for testing purposes.\nfn _find_private_ip(interfaces: &[NetworkInterface]) -> Option<(String, IpAddr)> {\n    // The way we do this is the following:\n    // 1. List the network interfaces\n    // 2. Filter out the interfaces that are not up\n    // 3. Filter out the networks that are not routable and private\n    // 4. Sort the IP addresses by:\n    //      - type (IPv4 first)\n    //      - mode (default first)\n    //      - size of network address space (desc)\n    // 5. Pick the first one\n    interfaces\n        .iter()\n        .filter(|interface| interface.is_up())\n        .flat_map(|interface| {\n            interface\n                .ips\n                .iter()\n                .filter(|ip_net| is_forwardable_ip(&ip_net.ip()) && is_private_ip(&ip_net.ip()))\n                .map(move |ip_net| (interface, ip_net))\n        })\n        .sorted_by_key(|(interface, ip_net)| {\n            (\n                ip_net.is_ipv6(),\n                is_dormant(interface),\n                std::cmp::Reverse(ip_net.prefix()),\n            )\n        })\n        .next()\n        .map(|(interface, ip_net)| (interface.name.clone(), ip_net.ip()))\n}\n\n#[cfg(any(target_os = \"linux\", target_os = \"android\"))]\nfn is_dormant(interface: &NetworkInterface) -> bool {\n    interface.is_dormant()\n}\n\n#[cfg(not(any(target_os = \"linux\", target_os = \"android\")))]\nfn is_dormant(_interface: &NetworkInterface) -> bool {\n    false\n}\n\n/// Converts an object into a resolved `SocketAddr`.\npub async fn get_socket_addr<T: ToSocketAddrs + std::fmt::Debug>(\n    addr: &T,\n) -> anyhow::Result<SocketAddr> {\n    lookup_host(addr)\n        .await\n        .with_context(|| format!(\"failed to parse address or resolve hostname {addr:?}\"))?\n        .next()\n        .ok_or_else(|| {\n            anyhow::anyhow!(\"DNS resolution did not yield any record for hostname {addr:?}\")\n        })\n}\n\nfn is_forwardable_ip(ip_addr: &IpAddr) -> bool {\n    static NON_FORWARDABLE_NETWORKS: OnceCell<Vec<IpNetwork>> = OnceCell::new();\n    NON_FORWARDABLE_NETWORKS\n        .get_or_init(|| {\n            // Blacklist of non-forwardable IP blocks taken from RFC6890\n            [\n                \"0.0.0.0/8\",\n                \"127.0.0.0/8\",\n                \"169.254.0.0/16\",\n                \"192.0.0.0/24\",\n                \"192.0.2.0/24\",\n                \"198.51.100.0/24\",\n                \"2001:10::/28\",\n                \"2001:db8::/32\",\n                \"203.0.113.0/24\",\n                \"240.0.0.0/4\",\n                \"255.255.255.255/32\",\n                \"::/128\",\n                \"::1/128\",\n                \"::ffff:0:0/96\",\n                \"fe80::/10\",\n            ]\n            .iter()\n            .map(|network| network.parse().expect(\"Failed to parse network range. This should never happen! Please, report on https://github.com/quickwit-oss/quickwit/issues.\"))\n            .collect()\n        })\n        .iter()\n        .all(|network| !network.contains(*ip_addr))\n}\n\nfn is_private_ip(ip_addr: &IpAddr) -> bool {\n    static PRIVATE_NETWORKS: OnceCell<Vec<IpNetwork>> = OnceCell::new();\n    PRIVATE_NETWORKS\n        .get_or_init(|| {\n            [\"192.168.0.0/16\", \"172.16.0.0/12\", \"10.0.0.0/8\", \"fc00::/7\"]\n                .iter()\n                .map(|network| network.parse().expect(\"Failed to parse network range. This should never happen! Please, report on https://github.com/quickwit-oss/quickwit/issues.\"))\n                .collect()\n        })\n        .iter()\n        .any(|network| network.contains(*ip_addr))\n}\n\npub fn get_hostname() -> io::Result<String> {\n    _get_hostname(hostname::get()?)\n}\n\n// Inner function for testing purposes.\nfn _get_hostname(hostname: OsString) -> io::Result<String> {\n    let hostname_lossy = hostname.to_string_lossy();\n    if is_valid_hostname(&hostname_lossy) {\n        Ok(hostname_lossy.to_string())\n    } else {\n        Err(io::Error::other(format!(\n            \"invalid hostname: `{hostname_lossy}`\"\n        )))\n    }\n}\n\npub fn get_short_hostname() -> io::Result<String> {\n    Ok(get_hostname()?\n        .split('.')\n        .next()\n        .expect(\"Split should never fail.\")\n        .to_string())\n}\n\n/// Returns whether a hostname is valid according to [RFC 1123](https://www.rfc-editor.org/rfc/rfc1123).\n///\n/// A hostname is valid if the following conditions are met:\n///\n/// - It does not start or end with `-` or `.`.\n/// - It does not contain any characters outside of the alphanumeric range, except for `-` and `.`.\n/// - It is not empty.\n/// - It is 253 or fewer characters.\n/// - Its labels (characters separated by `.`) are not empty.\n/// - Its labels are 63 or fewer characters.\n/// - Its labels do not start or end with '-' or '.'.\npub fn is_valid_hostname(hostname: &str) -> bool {\n    if hostname.is_empty() || hostname.len() > 253 {\n        return false;\n    }\n    if !hostname\n        .chars()\n        .all(|ch| ch.is_ascii_alphanumeric() || ch == '-' || ch == '.')\n    {\n        return false;\n    }\n    if hostname.split('.').any(|label| {\n        label.is_empty() || label.len() > 63 || label.starts_with('-') || label.ends_with('-')\n    }) {\n        return false;\n    }\n    true\n}\n\n#[cfg(test)]\nmod tests {\n    use std::net::Ipv6Addr;\n\n    use pnet::ipnetwork::{Ipv4Network, Ipv6Network};\n    use serde_json::Value as JsonValue;\n\n    use super::*;\n\n    #[test]\n    fn test_parse_host() {\n        assert_eq!(\n            \"127.0.0.1\".parse::<Host>().unwrap(),\n            Host::from(Ipv4Addr::LOCALHOST)\n        );\n        assert_eq!(\n            \"::1\".parse::<Host>().unwrap(),\n            Host::from(Ipv6Addr::new(0, 0, 0, 0, 0, 0, 0, 1))\n        );\n        assert_eq!(\n            \"localhost\".parse::<Host>().unwrap(),\n            Host::Hostname(\"localhost\".to_string())\n        );\n    }\n\n    #[test]\n    fn test_deserialize_host() {\n        assert_eq!(\n            serde_json::from_str::<Host>(\"\\\"127.0.0.1\\\"\").unwrap(),\n            Host::from(Ipv4Addr::LOCALHOST)\n        );\n        assert_eq!(\n            serde_json::from_str::<Host>(\"\\\"::1\\\"\").unwrap(),\n            Host::from(Ipv6Addr::new(0, 0, 0, 0, 0, 0, 0, 1))\n        );\n        assert_eq!(\n            serde_json::from_str::<Host>(\"\\\"localhost\\\"\").unwrap(),\n            Host::Hostname(\"localhost\".to_string())\n        );\n    }\n\n    #[test]\n    fn test_serialize_host() {\n        assert_eq!(\n            serde_json::to_value(Host::from(Ipv4Addr::LOCALHOST)).unwrap(),\n            JsonValue::String(\"127.0.0.1\".to_string())\n        );\n        assert_eq!(\n            serde_json::to_value(Host::from(Ipv6Addr::new(0, 0, 0, 0, 0, 0, 0, 1))).unwrap(),\n            JsonValue::String(\"::1\".to_string())\n        );\n        assert_eq!(\n            serde_json::to_value(Host::Hostname(\"localhost\".to_string())).unwrap(),\n            JsonValue::String(\"localhost\".to_string())\n        );\n    }\n\n    fn test_parse_addr_helper(addr: &str, expected_addr_opt: Option<&str>) {\n        let addr_res = HostAddr::parse_with_default_port(addr, 1337);\n        if let Some(expected_addr) = expected_addr_opt {\n            assert!(\n                addr_res.is_ok(),\n                \"Parsing `{addr}` was expected to succeed.\"\n            );\n            assert_eq!(addr_res.unwrap().to_string(), expected_addr);\n        } else {\n            assert!(\n                addr_res.is_err(),\n                \"Parsing `{}` was expected to fail, got `{}`\",\n                addr,\n                addr_res.unwrap()\n            );\n        }\n    }\n\n    #[tokio::test]\n    async fn test_parse_addr_with_ips() {\n        // IPv4\n        test_parse_addr_helper(\"127.0.0.1\", Some(\"127.0.0.1:1337\"));\n        test_parse_addr_helper(\"127.0.0.1:100\", Some(\"127.0.0.1:100\"));\n        test_parse_addr_helper(\"127.0..1:100\", None);\n\n        // IPv6\n        test_parse_addr_helper(\n            \"2001:0db8:85a3:0000:0000:8a2e:0370:7334\",\n            Some(\"[2001:db8:85a3::8a2e:370:7334]:1337\"),\n        );\n        test_parse_addr_helper(\"2001:0db8:85a3:0000:0000:8a2e:0370:7334:1000\", None);\n        test_parse_addr_helper(\n            \"[2001:0db8:85a3:0000:0000:8a2e:0370:7334]:1000\",\n            Some(\"[2001:db8:85a3::8a2e:370:7334]:1000\"),\n        );\n        test_parse_addr_helper(\"[2001:0db8:1000\", None);\n        test_parse_addr_helper(\"2001:0db8:85a3:0000:0000:8a2e:0370:7334]:1000\", None);\n\n        // Hostname\n        test_parse_addr_helper(\"google.com\", Some(\"google.com:1337\"));\n        test_parse_addr_helper(\"google.com:1000\", Some(\"google.com:1000\"));\n    }\n\n    #[test]\n    fn test_is_valid_hostname() {\n        for hostname in &[\n            \"VaLiD-HoStNaMe\",\n            \"50-name\",\n            \"235235\",\n            \"example.com\",\n            \"VaLid.HoStNaMe\",\n            \"123.456\",\n        ] {\n            assert!(\n                is_valid_hostname(hostname),\n                \"Hostname `{hostname}` is valid.\",\n            );\n        }\n\n        for hostname in &[\n            \"-invalid-name\",\n            \"also-invalid-\",\n            \"asdf@fasd\",\n            \"@asdfl\",\n            \"asd f@\",\n            \".invalid\",\n            \"invalid.name.\",\n            \"foo.label-is-way-to-longgggggggggggggggggggggggggggggggggggggggggggg.org\",\n            \"invalid.-starting.char\",\n            \"invalid.ending-.char\",\n            \"empty..label\",\n        ] {\n            assert!(\n                !is_valid_hostname(hostname),\n                \"Hostname `{hostname}` is invalid.\"\n            );\n        }\n    }\n\n    #[test]\n    fn test_find_private_ip() {\n        assert!(_find_private_ip(&[]).is_none());\n\n        let interfaces = [\n            NetworkInterface {\n                name: \"lo\".to_string(),\n                description: \"\".to_string(),\n                index: 1,\n                mac: None,\n                ips: vec![\n                    IpNetwork::V4(Ipv4Network::new(\"127.0.0.1\".parse().unwrap(), 8).unwrap()),\n                    IpNetwork::V6(Ipv6Network::new(\"::1\".parse().unwrap(), 128).unwrap()),\n                ],\n                flags: 65609,\n            },\n            NetworkInterface {\n                name: \"docker0\".to_string(),\n                description: \"\".to_string(),\n                index: 2,\n                mac: None,\n                ips: vec![\n                    IpNetwork::V6(\n                        Ipv6Network::new(\"fe80::42:69ff:fe8e:e739\".parse().unwrap(), 64).unwrap(),\n                    ),\n                    IpNetwork::V4(Ipv4Network::new(\"172.17.0.1\".parse().unwrap(), 8).unwrap()),\n                ],\n                flags: 4099,\n            },\n            NetworkInterface {\n                name: \"eth0\".to_string(),\n                description: \"\".to_string(),\n                index: 3,\n                mac: None,\n                ips: vec![\n                    IpNetwork::V6(\n                        Ipv6Network::new(\"fe80::84ed:78c:ec06:bf53\".parse().unwrap(), 64).unwrap(),\n                    ),\n                    IpNetwork::V4(Ipv4Network::new(\"192.168.1.70\".parse().unwrap(), 24).unwrap()),\n                ],\n                flags: 69699,\n            },\n        ];\n        let (interface_name, ip_addr) = _find_private_ip(&interfaces).unwrap();\n        assert_eq!(interface_name, \"eth0\");\n        assert_eq!(ip_addr, \"192.168.1.70\".parse::<IpAddr>().unwrap());\n    }\n\n    #[test]\n    fn test_is_forwardable_ip() {\n        for ip in [\"192.168.0.42\", \"172.16.0.42\", \"10.0.0.42\"] {\n            assert!(\n                is_forwardable_ip(&ip.parse::<IpAddr>().unwrap()),\n                \"IP `{ip}` is forwardable!\"\n            );\n        }\n        for ip in [\"127.0.0.42\", \"169.254.0.42\", \"192.0.0.42\"] {\n            assert!(\n                !is_forwardable_ip(&ip.parse::<IpAddr>().unwrap()),\n                \"IP `{ip}` is not forwardable!\"\n            );\n        }\n    }\n\n    #[test]\n    fn test_is_private_ip() {\n        for ip in [\"192.168.0.42\", \"172.16.0.42\", \"10.0.0.42\"] {\n            assert!(\n                is_private_ip(&ip.parse::<IpAddr>().unwrap()),\n                \"IP `{ip}` is private!\"\n            );\n        }\n        for ip in [\"192.169.0.42\", \"172.32.0.42\", \"11.0.0.42\"] {\n            assert!(\n                !is_private_ip(&ip.parse::<IpAddr>().unwrap()),\n                \"IP `{ip}` is public!\"\n            );\n        }\n    }\n\n    #[test]\n    fn test_get_hostname() {\n        assert_eq!(\n            get_hostname().unwrap(),\n            hostname::get().unwrap().to_string_lossy().to_string()\n        );\n        _get_hostname(OsString::from(\"\")).unwrap_err();\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/path_hasher.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::hash::Hasher;\n\n/// We use 255 as a separator as it isn't used by utf-8.\n///\n/// Tantivy uses 1 because it is more convenient for range queries, but we don't\n/// care about the sort order here.\n///\n/// Note: changing this is not retro-compatible!\nconst SEPARATOR: &[u8] = &[255];\n\n/// Mini wrapper over the FnvHasher to incrementally hash nodes\n/// in a tree.\n///\n/// Its purpose is to:\n/// - work around the lack of Clone in the fnv Hasher\n/// - enforce a 1 byte separator between segments\n#[derive(Default)]\npub struct PathHasher {\n    hasher: fnv::FnvHasher,\n}\n\nimpl Clone for PathHasher {\n    #[inline(always)]\n    fn clone(&self) -> PathHasher {\n        PathHasher {\n            hasher: fnv::FnvHasher::with_key(self.hasher.finish()),\n        }\n    }\n}\n\nimpl PathHasher {\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn hash_path(segments: &[&[u8]]) -> u64 {\n        let mut hasher = Self::default();\n        for segment in segments {\n            hasher.append(segment);\n        }\n        hasher.finish_leaf()\n    }\n\n    /// Appends a new segment to our path.\n    ///\n    /// In order to avoid natural collisions, (e.g. &[\"ab\", \"c\"] and &[\"a\", \"bc\"]),\n    /// we add a null byte between each segment as a separator.\n    #[inline]\n    pub fn append(&mut self, payload: &[u8]) {\n        self.hasher.write(payload);\n        self.hasher.write(SEPARATOR);\n    }\n\n    #[inline]\n    pub fn finish_leaf(&self) -> u64 {\n        self.hasher.finish()\n    }\n\n    #[inline]\n    pub fn finish_intermediate(&self) -> u64 {\n        let mut intermediate = fnv::FnvHasher::with_key(self.hasher.finish());\n        intermediate.write(SEPARATOR);\n        intermediate.finish()\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/pretty.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::time::Duration;\n\npub struct PrettySample<I>(I, usize);\n\nimpl<I> PrettySample<I> {\n    pub fn new(slice: I, sample_size: usize) -> Self {\n        Self(slice, sample_size)\n    }\n}\n\nimpl<I> fmt::Debug for PrettySample<I>\nwhere\n    I: IntoIterator + Clone,\n    I::Item: fmt::Debug,\n{\n    fn fmt(&self, formatter: &mut fmt::Formatter<'_>) -> fmt::Result {\n        write!(formatter, \"[\")?;\n\n        // In general, we will receive a reference (&[...], &HashMap...) or a Map<_> of them.\n        // So we either perform a Copy, or a cheap Clone of a simple struct\n        let mut iter = self.0.clone().into_iter().enumerate();\n        for (i, item) in &mut iter {\n            if i > 0 {\n                write!(formatter, \", \")?;\n            }\n            write!(formatter, \"{item:?}\")?;\n            if i == self.1 - 1 {\n                break;\n            }\n        }\n        let left = iter.count();\n        if left > 0 {\n            write!(formatter, \", and {left} more\")?;\n        }\n        write!(formatter, \"]\")?;\n        Ok(())\n    }\n}\n\npub trait PrettyDisplay {\n    fn pretty_display(&self) -> impl fmt::Display;\n}\n\nstruct DurationPrettyDisplay<'a>(&'a Duration);\n\nimpl fmt::Display for DurationPrettyDisplay<'_> {\n    fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {\n        // This is enough for my current use cases. To be extended as you see fit.\n        let duration_millis = self.0.as_millis();\n\n        if duration_millis < 1_000 {\n            return write!(formatter, \"{duration_millis}ms\");\n        }\n        write!(\n            formatter,\n            \"{}.{}s\",\n            duration_millis / 1_000,\n            duration_millis % 1_000 / 10\n        )\n    }\n}\n\nimpl PrettyDisplay for Duration {\n    fn pretty_display(&self) -> impl fmt::Display {\n        DurationPrettyDisplay(self)\n    }\n}\n\nstruct SequencePrettyDisplay<I>(I);\n\nimpl<I> fmt::Display for SequencePrettyDisplay<I>\nwhere\n    I: IntoIterator + Clone,\n    I::Item: fmt::Display,\n{\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        write!(f, \"[\")?;\n\n        // In general, we will receive a reference (&[...], &HashMap...) or a Map<_> of them.\n        // So we either perform a Copy, or a cheap Clone of a simple struct\n        let mut iter = self.0.clone().into_iter().peekable();\n\n        while let Some(item) = iter.next() {\n            write!(f, \"{item}\")?;\n            if iter.peek().is_some() {\n                write!(f, \", \")?;\n            }\n        }\n        write!(f, \"]\")\n    }\n}\n\nimpl<T: fmt::Display> PrettyDisplay for &[T] {\n    fn pretty_display(&self) -> impl fmt::Display {\n        SequencePrettyDisplay(*self)\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_pretty_sample() {\n        let pretty_sample = PrettySample::<&[usize]>::new(&[], 2);\n        assert_eq!(format!(\"{pretty_sample:?}\"), \"[]\");\n\n        let pretty_sample = PrettySample::new(&[1], 2);\n        assert_eq!(format!(\"{pretty_sample:?}\"), \"[1]\");\n\n        let pretty_sample = PrettySample::new(&[1, 2], 2);\n        assert_eq!(format!(\"{pretty_sample:?}\"), \"[1, 2]\");\n\n        let pretty_sample = PrettySample::new(&[1, 2, 3], 2);\n        assert_eq!(format!(\"{pretty_sample:?}\"), \"[1, 2, and 1 more]\");\n\n        let pretty_sample = PrettySample::new(&[1, 2, 3, 4], 2);\n        assert_eq!(format!(\"{pretty_sample:?}\"), \"[1, 2, and 2 more]\");\n    }\n\n    #[test]\n    fn test_duration_pretty_display() {\n        let duration = Duration::from_millis(0);\n        assert_eq!(format!(\"{}\", duration.pretty_display()), \"0ms\");\n\n        let duration = Duration::from_millis(125);\n        assert_eq!(format!(\"{}\", duration.pretty_display()), \"125ms\");\n\n        let duration = Duration::from_millis(1_000);\n        assert_eq!(format!(\"{}\", duration.pretty_display()), \"1.0s\");\n\n        let duration = Duration::from_millis(1_125);\n        assert_eq!(format!(\"{}\", duration.pretty_display()), \"1.12s\");\n    }\n\n    #[test]\n    fn test_sequence_pretty_display() {\n        let empty_slice: &[i32] = &[];\n        assert_eq!(format!(\"{}\", empty_slice.pretty_display()), \"[]\");\n\n        let slice_one: &[i32] = &[1];\n        assert_eq!(format!(\"{}\", slice_one.pretty_display()), \"[1]\");\n\n        let slice_two: &[i32] = &[1, 2];\n        assert_eq!(format!(\"{}\", slice_two.pretty_display()), \"[1, 2]\");\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/progress.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::Arc;\nuse std::sync::atomic::{AtomicU32, Ordering};\n\nuse futures::Future;\n\n/// Progress makes it possible to register some progress.\n/// It is used in lieu of healthcheck.\n///\n/// If no progress is observed until the next heartbeat, the actor will be killed.\n#[derive(Clone)]\npub struct Progress(Arc<AtomicU32>);\n\n#[derive(Clone, Copy, Debug, Eq, PartialEq)]\nenum ProgressState {\n    // No update recorded since the last call to .check_for_update()\n    NoUpdate,\n    // An update was recorded since the last call to .check_for_update()\n    Updated,\n    // The actor is in the protected zone.\n    //\n    // The protected zone should seldom be used. It is useful\n    // when calling an external library that is blocking for instance.\n    //\n    // Another use case is blocking when sending a message to another actor\n    // with a saturated message bus.\n    // The failure detection is then considered to be the problem of\n    // the downstream actor.\n    //\n    // As long as the actor is in the protected zone, healthchecking won't apply\n    // to it.\n    //\n    // The value inside starts at 0.\n    ProtectedZone(u32),\n}\n\n#[allow(clippy::from_over_into)]\nimpl Into<u32> for ProgressState {\n    fn into(self) -> u32 {\n        match self {\n            ProgressState::NoUpdate => 0,\n            ProgressState::Updated => 1,\n            ProgressState::ProtectedZone(level) => 2 + level,\n        }\n    }\n}\n\nimpl From<u32> for ProgressState {\n    fn from(level: u32) -> Self {\n        match level {\n            0 => ProgressState::NoUpdate,\n            1 => ProgressState::Updated,\n            level => ProgressState::ProtectedZone(level - 2),\n        }\n    }\n}\n\nimpl Default for Progress {\n    fn default() -> Progress {\n        Progress(Arc::new(AtomicU32::new(ProgressState::Updated.into())))\n    }\n}\n\nimpl Progress {\n    pub fn record_progress(&self) {\n        self.0\n            .fetch_max(ProgressState::Updated.into(), Ordering::Relaxed);\n    }\n\n    /// Executes a future in a protected zone.\n    pub async fn protect_future<Fut, T>(&self, future: Fut) -> T\n    where Fut: Future<Output = T> {\n        let _guard = self.protect_zone();\n        future.await\n    }\n\n    pub fn protect_zone(&self) -> ProtectedZoneGuard {\n        loop {\n            let previous_state: ProgressState = self.0.load(Ordering::SeqCst).into();\n            let new_state = match previous_state {\n                ProgressState::NoUpdate | ProgressState::Updated => ProgressState::ProtectedZone(0),\n                ProgressState::ProtectedZone(level) => ProgressState::ProtectedZone(level + 1),\n            };\n            if self\n                .0\n                .compare_exchange(\n                    previous_state.into(),\n                    new_state.into(),\n                    Ordering::SeqCst,\n                    Ordering::SeqCst,\n                )\n                .is_ok()\n            {\n                return ProtectedZoneGuard(self.0.clone());\n            }\n        }\n    }\n\n    /// This method mutates the state as follows and returns true if\n    /// the object was in the protected zone or had change registered.\n    /// - Updated -> NoUpdate, returns true\n    /// - NoUpdate -> NoUpdate, returns false\n    /// - ProtectedZone -> ProtectedZone, returns true\n    pub fn registered_activity_since_last_call(&self) -> bool {\n        let previous_state: ProgressState = self\n            .0\n            .compare_exchange(\n                ProgressState::Updated.into(),\n                ProgressState::NoUpdate.into(),\n                Ordering::Relaxed,\n                Ordering::Relaxed,\n            )\n            .unwrap_or_else(|previous_value| previous_value)\n            .into();\n        previous_state != ProgressState::NoUpdate\n    }\n}\n\npub struct ProtectedZoneGuard(Arc<AtomicU32>);\n\nimpl Drop for ProtectedZoneGuard {\n    fn drop(&mut self) {\n        let previous_state: ProgressState = self.0.fetch_sub(1, Ordering::SeqCst).into();\n        assert!(matches!(previous_state, ProgressState::ProtectedZone(_)));\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::Progress;\n\n    #[test]\n    fn test_progress() {\n        let progress = Progress::default();\n        assert!(progress.registered_activity_since_last_call());\n        progress.record_progress();\n        assert!(progress.registered_activity_since_last_call());\n        assert!(!progress.registered_activity_since_last_call());\n    }\n\n    #[test]\n    fn test_progress_protect_zone() {\n        let progress = Progress::default();\n        assert!(progress.registered_activity_since_last_call());\n        progress.record_progress();\n        assert!(progress.registered_activity_since_last_call());\n        {\n            let _protect_guard = progress.protect_zone();\n            assert!(progress.registered_activity_since_last_call());\n            assert!(progress.registered_activity_since_last_call());\n        }\n        assert!(progress.registered_activity_since_last_call());\n        assert!(!progress.registered_activity_since_last_call());\n    }\n\n    #[test]\n    fn test_progress_several_protect_zone() {\n        let progress = Progress::default();\n        assert!(progress.registered_activity_since_last_call());\n        progress.record_progress();\n        assert!(progress.registered_activity_since_last_call());\n        let first_protect_guard = progress.protect_zone();\n        let second_protect_guard = progress.protect_zone();\n        assert!(progress.registered_activity_since_last_call());\n        assert!(progress.registered_activity_since_last_call());\n        std::mem::drop(first_protect_guard);\n        assert!(progress.registered_activity_since_last_call());\n        assert!(progress.registered_activity_since_last_call());\n        std::mem::drop(second_protect_guard);\n        assert!(progress.registered_activity_since_last_call());\n        assert!(!progress.registered_activity_since_last_call());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/pubsub.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::fmt;\nuse std::sync::atomic::{AtomicUsize, Ordering};\nuse std::sync::{Arc, Mutex, Weak};\nuse std::time::Duration;\n\nuse async_trait::async_trait;\nuse tokio::sync::Mutex as TokioMutex;\n\nuse crate::rate_limited_warn;\nuse crate::type_map::TypeMap;\n\nconst EVENT_SUBSCRIPTION_CALLBACK_TIMEOUT: Duration = Duration::from_secs(10);\n\npub trait Event: fmt::Debug + Clone + Send + Sync + 'static {}\n\n#[async_trait]\npub trait EventSubscriber<E>: Send + Sync + 'static {\n    async fn handle_event(&mut self, event: E);\n}\n\n#[async_trait]\nimpl<E, F> EventSubscriber<E> for F\nwhere\n    E: Event,\n    F: FnMut(E) + Send + Sync + 'static,\n{\n    async fn handle_event(&mut self, event: E) {\n        (self)(event);\n    }\n}\n\ntype EventSubscriptions<E> = HashMap<usize, EventSubscription<E>>;\n\n/// The event broker makes it possible to\n/// - emit specific local events\n/// - subscribe to these local events\n///\n/// The event broker is not distributed in itself. Only events emitted\n/// locally will be received by the subscribers.\n///\n/// It is however possible to locally subscribe a handler to a kind of event,\n/// that will in turn run a RPC to other nodes.\n#[derive(Debug, Clone, Default)]\npub struct EventBroker {\n    inner: Arc<InnerEventBroker>,\n}\n\n#[derive(Debug, Default)]\nstruct InnerEventBroker {\n    subscription_sequence: AtomicUsize,\n    subscriptions: Mutex<TypeMap>,\n}\n\nimpl EventBroker {\n    // The point of this private method is to allow the public subscribe method to have only one\n    // generic argument and avoid the ugly `::<E, _>` syntax.\n    fn subscribe_aux<E, S>(&self, subscriber: S, with_timeout: bool) -> EventSubscriptionHandle\n    where\n        E: Event,\n        S: EventSubscriber<E> + Send + Sync + 'static,\n    {\n        let mut subscriptions = self\n            .inner\n            .subscriptions\n            .lock()\n            .expect(\"lock should not be poisoned\");\n\n        if !subscriptions.contains::<EventSubscriptions<E>>() {\n            subscriptions.insert::<EventSubscriptions<E>>(HashMap::new());\n        }\n        let subscription_id = self\n            .inner\n            .subscription_sequence\n            .fetch_add(1, Ordering::Relaxed);\n\n        let subscriber_name = std::any::type_name::<S>();\n        let subscription = EventSubscription {\n            subscriber_name,\n            subscriber: Arc::new(TokioMutex::new(Box::new(subscriber))),\n            with_timeout,\n        };\n        let typed_subscriptions = subscriptions\n            .get_mut::<EventSubscriptions<E>>()\n            .expect(\"subscription map should exist\");\n        typed_subscriptions.insert(subscription_id, subscription);\n\n        EventSubscriptionHandle {\n            subscription_id,\n            broker: Arc::downgrade(&self.inner),\n            drop_me: |subscription_id, broker| {\n                let mut subscriptions = broker\n                    .subscriptions\n                    .lock()\n                    .expect(\"lock should not be poisoned\");\n                if let Some(typed_subscriptions) = subscriptions.get_mut::<EventSubscriptions<E>>()\n                {\n                    typed_subscriptions.remove(&subscription_id);\n                }\n            },\n        }\n    }\n\n    /// Subscribes to an event type.\n    ///\n    /// The callback should be as light as possible.\n    ///\n    /// # Disclaimer\n    ///\n    /// If the callback takes more than `EVENT_SUBSCRIPTION_CALLBACK_TIMEOUT` to execute,\n    /// the callback future will be aborted.\n    #[must_use]\n    pub fn subscribe<E>(&self, subscriber: impl EventSubscriber<E>) -> EventSubscriptionHandle\n    where E: Event {\n        self.subscribe_aux(subscriber, true)\n    }\n\n    /// Subscribes to an event type.\n    ///\n    /// The callback should be as light as possible.\n    #[must_use]\n    pub fn subscribe_without_timeout<E>(\n        &self,\n        subscriber: impl EventSubscriber<E>,\n    ) -> EventSubscriptionHandle\n    where\n        E: Event,\n    {\n        self.subscribe_aux(subscriber, false)\n    }\n\n    /// Publishes an event.\n    pub fn publish<E>(&self, event: E)\n    where E: Event {\n        let subscriptions = self\n            .inner\n            .subscriptions\n            .lock()\n            .expect(\"lock should not be poisoned\");\n        if let Some(typed_subscriptions) = subscriptions.get::<EventSubscriptions<E>>() {\n            for subscription in typed_subscriptions.values() {\n                subscription.trigger(event.clone());\n            }\n        }\n    }\n}\n\nstruct EventSubscription<E> {\n    // We put that in the subscription in order to avoid having to take the lock\n    // to access it.\n    subscriber_name: &'static str,\n    subscriber: Arc<TokioMutex<Box<dyn EventSubscriber<E>>>>,\n    with_timeout: bool,\n}\n\nimpl<E: Event> EventSubscription<E> {\n    /// Call the callback associated with the subscription.\n    fn trigger(&self, event: E) {\n        if self.with_timeout {\n            self.trigger_abort_on_timeout(event);\n        } else {\n            self.trigger_just_log_on_timeout(event)\n        }\n    }\n\n    /// Spawns a task to run the given subscription.\n    ///\n    /// Just logs a warning if it took more than `EVENT_SUBSCRIPTION_CALLBACK_TIMEOUT`\n    /// for the future to execute.\n    fn trigger_just_log_on_timeout(&self, event: E) {\n        let subscriber_name = self.subscriber_name;\n        let subscriber = self.subscriber.clone();\n        // This task is just here to log a warning if the callback takes too long to execute.\n        let log_timeout_task_handle = tokio::task::spawn(async move {\n            tokio::time::sleep(EVENT_SUBSCRIPTION_CALLBACK_TIMEOUT).await;\n            let event_name = std::any::type_name::<E>();\n            rate_limited_warn!(\n                limit_per_min = 10,\n                \"{subscriber_name}'s handler for {event_name} did not finished within {}ms\",\n                EVENT_SUBSCRIPTION_CALLBACK_TIMEOUT.as_millis()\n            );\n        });\n        tokio::task::spawn(async move {\n            subscriber.lock().await.handle_event(event).await;\n            // The callback has terminated, let's abort the timeout task.\n            log_timeout_task_handle.abort();\n        });\n    }\n\n    /// Spawns a task to run the given subscription.\n    ///\n    /// Aborts the future execution and logs a warning if it takes more than\n    /// `EVENT_SUBSCRIPTION_CALLBACK_TIMEOUT`.\n    fn trigger_abort_on_timeout(&self, event: E) {\n        let subscriber_name = self.subscriber_name;\n        let subscriber = self.subscriber.clone();\n        let fut = async move {\n            if tokio::time::timeout(EVENT_SUBSCRIPTION_CALLBACK_TIMEOUT, async {\n                subscriber.lock().await.handle_event(event).await\n            })\n            .await\n            .is_err()\n            {\n                let event_name = std::any::type_name::<E>();\n                rate_limited_warn!(\n                    limit_per_min = 10,\n                    \"{subscriber_name}'s handler for {event_name} timed out, abort\"\n                );\n            }\n        };\n        tokio::task::spawn(fut);\n    }\n}\n\n#[derive(Clone)]\npub struct EventSubscriptionHandle {\n    subscription_id: usize,\n    broker: Weak<InnerEventBroker>,\n    drop_me: fn(usize, &InnerEventBroker),\n}\n\nimpl EventSubscriptionHandle {\n    pub fn cancel(self) {}\n\n    /// By default, dropping a subscription handle cancels the subscription.\n    /// `forever` consumes the handle and avoids cancelling the subscription on drop.\n    pub fn forever(mut self) {\n        self.broker = Weak::new();\n    }\n}\n\nimpl Drop for EventSubscriptionHandle {\n    fn drop(&mut self) {\n        if let Some(broker) = self.broker.upgrade() {\n            (self.drop_me)(self.subscription_id, &broker);\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use std::sync::Arc;\n    use std::sync::atomic::{AtomicUsize, Ordering};\n\n    use super::*;\n\n    #[derive(Debug, Clone)]\n    struct MyEvent {\n        value: usize,\n    }\n\n    impl Event for MyEvent {}\n\n    #[derive(Debug, Clone)]\n    struct MySubscriber {\n        counter: Arc<AtomicUsize>,\n    }\n\n    #[async_trait]\n    impl EventSubscriber<MyEvent> for MySubscriber {\n        async fn handle_event(&mut self, event: MyEvent) {\n            self.counter.store(event.value, Ordering::Relaxed);\n        }\n    }\n\n    #[tokio::test]\n    async fn test_event_broker() {\n        let event_broker = EventBroker::default();\n        let counter = Arc::new(AtomicUsize::new(0));\n        let subscriber = MySubscriber {\n            counter: counter.clone(),\n        };\n        let subscription_handle = event_broker.subscribe(subscriber);\n\n        let event = MyEvent { value: 42 };\n        event_broker.publish(event);\n\n        tokio::time::sleep(Duration::from_millis(1)).await;\n        assert_eq!(counter.load(Ordering::Relaxed), 42);\n\n        subscription_handle.cancel();\n\n        let event = MyEvent { value: 1337 };\n        event_broker.publish(event);\n\n        tokio::time::sleep(Duration::from_millis(1)).await;\n        assert_eq!(counter.load(Ordering::Relaxed), 42);\n    }\n\n    #[tokio::test]\n    async fn test_event_broker_handle_drop() {\n        let event_broker = EventBroker::default();\n        let (tx, mut rx) = tokio::sync::mpsc::unbounded_channel();\n        drop(event_broker.subscribe(move |event: MyEvent| {\n            tx.send(event.value).unwrap();\n        }));\n        event_broker.publish(MyEvent { value: 42 });\n        assert!(rx.recv().await.is_none());\n    }\n\n    #[tokio::test]\n    async fn test_event_broker_handle_cancel() {\n        let event_broker = EventBroker::default();\n        let (tx, mut rx) = tokio::sync::mpsc::unbounded_channel();\n        event_broker\n            .subscribe(move |event: MyEvent| {\n                tx.send(event.value).unwrap();\n            })\n            .cancel();\n        event_broker.publish(MyEvent { value: 42 });\n        assert!(rx.recv().await.is_none());\n    }\n\n    #[tokio::test]\n    async fn test_event_broker_handle_forever() {\n        let event_broker = EventBroker::default();\n        let (tx, mut rx) = tokio::sync::mpsc::unbounded_channel();\n        event_broker\n            .subscribe(move |event: MyEvent| {\n                tx.send(event.value).unwrap();\n            })\n            .forever();\n        event_broker.publish(MyEvent { value: 42 });\n        assert_eq!(rx.recv().await, Some(42));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/rand.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse rand::Rng;\nuse rand::distr::Alphanumeric;\n\n/// Appends a random suffix composed of a hyphen and five random alphanumeric characters.\npub fn append_random_suffix(string: &str) -> String {\n    let rng = rand::rng();\n    let mut randomized_string = String::with_capacity(string.len() + 6);\n    randomized_string.push_str(string);\n    randomized_string.push('-');\n\n    for random_byte in rng.sample_iter(&Alphanumeric).take(5) {\n        randomized_string.push(char::from(random_byte));\n    }\n    randomized_string\n}\n\n#[cfg(test)]\nmod tests {\n    use super::append_random_suffix;\n\n    #[test]\n    fn test_append_random_suffix() -> anyhow::Result<()> {\n        let randomized = append_random_suffix(\"\");\n        let mut chars = randomized.chars();\n        assert_eq!(chars.next(), Some('-'));\n        assert_eq!(chars.clone().count(), 5);\n        assert!(chars.all(|ch| ch.is_ascii_alphanumeric()));\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/rate_limited_tracing.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n// TODO coasetime has a recent() instead of now() which is essentially free (atomic read instead of\n// vdso call), but needs us to spawn a future/thread updating that value regularly\n\nuse std::sync::atomic::{AtomicU64, Ordering};\n\nuse coarsetime::{Duration, Instant};\n\n/// Metadata for a log site. This is stored inside a single AtomicU64 when not in use.\n///\n/// `call_count` is the number of calls since the last upgrade of generation, it's stored\n/// in the lower 32b of the atomic, so it can just be incremented on the fast path.\n/// `generation` is the number of time we reset the `call_count`. It isn't used as is, and\n/// is just compared to itself to detect and handle properly concurrent resets from multiple\n/// threads.\n#[derive(Clone, Copy)]\nstruct LogSiteMetadata {\n    generation: u32,\n    call_count: u32,\n}\n\nimpl From<u64> for LogSiteMetadata {\n    fn from(val: u64) -> LogSiteMetadata {\n        LogSiteMetadata {\n            generation: (val >> 32) as u32,\n            call_count: (val & ((1 << 32) - 1)) as u32,\n        }\n    }\n}\n\nimpl From<LogSiteMetadata> for u64 {\n    fn from(count: LogSiteMetadata) -> u64 {\n        ((count.generation as u64) << 32) + count.call_count as u64\n    }\n}\n\n/// Helper function used in [`rate_limited_tracing`] to determine if this line should log,\n/// and update the related counters.\npub fn should_log<F: Fn() -> Instant>(\n    count_atomic: &AtomicU64,\n    last_reset_atomic: &AtomicU64,\n    limit: u32,\n    now: F,\n) -> bool {\n    //  count_atomic is treated as 2 u32: upper bits count \"generation\", lower bits count number of\n    //  calls since LAST_RESET. We assume there won't be 2**32 calls to this log in ~60s.\n    //  Generation is free to wrap around.\n\n    // Because the lower 32 bits are storing the log count, we can\n    // increment the entire u64 to record this log call.\n    let logsite_meta_u64 = count_atomic.fetch_add(1, Ordering::Acquire);\n    if logsite_meta_u64 == 0 {\n        // this can only be reached the very 1st time we log\n        last_reset_atomic.store(now().as_ticks(), Ordering::Release);\n    }\n\n    let LogSiteMetadata {\n        generation,\n        call_count,\n    } = logsite_meta_u64.into();\n\n    if call_count < limit {\n        return true;\n    }\n\n    let current_time = Duration::from_ticks(now().as_ticks());\n    let last_reset = Duration::from_ticks(last_reset_atomic.load(Ordering::Acquire));\n\n    let should_reset = current_time.abs_diff(last_reset) >= Duration::from_secs(60);\n\n    if !should_reset {\n        // we are over-limit and not far enough in time to reset: don't log\n        return false;\n    }\n\n    let mut update_time = false;\n\n    let update_res =\n        count_atomic.fetch_update(Ordering::Release, Ordering::Acquire, |current_count| {\n            let mut current_count: LogSiteMetadata = current_count.into();\n            if generation == current_count.generation {\n                // we can update generation&time, so we can definitely log\n                update_time = true;\n                let new_count = LogSiteMetadata {\n                    generation: generation.wrapping_add(1),\n                    call_count: 1,\n                };\n                Some(new_count.into())\n            } else {\n                // we can't update generation&time, but maybe we can still log?\n                update_time = false;\n                if current_count.call_count < limit {\n                    // we can log, update the count\n                    current_count.call_count += 1;\n                    Some(current_count.into())\n                } else {\n                    // we can't log, save some contention by not recording that we tried to\n                    // log, and exit in error\n                    None\n                }\n            }\n        });\n    let can_log = update_res.is_ok();\n\n    // technically there is a race condition if we stay stuck *here* for > 60s, which\n    // could cause us to log more than required. This is unlikely to happen, and not\n    // really a big issue.\n\n    if update_time {\n        // *we* updated generation, so we must update last_reset too\n        last_reset_atomic.store(current_time.as_ticks(), Ordering::Release);\n    }\n    can_log\n}\n\n#[macro_export]\nmacro_rules! rate_limited_tracing {\n    ($log_fn:ident, limit_per_min=$limit:literal, $($args:tt)*) => {{\n        use ::std::sync::atomic::AtomicU64;\n        use $crate::rate_limited_tracing::CoarsetimeInstant;\n\n        static COUNT: AtomicU64 = AtomicU64::new(0);\n        // we can't get time from constant context, so we pre-initialize with zero\n        static LAST_RESET: AtomicU64 = AtomicU64::new(0);\n\n        if $crate::rate_limited_tracing::should_log(&COUNT, &LAST_RESET, $limit, CoarsetimeInstant::now) {\n            ::tracing::$log_fn!($($args)*);\n        }\n    }};\n}\n\n#[macro_export]\nmacro_rules! rate_limited_trace {\n    ($unit:ident=$limit:literal, $($args:tt)*) => {\n        $crate::rate_limited_tracing::rate_limited_tracing!(trace, $unit=$limit, $($args)*)\n    };\n}\n#[macro_export]\nmacro_rules! rate_limited_debug {\n    ($unit:ident=$limit:literal, $($args:tt)*) => {\n        $crate::rate_limited_tracing::rate_limited_tracing!(debug, $unit=$limit, $($args)*)\n    };\n}\n#[macro_export]\nmacro_rules! rate_limited_info {\n    ($unit:ident=$limit:literal, $($args:tt)*) => {\n        $crate::rate_limited_tracing::rate_limited_tracing!(info, $unit=$limit, $($args)*)\n    };\n}\n#[macro_export]\nmacro_rules! rate_limited_warn {\n    ($unit:ident=$limit:literal, $($args:tt)*) => {\n        $crate::rate_limited_tracing::rate_limited_tracing!(warn, $unit=$limit, $($args)*)\n    };\n}\n#[macro_export]\nmacro_rules! rate_limited_error {\n    ($unit:ident=$limit:literal, $($args:tt)*) => {\n        $crate::rate_limited_tracing::rate_limited_tracing!(error, $unit=$limit, $($args)*)\n    };\n}\n\nfn _check_macro_works() {\n    rate_limited_info!(limit_per_min = 10, \"test {}\", \"test\");\n}\n\n#[doc(hidden)]\npub use coarsetime::Instant as CoarsetimeInstant;\npub use rate_limited_debug;\npub use rate_limited_error;\npub use rate_limited_info;\npub use rate_limited_trace;\n#[doc(hidden)]\npub use rate_limited_tracing;\npub use rate_limited_warn;\n\n#[cfg(test)]\nmod tests {\n    use std::sync::atomic::{AtomicU64, Ordering};\n\n    use coarsetime::{Duration, Instant};\n\n    use super::should_log;\n\n    // TODO as this is atomic code, we should test it with multiple threads to verify it behaves\n    // like we'd expect, maybe using something like `loom`?\n\n    #[test]\n    fn test_rate_limited_log_single_thread() {\n        let count = AtomicU64::new(0);\n        let last_reset = AtomicU64::new(0);\n        let limit = 5u64;\n\n        let mut simulated_time = Instant::now();\n        let simulation_step = Duration::from_secs(1);\n\n        assert!(should_log(&count, &last_reset, limit as _, || {\n            simulated_time\n        }));\n        assert_eq!(count.load(Ordering::Relaxed), 1);\n        let reset_timestamp = last_reset.load(Ordering::Relaxed);\n        assert_ne!(reset_timestamp, 0);\n\n        simulated_time += simulation_step;\n\n        for i in 1..limit {\n            // we log as many time as expected\n            assert!(should_log(&count, &last_reset, limit as _, || {\n                simulated_time\n            }));\n            assert_eq!(count.load(Ordering::Relaxed), i + 1);\n            assert_eq!(last_reset.load(Ordering::Relaxed), reset_timestamp);\n            simulated_time += simulation_step;\n        }\n\n        for i in limit..(limit * 2) {\n            // we don't log, nor update\n            assert!(!should_log(&count, &last_reset, limit as _, || {\n                simulated_time\n            }));\n            assert_eq!(count.load(Ordering::Relaxed), i + 1);\n            assert_eq!(last_reset.load(Ordering::Relaxed), reset_timestamp);\n            simulated_time += simulation_step;\n        }\n\n        // advance enough to reset counter\n        simulated_time += simulation_step * 60;\n\n        assert!(should_log(&count, &last_reset, limit as _, || {\n            simulated_time\n        }));\n        // counter got reset, generation increased\n        assert_eq!(count.load(Ordering::Relaxed), 1 + (1 << 32));\n        // last reset changed too\n        assert_ne!(last_reset.load(Ordering::Relaxed), reset_timestamp);\n        let reset_timestamp = last_reset.load(Ordering::Relaxed);\n\n        for i in 1..limit {\n            // we log as many time as expected\n            assert!(should_log(&count, &last_reset, limit as _, || {\n                simulated_time\n            }));\n            assert_eq!(count.load(Ordering::Relaxed), i + 1 + (1 << 32));\n            assert_eq!(last_reset.load(Ordering::Relaxed), reset_timestamp);\n            simulated_time += simulation_step;\n        }\n\n        for i in limit..(limit * 2) {\n            // we don't log, nor update\n            assert!(!should_log(&count, &last_reset, limit as _, || {\n                simulated_time\n            }));\n            assert_eq!(count.load(Ordering::Relaxed), i + 1 + (1 << 32));\n            assert_eq!(last_reset.load(Ordering::Relaxed), reset_timestamp);\n            simulated_time += simulation_step;\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/rate_limiter.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::ops::Add;\nuse std::time::Duration;\n\nuse bytesize::ByteSize;\nuse governor::clock::{Clock, DefaultClock, Reference};\nuse governor::nanos::Nanos;\n\nuse crate::tower::{ConstantRate, Rate};\n\n#[derive(Debug, Clone, Copy)]\npub struct RateLimiterSettings {\n    // After a long period of inactivity, the rate limiter can accumulate some \"credits\"\n    // up to what we call a `burst_limit`.\n    //\n    // Until these credits are expired, the rate limiter may exceed temporarily its rate limit.\n    pub burst_limit: u64,\n    pub rate_limit: ConstantRate,\n    // The refill period has an effect on the resolution at which the\n    // rate limiting is enforced.\n    //\n    // `Instant::now()` is guaranteed to be called at most once per refill_period.\n    pub refill_period: Duration,\n}\n\n#[cfg(any(test, feature = \"testsuite\"))]\nimpl Default for RateLimiterSettings {\n    fn default() -> Self {\n        // 10 MB burst limit.\n        let burst_limit = ByteSize::mb(10).as_u64();\n        // 5 MB/s rate limit.\n        let rate_limit = ConstantRate::bytes_per_sec(ByteSize::mb(5));\n        // Refill every 100ms.\n        let refill_period = Duration::from_millis(100);\n\n        Self {\n            burst_limit,\n            rate_limit,\n            refill_period,\n        }\n    }\n}\n\n/// A bursty token-based rate limiter.\n#[derive(Debug, Clone)]\npub struct RateLimiter<C: Clock = DefaultClock> {\n    // Maximum number of permits that can be accumulated.\n    max_capacity: u64,\n    // Number of permits available.\n    available_permits: u64,\n    refill_amount: u64,\n    refill_period: Duration,\n    refill_period_nanos: u64,\n    refill_at: C::Instant,\n    clock: C,\n}\n\n#[cfg(any(test, feature = \"testsuite\"))]\nimpl Default for RateLimiter<DefaultClock> {\n    fn default() -> Self {\n        Self::from_settings(RateLimiterSettings::default())\n    }\n}\n\nimpl RateLimiter<DefaultClock> {\n    /// Creates a new rate limiter from the given settings using the default clock.\n    pub fn from_settings(settings: RateLimiterSettings) -> Self {\n        Self::from_settings_with_clock(settings, DefaultClock::default())\n    }\n}\n\nimpl<C: Clock> RateLimiter<C> {\n    /// Creates a new rate limiter from the given settings with a custom clock.\n    pub fn from_settings_with_clock(settings: RateLimiterSettings, clock: C) -> Self {\n        let max_capacity = settings.burst_limit;\n        let refill_period = settings.refill_period;\n        let rate_limit = settings.rate_limit.rescale(refill_period);\n        let now = clock.now();\n\n        Self {\n            max_capacity,\n            available_permits: max_capacity,\n            refill_amount: rate_limit.work(),\n            refill_period,\n            refill_period_nanos: refill_period.as_nanos() as u64,\n            refill_at: now.add(Nanos::from(refill_period)),\n            clock,\n        }\n    }\n\n    /// Returns the number of permits available.\n    pub fn available_permits(&mut self) -> u64 {\n        self.refill(self.clock.now());\n        self.available_permits\n    }\n\n    /// Acquires some permits from the rate limiter. Returns whether the permits were acquired.\n    pub fn acquire(&mut self, num_permits: u64) -> bool {\n        self.refill(self.clock.now());\n        self.acquire_inner(num_permits)\n    }\n\n    /// Acquires some permits expressed in bytes from the rate limiter. Returns whether the permits\n    /// were acquired.\n    pub fn acquire_bytes(&mut self, bytes: ByteSize) -> bool {\n        self.acquire(bytes.as_u64())\n    }\n\n    /// Drains all the permits from the rate limiter, effectively disabling all the operations\n    /// guarded by the rate limiter for one refill period.\n    pub fn drain(&mut self) {\n        self.available_permits = 0;\n        self.refill_at = self.clock.now().add(Nanos::from(self.refill_period));\n    }\n\n    /// Gives back some unused permits to the rate limiter.\n    pub fn release(&mut self, num_permits: u64) {\n        self.available_permits = self.max_capacity.min(self.available_permits + num_permits);\n    }\n\n    fn acquire_inner(&mut self, num_permits: u64) -> bool {\n        if self.available_permits >= num_permits {\n            self.available_permits -= num_permits;\n            true\n        } else {\n            false\n        }\n    }\n\n    fn refill(&mut self, now: C::Instant) {\n        if now.lt(&self.refill_at) {\n            return;\n        }\n        let elapsed_nanos = now.duration_since(self.refill_at).as_u64();\n        // More than one refill period may have elapsed so we need to take that into account.\n        let refill =\n            self.refill_amount + self.refill_amount * elapsed_nanos / self.refill_period_nanos;\n        self.available_permits = self.max_capacity.min(self.available_permits + refill);\n        self.refill_at = now.add(Nanos::from(self.refill_period));\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use governor::clock::FakeRelativeClock;\n\n    use super::*;\n\n    #[test]\n    fn test_rate_limiter_acquire() {\n        let settings = RateLimiterSettings {\n            burst_limit: ByteSize::mb(2).as_u64(),\n            rate_limit: ConstantRate::bytes_per_sec(ByteSize::mb(1)),\n            refill_period: Duration::from_millis(100),\n        };\n        let clock = FakeRelativeClock::default();\n        let mut rate_limiter = RateLimiter::from_settings_with_clock(settings, clock.clone());\n        assert_eq!(rate_limiter.max_capacity, ByteSize::mb(2).as_u64());\n        assert_eq!(rate_limiter.available_permits, ByteSize::mb(2).as_u64());\n        assert_eq!(rate_limiter.refill_amount, ByteSize::kb(100).as_u64());\n        assert_eq!(rate_limiter.refill_period, Duration::from_millis(100));\n\n        assert!(rate_limiter.acquire_bytes(ByteSize::mb(1)));\n        assert!(rate_limiter.acquire_bytes(ByteSize::mb(1)));\n        assert!(!rate_limiter.acquire_bytes(ByteSize::kb(1)));\n\n        clock.advance(Duration::from_millis(100));\n\n        assert!(rate_limiter.acquire_bytes(ByteSize::kb(100)));\n        assert!(!rate_limiter.acquire_bytes(ByteSize::kb(20)));\n\n        clock.advance(Duration::from_millis(250));\n\n        assert!(rate_limiter.acquire_bytes(ByteSize::kb(125)));\n        assert!(rate_limiter.acquire_bytes(ByteSize::kb(125)));\n        assert!(!rate_limiter.acquire_bytes(ByteSize::kb(20)));\n    }\n\n    #[test]\n    fn test_rate_limiter_drain() {\n        let settings = RateLimiterSettings {\n            burst_limit: ByteSize::mb(2).as_u64(),\n            rate_limit: ConstantRate::bytes_per_sec(ByteSize::mb(1)),\n            refill_period: Duration::from_millis(100),\n        };\n        let clock = FakeRelativeClock::default();\n        let mut rate_limiter = RateLimiter::from_settings_with_clock(settings, clock.clone());\n        rate_limiter.drain();\n        assert_eq!(rate_limiter.available_permits, 0);\n\n        clock.advance(Duration::from_millis(50));\n        rate_limiter.refill(clock.now());\n        assert_eq!(rate_limiter.available_permits, 0);\n\n        clock.advance(Duration::from_millis(50));\n        rate_limiter.refill(clock.now());\n        assert!(rate_limiter.available_permits >= ByteSize::kb(100).as_u64());\n    }\n\n    #[test]\n    fn test_rate_limiter_release() {\n        let settings = RateLimiterSettings {\n            burst_limit: 1,\n            rate_limit: ConstantRate::bytes_per_sec(ByteSize::mb(1)),\n            refill_period: Duration::from_millis(100),\n        };\n        let mut rate_limiter = RateLimiter::from_settings(settings);\n        rate_limiter.acquire(1);\n        assert_eq!(rate_limiter.available_permits, 0);\n\n        rate_limiter.release(1);\n        assert_eq!(rate_limiter.available_permits, 1);\n\n        rate_limiter.release(1);\n        assert_eq!(rate_limiter.available_permits, 1);\n    }\n\n    #[test]\n    fn test_rate_limiter_refill() {\n        let settings = RateLimiterSettings {\n            burst_limit: ByteSize::mb(2).as_u64(),\n            rate_limit: ConstantRate::bytes_per_sec(ByteSize::mb(1)),\n            refill_period: Duration::from_millis(100),\n        };\n        let clock = FakeRelativeClock::default();\n        let mut rate_limiter = RateLimiter::from_settings_with_clock(settings, clock.clone());\n\n        rate_limiter.available_permits = 0;\n        assert_eq!(rate_limiter.available_permits, 0);\n\n        rate_limiter.available_permits = 0;\n        clock.advance(Duration::from_millis(100));\n        rate_limiter.refill(clock.now());\n        assert_eq!(rate_limiter.available_permits, ByteSize::kb(100).as_u64());\n\n        rate_limiter.available_permits = 0;\n        clock.advance(Duration::from_millis(110));\n        rate_limiter.refill(clock.now());\n        assert_eq!(rate_limiter.available_permits, ByteSize::kb(110).as_u64());\n\n        rate_limiter.available_permits = 0;\n        clock.advance(Duration::from_millis(210));\n        rate_limiter.refill(clock.now());\n        assert_eq!(rate_limiter.available_permits, ByteSize::kb(210).as_u64());\n    }\n\n    #[test]\n    fn test_rate_limiter_available_permits() {\n        let settings = RateLimiterSettings {\n            burst_limit: ByteSize::mb(2).as_u64(),\n            rate_limit: ConstantRate::bytes_per_sec(ByteSize::mb(1)),\n            refill_period: Duration::from_millis(100),\n        };\n        let clock = FakeRelativeClock::default();\n        let mut rate_limiter = RateLimiter::from_settings_with_clock(settings, clock.clone());\n\n        rate_limiter.available_permits = 0;\n        clock.advance(Duration::from_millis(100));\n        assert_eq!(rate_limiter.available_permits(), ByteSize::kb(100).as_u64());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/rendezvous_hasher.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::cmp::Reverse;\nuse std::hash::{Hash, Hasher};\n\nuse siphasher::sip::SipHasher;\n\n/// Computes the affinity of a node for a given `key`.\n/// A higher value means a higher affinity.\n/// This is the `rendezvous hash`.\npub fn node_affinity<T: Hash, U: Hash>(node: T, key: &U) -> u64 {\n    let mut state = SipHasher::new();\n    key.hash(&mut state);\n    node.hash(&mut state);\n    state.finish()\n}\n\n/// Sorts the list of node ordered by decreasing affinity values.\n/// This is called rendezvous hashing.\npub fn sort_by_rendez_vous_hash<T: Hash, U: Hash>(nodes: &mut [T], key: U) {\n    nodes.sort_by_cached_key(|node| Reverse(node_affinity(node, &key)));\n}\n\n#[cfg(test)]\nmod tests {\n    use std::net::SocketAddr;\n\n    use super::*;\n    use crate::SocketAddrLegacyHash;\n\n    fn test_socket_addr(last_byte: u8) -> SocketAddr {\n        ([127, 0, 0, last_byte], 10_000u16).into()\n    }\n\n    #[test]\n    fn test_utils_sort_by_rendez_vous_hash() {\n        let socket1 = test_socket_addr(1);\n        let socket2 = test_socket_addr(2);\n        let socket3 = test_socket_addr(3);\n        let socket4 = test_socket_addr(4);\n\n        let legacy_socket1 = SocketAddrLegacyHash(&socket1);\n        let legacy_socket2 = SocketAddrLegacyHash(&socket2);\n        let legacy_socket3 = SocketAddrLegacyHash(&socket3);\n        let legacy_socket4 = SocketAddrLegacyHash(&socket4);\n\n        let mut socket_set1 = vec![\n            legacy_socket4,\n            legacy_socket3,\n            legacy_socket1,\n            legacy_socket2,\n        ];\n        sort_by_rendez_vous_hash(&mut socket_set1, \"key\");\n\n        let mut socket_set2 = vec![legacy_socket1, legacy_socket2, legacy_socket4];\n        sort_by_rendez_vous_hash(&mut socket_set2, \"key\");\n\n        let mut socket_set3 = vec![legacy_socket1, legacy_socket4];\n        sort_by_rendez_vous_hash(&mut socket_set3, \"key\");\n\n        assert_eq!(\n            socket_set1,\n            &[\n                legacy_socket1,\n                legacy_socket2,\n                legacy_socket3,\n                legacy_socket4\n            ]\n        );\n        assert_eq!(\n            socket_set2,\n            &[legacy_socket1, legacy_socket2, legacy_socket4]\n        );\n        assert_eq!(socket_set3, &[legacy_socket1, legacy_socket4]);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/retry.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt::Debug;\nuse std::time::Duration;\n\nuse async_trait::async_trait;\nuse futures::Future;\nuse rand::Rng;\nuse tracing::{debug, warn};\n\npub trait Retryable {\n    fn is_retryable(&self) -> bool {\n        false\n    }\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub enum Retry<E> {\n    Permanent(E),\n    Transient(E),\n}\n\nimpl<E> Retry<E> {\n    pub fn into_inner(self) -> E {\n        match self {\n            Self::Transient(error) => error,\n            Self::Permanent(error) => error,\n        }\n    }\n}\n\nimpl<E> Retryable for Retry<E> {\n    fn is_retryable(&self) -> bool {\n        match self {\n            Retry::Permanent(_) => false,\n            Retry::Transient(_) => true,\n        }\n    }\n}\n\n#[derive(Debug, Clone, Copy)]\npub struct RetryParams {\n    pub base_delay: Duration,\n    pub max_delay: Duration,\n    pub max_attempts: usize,\n}\n\nimpl RetryParams {\n    /// Creates a new [`RetryParams`] instance using the same settings as the standard retry policy\n    /// defined in the AWS SDK for Rust.\n    pub fn standard() -> Self {\n        Self {\n            base_delay: Duration::from_secs(1),\n            max_delay: Duration::from_secs(20),\n            max_attempts: 3,\n        }\n    }\n\n    /// Creates a new [`RetryParams`] instance using settings that are more aggressive than those of\n    /// the standard policy for services that are more resilient to retries, usually managed\n    /// cloud services.\n    pub fn aggressive() -> Self {\n        Self {\n            base_delay: Duration::from_millis(250),\n            max_delay: Duration::from_secs(20),\n            max_attempts: 5,\n        }\n    }\n\n    /// Creates a new [`RetryParams`] instance that does not perform any retries.\n    pub fn no_retries() -> Self {\n        Self {\n            base_delay: Duration::ZERO,\n            max_delay: Duration::ZERO,\n            max_attempts: 1,\n        }\n    }\n\n    /// Computes the delay after which a new attempt should be performed. The randomized delay\n    /// increases after each attempt (exponential backoff and full jitter). Implementation and\n    /// default values originate from the Java SDK. See also: <https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/>.\n    ///\n    /// The caller should pass the number of attempts that have been performed so far. Not to be\n    /// confused with the number of retries, which is one less than the number of attempts.\n    ///\n    /// # Panics\n    ///\n    /// Panics if `num_attempts` is zero.\n    pub fn compute_delay(&self, num_attempts: usize) -> Duration {\n        assert!(num_attempts > 0, \"num_attempts should be greater than zero\");\n        let num_attempts = num_attempts.min(32);\n        let delay_ms = (self.base_delay.as_millis() as u64)\n            .saturating_mul(2u64.saturating_pow(num_attempts as u32 - 1));\n        let capped_delay_ms = delay_ms.min(self.max_delay.as_millis() as u64);\n        let half_delay_ms = capped_delay_ms.div_ceil(2);\n        let jitter_range = half_delay_ms..capped_delay_ms + 1;\n        let jittered_delay_ms = rand::rng().random_range(jitter_range);\n        Duration::from_millis(jittered_delay_ms)\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn for_test() -> Self {\n        Self {\n            base_delay: Duration::from_millis(1),\n            max_delay: Duration::from_millis(2),\n            max_attempts: 3,\n        }\n    }\n}\n\n#[async_trait]\npub trait MockableSleep {\n    async fn sleep(&self, duration: Duration);\n}\n\npub struct TokioSleep;\n\n#[async_trait]\nimpl MockableSleep for TokioSleep {\n    async fn sleep(&self, duration: Duration) {\n        tokio::time::sleep(duration).await;\n    }\n}\n\npub async fn retry_with_mockable_sleep<U, E, Fut>(\n    retry_params: &RetryParams,\n    f: impl Fn() -> Fut,\n    mockable_sleep: impl MockableSleep,\n) -> Result<U, E>\nwhere\n    Fut: Future<Output = Result<U, E>>,\n    E: Retryable + Debug + 'static,\n{\n    let mut num_attempts = 0;\n\n    loop {\n        let response = f().await;\n\n        let error = match response {\n            Ok(response) => {\n                return Ok(response);\n            }\n            Err(error) => error,\n        };\n        if !error.is_retryable() {\n            return Err(error);\n        }\n        num_attempts += 1;\n\n        if num_attempts >= retry_params.max_attempts {\n            warn!(\n                num_attempts=%num_attempts,\n                \"request failed\"\n            );\n            return Err(error);\n        }\n        let delay = retry_params.compute_delay(num_attempts);\n        debug!(\n            num_attempts=%num_attempts,\n            delay_ms=%delay.as_millis(),\n            error=?error,\n            \"request failed, retrying\"\n        );\n        mockable_sleep.sleep(delay).await;\n    }\n}\n\npub async fn retry<U, E, Fut>(retry_params: &RetryParams, f: impl Fn() -> Fut) -> Result<U, E>\nwhere\n    Fut: Future<Output = Result<U, E>>,\n    E: Retryable + Debug + 'static,\n{\n    retry_with_mockable_sleep(retry_params, f, TokioSleep).await\n}\n\n#[cfg(test)]\nmod tests {\n    use std::sync::RwLock;\n    use std::time::Duration;\n\n    use futures::future::ready;\n\n    use super::{MockableSleep, RetryParams, Retryable, retry_with_mockable_sleep};\n\n    #[derive(Debug, Eq, PartialEq)]\n    pub enum Retry<E> {\n        Permanent(E),\n        Transient(E),\n    }\n\n    impl<E> Retryable for Retry<E> {\n        fn is_retryable(&self) -> bool {\n            match self {\n                Retry::Permanent(_) => false,\n                Retry::Transient(_) => true,\n            }\n        }\n    }\n\n    struct NoopSleep;\n\n    #[async_trait::async_trait]\n    impl MockableSleep for NoopSleep {\n        async fn sleep(&self, _duration: Duration) {\n            // This is a no-op implementation, so we do nothing here.\n        }\n    }\n\n    async fn simulate_retries<T>(values: Vec<Result<T, Retry<usize>>>) -> Result<T, Retry<usize>> {\n        let noop_mock = NoopSleep;\n        let values_it = RwLock::new(values.into_iter());\n        retry_with_mockable_sleep(\n            &RetryParams {\n                base_delay: Duration::from_millis(1),\n                max_delay: Duration::from_millis(2),\n                max_attempts: 30,\n            },\n            || ready(values_it.write().unwrap().next().unwrap()),\n            noop_mock,\n        )\n        .await\n    }\n\n    #[tokio::test]\n    async fn test_retry_accepts_ok() {\n        assert_eq!(simulate_retries(vec![Ok(())]).await, Ok(()));\n    }\n\n    #[tokio::test]\n    async fn test_retry_does_retry() {\n        assert_eq!(\n            simulate_retries(vec![Err(Retry::Transient(1)), Ok(())]).await,\n            Ok(())\n        );\n    }\n\n    #[tokio::test]\n    async fn test_retry_stops_retrying_on_non_retryable_error() {\n        assert_eq!(\n            simulate_retries(vec![Err(Retry::Permanent(1)), Ok(())]).await,\n            Err(Retry::Permanent(1))\n        );\n    }\n\n    #[tokio::test]\n    async fn test_retry_retries_up_at_most_attempts_times() {\n        let retry_sequence: Vec<_> = (0..30)\n            .map(|retry_id| Err(Retry::Transient(retry_id)))\n            .chain(Some(Ok(())))\n            .collect();\n        assert_eq!(\n            simulate_retries(retry_sequence).await,\n            Err(Retry::Transient(29))\n        );\n    }\n\n    #[tokio::test]\n    async fn test_retry_retries_up_to_max_attempts_times() {\n        let retry_sequence: Vec<_> = (0..29)\n            .map(|retry_id| Err(Retry::Transient(retry_id)))\n            .chain(Some(Ok(())))\n            .collect();\n        assert_eq!(simulate_retries(retry_sequence).await, Ok(()));\n    }\n\n    fn test_retry_delay_does_not_overflow_aux(retry_params: RetryParams) {\n        for i in 1..100 {\n            let delay = retry_params.compute_delay(i);\n            assert!(delay <= retry_params.max_delay);\n            if retry_params.base_delay <= retry_params.max_delay {\n                assert!(delay * 2 >= retry_params.base_delay);\n            }\n        }\n    }\n\n    proptest::proptest! {\n        #[test]\n        fn test_retry_delay_does_not_overflow(\n            max_attempts in 1..1_000usize,\n            base_delay in 0..1_000u64,\n            max_delay in 0..60_000u64,\n        ) {\n            let retry_params = RetryParams {\n                max_attempts,\n                base_delay: Duration::from_millis(base_delay),\n                max_delay: Duration::from_millis(max_delay),\n            };\n            test_retry_delay_does_not_overflow_aux(retry_params);\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/ring_buffer.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt::{Debug, Formatter};\n\n/// Fixed-size buffer that keeps the last N elements pushed into it.\n///\n/// `head` is the write cursor. It advances by one on each push and wraps\n/// back to 0 when it reaches N, overwriting the oldest element.\n///\n/// ```text\n/// RingBuffer<u32, 4> after pushing 1, 2, 3, 4, 5, 6:\n///\n///   buffer = [5, 6, 3, 4]    head = 2    len = 4\n///                 ^\n///                 next write goes here\n///\n///   logical view (oldest → newest): [3, 4, 5, 6]\n/// ```\npub struct RingBuffer<T: Copy + Default, const N: usize> {\n    buffer: [T; N],\n    head: usize,\n    len: usize,\n}\n\nimpl<T: Copy + Default, const N: usize> Default for RingBuffer<T, N> {\n    fn default() -> Self {\n        Self {\n            buffer: [T::default(); N],\n            head: 0,\n            len: 0,\n        }\n    }\n}\n\nimpl<T: Copy + Default + Debug, const N: usize> Debug for RingBuffer<T, N> {\n    fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {\n        f.debug_list().entries(self.iter()).finish()\n    }\n}\n\nimpl<T: Copy + Default, const N: usize> RingBuffer<T, N> {\n    pub fn push_back(&mut self, value: T) {\n        self.buffer[self.head] = value;\n        self.head = (self.head + 1) % N;\n        if self.len < N {\n            self.len += 1;\n        }\n    }\n\n    pub fn last(&self) -> Option<T> {\n        if self.len == 0 {\n            return None;\n        }\n        Some(self.buffer[(self.head + N - 1) % N])\n    }\n\n    pub fn front(&self) -> Option<T> {\n        if self.len == 0 {\n            return None;\n        }\n        Some(self.buffer[(self.head + N - self.len) % N])\n    }\n\n    pub fn len(&self) -> usize {\n        self.len\n    }\n\n    pub fn is_empty(&self) -> bool {\n        self.len == 0\n    }\n\n    pub fn iter(&self) -> impl Iterator<Item = &T> + '_ {\n        let start = (self.head + N - self.len) % N;\n        (0..self.len).map(move |i| &self.buffer[(start + i) % N])\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_empty() {\n        let rb = RingBuffer::<u32, 4>::default();\n        assert!(rb.is_empty());\n        assert_eq!(rb.len(), 0);\n        assert_eq!(rb.last(), None);\n        assert_eq!(rb.front(), None);\n        assert_eq!(rb.iter().count(), 0);\n    }\n\n    #[test]\n    fn test_single_push() {\n        let mut rb = RingBuffer::<u32, 4>::default();\n        rb.push_back(10);\n        assert_eq!(rb.len(), 1);\n        assert!(!rb.is_empty());\n        assert_eq!(rb.last(), Some(10));\n        assert_eq!(rb.front(), Some(10));\n        assert_eq!(rb.iter().copied().collect::<Vec<_>>(), vec![10]);\n    }\n\n    #[test]\n    fn test_partial_fill() {\n        let mut rb = RingBuffer::<u32, 4>::default();\n        rb.push_back(1);\n        rb.push_back(2);\n        rb.push_back(3);\n        assert_eq!(rb.len(), 3);\n        assert_eq!(rb.last(), Some(3));\n        assert_eq!(rb.front(), Some(1));\n        assert_eq!(rb.iter().copied().collect::<Vec<_>>(), vec![1, 2, 3]);\n    }\n\n    #[test]\n    fn test_exactly_full() {\n        let mut rb = RingBuffer::<u32, 4>::default();\n        for i in 1..=4 {\n            rb.push_back(i);\n        }\n        assert_eq!(rb.len(), 4);\n        assert_eq!(rb.last(), Some(4));\n        assert_eq!(rb.front(), Some(1));\n        assert_eq!(rb.iter().copied().collect::<Vec<_>>(), vec![1, 2, 3, 4]);\n    }\n\n    #[test]\n    fn test_wrap_around() {\n        let mut rb = RingBuffer::<u32, 4>::default();\n        for i in 1..=6 {\n            rb.push_back(i);\n        }\n        assert_eq!(rb.len(), 4);\n        assert_eq!(rb.last(), Some(6));\n        assert_eq!(rb.front(), Some(3));\n        assert_eq!(rb.iter().copied().collect::<Vec<_>>(), vec![3, 4, 5, 6]);\n    }\n\n    #[test]\n    fn test_many_wraps() {\n        let mut rb = RingBuffer::<u32, 3>::default();\n        for i in 1..=100 {\n            rb.push_back(i);\n        }\n        assert_eq!(rb.len(), 3);\n        assert_eq!(rb.last(), Some(100));\n        assert_eq!(rb.front(), Some(98));\n        assert_eq!(rb.iter().copied().collect::<Vec<_>>(), vec![98, 99, 100]);\n    }\n\n    #[test]\n    fn test_debug() {\n        let mut rb = RingBuffer::<u32, 3>::default();\n        rb.push_back(1);\n        rb.push_back(2);\n        assert_eq!(format!(\"{:?}\", rb), \"[1, 2]\");\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/runtimes.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::sync::atomic::{AtomicUsize, Ordering};\nuse std::time::Duration;\n\nuse once_cell::sync::OnceCell;\nuse prometheus::{Gauge, IntCounter, IntGauge};\nuse tokio::runtime::Runtime;\nuse tokio_metrics::{RuntimeMetrics, RuntimeMonitor};\n\nuse crate::metrics::{new_counter, new_float_gauge, new_gauge};\n\nstatic RUNTIMES: OnceCell<HashMap<RuntimeType, tokio::runtime::Runtime>> = OnceCell::new();\n\n/// Describes which runtime an actor should run on.\n#[derive(Clone, Copy, Debug, Hash, Eq, PartialEq)]\npub enum RuntimeType {\n    /// The blocking runtime runs blocking actors.\n    /// This runtime is only used as a nice thread pool with\n    /// the interface as tokio stasks.\n    ///\n    /// This runtime should not be used to run tokio\n    /// io operations.\n    ///\n    /// Tasks are allowed to block for an arbitrary amount of time.\n    Blocking,\n\n    /// The non-blocking runtime is closer to what one would expect from\n    /// a regular tokio runtime.\n    ///\n    /// Task are expect to yield within 500 micros.\n    NonBlocking,\n}\n\n#[derive(Debug, Clone, Copy)]\npub struct RuntimesConfig {\n    /// Number of worker threads allocated to the non-blocking runtime.\n    pub num_threads_non_blocking: usize,\n    /// Number of worker threads allocated to the blocking runtime.\n    pub num_threads_blocking: usize,\n}\n\nimpl RuntimesConfig {\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn light_for_tests() -> RuntimesConfig {\n        RuntimesConfig {\n            num_threads_blocking: 1,\n            num_threads_non_blocking: 1,\n        }\n    }\n\n    pub fn with_num_cpus(num_cpus: usize) -> Self {\n        // Non blocking task are supposed to be io intensive, and not require many threads.\n        // On the other hand the blocking actors are cpu intensive. We allocate\n        // almost all of the threads to them.\n        match num_cpus {\n            0..=3 => {\n                // We do not have enough vCPUs to allocate a full thread to\n                // non-blocking.\n                RuntimesConfig {\n                    num_threads_non_blocking: 1,\n                    num_threads_blocking: num_cpus,\n                }\n            }\n            4..=6 => RuntimesConfig {\n                num_threads_non_blocking: 1,\n                num_threads_blocking: num_cpus - 1,\n            },\n            7.. => RuntimesConfig {\n                num_threads_non_blocking: 2,\n                num_threads_blocking: num_cpus - 2,\n            },\n        }\n    }\n}\n\nimpl Default for RuntimesConfig {\n    fn default() -> Self {\n        let num_cpus = crate::num_cpus();\n        Self::with_num_cpus(num_cpus)\n    }\n}\n\nfn start_runtimes(config: RuntimesConfig) -> HashMap<RuntimeType, Runtime> {\n    let mut runtimes = HashMap::with_capacity(2);\n\n    let disable_lifo_slot = crate::get_bool_from_env(\"QW_DISABLE_TOKIO_LIFO_SLOT\", true);\n\n    let mut blocking_runtime_builder = tokio::runtime::Builder::new_multi_thread();\n    if disable_lifo_slot {\n        blocking_runtime_builder.disable_lifo_slot();\n    }\n    let blocking_runtime = blocking_runtime_builder\n        .worker_threads(config.num_threads_blocking)\n        .thread_name_fn(|| {\n            static ATOMIC_ID: AtomicUsize = AtomicUsize::new(0);\n            let id = ATOMIC_ID.fetch_add(1, Ordering::AcqRel);\n            format!(\"blocking-{id}\")\n        })\n        .enable_all()\n        .build()\n        .unwrap();\n\n    scrape_tokio_runtime_metrics(blocking_runtime.handle(), \"blocking\");\n    runtimes.insert(RuntimeType::Blocking, blocking_runtime);\n\n    let non_blocking_runtime = tokio::runtime::Builder::new_multi_thread()\n        .worker_threads(config.num_threads_non_blocking)\n        .thread_name_fn(|| {\n            static ATOMIC_ID: AtomicUsize = AtomicUsize::new(0);\n            let id = ATOMIC_ID.fetch_add(1, Ordering::AcqRel);\n            format!(\"non-blocking-{id}\")\n        })\n        .enable_all()\n        .build()\n        .unwrap();\n\n    scrape_tokio_runtime_metrics(non_blocking_runtime.handle(), \"non_blocking\");\n    runtimes.insert(RuntimeType::NonBlocking, non_blocking_runtime);\n\n    runtimes\n}\n\npub fn initialize_runtimes(runtimes_config: RuntimesConfig) -> anyhow::Result<()> {\n    RUNTIMES.get_or_init(|| start_runtimes(runtimes_config));\n    Ok(())\n}\n\nimpl RuntimeType {\n    pub fn get_runtime_handle(self) -> tokio::runtime::Handle {\n        RUNTIMES\n            .get_or_init(|| {\n                #[cfg(any(test, feature = \"testsuite\"))]\n                {\n                    tracing::warn!(\"starting Tokio actor runtimes for tests\");\n                    start_runtimes(RuntimesConfig::light_for_tests())\n                }\n                #[cfg(not(any(test, feature = \"testsuite\")))]\n                {\n                    panic!(\"Tokio runtimes not initialized. Please, report this issue on GitHub: https://github.com/quickwit-oss/quickwit/issues.\");\n                }\n            })\n            .get(&self)\n            .unwrap()\n            .handle()\n            .clone()\n    }\n}\n\n/// Spawns a background task\npub fn scrape_tokio_runtime_metrics(handle: &tokio::runtime::Handle, label: &'static str) {\n    let runtime_monitor = RuntimeMonitor::new(handle);\n    handle.spawn(async move {\n        let mut interval = tokio::time::interval(Duration::from_secs(1));\n        let mut prometheus_runtime_metrics = PrometheusRuntimeMetrics::new(label);\n\n        for tokio_runtime_metrics in runtime_monitor.intervals() {\n            interval.tick().await;\n            prometheus_runtime_metrics.update(&tokio_runtime_metrics);\n        }\n    });\n}\n\nstruct PrometheusRuntimeMetrics {\n    scheduled_tasks: IntGauge,\n    worker_busy_duration_milliseconds_total: IntCounter,\n    worker_busy_ratio: Gauge,\n    worker_threads: IntGauge,\n}\n\nimpl PrometheusRuntimeMetrics {\n    pub fn new(label: &'static str) -> Self {\n        Self {\n            scheduled_tasks: new_gauge(\n                \"tokio_scheduled_tasks\",\n                \"The total number of tasks currently scheduled in workers' local queues.\",\n                \"runtime\",\n                &[(\"runtime_type\", label)],\n            ),\n            worker_busy_duration_milliseconds_total: new_counter(\n                \"tokio_worker_busy_duration_milliseconds_total\",\n                \" The total amount of time worker threads were busy.\",\n                \"runtime\",\n                &[(\"runtime_type\", label)],\n            ),\n            worker_busy_ratio: new_float_gauge(\n                \"tokio_worker_busy_ratio\",\n                \"The ratio of time worker threads were busy since the last time runtime metrics \\\n                 were collected.\",\n                \"runtime\",\n                &[(\"runtime_type\", label)],\n            ),\n            worker_threads: new_gauge(\n                \"tokio_worker_threads\",\n                \"The number of worker threads used by the runtime.\",\n                \"runtime\",\n                &[(\"runtime_type\", label)],\n            ),\n        }\n    }\n\n    pub fn update(&mut self, runtime_metrics: &RuntimeMetrics) {\n        self.scheduled_tasks\n            .set(runtime_metrics.total_local_queue_depth as i64);\n        self.worker_busy_duration_milliseconds_total\n            .inc_by(runtime_metrics.total_busy_duration.as_millis() as u64);\n        self.worker_busy_ratio.set(runtime_metrics.busy_ratio());\n        self.worker_threads\n            .set(runtime_metrics.workers_count as i64);\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_runtimes_config_default() {\n        let runtime_default = RuntimesConfig::default();\n        assert!(runtime_default.num_threads_non_blocking <= runtime_default.num_threads_blocking);\n        assert!(runtime_default.num_threads_non_blocking <= 2);\n    }\n\n    #[test]\n    fn test_runtimes_with_given_num_cpus_10() {\n        let runtime = RuntimesConfig::with_num_cpus(10);\n        assert_eq!(runtime.num_threads_blocking, 8);\n        assert_eq!(runtime.num_threads_non_blocking, 2);\n    }\n\n    #[test]\n    fn test_runtimes_with_given_num_cpus_3() {\n        let runtime = RuntimesConfig::with_num_cpus(3);\n        assert_eq!(runtime.num_threads_blocking, 3);\n        assert_eq!(runtime.num_threads_non_blocking, 1);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/shared_consts.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::OnceLock;\nuse std::time::Duration;\n\nuse bytesize::ByteSize;\nuse tracing::warn;\n\n/// Field name reserved for storing the dynamically indexed fields.\npub const FIELD_PRESENCE_FIELD_NAME: &str = \"_field_presence\";\n\npub const MINIMUM_DELETION_GRACE_PERIOD: Duration = Duration::from_secs(5 * 60); // 5mn\nconst MAXIMUM_DELETION_GRACE_PERIOD: Duration = Duration::from_secs(2 * 24 * 3600); // 2 days\n\n/// We cannot safely delete splits right away as a:\n/// - in-flight queries could actually have selected this split,\n/// - scroll queries may also have a point in time on these splits.\n///\n/// We deal this probably by introducing a grace period. A split is first marked as delete,\n/// and hence won't be selected for search. After a few minutes, once it reasonably safe to assume\n/// that all queries involving this split have terminated, we effectively delete the split.\n/// This duration is controlled by `DELETION_GRACE_PERIOD`.\npub fn split_deletion_grace_period() -> Duration {\n    const DEFAULT_DELETION_GRACE_PERIOD: Duration = Duration::from_secs(60 * 32); // 32 min\n\n    static SPLIT_DELETION_GRACE_PERIOD_SECS_LOCK: OnceLock<Duration> = std::sync::OnceLock::new();\n    *SPLIT_DELETION_GRACE_PERIOD_SECS_LOCK.get_or_init(|| {\n        let deletion_grace_period_secs: u64 = crate::get_from_env(\n            \"QW_SPLIT_DELETION_GRACE_PERIOD_SECS\",\n            DEFAULT_DELETION_GRACE_PERIOD.as_secs(),\n            false,\n        );\n        let deletion_grace_period_secs_clamped: u64 = deletion_grace_period_secs.clamp(\n            MINIMUM_DELETION_GRACE_PERIOD.as_secs(),\n            MAXIMUM_DELETION_GRACE_PERIOD.as_secs(),\n        );\n        if deletion_grace_period_secs_clamped != deletion_grace_period_secs {\n            warn!(\n                \"The deletion grace period is clamped to {} seconds. The provided value was {} \\\n                 seconds.\",\n                deletion_grace_period_secs_clamped, deletion_grace_period_secs\n            );\n        }\n        Duration::from_secs(deletion_grace_period_secs_clamped)\n    })\n}\n\n/// In order to amortized search with scroll, we fetch more documents than are\n/// being requested.\npub const SCROLL_BATCH_LEN: usize = 1_000;\n\n/// Key prefix used in chitchat to broadcast the list of primary shards hosted by a leader.\npub const INGESTER_PRIMARY_SHARDS_PREFIX: &str = \"ingester.primary_shards:\";\n\n/// Key used in chitchat to broadcast the status of an ingester.\npub const INGESTER_STATUS_KEY: &str = \"ingester.status\";\n\n/// Prefix used in chitchat to broadcast per-source ingester capacity scores and open shard counts.\npub const INGESTER_CAPACITY_SCORE_PREFIX: &str = \"ingester.capacity_score:\";\n\n/// File name for the encoded list of fields in the split\npub const SPLIT_FIELDS_FILE_NAME: &str = \"split_fields\";\n\n/// More or less the indexing throughput of a core\n/// i.e. PIPELINE_THROUGHPUT / PIPELINE_FULL_CAPACITY\npub const DEFAULT_SHARD_THROUGHPUT_LIMIT: ByteSize = ByteSize::mib(5);\n/// Large enough to absorb small bursts but should remain defensive against unbalanced shards.\npub const DEFAULT_SHARD_BURST_LIMIT: ByteSize = ByteSize::mib(50);\n\n/// A compromise between \"exponential\" scale up and moderate shard count increase.\npub const DEFAULT_SHARD_SCALE_UP_FACTOR: f32 = 1.5;\n\n// (Just a reexport).\npub use bytesize::MIB;\n"
  },
  {
    "path": "quickwit/quickwit-common/src/socket_addr_legacy_hash.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::hash::Hasher;\nuse std::net::SocketAddr;\n\n/// Computes the hash of socket addr, the way it was done before Rust 1.81\n///\n/// In <https://github.com/rust-lang/rust/commit/ba620344301aaa3b2733575a0696cdfd877edbdf>\n/// rustc change the implementation of Hash for IpAddr v4 and v6.\n///\n/// The idea was to not hash an array of bytes but instead interpret it as a register\n/// and hash this.\n///\n/// This was done for performance reason, but this change the result of the hash function\n/// used to compute affinity in quickwit. As a result, the switch would invalidate all\n/// existing cache.\n///\n/// In order to avoid this, we introduce the following function that reproduces the old\n/// behavior.\n#[repr(transparent)]\n#[derive(Debug, Eq, PartialEq, Copy, Clone)]\npub struct SocketAddrLegacyHash<'a>(pub &'a SocketAddr);\n\nimpl std::hash::Hash for SocketAddrLegacyHash<'_> {\n    fn hash<H: Hasher>(&self, state: &mut H) {\n        std::mem::discriminant(self.0).hash(state);\n        match self.0 {\n            SocketAddr::V4(socket_addr_v4) => {\n                socket_addr_v4.ip().octets().hash(state);\n                socket_addr_v4.port().hash(state);\n            }\n            SocketAddr::V6(socket_addr_v6) => {\n                socket_addr_v6.ip().octets().hash(state);\n                socket_addr_v6.port().hash(state);\n                socket_addr_v6.flowinfo().hash(state);\n                socket_addr_v6.scope_id().hash(state);\n            }\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::net::SocketAddrV6;\n\n    use super::*;\n\n    fn sample_socket_addr_v4() -> SocketAddr {\n        \"17.12.15.3:1834\".parse().unwrap()\n    }\n\n    fn sample_socket_addr_v6() -> SocketAddr {\n        let mut socket_addr_v6: SocketAddrV6 = \"[fe80::240:63ff:fede:3c19]:8080\".parse().unwrap();\n        socket_addr_v6.set_scope_id(4047u32);\n        socket_addr_v6.set_flowinfo(303u32);\n        socket_addr_v6.into()\n    }\n\n    fn compute_hash(hashable: impl std::hash::Hash) -> u64 {\n        // I wish I could have used the sip hasher but we don't have the deps here and I did\n        // not want to move that code to quickwit-common.\n        //\n        // If test break because rust changed its default hasher, we can just update the tests in\n        // this file with the new values.\n        let mut hasher = siphasher::sip::SipHasher::default();\n        hashable.hash(&mut hasher);\n        hasher.finish()\n    }\n\n    #[test]\n    fn test_legacy_hash_socket_addr_v4() {\n        let h = compute_hash(SocketAddrLegacyHash(&sample_socket_addr_v4()));\n        // This value is coming from using rust 1.80 to hash socket addr\n        assert_eq!(h, 8725442259486497862);\n    }\n\n    #[test]\n    fn test_legacy_hash_socket_addr_v6() {\n        let h = compute_hash(SocketAddrLegacyHash(&sample_socket_addr_v6()));\n        // This value is coming from using rust 1.80 to hash socket addr\n        assert_eq!(h, 14277248675058176752);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/sorted_iter.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::cmp::Ordering;\nuse std::collections::{btree_map, btree_set};\nuse std::iter::Peekable;\n\n/// Marks sorted iterators, typically iterators over [`btree_set::BTreeSet`] and\n/// [`btree_map::BTreeMap`].\ntrait Sorted {}\n\n/// Defines helper methods on sorted iterators.\npub trait SortedIterator: Iterator + Sized {\n    /// Compares two sorted iterators and returns the diff.\n    fn diff<U>(self, other: U) -> DiffIterator<Self, U>\n    where U: SortedIterator<Item = Self::Item> {\n        DiffIterator {\n            left: self.peekable(),\n            right: other.peekable(),\n        }\n    }\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub enum Diff<K> {\n    Added(K),\n    Unchanged(K),\n    Removed(K),\n}\n\npub struct DiffIterator<T: Iterator, U: Iterator> {\n    left: Peekable<T>,\n    right: Peekable<U>,\n}\n\nimpl<T, U, K> Iterator for DiffIterator<T, U>\nwhere\n    T: Iterator<Item = K>,\n    U: Iterator<Item = K>,\n    K: Ord,\n{\n    type Item = Diff<K>;\n\n    fn next(&mut self) -> Option<Self::Item> {\n        match (self.left.peek(), self.right.peek()) {\n            (Some(left), Some(right)) => match left.cmp(right) {\n                Ordering::Less => {\n                    let left = self\n                        .left\n                        .next()\n                        .expect(\"The left iterator should not be empty.\");\n                    Some(Diff::Removed(left))\n                }\n                Ordering::Equal => {\n                    let left = self\n                        .left\n                        .next()\n                        .expect(\"The left iterator should not be empty.\");\n                    self.right.next();\n                    Some(Diff::Unchanged(left))\n                }\n                Ordering::Greater => {\n                    let right = self\n                        .right\n                        .next()\n                        .expect(\"The right iterator should not be empty.\");\n                    Some(Diff::Added(right))\n                }\n            },\n            (Some(_), None) => {\n                let left = self\n                    .left\n                    .next()\n                    .expect(\"The left iterator should not be empty.\");\n                Some(Diff::Removed(left))\n            }\n            (None, Some(_)) => {\n                let right = self\n                    .right\n                    .next()\n                    .expect(\"The right iterator should not be empty.\");\n                Some(Diff::Added(right))\n            }\n            (None, None) => None,\n        }\n    }\n}\n\nimpl<T> SortedIterator for T where T: Iterator + Sorted {}\n\nimpl<K, V> Sorted for btree_map::IntoKeys<K, V> {}\nimpl<K, V> Sorted for btree_map::IntoValues<K, V> {}\nimpl<K, V> Sorted for btree_map::Keys<'_, K, V> {}\nimpl<K, V> Sorted for btree_map::Values<'_, K, V> {}\nimpl<K> Sorted for btree_set::IntoIter<K> {}\nimpl<K> Sorted for btree_set::Iter<'_, K> {}\n\n/// Same as [`SortedIterator`] but for (key, value) pairs sorted by key.\npub trait SortedByKeyIterator<K, V>: Iterator + Sized {\n    /// Compares the keys of two sorted key-value iterators and returns the diff.\n    fn diff_by_key<U, W>(self, other: U) -> DiffByKeyIterator<Self, U>\n    where U: SortedByKeyIterator<K, W> {\n        DiffByKeyIterator {\n            left: self.peekable(),\n            right: other.peekable(),\n        }\n    }\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub enum KeyDiff<K, V, W> {\n    Added(K, W),\n    Unchanged(K, V, W),\n    Removed(K, V),\n}\n\npub struct DiffByKeyIterator<T: Iterator, U: Iterator> {\n    left: Peekable<T>,\n    right: Peekable<U>,\n}\n\nimpl<T, U, K, V, W> Iterator for DiffByKeyIterator<T, U>\nwhere\n    T: Iterator<Item = (K, V)>,\n    U: Iterator<Item = (K, W)>,\n    K: Ord,\n{\n    type Item = KeyDiff<K, V, W>;\n\n    fn next(&mut self) -> Option<Self::Item> {\n        match (self.left.peek(), self.right.peek()) {\n            (Some((left_key, _)), Some((right_key, _))) => match left_key.cmp(right_key) {\n                Ordering::Less => {\n                    let (left_key, left_value) = self\n                        .left\n                        .next()\n                        .expect(\"The left iterator should not be empty.\");\n                    Some(KeyDiff::Removed(left_key, left_value))\n                }\n                Ordering::Equal => {\n                    let (left_key, left_value) = self\n                        .left\n                        .next()\n                        .expect(\"The left iterator should not be empty.\");\n                    let (_, right_value) = self\n                        .right\n                        .next()\n                        .expect(\"The right iterator should not be empty.\");\n                    Some(KeyDiff::Unchanged(left_key, left_value, right_value))\n                }\n                Ordering::Greater => {\n                    let (right_key, right_value) = self\n                        .right\n                        .next()\n                        .expect(\"The right iterator should not be empty.\");\n                    Some(KeyDiff::Added(right_key, right_value))\n                }\n            },\n            (Some(_), None) => {\n                let (left_key, left_value) = self\n                    .left\n                    .next()\n                    .expect(\"The left iterator should not be empty.\");\n                Some(KeyDiff::Removed(left_key, left_value))\n            }\n            (None, Some(_)) => {\n                let (right_key, right_value) = self\n                    .right\n                    .next()\n                    .expect(\"The right iterator should not be empty.\");\n                Some(KeyDiff::Added(right_key, right_value))\n            }\n            (None, None) => None,\n        }\n    }\n}\n\nimpl<T, K, V> SortedByKeyIterator<K, V> for T where T: Iterator<Item = (K, V)> + Sorted {}\n\nimpl<K, V> Sorted for btree_map::IntoIter<K, V> {}\nimpl<K, V> Sorted for btree_map::Iter<'_, K, V> {}\n\n#[cfg(test)]\nmod tests {\n    use std::collections::{BTreeMap, BTreeSet};\n\n    use super::*;\n\n    #[test]\n    fn test_diff() {\n        {\n            let left: BTreeSet<u64> = Vec::new().into_iter().collect();\n            let right: BTreeSet<u64> = Vec::new().into_iter().collect();\n            let diff: Vec<_> = left.iter().diff(right.iter()).collect();\n            assert_eq!(diff, Vec::new());\n        }\n        {\n            let left: BTreeSet<_> = vec![1].into_iter().collect();\n            let right: BTreeSet<_> = Vec::new().into_iter().collect();\n            let diff: Vec<_> = left.iter().diff(right.iter()).collect();\n            assert_eq!(diff, vec![Diff::Removed(&1)]);\n        }\n        {\n            let left: BTreeSet<_> = Vec::new().into_iter().collect();\n            let right: BTreeSet<_> = vec![1].into_iter().collect();\n            let diff: Vec<_> = left.iter().diff(right.iter()).collect();\n            assert_eq!(diff, vec![Diff::Added(&1)]);\n        }\n        {\n            let left: BTreeSet<_> = vec![1].into_iter().collect();\n            let right: BTreeSet<_> = vec![1].into_iter().collect();\n            let diff: Vec<_> = left.iter().diff(right.iter()).collect();\n            assert_eq!(diff, vec![Diff::Unchanged(&1)]);\n        }\n        {\n            let left: BTreeSet<_> = vec![1, 3, 5, 7].into_iter().collect();\n            let right: BTreeSet<_> = vec![2, 4, 5, 6].into_iter().collect();\n            let diff: Vec<_> = left.iter().diff(right.iter()).collect();\n            assert_eq!(\n                diff,\n                vec![\n                    Diff::Removed(&1),\n                    Diff::Added(&2),\n                    Diff::Removed(&3),\n                    Diff::Added(&4),\n                    Diff::Unchanged(&5),\n                    Diff::Added(&6),\n                    Diff::Removed(&7),\n                ]\n            );\n        }\n    }\n\n    #[test]\n    fn test_diff_by_key() {\n        {\n            let left: BTreeMap<u64, u64> = Vec::new().into_iter().collect();\n            let right: BTreeMap<u64, u64> = Vec::new().into_iter().collect();\n            let key_diff: Vec<_> = left.iter().diff_by_key(right.iter()).collect();\n            assert_eq!(key_diff, Vec::new());\n        }\n        {\n            let left: BTreeMap<_, _> = vec![(1, 1)].into_iter().collect();\n            let right: BTreeMap<_, &'static str> = Vec::new().into_iter().collect();\n            let key_diff: Vec<_> = left.iter().diff_by_key(right.iter()).collect();\n            assert_eq!(key_diff, vec![KeyDiff::Removed(&1, &1)]);\n        }\n        {\n            let left: BTreeMap<_, usize> = Vec::new().into_iter().collect();\n            let right: BTreeMap<_, _> = vec![(1, \"a\")].into_iter().collect();\n            let key_diff: Vec<_> = left.iter().diff_by_key(right.iter()).collect();\n            assert_eq!(key_diff, vec![KeyDiff::Added(&1, &\"a\")]);\n        }\n        {\n            let left: BTreeMap<_, _> = vec![(1, 11)].into_iter().collect();\n            let right: BTreeMap<_, _> = vec![(1, \"a\")].into_iter().collect();\n            let key_diff: Vec<_> = left.iter().diff_by_key(right.iter()).collect();\n            assert_eq!(key_diff, vec![KeyDiff::Unchanged(&1, &11, &\"a\")]);\n        }\n        {\n            let left: BTreeMap<_, _> = vec![(1, 1), (3, 3), (5, 5), (7, 7)].into_iter().collect();\n            let right: BTreeMap<_, _> = vec![(2, \"b\"), (4, \"d\"), (5, \"e\"), (6, \"f\")]\n                .into_iter()\n                .collect();\n            let key_diff: Vec<_> = left.iter().diff_by_key(right.iter()).collect();\n            assert_eq!(\n                key_diff,\n                vec![\n                    KeyDiff::Removed(&1, &1),\n                    KeyDiff::Added(&2, &\"b\"),\n                    KeyDiff::Removed(&3, &3),\n                    KeyDiff::Added(&4, &\"d\"),\n                    KeyDiff::Unchanged(&5, &5, &\"e\"),\n                    KeyDiff::Added(&6, &\"f\"),\n                    KeyDiff::Removed(&7, &7),\n                ]\n            );\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/stream_utils.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::any::TypeId;\nuse std::fmt;\nuse std::pin::Pin;\n\nuse bytesize::ByteSize;\nuse futures::{Stream, StreamExt, TryStreamExt, stream};\nuse prometheus::IntGauge;\nuse tokio::sync::{mpsc, watch};\nuse tokio_stream::wrappers::{ReceiverStream, UnboundedReceiverStream, WatchStream};\nuse tracing::warn;\n\nuse crate::metrics::GaugeGuard;\nuse crate::tower::RpcName;\n\npub type BoxStream<T> = Pin<Box<dyn Stream<Item = T> + Send + Unpin + 'static>>;\n\n/// A stream impl for code-generated services with streaming endpoints.\npub struct ServiceStream<T> {\n    inner: BoxStream<T>,\n}\n\nimpl<T> ServiceStream<T>\nwhere T: Send + 'static\n{\n    pub fn new(inner: BoxStream<T>) -> Self {\n        Self { inner }\n    }\n\n    pub fn empty() -> Self {\n        Self {\n            inner: Box::pin(stream::empty()),\n        }\n    }\n\n    pub fn map<F, U>(self, f: F) -> ServiceStream<U>\n    where\n        F: FnMut(T) -> U + Send + 'static,\n        U: Send + 'static,\n    {\n        ServiceStream {\n            inner: Box::pin(self.inner.map(f)),\n        }\n    }\n}\n\nimpl<T> fmt::Debug for ServiceStream<T>\nwhere T: 'static\n{\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        write!(f, \"ServiceStream<{:?}>\", TypeId::of::<T>())\n    }\n}\n\nimpl<T> Unpin for ServiceStream<T> {}\n\nimpl<T> ServiceStream<T>\nwhere T: Send + 'static\n{\n    pub fn new_bounded(capacity: usize) -> (mpsc::Sender<T>, Self) {\n        let (sender, receiver) = mpsc::channel(capacity);\n        (sender, receiver.into())\n    }\n\n    pub fn new_bounded_with_gauge(\n        capacity: usize,\n        gauge: &'static IntGauge,\n    ) -> (TrackedSender<T>, Self) {\n        let (sender, receiver) = mpsc::channel(capacity);\n        let tracked_sender = TrackedSender { sender, gauge };\n        let receiver_stream =\n            ReceiverStream::new(receiver).map(|value: InFlightValue<T>| value.into_inner());\n        let service_stream = Self {\n            inner: Box::pin(receiver_stream),\n        };\n        (tracked_sender, service_stream)\n    }\n\n    pub fn new_unbounded() -> (mpsc::UnboundedSender<T>, Self) {\n        let (sender, receiver) = mpsc::unbounded_channel();\n        (sender, receiver.into())\n    }\n\n    pub fn new_unbounded_with_gauge(gauge: &'static IntGauge) -> (TrackedUnboundedSender<T>, Self) {\n        let (sender, receiver) = mpsc::unbounded_channel();\n        let tracked_sender = TrackedUnboundedSender { sender, gauge };\n        let receiver_stream = UnboundedReceiverStream::new(receiver)\n            .map(|value: InFlightValue<T>| value.into_inner());\n        let service_stream = Self {\n            inner: Box::pin(receiver_stream),\n        };\n        (tracked_sender, service_stream)\n    }\n}\n\nimpl<T> ServiceStream<T>\nwhere T: Clone + Send + Sync + 'static\n{\n    pub fn new_watch(init: T) -> (watch::Sender<T>, Self) {\n        let (sender, receiver) = watch::channel(init);\n        (sender, receiver.into())\n    }\n}\n\nimpl<T, E> ServiceStream<Result<T, E>>\nwhere\n    T: Send + 'static,\n    E: Send + 'static,\n{\n    pub fn map_err<F, U>(self, f: F) -> ServiceStream<Result<T, U>>\n    where\n        F: FnMut(E) -> U + Send + 'static,\n        U: Send + 'static,\n    {\n        ServiceStream {\n            inner: Box::pin(self.inner.map_err(f)),\n        }\n    }\n}\n\nimpl<T> Stream for ServiceStream<T> {\n    type Item = T;\n\n    fn poll_next(\n        mut self: std::pin::Pin<&mut Self>,\n        cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Option<Self::Item>> {\n        Pin::new(&mut self.inner).poll_next(cx)\n    }\n}\n\nimpl<T> From<mpsc::Receiver<T>> for ServiceStream<T>\nwhere T: Send + 'static\n{\n    fn from(receiver: mpsc::Receiver<T>) -> Self {\n        Self {\n            inner: Box::pin(ReceiverStream::new(receiver)),\n        }\n    }\n}\n\nimpl<T> From<mpsc::UnboundedReceiver<T>> for ServiceStream<T>\nwhere T: Send + 'static\n{\n    fn from(receiver: mpsc::UnboundedReceiver<T>) -> Self {\n        Self {\n            inner: Box::pin(UnboundedReceiverStream::new(receiver)),\n        }\n    }\n}\n\nimpl<T> From<watch::Receiver<T>> for ServiceStream<T>\nwhere T: Clone + Send + Sync + 'static\n{\n    fn from(receiver: watch::Receiver<T>) -> Self {\n        Self {\n            inner: Box::pin(WatchStream::new(receiver)),\n        }\n    }\n}\n\n/// Adapts a server-side tonic::Streaming into a ServiceStream of `Result<T, tonic::Status>`. Once\n/// an error is encountered, the stream will be closed and subsequent calls to `poll_next` will\n/// return `None`.\nimpl<T> From<tonic::Streaming<T>> for ServiceStream<Result<T, tonic::Status>>\nwhere T: Send + 'static\n{\n    fn from(streaming: tonic::Streaming<T>) -> Self {\n        Self {\n            inner: Box::pin(streaming),\n        }\n    }\n}\n\n/// Adapts a client-side tonic::Streaming into a ServiceStream of `T`. Once an error is encountered,\n/// the stream will be closed and subsequent calls to `poll_next` will return `None`.\nimpl<T> From<tonic::Streaming<T>> for ServiceStream<T>\nwhere T: Send + 'static\n{\n    fn from(streaming: tonic::Streaming<T>) -> Self {\n        let message_stream = stream::unfold(streaming, |mut streaming| {\n            Box::pin(async {\n                match streaming.message().await {\n                    Ok(Some(message)) => Some((message, streaming)),\n                    Ok(None) => None,\n                    Err(error) => {\n                        warn!(error=?error, \"gRPC transport error\");\n                        None\n                    }\n                }\n            })\n        });\n        Self {\n            inner: Box::pin(message_stream),\n        }\n    }\n}\n\n#[cfg(any(test, feature = \"testsuite\"))]\nimpl<T> From<Vec<T>> for ServiceStream<T>\nwhere T: Send + 'static\n{\n    fn from(values: Vec<T>) -> Self {\n        Self {\n            inner: Box::pin(stream::iter(values)),\n        }\n    }\n}\n\nimpl<T> RpcName for ServiceStream<T>\nwhere T: RpcName\n{\n    fn rpc_name() -> &'static str {\n        T::rpc_name()\n    }\n}\n\npub struct InFlightValue<T>(T, #[allow(dead_code)] GaugeGuard<'static>);\n\nimpl<T> fmt::Debug for InFlightValue<T>\nwhere T: fmt::Debug\n{\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        write!(f, \"{:?}\", self.0)\n    }\n}\n\nimpl<T> InFlightValue<T> {\n    pub fn new(value: T, value_size: ByteSize, gauge: &'static IntGauge) -> Self {\n        let mut gauge_guard = GaugeGuard::from_gauge(gauge);\n        gauge_guard.add(value_size.as_u64() as i64);\n\n        Self(value, gauge_guard)\n    }\n\n    pub fn into_inner(self) -> T {\n        self.0\n    }\n}\n\npub struct TrackedSender<T> {\n    sender: mpsc::Sender<InFlightValue<T>>,\n    gauge: &'static IntGauge,\n}\n\nimpl<T> TrackedSender<T> {\n    pub async fn send(\n        &self,\n        value: T,\n        value_size: ByteSize,\n    ) -> Result<(), mpsc::error::SendError<T>> {\n        self.sender\n            .send(InFlightValue::new(value, value_size, self.gauge))\n            .await\n            .map_err(|send_error| mpsc::error::SendError(send_error.0.0))\n    }\n}\n\npub struct TrackedUnboundedSender<T> {\n    sender: mpsc::UnboundedSender<InFlightValue<T>>,\n    gauge: &'static IntGauge,\n}\n\nimpl<T> TrackedUnboundedSender<T> {\n    pub fn send(&self, value: T, value_size: ByteSize) -> Result<(), mpsc::error::SendError<T>> {\n        self.sender\n            .send(InFlightValue::new(value, value_size, self.gauge))\n            .map_err(|send_error| mpsc::error::SendError(send_error.0.0))\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use once_cell::sync::Lazy;\n\n    use super::*;\n    use crate::metrics::new_gauge;\n\n    #[tokio::test]\n    async fn test_service_stream_map() {\n        let mapped_values = ServiceStream::from(vec![0, 1, 2, 3])\n            .map(|x| x * 2)\n            .collect::<Vec<_>>()\n            .await;\n        assert_eq!(mapped_values, vec![0, 2, 4, 6]);\n    }\n\n    #[tokio::test]\n    async fn test_tracked_service_stream_bounded() {\n        static TEST_GAUGE: Lazy<IntGauge> =\n            Lazy::new(|| new_gauge(\"common\", \"help\", \"test_tracked_service_stream_bounded\", &[]));\n\n        let (service_stream_tx, mut service_stream) =\n            ServiceStream::new_bounded_with_gauge(3, &TEST_GAUGE);\n\n        service_stream_tx.send(1, ByteSize(42)).await.unwrap();\n        assert_eq!(TEST_GAUGE.get(), 42);\n\n        service_stream_tx.send(2, ByteSize(1337)).await.unwrap();\n        assert_eq!(TEST_GAUGE.get(), 1379);\n\n        let value = service_stream.next().await.unwrap();\n        assert_eq!(value, 1);\n        assert_eq!(TEST_GAUGE.get(), 1337);\n    }\n\n    #[tokio::test]\n    async fn test_tracked_service_stream_unbounded() {\n        static TEST_GAUGE: Lazy<IntGauge> = Lazy::new(|| {\n            new_gauge(\n                \"common\",\n                \"help\",\n                \"test_tracked_service_stream_unbounded\",\n                &[],\n            )\n        });\n\n        let (service_stream_tx, mut service_stream) =\n            ServiceStream::new_unbounded_with_gauge(&TEST_GAUGE);\n\n        service_stream_tx.send(1, ByteSize(42)).unwrap();\n        assert_eq!(TEST_GAUGE.get(), 42);\n\n        service_stream_tx.send(2, ByteSize(1337)).unwrap();\n        assert_eq!(TEST_GAUGE.get(), 1379);\n\n        let value = service_stream.next().await.unwrap();\n        assert_eq!(value, 1);\n        assert_eq!(TEST_GAUGE.get(), 1337);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/temp_dir.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::io;\nuse std::path::{Path, PathBuf};\nuse std::sync::Arc;\n\nuse tempfile::TempDir;\nuse tokio::fs;\n\nuse crate::ignore_error_kind;\n\nconst MAX_LENGTH: usize = 255;\n\nconst SEPARATOR: char = '%';\n\nconst NUM_RAND_CHARS: usize = 6;\n\n/// Creates the specified directory. If the directory already exists, deletes its contents.\npub async fn create_or_purge_directory(path: &Path) -> io::Result<PathBuf> {\n    // Delete if exists and recreate scratch directory.\n    ignore_error_kind!(io::ErrorKind::NotFound, fs::remove_dir_all(path).await)?;\n    fs::create_dir_all(path).await?;\n    Ok(path.to_path_buf())\n}\n\n/// A temporary directory. This directory is deleted when the object is dropped.\n#[derive(Debug, Clone)]\npub struct TempDirectory {\n    inner: Arc<TempDir>,\n    _parent: Option<Arc<TempDir>>,\n}\n\nimpl TempDirectory {\n    /// A path where the temporary directory is pointing to.\n    pub fn path(&self) -> &Path {\n        self.inner.path()\n    }\n\n    /// Creates a new temporary directory with the current temporary directory.\n    /// The new directory keeps a pointer to the parent directory to perevent it\n    /// from premature deletion. The directory is deleted when the object is dropped.\n    pub fn named_temp_child(&self, prefix: &str) -> io::Result<TempDirectory> {\n        Ok(TempDirectory {\n            inner: Arc::new(\n                tempfile::Builder::new()\n                    .prefix(prefix)\n                    .tempdir_in(self.path())?,\n            ),\n            _parent: Some(self.inner.clone()),\n        })\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn for_test() -> Self {\n        Builder::default().tempdir().unwrap()\n    }\n}\n\n/// A temporary directory builder.\n#[derive(Debug)]\npub struct Builder<'a> {\n    parts: Vec<&'a str>,\n    max_length: usize,\n    separator: char,\n    num_rand_chars: usize,\n}\n\nimpl Default for Builder<'_> {\n    fn default() -> Self {\n        Self {\n            parts: Default::default(),\n            max_length: MAX_LENGTH,\n            separator: SEPARATOR,\n            num_rand_chars: NUM_RAND_CHARS,\n        }\n    }\n}\n\nimpl<'a> Builder<'a> {\n    /// Specifies the number of random bytes to add at the end of the directory name. Default is 6.\n    pub fn rand_bytes(&mut self, rand: usize) -> &mut Self {\n        self.num_rand_chars = rand;\n        self\n    }\n\n    /// Specifies the maximum length of the directory name in characters. Default is 255 characters.\n    pub fn max_length(&mut self, max_length: usize) -> &mut Self {\n        self.max_length = max_length;\n        self\n    }\n\n    /// Adds a prefix to the directory name.\n    pub fn join(&mut self, name: &'a str) -> &mut Self {\n        if !name.is_empty() {\n            self.parts.push(name.as_ref());\n        }\n        self\n    }\n\n    fn push_str(buffer: &mut String, addition: &'a str, size: usize) -> usize {\n        let len = addition.len();\n        if len <= size {\n            buffer.push_str(addition);\n            return len;\n        } else if size < 3 {\n            buffer.push_str(&addition[0..size]);\n        } else {\n            let half = size - size / 2;\n            buffer.push_str(&addition[0..half - 1]);\n            buffer.push_str(\"..\");\n            buffer.push_str(&addition[addition.len() - (size - half) + 1..]);\n        }\n        size\n    }\n\n    /// Constructs the prefix from the parts specified by the join function.\n    /// If parts are small enough they will be simply concatenated with the\n    /// separator character in between. If parts are too large they will\n    /// truncated by replacing the middle of each part with \"..\". The resulting\n    /// string will be at most max_length characters long.\n    fn prefix(&self) -> io::Result<String> {\n        if self.parts.is_empty() {\n            return Ok(String::new());\n        }\n        let separator_count = if self.num_rand_chars > 0 {\n            self.parts.len()\n        } else {\n            self.parts.len() - 1\n        };\n        // We want to preserve at least one letter from each part with separators.\n        if self.max_length < self.parts.len() + separator_count + self.num_rand_chars {\n            return Err(io::Error::new(\n                io::ErrorKind::InvalidInput,\n                \"the filename limit is too small\",\n            ));\n        }\n        // Calculate how many characters from the parts we can use in the final string.\n        let len_without_separators = self.max_length - separator_count - self.num_rand_chars;\n        // Calculate how many characters per part can we use.\n        let average_len = len_without_separators / self.parts.len();\n        // Account for the average length may not be a whole number.\n        let mut leftovers = len_without_separators % self.parts.len();\n        // We will have some long parts and some short parts. The short parts (part shorter\n        // than average can \"donate\" their space to the large parts. That will allows us to\n        // use all available space. In this loop we are counting how many characters large\n        // parts can use in addition to the average.\n        for part in &self.parts {\n            if part.len() <= average_len {\n                // Adjust the available length from the parts that are shorter\n                leftovers += average_len - part.len();\n            }\n        }\n        // Build the final string by cancatenating the parts while cutting the to the desired\n        // length.\n        let mut buf = String::new();\n        for (i, part) in self.parts.iter().enumerate() {\n            if part.len() <= average_len {\n                // If the part is shorter than the average - we just add it\n                Self::push_str(&mut buf, part, average_len);\n            } else {\n                // If the part is longer than the average - we can cut it down to average_len +\n                // leftovers\n                let pushed = Self::push_str(&mut buf, part, average_len + leftovers) - average_len;\n                // We now need to adjust leftovers by the number of additional characters the we\n                // pushed above average_len\n                leftovers -= pushed;\n            }\n            // The last separator is only added if there are random bytes at the end\n            if i < self.parts.len() - 1 || self.num_rand_chars > 0 {\n                buf.push(self.separator)\n            }\n        }\n        Ok(buf)\n    }\n\n    /// Creates a temporary directory in the temp directory of operation system\n    pub fn tempdir(&self) -> io::Result<TempDirectory> {\n        Ok(TempDirectory {\n            inner: Arc::new(\n                tempfile::Builder::new()\n                    .rand_bytes(self.num_rand_chars)\n                    .prefix(&self.prefix()?)\n                    .tempdir()?,\n            ),\n            _parent: None,\n        })\n    }\n\n    /// Creates a temporary directory in the specified directory\n    pub fn tempdir_in<P: AsRef<Path>>(&self, dir: P) -> io::Result<TempDirectory> {\n        Ok(TempDirectory {\n            inner: Arc::new(\n                tempfile::Builder::new()\n                    .rand_bytes(self.num_rand_chars)\n                    .prefix(&self.prefix()?)\n                    .tempdir_in(dir)?,\n            ),\n            _parent: None,\n        })\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::cmp;\n\n    use rand::Rng;\n\n    use super::*;\n\n    #[test]\n    fn test_push_str() {\n        assert_truncate(\"abcdef\", 100, \"abcdef\", 6);\n        assert_truncate(\"abcdef\", 6, \"abcdef\", 6);\n        assert_truncate(\"abcdefghijklmnopqrstuvwxyz\", 5, \"ab..z\", 5);\n        assert_truncate(\"abcdefghijklmnopqrstuvwxyz\", 4, \"a..z\", 4);\n        assert_truncate(\"abcdefghijklmnopqrstuvwxyz\", 3, \"a..\", 3);\n        assert_truncate(\"abcdefghijklmnopqrstuvwxyz\", 2, \"ab\", 2);\n        assert_truncate(\"abcdefghijklmnopqrstuvwxyz\", 1, \"a\", 1);\n        assert_truncate(\"abcde\", 10, \"abcde\", 5);\n        assert_truncate(\"abcde\", 5, \"abcde\", 5);\n        assert_truncate(\"abcde\", 4, \"a..e\", 4);\n        assert_truncate(\"abcde\", 3, \"a..\", 3);\n        assert_truncate(\"abcde\", 2, \"ab\", 2);\n        assert_truncate(\"abcde\", 1, \"a\", 1);\n    }\n\n    fn assert_truncate(addition: &str, size: usize, expected_addition: &str, expected_size: usize) {\n        let mut buf = String::new();\n        let size = Builder::push_str(&mut buf, addition, size);\n        assert_eq!(expected_addition, buf);\n        assert_eq!(expected_size, size);\n    }\n\n    #[test]\n    fn test_random_failures() {\n        assert_prefix(\n            vec![\"AAAAAAAAAA\", \"AA\", \"AAAAAAA\", \"AAAA\", \"AAAA\", \"AAAAAAAAA\"],\n            35,\n            \"AAAAAAAAAA%AA%AA..A%AAAA%AAAA%A..A%\",\n        );\n        assert_prefix(\n            vec![\n                \"AAAAAA\",\n                \"AAAAAAAAA\",\n                \"AAAAAA\",\n                \"AAA\",\n                \"A\",\n                \"AAAAAAA\",\n                \"AAAAAAAAAA\",\n                \"AAAAA\",\n            ],\n            55,\n            \"AAAAAA%AAAAAAAAA%AAAAAA%AAA%A%AAAAAAA%AAAAAAAAAA%AAAAA%\",\n        );\n        assert_prefix(\n            vec![\"AAAAAAAAA\", \"\", \"AAAAAAA\", \"AAAAAAA\"],\n            25,\n            \"AAA..AAA%AAAAAAA%AAAAAAA%\",\n        );\n    }\n\n    #[test]\n    fn test_prefix() {\n        assert_prefix(vec![\"0\", \"abcde\", \"uvwxyz\"], 15, \"0%abcde%uvwxyz%\");\n\n        assert_prefix(vec![\"a\", \"b\"], 100, \"a%b%\");\n        assert_prefix(vec![\"abcde\", \"uvwxyz\"], 100, \"abcde%uvwxyz%\");\n        assert_prefix(vec![\"abcde\", \"uvwxyz\"], 13, \"abcde%uvwxyz%\");\n        assert_prefix(vec![\"abcde\", \"uvwxyz\"], 12, \"abcde%uv..z%\");\n        assert_prefix(vec![\"abcde\", \"uvwxyz\"], 11, \"abcde%u..z%\");\n        assert_prefix(vec![\"abcde\", \"uvwxyz\"], 10, \"a..e%u..z%\");\n        assert_prefix(vec![\"abcde\", \"uvwxyz\"], 9, \"a..e%u..%\");\n        assert_prefix(vec![\"abcde\", \"uvwxyz\"], 8, \"a..%u..%\");\n        assert_prefix(vec![\"abcde\", \"uvwxyz\"], 7, \"a..%uv%\");\n        assert_prefix(vec![\"abcde\", \"uvwxyz\"], 6, \"ab%uv%\");\n        assert_prefix(vec![\"abcde\", \"uvwxyz\"], 5, \"ab%u%\");\n        assert_prefix(vec![\"abcde\", \"uvwxyz\"], 4, \"a%u%\");\n        assert_prefix_err(\n            \"the filename limit is too small\",\n            vec![\"abcde\", \"uvwxyz\"],\n            3,\n        );\n\n        assert_prefix(vec![\"0\", \"abcde\", \"uvwxyz\"], 15, \"0%abcde%uvwxyz%\");\n        assert_prefix(vec![\"0\", \"abcde\", \"uvwxyz\"], 14, \"0%abcde%uv..z%\");\n        assert_prefix(vec![\"0\", \"abcde\", \"uvwxyz\"], 13, \"0%abcde%u..z%\");\n        assert_prefix(vec![\"0\", \"abcde\", \"uvwxyz\"], 12, \"0%abcde%u..%\");\n        assert_prefix(vec![\"0\", \"abcde\", \"uvwxyz\"], 11, \"0%abcde%uv%\");\n        assert_prefix(vec![\"0\", \"abcde\", \"uvwxyz\"], 10, \"0%a..e%uv%\");\n        assert_prefix(vec![\"0\", \"abcde\", \"uvwxyz\"], 9, \"0%a..%uv%\");\n        assert_prefix(vec![\"0\", \"abcde\", \"uvwxyz\"], 8, \"0%a..%u%\");\n        assert_prefix(vec![\"0\", \"abcde\", \"uvwxyz\"], 7, \"0%ab%u%\");\n        assert_prefix(vec![\"0\", \"abcde\", \"uvwxyz\"], 6, \"0%a%u%\");\n        assert_prefix_err(\n            \"the filename limit is too small\",\n            vec![\"0\", \"abcde\", \"uvwxyz\"],\n            5,\n        );\n    }\n\n    fn assert_prefix(parts: Vec<&str>, size: usize, expected_path: &str) {\n        let mut builder = Builder::default();\n        builder.rand_bytes(5);\n        builder.max_length(size + 5); // Size of random suffix\n        for part in parts.iter() {\n            builder.join(part);\n        }\n        let prefix = builder.prefix().unwrap();\n        assert_eq!(expected_path, prefix, \"parts: {parts:?} len: {size:?}\");\n    }\n\n    fn assert_prefix_err(expected_err: &str, parts: Vec<&str>, size: usize) {\n        let mut builder = Builder::default();\n        builder.rand_bytes(5);\n        builder.max_length(size + 5); // Size of random suffix\n        for part in parts.iter() {\n            builder.join(part);\n        }\n        let error = builder.prefix().unwrap_err();\n        assert_eq!(expected_err, error.to_string());\n    }\n\n    #[test]\n    fn test_prefix_random() {\n        let mut rng = rand::rng();\n        let template = \"A\".repeat(100);\n        for _ in 0..10000 {\n            let rand_bytes = rng.random_range(0..4);\n            let parts_num = rng.random_range(0..10);\n            let mut builder = Builder::default();\n            builder.rand_bytes(rand_bytes);\n            let mut max_size = 0;\n            for _ in 0..parts_num {\n                let size = 1 + rng.random_range(0..10);\n                builder.join(&template[0..size]);\n                max_size += size + 1;\n            }\n            let separator_count = if rand_bytes > 0 {\n                parts_num\n            } else {\n                // no separator at the end\n                if max_size > 0 {\n                    max_size -= 1;\n                    parts_num - 1\n                } else {\n                    parts_num\n                }\n            };\n            let limit_threshold = parts_num + separator_count + rand_bytes;\n            if parts_num > 0 && rng.random() {\n                builder.max_length(rng.random_range(0..limit_threshold));\n                assert_eq!(\n                    \"the filename limit is too small\",\n                    builder.prefix().unwrap_err().to_string()\n                );\n            } else {\n                let len = limit_threshold + rng.random_range(0..100);\n                builder.max_length(len);\n                let builder_debug = format!(\"{builder:?}, len {len}\");\n                let builder_prefix = builder.prefix().unwrap();\n                assert_eq!(\n                    builder_prefix.len(),\n                    cmp::min(len - rand_bytes, max_size),\n                    \"{builder_debug} -> {builder_prefix}\"\n                );\n            }\n        }\n    }\n\n    #[test]\n    fn test_directory_creation_and_removal() {\n        let directory = Builder::default()\n            .join(\"foo\")\n            .join(\"bar\")\n            .join(\"baz\")\n            .rand_bytes(0)\n            .tempdir()\n            .unwrap();\n        assert_eq!(directory.path().file_name().unwrap(), \"foo%bar%baz\");\n        let path = directory.path().to_path_buf();\n        assert!(path.try_exists().unwrap());\n        drop(directory);\n        assert!(!path.try_exists().unwrap());\n    }\n\n    #[test]\n    fn test_directory_creation_and_removal_with_random_bytes() {\n        let directory = Builder::default()\n            .join(\"foo\")\n            .join(\"bar\")\n            .join(\"baz\")\n            .rand_bytes(4)\n            .tempdir()\n            .unwrap();\n        let filename = directory.path().file_name().unwrap().to_str().unwrap();\n        assert_eq!(&filename[0..filename.len() - 4], \"foo%bar%baz%\");\n        let path = directory.path().to_path_buf();\n        assert!(path.try_exists().unwrap());\n        drop(directory);\n        assert!(!path.try_exists().unwrap());\n    }\n\n    #[test]\n    fn test_directory_randomness() {\n        let mut directories = Vec::new();\n        let mut paths = Vec::new();\n        let temp_dir = Builder::default().tempdir().unwrap();\n        // Try creating the maximum number of directories for a single random byte\n        // On case-insensitive filesystems we can only have 36 different directories a-z,0-9\n        for _ in 0..36 {\n            let dir = Builder::default()\n                .join(\"test\")\n                .rand_bytes(1)\n                .tempdir_in(temp_dir.path())\n                .unwrap();\n            assert_eq!(dir.path().parent().unwrap(), temp_dir.path());\n            paths.push(dir.path().to_path_buf());\n            directories.push(dir);\n        }\n        for path in paths.iter() {\n            assert!(path.try_exists().unwrap());\n        }\n        drop(directories);\n        for path in paths.iter() {\n            assert!(!path.try_exists().unwrap());\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/test_utils.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::net::SocketAddr;\nuse std::time::Duration;\n\nuse futures::Future;\nuse hyper::Uri;\nuse tokio::time::error::Elapsed;\nuse tower::Service as _;\n\npub async fn wait_until_predicate<Fut>(\n    predicate: impl Fn() -> Fut,\n    timeout: Duration,\n    retry_interval: Duration,\n) -> Result<(), Elapsed>\nwhere\n    Fut: Future<Output = bool>,\n{\n    tokio::time::timeout(timeout, async move {\n        loop {\n            if predicate().await {\n                break;\n            }\n            tokio::time::sleep(retry_interval).await\n        }\n    })\n    .await\n}\n\n/// Tries to connect at most 3 times to `SocketAddr`.\n/// If not successful, returns an error.\n/// This is a convenient function to wait before sending gRPC requests\n/// to this `SocketAddr`.\npub async fn wait_for_server_ready(socket_addr: SocketAddr) -> anyhow::Result<()> {\n    let mut num_attempts = 0;\n    let max_num_attempts = 10;\n    let uri = Uri::builder()\n        .scheme(\"http\")\n        .authority(socket_addr.to_string().as_str())\n        .path_and_query(\"/\")\n        .build()?;\n\n    while num_attempts < max_num_attempts {\n        tokio::time::sleep(Duration::from_millis(50 * (num_attempts + 1))).await;\n        let mut http = hyper_util::client::legacy::connect::HttpConnector::new();\n        match http.call(uri.clone()).await {\n            Ok(_) => break,\n            Err(_) => {\n                println!(\n                    \"Failed to connect to `{}` failed, retrying {}/{}\",\n                    socket_addr,\n                    num_attempts + 1,\n                    max_num_attempts\n                );\n                num_attempts += 1;\n            }\n        }\n    }\n    if num_attempts == max_num_attempts {\n        anyhow::bail!(\"too many attempts to connect to `{}`\", socket_addr);\n    }\n    Ok(())\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/thread_pool.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::sync::Arc;\n\nuse futures::{Future, TryFutureExt};\nuse once_cell::sync::Lazy;\nuse prometheus::IntGauge;\nuse tokio::sync::oneshot;\nuse tracing::error;\n\nuse crate::metrics::{GaugeGuard, IntGaugeVec, OwnedGaugeGuard, new_gauge_vec};\n\n/// An executor backed by a thread pool to run CPU-intensive tasks.\n///\n/// tokio::spawn_blocking should only used for IO-bound tasks, as it has not limit on its\n/// thread count.\n#[derive(Clone)]\npub struct ThreadPool {\n    thread_pool: Arc<rayon::ThreadPool>,\n    ongoing_tasks: IntGauge,\n    pending_tasks: IntGauge,\n}\n\nimpl ThreadPool {\n    pub fn new(name: &'static str, num_threads_opt: Option<usize>) -> ThreadPool {\n        let mut rayon_pool_builder = rayon::ThreadPoolBuilder::new()\n            .thread_name(move |thread_id| format!(\"quickwit-{name}-{thread_id}\"))\n            .panic_handler(move |_my_panic| {\n                error!(\"task running in the quickwit {name} thread pool panicked\");\n            });\n        if let Some(num_threads) = num_threads_opt {\n            rayon_pool_builder = rayon_pool_builder.num_threads(num_threads);\n        }\n        let thread_pool = rayon_pool_builder\n            .build()\n            .expect(\"failed to spawn thread pool\");\n        let ongoing_tasks = THREAD_POOL_METRICS.ongoing_tasks.with_label_values([name]);\n        let pending_tasks = THREAD_POOL_METRICS.pending_tasks.with_label_values([name]);\n        ThreadPool {\n            thread_pool: Arc::new(thread_pool),\n            ongoing_tasks,\n            pending_tasks,\n        }\n    }\n\n    pub fn get_underlying_rayon_thread_pool(&self) -> Arc<rayon::ThreadPool> {\n        self.thread_pool.clone()\n    }\n\n    /// Function similar to `tokio::spawn_blocking`.\n    ///\n    /// Here are two important differences however:\n    ///\n    /// 1) The task runs on a rayon thread pool managed by Quickwit. This pool is specifically used\n    ///    only to run CPU-intensive work and is configured to contain `num_cpus` cores.\n    ///\n    /// 2) Before the task is effectively scheduled, we check that the spawner is still interested\n    ///    in its result.\n    ///\n    /// It is therefore required to `await` the result of this\n    /// function to get any work done.\n    ///\n    /// This is nice because it makes work that has been scheduled\n    /// but is not running yet \"cancellable\".\n    pub fn run_cpu_intensive<F, R>(\n        &self,\n        cpu_intensive_fn: F,\n    ) -> impl Future<Output = Result<R, Panicked>>\n    where\n        F: FnOnce() -> R + Send + 'static,\n        R: Send + 'static,\n    {\n        let span = tracing::Span::current();\n        let ongoing_tasks = self.ongoing_tasks.clone();\n        let mut pending_tasks_guard: OwnedGaugeGuard =\n            OwnedGaugeGuard::from_gauge(self.pending_tasks.clone());\n        pending_tasks_guard.add(1i64);\n        let (tx, rx) = oneshot::channel();\n        self.thread_pool.spawn(move || {\n            drop(pending_tasks_guard);\n            if tx.is_closed() {\n                return;\n            }\n            let _guard = span.enter();\n            let mut ongoing_task_guard = GaugeGuard::from_gauge(&ongoing_tasks);\n            ongoing_task_guard.add(1i64);\n            let result = cpu_intensive_fn();\n            let _ = tx.send(result);\n        });\n        rx.map_err(|_| Panicked)\n    }\n}\n\n/// Run a small (<200ms) CPU-intensive task on a dedicated thread pool with a few threads.\n///\n/// When running blocking io (or side-effects in general), prefer using `tokio::spawn_blocking`\n/// instead. When running long tasks or a set of tasks that you expect to take more than 33% of\n/// your vCPUs, use a dedicated thread/runtime or executor instead.\n///\n/// Disclaimer: The function will no be executed if the Future is dropped.\n#[must_use = \"run_cpu_intensive will not run if the future it returns is dropped\"]\npub fn run_cpu_intensive<F, R>(cpu_intensive_fn: F) -> impl Future<Output = Result<R, Panicked>>\nwhere\n    F: FnOnce() -> R + Send + 'static,\n    R: Send + 'static,\n{\n    static SMALL_TASK_EXECUTOR: std::sync::OnceLock<ThreadPool> = std::sync::OnceLock::new();\n    SMALL_TASK_EXECUTOR\n        .get_or_init(|| {\n            let num_threads: usize = (crate::num_cpus() / 3).max(2);\n            ThreadPool::new(\"small_tasks\", Some(num_threads))\n        })\n        .run_cpu_intensive(cpu_intensive_fn)\n}\n\n#[derive(Clone, Copy, Debug, Eq, PartialEq)]\npub struct Panicked;\n\nimpl fmt::Display for Panicked {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(f, \"scheduled task panicked\")\n    }\n}\n\nimpl std::error::Error for Panicked {}\n\nstruct ThreadPoolMetrics {\n    ongoing_tasks: IntGaugeVec<1>,\n    pending_tasks: IntGaugeVec<1>,\n}\n\nimpl Default for ThreadPoolMetrics {\n    fn default() -> Self {\n        ThreadPoolMetrics {\n            ongoing_tasks: new_gauge_vec(\n                \"ongoing_tasks\",\n                \"number of tasks being currently processed by threads in the thread pool\",\n                \"thread_pool\",\n                &[],\n                [\"pool\"],\n            ),\n            pending_tasks: new_gauge_vec(\n                \"pending_tasks\",\n                \"number of tasks waiting in the queue before being processed by the thread pool\",\n                \"thread_pool\",\n                &[],\n                [\"pool\"],\n            ),\n        }\n    }\n}\n\nstatic THREAD_POOL_METRICS: Lazy<ThreadPoolMetrics> = Lazy::new(ThreadPoolMetrics::default);\n\n#[cfg(test)]\nmod tests {\n    use std::sync::Arc;\n    use std::sync::atomic::{AtomicU64, Ordering};\n    use std::time::Duration;\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_run_cpu_intensive() {\n        assert_eq!(run_cpu_intensive(|| 1).await, Ok(1));\n    }\n\n    #[tokio::test]\n    async fn test_run_cpu_intensive_panicks() {\n        assert!(run_cpu_intensive(|| panic!(\"\")).await.is_err());\n    }\n\n    #[tokio::test]\n    async fn test_run_cpu_intensive_panicks_do_not_shrink_thread_pool() {\n        for _ in 0..100 {\n            assert!(run_cpu_intensive(|| panic!(\"\")).await.is_err());\n        }\n    }\n\n    #[tokio::test]\n    async fn test_run_cpu_intensive_abort() {\n        let counter: Arc<AtomicU64> = Default::default();\n        let mut futures = Vec::new();\n        for _ in 0..1_000 {\n            let counter_clone = counter.clone();\n            let fut = run_cpu_intensive(move || {\n                std::thread::sleep(Duration::from_millis(5));\n                counter_clone.fetch_add(1, Ordering::SeqCst)\n            });\n            // The first few num_cores tasks should run, but the other should get cancelled.\n            futures.push(tokio::time::timeout(Duration::from_millis(1), fut));\n        }\n        futures::future::join_all(futures).await;\n        assert!(counter.load(Ordering::SeqCst) < 100);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/tower/box_layer.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::sync::Arc;\n\nuse tower::layer::layer_fn;\nuse tower::{Layer, Service};\n\nuse crate::tower::BoxService;\n\npub struct BoxLayer<S, R, T, E> {\n    inner: Arc<dyn Layer<S, Service = BoxService<R, T, E>> + Send + Sync + 'static>,\n}\n\nimpl<S, R, T, E> BoxLayer<S, R, T, E> {\n    pub fn new<L>(inner_layer: L) -> Self\n    where\n        L: Layer<S> + Send + Sync + 'static,\n        L::Service: Service<R, Response = T, Error = E> + Clone + Send + Sync + 'static,\n        <L::Service as Service<R>>::Future: Send + 'static,\n    {\n        let layer = layer_fn(move |inner_svc: S| {\n            let outer_layer = inner_layer.layer(inner_svc);\n            BoxService::new(outer_layer)\n        });\n\n        Self {\n            inner: Arc::new(layer),\n        }\n    }\n}\n\nimpl<S, R, T, E> Layer<S> for BoxLayer<S, R, T, E> {\n    type Service = BoxService<R, T, E>;\n\n    fn layer(&self, inner: S) -> Self::Service {\n        self.inner.layer(inner)\n    }\n}\n\nimpl<S, R, T, E> Clone for BoxLayer<S, R, T, E> {\n    fn clone(&self) -> Self {\n        Self {\n            inner: self.inner.clone(),\n        }\n    }\n}\n\nimpl<S, R, T, E> fmt::Debug for BoxLayer<S, R, T, E> {\n    fn fmt(&self, fmt: &mut fmt::Formatter) -> fmt::Result {\n        fmt.debug_struct(\"BoxLayer\").finish()\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/tower/box_service.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::task::{Context, Poll};\n\nuse tower::{Service, ServiceExt};\n\nuse super::BoxFuture;\n\ntrait CloneService<R, T, E>:\n    Service<R, Response = T, Error = E, Future = BoxFuture<T, E>>\n    + dyn_clone::DynClone\n    + Send\n    + Sync\n    + 'static\n{\n}\n\ndyn_clone::clone_trait_object!(<R, T, E> CloneService<R, T, E>);\n\nimpl<S, R, T, E> CloneService<R, T, E> for S where S: Service<R, Response = T, Error = E, Future = BoxFuture<T, E>>\n        + Clone\n        + Send\n        + Sync\n        + 'static\n{\n}\n\npub struct BoxService<R, T, E> {\n    inner: Box<dyn CloneService<R, T, E>>,\n}\n\nimpl<R, T, E> Clone for BoxService<R, T, E> {\n    fn clone(&self) -> Self {\n        Self {\n            inner: self.inner.clone(),\n        }\n    }\n}\n\nimpl<R, T, E> BoxService<R, T, E> {\n    pub fn new<S>(inner: S) -> Self\n    where\n        S: Service<R, Response = T, Error = E> + Clone + Send + Sync + 'static,\n        S::Future: Send + 'static,\n    {\n        let inner = Box::new(inner.map_future(|fut| Box::pin(fut) as _));\n        BoxService { inner }\n    }\n}\n\nimpl<R, T, E> Service<R> for BoxService<R, T, E> {\n    type Response = T;\n    type Error = E;\n    type Future = BoxFuture<T, E>;\n\n    fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), E>> {\n        self.inner.poll_ready(cx)\n    }\n\n    fn call(&mut self, request: R) -> BoxFuture<T, E> {\n        self.inner.call(request)\n    }\n}\n\nimpl<T, U, E> fmt::Debug for BoxService<T, U, E> {\n    fn fmt(&self, fmt: &mut fmt::Formatter) -> fmt::Result {\n        fmt.debug_struct(\"BoxService\").finish()\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/tower/buffer.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::error::Error;\nuse std::marker::PhantomData;\nuse std::task::{Context, Poll};\nuse std::{error, fmt};\n\nuse futures::TryFutureExt as _;\nuse tower::buffer::Buffer as TowerBuffer;\nuse tower::buffer::error::{Closed, ServiceError};\nuse tower::{Layer, Service};\n\nuse super::{BoxError, BoxFuture};\n\n#[derive(Debug, thiserror::Error)]\npub enum BufferError {\n    #[error(\"the buffer's worker closed unexpectedly\")]\n    Closed,\n    #[error(\"the buffer service returned an unknown error\")]\n    Unknown,\n}\n\n/// A wrapper around [`tower::buffer::Buffer`] service that preserves the original error type.\npub struct Buffer<S, R>\nwhere S: Service<R>\n{\n    bound: usize,\n    inner: TowerBuffer<R, <S as Service<R>>::Future>,\n}\n\nimpl<S, R> Buffer<S, R>\nwhere\n    S: Service<R>,\n    S::Error: Into<BoxError>,\n{\n    pub fn new(service: S, bound: usize) -> Self\n    where\n        S: Send + 'static,\n        S::Future: Send,\n        S::Error: Send + Sync,\n        R: Send + 'static,\n    {\n        Self {\n            bound,\n            inner: TowerBuffer::new(service, bound),\n        }\n    }\n}\n\nimpl<S, R> Service<R> for Buffer<S, R>\nwhere\n    R: Send + 'static,\n    S: Service<R>,\n    S::Error: error::Error + From<BufferError> + Into<BoxError> + Clone + Send + Sync + 'static,\n    S::Future: Send + 'static,\n{\n    type Response = S::Response;\n    type Error = S::Error;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n\n    fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {\n        self.inner.poll_ready(cx).map_err(downcast_error)\n    }\n\n    fn call(&mut self, request: R) -> Self::Future {\n        let fut = self.inner.call(request).map_err(downcast_error);\n        Box::pin(fut)\n    }\n}\n\n/// Downcasts an error boxed as [`tower::BoxError`] by the buffer service back into the original\n/// error `E`.\nfn downcast_error<E>(error: BoxError) -> E\nwhere E: error::Error + From<BufferError> + Clone + 'static {\n    if let Some(error) = error.downcast_ref::<E>() {\n        return error.clone();\n    }\n    // This happens when the buffer worker is dead.\n    if error.downcast_ref::<Closed>().is_some() {\n        return BufferError::Closed.into();\n    }\n    // This happens when the inner service returns an error on `poll_ready`.\n    if let Some(service_error) = error.downcast_ref::<ServiceError>()\n        && let Some(source) = service_error.source()\n        && let Some(inner) = source.downcast_ref::<E>()\n    {\n        return inner.clone();\n    }\n    // This will happen only if the buffer service implementation adds a new error type.\n    BufferError::Unknown.into()\n}\n\nimpl<S, R> fmt::Debug for Buffer<S, R>\nwhere S: Service<R>\n{\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        f.debug_struct(\"Buffer\")\n            .field(\"bound\", &self.bound)\n            .finish()\n    }\n}\n\nimpl<S, R> Clone for Buffer<S, R>\nwhere\n    S: Service<R>,\n    R: Send + 'static,\n    <S as Service<R>>::Future: Send + 'static,\n{\n    fn clone(&self) -> Self {\n        Self {\n            bound: self.bound,\n            inner: self.inner.clone(),\n        }\n    }\n}\n\npub struct BufferLayer<R> {\n    bound: usize,\n    _phantom: PhantomData<fn(R)>,\n}\n\nimpl<R> BufferLayer<R> {\n    pub fn new(bound: usize) -> Self {\n        Self {\n            bound,\n            _phantom: PhantomData,\n        }\n    }\n}\n\nimpl<S, R> Layer<S> for BufferLayer<R>\nwhere\n    S: Service<R> + Send + 'static,\n    S::Future: Send,\n    S::Error: error::Error + From<BufferError> + Into<BoxError> + Clone + Send + Sync + 'static,\n    R: Send + 'static,\n{\n    type Service = Buffer<S, R>;\n\n    fn layer(&self, service: S) -> Self::Service {\n        Buffer::new(service, self.bound)\n    }\n}\n\nimpl<R> fmt::Debug for BufferLayer<R> {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        f.debug_struct(\"BufferLayer\")\n            .field(\"bound\", &self.bound)\n            .finish()\n    }\n}\n\nimpl<R> Clone for BufferLayer<R> {\n    fn clone(&self) -> Self {\n        *self\n    }\n}\n\nimpl<R> Copy for BufferLayer<R> {}\n\n#[cfg(test)]\nmod tests {\n    use tower::ServiceExt;\n\n    use super::*;\n\n    #[derive(Debug, Clone, thiserror::Error, PartialEq, Eq)]\n    enum MyServiceError {\n        #[error(\"service is exhausted\")]\n        Exhausted,\n        #[error(\"service is unavailable\")]\n        Unavailable,\n        #[error(\"service attempted to divide by zero\")]\n        ZeroDivision,\n    }\n\n    impl From<BufferError> for MyServiceError {\n        fn from(_: BufferError) -> Self {\n            MyServiceError::Unavailable\n        }\n    }\n\n    #[derive(Debug, Default)]\n    struct MyService {\n        num_calls: usize,\n    }\n\n    impl Service<(usize, usize)> for MyService {\n        type Response = usize;\n        type Error = MyServiceError;\n        type Future = BoxFuture<Self::Response, Self::Error>;\n\n        fn poll_ready(&mut self, _cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {\n            self.num_calls += 1;\n\n            if self.num_calls > 2 {\n                Poll::Ready(Err(MyServiceError::Exhausted))\n            } else {\n                Poll::Ready(Ok(()))\n            }\n        }\n\n        fn call(&mut self, (dividend, divisor): (usize, usize)) -> Self::Future {\n            let fut = async move {\n                if divisor == 0 {\n                    Err(MyServiceError::ZeroDivision)\n                } else {\n                    Ok(dividend / divisor)\n                }\n            };\n            Box::pin(fut)\n        }\n    }\n\n    #[tokio::test]\n    async fn test_buffer_error() {\n        let mut service = BufferLayer::new(1).layer(MyService::default());\n\n        assert_eq!(\n            service.ready().await.unwrap().call((10, 2)).await.unwrap(),\n            5\n        );\n        assert_eq!(\n            service\n                .ready()\n                .await\n                .unwrap()\n                .call((10, 0))\n                .await\n                .unwrap_err(),\n            MyServiceError::ZeroDivision\n        );\n        assert_eq!(\n            service\n                .ready()\n                .await\n                .unwrap()\n                .call((10, 0))\n                .await\n                .unwrap_err(),\n            MyServiceError::Exhausted\n        );\n    }\n\n    #[tokio::test]\n    async fn test_buffer_closed() {\n        let (inner, worker) = TowerBuffer::pair(MyService::default(), 1);\n        let handle = tokio::spawn(worker);\n\n        let mut service: Buffer<MyService, (usize, usize)> = Buffer { bound: 1, inner };\n        let res: usize = service.ready().await.unwrap().call((10, 2)).await.unwrap();\n        assert_eq!(res, 5);\n\n        handle.abort();\n        handle.await.unwrap_err();\n\n        assert_eq!(\n            service.ready().await.unwrap_err(),\n            MyServiceError::Unavailable\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/tower/change.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n/// A change enum similar to `tower::discover::Change` but cloneable.\n// TODO: Remove when the next version of tower (0.4.14?) is released.\n#[derive(Debug, Clone)]\npub enum Change<K, V> {\n    Insert(K, V),\n    Remove(K),\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/tower/circuit_breaker.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::future::Future;\nuse std::pin::Pin;\nuse std::sync::{Arc, Mutex};\nuse std::task::{Context, Poll};\nuse std::time::Duration;\n\nuse pin_project::pin_project;\nuse prometheus::IntCounter;\nuse tokio::time::Instant;\nuse tower::{Layer, Service};\n\n/// The circuit breaker layer implements the [circuit breaker pattern](https://martinfowler.com/bliki/CircuitBreaker.html).\n///\n/// It counts the errors emitted by the inner service, and if the number of errors exceeds a certain\n/// threshold within a certain time window, it will \"open\" the circuit.\n///\n/// Requests will then be rejected for a given timeout.\n/// After this timeout, the circuit breaker ends up in a HalfOpen state. It will allow a single\n/// request to pass through. Depending on the result of this request, the circuit breaker will\n/// either close the circuit again or open it again.\n///\n/// Implementation detail:\n///\n/// A circuit breaker needs to have some logic to estimate the chances for the next request\n/// to fail. In this implementation, we use a simple heuristic that does not take in account\n/// successes. We simply count the number or errors which happened in the last window.\n///\n/// The circuit breaker does not attempt to measure accurately the error rate.\n/// Instead, it counts errors, and check for the time window in which these errors occurred.\n/// This approach is accurate enough, robust, very easy to code and avoids calling the\n/// `Instant::now()` at every error in the open state.\n#[derive(Debug, Clone)]\npub struct CircuitBreakerLayer<Evaluator> {\n    max_error_count_per_time_window: u32,\n    time_window: Duration,\n    timeout: Duration,\n    evaluator: Evaluator,\n    circuit_break_total: prometheus::IntCounter,\n}\n\npub trait CircuitBreakerEvaluator: Clone {\n    type Response;\n    type Error;\n    fn is_circuit_breaker_error(&self, output: &Result<Self::Response, Self::Error>) -> bool;\n    fn make_circuit_breaker_output(&self) -> Self::Error;\n    fn make_layer(\n        self,\n        max_num_errors_per_secs: u32,\n        timeout: Duration,\n        circuit_break_total: prometheus::IntCounter,\n    ) -> CircuitBreakerLayer<Self> {\n        CircuitBreakerLayer {\n            max_error_count_per_time_window: max_num_errors_per_secs,\n            time_window: Duration::from_secs(1),\n            timeout,\n            evaluator: self,\n            circuit_break_total,\n        }\n    }\n}\n\nimpl<S, Evaluator: CircuitBreakerEvaluator> Layer<S> for CircuitBreakerLayer<Evaluator> {\n    type Service = CircuitBreaker<S, Evaluator>;\n\n    fn layer(&self, service: S) -> CircuitBreaker<S, Evaluator> {\n        let time_window = Duration::from_millis(self.time_window.as_millis() as u64);\n        let timeout = Duration::from_millis(self.timeout.as_millis() as u64);\n        CircuitBreaker {\n            underlying: service,\n            circuit_breaker_inner: Arc::new(Mutex::new(CircuitBreakerInner {\n                max_error_count_per_time_window: self.max_error_count_per_time_window,\n                time_window,\n                timeout,\n                state: CircuitBreakerState::Closed(ClosedState {\n                    error_counter: 0u32,\n                    error_window_end: Instant::now() + time_window,\n                }),\n                evaluator: self.evaluator.clone(),\n                circuit_break_total: self.circuit_break_total.clone(),\n            })),\n        }\n    }\n}\n\nstruct CircuitBreakerInner<Evaluator> {\n    max_error_count_per_time_window: u32,\n    time_window: Duration,\n    timeout: Duration,\n    evaluator: Evaluator,\n    state: CircuitBreakerState,\n    circuit_break_total: IntCounter,\n}\n\nimpl<Evaluator> CircuitBreakerInner<Evaluator> {\n    fn get_state(&mut self) -> CircuitBreakerState {\n        let new_state = match self.state {\n            CircuitBreakerState::Open { until } => {\n                let now = Instant::now();\n                if now < until {\n                    CircuitBreakerState::Open { until }\n                } else {\n                    CircuitBreakerState::HalfOpen\n                }\n            }\n            other => other,\n        };\n        self.state = new_state;\n        new_state\n    }\n\n    fn receive_error(&mut self) {\n        match self.state {\n            CircuitBreakerState::HalfOpen => {\n                self.circuit_break_total.inc();\n                self.state = CircuitBreakerState::Open {\n                    until: Instant::now() + self.timeout,\n                }\n            }\n            CircuitBreakerState::Open { .. } => {}\n            CircuitBreakerState::Closed(ClosedState {\n                error_counter,\n                error_window_end,\n            }) => {\n                if error_counter < self.max_error_count_per_time_window {\n                    self.state = CircuitBreakerState::Closed(ClosedState {\n                        error_counter: error_counter + 1,\n                        error_window_end,\n                    });\n                    return;\n                }\n                let now = Instant::now();\n                if now < error_window_end {\n                    self.circuit_break_total.inc();\n                    self.state = CircuitBreakerState::Open {\n                        until: now + self.timeout,\n                    };\n                } else {\n                    self.state = CircuitBreakerState::Closed(ClosedState {\n                        error_counter: 0u32,\n                        error_window_end: now + self.time_window,\n                    });\n                }\n            }\n        }\n    }\n\n    fn receive_success(&mut self) {\n        match self.state {\n            CircuitBreakerState::HalfOpen | CircuitBreakerState::Open { .. } => {\n                self.state = CircuitBreakerState::Closed(ClosedState {\n                    error_counter: 0u32,\n                    error_window_end: Instant::now() + self.time_window,\n                });\n            }\n            CircuitBreakerState::Closed { .. } => {\n                // We could actually take that as a signal.\n            }\n        }\n    }\n}\n\n#[derive(Clone)]\npub struct CircuitBreaker<S, Evaluator> {\n    underlying: S,\n    circuit_breaker_inner: Arc<Mutex<CircuitBreakerInner<Evaluator>>>,\n}\n\nimpl<S, Evaluator> std::fmt::Debug for CircuitBreaker<S, Evaluator> {\n    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {\n        f.debug_struct(\"CircuitBreaker\").finish()\n    }\n}\n\n#[derive(Debug, Clone, Copy)]\nenum CircuitBreakerState {\n    Open { until: Instant },\n    HalfOpen,\n    Closed(ClosedState),\n}\n\n#[derive(Debug, Clone, Copy)]\nstruct ClosedState {\n    error_counter: u32,\n    error_window_end: Instant,\n}\n\nimpl<S, R, Evaluator> Service<R> for CircuitBreaker<S, Evaluator>\nwhere\n    S: Service<R>,\n    Evaluator: CircuitBreakerEvaluator<Response = S::Response, Error = S::Error>,\n{\n    type Response = S::Response;\n    type Error = S::Error;\n    type Future = CircuitBreakerFuture<S::Future, Evaluator>;\n\n    fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {\n        let mut inner = self.circuit_breaker_inner.lock().unwrap();\n        let state = inner.get_state();\n        match state {\n            CircuitBreakerState::Closed { .. } | CircuitBreakerState::HalfOpen => {\n                self.underlying.poll_ready(cx)\n            }\n            CircuitBreakerState::Open { .. } => {\n                let circuit_break_error = inner.evaluator.make_circuit_breaker_output();\n                Poll::Ready(Err(circuit_break_error))\n            }\n        }\n    }\n\n    fn call(&mut self, request: R) -> Self::Future {\n        CircuitBreakerFuture {\n            underlying_fut: self.underlying.call(request),\n            circuit_breaker_inner: self.circuit_breaker_inner.clone(),\n        }\n    }\n}\n\n#[pin_project]\npub struct CircuitBreakerFuture<F, Evaluator> {\n    #[pin]\n    underlying_fut: F,\n    circuit_breaker_inner: Arc<Mutex<CircuitBreakerInner<Evaluator>>>,\n}\n\nimpl<Response, Error, F, Evaluator> Future for CircuitBreakerFuture<F, Evaluator>\nwhere\n    F: Future<Output = Result<Response, Error>>,\n    Evaluator: CircuitBreakerEvaluator<Response = Response, Error = Error>,\n{\n    type Output = F::Output;\n\n    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {\n        let circuit_breaker_inner = self.circuit_breaker_inner.clone();\n        let poll_res = self.project().underlying_fut.poll(cx);\n        match poll_res {\n            Poll::Pending => Poll::Pending,\n            Poll::Ready(result) => {\n                let mut circuit_breaker_inner_lock = circuit_breaker_inner.lock().unwrap();\n                let is_circuit_breaker_error = circuit_breaker_inner_lock\n                    .evaluator\n                    .is_circuit_breaker_error(&result);\n                if is_circuit_breaker_error {\n                    circuit_breaker_inner_lock.receive_error();\n                } else {\n                    circuit_breaker_inner_lock.receive_success();\n                }\n                Poll::Ready(result)\n            }\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::sync::atomic::{AtomicBool, Ordering};\n\n    use tower::{ServiceBuilder, ServiceExt};\n\n    use super::*;\n\n    #[derive(Debug)]\n    enum TestError {\n        CircuitBreak,\n        ServiceError,\n    }\n\n    #[derive(Debug, Clone, Copy)]\n    struct TestCircuitBreakerEvaluator;\n\n    impl CircuitBreakerEvaluator for TestCircuitBreakerEvaluator {\n        type Response = ();\n        type Error = TestError;\n\n        fn is_circuit_breaker_error(&self, output: &Result<Self::Response, Self::Error>) -> bool {\n            output.is_err()\n        }\n\n        fn make_circuit_breaker_output(&self) -> TestError {\n            TestError::CircuitBreak\n        }\n    }\n\n    #[tokio::test]\n    async fn test_circuit_breaker() {\n        tokio::time::pause();\n        let test_switch: Arc<AtomicBool> = Arc::new(AtomicBool::new(true));\n\n        const TIMEOUT: Duration = Duration::from_millis(500);\n\n        let int_counter: prometheus::IntCounter =\n            IntCounter::new(\"circuit_break_total_test\", \"test circuit breaker counter\").unwrap();\n        let mut service = ServiceBuilder::new()\n            .layer(TestCircuitBreakerEvaluator.make_layer(10, TIMEOUT, int_counter))\n            .service_fn(|_| async {\n                if test_switch.load(Ordering::Relaxed) {\n                    Ok(())\n                } else {\n                    Err(TestError::ServiceError)\n                }\n            });\n\n        service.ready().await.unwrap().call(()).await.unwrap();\n\n        for _ in 0..1_000 {\n            service.ready().await.unwrap().call(()).await.unwrap();\n        }\n\n        test_switch.store(false, Ordering::Relaxed);\n\n        let mut service_error_count = 0;\n        let mut circuit_break_count = 0;\n        for _ in 0..1_000 {\n            match service.ready().await {\n                Ok(service) => {\n                    service.call(()).await.unwrap_err();\n                    service_error_count += 1;\n                }\n                Err(_circuit_breaker_error) => {\n                    circuit_break_count += 1;\n                }\n            }\n        }\n\n        assert_eq!(service_error_count + circuit_break_count, 1_000);\n        assert_eq!(service_error_count, 11);\n\n        tokio::time::advance(TIMEOUT).await;\n\n        // The test request at half open fails.\n        for _ in 0..1_000 {\n            match service.ready().await {\n                Ok(service) => {\n                    service.call(()).await.unwrap_err();\n                    service_error_count += 1;\n                }\n                Err(_circuit_breaker_error) => {\n                    circuit_break_count += 1;\n                }\n            }\n        }\n\n        assert_eq!(service_error_count + circuit_break_count, 2_000);\n        assert_eq!(service_error_count, 12);\n\n        test_switch.store(true, Ordering::Relaxed);\n        tokio::time::advance(TIMEOUT).await;\n\n        // The test request at half open succeeds.\n        for _ in 0..1_000 {\n            service.ready().await.unwrap().call(()).await.unwrap();\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/tower/delay.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::future::Future;\nuse std::pin::Pin;\nuse std::task::{Context, Poll};\nuse std::time::Duration;\n\nuse pin_project::pin_project;\nuse tokio::time::Sleep;\nuse tower::{Layer, Service};\n\n/// Delays a request by `delay` seconds.\n#[derive(Debug, Clone)]\npub struct Delay<S> {\n    inner: S,\n    delay: Duration,\n}\n\nimpl<S, R> Service<R> for Delay<S>\nwhere S: Service<R>\n{\n    type Response = S::Response;\n    type Error = S::Error;\n    type Future = DelayFuture<S::Future>;\n\n    fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {\n        self.inner.poll_ready(cx)\n    }\n\n    fn call(&mut self, request: R) -> Self::Future {\n        DelayFuture {\n            inner: self.inner.call(request),\n            sleep: tokio::time::sleep(self.delay),\n            slept: false,\n        }\n    }\n}\n\n#[pin_project]\n#[derive(Debug)]\npub struct DelayFuture<F> {\n    #[pin]\n    inner: F,\n    #[pin]\n    sleep: Sleep,\n    slept: bool,\n}\n\nimpl<F, T, E> Future for DelayFuture<F>\nwhere F: Future<Output = Result<T, E>>\n{\n    type Output = Result<T, E>;\n\n    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {\n        let this = self.project();\n\n        if !*this.slept {\n            match this.sleep.poll(cx) {\n                Poll::Ready(_) => *this.slept = true,\n                Poll::Pending => return Poll::Pending,\n            }\n        }\n        this.inner.poll(cx)\n    }\n}\n\n/// Applies a delay to requests via the supplied inner service.\n#[derive(Debug, Clone)]\npub struct DelayLayer {\n    delay: Duration,\n}\n\nimpl DelayLayer {\n    /// Creates a new `DelayLayer` with the specified delay.\n    pub fn new(delay: Duration) -> Self {\n        Self { delay }\n    }\n}\n\nimpl<S> Layer<S> for DelayLayer {\n    type Service = Delay<S>;\n\n    fn layer(&self, service: S) -> Self::Service {\n        Delay {\n            inner: service,\n            delay: self.delay,\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::time::Instant;\n\n    use tokio::time::Duration;\n    use tower::{ServiceBuilder, ServiceExt};\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_delay() {\n        let delay = Duration::from_millis(100);\n        let mut service = ServiceBuilder::new()\n            .layer(DelayLayer::new(delay))\n            .service_fn(|_| async { Ok::<_, ()>(()) });\n\n        let start = Instant::now();\n        service.ready().await.unwrap().call(()).await.unwrap();\n\n        let elapsed = start.elapsed();\n        assert!(elapsed >= delay);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/tower/estimate_rate.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::marker::PhantomData;\nuse std::task::{Context, Poll};\nuse std::time::Instant;\n\nuse tower::load::CompleteOnResponse;\nuse tower::load::completion::TrackCompletionFuture;\nuse tower::{Layer, Service};\n\nuse super::{Cost, RateEstimator};\n\npub struct Handle<T: RateEstimator> {\n    started_at: Instant,\n    work: u64,\n    estimator: T,\n}\n\nimpl<T> Drop for Handle<T>\nwhere T: RateEstimator\n{\n    fn drop(&mut self) {\n        let ended_at = Instant::now();\n        self.estimator.update(self.started_at, ended_at, self.work);\n    }\n}\n\n/// Estimates the quantity of work the underlying service can handle over a period of time.\n///\n/// Each request is decorated with a `Handle` that measures the time necessary to process the\n/// request and, on drop, updates the rate estimator on which it holds a reference.\n#[derive(Debug, Clone)]\npub struct EstimateRate<S, T> {\n    service: S,\n    estimator: T,\n}\n\nimpl<S, T> EstimateRate<S, T>\nwhere T: RateEstimator\n{\n    /// Creates a new rate estimator.\n    pub fn new(service: S, estimator: T) -> Self {\n        Self { service, estimator }\n    }\n\n    fn handle(&self, work: u64) -> Handle<T> {\n        Handle {\n            started_at: Instant::now(),\n            work,\n            estimator: self.estimator.clone(),\n        }\n    }\n}\n\nimpl<S, R, T> Service<R> for EstimateRate<S, T>\nwhere\n    S: Service<R>,\n    R: Cost,\n    T: RateEstimator,\n{\n    type Response = S::Response;\n    type Error = S::Error;\n    type Future = TrackCompletionFuture<S::Future, CompleteOnResponse, Handle<T>>;\n\n    fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {\n        self.service.poll_ready(cx)\n    }\n\n    fn call(&mut self, request: R) -> Self::Future {\n        let handle = self.handle(request.cost());\n        TrackCompletionFuture::new(\n            CompleteOnResponse::default(),\n            handle,\n            self.service.call(request),\n        )\n    }\n}\n\n/// Estimates the quantity of work the underlying\n/// service can handle over a period of time.\n#[derive(Debug, Clone)]\npub struct EstimateRateLayer<R, T> {\n    estimator: T,\n    _phantom: PhantomData<R>,\n}\n\nimpl<R, T> EstimateRateLayer<R, T> {\n    /// Creates new estimate rate layer.\n    pub fn new(estimator: T) -> Self {\n        Self {\n            estimator,\n            _phantom: PhantomData,\n        }\n    }\n}\n\nimpl<S, R, T> Layer<S> for EstimateRateLayer<R, T>\nwhere\n    S: Service<R>,\n    R: Cost,\n    T: RateEstimator,\n{\n    type Service = EstimateRate<S, T>;\n\n    fn layer(&self, service: S) -> Self::Service {\n        EstimateRate::new(service, self.estimator.clone())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::sync::Arc;\n    use std::sync::atomic::{AtomicU64, Ordering};\n    use std::time::Duration;\n\n    use tower::ServiceExt;\n\n    use super::*;\n    use crate::tower::Rate;\n\n    struct Request;\n\n    impl Cost for Request {\n        fn cost(&self) -> u64 {\n            42\n        }\n    }\n\n    #[derive(Debug, Clone, Default)]\n    struct DummyEstimator {\n        work: Arc<AtomicU64>,\n        duration_micros: Arc<AtomicU64>,\n    }\n\n    impl Rate for DummyEstimator {\n        fn work(&self) -> u64 {\n            self.work.load(Ordering::Relaxed)\n        }\n\n        fn period(&self) -> Duration {\n            Duration::from_micros(self.duration_micros.load(Ordering::Relaxed))\n        }\n    }\n\n    impl RateEstimator for DummyEstimator {\n        fn update(&mut self, started_at: Instant, ended_at: Instant, work: u64) {\n            self.work.store(work, Ordering::Relaxed);\n            self.duration_micros.store(\n                (ended_at - started_at).as_micros() as u64,\n                Ordering::Relaxed,\n            );\n        }\n    }\n\n    #[tokio::test]\n    async fn test_estimate_rate() {\n        let estimator = DummyEstimator::default();\n        let mut service = EstimateRate::new(\n            tower::service_fn(|_: Request| async move { Ok::<_, ()>(()) }),\n            estimator.clone(),\n        );\n        service.ready().await.unwrap().call(Request).await.unwrap();\n        assert_eq!(service.estimator.work(), 42);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/tower/event_listener.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::pin::Pin;\nuse std::task::{Context, Poll};\n\nuse futures::{Future, ready};\nuse pin_project::pin_project;\nuse tower::{Layer, Service};\n\nuse crate::pubsub::{Event, EventBroker};\n\n#[derive(Clone)]\npub struct EventListener<S> {\n    inner: S,\n    event_broker: EventBroker,\n}\n\nimpl<S> EventListener<S> {\n    pub fn new(inner: S, event_broker: EventBroker) -> Self {\n        Self {\n            inner,\n            event_broker,\n        }\n    }\n}\n\nimpl<S, R> Service<R> for EventListener<S>\nwhere\n    S: Service<R>,\n    R: Event,\n{\n    type Response = S::Response;\n    type Error = S::Error;\n    type Future = ResponseFuture<S::Future, R>;\n\n    fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {\n        self.inner.poll_ready(cx)\n    }\n\n    fn call(&mut self, request: R) -> Self::Future {\n        let inner = self.inner.call(request.clone());\n        ResponseFuture {\n            inner,\n            event_broker: self.event_broker.clone(),\n            request: Some(request),\n        }\n    }\n}\n\n#[derive(Debug, Clone)]\npub struct EventListenerLayer {\n    event_broker: EventBroker,\n}\n\nimpl EventListenerLayer {\n    pub fn new(event_broker: EventBroker) -> Self {\n        Self { event_broker }\n    }\n}\n\nimpl<S> Layer<S> for EventListenerLayer {\n    type Service = EventListener<S>;\n\n    fn layer(&self, service: S) -> Self::Service {\n        EventListener::new(service, self.event_broker.clone())\n    }\n}\n\n/// Response future for [`EventListener`].\n#[pin_project]\npub struct ResponseFuture<F, R> {\n    #[pin]\n    inner: F,\n    event_broker: EventBroker,\n    request: Option<R>,\n}\n\nimpl<R, F, T, E> Future for ResponseFuture<F, R>\nwhere\n    R: Event,\n    F: Future<Output = Result<T, E>>,\n{\n    type Output = Result<T, E>;\n\n    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {\n        let this = self.project();\n        let response = ready!(this.inner.poll(cx));\n\n        if response.is_ok() {\n            this.event_broker\n                .publish(this.request.take().expect(\"request should be set\"));\n        }\n        Poll::Ready(Ok(response?))\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::sync::Arc;\n    use std::sync::atomic::{AtomicUsize, Ordering};\n    use std::time::Duration;\n\n    use async_trait::async_trait;\n\n    use super::*;\n    use crate::pubsub::EventSubscriber;\n\n    #[derive(Debug, Clone, Copy)]\n    struct MyEvent {\n        return_ok: bool,\n    }\n\n    impl Event for MyEvent {}\n\n    struct MySubscriber {\n        counter: Arc<AtomicUsize>,\n    }\n\n    #[async_trait]\n    impl EventSubscriber<MyEvent> for MySubscriber {\n        async fn handle_event(&mut self, _event: MyEvent) {\n            self.counter.fetch_add(1, Ordering::Relaxed);\n        }\n    }\n\n    #[tokio::test]\n    async fn test_event_listener() {\n        let event_broker = EventBroker::default();\n        let counter = Arc::new(AtomicUsize::new(0));\n        let subscriber = MySubscriber {\n            counter: counter.clone(),\n        };\n        let _subscription_handle = event_broker.subscribe::<MyEvent>(subscriber);\n\n        let layer = EventListenerLayer::new(event_broker);\n\n        let mut service = layer.layer(tower::service_fn(|request: MyEvent| async move {\n            if request.return_ok { Ok(()) } else { Err(()) }\n        }));\n        let request = MyEvent { return_ok: false };\n        service.call(request).await.unwrap_err();\n\n        tokio::time::sleep(Duration::from_millis(1)).await;\n        assert_eq!(counter.load(Ordering::Relaxed), 0);\n\n        let request = MyEvent { return_ok: true };\n        service.call(request).await.unwrap();\n\n        tokio::time::sleep(Duration::from_millis(1)).await;\n        assert_eq!(counter.load(Ordering::Relaxed), 1);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/tower/load_shed.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::future::Future;\nuse std::pin::Pin;\nuse std::sync::Arc;\nuse std::task::{Context, Poll};\n\nuse pin_project::pin_project;\nuse tokio::sync::{OwnedSemaphorePermit, Semaphore};\nuse tower::{Layer, Service};\n\n/// Tracks the number of in-flight requests being processed by a service and rejects new incoming\n/// requests if the number of in-flight requests exceeds a specified limit.\n#[derive(Debug)]\npub struct LoadShed<S> {\n    inner: S,\n    permits: Arc<Semaphore>,\n    permit_opt: Option<OwnedSemaphorePermit>,\n}\n\nimpl<S> Clone for LoadShed<S>\nwhere S: Clone\n{\n    fn clone(&self) -> Self {\n        Self {\n            inner: self.inner.clone(),\n            permits: self.permits.clone(),\n            permit_opt: None,\n        }\n    }\n}\n\npub trait MakeLoadShedError {\n    fn make_load_shed_error() -> Self;\n}\n\nimpl<S, R> Service<R> for LoadShed<S>\nwhere\n    S: Service<R>,\n    S::Error: MakeLoadShedError,\n{\n    type Response = S::Response;\n    type Error = S::Error;\n    type Future = LoadShedFuture<S::Future>;\n\n    fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {\n        if self.permit_opt.is_none() {\n            if let Ok(permit) = self.permits.clone().try_acquire_owned() {\n                self.permit_opt = Some(permit);\n            } else {\n                return Poll::Ready(Err(S::Error::make_load_shed_error()));\n            }\n        }\n        self.inner.poll_ready(cx)\n    }\n\n    fn call(&mut self, request: R) -> Self::Future {\n        let permit = self\n            .permit_opt\n            .take()\n            .expect(\"`poll_ready` should be called before `call`\");\n\n        LoadShedFuture {\n            inner: self.inner.call(request),\n            permit,\n        }\n    }\n}\n\n#[pin_project]\n#[derive(Debug)]\npub struct LoadShedFuture<F> {\n    #[pin]\n    inner: F,\n    permit: OwnedSemaphorePermit,\n}\n\nimpl<F, T, E> Future for LoadShedFuture<F>\nwhere F: Future<Output = Result<T, E>>\n{\n    type Output = Result<T, E>;\n\n    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {\n        self.project().inner.poll(cx)\n    }\n}\n\n/// Allows at most `max_in_flight_requests` in-flight requests before rejecting new incoming\n/// requests.\n#[derive(Debug, Clone)]\npub struct LoadShedLayer {\n    max_in_flight_requests: usize,\n}\n\nimpl LoadShedLayer {\n    /// Creates a new `LoadShedLayer` allowing at most `max_in_flight_requests` in-flight requests\n    /// before rejecting new incoming requests.\n    pub fn new(max_in_flight_requests: usize) -> Self {\n        Self {\n            max_in_flight_requests,\n        }\n    }\n}\n\nimpl<S> Layer<S> for LoadShedLayer {\n    type Service = LoadShed<S>;\n\n    fn layer(&self, service: S) -> Self::Service {\n        LoadShed {\n            inner: service,\n            permits: Arc::new(Semaphore::new(self.max_in_flight_requests)),\n            permit_opt: None,\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use tower::{ServiceBuilder, ServiceExt};\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_load_shed() {\n        #[derive(Debug)]\n        struct MyError;\n\n        impl MakeLoadShedError for MyError {\n            fn make_load_shed_error() -> Self {\n                MyError\n            }\n        }\n        let mut service = ServiceBuilder::new()\n            .layer(LoadShedLayer::new(1))\n            .service_fn(|_| async { Ok::<_, MyError>(()) });\n\n        let in_fight_fut = service.ready().await.unwrap().call(());\n        service.ready().await.unwrap_err();\n\n        drop(in_fight_fut);\n        service.ready().await.unwrap().call(()).await.unwrap();\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/tower/metrics.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::pin::Pin;\nuse std::task::{Context, Poll};\nuse std::time::Instant;\n\nuse futures::{Future, ready};\nuse pin_project::{pin_project, pinned_drop};\nuse prometheus::exponential_buckets;\nuse tower::{Layer, Service};\n\nuse crate::metrics::{\n    HistogramVec, IntCounterVec, IntGaugeVec, new_counter_vec, new_gauge_vec, new_histogram_vec,\n};\n\npub trait RpcName {\n    fn rpc_name() -> &'static str;\n}\n\n#[derive(Clone)]\npub struct GrpcMetrics<S> {\n    inner: S,\n    requests_total: IntCounterVec<2>,\n    requests_in_flight: IntGaugeVec<1>,\n    request_duration_seconds: HistogramVec<2>,\n}\n\nimpl<S, R> Service<R> for GrpcMetrics<S>\nwhere\n    S: Service<R>,\n    R: RpcName,\n{\n    type Response = S::Response;\n    type Error = S::Error;\n    type Future = ResponseFuture<S::Future>;\n\n    fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {\n        self.inner.poll_ready(cx)\n    }\n\n    fn call(&mut self, request: R) -> Self::Future {\n        let start = Instant::now();\n        let rpc_name = R::rpc_name();\n        let inner = self.inner.call(request);\n\n        self.requests_in_flight.with_label_values([rpc_name]).inc();\n\n        ResponseFuture {\n            inner,\n            start,\n            rpc_name,\n            status: \"cancelled\",\n            requests_total: self.requests_total.clone(),\n            requests_in_flight: self.requests_in_flight.clone(),\n            request_duration_seconds: self.request_duration_seconds.clone(),\n        }\n    }\n}\n\n#[derive(Clone)]\npub struct GrpcMetricsLayer {\n    requests_total: IntCounterVec<2>,\n    requests_in_flight: IntGaugeVec<1>,\n    request_duration_seconds: HistogramVec<2>,\n}\n\nimpl GrpcMetricsLayer {\n    pub fn new(subsystem: &'static str, kind: &'static str) -> Self {\n        Self {\n            requests_total: new_counter_vec(\n                \"grpc_requests_total\",\n                \"Total number of gRPC requests processed.\",\n                subsystem,\n                &[(\"kind\", kind)],\n                [\"rpc\", \"status\"],\n            ),\n            requests_in_flight: new_gauge_vec(\n                \"grpc_requests_in_flight\",\n                \"Number of gRPC requests in-flight.\",\n                subsystem,\n                &[(\"kind\", kind)],\n                [\"rpc\"],\n            ),\n            request_duration_seconds: new_histogram_vec(\n                \"grpc_request_duration_seconds\",\n                \"Duration of request in seconds.\",\n                subsystem,\n                &[(\"kind\", kind)],\n                [\"rpc\", \"status\"],\n                exponential_buckets(0.001, 2.0, 12).unwrap(),\n            ),\n        }\n    }\n}\n\nimpl<S> Layer<S> for GrpcMetricsLayer {\n    type Service = GrpcMetrics<S>;\n\n    fn layer(&self, inner: S) -> Self::Service {\n        GrpcMetrics {\n            inner,\n            requests_total: self.requests_total.clone(),\n            requests_in_flight: self.requests_in_flight.clone(),\n            request_duration_seconds: self.request_duration_seconds.clone(),\n        }\n    }\n}\n\n/// Response future for [`PrometheusMetrics`].\n#[pin_project(PinnedDrop)]\npub struct ResponseFuture<F> {\n    #[pin]\n    inner: F,\n    start: Instant,\n    rpc_name: &'static str,\n    status: &'static str,\n    requests_total: IntCounterVec<2>,\n    requests_in_flight: IntGaugeVec<1>,\n    request_duration_seconds: HistogramVec<2>,\n}\n\n#[pinned_drop]\nimpl<F> PinnedDrop for ResponseFuture<F> {\n    fn drop(self: Pin<&mut Self>) {\n        let elapsed = self.start.elapsed().as_secs_f64();\n        let label_values = [self.rpc_name, self.status];\n\n        self.requests_total.with_label_values(label_values).inc();\n        self.request_duration_seconds\n            .with_label_values(label_values)\n            .observe(elapsed);\n        self.requests_in_flight\n            .with_label_values([self.rpc_name])\n            .dec();\n    }\n}\n\nimpl<F, T, E> Future for ResponseFuture<F>\nwhere F: Future<Output = Result<T, E>>\n{\n    type Output = Result<T, E>;\n\n    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {\n        let this = self.project();\n        let response = ready!(this.inner.poll(cx));\n        *this.status = if response.is_ok() { \"success\" } else { \"error\" };\n        Poll::Ready(Ok(response?))\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    struct HelloRequest;\n\n    impl RpcName for HelloRequest {\n        fn rpc_name() -> &'static str {\n            \"hello\"\n        }\n    }\n\n    struct GoodbyeRequest;\n\n    impl RpcName for GoodbyeRequest {\n        fn rpc_name() -> &'static str {\n            \"goodbye\"\n        }\n    }\n\n    #[tokio::test]\n    async fn test_grpc_metrics() {\n        let layer = GrpcMetricsLayer::new(\"quickwit_test\", \"server\");\n\n        let mut hello_service =\n            layer\n                .clone()\n                .layer(tower::service_fn(|request: HelloRequest| async move {\n                    Ok::<_, ()>(request)\n                }));\n        let mut goodbye_service =\n            layer\n                .clone()\n                .layer(tower::service_fn(|request: GoodbyeRequest| async move {\n                    Ok::<_, ()>(request)\n                }));\n\n        hello_service.call(HelloRequest).await.unwrap();\n\n        assert_eq!(\n            layer\n                .requests_total\n                .with_label_values([\"hello\", \"success\"])\n                .get(),\n            1\n        );\n        assert_eq!(\n            layer\n                .requests_total\n                .with_label_values([\"goodbye\", \"success\"])\n                .get(),\n            0\n        );\n\n        goodbye_service.call(GoodbyeRequest).await.unwrap();\n\n        assert_eq!(\n            layer\n                .requests_total\n                .with_label_values([\"goodbye\", \"success\"])\n                .get(),\n            1\n        );\n\n        let hello_future = hello_service.call(HelloRequest);\n        drop(hello_future);\n\n        assert_eq!(\n            layer\n                .requests_total\n                .with_label_values([\"hello\", \"cancelled\"])\n                .get(),\n            1\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/tower/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod box_layer;\nmod box_service;\nmod buffer;\nmod change;\nmod circuit_breaker;\nmod delay;\nmod estimate_rate;\nmod event_listener;\nmod load_shed;\nmod metrics;\nmod one_task_per_call_layer;\nmod pool;\nmod rate;\nmod rate_estimator;\nmod rate_limit;\nmod retry;\nmod timeout;\nmod transport;\n\nuse std::error;\nuse std::pin::Pin;\n\npub use box_layer::BoxLayer;\npub use box_service::BoxService;\npub use buffer::{Buffer, BufferError, BufferLayer};\npub use change::Change;\npub use circuit_breaker::{CircuitBreaker, CircuitBreakerEvaluator, CircuitBreakerLayer};\npub use delay::{Delay, DelayLayer};\npub use estimate_rate::{EstimateRate, EstimateRateLayer};\npub use event_listener::{EventListener, EventListenerLayer};\nuse futures::Future;\npub use load_shed::{LoadShed, LoadShedLayer, MakeLoadShedError};\npub use metrics::{GrpcMetrics, GrpcMetricsLayer, RpcName};\npub use one_task_per_call_layer::{OneTaskPerCallLayer, TaskCancelled};\npub use pool::Pool;\npub use rate::{ConstantRate, Rate};\npub use rate_estimator::{RateEstimator, SmaRateEstimator};\npub use rate_limit::{RateLimit, RateLimitLayer};\npub use retry::{RetryLayer, RetryPolicy};\npub use timeout::{Timeout, TimeoutExceeded, TimeoutLayer};\npub use transport::{\n    BalanceChannel, ClientGrpcConfig, KeepAliveConfig, make_channel, warmup_channel,\n};\n\npub type BoxError = Box<dyn error::Error + Send + Sync + 'static>;\n\npub type BoxFuture<T, E> = Pin<Box<dyn Future<Output = Result<T, E>> + Send + 'static>>;\n\npub type BoxFutureInfaillible<T> = Pin<Box<dyn Future<Output = T> + Send + 'static>>;\n\npub trait Cost {\n    fn cost(&self) -> u64;\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/tower/one_task_per_call_layer.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::future::Future;\nuse std::pin::Pin;\nuse std::task::{Context, Poll};\n\nuse pin_project::pin_project;\nuse tokio::task::{JoinError, JoinHandle};\nuse tower::{Layer, Service};\nuse tracing::error;\n\nuse crate::tower::RpcName;\n\n/// This layer spawns a new task for each call to the inner service.\n///\n/// This is useful for service where the handle is not cancel-safe:\n/// On a connection drop for instance, tonic can cancel the Future associated\n/// to a request execution.\n///\n/// By executing it on a dedicated task, we ensure the future is run to\n/// completion.\n///\n/// Disclaimer: This layer should be used with caution, as it means that timeout\n/// are not possible anymore.\n///\n/// It also can behave in an unexpected way when combined with layers like the\n/// `GlobalConcurrencyLimitLayer`.\npub struct OneTaskPerCallLayer;\n\nimpl<S: Clone> Layer<S> for OneTaskPerCallLayer {\n    type Service = OneTaskPerCallService<S>;\n\n    fn layer(&self, service: S) -> Self::Service {\n        OneTaskPerCallService { service }\n    }\n}\n\n#[derive(Clone)]\npub struct OneTaskPerCallService<S> {\n    service: S,\n}\n\nimpl<S, Request> Service<Request> for OneTaskPerCallService<S>\nwhere\n    S: Service<Request>,\n    S::Future: Send + 'static,\n    S::Response: Send + 'static,\n    S::Error: From<TaskCancelled> + Send + 'static,\n    Request: fmt::Debug + Send + RpcName + 'static,\n{\n    type Response = S::Response;\n    type Error = S::Error;\n    type Future = UnwrapOrElseFuture<S::Response, S::Error>;\n\n    fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {\n        self.service.poll_ready(cx)\n    }\n\n    fn call(&mut self, request: Request) -> Self::Future {\n        let request_name: &'static str = Request::rpc_name();\n        let future = self.service.call(request);\n        let join_handle = tokio::spawn(future);\n        UnwrapOrElseFuture {\n            request_name,\n            join_handle,\n        }\n    }\n}\n\n#[pin_project]\npub struct UnwrapOrElseFuture<T, E> {\n    request_name: &'static str,\n    #[pin]\n    join_handle: JoinHandle<Result<T, E>>,\n}\n\nimpl<T, E> Future for UnwrapOrElseFuture<T, E>\nwhere E: From<TaskCancelled>\n{\n    type Output = Result<T, E>;\n\n    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {\n        let request_name = self.request_name;\n        let pinned_join_handle: Pin<&mut JoinHandle<Result<T, E>>> = self.project().join_handle;\n        match pinned_join_handle.poll(cx) {\n            Poll::Ready(Ok(Ok(t))) => Poll::Ready(Ok(t)),\n            Poll::Ready(Ok(Err(e))) => Poll::Ready(Err(e)),\n            Poll::Ready(Err(join_error)) => {\n                error!(\n                    \"task running the request `{}` was cancelled or panicked. please report! \\\n                     JoinError: {:?}\",\n                    request_name, join_error\n                );\n                let task_cancelled = TaskCancelled {\n                    request_name,\n                    join_error,\n                };\n                Poll::Ready(Err(E::from(task_cancelled)))\n            }\n            Poll::Pending => Poll::Pending,\n        }\n    }\n}\n\npub struct TaskCancelled {\n    pub request_name: &'static str,\n    pub join_error: JoinError,\n}\n\nimpl std::fmt::Display for TaskCancelled {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        let TaskCancelled {\n            request_name,\n            join_error,\n        } = self;\n        write!(\n            f,\n            \"task running `{request_name}` was cancelled or panicked. JoinError: {join_error:?})\"\n        )\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use std::sync::Arc;\n    use std::time::Duration;\n\n    use tokio::sync::Mutex;\n    use tower::ServiceExt;\n\n    use super::*;\n    use crate::tower::RpcName;\n\n    #[derive(Debug)]\n    struct Request;\n\n    impl RpcName for Request {\n        fn rpc_name() -> &'static str {\n            \"dummy_request\"\n        }\n    }\n\n    #[derive(Debug)]\n    struct DummyError;\n\n    impl From<TaskCancelled> for DummyError {\n        fn from(_task_cancelled: TaskCancelled) -> DummyError {\n            DummyError\n        }\n    }\n\n    // In this toy example, we want to make sure, upon all observation\n    // left == right.\n    //\n    // In reality, OneTaskPerCallLayer is meant to protect more complicated\n    // invariants.\n    #[derive(Default)]\n    struct State {\n        left: usize,\n        right: usize,\n    }\n\n    #[tokio::test]\n    async fn test_task_cancelled() {\n        let state: Arc<Mutex<State>> = Default::default();\n        let state_clone: Arc<Mutex<State>> = state.clone();\n        let service = tower::service_fn(move |_request: Request| {\n            let state_clone = state.clone();\n            async move {\n                let mut lock = state_clone.lock().await;\n                assert_eq!(lock.left, lock.right);\n                lock.left += 1;\n                // If the task was cancelled at this point, it would leave us with\n                // a broken invariant.\n                tokio::time::sleep(Duration::from_millis(100)).await;\n                lock.right += 1;\n                Result::Ok::<(), DummyError>(())\n            }\n        });\n        let mut one_task_per_call_service = OneTaskPerCallService { service };\n        tokio::select!(\n            _ = async { one_task_per_call_service.ready().await.unwrap().call(Request).await } => {\n                panic!(\"this should have timed out\");\n            },\n            _ = tokio::time::sleep(Duration::from_millis(10)) => (),\n        );\n        let state_guard = state_clone.lock().await;\n        assert_eq!(state_guard.left, state_guard.right);\n        assert_eq!(state_guard.left, 1);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/tower/pool.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::any::TypeId;\nuse std::borrow::Borrow;\nuse std::cmp::{Eq, PartialEq};\nuse std::collections::HashMap;\nuse std::fmt;\nuse std::hash::Hash;\nuse std::sync::{Arc, RwLock};\n\nuse futures::{Stream, StreamExt};\n\nuse super::Change;\n\n/// A pool of `V` values identified by `K` keys. The pool can be updated manually by calling the\n/// `add/remove` methods or by listening to a stream of changes.\npub struct Pool<K, V> {\n    pool: Arc<RwLock<HashMap<K, V>>>,\n}\n\nimpl<K, V> fmt::Debug for Pool<K, V>\nwhere\n    K: 'static,\n    V: 'static,\n{\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        write!(f, \"Pool<{:?}, {:?}>\", TypeId::of::<K>(), TypeId::of::<V>())\n    }\n}\n\nimpl<K, V> Clone for Pool<K, V> {\n    fn clone(&self) -> Self {\n        Self {\n            pool: self.pool.clone(),\n        }\n    }\n}\n\nimpl<K, V> Default for Pool<K, V>\nwhere K: Eq + PartialEq + Hash\n{\n    fn default() -> Self {\n        Self {\n            pool: Arc::new(RwLock::new(HashMap::default())),\n        }\n    }\n}\n\nimpl<K, V> Pool<K, V>\nwhere\n    K: Eq + PartialEq + Hash + Clone + Send + Sync + 'static,\n    V: Clone + Send + Sync + 'static,\n{\n    /// Listens for the changes emitted by the stream and updates the pool accordingly.\n    pub fn listen_for_changes(\n        &self,\n        change_stream: impl Stream<Item = Change<K, V>> + Send + 'static,\n    ) {\n        let pool = self.clone();\n        let future = async move {\n            change_stream\n                .for_each(|change| async {\n                    match change {\n                        Change::Insert(key, service) => {\n                            pool.insert(key, service);\n                        }\n                        Change::Remove(key) => {\n                            pool.remove(&key);\n                        }\n                    }\n                })\n                .await;\n        };\n        tokio::spawn(future);\n    }\n\n    /// Returns whether the pool is empty.\n    pub fn is_empty(&self) -> bool {\n        self.pool\n            .read()\n            .expect(\"lock should not be poisoned\")\n            .is_empty()\n    }\n\n    /// Returns the number of values in the pool.\n    pub fn len(&self) -> usize {\n        self.pool.read().expect(\"lock should not be poisoned\").len()\n    }\n\n    /// Returns all the keys in the pool.\n    pub fn keys(&self) -> Vec<K> {\n        self.pool\n            .read()\n            .expect(\"lock should not be poisoned\")\n            .keys()\n            .cloned()\n            .collect()\n    }\n\n    /// Returns all the key-value pairs in the pool.\n    pub fn keys_values(&self) -> Vec<(K, V)> {\n        self.pool\n            .read()\n            .expect(\"lock should not be poisoned\")\n            .iter()\n            .map(|(key, value)| (key.clone(), value.clone()))\n            .collect()\n    }\n\n    /// Returns all the values in the pool.\n    pub fn values(&self) -> Vec<V> {\n        self.pool\n            .read()\n            .expect(\"lock should not be poisoned\")\n            .values()\n            .cloned()\n            .collect()\n    }\n\n    /// Returns all the key-value pairs in the pool.\n    pub fn pairs(&self) -> Vec<(K, V)> {\n        self.pool\n            .read()\n            .expect(\"lock should not be poisoned\")\n            .iter()\n            .map(|(key, value)| (key.clone(), value.clone()))\n            .collect()\n    }\n\n    /// Returns the value associated with the given key.\n    pub fn contains_key<Q>(&self, key: &Q) -> bool\n    where\n        Q: Hash + Eq + ?Sized,\n        K: Borrow<Q>,\n    {\n        self.pool\n            .read()\n            .expect(\"lock should not be poisoned\")\n            .contains_key(key)\n    }\n\n    /// Returns the value associated with the given key.\n    pub fn get<Q>(&self, key: &Q) -> Option<V>\n    where\n        Q: Hash + Eq + ?Sized,\n        K: Borrow<Q>,\n    {\n        self.pool\n            .read()\n            .expect(\"lock should not be poisoned\")\n            .get(key)\n            .cloned()\n    }\n\n    /// Finds a key in the pool that satisfies the given predicate.\n    pub fn find(&self, func: impl Fn(&K, &V) -> bool) -> Option<(K, V)> {\n        self.pool\n            .read()\n            .expect(\"lock should not be poisoned\")\n            .iter()\n            .find(|(key, value)| func(key, value))\n            .map(|(key, value)| (key.clone(), value.clone()))\n    }\n\n    /// Adds a value to the pool.\n    pub fn insert(&self, key: K, service: V) {\n        self.pool\n            .write()\n            .expect(\"lock should not be poisoned\")\n            .insert(key, service);\n    }\n\n    /// Removes a value from the pool.\n    fn remove(&self, key: &K) {\n        self.pool\n            .write()\n            .expect(\"lock should not be poisoned\")\n            .remove(key);\n    }\n}\n\nimpl<K, V> FromIterator<(K, V)> for Pool<K, V>\nwhere K: Eq + PartialEq + Hash\n{\n    fn from_iter<I>(iter: I) -> Self\n    where I: IntoIterator<Item = (K, V)> {\n        Self {\n            pool: Arc::new(RwLock::new(HashMap::from_iter(iter))),\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::time::Duration;\n\n    use tokio_stream::wrappers::ReceiverStream;\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_pool() {\n        let (change_stream_tx, change_stream_rx) = tokio::sync::mpsc::channel(10);\n        let change_stream = ReceiverStream::new(change_stream_rx);\n\n        let pool = Pool::default();\n        pool.listen_for_changes(change_stream);\n\n        assert!(pool.is_empty());\n        assert_eq!(pool.len(), 0);\n\n        change_stream_tx.send(Change::Insert(1, 11)).await.unwrap();\n        tokio::time::sleep(Duration::from_millis(1)).await;\n\n        assert!(!pool.is_empty());\n        assert_eq!(pool.len(), 1);\n\n        assert!(pool.contains_key(&1));\n        assert_eq!(pool.get(&1), Some(11));\n\n        change_stream_tx.send(Change::Insert(2, 21)).await.unwrap();\n        tokio::time::sleep(Duration::from_millis(1)).await;\n\n        assert_eq!(pool.len(), 2);\n        assert_eq!(pool.get(&2), Some(21));\n\n        assert_eq!(pool.find(|k, _| *k == 1), Some((1, 11)));\n\n        let mut pairs = pool.pairs();\n        pairs.sort();\n\n        assert_eq!(pairs, vec![(1, 11), (2, 21)]);\n\n        change_stream_tx.send(Change::Insert(1, 12)).await.unwrap();\n        tokio::time::sleep(Duration::from_millis(1)).await;\n\n        assert_eq!(pool.get(&1), Some(12));\n\n        change_stream_tx.send(Change::Remove(1)).await.unwrap();\n        tokio::time::sleep(Duration::from_millis(1)).await;\n\n        assert_eq!(pool.len(), 1);\n\n        change_stream_tx.send(Change::Remove(2)).await.unwrap();\n        tokio::time::sleep(Duration::from_millis(1)).await;\n\n        assert!(pool.is_empty());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/tower/rate.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::time::Duration;\n\nuse bytesize::ByteSize;\n\npub trait Rate: Clone {\n    /// Returns the amount of work per time period.\n    fn work(&self) -> u64;\n\n    /// Returns the amount of work in bytes per time period.\n    fn work_bytes(&self) -> ByteSize {\n        ByteSize(self.work())\n    }\n\n    /// Returns the duration of a time period.\n    fn period(&self) -> Duration;\n}\n\n/// A rate of unit of work per time period.\n#[derive(Debug, Copy, Clone)]\npub struct ConstantRate {\n    work: u64,\n    period: Duration,\n}\n\nimpl ConstantRate {\n    /// Creates a new constant rate.\n    ///\n    /// # Panics\n    ///\n    /// This function panics if `period` is 0 while work is != 0.\n    pub const fn new(work: u64, period: Duration) -> Self {\n        assert!(!period.is_zero() || work == 0u64);\n        Self { work, period }\n    }\n\n    pub const fn bytes_per_period(bytes: ByteSize, period: Duration) -> Self {\n        let work = bytes.as_u64();\n        Self::new(work, period)\n    }\n\n    pub const fn bytes_per_sec(bytes: ByteSize) -> Self {\n        Self::bytes_per_period(bytes, Duration::from_secs(1))\n    }\n\n    /// Changes the scale of the rate, i.e. the duration of the time period, while keeping the rate\n    /// constant.\n    ///\n    /// # Panics\n    ///\n    /// This function panics if `new_period` is 0.\n    pub fn rescale(&self, new_period: Duration) -> Self {\n        if self.work == 0u64 {\n            return Self::new(0u64, new_period);\n        }\n        assert!(!new_period.is_zero());\n        let new_work = self.work() as u128 * new_period.as_nanos() / self.period().as_nanos();\n        Self::new(new_work as u64, new_period)\n    }\n}\n\nimpl Rate for ConstantRate {\n    fn work(&self) -> u64 {\n        self.work\n    }\n\n    fn period(&self) -> Duration {\n        self.period\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    #[should_panic]\n    fn test_rescale_zero_duration_panics() {\n        ConstantRate::bytes_per_period(ByteSize::b(1), Duration::default());\n    }\n\n    #[test]\n    fn test_rescale_zero_duration_accepted_if_no_work() {\n        let rate = ConstantRate::bytes_per_period(ByteSize::b(0), Duration::default());\n        let rescaled_rate = rate.rescale(Duration::from_secs(1));\n        assert_eq!(rescaled_rate.work_bytes(), ByteSize::b(0));\n        assert_eq!(rescaled_rate.period(), Duration::from_secs(1));\n    }\n\n    #[test]\n    fn test_rescale() {\n        let rate = ConstantRate::bytes_per_period(ByteSize::mib(5), Duration::from_secs(5));\n        let rescaled_rate = rate.rescale(Duration::from_secs(1));\n        assert_eq!(rescaled_rate.work_bytes(), ByteSize::mib(1));\n        assert_eq!(rescaled_rate.period(), Duration::from_secs(1));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/tower/rate_estimator.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::num::NonZeroUsize;\nuse std::sync::Arc;\nuse std::sync::atomic::{AtomicU64, Ordering};\nuse std::time::{Duration, Instant};\n\nuse super::Rate;\n\npub trait RateEstimator: Rate {\n    fn update(&mut self, started_at: Instant, ended_at: Instant, work: u64);\n}\n\n/// Simple moving average rate estimator. Tracks the average rate of work over a sliding time\n/// window.\n#[derive(Debug, Clone)]\npub struct SmaRateEstimator {\n    inner: Arc<InnerSmaRateEstimator>,\n}\n\n#[derive(Debug)]\nstruct InnerSmaRateEstimator {\n    anchor: Instant,\n    buckets: Box<[Bucket]>,\n    bucket_period_millis: u64,\n    period_millis: u64,\n    num_buckets: u64,\n}\n\nimpl SmaRateEstimator {\n    /// Creates a new simple moving average rate estimator.\n    ///\n    /// The rate returned is the rate measured over the last `n - 1` buckets. The\n    /// ongoing bucket is not taken in account.\n    /// In other words, we are returning a rolling average that spans over a period\n    /// of `num_buckets * bucket_period`.\n    ///\n    /// The `period` argument is just a `scaling unit`. A period of 1s means that the\n    /// the number returned by `work` is expressed in `bytes / second`.\n    ///\n    /// This rate estimator is bucket-based and outputs the average rate of work over the previous\n    /// closed `n-1` buckets.\n    ///\n    /// # Panics\n    ///\n    /// This function panics if `bucket_period` is < 1s  or `period` is < 1ms.\n    pub fn new(num_buckets: NonZeroUsize, bucket_period: Duration, period: Duration) -> Self {\n        assert!(bucket_period.as_millis() >= 100);\n        assert!(period.as_millis() > 0);\n\n        let mut buckets = Vec::with_capacity(num_buckets.get());\n        for _ in 0..num_buckets.get() {\n            buckets.push(Bucket::default());\n        }\n        let inner = InnerSmaRateEstimator {\n            anchor: Instant::now(),\n            buckets: buckets.into_boxed_slice(),\n            bucket_period_millis: bucket_period.as_millis() as u64,\n            num_buckets: num_buckets.get() as u64,\n            period_millis: period.as_millis() as u64,\n        };\n        Self {\n            inner: Arc::new(inner),\n        }\n    }\n\n    fn work_in_bucket(&self, bucket_ord: u64) -> u64 {\n        self.inner.buckets[bucket_ord as usize % self.inner.buckets.len()]\n            .work_for_bucket(bucket_ord)\n    }\n\n    fn work_at(&self, now: Instant) -> u64 {\n        let elapsed_ms: u64 = now.duration_since(self.inner.anchor).as_millis() as u64;\n        let current_bucket_ord = elapsed_ms / self.inner.bucket_period_millis;\n        let num_buckets = self.inner.num_buckets - 1u64;\n        let bucket_range = current_bucket_ord.saturating_sub(num_buckets)..current_bucket_ord;\n        let cumulative_work: u64 = bucket_range\n            .map(|bucket_ord| self.work_in_bucket(bucket_ord))\n            .sum();\n        (cumulative_work * self.inner.period_millis)\n            / (self.inner.bucket_period_millis * num_buckets)\n    }\n}\n\nimpl Rate for SmaRateEstimator {\n    /// Returns the estimated amount of work performed during a `period`.\n    ///\n    /// This estimation is computed by summing the amount of work performed tracked in the previous\n    /// `n-1` buckets and dividing it by the duration of the `n-1` periods.\n    fn work(&self) -> u64 {\n        self.work_at(Instant::now())\n    }\n\n    fn period(&self) -> Duration {\n        Duration::from_millis(self.inner.period_millis)\n    }\n}\n\n#[inline]\nfn compute_bucket_ord_hash(bucket_ord: u64) -> u8 {\n    // We pick 241 because it is the highest prime number below 256\n    // that can be computed easily.\n    //\n    // The fact that it is prime makes it so that it is complemented by the\n    // bucket id for any value of num_buckets (well except multiples of 241)\n    // thanks to the chinese theorem.\n    (bucket_ord % 241) as u8\n}\n\nimpl RateEstimator for SmaRateEstimator {\n    fn update(&mut self, _started_at: Instant, ended_at: Instant, work: u64) {\n        let elapsed = ended_at.duration_since(self.inner.anchor).as_millis() as u64;\n        let num_buckets = self.inner.num_buckets;\n        let bucket_ord = elapsed / self.inner.bucket_period_millis;\n        let bucket = &self.inner.buckets[(bucket_ord % num_buckets) as usize];\n        bucket.increment_work(work, bucket_ord);\n    }\n}\n\n/// Rate estimator bucket. The 56 least significant bits of the atomic integer store the amount of\n/// work, while the most significant 8 bits are encoding a well-thought hash of the bucket ord.\n///\n/// The hash is used to ensure that we know exactly when to reset the bucket's work.\n#[derive(Debug, Default)]\nstruct Bucket {\n    // This atomic is actually encoding two things:\n    // - low bits [0..56): the amount of work recorded in the bucket.\n    // - high bits [56..64): the bucket ord, or rather its last 8 bits.\n    bits: AtomicU64,\n}\n\nconst WORK_MASK: u64 = (1u64 << 56) - 1;\n\nstruct BucketVal {\n    work: u64,\n    bucket_ord_hash: u8,\n}\n\nimpl From<u64> for BucketVal {\n    #[inline]\n    fn from(bucket_bits: u64) -> BucketVal {\n        BucketVal {\n            work: bucket_bits & WORK_MASK,\n            bucket_ord_hash: (bucket_bits >> 56) as u8,\n        }\n    }\n}\n\nimpl From<BucketVal> for u64 {\n    #[inline]\n    fn from(value: BucketVal) -> Self {\n        (value.bucket_ord_hash as u64) << 56 | value.work\n    }\n}\n\nimpl Bucket {\n    fn work_for_bucket(&self, bucket_ord: u64) -> u64 {\n        let bucket_val = BucketVal::from(self.bits.load(Ordering::Relaxed));\n        if bucket_val.bucket_ord_hash == compute_bucket_ord_hash(bucket_ord) {\n            bucket_val.work\n        } else {\n            0\n        }\n    }\n\n    fn increment_work(&self, work: u64, bucket_ord: u64) {\n        let expected_bucket_ord_hash: u8 = compute_bucket_ord_hash(bucket_ord);\n        let current_bits = self.bits.fetch_add(work, Ordering::Relaxed) + work;\n        let bucket_val = BucketVal::from(current_bits);\n\n        // This is not the bucket we targeted, we need to retry and update the bucket with the new\n        // bucket_ord and a reset value.\n        if bucket_val.bucket_ord_hash != expected_bucket_ord_hash {\n            let mut expected_bits = current_bits;\n            let new_bits: u64 = BucketVal {\n                work,\n                bucket_ord_hash: expected_bucket_ord_hash,\n            }\n            .into();\n\n            while let Err(current_bits) = self.bits.compare_exchange(\n                expected_bits,\n                new_bits,\n                Ordering::AcqRel,\n                Ordering::Acquire,\n            ) {\n                if BucketVal::from(current_bits).bucket_ord_hash == expected_bucket_ord_hash {\n                    // Some thread managed to successfully flip the color. We're good.\n                    self.bits.fetch_add(work, Ordering::Relaxed);\n                    break;\n                } else {\n                    // We keep trying.\n                    expected_bits = current_bits;\n                }\n            }\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::sync::Barrier;\n    use std::thread;\n\n    use super::*;\n\n    #[test]\n    fn test_bucket() {\n        let bucket = Bucket::default();\n        assert_eq!(bucket.work_for_bucket(0u64), 0);\n\n        // First pass, the bucket is red.\n        bucket.increment_work(1, 0u64);\n        assert_eq!(bucket.work_for_bucket(0u64), 1);\n        assert_eq!(bucket.work_for_bucket(1u64), 0);\n\n        bucket.increment_work(2, 0u64);\n        assert_eq!(bucket.work_for_bucket(0u64), 3);\n\n        // Second pass, the bucket is now black.\n        bucket.increment_work(5, 1u64);\n        assert_eq!(bucket.work_for_bucket(1u64), 5);\n        assert_eq!(bucket.work_for_bucket(0u64), 0);\n\n        bucket.increment_work(7, 1u64);\n        assert_eq!(bucket.work_for_bucket(1u64), 12);\n\n        // Third pass, the bucket is red again.\n        bucket.increment_work(9, 2u64);\n        assert_eq!(bucket.work_for_bucket(2u64), 9);\n\n        bucket.increment_work(11, 2u64);\n        assert_eq!(bucket.work_for_bucket(2u64), 20);\n\n        for num_threads in [1, 2, 3, 5, 10, 20] {\n            let barrier = Arc::new(Barrier::new(num_threads));\n            let bucket = Arc::new(Bucket::default());\n            let mut cumulative_work = 0;\n            let mut handles = Vec::with_capacity(num_threads);\n\n            for i in 0..num_threads {\n                let barrier = barrier.clone();\n                let bucket = bucket.clone();\n                cumulative_work += i as u64;\n\n                handles.push(thread::spawn(move || {\n                    barrier.wait();\n                    // First time we increment the work in this second pass. All the threads will\n                    // attempt to flip the bucket's color. Only one should succeed.\n                    bucket.increment_work(i as u64, 3u64);\n                }));\n            }\n            for handle in handles {\n                handle.join().unwrap();\n            }\n            assert_eq!(bucket.work_for_bucket(3u64), cumulative_work);\n        }\n    }\n\n    #[test]\n    fn test_sma_rate_estimator() {\n        let num_buckets = NonZeroUsize::new(3).unwrap();\n        let bucket_period = Duration::from_secs(1);\n        let period = Duration::from_millis(100);\n\n        let mut estimator = SmaRateEstimator::new(num_buckets, bucket_period, period);\n        assert_eq!(estimator.work(), 0);\n        assert_eq!(estimator.period(), Duration::from_millis(100));\n\n        let anchor = estimator.inner.anchor;\n\n        let started_at = anchor;\n        let ended_at = started_at + Duration::from_millis(0);\n        estimator.update(started_at, ended_at, 100);\n        assert_eq!(estimator.inner.buckets[0].work_for_bucket(0), 100);\n\n        let ended_at = started_at + Duration::from_millis(999);\n        estimator.update(started_at, ended_at, 200);\n        assert_eq!(estimator.inner.buckets[0].work_for_bucket(0), 300);\n\n        assert_eq!(estimator.work_at(anchor), 0);\n\n        let ended_at = started_at + Duration::from_millis(1_000);\n        estimator.update(started_at, ended_at, 300);\n        assert_eq!(estimator.inner.buckets[1].work_for_bucket(1), 300);\n\n        let ended_at = started_at + Duration::from_millis(1_999);\n        estimator.update(started_at, ended_at, 600);\n        assert_eq!(estimator.inner.buckets[1].work_for_bucket(1), 900);\n\n        assert_eq!(\n            estimator.work_at(anchor + Duration::from_secs(2)),\n            (300 + 900) / 20\n        );\n\n        let ended_at = started_at + Duration::from_millis(2_000);\n        estimator.update(started_at, ended_at, 800);\n        assert_eq!(estimator.inner.buckets[2].work_for_bucket(2), 800);\n\n        let ended_at = started_at + Duration::from_millis(2_999);\n        estimator.update(started_at, ended_at, 1_000);\n        assert_eq!(estimator.inner.buckets[2].work_for_bucket(2), 1_800);\n\n        assert_eq!(estimator.work_at(anchor + Duration::from_secs(3)), 135);\n\n        let ended_at = started_at + Duration::from_millis(3_000);\n        estimator.update(started_at, ended_at, 500);\n        assert_eq!(estimator.inner.buckets[0].work_for_bucket(0), 0);\n        assert_eq!(estimator.inner.buckets[0].work_for_bucket(3), 500);\n    }\n\n    #[test]\n    fn test_sma_rate_skipped_bucket() {\n        let num_buckets = NonZeroUsize::new(10).unwrap();\n        let bucket_period = Duration::from_secs(1);\n        let period = Duration::from_secs(1);\n\n        let mut estimator = SmaRateEstimator::new(num_buckets, bucket_period, period);\n\n        assert_eq!(estimator.work(), 0);\n\n        let anchor = estimator.inner.anchor;\n\n        // We fill all of the bucket with 100 work.\n        for i in 0..10 {\n            let ended_at = anchor + Duration::from_secs(1) * i;\n            estimator.update(ended_at, ended_at, 100);\n        }\n\n        assert_eq!(estimator.work_at(anchor + Duration::from_secs(10)), 100);\n\n        // Now let's assume there isn't any work ongoing for 4s.\n        // Over the last 9 seconds, we have received 500 works\n        //\n        // After the reset, we should have the following buckets:\n        // We expect a mean of 44 work/s.\n        // |0, 0, 0, 0, 0, 100*, 100, 100, 100, 100|\n        //\n        // Since the current bucket (idx = 5) is not taken into account, this leads\n        // to an average of 400 / 9 = 44 work units.\n        assert_eq!(estimator.work_at(anchor + Duration::from_secs(15)), 44);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/tower/rate_limit.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::future::Future;\nuse std::pin::Pin;\nuse std::task::{Context, Poll};\n\nuse futures::ready;\nuse tokio::time::{Instant, Sleep};\nuse tower::{Layer, Service};\n\nuse super::Cost;\nuse super::rate::Rate;\n\n/// Enforces a rate limit on the quantity of work the underlying\n/// service can handle over a period of time. This implementation is a generalization of\n/// `tower::limit::RateLimit`, which is limited to a constant rate of requests over a period of\n/// time.\n#[derive(Debug)]\npub struct RateLimit<S, T> {\n    inner: S,\n    rate: T,\n    state: State,\n    sleep: Pin<Box<Sleep>>,\n}\n\n#[derive(Debug)]\nenum State {\n    // The service has hit its limit.\n    Limited { debit: u64 },\n    Ready { deadline: Instant, credit: u64 },\n}\n\nimpl<S, T> RateLimit<S, T>\nwhere T: Rate\n{\n    /// Creates a new rate limiter.\n    pub fn new(inner: S, rate: T) -> Self {\n        let deadline = Instant::now();\n        let state = State::Ready {\n            deadline,\n            credit: rate.work(),\n        };\n\n        Self {\n            inner,\n            rate,\n            state,\n            // The sleep won't actually be used with this duration, but\n            // we create it eagerly so that we can reset it in place rather than\n            // `Box::pin`ning a new `Sleep` every time we need one.\n            sleep: Box::pin(tokio::time::sleep_until(deadline)),\n        }\n    }\n\n    /// Gets a reference to the inner service.\n    pub fn get_ref(&self) -> &S {\n        &self.inner\n    }\n\n    /// Gets a mutable reference to the inner service.\n    pub fn get_mut(&mut self) -> &mut S {\n        &mut self.inner\n    }\n\n    /// Consumes `self`, returning the inner service\n    pub fn into_inner(self) -> S {\n        self.inner\n    }\n}\n\nimpl<S, R, T> Service<R> for RateLimit<S, T>\nwhere\n    S: Service<R>,\n    R: Cost,\n    T: Rate,\n{\n    type Response = S::Response;\n    type Error = S::Error;\n    type Future = S::Future;\n\n    fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {\n        let debit = match self.state {\n            State::Ready { .. } => return Poll::Ready(ready!(self.inner.poll_ready(cx))),\n            State::Limited { debit } => {\n                if Pin::new(&mut self.sleep).poll(cx).is_pending() {\n                    return Poll::Pending;\n                }\n                debit\n            }\n        };\n        let deposit = self.rate.work();\n\n        if deposit >= debit {\n            self.state = State::Ready {\n                deadline: Instant::now() + self.rate.period(),\n                credit: deposit - debit,\n            };\n            Poll::Ready(ready!(self.inner.poll_ready(cx)))\n        } else {\n            self.state = State::Limited {\n                debit: debit - deposit,\n            };\n            self.sleep\n                .as_mut()\n                .reset(Instant::now() + self.rate.period());\n            Poll::Pending\n        }\n    }\n\n    fn call(&mut self, request: R) -> Self::Future {\n        match self.state {\n            State::Ready {\n                mut deadline,\n                mut credit,\n            } => {\n                let now = Instant::now();\n\n                // If the period has elapsed, reset it.\n                if now >= deadline {\n                    deadline = now + self.rate.period();\n                    credit = self.rate.work();\n                }\n                let withdrawal = request.cost();\n\n                if credit >= withdrawal {\n                    credit -= withdrawal;\n                    self.state = State::Ready { deadline, credit };\n                } else {\n                    // The service is disabled until further notice\n                    // Reset the sleep future in place, so that we don't have to\n                    // deallocate the existing box and allocate a new one.\n                    let debit = withdrawal - credit;\n                    self.state = State::Limited { debit };\n                    self.sleep.as_mut().reset(deadline);\n                }\n\n                // Call the inner future\n                self.inner.call(request)\n            }\n            State::Limited { .. } => {\n                panic!(\"Service not ready; `poll_ready` must be called first!\")\n            }\n        }\n    }\n}\n\n/// Enforces a rate limit on the quantity of work the underlying\n/// service can handle over a period of time.\n#[derive(Debug, Clone)]\npub struct RateLimitLayer<T> {\n    rate: T,\n}\n\nimpl<T> RateLimitLayer<T> {\n    /// Creates new rate limit layer.\n    pub fn new(rate: T) -> Self {\n        Self { rate }\n    }\n}\n\nimpl<S, T> Layer<S> for RateLimitLayer<T>\nwhere T: Rate\n{\n    type Service = RateLimit<S, T>;\n\n    fn layer(&self, service: S) -> Self::Service {\n        RateLimit::new(service, self.rate.clone())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use std::sync::Arc;\n    use std::sync::atomic::{AtomicU64, Ordering};\n    use std::time::Duration;\n\n    use futures::future::join_all;\n    use tower::{ServiceBuilder, ServiceExt};\n\n    use super::*;\n    use crate::tower::buffer::BufferError;\n    use crate::tower::{BufferLayer, ConstantRate};\n\n    struct Request {\n        cost: u64,\n    }\n\n    impl Request {\n        fn random() -> Self {\n            Self {\n                cost: rand::random::<u64>() % 100,\n            }\n        }\n    }\n\n    impl Cost for Request {\n        fn cost(&self) -> u64 {\n            self.cost\n        }\n    }\n\n    #[derive(Debug, Clone, thiserror::Error)]\n    #[error(\"rate meter error\")]\n    struct RateMeterError;\n\n    impl From<BufferError> for RateMeterError {\n        fn from(_: BufferError) -> Self {\n            Self\n        }\n    }\n\n    #[derive(Debug, Clone)]\n    struct RateMeter {\n        cumulated_work: Arc<AtomicU64>,\n    }\n\n    impl RateMeter {\n        fn new() -> Self {\n            Self {\n                cumulated_work: Arc::new(AtomicU64::new(0)),\n            }\n        }\n    }\n\n    impl Service<Request> for RateMeter {\n        type Response = ();\n        type Error = RateMeterError;\n        type Future = futures::future::Ready<Result<Self::Response, Self::Error>>;\n\n        fn poll_ready(&mut self, _: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {\n            Poll::Ready(Ok(()))\n        }\n\n        fn call(&mut self, request: Request) -> Self::Future {\n            self.cumulated_work\n                .fetch_add(request.cost, Ordering::Relaxed);\n            futures::future::ready(Ok(()))\n        }\n    }\n\n    #[tokio::test]\n    async fn test_rate_limit_over_multiple_periods() {\n        let work = 1000;\n        let period = 100;\n\n        let rate = ConstantRate::new(work, Duration::from_millis(period));\n        let meter = RateMeter::new();\n        let mut service = ServiceBuilder::new()\n            .layer(BufferLayer::new(10))\n            .layer(RateLimitLayer::new(rate))\n            .service(meter.clone());\n\n        let now = Instant::now();\n        service\n            .ready()\n            .await\n            .unwrap()\n            .call(Request { cost: 1 })\n            .await\n            .unwrap();\n        // The request should go through immediately but in some rare instance the test is slow to\n        // run and the call to `call` takes more than 1 ms.\n        assert!(now.elapsed() < Duration::from_millis(5));\n\n        let now = Instant::now();\n        // The first request goes through, but the second one is rate limited.\n        service\n            .ready()\n            .await\n            .unwrap()\n            .call(Request { cost: 2 * work - 1 })\n            .await\n            .unwrap();\n        service\n            .ready()\n            .await\n            .unwrap()\n            .call(Request { cost: 1 })\n            .await\n            .unwrap();\n        assert!(now.elapsed() >= Duration::from_millis(period));\n        assert!(now.elapsed() < Duration::from_millis(2 * period));\n    }\n\n    #[tokio::test]\n    async fn test_rate_limit() {\n        let work = 1000;\n        let period = 100;\n        let deadline = 500;\n        let expected_cumulated_work = work * (deadline / period);\n\n        let rate = ConstantRate::new(work, Duration::from_millis(period));\n        let meter = RateMeter::new();\n        let service = ServiceBuilder::new()\n            .layer(BufferLayer::new(10))\n            .layer(RateLimitLayer::new(rate))\n            .service(meter.clone());\n\n        let futures = (0..5).map(|_| {\n            let mut service = service.clone();\n            tokio::time::timeout(Duration::from_millis(deadline), async move {\n                loop {\n                    service\n                        .ready()\n                        .await\n                        .unwrap()\n                        .call(Request::random())\n                        .await\n                        .unwrap();\n                }\n            })\n        });\n        join_all(futures).await;\n        let cumulated_work = meter.cumulated_work.load(Ordering::Relaxed);\n        assert!(cumulated_work > expected_cumulated_work * 95 / 100);\n        assert!(cumulated_work < expected_cumulated_work * 105 / 100)\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/tower/retry.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::any::type_name;\nuse std::fmt;\n\nuse tokio::time::Sleep;\nuse tower::Layer;\nuse tower::retry::{Policy, Retry};\nuse tracing::debug;\n\nuse crate::retry::{RetryParams, Retryable};\n\n/// Retry layer copy/pasted from `tower::retry::RetryLayer`\n/// but which implements `Clone`.\nimpl<P, S> Layer<S> for RetryLayer<P>\nwhere P: Clone\n{\n    type Service = Retry<P, S>;\n\n    fn layer(&self, service: S) -> Self::Service {\n        let policy = self.policy.clone();\n        Retry::new(policy, service)\n    }\n}\n\n#[derive(Clone, Debug)]\npub struct RetryLayer<P> {\n    policy: P,\n}\n\nimpl<P> RetryLayer<P> {\n    /// Create a new [`RetryLayer`] from a retry policy\n    pub fn new(policy: P) -> Self {\n        RetryLayer { policy }\n    }\n}\n\n#[derive(Clone, Copy, Debug)]\npub struct RetryPolicy {\n    num_attempts: usize,\n    retry_params: RetryParams,\n}\n\nimpl From<RetryParams> for RetryPolicy {\n    fn from(retry_params: RetryParams) -> Self {\n        Self {\n            num_attempts: 0,\n            retry_params,\n        }\n    }\n}\n\nimpl<R, T, E> Policy<R, T, E> for RetryPolicy\nwhere\n    R: Clone,\n    E: fmt::Debug + Retryable,\n{\n    type Future = Sleep;\n\n    fn retry(&mut self, _request: &mut R, result: &mut Result<T, E>) -> Option<Self::Future> {\n        match result {\n            Ok(_) => None,\n            Err(error) => {\n                self.num_attempts += 1;\n\n                if !error.is_retryable() || self.num_attempts >= self.retry_params.max_attempts {\n                    None\n                } else {\n                    let delay = self.retry_params.compute_delay(self.num_attempts);\n                    debug!(\n                        num_attempts=%self.num_attempts,\n                        delay_millis=%delay.as_millis(),\n                        error=?error,\n                        \"{} request failed, retrying.\", type_name::<R>()\n                    );\n                    let sleep_fut = tokio::time::sleep(delay);\n                    Some(sleep_fut)\n                }\n            }\n        }\n    }\n\n    fn clone_request(&mut self, request: &R) -> Option<R> {\n        Some(request.clone())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::sync::atomic::{AtomicUsize, Ordering};\n    use std::sync::{Arc, Mutex};\n    use std::task::{Context, Poll};\n\n    use futures::future::{Ready, ready};\n    use tower::{Layer, Service, ServiceExt};\n\n    use super::*;\n\n    #[derive(Debug, Eq, PartialEq)]\n    pub enum Retry<E> {\n        Permanent(E),\n        Transient(E),\n    }\n\n    impl<E> Retryable for Retry<E> {\n        fn is_retryable(&self) -> bool {\n            match self {\n                Retry::Permanent(_) => false,\n                Retry::Transient(_) => true,\n            }\n        }\n    }\n\n    #[derive(Debug, Clone, Default)]\n    struct HelloService;\n\n    type HelloResults = Arc<Mutex<Vec<Result<(), Retry<()>>>>>;\n\n    #[derive(Debug, Clone, Default)]\n    struct HelloRequest {\n        num_attempts: Arc<AtomicUsize>,\n        results: HelloResults,\n    }\n\n    impl Service<HelloRequest> for HelloService {\n        type Response = ();\n        type Error = Retry<()>;\n        type Future = Ready<Result<(), Retry<()>>>;\n\n        fn poll_ready(&mut self, _cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {\n            Poll::Ready(Ok(()))\n        }\n\n        fn call(&mut self, request: HelloRequest) -> Self::Future {\n            request.num_attempts.fetch_add(1, Ordering::Relaxed);\n            let result = request\n                .results\n                .lock()\n                .expect(\"lock should not be poisoned\")\n                .pop()\n                .unwrap_or(Err(Retry::Permanent(())));\n            ready(result)\n        }\n    }\n\n    #[tokio::test]\n    async fn test_retry_policy() {\n        let retry_policy = RetryPolicy::from(RetryParams::for_test());\n        let retry_layer = RetryLayer::new(retry_policy);\n        let mut retry_hello_service = retry_layer.layer(HelloService);\n\n        let hello_request = HelloRequest {\n            results: Arc::new(Mutex::new(vec![Ok(())])),\n            ..Default::default()\n        };\n        retry_hello_service\n            .ready()\n            .await\n            .unwrap()\n            .call(hello_request.clone())\n            .await\n            .unwrap();\n        assert_eq!(hello_request.num_attempts.load(Ordering::Relaxed), 1);\n\n        let hello_request = HelloRequest {\n            results: Arc::new(Mutex::new(vec![Ok(()), Err(Retry::Transient(()))])),\n            ..Default::default()\n        };\n        retry_hello_service\n            .ready()\n            .await\n            .unwrap()\n            .call(hello_request.clone())\n            .await\n            .unwrap();\n        assert_eq!(hello_request.num_attempts.load(Ordering::Relaxed), 2);\n\n        let hello_request = HelloRequest {\n            results: Arc::new(Mutex::new(vec![\n                Err(Retry::Transient(())),\n                Err(Retry::Transient(())),\n                Err(Retry::Transient(())),\n            ])),\n            ..Default::default()\n        };\n        retry_hello_service\n            .ready()\n            .await\n            .unwrap()\n            .call(hello_request.clone())\n            .await\n            .unwrap_err();\n        assert_eq!(hello_request.num_attempts.load(Ordering::Relaxed), 3);\n\n        let hello_request = HelloRequest::default();\n        retry_hello_service\n            .ready()\n            .await\n            .unwrap()\n            .call(hello_request.clone())\n            .await\n            .unwrap_err();\n        assert_eq!(hello_request.num_attempts.load(Ordering::Relaxed), 1);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/tower/timeout.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::future::Future;\nuse std::pin::Pin;\nuse std::task::{Context, Poll};\nuse std::time::Duration;\n\nuse pin_project::pin_project;\nuse tokio::time::Sleep;\nuse tower::{Layer, Service};\n\n#[derive(Debug, Clone)]\npub struct Timeout<S> {\n    service: S,\n    timeout: Duration,\n}\nimpl<S> Timeout<S> {\n    /// Creates a new [`Timeout`]\n    pub fn new(service: S, timeout: Duration) -> Self {\n        Timeout { service, timeout }\n    }\n}\n\nimpl<S, R> Service<R> for Timeout<S>\nwhere\n    S: Service<R>,\n    S::Error: From<TimeoutExceeded>,\n{\n    type Response = S::Response;\n    type Error = S::Error;\n    type Future = TimeoutFuture<S::Future>;\n\n    fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {\n        self.service.poll_ready(cx)\n    }\n\n    fn call(&mut self, request: R) -> Self::Future {\n        TimeoutFuture {\n            inner: self.service.call(request),\n            sleep: tokio::time::sleep(self.timeout),\n        }\n    }\n}\n\n/// The error type for the `Timeout` service.\n#[derive(Debug, PartialEq, Eq)]\npub struct TimeoutExceeded;\n\n#[pin_project]\n#[derive(Debug)]\npub struct TimeoutFuture<F> {\n    #[pin]\n    inner: F,\n    #[pin]\n    sleep: Sleep,\n}\n\nimpl<F, T, E> Future for TimeoutFuture<F>\nwhere\n    F: Future<Output = Result<T, E>>,\n    E: From<TimeoutExceeded>,\n{\n    type Output = Result<T, E>;\n\n    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {\n        let this = self.project();\n\n        match this.inner.poll(cx) {\n            Poll::Ready(v) => return Poll::Ready(v),\n            Poll::Pending => {}\n        }\n\n        // Now check the timeout\n        match this.sleep.poll(cx) {\n            Poll::Pending => Poll::Pending,\n            Poll::Ready(_) => Poll::Ready(Err(TimeoutExceeded.into())),\n        }\n    }\n}\n\n/// This is similar to tower's Timeout Layer except it requires\n/// the error of the service to implement `From<TimeoutExceeded>`.\n///\n/// If the inner service does not complete within the specified duration,\n/// the response will be aborted with the error `TimeoutExceeded`.\n///\n/// Note that when used in combination with a retry layer, this should be\n/// stacked on top of it for the timeout to be retried.\n#[derive(Debug, Clone)]\npub struct TimeoutLayer {\n    timeout: Duration,\n}\n\nimpl TimeoutLayer {\n    /// Creates a new `TimeoutLayer` with the specified delay.\n    pub fn new(timeout: Duration) -> Self {\n        Self { timeout }\n    }\n}\n\nimpl<S> Layer<S> for TimeoutLayer {\n    type Service = Timeout<S>;\n\n    fn layer(&self, service: S) -> Self::Service {\n        Timeout::new(service, self.timeout)\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use tokio::time::Duration;\n    use tower::{ServiceBuilder, ServiceExt};\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_timeout() {\n        let delay = Duration::from_millis(100);\n        let mut service = ServiceBuilder::new()\n            .layer(TimeoutLayer::new(delay))\n            .service_fn(|_| async {\n                // sleep for 1 sec\n                tokio::time::sleep(Duration::from_secs(1)).await;\n                Ok::<_, TimeoutExceeded>(())\n            });\n\n        let res = service.ready().await.unwrap().call(()).await;\n        assert_eq!(res, Err(TimeoutExceeded));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/tower/transport.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashSet;\nuse std::convert::Infallible;\nuse std::fmt;\nuse std::hash::Hash;\nuse std::net::SocketAddr;\nuse std::pin::Pin;\nuse std::task::{Context, Poll};\nuse std::time::Duration;\n\nuse futures::stream::once;\nuse futures::{Stream, StreamExt};\nuse tokio::sync::{mpsc, watch};\nuse tokio_stream::wrappers::UnboundedReceiverStream;\nuse tonic::transport::channel::ClientTlsConfig;\nuse tonic::transport::{Channel, Endpoint, Uri};\nuse tower::balance::p2c::Balance;\nuse tower::buffer::Buffer;\nuse tower::discover::Change as TowerChange;\nuse tower::load::{CompleteOnResponse, PendingRequestsDiscover};\nuse tower::{BoxError, Service, ServiceExt};\n\nuse super::{BoxFuture, Change};\nuse crate::BoxStream;\n\n// Transforms a boxed stream of `Change<K, Channel>` into a stream of `Result<TowerChange<K,\n// Channel>, Infallible>>` while keeping track of the number of connections.\nstruct ChangeStreamAdapter<K> {\n    changes: BoxStream<Change<K, Channel>>,\n    connection_keys_tx: watch::Sender<HashSet<K>>,\n    keys: HashSet<K>,\n}\n\n// A blanket `Discover` implementation exists for any `Stream<Item = Result<Change<K, V>, E>>`\nimpl<K> Stream for ChangeStreamAdapter<K>\nwhere K: Hash + Eq + Clone\n{\n    type Item = Result<TowerChange<K, Channel>, Infallible>;\n\n    fn poll_next(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Option<Self::Item>> {\n        match Pin::new(&mut *self.changes).poll_next(cx) {\n            Poll::Pending | Poll::Ready(None) => Poll::Pending,\n            Poll::Ready(Some(change)) => match change {\n                Change::Insert(key, channel) => {\n                    if self.keys.insert(key.clone()) {\n                        self.connection_keys_tx.send_modify(|connection_keys| {\n                            connection_keys.insert(key.clone());\n                        });\n                    }\n                    Poll::Ready(Some(Ok(TowerChange::Insert(key, channel))))\n                }\n                Change::Remove(key) => {\n                    if self.keys.remove(&key) {\n                        self.connection_keys_tx.send_modify(|connection_keys| {\n                            connection_keys.remove(&key);\n                        });\n                    }\n                    Poll::Ready(Some(Ok(TowerChange::Remove(key))))\n                }\n            },\n        }\n    }\n}\n\nimpl<K> Unpin for ChangeStreamAdapter<K> where K: Hash + Eq + Clone {}\n\ntype HttpRequest = http::Request<tonic::body::Body>;\ntype HttpResponse = http::Response<tonic::body::Body>;\ntype ChangeStream<K> = UnboundedReceiverStream<Result<TowerChange<K, Channel>, Infallible>>;\ntype Discover<K> = PendingRequestsDiscover<ChangeStream<K>, CompleteOnResponse>;\ntype ChannelImpl<K> =\n    Buffer<HttpRequest, <Balance<Discover<K>, HttpRequest> as Service<HttpRequest>>::Future>;\n\n#[derive(Clone)]\npub struct BalanceChannel<K: Hash + Eq + Clone + Send> {\n    inner: ChannelImpl<K>,\n    connection_keys_rx: watch::Receiver<HashSet<K>>,\n}\n\nimpl<K> BalanceChannel<K>\nwhere K: Hash + Eq + Send + Sync + Clone + 'static\n{\n    pub fn new() -> (Self, mpsc::UnboundedSender<Change<K, Channel>>) {\n        let (change_tx, change_rx) = mpsc::unbounded_channel();\n        let changes = UnboundedReceiverStream::new(change_rx);\n        let channel = Self::from_stream(changes);\n        (channel, change_tx)\n    }\n\n    pub fn from_channel(key: K, channel: Channel) -> Self {\n        Self::from_stream(once(Box::pin(async { Change::Insert(key, channel) })))\n    }\n\n    pub fn from_stream<S>(changes: S) -> Self\n    where S: Stream<Item = Change<K, Channel>> + Send + Unpin + 'static {\n        let (connection_keys_tx, connection_keys_rx) = watch::channel(HashSet::new());\n        let change_stream = unlazy_stream(ChangeStreamAdapter::<K> {\n            changes: Box::pin(changes),\n            connection_keys_tx,\n            keys: HashSet::new(),\n        });\n        let completion = CompleteOnResponse::default();\n        let pending_requests_discover = PendingRequestsDiscover::new(change_stream, completion);\n        let balance_svc = Balance::new(pending_requests_discover);\n        let buffer_svc = Buffer::new(balance_svc, 512);\n\n        BalanceChannel {\n            inner: buffer_svc,\n            connection_keys_rx,\n        }\n    }\n\n    pub fn num_connections(&self) -> usize {\n        self.connection_keys_rx.borrow().len()\n    }\n\n    pub fn connection_keys_watcher(&self) -> watch::Receiver<HashSet<K>> {\n        self.connection_keys_rx.clone()\n    }\n\n    pub async fn wait_for(\n        &self,\n        timeout_after: Duration,\n        predicate: impl Fn(&HashSet<K>) -> bool,\n    ) -> bool {\n        tokio::time::timeout(\n            timeout_after,\n            self.connection_keys_watcher().wait_for(predicate),\n        )\n        .await\n        .is_ok()\n    }\n}\n\n/// `tower::buffer::Buffer` and `tower::balance::Balance` lazily polls their inner services. As a\n/// result, the underlying discover stream is only polled when requests are made to the\n/// `BalanceChannel`. When the channel is idle, the pool of connections is not updated and\n/// `num_connections` can be inaccurate. Since this number is used to determine whether a service is\n/// ready or not, we must poll the stream eagerly to always supply an up-to-date value.\nfn unlazy_stream<S, T>(mut inner_stream: S) -> UnboundedReceiverStream<T>\nwhere\n    T: Send + 'static,\n    S: Stream<Item = T> + Send + Unpin + 'static,\n{\n    let (outer_stream_tx, outer_stream_rx) = mpsc::unbounded_channel();\n    let future = async move {\n        while let Some(item) = inner_stream.next().await {\n            if outer_stream_tx.send(item).is_err() {\n                break;\n            }\n        }\n    };\n    tokio::spawn(future);\n    UnboundedReceiverStream::new(outer_stream_rx)\n}\n\nimpl<K> Service<HttpRequest> for BalanceChannel<K>\nwhere K: Hash + Eq + Clone + Send\n{\n    type Response = HttpResponse;\n    type Error = BoxError;\n    type Future = BoxFuture<HttpResponse, BoxError>;\n\n    fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {\n        self.inner.poll_ready(cx)\n    }\n\n    fn call(&mut self, request: HttpRequest) -> Self::Future {\n        Box::pin(self.inner.call(request))\n    }\n}\n\nimpl<K> fmt::Debug for BalanceChannel<K>\nwhere K: Hash + Eq + Clone + Send + Sync + 'static\n{\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        f.debug_struct(\"BalanceChannel\")\n            .field(\"num_connections\", &self.num_connections())\n            .finish()\n    }\n}\n\n#[derive(Clone, Debug, Eq, PartialEq)]\npub struct KeepAliveConfig {\n    pub interval: Duration,\n    pub timeout: Duration,\n}\n\n#[derive(Clone, Default)]\npub struct ClientGrpcConfig {\n    pub keep_alive_opt: Option<KeepAliveConfig>,\n    pub tls_config_opt: Option<ClientTlsConfig>,\n}\n\n/// Creates a channel from a socket address.\n///\n/// The function is marked as `async` because it requires an executor (`connect_lazy`).\npub async fn make_channel(\n    socket_addr: SocketAddr,\n    client_grpc_config: ClientGrpcConfig,\n) -> Channel {\n    let ClientGrpcConfig {\n        keep_alive_opt,\n        tls_config_opt,\n    } = client_grpc_config;\n    let scheme = if tls_config_opt.is_some() {\n        \"https\"\n    } else {\n        \"http\"\n    };\n    let uri = Uri::builder()\n        .scheme(scheme)\n        .authority(socket_addr.to_string())\n        .path_and_query(\"/\")\n        .build()\n        .expect(\"provided arguments should be valid\");\n    let mut endpoint = Endpoint::from(uri).connect_timeout(Duration::from_secs(5));\n    if let Some(tls_config) = tls_config_opt {\n        endpoint = endpoint.tls_config(tls_config).expect(\"sadness TODO\");\n    }\n    if let Some(keep_alive) = keep_alive_opt {\n        endpoint = endpoint\n            .keep_alive_while_idle(true)\n            .http2_keep_alive_interval(keep_alive.interval)\n            .keep_alive_timeout(keep_alive.timeout);\n    }\n    endpoint.connect_lazy()\n}\n\n/// Forces a channel to initiate the underlying HTTP connection. Calling this function only makes\n/// sense for channels connected lazily.\n///\n/// The function is marked as `async` because it requires a tokio runtime.\npub async fn warmup_channel(channel: Channel) {\n    tokio::spawn(channel.ready_oneshot());\n}\n\n#[cfg(test)]\nmod tests {\n    use futures::StreamExt;\n    use tonic::transport::Endpoint;\n    use tower::ServiceExt;\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_channel_discover() {\n        let (change_tx, change_rx) = mpsc::unbounded_channel();\n        let (connection_keys_tx, connection_keys_rx) = watch::channel(HashSet::new());\n\n        let mut channel_discover = ChangeStreamAdapter::<&str> {\n            changes: Box::pin(UnboundedReceiverStream::new(change_rx)),\n            connection_keys_tx,\n            keys: HashSet::new(),\n        };\n        assert!(connection_keys_rx.borrow().is_empty());\n\n        let channel = Endpoint::from_static(\"http://[::1]:1212\").connect_lazy();\n        change_tx.send(Change::Insert(\"foo\", channel)).unwrap();\n\n        let change = channel_discover.next().await.unwrap().unwrap();\n        assert!(matches!(change, TowerChange::Insert(\"foo\", _)));\n        assert_eq!(*connection_keys_rx.borrow(), HashSet::from_iter([\"foo\"]));\n\n        let channel = Endpoint::from_static(\"http://[::1]:1337\").connect_lazy();\n        change_tx.send(Change::Insert(\"foo\", channel)).unwrap();\n\n        let change = channel_discover.next().await.unwrap().unwrap();\n        assert!(matches!(change, TowerChange::Insert(\"foo\", _)));\n        assert_eq!(*connection_keys_rx.borrow(), HashSet::from_iter([\"foo\"]));\n\n        change_tx.send(Change::Remove(\"bar\")).unwrap();\n        let change = channel_discover.next().await.unwrap().unwrap();\n\n        assert!(matches!(change, TowerChange::Remove(\"bar\")));\n        assert_eq!(*connection_keys_rx.borrow(), HashSet::from_iter([\"foo\"]));\n\n        change_tx.send(Change::Remove(\"foo\")).unwrap();\n        let change = channel_discover.next().await.unwrap().unwrap();\n\n        assert!(matches!(change, TowerChange::Remove(\"foo\")));\n        assert!(connection_keys_rx.borrow().is_empty());\n    }\n\n    #[tokio::test]\n    async fn test_balance_channel() {\n        let (mut balance_channel, change_tx) = BalanceChannel::<&str>::new();\n        let mut num_connections_watcher = balance_channel.connection_keys_watcher();\n        assert_eq!(balance_channel.num_connections(), 0);\n\n        let channel = Endpoint::from_static(\"http://[::1]:1212\").connect_lazy();\n        change_tx.send(Change::Insert(\"foo\", channel)).unwrap();\n        num_connections_watcher.changed().await.unwrap();\n        assert_eq!(balance_channel.num_connections(), 1);\n\n        change_tx.send(Change::Remove(\"foo\")).unwrap();\n        num_connections_watcher.changed().await.unwrap();\n        assert_eq!(balance_channel.num_connections(), 0);\n\n        // `ready()` is lying... See `unlazy_stream()` comment.\n        balance_channel.ready().await.unwrap();\n\n        // The rest of the test lives in the `quickwit-codegen-example` crate.\n        // TODO: Move the test here.\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/type_map.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::any::{Any, TypeId};\nuse std::collections::HashMap;\n\n#[derive(Debug, Default)]\npub struct TypeMap(HashMap<TypeId, Box<dyn Any + Send + Sync>>);\n\nimpl TypeMap {\n    pub fn contains<T: Any + Send + Sync>(&self) -> bool {\n        self.0.contains_key(&TypeId::of::<T>())\n    }\n\n    pub fn insert<T: Any + Send + Sync>(&mut self, instance: T) {\n        self.0.insert(TypeId::of::<T>(), Box::new(instance));\n    }\n\n    pub fn get<T: Any + Send + Sync>(&self) -> Option<&T> {\n        self.0.get(&TypeId::of::<T>()).map(|instance| {\n            instance\n                .downcast_ref::<T>()\n                .expect(\"Instance should be of type T.\")\n        })\n    }\n\n    pub fn get_mut<T: Any + Send + Sync>(&mut self) -> Option<&mut T> {\n        self.0.get_mut(&TypeId::of::<T>()).map(|instance| {\n            instance\n                .downcast_mut::<T>()\n                .expect(\"Instance should be of type T.\")\n        })\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-common/src/uri.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::borrow::Cow;\nuse std::env;\nuse std::fmt::{Debug, Display};\nuse std::hash::Hash;\nuse std::path::{Component, Path, PathBuf};\nuse std::str::FromStr;\n\nuse anyhow::{Context, bail};\nuse once_cell::sync::OnceCell;\nuse regex::Regex;\nuse serde::de::Error;\nuse serde::{Deserialize, Serialize, Serializer};\n\n#[derive(Debug, Clone, Copy, Eq, PartialEq, Hash, Serialize, Deserialize)]\n#[serde(rename_all = \"snake_case\")]\n#[repr(u8)]\npub enum Protocol {\n    Actor = 1,\n    Azure = 2,\n    File = 3,\n    Grpc = 4,\n    PostgreSQL = 5,\n    Ram = 6,\n    S3 = 7,\n    Google = 8,\n}\n\nimpl Protocol {\n    pub fn as_str(&self) -> &str {\n        match &self {\n            Protocol::Actor => \"actor\",\n            Protocol::Azure => \"azure\",\n            Protocol::File => \"file\",\n            Protocol::Grpc => \"grpc\",\n            Protocol::PostgreSQL => \"postgresql\",\n            Protocol::Ram => \"ram\",\n            Protocol::S3 => \"s3\",\n            Protocol::Google => \"gs\",\n        }\n    }\n\n    pub fn is_file(&self) -> bool {\n        matches!(&self, Protocol::File)\n    }\n\n    pub fn is_file_storage(&self) -> bool {\n        matches!(&self, Protocol::File | Protocol::Ram)\n    }\n\n    pub fn is_object_storage(&self) -> bool {\n        matches!(&self, Protocol::Azure | Protocol::S3 | Protocol::Google)\n    }\n\n    pub fn is_database(&self) -> bool {\n        matches!(&self, Protocol::PostgreSQL)\n    }\n}\n\nimpl Display for Protocol {\n    fn fmt(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {\n        write!(formatter, \"{}\", self.as_str())\n    }\n}\n\nimpl FromStr for Protocol {\n    type Err = anyhow::Error;\n\n    fn from_str(protocol: &str) -> anyhow::Result<Self> {\n        match protocol {\n            \"azure\" => Ok(Protocol::Azure),\n            \"file\" => Ok(Protocol::File),\n            \"grpc\" => Ok(Protocol::Grpc),\n            \"actor\" => Ok(Protocol::Actor),\n            \"pg\" | \"postgres\" | \"postgresql\" => Ok(Protocol::PostgreSQL),\n            \"ram\" => Ok(Protocol::Ram),\n            \"s3\" => Ok(Protocol::S3),\n            \"gs\" => Ok(Protocol::Google),\n            _ => bail!(\"unknown URI protocol `{protocol}`\"),\n        }\n    }\n}\n\nconst PROTOCOL_SEPARATOR: &str = \"://\";\n\n/// Encapsulates the URI type.\n///\n/// URI's string representation are guaranteed to start\n/// by the protocol `str()` representation.\n///\n/// # Disclaimer\n///\n/// Uri has to be built using `Uri::from_str`.\n/// This function has some normalization behavior.\n/// Some protocol have several acceptable string representation (`pg`, `postgres`, `postgresql`).\n///\n/// If the representation in the input string is not canonical, it will get normalized.\n/// In other words, a parsed URI may not have the exact string representation as the original\n/// string.\n#[derive(Clone, Eq, PartialEq, Hash)]\npub struct Uri {\n    uri: String,\n    protocol: Protocol,\n}\n\nimpl Uri {\n    /// This is only used for test. We artificially restrict the lifetime to 'static\n    /// to avoid misuses.\n    pub fn for_test(uri: &'static str) -> Self {\n        Uri::from_str(uri).unwrap()\n    }\n\n    /// Returns the extension of the URI.\n    pub fn extension(&self) -> Option<&str> {\n        Path::new(&self.uri).extension()?.to_str()\n    }\n\n    /// Returns the URI as a string slice.\n    pub fn as_str(&self) -> &str {\n        &self.uri\n    }\n\n    /// Returns the protocol of the URI.\n    pub fn protocol(&self) -> Protocol {\n        self.protocol\n    }\n\n    /// Strips sensitive information such as credentials from URI.\n    fn as_redacted_str(&self) -> Cow<'_, str> {\n        if self.protocol().is_database() {\n            static DATABASE_URI_PATTERN: OnceCell<Regex> = OnceCell::new();\n            DATABASE_URI_PATTERN\n                .get_or_init(|| {\n                    Regex::new(\"(?P<before>^.*://.*)(?P<password>:.*@)(?P<after>.*)\")\n                        .expect(\"regular expression should compile\")\n                })\n                .replace(&self.uri, \"$before:***redacted***@$after\")\n        } else {\n            Cow::Borrowed(&self.uri)\n        }\n    }\n\n    pub fn redact(&mut self) {\n        self.uri = self.as_redacted_str().into_owned();\n    }\n\n    /// Returns the file path of the URI.\n    /// Applies only to `file://` and `ram://` URIs.\n    pub fn filepath(&self) -> Option<&Path> {\n        if self.protocol().is_file_storage() {\n            Some(self.path())\n        } else {\n            None\n        }\n    }\n\n    /// Returns the parent URI.\n    /// Does not apply to PostgreSQL URIs.\n    pub fn parent(&self) -> Option<Uri> {\n        if self.protocol().is_database() {\n            return None;\n        }\n        let path = self.path();\n        let protocol = self.protocol();\n\n        if protocol == Protocol::S3 && path.components().count() < 2 {\n            return None;\n        }\n        if protocol == Protocol::Azure && path.components().count() < 3 {\n            return None;\n        }\n        if protocol == Protocol::Google && path.components().count() < 2 {\n            return None;\n        }\n        let parent_path = path.parent()?;\n\n        Some(Self {\n            uri: format!(\"{protocol}{PROTOCOL_SEPARATOR}{}\", parent_path.display()),\n            protocol,\n        })\n    }\n\n    fn path(&self) -> &Path {\n        Path::new(&self.uri[self.protocol.as_str().len() + PROTOCOL_SEPARATOR.len()..])\n    }\n\n    /// Returns the last component of the URI.\n    pub fn file_name(&self) -> Option<&Path> {\n        if self.protocol() == Protocol::PostgreSQL {\n            return None;\n        }\n        let path = self.path();\n\n        if self.protocol() == Protocol::S3 && path.components().count() < 2 {\n            return None;\n        }\n        if self.protocol() == Protocol::Azure && path.components().count() < 3 {\n            return None;\n        }\n        if self.protocol() == Protocol::Google && path.components().count() < 2 {\n            return None;\n        }\n        path.file_name().map(Path::new)\n    }\n\n    /// Consumes the [`Uri`] struct and returns the normalized URI as a string.\n    pub fn into_string(self) -> String {\n        self.uri\n    }\n\n    /// Creates a new [`Uri`] with `path` adjoined to `self`.\n    /// Fails if `path` is absolute.\n    pub fn join<P: AsRef<Path> + std::fmt::Debug>(&self, path: P) -> anyhow::Result<Self> {\n        if path.as_ref().is_absolute() {\n            bail!(\n                \"cannot join URI `{}` with absolute path `{:?}`\",\n                self.uri,\n                path\n            );\n        }\n        let joined = match self.protocol() {\n            Protocol::File => Path::new(&self.uri)\n                .join(path)\n                .to_string_lossy()\n                .to_string(),\n            Protocol::PostgreSQL => bail!(\n                \"cannot join PostgreSQL URI `{}` with path `{:?}`\",\n                self.uri,\n                path\n            ),\n            _ => format!(\n                \"{}{}{}\",\n                self.uri,\n                if self.uri.ends_with('/') { \"\" } else { \"/\" },\n                path.as_ref().display(),\n            ),\n        };\n        Ok(Self {\n            uri: joined,\n            protocol: self.protocol,\n        })\n    }\n\n    /// Attempts to construct a [`Uri`] from a string.\n    /// A `file://` protocol is assumed if not specified.\n    /// File URIs are resolved (normalized) relative to the current working directory\n    /// unless an absolute path is specified.\n    /// Handles special characters such as `~`, `.`, `..`.\n    fn parse_str(uri_str: &str) -> anyhow::Result<Self> {\n        // CAUTION: Do not display the URI in error messages to avoid leaking credentials.\n        if uri_str.is_empty() {\n            bail!(\"failed to parse empty URI\");\n        }\n        let (protocol, mut path) = match uri_str.split_once(PROTOCOL_SEPARATOR) {\n            None => (Protocol::File, uri_str.to_string()),\n            Some((protocol, path)) => (Protocol::from_str(protocol)?, path.to_string()),\n        };\n        if protocol == Protocol::File {\n            if path.starts_with('~') {\n                // We only accept `~` (alias to the home directory) and `~/path/to/something`.\n                // If there is something following the `~` that is not `/`, we bail.\n                if path.len() > 1 && !path.starts_with(\"~/\") {\n                    bail!(\"failed to normalize URI: tilde expansion is only partially supported\");\n                }\n\n                let home_dir_path = home::home_dir()\n                    .context(\"failed to normalize URI: could not resolve home directory\")?\n                    .to_string_lossy()\n                    .to_string();\n\n                path.replace_range(0..1, &home_dir_path);\n            }\n            if Path::new(&path).is_relative() {\n                let current_dir = env::current_dir().context(\n                    \"failed to normalize URI: could not resolve current working directory. the \\\n                     directory does not exist or user has insufficient permissions\",\n                )?;\n                path = current_dir.join(path).to_string_lossy().to_string();\n            }\n            path = normalize_path(Path::new(&path))\n                .to_string_lossy()\n                .to_string();\n        }\n        Ok(Self {\n            uri: format!(\"{protocol}{PROTOCOL_SEPARATOR}{path}\"),\n            protocol,\n        })\n    }\n}\n\nimpl AsRef<str> for Uri {\n    fn as_ref(&self) -> &str {\n        &self.uri\n    }\n}\n\nimpl Debug for Uri {\n    fn fmt(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {\n        formatter\n            .debug_struct(\"Uri\")\n            .field(\"uri\", &self.as_redacted_str())\n            .finish()\n    }\n}\n\nimpl Display for Uri {\n    fn fmt(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {\n        write!(formatter, \"{}\", self.as_redacted_str())\n    }\n}\n\nimpl FromStr for Uri {\n    type Err = anyhow::Error;\n\n    fn from_str(uri_str: &str) -> anyhow::Result<Self> {\n        Uri::parse_str(uri_str)\n    }\n}\n\nimpl PartialEq<&str> for Uri {\n    fn eq(&self, other: &&str) -> bool {\n        &self.uri == other\n    }\n}\n\nimpl PartialEq<String> for Uri {\n    fn eq(&self, other: &String) -> bool {\n        &self.uri == other\n    }\n}\n\nimpl<'de> Deserialize<'de> for Uri {\n    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>\n    where D: serde::Deserializer<'de> {\n        let uri_str: Cow<'de, str> = Deserialize::deserialize(deserializer)?;\n        let uri = Uri::from_str(&uri_str).map_err(D::Error::custom)?;\n        Ok(uri)\n    }\n}\n\nimpl Serialize for Uri {\n    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>\n    where S: Serializer {\n        serializer.serialize_str(&self.uri)\n    }\n}\n\n/// Normalizes a path by resolving the components like (., ..).\n/// This helper does the same thing as `Path::canonicalize`.\n/// It only differs from `Path::canonicalize` by not checking file existence\n/// during resolution.\n/// <https://github.com/rust-lang/cargo/blob/fede83ccf973457de319ba6fa0e36ead454d2e20/src/cargo/util/paths.rs#L61>\nfn normalize_path(path: &Path) -> PathBuf {\n    let mut components = path.components().peekable();\n    let mut resulting_path_buf =\n        if let Some(component @ Component::Prefix(..)) = components.peek().cloned() {\n            components.next();\n            PathBuf::from(component.as_os_str())\n        } else {\n            PathBuf::new()\n        };\n\n    for component in components {\n        match component {\n            Component::Prefix(..) => unreachable!(),\n            Component::RootDir => {\n                resulting_path_buf.push(component.as_os_str());\n            }\n            Component::CurDir => {}\n            Component::ParentDir => {\n                resulting_path_buf.pop();\n            }\n            Component::Normal(inner_component) => {\n                resulting_path_buf.push(inner_component);\n            }\n        }\n    }\n    resulting_path_buf\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_try_new_uri() {\n        Uri::from_str(\"\").unwrap_err();\n\n        let home_dir = home::home_dir().unwrap();\n        let current_dir = env::current_dir().unwrap();\n\n        let uri = Uri::from_str(\"file:///home/foo/bar\").unwrap();\n        assert_eq!(uri.protocol(), Protocol::File);\n        assert_eq!(uri.filepath(), Some(Path::new(\"/home/foo/bar\")));\n        assert_eq!(uri, \"file:///home/foo/bar\");\n        assert_eq!(uri, \"file:///home/foo/bar\".to_string());\n        assert_eq!(\n            Uri::from_str(\"file:///foo./bar..\").unwrap(),\n            \"file:///foo./bar..\"\n        );\n        assert_eq!(\n            Uri::from_str(\"home/homer/docs/dognuts\").unwrap(),\n            format!(\"file://{}/home/homer/docs/dognuts\", current_dir.display())\n        );\n        assert_eq!(\n            Uri::from_str(\"home/homer/docs/../dognuts\").unwrap(),\n            format!(\"file://{}/home/homer/dognuts\", current_dir.display())\n        );\n        assert_eq!(\n            Uri::from_str(\"home/homer/docs/../../dognuts\").unwrap(),\n            format!(\"file://{}/home/dognuts\", current_dir.display())\n        );\n        assert_eq!(\n            Uri::from_str(\"/home/homer/docs/dognuts\").unwrap(),\n            \"file:///home/homer/docs/dognuts\"\n        );\n        assert_eq!(\n            Uri::from_str(\"~\").unwrap(),\n            format!(\"file://{}\", home_dir.display())\n        );\n        assert_eq!(\n            Uri::from_str(\"~/\").unwrap(),\n            format!(\"file://{}\", home_dir.display())\n        );\n        assert_eq!(\n            Uri::from_str(\"~anything/bar\").unwrap_err().to_string(),\n            \"failed to normalize URI: tilde expansion is only partially supported\"\n        );\n        assert_eq!(\n            Uri::from_str(\"~/.\").unwrap(),\n            format!(\"file://{}\", home_dir.display())\n        );\n        assert_eq!(\n            Uri::from_str(\"~/..\").unwrap(),\n            format!(\"file://{}\", home_dir.parent().unwrap().display())\n        );\n        assert_eq!(\n            Uri::from_str(\"file://\").unwrap(),\n            format!(\"file://{}\", current_dir.display())\n        );\n        assert_eq!(Uri::from_str(\"file:///\").unwrap(), \"file:///\");\n        assert_eq!(\n            Uri::from_str(\"file://.\").unwrap(),\n            format!(\"file://{}\", current_dir.display())\n        );\n        assert_eq!(\n            Uri::from_str(\"file://..\").unwrap(),\n            format!(\"file://{}\", current_dir.parent().unwrap().display())\n        );\n        assert_eq!(\n            Uri::from_str(\"s3://home/homer/docs/dognuts\").unwrap(),\n            \"s3://home/homer/docs/dognuts\"\n        );\n        assert_eq!(\n            Uri::from_str(\"s3://home/homer/docs/../dognuts\").unwrap(),\n            \"s3://home/homer/docs/../dognuts\"\n        );\n        assert_eq!(\n            Uri::from_str(\"azure://account/container/docs/dognuts\").unwrap(),\n            \"azure://account/container/docs/dognuts\"\n        );\n        assert_eq!(\n            Uri::from_str(\"azure://account/container/homer/docs/../dognuts\").unwrap(),\n            \"azure://account/container/homer/docs/../dognuts\"\n        );\n        assert_eq!(\n            Uri::from_str(\"gs://bucket/docs/dognuts\").unwrap(),\n            \"gs://bucket/docs/dognuts\"\n        );\n        assert_eq!(\n            Uri::from_str(\"gs://bucket/homer/docs/../dognuts\").unwrap(),\n            \"gs://bucket/homer/docs/../dognuts\"\n        );\n        assert_eq!(\n            Uri::from_str(\"actor://localhost:7281/an-actor-id\").unwrap(),\n            \"actor://localhost:7281/an-actor-id\"\n        );\n\n        assert_eq!(\n            Uri::from_str(\"http://localhost:9000/quickwit\")\n                .unwrap_err()\n                .to_string(),\n            \"unknown URI protocol `http`\"\n        );\n    }\n\n    #[test]\n    fn test_uri_protocol() {\n        assert_eq!(Uri::for_test(\"file:///home\").protocol(), Protocol::File);\n        assert_eq!(Uri::for_test(\"ram:///in-memory\").protocol(), Protocol::Ram);\n        assert_eq!(Uri::for_test(\"s3://bucket/key\").protocol(), Protocol::S3);\n        assert_eq!(\n            Uri::for_test(\"azure://account/bucket/key\").protocol(),\n            Protocol::Azure\n        );\n        assert_eq!(\n            Uri::for_test(\"gs://bucket/key\").protocol(),\n            Protocol::Google\n        );\n        assert_eq!(\n            Uri::for_test(\"postgres://localhost:5432/metastore\").protocol(),\n            Protocol::PostgreSQL\n        );\n        assert_eq!(\n            Uri::for_test(\"postgresql://localhost:5432/metastore\").protocol(),\n            Protocol::PostgreSQL\n        );\n    }\n\n    #[test]\n    fn test_uri_extension() {\n        assert!(Uri::for_test(\"s3://\").extension().is_none());\n\n        assert_eq!(\n            Uri::for_test(\"s3://config.json\").extension().unwrap(),\n            \"json\"\n        );\n        assert_eq!(\n            Uri::for_test(\"azure://config.foo\").extension().unwrap(),\n            \"foo\"\n        );\n    }\n\n    #[test]\n    fn test_uri_join() {\n        assert_eq!(\n            Uri::for_test(\"file:///\").join(\"foo\").unwrap(),\n            \"file:///foo\"\n        );\n        assert_eq!(\n            Uri::for_test(\"file:///foo\").join(\"bar\").unwrap(),\n            \"file:///foo/bar\"\n        );\n        assert_eq!(\n            Uri::for_test(\"file:///foo/\").join(\"bar\").unwrap(),\n            \"file:///foo/bar\"\n        );\n        assert_eq!(\n            Uri::for_test(\"ram://foo\").join(\"bar\").unwrap(),\n            \"ram://foo/bar\"\n        );\n        assert_eq!(\n            Uri::for_test(\"s3://bucket/\").join(\"key\").unwrap(),\n            \"s3://bucket/key\"\n        );\n        assert_eq!(\n            Uri::for_test(\"azure://account/container\")\n                .join(\"key\")\n                .unwrap(),\n            \"azure://account/container/key\"\n        );\n        assert_eq!(\n            Uri::for_test(\"gs://bucket\").join(\"key\").unwrap(),\n            \"gs://bucket/key\"\n        );\n        Uri::for_test(\"s3://bucket/\").join(\"/key\").unwrap_err();\n        Uri::for_test(\"azure://account/container/\")\n            .join(\"/key\")\n            .unwrap_err();\n        Uri::for_test(\"postgres://username:password@localhost:5432/metastore\")\n            .join(\"table\")\n            .unwrap_err();\n    }\n\n    #[test]\n    fn test_uri_parent() {\n        assert!(Uri::for_test(\"file:///\").parent().is_none());\n        assert_eq!(Uri::for_test(\"file:///foo\").parent().unwrap(), \"file:///\");\n        assert_eq!(Uri::for_test(\"file:///foo/\").parent().unwrap(), \"file:///\");\n        assert_eq!(\n            Uri::for_test(\"file:///foo/bar\").parent().unwrap(),\n            \"file:///foo\"\n        );\n        assert!(\n            Uri::for_test(\"postgres://localhost:5432/db\")\n                .parent()\n                .is_none()\n        );\n\n        assert!(Uri::for_test(\"ram:///\").parent().is_none());\n        assert_eq!(Uri::for_test(\"ram:///foo\").parent().unwrap(), \"ram:///\");\n        assert_eq!(Uri::for_test(\"ram:///foo/\").parent().unwrap(), \"ram:///\");\n        assert_eq!(\n            Uri::for_test(\"ram:///foo/bar\").parent().unwrap(),\n            \"ram:///foo\"\n        );\n        assert!(Uri::for_test(\"s3://bucket\").parent().is_none());\n        assert!(Uri::for_test(\"s3://bucket/\").parent().is_none());\n        assert_eq!(\n            Uri::for_test(\"s3://bucket/foo\").parent().unwrap(),\n            \"s3://bucket\"\n        );\n        assert_eq!(\n            Uri::for_test(\"s3://bucket/foo/\").parent().unwrap(),\n            \"s3://bucket\"\n        );\n        assert_eq!(\n            Uri::for_test(\"s3://bucket/foo/bar\").parent().unwrap(),\n            \"s3://bucket/foo\"\n        );\n        assert_eq!(\n            Uri::for_test(\"s3://bucket/foo/bar/\").parent().unwrap(),\n            \"s3://bucket/foo\"\n        );\n        assert!(Uri::for_test(\"azure://account/\").parent().is_none());\n        assert!(Uri::for_test(\"azure://account\").parent().is_none());\n        assert!(\n            Uri::for_test(\"azure://account/container/\")\n                .parent()\n                .is_none()\n        );\n        assert!(\n            Uri::for_test(\"azure://account/container\")\n                .parent()\n                .is_none()\n        );\n        assert_eq!(\n            Uri::for_test(\"azure://account/container/foo\")\n                .parent()\n                .unwrap(),\n            \"azure://account/container\"\n        );\n        assert_eq!(\n            Uri::for_test(\"azure://account/container/foo/\")\n                .parent()\n                .unwrap(),\n            \"azure://account/container\"\n        );\n        assert_eq!(\n            Uri::for_test(\"azure://account/container/foo/bar\")\n                .parent()\n                .unwrap(),\n            \"azure://account/container/foo\"\n        );\n        assert!(Uri::for_test(\"gs://bucket\").parent().is_none());\n        assert!(Uri::for_test(\"gs://bucket/\").parent().is_none());\n        assert_eq!(\n            Uri::for_test(\"gs://bucket/foo\").parent().unwrap(),\n            \"gs://bucket\"\n        );\n        assert_eq!(\n            Uri::for_test(\"gs://bucket/foo/\").parent().unwrap(),\n            \"gs://bucket\"\n        );\n        assert_eq!(\n            Uri::for_test(\"gs://bucket/foo/bar\").parent().unwrap(),\n            \"gs://bucket/foo\"\n        );\n        assert_eq!(\n            Uri::for_test(\"gs://bucket/foo/bar/\").parent().unwrap(),\n            \"gs://bucket/foo\"\n        );\n    }\n\n    #[test]\n    fn test_uri_file_name() {\n        assert!(Uri::for_test(\"file:///\").file_name().is_none());\n        assert_eq!(\n            Uri::for_test(\"file:///foo\").file_name().unwrap(),\n            Path::new(\"foo\")\n        );\n        assert_eq!(\n            Uri::for_test(\"file:///foo/\").file_name().unwrap(),\n            Path::new(\"foo\")\n        );\n        assert!(\n            Uri::for_test(\"postgres://localhost:5432/db\")\n                .file_name()\n                .is_none()\n        );\n\n        assert!(Uri::for_test(\"ram:///\").file_name().is_none());\n        assert_eq!(\n            Uri::for_test(\"ram:///foo\").file_name().unwrap(),\n            Path::new(\"foo\")\n        );\n        assert_eq!(\n            Uri::for_test(\"ram:///foo/\").file_name().unwrap(),\n            Path::new(\"foo\")\n        );\n        assert!(Uri::for_test(\"s3://bucket\").file_name().is_none());\n        assert!(Uri::for_test(\"s3://bucket/\").file_name().is_none());\n        assert_eq!(\n            Uri::for_test(\"s3://bucket/foo\").file_name().unwrap(),\n            Path::new(\"foo\"),\n        );\n        assert_eq!(\n            Uri::for_test(\"s3://bucket/foo/\").file_name().unwrap(),\n            Path::new(\"foo\"),\n        );\n        assert!(Uri::for_test(\"azure://account\").file_name().is_none());\n        assert!(Uri::for_test(\"azure://account/\").file_name().is_none());\n        assert!(\n            Uri::for_test(\"azure://account/container\")\n                .file_name()\n                .is_none()\n        );\n        assert!(\n            Uri::for_test(\"azure://account/container/\")\n                .file_name()\n                .is_none()\n        );\n        assert_eq!(\n            Uri::for_test(\"azure://account/container/foo\")\n                .file_name()\n                .unwrap(),\n            Path::new(\"foo\"),\n        );\n        assert_eq!(\n            Uri::for_test(\"azure://account/container/foo/\")\n                .file_name()\n                .unwrap(),\n            Path::new(\"foo\"),\n        );\n        assert!(Uri::for_test(\"gs://bucket\").file_name().is_none());\n        assert!(Uri::for_test(\"gs://bucket/\").file_name().is_none());\n        assert_eq!(\n            Uri::for_test(\"gs://bucket/foo\").file_name().unwrap(),\n            Path::new(\"foo\"),\n        );\n        assert_eq!(\n            Uri::for_test(\"gs://bucket/foo/\").file_name().unwrap(),\n            Path::new(\"foo\"),\n        );\n    }\n\n    #[test]\n    fn test_uri_filepath() {\n        assert_eq!(\n            Uri::for_test(\"file:///\").filepath().unwrap(),\n            Path::new(\"/\")\n        );\n        assert_eq!(\n            Uri::for_test(\"file:///foo\").filepath().unwrap(),\n            Path::new(\"/foo\")\n        );\n        assert_eq!(Uri::for_test(\"ram:///\").filepath().unwrap(), Path::new(\"/\"));\n        assert_eq!(\n            Uri::for_test(\"ram:///foo\").filepath().unwrap(),\n            Path::new(\"/foo\")\n        );\n        assert!(Uri::for_test(\"s3://bucket/\").filepath().is_none());\n        assert!(\n            Uri::for_test(\"azure://account/container/\")\n                .filepath()\n                .is_none()\n        );\n        assert!(\n            Uri::for_test(\"azure://account/container/foo.json\")\n                .filepath()\n                .is_none()\n        );\n        assert!(Uri::for_test(\"gs://bucket/\").filepath().is_none());\n    }\n\n    #[test]\n    fn test_uri_as_redacted_str() {\n        assert_eq!(\n            Uri::for_test(\"s3://bucket/key\").as_redacted_str(),\n            \"s3://bucket/key\"\n        );\n        assert_eq!(\n            Uri::for_test(\"azure://account/container/key\").as_redacted_str(),\n            \"azure://account/container/key\"\n        );\n        assert_eq!(\n            Uri::for_test(\"gs://bucket/key\").as_redacted_str(),\n            \"gs://bucket/key\"\n        );\n        assert_eq!(\n            Uri::for_test(\"postgres://localhost:5432/metastore\").as_redacted_str(),\n            \"postgresql://localhost:5432/metastore\"\n        );\n        assert_eq!(\n            Uri::for_test(\"pg://username@localhost:5432/metastore\").as_redacted_str(),\n            \"postgresql://username@localhost:5432/metastore\"\n        );\n        {\n            for protocol in [\"postgres\", \"postgresql\"] {\n                let uri = Uri::from_str(&format!(\n                    \"{protocol}://username:password@localhost:5432/metastore\"\n                ))\n                .unwrap();\n                let expected_uri =\n                    \"postgresql://username:***redacted***@localhost:5432/metastore\".to_string();\n                assert_eq!(uri.as_redacted_str(), expected_uri);\n                assert_eq!(format!(\"{uri}\"), expected_uri);\n                assert_eq!(\n                    format!(\"{uri:?}\"),\n                    format!(\"Uri {{ uri: \\\"{expected_uri}\\\" }}\")\n                );\n            }\n        }\n    }\n\n    #[test]\n    fn test_uri_serialize() {\n        let uri = Uri::for_test(\"s3://bucket/key\");\n        assert_eq!(\n            serde_json::to_value(uri).unwrap(),\n            serde_json::Value::String(\"s3://bucket/key\".to_string())\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-config/Cargo.toml",
    "content": "[package]\nname = \"quickwit-config\"\ndescription = \"Define and manage Quickwit configuration objects\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nanyhow = { workspace = true }\nbytes = { workspace = true }\nbytesize = { workspace = true }\nchrono = { workspace = true }\ncron = { workspace = true }\nenum-iterator = { workspace = true }\nhttp = { workspace = true }\nhttp-serde = { workspace = true }\nhumantime = { workspace = true }\nitertools = { workspace = true }\njson_comments = { workspace = true }\nnew_string_template = { workspace = true }\nonce_cell = { workspace = true }\nregex = { workspace = true }\nserde = { workspace = true }\nserde_json = { workspace = true }\nserde_with = { workspace = true }\nserde_yaml = { workspace = true }\nsiphasher = { workspace = true }\ntoml = { workspace = true }\ntracing = { workspace = true }\nutoipa = { workspace = true }\nvrl = { workspace = true, optional = true }\n\nquickwit-common = { workspace = true }\nquickwit-doc-mapper = { workspace = true }\nquickwit-proto = { workspace = true }\n\n[dev-dependencies]\ntokio = { workspace = true }\n\nquickwit-proto = { workspace = true, features = [\"testsuite\"] }\nquickwit-common = { workspace = true, features = [\"testsuite\"] }\n\n[features]\ntestsuite = []\nvrl = [\"dep:vrl\"]\n"
  },
  {
    "path": "quickwit/quickwit-config/resources/tests/index_config/hdfs-logs-create-config.yaml",
    "content": "version: 0.8\n\ndoc_mapping:\n  field_mappings:\n    - name: body\n      type: text\n      tokenizer: default\n      record: position\n    - name: timestamp\n      type: i64\n      fast: true\n\n"
  },
  {
    "path": "quickwit/quickwit-config/resources/tests/index_config/hdfs-logs.json",
    "content": "# Comments are supported.\n{\n    \"version\": \"0.7\",\n    \"index_id\": \"hdfs-logs\",\n    \"index_uri\": \"s3://quickwit-indexes/hdfs-logs\",\n    \"doc_mapping\": {\n        \"tokenizers\": [\n            {\n                \"name\": \"service_regex\",\n                \"type\": \"regex\",\n                \"pattern\": \"\\\\w*\"\n            }\n        ],\n        \"field_mappings\": [\n            {\n                \"name\": \"tenant_id\",\n                \"type\": \"u64\",\n                \"fast\": true\n            },\n            {\n                \"name\": \"timestamp\",\n                \"type\": \"datetime\",\n                \"fast\": true\n            },\n            {\n                \"name\": \"severity_text\",\n                \"type\": \"text\",\n                \"tokenizer\": \"raw\"\n            },\n            {\n                \"name\": \"body\",\n                \"type\": \"text\",\n                \"tokenizer\": \"default\",\n                \"record\": \"position\"\n            },\n            {\n                \"name\": \"resource\",\n                \"type\": \"object\",\n                \"field_mappings\": [\n                    {\n                        \"name\": \"service\",\n                        \"type\": \"text\",\n                        \"tokenizer\": \"service_regex\"\n                    }\n                ]\n            }\n        ],\n        \"tag_fields\": [\"tenant_id\"],\n        \"timestamp_field\": \"timestamp\",\n        \"store_source\": true\n    },\n    \"retention\": {\n        \"period\": \"90 days\",\n        \"schedule\": \"daily\"\n    },\n    \"indexing_settings\": {\n        \"commit_timeout_secs\": 61,\n        \"split_num_docs_target\": 10000001,\n        \"merge_policy\": {\n            \"type\": \"stable_log\",\n            \"merge_factor\": 9,\n            \"max_merge_factor\": 11,\n            \"maturation_period\": \"48 hours\"\n        },\n        \"resources\": {\n            \"heap_size\": \"3G\"\n        }\n    },\n    \"ingest_settings\": {\n        \"min_shards\": 12\n    },\n    \"search_settings\": {\n        \"default_search_fields\": [\"severity_text\", \"body\"]\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-config/resources/tests/index_config/hdfs-logs.toml",
    "content": "version = \"0.7\"\nindex_id = \"hdfs-logs\"\nindex_uri = \"s3://quickwit-indexes/hdfs-logs\"\n\n[doc_mapping]\ntokenizers = [\n  { name = \"service_regex\", type = \"regex\", pattern = \"\\\\w*\" },\n]\nfield_mappings = [\n  { name = \"tenant_id\", type = \"u64\", fast = true },\n  { name = \"timestamp\", type = \"datetime\", fast = true },\n  { name = \"severity_text\", type = \"text\", tokenizer = \"raw\" },\n  { name = \"body\", type = \"text\", tokenizer = \"default\", record = \"position\" },\n  { name = \"resource\", type = \"object\", field_mappings = [ { name = \"service\", type = \"text\", tokenizer = \"service_regex\" } ] },\n]\ntag_fields = [ \"tenant_id\" ]\nstore_source = true\ntimestamp_field = \"timestamp\"\n\n[retention]\nperiod = \"90 days\"\nschedule = \"daily\"\n\n[indexing_settings]\ncommit_timeout_secs = 61\nsplit_num_docs_target = 10_000_001\n\n[indexing_settings.merge_policy]\ntype = \"stable_log\"\nmerge_factor = 9\nmax_merge_factor = 11\nmaturation_period = \"48 hours\"\n\n[indexing_settings.resources]\nheap_size = \"3G\"\n\n[ingest_settings]\nmin_shards = 12\n\n[search_settings]\ndefault_search_fields = [ \"severity_text\", \"body\" ]\n"
  },
  {
    "path": "quickwit/quickwit-config/resources/tests/index_config/hdfs-logs.yaml",
    "content": "version: 0.8\nindex_id: hdfs-logs\nindex_uri: s3://quickwit-indexes/hdfs-logs\n\ndoc_mapping:\n  tokenizers:\n    - name: service_regex\n      type: regex\n      pattern: \"\\\\w*\"\n  field_mappings:\n    - name: tenant_id\n      type: u64\n      fast: true\n    - name: timestamp\n      type: datetime\n      fast: true\n    - name: severity_text\n      type: text\n      tokenizer: raw\n    - name: body\n      type: text\n      tokenizer: default\n      record: position\n    - name: resource\n      type: object\n      field_mappings:\n        - name: service\n          type: text\n          tokenizer: service_regex\n  tag_fields: [tenant_id]\n  timestamp_field: timestamp\n  store_source: true\n\nretention:\n  period: 90 days\n  schedule: daily\n\nindexing_settings:\n  commit_timeout_secs: 61\n  split_num_docs_target: 10000001\n  merge_policy:\n    type: \"stable_log\"\n    merge_factor: 9\n    max_merge_factor: 11\n    maturation_period: 48 hours\n  resources:\n    heap_size: 3G\n\ningest_settings:\n  min_shards: 12\n\nsearch_settings:\n  default_search_fields: [severity_text, body]\n"
  },
  {
    "path": "quickwit/quickwit-config/resources/tests/index_config/minimal-hdfs-logs.yaml",
    "content": "version: 0.8\n\nindex_id: hdfs-logs\nindex_uri: s3://quickwit-indexes/hdfs-logs\n\ndoc_mapping:\n  field_mappings:\n    - name: body\n      type: text\n      tokenizer: default\n      record: position\n\nsearch_settings:\n  default_search_fields: [body]\n"
  },
  {
    "path": "quickwit/quickwit-config/resources/tests/index_config/partial-hdfs-logs.yaml",
    "content": "version: 0.8\n\nindex_id: hdfs-logs\nindex_uri: s3://quickwit-indexes/hdfs-logs\n\ndoc_mapping:\n  field_mappings:\n    - name: body\n      type: text\n      tokenizer: default\n      record: position\n    - name: timestamp\n      type: i64\n      fast: true\n\nindexing_settings:\n  commit_timeout_secs: 42\n  merge_policy:\n    type: \"stable_log\"\n\nsearch_settings:\n  default_search_fields: [body]\n"
  },
  {
    "path": "quickwit/quickwit-config/resources/tests/node_config/quickwit.json",
    "content": "# Comments are supported.\n{\n    \"version\": \"0.7\",\n    \"cluster_id\": \"quickwit-cluster\",\n    \"node_id\": \"my-unique-node-id\",\n    \"availability_zone\": \"az-1\",\n    \"enabled_services\": [\n        \"janitor\",\n        \"metastore\"\n    ],\n    \"listen_address\": \"0.0.0.0\",\n    \"advertise_address\": \"172.0.0.12\",\n    \"gossip_listen_port\": 2222,\n    \"grpc_listen_port\": 3333,\n    \"peer_seeds\": [\n        \"quickwit-searcher-0.local\",\n        \"quickwit-searcher-1.local\"\n    ],\n    \"data_dir\": \"/opt/quickwit/data\",\n    \"metastore_uri\": \"postgres://username:password@host:port/db\",\n    \"default_index_root_uri\": \"s3://quickwit-indexes\",\n    \"rest\": {\n        \"listen_port\": 1111,\n        \"extra_headers\": {\n            \"x-header-1\": \"header-value-1\",\n            \"x-header-2\": \"header-value-2\"\n        }\n    },\n    \"grpc\": {\n        \"max_message_size\": \"10 MB\"\n    },\n    \"storage\": {\n        \"azure\": {\n            \"account\": \"quickwit-dev\"\n        },\n        \"s3\": {\n            \"flavor\": \"gcs\",\n            \"endpoint\": \"http://localhost:4566\",\n            \"force_path_style_access\": true\n        }\n    },\n    \"metastore\": {\n        \"postgres\": {\n            \"min_connections\": 1,\n            \"max_num_connections\": 12,\n            \"acquire_connection_timeout\": \"30s\",\n            \"idle_connection_timeout\": \"30min\",\n            \"max_connection_lifetime\": \"1h\"\n        }\n    },\n    \"indexer\": {\n        \"enable_otlp_endpoint\": true,\n        \"split_store_max_num_bytes\": \"1T\",\n        \"split_store_max_num_splits\": 10000,\n        \"max_concurrent_split_uploads\": 8,\n        \"max_merge_write_throughput\": \"100mb\",\n        \"merge_concurrency\": 2\n    },\n    \"ingest_api\": {\n        \"replication_factor\": 2\n    },\n    \"searcher\": {\n        \"aggregation_memory_limit\": \"1G\",\n        \"aggregation_bucket_limit\": 500000,\n        \"fast_field_cache_capacity\": \"10G\",\n        \"split_footer_cache_capacity\": \"1G\",\n        \"max_num_concurrent_split_streams\": 120,\n        \"max_num_concurrent_split_searches\": 150,\n        \"storage_timeout_policy\": {\n            \"min_throughtput_bytes_per_secs\": 100000,\n            \"timeout_millis\": 2000,\n            \"max_num_retries\": 2\n        },\n        \"lambda\": {\n            \"function_name\": \"quickwit-lambda-leaf-search\",\n            \"max_splits_per_invocation\": 10,\n            \"offload_threshold\": 30,\n            \"auto_deploy\": {\n                \"execution_role_arn\": \"arn:aws:iam::123456789012:role/quickwit-lambda-role\",\n                \"memory_size\": \"5 GiB\",\n                \"invocation_timeout_secs\": 15\n            }\n        }\n    },\n    \"jaeger\": {\n        \"enable_endpoint\": true,\n        \"lookback_period_hours\": 24,\n        \"max_trace_duration_secs\": 600,\n        \"max_fetch_spans\": 1000\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-config/resources/tests/node_config/quickwit.toml",
    "content": "version = \"0.7\"\n\ncluster_id = \"quickwit-cluster\"\nnode_id = \"my-unique-node-id\"\navailability_zone = \"az-1\"\nenabled_services = [ \"janitor\", \"metastore\" ]\nlisten_address = \"0.0.0.0\"\nadvertise_address = \"172.0.0.12\"\ngossip_listen_port = 2222\ngrpc_listen_port = 3333\npeer_seeds = [ \"quickwit-searcher-0.local\", \"quickwit-searcher-1.local\" ]\ndata_dir = \"/opt/quickwit/data\"\nmetastore_uri = \"postgres://username:password@host:port/db\"\ndefault_index_root_uri = \"s3://quickwit-indexes\"\n\n[rest]\nlisten_port = 1111\n\n[rest.extra_headers]\nx-header-1 = \"header-value-1\"\nx-header-2 = \"header-value-2\"\n\n[grpc]\nmax_message_size = \"10 MB\"\n\n[storage.azure]\naccount = \"quickwit-dev\"\n\n[storage.s3]\nflavor = \"gcs\"\nendpoint = \"http://localhost:4566\"\nforce_path_style_access = true\n\n[metastore.postgres]\nmin_connections = 1\nmax_num_connections = 12\nacquire_connection_timeout = \"30s\"\nidle_connection_timeout = \"30min\"\nmax_connection_lifetime = \"1h\"\n\n[indexer]\nenable_otlp_endpoint = true\nsplit_store_max_num_bytes = \"1T\"\nsplit_store_max_num_splits = 10_000\nmax_concurrent_split_uploads = 8\nmax_merge_write_throughput = \"100mb\"\nmerge_concurrency = 2\n\n[ingest_api]\nreplication_factor = 2\n\n[searcher]\naggregation_memory_limit = \"1G\"\naggregation_bucket_limit = 500_000\nfast_field_cache_capacity = \"10G\"\nsplit_footer_cache_capacity = \"1G\"\nmax_num_concurrent_split_streams = 120\nmax_num_concurrent_split_searches = 150\n\n[searcher.storage_timeout_policy]\nmin_throughtput_bytes_per_secs = 100000\ntimeout_millis = 2000\nmax_num_retries = 2\n\n[searcher.lambda]\nfunction_name = \"quickwit-lambda-leaf-search\"\nmax_splits_per_invocation = 10\noffload_threshold = 30\n\n[searcher.lambda.auto_deploy]\nexecution_role_arn = \"arn:aws:iam::123456789012:role/quickwit-lambda-role\"\nmemory_size = \"5 GiB\"\ninvocation_timeout_secs = 15\n\n[jaeger]\nenable_endpoint = true\nlookback_period_hours = 24\nmax_trace_duration_secs = 600\nmax_fetch_spans = 1_000\n"
  },
  {
    "path": "quickwit/quickwit-config/resources/tests/node_config/quickwit.wrongkey.yaml",
    "content": "version: 0.8\nsearcher:\n  fast_field_cache_capacity: 10G\n  # Typo here. It is supposed to be searches.\n  max_num_concurrent_split_searches_with_typo: 150\n"
  },
  {
    "path": "quickwit/quickwit-config/resources/tests/node_config/quickwit.yaml",
    "content": "version: 0.8\n\ncluster_id: quickwit-cluster\nnode_id: my-unique-node-id\navailability_zone: az-1\nenabled_services:\n  - janitor\n  - metastore\nlisten_address: 0.0.0.0\nadvertise_address: 172.0.0.12\ngossip_listen_port: 2222\ngrpc_listen_port: 3333\npeer_seeds:\n  - quickwit-searcher-0.local\n  - quickwit-searcher-1.local\ndata_dir: /opt/quickwit/data\nmetastore_uri: postgres://username:password@host:port/db\ndefault_index_root_uri: s3://quickwit-indexes\n\nrest:\n  listen_port: 1111\n  extra_headers:\n    x-header-1: header-value-1\n    x-header-2: header-value-2\n\ngrpc:\n  max_message_size: 10 MB\n\nstorage:\n  azure:\n    account: quickwit-dev\n  s3:\n    flavor: gcs\n    endpoint: http://localhost:4566\n    force_path_style_access: true\n\nmetastore:\n  postgres:\n    min_connections: 1\n    max_num_connections: 12\n    acquire_connection_timeout: 30s\n    idle_connection_timeout: 30min\n    max_connection_lifetime: 1h\n\nindexer:\n  enable_otlp_endpoint: true\n  split_store_max_num_bytes: 1T\n  split_store_max_num_splits: 10000\n  max_concurrent_split_uploads: 8\n  max_merge_write_throughput: 100mb\n  merge_concurrency: 2\n\ningest_api:\n  replication_factor: 2\n\nsearcher:\n  aggregation_memory_limit: 1G\n  aggregation_bucket_limit: 500000\n  fast_field_cache_capacity: 10G\n  split_footer_cache_capacity: 1G\n  max_num_concurrent_split_streams: 120\n  max_num_concurrent_split_searches: 150\n  storage_timeout_policy:\n    min_throughtput_bytes_per_secs: 100000\n    timeout_millis: 2000\n    max_num_retries: 2\n  lambda:\n    function_name: quickwit-lambda-leaf-search\n    max_splits_per_invocation: 10\n    offload_threshold: 30\n    auto_deploy:\n      execution_role_arn: arn:aws:iam::123456789012:role/quickwit-lambda-role\n      memory_size: 5 GiB\n      invocation_timeout_secs: 15\n\njaeger:\n  enable_endpoint: true\n  lookback_period_hours: 24\n  max_trace_duration_secs: 600\n  max_fetch_spans: 1000\n"
  },
  {
    "path": "quickwit/quickwit-config/resources/tests/source_config/ingest-api-source.json",
    "content": "{\n  \"version\": \"0.7\",\n  \"source_id\": \"_ingest-api-source\",\n  \"enabled\": true,\n  \"source_type\": \"ingest-api\",\n  \"transform\": {\n    \"script\": \".message = downcase(string!(.message))\"\n  }\n}\n"
  },
  {
    "path": "quickwit/quickwit-config/resources/tests/source_config/kafka-source.json",
    "content": "{\n    \"version\": \"0.7\",\n    \"source_id\": \"hdfs-logs-kafka-source\",\n    \"desired_num_pipelines\": 2,\n    \"source_type\": \"kafka\",\n    \"params\": {\n        \"topic\": \"cloudera-cluster-logs\",\n        \"client_params\": {\n            \"bootstrap.servers\": \"localhost:9092\"\n        }\n    },\n    \"transform\": {\n        \"script\": \".message = downcase(string!(.message))\",\n        \"timezone\": \"local\"\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-config/resources/tests/source_config/kinesis-source.yaml",
    "content": "version: 0.8\nsource_id: hdfs-logs-kinesis-source\nsource_type: kinesis\nparams:\n  stream_name: emr-cluster-logs\ntransform:\n  script: .message = downcase(string!(.message))\n  timezone: local\n"
  },
  {
    "path": "quickwit/quickwit-config/src/cluster_config/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse bytesize::ByteSize;\nuse quickwit_common::uri::Uri;\n\n/// An embryo of a cluster config.\n// TODO: Move to `quickwit-config` and version object.\n#[derive(Debug, Clone)]\npub struct ClusterConfig {\n    pub cluster_id: String,\n    pub auto_create_indexes: bool,\n    pub default_index_root_uri: Uri,\n    pub replication_factor: usize,\n    pub shard_throughput_limit: ByteSize,\n    pub shard_scale_up_factor: f32,\n}\n\nimpl ClusterConfig {\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn for_test() -> Self {\n        ClusterConfig {\n            cluster_id: \"test-cluster\".to_string(),\n            auto_create_indexes: false,\n            default_index_root_uri: Uri::for_test(\"ram:///indexes\"),\n            replication_factor: 1,\n            shard_throughput_limit: quickwit_common::shared_consts::DEFAULT_SHARD_THROUGHPUT_LIMIT,\n            shard_scale_up_factor: 1.01,\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-config/src/config_value.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::str::FromStr;\nuse std::{any, fmt};\n\nuse anyhow::{self, Context};\nuse serde::{Deserialize, Deserializer};\nuse tracing::warn;\n\nuse crate::qw_env_vars::{QW_ENV_VARS, QW_NONE};\n\n#[derive(Debug, Clone, Eq, PartialEq)]\npub(crate) struct ConfigValue<T, const E: usize> {\n    /// Value provided by the user in a config file.\n    provided: Option<T>,\n    /// Value provided by Quickwit as default.\n    default: Option<T>,\n}\n\nimpl<T, const E: usize> ConfigValue<T, E>\nwhere\n    T: FromStr,\n    <T as FromStr>::Err: fmt::Debug,\n{\n    pub(crate) fn with_default(value: T) -> Self {\n        Self {\n            provided: None,\n            default: Some(value),\n        }\n    }\n\n    pub(crate) fn none() -> Self {\n        Self {\n            provided: None,\n            default: None,\n        }\n    }\n\n    #[cfg(test)]\n    pub(crate) fn for_test(value: T) -> Self {\n        Self {\n            provided: Some(value),\n            default: None,\n        }\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub(crate) fn unwrap(self) -> T {\n        self.provided.or(self.default).unwrap()\n    }\n\n    pub(crate) fn resolve_optional(\n        self,\n        env_vars: &HashMap<String, String>,\n    ) -> anyhow::Result<Option<T>> {\n        // QW env vars take precedence over the config file values.\n        if E > QW_NONE\n            && let Some(env_var_key) = QW_ENV_VARS.get(&E)\n            && let Some(env_var_value) = env_vars.get(*env_var_key).filter(|val| {\n                if val.is_empty() {\n                    warn!(\n                        \"environment variable `{}` is set but value is empty\",\n                        *env_var_key\n                    );\n                    false\n                } else {\n                    true\n                }\n            })\n        {\n            let value = env_var_value.parse::<T>().map_err(|error| {\n                anyhow::anyhow!(\n                    \"failed to convert value `{env_var_value}` read from environment variable \\\n                     `{env_var_key}` to type `{}`: {error:?}\",\n                    any::type_name::<T>(),\n                )\n            })?;\n            return Ok(Some(value));\n        }\n        Ok(self.provided.or(self.default))\n    }\n\n    pub(crate) fn resolve(self, env_vars: &HashMap<String, String>) -> anyhow::Result<T> {\n        self.resolve_optional(env_vars)?.context(\n            \"failed to resolve field value: no value was provided via environment variable or \\\n             config file, and the field has no default\",\n        )\n    }\n}\n\nimpl<T, const E: usize> Default for ConfigValue<T, E>\nwhere T: Default\n{\n    fn default() -> Self {\n        Self {\n            provided: None,\n            default: Some(T::default()),\n        }\n    }\n}\n\nimpl<'de, T, const E: usize> Deserialize<'de> for ConfigValue<T, E>\nwhere T: Deserialize<'de>\n{\n    fn deserialize<D>(deserializer: D) -> Result<ConfigValue<T, E>, D::Error>\n    where D: Deserializer<'de> {\n        let value: Option<T> = Deserialize::deserialize(deserializer)?;\n        Ok(ConfigValue {\n            provided: value,\n            default: None,\n        })\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n    use crate::qw_env_vars::{\n        QW_AVAILABILITY_ZONE, QW_CLUSTER_ID, QW_GOSSIP_LISTEN_PORT, QW_NODE_ID, QW_REST_LISTEN_PORT,\n    };\n\n    #[test]\n    fn test_config_value_resolve_optional() {\n        {\n            let env_vars = HashMap::new();\n            let rest_listen_port = ConfigValue::<usize, QW_REST_LISTEN_PORT>::none();\n            assert!(\n                rest_listen_port\n                    .resolve_optional(&env_vars)\n                    .unwrap()\n                    .is_none()\n            );\n        }\n        {\n            let env_vars = HashMap::new();\n            let rest_listen_port = ConfigValue::<usize, QW_REST_LISTEN_PORT>::with_default(7280);\n            assert_eq!(\n                rest_listen_port\n                    .resolve_optional(&env_vars)\n                    .unwrap()\n                    .unwrap(),\n                7280\n            );\n        }\n        {\n            let env_vars = HashMap::new();\n            let rest_listen_port = ConfigValue::<usize, QW_REST_LISTEN_PORT> {\n                provided: Some(5678),\n                default: Some(7820),\n            };\n            assert_eq!(\n                rest_listen_port\n                    .resolve_optional(&env_vars)\n                    .unwrap()\n                    .unwrap(),\n                5678\n            );\n        }\n        {\n            let mut env_vars = HashMap::new();\n            env_vars.insert(\"QW_REST_LISTEN_PORT\".to_string(), \"foobar\".to_string());\n            let rest_listen_port = ConfigValue::<usize, QW_REST_LISTEN_PORT> {\n                provided: Some(5678),\n                default: Some(7820),\n            };\n            rest_listen_port.resolve_optional(&env_vars).unwrap_err();\n        }\n        {\n            let mut env_vars = HashMap::new();\n            env_vars.insert(\"QW_REST_LISTEN_PORT\".to_string(), \"1234\".to_string());\n            let rest_listen_port = ConfigValue::<usize, QW_REST_LISTEN_PORT> {\n                provided: Some(5678),\n                default: Some(7820),\n            };\n            assert_eq!(\n                rest_listen_port\n                    .resolve_optional(&env_vars)\n                    .unwrap()\n                    .unwrap(),\n                1234\n            );\n        }\n    }\n\n    #[test]\n    fn test_config_value_resolve() {\n        let env_vars = HashMap::new();\n        let rest_listen_port = ConfigValue::<usize, QW_REST_LISTEN_PORT>::none();\n        rest_listen_port.resolve(&env_vars).unwrap_err();\n    }\n\n    #[test]\n    fn test_config_value_resolve_optional_empty_string() {\n        let mut env_vars = HashMap::new();\n        env_vars.insert(\"QW_AVAILABILITY_ZONE\".to_string(), \"\".to_string());\n        let az = ConfigValue::<usize, QW_AVAILABILITY_ZONE>::none();\n        assert!(az.resolve_optional(&env_vars).unwrap().is_none());\n    }\n\n    #[test]\n    fn test_config_value_deserialize() {\n        fn default_cluster_id() -> ConfigValue<String, QW_CLUSTER_ID> {\n            ConfigValue::with_default(\"default-cluster\".to_string())\n        }\n\n        fn default_node_id() -> ConfigValue<String, QW_NODE_ID> {\n            ConfigValue::with_default(\"default-node\".to_string())\n        }\n\n        fn default_rest_listen_port() -> ConfigValue<usize, QW_REST_LISTEN_PORT> {\n            ConfigValue::with_default(7280)\n        }\n\n        #[derive(Deserialize)]\n        struct Config {\n            #[serde(default)]\n            version: ConfigValue<usize, QW_NONE>,\n            #[serde(default = \"default_cluster_id\")]\n            cluster_id: ConfigValue<String, QW_CLUSTER_ID>,\n            #[serde(default = \"default_node_id\")]\n            node_id: ConfigValue<String, QW_NODE_ID>,\n            #[serde(default = \"default_rest_listen_port\")]\n            rest_listen_port: ConfigValue<usize, QW_REST_LISTEN_PORT>,\n            gossip_listen_port: ConfigValue<String, QW_GOSSIP_LISTEN_PORT>,\n        }\n        let config = serde_yaml::from_str::<Config>(\n            r#\"\n            cluster_id: qw-cluster\n            \"#,\n        )\n        .unwrap();\n\n        let mut env_vars = HashMap::new();\n        env_vars.insert(\"QW_REST_LISTEN_PORT\".to_string(), \"1234\".to_string());\n\n        assert_eq!(config.version.resolve(&env_vars).unwrap(), 0);\n        assert_eq!(config.cluster_id.resolve(&env_vars).unwrap(), \"qw-cluster\");\n        assert_eq!(config.node_id.resolve(&env_vars).unwrap(), \"default-node\");\n        assert_eq!(config.rest_listen_port.resolve(&env_vars).unwrap(), 1234);\n        assert!(\n            config\n                .gossip_listen_port\n                .resolve_optional(&env_vars)\n                .unwrap()\n                .is_none()\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-config/src/index_config/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\npub(crate) mod serialize;\n\nuse std::collections::HashSet;\nuse std::hash::{Hash, Hasher};\nuse std::num::NonZeroUsize;\nuse std::str::FromStr;\nuse std::sync::Arc;\nuse std::time::Duration;\n\nuse anyhow::{Context, ensure};\nuse bytesize::ByteSize;\nuse chrono::Utc;\nuse cron::Schedule;\nuse humantime::parse_duration;\nuse quickwit_common::uri::Uri;\nuse quickwit_common::{is_true, true_fn};\nuse quickwit_doc_mapper::{DocMapper, DocMapperBuilder, DocMapping};\nuse quickwit_proto::types::IndexId;\nuse serde::{Deserialize, Serialize};\npub use serialize::{load_index_config_from_user_config, load_index_config_update};\nuse siphasher::sip::SipHasher;\nuse tracing::warn;\n\nuse crate::index_config::serialize::VersionedIndexConfig;\nuse crate::merge_policy_config::MergePolicyConfig;\n\n#[derive(Clone, Debug, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct IndexingResources {\n    #[schema(value_type = String, default = \"2 GB\")]\n    #[serde(default = \"IndexingResources::default_heap_size\")]\n    #[serde(with = \"crate::serde_utils::bytesize_serde\")]\n    pub heap_size: ByteSize,\n    // DEPRECATED: See #4439\n    #[schema(value_type = String)]\n    #[serde(default)]\n    #[serde(skip_serializing)]\n    max_merge_write_throughput: Option<ByteSize>,\n}\n\nimpl PartialEq for IndexingResources {\n    fn eq(&self, other: &Self) -> bool {\n        self.heap_size == other.heap_size\n    }\n}\n\nimpl Hash for IndexingResources {\n    fn hash<H: Hasher>(&self, state: &mut H) {\n        self.heap_size.hash(state);\n    }\n}\n\nimpl IndexingResources {\n    fn default_heap_size() -> ByteSize {\n        ByteSize::gb(2)\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn for_test() -> Self {\n        Self {\n            heap_size: ByteSize::mb(20),\n            ..Default::default()\n        }\n    }\n\n    pub fn validate(&self) -> anyhow::Result<()> {\n        if self.max_merge_write_throughput.is_some() {\n            warn!(\n                \"`max_merge_write_throughput` is deprecated and will be removed in a future \\\n                 version. See #4439. A global limit now exists in indexer configuration.\"\n            );\n        }\n        Ok(())\n    }\n}\n\nimpl Default for IndexingResources {\n    fn default() -> Self {\n        Self {\n            heap_size: Self::default_heap_size(),\n            max_merge_write_throughput: None,\n        }\n    }\n}\n\n#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Hash, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct IndexingSettings {\n    #[schema(default = 60)]\n    #[serde(default = \"IndexingSettings::default_commit_timeout_secs\")]\n    pub commit_timeout_secs: usize,\n    #[schema(default = 8)]\n    #[serde(default = \"IndexingSettings::default_docstore_compression_level\")]\n    pub docstore_compression_level: i32,\n    #[schema(default = 1_000_000)]\n    #[serde(default = \"IndexingSettings::default_docstore_blocksize\")]\n    pub docstore_blocksize: usize,\n    /// The merge policy aims to eventually produce mature splits that have a larger size but\n    /// are within close range of `split_num_docs_target`.\n    ///\n    /// In other words, splits that contain a number of documents greater than or equal to\n    /// `split_num_docs_target` are considered mature and never merged.\n    #[serde(default = \"IndexingSettings::default_split_num_docs_target\")]\n    pub split_num_docs_target: usize,\n    #[serde(default)]\n    pub merge_policy: MergePolicyConfig,\n    #[serde(default)]\n    pub resources: IndexingResources,\n}\n\nimpl IndexingSettings {\n    pub fn commit_timeout(&self) -> Duration {\n        Duration::from_secs(self.commit_timeout_secs as u64)\n    }\n\n    fn default_commit_timeout_secs() -> usize {\n        60\n    }\n\n    pub fn default_docstore_blocksize() -> usize {\n        1_000_000\n    }\n\n    pub fn default_docstore_compression_level() -> i32 {\n        8\n    }\n\n    pub fn default_split_num_docs_target() -> usize {\n        10_000_000\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn for_test() -> Self {\n        Self {\n            resources: IndexingResources::for_test(),\n            ..Default::default()\n        }\n    }\n}\n\nimpl Default for IndexingSettings {\n    fn default() -> Self {\n        Self {\n            commit_timeout_secs: Self::default_commit_timeout_secs(),\n            docstore_blocksize: Self::default_docstore_blocksize(),\n            docstore_compression_level: Self::default_docstore_compression_level(),\n            split_num_docs_target: Self::default_split_num_docs_target(),\n            merge_policy: MergePolicyConfig::default(),\n            resources: IndexingResources::default(),\n        }\n    }\n}\n\n/// Settings for ingestion.\n#[derive(Clone, Debug, PartialEq, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct IngestSettings {\n    /// Configures the minimum number of shards to use for ingestion.\n    #[schema(default = 1, value_type = usize)]\n    #[serde(default = \"IngestSettings::default_min_shards\")]\n    pub min_shards: NonZeroUsize,\n    /// Whether to validate documents against the current doc mapping during ingestion.\n    /// Defaults to true. When false, documents will be written directly to the WAL without\n    /// validation, but might still be rejected during indexing when applying the doc mapping\n    /// in the doc processor, in that case the documents are dropped and a warning is logged.\n    ///\n    /// Note that when a source has a VRL transform configured, documents are not validated against\n    /// the doc mapping during ingestion either.\n    #[schema(default = true, value_type = bool)]\n    #[serde(default = \"true_fn\", skip_serializing_if = \"is_true\")]\n    pub validate_docs: bool,\n}\n\nimpl IngestSettings {\n    pub fn default_min_shards() -> NonZeroUsize {\n        NonZeroUsize::MIN\n    }\n}\n\nimpl Default for IngestSettings {\n    fn default() -> Self {\n        Self {\n            min_shards: Self::default_min_shards(),\n            validate_docs: true,\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default, Eq, PartialEq, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct SearchSettings {\n    #[serde(default)]\n    pub default_search_fields: Vec<String>,\n}\n\n#[derive(Clone, Debug, Hash, Eq, PartialEq, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct RetentionPolicy {\n    /// Duration of time for which the splits should be retained, expressed in a human-friendly way\n    /// (`1 hour`, `3 days`, `1 week`, ...).\n    #[serde(rename = \"period\")]\n    pub retention_period: String,\n\n    /// Defines the frequency at which the retention policy is evaluated and applied, expressed in\n    /// a human-friendly way (`hourly`, `daily`, ...) or as a cron expression (`0 0 * * * *`,\n    /// `0 0 0 * * *`).\n    #[serde(default = \"RetentionPolicy::default_schedule\")]\n    #[serde(rename = \"schedule\")]\n    pub evaluation_schedule: String,\n}\n\nimpl RetentionPolicy {\n    pub fn default_schedule() -> String {\n        \"hourly\".to_string()\n    }\n\n    pub fn retention_period(&self) -> anyhow::Result<Duration> {\n        parse_duration(&self.retention_period).with_context(|| {\n            format!(\n                \"failed to parse retention period `{}`\",\n                self.retention_period\n            )\n        })\n    }\n\n    pub fn evaluation_schedule(&self) -> anyhow::Result<Schedule> {\n        let evaluation_schedule = prepend_at_char(&self.evaluation_schedule);\n\n        Schedule::from_str(&evaluation_schedule).with_context(|| {\n            format!(\n                \"failed to parse retention evaluation schedule `{}`\",\n                self.evaluation_schedule\n            )\n        })\n    }\n\n    pub fn duration_until_next_evaluation(&self) -> anyhow::Result<Duration> {\n        let schedule = self.evaluation_schedule()?;\n        let future_date = schedule\n            .upcoming(Utc)\n            .next()\n            .expect(\"Failed to obtain next evaluation date.\");\n        let duration = (future_date - Utc::now())\n            .to_std()\n            .map_err(|err| anyhow::anyhow!(err.to_string()))?;\n        Ok(duration)\n    }\n\n    pub(super) fn validate(&self) -> anyhow::Result<()> {\n        self.retention_period()?;\n        self.evaluation_schedule()?;\n        Ok(())\n    }\n}\n\n/// Prepends an `@` char at the start of the cron expression if necessary:\n/// `hourly` -> `@hourly`\nfn prepend_at_char(schedule: &str) -> String {\n    let trimmed_schedule = schedule.trim();\n\n    if !trimmed_schedule.is_empty()\n        && !trimmed_schedule.starts_with('@')\n        && trimmed_schedule.chars().all(|ch| ch.is_ascii_alphabetic())\n    {\n        return format!(\"@{trimmed_schedule}\");\n    }\n    trimmed_schedule.to_string()\n}\n\n#[derive(Clone, Debug, Serialize, Deserialize, PartialEq)]\n#[serde(deny_unknown_fields)]\n#[serde(into = \"VersionedIndexConfig\")]\n#[serde(try_from = \"VersionedIndexConfig\")]\npub struct IndexConfig {\n    pub index_id: IndexId,\n    pub index_uri: Uri,\n    pub doc_mapping: DocMapping,\n    pub indexing_settings: IndexingSettings,\n    pub ingest_settings: IngestSettings,\n    pub search_settings: SearchSettings,\n    pub retention_policy_opt: Option<RetentionPolicy>,\n}\n\nimpl IndexConfig {\n    /// Return a fingerprint of parameters relevant for indexers\n    ///\n    /// This should remain private to this crate to avoid confusion with the\n    /// full indexing pipeline fingerprint that also includes the source's\n    /// fingerprint.\n    pub(crate) fn indexing_params_fingerprint(&self) -> u64 {\n        let mut hasher = SipHasher::new();\n        self.doc_mapping.doc_mapping_uid.hash(&mut hasher);\n        self.indexing_settings.hash(&mut hasher);\n        hasher.finish()\n    }\n\n    /// Compares IndexConfig level fingerprints\n    ///\n    /// This method is meant to enable IndexConfig level fingerprint comparison\n    /// without taking the risk of mixing them up with pipeline level\n    /// fingerprints (computed by\n    /// [`crate::indexing_pipeline_params_fingerprint()`]).\n    pub fn equals_fingerprint(&self, other: &Self) -> bool {\n        self.indexing_params_fingerprint() == other.indexing_params_fingerprint()\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn for_test(index_id: &str, index_uri: &str) -> Self {\n        let index_uri = Uri::from_str(index_uri).unwrap();\n        let doc_mapping_json = r#\"{\n            \"doc_mapping_uid\": \"00000000000000000000000000\",\n            \"mode\": \"lenient\",\n            \"field_mappings\": [\n                {\n                    \"name\": \"timestamp\",\n                    \"type\": \"datetime\",\n                    \"fast\": true\n                },\n                {\n                    \"name\": \"body\",\n                    \"type\": \"text\",\n                    \"stored\": true\n                },\n                {\n                    \"name\": \"response_date\",\n                    \"type\": \"datetime\",\n                    \"fast\": true\n                },\n                {\n                    \"name\": \"response_time\",\n                    \"type\": \"f64\",\n                    \"fast\": true\n                },\n                {\n                    \"name\": \"response_payload\",\n                    \"type\": \"bytes\",\n                    \"fast\": true\n                },\n                {\n                    \"name\": \"owner\",\n                    \"type\": \"text\",\n                    \"tokenizer\": \"raw\"\n                },\n                {\n                    \"name\": \"attributes\",\n                    \"type\": \"object\",\n                    \"field_mappings\": [\n                        {\n                            \"name\": \"tags\",\n                            \"type\": \"array<i64>\"\n                        },\n                        {\n                            \"name\": \"server\",\n                            \"type\": \"text\"\n                        },\n                        {\n                            \"name\": \"server.status\",\n                            \"type\": \"array<text>\"\n                        },\n                        {\n                            \"name\": \"server.payload\",\n                            \"type\": \"array<bytes>\"\n                        }\n                    ]\n                }\n            ],\n            \"timestamp_field\": \"timestamp\",\n            \"tag_fields\": [\"owner\"],\n            \"store_source\": true\n        }\"#;\n        let doc_mapping = serde_json::from_str(doc_mapping_json).unwrap();\n        let indexing_settings = IndexingSettings {\n            resources: IndexingResources::for_test(),\n            ..Default::default()\n        };\n        let search_settings = SearchSettings {\n            default_search_fields: vec![\n                \"body\".to_string(),\n                r#\"attributes.server\"#.to_string(),\n                r\"attributes.server\\.status\".to_string(),\n            ],\n        };\n        IndexConfig {\n            index_id: index_id.to_string(),\n            index_uri,\n            doc_mapping,\n            indexing_settings,\n            ingest_settings: IngestSettings::default(),\n            search_settings,\n            retention_policy_opt: None,\n        }\n    }\n}\n\n#[cfg(any(test, feature = \"testsuite\"))]\nimpl crate::TestableForRegression for IndexConfig {\n    fn sample_for_regression() -> Self {\n        use std::collections::BTreeSet;\n        use std::num::NonZeroU32;\n\n        use quickwit_doc_mapper::Mode;\n        use quickwit_proto::types::DocMappingUid;\n\n        use crate::merge_policy_config::StableLogMergePolicyConfig;\n\n        let tenant_id_mapping = serde_json::from_str(\n            r#\"{\n                \"name\": \"tenant_id\",\n                \"type\": \"u64\",\n                \"fast\": true\n        }\"#,\n        )\n        .unwrap();\n        let timestamp_mapping = serde_json::from_str(\n            r#\"{\n                \"name\": \"timestamp\",\n                \"type\": \"datetime\",\n                \"fast\": true\n        }\"#,\n        )\n        .unwrap();\n        let log_level_mapping = serde_json::from_str(\n            r#\"{\n                \"name\": \"log_level\",\n                \"type\": \"text\",\n                \"tokenizer\": \"raw\"\n        }\"#,\n        )\n        .unwrap();\n        let message_mapping = serde_json::from_str(\n            r#\"{\n                \"name\": \"message\",\n                \"type\": \"text\",\n                \"record\": \"position\",\n                \"tokenizer\": \"default\"\n        }\"#,\n        )\n        .unwrap();\n        let tokenizer = serde_json::from_str(\n            r#\"{\n                \"name\": \"custom_tokenizer\",\n                \"type\": \"regex\",\n                \"pattern\": \"[^\\\\p{L}\\\\p{N}]+\"\n            }\"#,\n        )\n        .unwrap();\n        let doc_mapping = DocMapping {\n            doc_mapping_uid: DocMappingUid::for_test(1),\n            mode: Mode::default(),\n            field_mappings: vec![\n                tenant_id_mapping,\n                timestamp_mapping,\n                log_level_mapping,\n                message_mapping,\n            ],\n            timestamp_field: Some(\"timestamp\".to_string()),\n            tag_fields: BTreeSet::from_iter([\"tenant_id\".to_string(), \"log_level\".to_string()]),\n            partition_key: Some(\"tenant_id\".to_string()),\n            max_num_partitions: NonZeroU32::new(100).unwrap(),\n            index_field_presence: true,\n            store_document_size: false,\n            store_source: true,\n            tokenizers: vec![tokenizer],\n        };\n        let stable_log_config = StableLogMergePolicyConfig {\n            merge_factor: 9,\n            max_merge_factor: 11,\n            ..Default::default()\n        };\n        let merge_policy = MergePolicyConfig::StableLog(stable_log_config);\n        let indexing_resources = IndexingResources {\n            heap_size: ByteSize::mb(50),\n            ..Default::default()\n        };\n        let indexing_settings = IndexingSettings {\n            commit_timeout_secs: 301,\n            split_num_docs_target: 10_000_001,\n            merge_policy,\n            resources: indexing_resources,\n            ..Default::default()\n        };\n        let ingest_settings = IngestSettings {\n            min_shards: NonZeroUsize::new(12).unwrap(),\n            validate_docs: true,\n        };\n        let search_settings = SearchSettings {\n            default_search_fields: vec![\"message\".to_string()],\n        };\n        let retention_policy_opt = Some(RetentionPolicy {\n            retention_period: \"90 days\".to_string(),\n            evaluation_schedule: \"daily\".to_string(),\n        });\n        IndexConfig {\n            index_id: \"my-index\".to_string(),\n            index_uri: Uri::for_test(\"s3://quickwit-indexes/my-index\"),\n            doc_mapping,\n            indexing_settings,\n            ingest_settings,\n            search_settings,\n            retention_policy_opt,\n        }\n    }\n\n    fn assert_equality(&self, other: &Self) {\n        assert_eq!(self.index_id, other.index_id);\n        assert_eq!(self.index_uri, other.index_uri);\n        assert_eq!(self.doc_mapping, other.doc_mapping);\n        assert_eq!(self.indexing_settings, other.indexing_settings);\n        assert_eq!(self.ingest_settings, other.ingest_settings);\n        assert_eq!(self.search_settings, other.search_settings);\n        assert_eq!(self.retention_policy_opt, other.retention_policy_opt);\n    }\n}\n\n/// Builds and returns the doc mapper associated with an index.\npub fn build_doc_mapper(\n    doc_mapping: &DocMapping,\n    search_settings: &SearchSettings,\n) -> anyhow::Result<Arc<DocMapper>> {\n    let builder = DocMapperBuilder {\n        doc_mapping: doc_mapping.clone(),\n        default_search_fields: search_settings.default_search_fields.clone(),\n        legacy_type_tag: None,\n    };\n    let doc_mapper = builder.try_build()?;\n    Ok(Arc::new(doc_mapper))\n}\n\n/// Validates the objects that make up an index configuration. This is a \"free\" function as opposed\n/// to a method on `IndexConfig` so we can reuse it for validating index templates.\npub(super) fn validate_index_config(\n    doc_mapping: &DocMapping,\n    indexing_settings: &IndexingSettings,\n    search_settings: &SearchSettings,\n    retention_policy_opt: &Option<RetentionPolicy>,\n) -> anyhow::Result<()> {\n    // Note: this needs a deep refactoring to separate the doc mapping configuration,\n    // and doc mapper implementations.\n    // TODO see if we should store the byproducton the IndexConfig.\n    build_doc_mapper(doc_mapping, search_settings)?;\n\n    indexing_settings.merge_policy.validate()?;\n    indexing_settings.resources.validate()?;\n\n    if let Some(retention_policy) = retention_policy_opt {\n        retention_policy.validate()?;\n\n        ensure!(\n            doc_mapping.timestamp_field.is_some(),\n            \"retention policy requires a timestamp field, but doc mapping does not declare one\"\n        );\n    }\n    Ok(())\n}\n\n/// Returns the updated doc mapping and a boolean indicating whether a mutation occurred.\n///\n/// The logic goes as follows:\n/// 1. If the new doc mapping is the same as the current doc mapping, ignoring their UIDs, returns\n///    the current doc mapping and `false`, indicating that no mutation occurred.\n/// 2. If the new doc mapping is different from the current doc mapping, verifies the following\n///    constraints before returning the new doc mapping and `true`, indicating that a mutation\n///    occurred:\n///    - The doc mapping UID should differ from the current one\n///    - The timestamp field should remain the same\n///    - The tokenizers should be a superset of the current tokenizers\n///    - A doc mapper can be built from the new doc mapping\npub fn prepare_doc_mapping_update(\n    mut new_doc_mapping: DocMapping,\n    current_doc_mapping: &DocMapping,\n    search_settings: &SearchSettings,\n) -> anyhow::Result<(DocMapping, bool)> {\n    // Save the new doc mapping UID in a temporary variable and override it with the current doc\n    // mapping UID to compare the two doc mappings, ignoring their UIDs.\n    let new_doc_mapping_uid = new_doc_mapping.doc_mapping_uid;\n    new_doc_mapping.doc_mapping_uid = current_doc_mapping.doc_mapping_uid;\n\n    if new_doc_mapping == *current_doc_mapping {\n        return Ok((new_doc_mapping, false));\n    }\n    // Restore the new doc mapping UID.\n    new_doc_mapping.doc_mapping_uid = new_doc_mapping_uid;\n\n    ensure!(\n        new_doc_mapping.doc_mapping_uid != current_doc_mapping.doc_mapping_uid,\n        \"new doc mapping UID should differ from the current one, current UID `{}`, new UID `{}`\",\n        current_doc_mapping.doc_mapping_uid,\n        new_doc_mapping.doc_mapping_uid,\n    );\n    let new_timestamp_field = new_doc_mapping.timestamp_field.as_deref();\n    let current_timestamp_field = current_doc_mapping.timestamp_field.as_deref();\n    ensure!(\n        new_timestamp_field == current_timestamp_field,\n        \"updating timestamp field is not allowed, current timestamp field `{}`, new timestamp \\\n         field `{}`\",\n        current_timestamp_field.unwrap_or(\"none\"),\n        new_timestamp_field.unwrap_or(\"none\"),\n    );\n    // TODO: Unsure this constraint is required, should we relax it?\n    let new_tokenizers: HashSet<_> = new_doc_mapping.tokenizers.iter().collect();\n    let current_tokenizers: HashSet<_> = current_doc_mapping.tokenizers.iter().collect();\n    ensure!(\n        new_tokenizers.is_superset(&current_tokenizers),\n        \"updating tokenizers is allowed only if adding new tokenizers, current tokenizers \\\n         `{current_tokenizers:?}`, new tokenizers `{new_tokenizers:?}`\",\n    );\n    build_doc_mapper(&new_doc_mapping, search_settings).context(\"invalid doc mapping\")?;\n    Ok((new_doc_mapping, true))\n}\n\n#[cfg(test)]\nmod tests {\n\n    use cron::TimeUnitSpec;\n    use quickwit_doc_mapper::{Mode, ModeType, TokenizerEntry};\n    use quickwit_proto::types::DocMappingUid;\n\n    use super::*;\n    use crate::ConfigFormat;\n    use crate::merge_policy_config::MergePolicyConfig;\n\n    fn get_index_config_filepath(index_config_filename: &str) -> String {\n        format!(\n            \"{}/resources/tests/index_config/{}\",\n            env!(\"CARGO_MANIFEST_DIR\"),\n            index_config_filename\n        )\n    }\n\n    #[track_caller]\n    fn test_index_config_parse_aux(config_format: ConfigFormat) {\n        let index_config_filepath =\n            get_index_config_filepath(&format!(\"hdfs-logs.{config_format:?}\").to_lowercase());\n        let file = std::fs::read_to_string(index_config_filepath).unwrap();\n        let index_config = load_index_config_from_user_config(\n            config_format,\n            file.as_bytes(),\n            &Uri::for_test(\"s3://defaultbucket/\"),\n        )\n        .unwrap();\n        assert_eq!(index_config.doc_mapping.tokenizers.len(), 1);\n        assert_eq!(index_config.doc_mapping.tokenizers[0].name, \"service_regex\");\n        assert_eq!(index_config.doc_mapping.field_mappings.len(), 5);\n        assert_eq!(index_config.doc_mapping.field_mappings[0].name, \"tenant_id\");\n        assert_eq!(index_config.doc_mapping.field_mappings[1].name, \"timestamp\");\n        assert_eq!(\n            index_config.doc_mapping.field_mappings[2].name,\n            \"severity_text\"\n        );\n        assert_eq!(index_config.doc_mapping.field_mappings[3].name, \"body\");\n        assert_eq!(index_config.doc_mapping.field_mappings[4].name, \"resource\");\n\n        assert_eq!(\n            index_config\n                .doc_mapping\n                .tag_fields\n                .into_iter()\n                .collect::<Vec<String>>(),\n            vec![\"tenant_id\".to_string()]\n        );\n        let expected_retention_policy = RetentionPolicy {\n            retention_period: \"90 days\".to_string(),\n            evaluation_schedule: \"daily\".to_string(),\n        };\n        assert_eq!(\n            index_config.retention_policy_opt.unwrap(),\n            expected_retention_policy\n        );\n        assert!(index_config.doc_mapping.store_source);\n\n        assert_eq!(\n            index_config.doc_mapping.timestamp_field.unwrap(),\n            \"timestamp\"\n        );\n        assert_eq!(index_config.indexing_settings.commit_timeout_secs, 61);\n        assert_eq!(\n            index_config.indexing_settings.merge_policy,\n            MergePolicyConfig::StableLog(crate::StableLogMergePolicyConfig {\n                merge_factor: 9,\n                max_merge_factor: 11,\n                maturation_period: Duration::from_secs(48 * 3600),\n                ..Default::default()\n            })\n        );\n        assert_eq!(\n            index_config.indexing_settings.resources,\n            IndexingResources {\n                heap_size: ByteSize::gb(3),\n                ..Default::default()\n            }\n        );\n        assert_eq!(index_config.ingest_settings.min_shards.get(), 12);\n        assert_eq!(\n            index_config.search_settings,\n            SearchSettings {\n                default_search_fields: vec![\"severity_text\".to_string(), \"body\".to_string()],\n            }\n        );\n    }\n\n    #[test]\n    fn test_index_config_from_json() {\n        test_index_config_parse_aux(ConfigFormat::Json);\n    }\n\n    #[test]\n    fn test_index_config_from_toml() {\n        test_index_config_parse_aux(ConfigFormat::Toml);\n    }\n\n    #[test]\n    fn test_index_config_from_yaml() {\n        test_index_config_parse_aux(ConfigFormat::Yaml);\n    }\n\n    #[test]\n    fn test_indexer_config_default_values() {\n        let default_index_root_uri = Uri::for_test(\"s3://defaultbucket/\");\n        {\n            let index_config_filepath = get_index_config_filepath(\"minimal-hdfs-logs.yaml\");\n            let file_content = std::fs::read_to_string(index_config_filepath).unwrap();\n            let index_config = load_index_config_from_user_config(\n                ConfigFormat::Yaml,\n                file_content.as_bytes(),\n                &default_index_root_uri,\n            )\n            .unwrap();\n\n            assert_eq!(index_config.index_id, \"hdfs-logs\");\n            assert_eq!(index_config.index_uri, \"s3://quickwit-indexes/hdfs-logs\");\n            assert_eq!(index_config.doc_mapping.field_mappings.len(), 1);\n            assert_eq!(index_config.doc_mapping.field_mappings[0].name, \"body\");\n            assert!(!index_config.doc_mapping.store_source);\n            assert_eq!(index_config.indexing_settings, IndexingSettings::default());\n            assert_eq!(index_config.ingest_settings, IngestSettings::default());\n\n            let expected_search_settings = SearchSettings {\n                default_search_fields: vec![\"body\".to_string()],\n            };\n            assert_eq!(index_config.search_settings, expected_search_settings);\n            assert!(index_config.retention_policy_opt.is_none());\n        }\n        {\n            let index_config_filepath = get_index_config_filepath(\"partial-hdfs-logs.yaml\");\n            let file_content = std::fs::read_to_string(index_config_filepath).unwrap();\n            let index_config = load_index_config_from_user_config(\n                ConfigFormat::Yaml,\n                file_content.as_bytes(),\n                &default_index_root_uri,\n            )\n            .unwrap();\n\n            assert_eq!(index_config.index_id, \"hdfs-logs\");\n            assert_eq!(index_config.index_uri, \"s3://quickwit-indexes/hdfs-logs\");\n            assert_eq!(index_config.doc_mapping.field_mappings.len(), 2);\n            assert_eq!(index_config.doc_mapping.field_mappings[0].name, \"body\");\n            assert_eq!(index_config.doc_mapping.field_mappings[1].name, \"timestamp\");\n            assert!(!index_config.doc_mapping.store_source);\n            assert_eq!(\n                index_config.indexing_settings,\n                IndexingSettings {\n                    commit_timeout_secs: 42,\n                    merge_policy: MergePolicyConfig::default(),\n                    resources: IndexingResources {\n                        ..Default::default()\n                    },\n                    ..Default::default()\n                }\n            );\n            assert_eq!(\n                index_config.search_settings,\n                SearchSettings {\n                    default_search_fields: vec![\"body\".to_string()],\n                }\n            );\n        }\n    }\n\n    #[test]\n    #[should_panic(expected = \"empty URI\")]\n    fn test_config_validates_uris() {\n        let config_yaml = r#\"\n            version: 0.8\n            index_id: hdfs-logs\n            index_uri: ''\n            doc_mapping: {}\n        \"#;\n        serde_yaml::from_str::<IndexConfig>(config_yaml).unwrap();\n    }\n\n    #[test]\n    fn test_minimal_index_config_default_dynamic() {\n        let config_yaml = r#\"\n            version: 0.8\n            index_id: hdfs-logs\n            index_uri: \"s3://my-index\"\n            doc_mapping: {}\n        \"#;\n        let minimal_config: IndexConfig = load_index_config_from_user_config(\n            ConfigFormat::Yaml,\n            config_yaml.as_bytes(),\n            &Uri::for_test(\"s3://my-index\"),\n        )\n        .unwrap();\n        assert_eq!(\n            minimal_config.doc_mapping.mode.mode_type(),\n            ModeType::Dynamic\n        );\n    }\n\n    #[test]\n    fn test_index_config_with_malformed_maturation_duration() {\n        let config_yaml = r#\"\n            version: 0.8\n            index_id: hdfs-logs\n            index_uri: \"s3://my-index\"\n            doc_mapping: {}\n            indexing_settings:\n              merge_policy:\n                type: limit_merge\n                maturation_period: x\n        \"#;\n        let parsing_config_error = load_index_config_from_user_config(\n            ConfigFormat::Yaml,\n            config_yaml.as_bytes(),\n            &Uri::for_test(\"s3://my-index\"),\n        )\n        .unwrap_err();\n        println!(\"{parsing_config_error:?}\");\n        assert!(\n            parsing_config_error\n                .root_cause()\n                .to_string()\n                .contains(\"failed to parse human-readable duration `x`\")\n        );\n    }\n\n    #[test]\n    fn test_retention_policy_serialization() {\n        let retention_policy = RetentionPolicy {\n            retention_period: \"90 days\".to_string(),\n            evaluation_schedule: \"hourly\".to_string(),\n        };\n        let retention_policy_yaml = serde_yaml::to_string(&retention_policy).unwrap();\n        assert_eq!(\n            serde_yaml::from_str::<RetentionPolicy>(&retention_policy_yaml).unwrap(),\n            retention_policy,\n        );\n    }\n\n    #[test]\n    fn test_retention_policy_deserialization() {\n        {\n            let retention_policy_yaml = r#\"\n            period: 90 days\n        \"#;\n            let retention_policy =\n                serde_yaml::from_str::<RetentionPolicy>(retention_policy_yaml).unwrap();\n\n            let expected_retention_policy = RetentionPolicy {\n                retention_period: \"90 days\".to_string(),\n                evaluation_schedule: \"hourly\".to_string(),\n            };\n            assert_eq!(retention_policy, expected_retention_policy);\n        }\n        {\n            let retention_policy_yaml = r#\"\n            period: 90 days\n            schedule: daily\n        \"#;\n            let retention_policy =\n                serde_yaml::from_str::<RetentionPolicy>(retention_policy_yaml).unwrap();\n\n            let expected_retention_policy = RetentionPolicy {\n                retention_period: \"90 days\".to_string(),\n                evaluation_schedule: \"daily\".to_string(),\n            };\n            assert_eq!(retention_policy, expected_retention_policy);\n        }\n    }\n\n    #[test]\n    fn test_parse_retention_policy_period() {\n        {\n            let retention_policy = RetentionPolicy {\n                retention_period: \"1 hour\".to_string(),\n                evaluation_schedule: \"hourly\".to_string(),\n            };\n            assert_eq!(\n                retention_policy.retention_period().unwrap(),\n                Duration::from_secs(3600)\n            );\n            {\n                let retention_policy = RetentionPolicy {\n                    retention_period: \"foo\".to_string(),\n                    evaluation_schedule: \"hourly\".to_string(),\n                };\n                assert_eq!(\n                    retention_policy.retention_period().unwrap_err().to_string(),\n                    \"failed to parse retention period `foo`\"\n                );\n            }\n        }\n    }\n\n    #[test]\n    fn test_prepend_at_char() {\n        assert_eq!(prepend_at_char(\"\"), \"\");\n        assert_eq!(prepend_at_char(\"* * 0 0 0\"), \"* * 0 0 0\");\n        assert_eq!(prepend_at_char(\"hourly\"), \"@hourly\");\n        assert_eq!(prepend_at_char(\"@hourly\"), \"@hourly\");\n    }\n\n    #[test]\n    fn test_parse_retention_policy_schedule() {\n        let hourly_schedule = Schedule::from_str(\"@hourly\").unwrap();\n        {\n            let retention_policy = RetentionPolicy {\n                retention_period: \"1 hour\".to_string(),\n                evaluation_schedule: \"@hourly\".to_string(),\n            };\n            assert_eq!(\n                retention_policy.evaluation_schedule().unwrap(),\n                hourly_schedule\n            );\n        }\n        {\n            let retention_policy = RetentionPolicy {\n                retention_period: \"1 hour\".to_string(),\n                evaluation_schedule: \"hourly\".to_string(),\n            };\n            assert_eq!(\n                retention_policy.evaluation_schedule().unwrap(),\n                hourly_schedule\n            );\n        }\n        {\n            let retention_policy = RetentionPolicy {\n                retention_period: \"1 hour\".to_string(),\n                evaluation_schedule: \"0 * * * * *\".to_string(),\n            };\n            let evaluation_schedule = retention_policy.evaluation_schedule().unwrap();\n            assert_eq!(evaluation_schedule.seconds().count(), 1);\n            assert_eq!(evaluation_schedule.minutes().count(), 60);\n        }\n    }\n\n    #[test]\n    fn test_retention_policy_validate() {\n        {\n            let retention_policy = RetentionPolicy {\n                retention_period: \"1 hour\".to_string(),\n                evaluation_schedule: \"hourly\".to_string(),\n            };\n            retention_policy.validate().unwrap();\n        }\n        {\n            let retention_policy = RetentionPolicy {\n                retention_period: \"foo\".to_string(),\n                evaluation_schedule: \"hourly\".to_string(),\n            };\n            retention_policy.validate().unwrap_err();\n        }\n        {\n            let retention_policy = RetentionPolicy {\n                retention_period: \"1 hour\".to_string(),\n                evaluation_schedule: \"foo\".to_string(),\n            };\n            retention_policy.validate().unwrap_err();\n        }\n    }\n\n    #[test]\n    fn test_retention_schedule_duration() {\n        let schedule_test_helper_fn = |schedule_str: &str| {\n            let hourly_schedule = Schedule::from_str(&prepend_at_char(schedule_str)).unwrap();\n            let retention_policy = RetentionPolicy {\n                retention_period: \"1 hour\".to_string(),\n                evaluation_schedule: schedule_str.to_string(),\n            };\n\n            let next_evaluation_duration = chrono::Duration::nanoseconds(\n                retention_policy\n                    .duration_until_next_evaluation()\n                    .unwrap()\n                    .as_nanos() as i64,\n            );\n            let next_evaluation_date = Utc::now() + next_evaluation_duration;\n            let expected_date = hourly_schedule.upcoming(Utc).next().unwrap();\n            assert_eq!(next_evaluation_date.timestamp(), expected_date.timestamp());\n        };\n\n        schedule_test_helper_fn(\"hourly\");\n        schedule_test_helper_fn(\"daily\");\n        schedule_test_helper_fn(\"weekly\");\n        schedule_test_helper_fn(\"monthly\");\n        schedule_test_helper_fn(\"* * * ? * ?\");\n    }\n\n    #[test]\n    fn test_ingest_settings_serde() {\n        let settings = IngestSettings {\n            min_shards: NonZeroUsize::MIN,\n            validate_docs: false,\n        };\n        let settings_yaml = serde_yaml::to_string(&settings).unwrap();\n        assert!(settings_yaml.contains(\"validate_docs\"));\n\n        let expected_settings: IngestSettings = serde_yaml::from_str(&settings_yaml).unwrap();\n        assert_eq!(settings, expected_settings);\n\n        let settings = IngestSettings {\n            min_shards: NonZeroUsize::MIN,\n            validate_docs: true,\n        };\n        let settings_yaml = serde_yaml::to_string(&settings).unwrap();\n        assert!(!settings_yaml.contains(\"validate_docs\"));\n\n        let expected_settings: IngestSettings = serde_yaml::from_str(&settings_yaml).unwrap();\n        assert_eq!(settings, expected_settings);\n\n        let settings_yaml = r#\"\n            min_shards: 0\n        \"#;\n        let error = serde_yaml::from_str::<IngestSettings>(settings_yaml).unwrap_err();\n        assert!(error.to_string().contains(\"expected a nonzero\"));\n    }\n\n    #[test]\n    fn test_prepare_doc_mapping_update() {\n        let current_index_config = IndexConfig::for_test(\"test-index\", \"s3://test-index\");\n        let mut current_doc_mapping = current_index_config.doc_mapping;\n        let search_settings = current_index_config.search_settings;\n\n        let tokenizer_json = r#\"\n            {\n                \"name\": \"breton-tokenizer\",\n                \"type\": \"regex\",\n                \"pattern\": \"crêpes*\"\n            }\n            \"#;\n        let tokenizer: TokenizerEntry = serde_json::from_str(tokenizer_json).unwrap();\n\n        current_doc_mapping.tokenizers.push(tokenizer.clone());\n\n        // The new doc mapping should have a different doc mapping UID.\n        let mut new_doc_mapping = current_doc_mapping.clone();\n        new_doc_mapping.store_source = false; // This is set to `true` for the current doc mapping.\n        let error =\n            prepare_doc_mapping_update(new_doc_mapping, &current_doc_mapping, &search_settings)\n                .unwrap_err()\n                .to_string();\n        assert!(error.contains(\"doc mapping UID should differ\"));\n\n        // The new doc mapping should not change the timestamp field.\n        let mut new_doc_mapping = current_doc_mapping.clone();\n        new_doc_mapping.doc_mapping_uid = DocMappingUid::random();\n        new_doc_mapping.timestamp_field = Some(\"ts\".to_string()); // This is set to `timestamp` for the current doc mapping.\n        let error =\n            prepare_doc_mapping_update(new_doc_mapping, &current_doc_mapping, &search_settings)\n                .unwrap_err()\n                .to_string();\n        assert!(error.contains(\"timestamp field\"));\n\n        // The new doc mapping should not remove the timestamp field.\n        let mut new_doc_mapping = current_doc_mapping.clone();\n        new_doc_mapping.doc_mapping_uid = DocMappingUid::random();\n        new_doc_mapping.timestamp_field = None;\n        let error =\n            prepare_doc_mapping_update(new_doc_mapping, &current_doc_mapping, &search_settings)\n                .unwrap_err()\n                .to_string();\n        assert!(error.contains(\"timestamp field\"));\n\n        // The new doc mapping should not remove tokenizers.\n        let mut new_doc_mapping = current_doc_mapping.clone();\n        new_doc_mapping.doc_mapping_uid = DocMappingUid::random();\n        new_doc_mapping.tokenizers.clear();\n        let error =\n            prepare_doc_mapping_update(new_doc_mapping, &current_doc_mapping, &search_settings)\n                .unwrap_err()\n                .to_string();\n        assert!(error.contains(\"tokenizers\"));\n\n        // The new doc mapping should be \"buildable\" into a doc mapper.\n        let mut new_doc_mapping = current_doc_mapping.clone();\n        new_doc_mapping.doc_mapping_uid = DocMappingUid::random();\n        new_doc_mapping.tokenizers.push(tokenizer);\n        let error =\n            prepare_doc_mapping_update(new_doc_mapping, &current_doc_mapping, &search_settings)\n                .unwrap_err()\n                .source()\n                .unwrap()\n                .to_string();\n        assert!(error.contains(\"duplicated custom tokenizer\"));\n\n        let mut new_doc_mapping = current_doc_mapping.clone();\n        new_doc_mapping.doc_mapping_uid = DocMappingUid::random();\n        let (updated_doc_mapping, mutation_occurred) =\n            prepare_doc_mapping_update(new_doc_mapping, &current_doc_mapping, &search_settings)\n                .unwrap();\n        assert!(!mutation_occurred);\n        assert_eq!(\n            updated_doc_mapping.doc_mapping_uid,\n            current_doc_mapping.doc_mapping_uid\n        );\n        assert_eq!(updated_doc_mapping, current_doc_mapping);\n\n        let mut new_doc_mapping = current_doc_mapping.clone();\n        let new_doc_mapping_uid = DocMappingUid::random();\n        new_doc_mapping.doc_mapping_uid = new_doc_mapping_uid;\n        new_doc_mapping.mode = Mode::Strict;\n        let (updated_doc_mapping, mutation_occurred) =\n            prepare_doc_mapping_update(new_doc_mapping, &current_doc_mapping, &search_settings)\n                .unwrap();\n        assert!(mutation_occurred);\n        assert_eq!(updated_doc_mapping.doc_mapping_uid, new_doc_mapping_uid);\n        assert_eq!(updated_doc_mapping.mode, Mode::Strict);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-config/src/index_config/serialize.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse anyhow::{Context, ensure};\nuse quickwit_common::uri::Uri;\nuse quickwit_proto::types::{DocMappingUid, IndexId};\nuse serde::{Deserialize, Serialize};\nuse tracing::info;\n\nuse super::{IngestSettings, validate_index_config};\nuse crate::{\n    ConfigFormat, DocMapping, IndexConfig, IndexingSettings, RetentionPolicy, SearchSettings,\n    prepare_doc_mapping_update, validate_identifier,\n};\n\n/// Alias for the latest serialization format.\ntype IndexConfigForSerialization = IndexConfigV0_8;\n\n#[derive(Clone, Debug, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(tag = \"version\")]\npub(crate) enum VersionedIndexConfig {\n    // The two versions use the same format but for v0.8 and below, we need to set the\n    // `doc_mapping_uid` to the nil value upon deserialization.\n    #[serde(rename = \"0.9\")]\n    V0_9(IndexConfigV0_8),\n    // Retro compatibility\n    #[serde(rename = \"0.8\")]\n    #[serde(alias = \"0.7\")]\n    V0_8(IndexConfigV0_8),\n}\n\nimpl From<VersionedIndexConfig> for IndexConfigForSerialization {\n    fn from(versioned_config: VersionedIndexConfig) -> IndexConfigForSerialization {\n        match versioned_config {\n            VersionedIndexConfig::V0_8(v0_8) => v0_8,\n            VersionedIndexConfig::V0_9(v0_8) => v0_8,\n        }\n    }\n}\n\n/// Parses and validates an [`IndexConfig`] as supplied by a user with a given [`ConfigFormat`],\n/// config content and a `default_index_root_uri`.\npub fn load_index_config_from_user_config(\n    config_format: ConfigFormat,\n    config_content: &[u8],\n    default_index_root_uri: &Uri,\n) -> anyhow::Result<IndexConfig> {\n    let versioned_index_config: VersionedIndexConfig = config_format.parse(config_content)?;\n    let index_config_for_serialization: IndexConfigForSerialization = versioned_index_config.into();\n    index_config_for_serialization.build_and_validate(Some(default_index_root_uri))\n}\n\n/// Parses and validates an [`IndexConfig`] update.\n///\n/// Ensures that the new configuration is valid in itself and compared to the\n/// current index config. If the new configuration omits some fields, the\n/// default values will be used, not those of the current index config.\npub fn load_index_config_update(\n    config_format: ConfigFormat,\n    index_config_bytes: &[u8],\n    default_index_root_uri: &Uri,\n    current_index_config: &IndexConfig,\n) -> anyhow::Result<IndexConfig> {\n    let mut new_index_config = load_index_config_from_user_config(\n        config_format,\n        index_config_bytes,\n        default_index_root_uri,\n    )?;\n    ensure!(\n        current_index_config.index_id == new_index_config.index_id,\n        \"`index_id` in config file {} does not match updated `index_id` {}\",\n        current_index_config.index_id,\n        new_index_config.index_id\n    );\n    ensure!(\n        current_index_config.index_uri == new_index_config.index_uri,\n        \"`index_uri` cannot be updated, current value {}, new expected value {}\",\n        current_index_config.index_uri,\n        new_index_config.index_uri\n    );\n    let (updated_doc_mapping, _mutation_occurred) = prepare_doc_mapping_update(\n        new_index_config.doc_mapping,\n        &current_index_config.doc_mapping,\n        &new_index_config.search_settings,\n    )?;\n    new_index_config.doc_mapping = updated_doc_mapping;\n\n    Ok(new_index_config)\n}\n\nimpl IndexConfigForSerialization {\n    fn index_uri_or_fallback_to_default(\n        &self,\n        default_index_root_uri_opt: Option<&Uri>,\n    ) -> anyhow::Result<Uri> {\n        if let Some(index_uri) = &self.index_uri {\n            return Ok(index_uri.clone());\n        }\n        let default_index_root_uri = default_index_root_uri_opt.context(\"missing `index_uri`\")?;\n        let index_uri: Uri = default_index_root_uri.join(&self.index_id)\n            .context(\"failed to create default index URI. this should never happen! please, report on https://github.com/quickwit-oss/quickwit/issues\")?;\n        info!(\n            index_id=%self.index_id,\n            index_uri=%index_uri,\n            \"index config does not specify `index_uri`, falling back to default value\",\n        );\n        Ok(index_uri)\n    }\n\n    pub fn build_and_validate(\n        self,\n        default_index_root_uri: Option<&Uri>,\n    ) -> anyhow::Result<IndexConfig> {\n        validate_identifier(\"index\", &self.index_id)?;\n\n        let index_uri = self.index_uri_or_fallback_to_default(default_index_root_uri)?;\n\n        let index_config = IndexConfig {\n            index_id: self.index_id,\n            index_uri,\n            doc_mapping: self.doc_mapping,\n            indexing_settings: self.indexing_settings,\n            ingest_settings: self.ingest_settings,\n            search_settings: self.search_settings,\n            retention_policy_opt: self.retention_policy_opt,\n        };\n        validate_index_config(\n            &index_config.doc_mapping,\n            &index_config.indexing_settings,\n            &index_config.search_settings,\n            &index_config.retention_policy_opt,\n        )?;\n        Ok(index_config)\n    }\n}\n\nimpl From<IndexConfig> for VersionedIndexConfig {\n    fn from(index_config: IndexConfig) -> Self {\n        VersionedIndexConfig::V0_9(index_config.into())\n    }\n}\n\nimpl TryFrom<VersionedIndexConfig> for IndexConfig {\n    type Error = anyhow::Error;\n\n    fn try_from(versioned_index_config: VersionedIndexConfig) -> anyhow::Result<Self> {\n        match versioned_index_config {\n            VersionedIndexConfig::V0_8(mut v0_8) => {\n                // Override the randomly generated doc mapping UID with the nil value.\n                v0_8.doc_mapping.doc_mapping_uid = DocMappingUid::default();\n                v0_8.build_and_validate(None)\n            }\n            VersionedIndexConfig::V0_9(v0_8) => v0_8.build_and_validate(None),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct IndexConfigV0_8 {\n    #[schema(value_type = String)]\n    pub index_id: IndexId,\n    #[schema(value_type = String)]\n    #[serde(default)]\n    pub index_uri: Option<Uri>,\n    pub doc_mapping: DocMapping,\n    #[serde(default)]\n    pub indexing_settings: IndexingSettings,\n    #[serde(default)]\n    pub ingest_settings: IngestSettings,\n    #[serde(default)]\n    pub search_settings: SearchSettings,\n    #[serde(rename = \"retention\")]\n    #[serde(default)]\n    pub retention_policy_opt: Option<RetentionPolicy>,\n}\n\nimpl From<IndexConfig> for IndexConfigV0_8 {\n    fn from(index_config: IndexConfig) -> Self {\n        IndexConfigV0_8 {\n            index_id: index_config.index_id,\n            index_uri: Some(index_config.index_uri),\n            doc_mapping: index_config.doc_mapping,\n            indexing_settings: index_config.indexing_settings,\n            ingest_settings: index_config.ingest_settings,\n            search_settings: index_config.search_settings,\n            retention_policy_opt: index_config.retention_policy_opt,\n        }\n    }\n}\n\n#[cfg(test)]\nmod test {\n    use super::*;\n    use crate::merge_policy_config::{MergePolicyConfig, StableLogMergePolicyConfig};\n\n    fn minimal_index_config_for_serialization() -> IndexConfigForSerialization {\n        serde_yaml::from_str(\n            r#\"\n            index_id: hdfs-logs\n            index_uri: s3://quickwit-indexes/hdfs-logs\n\n            doc_mapping:\n                field_mappings:\n                    - name: body\n                      type: text\n                      tokenizer: default\n                      record: position\n\n            search_settings:\n                default_search_fields: [body]\n        \"#,\n        )\n        .unwrap()\n    }\n\n    #[test]\n    fn test_validate_invalid_merge_policy() {\n        // Not yet invalid, but we modify it right after this.\n        let mut invalid_index_config: IndexConfigForSerialization =\n            minimal_index_config_for_serialization();\n        // Set a max merge factor to an inconsistent value.\n        let mut stable_log_merge_policy_config = StableLogMergePolicyConfig::default();\n        stable_log_merge_policy_config.max_merge_factor =\n            stable_log_merge_policy_config.merge_factor - 1;\n        invalid_index_config.indexing_settings.merge_policy =\n            MergePolicyConfig::StableLog(stable_log_merge_policy_config);\n        let validation_err = invalid_index_config\n            .build_and_validate(None)\n            .unwrap_err()\n            .to_string();\n        assert_eq!(\n            validation_err,\n            \"index config merge policy `max_merge_factor` must be superior or equal to \\\n             `merge_factor`\"\n        );\n    }\n\n    #[test]\n    fn test_validate_retention_policy() {\n        // Not yet invalid, but we modify it right after this.\n        let mut invalid_index_config: IndexConfigForSerialization =\n            minimal_index_config_for_serialization();\n        invalid_index_config.retention_policy_opt = Some(RetentionPolicy {\n            retention_period: \"90 days\".to_string(),\n            evaluation_schedule: \"hourly\".to_string(),\n        });\n        let validation_err = invalid_index_config\n            .build_and_validate(None)\n            .unwrap_err()\n            .to_string();\n        assert!(validation_err.contains(\"retention policy requires a timestamp field\"));\n    }\n\n    #[test]\n    fn test_minimal_index_config_missing_root_uri_no_default_uri() {\n        let config_yaml = r#\"\n            version: 0.8\n            index_id: hdfs-logs\n            doc_mapping: {}\n        \"#;\n        let config_parse_result: anyhow::Result<IndexConfig> =\n            ConfigFormat::Yaml.parse(config_yaml.as_bytes());\n        assert!(format!(\"{:?}\", config_parse_result.unwrap_err()).contains(\"missing `index_uri`\"));\n    }\n\n    #[test]\n    fn test_minimal_index_config_missing_root_uri_with_default_index_root_uri() {\n        let config_yaml = r#\"\n            version: 0.8\n            index_id: hdfs-logs\n            doc_mapping: {}\n        \"#;\n        {\n            let index_config: IndexConfig = load_index_config_from_user_config(\n                ConfigFormat::Yaml,\n                config_yaml.as_bytes(),\n                // same but without the trailing slash.\n                &Uri::for_test(\"s3://mybucket\"),\n            )\n            .unwrap();\n            assert_eq!(index_config.index_uri.as_str(), \"s3://mybucket/hdfs-logs\");\n        }\n    }\n\n    #[test]\n    fn test_update_index_root_uri() {\n        let original_config_yaml = r#\"\n            version: 0.8\n            index_id: hdfs-logs\n            doc_mapping: {}\n        \"#;\n        let default_root = Uri::for_test(\"s3://mybucket\");\n        let original_config: IndexConfig = load_index_config_from_user_config(\n            ConfigFormat::Yaml,\n            original_config_yaml.as_bytes(),\n            &default_root,\n        )\n        .unwrap();\n        {\n            // use default in update\n            let updated_config_yaml = r#\"\n                version: 0.8\n                index_id: hdfs-logs\n                doc_mapping: {}\n            \"#;\n            let updated_config = load_index_config_update(\n                ConfigFormat::Yaml,\n                updated_config_yaml.as_bytes(),\n                &default_root,\n                &original_config,\n            )\n            .unwrap();\n            assert_eq!(updated_config.index_uri.as_str(), \"s3://mybucket/hdfs-logs\");\n        }\n        {\n            // use the current index_uri explicitly\n            let updated_config_yaml = r#\"\n                version: 0.8\n                index_id: hdfs-logs\n                index_uri: s3://mybucket/hdfs-logs\n                doc_mapping: {}\n            \"#;\n            let updated_config = load_index_config_update(\n                ConfigFormat::Yaml,\n                updated_config_yaml.as_bytes(),\n                &default_root,\n                &original_config,\n            )\n            .unwrap();\n            assert_eq!(updated_config.index_uri.as_str(), \"s3://mybucket/hdfs-logs\");\n        }\n        {\n            // try using a different index_uri\n            let updated_config_yaml = r#\"\n                version: 0.8\n                index_id: hdfs-logs\n                index_uri: s3://mybucket/new-directory/\n                doc_mapping: {}\n            \"#;\n            let load_error = load_index_config_update(\n                ConfigFormat::Yaml,\n                updated_config_yaml.as_bytes(),\n                &default_root,\n                &original_config,\n            )\n            .unwrap_err();\n            assert!(format!(\"{load_error:?}\").contains(\"`index_uri` cannot be updated\"));\n        }\n    }\n\n    #[test]\n    fn test_update_reset_defaults() {\n        let original_config_yaml = r#\"\n            version: 0.8\n            index_id: hdfs-logs\n            doc_mapping:\n                field_mappings:\n                    - name: timestamp\n                      type: datetime\n                      fast: true\n                timestamp_field: timestamp\n\n            search_settings:\n                default_search_fields: [body]\n\n            indexing_settings:\n                commit_timeout_secs: 10\n\n            retention:\n                period: 90 days\n                schedule: daily\n        \"#;\n        let default_root = Uri::for_test(\"s3://mybucket\");\n        let original_config: IndexConfig = load_index_config_from_user_config(\n            ConfigFormat::Yaml,\n            original_config_yaml.as_bytes(),\n            &default_root,\n        )\n        .unwrap();\n\n        let updated_config_yaml = r#\"\n            version: 0.8\n            index_id: hdfs-logs\n            doc_mapping:\n                field_mappings:\n                    - name: timestamp\n                      type: datetime\n                      fast: true\n                timestamp_field: timestamp\n        \"#;\n        let updated_config = load_index_config_update(\n            ConfigFormat::Yaml,\n            updated_config_yaml.as_bytes(),\n            &default_root,\n            &original_config,\n        )\n        .unwrap();\n        assert_eq!(\n            updated_config.search_settings.default_search_fields,\n            Vec::<String>::default(),\n        );\n        assert_eq!(\n            updated_config.indexing_settings.commit_timeout_secs,\n            IndexingSettings::default_commit_timeout_secs()\n        );\n        assert_eq!(updated_config.retention_policy_opt, None);\n    }\n\n    #[test]\n    fn test_update_doc_mappings() {\n        let original_config_yaml = r#\"\n            version: 0.8\n            index_id: hdfs-logs\n            doc_mapping: {}\n        \"#;\n        let default_root = Uri::for_test(\"s3://mybucket\");\n        let original_config: IndexConfig = load_index_config_from_user_config(\n            ConfigFormat::Yaml,\n            original_config_yaml.as_bytes(),\n            &default_root,\n        )\n        .unwrap();\n\n        let updated_config_yaml = r#\"\n            version: 0.8\n            index_id: hdfs-logs\n            doc_mapping:\n                field_mappings:\n                    - name: body\n                      type: text\n                      tokenizer: default\n                      record: position\n        \"#;\n        let updated_config = load_index_config_update(\n            ConfigFormat::Yaml,\n            updated_config_yaml.as_bytes(),\n            &default_root,\n            &original_config,\n        )\n        .unwrap();\n        assert_eq!(updated_config.doc_mapping.field_mappings.len(), 1);\n    }\n\n    #[test]\n    fn test_update_doc_mappings_failing_cases() {\n        let original_config_yaml = r#\"\n            version: 0.8\n            index_id: hdfs-logs\n            doc_mapping:\n                mode: lenient\n                doc_mapping_uid: 00000000000000000000000000\n                timestamp_field: timestamp\n                field_mappings:\n                    - name: timestamp\n                      type: datetime\n                      fast: true\n        \"#;\n        let default_root = Uri::for_test(\"s3://mybucket\");\n        let original_config: IndexConfig = load_index_config_from_user_config(\n            ConfigFormat::Yaml,\n            original_config_yaml.as_bytes(),\n            &default_root,\n        )\n        .unwrap();\n\n        let updated_config_yaml = r#\"\n            version: 0.8\n            index_id: hdfs-logs\n            doc_mapping:\n                mode: lenient\n                doc_mapping_uid: 00000000000000000000000000\n                timestamp_field: timestamp\n                field_mappings:\n                    - name: timestamp\n                      type: datetime\n                      fast: true\n                    - name: body\n                      type: text\n                      tokenizer: default\n                      record: position\n        \"#;\n        load_index_config_update(\n            ConfigFormat::Yaml,\n            updated_config_yaml.as_bytes(),\n            &default_root,\n            &original_config,\n        )\n        .expect_err(\"mapping changed but uid fixed should error\");\n\n        let updated_config_yaml = r#\"\n            version: 0.8\n            index_id: hdfs-logs\n            doc_mapping:\n                mode: lenient\n                field_mappings:\n                    - name: timestamp\n                      type: datetime\n                      fast: true\n        \"#;\n        load_index_config_update(\n            ConfigFormat::Yaml,\n            updated_config_yaml.as_bytes(),\n            &default_root,\n            &original_config,\n        )\n        .expect_err(\"timestamp field removed should error\");\n\n        let updated_config_yaml = r#\"\n            version: 0.8\n            index_id: hdfs-logs\n            doc_mapping:\n                mode: lenient\n                timestamp_field: timestamp\n                field_mappings:\n                    - name: body\n                      type: text\n                      tokenizer: default\n                      record: position\n        \"#;\n        load_index_config_update(\n            ConfigFormat::Yaml,\n            updated_config_yaml.as_bytes(),\n            &default_root,\n            &original_config,\n        )\n        .expect_err(\"field required for timestamp is absent\");\n\n        let updated_config_yaml = r#\"\n            version: 0.8\n            index_id: hdfs-logs\n            doc_mapping:\n                mode: lenient\n                timestamp_field: timestamp\n                field_mappings:\n                    - name: timestamp\n                      type: datetime\n                      fast: true\n            search_settings:\n              default_search_fields: [\"i_dont_exist\"]\n        \"#;\n        load_index_config_update(\n            ConfigFormat::Yaml,\n            updated_config_yaml.as_bytes(),\n            &default_root,\n            &original_config,\n        )\n        .expect_err(\"field required for default search is absent\");\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-config/src/index_template/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod serialize;\n\nuse anyhow::ensure;\nuse quickwit_common::uri::Uri;\nuse quickwit_proto::types::{DocMappingUid, IndexId};\nuse serde::{Deserialize, Serialize};\npub use serialize::{IndexTemplateV0_8, VersionedIndexTemplate};\n\nuse crate::index_config::{IngestSettings, validate_index_config};\nuse crate::{\n    DocMapping, IndexConfig, IndexingSettings, RetentionPolicy, SearchSettings,\n    validate_identifier, validate_index_id_pattern,\n};\n\npub type IndexTemplateId = String;\npub type IndexIdPattern = String;\n\n#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]\n#[serde(into = \"VersionedIndexTemplate\")]\n#[serde(from = \"VersionedIndexTemplate\")]\npub struct IndexTemplate {\n    pub template_id: IndexTemplateId,\n    pub index_id_patterns: Vec<IndexIdPattern>,\n    #[serde(default)]\n    pub index_root_uri: Option<Uri>,\n    #[serde(default)]\n    pub priority: usize,\n    #[serde(default)]\n    pub description: Option<String>,\n    pub doc_mapping: DocMapping,\n    #[serde(default)]\n    pub indexing_settings: IndexingSettings,\n    #[serde(default)]\n    pub ingest_settings: IngestSettings,\n    #[serde(default)]\n    pub search_settings: SearchSettings,\n    #[serde(rename = \"retention\")]\n    #[serde(default)]\n    pub retention_policy_opt: Option<RetentionPolicy>,\n}\n\nimpl IndexTemplate {\n    pub fn apply_template(\n        &self,\n        index_id: IndexId,\n        default_index_root_uri: &Uri,\n    ) -> anyhow::Result<IndexConfig> {\n        let index_uri = self\n            .index_root_uri\n            .as_ref()\n            .unwrap_or(default_index_root_uri)\n            .join(&index_id)?;\n\n        // Ensure that the doc mapping UID is truly unique per index.\n        let mut doc_mapping = self.doc_mapping.clone();\n        doc_mapping.doc_mapping_uid = DocMappingUid::random();\n\n        let index_config = IndexConfig {\n            index_id,\n            index_uri,\n            doc_mapping,\n            indexing_settings: self.indexing_settings.clone(),\n            ingest_settings: self.ingest_settings.clone(),\n            search_settings: self.search_settings.clone(),\n            retention_policy_opt: self.retention_policy_opt.clone(),\n        };\n        Ok(index_config)\n    }\n\n    pub fn validate(&self) -> anyhow::Result<()> {\n        validate_identifier(\"template\", &self.template_id)?;\n\n        ensure!(\n            !self.index_id_patterns.is_empty(),\n            \"`index_id_patterns` must not be empty\"\n        );\n        for index_id_pattern in &self.index_id_patterns {\n            validate_index_id_pattern(index_id_pattern, true)?;\n        }\n        validate_index_config(\n            &self.doc_mapping,\n            &self.indexing_settings,\n            &self.search_settings,\n            &self.retention_policy_opt,\n        )?;\n        Ok(())\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn for_test(template_id: &str, index_id_patterns: &[&str], priority: usize) -> Self {\n        let index_id_patterns: Vec<IndexIdPattern> = index_id_patterns\n            .iter()\n            .map(|pattern| pattern.to_string())\n            .collect();\n\n        let doc_mapping_json = r#\"{\n            \"field_mappings\": [\n                {\n                    \"name\": \"ts\",\n                    \"type\": \"datetime\",\n                    \"fast\": true\n                },\n                {\n                    \"name\": \"message\",\n                    \"type\": \"json\"\n                }\n            ],\n            \"timestamp_field\": \"ts\"\n        }\"#;\n        let doc_mapping: DocMapping = serde_json::from_str(doc_mapping_json).unwrap();\n\n        IndexTemplate {\n            template_id: template_id.to_string(),\n            index_root_uri: Some(Uri::for_test(\"ram:///indexes\")),\n            index_id_patterns,\n            priority,\n            description: Some(\"Test description.\".to_string()),\n            doc_mapping,\n            indexing_settings: IndexingSettings::default(),\n            ingest_settings: IngestSettings::default(),\n            search_settings: SearchSettings::default(),\n            retention_policy_opt: None,\n        }\n    }\n}\n\n#[cfg(any(test, feature = \"testsuite\"))]\nimpl crate::TestableForRegression for IndexTemplate {\n    fn sample_for_regression() -> Self {\n        let template_id = \"test-template\".to_string();\n        let index_id_patterns = vec![\n            \"test-index-foo*\".to_string(),\n            \"-test-index-foobar\".to_string(),\n        ];\n\n        let doc_mapping_json = r#\"{\n            \"doc_mapping_uid\": \"00000000000000000000000001\",\n            \"field_mappings\": [\n                {\n                    \"name\": \"ts\",\n                    \"type\": \"datetime\",\n                    \"fast\": true\n                },\n                {\n                    \"name\": \"message\",\n                    \"type\": \"json\"\n                }\n            ],\n            \"timestamp_field\": \"ts\"\n        }\"#;\n        let doc_mapping: DocMapping = serde_json::from_str(doc_mapping_json).unwrap();\n\n        IndexTemplate {\n            template_id: template_id.to_string(),\n            index_root_uri: Some(Uri::for_test(\"ram:///indexes\")),\n            index_id_patterns,\n            priority: 100,\n            description: Some(\"Test description.\".to_string()),\n            doc_mapping,\n            indexing_settings: IndexingSettings::default(),\n            ingest_settings: IngestSettings::default(),\n            search_settings: SearchSettings::default(),\n            retention_policy_opt: Some(RetentionPolicy {\n                retention_period: \"42 days\".to_string(),\n                evaluation_schedule: \"daily\".to_string(),\n            }),\n        }\n    }\n\n    fn assert_equality(&self, other: &Self) {\n        assert_eq!(self, other);\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_index_template_serde() {\n        let index_template_yaml = r#\"\n            version: 0.8\n\n            template_id: test-template\n            index_id_patterns:\n              - test-index-*\n              - -test-index-foo\n            description: Test description.\n            priority: 100\n            doc_mapping:\n              field_mappings:\n                - name: ts\n                  type: datetime\n                  fast: true\n                - name: message\n                  type: json\n              timestamp_field: ts\n        \"#;\n        let index_template: IndexTemplate = serde_yaml::from_str(index_template_yaml).unwrap();\n        assert_eq!(index_template.template_id, \"test-template\");\n        assert_eq!(index_template.index_id_patterns.len(), 2);\n        assert_eq!(\n            index_template.index_id_patterns,\n            [\"test-index-*\", \"-test-index-foo\"]\n        );\n        assert_eq!(index_template.priority, 100);\n        assert_eq!(index_template.description.unwrap(), \"Test description.\");\n        assert_eq!(index_template.doc_mapping.timestamp_field.unwrap(), \"ts\");\n    }\n\n    #[test]\n    fn test_index_template_apply() {\n        let mut index_template = IndexTemplate::for_test(\"test-template\", &[\"test-index-*\"], 0);\n\n        index_template.indexing_settings = IndexingSettings {\n            commit_timeout_secs: 42,\n            ..Default::default()\n        };\n        index_template.search_settings = SearchSettings {\n            default_search_fields: vec![\"message\".to_string()],\n        };\n        index_template.retention_policy_opt = Some(RetentionPolicy {\n            retention_period: \"42 days\".to_string(),\n            evaluation_schedule: \"hourly\".to_string(),\n        });\n        let default_index_root_uri = Uri::for_test(\"s3://test-bucket/indexes\");\n\n        let index_config_foo = index_template\n            .apply_template(\"test-index-foo\".to_string(), &default_index_root_uri)\n            .unwrap();\n\n        assert_eq!(index_config_foo.index_id, \"test-index-foo\");\n        assert_eq!(index_config_foo.index_uri, \"ram:///indexes/test-index-foo\");\n\n        assert_eq!(index_config_foo.doc_mapping.timestamp_field.unwrap(), \"ts\");\n        assert_eq!(index_config_foo.indexing_settings.commit_timeout_secs, 42);\n        assert_eq!(\n            index_config_foo.search_settings.default_search_fields,\n            [\"message\"]\n        );\n        let retention_policy = index_config_foo.retention_policy_opt.unwrap();\n        assert_eq!(retention_policy.retention_period, \"42 days\");\n        assert_eq!(retention_policy.evaluation_schedule, \"hourly\");\n\n        index_template.index_root_uri = None;\n\n        let index_config_bar = index_template\n            .apply_template(\"test-index-bar\".to_string(), &default_index_root_uri)\n            .unwrap();\n\n        assert_eq!(index_config_bar.index_id, \"test-index-bar\");\n        assert_eq!(\n            index_config_bar.index_uri,\n            \"s3://test-bucket/indexes/test-index-bar\"\n        );\n        assert_ne!(\n            index_config_foo.doc_mapping.doc_mapping_uid,\n            index_config_bar.doc_mapping.doc_mapping_uid\n        );\n    }\n\n    #[test]\n    fn test_index_template_validate() {\n        let index_template = IndexTemplate::for_test(\"\", &[], 0);\n        let error = index_template.validate().unwrap_err();\n        assert!(error.to_string().contains(\"template ID `` is invalid\"));\n\n        let index_template = IndexTemplate::for_test(\"test-template\", &[], 0);\n        let error = index_template.validate().unwrap_err();\n        assert!(error.to_string().contains(\"empty\"));\n\n        let index_template = IndexTemplate::for_test(\"test-template\", &[\"\"], 0);\n        let error = index_template.validate().unwrap_err();\n        assert!(error.to_string().contains(\"index ID pattern `` is invalid\"));\n\n        let mut index_template = IndexTemplate::for_test(\"test-template\", &[\"test-index-*\"], 0);\n        index_template.retention_policy_opt = Some(RetentionPolicy {\n            retention_period: \"\".to_string(),\n            evaluation_schedule: \"\".to_string(),\n        });\n        let error = index_template.validate().unwrap_err();\n        assert!(\n            error\n                .to_string()\n                .contains(\"failed to parse retention period\")\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-config/src/index_template/serialize.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_common::uri::Uri;\nuse serde::{Deserialize, Serialize};\n\nuse super::{IndexIdPattern, IndexTemplate, IndexTemplateId};\nuse crate::index_config::IngestSettings;\nuse crate::{DocMapping, IndexingSettings, RetentionPolicy, SearchSettings};\n\n#[derive(Clone, Debug, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(tag = \"version\")]\npub enum VersionedIndexTemplate {\n    #[serde(rename = \"0.9\")]\n    #[serde(alias = \"0.8\")]\n    #[serde(alias = \"0.7\")]\n    V0_8(IndexTemplateV0_8),\n}\n\n#[derive(Clone, Debug, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct IndexTemplateV0_8 {\n    #[schema(value_type = String)]\n    pub template_id: IndexTemplateId,\n    /// Glob patterns (e.g., `logs-foo*`) with negation by prepending `-` (e.g `-logs-fool`).\n    #[schema(value_type = Vec<String>)]\n    pub index_id_patterns: Vec<IndexIdPattern>,\n    /// The actual index URI is the concatenation of this with the index id.\n    #[schema(value_type = String)]\n    #[serde(default)]\n    pub index_root_uri: Option<Uri>,\n    /// When multiple templates match an index, the one with the highest priority is selected.\n    #[serde(default)]\n    pub priority: usize,\n    #[serde(default)]\n    pub description: Option<String>,\n\n    pub doc_mapping: DocMapping,\n    #[serde(default)]\n    pub indexing_settings: IndexingSettings,\n    #[serde(default)]\n    pub ingest_settings: IngestSettings,\n    #[serde(default)]\n    pub search_settings: SearchSettings,\n    #[serde(default)]\n    pub retention: Option<RetentionPolicy>,\n}\n\nimpl From<VersionedIndexTemplate> for IndexTemplate {\n    fn from(versioned_index_template: VersionedIndexTemplate) -> Self {\n        match versioned_index_template {\n            VersionedIndexTemplate::V0_8(v0_8) => v0_8.into(),\n        }\n    }\n}\n\nimpl From<IndexTemplate> for VersionedIndexTemplate {\n    fn from(index_template: IndexTemplate) -> Self {\n        VersionedIndexTemplate::V0_8(index_template.into())\n    }\n}\n\nimpl From<IndexTemplateV0_8> for IndexTemplate {\n    fn from(index_template_v0_8: IndexTemplateV0_8) -> Self {\n        IndexTemplate {\n            template_id: index_template_v0_8.template_id,\n            index_id_patterns: index_template_v0_8.index_id_patterns,\n            index_root_uri: index_template_v0_8.index_root_uri,\n            priority: index_template_v0_8.priority,\n            description: index_template_v0_8.description,\n            doc_mapping: index_template_v0_8.doc_mapping,\n            indexing_settings: index_template_v0_8.indexing_settings,\n            ingest_settings: index_template_v0_8.ingest_settings,\n            search_settings: index_template_v0_8.search_settings,\n            retention_policy_opt: index_template_v0_8.retention,\n        }\n    }\n}\n\nimpl From<IndexTemplate> for IndexTemplateV0_8 {\n    fn from(index_template: IndexTemplate) -> Self {\n        IndexTemplateV0_8 {\n            template_id: index_template.template_id,\n            index_id_patterns: index_template.index_id_patterns,\n            index_root_uri: index_template.index_root_uri,\n            priority: index_template.priority,\n            description: index_template.description,\n            doc_mapping: index_template.doc_mapping,\n            indexing_settings: index_template.indexing_settings,\n            ingest_settings: index_template.ingest_settings,\n            search_settings: index_template.search_settings,\n            retention: index_template.retention_policy_opt,\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-config/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n#![deny(clippy::disallowed_methods)]\n\nuse std::hash::Hasher;\nuse std::str::FromStr;\n\nuse anyhow::{Context, bail, ensure};\nuse json_comments::StripComments;\nuse once_cell::sync::Lazy;\nuse quickwit_common::get_bool_from_env;\nuse quickwit_common::net::is_valid_hostname;\nuse quickwit_common::uri::Uri;\nuse quickwit_proto::types::NodeIdRef;\nuse regex::Regex;\n\nmod cluster_config;\nmod config_value;\nmod index_config;\nmod index_template;\npub mod merge_policy_config;\nmod metastore_config;\nmod node_config;\nmod qw_env_vars;\npub(crate) mod serde_utils;\npub mod service;\nmod source_config;\nmod storage_config;\nmod templating;\n\npub use cluster_config::ClusterConfig;\n// We export that one for backward compatibility.\n// See #2048\nuse index_config::serialize::{IndexConfigV0_8, VersionedIndexConfig};\npub use index_config::{\n    IndexConfig, IndexingResources, IndexingSettings, IngestSettings, RetentionPolicy,\n    SearchSettings, build_doc_mapper, load_index_config_from_user_config, load_index_config_update,\n    prepare_doc_mapping_update,\n};\npub use quickwit_doc_mapper::DocMapping;\nuse serde::Serialize;\nuse serde::de::DeserializeOwned;\nuse serde_json::Value as JsonValue;\nuse siphasher::sip::SipHasher;\nuse source_config::FileSourceParamsForSerde;\npub use source_config::{\n    CLI_SOURCE_ID, FileSourceMessageType, FileSourceNotification, FileSourceParams, FileSourceSqs,\n    INGEST_API_SOURCE_ID, INGEST_V2_SOURCE_ID, KafkaSourceParams, KinesisSourceParams,\n    PubSubSourceParams, PulsarSourceAuth, PulsarSourceParams, RegionOrEndpoint, SourceConfig,\n    SourceInputFormat, SourceParams, TransformConfig, VecSourceParams, VoidSourceParams,\n    load_source_config_from_user_config, load_source_config_update,\n};\nuse tracing::warn;\n\nuse crate::index_template::IndexTemplateV0_8;\npub use crate::index_template::{IndexTemplate, IndexTemplateId, VersionedIndexTemplate};\nuse crate::merge_policy_config::{\n    ConstWriteAmplificationMergePolicyConfig, MergePolicyConfig, StableLogMergePolicyConfig,\n};\npub use crate::metastore_config::{\n    MetastoreBackend, MetastoreConfig, MetastoreConfigs, PostgresMetastoreConfig,\n};\npub use crate::node_config::{\n    CacheConfig, CachePolicy, DEFAULT_QW_CONFIG_PATH, GrpcConfig, IndexerConfig, IngestApiConfig,\n    JaegerConfig, KeepAliveConfig, LambdaConfig, LambdaDeployConfig, NodeConfig, RestConfig,\n    SearcherConfig, SplitCacheLimits, StorageTimeoutPolicy, TlsConfig,\n};\nuse crate::source_config::serialize::{SourceConfigV0_7, SourceConfigV0_8, VersionedSourceConfig};\npub use crate::storage_config::{\n    AzureStorageConfig, FileStorageConfig, GoogleCloudStorageConfig, RamStorageConfig,\n    S3StorageConfig, StorageBackend, StorageBackendFlavor, StorageConfig, StorageConfigs,\n};\n\n/// Returns true if the ingest API v2 is enabled.\npub fn enable_ingest_v2() -> bool {\n    static ENABLE_INGEST_V2: Lazy<bool> =\n        Lazy::new(|| get_bool_from_env(\"QW_ENABLE_INGEST_V2\", true));\n    *ENABLE_INGEST_V2\n}\n\n/// Returns true if the ingest API v1 is disabled.\npub fn disable_ingest_v1() -> bool {\n    static DISABLE_INGEST_V1: Lazy<bool> =\n        Lazy::new(|| get_bool_from_env(\"QW_DISABLE_INGEST_V1\", false));\n    *DISABLE_INGEST_V1\n}\n\n#[derive(utoipa::OpenApi)]\n#[openapi(components(schemas(\n    ConstWriteAmplificationMergePolicyConfig,\n    DocMapping,\n    FileSourceMessageType,\n    FileSourceNotification,\n    FileSourceParamsForSerde,\n    FileSourceSqs,\n    IndexConfigV0_8,\n    IndexingResources,\n    IndexingSettings,\n    IndexTemplateV0_8,\n    IngestSettings,\n    KafkaSourceParams,\n    KinesisSourceParams,\n    MergePolicyConfig,\n    PubSubSourceParams,\n    PulsarSourceAuth,\n    PulsarSourceParams,\n    RegionOrEndpoint,\n    RetentionPolicy,\n    SearchSettings,\n    SourceConfigV0_7,\n    SourceConfigV0_8,\n    SourceInputFormat,\n    SourceParams,\n    StableLogMergePolicyConfig,\n    TransformConfig,\n    VecSourceParams,\n    VersionedIndexConfig,\n    VersionedIndexTemplate,\n    VersionedSourceConfig,\n    VoidSourceParams,\n)))]\n/// Schema used for the OpenAPI generation which are apart of this crate.\npub struct ConfigApiSchemas;\n\n/// Checks whether an identifier conforms to Quickwit naming conventions.\npub fn validate_identifier(label: &str, value: &str) -> anyhow::Result<()> {\n    static IDENTIFIER_REGEX: Lazy<Regex> = Lazy::new(|| {\n        Regex::new(r\"^[a-zA-Z][a-zA-Z0-9-_\\.]{2,254}$\").expect(\"regular expression should compile\")\n    });\n    ensure!(\n        IDENTIFIER_REGEX.is_match(value),\n        \"{label} ID `{value}` is invalid: identifiers must match the following regular \\\n         expression: `^[a-zA-Z][a-zA-Z0-9-_\\\\.]{{2,254}}$`\"\n    );\n    Ok(())\n}\n\n/// Checks whether an index ID pattern conforms to Quickwit conventions.\n/// Index ID patterns accept the same characters as identifiers AND accept `*`\n/// chars to allow for glob-like patterns.\npub fn validate_index_id_pattern(pattern: &str, allow_negative: bool) -> anyhow::Result<()> {\n    static IDENTIFIER_REGEX_WITH_GLOB_PATTERN: Lazy<Regex> = Lazy::new(|| {\n        Regex::new(r\"^[a-zA-Z\\*][a-zA-Z0-9-_\\.\\*]{0,254}$\")\n            .expect(\"regular expression should compile\")\n    });\n    static IDENTIFIER_REGEX_WITH_GLOB_PATTERN_NEGATIVE: Lazy<Regex> = Lazy::new(|| {\n        Regex::new(r\"^-?[a-zA-Z\\*][a-zA-Z0-9-_\\.\\*]{0,254}$\")\n            .expect(\"regular expression should compile\")\n    });\n\n    let regex = if allow_negative {\n        &IDENTIFIER_REGEX_WITH_GLOB_PATTERN_NEGATIVE\n    } else {\n        &IDENTIFIER_REGEX_WITH_GLOB_PATTERN\n    };\n\n    if !regex.is_match(pattern) {\n        bail!(\n            \"index ID pattern `{pattern}` is invalid: patterns must match the following regular \\\n             expression: `^[a-zA-Z\\\\*][a-zA-Z0-9-_\\\\.\\\\*]{{0,254}}$`\"\n        );\n    }\n    // Forbid multiple stars in the pattern to force the user making simpler patterns\n    // as multiple stars does not bring any value.\n    if pattern.contains(\"**\") {\n        bail!(\n            \"index ID pattern `{pattern}` is invalid: patterns must not contain multiple \\\n             consecutive `*`\"\n        );\n    }\n    // If there is no star in the pattern, we need at least 3 characters.\n    if !pattern.contains('*') && pattern.len() < 3 {\n        bail!(\n            \"index ID pattern `{pattern}` is invalid: an index ID must have at least 3 characters\"\n        );\n    }\n    Ok(())\n}\n\npub fn validate_node_id(node_id: &NodeIdRef) -> anyhow::Result<()> {\n    if !is_valid_hostname(node_id.as_str()) {\n        bail!(\n            \"node identifier `{node_id}` is invalid. node identifiers must be valid short \\\n             hostnames (see RFC 1123)\"\n        );\n    }\n    Ok(())\n}\n\n#[derive(Copy, Clone, Debug, Eq, PartialEq)]\npub enum ConfigFormat {\n    Json,\n    Toml,\n    Yaml,\n}\n\nimpl ConfigFormat {\n    pub fn as_str(&self) -> &'static str {\n        match self {\n            ConfigFormat::Json => \"json\",\n            ConfigFormat::Toml => \"toml\",\n            ConfigFormat::Yaml => \"yaml\",\n        }\n    }\n\n    pub fn sniff_from_uri(uri: &Uri) -> anyhow::Result<ConfigFormat> {\n        let extension_str: &str = uri.extension().with_context(|| {\n            format!(\n                \"failed to parse config file `{uri}`: file extension is missing. supported file \\\n                 formats and extensions are JSON (.json), TOML (.toml), and YAML (.yaml or .yml)\"\n            )\n        })?;\n        ConfigFormat::from_str(extension_str)\n            .with_context(|| format!(\"failed to identify configuration file format {uri}\"))\n    }\n\n    pub fn parse<T>(&self, payload: &[u8]) -> anyhow::Result<T>\n    where T: DeserializeOwned {\n        match self {\n            ConfigFormat::Json => {\n                let mut json_value: JsonValue =\n                    serde_json::from_reader(StripComments::new(payload))?;\n                let version_value = json_value.get_mut(\"version\").context(\"missing version\")?;\n                if let Some(version_number) = version_value.as_u64() {\n                    warn!(version_value=?version_value, \"`version` should be a string\");\n                    *version_value = JsonValue::String(version_number.to_string());\n                }\n                serde_json::from_value(json_value).context(\"failed to parse JSON file\")\n            }\n            ConfigFormat::Toml => {\n                let payload_str = std::str::from_utf8(payload)\n                    .context(\"configuration file contains invalid UTF-8 characters\")?;\n                let mut toml_value: toml::Value =\n                    toml::from_str(payload_str).context(\"failed to parse TOML file\")?;\n                let version_value = toml_value.get_mut(\"version\").context(\"missing version\")?;\n                if let Some(version_number) = version_value.as_integer() {\n                    warn!(version_value=?version_value, \"`version` should be a string\");\n                    *version_value = toml::Value::String(version_number.to_string());\n                    let reserialized = toml::to_string(version_value)\n                        .context(\"failed to reserialize toml config\")?;\n                    toml::from_str(&reserialized).context(\"failed to parse TOML file\")\n                } else {\n                    toml::from_str(payload_str).context(\"failed to parse TOML file\")\n                }\n            }\n            ConfigFormat::Yaml => {\n                serde_yaml::from_slice(payload).context(\"failed to parse YAML file\")\n            }\n        }\n    }\n}\n\nimpl FromStr for ConfigFormat {\n    type Err = anyhow::Error;\n\n    fn from_str(ext: &str) -> anyhow::Result<Self> {\n        match ext {\n            \"json\" => Ok(Self::Json),\n            \"toml\" => Ok(Self::Toml),\n            \"yaml\" | \"yml\" => Ok(Self::Yaml),\n            _ => bail!(\n                \"file extension `.{ext}` is not supported. supported file formats and extensions \\\n                 are JSON (.json), TOML (.toml), and YAML (.yaml or .yml)\",\n            ),\n        }\n    }\n}\n\npub trait TestableForRegression: Serialize + DeserializeOwned {\n    /// Produces an instance of `Self` whose serialization output will be tested against future\n    /// versions of the format for backward compatibility.\n    fn sample_for_regression() -> Self;\n\n    /// Asserts that `self` and `other` are equal. It must panic if they are not.\n    fn assert_equality(&self, other: &Self);\n}\n\n/// Returns a fingerprint (a hash) of all the parameters that should force an\n/// indexing pipeline to restart upon index or source config updates.\npub fn indexing_pipeline_params_fingerprint(\n    index_config: &IndexConfig,\n    source_config: &SourceConfig,\n) -> u64 {\n    let mut hasher = SipHasher::new();\n    hasher.write_u64(index_config.indexing_params_fingerprint());\n    hasher.write_u64(source_config.indexing_params_fingerprint());\n    hasher.finish()\n}\n\n#[cfg(test)]\nmod tests {\n    use super::validate_identifier;\n    use crate::validate_index_id_pattern;\n\n    #[test]\n    fn test_validate_identifier() {\n        validate_identifier(\"cluster\", \"\").unwrap_err();\n        validate_identifier(\"cluster\", \"-\").unwrap_err();\n        validate_identifier(\"cluster\", \"_\").unwrap_err();\n        validate_identifier(\"cluster\", \"f\").unwrap_err();\n        validate_identifier(\"cluster\", \"fo\").unwrap_err();\n        validate_identifier(\"cluster\", \"_fo\").unwrap_err();\n        validate_identifier(\"cluster\", \"_foo\").unwrap_err();\n        validate_identifier(\"cluster\", \".foo.bar\").unwrap_err();\n        validate_identifier(\"cluster\", \"foo\").unwrap();\n        validate_identifier(\"cluster\", \"f-_\").unwrap();\n        validate_identifier(\"index\", \"foo.bar\").unwrap();\n\n        assert!(\n            validate_identifier(\"cluster\", \"foo!\")\n                .unwrap_err()\n                .to_string()\n                .contains(\"cluster ID `foo!` is invalid\")\n        );\n    }\n\n    #[test]\n    fn test_validate_index_id_pattern() {\n        validate_index_id_pattern(\"*\", false).unwrap();\n        validate_index_id_pattern(\"abc.*\", false).unwrap();\n        validate_index_id_pattern(\"ab\", false).unwrap_err();\n        validate_index_id_pattern(\"\", false).unwrap_err();\n        validate_index_id_pattern(\"**\", false).unwrap_err();\n        assert!(\n            validate_index_id_pattern(\"foo!\", false)\n                .unwrap_err()\n                .to_string()\n                .contains(\"index ID pattern `foo!` is invalid:\")\n        );\n        validate_index_id_pattern(\"-abc\", true).unwrap();\n        validate_index_id_pattern(\"-abc\", false).unwrap_err();\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-config/src/merge_policy_config.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::time::Duration;\n\nuse serde::{Deserialize, Deserializer, Serialize, Serializer, de};\n\nfn is_zero(value: &usize) -> bool {\n    *value == 0\n}\n\n#[derive(Debug, Clone, Serialize, Deserialize, Eq, PartialEq, Hash, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct ConstWriteAmplificationMergePolicyConfig {\n    /// Number of splits to merge together in a single merge operation.\n    #[serde(default = \"default_merge_factor\")]\n    pub merge_factor: usize,\n    /// Maximum number of splits that can be merged together in a single merge operation.\n    #[serde(default = \"default_max_merge_factor\")]\n    pub max_merge_factor: usize,\n    /// Maximum number of merges that a given split should undergo.\n    #[serde(default = \"default_max_merge_ops\")]\n    pub max_merge_ops: usize,\n    /// Duration relative to `split.created_timestamp` after which a split\n    /// becomes mature.\n    /// If `now() >= split.created_timestamp + maturation_period` then\n    /// the split is considered mature.\n    #[schema(value_type = String)]\n    #[serde(default = \"default_maturation_period\")]\n    #[serde(deserialize_with = \"parse_human_duration\")]\n    #[serde(serialize_with = \"serialize_duration\")]\n    pub maturation_period: Duration,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"is_zero\")]\n    pub max_finalize_merge_operations: usize,\n    /// Splits with a number of docs higher than\n    /// `max_finalize_split_num_docs` will not be considered\n    /// for finalize split merge operations.\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub max_finalize_split_num_docs: Option<usize>,\n}\n\nimpl Default for ConstWriteAmplificationMergePolicyConfig {\n    fn default() -> ConstWriteAmplificationMergePolicyConfig {\n        ConstWriteAmplificationMergePolicyConfig {\n            max_merge_ops: default_max_merge_ops(),\n            merge_factor: default_merge_factor(),\n            max_merge_factor: default_max_merge_factor(),\n            maturation_period: default_maturation_period(),\n            max_finalize_merge_operations: 0,\n            max_finalize_split_num_docs: None,\n        }\n    }\n}\n\n#[derive(Clone, Debug, Serialize, Deserialize, Eq, PartialEq, Hash, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct StableLogMergePolicyConfig {\n    /// Number of docs below which all splits are considered as belonging to the same level.\n    #[serde(default = \"default_min_level_num_docs\")]\n    pub min_level_num_docs: usize,\n    /// Number of splits to merge together in a single merge operation.\n    #[serde(default = \"default_merge_factor\")]\n    pub merge_factor: usize,\n    /// Maximum number of splits that can be merged together in a single merge operation.\n    #[serde(default = \"default_max_merge_factor\")]\n    pub max_merge_factor: usize,\n    /// Duration relative to `split.created_timestamp` after which a split\n    /// becomes mature.\n    /// If `now() >= split.created_timestamp + maturation_period` then\n    /// the split is mature.\n    #[schema(value_type = String)]\n    #[serde(default = \"default_maturation_period\")]\n    #[serde(deserialize_with = \"parse_human_duration\")]\n    #[serde(serialize_with = \"serialize_duration\")]\n    pub maturation_period: Duration,\n}\n\nfn default_merge_factor() -> usize {\n    10\n}\n\nfn default_max_merge_factor() -> usize {\n    12\n}\n\nfn default_max_merge_ops() -> usize {\n    4\n}\n\nfn default_min_level_num_docs() -> usize {\n    100_000\n}\n\nfn default_maturation_period() -> Duration {\n    Duration::from_secs(48 * 3600)\n}\n\nimpl Default for StableLogMergePolicyConfig {\n    fn default() -> Self {\n        StableLogMergePolicyConfig {\n            min_level_num_docs: default_min_level_num_docs(),\n            merge_factor: default_merge_factor(),\n            max_merge_factor: default_max_merge_factor(),\n            maturation_period: default_maturation_period(),\n        }\n    }\n}\n\nfn parse_human_duration<'de, D>(deserializer: D) -> Result<Duration, D::Error>\nwhere D: Deserializer<'de> {\n    let value: String = Deserialize::deserialize(deserializer)?;\n    let duration = humantime::parse_duration(&value).map_err(|error| {\n        de::Error::custom(format!(\n            \"failed to parse human-readable duration `{value}`: {error:?}\",\n        ))\n    })?;\n    Ok(duration)\n}\n\nfn serialize_duration<S>(value: &Duration, s: S) -> Result<S::Ok, S::Error>\nwhere S: Serializer {\n    let value_str = humantime::format_duration(*value).to_string();\n    s.serialize_str(&value_str)\n}\n\n#[derive(Debug, Serialize, Deserialize, Clone, Eq, PartialEq, Hash, utoipa::ToSchema)]\n#[serde(tag = \"type\")]\n#[serde(deny_unknown_fields)]\npub enum MergePolicyConfig {\n    #[serde(rename = \"no_merge\")]\n    Nop,\n    #[serde(rename = \"limit_merge\")]\n    ConstWriteAmplification(ConstWriteAmplificationMergePolicyConfig),\n    #[serde(rename = \"stable_log\")]\n    #[serde(alias = \"default\")]\n    StableLog(StableLogMergePolicyConfig),\n}\n\nimpl Default for MergePolicyConfig {\n    fn default() -> Self {\n        MergePolicyConfig::StableLog(StableLogMergePolicyConfig::default())\n    }\n}\n\nimpl MergePolicyConfig {\n    pub fn noop() -> Self {\n        MergePolicyConfig::Nop\n    }\n\n    pub fn validate(&self) -> anyhow::Result<()> {\n        let (merge_factor, max_merge_factor) = match self {\n            MergePolicyConfig::Nop => {\n                return Ok(());\n            }\n            MergePolicyConfig::ConstWriteAmplification(config) => {\n                (config.merge_factor, config.max_merge_factor)\n            }\n            MergePolicyConfig::StableLog(config) => (config.merge_factor, config.max_merge_factor),\n        };\n        if max_merge_factor < merge_factor {\n            anyhow::bail!(\n                \"index config merge policy `max_merge_factor` must be superior or equal to \\\n                 `merge_factor`\"\n            );\n        }\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-config/src/metastore_config.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::num::NonZeroUsize;\nuse std::ops::Deref;\nuse std::time::Duration;\n\nuse anyhow::{Context, ensure};\nuse humantime::parse_duration;\nuse itertools::Itertools;\nuse serde::{Deserialize, Serialize};\nuse serde_with::{EnumMap, serde_as};\n\n#[derive(Debug, Clone, Copy, Eq, PartialEq, Ord, PartialOrd, Hash, Serialize, Deserialize)]\n#[serde(rename_all = \"snake_case\")]\npub enum MetastoreBackend {\n    File,\n    #[serde(alias = \"pg\", alias = \"postgres\")]\n    PostgreSQL,\n}\n\n/// Holds the metastore configurations defined in the `metastore` section of node config files.\n///\n/// ```yaml\n/// metastore:\n///   file:\n///     polling_interval: 30s\n///\n///   postgres:\n///     max_connections: 12\n/// ```\n#[serde_as]\n#[derive(Debug, Clone, Default, Eq, PartialEq, Serialize, Deserialize)]\npub struct MetastoreConfigs(#[serde_as(as = \"EnumMap\")] Vec<MetastoreConfig>);\n\nimpl MetastoreConfigs {\n    pub fn redact(&mut self) {\n        for metastore_config in &mut self.0 {\n            metastore_config.redact();\n        }\n    }\n\n    pub fn validate(&self) -> anyhow::Result<()> {\n        for metastore_config in &self.0 {\n            metastore_config.validate()?;\n        }\n        let backends: Vec<MetastoreBackend> = self\n            .0\n            .iter()\n            .map(|metastore_config| metastore_config.backend())\n            .sorted()\n            .collect();\n\n        for (left, right) in backends.iter().zip(backends.iter().skip(1)) {\n            ensure!(\n                left != right,\n                \"{left:?} metastore config is defined multiple times\"\n            );\n        }\n        Ok(())\n    }\n\n    pub fn find_file(&self) -> Option<&FileMetastoreConfig> {\n        self.0\n            .iter()\n            .find_map(|metastore_config| match metastore_config {\n                MetastoreConfig::File(file_metastore_config) => Some(file_metastore_config),\n                _ => None,\n            })\n    }\n\n    pub fn find_postgres(&self) -> Option<&PostgresMetastoreConfig> {\n        self.0\n            .iter()\n            .find_map(|metastore_config| match metastore_config {\n                MetastoreConfig::PostgreSQL(postgres_metastore_config) => {\n                    Some(postgres_metastore_config)\n                }\n                _ => None,\n            })\n    }\n}\n\nimpl Deref for MetastoreConfigs {\n    type Target = Vec<MetastoreConfig>;\n\n    fn deref(&self) -> &Self::Target {\n        &self.0\n    }\n}\n\n#[derive(Debug, Clone, Eq, PartialEq, Serialize, Deserialize)]\n#[serde(rename_all = \"snake_case\")]\npub enum MetastoreConfig {\n    File(FileMetastoreConfig),\n    #[serde(alias = \"pg\", alias = \"postgres\")]\n    PostgreSQL(PostgresMetastoreConfig),\n}\n\nimpl MetastoreConfig {\n    pub fn backend(&self) -> MetastoreBackend {\n        match self {\n            Self::File(_) => MetastoreBackend::File,\n            Self::PostgreSQL(_) => MetastoreBackend::PostgreSQL,\n        }\n    }\n\n    pub fn as_file(&self) -> Option<&FileMetastoreConfig> {\n        match self {\n            Self::File(file_metastore_config) => Some(file_metastore_config),\n            _ => None,\n        }\n    }\n\n    pub fn as_postgres(&self) -> Option<&PostgresMetastoreConfig> {\n        match self {\n            Self::PostgreSQL(postgres_metastore_config) => Some(postgres_metastore_config),\n            _ => None,\n        }\n    }\n\n    pub fn redact(&mut self) {\n        // TODO: Implement this method when we end up storing secrets in the\n        // metastore config.\n    }\n\n    pub fn validate(&self) -> anyhow::Result<()> {\n        match self {\n            Self::File(file_metastore_config) => file_metastore_config.validate()?,\n            Self::PostgreSQL(postgres_metastore_config) => postgres_metastore_config.validate()?,\n        }\n        Ok(())\n    }\n}\n\nimpl From<FileMetastoreConfig> for MetastoreConfig {\n    fn from(file_metastore_config: FileMetastoreConfig) -> Self {\n        Self::File(file_metastore_config)\n    }\n}\n\nimpl From<PostgresMetastoreConfig> for MetastoreConfig {\n    fn from(postgres_metastore_config: PostgresMetastoreConfig) -> Self {\n        Self::PostgreSQL(postgres_metastore_config)\n    }\n}\n\n#[derive(Debug, Clone, Eq, PartialEq, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct PostgresMetastoreConfig {\n    #[serde(default = \"PostgresMetastoreConfig::default_min_connections\")]\n    pub min_connections: usize,\n    #[serde(\n        alias = \"max_num_connections\",\n        default = \"PostgresMetastoreConfig::default_max_connections\"\n    )]\n    pub max_connections: NonZeroUsize,\n    #[serde(default = \"PostgresMetastoreConfig::default_acquire_connection_timeout\")]\n    pub acquire_connection_timeout: String,\n    #[serde(default = \"PostgresMetastoreConfig::default_idle_connection_timeout\")]\n    pub idle_connection_timeout: String,\n    #[serde(default = \"PostgresMetastoreConfig::default_max_connection_lifetime\")]\n    pub max_connection_lifetime: String,\n}\n\nimpl Default for PostgresMetastoreConfig {\n    fn default() -> Self {\n        Self {\n            min_connections: Self::default_min_connections(),\n            max_connections: Self::default_max_connections(),\n            acquire_connection_timeout: Self::default_acquire_connection_timeout(),\n            idle_connection_timeout: Self::default_idle_connection_timeout(),\n            max_connection_lifetime: Self::default_max_connection_lifetime(),\n        }\n    }\n}\n\nimpl PostgresMetastoreConfig {\n    pub fn default_min_connections() -> usize {\n        0\n    }\n\n    pub fn default_max_connections() -> NonZeroUsize {\n        NonZeroUsize::new(10).unwrap()\n    }\n\n    pub fn default_acquire_connection_timeout() -> String {\n        \"10s\".to_string()\n    }\n\n    pub fn default_idle_connection_timeout() -> String {\n        \"10min\".to_string()\n    }\n\n    pub fn default_max_connection_lifetime() -> String {\n        \"30min\".to_string()\n    }\n\n    pub fn acquire_connection_timeout(&self) -> anyhow::Result<Duration> {\n        parse_duration(&self.acquire_connection_timeout).with_context(|| {\n            format!(\n                \"failed to parse `acquire_connection_timeout` value `{}`\",\n                self.acquire_connection_timeout\n            )\n        })\n    }\n\n    pub fn idle_connection_timeout_opt(&self) -> anyhow::Result<Option<Duration>> {\n        if self.idle_connection_timeout.is_empty() || self.idle_connection_timeout == \"0\" {\n            return Ok(None);\n        }\n        let idle_connection_timeout =\n            parse_duration(&self.idle_connection_timeout).with_context(|| {\n                format!(\n                    \"failed to parse `idle_connection_timeout` value `{}`\",\n                    self.idle_connection_timeout\n                )\n            })?;\n        if idle_connection_timeout.is_zero() {\n            Ok(None)\n        } else {\n            Ok(Some(idle_connection_timeout))\n        }\n    }\n\n    pub fn max_connection_lifetime_opt(&self) -> anyhow::Result<Option<Duration>> {\n        if self.max_connection_lifetime.is_empty() || self.max_connection_lifetime == \"0\" {\n            return Ok(None);\n        }\n        let max_connection_lifetime =\n            parse_duration(&self.max_connection_lifetime).with_context(|| {\n                format!(\n                    \"failed to parse `max_connection_lifetime` value `{}`\",\n                    self.max_connection_lifetime\n                )\n            })?;\n        if max_connection_lifetime.is_zero() {\n            Ok(None)\n        } else {\n            Ok(Some(max_connection_lifetime))\n        }\n    }\n\n    pub fn validate(&self) -> anyhow::Result<()> {\n        ensure!(\n            self.min_connections <= self.max_connections.get(),\n            \"`min_connections` must be less than or equal to `max_connections`\"\n        );\n        self.acquire_connection_timeout()?;\n        self.idle_connection_timeout_opt()?;\n        self.max_connection_lifetime_opt()?;\n        Ok(())\n    }\n}\n\n#[derive(Debug, Clone, Default, Eq, PartialEq, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct FileMetastoreConfig;\n\nimpl FileMetastoreConfig {\n    pub fn validate(&self) -> anyhow::Result<()> {\n        Ok(())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_metastore_configs_serde() {\n        let metastore_configs_yaml = \"\";\n        let metastore_configs: MetastoreConfigs =\n            serde_yaml::from_str(metastore_configs_yaml).unwrap();\n        assert!(metastore_configs.is_empty());\n\n        let metastore_configs_yaml = r#\"\n                postgres:\n                    max_connections: 12\n            \"#;\n        let metastore_configs: MetastoreConfigs =\n            serde_yaml::from_str(metastore_configs_yaml).unwrap();\n\n        let expected_metastore_configs = MetastoreConfigs(vec![\n            PostgresMetastoreConfig {\n                max_connections: NonZeroUsize::new(12).unwrap(),\n                ..Default::default()\n            }\n            .into(),\n        ]);\n        assert_eq!(metastore_configs, expected_metastore_configs);\n    }\n\n    #[test]\n    fn test_metastore_configs_validate() {\n        let metastore_configs = MetastoreConfigs(vec![\n            PostgresMetastoreConfig {\n                max_connections: NonZeroUsize::new(12).unwrap(),\n                ..Default::default()\n            }\n            .into(),\n            PostgresMetastoreConfig {\n                max_connections: NonZeroUsize::new(12).unwrap(),\n                ..Default::default()\n            }\n            .into(),\n        ]);\n        let error = metastore_configs.validate().unwrap_err();\n        assert!(error.to_string().contains(\"defined multiple times\"));\n\n        let metastore_configs = MetastoreConfigs(vec![\n            PostgresMetastoreConfig {\n                acquire_connection_timeout: \"15\".to_string(),\n                ..Default::default()\n            }\n            .into(),\n        ]);\n        let error = metastore_configs.validate().unwrap_err();\n        assert!(error.to_string().contains(\"`acquire_connection_timeout`\"));\n    }\n\n    #[test]\n    fn test_pg_metastore_config_serde() {\n        {\n            let pg_metastore_config_yaml = \"\";\n            let pg_metastore_config: PostgresMetastoreConfig =\n                serde_yaml::from_str(pg_metastore_config_yaml).unwrap();\n            assert_eq!(pg_metastore_config, PostgresMetastoreConfig::default());\n        }\n        {\n            let pg_metastore_config_yaml = r#\"\n                max_connections: 12\n            \"#;\n            let pg_metastore_config: PostgresMetastoreConfig =\n                serde_yaml::from_str(pg_metastore_config_yaml).unwrap();\n\n            let expected_pg_metastore_config = PostgresMetastoreConfig {\n                max_connections: NonZeroUsize::new(12).unwrap(),\n                ..Default::default()\n            };\n            assert_eq!(pg_metastore_config, expected_pg_metastore_config);\n        }\n        {\n            let pg_metastore_config_yaml = r#\"\n                min_connections: 6\n                max_connections: 12\n                acquire_connection_timeout: 500ms\n                idle_connection_timeout: 1h\n                max_connection_lifetime: 1d\n            \"#;\n            let pg_metastore_config: PostgresMetastoreConfig =\n                serde_yaml::from_str(pg_metastore_config_yaml).unwrap();\n\n            let expected_pg_metastore_config = PostgresMetastoreConfig {\n                min_connections: 6,\n                max_connections: NonZeroUsize::new(12).unwrap(),\n                acquire_connection_timeout: \"500ms\".to_string(),\n                idle_connection_timeout: \"1h\".to_string(),\n                max_connection_lifetime: \"1d\".to_string(),\n            };\n            assert_eq!(pg_metastore_config, expected_pg_metastore_config);\n            assert_eq!(\n                pg_metastore_config.acquire_connection_timeout().unwrap(),\n                Duration::from_millis(500)\n            );\n            assert_eq!(\n                pg_metastore_config.idle_connection_timeout_opt().unwrap(),\n                Some(Duration::from_secs(3600))\n            );\n            assert_eq!(\n                pg_metastore_config.max_connection_lifetime_opt().unwrap(),\n                Some(Duration::from_secs(24 * 3600))\n            );\n        }\n        {\n            let pg_metastore_config_yaml = r#\"\n                min_connections: 6\n                max_connections: 12\n                acquire_connection_timeout: 15s\n                idle_connection_timeout: \"\"\n                max_connection_lifetime: 0\n            \"#;\n            let pg_metastore_config: PostgresMetastoreConfig =\n                serde_yaml::from_str(pg_metastore_config_yaml).unwrap();\n\n            let expected_pg_metastore_config = PostgresMetastoreConfig {\n                min_connections: 6,\n                max_connections: NonZeroUsize::new(12).unwrap(),\n                acquire_connection_timeout: \"15s\".to_string(),\n                idle_connection_timeout: \"\".to_string(),\n                max_connection_lifetime: \"0\".to_string(),\n            };\n            assert_eq!(pg_metastore_config, expected_pg_metastore_config);\n            assert_eq!(\n                pg_metastore_config.acquire_connection_timeout().unwrap(),\n                Duration::from_secs(15)\n            );\n            assert!(\n                pg_metastore_config\n                    .idle_connection_timeout_opt()\n                    .unwrap()\n                    .is_none()\n            );\n            assert!(\n                pg_metastore_config\n                    .max_connection_lifetime_opt()\n                    .unwrap()\n                    .is_none(),\n            );\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-config/src/node_config/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod serialize;\n\nuse std::collections::{HashMap, HashSet};\nuse std::env;\nuse std::net::SocketAddr;\nuse std::num::{NonZeroU32, NonZeroU64, NonZeroUsize};\nuse std::path::PathBuf;\nuse std::time::Duration;\n\nuse anyhow::{bail, ensure};\nuse bytesize::ByteSize;\nuse http::HeaderMap;\nuse quickwit_common::net::HostAddr;\nuse quickwit_common::shared_consts::{\n    DEFAULT_SHARD_BURST_LIMIT, DEFAULT_SHARD_SCALE_UP_FACTOR, DEFAULT_SHARD_THROUGHPUT_LIMIT,\n};\nuse quickwit_common::uri::Uri;\nuse quickwit_proto::indexing::CpuCapacity;\nuse quickwit_proto::tonic::codec::CompressionEncoding;\nuse quickwit_proto::types::NodeId;\nuse serde::{Deserialize, Deserializer, Serialize};\nuse tracing::{info, warn};\n\nuse crate::node_config::serialize::load_node_config_with_env;\nuse crate::serde_utils::DurationAsStr;\nuse crate::service::QuickwitService;\nuse crate::storage_config::StorageConfigs;\nuse crate::{ConfigFormat, MetastoreConfigs};\n\npub const DEFAULT_QW_CONFIG_PATH: &str = \"config/quickwit.yaml\";\n\n#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct RestConfig {\n    pub listen_addr: SocketAddr,\n    pub cors_allow_origins: Vec<String>,\n    #[serde(with = \"http_serde::header_map\")]\n    pub extra_headers: HeaderMap,\n    #[serde(default)]\n    pub tls: Option<TlsConfig>,\n}\n\n#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct GrpcConfig {\n    #[serde(default = \"GrpcConfig::default_max_message_size\")]\n    pub max_message_size: ByteSize,\n    #[serde(default)]\n    pub tls: Option<TlsConfig>,\n    // If set, keeps idle connection alive by periodically perform a\n    // keep alive ping request.\n    #[serde(default, skip_serializing_if = \"Option::is_none\")]\n    pub keep_alive: Option<KeepAliveConfig>,\n}\n\nfn default_http2_keep_alive_interval() -> DurationAsStr {\n    DurationAsStr::try_from(\"10s\".to_string()).unwrap()\n}\n\nfn default_keep_alive_timeout() -> DurationAsStr {\n    DurationAsStr::try_from(\"5s\".to_string()).unwrap()\n}\n\n#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]\npub struct KeepAliveConfig {\n    // Set the HTTP/2 KEEP_ALIVE_INTERVAL. This is the time the connection\n    // should be idle before sending a keepalive ping.\n    #[serde(default = \"default_http2_keep_alive_interval\")]\n    pub interval: DurationAsStr,\n\n    // Set the HTTP/2 KEEP_ALIVE_TIMEOUT. This is the time to wait for an ACK\n    // after sending a keepalive ping. If the server doesn't respond within\n    // this time, the connection might be considered dead.\n    // Tonic uses hyper's default (20 seconds) if not set.\n    #[serde(default = \"default_keep_alive_timeout\")]\n    pub timeout: DurationAsStr,\n}\n\nimpl From<KeepAliveConfig> for quickwit_common::tower::KeepAliveConfig {\n    fn from(val: KeepAliveConfig) -> Self {\n        quickwit_common::tower::KeepAliveConfig {\n            interval: *val.interval,\n            timeout: *val.timeout,\n        }\n    }\n}\n\nimpl GrpcConfig {\n    fn default_max_message_size() -> ByteSize {\n        ByteSize::mib(20)\n    }\n\n    pub fn validate(&self) -> anyhow::Result<()> {\n        ensure!(\n            self.max_message_size >= ByteSize::mb(1),\n            \"max gRPC message size (`grpc.max_message_size`) must be at least 1MB, got `{}`\",\n            self.max_message_size\n        );\n        Ok(())\n    }\n}\n\nimpl Default for GrpcConfig {\n    fn default() -> Self {\n        Self {\n            max_message_size: Self::default_max_message_size(),\n            tls: None,\n            keep_alive: None,\n        }\n    }\n}\n\n#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct TlsConfig {\n    pub cert_path: String,\n    pub key_path: String,\n    #[serde(default)]\n    pub ca_path: String,\n    #[serde(default)]\n    pub expected_name: Option<String>,\n    #[serde(default)]\n    pub validate_client: bool,\n}\n\n#[derive(Clone, Debug, Eq, PartialEq, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct IndexerConfig {\n    #[serde(default = \"IndexerConfig::default_split_store_max_num_bytes\")]\n    pub split_store_max_num_bytes: ByteSize,\n    #[serde(default = \"IndexerConfig::default_split_store_max_num_splits\")]\n    pub split_store_max_num_splits: usize,\n    #[serde(default = \"IndexerConfig::default_max_concurrent_split_uploads\")]\n    pub max_concurrent_split_uploads: usize,\n    /// Limits the IO throughput of the `SplitDownloader` and the `MergeExecutor`.\n    /// On hardware where IO is constrained, it makes sure that Merges (a batch operation)\n    /// does not starve indexing itself (as it is a latency sensitive operation).\n    #[serde(default)]\n    pub max_merge_write_throughput: Option<ByteSize>,\n    /// Maximum number of merge or delete operation that can be executed concurrently.\n    /// (defaults to num_cpu / 2).\n    #[serde(default = \"IndexerConfig::default_merge_concurrency\")]\n    pub merge_concurrency: NonZeroUsize,\n    /// Enables the OpenTelemetry exporter endpoint to ingest logs and traces via the OpenTelemetry\n    /// Protocol (OTLP).\n    #[serde(default = \"IndexerConfig::default_enable_otlp_endpoint\")]\n    pub enable_otlp_endpoint: bool,\n    #[serde(default = \"IndexerConfig::default_enable_cooperative_indexing\")]\n    pub enable_cooperative_indexing: bool,\n    #[serde(default = \"IndexerConfig::default_cpu_capacity\")]\n    pub cpu_capacity: CpuCapacity,\n}\n\nimpl IndexerConfig {\n    fn default_enable_cooperative_indexing() -> bool {\n        false\n    }\n\n    fn default_enable_otlp_endpoint() -> bool {\n        #[cfg(any(test, feature = \"testsuite\"))]\n        {\n            false\n        }\n        #[cfg(not(any(test, feature = \"testsuite\")))]\n        {\n            quickwit_common::get_bool_from_env(\"QW_ENABLE_OTLP_ENDPOINT\", true)\n        }\n    }\n\n    fn default_max_concurrent_split_uploads() -> usize {\n        12\n    }\n\n    pub fn default_split_store_max_num_bytes() -> ByteSize {\n        ByteSize::gib(100)\n    }\n\n    pub fn default_split_store_max_num_splits() -> usize {\n        1_000\n    }\n\n    pub fn default_merge_concurrency() -> NonZeroUsize {\n        NonZeroUsize::new(quickwit_common::num_cpus() * 2 / 3).unwrap_or(NonZeroUsize::MIN)\n    }\n\n    fn default_cpu_capacity() -> CpuCapacity {\n        CpuCapacity::one_cpu_thread() * (quickwit_common::num_cpus() as u32)\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn for_test() -> anyhow::Result<Self> {\n        use quickwit_proto::indexing::PIPELINE_FULL_CAPACITY;\n        let indexer_config = IndexerConfig {\n            enable_cooperative_indexing: false,\n            enable_otlp_endpoint: true,\n            split_store_max_num_bytes: ByteSize::mb(1),\n            split_store_max_num_splits: 3,\n            max_concurrent_split_uploads: 4,\n            cpu_capacity: PIPELINE_FULL_CAPACITY * 4u32,\n            max_merge_write_throughput: None,\n            merge_concurrency: NonZeroUsize::new(3).unwrap(),\n        };\n        Ok(indexer_config)\n    }\n}\n\nimpl Default for IndexerConfig {\n    fn default() -> Self {\n        Self {\n            enable_cooperative_indexing: Self::default_enable_cooperative_indexing(),\n            enable_otlp_endpoint: Self::default_enable_otlp_endpoint(),\n            split_store_max_num_bytes: Self::default_split_store_max_num_bytes(),\n            split_store_max_num_splits: Self::default_split_store_max_num_splits(),\n            max_concurrent_split_uploads: Self::default_max_concurrent_split_uploads(),\n            cpu_capacity: Self::default_cpu_capacity(),\n            merge_concurrency: Self::default_merge_concurrency(),\n            max_merge_write_throughput: None,\n        }\n    }\n}\n\n#[derive(Debug, Clone, Copy, PartialEq, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct SplitCacheLimits {\n    pub max_num_bytes: ByteSize,\n    #[serde(default = \"SplitCacheLimits::default_max_num_splits\")]\n    pub max_num_splits: NonZeroU32,\n    #[serde(default = \"SplitCacheLimits::default_num_concurrent_downloads\")]\n    pub num_concurrent_downloads: NonZeroU32,\n    #[serde(default = \"SplitCacheLimits::default_max_file_descriptors\")]\n    pub max_file_descriptors: NonZeroU32,\n}\n\nimpl SplitCacheLimits {\n    fn default_max_num_splits() -> NonZeroU32 {\n        NonZeroU32::new(10_000).unwrap()\n    }\n\n    fn default_num_concurrent_downloads() -> NonZeroU32 {\n        NonZeroU32::new(1).unwrap()\n    }\n\n    fn default_max_file_descriptors() -> NonZeroU32 {\n        NonZeroU32::new(100).unwrap()\n    }\n}\n\n#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]\n#[serde(deny_unknown_fields, default)]\npub struct SearcherConfig {\n    pub aggregation_memory_limit: ByteSize,\n    pub aggregation_bucket_limit: u32,\n\n    #[serde(alias = \"fast_field_cache_capacity\")]\n    #[serde(\n        deserialize_with = \"CacheConfig::deserialize_with_default::<_, {ByteSize::gb(1).as_u64()}>\"\n    )]\n    pub fast_field_cache: CacheConfig,\n    #[serde(alias = \"split_footer_cache_capacity\")]\n    #[serde(deserialize_with = \"CacheConfig::deserialize_with_default::<_, \\\n                                {ByteSize::mb(500).as_u64()}>\")]\n    pub split_footer_cache: CacheConfig,\n    #[serde(alias = \"partial_request_cache_capacity\")]\n    #[serde(deserialize_with = \"CacheConfig::deserialize_with_default::<_, \\\n                                {ByteSize::mb(64).as_u64()}>\")]\n    pub partial_request_cache: CacheConfig,\n    #[serde(alias = \"predicate_cache_capacity\")]\n    #[serde(deserialize_with = \"CacheConfig::deserialize_with_default::<_, \\\n                                {ByteSize::mb(256).as_u64()}>\")]\n    pub predicate_cache: CacheConfig,\n\n    pub max_num_concurrent_split_searches: usize,\n    pub max_splits_per_search: Option<usize>,\n    // Deprecated: stream search requests are no longer supported.\n    #[serde(alias = \"max_num_concurrent_split_streams\", default, skip_serializing)]\n    pub _max_num_concurrent_split_streams: Option<serde::de::IgnoredAny>,\n    // Strangely, if None, this will also have the effect of not forwarding\n    // to searcher.\n    // TODO document and fix if necessary.\n    #[serde(default, skip_serializing_if = \"Option::is_none\")]\n    pub split_cache: Option<SplitCacheLimits>,\n    #[serde(default = \"SearcherConfig::default_request_timeout_secs\")]\n    request_timeout_secs: NonZeroU64,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub storage_timeout_policy: Option<StorageTimeoutPolicy>,\n    pub warmup_memory_budget: ByteSize,\n    pub warmup_single_split_initial_allocation: ByteSize,\n    /// Lambda configuration for serverless leaf search execution.\n    /// If set, enables Lambda execution for leaf search.\n    ///\n    /// If set, and Quickwit cannot access the Lambda (after a deploy attempt if\n    /// auto deploy is set up), Quickwit will log an error and\n    /// fail on startup.\n    #[serde(default)]\n    pub lambda: Option<LambdaConfig>,\n}\n\n/// Configuration for AWS Lambda leaf search execution.\n#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct LambdaConfig {\n    /// AWS Lambda function name.\n    #[serde(default = \"LambdaConfig::default_function_name\")]\n    pub function_name: String,\n    /// Maximum number of splits per Lambda invocation.\n    #[serde(default = \"LambdaConfig::default_max_splits_per_invocation\")]\n    pub max_splits_per_invocation: NonZeroUsize,\n    /// Maximum number of splits to process locally before offloading to Lambda.\n    /// When the number of pending split searches exceeds this threshold,\n    /// new splits are offloaded to Lambda instead of being queued locally.\n    /// A value of 0 offloads everything to Lambda.\n    #[serde(default = \"LambdaConfig::default_offload_threshold\")]\n    pub offload_threshold: usize,\n    /// Auto-deploy configuration. If set, Quickwit will automatically deploy\n    /// the Lambda function at startup.\n    /// If deploying a lambda fails, Quickwit will log an error and fail.\n    #[serde(default)]\n    pub auto_deploy: Option<LambdaDeployConfig>,\n}\n\nimpl LambdaConfig {\n    #[cfg(feature = \"testsuite\")]\n    pub fn for_test() -> Self {\n        Self {\n            function_name: Self::default_function_name(),\n            max_splits_per_invocation: Self::default_max_splits_per_invocation(),\n            offload_threshold: Self::default_offload_threshold(),\n            auto_deploy: None,\n        }\n    }\n\n    fn default_function_name() -> String {\n        \"quickwit-lambda-search\".to_string()\n    }\n    fn default_max_splits_per_invocation() -> NonZeroUsize {\n        NonZeroUsize::new(10).unwrap()\n    }\n    fn default_offload_threshold() -> usize {\n        100\n    }\n}\n\n/// Configuration for automatic Lambda function deployment.\n#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct LambdaDeployConfig {\n    /// IAM execution role ARN for the Lambda function.\n    /// The role only requires GetObject permission to the targeted S3 bucket.\n    pub execution_role_arn: String,\n    /// Memory size for the Lambda function.\n    /// It will be rounded up to the nearest multiple of 1MiB.\n    #[serde(default = \"LambdaDeployConfig::default_memory_size\")]\n    pub memory_size: ByteSize,\n    /// Timeout for Lambda invocations in seconds.\n    #[serde(default = \"LambdaDeployConfig::default_invocation_timeout_secs\")]\n    pub invocation_timeout_secs: u64,\n}\n\nimpl LambdaDeployConfig {\n    fn default_memory_size() -> ByteSize {\n        // Empirically this implies between 4 and 6 vCPUs.\n        ByteSize::gib(5)\n    }\n    fn default_invocation_timeout_secs() -> u64 {\n        15\n    }\n}\n\n#[derive(Clone, Debug, PartialEq, Eq, Hash, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct CacheConfig {\n    #[serde(default)]\n    capacity: Option<ByteSize>,\n    #[serde(default)]\n    policy: Option<CachePolicy>,\n\n    // Cache configs inside the virtual cache aren't allowed to contain virtual cache\n    #[serde(default)]\n    pub virtual_caches: Vec<CacheConfig>,\n}\n\nimpl CacheConfig {\n    pub fn no_cache() -> Self {\n        CacheConfig {\n            capacity: None,\n            policy: None,\n            virtual_caches: Vec::new(),\n        }\n    }\n\n    pub fn default_with_capacity(capacity: ByteSize) -> Self {\n        CacheConfig {\n            capacity: Some(capacity),\n            policy: None,\n            virtual_caches: Vec::new(),\n        }\n    }\n\n    pub fn capacity(&self) -> ByteSize {\n        // this should always be there\n        self.capacity.unwrap_or_default()\n    }\n\n    pub fn capacity_for_virtual_cache(&mut self, real_capacity: ByteSize) -> ByteSize {\n        let capacity = self.capacity.unwrap_or(real_capacity);\n        self.capacity = Some(capacity);\n        capacity\n    }\n\n    pub fn policy(&self) -> CachePolicy {\n        self.policy.unwrap_or_default()\n    }\n\n    pub fn policy_for_virtual_cache(&mut self, real_policy: CachePolicy) -> CachePolicy {\n        let policy = self.policy.unwrap_or(real_policy);\n        self.policy = Some(policy);\n        policy\n    }\n\n    fn deserialize_with_default<'de, D, const DEFAULT_CAPACITY: u64>(\n        deserializer: D,\n    ) -> Result<CacheConfig, D::Error>\n    where D: Deserializer<'de> {\n        use serde_with::{DeserializeAs, FromInto, PickFirst, Same};\n\n        let mut cache_config: CacheConfig =\n            PickFirst::<(Same, FromInto<ByteSize>)>::deserialize_as(deserializer)?;\n        if cache_config.capacity.is_none() {\n            cache_config.capacity = Some(ByteSize::b(DEFAULT_CAPACITY));\n        }\n        Ok(cache_config)\n    }\n}\n\nimpl From<ByteSize> for CacheConfig {\n    fn from(capacity: ByteSize) -> Self {\n        CacheConfig::default_with_capacity(capacity)\n    }\n}\n\n#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, Serialize, Deserialize, Default)]\n#[serde(rename_all = \"kebab-case\")]\npub enum CachePolicy {\n    #[default]\n    Lru,\n    S3Fifo,\n    TinyLfu,\n}\n\nimpl std::fmt::Display for CachePolicy {\n    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {\n        match self {\n            CachePolicy::Lru => f.write_str(\"lru\"),\n            CachePolicy::S3Fifo => f.write_str(\"s3-fifo\"),\n            CachePolicy::TinyLfu => f.write_str(\"tiny-lfu\"),\n        }\n    }\n}\n\n/// Configuration controlling how fast a searcher should timeout a `get_slice`\n/// request to retry it.\n///\n/// [Amazon's best practise](https://docs.aws.amazon.com/whitepapers/latest/s3-optimizing-performance-best-practices/timeouts-and-retries-for-latency-sensitive-applications.html)\n/// suggests that to ensure low latency, it is best to:\n/// - retry small GET request after 2s\n/// - retry large GET request when the throughput is below some percentile.\n///\n/// This policy is inspired by this guidance. It does not track instanteneous throughput, but\n/// computes an overall timeout using the following formula:\n/// `timeout_offset + num_bytes_get_request / min_throughtput`\n#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]\npub struct StorageTimeoutPolicy {\n    pub min_throughtput_bytes_per_secs: u64,\n    pub timeout_millis: u64,\n    // Disclaimer: this is a number of retry, so the overall max number of\n    // attempts is `max_num_retries + 1``.\n    pub max_num_retries: usize,\n}\n\nimpl StorageTimeoutPolicy {\n    pub fn compute_timeout(&self, num_bytes: usize) -> impl Iterator<Item = Duration> {\n        let min_download_time_secs: f64 = if self.min_throughtput_bytes_per_secs == 0 {\n            0.0f64\n        } else {\n            num_bytes as f64 / self.min_throughtput_bytes_per_secs as f64\n        };\n        let timeout = Duration::from_millis(self.timeout_millis)\n            + Duration::from_secs_f64(min_download_time_secs);\n        std::iter::repeat_n(timeout, self.max_num_retries + 1)\n    }\n}\n\nimpl Default for SearcherConfig {\n    fn default() -> Self {\n        SearcherConfig {\n            fast_field_cache: CacheConfig::default_with_capacity(ByteSize::gb(1)),\n            split_footer_cache: CacheConfig::default_with_capacity(ByteSize::mb(500)),\n            partial_request_cache: CacheConfig::default_with_capacity(ByteSize::mb(64)),\n            predicate_cache: CacheConfig::default_with_capacity(ByteSize::mb(256)),\n            max_num_concurrent_split_searches: 100,\n            max_splits_per_search: None,\n            _max_num_concurrent_split_streams: None,\n            aggregation_memory_limit: ByteSize::mb(500),\n            aggregation_bucket_limit: 65000,\n            split_cache: None,\n            request_timeout_secs: Self::default_request_timeout_secs(),\n            storage_timeout_policy: None,\n            warmup_memory_budget: ByteSize::gb(100),\n            warmup_single_split_initial_allocation: ByteSize::gb(1),\n            lambda: None,\n        }\n    }\n}\n\nimpl SearcherConfig {\n    /// The timeout after which a search should be cancelled\n    pub fn request_timeout(&self) -> Duration {\n        Duration::from_secs(self.request_timeout_secs.get())\n    }\n    fn default_request_timeout_secs() -> NonZeroU64 {\n        NonZeroU64::new(30).unwrap()\n    }\n    fn validate(&self) -> anyhow::Result<()> {\n        if let Some(split_cache_limits) = self.split_cache {\n            if self.max_num_concurrent_split_searches\n                > split_cache_limits.max_file_descriptors.get() as usize\n            {\n                anyhow::bail!(\n                    \"max_num_concurrent_split_searches ({}) must be lower or equal to \\\n                     split_cache.max_file_descriptors ({})\",\n                    self.max_num_concurrent_split_searches,\n                    split_cache_limits.max_file_descriptors\n                );\n            }\n            if self.warmup_single_split_initial_allocation > self.warmup_memory_budget {\n                anyhow::bail!(\n                    \"warmup_single_split_initial_allocation ({}) must be lower or equal to \\\n                     warmup_memory_budget ({})\",\n                    self.warmup_single_split_initial_allocation,\n                    self.warmup_memory_budget\n                );\n            }\n        }\n        Ok(())\n    }\n}\n\n#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]\n#[serde(rename_all = \"lowercase\")]\npub enum CompressionAlgorithm {\n    Gzip,\n    Zstd,\n}\n\n#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]\n#[serde(deny_unknown_fields, default)]\npub struct IngestApiConfig {\n    /// Maximum memory space taken by the ingest WAL\n    pub max_queue_memory_usage: ByteSize,\n    /// Maximum disk space taken by the ingest WAL\n    pub max_queue_disk_usage: ByteSize,\n    replication_factor: usize,\n    pub content_length_limit: ByteSize,\n    /// (hidden) Targeted throughput for each shard\n    pub shard_throughput_limit: ByteSize,\n    /// (hidden) Maximum accumulated throughput capacity for underutilized\n    /// shards, allowing the throughput limit to be temporarily exceeded\n    pub shard_burst_limit: ByteSize,\n    /// (hidden) new_shard_count = ceil(old_shard_count * shard_scale_up_factor)\n    ///\n    /// Setting this too high will be cancelled out by the arbiter that prevents\n    /// creating too many shards at once.\n    pub shard_scale_up_factor: f32,\n    #[serde(default)]\n    pub grpc_compression_algorithm: Option<CompressionAlgorithm>,\n}\n\nimpl Default for IngestApiConfig {\n    fn default() -> Self {\n        Self {\n            max_queue_memory_usage: ByteSize::gib(2),\n            max_queue_disk_usage: ByteSize::gib(4),\n            replication_factor: 1,\n            content_length_limit: ByteSize::mib(10),\n            shard_throughput_limit: DEFAULT_SHARD_THROUGHPUT_LIMIT,\n            shard_burst_limit: DEFAULT_SHARD_BURST_LIMIT,\n            shard_scale_up_factor: DEFAULT_SHARD_SCALE_UP_FACTOR,\n            grpc_compression_algorithm: None,\n        }\n    }\n}\n\nimpl IngestApiConfig {\n    /// Returns the replication factor, as defined in environment variable or in the configuration\n    /// in that order (the environment variable can overrides the configuration).\n    pub fn replication_factor(&self) -> anyhow::Result<NonZeroUsize> {\n        if let Ok(replication_factor_str) = env::var(\"QW_INGEST_REPLICATION_FACTOR\") {\n            let replication_factor = match replication_factor_str.trim() {\n                \"1\" => 1,\n                \"2\" => 2,\n                _ => bail!(\n                    \"replication factor must be either 1 or 2, got `{replication_factor_str}`\"\n                ),\n            };\n            return Ok(NonZeroUsize::new(replication_factor)\n                .expect(\"replication factor should be either 1 or 2\"));\n        }\n        ensure!(\n            self.replication_factor >= 1 && self.replication_factor <= 2,\n            \"replication factor must be either 1 or 2, got `{}`\",\n            self.replication_factor\n        );\n        Ok(NonZeroUsize::new(self.replication_factor)\n            .expect(\"replication factor should be either 1 or 2\"))\n    }\n\n    pub fn grpc_compression_encoding(&self) -> Option<CompressionEncoding> {\n        self.grpc_compression_algorithm\n            .as_ref()\n            .map(|algorithm| match algorithm {\n                CompressionAlgorithm::Gzip => CompressionEncoding::Gzip,\n                CompressionAlgorithm::Zstd => CompressionEncoding::Zstd,\n            })\n    }\n\n    fn validate(&self) -> anyhow::Result<()> {\n        self.replication_factor()?;\n        ensure!(\n            self.max_queue_disk_usage > ByteSize::mib(256),\n            \"max_queue_disk_usage must be at least 256 MiB, got `{}`\",\n            self.max_queue_disk_usage.display().si()\n        );\n        ensure!(\n            self.max_queue_disk_usage >= self.max_queue_memory_usage,\n            \"max_queue_disk_usage ({}) must be at least max_queue_memory_usage ({})\",\n            self.max_queue_disk_usage.display().si(),\n            self.max_queue_memory_usage.display().si()\n        );\n        info!(\n            \"ingestion shard throughput limit: {}\",\n            self.shard_throughput_limit.display().si()\n        );\n        ensure!(\n            self.shard_throughput_limit >= ByteSize::mib(1)\n                && self.shard_throughput_limit <= ByteSize::mib(20),\n            \"shard_throughput_limit ({}) must be within 1mb and 20mb\",\n            self.shard_throughput_limit.display().si()\n        );\n        // The newline delimited format is persisted as something a bit larger\n        // (lines prefixed with their length)\n        let estimated_persist_size = ByteSize::b(3 * self.content_length_limit.as_u64() / 2);\n        ensure!(\n            self.shard_burst_limit >= estimated_persist_size,\n            \"shard_burst_limit ({}) must be at least 1.5*content_length_limit ({})\",\n            self.shard_burst_limit,\n            estimated_persist_size,\n        );\n        ensure!(\n            self.shard_scale_up_factor > 1.0,\n            \"shard_scale_up_factor ({}) must be greater than 1\",\n            self.shard_scale_up_factor,\n        );\n        Ok(())\n    }\n}\n\n#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct JaegerConfig {\n    /// Enables the gRPC endpoint that allows the Jaeger Query Service to connect and retrieve\n    /// traces.\n    #[serde(default = \"JaegerConfig::default_enable_endpoint\")]\n    pub enable_endpoint: bool,\n    /// How far back in time we look for spans when queries at not time-bound (`get_services`,\n    /// `get_operations`, `get_trace` operations).\n    #[serde(default = \"JaegerConfig::default_lookback_period_hours\")]\n    lookback_period_hours: NonZeroU64,\n    /// The assumed maximum duration of a trace in seconds.\n    ///\n    /// Finding a trace happens in two phases: the first phase identifies at least one span that\n    /// matches the query, while the second phase retrieves the spans that belong to the trace.\n    /// The `max_trace_duration_secs` parameter is used during the second phase to restrict the\n    /// search time interval to [span.end_timestamp - max_trace_duration, span.start_timestamp\n    /// + max_trace_duration].\n    #[serde(default = \"JaegerConfig::default_max_trace_duration_secs\")]\n    max_trace_duration_secs: NonZeroU64,\n    /// The maximum number of spans that can be retrieved in a single request.\n    #[serde(default = \"JaegerConfig::default_max_fetch_spans\")]\n    pub max_fetch_spans: NonZeroU64,\n}\n\nimpl JaegerConfig {\n    pub fn lookback_period(&self) -> Duration {\n        Duration::from_secs(self.lookback_period_hours.get() * 3600)\n    }\n\n    pub fn max_trace_duration(&self) -> Duration {\n        Duration::from_secs(self.max_trace_duration_secs.get())\n    }\n\n    fn default_enable_endpoint() -> bool {\n        #[cfg(any(test, feature = \"testsuite\"))]\n        {\n            false\n        }\n        #[cfg(not(any(test, feature = \"testsuite\")))]\n        {\n            quickwit_common::get_bool_from_env(\"QW_ENABLE_JAEGER_ENDPOINT\", true)\n        }\n    }\n\n    fn default_lookback_period_hours() -> NonZeroU64 {\n        NonZeroU64::new(72).unwrap() // 3 days\n    }\n\n    fn default_max_trace_duration_secs() -> NonZeroU64 {\n        NonZeroU64::new(3600).unwrap() // 1 hour\n    }\n\n    fn default_max_fetch_spans() -> NonZeroU64 {\n        NonZeroU64::new(10_000).unwrap() // 10k spans\n    }\n}\n\nimpl Default for JaegerConfig {\n    fn default() -> Self {\n        Self {\n            enable_endpoint: Self::default_enable_endpoint(),\n            lookback_period_hours: Self::default_lookback_period_hours(),\n            max_trace_duration_secs: Self::default_max_trace_duration_secs(),\n            max_fetch_spans: Self::default_max_fetch_spans(),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Serialize)]\npub struct NodeConfig {\n    pub cluster_id: String,\n    pub node_id: NodeId,\n    pub availability_zone: Option<String>,\n    pub enabled_services: HashSet<QuickwitService>,\n    pub gossip_listen_addr: SocketAddr,\n    pub grpc_listen_addr: SocketAddr,\n    pub gossip_advertise_addr: SocketAddr,\n    pub grpc_advertise_addr: SocketAddr,\n    pub gossip_interval: Duration,\n    pub peer_seeds: Vec<String>,\n    pub data_dir_path: PathBuf,\n    pub metastore_uri: Uri,\n    pub default_index_root_uri: Uri,\n    pub rest_config: RestConfig,\n    pub grpc_config: GrpcConfig,\n    pub storage_configs: StorageConfigs,\n    pub metastore_configs: MetastoreConfigs,\n    pub indexer_config: IndexerConfig,\n    pub searcher_config: SearcherConfig,\n    pub ingest_api_config: IngestApiConfig,\n    pub jaeger_config: JaegerConfig,\n}\n\nimpl NodeConfig {\n    pub fn is_service_enabled(&self, service: QuickwitService) -> bool {\n        self.enabled_services.contains(&service)\n    }\n\n    /// Parses and validates a [`NodeConfig`] from a given URI and config content.\n    pub async fn load(config_format: ConfigFormat, config_content: &[u8]) -> anyhow::Result<Self> {\n        let env_vars = env::vars().collect::<HashMap<_, _>>();\n        let config = load_node_config_with_env(config_format, config_content, &env_vars).await?;\n        if !config.data_dir_path.try_exists()? {\n            bail!(\n                \"data dir `{}` does not exist\",\n                config.data_dir_path.display()\n            );\n        }\n        Ok(config)\n    }\n\n    /// Returns the list of peer seed addresses. The addresses MUST NOT be resolved. Otherwise, the\n    /// DNS-based discovery mechanism implemented in Chitchat will not work correctly.\n    pub async fn peer_seed_addrs(&self) -> anyhow::Result<Vec<String>> {\n        let mut peer_seed_addrs = Vec::new();\n        let default_gossip_port = self.gossip_listen_addr.port();\n\n        // We want to pass non-resolved addresses to Chitchat but still want to resolve them for\n        // validation purposes. Additionally, we need to append a default port if necessary and\n        // finally return the addresses as strings, which is tricky for IPv6. We let the logic baked\n        // in `HostAddr` handle this complexity.\n        let mut found_something = false;\n        for peer_seed in &self.peer_seeds {\n            let peer_seed_addr = HostAddr::parse_with_default_port(peer_seed, default_gossip_port)?;\n            if let Err(error) = peer_seed_addr.resolve().await {\n                warn!(peer_seed = %peer_seed_addr, error = ?error, \"failed to resolve peer seed address\");\n            } else {\n                found_something = true;\n            }\n            peer_seed_addrs.push(peer_seed_addr.to_string())\n        }\n        if !self.peer_seeds.is_empty() && !found_something {\n            warn!(\"failed to resolve all the peer seed addresses\")\n        }\n        Ok(peer_seed_addrs)\n    }\n\n    pub fn redact(&mut self) {\n        self.metastore_configs.redact();\n        self.metastore_uri.redact();\n        self.storage_configs.redact();\n    }\n\n    /// Creates a config with defaults suitable for testing.\n    ///\n    /// Uses the default ports without ensuring that they are available.\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn for_test() -> Self {\n        serialize::node_config_for_tests_from_ports(7280, 7281)\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn for_test_from_ports(rest_listen_port: u16, grpc_listen_port: u16) -> Self {\n        serialize::node_config_for_tests_from_ports(rest_listen_port, grpc_listen_port)\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_proto::indexing::CpuCapacity;\n\n    use super::*;\n    use crate::IndexerConfig;\n\n    #[test]\n    fn test_indexer_config_serialization() {\n        {\n            let indexer_config: IndexerConfig = serde_json::from_str(r#\"{}\"#).unwrap();\n            assert_eq!(&indexer_config, &IndexerConfig::default());\n            assert!(indexer_config.cpu_capacity.cpu_millis() > 0);\n            assert_eq!(indexer_config.cpu_capacity.cpu_millis() % 1_000, 0);\n        }\n        {\n            let indexer_config: IndexerConfig =\n                serde_yaml::from_str(r#\"cpu_capacity: 1.5\"#).unwrap();\n            assert_eq!(\n                indexer_config.cpu_capacity,\n                CpuCapacity::from_cpu_millis(1500)\n            );\n            let indexer_config_json = serde_json::to_value(&indexer_config).unwrap();\n            assert_eq!(\n                indexer_config_json\n                    .get(\"cpu_capacity\")\n                    .unwrap()\n                    .as_str()\n                    .unwrap(),\n                \"1500m\"\n            );\n        }\n        {\n            let indexer_config: IndexerConfig =\n                serde_yaml::from_str(r#\"merge_concurrency: 5\"#).unwrap();\n            assert_eq!(\n                indexer_config.merge_concurrency,\n                NonZeroUsize::new(5).unwrap()\n            );\n            let indexer_config_json = serde_json::to_value(&indexer_config).unwrap();\n            assert_eq!(\n                indexer_config_json\n                    .get(\"merge_concurrency\")\n                    .unwrap()\n                    .as_u64()\n                    .unwrap(),\n                5\n            );\n        }\n        {\n            let indexer_config: IndexerConfig =\n                serde_yaml::from_str(r#\"cpu_capacity: 1500m\"#).unwrap();\n            assert_eq!(\n                indexer_config.cpu_capacity,\n                CpuCapacity::from_cpu_millis(1500)\n            );\n            let indexer_config_json = serde_json::to_value(&indexer_config).unwrap();\n            assert_eq!(\n                indexer_config_json\n                    .get(\"cpu_capacity\")\n                    .unwrap()\n                    .as_str()\n                    .unwrap(),\n                \"1500m\"\n            );\n        }\n    }\n\n    #[test]\n    fn test_validate_ingest_api_default() {\n        let ingest_api_config: IngestApiConfig = serde_yaml::from_str(\"\").unwrap();\n        assert!(ingest_api_config.validate().is_ok());\n        assert_eq!(ingest_api_config, IngestApiConfig::default());\n    }\n\n    #[test]\n    fn test_validate_ingest_api_config() {\n        {\n            let ingest_api_config: IngestApiConfig = serde_yaml::from_str(\n                r#\"\n                    max_queue_disk_usage: 100M\n                    grpc_compression_algorithm: zstd\n                \"#,\n            )\n            .unwrap();\n            assert_eq!(\n                ingest_api_config.validate().unwrap_err().to_string(),\n                \"max_queue_disk_usage must be at least 256 MiB, got `100.0 MB`\"\n            );\n            assert_eq!(\n                ingest_api_config.grpc_compression_encoding().unwrap(),\n                CompressionEncoding::Zstd\n            );\n        }\n        {\n            let ingest_api_config: IngestApiConfig = serde_yaml::from_str(\n                r#\"\n                    max_queue_memory_usage: 600M\n                    max_queue_disk_usage: 500M\n                \"#,\n            )\n            .unwrap();\n            assert_eq!(\n                ingest_api_config.validate().unwrap_err().to_string(),\n                \"max_queue_disk_usage (500.0 MB) must be at least max_queue_memory_usage (600.0 \\\n                 MB)\"\n            );\n        }\n        {\n            let ingest_api_config: IngestApiConfig = serde_yaml::from_str(\n                r#\"\n                    shard_throughput_limit: 21M\n                \"#,\n            )\n            .unwrap();\n            assert_eq!(\n                ingest_api_config.validate().unwrap_err().to_string(),\n                \"shard_throughput_limit (21.0 MB) must be within 1mb and 20mb\"\n            );\n        }\n    }\n\n    #[track_caller]\n    fn test_keepalive_config_serialization_aux(\n        keep_alive_json: serde_json::Value,\n        expected: quickwit_common::tower::KeepAliveConfig,\n    ) {\n        let keep_alive_config: KeepAliveConfig =\n            serde_json::from_value(keep_alive_json.clone()).unwrap();\n        let keep_alive_deser: quickwit_common::tower::KeepAliveConfig =\n            keep_alive_config.clone().into();\n        assert_eq!(&keep_alive_deser, &expected);\n        let keep_alive_config_deser_ser = serde_json::to_value(keep_alive_config).unwrap();\n        let keep_alive_config_deser_ser_deser: KeepAliveConfig =\n            serde_json::from_value(keep_alive_config_deser_ser).unwrap();\n        let keep_alive_config_deser_ser_deser: quickwit_common::tower::KeepAliveConfig =\n            keep_alive_config_deser_ser_deser.into();\n        assert_eq!(&keep_alive_config_deser_ser_deser, &expected);\n    }\n\n    #[test]\n    fn test_keepalive_config_serialization() {\n        test_keepalive_config_serialization_aux(\n            serde_json::json!({}),\n            quickwit_common::tower::KeepAliveConfig {\n                interval: Duration::from_secs(10),\n                timeout: Duration::from_secs(5),\n            },\n        );\n        test_keepalive_config_serialization_aux(\n            serde_json::json!({\n                \"interval\": \"3s\",\n                \"timeout\": \"1s\",\n            }),\n            quickwit_common::tower::KeepAliveConfig {\n                interval: Duration::from_secs(3),\n                timeout: Duration::from_secs(1),\n            },\n        );\n    }\n\n    #[test]\n    fn test_grpc_config_serialization_default() {\n        let grpc_config: GrpcConfig = serde_json::from_str(r#\"{}\"#).unwrap();\n        assert_eq!(\n            grpc_config.max_message_size,\n            GrpcConfig::default().max_message_size\n        );\n\n        let grpc_config: GrpcConfig = serde_yaml::from_str(\n            r#\"\n                max_message_size: 4MiB\n            \"#,\n        )\n        .unwrap();\n        assert_eq!(grpc_config.max_message_size, ByteSize::mib(4));\n    }\n\n    #[test]\n    fn test_grpc_config_validate() {\n        let grpc_config = GrpcConfig {\n            max_message_size: ByteSize::mb(1),\n            tls: None,\n            keep_alive: None,\n        };\n        assert!(grpc_config.validate().is_ok());\n\n        let grpc_config = GrpcConfig {\n            max_message_size: ByteSize::kb(1),\n            tls: None,\n            keep_alive: None,\n        };\n        assert!(grpc_config.validate().is_err());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-config/src/node_config/serialize.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::net::{IpAddr, SocketAddr};\nuse std::str::FromStr;\nuse std::time::Duration;\n\nuse anyhow::{Context, bail};\nuse bytesize::ByteSize;\nuse http::HeaderMap;\nuse quickwit_common::fs::get_disk_size;\nuse quickwit_common::net::{Host, find_private_ip, get_short_hostname};\nuse quickwit_common::new_coolid;\nuse quickwit_common::uri::Uri;\nuse quickwit_proto::types::NodeId;\nuse serde::{Deserialize, Serialize};\nuse tracing::{info, warn};\n\nuse super::{GrpcConfig, RestConfig};\nuse crate::config_value::ConfigValue;\nuse crate::qw_env_vars::*;\nuse crate::service::QuickwitService;\nuse crate::storage_config::StorageConfigs;\nuse crate::templating::render_config;\nuse crate::{\n    ConfigFormat, IndexerConfig, IngestApiConfig, JaegerConfig, MetastoreConfigs, NodeConfig,\n    SearcherConfig, TlsConfig, validate_identifier, validate_node_id,\n};\n\npub const DEFAULT_CLUSTER_ID: &str = \"quickwit-default-cluster\";\n\npub const DEFAULT_DATA_DIR_PATH: &str = \"qwdata\";\n\npub const DEFAULT_GOSSIP_INTERVAL: Duration = if cfg!(any(test, feature = \"testsuite\")) {\n    Duration::from_millis(25)\n} else {\n    Duration::from_secs(1)\n};\n\n// Default config values in the order they appear in [`NodeConfigBuilder`].\nfn default_cluster_id() -> ConfigValue<String, QW_CLUSTER_ID> {\n    ConfigValue::with_default(DEFAULT_CLUSTER_ID.to_string())\n}\n\nfn default_node_id() -> ConfigValue<String, QW_NODE_ID> {\n    let node_id = match get_short_hostname() {\n        Ok(short_hostname) => short_hostname,\n        Err(error) => {\n            let node_id = new_coolid(\"node\");\n            warn!(error=?error, \"failed to determine hostname or hostname was invalid, falling back to random node ID `{}`\", node_id);\n            node_id\n        }\n    };\n    ConfigValue::with_default(node_id)\n}\n\nfn default_availability_zone() -> ConfigValue<String, QW_AVAILABILITY_ZONE> {\n    ConfigValue::none()\n}\n\n#[derive(Clone, Debug, Default, Serialize, Deserialize, PartialEq)]\nstruct List(Vec<String>);\n\nimpl FromStr for List {\n    type Err = anyhow::Error;\n\n    fn from_str(list_str: &str) -> Result<Self, Self::Err> {\n        let list = list_str\n            .split(',')\n            .map(|elem| elem.trim().to_string())\n            .filter(|elem| !elem.is_empty())\n            .collect();\n        Ok(List(list))\n    }\n}\n\nfn default_enabled_services() -> ConfigValue<List, QW_ENABLED_SERVICES> {\n    ConfigValue::with_default(List(\n        QuickwitService::supported_services()\n            .into_iter()\n            .map(|service| service.to_string())\n            .collect(),\n    ))\n}\n\nfn default_listen_address() -> ConfigValue<String, QW_LISTEN_ADDRESS> {\n    ConfigValue::with_default(Host::default().to_string())\n}\n\nfn default_rest_listen_port() -> u16 {\n    7280\n}\n\nfn default_data_dir_uri() -> ConfigValue<Uri, QW_DATA_DIR> {\n    ConfigValue::with_default(Uri::from_str(DEFAULT_DATA_DIR_PATH).unwrap())\n}\n\n/// Returns the default advertise host.\nfn default_advertise_host(listen_ip: &IpAddr) -> anyhow::Result<Host> {\n    if listen_ip.is_unspecified() {\n        if let Some((interface_name, private_ip)) = find_private_ip() {\n            info!(advertise_address=%private_ip, interface_name=%interface_name, \"using sniffed advertise address `{private_ip}`\");\n            return Ok(Host::from(private_ip));\n        }\n        bail!(\"listen address `{listen_ip}` is unspecified and advertise address is not set\");\n    }\n    info!(advertise_address=%listen_ip, \"using listen address `{listen_ip}` as advertise address\");\n    Ok(Host::from(*listen_ip))\n}\n\n// Surprisingly, the default metastore and the index root uri are the same (if you exclude the\n// polling_interval parameter). Indeed, this is a convenient setting for testing with a file backed\n// metastore and indexes splits stored locally too.\n// For a given index `index-id`, it means that we have the metastore file\n// in  `./qwdata/indexes/{index-id}/metastore.json` and splits in\n// dir `./qwdata/indexes/{index-id}/splits`.\nfn default_metastore_uri(data_dir_uri: &Uri) -> Uri {\n    data_dir_uri.join(\"indexes#polling_interval=30s\").expect(\"Failed to create default metastore URI. This should never happen! Please, report on https://github.com/quickwit-oss/quickwit/issues.\")\n}\n\n// See comment above.\nfn default_index_root_uri(data_dir_uri: &Uri) -> Uri {\n    data_dir_uri.join(\"indexes\").expect(\"Failed to create default index root URI. This should never happen! Please, report on https://github.com/quickwit-oss/quickwit/issues.\")\n}\n\npub async fn load_node_config_with_env(\n    config_format: ConfigFormat,\n    config_content: &[u8],\n    env_vars: &HashMap<String, String>,\n) -> anyhow::Result<NodeConfig> {\n    let rendered_config_content = render_config(config_content)?;\n    let versioned_node_config: VersionedNodeConfig =\n        config_format.parse(rendered_config_content.as_bytes())?;\n    let node_config_builder: NodeConfigBuilder = versioned_node_config.into();\n    let config = node_config_builder.build_and_validate(env_vars).await?;\n    Ok(config)\n}\n\n#[derive(Debug, Deserialize)]\n#[serde(tag = \"version\")]\nenum VersionedNodeConfig {\n    #[serde(rename = \"0.8\")]\n    // Retro compatibility.\n    #[serde(alias = \"0.7\")]\n    V0_8(NodeConfigBuilder),\n}\n\nimpl From<VersionedNodeConfig> for NodeConfigBuilder {\n    fn from(versioned_node_config: VersionedNodeConfig) -> Self {\n        match versioned_node_config {\n            VersionedNodeConfig::V0_8(node_config_builder) => node_config_builder,\n        }\n    }\n}\n\n#[serde_with::serde_as]\n#[derive(Debug, Deserialize, PartialEq)]\n#[serde(deny_unknown_fields)]\nstruct NodeConfigBuilder {\n    #[serde(default = \"default_cluster_id\")]\n    cluster_id: ConfigValue<String, QW_CLUSTER_ID>,\n    #[serde(default = \"default_node_id\")]\n    node_id: ConfigValue<String, QW_NODE_ID>,\n    #[serde(default = \"default_availability_zone\")]\n    availability_zone: ConfigValue<String, QW_AVAILABILITY_ZONE>,\n    #[serde(default = \"default_enabled_services\")]\n    enabled_services: ConfigValue<List, QW_ENABLED_SERVICES>,\n    #[serde(default = \"default_listen_address\")]\n    listen_address: ConfigValue<String, QW_LISTEN_ADDRESS>,\n    advertise_address: ConfigValue<String, QW_ADVERTISE_ADDRESS>,\n    // Deprecated, use `rest.listen_port` instead.\n    rest_listen_port: Option<u16>,\n    gossip_listen_port: ConfigValue<u16, QW_GOSSIP_LISTEN_PORT>,\n    grpc_listen_port: ConfigValue<u16, QW_GRPC_LISTEN_PORT>,\n    gossip_interval_ms: ConfigValue<u32, QW_GOSSIP_INTERVAL_MS>,\n    #[serde(default)]\n    peer_seeds: ConfigValue<List, QW_PEER_SEEDS>,\n    #[serde(rename = \"data_dir\")]\n    #[serde(default = \"default_data_dir_uri\")]\n    data_dir_uri: ConfigValue<Uri, QW_DATA_DIR>,\n    metastore_uri: ConfigValue<Uri, QW_METASTORE_URI>,\n    default_index_root_uri: ConfigValue<Uri, QW_DEFAULT_INDEX_ROOT_URI>,\n    #[serde(rename = \"rest\")]\n    #[serde(default)]\n    rest_config_builder: RestConfigBuilder,\n    #[serde(rename = \"grpc\")]\n    #[serde(default)]\n    grpc_config: GrpcConfig,\n    #[serde(rename = \"storage\")]\n    #[serde(default)]\n    storage_configs: StorageConfigs,\n    #[serde(rename = \"metastore\")]\n    #[serde(default)]\n    metastore_configs: MetastoreConfigs,\n    #[serde(rename = \"indexer\")]\n    #[serde(default)]\n    indexer_config: IndexerConfig,\n    #[serde(rename = \"searcher\")]\n    #[serde(default)]\n    searcher_config: SearcherConfig,\n    #[serde(rename = \"ingest_api\")]\n    #[serde(default)]\n    ingest_api_config: IngestApiConfig,\n    #[serde(rename = \"jaeger\")]\n    #[serde(default)]\n    jaeger_config: JaegerConfig,\n}\n\nimpl NodeConfigBuilder {\n    pub async fn build_and_validate(\n        mut self,\n        env_vars: &HashMap<String, String>,\n    ) -> anyhow::Result<NodeConfig> {\n        let node_id = self.node_id.resolve(env_vars).map(NodeId::new)?;\n        let availability_zone = self.availability_zone.resolve_optional(env_vars)?;\n\n        let enabled_services = self\n            .enabled_services\n            .resolve(env_vars)?\n            .0\n            .into_iter()\n            .map(|service| service.parse())\n            .collect::<Result<_, _>>()?;\n\n        let listen_address = self.listen_address.resolve(env_vars)?;\n        let listen_host = listen_address.parse::<Host>()?;\n        let listen_ip = listen_host.resolve().await?;\n\n        if let Some(rest_listen_port) = self.rest_listen_port {\n            if self.rest_config_builder.listen_port.is_some() {\n                bail!(\n                    \"conflicting configuration values: please use only `rest.listen_port`, \\\n                     `rest_listen_port` is deprecated and should not be used alongside \\\n                     `rest.listen_port`. Update your configuration to use `rest.listen_port`.\"\n                );\n            }\n            warn!(\"`rest_listen_port` is deprecated, use `rest.listen_port` instead\");\n            self.rest_config_builder.listen_port = Some(rest_listen_port);\n        }\n\n        let rest_config = self\n            .rest_config_builder\n            .build_and_validate(listen_ip, env_vars)?;\n\n        self.grpc_config.validate()?;\n\n        let gossip_listen_port = self\n            .gossip_listen_port\n            .resolve_optional(env_vars)?\n            .unwrap_or(rest_config.listen_addr.port());\n        let gossip_listen_addr = SocketAddr::new(listen_ip, gossip_listen_port);\n\n        let grpc_listen_port = self\n            .grpc_listen_port\n            .resolve_optional(env_vars)?\n            .unwrap_or(rest_config.listen_addr.port() + 1);\n        let grpc_listen_addr = SocketAddr::new(listen_ip, grpc_listen_port);\n\n        let advertise_address = self.advertise_address.resolve_optional(env_vars)?;\n        let advertise_host = advertise_address\n            .map(|addr| addr.parse::<Host>())\n            .unwrap_or_else(|| default_advertise_host(&listen_ip))?;\n\n        let advertise_ip = advertise_host.resolve().await?;\n        let gossip_advertise_addr = SocketAddr::new(advertise_ip, gossip_listen_port);\n        let grpc_advertise_addr = SocketAddr::new(advertise_ip, grpc_listen_port);\n\n        let data_dir_uri = self.data_dir_uri.resolve(env_vars)?;\n        let data_dir_path = data_dir_uri\n            .filepath()\n            .with_context(|| {\n                format!(\n                    \"data dir must be located on the local file system. current location: \\\n                     `{data_dir_uri}`\"\n                )\n            })?\n            .to_path_buf();\n\n        let metastore_uri = self\n            .metastore_uri\n            .resolve_optional(env_vars)?\n            .unwrap_or_else(|| default_metastore_uri(&data_dir_uri));\n\n        let default_index_root_uri = self\n            .default_index_root_uri\n            .resolve_optional(env_vars)?\n            .unwrap_or_else(|| default_index_root_uri(&data_dir_uri));\n\n        self.storage_configs.validate()?;\n        self.storage_configs.apply_flavors();\n        self.ingest_api_config.validate()?;\n        self.searcher_config.validate()?;\n\n        let gossip_interval = self\n            .gossip_interval_ms\n            .resolve_optional(env_vars)?\n            .map(|gossip_interval_ms| Duration::from_millis(gossip_interval_ms as u64))\n            .unwrap_or(DEFAULT_GOSSIP_INTERVAL);\n\n        let node_config = NodeConfig {\n            cluster_id: self.cluster_id.resolve(env_vars)?,\n            node_id,\n            availability_zone,\n            enabled_services,\n            gossip_listen_addr,\n            grpc_listen_addr,\n            gossip_advertise_addr,\n            grpc_advertise_addr,\n            gossip_interval,\n            peer_seeds: self.peer_seeds.resolve(env_vars)?.0,\n            data_dir_path,\n            metastore_uri,\n            default_index_root_uri,\n            rest_config,\n            grpc_config: self.grpc_config,\n            metastore_configs: self.metastore_configs,\n            storage_configs: self.storage_configs,\n            indexer_config: self.indexer_config,\n            searcher_config: self.searcher_config,\n            ingest_api_config: self.ingest_api_config,\n            jaeger_config: self.jaeger_config,\n        };\n\n        validate(&node_config)?;\n        Ok(node_config)\n    }\n}\n\nfn validate(node_config: &NodeConfig) -> anyhow::Result<()> {\n    validate_identifier(\"cluster\", &node_config.cluster_id)?;\n    validate_node_id(&node_config.node_id)?;\n\n    if node_config.cluster_id == DEFAULT_CLUSTER_ID {\n        warn!(\"cluster ID is not set, falling back to default value `{DEFAULT_CLUSTER_ID}`\",);\n    }\n    if node_config.peer_seeds.is_empty() {\n        warn!(\"peer seeds are empty\");\n    }\n    validate_disk_usage(node_config);\n    Ok(())\n}\n\n/// A list of all the known disk budgets\n///\n/// External disk usage and unbounded disk usages, e.g the indexing workbench\n/// (indexing/) and the delete task workbench (delete_task_service/) are not included.\n#[derive(Default, Debug)]\nstruct ExpectedDiskUsage {\n    // indexer / ingester\n    split_store_max_num_bytes: Option<ByteSize>,\n    max_queue_disk_usage: Option<ByteSize>,\n    // searcher\n    split_cache: Option<ByteSize>,\n}\n\nimpl ExpectedDiskUsage {\n    fn from_config(node_config: &NodeConfig) -> Self {\n        let mut expected = Self::default();\n        if node_config.is_service_enabled(QuickwitService::Indexer) {\n            expected.max_queue_disk_usage =\n                Some(node_config.ingest_api_config.max_queue_disk_usage);\n            expected.split_store_max_num_bytes =\n                Some(node_config.indexer_config.split_store_max_num_bytes);\n        }\n        if node_config.is_service_enabled(QuickwitService::Searcher) {\n            expected.split_cache = node_config\n                .searcher_config\n                .split_cache\n                .map(|limits| limits.max_num_bytes);\n        }\n        expected\n    }\n\n    fn total(&self) -> ByteSize {\n        self.split_store_max_num_bytes.unwrap_or_default()\n            + self.max_queue_disk_usage.unwrap_or_default()\n            + self.split_cache.unwrap_or_default()\n    }\n}\n\nfn validate_disk_usage(node_config: &NodeConfig) {\n    if let Some(volume_size) = get_disk_size(&node_config.data_dir_path) {\n        let expected_disk_usage = ExpectedDiskUsage::from_config(node_config);\n        if expected_disk_usage.total() > volume_size {\n            warn!(\n                ?volume_size,\n                ?expected_disk_usage,\n                \"data dir volume too small\"\n            );\n        }\n    }\n}\n\n#[cfg(test)]\nimpl Default for NodeConfigBuilder {\n    fn default() -> Self {\n        Self {\n            cluster_id: default_cluster_id(),\n            node_id: default_node_id(),\n            availability_zone: ConfigValue::none(),\n            enabled_services: default_enabled_services(),\n            listen_address: default_listen_address(),\n            rest_listen_port: None,\n            gossip_listen_port: ConfigValue::none(),\n            grpc_listen_port: ConfigValue::none(),\n            gossip_interval_ms: ConfigValue::none(),\n            advertise_address: ConfigValue::none(),\n            peer_seeds: ConfigValue::with_default(List::default()),\n            data_dir_uri: default_data_dir_uri(),\n            metastore_uri: ConfigValue::none(),\n            default_index_root_uri: ConfigValue::none(),\n            rest_config_builder: RestConfigBuilder::default(),\n            grpc_config: GrpcConfig::default(),\n            storage_configs: StorageConfigs::default(),\n            metastore_configs: MetastoreConfigs::default(),\n            indexer_config: IndexerConfig::default(),\n            searcher_config: SearcherConfig::default(),\n            ingest_api_config: IngestApiConfig::default(),\n            jaeger_config: JaegerConfig::default(),\n        }\n    }\n}\n\n#[serde_with::serde_as]\n#[derive(Debug, Deserialize, PartialEq, Default)]\n#[serde(deny_unknown_fields)]\nstruct RestConfigBuilder {\n    #[serde(default)]\n    listen_port: Option<u16>,\n    #[serde(default)]\n    #[serde_as(deserialize_as = \"serde_with::OneOrMany<_>\")]\n    pub cors_allow_origins: Vec<String>,\n    #[serde(with = \"http_serde::header_map\")]\n    #[serde(default)]\n    pub extra_headers: HeaderMap,\n    #[serde(default)]\n    pub tls: Option<TlsConfig>,\n}\n\nimpl RestConfigBuilder {\n    fn build_and_validate(\n        self,\n        listen_ip: IpAddr,\n        env_vars: &HashMap<String, String>,\n    ) -> anyhow::Result<RestConfig> {\n        let listen_port_from_config_or_default =\n            self.listen_port.unwrap_or(default_rest_listen_port());\n        let listen_port = ConfigValue::<u16, QW_REST_LISTEN_PORT>::with_default(\n            listen_port_from_config_or_default,\n        )\n        .resolve(env_vars)?;\n        let rest_config = RestConfig {\n            listen_addr: SocketAddr::new(listen_ip, listen_port),\n            cors_allow_origins: self.cors_allow_origins,\n            extra_headers: self.extra_headers,\n            tls: self.tls,\n        };\n        Ok(rest_config)\n    }\n}\n\n#[cfg(any(test, feature = \"testsuite\"))]\npub fn node_config_for_tests_from_ports(\n    rest_listen_port: u16,\n    grpc_listen_port: u16,\n) -> NodeConfig {\n    let node_id = NodeId::new(default_node_id().unwrap());\n    let enabled_services = QuickwitService::supported_services();\n    let availability_zone = Some(String::from(\"az-1\"));\n    let listen_address = Host::default();\n    let rest_listen_addr = listen_address\n        .with_port(rest_listen_port)\n        .to_socket_addr()\n        .expect(\"default host should be an IP address\");\n    let gossip_listen_addr = listen_address\n        .with_port(rest_listen_port)\n        .to_socket_addr()\n        .expect(\"default host should be an IP address\");\n    let grpc_listen_addr = listen_address\n        .with_port(grpc_listen_port)\n        .to_socket_addr()\n        .expect(\"default host should be an IP address\");\n\n    let data_dir_uri = default_data_dir_uri().unwrap();\n    let data_dir_path = data_dir_uri\n        .filepath()\n        .expect(\"The default data dir should be valid directory path.\")\n        .to_path_buf();\n    let metastore_uri = default_metastore_uri(&data_dir_uri);\n    let default_index_root_uri = default_index_root_uri(&data_dir_uri);\n    let rest_config = RestConfig {\n        listen_addr: rest_listen_addr,\n        cors_allow_origins: Vec::new(),\n        extra_headers: HeaderMap::new(),\n        tls: None,\n    };\n    NodeConfig {\n        cluster_id: default_cluster_id().unwrap(),\n        node_id,\n        availability_zone,\n        enabled_services,\n        gossip_advertise_addr: gossip_listen_addr,\n        grpc_advertise_addr: grpc_listen_addr,\n        gossip_listen_addr,\n        grpc_listen_addr,\n        gossip_interval: Duration::from_millis(25u64),\n        peer_seeds: Vec::new(),\n        data_dir_path,\n        metastore_uri,\n        default_index_root_uri,\n        rest_config,\n        grpc_config: GrpcConfig::default(),\n        storage_configs: StorageConfigs::default(),\n        metastore_configs: MetastoreConfigs::default(),\n        indexer_config: IndexerConfig::default(),\n        searcher_config: SearcherConfig::default(),\n        ingest_api_config: IngestApiConfig::default(),\n        jaeger_config: JaegerConfig::default(),\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::env;\n    use std::net::Ipv4Addr;\n    use std::num::{NonZeroU64, NonZeroUsize};\n    use std::path::Path;\n\n    use bytesize::ByteSize;\n    use itertools::Itertools;\n\n    use super::*;\n    use crate::storage_config::StorageBackendFlavor;\n    use crate::{CacheConfig, LambdaConfig, LambdaDeployConfig};\n\n    fn get_config_filepath(config_filename: &str) -> String {\n        format!(\n            \"{}/resources/tests/node_config/{}\",\n            env!(\"CARGO_MANIFEST_DIR\"),\n            config_filename\n        )\n    }\n\n    async fn test_node_config_parse_aux(config_format: ConfigFormat) -> anyhow::Result<()> {\n        let config_filepath =\n            get_config_filepath(&format!(\"quickwit.{config_format:?}\").to_lowercase());\n        let file = std::fs::read_to_string(&config_filepath).unwrap();\n        let env_vars = HashMap::default();\n        let config = load_node_config_with_env(config_format, file.as_bytes(), &env_vars).await?;\n        assert_eq!(config.cluster_id, \"quickwit-cluster\");\n        assert_eq!(config.enabled_services.len(), 2);\n\n        assert!(config.is_service_enabled(QuickwitService::Janitor));\n        assert!(config.is_service_enabled(QuickwitService::Metastore));\n\n        assert_eq!(config.availability_zone.unwrap(), \"az-1\");\n        assert_eq!(\n            config.rest_config.listen_addr,\n            SocketAddr::new(IpAddr::V4(Ipv4Addr::UNSPECIFIED), 1111)\n        );\n        assert_eq!(\n            config.rest_config.extra_headers.get(\"x-header-1\").unwrap(),\n            \"header-value-1\"\n        );\n        assert_eq!(\n            config.rest_config.extra_headers.get(\"x-header-2\").unwrap(),\n            \"header-value-2\"\n        );\n        assert_eq!(config.grpc_config.max_message_size, ByteSize::mb(10));\n\n        assert_eq!(\n            config.gossip_listen_addr,\n            SocketAddr::new(IpAddr::V4(Ipv4Addr::UNSPECIFIED), 2222)\n        );\n        assert_eq!(\n            config.grpc_listen_addr,\n            SocketAddr::new(IpAddr::V4(Ipv4Addr::UNSPECIFIED), 3333)\n        );\n        assert_eq!(\n            config.gossip_advertise_addr,\n            SocketAddr::new(IpAddr::V4(Ipv4Addr::new(172, 0, 0, 12)), 2222)\n        );\n        assert_eq!(\n            config.grpc_advertise_addr,\n            SocketAddr::new(IpAddr::V4(Ipv4Addr::new(172, 0, 0, 12)), 3333)\n        );\n        assert_eq!(\n            config.peer_seeds,\n            vec![\n                \"quickwit-searcher-0.local\".to_string(),\n                \"quickwit-searcher-1.local\".to_string()\n            ]\n        );\n        assert_eq!(config.data_dir_path, Path::new(\"/opt/quickwit/data\"));\n        assert_eq!(\n            config.metastore_uri,\n            \"postgresql://username:password@host:port/db\"\n        );\n        assert_eq!(config.default_index_root_uri, \"s3://quickwit-indexes\");\n\n        let azure_storage_config = config.storage_configs.find_azure().unwrap();\n        assert_eq!(\n            azure_storage_config.account_name.as_ref().unwrap(),\n            \"quickwit-dev\"\n        );\n\n        let s3_storage_config = config.storage_configs.find_s3().unwrap();\n        assert_eq!(s3_storage_config.flavor.unwrap(), StorageBackendFlavor::Gcs);\n        assert_eq!(\n            s3_storage_config.endpoint.as_ref().unwrap(),\n            \"http://localhost:4566\"\n        );\n        assert!(s3_storage_config.force_path_style_access);\n        assert!(s3_storage_config.disable_multi_object_delete);\n        assert!(s3_storage_config.disable_multipart_upload);\n\n        let postgres_config = config.metastore_configs.find_postgres().unwrap();\n        assert_eq!(postgres_config.min_connections, 1);\n        assert_eq!(postgres_config.max_connections.get(), 12);\n        assert_eq!(\n            postgres_config.acquire_connection_timeout().unwrap(),\n            Duration::from_secs(30)\n        );\n        assert_eq!(\n            postgres_config.acquire_connection_timeout().unwrap(),\n            Duration::from_secs(30)\n        );\n        assert_eq!(\n            postgres_config.idle_connection_timeout_opt().unwrap(),\n            Some(Duration::from_secs(1800))\n        );\n        assert_eq!(\n            postgres_config.max_connection_lifetime_opt().unwrap(),\n            Some(Duration::from_secs(3600))\n        );\n\n        assert_eq!(\n            config.indexer_config,\n            IndexerConfig {\n                enable_otlp_endpoint: true,\n                split_store_max_num_bytes: ByteSize::tb(1),\n                split_store_max_num_splits: 10_000,\n                max_concurrent_split_uploads: 8,\n                merge_concurrency: NonZeroUsize::new(2).unwrap(),\n                cpu_capacity: IndexerConfig::default_cpu_capacity(),\n                enable_cooperative_indexing: false,\n                max_merge_write_throughput: Some(ByteSize::mb(100)),\n            }\n        );\n        assert_eq!(\n            config.ingest_api_config,\n            IngestApiConfig {\n                replication_factor: 2,\n                ..Default::default()\n            }\n        );\n        assert_eq!(\n            config.searcher_config,\n            SearcherConfig {\n                aggregation_memory_limit: ByteSize::gb(1),\n                aggregation_bucket_limit: 500_000,\n                fast_field_cache: CacheConfig::default_with_capacity(ByteSize::gb(10)),\n                split_footer_cache: CacheConfig::default_with_capacity(ByteSize::gb(1)),\n                partial_request_cache: CacheConfig::default_with_capacity(ByteSize::mb(64)),\n                predicate_cache: CacheConfig::default_with_capacity(ByteSize::mb(256)),\n                max_num_concurrent_split_searches: 150,\n                max_splits_per_search: None,\n                _max_num_concurrent_split_streams: Some(serde::de::IgnoredAny),\n                split_cache: None,\n                request_timeout_secs: NonZeroU64::new(30).unwrap(),\n                storage_timeout_policy: Some(crate::StorageTimeoutPolicy {\n                    min_throughtput_bytes_per_secs: 100_000,\n                    timeout_millis: 2_000,\n                    max_num_retries: 2\n                }),\n                warmup_memory_budget: ByteSize::gb(100),\n                warmup_single_split_initial_allocation: ByteSize::gb(1),\n                lambda: Some(LambdaConfig {\n                    function_name: \"quickwit-lambda-leaf-search\".to_string(),\n                    max_splits_per_invocation: NonZeroUsize::new(10).unwrap(),\n                    offload_threshold: 30,\n                    auto_deploy: Some(LambdaDeployConfig {\n                        execution_role_arn: \"arn:aws:iam::123456789012:role/quickwit-lambda-role\"\n                            .to_string(),\n                        memory_size: ByteSize::gib(5),\n                        invocation_timeout_secs: 15,\n                    }),\n                }),\n            }\n        );\n        assert_eq!(\n            config.jaeger_config,\n            JaegerConfig {\n                enable_endpoint: true,\n                lookback_period_hours: NonZeroU64::new(24).unwrap(),\n                max_trace_duration_secs: NonZeroU64::new(600).unwrap(),\n                max_fetch_spans: NonZeroU64::new(1_000).unwrap(),\n            }\n        );\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_node_config_parse_json() {\n        test_node_config_parse_aux(ConfigFormat::Json)\n            .await\n            .unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_node_config_parse_toml() {\n        test_node_config_parse_aux(ConfigFormat::Toml)\n            .await\n            .unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_node_config_parse_yaml() {\n        test_node_config_parse_aux(ConfigFormat::Yaml)\n            .await\n            .unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_config_contains_wrong_values() {\n        let config_filepath = get_config_filepath(\"quickwit.wrongkey.yaml\");\n        let config_str = std::fs::read_to_string(&config_filepath).unwrap();\n        let parsing_error = super::load_node_config_with_env(\n            ConfigFormat::Yaml,\n            config_str.as_bytes(),\n            &Default::default(),\n        )\n        .await\n        .unwrap_err();\n        assert!(\n            format!(\"{parsing_error:?}\")\n                .contains(\"unknown field `max_num_concurrent_split_searches_with_typo`\")\n        );\n    }\n\n    #[tokio::test]\n    async fn test_node_config_default_values_minimal() {\n        let config_yaml = \"version: 0.8\";\n        let config = load_node_config_with_env(\n            ConfigFormat::Yaml,\n            config_yaml.as_bytes(),\n            &Default::default(),\n        )\n        .await\n        .unwrap();\n        assert_eq!(config.cluster_id, DEFAULT_CLUSTER_ID);\n        assert_eq!(config.node_id, get_short_hostname().unwrap());\n        assert_eq!(config.availability_zone, None);\n        assert_eq!(\n            config.enabled_services,\n            QuickwitService::supported_services()\n        );\n        assert_eq!(\n            config.rest_config.listen_addr,\n            SocketAddr::new(IpAddr::V4(Ipv4Addr::LOCALHOST), 7280)\n        );\n        assert_eq!(\n            config.gossip_listen_addr,\n            SocketAddr::new(IpAddr::V4(Ipv4Addr::LOCALHOST), 7280)\n        );\n        assert_eq!(\n            config.grpc_listen_addr,\n            SocketAddr::new(IpAddr::V4(Ipv4Addr::LOCALHOST), 7281)\n        );\n        assert_eq!(\n            config.data_dir_path.to_string_lossy(),\n            format!(\"{}/qwdata\", env::current_dir().unwrap().display())\n        );\n        assert_eq!(\n            config.metastore_uri,\n            format!(\n                \"file://{}/qwdata/indexes#polling_interval=30s\",\n                env::current_dir().unwrap().display()\n            )\n        );\n        assert_eq!(\n            config.default_index_root_uri,\n            format!(\n                \"file://{}/qwdata/indexes\",\n                env::current_dir().unwrap().display()\n            )\n        );\n        assert_eq!(config.ingest_api_config.replication_factor, 1);\n    }\n\n    #[tokio::test]\n    async fn test_node_config_env_var_override() {\n        let config_yaml = \"version: 0.8\";\n        let mut env_vars = HashMap::new();\n        env_vars.insert(\"QW_CLUSTER_ID\".to_string(), \"test-cluster\".to_string());\n        env_vars.insert(\"QW_NODE_ID\".to_string(), \"test-node\".to_string());\n        env_vars.insert(\n            \"QW_ENABLED_SERVICES\".to_string(),\n            \"indexer,metastore\".to_string(),\n        );\n        env_vars.insert(\"QW_LISTEN_ADDRESS\".to_string(), \"172.0.0.12\".to_string());\n        env_vars.insert(\"QW_ADVERTISE_ADDRESS\".to_string(), \"172.0.0.13\".to_string());\n        env_vars.insert(\"QW_REST_LISTEN_PORT\".to_string(), \"1234\".to_string());\n        env_vars.insert(\"QW_GOSSIP_LISTEN_PORT\".to_string(), \"5678\".to_string());\n        env_vars.insert(\"QW_GRPC_LISTEN_PORT\".to_string(), \"9012\".to_string());\n        env_vars.insert(\n            \"QW_PEER_SEEDS\".to_string(),\n            \"test-peer-seed-0,test-peer-seed-1\".to_string(),\n        );\n        env_vars.insert(\"QW_DATA_DIR\".to_string(), \"test-data-dir\".to_string());\n        env_vars.insert(\n            \"QW_METASTORE_URI\".to_string(),\n            \"postgresql://test-user:test-password@test-host:4321/test-db\".to_string(),\n        );\n        env_vars.insert(\n            \"QW_DEFAULT_INDEX_ROOT_URI\".to_string(),\n            \"s3://quickwit-indexes/prod\".to_string(),\n        );\n        let config =\n            load_node_config_with_env(ConfigFormat::Yaml, config_yaml.as_bytes(), &env_vars)\n                .await\n                .unwrap();\n        assert_eq!(config.cluster_id, \"test-cluster\");\n        assert_eq!(config.node_id, \"test-node\");\n        assert_eq!(config.enabled_services.len(), 2);\n        assert_eq!(\n            config\n                .enabled_services\n                .iter()\n                .sorted_by_key(|service| service.as_str())\n                .collect::<Vec<_>>(),\n            &[&QuickwitService::Indexer, &QuickwitService::Metastore]\n        );\n        assert_eq!(\n            config.rest_config.listen_addr,\n            SocketAddr::new(IpAddr::V4(Ipv4Addr::new(172, 0, 0, 12)), 1234)\n        );\n        assert_eq!(\n            config.gossip_listen_addr,\n            SocketAddr::new(IpAddr::V4(Ipv4Addr::new(172, 0, 0, 12)), 5678)\n        );\n        assert_eq!(\n            config.grpc_listen_addr,\n            SocketAddr::new(IpAddr::V4(Ipv4Addr::new(172, 0, 0, 12)), 9012)\n        );\n        assert_eq!(\n            config.gossip_advertise_addr,\n            SocketAddr::new(IpAddr::V4(Ipv4Addr::new(172, 0, 0, 13)), 5678)\n        );\n        assert_eq!(\n            config.grpc_advertise_addr,\n            SocketAddr::new(IpAddr::V4(Ipv4Addr::new(172, 0, 0, 13)), 9012)\n        );\n        assert_eq!(\n            config.peer_seeds,\n            vec![\n                \"test-peer-seed-0\".to_string(),\n                \"test-peer-seed-1\".to_string()\n            ]\n        );\n        assert_eq!(\n            config.data_dir_path,\n            env::current_dir().unwrap().join(\"test-data-dir\")\n        );\n        assert_eq!(\n            config.metastore_uri,\n            \"postgresql://test-user:test-password@test-host:4321/test-db\"\n        );\n        assert_eq!(config.default_index_root_uri, \"s3://quickwit-indexes/prod\");\n    }\n\n    #[tokio::test]\n    async fn test_quickwwit_config_default_values_storage() {\n        let config_yaml = r#\"\n            version: 0.8\n            node_id: \"node-1\"\n            metastore_uri: postgres://username:password@host:port/db\n        \"#;\n        let config = load_node_config_with_env(\n            ConfigFormat::Yaml,\n            config_yaml.as_bytes(),\n            &Default::default(),\n        )\n        .await\n        .unwrap();\n        assert_eq!(config.cluster_id, DEFAULT_CLUSTER_ID);\n        assert_eq!(config.node_id, \"node-1\");\n        assert_eq!(\n            config.metastore_uri,\n            \"postgresql://username:password@host:port/db\"\n        );\n    }\n\n    #[tokio::test]\n    async fn test_node_config_config_default_values_default_indexer_searcher_config() {\n        let config_yaml = r#\"\n            version: 0.8\n            metastore_uri: postgres://username:password@host:port/db\n            data_dir: /opt/quickwit/data\n        \"#;\n        let config = load_node_config_with_env(\n            ConfigFormat::Yaml,\n            config_yaml.as_bytes(),\n            &Default::default(),\n        )\n        .await\n        .unwrap();\n        assert_eq!(\n            config.metastore_uri,\n            \"postgresql://username:password@host:port/db\"\n        );\n        assert_eq!(config.indexer_config, IndexerConfig::default());\n        assert_eq!(config.searcher_config, SearcherConfig::default());\n        assert_eq!(config.ingest_api_config, IngestApiConfig::default());\n        assert_eq!(config.jaeger_config, JaegerConfig::default());\n    }\n\n    #[tokio::test]\n    async fn test_node_config_validate() {\n        let config_filepath = get_config_filepath(\"quickwit.toml\");\n        let file_content = std::fs::read_to_string(&config_filepath).unwrap();\n\n        let data_dir_path = env::current_dir().unwrap();\n        let mut env_vars = HashMap::new();\n        env_vars.insert(\n            \"QW_DATA_DIR\".to_string(),\n            data_dir_path.to_string_lossy().to_string(),\n        );\n        load_node_config_with_env(ConfigFormat::Toml, file_content.as_bytes(), &env_vars)\n            .await\n            .unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_peer_socket_addrs() {\n        {\n            let node_config = NodeConfigBuilder {\n                ..Default::default()\n            }\n            .build_and_validate(&HashMap::new())\n            .await\n            .unwrap();\n            assert!(node_config.peer_seed_addrs().await.unwrap().is_empty());\n        }\n        {\n            let node_config = NodeConfigBuilder {\n                rest_config_builder: RestConfigBuilder {\n                    listen_port: Some(1789),\n                    ..Default::default()\n                },\n                peer_seeds: ConfigValue::for_test(List(vec![\n                    \"unresolvable.example.com\".to_string(),\n                    \"localhost\".to_string(),\n                    \"localhost:1337\".to_string(),\n                    \"127.0.0.1\".to_string(),\n                    \"127.0.0.1:1337\".to_string(),\n                ])),\n                ..Default::default()\n            }\n            .build_and_validate(&HashMap::new())\n            .await\n            .unwrap();\n            assert_eq!(\n                node_config.peer_seed_addrs().await.unwrap(),\n                vec![\n                    \"unresolvable.example.com:1789\".to_string(),\n                    \"localhost:1789\".to_string(),\n                    \"localhost:1337\".to_string(),\n                    \"127.0.0.1:1789\".to_string(),\n                    \"127.0.0.1:1337\".to_string()\n                ]\n            );\n        }\n    }\n\n    #[tokio::test]\n    async fn test_socket_addr_ports() {\n        {\n            let node_config = NodeConfigBuilder {\n                listen_address: default_listen_address(),\n                ..Default::default()\n            }\n            .build_and_validate(&HashMap::new())\n            .await\n            .unwrap();\n            assert_eq!(\n                node_config.rest_config.listen_addr.to_string(),\n                \"127.0.0.1:7280\"\n            );\n            assert_eq!(node_config.gossip_listen_addr.to_string(), \"127.0.0.1:7280\");\n            assert_eq!(node_config.grpc_listen_addr.to_string(), \"127.0.0.1:7281\");\n        }\n        {\n            let node_config = NodeConfigBuilder {\n                listen_address: default_listen_address(),\n                rest_config_builder: RestConfigBuilder {\n                    listen_port: Some(1789),\n                    ..Default::default()\n                },\n                ..Default::default()\n            }\n            .build_and_validate(&HashMap::new())\n            .await\n            .unwrap();\n            assert_eq!(\n                node_config.rest_config.listen_addr.to_string(),\n                \"127.0.0.1:1789\"\n            );\n            assert_eq!(node_config.gossip_listen_addr.to_string(), \"127.0.0.1:1789\");\n            assert_eq!(node_config.grpc_listen_addr.to_string(), \"127.0.0.1:1790\");\n        }\n        {\n            let node_config = NodeConfigBuilder {\n                listen_address: default_listen_address(),\n                gossip_listen_port: ConfigValue::for_test(1889),\n                grpc_listen_port: ConfigValue::for_test(1989),\n                rest_config_builder: RestConfigBuilder {\n                    listen_port: Some(1789),\n                    ..Default::default()\n                },\n                ..Default::default()\n            }\n            .build_and_validate(&HashMap::new())\n            .await\n            .unwrap();\n            assert_eq!(\n                node_config.rest_config.listen_addr.to_string(),\n                \"127.0.0.1:1789\"\n            );\n            assert_eq!(node_config.gossip_listen_addr.to_string(), \"127.0.0.1:1889\");\n            assert_eq!(node_config.grpc_listen_addr.to_string(), \"127.0.0.1:1989\");\n        }\n    }\n\n    #[tokio::test]\n    async fn test_rest_deprecated_listen_port_config() {\n        // This test should be removed once deprecated `rest_listen_port` field is removed.\n        let node_config = NodeConfigBuilder {\n            rest_listen_port: Some(1789),\n            listen_address: default_listen_address(),\n            rest_config_builder: RestConfigBuilder {\n                listen_port: None,\n                ..Default::default()\n            },\n            ..Default::default()\n        }\n        .build_and_validate(&HashMap::new())\n        .await\n        .unwrap();\n        assert_eq!(\n            node_config.rest_config.listen_addr.to_string(),\n            \"127.0.0.1:1789\"\n        );\n        assert_eq!(node_config.gossip_listen_addr.to_string(), \"127.0.0.1:1789\");\n        assert_eq!(node_config.grpc_listen_addr.to_string(), \"127.0.0.1:1790\");\n    }\n\n    #[tokio::test]\n    async fn test_load_config_with_validation_error() {\n        let config_filepath = get_config_filepath(\"quickwit.yaml\");\n        let file = std::fs::read_to_string(&config_filepath).unwrap();\n        let error = NodeConfig::load(ConfigFormat::Yaml, file.as_bytes())\n            .await\n            .unwrap_err();\n        assert!(error.to_string().contains(\"data dir\"));\n    }\n\n    #[tokio::test]\n    async fn test_config_validates_uris() {\n        {\n            let config_yaml = r#\"\n            version: 0.8\n            node_id: 1\n            metastore_uri: ''\n        \"#;\n            assert!(\n                load_node_config_with_env(\n                    ConfigFormat::Yaml,\n                    config_yaml.as_bytes(),\n                    &Default::default()\n                )\n                .await\n                .is_err()\n            );\n        }\n        {\n            let config_yaml = r#\"\n            version: 0.8\n            node_id: 1\n            metastore_uri: postgres://username:password@host:port/db\n            default_index_root_uri: ''\n        \"#;\n            assert!(\n                load_node_config_with_env(\n                    ConfigFormat::Yaml,\n                    config_yaml.as_bytes(),\n                    &Default::default()\n                )\n                .await\n                .is_err()\n            );\n        }\n    }\n\n    #[tokio::test]\n    async fn test_node_config_data_dir_accepts_both_file_uris_and_file_paths() {\n        {\n            let config_yaml = r#\"\n                version: 0.8\n                data_dir: /opt/quickwit/data\n            \"#;\n            let config = load_node_config_with_env(\n                ConfigFormat::Yaml,\n                config_yaml.as_bytes(),\n                &HashMap::default(),\n            )\n            .await\n            .unwrap();\n            assert_eq!(&config.data_dir_path, Path::new(\"/opt/quickwit/data\"));\n        }\n        {\n            let config_yaml = r#\"\n                version: 0.8\n                data_dir: file:///opt/quickwit/data\n            \"#;\n            let config = load_node_config_with_env(\n                ConfigFormat::Yaml,\n                config_yaml.as_bytes(),\n                &HashMap::default(),\n            )\n            .await\n            .unwrap();\n            assert_eq!(&config.data_dir_path, Path::new(\"/opt/quickwit/data\"));\n        }\n        {\n            let config_yaml = r#\"\n                version: 0.8\n                data_dir: s3://indexes/foo\n            \"#;\n            let error = load_node_config_with_env(\n                ConfigFormat::Yaml,\n                config_yaml.as_bytes(),\n                &HashMap::default(),\n            )\n            .await\n            .unwrap_err();\n            assert!(error.to_string().contains(\"data dir must be located\"));\n        }\n    }\n\n    #[tokio::test]\n    async fn test_config_invalid_when_both_listen_ports_params_are_configured() {\n        let config_yaml = r#\"\n                version: 0.8\n                rest_listen_port: 1789\n                rest:\n                  listen_port: 1789\n            \"#;\n        let config = load_node_config_with_env(\n            ConfigFormat::Yaml,\n            config_yaml.as_bytes(),\n            &HashMap::default(),\n        )\n        .await\n        .unwrap_err();\n        assert_eq!(\n            &config.to_string(),\n            \"conflicting configuration values: please use only `rest.listen_port`, \\\n             `rest_listen_port` is deprecated and should not be used alongside \\\n             `rest.listen_port`. Update your configuration to use `rest.listen_port`.\"\n        );\n    }\n\n    #[test]\n    fn test_jaeger_config_rejects_null_values() {\n        let jaeger_config_yaml = r#\"\n            enable_endpoint: true\n            max_trace_duration_secs: 0\n        \"#;\n        let error = serde_yaml::from_str::<JaegerConfig>(jaeger_config_yaml).unwrap_err();\n        assert!(\n            error\n                .to_string()\n                .contains(\"max_trace_duration_secs: invalid value: integer `0`\")\n        )\n    }\n\n    #[tokio::test]\n    async fn test_rest_config_accepts_wildcard() {\n        let rest_config_yaml = r#\"\n            version: 0.8\n            rest:\n              cors_allow_origins: '*'\n        \"#;\n        let config = load_node_config_with_env(\n            ConfigFormat::Yaml,\n            rest_config_yaml.as_bytes(),\n            &Default::default(),\n        )\n        .await\n        .expect(\"Deserialize rest config\");\n        assert_eq!(config.rest_config.cors_allow_origins, [\"*\"]);\n    }\n\n    #[tokio::test]\n    async fn test_rest_config_accepts_single_origin() {\n        let rest_config_yaml = r#\"\n            version: 0.8\n            rest:\n              cors_allow_origins:\n                - https://www.my-domain.com\n        \"#;\n        let config = load_node_config_with_env(\n            ConfigFormat::Yaml,\n            rest_config_yaml.as_bytes(),\n            &Default::default(),\n        )\n        .await\n        .expect(\"Deserialize rest config\");\n        assert_eq!(\n            config.rest_config.cors_allow_origins,\n            [\"https://www.my-domain.com\"]\n        );\n\n        let rest_config_yaml = r#\"\n            version: 0.8\n            rest:\n              cors_allow_origins: http://192.168.0.108:7280\n        \"#;\n        let config = load_node_config_with_env(\n            ConfigFormat::Yaml,\n            rest_config_yaml.as_bytes(),\n            &Default::default(),\n        )\n        .await\n        .expect(\"Deserialize rest config\");\n        assert_eq!(\n            config.rest_config.cors_allow_origins,\n            [\"http://192.168.0.108:7280\"]\n        );\n    }\n\n    #[tokio::test]\n    async fn test_rest_config_accepts_multi_origin() {\n        let rest_config_yaml = r#\"\n            version: 0.8\n            rest:\n              cors_allow_origins:\n                - https://www.my-domain.com\n        \"#;\n        let config = load_node_config_with_env(\n            ConfigFormat::Yaml,\n            rest_config_yaml.as_bytes(),\n            &Default::default(),\n        )\n        .await\n        .expect(\"Deserialize rest config\");\n        assert_eq!(\n            config.rest_config.cors_allow_origins,\n            [\"https://www.my-domain.com\"]\n        );\n\n        let rest_config_yaml = r#\"\n            version: 0.8\n            rest:\n              cors_allow_origins:\n                - https://www.my-domain.com\n                - https://www.my-other-domain.com\n        \"#;\n        let config = load_node_config_with_env(\n            ConfigFormat::Yaml,\n            rest_config_yaml.as_bytes(),\n            &Default::default(),\n        )\n        .await\n        .expect(\"Deserialize rest config\");\n        assert_eq!(\n            config.rest_config.cors_allow_origins,\n            [\n                \"https://www.my-domain.com\",\n                \"https://www.my-other-domain.com\"\n            ]\n        );\n\n        let rest_config_yaml = r#\"\n            version: 0.8\n            rest:\n              rest_cors_allow_origins:\n        \"#;\n        load_node_config_with_env(\n            ConfigFormat::Yaml,\n            rest_config_yaml.as_bytes(),\n            &Default::default(),\n        )\n        .await\n        .expect_err(\"Config should not allow empty origins.\");\n\n        let rest_config_yaml = r#\"\n            version: 0.8\n            rest:\n              cors_allow_origins:\n                -\n        \"#;\n        load_node_config_with_env(\n            ConfigFormat::Yaml,\n            rest_config_yaml.as_bytes(),\n            &Default::default(),\n        )\n        .await\n        .expect_err(\"Config should not allow empty origins.\");\n    }\n\n    #[tokio::test]\n    async fn test_node_config_validates_ingest_config() {\n        let ingest_config = IngestApiConfig {\n            replication_factor: 0,\n            ..Default::default()\n        };\n        let error_message = ingest_config.validate().unwrap_err().to_string();\n        assert!(error_message.contains(\"either 1 or 2, got `0`\"));\n\n        let ingest_config = IngestApiConfig {\n            replication_factor: 3,\n            ..Default::default()\n        };\n        let error_message = ingest_config.validate().unwrap_err().to_string();\n        assert!(error_message.contains(\"either 1 or 2, got `3`\"));\n\n        let node_config_yaml = r#\"\n            version: 0.8\n            ingest_api:\n              replication_factor: 0\n        \"#;\n        let error_message = load_node_config_with_env(\n            ConfigFormat::Yaml,\n            node_config_yaml.as_bytes(),\n            &Default::default(),\n        )\n        .await\n        .unwrap_err()\n        .to_string();\n        assert!(error_message.contains(\"replication factor\"));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-config/src/qw_env_vars.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\n\nuse once_cell::sync::Lazy;\n\n/// Expands the list of QW environment variables into constants of the form `const <ENV_VAR_KEY>:\n/// usize = <env var index>;` and builds the map `QW_EN_VARS` of environment variable index to\n/// environment variable key.\nmacro_rules! qw_env_vars {\n    (@step $idx:expr,) => {};\n\n    (@step $idx:expr, $head:ident, $($tail:ident,)*) => {\n        pub(crate) const $head: usize = $idx;\n\n        qw_env_vars!(@step $idx + 1usize, $($tail,)*);\n    };\n\n    ($($ident:ident),*) => {\n        qw_env_vars!(@step 0usize, $($ident,)*);\n\n        pub(crate) static QW_ENV_VARS: Lazy<HashMap<usize, &'static str>> = Lazy::new(|| {\n            let mut env_vars = HashMap::new();\n            $(env_vars.insert($ident, stringify!($ident));)*\n            env_vars\n        });\n    }\n}\n\n// These environment variable keys can be declared in any order with the exception of `QW_NONE`,\n// which must be declared first.\nqw_env_vars!(\n    QW_NONE,\n    QW_CLUSTER_ID,\n    QW_NODE_ID,\n    QW_AVAILABILITY_ZONE,\n    QW_ENABLED_SERVICES,\n    QW_LISTEN_ADDRESS,\n    QW_ADVERTISE_ADDRESS,\n    QW_REST_LISTEN_PORT,\n    QW_GOSSIP_LISTEN_PORT,\n    QW_GRPC_LISTEN_PORT,\n    QW_GOSSIP_INTERVAL_MS,\n    QW_PEER_SEEDS,\n    QW_DATA_DIR,\n    QW_METASTORE_URI,\n    QW_DEFAULT_INDEX_ROOT_URI\n);\n\n#[cfg(test)]\nmod tests {\n\n    use super::*;\n\n    #[test]\n    fn test_qw_env_vars_expansion() {\n        assert_eq!(QW_NONE, 0);\n\n        assert_eq!(QW_CLUSTER_ID, 1);\n        assert_eq!(QW_ENV_VARS.get(&QW_CLUSTER_ID).unwrap(), &\"QW_CLUSTER_ID\");\n\n        assert_eq!(QW_ENV_VARS.get(&QW_NODE_ID).unwrap(), &\"QW_NODE_ID\");\n        assert_eq!(QW_NODE_ID, 2);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-config/src/serde_utils.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::ops::Deref;\nuse std::time::Duration;\n\nuse serde::{Deserialize, Serialize};\n\n/// Custom serde module for ByteSize that serializes as raw byte count (u64).\n/// This ensures perfect roundtrip consistency regardless of display format changes\n/// in the bytesize crate. Deserialization still accepts human-readable strings\n/// like \"2 GB\" via bytesize's default deserializer.\npub mod bytesize_serde {\n    use bytesize::ByteSize;\n    use serde::{Deserialize, Deserializer, Serializer};\n\n    pub fn serialize<S>(byte_size: &ByteSize, serializer: S) -> Result<S::Ok, S::Error>\n    where S: Serializer {\n        serializer.serialize_u64(byte_size.as_u64())\n    }\n\n    pub fn deserialize<'de, D>(deserializer: D) -> Result<ByteSize, D::Error>\n    where D: Deserializer<'de> {\n        ByteSize::deserialize(deserializer)\n    }\n}\n\n#[derive(Serialize, Deserialize, Clone)]\n#[serde(try_from = \"String\", into = \"String\")]\npub struct DurationAsStr {\n    duration_str: String,\n    duration: Duration,\n}\n\nimpl TryFrom<String> for DurationAsStr {\n    type Error = humantime::DurationError;\n\n    fn try_from(duration_str: String) -> Result<Self, Self::Error> {\n        let duration = humantime::parse_duration(&duration_str)?;\n        Ok(DurationAsStr {\n            duration_str,\n            duration,\n        })\n    }\n}\n\nimpl From<DurationAsStr> for String {\n    fn from(duration_as_str: DurationAsStr) -> String {\n        duration_as_str.duration_str\n    }\n}\n\nimpl Deref for DurationAsStr {\n    type Target = Duration;\n\n    fn deref(&self) -> &Self::Target {\n        &self.duration\n    }\n}\n\nimpl From<DurationAsStr> for Duration {\n    fn from(duration_as_str: DurationAsStr) -> Self {\n        *duration_as_str\n    }\n}\n\nimpl fmt::Debug for DurationAsStr {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        self.duration_str.fmt(f)\n    }\n}\n\nimpl PartialEq for DurationAsStr {\n    fn eq(&self, other: &Self) -> bool {\n        // We do not check for the chosen representation here\n        self.duration == other.duration\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use super::*;\n\n    #[test]\n    fn test_duration_deserialize() {\n        let duration: DurationAsStr = serde_json::from_str(\"\\\"10s\\\"\").unwrap();\n        assert_eq!(*duration, Duration::from_secs(10));\n        let deser_error = serde_json::from_str::<DurationAsStr>(\"\\\"10\\\"\").unwrap_err();\n        assert_eq!(\n            deser_error.to_string(),\n            \"time unit needed, for example 10sec or 10ms\"\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-config/src/service.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashSet;\nuse std::fmt::Display;\nuse std::str::FromStr;\n\nuse anyhow::bail;\nuse enum_iterator::{Sequence, all};\nuse itertools::Itertools;\nuse serde::Serialize;\n\n#[derive(Clone, Copy, Debug, Eq, PartialEq, Hash, Serialize, Sequence)]\n#[serde(into = \"&'static str\")]\npub enum QuickwitService {\n    ControlPlane,\n    Indexer,\n    Searcher,\n    Janitor,\n    Metastore,\n}\n\n#[allow(clippy::from_over_into)]\nimpl Into<&'static str> for QuickwitService {\n    fn into(self) -> &'static str {\n        self.as_str()\n    }\n}\n\nimpl QuickwitService {\n    pub fn as_str(&self) -> &'static str {\n        match self {\n            QuickwitService::ControlPlane => \"control_plane\",\n            QuickwitService::Indexer => \"indexer\",\n            QuickwitService::Searcher => \"searcher\",\n            QuickwitService::Janitor => \"janitor\",\n            QuickwitService::Metastore => \"metastore\",\n        }\n    }\n\n    pub fn supported_services() -> HashSet<QuickwitService> {\n        all::<QuickwitService>().collect()\n    }\n}\n\nimpl Display for QuickwitService {\n    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {\n        write!(f, \"{}\", self.as_str())\n    }\n}\n\nimpl FromStr for QuickwitService {\n    type Err = anyhow::Error;\n\n    fn from_str(service_str: &str) -> Result<Self, Self::Err> {\n        match service_str {\n            \"control-plane\" | \"control_plane\" => Ok(QuickwitService::ControlPlane),\n            \"indexer\" => Ok(QuickwitService::Indexer),\n            \"searcher\" => Ok(QuickwitService::Searcher),\n            \"janitor\" => Ok(QuickwitService::Janitor),\n            \"metastore\" => Ok(QuickwitService::Metastore),\n            _ => {\n                bail!(\n                    \"failed to parse service `{service_str}`. supported services are: `{}`\",\n                    QuickwitService::supported_services().iter().join(\"`, `\")\n                )\n            }\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-config/src/source_config/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\npub(crate) mod serialize;\n\nuse std::borrow::Cow;\nuse std::hash::{Hash, Hasher};\nuse std::num::NonZeroUsize;\nuse std::str::FromStr;\n\nuse anyhow::ensure;\nuse bytes::Bytes;\nuse quickwit_common::is_false;\nuse quickwit_common::uri::Uri;\nuse quickwit_proto::metastore::SourceType;\nuse quickwit_proto::types::SourceId;\nuse regex::Regex;\nuse serde::de::Error;\nuse serde::{Deserialize, Deserializer, Serialize};\nuse serde_json::Value as JsonValue;\n// For backward compatibility.\nuse serialize::VersionedSourceConfig;\npub use serialize::{load_source_config_from_user_config, load_source_config_update};\nuse siphasher::sip::SipHasher;\n\nuse crate::{disable_ingest_v1, enable_ingest_v2};\n\n/// Reserved source ID for the `quickwit index ingest` CLI command.\npub const CLI_SOURCE_ID: &str = \"_ingest-cli-source\";\n\n/// Reserved source ID used for Quickwit ingest API.\npub const INGEST_API_SOURCE_ID: &str = \"_ingest-api-source\";\n\n/// Reserved source ID used for native Quickwit ingest.\n/// (this is for ingest v2)\npub const INGEST_V2_SOURCE_ID: &str = \"_ingest-source\";\n\npub const RESERVED_SOURCE_IDS: &[&str] =\n    &[CLI_SOURCE_ID, INGEST_API_SOURCE_ID, INGEST_V2_SOURCE_ID];\n\n#[derive(Clone, Debug, Eq, PartialEq, Serialize, Deserialize)]\n#[serde(into = \"VersionedSourceConfig\")]\n#[serde(try_from = \"VersionedSourceConfig\")]\npub struct SourceConfig {\n    pub source_id: SourceId,\n\n    /// Number of indexing pipelines to run on a cluster for the source.\n    pub num_pipelines: NonZeroUsize,\n\n    // Denotes if this source is enabled.\n    pub enabled: bool,\n\n    pub source_params: SourceParams,\n\n    pub transform_config: Option<TransformConfig>,\n\n    // Denotes the input data format.\n    #[serde(default)]\n    pub input_format: SourceInputFormat,\n}\n\nimpl SourceConfig {\n    pub fn source_type(&self) -> SourceType {\n        self.source_params.source_type()\n    }\n\n    // TODO: Remove after source factory refactor.\n    pub fn params(&self) -> JsonValue {\n        match &self.source_params {\n            SourceParams::File(params) => serde_json::to_value(params),\n            SourceParams::PubSub(params) => serde_json::to_value(params),\n            SourceParams::Ingest => serde_json::to_value(()),\n            SourceParams::IngestApi => serde_json::to_value(()),\n            SourceParams::IngestCli => serde_json::to_value(()),\n            SourceParams::Kafka(params) => serde_json::to_value(params),\n            SourceParams::Kinesis(params) => serde_json::to_value(params),\n            SourceParams::Pulsar(params) => serde_json::to_value(params),\n            SourceParams::Stdin => serde_json::to_value(()),\n            SourceParams::Vec(params) => serde_json::to_value(params),\n            SourceParams::Void(params) => serde_json::to_value(params),\n        }\n        .expect(\"`SourceParams` should be JSON serializable\")\n    }\n\n    /// Creates the default CLI source config. The CLI source ingests data from stdin.\n    pub fn cli() -> Self {\n        Self {\n            source_id: CLI_SOURCE_ID.to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::IngestCli,\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        }\n    }\n\n    /// Creates a native Quickwit ingest source. The ingest source ingests data from an ingester.\n    pub fn ingest_v2() -> Self {\n        Self {\n            source_id: INGEST_V2_SOURCE_ID.to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: enable_ingest_v2(),\n            source_params: SourceParams::Ingest,\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        }\n    }\n\n    /// Creates the default ingest-api source config.\n    pub fn ingest_api_default() -> Self {\n        Self {\n            source_id: INGEST_API_SOURCE_ID.to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: !disable_ingest_v1(),\n            source_params: SourceParams::IngestApi,\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        }\n    }\n\n    /// Returns a fingerprint of parameters relevant for indexers.\n    ///\n    /// This should remain private to this crate to avoid confusion with the\n    /// full indexing pipeline fingerprint that also includes the index config's\n    /// fingerprint.\n    pub(crate) fn indexing_params_fingerprint(&self) -> u64 {\n        let mut hasher = SipHasher::new();\n        self.input_format.hash(&mut hasher);\n        self.num_pipelines.hash(&mut hasher);\n        self.source_params.hash(&mut hasher);\n        self.transform_config.hash(&mut hasher);\n        hasher.finish()\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn for_test(source_id: &str, source_params: SourceParams) -> Self {\n        Self {\n            source_id: source_id.to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params,\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        }\n    }\n}\n\n#[cfg(any(test, feature = \"testsuite\"))]\nimpl crate::TestableForRegression for SourceConfig {\n    fn sample_for_regression() -> Self {\n        SourceConfig {\n            source_id: \"kafka-source\".to_string(),\n            num_pipelines: NonZeroUsize::new(2).unwrap(),\n            enabled: true,\n            source_params: SourceParams::Kafka(KafkaSourceParams {\n                topic: \"kafka-topic\".to_string(),\n                client_log_level: None,\n                client_params: serde_json::json!({}),\n                enable_backfill_mode: false,\n            }),\n            transform_config: Some(TransformConfig {\n                vrl_script: \".message = downcase(string!(.message))\".to_string(),\n                timezone: default_timezone(),\n            }),\n            input_format: SourceInputFormat::Json,\n        }\n    }\n\n    fn assert_equality(&self, other: &Self) {\n        assert_eq!(self, other);\n    }\n}\n\n#[derive(\n    Clone, Copy, Debug, Default, Eq, PartialEq, Serialize, Deserialize, Hash, utoipa::ToSchema,\n)]\n#[serde(rename_all = \"snake_case\")]\npub enum SourceInputFormat {\n    #[default]\n    Json,\n    OtlpLogsJson,\n    #[serde(alias = \"otlp_logs_proto\")]\n    OtlpLogsProtobuf,\n    #[serde(alias = \"otlp_trace_json\")]\n    OtlpTracesJson,\n    #[serde(\n        alias = \"otlp_trace_proto\",\n        alias = \"otlp_trace_protobuf\",\n        alias = \"otlp_traces_proto\"\n    )]\n    OtlpTracesProtobuf,\n    #[serde(alias = \"plain\")]\n    PlainText,\n}\n\nimpl FromStr for SourceInputFormat {\n    type Err = String;\n\n    fn from_str(format_str: &str) -> Result<Self, String> {\n        match format_str {\n            \"json\" => Ok(Self::Json),\n            \"plain\" => Ok(Self::PlainText),\n            unknown => Err(format!(\"unknown source input format: `{unknown}`\")),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Eq, PartialEq, Serialize, Deserialize, Hash, utoipa::ToSchema)]\n#[serde(tag = \"source_type\", content = \"params\", rename_all = \"snake_case\")]\npub enum SourceParams {\n    #[schema(value_type = FileSourceParamsForSerde)]\n    File(FileSourceParams),\n    Ingest,\n    #[serde(rename = \"ingest-api\")]\n    IngestApi,\n    #[serde(rename = \"ingest-cli\")]\n    IngestCli,\n    Kafka(KafkaSourceParams),\n    Kinesis(KinesisSourceParams),\n    #[serde(rename = \"pubsub\")]\n    PubSub(PubSubSourceParams),\n    Pulsar(PulsarSourceParams),\n    Stdin,\n    Vec(VecSourceParams),\n    Void(VoidSourceParams),\n}\n\nimpl SourceParams {\n    pub fn file_from_uri(uri: Uri) -> Self {\n        Self::File(FileSourceParams::Filepath(uri))\n    }\n\n    pub fn file_from_str<P: AsRef<str>>(filepath: P) -> anyhow::Result<Self> {\n        Uri::from_str(filepath.as_ref()).map(Self::file_from_uri)\n    }\n\n    pub fn stdin() -> Self {\n        Self::Stdin\n    }\n\n    pub fn void() -> Self {\n        Self::Void(VoidSourceParams)\n    }\n\n    fn source_type(&self) -> SourceType {\n        match self {\n            SourceParams::File(_) => SourceType::File,\n            SourceParams::Ingest => SourceType::IngestV2,\n            SourceParams::IngestApi => SourceType::IngestV1,\n            SourceParams::IngestCli => SourceType::Cli,\n            SourceParams::Kafka(_) => SourceType::Kafka,\n            SourceParams::Kinesis(_) => SourceType::Kinesis,\n            SourceParams::PubSub(_) => SourceType::PubSub,\n            SourceParams::Pulsar(_) => SourceType::Pulsar,\n            SourceParams::Stdin => SourceType::Stdin,\n            SourceParams::Vec(_) => SourceType::Vec,\n            SourceParams::Void(_) => SourceType::Void,\n        }\n    }\n\n    fn validate_update(&self, new_source_params: &SourceParams) -> anyhow::Result<()> {\n        match (self, new_source_params) {\n            (\n                SourceParams::File(FileSourceParams::Notifications(current)),\n                SourceParams::File(FileSourceParams::Notifications(new)),\n            ) => current.validate_update(new),\n            (SourceParams::Kafka(current), SourceParams::Kafka(new)) => {\n                current.validate_update(new)\n            }\n            (SourceParams::Kinesis(current), SourceParams::Kinesis(new)) => {\n                current.validate_update(new)\n            }\n            (SourceParams::PubSub(current), SourceParams::PubSub(new)) => {\n                current.validate_update(new)\n            }\n            (SourceParams::Pulsar(current), SourceParams::Pulsar(new)) => {\n                current.validate_update(new)\n            }\n            (current, new) if current.source_type() != new.source_type() => Err(anyhow::anyhow!(\n                \"source type cannot be changed, current type {}\",\n                current.source_type(),\n            )),\n            _ => Err(anyhow::anyhow!(\n                \"source type {} cannot be updated\",\n                self.source_type(),\n            )),\n        }\n    }\n}\n\n#[derive(Clone, Copy, Debug, Eq, PartialEq, Hash, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(rename_all = \"snake_case\")]\npub enum FileSourceMessageType {\n    /// See <https://docs.aws.amazon.com/AmazonS3/latest/userguide/notification-content-structure.html>\n    S3Notification,\n    /// A string with the URI of the file (e.g `s3://bucket/key`)\n    RawUri,\n}\n\n#[derive(Clone, Debug, Eq, PartialEq, Hash, Serialize, Deserialize, utoipa::ToSchema)]\npub struct FileSourceSqs {\n    pub queue_url: String,\n    pub message_type: FileSourceMessageType,\n    #[serde(default = \"default_deduplication_window_duration_secs\")]\n    pub deduplication_window_duration_secs: u32,\n    #[serde(default = \"default_deduplication_window_max_messages\")]\n    pub deduplication_window_max_messages: u32,\n    #[serde(default = \"default_deduplication_cleanup_interval_secs\")]\n    pub deduplication_cleanup_interval_secs: u32,\n}\n\nfn default_deduplication_window_duration_secs() -> u32 {\n    3600\n}\n\nfn default_deduplication_window_max_messages() -> u32 {\n    100_000\n}\n\nfn default_deduplication_cleanup_interval_secs() -> u32 {\n    60\n}\n\n#[derive(Clone, Debug, Eq, PartialEq, Hash, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(tag = \"type\", rename_all = \"snake_case\")]\npub enum FileSourceNotification {\n    Sqs(FileSourceSqs),\n}\n\nimpl FileSourceNotification {\n    fn validate_update(&self, other: &Self) -> anyhow::Result<()> {\n        match (self, other) {\n            (Self::Sqs(_), Self::Sqs(_)) => {\n                // changing the queue or the deduplication settings should be fine\n                Ok(())\n            }\n        }\n    }\n}\n\n#[derive(Clone, Debug, Eq, PartialEq, Deserialize, Serialize, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub(super) struct FileSourceParamsForSerde {\n    #[serde(default, skip_serializing_if = \"Vec::is_empty\")]\n    notifications: Vec<FileSourceNotification>,\n    #[serde(default, skip_serializing_if = \"Option::is_none\")]\n    filepath: Option<String>,\n}\n\n#[derive(Clone, Debug, Eq, PartialEq, Hash, Serialize, Deserialize)]\n#[serde(\n    try_from = \"FileSourceParamsForSerde\",\n    into = \"FileSourceParamsForSerde\"\n)]\npub enum FileSourceParams {\n    Notifications(FileSourceNotification),\n    Filepath(Uri),\n}\n\nimpl TryFrom<FileSourceParamsForSerde> for FileSourceParams {\n    type Error = Cow<'static, str>;\n\n    fn try_from(mut value: FileSourceParamsForSerde) -> Result<Self, Self::Error> {\n        if value.filepath.is_some() && !value.notifications.is_empty() {\n            return Err(\n                \"File source parameters `notifications` and `filepath` are mutually exclusive\"\n                    .into(),\n            );\n        }\n        if let Some(filepath) = value.filepath {\n            let uri = Uri::from_str(&filepath).map_err(|err| err.to_string())?;\n            Ok(FileSourceParams::Filepath(uri))\n        } else if value.notifications.len() == 1 {\n            Ok(FileSourceParams::Notifications(\n                value.notifications.remove(0),\n            ))\n        } else if value.notifications.len() > 1 {\n            Err(\"Only one notification can be specified for now\".into())\n        } else {\n            Err(\n                \"Either `notifications` or `filepath` must be specified as file source parameters\"\n                    .into(),\n            )\n        }\n    }\n}\n\nimpl From<FileSourceParams> for FileSourceParamsForSerde {\n    fn from(value: FileSourceParams) -> Self {\n        match value {\n            FileSourceParams::Filepath(uri) => Self {\n                filepath: Some(uri.to_string()),\n                notifications: vec![],\n            },\n            FileSourceParams::Notifications(notification) => Self {\n                filepath: None,\n                notifications: vec![notification],\n            },\n        }\n    }\n}\n\nimpl FileSourceParams {\n    pub fn from_filepath<P: AsRef<str>>(filepath: P) -> anyhow::Result<Self> {\n        Uri::from_str(filepath.as_ref()).map(Self::Filepath)\n    }\n}\n\n#[derive(Clone, Debug, Eq, PartialEq, Hash, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct KafkaSourceParams {\n    /// Name of the topic that the source consumes.\n    pub topic: String,\n    /// Kafka client log level. Possible values are `debug`, `info`, `warn`, and `error`.\n    #[schema(value_type = String)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub client_log_level: Option<String>,\n    /// Kafka client configuration parameters.\n    #[schema(value_type = Object)]\n    #[serde(default = \"serde_json::Value::default\")]\n    #[serde(skip_serializing_if = \"serde_json::Value::is_null\")]\n    pub client_params: JsonValue,\n    /// When backfill mode is enabled, the source exits after reaching the end of the topic.\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"is_false\")]\n    pub enable_backfill_mode: bool,\n}\n\nimpl KafkaSourceParams {\n    fn validate_update(&self, other: &Self) -> anyhow::Result<()> {\n        // Updating the topic would likely mess up the checkpoints because the\n        // Kafka partition IDs are used as metastore checkpoint PartitionId\n        // and there uniqueness is not guaranteed across topics.\n        ensure!(self.topic == other.topic, \"Kafka topic cannot be updated\");\n        Ok(())\n    }\n}\n\n#[derive(Clone, Debug, Eq, PartialEq, Hash, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct PubSubSourceParams {\n    /// Name of the subscription that the source consumes.\n    pub subscription: String,\n    /// When backfill mode is enabled, the source exits after reaching the end of the topic.\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"is_false\")]\n    pub enable_backfill_mode: bool,\n    /// GCP service account credentials (`None` will use default via\n    /// GOOGLE_APPLICATION_CREDENTIALS)\n    /// Path to a google_cloud_auth::credentials::CredentialsFile serialized in JSON. See also\n    /// `<https://cloud.google.com/docs/authentication/application-default-credentials>` and\n    /// `<https://github.com/yoshidan/google-cloud-rust/tree/main/pubsub#automatically>` and\n    /// `<https://docs.rs/google-cloud-auth/0.12.0/google_cloud_auth/credentials/struct.CredentialsFile.html>`.\n    pub credentials_file: Option<String>,\n    /// GCP project ID (Defaults to credentials file project ID).\n    pub project_id: Option<String>,\n    /// Maximum number of messages returned by a pull request (default 1,000)\n    pub max_messages_per_pull: Option<i32>,\n}\n\nimpl PubSubSourceParams {\n    fn validate_update(&self, _other: &Self) -> anyhow::Result<()> {\n        // experimental source, no validation is performed\n        Ok(())\n    }\n}\n\n#[derive(Clone, Debug, Eq, PartialEq, Hash, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(rename_all = \"lowercase\")]\npub enum RegionOrEndpoint {\n    Region(String),\n    Endpoint(String),\n}\n\n#[derive(Clone, Debug, Eq, PartialEq, Hash, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(try_from = \"KinesisSourceParamsInner\")]\npub struct KinesisSourceParams {\n    pub stream_name: String,\n    #[serde(flatten)]\n    pub region_or_endpoint: Option<RegionOrEndpoint>,\n    /// When backfill mode is enabled, the source exits after reaching the end of the stream.\n    #[serde(skip_serializing_if = \"is_false\")]\n    pub enable_backfill_mode: bool,\n}\n\nimpl KinesisSourceParams {\n    fn validate_update(&self, other: &Self) -> anyhow::Result<()> {\n        // Changing the stream would likely mess up the checkpoints because the\n        // Kinesis shard IDs are used as metastore checkpoint PartitionId, and\n        // there uniqueness is only guaranteed within a stream.\n        ensure!(\n            self.stream_name == other.stream_name,\n            \"Kinesis stream_name cannot be updated\"\n        );\n        ensure!(\n            self.region_or_endpoint == other.region_or_endpoint,\n            \"Kinesis region or endpoint cannot be updated\"\n        );\n        Ok(())\n    }\n}\n\n#[derive(Clone, Debug, Eq, PartialEq, Deserialize)]\n#[serde(deny_unknown_fields)]\nstruct KinesisSourceParamsInner {\n    pub stream_name: String,\n    pub region: Option<String>,\n    pub endpoint: Option<String>,\n    #[serde(default)]\n    pub enable_backfill_mode: bool,\n}\n\nimpl TryFrom<KinesisSourceParamsInner> for KinesisSourceParams {\n    type Error = &'static str;\n\n    fn try_from(value: KinesisSourceParamsInner) -> Result<Self, Self::Error> {\n        if value.region.is_some() && value.endpoint.is_some() {\n            return Err(\"Kinesis source parameters `region` and `endpoint` are mutually exclusive\");\n        }\n        let region = value.region.map(RegionOrEndpoint::Region);\n        let endpoint = value.endpoint.map(RegionOrEndpoint::Endpoint);\n        let region_or_endpoint = region.or(endpoint);\n\n        Ok(KinesisSourceParams {\n            stream_name: value.stream_name,\n            region_or_endpoint,\n            enable_backfill_mode: value.enable_backfill_mode,\n        })\n    }\n}\n\n#[derive(Clone, Debug, Default, Eq, PartialEq, Hash, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct VecSourceParams {\n    #[schema(value_type = Vec<String>)]\n    pub docs: Vec<Bytes>,\n    pub batch_num_docs: usize,\n    #[serde(default)]\n    pub partition: String,\n}\n\n#[derive(Clone, Debug, Eq, PartialEq, Hash, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct VoidSourceParams;\n\n#[derive(\n    Clone, Debug, Eq, PartialEq, Hash, serde::Serialize, serde::Deserialize, utoipa::ToSchema,\n)]\n#[serde(deny_unknown_fields)]\npub struct PulsarSourceParams {\n    /// List of the topics that the source consumes.\n    pub topics: Vec<String>,\n    #[serde(deserialize_with = \"pulsar_uri\")]\n    /// The connection URI for pulsar.\n    pub address: String,\n    #[schema(default = \"quickwit\")]\n    #[serde(default = \"default_consumer_name\")]\n    /// The name to register with the pulsar source.\n    pub consumer_name: String,\n    // Serde yaml has some specific behaviour when deserializing\n    // enums (see https://github.com/dtolnay/serde-yaml/issues/342)\n    // and requires explicitly stating `default` in order to make the parameter\n    // optional on the yaml config.\n    #[serde(default, with = \"serde_yaml::with::singleton_map\")]\n    /// Authentication for pulsar.\n    pub authentication: Option<PulsarSourceAuth>,\n}\n\nimpl PulsarSourceParams {\n    fn validate_update(&self, _other: &Self) -> anyhow::Result<()> {\n        // In the Pulsar source, we use use combinations of the topic+partition\n        // (generated by the Pulsar client library) as metastore checkpoint\n        // PartitionId, and those are \"guaranteed\" to be unique across topics.\n        Ok(())\n    }\n}\n\n#[derive(\n    Clone, Debug, Eq, PartialEq, Hash, serde::Serialize, serde::Deserialize, utoipa::ToSchema,\n)]\n#[serde(rename_all = \"lowercase\")]\npub enum PulsarSourceAuth {\n    Token(String),\n    Oauth2 {\n        issuer_url: String,\n        credentials_url: String,\n        audience: Option<String>,\n        scope: Option<String>,\n    },\n}\n\n// Deserializing a string into an pulsar uri.\nfn pulsar_uri<'de, D>(deserializer: D) -> Result<String, D::Error>\nwhere D: Deserializer<'de> {\n    let uri: String = Deserialize::deserialize(deserializer)?;\n    let re: Regex = Regex::new(r\"pulsar(\\+ssl)?://.*\").expect(\"regular expression should compile\");\n\n    if !re.is_match(uri.as_str()) {\n        return Err(Error::custom(format!(\n            \"invalid Pulsar uri provided, must be in the format of `pulsar://host:port/path`. \\\n             got: `{uri}`\"\n        )));\n    }\n\n    Ok(uri)\n}\n\nfn default_consumer_name() -> String {\n    \"quickwit\".to_string()\n}\n\n#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Eq, Hash, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct TransformConfig {\n    /// [VRL] source code of the transform compiled to a VRL [`Program`](vrl::compiler::Program).\n    ///\n    /// [VRL]: https://vector.dev/docs/reference/vrl/\n    #[serde(rename = \"script\")]\n    vrl_script: String,\n\n    /// Timezone used in the VRL [`Program`](vrl::compiler::Program) for date and time\n    /// manipulations. Defaults to `UTC` if not timezone is specified.\n    #[serde(default = \"default_timezone\")]\n    timezone: String,\n}\n\nfn default_timezone() -> String {\n    \"UTC\".to_string()\n}\n\nimpl TransformConfig {\n    /// Creates a new [`TransformConfig`] instance from the provided VRL script and optional\n    /// timezone.\n    pub fn new(vrl_script: String, timezone_opt: Option<String>) -> Self {\n        Self {\n            vrl_script,\n            timezone: timezone_opt.unwrap_or_else(default_timezone),\n        }\n    }\n\n    #[cfg(feature = \"vrl\")]\n    pub(crate) fn validate_vrl_script(&self) -> anyhow::Result<()> {\n        self.compile_vrl_script()?;\n        Ok(())\n    }\n\n    #[cfg(not(feature = \"vrl\"))]\n    pub(crate) fn validate_vrl_script(&self) -> anyhow::Result<()> {\n        // If we are missing the VRL feature we do not return an error here,\n        // to avoid breaking unit tests.\n        //\n        // We do return an explicit error on instantiation of the program however.\n        Ok(())\n    }\n\n    #[cfg(feature = \"vrl\")]\n    /// Compiles the VRL script to a VRL [`Program`](vrl::compiler::Program) and returns it along\n    /// with the timezone.\n    pub fn compile_vrl_script(\n        &self,\n    ) -> anyhow::Result<(vrl::compiler::Program, vrl::compiler::TimeZone)> {\n        use anyhow::Context;\n        let timezone = vrl::compiler::TimeZone::parse(&self.timezone).with_context(|| {\n            format!(\n                \"failed to parse timezone: `{}`. timezone must be a valid name \\\n            in the TZ database: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones\",\n                self.timezone,\n            )\n        })?;\n        // Append \"\\n.\" to the script to return the entire document and not only the modified\n        // fields.\n        let vrl_script = self.vrl_script.clone() + \"\\n.\";\n        let functions = vrl::stdlib::all();\n\n        let compilation_res = match vrl::compiler::compile(&vrl_script, &functions) {\n            Ok(compilation_res) => compilation_res,\n            Err(diagnostics) => {\n                let mut formatter = vrl::diagnostic::Formatter::new(&vrl_script, diagnostics);\n                formatter.enable_colors(!quickwit_common::no_color());\n                anyhow::bail!(\"failed to compile VRL script:\\n {formatter}\")\n            }\n        };\n\n        let vrl::compiler::CompilationResult {\n            program, warnings, ..\n        } = compilation_res;\n\n        if !warnings.is_empty() {\n            let mut formatter = vrl::diagnostic::Formatter::new(&vrl_script, warnings);\n            formatter.enable_colors(!quickwit_common::no_color());\n            tracing::warn!(\"VRL program compiled with some warnings: {formatter}\");\n        }\n        Ok((program, timezone))\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn for_test(vrl_script: &str) -> Self {\n        Self {\n            vrl_script: vrl_script.to_string(),\n            timezone: default_timezone(),\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::num::NonZero;\n    use std::str::FromStr;\n\n    use quickwit_common::uri::Uri;\n    use serde_json::json;\n\n    use super::*;\n    use crate::source_config::RegionOrEndpoint;\n    use crate::{ConfigFormat, FileSourceParams, KinesisSourceParams};\n\n    fn get_source_config_filepath(source_config_filename: &str) -> String {\n        format!(\n            \"{}/resources/tests/source_config/{}\",\n            env!(\"CARGO_MANIFEST_DIR\"),\n            source_config_filename\n        )\n    }\n\n    #[tokio::test]\n    async fn test_load_kafka_source_config() {\n        let source_config_filepath = get_source_config_filepath(\"kafka-source.json\");\n        let file_content = std::fs::read_to_string(&source_config_filepath).unwrap();\n        let source_config_uri = Uri::from_str(&source_config_filepath).unwrap();\n        let config_format = ConfigFormat::sniff_from_uri(&source_config_uri).unwrap();\n        let source_config =\n            load_source_config_from_user_config(config_format, file_content.as_bytes()).unwrap();\n        let expected_source_config = SourceConfig {\n            source_id: \"hdfs-logs-kafka-source\".to_string(),\n            num_pipelines: NonZeroUsize::new(2).unwrap(),\n            enabled: true,\n            source_params: SourceParams::Kafka(KafkaSourceParams {\n                topic: \"cloudera-cluster-logs\".to_string(),\n                client_log_level: None,\n                client_params: json! {{\"bootstrap.servers\": \"localhost:9092\"}},\n                enable_backfill_mode: false,\n            }),\n            transform_config: Some(TransformConfig {\n                vrl_script: \".message = downcase(string!(.message))\".to_string(),\n                timezone: \"local\".to_string(),\n            }),\n            input_format: SourceInputFormat::Json,\n        };\n        assert_eq!(source_config, expected_source_config);\n        assert_eq!(source_config.num_pipelines.get(), 2);\n    }\n\n    #[test]\n    fn test_kafka_source_params_serialization() {\n        {\n            let params = KafkaSourceParams {\n                topic: \"my-topic\".to_string(),\n                client_log_level: None,\n                client_params: json!(null),\n                enable_backfill_mode: false,\n            };\n            let params_yaml = serde_yaml::to_string(&params).unwrap();\n\n            assert_eq!(\n                serde_yaml::from_str::<KafkaSourceParams>(&params_yaml).unwrap(),\n                params,\n            )\n        }\n        {\n            let params = KafkaSourceParams {\n                topic: \"my-topic\".to_string(),\n                client_log_level: Some(\"info\".to_string()),\n                client_params: json! {{\"bootstrap.servers\": \"localhost:9092\"}},\n                enable_backfill_mode: false,\n            };\n            let params_yaml = serde_yaml::to_string(&params).unwrap();\n\n            assert_eq!(\n                serde_yaml::from_str::<KafkaSourceParams>(&params_yaml).unwrap(),\n                params,\n            )\n        }\n    }\n\n    #[test]\n    fn test_kafka_source_params_deserialization() {\n        {\n            let yaml = r#\"\n                    topic: my-topic\n                \"#;\n            assert_eq!(\n                serde_yaml::from_str::<KafkaSourceParams>(yaml).unwrap(),\n                KafkaSourceParams {\n                    topic: \"my-topic\".to_string(),\n                    client_log_level: None,\n                    client_params: json!(null),\n                    enable_backfill_mode: false,\n                }\n            );\n        }\n        {\n            let yaml = r#\"\n                    topic: my-topic\n                    client_log_level: info\n                    client_params:\n                        bootstrap.servers: localhost:9092\n                    enable_backfill_mode: true\n                \"#;\n            assert_eq!(\n                serde_yaml::from_str::<KafkaSourceParams>(yaml).unwrap(),\n                KafkaSourceParams {\n                    topic: \"my-topic\".to_string(),\n                    client_log_level: Some(\"info\".to_string()),\n                    client_params: json! {{\"bootstrap.servers\": \"localhost:9092\"}},\n                    enable_backfill_mode: true,\n                }\n            );\n        }\n    }\n\n    #[tokio::test]\n    async fn test_load_kinesis_source_config() {\n        let source_config_filepath = get_source_config_filepath(\"kinesis-source.yaml\");\n        let file_content = std::fs::read_to_string(&source_config_filepath).unwrap();\n        let source_config_uri = Uri::from_str(&source_config_filepath).unwrap();\n        let config_format = ConfigFormat::sniff_from_uri(&source_config_uri).unwrap();\n        let source_config =\n            load_source_config_from_user_config(config_format, file_content.as_bytes()).unwrap();\n        let expected_source_config = SourceConfig {\n            source_id: \"hdfs-logs-kinesis-source\".to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::Kinesis(KinesisSourceParams {\n                stream_name: \"emr-cluster-logs\".to_string(),\n                region_or_endpoint: None,\n                enable_backfill_mode: false,\n            }),\n            transform_config: Some(TransformConfig {\n                vrl_script: \".message = downcase(string!(.message))\".to_string(),\n                timezone: \"local\".to_string(),\n            }),\n            input_format: SourceInputFormat::Json,\n        };\n        assert_eq!(source_config, expected_source_config);\n        assert_eq!(source_config.num_pipelines.get(), 1);\n    }\n\n    #[tokio::test]\n    async fn test_load_invalid_source_config() {\n        {\n            let content = r#\"\n            {\n                \"version\": \"0.7\",\n                \"source_id\": \"hdfs-logs-void-source\",\n                \"desired_num_pipelines\": 0,\n                \"max_num_pipelines_per_indexer\": 1,\n                \"source_type\": \"void\",\n                \"params\": {}\n            }\n            \"#;\n            let error = load_source_config_from_user_config(ConfigFormat::Json, content.as_bytes())\n                .unwrap_err();\n            assert!(\n                error\n                    .to_string()\n                    .contains(\"`desired_num_pipelines` must be\")\n            );\n        }\n        // {\n        //     let content = r#\"\n        //     {\n        //         \"version\": \"0.7\",\n        //         \"source_id\": \"hdfs-logs-void-source\",\n        //         \"desired_num_pipelines\": 1,\n        //         \"max_num_pipelines_per_indexer\": 0,\n        //         \"source_type\": \"void\",\n        //         \"params\": {}\n        //     }\n        //     \"#;\n        //     let error = load_source_config_from_user_config(ConfigFormat::Json,\n        // content.as_bytes())         .unwrap_err();\n        //     assert!(error\n        //         .to_string()\n        //         .contains(\"`max_num_pipelines_per_indexer` must be\"));\n        // }\n        {\n            let content = r#\"\n            {\n                \"version\": \"0.8\",\n                \"source_id\": \"hdfs-logs-void-source\",\n                \"num_pipelines\": 2,\n                \"source_type\": \"void\",\n                \"params\": {}\n            }\n            \"#;\n            let error = load_source_config_from_user_config(ConfigFormat::Json, content.as_bytes())\n                .unwrap_err();\n            assert!(error.to_string().contains(\"supports multiple pipelines\"));\n        }\n        {\n            let content = r#\"\n            {\n                \"version\": \"0.7\",\n                \"source_id\": \"hdfs-logs-void-source\",\n                \"desired_num_pipelines\": 2,\n                \"max_num_pipelines_per_indexer\": 1,\n                \"source_type\": \"void\",\n                \"params\": {}\n            }\n            \"#;\n            let error = load_source_config_from_user_config(ConfigFormat::Json, content.as_bytes())\n                .unwrap_err();\n            assert!(error.to_string().contains(\"supports multiple pipelines\"));\n        }\n    }\n\n    #[tokio::test]\n    async fn test_load_valid_distributed_source_config_0_7() {\n        {\n            let content = r#\"\n            {\n                \"version\": \"0.7\",\n                \"source_id\": \"hdfs-logs-kafka-source\",\n                \"desired_num_pipelines\": 3,\n                \"max_num_pipelines_per_indexer\": 3,\n                \"source_type\": \"kafka\",\n                \"params\": {\n                    \"topic\": \"my-topic\"\n                }\n            }\n            \"#;\n            let source_config =\n                load_source_config_from_user_config(ConfigFormat::Json, content.as_bytes())\n                    .unwrap();\n            assert_eq!(source_config.num_pipelines.get(), 3);\n        }\n        {\n            let content = r#\"\n            {\n                \"version\": \"0.7\",\n                \"source_id\": \"hdfs-logs-pulsar-source\",\n                \"desired_num_pipelines\": 3,\n                \"max_num_pipelines_per_indexer\": 3,\n                \"source_type\": \"pulsar\",\n                \"params\": {\n                    \"topics\": [\"my-topic\"],\n                    \"address\": \"http://localhost:6650\"\n                }\n            }\n            \"#;\n            load_source_config_from_user_config(ConfigFormat::Json, content.as_bytes())\n                .unwrap_err();\n            // TODO: uncomment asserts once distributed indexing is activated for pulsar.\n            // assert_eq!(source_config.num_pipelines(), 3);\n            // assert_eq!(source_config.max_num_pipelines_per_indexer(), 3);\n        }\n    }\n\n    #[tokio::test]\n    async fn test_load_valid_distributed_source_config() {\n        {\n            let content = r#\"\n            {\n                \"version\": \"0.8\",\n                \"source_id\": \"hdfs-logs-kafka-source\",\n                \"num_pipelines\": 3,\n                \"source_type\": \"kafka\",\n                \"params\": {\n                    \"topic\": \"my-topic\"\n                }\n            }\n            \"#;\n            let source_config =\n                load_source_config_from_user_config(ConfigFormat::Json, content.as_bytes())\n                    .unwrap();\n            assert_eq!(source_config.num_pipelines.get(), 3);\n        }\n    }\n\n    #[test]\n    fn test_file_source_params_serde() {\n        {\n            let yaml = r#\"\n                filepath: source-path.json\n            \"#;\n            let file_params_deserialized = serde_yaml::from_str::<FileSourceParams>(yaml).unwrap();\n            let uri = Uri::from_str(\"source-path.json\").unwrap();\n            assert_eq!(file_params_deserialized, FileSourceParams::Filepath(uri));\n            let file_params_reserialized = serde_json::to_value(file_params_deserialized).unwrap();\n            file_params_reserialized\n                .get(\"filepath\")\n                .unwrap()\n                .as_str()\n                .unwrap()\n                .contains(\"source-path.json\");\n        }\n        {\n            let yaml = r#\"\n                notifications:\n                  - type: sqs\n                    queue_url: https://sqs.us-east-1.amazonaws.com/123456789012/queue-name\n                    message_type: s3_notification\n            \"#;\n            let file_params_deserialized = serde_yaml::from_str::<FileSourceParams>(yaml).unwrap();\n            assert_eq!(\n                file_params_deserialized,\n                FileSourceParams::Notifications(FileSourceNotification::Sqs(FileSourceSqs {\n                    queue_url: \"https://sqs.us-east-1.amazonaws.com/123456789012/queue-name\"\n                        .to_string(),\n                    message_type: FileSourceMessageType::S3Notification,\n                    deduplication_window_duration_secs: default_deduplication_window_duration_secs(\n                    ),\n                    deduplication_window_max_messages: default_deduplication_window_max_messages(),\n                    deduplication_cleanup_interval_secs:\n                        default_deduplication_cleanup_interval_secs()\n                })),\n            );\n            let file_params_reserialized = serde_json::to_value(&file_params_deserialized).unwrap();\n            assert_eq!(\n                file_params_reserialized,\n                json!({\"notifications\": [{\n                    \"type\": \"sqs\",\n                    \"queue_url\": \"https://sqs.us-east-1.amazonaws.com/123456789012/queue-name\",\n                    \"message_type\": \"s3_notification\",\n                    \"deduplication_window_duration_secs\": default_deduplication_window_duration_secs(),\n                    \"deduplication_window_max_messages\": default_deduplication_window_max_messages(),\n                    \"deduplication_cleanup_interval_secs\": default_deduplication_cleanup_interval_secs(),\n                }]})\n            );\n        }\n        {\n            let yaml = r#\"\n                filepath: source-path.json\n                notifications:\n                  - type: sqs\n                    queue_url: https://sqs.us-east-1.amazonaws.com/123456789012/queue-name\n                    message_type: s3_notification\n            \"#;\n            let error = serde_yaml::from_str::<FileSourceParams>(yaml).unwrap_err();\n            assert_eq!(\n                error.to_string(),\n                \"File source parameters `notifications` and `filepath` are mutually exclusive\"\n            );\n        }\n        {\n            let yaml = r#\"\n                notifications:\n                  - type: sqs\n                    queue_url: https://sqs.us-east-1.amazonaws.com/123456789012/queue1\n                    message_type: s3_notification\n                  - type: sqs\n                    queue_url: https://sqs.us-east-1.amazonaws.com/123456789012/queue2\n                    message_type: s3_notification\n            \"#;\n            let error = serde_yaml::from_str::<FileSourceParams>(yaml).unwrap_err();\n            assert_eq!(\n                error.to_string(),\n                \"Only one notification can be specified for now\"\n            );\n        }\n        {\n            let json = r#\"\n            {\n                \"notifications\": [\n                    {\n                        \"queue_url\": \"https://sqs.us-east-1.amazonaws.com/123456789012/queue\",\n                        \"message_type\": \"s3_notification\"\n                    }\n                ]\n            }\n            \"#;\n            let error = serde_json::from_str::<FileSourceParams>(json).unwrap_err();\n            assert!(error.to_string().contains(\"missing field `type`\"));\n        }\n    }\n\n    #[test]\n    fn test_kinesis_source_params_serialization() {\n        {\n            let params = KinesisSourceParams {\n                stream_name: \"my-stream\".to_string(),\n                region_or_endpoint: None,\n                enable_backfill_mode: false,\n            };\n            let params_yaml = serde_yaml::to_string(&params).unwrap();\n\n            assert_eq!(\n                serde_yaml::from_str::<KinesisSourceParams>(&params_yaml).unwrap(),\n                params,\n            )\n        }\n        {\n            let params = KinesisSourceParams {\n                stream_name: \"my-stream\".to_string(),\n                region_or_endpoint: Some(RegionOrEndpoint::Region(\"us-west-1\".to_string())),\n                enable_backfill_mode: false,\n            };\n            let params_yaml = serde_yaml::to_string(&params).unwrap();\n\n            assert_eq!(\n                serde_yaml::from_str::<KinesisSourceParams>(&params_yaml).unwrap(),\n                params,\n            )\n        }\n        {\n            let params = KinesisSourceParams {\n                stream_name: \"my-stream\".to_string(),\n                region_or_endpoint: Some(RegionOrEndpoint::Endpoint(\n                    \"https://localhost:4566\".to_string(),\n                )),\n                enable_backfill_mode: false,\n            };\n            let params_yaml = serde_yaml::to_string(&params).unwrap();\n\n            assert_eq!(\n                ConfigFormat::Yaml\n                    .parse::<KinesisSourceParams>(params_yaml.as_bytes())\n                    .unwrap(),\n                params,\n            )\n        }\n    }\n\n    #[test]\n    fn test_kinesis_source_params_deserialization() {\n        {\n            let yaml = r#\"\n                    stream_name: my-stream\n                \"#;\n            assert_eq!(\n                serde_yaml::from_str::<KinesisSourceParams>(yaml).unwrap(),\n                KinesisSourceParams {\n                    stream_name: \"my-stream\".to_string(),\n                    region_or_endpoint: None,\n                    enable_backfill_mode: false,\n                }\n            );\n        }\n        {\n            let yaml = r#\"\n                    stream_name: my-stream\n                    region: us-west-1\n                    enable_backfill_mode: true\n                \"#;\n            assert_eq!(\n                serde_yaml::from_str::<KinesisSourceParams>(yaml).unwrap(),\n                KinesisSourceParams {\n                    stream_name: \"my-stream\".to_string(),\n                    region_or_endpoint: Some(RegionOrEndpoint::Region(\"us-west-1\".to_string())),\n                    enable_backfill_mode: true,\n                }\n            );\n        }\n        {\n            let yaml = r#\"\n                    stream_name: my-stream\n                    region: us-west-1\n                    endpoint: https://localhost:4566\n                \"#;\n            let error = serde_yaml::from_str::<KinesisSourceParams>(yaml).unwrap_err();\n            assert!(error.to_string().starts_with(\"Kinesis source parameters \"));\n        }\n    }\n\n    #[test]\n    fn test_pulsar_source_params_deserialization() {\n        {\n            let yaml = r#\"\n                    topics:\n                        - my-topic\n                    address: pulsar://localhost:6560\n                    consumer_name: my-pulsar-consumer\n                \"#;\n            assert_eq!(\n                serde_yaml::from_str::<PulsarSourceParams>(yaml).unwrap(),\n                PulsarSourceParams {\n                    topics: vec![\"my-topic\".to_string()],\n                    address: \"pulsar://localhost:6560\".to_string(),\n                    consumer_name: \"my-pulsar-consumer\".to_string(),\n                    authentication: None,\n                }\n            );\n        }\n\n        {\n            let yaml = r#\"\n                    topics:\n                        - my-topic\n                    address: pulsar://localhost:6560\n                    consumer_name: my-pulsar-consumer\n                    authentication:\n                        token: my-token\n                \"#;\n            assert_eq!(\n                serde_yaml::from_str::<PulsarSourceParams>(yaml).unwrap(),\n                PulsarSourceParams {\n                    topics: vec![\"my-topic\".to_string()],\n                    address: \"pulsar://localhost:6560\".to_string(),\n                    consumer_name: \"my-pulsar-consumer\".to_string(),\n                    authentication: Some(PulsarSourceAuth::Token(\"my-token\".to_string())),\n                }\n            );\n        }\n\n        {\n            let yaml = r#\"\n                    topics:\n                        - my-topic\n                    address: pulsar://localhost:6560\n                    consumer_name: my-pulsar-consumer\n                    authentication:\n                        oauth2:\n                            issuer_url: https://my-issuer:9000/path\n                            credentials_url: https://my-credentials.com/path\n                \"#;\n            assert_eq!(\n                serde_yaml::from_str::<PulsarSourceParams>(yaml).unwrap(),\n                PulsarSourceParams {\n                    topics: vec![\"my-topic\".to_string()],\n                    address: \"pulsar://localhost:6560\".to_string(),\n                    consumer_name: \"my-pulsar-consumer\".to_string(),\n                    authentication: Some(PulsarSourceAuth::Oauth2 {\n                        issuer_url: \"https://my-issuer:9000/path\".to_string(),\n                        credentials_url: \"https://my-credentials.com/path\".to_string(),\n                        audience: None,\n                        scope: None,\n                    }),\n                }\n            );\n        }\n\n        {\n            let yaml = r#\"\n                    topics:\n                        - my-topic\n                    address: pulsar://localhost:6560\n                    consumer_name: my-pulsar-consumer\n                    authentication:\n                        oauth2:\n                            issuer_url: https://my-issuer:9000/path\n                            credentials_url: https://my-credentials.com/path\n                            audience: my-audience\n                            scope: \"read+write\"\n                \"#;\n            assert_eq!(\n                serde_yaml::from_str::<PulsarSourceParams>(yaml).unwrap(),\n                PulsarSourceParams {\n                    topics: vec![\"my-topic\".to_string()],\n                    address: \"pulsar://localhost:6560\".to_string(),\n                    consumer_name: \"my-pulsar-consumer\".to_string(),\n                    authentication: Some(PulsarSourceAuth::Oauth2 {\n                        issuer_url: \"https://my-issuer:9000/path\".to_string(),\n                        credentials_url: \"https://my-credentials.com/path\".to_string(),\n                        audience: Some(\"my-audience\".to_string()),\n                        scope: Some(\"read+write\".to_string()),\n                    }),\n                }\n            );\n        }\n\n        {\n            let yaml = r#\"\n                    topics:\n                        - my-topic\n                \"#;\n            serde_yaml::from_str::<PulsarSourceParams>(yaml)\n                .expect_err(\"Parameters should error on missing address\");\n        }\n\n        {\n            let yaml = r#\"\n                    topics:\n                        - my-topic\n                    address: pulsar://localhost:6560\n                \"#;\n            assert_eq!(\n                serde_yaml::from_str::<PulsarSourceParams>(yaml).unwrap(),\n                PulsarSourceParams {\n                    topics: vec![\"my-topic\".to_string()],\n                    address: \"pulsar://localhost:6560\".to_string(),\n                    consumer_name: default_consumer_name(),\n                    authentication: None,\n                }\n            );\n        }\n\n        {\n            let yaml = r#\"\n                    topics:\n                        - my-topic\n                    address: invalid-address\n                \"#;\n            serde_yaml::from_str::<PulsarSourceParams>(yaml)\n                .expect_err(\"Pulsar config should reject invalid address\");\n        }\n\n        {\n            let yaml = r#\"\n                    topics:\n                        - my-topic\n                    address: pulsar://some-host:80/valid-path\n                \"#;\n            assert_eq!(\n                serde_yaml::from_str::<PulsarSourceParams>(yaml).unwrap(),\n                PulsarSourceParams {\n                    topics: vec![\"my-topic\".to_string()],\n                    address: \"pulsar://some-host:80/valid-path\".to_string(),\n                    consumer_name: default_consumer_name(),\n                    authentication: None,\n                }\n            );\n        }\n\n        {\n            let yaml = r#\"\n                    topics:\n                        - my-topic\n                    address: pulsar://2345:0425:2CA1:0000:0000:0567:5673:23b5:80/valid-path\n                \"#;\n            assert_eq!(\n                serde_yaml::from_str::<PulsarSourceParams>(yaml).unwrap(),\n                PulsarSourceParams {\n                    topics: vec![\"my-topic\".to_string()],\n                    address: \"pulsar://2345:0425:2CA1:0000:0000:0567:5673:23b5:80/valid-path\"\n                        .to_string(),\n                    consumer_name: default_consumer_name(),\n                    authentication: None,\n                }\n            );\n        }\n    }\n\n    #[cfg(feature = \"vrl\")]\n    #[tokio::test]\n    async fn test_load_ingest_api_source_config() {\n        let source_config_filepath = get_source_config_filepath(\"ingest-api-source.json\");\n        let file_content = std::fs::read(source_config_filepath).unwrap();\n        let source_config: SourceConfig = ConfigFormat::Json.parse(&file_content).unwrap();\n        let expected_source_config = SourceConfig {\n            source_id: INGEST_API_SOURCE_ID.to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::IngestApi,\n            transform_config: Some(TransformConfig {\n                vrl_script: \".message = downcase(string!(.message))\".to_string(),\n                timezone: default_timezone(),\n            }),\n            input_format: SourceInputFormat::Json,\n        };\n        assert_eq!(source_config, expected_source_config);\n        assert_eq!(source_config.num_pipelines.get(), 1);\n    }\n\n    #[test]\n    fn test_transform_config_serialization() {\n        {\n            let transform_config = TransformConfig {\n                vrl_script: \".message = downcase(string!(.message))\".to_string(),\n                timezone: \"local\".to_string(),\n            };\n            let transform_config_yaml = serde_yaml::to_string(&transform_config).unwrap();\n            assert_eq!(\n                serde_yaml::from_str::<TransformConfig>(&transform_config_yaml).unwrap(),\n                transform_config,\n            );\n        }\n        {\n            let transform_config = TransformConfig {\n                vrl_script: \".message = downcase(string!(.message))\".to_string(),\n                timezone: default_timezone(),\n            };\n            let transform_config_yaml = serde_yaml::to_string(&transform_config).unwrap();\n            assert_eq!(\n                serde_yaml::from_str::<TransformConfig>(&transform_config_yaml).unwrap(),\n                transform_config,\n            );\n        }\n    }\n\n    #[test]\n    fn test_transform_config_deserialization() {\n        {\n            let transform_config_yaml = r#\"\n                script: .message = downcase(string!(.message))\n            \"#;\n            let transform_config =\n                serde_yaml::from_str::<TransformConfig>(transform_config_yaml).unwrap();\n\n            let expected_transform_config = TransformConfig {\n                vrl_script: \".message = downcase(string!(.message))\".to_string(),\n                timezone: default_timezone(),\n            };\n            assert_eq!(transform_config, expected_transform_config);\n        }\n        {\n            let transform_config_yaml = r#\"\n                script: .message = downcase(string!(.message))\n                timezone: Turkey\n            \"#;\n            let transform_config =\n                serde_yaml::from_str::<TransformConfig>(transform_config_yaml).unwrap();\n\n            let expected_transform_config = TransformConfig {\n                vrl_script: \".message = downcase(string!(.message))\".to_string(),\n                timezone: \"Turkey\".to_string(),\n            };\n            assert_eq!(transform_config, expected_transform_config);\n        }\n    }\n\n    #[cfg(feature = \"vrl\")]\n    #[test]\n    fn test_transform_config_compile_vrl_script() {\n        {\n            let transform_config = TransformConfig {\n                vrl_script: \".message = downcase(string!(.message))\".to_string(),\n                timezone: \"Turkey\".to_string(),\n            };\n            transform_config.compile_vrl_script().unwrap();\n        }\n        {\n            let transform_config = TransformConfig {\n                vrl_script: r#\"\n                . = parse_json!(string!(.message))\n                .timestamp = to_unix_timestamp(timestamp!(.timestamp))\n                del(.username)\n                .message = downcase(string!(.message))\n                \"#\n                .to_string(),\n                timezone: default_timezone(),\n            };\n            transform_config.compile_vrl_script().unwrap();\n        }\n        {\n            let transform_config = TransformConfig {\n                vrl_script: \".message = downcase(string!(.message))\".to_string(),\n                timezone: \"foo\".to_string(),\n            };\n            let error = transform_config.compile_vrl_script().unwrap_err();\n            assert!(error.to_string().starts_with(\"failed to parse timezone\"));\n        }\n        {\n            let transform_config = TransformConfig {\n                vrl_script: \"foo\".to_string(),\n                timezone: \"Turkey\".to_string(),\n            };\n            let error = transform_config.compile_vrl_script().unwrap_err();\n            assert!(error.to_string().starts_with(\"failed to compile\"));\n        }\n    }\n\n    #[tokio::test]\n    async fn test_source_config_plain_text_input_format() {\n        let file_content = r#\"{\n            \"version\": \"0.7\",\n            \"source_id\": \"logs-file-source\",\n            \"desired_num_pipelines\": 1,\n            \"max_num_pipelines_per_indexer\": 1,\n            \"source_type\": \"file\",\n            \"params\": {\n              \"filepath\": \"s3://mybucket/test_non_json_corpus.txt\"\n            },\n            \"input_format\": \"plain_text\"\n        }\"#;\n        let source_config =\n            load_source_config_from_user_config(ConfigFormat::Json, file_content.as_bytes())\n                .unwrap();\n        assert_eq!(source_config.input_format, SourceInputFormat::PlainText);\n    }\n\n    #[tokio::test]\n    async fn test_update_kafka_source_config() {\n        let source_config_filepath = get_source_config_filepath(\"kafka-source.json\");\n        let file_content = std::fs::read(&source_config_filepath).unwrap();\n        let source_config_uri = Uri::from_str(&source_config_filepath).unwrap();\n        let config_format = ConfigFormat::sniff_from_uri(&source_config_uri).unwrap();\n        {\n            let mut existing_source_config =\n                load_source_config_from_user_config(config_format, &file_content).unwrap();\n            existing_source_config.num_pipelines = NonZero::new(4).unwrap();\n            let new_source_config =\n                load_source_config_update(config_format, &file_content, &existing_source_config)\n                    .unwrap();\n\n            let expected_source_config = SourceConfig {\n                source_id: \"hdfs-logs-kafka-source\".to_string(),\n                num_pipelines: NonZeroUsize::new(2).unwrap(),\n                enabled: true,\n                source_params: SourceParams::Kafka(KafkaSourceParams {\n                    topic: \"cloudera-cluster-logs\".to_string(),\n                    client_log_level: None,\n                    client_params: json! {{\"bootstrap.servers\": \"localhost:9092\"}},\n                    enable_backfill_mode: false,\n                }),\n                transform_config: Some(TransformConfig {\n                    vrl_script: \".message = downcase(string!(.message))\".to_string(),\n                    timezone: \"local\".to_string(),\n                }),\n                input_format: SourceInputFormat::Json,\n            };\n            assert_eq!(new_source_config, expected_source_config);\n            assert_eq!(new_source_config.num_pipelines.get(), 2);\n        }\n        {\n            // the source type cannot be updated\n            let mut existing_source_config =\n                load_source_config_from_user_config(config_format, &file_content).unwrap();\n            existing_source_config.source_params = SourceParams::Kinesis(KinesisSourceParams {\n                stream_name: \"my-stream\".to_string(),\n                region_or_endpoint: None,\n                enable_backfill_mode: false,\n            });\n            load_source_config_update(config_format, &file_content, &existing_source_config)\n                .unwrap_err();\n        }\n        {\n            // the topic cannot be updated\n            let mut existing_source_config =\n                load_source_config_from_user_config(config_format, &file_content).unwrap();\n            let SourceParams::Kafka(kafka_params) = &mut existing_source_config.source_params\n            else {\n                panic!(\"expected Kafka source params\");\n            };\n            kafka_params.topic = \"other_topic_name\".to_string();\n            load_source_config_update(config_format, &file_content, &existing_source_config)\n                .unwrap_err();\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-config/src/source_config/serialize.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::num::NonZeroUsize;\n\nuse anyhow::{bail, ensure};\nuse quickwit_proto::types::SourceId;\nuse serde::{Deserialize, Serialize};\n\nuse super::{RESERVED_SOURCE_IDS, TransformConfig};\nuse crate::{\n    ConfigFormat, FileSourceParams, SourceConfig, SourceInputFormat, SourceParams,\n    validate_identifier,\n};\n\ntype SourceConfigForSerialization = SourceConfigV0_8;\n\n#[derive(Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\n#[serde(tag = \"version\")]\npub enum VersionedSourceConfig {\n    #[serde(rename = \"0.9\")]\n    #[serde(alias = \"0.8\")]\n    V0_8(SourceConfigV0_8),\n    // Retro compatibility.\n    #[serde(rename = \"0.7\")]\n    V0_7(SourceConfigV0_7),\n}\n\nimpl From<VersionedSourceConfig> for SourceConfigForSerialization {\n    fn from(versioned_source_config: VersionedSourceConfig) -> Self {\n        match versioned_source_config {\n            VersionedSourceConfig::V0_7(v0_7) => v0_7.into(),\n            VersionedSourceConfig::V0_8(v0_8) => v0_8,\n        }\n    }\n}\n\n/// Parses and validates an [`SourceConfig`] as supplied by a user with a given [`ConfigFormat`],\n/// and config content.\npub fn load_source_config_from_user_config(\n    config_format: ConfigFormat,\n    config_content: &[u8],\n) -> anyhow::Result<SourceConfig> {\n    let versioned_source_config: VersionedSourceConfig = config_format.parse(config_content)?;\n    let source_config_for_serialization: SourceConfigForSerialization =\n        versioned_source_config.into();\n    source_config_for_serialization.validate_and_build()\n}\n\n/// Parses and validates a [`SourceConfig`] update.\n///\n/// Ensures that the new configuration is valid in itself and compared to the\n/// current source config. If the new configuration omits some fields, the\n/// default values will be used, not those of the current source config.\npub fn load_source_config_update(\n    config_format: ConfigFormat,\n    config_content: &[u8],\n    current_source_config: &SourceConfig,\n) -> anyhow::Result<SourceConfig> {\n    let versioned_source_config: VersionedSourceConfig = config_format.parse(config_content)?;\n    let source_config_for_serialization: SourceConfigForSerialization =\n        versioned_source_config.into();\n    let new_source_config = source_config_for_serialization.validate_and_build()?;\n\n    ensure!(\n        current_source_config.source_id == new_source_config.source_id,\n        \"existing `source_id` {} does not match updated `source_id` {}\",\n        current_source_config.source_id,\n        new_source_config.source_id\n    );\n\n    current_source_config\n        .source_params\n        .validate_update(&new_source_config.source_params)?;\n\n    Ok(new_source_config)\n}\n\nimpl SourceConfigForSerialization {\n    /// Checks the validity of the `SourceConfig` as a \"deserializable source\".\n    ///\n    /// Two remarks:\n    /// - This does not check connectivity, it just validate configuration, without performing any\n    ///   IO. See `check_connectivity(..)`.\n    /// - This is used each time the `SourceConfig` is deserialized (at creation but also during\n    ///   communications with the metastore). When ingesting from stdin, we programmatically create\n    ///   an invalid `SourceConfig` and only use it locally.\n    fn validate_and_build(self) -> anyhow::Result<SourceConfig> {\n        if !RESERVED_SOURCE_IDS.contains(&self.source_id.as_str()) {\n            validate_identifier(\"source\", &self.source_id)?;\n        }\n        let num_pipelines = NonZeroUsize::new(self.num_pipelines)\n            .ok_or_else(|| anyhow::anyhow!(\"`desired_num_pipelines` must be strictly positive\"))?;\n        match &self.source_params {\n            SourceParams::Stdin => {\n                bail!(\n                    \"stdin can only be used as source through the CLI command `quickwit tool \\\n                     local-ingest`\"\n                );\n            }\n            SourceParams::File(_)\n            | SourceParams::Kafka(_)\n            | SourceParams::Kinesis(_)\n            | SourceParams::Pulsar(_) => {\n                // TODO consider any validation opportunity\n            }\n            SourceParams::PubSub(_)\n            | SourceParams::Ingest\n            | SourceParams::IngestApi\n            | SourceParams::IngestCli\n            | SourceParams::Vec(_)\n            | SourceParams::Void(_) => {}\n        }\n        match &self.source_params {\n            SourceParams::PubSub(_)\n            | SourceParams::Kafka(_)\n            | SourceParams::File(FileSourceParams::Notifications(_)) => {}\n            _ => {\n                if self.num_pipelines > 1 {\n                    bail!(\"Quickwit currently supports multiple pipelines only for GCP PubSub or Kafka sources. open an issue https://github.com/quickwit-oss/quickwit/issues if you need the feature for other source types\");\n                }\n            }\n        }\n\n        if let Some(transform_config) = &self.transform {\n            if matches!(\n                self.input_format,\n                SourceInputFormat::OtlpLogsJson\n                    | SourceInputFormat::OtlpLogsProtobuf\n                    | SourceInputFormat::OtlpTracesJson\n                    | SourceInputFormat::OtlpTracesProtobuf\n            ) {\n                bail!(\"VRL transforms are not supported for OTLP input formats\");\n            }\n            transform_config.validate_vrl_script()?;\n        }\n\n        Ok(SourceConfig {\n            source_id: self.source_id,\n            num_pipelines,\n            enabled: self.enabled,\n            source_params: self.source_params,\n            transform_config: self.transform,\n            input_format: self.input_format,\n        })\n    }\n}\n\nimpl From<SourceConfig> for SourceConfigV0_8 {\n    fn from(source_config: SourceConfig) -> Self {\n        SourceConfigV0_8 {\n            source_id: source_config.source_id,\n            num_pipelines: source_config.num_pipelines.get(),\n            enabled: source_config.enabled,\n            source_params: source_config.source_params,\n            transform: source_config.transform_config,\n            input_format: source_config.input_format,\n        }\n    }\n}\n\nimpl From<SourceConfig> for VersionedSourceConfig {\n    fn from(source_config: SourceConfig) -> Self {\n        VersionedSourceConfig::V0_8(source_config.into())\n    }\n}\n\nimpl TryFrom<VersionedSourceConfig> for SourceConfig {\n    type Error = anyhow::Error;\n\n    fn try_from(versioned_source_config: VersionedSourceConfig) -> anyhow::Result<Self> {\n        let v1: SourceConfigV0_8 = versioned_source_config.into();\n        v1.validate_and_build()\n    }\n}\n\nfn default_max_num_pipelines_per_indexer() -> usize {\n    1\n}\n\nfn default_num_pipelines() -> usize {\n    1\n}\n\nfn default_source_enabled() -> bool {\n    true\n}\n\n#[derive(Clone, Debug, Eq, PartialEq, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct SourceConfigV0_7 {\n    #[schema(value_type = String)]\n    pub source_id: SourceId,\n\n    #[serde(\n        default = \"default_max_num_pipelines_per_indexer\",\n        alias = \"num_pipelines\"\n    )]\n    pub max_num_pipelines_per_indexer: usize,\n\n    #[serde(default = \"default_num_pipelines\")]\n    pub desired_num_pipelines: usize,\n\n    // Denotes if this source is enabled.\n    #[serde(default = \"default_source_enabled\")]\n    pub enabled: bool,\n\n    #[serde(flatten)]\n    pub source_params: SourceParams,\n\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub transform: Option<TransformConfig>,\n\n    // Denotes the input data format.\n    #[serde(default)]\n    pub input_format: SourceInputFormat,\n}\n\n#[derive(Clone, Debug, Eq, PartialEq, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct SourceConfigV0_8 {\n    #[schema(value_type = String)]\n    pub source_id: SourceId,\n\n    #[serde(default = \"default_num_pipelines\")]\n    pub num_pipelines: usize,\n\n    // Denotes if this source is enabled.\n    #[serde(default = \"default_source_enabled\")]\n    pub enabled: bool,\n\n    #[serde(flatten)]\n    pub source_params: SourceParams,\n\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub transform: Option<TransformConfig>,\n\n    // Denotes the input data format.\n    #[serde(default)]\n    pub input_format: SourceInputFormat,\n}\n\nimpl From<SourceConfigV0_7> for SourceConfigV0_8 {\n    fn from(source_config_v0_7: SourceConfigV0_7) -> Self {\n        let SourceConfigV0_7 {\n            source_id,\n            max_num_pipelines_per_indexer: _,\n            desired_num_pipelines,\n            enabled,\n            source_params,\n            transform,\n            input_format,\n        } = source_config_v0_7;\n        SourceConfigV0_8 {\n            source_id,\n            num_pipelines: desired_num_pipelines,\n            enabled,\n            source_params,\n            transform,\n            input_format,\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-config/src/storage_config.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::ops::Deref;\nuse std::sync::OnceLock;\nuse std::{env, fmt};\n\nuse anyhow::ensure;\nuse itertools::Itertools;\nuse quickwit_common::get_bool_from_env;\nuse serde::{Deserialize, Serialize};\nuse serde_with::{EnumMap, serde_as};\n\n/// Lists the storage backends supported by Quickwit.\n#[derive(Debug, Clone, Copy, Eq, PartialEq, Ord, PartialOrd, Hash, Serialize, Deserialize)]\n#[serde(rename_all = \"snake_case\")]\npub enum StorageBackend {\n    /// Azure Blob Storage\n    Azure,\n    /// Local file system\n    File,\n    /// Google Cloud Storage\n    Google,\n    /// In-memory storage, for testing purposes\n    Ram,\n    /// Amazon S3 or S3-compatible storage\n    S3,\n}\n\n#[derive(Debug, Clone, Copy, Eq, PartialEq, Ord, PartialOrd, Hash, Serialize, Deserialize)]\n#[serde(rename_all = \"snake_case\")]\npub enum StorageBackendFlavor {\n    /// Digital Ocean Spaces\n    #[serde(alias = \"do\")]\n    DigitalOcean,\n    /// Garage\n    Garage,\n    /// Google Cloud Storage\n    #[serde(alias = \"gcp\", alias = \"google\")]\n    Gcs,\n    /// MinIO\n    #[serde(rename = \"minio\")]\n    MinIO,\n}\n\n/// Holds the storage configurations defined in the `storage` section of node config files.\n///\n/// ```yaml\n/// storage:\n///   azure:\n///     account: test-account\n///\n///   s3:\n///     endpoint: http://localhost:4566\n/// ```\n#[serde_as]\n#[derive(Debug, Clone, Default, Eq, PartialEq, Serialize, Deserialize)]\npub struct StorageConfigs(#[serde_as(as = \"EnumMap\")] Vec<StorageConfig>);\n\nimpl StorageConfigs {\n    pub fn new(storage_configs: Vec<StorageConfig>) -> Self {\n        Self(storage_configs)\n    }\n\n    pub fn redact(&mut self) {\n        for storage_config in self.0.iter_mut() {\n            storage_config.redact();\n        }\n    }\n\n    pub fn apply_flavors(&mut self) {\n        for storage_config in self.0.iter_mut() {\n            if let StorageConfig::S3(s3_storage_config) = storage_config {\n                s3_storage_config.apply_flavor();\n            }\n        }\n    }\n\n    pub fn validate(&self) -> anyhow::Result<()> {\n        let backends: Vec<StorageBackend> = self\n            .0\n            .iter()\n            .map(|storage_config| storage_config.backend())\n            .sorted()\n            .collect();\n\n        for (left, right) in backends.iter().zip(backends.iter().skip(1)) {\n            ensure!(\n                left != right,\n                \"{left:?} storage config is defined multiple times\",\n            );\n        }\n        Ok(())\n    }\n\n    pub fn find_azure(&self) -> Option<&AzureStorageConfig> {\n        self.0\n            .iter()\n            .find_map(|storage_config| match storage_config {\n                StorageConfig::Azure(azure_storage_config) => Some(azure_storage_config),\n                _ => None,\n            })\n    }\n\n    pub fn find_google(&self) -> Option<&GoogleCloudStorageConfig> {\n        self.0\n            .iter()\n            .find_map(|storage_config| match storage_config {\n                StorageConfig::Google(google_storage_config) => Some(google_storage_config),\n                _ => None,\n            })\n    }\n\n    pub fn find_file(&self) -> Option<&FileStorageConfig> {\n        self.0\n            .iter()\n            .find_map(|storage_config| match storage_config {\n                StorageConfig::File(file_storage_config) => Some(file_storage_config),\n                _ => None,\n            })\n    }\n\n    pub fn find_ram(&self) -> Option<&RamStorageConfig> {\n        self.0\n            .iter()\n            .find_map(|storage_config| match storage_config {\n                StorageConfig::Ram(ram_storage_config) => Some(ram_storage_config),\n                _ => None,\n            })\n    }\n\n    pub fn find_s3(&self) -> Option<&S3StorageConfig> {\n        self.0\n            .iter()\n            .find_map(|storage_config| match storage_config {\n                StorageConfig::S3(s3_storage_config) => Some(s3_storage_config),\n                _ => None,\n            })\n    }\n}\n\nimpl Deref for StorageConfigs {\n    type Target = Vec<StorageConfig>;\n\n    fn deref(&self) -> &Self::Target {\n        &self.0\n    }\n}\n\n#[derive(Debug, Clone, Eq, PartialEq, Serialize, Deserialize)]\n#[serde(rename_all = \"snake_case\")]\npub enum StorageConfig {\n    Azure(AzureStorageConfig),\n    File(FileStorageConfig),\n    Ram(RamStorageConfig),\n    S3(S3StorageConfig),\n    Google(GoogleCloudStorageConfig),\n}\n\nimpl StorageConfig {\n    pub fn redact(&mut self) {\n        match self {\n            Self::Azure(azure_storage_config) => azure_storage_config.redact(),\n            Self::File(_) | Self::Ram(_) | Self::Google(_) => {}\n            Self::S3(s3_storage_config) => s3_storage_config.redact(),\n        }\n    }\n\n    pub fn as_azure(&self) -> Option<&AzureStorageConfig> {\n        match self {\n            Self::Azure(azure_storage_config) => Some(azure_storage_config),\n            _ => None,\n        }\n    }\n\n    pub fn as_file(&self) -> Option<&FileStorageConfig> {\n        match self {\n            Self::File(file_storage_config) => Some(file_storage_config),\n            _ => None,\n        }\n    }\n\n    pub fn as_ram(&self) -> Option<&RamStorageConfig> {\n        match self {\n            Self::Ram(ram_storage_config) => Some(ram_storage_config),\n            _ => None,\n        }\n    }\n\n    pub fn as_s3(&self) -> Option<&S3StorageConfig> {\n        match self {\n            Self::S3(s3_storage_config) => Some(s3_storage_config),\n            _ => None,\n        }\n    }\n\n    pub fn as_google(&self) -> Option<&GoogleCloudStorageConfig> {\n        match self {\n            Self::Google(google_cloud_storage_config) => Some(google_cloud_storage_config),\n            _ => None,\n        }\n    }\n}\n\nimpl From<AzureStorageConfig> for StorageConfig {\n    fn from(azure_storage_config: AzureStorageConfig) -> Self {\n        Self::Azure(azure_storage_config)\n    }\n}\n\nimpl From<FileStorageConfig> for StorageConfig {\n    fn from(file_storage_config: FileStorageConfig) -> Self {\n        Self::File(file_storage_config)\n    }\n}\n\nimpl From<RamStorageConfig> for StorageConfig {\n    fn from(ram_storage_config: RamStorageConfig) -> Self {\n        Self::Ram(ram_storage_config)\n    }\n}\n\nimpl From<S3StorageConfig> for StorageConfig {\n    fn from(s3_storage_config: S3StorageConfig) -> Self {\n        Self::S3(s3_storage_config)\n    }\n}\n\nimpl From<GoogleCloudStorageConfig> for StorageConfig {\n    fn from(google_cloud_storage_config: GoogleCloudStorageConfig) -> Self {\n        Self::Google(google_cloud_storage_config)\n    }\n}\n\nimpl StorageConfig {\n    pub fn backend(&self) -> StorageBackend {\n        match self {\n            Self::Azure(_) => StorageBackend::Azure,\n            Self::File(_) => StorageBackend::File,\n            Self::Ram(_) => StorageBackend::Ram,\n            Self::S3(_) => StorageBackend::S3,\n            Self::Google(_) => StorageBackend::Google,\n        }\n    }\n}\n\n#[derive(Clone, Default, Eq, PartialEq, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct AzureStorageConfig {\n    #[serde(default)]\n    #[serde(rename = \"account\")]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub account_name: Option<String>,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub access_key: Option<String>,\n}\n\nimpl AzureStorageConfig {\n    pub const AZURE_STORAGE_ACCOUNT_ENV_VAR: &'static str = \"QW_AZURE_STORAGE_ACCOUNT\";\n\n    pub const AZURE_STORAGE_ACCESS_KEY_ENV_VAR: &'static str = \"QW_AZURE_STORAGE_ACCESS_KEY\";\n\n    /// Redacts the access key.\n    pub fn redact(&mut self) {\n        if let Some(access_key) = self.access_key.as_mut() {\n            *access_key = \"***redacted***\".to_string();\n        }\n    }\n\n    /// Attempts to find the storage account name in the environment variable\n    /// `QW_AZURE_STORAGE_ACCOUNT` or node config.\n    pub fn resolve_account_name(&self) -> Option<String> {\n        env::var(Self::AZURE_STORAGE_ACCOUNT_ENV_VAR)\n            .ok()\n            .or_else(|| self.account_name.clone())\n    }\n\n    /// Attempts to find the storage account access key in the environment variable\n    /// `QW_AZURE_STORAGE_ACCESS_KEY` or node config.\n    pub fn resolve_access_key(&self) -> Option<String> {\n        env::var(Self::AZURE_STORAGE_ACCESS_KEY_ENV_VAR)\n            .ok()\n            .or_else(|| self.access_key.clone())\n    }\n}\n\nimpl fmt::Debug for AzureStorageConfig {\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        f.debug_struct(\"AzureStorageConfig\")\n            .field(\"account_name\", &self.account_name)\n            .field(\n                \"access_key\",\n                &self.access_key.as_ref().map(|_| \"***redacted***\"),\n            )\n            .finish()\n    }\n}\n\n#[derive(Clone, Default, Eq, PartialEq, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct S3StorageConfig {\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub flavor: Option<StorageBackendFlavor>,\n    #[serde(default)]\n    pub access_key_id: Option<String>,\n    #[serde(default)]\n    pub secret_access_key: Option<String>,\n    #[serde(default)]\n    pub region: Option<String>,\n    #[serde(default)]\n    pub endpoint: Option<String>,\n    #[serde(default)]\n    pub force_path_style_access: bool,\n    #[serde(alias = \"disable_multi_object_delete_requests\")]\n    #[serde(default)]\n    pub disable_multi_object_delete: bool,\n    #[serde(default)]\n    pub disable_multipart_upload: bool,\n}\n\nimpl S3StorageConfig {\n    fn apply_flavor(&mut self) {\n        match self.flavor {\n            Some(StorageBackendFlavor::DigitalOcean) => {\n                self.force_path_style_access = true;\n                self.disable_multi_object_delete = true;\n            }\n            Some(StorageBackendFlavor::Garage) => {\n                self.region = Some(\"garage\".to_string());\n                self.force_path_style_access = true;\n            }\n            Some(StorageBackendFlavor::Gcs) => {\n                self.disable_multi_object_delete = true;\n                self.disable_multipart_upload = true;\n            }\n            Some(StorageBackendFlavor::MinIO) => {\n                self.region = Some(\"minio\".to_string());\n                self.force_path_style_access = true;\n            }\n            _ => {}\n        }\n    }\n\n    pub fn redact(&mut self) {\n        if let Some(secret_access_key) = self.secret_access_key.as_mut() {\n            *secret_access_key = \"***redacted***\".to_string();\n        }\n    }\n\n    pub fn endpoint(&self) -> Option<String> {\n        env::var(\"QW_S3_ENDPOINT\")\n            .ok()\n            .or_else(|| self.endpoint.clone())\n    }\n\n    pub fn force_path_style_access(&self) -> Option<bool> {\n        static FORCE_PATH_STYLE: OnceLock<Option<bool>> = OnceLock::new();\n        *FORCE_PATH_STYLE.get_or_init(|| {\n            let force_path_style_access = get_bool_from_env(\n                \"QW_S3_FORCE_PATH_STYLE_ACCESS\",\n                self.force_path_style_access,\n            );\n            Some(force_path_style_access)\n        })\n    }\n}\n\nimpl fmt::Debug for S3StorageConfig {\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        f.debug_struct(\"S3StorageConfig\")\n            .field(\"access_key_id\", &self.access_key_id)\n            .field(\n                \"secret_access_key\",\n                &self.secret_access_key.as_ref().map(|_| \"***redacted***\"),\n            )\n            .field(\"region\", &self.region)\n            .field(\"endpoint\", &self.endpoint)\n            .field(\"force_path_style_access\", &self.force_path_style_access)\n            .field(\n                \"disable_multi_object_delete\",\n                &self.disable_multi_object_delete,\n            )\n            .finish()\n    }\n}\n\n#[derive(Debug, Clone, Default, Eq, PartialEq, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct FileStorageConfig;\n\n#[derive(Debug, Clone, Default, Eq, PartialEq, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct RamStorageConfig;\n\n#[derive(Debug, Clone, Default, Eq, PartialEq, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct GoogleCloudStorageConfig {\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub credential_path: Option<String>,\n}\n\nimpl GoogleCloudStorageConfig {\n    pub const GOOGLE_CLOUD_STORAGE_CREDENTIAL_PATH_ENV_VAR: &'static str =\n        \"QW_GOOGLE_CLOUD_STORAGE_CREDENTIAL_PATH\";\n\n    /// Attempts to find the credential path in the environment variable\n    /// `QW_GOOGLE_CLOUD_STORAGE_CREDENTIAL_PATH` or the config.\n    pub fn resolve_credential_path(&self) -> Option<String> {\n        env::var(Self::GOOGLE_CLOUD_STORAGE_CREDENTIAL_PATH_ENV_VAR)\n            .ok()\n            .or_else(|| self.credential_path.clone())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_storage_configs_serde() {\n        let storage_configs_yaml = \"\";\n        let storage_configs: StorageConfigs = serde_yaml::from_str(storage_configs_yaml).unwrap();\n        assert!(storage_configs.is_empty());\n\n        let storage_configs_yaml = r#\"\n                azure:\n                    account: test-account\n                s3:\n                    endpoint: http://localhost:4566\n            \"#;\n        let storage_configs: StorageConfigs = serde_yaml::from_str(storage_configs_yaml).unwrap();\n\n        let expected_storage_configs = StorageConfigs(vec![\n            AzureStorageConfig {\n                account_name: Some(\"test-account\".to_string()),\n                ..Default::default()\n            }\n            .into(),\n            S3StorageConfig {\n                endpoint: Some(\"http://localhost:4566\".to_string()),\n                ..Default::default()\n            }\n            .into(),\n        ]);\n        assert_eq!(storage_configs, expected_storage_configs);\n    }\n\n    #[test]\n    fn test_storage_configs_apply_flavors() {\n        let mut storage_configs = StorageConfigs(vec![\n            S3StorageConfig {\n                flavor: Some(StorageBackendFlavor::DigitalOcean),\n                ..Default::default()\n            }\n            .into(),\n            S3StorageConfig {\n                flavor: Some(StorageBackendFlavor::Garage),\n                ..Default::default()\n            }\n            .into(),\n            S3StorageConfig {\n                flavor: Some(StorageBackendFlavor::Gcs),\n                ..Default::default()\n            }\n            .into(),\n            S3StorageConfig {\n                flavor: Some(StorageBackendFlavor::MinIO),\n                ..Default::default()\n            }\n            .into(),\n        ]);\n        storage_configs.apply_flavors();\n\n        let do_storage_config = storage_configs[0].as_s3().unwrap();\n        assert!(do_storage_config.force_path_style_access);\n        assert!(do_storage_config.disable_multi_object_delete);\n\n        let garage_storage_config = storage_configs[1].as_s3().unwrap();\n        assert_eq!(garage_storage_config.region, Some(\"garage\".to_string()));\n        assert!(garage_storage_config.force_path_style_access);\n\n        let gcs_storage_config = storage_configs[2].as_s3().unwrap();\n        assert!(gcs_storage_config.disable_multi_object_delete);\n        assert!(gcs_storage_config.disable_multipart_upload);\n\n        let minio_storage_config = storage_configs[3].as_s3().unwrap();\n        assert_eq!(minio_storage_config.region, Some(\"minio\".to_string()));\n        assert!(minio_storage_config.force_path_style_access);\n    }\n\n    #[test]\n    fn test_storage_configs_validate() {\n        let storage_configs = StorageConfigs(vec![\n            AzureStorageConfig {\n                account_name: Some(\"test-account\".to_string()),\n                ..Default::default()\n            }\n            .into(),\n            AzureStorageConfig {\n                account_name: Some(\"prod-account\".to_string()),\n                ..Default::default()\n            }\n            .into(),\n        ]);\n        storage_configs.validate().unwrap_err();\n    }\n\n    #[test]\n    fn test_storage_configs_redact() {\n        let mut storage_configs = StorageConfigs(vec![\n            AzureStorageConfig {\n                access_key: Some(\"test-azure-access-key\".to_string()),\n                ..Default::default()\n            }\n            .into(),\n            S3StorageConfig {\n                secret_access_key: Some(\"test-s3-secret-access-key\".to_string()),\n                ..Default::default()\n            }\n            .into(),\n        ]);\n        storage_configs.redact();\n\n        assert_eq!(\n            storage_configs\n                .find_azure()\n                .unwrap()\n                .access_key\n                .as_ref()\n                .unwrap(),\n            \"***redacted***\"\n        );\n        assert_eq!(\n            storage_configs\n                .find_s3()\n                .unwrap()\n                .secret_access_key\n                .as_ref()\n                .unwrap(),\n            \"***redacted***\"\n        );\n    }\n\n    #[test]\n    fn test_storage_azure_config_serde() {\n        {\n            let azure_storage_config_yaml = r#\"\n                account: test-account\n            \"#;\n            let azure_storage_config: AzureStorageConfig =\n                serde_yaml::from_str(azure_storage_config_yaml).unwrap();\n\n            let expected_azure_config = AzureStorageConfig {\n                account_name: Some(\"test-account\".to_string()),\n                ..Default::default()\n            };\n            assert_eq!(azure_storage_config, expected_azure_config);\n        }\n        {\n            let azure_storage_config_yaml = r#\"\n                account: test-account\n                access_key: test-access-key\n            \"#;\n            let azure_storage_config: AzureStorageConfig =\n                serde_yaml::from_str(azure_storage_config_yaml).unwrap();\n\n            let expected_azure_config = AzureStorageConfig {\n                account_name: Some(\"test-account\".to_string()),\n                access_key: Some(\"test-access-key\".to_string()),\n            };\n            assert_eq!(azure_storage_config, expected_azure_config);\n        }\n    }\n\n    #[test]\n    fn test_storage_google_config_serde() {\n        {\n            let google_cloud_storage_config_yaml = r#\"\n                credential_path: /path/to/credential.json\n            \"#;\n            let google_cloud_storage_config: GoogleCloudStorageConfig =\n                serde_yaml::from_str(google_cloud_storage_config_yaml).unwrap();\n\n            let expected_google_cloud_storage_config = GoogleCloudStorageConfig {\n                credential_path: Some(\"/path/to/credential.json\".to_string()),\n            };\n            assert_eq!(\n                google_cloud_storage_config,\n                expected_google_cloud_storage_config\n            );\n        }\n    }\n\n    #[test]\n    fn test_storage_s3_config_serde() {\n        {\n            let s3_storage_config_yaml = r#\"\n                endpoint: http://localhost:4566\n            \"#;\n            let s3_storage_config: S3StorageConfig =\n                serde_yaml::from_str(s3_storage_config_yaml).unwrap();\n\n            let expected_s3_config = S3StorageConfig {\n                endpoint: Some(\"http://localhost:4566\".to_string()),\n                ..Default::default()\n            };\n            assert_eq!(s3_storage_config, expected_s3_config);\n        }\n        {\n            let s3_storage_config_yaml = r#\"\n                region: us-east-1\n                endpoint: http://localhost:4566\n                force_path_style_access: true\n                disable_multi_object_delete_requests: true\n                disable_multipart_upload: true\n            \"#;\n            let s3_storage_config: S3StorageConfig =\n                serde_yaml::from_str(s3_storage_config_yaml).unwrap();\n\n            let expected_s3_config = S3StorageConfig {\n                region: Some(\"us-east-1\".to_string()),\n                endpoint: Some(\"http://localhost:4566\".to_string()),\n                force_path_style_access: true,\n                disable_multi_object_delete: true,\n                disable_multipart_upload: true,\n                ..Default::default()\n            };\n            assert_eq!(s3_storage_config, expected_s3_config);\n        }\n    }\n\n    #[test]\n    fn test_storage_s3_config_flavor_serde() {\n        {\n            let s3_storage_config_yaml = r#\"\n                flavor: digital_ocean\n            \"#;\n            let s3_storage_config: S3StorageConfig =\n                serde_yaml::from_str(s3_storage_config_yaml).unwrap();\n\n            assert_eq!(\n                s3_storage_config.flavor,\n                Some(StorageBackendFlavor::DigitalOcean)\n            );\n        }\n        {\n            let s3_storage_config_yaml = r#\"\n                flavor: garage\n            \"#;\n            let s3_storage_config: S3StorageConfig =\n                serde_yaml::from_str(s3_storage_config_yaml).unwrap();\n\n            assert_eq!(s3_storage_config.flavor, Some(StorageBackendFlavor::Garage));\n        }\n        {\n            let s3_storage_config_yaml = r#\"\n                flavor: gcs\n            \"#;\n            let s3_storage_config: S3StorageConfig =\n                serde_yaml::from_str(s3_storage_config_yaml).unwrap();\n\n            assert_eq!(s3_storage_config.flavor, Some(StorageBackendFlavor::Gcs));\n        }\n        {\n            let s3_storage_config_yaml = r#\"\n                flavor: minio\n            \"#;\n            let s3_storage_config: S3StorageConfig =\n                serde_yaml::from_str(s3_storage_config_yaml).unwrap();\n\n            assert_eq!(s3_storage_config.flavor, Some(StorageBackendFlavor::MinIO));\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-config/src/templating.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::io::BufRead;\n\nuse anyhow::{Context, Result, bail};\nuse new_string_template::template::Template;\nuse once_cell::sync::Lazy;\nuse regex::Regex;\nuse tracing::debug;\n\n// Matches `${value}` if value is formatted as:\n// `ENV_VAR` or `ENV_VAR:DEFAULT`\n// Ignores whitespaces in curly braces\nstatic TEMPLATE_ENV_VAR_CAPTURE: Lazy<Regex> = Lazy::new(|| {\n    Regex::new(r\"\\$\\{\\s*([A-Za-z0-9_]+)\\s*(?::\\-\\s*([^\\s\\}]+)\\s*)?}\")\n        .expect(\"regular expression should compile\")\n});\n\npub fn render_config(config_content: &[u8]) -> Result<String> {\n    let template_str = std::str::from_utf8(config_content)\n        .context(\"config file contains invalid UTF-8 characters\")?;\n\n    let mut values = HashMap::new();\n\n    for (line_no, line_result) in config_content.lines().enumerate() {\n        let line = line_result?;\n\n        for captures in TEMPLATE_ENV_VAR_CAPTURE.captures_iter(&line) {\n            let env_var_key = captures\n                .get(1)\n                .expect(\"captures should always have at least one match\")\n                .as_str();\n            let substitution_value = {\n                if line.trim_start().starts_with('#') {\n                    debug!(\n                        env_var_name=%env_var_key,\n                        \"config file line #{line_no} is commented out, skipping\"\n                    );\n                    // This line is commented out, return the line as is.\n                    captures\n                        .get(0)\n                        .expect(\"0th capture should always be set\")\n                        .as_str()\n                        .to_string()\n                } else if let Ok(env_var_value) = std::env::var(env_var_key) {\n                    debug!(\n                        env_var_name=%env_var_key,\n                        env_var_value=%env_var_value,\n                        \"environment variable is set, substituting with environment variable value\"\n                    );\n                    env_var_value\n                } else if let Some(default_match) = captures.get(2) {\n                    let default_value = default_match.as_str().to_string();\n                    debug!(\n                        env_var_name=%env_var_key,\n                        default_value=%default_value,\n                        \"environment variable is not set, substituting with default value\"\n                    );\n                    default_value\n                } else {\n                    bail!(\n                        \"failed to render config file template: environment variable \\\n                         `{env_var_key}` is not set and no default value is provided\"\n                    );\n                }\n            };\n            values.insert(env_var_key.to_string(), substitution_value);\n        }\n    }\n    let template = Template::new(template_str).with_regex(&TEMPLATE_ENV_VAR_CAPTURE);\n    let rendered = template\n        .render_string(&values)\n        .context(\"failed to render config file template\")?;\n    Ok(rendered)\n}\n\n#[cfg(test)]\nmod test {\n    use std::env;\n\n    use super::render_config;\n\n    #[test]\n    fn test_template_render() {\n        // SAFETY: this test may not be entirely sound if not run with nextest or --test-threads=1\n        // as this is only a test, and it would be extremly inconvenient to run it in a different\n        // way, we are keeping it that way\n\n        let config_content = b\"metastore_uri: ${TEST_TEMPLATE_RENDER_ENV_VAR_PLEASE_DONT_NOTICE}\";\n        unsafe {\n            env::set_var(\n                \"TEST_TEMPLATE_RENDER_ENV_VAR_PLEASE_DONT_NOTICE\",\n                \"s3://test-bucket/metastore\",\n            )\n        };\n        let rendered = render_config(config_content).unwrap();\n        unsafe { std::env::remove_var(\"TEST_TEMPLATE_RENDER_ENV_VAR_PLEASE_DONT_NOTICE\") };\n        assert_eq!(rendered, \"metastore_uri: s3://test-bucket/metastore\");\n    }\n\n    #[test]\n    fn test_template_render_supports_whitespaces() {\n        // SAFETY: this test may not be entirely sound if not run with nextest or --test-threads=1\n        // as this is only a test, and it would be extremly inconvenient to run it in a different\n        // way, we are keeping it that way\n\n        unsafe {\n            env::set_var(\n                \"TEST_TEMPLATE_RENDER_WHITESPACE_QW_TEST\",\n                \"s3://test-bucket/metastore\",\n            )\n        };\n        {\n            let config_content = b\"metastore_uri: ${  TEST_TEMPLATE_RENDER_WHITESPACE_QW_TEST  }\";\n            let rendered = render_config(config_content).unwrap();\n            assert_eq!(rendered, \"metastore_uri: s3://test-bucket/metastore\");\n        }\n    }\n\n    #[test]\n    fn test_template_render_with_default_value() {\n        {\n            let config_content =\n                b\"metastore_uri: ${QW_ENV_VAR_DOES_NOT_EXIST:-s3://test-bucket/metastore}\";\n            let rendered = render_config(config_content).unwrap();\n            assert_eq!(rendered, \"metastore_uri: s3://test-bucket/metastore\");\n        }\n        {\n            let config_content =\n                b\"metastore_uri: ${  QW_ENV_VAR_DOES_NOT_EXIST  :-  s3://test-bucket/metastore  }\";\n            let rendered = render_config(config_content).unwrap();\n            assert_eq!(rendered, \"metastore_uri: s3://test-bucket/metastore\");\n        }\n    }\n\n    #[test]\n    fn test_template_render_should_panic() {\n        let config_content = b\"metastore_uri: ${QW_ENV_VAR_DOES_NOT_EXIST}\";\n        render_config(config_content).unwrap_err();\n    }\n\n    #[test]\n    fn test_template_render_with_default_use_env() {\n        // SAFETY: this test may not be entirely sound if not run with nextest or --test-threads=1\n        // as this is only a test, and it would be extremly inconvenient to run it in a different\n        // way, we are keeping it that way\n\n        let config_content =\n            b\"metastore_uri: ${TEST_TEMPLATE_RENDER_ENV_VAR_DEFAULT_USE_ENV:-s3://test-bucket/wrongbucket}\";\n        unsafe {\n            env::set_var(\n                \"TEST_TEMPLATE_RENDER_ENV_VAR_DEFAULT_USE_ENV\",\n                \"s3://test-bucket/metastore\",\n            )\n        };\n        let rendered = render_config(config_content).unwrap();\n        unsafe { std::env::remove_var(\"TEST_TEMPLATE_RENDER_ENV_VAR_DEFAULT_USE_ENV\") };\n        assert_eq!(rendered, \"metastore_uri: s3://test-bucket/metastore\");\n    }\n\n    #[test]\n    fn test_template_render_with_multiple_vars_per_line() {\n        // SAFETY: this test may not be entirely sound if not run with nextest or --test-threads=1\n        // as this is only a test, and it would be extremly inconvenient to run it in a different\n        // way, we are keeping it that way\n\n        let config_content =\n            b\"metastore_uri: s3://${RENDER_MULTIPLE_BUCKET}/${RENDER_MULTIPLE_PREFIX:-index}#polling_interval=${RENDER_MULTIPLE_INTERVAL}s\";\n        unsafe {\n            env::set_var(\"RENDER_MULTIPLE_BUCKET\", \"test-bucket\");\n            env::set_var(\"RENDER_MULTIPLE_PREFIX\", \"metastore\");\n            env::set_var(\"RENDER_MULTIPLE_INTERVAL\", \"30\");\n        }\n        let rendered = render_config(config_content).unwrap();\n        unsafe {\n            std::env::remove_var(\"RENDER_MULTIPLE_BUCKET\");\n            std::env::remove_var(\"RENDER_MULTIPLE_PREFIX\");\n            std::env::remove_var(\"RENDER_MULTIPLE_INTERVAL\");\n        }\n        assert_eq!(\n            rendered,\n            \"metastore_uri: s3://test-bucket/metastore#polling_interval=30s\"\n        );\n    }\n\n    #[test]\n    fn test_template_render_ignores_commented_lines() {\n        {\n            let config_content = b\"# metastore_uri: ${QW_ENV_VAR_DOES_NOT_EXIST}\";\n            let rendered = render_config(config_content).unwrap();\n            assert_eq!(rendered, \"# metastore_uri: ${QW_ENV_VAR_DOES_NOT_EXIST}\");\n        }\n        {\n            let config_content =\n                b\" # metastore_uri: ${ QW_ENV_VAR_DOES_NOT_EXIST :- default-value }\";\n            let rendered = render_config(config_content).unwrap();\n            assert_eq!(\n                rendered,\n                \" # metastore_uri: ${ QW_ENV_VAR_DOES_NOT_EXIST :- default-value }\"\n            );\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-control-plane/Cargo.toml",
    "content": "[package]\nname = \"quickwit-control-plane\"\ndescription = \"Control plane service implementation\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nanyhow = { workspace = true }\nasync-trait = { workspace = true }\nbytesize = { workspace = true }\nfnv = { workspace = true }\nfutures = { workspace = true }\nitertools = { workspace = true }\nlru = { workspace = true }\nmockall = { workspace = true, optional = true }\nonce_cell = { workspace = true }\nrand = { workspace = true }\nserde = { workspace = true }\nserde_json = { workspace = true }\nsmallvec = { workspace = true }\ntokio = { workspace = true }\ntracing = { workspace = true }\nulid = { workspace = true }\n\nquickwit-actors = { workspace = true }\nquickwit-cluster = { workspace = true }\nquickwit-common = { workspace = true }\nquickwit-config = { workspace = true }\nquickwit-ingest = { workspace = true }\nquickwit-metastore = { workspace = true }\nquickwit-proto = { workspace = true }\n\n[dev-dependencies]\nfutures = { workspace = true }\nmockall = { workspace = true }\nproptest = { workspace = true }\nrand = { workspace = true }\n\nquickwit-actors = { workspace = true, features = [\"testsuite\"] }\nquickwit-cluster = { workspace = true, features = [\"testsuite\"] }\nquickwit-common = { workspace = true, features = [\"testsuite\"] }\nquickwit-config = { workspace = true, features = [\"testsuite\"] }\nquickwit-indexing = { workspace = true }\nquickwit-ingest = { workspace = true, features = [\"testsuite\"] }\nquickwit-metastore = { workspace = true, features = [\"testsuite\"] }\nquickwit-proto = { workspace = true, features = [\"testsuite\"] }\n\n[features]\ntestsuite = [\"mockall\"]\n"
  },
  {
    "path": "quickwit/quickwit-control-plane/README.md",
    "content": "# Quickwit Control Plane\n\nThe Control Plane is responsible for scheduling indexing tasks to indexers. Its role is to ensure that the cluster is correctly running all indexing tasks on each indexer.\n\nAn indexing task is simply identified by a couple `(IndexId, SourceId, Option<Vec<ShardId>>)`.\n\n## Scheduling algorithm\n\nThe control plane keeps an up to date partial view of the metastore.\nThis is enforced by routing all of the index/shards/sources alternating\ncommand be routed through the control plane.\n\nOn startup, or when a metastore event is received, the scheduler computes the list of indexing tasks.\nIt then applies a placement algorithm to decide which indexer should be running each indexing task. The result of this placement is called the physical indexing plan, and associated each indexer to a list of indexing tasks.\n\nThe control plane then emits gRPC to the indexers that are not already following their assigned part of the indexing plan.\n\n```mermaid\nflowchart TB\n    StartScheduling(Start scheduling)--\"(Sources, Nodes)\"-->BuildPhysical\n    style StartScheduling fill:#ff0026,fill-opacity:0.5,stroke:#ff0026,stroke-width:4px\n    BuildPhysical[Build Physical Plan]--PhysicalPlan-->Apply\n    Apply[Apply plan to each indexer] --IndexerPlan--> Indexer1\n    Apply --IndexerPlan--> Indexer2\n    Apply --IndexerPlan--> Indexer...\n```\n\n## Control loop\n\nEach indexer reports its currently running plan via chitchat.\nA control loop makes sure that this cluster state matches the latest applied plan.\nIf a divergence is observed (for instance, if a node leaves the cluster), or if a node reports not being running a given pipeline, the control plane will take the necessary actions (respectively recompute the physical plan or reapply the plan).\n\n## Read more in the Rust docs\n\n[Scheduler Rust docs](./src/scheduler.rs#L66)\n"
  },
  {
    "path": "quickwit/quickwit-control-plane/src/control_plane.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::btree_map::Entry;\nuse std::collections::{BTreeMap, BTreeSet};\nuse std::fmt;\nuse std::fmt::Formatter;\nuse std::num::NonZeroUsize;\nuse std::time::Duration;\n\nuse anyhow::Context;\nuse async_trait::async_trait;\nuse futures::stream::FuturesUnordered;\nuse futures::{Future, StreamExt};\nuse itertools::Itertools;\nuse quickwit_actors::{\n    Actor, ActorContext, ActorExitStatus, ActorHandle, DeferableReplyHandler, Handler, Mailbox,\n    Supervisor, Universe, WeakMailbox,\n};\nuse quickwit_cluster::{ClusterChange, ClusterChangeStream, ClusterChangeStreamFactory};\nuse quickwit_common::pretty::PrettyDisplay;\nuse quickwit_common::pubsub::EventSubscriber;\nuse quickwit_common::uri::Uri;\nuse quickwit_common::{Progress, shared_consts};\nuse quickwit_config::{ClusterConfig, IndexConfig, IndexTemplate, SourceConfig};\nuse quickwit_ingest::{IngesterPool, LocalShardsUpdate};\nuse quickwit_metastore::{CreateIndexRequestExt, CreateIndexResponseExt, IndexMetadataResponseExt};\nuse quickwit_proto::control_plane::{\n    AdviseResetShardsRequest, AdviseResetShardsResponse, ControlPlaneError, ControlPlaneResult,\n    GetOrCreateOpenShardsRequest, GetOrCreateOpenShardsResponse, GetOrCreateOpenShardsSubrequest,\n};\nuse quickwit_proto::indexing::ShardPositionsUpdate;\nuse quickwit_proto::ingest::ingester::IngesterStatus;\nuse quickwit_proto::metastore::{\n    AddSourceRequest, CreateIndexRequest, CreateIndexResponse, DeleteIndexRequest,\n    DeleteShardsRequest, DeleteSourceRequest, EmptyResponse, FindIndexTemplateMatchesRequest,\n    IndexMetadataResponse, IndexTemplateMatch, MetastoreError, MetastoreResult, MetastoreService,\n    MetastoreServiceClient, PruneShardsRequest, ToggleSourceRequest, UpdateIndexRequest,\n    UpdateSourceRequest, serde_utils,\n};\nuse quickwit_proto::types::{IndexId, IndexUid, NodeId, ShardId, SourceId, SourceUid};\nuse serde::Serialize;\nuse serde_json::{Value as JsonValue, json};\nuse tokio::sync::watch;\nuse tracing::{Level, debug, enabled, error, info};\n\nuse crate::IndexerPool;\nuse crate::cooldown_map::{CooldownMap, CooldownStatus};\nuse crate::debouncer::Debouncer;\nuse crate::indexing_scheduler::{IndexingScheduler, IndexingSchedulerState};\nuse crate::ingest::IngestController;\nuse crate::ingest::ingest_controller::{IngestControllerStats, RebalanceShardsCallback};\nuse crate::model::ControlPlaneModel;\n\n/// Interval between two controls (or checks) of the desired plan VS running plan.\npub(crate) const CONTROL_PLAN_LOOP_INTERVAL: Duration = if cfg!(any(test, feature = \"testsuite\")) {\n    Duration::from_millis(100)\n} else {\n    Duration::from_secs(5)\n};\n\n/// Minimum period between two identical shard pruning operations.\nconst PRUNE_SHARDS_DEFAULT_COOLDOWN_PERIOD: Duration = Duration::from_secs(120);\n\n/// Minimum period between two rebuild plan operations.\nconst REBUILD_PLAN_COOLDOWN_PERIOD: Duration = Duration::from_secs(2);\n\n#[derive(Debug)]\nstruct ControlPlanLoop;\n\n#[derive(Debug, Default, Clone, Copy)]\nstruct RebuildPlan;\n\npub struct ControlPlane {\n    cluster_config: ClusterConfig,\n    cluster_change_stream_opt: Option<ClusterChangeStream>,\n    // The control plane state is split into to independent functions, that we naturally isolated\n    // code wise and state wise.\n    //\n    // - The indexing scheduler is in charge of managing indexers: it decides which indexer should\n    // index which source/shards.\n    // - the ingest controller is in charge of managing ingesters: it opens and closes shards on\n    // the different ingesters.\n    indexing_scheduler: IndexingScheduler,\n    ingest_controller: IngestController,\n    metastore: MetastoreServiceClient,\n    model: ControlPlaneModel,\n    prune_shard_cooldown: CooldownMap<(IndexId, SourceId)>,\n    rebuild_plan_debouncer: Debouncer,\n    readiness_tx: watch::Sender<bool>,\n    // Disables the control loop. This is useful for unit testing.\n    disable_control_loop: bool,\n}\n\nimpl fmt::Debug for ControlPlane {\n    fn fmt(&self, f: &mut Formatter) -> fmt::Result {\n        f.debug_struct(\"ControlPlane\").finish()\n    }\n}\n\nimpl ControlPlane {\n    pub fn spawn(\n        universe: &Universe,\n        cluster_config: ClusterConfig,\n        self_node_id: NodeId,\n        cluster_change_stream_factory: impl ClusterChangeStreamFactory,\n        indexer_pool: IndexerPool,\n        ingester_pool: IngesterPool,\n        metastore: MetastoreServiceClient,\n    ) -> (\n        Mailbox<Self>,\n        ActorHandle<Supervisor<Self>>,\n        watch::Receiver<bool>,\n    ) {\n        let disable_control_loop = false;\n        Self::spawn_inner(\n            universe,\n            cluster_config,\n            self_node_id,\n            cluster_change_stream_factory,\n            indexer_pool,\n            ingester_pool,\n            metastore,\n            disable_control_loop,\n        )\n    }\n\n    #[allow(clippy::too_many_arguments)]\n    fn spawn_inner(\n        universe: &Universe,\n        cluster_config: ClusterConfig,\n        self_node_id: NodeId,\n        cluster_change_stream_factory: impl ClusterChangeStreamFactory,\n        indexer_pool: IndexerPool,\n        ingester_pool: IngesterPool,\n        metastore: MetastoreServiceClient,\n        disable_control_loop: bool,\n    ) -> (\n        Mailbox<Self>,\n        ActorHandle<Supervisor<Self>>,\n        watch::Receiver<bool>,\n    ) {\n        info!(\"starting control plane\");\n\n        let (readiness_tx, readiness_rx) = watch::channel(false);\n        let (control_plane_mailbox, control_plane_handle) =\n            universe.spawn_builder().supervise_fn(move || {\n                let cluster_id = cluster_config.cluster_id.clone();\n                let replication_factor = cluster_config.replication_factor;\n                let shard_throughput_limit_mib: f32 = cluster_config.shard_throughput_limit.as_u64()\n                    as f32\n                    / shared_consts::MIB as f32;\n                let indexing_scheduler =\n                    IndexingScheduler::new(cluster_id, self_node_id.clone(), indexer_pool.clone());\n                let ingest_controller = IngestController::new(\n                    metastore.clone(),\n                    ingester_pool.clone(),\n                    replication_factor,\n                    shard_throughput_limit_mib,\n                    cluster_config.shard_scale_up_factor,\n                );\n\n                let readiness_tx = readiness_tx.clone();\n                let _ = readiness_tx.send(false);\n\n                ControlPlane {\n                    cluster_config: cluster_config.clone(),\n                    cluster_change_stream_opt: Some(cluster_change_stream_factory.create()),\n                    indexing_scheduler,\n                    ingest_controller,\n                    metastore: metastore.clone(),\n                    model: Default::default(),\n                    prune_shard_cooldown: CooldownMap::new(NonZeroUsize::new(1024).unwrap()),\n                    rebuild_plan_debouncer: Debouncer::new(REBUILD_PLAN_COOLDOWN_PERIOD),\n                    readiness_tx,\n                    disable_control_loop,\n                }\n            });\n        (control_plane_mailbox, control_plane_handle, readiness_rx)\n    }\n}\n\n#[derive(Debug, Clone, Serialize, Default)]\npub struct ControlPlaneObservableState {\n    pub indexing_scheduler: IndexingSchedulerState,\n    pub ingest_controller: IngestControllerStats,\n    pub num_indexes: usize,\n    pub num_sources: usize,\n    pub readiness: bool,\n}\n\n#[async_trait]\nimpl Actor for ControlPlane {\n    type ObservableState = ControlPlaneObservableState;\n\n    fn name(&self) -> String {\n        \"ControlPlane\".to_string()\n    }\n\n    fn observable_state(&self) -> Self::ObservableState {\n        ControlPlaneObservableState {\n            indexing_scheduler: self.indexing_scheduler.observable_state(),\n            ingest_controller: self.ingest_controller.stats,\n            num_indexes: self.model.num_indexes(),\n            num_sources: self.model.num_sources(),\n            readiness: *self.readiness_tx.borrow(),\n        }\n    }\n\n    async fn initialize(&mut self, ctx: &ActorContext<Self>) -> Result<(), ActorExitStatus> {\n        crate::metrics::CONTROL_PLANE_METRICS.restart_total.inc();\n\n        self.model\n            .load_from_metastore(&mut self.metastore, ctx.progress())\n            .await\n            .context(\"failed to initialize control plane model\")?;\n\n        let _rebuild_plan_waiter = self.rebuild_plan_debounced(ctx);\n\n        self.ingest_controller.sync_with_all_ingesters(&self.model);\n\n        ctx.schedule_self_msg(CONTROL_PLAN_LOOP_INTERVAL, ControlPlanLoop);\n\n        let weak_mailbox = ctx.mailbox().downgrade();\n        let cluster_change_stream = self\n            .cluster_change_stream_opt\n            .take()\n            .expect(\"`initialize` should be called only once\");\n        spawn_watch_indexers_task(weak_mailbox, cluster_change_stream);\n        let _ = self.readiness_tx.send(true);\n        Ok(())\n    }\n}\n\nimpl ControlPlane {\n    async fn auto_create_indexes(\n        &mut self,\n        subrequests: &[GetOrCreateOpenShardsSubrequest],\n        progress: &Progress,\n    ) -> MetastoreResult<()> {\n        if !self.cluster_config.auto_create_indexes {\n            return Ok(());\n        }\n        let mut index_ids = Vec::new();\n\n        for subrequest in subrequests {\n            if self.model.index_uid(&subrequest.index_id).is_none() {\n                index_ids.push(subrequest.index_id.clone());\n            }\n        }\n        if index_ids.is_empty() {\n            return Ok(());\n        }\n        let find_index_template_matches_request = FindIndexTemplateMatchesRequest { index_ids };\n        let find_index_template_matches_response = progress\n            .protect_future(\n                self.metastore\n                    .find_index_template_matches(find_index_template_matches_request),\n            )\n            .await?;\n\n        let mut create_index_futures = FuturesUnordered::new();\n\n        for index_template_match in find_index_template_matches_response.matches {\n            // TODO: It's a bit brutal to fail the entire operation if applying a single index\n            // template fails. We should return a partial failure instead for the subrequest. I\n            // want to do so in an upcoming refactor where the `GetOrCreateOpenShardsRequest` will\n            // be processed in multiple steps in a dedicated workbench.\n            let index_config = apply_index_template_match(\n                index_template_match,\n                &self.cluster_config.default_index_root_uri,\n            )?;\n            // We disable ingest V1 for index templates.\n            let source_configs = [SourceConfig::ingest_v2(), SourceConfig::cli()];\n\n            let create_index_request = CreateIndexRequest::try_from_index_and_source_configs(\n                &index_config,\n                &source_configs,\n            )?;\n            let create_index_future = {\n                let metastore = self.metastore.clone();\n                async move { metastore.create_index(create_index_request).await }\n            };\n            create_index_futures.push(create_index_future);\n        }\n        while let Some(create_index_response_result) =\n            progress.protect_future(create_index_futures.next()).await\n        {\n            // Same here.\n            let create_index_response = create_index_response_result?;\n            let index_metadata = create_index_response.deserialize_index_metadata()?;\n            self.model.add_index(index_metadata);\n        }\n        Ok(())\n    }\n\n    /// Deletes a set of shards from the metastore and the control plane model.\n    ///\n    /// If the shards were already absent this operation is considered successful.\n    async fn delete_shards(\n        &mut self,\n        source_uid: &SourceUid,\n        shard_ids: &[ShardId],\n        progress: &Progress,\n    ) -> anyhow::Result<()> {\n        debug!(\n            index_uid=%source_uid.index_uid,\n            source_id=%source_uid.source_id,\n            shard_ids=%shard_ids.pretty_display(),\n            \"deleting shards\"\n        );\n        let delete_shards_request = DeleteShardsRequest {\n            index_uid: Some(source_uid.index_uid.clone()),\n            source_id: source_uid.source_id.clone(),\n            shard_ids: shard_ids.to_vec(),\n            force: false,\n        };\n        // We use a tiny bit different strategy here than for other handlers\n        // All metastore errors end up fail/respawn the control plane.\n        //\n        // This is because deleting shards is done in reaction to an event\n        // and we do not really have the freedom to return an error to a caller like for other\n        // calls: there is no caller.\n        progress\n            .protect_future(self.metastore.delete_shards(delete_shards_request))\n            .await\n            .context(\"failed to delete shards from metastore\")?;\n\n        self.model.delete_shards(source_uid, shard_ids);\n\n        info!(\n            index_uid=%source_uid.index_uid,\n            source_id=%source_uid.source_id,\n            shard_ids=%shard_ids.pretty_display(),\n            \"deleted shards\"\n        );\n        Ok(())\n    }\n\n    fn debug_info(&self) -> JsonValue {\n        // Build the union of ingesters tracked by ingester pool and the model.\n        let mut ingesters: BTreeMap<NodeId, JsonValue> = BTreeMap::new();\n\n        for (ingester_id, ingester) in self.ingest_controller.ingester_pool.keys_values() {\n            let ingester_json = json!({\n                \"available\": true,\n                \"status\": ingester.status.as_json_str_name(),\n            });\n            ingesters.insert(ingester_id.clone(), ingester_json);\n        }\n        for shard in self.model.all_shards() {\n            let ingester_id = NodeId::from(shard.leader_id.clone());\n\n            if let Entry::Vacant(entry) = ingesters.entry(ingester_id.clone()) {\n                let ingester_json = json!({\n                    \"available\": false,\n                    \"status\": IngesterStatus::default(),\n                });\n                entry.insert(ingester_json);\n            }\n        }\n\n        let physical_indexing_plan: Vec<JsonValue> = self\n            .indexing_scheduler\n            .observable_state()\n            .last_applied_physical_plan\n            .map(|plan| {\n                plan.indexing_tasks_per_indexer()\n                    .iter()\n                    .map(|(node_id, tasks)| {\n                        json!({\n                            \"node_id\": node_id.clone(),\n                            \"tasks\": tasks.clone(),\n                        })\n                    })\n                    .collect()\n            })\n            .unwrap_or_default();\n\n        let mut per_index_and_leader_shards_json: BTreeMap<\n            IndexUid,\n            BTreeMap<String, Vec<JsonValue>>,\n        > = BTreeMap::new();\n\n        for (source_uid, shard_entries) in self.model.all_shards_with_source() {\n            for shard_entry in shard_entries {\n                let shard_json = json!({\n                    \"index_uid\": source_uid.index_uid,\n                    \"source_id\": source_uid.source_id,\n                    \"shard_id\": shard_entry.shard_id,\n                    \"shard_state\": shard_entry.shard_state().as_json_str_name(),\n                    \"leader_id\": shard_entry.leader_id,\n                    \"follower_id\": shard_entry.follower_id,\n                    \"publish_position_inclusive\": shard_entry.publish_position_inclusive(),\n                });\n                per_index_and_leader_shards_json\n                    .entry(source_uid.index_uid.clone())\n                    .or_default()\n                    .entry(shard_entry.leader_id.clone())\n                    .or_default()\n                    .push(shard_json);\n            }\n        }\n        json!({\n            \"ingesters\": ingesters,\n            \"physical_indexing_plan\": physical_indexing_plan,\n            \"shard_table\": per_index_and_leader_shards_json,\n        })\n    }\n\n    /// Rebuilds the indexing plan.\n    ///\n    /// This method includes some debouncing logic. Every call will be followed by a cooldown\n    /// period.\n    ///\n    /// This method returns a future that can be awaited to ensure that the relevant rebuild plan\n    /// operation has been executed.\n    fn rebuild_plan_debounced(\n        &mut self,\n        ctx: &ActorContext<Self>,\n    ) -> impl Future<Output = ()> + use<> {\n        let next_rebuild_waiter = self\n            .indexing_scheduler\n            .next_rebuild_tracker\n            .next_rebuild_waiter();\n        self.rebuild_plan_debouncer\n            .self_send_with_cooldown::<RebuildPlan>(ctx);\n        next_rebuild_waiter\n    }\n}\n\n#[async_trait]\nimpl Handler<RebuildPlan> for ControlPlane {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        _message: RebuildPlan,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        self.indexing_scheduler.rebuild_plan(&self.model);\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<ShardPositionsUpdate> for ControlPlane {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        shard_positions_update: ShardPositionsUpdate,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        if enabled!(Level::DEBUG) {\n            let pretty_positions: Vec<String> = shard_positions_update\n                .updated_shard_positions\n                .iter()\n                .map(|(shard_id, position)| format!(\"{shard_id}:{}\", position.pretty_display()))\n                .sorted()\n                .collect();\n\n            debug!(\n                index_uid=%shard_positions_update.source_uid.index_uid,\n                source_id=%shard_positions_update.source_uid.source_id,\n                positions=%pretty_positions.as_slice().pretty_display(),\n                \"received shard positions update\"\n            );\n        }\n        let Some(shard_entries) = self\n            .model\n            .get_shards_for_source_mut(&shard_positions_update.source_uid)\n        else {\n            // The source no longer exists.\n            return Ok(());\n        };\n\n        let mut shard_ids_to_close = Vec::new();\n        for (shard_id, position) in shard_positions_update.updated_shard_positions {\n            if let Some(shard) = shard_entries.get_mut(&shard_id) {\n                shard.publish_position_inclusive =\n                    Some(shard.publish_position_inclusive().max(position.clone()));\n                if position.is_eof() {\n                    // identify shards that have reached EOF but have not yet been removed.\n                    info!(\n                        index_uid=%shard_positions_update.source_uid.index_uid,\n                        source_id=%shard_positions_update.source_uid.source_id,\n                        %shard_id,\n                        position=%position.pretty_display(),\n                        \"received shard eof via gossip\"\n                    );\n                    shard_ids_to_close.push(shard_id);\n                }\n            }\n        }\n        if shard_ids_to_close.is_empty() {\n            return Ok(());\n        }\n        self.delete_shards(\n            &shard_positions_update.source_uid,\n            &shard_ids_to_close,\n            ctx.progress(),\n        )\n        .await?;\n        let _rebuild_plan_waiter = self.rebuild_plan_debounced(ctx);\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<ControlPlanLoop> for ControlPlane {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        _message: ControlPlanLoop,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        if self.disable_control_loop {\n            return Ok(());\n        }\n        if let Err(metastore_error) = self\n            .ingest_controller\n            .rebalance_shards(&mut self.model, ctx.mailbox(), ctx.progress())\n            .await\n        {\n            return convert_metastore_error::<()>(metastore_error).map(|_| ());\n        }\n        self.indexing_scheduler.control_running_plan(&self.model);\n        ctx.schedule_self_msg(CONTROL_PLAN_LOOP_INTERVAL, ControlPlanLoop);\n        Ok(())\n    }\n}\n\n/// This function converts a metastore error into an actor error.\n///\n/// If the metastore error is implying the transaction has not been\n/// successful, then we do not need to restart the metastore.\n/// If the metastore error does not let us know whether the transaction was\n/// successful or not, we need to restart the actor and have it load its state from\n/// the metastore.\n///\n/// This function also logs errors.\nfn convert_metastore_error<T>(\n    metastore_error: MetastoreError,\n) -> Result<ControlPlaneResult<T>, ActorExitStatus> {\n    // If true, we know that the transactions has not been recorded in the Metastore.\n    // If false, we simply are not sure whether the transaction has been recorded or not.\n    let is_transaction_certainly_aborted = metastore_error.is_transaction_certainly_aborted();\n    if is_transaction_certainly_aborted {\n        // If the metastore transaction is certain to have been aborted,\n        // this is actually a good thing.\n        // We do not need to restart the control plane.\n        if !matches!(metastore_error, MetastoreError::AlreadyExists(_)) {\n            // This is not always an error to attempt to create an object that already exists.\n            // In particular, we create two otel indexes on startup.\n            // It will be up to the client to decide what to do there.\n            error!(err=?metastore_error, transaction_outcome=\"aborted\", \"metastore error\");\n        }\n        crate::metrics::CONTROL_PLANE_METRICS\n            .metastore_error_aborted\n            .inc();\n        Ok(Err(ControlPlaneError::Metastore(metastore_error)))\n    } else {\n        // If the metastore transaction may have been executed, we need to restart the control plane\n        // so that it gets resynced with the metastore state.\n        error!(error=?metastore_error, transaction_outcome=\"maybe-executed\", \"metastore error\");\n        crate::metrics::CONTROL_PLANE_METRICS\n            .metastore_error_maybe_executed\n            .inc();\n        Err(ActorExitStatus::from(anyhow::anyhow!(metastore_error)))\n    }\n}\n\n// This handler is a metastore call proxied through the control plane: we must first forward the\n// request to the metastore, and then act on the event.\n#[async_trait]\nimpl DeferableReplyHandler<CreateIndexRequest> for ControlPlane {\n    type Reply = ControlPlaneResult<CreateIndexResponse>;\n\n    async fn handle_message(\n        &mut self,\n        request: CreateIndexRequest,\n        reply: impl FnOnce(Self::Reply) + Send + Sync + 'static,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        debug!(\"creating index\");\n\n        let response = match ctx\n            .protect_future(self.metastore.create_index(request))\n            .await\n        {\n            Ok(response) => response,\n            Err(metastore_error) => {\n                reply(convert_metastore_error(metastore_error)?);\n                return Ok(());\n            }\n        };\n        let index_metadata = match response.deserialize_index_metadata() {\n            Ok(index_metadata) => index_metadata,\n            Err(serde_error) => {\n                error!(error=?serde_error, \"failed to deserialize index metadata\");\n                return Err(ActorExitStatus::from(anyhow::anyhow!(serde_error)));\n            }\n        };\n        let index_uid = index_metadata.index_uid.clone();\n\n        // Now, create index can also add sources to support creating indexes automatically from\n        // index and source config templates.\n        let should_rebuild_plan = !index_metadata.sources.is_empty();\n        self.model.add_index(index_metadata);\n\n        if should_rebuild_plan {\n            let rebuild_plan_notifier = self.rebuild_plan_debounced(ctx);\n            tokio::task::spawn(async move {\n                rebuild_plan_notifier.await;\n                reply(Ok(response));\n            });\n        } else {\n            reply(Ok(response));\n        }\n        info!(%index_uid, \"created index\");\n        Ok(())\n    }\n}\n\n// This handler is a metastore call proxied through the control plane: we must first forward the\n// request to the metastore, and then act on the event.\n#[async_trait]\nimpl Handler<UpdateIndexRequest> for ControlPlane {\n    type Reply = ControlPlaneResult<IndexMetadataResponse>;\n\n    async fn handle(\n        &mut self,\n        request: UpdateIndexRequest,\n        ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        let index_uid: IndexUid = request.index_uid().clone();\n        debug!(%index_uid, \"updating index\");\n\n        let response = match ctx\n            .protect_future(self.metastore.update_index(request))\n            .await\n        {\n            Ok(response) => response,\n            Err(metastore_error) => {\n                return convert_metastore_error(metastore_error);\n            }\n        };\n        let index_metadata = match response.deserialize_index_metadata() {\n            Ok(index_metadata) => index_metadata,\n            Err(serde_error) => {\n                error!(error=?serde_error, \"failed to deserialize index metadata\");\n                return Err(ActorExitStatus::from(anyhow::anyhow!(serde_error)));\n            }\n        };\n        if self\n            .model\n            .update_index_config(&index_uid, index_metadata.index_config)?\n        {\n            let _rebuild_plan_notifier = self.rebuild_plan_debounced(ctx);\n        }\n        info!(%index_uid, \"updated index\");\n        Ok(Ok(response))\n    }\n}\n\n// This handler is a metastore call proxied through the control plane: we must first forward the\n// request to the metastore, and then act on the event.\n#[async_trait]\nimpl Handler<DeleteIndexRequest> for ControlPlane {\n    type Reply = ControlPlaneResult<EmptyResponse>;\n\n    async fn handle(\n        &mut self,\n        request: DeleteIndexRequest,\n        ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        let index_uid: IndexUid = request.index_uid().clone();\n        debug!(%index_uid, \"deleting index\");\n\n        if let Err(metastore_error) = ctx\n            .protect_future(self.metastore.delete_index(request))\n            .await\n        {\n            return convert_metastore_error(metastore_error);\n        };\n        let ingester_needing_resync: BTreeSet<NodeId> = self\n            .model\n            .list_shards_for_index(&index_uid)\n            .flat_map(|shard_entry| shard_entry.ingesters())\n            .map(|node_id_ref| node_id_ref.to_owned())\n            .collect();\n\n        self.model.delete_index(&index_uid);\n\n        self.ingest_controller\n            .sync_with_ingesters(&ingester_needing_resync, &self.model);\n\n        // TODO: Refine the event. Notify index will have the effect to reload the entire state from\n        // the metastore. We should update the state of the control plane.\n        let _rebuild_plan_waiter = self.rebuild_plan_debounced(ctx);\n\n        info!(%index_uid, \"deleted index\");\n        let response = EmptyResponse {};\n        Ok(Ok(response))\n    }\n}\n\n// This handler is a metastore call proxied through the control plane: we must first forward the\n// request to the metastore, and then act on the event.\n#[async_trait]\nimpl Handler<AddSourceRequest> for ControlPlane {\n    type Reply = ControlPlaneResult<EmptyResponse>;\n\n    async fn handle(\n        &mut self,\n        request: AddSourceRequest,\n        ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        let index_uid: IndexUid = request.index_uid().clone();\n        let source_config: SourceConfig =\n            match serde_utils::from_json_str(&request.source_config_json) {\n                Ok(source_config) => source_config,\n                Err(error) => {\n                    error!(%error, \"failed to deserialize source config\");\n                    return Ok(Err(ControlPlaneError::from(error)));\n                }\n            };\n        let source_id = source_config.source_id.clone();\n        debug!(%index_uid, source_id, \"adding source\");\n\n        if let Err(error) = ctx.protect_future(self.metastore.add_source(request)).await {\n            return Ok(Err(ControlPlaneError::from(error)));\n        };\n        self.model\n            .add_source(&index_uid, source_config)\n            .context(\"failed to add source\")?;\n\n        info!(%index_uid, source_id, \"added source\");\n\n        // TODO: Refine the event. Notify index will have the effect to reload the entire state from\n        // the metastore. We should update the state of the control plane.\n        let _rebuild_plan_waiter = self.rebuild_plan_debounced(ctx);\n\n        let response = EmptyResponse {};\n        Ok(Ok(response))\n    }\n}\n\n#[async_trait]\nimpl Handler<UpdateSourceRequest> for ControlPlane {\n    type Reply = ControlPlaneResult<EmptyResponse>;\n\n    async fn handle(\n        &mut self,\n        request: UpdateSourceRequest,\n        ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        let index_uid: IndexUid = request.index_uid().clone();\n        let source_config: SourceConfig =\n            match serde_utils::from_json_str(&request.source_config_json) {\n                Ok(source_config) => source_config,\n                Err(error) => {\n                    error!(%error, \"failed to deserialize source config\");\n                    return Ok(Err(ControlPlaneError::from(error)));\n                }\n            };\n        let source_id = source_config.source_id.clone();\n        debug!(%index_uid, source_id, \"updating source\");\n\n        if let Err(error) = ctx\n            .protect_future(self.metastore.update_source(request))\n            .await\n        {\n            return Ok(Err(ControlPlaneError::from(error)));\n        };\n        self.model\n            .update_source(&index_uid, source_config)\n            .context(\"failed to add source\")?;\n\n        // TODO: Refine the event. Notify index will have the effect to reload the entire state from\n        // the metastore. We should update the state of the control plane.\n        let _rebuild_plan_waiter = self.rebuild_plan_debounced(ctx);\n\n        info!(%index_uid, source_id, \"updated source\");\n        let response = EmptyResponse {};\n        Ok(Ok(response))\n    }\n}\n\n// This handler is a metastore call proxied through the control plane: we must first forward the\n// request to the metastore, and then act on the event.\n#[async_trait]\nimpl Handler<ToggleSourceRequest> for ControlPlane {\n    type Reply = ControlPlaneResult<EmptyResponse>;\n\n    async fn handle(\n        &mut self,\n        request: ToggleSourceRequest,\n        ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        let index_uid: IndexUid = request.index_uid().clone();\n        let source_id = request.source_id.clone();\n        let enable = request.enable;\n        debug!(%index_uid, source_id, enable, \"toggling source\");\n\n        if let Err(error) = ctx\n            .protect_future(self.metastore.toggle_source(request))\n            .await\n        {\n            return Ok(Err(ControlPlaneError::from(error)));\n        };\n\n        let mutation_occurred = self\n            .model\n            .toggle_source(&index_uid, &source_id, enable)\n            .context(\"failed to toggle source\")?;\n\n        if mutation_occurred {\n            let _rebuild_plan_waiter = self.rebuild_plan_debounced(ctx);\n        }\n        info!(%index_uid, source_id, enabled=enable, \"toggled source\");\n        let response = EmptyResponse {};\n        Ok(Ok(response))\n    }\n}\n\n// This handler is a metastore call proxied through the control plane: we must first forward the\n// request to the metastore, and then act on the event.\n#[async_trait]\nimpl Handler<DeleteSourceRequest> for ControlPlane {\n    type Reply = ControlPlaneResult<EmptyResponse>;\n\n    async fn handle(\n        &mut self,\n        request: DeleteSourceRequest,\n        ctx: &ActorContext<Self>,\n    ) -> Result<ControlPlaneResult<EmptyResponse>, ActorExitStatus> {\n        let index_uid: IndexUid = request.index_uid().clone();\n        let source_id = request.source_id.clone();\n        debug!(%index_uid, source_id, \"deleting source\");\n\n        let source_uid = SourceUid {\n            index_uid: index_uid.clone(),\n            source_id: source_id.clone(),\n        };\n\n        if let Err(metastore_error) = ctx\n            .protect_future(self.metastore.delete_source(request))\n            .await\n        {\n            // TODO If the metastore fails returns an error but somehow succeed deleting the source,\n            // the control plane will restart and the shards will be remaining on the ingesters.\n            //\n            // This is tracked in #4274\n            return convert_metastore_error(metastore_error);\n        };\n\n        let ingesters_needing_resync: BTreeSet<NodeId> =\n            if let Some(shard_entries) = self.model.get_shards_for_source(&source_uid) {\n                shard_entries\n                    .values()\n                    .flat_map(|shard_entry| shard_entry.ingesters())\n                    .map(|node_id_ref| node_id_ref.to_owned())\n                    .collect()\n            } else {\n                BTreeSet::new()\n            };\n\n        self.ingest_controller\n            .sync_with_ingesters(&ingesters_needing_resync, &self.model);\n\n        self.model.delete_source(&source_uid);\n        let _rebuild_plan_waiter = self.rebuild_plan_debounced(ctx);\n\n        info!(\n            index_uid=%source_uid.index_uid,\n            source_id=%source_uid.source_id,\n            \"deleted source\"\n        );\n        let response = EmptyResponse {};\n        Ok(Ok(response))\n    }\n}\n\n#[async_trait]\nimpl Handler<PruneShardsRequest> for ControlPlane {\n    type Reply = ControlPlaneResult<EmptyResponse>;\n\n    async fn handle(\n        &mut self,\n        request: PruneShardsRequest,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<ControlPlaneResult<EmptyResponse>, ActorExitStatus> {\n        let interval = request\n            .interval_secs\n            .map(|interval_secs| Duration::from_secs(interval_secs as u64))\n            .unwrap_or_else(|| PRUNE_SHARDS_DEFAULT_COOLDOWN_PERIOD);\n\n        // A very basic debounce is enough here, missing one call to the pruning API is fine\n        let status = self.prune_shard_cooldown.update(\n            (\n                request.index_uid().index_id.clone(),\n                request.source_id.clone(),\n            ),\n            interval,\n        );\n        if let CooldownStatus::Ready = status\n            && let Err(metastore_error) = self.metastore.prune_shards(request).await\n        {\n            return convert_metastore_error(metastore_error);\n        };\n        // Return ok regardless of whether the call was successful or debounced\n        let response = EmptyResponse {};\n        Ok(Ok(response))\n    }\n}\n\n// This is neither a proxied call nor a metastore callback.\n#[async_trait]\nimpl Handler<GetOrCreateOpenShardsRequest> for ControlPlane {\n    type Reply = ControlPlaneResult<GetOrCreateOpenShardsResponse>;\n\n    async fn handle(\n        &mut self,\n        request: GetOrCreateOpenShardsRequest,\n        ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        if let Err(metastore_error) = self\n            .auto_create_indexes(&request.subrequests, ctx.progress())\n            .await\n        {\n            return convert_metastore_error(metastore_error);\n        }\n        match self\n            .ingest_controller\n            .get_or_create_open_shards(request, &mut self.model, ctx.progress())\n            .await\n        {\n            Ok(response) => {\n                let _rebuild_plan_waiter = self.rebuild_plan_debounced(ctx);\n                Ok(Ok(response))\n            }\n            Err(metastore_error) => convert_metastore_error(metastore_error),\n        }\n    }\n}\n\n// This is neither a proxied call nor a metastore callback.\n#[async_trait]\nimpl Handler<AdviseResetShardsRequest> for ControlPlane {\n    type Reply = ControlPlaneResult<AdviseResetShardsResponse>;\n\n    async fn handle(\n        &mut self,\n        request: AdviseResetShardsRequest,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        let response = self\n            .ingest_controller\n            .advise_reset_shards(request, &self.model);\n        Ok(Ok(response))\n    }\n}\n\n#[async_trait]\nimpl Handler<LocalShardsUpdate> for ControlPlane {\n    type Reply = ControlPlaneResult<()>;\n\n    async fn handle(\n        &mut self,\n        local_shards_update: LocalShardsUpdate,\n        ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        if let Err(metastore_error) = self\n            .ingest_controller\n            .handle_local_shards_update(local_shards_update, &mut self.model, ctx.progress())\n            .await\n        {\n            return convert_metastore_error(metastore_error);\n        }\n        let _rebuild_plan_waiter = self.rebuild_plan_debounced(ctx);\n        Ok(Ok(()))\n    }\n}\n\n#[derive(Debug)]\npub struct GetDebugInfo;\n\n#[async_trait]\nimpl Handler<GetDebugInfo> for ControlPlane {\n    type Reply = JsonValue;\n\n    async fn handle(\n        &mut self,\n        _: GetDebugInfo,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        Ok(self.debug_info())\n    }\n}\n\n#[derive(Clone)]\npub struct ControlPlaneEventSubscriber(WeakMailbox<ControlPlane>);\n\nimpl ControlPlaneEventSubscriber {\n    pub fn new(weak_control_plane_mailbox: WeakMailbox<ControlPlane>) -> Self {\n        Self(weak_control_plane_mailbox)\n    }\n}\n\n#[async_trait]\nimpl EventSubscriber<LocalShardsUpdate> for ControlPlaneEventSubscriber {\n    async fn handle_event(&mut self, local_shards_update: LocalShardsUpdate) {\n        if let Some(control_plane_mailbox) = self.0.upgrade()\n            && let Err(error) = control_plane_mailbox\n                .send_message(local_shards_update)\n                .await\n        {\n            error!(%error, \"failed to forward local shards update to control plane\");\n        }\n    }\n}\n\n#[async_trait]\nimpl EventSubscriber<ShardPositionsUpdate> for ControlPlaneEventSubscriber {\n    async fn handle_event(&mut self, shard_positions_update: ShardPositionsUpdate) {\n        if let Some(control_plane_mailbox) = self.0.upgrade()\n            && let Err(error) = control_plane_mailbox\n                .send_message(shard_positions_update)\n                .await\n        {\n            error!(%error, \"failed to forward shard positions update to control plane\");\n        }\n    }\n}\n\nfn apply_index_template_match(\n    index_template_match: IndexTemplateMatch,\n    default_index_root_uri: &Uri,\n) -> MetastoreResult<IndexConfig> {\n    let index_template: IndexTemplate =\n        serde_utils::from_json_str(&index_template_match.index_template_json)?;\n    let index_config = index_template\n        .apply_template(index_template_match.index_id, default_index_root_uri)\n        .map_err(|error| MetastoreError::Internal {\n            message: \"failed to apply index template\".to_string(),\n            cause: error.to_string(),\n        })?;\n    Ok(index_config)\n}\n\n#[derive(Debug)]\nstruct RebalanceShards;\n\n#[async_trait]\nimpl Handler<RebalanceShards> for ControlPlane {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        _message: RebalanceShards,\n        ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        if let Err(error) = self\n            .ingest_controller\n            .rebalance_shards(&mut self.model, ctx.mailbox(), ctx.progress())\n            .await\n        {\n            return convert_metastore_error::<()>(error).map(|_| ());\n        };\n        self.indexing_scheduler.rebuild_plan(&self.model);\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<RebalanceShardsCallback> for ControlPlane {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        message: RebalanceShardsCallback,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        let num_closed_shards = message.closed_shards.len();\n        debug!(\"closing {num_closed_shards} shards after rebalance\");\n\n        for closed_shard in message.closed_shards {\n            let shard_id = closed_shard.shard_id().clone();\n            let source_uid = SourceUid {\n                index_uid: closed_shard.index_uid().clone(),\n                source_id: closed_shard.source_id,\n            };\n            self.model.close_shards(&source_uid, &[shard_id]);\n        }\n        // We drop the rebalance guard explicitly here to put some emphasis on where the rebalance\n        // lock is released.\n        drop(message.rebalance_guard);\n        Ok(())\n    }\n}\n\nfn spawn_watch_indexers_task(\n    weak_mailbox: WeakMailbox<ControlPlane>,\n    cluster_change_stream: ClusterChangeStream,\n) {\n    tokio::spawn(watcher_indexers(weak_mailbox, cluster_change_stream));\n}\n\nasync fn watcher_indexers(\n    weak_mailbox: WeakMailbox<ControlPlane>,\n    mut cluster_change_stream: ClusterChangeStream,\n) {\n    while let Some(cluster_change) = cluster_change_stream.next().await {\n        let Some(mailbox) = weak_mailbox.upgrade() else {\n            return;\n        };\n\n        // Ingesters have two readiness levels:\n        // 1. Cluster connectivity: node is up and can reach the metastore (similar to other nodes)\n        // 2. Shard readiness: IngesterStatus::Ready indicates the ingester can accept new shards\n        // We rebalance shards when either readiness level changes.\n        let mut trigger_rebalance = false;\n        match cluster_change {\n            ClusterChange::Add(node) if node.is_indexer() => {\n                if node.ingester_status().is_ready() {\n                    info!(\n                        \"indexer `{}` with status `{}` joined the cluster: rebalancing shards and \\\n                         rebuilding indexing plan\",\n                        node.node_id(),\n                        node.ingester_status().as_json_str_name()\n                    );\n                    trigger_rebalance = true;\n                }\n            }\n            ClusterChange::Remove(node) if node.is_indexer() => {\n                info!(\n                    \"indexer `{}` left the cluster: rebalancing shards and rebuilding indexing \\\n                     plan\",\n                    node.node_id()\n                );\n                trigger_rebalance = true\n            }\n            ClusterChange::Update { previous, updated } if updated.is_indexer() => {\n                let was_ready = previous.ingester_status().is_ready();\n                let is_ready = updated.ingester_status().is_ready();\n\n                if was_ready ^ is_ready {\n                    info!(\n                        \"indexer `{}` status changed to `{}`: rebalancing shards and rebuilding \\\n                         indexing plan\",\n                        updated.node_id(),\n                        updated.ingester_status().as_json_str_name()\n                    );\n                    trigger_rebalance = true;\n                }\n            }\n            _ => {}\n        }\n        if trigger_rebalance && mailbox.send_message(RebalanceShards).await.is_err() {\n            return;\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::num::NonZero;\n    use std::sync::Arc;\n\n    use mockall::Sequence;\n    use quickwit_actors::{AskError, Observe, SupervisorMetrics};\n    use quickwit_cluster::{ClusterChangeStreamFactoryForTest, ClusterNode};\n    use quickwit_config::{\n        CLI_SOURCE_ID, INGEST_V2_SOURCE_ID, IndexConfig, KafkaSourceParams, SourceParams,\n    };\n    use quickwit_indexing::IndexingService;\n    use quickwit_ingest::IngesterPoolEntry;\n    use quickwit_metastore::{\n        CreateIndexRequestExt, IndexMetadata, ListIndexesMetadataResponseExt,\n    };\n    use quickwit_proto::control_plane::{\n        GetOrCreateOpenShardsFailureReason, GetOrCreateOpenShardsSubrequest,\n    };\n    use quickwit_proto::indexing::{\n        ApplyIndexingPlanRequest, ApplyIndexingPlanResponse, CpuCapacity, IndexingServiceClient,\n        MockIndexingService,\n    };\n    use quickwit_proto::ingest::ingester::{\n        IngesterServiceClient, IngesterStatus, InitShardSuccess, InitShardsResponse,\n        MockIngesterService, RetainShardsResponse,\n    };\n    use quickwit_proto::ingest::{Shard, ShardPKey, ShardState};\n    use quickwit_proto::metastore::{\n        DeleteShardsResponse, EntityKind, FindIndexTemplateMatchesResponse,\n        ListIndexesMetadataRequest, ListIndexesMetadataResponse, ListShardsRequest,\n        ListShardsResponse, ListShardsSubresponse, MetastoreError, MockMetastoreService,\n        OpenShardSubresponse, OpenShardsResponse, SourceType,\n    };\n    use quickwit_proto::types::{DocMappingUid, Position};\n    use tokio::sync::Mutex;\n\n    use super::*;\n    use crate::IndexerNodeInfo;\n\n    #[tokio::test]\n    async fn test_control_plane_create_index() {\n        let universe = Universe::with_accelerated_time();\n        let self_node_id: NodeId = \"test-node\".into();\n        let indexer_pool = IndexerPool::default();\n        let ingester_pool = IngesterPool::default();\n\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let index_uid_clone = index_uid.clone();\n        mock_metastore\n            .expect_create_index()\n            .withf(|create_index_request| {\n                let index_config: IndexConfig =\n                    create_index_request.deserialize_index_config().unwrap();\n                assert_eq!(index_config.index_id, \"test-index\");\n                assert_eq!(index_config.index_uri, \"ram:///test-index\");\n                true\n            })\n            .returning(move |_| {\n                let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///test-index\");\n                let index_metadata_json = serde_json::to_string(&index_metadata).unwrap();\n                let response = CreateIndexResponse {\n                    index_uid: Some(index_uid_clone.clone()),\n                    index_metadata_json,\n                };\n                Ok(response)\n            });\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(|_| Ok(ListIndexesMetadataResponse::for_test(Vec::new())));\n        let cluster_config = ClusterConfig::for_test();\n        let cluster_change_stream_factory = ClusterChangeStreamFactoryForTest::default();\n        let (control_plane_mailbox, _control_plane_handle, _readiness_rx) = ControlPlane::spawn(\n            &universe,\n            cluster_config,\n            self_node_id,\n            cluster_change_stream_factory,\n            indexer_pool,\n            ingester_pool,\n            MetastoreServiceClient::from_mock(mock_metastore),\n        );\n        let index_config = IndexConfig::for_test(\"test-index\", \"ram:///test-index\");\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let create_index_response = control_plane_mailbox\n            .ask_for_res(create_index_request)\n            .await\n            .unwrap();\n        assert_eq!(create_index_response.index_uid(), &index_uid);\n\n        // TODO: Test that create index event is properly sent to ingest controller.\n\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_control_plane_delete_index() {\n        let universe = Universe::with_accelerated_time();\n        let self_node_id: NodeId = \"test-node\".into();\n        let indexer_pool = IndexerPool::default();\n        let ingester_pool = IngesterPool::default();\n\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_uid_clone = index_uid.clone();\n        mock_metastore\n            .expect_delete_index()\n            .withf(move |delete_index_request| delete_index_request.index_uid() == &index_uid_clone)\n            .returning(|_| Ok(EmptyResponse {}));\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(|_| Ok(ListIndexesMetadataResponse::for_test(Vec::new())));\n\n        let cluster_config = ClusterConfig::for_test();\n        let cluster_change_stream_factory = ClusterChangeStreamFactoryForTest::default();\n        let (control_plane_mailbox, _control_plane_handle, _readiness_rx) = ControlPlane::spawn(\n            &universe,\n            cluster_config,\n            self_node_id,\n            cluster_change_stream_factory,\n            indexer_pool,\n            ingester_pool,\n            MetastoreServiceClient::from_mock(mock_metastore),\n        );\n        let delete_index_request = DeleteIndexRequest {\n            index_uid: Some(index_uid),\n        };\n        control_plane_mailbox\n            .ask_for_res(delete_index_request)\n            .await\n            .unwrap();\n\n        // TODO: Test that delete index event is properly sent to ingest controller.\n\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_control_plane_add_source() {\n        let universe = Universe::with_accelerated_time();\n        let self_node_id: NodeId = \"test-node\".into();\n        let indexer_pool = IndexerPool::default();\n        let ingester_pool = IngesterPool::default();\n\n        let mut index_metadata = IndexMetadata::for_test(\"test-index\", \"ram://test\");\n        index_metadata\n            .add_source(SourceConfig::ingest_v2())\n            .unwrap();\n\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_add_source()\n            .withf(|add_source_request| {\n                let source_config: SourceConfig =\n                    serde_json::from_str(&add_source_request.source_config_json).unwrap();\n                assert_eq!(source_config.source_id, \"test-source\");\n                assert_eq!(source_config.source_type(), SourceType::Void);\n                true\n            })\n            .return_once(|_| Ok(EmptyResponse {}));\n        // the list_indexes_metadata and list_shards calls are made when the control plane starts\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .return_once(move |_| {\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata.clone(),\n                ]))\n            });\n        mock_metastore\n            .expect_list_shards()\n            .return_once(move |_| Ok(ListShardsResponse::default()));\n\n        let cluster_config = ClusterConfig::for_test();\n        let cluster_change_stream_factory = ClusterChangeStreamFactoryForTest::default();\n        let (control_plane_mailbox, _control_plane_handle, _readiness_rx) = ControlPlane::spawn(\n            &universe,\n            cluster_config,\n            self_node_id,\n            cluster_change_stream_factory,\n            indexer_pool,\n            ingester_pool,\n            MetastoreServiceClient::from_mock(mock_metastore),\n        );\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let source_config = SourceConfig::for_test(\"test-source\", SourceParams::void());\n        let add_source_request = AddSourceRequest {\n            index_uid: Some(index_uid),\n            source_config_json: serde_json::to_string(&source_config).unwrap(),\n        };\n        control_plane_mailbox\n            .ask_for_res(add_source_request)\n            .await\n            .unwrap();\n\n        // TODO: Test that delete index event is properly sent to ingest controller.\n\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_control_plane_update_source() {\n        let universe = Universe::with_accelerated_time();\n        let pipelines_after_update = 3;\n        let self_node_id: NodeId = \"test-node\".into();\n        let indexer_pool = IndexerPool::default();\n        let mut mock_indexer = MockIndexingService::new();\n        // call when starting the cp\n        mock_indexer\n            .expect_apply_indexing_plan()\n            .withf(|request| request.indexing_tasks.len() == 1)\n            .return_once(|_| Ok(ApplyIndexingPlanResponse {}));\n        // call after the update (3 tasks because 3 pipelines)\n        mock_indexer\n            .expect_apply_indexing_plan()\n            .withf(move |request| request.indexing_tasks.len() == pipelines_after_update)\n            .return_once(|_| Ok(ApplyIndexingPlanResponse {}));\n        let indexer = IndexingServiceClient::from_mock(mock_indexer);\n        let indexer_info = IndexerNodeInfo {\n            node_id: self_node_id.clone(),\n            generation_id: 0,\n            client: indexer,\n            indexing_tasks: Vec::new(),\n            indexing_capacity: CpuCapacity::from_cpu_millis(1_000),\n        };\n        indexer_pool.insert(self_node_id.clone(), indexer_info);\n\n        let ingester_pool = IngesterPool::default();\n\n        let mut index_metadata = IndexMetadata::for_test(\"test-index\", \"ram://tata\");\n        index_metadata\n            .add_source(SourceConfig::ingest_v2())\n            .unwrap();\n\n        let mut test_source_config = SourceConfig::for_test(\n            \"test-source\",\n            SourceParams::Kafka(KafkaSourceParams {\n                topic: \"test-topic\".to_string(),\n                client_log_level: None,\n                enable_backfill_mode: false,\n                client_params: json!({}),\n            }),\n        );\n        index_metadata\n            .add_source(test_source_config.clone())\n            .unwrap();\n\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_update_source()\n            .withf(move |update_source_request| {\n                let source_config: SourceConfig =\n                    serde_json::from_str(&update_source_request.source_config_json).unwrap();\n                assert_eq!(source_config.source_id, \"test-source\");\n                assert_eq!(source_config.source_type(), SourceType::Kafka);\n                assert_eq!(\n                    source_config.num_pipelines,\n                    NonZero::new(pipelines_after_update).unwrap()\n                );\n                true\n            })\n            .return_once(|_| Ok(EmptyResponse {}));\n        // the list_indexes_metadata and list_shards calls are made when the control plane starts\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .return_once(move |_| {\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata.clone(),\n                ]))\n            });\n        mock_metastore\n            .expect_list_shards()\n            .return_once(move |_| Ok(ListShardsResponse::default()));\n\n        let cluster_config = ClusterConfig::for_test();\n        let cluster_change_stream_factory = ClusterChangeStreamFactoryForTest::default();\n        let (control_plane_mailbox, _control_plane_handle, _readiness_rx) = ControlPlane::spawn(\n            &universe,\n            cluster_config,\n            self_node_id,\n            cluster_change_stream_factory,\n            indexer_pool,\n            ingester_pool,\n            MetastoreServiceClient::from_mock(mock_metastore),\n        );\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        test_source_config.num_pipelines = NonZero::new(pipelines_after_update).unwrap();\n        let update_source_request = UpdateSourceRequest {\n            index_uid: Some(index_uid),\n            source_config_json: serde_json::to_string(&test_source_config).unwrap(),\n        };\n        control_plane_mailbox\n            .ask_for_res(update_source_request)\n            .await\n            .unwrap();\n\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_control_plane_toggle_source() {\n        let universe = Universe::with_accelerated_time();\n        let self_node_id: NodeId = \"test-node\".into();\n        let indexer_pool = IndexerPool::default();\n        let ingester_pool = IngesterPool::default();\n\n        let mut index_metadata = IndexMetadata::for_test(\"test-index\", \"ram://toto\");\n        index_metadata\n            .add_source(SourceConfig::ingest_v2())\n            .unwrap();\n\n        let test_source_config = SourceConfig::for_test(\"test-source\", SourceParams::void());\n        index_metadata.add_source(test_source_config).unwrap();\n\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .return_once(|_| Ok(ListIndexesMetadataResponse::for_test(vec![index_metadata])));\n        mock_metastore\n            .expect_list_shards()\n            .return_once(move |_| Ok(ListShardsResponse::default()));\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let index_uid_clone = index_uid.clone();\n        mock_metastore\n            .expect_toggle_source()\n            .times(1)\n            .return_once(move |toggle_source_request| {\n                assert_eq!(toggle_source_request.index_uid(), &index_uid_clone);\n                assert_eq!(toggle_source_request.source_id, \"test-source\");\n                Ok(EmptyResponse {})\n            });\n        let index_uid_clone = index_uid.clone();\n        mock_metastore\n            .expect_toggle_source()\n            .times(1)\n            .return_once(move |toggle_source_request| {\n                assert_eq!(toggle_source_request.index_uid(), &index_uid_clone);\n                assert_eq!(toggle_source_request.source_id, \"test-source\");\n                assert!(!toggle_source_request.enable);\n                Ok(EmptyResponse {})\n            });\n\n        let cluster_config = ClusterConfig::for_test();\n        let cluster_change_stream_factory = ClusterChangeStreamFactoryForTest::default();\n        let (control_plane_mailbox, _control_plane_handle, _readiness_rx) = ControlPlane::spawn(\n            &universe,\n            cluster_config,\n            self_node_id,\n            cluster_change_stream_factory,\n            indexer_pool,\n            ingester_pool,\n            MetastoreServiceClient::from_mock(mock_metastore),\n        );\n        let enable_source_request = ToggleSourceRequest {\n            index_uid: Some(index_uid.clone()),\n            source_id: \"test-source\".to_string(),\n            enable: true,\n        };\n        control_plane_mailbox\n            .ask_for_res(enable_source_request)\n            .await\n            .unwrap();\n\n        let disable_source_request = ToggleSourceRequest {\n            index_uid: Some(index_uid),\n            source_id: \"test-source\".to_string(),\n            enable: false,\n        };\n        control_plane_mailbox\n            .ask_for_res(disable_source_request)\n            .await\n            .unwrap();\n\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_control_plane_delete_source() {\n        let universe = Universe::with_accelerated_time();\n        let self_node_id: NodeId = \"test-node\".into();\n        let indexer_pool = IndexerPool::default();\n        let ingester_pool = IngesterPool::default();\n\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let index_uid_clone = index_uid.clone();\n        mock_metastore\n            .expect_delete_source()\n            .withf(move |delete_source_request| {\n                assert_eq!(delete_source_request.index_uid(), &index_uid_clone);\n                assert_eq!(delete_source_request.source_id, \"test-source\");\n                true\n            })\n            .returning(|_| Ok(EmptyResponse {}));\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(|_| Ok(ListIndexesMetadataResponse::for_test(Vec::new())));\n\n        let cluster_config = ClusterConfig::for_test();\n        let cluster_change_stream_factory = ClusterChangeStreamFactoryForTest::default();\n        let (control_plane_mailbox, _control_plane_handle, _readiness_rx) = ControlPlane::spawn(\n            &universe,\n            cluster_config,\n            self_node_id,\n            cluster_change_stream_factory,\n            indexer_pool,\n            ingester_pool,\n            MetastoreServiceClient::from_mock(mock_metastore),\n        );\n        let delete_source_request = DeleteSourceRequest {\n            index_uid: Some(index_uid),\n            source_id: \"test-source\".to_string(),\n        };\n        control_plane_mailbox\n            .ask_for_res(delete_source_request)\n            .await\n            .unwrap();\n\n        // TODO: Test that delete index event is properly sent to ingest controller.\n\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_control_plane_get_or_create_open_shards() {\n        let universe = Universe::with_accelerated_time();\n        let self_node_id: NodeId = \"test-node\".into();\n        let indexer_pool = IndexerPool::default();\n\n        let ingester_pool = IngesterPool::default();\n\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(|_| {\n                let mut index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///test-index\");\n                let mut source_config = SourceConfig::ingest_v2();\n                source_config.enabled = true;\n                index_metadata.add_source(source_config).unwrap();\n                Ok(ListIndexesMetadataResponse::for_test(vec![index_metadata]))\n            });\n        let index_uid_clone = index_uid.clone();\n        mock_metastore\n            .expect_list_shards()\n            .returning(move |request| {\n                assert_eq!(request.subrequests.len(), 1);\n\n                let subrequest = &request.subrequests[0];\n                assert_eq!(subrequest.index_uid(), &index_uid_clone);\n                assert_eq!(subrequest.source_id, INGEST_V2_SOURCE_ID);\n\n                let subresponses = vec![ListShardsSubresponse {\n                    index_uid: Some(index_uid_clone.clone()),\n                    source_id: INGEST_V2_SOURCE_ID.to_string(),\n                    shards: vec![Shard {\n                        index_uid: Some(index_uid_clone.clone()),\n                        source_id: INGEST_V2_SOURCE_ID.to_string(),\n                        shard_id: Some(ShardId::from(1)),\n                        shard_state: ShardState::Open as i32,\n                        ..Default::default()\n                    }],\n                }];\n                let response = ListShardsResponse { subresponses };\n                Ok(response)\n            });\n\n        let cluster_config = ClusterConfig::for_test();\n        let cluster_change_stream_factory = ClusterChangeStreamFactoryForTest::default();\n        let (control_plane_mailbox, _control_plane_handle, _readiness_rx) = ControlPlane::spawn(\n            &universe,\n            cluster_config,\n            self_node_id,\n            cluster_change_stream_factory,\n            indexer_pool,\n            ingester_pool,\n            MetastoreServiceClient::from_mock(mock_metastore),\n        );\n        let get_open_shards_request = GetOrCreateOpenShardsRequest {\n            subrequests: vec![GetOrCreateOpenShardsSubrequest {\n                subrequest_id: 0,\n                index_id: \"test-index\".to_string(),\n                source_id: INGEST_V2_SOURCE_ID.to_string(),\n            }],\n            closed_shards: Vec::new(),\n            unavailable_leaders: Vec::new(),\n        };\n        let get_open_shards_response = control_plane_mailbox\n            .ask_for_res(get_open_shards_request)\n            .await\n            .unwrap();\n        assert_eq!(get_open_shards_response.successes.len(), 1);\n        assert_eq!(get_open_shards_response.failures.len(), 0);\n\n        let subresponse = &get_open_shards_response.successes[0];\n        assert_eq!(subresponse.index_uid(), &index_uid);\n        assert_eq!(subresponse.source_id, INGEST_V2_SOURCE_ID);\n        assert_eq!(subresponse.open_shards.len(), 1);\n        assert_eq!(subresponse.open_shards[0].shard_id(), ShardId::from(1));\n\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_control_plane_supervision_reload_from_metastore() {\n        let universe = Universe::default();\n        let node_id = NodeId::new(\"test_node\".to_string());\n        let indexer_pool = IndexerPool::default();\n        let ingester_pool = IngesterPool::default();\n        let mut mock_metastore = MockMetastoreService::new();\n\n        let mut index_0 = IndexMetadata::for_test(\"test-index-0\", \"ram:///test-index-0\");\n        let source = SourceConfig::ingest_v2();\n        index_0.add_source(source.clone()).unwrap();\n\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .times(2) // 1 for the first initialization, 1 after the respawn of the control plane.\n            .returning(|list_indexes_request: ListIndexesMetadataRequest| {\n                assert_eq!(list_indexes_request, ListIndexesMetadataRequest::all());\n                Ok(ListIndexesMetadataResponse::for_test(Vec::new()))\n            });\n        mock_metastore.expect_list_shards().return_once(\n            |_list_shards_request: ListShardsRequest| {\n                let list_shards_resp = ListShardsResponse {\n                    subresponses: Vec::new(),\n                };\n                Ok(list_shards_resp)\n            },\n        );\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///test-index\");\n        let index_metadata_json = serde_json::to_string(&index_metadata).unwrap();\n\n        mock_metastore.expect_create_index().times(1).return_once(\n            |_create_index_request: CreateIndexRequest| {\n                Ok(CreateIndexResponse {\n                    index_uid: index_metadata.index_uid.into(),\n                    index_metadata_json,\n                })\n            },\n        );\n        mock_metastore.expect_create_index().times(1).return_once(\n            |create_index_request: CreateIndexRequest| {\n                Err(MetastoreError::AlreadyExists(EntityKind::Index {\n                    index_id: create_index_request\n                        .deserialize_index_config()\n                        .unwrap()\n                        .index_id,\n                }))\n            },\n        );\n        mock_metastore.expect_create_index().times(1).return_once(\n            |_create_index_request: CreateIndexRequest| {\n                Err(MetastoreError::Connection {\n                    message: \"Fake connection error.\".to_string(),\n                })\n            },\n        );\n\n        let cluster_config = ClusterConfig::for_test();\n        let cluster_change_stream_factory = ClusterChangeStreamFactoryForTest::default();\n        let (control_plane_mailbox, control_plane_handle, mut readiness_rx) = ControlPlane::spawn(\n            &universe,\n            cluster_config,\n            node_id,\n            cluster_change_stream_factory,\n            indexer_pool,\n            ingester_pool,\n            MetastoreServiceClient::from_mock(mock_metastore),\n        );\n        tokio::time::timeout(\n            Duration::from_secs(5),\n            readiness_rx.wait_for(|readiness| *readiness),\n        )\n        .await\n        .unwrap()\n        .unwrap();\n\n        let index_config = IndexConfig::for_test(\"test-index\", \"ram:///test-index\");\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n\n        // A happy path: we simply create the index.\n        control_plane_mailbox\n            .ask_for_res(create_index_request.clone())\n            .await\n            .unwrap();\n\n        // Now let's see what happens if we attempt to create the same index a second time.\n        let control_plane_error: ControlPlaneError = control_plane_mailbox\n            .ask(create_index_request.clone())\n            .await\n            .unwrap()\n            .unwrap_err();\n\n        // That kind of error clearly indicates that the transaction has failed.\n        // The control plane does not need to be restarted.\n        assert!(\n            matches!(control_plane_error, ControlPlaneError::Metastore(MetastoreError::AlreadyExists(entity)) if entity == EntityKind::Index { index_id: \"test-index\".to_string() })\n        );\n\n        control_plane_mailbox.ask(Observe).await.unwrap();\n\n        assert_eq!(\n            control_plane_handle\n                .process_pending_and_observe()\n                .await\n                .metrics,\n            SupervisorMetrics {\n                num_panics: 0,\n                num_errors: 0,\n                num_kills: 0\n            }\n        );\n\n        // Now let's see what happens with a grayer type of error.\n        let control_plane_error: AskError<ControlPlaneError> = control_plane_mailbox\n            .ask_for_res(create_index_request)\n            .await\n            .unwrap_err();\n        assert!(matches!(control_plane_error, AskError::ProcessMessageError));\n\n        // This time, the control plane is restarted.\n        control_plane_mailbox.ask(Observe).await.unwrap();\n        assert_eq!(\n            control_plane_handle\n                .process_pending_and_observe()\n                .await\n                .metrics,\n            SupervisorMetrics {\n                num_panics: 0,\n                num_errors: 1,\n                num_kills: 0\n            }\n        );\n\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_delete_shard_on_eof() {\n        let universe = Universe::with_accelerated_time();\n        let node_id = NodeId::new(\"test-control-plane\".to_string());\n        let indexer_pool = IndexerPool::default();\n        let (client_mailbox, client_inbox) = universe.create_test_mailbox();\n        let client = IndexingServiceClient::from_mailbox::<IndexingService>(client_mailbox);\n        let indexer_node_info = IndexerNodeInfo {\n            node_id: NodeId::new(\"test-indexer\".to_string()),\n            generation_id: 0,\n            client,\n            indexing_tasks: Vec::new(),\n            indexing_capacity: CpuCapacity::from_cpu_millis(4_000),\n        };\n        indexer_pool.insert(indexer_node_info.node_id.clone(), indexer_node_info);\n        let ingester_pool = IngesterPool::default();\n        let mut mock_metastore = MockMetastoreService::new();\n\n        let mut index_0 = IndexMetadata::for_test(\"test-index-0\", \"ram:///test-index-0\");\n        let mut source = SourceConfig::ingest_v2();\n        source.enabled = true;\n        index_0.add_source(source.clone()).unwrap();\n\n        let index_0_clone = index_0.clone();\n        mock_metastore.expect_list_indexes_metadata().return_once(\n            move |list_indexes_request: ListIndexesMetadataRequest| {\n                assert_eq!(list_indexes_request, ListIndexesMetadataRequest::all());\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_0_clone.clone(),\n                ]))\n            },\n        );\n        let index_uid_clone = index_0.index_uid.clone();\n        mock_metastore.expect_delete_shards().return_once(\n            move |delete_shards_request: DeleteShardsRequest| {\n                assert_eq!(delete_shards_request.index_uid(), &index_uid_clone);\n                assert_eq!(delete_shards_request.source_id, INGEST_V2_SOURCE_ID);\n                assert_eq!(delete_shards_request.shard_ids, [ShardId::from(17)]);\n                assert!(!delete_shards_request.force);\n\n                let response = DeleteShardsResponse {\n                    index_uid: delete_shards_request.index_uid,\n                    source_id: delete_shards_request.source_id,\n                    successes: delete_shards_request.shard_ids,\n                    failures: Vec::new(),\n                };\n                Ok(response)\n            },\n        );\n\n        let mut shard = Shard {\n            index_uid: Some(index_0.index_uid.clone()),\n            source_id: INGEST_V2_SOURCE_ID.to_string(),\n            shard_id: Some(ShardId::from(17)),\n            leader_id: \"test-ingester\".to_string(),\n            publish_position_inclusive: Some(Position::Beginning),\n            ..Default::default()\n        };\n        shard.set_shard_state(ShardState::Open);\n\n        let index_uid_clone = index_0.index_uid.clone();\n        mock_metastore.expect_list_shards().return_once(\n            move |_list_shards_request: ListShardsRequest| {\n                let list_shards_resp = ListShardsResponse {\n                    subresponses: vec![ListShardsSubresponse {\n                        index_uid: Some(index_uid_clone),\n                        source_id: INGEST_V2_SOURCE_ID.to_string(),\n                        shards: vec![shard],\n                    }],\n                };\n                Ok(list_shards_resp)\n            },\n        );\n\n        let cluster_config = ClusterConfig::for_test();\n        let cluster_change_stream_factory = ClusterChangeStreamFactoryForTest::default();\n        let (control_plane_mailbox, _control_plane_handle, _readiness_rx) = ControlPlane::spawn(\n            &universe,\n            cluster_config,\n            node_id,\n            cluster_change_stream_factory,\n            indexer_pool,\n            ingester_pool,\n            MetastoreServiceClient::from_mock(mock_metastore),\n        );\n        let source_uid = SourceUid {\n            index_uid: index_0.index_uid.clone(),\n            source_id: INGEST_V2_SOURCE_ID.to_string(),\n        };\n\n        // This update should not triggeer anything in the control plane.\n        control_plane_mailbox\n            .ask(ShardPositionsUpdate {\n                source_uid: source_uid.clone(),\n                updated_shard_positions: vec![(ShardId::from(17), Position::offset(1_000u64))],\n            })\n            .await\n            .unwrap();\n\n        let control_plane_obs: ControlPlaneObservableState =\n            control_plane_mailbox.ask(Observe).await.unwrap();\n        let last_applied_physical_plan = control_plane_obs\n            .indexing_scheduler\n            .last_applied_physical_plan\n            .unwrap();\n        let indexing_tasks = last_applied_physical_plan\n            .indexing_tasks_per_indexer()\n            .get(\"test-indexer\")\n            .unwrap();\n        assert_eq!(indexing_tasks.len(), 1);\n        assert_eq!(indexing_tasks[0].shard_ids, [ShardId::from(17)]);\n\n        let control_plane_debug_info = control_plane_mailbox.ask(GetDebugInfo).await.unwrap();\n        let shard = &control_plane_debug_info[\"shard_table\"]\n            [\"test-index-0:00000000000000000000000000\"][\"test-ingester\"][0];\n        assert_eq!(shard[\"shard_id\"], \"00000000000000000017\");\n        assert_eq!(shard[\"publish_position_inclusive\"], \"00000000000000001000\");\n\n        let _ = client_inbox.drain_for_test();\n\n        universe.sleep(Duration::from_secs(30)).await;\n        // This update should trigger the deletion of the shard and a new indexing plan.\n        control_plane_mailbox\n            .ask(ShardPositionsUpdate {\n                source_uid,\n                updated_shard_positions: vec![(ShardId::from(17), Position::eof(1_000u64))],\n            })\n            .await\n            .unwrap();\n\n        let control_plane_obs: ControlPlaneObservableState =\n            control_plane_mailbox.ask(Observe).await.unwrap();\n        let last_applied_physical_plan = control_plane_obs\n            .indexing_scheduler\n            .last_applied_physical_plan\n            .unwrap();\n        let indexing_tasks = last_applied_physical_plan\n            .indexing_tasks_per_indexer()\n            .get(\"test-indexer\")\n            .unwrap();\n        assert!(indexing_tasks.is_empty());\n\n        let apply_plan_requests = client_inbox.drain_for_test_typed::<ApplyIndexingPlanRequest>();\n        let last_apply_plan_request = apply_plan_requests.last().unwrap();\n        assert!(last_apply_plan_request.indexing_tasks.is_empty());\n\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_fill_shard_table_position_from_metastore_on_startup() {\n        let universe = Universe::with_accelerated_time();\n        let node_id = NodeId::new(\"test-control-plane\".to_string());\n        let indexer_pool = IndexerPool::default();\n        let (client_mailbox, _client_inbox) = universe.create_test_mailbox();\n        let client = IndexingServiceClient::from_mailbox::<IndexingService>(client_mailbox);\n        let indexer_node_info = IndexerNodeInfo {\n            node_id: NodeId::new(\"test-indexer\".to_string()),\n            generation_id: 0,\n            client,\n            indexing_tasks: Vec::new(),\n            indexing_capacity: CpuCapacity::from_cpu_millis(4_000),\n        };\n        indexer_pool.insert(indexer_node_info.node_id.clone(), indexer_node_info);\n        let ingester_pool = IngesterPool::default();\n        let mut mock_metastore = MockMetastoreService::new();\n\n        let mut index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///test-index\");\n        let mut source_config = SourceConfig::ingest_v2();\n        source_config.enabled = true;\n        index_metadata.add_source(source_config.clone()).unwrap();\n\n        let index_metadata_clone = index_metadata.clone();\n        mock_metastore.expect_list_indexes_metadata().return_once(\n            move |list_indexes_request: ListIndexesMetadataRequest| {\n                assert_eq!(list_indexes_request, ListIndexesMetadataRequest::all());\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata_clone,\n                ]))\n            },\n        );\n\n        let mut shard = Shard {\n            index_uid: Some(index_metadata.index_uid.clone()),\n            source_id: INGEST_V2_SOURCE_ID.to_string(),\n            shard_id: Some(ShardId::from(17)),\n            leader_id: \"test-ingester\".to_string(),\n            publish_position_inclusive: Some(Position::Offset(1234u64.into())),\n            ..Default::default()\n        };\n        shard.set_shard_state(ShardState::Open);\n\n        let index_uid_clone = index_metadata.index_uid.clone();\n        mock_metastore.expect_list_shards().return_once(\n            move |_list_shards_request: ListShardsRequest| {\n                let list_shards_resp = ListShardsResponse {\n                    subresponses: vec![ListShardsSubresponse {\n                        index_uid: Some(index_uid_clone),\n                        source_id: INGEST_V2_SOURCE_ID.to_string(),\n                        shards: vec![shard],\n                    }],\n                };\n                Ok(list_shards_resp)\n            },\n        );\n\n        let cluster_config = ClusterConfig::for_test();\n        let cluster_change_stream_factory = ClusterChangeStreamFactoryForTest::default();\n        let (control_plane_mailbox, _control_plane_handle, _readiness_rx) = ControlPlane::spawn(\n            &universe,\n            cluster_config,\n            node_id,\n            cluster_change_stream_factory,\n            indexer_pool,\n            ingester_pool,\n            MetastoreServiceClient::from_mock(mock_metastore),\n        );\n        let control_plane_debug_info = control_plane_mailbox.ask(GetDebugInfo).await.unwrap();\n        let shard = &control_plane_debug_info[\"shard_table\"]\n            [\"test-index:00000000000000000000000000\"][\"test-ingester\"][0];\n        assert_eq!(shard[\"shard_id\"], \"00000000000000000017\");\n        assert_eq!(shard[\"publish_position_inclusive\"], \"00000000000000001234\");\n\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_delete_non_existing_shard() {\n        quickwit_common::setup_logging_for_tests();\n        let universe = Universe::default();\n        let node_id = NodeId::new(\"test-control-plane\".to_string());\n        let indexer_pool = IndexerPool::default();\n        let (client_mailbox, _client_inbox) = universe.create_test_mailbox();\n        let client = IndexingServiceClient::from_mailbox::<IndexingService>(client_mailbox);\n        let indexer_node_info = IndexerNodeInfo {\n            node_id: NodeId::new(\"test-indexer\".to_string()),\n            generation_id: 0,\n            client,\n            indexing_tasks: Vec::new(),\n            indexing_capacity: CpuCapacity::from_cpu_millis(4_000),\n        };\n        indexer_pool.insert(indexer_node_info.node_id.clone(), indexer_node_info);\n        let ingester_pool = IngesterPool::default();\n        let mut mock_metastore = MockMetastoreService::new();\n\n        let mut index_0 = IndexMetadata::for_test(\"test-index-0\", \"ram:///test-index-0\");\n        let mut source = SourceConfig::ingest_v2();\n        source.enabled = true;\n        index_0.add_source(source.clone()).unwrap();\n\n        let index_0_clone = index_0.clone();\n        mock_metastore.expect_list_indexes_metadata().return_once(\n            move |list_indexes_request: ListIndexesMetadataRequest| {\n                assert_eq!(list_indexes_request, ListIndexesMetadataRequest::all());\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_0_clone.clone(),\n                ]))\n            },\n        );\n        let index_uid_clone = index_0.index_uid.clone();\n        mock_metastore.expect_delete_shards().return_once(\n            move |delete_shards_request: DeleteShardsRequest| {\n                assert_eq!(delete_shards_request.index_uid(), &index_uid_clone);\n                assert_eq!(delete_shards_request.source_id, INGEST_V2_SOURCE_ID);\n                assert_eq!(delete_shards_request.shard_ids, [ShardId::from(17)]);\n                assert!(!delete_shards_request.force);\n\n                let response = DeleteShardsResponse {\n                    index_uid: delete_shards_request.index_uid,\n                    source_id: delete_shards_request.source_id,\n                    successes: delete_shards_request.shard_ids,\n                    failures: Vec::new(),\n                };\n                Ok(response)\n            },\n        );\n\n        let index_uid_clone = index_0.index_uid.clone();\n        mock_metastore.expect_list_shards().return_once(\n            move |_list_shards_request: ListShardsRequest| {\n                let list_shards_resp = ListShardsResponse {\n                    subresponses: vec![ListShardsSubresponse {\n                        index_uid: Some(index_uid_clone),\n                        source_id: INGEST_V2_SOURCE_ID.to_string(),\n                        shards: Vec::new(),\n                    }],\n                };\n                Ok(list_shards_resp)\n            },\n        );\n\n        let cluster_config = ClusterConfig::for_test();\n        let cluster_change_stream_factory = ClusterChangeStreamFactoryForTest::default();\n        let (control_plane_mailbox, _control_plane_handle, _readiness_rx) = ControlPlane::spawn(\n            &universe,\n            cluster_config,\n            node_id,\n            cluster_change_stream_factory,\n            indexer_pool,\n            ingester_pool,\n            MetastoreServiceClient::from_mock(mock_metastore),\n        );\n        let source_uid = SourceUid {\n            index_uid: index_0.index_uid.clone(),\n            source_id: INGEST_V2_SOURCE_ID.to_string(),\n        };\n\n        // This update should not triggeer anything in the control plane.\n        control_plane_mailbox\n            .ask(ShardPositionsUpdate {\n                source_uid: source_uid.clone(),\n                updated_shard_positions: vec![(ShardId::from(17), Position::eof(1_000u64))],\n            })\n            .await\n            .unwrap();\n\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_delete_index() {\n        quickwit_common::setup_logging_for_tests();\n        let universe = Universe::default();\n        let node_id = NodeId::new(\"test-control-plane\".to_string());\n        let indexer_pool = IndexerPool::default();\n\n        let ingester_pool = IngesterPool::default();\n        let mut mock_ingester = MockIngesterService::new();\n        let mut seq = Sequence::new();\n\n        let mut index_0 = IndexMetadata::for_test(\"test-index-0\", \"ram:///test-index-0\");\n        let mut source = SourceConfig::ingest_v2();\n        source.enabled = true;\n        index_0.add_source(source.clone()).unwrap();\n\n        let index_uid_clone = index_0.index_uid.clone();\n        let index_0_clone = index_0.clone();\n\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .times(1)\n            .in_sequence(&mut seq)\n            .returning(move |list_indexes_request: ListIndexesMetadataRequest| {\n                assert_eq!(list_indexes_request, ListIndexesMetadataRequest::all());\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_0_clone.clone(),\n                ]))\n            });\n        mock_metastore\n            .expect_list_shards()\n            .times(1)\n            .in_sequence(&mut seq)\n            .returning(move |_list_shards_request: ListShardsRequest| {\n                let list_shards_resp = ListShardsResponse {\n                    subresponses: vec![ListShardsSubresponse {\n                        index_uid: Some(index_uid_clone.clone()),\n                        source_id: INGEST_V2_SOURCE_ID.to_string(),\n                        shards: vec![Shard {\n                            index_uid: Some(index_uid_clone.clone()),\n                            source_id: source.source_id.to_string(),\n                            shard_id: Some(ShardId::from(15)),\n                            leader_id: \"node1\".to_string(),\n                            follower_id: None,\n                            shard_state: ShardState::Open as i32,\n                            doc_mapping_uid: Some(DocMappingUid::default()),\n                            publish_position_inclusive: None,\n                            publish_token: None,\n                            update_timestamp: 1724158996,\n                        }],\n                    }],\n                };\n                Ok(list_shards_resp)\n            });\n\n        mock_ingester\n            .expect_retain_shards()\n            .times(1)\n            .in_sequence(&mut seq)\n            .returning(|request| {\n                assert_eq!(request.retain_shards_for_sources.len(), 1);\n                assert_eq!(\n                    request.retain_shards_for_sources[0].shard_ids,\n                    [ShardId::from(15)]\n                );\n                Ok(RetainShardsResponse {})\n            });\n\n        let index_uid_clone = index_0.index_uid.clone();\n        mock_metastore\n            .expect_delete_index()\n            .times(1)\n            .in_sequence(&mut seq)\n            .returning(move |delete_index_request: DeleteIndexRequest| {\n                assert_eq!(delete_index_request.index_uid(), &index_uid_clone);\n                Ok(EmptyResponse {})\n            });\n        mock_ingester\n            .expect_retain_shards()\n            .times(1)\n            .in_sequence(&mut seq)\n            .returning(|mut request| {\n                assert_eq!(request.retain_shards_for_sources.len(), 1);\n                let retain_shards_for_source = request.retain_shards_for_sources.pop().unwrap();\n                assert!(&retain_shards_for_source.shard_ids.is_empty());\n                Ok(RetainShardsResponse {})\n            });\n        let ingester =\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester));\n        ingester_pool.insert(\"node1\".into(), ingester);\n\n        let cluster_config = ClusterConfig::for_test();\n        let cluster_change_stream_factory = ClusterChangeStreamFactoryForTest::default();\n        let (control_plane_mailbox, _control_plane_handle, _readiness_rx) = ControlPlane::spawn(\n            &universe,\n            cluster_config,\n            node_id,\n            cluster_change_stream_factory,\n            indexer_pool,\n            ingester_pool,\n            MetastoreServiceClient::from_mock(mock_metastore),\n        );\n        // This update should not trigger anything in the control plane.\n        control_plane_mailbox\n            .ask(DeleteIndexRequest {\n                index_uid: Some(index_0.index_uid),\n            })\n            .await\n            .unwrap()\n            .unwrap();\n\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_delete_source() {\n        quickwit_common::setup_logging_for_tests();\n        let universe = Universe::default();\n        let node_id = NodeId::new(\"test-control-plane\".to_string());\n        let indexer_pool = IndexerPool::default();\n\n        let ingester_pool = IngesterPool::default();\n        let mut mock_ingester = MockIngesterService::new();\n        mock_ingester\n            .expect_retain_shards()\n            .times(2)\n            .returning(|request| {\n                assert_eq!(request.retain_shards_for_sources.len(), 1);\n                assert_eq!(\n                    request.retain_shards_for_sources[0].shard_ids,\n                    [ShardId::from(15)]\n                );\n                Ok(RetainShardsResponse {})\n            });\n        let ingester =\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester));\n        ingester_pool.insert(\"node1\".into(), ingester);\n\n        let mut index_0 = IndexMetadata::for_test(\"test-index-0\", \"ram:///test-index-0\");\n        let index_uid_clone = index_0.index_uid.clone();\n\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore.expect_delete_source().return_once(\n            move |delete_source_request: DeleteSourceRequest| {\n                assert_eq!(delete_source_request.index_uid(), &index_uid_clone);\n                assert_eq!(&delete_source_request.source_id, INGEST_V2_SOURCE_ID);\n                Ok(EmptyResponse {})\n            },\n        );\n\n        let mut source = SourceConfig::ingest_v2();\n        source.enabled = true;\n        index_0.add_source(source.clone()).unwrap();\n\n        let index_0_clone = index_0.clone();\n        mock_metastore.expect_list_indexes_metadata().return_once(\n            move |list_indexes_request: ListIndexesMetadataRequest| {\n                assert_eq!(list_indexes_request, ListIndexesMetadataRequest::all());\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_0_clone.clone(),\n                ]))\n            },\n        );\n\n        let index_uid_clone = index_0.index_uid.clone();\n        mock_metastore.expect_list_shards().return_once(\n            move |_list_shards_request: ListShardsRequest| {\n                let list_shards_resp = ListShardsResponse {\n                    subresponses: vec![ListShardsSubresponse {\n                        index_uid: Some(index_uid_clone.clone()),\n                        source_id: INGEST_V2_SOURCE_ID.to_string(),\n                        shards: vec![Shard {\n                            index_uid: Some(index_uid_clone),\n                            source_id: source.source_id.to_string(),\n                            shard_id: Some(ShardId::from(15)),\n                            leader_id: \"node1\".to_string(),\n                            follower_id: None,\n                            shard_state: ShardState::Open as i32,\n                            doc_mapping_uid: Some(DocMappingUid::default()),\n                            publish_position_inclusive: None,\n                            publish_token: None,\n                            update_timestamp: 1724158996,\n                        }],\n                    }],\n                };\n                Ok(list_shards_resp)\n            },\n        );\n        let cluster_config = ClusterConfig::for_test();\n        let cluster_change_stream_factory = ClusterChangeStreamFactoryForTest::default();\n        let (control_plane_mailbox, _control_plane_handle, _readiness_rx) = ControlPlane::spawn(\n            &universe,\n            cluster_config,\n            node_id,\n            cluster_change_stream_factory,\n            indexer_pool,\n            ingester_pool,\n            MetastoreServiceClient::from_mock(mock_metastore),\n        );\n        // This update should not trigger anything in the control plane.\n        control_plane_mailbox\n            .ask(DeleteSourceRequest {\n                index_uid: Some(index_0.index_uid),\n                source_id: INGEST_V2_SOURCE_ID.to_string(),\n            })\n            .await\n            .unwrap()\n            .unwrap();\n\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_auto_create_indexes_on_get_or_create_open_shards_request() {\n        let universe = Universe::default();\n\n        let mut cluster_config = ClusterConfig::for_test();\n        cluster_config.auto_create_indexes = true;\n\n        let node_id = NodeId::from(\"test-node\");\n        let cluster_change_stream_factory = ClusterChangeStreamFactoryForTest::default();\n        let indexer_pool = IndexerPool::default();\n        let ingester_pool = IngesterPool::default();\n\n        let mut mock_metastore = MockMetastoreService::new();\n\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .return_once(|_| Ok(ListIndexesMetadataResponse::for_test(Vec::new())));\n\n        mock_metastore\n            .expect_find_index_template_matches()\n            .return_once(|request| {\n                assert_eq!(request.index_ids, [\"test-index-foo\"]);\n\n                let index_template =\n                    IndexTemplate::for_test(\"test-template-foo\", &[\"test-index-foo*\"], 100);\n                let index_template_json = serde_json::to_string(&index_template).unwrap();\n\n                Ok(FindIndexTemplateMatchesResponse {\n                    matches: vec![IndexTemplateMatch {\n                        template_id: \"test-template-foo\".to_string(),\n                        index_id: \"test-index-foo\".to_string(),\n                        index_template_json,\n                    }],\n                })\n            });\n\n        mock_metastore.expect_create_index().return_once(|request| {\n            let index_config = request.deserialize_index_config().unwrap();\n            assert_eq!(index_config.index_id, \"test-index-foo\");\n            assert_eq!(index_config.index_uri, \"ram:///indexes/test-index-foo\");\n\n            let source_configs = request.deserialize_source_configs().unwrap();\n            assert_eq!(source_configs.len(), 2);\n            // assert_eq!(source_configs[0].source_id, INGEST_API_SOURCE_ID);\n            assert_eq!(source_configs[0].source_id, INGEST_V2_SOURCE_ID);\n            assert_eq!(source_configs[1].source_id, CLI_SOURCE_ID);\n\n            let index_uid = IndexUid::for_test(\"test-index-foo\", 0);\n            let mut index_metadata = IndexMetadata::new_with_index_uid(index_uid, index_config);\n\n            for source_config in source_configs {\n                index_metadata.add_source(source_config).unwrap();\n            }\n            let index_metadata_json = serde_json::to_string(&index_metadata).unwrap();\n\n            Ok(CreateIndexResponse {\n                index_uid: index_metadata.index_uid.into(),\n                index_metadata_json,\n            })\n        });\n\n        let (control_plane_mailbox, _control_plane_handle, _readiness_rx) = ControlPlane::spawn(\n            &universe,\n            cluster_config,\n            node_id,\n            cluster_change_stream_factory,\n            indexer_pool,\n            ingester_pool,\n            MetastoreServiceClient::from_mock(mock_metastore),\n        );\n\n        let response = control_plane_mailbox\n            .ask(GetOrCreateOpenShardsRequest {\n                subrequests: vec![GetOrCreateOpenShardsSubrequest {\n                    subrequest_id: 0,\n                    index_id: \"test-index-foo\".to_string(),\n                    source_id: INGEST_V2_SOURCE_ID.to_string(),\n                }],\n                closed_shards: Vec::new(),\n                unavailable_leaders: Vec::new(),\n            })\n            .await\n            .unwrap()\n            .unwrap();\n        assert!(response.successes.is_empty());\n        assert_eq!(response.failures.len(), 1);\n        assert!(matches!(\n            response.failures[0].reason(),\n            GetOrCreateOpenShardsFailureReason::NoIngestersAvailable\n        ));\n\n        let control_plane_state = control_plane_mailbox.ask(Observe).await.unwrap();\n        assert_eq!(control_plane_state.num_indexes, 1);\n        assert_eq!(control_plane_state.num_sources, 1);\n\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_watch_indexers() {\n        let universe = Universe::with_accelerated_time();\n        let (control_plane_mailbox, control_plane_inbox) = universe.create_test_mailbox();\n        let weak_control_plane_mailbox = control_plane_mailbox.downgrade();\n\n        let cluster_change_stream_factory = ClusterChangeStreamFactoryForTest::default();\n        let cluster_change_stream = cluster_change_stream_factory.create();\n        spawn_watch_indexers_task(weak_control_plane_mailbox, cluster_change_stream);\n\n        let cluster_change_stream_tx = cluster_change_stream_factory.change_stream_tx();\n\n        // a non-indexer node status change doesn't trigger a shard rebalancing.\n        let metastore_node = ClusterNode::for_test(\n            \"test-metastore\",\n            1515,\n            false,\n            &[\"metastore\"],\n            &[],\n            IngesterStatus::Unspecified,\n        )\n        .await;\n        let cluster_change = ClusterChange::Add(metastore_node);\n        cluster_change_stream_tx.send(cluster_change).unwrap();\n\n        tokio::time::sleep(Duration::from_millis(1)).await;\n        assert!(\n            control_plane_inbox\n                .drain_for_test_typed::<RebalanceShards>()\n                .is_empty()\n        );\n\n        // an indexer initializing doesn't trigger a shard rebalancing.\n        let indexer_node_initializing: ClusterNode = ClusterNode::for_test(\n            \"test-indexer\",\n            1515,\n            false,\n            &[\"indexer\"],\n            &[],\n            IngesterStatus::Initializing,\n        )\n        .await;\n        let cluster_change = ClusterChange::Add(indexer_node_initializing);\n        cluster_change_stream_tx.send(cluster_change).unwrap();\n\n        tokio::time::sleep(Duration::from_millis(1)).await;\n        assert!(\n            control_plane_inbox\n                .drain_for_test_typed::<RebalanceShards>()\n                .is_empty()\n        );\n\n        // an indexer ready triggers a shard rebalancing.\n        let indexer_node: ClusterNode = ClusterNode::for_test(\n            \"test-indexer\",\n            1515,\n            false,\n            &[\"indexer\"],\n            &[],\n            IngesterStatus::Ready,\n        )\n        .await;\n        let cluster_change = ClusterChange::Add(indexer_node.clone());\n        cluster_change_stream_tx.send(cluster_change).unwrap();\n\n        tokio::time::sleep(Duration::from_millis(1)).await;\n        let RebalanceShards = control_plane_inbox.recv_typed_message().await.unwrap();\n\n        // removing an indexer node triggers a shard rebalancing.\n        let cluster_change = ClusterChange::Remove(indexer_node.clone());\n        cluster_change_stream_tx.send(cluster_change).unwrap();\n\n        tokio::time::sleep(Duration::from_millis(1)).await;\n        let RebalanceShards = control_plane_inbox.recv_typed_message().await.unwrap();\n\n        // a change in IngesterStatus readiness triggers a shard rebalancing.\n        let node_ready = ClusterNode::for_test(\n            \"test-indexer\",\n            1515,\n            false,\n            &[\"indexer\"],\n            &[],\n            IngesterStatus::Ready,\n        )\n        .await;\n        let node_retiring = ClusterNode::for_test(\n            \"test-indexer\",\n            1515,\n            false,\n            &[\"indexer\"],\n            &[],\n            IngesterStatus::Retiring,\n        )\n        .await;\n        let cluster_change = ClusterChange::Update {\n            previous: node_ready,\n            updated: node_retiring,\n        };\n        cluster_change_stream_tx.send(cluster_change).unwrap();\n\n        tokio::time::sleep(Duration::from_millis(1)).await;\n        let RebalanceShards = control_plane_inbox.recv_typed_message().await.unwrap();\n\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_control_plane_rebuilds_plan_on_indexer_joins_or_leaves_the_cluster() {\n        let universe = Universe::with_accelerated_time();\n\n        let cluster_config = ClusterConfig::for_test();\n        let node_id = NodeId::from(\"test-control-plane\");\n        let cluster_change_stream_factory = ClusterChangeStreamFactoryForTest::default();\n\n        let indexer_pool = IndexerPool::default();\n        let ingester_pool = IngesterPool::default();\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .return_once(|_| Ok(ListIndexesMetadataResponse::for_test(Vec::new())));\n        let metastore = MetastoreServiceClient::from_mock(mock_metastore);\n        let disable_control_loop = true;\n        let (_control_plane_mailbox, control_plane_handle, _readiness_rx) =\n            ControlPlane::spawn_inner(\n                &universe,\n                cluster_config,\n                node_id,\n                cluster_change_stream_factory.clone(),\n                indexer_pool.clone(),\n                ingester_pool,\n                metastore,\n                disable_control_loop,\n            );\n        let cluster_change_stream_tx = cluster_change_stream_factory.change_stream_tx();\n        let indexer_node: ClusterNode = ClusterNode::for_test(\n            \"test-indexer\",\n            1515,\n            false,\n            &[\"indexer\"],\n            &[],\n            IngesterStatus::Ready,\n        )\n        .await;\n        let cluster_change = ClusterChange::Add(indexer_node.clone());\n        cluster_change_stream_tx.send(cluster_change).unwrap();\n\n        universe.sleep(Duration::from_secs(10)).await;\n\n        let ingest_controller_stats = control_plane_handle\n            .process_pending_and_observe()\n            .await\n            .state_opt\n            .as_ref()\n            .unwrap()\n            .ingest_controller;\n        assert_eq!(ingest_controller_stats.num_rebalance_shards_ops, 1);\n\n        let cluster_change = ClusterChange::Remove(indexer_node);\n        cluster_change_stream_tx.send(cluster_change).unwrap();\n\n        universe.sleep(Duration::from_secs(10)).await;\n\n        let ingest_controller_stats = control_plane_handle\n            .process_pending_and_observe()\n            .await\n            .state_opt\n            .as_ref()\n            .unwrap()\n            .ingest_controller;\n        assert_eq!(ingest_controller_stats.num_rebalance_shards_ops, 2);\n\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_control_plane_handles_rebalance_shards_callback() {\n        let universe = Universe::with_accelerated_time();\n\n        let cluster_config = ClusterConfig::for_test();\n        let node_id = NodeId::from(\"test-control-plane\");\n        let cluster_change_stream_factory = ClusterChangeStreamFactoryForTest::default();\n\n        let indexer_pool = IndexerPool::default();\n        let ingester_pool = IngesterPool::default();\n        let ingester_id = NodeId::from(\"test-ingester\");\n        let mut mock_ingester = MockIngesterService::new();\n        mock_ingester\n            .expect_retain_shards()\n            .return_once(|_| Ok(RetainShardsResponse {}));\n        mock_ingester.expect_init_shards().return_once(|request| {\n            let shard = request.subrequests[0].shard().clone();\n            let response = InitShardsResponse {\n                successes: vec![InitShardSuccess {\n                    subrequest_id: 0,\n                    shard: Some(shard),\n                }],\n                failures: Vec::new(),\n            };\n            Ok(response)\n        });\n        let ingester =\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester));\n        ingester_pool.insert(ingester_id, ingester);\n\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .return_once(|_| Ok(ListIndexesMetadataResponse::for_test(Vec::new())));\n\n        mock_metastore.expect_create_index().return_once(|request| {\n            let index_config = request.deserialize_index_config().unwrap();\n            let source_configs = request.deserialize_source_configs().unwrap();\n            let mut index_metadata = IndexMetadata::new_with_index_uid(\n                IndexUid::for_test(&index_config.index_id, 0),\n                index_config,\n            );\n            for source_config in source_configs {\n                index_metadata.add_source(source_config).unwrap();\n            }\n            let index_metadata_json = serde_json::to_string(&index_metadata).unwrap();\n            let response = CreateIndexResponse {\n                index_uid: Some(IndexUid::for_test(\"test-index\", 0u128)),\n                index_metadata_json,\n            };\n            Ok(response)\n        });\n        mock_metastore.expect_open_shards().return_once(|_| {\n            let response = OpenShardsResponse {\n                subresponses: vec![OpenShardSubresponse {\n                    subrequest_id: 0,\n                    open_shard: Some(Shard {\n                        index_uid: Some(IndexUid::for_test(\"test-index\", 0u128)),\n                        source_id: INGEST_V2_SOURCE_ID.to_string(),\n                        shard_id: Some(ShardId::from(0u64)),\n                        leader_id: \"test-ingester\".to_string(),\n                        follower_id: None,\n                        shard_state: ShardState::Open as i32,\n                        doc_mapping_uid: Some(DocMappingUid::default()),\n                        publish_position_inclusive: Some(Position::Beginning),\n                        publish_token: None,\n                        update_timestamp: 1724158996,\n                    }),\n                }],\n            };\n            Ok(response)\n        });\n        let metastore = MetastoreServiceClient::from_mock(mock_metastore);\n        let (control_plane_mailbox, _control_plane_handle, _readiness_rx) = ControlPlane::spawn(\n            &universe,\n            cluster_config,\n            node_id,\n            cluster_change_stream_factory.clone(),\n            indexer_pool.clone(),\n            ingester_pool,\n            metastore,\n        );\n        let index_config = IndexConfig::for_test(\"test-index\", \"ram:///test-index\");\n        let mut source_config = SourceConfig::ingest_v2();\n        source_config.enabled = true;\n\n        let create_index_request =\n            CreateIndexRequest::try_from_index_and_source_configs(&index_config, &[source_config])\n                .unwrap();\n        control_plane_mailbox\n            .ask(create_index_request)\n            .await\n            .unwrap()\n            .unwrap();\n\n        let get_or_create_open_shards_request = GetOrCreateOpenShardsRequest {\n            subrequests: vec![GetOrCreateOpenShardsSubrequest {\n                subrequest_id: 0,\n                index_id: \"test-index\".to_string(),\n                source_id: INGEST_V2_SOURCE_ID.to_string(),\n            }],\n            closed_shards: Vec::new(),\n            unavailable_leaders: Vec::new(),\n        };\n        control_plane_mailbox\n            .ask(get_or_create_open_shards_request)\n            .await\n            .unwrap()\n            .unwrap();\n\n        let closed_shards = vec![\n            ShardPKey {\n                index_uid: Some(IndexUid::for_test(\"test-index\", 0u128)),\n                source_id: INGEST_V2_SOURCE_ID.to_string(),\n                shard_id: Some(ShardId::from(0u64)),\n            },\n            ShardPKey {\n                index_uid: Some(IndexUid::for_test(\"test-index\", 0u128)),\n                source_id: INGEST_V2_SOURCE_ID.to_string(),\n                shard_id: Some(ShardId::from(1u64)),\n            },\n        ];\n        let rebalance_lock = Arc::new(Mutex::new(()));\n        let rebalance_guard = rebalance_lock.clone().lock_owned().await;\n        let callback = RebalanceShardsCallback {\n            closed_shards,\n            rebalance_guard,\n        };\n        control_plane_mailbox.ask(callback).await.unwrap();\n\n        let control_plane_debug_info = control_plane_mailbox.ask(GetDebugInfo).await.unwrap();\n        let shard = &control_plane_debug_info[\"shard_table\"]\n            [\"test-index:00000000000000000000000000\"][\"test-ingester\"][0];\n        assert_eq!(shard[\"shard_id\"], \"00000000000000000000\");\n        assert_eq!(shard[\"shard_state\"], \"closed\");\n\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_control_plane_get_debug_info() {\n        let universe = Universe::with_accelerated_time();\n\n        let cluster_config = ClusterConfig::for_test();\n        let node_id = NodeId::from(\"test-control-plane\");\n        let cluster_change_stream_factory = ClusterChangeStreamFactoryForTest::default();\n\n        let indexer_pool = IndexerPool::default();\n        let ingester_id = NodeId::from(\"test-ingester\");\n\n        let mut mock_indexer = MockIndexingService::new();\n        mock_indexer\n            .expect_apply_indexing_plan()\n            .return_once(|_| Ok(ApplyIndexingPlanResponse {}));\n        let indexer = IndexingServiceClient::from_mock(mock_indexer);\n\n        let indexer_info = IndexerNodeInfo {\n            node_id: ingester_id.clone(),\n            generation_id: 0,\n            client: indexer,\n            indexing_tasks: Vec::new(),\n            indexing_capacity: CpuCapacity::from_cpu_millis(1_000),\n        };\n        indexer_pool.insert(ingester_id.clone(), indexer_info);\n\n        let ingester_pool = IngesterPool::default();\n        let mut mock_ingester = MockIngesterService::new();\n        mock_ingester\n            .expect_retain_shards()\n            .return_once(|_| Ok(RetainShardsResponse {}));\n        mock_ingester.expect_init_shards().return_once(|request| {\n            let shard = request.subrequests[0].shard().clone();\n            let response = InitShardsResponse {\n                successes: vec![InitShardSuccess {\n                    subrequest_id: 0,\n                    shard: Some(shard),\n                }],\n                failures: Vec::new(),\n            };\n            Ok(response)\n        });\n        let ingester =\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester));\n        ingester_pool.insert(ingester_id, ingester);\n\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .return_once(|_| Ok(ListIndexesMetadataResponse::for_test(Vec::new())));\n\n        mock_metastore.expect_create_index().return_once(|request| {\n            let index_config = request.deserialize_index_config().unwrap();\n            let source_configs = request.deserialize_source_configs().unwrap();\n            let mut index_metadata = IndexMetadata::new_with_index_uid(\n                IndexUid::for_test(&index_config.index_id, 0),\n                index_config,\n            );\n            for source_config in source_configs {\n                index_metadata.add_source(source_config).unwrap();\n            }\n            let index_metadata_json = serde_json::to_string(&index_metadata).unwrap();\n            let response = CreateIndexResponse {\n                index_uid: Some(IndexUid::for_test(\"test-index\", 0u128)),\n                index_metadata_json,\n            };\n            Ok(response)\n        });\n        mock_metastore.expect_open_shards().return_once(|_| {\n            let response = OpenShardsResponse {\n                subresponses: vec![OpenShardSubresponse {\n                    subrequest_id: 0,\n                    open_shard: Some(Shard {\n                        index_uid: Some(IndexUid::for_test(\"test-index\", 0u128)),\n                        source_id: INGEST_V2_SOURCE_ID.to_string(),\n                        shard_id: Some(ShardId::from(0u64)),\n                        leader_id: \"test-ingester\".to_string(),\n                        follower_id: None,\n                        shard_state: ShardState::Open as i32,\n                        doc_mapping_uid: Some(DocMappingUid::default()),\n                        publish_position_inclusive: Some(Position::Beginning),\n                        publish_token: None,\n                        update_timestamp: 1724158996,\n                    }),\n                }],\n            };\n            Ok(response)\n        });\n        let metastore = MetastoreServiceClient::from_mock(mock_metastore);\n        let (control_plane_mailbox, _control_plane_handle, _readiness_rx) = ControlPlane::spawn(\n            &universe,\n            cluster_config,\n            node_id,\n            cluster_change_stream_factory.clone(),\n            indexer_pool.clone(),\n            ingester_pool,\n            metastore,\n        );\n        let index_config = IndexConfig::for_test(\"test-index\", \"ram:///test-index\");\n        let mut source_config = SourceConfig::ingest_v2();\n        source_config.enabled = true;\n\n        let create_index_request =\n            CreateIndexRequest::try_from_index_and_source_configs(&index_config, &[source_config])\n                .unwrap();\n        control_plane_mailbox\n            .ask(create_index_request)\n            .await\n            .unwrap()\n            .unwrap();\n\n        let get_or_create_open_shards_request = GetOrCreateOpenShardsRequest {\n            subrequests: vec![GetOrCreateOpenShardsSubrequest {\n                subrequest_id: 0,\n                index_id: \"test-index\".to_string(),\n                source_id: INGEST_V2_SOURCE_ID.to_string(),\n            }],\n            closed_shards: Vec::new(),\n            unavailable_leaders: Vec::new(),\n        };\n        control_plane_mailbox\n            .ask(get_or_create_open_shards_request)\n            .await\n            .unwrap()\n            .unwrap();\n\n        let control_plane_debug_info = control_plane_mailbox.ask(GetDebugInfo).await.unwrap();\n\n        assert_eq!(\n            control_plane_debug_info[\"physical_indexing_plan\"][0][\"node_id\"],\n            \"test-ingester\"\n        );\n        let shard = &control_plane_debug_info[\"shard_table\"]\n            [\"test-index:00000000000000000000000000\"][\"test-ingester\"][0];\n        assert_eq!(shard[\"index_uid\"], \"test-index:00000000000000000000000000\");\n        assert_eq!(shard[\"source_id\"], INGEST_V2_SOURCE_ID);\n        assert_eq!(shard[\"shard_id\"], \"00000000000000000000\");\n        assert_eq!(shard[\"shard_state\"], \"open\");\n        assert_eq!(shard[\"leader_id\"], \"test-ingester\");\n        assert_eq!(shard[\"follower_id\"], JsonValue::Null);\n        assert_eq!(\n            shard[\"publish_position_inclusive\"],\n            json!(Position::Beginning)\n        );\n\n        universe.assert_quit().await;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-control-plane/src/cooldown_map.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt::Debug;\nuse std::hash::Hash;\nuse std::num::NonZeroUsize;\nuse std::time::{Duration, Instant};\n\nuse lru::LruCache;\n\n/// A map that keeps track of a cooldown deadline for each of its keys.\n///\n/// Internally it uses an [`LruCache`] to prune the oldest entries when the\n/// capacity is reached. If the capacity is reached but the oldest entry is not\n/// outdated, the capacity is extended (2x).\npub struct CooldownMap<K>(LruCache<K, Instant>);\n\n#[derive(Debug, PartialEq)]\npub enum CooldownStatus {\n    Ready,\n    InCooldown,\n}\n\nimpl<K: Hash + Eq> CooldownMap<K> {\n    pub fn new(capacity: NonZeroUsize) -> Self {\n        Self(LruCache::new(capacity))\n    }\n\n    /// Updates the deadline for the given key if it isn't currently in cooldown.\n    ///\n    /// The status returned is the one before the update (after an update, the\n    /// status is always `InCooldown`).\n    pub fn update(&mut self, key: K, cooldown_interval: Duration) -> CooldownStatus {\n        let deadline_opt = self.0.get_mut(&key);\n        let now = Instant::now();\n        if let Some(deadline) = deadline_opt {\n            if *deadline > now {\n                CooldownStatus::InCooldown\n            } else {\n                *deadline = now + cooldown_interval;\n                CooldownStatus::Ready\n            }\n        } else {\n            let capacity: usize = self.0.cap().into();\n            if self.0.len() == capacity\n                && let Some((_, deadline)) = self.0.peek_lru()\n                && *deadline > now\n            {\n                // the oldest entry is not outdated, grow the LRU\n                self.0.resize(NonZeroUsize::new(capacity * 2).unwrap());\n            }\n            self.0.push(key, now + cooldown_interval);\n            CooldownStatus::Ready\n        }\n    }\n}\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_cooldown_map_resize() {\n        let mut cooldown_map = CooldownMap::new(NonZeroUsize::new(2).unwrap());\n        let cooldown_interval = Duration::from_secs(1);\n        assert_eq!(\n            cooldown_map.update(\"test_key1\", cooldown_interval),\n            CooldownStatus::Ready\n        );\n        assert_eq!(\n            cooldown_map.update(\"test_key1\", cooldown_interval),\n            CooldownStatus::InCooldown\n        );\n        assert_eq!(\n            cooldown_map.update(\"test_key2\", cooldown_interval),\n            CooldownStatus::Ready\n        );\n        assert_eq!(\n            cooldown_map.update(\"test_key2\", cooldown_interval),\n            CooldownStatus::InCooldown\n        );\n        // Hitting the capacity, the map should grow transparently\n        assert_eq!(\n            cooldown_map.update(\"test_key3\", cooldown_interval),\n            CooldownStatus::Ready\n        );\n        assert_eq!(\n            cooldown_map.update(\"test_key1\", cooldown_interval),\n            CooldownStatus::InCooldown\n        );\n        assert_eq!(\n            cooldown_map.update(\"test_key2\", cooldown_interval),\n            CooldownStatus::InCooldown\n        );\n        assert_eq!(cooldown_map.0.cap(), NonZeroUsize::new(4).unwrap());\n    }\n\n    #[test]\n    fn test_cooldown_map_expired() {\n        let mut cooldown_map = CooldownMap::new(NonZeroUsize::new(2).unwrap());\n        let cooldown_interval_short = Duration::from_millis(100);\n        let cooldown_interval_long = Duration::from_secs(5);\n\n        assert_eq!(\n            cooldown_map.update(\"test_key_short\", cooldown_interval_short),\n            CooldownStatus::Ready\n        );\n        assert_eq!(\n            cooldown_map.update(\"test_key_long\", cooldown_interval_long),\n            CooldownStatus::Ready\n        );\n\n        std::thread::sleep(cooldown_interval_short.mul_f32(1.1));\n        assert_eq!(\n            cooldown_map.update(\"test_key_short\", cooldown_interval_short),\n            CooldownStatus::Ready\n        );\n        assert_eq!(\n            cooldown_map.update(\"test_key_long\", cooldown_interval_long),\n            CooldownStatus::InCooldown\n        );\n    }\n\n    #[test]\n    fn test_cooldown_map_eviction() {\n        let mut cooldown_map = CooldownMap::new(NonZeroUsize::new(2).unwrap());\n        let cooldown_interval_short = Duration::from_millis(100);\n        let cooldown_interval_long = Duration::from_secs(5);\n\n        assert_eq!(\n            cooldown_map.update(\"test_key_short\", cooldown_interval_short),\n            CooldownStatus::Ready\n        );\n        assert_eq!(\n            cooldown_map.update(\"test_key_long_1\", cooldown_interval_long),\n            CooldownStatus::Ready\n        );\n\n        // after the cooldown period `test_key_short` should be evicted when adding a new key\n        std::thread::sleep(cooldown_interval_short.mul_f32(1.1));\n        assert_eq!(cooldown_map.0.len(), 2);\n        assert_eq!(\n            cooldown_map.update(\"test_key_long_2\", cooldown_interval_long),\n            CooldownStatus::Ready\n        );\n        assert_eq!(cooldown_map.0.len(), 2);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-control-plane/src/debouncer.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::{Arc, Mutex};\nuse std::time::Duration;\n\nuse quickwit_actors::{Actor, ActorContext, Handler};\n\n/// A debouncer is a helper to debounce events.\n///\n/// The debouncing takes a `cooldown_period` parameter and works as you may expect:\n///\n///\n///    time                t=0                   t=COOLDOWN            t=2*COOLDOWN\n///                        |                          |                     |\n///  ----------------------------------------------------------------------------------------------------\n///    event               *     * *    *                                   *   * *\n///    debounced effect    o                          o                     o\n///\n/// In particular, note the event triggered at `t=COOLDOWN`.\n#[derive(Clone)]\npub struct Debouncer {\n    cooldown_period: Duration,\n    cooldown_state: Arc<Mutex<DebouncerState>>,\n}\n\n#[derive(Clone, Copy, Debug, PartialEq)]\nenum DebouncerState {\n    /// More than `cooldown_period` has elapsed since we last emitted an event.\n    NoCooldown,\n    // Less than `cooldown_period` has elapsed since we last emitted an event,\n    // and no event has been received since then.\n    CooldownNotScheduled,\n    // Less than `cooldown_period` has elapsed since we last emitted an event,\n    // and we have already received an event during this cooldown period.\n    CooldownScheduled,\n}\n\nimpl DebouncerState {\n    fn accept(self, transition: Transition) -> DebouncerState {\n        use DebouncerState::*;\n        use Transition::*;\n        match (self, transition) {\n            (NoCooldown, Emit) => CooldownNotScheduled,\n            (NoCooldown, CooldownExpired) => unreachable!(),\n            (CooldownNotScheduled, Emit) => CooldownScheduled,\n            (CooldownNotScheduled, CooldownExpired) => NoCooldown,\n            (CooldownScheduled, Emit) => CooldownScheduled,\n            (CooldownScheduled, CooldownExpired) => NoCooldown,\n        }\n    }\n}\n\nenum Transition {\n    CooldownExpired,\n    Emit,\n}\n\n#[allow(dead_code)]\nimpl Debouncer {\n    pub fn new(cooldown_period: Duration) -> Debouncer {\n        Debouncer {\n            cooldown_period,\n            cooldown_state: Arc::new(Mutex::new(DebouncerState::NoCooldown)),\n        }\n    }\n\n    /// Updates the state according to the transition, and returns the state before the transition.\n    /// The entire transition is atomic.\n    fn accept_transition(&self, transition: Transition) -> DebouncerState {\n        let mut lock = self.cooldown_state.lock().unwrap();\n        let previous_state = *lock;\n        let new_state = previous_state.accept(transition);\n        *lock = new_state;\n        previous_state\n    }\n\n    fn emit_message<A, M>(&self, ctx: &ActorContext<A>)\n    where\n        A: Actor + Handler<M>,\n        M: Default + std::fmt::Debug + Send + Sync + 'static,\n    {\n        let _ = ctx.mailbox().send_message_with_high_priority(M::default());\n    }\n\n    fn schedule_post_cooldown_callback<A, M>(&self, ctx: &ActorContext<A>)\n    where\n        A: Actor + Handler<M>,\n        M: Default + std::fmt::Debug + Send + Sync + 'static,\n    {\n        let ctx_clone = ctx.clone();\n        let self_clone = self.clone();\n        let callback = move || {\n            let previous_state = self_clone.accept_transition(Transition::CooldownExpired);\n            if previous_state == DebouncerState::CooldownScheduled {\n                self_clone.self_send_with_cooldown(&ctx_clone);\n            }\n        };\n        ctx.spawn_ctx()\n            .schedule_event(callback, self.cooldown_period);\n    }\n\n    pub fn self_send_with_cooldown<M>(&self, ctx: &ActorContext<impl Handler<M>>)\n    where M: Default + std::fmt::Debug + Send + Sync + 'static {\n        let cooldown_state = self.accept_transition(Transition::Emit);\n        match cooldown_state {\n            DebouncerState::NoCooldown => {\n                self.emit_message(ctx);\n                self.schedule_post_cooldown_callback(ctx);\n            }\n            DebouncerState::CooldownNotScheduled | DebouncerState::CooldownScheduled => {}\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::time::Duration;\n\n    use async_trait::async_trait;\n    use quickwit_actors::{Actor, ActorContext, ActorExitStatus, Handler, Universe};\n\n    use crate::debouncer::Debouncer;\n\n    struct DebouncingActor {\n        count: usize,\n        debouncer: Debouncer,\n    }\n\n    impl DebouncingActor {\n        pub fn new(cooldown_duration: Duration) -> DebouncingActor {\n            DebouncingActor {\n                count: 0,\n                debouncer: Debouncer::new(cooldown_duration),\n            }\n        }\n    }\n\n    #[derive(Debug, Default)]\n    struct Increment;\n\n    #[derive(Debug)]\n    struct DebouncedIncrement;\n\n    #[async_trait]\n    impl Actor for DebouncingActor {\n        type ObservableState = usize;\n\n        fn observable_state(&self) -> Self::ObservableState {\n            self.count\n        }\n    }\n\n    #[async_trait]\n    impl Handler<Increment> for DebouncingActor {\n        type Reply = ();\n\n        async fn handle(\n            &mut self,\n            _message: Increment,\n            _ctx: &ActorContext<Self>,\n        ) -> Result<Self::Reply, ActorExitStatus> {\n            self.count += 1;\n            Ok(())\n        }\n    }\n\n    #[async_trait]\n    impl Handler<DebouncedIncrement> for DebouncingActor {\n        type Reply = ();\n\n        async fn handle(\n            &mut self,\n            _message: DebouncedIncrement,\n            ctx: &ActorContext<Self>,\n        ) -> Result<Self::Reply, ActorExitStatus> {\n            self.debouncer.self_send_with_cooldown::<Increment>(ctx);\n            Ok(())\n        }\n    }\n\n    #[tokio::test]\n    async fn test_debouncer() {\n        let universe = Universe::default();\n        let cooldown_period = Duration::from_millis(1_000);\n        let debouncer = DebouncingActor::new(cooldown_period);\n        let (debouncer_mailbox, debouncer_handle) = universe.spawn_builder().spawn(debouncer);\n        {\n            let count = *debouncer_handle.process_pending_and_observe().await;\n            assert_eq!(count, 0);\n        }\n        {\n            let _ = debouncer_mailbox.ask(DebouncedIncrement).await;\n            let count = *debouncer_handle.process_pending_and_observe().await;\n            assert_eq!(count, 1);\n        }\n        for _ in 0..10 {\n            let _ = debouncer_mailbox.ask(DebouncedIncrement).await;\n            let count = *debouncer_handle.process_pending_and_observe().await;\n            assert_eq!(count, 1);\n        }\n        {\n            universe.sleep(cooldown_period.mul_f32(1.2f32)).await;\n            let count = *debouncer_handle.process_pending_and_observe().await;\n            assert_eq!(count, 2);\n        }\n        {\n            let _ = debouncer_mailbox.ask(DebouncedIncrement).await;\n            let count = *debouncer_handle.process_pending_and_observe().await;\n            assert_eq!(count, 2);\n        }\n        {\n            universe.sleep(cooldown_period * 2).await;\n            let count = *debouncer_handle.process_pending_and_observe().await;\n            assert_eq!(count, 3);\n        }\n        universe.assert_quit().await;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-control-plane/src/indexing_plan.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse fnv::FnvHashMap;\nuse quickwit_proto::indexing::IndexingTask;\nuse serde::Serialize;\n\n/// A [`PhysicalIndexingPlan`] defines the list of indexing tasks\n/// each indexer, identified by its node ID, should run.\n/// TODO(fmassot): a metastore version number will be attached to the plan\n/// to identify if the plan is up to date with the metastore.\n#[derive(Debug, PartialEq, Clone, Serialize)]\npub struct PhysicalIndexingPlan {\n    indexing_tasks_per_indexer_id: FnvHashMap<String, Vec<IndexingTask>>,\n}\n\nimpl PhysicalIndexingPlan {\n    pub fn with_indexer_ids(indexer_ids: &[String]) -> PhysicalIndexingPlan {\n        PhysicalIndexingPlan {\n            indexing_tasks_per_indexer_id: indexer_ids\n                .iter()\n                .map(|indexer_id| (indexer_id.clone(), Vec::new()))\n                .collect(),\n        }\n    }\n\n    pub fn add_indexing_task(&mut self, indexer_id: &str, indexing_task: IndexingTask) {\n        self.indexing_tasks_per_indexer_id\n            .entry(indexer_id.to_string())\n            .or_default()\n            .push(indexing_task);\n    }\n\n    /// Returns the hashmap of (indexer ID, indexing tasks).\n    pub fn indexing_tasks_per_indexer(&self) -> &FnvHashMap<String, Vec<IndexingTask>> {\n        &self.indexing_tasks_per_indexer_id\n    }\n\n    pub fn num_indexers(&self) -> usize {\n        self.indexing_tasks_per_indexer_id.len()\n    }\n\n    /// Returns the hashmap of (indexer ID, indexing tasks).\n    pub fn indexing_tasks_per_indexer_mut(&mut self) -> &mut FnvHashMap<String, Vec<IndexingTask>> {\n        &mut self.indexing_tasks_per_indexer_id\n    }\n\n    /// Returns the hashmap of (indexer ID, indexing tasks).\n    pub fn indexer(&self, indexer_id: &str) -> Option<&[IndexingTask]> {\n        self.indexing_tasks_per_indexer_id\n            .get(indexer_id)\n            .map(Vec::as_slice)\n    }\n\n    pub fn normalize(&mut self) {\n        for tasks in self.indexing_tasks_per_indexer_id.values_mut() {\n            for task in tasks.iter_mut() {\n                task.shard_ids.sort_unstable();\n            }\n            tasks.sort_unstable_by(|left, right| {\n                left.index_uid\n                    .cmp(&right.index_uid)\n                    .then_with(|| left.source_id.cmp(&right.source_id))\n                    .then_with(|| left.shard_ids.first().cmp(&right.shard_ids.first()))\n                    .then_with(|| left.pipeline_uid.cmp(&right.pipeline_uid))\n            });\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-control-plane/src/indexing_scheduler/change_tracker.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::Arc;\n\nuse tokio::sync::watch;\n\n/// This object makes it possible to track for the completion of the next rebuild.\npub struct RebuildNotifier {\n    generation_processed_tx: watch::Sender<usize>,\n    generation_processed_rx: watch::Receiver<usize>,\n    generation: usize,\n}\n\nimpl Default for RebuildNotifier {\n    fn default() -> Self {\n        let (generation_processed_tx, generation_processed_rx) = watch::channel(0);\n\n        Self {\n            generation_processed_tx,\n            generation_processed_rx,\n            generation: 1,\n        }\n    }\n}\n\nimpl RebuildNotifier {\n    /// Returns a future that resolves when the next rebuild is completed.\n    ///\n    /// If an ongoing build T exists, it will not resolve upon build T's completion.\n    /// It will only be resolved upon build T+1's completion, or any subsequent build.\n    pub fn next_rebuild_waiter(&mut self) -> impl std::future::Future<Output = ()> + use<> {\n        let mut generation_processed_rx = self.generation_processed_rx.clone();\n        let current_generation = self.generation;\n        async move {\n            loop {\n                if *generation_processed_rx.borrow() >= current_generation {\n                    return;\n                }\n                if generation_processed_rx.changed().await.is_err() {\n                    return;\n                }\n            }\n        }\n    }\n\n    /// Starts a new rebuild.\n    pub fn start_rebuild(&mut self) -> Arc<NotifyChangeOnDrop> {\n        let generation = self.generation;\n        self.generation += 1;\n        Arc::new(NotifyChangeOnDrop {\n            generation,\n            generation_processed_tx: self.generation_processed_tx.clone(),\n        })\n    }\n}\n\npub struct NotifyChangeOnDrop {\n    generation: usize,\n    generation_processed_tx: watch::Sender<usize>,\n}\n\nimpl Drop for NotifyChangeOnDrop {\n    fn drop(&mut self) {\n        if self.generation < *self.generation_processed_tx.borrow() {\n            return;\n        }\n        let _ = self.generation_processed_tx.send(self.generation);\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::time::Duration;\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_change_tracker() {\n        let mut change_tracker = RebuildNotifier::default();\n        let waiter = change_tracker.next_rebuild_waiter();\n        let change_notifier = change_tracker.start_rebuild();\n        drop(change_notifier);\n        waiter.await;\n    }\n\n    #[tokio::test]\n    async fn test_change_tracker_ongoing_is_not_good() {\n        let mut change_tracker = RebuildNotifier::default();\n        let change_notifier = change_tracker.start_rebuild();\n        let waiter = change_tracker.next_rebuild_waiter();\n        let waiter2 = change_tracker.next_rebuild_waiter();\n        drop(change_notifier);\n        let change_notifier2 = change_tracker.start_rebuild();\n        let timeout_res = tokio::time::timeout(Duration::from_millis(100), waiter).await;\n        assert!(timeout_res.is_err());\n        drop(change_notifier2);\n        waiter2.await;\n    }\n\n    #[tokio::test]\n    async fn test_change_tracker_all_waiters_are_notified() {\n        let mut change_tracker = RebuildNotifier::default();\n        let waiter = change_tracker.next_rebuild_waiter();\n        let waiter2 = change_tracker.next_rebuild_waiter();\n        let change_notifier = change_tracker.start_rebuild();\n        drop(change_notifier);\n        waiter.await;\n        waiter2.await;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-control-plane/src/indexing_scheduler/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod change_tracker;\nmod scheduling;\n\nuse std::cmp::Ordering;\nuse std::fmt;\nuse std::num::NonZeroU32;\nuse std::sync::{Arc, OnceLock};\nuse std::time::{Duration, Instant};\n\nuse fnv::{FnvHashMap, FnvHashSet};\nuse itertools::Itertools;\nuse once_cell::sync::OnceCell;\nuse quickwit_common::pretty::PrettySample;\nuse quickwit_config::{FileSourceParams, SourceParams, indexing_pipeline_params_fingerprint};\nuse quickwit_proto::indexing::{\n    ApplyIndexingPlanRequest, CpuCapacity, IndexingService, IndexingTask, PIPELINE_FULL_CAPACITY,\n    PIPELINE_THROUGHPUT,\n};\nuse quickwit_proto::types::NodeId;\nuse scheduling::{SourceToSchedule, SourceToScheduleType};\nuse serde::Serialize;\nuse tracing::{debug, info, warn};\n\nuse crate::indexing_plan::PhysicalIndexingPlan;\nuse crate::indexing_scheduler::change_tracker::{NotifyChangeOnDrop, RebuildNotifier};\nuse crate::indexing_scheduler::scheduling::build_physical_indexing_plan;\nuse crate::metrics::ShardLocalityMetrics;\nuse crate::model::{ControlPlaneModel, ShardEntry, ShardLocations};\nuse crate::{IndexerNodeInfo, IndexerPool};\n\nconst DEFAULT_ENABLE_VARIABLE_SHARD_LOAD: bool = false;\n\npub(crate) const MIN_DURATION_BETWEEN_SCHEDULING: Duration =\n    if cfg!(any(test, feature = \"testsuite\")) {\n        Duration::from_millis(50)\n    } else {\n        Duration::from_secs(30)\n    };\n\n#[derive(Debug, Clone, Default, Serialize)]\npub struct IndexingSchedulerState {\n    pub num_applied_physical_indexing_plan: usize,\n    pub num_schedule_indexing_plan: usize,\n    pub last_applied_physical_plan: Option<PhysicalIndexingPlan>,\n    #[serde(skip)]\n    pub last_applied_plan_timestamp: Option<Instant>,\n}\n\n/// The [`IndexingScheduler`] is responsible for listing indexing tasks and assigning them to\n/// indexers.\n/// We call this duty `scheduling`. Contrary to what the name suggests, most indexing tasks are\n/// ever running. We just borrowed the terminology to Kubernetes.\n///\n/// Scheduling executes the following steps:\n/// 1. Builds a [`PhysicalIndexingPlan`] from the list of logical indexing tasks. See\n///    [`build_physical_indexing_plan`] for the implementation details.\n/// 2. Apply the [`PhysicalIndexingPlan`]: for each indexer, the scheduler send the indexing tasks\n///    by gRPC. An indexer immediately returns an Ok and apply asynchronously the received plan. Any\n///    errors (network) happening in this step are ignored. The scheduler runs a control loop that\n///    regularly checks if indexers are effectively running their plans (more details in the next\n///    section).\n///\n/// All events altering the list of indexes and sources are proxied through\n/// through the control plane. The control plane model is therefore guaranteed to be up-to-date\n/// (at the cost of making the control plane a single point of failure).\n///\n/// Each change to the model triggers the production of a new `PhysicalIndexingPlan`.\n///\n/// A `ControlPlanLoop` event is scheduled every `CONTROL_PLAN_LOOP_INTERVAL` and steers\n/// the cluster toward the last applied [`PhysicalIndexingPlan`].\n///\n/// This physical plan is a desired state. Even after that state is reached, it can be altered due\n/// to faulty server for instance.\n///\n/// We then need to detect deviation, possibly recompute the desired `PhysicalIndexingPlan`\n/// and steer back the cluster to the right state.\n///\n/// First to detect deviation, the control plan gathers an eventually consistent view of what is\n/// running on the different nodes of the cluster: the `running plan`. This is done via `chitchat`.\n///\n/// If the list of node ids has changed, the scheduler will retrigger a scheduling.\n/// If the indexing tasks do not match, the scheduler will apply again the last applied plan.\n/// Concretely, it will send the faulty nodes of the plan they are supposed to follow.\n//\n/// Finally, in order to give the time for each indexer to run their indexing tasks, the control\n/// plane will wait at least [`MIN_DURATION_BETWEEN_SCHEDULING`] before comparing the desired\n/// plan with the running plan.\npub struct IndexingScheduler {\n    cluster_id: String,\n    self_node_id: NodeId,\n    indexer_pool: IndexerPool,\n    state: IndexingSchedulerState,\n    pub(crate) next_rebuild_tracker: RebuildNotifier,\n}\n\nimpl fmt::Debug for IndexingScheduler {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        f.debug_struct(\"IndexingScheduler\")\n            .field(\"cluster_id\", &self.cluster_id)\n            .field(\"node_id\", &self.self_node_id)\n            .field(\n                \"last_applied_plan_ts\",\n                &self.state.last_applied_plan_timestamp,\n            )\n            .finish()\n    }\n}\n\nfn enable_variable_shard_load() -> bool {\n    static IS_SHARD_LOAD_CP_ENABLED: OnceCell<bool> = OnceCell::new();\n    *IS_SHARD_LOAD_CP_ENABLED.get_or_init(|| {\n        if let Some(enable_flag) =\n            quickwit_common::get_bool_from_env_opt(\"QW_ENABLE_VARIABLE_SHARD_LOAD\")\n        {\n            return enable_flag;\n        }\n        // For backward compatibility, if QW_DISABLE_VARIABLE_SHARD_LOAD is set, we accept this\n        // value too.\n        if let Some(disable_flag) =\n            quickwit_common::get_bool_from_env_opt(\"QW_DISABLE_VARIABLE_SHARD_LOAD\")\n        {\n            warn!(\n                disable = disable_flag,\n                \"QW_DISABLE_VARIABLE_SHARD_LOAD is deprecated. Please use \\\n                 QW_ENABLE_VARIABLE_SHARD_LOAD instead. We will use your setting in this version, \\\n                 but will likely ignore it in future versions.\"\n            );\n            return !disable_flag;\n        }\n        // Defaulting to false\n        info!(\n            \"QW_ENABLE_VARIABLE_SHARD_LOAD not set, defaulting to {}\",\n            DEFAULT_ENABLE_VARIABLE_SHARD_LOAD\n        );\n        DEFAULT_ENABLE_VARIABLE_SHARD_LOAD\n    })\n}\n\n/// Computes the CPU load associated to a single shard of a given index.\n///\n/// The array passed contains all of data we have about the shard of the index.\n/// This function averages their statistics.\n///\n/// For the moment, this function only takes in account the measured throughput,\n/// and assumes a constant CPU usage of 4 vCPU = 20mb/s.\n///\n/// It does not take in account the variation that could raise from the different\n/// doc mapping / nature of the data, etc.\nfn compute_load_per_shard(shard_entries: &[&ShardEntry]) -> NonZeroU32 {\n    if enable_variable_shard_load() {\n        let num_shards = shard_entries.len().max(1) as u64;\n        let average_throughput_per_shard_bytes: u64 = shard_entries\n            .iter()\n            .map(|shard_entry| shard_entry.long_term_ingestion_rate.0 as u64 * bytesize::MIB)\n            .sum::<u64>()\n            .div_ceil(num_shards)\n            // A shard throughput cannot exceed PIPELINE_THROUGHPUT in the long term (this is\n            // enforced by the configuration).\n            .min(PIPELINE_THROUGHPUT.as_u64());\n        let num_cpu_millis = (PIPELINE_FULL_CAPACITY.cpu_millis() as u64\n            * average_throughput_per_shard_bytes)\n            / PIPELINE_THROUGHPUT.as_u64();\n        const MIN_CPU_LOAD_PER_SHARD: u32 = 50u32;\n        NonZeroU32::new((num_cpu_millis as u32).max(MIN_CPU_LOAD_PER_SHARD)).unwrap()\n    } else {\n        get_default_load_per_shard()\n    }\n}\n\nfn get_default_load_per_shard() -> NonZeroU32 {\n    static DEFAULT_LOAD_PER_SHARD: OnceLock<NonZeroU32> = OnceLock::new();\n    *DEFAULT_LOAD_PER_SHARD.get_or_init(|| {\n        let default_load_per_shard = quickwit_common::get_from_env(\n            \"QW_DEFAULT_LOAD_PER_SHARD\",\n            PIPELINE_FULL_CAPACITY.cpu_millis() / 4,\n            false,\n        );\n        NonZeroU32::new(default_load_per_shard).unwrap()\n    })\n}\n\nfn get_sources_to_schedule(model: &ControlPlaneModel) -> Vec<SourceToSchedule> {\n    let mut sources = Vec::new();\n\n    for (source_uid, source_config) in model.source_configs() {\n        if !source_config.enabled {\n            continue;\n        }\n        let params_fingerprint = model\n            .index_metadata(&source_uid.index_uid)\n            .map(|index_meta| {\n                indexing_pipeline_params_fingerprint(&index_meta.index_config, source_config)\n            })\n            .unwrap_or_default();\n        match source_config.source_params {\n            SourceParams::File(FileSourceParams::Filepath(_))\n            | SourceParams::IngestCli\n            | SourceParams::Stdin\n            | SourceParams::Void(_)\n            | SourceParams::Vec(_) => { // We don't need to schedule those.\n            }\n\n            SourceParams::IngestApi => {\n                // TODO ingest v1 is scheduled differently\n                sources.push(SourceToSchedule {\n                    source_uid,\n                    source_type: SourceToScheduleType::IngestV1,\n                    params_fingerprint,\n                });\n            }\n            SourceParams::Ingest => {\n                // Expect: the source should exist since we just read it from `get_source_configs`.\n                // Note that we keep all shards, including Closed shards:\n                // A closed shards still needs to be indexed.\n                let shard_entries: Vec<&ShardEntry> = model\n                    .get_shards_for_source(&source_uid)\n                    .expect(\"source should exist\")\n                    .values()\n                    .collect();\n                if shard_entries.is_empty() {\n                    continue;\n                }\n                let shard_ids = shard_entries\n                    .iter()\n                    .map(|shard_entry| shard_entry.shard_id().clone())\n                    .collect();\n                let load_per_shard = compute_load_per_shard(&shard_entries[..]);\n                sources.push(SourceToSchedule {\n                    source_uid,\n                    source_type: SourceToScheduleType::Sharded {\n                        shard_ids,\n                        load_per_shard,\n                    },\n                    params_fingerprint,\n                });\n            }\n            SourceParams::Kafka(_)\n            | SourceParams::Kinesis(_)\n            | SourceParams::PubSub(_)\n            | SourceParams::Pulsar(_)\n            | SourceParams::File(FileSourceParams::Notifications(_)) => {\n                sources.push(SourceToSchedule {\n                    source_uid,\n                    source_type: SourceToScheduleType::NonSharded {\n                        num_pipelines: source_config.num_pipelines.get() as u32,\n                        // FIXME\n                        load_per_pipeline: NonZeroU32::new(PIPELINE_FULL_CAPACITY.cpu_millis())\n                            .unwrap(),\n                    },\n                    params_fingerprint,\n                });\n            }\n        }\n    }\n    sources\n}\n\nimpl IndexingScheduler {\n    pub fn new(cluster_id: String, self_node_id: NodeId, indexer_pool: IndexerPool) -> Self {\n        IndexingScheduler {\n            cluster_id,\n            self_node_id,\n            indexer_pool,\n            state: IndexingSchedulerState::default(),\n            next_rebuild_tracker: RebuildNotifier::default(),\n        }\n    }\n\n    pub fn observable_state(&self) -> IndexingSchedulerState {\n        self.state.clone()\n    }\n\n    // Should be called whenever a change in the list of index/shard\n    // has happened.\n    //\n    // Prefer not calling this method directly, and instead call\n    // `ControlPlane::rebuild_indexing_plan_debounced`.\n    pub(crate) fn rebuild_plan(&mut self, model: &ControlPlaneModel) {\n        crate::metrics::CONTROL_PLANE_METRICS.schedule_total.inc();\n\n        let notify_on_drop = self.next_rebuild_tracker.start_rebuild();\n\n        let sources = get_sources_to_schedule(model);\n\n        let indexers: Vec<IndexerNodeInfo> = self.get_indexers_from_indexer_pool();\n\n        let indexer_id_to_cpu_capacities: FnvHashMap<String, CpuCapacity> = indexers\n            .iter()\n            .filter_map(|indexer| {\n                if indexer.indexing_capacity.cpu_millis() > 0 {\n                    Some((indexer.node_id.to_string(), indexer.indexing_capacity))\n                } else {\n                    None\n                }\n            })\n            .collect();\n\n        if indexer_id_to_cpu_capacities.is_empty() {\n            if !sources.is_empty() {\n                warn!(\"no indexing capacity available, cannot schedule an indexing plan\");\n            }\n            return;\n        };\n\n        let shard_locations = model.shard_locations();\n        let new_physical_plan = build_physical_indexing_plan(\n            &sources,\n            &indexer_id_to_cpu_capacities,\n            self.state.last_applied_physical_plan.as_ref(),\n            &shard_locations,\n        );\n        let shard_locality_metrics =\n            get_shard_locality_metrics(&new_physical_plan, &shard_locations);\n        crate::metrics::CONTROL_PLANE_METRICS.set_shard_locality_metrics(shard_locality_metrics);\n        if let Some(last_applied_plan) = &self.state.last_applied_physical_plan {\n            let plans_diff = get_indexing_plans_diff(\n                last_applied_plan.indexing_tasks_per_indexer(),\n                new_physical_plan.indexing_tasks_per_indexer(),\n            );\n            // No need to apply the new plan as it is the same as the old one.\n            if plans_diff.is_empty() {\n                return;\n            }\n        }\n        self.apply_physical_indexing_plan(&indexers, new_physical_plan, Some(notify_on_drop));\n        self.state.num_schedule_indexing_plan += 1;\n    }\n\n    /// Checks if the last applied plan corresponds to the running indexing tasks present in the\n    /// chitchat cluster state. If true, do nothing.\n    /// - If node IDs differ, schedule a new indexing plan.\n    /// - If indexing tasks differ, apply again the last plan.\n    pub(crate) fn control_running_plan(&mut self, model: &ControlPlaneModel) {\n        let last_applied_plan =\n            if let Some(last_applied_plan) = &self.state.last_applied_physical_plan {\n                last_applied_plan\n            } else {\n                // If there is no plan, the node is probably starting and the scheduler did not find\n                // indexers yet. In this case, we want to schedule as soon as possible to find new\n                // indexers.\n                self.rebuild_plan(model);\n                return;\n            };\n        if let Some(last_applied_plan_timestamp) = self.state.last_applied_plan_timestamp\n            && Instant::now().duration_since(last_applied_plan_timestamp)\n                < MIN_DURATION_BETWEEN_SCHEDULING\n        {\n            return;\n        }\n        let indexers: Vec<IndexerNodeInfo> = self.get_indexers_from_indexer_pool();\n        let running_indexing_tasks_by_node_id: FnvHashMap<String, Vec<IndexingTask>> = indexers\n            .iter()\n            .map(|indexer| (indexer.node_id.to_string(), indexer.indexing_tasks.clone()))\n            .collect();\n\n        let indexing_plans_diff = get_indexing_plans_diff(\n            &running_indexing_tasks_by_node_id,\n            last_applied_plan.indexing_tasks_per_indexer(),\n        );\n        if !indexing_plans_diff.has_same_nodes() {\n            info!(plans_diff=?indexing_plans_diff, \"running plan and last applied plan node IDs differ: schedule an indexing plan\");\n            self.rebuild_plan(model);\n        } else if !indexing_plans_diff.has_same_tasks() {\n            // Some nodes may have not received their tasks, apply it again.\n            info!(plans_diff=?indexing_plans_diff, \"running tasks and last applied tasks differ: reapply last plan\");\n            self.apply_physical_indexing_plan(&indexers, last_applied_plan.clone(), None);\n        }\n    }\n\n    fn get_indexers_from_indexer_pool(&self) -> Vec<IndexerNodeInfo> {\n        self.indexer_pool.values()\n    }\n\n    fn apply_physical_indexing_plan(\n        &mut self,\n        indexers: &[IndexerNodeInfo],\n        new_physical_plan: PhysicalIndexingPlan,\n        notify_on_drop: Option<Arc<NotifyChangeOnDrop>>,\n    ) {\n        debug!(new_physical_plan=?new_physical_plan, \"apply physical indexing plan\");\n        crate::metrics::CONTROL_PLANE_METRICS.apply_plan_total.inc();\n        for (node_id, indexing_tasks) in new_physical_plan.indexing_tasks_per_indexer() {\n            // We don't want to block on a slow indexer so we apply this change asynchronously\n            // TODO not blocking is cool, but we need to make sure there is not accumulation\n            // possible here.\n            let notify_on_drop = notify_on_drop.clone();\n            tokio::spawn({\n                let indexer = indexers\n                    .iter()\n                    .find(|indexer| indexer.node_id == *node_id)\n                    .expect(\"This should never happen as the plan was built from these indexers.\")\n                    .clone();\n                let indexing_tasks = indexing_tasks.clone();\n                async move {\n                    if let Err(error) = indexer\n                        .client\n                        .clone()\n                        .apply_indexing_plan(ApplyIndexingPlanRequest { indexing_tasks })\n                        .await\n                    {\n                        warn!(\n                            %error,\n                            node_id=%indexer.node_id,\n                            generation_id=indexer.generation_id,\n                            \"failed to apply indexing plan to indexer\"\n                        );\n                    }\n                    drop(notify_on_drop);\n                }\n            });\n        }\n        self.state.num_applied_physical_indexing_plan += 1;\n        self.state.last_applied_plan_timestamp = Some(Instant::now());\n        self.state.last_applied_physical_plan = Some(new_physical_plan);\n    }\n}\n\nstruct IndexingPlansDiff<'a> {\n    pub missing_node_ids: FnvHashSet<&'a str>,\n    pub unplanned_node_ids: FnvHashSet<&'a str>,\n    pub missing_tasks_by_node_id: FnvHashMap<&'a str, Vec<&'a IndexingTask>>,\n    pub unplanned_tasks_by_node_id: FnvHashMap<&'a str, Vec<&'a IndexingTask>>,\n}\n\nimpl IndexingPlansDiff<'_> {\n    pub fn has_same_nodes(&self) -> bool {\n        self.missing_node_ids.is_empty() && self.unplanned_node_ids.is_empty()\n    }\n\n    pub fn has_same_tasks(&self) -> bool {\n        self.missing_tasks_by_node_id\n            .values()\n            .map(Vec::len)\n            .sum::<usize>()\n            == 0\n            && self\n                .unplanned_tasks_by_node_id\n                .values()\n                .map(Vec::len)\n                .sum::<usize>()\n                == 0\n    }\n\n    pub fn is_empty(&self) -> bool {\n        self.has_same_nodes() && self.has_same_tasks()\n    }\n}\n\nfn get_shard_locality_metrics(\n    physical_plan: &PhysicalIndexingPlan,\n    shard_locations: &ShardLocations,\n) -> ShardLocalityMetrics {\n    let mut num_local_shards = 0;\n    let mut num_remote_shards = 0;\n    for (indexer, tasks) in physical_plan.indexing_tasks_per_indexer() {\n        for task in tasks {\n            for shard_id in &task.shard_ids {\n                if shard_locations\n                    .get_shard_locations(shard_id)\n                    .iter()\n                    .any(|node| node.as_str() == indexer)\n                {\n                    num_local_shards += 1;\n                } else {\n                    num_remote_shards += 1;\n                }\n            }\n        }\n    }\n    ShardLocalityMetrics {\n        num_remote_shards,\n        num_local_shards,\n    }\n}\n\nimpl fmt::Debug for IndexingPlansDiff<'_> {\n    fn fmt(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {\n        if self.has_same_nodes() && self.has_same_tasks() {\n            return write!(formatter, \"EmptyIndexingPlansDiff\");\n        }\n        write!(formatter, \"IndexingPlansDiff(\")?;\n        let mut separator = \"\";\n        if !self.missing_node_ids.is_empty() {\n            write!(\n                formatter,\n                \"missing_node_ids={:?}\",\n                PrettySample::new(&self.missing_node_ids, 10)\n            )?;\n            separator = \", \"\n        }\n        if !self.unplanned_node_ids.is_empty() {\n            write!(\n                formatter,\n                \"{separator}unplanned_node_ids={:?}\",\n                PrettySample::new(&self.unplanned_node_ids, 10)\n            )?;\n            separator = \", \"\n        }\n        if !self.missing_tasks_by_node_id.is_empty() {\n            write!(formatter, \"{separator}missing_tasks_by_node_id=\",)?;\n            format_indexing_task_map(formatter, &self.missing_tasks_by_node_id)?;\n            separator = \", \"\n        }\n        if !self.unplanned_tasks_by_node_id.is_empty() {\n            write!(formatter, \"{separator}unplanned_tasks_by_node_id=\",)?;\n            format_indexing_task_map(formatter, &self.unplanned_tasks_by_node_id)?;\n        }\n        write!(formatter, \")\")\n    }\n}\n\nfn format_indexing_task_map(\n    formatter: &mut std::fmt::Formatter,\n    indexing_tasks: &FnvHashMap<&str, Vec<&IndexingTask>>,\n) -> std::fmt::Result {\n    // we show at most 5 nodes, and aggregate the results for the other.\n    // we show at most 10 indexes, but aggregate results after.\n    // we always aggregate shard ids\n    // we hide pipeline id and incarnation id, they are not very useful in most case, but take a\n    // lot of place\n    const MAX_NODE: usize = 5;\n    const MAX_INDEXES: usize = 10;\n    let mut index_displayed = 0;\n    write!(formatter, \"{{\")?;\n    let mut indexer_iter = indexing_tasks.iter().enumerate();\n    for (i, (index_name, tasks)) in &mut indexer_iter {\n        if i != 0 {\n            write!(formatter, \", \")?;\n        }\n        if index_displayed != MAX_INDEXES - 1 {\n            write!(formatter, \"{index_name:?}: [\")?;\n            let mut tasks_iter = tasks.iter().enumerate();\n            for (i, task) in &mut tasks_iter {\n                if i != 0 {\n                    write!(formatter, \", \")?;\n                }\n                write!(\n                    formatter,\n                    r#\"(index_id: \"{}\", source_id: \"{}\", shard_count: {})\"#,\n                    task.index_uid.as_ref().unwrap().index_id,\n                    task.source_id,\n                    task.shard_ids.len()\n                )?;\n                index_displayed += 1;\n                if index_displayed == MAX_INDEXES - 1 {\n                    let (task_count, shard_count) = tasks_iter.fold((0, 0), |(t, s), (_, task)| {\n                        (t + 1, s + task.shard_ids.len())\n                    });\n                    if task_count > 0 {\n                        write!(\n                            formatter,\n                            \" and {task_count} tasks and {shard_count} shards\"\n                        )?;\n                    }\n                    break;\n                }\n            }\n            write!(formatter, \"]\")?;\n        } else {\n            write!(\n                formatter,\n                \"{index_name:?}: [with {} tasks and {} shards]\",\n                tasks.len(),\n                tasks.iter().map(|task| task.shard_ids.len()).sum::<usize>()\n            )?;\n        }\n        if i == MAX_NODE - 1 {\n            break;\n        }\n    }\n    let (indexer, tasks, shards) = indexer_iter.fold((0, 0, 0), |(i, t, s), (_, (_, task))| {\n        (\n            i + 1,\n            t + task.len(),\n            s + task.iter().map(|task| task.shard_ids.len()).sum::<usize>(),\n        )\n    });\n    if indexer > 0 {\n        write!(\n            formatter,\n            \" and {indexer} more indexers, handling {tasks} tasks and {shards} shards}}\"\n        )\n    } else {\n        write!(formatter, \"}}\")\n    }\n}\n\n/// Returns the difference between the `running_plan` retrieved from the chitchat state and\n/// the last plan applied by the scheduler.\nfn get_indexing_plans_diff<'a>(\n    running_plan: &'a FnvHashMap<String, Vec<IndexingTask>>,\n    last_applied_plan: &'a FnvHashMap<String, Vec<IndexingTask>>,\n) -> IndexingPlansDiff<'a> {\n    // Nodes diff.\n    let running_node_ids: FnvHashSet<&str> = running_plan\n        .keys()\n        .map(|node_id| node_id.as_str())\n        .collect();\n    let planned_node_ids: FnvHashSet<&str> = last_applied_plan\n        .keys()\n        .map(|node_id| node_id.as_str())\n        .collect();\n    let missing_node_ids: FnvHashSet<&str> = planned_node_ids\n        .difference(&running_node_ids)\n        .copied()\n        .collect();\n    let unplanned_node_ids: FnvHashSet<&str> = running_node_ids\n        .difference(&planned_node_ids)\n        .copied()\n        .collect();\n    // Tasks diff.\n    let mut missing_tasks_by_node_id: FnvHashMap<&str, Vec<&IndexingTask>> = FnvHashMap::default();\n    let mut unplanned_tasks_by_node_id: FnvHashMap<&str, Vec<&IndexingTask>> =\n        FnvHashMap::default();\n    for node_id in running_node_ids.iter().chain(planned_node_ids.iter()) {\n        let running_tasks = running_plan\n            .get(*node_id)\n            .map(Vec::as_slice)\n            .unwrap_or_else(|| &[]);\n        let last_applied_tasks = last_applied_plan\n            .get(*node_id)\n            .map(Vec::as_slice)\n            .unwrap_or_else(|| &[]);\n        let (missing_tasks, unplanned_tasks) =\n            get_indexing_tasks_diff(running_tasks, last_applied_tasks);\n        missing_tasks_by_node_id.insert(*node_id, missing_tasks);\n        unplanned_tasks_by_node_id.insert(*node_id, unplanned_tasks);\n    }\n    IndexingPlansDiff {\n        missing_node_ids,\n        unplanned_node_ids,\n        missing_tasks_by_node_id,\n        unplanned_tasks_by_node_id,\n    }\n}\n\n/// Computes the difference between `running_tasks` and `last_applied_tasks` and returns a tuple\n/// of `missing_tasks` and `unplanned_tasks`.\n/// Note: we need to handle duplicate tasks in each array, so we count them and make the diff.\nfn get_indexing_tasks_diff<'a>(\n    running_tasks: &'a [IndexingTask],\n    last_applied_tasks: &'a [IndexingTask],\n) -> (Vec<&'a IndexingTask>, Vec<&'a IndexingTask>) {\n    let mut missing_tasks: Vec<&IndexingTask> = Vec::new();\n    let mut unplanned_tasks: Vec<&IndexingTask> = Vec::new();\n    let grouped_running_tasks: FnvHashMap<&IndexingTask, usize> = running_tasks\n        .iter()\n        .chunk_by(|&task| task)\n        .into_iter()\n        .map(|(key, group)| (key, group.count()))\n        .collect();\n    let grouped_last_applied_tasks: FnvHashMap<&IndexingTask, usize> = last_applied_tasks\n        .iter()\n        .chunk_by(|&task| task)\n        .into_iter()\n        .map(|(key, group)| (key, group.count()))\n        .collect();\n    let all_tasks: FnvHashSet<&IndexingTask> =\n        FnvHashSet::from_iter(running_tasks.iter().chain(last_applied_tasks.iter()));\n    for task in all_tasks {\n        let running_task_count = grouped_running_tasks.get(task).unwrap_or(&0);\n        let desired_task_count = grouped_last_applied_tasks.get(task).unwrap_or(&0);\n        match running_task_count.cmp(desired_task_count) {\n            Ordering::Greater => {\n                unplanned_tasks\n                    .extend_from_slice(&vec![task; running_task_count - desired_task_count]);\n            }\n            Ordering::Less => {\n                missing_tasks\n                    .extend_from_slice(&vec![task; desired_task_count - running_task_count])\n            }\n            _ => {}\n        }\n    }\n\n    (missing_tasks, unplanned_tasks)\n}\n\n#[cfg(test)]\nmod tests {\n    use std::num::NonZeroUsize;\n    use std::str::FromStr;\n\n    use proptest::{prop_compose, proptest};\n    use quickwit_config::{IndexConfig, KafkaSourceParams, SourceConfig, SourceParams};\n    use quickwit_metastore::IndexMetadata;\n    use quickwit_proto::types::{IndexUid, PipelineUid, ShardId, SourceUid};\n\n    use super::*;\n    use crate::model::ShardLocations;\n    #[test]\n    fn test_indexing_plans_diff() {\n        let index_uid = IndexUid::from_str(\"index-1:11111111111111111111111111\").unwrap();\n        let index_uid2 = IndexUid::from_str(\"index-2:11111111111111111111111111\").unwrap();\n        {\n            let running_plan = FnvHashMap::default();\n            let desired_plan = FnvHashMap::default();\n            let indexing_plans_diff = get_indexing_plans_diff(&running_plan, &desired_plan);\n            assert!(indexing_plans_diff.is_empty());\n        }\n        {\n            let mut running_plan = FnvHashMap::default();\n            let mut desired_plan = FnvHashMap::default();\n            let task_1 = IndexingTask {\n                pipeline_uid: Some(PipelineUid::for_test(10u128)),\n                index_uid: Some(index_uid.clone()),\n                source_id: \"source-1\".to_string(),\n                shard_ids: Vec::new(),\n                params_fingerprint: 0,\n            };\n            let task_1b = IndexingTask {\n                pipeline_uid: Some(PipelineUid::for_test(11u128)),\n                index_uid: Some(index_uid.clone()),\n                source_id: \"source-1\".to_string(),\n                shard_ids: Vec::new(),\n                params_fingerprint: 0,\n            };\n            let task_2 = IndexingTask {\n                pipeline_uid: Some(PipelineUid::for_test(20u128)),\n                index_uid: Some(index_uid.clone()),\n                source_id: \"source-2\".to_string(),\n                shard_ids: Vec::new(),\n                params_fingerprint: 0,\n            };\n            running_plan.insert(\n                \"indexer-1\".to_string(),\n                vec![task_1.clone(), task_1b.clone(), task_2.clone()],\n            );\n            desired_plan.insert(\n                \"indexer-1\".to_string(),\n                vec![task_2, task_1.clone(), task_1b.clone()],\n            );\n            let indexing_plans_diff = get_indexing_plans_diff(&running_plan, &desired_plan);\n            assert!(indexing_plans_diff.is_empty());\n        }\n        {\n            let mut running_plan = FnvHashMap::default();\n            let mut desired_plan = FnvHashMap::default();\n            let task_1 = IndexingTask {\n                pipeline_uid: Some(PipelineUid::for_test(1u128)),\n                index_uid: Some(index_uid.clone()),\n                source_id: \"source-1\".to_string(),\n                shard_ids: Vec::new(),\n                params_fingerprint: 0,\n            };\n            let task_2 = IndexingTask {\n                pipeline_uid: Some(PipelineUid::for_test(2u128)),\n                index_uid: Some(index_uid.clone()),\n                source_id: \"source-2\".to_string(),\n                shard_ids: Vec::new(),\n                params_fingerprint: 0,\n            };\n            running_plan.insert(\"indexer-1\".to_string(), vec![task_1.clone()]);\n            desired_plan.insert(\"indexer-1\".to_string(), vec![task_2.clone()]);\n\n            let indexing_plans_diff = get_indexing_plans_diff(&running_plan, &desired_plan);\n            assert!(!indexing_plans_diff.is_empty());\n            assert!(indexing_plans_diff.has_same_nodes());\n            assert!(!indexing_plans_diff.has_same_tasks());\n            assert_eq!(\n                indexing_plans_diff.unplanned_tasks_by_node_id,\n                FnvHashMap::from_iter([(\"indexer-1\", vec![&task_1])])\n            );\n            assert_eq!(\n                indexing_plans_diff.missing_tasks_by_node_id,\n                FnvHashMap::from_iter([(\"indexer-1\", vec![&task_2])])\n            );\n        }\n        {\n            // Task assigned to indexer-1 in desired plan but another one running.\n            let mut running_plan = FnvHashMap::default();\n            let mut desired_plan = FnvHashMap::default();\n            let task_1 = IndexingTask {\n                pipeline_uid: Some(PipelineUid::for_test(1u128)),\n                index_uid: Some(index_uid.clone()),\n                source_id: \"source-1\".to_string(),\n                shard_ids: Vec::new(),\n                params_fingerprint: 0,\n            };\n            let task_2 = IndexingTask {\n                pipeline_uid: Some(PipelineUid::for_test(2u128)),\n                index_uid: Some(index_uid2.clone()),\n                source_id: \"source-2\".to_string(),\n                shard_ids: Vec::new(),\n                params_fingerprint: 0,\n            };\n            running_plan.insert(\"indexer-2\".to_string(), vec![task_2.clone()]);\n            desired_plan.insert(\"indexer-1\".to_string(), vec![task_1.clone()]);\n\n            let indexing_plans_diff = get_indexing_plans_diff(&running_plan, &desired_plan);\n            assert!(!indexing_plans_diff.is_empty());\n            assert!(!indexing_plans_diff.has_same_nodes());\n            assert!(!indexing_plans_diff.has_same_tasks());\n            assert_eq!(\n                indexing_plans_diff.missing_node_ids,\n                FnvHashSet::from_iter([\"indexer-1\"])\n            );\n            assert_eq!(\n                indexing_plans_diff.unplanned_node_ids,\n                FnvHashSet::from_iter([\"indexer-2\"])\n            );\n            assert_eq!(\n                indexing_plans_diff.missing_tasks_by_node_id,\n                FnvHashMap::from_iter([(\"indexer-1\", vec![&task_1]), (\"indexer-2\", Vec::new())])\n            );\n            assert_eq!(\n                indexing_plans_diff.unplanned_tasks_by_node_id,\n                FnvHashMap::from_iter([(\"indexer-2\", vec![&task_2]), (\"indexer-1\", Vec::new())])\n            );\n        }\n        {\n            // Diff with 3 same tasks running but only one on the desired plan.\n            let mut running_plan = FnvHashMap::default();\n            let mut desired_plan = FnvHashMap::default();\n            let task_1a = IndexingTask {\n                pipeline_uid: Some(PipelineUid::for_test(10u128)),\n                index_uid: Some(index_uid.clone()),\n                source_id: \"source-1\".to_string(),\n                shard_ids: Vec::new(),\n                params_fingerprint: 0,\n            };\n            let task_1b = IndexingTask {\n                pipeline_uid: Some(PipelineUid::for_test(11u128)),\n                index_uid: Some(index_uid.clone()),\n                source_id: \"source-1\".to_string(),\n                shard_ids: Vec::new(),\n                params_fingerprint: 0,\n            };\n            let task_1c = IndexingTask {\n                pipeline_uid: Some(PipelineUid::for_test(12u128)),\n                index_uid: Some(index_uid.clone()),\n                source_id: \"source-1\".to_string(),\n                shard_ids: Vec::new(),\n                params_fingerprint: 0,\n            };\n            running_plan.insert(\"indexer-1\".to_string(), vec![task_1a.clone()]);\n            desired_plan.insert(\n                \"indexer-1\".to_string(),\n                vec![task_1a.clone(), task_1b.clone(), task_1c.clone()],\n            );\n\n            let indexing_plans_diff = get_indexing_plans_diff(&running_plan, &desired_plan);\n            assert!(!indexing_plans_diff.is_empty());\n            assert!(indexing_plans_diff.has_same_nodes());\n            assert!(!indexing_plans_diff.has_same_tasks());\n            assert_eq!(\n                indexing_plans_diff.missing_tasks_by_node_id,\n                FnvHashMap::from_iter([(\"indexer-1\", vec![&task_1b, &task_1c])])\n            );\n        }\n    }\n\n    #[test]\n    fn test_get_sources_to_schedule() {\n        let mut model = ControlPlaneModel::default();\n        let kafka_source_params = KafkaSourceParams {\n            topic: \"kafka-topic\".to_string(),\n            client_log_level: None,\n            client_params: serde_json::json!({}),\n            enable_backfill_mode: false,\n        };\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///test-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        model.add_index(index_metadata);\n        model\n            .add_source(\n                &index_uid,\n                SourceConfig {\n                    source_id: \"source_disabled\".to_string(),\n                    num_pipelines: NonZeroUsize::new(3).unwrap(),\n                    enabled: false,\n                    source_params: SourceParams::Kafka(kafka_source_params.clone()),\n                    transform_config: None,\n                    input_format: Default::default(),\n                },\n            )\n            .unwrap();\n        model\n            .add_source(\n                &index_uid,\n                SourceConfig {\n                    source_id: \"source_enabled\".to_string(),\n                    num_pipelines: NonZeroUsize::new(2).unwrap(),\n                    enabled: true,\n                    source_params: SourceParams::Kafka(kafka_source_params.clone()),\n                    transform_config: None,\n                    input_format: Default::default(),\n                },\n            )\n            .unwrap();\n        model\n            .add_source(\n                &index_uid,\n                SourceConfig {\n                    source_id: \"ingest_v1\".to_string(),\n                    num_pipelines: NonZeroUsize::new(2).unwrap(),\n                    enabled: true,\n                    // ingest v1\n                    source_params: SourceParams::IngestApi,\n                    transform_config: None,\n                    input_format: Default::default(),\n                },\n            )\n            .unwrap();\n        model\n            .add_source(\n                &index_uid,\n                SourceConfig {\n                    source_id: \"ingest_v2\".to_string(),\n                    num_pipelines: NonZeroUsize::new(2).unwrap(),\n                    enabled: true,\n                    // ingest v2\n                    source_params: SourceParams::Ingest,\n                    transform_config: None,\n                    input_format: Default::default(),\n                },\n            )\n            .unwrap();\n        // ingest v2 without any open shard is skipped.\n        model\n            .add_source(\n                &index_uid,\n                SourceConfig {\n                    source_id: \"ingest_v2_without_shard\".to_string(),\n                    num_pipelines: NonZeroUsize::new(2).unwrap(),\n                    enabled: true,\n                    // ingest v2\n                    source_params: SourceParams::Ingest,\n                    transform_config: None,\n                    input_format: Default::default(),\n                },\n            )\n            .unwrap();\n        model\n            .add_source(\n                &index_uid,\n                SourceConfig {\n                    source_id: \"ingest_cli\".to_string(),\n                    num_pipelines: NonZeroUsize::new(2).unwrap(),\n                    enabled: true,\n                    // ingest v1\n                    source_params: SourceParams::IngestCli,\n                    transform_config: None,\n                    input_format: Default::default(),\n                },\n            )\n            .unwrap();\n        let shard = Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: \"ingest_v2\".to_string(),\n            shard_id: Some(ShardId::from(17)),\n            shard_state: ShardState::Open as i32,\n            ..Default::default()\n        };\n        model.insert_shards(&index_uid, &\"ingest_v2\".to_string(), vec![shard]);\n        let shards: Vec<SourceToSchedule> = get_sources_to_schedule(&model);\n        assert_eq!(shards.len(), 3);\n    }\n\n    #[test]\n    fn test_build_physical_indexing_plan_simple() {\n        let source_1 = SourceUid {\n            index_uid: IndexUid::for_test(\"index-1\", 0),\n            source_id: \"source1\".to_string(),\n        };\n        let source_2 = SourceUid {\n            index_uid: IndexUid::for_test(\"index-2\", 0),\n            source_id: \"source2\".to_string(),\n        };\n        let sources = [\n            SourceToSchedule {\n                source_uid: source_1.clone(),\n                source_type: SourceToScheduleType::NonSharded {\n                    num_pipelines: 3,\n                    load_per_pipeline: NonZeroU32::new(1_000).unwrap(),\n                },\n                params_fingerprint: 0,\n            },\n            SourceToSchedule {\n                source_uid: source_2.clone(),\n                source_type: SourceToScheduleType::NonSharded {\n                    num_pipelines: 2,\n                    load_per_pipeline: NonZeroU32::new(1_000).unwrap(),\n                },\n                params_fingerprint: 0,\n            },\n        ];\n        let mut indexer_max_loads = FnvHashMap::default();\n        indexer_max_loads.insert(\"indexer1\".to_string(), mcpu(3_000));\n        indexer_max_loads.insert(\"indexer2\".to_string(), mcpu(3_000));\n        let shard_locations = ShardLocations::default();\n        let physical_plan =\n            build_physical_indexing_plan(&sources[..], &indexer_max_loads, None, &shard_locations);\n        assert_eq!(physical_plan.indexing_tasks_per_indexer().len(), 2);\n        let indexing_tasks_1 = physical_plan.indexer(\"indexer1\").unwrap();\n        assert_eq!(indexing_tasks_1.len(), 2);\n        let indexer_2_tasks = physical_plan.indexer(\"indexer2\").unwrap();\n        assert_eq!(indexer_2_tasks.len(), 3);\n    }\n\n    #[test]\n    fn test_debug_indexing_task_map() {\n        let mut map = FnvHashMap::default();\n        let task1 = IndexingTask {\n            index_uid: Some(IndexUid::for_test(\"index1\", 123)),\n            source_id: \"my-source\".to_string(),\n            pipeline_uid: Some(PipelineUid::random()),\n            shard_ids: vec![\"shard1\".into()],\n            params_fingerprint: 0,\n        };\n        let task2 = IndexingTask {\n            index_uid: Some(IndexUid::for_test(\"index2\", 123)),\n            source_id: \"my-source\".to_string(),\n            pipeline_uid: Some(PipelineUid::random()),\n            shard_ids: vec![\"shard2\".into(), \"shard3\".into()],\n            params_fingerprint: 0,\n        };\n        let task3 = IndexingTask {\n            index_uid: Some(IndexUid::for_test(\"index3\", 123)),\n            source_id: \"my-source\".to_string(),\n            pipeline_uid: Some(PipelineUid::random()),\n            shard_ids: vec![\"shard6\".into()],\n            params_fingerprint: 0,\n        };\n        // order made to map with the debug for lisibility\n        map.insert(\"indexer5\", vec![&task2]);\n        map.insert(\"indexer4\", vec![&task1]);\n        map.insert(\"indexer3\", vec![&task1, &task3]);\n        map.insert(\"indexer2\", vec![&task2, &task3, &task1, &task2]);\n        map.insert(\"indexer1\", vec![&task1, &task2, &task3, &task1]);\n        map.insert(\"indexer6\", vec![&task1, &task2, &task3]);\n        let plan = IndexingPlansDiff {\n            missing_node_ids: FnvHashSet::default(),\n            unplanned_node_ids: FnvHashSet::default(),\n            missing_tasks_by_node_id: map,\n            unplanned_tasks_by_node_id: FnvHashMap::default(),\n        };\n\n        let debug = format!(\"{plan:?}\");\n        assert_eq!(\n            debug,\n            r#\"IndexingPlansDiff(missing_tasks_by_node_id={\"indexer5\": [(index_id: \"index2\", source_id: \"my-source\", shard_count: 2)], \"indexer4\": [(index_id: \"index1\", source_id: \"my-source\", shard_count: 1)], \"indexer3\": [(index_id: \"index1\", source_id: \"my-source\", shard_count: 1), (index_id: \"index3\", source_id: \"my-source\", shard_count: 1)], \"indexer2\": [(index_id: \"index2\", source_id: \"my-source\", shard_count: 2), (index_id: \"index3\", source_id: \"my-source\", shard_count: 1), (index_id: \"index1\", source_id: \"my-source\", shard_count: 1), (index_id: \"index2\", source_id: \"my-source\", shard_count: 2)], \"indexer1\": [(index_id: \"index1\", source_id: \"my-source\", shard_count: 1) and 3 tasks and 4 shards] and 1 more indexers, handling 3 tasks and 4 shards})\"#\n        );\n    }\n\n    proptest! {\n        #[test]\n        fn test_building_indexing_tasks_and_physical_plan(num_indexers in 1usize..50usize, index_id_sources in proptest::collection::vec(gen_kafka_source(), 1..20)) {\n            let index_uids: fnv::FnvHashSet<IndexUid> =\n                index_id_sources.iter()\n                    .map(|(index_uid, _)| index_uid.clone())\n                    .collect();\n            let mut model = ControlPlaneModel::default();\n            for index_uid in index_uids {\n                let index_config = IndexConfig::for_test(&index_uid.index_id, &format!(\"ram://test/{index_uid}\"));\n                model.add_index(IndexMetadata::new_with_index_uid(index_uid, index_config));\n            }\n            for (index_uid, source_config) in &index_id_sources {\n                model.add_source(index_uid, source_config.clone()).unwrap();\n            }\n\n            let sources: Vec<SourceToSchedule> = get_sources_to_schedule(&model);\n            let mut indexer_max_loads = FnvHashMap::default();\n            for i in 0..num_indexers {\n                let indexer_id = format!(\"indexer-{i}\");\n                indexer_max_loads.insert(indexer_id, mcpu(4_000));\n            }\n            let shard_locations = ShardLocations::default();\n            let _physical_indexing_plan = build_physical_indexing_plan(&sources, &indexer_max_loads, None, &shard_locations);\n        }\n    }\n\n    use quickwit_config::SourceInputFormat;\n    use quickwit_proto::indexing::mcpu;\n    use quickwit_proto::ingest::{Shard, ShardState};\n\n    fn kafka_source_params_for_test() -> SourceParams {\n        SourceParams::Kafka(KafkaSourceParams {\n            topic: \"topic\".to_string(),\n            client_log_level: None,\n            client_params: serde_json::json!({\n                \"bootstrap.servers\": \"localhost:9092\",\n            }),\n            enable_backfill_mode: true,\n        })\n    }\n\n    prop_compose! {\n      fn gen_kafka_source()\n        (index_idx in 0usize..100usize, num_pipelines in 1usize..51usize) -> (IndexUid, SourceConfig) {\n          let index_uid = IndexUid::for_test(&format!(\"index-id-{index_idx}\"), 0 /* this is the index uid */);\n          let source_id = quickwit_common::rand::append_random_suffix(\"kafka-source\");\n          (index_uid, SourceConfig {\n              source_id,\n              num_pipelines: NonZeroUsize::new(num_pipelines).unwrap(),\n              enabled: true,\n              source_params: kafka_source_params_for_test(),\n              transform_config: None,\n              input_format: SourceInputFormat::Json,\n          })\n      }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-control-plane/src/indexing_scheduler/scheduling/README.md",
    "content": "# Scheduling logic\n\nQuickwit needs to assign indexing tasks to a set of indexers nodes.\nWe call the result of this decision the indexing physical plan.\n\nWe also want to observe some interesting properties such as:\n- (A) we want to avoid moving indexing tasks from one indexer to another one needlessly.\n- (B) we want a source to be spread amongst as few nodes as possible\n- (C) we want to balance the load between nodes as soon as the load is significatively (>30%) higher than the average (target) load\n- (D) when we are working with the Ingest API source, we prefer to colocate indexers on\n  the ingesters holding the data.\n\n# Problem abstraction\n\nTo simplify the logic and make it easier to test it, we first abstract this in the following\noptimization problem. In Quickwit, we have two types of source:\n\n- The push api source: they have a given (changing) set of shards associated to them.\n  A shard is rate-limited to ensure their throughput is lower than `5MB/s` worth of\n  uncompressed data. This guarantees that a given shard can be indexed by a\n  single indexing pipeline.\n\n- Other sources, like Kafka. It is a very common use case to use quickwit to index large\n  amounts of historical data. Right now, the user is therefore expected to supply a desired\n  number of pipeline.\n\nRouters send their batch to the different ingesters they know using a round-robin logic.\nWe assume that routers's list of known shards gets eventually updated after a shard addition, so that\nwe can assume that shard have roughly the same load.\n\nIndexers inform of observed load of all of their pipelines. \nThis load is assumed unidimensional. This is imperfect of course: indexing consumes network, io, etc.\nStill for the sake of simplification we pick one metric, measured as the amount of CPU spent\nin the indexer. \n\nThe control plane consolidates this figure to create a load_per_shard metric expressed in millicpu.\n\nThe hypothesis above allow us to see both kafka and ingest sources through the same lens, and stop \nmaking a distinction between shards.\n\nIn our scheduler, a source simply has:\n- an identifier (a `u32`)\n- a number of shard.\n- a load per shard identified by a u32, expression thousandth of CPU.\n\nAnd indexer has:\n- a maximum total load (that we will need to measure or configure).\n\nThe problem is now greatly simplified.\nA solution is a sparse matrix of `(num_indexers, num_sources)` that holds a number of shards to be indexed.\nThe different constraint and wanted properties can all be re-expressed. For instance:\n- We want the dot product of the load per shard vector with each row, to be close to the average load of each node (C)\n- We do not want a large distance between the two solution matrixes (A)\n- We want that matrix as sparse as possible (B)\n\nNote that the constraint (C) is enforced differently depending on the load:\n- shards can be placed freely on nodes up to 30% of their capacity\n- above this threshold, we try to assign shards to indexers so that the total load on each indexer is close to the average load\n\nTo express the affinity constraint (D) we could similarly define a matrix of `(num_indexers, num_sources)` with affinity scores and compute a distance with the solution matrix. \n\nThe actual cost function we would craft is however not linear, it is the combination of multiple distances like those discribed above.\n\n# The heuristic\n\nWe use the following heuristic.\n\nWhile assigning shards to node, we try to ensure that workloads are balanced (except for very small cluster loads). This is achieved by calculating a virtual capacity for each indexer. We calculate 120% of the total load on the entire cluster then divide it up proportionally between indexers according to their capacity. By respecting this virtual capacity when assigning shards to indexers, we make sure that all indexers have a load close to the average load.\n\n## Phase 1: Remove extraneous shards\n\nStarting from the existing solution, we first reduce it to make sure we do not have too many shards assigned. This happens when a source was scaled down or deleted.\nThis is done by reducing the number of shard wherever needed, picking in priority nodes with few shards.\n\nWe call the resulting solution \"reduced solution\". The reduced solution is usually not a valid solution as some shard\nmay have been added. We will add these in Phase 3.\n\nIf we compute the distance to the previous solution, we want to use the \"reduced solution\" and not the actual\nprevious solution.\n\n## Phase 2: Enforce nodes maximum load\n\nWe then remove entire sources from nodes where the load is higher than the capcity (load <30%) or virtual capacity (load >30%).\nFor every given node, we remove in priority sources that have an overall small load on the node.\n\nMatrix-wise, note that phase 1 and phase 2 creates a matrix lower or equal to the previous solution.\n\n## Phase 3: Greedy assignment\n\nAt this point we have reached a solution that fits on the cluster, but we possibly has several missing shards.\nWe therefore use a greedy algorithm to allocate these shard. We assign the shards source by source, in the order of decreasing total load.\n\nWe try assigning shards to indexers while trying to respect their virtual capacity. Because of the uneven size of shards and the greedy approach, this problem might not have a solution. In that case we iteratively grow the virtual capacity by 20% until the solution fits.\n\nShards for each source are placed in two steps:\n- in a first iteration we assign shards that have affinity scores (D)\n- in a second iteration we assign the rest of the shards starting with the node having the highest capacity\n\n## Phase 4: Optimization\n\nThis is not implemented yet. We could craft a proper optimization cost and use a BFS search to explore\nbetter solutions.\n\n\n# Code organization\n\nAll of this scheduling is done in the scheduling directory.\nClients only have to call the `build_physical_indexing_plan` function.\n\nThe code converts the list of sources into a \"scheduling problem\" that abstracts away kafka pipelines and ingest v2 pipelines.\nThe problem then goes through our optimization code.\nThe solution at this point only contains the number of shards of each type to be assigned to each indexers.\nThe function expands this solution into a complete physical plan, with shard ids and pipelines.\n"
  },
  {
    "path": "quickwit/quickwit-control-plane/src/indexing_scheduler/scheduling/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\npub mod scheduling_logic;\npub mod scheduling_logic_model;\n\nuse std::collections::HashMap;\nuse std::num::NonZeroU32;\n\nuse fnv::{FnvHashMap, FnvHashSet};\nuse quickwit_common::rate_limited_debug;\nuse quickwit_proto::indexing::{CpuCapacity, IndexingTask};\nuse quickwit_proto::types::{PipelineUid, ShardId, SourceUid};\nuse scheduling_logic_model::{IndexerOrd, SourceOrd};\nuse tracing::{error, warn};\n\nuse crate::indexing_plan::PhysicalIndexingPlan;\nuse crate::indexing_scheduler::scheduling::scheduling_logic_model::{\n    IndexerAssignment, SchedulingProblem, SchedulingSolution,\n};\nuse crate::model::ShardLocations;\n\n/// If we have several pipelines below this threshold we\n/// reduce the number of pipelines.\n///\n/// Note that even for 2 pipelines, this creates an hysteris effect.\n///\n/// Starting from a single pipeline.\n/// An overall load above 80% is enough to trigger the creation of a\n/// second pipeline.\n///\n/// Coming back to a single pipeline requires having a load per pipeline\n/// of 30%. Which translates into an overall load of 60%.\nconst CPU_PER_PIPELINE_LOAD_LOWER_THRESHOLD: CpuCapacity = CpuCapacity::from_cpu_millis(1_200);\n\n/// That's 80% of a period\nconst MAX_LOAD_PER_PIPELINE: CpuCapacity = CpuCapacity::from_cpu_millis(3_200);\n\nfn populate_problem(\n    source: &SourceToSchedule,\n    problem: &mut SchedulingProblem,\n) -> Option<SourceOrd> {\n    match &source.source_type {\n        SourceToScheduleType::IngestV1 => {\n            // TODO ingest v1 is scheduled differently\n            None\n        }\n        SourceToScheduleType::Sharded {\n            shard_ids,\n            load_per_shard,\n        } => {\n            let num_shards = shard_ids.len() as u32;\n            let source_ord = problem.add_source(num_shards, *load_per_shard);\n            Some(source_ord)\n        }\n        SourceToScheduleType::NonSharded {\n            num_pipelines,\n            load_per_pipeline,\n        } => {\n            let source_ord = problem.add_source(*num_pipelines, *load_per_pipeline);\n            Some(source_ord)\n        }\n    }\n}\n\n#[derive(Default)]\nstruct IdToOrdMap<'a> {\n    indexer_ids: Vec<String>,\n    sources: Vec<&'a SourceToSchedule>,\n    indexer_id_to_indexer_ord: FnvHashMap<String, IndexerOrd>,\n    source_uid_to_source_ord: FnvHashMap<SourceUid, SourceOrd>,\n}\n\nimpl<'a> IdToOrdMap<'a> {\n    // All source added are required to have a different source uid.\n    fn add_source(&mut self, source: &'a SourceToSchedule) -> SourceOrd {\n        let source_ord = self.source_uid_to_source_ord.len() as SourceOrd;\n        let previous_item = self\n            .source_uid_to_source_ord\n            .insert(source.source_uid.clone(), source_ord);\n        assert!(previous_item.is_none());\n        self.sources.push(source);\n        source_ord\n    }\n\n    fn source_ord(&self, source_uid: &SourceUid) -> Option<SourceOrd> {\n        self.source_uid_to_source_ord.get(source_uid).copied()\n    }\n\n    fn source(&self, source_uid: &SourceUid) -> Option<(SourceOrd, &'a SourceToSchedule)> {\n        let source_ord = self.source_uid_to_source_ord.get(source_uid).copied()?;\n        Some((source_ord, self.sources[source_ord as usize]))\n    }\n\n    fn indexer_ord(&self, indexer_id: &str) -> Option<IndexerOrd> {\n        self.indexer_id_to_indexer_ord.get(indexer_id).copied()\n    }\n\n    fn add_indexer_id(&mut self, indexer_id: String) -> IndexerOrd {\n        let indexer_ord = self.indexer_ids.len() as IndexerOrd;\n        self.indexer_id_to_indexer_ord\n            .insert(indexer_id.clone(), indexer_ord);\n        self.indexer_ids.push(indexer_id);\n        indexer_ord\n    }\n}\n\nfn convert_physical_plan_to_solution(\n    plan: &PhysicalIndexingPlan,\n    id_to_ord_map: &IdToOrdMap,\n    solution: &mut SchedulingSolution,\n) {\n    for (indexer_id, indexing_tasks) in plan.indexing_tasks_per_indexer() {\n        if let Some(indexer_ord) = id_to_ord_map.indexer_ord(indexer_id) {\n            let indexer_assignment = &mut solution.indexer_assignments[indexer_ord];\n            for indexing_task in indexing_tasks {\n                let source_uid = SourceUid {\n                    index_uid: indexing_task.index_uid().clone(),\n                    source_id: indexing_task.source_id.clone(),\n                };\n                if let Some((source_ord, source)) = id_to_ord_map.source(&source_uid) {\n                    match &source.source_type {\n                        SourceToScheduleType::Sharded { .. } => {\n                            indexer_assignment\n                                .add_shards(source_ord, indexing_task.shard_ids.len() as u32);\n                        }\n                        SourceToScheduleType::NonSharded { .. } => {\n                            // For non-sharded sources like Kafka, one pipeline = one shard in the\n                            // solutions\n                            indexer_assignment.add_shards(source_ord, 1);\n                        }\n                        SourceToScheduleType::IngestV1 => {\n                            // Ingest V1 is not part of the logical placement algorithm.\n                        }\n                    }\n                }\n            }\n        }\n    }\n}\n\n#[derive(Debug)]\npub struct SourceToSchedule {\n    pub source_uid: SourceUid,\n    pub source_type: SourceToScheduleType,\n    pub params_fingerprint: u64,\n}\n\n#[derive(Debug)]\npub enum SourceToScheduleType {\n    Sharded {\n        shard_ids: Vec<ShardId>,\n        load_per_shard: NonZeroU32,\n    },\n    NonSharded {\n        num_pipelines: u32,\n        load_per_pipeline: NonZeroU32,\n    },\n    // deprecated\n    IngestV1,\n}\n\nfn compute_max_num_shards_per_pipeline(source_type: &SourceToScheduleType) -> NonZeroU32 {\n    match &source_type {\n        SourceToScheduleType::Sharded { load_per_shard, .. } => {\n            NonZeroU32::new(MAX_LOAD_PER_PIPELINE.cpu_millis() / load_per_shard.get())\n                .unwrap_or_else(|| {\n                    // We throttle shard at ingestion to ensure that a shard does not\n                    // exceed 5MB/s.\n                    //\n                    // This value has been chosen to make sure that one full pipeline\n                    // should always be able to handle the load of one shard.\n                    //\n                    // However it is possible for the system to take more than this\n                    // when it is playing catch up.\n                    //\n                    // This is a transitory state, and not a problem per se.\n                    warn!(\"load per shard is higher than `MAX_LOAD_PER_PIPELINE`\");\n                    NonZeroU32::MIN // also colloquially known as `1`\n                })\n        }\n        SourceToScheduleType::IngestV1 | SourceToScheduleType::NonSharded { .. } => {\n            NonZeroU32::new(1u32).unwrap()\n        }\n    }\n}\n\n// This converts a scheduling solution for a given node and a given source.\n// Major quirk however:\n// For sharded function, this function only partially performs this conversion.\n// In the resulting function some of the shards may not be allocated.\n// The remaining shards will be added in postprocessing pass.\nfn convert_scheduling_solution_to_physical_plan_single_node_single_source(\n    mut remaining_num_shards_to_schedule_on_node: u32,\n    // Specific to the source.\n    previous_tasks: &[&IndexingTask],\n    source: &SourceToSchedule,\n) -> Vec<IndexingTask> {\n    match &source.source_type {\n        SourceToScheduleType::Sharded {\n            shard_ids,\n            load_per_shard,\n        } => {\n            if remaining_num_shards_to_schedule_on_node == 0 {\n                return Vec::new();\n            }\n            // For the moment we do something voluntarily suboptimal.\n            let max_num_pipelines = quickwit_common::div_ceil_u32(\n                remaining_num_shards_to_schedule_on_node * load_per_shard.get(),\n                CPU_PER_PIPELINE_LOAD_LOWER_THRESHOLD.cpu_millis(),\n            );\n            let max_num_shards_per_pipeline: NonZeroU32 =\n                compute_max_num_shards_per_pipeline(&source.source_type);\n            let mut new_tasks = Vec::new();\n            for previous_task in previous_tasks {\n                let max_shard_in_pipeline = max_num_shards_per_pipeline\n                    .get()\n                    .min(remaining_num_shards_to_schedule_on_node)\n                    as usize;\n                let shard_ids: Vec<ShardId> = previous_task\n                    .shard_ids\n                    .iter()\n                    .filter(|shard_id| shard_ids.contains(shard_id))\n                    .take(max_shard_in_pipeline)\n                    .cloned()\n                    .collect();\n                remaining_num_shards_to_schedule_on_node -= shard_ids.len() as u32;\n                let pipeline_uid = if previous_task.params_fingerprint == source.params_fingerprint\n                {\n                    previous_task.pipeline_uid\n                } else {\n                    Some(PipelineUid::random())\n                };\n                let new_task = IndexingTask {\n                    index_uid: previous_task.index_uid.clone(),\n                    source_id: previous_task.source_id.clone(),\n                    pipeline_uid,\n                    shard_ids,\n                    params_fingerprint: source.params_fingerprint,\n                };\n                new_tasks.push(new_task);\n                if new_tasks.len() >= max_num_pipelines as usize {\n                    break;\n                }\n                if remaining_num_shards_to_schedule_on_node == 0 {\n                    break;\n                }\n            }\n            new_tasks\n        }\n        SourceToScheduleType::NonSharded { .. } => {\n            // For non-sharded pipelines, we just need `num_shards` is a number of pipelines.\n            let mut indexing_tasks: Vec<IndexingTask> = previous_tasks\n                .iter()\n                .take(remaining_num_shards_to_schedule_on_node as usize)\n                .map(|task| (*task).clone())\n                .collect();\n            for indexing_task in &mut indexing_tasks {\n                if indexing_task.params_fingerprint != source.params_fingerprint {\n                    indexing_task.params_fingerprint = source.params_fingerprint;\n                    indexing_task.pipeline_uid = Some(PipelineUid::random());\n                }\n            }\n            indexing_tasks.resize_with(remaining_num_shards_to_schedule_on_node as usize, || {\n                IndexingTask {\n                    index_uid: Some(source.source_uid.index_uid.clone()),\n                    source_id: source.source_uid.source_id.clone(),\n                    pipeline_uid: Some(PipelineUid::random()),\n                    shard_ids: Vec::new(),\n                    params_fingerprint: source.params_fingerprint,\n                }\n            });\n            indexing_tasks\n        }\n        SourceToScheduleType::IngestV1 => {\n            // Ingest V1 is simple. One pipeline per indexer node.\n            if let Some(indexing_task) = previous_tasks.first() {\n                // The pipeline already exists, let's reuse it.\n                let mut indexing_task = (*indexing_task).clone();\n                if indexing_task.params_fingerprint != source.params_fingerprint {\n                    indexing_task.params_fingerprint = source.params_fingerprint;\n                    indexing_task.pipeline_uid = Some(PipelineUid::random());\n                }\n                vec![indexing_task]\n            } else {\n                // The source is new, we need to create a new task.\n                vec![IndexingTask {\n                    index_uid: Some(source.source_uid.index_uid.clone()),\n                    source_id: source.source_uid.source_id.clone(),\n                    pipeline_uid: Some(PipelineUid::random()),\n                    shard_ids: Vec::new(),\n                    params_fingerprint: source.params_fingerprint,\n                }]\n            }\n        }\n    }\n}\n\nfn convert_scheduling_solution_to_physical_plan_single_node(\n    indexer_assignment: &IndexerAssignment,\n    previous_tasks: &[IndexingTask],\n    sources: &[SourceToSchedule],\n    id_to_ord_map: &IdToOrdMap,\n) -> Vec<IndexingTask> {\n    let mut tasks = Vec::new();\n    for source in sources {\n        let source_num_shards =\n            if let Some(source_ord) = id_to_ord_map.source_ord(&source.source_uid) {\n                indexer_assignment.num_shards(source_ord)\n            } else {\n                // This can happen for IngestV1\n                1u32\n            };\n        let source_pipelines: Vec<&IndexingTask> = previous_tasks\n            .iter()\n            .filter(|task| {\n                task.index_uid() == &source.source_uid.index_uid\n                    && task.source_id == source.source_uid.source_id\n            })\n            .collect();\n        let source_tasks = convert_scheduling_solution_to_physical_plan_single_node_single_source(\n            source_num_shards,\n            &source_pipelines[..],\n            source,\n        );\n        tasks.extend(source_tasks);\n    }\n    // code goes here.\n    tasks.sort_by(|left: &IndexingTask, right: &IndexingTask| {\n        left.index_uid\n            .cmp(&right.index_uid)\n            .then_with(|| left.source_id.cmp(&right.source_id))\n    });\n    tasks\n}\n\n/// This function takes a scheduling solution (which abstracts the notion of pipelines,\n/// and shard ids) and builds a physical plan, attempting to make as little change as possible\n/// to the existing pipelines.\n///\n/// We do not support moving shard from one pipeline to another, so if required this function may\n/// also return instruction about deleting / adding new shards.\nfn convert_scheduling_solution_to_physical_plan(\n    solution: &SchedulingSolution,\n    id_to_ord_map: &IdToOrdMap,\n    sources: &[SourceToSchedule],\n    previous_plan_opt: Option<&PhysicalIndexingPlan>,\n    shard_locations: &ShardLocations,\n) -> PhysicalIndexingPlan {\n    let mut indexer_assignments = solution.indexer_assignments.clone();\n    let mut new_physical_plan = PhysicalIndexingPlan::with_indexer_ids(&id_to_ord_map.indexer_ids);\n    for (indexer_id, indexer_assignment) in id_to_ord_map\n        .indexer_ids\n        .iter()\n        .zip(&mut indexer_assignments)\n    {\n        let previous_tasks_for_indexer = previous_plan_opt\n            .and_then(|previous_plan| previous_plan.indexer(indexer_id))\n            .unwrap_or(&[]);\n        // First we attempt to recycle existing pipelines.\n        let new_plan_indexing_tasks_for_indexer: Vec<IndexingTask> =\n            convert_scheduling_solution_to_physical_plan_single_node(\n                indexer_assignment,\n                previous_tasks_for_indexer,\n                sources,\n                id_to_ord_map,\n            );\n        for indexing_task in new_plan_indexing_tasks_for_indexer {\n            new_physical_plan.add_indexing_task(indexer_id, indexing_task);\n        }\n    }\n\n    // We still need to do some extra work for sharded sources: assign missing shards, and possibly\n    // adding extra pipelines.\n    for source in sources {\n        let SourceToScheduleType::Sharded { shard_ids, .. } = &source.source_type else {\n            continue;\n        };\n        let source_ord = id_to_ord_map.source_ord(&source.source_uid).unwrap();\n        let mut scheduled_shards: FnvHashSet<ShardId> = FnvHashSet::default();\n        let mut remaining_num_shards_per_node: HashMap<String, NonZeroU32> =\n            HashMap::with_capacity(new_physical_plan.num_indexers());\n        for (indexer, indexing_tasks) in new_physical_plan.indexing_tasks_per_indexer_mut() {\n            let indexer_ord = id_to_ord_map.indexer_ord(indexer).unwrap();\n            let mut num_shards_for_indexer_source: u32 =\n                indexer_assignments[indexer_ord].num_shards(source_ord);\n            for indexing_task in indexing_tasks {\n                if indexing_task.index_uid() == &source.source_uid.index_uid\n                    && indexing_task.source_id == source.source_uid.source_id\n                {\n                    indexing_task.shard_ids.retain(|shard_id| {\n                        let shard_added = scheduled_shards.insert(shard_id.clone());\n                        if shard_added {\n                            true\n                        } else {\n                            error!(\n                                \"this should never happen. shard was allocated into two pipelines.\"\n                            );\n                            false\n                        }\n                    });\n                    num_shards_for_indexer_source -= indexing_task.shard_ids.len() as u32;\n                }\n            }\n            if let Some(num_shards_for_indexer_source_non_zero) =\n                NonZeroU32::new(num_shards_for_indexer_source)\n            {\n                remaining_num_shards_per_node\n                    .insert(indexer.clone(), num_shards_for_indexer_source_non_zero);\n            }\n        }\n\n        // Missing shards is an iterator over the shards that are not scheduled into a pipeline yet.\n        let missing_shards: Vec<ShardId> = shard_ids\n            .iter()\n            .filter(|shard_id| !scheduled_shards.contains(shard_id))\n            .cloned()\n            .collect();\n\n        // Let's assign the missing shards.\n        let max_shard_per_pipeline = compute_max_num_shards_per_pipeline(&source.source_type);\n\n        let shard_to_indexer: HashMap<ShardId, String> = assign_shards(\n            missing_shards,\n            remaining_num_shards_per_node,\n            shard_locations,\n        );\n        for (shard_id, indexer) in shard_to_indexer {\n            add_shard_to_indexer(\n                shard_id,\n                indexer,\n                &source.source_uid,\n                max_shard_per_pipeline,\n                &mut new_physical_plan,\n                source.params_fingerprint,\n            );\n        }\n    }\n\n    new_physical_plan.normalize();\n\n    new_physical_plan\n}\n\n/// This function is meant to be called after we have solved the scheduling\n/// problem, so we already know the number of shards to be assigned on each indexer node.\n/// We now need to precisely where each shard should be assigned.\n///\n/// It assigns the missing shards for a given source to the indexers, given:\n/// - the total number of shards that are to be scheduled on each nodes\n/// - the shard locations\n///\n/// This function will assign shards on a node hosting them in priority.\n///\n/// The current implementation is a heuristic.\n/// In the first pass, we attempt to assign as many shards as possible on the\n/// node hosting them.\nfn assign_shards(\n    missing_shards: Vec<ShardId>,\n    mut remaining_num_shards_per_node: HashMap<String, NonZeroU32>,\n    shard_locations: &ShardLocations,\n) -> HashMap<ShardId, String> {\n    let mut shard_to_indexer: HashMap<ShardId, String> =\n        HashMap::with_capacity(missing_shards.len());\n\n    // In a first pass we first assign as many shards on their hosting nodes as possible.\n    let mut remaining_missing_shards: Vec<ShardId> = Vec::new();\n    for shard_id in missing_shards {\n        // As a heuristic, we pick the first node hosting the shard that is available.\n        let indexer_hosting_shard: Option<(NonZeroU32, &str)> = shard_locations\n            .get_shard_locations(&shard_id)\n            .iter()\n            .map(|node_id| node_id.as_str())\n            .flat_map(|node_id| {\n                let num_shards = remaining_num_shards_per_node.get(node_id)?;\n                Some((*num_shards, node_id))\n            })\n            .min_by_key(|(num_shards, _node_id)| *num_shards);\n        if let Some((_num_shards, indexer)) = indexer_hosting_shard {\n            decrement_num_shards(indexer, &mut remaining_num_shards_per_node);\n            shard_to_indexer.insert(shard_id, indexer.to_string());\n        } else {\n            remaining_missing_shards.push(shard_id);\n        }\n    }\n\n    for shard_id in remaining_missing_shards {\n        let indexer = remaining_num_shards_per_node\n            .keys()\n            .next()\n            .expect(\"failed to assign all shards. please report\")\n            .to_string();\n        decrement_num_shards(&indexer, &mut remaining_num_shards_per_node);\n        shard_to_indexer.insert(shard_id, indexer.to_string());\n    }\n    assert!(remaining_num_shards_per_node.is_empty());\n\n    shard_to_indexer\n}\n\nfn decrement_num_shards(\n    node_id: &str,\n    remaining_num_shards_to_schedule_per_indexers: &mut HashMap<String, NonZeroU32>,\n) {\n    {\n        let previous_num_shards = remaining_num_shards_to_schedule_per_indexers\n            .get_mut(node_id)\n            .unwrap();\n        if let Some(new_num_shards) = NonZeroU32::new(previous_num_shards.get() - 1) {\n            *previous_num_shards = new_num_shards;\n            return;\n        }\n    }\n    remaining_num_shards_to_schedule_per_indexers.remove(node_id);\n}\n\n// Checks that's the physical solution indeed matches the scheduling solution.\nfn assert_post_condition_physical_plan_match_solution(\n    physical_plan: &PhysicalIndexingPlan,\n    solution: &SchedulingSolution,\n    id_to_ord_map: &IdToOrdMap,\n) {\n    let num_indexers = physical_plan.indexing_tasks_per_indexer().len();\n    assert_eq!(num_indexers, solution.indexer_assignments.len());\n    assert_eq!(num_indexers, id_to_ord_map.indexer_ids.len());\n    let mut reconstructed_solution = SchedulingSolution::with_num_indexers(num_indexers);\n    convert_physical_plan_to_solution(physical_plan, id_to_ord_map, &mut reconstructed_solution);\n    assert_eq!(\n        solution.indexer_assignments,\n        reconstructed_solution.indexer_assignments\n    );\n}\n\nfn add_shard_to_indexer(\n    missing_shard: ShardId,\n    indexer: String,\n    source_uid: &SourceUid,\n    max_shard_per_pipeline: NonZeroU32,\n    new_physical_plan: &mut PhysicalIndexingPlan,\n    params_fingerprint: u64,\n) {\n    let indexer_tasks = new_physical_plan\n        .indexing_tasks_per_indexer_mut()\n        .entry(indexer)\n        .or_default();\n\n    let indexing_task_opt = indexer_tasks\n        .iter_mut()\n        .filter(|indexing_task| {\n            indexing_task.index_uid() == &source_uid.index_uid\n                && indexing_task.source_id == source_uid.source_id\n        })\n        .filter(|task| task.shard_ids.len() < max_shard_per_pipeline.get() as usize)\n        .min_by_key(|task| task.shard_ids.len());\n\n    if let Some(indexing_task) = indexing_task_opt {\n        indexing_task.shard_ids.push(missing_shard);\n    } else {\n        // We haven't found any pipeline with remaining room.\n        // It is time to create a new pipeline.\n        indexer_tasks.push(IndexingTask {\n            index_uid: Some(source_uid.index_uid.clone()),\n            source_id: source_uid.source_id.clone(),\n            pipeline_uid: Some(PipelineUid::random()),\n            shard_ids: vec![missing_shard],\n            params_fingerprint,\n        });\n    }\n}\n\n// If the total node capacities is lower than 120% of the problem load, this\n// function scales the load of the indexer to reach this limit.\nfn inflate_node_capacities_if_necessary(problem: &mut SchedulingProblem) {\n    // First we scale the problem to the point where any indexer can fit the largest shard.\n    let Some(largest_shard_load) = problem.sources().map(|source| source.load_per_shard).max()\n    else {\n        return;\n    };\n\n    // We first artificially scale down the node capacities.\n    //\n    // The node capacity is an estimate of the amount of CPU available on a given indexer node.\n    // It has two purpose,\n    // - under a lot of load, indexer will receive work proportional to their relative capacity.\n    // - under low load, the absolute magnitude will be used by the scheduler, to decide whether\n    // to prefer having a balanced workload over other criteria (all pipeline from a same index on\n    // the same node, indexing local shards, etc.).\n    //\n    // The default CPU capacity is detected from the OS. Using these values directly leads\n    // a non uniform distribution of the load which is very confusing for users. We artificially\n    // scale down the indexer capacities.\n    problem.scale_node_capacities(0.3f32);\n\n    let min_indexer_capacity = (0..problem.num_indexers())\n        .map(|indexer_ord| problem.indexer_cpu_capacity(indexer_ord))\n        .min()\n        .expect(\"At least one indexer is required\");\n\n    assert_ne!(min_indexer_capacity.cpu_millis(), 0);\n    if min_indexer_capacity.cpu_millis() < largest_shard_load.get() {\n        let scaling_factor =\n            (largest_shard_load.get() as f32) / (min_indexer_capacity.cpu_millis() as f32);\n        problem.scale_node_capacities(scaling_factor);\n    }\n\n    let total_node_capacities: f32 = problem.total_node_capacities().cpu_millis() as f32;\n    let total_load: f32 = problem.total_load() as f32;\n    let inflated_total_load = total_load * 1.2f32;\n    if inflated_total_load >= total_node_capacities {\n        // We need to inflate our node capacities to match the problem.\n        let ratio = inflated_total_load / total_node_capacities;\n        problem.scale_node_capacities(ratio);\n    }\n}\n\n/// Creates a physical plan given the current situation of the cluster and the list of sources\n/// to schedule.\n///\n/// The scheduling problem abstracts all notion of shard ids, source types, and node_ids,\n/// to transform scheduling into a math problem.\n///\n/// This function implementation therefore goes\n/// 1) transform our problem into a scheduling problem. Something closer to a well-defined\n///    optimization problem. In particular this step removes:\n///    - the notion of shard ids, and only considers a number of shards being allocated.\n///    - node_ids and shard ids. These are replaced by integers.\n/// 2) convert the current situation of the cluster into something a previous scheduling solution.\n/// 3) compute the new scheduling solution.\n/// 4) convert the new scheduling solution back to the real world by reallocating the shard ids.\n///\n/// TODO cut into pipelines.\n/// Panics if any sources has no shards.\npub fn build_physical_indexing_plan(\n    sources: &[SourceToSchedule],\n    indexer_id_to_cpu_capacities: &FnvHashMap<String, CpuCapacity>,\n    previous_plan_opt: Option<&PhysicalIndexingPlan>,\n    shard_locations: &ShardLocations,\n) -> PhysicalIndexingPlan {\n    // Asserts that the source are valid.\n    check_sources(sources);\n\n    // We convert our problem into a simplified scheduling problem.\n    // In this simplified version, nodes and sources are just ids.\n    // Instead of individual shard ids, we just keep count of shards.\n    // Similarly, instead of accurate locality, we just keep the number of shards local\n    // to an indexer.\n    let (id_to_ord_map, problem) =\n        convert_to_simplified_problem(indexer_id_to_cpu_capacities, sources, shard_locations);\n\n    // Populate the previous solution, if any.\n    let mut previous_solution = problem.new_solution();\n    if let Some(previous_plan) = previous_plan_opt {\n        convert_physical_plan_to_solution(previous_plan, &id_to_ord_map, &mut previous_solution);\n    }\n\n    // Compute the new scheduling solution using a heuristic.\n    let new_solution = scheduling_logic::solve(problem, previous_solution);\n\n    // Convert the new scheduling solution back to a physical plan.\n    let new_physical_plan = convert_scheduling_solution_to_physical_plan(\n        &new_solution,\n        &id_to_ord_map,\n        sources,\n        previous_plan_opt,\n        shard_locations,\n    );\n\n    assert_post_condition_physical_plan_match_solution(\n        &new_physical_plan,\n        &new_solution,\n        &id_to_ord_map,\n    );\n\n    new_physical_plan\n}\n\n/// Makes any checks on the sources.\n/// Sharded sources are not allowed to have no shards.\nfn check_sources(sources: &[SourceToSchedule]) {\n    for source in sources {\n        if let SourceToScheduleType::Sharded { shard_ids, .. } = &source.source_type {\n            assert!(!shard_ids.is_empty())\n        }\n    }\n}\n\nfn convert_to_simplified_problem<'a>(\n    indexer_id_to_cpu_capacities: &'a FnvHashMap<String, CpuCapacity>,\n    sources: &'a [SourceToSchedule],\n    shard_locations: &ShardLocations,\n) -> (IdToOrdMap<'a>, SchedulingProblem) {\n    // Convert our problem to a scheduling problem.\n    let mut id_to_ord_map: IdToOrdMap<'a> = IdToOrdMap::default();\n\n    // We use a Vec as a `IndexOrd` -> Max load map.\n    let mut indexer_cpu_capacities: Vec<CpuCapacity> =\n        Vec::with_capacity(indexer_id_to_cpu_capacities.len());\n    for (indexer_id, &cpu_capacity) in indexer_id_to_cpu_capacities {\n        let indexer_ord = id_to_ord_map.add_indexer_id(indexer_id.clone());\n        assert_eq!(indexer_ord, indexer_cpu_capacities.len() as IndexerOrd);\n        indexer_cpu_capacities.push(cpu_capacity);\n    }\n\n    let mut problem = SchedulingProblem::with_indexer_cpu_capacities(indexer_cpu_capacities);\n\n    for source in sources {\n        if let Some(source_ord) = populate_problem(source, &mut problem) {\n            let registered_source_ord = id_to_ord_map.add_source(source);\n            if let SourceToScheduleType::Sharded { shard_ids, .. } = &source.source_type {\n                for shard_id in shard_ids {\n                    for &indexer in shard_locations.get_shard_locations(shard_id) {\n                        let Some(indexer_ord) = id_to_ord_map.indexer_ord(indexer.as_str()) else {\n                            // This happens if the ingester is unavailable.\n                            rate_limited_debug!(\n                                limit_per_min = 10,\n                                \"failed to find indexer ord for indexer {indexer}\"\n                            );\n                            continue;\n                        };\n                        problem.inc_affinity(source_ord, indexer_ord);\n                    }\n                }\n            }\n            assert_eq!(source_ord, registered_source_ord);\n        }\n    }\n    (id_to_ord_map, problem)\n}\n\n#[cfg(test)]\nmod tests {\n\n    use std::collections::{HashMap, HashSet};\n    use std::num::NonZeroU32;\n    use std::str::FromStr;\n    use std::sync::atomic::{AtomicUsize, Ordering};\n\n    use fnv::FnvHashMap;\n    use itertools::Itertools;\n    use quickwit_proto::indexing::{CpuCapacity, IndexingTask, mcpu};\n    use quickwit_proto::types::{IndexUid, NodeId, PipelineUid, ShardId, SourceUid};\n    use rand::prelude::IndexedRandom;\n\n    use super::{\n        SourceToSchedule, SourceToScheduleType, build_physical_indexing_plan,\n        convert_scheduling_solution_to_physical_plan_single_node_single_source,\n    };\n    use crate::indexing_plan::PhysicalIndexingPlan;\n    use crate::indexing_scheduler::get_shard_locality_metrics;\n    use crate::indexing_scheduler::scheduling::assign_shards;\n    use crate::model::ShardLocations;\n\n    fn source_id() -> SourceUid {\n        static COUNTER: AtomicUsize = AtomicUsize::new(0);\n        let index = IndexUid::for_test(\"test_index\", 0);\n        let source_id = COUNTER.fetch_add(1, Ordering::SeqCst);\n        SourceUid {\n            index_uid: index,\n            source_id: format!(\"source_{source_id}\"),\n        }\n    }\n\n    #[test]\n    fn test_build_physical_plan() {\n        let indexer1 = \"indexer1\".to_string();\n        let indexer2 = \"indexer2\".to_string();\n        let source_uid0 = source_id();\n        let source_uid1 = source_id();\n        let source_uid2 = source_id();\n        let source_0 = SourceToSchedule {\n            source_uid: source_uid0.clone(),\n            source_type: SourceToScheduleType::Sharded {\n                shard_ids: vec![\n                    ShardId::from(0),\n                    ShardId::from(1),\n                    ShardId::from(2),\n                    ShardId::from(3),\n                    ShardId::from(4),\n                    ShardId::from(5),\n                    ShardId::from(6),\n                    ShardId::from(7),\n                ],\n                load_per_shard: NonZeroU32::new(1_000).unwrap(),\n            },\n            params_fingerprint: 0,\n        };\n        let source_1 = SourceToSchedule {\n            source_uid: source_uid1.clone(),\n            source_type: SourceToScheduleType::NonSharded {\n                num_pipelines: 2,\n                load_per_pipeline: NonZeroU32::new(3_200).unwrap(),\n            },\n            params_fingerprint: 0,\n        };\n        let source_2 = SourceToSchedule {\n            source_uid: source_uid2.clone(),\n            source_type: SourceToScheduleType::IngestV1,\n            params_fingerprint: 0,\n        };\n        let mut indexer_id_to_cpu_capacities = FnvHashMap::default();\n        indexer_id_to_cpu_capacities.insert(indexer1.clone(), mcpu(16_000));\n        indexer_id_to_cpu_capacities.insert(indexer2.clone(), mcpu(16_000));\n        let shard_locations = ShardLocations::default();\n        let indexing_plan = build_physical_indexing_plan(\n            &[source_0, source_1, source_2],\n            &indexer_id_to_cpu_capacities,\n            None,\n            &shard_locations,\n        );\n        assert_eq!(indexing_plan.indexing_tasks_per_indexer().len(), 2);\n\n        let node1_plan = indexing_plan.indexer(&indexer1).unwrap();\n        let node2_plan = indexing_plan.indexer(&indexer2).unwrap();\n\n        // both non-sharded pipelines get scheduled on the same node.\n        assert_eq!(node1_plan.len(), 3);\n        assert_eq!(&node1_plan[0].source_id, &source_uid1.source_id);\n        assert!(&node1_plan[0].shard_ids.is_empty());\n        assert_eq!(&node1_plan[1].source_id, &source_uid1.source_id);\n        assert!(&node1_plan[1].shard_ids.is_empty());\n        assert_eq!(&node1_plan[2].source_id, &source_uid2.source_id);\n        assert!(&node1_plan[2].shard_ids.is_empty());\n\n        assert_eq!(node2_plan.len(), 4);\n        assert_eq!(&node2_plan[0].source_id, &source_uid0.source_id);\n\n        let mut shard_ids: HashSet<ShardId> = HashSet::default();\n        let mut shard_lens = Vec::new();\n        shard_lens.push(node2_plan[0].shard_ids.len());\n        shard_ids.extend(node2_plan[0].shard_ids.iter().cloned());\n        assert_eq!(&node2_plan[1].source_id, &source_uid0.source_id);\n        shard_lens.push(node2_plan[1].shard_ids.len());\n        shard_ids.extend(node2_plan[1].shard_ids.iter().cloned());\n        assert_eq!(&node2_plan[2].source_id, &source_uid0.source_id);\n        shard_lens.push(node2_plan[2].shard_ids.len());\n        shard_ids.extend(node2_plan[2].shard_ids.iter().cloned());\n        assert_eq!(shard_ids.len(), 8);\n        assert_eq!(&node2_plan[3].source_id, &source_uid2.source_id);\n        shard_lens.sort();\n        assert_eq!(&shard_lens[..], &[2, 3, 3]);\n    }\n\n    #[test]\n    fn test_build_physical_plan_with_locality() {\n        let num_indexers = 10;\n        let num_shards: usize = 1000;\n        let indexers: Vec<NodeId> = (0..num_indexers)\n            .map(|indexer_id| NodeId::new(format!(\"indexer{indexer_id}\")))\n            .collect();\n        let source_uids: Vec<SourceUid> = std::iter::repeat_with(source_id).take(1_000).collect();\n        let shard_ids: Vec<ShardId> = (0..num_shards as u64).map(ShardId::from).collect();\n        let sources: Vec<SourceToSchedule> = (0..num_shards)\n            .map(|i| SourceToSchedule {\n                source_uid: source_uids[i].clone(),\n                source_type: SourceToScheduleType::Sharded {\n                    shard_ids: vec![shard_ids[i].clone()],\n                    load_per_shard: NonZeroU32::new(250).unwrap(),\n                },\n                params_fingerprint: 0,\n            })\n            .collect();\n\n        let mut indexer_id_to_cpu_capacities = FnvHashMap::default();\n        for indexer in &indexers {\n            indexer_id_to_cpu_capacities.insert(indexer.as_str().to_string(), mcpu(16_000));\n        }\n        let mut rng = rand::rng();\n\n        let mut shard_locations = ShardLocations::default();\n        for shard_id in &shard_ids {\n            let indexer = indexers[..].choose(&mut rng).unwrap();\n            shard_locations.add_location(shard_id, indexer);\n        }\n\n        let plan = build_physical_indexing_plan(\n            &sources,\n            &indexer_id_to_cpu_capacities,\n            None,\n            &shard_locations,\n        );\n        assert_eq!(plan.indexing_tasks_per_indexer().len(), num_indexers);\n        let metrics = get_shard_locality_metrics(&plan, &shard_locations);\n        assert_eq!(\n            metrics.num_remote_shards + metrics.num_local_shards,\n            num_shards\n        );\n        assert!(metrics.num_remote_shards < 10);\n    }\n\n    #[tokio::test]\n    async fn test_build_physical_indexing_plan_with_not_enough_indexers() {\n        let source_uid1 = source_id();\n        let source_1 = SourceToSchedule {\n            source_uid: source_uid1.clone(),\n            source_type: SourceToScheduleType::NonSharded {\n                num_pipelines: 2,\n                load_per_pipeline: NonZeroU32::new(1000).unwrap(),\n            },\n            params_fingerprint: 0,\n        };\n        let sources = vec![source_1];\n\n        let indexer1 = \"indexer1\".to_string();\n        let mut indexer_max_loads = FnvHashMap::default();\n        let shard_locations = ShardLocations::default();\n        {\n            indexer_max_loads.insert(indexer1.clone(), mcpu(1_999));\n            // This test what happens when there isn't enough capacity on the cluster.\n            let physical_plan =\n                build_physical_indexing_plan(&sources, &indexer_max_loads, None, &shard_locations);\n            assert_eq!(physical_plan.indexing_tasks_per_indexer().len(), 1);\n            let expected_tasks = physical_plan.indexer(&indexer1).unwrap();\n            assert_eq!(expected_tasks.len(), 2);\n            assert_eq!(&expected_tasks[0].source_id, &source_uid1.source_id);\n        }\n        {\n            indexer_max_loads.insert(indexer1.clone(), mcpu(2_000));\n            // This test what happens when there isn't enough capacity on the cluster.\n            let physical_plan =\n                build_physical_indexing_plan(&sources, &indexer_max_loads, None, &shard_locations);\n            assert_eq!(physical_plan.indexing_tasks_per_indexer().len(), 1);\n            let expected_tasks = physical_plan.indexer(&indexer1).unwrap();\n            assert_eq!(expected_tasks.len(), 2);\n            assert_eq!(&expected_tasks[0].source_id, &source_uid1.source_id);\n            assert!(expected_tasks[0].shard_ids.is_empty());\n            assert_eq!(&expected_tasks[1].source_id, &source_uid1.source_id);\n            assert!(expected_tasks[1].shard_ids.is_empty());\n        }\n    }\n\n    fn make_indexing_tasks(\n        source_uid: &SourceUid,\n        shards: &[(PipelineUid, &[ShardId])],\n    ) -> Vec<IndexingTask> {\n        let mut plan = Vec::new();\n        for (pipeline_uid, shard_ids) in shards {\n            plan.push(IndexingTask {\n                index_uid: Some(source_uid.index_uid.clone()),\n                source_id: source_uid.source_id.clone(),\n                pipeline_uid: Some(*pipeline_uid),\n                shard_ids: shard_ids.to_vec(),\n                params_fingerprint: 0,\n            });\n        }\n        plan\n    }\n\n    #[test]\n    fn test_group_shards_into_pipeline_simple() {\n        let source_uid = source_id();\n        let indexing_tasks = make_indexing_tasks(\n            &source_uid,\n            &[\n                (\n                    PipelineUid::for_test(1u128),\n                    &[ShardId::from(1), ShardId::from(2)],\n                ),\n                (\n                    PipelineUid::for_test(2u128),\n                    &[ShardId::from(3), ShardId::from(4), ShardId::from(5)],\n                ),\n            ],\n        );\n        let sources = vec![SourceToSchedule {\n            source_uid: source_uid.clone(),\n            source_type: SourceToScheduleType::Sharded {\n                shard_ids: vec![\n                    ShardId::from(0),\n                    ShardId::from(1),\n                    ShardId::from(3),\n                    ShardId::from(4),\n                    ShardId::from(5),\n                ],\n                load_per_shard: NonZeroU32::new(1_000).unwrap(),\n            },\n            params_fingerprint: 0,\n        }];\n        let mut indexer_id_to_cpu_capacities = FnvHashMap::default();\n        indexer_id_to_cpu_capacities.insert(\"node1\".to_string(), mcpu(10_000));\n        let mut indexing_plan = PhysicalIndexingPlan::with_indexer_ids(&[\"node1\".to_string()]);\n        for indexing_task in indexing_tasks {\n            indexing_plan.add_indexing_task(\"node1\", indexing_task);\n        }\n        let shard_locations = ShardLocations::default();\n        let new_plan = build_physical_indexing_plan(\n            &sources,\n            &indexer_id_to_cpu_capacities,\n            Some(&indexing_plan),\n            &shard_locations,\n        );\n        let indexing_tasks = new_plan.indexer(\"node1\").unwrap();\n        assert_eq!(indexing_tasks.len(), 2);\n        assert_eq!(\n            &indexing_tasks[0].shard_ids,\n            &[ShardId::from(0), ShardId::from(1)]\n        );\n        assert_eq!(\n            &indexing_tasks[1].shard_ids,\n            &[ShardId::from(3), ShardId::from(4), ShardId::from(5)]\n        );\n    }\n\n    fn group_shards_into_pipelines_aux(\n        source_uid: &SourceUid,\n        shard_ids: &[u64],\n        previous_pipeline_shards: &[(PipelineUid, &[ShardId])],\n        load_per_shard: CpuCapacity,\n    ) -> Vec<IndexingTask> {\n        let indexing_tasks = make_indexing_tasks(source_uid, previous_pipeline_shards);\n        let sources = vec![SourceToSchedule {\n            source_uid: source_uid.clone(),\n            source_type: SourceToScheduleType::Sharded {\n                shard_ids: shard_ids.iter().copied().map(ShardId::from).collect(),\n                load_per_shard: NonZeroU32::new(load_per_shard.cpu_millis()).unwrap(),\n            },\n            params_fingerprint: 0,\n        }];\n        const NODE: &str = \"node1\";\n        let mut indexer_id_to_cpu_capacities = FnvHashMap::default();\n        indexer_id_to_cpu_capacities.insert(NODE.to_string(), mcpu(10_000));\n        let mut indexing_plan = PhysicalIndexingPlan::with_indexer_ids(&[\"node1\".to_string()]);\n        for indexing_task in indexing_tasks {\n            indexing_plan.add_indexing_task(NODE, indexing_task);\n        }\n        let shard_locations = ShardLocations::default();\n        let new_plan = build_physical_indexing_plan(\n            &sources,\n            &indexer_id_to_cpu_capacities,\n            Some(&indexing_plan),\n            &shard_locations,\n        );\n        let mut indexing_tasks = new_plan.indexer(NODE).unwrap().to_vec();\n        for indexing_task in &mut indexing_tasks {\n            indexing_task.shard_ids.sort();\n        }\n        // We sort indexing tasks for normalization purpose\n        indexing_tasks.sort_by_key(|task| task.shard_ids[0].clone());\n        indexing_tasks\n    }\n\n    #[test]\n    fn test_group_shards_load_per_shard_too_high() {\n        let source_uid = source_id();\n        let indexing_tasks =\n            group_shards_into_pipelines_aux(&source_uid, &[1, 2], &[], mcpu(4_000));\n        assert_eq!(indexing_tasks.len(), 2);\n    }\n\n    #[test]\n    fn test_group_shards_into_pipeline_hysteresis() {\n        let source_uid = source_id();\n        let indexing_tasks_1 = group_shards_into_pipelines_aux(\n            &source_uid,\n            &[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],\n            &[],\n            mcpu(400),\n        );\n        assert_eq!(indexing_tasks_1.len(), 2);\n        let indexing_tasks_len_1: Vec<usize> = indexing_tasks_1\n            .iter()\n            .map(|task| task.shard_ids.len())\n            .sorted()\n            .collect();\n        assert_eq!(&indexing_tasks_len_1, &[3, 8]);\n\n        let pipeline_tasks1: Vec<(PipelineUid, &[ShardId])> = indexing_tasks_1\n            .iter()\n            .map(|task| (task.pipeline_uid(), &task.shard_ids[..]))\n            .collect();\n\n        // With the same set of shards, an increase of load triggers the creation of a new task.\n        let indexing_tasks_2 = group_shards_into_pipelines_aux(\n            &source_uid,\n            &[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],\n            &pipeline_tasks1[..],\n            mcpu(600),\n        );\n        assert_eq!(indexing_tasks_2.len(), 3);\n        let indexing_tasks_len_2: Vec<usize> = indexing_tasks_2\n            .iter()\n            .map(|task| task.shard_ids.len())\n            .sorted()\n            .collect();\n        assert_eq!(&indexing_tasks_len_2, &[1, 5, 5]);\n\n        // Now the load comes back to normal\n        // The hysteresis takes effect. We do not switch back to 2 pipelines.\n        let pipeline_tasks_2: Vec<(PipelineUid, &[ShardId])> = indexing_tasks_2\n            .iter()\n            .map(|task| (task.pipeline_uid(), &task.shard_ids[..]))\n            .collect();\n        assert_eq!(indexing_tasks_2.len(), 3);\n        let indexing_tasks_3 = group_shards_into_pipelines_aux(\n            &source_uid,\n            &[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],\n            &pipeline_tasks_2,\n            mcpu(400),\n        );\n        assert_eq!(&indexing_tasks_3, &indexing_tasks_2);\n\n        let pipeline_tasks3: Vec<(PipelineUid, &[ShardId])> = indexing_tasks_3\n            .iter()\n            .map(|task| (task.pipeline_uid(), &task.shard_ids[..]))\n            .collect();\n        // Now a further lower load.\n        let indexing_tasks_4 = group_shards_into_pipelines_aux(\n            &source_uid,\n            &[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],\n            &pipeline_tasks3,\n            mcpu(200),\n        );\n        assert_eq!(indexing_tasks_4.len(), 2);\n        let indexing_tasks_len_4: Vec<usize> = indexing_tasks_4\n            .iter()\n            .map(|task| task.shard_ids.len())\n            .sorted()\n            .collect();\n        assert_eq!(&indexing_tasks_len_4, &[5, 6]);\n    }\n\n    /// We want to make sure for small pipelines, we still reschedule them with the same\n    /// pipeline uid.\n    #[test]\n    fn test_group_shards_into_pipeline_single_small_pipeline() {\n        let source_uid = source_id();\n        let pipeline_uid = PipelineUid::for_test(1u128);\n        let indexing_tasks = group_shards_into_pipelines_aux(\n            &source_uid,\n            &[12],\n            &[(pipeline_uid, &[ShardId::from(12)])],\n            mcpu(100),\n        );\n        assert_eq!(indexing_tasks.len(), 1);\n        let indexing_task = &indexing_tasks[0];\n        assert_eq!(&indexing_task.shard_ids, &[ShardId::from(12)]);\n        assert_eq!(indexing_task.pipeline_uid.unwrap(), pipeline_uid);\n    }\n\n    #[test]\n    fn test_assign_missing_shards() {\n        let shard0 = ShardId::from(0);\n        let shard1 = ShardId::from(1);\n        let shard2 = ShardId::from(2);\n        let shard3 = ShardId::from(3);\n\n        let missing_shards = vec![\n            shard0.clone(),\n            shard1.clone(),\n            shard2.clone(),\n            shard3.clone(),\n        ];\n        let node1 = NodeId::new(\"node1\".to_string());\n        let node2 = NodeId::new(\"node2\".to_string());\n        // This node is missing from the capacity map.\n        // It should not be assigned any task despite being present in shard locations.\n        let node_missing = NodeId::new(\"node_missing\".to_string());\n        let mut remaining_num_shards_per_node = HashMap::default();\n        remaining_num_shards_per_node\n            .insert(node1.as_str().to_string(), NonZeroU32::new(3).unwrap());\n        remaining_num_shards_per_node\n            .insert(node2.as_str().to_string(), NonZeroU32::new(1).unwrap());\n\n        let mut shard_locations: ShardLocations = ShardLocations::default();\n        // shard1 on 1\n        shard_locations.add_location(&shard1, &node1);\n        // shard2 on 2\n        shard_locations.add_location(&shard2, &node2);\n        // shard3 on both 1 and 2\n        shard_locations.add_location(&shard3, &node1);\n        shard_locations.add_location(&shard3, &node2);\n        shard_locations.add_location(&shard0, &node_missing);\n\n        let shard_to_indexer = assign_shards(\n            missing_shards,\n            remaining_num_shards_per_node,\n            &shard_locations,\n        );\n        assert_eq!(shard_to_indexer.len(), 4);\n        assert_eq!(shard_to_indexer.get(&shard1).unwrap(), \"node1\");\n        assert_eq!(shard_to_indexer.get(&shard2).unwrap(), \"node2\");\n        assert_eq!(shard_to_indexer.get(&shard3).unwrap(), \"node1\");\n        assert_eq!(shard_to_indexer.get(&shard0).unwrap(), \"node1\");\n    }\n\n    #[test]\n    fn test_solution_reconstruction() {\n        let sources_to_schedule = vec![\n            SourceToSchedule {\n                source_uid: SourceUid {\n                    index_uid: IndexUid::from_str(\"otel-logs-v0_6:01HKYD1SE37C90KSH21CD1M11A\")\n                        .unwrap(),\n                    source_id: \"_ingest-api-source\".to_string(),\n                },\n                source_type: SourceToScheduleType::IngestV1,\n                params_fingerprint: 0,\n            },\n            SourceToSchedule {\n                source_uid: SourceUid {\n                    index_uid: IndexUid::from_str(\n                        \"simian_chico_12856033706389338959:01HKYD414H1WVSASC5YD972P39\",\n                    )\n                    .unwrap(),\n                    source_id: \"_ingest-source\".to_string(),\n                },\n                source_type: SourceToScheduleType::Sharded {\n                    shard_ids: vec![ShardId::from(1)],\n                    load_per_shard: NonZeroU32::new(250).unwrap(),\n                },\n                params_fingerprint: 0,\n            },\n        ];\n        let mut capacities = FnvHashMap::default();\n        capacities.insert(\"indexer-1\".to_string(), CpuCapacity::from_cpu_millis(8000));\n        let shard_locations = ShardLocations::default();\n        build_physical_indexing_plan(&sources_to_schedule, &capacities, None, &shard_locations);\n    }\n\n    #[test]\n    fn test_convert_scheduling_solution_to_physical_plan_single_node_single_source_sharded() {\n        let source_uid = SourceUid {\n            index_uid: IndexUid::new_with_random_ulid(\"testindex\"),\n            source_id: \"testsource\".to_string(),\n        };\n        let previous_task1 = IndexingTask {\n            index_uid: Some(source_uid.index_uid.clone()),\n            source_id: source_uid.source_id.to_string(),\n            pipeline_uid: Some(PipelineUid::random()),\n            shard_ids: vec![ShardId::from(1), ShardId::from(4), ShardId::from(5)],\n            params_fingerprint: 0,\n        };\n        let previous_task2 = IndexingTask {\n            index_uid: Some(source_uid.index_uid.clone()),\n            source_id: source_uid.source_id.to_string(),\n            pipeline_uid: Some(PipelineUid::random()),\n            shard_ids: vec![\n                ShardId::from(6),\n                ShardId::from(7),\n                ShardId::from(8),\n                ShardId::from(9),\n                ShardId::from(10),\n            ],\n            params_fingerprint: 0,\n        };\n        {\n            let sharded_source = SourceToSchedule {\n                source_uid: source_uid.clone(),\n                source_type: SourceToScheduleType::Sharded {\n                    shard_ids: vec![\n                        ShardId::from(1),\n                        ShardId::from(2),\n                        ShardId::from(4),\n                        ShardId::from(6),\n                    ],\n                    load_per_shard: NonZeroU32::new(1_000).unwrap(),\n                },\n                params_fingerprint: 0,\n            };\n            let tasks = convert_scheduling_solution_to_physical_plan_single_node_single_source(\n                4,\n                &[&previous_task1, &previous_task2],\n                &sharded_source,\n            );\n            assert_eq!(tasks.len(), 2);\n            assert_eq!(tasks[0].index_uid(), &source_uid.index_uid);\n            assert_eq!(tasks[0].shard_ids, [ShardId::from(1), ShardId::from(4)]);\n            assert_eq!(tasks[1].index_uid(), &source_uid.index_uid);\n            assert_eq!(tasks[1].shard_ids, [ShardId::from(6)]);\n        }\n        {\n            // smaller shards force a merge into a single pipeline\n            let sharded_source = SourceToSchedule {\n                source_uid: source_uid.clone(),\n                source_type: SourceToScheduleType::Sharded {\n                    shard_ids: vec![\n                        ShardId::from(1),\n                        ShardId::from(2),\n                        ShardId::from(4),\n                        ShardId::from(6),\n                    ],\n                    load_per_shard: NonZeroU32::new(250).unwrap(),\n                },\n                params_fingerprint: 0,\n            };\n            let tasks = convert_scheduling_solution_to_physical_plan_single_node_single_source(\n                4,\n                &[&previous_task1, &previous_task2],\n                &sharded_source,\n            );\n            assert_eq!(tasks.len(), 1);\n            assert_eq!(tasks[0].index_uid(), &source_uid.index_uid);\n            assert_eq!(tasks[0].shard_ids, [ShardId::from(1), ShardId::from(4)]);\n        }\n    }\n\n    #[test]\n    fn test_convert_scheduling_solution_to_physical_plan_single_node_single_source_non_sharded() {\n        let source_uid = SourceUid {\n            index_uid: IndexUid::new_with_random_ulid(\"testindex\"),\n            source_id: \"testsource\".to_string(),\n        };\n        let pipeline_uid1 = PipelineUid::random();\n        let previous_task1 = IndexingTask {\n            index_uid: Some(source_uid.index_uid.clone()),\n            source_id: source_uid.source_id.to_string(),\n            pipeline_uid: Some(pipeline_uid1),\n            shard_ids: Vec::new(),\n            params_fingerprint: 0,\n        };\n        let pipeline_uid2 = PipelineUid::random();\n        let previous_task2 = IndexingTask {\n            index_uid: Some(source_uid.index_uid.clone()),\n            source_id: source_uid.source_id.to_string(),\n            pipeline_uid: Some(pipeline_uid2),\n            shard_ids: Vec::new(),\n            params_fingerprint: 0,\n        };\n        {\n            let sharded_source = SourceToSchedule {\n                source_uid: source_uid.clone(),\n                source_type: SourceToScheduleType::NonSharded {\n                    num_pipelines: 1,\n                    load_per_pipeline: NonZeroU32::new(4000).unwrap(),\n                },\n                params_fingerprint: 0,\n            };\n            let tasks = convert_scheduling_solution_to_physical_plan_single_node_single_source(\n                1,\n                &[&previous_task1, &previous_task2],\n                &sharded_source,\n            );\n            assert_eq!(tasks.len(), 1);\n            assert_eq!(tasks[0].index_uid(), &source_uid.index_uid);\n            assert!(tasks[0].shard_ids.is_empty());\n            assert_eq!(tasks[0].pipeline_uid.as_ref().unwrap(), &pipeline_uid1);\n        }\n        {\n            let sharded_source = SourceToSchedule {\n                source_uid: source_uid.clone(),\n                source_type: SourceToScheduleType::NonSharded {\n                    num_pipelines: 0,\n                    load_per_pipeline: NonZeroU32::new(1_000).unwrap(),\n                },\n                params_fingerprint: 0,\n            };\n            let tasks = convert_scheduling_solution_to_physical_plan_single_node_single_source(\n                0,\n                &[&previous_task1, &previous_task2],\n                &sharded_source,\n            );\n            assert_eq!(tasks.len(), 0);\n        }\n        {\n            let sharded_source = SourceToSchedule {\n                source_uid: source_uid.clone(),\n                source_type: SourceToScheduleType::NonSharded {\n                    num_pipelines: 2,\n                    load_per_pipeline: NonZeroU32::new(1_000).unwrap(),\n                },\n                params_fingerprint: 0,\n            };\n            let tasks = convert_scheduling_solution_to_physical_plan_single_node_single_source(\n                2,\n                &[&previous_task1, &previous_task2],\n                &sharded_source,\n            );\n            assert_eq!(tasks.len(), 2);\n            assert_eq!(tasks[0].index_uid(), &source_uid.index_uid);\n            assert!(tasks[0].shard_ids.is_empty());\n            assert_eq!(tasks[0].pipeline_uid.as_ref().unwrap(), &pipeline_uid1);\n            assert_eq!(tasks[1].index_uid(), &source_uid.index_uid);\n            assert!(tasks[1].shard_ids.is_empty());\n            assert_eq!(tasks[1].pipeline_uid.as_ref().unwrap(), &pipeline_uid2);\n        }\n        {\n            let sharded_source = SourceToSchedule {\n                source_uid: source_uid.clone(),\n                source_type: SourceToScheduleType::NonSharded {\n                    num_pipelines: 2,\n                    load_per_pipeline: NonZeroU32::new(1_000).unwrap(),\n                },\n                params_fingerprint: 0,\n            };\n            let tasks = convert_scheduling_solution_to_physical_plan_single_node_single_source(\n                2,\n                &[&previous_task1],\n                &sharded_source,\n            );\n            assert_eq!(tasks.len(), 2);\n            assert_eq!(tasks[0].index_uid(), &source_uid.index_uid);\n            assert!(tasks[0].shard_ids.is_empty());\n            assert_eq!(tasks[0].pipeline_uid.as_ref().unwrap(), &pipeline_uid1);\n            assert_eq!(tasks[1].index_uid(), &source_uid.index_uid);\n            assert!(tasks[1].shard_ids.is_empty());\n            assert_ne!(tasks[1].pipeline_uid.as_ref().unwrap(), &pipeline_uid1);\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-control-plane/src/indexing_scheduler/scheduling/scheduling_logic.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::cmp::Reverse;\nuse std::collections::BTreeMap;\nuse std::collections::btree_map::Entry;\n\nuse itertools::Itertools;\nuse quickwit_proto::indexing::CpuCapacity;\n\nuse super::scheduling_logic_model::*;\nuse crate::indexing_scheduler::scheduling::inflate_node_capacities_if_necessary;\n\n// ------------------------------------------------------------------------------------\n// High level algorithm\n\nfn check_contract_conditions(problem: &SchedulingProblem, solution: &SchedulingSolution) {\n    assert_eq!(problem.num_indexers(), solution.num_indexers());\n    for (node_id, indexer_assignment) in solution.indexer_assignments.iter().enumerate() {\n        assert_eq!(indexer_assignment.indexer_ord, node_id);\n    }\n    for (source_ord, source) in problem.sources().enumerate() {\n        assert_eq!(source_ord as SourceOrd, source.source_ord);\n    }\n}\n\npub fn solve(\n    mut problem: SchedulingProblem,\n    previous_solution: SchedulingSolution,\n) -> SchedulingSolution {\n    // We first inflate the indexer capacities to make sure they globally\n    // have at least 120% of the total problem load. This is done proportionally\n    // to their original capacity.\n    inflate_node_capacities_if_necessary(&mut problem);\n    // As a heuristic, to offer stability, we work iteratively\n    // from the previous solution.\n    let mut solution = previous_solution;\n    // We first run a few asserts to ensure that the problem is correct.\n    check_contract_conditions(&problem, &solution);\n    // Due to scale down, or entire removal of sources some shards we might have\n    // too many shards in the current solution.\n    // Let's first shave off the extraneous shards.\n    remove_extraneous_shards(&problem, &mut solution);\n    // Because the load associated to shards can change, some indexers\n    // may have too much work assigned to them.\n    // Again, we shave off some shards to make sure they are\n    // within their capacity.\n    enforce_indexers_cpu_capacity(&problem, &mut solution);\n    // The solution now meets the constraint, but it does not necessarily\n    // contains all of the shards that we need to assign.\n    //\n    // We first assign sources to indexers that have some affinity with them\n    // (provided they have the capacity.)\n    place_unassigned_shards_with_affinity(&problem, &mut solution);\n    // Finally we assign the remaining shards, regardess of whether they have affinity\n    // or not.\n    place_unassigned_shards_ignoring_affinity(problem, &solution)\n}\n\n// -------------------------------------------------------------------------\n// Phase 1\n// Remove shards in solution that are not needed anymore\n\nfn remove_extraneous_shards(problem: &SchedulingProblem, solution: &mut SchedulingSolution) {\n    let mut num_shards_per_source: Vec<u32> = vec![0; problem.num_sources()];\n    for indexer_assignment in &solution.indexer_assignments {\n        if let Some((&source_ord, _)) = indexer_assignment.num_shards_per_source.last_key_value() {\n            assert!(source_ord < problem.num_sources() as SourceOrd);\n        }\n        for (&source, &source_num_shards) in &indexer_assignment.num_shards_per_source {\n            num_shards_per_source[source as usize] += source_num_shards;\n        }\n    }\n    let num_shards_per_source_to_remove: Vec<(SourceOrd, u32)> = num_shards_per_source\n        .into_iter()\n        .zip(problem.sources())\n        .flat_map(|(num_shards, source)| {\n            let target_num_shards = source.num_shards;\n            if target_num_shards < num_shards {\n                Some((source.source_ord, num_shards - target_num_shards))\n            } else {\n                None\n            }\n        })\n        .collect();\n\n    let mut nodes_with_source: BTreeMap<SourceOrd, Vec<IndexerOrd>> = BTreeMap::default();\n    for (node_id, indexer_assignment) in solution.indexer_assignments.iter().enumerate() {\n        for (&source, &num_shards) in &indexer_assignment.num_shards_per_source {\n            if num_shards > 0 {\n                nodes_with_source.entry(source).or_default().push(node_id);\n            }\n        }\n    }\n\n    let mut indexer_available_capacity: Vec<i32> = solution\n        .indexer_assignments\n        .iter()\n        .map(|indexer_assignment| indexer_assignment.indexer_available_capacity(problem))\n        .collect();\n\n    for (source_ord, mut num_shards_to_remove) in num_shards_per_source_to_remove {\n        let nodes_with_source = nodes_with_source\n            .get_mut(&source_ord)\n            // Unwrap is safe here. By construction if we need to decrease the number of shard of a\n            // given source, at least one node has it.\n            .unwrap();\n        nodes_with_source.sort_by_key(|&node_id| indexer_available_capacity[node_id]);\n        for node_id in nodes_with_source.iter().copied() {\n            let indexer_assignment = &mut solution.indexer_assignments[node_id];\n            let previous_num_shards = indexer_assignment.num_shards(source_ord);\n            assert!(previous_num_shards > 0);\n            assert!(num_shards_to_remove > 0);\n            let num_shards_removed = previous_num_shards.min(num_shards_to_remove);\n            indexer_assignment.remove_shards(source_ord, num_shards_removed);\n            num_shards_to_remove -= num_shards_removed;\n            // We update the node capacity since its load has changed.\n            indexer_available_capacity[node_id] =\n                indexer_assignment.indexer_available_capacity(problem);\n            if num_shards_to_remove == 0 {\n                // No more work to do for this source.\n                break;\n            }\n        }\n    }\n    assert_remove_extraneous_shards_post_condition(problem, solution);\n}\n\nfn assert_remove_extraneous_shards_post_condition(\n    problem: &SchedulingProblem,\n    solution: &SchedulingSolution,\n) {\n    let mut num_shards_per_source: Vec<u32> = vec![0; problem.num_sources()];\n    for indexer_assignment in &solution.indexer_assignments {\n        for (&source, &load) in &indexer_assignment.num_shards_per_source {\n            num_shards_per_source[source as usize] += load;\n        }\n    }\n    for source in problem.sources() {\n        assert!(num_shards_per_source[source.source_ord as usize] <= source.num_shards);\n    }\n}\n\n// -------------------------------------------------------------------------\n// Phase 2\n// Relieve sources from the node that are exceeding their maximum load.\n\nfn enforce_indexers_cpu_capacity(problem: &SchedulingProblem, solution: &mut SchedulingSolution) {\n    for indexer_assignment in &mut solution.indexer_assignments {\n        let indexer_cpu_capacity: CpuCapacity =\n            problem.indexer_cpu_capacity(indexer_assignment.indexer_ord);\n        enforce_indexer_cpu_capacity(problem, indexer_cpu_capacity, indexer_assignment);\n    }\n}\n\nfn enforce_indexer_cpu_capacity(\n    problem: &SchedulingProblem,\n    indexer_cpu_capacity: CpuCapacity,\n    indexer_assignment: &mut IndexerAssignment,\n) {\n    let total_load = indexer_assignment.total_cpu_load(problem);\n    if total_load <= indexer_cpu_capacity.cpu_millis() {\n        return;\n    }\n    let mut load_to_remove: CpuCapacity =\n        CpuCapacity::from_cpu_millis(total_load) - indexer_cpu_capacity;\n    let mut source_cpu_capacities: Vec<(CpuCapacity, SourceOrd)> = indexer_assignment\n        .num_shards_per_source\n        .iter()\n        .map(|(&source_ord, num_shards)| {\n            let load_for_source = problem.source_load_per_shard(source_ord).get() * num_shards;\n            (CpuCapacity::from_cpu_millis(load_for_source), source_ord)\n        })\n        .collect();\n    source_cpu_capacities.sort();\n    for (source_cpu_capacity, source_ord) in source_cpu_capacities {\n        indexer_assignment.num_shards_per_source.remove(&source_ord);\n        load_to_remove = if load_to_remove <= source_cpu_capacity {\n            break;\n        } else {\n            load_to_remove - source_cpu_capacity\n        };\n    }\n    assert_enforce_nodes_cpu_capacity_post_condition(problem, indexer_assignment);\n}\n\nfn assert_enforce_nodes_cpu_capacity_post_condition(\n    problem: &SchedulingProblem,\n    indexer_assignment: &IndexerAssignment,\n) {\n    let total_load = indexer_assignment.total_cpu_load(problem);\n    assert!(\n        total_load\n            <= problem\n                .indexer_cpu_capacity(indexer_assignment.indexer_ord)\n                .cpu_millis()\n    );\n}\n\n// ----------------------------------------------------\n// Phase 3\n// Place unassigned sources.\n//\n// We use a greedy algorithm as a simple heuristic here.\n//\n// We go through the sources in decreasing order of their load,\n// in two passes.\n//\n// In the first pass, we have a look at\n// the nodes with which there is an affinity.\n//\n// If one of them has room for all of the shards, then we assign all\n// of the shards to it.\n//\n// In the second pass, we just put as many shards as possible on the node\n// with the highest available capacity.\n//\n// If this algorithm fails to place all remaining shards, we inflate\n// the node capacities by 20% in the scheduling problem and start from the beginning.\n\nfn attempt_place_unassigned_shards(\n    unassigned_shards: &[Source],\n    problem: &SchedulingProblem,\n    partial_solution: &SchedulingSolution,\n) -> Result<SchedulingSolution, NotEnoughCapacity> {\n    let mut solution = partial_solution.clone();\n    for source in unassigned_shards {\n        let indexers_with_most_available_capacity =\n            compute_indexer_available_capacity(problem, &solution)\n                .sorted_by_key(|(indexer_ord, capacity)| Reverse((*capacity, *indexer_ord)));\n        place_unassigned_shards_single_source(\n            source,\n            indexers_with_most_available_capacity,\n            &mut solution,\n        )?;\n    }\n    assert_place_unassigned_shards_post_condition(problem, &solution);\n    Ok(solution)\n}\n\nfn place_unassigned_shards_with_affinity(\n    problem: &SchedulingProblem,\n    solution: &mut SchedulingSolution,\n) {\n    let mut unassigned_shards: Vec<Source> = compute_unassigned_sources(problem, solution);\n    unassigned_shards.sort_by_key(|source| {\n        let load = source.num_shards * source.load_per_shard.get();\n        Reverse(load)\n    });\n    for source in &unassigned_shards {\n        // List of indexer with a non-null affinity and some available capacity, sorted by\n        // (affinity, available capacity) in that order.\n        let indexers_with_affinity_and_available_capacity = source\n            .affinities\n            .iter()\n            .filter(|&(_, &affinity)| affinity != 0u32)\n            .map(|(&indexer_ord, affinity)| {\n                let available_capacity =\n                    solution.indexer_assignments[indexer_ord].indexer_available_capacity(problem);\n                let capacity = CpuCapacity::from_cpu_millis(available_capacity as u32);\n                (indexer_ord, affinity, capacity)\n            })\n            .sorted_by_key(|(indexer_ord, affinity, capacity)| {\n                Reverse((*affinity, *capacity, *indexer_ord))\n            })\n            .map(|(indexer_ord, _, capacity)| (indexer_ord, capacity));\n        let _ = place_unassigned_shards_single_source(\n            source,\n            indexers_with_affinity_and_available_capacity,\n            solution,\n        );\n    }\n}\n\n#[must_use]\nfn place_unassigned_shards_ignoring_affinity(\n    mut problem: SchedulingProblem,\n    partial_solution: &SchedulingSolution,\n) -> SchedulingSolution {\n    let mut unassigned_shards: Vec<Source> = compute_unassigned_sources(&problem, partial_solution);\n    unassigned_shards.sort_by_key(|source| {\n        let load = source.num_shards * source.load_per_shard.get();\n        Reverse(load)\n    });\n\n    // Thanks to the call to `inflate_node_capacities_if_necessary`, we are\n    // certain that even on our first attempt, the total capacity of the indexer\n    // exceeds 120% of the partial solution. If a large shard needs to be placed\n    // in an already well balanced solution, it may not fit on any node. In that\n    // case, we iteratively grow the virtual capacity until it can be placed.\n    //\n    // 1.2^30 is about 240. If we reach 30 attempts we are certain to have a\n    // logical bug.\n    for attempt_number in 0..30 {\n        match attempt_place_unassigned_shards(&unassigned_shards[..], &problem, partial_solution) {\n            Ok(mut solution) => {\n                // the higher the attempt number, the more unbalanced the solution\n                if attempt_number > 0 {\n                    tracing::warn!(\n                        attempt_number = attempt_number,\n                        \"capacity re-scaled, scheduling solution likely unbalanced\"\n                    );\n                }\n                solution.capacity_scaling_iterations = attempt_number;\n                return solution;\n            }\n            Err(NotEnoughCapacity) => {\n                problem.scale_node_capacities(1.2f32);\n            }\n        }\n    }\n    unreachable!(\"Failed to assign all of the sources\");\n}\n\nfn assert_place_unassigned_shards_post_condition(\n    problem: &SchedulingProblem,\n    solution: &SchedulingSolution,\n) {\n    // We make sure we all shard are as placed.\n    for source in problem.sources() {\n        let num_assigned_shards: u32 = solution\n            .indexer_assignments\n            .iter()\n            .map(|indexer_assignment| indexer_assignment.num_shards(source.source_ord))\n            .sum();\n        assert_eq!(num_assigned_shards, source.num_shards);\n    }\n    // We make sure that the node capacity is respected.\n    for indexer_assignment in &solution.indexer_assignments {\n        // We call this function just to check that the indexer assignment does not exceed this\n        // capacity. (it includes an assert that panics if it happens).\n        assert_enforce_nodes_cpu_capacity_post_condition(problem, indexer_assignment);\n    }\n}\n\nstruct NotEnoughCapacity;\n\n/// Return Err(NotEnoughCapacity) iff the algorithm was unable to pack all of the sources\n/// amongst the node with their given node capacity.\nfn place_unassigned_shards_single_source(\n    source: &Source,\n    mut indexer_with_capacities: impl Iterator<Item = (IndexerOrd, CpuCapacity)>,\n    solution: &mut SchedulingSolution,\n) -> Result<(), NotEnoughCapacity> {\n    let mut num_shards = source.num_shards;\n    while num_shards > 0 {\n        let Some((indexer_ord, available_capacity)) = indexer_with_capacities.next() else {\n            return Err(NotEnoughCapacity);\n        };\n        let num_placable_shards = available_capacity.cpu_millis() / source.load_per_shard;\n        let num_shards_to_place = num_placable_shards.min(num_shards);\n        // Update the solution, the shard load, and the number of shards to place.\n        solution.indexer_assignments[indexer_ord]\n            .add_shards(source.source_ord, num_shards_to_place);\n        num_shards -= num_shards_to_place;\n    }\n    Ok(())\n}\n\n/// Compute the sources/shards that have not been assigned to any indexer yet.\n/// Affinity are also updated, with the limitation described in `Source`.\nfn compute_unassigned_sources(\n    problem: &SchedulingProblem,\n    solution: &SchedulingSolution,\n) -> Vec<Source> {\n    let mut unassigned_sources: BTreeMap<SourceOrd, Source> = problem\n        .sources()\n        .map(|source| (source.source_ord as SourceOrd, source))\n        .collect();\n    for (indexer_ord, indexer_assignment) in solution.indexer_assignments.iter().enumerate() {\n        for (&source_ord, &num_shards) in &indexer_assignment.num_shards_per_source {\n            if num_shards == 0 {\n                continue;\n            }\n            let Entry::Occupied(mut entry) = unassigned_sources.entry(source_ord) else {\n                panic!(\"The solution contains more shards than the actual problem.\");\n            };\n            if !entry.get_mut().remove_shards(indexer_ord, num_shards) {\n                entry.remove();\n            }\n        }\n    }\n    unassigned_sources.into_values().collect()\n}\n\n/// Builds a BinaryHeap with the different indexer capacities.\n///\n/// Panics if one of the indexer is over-assigned.\nfn compute_indexer_available_capacity<'a>(\n    problem: &'a SchedulingProblem,\n    solution: &'a SchedulingSolution,\n) -> impl Iterator<Item = (IndexerOrd, CpuCapacity)> + 'a {\n    solution\n        .indexer_assignments\n        .iter()\n        .map(|indexer_assignment| {\n            let available_capacity: i32 = indexer_assignment.indexer_available_capacity(problem);\n            assert!(available_capacity >= 0i32);\n            (\n                indexer_assignment.indexer_ord,\n                CpuCapacity::from_cpu_millis(available_capacity as u32),\n            )\n        })\n}\n\n#[cfg(test)]\nmod tests {\n    use std::num::NonZeroU32;\n\n    use proptest::prelude::*;\n    use quickwit_proto::indexing::mcpu;\n\n    use super::*;\n\n    #[test]\n    fn test_remove_extraneous_shards() {\n        let mut problem =\n            SchedulingProblem::with_indexer_cpu_capacities(vec![mcpu(4_000), mcpu(5_000)]);\n        problem.add_source(1, NonZeroU32::new(1_000u32).unwrap());\n        let mut solution = problem.new_solution();\n        solution.indexer_assignments[0].add_shards(0, 3);\n        solution.indexer_assignments[1].add_shards(0, 3);\n        remove_extraneous_shards(&problem, &mut solution);\n        assert_eq!(solution.indexer_assignments[0].num_shards(0), 0);\n        assert_eq!(solution.indexer_assignments[1].num_shards(0), 1);\n    }\n\n    #[test]\n    fn test_remove_extraneous_shards_2() {\n        let mut problem =\n            SchedulingProblem::with_indexer_cpu_capacities(vec![mcpu(5_000), mcpu(4_000)]);\n        problem.add_source(2, NonZeroU32::new(1_000).unwrap());\n        let mut solution = problem.new_solution();\n        solution.indexer_assignments[0].add_shards(0, 3);\n        solution.indexer_assignments[1].add_shards(0, 3);\n        remove_extraneous_shards(&problem, &mut solution);\n        assert_eq!(solution.indexer_assignments[0].num_shards(0), 2);\n        assert_eq!(solution.indexer_assignments[1].num_shards(0), 0);\n    }\n\n    #[test]\n    fn test_remove_missing_sources() {\n        let mut problem =\n            SchedulingProblem::with_indexer_cpu_capacities(vec![mcpu(5_000), mcpu(4_000)]);\n        // Source 0\n        problem.add_source(0, NonZeroU32::new(1_000).unwrap());\n        // Source 1\n        problem.add_source(2, NonZeroU32::new(1_000).unwrap());\n        let mut solution = problem.new_solution();\n        solution.indexer_assignments[0].add_shards(0, 1);\n        solution.indexer_assignments[0].add_shards(1, 1);\n        solution.indexer_assignments[1].add_shards(1, 2);\n        remove_extraneous_shards(&problem, &mut solution);\n        assert_eq!(solution.indexer_assignments[0].num_shards(0), 0);\n        assert_eq!(solution.indexer_assignments[0].num_shards(1), 1);\n        assert_eq!(solution.indexer_assignments[1].num_shards(0), 0);\n        assert_eq!(solution.indexer_assignments[1].num_shards(1), 1);\n    }\n\n    #[test]\n    fn test_enforce_nodes_cpu_capacity() {\n        let mut problem = SchedulingProblem::with_indexer_cpu_capacities(vec![\n            mcpu(5_000),\n            mcpu(5_000),\n            mcpu(5_000),\n            mcpu(5_000),\n            mcpu(7_000),\n        ]);\n        // Source 0\n        problem.add_source(10, NonZeroU32::new(3_000).unwrap());\n        problem.add_source(10, NonZeroU32::new(2_000).unwrap());\n        problem.add_source(10, NonZeroU32::new(1_001).unwrap());\n        let mut solution = problem.new_solution();\n\n        // node 0 does not exceed its capacity\n        solution.indexer_assignments[0].add_shards(0, 1);\n\n        // node 1 exceed its capacity with a single source\n        solution.indexer_assignments[1].add_shards(0, 2);\n\n        // node 2 is precisely at capacity\n        solution.indexer_assignments[2].add_shards(0, 1);\n        solution.indexer_assignments[2].add_shards(1, 1);\n\n        // node 3 is exceeding its capacity due with several sources\n        // We choose to remove sources entirely (as opposed to removing only shards that do not fit)\n        solution.indexer_assignments[3].add_shards(0, 1);\n        solution.indexer_assignments[3].add_shards(2, 2);\n\n        // node 3 is exceeding its capacity due with several sources\n        // We choose to remove sources entirely (as opposed to removing only shards that do not fit)\n        solution.indexer_assignments[4].add_shards(0, 1);\n        solution.indexer_assignments[4].add_shards(1, 1);\n        solution.indexer_assignments[4].add_shards(2, 2);\n\n        enforce_indexers_cpu_capacity(&problem, &mut solution);\n\n        assert_eq!(solution.indexer_assignments[0].num_shards(0), 1);\n        assert_eq!(solution.indexer_assignments[0].num_shards(1), 0);\n        assert_eq!(solution.indexer_assignments[0].num_shards(2), 0);\n\n        // We remove sources entirely!\n        assert_eq!(solution.indexer_assignments[1].num_shards(0), 0);\n        assert_eq!(solution.indexer_assignments[1].num_shards(1), 0);\n        assert_eq!(solution.indexer_assignments[1].num_shards(2), 0);\n\n        assert_eq!(solution.indexer_assignments[2].num_shards(0), 1);\n        assert_eq!(solution.indexer_assignments[2].num_shards(1), 1);\n        assert_eq!(solution.indexer_assignments[2].num_shards(2), 0);\n\n        assert_eq!(solution.indexer_assignments[3].num_shards(0), 1);\n        assert_eq!(solution.indexer_assignments[3].num_shards(1), 0);\n        assert_eq!(solution.indexer_assignments[3].num_shards(2), 0);\n\n        assert_eq!(solution.indexer_assignments[4].num_shards(0), 1);\n        assert_eq!(solution.indexer_assignments[4].num_shards(1), 0);\n        assert_eq!(solution.indexer_assignments[4].num_shards(2), 2);\n    }\n\n    #[test]\n    fn test_compute_unassigned_shards_simple() {\n        let mut problem = SchedulingProblem::with_indexer_cpu_capacities(vec![mcpu(4_000)]);\n        problem.add_source(4, NonZeroU32::new(1000).unwrap());\n        problem.add_source(4, NonZeroU32::new(1_000).unwrap());\n        let solution = problem.new_solution();\n        let unassigned_shards = compute_unassigned_sources(&problem, &solution);\n        assert_eq!(\n            unassigned_shards[0],\n            Source {\n                source_ord: 0,\n                load_per_shard: NonZeroU32::new(1_000).unwrap(),\n                num_shards: 4,\n                affinities: BTreeMap::default(),\n            }\n        );\n    }\n\n    #[test]\n    fn test_compute_unassigned_shards_with_non_trivial_solution() {\n        let mut problem =\n            SchedulingProblem::with_indexer_cpu_capacities(vec![mcpu(50_000), mcpu(40_000)]);\n        problem.add_source(5, NonZeroU32::new(1_000).unwrap());\n        problem.add_source(15, NonZeroU32::new(2_000).unwrap());\n        let mut solution = problem.new_solution();\n\n        solution.indexer_assignments[0].add_shards(0, 1);\n        solution.indexer_assignments[0].add_shards(1, 3);\n        solution.indexer_assignments[1].add_shards(0, 2);\n        solution.indexer_assignments[1].add_shards(1, 3);\n        let unassigned_shards = compute_unassigned_sources(&problem, &solution);\n        assert_eq!(\n            unassigned_shards[0],\n            Source {\n                source_ord: 0,\n                load_per_shard: NonZeroU32::new(1_000).unwrap(),\n                num_shards: 5 - (1 + 2),\n                affinities: Default::default(),\n            }\n        );\n        assert_eq!(\n            unassigned_shards[1],\n            Source {\n                source_ord: 1,\n                load_per_shard: NonZeroU32::new(2_000).unwrap(),\n                num_shards: 15 - (3 + 3),\n                affinities: Default::default(),\n            }\n        );\n    }\n\n    #[test]\n    fn test_place_unassigned_shards_simple() {\n        let mut problem = SchedulingProblem::with_indexer_cpu_capacities(vec![mcpu(4_000)]);\n        problem.add_source(4, NonZeroU32::new(1_000).unwrap());\n        let partial_solution = problem.new_solution();\n        let solution = place_unassigned_shards_ignoring_affinity(problem, &partial_solution);\n        assert_eq!(solution.indexer_assignments[0].num_shards(0), 4);\n    }\n\n    #[test]\n    fn test_place_unassigned_shards_with_affinity() {\n        let mut problem =\n            SchedulingProblem::with_indexer_cpu_capacities(vec![mcpu(4_000), mcpu(4000)]);\n        problem.add_source(4, NonZeroU32::new(1_000).unwrap());\n        problem.add_source(4, NonZeroU32::new(1_000).unwrap());\n        problem.inc_affinity(0, 1);\n        problem.inc_affinity(1, 0);\n        let mut solution = problem.new_solution();\n        place_unassigned_shards_with_affinity(&problem, &mut solution);\n        assert_eq!(solution.indexer_assignments[0].num_shards(1), 4);\n        assert_eq!(solution.indexer_assignments[1].num_shards(0), 4);\n    }\n\n    #[test]\n    fn test_place_unassigned_shards_reach_capacity() {\n        let mut problem =\n            SchedulingProblem::with_indexer_cpu_capacities(vec![mcpu(50_000), mcpu(40_000)]);\n        problem.add_source(5, NonZeroU32::new(1_000).unwrap());\n        problem.add_source(15, NonZeroU32::new(2_000).unwrap());\n        let mut solution = problem.new_solution();\n        solution.indexer_assignments[0].add_shards(0, 1);\n        solution.indexer_assignments[0].add_shards(1, 3);\n        solution.indexer_assignments[1].add_shards(0, 2);\n        solution.indexer_assignments[1].add_shards(1, 3);\n        let unassigned_shards = compute_unassigned_sources(&problem, &solution);\n        assert_eq!(solution.indexer_assignments[0].num_shards(0), 1);\n        assert_eq!(solution.indexer_assignments[0].num_shards(1), 3);\n        assert_eq!(solution.indexer_assignments[1].num_shards(0), 2);\n        assert_eq!(solution.indexer_assignments[1].num_shards(1), 3);\n        assert_eq!(\n            unassigned_shards[0],\n            Source {\n                source_ord: 0,\n                load_per_shard: NonZeroU32::new(1_000).unwrap(),\n                num_shards: 5 - (1 + 2),\n                affinities: Default::default(),\n            }\n        );\n        assert_eq!(\n            unassigned_shards[1],\n            Source {\n                source_ord: 1,\n                load_per_shard: NonZeroU32::new(2_000).unwrap(),\n                num_shards: 15 - (3 + 3),\n                affinities: Default::default(),\n            }\n        );\n    }\n\n    #[test]\n    fn test_solve() {\n        let mut problem = SchedulingProblem::with_indexer_cpu_capacities(vec![mcpu(800)]);\n        problem.add_source(43, NonZeroU32::new(1).unwrap());\n        problem.add_source(379, NonZeroU32::new(1).unwrap());\n        let previous_solution = problem.new_solution();\n        solve(problem, previous_solution);\n    }\n\n    fn indexer_cpu_capacity_strat() -> impl Strategy<Value = CpuCapacity> {\n        prop_oneof![\n            1u32..10_000u32,\n            Just(1u32),\n            800u32..1200u32,\n            1900u32..2100u32,\n        ]\n        .prop_map(CpuCapacity::from_cpu_millis)\n    }\n\n    fn num_shards() -> impl Strategy<Value = u32> {\n        0u32..3u32\n    }\n\n    fn source_strat() -> impl Strategy<Value = (u32, NonZeroU32)> {\n        let load_strat = prop_oneof![\n            Just(1u32),\n            Just(2u32),\n            Just(10u32),\n            Just(100u32),\n            Just(250u32),\n            1u32..1_000u32\n        ];\n        (\n            num_shards(),\n            load_strat.prop_map(|load| NonZeroU32::new(load).unwrap()),\n        )\n    }\n\n    fn problem_strategy(\n        num_nodes: usize,\n        num_sources: usize,\n    ) -> impl Strategy<Value = SchedulingProblem> {\n        let indexer_cpu_capacity_strat =\n            proptest::collection::vec(indexer_cpu_capacity_strat(), num_nodes);\n        let sources_strat = proptest::collection::vec(source_strat(), num_sources);\n        (indexer_cpu_capacity_strat, sources_strat).prop_map(|(node_cpu_capacities, sources)| {\n            let mut problem = SchedulingProblem::with_indexer_cpu_capacities(node_cpu_capacities);\n            for (num_shards, load_per_shard) in sources {\n                problem.add_source(num_shards, load_per_shard);\n            }\n            problem\n        })\n    }\n\n    fn num_nodes_strat() -> impl Strategy<Value = usize> {\n        prop_oneof![\n            3 => 1usize..3,\n            1 => 4usize..10,\n        ]\n    }\n    fn num_sources_strat() -> impl Strategy<Value = usize> {\n        prop_oneof![\n            3 => 0usize..3,\n            1 => 4usize..10,\n        ]\n    }\n\n    fn indexer_assignments_strategy(num_sources: usize) -> impl Strategy<Value = Vec<u32>> {\n        proptest::collection::vec(0u32..3u32, num_sources)\n    }\n\n    fn initial_solution_strategy(\n        num_nodes: usize,\n        num_sources: usize,\n    ) -> impl Strategy<Value = SchedulingSolution> {\n        proptest::collection::vec(indexer_assignments_strategy(num_sources), num_nodes).prop_map(\n            move |indexer_assignments: Vec<Vec<u32>>| {\n                let mut solution = SchedulingSolution::with_num_indexers(num_nodes);\n                for (node_id, indexer_assignment) in indexer_assignments.iter().enumerate() {\n                    for (source_ord, num_shards) in indexer_assignment.iter().copied().enumerate() {\n                        solution.indexer_assignments[node_id]\n                            .add_shards(source_ord as u32, num_shards);\n                    }\n                }\n                solution\n            },\n        )\n    }\n\n    fn problem_solution_strategy() -> impl Strategy<Value = (SchedulingProblem, SchedulingSolution)>\n    {\n        (num_nodes_strat(), num_sources_strat()).prop_flat_map(move |(num_nodes, num_sources)| {\n            (\n                problem_strategy(num_nodes, num_sources),\n                initial_solution_strategy(num_nodes, num_sources),\n            )\n        })\n    }\n\n    #[test]\n    fn test_problem_missing_capacities() {\n        let mut problem =\n            SchedulingProblem::with_indexer_cpu_capacities(vec![CpuCapacity::from_cpu_millis(100)]);\n        problem.add_source(1, NonZeroU32::new(1).unwrap());\n        let mut previous_solution = problem.new_solution();\n        previous_solution.indexer_assignments[0].add_shards(0, 0);\n        let solution = solve(problem, previous_solution);\n        assert_eq!(solution.indexer_assignments[0].num_shards(0), 1);\n    }\n\n    #[test]\n    fn test_problem_unbalanced_simple() {\n        let mut problem = SchedulingProblem::with_indexer_cpu_capacities(vec![\n            CpuCapacity::from_cpu_millis(1),\n            CpuCapacity::from_cpu_millis(1),\n        ]);\n        problem.add_source(1, NonZeroU32::new(10).unwrap());\n        for _ in 0..10 {\n            problem.add_source(1, NonZeroU32::new(1).unwrap());\n        }\n        let previous_solution = problem.new_solution();\n        let solution = solve(problem.clone(), previous_solution);\n        let available_capacities: Vec<u32> = solution\n            .indexer_assignments\n            .iter()\n            .map(|indexer_assignment: &IndexerAssignment| {\n                indexer_assignment.total_cpu_load(&problem)\n            })\n            .collect();\n        assert_eq!(available_capacities.len(), 2);\n        let (min, max) = available_capacities\n            .into_iter()\n            .minmax()\n            .into_option()\n            .unwrap();\n        assert_eq!(min, 10);\n        assert_eq!(max, 10);\n    }\n\n    proptest! {\n        #[test]\n        fn test_proptest_post_conditions((problem, solution) in problem_solution_strategy()) {\n            let solution_1 = solve(problem.clone(), solution);\n            let solution_2 = solve(problem.clone(), solution_1.clone());\n            // TODO: This assert actually fails for some scenarii. We say it is fine\n            // for now as long as the solution does not change again during the\n            // next resolution:\n            // let has_solution_changed_once = solution_1.indexer_assignments != solution_2.indexer_assignments;\n            // assert!(!has_solution_changed_once, \"Solution changed for same problem\\nSolution 1:{solution_1:?}\\nSolution 2: {solution_2:?}\");\n            let solution_3 = solve(problem, solution_2.clone());\n            let has_solution_changed_again = solution_2.indexer_assignments != solution_3.indexer_assignments;\n            assert!(!has_solution_changed_again, \"solution unstable!!!\\nSolution 1: {solution_1:?}\\nSolution 2: {solution_2:?}\\nSolution 3: {solution_3:?}\");\n        }\n    }\n\n    #[test]\n    fn test_capacity_scaling_iteration_required() {\n        // Create a problem where affinity constraints cause suboptimal placement\n        // requiring iterative scaling despite initial capacity scaling.\n        let mut problem =\n            SchedulingProblem::with_indexer_cpu_capacities(vec![mcpu(3000), mcpu(3000)]);\n        problem.add_source(1, NonZeroU32::new(2500).unwrap()); // Source 0\n        problem.add_source(1, NonZeroU32::new(2500).unwrap()); // Source 1\n        problem.add_source(1, NonZeroU32::new(1500).unwrap()); // Source 2\n        let previous_solution = problem.new_solution();\n        let solution = solve(problem, previous_solution);\n\n        assert_eq!(solution.capacity_scaling_iterations, 1);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-control-plane/src/indexing_scheduler/scheduling/scheduling_logic_model.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeMap;\nuse std::collections::btree_map::Entry;\nuse std::num::NonZeroU32;\n\nuse quickwit_proto::indexing::CpuCapacity;\n\npub type SourceOrd = u32;\npub type IndexerOrd = usize;\n\n#[derive(Clone, Debug, Eq, PartialEq)]\npub struct Source {\n    pub source_ord: SourceOrd,\n    pub load_per_shard: NonZeroU32,\n    /// Affinities of the source for each indexer.\n    /// In the beginning, affinities are initialized to be the count of shards of the source\n    /// that are located on the indexer.\n    ///\n    /// As we compute unassigned sources, we decrease the affinity by the given number of shards,\n    /// saturating at 0.\n    ///\n    /// As a result we only have the invariant\n    /// and `affinity(source, indexer) <= num shard of source on indexer`\n    pub affinities: BTreeMap<IndexerOrd, u32>,\n    pub num_shards: u32,\n}\n\nimpl Source {\n    // Remove a given number of shards, located on the given indexer.\n    // Returns `false` if and only if all of the shards have been removed.\n    //\n    // This function also decrease the affinity of the source for the given indexer\n    // by num_shards_to_remove in a saturating way.\n    //\n    // # Panics\n    //\n    // If the source does have that many total number of shards to begin with.\n    pub fn remove_shards(&mut self, indexer_ord: usize, num_shards_to_remove: u32) -> bool {\n        if num_shards_to_remove == 0u32 {\n            return self.num_shards > 0u32;\n        }\n        let entry = self.affinities.entry(indexer_ord);\n        self.num_shards = self\n            .num_shards\n            .checked_sub(num_shards_to_remove)\n            .expect(\"removing more shards than available.\");\n        if self.num_shards == 0u32 {\n            self.affinities.clear();\n            return false;\n        }\n        if let Entry::Occupied(mut affinity_with_indexer_entry) = entry {\n            let affinity_with_indexer: &mut u32 = affinity_with_indexer_entry.get_mut();\n            let affinity_after_removal = affinity_with_indexer.saturating_sub(num_shards_to_remove);\n            if affinity_after_removal == 0u32 {\n                affinity_with_indexer_entry.remove();\n            } else {\n                *affinity_with_indexer = affinity_after_removal;\n            }\n        }\n        true\n    }\n}\n\n#[derive(Debug, Clone)]\npub struct SchedulingProblem {\n    sources: Vec<Source>,\n    indexer_cpu_capacities: Vec<CpuCapacity>,\n}\n\nimpl SchedulingProblem {\n    /// Problem constructor.\n    ///\n    /// Panics if the list of indexers is empty or if one of the\n    /// indexer has a null capacity.\n    pub fn with_indexer_cpu_capacities(\n        indexer_cpu_capacities: Vec<CpuCapacity>,\n    ) -> SchedulingProblem {\n        assert!(!indexer_cpu_capacities.is_empty());\n        assert!(\n            indexer_cpu_capacities\n                .iter()\n                .all(|cpu_capacity| cpu_capacity.cpu_millis() > 0)\n        );\n        // TODO assert for affinity.\n        SchedulingProblem {\n            sources: Vec::new(),\n            indexer_cpu_capacities,\n        }\n    }\n\n    pub fn new_solution(&self) -> SchedulingSolution {\n        SchedulingSolution::with_num_indexers(self.indexer_cpu_capacities.len())\n    }\n\n    pub fn indexer_cpu_capacity(&self, indexer_ord: IndexerOrd) -> CpuCapacity {\n        self.indexer_cpu_capacities[indexer_ord]\n    }\n\n    /// Scales the cpu capacity by the given scaling factor.\n    ///\n    /// Resulting cpu capacity are ceiled to the next integer millicpus value.\n    pub fn scale_node_capacities(&mut self, scale: f32) {\n        for capacity in &mut self.indexer_cpu_capacities {\n            let scaled_cpu_millis = (capacity.cpu_millis() as f32 * scale).ceil() as u32;\n            *capacity = CpuCapacity::from_cpu_millis(scaled_cpu_millis);\n        }\n    }\n\n    pub fn total_node_capacities(&self) -> CpuCapacity {\n        self.indexer_cpu_capacities\n            .iter()\n            .copied()\n            .fold(CpuCapacity::zero(), |left, right| left + right)\n    }\n\n    pub fn total_load(&self) -> u32 {\n        self.sources\n            .iter()\n            .map(|source| source.num_shards * source.load_per_shard.get())\n            .sum()\n    }\n\n    pub fn sources(&self) -> impl Iterator<Item = Source> + '_ {\n        self.sources.iter().cloned()\n    }\n\n    pub fn add_source(&mut self, num_shards: u32, load_per_shard: NonZeroU32) -> SourceOrd {\n        let source_ord = self.sources.len() as SourceOrd;\n        self.sources.push(Source {\n            source_ord,\n            num_shards,\n            load_per_shard,\n            affinities: Default::default(),\n        });\n        source_ord\n    }\n\n    /// Increases the affinity source <-> indexer by 1.\n    /// This is done to record that the indexer is hosting one shard of the source.\n    pub fn inc_affinity(&mut self, source_ord: SourceOrd, indexer_ord: IndexerOrd) {\n        let affinity: &mut u32 = self.sources[source_ord as usize]\n            .affinities\n            .entry(indexer_ord)\n            .or_default();\n        *affinity += 1;\n    }\n\n    pub fn source_load_per_shard(&self, source_ord: SourceOrd) -> NonZeroU32 {\n        self.sources[source_ord as usize].load_per_shard\n    }\n\n    pub fn num_sources(&self) -> usize {\n        self.sources.len()\n    }\n\n    pub fn num_indexers(&self) -> usize {\n        self.indexer_cpu_capacities.len()\n    }\n}\n\n#[derive(Clone, Debug, Eq, PartialEq)]\npub struct IndexerAssignment {\n    pub indexer_ord: IndexerOrd,\n    pub num_shards_per_source: BTreeMap<SourceOrd, u32>,\n}\n\nimpl IndexerAssignment {\n    pub fn new(indexer_ord: IndexerOrd) -> IndexerAssignment {\n        IndexerAssignment {\n            indexer_ord,\n            num_shards_per_source: Default::default(),\n        }\n    }\n\n    /// Returns the number of available `mcpu` in the indexer.\n    /// If the indexer is over-assigned this method returns a negative value.\n    pub fn indexer_available_capacity(&self, problem: &SchedulingProblem) -> i32 {\n        let total_cpu_load = self.total_cpu_load(problem);\n        let indexer_cpu_capacity = problem.indexer_cpu_capacities[self.indexer_ord];\n        indexer_cpu_capacity.cpu_millis() as i32 - total_cpu_load as i32\n    }\n\n    pub fn total_cpu_load(&self, problem: &SchedulingProblem) -> u32 {\n        self.num_shards_per_source\n            .iter()\n            .map(|(source_ord, num_shards)| {\n                problem.source_load_per_shard(*source_ord).get() * num_shards\n            })\n            .sum()\n    }\n\n    pub fn num_shards(&self, source_ord: SourceOrd) -> u32 {\n        self.num_shards_per_source\n            .get(&source_ord)\n            .copied()\n            .unwrap_or(0u32)\n    }\n\n    /// Add shards to a source (noop of `num_shards` is 0).\n    pub fn add_shards(&mut self, source_ord: u32, num_shards: u32) {\n        // No need to fill indexer_assignments with empty assignments.\n        if num_shards == 0 {\n            return;\n        }\n        *self.num_shards_per_source.entry(source_ord).or_default() += num_shards;\n    }\n\n    pub fn remove_shards(&mut self, source_ord: u32, num_shards_removed: u32) {\n        let entry = self.num_shards_per_source.entry(source_ord);\n        let Entry::Occupied(mut occupied_entry) = entry else {\n            assert_eq!(num_shards_removed, 0);\n            return;\n        };\n        let previous_shard_count = *occupied_entry.get();\n        assert!(previous_shard_count >= num_shards_removed);\n        if previous_shard_count > num_shards_removed {\n            *occupied_entry.get_mut() -= num_shards_removed\n        } else {\n            occupied_entry.remove();\n        }\n    }\n}\n\n#[derive(Clone, Debug)]\npub struct SchedulingSolution {\n    pub indexer_assignments: Vec<IndexerAssignment>,\n    // used for tests\n    pub capacity_scaling_iterations: usize,\n}\n\nimpl SchedulingSolution {\n    pub fn with_num_indexers(num_indexers: usize) -> SchedulingSolution {\n        SchedulingSolution {\n            indexer_assignments: (0..num_indexers).map(IndexerAssignment::new).collect(),\n            capacity_scaling_iterations: 0,\n        }\n    }\n    pub fn num_indexers(&self) -> usize {\n        self.indexer_assignments.len()\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    fn test_source() -> Source {\n        let mut affinities: BTreeMap<usize, u32> = Default::default();\n        affinities.insert(7, 3u32);\n        affinities.insert(11, 2u32);\n        Source {\n            source_ord: 0u32,\n            load_per_shard: NonZeroU32::new(1000u32).unwrap(),\n            affinities,\n            num_shards: 2 + 3,\n        }\n    }\n\n    #[test]\n    fn test_source_remove_simple() {\n        let mut source = test_source();\n        assert!(source.remove_shards(7, 2));\n        assert_eq!(source.num_shards, 5 - 2);\n        assert_eq!(source.affinities.get(&7).copied(), Some(1));\n        assert_eq!(source.affinities.get(&11).copied(), Some(2));\n    }\n\n    #[test]\n    fn test_source_remove_all_affinity() {\n        let mut source = test_source();\n        assert!(source.remove_shards(7, 3));\n        assert_eq!(source.num_shards, 5 - 3);\n        assert!(!source.affinities.contains_key(&7));\n        assert_eq!(source.affinities.get(&11).copied(), Some(2));\n    }\n\n    #[test]\n    fn test_source_remove_more_than_affinity() {\n        let mut source = test_source();\n        assert!(source.remove_shards(7, 4));\n        assert_eq!(source.num_shards, 5 - 4);\n        assert!(!source.affinities.contains_key(&7));\n        assert_eq!(source.affinities.get(&11).copied(), Some(2));\n    }\n\n    #[test]\n    fn test_source_remove_all_shards() {\n        let mut source = test_source();\n        assert!(!source.remove_shards(7, 5));\n        assert_eq!(source.num_shards, 0);\n        assert!(source.affinities.is_empty());\n    }\n\n    #[test]\n    #[should_panic]\n    fn test_source_remove_more_than_all_shards() {\n        let mut source = test_source();\n        assert!(source.remove_shards(7, 6));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-control-plane/src/ingest/ingest_controller.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::btree_map::Entry;\nuse std::collections::{BTreeMap, BTreeSet, HashMap, HashSet};\nuse std::fmt;\nuse std::future::Future;\nuse std::num::NonZeroUsize;\nuse std::sync::Arc;\nuse std::time::Duration;\n\nuse fnv::FnvHashSet;\nuse futures::StreamExt;\nuse futures::stream::FuturesUnordered;\nuse itertools::{Itertools as _, MinMaxResult};\nuse quickwit_actors::Mailbox;\nuse quickwit_common::Progress;\nuse quickwit_common::pretty::PrettySample;\nuse quickwit_ingest::{IngesterPool, LeaderId, LocalShardsUpdate};\nuse quickwit_proto::control_plane::{\n    AdviseResetShardsRequest, AdviseResetShardsResponse, GetOrCreateOpenShardsFailureReason,\n    GetOrCreateOpenShardsRequest, GetOrCreateOpenShardsResponse, GetOrCreateOpenShardsSubrequest,\n    GetOrCreateOpenShardsSuccess,\n};\nuse quickwit_proto::ingest::ingester::{\n    CloseShardsRequest, CloseShardsResponse, IngesterService, IngesterStatus, InitShardFailure,\n    InitShardSubrequest, InitShardsRequest, InitShardsResponse, RetainShardsForSource,\n    RetainShardsRequest,\n};\nuse quickwit_proto::ingest::{\n    Shard, ShardIdPosition, ShardIdPositions, ShardIds, ShardPKey, ShardState,\n};\nuse quickwit_proto::metastore::{\n    MetastoreResult, MetastoreService, MetastoreServiceClient, OpenShardSubrequest,\n    OpenShardsRequest, OpenShardsResponse, serde_utils,\n};\nuse quickwit_proto::types::{IndexUid, NodeId, NodeIdRef, Position, ShardId, SourceUid};\nuse rand::prelude::IndexedRandom;\nuse rand::rngs::ThreadRng;\nuse rand::seq::SliceRandom;\nuse rand::{Rng, RngCore, rng};\nuse serde::{Deserialize, Serialize};\nuse tokio::sync::{Mutex, OwnedMutexGuard};\nuse tracing::{Level, debug, enabled, error, info, instrument, warn};\nuse ulid::Ulid;\n\nuse super::scaling_arbiter::ScalingArbiter;\nuse crate::control_plane::ControlPlane;\nuse crate::ingest::wait_handle::WaitHandle;\nuse crate::model::{ControlPlaneModel, ScalingMode, ShardEntry, ShardStats};\n\nconst CLOSE_SHARDS_REQUEST_TIMEOUT: Duration = if cfg!(test) {\n    Duration::from_millis(50)\n} else {\n    Duration::from_secs(3)\n};\n\nconst INIT_SHARDS_REQUEST_TIMEOUT: Duration = CLOSE_SHARDS_REQUEST_TIMEOUT;\n\nconst CLOSE_SHARDS_UPON_REBALANCE_DELAY: Duration = if cfg!(test) {\n    Duration::ZERO\n} else {\n    Duration::from_secs(10)\n};\n\nconst FIRE_AND_FORGET_TIMEOUT: Duration = Duration::from_secs(3);\n\n/// Spawns a new task to execute the given future,\n/// and stops polling it/drops it after a timeout.\n///\n/// All errors are ignored, and not even logged.\nfn fire_and_forget(\n    fut: impl Future<Output = ()> + Send + 'static,\n    operation: impl std::fmt::Display + Send + 'static,\n) {\n    tokio::spawn(async move {\n        if let Err(_timeout_elapsed) = tokio::time::timeout(FIRE_AND_FORGET_TIMEOUT, fut).await {\n            error!(%operation, \"timeout elapsed\");\n        }\n    });\n}\n\n// Returns a random position of the els `slice`, such that the element in this array is NOT\n// `except_el`.\nfn pick_position(\n    els: &[&NodeIdRef],\n    except_el_opt: Option<&NodeIdRef>,\n    rng: &mut ThreadRng,\n) -> Option<usize> {\n    let except_pos_opt =\n        except_el_opt.and_then(|except_el| els.iter().position(|el| *el == except_el));\n    if let Some(except_pos) = except_pos_opt {\n        let pos = rng.random_range(0..els.len() - 1);\n        if pos >= except_pos {\n            Some(pos + 1)\n        } else {\n            Some(pos)\n        }\n    } else {\n        Some(rng.random_range(0..els.len()))\n    }\n}\n\n/// Pick a node from the `shard_count_to_node_ids` that is different from `except_node_opt`.\n/// We pick in priority nodes with the least number of shards, and we break any tie randomly.\n///\n/// Once a node has been found, we update the `shard_count_to_node_ids` to reflect the new state.\n/// In particular, the ingester node is moved from its previous shard_count level to its new\n/// shard_count level. In particular, a shard_count entry that is empty should be removed from the\n/// BTreeMap.\nfn pick_one<'a>(\n    shard_count_to_node_ids: &mut BTreeMap<usize, Vec<&'a NodeIdRef>>,\n    except_node_opt: Option<&'a NodeIdRef>,\n    rng: &mut ThreadRng,\n) -> Option<&'a NodeIdRef> {\n    let (&shard_count, _) = shard_count_to_node_ids.iter().find(|(_, node_ids)| {\n        let Some(except_node) = except_node_opt else {\n            return true;\n        };\n        if node_ids.len() >= 2 {\n            return true;\n        }\n        let Some(&single_node_id) = node_ids.first() else {\n            return false;\n        };\n        single_node_id != except_node\n    })?;\n    let mut shard_entry = shard_count_to_node_ids.entry(shard_count);\n    let Entry::Occupied(occupied_shard_entry) = &mut shard_entry else {\n        panic!();\n    };\n    let nodes = occupied_shard_entry.get_mut();\n    let position = pick_position(nodes, except_node_opt, rng)?;\n\n    let node_id = nodes.swap_remove(position);\n    let new_shard_count = shard_count + 1;\n    let should_remove_entry = nodes.is_empty();\n\n    if should_remove_entry {\n        shard_count_to_node_ids.remove(&shard_count);\n    }\n    shard_count_to_node_ids\n        .entry(new_shard_count)\n        .or_default()\n        .push(node_id);\n    Some(node_id)\n}\n\n/// Pick two ingester nodes from `shard_count_to_node_ids` different one from each other.\n/// Ingesters with the lower number of shards are preferred.\nfn pick_two<'a>(\n    shard_count_to_node_ids: &mut BTreeMap<usize, Vec<&'a NodeIdRef>>,\n    rng: &mut ThreadRng,\n) -> Option<(&'a NodeIdRef, &'a NodeIdRef)> {\n    let leader = pick_one(shard_count_to_node_ids, None, rng)?;\n    let follower = pick_one(shard_count_to_node_ids, Some(leader), rng)?;\n    Some((leader, follower))\n}\n\nfn allocate_shards(\n    node_id_shard_counts: &HashMap<NodeId, usize>,\n    num_shards: usize,\n    replication_enabled: bool,\n) -> Option<Vec<(&NodeIdRef, Option<&NodeIdRef>)>> {\n    let mut shard_count_to_node_ids: BTreeMap<usize, Vec<&NodeIdRef>> = BTreeMap::default();\n    for (node_id, &num_shards) in node_id_shard_counts {\n        shard_count_to_node_ids\n            .entry(num_shards)\n            .or_default()\n            .push(node_id.as_ref());\n    }\n    let mut rng = rng();\n    let mut shard_allocations: Vec<(&NodeIdRef, Option<&NodeIdRef>)> =\n        Vec::with_capacity(num_shards);\n    for _ in 0..num_shards {\n        if replication_enabled {\n            let (leader, follower) = pick_two(&mut shard_count_to_node_ids, &mut rng)?;\n            shard_allocations.push((leader, Some(follower)));\n        } else {\n            let leader = pick_one(&mut shard_count_to_node_ids, None, &mut rng)?;\n            shard_allocations.push((leader, None));\n        }\n    }\n    Some(shard_allocations)\n}\n\n#[derive(Debug, Default, Clone, Copy, Serialize, Deserialize)]\npub struct IngestControllerStats {\n    pub num_rebalance_shards_ops: usize,\n}\n\npub struct IngestController {\n    pub(crate) ingester_pool: IngesterPool,\n    pub(crate) stats: IngestControllerStats,\n    metastore: MetastoreServiceClient,\n    replication_factor: usize,\n    // This lock ensures that only one rebalance operation is performed at a time.\n    rebalance_lock: Arc<Mutex<()>>,\n    scaling_arbiter: ScalingArbiter,\n}\n\nimpl fmt::Debug for IngestController {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        f.debug_struct(\"IngestController\")\n            .field(\"ingester_pool\", &self.ingester_pool)\n            .field(\"metastore\", &self.metastore)\n            .field(\"replication_factor\", &self.replication_factor)\n            .finish()\n    }\n}\n\n/// Updates both the metastore and the control plane.\n/// If successful, the control plane is guaranteed to be in sync with the metastore.\n/// If an error is returned, the control plane might be out of sync with the metastore.\n/// It is up to the client to check the error type and see if the control plane actor should be\n/// restarted.\nasync fn open_shards_on_metastore_and_model(\n    open_shard_subrequests: Vec<OpenShardSubrequest>,\n    metastore: &mut MetastoreServiceClient,\n    model: &mut ControlPlaneModel,\n) -> MetastoreResult<OpenShardsResponse> {\n    if open_shard_subrequests.is_empty() {\n        return Ok(OpenShardsResponse {\n            subresponses: Vec::new(),\n        });\n    }\n    let open_shards_request = OpenShardsRequest {\n        subrequests: open_shard_subrequests,\n    };\n    let open_shards_response = metastore.open_shards(open_shards_request).await?;\n    for open_shard_subresponse in &open_shards_response.subresponses {\n        if let Some(shard) = &open_shard_subresponse.open_shard {\n            let shard = shard.clone();\n            let index_uid = shard.index_uid().clone();\n            let source_id = shard.source_id.clone();\n            model.insert_shards(&index_uid, &source_id, vec![shard]);\n        }\n    }\n    Ok(open_shards_response)\n}\n\nfn get_open_shard_from_model(\n    get_open_shards_subrequest: &GetOrCreateOpenShardsSubrequest,\n    model: &ControlPlaneModel,\n    unavailable_leaders: &FnvHashSet<NodeId>,\n) -> Result<Option<GetOrCreateOpenShardsSuccess>, GetOrCreateOpenShardsFailureReason> {\n    let Some(index_uid) = model.index_uid(&get_open_shards_subrequest.index_id) else {\n        return Err(GetOrCreateOpenShardsFailureReason::IndexNotFound);\n    };\n    let Some(open_shard_entries) = model.find_open_shards(\n        index_uid,\n        &get_open_shards_subrequest.source_id,\n        unavailable_leaders,\n    ) else {\n        return Err(GetOrCreateOpenShardsFailureReason::SourceNotFound);\n    };\n    if open_shard_entries.is_empty() {\n        return Ok(None);\n    }\n    // We already have open shards. Let's return them.\n    let open_shards: Vec<Shard> = open_shard_entries\n        .into_iter()\n        .map(|shard_entry| shard_entry.shard)\n        .collect();\n    Ok(Some(GetOrCreateOpenShardsSuccess {\n        subrequest_id: get_open_shards_subrequest.subrequest_id,\n        index_uid: Some(index_uid.clone()),\n        source_id: get_open_shards_subrequest.source_id.clone(),\n        open_shards,\n    }))\n}\n\nimpl IngestController {\n    pub fn new(\n        metastore: MetastoreServiceClient,\n        ingester_pool: IngesterPool,\n        replication_factor: usize,\n        max_shard_ingestion_throughput_mib_per_sec: f32,\n        shard_scale_up_factor: f32,\n    ) -> Self {\n        IngestController {\n            metastore,\n            ingester_pool,\n            replication_factor,\n            rebalance_lock: Arc::new(Mutex::new(())),\n            stats: IngestControllerStats::default(),\n            scaling_arbiter: ScalingArbiter::with_max_shard_ingestion_throughput_mib_per_sec(\n                max_shard_ingestion_throughput_mib_per_sec,\n                shard_scale_up_factor,\n            ),\n        }\n    }\n\n    /// Sends a retain shard request to the given list of ingesters.\n    ///\n    /// If the request fails, we just log an error.\n    pub(crate) fn sync_with_ingesters(\n        &self,\n        ingesters: &BTreeSet<NodeId>,\n        model: &ControlPlaneModel,\n    ) {\n        for ingester in ingesters {\n            self.sync_with_ingester(ingester, model);\n        }\n    }\n\n    pub(crate) fn sync_with_all_ingesters(&self, model: &ControlPlaneModel) {\n        let ingesters: Vec<NodeId> = self.ingester_pool.keys();\n        for ingester in ingesters {\n            self.sync_with_ingester(&ingester, model);\n        }\n    }\n\n    /// Syncs the ingester in a fire and forget manner.\n    ///\n    /// The returned oneshot is just here for unit test to wait for the operation to terminate.\n    fn sync_with_ingester(&self, ingester_id: &NodeId, model: &ControlPlaneModel) -> WaitHandle {\n        info!(ingester = %ingester_id, \"sync_with_ingester\");\n        let (wait_drop_guard, wait_handle) = WaitHandle::new();\n        let Some(ingester) = self.ingester_pool.get(ingester_id) else {\n            // TODO: (Maybe) We should mark the ingester as unavailable, and stop advertise its\n            // shard to routers.\n            warn!(\"failed to sync with ingester `{ingester_id}`: not available\");\n            return wait_handle;\n        };\n        let mut retain_shards_req = RetainShardsRequest::default();\n        for (source_uid, shard_ids) in &*model.list_shards_for_node(ingester_id) {\n            let shards_for_source = RetainShardsForSource {\n                index_uid: Some(source_uid.index_uid.clone()),\n                source_id: source_uid.source_id.clone(),\n                shard_ids: shard_ids.iter().cloned().collect(),\n            };\n            retain_shards_req\n                .retain_shards_for_sources\n                .push(shards_for_source);\n        }\n        info!(%ingester_id, \"retain shards ingester\");\n        let operation: String = format!(\"retain shards `{ingester_id}`\");\n        fire_and_forget(\n            async move {\n                if let Err(retain_shards_err) =\n                    ingester.client.retain_shards(retain_shards_req).await\n                {\n                    error!(%retain_shards_err, \"retain shards error\");\n                }\n                // just a way to force moving the drop guard.\n                drop(wait_drop_guard);\n            },\n            operation,\n        );\n        wait_handle\n    }\n\n    fn handle_closed_shards(&self, closed_shards: Vec<ShardIds>, model: &mut ControlPlaneModel) {\n        for closed_shard in closed_shards {\n            let index_uid: IndexUid = closed_shard.index_uid().clone();\n            let source_id = closed_shard.source_id;\n\n            let source_uid = SourceUid {\n                index_uid,\n                source_id,\n            };\n            let closed_shard_ids = model.close_shards(&source_uid, &closed_shard.shard_ids);\n\n            if !closed_shard_ids.is_empty() {\n                info!(\n                    index_id=%source_uid.index_uid.index_id,\n                    source_id=%source_uid.source_id,\n                    shard_ids=?PrettySample::new(&closed_shard_ids, 5),\n                    \"closed {} shards reported by router\",\n                    closed_shard_ids.len()\n                );\n            }\n        }\n    }\n\n    pub(crate) async fn handle_local_shards_update(\n        &mut self,\n        local_shards_update: LocalShardsUpdate,\n        model: &mut ControlPlaneModel,\n        progress: &Progress,\n    ) -> MetastoreResult<()> {\n        let shard_stats = model.update_shards(\n            &local_shards_update.source_uid,\n            &local_shards_update.shard_infos,\n        );\n        let min_shards = model\n            .index_metadata(&local_shards_update.source_uid.index_uid)\n            .expect(\"index should exist\")\n            .index_config\n            .ingest_settings\n            .min_shards;\n\n        let Some(scaling_mode) = self.scaling_arbiter.should_scale(shard_stats, min_shards) else {\n            return Ok(());\n        };\n        match scaling_mode {\n            ScalingMode::Up(num_shards) => {\n                self.try_scale_up_shards(\n                    local_shards_update.source_uid,\n                    shard_stats,\n                    model,\n                    progress,\n                    num_shards,\n                )\n                .await?;\n            }\n            ScalingMode::Down => {\n                self.try_scale_down_shards(\n                    local_shards_update.source_uid,\n                    shard_stats,\n                    min_shards,\n                    model,\n                    progress,\n                )\n                .await?;\n            }\n        }\n\n        Ok(())\n    }\n\n    /// Finds the open shards that satisfies the [`GetOrCreateOpenShardsRequest`] request sent by an\n    /// ingest router. First, the control plane checks its internal shard table to find\n    /// candidates. If it does not contain any, the control plane will ask\n    /// the metastore to open new shards.\n    pub(crate) async fn get_or_create_open_shards(\n        &mut self,\n        get_open_shards_request: GetOrCreateOpenShardsRequest,\n        model: &mut ControlPlaneModel,\n        progress: &Progress,\n    ) -> MetastoreResult<GetOrCreateOpenShardsResponse> {\n        // Closing shards is an operation performed by ingesters,\n        // so the control plane is not necessarily aware that they are closed.\n        //\n        // Routers can report closed shards so that we can update our\n        // internal state.\n        self.handle_closed_shards(get_open_shards_request.closed_shards, model);\n\n        let num_subrequests = get_open_shards_request.subrequests.len();\n        let mut get_or_create_open_shards_successes = Vec::with_capacity(num_subrequests);\n        let mut get_or_create_open_shards_failures = Vec::new();\n\n        let mut per_source_num_shards_to_open = HashMap::new();\n\n        let unavailable_leaders: FnvHashSet<NodeId> = get_open_shards_request\n            .unavailable_leaders\n            .into_iter()\n            .map(NodeId::from)\n            .collect();\n\n        // We do a first pass to identify the shards that are missing from the model and need to be\n        // created.\n        for get_open_shards_subrequest in &get_open_shards_request.subrequests {\n            if let Ok(None) =\n                get_open_shard_from_model(get_open_shards_subrequest, model, &unavailable_leaders)\n            {\n                // We did not find any open shard in the model, we will have to create one.\n                // Let's keep track of all of the source that require new shards, so we can batch\n                // create them after this loop.\n                let index_uid = model\n                    .index_uid(&get_open_shards_subrequest.index_id)\n                    .expect(\"index should exist\")\n                    .clone();\n                let min_shards = model\n                    .index_metadata(&index_uid)\n                    .expect(\"index should exist\")\n                    .index_config\n                    .ingest_settings\n                    .min_shards\n                    .get();\n                let source_uid = SourceUid {\n                    index_uid,\n                    source_id: get_open_shards_subrequest.source_id.clone(),\n                };\n                per_source_num_shards_to_open.insert(source_uid, min_shards);\n            }\n        }\n\n        if let Err(metastore_error) = self\n            .try_open_shards(\n                per_source_num_shards_to_open,\n                model,\n                &unavailable_leaders,\n                progress,\n            )\n            .await\n        {\n            // We experienced a metastore error. If this is not certain abort, we need\n            // to restart the control plane, to make sure the control plane is not out-of-sync.\n            //\n            if !metastore_error.is_transaction_certainly_aborted() {\n                return Err(metastore_error);\n            } else {\n                // If not, let's just log something.\n                // This is not critical. We will just end up return some failure in the response.\n                error!(error=?metastore_error, \"failed to open shards on the metastore\");\n            }\n        }\n        for get_open_shards_subrequest in get_open_shards_request.subrequests {\n            match get_open_shard_from_model(\n                &get_open_shards_subrequest,\n                model,\n                &unavailable_leaders,\n            ) {\n                Ok(Some(success)) => {\n                    get_or_create_open_shards_successes.push(success);\n                }\n                Ok(None) => {\n                    get_or_create_open_shards_failures.push(\n                        GetOrCreateOpenShardsFailureReason::NoIngestersAvailable\n                            .create_failure(get_open_shards_subrequest),\n                    );\n                }\n                Err(failure_reason) => {\n                    get_or_create_open_shards_failures\n                        .push(failure_reason.create_failure(get_open_shards_subrequest));\n                }\n            }\n        }\n        let response = GetOrCreateOpenShardsResponse {\n            successes: get_or_create_open_shards_successes,\n            failures: get_or_create_open_shards_failures,\n        };\n        Ok(response)\n    }\n\n    /// Allocates and assigns new shards to ingesters.\n    fn allocate_shards(\n        &self,\n        num_shards_to_allocate: usize,\n        unavailable_leaders: &FnvHashSet<NodeId>,\n        model: &ControlPlaneModel,\n    ) -> Option<Vec<(NodeId, Option<NodeId>)>> {\n        // Count of open shards per available ingester node (including the ingester with 0 open\n        // shards).\n        let mut per_node_num_open_shards: HashMap<NodeId, usize> = self\n            .ingester_pool\n            .keys_values()\n            .into_iter()\n            .filter(|(ingester_id, ingester)| {\n                ingester.status.is_ready() && !unavailable_leaders.contains(ingester_id)\n            })\n            .map(|(ingester_id, _)| (ingester_id, 0))\n            .collect();\n\n        let num_ingesters = per_node_num_open_shards.len();\n\n        if num_ingesters == 0 {\n            warn!(\"failed to allocate {num_shards_to_allocate} shards: no ingesters available\");\n            return None;\n        }\n\n        if self.replication_factor > num_ingesters {\n            warn!(\n                \"failed to allocate {num_shards_to_allocate} shards: replication factor is \\\n                 greater than the number of available ingesters\"\n            );\n            return None;\n        }\n\n        for shard in model.all_shards() {\n            if shard.is_open() && !unavailable_leaders.contains(&shard.leader_id) {\n                for ingest_node in shard.ingesters() {\n                    if let Some(shard_count) =\n                        per_node_num_open_shards.get_mut(ingest_node.as_str())\n                    {\n                        *shard_count += 1;\n                    } else {\n                        // The shard is not present in the `per_node_num_open_shards` map.\n                        // This is normal. It just means an ingester is temporarily unavailable,\n                        // either from the control plane view (not present in the indexer pool,\n                        // because as a result of information from\n                        // chitchat), or because it is in the unavailable\n                        // leaders map.\n                    }\n                }\n            }\n        }\n\n        assert!(self.replication_factor == 1 || self.replication_factor == 2);\n        let leader_follower_pairs: Vec<(&NodeIdRef, Option<&NodeIdRef>)> = allocate_shards(\n            &per_node_num_open_shards,\n            num_shards_to_allocate,\n            self.replication_factor == 2,\n        )?;\n        Some(\n            leader_follower_pairs\n                .into_iter()\n                .map(|(leader_id, follower_id)| {\n                    (leader_id.to_owned(), follower_id.map(NodeIdRef::to_owned))\n                })\n                .collect(),\n        )\n    }\n\n    /// Calls init shards on the leaders hosting newly opened shards.\n    async fn init_shards(\n        &self,\n        init_shard_subrequests: Vec<InitShardSubrequest>,\n        progress: &Progress,\n    ) -> InitShardsResponse {\n        let mut successes = Vec::with_capacity(init_shard_subrequests.len());\n        let mut failures = Vec::new();\n\n        let mut per_leader_shards_to_init: HashMap<String, Vec<InitShardSubrequest>> =\n            HashMap::new();\n\n        for init_shard_subrequest in init_shard_subrequests {\n            let leader_id = init_shard_subrequest.shard().leader_id.clone();\n            per_leader_shards_to_init\n                .entry(leader_id)\n                .or_default()\n                .push(init_shard_subrequest);\n        }\n        let mut init_shards_futures = FuturesUnordered::new();\n\n        for (leader_id, subrequests) in per_leader_shards_to_init {\n            let init_shard_failures: Vec<InitShardFailure> = subrequests\n                .iter()\n                .map(|subrequest| {\n                    let shard = subrequest.shard();\n\n                    InitShardFailure {\n                        subrequest_id: subrequest.subrequest_id,\n                        index_uid: Some(shard.index_uid().clone()),\n                        source_id: shard.source_id.clone(),\n                        shard_id: Some(shard.shard_id().clone()),\n                    }\n                })\n                .collect();\n            let Some(leader) = self.ingester_pool.get(&leader_id) else {\n                warn!(\"failed to init shards: ingester `{leader_id}` is unavailable\");\n                failures.extend(init_shard_failures);\n                continue;\n            };\n            let init_shards_request = InitShardsRequest { subrequests };\n            let init_shards_future = async move {\n                let init_shards_result = tokio::time::timeout(\n                    INIT_SHARDS_REQUEST_TIMEOUT,\n                    leader.client.init_shards(init_shards_request),\n                )\n                .await;\n                (leader_id.clone(), init_shards_result, init_shard_failures)\n            };\n            init_shards_futures.push(init_shards_future);\n        }\n        while let Some((leader_id, init_shards_result, init_shard_failures)) =\n            progress.protect_future(init_shards_futures.next()).await\n        {\n            match init_shards_result {\n                Ok(Ok(init_shards_response)) => {\n                    successes.extend(init_shards_response.successes);\n                    failures.extend(init_shards_response.failures);\n                }\n                Ok(Err(error)) => {\n                    error!(%error, \"failed to init shards on `{leader_id}`\");\n                    failures.extend(init_shard_failures);\n                }\n                Err(_elapsed) => {\n                    error!(\"failed to init shards on `{leader_id}`: request timed out\");\n                    failures.extend(init_shard_failures);\n                }\n            }\n        }\n        InitShardsResponse {\n            successes,\n            failures,\n        }\n    }\n\n    /// Attempts to increase the number of shards. This operation is rate limited to avoid creating\n    /// to many shards in a short period of time. As a result, this method may not create any\n    /// shard.\n    async fn try_scale_up_shards(\n        &mut self,\n        source_uid: SourceUid,\n        shard_stats: ShardStats,\n        model: &mut ControlPlaneModel,\n        progress: &Progress,\n        num_shards_to_open: usize,\n    ) -> MetastoreResult<()> {\n        if !model\n            .acquire_scaling_permits(&source_uid, ScalingMode::Up(num_shards_to_open))\n            .unwrap_or(false)\n        {\n            return Ok(());\n        }\n        let new_num_open_shards = shard_stats.num_open_shards + num_shards_to_open;\n        let new_shards_per_source: HashMap<SourceUid, usize> =\n            HashMap::from_iter([(source_uid.clone(), num_shards_to_open)]);\n        let successful_source_uids_res = self\n            .try_open_shards(new_shards_per_source, model, &Default::default(), progress)\n            .await;\n\n        match successful_source_uids_res {\n            Ok(successful_source_uids) => {\n                assert!(successful_source_uids.len() <= 1);\n\n                if successful_source_uids.is_empty() {\n                    // We did not manage to create the shard.\n                    // We can release our permit.\n                    model.release_scaling_permits(&source_uid, ScalingMode::Up(num_shards_to_open));\n                    warn!(\n                        index_uid=%source_uid.index_uid,\n                        source_id=%source_uid.source_id,\n                        \"scaling up number of shards to {new_num_open_shards} failed: shard initialization failure\"\n                    );\n                } else {\n                    info!(\n                        index_id=%source_uid.index_uid.index_id,\n                        source_id=%source_uid.source_id,\n                        \"successfully scaled up number of shards to {new_num_open_shards}\"\n                    );\n                }\n                Ok(())\n            }\n            Err(metastore_error) => {\n                // We did not manage to create the shard.\n                // We can release our permit, but we also need to return the error to the caller, in\n                // order to restart the control plane actor if necessary.\n                warn!(\n                    index_id=%source_uid.index_uid.index_id,\n                    source_id=%source_uid.source_id,\n                    \"scaling up number of shards to {new_num_open_shards} failed: {metastore_error:?}\"\n                );\n                model.release_scaling_permits(&source_uid, ScalingMode::Up(num_shards_to_open));\n                Err(metastore_error)\n            }\n        }\n    }\n\n    /// Attempts to open shards for different sources\n    /// `source_uids` may contain the same source multiple times.\n    ///\n    /// This function returns the list of sources for which `try_open_shards` was successful.\n    ///\n    /// As long as no metastore error is returned this function leaves the control plane model\n    /// in sync with the metastore.\n    ///\n    /// Also, this function only updates the control plane model and the metastore after\n    /// having successfully initialized a shard (and possibly its replica) on the ingester.\n    ///\n    /// This function can be partially successful: if init_shards was unsuccessful for some shard,\n    /// then the successfully initialized shard will still be record in the metastore/control\n    /// plane model.\n    ///\n    /// The number of successfully open shards is returned.\n    async fn try_open_shards(\n        &mut self,\n        per_source_num_shards_to_open: HashMap<SourceUid, usize>,\n        model: &mut ControlPlaneModel,\n        unavailable_leaders: &FnvHashSet<NodeId>,\n        progress: &Progress,\n    ) -> MetastoreResult<HashMap<SourceUid, usize>> {\n        let total_num_shards_to_open: usize = per_source_num_shards_to_open.values().sum();\n\n        if total_num_shards_to_open == 0 {\n            return Ok(HashMap::new());\n        }\n        // TODO unavailable leaders\n        let Some(leader_follower_pairs) =\n            self.allocate_shards(total_num_shards_to_open, unavailable_leaders, model)\n        else {\n            return Ok(HashMap::new());\n        };\n\n        let source_uids_with_multiplicity = per_source_num_shards_to_open\n            .iter()\n            .flat_map(|(source_uid, count)| std::iter::repeat_n(source_uid, *count));\n\n        let mut init_shard_subrequests: Vec<InitShardSubrequest> = Vec::new();\n\n        for (subrequest_id, (source_uid, (leader_id, follower_id_opt))) in\n            source_uids_with_multiplicity\n                .zip(leader_follower_pairs)\n                .enumerate()\n        {\n            let shard_id = ShardId::from(Ulid::new());\n\n            let index_metadata = model\n                .index_metadata(&source_uid.index_uid)\n                .expect(\"index should exist\");\n            let has_transform = model\n                .source_metadata(source_uid)\n                .expect(\"source should exist\")\n                .transform_config\n                .is_some();\n            let validate_docs =\n                index_metadata.index_config.ingest_settings.validate_docs && !has_transform;\n            let doc_mapping = &index_metadata.index_config.doc_mapping;\n            let doc_mapping_uid = doc_mapping.doc_mapping_uid;\n            let doc_mapping_json = serde_utils::to_json_str(doc_mapping)?;\n\n            let shard = Shard {\n                index_uid: Some(source_uid.index_uid.clone()),\n                source_id: source_uid.source_id.clone(),\n                shard_id: Some(shard_id),\n                leader_id: leader_id.to_string(),\n                follower_id: follower_id_opt.as_ref().map(ToString::to_string),\n                shard_state: ShardState::Open as i32,\n                doc_mapping_uid: Some(doc_mapping_uid),\n                publish_position_inclusive: Some(Position::Beginning),\n                publish_token: None,\n                update_timestamp: 0, // assigned later by the metastore\n            };\n            let init_shard_subrequest = InitShardSubrequest {\n                subrequest_id: subrequest_id as u32,\n                shard: Some(shard),\n                doc_mapping_json,\n                validate_docs,\n            };\n            init_shard_subrequests.push(init_shard_subrequest);\n        }\n\n        // Let's first attempt to initialize these shards.\n        let init_shards_response = self.init_shards(init_shard_subrequests, progress).await;\n\n        let open_shard_subrequests = init_shards_response\n            .successes\n            .into_iter()\n            .enumerate()\n            .map(|(subrequest_id, init_shard_success)| {\n                let shard = init_shard_success.shard();\n\n                OpenShardSubrequest {\n                    subrequest_id: subrequest_id as u32,\n                    index_uid: shard.index_uid.clone(),\n                    source_id: shard.source_id.clone(),\n                    shard_id: shard.shard_id.clone(),\n                    leader_id: shard.leader_id.clone(),\n                    follower_id: shard.follower_id.clone(),\n                    doc_mapping_uid: shard.doc_mapping_uid,\n                    // Shards are acquired by the ingest sources\n                    publish_token: None,\n                }\n            })\n            .collect();\n\n        let open_shards_response = progress\n            .protect_future(open_shards_on_metastore_and_model(\n                open_shard_subrequests,\n                &mut self.metastore,\n                model,\n            ))\n            .await?;\n\n        let mut per_source_num_opened_shards: HashMap<SourceUid, usize> = HashMap::new();\n\n        for open_shard_subresponse in open_shards_response.subresponses {\n            let source_uid = open_shard_subresponse.open_shard().source_uid();\n            *per_source_num_opened_shards.entry(source_uid).or_default() += 1;\n        }\n\n        Ok(per_source_num_opened_shards)\n    }\n\n    /// Attempts to decrease the number of shards. This operation is rate limited to avoid closing\n    /// shards too aggressively. As a result, this method may not close any shard.\n    async fn try_scale_down_shards(\n        &self,\n        source_uid: SourceUid,\n        shard_stats: ShardStats,\n        min_shards: NonZeroUsize,\n        model: &mut ControlPlaneModel,\n        progress: &Progress,\n    ) -> MetastoreResult<()> {\n        // The scaling arbiter should not suggest scaling down if the number of shards is already\n        // below the minimum, but we're just being defensive here.\n        if shard_stats.num_open_shards <= min_shards.get() {\n            return Ok(());\n        }\n        if !model\n            .acquire_scaling_permits(&source_uid, ScalingMode::Down)\n            .unwrap_or(false)\n        {\n            return Ok(());\n        }\n        let new_num_open_shards = shard_stats.num_open_shards - 1;\n\n        info!(\n            index_id=%source_uid.index_uid.index_id,\n            source_id=%source_uid.source_id,\n            \"scaling down number of shards to {new_num_open_shards}\"\n        );\n        let Some((leader_id, shard_id)) = find_scale_down_candidate(&source_uid, model) else {\n            model.release_scaling_permits(&source_uid, ScalingMode::Down);\n            return Ok(());\n        };\n        info!(\"scaling down shard {shard_id} from {leader_id}\");\n        let Some(ingester) = self.ingester_pool.get(&leader_id) else {\n            model.release_scaling_permits(&source_uid, ScalingMode::Down);\n            return Ok(());\n        };\n        let shard_pkeys = vec![ShardPKey {\n            index_uid: Some(source_uid.index_uid.clone()),\n            source_id: source_uid.source_id.clone(),\n            shard_id: Some(shard_id.clone()),\n        }];\n        let close_shards_request = CloseShardsRequest { shard_pkeys };\n\n        if let Err(error) = progress\n            .protect_future(ingester.client.close_shards(close_shards_request))\n            .await\n        {\n            warn!(\"failed to scale down number of shards: {error}\");\n            model.release_scaling_permits(&source_uid, ScalingMode::Down);\n            return Ok(());\n        }\n        model.close_shards(&source_uid, &[shard_id]);\n        Ok(())\n    }\n\n    pub(crate) fn advise_reset_shards(\n        &self,\n        request: AdviseResetShardsRequest,\n        model: &ControlPlaneModel,\n    ) -> AdviseResetShardsResponse {\n        info!(\n            \"received advise reset shards request from `{}`\",\n            request.ingester_id\n        );\n        debug!(shard_ids=?summarize_shard_ids(&request.shard_ids), \"advise reset shards\");\n\n        let mut shards_to_delete: Vec<ShardIds> = Vec::new();\n        let mut shards_to_truncate: Vec<ShardIdPositions> = Vec::new();\n\n        for shard_ids in request.shard_ids {\n            let index_uid = shard_ids.index_uid().clone();\n            let source_id = shard_ids.source_id.clone();\n\n            let source_uid = SourceUid {\n                index_uid,\n                source_id,\n            };\n            let Some(shard_entries) = model.get_shards_for_source(&source_uid) else {\n                // The source no longer exists: we can safely delete all the shards.\n                shards_to_delete.push(shard_ids);\n                continue;\n            };\n            let mut shard_ids_to_delete = Vec::new();\n            let mut shard_positions_to_truncate = Vec::new();\n\n            for shard_id in shard_ids.shard_ids {\n                if let Some(shard_entry) = shard_entries.get(&shard_id) {\n                    let publish_position_inclusive = shard_entry.publish_position_inclusive();\n\n                    shard_positions_to_truncate.push(ShardIdPosition {\n                        shard_id: Some(shard_id),\n                        publish_position_inclusive: Some(publish_position_inclusive),\n                    });\n                } else {\n                    shard_ids_to_delete.push(shard_id);\n                }\n            }\n            if !shard_ids_to_delete.is_empty() {\n                shards_to_delete.push(ShardIds {\n                    index_uid: Some(source_uid.index_uid.clone()),\n                    source_id: source_uid.source_id.clone(),\n                    shard_ids: shard_ids_to_delete,\n                });\n            }\n            if !shard_positions_to_truncate.is_empty() {\n                shards_to_truncate.push(ShardIdPositions {\n                    index_uid: Some(source_uid.index_uid),\n                    source_id: source_uid.source_id,\n                    shard_positions: shard_positions_to_truncate,\n                });\n            }\n        }\n        if enabled!(Level::DEBUG) {\n            let shards_to_truncate: Vec<(&str, Position)> = shards_to_truncate\n                .iter()\n                .flat_map(|shard_positions| {\n                    shard_positions\n                        .shard_positions\n                        .iter()\n                        .map(|shard_id_position| {\n                            (\n                                shard_id_position.shard_id().as_str(),\n                                shard_id_position.publish_position_inclusive(),\n                            )\n                        })\n                })\n                .collect();\n            debug!(shard_ids_to_delete=?summarize_shard_ids(&shards_to_delete), shards_to_truncate=?shards_to_truncate, \"advise reset shards response\");\n        }\n\n        AdviseResetShardsResponse {\n            shards_to_delete,\n            shards_to_truncate,\n        }\n    }\n\n    /// Rebalances shards from ingesters with too many shards to ingesters with too few shards.\n    /// Moving a shard consists of closing the shard on the source ingester and opening a new\n    /// one on the target ingester.\n    ///\n    /// This method is guarded by a lock to ensure that only one rebalance operation is performed at\n    /// a time.\n    #[instrument(skip_all)]\n    pub(crate) async fn rebalance_shards(\n        &mut self,\n        model: &mut ControlPlaneModel,\n        mailbox: &Mailbox<ControlPlane>,\n        progress: &Progress,\n    ) -> MetastoreResult<usize> {\n        let Ok(rebalance_guard) = self.rebalance_lock.clone().try_lock_owned() else {\n            debug!(\"skipping rebalance: another rebalance is already in progress\");\n            return Ok(0);\n        };\n        self.stats.num_rebalance_shards_ops += 1;\n\n        let shards_to_rebalance: Vec<Shard> = self.compute_shards_to_rebalance(model);\n\n        crate::metrics::CONTROL_PLANE_METRICS\n            .rebalance_shards\n            .set(shards_to_rebalance.len() as i64);\n\n        if shards_to_rebalance.is_empty() {\n            debug!(\"skipping rebalance: no shards to rebalance\");\n            return Ok(0);\n        }\n        let mut per_source_num_shards_to_open: HashMap<SourceUid, usize> = HashMap::new();\n\n        for shard in &shards_to_rebalance {\n            *per_source_num_shards_to_open\n                .entry(shard.source_uid())\n                .or_default() += 1;\n        }\n        let mut per_source_num_opened_shards: HashMap<SourceUid, usize> = self\n            .try_open_shards(\n                per_source_num_shards_to_open,\n                model,\n                &Default::default(),\n                progress,\n            )\n            .await\n            .inspect_err(|error| {\n                error!(%error, \"failed to open shards during rebalance\");\n                crate::metrics::CONTROL_PLANE_METRICS\n                    .rebalance_shards\n                    .set(0);\n            })?;\n\n        let num_opened_shards: usize = per_source_num_opened_shards.values().sum();\n\n        crate::metrics::CONTROL_PLANE_METRICS\n            .rebalance_shards\n            .set(num_opened_shards as i64);\n\n        for source_uid in per_source_num_opened_shards.keys() {\n            // We temporarily disable the ability the scale down the number of shards for\n            // the source to avoid closing the shards we just opened.\n            model.drain_scaling_permits(source_uid, ScalingMode::Down);\n        }\n\n        // Close as many shards as we opened. Because `try_open_shards` might fail partially, we\n        // must only close the shards that we successfully opened.\n        let mut shards_to_close = Vec::with_capacity(shards_to_rebalance.len());\n\n        for shard in shards_to_rebalance {\n            let source_uid = shard.source_uid();\n            let Some(num_open_shards) = per_source_num_opened_shards.get_mut(&source_uid) else {\n                continue;\n            };\n            if *num_open_shards == 0 {\n                continue;\n            };\n            *num_open_shards -= 1;\n            shards_to_close.push(shard);\n        }\n        let close_shards_fut = self.close_shards(shards_to_close);\n        let mailbox_clone = mailbox.clone();\n\n        let close_shards_and_send_callback_fut = async move {\n            // We wait for a few seconds before closing the shards to give the ingesters some time\n            // to learn about the ones we just opened via gossip.\n            tokio::time::sleep(CLOSE_SHARDS_UPON_REBALANCE_DELAY).await;\n\n            let closed_shards = close_shards_fut.await;\n\n            if closed_shards.is_empty() {\n                return;\n            }\n            let callback = RebalanceShardsCallback {\n                closed_shards,\n                rebalance_guard,\n            };\n            let _ = mailbox_clone.send_message(callback).await;\n        };\n        tokio::spawn(close_shards_and_send_callback_fut);\n\n        if num_opened_shards > 0 {\n            info!(\"rebalance opened {num_opened_shards} new shards\");\n        }\n        Ok(num_opened_shards)\n    }\n\n    /// Computes shards that need to be rebalanced.\n    ///\n    /// This function identifies which shards should be moved to achieve a balance across available\n    /// ingesters.\n    /// It does not mutate any state. It just identifies the list of shards\n    /// that need to be rebalanced.\n    ///\n    /// Unfortunately, we cannot move shards that are on unavailable ingesters.\n    /// The closing operation can only be done by the leader of that shard.\n    /// For these reason, we exclude these shards from the rebalance process.\n    fn compute_shards_to_rebalance(&self, model: &ControlPlaneModel) -> Vec<Shard> {\n        let mut per_ready_ingester_shards: HashMap<NodeId, Vec<&Shard>> = HashMap::new();\n        let mut retiring_ingesters: HashSet<NodeId> = HashSet::new();\n\n        for (ingester_id, ingester) in self.ingester_pool.keys_values() {\n            if ingester.status.is_ready() {\n                per_ready_ingester_shards.insert(ingester_id, Vec::new());\n            } else if ingester.status == IngesterStatus::Retiring {\n                retiring_ingesters.insert(ingester_id);\n            }\n        }\n\n        let mut shards_to_rebalance: Vec<Shard> = Vec::new();\n        let mut num_ready_shards: usize = 0;\n\n        for shard in model.all_shards() {\n            if !shard.is_open() {\n                continue;\n            }\n            let leader_id_ref = NodeIdRef::from_str(&shard.leader_id);\n\n            if let Some(shards) = per_ready_ingester_shards.get_mut(leader_id_ref) {\n                // Shards on ready ingesters participate in the balancing logic.\n                num_ready_shards += 1;\n                shards.push(&shard.shard);\n            } else if retiring_ingesters.contains(leader_id_ref) {\n                // All open shards on retiring ingesters must be rebalanced.\n                shards_to_rebalance.push(shard.shard.clone());\n            }\n        }\n\n        let num_retiring_shards = shards_to_rebalance.len();\n        let num_ready_ingesters = per_ready_ingester_shards.len();\n\n        let mut rng = rng();\n        let mut per_leader_open_shards_shuffled: Vec<Vec<&Shard>> = per_ready_ingester_shards\n            .into_values()\n            .map(|mut shards| {\n                shards.shuffle(&mut rng);\n                shards\n            })\n            .collect();\n\n        // This is more of a loop-loop, but since we know it should exit before\n        // `num_ready_shards`, we defensively use a for-loop.\n        for _ in 0..num_ready_shards {\n            let MinMaxResult::MinMax(min_shards, max_shards) = per_leader_open_shards_shuffled\n                .iter_mut()\n                .minmax_by_key(|shards| shards.len())\n            else {\n                // There are less than 2 ingesters.\n                // Nothing to do here.\n                break;\n            };\n\n            // We leave a tolerance of 1/10 between the min and max number of shards per leader\n            const TOLERANCE_INV_RATIO: usize = 10;\n            if max_shards.len()\n                < min_shards.len() + min_shards.len().div_ceil(TOLERANCE_INV_RATIO).max(2)\n            {\n                break;\n            }\n\n            let shard = max_shards.pop().expect(\"shards should not be empty\");\n            shards_to_rebalance.push(shard.clone());\n            min_shards.push(shard);\n        }\n\n        if shards_to_rebalance.is_empty() {\n            debug!(\"no shards to rebalance\");\n        } else {\n            info!(\n                num_ready_shards,\n                num_ready_ingesters,\n                num_retiring_shards,\n                num_shards_to_rebalance = shards_to_rebalance.len(),\n                \"rebalancing shards\"\n            );\n        }\n        shards_to_rebalance\n    }\n\n    /// Attempts to close the list of shards passed as argument.\n    ///\n    /// If ingesters are not available, the shards are not closed.\n    fn close_shards(\n        &self,\n        shards_to_close: Vec<Shard>,\n    ) -> impl Future<Output = Vec<ShardPKey>> + Send + 'static {\n        let mut per_leader_shards_to_close: HashMap<LeaderId, Vec<ShardPKey>> = HashMap::new();\n\n        for shard in shards_to_close {\n            let shard_pkey = ShardPKey {\n                index_uid: shard.index_uid,\n                source_id: shard.source_id,\n                shard_id: shard.shard_id,\n            };\n            let leader_id = NodeId::from(shard.leader_id);\n            per_leader_shards_to_close\n                .entry(leader_id)\n                .or_default()\n                .push(shard_pkey);\n        }\n        let mut close_shards_futures = FuturesUnordered::new();\n\n        for (leader_id, shard_pkeys) in per_leader_shards_to_close {\n            let Some(ingester) = self.ingester_pool.get(&leader_id) else {\n                warn!(\"failed to close shards: ingester `{leader_id}` is unavailable\");\n                continue;\n            };\n            let shards_to_close_request = CloseShardsRequest { shard_pkeys };\n            let close_shards_future = async move {\n                tokio::time::timeout(\n                    CLOSE_SHARDS_REQUEST_TIMEOUT,\n                    ingester.client.close_shards(shards_to_close_request),\n                )\n                .await\n            };\n            close_shards_futures.push(close_shards_future);\n        }\n        async move {\n            let mut closed_shards = Vec::new();\n\n            while let Some(close_shards_result) = close_shards_futures.next().await {\n                match close_shards_result {\n                    Ok(Ok(CloseShardsResponse { successes })) => {\n                        closed_shards.extend(successes);\n                    }\n                    Ok(Err(error)) => {\n                        error!(%error, \"failed to close shards\");\n                    }\n                    Err(_elapsed) => {\n                        error!(\"close shards request timed out\");\n                    }\n                }\n            }\n            closed_shards\n        }\n    }\n}\n\nfn summarize_shard_ids(shard_ids: &[ShardIds]) -> Vec<&str> {\n    shard_ids\n        .iter()\n        .flat_map(|source_shard_ids| {\n            source_shard_ids\n                .shard_ids\n                .iter()\n                .map(|shard_id| shard_id.as_str())\n        })\n        .collect()\n}\n\n/// When rebalancing shards, shards to move are closed some time after new shards are opened.\n/// Because we don't want to stall the control plane event loop while waiting for the close shards\n/// requests to complete, we use a callback to handle the results of those close shards requests.\n#[derive(Debug)]\npub(crate) struct RebalanceShardsCallback {\n    pub closed_shards: Vec<ShardPKey>,\n    pub rebalance_guard: OwnedMutexGuard<()>,\n}\n\n/// Finds a shard on the ingester with the highest number of open\n/// shards for this source.\n///\n/// If multiple shards are hosted on that ingester, the shard with the lowest (oldest)\n/// shard ID is chosen.\nfn find_scale_down_candidate(\n    source_uid: &SourceUid,\n    model: &ControlPlaneModel,\n) -> Option<(NodeId, ShardId)> {\n    let mut per_leader_shard_entries: HashMap<&String, Vec<&ShardEntry>> = HashMap::new();\n    let mut rng = rng();\n\n    for shard in model.get_shards_for_source(source_uid)?.values() {\n        if shard.is_open() {\n            per_leader_shard_entries\n                .entry(&shard.leader_id)\n                .or_default()\n                .push(shard);\n        }\n    }\n    per_leader_shard_entries\n        .into_iter()\n        // We use a random number to break ties... The HashMap is randomly seeded so this is\n        // should not make much difference, but we might want to be as explicit as possible.\n        .max_by_key(|(_leader_id, shard_entries)| (shard_entries.len(), rng.next_u32()))\n        .map(|(leader_id, shard_entries)| {\n            (\n                leader_id.clone().into(),\n                shard_entries.choose(&mut rng).unwrap().shard_id().clone(),\n            )\n        })\n}\n\n#[cfg(test)]\nmod tests {\n    use std::collections::BTreeSet;\n    use std::str::FromStr;\n    use std::sync::Arc;\n    use std::sync::atomic::{AtomicUsize, Ordering};\n\n    use itertools::Itertools;\n    use quickwit_actors::Universe;\n    use quickwit_common::setup_logging_for_tests;\n    use quickwit_common::shared_consts::DEFAULT_SHARD_THROUGHPUT_LIMIT;\n    use quickwit_common::tower::DelayLayer;\n    use quickwit_config::{DocMapping, INGEST_V2_SOURCE_ID, SourceConfig};\n    use quickwit_ingest::{IngesterPoolEntry, RateMibPerSec, ShardInfo};\n    use quickwit_metastore::IndexMetadata;\n    use quickwit_proto::control_plane::GetOrCreateOpenShardsSubrequest;\n    use quickwit_proto::ingest::ingester::{\n        CloseShardsResponse, IngesterServiceClient, IngesterStatus, InitShardSuccess,\n        InitShardsResponse, MockIngesterService, RetainShardsResponse,\n    };\n    use quickwit_proto::ingest::{IngestV2Error, Shard, ShardState};\n    use quickwit_proto::metastore::{\n        self, MetastoreError, MockMetastoreService, OpenShardSubresponse,\n    };\n    use quickwit_proto::types::{DocMappingUid, Position, SourceId};\n\n    use super::*;\n\n    const TEST_SHARD_THROUGHPUT_LIMIT_MIB: f32 =\n        DEFAULT_SHARD_THROUGHPUT_LIMIT.as_u64() as f32 / quickwit_common::shared_consts::MIB as f32;\n\n    #[tokio::test]\n    async fn test_ingest_controller_get_or_create_open_shards() {\n        let source_id: &'static str = \"test-source\";\n\n        let index_id_0 = \"test-index-0\";\n        let mut index_metadata_0 =\n            IndexMetadata::for_test(index_id_0, \"ram://indexes/test-index-0\");\n        let index_uid_0 = index_metadata_0.index_uid.clone();\n\n        let doc_mapping_uid_0 = DocMappingUid::random();\n        index_metadata_0.index_config.doc_mapping.doc_mapping_uid = doc_mapping_uid_0;\n\n        let index_id_1 = \"test-index-1\";\n        let mut index_metadata_1 =\n            IndexMetadata::for_test(index_id_1, \"ram://indexes/test-index-1\");\n        let index_uid_1 = index_metadata_1.index_uid.clone();\n\n        let doc_mapping_uid_1 = DocMappingUid::random();\n        index_metadata_1.index_config.doc_mapping.doc_mapping_uid = doc_mapping_uid_1;\n\n        let progress = Progress::default();\n\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore.expect_open_shards().once().returning({\n            let index_uid_1 = index_uid_1.clone();\n\n            move |request| {\n                assert_eq!(request.subrequests.len(), 1);\n                assert_eq!(request.subrequests[0].index_uid(), &index_uid_1);\n                assert_eq!(request.subrequests[0].source_id, source_id);\n                assert_eq!(request.subrequests[0].doc_mapping_uid(), doc_mapping_uid_1);\n\n                let subresponses = vec![metastore::OpenShardSubresponse {\n                    subrequest_id: 1,\n                    open_shard: Some(Shard {\n                        index_uid: index_uid_1.clone().into(),\n                        source_id: source_id.to_string(),\n                        shard_id: Some(ShardId::from(1)),\n                        shard_state: ShardState::Open as i32,\n                        leader_id: \"test-ingester-2\".to_string(),\n                        doc_mapping_uid: Some(doc_mapping_uid_1),\n                        ..Default::default()\n                    }),\n                }];\n                let response = metastore::OpenShardsResponse { subresponses };\n                Ok(response)\n            }\n        });\n        let metastore = MetastoreServiceClient::from_mock(mock_metastore);\n\n        let mock_ingester = MockIngesterService::new();\n        let client = IngesterServiceClient::from_mock(mock_ingester);\n\n        let ingester_pool = IngesterPool::default();\n        ingester_pool.insert(\n            NodeId::from(\"test-ingester-1\"),\n            IngesterPoolEntry::ready_with_client(client.clone()),\n        );\n\n        let mut mock_ingester = MockIngesterService::new();\n        let index_uid_1_clone = index_uid_1.clone();\n        mock_ingester\n            .expect_init_shards()\n            .once()\n            .returning(move |request| {\n                assert_eq!(request.subrequests.len(), 1);\n\n                let subrequest = &request.subrequests[0];\n\n                let shard = subrequest.shard();\n                assert_eq!(shard.index_uid(), &index_uid_1_clone);\n                assert_eq!(shard.source_id, source_id);\n                assert_eq!(shard.leader_id, \"test-ingester-2\");\n\n                let successes = vec![InitShardSuccess {\n                    subrequest_id: request.subrequests[0].subrequest_id,\n                    shard: Some(shard.clone()),\n                }];\n                let response = InitShardsResponse {\n                    successes,\n                    failures: Vec::new(),\n                };\n                Ok(response)\n            });\n        let ingester = IngesterServiceClient::from_mock(mock_ingester);\n        ingester_pool.insert(\n            NodeId::from(\"test-ingester-2\"),\n            IngesterPoolEntry::ready_with_client(ingester.clone()),\n        );\n\n        let replication_factor = 2;\n        let mut controller = IngestController::new(\n            metastore,\n            ingester_pool.clone(),\n            replication_factor,\n            TEST_SHARD_THROUGHPUT_LIMIT_MIB,\n            1.001,\n        );\n\n        let mut model = ControlPlaneModel::default();\n        model.add_index(index_metadata_0.clone());\n        model.add_index(index_metadata_1.clone());\n\n        let mut source_config = SourceConfig::ingest_v2();\n        source_config.source_id = source_id.to_string();\n\n        model\n            .add_source(&index_uid_0, source_config.clone())\n            .unwrap();\n        model.add_source(&index_uid_1, source_config).unwrap();\n\n        let shards = vec![\n            Shard {\n                index_uid: index_uid_0.clone().into(),\n                source_id: source_id.to_string(),\n                shard_id: Some(ShardId::from(1)),\n                leader_id: \"test-ingester-0\".to_string(),\n                shard_state: ShardState::Open as i32,\n                doc_mapping_uid: Some(doc_mapping_uid_0),\n                ..Default::default()\n            },\n            Shard {\n                index_uid: index_uid_0.clone().into(),\n                source_id: source_id.to_string(),\n                shard_id: Some(ShardId::from(2)),\n                leader_id: \"test-ingester-1\".to_string(),\n                shard_state: ShardState::Open as i32,\n                doc_mapping_uid: Some(doc_mapping_uid_0),\n                ..Default::default()\n            },\n        ];\n\n        model.insert_shards(&index_uid_0, &source_id.into(), shards);\n\n        let request = GetOrCreateOpenShardsRequest {\n            subrequests: Vec::new(),\n            closed_shards: Vec::new(),\n            unavailable_leaders: Vec::new(),\n        };\n        let response = controller\n            .get_or_create_open_shards(request, &mut model, &progress)\n            .await\n            .unwrap();\n\n        assert_eq!(response.successes.len(), 0);\n        assert_eq!(response.failures.len(), 0);\n\n        let subrequests = vec![\n            GetOrCreateOpenShardsSubrequest {\n                subrequest_id: 0,\n                index_id: \"test-index-0\".to_string(),\n                source_id: source_id.to_string(),\n            },\n            GetOrCreateOpenShardsSubrequest {\n                subrequest_id: 1,\n                index_id: \"test-index-1\".to_string(),\n                source_id: source_id.to_string(),\n            },\n            GetOrCreateOpenShardsSubrequest {\n                subrequest_id: 2,\n                index_id: \"index-not-found\".to_string(),\n                source_id: \"source-not-found\".to_string(),\n            },\n            GetOrCreateOpenShardsSubrequest {\n                subrequest_id: 3,\n                index_id: \"test-index-0\".to_string(),\n                source_id: \"source-not-found\".to_string(),\n            },\n        ];\n        let closed_shards = Vec::new();\n        let unavailable_leaders = vec![\"test-ingester-0\".to_string()];\n        let request = GetOrCreateOpenShardsRequest {\n            subrequests,\n            closed_shards,\n            unavailable_leaders,\n        };\n        let response = controller\n            .get_or_create_open_shards(request, &mut model, &progress)\n            .await\n            .unwrap();\n\n        assert_eq!(response.successes.len(), 2);\n        assert_eq!(response.failures.len(), 2);\n\n        let success = &response.successes[0];\n        assert_eq!(success.subrequest_id, 0);\n        assert_eq!(success.index_uid(), &index_uid_0);\n        assert_eq!(success.source_id, source_id);\n        assert_eq!(success.open_shards.len(), 1);\n        assert_eq!(success.open_shards[0].shard_id(), ShardId::from(2));\n        assert_eq!(success.open_shards[0].leader_id, \"test-ingester-1\");\n        assert_eq!(success.open_shards[0].doc_mapping_uid(), doc_mapping_uid_0);\n\n        let success = &response.successes[1];\n        assert_eq!(success.subrequest_id, 1);\n        assert_eq!(success.index_uid(), &index_uid_1);\n        assert_eq!(success.source_id, source_id);\n        assert_eq!(success.open_shards.len(), 1);\n        assert_eq!(success.open_shards[0].shard_id(), ShardId::from(1));\n        assert_eq!(success.open_shards[0].leader_id, \"test-ingester-2\");\n        assert_eq!(success.open_shards[0].doc_mapping_uid(), doc_mapping_uid_1);\n\n        let failure = &response.failures[0];\n        assert_eq!(failure.subrequest_id, 2);\n        assert_eq!(failure.index_id, \"index-not-found\");\n        assert_eq!(failure.source_id, \"source-not-found\");\n        assert_eq!(\n            failure.reason(),\n            GetOrCreateOpenShardsFailureReason::IndexNotFound\n        );\n\n        let failure = &response.failures[1];\n        assert_eq!(failure.subrequest_id, 3);\n        assert_eq!(failure.index_id, index_id_0);\n        assert_eq!(failure.source_id, \"source-not-found\");\n        assert_eq!(\n            failure.reason(),\n            GetOrCreateOpenShardsFailureReason::SourceNotFound\n        );\n\n        assert_eq!(model.num_shards(), 3);\n    }\n\n    #[tokio::test]\n    async fn test_ingest_controller_get_or_create_open_shards_metastore_failure() {\n        let source_id: &'static str = \"test-source\";\n\n        let index_id_0 = \"test-index-0\";\n        let index_metadata_0 = IndexMetadata::for_test(index_id_0, \"ram://indexes/test-index-0\");\n        let index_uid_0 = index_metadata_0.index_uid.clone();\n        let index_uid_0_clone = index_uid_0.clone();\n\n        let progress = Progress::default();\n\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_open_shards()\n            .once()\n            .returning(move |_| {\n                Err(MetastoreError::Internal {\n                    message: \"this error could be mean anything. transaction success or failure!\"\n                        .to_string(),\n                    cause: \"\".to_string(),\n                })\n            });\n        let metastore = MetastoreServiceClient::from_mock(mock_metastore);\n\n        let mut mock_ingester = MockIngesterService::new();\n        mock_ingester\n            .expect_init_shards()\n            .once()\n            .returning(move |request| {\n                assert_eq!(request.subrequests.len(), 1);\n\n                let subrequest = &request.subrequests[0];\n\n                let shard = subrequest.shard();\n                assert_eq!(shard.index_uid(), &index_uid_0);\n                assert_eq!(shard.source_id, source_id);\n                assert_eq!(shard.leader_id, \"test-ingester-1\");\n\n                let successes = vec![InitShardSuccess {\n                    subrequest_id: request.subrequests[0].subrequest_id,\n                    shard: Some(shard.clone()),\n                }];\n                let response = InitShardsResponse {\n                    successes,\n                    failures: Vec::new(),\n                };\n                Ok(response)\n            });\n        let client = IngesterServiceClient::from_mock(mock_ingester);\n\n        let ingester_pool = IngesterPool::default();\n        ingester_pool.insert(\n            NodeId::from(\"test-ingester-1\"),\n            IngesterPoolEntry::ready_with_client(client.clone()),\n        );\n\n        let replication_factor = 1;\n        let mut controller = IngestController::new(\n            metastore,\n            ingester_pool,\n            replication_factor,\n            TEST_SHARD_THROUGHPUT_LIMIT_MIB,\n            1.001,\n        );\n\n        let mut model = ControlPlaneModel::default();\n        model.add_index(index_metadata_0.clone());\n\n        let mut source_config = SourceConfig::ingest_v2();\n        source_config.source_id = source_id.to_string();\n\n        model\n            .add_source(&index_uid_0_clone, source_config.clone())\n            .unwrap();\n\n        let subrequests = vec![GetOrCreateOpenShardsSubrequest {\n            subrequest_id: 0,\n            index_id: \"test-index-0\".to_string(),\n            source_id: source_id.to_string(),\n        }];\n        let request = GetOrCreateOpenShardsRequest {\n            subrequests,\n            closed_shards: Vec::new(),\n            unavailable_leaders: Vec::new(),\n        };\n\n        let metastore_error = controller\n            .get_or_create_open_shards(request, &mut model, &progress)\n            .await\n            .unwrap_err();\n\n        assert!(!metastore_error.is_transaction_certainly_aborted());\n    }\n\n    #[tokio::test]\n    async fn test_ingest_controller_get_open_shards_handles_closed_shards() {\n        let metastore = MetastoreServiceClient::mocked();\n        let ingester_pool = IngesterPool::default();\n        let replication_factor = 2;\n\n        let mut controller = IngestController::new(\n            metastore,\n            ingester_pool,\n            replication_factor,\n            TEST_SHARD_THROUGHPUT_LIMIT_MIB,\n            1.001,\n        );\n        let mut model = ControlPlaneModel::default();\n\n        let index_uid = IndexUid::for_test(\"test-index-0\", 0);\n        let source_id: SourceId = \"test-source\".to_string();\n\n        let shards = vec![Shard {\n            shard_id: Some(ShardId::from(1)),\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            leader_id: \"test-ingester-0\".to_string(),\n            shard_state: ShardState::Open as i32,\n            ..Default::default()\n        }];\n        model.insert_shards(&index_uid, &source_id, shards);\n\n        let request = GetOrCreateOpenShardsRequest {\n            subrequests: Vec::new(),\n            closed_shards: vec![ShardIds {\n                index_uid: index_uid.clone().into(),\n                source_id: source_id.clone(),\n                shard_ids: vec![ShardId::from(1), ShardId::from(2)],\n            }],\n            unavailable_leaders: Vec::new(),\n        };\n        let progress = Progress::default();\n\n        controller\n            .get_or_create_open_shards(request, &mut model, &progress)\n            .await\n            .unwrap();\n\n        let shard_1 = model\n            .all_shards()\n            .find(|shard| shard.shard_id() == ShardId::from(1))\n            .unwrap();\n        assert!(shard_1.is_closed());\n    }\n\n    #[test]\n    fn test_ingest_controller_allocate_shards() {\n        let metastore = MetastoreServiceClient::mocked();\n        let ingester_pool = IngesterPool::default();\n        let replication_factor = 2;\n\n        let controller = IngestController::new(\n            metastore,\n            ingester_pool.clone(),\n            replication_factor,\n            TEST_SHARD_THROUGHPUT_LIMIT_MIB,\n            1.001,\n        );\n\n        let mut model = ControlPlaneModel::default();\n\n        let leader_follower_pairs_opt =\n            controller.allocate_shards(0, &FnvHashSet::default(), &model);\n        assert!(leader_follower_pairs_opt.is_none());\n\n        ingester_pool.insert(\n            NodeId::from(\"test-ingester-1\"),\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::mocked()),\n        );\n\n        let leader_follower_pairs_opt =\n            controller.allocate_shards(0, &FnvHashSet::default(), &model);\n\n        // We have only one node so with a replication factor of 2, we can't\n        // find any solution.\n        assert!(leader_follower_pairs_opt.is_none());\n\n        ingester_pool.insert(\n            \"test-ingester-2\".into(),\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::mocked()),\n        );\n\n        let leader_follower_pairs = controller\n            .allocate_shards(0, &FnvHashSet::default(), &model)\n            .unwrap();\n\n        // We tried to allocate 0 shards, so an empty vec makes sense.\n        assert!(leader_follower_pairs.is_empty());\n\n        let leader_follower_pairs = controller\n            .allocate_shards(1, &FnvHashSet::default(), &model)\n            .unwrap();\n\n        assert_eq!(leader_follower_pairs.len(), 1);\n\n        // The leader follower is picked at random: both ingester have the same number of shards.\n        if leader_follower_pairs[0].0 == \"test-ingester-1\" {\n            assert_eq!(\n                leader_follower_pairs[0].1,\n                Some(NodeId::from(\"test-ingester-2\"))\n            );\n        } else {\n            assert_eq!(leader_follower_pairs[0].0, \"test-ingester-2\");\n            assert_eq!(\n                leader_follower_pairs[0].1,\n                Some(NodeId::from(\"test-ingester-1\"))\n            );\n        }\n\n        let leader_follower_pairs = controller\n            .allocate_shards(2, &FnvHashSet::default(), &model)\n            .unwrap();\n        assert_eq!(leader_follower_pairs.len(), 2);\n\n        for leader_follower_pair in leader_follower_pairs {\n            if leader_follower_pair.0 == \"test-ingester-1\" {\n                assert_eq!(\n                    leader_follower_pair.1,\n                    Some(NodeId::from(\"test-ingester-2\"))\n                );\n            } else {\n                assert_eq!(leader_follower_pair.0, \"test-ingester-2\");\n                assert_eq!(\n                    leader_follower_pair.1,\n                    Some(NodeId::from(\"test-ingester-1\"))\n                );\n            }\n        }\n\n        let leader_follower_pairs = controller\n            .allocate_shards(3, &FnvHashSet::default(), &model)\n            .unwrap();\n        assert_eq!(leader_follower_pairs.len(), 3);\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n\n        let source_id: SourceId = \"test-source\".to_string();\n        let open_shards = vec![Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            shard_state: ShardState::Open as i32,\n            leader_id: \"test-ingester-1\".to_string(),\n            ..Default::default()\n        }];\n        model.insert_shards(&index_uid, &source_id, open_shards);\n\n        let leader_follower_pairs = controller\n            .allocate_shards(3, &FnvHashSet::default(), &model)\n            .unwrap();\n        assert_eq!(leader_follower_pairs.len(), 3);\n        assert_eq!(leader_follower_pairs[0].0, \"test-ingester-2\");\n        assert_eq!(\n            leader_follower_pairs[0].1,\n            Some(NodeId::from(\"test-ingester-1\"))\n        );\n\n        assert_eq!(leader_follower_pairs[1].0, \"test-ingester-2\");\n        assert_eq!(\n            leader_follower_pairs[1].1,\n            Some(NodeId::from(\"test-ingester-1\"))\n        );\n\n        assert_eq!(leader_follower_pairs[2].0, \"test-ingester-2\");\n        assert_eq!(\n            leader_follower_pairs[2].1,\n            Some(NodeId::from(\"test-ingester-1\"))\n        );\n\n        let open_shards = vec![\n            Shard {\n                index_uid: Some(index_uid.clone()),\n                source_id: source_id.clone(),\n                shard_id: Some(ShardId::from(2)),\n                shard_state: ShardState::Open as i32,\n                leader_id: \"test-ingester-1\".to_string(),\n                ..Default::default()\n            },\n            Shard {\n                index_uid: Some(index_uid.clone()),\n                source_id: source_id.clone(),\n                shard_id: Some(ShardId::from(3)),\n                shard_state: ShardState::Open as i32,\n                leader_id: \"test-ingester-1\".to_string(),\n                ..Default::default()\n            },\n        ];\n        model.insert_shards(&index_uid, &source_id, open_shards);\n\n        let leader_follower_pairs = controller\n            .allocate_shards(1, &FnvHashSet::default(), &model)\n            .unwrap();\n        assert_eq!(leader_follower_pairs.len(), 1);\n        // Ingester 1 already has two shards, so ingester 2 is picked as leader\n        assert_eq!(leader_follower_pairs[0].0, \"test-ingester-2\");\n        assert_eq!(\n            leader_follower_pairs[0].1,\n            Some(NodeId::from(\"test-ingester-1\"))\n        );\n\n        ingester_pool.insert(\n            \"test-ingester-3\".into(),\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::mocked()),\n        );\n        let unavailable_leaders = FnvHashSet::from_iter([NodeId::from(\"test-ingester-2\")]);\n        let leader_follower_pairs = controller\n            .allocate_shards(4, &unavailable_leaders, &model)\n            .unwrap();\n        // Ingester 2 is unavailable. Ingester 1 has open shards. Ingester 3 ends up leader.\n        assert_eq!(leader_follower_pairs.len(), 4);\n        assert_eq!(leader_follower_pairs[0].0, \"test-ingester-3\");\n        assert_eq!(\n            leader_follower_pairs[0].1,\n            Some(NodeId::from(\"test-ingester-1\"))\n        );\n\n        assert_eq!(leader_follower_pairs[1].0, \"test-ingester-3\");\n        assert_eq!(\n            leader_follower_pairs[1].1,\n            Some(NodeId::from(\"test-ingester-1\"))\n        );\n\n        assert_eq!(leader_follower_pairs[2].0, \"test-ingester-3\");\n        assert_eq!(\n            leader_follower_pairs[2].1,\n            Some(NodeId::from(\"test-ingester-1\"))\n        );\n\n        assert_eq!(leader_follower_pairs[3].0, \"test-ingester-3\");\n        assert_eq!(\n            leader_follower_pairs[3].1,\n            Some(NodeId::from(\"test-ingester-1\"))\n        );\n    }\n\n    #[tokio::test]\n    async fn test_ingest_controller_init_shards() {\n        let metastore = MetastoreServiceClient::mocked();\n        let ingester_pool = IngesterPool::default();\n        let replication_factor = 1;\n\n        let controller = IngestController::new(\n            metastore,\n            ingester_pool.clone(),\n            replication_factor,\n            TEST_SHARD_THROUGHPUT_LIMIT_MIB,\n            1.001,\n        );\n\n        let ingester_id_0 = NodeId::from(\"test-ingester-0\");\n        let mut mock_ingester_0 = MockIngesterService::new();\n        mock_ingester_0\n            .expect_init_shards()\n            .once()\n            .returning(|mut request| {\n                assert_eq!(request.subrequests.len(), 2);\n\n                request\n                    .subrequests\n                    .sort_by_key(|subrequest| subrequest.subrequest_id);\n\n                let subrequest_0 = &request.subrequests[0];\n                assert_eq!(subrequest_0.subrequest_id, 0);\n\n                let shard_0 = request.subrequests[0].shard();\n                assert_eq!(shard_0.index_uid(), &(\"test-index\", 0));\n                assert_eq!(shard_0.source_id, \"test-source\");\n                assert_eq!(shard_0.shard_id(), ShardId::from(0));\n                assert_eq!(shard_0.leader_id, \"test-ingester-0\");\n\n                let subrequest_1 = &request.subrequests[1];\n                assert_eq!(subrequest_1.subrequest_id, 1);\n\n                let shard_1 = request.subrequests[1].shard();\n                assert_eq!(shard_1.index_uid(), &(\"test-index\", 0));\n                assert_eq!(shard_1.source_id, \"test-source\");\n                assert_eq!(shard_1.shard_id(), ShardId::from(1));\n                assert_eq!(shard_1.leader_id, \"test-ingester-0\");\n\n                let successes = vec![InitShardSuccess {\n                    subrequest_id: 0,\n                    shard: Some(shard_0.clone()),\n                }];\n                let failures = vec![InitShardFailure {\n                    subrequest_id: 1,\n                    index_uid: shard_1.index_uid.clone(),\n                    source_id: shard_1.source_id.clone(),\n                    shard_id: shard_1.shard_id.clone(),\n                }];\n                let response = InitShardsResponse {\n                    successes,\n                    failures,\n                };\n                Ok(response)\n            });\n        let ingester_0 = IngesterServiceClient::from_mock(mock_ingester_0);\n        ingester_pool.insert(\n            ingester_id_0,\n            IngesterPoolEntry::ready_with_client(ingester_0),\n        );\n\n        let ingester_id_1 = NodeId::from(\"test-ingester-1\");\n        let mut mock_ingester_1 = MockIngesterService::new();\n        mock_ingester_1\n            .expect_init_shards()\n            .once()\n            .returning(|request| {\n                assert_eq!(request.subrequests.len(), 1);\n\n                let subrequest = &request.subrequests[0];\n                assert_eq!(subrequest.subrequest_id, 2);\n\n                let shard = request.subrequests[0].shard();\n                assert_eq!(shard.index_uid(), &(\"test-index\", 0));\n                assert_eq!(shard.source_id, \"test-source\");\n                assert_eq!(shard.shard_id(), ShardId::from(2));\n                assert_eq!(shard.leader_id, \"test-ingester-1\");\n\n                Err(IngestV2Error::Internal(\"internal error\".to_string()))\n            });\n        let ingester_1 =\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester_1));\n        ingester_pool.insert(ingester_id_1, ingester_1);\n\n        let ingester_id_2 = NodeId::from(\"test-ingester-2\");\n        let mut mock_ingester_2 = MockIngesterService::new();\n        mock_ingester_2.expect_init_shards().never();\n\n        let client_2 = IngesterServiceClient::tower()\n            .stack_init_shards_layer(DelayLayer::new(INIT_SHARDS_REQUEST_TIMEOUT * 2))\n            .build_from_mock(mock_ingester_2);\n        ingester_pool.insert(\n            ingester_id_2,\n            IngesterPoolEntry::ready_with_client(client_2),\n        );\n\n        let init_shards_response = controller\n            .init_shards(Vec::new(), &Progress::default())\n            .await;\n        assert_eq!(init_shards_response.successes.len(), 0);\n        assert_eq!(init_shards_response.failures.len(), 0);\n\n        // In this test:\n        // - ingester 0 will initialize shard 0 successfully and fail to initialize shard 1;\n        // - ingester 1 will return an error;\n        // - ingester 2 will time out;\n        // - ingester 3 will be unavailable.\n\n        let init_shard_subrequests: Vec<InitShardSubrequest> = vec![\n            InitShardSubrequest {\n                subrequest_id: 0,\n                shard: Some(Shard {\n                    index_uid: IndexUid::for_test(\"test-index\", 0).into(),\n                    source_id: \"test-source\".to_string(),\n                    shard_id: Some(ShardId::from(0)),\n                    leader_id: \"test-ingester-0\".to_string(),\n                    shard_state: ShardState::Open as i32,\n                    ..Default::default()\n                }),\n                doc_mapping_json: \"{}\".to_string(),\n                validate_docs: false,\n            },\n            InitShardSubrequest {\n                subrequest_id: 1,\n                shard: Some(Shard {\n                    index_uid: IndexUid::for_test(\"test-index\", 0).into(),\n                    source_id: \"test-source\".to_string(),\n                    shard_id: Some(ShardId::from(1)),\n                    leader_id: \"test-ingester-0\".to_string(),\n                    shard_state: ShardState::Open as i32,\n                    ..Default::default()\n                }),\n                doc_mapping_json: \"{}\".to_string(),\n                validate_docs: false,\n            },\n            InitShardSubrequest {\n                subrequest_id: 2,\n                shard: Some(Shard {\n                    index_uid: IndexUid::for_test(\"test-index\", 0).into(),\n                    source_id: \"test-source\".to_string(),\n                    shard_id: Some(ShardId::from(2)),\n                    leader_id: \"test-ingester-1\".to_string(),\n                    shard_state: ShardState::Open as i32,\n                    ..Default::default()\n                }),\n                doc_mapping_json: \"{}\".to_string(),\n                validate_docs: false,\n            },\n            InitShardSubrequest {\n                subrequest_id: 3,\n                shard: Some(Shard {\n                    index_uid: IndexUid::for_test(\"test-index\", 0).into(),\n                    source_id: \"test-source\".to_string(),\n                    shard_id: Some(ShardId::from(3)),\n                    leader_id: \"test-ingester-2\".to_string(),\n                    shard_state: ShardState::Open as i32,\n                    ..Default::default()\n                }),\n                doc_mapping_json: \"{}\".to_string(),\n                validate_docs: false,\n            },\n            InitShardSubrequest {\n                subrequest_id: 4,\n                shard: Some(Shard {\n                    index_uid: IndexUid::for_test(\"test-index\", 0).into(),\n                    source_id: \"test-source\".to_string(),\n                    shard_id: Some(ShardId::from(4)),\n                    leader_id: \"test-ingester-3\".to_string(),\n                    shard_state: ShardState::Open as i32,\n                    ..Default::default()\n                }),\n                doc_mapping_json: \"{}\".to_string(),\n                validate_docs: false,\n            },\n        ];\n        let init_shards_response = controller\n            .init_shards(init_shard_subrequests, &Progress::default())\n            .await;\n        assert_eq!(init_shards_response.successes.len(), 1);\n        assert_eq!(init_shards_response.failures.len(), 4);\n\n        let success = &init_shards_response.successes[0];\n        assert_eq!(success.subrequest_id, 0);\n\n        let mut failures = init_shards_response.failures;\n        failures.sort_by_key(|failure| failure.subrequest_id);\n\n        assert_eq!(failures[0].subrequest_id, 1);\n        assert_eq!(failures[1].subrequest_id, 2);\n        assert_eq!(failures[2].subrequest_id, 3);\n        assert_eq!(failures[3].subrequest_id, 4);\n    }\n\n    #[tokio::test]\n    async fn test_ingest_controller_try_open_shards() {\n        let doc_mapping_uid = DocMappingUid::random();\n        let expected_doc_mapping = doc_mapping_uid;\n\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_open_shards()\n            .once()\n            .returning(move |request| {\n                assert_eq!(request.subrequests.len(), 1);\n\n                let subrequest = &request.subrequests[0];\n                assert_eq!(subrequest.subrequest_id, 0);\n\n                assert_eq!(subrequest.index_uid(), &(\"test-index\", 0));\n                assert_eq!(subrequest.source_id, \"test-source\");\n                assert_eq!(subrequest.leader_id, \"test-ingester-1\");\n                assert_eq!(subrequest.doc_mapping_uid(), expected_doc_mapping);\n\n                let subresponses = vec![metastore::OpenShardSubresponse {\n                    subrequest_id: 0,\n                    open_shard: Some(Shard {\n                        index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n                        source_id: \"test-source\".to_string(),\n                        shard_id: Some(ShardId::from(0)),\n                        leader_id: \"test-ingester-1\".to_string(),\n                        shard_state: ShardState::Open as i32,\n                        doc_mapping_uid: Some(expected_doc_mapping),\n                        ..Default::default()\n                    }),\n                }];\n                let response = metastore::OpenShardsResponse { subresponses };\n                Ok(response)\n            });\n        let metastore = MetastoreServiceClient::from_mock(mock_metastore);\n        let ingester_pool = IngesterPool::default();\n        let replication_factor = 1;\n\n        let mut controller = IngestController::new(\n            metastore,\n            ingester_pool.clone(),\n            replication_factor,\n            TEST_SHARD_THROUGHPUT_LIMIT_MIB,\n            1.001,\n        );\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = \"test-source\".to_string();\n\n        let source_uid = SourceUid {\n            index_uid: index_uid.clone(),\n            source_id: source_id.clone(),\n        };\n        let mut index_metadata = IndexMetadata::for_test(\"test-index\", \"ram://indexes/test-index\");\n        index_metadata.sources.insert(\n            source_id.clone(),\n            SourceConfig::for_test(&source_id, quickwit_config::SourceParams::void()),\n        );\n\n        let doc_mapping_json = format!(\n            r#\"{{\n                \"doc_mapping_uid\": \"{doc_mapping_uid}\",\n                \"field_mappings\": [{{\n                        \"name\": \"message\",\n                        \"type\": \"text\"\n                }}]\n            }}\"#\n        );\n        let doc_mapping: DocMapping = serde_json::from_str(&doc_mapping_json).unwrap();\n        let expected_doc_mapping = doc_mapping.clone();\n        index_metadata.index_config.doc_mapping = doc_mapping;\n\n        let mut model = ControlPlaneModel::default();\n        model.add_index(index_metadata);\n\n        let mut mock_ingester = MockIngesterService::new();\n        mock_ingester\n            .expect_init_shards()\n            .once()\n            .returning(move |request| {\n                assert_eq!(request.subrequests.len(), 1);\n\n                let subrequest = &request.subrequests[0];\n                assert_eq!(subrequest.subrequest_id, 0);\n\n                let doc_mapping: DocMapping =\n                    serde_json::from_str(&subrequest.doc_mapping_json).unwrap();\n                assert_eq!(doc_mapping, expected_doc_mapping);\n\n                let shard = request.subrequests[0].shard();\n                assert_eq!(shard.index_uid(), &(\"test-index\", 0));\n                assert_eq!(shard.source_id, \"test-source\");\n                assert_eq!(shard.leader_id, \"test-ingester-1\");\n                assert_eq!(shard.doc_mapping_uid(), doc_mapping_uid);\n\n                let successes = vec![InitShardSuccess {\n                    subrequest_id: 0,\n                    shard: Some(shard.clone()),\n                }];\n                let response = InitShardsResponse {\n                    successes,\n                    failures: Vec::new(),\n                };\n                Ok(response)\n            });\n\n        ingester_pool.insert(\n            NodeId::from(\"test-ingester-1\"),\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester)),\n        );\n        let source_uids: HashMap<SourceUid, usize> = HashMap::from_iter([(source_uid.clone(), 1)]);\n        let unavailable_leaders = FnvHashSet::default();\n        let progress = Progress::default();\n\n        let per_source_num_opened_shards = controller\n            .try_open_shards(source_uids, &mut model, &unavailable_leaders, &progress)\n            .await\n            .unwrap();\n\n        assert_eq!(per_source_num_opened_shards.len(), 1);\n        assert_eq!(*per_source_num_opened_shards.get(&source_uid).unwrap(), 1);\n    }\n\n    #[tokio::test]\n    async fn test_ingest_controller_handle_local_shards_update() {\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_open_shards()\n            .once()\n            .returning(|request| {\n                assert_eq!(request.subrequests.len(), 1);\n                let subrequest = &request.subrequests[0];\n\n                assert_eq!(subrequest.index_uid(), &IndexUid::for_test(\"test-index\", 0));\n                assert_eq!(subrequest.source_id, \"test-source\");\n                assert_eq!(subrequest.leader_id, \"test-ingester\");\n\n                Err(MetastoreError::InvalidArgument {\n                    message: \"failed to open shards\".to_string(),\n                })\n            });\n        mock_metastore\n            .expect_open_shards()\n            .once()\n            .returning(|request| {\n                assert_eq!(request.subrequests.len(), 1);\n                let subrequest: &OpenShardSubrequest = &request.subrequests[0];\n\n                assert_eq!(subrequest.index_uid(), &IndexUid::for_test(\"test-index\", 0));\n                assert_eq!(subrequest.source_id, \"test-source\");\n                assert_eq!(subrequest.leader_id, \"test-ingester\");\n\n                let shard = Shard {\n                    index_uid: subrequest.index_uid.clone(),\n                    source_id: subrequest.source_id.clone(),\n                    shard_id: subrequest.shard_id.clone(),\n                    shard_state: ShardState::Open as i32,\n                    leader_id: subrequest.leader_id.clone(),\n                    follower_id: subrequest.follower_id.clone(),\n                    doc_mapping_uid: subrequest.doc_mapping_uid,\n                    publish_position_inclusive: Some(Position::Beginning),\n                    publish_token: None,\n                    update_timestamp: 1724158996,\n                };\n                let response = OpenShardsResponse {\n                    subresponses: vec![OpenShardSubresponse {\n                        subrequest_id: subrequest.subrequest_id,\n                        open_shard: Some(shard),\n                    }],\n                };\n                Ok(response)\n            });\n        let metastore = MetastoreServiceClient::from_mock(mock_metastore);\n        let ingester_pool = IngesterPool::default();\n        let replication_factor = 1;\n\n        let mut controller = IngestController::new(\n            metastore,\n            ingester_pool.clone(),\n            replication_factor,\n            TEST_SHARD_THROUGHPUT_LIMIT_MIB,\n            1.001,\n        );\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let mut index_metadata = IndexMetadata::for_test(\"test-index\", \"ram://indexes/test-index\");\n        let source_id: SourceId = \"test-source\".to_string();\n        index_metadata.sources.insert(\n            source_id.clone(),\n            SourceConfig::for_test(&source_id, quickwit_config::SourceParams::void()),\n        );\n\n        let source_uid = SourceUid {\n            index_uid: index_uid.clone(),\n            source_id: source_id.clone(),\n        };\n        let mut model = ControlPlaneModel::default();\n        model.add_index(index_metadata);\n        let progress = Progress::default();\n\n        let shards = vec![Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            leader_id: \"test-ingester\".to_string(),\n            shard_state: ShardState::Open as i32,\n            ..Default::default()\n        }];\n        model.insert_shards(&index_uid, &source_id, shards);\n        let shard_entries: Vec<ShardEntry> = model.all_shards().cloned().collect();\n\n        assert_eq!(shard_entries.len(), 1);\n        assert_eq!(shard_entries[0].short_term_ingestion_rate, 0);\n\n        // Test update shard ingestion rate but no scale down because num open shards is 1.\n        let shard_infos = BTreeSet::from_iter([ShardInfo {\n            shard_id: ShardId::from(1),\n            shard_state: ShardState::Open,\n            short_term_ingestion_rate: RateMibPerSec(1),\n            long_term_ingestion_rate: RateMibPerSec(1),\n        }]);\n        let local_shards_update = LocalShardsUpdate {\n            leader_id: \"test-ingester\".into(),\n            source_uid: source_uid.clone(),\n            shard_infos,\n        };\n\n        controller\n            .handle_local_shards_update(local_shards_update, &mut model, &progress)\n            .await\n            .unwrap();\n\n        let shard_entries: Vec<ShardEntry> = model.all_shards().cloned().collect();\n        assert_eq!(shard_entries.len(), 1);\n        assert_eq!(shard_entries[0].short_term_ingestion_rate, 1);\n\n        // Test update shard ingestion rate with failing scale down.\n        let shards = vec![Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(2)),\n            shard_state: ShardState::Open as i32,\n            leader_id: \"test-ingester\".to_string(),\n            ..Default::default()\n        }];\n        model.insert_shards(&index_uid, &source_id, shards);\n\n        let shard_entries: Vec<ShardEntry> = model.all_shards().cloned().collect();\n        assert_eq!(shard_entries.len(), 2);\n\n        let mut mock_ingester = MockIngesterService::new();\n\n        let index_uid_clone = index_uid.clone();\n        mock_ingester.expect_init_shards().returning(\n            move |init_shard_request: InitShardsRequest| {\n                assert_eq!(init_shard_request.subrequests.len(), 1);\n                let init_shard_subrequest: &InitShardSubrequest =\n                    &init_shard_request.subrequests[0];\n                assert!(init_shard_subrequest.validate_docs);\n                Ok(InitShardsResponse {\n                    successes: vec![InitShardSuccess {\n                        subrequest_id: init_shard_subrequest.subrequest_id,\n                        shard: init_shard_subrequest.shard.clone(),\n                    }],\n                    failures: Vec::new(),\n                })\n            },\n        );\n        mock_ingester\n            .expect_close_shards()\n            .returning(move |request| {\n                assert_eq!(request.shard_pkeys.len(), 1);\n                assert_eq!(request.shard_pkeys[0].index_uid(), &index_uid_clone);\n                assert_eq!(request.shard_pkeys[0].source_id, \"test-source\");\n                Err(IngestV2Error::Internal(\n                    \"failed to close shards\".to_string(),\n                ))\n            });\n        let ingester =\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester));\n        ingester_pool.insert(\"test-ingester\".into(), ingester);\n\n        let shard_infos = BTreeSet::from_iter([\n            ShardInfo {\n                shard_id: ShardId::from(1),\n                shard_state: ShardState::Open,\n                short_term_ingestion_rate: RateMibPerSec(1),\n                long_term_ingestion_rate: RateMibPerSec(1),\n            },\n            ShardInfo {\n                shard_id: ShardId::from(2),\n                shard_state: ShardState::Open,\n                short_term_ingestion_rate: RateMibPerSec(1),\n                long_term_ingestion_rate: RateMibPerSec(1),\n            },\n        ]);\n        let local_shards_update = LocalShardsUpdate {\n            leader_id: \"test-ingester\".into(),\n            source_uid: source_uid.clone(),\n            shard_infos,\n        };\n        controller\n            .handle_local_shards_update(local_shards_update, &mut model, &progress)\n            .await\n            .unwrap();\n\n        // Test update shard ingestion rate with failing scale up.\n        let shard_infos = BTreeSet::from_iter([\n            ShardInfo {\n                shard_id: ShardId::from(1),\n                shard_state: ShardState::Open,\n                short_term_ingestion_rate: RateMibPerSec(4),\n                long_term_ingestion_rate: RateMibPerSec(4),\n            },\n            ShardInfo {\n                shard_id: ShardId::from(2),\n                shard_state: ShardState::Open,\n                short_term_ingestion_rate: RateMibPerSec(4),\n                long_term_ingestion_rate: RateMibPerSec(4),\n            },\n        ]);\n        let local_shards_update = LocalShardsUpdate {\n            leader_id: \"test-ingester\".into(),\n            source_uid: source_uid.clone(),\n            shard_infos,\n        };\n\n        // The first request fails due to an error on the metastore.\n        let MetastoreError::InvalidArgument { .. } = controller\n            .handle_local_shards_update(local_shards_update.clone(), &mut model, &progress)\n            .await\n            .unwrap_err()\n        else {\n            panic!();\n        };\n\n        // The second request works!\n        controller\n            .handle_local_shards_update(local_shards_update, &mut model, &progress)\n            .await\n            .unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_ingest_controller_disable_validation_when_vrl() {\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_open_shards()\n            .once()\n            .returning(|request| {\n                let subrequest: &OpenShardSubrequest = &request.subrequests[0];\n                let shard = Shard {\n                    index_uid: subrequest.index_uid.clone(),\n                    source_id: subrequest.source_id.clone(),\n                    shard_id: subrequest.shard_id.clone(),\n                    shard_state: ShardState::Open as i32,\n                    leader_id: subrequest.leader_id.clone(),\n                    follower_id: subrequest.follower_id.clone(),\n                    doc_mapping_uid: subrequest.doc_mapping_uid,\n                    publish_position_inclusive: Some(Position::Beginning),\n                    publish_token: None,\n                    update_timestamp: 1724158996,\n                };\n                let response = OpenShardsResponse {\n                    subresponses: vec![OpenShardSubresponse {\n                        subrequest_id: subrequest.subrequest_id,\n                        open_shard: Some(shard),\n                    }],\n                };\n                Ok(response)\n            });\n        let metastore = MetastoreServiceClient::from_mock(mock_metastore);\n        let ingester_pool = IngesterPool::default();\n        let replication_factor = 1;\n\n        let mut controller = IngestController::new(\n            metastore,\n            ingester_pool.clone(),\n            replication_factor,\n            TEST_SHARD_THROUGHPUT_LIMIT_MIB,\n            1.001,\n        );\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let mut index_metadata = IndexMetadata::for_test(\"test-index\", \"ram://indexes/test-index\");\n        let source_id: SourceId = \"test-source\".to_string();\n        let mut source_config =\n            SourceConfig::for_test(&source_id, quickwit_config::SourceParams::void());\n        // set a vrl script\n        source_config.transform_config =\n            Some(quickwit_config::TransformConfig::new(\"\".to_string(), None));\n        index_metadata\n            .sources\n            .insert(source_id.clone(), source_config);\n\n        let source_uid = SourceUid {\n            index_uid: index_uid.clone(),\n            source_id: source_id.clone(),\n        };\n        let mut model = ControlPlaneModel::default();\n        model.add_index(index_metadata);\n        let progress = Progress::default();\n\n        let shards = vec![Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            leader_id: \"test-ingester\".to_string(),\n            shard_state: ShardState::Open as i32,\n            ..Default::default()\n        }];\n        model.insert_shards(&index_uid, &source_id, shards);\n\n        let mut mock_ingester = MockIngesterService::new();\n\n        mock_ingester.expect_init_shards().returning(\n            move |init_shard_request: InitShardsRequest| {\n                assert_eq!(init_shard_request.subrequests.len(), 1);\n                let init_shard_subrequest: &InitShardSubrequest =\n                    &init_shard_request.subrequests[0];\n                // we have vrl, so no validation\n                assert!(!init_shard_subrequest.validate_docs);\n                Ok(InitShardsResponse {\n                    successes: vec![InitShardSuccess {\n                        subrequest_id: init_shard_subrequest.subrequest_id,\n                        shard: init_shard_subrequest.shard.clone(),\n                    }],\n                    failures: Vec::new(),\n                })\n            },\n        );\n\n        let ingester =\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester));\n        ingester_pool.insert(\"test-ingester\".into(), ingester);\n\n        let shard_infos = BTreeSet::from_iter([ShardInfo {\n            shard_id: ShardId::from(1),\n            shard_state: ShardState::Open,\n            short_term_ingestion_rate: RateMibPerSec(4),\n            long_term_ingestion_rate: RateMibPerSec(4),\n        }]);\n        let local_shards_update = LocalShardsUpdate {\n            leader_id: \"test-ingester\".into(),\n            source_uid: source_uid.clone(),\n            shard_infos,\n        };\n\n        controller\n            .handle_local_shards_update(local_shards_update, &mut model, &progress)\n            .await\n            .unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_ingest_controller_try_scale_up_shards() {\n        let mut mock_metastore = MockMetastoreService::new();\n\n        let index_uid = IndexUid::from_str(\"test-index:00000000000000000000000000\").unwrap();\n        let index_uid_clone = index_uid.clone();\n        mock_metastore\n            .expect_open_shards()\n            .once()\n            .returning(move |request| {\n                assert_eq!(request.subrequests.len(), 1);\n                assert_eq!(request.subrequests[0].index_uid(), &index_uid_clone);\n                assert_eq!(request.subrequests[0].source_id, INGEST_V2_SOURCE_ID);\n                assert_eq!(request.subrequests[0].leader_id, \"test-ingester\");\n\n                Err(MetastoreError::InvalidArgument {\n                    message: \"failed to open shards\".to_string(),\n                })\n            });\n        let index_uid_clone = index_uid.clone();\n        mock_metastore\n            .expect_open_shards()\n            .returning(move |request| {\n                assert_eq!(request.subrequests.len(), 1);\n                assert_eq!(request.subrequests[0].index_uid(), &index_uid_clone);\n                assert_eq!(request.subrequests[0].source_id, INGEST_V2_SOURCE_ID);\n                assert_eq!(request.subrequests[0].leader_id, \"test-ingester\");\n\n                let subresponses = vec![metastore::OpenShardSubresponse {\n                    subrequest_id: 0,\n                    open_shard: Some(Shard {\n                        index_uid: Some(index_uid.clone()),\n                        source_id: INGEST_V2_SOURCE_ID.to_string(),\n                        shard_id: Some(ShardId::from(1)),\n                        leader_id: \"test-ingester\".to_string(),\n                        shard_state: ShardState::Open as i32,\n                        ..Default::default()\n                    }),\n                }];\n                let response = metastore::OpenShardsResponse { subresponses };\n                Ok(response)\n            });\n        let metastore = MetastoreServiceClient::from_mock(mock_metastore);\n\n        let ingester_pool = IngesterPool::default();\n        let replication_factor = 1;\n\n        let mut controller = IngestController::new(\n            metastore,\n            ingester_pool.clone(),\n            replication_factor,\n            TEST_SHARD_THROUGHPUT_LIMIT_MIB,\n            1.001,\n        );\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id: SourceId = INGEST_V2_SOURCE_ID.to_string();\n\n        let source_uid = SourceUid {\n            index_uid: index_uid.clone(),\n            source_id: source_id.clone(),\n        };\n        let shard_stats = ShardStats {\n            num_open_shards: 2,\n            ..Default::default()\n        };\n        let mut model = ControlPlaneModel::default();\n        let index_metadata =\n            IndexMetadata::for_test(&index_uid.index_id, \"ram://indexes/test-index:0\");\n        model.add_index(index_metadata);\n\n        let source_config = SourceConfig::ingest_v2();\n        model.add_source(&index_uid, source_config).unwrap();\n\n        let progress = Progress::default();\n\n        // Test could not find leader because no ingester in pool\n        controller\n            .try_scale_up_shards(source_uid.clone(), shard_stats, &mut model, &progress, 1)\n            .await\n            .unwrap();\n\n        let mut mock_ingester = MockIngesterService::new();\n\n        let index_uid_clone = index_uid.clone();\n        mock_ingester\n            .expect_init_shards()\n            .once()\n            .returning(move |request| {\n                assert_eq!(request.subrequests.len(), 1);\n\n                let subrequest = &request.subrequests[0];\n                assert_eq!(subrequest.subrequest_id, 0);\n\n                let shard = request.subrequests[0].shard();\n                assert_eq!(shard.index_uid(), &index_uid_clone);\n                assert_eq!(shard.source_id, INGEST_V2_SOURCE_ID);\n                assert_eq!(shard.leader_id, \"test-ingester\");\n\n                Err(IngestV2Error::Internal(\"failed to init shards\".to_string()))\n            });\n        let index_uid_clone = index_uid.clone();\n        mock_ingester\n            .expect_init_shards()\n            .returning(move |request| {\n                assert_eq!(request.subrequests.len(), 1);\n\n                let subrequest = &request.subrequests[0];\n                assert_eq!(subrequest.subrequest_id, 0);\n\n                let shard = subrequest.shard();\n                assert_eq!(shard.index_uid(), &index_uid_clone);\n                assert_eq!(shard.source_id, INGEST_V2_SOURCE_ID);\n                assert_eq!(shard.leader_id, \"test-ingester\");\n\n                let successes = vec![InitShardSuccess {\n                    subrequest_id: request.subrequests[0].subrequest_id,\n                    shard: Some(shard.clone()),\n                }];\n                let response = InitShardsResponse {\n                    successes,\n                    failures: Vec::new(),\n                };\n                Ok(response)\n            });\n        let ingester =\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester));\n        ingester_pool.insert(\"test-ingester\".into(), ingester);\n\n        // Test failed to open shards.\n        controller\n            .try_scale_up_shards(source_uid.clone(), shard_stats, &mut model, &progress, 1)\n            .await\n            .unwrap();\n        assert_eq!(model.all_shards().count(), 0);\n\n        // Test failed to init shards.\n        controller\n            .try_scale_up_shards(source_uid.clone(), shard_stats, &mut model, &progress, 1)\n            .await\n            .unwrap_err();\n        assert_eq!(model.all_shards().count(), 0);\n\n        // Test successfully opened shard.\n        controller\n            .try_scale_up_shards(source_uid.clone(), shard_stats, &mut model, &progress, 1)\n            .await\n            .unwrap();\n        assert_eq!(\n            model.all_shards().filter(|shard| shard.is_open()).count(),\n            1\n        );\n    }\n\n    #[tokio::test]\n    async fn test_ingest_controller_try_scale_down_shards() {\n        let metastore = MetastoreServiceClient::mocked();\n        let ingester_pool = IngesterPool::default();\n        let replication_factor = 1;\n\n        let controller = IngestController::new(\n            metastore,\n            ingester_pool.clone(),\n            replication_factor,\n            TEST_SHARD_THROUGHPUT_LIMIT_MIB,\n            1.001,\n        );\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id: SourceId = \"test-source\".to_string();\n\n        let source_uid = SourceUid {\n            index_uid: index_uid.clone(),\n            source_id: source_id.clone(),\n        };\n        let shard_stats = ShardStats {\n            num_open_shards: 2,\n            ..Default::default()\n        };\n        let min_shards = NonZeroUsize::MIN;\n        let mut model = ControlPlaneModel::default();\n        let progress = Progress::default();\n\n        // Test could not find a scale down candidate.\n        controller\n            .try_scale_down_shards(\n                source_uid.clone(),\n                shard_stats,\n                min_shards,\n                &mut model,\n                &progress,\n            )\n            .await\n            .unwrap();\n\n        let shards = vec![Shard {\n            shard_id: Some(ShardId::from(1)),\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            leader_id: \"test-ingester\".to_string(),\n            shard_state: ShardState::Open as i32,\n            ..Default::default()\n        }];\n        model.insert_shards(&index_uid, &source_id, shards);\n\n        // Test ingester is unavailable.\n        controller\n            .try_scale_down_shards(\n                source_uid.clone(),\n                shard_stats,\n                min_shards,\n                &mut model,\n                &progress,\n            )\n            .await\n            .unwrap();\n\n        let mut mock_ingester = MockIngesterService::new();\n\n        let index_uid_clone = index_uid.clone();\n        mock_ingester\n            .expect_close_shards()\n            .once()\n            .returning(move |request| {\n                assert_eq!(request.shard_pkeys.len(), 1);\n                assert_eq!(request.shard_pkeys[0].index_uid(), &index_uid_clone);\n                assert_eq!(request.shard_pkeys[0].source_id, \"test-source\");\n                assert_eq!(request.shard_pkeys[0].shard_id(), ShardId::from(1));\n\n                Err(IngestV2Error::Internal(\n                    \"failed to close shards\".to_string(),\n                ))\n            });\n        let index_uid_clone = index_uid.clone();\n        mock_ingester\n            .expect_close_shards()\n            .once()\n            .returning(move |request| {\n                assert_eq!(request.shard_pkeys.len(), 1);\n                assert_eq!(request.shard_pkeys[0].index_uid(), &index_uid_clone);\n                assert_eq!(request.shard_pkeys[0].source_id, \"test-source\");\n                assert_eq!(request.shard_pkeys[0].shard_id(), ShardId::from(1));\n\n                let response = CloseShardsResponse {\n                    successes: request.shard_pkeys,\n                };\n                Ok(response)\n            });\n        let ingester =\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester));\n        ingester_pool.insert(\"test-ingester\".into(), ingester);\n\n        // Test failed to close shard.\n        controller\n            .try_scale_down_shards(\n                source_uid.clone(),\n                shard_stats,\n                min_shards,\n                &mut model,\n                &progress,\n            )\n            .await\n            .unwrap();\n        assert!(model.all_shards().all(|shard| shard.is_open()));\n\n        // Test successfully closed shard.\n        controller\n            .try_scale_down_shards(\n                source_uid.clone(),\n                shard_stats,\n                min_shards,\n                &mut model,\n                &progress,\n            )\n            .await\n            .unwrap();\n        assert!(model.all_shards().all(|shard| shard.is_closed()));\n\n        let shards = vec![Shard {\n            shard_id: Some(ShardId::from(2)),\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            leader_id: \"test-ingester\".to_string(),\n            shard_state: ShardState::Open as i32,\n            ..Default::default()\n        }];\n        model.insert_shards(&index_uid, &source_id, shards);\n\n        // Test rate limited.\n        controller\n            .try_scale_down_shards(\n                source_uid.clone(),\n                shard_stats,\n                min_shards,\n                &mut model,\n                &progress,\n            )\n            .await\n            .unwrap();\n        assert!(model.all_shards().any(|shard| shard.is_open()));\n    }\n\n    #[test]\n    fn test_find_scale_down_candidate() {\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id: SourceId = \"test-source\".to_string();\n\n        let source_uid = SourceUid {\n            index_uid: index_uid.clone(),\n            source_id: source_id.clone(),\n        };\n        let mut model = ControlPlaneModel::default();\n\n        assert!(find_scale_down_candidate(&source_uid, &model).is_none());\n\n        let shards = vec![\n            Shard {\n                index_uid: index_uid.clone().into(),\n                source_id: source_id.clone(),\n                shard_id: Some(ShardId::from(1)),\n                shard_state: ShardState::Open as i32,\n                leader_id: \"test-ingester-0\".to_string(),\n                ..Default::default()\n            },\n            Shard {\n                index_uid: index_uid.clone().into(),\n                source_id: source_id.clone(),\n                shard_id: Some(ShardId::from(2)),\n                shard_state: ShardState::Open as i32,\n                leader_id: \"test-ingester-0\".to_string(),\n                ..Default::default()\n            },\n            Shard {\n                index_uid: index_uid.clone().into(),\n                source_id: source_id.clone(),\n                shard_id: Some(ShardId::from(3)),\n                shard_state: ShardState::Closed as i32, //< this one is closed\n                leader_id: \"test-ingester-0\".to_string(),\n                ..Default::default()\n            },\n            Shard {\n                index_uid: index_uid.clone().into(),\n                source_id: source_id.clone(),\n                shard_id: Some(ShardId::from(4)),\n                shard_state: ShardState::Open as i32,\n                leader_id: \"test-ingester-1\".to_string(),\n                ..Default::default()\n            },\n            Shard {\n                index_uid: index_uid.clone().into(),\n                source_id: source_id.clone(),\n                shard_id: Some(ShardId::from(5)),\n                shard_state: ShardState::Open as i32,\n                leader_id: \"test-ingester-1\".to_string(),\n                ..Default::default()\n            },\n            Shard {\n                index_uid: index_uid.clone().into(),\n                source_id: source_id.clone(),\n                shard_id: Some(ShardId::from(6)),\n                shard_state: ShardState::Open as i32,\n                leader_id: \"test-ingester-1\".to_string(),\n                ..Default::default()\n            },\n        ];\n        // That's 3 open shards on indexer-1, 2 open shard and one closed shard on indexer-0..\n        model.insert_shards(&index_uid, &source_id, shards);\n\n        let shard_infos = BTreeSet::from_iter([\n            ShardInfo {\n                shard_id: ShardId::from(1),\n                shard_state: ShardState::Open,\n                short_term_ingestion_rate: quickwit_ingest::RateMibPerSec(1),\n                long_term_ingestion_rate: quickwit_ingest::RateMibPerSec(1),\n            },\n            ShardInfo {\n                shard_id: ShardId::from(2),\n                shard_state: ShardState::Open,\n                short_term_ingestion_rate: quickwit_ingest::RateMibPerSec(2),\n                long_term_ingestion_rate: quickwit_ingest::RateMibPerSec(2),\n            },\n            ShardInfo {\n                shard_id: ShardId::from(3),\n                shard_state: ShardState::Open,\n                short_term_ingestion_rate: quickwit_ingest::RateMibPerSec(3),\n                long_term_ingestion_rate: quickwit_ingest::RateMibPerSec(3),\n            },\n            ShardInfo {\n                shard_id: ShardId::from(4),\n                shard_state: ShardState::Open,\n                short_term_ingestion_rate: quickwit_ingest::RateMibPerSec(4),\n                long_term_ingestion_rate: quickwit_ingest::RateMibPerSec(4),\n            },\n            ShardInfo {\n                shard_id: ShardId::from(5),\n                shard_state: ShardState::Open,\n                short_term_ingestion_rate: quickwit_ingest::RateMibPerSec(5),\n                long_term_ingestion_rate: quickwit_ingest::RateMibPerSec(5),\n            },\n            ShardInfo {\n                shard_id: ShardId::from(6),\n                shard_state: ShardState::Open,\n                short_term_ingestion_rate: quickwit_ingest::RateMibPerSec(6),\n                long_term_ingestion_rate: quickwit_ingest::RateMibPerSec(6),\n            },\n        ]);\n        model.update_shards(&source_uid, &shard_infos);\n\n        let (leader_id, _shard_id) = find_scale_down_candidate(&source_uid, &model).unwrap();\n        // We pick ingester 1 has it has more open shard\n        assert_eq!(leader_id, \"test-ingester-1\");\n    }\n\n    #[tokio::test]\n    async fn test_sync_with_ingesters() {\n        let metastore = MetastoreServiceClient::mocked();\n        let ingester_pool = IngesterPool::default();\n        let replication_factor = 2;\n\n        let controller = IngestController::new(\n            metastore,\n            ingester_pool.clone(),\n            replication_factor,\n            TEST_SHARD_THROUGHPUT_LIMIT_MIB,\n            1.001,\n        );\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id: SourceId = \"test-source\".to_string();\n        let mut model = ControlPlaneModel::default();\n        let shards = vec![\n            Shard {\n                index_uid: Some(index_uid.clone()),\n                source_id: source_id.clone(),\n                shard_id: Some(ShardId::from(1)),\n                shard_state: ShardState::Open as i32,\n                leader_id: \"node-1\".to_string(),\n                follower_id: Some(\"node-2\".to_string()),\n                ..Default::default()\n            },\n            Shard {\n                index_uid: Some(index_uid.clone()),\n                source_id: source_id.clone(),\n                shard_id: Some(ShardId::from(2)),\n                shard_state: ShardState::Open as i32,\n                leader_id: \"node-2\".to_string(),\n                follower_id: Some(\"node-3\".to_string()),\n                ..Default::default()\n            },\n            Shard {\n                index_uid: Some(index_uid.clone()),\n                source_id: source_id.clone(),\n                shard_id: Some(ShardId::from(3)),\n                shard_state: ShardState::Open as i32,\n                leader_id: \"node-2\".to_string(),\n                follower_id: Some(\"node-1\".to_string()),\n                ..Default::default()\n            },\n        ];\n        model.insert_shards(&index_uid, &source_id, shards);\n\n        let mut mock_ingester_1 = MockIngesterService::new();\n        let mock_ingester_2 = MockIngesterService::new();\n        let mock_ingester_3 = MockIngesterService::new();\n\n        let count_calls = Arc::new(AtomicUsize::new(0));\n        let count_calls_clone = count_calls.clone();\n        mock_ingester_1\n            .expect_retain_shards()\n            .once()\n            .returning(move |request| {\n                assert_eq!(request.retain_shards_for_sources.len(), 1);\n                assert_eq!(\n                    request.retain_shards_for_sources[0].shard_ids,\n                    [ShardId::from(1), ShardId::from(3)]\n                );\n                count_calls_clone.fetch_add(1, Ordering::Release);\n                Ok(RetainShardsResponse {})\n            });\n        ingester_pool.insert(\n            \"node-1\".into(),\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester_1)),\n        );\n        ingester_pool.insert(\n            \"node-2\".into(),\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester_2)),\n        );\n        ingester_pool.insert(\n            \"node-3\".into(),\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester_3)),\n        );\n        let node_id = \"node-1\".into();\n        let wait_handle = controller.sync_with_ingester(&node_id, &model);\n        wait_handle.wait().await;\n        assert_eq!(count_calls.load(Ordering::Acquire), 1);\n    }\n\n    #[tokio::test]\n    async fn test_ingest_controller_advise_reset_shards() {\n        let metastore = MetastoreServiceClient::mocked();\n        let ingester_pool = IngesterPool::default();\n        let replication_factor = 2;\n\n        let controller = IngestController::new(\n            metastore,\n            ingester_pool,\n            replication_factor,\n            TEST_SHARD_THROUGHPUT_LIMIT_MIB,\n            1.001,\n        );\n\n        let mut model = ControlPlaneModel::default();\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id_00: SourceId = \"test-source-0\".into();\n        let source_id_01: SourceId = \"test-source-1\".into();\n\n        let shards = vec![Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id_00.clone(),\n            shard_id: Some(ShardId::from(1)),\n            shard_state: ShardState::Open as i32,\n            publish_position_inclusive: Some(Position::offset(1337u64)),\n            ..Default::default()\n        }];\n        model.insert_shards(&index_uid, &source_id_00, shards);\n\n        let advise_reset_shards_request = AdviseResetShardsRequest {\n            ingester_id: \"test-ingester\".to_string(),\n            shard_ids: vec![\n                ShardIds {\n                    index_uid: Some(index_uid.clone()),\n                    source_id: source_id_00.clone(),\n                    shard_ids: vec![ShardId::from(1), ShardId::from(2)],\n                },\n                ShardIds {\n                    index_uid: Some(index_uid.clone()),\n                    source_id: source_id_01.clone(),\n                    shard_ids: vec![ShardId::from(3)],\n                },\n            ],\n        };\n        let advise_reset_shards_response =\n            controller.advise_reset_shards(advise_reset_shards_request, &model);\n\n        assert_eq!(advise_reset_shards_response.shards_to_delete.len(), 2);\n\n        let shard_to_delete_00 = &advise_reset_shards_response.shards_to_delete[0];\n        assert_eq!(shard_to_delete_00.index_uid(), &index_uid);\n        assert_eq!(shard_to_delete_00.source_id, source_id_00);\n        assert_eq!(shard_to_delete_00.shard_ids.len(), 1);\n        assert_eq!(shard_to_delete_00.shard_ids[0], ShardId::from(2));\n\n        let shard_to_delete_01 = &advise_reset_shards_response.shards_to_delete[1];\n        assert_eq!(shard_to_delete_01.index_uid(), &index_uid);\n        assert_eq!(shard_to_delete_01.source_id, source_id_01);\n        assert_eq!(shard_to_delete_01.shard_ids.len(), 1);\n        assert_eq!(shard_to_delete_01.shard_ids[0], ShardId::from(3));\n\n        assert_eq!(advise_reset_shards_response.shards_to_truncate.len(), 1);\n\n        let shard_to_truncate = &advise_reset_shards_response.shards_to_truncate[0];\n        assert_eq!(shard_to_truncate.index_uid(), &index_uid);\n        assert_eq!(shard_to_truncate.source_id, source_id_00);\n        assert_eq!(shard_to_truncate.shard_positions.len(), 1);\n        assert_eq!(\n            shard_to_truncate.shard_positions[0].shard_id(),\n            ShardId::from(1)\n        );\n        assert_eq!(\n            shard_to_truncate.shard_positions[0].publish_position_inclusive(),\n            Position::offset(1337u64)\n        );\n    }\n\n    #[tokio::test]\n    async fn test_ingest_controller_close_shards() {\n        let metastore = MetastoreServiceClient::mocked();\n        let ingester_pool = IngesterPool::default();\n        let replication_factor = 1;\n        let controller = IngestController::new(\n            metastore,\n            ingester_pool.clone(),\n            replication_factor,\n            TEST_SHARD_THROUGHPUT_LIMIT_MIB,\n            1.001,\n        );\n\n        let closed_shards = controller.close_shards(Vec::new()).await;\n        assert_eq!(closed_shards.len(), 0);\n\n        let ingester_id_0 = NodeId::from(\"test-ingester-0\");\n        let mut mock_ingester_0 = MockIngesterService::new();\n        mock_ingester_0\n            .expect_close_shards()\n            .once()\n            .returning(|mut request| {\n                assert_eq!(request.shard_pkeys.len(), 2);\n\n                request\n                    .shard_pkeys\n                    .sort_by(|left, right| left.shard_id().cmp(right.shard_id()));\n\n                let shard_0 = &request.shard_pkeys[0];\n                assert_eq!(shard_0.index_uid(), &IndexUid::for_test(\"test-index\", 0));\n                assert_eq!(shard_0.source_id, \"test-source\");\n                assert_eq!(shard_0.shard_id(), ShardId::from(0));\n\n                let shard_1 = &request.shard_pkeys[1];\n                assert_eq!(shard_1.index_uid(), &IndexUid::for_test(\"test-index\", 0));\n                assert_eq!(shard_1.source_id, \"test-source\");\n                assert_eq!(shard_1.shard_id(), ShardId::from(1));\n\n                let response = CloseShardsResponse {\n                    successes: vec![shard_0.clone()],\n                };\n                Ok(response)\n            });\n        let ingester_0 = IngesterServiceClient::from_mock(mock_ingester_0);\n        ingester_pool.insert(\n            ingester_id_0.clone(),\n            IngesterPoolEntry::ready_with_client(ingester_0),\n        );\n\n        let ingester_id_1 = NodeId::from(\"test-ingester-1\");\n        let mut mock_ingester_1 = MockIngesterService::new();\n        mock_ingester_1\n            .expect_close_shards()\n            .once()\n            .returning(|request| {\n                assert_eq!(request.shard_pkeys.len(), 1);\n\n                let shard = &request.shard_pkeys[0];\n                assert_eq!(shard.index_uid(), &IndexUid::for_test(\"test-index\", 0));\n                assert_eq!(shard.source_id, \"test-source\");\n                assert_eq!(shard.shard_id(), ShardId::from(2));\n\n                Err(IngestV2Error::Internal(\"internal error\".to_string()))\n            });\n        let ingester_1 = IngesterServiceClient::from_mock(mock_ingester_1);\n        ingester_pool.insert(\n            ingester_id_1.clone(),\n            IngesterPoolEntry::ready_with_client(ingester_1),\n        );\n\n        let ingester_id_2 = NodeId::from(\"test-ingester-2\");\n        let mut mock_ingester_2 = MockIngesterService::new();\n        mock_ingester_2.expect_close_shards().never();\n\n        let client_2 = IngesterServiceClient::tower()\n            .stack_close_shards_layer(DelayLayer::new(CLOSE_SHARDS_REQUEST_TIMEOUT * 2))\n            .build_from_mock(mock_ingester_2);\n        ingester_pool.insert(\n            ingester_id_2.clone(),\n            IngesterPoolEntry::ready_with_client(client_2),\n        );\n\n        // In this test:\n        // - ingester 0 will close shard 0 successfully and fail to close shard 1;\n        // - ingester 1 will return an error;\n        // - ingester 2 will time out;\n        // - ingester 3 will be unavailable.\n\n        let shards_to_close = vec![\n            Shard {\n                index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n                source_id: \"test-source\".to_string(),\n                shard_id: Some(ShardId::from(0)),\n                leader_id: ingester_id_0.to_string(),\n                ..Default::default()\n            },\n            Shard {\n                index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n                source_id: \"test-source\".to_string(),\n                shard_id: Some(ShardId::from(1)),\n                leader_id: ingester_id_0.to_string(),\n                ..Default::default()\n            },\n            Shard {\n                index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n                source_id: \"test-source\".to_string(),\n                shard_id: Some(ShardId::from(2)),\n                leader_id: ingester_id_1.to_string(),\n                ..Default::default()\n            },\n            Shard {\n                index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n                source_id: \"test-source\".to_string(),\n                shard_id: Some(ShardId::from(3)),\n                leader_id: ingester_id_2.to_string(),\n                ..Default::default()\n            },\n            Shard {\n                index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n                source_id: \"test-source\".to_string(),\n                shard_id: Some(ShardId::from(4)),\n                leader_id: \"test-ingester-3\".to_string(),\n                ..Default::default()\n            },\n        ];\n        let closed_shards = controller.close_shards(shards_to_close).await;\n        assert_eq!(closed_shards.len(), 1);\n\n        let closed_shard = &closed_shards[0];\n        assert_eq!(closed_shard.index_uid(), &(\"test-index\", 0));\n        assert_eq!(closed_shard.source_id, \"test-source\");\n        assert_eq!(closed_shard.shard_id(), ShardId::from(0));\n    }\n\n    #[tokio::test]\n    async fn test_ingest_controller_rebalance_shards() {\n        setup_logging_for_tests();\n\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore.expect_open_shards().return_once(|request| {\n            assert_eq!(request.subrequests.len(), 1);\n\n            let subrequest_0 = &request.subrequests[0];\n            assert_eq!(subrequest_0.subrequest_id, 0);\n            assert_eq!(subrequest_0.index_uid(), &(\"test-index\", 0));\n            assert_eq!(subrequest_0.source_id, INGEST_V2_SOURCE_ID.to_string());\n            assert_eq!(subrequest_0.leader_id, \"test-ingester-1\");\n            assert!(subrequest_0.follower_id.is_none());\n\n            let subresponses = vec![metastore::OpenShardSubresponse {\n                subrequest_id: 0,\n                open_shard: Some(Shard {\n                    index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n                    source_id: INGEST_V2_SOURCE_ID.to_string(),\n                    shard_id: subrequest_0.shard_id.clone(),\n                    leader_id: \"test-ingester-1\".to_string(),\n                    shard_state: ShardState::Open as i32,\n                    ..Default::default()\n                }),\n            }];\n            let response = metastore::OpenShardsResponse { subresponses };\n            Ok(response)\n        });\n        let metastore = MetastoreServiceClient::from_mock(mock_metastore);\n        let ingester_pool = IngesterPool::default();\n        let replication_factor = 1;\n        let mut controller = IngestController::new(\n            metastore,\n            ingester_pool.clone(),\n            replication_factor,\n            TEST_SHARD_THROUGHPUT_LIMIT_MIB,\n            1.001,\n        );\n\n        let mut model = ControlPlaneModel::default();\n\n        let universe = Universe::with_accelerated_time();\n        let (control_plane_mailbox, control_plane_inbox) = universe.create_test_mailbox();\n        let progress = Progress::default();\n\n        let num_opened_shards = controller\n            .rebalance_shards(&mut model, &control_plane_mailbox, &progress)\n            .await\n            .unwrap();\n        assert_eq!(num_opened_shards, 0);\n\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram://indexes/test-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        model.add_index(index_metadata);\n\n        let source_config = SourceConfig::ingest_v2();\n        model.add_source(&index_uid, source_config).unwrap();\n\n        // In this test, ingester 0 hosts 5 shards but there are two ingesters in the cluster.\n        // `rebalance_shards` will attempt to move 2 shards to ingester 1. However, it will fail to\n        // init one shard, so only one shard will be actually moved.\n\n        let open_shards = vec![\n            Shard {\n                index_uid: Some(index_uid.clone()),\n                source_id: INGEST_V2_SOURCE_ID.to_string(),\n                shard_id: Some(ShardId::from(0)),\n                leader_id: \"test-ingester-0\".to_string(),\n                shard_state: ShardState::Open as i32,\n                ..Default::default()\n            },\n            Shard {\n                index_uid: Some(index_uid.clone()),\n                source_id: INGEST_V2_SOURCE_ID.to_string(),\n                shard_id: Some(ShardId::from(1)),\n                leader_id: \"test-ingester-0\".to_string(),\n                shard_state: ShardState::Open as i32,\n                ..Default::default()\n            },\n            Shard {\n                index_uid: Some(index_uid.clone()),\n                source_id: INGEST_V2_SOURCE_ID.to_string(),\n                shard_id: Some(ShardId::from(2)),\n                leader_id: \"test-ingester-0\".to_string(),\n                shard_state: ShardState::Open as i32,\n                ..Default::default()\n            },\n            Shard {\n                index_uid: Some(index_uid.clone()),\n                source_id: INGEST_V2_SOURCE_ID.to_string(),\n                shard_id: Some(ShardId::from(3)),\n                leader_id: \"test-ingester-0\".to_string(),\n                shard_state: ShardState::Open as i32,\n                ..Default::default()\n            },\n            Shard {\n                index_uid: Some(index_uid.clone()),\n                source_id: INGEST_V2_SOURCE_ID.to_string(),\n                shard_id: Some(ShardId::from(4)),\n                leader_id: \"test-ingester-0\".to_string(),\n                shard_state: ShardState::Open as i32,\n                ..Default::default()\n            },\n        ];\n        model.insert_shards(&index_uid, &INGEST_V2_SOURCE_ID.to_string(), open_shards);\n\n        let ingester_id_0 = NodeId::from(\"test-ingester-0\");\n        let mut mock_ingester_0 = MockIngesterService::new();\n        mock_ingester_0\n            .expect_close_shards()\n            .once()\n            .returning(|request| {\n                assert_eq!(request.shard_pkeys.len(), 1);\n\n                let shard = &request.shard_pkeys[0];\n                assert_eq!(shard.index_uid(), &(\"test-index\", 0));\n                assert_eq!(shard.source_id, INGEST_V2_SOURCE_ID);\n                // assert_eq!(shard.shard_id(), ShardId::from(2));\n\n                let response = CloseShardsResponse {\n                    successes: vec![shard.clone()],\n                };\n                Ok(response)\n            });\n        let ingester_0 = IngesterServiceClient::from_mock(mock_ingester_0);\n        ingester_pool.insert(\n            ingester_id_0.clone(),\n            IngesterPoolEntry::ready_with_client(ingester_0),\n        );\n\n        let ingester_id_1 = NodeId::from(\"test-ingester-1\");\n        let mut mock_ingester_1 = MockIngesterService::new();\n        mock_ingester_1.expect_init_shards().return_once(|request| {\n            assert_eq!(request.subrequests.len(), 2);\n\n            let subrequest_0 = &request.subrequests[0];\n            assert_eq!(subrequest_0.subrequest_id, 0);\n\n            let shard_0 = request.subrequests[0].shard();\n            assert_eq!(shard_0.index_uid(), &(\"test-index\", 0));\n            assert_eq!(shard_0.source_id, INGEST_V2_SOURCE_ID.to_string());\n            assert_eq!(shard_0.leader_id, \"test-ingester-1\");\n            assert!(shard_0.follower_id.is_none());\n\n            let subrequest_1 = &request.subrequests[1];\n            assert_eq!(subrequest_1.subrequest_id, 1);\n\n            let shard_1 = request.subrequests[0].shard();\n            assert_eq!(shard_1.index_uid(), &(\"test-index\", 0));\n            assert_eq!(shard_1.source_id, INGEST_V2_SOURCE_ID.to_string());\n            assert_eq!(shard_1.leader_id, \"test-ingester-1\");\n            assert!(shard_1.follower_id.is_none());\n\n            let successes = vec![InitShardSuccess {\n                subrequest_id: request.subrequests[0].subrequest_id,\n                shard: Some(shard_0.clone()),\n            }];\n            let failures = vec![InitShardFailure {\n                subrequest_id: request.subrequests[1].subrequest_id,\n                index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n                source_id: INGEST_V2_SOURCE_ID.to_string(),\n                shard_id: Some(shard_1.shard_id().clone()),\n            }];\n            let response = InitShardsResponse {\n                successes,\n                failures,\n            };\n            Ok(response)\n        });\n        let ingester_1 =\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester_1));\n        ingester_pool.insert(ingester_id_1.clone(), ingester_1);\n\n        let num_opened_shards = controller\n            .rebalance_shards(&mut model, &control_plane_mailbox, &progress)\n            .await\n            .unwrap();\n        assert_eq!(num_opened_shards, 1);\n\n        let callback: RebalanceShardsCallback = tokio::time::timeout(\n            CLOSE_SHARDS_REQUEST_TIMEOUT * 2,\n            control_plane_inbox.recv_typed_message(),\n        )\n        .await\n        .unwrap()\n        .unwrap();\n        assert_eq!(callback.closed_shards.len(), 1);\n    }\n\n    // #[track_caller]\n    fn test_allocate_shards_aux_aux(\n        shard_counts_map: &HashMap<NodeId, usize>,\n        num_shards: usize,\n        replication_enabled: bool,\n    ) {\n        let shard_allocations_opt =\n            super::allocate_shards(shard_counts_map, num_shards, replication_enabled);\n        if num_shards == 0 {\n            assert_eq!(shard_allocations_opt, Some(Vec::new()));\n            return;\n        }\n        let num_nodes_required = if replication_enabled { 2 } else { 1 };\n        if shard_counts_map.len() < num_nodes_required {\n            assert!(shard_allocations_opt.is_none());\n            return;\n        }\n        let shard_allocations = shard_allocations_opt.unwrap();\n        let mut total_counts: HashMap<&NodeIdRef, usize> = HashMap::default();\n        assert_eq!(shard_allocations.len(), num_shards);\n        if num_shards == 0 {\n            return;\n        }\n        for (leader, follower_opt) in shard_allocations {\n            assert_eq!(follower_opt.is_some(), replication_enabled);\n            *total_counts.entry(leader).or_default() += 1;\n            if let Some(follower) = follower_opt {\n                *total_counts.entry(follower).or_default() += 1;\n                assert_ne!(follower, leader);\n            }\n        }\n        for (shard, count) in shard_counts_map {\n            if let Some(shard_count) = total_counts.get_mut(shard.as_ref()) {\n                *shard_count += *count;\n            }\n        }\n        let (min, max) = total_counts\n            .values()\n            .copied()\n            .minmax()\n            .into_option()\n            .unwrap();\n        if !replication_enabled {\n            // If replication is enabled, we can end up being forced to not spread shards as evenly\n            // as we would wish. For instance, if there are only two nodes initially\n            // unbalanced.\n            assert!(min + 1 >= max);\n        } else {\n            let (previous_min, previous_max) = shard_counts_map\n                .values()\n                .copied()\n                .minmax()\n                .into_option()\n                .unwrap();\n            // The algorithm is supposed to reduce the variance.\n            // Of course sometimes it is not possible. For instance for 3 nodes that are\n            // perfectly balanced to begin with, if we as for a single shard.\n            assert!((previous_max - previous_min).max(1) >= (max - min));\n        }\n    }\n\n    fn test_allocate_shards_aux(shard_counts: &[usize]) {\n        let mut shard_counts_map: HashMap<NodeId, usize> = HashMap::new();\n        let shards: Vec<String> = (0..shard_counts.len())\n            .map(|i| format!(\"shard-{i}\"))\n            .collect();\n        for (shard, &shard_count) in shards.into_iter().zip(shard_counts.iter()) {\n            shard_counts_map.insert(NodeId::from(shard), shard_count);\n        }\n        for i in 0..10 {\n            test_allocate_shards_aux_aux(&shard_counts_map, i, false);\n            test_allocate_shards_aux_aux(&shard_counts_map, i, true);\n        }\n    }\n\n    use proptest::prelude::*;\n\n    proptest! {\n        #[test]\n        fn test_proptest_allocate_shards(shard_counts in proptest::collection::vec(0..10usize, 0..10usize)) {\n            test_allocate_shards_aux(&shard_counts);\n        }\n    }\n\n    #[test]\n    fn test_allocate_shards_prop_test() {\n        test_allocate_shards_aux(&[]);\n        test_allocate_shards_aux(&[1]);\n        test_allocate_shards_aux(&[1, 1]);\n        test_allocate_shards_aux(&[1, 2]);\n        test_allocate_shards_aux(&[1, 4]);\n        test_allocate_shards_aux(&[2, 3, 2]);\n        test_allocate_shards_aux(&[2, 4, 6]);\n        test_allocate_shards_aux(&[2, 3, 10]);\n    }\n\n    #[test]\n    fn test_allocate_shards_prop_test_bug() {\n        test_allocate_shards_aux(&[7, 7, 7]);\n    }\n\n    #[test]\n    fn test_pick_one() {\n        let mut shard_counts = BTreeMap::default();\n        shard_counts.insert(\n            1,\n            vec![NodeIdRef::from_str(\"node1\"), NodeIdRef::from_str(\"node2\")],\n        );\n        let mut rng = rand::rng();\n        let node = pick_one(\n            &mut shard_counts,\n            Some(NodeIdRef::from_str(\"node2\")),\n            &mut rng,\n        )\n        .unwrap();\n        assert_eq!(node.as_str(), \"node1\");\n        assert_eq!(shard_counts.len(), 2);\n        assert_eq!(\n            &shard_counts.get(&1).unwrap()[..],\n            &[NodeIdRef::from_str(\"node2\")]\n        );\n        assert_eq!(\n            &shard_counts.get(&2).unwrap()[..],\n            &[NodeIdRef::from_str(\"node1\")]\n        );\n        let node = pick_one(&mut shard_counts, None, &mut rng).unwrap();\n        assert_eq!(node.as_str(), \"node2\");\n        assert_eq!(shard_counts.len(), 1);\n        assert_eq!(\n            &shard_counts.get(&2).unwrap()[..],\n            &[NodeIdRef::from_str(\"node1\"), NodeIdRef::from_str(\"node2\")]\n        );\n    }\n\n    /// Test helper for compute_shards_to_rebalance.\n    /// The reason for testing both available and unavailable ingesters with open shards is to\n    /// ensure the algorithm holds up when there are open shards\n    ///\n    /// - `available_ingester_shards`: open shards per available ingester\n    /// - `unavailable_ingester_shards`: open shards on unavailable ingesters\n    fn test_compute_shards_to_rebalance_aux(\n        ready_ingester_shards: &[usize],\n        unavailable_ingester_shards: &[usize],\n        retiring_ingester_shards: &[usize],\n    ) {\n        let index_id = \"test-index\";\n        let index_metadata = IndexMetadata::for_test(index_id, \"ram://indexes/test-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        let source_id: SourceId = \"test-source\".to_string();\n\n        let mut model = ControlPlaneModel::default();\n        model.add_index(index_metadata.clone());\n\n        let mut source_config = SourceConfig::ingest_v2();\n        source_config.source_id = source_id.to_string();\n        model.add_source(&index_uid, source_config).unwrap();\n\n        let ingester_pool = IngesterPool::default();\n        let mock_ingester = MockIngesterService::new();\n        let ingester_client = IngesterServiceClient::from_mock(mock_ingester);\n\n        let ready_ids: Vec<String> = (0..ready_ingester_shards.len())\n            .map(|i| format!(\"ready-ingester-{}\", i))\n            .collect();\n\n        for ingester_id in &ready_ids {\n            let ingester = IngesterPoolEntry {\n                client: ingester_client.clone(),\n                status: IngesterStatus::Ready,\n                availability_zone: None,\n            };\n            ingester_pool.insert(NodeId::from(ingester_id.clone()), ingester);\n        }\n\n        let unavailable_ids: Vec<String> = (0..unavailable_ingester_shards.len())\n            .map(|i| format!(\"unavailable-ingester-{}\", i))\n            .collect();\n\n        let retiring_ids: Vec<String> = (0..retiring_ingester_shards.len())\n            .map(|i| format!(\"retiring-ingester-{}\", i))\n            .collect();\n\n        for ingester_id in &retiring_ids {\n            let ingester = IngesterPoolEntry {\n                client: ingester_client.clone(),\n                status: IngesterStatus::Retiring,\n                availability_zone: None,\n            };\n            ingester_pool.insert(NodeId::from(ingester_id.clone()), ingester);\n        }\n\n        let mut shards: Vec<Shard> = Vec::new();\n        let mut shard_id: u64 = 0;\n\n        for (idx, &num_shards) in ready_ingester_shards.iter().enumerate() {\n            for _ in 0..num_shards {\n                shards.push(Shard {\n                    index_uid: Some(index_uid.clone()),\n                    source_id: source_id.clone(),\n                    shard_id: Some(ShardId::from(shard_id)),\n                    leader_id: ready_ids[idx].clone(),\n                    shard_state: ShardState::Open as i32,\n                    ..Default::default()\n                });\n                shard_id += 1;\n            }\n        }\n\n        // Shards on unavailable ingesters - these shouldn't affect rebalancing calculations\n        for (idx, &num_shards) in unavailable_ingester_shards.iter().enumerate() {\n            for _ in 0..num_shards {\n                shards.push(Shard {\n                    index_uid: Some(index_uid.clone()),\n                    source_id: source_id.clone(),\n                    shard_id: Some(ShardId::from(shard_id)),\n                    leader_id: unavailable_ids[idx].clone(),\n                    shard_state: ShardState::Open as i32,\n                    ..Default::default()\n                });\n                shard_id += 1;\n            }\n        }\n\n        let num_retiring_shards: usize = retiring_ingester_shards.iter().sum();\n\n        // Shards on retiring ingesters - all of these should be rebalanced\n        for (idx, &num_shards) in retiring_ingester_shards.iter().enumerate() {\n            for _ in 0..num_shards {\n                shards.push(Shard {\n                    index_uid: Some(index_uid.clone()),\n                    source_id: source_id.clone(),\n                    shard_id: Some(ShardId::from(shard_id)),\n                    leader_id: retiring_ids[idx].clone(),\n                    shard_state: ShardState::Open as i32,\n                    ..Default::default()\n                });\n                shard_id += 1;\n            }\n        }\n\n        model.insert_shards(&index_uid, &source_id, shards.clone());\n\n        let controller = IngestController::new(\n            MetastoreServiceClient::mocked(),\n            ingester_pool.clone(),\n            2, // replication_factor\n            TEST_SHARD_THROUGHPUT_LIMIT_MIB,\n            1.001,\n        );\n        let shards_to_rebalance = controller.compute_shards_to_rebalance(&model);\n\n        // All shards on retiring ingesters must be rebalanced.\n        let num_retiring_shards_to_rebalance = shards_to_rebalance\n            .iter()\n            .filter(|shard| shard.leader_id.starts_with(\"retiring-\"))\n            .count();\n        assert_eq!(num_retiring_shards_to_rebalance, num_retiring_shards);\n\n        let source_uid = SourceUid {\n            index_uid: index_uid.clone(),\n            source_id: source_id.clone(),\n        };\n        let shard_ids_to_rebalance: Vec<ShardId> = shards_to_rebalance\n            .iter()\n            .flat_map(|shard| shard.shard_id.clone())\n            .collect();\n\n        let closed_shard_ids = model.close_shards(&source_uid, &shard_ids_to_rebalance);\n        assert_eq!(closed_shard_ids.len(), shards_to_rebalance.len());\n\n        let mut per_ready_ingester_num_shards: HashMap<&str, usize> = ready_ids\n            .iter()\n            .map(|ready_id| (ready_id.as_str(), 0))\n            .collect();\n\n        for shard in model.all_shards() {\n            if !shard.is_open() {\n                continue;\n            }\n            if let Some(count_shard) =\n                per_ready_ingester_num_shards.get_mut(shard.leader_id.as_str())\n            {\n                *count_shard += 1;\n            }\n        }\n\n        // Now we move the different shards to ready ingesters (not retiring ones).\n        // We can only simulate this if there are ready ingesters to receive shards.\n        if !ready_ids.is_empty() {\n            let mut per_ingester_num_shards_sorted: BTreeSet<(usize, &str)> =\n                per_ready_ingester_num_shards\n                    .into_iter()\n                    .map(|(ingester_id, num_shards)| (num_shards, ingester_id))\n                    .collect();\n            let mut opened_shards: Vec<Shard> = Vec::new();\n            for _ in 0..shards_to_rebalance.len() {\n                let (num_shards, ingester_id) = per_ingester_num_shards_sorted.pop_first().unwrap();\n                let opened_shard = Shard {\n                    index_uid: Some(index_uid.clone()),\n                    source_id: source_id.to_string(),\n                    shard_id: Some(ShardId::from(shard_id)),\n                    leader_id: ingester_id.to_string(),\n                    shard_state: ShardState::Open as i32,\n                    ..Default::default()\n                };\n                per_ingester_num_shards_sorted.insert((num_shards + 1, ingester_id));\n                opened_shards.push(opened_shard);\n                shard_id += 1;\n            }\n\n            if let Some((min_shards, max_shards)) = per_ingester_num_shards_sorted\n                .iter()\n                .map(|(num_shards, _)| num_shards)\n                .copied()\n                .minmax()\n                .into_option()\n            {\n                assert!(min_shards + min_shards.div_ceil(10).max(2) >= max_shards);\n            }\n\n            // Test stability of the algorithm: mark the retiring ingesters as\n            // decommissioned, insert the new shards, and verify no further rebalance is\n            // needed among the ready ingesters.\n            for ingester_id in &retiring_ids {\n                let ingester = IngesterPoolEntry {\n                    client: ingester_client.clone(),\n                    status: IngesterStatus::Decommissioned,\n                    availability_zone: None,\n                };\n                ingester_pool.insert(NodeId::from(ingester_id.clone()), ingester);\n            }\n            model.insert_shards(&index_uid, &source_id, opened_shards);\n\n            let shards_to_rebalance = controller.compute_shards_to_rebalance(&model);\n            assert!(shards_to_rebalance.is_empty());\n        }\n    }\n\n    proptest! {\n        #[test]\n        fn test_compute_shards_to_rebalance_proptest(\n            ready_shards in proptest::collection::vec(0..13usize, 0..13usize),\n            unavailable_shards in proptest::collection::vec(0..13usize, 0..5usize),\n            retiring_shards in proptest::collection::vec(0..5usize, 0..5usize),\n        ) {\n            test_compute_shards_to_rebalance_aux(&ready_shards, &unavailable_shards, &retiring_shards);\n        }\n    }\n\n    #[test]\n    fn test_compute_shards_to_rebalance() {\n        test_compute_shards_to_rebalance_aux(&[], &[], &[]);\n        test_compute_shards_to_rebalance_aux(&[0], &[], &[]);\n        test_compute_shards_to_rebalance_aux(&[1], &[], &[]);\n        test_compute_shards_to_rebalance_aux(&[0, 1], &[], &[]);\n        test_compute_shards_to_rebalance_aux(&[0, 1], &[1], &[]);\n        test_compute_shards_to_rebalance_aux(&[0, 1, 2], &[3, 4], &[]);\n        // Retiring ingesters: all their shards must be rebalanced\n        test_compute_shards_to_rebalance_aux(&[1, 1], &[], &[3]);\n        test_compute_shards_to_rebalance_aux(&[0, 0, 0], &[], &[5]);\n        test_compute_shards_to_rebalance_aux(&[2], &[], &[1, 2]);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-control-plane/src/ingest/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\npub(crate) mod ingest_controller;\nmod scaling_arbiter;\nmod wait_handle;\n\npub use ingest_controller::IngestController;\npub use wait_handle::WaitHandle;\n"
  },
  {
    "path": "quickwit/quickwit-control-plane/src/ingest/scaling_arbiter.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::num::NonZeroUsize;\n\nuse crate::model::{ScalingMode, ShardStats};\n\npub(crate) struct ScalingArbiter {\n    // Threshold in MiB/s below which we decrease the number of shards.\n    scale_down_shards_threshold_mib_per_sec: f32,\n\n    // Per shard threshold in MiB/s above which we increase the number of shards.\n    //\n    // We want scaling up to be reactive, so we first inspect the short\n    // term threshold.\n    //\n    // However, this threshold is based on a very short window of time: 5s.\n    //\n    // In order to avoid having back and forth scaling up and down in response to temporary\n    // punctual spikes of a few MB, we also compute what would be the long term ingestion rate\n    // after scaling up, and double check that it is above the long term threshold.\n    scale_up_shards_short_term_threshold_mib_per_sec: f32,\n    scale_up_shards_long_term_threshold_mib_per_sec: f32,\n    // The max increase factor of the number of shards in one scale up operation\n    shard_scale_up_factor: f32,\n}\n\nimpl ScalingArbiter {\n    pub fn with_max_shard_ingestion_throughput_mib_per_sec(\n        max_shard_throughput_mib_per_sec: f32,\n        shard_scale_up_factor: f32,\n    ) -> ScalingArbiter {\n        ScalingArbiter {\n            scale_up_shards_short_term_threshold_mib_per_sec: max_shard_throughput_mib_per_sec\n                * 0.8f32,\n            scale_up_shards_long_term_threshold_mib_per_sec: max_shard_throughput_mib_per_sec\n                * 0.3f32,\n            scale_down_shards_threshold_mib_per_sec: max_shard_throughput_mib_per_sec * 0.2f32,\n            shard_scale_up_factor,\n        }\n    }\n\n    /// Computes the maximum number of shards we can have without going below\n    /// the long term scale up threshold\n    fn long_term_scale_up_threshold_max_shards(&self, shard_stats: ShardStats) -> usize {\n        (shard_stats.avg_long_term_ingestion_rate * shard_stats.num_open_shards as f32\n            / self.scale_up_shards_long_term_threshold_mib_per_sec)\n            .floor() as usize\n    }\n\n    /// Computes the next number of shards we should have according the scaling factor\n    fn scale_up_factor_target_shards(&self, shard_stats: ShardStats) -> usize {\n        (shard_stats.num_open_shards as f32 * self.shard_scale_up_factor).ceil() as usize\n    }\n\n    /// Scale based on the \"per shard average\" metric\n    ///\n    /// Returns `None` when there are no open shards because in that case routers are expected to\n    /// make the [`quickwit_proto::control_plane::GetOrCreateOpenShardsRequest`]\n    pub(crate) fn should_scale(\n        &self,\n        shard_stats: ShardStats,\n        min_shards: NonZeroUsize,\n    ) -> Option<ScalingMode> {\n        // If ingest is idle, there is nothing to do. Idle shards are automatically closed by\n        // ingesters (see `quickwit_ingest::ingest_v2::idle::CloseIdleShardsTask`).\n        if shard_stats.num_open_shards == 0 || shard_stats.avg_long_term_ingestion_rate == 0.0 {\n            return None;\n        }\n        if shard_stats.num_open_shards < min_shards.get() {\n            let num_shards_to_open = min_shards.get() - shard_stats.num_open_shards;\n            let scaling_mode = ScalingMode::Up(num_shards_to_open);\n            return Some(scaling_mode);\n        }\n        // Scale up based on the short term metric value while making sure that\n        // the long term value doesn't get near the scale down threshold.\n        if shard_stats.avg_short_term_ingestion_rate\n            >= self.scale_up_shards_short_term_threshold_mib_per_sec\n        {\n            let new_calculated_num_shards = usize::min(\n                self.long_term_scale_up_threshold_max_shards(shard_stats),\n                self.scale_up_factor_target_shards(shard_stats),\n            );\n\n            let target_num_shards = usize::max(min_shards.get(), new_calculated_num_shards);\n\n            if target_num_shards > shard_stats.num_open_shards {\n                let num_shards_to_open = target_num_shards - shard_stats.num_open_shards;\n                let scaling_mode = ScalingMode::Up(num_shards_to_open);\n                return Some(scaling_mode);\n            }\n        }\n        // On the other hand, scale down only based on the long term metric value to avoid\n        // being sensitive to very short drops in ingestion\n        if shard_stats.avg_long_term_ingestion_rate <= self.scale_down_shards_threshold_mib_per_sec\n            && shard_stats.num_open_shards > min_shards.get()\n        {\n            return Some(ScalingMode::Down);\n        }\n        None\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::num::NonZeroUsize;\n\n    use super::ScalingArbiter;\n    use crate::model::{ScalingMode, ShardStats};\n\n    #[test]\n    fn test_scaling_arbiter_one_by_one() {\n        // use shard throughput 10MiB to simplify calculations\n        // with a factor close to 1 shards are effectively added 1 by 1\n        let scaling_arbiter =\n            ScalingArbiter::with_max_shard_ingestion_throughput_mib_per_sec(10.0, 1.01);\n        assert_eq!(\n            scaling_arbiter.should_scale(\n                ShardStats {\n                    num_open_shards: 0,\n                    num_closed_shards: 0,\n                    avg_short_term_ingestion_rate: 0.0,\n                    avg_long_term_ingestion_rate: 0.0,\n                },\n                NonZeroUsize::MIN\n            ),\n            None,\n        );\n        assert_eq!(\n            scaling_arbiter.should_scale(\n                ShardStats {\n                    num_open_shards: 1,\n                    num_closed_shards: 0,\n                    avg_short_term_ingestion_rate: 5.0,\n                    avg_long_term_ingestion_rate: 6.0,\n                },\n                NonZeroUsize::MIN\n            ),\n            None\n        );\n        assert_eq!(\n            scaling_arbiter.should_scale(\n                ShardStats {\n                    num_open_shards: 1,\n                    num_closed_shards: 0,\n                    avg_short_term_ingestion_rate: 8.1,\n                    avg_long_term_ingestion_rate: 8.1,\n                },\n                NonZeroUsize::MIN\n            ),\n            Some(ScalingMode::Up(1))\n        );\n        assert_eq!(\n            scaling_arbiter.should_scale(\n                ShardStats {\n                    num_open_shards: 2,\n                    num_closed_shards: 0,\n                    avg_short_term_ingestion_rate: 8.1,\n                    avg_long_term_ingestion_rate: 8.1,\n                },\n                NonZeroUsize::MIN\n            ),\n            Some(ScalingMode::Up(1))\n        );\n        assert_eq!(\n            scaling_arbiter.should_scale(\n                ShardStats {\n                    num_open_shards: 2,\n                    num_closed_shards: 0,\n                    avg_short_term_ingestion_rate: 3.0,\n                    avg_long_term_ingestion_rate: 1.5,\n                },\n                NonZeroUsize::MIN\n            ),\n            Some(ScalingMode::Down)\n        );\n        assert_eq!(\n            scaling_arbiter.should_scale(\n                ShardStats {\n                    num_open_shards: 1,\n                    num_closed_shards: 0,\n                    avg_short_term_ingestion_rate: 3.0,\n                    avg_long_term_ingestion_rate: 1.5,\n                },\n                NonZeroUsize::MIN\n            ),\n            None,\n        );\n        assert_eq!(\n            scaling_arbiter.should_scale(\n                ShardStats {\n                    num_open_shards: 1,\n                    num_closed_shards: 0,\n                    avg_short_term_ingestion_rate: 8.0,\n                    avg_long_term_ingestion_rate: 3.0,\n                },\n                NonZeroUsize::MIN\n            ),\n            None,\n        );\n    }\n\n    #[test]\n    fn test_scaling_arbiter_2x() {\n        // use shard throughput 10MiB to simplify calculations\n        let scaling_arbiter =\n            ScalingArbiter::with_max_shard_ingestion_throughput_mib_per_sec(10.0, 2.);\n        assert_eq!(\n            scaling_arbiter.should_scale(\n                ShardStats {\n                    num_open_shards: 0,\n                    num_closed_shards: 0,\n                    avg_short_term_ingestion_rate: 0.0,\n                    avg_long_term_ingestion_rate: 0.0,\n                },\n                NonZeroUsize::MIN\n            ),\n            None,\n        );\n        assert_eq!(\n            scaling_arbiter.should_scale(\n                ShardStats {\n                    num_open_shards: 2,\n                    num_closed_shards: 0,\n                    avg_short_term_ingestion_rate: 5.0,\n                    avg_long_term_ingestion_rate: 6.0,\n                },\n                NonZeroUsize::MIN\n            ),\n            None\n        );\n        assert_eq!(\n            scaling_arbiter.should_scale(\n                ShardStats {\n                    num_open_shards: 1,\n                    num_closed_shards: 0,\n                    avg_short_term_ingestion_rate: 8.1,\n                    avg_long_term_ingestion_rate: 8.1,\n                },\n                NonZeroUsize::MIN\n            ),\n            Some(ScalingMode::Up(1))\n        );\n        assert_eq!(\n            scaling_arbiter.should_scale(\n                ShardStats {\n                    num_open_shards: 2,\n                    num_closed_shards: 0,\n                    avg_short_term_ingestion_rate: 8.1,\n                    avg_long_term_ingestion_rate: 8.1,\n                },\n                NonZeroUsize::MIN\n            ),\n            Some(ScalingMode::Up(2))\n        );\n        assert_eq!(\n            scaling_arbiter.should_scale(\n                ShardStats {\n                    num_open_shards: 2,\n                    num_closed_shards: 0,\n                    avg_short_term_ingestion_rate: 3.0,\n                    avg_long_term_ingestion_rate: 1.5,\n                },\n                NonZeroUsize::MIN\n            ),\n            Some(ScalingMode::Down)\n        );\n        assert_eq!(\n            scaling_arbiter.should_scale(\n                ShardStats {\n                    num_open_shards: 1,\n                    num_closed_shards: 0,\n                    avg_short_term_ingestion_rate: 3.0,\n                    avg_long_term_ingestion_rate: 1.5,\n                },\n                NonZeroUsize::MIN\n            ),\n            None,\n        );\n        assert_eq!(\n            scaling_arbiter.should_scale(\n                ShardStats {\n                    num_open_shards: 1,\n                    num_closed_shards: 0,\n                    avg_short_term_ingestion_rate: 8.0,\n                    avg_long_term_ingestion_rate: 3.1,\n                },\n                NonZeroUsize::MIN\n            ),\n            None,\n        );\n        // Scale by just 1 if 2 would bring us too close to the scale down threshold\n        assert_eq!(\n            scaling_arbiter.should_scale(\n                ShardStats {\n                    num_open_shards: 2,\n                    num_closed_shards: 0,\n                    avg_short_term_ingestion_rate: 8.1,\n                    avg_long_term_ingestion_rate: 5.,\n                },\n                NonZeroUsize::MIN\n            ),\n            Some(ScalingMode::Up(1)),\n        );\n    }\n\n    #[test]\n    fn test_scale_up_computations() {\n        // use shard throughput 10MiB to simplify calculations\n        let scaling_arbiter =\n            ScalingArbiter::with_max_shard_ingestion_throughput_mib_per_sec(10.0, 1.5);\n\n        let shard_stats = ShardStats {\n            num_open_shards: 0,\n            num_closed_shards: 0,\n            avg_short_term_ingestion_rate: 0.,\n            avg_long_term_ingestion_rate: 0.,\n        };\n        assert_eq!(\n            scaling_arbiter.long_term_scale_up_threshold_max_shards(shard_stats),\n            0\n        );\n        assert_eq!(\n            scaling_arbiter.scale_up_factor_target_shards(shard_stats),\n            0\n        );\n\n        let shard_stats = ShardStats {\n            num_open_shards: 1,\n            num_closed_shards: 0,\n            avg_short_term_ingestion_rate: 5.0,\n            avg_long_term_ingestion_rate: 6.1,\n        };\n        assert_eq!(\n            scaling_arbiter.long_term_scale_up_threshold_max_shards(shard_stats),\n            2\n        );\n        assert_eq!(\n            scaling_arbiter.scale_up_factor_target_shards(shard_stats),\n            2\n        );\n\n        let shard_stats = ShardStats {\n            num_open_shards: 2,\n            num_closed_shards: 0,\n            avg_short_term_ingestion_rate: 5.0,\n            avg_long_term_ingestion_rate: 1.1,\n        };\n        assert_eq!(\n            scaling_arbiter.long_term_scale_up_threshold_max_shards(shard_stats),\n            0\n        );\n        assert_eq!(\n            scaling_arbiter.scale_up_factor_target_shards(shard_stats),\n            3\n        );\n\n        let shard_stats = ShardStats {\n            num_open_shards: 2,\n            num_closed_shards: 0,\n            avg_short_term_ingestion_rate: 5.0,\n            avg_long_term_ingestion_rate: 6.1,\n        };\n        assert_eq!(\n            scaling_arbiter.long_term_scale_up_threshold_max_shards(shard_stats),\n            4\n        );\n        assert_eq!(\n            scaling_arbiter.scale_up_factor_target_shards(shard_stats),\n            3\n        );\n\n        let shard_stats = ShardStats {\n            num_open_shards: 5,\n            num_closed_shards: 0,\n            avg_short_term_ingestion_rate: 5.0,\n            avg_long_term_ingestion_rate: 1.1,\n        };\n        assert_eq!(\n            scaling_arbiter.long_term_scale_up_threshold_max_shards(shard_stats),\n            1\n        );\n        assert_eq!(\n            scaling_arbiter.scale_up_factor_target_shards(shard_stats),\n            8\n        );\n    }\n\n    #[test]\n    fn test_scaling_arbiter_idle() {\n        let scaling_arbiter =\n            ScalingArbiter::with_max_shard_ingestion_throughput_mib_per_sec(10.0, 1.5);\n\n        let shard_stats = ShardStats {\n            num_open_shards: 0,\n            num_closed_shards: 0,\n            avg_short_term_ingestion_rate: 0.0,\n            avg_long_term_ingestion_rate: 0.0,\n        };\n        let min_shards = NonZeroUsize::MIN;\n        let scaling_mode = scaling_arbiter.should_scale(shard_stats, min_shards);\n        assert!(scaling_mode.is_none());\n\n        let shard_stats = ShardStats {\n            num_open_shards: 1,\n            num_closed_shards: 0,\n            avg_short_term_ingestion_rate: 0.0,\n            avg_long_term_ingestion_rate: 0.0,\n        };\n        let min_shards = NonZeroUsize::new(2).unwrap();\n        let scaling_mode = scaling_arbiter.should_scale(shard_stats, min_shards);\n        assert!(scaling_mode.is_none());\n    }\n\n    #[test]\n    fn test_scaling_arbiter_min_shards() {\n        let scaling_arbiter =\n            ScalingArbiter::with_max_shard_ingestion_throughput_mib_per_sec(10.0, 1.5);\n\n        let shard_stats = ShardStats {\n            num_open_shards: 1,\n            num_closed_shards: 0,\n            avg_short_term_ingestion_rate: 5.0,\n            avg_long_term_ingestion_rate: 1.0,\n        };\n        let min_shards = NonZeroUsize::new(5).unwrap();\n        let scaling_mode = scaling_arbiter\n            .should_scale(shard_stats, min_shards)\n            .unwrap();\n        assert_eq!(scaling_mode, ScalingMode::Up(4));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-control-plane/src/ingest/wait_handle.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse tokio::sync::oneshot;\n\npub struct WaitHandle {\n    rx: oneshot::Receiver<()>,\n}\n\nimpl WaitHandle {\n    pub fn new() -> (WaitDropGuard, WaitHandle) {\n        let (tx, rx) = oneshot::channel();\n        let wait_drop_guard = WaitDropGuard(tx);\n        let wait_handle = WaitHandle { rx };\n        (wait_drop_guard, wait_handle)\n    }\n\n    pub async fn wait(self) {\n        let _ = self.rx.await;\n    }\n}\n\npub struct WaitDropGuard(#[allow(dead_code)] oneshot::Sender<()>);\n\n#[cfg(test)]\nmod tests {\n    use tokio::sync::oneshot::error::TryRecvError;\n\n    #[tokio::test]\n    async fn test_wait_handle_simple() {\n        let (wait_drop_handle, mut wait_handle) = super::WaitHandle::new();\n        assert!(matches!(\n            wait_handle.rx.try_recv().unwrap_err(),\n            TryRecvError::Empty\n        ));\n        drop(wait_drop_handle);\n        wait_handle.wait().await;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-control-plane/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\npub mod control_plane;\npub mod indexing_plan;\npub mod indexing_scheduler;\npub mod ingest;\npub(crate) mod metrics;\npub(crate) mod model;\n\nuse quickwit_common::tower::Pool;\nuse quickwit_proto::indexing::{CpuCapacity, IndexingServiceClient, IndexingTask};\nuse quickwit_proto::types::NodeId;\n\n/// Indexer-node specific information stored in the pool of available indexer nodes\n#[derive(Debug, Clone)]\npub struct IndexerNodeInfo {\n    pub node_id: NodeId,\n    pub generation_id: u64,\n    pub client: IndexingServiceClient,\n    pub indexing_tasks: Vec<IndexingTask>,\n    pub indexing_capacity: CpuCapacity,\n}\n\npub type IndexerPool = Pool<NodeId, IndexerNodeInfo>;\n\nmod cooldown_map;\nmod debouncer;\n#[cfg(test)]\nmod tests;\n"
  },
  {
    "path": "quickwit/quickwit-control-plane/src/metrics.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse once_cell::sync::Lazy;\nuse quickwit_common::metrics::{\n    IntCounter, IntGauge, IntGaugeVec, new_counter, new_gauge, new_gauge_vec,\n};\n\n#[derive(Debug, Clone, Copy)]\npub struct ShardLocalityMetrics {\n    pub num_remote_shards: usize,\n    pub num_local_shards: usize,\n}\n\npub struct ControlPlaneMetrics {\n    // Indexes and shards tracked by the control plane.\n    pub indexes_total: IntGauge,\n    pub open_shards: IntGaugeVec<1>,\n    pub closed_shards: IntGaugeVec<1>,\n\n    // Operations performed by the control plane.\n    pub apply_plan_total: IntCounter,\n    pub rebalance_shards: IntGauge,\n    pub restart_total: IntCounter,\n    pub schedule_total: IntCounter,\n\n    // Metastore errors.\n    pub metastore_error_aborted: IntCounter,\n    pub metastore_error_maybe_executed: IntCounter,\n\n    // Indexing plan metrics.\n    pub local_shards: IntGauge,\n    pub remote_shards: IntGauge,\n}\n\nimpl ControlPlaneMetrics {\n    pub fn set_shard_locality_metrics(&self, shard_locality_metrics: ShardLocalityMetrics) {\n        self.local_shards\n            .set(shard_locality_metrics.num_local_shards as i64);\n        self.remote_shards\n            .set(shard_locality_metrics.num_remote_shards as i64);\n    }\n}\n\nimpl Default for ControlPlaneMetrics {\n    fn default() -> Self {\n        let open_shards = new_gauge_vec(\n            \"shards\",\n            \"Number of open and closed shards tracked by the ingest controller\",\n            \"control_plane\",\n            &[(\"state\", \"open\")],\n            [\"index_id\"],\n        );\n        let closed_shards = new_gauge_vec(\n            \"shards\",\n            \"Number of open and closed shards tracked by the ingest controller\",\n            \"control_plane\",\n            &[(\"state\", \"closed\")],\n            [\"index_id\"],\n        );\n        let indexed_shards = new_gauge_vec(\n            \"indexed_shards\",\n            \"Number of (remote/local) shards in the indexing plan\",\n            \"control_plane\",\n            &[],\n            [\"locality\"],\n        );\n        let local_shards = indexed_shards.with_label_values([\"local\"]);\n        let remote_shards = indexed_shards.with_label_values([\"remote\"]);\n\n        ControlPlaneMetrics {\n            indexes_total: new_gauge(\n                \"indexes_total\",\n                \"Number of indexes tracked by the control plane.\",\n                \"control_plane\",\n                &[],\n            ),\n            open_shards,\n            closed_shards,\n            apply_plan_total: new_counter(\n                \"apply_plan_total\",\n                \"Number of control plane `apply plan` operations.\",\n                \"control_plane\",\n                &[],\n            ),\n            rebalance_shards: new_gauge(\n                \"rebalance_shards\",\n                \"Number of shards rebalanced by the control plane.\",\n                \"control_plane\",\n                &[],\n            ),\n            restart_total: new_counter(\n                \"restart_total\",\n                \"Number of control plane restarts.\",\n                \"control_plane\",\n                &[],\n            ),\n            schedule_total: new_counter(\n                \"schedule_total\",\n                \"Number of control plane `schedule` operations.\",\n                \"control_plane\",\n                &[],\n            ),\n            metastore_error_aborted: new_counter(\n                \"metastore_error_aborted\",\n                \"Number of aborted metastore transaction (= do not trigger a control plane \\\n                 restart)\",\n                \"control_plane\",\n                &[],\n            ),\n            metastore_error_maybe_executed: new_counter(\n                \"metastore_error_maybe_executed\",\n                \"Number of metastore transaction with an uncertain outcome (= do trigger a \\\n                 control plane restart)\",\n                \"control_plane\",\n                &[],\n            ),\n            local_shards,\n            remote_shards,\n        }\n    }\n}\n\npub static CONTROL_PLANE_METRICS: Lazy<ControlPlaneMetrics> =\n    Lazy::new(ControlPlaneMetrics::default);\n"
  },
  {
    "path": "quickwit/quickwit-control-plane/src/model/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod shard_table;\n\nuse std::borrow::Cow;\nuse std::collections::BTreeSet;\nuse std::mem;\nuse std::ops::Deref;\nuse std::time::Instant;\n\nuse anyhow::bail;\nuse fnv::{FnvHashMap, FnvHashSet};\nuse futures::StreamExt;\nuse quickwit_common::Progress;\nuse quickwit_common::pretty::PrettyDisplay;\nuse quickwit_config::{INGEST_V2_SOURCE_ID, IndexConfig, SourceConfig, enable_ingest_v2};\nuse quickwit_ingest::ShardInfos;\nuse quickwit_metastore::{AddSourceRequestExt, IndexMetadata, ListIndexesMetadataResponseExt};\nuse quickwit_proto::control_plane::ControlPlaneResult;\nuse quickwit_proto::ingest::Shard;\nuse quickwit_proto::metastore::{\n    self, AddSourceRequest, EntityKind, ListIndexesMetadataRequest, ListShardsSubrequest,\n    ListShardsSubresponse, MetastoreError, MetastoreResult, MetastoreService,\n    MetastoreServiceClient, SourceType, ToggleSourceRequest,\n};\nuse quickwit_proto::types::{IndexId, IndexUid, NodeId, ShardId, SourceId, SourceUid};\npub(super) use shard_table::{ScalingMode, ShardEntry, ShardLocations, ShardStats, ShardTable};\nuse tracing::{debug, error, info, instrument, warn};\n\n/// The control plane maintains a model in sync with the metastore.\n///\n/// The model stays consistent with the metastore, because all\n/// the mutations (create/delete index, add/delete source, etc.) go through the control plane.\n///\n/// If a mutation yields an error, the control plane is killed\n/// and restarted.\n///\n/// Upon starts, it loads its entire state from the metastore.\n#[derive(Default, Debug)]\npub(crate) struct ControlPlaneModel {\n    index_uid_table: FnvHashMap<IndexId, IndexUid>,\n    index_table: FnvHashMap<IndexUid, IndexMetadata>,\n    shard_table: ShardTable,\n}\n\nimpl ControlPlaneModel {\n    /// Clears the entire state of the model.\n    pub fn clear(&mut self) {\n        *self = Default::default();\n    }\n\n    pub fn num_indexes(&self) -> usize {\n        self.index_table.len()\n    }\n\n    pub fn num_sources(&self) -> usize {\n        self.shard_table.num_sources()\n    }\n\n    pub fn shard_locations(&self) -> ShardLocations<'_> {\n        self.shard_table.shard_locations()\n    }\n\n    #[cfg(test)]\n    pub fn num_shards(&self) -> usize {\n        self.shard_table.num_shards()\n    }\n\n    #[instrument(skip_all)]\n    pub async fn load_from_metastore(\n        &mut self,\n        metastore: &mut MetastoreServiceClient,\n        progress: &Progress,\n    ) -> ControlPlaneResult<()> {\n        const BATCH_SIZE: usize = 500;\n\n        let now = Instant::now();\n        self.clear();\n\n        let indexes_metadata = progress\n            .protect_future(metastore.list_indexes_metadata(ListIndexesMetadataRequest::all()))\n            .await?\n            .deserialize_indexes_metadata()\n            .await?;\n\n        let num_indexes = indexes_metadata.len();\n        self.index_table.reserve(num_indexes);\n\n        for index_metadata in indexes_metadata {\n            self.add_index(index_metadata);\n        }\n        self.create_or_enable_ingest_v2_sources_if_necessary(metastore, progress)\n            .await?;\n\n        let mut num_sources = 0;\n        let mut num_shards = 0;\n\n        let mut next_list_shards_request = metastore::ListShardsRequest::default();\n\n        for (idx, index_metadata) in self.index_table.values().enumerate() {\n            for source_config in index_metadata.sources.values() {\n                num_sources += 1;\n\n                if source_config.source_type() == SourceType::IngestV2 {\n                    let request = ListShardsSubrequest {\n                        index_uid: index_metadata.index_uid.clone().into(),\n                        source_id: source_config.source_id.clone(),\n                        shard_state: None,\n                    };\n                    next_list_shards_request.subrequests.push(request);\n                }\n            }\n            let num_subrequests = next_list_shards_request.subrequests.len();\n\n            if num_subrequests > 0 && (num_subrequests >= BATCH_SIZE || idx == num_indexes - 1) {\n                let list_shards_request = mem::take(&mut next_list_shards_request);\n                let list_shards_response = progress\n                    .protect_future(metastore.list_shards(list_shards_request))\n                    .await?;\n\n                for list_shards_subresponse in list_shards_response.subresponses {\n                    num_shards += list_shards_subresponse.shards.len();\n\n                    let ListShardsSubresponse {\n                        index_uid,\n                        source_id,\n                        shards,\n                    } = list_shards_subresponse;\n                    let index_uid = index_uid.expect(\"`index_uid` should be a required field\");\n                    self.shard_table\n                        .insert_shards(&index_uid, &source_id, shards);\n                }\n            }\n        }\n        info!(\n            \"synced control plane model with metastore in {} ({num_indexes} indexes, \\\n             {num_sources} sources, {num_shards} shards)\",\n            now.elapsed().pretty_display()\n        );\n        Ok(())\n    }\n\n    pub fn index_uid(&self, index_id: &str) -> Option<&IndexUid> {\n        self.index_uid_table.get(index_id)\n    }\n\n    pub fn index_metadata(&self, index_uid: &IndexUid) -> Option<&IndexMetadata> {\n        self.index_table.get(index_uid)\n    }\n\n    pub(crate) fn source_metadata(&self, source_uid: &SourceUid) -> Option<&SourceConfig> {\n        self.index_metadata(&source_uid.index_uid)?\n            .sources\n            .get(&source_uid.source_id)\n    }\n\n    fn update_metrics(&self) {\n        crate::metrics::CONTROL_PLANE_METRICS\n            .indexes_total\n            .set(self.index_table.len() as i64);\n    }\n\n    pub(crate) fn source_configs(&self) -> impl Iterator<Item = (SourceUid, &SourceConfig)> + '_ {\n        self.index_table.values().flat_map(|index_metadata| {\n            index_metadata\n                .sources\n                .iter()\n                .map(move |(source_id, source_config)| {\n                    (\n                        SourceUid {\n                            index_uid: index_metadata.index_uid.clone(),\n                            source_id: source_id.clone(),\n                        },\n                        source_config,\n                    )\n                })\n        })\n    }\n\n    pub(crate) fn add_index(&mut self, index_metadata: IndexMetadata) {\n        let index_uid = index_metadata.index_uid.clone();\n        let index_id = index_uid.index_id.clone();\n\n        self.index_uid_table.insert(index_id, index_uid.clone());\n\n        for (source_id, source_config) in &index_metadata.sources {\n            if source_config.source_type() == SourceType::IngestV2 {\n                self.shard_table.add_source(&index_uid, source_id);\n            }\n        }\n        self.index_table.insert(index_uid, index_metadata);\n        self.update_metrics();\n    }\n\n    /// Updates the configuration of the specified index, returning an error if\n    /// the index didn't exist.\n    pub(crate) fn update_index_config(\n        &mut self,\n        index_uid: &IndexUid,\n        index_config: IndexConfig,\n    ) -> anyhow::Result<bool> {\n        let Some(index_model) = self.index_table.get_mut(index_uid) else {\n            bail!(\"index `{}` not found\", index_uid.index_id);\n        };\n        let fp_changed = !index_model.index_config.equals_fingerprint(&index_config);\n        index_model.index_config = index_config;\n        Ok(fp_changed)\n    }\n\n    pub(crate) fn delete_index(&mut self, index_uid: &IndexUid) {\n        self.index_table.remove(index_uid);\n        self.index_uid_table.remove(&index_uid.index_id);\n        self.shard_table.delete_index(&index_uid.index_id);\n        self.update_metrics();\n    }\n\n    /// Adds a source to a given index. Returns an error if the source already\n    /// exists.\n    pub(crate) fn add_source(\n        &mut self,\n        index_uid: &IndexUid,\n        source_config: SourceConfig,\n    ) -> ControlPlaneResult<()> {\n        let index_metadata = self.index_table.get_mut(index_uid).ok_or_else(|| {\n            MetastoreError::NotFound(EntityKind::Index {\n                index_id: index_uid.to_string(),\n            })\n        })?;\n        index_metadata.add_source(source_config.clone())?;\n\n        if source_config.source_type() == SourceType::IngestV2 {\n            self.shard_table\n                .add_source(index_uid, &source_config.source_id);\n        }\n        Ok(())\n    }\n\n    pub(crate) fn update_source(\n        &mut self,\n        index_uid: &IndexUid,\n        source_config: SourceConfig,\n    ) -> ControlPlaneResult<()> {\n        let index_metadata = self.index_table.get_mut(index_uid).ok_or_else(|| {\n            MetastoreError::NotFound(EntityKind::Index {\n                index_id: index_uid.to_string(),\n            })\n        })?;\n        index_metadata.update_source(source_config)?;\n        Ok(())\n    }\n\n    pub(crate) fn delete_source(&mut self, source_uid: &SourceUid) {\n        // Removing shards from shard table.\n        self.shard_table\n            .delete_source(&source_uid.index_uid, &source_uid.source_id);\n        // Remove source from index metadata.\n        let Some(index_metadata) = self.index_table.get_mut(&source_uid.index_uid) else {\n            warn!(index_uid=%source_uid.index_uid, source_id=%source_uid.source_id, \"delete source: index not found\");\n            return;\n        };\n        if index_metadata\n            .sources\n            .remove(&source_uid.source_id)\n            .is_none()\n        {\n            warn!(index_uid=%source_uid.index_uid, source_id=%source_uid.source_id, \"delete source: source not found\");\n        };\n    }\n\n    /// Returns `true` if the source status has changed, `false` otherwise.\n    /// Returns an error if the source could not be found.\n    pub(crate) fn toggle_source(\n        &mut self,\n        index_uid: &IndexUid,\n        source_id: &SourceId,\n        enable: bool,\n    ) -> ControlPlaneResult<bool> {\n        let index_model = self.index_table.get_mut(index_uid).ok_or_else(|| {\n            MetastoreError::NotFound(EntityKind::Index {\n                index_id: index_uid.to_string(),\n            })\n        })?;\n        let source_config = index_model.sources.get_mut(source_id).ok_or_else(|| {\n            MetastoreError::NotFound(EntityKind::Source {\n                index_id: index_uid.to_string(),\n                source_id: source_id.clone(),\n            })\n        })?;\n        let has_changed = source_config.enabled != enable;\n        source_config.enabled = enable;\n        Ok(has_changed)\n    }\n\n    pub(crate) fn all_shards(&self) -> impl Iterator<Item = &ShardEntry> + '_ {\n        self.shard_table.all_shards()\n    }\n\n    pub(crate) fn all_shards_with_source(\n        &self,\n    ) -> impl Iterator<Item = (&SourceUid, impl Iterator<Item = &ShardEntry>)> + '_ {\n        self.shard_table.all_shards_with_source()\n    }\n\n    pub fn list_shards_for_node(\n        &self,\n        ingester: &NodeId,\n    ) -> impl Deref<Target = FnvHashMap<SourceUid, BTreeSet<ShardId>>> + '_ {\n        if let Some(shards_for_node) = self.shard_table.list_shards_for_node(ingester) {\n            Cow::Borrowed(shards_for_node)\n        } else {\n            Cow::Owned(FnvHashMap::default())\n        }\n    }\n\n    pub fn list_shards_for_index<'a>(\n        &'a self,\n        index_uid: &'a IndexUid,\n    ) -> impl Iterator<Item = &'a ShardEntry> + 'a {\n        self.shard_table.list_shards_for_index(index_uid)\n    }\n\n    /// Lists the shards of a given source. Returns `None` if the source does not exist.\n    pub fn get_shards_for_source(\n        &self,\n        source_uid: &SourceUid,\n    ) -> Option<&FnvHashMap<ShardId, ShardEntry>> {\n        self.shard_table.get_shards(source_uid)\n    }\n\n    /// Lists the shards of a given source. Returns `None` if the source does not exist.\n    pub fn get_shards_for_source_mut(\n        &mut self,\n        source_uid: &SourceUid,\n    ) -> Option<&mut FnvHashMap<ShardId, ShardEntry>> {\n        self.shard_table.get_shards_mut(source_uid)\n    }\n\n    /// Inserts the shards that have just been opened by calling `open_shards` on the metastore.\n    pub fn insert_shards(\n        &mut self,\n        index_uid: &IndexUid,\n        source_id: &SourceId,\n        opened_shards: Vec<Shard>,\n    ) {\n        self.shard_table\n            .insert_shards(index_uid, source_id, opened_shards);\n    }\n\n    /// Finds open shards for a given index and source and whose leaders are not in the set of\n    /// unavailable ingesters.\n    pub fn find_open_shards(\n        &self,\n        index_uid: &IndexUid,\n        source_id: &SourceId,\n        unavailable_leaders: &FnvHashSet<NodeId>,\n    ) -> Option<Vec<ShardEntry>> {\n        self.shard_table\n            .find_open_shards(index_uid, source_id, unavailable_leaders)\n    }\n\n    /// Updates the state and ingestion rate of the shards according to the given shard infos.\n    pub fn update_shards(\n        &mut self,\n        source_uid: &SourceUid,\n        shard_infos: &ShardInfos,\n    ) -> ShardStats {\n        debug!(\n            index_uid=%source_uid.index_uid,\n            source_id=%source_uid.source_id,\n            \"updating shards\"\n        );\n        self.shard_table.update_shards(source_uid, shard_infos)\n    }\n\n    /// Sets the state of the shards identified by their index UID, source ID, and shard IDs to\n    /// `Closed`.\n    pub fn close_shards(&mut self, source_uid: &SourceUid, shard_ids: &[ShardId]) -> Vec<ShardId> {\n        debug!(\n            index_uid=%source_uid.index_uid,\n            source_id=%source_uid.source_id,\n            shard_ids=%shard_ids.pretty_display(),\n            \"closing shards\"\n        );\n        self.shard_table.close_shards(source_uid, shard_ids)\n    }\n\n    /// Removes the shards identified by their index UID, source ID, and shard IDs.\n    pub fn delete_shards(&mut self, source_uid: &SourceUid, shard_ids: &[ShardId]) {\n        debug!(\n            index_uid=%source_uid.index_uid,\n            source_id=%source_uid.source_id,\n            shard_ids=%shard_ids.pretty_display(),\n            \"deleting shards\"\n        );\n        self.shard_table.delete_shards(source_uid, shard_ids);\n    }\n\n    pub fn acquire_scaling_permits(\n        &mut self,\n        source_uid: &SourceUid,\n        scaling_mode: ScalingMode,\n    ) -> Option<bool> {\n        self.shard_table\n            .acquire_scaling_permits(source_uid, scaling_mode)\n    }\n\n    pub fn drain_scaling_permits(&mut self, source_uid: &SourceUid, scaling_mode: ScalingMode) {\n        self.shard_table\n            .drain_scaling_permits(source_uid, scaling_mode)\n    }\n\n    pub fn release_scaling_permits(&mut self, source_uid: &SourceUid, scaling_mode: ScalingMode) {\n        self.shard_table\n            .release_scaling_permits(source_uid, scaling_mode)\n    }\n\n    // Quickwit 0.9 uses the ingest v2 source by default. For indexes created prior to 0.9, we need\n    // to ensure that the ingest v2 source is created and enabled if necessary.\n    //\n    // TODO(#5604)\n    async fn create_or_enable_ingest_v2_sources_if_necessary(\n        &mut self,\n        metastore: &mut MetastoreServiceClient,\n        progress: &Progress,\n    ) -> ControlPlaneResult<()> {\n        // User has voluntarily disabled ingest v2, nothing to do.\n        if !enable_ingest_v2() {\n            return Ok(());\n        }\n        // Indexes for which the ingest v2 source needs to be created.\n        let mut sources_to_create = Vec::new();\n        // Indexes for which the ingest v2 source needs to be enabled.\n        let mut sources_to_enable = Vec::new();\n\n        for (index_uid, index_metadata) in &self.index_table {\n            let ingest_v2_source_opt = index_metadata.sources.get(INGEST_V2_SOURCE_ID);\n\n            if let Some(ingest_v2_source) = ingest_v2_source_opt {\n                if !ingest_v2_source.enabled {\n                    sources_to_enable.push(index_uid.clone());\n                }\n            } else {\n                sources_to_create.push(index_uid.clone());\n            }\n        }\n        self.create_ingest_v2_sources(sources_to_create, metastore, progress)\n            .await?;\n        self.enable_ingest_v2_sources(sources_to_enable, metastore, progress)\n            .await?;\n        Ok(())\n    }\n\n    async fn create_ingest_v2_sources(\n        &mut self,\n        sources_to_create: Vec<IndexUid>,\n        metastore: &mut MetastoreServiceClient,\n        progress: &Progress,\n    ) -> MetastoreResult<()> {\n        let num_sources_to_create = sources_to_create.len();\n        let now = Instant::now();\n        info!(\"adding ingest v2 source to {num_sources_to_create} indexes\");\n\n        let mut add_source_futures = Vec::with_capacity(num_sources_to_create);\n\n        for index_uid in sources_to_create {\n            let metastore = metastore.clone();\n            let source_config = SourceConfig::ingest_v2();\n            let add_source_request =\n                AddSourceRequest::try_from_source_config(index_uid.clone(), &source_config)?;\n            let add_source_future = async move {\n                let add_source_result = metastore.add_source(add_source_request).await;\n                match add_source_result {\n                    Ok(_) => Ok((index_uid, source_config)),\n                    Err(error) => Err((index_uid, error)),\n                }\n            };\n            add_source_futures.push(add_source_future);\n        }\n        let mut add_source_result_stream =\n            futures::stream::iter(add_source_futures).buffer_unordered(100);\n        let mut num_errors = 0;\n\n        while let Some(add_source_result) = progress\n            .protect_future(add_source_result_stream.next())\n            .await\n        {\n            match add_source_result {\n                Ok((index_uid, source_config)) => {\n                    self.add_source(&index_uid, source_config)?;\n                }\n                Err((index_uid, error)) => {\n                    num_errors += 1;\n                    debug!(%error, %index_uid, \"failed to add ingest v2 source to index\");\n                }\n            }\n        }\n        if num_errors > 0 {\n            error!(\"failed to add ingest v2 sources to {num_errors} indexes\");\n        }\n        info!(\n            \"added ingest v2 source to {num_sources_to_create} indexes in {}\",\n            now.elapsed().pretty_display()\n        );\n        Ok(())\n    }\n\n    async fn enable_ingest_v2_sources(\n        &mut self,\n        sources_to_enable: Vec<IndexUid>,\n        metastore: &mut MetastoreServiceClient,\n        progress: &Progress,\n    ) -> MetastoreResult<()> {\n        let num_sources_to_enable = sources_to_enable.len();\n        let now = Instant::now();\n        info!(\"enabling {num_sources_to_enable} ingest v2 sources\");\n\n        let mut toggle_source_futures = Vec::with_capacity(num_sources_to_enable);\n\n        for index_uid in sources_to_enable {\n            let metastore = metastore.clone();\n            let toggle_source_request = ToggleSourceRequest {\n                index_uid: index_uid.clone().into(),\n                source_id: INGEST_V2_SOURCE_ID.to_string(),\n                enable: true,\n            };\n            let toggle_source_future = async move {\n                let toggle_source_result = metastore.toggle_source(toggle_source_request).await;\n                match toggle_source_result {\n                    Ok(_) => Ok(index_uid),\n                    Err(error) => Err((index_uid, error)),\n                }\n            };\n            toggle_source_futures.push(toggle_source_future);\n        }\n        let mut toggle_source_result_stream =\n            futures::stream::iter(toggle_source_futures).buffer_unordered(100);\n        let mut num_errors = 0;\n\n        let ingest_v2_source_id = INGEST_V2_SOURCE_ID.to_string();\n\n        while let Some(toggle_source_result) = progress\n            .protect_future(toggle_source_result_stream.next())\n            .await\n        {\n            match toggle_source_result {\n                Ok(index_uid) => {\n                    self.toggle_source(&index_uid, &ingest_v2_source_id, true)?;\n                }\n                Err((index_uid, error)) => {\n                    num_errors += 1;\n                    debug!(%error, %index_uid, \"failed to enable ingest v2 source\");\n                }\n            }\n        }\n        if num_errors > 0 {\n            error!(\"failed to enable {num_errors} ingest v2 sources\");\n        }\n        info!(\n            \"enabled {num_sources_to_enable} ingest v2 sources in {}\",\n            now.elapsed().pretty_display()\n        );\n        Ok(())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use metastore::EmptyResponse;\n    use quickwit_config::{INGEST_V2_SOURCE_ID, SourceConfig, SourceParams, TransformConfig};\n    use quickwit_metastore::IndexMetadata;\n    use quickwit_proto::ingest::{Shard, ShardState};\n    use quickwit_proto::metastore::{ListIndexesMetadataResponse, MockMetastoreService};\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_control_plane_model_load_shard_table() {\n        let index_uid0 = IndexUid::for_test(\"test-index-0\", 0);\n        let index_uid1 = IndexUid::for_test(\"test-index-1\", 0);\n        let index_uid2 = IndexUid::for_test(\"test-index-2\", 0);\n\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(|request| {\n                assert_eq!(request, ListIndexesMetadataRequest::all());\n\n                let mut index_0 = IndexMetadata::for_test(\"test-index-0\", \"ram:///test-index-0\");\n                let mut source_config = SourceConfig::ingest_v2();\n                source_config.enabled = true;\n                index_0.add_source(source_config.clone()).unwrap();\n\n                let mut index_1 = IndexMetadata::for_test(\"test-index-1\", \"ram:///test-index-1\");\n                source_config.enabled = false;\n                index_1.add_source(source_config).unwrap();\n\n                let mut index_2 = IndexMetadata::for_test(\"test-index-2\", \"ram:///test-index-2\");\n                index_2.add_source(SourceConfig::cli()).unwrap();\n\n                let indexes = vec![index_0, index_1, index_2];\n                Ok(ListIndexesMetadataResponse::for_test(indexes))\n            });\n        let index_uid2_clone = index_uid2.clone();\n        mock_metastore\n            .expect_add_source()\n            .return_once(move |request| {\n                assert_eq!(*request.index_uid(), index_uid2_clone);\n\n                let source_config = request.deserialize_source_config().unwrap();\n                assert_eq!(source_config.source_id, INGEST_V2_SOURCE_ID);\n                assert_eq!(source_config.source_type(), SourceType::IngestV2);\n\n                Ok(EmptyResponse {})\n            });\n        let index_uid1_clone = index_uid1.clone();\n        mock_metastore\n            .expect_toggle_source()\n            .return_once(move |request| {\n                assert_eq!(*request.index_uid(), index_uid1_clone);\n                assert_eq!(request.source_id, INGEST_V2_SOURCE_ID);\n                assert!(request.enable);\n\n                Ok(EmptyResponse {})\n            });\n        let index_uid0_clone = index_uid0.clone();\n        let index_uid1_clone = index_uid1.clone();\n        let index_uid2_clone = index_uid2.clone();\n        mock_metastore\n            .expect_list_shards()\n            .return_once(move |mut request| {\n                assert_eq!(request.subrequests.len(), 3);\n\n                request\n                    .subrequests\n                    .sort_by(|left, right| left.index_uid().cmp(right.index_uid()));\n\n                assert_eq!(request.subrequests[0].index_uid(), &index_uid0_clone);\n                assert_eq!(request.subrequests[0].source_id, INGEST_V2_SOURCE_ID);\n                assert!(request.subrequests[0].shard_state.is_none());\n\n                assert_eq!(request.subrequests[1].index_uid(), &index_uid1_clone);\n                assert_eq!(request.subrequests[1].source_id, INGEST_V2_SOURCE_ID);\n                assert!(request.subrequests[1].shard_state.is_none());\n\n                assert_eq!(request.subrequests[2].index_uid(), &index_uid2_clone);\n                assert_eq!(request.subrequests[2].source_id, INGEST_V2_SOURCE_ID);\n                assert!(request.subrequests[2].shard_state.is_none());\n\n                let subresponses = vec![\n                    metastore::ListShardsSubresponse {\n                        index_uid: Some(index_uid0_clone.clone()),\n                        source_id: INGEST_V2_SOURCE_ID.to_string(),\n                        shards: vec![Shard {\n                            shard_id: Some(ShardId::from(42)),\n                            index_uid: Some(index_uid0_clone.clone()),\n                            source_id: INGEST_V2_SOURCE_ID.to_string(),\n                            shard_state: ShardState::Open as i32,\n                            leader_id: \"node1\".to_string(),\n                            ..Default::default()\n                        }],\n                    },\n                    metastore::ListShardsSubresponse {\n                        index_uid: Some(index_uid1_clone.clone()),\n                        source_id: INGEST_V2_SOURCE_ID.to_string(),\n                        shards: Vec::new(),\n                    },\n                ];\n                let response = metastore::ListShardsResponse { subresponses };\n                Ok(response)\n            });\n        let mut model = ControlPlaneModel::default();\n        let mut metastore = MetastoreServiceClient::from_mock(mock_metastore);\n        let progress = Progress::default();\n        model\n            .load_from_metastore(&mut metastore, &progress)\n            .await\n            .unwrap();\n\n        assert_eq!(model.index_table.len(), 3);\n        assert_eq!(*model.index_uid(\"test-index-0\").unwrap(), index_uid0);\n        assert_eq!(*model.index_uid(\"test-index-1\").unwrap(), index_uid1);\n        assert_eq!(*model.index_uid(\"test-index-2\").unwrap(), index_uid2);\n\n        assert_eq!(model.shard_table.num_shards(), 1);\n\n        let source_uid_0 = SourceUid {\n            index_uid: index_uid0.clone(),\n            source_id: INGEST_V2_SOURCE_ID.to_string(),\n        };\n        let shards: Vec<&ShardEntry> = model\n            .shard_table\n            .get_shards(&source_uid_0)\n            .unwrap()\n            .values()\n            .collect();\n        assert_eq!(shards.len(), 1);\n        assert_eq!(shards[0].shard_id(), ShardId::from(42));\n\n        let source_uid_1 = SourceUid {\n            index_uid: index_uid1.clone(),\n            source_id: INGEST_V2_SOURCE_ID.to_string(),\n        };\n        let shards: Vec<&ShardEntry> = model\n            .shard_table\n            .get_shards(&source_uid_1)\n            .unwrap()\n            .values()\n            .collect();\n        assert_eq!(shards.len(), 0);\n    }\n\n    #[test]\n    fn test_control_plane_model_add_index() {\n        let mut model = ControlPlaneModel::default();\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///indexes\");\n        let index_uid = index_metadata.index_uid.clone();\n        model.add_index(index_metadata.clone());\n\n        assert_eq!(model.index_table.len(), 1);\n        assert_eq!(model.index_table.get(&index_uid).unwrap(), &index_metadata);\n\n        assert_eq!(model.index_uid_table.len(), 1);\n        assert_eq!(*model.index_uid(\"test-index\").unwrap(), index_uid);\n    }\n\n    #[test]\n    fn test_control_plane_model_add_index_with_sources() {\n        let mut model = ControlPlaneModel::default();\n        let mut index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///indexes\");\n        index_metadata.add_source(SourceConfig::cli()).unwrap();\n        index_metadata\n            .add_source(SourceConfig::ingest_v2())\n            .unwrap();\n        let index_uid = index_metadata.index_uid.clone();\n        model.add_index(index_metadata.clone());\n\n        assert_eq!(model.index_table.len(), 1);\n        assert_eq!(model.index_table.get(&index_uid).unwrap(), &index_metadata);\n\n        assert_eq!(model.index_uid_table.len(), 1);\n        assert_eq!(*model.index_uid(\"test-index\").unwrap(), index_uid);\n\n        assert_eq!(model.shard_table.num_sources(), 1);\n\n        let source_uid = SourceUid {\n            index_uid: index_uid.clone(),\n            source_id: INGEST_V2_SOURCE_ID.to_string(),\n        };\n        assert_eq!(model.shard_table.get_shards(&source_uid).unwrap().len(), 0);\n    }\n\n    #[test]\n    fn test_control_plane_model_update_index_config() {\n        let mut model = ControlPlaneModel::default();\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///indexes\");\n        let index_uid = index_metadata.index_uid.clone();\n        model.add_index(index_metadata.clone());\n\n        // Update the index config\n        let mut index_config = index_metadata.index_config.clone();\n        index_config.search_settings.default_search_fields = vec![\"myfield\".to_string()];\n        model\n            .update_index_config(&index_uid, index_config.clone())\n            .unwrap();\n\n        assert_eq!(model.index_table.len(), 1);\n        assert_eq!(\n            model.index_table.get(&index_uid).unwrap().index_config,\n            index_config\n        );\n    }\n\n    #[test]\n    fn test_control_plane_model_update_sources() {\n        let mut model = ControlPlaneModel::default();\n        let mut index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///indexes\");\n        let mut my_source = SourceConfig::for_test(\"my-source\", SourceParams::void());\n        index_metadata.add_source(my_source.clone()).unwrap();\n        index_metadata\n            .add_source(SourceConfig::ingest_v2())\n            .unwrap();\n        let index_uid = index_metadata.index_uid.clone();\n        model.add_index(index_metadata.clone());\n\n        // Update a source\n        my_source.transform_config = Some(TransformConfig::new(\"del(.username)\".to_string(), None));\n        model.update_source(&index_uid, my_source.clone()).unwrap();\n\n        assert_eq!(model.index_table.len(), 1);\n        assert_eq!(\n            model\n                .index_table\n                .get(&index_uid)\n                .unwrap()\n                .sources\n                .get(\"my-source\")\n                .unwrap(),\n            &my_source\n        );\n    }\n\n    #[test]\n    fn test_control_plane_model_delete_index() {\n        let mut model = ControlPlaneModel::default();\n\n        let mut index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///indexes\");\n        let index_uid = index_metadata.index_uid.clone();\n        model.delete_index(&index_uid);\n\n        index_metadata\n            .add_source(SourceConfig::ingest_v2())\n            .unwrap();\n        model.add_index(index_metadata);\n\n        model.delete_index(&index_uid);\n\n        assert!(model.index_table.is_empty());\n        assert!(model.index_uid_table.is_empty());\n        assert_eq!(model.shard_table.num_sources(), 0);\n    }\n\n    #[test]\n    fn test_control_plane_model_toggle_source() {\n        let mut model = ControlPlaneModel::default();\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///indexes\");\n        let index_uid = index_metadata.index_uid.clone();\n        model.add_index(index_metadata);\n        let source_config = SourceConfig::for_test(\"test-source\", SourceParams::void());\n        model.add_source(&index_uid, source_config).unwrap();\n        {\n            let has_changed = model\n                .toggle_source(&index_uid, &\"test-source\".to_string(), true)\n                .unwrap();\n            assert!(!has_changed);\n        }\n        {\n            let has_changed = model\n                .toggle_source(&index_uid, &\"test-source\".to_string(), true)\n                .unwrap();\n            assert!(!has_changed);\n        }\n        {\n            let has_changed = model\n                .toggle_source(&index_uid, &\"test-source\".to_string(), false)\n                .unwrap();\n            assert!(has_changed);\n        }\n        {\n            let has_changed = model\n                .toggle_source(&index_uid, &\"test-source\".to_string(), false)\n                .unwrap();\n            assert!(!has_changed);\n        }\n        {\n            let has_changed = model\n                .toggle_source(&index_uid, &\"test-source\".to_string(), true)\n                .unwrap();\n            assert!(has_changed);\n        }\n        {\n            let has_changed = model\n                .toggle_source(&index_uid, &\"test-source\".to_string(), true)\n                .unwrap();\n            assert!(!has_changed);\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-control-plane/src/model/shard_table.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::hash_map::Entry;\nuse std::collections::{BTreeSet, HashMap};\nuse std::ops::{Deref, DerefMut};\nuse std::time::Duration;\n\nuse fnv::{FnvHashMap, FnvHashSet};\nuse quickwit_common::metrics::index_label;\nuse quickwit_common::rate_limiter::{RateLimiter, RateLimiterSettings};\nuse quickwit_common::tower::ConstantRate;\nuse quickwit_ingest::{RateMibPerSec, ShardInfo, ShardInfos};\nuse quickwit_proto::ingest::{Shard, ShardState};\nuse quickwit_proto::types::{IndexUid, NodeId, ShardId, SourceId, SourceUid};\nuse tracing::{error, info, warn};\n\n/// Limits the number of scale up operations that can happen to a source to 5 per minute.\nconst SCALING_UP_RATE_LIMITER_SETTINGS: RateLimiterSettings = RateLimiterSettings {\n    burst_limit: 5,\n    rate_limit: ConstantRate::new(5, Duration::from_secs(60)),\n    refill_period: Duration::from_secs(12),\n};\n\n/// Limits the number of shards that can be closed for scaling down a source to 1 per minute.\nconst SCALING_DOWN_RATE_LIMITER_SETTINGS: RateLimiterSettings = RateLimiterSettings {\n    burst_limit: 1,\n    rate_limit: ConstantRate::new(1, Duration::from_secs(60)),\n    refill_period: Duration::from_secs(60),\n};\n\n#[derive(Debug, Clone, Copy, Eq, PartialEq)]\npub(crate) enum ScalingMode {\n    /// Scale up by adding this number of shards\n    Up(usize),\n    /// Scale down by removing one shard\n    Down,\n}\n\n#[derive(Debug, Clone)]\npub(crate) struct ShardEntry {\n    pub shard: Shard,\n    pub short_term_ingestion_rate: RateMibPerSec,\n    pub long_term_ingestion_rate: RateMibPerSec,\n}\n\nimpl Deref for ShardEntry {\n    type Target = Shard;\n\n    fn deref(&self) -> &Self::Target {\n        &self.shard\n    }\n}\n\nimpl DerefMut for ShardEntry {\n    fn deref_mut(&mut self) -> &mut Self::Target {\n        &mut self.shard\n    }\n}\n\nimpl From<Shard> for ShardEntry {\n    fn from(shard: Shard) -> Self {\n        Self {\n            shard,\n            short_term_ingestion_rate: RateMibPerSec::default(),\n            long_term_ingestion_rate: RateMibPerSec::default(),\n        }\n    }\n}\n\n#[derive(Debug)]\npub(crate) struct ShardTableEntry {\n    shard_entries: FnvHashMap<ShardId, ShardEntry>,\n    scaling_up_rate_limiter: RateLimiter,\n    scaling_down_rate_limiter: RateLimiter,\n}\n\nimpl Default for ShardTableEntry {\n    fn default() -> Self {\n        Self {\n            shard_entries: Default::default(),\n            scaling_up_rate_limiter: RateLimiter::from_settings(SCALING_UP_RATE_LIMITER_SETTINGS),\n            scaling_down_rate_limiter: RateLimiter::from_settings(\n                SCALING_DOWN_RATE_LIMITER_SETTINGS,\n            ),\n        }\n    }\n}\n\nimpl ShardTableEntry {\n    fn is_empty(&self) -> bool {\n        self.shard_entries.is_empty()\n    }\n\n    fn shards_stats(&self) -> ShardStats {\n        let mut num_open_shards = 0;\n        let mut num_closed_shards = 0;\n        let mut short_term_ingestion_rate_sum = 0;\n        let mut long_term_ingestion_rate_sum = 0;\n\n        for shard_entry in self.shard_entries.values() {\n            if shard_entry.is_open() {\n                num_open_shards += 1;\n                short_term_ingestion_rate_sum += shard_entry.short_term_ingestion_rate.0 as usize;\n                long_term_ingestion_rate_sum += shard_entry.long_term_ingestion_rate.0 as usize;\n            } else if shard_entry.is_closed() {\n                num_closed_shards += 1;\n            }\n        }\n        let avg_short_term_ingestion_rate = if num_open_shards > 0 {\n            short_term_ingestion_rate_sum as f32 / num_open_shards as f32\n        } else {\n            0.0\n        };\n        let avg_long_term_ingestion_rate = if num_open_shards > 0 {\n            long_term_ingestion_rate_sum as f32 / num_open_shards as f32\n        } else {\n            0.0\n        };\n        ShardStats {\n            num_open_shards,\n            num_closed_shards,\n            avg_short_term_ingestion_rate,\n            avg_long_term_ingestion_rate,\n        }\n    }\n}\n\n#[derive(Default)]\npub struct ShardLocations<'a> {\n    shard_locations: HashMap<&'a ShardId, smallvec::SmallVec<[&'a NodeId; 2]>>,\n}\n\nimpl<'a> ShardLocations<'a> {\n    pub(crate) fn add_location(&mut self, shard_id: &'a ShardId, ingester_id: &'a NodeId) {\n        let locations = self.shard_locations.entry(shard_id).or_default();\n        if locations.contains(&ingester_id) {\n            warn!(\"shard {shard_id:?} was registered twice the same ingester {ingester_id:?}\");\n        } else {\n            locations.push(ingester_id);\n        }\n    }\n\n    /// Returns the list of indexer holding the given shard.\n    /// No guarantee is made on the order of the returned list.\n    pub fn get_shard_locations(&self, shard_id: &ShardId) -> &[&'a NodeId] {\n        let Some(node_ids) = self.shard_locations.get(shard_id) else {\n            return &[];\n        };\n        node_ids.as_slice()\n    }\n}\n\n// A table that keeps track of the existing shards for each index and source,\n// and for each ingester, the list of shards it is supposed to host.\n//\n// (All mutable methods must maintain these two invariants.)\n#[derive(Debug, Default)]\npub(crate) struct ShardTable {\n    table_entries: FnvHashMap<SourceUid, ShardTableEntry>,\n    ingester_shards: FnvHashMap<NodeId, FnvHashMap<SourceUid, BTreeSet<ShardId>>>,\n}\n\n// Removes the shards from the ingester_shards map.\n//\n// This function is used to maintain the shard table invariant.\n// Logs an error if the shard was not found in the ingester_shards map.\nfn remove_shard_from_ingesters_internal(\n    source_uid: &SourceUid,\n    shard: &Shard,\n    ingester_shards: &mut FnvHashMap<NodeId, FnvHashMap<SourceUid, BTreeSet<ShardId>>>,\n) {\n    for node in shard.ingesters() {\n        let ingester_shards = ingester_shards\n            .get_mut(node)\n            .expect(\"shard table reached inconsistent state\");\n        let shard_ids = ingester_shards.get_mut(source_uid).unwrap();\n        let shard_was_removed = shard_ids.remove(shard.shard_id());\n        if !shard_was_removed {\n            error!(\n                \"shard table has reached an inconsistent state. shard {shard:?} was removed from \\\n                 the shard table but was apparently not in the ingester_shards map.\"\n            );\n        }\n    }\n}\n\nimpl ShardTable {\n    /// Returns a ShardLocations object that maps each shard to the list of ingesters hosting it.\n    /// All shards are considered regardless of their state (including unavailable).\n    pub fn shard_locations(&self) -> ShardLocations<'_> {\n        let mut shard_locations = ShardLocations::default();\n        for (ingester_id, source_shards) in &self.ingester_shards {\n            for shard_ids in source_shards.values() {\n                for shard_id in shard_ids {\n                    shard_locations.add_location(shard_id, ingester_id);\n                }\n            }\n        }\n        shard_locations\n    }\n\n    /// Removes all the entries that match the target index ID.\n    pub fn delete_index(&mut self, index_id: &str) {\n        let shards_removed = self\n            .table_entries\n            .iter()\n            .filter(|(source_uid, _)| source_uid.index_uid.index_id == index_id)\n            .flat_map(|(source_uid, shard_table_entry)| {\n                shard_table_entry\n                    .shard_entries\n                    .values()\n                    .map(move |shard_entry: &ShardEntry| (source_uid, &shard_entry.shard))\n            });\n        for (source_uid, shard) in shards_removed {\n            remove_shard_from_ingesters_internal(source_uid, shard, &mut self.ingester_shards);\n        }\n        self.table_entries\n            .retain(|source_uid, _| source_uid.index_uid.index_id != index_id);\n        self.check_invariant();\n    }\n\n    /// Checks whether the shard table is consistent.\n    ///\n    /// Panics if it is not.\n    #[allow(clippy::mutable_key_type)]\n    fn check_invariant(&self) {\n        // This function is expensive! Let's not call it in release mode.\n        if !cfg!(debug_assertions) {\n            return;\n        };\n        let mut shard_sets_in_shard_table = FnvHashSet::default();\n        for (source_uid, shard_table_entry) in &self.table_entries {\n            for (shard_id, shard_entry) in &shard_table_entry.shard_entries {\n                debug_assert_eq!(shard_id, shard_entry.shard.shard_id());\n                debug_assert_eq!(&source_uid.index_uid, shard_entry.shard.index_uid());\n                for node in shard_entry.shard.ingesters() {\n                    shard_sets_in_shard_table.insert((node, source_uid, shard_id));\n                }\n            }\n        }\n        for (node, ingester_shards) in &self.ingester_shards {\n            for (source_uid, shard_ids) in ingester_shards {\n                for shard_id in shard_ids {\n                    let shard_table_entry = self.table_entries.get(source_uid).unwrap();\n                    debug_assert!(shard_table_entry.shard_entries.contains_key(shard_id));\n                    debug_assert!(shard_sets_in_shard_table.remove(&(node, source_uid, shard_id)));\n                }\n            }\n        }\n    }\n\n    /// Lists all the shards hosted on a given node, regardless of whether it is a\n    /// leader or a follower.\n    pub fn list_shards_for_node(\n        &self,\n        ingester: &NodeId,\n    ) -> Option<&FnvHashMap<SourceUid, BTreeSet<ShardId>>> {\n        self.ingester_shards.get(ingester)\n    }\n\n    pub fn list_shards_for_index<'a>(\n        &'a self,\n        index_uid: &'a IndexUid,\n    ) -> impl Iterator<Item = &'a ShardEntry> + 'a {\n        self.table_entries\n            .iter()\n            .filter(move |(source_uid, _)| source_uid.index_uid == *index_uid)\n            .flat_map(|(_, shard_table_entry)| shard_table_entry.shard_entries.values())\n    }\n\n    pub fn num_sources(&self) -> usize {\n        self.table_entries.len()\n    }\n\n    #[cfg(test)]\n    pub fn num_shards(&self) -> usize {\n        self.table_entries\n            .values()\n            .map(|shard_table_entry| shard_table_entry.shard_entries.len())\n            .sum()\n    }\n\n    /// Adds a new empty entry for the given index and source.\n    ///\n    /// TODO check and document the behavior on error (if the source was already here).\n    pub fn add_source(&mut self, index_uid: &IndexUid, source_id: &SourceId) {\n        let source_uid = SourceUid {\n            index_uid: index_uid.clone(),\n            source_id: source_id.clone(),\n        };\n        let table_entry = ShardTableEntry::default();\n        let previous_table_entry_opt = self.table_entries.insert(source_uid, table_entry);\n        if let Some(previous_table_entry) = previous_table_entry_opt\n            && !previous_table_entry.is_empty()\n        {\n            error!(\n                \"shard table entry for index `{}` and source `{}` already exists\",\n                index_uid.index_id, source_id\n            );\n        }\n        self.check_invariant();\n    }\n\n    pub fn delete_source(&mut self, index_uid: &IndexUid, source_id: &SourceId) {\n        let source_uid = SourceUid {\n            index_uid: index_uid.clone(),\n            source_id: source_id.clone(),\n        };\n        let Some(shard_table_entry) = self.table_entries.remove(&source_uid) else {\n            return;\n        };\n        for shard_entry in shard_table_entry.shard_entries.values() {\n            remove_shard_from_ingesters_internal(\n                &source_uid,\n                &shard_entry.shard,\n                &mut self.ingester_shards,\n            );\n        }\n        self.check_invariant();\n    }\n\n    pub(crate) fn all_shards(&self) -> impl Iterator<Item = &ShardEntry> + '_ {\n        self.table_entries\n            .values()\n            .flat_map(|table_entry| table_entry.shard_entries.values())\n    }\n\n    pub(crate) fn all_shards_with_source(\n        &self,\n    ) -> impl Iterator<Item = (&SourceUid, impl Iterator<Item = &ShardEntry>)> + '_ {\n        self.table_entries\n            .iter()\n            .map(|(source, shard_table)| (source, shard_table.shard_entries.values()))\n    }\n\n    /// Lists the shards of a given source. Returns `None` if the source does not exist.\n    pub fn get_shards(&self, source_uid: &SourceUid) -> Option<&FnvHashMap<ShardId, ShardEntry>> {\n        self.table_entries\n            .get(source_uid)\n            .map(|table_entry| &table_entry.shard_entries)\n    }\n\n    /// Lists the shards of a given source. Returns `None` if the source does not exist.\n    pub fn get_shards_mut(\n        &mut self,\n        source_uid: &SourceUid,\n    ) -> Option<&mut FnvHashMap<ShardId, ShardEntry>> {\n        self.table_entries\n            .get_mut(source_uid)\n            .map(|table_entry| &mut table_entry.shard_entries)\n    }\n\n    /// Inserts the shards into the shard table.\n    pub fn insert_shards(\n        &mut self,\n        index_uid: &IndexUid,\n        source_id: &SourceId,\n        opened_shards: Vec<Shard>,\n    ) {\n        let source_uid = SourceUid {\n            index_uid: index_uid.clone(),\n            source_id: source_id.clone(),\n        };\n        for shard in &opened_shards {\n            if shard.index_uid() != &source_uid.index_uid || shard.source_id != source_uid.source_id\n            {\n                panic!(\n                    \"shard source UID `{}/{}` does not match source UID `{source_uid}`\",\n                    shard.index_uid(),\n                    shard.source_id,\n                );\n            }\n        }\n        for shard in &opened_shards {\n            for node in shard.ingesters() {\n                let ingester_shards = self.ingester_shards.entry(node.to_owned()).or_default();\n                let shard_ids = ingester_shards.entry(source_uid.clone()).or_default();\n                shard_ids.insert(shard.shard_id().clone());\n            }\n        }\n        match self.table_entries.entry(source_uid.clone()) {\n            Entry::Occupied(mut entry) => {\n                let table_entry = entry.get_mut();\n                for opened_shard in opened_shards {\n                    // We only insert shards that we don't know about because the control plane\n                    // knows more about the state of the shards than the metastore.\n                    table_entry\n                        .shard_entries\n                        .entry(opened_shard.shard_id().clone())\n                        .or_insert_with(|| ShardEntry::from(opened_shard));\n                }\n            }\n            // This should never happen if the control plane view is consistent with the state of\n            // the metastore, so should we panic here? Warnings are most likely going to go\n            // unnoticed.\n            Entry::Vacant(entry) => {\n                warn!(\n                    \"control plane inconsistent with metastore: inserting shards for a \\\n                     non-existing source (please report)\"\n                );\n                let shard_entries: FnvHashMap<ShardId, ShardEntry> = opened_shards\n                    .into_iter()\n                    .map(|shard| (shard.shard_id().clone(), shard.into()))\n                    .collect();\n                let table_entry = ShardTableEntry {\n                    shard_entries,\n                    ..Default::default()\n                };\n                entry.insert(table_entry);\n            }\n        }\n        // Let's now update the open shard metrics for this specific index.\n        self.update_shard_metrics_for_source_uid(&source_uid);\n        self.check_invariant();\n    }\n\n    /// Finds open shards for a given index and source and whose leaders are not in the set of\n    /// unavailable ingesters.\n    pub fn find_open_shards(\n        &self,\n        index_uid: &IndexUid,\n        source_id: &SourceId,\n        unavailable_leaders: &FnvHashSet<NodeId>,\n    ) -> Option<Vec<ShardEntry>> {\n        let source_uid = SourceUid {\n            index_uid: index_uid.clone(),\n            source_id: source_id.clone(),\n        };\n        let table_entry = self.table_entries.get(&source_uid)?;\n        let open_shards: Vec<ShardEntry> = table_entry\n            .shard_entries\n            .values()\n            .filter(|shard_entry| {\n                shard_entry.shard.is_open() && !unavailable_leaders.contains(&shard_entry.leader_id)\n            })\n            .cloned()\n            .collect();\n        Some(open_shards)\n    }\n\n    pub fn update_shard_metrics_for_source_uid(&self, source_uid: &SourceUid) {\n        let Some(table_entry) = self.table_entries.get(source_uid) else {\n            return;\n        };\n        let index_id = source_uid.index_uid.index_id.as_str();\n        let index_label = index_label(index_id);\n\n        // If `index_label(index_id)` returns `index_id`, then per-index metrics are enabled and we\n        // can update the metrics for this specific index.\n        if index_label == index_id {\n            let shard_stats = table_entry.shards_stats();\n            crate::metrics::CONTROL_PLANE_METRICS\n                .open_shards\n                .with_label_values([index_label])\n                .set(shard_stats.num_open_shards as i64);\n            crate::metrics::CONTROL_PLANE_METRICS\n                .closed_shards\n                .with_label_values([index_label])\n                .set(shard_stats.num_closed_shards as i64);\n            return;\n        }\n        // Per-index metrics are disabled, so we update the metrics for all sources.\n        let mut num_open_shards = 0;\n        let mut num_closed_shards = 0;\n\n        for shard_entry in self.all_shards() {\n            if shard_entry.is_open() {\n                num_open_shards += 1;\n            } else if shard_entry.is_closed() {\n                num_closed_shards += 1;\n            }\n        }\n        crate::metrics::CONTROL_PLANE_METRICS\n            .open_shards\n            .with_label_values([index_label])\n            .set(num_open_shards as i64);\n        crate::metrics::CONTROL_PLANE_METRICS\n            .closed_shards\n            .with_label_values([index_label])\n            .set(num_closed_shards as i64);\n    }\n\n    pub fn update_shards(\n        &mut self,\n        source_uid: &SourceUid,\n        shard_infos: &ShardInfos,\n    ) -> ShardStats {\n        let Some(table_entry) = self.table_entries.get_mut(source_uid) else {\n            return ShardStats::default();\n        };\n        for shard_info in shard_infos {\n            let ShardInfo {\n                shard_id,\n                shard_state,\n                short_term_ingestion_rate,\n                long_term_ingestion_rate,\n            } = shard_info;\n\n            if let Some(shard_entry) = table_entry.shard_entries.get_mut(shard_id) {\n                shard_entry.short_term_ingestion_rate = *short_term_ingestion_rate;\n                shard_entry.long_term_ingestion_rate = *long_term_ingestion_rate;\n                // `ShardInfos` are broadcasted via Chitchat and eventually consistent. As a\n                // result, we can only trust the `Closed` state, which is final.\n                if shard_state.is_closed() {\n                    shard_entry.set_shard_state(ShardState::Closed);\n                }\n            }\n        }\n        table_entry.shards_stats()\n    }\n\n    /// Sets the state of the shards identified by their index UID, source ID, and shard IDs to\n    /// `Closed`.\n    pub fn close_shards(&mut self, source_uid: &SourceUid, shard_ids: &[ShardId]) -> Vec<ShardId> {\n        let Some(table_entry) = self.table_entries.get_mut(source_uid) else {\n            return Vec::new();\n        };\n        let mut closed_shard_ids = Vec::new();\n\n        for shard_id in shard_ids {\n            if let Some(shard_entry) = table_entry.shard_entries.get_mut(shard_id) {\n                if !shard_entry.is_closed() {\n                    shard_entry.set_shard_state(ShardState::Closed);\n                    closed_shard_ids.push(shard_id.clone());\n                }\n            } else {\n                info!(\n                    index_id=%source_uid.index_uid.index_id,\n                    source_id=%source_uid.source_id,\n                    %shard_id,\n                    \"ignoring attempt to close shard: it is unknown (probably because it has been deleted)\"\n                );\n            }\n        }\n        self.update_shard_metrics_for_source_uid(source_uid);\n        closed_shard_ids\n    }\n\n    /// Removes the shards identified by their index UID, source ID, and shard IDs.\n    pub fn delete_shards(&mut self, source_uid: &SourceUid, shard_ids: &[ShardId]) {\n        let Some(table_entry) = self.table_entries.get_mut(source_uid) else {\n            return;\n        };\n        let mut shard_entries_to_remove: Vec<ShardEntry> = Vec::new();\n        for shard_id in shard_ids {\n            if let Some(shard_entry) = table_entry.shard_entries.remove(shard_id) {\n                shard_entries_to_remove.push(shard_entry);\n            } else {\n                warn!(shard=%shard_id, \"deleting a non-existing shard\");\n            }\n        }\n        for shard_entry in shard_entries_to_remove {\n            remove_shard_from_ingesters_internal(\n                source_uid,\n                &shard_entry.shard,\n                &mut self.ingester_shards,\n            );\n        }\n        self.check_invariant();\n        self.update_shard_metrics_for_source_uid(source_uid);\n    }\n\n    pub fn acquire_scaling_permits(\n        &mut self,\n        source_uid: &SourceUid,\n        scaling_mode: ScalingMode,\n    ) -> Option<bool> {\n        let table_entry = self.table_entries.get_mut(source_uid)?;\n        let scaling_rate_limiter = match scaling_mode {\n            ScalingMode::Up(_) => &mut table_entry.scaling_up_rate_limiter,\n            ScalingMode::Down => &mut table_entry.scaling_down_rate_limiter,\n        };\n        Some(scaling_rate_limiter.acquire(1))\n    }\n\n    pub fn drain_scaling_permits(&mut self, source_uid: &SourceUid, scaling_mode: ScalingMode) {\n        if let Some(table_entry) = self.table_entries.get_mut(source_uid) {\n            let scaling_rate_limiter = match scaling_mode {\n                ScalingMode::Up(_) => &mut table_entry.scaling_up_rate_limiter,\n                ScalingMode::Down => &mut table_entry.scaling_down_rate_limiter,\n            };\n            scaling_rate_limiter.drain();\n        }\n    }\n\n    pub fn release_scaling_permits(&mut self, source_uid: &SourceUid, scaling_mode: ScalingMode) {\n        if let Some(table_entry) = self.table_entries.get_mut(source_uid) {\n            let scaling_rate_limiter = match scaling_mode {\n                ScalingMode::Up(_) => &mut table_entry.scaling_up_rate_limiter,\n                ScalingMode::Down => &mut table_entry.scaling_down_rate_limiter,\n            };\n            scaling_rate_limiter.release(1);\n        }\n    }\n}\n\n#[derive(Clone, Copy, Default)]\npub(crate) struct ShardStats {\n    pub num_open_shards: usize,\n    pub num_closed_shards: usize,\n    /// Average short-term ingestion rate (MiB/s) over all open shards.\n    pub avg_short_term_ingestion_rate: f32,\n    /// Average long-term ingestion rate (MiB/s) over all open shards.\n    pub avg_long_term_ingestion_rate: f32,\n}\n\n#[cfg(test)]\nmod tests {\n    use std::collections::BTreeSet;\n\n    use itertools::Itertools;\n    use quickwit_proto::ingest::Shard;\n\n    use super::*;\n\n    impl ShardTableEntry {\n        pub fn shards(&self) -> Vec<Shard> {\n            self.shard_entries\n                .values()\n                .map(|shard_entry| shard_entry.shard.clone())\n                .sorted_unstable_by(|left, right| left.shard_id.cmp(&right.shard_id))\n                .collect()\n        }\n    }\n\n    impl ShardTable {\n        pub fn find_open_shards_sorted(\n            &self,\n            index_uid: &IndexUid,\n            source_id: &SourceId,\n            unavailable_leaders: &FnvHashSet<NodeId>,\n        ) -> Option<Vec<ShardEntry>> {\n            self.find_open_shards(index_uid, source_id, unavailable_leaders)\n                .map(|mut shards| {\n                    shards.sort_unstable_by(|left, right| {\n                        left.shard.shard_id.cmp(&right.shard.shard_id)\n                    });\n                    shards\n                })\n        }\n    }\n\n    #[test]\n    fn test_shard_table_delete_index() {\n        let mut shard_table = ShardTable::default();\n        shard_table.delete_index(\"test-index\");\n\n        let index_uid_0: IndexUid = IndexUid::for_test(\"test-index-foo\", 0);\n        let source_id_0 = \"test-source-0\".to_string();\n        shard_table.add_source(&index_uid_0, &source_id_0);\n\n        let source_id_1 = \"test-source-1\".to_string();\n        shard_table.add_source(&index_uid_0, &source_id_1);\n\n        let index_uid_1: IndexUid = IndexUid::for_test(\"test-index-bar\", 1);\n        shard_table.add_source(&index_uid_1, &source_id_0);\n\n        shard_table.delete_index(\"test-index-foo\");\n        assert_eq!(shard_table.table_entries.len(), 1);\n\n        assert!(shard_table.table_entries.contains_key(&SourceUid {\n            index_uid: index_uid_1,\n            source_id: source_id_0\n        }));\n    }\n\n    #[test]\n    fn test_shard_table_add_source() {\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = \"test-source\".to_string();\n\n        let mut shard_table = ShardTable::default();\n        shard_table.add_source(&index_uid, &source_id);\n        assert_eq!(shard_table.table_entries.len(), 1);\n\n        let source_uid = SourceUid {\n            index_uid,\n            source_id,\n        };\n        let table_entry = shard_table.table_entries.get(&source_uid).unwrap();\n        assert!(table_entry.shard_entries.is_empty());\n    }\n\n    #[test]\n    fn test_shard_table_list_shards() {\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = \"test-source\".to_string();\n        let source_uid = SourceUid {\n            index_uid: index_uid.clone(),\n            source_id: source_id.clone(),\n        };\n        let mut shard_table = ShardTable::default();\n\n        assert!(shard_table.get_shards(&source_uid).is_none());\n\n        shard_table.add_source(&index_uid, &source_id);\n        let shards = shard_table.get_shards(&source_uid).unwrap();\n        assert_eq!(shards.len(), 0);\n\n        let shard_01 = Shard {\n            index_uid: index_uid.clone().into(),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            leader_id: \"test-leader-0\".to_string(),\n            shard_state: ShardState::Closed as i32,\n            ..Default::default()\n        };\n        shard_table.insert_shards(&index_uid, &source_id, vec![shard_01]);\n\n        let shards = shard_table.get_shards(&source_uid).unwrap();\n        assert_eq!(shards.len(), 1);\n    }\n\n    #[test]\n    fn test_shard_table_insert_newly_opened_shards() {\n        let index_uid_0: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = \"test-source\".to_string();\n\n        let mut shard_table = ShardTable::default();\n\n        let shard_01 = Shard {\n            index_uid: index_uid_0.clone().into(),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            leader_id: \"test-leader-0\".to_string(),\n            shard_state: ShardState::Open as i32,\n            ..Default::default()\n        };\n        shard_table.insert_shards(&index_uid_0, &source_id, vec![shard_01.clone()]);\n\n        assert_eq!(shard_table.table_entries.len(), 1);\n\n        let source_uid = SourceUid {\n            index_uid: index_uid_0.clone(),\n            source_id: source_id.clone(),\n        };\n        let table_entry = shard_table.table_entries.get(&source_uid).unwrap();\n        let shards = table_entry.shards();\n        assert_eq!(shards.len(), 1);\n        assert_eq!(shards[0], shard_01);\n\n        shard_table\n            .table_entries\n            .get_mut(&source_uid)\n            .unwrap()\n            .shard_entries\n            .get_mut(&ShardId::from(1))\n            .unwrap()\n            .set_shard_state(ShardState::Unavailable);\n\n        let shard_02 = Shard {\n            index_uid: index_uid_0.clone().into(),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(2)),\n            leader_id: \"test-leader-0\".to_string(),\n            shard_state: ShardState::Open as i32,\n            ..Default::default()\n        };\n\n        shard_table.insert_shards(\n            &index_uid_0,\n            &source_id,\n            vec![shard_01.clone(), shard_02.clone()],\n        );\n\n        assert_eq!(shard_table.table_entries.len(), 1);\n\n        let source_uid = SourceUid {\n            index_uid: index_uid_0.clone(),\n            source_id: source_id.clone(),\n        };\n        let table_entry = shard_table.table_entries.get(&source_uid).unwrap();\n        let shards = table_entry.shards();\n        assert_eq!(shards.len(), 2);\n        assert_eq!(shards[0].shard_state(), ShardState::Unavailable);\n        assert_eq!(shards[1], shard_02);\n    }\n\n    #[test]\n    fn test_shard_table_find_open_shards() {\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = \"test-source\".to_string();\n\n        let mut shard_table = ShardTable::default();\n        shard_table.add_source(&index_uid, &source_id);\n\n        let mut unavailable_ingesters = FnvHashSet::default();\n\n        let open_shards = shard_table\n            .find_open_shards_sorted(&index_uid, &source_id, &unavailable_ingesters)\n            .unwrap();\n        assert_eq!(open_shards.len(), 0);\n\n        let shard_01 = Shard {\n            index_uid: index_uid.clone().into(),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            leader_id: \"test-leader-0\".to_string(),\n            shard_state: ShardState::Closed as i32,\n            ..Default::default()\n        };\n        let shard_02 = Shard {\n            index_uid: index_uid.clone().into(),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(2)),\n            leader_id: \"test-leader-0\".to_string(),\n            shard_state: ShardState::Unavailable as i32,\n            ..Default::default()\n        };\n        let shard_03 = Shard {\n            index_uid: index_uid.clone().into(),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(3)),\n            leader_id: \"test-leader-0\".to_string(),\n            shard_state: ShardState::Open as i32,\n            ..Default::default()\n        };\n        let shard_04 = Shard {\n            index_uid: index_uid.clone().into(),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(4)),\n            leader_id: \"test-leader-1\".to_string(),\n            shard_state: ShardState::Open as i32,\n            ..Default::default()\n        };\n        shard_table.insert_shards(\n            &index_uid,\n            &source_id,\n            vec![shard_01, shard_02, shard_03.clone(), shard_04.clone()],\n        );\n        let open_shards = shard_table\n            .find_open_shards_sorted(&index_uid, &source_id, &unavailable_ingesters)\n            .unwrap();\n        assert_eq!(open_shards.len(), 2);\n        assert_eq!(open_shards[0].shard, shard_03);\n        assert_eq!(open_shards[1].shard, shard_04);\n\n        unavailable_ingesters.insert(\"test-leader-0\".into());\n\n        let open_shards = shard_table\n            .find_open_shards_sorted(&index_uid, &source_id, &unavailable_ingesters)\n            .unwrap();\n        assert_eq!(open_shards.len(), 1);\n        assert_eq!(open_shards[0].shard, shard_04);\n    }\n\n    #[test]\n    fn test_shard_table_update_shards() {\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = \"test-source\".to_string();\n\n        let mut shard_table = ShardTable::default();\n\n        let shard_01 = Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            shard_state: ShardState::Open as i32,\n            ..Default::default()\n        };\n        let shard_02 = Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(2)),\n            shard_state: ShardState::Open as i32,\n            ..Default::default()\n        };\n        let shard_03 = Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(3)),\n            shard_state: ShardState::Unavailable as i32,\n            ..Default::default()\n        };\n        let shard_04 = Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(4)),\n            shard_state: ShardState::Open as i32,\n            ..Default::default()\n        };\n        shard_table.insert_shards(\n            &index_uid,\n            &source_id,\n            vec![shard_01, shard_02, shard_03, shard_04],\n        );\n        let source_uid = SourceUid {\n            index_uid,\n            source_id,\n        };\n        let shard_infos = BTreeSet::from_iter([\n            ShardInfo {\n                shard_id: ShardId::from(1),\n                shard_state: ShardState::Open,\n                short_term_ingestion_rate: RateMibPerSec(1),\n                long_term_ingestion_rate: RateMibPerSec(1),\n            },\n            ShardInfo {\n                shard_id: ShardId::from(2),\n                shard_state: ShardState::Open,\n                short_term_ingestion_rate: RateMibPerSec(2),\n                long_term_ingestion_rate: RateMibPerSec(2),\n            },\n            ShardInfo {\n                shard_id: ShardId::from(3),\n                shard_state: ShardState::Open,\n                short_term_ingestion_rate: RateMibPerSec(3),\n                long_term_ingestion_rate: RateMibPerSec(3),\n            },\n            ShardInfo {\n                shard_id: ShardId::from(4),\n                shard_state: ShardState::Closed,\n                short_term_ingestion_rate: RateMibPerSec(4),\n                long_term_ingestion_rate: RateMibPerSec(4),\n            },\n            ShardInfo {\n                shard_id: ShardId::from(5),\n                shard_state: ShardState::Open,\n                short_term_ingestion_rate: RateMibPerSec(5),\n                long_term_ingestion_rate: RateMibPerSec(5),\n            },\n        ]);\n        let shard_stats = shard_table.update_shards(&source_uid, &shard_infos);\n        assert_eq!(shard_stats.num_open_shards, 2);\n        assert_eq!(shard_stats.avg_short_term_ingestion_rate, 1.5);\n\n        assert_eq!(shard_stats.avg_short_term_ingestion_rate, 1.5);\n\n        let shard_entries: Vec<ShardEntry> = shard_table\n            .get_shards(&source_uid)\n            .unwrap()\n            .values()\n            .cloned()\n            .sorted_unstable_by(|left, right| left.shard.shard_id.cmp(&right.shard.shard_id))\n            .collect();\n        assert_eq!(shard_entries.len(), 4);\n\n        assert_eq!(shard_entries[0].shard.shard_id(), ShardId::from(1));\n        assert_eq!(shard_entries[0].shard.shard_state(), ShardState::Open);\n        assert_eq!(shard_entries[0].short_term_ingestion_rate, RateMibPerSec(1));\n\n        assert_eq!(shard_entries[1].shard.shard_id(), ShardId::from(2));\n        assert_eq!(shard_entries[1].shard.shard_state(), ShardState::Open);\n        assert_eq!(shard_entries[1].short_term_ingestion_rate, RateMibPerSec(2));\n\n        assert_eq!(shard_entries[2].shard.shard_id(), ShardId::from(3));\n        assert_eq!(\n            shard_entries[2].shard.shard_state(),\n            ShardState::Unavailable\n        );\n        assert_eq!(shard_entries[2].short_term_ingestion_rate, RateMibPerSec(3));\n\n        assert_eq!(shard_entries[3].shard.shard_id(), ShardId::from(4));\n        assert_eq!(shard_entries[3].shard.shard_state(), ShardState::Closed);\n        assert_eq!(shard_entries[3].short_term_ingestion_rate, RateMibPerSec(4));\n    }\n\n    #[test]\n    fn test_shard_table_close_shards() {\n        let index_uid_0: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let index_uid_1: IndexUid = IndexUid::for_test(\"test-index\", 1);\n        let source_id = \"test-source\".to_string();\n\n        let mut shard_table = ShardTable::default();\n\n        let shard_01 = Shard {\n            index_uid: index_uid_0.clone().into(),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            leader_id: \"test-leader-0\".to_string(),\n            shard_state: ShardState::Open as i32,\n            ..Default::default()\n        };\n        let shard_02 = Shard {\n            index_uid: index_uid_0.clone().into(),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(2)),\n            leader_id: \"test-leader-0\".to_string(),\n            shard_state: ShardState::Closed as i32,\n            ..Default::default()\n        };\n        let shard_11 = Shard {\n            index_uid: index_uid_1.clone().into(),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            leader_id: \"test-leader-0\".to_string(),\n            shard_state: ShardState::Open as i32,\n            ..Default::default()\n        };\n        shard_table.insert_shards(&index_uid_0, &source_id, vec![shard_01, shard_02]);\n        shard_table.insert_shards(&index_uid_1, &source_id, vec![shard_11]);\n\n        let source_uid_0 = SourceUid {\n            index_uid: index_uid_0,\n            source_id,\n        };\n        let closed_shard_ids = shard_table.close_shards(\n            &source_uid_0,\n            &[ShardId::from(1), ShardId::from(2), ShardId::from(3)],\n        );\n        assert_eq!(closed_shard_ids, &[ShardId::from(1)]);\n\n        let table_entry = shard_table.table_entries.get(&source_uid_0).unwrap();\n        let shards = table_entry.shards();\n        assert_eq!(shards[0].shard_state(), ShardState::Closed);\n    }\n\n    #[test]\n    fn test_shard_table_delete_shards() {\n        let mut shard_table = ShardTable::default();\n\n        let index_uid_0: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let index_uid_1: IndexUid = IndexUid::for_test(\"test-index\", 1);\n        let source_id = \"test-source\".to_string();\n\n        let shard_01 = Shard {\n            index_uid: index_uid_0.clone().into(),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            leader_id: \"test-leader-0\".to_string(),\n            shard_state: ShardState::Open as i32,\n            ..Default::default()\n        };\n        let shard_02 = Shard {\n            index_uid: index_uid_0.clone().into(),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(2)),\n            leader_id: \"test-leader-0\".to_string(),\n            shard_state: ShardState::Open as i32,\n            ..Default::default()\n        };\n        let shard_11 = Shard {\n            index_uid: index_uid_1.clone().into(),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            leader_id: \"test-leader-0\".to_string(),\n            shard_state: ShardState::Open as i32,\n            ..Default::default()\n        };\n        shard_table.insert_shards(&index_uid_0, &source_id, vec![shard_01.clone(), shard_02]);\n        shard_table.insert_shards(&index_uid_1, &source_id, vec![shard_11]);\n\n        let source_uid_0 = SourceUid {\n            index_uid: index_uid_0.clone(),\n            source_id: source_id.clone(),\n        };\n        shard_table.delete_shards(&source_uid_0, &[ShardId::from(2)]);\n\n        let source_uid_1 = SourceUid {\n            index_uid: index_uid_1.clone(),\n            source_id: source_id.clone(),\n        };\n        shard_table.delete_shards(&source_uid_1, &[ShardId::from(1)]);\n\n        assert_eq!(shard_table.table_entries.len(), 2);\n\n        let table_entry = shard_table.table_entries.get(&source_uid_0).unwrap();\n        let shards = table_entry.shards();\n        assert_eq!(shards.len(), 1);\n        assert_eq!(shards[0], shard_01);\n\n        let table_entry = shard_table.table_entries.get(&source_uid_1).unwrap();\n        assert!(table_entry.is_empty());\n    }\n\n    #[test]\n    fn test_shard_table_acquire_scaling_up_permits() {\n        let mut shard_table = ShardTable::default();\n\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = \"test-source\".to_string();\n\n        let source_uid = SourceUid {\n            index_uid: index_uid.clone(),\n            source_id: source_id.clone(),\n        };\n        assert!(\n            shard_table\n                .acquire_scaling_permits(&source_uid, ScalingMode::Up(1))\n                .is_none()\n        );\n\n        shard_table.add_source(&index_uid, &source_id);\n\n        let previous_available_permits = shard_table\n            .table_entries\n            .get_mut(&source_uid)\n            .unwrap()\n            .scaling_up_rate_limiter\n            .available_permits();\n\n        assert!(\n            shard_table\n                .acquire_scaling_permits(&source_uid, ScalingMode::Up(1))\n                .unwrap()\n        );\n\n        let new_available_permits = shard_table\n            .table_entries\n            .get_mut(&source_uid)\n            .unwrap()\n            .scaling_up_rate_limiter\n            .available_permits();\n\n        assert_eq!(new_available_permits, previous_available_permits - 1);\n    }\n\n    #[test]\n    fn test_shard_table_acquire_scaling_down_permits() {\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = \"test-source\".to_string();\n\n        let mut shard_table = ShardTable::default();\n\n        let source_uid = SourceUid {\n            index_uid: index_uid.clone(),\n            source_id: source_id.clone(),\n        };\n        assert!(\n            shard_table\n                .acquire_scaling_permits(&source_uid, ScalingMode::Down)\n                .is_none()\n        );\n\n        shard_table.add_source(&index_uid, &source_id);\n\n        let previous_available_permits = shard_table\n            .table_entries\n            .get_mut(&source_uid)\n            .unwrap()\n            .scaling_down_rate_limiter\n            .available_permits();\n\n        assert!(\n            shard_table\n                .acquire_scaling_permits(&source_uid, ScalingMode::Down)\n                .unwrap()\n        );\n\n        let new_available_permits = shard_table\n            .table_entries\n            .get_mut(&source_uid)\n            .unwrap()\n            .scaling_down_rate_limiter\n            .available_permits();\n\n        assert_eq!(new_available_permits, previous_available_permits - 1);\n    }\n\n    #[test]\n    fn test_shard_table_release_scaling_up_permits() {\n        let mut shard_table = ShardTable::default();\n\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = \"test-source\".to_string();\n\n        shard_table.add_source(&index_uid, &source_id);\n\n        let source_uid = SourceUid {\n            index_uid: index_uid.clone(),\n            source_id: source_id.clone(),\n        };\n        let previous_available_permits = shard_table\n            .table_entries\n            .get_mut(&source_uid)\n            .unwrap()\n            .scaling_up_rate_limiter\n            .available_permits();\n\n        assert!(\n            shard_table\n                .acquire_scaling_permits(&source_uid, ScalingMode::Up(1))\n                .unwrap()\n        );\n\n        shard_table.release_scaling_permits(&source_uid, ScalingMode::Up(1));\n\n        let new_available_permits = shard_table\n            .table_entries\n            .get_mut(&source_uid)\n            .unwrap()\n            .scaling_up_rate_limiter\n            .available_permits();\n\n        assert_eq!(new_available_permits, previous_available_permits);\n    }\n\n    #[test]\n    fn test_shard_table_release_scaling_down_permits() {\n        let mut shard_table = ShardTable::default();\n\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = \"test-source\".to_string();\n\n        shard_table.add_source(&index_uid, &source_id);\n\n        let source_uid = SourceUid {\n            index_uid: index_uid.clone(),\n            source_id: source_id.clone(),\n        };\n        let previous_available_permits = shard_table\n            .table_entries\n            .get_mut(&source_uid)\n            .unwrap()\n            .scaling_up_rate_limiter\n            .available_permits();\n\n        assert!(\n            shard_table\n                .acquire_scaling_permits(&source_uid, ScalingMode::Down)\n                .unwrap()\n        );\n\n        shard_table.release_scaling_permits(&source_uid, ScalingMode::Down);\n\n        let new_available_permits = shard_table\n            .table_entries\n            .get_mut(&source_uid)\n            .unwrap()\n            .scaling_up_rate_limiter\n            .available_permits();\n\n        assert_eq!(new_available_permits, previous_available_permits);\n    }\n\n    #[test]\n    fn test_shard_locations() {\n        let shard1 = ShardId::from(\"shard1\");\n        let shard2 = ShardId::from(\"shard1\");\n        let unlisted_shard = ShardId::from(\"unlisted\");\n        let node1 = NodeId::new(\"node1\".to_string());\n        let node2 = NodeId::new(\"node2\".to_string());\n        let mut shard_locations = ShardLocations::default();\n        shard_locations.add_location(&shard1, &node1);\n        shard_locations.add_location(&shard1, &node2);\n        // add location called several times should counted once.\n        shard_locations.add_location(&shard2, &node2);\n        assert_eq!(\n            shard_locations.get_shard_locations(&shard1),\n            &[&node1, &node2]\n        );\n        assert_eq!(\n            shard_locations.get_shard_locations(&shard2),\n            &[&node1, &node2]\n        );\n        // If the shard is not listed, we do not panic but just return an empty list.\n        assert!(\n            shard_locations\n                .get_shard_locations(&unlisted_shard)\n                .is_empty()\n        );\n    }\n\n    #[test]\n    fn test_shard_table_shard_locations() {\n        let mut shard_table = ShardTable::default();\n\n        let index_uid0: IndexUid = IndexUid::for_test(\"test-index0\", 0);\n        let source_id = \"test-source0\".to_string();\n        shard_table.add_source(&index_uid0, &source_id);\n\n        let index_uid1: IndexUid = IndexUid::for_test(\"test-index1\", 0);\n        let source_id = \"test-source1\".to_string();\n        shard_table.add_source(&index_uid1, &source_id);\n\n        let source_uid0 = SourceUid {\n            index_uid: index_uid0.clone(),\n            source_id: source_id.clone(),\n        };\n\n        let source_uid1 = SourceUid {\n            index_uid: index_uid1.clone(),\n            source_id: source_id.clone(),\n        };\n\n        let make_shard = |source_uid: &SourceUid,\n                          leader_id: &str,\n                          shard_id: u64,\n                          follower_id: Option<&str>,\n                          shard_state: ShardState| {\n            Shard {\n                index_uid: source_uid.index_uid.clone().into(),\n                source_id: source_uid.source_id.clone(),\n                shard_id: Some(ShardId::from(shard_id)),\n                leader_id: leader_id.to_string(),\n                follower_id: follower_id.map(|s| s.to_string()),\n                shard_state: shard_state as i32,\n                ..Default::default()\n            }\n        };\n\n        shard_table.insert_shards(\n            &source_uid0.index_uid,\n            &source_uid0.source_id,\n            vec![\n                make_shard(\n                    &source_uid0,\n                    \"indexer1\",\n                    0,\n                    Some(\"indexer2\"),\n                    ShardState::Open,\n                ),\n                make_shard(&source_uid0, \"indexer1\", 1, None, ShardState::Closed),\n                make_shard(&source_uid0, \"indexer2\", 2, None, ShardState::Open),\n            ],\n        );\n\n        shard_table.insert_shards(\n            &source_uid1.index_uid,\n            &source_uid1.source_id,\n            vec![\n                make_shard(\n                    &source_uid1,\n                    \"indexer2\",\n                    3,\n                    Some(\"indexer1\"),\n                    ShardState::Unavailable,\n                ),\n                make_shard(\n                    &source_uid1,\n                    \"indexer2\",\n                    3,\n                    Some(\"indexer1\"),\n                    ShardState::Open,\n                ),\n            ],\n        );\n\n        let shard_locations = shard_table.shard_locations();\n        let get_sorted_locations_for_shard = |shard_id: u64| {\n            let mut locations = shard_locations\n                .get_shard_locations(&ShardId::from(shard_id))\n                .to_vec();\n            locations.sort();\n            locations\n        };\n        assert_eq!(\n            &get_sorted_locations_for_shard(0u64),\n            &[&NodeId::from(\"indexer1\"), &NodeId::from(\"indexer2\")]\n        );\n        assert_eq!(\n            &get_sorted_locations_for_shard(1u64),\n            &[&NodeId::from(\"indexer1\")]\n        );\n        assert_eq!(\n            &get_sorted_locations_for_shard(2u64),\n            &[&NodeId::from(\"indexer2\")]\n        );\n        assert_eq!(\n            &get_sorted_locations_for_shard(3u64),\n            &[&NodeId::from(\"indexer1\"), &NodeId::from(\"indexer2\")]\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-control-plane/src/tests.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::num::NonZeroUsize;\nuse std::time::Duration;\n\nuse fnv::FnvHashMap;\nuse futures::{Stream, StreamExt};\nuse quickwit_actors::{Inbox, Mailbox, Observe, Universe};\nuse quickwit_cluster::{ChannelTransport, Cluster, ClusterChange, create_cluster_for_test};\nuse quickwit_common::test_utils::wait_until_predicate;\nuse quickwit_common::tower::{Change, Pool};\nuse quickwit_config::service::QuickwitService;\nuse quickwit_config::{\n    ClusterConfig, KafkaSourceParams, SourceConfig, SourceInputFormat, SourceParams,\n};\nuse quickwit_indexing::IndexingService;\nuse quickwit_metastore::{IndexMetadata, ListIndexesMetadataResponseExt};\nuse quickwit_proto::indexing::{ApplyIndexingPlanRequest, CpuCapacity, IndexingServiceClient};\nuse quickwit_proto::metastore::{\n    ListIndexesMetadataResponse, ListShardsResponse, MetastoreServiceClient, MockMetastoreService,\n};\nuse quickwit_proto::types::NodeId;\nuse serde_json::json;\n\nuse crate::IndexerNodeInfo;\nuse crate::control_plane::{CONTROL_PLAN_LOOP_INTERVAL, ControlPlane};\nuse crate::indexing_scheduler::MIN_DURATION_BETWEEN_SCHEDULING;\n\nfn index_metadata_for_test(index_id: &str, source_id: &str, num_pipelines: usize) -> IndexMetadata {\n    let mut index_metadata = IndexMetadata::for_test(index_id, \"ram://indexes/test-index\");\n    let ingest_source_config = SourceConfig::ingest_v2();\n    index_metadata.add_source(ingest_source_config).unwrap();\n\n    let kafka_source_config = SourceConfig {\n        enabled: true,\n        source_id: source_id.to_string(),\n        num_pipelines: NonZeroUsize::new(num_pipelines).unwrap(),\n        source_params: SourceParams::Kafka(KafkaSourceParams {\n            topic: \"topic\".to_string(),\n            client_log_level: None,\n            client_params: json!({\n            \"bootstrap.servers\": \"localhost:9092\",\n            }),\n            enable_backfill_mode: true,\n        }),\n        transform_config: None,\n        input_format: SourceInputFormat::Json,\n    };\n    index_metadata.add_source(kafka_source_config).unwrap();\n    index_metadata\n}\n\npub fn test_indexer_change_stream(\n    cluster_change_stream: impl Stream<Item = ClusterChange> + Send + 'static,\n    indexing_clients: FnvHashMap<NodeId, Mailbox<IndexingService>>,\n) -> impl Stream<Item = Change<NodeId, IndexerNodeInfo>> + Send + 'static {\n    cluster_change_stream.filter_map(move |cluster_change| {\n        let indexing_clients = indexing_clients.clone();\n        Box::pin(async move {\n            match cluster_change {\n                ClusterChange::Add(node)\n                    if node.enabled_services().contains(&QuickwitService::Indexer) =>\n                {\n                    let node_id = node.node_id().to_owned();\n                    let generation_id = node.chitchat_id().generation_id;\n                    let indexing_tasks = node.indexing_tasks().to_vec();\n                    let client_mailbox = indexing_clients.get(&node_id).unwrap().clone();\n                    let client = IndexingServiceClient::from_mailbox(client_mailbox);\n                    let change = Change::Insert(\n                        node_id.clone(),\n                        IndexerNodeInfo {\n                            node_id,\n                            generation_id,\n                            client,\n                            indexing_tasks,\n                            indexing_capacity: CpuCapacity::from_cpu_millis(4_000),\n                        },\n                    );\n                    Some(change)\n                }\n                ClusterChange::Remove(node) => Some(Change::Remove(node.node_id().to_owned())),\n                _ => None,\n            }\n        })\n    })\n}\n\nasync fn start_control_plane(\n    cluster: Cluster,\n    indexers: &[&Cluster],\n    universe: &Universe,\n) -> (Vec<Inbox<IndexingService>>, Mailbox<ControlPlane>) {\n    let index_1 = \"test-indexing-plan-1\";\n    let source_1 = \"source-1\";\n    let index_2 = \"test-indexing-plan-2\";\n    let source_2 = \"source-2\";\n    let index_metadata_1 = index_metadata_for_test(index_1, source_1, 2);\n    let mut index_metadata_2 = index_metadata_for_test(index_2, source_2, 1);\n    index_metadata_2.create_timestamp = index_metadata_1.create_timestamp + 1;\n    let mut mock_metastore = MockMetastoreService::new();\n    mock_metastore.expect_list_indexes_metadata().returning(\n        move |_list_indexes_request: quickwit_proto::metastore::ListIndexesMetadataRequest| {\n            let indexes_metadata = vec![index_metadata_2.clone(), index_metadata_1.clone()];\n            Ok(ListIndexesMetadataResponse::for_test(indexes_metadata))\n        },\n    );\n    mock_metastore.expect_list_shards().returning(|_| {\n        Ok(ListShardsResponse {\n            subresponses: Vec::new(),\n        })\n    });\n    let mut indexer_inboxes = Vec::new();\n\n    let indexer_pool = Pool::default();\n    let ingester_pool = Pool::default();\n    let mut indexing_clients = FnvHashMap::default();\n\n    for indexer in indexers {\n        let (indexing_service_mailbox, indexing_service_inbox) = universe.create_test_mailbox();\n        indexing_clients.insert(indexer.self_node_id().to_owned(), indexing_service_mailbox);\n        indexer_inboxes.push(indexing_service_inbox);\n    }\n    let indexer_change_stream =\n        test_indexer_change_stream(cluster.change_stream(), indexing_clients);\n    indexer_pool.listen_for_changes(indexer_change_stream);\n\n    let mut cluster_config = ClusterConfig::for_test();\n    cluster_config.cluster_id = cluster.cluster_id().to_string();\n\n    let self_node_id = cluster.self_node_id().to_owned();\n    let (control_plane_mailbox, _control_plane_handle, _is_ready_rx) = ControlPlane::spawn(\n        universe,\n        cluster_config,\n        self_node_id,\n        cluster,\n        indexer_pool,\n        ingester_pool,\n        MetastoreServiceClient::from_mock(mock_metastore),\n    );\n\n    (indexer_inboxes, control_plane_mailbox)\n}\n\n#[tokio::test]\nasync fn test_scheduler_scheduling_and_control_loop_apply_plan_again() {\n    quickwit_common::setup_logging_for_tests();\n    let transport = ChannelTransport::default();\n    let cluster =\n        create_cluster_for_test(Vec::new(), &[\"indexer\", \"control_plane\"], &transport, true)\n            .await\n            .unwrap();\n    cluster\n        .wait_for_ready_members(|members| members.len() == 1, Duration::from_secs(5))\n        .await\n        .unwrap();\n    let universe = Universe::with_accelerated_time();\n    let (indexing_service_inboxes, control_plane_mailbox) =\n        start_control_plane(cluster.clone(), &[&cluster.clone()], &universe).await;\n    let indexing_service_inbox = indexing_service_inboxes[0].clone();\n    let scheduler_state = control_plane_mailbox\n        .ask(Observe)\n        .await\n        .unwrap()\n        .indexing_scheduler;\n    let indexing_service_inbox_messages =\n        indexing_service_inbox.drain_for_test_typed::<ApplyIndexingPlanRequest>();\n    assert_eq!(scheduler_state.num_applied_physical_indexing_plan, 1);\n    assert_eq!(scheduler_state.num_schedule_indexing_plan, 1);\n    assert!(scheduler_state.last_applied_physical_plan.is_some());\n    assert_eq!(indexing_service_inbox_messages.len(), 1);\n\n    // After a CONTROL_PLAN_LOOP_INTERVAL, the control loop will check if the desired plan is\n    // running on the indexer. As chitchat state of the indexer is not updated (we did\n    // not instantiate a indexing service for that), the control loop will apply again\n    // the same plan.\n    // Check first the plan is not updated before `MIN_DURATION_BETWEEN_SCHEDULING`.\n    tokio::time::sleep(MIN_DURATION_BETWEEN_SCHEDULING.mul_f32(0.5)).await;\n    let scheduler_state = control_plane_mailbox\n        .ask(Observe)\n        .await\n        .unwrap()\n        .indexing_scheduler;\n    assert_eq!(scheduler_state.num_schedule_indexing_plan, 1);\n    assert_eq!(scheduler_state.num_applied_physical_indexing_plan, 1);\n\n    // After `MIN_DURATION_BETWEEN_SCHEDULING`, we should see a plan update.\n    tokio::time::sleep(MIN_DURATION_BETWEEN_SCHEDULING.mul_f32(0.7)).await;\n    let scheduler_state = control_plane_mailbox\n        .ask(Observe)\n        .await\n        .unwrap()\n        .indexing_scheduler;\n    let indexing_service_inbox_messages =\n        indexing_service_inbox.drain_for_test_typed::<ApplyIndexingPlanRequest>();\n    assert_eq!(scheduler_state.num_schedule_indexing_plan, 1);\n    assert_eq!(scheduler_state.num_applied_physical_indexing_plan, 2);\n    assert_eq!(indexing_service_inbox_messages.len(), 1);\n    let indexing_tasks = indexing_service_inbox_messages\n        .first()\n        .unwrap()\n        .indexing_tasks\n        .clone();\n\n    // Update the indexer state and check that the indexer does not receive any new\n    // `ApplyIndexingPlanRequest`.\n    cluster\n        .update_self_node_indexing_tasks(&indexing_tasks)\n        .await;\n    let scheduler_state = control_plane_mailbox\n        .ask(Observe)\n        .await\n        .unwrap()\n        .indexing_scheduler;\n    assert_eq!(scheduler_state.num_applied_physical_indexing_plan, 2);\n    let indexing_service_inbox_messages =\n        indexing_service_inbox.drain_for_test_typed::<ApplyIndexingPlanRequest>();\n    assert_eq!(indexing_service_inbox_messages.len(), 0);\n\n    // Update the indexer state with a different plan and check that the indexer does now\n    // receive a new `ApplyIndexingPlanRequest`.\n    cluster\n        .update_self_node_indexing_tasks(&[indexing_tasks[0].clone()])\n        .await;\n    tokio::time::sleep(MIN_DURATION_BETWEEN_SCHEDULING.mul_f32(1.2)).await;\n    let scheduler_state = control_plane_mailbox\n        .ask(Observe)\n        .await\n        .unwrap()\n        .indexing_scheduler;\n    assert_eq!(scheduler_state.num_applied_physical_indexing_plan, 3);\n    let indexing_service_inbox_messages =\n        indexing_service_inbox.drain_for_test_typed::<ApplyIndexingPlanRequest>();\n    assert_eq!(indexing_service_inbox_messages.len(), 1);\n    universe.assert_quit().await;\n}\n\n#[tokio::test]\nasync fn test_scheduler_scheduling_no_indexer() {\n    let transport = ChannelTransport::default();\n    let cluster = create_cluster_for_test(Vec::new(), &[\"control_plane\"], &transport, true)\n        .await\n        .unwrap();\n    let universe = Universe::with_accelerated_time();\n    let (indexing_service_inboxes, control_plane_mailbox) =\n        start_control_plane(cluster.clone(), &[], &universe).await;\n    assert_eq!(indexing_service_inboxes.len(), 0);\n\n    // No indexer.\n    universe.sleep(CONTROL_PLAN_LOOP_INTERVAL).await;\n    let scheduler_state = control_plane_mailbox\n        .ask(Observe)\n        .await\n        .unwrap()\n        .indexing_scheduler;\n    assert_eq!(scheduler_state.num_applied_physical_indexing_plan, 0);\n    assert_eq!(scheduler_state.num_schedule_indexing_plan, 0);\n    assert!(scheduler_state.last_applied_physical_plan.is_none());\n\n    // There is no indexer, we should observe no\n    // scheduling.\n    universe.sleep(Duration::from_secs(60)).await;\n    let scheduler_state = control_plane_mailbox\n        .ask(Observe)\n        .await\n        .unwrap()\n        .indexing_scheduler;\n    assert_eq!(scheduler_state.num_applied_physical_indexing_plan, 0);\n    assert_eq!(scheduler_state.num_schedule_indexing_plan, 0);\n    assert!(scheduler_state.last_applied_physical_plan.is_none());\n    universe.assert_quit().await;\n}\n\n#[tokio::test]\nasync fn test_scheduler_scheduling_multiple_indexers() {\n    let transport = ChannelTransport::default();\n    let cluster = create_cluster_for_test(Vec::new(), &[\"control_plane\"], &transport, true)\n        .await\n        .unwrap();\n    let cluster_indexer_1 = create_cluster_for_test(\n        vec![cluster.gossip_advertise_addr().to_string()],\n        &[\"indexer\"],\n        &transport,\n        true,\n    )\n    .await\n    .unwrap();\n    let cluster_indexer_2 = create_cluster_for_test(\n        vec![cluster.gossip_advertise_addr().to_string()],\n        &[\"indexer\"],\n        &transport,\n        true,\n    )\n    .await\n    .unwrap();\n    let universe = Universe::new();\n    let (indexing_service_inboxes, control_plane_mailbox) = start_control_plane(\n        cluster.clone(),\n        &[&cluster_indexer_1, &cluster_indexer_2],\n        &universe,\n    )\n    .await;\n    let indexing_service_inbox_1 = indexing_service_inboxes[0].clone();\n    let indexing_service_inbox_2 = indexing_service_inboxes[1].clone();\n\n    // No indexer.\n    let scheduler_state = control_plane_mailbox\n        .ask(Observe)\n        .await\n        .unwrap()\n        .indexing_scheduler;\n    let indexing_service_inbox_messages =\n        indexing_service_inbox_1.drain_for_test_typed::<ApplyIndexingPlanRequest>();\n    assert_eq!(scheduler_state.num_applied_physical_indexing_plan, 0);\n    assert_eq!(scheduler_state.num_schedule_indexing_plan, 0);\n    assert!(scheduler_state.last_applied_physical_plan.is_none());\n    assert_eq!(indexing_service_inbox_messages.len(), 0);\n\n    cluster\n        .wait_for_ready_members(\n            |members| {\n                members\n                    .iter()\n                    .any(|member| member.enabled_services.contains(&QuickwitService::Indexer))\n            },\n            Duration::from_secs(5),\n        )\n        .await\n        .unwrap();\n\n    // Wait for chitchat update, sheduler will detect new indexers and schedule a plan.\n    wait_until_predicate(\n        || {\n            let control_plane_mailbox_clone = control_plane_mailbox.clone();\n            async move {\n                let scheduler_state = control_plane_mailbox_clone\n                    .ask(Observe)\n                    .await\n                    .unwrap()\n                    .indexing_scheduler;\n                scheduler_state.num_schedule_indexing_plan == 1\n            }\n        },\n        CONTROL_PLAN_LOOP_INTERVAL * 4,\n        Duration::from_millis(100),\n    )\n    .await\n    .unwrap();\n    let scheduler_state = control_plane_mailbox\n        .ask(Observe)\n        .await\n        .unwrap()\n        .indexing_scheduler;\n    assert_eq!(scheduler_state.num_applied_physical_indexing_plan, 1);\n    let indexing_service_inbox_messages_1 =\n        indexing_service_inbox_1.drain_for_test_typed::<ApplyIndexingPlanRequest>();\n    let indexing_service_inbox_messages_2 =\n        indexing_service_inbox_2.drain_for_test_typed::<ApplyIndexingPlanRequest>();\n    assert_eq!(indexing_service_inbox_messages_1.len(), 1);\n    assert_eq!(indexing_service_inbox_messages_2.len(), 1);\n    cluster_indexer_1\n        .update_self_node_indexing_tasks(&indexing_service_inbox_messages_1[0].indexing_tasks)\n        .await;\n    cluster_indexer_2\n        .update_self_node_indexing_tasks(&indexing_service_inbox_messages_2[0].indexing_tasks)\n        .await;\n\n    // Wait 2 CONTROL_PLAN_LOOP_INTERVAL again and check the scheduler will not apply the plan\n    // several times.\n    universe.sleep(CONTROL_PLAN_LOOP_INTERVAL * 2).await;\n    let scheduler_state = control_plane_mailbox\n        .ask(Observe)\n        .await\n        .unwrap()\n        .indexing_scheduler;\n    assert_eq!(scheduler_state.num_schedule_indexing_plan, 1);\n\n    // Shutdown cluster and wait until the new scheduling.\n    cluster_indexer_2.leave().await;\n\n    cluster\n        .wait_for_ready_members(\n            |members| {\n                members\n                    .iter()\n                    .filter(|member| member.enabled_services.contains(&QuickwitService::Indexer))\n                    .count()\n                    == 1\n            },\n            Duration::from_secs(5),\n        )\n        .await\n        .unwrap();\n\n    wait_until_predicate(\n        || {\n            let scheduler_handler_mailbox_clone = control_plane_mailbox.clone();\n            async move {\n                let scheduler_state = scheduler_handler_mailbox_clone\n                    .ask(Observe)\n                    .await\n                    .unwrap()\n                    .indexing_scheduler;\n                scheduler_state.num_schedule_indexing_plan == 2\n            }\n        },\n        CONTROL_PLAN_LOOP_INTERVAL * 10,\n        Duration::from_millis(100),\n    )\n    .await\n    .unwrap();\n\n    universe.assert_quit().await;\n}\n"
  },
  {
    "path": "quickwit/quickwit-datetime/Cargo.toml",
    "content": "[package]\nname = \"quickwit-datetime\"\ndescription = \"Date and datetime utilities for Quickwit\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nitertools = { workspace = true }\nserde = { workspace = true }\nserde_json = { workspace = true }\ntantivy = { workspace = true }\ntime = { workspace = true }\ntime-fmt = \"0.3.8\"\n"
  },
  {
    "path": "quickwit/quickwit-datetime/README.md",
    "content": "Why a datetime crate? Why is it no in quickwit-common or where it is consumed?\n\n- We don't want to add a dependency to tantivy in quickwit-common\n- We need this date logic both in quickwit-query and in quickwit-docmapper\n"
  },
  {
    "path": "quickwit/quickwit-datetime/src/date_time_format.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt::Display;\nuse std::str::FromStr;\n\nuse serde::de::Error;\nuse serde::{Deserialize, Deserializer, Serialize};\nuse serde_json::Value as JsonValue;\nuse time::Month;\nuse time::format_description::well_known::{Iso8601, Rfc2822, Rfc3339};\n\nuse crate::java_date_time_format::is_strftime_formatting;\nuse crate::{StrptimeParser, TantivyDateTime};\n\n/// Specifies the datetime and unix timestamp formats to use when parsing date strings.\n#[derive(Clone, Debug, Eq, PartialEq, Hash, Default)]\npub enum DateTimeInputFormat {\n    Iso8601,\n    Rfc2822,\n    #[default]\n    Rfc3339,\n    Strptime(StrptimeParser),\n    Timestamp,\n}\n\nimpl DateTimeInputFormat {\n    pub fn as_str(&self) -> &str {\n        match self {\n            DateTimeInputFormat::Iso8601 => \"iso8601\",\n            DateTimeInputFormat::Rfc2822 => \"rfc2822\",\n            DateTimeInputFormat::Rfc3339 => \"rfc3339\",\n            DateTimeInputFormat::Strptime(parser) => parser.strptime_format.as_str(),\n            DateTimeInputFormat::Timestamp => \"unix_timestamp\",\n        }\n    }\n}\n\nimpl Display for DateTimeInputFormat {\n    fn fmt(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {\n        formatter.write_str(self.as_str())\n    }\n}\n\nimpl FromStr for DateTimeInputFormat {\n    type Err = String;\n\n    fn from_str(date_time_format_str: &str) -> Result<Self, Self::Err> {\n        let date_time_format = match date_time_format_str.to_lowercase().as_str() {\n            \"iso8601\" => DateTimeInputFormat::Iso8601,\n            \"rfc2822\" => DateTimeInputFormat::Rfc2822,\n            \"rfc3339\" => DateTimeInputFormat::Rfc3339,\n            \"unix_timestamp\" => DateTimeInputFormat::Timestamp,\n            _ => {\n                if !is_strftime_formatting(date_time_format_str) {\n                    return Err(format!(\n                        \"unknown input format: `{date_time_format_str}`. a custom date time \\\n                         format must contain at least one `strftime` special characters\"\n                    ));\n                }\n                DateTimeInputFormat::Strptime(StrptimeParser::from_strptime(date_time_format_str)?)\n            }\n        };\n        Ok(date_time_format)\n    }\n}\n\nimpl Serialize for DateTimeInputFormat {\n    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>\n    where S: serde::Serializer {\n        serializer.serialize_str(self.as_str())\n    }\n}\n\nimpl<'de> Deserialize<'de> for DateTimeInputFormat {\n    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>\n    where D: Deserializer<'de> {\n        let date_time_format_str: String = Deserialize::deserialize(deserializer)?;\n        let date_time_format = date_time_format_str.parse().map_err(D::Error::custom)?;\n        Ok(date_time_format)\n    }\n}\n\n/// Specifies the datetime format to use when displaying datetime values.\n#[derive(Clone, Debug, Eq, PartialEq, Hash, Default)]\npub enum DateTimeOutputFormat {\n    Iso8601,\n    Rfc2822,\n    #[default]\n    Rfc3339,\n    Strptime(StrptimeParser),\n    TimestampSecs,\n    TimestampMillis,\n    TimestampMicros,\n    TimestampNanos,\n}\n\nimpl DateTimeOutputFormat {\n    pub fn as_str(&self) -> &str {\n        match self {\n            DateTimeOutputFormat::Iso8601 => \"iso8601\",\n            DateTimeOutputFormat::Rfc2822 => \"rfc2822\",\n            DateTimeOutputFormat::Rfc3339 => \"rfc3339\",\n            DateTimeOutputFormat::Strptime(parser) => parser.strptime_format.as_str(),\n            DateTimeOutputFormat::TimestampSecs => \"unix_timestamp_secs\",\n            DateTimeOutputFormat::TimestampMillis => \"unix_timestamp_millis\",\n            DateTimeOutputFormat::TimestampMicros => \"unix_timestamp_micros\",\n            DateTimeOutputFormat::TimestampNanos => \"unix_timestamp_nanos\",\n        }\n    }\n\n    pub fn format_to_json(&self, date_time: TantivyDateTime) -> Result<JsonValue, String> {\n        let date = date_time.into_utc();\n        let format_result = match &self {\n            DateTimeOutputFormat::Rfc3339 => date.format(&Rfc3339).map(JsonValue::String),\n            DateTimeOutputFormat::Iso8601 => date.format(&Iso8601::DEFAULT).map(JsonValue::String),\n            DateTimeOutputFormat::Rfc2822 => date.format(&Rfc2822).map(JsonValue::String),\n            DateTimeOutputFormat::Strptime(strftime_parser) => strftime_parser\n                .format_date_time(&date)\n                .map(JsonValue::String),\n            DateTimeOutputFormat::TimestampSecs => {\n                Ok(JsonValue::Number(date_time.into_timestamp_secs().into()))\n            }\n            DateTimeOutputFormat::TimestampMillis => {\n                Ok(JsonValue::Number(date_time.into_timestamp_millis().into()))\n            }\n            DateTimeOutputFormat::TimestampMicros => {\n                Ok(JsonValue::Number(date_time.into_timestamp_micros().into()))\n            }\n            DateTimeOutputFormat::TimestampNanos => {\n                Ok(JsonValue::Number(date_time.into_timestamp_nanos().into()))\n            }\n        };\n        format_result.map_err(|error| error.to_string())\n    }\n}\n\nimpl Display for DateTimeOutputFormat {\n    fn fmt(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {\n        formatter.write_str(self.as_str())\n    }\n}\n\nimpl FromStr for DateTimeOutputFormat {\n    type Err = String;\n\n    fn from_str(date_time_format_str: &str) -> Result<Self, Self::Err> {\n        let date_time_format = match date_time_format_str.to_lowercase().as_str() {\n            \"iso8601\" => DateTimeOutputFormat::Iso8601,\n            \"rfc2822\" => DateTimeOutputFormat::Rfc2822,\n            \"rfc3339\" => DateTimeOutputFormat::Rfc3339,\n            \"unix_timestamp_secs\" => DateTimeOutputFormat::TimestampSecs,\n            \"unix_timestamp_millis\" => DateTimeOutputFormat::TimestampMillis,\n            \"unix_timestamp_micros\" => DateTimeOutputFormat::TimestampMicros,\n            \"unix_timestamp_nanos\" => DateTimeOutputFormat::TimestampNanos,\n            _ => {\n                if !is_strftime_formatting(date_time_format_str) {\n                    return Err(format!(\n                        \"unknown output format: `{date_time_format_str}`. a custom date time \\\n                         format must contain at least one `strftime` special characters\"\n                    ));\n                }\n                DateTimeOutputFormat::Strptime(StrptimeParser::from_strptime(date_time_format_str)?)\n            }\n        };\n        Ok(date_time_format)\n    }\n}\n\nimpl Serialize for DateTimeOutputFormat {\n    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>\n    where S: serde::Serializer {\n        serializer.serialize_str(self.as_str())\n    }\n}\n\nimpl<'de> Deserialize<'de> for DateTimeOutputFormat {\n    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>\n    where D: Deserializer<'de> {\n        let date_time_format_str: String = Deserialize::deserialize(deserializer)?;\n        let date_time_format = date_time_format_str.parse().map_err(D::Error::custom)?;\n        Ok(date_time_format)\n    }\n}\n\n/// Infers the year of a parsed date time. It assumes that events appear more often delayed than in\n/// the future and, as a result, skews towards the past year.\npub(super) fn infer_year(\n    parsed_month_opt: Option<Month>,\n    this_month: Month,\n    this_year: i32,\n) -> i32 {\n    let Some(parsed_month) = parsed_month_opt else {\n        return this_year;\n    };\n    if parsed_month as u8 > this_month as u8 + 3 {\n        return this_year - 1;\n    }\n    this_year\n}\n\n#[cfg(test)]\nmod tests {\n    use time::Month;\n\n    use super::*;\n\n    #[test]\n    fn test_date_time_input_format_ser() {\n        let date_time_formats_json = serde_json::to_value(&[\n            DateTimeInputFormat::Iso8601,\n            DateTimeInputFormat::Rfc2822,\n            DateTimeInputFormat::Rfc3339,\n            DateTimeInputFormat::Timestamp,\n        ])\n        .unwrap();\n\n        let expected_date_time_formats =\n            serde_json::json!([\"iso8601\", \"rfc2822\", \"rfc3339\", \"unix_timestamp\",]);\n        assert_eq!(date_time_formats_json, expected_date_time_formats);\n    }\n\n    #[test]\n    fn test_date_time_input_format_deser() {\n        let date_time_formats_json = r#\"\n            [\n                \"iso8601\",\n                \"rfc2822\",\n                \"rfc3339\",\n                \"unix_timestamp\"\n            ]\n            \"#;\n        let date_time_formats: Vec<DateTimeInputFormat> =\n            serde_json::from_str(date_time_formats_json).unwrap();\n        let expected_date_time_formats = [\n            DateTimeInputFormat::Iso8601,\n            DateTimeInputFormat::Rfc2822,\n            DateTimeInputFormat::Rfc3339,\n            DateTimeInputFormat::Timestamp,\n        ];\n        assert_eq!(date_time_formats, &expected_date_time_formats);\n    }\n\n    #[test]\n    fn test_date_time_output_format_ser() {\n        let date_time_formats_json = serde_json::to_value(&[\n            DateTimeOutputFormat::Iso8601,\n            DateTimeOutputFormat::Rfc2822,\n            DateTimeOutputFormat::Rfc3339,\n            DateTimeOutputFormat::TimestampSecs,\n            DateTimeOutputFormat::TimestampMillis,\n            DateTimeOutputFormat::TimestampMicros,\n            DateTimeOutputFormat::TimestampNanos,\n        ])\n        .unwrap();\n\n        let expected_date_time_formats = serde_json::json!([\n            \"iso8601\",\n            \"rfc2822\",\n            \"rfc3339\",\n            \"unix_timestamp_secs\",\n            \"unix_timestamp_millis\",\n            \"unix_timestamp_micros\",\n            \"unix_timestamp_nanos\",\n        ]);\n        assert_eq!(date_time_formats_json, expected_date_time_formats);\n    }\n\n    #[test]\n    fn test_date_time_output_format_deser() {\n        let date_time_formats_json = r#\"\n            [\n                \"iso8601\",\n                \"rfc2822\",\n                \"rfc3339\",\n                \"unix_timestamp_secs\",\n                \"unix_timestamp_millis\",\n                \"unix_timestamp_micros\",\n                \"unix_timestamp_nanos\"\n            ]\n            \"#;\n        let date_time_formats: Vec<DateTimeOutputFormat> =\n            serde_json::from_str(date_time_formats_json).unwrap();\n        let expected_date_time_formats = [\n            DateTimeOutputFormat::Iso8601,\n            DateTimeOutputFormat::Rfc2822,\n            DateTimeOutputFormat::Rfc3339,\n            DateTimeOutputFormat::TimestampSecs,\n            DateTimeOutputFormat::TimestampMillis,\n            DateTimeOutputFormat::TimestampMicros,\n            DateTimeOutputFormat::TimestampNanos,\n        ];\n        assert_eq!(date_time_formats, &expected_date_time_formats);\n    }\n\n    #[test]\n    fn test_fail_date_time_input_format_from_str_with_unknown_format() {\n        let formats = vec![\n            \"test%\",\n            \"test-%v\",\n            \"test-%q\",\n            \"unix_timestamp_secs\",\n            \"unix_timestamp_seconds\",\n        ];\n        for format in formats {\n            let error_str = DateTimeInputFormat::from_str(format)\n                .unwrap_err()\n                .to_string();\n            assert!(error_str.contains(&format!(\"unknown input format: `{format}`\")));\n        }\n    }\n\n    #[test]\n    fn test_fail_date_time_output_format_from_str_with_unknown_format() {\n        let formats = vec![\"test%\", \"test-%v\", \"test-%q\", \"unix_timestamp_seconds\"];\n        for format in formats {\n            let error_str = DateTimeOutputFormat::from_str(format)\n                .unwrap_err()\n                .to_string();\n            assert!(error_str.contains(&format!(\"unknown output format: `{format}`\")));\n        }\n    }\n\n    #[test]\n    fn test_infer_year() {\n        let inferred_year = infer_year(None, Month::January, 2024);\n        assert_eq!(inferred_year, 2024);\n\n        let inferred_year = infer_year(Some(Month::December), Month::January, 2024);\n        assert_eq!(inferred_year, 2023);\n\n        let inferred_year = infer_year(Some(Month::January), Month::January, 2024);\n        assert_eq!(inferred_year, 2024);\n\n        let inferred_year = infer_year(Some(Month::February), Month::January, 2024);\n        assert_eq!(inferred_year, 2024);\n\n        let inferred_year = infer_year(Some(Month::March), Month::January, 2024);\n        assert_eq!(inferred_year, 2024);\n\n        let inferred_year = infer_year(Some(Month::April), Month::January, 2024);\n        assert_eq!(inferred_year, 2024);\n\n        let inferred_year = infer_year(Some(Month::May), Month::January, 2024);\n        assert_eq!(inferred_year, 2023);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-datetime/src/date_time_parsing.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::time::Duration;\n\nuse itertools::Itertools;\nuse time::OffsetDateTime;\nuse time::format_description::well_known::{Iso8601, Rfc2822, Rfc3339};\n\nuse super::date_time_format::DateTimeInputFormat;\nuse crate::TantivyDateTime;\n\n// Minimum supported timestamp value in seconds (13 Apr 1972 23:59:55 GMT).\nconst MIN_TIMESTAMP_SECONDS: i64 = 72_057_595;\n\n// Maximum supported timestamp value in seconds (16 Mar 2242 12:56:31 GMT).\nconst MAX_TIMESTAMP_SECONDS: i64 = 8_589_934_591;\n\npub fn parse_date_time_str(\n    date_time_str: &str,\n    date_time_formats: &[DateTimeInputFormat],\n) -> Result<TantivyDateTime, String> {\n    let trimmed_date_time_str = date_time_str.trim_ascii();\n\n    for date_time_format in date_time_formats {\n        let date_time_opt = match date_time_format {\n            DateTimeInputFormat::Iso8601 => parse_iso8601(trimmed_date_time_str)\n                .map(TantivyDateTime::from_utc)\n                .ok(),\n            DateTimeInputFormat::Rfc2822 => parse_rfc2822(trimmed_date_time_str)\n                .map(TantivyDateTime::from_utc)\n                .ok(),\n            DateTimeInputFormat::Rfc3339 => parse_rfc3339(trimmed_date_time_str)\n                .map(TantivyDateTime::from_utc)\n                .ok(),\n            DateTimeInputFormat::Strptime(parser) => parser\n                .parse_date_time(trimmed_date_time_str)\n                .map(TantivyDateTime::from_utc)\n                .ok(),\n            DateTimeInputFormat::Timestamp => parse_timestamp_str(trimmed_date_time_str),\n        };\n        if let Some(date_time) = date_time_opt {\n            return Ok(date_time);\n        }\n    }\n    Err(format!(\n        \"failed to parse datetime `{date_time_str}` using the following formats: `{}`\",\n        date_time_formats\n            .iter()\n            .map(|date_time_format| date_time_format.as_str())\n            .join(\"`, `\")\n    ))\n}\n\npub fn parse_timestamp_float(\n    timestamp: f64,\n    date_time_formats: &[DateTimeInputFormat],\n) -> Result<TantivyDateTime, String> {\n    if !date_time_formats.contains(&DateTimeInputFormat::Timestamp) {\n        return Err(format!(\n            \"failed to parse datetime `{timestamp}` using the following formats: `{}`\",\n            date_time_formats\n                .iter()\n                .map(|date_time_format| date_time_format.as_str())\n                .join(\"`, `\")\n        ));\n    }\n    let duration_since_epoch = Duration::try_from_secs_f64(timestamp)\n        .map_err(|error| format!(\"failed to parse datetime `{timestamp}`: {error}\"))?;\n    let timestamp_nanos = duration_since_epoch.as_nanos() as i64;\n    Ok(TantivyDateTime::from_timestamp_nanos(timestamp_nanos))\n}\n\npub fn parse_timestamp_int(\n    timestamp: i64,\n    date_time_formats: &[DateTimeInputFormat],\n) -> Result<TantivyDateTime, String> {\n    if !date_time_formats.contains(&DateTimeInputFormat::Timestamp) {\n        return Err(format!(\n            \"failed to parse datetime `{timestamp}` using the following formats: `{}`\",\n            date_time_formats\n                .iter()\n                .map(|date_time_format| date_time_format.as_str())\n                .join(\"`, `\")\n        ));\n    }\n    parse_timestamp(timestamp)\n}\n\npub fn parse_timestamp_str(timestamp_str: &str) -> Option<TantivyDateTime> {\n    if let Ok(timestamp) = timestamp_str.parse::<i64>() {\n        return parse_timestamp(timestamp).ok();\n    }\n    if let Some((timestamp_secs_str, subsecond_digits_str)) = timestamp_str.split_once('.') {\n        if subsecond_digits_str.is_empty() {\n            return parse_timestamp_str(timestamp_secs_str);\n        }\n        if let Ok(timestamp_secs @ MIN_TIMESTAMP_SECONDS..=MAX_TIMESTAMP_SECONDS) =\n            timestamp_secs_str.parse::<i64>()\n        {\n            let num_subsecond_digits = subsecond_digits_str.len().min(9);\n\n            if let Ok(subsecond_digits) =\n                subsecond_digits_str[..num_subsecond_digits].parse::<i64>()\n            {\n                let nanos = subsecond_digits * 10i64.pow(9 - num_subsecond_digits as u32);\n                let timestamp_nanos = timestamp_secs * 1_000_000_000 + nanos;\n                return Some(TantivyDateTime::from_timestamp_nanos(timestamp_nanos));\n            }\n        }\n    }\n    None\n}\n\n/// Parses a ISO8601 date.\nfn parse_iso8601(value: &str) -> Result<OffsetDateTime, String> {\n    OffsetDateTime::parse(value, &Iso8601::DEFAULT).map_err(|error| error.to_string())\n}\n\n/// Parses a RFC2822 date.\nfn parse_rfc2822(value: &str) -> Result<OffsetDateTime, String> {\n    OffsetDateTime::parse(value, &Rfc2822).map_err(|error| error.to_string())\n}\n\n/// Parses a RFC3339 date.\nfn parse_rfc3339(value: &str) -> Result<OffsetDateTime, String> {\n    OffsetDateTime::parse(value, &Rfc3339).map_err(|error| error.to_string())\n}\n\n/// Returns the appropriate [`TantivyDateTime`] for the specified Unix timestamp.\n///\n/// This function will choose the timestamp precision based on the value range.\n/// The tradeoff is that we can only support dates ranging:\n/// - from `13 Apr 1972 23:59:55`: smallest value that can be converted to all precisions.\n/// - to: `16 Mar 2242 12:56:31`: greatest value that can be converted to all precisions.\npub fn parse_timestamp(timestamp: i64) -> Result<TantivyDateTime, String> {\n    const MIN_TIMESTAMP_MILLIS: i64 = MIN_TIMESTAMP_SECONDS * 1000;\n    const MAX_TIMESTAMP_MILLIS: i64 = MAX_TIMESTAMP_SECONDS * 1000;\n\n    const MIN_TIMESTAMP_MICROS: i64 = MIN_TIMESTAMP_SECONDS * 1_000_000;\n    const MAX_TIMESTAMP_MICROS: i64 = MAX_TIMESTAMP_SECONDS * 1_000_000;\n\n    const MIN_TIMESTAMP_NANOS: i64 = MIN_TIMESTAMP_SECONDS * 1_000_000_000;\n    const MAX_TIMESTAMP_NANOS: i64 = MAX_TIMESTAMP_SECONDS * 1_000_000_000;\n\n    match timestamp {\n        MIN_TIMESTAMP_SECONDS..=MAX_TIMESTAMP_SECONDS => {\n            Ok(TantivyDateTime::from_timestamp_secs(timestamp))\n        }\n        MIN_TIMESTAMP_MILLIS..=MAX_TIMESTAMP_MILLIS => {\n            Ok(TantivyDateTime::from_timestamp_millis(timestamp))\n        }\n        MIN_TIMESTAMP_MICROS..=MAX_TIMESTAMP_MICROS => {\n            Ok(TantivyDateTime::from_timestamp_micros(timestamp))\n        }\n        MIN_TIMESTAMP_NANOS..=MAX_TIMESTAMP_NANOS => {\n            Ok(TantivyDateTime::from_timestamp_nanos(timestamp))\n        }\n        _ => Err(format!(\n            \"failed to parse unix timestamp `{timestamp}`. Quickwit only support timestamp values \\\n             ranging from `13 Apr 1972 23:59:55` to `16 Mar 2242 12:56:31`\"\n        )),\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use time::Month;\n    use time::macros::datetime;\n\n    use super::*;\n    use crate::StrptimeParser;\n    use crate::date_time_format::infer_year;\n\n    #[test]\n    fn test_parse_iso8601() {\n        let date_time = parse_iso8601(\"20120521T120914Z\").unwrap();\n        assert_eq!(date_time, datetime!(2012-05-21 12:09:14 UTC));\n    }\n\n    #[test]\n    fn test_parse_rfc2822() {\n        let date_time = parse_rfc2822(\"Mon, 21 May 2012 12:09:14 GMT\").unwrap();\n        assert_eq!(date_time, datetime!(2012-05-21 12:09:14 UTC));\n    }\n\n    #[test]\n    fn test_parse_rfc3339() {\n        let date_time = parse_rfc3339(\"2012-05-21T12:09:14-00:00\").unwrap();\n        assert_eq!(date_time, datetime!(2012-05-21 12:09:14 UTC));\n    }\n\n    #[test]\n    fn test_parse_strptime() {\n        let test_data = vec![\n            (\n                \" %Y-%m-%d %H:%M:%S \",\n                \"2012-05-21 12:09:14\",\n                datetime!(2012-05-21 12:09:14 UTC),\n            ),\n            (\n                \"%Y-%m-%d %H:%M:%S %z\",\n                \" 2012-05-21 12:09:14 +0000 \",\n                datetime!(2012-05-21 12:09:14 UTC),\n            ),\n            (\n                \"%Y-%m-%d %H:%M:%S %z\",\n                \"2012-05-21 12:09:14 +0200\",\n                datetime!(2012-05-21 10:09:14 UTC),\n            ),\n            (\n                \"%Y-%m-%d %H:%M:%S %z\",\n                \"2012-05-21 12:09:14 -0300\",\n                datetime!(2012-05-21 15:09:14 UTC),\n            ),\n            (\n                \"%Y-%m-%d %H:%M:%S %z\",\n                \"2012-05-21 12:09:14 -03:00\",\n                datetime!(2012-05-21 15:09:14 UTC),\n            ),\n            (\n                \"%Y-%m-%d %H:%M:%S.%f\",\n                \"2024-01-31 18:40:19.950\",\n                datetime!(2024-01-31 18:40:19.950000000 UTC),\n            ),\n            (\n                \"%Y-%m-%d %H:%M:%S.%f\",\n                \"2024-01-31 18:40:19.950188\",\n                datetime!(2024-01-31 18:40:19.950188000 UTC),\n            ),\n            (\n                \"%Y-%m-%d %H:%M:%S.%f\",\n                \"2024-01-31 18:40:19.950188123\",\n                datetime!(2024-01-31 18:40:19.950188123 UTC),\n            ),\n            (\"%b %d %H:%M:%S\", \"Mar  6 17:40:02\", {\n                let dt = datetime!(1900-03-06 17:40:02 UTC);\n                let now = OffsetDateTime::now_utc();\n                let year = infer_year(Some(Month::March), now.month(), now.year());\n                dt.replace_year(year).unwrap()\n            }),\n            (\n                \"%Y-%m-%dT%H:%M:%S.%f%z\",\n                \"2024-03-21T03:45:02.561820768-0400\",\n                datetime!(2024-03-21 03:45:02.561820768 -04:00),\n            ),\n        ];\n        for (fmt, date_time_str, expected) in test_data {\n            let parser = DateTimeInputFormat::Strptime(StrptimeParser::from_strptime(fmt).unwrap());\n            let result = parse_date_time_str(date_time_str, &[parser]);\n            if let Err(error) = &result {\n                panic!(\n                    \"failed to parse `{date_time_str}` using the following strptime format \\\n                     `{fmt}`: {error}\"\n                )\n            }\n            assert_eq!(result.unwrap(), TantivyDateTime::from_utc(expected));\n        }\n    }\n\n    #[test]\n    fn test_parse_date_without_time() {\n        let strptime_parser = StrptimeParser::from_strptime(\"%Y-%m-%d\").unwrap();\n        let date = strptime_parser.parse_date_time(\"2012-05-21\").unwrap();\n        assert_eq!(date, datetime!(2012-05-21 00:00:00 UTC));\n    }\n\n    #[test]\n    fn test_parse_date_am_pm_hour_not_zeroed() {\n        let strptime_parser = StrptimeParser::from_strptime(\"%Y-%m-%d %I:%M:%S %p\").unwrap();\n        let date = strptime_parser\n            .parse_date_time(\"2012-05-21 10:05:12 pm\")\n            .unwrap();\n        assert_eq!(date, datetime!(2012-05-21 22:05:12 UTC));\n    }\n\n    #[test]\n    fn test_parse_date_time_str() {\n        for date_time_str in [\n            \"20120521T120914Z \",\n            \" Mon, 21 May 2012 12:09:14 GMT\",\n            \" 2012-05-21T12:09:14-00:00 \",\n            \"2012-05-21 12:09:14\",\n            \" 2012/05/21 12:09:14\",\n            \"2012/05/21 12:09:14 +00:00\",\n            \"1337602154 \",\n            \" 1337602154.0 \",\n        ] {\n            let date_time = parse_date_time_str(\n                date_time_str,\n                &[\n                    DateTimeInputFormat::Iso8601,\n                    DateTimeInputFormat::Rfc2822,\n                    DateTimeInputFormat::Rfc3339,\n                    DateTimeInputFormat::Strptime(\n                        StrptimeParser::from_strptime(\"%Y-%m-%d %H:%M:%S\").unwrap(),\n                    ),\n                    DateTimeInputFormat::Strptime(\n                        StrptimeParser::from_strptime(\"%Y/%m/%d %H:%M:%S\").unwrap(),\n                    ),\n                    DateTimeInputFormat::Strptime(\n                        StrptimeParser::from_strptime(\"%Y/%m/%d %H:%M:%S %z\").unwrap(),\n                    ),\n                    DateTimeInputFormat::Timestamp,\n                ],\n            )\n            .unwrap();\n            assert_eq!(\n                date_time.into_timestamp_secs(),\n                datetime!(2012-05-21 12:09:14 UTC).unix_timestamp()\n            );\n        }\n        let error = parse_date_time_str(\n            \"foo\",\n            &[DateTimeInputFormat::Iso8601, DateTimeInputFormat::Rfc2822],\n        )\n        .unwrap_err();\n        assert_eq!(\n            error,\n            \"failed to parse datetime `foo` using the following formats: `iso8601`, `rfc2822`\"\n        );\n    }\n\n    #[test]\n    fn test_parse_timestamp_float() {\n        let unix_ts_secs = OffsetDateTime::now_utc().unix_timestamp();\n        {\n            let date_time = parse_timestamp_float(\n                unix_ts_secs as f64,\n                &[DateTimeInputFormat::Iso8601, DateTimeInputFormat::Timestamp],\n            )\n            .unwrap();\n            assert_eq!(date_time.into_timestamp_millis(), unix_ts_secs * 1_000);\n        }\n        {\n            let date_time = parse_timestamp_float(\n                unix_ts_secs as f64 + 0.1230,\n                &[DateTimeInputFormat::Iso8601, DateTimeInputFormat::Timestamp],\n            )\n            .unwrap();\n            assert!((date_time.into_timestamp_millis() - (unix_ts_secs * 1_000 + 123)).abs() <= 1);\n        }\n        {\n            let date_time = parse_timestamp_float(\n                unix_ts_secs as f64 + 0.1234560,\n                &[DateTimeInputFormat::Iso8601, DateTimeInputFormat::Timestamp],\n            )\n            .unwrap();\n            assert!(\n                (date_time.into_timestamp_micros() - (unix_ts_secs * 1_000_000 + 123_456)).abs()\n                    <= 1\n            );\n        }\n        {\n            let date_time = parse_timestamp_float(\n                unix_ts_secs as f64 + 0.123456789,\n                &[DateTimeInputFormat::Iso8601, DateTimeInputFormat::Timestamp],\n            )\n            .unwrap();\n            assert!(\n                (date_time.into_timestamp_nanos() - (unix_ts_secs * 1_000_000_000 + 123_456_789))\n                    .abs()\n                    <= 100\n            );\n        }\n        {\n            let error = parse_timestamp_float(\n                1668730394917.01,\n                &[DateTimeInputFormat::Iso8601, DateTimeInputFormat::Rfc2822],\n            )\n            .unwrap_err();\n            assert_eq!(\n                error,\n                \"failed to parse datetime `1668730394917.01` using the following formats: \\\n                 `iso8601`, `rfc2822`\"\n            );\n        }\n    }\n\n    #[test]\n    fn test_parse_timestamp_int() {\n        {\n            let unix_ts_secs = OffsetDateTime::now_utc().unix_timestamp();\n            let date_time = parse_timestamp_int(\n                unix_ts_secs,\n                &[DateTimeInputFormat::Iso8601, DateTimeInputFormat::Timestamp],\n            )\n            .unwrap();\n            assert_eq!(date_time.into_timestamp_secs(), unix_ts_secs);\n        }\n        {\n            let error = parse_timestamp_int(\n                1668730394917,\n                &[DateTimeInputFormat::Iso8601, DateTimeInputFormat::Rfc2822],\n            )\n            .unwrap_err();\n            assert_eq!(\n                error,\n                \"failed to parse datetime `1668730394917` using the following formats: `iso8601`, \\\n                 `rfc2822`\"\n            );\n        }\n    }\n\n    #[test]\n    fn test_parse_timestamp_str() {\n        let date_time = parse_timestamp_str(\"123456789\").unwrap();\n        assert_eq!(date_time.into_timestamp_secs(), 123456789);\n\n        let date_time = parse_timestamp_str(\"123456789.\").unwrap();\n        assert_eq!(date_time.into_timestamp_secs(), 123456789);\n\n        let date_time = parse_timestamp_str(\"123456789.0\").unwrap();\n        assert_eq!(date_time.into_timestamp_secs(), 123456789);\n\n        let date_time = parse_timestamp_str(\"123456789.1\").unwrap();\n        assert_eq!(date_time.into_timestamp_millis(), 123456789100);\n\n        let date_time = parse_timestamp_str(\"123456789.100000001\").unwrap();\n        assert_eq!(date_time.into_timestamp_nanos(), 123456789100000001);\n\n        let date_time = parse_timestamp_str(\"123456789.1000000011\").unwrap();\n        assert_eq!(date_time.into_timestamp_nanos(), 123456789100000001);\n    }\n\n    #[test]\n    fn test_parse_date_time_millis() {\n        for date_time_str in [\n            \"20120521T120914.12Z\",\n            \"2012-05-21T12:09:14.12-00:00\",\n            \"2012-05-21 12:09:14.120\",\n        ] {\n            let date_time = parse_date_time_str(\n                date_time_str,\n                &[\n                    DateTimeInputFormat::Iso8601,\n                    DateTimeInputFormat::Rfc3339,\n                    DateTimeInputFormat::Strptime(\n                        StrptimeParser::from_strptime(\"%Y-%m-%d %H:%M:%S.%f\").unwrap(),\n                    ),\n                ],\n            )\n            .unwrap();\n            assert_eq!(\n                date_time.into_timestamp_micros() as i128,\n                datetime!(2012-05-21 12:09:14.12 UTC).unix_timestamp_nanos() / 1_000\n            );\n        }\n    }\n\n    #[test]\n    fn test_parse_timestamp() {\n        let now = OffsetDateTime::now_utc();\n        {\n            let unix_ts_secs = now.unix_timestamp();\n            let date_time = parse_timestamp(unix_ts_secs).unwrap();\n            assert_eq!(date_time.into_timestamp_secs(), unix_ts_secs);\n        }\n        {\n            let unix_ts_millis = (now.unix_timestamp_nanos() / 1_000_000) as i64;\n            let date_time = parse_timestamp(unix_ts_millis).unwrap();\n            assert_eq!(date_time.into_timestamp_millis(), unix_ts_millis);\n        }\n        {\n            let unix_ts_micros = (now.unix_timestamp_nanos() / 1_000) as i64;\n            let date_time = parse_timestamp(unix_ts_micros).unwrap();\n            assert_eq!(date_time.into_timestamp_micros(), unix_ts_micros);\n        }\n        {\n            let unix_ts_nanos = now.unix_timestamp_nanos() as i64;\n            let date_time = parse_timestamp(unix_ts_nanos).unwrap();\n            assert_eq!(date_time.into_timestamp_nanos(), unix_ts_nanos);\n        }\n        {\n            let min_supported_date =\n                OffsetDateTime::parse(\"1972-04-13T23:59:55.00Z\", &Rfc3339).unwrap();\n            let parsed_date_time = parse_timestamp(min_supported_date.unix_timestamp()).unwrap();\n            assert_eq!(\n                parsed_date_time.into_timestamp_secs(),\n                min_supported_date.unix_timestamp()\n            );\n            assert_eq!(\n                parsed_date_time.into_timestamp_micros(),\n                min_supported_date.unix_timestamp_nanos() as i64 / 1_000\n            );\n        }\n        {\n            let max_supported_date =\n                OffsetDateTime::parse(\"2242-03-16T12:56:31.00Z\", &Rfc3339).unwrap();\n            let parsed_date_time = parse_timestamp(max_supported_date.unix_timestamp()).unwrap();\n            assert_eq!(\n                parsed_date_time.into_timestamp_secs(),\n                max_supported_date.unix_timestamp()\n            );\n            assert_eq!(\n                parsed_date_time.into_timestamp_micros(),\n                max_supported_date.unix_timestamp_nanos() as i64 / 1_000\n            );\n        }\n        {\n            let less_than_supported_date = MIN_TIMESTAMP_SECONDS - 1;\n            let parse_err = parse_timestamp(less_than_supported_date).unwrap_err();\n            assert!(parse_err.contains(\"failed to parse unix timestamp\"));\n        }\n        {\n            let greater_than_supported_date = MAX_TIMESTAMP_SECONDS + 1;\n            let parse_err = parse_timestamp(greater_than_supported_date).unwrap_err();\n            assert!(parse_err.contains(\"failed to parse unix timestamp\"));\n        }\n        {\n            let unix_epoch = 0;\n            let parse_err = parse_timestamp(unix_epoch).unwrap_err();\n            assert!(parse_err.contains(\"failed to parse unix timestamp\"));\n\n            let parse_err = parse_timestamp(MIN_TIMESTAMP_SECONDS << 7).unwrap_err();\n            assert!(parse_err.contains(\"failed to parse unix timestamp\"));\n\n            let parse_err = parse_timestamp(MIN_TIMESTAMP_SECONDS << 17).unwrap_err();\n            assert!(parse_err.contains(\"failed to parse unix timestamp\"));\n\n            let parse_err = parse_timestamp(MIN_TIMESTAMP_SECONDS << 27).unwrap_err();\n            assert!(parse_err.contains(\"failed to parse unix timestamp\"));\n        }\n    }\n\n    #[test]\n    fn test_parse_timestamp_min_max_values() {\n        {\n            let min_ts_millis = MIN_TIMESTAMP_SECONDS * 1_000;\n            let date_time = parse_timestamp(min_ts_millis).unwrap();\n            assert_eq!(date_time.into_timestamp_millis(), min_ts_millis);\n\n            let min_ts_micros = MIN_TIMESTAMP_SECONDS * 1_000_000;\n            let date_time = parse_timestamp(min_ts_micros).unwrap();\n            assert_eq!(date_time.into_timestamp_micros(), min_ts_micros);\n\n            let min_ts_nanos = MIN_TIMESTAMP_SECONDS * 1_000_000_000;\n            let date_time = parse_timestamp(min_ts_nanos).unwrap();\n            assert_eq!(date_time.into_timestamp_micros() * 1000, min_ts_nanos);\n        }\n        {\n            let max_ts_seconds = MAX_TIMESTAMP_SECONDS;\n            let date_time = parse_timestamp(max_ts_seconds).unwrap();\n            assert_eq!(date_time.into_timestamp_secs(), max_ts_seconds);\n\n            let max_ts_millis = MAX_TIMESTAMP_SECONDS * 1_000;\n            let date_time = parse_timestamp(max_ts_millis).unwrap();\n            assert_eq!(date_time.into_timestamp_millis(), max_ts_millis);\n\n            let max_ts_micros = MAX_TIMESTAMP_SECONDS * 1_000_000;\n            let date_time = parse_timestamp(max_ts_micros).unwrap();\n            assert_eq!(date_time.into_timestamp_micros(), max_ts_micros);\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-datetime/src/java_date_time_format.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::num::NonZeroU8;\nuse std::sync::OnceLock;\n\nuse time::error::{Format, TryFromParsed};\nuse time::format_description::modifier::{\n    Day, Hour, Minute, Month as MonthModifier, Padding, Second, Subsecond, SubsecondDigits,\n    WeekNumber, WeekNumberRepr, Weekday, WeekdayRepr, Year, YearRepr,\n};\nuse time::format_description::{Component, OwnedFormatItem};\nuse time::parsing::Parsed;\nuse time::{Month, OffsetDateTime, PrimitiveDateTime, UtcOffset};\nuse time_fmt::parse::time_format_item::parse_to_format_item;\n\nuse crate::date_time_format;\n\nconst JAVA_DATE_FORMAT_TOKENS: &[&str] = &[\n    \"yyyy\",\n    \"xxxx\",\n    \"SSSSSSSSS\", // For nanoseconds\n    \"SSSSSSS\",   // For microseconds\n    \"SSSSSS\",    // For fractional seconds up to six digits\n    \"SSSSS\",\n    \"SSSS\",\n    \"SSS\",\n    \"SS\",\n    \"ZZ\",\n    \"ww\",\n    \"w[w]\",\n    \"MM\",\n    \"dd\",\n    \"HH\",\n    \"hh\",\n    \"kk\",\n    \"mm\",\n    \"ss\",\n    \"aa\",\n    \"a\",\n    \"w\",\n    \"M\",\n    \"d\",\n    \"H\",\n    \"h\",\n    \"k\",\n    \"m\",\n    \"s\",\n    \"S\",\n    \"Z\",\n    \"e\",\n];\n\nfn literal(s: &[u8]) -> OwnedFormatItem {\n    // builds a boxed slice from a slice\n    let boxed_slice: Box<[u8]> = s.to_vec().into_boxed_slice();\n    OwnedFormatItem::Literal(boxed_slice)\n}\n\n#[inline]\nfn get_padding(ptn: &str) -> Padding {\n    if ptn.len() == 2 {\n        Padding::Zero\n    } else {\n        Padding::None\n    }\n}\n\nfn build_zone_offset(_: &str) -> Option<OwnedFormatItem> {\n    // 'Z' literal to represent UTC offset\n    let z_literal = OwnedFormatItem::Literal(Box::from(b\"Z\".as_ref()));\n\n    // Offset in '+/-HH:MM' format\n    let offset_with_delimiter_items: Box<[OwnedFormatItem]> = vec![\n        OwnedFormatItem::Component(Component::OffsetHour(Default::default())),\n        OwnedFormatItem::Literal(Box::from(b\":\".as_ref())),\n        OwnedFormatItem::Component(Component::OffsetMinute(Default::default())),\n    ]\n    .into_boxed_slice();\n    let offset_with_delimiter_compound = OwnedFormatItem::Compound(offset_with_delimiter_items);\n\n    // Offset in '+/-HHMM' format\n    let offset_items: Box<[OwnedFormatItem]> = vec![\n        OwnedFormatItem::Component(Component::OffsetHour(Default::default())),\n        OwnedFormatItem::Component(Component::OffsetMinute(Default::default())),\n    ]\n    .into_boxed_slice();\n    let offset_compound = OwnedFormatItem::Compound(offset_items);\n\n    Some(OwnedFormatItem::First(\n        vec![z_literal, offset_with_delimiter_compound, offset_compound].into_boxed_slice(),\n    ))\n}\n\n// There is a `YearRepr::LastTwo` representation in the time crate, but the parser is unreliable, so\n// we only support `YearRepr::Full` for now. See also https://github.com/time-rs/time/issues/649.\nconst fn year_item() -> Option<OwnedFormatItem> {\n    let mut year_component = Year::default();\n    year_component.repr = YearRepr::Full;\n    Some(OwnedFormatItem::Component(Component::Year(year_component)))\n}\n\nfn build_month_item(ptn: &str) -> Option<OwnedFormatItem> {\n    let mut month: MonthModifier = Default::default();\n    month.padding = get_padding(ptn);\n    Some(OwnedFormatItem::Component(Component::Month(month)))\n}\n\nfn build_day_item(ptn: &str) -> Option<OwnedFormatItem> {\n    let mut day = Day::default();\n    day.padding = get_padding(ptn);\n    Some(OwnedFormatItem::Component(Component::Day(day)))\n}\n\nfn build_day_of_week_item(_: &str) -> Option<OwnedFormatItem> {\n    let mut weekday = Weekday::default();\n    weekday.repr = WeekdayRepr::Monday;\n    weekday.one_indexed = false;\n    Some(OwnedFormatItem::Component(Component::Weekday(weekday)))\n}\n\nfn build_week_of_year_item(ptn: &str) -> Option<OwnedFormatItem> {\n    let mut week_number = WeekNumber::default();\n    week_number.repr = WeekNumberRepr::Monday;\n    week_number.padding = get_padding(ptn);\n    Some(OwnedFormatItem::Component(Component::WeekNumber(\n        week_number,\n    )))\n}\n\nfn build_hour_item(ptn: &str) -> Option<OwnedFormatItem> {\n    let mut hour = Hour::default();\n    hour.padding = get_padding(ptn);\n    hour.is_12_hour_clock = false;\n    Some(OwnedFormatItem::Component(Component::Hour(hour)))\n}\n\nfn build_minute_item(ptn: &str) -> Option<OwnedFormatItem> {\n    let mut minute: Minute = Default::default();\n    minute.padding = get_padding(ptn);\n    Some(OwnedFormatItem::Component(Component::Minute(minute)))\n}\n\nfn build_second_item(ptn: &str) -> Option<OwnedFormatItem> {\n    let mut second: Second = Default::default();\n    second.padding = get_padding(ptn);\n    Some(OwnedFormatItem::Component(Component::Second(second)))\n}\n\nfn build_fraction_of_second_item(_ptn: &str) -> Option<OwnedFormatItem> {\n    let mut subsecond: Subsecond = Default::default();\n    subsecond.digits = SubsecondDigits::OneOrMore;\n    Some(OwnedFormatItem::Component(Component::Subsecond(subsecond)))\n}\n\nfn parse_java_datetime_format_items_recursive(\n    chars: &mut std::iter::Peekable<std::str::Chars>,\n) -> Result<Vec<OwnedFormatItem>, String> {\n    let mut items = Vec::new();\n\n    while let Some(&c) = chars.peek() {\n        match c {\n            '[' => {\n                chars.next();\n                let optional_items = parse_java_datetime_format_items_recursive(chars)?;\n                items.push(OwnedFormatItem::Optional(Box::new(\n                    OwnedFormatItem::Compound(optional_items.into_boxed_slice()),\n                )));\n            }\n            ']' => {\n                chars.next();\n                break;\n            }\n            '\\'' => {\n                chars.next();\n                let mut literal_str = String::new();\n                while let Some(&next_c) = chars.peek() {\n                    if next_c == '\\'' {\n                        chars.next();\n                        break;\n                    } else {\n                        literal_str.push(next_c);\n                        chars.next();\n                    }\n                }\n                items.push(literal(literal_str.as_bytes()));\n            }\n            _ => {\n                if let Some(format_item) = match_java_date_format_token(chars)? {\n                    items.push(format_item);\n                } else {\n                    // Treat as a literal character\n                    items.push(literal(c.to_string().as_bytes()));\n                    chars.next();\n                }\n            }\n        }\n    }\n\n    Ok(items)\n}\n\n// Elasticsearch/OpenSearch uses a set of preconfigured formats, more information could be found\n// here https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-date-format.html\nfn match_java_date_format_token(\n    chars: &mut std::iter::Peekable<std::str::Chars>,\n) -> Result<Option<OwnedFormatItem>, String> {\n    if chars.peek().is_none() {\n        return Ok(None);\n    }\n\n    let remaining: String = chars.clone().collect();\n\n    // Try to match the longest possible token\n    for token in JAVA_DATE_FORMAT_TOKENS {\n        if remaining.starts_with(token) {\n            for _ in 0..token.len() {\n                chars.next();\n            }\n\n            let format_item = match *token {\n                \"yyyy\" | \"xxxx\" => year_item(),\n                \"MM\" | \"M\" => build_month_item(token),\n                \"dd\" | \"d\" => build_day_item(token),\n                \"HH\" | \"H\" => build_hour_item(token),\n                \"mm\" | \"m\" => build_minute_item(token),\n                \"ss\" | \"s\" => build_second_item(token),\n                \"SSSSSSSSS\" | \"SSSSSSS\" | \"SSSSSS\" | \"SSSSS\" | \"SSSS\" | \"SSS\" | \"SS\" | \"S\" => {\n                    build_fraction_of_second_item(token)\n                }\n                \"Z\" => build_zone_offset(token),\n                \"ww\" | \"w[w]\" | \"w\" => build_week_of_year_item(token),\n                \"e\" => build_day_of_week_item(token),\n                _ => return Err(format!(\"unrecognized token '{token}'\")),\n            };\n            return Ok(format_item);\n        }\n    }\n\n    Ok(None)\n}\n\n// Check if the given date time format is a common alias and replace it with the\n// Java date format it is mapped to, if any.\n// If the java_datetime_format is not an alias, it is expected to be a\n// java date time format and should be returned as is.\nfn resolve_java_datetime_format_alias(java_datetime_format: &str) -> &str {\n    static JAVA_DATE_FORMAT_ALIASES: OnceLock<HashMap<&'static str, &'static str>> =\n        OnceLock::new();\n    let java_datetime_format_map = JAVA_DATE_FORMAT_ALIASES.get_or_init(|| {\n        let mut m = HashMap::new();\n        m.insert(\"date_optional_time\", \"yyyy-MM-dd['T'HH:mm:ss.SSSZ]\");\n        m.insert(\n            \"strict_date_optional_time\",\n            \"yyyy[-MM[-dd['T'HH[:mm[:ss[.SSS[Z]]]]]]]\",\n        );\n        m.insert(\n            \"strict_date_optional_time_nanos\",\n            \"yyyy[-MM[-dd['T'HH:mm:ss.SSSSSSZ]]]\",\n        );\n        m.insert(\"basic_date\", \"yyyyMMdd\");\n\n        m.insert(\"strict_basic_week_date\", \"xxxx'W'wwe\");\n        m.insert(\"basic_week_date\", \"xxxx'W'wwe\");\n\n        m.insert(\"strict_basic_week_date_time\", \"xxxx'W'wwe'T'HHmmss.SSSZ\");\n        m.insert(\"basic_week_date_time\", \"xxxx'W'wwe'T'HHmmss.SSSZ\");\n\n        m.insert(\n            \"strict_basic_week_date_time_no_millis\",\n            \"xxxx'W'wwe'T'HHmmssZ\",\n        );\n        m.insert(\"basic_week_date_time_no_millis\", \"xxxx'W'wwe'T'HHmmssZ\");\n\n        m.insert(\"strict_week_date\", \"xxxx-'W'ww-e\");\n        m.insert(\"week_date\", \"xxxx-'W'w[w]-e\");\n        m\n    });\n    java_datetime_format_map\n        .get(java_datetime_format)\n        .copied()\n        .unwrap_or(java_datetime_format)\n}\n\n/// A date time parser that holds the format specification `Vec<FormatItem>`.\n#[derive(Clone)]\npub struct StrptimeParser {\n    pub(crate) strptime_format: String,\n    items: Box<[OwnedFormatItem]>,\n}\n\npub fn parse_java_datetime_format_items(\n    java_datetime_format: &str,\n) -> Result<Box<[OwnedFormatItem]>, String> {\n    let mut chars = java_datetime_format.chars().peekable();\n    let items = parse_java_datetime_format_items_recursive(&mut chars)?;\n    Ok(items.into_boxed_slice())\n}\n\nimpl StrptimeParser {\n    /// Parse a date assume UTC if unspecified.\n    /// See `parse_date_time_with_default_timezone` for more details.\n    pub fn parse_date_time(&self, date_time_str: &str) -> Result<OffsetDateTime, String> {\n        self.parse_date_time_with_default_timezone(date_time_str, UtcOffset::UTC)\n    }\n\n    /// Parse a date. If no timezone is specified we will assume the timezone passed as\n    /// `default_offset`. If the date is missing, it will be automatically set to 00:00:00.\n    pub fn parse_date_time_with_default_timezone(\n        &self,\n        date_time_str: &str,\n        default_offset: UtcOffset,\n    ) -> Result<OffsetDateTime, String> {\n        let mut parsed = Parsed::new();\n        if !parsed\n            .parse_items(date_time_str.as_bytes(), &self.items)\n            .map_err(|err| err.to_string())?\n            .is_empty()\n        {\n            return Err(format!(\n                \"datetime string `{date_time_str}` does not match strptime format `{}`\",\n                self.strptime_format\n            ));\n        }\n\n        // The parsed datetime contains a date but seems to be missing \"time\".\n        // We complete it artificially with 00:00:00.\n        if parsed.hour_24().is_none()\n            && !(parsed.hour_12().is_some() && parsed.hour_12_is_pm().is_some())\n        {\n            parsed.set_hour_24(0u8);\n            parsed.set_minute(0u8);\n            parsed.set_second(0u8);\n        }\n\n        if parsed.year().is_none() {\n            let now = OffsetDateTime::now_utc();\n            let year = date_time_format::infer_year(parsed.month(), now.month(), now.year());\n            parsed.set_year(year);\n        }\n\n        if parsed.day().is_none() && parsed.monday_week_number().is_none() {\n            parsed.set_day(NonZeroU8::try_from(1u8).unwrap());\n        }\n\n        if parsed.month().is_none() && parsed.monday_week_number().is_none() {\n            parsed.set_month(Month::January);\n        }\n\n        if parsed.offset_hour().is_some() {\n            let offset_datetime: OffsetDateTime = parsed\n                .try_into()\n                .map_err(|err: TryFromParsed| err.to_string())?;\n            return Ok(offset_datetime);\n        }\n        let primitive_date_time: PrimitiveDateTime = parsed\n            .try_into()\n            .map_err(|err: TryFromParsed| err.to_string())?;\n        Ok(primitive_date_time.assume_offset(default_offset))\n    }\n\n    pub fn format_date_time(&self, date_time: &OffsetDateTime) -> Result<String, Format> {\n        date_time.format(&self.items)\n    }\n\n    pub fn from_strptime(strptime_format: &str) -> Result<StrptimeParser, String> {\n        let items: Box<[OwnedFormatItem]> = parse_to_format_item(strptime_format)\n            .map_err(|err| format!(\"invalid strptime format `{strptime_format}`: {err}\"))?\n            .into_iter()\n            .map(|item| item.into())\n            .collect::<Vec<_>>()\n            .into_boxed_slice();\n        Ok(StrptimeParser::new(strptime_format.to_string(), items))\n    }\n\n    pub fn from_java_datetime_format(java_datetime_format: &str) -> Result<StrptimeParser, String> {\n        let java_datetime_format_resolved =\n            resolve_java_datetime_format_alias(java_datetime_format);\n        let items: Box<[OwnedFormatItem]> =\n            parse_java_datetime_format_items(java_datetime_format_resolved)?;\n        Ok(StrptimeParser::new(java_datetime_format.to_string(), items))\n    }\n\n    fn new(strptime_format: String, items: Box<[OwnedFormatItem]>) -> Self {\n        StrptimeParser {\n            strptime_format,\n            items,\n        }\n    }\n}\n\nimpl PartialEq for StrptimeParser {\n    fn eq(&self, other: &Self) -> bool {\n        self.strptime_format == other.strptime_format\n    }\n}\n\nimpl Eq for StrptimeParser {}\n\nimpl std::fmt::Debug for StrptimeParser {\n    fn fmt(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {\n        formatter\n            .debug_struct(\"StrptimeParser\")\n            .field(\"format\", &self.strptime_format)\n            .finish()\n    }\n}\n\nimpl std::hash::Hash for StrptimeParser {\n    fn hash<H: std::hash::Hasher>(&self, state: &mut H) {\n        self.strptime_format.hash(state);\n    }\n}\n\n// `Strftime` format special characters.\n// These characters are taken from the parsing crate we use for compatibility.\nconst STRFTIME_FORMAT_MARKERS: [&str; 36] = [\n    \"%a\", \"%A\", \"%b\", \"%B\", \"%c\", \"%C\", \"%d\", \"%D\", \"%e\", \"%f\", \"%F\", \"%h\", \"%H\", \"%I\", \"%j\", \"%k\",\n    \"%l\", \"%m\", \"%M\", \"%n\", \"%p\", \"%P\", \"%r\", \"%R\", \"%S\", \"%t\", \"%T\", \"%U\", \"%w\", \"%W\", \"%x\", \"%X\",\n    \"%y\", \"%Y\", \"%z\", \"%Z\",\n];\n\n// Checks if a format contains `strftime` special characters.\npub fn is_strftime_formatting(format_str: &str) -> bool {\n    STRFTIME_FORMAT_MARKERS\n        .iter()\n        .any(|marker| format_str.contains(marker))\n}\n\n#[cfg(test)]\nmod tests {\n    use time::macros::datetime;\n\n    use super::*;\n    use crate::java_date_time_format::parse_java_datetime_format_items;\n\n    #[test]\n    fn test_parse_datetime_format_missing_time() {\n        let parser = StrptimeParser::from_strptime(\"%Y-%m-%d\").unwrap();\n        assert_eq!(\n            parser.parse_date_time(\"2021-01-01\").unwrap(),\n            datetime!(2021-01-01 00:00:00 UTC)\n        );\n    }\n\n    #[test]\n    fn test_parse_datetime_format_strict_on_trailing_data() {\n        let parser = StrptimeParser::from_strptime(\"%Y-%m-%d\").unwrap();\n        let error = parser.parse_date_time(\"2021-01-01TABC\").unwrap_err();\n        assert_eq!(\n            error,\n            \"datetime string `2021-01-01TABC` does not match strptime format `%Y-%m-%d`\"\n        );\n    }\n\n    #[test]\n    fn test_parse_strptime_with_timezone() {\n        let parser = StrptimeParser::from_strptime(\"%Y-%m-%dT%H:%M:%S %z\").unwrap();\n        let offset_datetime = parser\n            .parse_date_time(\"2021-01-01T11:00:03 +07:00\")\n            .unwrap();\n        assert_eq!(offset_datetime, datetime!(2021-01-01 11:00:03 +7));\n    }\n\n    #[track_caller]\n    fn test_parse_java_datetime_aux(\n        java_date_time_format: &str,\n        date_str: &str,\n        expected_datetime: OffsetDateTime,\n    ) {\n        let parser = StrptimeParser::from_java_datetime_format(java_date_time_format).unwrap();\n        let datetime = parser.parse_date_time(date_str).unwrap();\n        assert_eq!(datetime, expected_datetime);\n    }\n\n    #[test]\n    fn test_parse_java_datetime_format() {\n        test_parse_java_datetime_aux(\"yyyyMMdd\", \"20210101\", datetime!(2021-01-01 00:00:00 UTC));\n        test_parse_java_datetime_aux(\n            \"yyyy MM dd\",\n            \"2021 01 01\",\n            datetime!(2021-01-01 00:00:00 UTC),\n        );\n        test_parse_java_datetime_aux(\n            \"yyyy!MM?dd\",\n            \"2021!01?01\",\n            datetime!(2021-01-01 00:00:00 UTC),\n        );\n        test_parse_java_datetime_aux(\n            \"yyyy!MM?dd'T'HH:\",\n            \"2021!01?01T13:\",\n            datetime!(2021-01-01 13:00:00 UTC),\n        );\n        test_parse_java_datetime_aux(\n            \"yyyy!MM?dd['T'[HH:]]\",\n            \"2021!01?01\",\n            datetime!(2021-01-01 00:00:00 UTC),\n        );\n        test_parse_java_datetime_aux(\n            \"yyyy!MM?dd['T'[HH:]\",\n            \"2021!01?01T\",\n            datetime!(2021-01-01 00:00:00 UTC),\n        );\n        test_parse_java_datetime_aux(\n            \"yyyy!MM?dd['T'[HH:]]\",\n            \"2021!01?01T13:\",\n            datetime!(2021-01-01 13:00:00 UTC),\n        );\n    }\n\n    #[test]\n    fn test_parse_java_missing_time() {\n        test_parse_java_datetime_aux(\n            \"yyyy-MM-dd\",\n            \"2021-01-01\",\n            datetime!(2021-01-01 00:00:00 UTC),\n        );\n    }\n\n    #[test]\n    fn test_parse_java_optional_missing_time() {\n        test_parse_java_datetime_aux(\n            \"yyyy-MM-dd[ HH:mm:ss]\",\n            \"2021-01-01\",\n            datetime!(2021-01-01 00:00:00 UTC),\n        );\n        test_parse_java_datetime_aux(\n            \"yyyy-MM-dd[ HH:mm:ss]\",\n            \"2021-01-01 12:34:56\",\n            datetime!(2021-01-01 12:34:56 UTC),\n        );\n    }\n\n    #[test]\n    fn test_parse_java_datetime_format_aliases() {\n        test_parse_java_datetime_aux(\n            \"date_optional_time\",\n            \"2021-01-01\",\n            datetime!(2021-01-01 00:00:00 UTC),\n        );\n        test_parse_java_datetime_aux(\n            \"date_optional_time\",\n            \"2021-01-21T03:01:22.312+01:00\",\n            datetime!(2021-01-21 03:01:22.312 +1),\n        );\n    }\n\n    #[test]\n    fn test_parse_java_week_formats() {\n        test_parse_java_datetime_aux(\n            \"basic_week_date\",\n            \"2024W313\",\n            datetime!(2024-08-01 0:00:00.0 +00:00:00),\n        );\n        let parser = StrptimeParser::from_java_datetime_format(\"basic_week_date\").unwrap();\n        parser.parse_date_time(\"24W313\").unwrap_err();\n\n        let parser = StrptimeParser::from_java_datetime_format(\"basic_week_date\").unwrap();\n        parser.parse_date_time(\"1W313\").unwrap_err();\n\n        test_parse_java_datetime_aux(\n            \"basic_week_date_time\",\n            \"2018W313T121212.1Z\",\n            datetime!(2018-08-02 12:12:12.1 +00:00:00),\n        );\n        test_parse_java_datetime_aux(\n            \"basic_week_date_time\",\n            \"2018W313T121212.123Z\",\n            datetime!(2018-08-02 12:12:12.123 +00:00:00),\n        );\n        test_parse_java_datetime_aux(\n            \"basic_week_date_time\",\n            \"2018W313T121212.123456789Z\",\n            datetime!(2018-08-02 12:12:12.123456789 +00:00:00),\n        );\n        test_parse_java_datetime_aux(\n            \"basic_week_date_time\",\n            \"2018W313T121212.123+0100\",\n            datetime!(2018-08-02 12:12:12.123 +01:00:00),\n        );\n        test_parse_java_datetime_aux(\n            \"basic_week_date_time_no_millis\",\n            \"2018W313T121212Z\",\n            datetime!(2018-08-02 12:12:12.0 +00:00:00),\n        );\n        test_parse_java_datetime_aux(\n            \"basic_week_date_time_no_millis\",\n            \"2018W313T121212+0100\",\n            datetime!(2018-08-02 12:12:12.0 +01:00:00),\n        );\n        test_parse_java_datetime_aux(\n            \"basic_week_date_time_no_millis\",\n            \"2018W313T121212+01:00\",\n            datetime!(2018-08-02 12:12:12.0 +01:00:00),\n        );\n\n        test_parse_java_datetime_aux(\n            \"week_date\",\n            \"2012-W48-6\",\n            datetime!(2012-12-02 0:00:00.0 +00:00:00),\n        );\n\n        test_parse_java_datetime_aux(\n            \"week_date\",\n            \"2012-W01-6\",\n            datetime!(2012-01-08 0:00:00.0 +00:00:00),\n        );\n\n        test_parse_java_datetime_aux(\n            \"week_date\",\n            \"2012-W1-6\",\n            datetime!(2012-01-08 0:00:00.0 +00:00:00),\n        );\n    }\n\n    #[test]\n    fn test_parse_java_strict_week_formats() {\n        test_parse_java_datetime_aux(\n            \"strict_basic_week_date\",\n            \"2024W313\",\n            datetime!(2024-08-01 0:00:00.0 +00:00:00),\n        );\n\n        test_parse_java_datetime_aux(\n            \"strict_week_date\",\n            \"2012-W48-6\",\n            datetime!(2012-12-02 0:00:00.0 +00:00:00),\n        );\n\n        test_parse_java_datetime_aux(\n            \"strict_week_date\",\n            \"2012-W01-6\",\n            datetime!(2012-01-08 0:00:00.0 +00:00:00),\n        );\n    }\n\n    #[test]\n    fn test_parse_strict_date_optional_time() {\n        let parser =\n            StrptimeParser::from_java_datetime_format(\"strict_date_optional_time\").unwrap();\n        let dates = [\n            \"2019\",\n            \"2019-03\",\n            \"2019-03-23\",\n            \"2019-03-23T21:34\",\n            \"2019-03-23T21:34:46\",\n            \"2019-03-23T21:34:46.123Z\",\n            \"2019-03-23T21:35:46.123+00:00\",\n            \"2019-03-23T21:36:46.123+03:00\",\n            \"2019-03-23T21:37:46.123+0300\",\n        ];\n        let expected = [\n            datetime!(2019-01-01 00:00:00 UTC),\n            datetime!(2019-03-01 00:00:00 UTC),\n            datetime!(2019-03-23 00:00:00 UTC),\n            datetime!(2019-03-23 21:34 UTC),\n            datetime!(2019-03-23 21:34:46 UTC),\n            datetime!(2019-03-23 21:34:46.123 UTC),\n            datetime!(2019-03-23 21:35:46.123 UTC),\n            datetime!(2019-03-23 21:36:46.123 +03:00:00),\n            datetime!(2019-03-23 21:37:46.123 +03:00:00),\n        ];\n        for (date_str, &expected_dt) in dates.iter().zip(expected.iter()) {\n            let parsed_dt = parser\n                .parse_date_time(date_str)\n                .unwrap_or_else(|error| panic!(\"failed to parse {date_str}: {error}\"));\n            assert_eq!(parsed_dt, expected_dt);\n        }\n    }\n\n    #[test]\n    fn test_parse_strict_date_optional_time_nanos() {\n        let parser =\n            StrptimeParser::from_java_datetime_format(\"strict_date_optional_time_nanos\").unwrap();\n        let dates = [\n            \"2019\",\n            \"2019-03\",\n            \"2019-03-23\",\n            \"2019-03-23T21:34:46.123456789Z\",\n            \"2019-03-23T21:35:46.123456789+00:00\",\n            \"2019-03-23T21:36:46.123456789+03:00\",\n            \"2019-03-23T21:37:46.123456789+0300\",\n        ];\n        let expected = [\n            datetime!(2019-01-01 00:00:00 UTC),\n            datetime!(2019-03-01 00:00:00 UTC),\n            datetime!(2019-03-23 00:00:00 UTC),\n            datetime!(2019-03-23 21:34:46.123456789 UTC),\n            datetime!(2019-03-23 21:35:46.123456789 UTC),\n            datetime!(2019-03-23 21:36:46.123456789 +03:00:00),\n            datetime!(2019-03-23 21:37:46.123456789 +03:00:00),\n        ];\n        for (date_str, &expected_dt) in dates.iter().zip(expected.iter()) {\n            let parsed_dt = parser\n                .parse_date_time(date_str)\n                .unwrap_or_else(|error| panic!(\"failed to parse {date_str}: {error}\"));\n            assert_eq!(parsed_dt, expected_dt);\n        }\n    }\n\n    #[test]\n    fn test_parse_java_datetime_format_items() {\n        let format_str = \"xxxx'W'wwe\";\n        let result = parse_java_datetime_format_items(format_str).unwrap();\n\n        // We expect the tokens to be parsed as:\n        // - 'xxxx' (week-based year)\n        // - 'W' (literal)\n        // - 'ww' (week of year)\n        // - 'e' (day of week)\n\n        assert_eq!(result.len(), 4);\n\n        // Verify each token\n        match &result[0] {\n            OwnedFormatItem::Component(Component::Year(year)) => {\n                assert_eq!(year.repr, YearRepr::Full);\n            }\n            unexpected => panic!(\"expected Year, but found: {unexpected:?}\",),\n        }\n        match &result[1] {\n            OwnedFormatItem::Literal(lit) => assert_eq!(lit.as_ref(), b\"W\"),\n            unexpected => panic!(\"expected literal 'W', but found: {unexpected:?}\"),\n        }\n        match &result[2] {\n            OwnedFormatItem::Component(Component::WeekNumber(_)) => {}\n            unexpected => panic!(\"expected WeekNumber component, but found: {unexpected:?}\"),\n        }\n        match &result[3] {\n            OwnedFormatItem::Component(Component::Weekday(_)) => {}\n            unexpected => panic!(\"expected Weekday component, but found: {unexpected:?}\"),\n        }\n    }\n\n    #[test]\n    fn test_parse_java_datetime_format_with_literals() {\n        let format = \"yyyy'T'Z-HHuu\";\n        let parser = StrptimeParser::from_java_datetime_format(format).unwrap();\n\n        let test_cases = [\n            (\"2023TZ-14uu\", datetime!(2023-01-01 14:00:00 UTC)),\n            (\"2024TZ-05uu\", datetime!(2024-01-01 05:00:00 UTC)),\n            (\"2025TZ-23uu\", datetime!(2025-01-01 23:00:00 UTC)),\n        ];\n\n        for (input, expected) in test_cases.iter() {\n            let result = parser.parse_date_time(input).unwrap();\n            assert_eq!(result, *expected, \"failed to parse {input}\");\n        }\n\n        // Test error case\n        let error_case = \"2023-1430\";\n        assert!(\n            parser.parse_date_time(error_case).is_err(),\n            \"expected error for input: {error_case}\",\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-datetime/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod date_time_format;\nmod date_time_parsing;\npub mod java_date_time_format;\n\npub use date_time_format::{DateTimeInputFormat, DateTimeOutputFormat};\npub use date_time_parsing::{\n    parse_date_time_str, parse_timestamp, parse_timestamp_float, parse_timestamp_int,\n};\npub use java_date_time_format::StrptimeParser;\npub use tantivy::DateTime as TantivyDateTime;\n"
  },
  {
    "path": "quickwit/quickwit-directories/Cargo.toml",
    "content": "[package]\nname = \"quickwit-directories\"\ndescription = \"Custom `tantivy::Directory` implementations for Quickwit\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nanyhow = { workspace = true }\nasync-trait = { workspace = true }\npostcard = { workspace = true }\nserde = { workspace = true }\ntantivy = { workspace = true }\ntime = { workspace = true }\ntokio = { workspace = true }\ntracing = { workspace = true }\n\nquickwit-common = { workspace = true }\nquickwit-storage = { workspace = true }\n\n[dev-dependencies]\ntempfile = { workspace = true }\n"
  },
  {
    "path": "quickwit/quickwit-directories/src/bundle_directory.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::convert::TryInto;\nuse std::fmt::Debug;\nuse std::path::{Path, PathBuf};\nuse std::sync::Arc;\nuse std::{fmt, io};\n\nuse quickwit_storage::{BundleStorageFileOffsets, OwnedBytes, Storage, StorageResult};\nuse tantivy::directory::error::OpenReadError;\nuse tantivy::directory::{FileHandle, FileSlice};\nuse tantivy::{Directory, HasLen};\n\n/// BundleDirectory is a read-only directory that makes it possible to\n/// open a split and serve the file it contains via tantivy's `Directory`.\n///\n/// It is the `Directory` equivalent of `BundleStorage`.\n///\n/// Split Format:\n/// `[Files][FilesMetadata][FilesMetadata length 8 byte Little endian][Hotcache][Hotcache length 8\n/// byte Little endian]`\n#[derive(Clone)]\npub struct BundleDirectory {\n    file: FileSlice,\n    file_offsets: BundleStorageFileOffsets,\n}\n\nimpl Debug for BundleDirectory {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(f, \"BundleDirectory\")\n    }\n}\n\n/// Loads the split footer from a storage and path.\n///\n/// Returns (SplitFooter, BundleFooter)\n/// SplitFooter [BundleMetadata, BundleMetadata Len, Hotcache, Hotcache len]\n/// BundleFooter [BundleMetadata, BundleMetadata Len]\npub async fn read_split_footer(\n    storage: Arc<dyn Storage>,\n    path: &Path,\n) -> StorageResult<(OwnedBytes, OwnedBytes)> {\n    let file_len = storage.file_num_bytes(path).await? as usize;\n\n    let hotcache_len_bytes = storage.get_slice(path, file_len - 8..file_len).await?;\n    let hotcache_len = u64::from_le_bytes(hotcache_len_bytes.as_ref().try_into().unwrap()) as usize;\n\n    let second_footer_start = file_len - 8 - hotcache_len - 8;\n    let second_footer_bytes = storage\n        .get_slice(path, second_footer_start..second_footer_start + 8)\n        .await?;\n    let second_footer_len =\n        u64::from_le_bytes(second_footer_bytes.as_ref().try_into().unwrap()) as usize;\n\n    let split_footer = storage\n        .get_slice(path, second_footer_start - second_footer_len..file_len)\n        .await?;\n    let only_bundle_footer = split_footer.slice(0..second_footer_len + 8);\n\n    Ok((split_footer, only_bundle_footer))\n}\n\n/// Return two slices for given split: `[body and bundle meta data] [hotcache]`\nfn split_footer(file_slice: FileSlice) -> io::Result<(FileSlice, FileSlice)> {\n    let (body_and_footer_slice, footer_len_slice) = file_slice.split_from_end(4);\n    let footer_len_bytes = footer_len_slice.read_bytes()?;\n    let footer_len = u32::from_le_bytes(footer_len_bytes.as_slice().try_into().unwrap());\n    Ok(body_and_footer_slice.split_from_end(footer_len as usize))\n}\n\n/// Return two slices for given split: `[body and bundle meta data] [hotcache]`\npub fn get_hotcache_from_split(data: OwnedBytes) -> io::Result<OwnedBytes> {\n    let split_file = FileSlice::new(Arc::new(data));\n    let (_, hotcache) = split_footer(split_file)?;\n    hotcache.read_bytes()\n}\n\nimpl BundleDirectory {\n    /// Get files and their sizes in a split.\n    pub fn get_stats_split(data: OwnedBytes) -> anyhow::Result<Vec<(PathBuf, u64)>> {\n        let split_file = FileSlice::new(Arc::new(data));\n        let (body_and_bundle_metadata, hot_cache) = split_footer(split_file)?;\n        let file_offsets = BundleStorageFileOffsets::open(body_and_bundle_metadata)?;\n\n        let mut files_and_size: Vec<(_, _)> = file_offsets\n            .files\n            .into_iter()\n            .map(|(file, range)| (file, range.end - range.start))\n            .collect();\n\n        files_and_size.push((\n            PathBuf::from(\"hotcache\".to_string()),\n            hot_cache.len() as u64,\n        ));\n\n        files_and_size.sort();\n        Ok(files_and_size)\n    }\n\n    /// Opens a split file.\n    pub fn open_split(split_file: FileSlice) -> io::Result<BundleDirectory> {\n        // First we remove the hotcache from our file slice.\n        let (body_and_bundle_metadata, _hot_cache) = split_footer(split_file)?;\n        BundleDirectory::open_bundle(body_and_bundle_metadata).map_err(io::Error::other)\n    }\n\n    /// Opens a BundleDirectory, given a file containing the bundle data.\n    pub fn open_bundle(file: FileSlice) -> anyhow::Result<BundleDirectory> {\n        let file_offsets = BundleStorageFileOffsets::open(file.clone())?;\n        Ok(BundleDirectory { file, file_offsets })\n    }\n}\n\nimpl Directory for BundleDirectory {\n    fn get_file_handle(&self, path: &Path) -> Result<Arc<dyn FileHandle>, OpenReadError> {\n        let file_slice = self.open_read(path)?;\n        Ok(Arc::new(file_slice))\n    }\n\n    fn open_read(&self, path: &Path) -> Result<FileSlice, OpenReadError> {\n        let byte_range = self\n            .file_offsets\n            .get(path)\n            .ok_or_else(|| OpenReadError::FileDoesNotExist(path.to_path_buf()))?;\n        Ok(self\n            .file\n            .slice(byte_range.start as usize..byte_range.end as usize))\n    }\n\n    fn atomic_read(&self, path: &Path) -> Result<Vec<u8>, OpenReadError> {\n        let file_slice = self.open_read(path)?;\n        let payload = file_slice\n            .read_bytes()\n            .map_err(|io_error| OpenReadError::wrap_io_error(io_error, path.to_path_buf()))?;\n        Ok(payload.to_vec())\n    }\n\n    fn exists(&self, path: &Path) -> Result<bool, OpenReadError> {\n        Ok(self.file_offsets.exists(path))\n    }\n\n    crate::read_only_directory!();\n}\n\n#[cfg(test)]\nmod tests {\n    use std::fs::File;\n    use std::io::Write;\n\n    use quickwit_common::shared_consts::SPLIT_FIELDS_FILE_NAME;\n    use quickwit_storage::{PutPayload, SplitPayloadBuilder};\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_bundle_directory_stats() -> anyhow::Result<()> {\n        let temp_dir = tempfile::tempdir()?;\n        let test_filepath1 = temp_dir.path().join(\"f1\");\n        let test_filepath2 = temp_dir.path().join(\"f2\");\n\n        let mut file1 = File::create(&test_filepath1)?;\n        file1.write_all(&[123, 76])?;\n\n        let mut file2 = File::create(&test_filepath2)?;\n        file2.write_all(&[99, 55, 44])?;\n\n        let split_streamer = SplitPayloadBuilder::get_split_payload(\n            &[test_filepath1.clone(), test_filepath2.clone()],\n            &[],\n            &[\n                1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,\n            ],\n        )?;\n\n        let buffer = split_streamer.read_all().await?;\n\n        // check stats\n        let stats = BundleDirectory::get_stats_split(buffer)?;\n\n        assert_eq!(stats[0], (PathBuf::from(\"f1\".to_string()), 2_u64));\n        assert_eq!(stats[1], (PathBuf::from(\"f2\".to_string()), 3_u64));\n        assert_eq!(stats[2], (PathBuf::from(\"hotcache\".to_string()), 18_u64));\n\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_bundle_directory() -> anyhow::Result<()> {\n        let temp_dir = tempfile::tempdir()?;\n        let test_filepath1 = temp_dir.path().join(\"f1\");\n        let test_filepath2 = temp_dir.path().join(\"f2\");\n\n        let mut file1 = File::create(&test_filepath1)?;\n        file1.write_all(&[123, 76])?;\n\n        let mut file2 = File::create(&test_filepath2)?;\n        file2.write_all(&[99, 55, 44])?;\n\n        let split_streamer = SplitPayloadBuilder::get_split_payload(\n            &[test_filepath1.clone(), test_filepath2.clone()],\n            &[],\n            &[\n                1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,\n            ],\n        )?;\n\n        let buffer = split_streamer.read_all().await?;\n\n        let bundle_file_slice = FileSlice::from(buffer.to_vec());\n\n        let bundle_dir = BundleDirectory::open_split(bundle_file_slice)?;\n\n        assert!(bundle_dir.exists(Path::new(\"f1\")).unwrap());\n        assert!(bundle_dir.exists(Path::new(\"f2\")).unwrap());\n        assert!(!bundle_dir.exists(Path::new(\"f3\")).unwrap());\n\n        let f1_data = bundle_dir.atomic_read(Path::new(\"f1\"))?;\n        assert_eq!(&*f1_data, &[123u8, 76u8]);\n\n        let f2_data = bundle_dir.atomic_read(Path::new(\"f2\"))?;\n        assert_eq!(&f2_data[..], &[99, 55, 44]);\n\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_stream_split_to_bundle_and_open() -> anyhow::Result<()> {\n        let temp_dir = tempfile::tempdir()?;\n        let test_filepath1 = temp_dir.path().join(\"f1\");\n        let test_filepath2 = temp_dir.path().join(\"f2\");\n\n        let mut file1 = File::create(&test_filepath1)?;\n        file1.write_all(&[123, 76])?;\n\n        let mut file2 = File::create(&test_filepath2)?;\n        file2.write_all(&[99, 55, 44])?;\n\n        let split_streamer = SplitPayloadBuilder::get_split_payload(\n            &[test_filepath1.clone(), test_filepath2.clone()],\n            &[5, 5, 5],\n            &[1, 2, 3],\n        )?;\n\n        let data = split_streamer.read_all().await?;\n\n        let bundle_dir = BundleDirectory::open_split(FileSlice::from(data.to_vec()))?;\n\n        let field_data = bundle_dir.atomic_read(Path::new(SPLIT_FIELDS_FILE_NAME))?;\n        assert_eq!(&*field_data, &[5, 5, 5]);\n\n        let f1_data = bundle_dir.atomic_read(Path::new(\"f1\"))?;\n        assert_eq!(&*f1_data, &[123u8, 76u8]);\n\n        let f2_data = bundle_dir.atomic_read(Path::new(\"f2\"))?;\n        assert_eq!(&f2_data[..], &[99, 55, 44]);\n\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-directories/src/caching_directory.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::ops::Range;\nuse std::path::{Path, PathBuf};\nuse std::sync::Arc;\nuse std::{fmt, io};\n\nuse async_trait::async_trait;\nuse quickwit_storage::ByteRangeCache;\nuse tantivy::directory::error::OpenReadError;\nuse tantivy::directory::{FileHandle, OwnedBytes};\nuse tantivy::{Directory, HasLen};\n\n/// The caching directory is a simple cache that wraps another directory.\n#[derive(Clone)]\npub struct CachingDirectory {\n    underlying: Arc<dyn Directory>,\n    // TODO fixme: that's a pretty ugly cache we have here.\n    cache: ByteRangeCache,\n}\n\nimpl CachingDirectory {\n    /// Creates a new CachingDirectory.\n    ///\n    /// Warning: The resulting CacheDirectory will cache all information without ever\n    /// removing any item from the cache.\n    pub fn new_unbounded(underlying: Arc<dyn Directory>) -> CachingDirectory {\n        let byte_range_cache = ByteRangeCache::with_infinite_capacity(\n            &quickwit_storage::STORAGE_METRICS.shortlived_cache,\n        );\n        CachingDirectory::new(underlying, byte_range_cache)\n    }\n\n    /// Creates a new CachingDirectory.\n    ///\n    /// Warning: The resulting CacheDirectory will cache all information without ever\n    /// removing any item from the cache.\n    pub fn new(underlying: Arc<dyn Directory>, cache: ByteRangeCache) -> CachingDirectory {\n        CachingDirectory { underlying, cache }\n    }\n}\n\nimpl fmt::Debug for CachingDirectory {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(f, \"CachingDirectory({:?})\", self.underlying)\n    }\n}\n\nstruct CachingFileHandle {\n    path: PathBuf,\n    cache: ByteRangeCache,\n    underlying_filehandle: Arc<dyn FileHandle>,\n}\n\nimpl fmt::Debug for CachingFileHandle {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(\n            f,\n            \"CachingFileHandle(path={:?}, underlying={:?})\",\n            &self.path,\n            self.underlying_filehandle.as_ref()\n        )\n    }\n}\n\n#[async_trait]\nimpl FileHandle for CachingFileHandle {\n    fn read_bytes(&self, byte_range: Range<usize>) -> io::Result<OwnedBytes> {\n        if let Some(bytes) = self.cache.get_slice(&self.path, byte_range.clone()) {\n            return Ok(bytes);\n        }\n        let owned_bytes = self.underlying_filehandle.read_bytes(byte_range.clone())?;\n        self.cache\n            .put_slice(self.path.clone(), byte_range, owned_bytes.clone());\n        Ok(owned_bytes)\n    }\n\n    async fn read_bytes_async(&self, byte_range: Range<usize>) -> io::Result<OwnedBytes> {\n        if let Some(owned_bytes) = self.cache.get_slice(&self.path, byte_range.clone()) {\n            return Ok(owned_bytes);\n        }\n        let read_bytes = self\n            .underlying_filehandle\n            .read_bytes_async(byte_range.clone())\n            .await?;\n        self.cache\n            .put_slice(self.path.clone(), byte_range, read_bytes.clone());\n        Ok(read_bytes)\n    }\n}\n\nimpl HasLen for CachingFileHandle {\n    fn len(&self) -> usize {\n        self.underlying_filehandle.len()\n    }\n}\n\nimpl Directory for CachingDirectory {\n    fn exists(&self, path: &Path) -> std::result::Result<bool, OpenReadError> {\n        self.underlying.exists(path)\n    }\n\n    fn get_file_handle(\n        &self,\n        path: &Path,\n    ) -> std::result::Result<Arc<dyn FileHandle>, OpenReadError> {\n        let underlying_filehandle = self.underlying.get_file_handle(path)?;\n        let caching_file_handle = CachingFileHandle {\n            path: path.to_path_buf(),\n            cache: self.cache.clone(),\n            underlying_filehandle,\n        };\n        Ok(Arc::new(caching_file_handle))\n    }\n\n    fn atomic_read(&self, path: &Path) -> std::result::Result<Vec<u8>, OpenReadError> {\n        let file_handle = self.get_file_handle(path)?;\n        let len = file_handle.len();\n        let owned_bytes = file_handle\n            .read_bytes(0..len)\n            .map_err(|io_error| OpenReadError::wrap_io_error(io_error, path.to_path_buf()))?;\n        Ok(owned_bytes.as_slice().to_vec())\n    }\n\n    crate::read_only_directory!();\n}\n\n#[cfg(test)]\nmod tests {\n\n    use std::path::Path;\n    use std::sync::Arc;\n\n    use tantivy::Directory;\n    use tantivy::directory::RamDirectory;\n\n    use super::CachingDirectory;\n    use crate::DebugProxyDirectory;\n\n    #[test]\n    fn test_caching_directory() -> tantivy::Result<()> {\n        let ram_directory = RamDirectory::default();\n        let test_path = Path::new(\"test\");\n        ram_directory.atomic_write(test_path, &b\"test\"[..])?;\n        let debug_proxy_directory = Arc::new(DebugProxyDirectory::wrap(ram_directory));\n        let caching_directory = CachingDirectory::new_unbounded(debug_proxy_directory.clone());\n        caching_directory.atomic_read(test_path)?;\n        caching_directory.atomic_read(test_path)?;\n        assert_eq!(debug_proxy_directory.drain_read_operations().count(), 1);\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-directories/src/debug_proxy_directory.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::ops::Range;\nuse std::path::{Path, PathBuf};\nuse std::sync::{Arc, Mutex};\nuse std::time::{Duration, Instant};\nuse std::{fmt, io, mem};\n\nuse async_trait::async_trait;\nuse tantivy::directory::error::OpenReadError;\nuse tantivy::directory::{FileHandle, OwnedBytes};\nuse tantivy::{Directory, HasLen};\nuse time::OffsetDateTime;\n\n#[derive(Clone, Default)]\nstruct OperationBuffer(Arc<Mutex<Vec<ReadOperation>>>);\n\nimpl fmt::Debug for OperationBuffer {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(f, \"OperationBuffer\")\n    }\n}\n\nimpl OperationBuffer {\n    fn drain(&self) -> impl Iterator<Item = ReadOperation> + 'static {\n        let mut guard = self.0.lock().expect(\"Mutex poisoned\");\n        let ops: Vec<ReadOperation> = mem::take(guard.as_mut());\n        ops.into_iter()\n    }\n\n    fn push(&self, read_operation: ReadOperation) {\n        let mut guard = self.0.lock().expect(\"Mutex poisoned\");\n        guard.push(read_operation);\n    }\n}\n\n/// A ReadOperation records meta data about a read operation.\n/// It is recorded by the `DebugProxyDirectory`.\n#[derive(Clone, Debug, Eq, PartialEq)]\npub struct ReadOperation {\n    /// Path that was read\n    pub path: PathBuf,\n    /// If fetching a range of data, the start offset, else 0.\n    pub offset: usize,\n    /// The number of bytes fetched\n    pub num_bytes: usize,\n    /// The date at which the operation was performed (UTC timezone).\n    pub start_date: OffsetDateTime,\n    /// The elapsed time to run the read operatioon.\n    pub duration: Duration,\n}\n\nstruct ReadOperationBuilder {\n    start_date: OffsetDateTime,\n    start_instant: Instant,\n    path: PathBuf,\n    offset: usize,\n}\n\nimpl ReadOperationBuilder {\n    pub fn new(path: &Path) -> Self {\n        let start_instant = Instant::now();\n        let start_date = OffsetDateTime::now_utc();\n        ReadOperationBuilder {\n            start_date,\n            start_instant,\n            path: path.to_path_buf(),\n            offset: 0,\n        }\n    }\n\n    pub fn with_offset(self, offset: usize) -> Self {\n        ReadOperationBuilder {\n            start_date: self.start_date,\n            start_instant: self.start_instant,\n            path: self.path,\n            offset,\n        }\n    }\n\n    fn terminate(self, num_bytes: usize) -> ReadOperation {\n        let duration = self.start_instant.elapsed();\n        ReadOperation {\n            path: self.path.clone(),\n            offset: self.offset,\n            num_bytes,\n            start_date: self.start_date,\n            duration,\n        }\n    }\n}\n\n/// The debug proxy wraps another directory and simply acts as a proxy\n/// recording all of its read operations.\n///\n/// It has two purpose\n/// - It is used when building our hotcache, to identify the file sections that should be in the\n///   hotcache.\n/// - It is used in the search-api to provide debugging/performance information.\n#[derive(Debug)]\npub struct DebugProxyDirectory<D: Directory> {\n    underlying: Arc<D>,\n    operations: OperationBuffer,\n}\n\nimpl<D: Directory> Clone for DebugProxyDirectory<D> {\n    fn clone(&self) -> Self {\n        DebugProxyDirectory {\n            underlying: self.underlying.clone(),\n            operations: self.operations.clone(),\n        }\n    }\n}\n\nimpl<D: Directory> DebugProxyDirectory<D> {\n    /// Wraps another directory to log all of its read operations.\n    pub fn wrap(directory: D) -> Self {\n        DebugProxyDirectory {\n            underlying: Arc::new(directory),\n            operations: OperationBuffer::default(),\n        }\n    }\n\n    /// Returns all of the existing read operations.\n    ///\n    /// Calling this \"drains\" the existing queue of operations.\n    pub fn drain_read_operations(&self) -> impl Iterator<Item = ReadOperation> + '_ {\n        self.operations.drain()\n    }\n\n    /// Adds a new operation\n    fn register(&self, read_op: ReadOperation) {\n        self.operations.push(read_op);\n    }\n\n    /// Adds a new operation in an async fashion.\n    async fn register_async(&self, read_op: ReadOperation) {\n        self.operations.push(read_op);\n    }\n}\n\nstruct DebugProxyFileHandle<D: Directory> {\n    directory: DebugProxyDirectory<D>,\n    underlying: Arc<dyn FileHandle>,\n    path: PathBuf,\n}\n\n#[async_trait]\nimpl<D: Directory> FileHandle for DebugProxyFileHandle<D> {\n    fn read_bytes(&self, byte_range: Range<usize>) -> io::Result<OwnedBytes> {\n        let read_operation_builder =\n            ReadOperationBuilder::new(&self.path).with_offset(byte_range.start);\n        let payload = self.underlying.read_bytes(byte_range)?;\n        let read_operation = read_operation_builder.terminate(payload.len());\n        self.directory.register(read_operation);\n        Ok(payload)\n    }\n\n    async fn read_bytes_async(&self, byte_range: Range<usize>) -> io::Result<OwnedBytes> {\n        let read_operation_builder =\n            ReadOperationBuilder::new(&self.path).with_offset(byte_range.start);\n        let payload = self.underlying.read_bytes_async(byte_range).await?;\n        let read_operation = read_operation_builder.terminate(payload.len());\n        self.directory.register_async(read_operation).await;\n        Ok(payload)\n    }\n}\n\nimpl<D: Directory> fmt::Debug for DebugProxyFileHandle<D> {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(f, \"DebugProxyFileHandle({:?})\", &self.underlying)\n    }\n}\n\nimpl<D: Directory> HasLen for DebugProxyFileHandle<D> {\n    fn len(&self) -> usize {\n        self.underlying.len()\n    }\n}\n\nimpl<D: Directory> Directory for DebugProxyDirectory<D> {\n    fn get_file_handle(&self, path: &Path) -> Result<Arc<dyn FileHandle>, OpenReadError> {\n        let underlying = self.underlying.get_file_handle(path)?;\n        Ok(Arc::new(DebugProxyFileHandle {\n            underlying,\n            directory: self.clone(),\n            path: path.to_owned(),\n        }))\n    }\n\n    fn exists(&self, path: &Path) -> Result<bool, OpenReadError> {\n        self.underlying.exists(path)\n    }\n\n    fn atomic_read(&self, path: &Path) -> Result<Vec<u8>, OpenReadError> {\n        let read_operation_builder = ReadOperationBuilder::new(path);\n        let payload = self.underlying.atomic_read(path)?;\n        let read_operation = read_operation_builder.terminate(payload.len());\n        self.register(read_operation);\n        Ok(payload.to_vec())\n    }\n\n    crate::read_only_directory!();\n}\n\n#[cfg(test)]\nmod tests {\n    use std::io::Write;\n    use std::path::Path;\n\n    use tantivy::Directory;\n    use tantivy::directory::{RamDirectory, TerminatingWrite};\n\n    use super::DebugProxyDirectory;\n\n    const TEST_PATH: &str = \"test.file\";\n    const TEST_PAYLOAD: &[u8] = b\"hello happy tax payer\";\n\n    fn make_test_directory() -> tantivy::Result<RamDirectory> {\n        let ram_directory = RamDirectory::create();\n        let mut wrt = ram_directory.open_write(Path::new(TEST_PATH))?;\n        wrt.write_all(TEST_PAYLOAD)?;\n        wrt.flush()?;\n        wrt.terminate()?;\n        Ok(ram_directory)\n    }\n\n    #[test]\n    fn test_debug_proxy_atomic_read() -> tantivy::Result<()> {\n        let debug_proxy = DebugProxyDirectory::wrap(make_test_directory()?);\n        let test_path = Path::new(TEST_PATH);\n        let read_data = debug_proxy.atomic_read(test_path)?;\n        assert_eq!(&read_data[..], TEST_PAYLOAD);\n        let operations: Vec<crate::ReadOperation> = debug_proxy.drain_read_operations().collect();\n        println!(\"operations {operations:?}\");\n        assert_eq!(operations.len(), 1);\n        let op0 = &operations[0];\n        assert_eq!(op0.offset, 0);\n        assert_eq!(op0.num_bytes, 21);\n        assert_eq!(op0.path, test_path);\n        Ok(())\n    }\n\n    #[test]\n    fn test_debug_proxy_open_read_read_sync() -> tantivy::Result<()> {\n        let test_path = Path::new(TEST_PATH);\n        let debug_proxy = DebugProxyDirectory::wrap(make_test_directory()?);\n        let read_data = debug_proxy.open_read(test_path)?;\n        assert_eq!(read_data.read_bytes_slice(1..3)?.as_slice(), b\"el\");\n        let operations: Vec<crate::ReadOperation> = debug_proxy.drain_read_operations().collect();\n        assert_eq!(operations.len(), 1);\n        let op = &operations[0];\n        assert_eq!(op.path, test_path);\n        assert_eq!(op.offset, 1);\n        assert_eq!(op.num_bytes, 2);\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_debug_proxy_open_read_read_async() {\n        let test_path = Path::new(TEST_PATH);\n        let debug_proxy = DebugProxyDirectory::wrap(make_test_directory().unwrap());\n        let read_data = debug_proxy.open_read(test_path).unwrap();\n        assert_eq!(\n            read_data\n                .read_bytes_slice_async(1..3)\n                .await\n                .unwrap()\n                .as_slice(),\n            b\"el\"\n        );\n        let operations: Vec<crate::ReadOperation> = debug_proxy.drain_read_operations().collect();\n        assert_eq!(operations.len(), 1);\n        let op = &operations[0];\n        assert_eq!(op.path, test_path);\n        assert_eq!(op.offset, 1);\n        assert_eq!(op.num_bytes, 2);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-directories/src/hot_directory.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{HashMap, HashSet};\nuse std::ops::Range;\nuse std::path::{Path, PathBuf};\nuse std::sync::Arc;\nuse std::{fmt, io};\n\nuse anyhow::{Context, bail};\nuse async_trait::async_trait;\nuse quickwit_storage::VersionedComponent;\nuse serde::{Deserialize, Serialize};\nuse tantivy::directory::error::OpenReadError;\nuse tantivy::directory::{FileHandle, FileSlice, OwnedBytes};\nuse tantivy::error::DataCorruption;\nuse tantivy::{Directory, HasLen, Index, IndexReader, ReloadPolicy, TantivyError};\n\nuse crate::{CachingDirectory, DebugProxyDirectory};\n\n#[derive(Clone, Copy, Default)]\n#[repr(u32)]\npub enum HotDirectoryVersions {\n    #[default]\n    V1 = 1,\n}\n\nimpl VersionedComponent for HotDirectoryVersions {\n    const MAGIC_NUMBER: u32 = 2_557_869_106u32;\n    type Component = HotDirectoryMeta;\n\n    fn to_version_code(self) -> u32 {\n        self as u32\n    }\n\n    fn try_from_version_code_impl(code: u32) -> Option<Self> {\n        match code {\n            1u32 => Some(Self::V1),\n            _ => None,\n        }\n    }\n\n    fn deserialize_impl(&self, bytes: &mut OwnedBytes) -> anyhow::Result<HotDirectoryMeta> {\n        match self {\n            Self::V1 => {\n                if bytes.len() < 4 {\n                    bail!(\"data too short (len={})\", bytes.len());\n                }\n                let len = bytes.read_u32() as usize;\n                let hot_directory_meta = postcard::from_bytes(&bytes.as_slice()[..len])\n                    .context(\"failed to deserialize hot directory meta\")?;\n                bytes.advance(len);\n                Ok(hot_directory_meta)\n            }\n        }\n    }\n\n    fn serialize_impl(component: &Self::Component, output: &mut Vec<u8>) {\n        let buf = postcard::to_stdvec(component).unwrap();\n        output.extend_from_slice(&(buf.len() as u32).to_le_bytes());\n        output.extend_from_slice(&buf[..]);\n    }\n}\n\n#[derive(Serialize, Deserialize)]\npub struct HotDirectoryMeta {\n    file_lengths: HashMap<PathBuf, u64>,\n    slice_offsets: Vec<(PathBuf, u64)>,\n}\n\n#[derive(Clone, Debug, Serialize, Deserialize)]\nstruct SliceCacheIndexEntry {\n    start: usize, //< legacy. We keep this instead of range due to existing indices.\n    stop: usize,\n    addr: usize,\n}\n\nimpl SliceCacheIndexEntry {\n    pub fn len(&self) -> usize {\n        self.range().len()\n    }\n\n    pub fn range(&self) -> Range<usize> {\n        self.start..self.stop\n    }\n}\n\n#[derive(Serialize, Deserialize, Default)]\npub struct SliceCacheIndex {\n    total_len: u64,\n    slices: Vec<SliceCacheIndexEntry>,\n}\nimpl SliceCacheIndex {\n    pub fn is_complete(&self) -> bool {\n        if self.slices.len() != 1 {\n            return false;\n        }\n        self.slices[0].len() as u64 == self.total_len\n    }\n\n    pub fn get(&self, byte_range: Range<usize>) -> Option<usize> {\n        let entry_idx = match self\n            .slices\n            .binary_search_by_key(&byte_range.start, |entry| entry.range().start)\n        {\n            Ok(idx) => idx,\n            Err(0) => {\n                return None;\n            }\n            Err(idx_after) => idx_after - 1,\n        };\n        let entry = &self.slices[entry_idx];\n        if entry.range().start > byte_range.start || entry.range().end < byte_range.end {\n            return None;\n        }\n        Some(entry.addr + byte_range.start - entry.range().start)\n    }\n}\n\n#[derive(Default)]\nstruct StaticDirectoryCacheBuilder {\n    file_cache_builder: HashMap<PathBuf, StaticSliceCacheBuilder>,\n    file_lengths: HashMap<PathBuf, u64>, // a mapping from file path to file size in bytes\n}\n\nimpl StaticDirectoryCacheBuilder {\n    pub fn add_file(&mut self, path: &Path, file_len: u64) -> &mut StaticSliceCacheBuilder {\n        self.file_lengths.insert(path.to_owned(), file_len);\n        self.file_cache_builder\n            .entry(path.to_owned())\n            .or_insert_with(|| StaticSliceCacheBuilder::new(file_len))\n    }\n\n    /// Flush needs to be called afterwards.\n    pub fn write(self, wrt: &mut dyn io::Write) -> tantivy::Result<()> {\n        let mut data_buffer = Vec::new();\n        let mut data_idx: Vec<(PathBuf, u64)> = Vec::new();\n        let mut offset = 0u64;\n        for (path, cache) in self.file_cache_builder {\n            let buf = cache.flush()?;\n            data_idx.push((path, offset));\n            offset += buf.len() as u64;\n            data_buffer.extend_from_slice(&buf);\n        }\n        let hot_directory_metas = HotDirectoryMeta {\n            file_lengths: self.file_lengths,\n            slice_offsets: data_idx,\n        };\n        let buffer = HotDirectoryVersions::serialize(&hot_directory_metas);\n        wrt.write_all(&buffer)?;\n        wrt.write_all(&data_buffer)?;\n        Ok(())\n    }\n}\n\n#[derive(Debug)]\nstruct StaticDirectoryCache {\n    file_lengths: HashMap<PathBuf, u64>,\n    slices: HashMap<PathBuf, Arc<StaticSliceCache>>,\n}\n\nimpl StaticDirectoryCache {\n    pub fn open(mut bytes: OwnedBytes) -> anyhow::Result<StaticDirectoryCache> {\n        let HotDirectoryMeta {\n            mut slice_offsets,\n            file_lengths,\n        } = HotDirectoryVersions::try_read_component(&mut bytes)?;\n        slice_offsets.push((PathBuf::default(), bytes.len() as u64));\n        let slices = slice_offsets\n            .windows(2)\n            .map(|slice_offsets_window| {\n                let path = slice_offsets_window[0].0.clone();\n                let start = slice_offsets_window[0].1 as usize;\n                let end = slice_offsets_window[1].1 as usize;\n                StaticSliceCache::open(bytes.slice(start..end)).map(|s| (path, Arc::new(s)))\n            })\n            .collect::<tantivy::Result<_>>()?;\n        Ok(StaticDirectoryCache {\n            file_lengths,\n            slices,\n        })\n    }\n\n    pub fn get_slice(&self, path: &Path) -> Arc<StaticSliceCache> {\n        self.slices.get(path).cloned().unwrap_or_default()\n    }\n\n    pub fn get_file_length(&self, path: &Path) -> Option<u64> {\n        self.file_lengths.get(path).copied()\n    }\n\n    pub fn get_file_lengths(&self) -> Vec<(PathBuf, u64)> {\n        let mut entries = self\n            .file_lengths\n            .iter()\n            .map(|(path, len)| (path.clone(), *len))\n            .collect::<Vec<_>>();\n        entries.sort_by_key(|el| el.0.to_owned());\n        entries\n    }\n}\n\n/// A SliceCache is a static toring\npub struct StaticSliceCache {\n    bytes: OwnedBytes,\n    index: SliceCacheIndex,\n}\n\nimpl Default for StaticSliceCache {\n    fn default() -> StaticSliceCache {\n        StaticSliceCache {\n            bytes: OwnedBytes::empty(),\n            index: SliceCacheIndex::default(),\n        }\n    }\n}\n\nimpl StaticSliceCache {\n    pub fn open(owned_bytes: OwnedBytes) -> tantivy::Result<Self> {\n        let owned_bytes_len = owned_bytes.len();\n        assert!(owned_bytes_len >= 8);\n        let (body, len_bytes) = owned_bytes.split(owned_bytes_len - 8);\n        let mut body_len_bytes = [0u8; 8];\n        body_len_bytes.copy_from_slice(len_bytes.as_slice());\n        let body_len = u64::from_le_bytes(body_len_bytes);\n        let (body, idx) = body.split(body_len as usize);\n        let idx_bytes = idx.as_slice();\n        let index: SliceCacheIndex = postcard::from_bytes(idx_bytes).map_err(|err| {\n            DataCorruption::comment_only(format!(\"failed to deserialize the slice index: {err:?}\"))\n        })?;\n        Ok(StaticSliceCache { bytes: body, index })\n    }\n\n    pub fn try_read_all(&self) -> Option<OwnedBytes> {\n        if !self.index.is_complete() {\n            return None;\n        }\n        Some(self.bytes.clone())\n    }\n\n    pub fn try_read_bytes(&self, byte_range: Range<usize>) -> Option<OwnedBytes> {\n        if byte_range.is_empty() {\n            return Some(OwnedBytes::empty());\n        }\n        if let Some(start) = self.index.get(byte_range.clone()) {\n            return Some(self.bytes.slice(start..start + byte_range.len()));\n        }\n        None\n    }\n}\n\nstruct StaticSliceCacheBuilder {\n    wrt: Vec<u8>,\n    slices: Vec<SliceCacheIndexEntry>,\n    offset: u64,\n    total_len: u64,\n}\n\nimpl StaticSliceCacheBuilder {\n    pub fn new(total_len: u64) -> StaticSliceCacheBuilder {\n        StaticSliceCacheBuilder {\n            wrt: Vec::new(),\n            slices: Vec::new(),\n            offset: 0u64,\n            total_len,\n        }\n    }\n\n    pub fn add_bytes(&mut self, bytes: &[u8], start: usize) {\n        self.wrt.extend_from_slice(bytes);\n        let end = start + bytes.len();\n        self.slices.push(SliceCacheIndexEntry {\n            start,\n            stop: end,\n            addr: self.offset as usize,\n        });\n        self.offset += bytes.len() as u64;\n    }\n\n    fn merged_slices(&mut self) -> tantivy::Result<Vec<SliceCacheIndexEntry>> {\n        if self.slices.is_empty() {\n            return Ok(Vec::new());\n        }\n        self.slices.sort_unstable_by_key(|e| e.range().start);\n        let mut slices = Vec::with_capacity(self.slices.len());\n        let mut last = self.slices[0].clone();\n        for segment in &self.slices[1..] {\n            if segment.range().start < last.range().end {\n                return Err(tantivy::TantivyError::InvalidArgument(format!(\n                    \"two segments are overlapping on byte {}\",\n                    segment.range().start\n                )));\n            }\n            if last.stop == segment.range().start\n                && (last.addr + last.range().len() == segment.addr)\n            {\n                // We merge the current segment with the previous one\n                last.stop += segment.range().len();\n            } else {\n                slices.push(last);\n                last = segment.clone();\n            }\n        }\n        slices.push(last);\n        Ok(slices)\n    }\n\n    pub fn flush(mut self) -> tantivy::Result<Vec<u8>> {\n        let merged_slices = self.merged_slices()?;\n        let slices_idx = SliceCacheIndex {\n            total_len: self.total_len,\n            slices: merged_slices,\n        };\n        self.wrt.extend_from_slice(\n            &postcard::to_allocvec(&slices_idx).map_err(|err| {\n                TantivyError::InternalError(format!(\"could not serialize {err:?}\"))\n            })?,\n        );\n        self.wrt.extend_from_slice(&self.offset.to_le_bytes()[..]);\n        Ok(self.wrt)\n    }\n}\n\nimpl fmt::Debug for StaticSliceCache {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(f, \"SliceCache()\")\n    }\n}\n\n/// The hot directory accelerates a given directory,\n/// by placing a static cache in front of a directory.\n///\n/// The `HotDirectory` does not implement write operations. It has been\n/// designed for quickwit in order to regroup all of the small random\n/// read operations required to open an index.\n/// All of these operations are gather into a single file called the\n/// hotcache.\n#[derive(Clone)]\npub struct HotDirectory {\n    inner: Arc<InnerHotDirectory>,\n}\n\nimpl HotDirectory {\n    /// Wraps an index, with a static cache serialized into `hot_cache_bytes`.\n    pub fn open<D: Directory>(\n        underlying: D,\n        hot_cache_bytes: OwnedBytes,\n    ) -> anyhow::Result<HotDirectory> {\n        let static_cache = StaticDirectoryCache::open(hot_cache_bytes)?;\n        Ok(HotDirectory {\n            inner: Arc::new(InnerHotDirectory {\n                underlying: Box::new(underlying),\n                cache: Arc::new(static_cache),\n            }),\n        })\n    }\n\n    /// Get all the files in the directory and their sizes.\n    ///\n    /// The actual cached data is a very small fraction of this length.\n    pub fn get_file_lengths(&self) -> Vec<(PathBuf, u64)> {\n        self.inner.cache.get_file_lengths()\n    }\n}\n\nstruct FileSliceWithCache {\n    underlying: FileSlice,\n    static_cache: Arc<StaticSliceCache>,\n    file_length: u64,\n}\n\n#[async_trait]\nimpl FileHandle for FileSliceWithCache {\n    fn read_bytes(&self, byte_range: Range<usize>) -> io::Result<OwnedBytes> {\n        if let Some(found_bytes) = self.static_cache.try_read_bytes(byte_range.clone()) {\n            return Ok(found_bytes);\n        }\n        self.underlying.read_bytes_slice(byte_range)\n    }\n\n    async fn read_bytes_async(&self, byte_range: Range<usize>) -> io::Result<OwnedBytes> {\n        if let Some(found_bytes) = self.static_cache.try_read_bytes(byte_range.clone()) {\n            return Ok(found_bytes);\n        }\n        self.underlying.read_bytes_slice_async(byte_range).await\n    }\n}\n\nimpl fmt::Debug for FileSliceWithCache {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(f, \"FileSliceWithCache({:?})\", &self.underlying)\n    }\n}\n\nimpl HasLen for FileSliceWithCache {\n    fn len(&self) -> usize {\n        self.file_length as usize\n    }\n}\n\nstruct InnerHotDirectory {\n    underlying: Box<dyn Directory>,\n    cache: Arc<StaticDirectoryCache>,\n}\n\nimpl fmt::Debug for HotDirectory {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(\n            f,\n            \"HotDirectory(dir={:?}, cache={:?})\",\n            self.inner.underlying.as_ref(),\n            self.inner.cache.as_ref()\n        )\n    }\n}\n\nimpl Directory for HotDirectory {\n    fn get_file_handle(&self, path: &Path) -> Result<Arc<dyn FileHandle>, OpenReadError> {\n        let file_length = self\n            .inner\n            .cache\n            .get_file_length(path)\n            .ok_or_else(|| OpenReadError::FileDoesNotExist(path.to_owned()))?;\n        let underlying_filehandle = self.inner.underlying.get_file_handle(path)?;\n        let underlying = FileSlice::new_with_num_bytes(underlying_filehandle, file_length as usize);\n        let file_slice_with_cache = FileSliceWithCache {\n            underlying,\n            static_cache: self.inner.cache.get_slice(path),\n            file_length,\n        };\n        Ok(Arc::new(file_slice_with_cache))\n    }\n\n    fn exists(&self, path: &std::path::Path) -> Result<bool, OpenReadError> {\n        Ok(self.inner.cache.get_file_length(path).is_some())\n    }\n\n    fn atomic_read(&self, path: &std::path::Path) -> Result<Vec<u8>, OpenReadError> {\n        let slice_cache = self.inner.cache.get_slice(path);\n        if let Some(all_bytes) = slice_cache.try_read_all() {\n            return Ok(all_bytes.as_slice().to_owned());\n        }\n        self.inner.underlying.atomic_read(path)\n    }\n\n    crate::read_only_directory!();\n}\n\nfn list_index_files(index: &Index) -> tantivy::Result<HashSet<PathBuf>> {\n    let index_meta = index.load_metas()?;\n    let mut files: HashSet<PathBuf> = index_meta\n        .segments\n        .into_iter()\n        .flat_map(|segment_meta| segment_meta.list_files())\n        .collect();\n    files.insert(Path::new(\"meta.json\").to_path_buf());\n    files.insert(Path::new(\".managed.json\").to_path_buf());\n    Ok(files)\n}\n\n/// Given a tantivy directory, automatically identify the parts that should be loaded on startup\n/// and writes a static cache file called hotcache in the `output`.\n///\n/// See [`HotDirectory`] for more information.\npub fn write_hotcache<D: Directory>(\n    directory: D,\n    output: &mut dyn io::Write,\n) -> tantivy::Result<()> {\n    // We use the caching directory here in order to defensively ensure that\n    // the content of the directory that will be written in the hotcache is precisely\n    // the same that was read on the first pass.\n    let caching_directory = CachingDirectory::new_unbounded(Arc::new(directory));\n    let debug_proxy_directory = DebugProxyDirectory::wrap(caching_directory);\n    let index = Index::open(debug_proxy_directory.clone())?;\n    let schema = index.schema();\n    let reader: IndexReader = index\n        .reader_builder()\n        .reload_policy(ReloadPolicy::Manual)\n        .try_into()?;\n    let searcher = reader.searcher();\n    for (field, field_entry) in schema.fields() {\n        if !field_entry.is_indexed() {\n            continue;\n        }\n        for reader in searcher.segment_readers() {\n            let _inv_idx = reader.inverted_index(field)?;\n        }\n    }\n    let mut cache_builder = StaticDirectoryCacheBuilder::default();\n    let read_operations = debug_proxy_directory.drain_read_operations();\n    let mut per_file_slices: HashMap<PathBuf, HashSet<Range<usize>>> = HashMap::default();\n    for read_operation in read_operations {\n        per_file_slices\n            .entry(read_operation.path)\n            .or_default()\n            .insert(read_operation.offset..read_operation.offset + read_operation.num_bytes);\n    }\n    let index_files = list_index_files(&index)?;\n    for file_path in index_files {\n        let file_slice_res = debug_proxy_directory.open_read(&file_path);\n        if let Err(tantivy::directory::error::OpenReadError::FileDoesNotExist(_)) = file_slice_res {\n            continue;\n        }\n        let file_slice = file_slice_res?;\n        let file_cache_builder = cache_builder.add_file(&file_path, file_slice.len() as u64);\n        if let Some(intervals) = per_file_slices.get(&file_path) {\n            for byte_range in intervals {\n                let len = byte_range.len();\n                // We do not want to store slices that are too large in the hotcache,\n                // but on the other hand, the term dictionray index and the docstore\n                // index are required for quickwit to work.\n                //\n                // Warning: we need to work on string here because `Path::ends_with`\n                // has very different semantics.\n                let file_path_str = file_path.to_string_lossy();\n                if file_path_str.ends_with(\"store\")\n                    || file_path_str.ends_with(\"term\")\n                    || len < 10_000_000\n                {\n                    let bytes = file_slice.read_bytes_slice(byte_range.clone())?;\n                    file_cache_builder.add_bytes(bytes.as_slice(), byte_range.start);\n                }\n            }\n        }\n    }\n    cache_builder.write(output)?;\n    output.flush()?;\n    Ok(())\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_empty_slice_cache_index() -> tantivy::Result<()> {\n        let slice_cache_builder = StaticSliceCacheBuilder::new(10u64);\n        let cache_data = slice_cache_builder.flush()?;\n        let owned_bytes = OwnedBytes::new(cache_data);\n        let slice_cache = StaticSliceCache::open(owned_bytes)?;\n        assert!(slice_cache.try_read_bytes(5..6).is_none());\n        Ok(())\n    }\n\n    #[test]\n    fn test_simple_slice_cache_index() -> tantivy::Result<()> {\n        let mut slice_cache_builder = StaticSliceCacheBuilder::new(10u64);\n        slice_cache_builder.add_bytes(b\"abc\", 2);\n        let cache_data = slice_cache_builder.flush()?;\n        let owned_bytes = OwnedBytes::new(cache_data);\n        let slice_cache = StaticSliceCache::open(owned_bytes)?;\n        assert_eq!(\n            slice_cache.try_read_bytes(2..5).unwrap().as_slice(),\n            &b\"abc\"[..]\n        );\n        assert_eq!(\n            slice_cache.try_read_bytes(2..3).unwrap().as_slice(),\n            &b\"a\"[..]\n        );\n        assert_eq!(\n            slice_cache.try_read_bytes(3..5).unwrap().as_slice(),\n            &b\"bc\"[..]\n        );\n        assert_eq!(\n            slice_cache.try_read_bytes(4..5).unwrap().as_slice(),\n            &b\"c\"[..]\n        );\n        assert!(slice_cache.try_read_bytes(5..6).is_none());\n        assert!(slice_cache.try_read_bytes(4..6).is_none());\n        assert!(slice_cache.try_read_bytes(6..7).is_none());\n        assert_eq!(\n            slice_cache.try_read_bytes(6..6).unwrap().as_slice(),\n            &b\"\"[..]\n        );\n        Ok(())\n    }\n\n    #[test]\n    fn test_several_segments() -> tantivy::Result<()> {\n        let mut slice_cache_builder = StaticSliceCacheBuilder::new(100u64);\n        slice_cache_builder.add_bytes(b\"def\", 6);\n        slice_cache_builder.add_bytes(b\"ghi\", 12);\n        slice_cache_builder.add_bytes(b\"abc\", 2);\n        let cache_data = slice_cache_builder.flush()?;\n        let owned_bytes = OwnedBytes::new(cache_data);\n        let slice_cache = StaticSliceCache::open(owned_bytes)?;\n        assert_eq!(\n            slice_cache.try_read_bytes(2..5).unwrap().as_slice(),\n            &b\"abc\"[..]\n        );\n        assert_eq!(\n            slice_cache.try_read_bytes(2..3).unwrap().as_slice(),\n            &b\"a\"[..]\n        );\n        assert_eq!(\n            slice_cache.try_read_bytes(3..5).unwrap().as_slice(),\n            &b\"bc\"[..]\n        );\n        assert_eq!(\n            slice_cache.try_read_bytes(4..5).unwrap().as_slice(),\n            &b\"c\"[..]\n        );\n        assert!(slice_cache.try_read_bytes(5..6).is_none());\n        assert!(slice_cache.try_read_bytes(4..6).is_none());\n        assert_eq!(\n            slice_cache.try_read_bytes(6..7).unwrap().as_slice(),\n            &b\"d\"[..]\n        );\n        assert!(slice_cache.try_read_bytes(2..7).is_none());\n        Ok(())\n    }\n\n    #[test]\n    fn test_slice_cache_merged_entries() -> tantivy::Result<()> {\n        let mut slice_cache_builder = StaticSliceCacheBuilder::new(100u64);\n        slice_cache_builder.add_bytes(b\"abc\", 2);\n        slice_cache_builder.add_bytes(b\"def\", 5);\n        let cache_data = slice_cache_builder.flush()?;\n        let owned_bytes = OwnedBytes::new(cache_data);\n        let slice_cache = StaticSliceCache::open(owned_bytes)?;\n        assert_eq!(\n            slice_cache.try_read_bytes(3..7).unwrap().as_slice(),\n            &b\"bcde\"[..]\n        );\n        Ok(())\n    }\n\n    #[test]\n    fn test_slice_cache_unmergeable_entries() -> tantivy::Result<()> {\n        let mut slice_cache_builder = StaticSliceCacheBuilder::new(100u64);\n        slice_cache_builder.add_bytes(b\"def\", 5);\n        slice_cache_builder.add_bytes(b\"abc\", 2);\n        let cache_data = slice_cache_builder.flush()?;\n        let owned_bytes = OwnedBytes::new(cache_data);\n        let slice_cache = StaticSliceCache::open(owned_bytes)?;\n        assert!(slice_cache.try_read_bytes(3..7).is_none());\n        Ok(())\n    }\n\n    #[test]\n    fn test_slice_cache_overlapping_entries() {\n        let mut slice_cache_builder = StaticSliceCacheBuilder::new(100u64);\n        slice_cache_builder.add_bytes(b\"abcd\", 2);\n        slice_cache_builder.add_bytes(b\"def\", 5);\n        assert!(slice_cache_builder.flush().is_err());\n    }\n\n    #[test]\n    fn test_slice_entry_serialization() -> anyhow::Result<()> {\n        let slice_entry = super::SliceCacheIndexEntry {\n            start: 1,\n            stop: 5,\n            addr: 4,\n        };\n        let bytes = postcard::to_allocvec(&slice_entry)?;\n        assert_eq!(&bytes[..], &[1, 5, 4]);\n        Ok(())\n    }\n\n    #[test]\n    fn test_slice_directory_cache() {\n        let one_path = Path::new(\"one.txt\");\n        let two_path = Path::new(\"two.txt\");\n        let three_path = Path::new(\"three.txt\");\n        let four_path = Path::new(\"four.txt\");\n\n        let mut directory_cache_builder = StaticDirectoryCacheBuilder::default();\n        directory_cache_builder\n            .add_file(one_path, 100)\n            .add_bytes(b\" happy t\", 5);\n        directory_cache_builder\n            .add_file(two_path, 200)\n            .add_bytes(b\"my name\", 0);\n        directory_cache_builder.add_file(three_path, 300);\n\n        let mut buffer = Vec::new();\n        directory_cache_builder.write(&mut buffer).unwrap();\n        let directory_cache = StaticDirectoryCache::open(OwnedBytes::new(buffer)).unwrap();\n\n        assert_eq!(directory_cache.get_file_length(one_path), Some(100));\n        assert_eq!(directory_cache.get_file_length(two_path), Some(200));\n        assert_eq!(directory_cache.get_file_length(three_path), Some(300));\n        assert_eq!(directory_cache.get_file_length(four_path), None);\n\n        let file_lengths = directory_cache.get_file_lengths();\n        assert_eq!(file_lengths[0], (one_path.to_owned(), 100));\n        assert_eq!(file_lengths[1], (three_path.to_owned(), 300));\n        assert_eq!(file_lengths[2], (two_path.to_owned(), 200));\n\n        assert_eq!(\n            directory_cache\n                .get_slice(one_path)\n                .try_read_bytes(6..11)\n                .unwrap()\n                .as_ref(),\n            b\"happy\"\n        );\n        assert_eq!(\n            directory_cache\n                .get_slice(two_path)\n                .try_read_bytes(3..7)\n                .unwrap()\n                .as_ref(),\n            b\"name\"\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-directories/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n//! This crate contains all of the building pieces that make quickwit's IO possible.\n//!\n//! - The `StorageDirectory` just wraps a `Storage` trait to make it compatible with tantivy's\n//!   Directory API.\n//! - The `BundleDirectory` bundles multiple files into a single file.\n//! - The `HotDirectory` wraps another directory with a static cache.\n//! - The `CachingDirectory` wraps a Directory with a dynamic cache.\n//! - The `DebugDirectory` acts as a proxy to another directory to instrument it and record all of\n//!   its IO.\n#![warn(missing_docs)]\n#![deny(clippy::disallowed_methods)]\n\nmod bundle_directory;\nmod caching_directory;\nmod debug_proxy_directory;\nmod hot_directory;\nmod storage_directory;\nmod union_directory;\n\npub use self::bundle_directory::{BundleDirectory, get_hotcache_from_split, read_split_footer};\npub use self::caching_directory::CachingDirectory;\npub use self::debug_proxy_directory::{DebugProxyDirectory, ReadOperation};\npub use self::hot_directory::{HotDirectory, write_hotcache};\npub use self::storage_directory::StorageDirectory;\npub use self::union_directory::UnionDirectory;\n\nmacro_rules! read_only_directory {\n    () => {\n        fn atomic_write(&self, _path: &Path, _data: &[u8]) -> io::Result<()> {\n            unimplemented!(\"read-only\")\n        }\n\n        fn delete(&self, _path: &Path) -> Result<(), tantivy::directory::error::DeleteError> {\n            unimplemented!(\"read-only\")\n        }\n\n        fn open_write(\n            &self,\n            _path: &Path,\n        ) -> Result<tantivy::directory::WritePtr, tantivy::directory::error::OpenWriteError> {\n            unimplemented!(\"read-only\")\n        }\n\n        fn sync_directory(&self) -> io::Result<()> {\n            unimplemented!(\"read-only\")\n        }\n\n        fn watch(\n            &self,\n            _watch_callback: tantivy::directory::WatchCallback,\n        ) -> tantivy::Result<tantivy::directory::WatchHandle> {\n            Ok(tantivy::directory::WatchHandle::empty())\n        }\n\n        fn acquire_lock(\n            &self,\n            _lock: &tantivy::directory::Lock,\n        ) -> Result<tantivy::directory::DirectoryLock, tantivy::directory::error::LockError> {\n            Ok(tantivy::directory::DirectoryLock::from(Box::new(|| {})))\n        }\n    };\n}\npub(crate) use read_only_directory;\n"
  },
  {
    "path": "quickwit/quickwit-directories/src/storage_directory.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt::Debug;\nuse std::ops::Range;\nuse std::path::{Path, PathBuf};\nuse std::sync::Arc;\nuse std::{fmt, io};\n\nuse async_trait::async_trait;\nuse quickwit_common::uri::Uri;\nuse quickwit_storage::{OwnedBytes, Storage};\nuse tantivy::directory::FileHandle;\nuse tantivy::directory::error::OpenReadError;\nuse tantivy::{Directory, HasLen};\nuse tracing::{error, instrument};\n\nstruct StorageDirectoryFileHandle {\n    storage_directory: StorageDirectory,\n    path: PathBuf,\n}\n\nimpl HasLen for StorageDirectoryFileHandle {\n    fn len(&self) -> usize {\n        unimplemented!()\n    }\n}\n\nimpl fmt::Debug for StorageDirectoryFileHandle {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(\n            f,\n            \"StorageDirectoryFileHandle({:?}, dir={:?})\",\n            &self.path, self.storage_directory\n        )\n    }\n}\n\n#[async_trait]\nimpl FileHandle for StorageDirectoryFileHandle {\n    fn read_bytes(&self, _byte_range: Range<usize>) -> io::Result<OwnedBytes> {\n        Err(unsupported_operation(&self.path))\n    }\n\n    #[instrument(level = \"debug\", fields(path = %self.path.to_string_lossy(), byte_range_size = byte_range.end - byte_range.start), skip(self))]\n    async fn read_bytes_async(&self, byte_range: Range<usize>) -> io::Result<OwnedBytes> {\n        if byte_range.is_empty() {\n            return Ok(OwnedBytes::empty());\n        }\n        let object_bytes = self\n            .storage_directory\n            .get_slice(&self.path, byte_range)\n            .await?;\n        Ok(object_bytes)\n    }\n}\n\n/// Directory backed a quickwit `Storage` abstraction.\n///\n/// It should not be used in a context outside quickwit, as it contains\n/// several pitfalls:\n/// Fetching data synchronously panics.\n/// Writing data panics.\n///\n/// This directory is fetch slices of data to a possibly distant storage\n/// everytime `read_bytes` is called.\n#[derive(Clone)]\npub struct StorageDirectory {\n    storage: Arc<dyn Storage>,\n}\n\nimpl Debug for StorageDirectory {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(f, \"StorageDirectory({:?})\", self.uri())\n    }\n}\n\nimpl StorageDirectory {\n    /// Creates a new StorageDirectory, backed by the given `storage`.\n    pub fn new(storage: Arc<dyn Storage>) -> StorageDirectory {\n        StorageDirectory { storage }\n    }\n\n    /// Fetches a slice of byte from a file asynchronously.\n    pub async fn get_slice(&self, path: &Path, range: Range<usize>) -> io::Result<OwnedBytes> {\n        let payload: OwnedBytes = self.storage.get_slice(path, range).await?;\n        Ok(payload)\n    }\n\n    /// Fetches an entire file asynchronously.\n    pub async fn get_all(&self, path: &Path) -> io::Result<OwnedBytes> {\n        let payload: OwnedBytes = self.storage.get_all(path).await?;\n        Ok(payload)\n    }\n\n    /// Returns the uri associated to the underlying storage.\n    pub fn uri(&self) -> &Uri {\n        self.storage.uri()\n    }\n}\n\nfn unsupported_operation(path: &Path) -> io::Error {\n    let error = \"unsupported operation: `StorageDirectory` only supports async reads\";\n    error!(error, ?path);\n    io::Error::other(format!(\"{error}: {}\", path.display()))\n}\n\nimpl Directory for StorageDirectory {\n    fn get_file_handle(&self, path: &Path) -> Result<Arc<dyn FileHandle>, OpenReadError> {\n        Ok(Arc::new(StorageDirectoryFileHandle {\n            storage_directory: self.clone(),\n            path: path.to_path_buf(),\n        }))\n    }\n\n    fn atomic_read(&self, path: &Path) -> Result<Vec<u8>, OpenReadError> {\n        Err(OpenReadError::wrap_io_error(\n            unsupported_operation(path),\n            path.to_path_buf(),\n        ))\n    }\n\n    fn exists(&self, path: &std::path::Path) -> Result<bool, OpenReadError> {\n        Err(OpenReadError::wrap_io_error(\n            unsupported_operation(path),\n            path.to_path_buf(),\n        ))\n    }\n\n    crate::read_only_directory!();\n}\n"
  },
  {
    "path": "quickwit/quickwit-directories/src/union_directory.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::io;\nuse std::path::{Path, PathBuf};\nuse std::sync::Arc;\n\nuse tantivy::Directory;\nuse tantivy::directory::error::{DeleteError, OpenReadError, OpenWriteError};\nuse tantivy::directory::{FileHandle, WatchHandle};\n\n/// A union directory takes a bunch of directories and stacks them, similarly to UnionFS.\n/// The resulting directory is a virtual view of the union of the different directories.\n///\n/// If a path exists in all directories, the first of the list containing the path\n/// will shadow the other directories.\n///\n/// The first directory of the list will receive all write operations.\n/// Deletes on the other hand will be applied on all directories containing the file.\n#[derive(Clone, Debug)]\npub struct UnionDirectory {\n    directories: Arc<Vec<Box<dyn Directory>>>,\n}\n\nimpl UnionDirectory {\n    /// Creates a new union directory.\n    pub fn union_of(directories: Vec<Box<dyn Directory>>) -> UnionDirectory {\n        UnionDirectory {\n            directories: Arc::new(directories),\n        }\n    }\n\n    /// Helper function to find the first directory containing the given path.\n    fn find_directory_for_path(&self, path: &Path) -> Result<&dyn Directory, OpenReadError> {\n        for directory in self.directories.iter() {\n            if directory.exists(path)? {\n                return Ok(directory.as_ref());\n            }\n        }\n        Err(OpenReadError::FileDoesNotExist(path.to_path_buf()))\n    }\n}\n\nfn convert_open_to_delete_error(open_err: OpenReadError) -> DeleteError {\n    match open_err {\n        OpenReadError::FileDoesNotExist(path) => DeleteError::FileDoesNotExist(path),\n        OpenReadError::IoError { io_error, filepath } => {\n            DeleteError::IoError { io_error, filepath }\n        }\n        err @ OpenReadError::IncompatibleIndex(_) => DeleteError::IoError {\n            io_error: Arc::new(io::Error::new(io::ErrorKind::Unsupported, err)),\n            filepath: PathBuf::from(\"/\"),\n        },\n    }\n}\n\nimpl Directory for UnionDirectory {\n    fn get_file_handle(&self, path: &Path) -> Result<Arc<dyn FileHandle>, OpenReadError> {\n        let directory = self.find_directory_for_path(path)?;\n        directory.get_file_handle(path)\n    }\n\n    fn exists(&self, path: &Path) -> Result<bool, OpenReadError> {\n        match self.find_directory_for_path(path) {\n            Ok(_) => Ok(true),\n            Err(OpenReadError::FileDoesNotExist(_)) => Ok(false),\n            Err(err) => Err(err),\n        }\n    }\n\n    fn atomic_read(&self, path: &Path) -> Result<Vec<u8>, OpenReadError> {\n        let directory = self.find_directory_for_path(path)?;\n        directory.atomic_read(path)\n    }\n\n    fn open_write(&self, path: &Path) -> Result<tantivy::directory::WritePtr, OpenWriteError> {\n        self.directories[0].open_write(path)\n    }\n\n    fn delete(&self, path: &Path) -> Result<(), DeleteError> {\n        let mut found_file = false;\n        for directory in self.directories.iter() {\n            // We first check exist, in order to support read-only directories.\n            match directory.exists(path) {\n                Ok(true) => {\n                    directory.delete(path)?;\n                    found_file = true;\n                }\n                Ok(false) => {}\n                Err(exist_err) => {\n                    return Err(convert_open_to_delete_error(exist_err));\n                }\n            }\n        }\n        if !found_file {\n            return Err(DeleteError::FileDoesNotExist(path.to_path_buf()));\n        }\n        Ok(())\n    }\n\n    fn atomic_write(&self, path: &Path, data: &[u8]) -> io::Result<()> {\n        self.directories[0].atomic_write(path, data)\n    }\n\n    fn watch(&self, callback: tantivy::directory::WatchCallback) -> tantivy::Result<WatchHandle> {\n        self.directories[0].watch(callback)\n    }\n\n    fn sync_directory(&self) -> io::Result<()> {\n        self.directories[0].sync_directory()\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::path::Path;\n\n    use tantivy::directory::{Directory, RamDirectory};\n\n    use crate::UnionDirectory;\n\n    #[test]\n    fn test_union_directory_atomic_simple() -> anyhow::Result<()> {\n        let dir1 = RamDirectory::create();\n        let dir2 = RamDirectory::create();\n        dir1.atomic_write(Path::new(\"path1\"), &b\"data1\"[..])?;\n        dir2.atomic_write(Path::new(\"path2\"), &b\"data2\"[..])?;\n        let union_directory = UnionDirectory::union_of(vec![Box::new(dir1), Box::new(dir2)]);\n        {\n            let payload_1 = union_directory.atomic_read(Path::new(\"path1\"))?;\n            assert_eq!(payload_1, b\"data1\");\n        }\n        {\n            let payload_1 = union_directory\n                .open_read(Path::new(\"path1\"))?\n                .read_bytes()?;\n            assert_eq!(payload_1.as_slice(), b\"data1\");\n        }\n        {\n            let payload_2 = union_directory.atomic_read(Path::new(\"path2\"))?;\n            assert_eq!(payload_2, b\"data2\");\n        }\n        {\n            let payload_2 = union_directory\n                .open_read(Path::new(\"path2\"))?\n                .read_bytes()?;\n            assert_eq!(payload_2.as_slice(), b\"data2\");\n        }\n        Ok(())\n    }\n\n    #[test]\n    fn test_union_directory_shadowing() -> anyhow::Result<()> {\n        let dir1 = RamDirectory::create();\n        let dir2 = RamDirectory::create();\n        dir1.atomic_write(Path::new(\"shadowed_path\"), &b\"shadower\"[..])?;\n        dir2.atomic_write(Path::new(\"shadowed_path\"), &b\"shadowee\"[..])?;\n        let union_directory = UnionDirectory::union_of(vec![Box::new(dir1), Box::new(dir2)]);\n        let payload = union_directory.atomic_read(Path::new(\"shadowed_path\"))?;\n        assert_eq!(payload, b\"shadower\");\n        Ok(())\n    }\n\n    #[test]\n    fn test_union_directory_exists() -> anyhow::Result<()> {\n        let dir1 = RamDirectory::create();\n        dir1.atomic_write(Path::new(\"path1\"), &b\"data1\"[..])?;\n        dir1.atomic_write(Path::new(\"shadowed_path\"), &b\"shadower\"[..])?;\n\n        let dir2 = RamDirectory::create();\n        dir2.atomic_write(Path::new(\"path2\"), &b\"data2\"[..])?;\n        dir2.atomic_write(Path::new(\"shadowed_path\"), &b\"shadowee\"[..])?;\n\n        let union_directory = UnionDirectory::union_of(vec![Box::new(dir1), Box::new(dir2)]);\n        assert!(union_directory.exists(Path::new(\"path1\"))?);\n        assert!(union_directory.exists(Path::new(\"path2\"))?);\n        assert!(union_directory.exists(Path::new(\"shadowed_path\"))?);\n\n        assert!(!union_directory.exists(Path::new(\"path3\"))?);\n        Ok(())\n    }\n\n    #[test]\n    fn test_union_directory_delete() -> anyhow::Result<()> {\n        let dir1 = RamDirectory::create();\n        dir1.atomic_write(Path::new(\"path1\"), &b\"data1\"[..])?;\n        dir1.atomic_write(Path::new(\"shadowed_path\"), &b\"shadower\"[..])?;\n\n        let dir2 = RamDirectory::create();\n        dir2.atomic_write(Path::new(\"path2\"), &b\"data2\"[..])?;\n        dir2.atomic_write(Path::new(\"shadowed_path\"), &b\"shadowee\"[..])?;\n\n        let union_directory = UnionDirectory::union_of(vec![Box::new(dir1), Box::new(dir2)]);\n\n        union_directory.delete(Path::new(\"path1\"))?;\n        assert!(!union_directory.exists(Path::new(\"path1\"))?);\n\n        union_directory.delete(Path::new(\"path2\"))?;\n        assert!(!union_directory.exists(Path::new(\"path2\"))?);\n\n        union_directory.delete(Path::new(\"shadowed_path\"))?;\n        assert!(!union_directory.exists(Path::new(\"shadowed_path\"))?);\n\n        union_directory.delete(Path::new(\"path3\")).unwrap_err();\n        Ok(())\n    }\n    #[test]\n    fn test_union_directory_write() -> anyhow::Result<()> {\n        let dir1 = RamDirectory::create();\n        dir1.atomic_write(Path::new(\"path1\"), &b\"data1\"[..])?;\n\n        let dir2 = RamDirectory::create();\n        dir2.atomic_write(Path::new(\"path2\"), &b\"data2\"[..])?;\n\n        let union_directory = UnionDirectory::union_of(vec![Box::new(dir1), Box::new(dir2)]);\n        union_directory.atomic_write(Path::new(\"path1\"), &b\"data1 data1\"[..])?;\n        union_directory.atomic_write(Path::new(\"path3\"), &b\"data3\"[..])?;\n        {\n            let payload = union_directory.atomic_read(Path::new(\"path1\"))?;\n            assert_eq!(payload, b\"data1 data1\");\n        }\n        {\n            let payload = union_directory.atomic_read(Path::new(\"path3\"))?;\n            assert_eq!(payload, b\"data3\");\n        }\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-doc-mapper/Cargo.toml",
    "content": "[package]\nname = \"quickwit-doc-mapper\"\ndescription = \"Index schema and document mapping\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nanyhow = { workspace = true }\nbase64 = { workspace = true }\nfnv = { workspace = true }\nhex = { workspace = true }\nindexmap = { workspace = true }\nitertools = { workspace = true }\nnom = { workspace = true }\nonce_cell = { workspace = true }\nregex = { workspace = true }\nserde = { workspace = true }\nserde_json = { workspace = true }\nserde_json_borrow = { workspace = true }\nsiphasher = { workspace = true }\ntantivy = { workspace = true }\nthiserror = { workspace = true }\ntracing = { workspace = true }\nutoipa = { workspace = true }\n\nquickwit-common = { workspace = true }\nquickwit-datetime = { workspace = true }\nquickwit-macros = { workspace = true }\nquickwit-proto = { workspace = true }\nquickwit-query = { workspace = true }\n\n[dev-dependencies]\nbinggan = { workspace = true }\nmatches = { workspace = true }\nserde_yaml = { workspace = true }\ntime = { workspace = true }\n\nquickwit-common = { workspace = true, features = [\"testsuite\"] }\nquickwit-query = { workspace = true }\n\n[features]\ntestsuite = []\n\n[[bench]]\nname = \"doc_to_json_bench\"\nharness = false\n\n[[bench]]\nname = \"routing_expression_bench\"\nharness = false\n"
  },
  {
    "path": "quickwit/quickwit-doc-mapper/benches/data/simple-parse-bench.json",
    "content": "{\"id\":1,\"first_name\":\"Giulia\",\"last_name\":\"Chaplain\",\"email\":\"gchaplain0@ameblo.jp\"}\n{\"id\":2,\"first_name\":\"Vivyan\",\"last_name\":\"Shitliffe\",\"email\":\"vshitliffe1@skype.com\"}\n{\"id\":3,\"first_name\":\"Phip\",\"last_name\":\"Ribey\",\"email\":\"pribey2@sitemeter.com\"}\n{\"id\":4,\"first_name\":\"Theressa\",\"last_name\":\"Gamlin\",\"email\":\"tgamlin3@alibaba.com\"}\n{\"id\":5,\"first_name\":\"Monica\",\"last_name\":\"Buney\",\"email\":\"mbuney4@abc.net.au\"}\n{\"id\":6,\"first_name\":\"Adore\",\"last_name\":\"Brickhill\",\"email\":\"abrickhill5@liveinternet.ru\"}\n{\"id\":7,\"first_name\":\"Germana\",\"last_name\":\"Culligan\",\"email\":\"gculligan6@forbes.com\"}\n{\"id\":8,\"first_name\":\"Jorgan\",\"last_name\":\"Provost\",\"email\":\"jprovost7@naver.com\"}\n{\"id\":9,\"first_name\":\"Dianemarie\",\"last_name\":\"Dorney\",\"email\":\"ddorney8@alexa.com\"}\n{\"id\":10,\"first_name\":\"Philipa\",\"last_name\":\"Cocozza\",\"email\":\"pcocozza9@eventbrite.com\"}\n{\"id\":11,\"first_name\":\"Adena\",\"last_name\":\"Frickey\",\"email\":\"africkeya@php.net\"}\n{\"id\":12,\"first_name\":\"Noelyn\",\"last_name\":\"Jocelyn\",\"email\":\"njocelynb@addtoany.com\"}\n{\"id\":13,\"first_name\":\"Cammy\",\"last_name\":\"Norwell\",\"email\":\"cnorwellc@yale.edu\"}\n{\"id\":14,\"first_name\":\"Eadie\",\"last_name\":\"Pipworth\",\"email\":\"epipworthd@barnesandnoble.com\"}\n{\"id\":15,\"first_name\":\"Tandy\",\"last_name\":\"Lenahan\",\"email\":\"tlenahane@tripod.com\"}\n{\"id\":16,\"first_name\":\"Honoria\",\"last_name\":\"Van Weedenburg\",\"email\":\"hvanweedenburgf@discovery.com\"}\n{\"id\":17,\"first_name\":\"Felita\",\"last_name\":\"O' Mullane\",\"email\":\"fomullaneg@msu.edu\"}\n{\"id\":18,\"first_name\":\"Austin\",\"last_name\":\"Brownstein\",\"email\":\"abrownsteinh@prlog.org\"}\n{\"id\":19,\"first_name\":\"Leigh\",\"last_name\":\"Berzins\",\"email\":\"lberzinsi@walmart.com\"}\n{\"id\":20,\"first_name\":\"Rachele\",\"last_name\":\"Adamsson\",\"email\":\"radamssonj@csmonitor.com\"}\n{\"id\":21,\"first_name\":\"Barbabra\",\"last_name\":\"Wilacot\",\"email\":\"bwilacotk@kickstarter.com\"}\n{\"id\":22,\"first_name\":\"Griffin\",\"last_name\":\"Jone\",\"email\":\"gjonel@google.pl\"}\n{\"id\":23,\"first_name\":\"Michel\",\"last_name\":\"Bothie\",\"email\":\"mbothiem@shareasale.com\"}\n{\"id\":24,\"first_name\":\"Callie\",\"last_name\":\"Selley\",\"email\":\"cselleyn@gmpg.org\"}\n{\"id\":25,\"first_name\":\"Gleda\",\"last_name\":\"O'Lahy\",\"email\":\"golahyo@nih.gov\"}\n{\"id\":26,\"first_name\":\"Alia\",\"last_name\":\"Ladel\",\"email\":\"aladelp@phpbb.com\"}\n{\"id\":27,\"first_name\":\"Gusti\",\"last_name\":\"McVitty\",\"email\":\"gmcvittyq@redcross.org\"}\n{\"id\":28,\"first_name\":\"Carolann\",\"last_name\":\"Pachmann\",\"email\":\"cpachmannr@goodreads.com\"}\n{\"id\":29,\"first_name\":\"Agata\",\"last_name\":\"Nyssen\",\"email\":\"anyssens@utexas.edu\"}\n{\"id\":30,\"first_name\":\"Jerrie\",\"last_name\":\"Craddy\",\"email\":\"jcraddyt@cbc.ca\"}\n{\"id\":31,\"first_name\":\"Nariko\",\"last_name\":\"Von Brook\",\"email\":\"nvonbrooku@comcast.net\"}\n{\"id\":32,\"first_name\":\"Zacharias\",\"last_name\":\"Gobel\",\"email\":\"zgobelv@wordpress.org\"}\n{\"id\":33,\"first_name\":\"Sidnee\",\"last_name\":\"Whettleton\",\"email\":\"swhettletonw@lycos.com\"}\n{\"id\":34,\"first_name\":\"Orlan\",\"last_name\":\"Adamovitch\",\"email\":\"oadamovitchx@hibu.com\"}\n{\"id\":35,\"first_name\":\"Lotty\",\"last_name\":\"Eddolls\",\"email\":\"leddollsy@oracle.com\"}\n{\"id\":36,\"first_name\":\"Sarge\",\"last_name\":\"Tongue\",\"email\":\"stonguez@shutterfly.com\"}\n{\"id\":37,\"first_name\":\"Dalia\",\"last_name\":\"Fisbey\",\"email\":\"dfisbey10@ca.gov\"}\n{\"id\":38,\"first_name\":\"Christin\",\"last_name\":\"Yokel\",\"email\":\"cyokel11@weebly.com\"}\n{\"id\":39,\"first_name\":\"Maryjo\",\"last_name\":\"Thridgould\",\"email\":\"mthridgould12@apple.com\"}\n{\"id\":40,\"first_name\":\"Maxie\",\"last_name\":\"Nock\",\"email\":\"mnock13@symantec.com\"}\n{\"id\":41,\"first_name\":\"Shani\",\"last_name\":\"Breeds\",\"email\":\"sbreeds14@who.int\"}\n{\"id\":42,\"first_name\":\"Loraine\",\"last_name\":\"Sainthill\",\"email\":\"lsainthill15@icq.com\"}\n{\"id\":43,\"first_name\":\"Lorain\",\"last_name\":\"Davidou\",\"email\":\"ldavidou16@hexun.com\"}\n{\"id\":44,\"first_name\":\"Ameline\",\"last_name\":\"Dymoke\",\"email\":\"adymoke17@usgs.gov\"}\n{\"id\":45,\"first_name\":\"Tod\",\"last_name\":\"Mendenhall\",\"email\":\"tmendenhall18@skype.com\"}\n{\"id\":46,\"first_name\":\"Cloris\",\"last_name\":\"Pengilley\",\"email\":\"cpengilley19@upenn.edu\"}\n{\"id\":47,\"first_name\":\"Kathi\",\"last_name\":\"Ortells\",\"email\":\"kortells1a@state.tx.us\"}\n{\"id\":48,\"first_name\":\"Sal\",\"last_name\":\"Praill\",\"email\":\"spraill1b@deviantart.com\"}\n{\"id\":49,\"first_name\":\"Gideon\",\"last_name\":\"McCauley\",\"email\":\"gmccauley1c@wordpress.com\"}\n{\"id\":50,\"first_name\":\"Rickie\",\"last_name\":\"Zanitti\",\"email\":\"rzanitti1d@jigsy.com\"}\n{\"id\":51,\"first_name\":\"Leodora\",\"last_name\":\"Chaloner\",\"email\":\"lchaloner1e@163.com\"}\n{\"id\":52,\"first_name\":\"Freida\",\"last_name\":\"Strethill\",\"email\":\"fstrethill1f@photobucket.com\"}\n{\"id\":53,\"first_name\":\"Noach\",\"last_name\":\"Coot\",\"email\":\"ncoot1g@omniture.com\"}\n{\"id\":54,\"first_name\":\"Shawn\",\"last_name\":\"Booij\",\"email\":\"sbooij1h@eepurl.com\"}\n{\"id\":55,\"first_name\":\"Currey\",\"last_name\":\"Boyford\",\"email\":\"cboyford1i@netvibes.com\"}\n{\"id\":56,\"first_name\":\"Kaitlin\",\"last_name\":\"Ripon\",\"email\":\"kripon1j@prnewswire.com\"}\n{\"id\":57,\"first_name\":\"Quintina\",\"last_name\":\"Hallows\",\"email\":\"qhallows1k@wikimedia.org\"}\n{\"id\":58,\"first_name\":\"Gallagher\",\"last_name\":\"Degoey\",\"email\":\"gdegoey1l@ox.ac.uk\"}\n{\"id\":59,\"first_name\":\"Bride\",\"last_name\":\"Lodin\",\"email\":\"blodin1m@storify.com\"}\n{\"id\":60,\"first_name\":\"Wilhelmine\",\"last_name\":\"Longworth\",\"email\":\"wlongworth1n@drupal.org\"}\n{\"id\":61,\"first_name\":\"Giacinta\",\"last_name\":\"Gulliman\",\"email\":\"ggulliman1o@ow.ly\"}\n{\"id\":62,\"first_name\":\"Whitney\",\"last_name\":\"Swallwell\",\"email\":\"wswallwell1p@businesswire.com\"}\n{\"id\":63,\"first_name\":\"Boris\",\"last_name\":\"Larraway\",\"email\":\"blarraway1q@disqus.com\"}\n{\"id\":64,\"first_name\":\"Guenevere\",\"last_name\":\"Pierse\",\"email\":\"gpierse1r@sourceforge.net\"}\n{\"id\":65,\"first_name\":\"Leela\",\"last_name\":\"O'Carran\",\"email\":\"locarran1s@telegraph.co.uk\"}\n{\"id\":66,\"first_name\":\"Anna-diana\",\"last_name\":\"Marple\",\"email\":\"amarple1t@unesco.org\"}\n{\"id\":67,\"first_name\":\"Irita\",\"last_name\":\"Hayto\",\"email\":\"ihayto1u@samsung.com\"}\n{\"id\":68,\"first_name\":\"Rance\",\"last_name\":\"Urwen\",\"email\":\"rurwen1v@t.co\"}\n{\"id\":69,\"first_name\":\"Mollie\",\"last_name\":\"Frowing\",\"email\":\"mfrowing1w@comcast.net\"}\n{\"id\":70,\"first_name\":\"Ellsworth\",\"last_name\":\"Sandell\",\"email\":\"esandell1x@google.com\"}\n{\"id\":71,\"first_name\":\"Jude\",\"last_name\":\"Rooksby\",\"email\":\"jrooksby1y@issuu.com\"}\n{\"id\":72,\"first_name\":\"Zahara\",\"last_name\":\"Edworthie\",\"email\":\"zedworthie1z@mysql.com\"}\n{\"id\":73,\"first_name\":\"Prentiss\",\"last_name\":\"Meddings\",\"email\":\"pmeddings20@deviantart.com\"}\n{\"id\":74,\"first_name\":\"Chrissie\",\"last_name\":\"Lechmere\",\"email\":\"clechmere21@abc.net.au\"}\n{\"id\":75,\"first_name\":\"Rosie\",\"last_name\":\"Danels\",\"email\":\"rdanels22@flavors.me\"}\n{\"id\":76,\"first_name\":\"Cchaddie\",\"last_name\":\"Hatfield\",\"email\":\"chatfield23@ehow.com\"}\n{\"id\":77,\"first_name\":\"Deana\",\"last_name\":\"Goter\",\"email\":\"dgoter24@yellowpages.com\"}\n{\"id\":78,\"first_name\":\"Agatha\",\"last_name\":\"Witchard\",\"email\":\"awitchard25@google.cn\"}\n{\"id\":79,\"first_name\":\"Rhiamon\",\"last_name\":\"Bleckly\",\"email\":\"rbleckly26@nymag.com\"}\n{\"id\":80,\"first_name\":\"Kellie\",\"last_name\":\"Karpushkin\",\"email\":\"kkarpushkin27@seattletimes.com\"}\n{\"id\":81,\"first_name\":\"Konstanze\",\"last_name\":\"Ramsbottom\",\"email\":\"kramsbottom28@gov.uk\"}\n{\"id\":82,\"first_name\":\"Doy\",\"last_name\":\"Servant\",\"email\":\"dservant29@amazon.de\"}\n{\"id\":83,\"first_name\":\"Marj\",\"last_name\":\"Kenford\",\"email\":\"mkenford2a@studiopress.com\"}\n{\"id\":84,\"first_name\":\"Amalia\",\"last_name\":\"Hubbock\",\"email\":\"ahubbock2b@upenn.edu\"}\n{\"id\":85,\"first_name\":\"Jeannie\",\"last_name\":\"Vannah\",\"email\":\"jvannah2c@sourceforge.net\"}\n{\"id\":86,\"first_name\":\"Janina\",\"last_name\":\"Wigelsworth\",\"email\":\"jwigelsworth2d@instagram.com\"}\n{\"id\":87,\"first_name\":\"Ermina\",\"last_name\":\"Patshull\",\"email\":\"epatshull2e@arstechnica.com\"}\n{\"id\":88,\"first_name\":\"Abbi\",\"last_name\":\"Joseland\",\"email\":\"ajoseland2f@discuz.net\"}\n{\"id\":89,\"first_name\":\"Lavinie\",\"last_name\":\"Cosson\",\"email\":\"lcosson2g@ucoz.com\"}\n{\"id\":90,\"first_name\":\"Noble\",\"last_name\":\"Wyborn\",\"email\":\"nwyborn2h@sun.com\"}\n{\"id\":91,\"first_name\":\"Lulita\",\"last_name\":\"Tunnicliff\",\"email\":\"ltunnicliff2i@dagondesign.com\"}\n{\"id\":92,\"first_name\":\"Debera\",\"last_name\":\"Juris\",\"email\":\"djuris2j@samsung.com\"}\n{\"id\":93,\"first_name\":\"Dania\",\"last_name\":\"Heersema\",\"email\":\"dheersema2k@ox.ac.uk\"}\n{\"id\":94,\"first_name\":\"Jacki\",\"last_name\":\"Leveridge\",\"email\":\"jleveridge2l@dropbox.com\"}\n{\"id\":95,\"first_name\":\"Zara\",\"last_name\":\"Sainsbury-Brown\",\"email\":\"zsainsburybrown2m@thetimes.co.uk\"}\n{\"id\":96,\"first_name\":\"Rhianna\",\"last_name\":\"Pittson\",\"email\":\"rpittson2n@ameblo.jp\"}\n{\"id\":97,\"first_name\":\"Bevon\",\"last_name\":\"Rugge\",\"email\":\"brugge2o@163.com\"}\n{\"id\":98,\"first_name\":\"Balduin\",\"last_name\":\"Crosen\",\"email\":\"bcrosen2p@nationalgeographic.com\"}\n{\"id\":99,\"first_name\":\"Jens\",\"last_name\":\"Muspratt\",\"email\":\"jmuspratt2q@wordpress.org\"}\n{\"id\":100,\"first_name\":\"Butch\",\"last_name\":\"Rijkeseis\",\"email\":\"brijkeseis2r@wordpress.com\"}\n{\"id\":101,\"first_name\":\"Garold\",\"last_name\":\"Tincey\",\"email\":\"gtincey2s@unesco.org\"}\n{\"id\":102,\"first_name\":\"Krishna\",\"last_name\":\"Starkie\",\"email\":\"kstarkie2t@gizmodo.com\"}\n{\"id\":103,\"first_name\":\"Thomasine\",\"last_name\":\"Tickner\",\"email\":\"ttickner2u@slashdot.org\"}\n{\"id\":104,\"first_name\":\"Tuesday\",\"last_name\":\"Osmon\",\"email\":\"tosmon2v@vkontakte.ru\"}\n{\"id\":105,\"first_name\":\"Elberta\",\"last_name\":\"Ellsbury\",\"email\":\"eellsbury2w@cmu.edu\"}\n{\"id\":106,\"first_name\":\"Rudyard\",\"last_name\":\"Barrie\",\"email\":\"rbarrie2x@wisc.edu\"}\n{\"id\":107,\"first_name\":\"Cash\",\"last_name\":\"Cloutt\",\"email\":\"ccloutt2y@twitter.com\"}\n{\"id\":108,\"first_name\":\"Jammal\",\"last_name\":\"Bateman\",\"email\":\"jbateman2z@va.gov\"}\n{\"id\":109,\"first_name\":\"Binnie\",\"last_name\":\"Siddall\",\"email\":\"bsiddall30@narod.ru\"}\n{\"id\":110,\"first_name\":\"Cirillo\",\"last_name\":\"Stockbridge\",\"email\":\"cstockbridge31@reddit.com\"}\n{\"id\":111,\"first_name\":\"Cherie\",\"last_name\":\"Sommerlie\",\"email\":\"csommerlie32@scientificamerican.com\"}\n{\"id\":112,\"first_name\":\"Jaquenette\",\"last_name\":\"Autrie\",\"email\":\"jautrie33@umich.edu\"}\n{\"id\":113,\"first_name\":\"Ellswerth\",\"last_name\":\"Bethell\",\"email\":\"ebethell34@adobe.com\"}\n{\"id\":114,\"first_name\":\"Earvin\",\"last_name\":\"Millmore\",\"email\":\"emillmore35@kickstarter.com\"}\n{\"id\":115,\"first_name\":\"Hall\",\"last_name\":\"Ruddle\",\"email\":\"hruddle36@tripod.com\"}\n{\"id\":116,\"first_name\":\"Eveleen\",\"last_name\":\"O'Kennavain\",\"email\":\"eokennavain37@squarespace.com\"}\n{\"id\":117,\"first_name\":\"Dillie\",\"last_name\":\"Petrescu\",\"email\":\"dpetrescu38@ibm.com\"}\n{\"id\":118,\"first_name\":\"Celestia\",\"last_name\":\"Burwood\",\"email\":\"cburwood39@oaic.gov.au\"}\n{\"id\":119,\"first_name\":\"Lynna\",\"last_name\":\"Barnsdall\",\"email\":\"lbarnsdall3a@marketwatch.com\"}\n{\"id\":120,\"first_name\":\"Jandy\",\"last_name\":\"Noods\",\"email\":\"jnoods3b@behance.net\"}\n{\"id\":121,\"first_name\":\"Guillemette\",\"last_name\":\"Chinn\",\"email\":\"gchinn3c@comcast.net\"}\n{\"id\":122,\"first_name\":\"Abagael\",\"last_name\":\"Keenleyside\",\"email\":\"akeenleyside3d@usatoday.com\"}\n{\"id\":123,\"first_name\":\"Arlan\",\"last_name\":\"Kubczak\",\"email\":\"akubczak3e@apache.org\"}\n{\"id\":124,\"first_name\":\"Tybi\",\"last_name\":\"Flobert\",\"email\":\"tflobert3f@cbsnews.com\"}\n{\"id\":125,\"first_name\":\"Hodge\",\"last_name\":\"Champley\",\"email\":\"hchampley3g@etsy.com\"}\n{\"id\":126,\"first_name\":\"Betta\",\"last_name\":\"Navarijo\",\"email\":\"bnavarijo3h@xrea.com\"}\n{\"id\":127,\"first_name\":\"Stacee\",\"last_name\":\"Brunetti\",\"email\":\"sbrunetti3i@artisteer.com\"}\n{\"id\":128,\"first_name\":\"Hurlee\",\"last_name\":\"Bowstead\",\"email\":\"hbowstead3j@mit.edu\"}\n{\"id\":129,\"first_name\":\"Linet\",\"last_name\":\"Binnall\",\"email\":\"lbinnall3k@blogger.com\"}\n{\"id\":130,\"first_name\":\"Daile\",\"last_name\":\"Borrett\",\"email\":\"dborrett3l@npr.org\"}\n{\"id\":131,\"first_name\":\"Kay\",\"last_name\":\"Coneybeare\",\"email\":\"kconeybeare3m@amazon.co.uk\"}\n{\"id\":132,\"first_name\":\"Menard\",\"last_name\":\"Chatan\",\"email\":\"mchatan3n@nih.gov\"}\n{\"id\":133,\"first_name\":\"Anthiathia\",\"last_name\":\"Innman\",\"email\":\"ainnman3o@wiley.com\"}\n{\"id\":134,\"first_name\":\"Lenka\",\"last_name\":\"Polland\",\"email\":\"lpolland3p@xing.com\"}\n{\"id\":135,\"first_name\":\"Allina\",\"last_name\":\"Custed\",\"email\":\"acusted3q@usa.gov\"}\n{\"id\":136,\"first_name\":\"Janessa\",\"last_name\":\"Gerckens\",\"email\":\"jgerckens3r@nyu.edu\"}\n{\"id\":137,\"first_name\":\"Magdalen\",\"last_name\":\"Doerr\",\"email\":\"mdoerr3s@lycos.com\"}\n{\"id\":138,\"first_name\":\"Pail\",\"last_name\":\"Sellstrom\",\"email\":\"psellstrom3t@sohu.com\"}\n{\"id\":139,\"first_name\":\"Lissy\",\"last_name\":\"Pindell\",\"email\":\"lpindell3u@dailymail.co.uk\"}\n{\"id\":140,\"first_name\":\"Beatriz\",\"last_name\":\"Elintune\",\"email\":\"belintune3v@cdc.gov\"}\n{\"id\":141,\"first_name\":\"Arlana\",\"last_name\":\"Carik\",\"email\":\"acarik3w@bigcartel.com\"}\n{\"id\":142,\"first_name\":\"Conny\",\"last_name\":\"Anster\",\"email\":\"canster3x@domainmarket.com\"}\n{\"id\":143,\"first_name\":\"Alvera\",\"last_name\":\"Carn\",\"email\":\"acarn3y@biblegateway.com\"}\n{\"id\":144,\"first_name\":\"Adriaens\",\"last_name\":\"Farfoot\",\"email\":\"afarfoot3z@dion.ne.jp\"}\n{\"id\":145,\"first_name\":\"Rosabelle\",\"last_name\":\"Mardoll\",\"email\":\"rmardoll40@i2i.jp\"}\n{\"id\":146,\"first_name\":\"Riki\",\"last_name\":\"Orrett\",\"email\":\"rorrett41@fda.gov\"}\n{\"id\":147,\"first_name\":\"Zorah\",\"last_name\":\"Jaime\",\"email\":\"zjaime42@xing.com\"}\n{\"id\":148,\"first_name\":\"Godwin\",\"last_name\":\"Birkinshaw\",\"email\":\"gbirkinshaw43@mail.ru\"}\n{\"id\":149,\"first_name\":\"Karoly\",\"last_name\":\"Cowl\",\"email\":\"kcowl44@rambler.ru\"}\n{\"id\":150,\"first_name\":\"Myrtle\",\"last_name\":\"Gostling\",\"email\":\"mgostling45@psu.edu\"}\n{\"id\":151,\"first_name\":\"Matthew\",\"last_name\":\"Shird\",\"email\":\"mshird46@linkedin.com\"}\n{\"id\":152,\"first_name\":\"Tanner\",\"last_name\":\"Kubelka\",\"email\":\"tkubelka47@state.tx.us\"}\n{\"id\":153,\"first_name\":\"Cosimo\",\"last_name\":\"Broune\",\"email\":\"cbroune48@sogou.com\"}\n{\"id\":154,\"first_name\":\"Rafa\",\"last_name\":\"Vear\",\"email\":\"rvear49@theglobeandmail.com\"}\n{\"id\":155,\"first_name\":\"Emmalynn\",\"last_name\":\"Elcomb\",\"email\":\"eelcomb4a@cam.ac.uk\"}\n{\"id\":156,\"first_name\":\"Karolina\",\"last_name\":\"Mayler\",\"email\":\"kmayler4b@furl.net\"}\n{\"id\":157,\"first_name\":\"Serene\",\"last_name\":\"Piers\",\"email\":\"spiers4c@etsy.com\"}\n{\"id\":158,\"first_name\":\"Garnet\",\"last_name\":\"Reynault\",\"email\":\"greynault4d@trellian.com\"}\n{\"id\":159,\"first_name\":\"Decca\",\"last_name\":\"Kauscher\",\"email\":\"dkauscher4e@eepurl.com\"}\n{\"id\":160,\"first_name\":\"Olga\",\"last_name\":\"Willas\",\"email\":\"owillas4f@npr.org\"}\n{\"id\":161,\"first_name\":\"Neil\",\"last_name\":\"Chatel\",\"email\":\"nchatel4g@nifty.com\"}\n{\"id\":162,\"first_name\":\"Jim\",\"last_name\":\"Terrelly\",\"email\":\"jterrelly4h@360.cn\"}\n{\"id\":163,\"first_name\":\"Ximenez\",\"last_name\":\"Saffen\",\"email\":\"xsaffen4i@deliciousdays.com\"}\n{\"id\":164,\"first_name\":\"Kandace\",\"last_name\":\"Skitt\",\"email\":\"kskitt4j@webs.com\"}\n{\"id\":165,\"first_name\":\"Dinny\",\"last_name\":\"Borman\",\"email\":\"dborman4k@issuu.com\"}\n{\"id\":166,\"first_name\":\"Wolfy\",\"last_name\":\"McVanamy\",\"email\":\"wmcvanamy4l@msn.com\"}\n{\"id\":167,\"first_name\":\"Tito\",\"last_name\":\"Orlton\",\"email\":\"torlton4m@discuz.net\"}\n{\"id\":168,\"first_name\":\"Leia\",\"last_name\":\"Evelyn\",\"email\":\"levelyn4n@dagondesign.com\"}\n{\"id\":169,\"first_name\":\"Kamila\",\"last_name\":\"Jeste\",\"email\":\"kjeste4o@imgur.com\"}\n{\"id\":170,\"first_name\":\"Annissa\",\"last_name\":\"Aitchinson\",\"email\":\"aaitchinson4p@msn.com\"}\n{\"id\":171,\"first_name\":\"Zulema\",\"last_name\":\"Pimme\",\"email\":\"zpimme4q@lulu.com\"}\n{\"id\":172,\"first_name\":\"Nichole\",\"last_name\":\"Nealand\",\"email\":\"nnealand4r@arstechnica.com\"}\n{\"id\":173,\"first_name\":\"Barth\",\"last_name\":\"Carver\",\"email\":\"bcarver4s@bloomberg.com\"}\n{\"id\":174,\"first_name\":\"Theodosia\",\"last_name\":\"Helkin\",\"email\":\"thelkin4t@shinystat.com\"}\n{\"id\":175,\"first_name\":\"Sander\",\"last_name\":\"Monkton\",\"email\":\"smonkton4u@globo.com\"}\n{\"id\":176,\"first_name\":\"Claus\",\"last_name\":\"Mattsson\",\"email\":\"cmattsson4v@joomla.org\"}\n{\"id\":177,\"first_name\":\"Julianne\",\"last_name\":\"Bettenay\",\"email\":\"jbettenay4w@com.com\"}\n{\"id\":178,\"first_name\":\"Conant\",\"last_name\":\"Da Costa\",\"email\":\"cdacosta4x@cam.ac.uk\"}\n{\"id\":179,\"first_name\":\"Patten\",\"last_name\":\"Goldby\",\"email\":\"pgoldby4y@nasa.gov\"}\n{\"id\":180,\"first_name\":\"Courtenay\",\"last_name\":\"Taft\",\"email\":\"ctaft4z@indiegogo.com\"}\n{\"id\":181,\"first_name\":\"Kirsten\",\"last_name\":\"Gore\",\"email\":\"kgore50@angelfire.com\"}\n{\"id\":182,\"first_name\":\"Quincy\",\"last_name\":\"Cosslett\",\"email\":\"qcosslett51@netscape.com\"}\n{\"id\":183,\"first_name\":\"Sherri\",\"last_name\":\"Marchenko\",\"email\":\"smarchenko52@simplemachines.org\"}\n{\"id\":184,\"first_name\":\"Marianna\",\"last_name\":\"Van Schafflaer\",\"email\":\"mvanschafflaer53@ftc.gov\"}\n{\"id\":185,\"first_name\":\"Idalina\",\"last_name\":\"Mullinger\",\"email\":\"imullinger54@dot.gov\"}\n{\"id\":186,\"first_name\":\"Pauline\",\"last_name\":\"Volk\",\"email\":\"pvolk55@addtoany.com\"}\n{\"id\":187,\"first_name\":\"Joel\",\"last_name\":\"Kaasman\",\"email\":\"jkaasman56@youtube.com\"}\n{\"id\":188,\"first_name\":\"Tremaine\",\"last_name\":\"Follin\",\"email\":\"tfollin57@answers.com\"}\n{\"id\":189,\"first_name\":\"Dyane\",\"last_name\":\"Shasnan\",\"email\":\"dshasnan58@whitehouse.gov\"}\n{\"id\":190,\"first_name\":\"Deny\",\"last_name\":\"Packe\",\"email\":\"dpacke59@wordpress.com\"}\n{\"id\":191,\"first_name\":\"Danella\",\"last_name\":\"Clifford\",\"email\":\"dclifford5a@toplist.cz\"}\n{\"id\":192,\"first_name\":\"Melesa\",\"last_name\":\"Ballach\",\"email\":\"mballach5b@netscape.com\"}\n{\"id\":193,\"first_name\":\"Annabel\",\"last_name\":\"Bragginton\",\"email\":\"abragginton5c@de.vu\"}\n{\"id\":194,\"first_name\":\"Reagen\",\"last_name\":\"Boullin\",\"email\":\"rboullin5d@e-recht24.de\"}\n{\"id\":195,\"first_name\":\"Cassi\",\"last_name\":\"Chieco\",\"email\":\"cchieco5e@go.com\"}\n{\"id\":196,\"first_name\":\"Rollins\",\"last_name\":\"Hurdiss\",\"email\":\"rhurdiss5f@indiatimes.com\"}\n{\"id\":197,\"first_name\":\"Ole\",\"last_name\":\"Martusov\",\"email\":\"omartusov5g@washingtonpost.com\"}\n{\"id\":198,\"first_name\":\"Hillyer\",\"last_name\":\"Godson\",\"email\":\"hgodson5h@51.la\"}\n{\"id\":199,\"first_name\":\"Wat\",\"last_name\":\"Trusdale\",\"email\":\"wtrusdale5i@alexa.com\"}\n{\"id\":200,\"first_name\":\"Dotti\",\"last_name\":\"MacClancey\",\"email\":\"dmacclancey5j@51.la\"}\n{\"id\":201,\"first_name\":\"Cassius\",\"last_name\":\"Vaughan-Hughes\",\"email\":\"cvaughanhughes5k@pcworld.com\"}\n{\"id\":202,\"first_name\":\"Aleksandr\",\"last_name\":\"Rossey\",\"email\":\"arossey5l@imageshack.us\"}\n{\"id\":203,\"first_name\":\"Wakefield\",\"last_name\":\"Goodhay\",\"email\":\"wgoodhay5m@gov.uk\"}\n{\"id\":204,\"first_name\":\"Patrizio\",\"last_name\":\"Loutheane\",\"email\":\"ploutheane5n@goo.gl\"}\n{\"id\":205,\"first_name\":\"Dayna\",\"last_name\":\"Blaylock\",\"email\":\"dblaylock5o@fda.gov\"}\n{\"id\":206,\"first_name\":\"Ad\",\"last_name\":\"Henken\",\"email\":\"ahenken5p@wikimedia.org\"}\n{\"id\":207,\"first_name\":\"Selene\",\"last_name\":\"Saunderson\",\"email\":\"ssaunderson5q@over-blog.com\"}\n{\"id\":208,\"first_name\":\"Almeda\",\"last_name\":\"Hlavecek\",\"email\":\"ahlavecek5r@ted.com\"}\n{\"id\":209,\"first_name\":\"Jessamine\",\"last_name\":\"Coaster\",\"email\":\"jcoaster5s@ning.com\"}\n{\"id\":210,\"first_name\":\"Valry\",\"last_name\":\"McCollum\",\"email\":\"vmccollum5t@upenn.edu\"}\n{\"id\":211,\"first_name\":\"Cami\",\"last_name\":\"Treherne\",\"email\":\"ctreherne5u@istockphoto.com\"}\n{\"id\":212,\"first_name\":\"Bartholemy\",\"last_name\":\"Tharme\",\"email\":\"btharme5v@businessinsider.com\"}\n{\"id\":213,\"first_name\":\"Sallyann\",\"last_name\":\"Selcraig\",\"email\":\"sselcraig5w@pcworld.com\"}\n{\"id\":214,\"first_name\":\"Felipa\",\"last_name\":\"Faichnie\",\"email\":\"ffaichnie5x@jimdo.com\"}\n{\"id\":215,\"first_name\":\"Charis\",\"last_name\":\"Lakenton\",\"email\":\"clakenton5y@goo.gl\"}\n{\"id\":216,\"first_name\":\"Hildagard\",\"last_name\":\"Klesse\",\"email\":\"hklesse5z@washington.edu\"}\n{\"id\":217,\"first_name\":\"Zoe\",\"last_name\":\"Doul\",\"email\":\"zdoul60@imdb.com\"}\n{\"id\":218,\"first_name\":\"Mab\",\"last_name\":\"Thorrold\",\"email\":\"mthorrold61@freewebs.com\"}\n{\"id\":219,\"first_name\":\"Shay\",\"last_name\":\"Pringer\",\"email\":\"springer62@arizona.edu\"}\n{\"id\":220,\"first_name\":\"Sandi\",\"last_name\":\"Petford\",\"email\":\"spetford63@symantec.com\"}\n{\"id\":221,\"first_name\":\"Yuma\",\"last_name\":\"Pilmer\",\"email\":\"ypilmer64@sciencedaily.com\"}\n{\"id\":222,\"first_name\":\"Jaymee\",\"last_name\":\"Bennen\",\"email\":\"jbennen65@a8.net\"}\n{\"id\":223,\"first_name\":\"Chris\",\"last_name\":\"Carwithim\",\"email\":\"ccarwithim66@tripadvisor.com\"}\n{\"id\":224,\"first_name\":\"Denney\",\"last_name\":\"Shillum\",\"email\":\"dshillum67@hugedomains.com\"}\n{\"id\":225,\"first_name\":\"Odille\",\"last_name\":\"Marshall\",\"email\":\"omarshall68@aboutads.info\"}\n{\"id\":226,\"first_name\":\"Forrest\",\"last_name\":\"MacMoyer\",\"email\":\"fmacmoyer69@bbc.co.uk\"}\n{\"id\":227,\"first_name\":\"Matteo\",\"last_name\":\"Millhill\",\"email\":\"mmillhill6a@ibm.com\"}\n{\"id\":228,\"first_name\":\"Loni\",\"last_name\":\"Kedie\",\"email\":\"lkedie6b@reverbnation.com\"}\n{\"id\":229,\"first_name\":\"Roland\",\"last_name\":\"Lipyeat\",\"email\":\"rlipyeat6c@aol.com\"}\n{\"id\":230,\"first_name\":\"Merrick\",\"last_name\":\"Catterell\",\"email\":\"mcatterell6d@barnesandnoble.com\"}\n{\"id\":231,\"first_name\":\"Lucias\",\"last_name\":\"Kadar\",\"email\":\"lkadar6e@slideshare.net\"}\n{\"id\":232,\"first_name\":\"Koral\",\"last_name\":\"Sendall\",\"email\":\"ksendall6f@ted.com\"}\n{\"id\":233,\"first_name\":\"Pollyanna\",\"last_name\":\"Asbrey\",\"email\":\"pasbrey6g@salon.com\"}\n{\"id\":234,\"first_name\":\"Gorden\",\"last_name\":\"Guinn\",\"email\":\"gguinn6h@usa.gov\"}\n{\"id\":235,\"first_name\":\"Cal\",\"last_name\":\"Nower\",\"email\":\"cnower6i@cloudflare.com\"}\n{\"id\":236,\"first_name\":\"Waldon\",\"last_name\":\"McGruar\",\"email\":\"wmcgruar6j@sbwire.com\"}\n{\"id\":237,\"first_name\":\"Ginger\",\"last_name\":\"Cheers\",\"email\":\"gcheers6k@gnu.org\"}\n{\"id\":238,\"first_name\":\"Jeremiah\",\"last_name\":\"Ivanitsa\",\"email\":\"jivanitsa6l@mlb.com\"}\n{\"id\":239,\"first_name\":\"Lind\",\"last_name\":\"Marcu\",\"email\":\"lmarcu6m@twitpic.com\"}\n{\"id\":240,\"first_name\":\"Sigismond\",\"last_name\":\"Emmer\",\"email\":\"semmer6n@hatena.ne.jp\"}\n{\"id\":241,\"first_name\":\"Zedekiah\",\"last_name\":\"Davidsen\",\"email\":\"zdavidsen6o@google.ru\"}\n{\"id\":242,\"first_name\":\"Alex\",\"last_name\":\"Formie\",\"email\":\"aformie6p@loc.gov\"}\n{\"id\":243,\"first_name\":\"Reid\",\"last_name\":\"Goodhall\",\"email\":\"rgoodhall6q@loc.gov\"}\n{\"id\":244,\"first_name\":\"Gray\",\"last_name\":\"Forge\",\"email\":\"gforge6r@virginia.edu\"}\n{\"id\":245,\"first_name\":\"Dalston\",\"last_name\":\"Batteson\",\"email\":\"dbatteson6s@dailymail.co.uk\"}\n{\"id\":246,\"first_name\":\"Base\",\"last_name\":\"Devey\",\"email\":\"bdevey6t@cocolog-nifty.com\"}\n{\"id\":247,\"first_name\":\"Beret\",\"last_name\":\"Bann\",\"email\":\"bbann6u@pcworld.com\"}\n{\"id\":248,\"first_name\":\"Lois\",\"last_name\":\"Dudney\",\"email\":\"ldudney6v@blog.com\"}\n{\"id\":249,\"first_name\":\"Garth\",\"last_name\":\"Renner\",\"email\":\"grenner6w@wsj.com\"}\n{\"id\":250,\"first_name\":\"Dorette\",\"last_name\":\"Baglan\",\"email\":\"dbaglan6x@cisco.com\"}\n{\"id\":251,\"first_name\":\"Joe\",\"last_name\":\"Painter\",\"email\":\"jpainter6y@unblog.fr\"}\n{\"id\":252,\"first_name\":\"Emilie\",\"last_name\":\"Radborn\",\"email\":\"eradborn6z@geocities.jp\"}\n{\"id\":253,\"first_name\":\"Dehlia\",\"last_name\":\"Betchley\",\"email\":\"dbetchley70@slideshare.net\"}\n{\"id\":254,\"first_name\":\"Ertha\",\"last_name\":\"Makepeace\",\"email\":\"emakepeace71@samsung.com\"}\n{\"id\":255,\"first_name\":\"Brita\",\"last_name\":\"Currie\",\"email\":\"bcurrie72@jigsy.com\"}\n{\"id\":256,\"first_name\":\"Blisse\",\"last_name\":\"Collimore\",\"email\":\"bcollimore73@hhs.gov\"}\n{\"id\":257,\"first_name\":\"Stearn\",\"last_name\":\"Tattersall\",\"email\":\"stattersall74@youtu.be\"}\n{\"id\":258,\"first_name\":\"Randi\",\"last_name\":\"Lambertini\",\"email\":\"rlambertini75@dot.gov\"}\n{\"id\":259,\"first_name\":\"Beryle\",\"last_name\":\"Aspray\",\"email\":\"baspray76@biglobe.ne.jp\"}\n{\"id\":260,\"first_name\":\"Pansy\",\"last_name\":\"Ricketts\",\"email\":\"pricketts77@google.co.uk\"}\n{\"id\":261,\"first_name\":\"Jeniece\",\"last_name\":\"Eveque\",\"email\":\"jeveque78@ucoz.ru\"}\n{\"id\":262,\"first_name\":\"Kelsey\",\"last_name\":\"Desorts\",\"email\":\"kdesorts79@blog.com\"}\n{\"id\":263,\"first_name\":\"Cecilla\",\"last_name\":\"Hunting\",\"email\":\"chunting7a@rediff.com\"}\n{\"id\":264,\"first_name\":\"Robbie\",\"last_name\":\"Rudeyeard\",\"email\":\"rrudeyeard7b@twitpic.com\"}\n{\"id\":265,\"first_name\":\"Ronny\",\"last_name\":\"Cloy\",\"email\":\"rcloy7c@wufoo.com\"}\n{\"id\":266,\"first_name\":\"Fallon\",\"last_name\":\"McGarrie\",\"email\":\"fmcgarrie7d@domainmarket.com\"}\n{\"id\":267,\"first_name\":\"Andree\",\"last_name\":\"Salazar\",\"email\":\"asalazar7e@aol.com\"}\n{\"id\":268,\"first_name\":\"Terri\",\"last_name\":\"Pentlow\",\"email\":\"tpentlow7f@narod.ru\"}\n{\"id\":269,\"first_name\":\"Teodoro\",\"last_name\":\"Guerrazzi\",\"email\":\"tguerrazzi7g@trellian.com\"}\n{\"id\":270,\"first_name\":\"Ashton\",\"last_name\":\"Kirimaa\",\"email\":\"akirimaa7h@cdc.gov\"}\n{\"id\":271,\"first_name\":\"Nikolas\",\"last_name\":\"Vidineev\",\"email\":\"nvidineev7i@tuttocitta.it\"}\n{\"id\":272,\"first_name\":\"Myrtia\",\"last_name\":\"Karsh\",\"email\":\"mkarsh7j@wired.com\"}\n{\"id\":273,\"first_name\":\"Modesty\",\"last_name\":\"Boissier\",\"email\":\"mboissier7k@aol.com\"}\n{\"id\":274,\"first_name\":\"Cyndia\",\"last_name\":\"Leemans\",\"email\":\"cleemans7l@cornell.edu\"}\n{\"id\":275,\"first_name\":\"Shantee\",\"last_name\":\"Maykin\",\"email\":\"smaykin7m@wix.com\"}\n{\"id\":276,\"first_name\":\"Emmerich\",\"last_name\":\"Gergus\",\"email\":\"egergus7n@paypal.com\"}\n{\"id\":277,\"first_name\":\"Lindsey\",\"last_name\":\"Chaulk\",\"email\":\"lchaulk7o@sfgate.com\"}\n{\"id\":278,\"first_name\":\"Romona\",\"last_name\":\"Murie\",\"email\":\"rmurie7p@sun.com\"}\n{\"id\":279,\"first_name\":\"Bel\",\"last_name\":\"Hitzschke\",\"email\":\"bhitzschke7q@jugem.jp\"}\n{\"id\":280,\"first_name\":\"Hilton\",\"last_name\":\"Haythorn\",\"email\":\"hhaythorn7r@sciencedaily.com\"}\n{\"id\":281,\"first_name\":\"Lydie\",\"last_name\":\"Heinrici\",\"email\":\"lheinrici7s@cnbc.com\"}\n{\"id\":282,\"first_name\":\"Gwynne\",\"last_name\":\"Harriman\",\"email\":\"gharriman7t@linkedin.com\"}\n{\"id\":283,\"first_name\":\"Boy\",\"last_name\":\"Abrahmson\",\"email\":\"babrahmson7u@mit.edu\"}\n{\"id\":284,\"first_name\":\"Rayner\",\"last_name\":\"Murrill\",\"email\":\"rmurrill7v@freewebs.com\"}\n{\"id\":285,\"first_name\":\"Steward\",\"last_name\":\"Lodovichi\",\"email\":\"slodovichi7w@ucoz.com\"}\n{\"id\":286,\"first_name\":\"Zelig\",\"last_name\":\"Guillet\",\"email\":\"zguillet7x@vistaprint.com\"}\n{\"id\":287,\"first_name\":\"Merrily\",\"last_name\":\"Millen\",\"email\":\"mmillen7y@cam.ac.uk\"}\n{\"id\":288,\"first_name\":\"Raff\",\"last_name\":\"Goold\",\"email\":\"rgoold7z@canalblog.com\"}\n{\"id\":289,\"first_name\":\"Aland\",\"last_name\":\"Richards\",\"email\":\"arichards80@people.com.cn\"}\n{\"id\":290,\"first_name\":\"Ambrose\",\"last_name\":\"Fanning\",\"email\":\"afanning81@telegraph.co.uk\"}\n{\"id\":291,\"first_name\":\"Dolly\",\"last_name\":\"McConnulty\",\"email\":\"dmcconnulty82@pen.io\"}\n{\"id\":292,\"first_name\":\"Eleen\",\"last_name\":\"Muffen\",\"email\":\"emuffen83@reddit.com\"}\n{\"id\":293,\"first_name\":\"Rachele\",\"last_name\":\"Cleminshaw\",\"email\":\"rcleminshaw84@apache.org\"}\n{\"id\":294,\"first_name\":\"Royall\",\"last_name\":\"Grierson\",\"email\":\"rgrierson85@chron.com\"}\n{\"id\":295,\"first_name\":\"Goldie\",\"last_name\":\"Bouskill\",\"email\":\"gbouskill86@vkontakte.ru\"}\n{\"id\":296,\"first_name\":\"Padraic\",\"last_name\":\"Manolov\",\"email\":\"pmanolov87@163.com\"}\n{\"id\":297,\"first_name\":\"Olivie\",\"last_name\":\"Corcut\",\"email\":\"ocorcut88@nyu.edu\"}\n{\"id\":298,\"first_name\":\"Cly\",\"last_name\":\"Peete\",\"email\":\"cpeete89@feedburner.com\"}\n{\"id\":299,\"first_name\":\"Stanly\",\"last_name\":\"Grieswood\",\"email\":\"sgrieswood8a@google.it\"}\n{\"id\":300,\"first_name\":\"Cullin\",\"last_name\":\"Hammatt\",\"email\":\"chammatt8b@hostgator.com\"}\n{\"id\":301,\"first_name\":\"Talbert\",\"last_name\":\"Lilliman\",\"email\":\"tlilliman8c@wufoo.com\"}\n{\"id\":302,\"first_name\":\"Britteny\",\"last_name\":\"Kubica\",\"email\":\"bkubica8d@jiathis.com\"}\n{\"id\":303,\"first_name\":\"Leona\",\"last_name\":\"Matthaus\",\"email\":\"lmatthaus8e@chicagotribune.com\"}\n{\"id\":304,\"first_name\":\"Minda\",\"last_name\":\"Emmerson\",\"email\":\"memmerson8f@cnn.com\"}\n{\"id\":305,\"first_name\":\"Reinhard\",\"last_name\":\"Dudderidge\",\"email\":\"rdudderidge8g@twitter.com\"}\n{\"id\":306,\"first_name\":\"Ode\",\"last_name\":\"Isaac\",\"email\":\"oisaac8h@mashable.com\"}\n{\"id\":307,\"first_name\":\"Tannie\",\"last_name\":\"Duffan\",\"email\":\"tduffan8i@netlog.com\"}\n{\"id\":308,\"first_name\":\"Riccardo\",\"last_name\":\"Heaney\",\"email\":\"rheaney8j@g.co\"}\n{\"id\":309,\"first_name\":\"Leann\",\"last_name\":\"Klimpt\",\"email\":\"lklimpt8k@wsj.com\"}\n{\"id\":310,\"first_name\":\"Jacquelyn\",\"last_name\":\"Reddle\",\"email\":\"jreddle8l@prlog.org\"}\n{\"id\":311,\"first_name\":\"Pebrook\",\"last_name\":\"Gladdor\",\"email\":\"pgladdor8m@hp.com\"}\n{\"id\":312,\"first_name\":\"Devy\",\"last_name\":\"Keers\",\"email\":\"dkeers8n@theglobeandmail.com\"}\n{\"id\":313,\"first_name\":\"Em\",\"last_name\":\"Bullock\",\"email\":\"ebullock8o@theglobeandmail.com\"}\n{\"id\":314,\"first_name\":\"Catharine\",\"last_name\":\"Rabbitt\",\"email\":\"crabbitt8p@usnews.com\"}\n{\"id\":315,\"first_name\":\"Roderick\",\"last_name\":\"Barette\",\"email\":\"rbarette8q@icq.com\"}\n{\"id\":316,\"first_name\":\"Joshia\",\"last_name\":\"MacIver\",\"email\":\"jmaciver8r@biblegateway.com\"}\n{\"id\":317,\"first_name\":\"Jarrad\",\"last_name\":\"Donnan\",\"email\":\"jdonnan8s@is.gd\"}\n{\"id\":318,\"first_name\":\"Sophey\",\"last_name\":\"Corriea\",\"email\":\"scorriea8t@liveinternet.ru\"}\n{\"id\":319,\"first_name\":\"Aura\",\"last_name\":\"Pancast\",\"email\":\"apancast8u@google.com.hk\"}\n{\"id\":320,\"first_name\":\"Benedikta\",\"last_name\":\"Billin\",\"email\":\"bbillin8v@cmu.edu\"}\n{\"id\":321,\"first_name\":\"Hermine\",\"last_name\":\"Bidgood\",\"email\":\"hbidgood8w@businessweek.com\"}\n{\"id\":322,\"first_name\":\"Woody\",\"last_name\":\"Sellack\",\"email\":\"wsellack8x@hibu.com\"}\n{\"id\":323,\"first_name\":\"Wrennie\",\"last_name\":\"Ivermee\",\"email\":\"wivermee8y@tuttocitta.it\"}\n{\"id\":324,\"first_name\":\"Miranda\",\"last_name\":\"Pyle\",\"email\":\"mpyle8z@bbb.org\"}\n{\"id\":325,\"first_name\":\"Cathee\",\"last_name\":\"Dowdeswell\",\"email\":\"cdowdeswell90@unblog.fr\"}\n{\"id\":326,\"first_name\":\"Georgine\",\"last_name\":\"Beesley\",\"email\":\"gbeesley91@amazon.com\"}\n{\"id\":327,\"first_name\":\"Marmaduke\",\"last_name\":\"Musprat\",\"email\":\"mmusprat92@fda.gov\"}\n{\"id\":328,\"first_name\":\"Jermayne\",\"last_name\":\"Lindro\",\"email\":\"jlindro93@spotify.com\"}\n{\"id\":329,\"first_name\":\"Kris\",\"last_name\":\"Tripe\",\"email\":\"ktripe94@phoca.cz\"}\n{\"id\":330,\"first_name\":\"Ofella\",\"last_name\":\"Antushev\",\"email\":\"oantushev95@desdev.cn\"}\n{\"id\":331,\"first_name\":\"Tod\",\"last_name\":\"Macia\",\"email\":\"tmacia96@engadget.com\"}\n{\"id\":332,\"first_name\":\"Jelene\",\"last_name\":\"Cecere\",\"email\":\"jcecere97@51.la\"}\n{\"id\":333,\"first_name\":\"Britney\",\"last_name\":\"Sanches\",\"email\":\"bsanches98@princeton.edu\"}\n{\"id\":334,\"first_name\":\"Quinn\",\"last_name\":\"Pirolini\",\"email\":\"qpirolini99@oracle.com\"}\n{\"id\":335,\"first_name\":\"Costanza\",\"last_name\":\"Wharby\",\"email\":\"cwharby9a@shareasale.com\"}\n{\"id\":336,\"first_name\":\"Steven\",\"last_name\":\"Edel\",\"email\":\"sedel9b@google.com.au\"}\n{\"id\":337,\"first_name\":\"Quinton\",\"last_name\":\"Simonnet\",\"email\":\"qsimonnet9c@mail.ru\"}\n{\"id\":338,\"first_name\":\"Conroy\",\"last_name\":\"Sorey\",\"email\":\"csorey9d@mapy.cz\"}\n{\"id\":339,\"first_name\":\"Kacey\",\"last_name\":\"Tweedy\",\"email\":\"ktweedy9e@telegraph.co.uk\"}\n{\"id\":340,\"first_name\":\"Aubrie\",\"last_name\":\"Oatley\",\"email\":\"aoatley9f@amazon.com\"}\n{\"id\":341,\"first_name\":\"Gerhard\",\"last_name\":\"Rizzolo\",\"email\":\"grizzolo9g@ftc.gov\"}\n{\"id\":342,\"first_name\":\"Keenan\",\"last_name\":\"Godier\",\"email\":\"kgodier9h@harvard.edu\"}\n{\"id\":343,\"first_name\":\"Alard\",\"last_name\":\"Tubridy\",\"email\":\"atubridy9i@ask.com\"}\n{\"id\":344,\"first_name\":\"Nadine\",\"last_name\":\"Naden\",\"email\":\"nnaden9j@ca.gov\"}\n{\"id\":345,\"first_name\":\"Alessandro\",\"last_name\":\"Timson\",\"email\":\"atimson9k@gov.uk\"}\n{\"id\":346,\"first_name\":\"Mirella\",\"last_name\":\"Shurville\",\"email\":\"mshurville9l@google.co.uk\"}\n{\"id\":347,\"first_name\":\"Libbie\",\"last_name\":\"Waterman\",\"email\":\"lwaterman9m@abc.net.au\"}\n{\"id\":348,\"first_name\":\"Cordy\",\"last_name\":\"Selesnick\",\"email\":\"cselesnick9n@freewebs.com\"}\n{\"id\":349,\"first_name\":\"Gerrard\",\"last_name\":\"Roney\",\"email\":\"groney9o@xrea.com\"}\n{\"id\":350,\"first_name\":\"Paulita\",\"last_name\":\"Giacomi\",\"email\":\"pgiacomi9p@utexas.edu\"}\n{\"id\":351,\"first_name\":\"Martguerita\",\"last_name\":\"Ceaser\",\"email\":\"mceaser9q@toplist.cz\"}\n{\"id\":352,\"first_name\":\"Alexei\",\"last_name\":\"Kellitt\",\"email\":\"akellitt9r@joomla.org\"}\n{\"id\":353,\"first_name\":\"Domenico\",\"last_name\":\"Byard\",\"email\":\"dbyard9s@histats.com\"}\n{\"id\":354,\"first_name\":\"Herby\",\"last_name\":\"Piele\",\"email\":\"hpiele9t@edublogs.org\"}\n{\"id\":355,\"first_name\":\"Shaughn\",\"last_name\":\"Ramsby\",\"email\":\"sramsby9u@ning.com\"}\n{\"id\":356,\"first_name\":\"Britni\",\"last_name\":\"Maginot\",\"email\":\"bmaginot9v@walmart.com\"}\n{\"id\":357,\"first_name\":\"May\",\"last_name\":\"Manshaw\",\"email\":\"mmanshaw9w@slideshare.net\"}\n{\"id\":358,\"first_name\":\"Onofredo\",\"last_name\":\"Corcut\",\"email\":\"ocorcut9x@vistaprint.com\"}\n{\"id\":359,\"first_name\":\"Lincoln\",\"last_name\":\"Mantrup\",\"email\":\"lmantrup9y@yolasite.com\"}\n{\"id\":360,\"first_name\":\"Ora\",\"last_name\":\"Kearton\",\"email\":\"okearton9z@bravesites.com\"}\n{\"id\":361,\"first_name\":\"Gaylord\",\"last_name\":\"Kulicke\",\"email\":\"gkulickea0@pen.io\"}\n{\"id\":362,\"first_name\":\"Thibaut\",\"last_name\":\"Easbie\",\"email\":\"teasbiea1@amazonaws.com\"}\n{\"id\":363,\"first_name\":\"Doro\",\"last_name\":\"Metts\",\"email\":\"dmettsa2@tiny.cc\"}\n{\"id\":364,\"first_name\":\"Lon\",\"last_name\":\"Breslane\",\"email\":\"lbreslanea3@sciencedaily.com\"}\n{\"id\":365,\"first_name\":\"Coreen\",\"last_name\":\"Coultass\",\"email\":\"ccoultassa4@skyrock.com\"}\n{\"id\":366,\"first_name\":\"Donall\",\"last_name\":\"Cusack\",\"email\":\"dcusacka5@cyberchimps.com\"}\n{\"id\":367,\"first_name\":\"Derrek\",\"last_name\":\"O'Sharry\",\"email\":\"dosharrya6@cbslocal.com\"}\n{\"id\":368,\"first_name\":\"Barth\",\"last_name\":\"Thieme\",\"email\":\"bthiemea7@odnoklassniki.ru\"}\n{\"id\":369,\"first_name\":\"Wyatt\",\"last_name\":\"Alderton\",\"email\":\"waldertona8@whitehouse.gov\"}\n{\"id\":370,\"first_name\":\"Correna\",\"last_name\":\"Colmore\",\"email\":\"ccolmorea9@icio.us\"}\n{\"id\":371,\"first_name\":\"Kacy\",\"last_name\":\"Weippert\",\"email\":\"kweippertaa@devhub.com\"}\n{\"id\":372,\"first_name\":\"Trista\",\"last_name\":\"Androsik\",\"email\":\"tandrosikab@boston.com\"}\n{\"id\":373,\"first_name\":\"Joyan\",\"last_name\":\"Abramski\",\"email\":\"jabramskiac@gnu.org\"}\n{\"id\":374,\"first_name\":\"Rafi\",\"last_name\":\"Garfield\",\"email\":\"rgarfieldad@yellowpages.com\"}\n{\"id\":375,\"first_name\":\"Wilton\",\"last_name\":\"Tankard\",\"email\":\"wtankardae@networksolutions.com\"}\n{\"id\":376,\"first_name\":\"Noak\",\"last_name\":\"Crielly\",\"email\":\"ncriellyaf@guardian.co.uk\"}\n{\"id\":377,\"first_name\":\"Louisette\",\"last_name\":\"Redrup\",\"email\":\"lredrupag@phoca.cz\"}\n{\"id\":378,\"first_name\":\"Anneliese\",\"last_name\":\"Pegram\",\"email\":\"apegramah@vinaora.com\"}\n{\"id\":379,\"first_name\":\"Jenny\",\"last_name\":\"Guirau\",\"email\":\"jguirauai@sciencedirect.com\"}\n{\"id\":380,\"first_name\":\"Phaedra\",\"last_name\":\"Plaistowe\",\"email\":\"pplaistoweaj@usatoday.com\"}\n{\"id\":381,\"first_name\":\"Terri\",\"last_name\":\"Mathew\",\"email\":\"tmathewak@jigsy.com\"}\n{\"id\":382,\"first_name\":\"Allegra\",\"last_name\":\"Wenn\",\"email\":\"awennal@squidoo.com\"}\n{\"id\":383,\"first_name\":\"Davita\",\"last_name\":\"Fergyson\",\"email\":\"dfergysonam@wix.com\"}\n{\"id\":384,\"first_name\":\"Genovera\",\"last_name\":\"Billinge\",\"email\":\"gbillingean@edublogs.org\"}\n{\"id\":385,\"first_name\":\"Lyndsay\",\"last_name\":\"Zuker\",\"email\":\"lzukerao@npr.org\"}\n{\"id\":386,\"first_name\":\"Bess\",\"last_name\":\"Ryrie\",\"email\":\"bryrieap@cnbc.com\"}\n{\"id\":387,\"first_name\":\"Rena\",\"last_name\":\"Orniz\",\"email\":\"rornizaq@cargocollective.com\"}\n{\"id\":388,\"first_name\":\"Baudoin\",\"last_name\":\"Walklate\",\"email\":\"bwalklatear@uiuc.edu\"}\n{\"id\":389,\"first_name\":\"Cherri\",\"last_name\":\"Pollicote\",\"email\":\"cpollicoteas@state.tx.us\"}\n{\"id\":390,\"first_name\":\"Yanaton\",\"last_name\":\"Knappitt\",\"email\":\"yknappittat@ucoz.com\"}\n{\"id\":391,\"first_name\":\"Audi\",\"last_name\":\"Izkovicz\",\"email\":\"aizkoviczau@google.es\"}\n{\"id\":392,\"first_name\":\"Lewes\",\"last_name\":\"Chilcott\",\"email\":\"lchilcottav@amazonaws.com\"}\n{\"id\":393,\"first_name\":\"Mariquilla\",\"last_name\":\"Pinck\",\"email\":\"mpinckaw@example.com\"}\n{\"id\":394,\"first_name\":\"Lodovico\",\"last_name\":\"Tadman\",\"email\":\"ltadmanax@vk.com\"}\n{\"id\":395,\"first_name\":\"Yancey\",\"last_name\":\"Beardsall\",\"email\":\"ybeardsallay@printfriendly.com\"}\n{\"id\":396,\"first_name\":\"Matilda\",\"last_name\":\"Fedorski\",\"email\":\"mfedorskiaz@biglobe.ne.jp\"}\n{\"id\":397,\"first_name\":\"Kellen\",\"last_name\":\"Cleveley\",\"email\":\"kcleveleyb0@state.gov\"}\n{\"id\":398,\"first_name\":\"Dave\",\"last_name\":\"Aglione\",\"email\":\"daglioneb1@barnesandnoble.com\"}\n{\"id\":399,\"first_name\":\"Sanders\",\"last_name\":\"Noades\",\"email\":\"snoadesb2@wufoo.com\"}\n{\"id\":400,\"first_name\":\"Ingar\",\"last_name\":\"Asser\",\"email\":\"iasserb3@youku.com\"}\n{\"id\":401,\"first_name\":\"Sandie\",\"last_name\":\"Gregore\",\"email\":\"sgregoreb4@msu.edu\"}\n{\"id\":402,\"first_name\":\"Georgiana\",\"last_name\":\"Statefield\",\"email\":\"gstatefieldb5@kickstarter.com\"}\n{\"id\":403,\"first_name\":\"Jackquelin\",\"last_name\":\"Frugier\",\"email\":\"jfrugierb6@unicef.org\"}\n{\"id\":404,\"first_name\":\"Hillary\",\"last_name\":\"Dallyn\",\"email\":\"hdallynb7@loc.gov\"}\n{\"id\":405,\"first_name\":\"Townsend\",\"last_name\":\"Syde\",\"email\":\"tsydeb8@answers.com\"}\n{\"id\":406,\"first_name\":\"Rina\",\"last_name\":\"Scurrah\",\"email\":\"rscurrahb9@wunderground.com\"}\n{\"id\":407,\"first_name\":\"Frank\",\"last_name\":\"Sheer\",\"email\":\"fsheerba@howstuffworks.com\"}\n{\"id\":408,\"first_name\":\"Marice\",\"last_name\":\"Bertie\",\"email\":\"mbertiebb@senate.gov\"}\n{\"id\":409,\"first_name\":\"Cookie\",\"last_name\":\"McMillan\",\"email\":\"cmcmillanbc@webeden.co.uk\"}\n{\"id\":410,\"first_name\":\"Emmy\",\"last_name\":\"Lauthian\",\"email\":\"elauthianbd@dedecms.com\"}\n{\"id\":411,\"first_name\":\"Kory\",\"last_name\":\"Francklin\",\"email\":\"kfrancklinbe@furl.net\"}\n{\"id\":412,\"first_name\":\"Gardiner\",\"last_name\":\"Senyard\",\"email\":\"gsenyardbf@wikispaces.com\"}\n{\"id\":413,\"first_name\":\"Mercedes\",\"last_name\":\"Kolczynski\",\"email\":\"mkolczynskibg@posterous.com\"}\n{\"id\":414,\"first_name\":\"Myriam\",\"last_name\":\"Saben\",\"email\":\"msabenbh@wix.com\"}\n{\"id\":415,\"first_name\":\"Chevy\",\"last_name\":\"Quinell\",\"email\":\"cquinellbi@dyndns.org\"}\n{\"id\":416,\"first_name\":\"Ed\",\"last_name\":\"Heddon\",\"email\":\"eheddonbj@usda.gov\"}\n{\"id\":417,\"first_name\":\"Gerek\",\"last_name\":\"Baddiley\",\"email\":\"gbaddileybk@diigo.com\"}\n{\"id\":418,\"first_name\":\"Edgardo\",\"last_name\":\"Careswell\",\"email\":\"ecareswellbl@amazonaws.com\"}\n{\"id\":419,\"first_name\":\"Hunfredo\",\"last_name\":\"Gibbard\",\"email\":\"hgibbardbm@istockphoto.com\"}\n{\"id\":420,\"first_name\":\"Audie\",\"last_name\":\"Siddle\",\"email\":\"asiddlebn@squidoo.com\"}\n{\"id\":421,\"first_name\":\"Adey\",\"last_name\":\"Kingsford\",\"email\":\"akingsfordbo@telegraph.co.uk\"}\n{\"id\":422,\"first_name\":\"Ethelyn\",\"last_name\":\"Vanyushkin\",\"email\":\"evanyushkinbp@amazonaws.com\"}\n{\"id\":423,\"first_name\":\"Bron\",\"last_name\":\"Edger\",\"email\":\"bedgerbq@bandcamp.com\"}\n{\"id\":424,\"first_name\":\"Cathrine\",\"last_name\":\"Arnaldo\",\"email\":\"carnaldobr@imgur.com\"}\n{\"id\":425,\"first_name\":\"Rickie\",\"last_name\":\"Yeskov\",\"email\":\"ryeskovbs@slate.com\"}\n{\"id\":426,\"first_name\":\"Gale\",\"last_name\":\"Choat\",\"email\":\"gchoatbt@google.de\"}\n{\"id\":427,\"first_name\":\"Garik\",\"last_name\":\"Leak\",\"email\":\"gleakbu@privacy.gov.au\"}\n{\"id\":428,\"first_name\":\"Timofei\",\"last_name\":\"Whiteoak\",\"email\":\"twhiteoakbv@tinyurl.com\"}\n{\"id\":429,\"first_name\":\"Wally\",\"last_name\":\"Caughan\",\"email\":\"wcaughanbw@msu.edu\"}\n{\"id\":430,\"first_name\":\"Yancy\",\"last_name\":\"Stealfox\",\"email\":\"ystealfoxbx@berkeley.edu\"}\n{\"id\":431,\"first_name\":\"Dela\",\"last_name\":\"Strong\",\"email\":\"dstrongby@free.fr\"}\n{\"id\":432,\"first_name\":\"Dougie\",\"last_name\":\"Tewnion\",\"email\":\"dtewnionbz@discovery.com\"}\n{\"id\":433,\"first_name\":\"Kelly\",\"last_name\":\"Frean\",\"email\":\"kfreanc0@imageshack.us\"}\n{\"id\":434,\"first_name\":\"Connie\",\"last_name\":\"Blaschek\",\"email\":\"cblaschekc1@wikipedia.org\"}\n{\"id\":435,\"first_name\":\"Michell\",\"last_name\":\"D'Ambrogio\",\"email\":\"mdambrogioc2@sfgate.com\"}\n{\"id\":436,\"first_name\":\"Larine\",\"last_name\":\"Comber\",\"email\":\"lcomberc3@com.com\"}\n{\"id\":437,\"first_name\":\"Giacopo\",\"last_name\":\"Linde\",\"email\":\"glindec4@tripadvisor.com\"}\n{\"id\":438,\"first_name\":\"Debbi\",\"last_name\":\"Whaley\",\"email\":\"dwhaleyc5@studiopress.com\"}\n{\"id\":439,\"first_name\":\"Alva\",\"last_name\":\"Matyasik\",\"email\":\"amatyasikc6@gmpg.org\"}\n{\"id\":440,\"first_name\":\"Mead\",\"last_name\":\"Andrini\",\"email\":\"mandrinic7@arstechnica.com\"}\n{\"id\":441,\"first_name\":\"Esme\",\"last_name\":\"Casetti\",\"email\":\"ecasettic8@furl.net\"}\n{\"id\":442,\"first_name\":\"Barbara\",\"last_name\":\"Piel\",\"email\":\"bpielc9@addtoany.com\"}\n{\"id\":443,\"first_name\":\"Slade\",\"last_name\":\"Coaker\",\"email\":\"scoakerca@spiegel.de\"}\n{\"id\":444,\"first_name\":\"Read\",\"last_name\":\"Wackley\",\"email\":\"rwackleycb@va.gov\"}\n{\"id\":445,\"first_name\":\"Min\",\"last_name\":\"Dunnet\",\"email\":\"mdunnetcc@friendfeed.com\"}\n{\"id\":446,\"first_name\":\"Barbabra\",\"last_name\":\"Taffarello\",\"email\":\"btaffarellocd@samsung.com\"}\n{\"id\":447,\"first_name\":\"Sadie\",\"last_name\":\"Stanlick\",\"email\":\"sstanlickce@vistaprint.com\"}\n{\"id\":448,\"first_name\":\"Keefe\",\"last_name\":\"Inglese\",\"email\":\"kinglesecf@nationalgeographic.com\"}\n{\"id\":449,\"first_name\":\"Domenic\",\"last_name\":\"Tomasek\",\"email\":\"dtomasekcg@privacy.gov.au\"}\n{\"id\":450,\"first_name\":\"Felic\",\"last_name\":\"Rydzynski\",\"email\":\"frydzynskich@flickr.com\"}\n{\"id\":451,\"first_name\":\"Alfy\",\"last_name\":\"Hamman\",\"email\":\"ahammanci@slate.com\"}\n{\"id\":452,\"first_name\":\"Townie\",\"last_name\":\"Tomini\",\"email\":\"ttominicj@printfriendly.com\"}\n{\"id\":453,\"first_name\":\"Evangelia\",\"last_name\":\"Badrick\",\"email\":\"ebadrickck@facebook.com\"}\n{\"id\":454,\"first_name\":\"Caron\",\"last_name\":\"Cornil\",\"email\":\"ccornilcl@newsvine.com\"}\n{\"id\":455,\"first_name\":\"Ernie\",\"last_name\":\"Reddin\",\"email\":\"ereddincm@tmall.com\"}\n{\"id\":456,\"first_name\":\"Arley\",\"last_name\":\"Wardall\",\"email\":\"awardallcn@tripod.com\"}\n{\"id\":457,\"first_name\":\"Robinet\",\"last_name\":\"Sam\",\"email\":\"rsamco@upenn.edu\"}\n{\"id\":458,\"first_name\":\"Melisenda\",\"last_name\":\"Timeby\",\"email\":\"mtimebycp@shinystat.com\"}\n{\"id\":459,\"first_name\":\"Shantee\",\"last_name\":\"Annes\",\"email\":\"sannescq@va.gov\"}\n{\"id\":460,\"first_name\":\"Molli\",\"last_name\":\"Mish\",\"email\":\"mmishcr@dot.gov\"}\n{\"id\":461,\"first_name\":\"Merralee\",\"last_name\":\"Vickerman\",\"email\":\"mvickermancs@baidu.com\"}\n{\"id\":462,\"first_name\":\"Wilhelmina\",\"last_name\":\"Heminsley\",\"email\":\"wheminsleyct@is.gd\"}\n{\"id\":463,\"first_name\":\"Granville\",\"last_name\":\"MacKeague\",\"email\":\"gmackeaguecu@pbs.org\"}\n{\"id\":464,\"first_name\":\"Sean\",\"last_name\":\"Loftus\",\"email\":\"sloftuscv@psu.edu\"}\n{\"id\":465,\"first_name\":\"Nevins\",\"last_name\":\"Gawke\",\"email\":\"ngawkecw@stumbleupon.com\"}\n{\"id\":466,\"first_name\":\"Gunilla\",\"last_name\":\"Lucock\",\"email\":\"glucockcx@deviantart.com\"}\n{\"id\":467,\"first_name\":\"Haydon\",\"last_name\":\"Fiddy\",\"email\":\"hfiddycy@imgur.com\"}\n{\"id\":468,\"first_name\":\"Ema\",\"last_name\":\"Salatino\",\"email\":\"esalatinocz@sitemeter.com\"}\n{\"id\":469,\"first_name\":\"Valentin\",\"last_name\":\"Yakovliv\",\"email\":\"vyakovlivd0@furl.net\"}\n{\"id\":470,\"first_name\":\"Carri\",\"last_name\":\"Saltern\",\"email\":\"csalternd1@biblegateway.com\"}\n{\"id\":471,\"first_name\":\"Kristos\",\"last_name\":\"Stanmore\",\"email\":\"kstanmored2@4shared.com\"}\n{\"id\":472,\"first_name\":\"Adriena\",\"last_name\":\"Bes\",\"email\":\"abesd3@constantcontact.com\"}\n{\"id\":473,\"first_name\":\"Cristabel\",\"last_name\":\"Bortolutti\",\"email\":\"cbortoluttid4@pinterest.com\"}\n{\"id\":474,\"first_name\":\"Hersh\",\"last_name\":\"Lock\",\"email\":\"hlockd5@forbes.com\"}\n{\"id\":475,\"first_name\":\"Yoshi\",\"last_name\":\"Marler\",\"email\":\"ymarlerd6@free.fr\"}\n{\"id\":476,\"first_name\":\"Tremaine\",\"last_name\":\"Librey\",\"email\":\"tlibreyd7@trellian.com\"}\n{\"id\":477,\"first_name\":\"Inge\",\"last_name\":\"Strawbridge\",\"email\":\"istrawbridged8@barnesandnoble.com\"}\n{\"id\":478,\"first_name\":\"Pascal\",\"last_name\":\"Carvill\",\"email\":\"pcarvilld9@washington.edu\"}\n{\"id\":479,\"first_name\":\"Zabrina\",\"last_name\":\"Ianitti\",\"email\":\"zianittida@yellowpages.com\"}\n{\"id\":480,\"first_name\":\"Almeta\",\"last_name\":\"Wessell\",\"email\":\"awesselldb@bbb.org\"}\n{\"id\":481,\"first_name\":\"Crissie\",\"last_name\":\"Troy\",\"email\":\"ctroydc@rambler.ru\"}\n{\"id\":482,\"first_name\":\"Xena\",\"last_name\":\"Frammingham\",\"email\":\"xframminghamdd@t-online.de\"}\n{\"id\":483,\"first_name\":\"Zilvia\",\"last_name\":\"Grinvalds\",\"email\":\"zgrinvaldsde@example.com\"}\n{\"id\":484,\"first_name\":\"Brit\",\"last_name\":\"Twelftree\",\"email\":\"btwelftreedf@shop-pro.jp\"}\n{\"id\":485,\"first_name\":\"Brianne\",\"last_name\":\"Johannes\",\"email\":\"bjohannesdg@nymag.com\"}\n{\"id\":486,\"first_name\":\"Felicle\",\"last_name\":\"MacRury\",\"email\":\"fmacrurydh@jiathis.com\"}\n{\"id\":487,\"first_name\":\"Salli\",\"last_name\":\"Chillingworth\",\"email\":\"schillingworthdi@gravatar.com\"}\n{\"id\":488,\"first_name\":\"Merline\",\"last_name\":\"Bodd\",\"email\":\"mbodddj@yandex.ru\"}\n{\"id\":489,\"first_name\":\"Christian\",\"last_name\":\"Pengelley\",\"email\":\"cpengelleydk@flavors.me\"}\n{\"id\":490,\"first_name\":\"Dallas\",\"last_name\":\"Sollowaye\",\"email\":\"dsollowayedl@zdnet.com\"}\n{\"id\":491,\"first_name\":\"Matias\",\"last_name\":\"Austen\",\"email\":\"maustendm@ow.ly\"}\n{\"id\":492,\"first_name\":\"Carney\",\"last_name\":\"Bergin\",\"email\":\"cbergindn@tiny.cc\"}\n{\"id\":493,\"first_name\":\"Carol\",\"last_name\":\"Vannikov\",\"email\":\"cvannikovdo@oakley.com\"}\n{\"id\":494,\"first_name\":\"Gail\",\"last_name\":\"Garwood\",\"email\":\"ggarwooddp@cbc.ca\"}\n{\"id\":495,\"first_name\":\"Adela\",\"last_name\":\"Baddam\",\"email\":\"abaddamdq@blinklist.com\"}\n{\"id\":496,\"first_name\":\"Bogey\",\"last_name\":\"Tomala\",\"email\":\"btomaladr@samsung.com\"}\n{\"id\":497,\"first_name\":\"Annabel\",\"last_name\":\"Pinsent\",\"email\":\"apinsentds@yelp.com\"}\n{\"id\":498,\"first_name\":\"Marijn\",\"last_name\":\"Trevarthen\",\"email\":\"mtrevarthendt@google.ca\"}\n{\"id\":499,\"first_name\":\"Arabelle\",\"last_name\":\"Corneliussen\",\"email\":\"acorneliussendu@census.gov\"}\n{\"id\":500,\"first_name\":\"Ritchie\",\"last_name\":\"Rosenblum\",\"email\":\"rrosenblumdv@i2i.jp\"}\n{\"id\":501,\"first_name\":\"Marji\",\"last_name\":\"Clarage\",\"email\":\"mclaragedw@gravatar.com\"}\n{\"id\":502,\"first_name\":\"Yolanthe\",\"last_name\":\"Doddemeade\",\"email\":\"ydoddemeadedx@cpanel.net\"}\n{\"id\":503,\"first_name\":\"Larine\",\"last_name\":\"Rodd\",\"email\":\"lrodddy@ft.com\"}\n{\"id\":504,\"first_name\":\"Adolpho\",\"last_name\":\"Bleasdale\",\"email\":\"ableasdaledz@unicef.org\"}\n{\"id\":505,\"first_name\":\"Rriocard\",\"last_name\":\"Roggeman\",\"email\":\"rroggemane0@salon.com\"}\n{\"id\":506,\"first_name\":\"Sissie\",\"last_name\":\"Ephgrave\",\"email\":\"sephgravee1@wikipedia.org\"}\n{\"id\":507,\"first_name\":\"Shepherd\",\"last_name\":\"Davidde\",\"email\":\"sdaviddee2@wikimedia.org\"}\n{\"id\":508,\"first_name\":\"Cecilla\",\"last_name\":\"Girt\",\"email\":\"cgirte3@soundcloud.com\"}\n{\"id\":509,\"first_name\":\"Alanah\",\"last_name\":\"Newtown\",\"email\":\"anewtowne4@cocolog-nifty.com\"}\n{\"id\":510,\"first_name\":\"Marvin\",\"last_name\":\"Duckhouse\",\"email\":\"mduckhousee5@icq.com\"}\n{\"id\":511,\"first_name\":\"Esme\",\"last_name\":\"Crouch\",\"email\":\"ecrouche6@foxnews.com\"}\n{\"id\":512,\"first_name\":\"Cthrine\",\"last_name\":\"Yelding\",\"email\":\"cyeldinge7@stumbleupon.com\"}\n{\"id\":513,\"first_name\":\"Nolly\",\"last_name\":\"Gude\",\"email\":\"ngudee8@bing.com\"}\n{\"id\":514,\"first_name\":\"Kimball\",\"last_name\":\"O' Mulderrig\",\"email\":\"komulderrige9@newyorker.com\"}\n{\"id\":515,\"first_name\":\"Felicdad\",\"last_name\":\"Mutlow\",\"email\":\"fmutlowea@wunderground.com\"}\n{\"id\":516,\"first_name\":\"Sybila\",\"last_name\":\"Kirke\",\"email\":\"skirkeeb@w3.org\"}\n{\"id\":517,\"first_name\":\"Aubrey\",\"last_name\":\"Horney\",\"email\":\"ahorneyec@1688.com\"}\n{\"id\":518,\"first_name\":\"Prudy\",\"last_name\":\"Hartles\",\"email\":\"phartlesed@scribd.com\"}\n{\"id\":519,\"first_name\":\"Saw\",\"last_name\":\"Olyunin\",\"email\":\"solyuninee@google.it\"}\n{\"id\":520,\"first_name\":\"Pearline\",\"last_name\":\"Fasham\",\"email\":\"pfashamef@chron.com\"}\n{\"id\":521,\"first_name\":\"Gretta\",\"last_name\":\"Vasilmanov\",\"email\":\"gvasilmanoveg@godaddy.com\"}\n{\"id\":522,\"first_name\":\"Pearle\",\"last_name\":\"Scougal\",\"email\":\"pscougaleh@si.edu\"}\n{\"id\":523,\"first_name\":\"Marney\",\"last_name\":\"Mariotte\",\"email\":\"mmariotteei@newsvine.com\"}\n{\"id\":524,\"first_name\":\"Bamby\",\"last_name\":\"Lockhart\",\"email\":\"blockhartej@mayoclinic.com\"}\n{\"id\":525,\"first_name\":\"Miguela\",\"last_name\":\"Baumadier\",\"email\":\"mbaumadierek@go.com\"}\n{\"id\":526,\"first_name\":\"Lusa\",\"last_name\":\"Bartalin\",\"email\":\"lbartalinel@fc2.com\"}\n{\"id\":527,\"first_name\":\"Helene\",\"last_name\":\"Parlott\",\"email\":\"hparlottem@wikimedia.org\"}\n{\"id\":528,\"first_name\":\"Manuel\",\"last_name\":\"Scallon\",\"email\":\"mscallonen@apple.com\"}\n{\"id\":529,\"first_name\":\"Wendie\",\"last_name\":\"O' Scallan\",\"email\":\"woscallaneo@aboutads.info\"}\n{\"id\":530,\"first_name\":\"Benito\",\"last_name\":\"Kerry\",\"email\":\"bkerryep@cmu.edu\"}\n{\"id\":531,\"first_name\":\"Dorthy\",\"last_name\":\"Skinner\",\"email\":\"dskinnereq@ustream.tv\"}\n{\"id\":532,\"first_name\":\"Demetris\",\"last_name\":\"Picton\",\"email\":\"dpictoner@typepad.com\"}\n{\"id\":533,\"first_name\":\"Barrie\",\"last_name\":\"Hurkett\",\"email\":\"bhurkettes@mlb.com\"}\n{\"id\":534,\"first_name\":\"Lucretia\",\"last_name\":\"Wherry\",\"email\":\"lwherryet@amazon.com\"}\n{\"id\":535,\"first_name\":\"Molly\",\"last_name\":\"Castagna\",\"email\":\"mcastagnaeu@google.cn\"}\n{\"id\":536,\"first_name\":\"Waylin\",\"last_name\":\"Chieco\",\"email\":\"wchiecoev@youtube.com\"}\n{\"id\":537,\"first_name\":\"Bobinette\",\"last_name\":\"Headingham\",\"email\":\"bheadinghamew@intel.com\"}\n{\"id\":538,\"first_name\":\"Malinda\",\"last_name\":\"Gerardi\",\"email\":\"mgerardiex@samsung.com\"}\n{\"id\":539,\"first_name\":\"Rhodia\",\"last_name\":\"Kenderdine\",\"email\":\"rkenderdineey@arstechnica.com\"}\n{\"id\":540,\"first_name\":\"Saundra\",\"last_name\":\"Brader\",\"email\":\"sbraderez@msu.edu\"}\n{\"id\":541,\"first_name\":\"Mariana\",\"last_name\":\"Buye\",\"email\":\"mbuyef0@wikimedia.org\"}\n{\"id\":542,\"first_name\":\"Hagan\",\"last_name\":\"Stoop\",\"email\":\"hstoopf1@tripadvisor.com\"}\n{\"id\":543,\"first_name\":\"Marlene\",\"last_name\":\"Keane\",\"email\":\"mkeanef2@google.cn\"}\n{\"id\":544,\"first_name\":\"Sayer\",\"last_name\":\"Eggar\",\"email\":\"seggarf3@unicef.org\"}\n{\"id\":545,\"first_name\":\"Eve\",\"last_name\":\"Allanson\",\"email\":\"eallansonf4@china.com.cn\"}\n{\"id\":546,\"first_name\":\"Arielle\",\"last_name\":\"Kytter\",\"email\":\"akytterf5@spotify.com\"}\n{\"id\":547,\"first_name\":\"Deerdre\",\"last_name\":\"Cabbell\",\"email\":\"dcabbellf6@youtube.com\"}\n{\"id\":548,\"first_name\":\"Nathan\",\"last_name\":\"Cromleholme\",\"email\":\"ncromleholmef7@fema.gov\"}\n{\"id\":549,\"first_name\":\"Brion\",\"last_name\":\"Recher\",\"email\":\"brecherf8@usgs.gov\"}\n{\"id\":550,\"first_name\":\"Tiena\",\"last_name\":\"Grealish\",\"email\":\"tgrealishf9@mail.ru\"}\n{\"id\":551,\"first_name\":\"Dane\",\"last_name\":\"Durtnal\",\"email\":\"ddurtnalfa@latimes.com\"}\n{\"id\":552,\"first_name\":\"Bealle\",\"last_name\":\"Lesurf\",\"email\":\"blesurffb@vinaora.com\"}\n{\"id\":553,\"first_name\":\"Evangelina\",\"last_name\":\"Lawrie\",\"email\":\"elawriefc@ameblo.jp\"}\n{\"id\":554,\"first_name\":\"Svend\",\"last_name\":\"Leel\",\"email\":\"sleelfd@ucoz.ru\"}\n{\"id\":555,\"first_name\":\"Cristen\",\"last_name\":\"Klimkov\",\"email\":\"cklimkovfe@about.me\"}\n{\"id\":556,\"first_name\":\"Devon\",\"last_name\":\"Lanchbury\",\"email\":\"dlanchburyff@umn.edu\"}\n{\"id\":557,\"first_name\":\"Rem\",\"last_name\":\"Cordes\",\"email\":\"rcordesfg@yolasite.com\"}\n{\"id\":558,\"first_name\":\"Romy\",\"last_name\":\"Mattiazzi\",\"email\":\"rmattiazzifh@craigslist.org\"}\n{\"id\":559,\"first_name\":\"Wit\",\"last_name\":\"Attenborrow\",\"email\":\"wattenborrowfi@auda.org.au\"}\n{\"id\":560,\"first_name\":\"Nanni\",\"last_name\":\"Studders\",\"email\":\"nstuddersfj@yandex.ru\"}\n{\"id\":561,\"first_name\":\"Mandie\",\"last_name\":\"Trembley\",\"email\":\"mtrembleyfk@gizmodo.com\"}\n{\"id\":562,\"first_name\":\"Babette\",\"last_name\":\"Clemmens\",\"email\":\"bclemmensfl@acquirethisname.com\"}\n{\"id\":563,\"first_name\":\"Ban\",\"last_name\":\"Bennion\",\"email\":\"bbennionfm@github.com\"}\n{\"id\":564,\"first_name\":\"Gail\",\"last_name\":\"Trevon\",\"email\":\"gtrevonfn@upenn.edu\"}\n{\"id\":565,\"first_name\":\"Karine\",\"last_name\":\"Alexandrescu\",\"email\":\"kalexandrescufo@globo.com\"}\n{\"id\":566,\"first_name\":\"Sabrina\",\"last_name\":\"Klaes\",\"email\":\"sklaesfp@example.com\"}\n{\"id\":567,\"first_name\":\"Adriane\",\"last_name\":\"Figgins\",\"email\":\"afigginsfq@cafepress.com\"}\n{\"id\":568,\"first_name\":\"Terese\",\"last_name\":\"Goldney\",\"email\":\"tgoldneyfr@purevolume.com\"}\n{\"id\":569,\"first_name\":\"Ashlie\",\"last_name\":\"Bowling\",\"email\":\"abowlingfs@dion.ne.jp\"}\n{\"id\":570,\"first_name\":\"Rivi\",\"last_name\":\"Laurenz\",\"email\":\"rlaurenzft@php.net\"}\n{\"id\":571,\"first_name\":\"Phillip\",\"last_name\":\"Longstaffe\",\"email\":\"plongstaffefu@goo.gl\"}\n{\"id\":572,\"first_name\":\"Adrian\",\"last_name\":\"Jewes\",\"email\":\"ajewesfv@elpais.com\"}\n{\"id\":573,\"first_name\":\"Muriel\",\"last_name\":\"Ladlow\",\"email\":\"mladlowfw@vkontakte.ru\"}\n{\"id\":574,\"first_name\":\"Ange\",\"last_name\":\"Habishaw\",\"email\":\"ahabishawfx@walmart.com\"}\n{\"id\":575,\"first_name\":\"Kennith\",\"last_name\":\"Olive\",\"email\":\"kolivefy@tamu.edu\"}\n{\"id\":576,\"first_name\":\"Vonnie\",\"last_name\":\"Eastbrook\",\"email\":\"veastbrookfz@merriam-webster.com\"}\n{\"id\":577,\"first_name\":\"Doralyn\",\"last_name\":\"Scarbarrow\",\"email\":\"dscarbarrowg0@gnu.org\"}\n{\"id\":578,\"first_name\":\"Adams\",\"last_name\":\"Sharpley\",\"email\":\"asharpleyg1@blogger.com\"}\n{\"id\":579,\"first_name\":\"Gilberte\",\"last_name\":\"Camamile\",\"email\":\"gcamamileg2@dailymail.co.uk\"}\n{\"id\":580,\"first_name\":\"Emylee\",\"last_name\":\"Siley\",\"email\":\"esileyg3@friendfeed.com\"}\n{\"id\":581,\"first_name\":\"Hanan\",\"last_name\":\"Falvey\",\"email\":\"hfalveyg4@springer.com\"}\n{\"id\":582,\"first_name\":\"Nerty\",\"last_name\":\"Marqyes\",\"email\":\"nmarqyesg5@thetimes.co.uk\"}\n{\"id\":583,\"first_name\":\"Neysa\",\"last_name\":\"Mossdale\",\"email\":\"nmossdaleg6@de.vu\"}\n{\"id\":584,\"first_name\":\"Allayne\",\"last_name\":\"Crookall\",\"email\":\"acrookallg7@amazon.de\"}\n{\"id\":585,\"first_name\":\"Ally\",\"last_name\":\"Lesly\",\"email\":\"aleslyg8@howstuffworks.com\"}\n{\"id\":586,\"first_name\":\"Saraann\",\"last_name\":\"Rosson\",\"email\":\"srossong9@e-recht24.de\"}\n{\"id\":587,\"first_name\":\"Marissa\",\"last_name\":\"Garn\",\"email\":\"mgarnga@360.cn\"}\n{\"id\":588,\"first_name\":\"Rusty\",\"last_name\":\"Jenteau\",\"email\":\"rjenteaugb@spiegel.de\"}\n{\"id\":589,\"first_name\":\"Daria\",\"last_name\":\"Danilchenko\",\"email\":\"ddanilchenkogc@networkadvertising.org\"}\n{\"id\":590,\"first_name\":\"Emmi\",\"last_name\":\"Duny\",\"email\":\"edunygd@dedecms.com\"}\n{\"id\":591,\"first_name\":\"Felice\",\"last_name\":\"Manser\",\"email\":\"fmanserge@bizjournals.com\"}\n{\"id\":592,\"first_name\":\"Domeniga\",\"last_name\":\"Garrand\",\"email\":\"dgarrandgf@goo.ne.jp\"}\n{\"id\":593,\"first_name\":\"Tani\",\"last_name\":\"Bampton\",\"email\":\"tbamptongg@huffingtonpost.com\"}\n{\"id\":594,\"first_name\":\"Paige\",\"last_name\":\"Holdren\",\"email\":\"pholdrengh@hugedomains.com\"}\n{\"id\":595,\"first_name\":\"Junia\",\"last_name\":\"Stoppe\",\"email\":\"jstoppegi@slideshare.net\"}\n{\"id\":596,\"first_name\":\"Krista\",\"last_name\":\"Hardwidge\",\"email\":\"khardwidgegj@people.com.cn\"}\n{\"id\":597,\"first_name\":\"Theodosia\",\"last_name\":\"Bleddon\",\"email\":\"tbleddongk@geocities.jp\"}\n{\"id\":598,\"first_name\":\"Esmaria\",\"last_name\":\"Gomez\",\"email\":\"egomezgl@slate.com\"}\n{\"id\":599,\"first_name\":\"Karisa\",\"last_name\":\"Dearnley\",\"email\":\"kdearnleygm@nyu.edu\"}\n{\"id\":600,\"first_name\":\"Dud\",\"last_name\":\"Cuddon\",\"email\":\"dcuddongn@youku.com\"}\n{\"id\":601,\"first_name\":\"Daile\",\"last_name\":\"Mylechreest\",\"email\":\"dmylechreestgo@ox.ac.uk\"}\n{\"id\":602,\"first_name\":\"Josselyn\",\"last_name\":\"Chaplyn\",\"email\":\"jchaplyngp@shop-pro.jp\"}\n{\"id\":603,\"first_name\":\"Klarrisa\",\"last_name\":\"Balnave\",\"email\":\"kbalnavegq@microsoft.com\"}\n{\"id\":604,\"first_name\":\"Nolie\",\"last_name\":\"Petters\",\"email\":\"npettersgr@opera.com\"}\n{\"id\":605,\"first_name\":\"Bruno\",\"last_name\":\"Vautin\",\"email\":\"bvautings@earthlink.net\"}\n{\"id\":606,\"first_name\":\"Ki\",\"last_name\":\"Stucke\",\"email\":\"kstuckegt@newyorker.com\"}\n{\"id\":607,\"first_name\":\"Jerad\",\"last_name\":\"MacGettigen\",\"email\":\"jmacgettigengu@nyu.edu\"}\n{\"id\":608,\"first_name\":\"Waverly\",\"last_name\":\"Gwyer\",\"email\":\"wgwyergv@sogou.com\"}\n{\"id\":609,\"first_name\":\"Norene\",\"last_name\":\"Detloff\",\"email\":\"ndetloffgw@istockphoto.com\"}\n{\"id\":610,\"first_name\":\"Alfonse\",\"last_name\":\"Mont\",\"email\":\"amontgx@mozilla.org\"}\n{\"id\":611,\"first_name\":\"Shepperd\",\"last_name\":\"Duffy\",\"email\":\"sduffygy@jigsy.com\"}\n{\"id\":612,\"first_name\":\"Petronille\",\"last_name\":\"Doughty\",\"email\":\"pdoughtygz@prweb.com\"}\n{\"id\":613,\"first_name\":\"Northrop\",\"last_name\":\"Stent\",\"email\":\"nstenth0@nytimes.com\"}\n{\"id\":614,\"first_name\":\"Marline\",\"last_name\":\"Ferrar\",\"email\":\"mferrarh1@utexas.edu\"}\n{\"id\":615,\"first_name\":\"Brenda\",\"last_name\":\"Sancho\",\"email\":\"bsanchoh2@com.com\"}\n{\"id\":616,\"first_name\":\"Leeanne\",\"last_name\":\"Candlish\",\"email\":\"lcandlishh3@merriam-webster.com\"}\n{\"id\":617,\"first_name\":\"Brianna\",\"last_name\":\"Starford\",\"email\":\"bstarfordh4@theguardian.com\"}\n{\"id\":618,\"first_name\":\"Leroi\",\"last_name\":\"Smissen\",\"email\":\"lsmissenh5@loc.gov\"}\n{\"id\":619,\"first_name\":\"Nev\",\"last_name\":\"Belvin\",\"email\":\"nbelvinh6@dedecms.com\"}\n{\"id\":620,\"first_name\":\"Kellen\",\"last_name\":\"Strowlger\",\"email\":\"kstrowlgerh7@about.com\"}\n{\"id\":621,\"first_name\":\"Ninette\",\"last_name\":\"Kerton\",\"email\":\"nkertonh8@google.com.au\"}\n{\"id\":622,\"first_name\":\"Ruby\",\"last_name\":\"Klewer\",\"email\":\"rklewerh9@engadget.com\"}\n{\"id\":623,\"first_name\":\"Berte\",\"last_name\":\"Joynes\",\"email\":\"bjoynesha@icq.com\"}\n{\"id\":624,\"first_name\":\"Jacob\",\"last_name\":\"Houseago\",\"email\":\"jhouseagohb@ocn.ne.jp\"}\n{\"id\":625,\"first_name\":\"Wandis\",\"last_name\":\"Souster\",\"email\":\"wsousterhc@ask.com\"}\n{\"id\":626,\"first_name\":\"Kelila\",\"last_name\":\"Guillon\",\"email\":\"kguillonhd@vistaprint.com\"}\n{\"id\":627,\"first_name\":\"Gretchen\",\"last_name\":\"Mellsop\",\"email\":\"gmellsophe@nationalgeographic.com\"}\n{\"id\":628,\"first_name\":\"Agnese\",\"last_name\":\"Rider\",\"email\":\"ariderhf@bloglovin.com\"}\n{\"id\":629,\"first_name\":\"Sabina\",\"last_name\":\"Spilsburie\",\"email\":\"sspilsburiehg@who.int\"}\n{\"id\":630,\"first_name\":\"Aubrie\",\"last_name\":\"Patey\",\"email\":\"apateyhh@linkedin.com\"}\n{\"id\":631,\"first_name\":\"Wolfie\",\"last_name\":\"Sommerville\",\"email\":\"wsommervillehi@domainmarket.com\"}\n{\"id\":632,\"first_name\":\"Anderea\",\"last_name\":\"Haversum\",\"email\":\"ahaversumhj@pbs.org\"}\n{\"id\":633,\"first_name\":\"Alessandro\",\"last_name\":\"Giovani\",\"email\":\"agiovanihk@cbslocal.com\"}\n{\"id\":634,\"first_name\":\"Arlette\",\"last_name\":\"Dulwich\",\"email\":\"adulwichhl@google.nl\"}\n{\"id\":635,\"first_name\":\"Reamonn\",\"last_name\":\"Flintiff\",\"email\":\"rflintiffhm@wikia.com\"}\n{\"id\":636,\"first_name\":\"Lowe\",\"last_name\":\"Redding\",\"email\":\"lreddinghn@naver.com\"}\n{\"id\":637,\"first_name\":\"Dannel\",\"last_name\":\"Lloyds\",\"email\":\"dlloydsho@odnoklassniki.ru\"}\n{\"id\":638,\"first_name\":\"Gasparo\",\"last_name\":\"Curtoys\",\"email\":\"gcurtoyshp@tuttocitta.it\"}\n{\"id\":639,\"first_name\":\"Larry\",\"last_name\":\"Shatliff\",\"email\":\"lshatliffhq@mayoclinic.com\"}\n{\"id\":640,\"first_name\":\"Faina\",\"last_name\":\"Dauby\",\"email\":\"fdaubyhr@washingtonpost.com\"}\n{\"id\":641,\"first_name\":\"Kara-lynn\",\"last_name\":\"Prise\",\"email\":\"kprisehs@live.com\"}\n{\"id\":642,\"first_name\":\"Garry\",\"last_name\":\"Patinkin\",\"email\":\"gpatinkinht@goo.gl\"}\n{\"id\":643,\"first_name\":\"Bryan\",\"last_name\":\"Eyrl\",\"email\":\"beyrlhu@indiegogo.com\"}\n{\"id\":644,\"first_name\":\"Dominick\",\"last_name\":\"Goracci\",\"email\":\"dgoraccihv@cdbaby.com\"}\n{\"id\":645,\"first_name\":\"Sabine\",\"last_name\":\"Dami\",\"email\":\"sdamihw@php.net\"}\n{\"id\":646,\"first_name\":\"Simeon\",\"last_name\":\"Czajka\",\"email\":\"sczajkahx@auda.org.au\"}\n{\"id\":647,\"first_name\":\"Bertrando\",\"last_name\":\"Ostler\",\"email\":\"bostlerhy@networksolutions.com\"}\n{\"id\":648,\"first_name\":\"Guillaume\",\"last_name\":\"Halahan\",\"email\":\"ghalahanhz@weibo.com\"}\n{\"id\":649,\"first_name\":\"Artus\",\"last_name\":\"Shotton\",\"email\":\"ashottoni0@goodreads.com\"}\n{\"id\":650,\"first_name\":\"Thurston\",\"last_name\":\"Privett\",\"email\":\"tprivetti1@goo.ne.jp\"}\n{\"id\":651,\"first_name\":\"Dagny\",\"last_name\":\"Handford\",\"email\":\"dhandfordi2@homestead.com\"}\n{\"id\":652,\"first_name\":\"Bathsheba\",\"last_name\":\"Pordall\",\"email\":\"bpordalli3@howstuffworks.com\"}\n{\"id\":653,\"first_name\":\"Gwynne\",\"last_name\":\"Vallens\",\"email\":\"gvallensi4@feedburner.com\"}\n{\"id\":654,\"first_name\":\"Faun\",\"last_name\":\"McMurrugh\",\"email\":\"fmcmurrughi5@merriam-webster.com\"}\n{\"id\":655,\"first_name\":\"Kerry\",\"last_name\":\"Gooding\",\"email\":\"kgoodingi6@macromedia.com\"}\n{\"id\":656,\"first_name\":\"Sol\",\"last_name\":\"Baskerfield\",\"email\":\"sbaskerfieldi7@amazon.co.uk\"}\n{\"id\":657,\"first_name\":\"Belvia\",\"last_name\":\"Risebrow\",\"email\":\"brisebrowi8@netlog.com\"}\n{\"id\":658,\"first_name\":\"Kelila\",\"last_name\":\"Stocken\",\"email\":\"kstockeni9@xinhuanet.com\"}\n{\"id\":659,\"first_name\":\"Raff\",\"last_name\":\"Kelland\",\"email\":\"rkellandia@technorati.com\"}\n{\"id\":660,\"first_name\":\"Sophi\",\"last_name\":\"Bengough\",\"email\":\"sbengoughib@booking.com\"}\n{\"id\":661,\"first_name\":\"Annmarie\",\"last_name\":\"Ivins\",\"email\":\"aivinsic@wikispaces.com\"}\n{\"id\":662,\"first_name\":\"Nettie\",\"last_name\":\"Camings\",\"email\":\"ncamingsid@amazon.co.jp\"}\n{\"id\":663,\"first_name\":\"Ted\",\"last_name\":\"Alcoran\",\"email\":\"talcoranie@hhs.gov\"}\n{\"id\":664,\"first_name\":\"Tim\",\"last_name\":\"Murden\",\"email\":\"tmurdenif@jugem.jp\"}\n{\"id\":665,\"first_name\":\"Latrina\",\"last_name\":\"Baines\",\"email\":\"lbainesig@sphinn.com\"}\n{\"id\":666,\"first_name\":\"Sanders\",\"last_name\":\"Crampsey\",\"email\":\"scrampseyih@cocolog-nifty.com\"}\n{\"id\":667,\"first_name\":\"Laurena\",\"last_name\":\"Bristowe\",\"email\":\"lbristoweii@sciencedaily.com\"}\n{\"id\":668,\"first_name\":\"Brose\",\"last_name\":\"Blanchet\",\"email\":\"bblanchetij@archive.org\"}\n{\"id\":669,\"first_name\":\"Jacintha\",\"last_name\":\"Kimmel\",\"email\":\"jkimmelik@psu.edu\"}\n{\"id\":670,\"first_name\":\"Nat\",\"last_name\":\"Hast\",\"email\":\"nhastil@ifeng.com\"}\n{\"id\":671,\"first_name\":\"Ealasaid\",\"last_name\":\"MacHoste\",\"email\":\"emachosteim@nps.gov\"}\n{\"id\":672,\"first_name\":\"Merralee\",\"last_name\":\"Phippen\",\"email\":\"mphippenin@usgs.gov\"}\n{\"id\":673,\"first_name\":\"Donella\",\"last_name\":\"Sanzio\",\"email\":\"dsanzioio@theatlantic.com\"}\n{\"id\":674,\"first_name\":\"Giorgi\",\"last_name\":\"Chaff\",\"email\":\"gchaffip@taobao.com\"}\n{\"id\":675,\"first_name\":\"Bennie\",\"last_name\":\"Smallsman\",\"email\":\"bsmallsmaniq@webmd.com\"}\n{\"id\":676,\"first_name\":\"Georgie\",\"last_name\":\"Crole\",\"email\":\"gcroleir@dot.gov\"}\n{\"id\":677,\"first_name\":\"Petra\",\"last_name\":\"Chappelow\",\"email\":\"pchappelowis@tmall.com\"}\n{\"id\":678,\"first_name\":\"Dalton\",\"last_name\":\"Wewell\",\"email\":\"dwewellit@tiny.cc\"}\n{\"id\":679,\"first_name\":\"Kinnie\",\"last_name\":\"Guilaem\",\"email\":\"kguilaemiu@ezinearticles.com\"}\n{\"id\":680,\"first_name\":\"Merci\",\"last_name\":\"Doyle\",\"email\":\"mdoyleiv@boston.com\"}\n{\"id\":681,\"first_name\":\"Constance\",\"last_name\":\"Tilson\",\"email\":\"ctilsoniw@usda.gov\"}\n{\"id\":682,\"first_name\":\"Paige\",\"last_name\":\"Sygroves\",\"email\":\"psygrovesix@sfgate.com\"}\n{\"id\":683,\"first_name\":\"Rutherford\",\"last_name\":\"Ughi\",\"email\":\"rughiiy@webeden.co.uk\"}\n{\"id\":684,\"first_name\":\"Jamil\",\"last_name\":\"Crighton\",\"email\":\"jcrightoniz@china.com.cn\"}\n{\"id\":685,\"first_name\":\"Gaspard\",\"last_name\":\"Lockner\",\"email\":\"glocknerj0@hao123.com\"}\n{\"id\":686,\"first_name\":\"Sindee\",\"last_name\":\"Beade\",\"email\":\"sbeadej1@sbwire.com\"}\n{\"id\":687,\"first_name\":\"Irina\",\"last_name\":\"Perren\",\"email\":\"iperrenj2@oracle.com\"}\n{\"id\":688,\"first_name\":\"Annis\",\"last_name\":\"Asker\",\"email\":\"aaskerj3@whitehouse.gov\"}\n{\"id\":689,\"first_name\":\"Ingram\",\"last_name\":\"MacGiany\",\"email\":\"imacgianyj4@blogspot.com\"}\n{\"id\":690,\"first_name\":\"Germaine\",\"last_name\":\"Maltby\",\"email\":\"gmaltbyj5@usda.gov\"}\n{\"id\":691,\"first_name\":\"Berkly\",\"last_name\":\"Prazor\",\"email\":\"bprazorj6@flickr.com\"}\n{\"id\":692,\"first_name\":\"Ferguson\",\"last_name\":\"Kyffin\",\"email\":\"fkyffinj7@goodreads.com\"}\n{\"id\":693,\"first_name\":\"Winonah\",\"last_name\":\"Furze\",\"email\":\"wfurzej8@alexa.com\"}\n{\"id\":694,\"first_name\":\"Merwin\",\"last_name\":\"Ionnidis\",\"email\":\"mionnidisj9@slashdot.org\"}\n{\"id\":695,\"first_name\":\"Jerrilee\",\"last_name\":\"Speerman\",\"email\":\"jspeermanja@tumblr.com\"}\n{\"id\":696,\"first_name\":\"Leslie\",\"last_name\":\"Mulvy\",\"email\":\"lmulvyjb@bravesites.com\"}\n{\"id\":697,\"first_name\":\"Eryn\",\"last_name\":\"Stoffel\",\"email\":\"estoffeljc@infoseek.co.jp\"}\n{\"id\":698,\"first_name\":\"Ladonna\",\"last_name\":\"Bosward\",\"email\":\"lboswardjd@php.net\"}\n{\"id\":699,\"first_name\":\"Giustino\",\"last_name\":\"Killelea\",\"email\":\"gkilleleaje@statcounter.com\"}\n{\"id\":700,\"first_name\":\"Dillie\",\"last_name\":\"Angell\",\"email\":\"dangelljf@weather.com\"}\n{\"id\":701,\"first_name\":\"Henri\",\"last_name\":\"Arnison\",\"email\":\"harnisonjg@bluehost.com\"}\n{\"id\":702,\"first_name\":\"Fina\",\"last_name\":\"Joules\",\"email\":\"fjoulesjh@webmd.com\"}\n{\"id\":703,\"first_name\":\"Elden\",\"last_name\":\"Shortan\",\"email\":\"eshortanji@github.com\"}\n{\"id\":704,\"first_name\":\"Vevay\",\"last_name\":\"Imison\",\"email\":\"vimisonjj@slideshare.net\"}\n{\"id\":705,\"first_name\":\"Marcellina\",\"last_name\":\"Jagg\",\"email\":\"mjaggjk@dmoz.org\"}\n{\"id\":706,\"first_name\":\"Evin\",\"last_name\":\"Lamacraft\",\"email\":\"elamacraftjl@dagondesign.com\"}\n{\"id\":707,\"first_name\":\"Marcel\",\"last_name\":\"Edy\",\"email\":\"medyjm@aboutads.info\"}\n{\"id\":708,\"first_name\":\"Orlan\",\"last_name\":\"Drei\",\"email\":\"odreijn@oaic.gov.au\"}\n{\"id\":709,\"first_name\":\"Egbert\",\"last_name\":\"Shillington\",\"email\":\"eshillingtonjo@simplemachines.org\"}\n{\"id\":710,\"first_name\":\"Opal\",\"last_name\":\"Oldnall\",\"email\":\"ooldnalljp@indiatimes.com\"}\n{\"id\":711,\"first_name\":\"Brianne\",\"last_name\":\"Penticost\",\"email\":\"bpenticostjq@webmd.com\"}\n{\"id\":712,\"first_name\":\"Pip\",\"last_name\":\"Blaney\",\"email\":\"pblaneyjr@weather.com\"}\n{\"id\":713,\"first_name\":\"Michaella\",\"last_name\":\"Goldsberry\",\"email\":\"mgoldsberryjs@dell.com\"}\n{\"id\":714,\"first_name\":\"Jerrilee\",\"last_name\":\"Paridge\",\"email\":\"jparidgejt@wikipedia.org\"}\n{\"id\":715,\"first_name\":\"Joelly\",\"last_name\":\"Knightley\",\"email\":\"jknightleyju@joomla.org\"}\n{\"id\":716,\"first_name\":\"Eloisa\",\"last_name\":\"Lee\",\"email\":\"eleejv@abc.net.au\"}\n{\"id\":717,\"first_name\":\"Andee\",\"last_name\":\"Boscott\",\"email\":\"aboscottjw@angelfire.com\"}\n{\"id\":718,\"first_name\":\"Menard\",\"last_name\":\"Bazley\",\"email\":\"mbazleyjx@si.edu\"}\n{\"id\":719,\"first_name\":\"Hadria\",\"last_name\":\"MacDonough\",\"email\":\"hmacdonoughjy@blogspot.com\"}\n{\"id\":720,\"first_name\":\"Demetrius\",\"last_name\":\"Ghelardi\",\"email\":\"dghelardijz@4shared.com\"}\n{\"id\":721,\"first_name\":\"Ingra\",\"last_name\":\"Boshard\",\"email\":\"iboshardk0@forbes.com\"}\n{\"id\":722,\"first_name\":\"Shelley\",\"last_name\":\"Cradoc\",\"email\":\"scradock1@google.nl\"}\n{\"id\":723,\"first_name\":\"Bertrando\",\"last_name\":\"Wurst\",\"email\":\"bwurstk2@hhs.gov\"}\n{\"id\":724,\"first_name\":\"Duky\",\"last_name\":\"Moresby\",\"email\":\"dmoresbyk3@w3.org\"}\n{\"id\":725,\"first_name\":\"Lynda\",\"last_name\":\"Matzeitis\",\"email\":\"lmatzeitisk4@bing.com\"}\n{\"id\":726,\"first_name\":\"Galvan\",\"last_name\":\"Challen\",\"email\":\"gchallenk5@nasa.gov\"}\n{\"id\":727,\"first_name\":\"Bette-ann\",\"last_name\":\"Lytlle\",\"email\":\"blytllek6@linkedin.com\"}\n{\"id\":728,\"first_name\":\"Henderson\",\"last_name\":\"Tonsley\",\"email\":\"htonsleyk7@wikipedia.org\"}\n{\"id\":729,\"first_name\":\"Daffi\",\"last_name\":\"Welch\",\"email\":\"dwelchk8@geocities.com\"}\n{\"id\":730,\"first_name\":\"Enrique\",\"last_name\":\"Emig\",\"email\":\"eemigk9@digg.com\"}\n{\"id\":731,\"first_name\":\"Darnall\",\"last_name\":\"Tupman\",\"email\":\"dtupmanka@indiegogo.com\"}\n{\"id\":732,\"first_name\":\"Vicki\",\"last_name\":\"Trayes\",\"email\":\"vtrayeskb@phpbb.com\"}\n{\"id\":733,\"first_name\":\"Quintus\",\"last_name\":\"Sancroft\",\"email\":\"qsancroftkc@ycombinator.com\"}\n{\"id\":734,\"first_name\":\"Karola\",\"last_name\":\"Mille\",\"email\":\"kmillekd@ustream.tv\"}\n{\"id\":735,\"first_name\":\"Aretha\",\"last_name\":\"Callum\",\"email\":\"acallumke@washingtonpost.com\"}\n{\"id\":736,\"first_name\":\"Karisa\",\"last_name\":\"Stainer\",\"email\":\"kstainerkf@nsw.gov.au\"}\n{\"id\":737,\"first_name\":\"Carine\",\"last_name\":\"Goom\",\"email\":\"cgoomkg@whitehouse.gov\"}\n{\"id\":738,\"first_name\":\"Town\",\"last_name\":\"Hannan\",\"email\":\"thannankh@harvard.edu\"}\n{\"id\":739,\"first_name\":\"Micheal\",\"last_name\":\"Arnaudin\",\"email\":\"marnaudinki@theatlantic.com\"}\n{\"id\":740,\"first_name\":\"Shaun\",\"last_name\":\"Prendergrass\",\"email\":\"sprendergrasskj@mapquest.com\"}\n{\"id\":741,\"first_name\":\"Chastity\",\"last_name\":\"Waszczyk\",\"email\":\"cwaszczykkk@gravatar.com\"}\n{\"id\":742,\"first_name\":\"Christy\",\"last_name\":\"Northey\",\"email\":\"cnortheykl@nymag.com\"}\n{\"id\":743,\"first_name\":\"Melamie\",\"last_name\":\"Triggel\",\"email\":\"mtriggelkm@myspace.com\"}\n{\"id\":744,\"first_name\":\"Duffy\",\"last_name\":\"Albrook\",\"email\":\"dalbrookkn@oakley.com\"}\n{\"id\":745,\"first_name\":\"Viv\",\"last_name\":\"Millwall\",\"email\":\"vmillwallko@technorati.com\"}\n{\"id\":746,\"first_name\":\"Abie\",\"last_name\":\"Cacacie\",\"email\":\"acacaciekp@reference.com\"}\n{\"id\":747,\"first_name\":\"Micah\",\"last_name\":\"Howden\",\"email\":\"mhowdenkq@youtube.com\"}\n{\"id\":748,\"first_name\":\"Gerladina\",\"last_name\":\"Sheeran\",\"email\":\"gsheerankr@soundcloud.com\"}\n{\"id\":749,\"first_name\":\"Reidar\",\"last_name\":\"Withur\",\"email\":\"rwithurks@1688.com\"}\n{\"id\":750,\"first_name\":\"Killy\",\"last_name\":\"Stroulger\",\"email\":\"kstroulgerkt@webs.com\"}\n{\"id\":751,\"first_name\":\"Penelope\",\"last_name\":\"Foli\",\"email\":\"pfoliku@ucoz.ru\"}\n{\"id\":752,\"first_name\":\"Pascal\",\"last_name\":\"Blethyn\",\"email\":\"pblethynkv@people.com.cn\"}\n{\"id\":753,\"first_name\":\"Jacobo\",\"last_name\":\"Renols\",\"email\":\"jrenolskw@nhs.uk\"}\n{\"id\":754,\"first_name\":\"Donelle\",\"last_name\":\"Jarrell\",\"email\":\"djarrellkx@about.com\"}\n{\"id\":755,\"first_name\":\"Hakim\",\"last_name\":\"Pietrzyk\",\"email\":\"hpietrzykky@123-reg.co.uk\"}\n{\"id\":756,\"first_name\":\"Fania\",\"last_name\":\"Hallick\",\"email\":\"fhallickkz@state.gov\"}\n{\"id\":757,\"first_name\":\"Boote\",\"last_name\":\"Gomersal\",\"email\":\"bgomersall0@virginia.edu\"}\n{\"id\":758,\"first_name\":\"Luis\",\"last_name\":\"Valler\",\"email\":\"lvallerl1@zimbio.com\"}\n{\"id\":759,\"first_name\":\"Shana\",\"last_name\":\"Vittel\",\"email\":\"svittell2@virginia.edu\"}\n{\"id\":760,\"first_name\":\"Onofredo\",\"last_name\":\"Philliphs\",\"email\":\"ophilliphsl3@vistaprint.com\"}\n{\"id\":761,\"first_name\":\"Osmond\",\"last_name\":\"Moulson\",\"email\":\"omoulsonl4@fema.gov\"}\n{\"id\":762,\"first_name\":\"Ly\",\"last_name\":\"Greenan\",\"email\":\"lgreenanl5@ucla.edu\"}\n{\"id\":763,\"first_name\":\"Mervin\",\"last_name\":\"Koop\",\"email\":\"mkoopl6@mediafire.com\"}\n{\"id\":764,\"first_name\":\"Ferrel\",\"last_name\":\"Redfearn\",\"email\":\"fredfearnl7@nyu.edu\"}\n{\"id\":765,\"first_name\":\"Robby\",\"last_name\":\"Huglin\",\"email\":\"rhuglinl8@nature.com\"}\n{\"id\":766,\"first_name\":\"Kendre\",\"last_name\":\"Youle\",\"email\":\"kyoulel9@domainmarket.com\"}\n{\"id\":767,\"first_name\":\"Windy\",\"last_name\":\"Rubel\",\"email\":\"wrubella@telegraph.co.uk\"}\n{\"id\":768,\"first_name\":\"Crystal\",\"last_name\":\"Carmichael\",\"email\":\"ccarmichaellb@admin.ch\"}\n{\"id\":769,\"first_name\":\"Agata\",\"last_name\":\"Penner\",\"email\":\"apennerlc@tinyurl.com\"}\n{\"id\":770,\"first_name\":\"Odey\",\"last_name\":\"Morse\",\"email\":\"omorseld@wired.com\"}\n{\"id\":771,\"first_name\":\"Siegfried\",\"last_name\":\"Glackin\",\"email\":\"sglackinle@hao123.com\"}\n{\"id\":772,\"first_name\":\"Norbie\",\"last_name\":\"Reiners\",\"email\":\"nreinerslf@cmu.edu\"}\n{\"id\":773,\"first_name\":\"Kipp\",\"last_name\":\"Lowdes\",\"email\":\"klowdeslg@privacy.gov.au\"}\n{\"id\":774,\"first_name\":\"Dyann\",\"last_name\":\"Francklyn\",\"email\":\"dfrancklynlh@google.com.br\"}\n{\"id\":775,\"first_name\":\"Gwennie\",\"last_name\":\"McGlynn\",\"email\":\"gmcglynnli@live.com\"}\n{\"id\":776,\"first_name\":\"Viviyan\",\"last_name\":\"Erdis\",\"email\":\"verdislj@clickbank.net\"}\n{\"id\":777,\"first_name\":\"Hallie\",\"last_name\":\"Sherewood\",\"email\":\"hsherewoodlk@trellian.com\"}\n{\"id\":778,\"first_name\":\"Redd\",\"last_name\":\"Stenton\",\"email\":\"rstentonll@infoseek.co.jp\"}\n{\"id\":779,\"first_name\":\"Wendall\",\"last_name\":\"Bath\",\"email\":\"wbathlm@upenn.edu\"}\n{\"id\":780,\"first_name\":\"Corilla\",\"last_name\":\"Zanetello\",\"email\":\"czanetelloln@sfgate.com\"}\n{\"id\":781,\"first_name\":\"Christye\",\"last_name\":\"Dracey\",\"email\":\"cdraceylo@qq.com\"}\n{\"id\":782,\"first_name\":\"Nester\",\"last_name\":\"Farleigh\",\"email\":\"nfarleighlp@usgs.gov\"}\n{\"id\":783,\"first_name\":\"Langsdon\",\"last_name\":\"Haggard\",\"email\":\"lhaggardlq@reddit.com\"}\n{\"id\":784,\"first_name\":\"Chev\",\"last_name\":\"Hay\",\"email\":\"chaylr@timesonline.co.uk\"}\n{\"id\":785,\"first_name\":\"Burlie\",\"last_name\":\"Cutchee\",\"email\":\"bcutcheels@dagondesign.com\"}\n{\"id\":786,\"first_name\":\"Darya\",\"last_name\":\"Mitchinson\",\"email\":\"dmitchinsonlt@bizjournals.com\"}\n{\"id\":787,\"first_name\":\"Bibi\",\"last_name\":\"Skitral\",\"email\":\"bskitrallu@homestead.com\"}\n{\"id\":788,\"first_name\":\"Kaylee\",\"last_name\":\"Olivo\",\"email\":\"kolivolv@census.gov\"}\n{\"id\":789,\"first_name\":\"Lenore\",\"last_name\":\"Roseblade\",\"email\":\"lrosebladelw@ning.com\"}\n{\"id\":790,\"first_name\":\"Tulley\",\"last_name\":\"Gonthard\",\"email\":\"tgonthardlx@bloomberg.com\"}\n{\"id\":791,\"first_name\":\"Olav\",\"last_name\":\"Galfour\",\"email\":\"ogalfourly@icq.com\"}\n{\"id\":792,\"first_name\":\"Nicolas\",\"last_name\":\"Margarson\",\"email\":\"nmargarsonlz@free.fr\"}\n{\"id\":793,\"first_name\":\"Reine\",\"last_name\":\"Klugman\",\"email\":\"rklugmanm0@fc2.com\"}\n{\"id\":794,\"first_name\":\"Gnni\",\"last_name\":\"Grewcock\",\"email\":\"ggrewcockm1@clickbank.net\"}\n{\"id\":795,\"first_name\":\"Lorain\",\"last_name\":\"Crossby\",\"email\":\"lcrossbym2@cdc.gov\"}\n{\"id\":796,\"first_name\":\"Angil\",\"last_name\":\"Toll\",\"email\":\"atollm3@deliciousdays.com\"}\n{\"id\":797,\"first_name\":\"Georgianne\",\"last_name\":\"Piotrowski\",\"email\":\"gpiotrowskim4@goo.gl\"}\n{\"id\":798,\"first_name\":\"Sheelagh\",\"last_name\":\"Orwin\",\"email\":\"sorwinm5@xing.com\"}\n{\"id\":799,\"first_name\":\"Ingrid\",\"last_name\":\"Dallon\",\"email\":\"idallonm6@noaa.gov\"}\n{\"id\":800,\"first_name\":\"Tab\",\"last_name\":\"Thomasson\",\"email\":\"tthomassonm7@columbia.edu\"}\n{\"id\":801,\"first_name\":\"Merridie\",\"last_name\":\"Scandroot\",\"email\":\"mscandrootm8@wsj.com\"}\n{\"id\":802,\"first_name\":\"Morty\",\"last_name\":\"MacDunleavy\",\"email\":\"mmacdunleavym9@canalblog.com\"}\n{\"id\":803,\"first_name\":\"Lind\",\"last_name\":\"Jordanson\",\"email\":\"ljordansonma@tinypic.com\"}\n{\"id\":804,\"first_name\":\"Field\",\"last_name\":\"Iiannoni\",\"email\":\"fiiannonimb@over-blog.com\"}\n{\"id\":805,\"first_name\":\"Sammie\",\"last_name\":\"Whimper\",\"email\":\"swhimpermc@imageshack.us\"}\n{\"id\":806,\"first_name\":\"Davy\",\"last_name\":\"Darthe\",\"email\":\"ddarthemd@netvibes.com\"}\n{\"id\":807,\"first_name\":\"Salli\",\"last_name\":\"Binstead\",\"email\":\"sbinsteadme@hubpages.com\"}\n{\"id\":808,\"first_name\":\"Betty\",\"last_name\":\"Chown\",\"email\":\"bchownmf@flavors.me\"}\n{\"id\":809,\"first_name\":\"Dinnie\",\"last_name\":\"Ilyushkin\",\"email\":\"dilyushkinmg@archive.org\"}\n{\"id\":810,\"first_name\":\"Renee\",\"last_name\":\"Daymond\",\"email\":\"rdaymondmh@accuweather.com\"}\n{\"id\":811,\"first_name\":\"Eddie\",\"last_name\":\"Duley\",\"email\":\"eduleymi@lulu.com\"}\n{\"id\":812,\"first_name\":\"Izak\",\"last_name\":\"Latour\",\"email\":\"ilatourmj@narod.ru\"}\n{\"id\":813,\"first_name\":\"Maura\",\"last_name\":\"Stuckford\",\"email\":\"mstuckfordmk@nymag.com\"}\n{\"id\":814,\"first_name\":\"Sven\",\"last_name\":\"Clampin\",\"email\":\"sclampinml@163.com\"}\n{\"id\":815,\"first_name\":\"Marlon\",\"last_name\":\"Bischoff\",\"email\":\"mbischoffmm@wsj.com\"}\n{\"id\":816,\"first_name\":\"Gustave\",\"last_name\":\"Hardbattle\",\"email\":\"ghardbattlemn@mozilla.com\"}\n{\"id\":817,\"first_name\":\"Alaine\",\"last_name\":\"Dietzler\",\"email\":\"adietzlermo@timesonline.co.uk\"}\n{\"id\":818,\"first_name\":\"Alisa\",\"last_name\":\"Ghirardi\",\"email\":\"aghirardimp@blogspot.com\"}\n{\"id\":819,\"first_name\":\"Irena\",\"last_name\":\"Goskar\",\"email\":\"igoskarmq@archive.org\"}\n{\"id\":820,\"first_name\":\"Eugenius\",\"last_name\":\"Taillant\",\"email\":\"etaillantmr@linkedin.com\"}\n{\"id\":821,\"first_name\":\"Patton\",\"last_name\":\"Garbert\",\"email\":\"pgarbertms@drupal.org\"}\n{\"id\":822,\"first_name\":\"Callie\",\"last_name\":\"Kubera\",\"email\":\"ckuberamt@ed.gov\"}\n{\"id\":823,\"first_name\":\"Carrissa\",\"last_name\":\"Duplain\",\"email\":\"cduplainmu@bing.com\"}\n{\"id\":824,\"first_name\":\"Rena\",\"last_name\":\"Thominga\",\"email\":\"rthomingamv@cocolog-nifty.com\"}\n{\"id\":825,\"first_name\":\"Adriaens\",\"last_name\":\"Lye\",\"email\":\"alyemw@wikipedia.org\"}\n{\"id\":826,\"first_name\":\"Robena\",\"last_name\":\"Tackett\",\"email\":\"rtackettmx@360.cn\"}\n{\"id\":827,\"first_name\":\"Yvon\",\"last_name\":\"Emanuele\",\"email\":\"yemanuelemy@odnoklassniki.ru\"}\n{\"id\":828,\"first_name\":\"Marcel\",\"last_name\":\"Beckinsall\",\"email\":\"mbeckinsallmz@blinklist.com\"}\n{\"id\":829,\"first_name\":\"Donaugh\",\"last_name\":\"Gaitskill\",\"email\":\"dgaitskilln0@cyberchimps.com\"}\n{\"id\":830,\"first_name\":\"Daloris\",\"last_name\":\"Leman\",\"email\":\"dlemann1@etsy.com\"}\n{\"id\":831,\"first_name\":\"Cad\",\"last_name\":\"Fermin\",\"email\":\"cferminn2@blogs.com\"}\n{\"id\":832,\"first_name\":\"Brigida\",\"last_name\":\"Hurry\",\"email\":\"bhurryn3@wunderground.com\"}\n{\"id\":833,\"first_name\":\"Carlene\",\"last_name\":\"Duns\",\"email\":\"cdunsn4@timesonline.co.uk\"}\n{\"id\":834,\"first_name\":\"King\",\"last_name\":\"Giblett\",\"email\":\"kgiblettn5@bbc.co.uk\"}\n{\"id\":835,\"first_name\":\"Emelita\",\"last_name\":\"Benito\",\"email\":\"ebeniton6@dell.com\"}\n{\"id\":836,\"first_name\":\"Valentine\",\"last_name\":\"MacCaughey\",\"email\":\"vmaccaugheyn7@dropbox.com\"}\n{\"id\":837,\"first_name\":\"Donnell\",\"last_name\":\"Pitcock\",\"email\":\"dpitcockn8@eepurl.com\"}\n{\"id\":838,\"first_name\":\"Dasie\",\"last_name\":\"Goburn\",\"email\":\"dgoburnn9@sciencedirect.com\"}\n{\"id\":839,\"first_name\":\"Berty\",\"last_name\":\"Klulicek\",\"email\":\"bklulicekna@artisteer.com\"}\n{\"id\":840,\"first_name\":\"Franzen\",\"last_name\":\"Pindred\",\"email\":\"fpindrednb@dropbox.com\"}\n{\"id\":841,\"first_name\":\"Othilia\",\"last_name\":\"Mattia\",\"email\":\"omattianc@hugedomains.com\"}\n{\"id\":842,\"first_name\":\"Analise\",\"last_name\":\"Absolom\",\"email\":\"aabsolomnd@over-blog.com\"}\n{\"id\":843,\"first_name\":\"Bella\",\"last_name\":\"Cowndley\",\"email\":\"bcowndleyne@networksolutions.com\"}\n{\"id\":844,\"first_name\":\"Rich\",\"last_name\":\"Sweedland\",\"email\":\"rsweedlandnf@studiopress.com\"}\n{\"id\":845,\"first_name\":\"Sinclair\",\"last_name\":\"Bonsale\",\"email\":\"sbonsaleng@icq.com\"}\n{\"id\":846,\"first_name\":\"Thurston\",\"last_name\":\"Blumsom\",\"email\":\"tblumsomnh@foxnews.com\"}\n{\"id\":847,\"first_name\":\"Howey\",\"last_name\":\"Dufoure\",\"email\":\"hdufoureni@geocities.jp\"}\n{\"id\":848,\"first_name\":\"Hannie\",\"last_name\":\"Kryzhov\",\"email\":\"hkryzhovnj@deliciousdays.com\"}\n{\"id\":849,\"first_name\":\"Anneliese\",\"last_name\":\"Winchcum\",\"email\":\"awinchcumnk@ifeng.com\"}\n{\"id\":850,\"first_name\":\"Ronda\",\"last_name\":\"Chicotti\",\"email\":\"rchicottinl@liveinternet.ru\"}\n{\"id\":851,\"first_name\":\"Lacy\",\"last_name\":\"Dennis\",\"email\":\"ldennisnm@paypal.com\"}\n{\"id\":852,\"first_name\":\"Chery\",\"last_name\":\"Leasor\",\"email\":\"cleasornn@ning.com\"}\n{\"id\":853,\"first_name\":\"Melli\",\"last_name\":\"Gowler\",\"email\":\"mgowlerno@prlog.org\"}\n{\"id\":854,\"first_name\":\"Audi\",\"last_name\":\"Ratnage\",\"email\":\"aratnagenp@sbwire.com\"}\n{\"id\":855,\"first_name\":\"Marci\",\"last_name\":\"Cato\",\"email\":\"mcatonq@vinaora.com\"}\n{\"id\":856,\"first_name\":\"Verena\",\"last_name\":\"de Guerre\",\"email\":\"vdeguerrenr@latimes.com\"}\n{\"id\":857,\"first_name\":\"Guglielmo\",\"last_name\":\"Wiltshaw\",\"email\":\"gwiltshawns@macromedia.com\"}\n{\"id\":858,\"first_name\":\"Thatch\",\"last_name\":\"Palin\",\"email\":\"tpalinnt@elegantthemes.com\"}\n{\"id\":859,\"first_name\":\"Amaleta\",\"last_name\":\"Godthaab\",\"email\":\"agodthaabnu@yellowpages.com\"}\n{\"id\":860,\"first_name\":\"Danna\",\"last_name\":\"Bertome\",\"email\":\"dbertomenv@jimdo.com\"}\n{\"id\":861,\"first_name\":\"Terrance\",\"last_name\":\"Lade\",\"email\":\"tladenw@php.net\"}\n{\"id\":862,\"first_name\":\"Arlie\",\"last_name\":\"Runsey\",\"email\":\"arunseynx@icq.com\"}\n{\"id\":863,\"first_name\":\"Ericha\",\"last_name\":\"Tamas\",\"email\":\"etamasny@businesswire.com\"}\n{\"id\":864,\"first_name\":\"Annissa\",\"last_name\":\"Carine\",\"email\":\"acarinenz@sitemeter.com\"}\n{\"id\":865,\"first_name\":\"Isaac\",\"last_name\":\"Conybear\",\"email\":\"iconybearo0@imgur.com\"}\n{\"id\":866,\"first_name\":\"Susy\",\"last_name\":\"Perris\",\"email\":\"sperriso1@patch.com\"}\n{\"id\":867,\"first_name\":\"Michele\",\"last_name\":\"Malcher\",\"email\":\"mmalchero2@google.com\"}\n{\"id\":868,\"first_name\":\"Benn\",\"last_name\":\"Serot\",\"email\":\"bseroto3@altervista.org\"}\n{\"id\":869,\"first_name\":\"Hewett\",\"last_name\":\"Smoote\",\"email\":\"hsmooteo4@dot.gov\"}\n{\"id\":870,\"first_name\":\"Renie\",\"last_name\":\"Rallings\",\"email\":\"rrallingso5@ox.ac.uk\"}\n{\"id\":871,\"first_name\":\"Sammy\",\"last_name\":\"Trew\",\"email\":\"strewo6@slideshare.net\"}\n{\"id\":872,\"first_name\":\"Enos\",\"last_name\":\"Fisbburne\",\"email\":\"efisbburneo7@webs.com\"}\n{\"id\":873,\"first_name\":\"Yancy\",\"last_name\":\"Rookwell\",\"email\":\"yrookwello8@sina.com.cn\"}\n{\"id\":874,\"first_name\":\"Iolande\",\"last_name\":\"Shillingford\",\"email\":\"ishillingfordo9@forbes.com\"}\n{\"id\":875,\"first_name\":\"Yorker\",\"last_name\":\"Downes\",\"email\":\"ydownesoa@addthis.com\"}\n{\"id\":876,\"first_name\":\"Laina\",\"last_name\":\"Jaulme\",\"email\":\"ljaulmeob@elpais.com\"}\n{\"id\":877,\"first_name\":\"Reta\",\"last_name\":\"Argont\",\"email\":\"rargontoc@harvard.edu\"}\n{\"id\":878,\"first_name\":\"Mirabelle\",\"last_name\":\"Schach\",\"email\":\"mschachod@pen.io\"}\n{\"id\":879,\"first_name\":\"Nataline\",\"last_name\":\"Cornish\",\"email\":\"ncornishoe@bbb.org\"}\n{\"id\":880,\"first_name\":\"Rab\",\"last_name\":\"MacPaden\",\"email\":\"rmacpadenof@ameblo.jp\"}\n{\"id\":881,\"first_name\":\"Cheryl\",\"last_name\":\"Blaske\",\"email\":\"cblaskeog@slate.com\"}\n{\"id\":882,\"first_name\":\"Walton\",\"last_name\":\"Fishburn\",\"email\":\"wfishburnoh@china.com.cn\"}\n{\"id\":883,\"first_name\":\"Leoine\",\"last_name\":\"Habercham\",\"email\":\"lhaberchamoi@dailymotion.com\"}\n{\"id\":884,\"first_name\":\"Caria\",\"last_name\":\"Lemmers\",\"email\":\"clemmersoj@prweb.com\"}\n{\"id\":885,\"first_name\":\"Ebenezer\",\"last_name\":\"Renny\",\"email\":\"erennyok@smugmug.com\"}\n{\"id\":886,\"first_name\":\"Max\",\"last_name\":\"Overy\",\"email\":\"moveryol@elegantthemes.com\"}\n{\"id\":887,\"first_name\":\"Patience\",\"last_name\":\"Bilyard\",\"email\":\"pbilyardom@hexun.com\"}\n{\"id\":888,\"first_name\":\"Aubree\",\"last_name\":\"Burdekin\",\"email\":\"aburdekinon@house.gov\"}\n{\"id\":889,\"first_name\":\"Grover\",\"last_name\":\"Trivett\",\"email\":\"gtrivettoo@stumbleupon.com\"}\n{\"id\":890,\"first_name\":\"Brittani\",\"last_name\":\"Durkin\",\"email\":\"bdurkinop@chronoengine.com\"}\n{\"id\":891,\"first_name\":\"Mair\",\"last_name\":\"Denyer\",\"email\":\"mdenyeroq@livejournal.com\"}\n{\"id\":892,\"first_name\":\"Antons\",\"last_name\":\"Pond-Jones\",\"email\":\"apondjonesor@netvibes.com\"}\n{\"id\":893,\"first_name\":\"Terri\",\"last_name\":\"Edgeworth\",\"email\":\"tedgeworthos@youtu.be\"}\n{\"id\":894,\"first_name\":\"Rikki\",\"last_name\":\"Schust\",\"email\":\"rschustot@hatena.ne.jp\"}\n{\"id\":895,\"first_name\":\"Emanuel\",\"last_name\":\"Magee\",\"email\":\"emageeou@shutterfly.com\"}\n{\"id\":896,\"first_name\":\"Leodora\",\"last_name\":\"Dewick\",\"email\":\"ldewickov@ycombinator.com\"}\n{\"id\":897,\"first_name\":\"Lani\",\"last_name\":\"Caskey\",\"email\":\"lcaskeyow@nyu.edu\"}\n{\"id\":898,\"first_name\":\"Ashla\",\"last_name\":\"Ordemann\",\"email\":\"aordemannox@shareasale.com\"}\n{\"id\":899,\"first_name\":\"Bran\",\"last_name\":\"Glidder\",\"email\":\"bglidderoy@dyndns.org\"}\n{\"id\":900,\"first_name\":\"Ricardo\",\"last_name\":\"Sarle\",\"email\":\"rsarleoz@msu.edu\"}\n{\"id\":901,\"first_name\":\"Marcille\",\"last_name\":\"Strevens\",\"email\":\"mstrevensp0@house.gov\"}\n{\"id\":902,\"first_name\":\"Corbet\",\"last_name\":\"Thurner\",\"email\":\"cthurnerp1@theatlantic.com\"}\n{\"id\":903,\"first_name\":\"Peirce\",\"last_name\":\"Poveleye\",\"email\":\"ppoveleyep2@so-net.ne.jp\"}\n{\"id\":904,\"first_name\":\"Berti\",\"last_name\":\"Baldacco\",\"email\":\"bbaldaccop3@guardian.co.uk\"}\n{\"id\":905,\"first_name\":\"Jemima\",\"last_name\":\"Menichino\",\"email\":\"jmenichinop4@mashable.com\"}\n{\"id\":906,\"first_name\":\"Hobart\",\"last_name\":\"Dawtry\",\"email\":\"hdawtryp5@nationalgeographic.com\"}\n{\"id\":907,\"first_name\":\"Tiena\",\"last_name\":\"Giannazzo\",\"email\":\"tgiannazzop6@goodreads.com\"}\n{\"id\":908,\"first_name\":\"Buck\",\"last_name\":\"Sturley\",\"email\":\"bsturleyp7@apache.org\"}\n{\"id\":909,\"first_name\":\"Corly\",\"last_name\":\"Sidgwick\",\"email\":\"csidgwickp8@elegantthemes.com\"}\n{\"id\":910,\"first_name\":\"Lynnea\",\"last_name\":\"Bezzant\",\"email\":\"lbezzantp9@rakuten.co.jp\"}\n{\"id\":911,\"first_name\":\"Skipp\",\"last_name\":\"Shepperd\",\"email\":\"sshepperdpa@apple.com\"}\n{\"id\":912,\"first_name\":\"Jeffry\",\"last_name\":\"Grierson\",\"email\":\"jgriersonpb@nih.gov\"}\n{\"id\":913,\"first_name\":\"Killian\",\"last_name\":\"Grzegorzewski\",\"email\":\"kgrzegorzewskipc@homestead.com\"}\n{\"id\":914,\"first_name\":\"Phebe\",\"last_name\":\"Holtaway\",\"email\":\"pholtawaypd@tinypic.com\"}\n{\"id\":915,\"first_name\":\"Morgan\",\"last_name\":\"Glader\",\"email\":\"mgladerpe@newsvine.com\"}\n{\"id\":916,\"first_name\":\"Dallon\",\"last_name\":\"Hamshere\",\"email\":\"dhamsherepf@geocities.com\"}\n{\"id\":917,\"first_name\":\"Sullivan\",\"last_name\":\"Jorden\",\"email\":\"sjordenpg@umich.edu\"}\n{\"id\":918,\"first_name\":\"Barbara\",\"last_name\":\"Simak\",\"email\":\"bsimakph@nsw.gov.au\"}\n{\"id\":919,\"first_name\":\"Arlyne\",\"last_name\":\"Guiduzzi\",\"email\":\"aguiduzzipi@pcworld.com\"}\n{\"id\":920,\"first_name\":\"Raff\",\"last_name\":\"Tremathick\",\"email\":\"rtremathickpj@webs.com\"}\n{\"id\":921,\"first_name\":\"Ailsun\",\"last_name\":\"Castelain\",\"email\":\"acastelainpk@engadget.com\"}\n{\"id\":922,\"first_name\":\"Zelda\",\"last_name\":\"Malt\",\"email\":\"zmaltpl@icio.us\"}\n{\"id\":923,\"first_name\":\"Chanda\",\"last_name\":\"Loram\",\"email\":\"clorampm@about.me\"}\n{\"id\":924,\"first_name\":\"Kiel\",\"last_name\":\"Binford\",\"email\":\"kbinfordpn@latimes.com\"}\n{\"id\":925,\"first_name\":\"Sawyer\",\"last_name\":\"Lesslie\",\"email\":\"slessliepo@webnode.com\"}\n{\"id\":926,\"first_name\":\"Billi\",\"last_name\":\"Hunte\",\"email\":\"bhuntepp@bravesites.com\"}\n{\"id\":927,\"first_name\":\"Thaxter\",\"last_name\":\"Mellows\",\"email\":\"tmellowspq@twitpic.com\"}\n{\"id\":928,\"first_name\":\"Shani\",\"last_name\":\"Djokic\",\"email\":\"sdjokicpr@fastcompany.com\"}\n{\"id\":929,\"first_name\":\"Hardy\",\"last_name\":\"Ambrogelli\",\"email\":\"hambrogellips@goo.ne.jp\"}\n{\"id\":930,\"first_name\":\"Antonie\",\"last_name\":\"Georgins\",\"email\":\"ageorginspt@seesaa.net\"}\n{\"id\":931,\"first_name\":\"Ennis\",\"last_name\":\"Schuck\",\"email\":\"eschuckpu@globo.com\"}\n{\"id\":932,\"first_name\":\"Jermayne\",\"last_name\":\"Reeson\",\"email\":\"jreesonpv@networkadvertising.org\"}\n{\"id\":933,\"first_name\":\"Claudio\",\"last_name\":\"Stener\",\"email\":\"cstenerpw@dyndns.org\"}\n{\"id\":934,\"first_name\":\"Stella\",\"last_name\":\"McLeoid\",\"email\":\"smcleoidpx@bigcartel.com\"}\n{\"id\":935,\"first_name\":\"Steven\",\"last_name\":\"Warby\",\"email\":\"swarbypy@cnn.com\"}\n{\"id\":936,\"first_name\":\"Oby\",\"last_name\":\"Prangle\",\"email\":\"opranglepz@dedecms.com\"}\n{\"id\":937,\"first_name\":\"Kellsie\",\"last_name\":\"Roberson\",\"email\":\"krobersonq0@skyrock.com\"}\n{\"id\":938,\"first_name\":\"Chiquia\",\"last_name\":\"De la croix\",\"email\":\"cdelacroixq1@virginia.edu\"}\n{\"id\":939,\"first_name\":\"Richie\",\"last_name\":\"Pyett\",\"email\":\"rpyettq2@hexun.com\"}\n{\"id\":940,\"first_name\":\"Darb\",\"last_name\":\"Pavitt\",\"email\":\"dpavittq3@bbb.org\"}\n{\"id\":941,\"first_name\":\"Gwenneth\",\"last_name\":\"Champken\",\"email\":\"gchampkenq4@stanford.edu\"}\n{\"id\":942,\"first_name\":\"Roger\",\"last_name\":\"Lghan\",\"email\":\"rlghanq5@cdbaby.com\"}\n{\"id\":943,\"first_name\":\"Aurelia\",\"last_name\":\"Golt\",\"email\":\"agoltq6@opera.com\"}\n{\"id\":944,\"first_name\":\"Stefa\",\"last_name\":\"Polini\",\"email\":\"spoliniq7@elpais.com\"}\n{\"id\":945,\"first_name\":\"Elden\",\"last_name\":\"Kuschek\",\"email\":\"ekuschekq8@imageshack.us\"}\n{\"id\":946,\"first_name\":\"Lucille\",\"last_name\":\"Davidy\",\"email\":\"ldavidyq9@paginegialle.it\"}\n{\"id\":947,\"first_name\":\"Amelina\",\"last_name\":\"Rabson\",\"email\":\"arabsonqa@ihg.com\"}\n{\"id\":948,\"first_name\":\"Rustin\",\"last_name\":\"Pickrill\",\"email\":\"rpickrillqb@dedecms.com\"}\n{\"id\":949,\"first_name\":\"Nicol\",\"last_name\":\"Gargett\",\"email\":\"ngargettqc@mit.edu\"}\n{\"id\":950,\"first_name\":\"Malachi\",\"last_name\":\"Chipman\",\"email\":\"mchipmanqd@harvard.edu\"}\n{\"id\":951,\"first_name\":\"Zebulon\",\"last_name\":\"Wackly\",\"email\":\"zwacklyqe@diigo.com\"}\n{\"id\":952,\"first_name\":\"Casi\",\"last_name\":\"Cosans\",\"email\":\"ccosansqf@pbs.org\"}\n{\"id\":953,\"first_name\":\"Gustavo\",\"last_name\":\"Hampton\",\"email\":\"ghamptonqg@tinyurl.com\"}\n{\"id\":954,\"first_name\":\"Yves\",\"last_name\":\"Dineen\",\"email\":\"ydineenqh@godaddy.com\"}\n{\"id\":955,\"first_name\":\"Ursala\",\"last_name\":\"Oller\",\"email\":\"uollerqi@jigsy.com\"}\n{\"id\":956,\"first_name\":\"Emlynn\",\"last_name\":\"Girardin\",\"email\":\"egirardinqj@zdnet.com\"}\n{\"id\":957,\"first_name\":\"Jarid\",\"last_name\":\"Fargie\",\"email\":\"jfargieqk@chicagotribune.com\"}\n{\"id\":958,\"first_name\":\"Laurens\",\"last_name\":\"Danihelka\",\"email\":\"ldanihelkaql@gmpg.org\"}\n{\"id\":959,\"first_name\":\"Ignaz\",\"last_name\":\"Drinan\",\"email\":\"idrinanqm@cbslocal.com\"}\n{\"id\":960,\"first_name\":\"Michaela\",\"last_name\":\"Benning\",\"email\":\"mbenningqn@ocn.ne.jp\"}\n{\"id\":961,\"first_name\":\"Anita\",\"last_name\":\"Dericot\",\"email\":\"adericotqo@ihg.com\"}\n{\"id\":962,\"first_name\":\"Giselbert\",\"last_name\":\"Grene\",\"email\":\"ggreneqp@arizona.edu\"}\n{\"id\":963,\"first_name\":\"Daphne\",\"last_name\":\"Deny\",\"email\":\"ddenyqq@google.co.uk\"}\n{\"id\":964,\"first_name\":\"Josefa\",\"last_name\":\"Scoular\",\"email\":\"jscoularqr@cargocollective.com\"}\n{\"id\":965,\"first_name\":\"Papagena\",\"last_name\":\"Blatcher\",\"email\":\"pblatcherqs@time.com\"}\n{\"id\":966,\"first_name\":\"Symon\",\"last_name\":\"Fearneley\",\"email\":\"sfearneleyqt@usgs.gov\"}\n{\"id\":967,\"first_name\":\"Flinn\",\"last_name\":\"Oak\",\"email\":\"foakqu@wsj.com\"}\n{\"id\":968,\"first_name\":\"Aeriela\",\"last_name\":\"Ofen\",\"email\":\"aofenqv@about.me\"}\n{\"id\":969,\"first_name\":\"Belia\",\"last_name\":\"Abdee\",\"email\":\"babdeeqw@lycos.com\"}\n{\"id\":970,\"first_name\":\"Dee\",\"last_name\":\"Sigg\",\"email\":\"dsiggqx@360.cn\"}\n{\"id\":971,\"first_name\":\"Gilberte\",\"last_name\":\"Kitchin\",\"email\":\"gkitchinqy@harvard.edu\"}\n{\"id\":972,\"first_name\":\"Adelaide\",\"last_name\":\"Clinch\",\"email\":\"aclinchqz@opera.com\"}\n{\"id\":973,\"first_name\":\"Lemmie\",\"last_name\":\"Gonnet\",\"email\":\"lgonnetr0@geocities.com\"}\n{\"id\":974,\"first_name\":\"Redd\",\"last_name\":\"Cham\",\"email\":\"rchamr1@mtv.com\"}\n{\"id\":975,\"first_name\":\"Hester\",\"last_name\":\"Belton\",\"email\":\"hbeltonr2@craigslist.org\"}\n{\"id\":976,\"first_name\":\"Barry\",\"last_name\":\"Sharrard\",\"email\":\"bsharrardr3@mozilla.com\"}\n{\"id\":977,\"first_name\":\"Carney\",\"last_name\":\"Skepper\",\"email\":\"cskepperr4@vkontakte.ru\"}\n{\"id\":978,\"first_name\":\"Karleen\",\"last_name\":\"Baigent\",\"email\":\"kbaigentr5@topsy.com\"}\n{\"id\":979,\"first_name\":\"Jany\",\"last_name\":\"Geraghty\",\"email\":\"jgeraghtyr6@google.com\"}\n{\"id\":980,\"first_name\":\"Valdemar\",\"last_name\":\"Kleinfeld\",\"email\":\"vkleinfeldr7@github.io\"}\n{\"id\":981,\"first_name\":\"Dierdre\",\"last_name\":\"Sydenham\",\"email\":\"dsydenhamr8@uiuc.edu\"}\n{\"id\":982,\"first_name\":\"Florella\",\"last_name\":\"Libermore\",\"email\":\"flibermorer9@europa.eu\"}\n{\"id\":983,\"first_name\":\"Stanley\",\"last_name\":\"Agron\",\"email\":\"sagronra@census.gov\"}\n{\"id\":984,\"first_name\":\"Estel\",\"last_name\":\"Guerrieri\",\"email\":\"eguerrierirb@wikispaces.com\"}\n{\"id\":985,\"first_name\":\"Leonie\",\"last_name\":\"Potebury\",\"email\":\"lpoteburyrc@ebay.co.uk\"}\n{\"id\":986,\"first_name\":\"Freeland\",\"last_name\":\"Caselli\",\"email\":\"fcasellird@nydailynews.com\"}\n{\"id\":987,\"first_name\":\"Sol\",\"last_name\":\"Skamell\",\"email\":\"sskamellre@gmpg.org\"}\n{\"id\":988,\"first_name\":\"Jakie\",\"last_name\":\"Portal\",\"email\":\"jportalrf@freewebs.com\"}\n{\"id\":989,\"first_name\":\"Flory\",\"last_name\":\"Stothart\",\"email\":\"fstothartrg@google.co.jp\"}\n{\"id\":990,\"first_name\":\"Lacy\",\"last_name\":\"Scotter\",\"email\":\"lscotterrh@pagesperso-orange.fr\"}\n{\"id\":991,\"first_name\":\"Mauricio\",\"last_name\":\"Adamthwaite\",\"email\":\"madamthwaiteri@cloudflare.com\"}\n{\"id\":992,\"first_name\":\"Bev\",\"last_name\":\"Whisson\",\"email\":\"bwhissonrj@de.vu\"}\n{\"id\":993,\"first_name\":\"Eryn\",\"last_name\":\"Dowbakin\",\"email\":\"edowbakinrk@salon.com\"}\n{\"id\":994,\"first_name\":\"Marlo\",\"last_name\":\"Craxford\",\"email\":\"mcraxfordrl@aboutads.info\"}\n{\"id\":995,\"first_name\":\"Tracy\",\"last_name\":\"Dougliss\",\"email\":\"tdouglissrm@php.net\"}\n{\"id\":996,\"first_name\":\"Hermann\",\"last_name\":\"Frantzen\",\"email\":\"hfrantzenrn@sitemeter.com\"}\n{\"id\":997,\"first_name\":\"Vivien\",\"last_name\":\"Drewery\",\"email\":\"vdreweryro@imgur.com\"}\n{\"id\":998,\"first_name\":\"Papageno\",\"last_name\":\"Greenstead\",\"email\":\"pgreensteadrp@seattletimes.com\"}\n{\"id\":999,\"first_name\":\"Freeman\",\"last_name\":\"Laguerre\",\"email\":\"flaguerrerq@cisco.com\"}\n{\"id\":1000,\"first_name\":\"Cameron\",\"last_name\":\"Tocque\",\"email\":\"ctocquerr@newsvine.com\"}"
  },
  {
    "path": "quickwit/quickwit-doc-mapper/benches/data/simple-routing-expression-bench.json",
    "content": "{\"timestamp\": 1698386133268880, \"source\": \"custom_dealercrawl\", \"vin\": \"2GCUDDED0R1145310\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133269998, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNSKMKT9PR546598\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133270959, \"source\": \"custom_dealercrawl\", \"vin\": \"1GT49PE78PF120870\", \"vid\": \"ae45b13a0a0e094a6d02e389bbed910c\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133271875, \"source\": \"custom_dealercrawl\", \"vin\": \"1GKS2DKL0MR129721\", \"vid\": \"b3504f5c0a0e0a9939aadd20361ee300\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133272644, \"source\": \"custom_dealercrawl\", \"vin\": \"1GT49RE75LF325021\", \"vid\": \"c4d98e290a0e081d6bd1a3a56ae3316c\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133273487, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNSKBED9PR545024\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133274290, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNEVKKWXPJ299402\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133275046, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNEVKKW4PJ315044\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133275396, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNEVJKW8PJ298811\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133275696, \"source\": \"custom_dealercrawl\", \"vin\": \"3GTU9DED0LG360508\", \"vid\": \"f41107ba0a0e0a9a2d62f34c045a09fb\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133275998, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNEVFKW5PJ303735\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133276276, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNERNKW7PJ296754\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133276550, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUDJED5RZ165154\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133276810, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUDJED5RZ152193\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133277182, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUDGED8RZ138291\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133277466, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUDGED6PZ120207\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133277734, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUDGE83RZ154188\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133278077, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUDEEL1RZ134255\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133278398, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUDEED9PZ113918\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133278656, \"source\": \"custom_dealercrawl\", \"vin\": \"1FM5K8GC1LGB34005\", \"vid\": \"c4d98aa70a0e081d6bd1a3a56f70d391\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133278989, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUDEED7PZ280620\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133279252, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUDEED2PZ304676\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133279560, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUDEED1RZ165143\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133279874, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUYEED3LZ335014\", \"vid\": \"233aa6af0a0e087f102b3b1d5cd62eea\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133280134, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCRDDED3RZ176824\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133280403, \"source\": \"custom_dealercrawl\", \"vin\": \"1FT7W2BT7KEG15697\", \"vid\": \"e8ea1e580a0e0a9260e76faa2bd7b4f1\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133280675, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCRDAED0PZ185444\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133281034, \"source\": \"custom_dealercrawl\", \"vin\": \"1FTFW1RG4NFB39096\", \"vid\": \"195b2dcd0a0e0a9a61540bbc7f86eac8\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133281359, \"source\": \"custom_dealercrawl\", \"vin\": \"1GTUUHEL3NZ514919\", \"vid\": \"ddba2a100a0e081d2e4f88904a3868a5\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133281604, \"source\": \"custom_dealercrawl\", \"vin\": \"1GC4YVE74RF172409\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133281892, \"source\": \"custom_dealercrawl\", \"vin\": \"1GYS4EKL5MR400679\", \"vid\": \"ce4cf2cf0a0e0a941d829089895c9fae\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133282239, \"source\": \"custom_dealercrawl\", \"vin\": \"3C63R3EL0NG159716\", \"vid\": \"5e681fd40a0e0a92336a2c1ea2a91f48\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133282533, \"source\": \"custom_dealercrawl\", \"vin\": \"1GC3YTE73RF130474\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133282796, \"source\": \"custom_dealercrawl\", \"vin\": \"1GC3YSE7XRF160923\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133283072, \"source\": \"custom_dealercrawl\", \"vin\": \"1GC3YSE78RF160340\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133283390, \"source\": \"custom_dealercrawl\", \"vin\": \"ZASPAKBN5N7D33543\", \"vid\": \"50f38ef90a0e0a992b96b96b80dfbc36\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133283672, \"source\": \"custom_dealercrawl\", \"vin\": \"1GC3YSE76RF160112\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133283988, \"source\": \"custom_dealercrawl\", \"vin\": \"ZARFANAN2L7626049\", \"vid\": \"faeda5f50a0e0a9a61540bbc6e49a9be\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133284302, \"source\": \"custom_dealercrawl\", \"vin\": \"WAUVVAFR6AA009328\", \"vid\": \"3c5a9bc50a0e087f42924702c68ba470\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133284526, \"source\": \"custom_dealercrawl\", \"vin\": \"2GCUDAEDXR1145166\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133284799, \"source\": \"custom_dealercrawl\", \"vin\": \"2GC4YPEY0R1157790\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133285091, \"source\": \"custom_dealercrawl\", \"vin\": \"2GC4YNE70R1120928\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133285352, \"source\": \"custom_dealercrawl\", \"vin\": \"2GC4YME7XR1144180\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133285580, \"source\": \"custom_dealercrawl\", \"vin\": \"2GC4YME7XR1144177\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133285863, \"source\": \"custom_dealercrawl\", \"vin\": \"WBA7T4C08NCK38450\", \"vid\": \"f42080210a0e0a943a15604b5241a31f\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133286120, \"source\": \"custom_dealercrawl\", \"vin\": \"5UXCR6C03L9C58192\", \"vid\": \"500a3d1e0a0e094a3883bfc3d8ceccc5\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133286466, \"source\": \"custom_dealercrawl\", \"vin\": \"WBXYH9C08P5V63695\", \"vid\": \"4184a7810a0e0a9a6585850e913a6837\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133286778, \"source\": \"custom_dealercrawl\", \"vin\": \"1HTKJPVK2PH387965\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133287356, \"source\": \"custom_dealercrawl\", \"vin\": \"1HTKJPVK0PH598842\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133287735, \"source\": \"custom_dealercrawl\", \"vin\": \"KL4CJESBXLB327918\", \"vid\": \"184583ce0a0e0a913c8e19b0dec8f9c7\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133288096, \"source\": \"custom_dealercrawl\", \"vin\": \"KL4CJASB6LB010338\", \"vid\": \"18469a970a0e0a943b89ecfdcc517607\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133288369, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNSKTKL4PR536961\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133288638, \"source\": \"custom_dealercrawl\", \"vin\": \"KL4CJASB0LB340301\", \"vid\": \"3c5a94950a0e087f429247026603afc8\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133288935, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNSKRKD0PR547368\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133289205, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNSKPKD7PR537974\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133289479, \"source\": \"custom_dealercrawl\", \"vin\": \"2G1FK1EJ5D9177823\", \"vid\": \"71ecfa260a0e0a906df43755e35c073a\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133289771, \"source\": \"custom_dealercrawl\", \"vin\": \"2G1FK3DJ5C9187909\", \"vid\": \"1845efab0a0e0a9a61540bbc97d89eed\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133290064, \"source\": \"custom_dealercrawl\", \"vin\": \"2G1FK3DJ9B9194103\", \"vid\": \"1845703e0a0e0a930a4d1cb070747df2\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133290357, \"source\": \"custom_dealercrawl\", \"vin\": \"2G1FK3DJXB9171817\", \"vid\": \"1bdbf7d00a0e094a62ba84ba2c2466f0\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133290657, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1ZD5ST5RF108083\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133290954, \"source\": \"custom_dealercrawl\", \"vin\": \"2G1FK3DJ8B9188647\", \"vid\": \"5d740da40a0e0a171983e9bd0190c4bc\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133291291, \"source\": \"custom_dealercrawl\", \"vin\": \"2G1FT1EW9B9121678\", \"vid\": \"40123a2d0a0e0a992b96b96b5c7f2fa4\", \"date\": \"2023-10-26\", \"domain\": \"www.basilvw.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133291568, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1YC3D49P5139882\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133291887, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1YB3D46P5142225\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133292167, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1FG3D74P0159539\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133292413, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1FE1R70P0159072\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133292656, \"source\": \"custom_dealercrawl\", \"vin\": \"1GC3YSE73RF149875\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133292987, \"source\": \"custom_dealercrawl\", \"vin\": \"1GC3YSE71RF154511\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133293285, \"source\": \"custom_dealercrawl\", \"vin\": \"1GC3YLE77RF186014\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133293580, \"source\": \"custom_dealercrawl\", \"vin\": \"1GB5YSE72RF248118\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133293836, \"source\": \"custom_dealercrawl\", \"vin\": \"1GB3YTE72RF155547\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133294117, \"source\": \"custom_dealercrawl\", \"vin\": \"1GB3YSE76RF219549\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133294362, \"source\": \"custom_dealercrawl\", \"vin\": \"1GB3YSE73RF165997\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133309877, \"source\": \"custom_dealercrawl\", \"vin\": \"1GB0GRFP7P1150104\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133310190, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1ZE5ST6PF240800\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133310465, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1ZD5ST8RF108112\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133310752, \"source\": \"custom_dealercrawl\", \"vin\": \"KL7CJPSB5JB703730\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133311070, \"source\": \"custom_dealercrawl\", \"vin\": \"KL79MUSL9NB055307\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133311423, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNAXXEV1KS627838\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133311695, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNAXUEV5LL267497\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133311990, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNAXSEV3KS603494\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133312245, \"source\": \"custom_dealercrawl\", \"vin\": \"3GCUYEEL3MG247234\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133333811, \"source\": \"custom_dealercrawl\", \"vin\": \"3GCUYDED3LG221363\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133334141, \"source\": \"custom_dealercrawl\", \"vin\": \"3GCPDFEK1NG501398\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133334392, \"source\": \"custom_dealercrawl\", \"vin\": \"3GCPDCED4NG665390\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133334642, \"source\": \"custom_dealercrawl\", \"vin\": \"2GC4YUEY6N1232937\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133334946, \"source\": \"custom_dealercrawl\", \"vin\": \"1GC4YPE78LF156247\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133335284, \"source\": \"custom_dealercrawl\", \"vin\": \"1GC4YPE71MF258796\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133335565, \"source\": \"custom_dealercrawl\", \"vin\": \"1GC1YNE70MF193849\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133335905, \"source\": \"custom_dealercrawl\", \"vin\": \"1GAZGPFG7L1237494\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133336163, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1YY2D73H5103077\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133351555, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1YB3D70K5109322\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133351882, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1PE5SB7E7112507\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133352152, \"source\": \"custom_dealercrawl\", \"vin\": \"1G11F5SL4FF195476\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133352432, \"source\": \"custom_dealercrawl\", \"vin\": \"5GAEVAKW0LJ217885\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133352711, \"source\": \"custom_dealercrawl\", \"vin\": \"3GKALVEV5ML305732\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133353010, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCRYGEL2LZ112522\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133353345, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCRYEED3LZ304173\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133353611, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCRYDED7NZ193379\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133353912, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCPYFEDXKZ289103\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133364666, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCPYFED6LZ320090\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133365021, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCPYFED2LZ129380\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133365308, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCPYCEF5MZ451321\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133365540, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCGTDE34G1330583\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133365853, \"source\": \"custom_dealercrawl\", \"vin\": \"1GC5YMEY5PF202554\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133366418, \"source\": \"custom_dealercrawl\", \"vin\": \"1GC4YTE70NF180621\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133366723, \"source\": \"custom_dealercrawl\", \"vin\": \"2G11Z5S39K9114885\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133367257, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNSKRKT7NR323437\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133367625, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNSKRKD5MR256750\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133381057, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNSKGKLXMR352045\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133381359, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNSKCKD7MR476355\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133381609, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNEVKKW2LJ237602\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133381884, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUYGET0MZ323357\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133382123, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUYEEDXLZ104871\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133382411, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUYBEF3KZ319717\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133382722, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUDEE83PZ102870\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133383062, \"source\": \"custom_dealercrawl\", \"vin\": \"SADF12FX3L1Z88367\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133383382, \"source\": \"custom_dealercrawl\", \"vin\": \"NMTKHMBX2MR137493\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133398814, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDPNCAC1H7255561\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133399150, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDMB5C10L6604551\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133399443, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDCR3LE4P5042602\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133399745, \"source\": \"custom_dealercrawl\", \"vin\": \"KM8J23A49KU969899\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133400025, \"source\": \"custom_dealercrawl\", \"vin\": \"JTHG81F20N5048760\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133400304, \"source\": \"custom_dealercrawl\", \"vin\": \"JTEKU5JR8M5852396\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133400531, \"source\": \"custom_dealercrawl\", \"vin\": \"JTEBU5JRXG5358237\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133400909, \"source\": \"custom_dealercrawl\", \"vin\": \"JTDKAMFU6N3180730\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133401206, \"source\": \"custom_dealercrawl\", \"vin\": \"1GTW7AFP6M1310458\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133401483, \"source\": \"custom_dealercrawl\", \"vin\": \"1GTW7AFG9L1268253\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133405976, \"source\": \"custom_dealercrawl\", \"vin\": \"1GTUUDED1NZ643825\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133406311, \"source\": \"custom_dealercrawl\", \"vin\": \"1GTP9EEL4MZ171923\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133406559, \"source\": \"custom_dealercrawl\", \"vin\": \"1GTG6CEN1H1223768\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133406822, \"source\": \"custom_dealercrawl\", \"vin\": \"1GKS2JKL6PR311586\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133407094, \"source\": \"custom_dealercrawl\", \"vin\": \"1GKKNXLSXJZ113083\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133422702, \"source\": \"custom_dealercrawl\", \"vin\": \"WUAWAAFC9JN902605\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133423065, \"source\": \"custom_dealercrawl\", \"vin\": \"WP1AA2A53NLB03425\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133423422, \"source\": \"custom_dealercrawl\", \"vin\": \"WDD1K6JB1KF072204\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133423698, \"source\": \"custom_dealercrawl\", \"vin\": \"WBAJS7C09LCE16859\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133424014, \"source\": \"custom_dealercrawl\", \"vin\": \"WBAGV8C03PCL44518\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133424317, \"source\": \"custom_dealercrawl\", \"vin\": \"WAULFAFH8DN013907\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133424629, \"source\": \"custom_dealercrawl\", \"vin\": \"WAUC4CF58NA009405\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133424938, \"source\": \"custom_dealercrawl\", \"vin\": \"WAUA4CF57MA004024\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133425227, \"source\": \"custom_dealercrawl\", \"vin\": \"WA1LAAF76KD007304\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133454121, \"source\": \"custom_dealercrawl\", \"vin\": \"WA1CAAFY7M2017816\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133454595, \"source\": \"custom_dealercrawl\", \"vin\": \"WA1BNAFY0L2017008\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133454922, \"source\": \"custom_dealercrawl\", \"vin\": \"WA12ABGE0LB035790\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133455212, \"source\": \"custom_dealercrawl\", \"vin\": \"W1K3G4FB9NJ375434\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133455530, \"source\": \"custom_dealercrawl\", \"vin\": \"5TDJSKFC3MS005112\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133455947, \"source\": \"custom_dealercrawl\", \"vin\": \"5TDGZRBH0LS008662\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133456265, \"source\": \"custom_dealercrawl\", \"vin\": \"5TDDZ3DC9LS239674\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133456503, \"source\": \"custom_dealercrawl\", \"vin\": \"5NPEF4JA7LH023396\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133456772, \"source\": \"custom_dealercrawl\", \"vin\": \"5NMS2CAD5LH250070\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133478001, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMTJ5DZ3NUL28344\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133478424, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ3D93KUL43281\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133478730, \"source\": \"custom_dealercrawl\", \"vin\": \"5J8TC2H64ML800234\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133479015, \"source\": \"custom_dealercrawl\", \"vin\": \"5J6RM4H51GL087560\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133479322, \"source\": \"custom_dealercrawl\", \"vin\": \"5FNYF6H59MB014961\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133479594, \"source\": \"custom_dealercrawl\", \"vin\": \"5FNRL6H95LB010646\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133479924, \"source\": \"custom_dealercrawl\", \"vin\": \"55SWF8EBXLU325311\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133480181, \"source\": \"custom_dealercrawl\", \"vin\": \"4T1G11AKXMU407495\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133480425, \"source\": \"custom_dealercrawl\", \"vin\": \"4S4BTGKD7L3186191\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133511190, \"source\": \"custom_dealercrawl\", \"vin\": \"4S3GKAV67L3614428\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133511520, \"source\": \"custom_dealercrawl\", \"vin\": \"4S3BNAR67K3009805\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133511884, \"source\": \"custom_dealercrawl\", \"vin\": \"4JGFB4KB8LA042033\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133512133, \"source\": \"custom_dealercrawl\", \"vin\": \"4JGFB4JB9LA180908\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133512416, \"source\": \"custom_dealercrawl\", \"vin\": \"3VV2B7AX9LM149646\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133512664, \"source\": \"custom_dealercrawl\", \"vin\": \"3VV2B7AX1LM001880\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133512998, \"source\": \"custom_dealercrawl\", \"vin\": \"3VV0B7AXXMM019223\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133513341, \"source\": \"custom_dealercrawl\", \"vin\": \"JM3TCBDY0G0107485\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133513569, \"source\": \"custom_dealercrawl\", \"vin\": \"JM3KFBCM3L0848807\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133554061, \"source\": \"custom_dealercrawl\", \"vin\": \"JM1GL1WY8L1511947\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133554439, \"source\": \"custom_dealercrawl\", \"vin\": \"JF2SKADC3MH450456\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133554813, \"source\": \"custom_dealercrawl\", \"vin\": \"JF2SKACC0KH523118\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133555130, \"source\": \"custom_dealercrawl\", \"vin\": \"5YFEPMAE6MP244430\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133555454, \"source\": \"custom_dealercrawl\", \"vin\": \"5XXG34J23MG004125\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133555834, \"source\": \"custom_dealercrawl\", \"vin\": \"5UXCR6C02N9K75226\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133556136, \"source\": \"custom_dealercrawl\", \"vin\": \"5UX53DP02N9K31493\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133556516, \"source\": \"custom_dealercrawl\", \"vin\": \"5TFCZ5AN4MX268847\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133556913, \"source\": \"custom_dealercrawl\", \"vin\": \"5TDJZRBH9MS106838\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133575788, \"source\": \"custom_dealercrawl\", \"vin\": \"3C4NJDBB5LT106928\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133576134, \"source\": \"custom_dealercrawl\", \"vin\": \"2FMPK4K98LBA22192\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133576436, \"source\": \"custom_dealercrawl\", \"vin\": \"2FMPK4J94NBA51676\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133576696, \"source\": \"custom_dealercrawl\", \"vin\": \"2C4RC1N77KR657719\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133577029, \"source\": \"custom_dealercrawl\", \"vin\": \"2C4RC1EG5LR121448\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133577341, \"source\": \"custom_dealercrawl\", \"vin\": \"2C4RC1BG0MR574995\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133577629, \"source\": \"custom_dealercrawl\", \"vin\": \"2C3CDZBT7HH513223\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133577978, \"source\": \"custom_dealercrawl\", \"vin\": \"1V2UR2CAXKC618353\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133578265, \"source\": \"custom_dealercrawl\", \"vin\": \"1N6ED1EK1NN606666\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133591613, \"source\": \"custom_dealercrawl\", \"vin\": \"1N6BA1F45KN525158\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133591993, \"source\": \"custom_dealercrawl\", \"vin\": \"1N4AA6EV0LC379776\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133592308, \"source\": \"custom_dealercrawl\", \"vin\": \"1HGCV3F56MA020208\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133592563, \"source\": \"custom_dealercrawl\", \"vin\": \"3VWEM7BU5RM006791\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133592968, \"source\": \"custom_dealercrawl\", \"vin\": \"3VVEX7B2XRM008059\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133593294, \"source\": \"custom_dealercrawl\", \"vin\": \"3VVEX7B22RM009819\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133593662, \"source\": \"custom_dealercrawl\", \"vin\": \"WVWHA7CDXRW125301\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133594035, \"source\": \"custom_dealercrawl\", \"vin\": \"1GYS4JKL2NR104786\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133594330, \"source\": \"custom_dealercrawl\", \"vin\": \"3VV4X7B21RM015803\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133601233, \"source\": \"custom_dealercrawl\", \"vin\": \"1GYS4FKL2MR305325\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133601540, \"source\": \"custom_dealercrawl\", \"vin\": \"1GYS4DKJ4LR186629\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133601975, \"source\": \"custom_dealercrawl\", \"vin\": \"1V2KR2CA7RC529308\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133602258, \"source\": \"custom_dealercrawl\", \"vin\": \"1V2KR2CA5RC528464\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133602584, \"source\": \"custom_dealercrawl\", \"vin\": \"WVWAR7AN1PE010224\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133602903, \"source\": \"custom_dealercrawl\", \"vin\": \"1V2VMPE8XPC015467\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133603211, \"source\": \"custom_dealercrawl\", \"vin\": \"1G6DH5RL3L0150549\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133603510, \"source\": \"custom_dealercrawl\", \"vin\": \"1G6AW5SX8K0107586\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133603832, \"source\": \"custom_dealercrawl\", \"vin\": \"1FTRF3B61KEC64391\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133620398, \"source\": \"custom_dealercrawl\", \"vin\": \"1FTFW1E80MFB98630\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133620725, \"source\": \"custom_dealercrawl\", \"vin\": \"1FTEW1EP9MFA96665\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133621066, \"source\": \"custom_dealercrawl\", \"vin\": \"1FT8W3BT7JEC71958\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133621345, \"source\": \"custom_dealercrawl\", \"vin\": \"1FT8W2BT6NEC29443\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133621619, \"source\": \"custom_dealercrawl\", \"vin\": \"1FT7W3BT5LEC65056\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133621963, \"source\": \"custom_dealercrawl\", \"vin\": \"3VVMB7AX4RM035085\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133622372, \"source\": \"custom_dealercrawl\", \"vin\": \"1FMSK8DH0LGC55725\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133622679, \"source\": \"custom_dealercrawl\", \"vin\": \"1FMJU2ATXKEA77571\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133622982, \"source\": \"custom_dealercrawl\", \"vin\": \"3VV8B7AX7RM036670\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133630960, \"source\": \"custom_dealercrawl\", \"vin\": \"WVW2A7CDXRW124459\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133631372, \"source\": \"custom_dealercrawl\", \"vin\": \"1V2AE2CA4RC213775\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133631672, \"source\": \"custom_dealercrawl\", \"vin\": \"1V2AE2CA8RC213150\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133631970, \"source\": \"custom_dealercrawl\", \"vin\": \"1V2WNPE89PC046503\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133632242, \"source\": \"custom_dealercrawl\", \"vin\": \"1V2WNPE88PC049859\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133632473, \"source\": \"custom_dealercrawl\", \"vin\": \"1V2WNPE84PC038521\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133632831, \"source\": \"custom_dealercrawl\", \"vin\": \"1V2FE2CA5RC214603\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133633138, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6SRFFT7NN457229\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133655734, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6SRFFT5LN125674\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133656118, \"source\": \"custom_dealercrawl\", \"vin\": \"1V2WNPE88PC047884\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133656416, \"source\": \"custom_dealercrawl\", \"vin\": \"1V2WNPE82PC045824\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133656679, \"source\": \"custom_dealercrawl\", \"vin\": \"1V2FR2CA6RC524389\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133657102, \"source\": \"custom_dealercrawl\", \"vin\": \"1V2FR2CA7RC529066\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133657380, \"source\": \"custom_dealercrawl\", \"vin\": \"1V2JNPE80PC050319\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133657630, \"source\": \"custom_dealercrawl\", \"vin\": \"3VWBM7BU0RM020102\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133657919, \"source\": \"custom_dealercrawl\", \"vin\": \"3VW7M7BU5RM021963\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133658194, \"source\": \"custom_dealercrawl\", \"vin\": \"3VW9T7BU0RM021669\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133669957, \"source\": \"custom_dealercrawl\", \"vin\": \"3VV2B7AX2KM060323\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133670355, \"source\": \"custom_dealercrawl\", \"vin\": \"3VV2B7AX3LM144345\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133670736, \"source\": \"custom_dealercrawl\", \"vin\": \"1V2RE2CA8MC203553\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133671100, \"source\": \"custom_dealercrawl\", \"vin\": \"3VWGZ7AJXBM076102\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133671448, \"source\": \"custom_dealercrawl\", \"vin\": \"3VV2B7AX3KM052120\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133671731, \"source\": \"custom_dealercrawl\", \"vin\": \"3VV2B7AX2MM003719\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133672124, \"source\": \"custom_dealercrawl\", \"vin\": \"1V2LR2CA5KC565816\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133672413, \"source\": \"custom_dealercrawl\", \"vin\": \"KMHGH4JH4CU052338\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133731015, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDEPCAA8M7084818\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133731396, \"source\": \"custom_dealercrawl\", \"vin\": \"5N1AT3BB6MC786205\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133731701, \"source\": \"custom_dealercrawl\", \"vin\": \"4T1K61BKXLU010846\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133732018, \"source\": \"custom_dealercrawl\", \"vin\": \"5N1DR2MM5KC649659\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133732301, \"source\": \"custom_dealercrawl\", \"vin\": \"19UUB1F39LA016667\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133732565, \"source\": \"custom_dealercrawl\", \"vin\": \"JN1BJ1CW0NW683093\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133732868, \"source\": \"custom_dealercrawl\", \"vin\": \"1J4FA54198L519317\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133733178, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4AJWAG6GL152115\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133733415, \"source\": \"custom_dealercrawl\", \"vin\": \"1J4AA2D14AL141924\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133733648, \"source\": \"custom_dealercrawl\", \"vin\": \"5N1AT3CBXMC845562\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133773860, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4AJWAG5CL215389\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133774253, \"source\": \"custom_dealercrawl\", \"vin\": \"1J4FA24117L218604\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133774574, \"source\": \"custom_dealercrawl\", \"vin\": \"1FTEW1EG1FFA18612\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133774999, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4GJXAGXLW223308\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133775330, \"source\": \"custom_dealercrawl\", \"vin\": \"1J4AA2D19BL557680\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133775680, \"source\": \"custom_dealercrawl\", \"vin\": \"1FTEW1EG3HFA54823\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133776001, \"source\": \"custom_dealercrawl\", \"vin\": \"2C3CDXJG5NH107400\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133776318, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJFBG5KC539748\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133776560, \"source\": \"custom_dealercrawl\", \"vin\": \"1J4FA39S93P344607\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133776906, \"source\": \"custom_dealercrawl\", \"vin\": \"1J4AA2D14BL537658\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133791019, \"source\": \"custom_dealercrawl\", \"vin\": \"19UDE4H37PA009583\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133791372, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6RR7KG4JS253111\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133791688, \"source\": \"custom_dealercrawl\", \"vin\": \"1J4FA39S32P723231\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133791991, \"source\": \"custom_dealercrawl\", \"vin\": \"5J8TC2H34KL026652\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133792307, \"source\": \"custom_dealercrawl\", \"vin\": \"2HKRW2H58NH660864\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133792544, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4AJWAG7HL580177\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133792879, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4AJWAG9EL250097\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133793185, \"source\": \"custom_dealercrawl\", \"vin\": \"JN8BT3DD7NW270480\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133793435, \"source\": \"custom_dealercrawl\", \"vin\": \"5N1DR3DK7NC202985\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133810322, \"source\": \"custom_dealercrawl\", \"vin\": \"1J4FA69S73P310361\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133810644, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4BJWDG6EL179599\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133810975, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4AJWAG6FL632329\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133811314, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4BJWDG1DL610140\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133811592, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4AJWAG8DL517163\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133811910, \"source\": \"custom_dealercrawl\", \"vin\": \"JM1BM1T7XF1263126\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133812190, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4AJWAG0CL126524\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133812437, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJKBG8M8173296\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133812684, \"source\": \"custom_dealercrawl\", \"vin\": \"5NPEC4ACXBH284871\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133812997, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJHEG9N8621383\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133818109, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4BJWDG9JL838785\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133818418, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJFLG2LC135600\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133818675, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJFCG6LC437438\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133818983, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJFBGXLC243352\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133819329, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJFBGXKC674837\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133819610, \"source\": \"custom_dealercrawl\", \"vin\": \"3N1CP5CV4ML475174\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133819911, \"source\": \"custom_dealercrawl\", \"vin\": \"JF1GPAA62E8220984\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133820184, \"source\": \"custom_dealercrawl\", \"vin\": \"3N1AB8CVXLY263601\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133820411, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4BJWDG2FL571688\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133826619, \"source\": \"custom_dealercrawl\", \"vin\": \"JH4KC1F94EC005217\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133826957, \"source\": \"custom_dealercrawl\", \"vin\": \"3N1CN8EV5ML880397\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133827282, \"source\": \"custom_dealercrawl\", \"vin\": \"1J4BA3H13AL218920\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133827553, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPF24AD0ME269643\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133827902, \"source\": \"custom_dealercrawl\", \"vin\": \"3FMCR9B63MRA46769\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133828201, \"source\": \"custom_dealercrawl\", \"vin\": \"3FA6P0T91LR197271\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133828433, \"source\": \"custom_dealercrawl\", \"vin\": \"3FA6P0SU5LR235414\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133828734, \"source\": \"custom_dealercrawl\", \"vin\": \"3FA6P0HD2KR159334\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133829069, \"source\": \"custom_dealercrawl\", \"vin\": \"3FA6P0H9XHR192831\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133848546, \"source\": \"custom_dealercrawl\", \"vin\": \"3CZRU6H52MM717351\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133848871, \"source\": \"custom_dealercrawl\", \"vin\": \"3C63R3EJ2JG430333\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133849167, \"source\": \"custom_dealercrawl\", \"vin\": \"3C4PDCGB8LT268751\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133849537, \"source\": \"custom_dealercrawl\", \"vin\": \"3C4NJDCB2MT511421\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133849814, \"source\": \"custom_dealercrawl\", \"vin\": \"1GYS4BKL6MR254436\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133850232, \"source\": \"custom_dealercrawl\", \"vin\": \"1GYS4AKJXLR180302\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133850525, \"source\": \"custom_dealercrawl\", \"vin\": \"3N1AB8CV5MY263300\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133850802, \"source\": \"custom_dealercrawl\", \"vin\": \"3N1CP5CV6NL522318\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133851134, \"source\": \"custom_dealercrawl\", \"vin\": \"1GKKNSLS6KZ233905\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133858634, \"source\": \"custom_dealercrawl\", \"vin\": \"JTMDJREVXJD202989\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133858979, \"source\": \"custom_dealercrawl\", \"vin\": \"5N1AZ2AS6MC147137\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133859326, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4BJWDG1FL518979\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133859618, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4BJWFG2GL180104\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133859934, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4BJWDG9GL226000\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133860238, \"source\": \"custom_dealercrawl\", \"vin\": \"WMW13DJ01N2S48557\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133860494, \"source\": \"custom_dealercrawl\", \"vin\": \"1GYKPDRS9MZ151849\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133860801, \"source\": \"custom_dealercrawl\", \"vin\": \"1GYKNHRS1PZ177153\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133861126, \"source\": \"custom_dealercrawl\", \"vin\": \"5N1DL1GS1PC352045\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133861387, \"source\": \"custom_dealercrawl\", \"vin\": \"1GYKNGRS4KZ256044\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133870550, \"source\": \"custom_dealercrawl\", \"vin\": \"WA1VWBF78ND001041\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133870906, \"source\": \"custom_dealercrawl\", \"vin\": \"19UUB1F55LA012441\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133871236, \"source\": \"custom_dealercrawl\", \"vin\": \"19UUB3F39LA001355\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133871537, \"source\": \"custom_dealercrawl\", \"vin\": \"1GYKNDRS7MZ110528\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133871899, \"source\": \"custom_dealercrawl\", \"vin\": \"1GYKNDRS4NZ115963\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133872184, \"source\": \"custom_dealercrawl\", \"vin\": \"1GYKNDRS1NZ111403\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133872441, \"source\": \"custom_dealercrawl\", \"vin\": \"19UDE4H68PA020232\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133872689, \"source\": \"custom_dealercrawl\", \"vin\": \"1N6AA1E56HN529816\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133887049, \"source\": \"custom_dealercrawl\", \"vin\": \"19UDE4H68PA011188\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133887400, \"source\": \"custom_dealercrawl\", \"vin\": \"1GYFZDR4XLF127072\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133887674, \"source\": \"custom_dealercrawl\", \"vin\": \"1GYFZDR4XKF212265\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133887996, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJFBG0MC565936\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133888359, \"source\": \"custom_dealercrawl\", \"vin\": \"1G6DX5RKXL0135783\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133888666, \"source\": \"custom_dealercrawl\", \"vin\": \"5J8TC2H33KL039411\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133888980, \"source\": \"custom_dealercrawl\", \"vin\": \"3TMCZ5AN4NM496905\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133889300, \"source\": \"custom_dealercrawl\", \"vin\": \"1G6DS1E38C0117529\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133889537, \"source\": \"custom_dealercrawl\", \"vin\": \"5J8YD4H88LL013138\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.speedcraftvw.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133905875, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJFBG8LC415281\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386133906219, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJFBG0LC273797\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133906455, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJFAGXKC853932\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133906700, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJFAG4LC242439\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133907041, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4PJMDXXKD299938\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133907382, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4HJXEG3MW561939\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386133907659, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4HJXAGXPW581986\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133907962, \"source\": \"custom_dealercrawl\", \"vin\": \"19XFL2H86NE014300\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.blasiuschevrolet.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133908232, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4BJWDG7FL766041\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133976110, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4BJWDG2FL585896\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133976476, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4BJWDG5GL110244\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133976837, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4HJWEG4EL203990\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133977140, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4HJWEG0CL201943\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133977383, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4HJWEG6HL502319\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386133977633, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4BJWEG4HL631049\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133977956, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4BJWEG2DL574683\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386133978248, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4BJWDG7GL266690\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386133978473, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4BJWDG2EL115933\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386134023945, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4BJWDG8GL289248\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386134024319, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4HJXDN1MW544542\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386134024610, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4BJWDG7HL635002\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386134025015, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4HJWDG5CL140882\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386134025300, \"source\": \"custom_dealercrawl\", \"vin\": \"1J8GA59138L587270\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386134025560, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4BJWDG8FL685517\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386134025847, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4BJWEGXFL529140\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386134026152, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4BJWDG4FL745289\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386134026519, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4BJWEG8HL751694\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386134026777, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4BJWDG8DL594924\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386134155775, \"source\": \"custom_dealercrawl\", \"vin\": \"1J4BA3H17AL138715\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386134156150, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4BJWEG7GL284881\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"thejeepdepot.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386134998956, \"source\": \"custom_dealercrawl\", \"vin\": \"JTDKAMFU6N3166469\", \"vid\": \"dc91a3f80a0e0a995e11ea711d17f70f\", \"date\": \"2023-10-26\", \"domain\": \"www.shapentoyota.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386135016926, \"source\": \"custom_dealercrawl\", \"vin\": \"4T1C11AK3LU369735\", \"vid\": \"e6f6f8360a0e0a992cac085c2ced42f5\", \"date\": \"2023-10-26\", \"domain\": \"www.shapentoyota.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386135017300, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1ZD5ST1NF121455\", \"vid\": \"e577a5f20a0e0a936891e4acda13017a\", \"date\": \"2023-10-26\", \"domain\": \"www.shapentoyota.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386135017620, \"source\": \"custom_dealercrawl\", \"vin\": \"1G6DB5RK7N0116335\", \"vid\": \"32b380bf0a0e0a906c8422562b1409de\", \"date\": \"2023-10-26\", \"domain\": \"www.shapentoyota.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386135017935, \"source\": \"custom_dealercrawl\", \"vin\": \"2T3WFREV4GW288225\", \"vid\": \"4dfc33f70a0e094a3883bfc33cd2fa8a\", \"date\": \"2023-10-26\", \"domain\": \"www.shapentoyota.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386135018250, \"source\": \"custom_dealercrawl\", \"vin\": \"3CZRU5H51NM739353\", \"vid\": \"dc91a04c0a0e0a995e11ea7109195bb5\", \"date\": \"2023-10-26\", \"domain\": \"www.shapentoyota.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386135018488, \"source\": \"custom_dealercrawl\", \"vin\": \"2HGFC2F54HH533722\", \"vid\": \"ae684a6f0a0e0a9057a7066da9769c19\", \"date\": \"2023-10-26\", \"domain\": \"www.shapentoyota.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386135155940, \"source\": \"custom_dealercrawl\", \"vin\": \"5J6RT6H53NL023709\", \"vid\": \"dc91a6120a0e0a995e11ea719977e5ab\", \"date\": \"2023-10-26\", \"domain\": \"www.shapentoyota.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386135156346, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6HJTFG9NL120530\", \"vid\": \"ff35ad660a0e094a1e8bea621ee5b473\", \"date\": \"2023-10-26\", \"domain\": \"www.shapentoyota.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386135156658, \"source\": \"custom_dealercrawl\", \"vin\": \"1HD1KBM15CB624431\", \"vid\": \"306e8c1e0a0e0a90290b5a5cca8f87a3\", \"date\": \"2023-10-26\", \"domain\": \"www.shapentoyota.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136305075, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPF24AD5RE701490\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136354464, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYK33DF4RG157196\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136354835, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYRL4LC7PG230943\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136355170, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPA24AD3PE621757\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136355509, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPF24AD5RE701974\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136355870, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYK6CDF9RG154740\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136356173, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPA24AD0PE617505\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136356482, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPA24AD2PE622723\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136356833, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPA24AD1PE613530\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136357142, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPA24AD7PE618750\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136357426, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPA24AD7PE621597\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136381496, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPA24AD0PE621926\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136381837, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPF24AD9PE684741\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136382147, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPF24AD5RE706320\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136382413, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPF24ADXPE690516\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136382675, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPF24AD1RE702944\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136382999, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPF24AD9PE690636\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136383344, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPF24AD5RE701358\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136383655, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPF24AD2RE704721\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136383926, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPF24AD2RE695468\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136384201, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDJ23AU9R7900577\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136398802, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDJ23AU1P7894254\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136399132, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPF24AD4PE666194\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136399497, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPF54AD2RE704659\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136399828, \"source\": \"custom_dealercrawl\", \"vin\": \"5XXG64J28RG251896\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136400111, \"source\": \"custom_dealercrawl\", \"vin\": \"5XXG64J2XRG253990\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136400411, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDPU3DF6R7217821\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136400676, \"source\": \"custom_dealercrawl\", \"vin\": \"5XXG64J27RG250822\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136400986, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYK33DF5RG150077\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136401313, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDPUCDF7R7222896\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136405694, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYK33DF6RG161430\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136406109, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYRG4LC6PG237239\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136406389, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYRG4LCXPG238068\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136406660, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYRG4LC8PG230289\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136407124, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYRL4LC2PG231207\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136421565, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYK53DF8RG144509\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136421909, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYK53DF6RG151877\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136422226, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYRH4LF1PG230254\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136422494, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYRH4LF5PG225302\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136422815, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYK7CDF3RG154665\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136423076, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYK7CDF3RG151832\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136423373, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYRH4LF4PG230054\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136423658, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYK7CDF7RG162302\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136423989, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYP64GC0RG444118\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136451599, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYRHDLF4PG226845\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136451954, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYP3DGC3RG450732\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136452294, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDRMDLH4P5217245\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136452625, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYP5DGC4RG456873\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136452888, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYP5DGC2RG452918\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136453167, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPF24AD2RE708154\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136453393, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPF24AD8RE708241\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136453633, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPF24AD6RE708948\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136453893, \"source\": \"custom_dealercrawl\", \"vin\": \"5XXG64J20RG251892\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136454146, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYK33DF5RG160656\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136471573, \"source\": \"custom_dealercrawl\", \"vin\": \"5XXG44J88RG254680\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136471898, \"source\": \"custom_dealercrawl\", \"vin\": \"5XXG34J27RG251333\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136472169, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYRH4LF8PG231983\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136472398, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYRH4LF9PG228638\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136472652, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYP5DGC0RG450892\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136472984, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYPH4A19HG277077\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136473264, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDJN2A2XK7010658\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136473510, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPF24AD8ME324436\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136473817, \"source\": \"custom_dealercrawl\", \"vin\": \"5XXGU4L38JG248299\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136474118, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDJ23AU5L7704613\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136502735, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDJ23AU8N7168307\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136503071, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDJ23AU6M7790693\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136503394, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPF24AD5LE247104\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136503676, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPF44AC5PE525653\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136503940, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPF44AC9NE483050\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136504224, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPF44ACXPE570068\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136504469, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDMA5C14L6583416\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136504767, \"source\": \"custom_dealercrawl\", \"vin\": \"5XXG64J24NG144841\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136505125, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPF54AD5PE662887\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136505424, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDPNCAC5M7891992\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136516101, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDPU3AG4P7059375\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136516433, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDPRCA67L7744778\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136516736, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYP3DHC1MG101186\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136517054, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYK53AF4PG062486\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136517316, \"source\": \"custom_dealercrawl\", \"vin\": \"KNAE35LC3M6092222\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136517542, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYP34HC8NG225957\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136517788, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYRK4LF7NG150987\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136518127, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYK5CDF2RG146465\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136518383, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYP5DGC7PG355453\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136535933, \"source\": \"custom_dealercrawl\", \"vin\": \"2GNALDEK4C6216921\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136536251, \"source\": \"custom_dealercrawl\", \"vin\": \"1N6AD0EV6FN733787\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136536484, \"source\": \"custom_dealercrawl\", \"vin\": \"3C4PDDEG1JT277972\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136536770, \"source\": \"custom_dealercrawl\", \"vin\": \"5NPE34AF6KH820628\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136537073, \"source\": \"custom_dealercrawl\", \"vin\": \"3VV3B7AX7KM131651\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136537424, \"source\": \"custom_dealercrawl\", \"vin\": \"1FMCU9H90LUB85622\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136537678, \"source\": \"custom_dealercrawl\", \"vin\": \"4S4BTADC7M3194424\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136537963, \"source\": \"custom_dealercrawl\", \"vin\": \"KM8J33AL0LU119214\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136538243, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4HJXDG5JW240322\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136538477, \"source\": \"custom_dealercrawl\", \"vin\": \"1FTER4EH9LLA76742\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136558046, \"source\": \"custom_dealercrawl\", \"vin\": \"JTEZU5JR2G5117223\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136558350, \"source\": \"custom_dealercrawl\", \"vin\": \"4T1K61AK8NU717125\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136558602, \"source\": \"custom_dealercrawl\", \"vin\": \"3FMCR9B64NRE22333\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136558916, \"source\": \"custom_dealercrawl\", \"vin\": \"5TDDZRFH4KS976680\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136559193, \"source\": \"custom_dealercrawl\", \"vin\": \"5TFAZ5CN1MX114472\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136559514, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJKAG7N8559081\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136559789, \"source\": \"custom_dealercrawl\", \"vin\": \"5TFAY5F18KX849412\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136560092, \"source\": \"custom_dealercrawl\", \"vin\": \"3TMCZ5AN5MM372527\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136560334, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCPYFEDXKZ134891\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136560558, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1FF1R73P0127097\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136586505, \"source\": \"custom_dealercrawl\", \"vin\": \"1GYS4KKJ7LR181476\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136586847, \"source\": \"custom_dealercrawl\", \"vin\": \"1FTFW1E84NFA16705\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136587129, \"source\": \"custom_dealercrawl\", \"vin\": \"1FTFW1RG9NFB37859\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136587448, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYK6CAF2PG015406\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136587825, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPA24ADXPE528749\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136588097, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDNB4H37N6064109\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136588357, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDPU3AF2P7087411\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136588615, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDCC3LCXN5533088\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136588910, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDPRCA6XN7009537\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136607117, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYP5DHC2NG320217\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136607458, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYRGDLC0NG100939\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136607729, \"source\": \"custom_dealercrawl\", \"vin\": \"KNAE55LC2N6120376\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136608072, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDPN3AC2M7853460\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136608320, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYRLDLCXNG094997\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136608559, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPFK4A7XJE194018\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136608888, \"source\": \"custom_dealercrawl\", \"vin\": \"5XXG64J25MG078993\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136609176, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYK6CAF6PG099097\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136609479, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYP34HC1MG122958\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136609710, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4PJMDX5KD476797\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136655889, \"source\": \"custom_dealercrawl\", \"vin\": \"1FTFW1E83NFA67922\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136656205, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNAL3EK9ES607576\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136656513, \"source\": \"custom_dealercrawl\", \"vin\": \"3N1AB8BV6NY210611\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136656822, \"source\": \"custom_dealercrawl\", \"vin\": \"5TFLA5DB8NX002984\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136657146, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJKBG3M8109148\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136657424, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4HJXEN9MW634584\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136657649, \"source\": \"custom_dealercrawl\", \"vin\": \"5TFLA5DB2NX013205\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136696239, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJKEGXM8111412\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136696522, \"source\": \"custom_dealercrawl\", \"vin\": \"2C3CDZAG8MH560548\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136696840, \"source\": \"custom_dealercrawl\", \"vin\": \"JN8AY2NC8L9621163\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136697123, \"source\": \"custom_dealercrawl\", \"vin\": \"5NMJB3AE6NH123464\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136697401, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RDHDGXMC626841\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136697648, \"source\": \"custom_dealercrawl\", \"vin\": \"3C4NJCBB2NT187593\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136697991, \"source\": \"custom_dealercrawl\", \"vin\": \"1N6AA1EF4NN105622\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136698272, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RDHDGXMC608226\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136698521, \"source\": \"custom_dealercrawl\", \"vin\": \"2T3C1RFVXMW105342\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136726637, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6JJTBG3ML597895\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136726998, \"source\": \"custom_dealercrawl\", \"vin\": \"2C4RC1BG8NR138591\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136727362, \"source\": \"custom_dealercrawl\", \"vin\": \"5NPEJ4J23NH134442\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136727641, \"source\": \"custom_dealercrawl\", \"vin\": \"1GKKNMLA7KZ197077\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136727935, \"source\": \"custom_dealercrawl\", \"vin\": \"1N4BL4DV9NN315392\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136728249, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJEAG5MC595235\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136728506, \"source\": \"custom_dealercrawl\", \"vin\": \"4T3LWRFV5NU050513\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136728760, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6HJTAG9ML506340\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136729088, \"source\": \"custom_dealercrawl\", \"vin\": \"2C3CDZFJ6MH655209\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136729337, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNAXKEV0NL100567\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136751446, \"source\": \"custom_dealercrawl\", \"vin\": \"5TFJA5DBXNX009152\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136751760, \"source\": \"custom_dealercrawl\", \"vin\": \"JTEBU5JR5L5749774\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136752117, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6RREFT4NN464082\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136752422, \"source\": \"custom_dealercrawl\", \"vin\": \"5N1AT3BA7MC801608\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136752695, \"source\": \"custom_dealercrawl\", \"vin\": \"1FMSK7FH2NGA51809\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136753002, \"source\": \"custom_dealercrawl\", \"vin\": \"JTMC1RFV7LD051915\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136753311, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4HJXDN3LW169980\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136753608, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4PJLMX3ND521279\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136753904, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6RREBT3NN231686\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136754178, \"source\": \"custom_dealercrawl\", \"vin\": \"1HGCV1F42JA018494\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136794988, \"source\": \"custom_dealercrawl\", \"vin\": \"5TFEY5F19MX276542\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136795383, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6RREFT6NN372908\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136795681, \"source\": \"custom_dealercrawl\", \"vin\": \"2FMPK3K93MBA60313\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136795996, \"source\": \"custom_dealercrawl\", \"vin\": \"2FMPK4J93NBA90520\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136796255, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4HJXEN4LW115117\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136796540, \"source\": \"custom_dealercrawl\", \"vin\": \"2C3CDZFJ5NH256907\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136796849, \"source\": \"custom_dealercrawl\", \"vin\": \"3C4NJDDB8NT155217\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136797162, \"source\": \"custom_dealercrawl\", \"vin\": \"5NMS24AJ9NH377743\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136797405, \"source\": \"custom_dealercrawl\", \"vin\": \"5NMS4DALXNH467057\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136855490, \"source\": \"custom_dealercrawl\", \"vin\": \"5TFJC5DB4PX021326\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136855856, \"source\": \"custom_dealercrawl\", \"vin\": \"JTMCY7AJ0M4102192\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136856165, \"source\": \"custom_dealercrawl\", \"vin\": \"5TFNA5ECXNX008866\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136856414, \"source\": \"custom_dealercrawl\", \"vin\": \"KM8J33AL7MU350113\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136856665, \"source\": \"custom_dealercrawl\", \"vin\": \"KMHRC8A33PU227154\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136856983, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCPWDED0KZ231270\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136857271, \"source\": \"custom_dealercrawl\", \"vin\": \"2C4RC1BG9NR109309\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136857497, \"source\": \"custom_dealercrawl\", \"vin\": \"1N6AA1EF4NN103272\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136857775, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJKBG8M8135289\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136858067, \"source\": \"custom_dealercrawl\", \"vin\": \"3TMDZ5BN5NM130647\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136894091, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6RR7KG0NS188148\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136894388, \"source\": \"custom_dealercrawl\", \"vin\": \"1LN6L9NP7L5607706\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136894690, \"source\": \"custom_dealercrawl\", \"vin\": \"4T1K61AK5PU151251\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136895004, \"source\": \"custom_dealercrawl\", \"vin\": \"KM8R54HE4MU295193\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136895303, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJJDG8M8152293\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136895597, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6RREFT0NN126517\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136895904, \"source\": \"custom_dealercrawl\", \"vin\": \"1GKS2HKJXKR258293\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136896185, \"source\": \"custom_dealercrawl\", \"vin\": \"3N1CP5CV9ML524515\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136896442, \"source\": \"custom_dealercrawl\", \"vin\": \"WBA13AL04N7K04177\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136896728, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6SRFFT9PN642286\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136906072, \"source\": \"custom_dealercrawl\", \"vin\": \"4JGFB4KB4MA375072\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136906367, \"source\": \"custom_dealercrawl\", \"vin\": \"KM8J33A46MU318049\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136935195, \"source\": \"custom_dealercrawl\", \"vin\": \"2T3E6RFV4MW010883\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136935497, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMJJ3LT8MEL05501\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136935813, \"source\": \"custom_dealercrawl\", \"vin\": \"1FMCU0H62LUC34238\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136936143, \"source\": \"custom_dealercrawl\", \"vin\": \"3GKALPEVXML376629\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386136936420, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ6J96NBL12115\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136936697, \"source\": \"custom_dealercrawl\", \"vin\": \"5N1DR3CC6NC265344\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136936999, \"source\": \"custom_dealercrawl\", \"vin\": \"5NMS24AJXNH457634\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136937280, \"source\": \"custom_dealercrawl\", \"vin\": \"5NTJDDAF7PH044577\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136937504, \"source\": \"custom_dealercrawl\", \"vin\": \"KM8J33AL5MU389332\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136978979, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6HJTFG5NL131587\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136979354, \"source\": \"custom_dealercrawl\", \"vin\": \"1FTEW1E51LFA62166\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136979635, \"source\": \"custom_dealercrawl\", \"vin\": \"1GKKNXLS9MZ226320\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136979944, \"source\": \"custom_dealercrawl\", \"vin\": \"1N4BL4CV9NN349561\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136980215, \"source\": \"custom_dealercrawl\", \"vin\": \"5NPLM4AG3MH018635\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136980457, \"source\": \"custom_dealercrawl\", \"vin\": \"7FARW2H80KE055200\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386136980719, \"source\": \"custom_dealercrawl\", \"vin\": \"WDDUG6GB5JA396421\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386136981037, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4HJXFG3LW200787\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386136981337, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNSKGKL5MR349036\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386136981562, \"source\": \"custom_dealercrawl\", \"vin\": \"5NMS24AJ9PH538482\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137014586, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4HJXDN7MW544609\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137014937, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4PJMMB8PD100719\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137015218, \"source\": \"custom_dealercrawl\", \"vin\": \"1FMJU1LT9MEA29019\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137015541, \"source\": \"custom_dealercrawl\", \"vin\": \"3C4NJDBB6NT120808\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137015861, \"source\": \"custom_dealercrawl\", \"vin\": \"3FMCR9D98NRD55677\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137016145, \"source\": \"custom_dealercrawl\", \"vin\": \"5TFCZ5AN5MX253287\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137016397, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJFLG7KC784727\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137016691, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6RREJT3MN625026\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137017025, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNEVNKW3NJ159111\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137017298, \"source\": \"custom_dealercrawl\", \"vin\": \"3GKALVEV3LL191860\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137047061, \"source\": \"custom_dealercrawl\", \"vin\": \"5NMS5DALXMH367441\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137047364, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4JJXR63MW727878\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137047644, \"source\": \"custom_dealercrawl\", \"vin\": \"1FMCU0G66NUA69542\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137047945, \"source\": \"custom_dealercrawl\", \"vin\": \"5N1AT3BA7MC707728\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137048205, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1FB1RX3N0119297\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137048539, \"source\": \"custom_dealercrawl\", \"vin\": \"5TFRM5F18KX139294\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137048818, \"source\": \"custom_dealercrawl\", \"vin\": \"1GKS2HKD8NR106754\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137049103, \"source\": \"custom_dealercrawl\", \"vin\": \"3C4NJDCB6NT203069\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137049399, \"source\": \"custom_dealercrawl\", \"vin\": \"3C4PDCBG1JT476647\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137049688, \"source\": \"custom_dealercrawl\", \"vin\": \"3VWJL7AT8EM608802\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137092556, \"source\": \"custom_dealercrawl\", \"vin\": \"5TFAZ5CN1MX096510\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137092890, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4PJMCB2LD612178\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137093221, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6JJTBM1NL146827\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137093471, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUYGED2NZ172158\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137093713, \"source\": \"custom_dealercrawl\", \"vin\": \"1N6AA1CF7NN100630\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137094003, \"source\": \"custom_dealercrawl\", \"vin\": \"JTEBU5JR0F5224951\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137094267, \"source\": \"custom_dealercrawl\", \"vin\": \"2C3CDXBG5NH141030\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137094515, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNAXKEV9NS200169\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137094756, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJEBG4NC125570\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137141293, \"source\": \"custom_dealercrawl\", \"vin\": \"5NMS64AJ4NH433284\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137141642, \"source\": \"custom_dealercrawl\", \"vin\": \"KM8R74GE7PU583886\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137141977, \"source\": \"custom_dealercrawl\", \"vin\": \"3FTTW8E94NRA67662\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137142302, \"source\": \"custom_dealercrawl\", \"vin\": \"JTEBU5JR0L5768605\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137142532, \"source\": \"custom_dealercrawl\", \"vin\": \"3N1CP5BV5ML516185\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137142806, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJFBG9KC746160\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137143116, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6RREHT0NN260957\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137143415, \"source\": \"custom_dealercrawl\", \"vin\": \"3TMCZ5AN6LM301707\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137143689, \"source\": \"custom_dealercrawl\", \"vin\": \"5TFMA5DB3PX079687\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137144001, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4PJMCBXLD626328\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137155683, \"source\": \"custom_dealercrawl\", \"vin\": \"1FA6P8TH5M5101765\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137156005, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCGTCEN3M1134271\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137156300, \"source\": \"custom_dealercrawl\", \"vin\": \"3C6UR5DL2MG652975\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137406128, \"source\": \"custom_dealercrawl\", \"vin\": \"3N1CN8BV7ML878526\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137497913, \"source\": \"custom_dealercrawl\", \"vin\": \"3GCUDFED5RG145371\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137498223, \"source\": \"custom_dealercrawl\", \"vin\": \"3GCNDAEK3RG148565\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137498538, \"source\": \"custom_dealercrawl\", \"vin\": \"3GCNDAED2RG142676\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137498873, \"source\": \"custom_dealercrawl\", \"vin\": \"2GCUDDEDXP1126700\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137499197, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCPDKEKXPZ276927\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137499491, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCRDAED1PZ307454\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137499824, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUDEED6PZ311131\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137500075, \"source\": \"custom_dealercrawl\", \"vin\": \"3GCUDDED0PG332079\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137500371, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUDAED3PZ314013\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137525110, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUDEED8PZ311616\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137525482, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNAXUEG7RL112394\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137525768, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNAXUEG6RL112516\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137526061, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCRDDED0RZ107394\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137526366, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNEVGKW2PJ299648\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137526601, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCPDKEK2RZ114700\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137526921, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCPDKEK8RZ109128\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137527181, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCPSCEK0P1230517\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137527476, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNAXXEG0RL144623\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137555500, \"source\": \"custom_dealercrawl\", \"vin\": \"KL79MTSL8PB216713\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137555814, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1ZE5ST5RF116942\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137556115, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUDEED2RZ109292\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137556425, \"source\": \"custom_dealercrawl\", \"vin\": \"KL79MMS23RB033154\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137556694, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNEVJKW2PJ314226\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137557050, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUDEED0RZ138130\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137557359, \"source\": \"custom_dealercrawl\", \"vin\": \"2GC4YME70R1147170\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137557626, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUDEED5RZ143825\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137557936, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNAXHEG4RL170793\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137570833, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNAXSEG5RL172078\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137571130, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNAXUEG1RS155585\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137571443, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNSKNKD7PR469941\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137571716, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNAXHEG1RL179807\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137571964, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNKBKRS9RS158000\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137572270, \"source\": \"custom_dealercrawl\", \"vin\": \"1GC5YME7XRF229050\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137572552, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUDJEL0RZ163772\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137572884, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNAXUEG2RS163341\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137573178, \"source\": \"custom_dealercrawl\", \"vin\": \"1GC4YYEY9RF257949\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137573418, \"source\": \"custom_dealercrawl\", \"vin\": \"3GCUDHE88RG140311\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137589079, \"source\": \"custom_dealercrawl\", \"vin\": \"3GCUDEE88PG337006\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137589386, \"source\": \"custom_dealercrawl\", \"vin\": \"3GCNDAED7RG142673\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137589740, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNAXUEG2RL198200\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137590072, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNKBKRS3RS162320\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137590367, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNKBKRS5RS145146\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137590619, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNKDCRJ0RS145448\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137590937, \"source\": \"custom_dealercrawl\", \"vin\": \"KL79MRSLXRB063628\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137591238, \"source\": \"custom_dealercrawl\", \"vin\": \"3GCUDCED0RG147088\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137591537, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNKBJR48RS165726\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137599682, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNSKPKDXPR543753\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137600042, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNKBLRS9RS146922\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137600325, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNSKRKD1PR483812\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137600607, \"source\": \"custom_dealercrawl\", \"vin\": \"3N1CP5DV9LL514774\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137600976, \"source\": \"custom_dealercrawl\", \"vin\": \"5NMJF3AEXPH197347\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137601261, \"source\": \"custom_dealercrawl\", \"vin\": \"5UXTY3C02M9F02255\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137601521, \"source\": \"custom_dealercrawl\", \"vin\": \"KL79MTSL1NB140880\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137601832, \"source\": \"custom_dealercrawl\", \"vin\": \"KL7CJLSB4MB347333\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137602091, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJFAG9JC244958\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137602346, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJJBG5N8547908\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137607014, \"source\": \"custom_dealercrawl\", \"vin\": \"1N6ED0EBXLN727929\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137607334, \"source\": \"custom_dealercrawl\", \"vin\": \"3TYAX5GN7NT043922\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137607620, \"source\": \"custom_dealercrawl\", \"vin\": \"KL79MMS24RB084260\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137607965, \"source\": \"custom_dealercrawl\", \"vin\": \"KL79MTSL0RB079298\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137608229, \"source\": \"custom_dealercrawl\", \"vin\": \"JTJDY7AX6L4315717\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137608527, \"source\": \"custom_dealercrawl\", \"vin\": \"KM8JFCA1XNU013250\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137608814, \"source\": \"custom_dealercrawl\", \"vin\": \"2GC4YPEY0R1161984\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137609111, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNSKSKD6PR535717\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137609392, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCPDKEK5RZ180187\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137615588, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1FW6S04P4186257\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137615978, \"source\": \"custom_dealercrawl\", \"vin\": \"3C6UR5DL7MG579196\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137616273, \"source\": \"custom_dealercrawl\", \"vin\": \"5NMJE3AE9NH027254\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137616560, \"source\": \"custom_dealercrawl\", \"vin\": \"5NMJF3AE2NH064546\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137616839, \"source\": \"custom_dealercrawl\", \"vin\": \"5NMS2DAJ3NH438519\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137617233, \"source\": \"custom_dealercrawl\", \"vin\": \"5NPEL4JA0MH090061\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137617530, \"source\": \"custom_dealercrawl\", \"vin\": \"JTDS4MCE1MJ064897\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137617793, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4SJVBT0NS110875\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137618153, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCRYDEDXKZ171288\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137618440, \"source\": \"custom_dealercrawl\", \"vin\": \"1GC4YNE79MF312088\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137627979, \"source\": \"custom_dealercrawl\", \"vin\": \"3GCPYFED2MG329385\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137628288, \"source\": \"custom_dealercrawl\", \"vin\": \"1GC4YREY7MF270973\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137628546, \"source\": \"custom_dealercrawl\", \"vin\": \"3GTU2PEJ2JG106243\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137628878, \"source\": \"custom_dealercrawl\", \"vin\": \"JTDKARFP8J3094207\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137629174, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4BJWFG4EL243426\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137629472, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNAXSEV1LS606086\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137629767, \"source\": \"custom_dealercrawl\", \"vin\": \"5TFPC5DB3NX005917\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137630088, \"source\": \"custom_dealercrawl\", \"vin\": \"KL79MUSL8PB026139\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137630377, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6JJTEGXML621380\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137630660, \"source\": \"custom_dealercrawl\", \"vin\": \"1FMJU1HT7MEA04323\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137655725, \"source\": \"custom_dealercrawl\", \"vin\": \"3C6UR5FLXNG161536\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137656059, \"source\": \"custom_dealercrawl\", \"vin\": \"3GCUYDET4LG395966\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137656329, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUYEEL6LZ229051\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137656561, \"source\": \"custom_dealercrawl\", \"vin\": \"JN8AY2BA4M9375216\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137675234, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNKBBRA6KS615651\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137675559, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1RC6S51JU137784\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137675959, \"source\": \"custom_dealercrawl\", \"vin\": \"1C3CDFBB4FD180883\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137676253, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1RC6S51JU128051\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137676498, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1RC6S56JU127221\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137676741, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1RC6S56JU139174\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137677064, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1RC6S5XJU142885\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137677345, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1RC6S54JU139545\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137677569, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1RC6S56JU140938\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137703733, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1RC6S55JU149906\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137704069, \"source\": \"custom_dealercrawl\", \"vin\": \"1GKKNMLA0KZ286201\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137704350, \"source\": \"custom_dealercrawl\", \"vin\": \"3GTU2PEJ3JG472163\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137704594, \"source\": \"custom_dealercrawl\", \"vin\": \"3N1AB7AP7DL621699\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137704915, \"source\": \"custom_dealercrawl\", \"vin\": \"1FTER4FH4KLA37926\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137705239, \"source\": \"custom_dealercrawl\", \"vin\": \"1FTEW1E85PFB86912\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137705475, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCPWBEK5NZ229232\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137705733, \"source\": \"custom_dealercrawl\", \"vin\": \"1GKS2GKCXKR318195\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137706008, \"source\": \"custom_dealercrawl\", \"vin\": \"1N4AL3AP4GC124621\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137706295, \"source\": \"custom_dealercrawl\", \"vin\": \"2FMGK5C86DBD17093\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137724231, \"source\": \"custom_dealercrawl\", \"vin\": \"2HKRM4H75EH685840\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137724533, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1JC6SH1G4124100\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137724834, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJEAG1KC662443\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137725120, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNCJPSB7JL157831\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137725386, \"source\": \"custom_dealercrawl\", \"vin\": \"2C3CDXGJ6MH539835\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137725631, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNTKGE76DG304982\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137725971, \"source\": \"custom_dealercrawl\", \"vin\": \"3C6UR5FL9NG147627\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137726302, \"source\": \"custom_dealercrawl\", \"vin\": \"5GAEVBKW3NJ109995\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137726528, \"source\": \"custom_dealercrawl\", \"vin\": \"5J6RW1H83NA022107\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137726771, \"source\": \"custom_dealercrawl\", \"vin\": \"KM8J33AL0MU283614\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137744913, \"source\": \"custom_dealercrawl\", \"vin\": \"KM8R74HE5NU487492\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137745217, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6SRFET8LN322100\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137745474, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1RA6S57HU178629\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137745705, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNAXSEV2JS570423\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137746031, \"source\": \"custom_dealercrawl\", \"vin\": \"ZACNJDD14MPM38918\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137746315, \"source\": \"custom_dealercrawl\", \"vin\": \"1GC4YUEY9MF202211\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137746553, \"source\": \"custom_dealercrawl\", \"vin\": \"1HGCV1F59KA126704\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137746870, \"source\": \"custom_dealercrawl\", \"vin\": \"3C63RRJL7NG292014\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137747154, \"source\": \"custom_dealercrawl\", \"vin\": \"3FA6P0CD4KR265016\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137761896, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4PJMBX3KD194225\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137762225, \"source\": \"custom_dealercrawl\", \"vin\": \"2C4RC1DG8MR538596\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137762465, \"source\": \"custom_dealercrawl\", \"vin\": \"2C4RC1BG8MR598218\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137762717, \"source\": \"custom_dealercrawl\", \"vin\": \"2C4RC1BG3MR512006\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137763049, \"source\": \"custom_dealercrawl\", \"vin\": \"2GNAXLEX8L6202238\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137763383, \"source\": \"custom_dealercrawl\", \"vin\": \"1FT7W2BT1CEC44876\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137763665, \"source\": \"custom_dealercrawl\", \"vin\": \"2C4RC1BG9MR527707\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137763991, \"source\": \"custom_dealercrawl\", \"vin\": \"2C4RC1BG4MR574627\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137764258, \"source\": \"custom_dealercrawl\", \"vin\": \"2C4RC1BG5MR589377\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137764487, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNSKCKC9KR168480\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137776577, \"source\": \"custom_dealercrawl\", \"vin\": \"7FARW2H90JE064373\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137776958, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYP2DHC1MG189126\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137777243, \"source\": \"custom_dealercrawl\", \"vin\": \"1FTEW1EP1LFB60955\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137777506, \"source\": \"custom_dealercrawl\", \"vin\": \"5NMJF3AE7PH198584\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137777792, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCUYEED5KZ218405\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137778138, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4AJWAG7FL616415\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137778399, \"source\": \"custom_dealercrawl\", \"vin\": \"5YJ3E1EAXLF806293\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137778681, \"source\": \"custom_dealercrawl\", \"vin\": \"JN8AZ2AF8M9718538\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137778999, \"source\": \"custom_dealercrawl\", \"vin\": \"1FMSK8DH7LGB92168\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137779259, \"source\": \"custom_dealercrawl\", \"vin\": \"1G6KJ5RS2GU155131\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137792080, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4BJWEG2HL600141\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137792375, \"source\": \"custom_dealercrawl\", \"vin\": \"1FTEW1E45KFB48523\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137792639, \"source\": \"custom_dealercrawl\", \"vin\": \"1GC2KVEG1JZ244548\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137792918, \"source\": \"custom_dealercrawl\", \"vin\": \"YV4102PK2L1558063\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137793155, \"source\": \"custom_dealercrawl\", \"vin\": \"1FTEW1EG7JKE84707\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137793422, \"source\": \"custom_dealercrawl\", \"vin\": \"1FTFW1E55MFA26101\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137793664, \"source\": \"custom_dealercrawl\", \"vin\": \"1GC4YRE75PF120808\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137793999, \"source\": \"custom_dealercrawl\", \"vin\": \"1GKS2HKJ7KR403287\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137794260, \"source\": \"custom_dealercrawl\", \"vin\": \"1N4AL21E49N555332\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137815725, \"source\": \"custom_dealercrawl\", \"vin\": \"3FADP4DJXKM117260\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137816049, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNAXKEV8MS129576\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137816338, \"source\": \"custom_dealercrawl\", \"vin\": \"1GT42YEY5JF102806\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137816627, \"source\": \"custom_dealercrawl\", \"vin\": \"3GTUUGE88PG169282\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137816954, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1YC3D45N5120114\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137817237, \"source\": \"custom_dealercrawl\", \"vin\": \"3VWD07AJ8FM407391\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137817481, \"source\": \"custom_dealercrawl\", \"vin\": \"4JGFB4JB1NA635690\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137817768, \"source\": \"custom_dealercrawl\", \"vin\": \"3GTP2VE38BG376842\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.denooyerchevy.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137818072, \"source\": \"custom_dealercrawl\", \"vin\": \"JN8AY2ND5LX015310\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137818330, \"source\": \"custom_dealercrawl\", \"vin\": \"KM8JFCA17PU130559\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137848345, \"source\": \"custom_dealercrawl\", \"vin\": \"1FMSK7DH4LGB11136\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137848717, \"source\": \"custom_dealercrawl\", \"vin\": \"1GKS1JKL2MR493317\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137849024, \"source\": \"custom_dealercrawl\", \"vin\": \"2C3CDZFJ6MH594248\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137849321, \"source\": \"custom_dealercrawl\", \"vin\": \"3C6UR5DJXMG554629\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137849557, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNAXUEV8NL189736\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137849887, \"source\": \"custom_dealercrawl\", \"vin\": \"5NMS24AJ1NH444545\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137850164, \"source\": \"custom_dealercrawl\", \"vin\": \"SHHFK8G73LU200431\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137850443, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6HJTAG4LL126434\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137850677, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6SRFBT0LN253424\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137850988, \"source\": \"custom_dealercrawl\", \"vin\": \"JA4ARUAU7NU008285\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137879804, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4HJXDG6NW181352\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137880135, \"source\": \"custom_dealercrawl\", \"vin\": \"3GCRCSE00AG142879\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137880396, \"source\": \"custom_dealercrawl\", \"vin\": \"5N1AZ2BJ1MC144068\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137880659, \"source\": \"custom_dealercrawl\", \"vin\": \"JTEAAAAH4MJ054957\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137880997, \"source\": \"custom_dealercrawl\", \"vin\": \"KM8SRDHF5HU180607\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137881289, \"source\": \"custom_dealercrawl\", \"vin\": \"1FTEW1CP1NKD56276\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137881529, \"source\": \"custom_dealercrawl\", \"vin\": \"1FTMF1C5XMKE39178\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137881847, \"source\": \"custom_dealercrawl\", \"vin\": \"5TDGY5B15NS189825\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137882144, \"source\": \"custom_dealercrawl\", \"vin\": \"JN8AY2NE1L9781225\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137882413, \"source\": \"custom_dealercrawl\", \"vin\": \"JTEHU5JR7M5960812\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137905725, \"source\": \"custom_dealercrawl\", \"vin\": \"1N4AA6CV7MC506947\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137906072, \"source\": \"custom_dealercrawl\", \"vin\": \"3GCPABEK3NG512100\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137906358, \"source\": \"custom_dealercrawl\", \"vin\": \"3VWDP7AJXCM365815\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137906590, \"source\": \"custom_dealercrawl\", \"vin\": \"JA4J3UA81NZ021668\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137906881, \"source\": \"custom_dealercrawl\", \"vin\": \"SALRRBBV5HA028828\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137907155, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJFAG4JC166444\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137936272, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6HJTAG6LL211646\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137936575, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNSCAKC5KR254077\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137936950, \"source\": \"custom_dealercrawl\", \"vin\": \"2T2HZMAA5LC181352\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137937224, \"source\": \"custom_dealercrawl\", \"vin\": \"JM1BL1K6XB1493156\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137937468, \"source\": \"custom_dealercrawl\", \"vin\": \"ZACNJBD15KPJ99803\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137937696, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6RREBG6NN177836\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137938000, \"source\": \"custom_dealercrawl\", \"vin\": \"1FMSK7FH4NGA51990\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137938288, \"source\": \"custom_dealercrawl\", \"vin\": \"1FT7W2BN5LEE33760\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137938522, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMYJ9YY5PNL02646\", \"vid\": \"e368e6420a0e0a9350297a7f1dbf8c95\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137946566, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1ZE5ST7GF310885\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137946881, \"source\": \"custom_dealercrawl\", \"vin\": \"5LM5J7XC9PGL22819\", \"vid\": \"49043f600a0e0a9448ce3dca6c11e8f1\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137947182, \"source\": \"custom_dealercrawl\", \"vin\": \"5LM5J7XC2PGL21902\", \"vid\": \"81007aaf0a0e0a9a31c50ba54c4a67db\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137947488, \"source\": \"custom_dealercrawl\", \"vin\": \"2C3CDXGJ3NH261798\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137947759, \"source\": \"custom_dealercrawl\", \"vin\": \"5GAERBKW3PJ156810\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137948093, \"source\": \"custom_dealercrawl\", \"vin\": \"5TDDZRBH4MS524074\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137948366, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ2DA9PUL22544\", \"vid\": \"0e22dc460a0e0a9a61540bbc5594e809\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137948669, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ2DA4PUL23164\", \"vid\": \"1b13aae80a0e0a9a61540bbc94781d68\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137948999, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ2DAXPUL25985\", \"vid\": \"211188f90a0e0a904cc2c9a75e2fcabc\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137958532, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6RREBT5NN201475\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137958861, \"source\": \"custom_dealercrawl\", \"vin\": \"1FMJK2AT3MEA19567\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137959106, \"source\": \"custom_dealercrawl\", \"vin\": \"1HGCV1F16LA108806\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137959432, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ2DA1PUL22960\", \"vid\": \"251ff63a0a0e087f4292470254c06e82\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137959716, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8K92PBL25585\", \"vid\": \"2111862b0a0e0a904cc2c9a7e33f7570\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137960032, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8K91PBL26842\", \"vid\": \"2cd0a5ad0a0e0a92527f8edc2e762e3a\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137960325, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8K91PBL25884\", \"vid\": \"2cd0ac6e0a0e0a940a1dcbad7f6a1d02\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137960576, \"source\": \"custom_dealercrawl\", \"vin\": \"2C3CDZAG5MH534344\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137960884, \"source\": \"custom_dealercrawl\", \"vin\": \"3C4NJCCB4LT231170\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386137975493, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8K9XPBL24152\", \"vid\": \"33fa75590a0e0a926c4309f6f6b4334d\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137975845, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8K9XPBL22997\", \"vid\": \"452c43220a0e087f102b3b1d2e421a63\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386137976116, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8K9XPBL24815\", \"vid\": \"4eea3f940a0e0a992b96b96bdd79410b\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137976388, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8K90PBL29702\", \"vid\": \"561a30dd0a0e0a992b96b96b83982840\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137976631, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8K90PBL22166\", \"vid\": \"793efd8e0a0e0a912674cfa5ccc05176\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386137976961, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8K99PBL21114\", \"vid\": \"7f3139130a0e0a933ddc904ab161b393\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137977233, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8K98PBL21511\", \"vid\": \"7f313edc0a0e0a933ddc904abf73e022\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386137977477, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8K92PBL24372\", \"vid\": \"810071490a0e0a946c889dafee59ccca\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386137977705, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ1DA2PUL22699\", \"vid\": \"4eea45480a0e0a992b96b96b03633e44\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138008968, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ1DA2PUL25571\", \"vid\": \"4eea478f0a0e0a9169ea2c186ad93338\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138009265, \"source\": \"custom_dealercrawl\", \"vin\": \"1FTEW1C89MKD57358\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138009501, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ9JP3KBL39106\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138009768, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ1DA2PUL22153\", \"vid\": \"636d0e230a0e0a99740e454b907d3269\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138010085, \"source\": \"custom_dealercrawl\", \"vin\": \"4T1K61AKXPU729019\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138010346, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ1DA2PUL20452\", \"vid\": \"8da53bab0a0e0a9033941ce1aa090bdb\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138010684, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ1DA1PUL20863\", \"vid\": \"9072aaa40a0e0a9157b4dc7f15f2b848\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138011024, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ1DA6PUL20552\", \"vid\": \"b6a36f3a0a0e0a934f6f2292ff376a40\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138011363, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ1DA0PUL20692\", \"vid\": \"c19953780a0e087f29bc94ace939fba5\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138033517, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ1DA6PUL20759\", \"vid\": \"c19958e50a0e087f29bc94ac1d076e00\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138033868, \"source\": \"custom_dealercrawl\", \"vin\": \"1FT7W2BTXGEB94033\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138034178, \"source\": \"custom_dealercrawl\", \"vin\": \"1G1FD1RS6P0133096\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138034454, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8K96PBL23340\", \"vid\": \"0f9c3e9c0a0e0a936ce9110d5b4e5e5b\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138034726, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8K93PBL23019\", \"vid\": \"0f9c412a0a0e0a936ce9110dcb98ca02\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138035103, \"source\": \"custom_dealercrawl\", \"vin\": \"3C4NJDDB2MT598638\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138035425, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8K99PBL25583\", \"vid\": \"2111836d0a0e0a904cc2c9a7906bcb05\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138035744, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ2DA6PUL21061\", \"vid\": \"292785750a0e087f102b3b1d73003275\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138036078, \"source\": \"custom_dealercrawl\", \"vin\": \"3N1AB7AP3GY219556\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138068734, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ2DA9PUL26285\", \"vid\": \"2f52432c0a0e0a992b96b96b777a9a06\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138069096, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ2DAXPUL07812\", \"vid\": \"3eb562b80a0e0a9315b6be49cba0bd43\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138069336, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMJJ2KT7MEL00866\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138069695, \"source\": \"custom_dealercrawl\", \"vin\": \"JTDEPMAE2MJ170203\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138070030, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ2DA8PUL19392\", \"vid\": \"70ead7ae0a0e0a931b3f4bbcb6eed50b\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138070329, \"source\": \"custom_dealercrawl\", \"vin\": \"KM8JF3AE0NU144531\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138070576, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ1DA7PUL25517\", \"vid\": \"21117ff60a0e0a904cc2c9a7b2539a22\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138070949, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ1DA9PUL21632\", \"vid\": \"22d009240a0e0a904cc2c9a7505f6af0\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138071243, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6RR7LT1MS516476\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138103554, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ1DA5PUL23152\", \"vid\": \"2cd0a7c30a0e0a940a1dcbadd338bf00\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138103927, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNSKHKC3JR373238\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138104192, \"source\": \"custom_dealercrawl\", \"vin\": \"1GT12TEG5GF124312\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138104509, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ1CA9PUL22135\", \"vid\": \"44821d2d0a0e0a90290b5a5c2dd63fdc\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138104852, \"source\": \"custom_dealercrawl\", \"vin\": \"1GTUUDEL2NZ504861\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138105198, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ1DA8PUL22125\", \"vid\": \"4eea3c640a0e094a3883bfc37a4a2ae1\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138105482, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8K95PBL20817\", \"vid\": \"ed61e5820a0e081d1b14d1fde2a437d0\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138105751, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8K99PBL26412\", \"vid\": \"f6d8cd970a0e0a9a61540bbca562695a\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138106058, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8K98PBL25820\", \"vid\": \"f6d8d0010a0e0a9a61540bbcb53cb21f\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138150417, \"source\": \"custom_dealercrawl\", \"vin\": \"3C6UR5DL2MG605137\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138150729, \"source\": \"custom_dealercrawl\", \"vin\": \"3C6UR5DL6NG237675\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138151054, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8J95PBL25047\", \"vid\": \"561a32650a0e0a992b96b96b42af7b4f\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138151373, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMJJ2LG9PEL11406\", \"vid\": \"070302900a0e0a9a6d5216d69b6254bc\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138151647, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMJJ2LG7PEL08777\", \"vid\": \"0b568b240a0e0a934b7bca1ad31a3920\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138151939, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMJJ3LGXPEL05555\", \"vid\": \"2c5e74e10a0e081d0793dde267eafd54\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138152230, \"source\": \"custom_dealercrawl\", \"vin\": \"3VWC57BU3MM003966\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138152499, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8K94PBL20842\", \"vid\": \"b6a375830a0e0a9174221517fb8cd624\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138152765, \"source\": \"custom_dealercrawl\", \"vin\": \"5UXCR6C0XLLL65224\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138153062, \"source\": \"custom_dealercrawl\", \"vin\": \"KM8J33A44KU074849\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138155696, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8K93PBL22890\", \"vid\": \"b6a388f90a0e087f1ff43b27278079b0\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138214559, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8K95PBL21157\", \"vid\": \"bae40af70a0e094a74e2589d3e23e58b\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138214980, \"source\": \"custom_dealercrawl\", \"vin\": \"KM8R54HE6LU156987\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138215337, \"source\": \"custom_dealercrawl\", \"vin\": \"WBXYJ3C34JEP75934\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138215628, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6RD7PT3CS211729\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138216014, \"source\": \"custom_dealercrawl\", \"vin\": \"1FT7W2BT6KEC78570\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138216289, \"source\": \"custom_dealercrawl\", \"vin\": \"1FTFW1RG4PFC37774\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138216549, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCGSCEAXK1256106\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138216863, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNSKBKC7JR167420\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138217153, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8K92PBL23612\", \"vid\": \"c3b38a640a0e094a69d51377b6b3ace9\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138240527, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8K90PBL23608\", \"vid\": \"dd6e746f0a0e094a5fad348d5a56227c\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138240915, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8K96PBL22267\", \"vid\": \"e18dc1530a0e0a90625213be23934685\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138241282, \"source\": \"custom_dealercrawl\", \"vin\": \"2FMDK4KC4ABB58167\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138241540, \"source\": \"custom_dealercrawl\", \"vin\": \"2T3C1RFV6NC209362\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138241814, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMPJ8K92PBL19298\", \"vid\": \"e64a6d190a0e094a72a865f2c1ca7879\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138242109, \"source\": \"custom_dealercrawl\", \"vin\": \"3C4NJDBN7PT517103\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138242364, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ2D96MUL17206\", \"vid\": \"5c888f390a0e0a91701c19688d571fe7\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138242627, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ2D95LUL27062\", \"vid\": \"65a7beef0a0e0a9a7139bc36dac452b2\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138242946, \"source\": \"custom_dealercrawl\", \"vin\": \"3VVDX7B25PM339141\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138272344, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ1C90MUL25412\", \"vid\": \"510e3d970a0e087f42924702f0854360\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138272630, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ1D92MUL28147\", \"vid\": \"65a7c1460a0e0a9a7139bc3601f4e41c\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138273004, \"source\": \"custom_dealercrawl\", \"vin\": \"KL4CJASB8GB664070\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138273303, \"source\": \"custom_dealercrawl\", \"vin\": \"5LMCJ3D90KUL09346\", \"vid\": \"e45cee2a0a0e0a9260e76faadbdf3e97\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138273543, \"source\": \"custom_dealercrawl\", \"vin\": \"2LMHJ5AT8ABJ15645\", \"vid\": \"e2b6a6e00a0e0a9a0f753ea1d8b4b7ed\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138273895, \"source\": \"custom_dealercrawl\", \"vin\": \"KM8R4DHE2NU483201\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138274192, \"source\": \"custom_dealercrawl\", \"vin\": \"3LN6L2LU7GR617901\", \"vid\": \"3c805d3b0a0e081d114b2b417f2f1fbe\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138274469, \"source\": \"custom_dealercrawl\", \"vin\": \"1G6KD57Y69U130623\", \"vid\": \"dfbf7a870a0e094a0c1254ea94dd8c80\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138274705, \"source\": \"custom_dealercrawl\", \"vin\": \"2GNALCEK9F1156422\", \"vid\": \"e67defb00a0e094a5fad348d839627af\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138301130, \"source\": \"custom_dealercrawl\", \"vin\": \"1G11H5SL6EF247232\", \"vid\": \"d7a1a21b0a0e0a9410ae358d62433a35\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138301464, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDME5C17F6028890\", \"vid\": \"2d21b0f10a0e0a992b96b96b9b2f9be1\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138301904, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYPGDA36JG388051\", \"vid\": \"c26bec1c0a0e0a9a49b1b26ae6a085af\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138302222, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDJP3A55F7147782\", \"vid\": \"3c8061bb0a0e094a3883bfc36fecfacb\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138302523, \"source\": \"custom_dealercrawl\", \"vin\": \"WDCGG5HB6DF979212\", \"vid\": \"2d21b27f0a0e0a992b96b96b09c73640\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138302841, \"source\": \"custom_dealercrawl\", \"vin\": \"WD4PG2EE6H3316917\", \"vid\": \"1cd336bf0a0e087f78ac9134d392d371\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138303147, \"source\": \"custom_dealercrawl\", \"vin\": \"WDDPK4HA1DF055529\", \"vid\": \"673f6c4a0a0e081d455631da840515ba\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138303479, \"source\": \"custom_dealercrawl\", \"vin\": \"4A37L3ETXBE002156\", \"vid\": \"3c805eb90a0e081d114b2b41d737513c\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138303764, \"source\": \"custom_dealercrawl\", \"vin\": \"JN8AF5MV5CT109129\", \"vid\": \"2ec579b80a0e087f102b3b1dc061c674\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138304085, \"source\": \"custom_dealercrawl\", \"vin\": \"LRBFZNR45PD033256\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138353621, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4SJVBT9NS123429\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138353991, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYK6CDF2RG150027\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofchattanooga.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138354304, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYK6CDFXRG151927\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofchattanooga.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138354541, \"source\": \"custom_dealercrawl\", \"vin\": \"5N1AR2MM6FC703799\", \"vid\": \"90ca2b640a0e081d2055119d8144148e\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138354862, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJFCG0LC426192\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138355134, \"source\": \"custom_dealercrawl\", \"vin\": \"3C6TRVBG6JE104146\", \"vid\": \"e45ce82c0a0e0a9a2d62f34ce9cd30f7\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138355431, \"source\": \"custom_dealercrawl\", \"vin\": \"1GYKPGRSXLZ101512\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138355709, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYRK4LF0PG228710\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofchattanooga.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138356028, \"source\": \"custom_dealercrawl\", \"vin\": \"YV4902DZ0D2425320\", \"vid\": \"2d21afaa0a0e0a992b96b96bbc5065c9\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138372630, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYRKDLF6PG221339\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofchattanooga.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138372946, \"source\": \"custom_dealercrawl\", \"vin\": \"YV4A22PK5G1075263\", \"vid\": \"9b0fb40d0a0e0a9209f3bda229b8e03a\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138373271, \"source\": \"custom_dealercrawl\", \"vin\": \"3TMCZ5AN3NM525116\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138373533, \"source\": \"custom_dealercrawl\", \"vin\": \"1FTFW1E8XMFB63514\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138373853, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDC3DLC7P5156664\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofchattanooga.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138374135, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDC3DLC5P5150832\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofchattanooga.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138374379, \"source\": \"custom_dealercrawl\", \"vin\": \"1GNEVHKW7JJ117763\", \"vid\": \"f434a51d0a0e094a5fad348d81d4441b\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138374638, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJJDG2P8733437\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138374898, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPA24AD5PE616348\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofchattanooga.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138396113, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPA24AD9PE618703\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofchattanooga.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138396502, \"source\": \"custom_dealercrawl\", \"vin\": \"4S4BTANC7M3210359\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.parklinemotors.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138396890, \"source\": \"custom_dealercrawl\", \"vin\": \"5GAEVBKW3KJ119700\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.parklinemotors.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138397199, \"source\": \"custom_dealercrawl\", \"vin\": \"1FT7W2BT3MED43801\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138397491, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPA24ADXPE611467\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofchattanooga.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138397820, \"source\": \"custom_dealercrawl\", \"vin\": \"1FT8W2BT9NED14213\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.parklinemotors.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138398112, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPA25AD4PE612104\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofchattanooga.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138398394, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPA24ADXPE606043\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofchattanooga.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138398636, \"source\": \"custom_dealercrawl\", \"vin\": \"3KPA24AD8PE613489\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofchattanooga.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138405847, \"source\": \"custom_dealercrawl\", \"vin\": \"3GNEC12078G294611\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138406131, \"source\": \"custom_dealercrawl\", \"vin\": \"3TMCZ5AN0PM543303\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138406432, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4SDJETXCC104326\", \"vid\": \"e67df11e0a0e094a5fad348d36a283af\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138406724, \"source\": \"custom_dealercrawl\", \"vin\": \"3TMCZ5AN4PM535009\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138407008, \"source\": \"custom_dealercrawl\", \"vin\": \"JF2GTACC1K8326039\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.parklinemotors.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138407312, \"source\": \"custom_dealercrawl\", \"vin\": \"3TMCZ5AN5PM542017\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138417387, \"source\": \"custom_dealercrawl\", \"vin\": \"WA1A4AFY3J2176035\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.parklinemotors.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138417706, \"source\": \"custom_dealercrawl\", \"vin\": \"3VWC57BU3KM100033\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138418024, \"source\": \"custom_dealercrawl\", \"vin\": \"ZFBCFYDT6GP396129\", \"vid\": \"673f71400a0e081d455631daa6d41924\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138418289, \"source\": \"custom_dealercrawl\", \"vin\": \"1GCGTEEN5J1207440\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.parklinemotors.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138418579, \"source\": \"custom_dealercrawl\", \"vin\": \"5XXG64J25RG253556\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofchattanooga.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138418875, \"source\": \"custom_dealercrawl\", \"vin\": \"1HGCV2F96JA008617\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.parklinemotors.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138419198, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYK33DF5RG163332\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofchattanooga.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138419492, \"source\": \"custom_dealercrawl\", \"vin\": \"2C3CDZFJ4MH507477\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.parklinemotors.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138419782, \"source\": \"custom_dealercrawl\", \"vin\": \"5TFDY5F12MX021288\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138433813, \"source\": \"custom_dealercrawl\", \"vin\": \"5XXG64J25RG251287\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofchattanooga.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138434140, \"source\": \"custom_dealercrawl\", \"vin\": \"1ZVBP8EM5C5279402\", \"vid\": \"3e071cff0a0e0a9169ea2c18c86e2ec6\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138434404, \"source\": \"custom_dealercrawl\", \"vin\": \"1GTV2UEH6FZ350999\", \"vid\": \"673f6a8c0a0e081d455631da145b193a\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138434653, \"source\": \"custom_dealercrawl\", \"vin\": \"1FTFW1E5XNKE75685\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.parklinemotors.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138434976, \"source\": \"custom_dealercrawl\", \"vin\": \"KMHGN4JE6JU225675\", \"vid\": \"f434a3ef0a0e094a5fad348dca398bb7\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138435282, \"source\": \"custom_dealercrawl\", \"vin\": \"KMHGH4JH4GU103729\", \"vid\": \"0e38c41e0a0e0a92630d01ba2853198a\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138435554, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYK33DF9RG158361\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofchattanooga.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138435891, \"source\": \"custom_dealercrawl\", \"vin\": \"19XFA1F5XAE072672\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138436173, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4PJLCB5KD164768\", \"vid\": \"673f6de60a0e081d455631da1f15c287\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138471288, \"source\": \"custom_dealercrawl\", \"vin\": \"1FT8W3BT5MED79385\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.parklinemotors.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138471606, \"source\": \"custom_dealercrawl\", \"vin\": \"3CZRU6H10MM749521\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.parklinemotors.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138471891, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4RJEAG7JC429648\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138472178, \"source\": \"custom_dealercrawl\", \"vin\": \"1FMJK2AT3KEA77398\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.parklinemotors.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138472414, \"source\": \"custom_dealercrawl\", \"vin\": \"1C4NJRBB0CD509587\", \"vid\": \"d5879ab10a0e0a93227b34edb258601a\", \"date\": \"2023-10-26\", \"domain\": \"www.southgatelincoln.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138472675, \"source\": \"custom_dealercrawl\", \"vin\": \"1C6SRFMT5NN307686\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"4\", \"name\": \"Hendrick Motors of Charlotte\", \"address\": \"5201 E INDEPENDENCE BLVD\", \"zip\": \"28212\"}}\n{\"timestamp\": 1698386138472985, \"source\": \"custom_dealercrawl\", \"vin\": \"KNDCR3LE2R5124881\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofchattanooga.com\", \"seller\": {\"id\": \"2\", \"name\": \"Mercedes-Benz of Spokane\", \"address\": \"21802 E GEORGE GEE AVE\", \"zip\": \"99019\"}}\n{\"timestamp\": 1698386138473286, \"source\": \"custom_dealercrawl\", \"vin\": \"1FTFW1ED9PFC02837\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138473521, \"source\": \"custom_dealercrawl\", \"vin\": \"JTKKT624860169672\", \"vid\": \"82836027\", \"date\": \"2023-10-26\", \"domain\": \"www.napletonnissanschererville.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138483669, \"source\": \"custom_dealercrawl\", \"vin\": \"5XYK6CDFXRG164046\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofchattanooga.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138484008, \"source\": \"custom_dealercrawl\", \"vin\": \"JF1VA1J6XH9814889\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.parklinemotors.com\", \"seller\": {\"id\": \"1\", \"name\": \"Ozzy's Car Company\", \"address\": \"4195 W. Chinden Blvd\", \"zip\": \"83714\"}}\n{\"timestamp\": 1698386138484314, \"source\": \"custom_dealercrawl\", \"vin\": \"YV4952CZ5D1668226\", \"vid\": \"82186503\", \"date\": \"2023-10-26\", \"domain\": \"www.napletonnissanschererville.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n{\"timestamp\": 1698386138484591, \"source\": \"custom_dealercrawl\", \"vin\": \"3C6UR5CJ9KG619579\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.parklinemotors.com\", \"seller\": {\"id\": \"3\", \"name\": \"Select Auto Imports\", \"address\": \"5630 S Van Dorn St\", \"zip\": \"22310\"}}\n{\"timestamp\": 1698386138484912, \"source\": \"custom_dealercrawl\", \"vin\": \"1N6AA1EF1LN505490\", \"vid\": \"\", \"date\": \"2023-10-26\", \"domain\": \"www.kiaofabilene.com\", \"seller\": {\"id\": \"5\", \"name\": \"Mercedes-Benz of Draper\", \"address\": \"11548 S LONE PEAK PARKWAY\", \"zip\": \"84020\"}}\n"
  },
  {
    "path": "quickwit/quickwit-doc-mapper/benches/doc_to_json_bench.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse binggan::plugins::*;\nuse binggan::{BenchRunner, INSTRUMENTED_SYSTEM, PeakMemAlloc, black_box};\nuse quickwit_doc_mapper::DocMapper;\nuse tantivy::TantivyDocument;\n\nconst SIMPLE_JSON_TEST_DATA: &str = include_str!(\"data/simple-parse-bench.json\");\nconst ROUTING_TEST_DATA: &str = include_str!(\"data/simple-routing-expression-bench.json\");\n\nconst DOC_MAPPER_CONF_SIMPLE_JSON: &str = r#\"{\n    \"type\": \"default\",\n    \"default_search_fields\": [],\n    \"tag_fields\": [],\n    \"field_mappings\": [\n        {\"name\": \"id\", \"type\": \"u64\", \"fast\": false },\n        {\"name\": \"first_name\", \"type\": \"text\" },\n        {\"name\": \"last_name\", \"type\": \"text\" },\n        {\"name\": \"email\", \"type\": \"text\" }\n    ]\n}\"#;\n\n/// Note that {\"name\": \"date\", \"type\": \"datetime\", \"input_formats\": [\"%Y-%m-%d\"], \"output_format\":\n/// \"%Y-%m-%d\"}, is removed since tantivy parsing only supports RFC3339\nconst ROUTING_DOC_MAPPER_CONF: &str = r#\"{\n    \"type\": \"default\",\n    \"default_search_fields\": [],\n    \"tag_fields\": [],\n    \"field_mappings\": [\n        {\"name\": \"timestamp\", \"type\": \"datetime\", \"input_formats\": [\"unix_timestamp\"], \"output_format\": \"%Y-%m-%d %H:%M:%S\", \"output_format\": \"%Y-%m-%d %H:%M:%S\", \"fast\": true },\n        {\"name\": \"source\", \"type\": \"text\" },\n        {\"name\": \"vin\", \"type\": \"text\" },\n        {\"name\": \"vid\", \"type\": \"text\" },\n        {\"name\": \"domain\", \"type\": \"text\" },\n        {\"name\": \"seller\", \"type\": \"object\", \"field_mappings\": [\n            {\"name\": \"id\", \"type\": \"text\" },\n            {\"name\": \"name\", \"type\": \"text\" },\n            {\"name\": \"address\", \"type\": \"text\" },\n            {\"name\": \"zip\", \"type\": \"text\" }\n        ]}\n    ],\n    \"partition_key\": \"seller.id\"\n}\"#;\n\n#[global_allocator]\npub static GLOBAL: &PeakMemAlloc<std::alloc::System> = &INSTRUMENTED_SYSTEM;\n\nfn get_test_data(\n    name: &'static str,\n    raw: &'static str,\n    doc_mapper: &'static str,\n) -> (&'static str, usize, Vec<&'static str>, Box<DocMapper>) {\n    let lines: Vec<&str> = raw.lines().map(|line| line.trim()).collect();\n    (\n        name,\n        raw.len(),\n        lines,\n        serde_json::from_str(doc_mapper).unwrap(),\n    )\n}\n\nfn run_bench() {\n    let inputs: Vec<(&str, usize, Vec<&str>, Box<DocMapper>)> = vec![\n        (get_test_data(\n            \"flat_json\",\n            SIMPLE_JSON_TEST_DATA,\n            DOC_MAPPER_CONF_SIMPLE_JSON,\n        )),\n        (get_test_data(\"routing_json\", ROUTING_TEST_DATA, ROUTING_DOC_MAPPER_CONF)),\n    ];\n\n    let mut runner: BenchRunner = BenchRunner::new();\n\n    runner.config().set_num_iter_for_bench(1);\n    runner.config().set_num_iter_for_group(100);\n    runner\n        .add_plugin(CacheTrasher::default())\n        .add_plugin(BPUTrasher::default())\n        .add_plugin(PeakMemAllocPlugin::new(GLOBAL));\n\n    for (input_name, size, data, doc_mapper) in inputs.iter() {\n        let dynamic_doc_mapper: DocMapper =\n            serde_json::from_str(r#\"{ \"mode\": \"dynamic\" }\"#).unwrap();\n        let mut group = runner.new_group();\n        group.set_name(input_name);\n        group.set_input_size(*size);\n        group.register_with_input(\"doc_mapper\", data, |lines| {\n            for line in lines {\n                black_box(doc_mapper.doc_from_json_str(line).unwrap());\n            }\n        });\n\n        group.register_with_input(\"doc_mapper_dynamic\", data, |lines| {\n            for line in lines {\n                black_box(dynamic_doc_mapper.doc_from_json_str(line).unwrap());\n            }\n        });\n\n        group.register_with_input(\"tantivy parse json\", data, |lines| {\n            let schema = doc_mapper.schema();\n            for line in lines {\n                let _doc = black_box(TantivyDocument::parse_json(&schema, line).unwrap());\n            }\n        });\n        group.run();\n    }\n}\n\nfn main() {\n    run_bench();\n}\n"
  },
  {
    "path": "quickwit/quickwit-doc-mapper/benches/routing_expression_bench.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse binggan::plugins::*;\nuse binggan::{BenchRunner, INSTRUMENTED_SYSTEM, PeakMemAlloc};\nuse quickwit_doc_mapper::RoutingExpr;\nuse serde_json::Value as JsonValue;\n\n#[global_allocator]\npub static GLOBAL: &PeakMemAlloc<std::alloc::System> = &INSTRUMENTED_SYSTEM;\n\nconst JSON_TEST_DATA: &str = include_str!(\"data/simple-routing-expression-bench.json\");\n\nfn run_bench() {\n    let json_lines: Vec<serde_json::Map<String, JsonValue>> = JSON_TEST_DATA\n        .lines()\n        .map(|line| serde_json::from_str(line).unwrap())\n        .collect();\n\n    let mut runner: BenchRunner = BenchRunner::new();\n\n    runner\n        .add_plugin(CacheTrasher::default())\n        .add_plugin(PeakMemAllocPlugin::new(GLOBAL));\n\n    {\n        let (input_name, size, data) = &(\"routing_expr\", JSON_TEST_DATA.len(), &json_lines);\n        let mut group = runner.new_group();\n        group.set_name(input_name);\n        group.set_input_size(*size);\n        group.register_with_input(\"simple-eval-hash\", data, |lines| {\n            let routing_expr = RoutingExpr::new(\"seller.id\").unwrap();\n            for json in lines.iter() {\n                routing_expr.eval_hash(json);\n            }\n        });\n\n        group.run();\n    }\n}\n\nfn main() {\n    run_bench();\n}\n"
  },
  {
    "path": "quickwit/quickwit-doc-mapper/src/doc_mapper/date_time_type.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse indexmap::IndexSet;\nuse quickwit_common::true_fn;\nuse quickwit_datetime::{DateTimeInputFormat, DateTimeOutputFormat, TantivyDateTime};\nuse serde::{Deserialize, Deserializer, Serialize};\nuse serde_json::Value as JsonValue;\nuse tantivy::schema::{DateTimePrecision, OwnedValue as TantivyValue};\n\n/// A struct holding DateTime field options.\n#[derive(Clone, Debug, Eq, PartialEq, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct QuickwitDateTimeOptions {\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub description: Option<String>,\n\n    /// Accepted input formats.\n    #[serde(default)]\n    pub input_formats: InputFormats,\n\n    /// Output format\n    #[serde(default)]\n    pub output_format: DateTimeOutputFormat,\n\n    /// Internal storage precision.\n    #[serde(default)]\n    #[serde(alias = \"precision\")]\n    pub fast_precision: DateTimePrecision,\n\n    #[serde(default = \"true_fn\")]\n    pub indexed: bool,\n\n    #[serde(default = \"true_fn\")]\n    pub stored: bool,\n\n    #[serde(default)]\n    pub fast: bool,\n}\n\nimpl Default for QuickwitDateTimeOptions {\n    fn default() -> Self {\n        Self {\n            description: None,\n            input_formats: InputFormats::default(),\n            output_format: DateTimeOutputFormat::default(),\n            fast_precision: DateTimePrecision::default(),\n            indexed: true,\n            stored: true,\n            fast: false,\n        }\n    }\n}\n\nimpl QuickwitDateTimeOptions {\n    pub(crate) fn validate_json(\n        &self,\n        json_value: &serde_json_borrow::Value,\n    ) -> Result<(), String> {\n        match json_value {\n            serde_json_borrow::Value::Number(timestamp) => {\n                // `.as_f64()` actually converts floats to integers, so we must check for integers\n                // first.\n                if let Some(timestamp_i64) = timestamp.as_i64() {\n                    quickwit_datetime::parse_timestamp_int(timestamp_i64, &self.input_formats.0)?;\n                    Ok(())\n                } else if let Some(timestamp_f64) = timestamp.as_f64() {\n                    quickwit_datetime::parse_timestamp_float(timestamp_f64, &self.input_formats.0)?;\n                    Ok(())\n                } else {\n                    Err(format!(\n                        \"failed to convert timestamp to f64 ({:?}). this should never happen\",\n                        serde_json::Number::from(*timestamp)\n                    ))\n                }\n            }\n            serde_json_borrow::Value::Str(date_time_str) => {\n                quickwit_datetime::parse_date_time_str(date_time_str, &self.input_formats.0)?;\n                Ok(())\n            }\n            _ => Err(format!(\n                \"failed to parse datetime: expected a float, integer, or string, got \\\n                 `{json_value}`\"\n            )),\n        }\n    }\n\n    pub(crate) fn parse_json(&self, json_value: &JsonValue) -> Result<TantivyValue, String> {\n        let date_time = match json_value {\n            JsonValue::Number(timestamp) => {\n                // `.as_f64()` actually converts floats to integers, so we must check for integers\n                // first.\n                if let Some(timestamp_i64) = timestamp.as_i64() {\n                    quickwit_datetime::parse_timestamp_int(timestamp_i64, &self.input_formats.0)?\n                } else if let Some(timestamp_f64) = timestamp.as_f64() {\n                    quickwit_datetime::parse_timestamp_float(timestamp_f64, &self.input_formats.0)?\n                } else {\n                    return Err(format!(\n                        \"failed to parse datetime `{timestamp:?}`: value is larger than i64::MAX\",\n                    ));\n                }\n            }\n            JsonValue::String(date_time_str) => {\n                quickwit_datetime::parse_date_time_str(date_time_str, &self.input_formats.0)?\n            }\n            _ => {\n                return Err(format!(\n                    \"failed to parse datetime: expected a float, integer, or string, got \\\n                     `{json_value}`\"\n                ));\n            }\n        };\n        Ok(TantivyValue::Date(date_time))\n    }\n\n    pub(crate) fn reparse_tantivy_value(\n        &self,\n        tantivy_value: &TantivyValue,\n    ) -> Option<TantivyDateTime> {\n        match tantivy_value {\n            TantivyValue::Date(date) => Some(*date),\n            TantivyValue::Str(date_time_str) => {\n                quickwit_datetime::parse_date_time_str(date_time_str, &self.input_formats.0).ok()\n            }\n            TantivyValue::U64(timestamp_u64) => {\n                let timestamp_i64 = (*timestamp_u64).try_into().ok()?;\n                quickwit_datetime::parse_timestamp_int(timestamp_i64, &self.input_formats.0).ok()\n            }\n            TantivyValue::I64(timestamp_i64) => {\n                quickwit_datetime::parse_timestamp_int(*timestamp_i64, &self.input_formats.0).ok()\n            }\n            TantivyValue::F64(timestamp_f64) => {\n                quickwit_datetime::parse_timestamp_float(*timestamp_f64, &self.input_formats.0).ok()\n            }\n            _ => None,\n        }\n    }\n}\n\n#[derive(Clone, Debug, Eq, PartialEq, Serialize)]\npub struct InputFormats(Vec<DateTimeInputFormat>);\n\nimpl Default for InputFormats {\n    fn default() -> Self {\n        Self(vec![\n            DateTimeInputFormat::Rfc3339,\n            DateTimeInputFormat::Timestamp,\n        ])\n    }\n}\n\nimpl<'de> Deserialize<'de> for InputFormats {\n    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>\n    where D: Deserializer<'de> {\n        let date_time_formats = IndexSet::<DateTimeInputFormat>::deserialize(deserializer)?;\n\n        if date_time_formats.is_empty() {\n            return Ok(InputFormats::default());\n        }\n        Ok(InputFormats(date_time_formats.into_iter().collect()))\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use time::macros::datetime;\n\n    use super::*;\n    use crate::doc_mapper::FieldMappingType;\n    use crate::{Cardinality, FieldMappingEntry};\n\n    #[test]\n    fn test_date_time_options_single_value_deser() {\n        let field_mapping_entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"updated_at\",\n                \"type\": \"datetime\",\n                \"description\": \"When the record was last updated.\",\n                \"input_formats\": [\n                    \"rfc3339\"\n                ],\n                \"fast_precision\": \"milliseconds\",\n                \"indexed\": true,\n                \"fast\": true,\n                \"stored\": false\n            }\n            \"#,\n        )\n        .unwrap();\n\n        assert_eq!(field_mapping_entry.name, \"updated_at\");\n\n        let date_time_options = match field_mapping_entry.mapping_type {\n            FieldMappingType::DateTime(date_time_options, Cardinality::SingleValued) => {\n                date_time_options\n            }\n            _ => panic!(\"Expected a date time field mapping\"),\n        };\n        let expected_input_formats = InputFormats(vec![DateTimeInputFormat::Rfc3339]);\n        let expected_date_time_options = QuickwitDateTimeOptions {\n            description: Some(\"When the record was last updated.\".to_string()),\n            input_formats: expected_input_formats,\n            output_format: DateTimeOutputFormat::Rfc3339,\n            fast_precision: DateTimePrecision::Milliseconds,\n            indexed: true,\n            fast: true,\n            stored: false,\n        };\n        assert_eq!(date_time_options, expected_date_time_options);\n    }\n\n    #[test]\n    fn test_backward_compatibility_after_fast_precision_rename() {\n        let field_mapping_entry: FieldMappingEntry = serde_json::from_str(\n            r#\"\n        {\n            \"name\": \"updated_at\",\n            \"type\": \"datetime\",\n            \"description\": \"When the record was last updated.\",\n            \"input_formats\": [\"rfc3339\"],\n            \"precision\": \"milliseconds\",\n            \"indexed\": true,\n            \"fast\": true,\n            \"stored\": false\n        }\n    \"#,\n        )\n        .unwrap();\n\n        if let FieldMappingType::DateTime(date_time_options, _) = field_mapping_entry.mapping_type {\n            assert_eq!(\n                date_time_options.fast_precision,\n                DateTimePrecision::Milliseconds\n            );\n        } else {\n            panic!(\"Expected a date time field mapping\");\n        }\n    }\n\n    #[test]\n    fn test_date_time_options_multi_values_deser() {\n        let field_mapping_entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"updated_at\",\n                \"type\": \"array<datetime>\",\n                \"description\": \"When the record was last updated.\",\n                \"input_formats\": [\n                    \"rfc3339\"\n                ],\n                \"output_format\": \"unix_timestamp_secs\",\n                \"fast_precision\": \"milliseconds\",\n                \"indexed\": true,\n                \"fast\": true,\n                \"stored\": false\n            }\n            \"#,\n        )\n        .unwrap();\n\n        assert_eq!(field_mapping_entry.name, \"updated_at\");\n\n        let date_time_options = match field_mapping_entry.mapping_type {\n            FieldMappingType::DateTime(date_time_options, Cardinality::MultiValued) => {\n                date_time_options\n            }\n            _ => panic!(\"Expected a date time field mapping.\"),\n        };\n        let expected_input_formats = InputFormats(vec![DateTimeInputFormat::Rfc3339]);\n        let expected_date_time_options = QuickwitDateTimeOptions {\n            description: Some(\"When the record was last updated.\".to_string()),\n            input_formats: expected_input_formats,\n            output_format: DateTimeOutputFormat::TimestampSecs,\n            fast_precision: DateTimePrecision::Milliseconds,\n            indexed: true,\n            fast: true,\n            stored: false,\n        };\n        assert_eq!(date_time_options, expected_date_time_options);\n    }\n\n    #[test]\n    fn test_date_time_options_deser_default() {\n        let date_time_options = serde_json::from_str::<QuickwitDateTimeOptions>(\"{}\").unwrap();\n        assert_eq!(date_time_options, QuickwitDateTimeOptions::default());\n        assert_eq!(\n            date_time_options.input_formats.0,\n            &[DateTimeInputFormat::Rfc3339, DateTimeInputFormat::Timestamp]\n        );\n        assert_eq!(\n            date_time_options.output_format,\n            DateTimeOutputFormat::Rfc3339\n        );\n        assert_eq!(date_time_options.fast_precision, DateTimePrecision::Seconds);\n        assert!(date_time_options.indexed);\n        assert!(date_time_options.stored);\n        assert!(!date_time_options.fast);\n    }\n\n    #[test]\n    fn test_date_time_options_deser_denies_unknown_fields() {\n        let error = serde_json::from_str::<QuickwitDateTimeOptions>(\n            r#\"\n            {\n                \"tokenizer\": \"raw\",\n            }\n            \"#,\n        )\n        .unwrap_err()\n        .to_string();\n        assert!(error.contains(\"unknown field `tokenizer`\"));\n\n        let error = serde_json::from_str::<QuickwitDateTimeOptions>(\n            r#\"\n            {\n                \"fast_precision\": \"hours\",\n            }\n            \"#,\n        )\n        .unwrap_err()\n        .to_string();\n        assert!(error.contains(\"unknown variant `hours`\"));\n    }\n\n    #[test]\n    fn test_test_date_time_options_ser() {\n        let field_mapping_entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"updated_at\",\n                \"type\": \"datetime\",\n                \"description\": \"When the record was last updated.\",\n                \"input_formats\": [\"iso8601\"]\n            }\"#,\n        )\n        .unwrap();\n\n        let entry_json = serde_json::to_value(&field_mapping_entry).unwrap();\n        assert_eq!(\n            entry_json,\n            serde_json::json!({\n                \"name\": \"updated_at\",\n                \"type\": \"datetime\",\n                \"description\": \"When the record was last updated.\",\n                \"input_formats\": [\"iso8601\"],\n                \"output_format\": \"rfc3339\",\n                \"fast_precision\": \"seconds\",\n                \"indexed\": true,\n                \"fast\": false,\n                \"stored\": true\n            })\n        );\n    }\n\n    #[test]\n    fn test_deserialize_input_formats_deser() {\n        {\n            let input_formats_json = r#\"[]\"#;\n            let input_formats: InputFormats = serde_json::from_str(input_formats_json).unwrap();\n            assert_eq!(\n                input_formats.0,\n                &[DateTimeInputFormat::Rfc3339, DateTimeInputFormat::Timestamp]\n            );\n        }\n        {\n            let input_formats_json = r#\"[\"rfc3339\", \"unix_timestamp\", \"unix_timestamp\"]\"#;\n            let input_formats: InputFormats = serde_json::from_str(input_formats_json).unwrap();\n            assert_eq!(\n                input_formats.0,\n                &[DateTimeInputFormat::Rfc3339, DateTimeInputFormat::Timestamp]\n            );\n        }\n    }\n\n    #[test]\n    fn test_deserialize_invalid_input_formats_should_error() {\n        {\n            let input_formats_json = r#\"[\"rfc3339\", \"%Y-%Q-%d\"]\"#;\n            let error = serde_json::from_str::<InputFormats>(input_formats_json)\n                .unwrap_err()\n                .to_string();\n            assert!(error.contains(\"invalid strptime format\"));\n        }\n    }\n\n    #[test]\n    fn test_date_time_options_parse_json() {\n        let date_time_options = QuickwitDateTimeOptions {\n            input_formats: InputFormats(vec![\n                DateTimeInputFormat::Rfc3339,\n                DateTimeInputFormat::Timestamp,\n            ]),\n            ..Default::default()\n        };\n        let expected_timestamp = datetime!(2012-05-21 12:09:14 UTC).unix_timestamp();\n        {\n            let json_value = serde_json::json!(\"2012-05-21T12:09:14-00:00\");\n            let tantivy_value = date_time_options.parse_json(&json_value).unwrap();\n            let date_time = match tantivy_value {\n                TantivyValue::Date(date_time) => date_time,\n                other => panic!(\"Expected a tantivy date time, got `{other:?}`.\"),\n            };\n            assert_eq!(date_time.into_timestamp_secs(), expected_timestamp);\n        }\n        {\n            let json_value = serde_json::json!(expected_timestamp);\n            let tantivy_value = date_time_options.parse_json(&json_value).unwrap();\n            let date_time = match tantivy_value {\n                TantivyValue::Date(date_time) => date_time,\n                other => panic!(\"Expected a tantivy date time, got `{other:?}`.\"),\n            };\n            assert_eq!(date_time.into_timestamp_secs(), expected_timestamp);\n        }\n        {\n            let json_value = serde_json::json!(expected_timestamp as f64);\n            let tantivy_value = date_time_options.parse_json(&json_value).unwrap();\n            let date_time = match tantivy_value {\n                TantivyValue::Date(date_time) => date_time,\n                other => panic!(\"Expected a tantivy date time, got `{other:?}`.\"),\n            };\n            assert_eq!(date_time.into_timestamp_secs(), expected_timestamp);\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-doc-mapper/src/doc_mapper/doc_mapper_builder.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse serde::de::IgnoredAny;\nuse serde::{Deserialize, Serialize};\n\nuse crate::{DocMapper, DocMapping};\n\n/// DocMapperBuilder is here\n/// to create a valid DocMapper.\n///\n/// It is also used to serialize/deserialize a DocMapper.\n/// note that this is not the way is the DocMapping is deserialized\n/// from the configuration.\n#[derive(Clone, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct DocMapperBuilder {\n    /// Doc mapping.\n    #[serde(flatten)]\n    pub doc_mapping: DocMapping,\n    /// Default search field names.\n    #[serde(default)]\n    pub default_search_fields: Vec<String>,\n\n    /// Allow the \"type\" field separately.\n    /// This is a residue from when the DocMapper was a trait.\n    #[serde(rename = \"type\", default)]\n    #[serde(skip_serializing)]\n    pub legacy_type_tag: Option<IgnoredAny>,\n}\n\n#[cfg(test)]\nimpl Default for DocMapperBuilder {\n    fn default() -> Self {\n        serde_json::from_str(\"{}\").unwrap()\n    }\n}\n\nimpl DocMapperBuilder {\n    /// Build a valid `DocMapper`.\n    /// This will consume your `DocMapperBuilder`.\n    pub fn try_build(self) -> anyhow::Result<DocMapper> {\n        self.try_into()\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n    use crate::ModeType;\n\n    #[test]\n    fn test_default_mapper_builder_deserialize_from_empty_object() {\n        let default_doc_mapper_builder: DocMapperBuilder = serde_json::from_str(\"{}\").unwrap();\n        assert_eq!(\n            default_doc_mapper_builder.doc_mapping.mode.mode_type(),\n            ModeType::Dynamic\n        );\n        assert!(\n            default_doc_mapper_builder\n                .doc_mapping\n                .field_mappings\n                .is_empty()\n        );\n        assert!(\n            default_doc_mapper_builder\n                .doc_mapping\n                .timestamp_field\n                .is_none()\n        );\n        assert!(default_doc_mapper_builder.doc_mapping.tag_fields.is_empty());\n        assert_eq!(default_doc_mapper_builder.doc_mapping.store_source, false);\n        assert!(default_doc_mapper_builder.default_search_fields.is_empty());\n    }\n\n    #[test]\n    fn test_default_mapper_builder_extra_field() {\n        assert!(serde_json::from_str::<DocMapperBuilder>(r#\"{\"unknownfield\": \"blop\"}\"#).is_err());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-doc-mapper/src/doc_mapper/doc_mapper_impl.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{BTreeMap, BTreeSet, HashSet};\nuse std::num::NonZeroU32;\nuse std::sync::Arc;\n\nuse anyhow::{Context, bail};\nuse fnv::FnvHashSet;\nuse quickwit_proto::types::DocMappingUid;\nuse quickwit_query::create_default_quickwit_tokenizer_manager;\nuse quickwit_query::query_ast::{BuildTantivyAstContext, QueryAst};\nuse quickwit_query::tokenizers::TokenizerManager;\nuse serde::{Deserialize, Serialize};\nuse serde_json::{self, Value as JsonValue};\nuse serde_json_borrow::Map as BorrowedJsonMap;\nuse tantivy::TantivyDocument as Document;\nuse tantivy::query::Query;\nuse tantivy::schema::{Field, FieldType, INDEXED, OwnedValue as TantivyValue, STORED, Schema};\n\nuse super::DocMapperBuilder;\nuse super::field_mapping_entry::RAW_TOKENIZER_NAME;\nuse super::field_presence::populate_field_presence;\nuse super::tantivy_val_to_json::tantivy_value_to_json;\nuse crate::doc_mapper::mapping_tree::{\n    JsonValueIterator, MappingNode, MappingNodeRoot, build_field_path_from_str, build_mapping_tree,\n    map_primitive_json_to_concatenate_value,\n};\nuse crate::doc_mapper::{FieldMappingType, JsonObject, Partition};\nuse crate::query_builder::build_query;\nuse crate::routing_expression::RoutingExpr;\nuse crate::{\n    Cardinality, DOCUMENT_SIZE_FIELD_NAME, DYNAMIC_FIELD_NAME, DocMapping, DocParsingError,\n    FIELD_PRESENCE_FIELD_NAME, Mode, ModeType, NamedField, QueryParserError, SOURCE_FIELD_NAME,\n    TokenizerEntry, WarmupInfo,\n};\n\nconst FIELD_PRESENCE_FIELD: Field = Field::from_field_id(0u32);\n\n/// which defines a set of rules to map json fields\n/// to tantivy index fields.\n///\n/// The mains rules are defined by the field mappings.\n#[derive(Clone, Serialize, Deserialize)]\n#[serde(into = \"DocMapperBuilder\", try_from = \"DocMapperBuilder\")]\npub struct DocMapper {\n    /// The UID of the doc mapping.\n    doc_mapping_uid: DocMappingUid,\n    /// Field in which the source should be stored.\n    /// This field is only valid when using the schema associated with the default\n    /// doc mapper, and therefore cannot be used in the `query` method.\n    source_field: Option<Field>,\n    /// Indexes field presence. It is necessary to enable this in order to run exists\n    /// queries.\n    index_field_presence: bool,\n    /// Field in which the dynamically mapped fields should be stored.\n    /// This field is only valid when using the schema associated with the default\n    /// doc mapper, and therefore cannot be used in the `query` method.\n    dynamic_field: Option<Field>,\n    /// Field in which the len of the source document is stored as a fast field.\n    document_size_field: Option<Field>,\n    /// Default list of field names used for search.\n    default_search_field_names: Vec<String>,\n    /// Timestamp field name.\n    timestamp_field_name: Option<String>,\n    /// Timestamp field path (name parsed)\n    timestamp_field_path: Option<Vec<String>>,\n    /// Root node of the field mapping tree.\n    /// See [`MappingNode`].\n    field_mappings: MappingNode,\n    /// Concat fields which needs to learn about any element put in dynamic_field\n    concatenate_dynamic_fields: Vec<Field>,\n    /// Schema generated by the store source and field mappings parameters.\n    schema: Schema,\n    /// List of field names used for tagging.\n    tag_field_names: BTreeSet<String>,\n    /// The partition key is a DSL used to route documents\n    /// into specific splits.\n    partition_key: RoutingExpr,\n    /// Maximum number of partitions\n    max_num_partitions: NonZeroU32,\n    /// Defines how unmapped fields should be handle.\n    mode: Mode,\n    /// User-defined tokenizers.\n    tokenizer_entries: Vec<TokenizerEntry>,\n    /// Tokenizer manager.\n    tokenizer_manager: TokenizerManager,\n}\n\nfn validate_timestamp_field(\n    timestamp_field_path: &str,\n    mapping_root_node: &MappingNode,\n) -> anyhow::Result<()> {\n    if timestamp_field_path.starts_with('.') || timestamp_field_path.starts_with(\"\\\\.\") {\n        bail!(\"timestamp field `{timestamp_field_path}` should not start with a `.`\");\n    }\n    if timestamp_field_path.ends_with('.') {\n        bail!(\"timestamp field `{timestamp_field_path}` should not end with a `.`\");\n    }\n    let Some(timestamp_field_type) =\n        mapping_root_node.find_field_mapping_type(timestamp_field_path)\n    else {\n        bail!(\"could not find timestamp field `{timestamp_field_path}` in field mappings\");\n    };\n    if let FieldMappingType::DateTime(date_time_option, cardinality) = &timestamp_field_type {\n        if cardinality != &Cardinality::SingleValued {\n            bail!(\"timestamp field `{timestamp_field_path}` should be single-valued\");\n        }\n        if !date_time_option.fast {\n            bail!(\"timestamp field `{timestamp_field_path}` should be a fast field\");\n        }\n    } else {\n        bail!(\"timestamp field `{timestamp_field_path}` should be a datetime field\");\n    }\n    Ok(())\n}\n\nimpl From<DocMapper> for DocMapperBuilder {\n    fn from(default_doc_mapper: DocMapper) -> Self {\n        let partition_key_str = default_doc_mapper.partition_key.to_string();\n        let partition_key_opt: Option<String> = if !partition_key_str.is_empty() {\n            Some(partition_key_str)\n        } else {\n            None\n        };\n        let doc_mapping = DocMapping {\n            doc_mapping_uid: default_doc_mapper.doc_mapping_uid,\n            mode: default_doc_mapper.mode,\n            field_mappings: default_doc_mapper.field_mappings.into(),\n            timestamp_field: default_doc_mapper.timestamp_field_name,\n            tag_fields: default_doc_mapper.tag_field_names,\n            partition_key: partition_key_opt,\n            max_num_partitions: default_doc_mapper.max_num_partitions,\n            index_field_presence: default_doc_mapper.index_field_presence,\n            store_document_size: default_doc_mapper.document_size_field.is_some(),\n            store_source: default_doc_mapper.source_field.is_some(),\n            tokenizers: default_doc_mapper.tokenizer_entries,\n        };\n        Self {\n            doc_mapping,\n            default_search_fields: default_doc_mapper.default_search_field_names,\n            legacy_type_tag: None,\n        }\n    }\n}\n\nimpl TryFrom<DocMapperBuilder> for DocMapper {\n    type Error = anyhow::Error;\n\n    fn try_from(builder: DocMapperBuilder) -> anyhow::Result<DocMapper> {\n        let mut schema_builder = Schema::builder();\n\n        // We want the field ID of the field presence field to be 0, so we add it to the schema\n        // first.\n        let field_presence_field = schema_builder.add_u64_field(FIELD_PRESENCE_FIELD_NAME, INDEXED);\n        assert_eq!(field_presence_field, FIELD_PRESENCE_FIELD);\n\n        let doc_mapping = builder.doc_mapping;\n\n        let dynamic_field = if let Mode::Dynamic(json_options) = &doc_mapping.mode {\n            Some(schema_builder.add_json_field(DYNAMIC_FIELD_NAME, json_options.clone()))\n        } else {\n            None\n        };\n        let document_size_field = if doc_mapping.store_document_size {\n            let document_size_field_options = tantivy::schema::NumericOptions::default().set_fast();\n            Some(\n                schema_builder.add_u64_field(DOCUMENT_SIZE_FIELD_NAME, document_size_field_options),\n            )\n        } else {\n            None\n        };\n        let source_field = if doc_mapping.store_source {\n            Some(schema_builder.add_json_field(SOURCE_FIELD_NAME, STORED))\n        } else {\n            None\n        };\n        let MappingNodeRoot {\n            field_mappings,\n            concatenate_dynamic_fields,\n        } = build_mapping_tree(&doc_mapping.field_mappings, &mut schema_builder)?;\n        if !concatenate_dynamic_fields.is_empty() && dynamic_field.is_none() {\n            bail!(\"concatenate field has `include_dynamic_fields` set, but index isn't dynamic\");\n        }\n        let timestamp_field_path = if let Some(timestamp_field_name) = &doc_mapping.timestamp_field\n        {\n            validate_timestamp_field(timestamp_field_name, &field_mappings)?;\n            Some(build_field_path_from_str(timestamp_field_name))\n        } else {\n            None\n        };\n        let schema = schema_builder.build();\n\n        let tokenizer_manager = create_default_quickwit_tokenizer_manager();\n        let mut custom_tokenizer_names = HashSet::new();\n        for tokenizer_config_entry in &doc_mapping.tokenizers {\n            if custom_tokenizer_names.contains(&tokenizer_config_entry.name) {\n                bail!(\n                    \"duplicated custom tokenizer: `{}`\",\n                    tokenizer_config_entry.name\n                );\n            }\n            if tokenizer_manager\n                .get_tokenizer(&tokenizer_config_entry.name)\n                .is_some()\n            {\n                bail!(\n                    \"custom tokenizer name `{}` should be different from built-in tokenizer's \\\n                     names\",\n                    tokenizer_config_entry.name\n                );\n            }\n            let tokenizer = tokenizer_config_entry\n                .config\n                .text_analyzer()\n                .map_err(|error| {\n                    anyhow::anyhow!(\n                        \"failed to build tokenizer `{}`: {:?}\",\n                        tokenizer_config_entry.name,\n                        error\n                    )\n                })?;\n            let does_lowercasing = tokenizer_config_entry\n                .config\n                .filters\n                .iter()\n                .any(|filter| matches!(filter, crate::TokenFilterType::LowerCaser));\n            tokenizer_manager.register(&tokenizer_config_entry.name, tokenizer, does_lowercasing);\n            custom_tokenizer_names.insert(&tokenizer_config_entry.name);\n        }\n        validate_fields_tokenizers(&schema, &tokenizer_manager)?;\n\n        // Resolve default search fields\n        let mut default_search_field_names = Vec::new();\n        for default_search_field_name in &builder.default_search_fields {\n            if default_search_field_names.contains(default_search_field_name) {\n                bail!(\n                    \"duplicated default search field: `{}`\",\n                    default_search_field_name\n                )\n            }\n            let (default_search_field, _json_path) = schema\n                .find_field_with_default(default_search_field_name, dynamic_field)\n                .with_context(|| {\n                    format!(\"unknown default search field `{default_search_field_name}`\")\n                })?;\n            if !schema.get_field_entry(default_search_field).is_indexed() {\n                bail!(\"default search field `{default_search_field_name}` is not indexed\",);\n            }\n            default_search_field_names.push(default_search_field_name.clone());\n        }\n\n        // Resolve tag fields\n        for tag_field_name in &doc_mapping.tag_fields {\n            validate_tag(tag_field_name, &schema)?;\n        }\n\n        let partition_key_expr: &str = doc_mapping.partition_key.as_deref().unwrap_or(\"\");\n        let partition_key = RoutingExpr::new(partition_key_expr).with_context(|| {\n            format!(\"failed to interpret the partition key: `{partition_key_expr}`\")\n        })?;\n\n        // If valid, partition key fields should be considered as tags.\n        let mut tag_field_names = doc_mapping.tag_fields;\n\n        for partition_key in partition_key.field_names() {\n            if validate_tag(&partition_key, &schema).is_ok() {\n                tag_field_names.insert(partition_key);\n            }\n        }\n        Ok(DocMapper {\n            doc_mapping_uid: doc_mapping.doc_mapping_uid,\n            schema,\n            index_field_presence: doc_mapping.index_field_presence,\n            source_field,\n            dynamic_field,\n            document_size_field,\n            default_search_field_names,\n            timestamp_field_name: doc_mapping.timestamp_field,\n            timestamp_field_path,\n            field_mappings,\n            concatenate_dynamic_fields,\n            tag_field_names,\n            partition_key,\n            max_num_partitions: doc_mapping.max_num_partitions,\n            mode: doc_mapping.mode,\n            tokenizer_entries: doc_mapping.tokenizers,\n            tokenizer_manager,\n        })\n    }\n}\n\n/// Checks that a given field name is a valid candidate for a tag.\n///\n/// The conditions are:\n/// - the field must be str, u64, or i64\n/// - if str, the field must use the `raw` tokenizer for indexing.\n/// - the field must be indexed.\nfn validate_tag(tag_field_name: &str, schema: &Schema) -> Result<(), anyhow::Error> {\n    if tag_field_name.starts_with('.') || tag_field_name.starts_with(\"\\\\.\") {\n        bail!(\"tag field `{tag_field_name}` should not start with a `.`\");\n    }\n    if tag_field_name.ends_with('.') {\n        bail!(\"tag field `{tag_field_name}` should not end with a `.`\");\n    }\n    let field = schema\n        .get_field(tag_field_name)\n        .with_context(|| format!(\"unknown tag field: `{tag_field_name}`\"))?;\n    let field_type = schema.get_field_entry(field).field_type();\n    match field_type {\n        FieldType::Str(options) => {\n            let tokenizer_opt = options\n                .get_indexing_options()\n                .map(|text_options: &tantivy::schema::TextFieldIndexing| text_options.tokenizer());\n            if tokenizer_opt != Some(RAW_TOKENIZER_NAME) {\n                bail!(\"tags collection is only allowed on text fields with the `raw` tokenizer\");\n            }\n        }\n        FieldType::U64(_) | FieldType::I64(_) => {\n            // u64 and i64 are accepted as tags.\n        }\n        _ => {\n            // We avoid the bytes / bool / f64 types,\n            // as they are generally speaking poor tags and we want to avoid\n            // bugs associated to the multiplicity of their representation.\n            //\n            // (Tags are relying heavily on string manipulation and we want to\n            // avoid a \"ZRP because you searched you searched for 0.100 instead of 0.1\",\n            // or `myflag:1`, `myflag:True` instead of `myflag:true`.\n            bail!(\n                \"tags collection is not allowed on `{}` fields\",\n                field_type.value_type().name().to_lowercase()\n            )\n        }\n    }\n    if !field_type.is_indexed() {\n        bail!(\n            \"tag fields are required to be indexed. (`{}` is not configured as indexed)\",\n            tag_field_name\n        )\n    }\n    Ok(())\n}\n\n/// Checks that a given text/json field name has a registered tokenizer.\nfn validate_fields_tokenizers(\n    schema: &Schema,\n    tokenizer_manager: &TokenizerManager,\n) -> Result<(), anyhow::Error> {\n    for (_, field_entry) in schema.fields() {\n        let tokenizer_name_opt = match field_entry.field_type() {\n            FieldType::Str(options) => options\n                .get_indexing_options()\n                .map(|text_options: &tantivy::schema::TextFieldIndexing| text_options.tokenizer()),\n            FieldType::JsonObject(options) => options\n                .get_text_indexing_options()\n                .map(|text_options: &tantivy::schema::TextFieldIndexing| text_options.tokenizer()),\n            _ => None,\n        };\n        if let Some(tokenizer_name) = tokenizer_name_opt\n            && tokenizer_manager.get_tokenizer(tokenizer_name).is_none()\n        {\n            bail!(\n                \"unknown tokenizer `{}` for field `{}`\",\n                tokenizer_name,\n                field_entry.name()\n            );\n        }\n    }\n    Ok(())\n}\n\nimpl std::fmt::Debug for DocMapper {\n    fn fmt(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {\n        formatter\n            .debug_struct(\"DocMapper\")\n            .field(\"store_source\", &self.source_field.is_some())\n            .field(\n                \"default_search_field_names\",\n                &self.default_search_field_names,\n            )\n            .field(\"timestamp_field_name\", &self.timestamp_field_name())\n            // TODO: complete it.\n            .finish()\n    }\n}\n\nfn extract_single_obj(\n    doc: &mut BTreeMap<String, Vec<TantivyValue>>,\n    key: &str,\n) -> anyhow::Result<Option<serde_json::Map<String, JsonValue>>> {\n    let mut values = if let Some(values) = doc.remove(key) {\n        values\n    } else {\n        return Ok(None);\n    };\n    if values.len() > 1 {\n        bail!(\n            \"invalid named document. there are more than 1 value associated to the `{key}` field\"\n        );\n    }\n    match values.pop() {\n        Some(TantivyValue::Object(dynamic_json_obj)) => Ok(Some(\n            dynamic_json_obj\n                .into_iter()\n                .map(|(key, val)| (key, tantivy_value_to_json(val)))\n                .collect(),\n        )),\n        Some(_) => {\n            bail!(\"the `{key}` value has to be a json object\");\n        }\n        None => Ok(None),\n    }\n}\n\nimpl DocMapper {\n    /// Returns the unique identifier of the doc mapping.\n    pub fn doc_mapping_uid(&self) -> DocMappingUid {\n        self.doc_mapping_uid\n    }\n\n    /// Validates a JSON object according to the doc mapper.\n    pub fn validate_json_obj(&self, json_obj: &BorrowedJsonMap) -> Result<(), DocParsingError> {\n        let is_strict = self.mode.mode_type() == ModeType::Strict;\n        let mut field_path = Vec::new();\n        self.field_mappings\n            .validate_from_json(json_obj, is_strict, &mut field_path)?;\n        if let Some(timestamp_field_path) = &self.timestamp_field_path {\n            let missing_ts_field =\n                || DocParsingError::RequiredField(\"timestamp field is required\".to_string());\n            match &timestamp_field_path[..] {\n                [] => (), // ?\n                [single_part] => {\n                    let obj = json_obj.get(single_part).ok_or_else(missing_ts_field)?;\n                    if !(obj.is_string() || obj.is_number()) {\n                        return Err(missing_ts_field());\n                    }\n                }\n                [first_part, more_part @ ..] => {\n                    let mut obj = json_obj.get(first_part).ok_or_else(missing_ts_field)?;\n                    for part in more_part {\n                        obj = obj\n                            .as_object()\n                            .ok_or_else(missing_ts_field)?\n                            .get(part)\n                            .ok_or_else(missing_ts_field)?;\n                    }\n                    if !(obj.is_string() || obj.is_number()) {\n                        return Err(missing_ts_field());\n                    }\n                }\n            };\n        }\n        Ok(())\n    }\n\n    /// Parses a JSON byte slice into a tantivy [`Document`].\n    pub fn doc_from_json_bytes(\n        &self,\n        json_doc: &[u8],\n    ) -> Result<(Partition, Document), DocParsingError> {\n        let json_obj: JsonObject = serde_json::from_slice(json_doc).map_err(|_| {\n            let json_doc_sample: String = std::str::from_utf8(json_doc)\n                .map(|doc_str| doc_str.chars().take(20).chain(\"...\".chars()).collect())\n                .unwrap_or_else(|_| \"document contains some invalid UTF-8 characters\".to_string());\n            DocParsingError::NotJsonObject(json_doc_sample)\n        })?;\n        self.doc_from_json_obj(json_obj, json_doc.len() as u64)\n    }\n\n    /// Parses a JSON string into a tantivy [`Document`].\n    pub fn doc_from_json_str(\n        &self,\n        json_doc: &str,\n    ) -> Result<(Partition, Document), DocParsingError> {\n        let json_obj: JsonObject = serde_json::from_str(json_doc).map_err(|_| {\n            let json_doc_sample: String = json_doc.chars().take(20).chain(\"...\".chars()).collect();\n            DocParsingError::NotJsonObject(json_doc_sample)\n        })?;\n        self.doc_from_json_obj(json_obj, json_doc.len() as u64)\n    }\n\n    /// Transforms a JSON object into a tantivy [`Document`] according to the rules\n    /// defined for the `DocMapper`.\n    pub fn doc_from_json_obj(\n        &self,\n        json_obj: JsonObject,\n        document_len: u64,\n    ) -> Result<(Partition, Document), DocParsingError> {\n        let partition: Partition = self.partition_key.eval_hash(&json_obj);\n\n        let mut dynamic_json_obj = serde_json::Map::default();\n        let mut field_path = Vec::new();\n        let mut document = Document::default();\n\n        if let Some(source_field) = self.source_field {\n            document.add_object(\n                source_field,\n                json_obj\n                    .clone()\n                    .into_iter()\n                    .map(|(key, val)| (key, TantivyValue::from(val)))\n                    .collect(),\n            );\n        }\n\n        let mode = self.mode.mode_type();\n        self.field_mappings.doc_from_json(\n            json_obj,\n            mode,\n            &mut document,\n            &mut field_path,\n            &mut dynamic_json_obj,\n        )?;\n\n        if let Some(dynamic_field) = self.dynamic_field\n            && !dynamic_json_obj.is_empty()\n        {\n            if !self.concatenate_dynamic_fields.is_empty() {\n                let json_obj_values =\n                    JsonValueIterator::new(serde_json::Value::Object(dynamic_json_obj.clone()))\n                        .flat_map(map_primitive_json_to_concatenate_value);\n\n                for value in json_obj_values {\n                    for concatenate_dynamic_field in self.concatenate_dynamic_fields.iter() {\n                        document.add_field_value(*concatenate_dynamic_field, &value);\n                    }\n                }\n            }\n            document.add_object(\n                dynamic_field,\n                dynamic_json_obj\n                    .into_iter()\n                    .map(|(key, val)| (key, TantivyValue::from(val)))\n                    .collect(),\n            );\n        }\n\n        if let Some(document_size_field) = self.document_size_field {\n            document.add_u64(document_size_field, document_len);\n        }\n\n        if self.index_field_presence {\n            let field_presence_hashes: FnvHashSet<u64> =\n                populate_field_presence(&document, &self.schema, true);\n            for field_presence_hash in field_presence_hashes {\n                document.add_field_value(FIELD_PRESENCE_FIELD, &field_presence_hash);\n            }\n        }\n        Ok((partition, document))\n    }\n\n    /// Converts a tantivy named Document to the json format.\n    ///\n    /// Tantivy does not have any notion of cardinality nor object.\n    /// It is therefore up to the `DocMapper` to pick a tantivy named document\n    /// and convert it into a final quickwit document.\n    ///\n    /// Because this operation is dependent on the `DocMapper`, this\n    /// method is meant to be called on the root node using the most recent\n    /// `DocMapper`. This ensures that the different hits are formatted according\n    /// to the same schema.\n    pub fn doc_to_json(\n        &self,\n        mut named_doc: BTreeMap<String, Vec<TantivyValue>>,\n    ) -> anyhow::Result<serde_json::Map<String, JsonValue>> {\n        let mut doc_json =\n            extract_single_obj(&mut named_doc, DYNAMIC_FIELD_NAME)?.unwrap_or_default();\n        let mut field_path: Vec<&str> = Vec::new();\n        self.field_mappings\n            .populate_json(&mut named_doc, &mut field_path, &mut doc_json);\n        if let Some(source_json) = extract_single_obj(&mut named_doc, SOURCE_FIELD_NAME)? {\n            doc_json.insert(\n                SOURCE_FIELD_NAME.to_string(),\n                JsonValue::Object(source_json),\n            );\n        }\n        if matches!(\n            self.mode,\n            Mode::Dynamic(ref opt) if opt.stored\n        ) {\n            // if we are in dynamic mode and there are other fields lefts, we should print them.\n            // They probably come from older schemas when these fields had a dedicated entry\n            'field: for (key, mut value) in named_doc {\n                if key.starts_with('_') {\n                    // this is an internal field, not meant to be shown\n                    continue 'field;\n                }\n                let Ok(path) = crate::routing_expression::parse_field_name(&key) else {\n                    continue 'field;\n                };\n                let Some((last_segment, path)) = path.split_last() else {\n                    continue 'field;\n                };\n                let mut map = &mut doc_json;\n                for segment in path {\n                    let obj = if map.contains_key(&**segment) {\n                        // we have to do this strange dance to please the borrowchecker\n                        map.get_mut(&**segment).unwrap()\n                    } else {\n                        map.insert(segment.to_string(), serde_json::Map::new().into());\n                        map.get_mut(&**segment).unwrap()\n                    };\n                    let JsonValue::Object(inner_map) = obj else {\n                        continue 'field;\n                    };\n                    map = inner_map;\n                }\n                map.entry(&**last_segment).or_insert_with(|| {\n                    if value.len() == 1 {\n                        tantivy_value_to_json(value.pop().unwrap())\n                    } else {\n                        JsonValue::Array(value.into_iter().map(tantivy_value_to_json).collect())\n                    }\n                });\n            }\n        }\n\n        Ok(doc_json)\n    }\n\n    /// Returns the query.\n    ///\n    /// Considering schema evolution, splits within an index can have different schema\n    /// over time. So `split_schema` is the schema of the split the query is targeting.\n    pub fn query(\n        &self,\n        split_schema: Schema,\n        query_ast: QueryAst,\n        with_validation: bool,\n        cache_context: Option<(Arc<dyn quickwit_query::query_ast::PredicateCache>, String)>,\n    ) -> Result<(Box<dyn Query>, WarmupInfo), QueryParserError> {\n        build_query(\n            query_ast,\n            &BuildTantivyAstContext {\n                schema: &split_schema,\n                tokenizer_manager: self.tokenizer_manager(),\n                search_fields: &self.default_search_field_names[..],\n                with_validation,\n            },\n            cache_context,\n        )\n    }\n\n    /// Returns the list of search fields to search into, when no field is specified.\n    /// (See `UserInputQuery`).\n    pub fn default_search_fields(&self) -> &[String] {\n        &self.default_search_field_names\n    }\n\n    /// Returns the schema.\n    ///\n    /// Considering schema evolution, splits within an index can have different schema\n    /// over time. The schema returned here represents the most up-to-date schema of the index.\n    pub fn schema(&self) -> Schema {\n        self.schema.clone()\n    }\n\n    /// Returns the timestamp field name.\n    pub fn timestamp_field_name(&self) -> Option<&str> {\n        self.timestamp_field_name.as_deref()\n    }\n\n    /// Returns the tag `NameField`s on the current schema.\n    /// Returns an error if a tag field is not found in this schema.\n    pub fn tag_named_fields(&self) -> anyhow::Result<Vec<NamedField>> {\n        let index_schema = self.schema();\n        self.tag_field_names()\n            .iter()\n            .map(|field_name| {\n                index_schema\n                    .get_field(field_name)\n                    .context(format!(\"field `{field_name}` must exist in the schema\"))\n                    .map(|field| NamedField {\n                        name: field_name.clone(),\n                        field,\n                        field_type: index_schema.get_field_entry(field).field_type().clone(),\n                    })\n            })\n            .collect::<Result<Vec<_>, _>>()\n    }\n\n    /// Returns the tag `NameField`s on the current schema.\n    /// Returns an error if a tag field is not found in this schema.\n    pub fn tag_field_names(&self) -> BTreeSet<String> {\n        self.tag_field_names.clone()\n    }\n\n    /// Returns the maximum number of partitions.\n    pub fn max_num_partitions(&self) -> NonZeroU32 {\n        self.max_num_partitions\n    }\n\n    /// Returns the tokenizer manager.\n    pub fn tokenizer_manager(&self) -> &TokenizerManager {\n        &self.tokenizer_manager\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::collections::{HashMap, HashSet};\n    use std::iter::zip;\n\n    use itertools::Itertools;\n    use quickwit_common::PathHasher;\n    use quickwit_query::query_ast::query_ast_from_user_text;\n    use serde_json::{self, Value as JsonValue, json};\n    use tantivy::schema::{\n        FieldType, IndexRecordOption, OwnedValue as TantivyValue, OwnedValue, Type, Value,\n    };\n\n    use super::DocMapper;\n    use crate::doc_mapper::field_mapping_entry::{DEFAULT_TOKENIZER_NAME, RAW_TOKENIZER_NAME};\n    use crate::{\n        DOCUMENT_SIZE_FIELD_NAME, DYNAMIC_FIELD_NAME, DocMapperBuilder, DocParsingError,\n        FIELD_PRESENCE_FIELD_NAME, SOURCE_FIELD_NAME,\n    };\n\n    fn example_json_doc_value() -> JsonValue {\n        serde_json::json!({\n            \"timestamp\": 1586960586i64,\n            \"body\": \"20200415T072306-0700 INFO This is a great log\",\n            \"response_date2\": \"2021-12-19T16:39:57+00:00\",\n            \"response_date\": \"2021-12-19T16:39:57Z\",\n            \"response_time\": 2.3,\n            \"response_payload\": \"YWJj\",\n            \"owner\": \"foo\",\n            \"isImportant\": false,\n            \"attributes\": {\n                \"server\": \"ABC\",\n                \"tags\": [22, 23],\n                \"server.status\": [\"200\", \"201\"],\n                \"server.payload\": [\"YQ==\", \"Yg==\"]\n            }\n        })\n    }\n\n    const EXPECTED_JSON_PATHS_AND_VALUES: &str = r#\"{\n            \"timestamp\": [\"2020-04-15T14:23:06Z\"],\n            \"body\": [\"20200415T072306-0700 INFO This is a great log\"],\n            \"response_date\": [\"2021-12-19T16:39:57Z\"],\n            \"response_time\": [2.3],\n            \"response_payload\": [\"YWJj\"],\n            \"owner\": [\"foo\"],\n            \"isImportant\": [false],\n            \"body_other_tokenizer\": [\"20200415T072306-0700 INFO This is a great log\"],\n            \"attributes.server\": [\"ABC\"],\n            \"attributes.server\\\\.payload\": [\"YQ==\", \"Yg==\"],\n            \"attributes.tags\": [22, 23],\n            \"attributes.server\\\\.status\": [\"200\", \"201\"]\n        }\"#;\n\n    #[test]\n    fn test_json_deserialize() -> anyhow::Result<()> {\n        let config = crate::default_doc_mapper_for_test();\n        assert!(config.source_field.is_some());\n        let mut default_search_field_names: Vec<String> = config.default_search_field_names;\n        default_search_field_names.sort();\n        assert_eq!(\n            default_search_field_names,\n            [\"attributes.server\", r\"attributes.server\\.status\", \"body\"]\n        );\n        assert_eq!(config.field_mappings.num_fields(), 10);\n        Ok(())\n    }\n\n    #[test]\n    fn test_parsing_document() {\n        let json_doc = example_json_doc_value();\n        let doc_mapper = crate::default_doc_mapper_for_test();\n        let (_, document) = doc_mapper\n            .doc_from_json_obj(json_doc.as_object().unwrap().clone(), 0)\n            .unwrap();\n        let schema = doc_mapper.schema();\n        // 9 property entry + 1 field \"_source\" + 2 fields values for \"tags\" field\n        // + 2 values inf \"server.status\" field + 2 values in \"server.payload\" field\n        // + 7 values for field presence\n        assert_eq!(document.len(), 23);\n        let expected_json_paths_and_values: HashMap<String, JsonValue> =\n            serde_json::from_str(EXPECTED_JSON_PATHS_AND_VALUES).unwrap();\n        let mut field_presences: HashSet<u64> = HashSet::new();\n        for (field, value) in document.field_values() {\n            let owned_value: OwnedValue = value.into();\n            let field_name = schema.get_field_name(field);\n            if field_name == SOURCE_FIELD_NAME {\n                // some part of aws-sdk enables `preserve_order` on serde_json.\n                // to get \"normal\" equality, we are forced to recreate the json object\n                // with sorted keys.\n                let sorted_json_values = json_doc\n                    .as_object()\n                    .unwrap()\n                    .clone()\n                    .into_iter()\n                    .sorted_by(|k1, k2| k1.0.cmp(&k2.0))\n                    .collect::<serde_json::Map<_, _>>();\n                assert_eq!(\n                    tantivy::schema::OwnedValue::from(value.as_value()),\n                    tantivy::schema::OwnedValue::from(sorted_json_values)\n                );\n            } else if field_name == DYNAMIC_FIELD_NAME {\n                assert_eq!(\n                    serde_json::to_string(&owned_value).unwrap(),\n                    r#\"{\"response_date2\":\"2021-12-19T16:39:57Z\"}\"#\n                );\n            } else if field_name == FIELD_PRESENCE_FIELD_NAME {\n                let field_presence_u64 = value.as_u64().unwrap();\n                field_presences.insert(field_presence_u64);\n            } else {\n                let value = serde_json::to_string(&owned_value).unwrap();\n                let is_value_in_expected_values = expected_json_paths_and_values\n                    .get(field_name)\n                    .unwrap()\n                    .as_array()\n                    .unwrap()\n                    .iter()\n                    .map(|expected_value| format!(\"{expected_value}\"))\n                    .any(|expected_value| expected_value == value);\n                if !is_value_in_expected_values {\n                    panic!(\"Could not find: {value:?} in {expected_json_paths_and_values:?}\");\n                }\n            }\n        }\n        assert_eq!(field_presences.len(), 7);\n        let timestamp_field = schema.get_field(\"timestamp\").unwrap();\n        let body_field = schema.get_field(\"body\").unwrap();\n        let attributes_field = schema.get_field(\"attributes.server\").unwrap();\n        assert!(!field_presences.contains(&PathHasher::hash_path(&[\n            &timestamp_field.field_id().to_le_bytes()[..]\n        ])));\n        assert!(field_presences.contains(&PathHasher::hash_path(&[\n            &body_field.field_id().to_le_bytes()[..]\n        ])));\n        assert!(field_presences.contains(&PathHasher::hash_path(&[\n            &attributes_field.field_id().to_le_bytes()[..]\n        ])));\n    }\n\n    #[test]\n    fn test_accept_parsing_document_with_unknown_fields_and_missing_fields() {\n        let doc_mapper = crate::default_doc_mapper_for_test();\n        doc_mapper\n            .doc_from_json_str(\n                r#\"{\n                \"timestamp\": 1586960586000,\n                \"unknown_field\": \"20200415T072306-0700 INFO This is a great log\",\n                \"response_date\": \"2021-12-19T16:39:57+00:00\",\n                \"response_time\": 12,\n                \"response_payload\": \"YWJj\"\n            }\"#,\n            )\n            .unwrap();\n    }\n\n    #[test]\n    fn test_fail_to_parse_document_with_wrong_cardinality() -> anyhow::Result<()> {\n        let doc_mapper = crate::default_doc_mapper_for_test();\n        let result = doc_mapper.doc_from_json_str(\n            r#\"{\n                \"timestamp\": 1586960586000,\n                \"body\": [\"text 1\", \"text 2\"]\n            }\"#,\n        );\n        assert!(result.is_err());\n        let error = result.unwrap_err();\n        assert_eq!(\n            error,\n            DocParsingError::MultiValuesNotSupported(\"body\".to_owned())\n        );\n        Ok(())\n    }\n\n    #[test]\n    fn test_fail_to_parse_document_with_wrong_value() -> anyhow::Result<()> {\n        let doc_mapper = crate::default_doc_mapper_for_test();\n        let result = doc_mapper.doc_from_json_str(\n            r#\"{\n                \"timestamp\": 1586960586000,\n                \"body\": 1\n            }\"#,\n        );\n        assert!(result.is_err());\n        let error = result.unwrap_err();\n        assert_eq!(\n            error,\n            DocParsingError::ValueError(\"body\".to_owned(), \"expected string, got `1`\".to_owned())\n        );\n        Ok(())\n    }\n\n    #[test]\n    fn test_timestamp_field_in_object_is_valid() {\n        serde_json::from_str::<DocMapper>(\n            r#\"{\n            \"field_mappings\": [\n                {\n                    \"name\": \"some_obj\",\n                    \"type\": \"object\",\n                    \"field_mappings\": [\n                        {\n                            \"name\": \"timestamp\",\n                            \"type\": \"datetime\",\n                            \"fast\": true\n                        }\n                    ]\n                }\n            ],\n            \"timestamp_field\": \"some_obj.timestamp\"\n        }\"#,\n        )\n        .unwrap();\n\n        serde_yaml::from_str::<DocMapper>(\n            r#\"\n            field_mappings:\n              - name: some_obj\n                type: object\n                field_mappings:\n                  - name: timestamp\n                    type: datetime\n                    fast: true\n            timestamp_field: some_obj.timestamp\n        \"#,\n        )\n        .unwrap();\n    }\n\n    #[test]\n    fn test_timestamp_field_with_dots_in_its_name_is_valid() {\n        serde_json::from_str::<DocMapper>(\n            r#\"{\n            \"field_mappings\": [\n                {\n                    \"name\": \"my.timestamp\",\n                    \"type\": \"datetime\",\n                    \"fast\": true\n                }\n            ],\n            \"timestamp_field\": \"my\\\\.timestamp\"\n        }\"#,\n        )\n        .unwrap();\n\n        serde_yaml::from_str::<DocMapper>(\n            r#\"\n            field_mappings:\n              - name: my.timestamp\n                type: datetime\n                fast: true\n            timestamp_field: \"my\\\\.timestamp\"\n        \"#,\n        )\n        .unwrap();\n    }\n\n    #[test]\n    fn test_timestamp_field_that_start_with_dot_is_invalid() {\n        assert_eq!(\n            serde_json::from_str::<DocMapper>(\n                r#\"{\n                \"field_mappings\": [\n                    {\n                        \"name\": \"my.timestamp\",\n                        \"type\": \"datetime\",\n                        \"fast\": true\n                    }\n                ],\n                \"timestamp_field\": \".my.timestamp\"\n            }\"#,\n            )\n            .unwrap_err()\n            .to_string(),\n            \"timestamp field `.my.timestamp` should not start with a `.`\",\n        );\n\n        assert_eq!(\n            serde_json::from_str::<DocMapper>(\n                r#\"{\n                \"field_mappings\": [\n                    {\n                        \"name\": \"my.timestamp\",\n                        \"type\": \"datetime\",\n                        \"fast\": true\n                    }\n                ],\n                \"timestamp_field\": \"\\\\.my\\\\.timestamp\"\n            }\"#,\n            )\n            .unwrap_err()\n            .to_string(),\n            \"timestamp field `\\\\.my\\\\.timestamp` should not start with a `.`\",\n        )\n    }\n\n    #[test]\n    fn test_timestamp_field_that_ends_with_dot_is_invalid() {\n        assert_eq!(\n            serde_json::from_str::<DocMapper>(\n                r#\"{\n                    \"timestamp_field\": \"my.timestamp.\"\n                }\"#,\n            )\n            .unwrap_err()\n            .to_string(),\n            \"timestamp field `my.timestamp.` should not end with a `.`\",\n        );\n\n        assert_eq!(\n            serde_json::from_str::<DocMapper>(\n                r#\"{\n                    \"timestamp_field\": \"my\\\\.timestamp\\\\.\"\n                }\"#,\n            )\n            .unwrap_err()\n            .to_string(),\n            \"timestamp field `my\\\\.timestamp\\\\.` should not end with a `.`\",\n        )\n    }\n\n    #[test]\n    fn test_tag_field_name_that_starts_with_dot_is_invalid() {\n        assert_eq!(\n            serde_json::from_str::<DocMapper>(\n                r#\"{\n                    \"tag_fields\": [\".my.tag\"]\n                }\"#,\n            )\n            .unwrap_err()\n            .to_string(),\n            \"tag field `.my.tag` should not start with a `.`\",\n        );\n\n        assert_eq!(\n            serde_json::from_str::<DocMapper>(\n                r#\"{\n                    \"tag_fields\": [\"\\\\.my\\\\.tag\"]\n                }\"#,\n            )\n            .unwrap_err()\n            .to_string(),\n            \"tag field `\\\\.my\\\\.tag` should not start with a `.`\",\n        )\n    }\n\n    #[test]\n    fn test_tag_field_name_that_ends_with_dot_is_invalid() {\n        assert_eq!(\n            serde_json::from_str::<DocMapper>(\n                r#\"{\n                    \"tag_fields\": [\"my.tag.\"]\n                }\"#,\n            )\n            .unwrap_err()\n            .to_string(),\n            \"tag field `my.tag.` should not end with a `.`\",\n        );\n\n        assert_eq!(\n            serde_json::from_str::<DocMapper>(\n                r#\"{\n                    \"tag_fields\": [\"my\\\\.tag\\\\.\"]\n                }\"#,\n            )\n            .unwrap_err()\n            .to_string(),\n            \"tag field `my\\\\.tag\\\\.` should not end with a `.`\",\n        )\n    }\n\n    #[test]\n    fn test_fail_to_build_doc_mapper_with_timestamp_field_with_multivalues_cardinality() {\n        let doc_mapper = r#\"{\n            \"timestamp_field\": \"timestamp\",\n            \"tag_fields\": [],\n            \"field_mappings\": [\n                {\n                    \"name\": \"timestamp\",\n                    \"type\": \"array<i64>\"\n                }\n            ]\n        }\"#;\n        let builder = serde_json::from_str::<DocMapperBuilder>(doc_mapper).unwrap();\n        let expected_msg = \"timestamp field `timestamp` should be a datetime field\";\n        assert_eq!(&builder.try_build().unwrap_err().to_string(), &expected_msg);\n    }\n\n    #[test]\n    fn test_fail_to_build_doc_mapper_with_non_fast_timestamp_field() {\n        let doc_mapper = r#\"{\n            \"default_search_fields\": [],\n            \"timestamp_field\": \"timestamp\",\n            \"tag_fields\": [],\n            \"field_mappings\": [\n                {\n                    \"name\": \"timestamp\",\n                    \"type\": \"datetime\",\n                    \"fast\": false\n                }\n            ]\n        }\"#;\n        let builder = serde_json::from_str::<DocMapperBuilder>(doc_mapper).unwrap();\n        let expected_msg = \"timestamp field `timestamp` should be a fast field\";\n        assert_eq!(&builder.try_build().unwrap_err().to_string(), &expected_msg);\n    }\n\n    #[test]\n    fn test_fail_to_build_doc_mapper_with_duplicate_fields() {\n        {\n            let doc_mapper = r#\"{\n                \"field_mappings\": [\n                    {\"name\": \"body\",\"type\": \"text\"},\n                    {\"name\": \"body\",\"type\": \"bytes\"}\n                ]\n            }\"#;\n            let builder = serde_json::from_str::<DocMapperBuilder>(doc_mapper).unwrap();\n            let expected_msg = \"duplicated field definition `body`\";\n            assert_eq!(&builder.try_build().unwrap_err().to_string(), expected_msg);\n        }\n\n        {\n            let doc_mapper = r#\"{\n                \"field_mappings\": [\n                    {\n                        \"name\": \"identity\",\n                        \"type\": \"object\",\n                        \"field_mappings\": [\n                            {\"type\": \"text\", \"name\": \"username\"},\n                            {\"type\": \"text\", \"name\": \"username\"}\n                        ]\n                    },\n                    {\"type\": \"text\", \"name\": \"body\"}\n                ]\n            }\"#;\n            let builder = serde_json::from_str::<DocMapperBuilder>(doc_mapper).unwrap();\n            let expected_msg = \"duplicated field definition `username`\";\n            assert_eq!(&builder.try_build().unwrap_err().to_string(), expected_msg);\n        }\n    }\n\n    #[test]\n    fn test_should_build_doc_mapper_with_duplicate_fields_at_different_level() {\n        let doc_mapper = r#\"{\n            \"field_mappings\": [\n                {\n                    \"name\": \"identity\",\n                    \"type\": \"object\",\n                    \"field_mappings\": [\n                        {\"type\": \"text\", \"name\": \"body\"},\n                        {\"type\": \"text\", \"name\": \"username\"}\n                    ]\n                },\n                {\"type\": \"text\", \"name\": \"body\"}\n            ]\n        }\"#;\n        let builder = serde_json::from_str::<DocMapperBuilder>(doc_mapper).unwrap();\n        assert!(builder.try_build().is_ok());\n    }\n\n    #[test]\n    fn test_fail_to_build_doc_mapper_with_multivalued_timestamp_field() {\n        let doc_mapper = r#\"{\n            \"default_search_fields\": [],\n            \"timestamp_field\": \"timestamp\",\n            \"tag_fields\": [],\n            \"field_mappings\": [\n                {\n                    \"name\": \"timestamp\",\n                    \"type\": \"array<datetime>\",\n                    \"fast\": true\n                }\n            ]\n        }\"#;\n\n        let builder = serde_json::from_str::<DocMapperBuilder>(doc_mapper).unwrap();\n        let expected_msg = \"timestamp field `timestamp` should be single-valued\";\n        assert_eq!(&builder.try_build().unwrap_err().to_string(), expected_msg);\n    }\n\n    #[test]\n    fn test_fail_with_field_name_equal_to_source() {\n        let doc_mapper = r#\"{\n            \"default_search_fields\": [],\n            \"tag_fields\": [],\n            \"field_mappings\": [\n                {\n                    \"name\": \"_source\",\n                    \"type\": \"i64\"\n                }\n            ]\n        }\"#;\n        let deser_err = serde_json::from_str::<DocMapperBuilder>(doc_mapper)\n            .err()\n            .unwrap();\n        assert!(\n            deser_err\n                .to_string()\n                .contains(\"the following fields are reserved for Quickwit internal usage\")\n        );\n    }\n\n    #[test]\n    fn test_fail_to_parse_document_with_wrong_base64_value() -> anyhow::Result<()> {\n        let doc_mapper = r#\"{\n            \"default_search_fields\": [],\n            \"timestamp_field\": null,\n            \"tag_fields\": [],\n            \"field_mappings\": [\n                {\n                    \"name\": \"image\",\n                    \"type\": \"bytes\",\n                    \"stored\": true\n                }\n            ]\n        }\"#;\n        let builder = serde_json::from_str::<DocMapperBuilder>(doc_mapper)?;\n        let doc_mapper = builder.try_build()?;\n        let result = doc_mapper.doc_from_json_str(\n            r#\"{\n            \"image\": \"invalid base64 data\"\n        }\"#,\n        );\n        let expected_msg = \"the field `image` could not be parsed: expected base64 string, got \\\n                            `invalid base64 data`: Invalid symbol 32, offset 7.\";\n        assert_eq!(result.unwrap_err().to_string(), expected_msg);\n        Ok(())\n    }\n\n    #[test]\n    fn test_parse_document_with_tag_fields() {\n        let doc_mapper = r#\"{\n            \"default_search_fields\": [],\n            \"index_field_presence\": true,\n            \"timestamp_field\": null,\n            \"tag_fields\": [\"city\"],\n            \"store_source\": true,\n            \"field_mappings\": [\n                {\n                    \"name\": \"city\",\n                    \"type\": \"text\",\n                    \"stored\": true,\n                    \"tokenizer\": \"raw\"\n                },\n                {\n                    \"name\": \"image\",\n                    \"type\": \"bytes\",\n                    \"stored\": true\n                }\n            ]\n        }\"#;\n\n        let builder = serde_json::from_str::<DocMapperBuilder>(doc_mapper).unwrap();\n        let doc_mapper = builder.try_build().unwrap();\n        let schema = doc_mapper.schema();\n        let json_doc_value: JsonValue = serde_json::json!({\n            \"city\": \"tokio\",\n            \"image\": \"YWJj\"\n        });\n        let (_, document) = doc_mapper\n            .doc_from_json_obj(json_doc_value.as_object().unwrap().clone(), 0)\n            .unwrap();\n\n        // 2 properties, + 1 value for \"_source\" + 2 for field presence.\n        assert_eq!(document.len(), 5);\n        let expected_json_paths_and_values: HashMap<String, JsonValue> = serde_json::from_str(\n            r#\"{\n                \"city\": [\"tokio\"],\n                \"image\": [\"YWJj\"]\n            }\"#,\n        )\n        .unwrap();\n        let mut field_presences: HashSet<u64> = HashSet::default();\n        document.field_values().for_each(|(field, value)| {\n            let owned_value: OwnedValue = value.into();\n            let field_name = schema.get_field_name(field);\n            if field_name == SOURCE_FIELD_NAME {\n                assert_eq!(\n                    tantivy::schema::OwnedValue::from(value.as_value()),\n                    tantivy::schema::OwnedValue::from(json_doc_value.as_object().unwrap().clone())\n                );\n            } else if field_name == FIELD_PRESENCE_FIELD_NAME {\n                let field_value_hash = value.as_u64().unwrap();\n                field_presences.insert(field_value_hash);\n            } else {\n                let value = serde_json::to_string(&owned_value).unwrap();\n                let is_value_in_expected_values = expected_json_paths_and_values\n                    .get(field_name)\n                    .unwrap()\n                    .as_array()\n                    .unwrap()\n                    .iter()\n                    .map(|expected_value| format!(\"{expected_value}\"))\n                    .any(|expected_value| expected_value == value);\n                assert!(is_value_in_expected_values);\n            }\n        });\n        assert_eq!(field_presences.len(), 2);\n        let city_field = schema.get_field(\"city\").unwrap();\n        let image_field = schema.get_field(\"image\").unwrap();\n        assert!(field_presences.contains(&PathHasher::hash_path(&[\n            &city_field.field_id().to_le_bytes()\n        ])));\n        assert!(field_presences.contains(&PathHasher::hash_path(&[\n            &image_field.field_id().to_le_bytes()\n        ])));\n    }\n\n    #[test]\n    fn test_partition_key_in_tags() {\n        let doc_mapper = r#\"{\n            \"default_search_fields\": [],\n            \"timestamp_field\": null,\n            \"tag_fields\": [\"city\"],\n            \"store_source\": true,\n            \"partition_key\": \"hash_mod((service,division,city), 50)\",\n            \"field_mappings\": [\n                {\n                    \"name\": \"city\",\n                    \"type\": \"text\",\n                    \"stored\": true,\n                    \"tokenizer\": \"raw\"\n                },\n                {\n                    \"name\": \"division\",\n                    \"type\": \"text\",\n                    \"stored\": true,\n                    \"tokenizer\": \"raw\"\n                },\n                {\n                    \"name\": \"service\",\n                    \"type\": \"text\",\n                    \"stored\": true,\n                    \"tokenizer\": \"raw\"\n                }\n            ]\n        }\"#;\n\n        let builder = serde_json::from_str::<DocMapperBuilder>(doc_mapper).unwrap();\n        let doc_mapper = builder.try_build().unwrap();\n        let tag_fields: Vec<_> = doc_mapper.tag_field_names.into_iter().collect();\n        assert_eq!(tag_fields, vec![\"city\", \"division\", \"service\",]);\n    }\n\n    #[test]\n    fn test_partition_key_in_tags_without_explicit_tags() {\n        let doc_mapper = r#\"{\n            \"default_search_fields\": [],\n            \"timestamp_field\": null,\n            \"store_source\": true,\n            \"partition_key\": \"service,hash_mod((division,city), 50)\",\n            \"field_mappings\": [\n                {\n                    \"name\": \"city\",\n                    \"type\": \"text\",\n                    \"stored\": true,\n                    \"tokenizer\": \"raw\"\n                },\n                {\n                    \"name\": \"division\",\n                    \"type\": \"text\",\n                    \"stored\": true,\n                    \"tokenizer\": \"raw\"\n                },\n                {\n                    \"name\": \"service\",\n                    \"type\": \"text\",\n                    \"stored\": true,\n                    \"tokenizer\": \"raw\"\n                }\n            ]\n        }\"#;\n\n        let builder = serde_json::from_str::<DocMapperBuilder>(doc_mapper).unwrap();\n        let doc_mapper = builder.try_build().unwrap();\n        let tag_fields: Vec<_> = doc_mapper.tag_field_names.into_iter().collect();\n        assert_eq!(tag_fields, vec![\"city\", \"division\", \"service\",]);\n    }\n\n    #[test]\n    fn test_build_doc_mapper_with_tag_field_with_dots_in_its_name() {\n        let doc_mapper = r#\"{\n            \"default_search_fields\": [],\n            \"tag_fields\": [\"my\\\\.city\\\\.id\"],\n            \"field_mappings\": [\n                {\n                    \"name\": \"my.city.id\",\n                    \"type\": \"u64\"\n                }\n            ]\n        }\"#;\n        serde_json::from_str::<DocMapper>(doc_mapper).unwrap();\n    }\n\n    #[test]\n    fn test_build_doc_mapper_with_tag_field_in_object() {\n        let doc_mapper = r#\"{\n            \"default_search_fields\": [],\n            \"tag_fields\": [\"location.city\"],\n            \"field_mappings\": [\n                {\n                    \"name\": \"location\",\n                    \"type\": \"object\",\n                    \"field_mappings\": [\n                        {\n                            \"name\": \"city\",\n                            \"type\": \"u64\"\n                        }\n                    ]\n                }\n            ]\n        }\"#;\n        serde_json::from_str::<DocMapper>(doc_mapper).unwrap();\n    }\n\n    #[test]\n    fn test_fail_to_build_doc_mapper_with_wrong_tag_fields_types() -> anyhow::Result<()> {\n        let doc_mapper_one = r#\"{\n            \"default_search_fields\": [],\n            \"tag_fields\": [\"city\"],\n            \"field_mappings\": [\n                {\n                    \"name\": \"city\",\n                    \"type\": \"text\"\n                }\n            ]\n        }\"#;\n        assert_eq!(\n            serde_json::from_str::<DocMapperBuilder>(doc_mapper_one)?\n                .try_build()\n                .unwrap_err()\n                .to_string(),\n            \"tags collection is only allowed on text fields with the `raw` tokenizer\".to_string(),\n        );\n\n        let doc_mapper_two = r#\"{\n            \"default_search_fields\": [],\n            \"tag_fields\": [\"photo\"],\n            \"field_mappings\": [\n                {\n                    \"name\": \"photo\",\n                    \"type\": \"bytes\"\n                }\n            ]\n        }\"#;\n        assert_eq!(\n            serde_json::from_str::<DocMapperBuilder>(doc_mapper_two)?\n                .try_build()\n                .unwrap_err()\n                .to_string(),\n            \"tags collection is not allowed on `bytes` fields\".to_string(),\n        );\n        Ok(())\n    }\n\n    // See #1132\n    #[test]\n    fn test_by_default_store_source_is_false_and_fields_are_stored_individually() {\n        let doc_mapper = r#\"{\n            \"default_search_fields\": [],\n            \"field_mappings\": [\n                {\n                    \"name\": \"my-field\",\n                    \"type\": \"u64\",\n                    \"indexed\": true\n                }\n            ]\n        }\"#;\n        let builder = serde_json::from_str::<DocMapperBuilder>(doc_mapper).unwrap();\n        let default_doc_mapper = builder.try_build().unwrap();\n        assert!(default_doc_mapper.source_field.is_none());\n        let schema = default_doc_mapper.schema();\n        let field = schema.get_field(\"my-field\").unwrap();\n        let field_entry = schema.get_field_entry(field);\n        assert!(field_entry.is_stored());\n    }\n\n    #[test]\n    fn test_lenient_mode_schema() {\n        let default_doc_mapper: DocMapper =\n            serde_json::from_str(r#\"{ \"mode\": \"lenient\" }\"#).unwrap();\n        let schema = default_doc_mapper.schema();\n        assert_eq!(schema.num_fields(), 1);\n        assert!(default_doc_mapper.default_search_field_names.is_empty());\n    }\n\n    #[test]\n    fn test_dynamic_mode_schema() {\n        let default_doc_mapper: DocMapper =\n            serde_json::from_str(r#\"{ \"mode\": \"dynamic\" }\"#).unwrap();\n        let schema = default_doc_mapper.schema();\n        assert_eq!(schema.num_fields(), 2);\n        let dynamic_field = schema.get_field(DYNAMIC_FIELD_NAME).unwrap();\n        let dynamic_field_entry = schema.get_field_entry(dynamic_field);\n        assert_eq!(dynamic_field_entry.field_type().value_type(), Type::Json);\n        // the dynamic field will be added implicitly at search time.\n        assert!(default_doc_mapper.default_search_field_names.is_empty());\n    }\n\n    #[test]\n    fn test_dynamic_mode_schema_not_indexed() {\n        let default_doc_mapper: DocMapper = serde_json::from_str(\n            r#\"{\n            \"mode\": \"dynamic\",\n            \"dynamic_mapping\": {\n                \"indexed\": false,\n                \"stored\": true\n            }\n        }\"#,\n        )\n        .unwrap();\n        let schema = default_doc_mapper.schema();\n        assert_eq!(schema.num_fields(), 2);\n        let dynamic_field = schema.get_field(DYNAMIC_FIELD_NAME).unwrap();\n        let dynamic_field_entry = schema.get_field_entry(dynamic_field);\n        let FieldType::JsonObject(json_opt) = dynamic_field_entry.field_type() else {\n            panic!(\"Expected a json object\");\n        };\n        assert_eq!(json_opt.is_indexed(), false);\n        default_doc_mapper.default_search_field_names.is_empty();\n    }\n\n    #[test]\n    fn test_strict_mode_simple() {\n        let default_doc_mapper: DocMapper =\n            serde_json::from_str(r#\"{ \"mode\": \"strict\" }\"#).unwrap();\n        let parsing_err = default_doc_mapper\n            .doc_from_json_str(r#\"{ \"a\": { \"b\": 5, \"c\": 6 } }\"#)\n            .err()\n            .unwrap();\n        assert!(\n            matches!(parsing_err, DocParsingError::NoSuchFieldInSchema(field_name) if field_name == \"a\")\n        );\n    }\n\n    #[test]\n    fn test_strict_mode_inner() {\n        let default_doc_mapper: DocMapper = serde_json::from_str(\n            r#\"{\n            \"field_mappings\": [\n                {\n                    \"name\": \"some_obj\",\n                    \"type\": \"object\",\n                    \"field_mappings\": [\n                        {\n                            \"name\": \"child_a\",\n                            \"type\": \"text\"\n                        }\n                    ]\n                }\n            ],\n            \"mode\": \"strict\"\n        }\"#,\n        )\n        .unwrap();\n        assert!(\n            default_doc_mapper\n                .doc_from_json_str(r#\"{ \"some_obj\": { \"child_a\": \"hello\" } }\"#)\n                .is_ok()\n        );\n        let parsing_err = default_doc_mapper\n            .doc_from_json_str(r#\"{ \"some_obj\": { \"child_a\": \"hello\", \"child_b\": 6 } }\"#)\n            .err()\n            .unwrap();\n        assert!(\n            matches!(parsing_err, DocParsingError::NoSuchFieldInSchema(field_name) if field_name == \"some_obj.child_b\")\n        );\n    }\n\n    #[test]\n    fn test_lenient_mode_simple() {\n        let default_doc_mapper: DocMapper =\n            serde_json::from_str(r#\"{ \"mode\": \"lenient\" }\"#).unwrap();\n        let (_, doc) = default_doc_mapper\n            .doc_from_json_str(r#\"{ \"a\": { \"b\": 5, \"c\": 6 } }\"#)\n            .unwrap();\n        assert_eq!(doc.len(), 0);\n    }\n\n    #[track_caller]\n    fn test_doc_from_json_test_aux(\n        doc_mapper_json: &str,\n        field: &str,\n        document_json: &str,\n        expected_values: Vec<TantivyValue>,\n    ) {\n        let default_doc_mapper: DocMapper = serde_json::from_str(doc_mapper_json).unwrap();\n        let schema = default_doc_mapper.schema();\n        let field = schema.get_field(field).unwrap();\n        let (_, doc) = default_doc_mapper.doc_from_json_str(document_json).unwrap();\n\n        let values: Vec<OwnedValue> = doc.get_all(field).map(|value| value.into()).collect();\n        assert_eq!(values.len(), expected_values.len());\n\n        for (value, expected_value) in zip(values, expected_values) {\n            assert_eq!(value, expected_value);\n        }\n    }\n\n    #[test]\n    fn test_dymamic_mode_simple() {\n        test_doc_from_json_test_aux(\n            r#\"{ \"mode\": \"dynamic\" }\"#,\n            DYNAMIC_FIELD_NAME,\n            r#\"{ \"a\": { \"b\": 5, \"c\": 6 } }\"#,\n            vec![\n                json!({\n                    \"a\": {\n                        \"b\": 5,\n                        \"c\": 6\n                    }\n                })\n                .into(),\n            ],\n        );\n    }\n\n    #[test]\n    fn test_dymamic_mode_inner() {\n        test_doc_from_json_test_aux(\n            r#\"{\n                \"field_mappings\": [\n                    {\n                        \"name\": \"some_obj\",\n                        \"type\": \"object\",\n                        \"field_mappings\": [\n                            {\n                                \"name\": \"child_a\",\n                                \"type\": \"text\"\n                            }\n                        ]\n                    }\n                ],\n                \"mode\": \"dynamic\"\n            }\"#,\n            DYNAMIC_FIELD_NAME,\n            r#\"{ \"some_obj\": { \"child_a\": \"\", \"child_b\": {\"c\": 3} }, \"some_obj2\": 4 }\"#,\n            vec![\n                json!({\n                    \"some_obj\": {\n                        \"child_b\": {\n                            \"c\": 3\n                        }\n                    },\n                    \"some_obj2\": 4\n                })\n                .into(),\n            ],\n        );\n    }\n\n    #[test]\n    fn test_json_object_in_mapping() {\n        test_doc_from_json_test_aux(\n            r#\"{\n                \"field_mappings\": [\n                    {\n                        \"name\": \"some_obj\",\n                        \"type\": \"object\",\n                        \"field_mappings\": [\n                            {\n                                \"name\": \"json_obj\",\n                                \"type\": \"json\"\n                            }\n                        ]\n                    }\n                ],\n                \"mode\": \"strict\"\n            }\"#,\n            \"some_obj.json_obj\",\n            r#\"{ \"some_obj\": { \"json_obj\": {\"hello\": 2} } }\"#,\n            vec![\n                json!({\n                    \"hello\": 2\n                })\n                .into(),\n            ],\n        );\n    }\n\n    #[test]\n    fn test_reject_invalid_concatenate_field() {\n        assert!(\n            serde_json::from_str::<DocMapper>(\n                r#\"{\n                \"field_mappings\": [\n                    {\n                        \"name\": \"concat\",\n                        \"type\": \"concatenate\",\n                        \"concatenate_fields\": [\"inexistent_field\"]\n                    }\n                ]\n            }\"#\n            )\n            .unwrap_err()\n            .to_string()\n            .contains(\"uses an unknown field\")\n        );\n        assert!(\n            serde_json::from_str::<DocMapper>(\n                r#\"{\n                \"field_mappings\": [\n                    {\n                        \"name\": \"concat\",\n                        \"type\": \"concatenate\",\n                        \"include_dynamic_fields\": true\n                    }\n                ],\n                \"mode\": \"strict\"\n            }\"#\n            )\n            .unwrap_err()\n            .to_string()\n            .contains(\n                \"concatenate field has `include_dynamic_fields` set, but index isn't dynamic\"\n            )\n        );\n        assert!(\n            serde_json::from_str::<DocMapper>(\n                r#\"{\n                \"field_mappings\": [\n                    {\n                        \"name\": \"concat\",\n                        \"type\": \"concatenate\"\n                    }\n                ]\n            }\"#\n            )\n            .unwrap_err()\n            .to_string()\n            .contains(\"concatenate type must have at least one sub-field\")\n        );\n    }\n\n    #[test]\n    fn test_concatenate_field_in_default_field() {\n        serde_json::from_str::<DocMapper>(\n            r#\"{\n                \"default_search_fields\": [\"concat\"],\n                \"field_mappings\": [\n                    {\n                        \"name\": \"some_text\",\n                        \"type\": \"text\"\n                    },\n                    {\n                        \"name\": \"concat\",\n                        \"type\": \"concatenate\",\n                        \"concatenate_fields\": [\"some_text\"]\n                    }\n                ],\n                \"mode\": \"strict\"\n            }\"#,\n        )\n        .unwrap();\n    }\n\n    #[test]\n    fn test_concatenate_field_in_mapping() {\n        test_doc_from_json_test_aux(\n            r#\"{\n                \"field_mappings\": [\n                    {\n                        \"name\": \"some_text\",\n                        \"type\": \"text\"\n                    },\n                    {\n                        \"name\": \"concat\",\n                        \"type\": \"concatenate\",\n                        \"concatenate_fields\": [\"some_text\"]\n                    }\n                ],\n                \"mode\": \"strict\"\n            }\"#,\n            \"concat\",\n            r#\"{\"some_text\": \"this is a text\"}\"#,\n            vec![\"this is a text\".into()],\n        );\n    }\n\n    #[test]\n    fn test_concatenate_field_in_mapping_dynamic() {\n        test_doc_from_json_test_aux(\n            r#\"{\n                \"field_mappings\": [\n                    {\n                        \"name\": \"concat\",\n                        \"type\": \"concatenate\",\n                        \"include_dynamic_fields\": true\n                    }\n                ],\n                \"mode\": \"dynamic\"\n            }\"#,\n            \"concat\",\n            r#\"{\"other_field\": \"this is a text\"}\"#,\n            vec![\"this is a text\".into()],\n        );\n        test_doc_from_json_test_aux(\n            r#\"{\n                \"field_mappings\": [\n                    {\n                        \"name\": \"concat\",\n                        \"type\": \"concatenate\",\n                        \"include_dynamic_fields\": true\n                    }\n                ],\n                \"mode\": \"dynamic\"\n            }\"#,\n            \"concat\",\n            r#\"{\"first_field\": \"this is a text\", \"second_field\": \"this is a text field too\"}\"#,\n            vec![\"this is a text\".into(), \"this is a text field too\".into()],\n        );\n    }\n\n    #[test]\n    fn test_concatenate_field_in_mapping_integer() {\n        test_doc_from_json_test_aux(\n            r#\"{\n                \"field_mappings\": [\n                    {\n                        \"name\": \"some_int\",\n                        \"type\": \"u64\"\n                    },\n                    {\n                        \"name\": \"concat\",\n                        \"type\": \"concatenate\",\n                        \"concatenate_fields\": [\"some_int\"]\n                    }\n                ],\n                \"mode\": \"strict\"\n            }\"#,\n            \"concat\",\n            r#\"{\"some_int\": 25}\"#,\n            vec![25_u64.into()],\n        );\n        test_doc_from_json_test_aux(\n            r#\"{\n                \"field_mappings\": [\n                    {\n                        \"name\": \"concat\",\n                        \"type\": \"concatenate\",\n                        \"include_dynamic_fields\": true\n                    }\n                ],\n                \"mode\": \"dynamic\"\n            }\"#,\n            \"concat\",\n            r#\"{\"some_int\": 25}\"#,\n            // i64 comes before u64\n            vec![25_i64.into()],\n        );\n    }\n\n    #[test]\n    fn test_concatenate_field_in_mapping_boolean() {\n        test_doc_from_json_test_aux(\n            r#\"{\n                \"field_mappings\": [\n                    {\n                        \"name\": \"some_bool\",\n                        \"type\": \"bool\"\n                    },\n                    {\n                        \"name\": \"concat\",\n                        \"type\": \"concatenate\",\n                        \"concatenate_fields\": [\"some_bool\"]\n                    }\n                ],\n                \"mode\": \"strict\"\n            }\"#,\n            \"concat\",\n            r#\"{\"some_bool\": false}\"#,\n            vec![false.into()],\n        );\n        test_doc_from_json_test_aux(\n            r#\"{\n                \"field_mappings\": [\n                    {\n                        \"name\": \"concat\",\n                        \"type\": \"concatenate\",\n                        \"include_dynamic_fields\": true\n                    }\n                ],\n                \"mode\": \"dynamic\"\n            }\"#,\n            \"concat\",\n            r#\"{\"some_bool\": true}\"#,\n            vec![true.into()],\n        );\n    }\n\n    #[test]\n    fn test_concatenate_field_array() {\n        test_doc_from_json_test_aux(\n            r#\"{\n                \"field_mappings\": [\n                    {\n                        \"name\": \"some_text\",\n                        \"type\": \"array<text>\"\n                    },\n                    {\n                        \"name\": \"concat\",\n                        \"type\": \"concatenate\",\n                        \"concatenate_fields\": [\"some_text\"]\n                    }\n                ],\n                \"mode\": \"strict\"\n            }\"#,\n            \"concat\",\n            r#\"{\"some_text\": [\"this is a text\", \"this is a text too\"]}\"#,\n            vec![\"this is a text\".into(), \"this is a text too\".into()],\n        );\n    }\n\n    #[test]\n    fn test_concatenate_multiple_field() {\n        test_doc_from_json_test_aux(\n            r#\"{\n                \"field_mappings\": [\n                    {\n                        \"name\": \"some_text\",\n                        \"type\": \"text\"\n                    },\n                    {\n                        \"name\": \"other_text\",\n                        \"type\": \"text\"\n                    },\n                    {\n                        \"name\": \"concat\",\n                        \"type\": \"concatenate\",\n                        \"concatenate_fields\": [\"some_text\", \"other_text\"]\n                    }\n                ],\n                \"mode\": \"strict\"\n            }\"#,\n            \"concat\",\n            r#\"{\"some_text\": \"this is a text\", \"other_text\": \"this is a text too\"}\"#,\n            vec![\"this is a text\".into(), \"this is a text too\".into()],\n        );\n    }\n\n    #[test]\n    fn test_concatenate_field_object() {\n        test_doc_from_json_test_aux(\n            r#\"{\n                \"field_mappings\": [\n                    {\n                        \"name\": \"some_obj\",\n                        \"type\": \"object\",\n                        \"field_mappings\": [\n                            {\n                                \"name\": \"json_obj\",\n                                \"type\": \"json\"\n                            }\n                        ]\n                    },\n                    {\n                        \"name\": \"concat\",\n                        \"type\": \"concatenate\",\n                        \"concatenate_fields\": [\"some_obj.json_obj\"]\n                    }\n                ],\n                \"mode\": \"strict\"\n            }\"#,\n            \"concat\",\n            r#\"{ \"some_obj\": { \"json_obj\": {\"hello\": \"world\"} } }\"#,\n            vec![\"world\".into()],\n        );\n    }\n\n    /*\n     * in the future we may want to make this works. Currently it isn't supported and fail at index\n     * creation\n    #[test]\n    fn test_concatenate_field_json_subpath() {\n        test_doc_from_json_test_aux(\n            r#\"{\n                \"field_mappings\": [\n                    {\n                        \"name\": \"json_obj\",\n                        \"type\": \"json\"\n                    },\n                    {\n                        \"name\": \"concat\",\n                        \"type\": \"concatenate\",\n                        \"concatenate_fields\": [\"json_obj.hello\"]\n                    }\n                ],\n                \"mode\": \"strict\"\n            }\"#,\n            \"concat\",\n            r#\"{ \"json_obj\": { \"hello\": \"1\", \"world\": \"2\"} }\"#,\n            vec![\"1\".into()],\n        );\n    }\n    */\n\n    #[test]\n    fn test_concatenate_field_text() {\n        test_doc_from_json_test_aux(\n            r#\"{\n                \"field_mappings\": [\n                    {\n                        \"name\": \"some_text\",\n                        \"type\": \"text\"\n                    },\n                    {\n                        \"name\": \"concat1\",\n                        \"type\": \"concatenate\",\n                        \"concatenate_fields\": [\"some_text\"]\n                    },\n                    {\n                        \"name\": \"concat2\",\n                        \"type\": \"concatenate\",\n                        \"concatenate_fields\": [\"some_text\"]\n                    }\n                ],\n                \"mode\": \"strict\"\n            }\"#,\n            \"concat1\",\n            r#\"{\"some_text\": \"this is a text\"}\"#,\n            vec![\"this is a text\".into()],\n        );\n        test_doc_from_json_test_aux(\n            r#\"{\n                \"field_mappings\": [\n                    {\n                        \"name\": \"some_text\",\n                        \"type\": \"text\"\n                    },\n                    {\n                        \"name\": \"concat1\",\n                        \"type\": \"concatenate\",\n                        \"concatenate_fields\": [\"some_text\"]\n                    },\n                    {\n                        \"name\": \"concat2\",\n                        \"type\": \"concatenate\",\n                        \"concatenate_fields\": [\"some_text\"]\n                    }\n                ],\n                \"mode\": \"strict\"\n            }\"#,\n            \"concat2\",\n            r#\"{\"some_text\": \"this is a text\"}\"#,\n            vec![\"this is a text\".into()],\n        );\n    }\n\n    #[test]\n    fn test_length_field() {\n        let raw_doc = r#\"{ \"some_obj\": { \"json_obj\": {\"hello\": 2} } }\"#;\n        test_doc_from_json_test_aux(\n            r#\"{\n                \"document_length\": true,\n                \"mode\": \"dynamic\"\n            }\"#,\n            DOCUMENT_SIZE_FIELD_NAME,\n            raw_doc,\n            vec![(raw_doc.len() as u64).into()],\n        );\n    }\n\n    fn default_doc_mapper_query_aux(doc_mapper: &DocMapper, query: &str) -> Result<String, String> {\n        let query_ast = query_ast_from_user_text(query, None)\n            .parse_user_query(doc_mapper.default_search_fields())\n            .map_err(|err| err.to_string())?;\n        let (query, _) = doc_mapper\n            .query(doc_mapper.schema(), query_ast, true, None)\n            .map_err(|err| err.to_string())?;\n        Ok(format!(\"{query:?}\"))\n    }\n\n    #[test]\n    fn test_doc_mapper_sub_field_query_on_non_json_field_should_error() {\n        let doc_mapper: DocMapper = serde_json::from_str(\n            r#\"{\n            \"field_mappings\": [{\"name\": \"body\", \"type\": \"text\"}],\n            \"mode\": \"dynamic\"\n        }\"#,\n        )\n        .unwrap();\n        assert_eq!(\n            default_doc_mapper_query_aux(&doc_mapper, \"body.wrong_field:hello\").unwrap_err(),\n            \"invalid query: field does not exist: `body.wrong_field`\"\n        );\n    }\n\n    #[test]\n    fn test_doc_mapper_accept_sub_field_query_on_json_field() {\n        let doc_mapper: DocMapper = serde_json::from_str(\n            r#\"{\n            \"field_mappings\": [{\"name\": \"body\", \"type\": \"json\"}],\n            \"mode\": \"dynamic\"\n        }\"#,\n        )\n        .unwrap();\n        assert_eq!(\n            default_doc_mapper_query_aux(&doc_mapper, \"body.dynamic_field:hello\"),\n            Ok(\n                r#\"TermQuery(Term(field=2, type=Json, path=dynamic_field, type=Str, \"hello\"))\"#\n                    .to_string()\n            )\n        );\n    }\n\n    #[test]\n    fn test_doc_mapper_object_dot_collision_with_object_field() {\n        let doc_mapper: DocMapper = serde_json::from_str(\n            r#\"{\n            \"field_mappings\": [\n                {\n                    \"name\": \"identity\",\n                    \"type\": \"object\",\n                    \"field_mappings\": [{\"type\": \"text\", \"name\": \"username\"}]\n                },\n                {\"type\": \"text\", \"name\": \"identity.username\"}\n            ]\n        }\"#,\n        )\n        .unwrap();\n        assert_eq!(\n            default_doc_mapper_query_aux(&doc_mapper, \"identity.username:toto\").unwrap(),\n            r#\"TermQuery(Term(field=2, type=Str, \"toto\"))\"#\n        );\n        assert_eq!(\n            default_doc_mapper_query_aux(&doc_mapper, r\"identity\\.username:toto\").unwrap(),\n            r#\"TermQuery(Term(field=3, type=Str, \"toto\"))\"#\n        );\n    }\n\n    #[test]\n    fn test_doc_mapper_object_dot_collision_with_json_field() {\n        let doc_mapper: DocMapper = serde_json::from_str(\n            r#\"{\n            \"field_mappings\": [\n                {\"name\": \"identity\", \"type\": \"json\"},\n                {\"type\": \"text\", \"name\": \"identity.username\"}\n            ]\n        }\"#,\n        )\n        .unwrap();\n        assert_eq!(\n            default_doc_mapper_query_aux(&doc_mapper, \"identity.username:toto\").unwrap(),\n            r#\"TermQuery(Term(field=2, type=Json, path=username, type=Str, \"toto\"))\"#\n        );\n        assert_eq!(\n            default_doc_mapper_query_aux(&doc_mapper, r\"identity\\.username:toto\").unwrap(),\n            r#\"TermQuery(Term(field=3, type=Str, \"toto\"))\"#\n        );\n    }\n\n    #[test]\n    fn test_doc_mapper_default_tokenizers() {\n        let doc_mapper: DocMapper = serde_json::from_str(\n            r#\"{\n            \"field_mappings\": [\n                {\"name\": \"json_field\", \"type\": \"json\"},\n                {\"name\": \"text_field\", \"type\": \"text\"}\n            ]\n        }\"#,\n        )\n        .unwrap();\n        let schema = doc_mapper.schema();\n\n        {\n            let json_field = schema.get_field(\"json_field\").unwrap();\n            let FieldType::JsonObject(json_options) =\n                schema.get_field_entry(json_field).field_type()\n            else {\n                panic!()\n            };\n            let text_indexing_options = json_options.get_text_indexing_options().unwrap();\n            assert_eq!(text_indexing_options.tokenizer(), RAW_TOKENIZER_NAME);\n            assert_eq!(\n                text_indexing_options.index_option(),\n                IndexRecordOption::Basic\n            );\n        }\n\n        {\n            let text_field = schema.get_field(\"text_field\").unwrap();\n            let FieldType::Str(text_options) = schema.get_field_entry(text_field).field_type()\n            else {\n                panic!()\n            };\n            assert_eq!(\n                text_options.get_indexing_options().unwrap().tokenizer(),\n                DEFAULT_TOKENIZER_NAME\n            );\n        }\n    }\n\n    #[test]\n    fn test_find_field_mapping_type() {\n        let mapper = serde_json::from_str::<DocMapper>(\n            r#\"{\n            \"field_mappings\": [\n                {\n                    \"name\": \"some_obj\",\n                    \"type\": \"object\",\n                    \"field_mappings\": [\n                        {\n                            \"name\": \"timestamp\",\n                            \"type\": \"datetime\",\n                            \"fast\": true\n                        },\n                        {\n                            \"name\": \"object2\",\n                            \"type\": \"object\",\n                            \"field_mappings\": [\n                                {\n                                    \"name\": \"id\",\n                                    \"type\": \"u64\"\n                                },\n                                {\n                                    \"name\": \"my.id\",\n                                    \"type\": \"u64\"\n                                }\n                            ]\n                        }\n                    ]\n                },\n                {\n                    \"name\": \"my.timestamp\",\n                    \"type\": \"datetime\",\n                    \"fast\": true\n                }\n            ]\n        }\"#,\n        )\n        .unwrap();\n        mapper\n            .field_mappings\n            .find_field_mapping_type(\"some_obj.timestamp\")\n            .unwrap();\n        mapper\n            .field_mappings\n            .find_field_mapping_type(\"some_obj.object2.id\")\n            .unwrap();\n        mapper\n            .field_mappings\n            .find_field_mapping_type(\"some_obj.object2\")\n            .unwrap();\n        mapper\n            .field_mappings\n            .find_field_mapping_type(\"some_obj.object2.my\\\\.id\")\n            .unwrap();\n        mapper\n            .field_mappings\n            .find_field_mapping_type(\"my\\\\.timestamp\")\n            .unwrap();\n    }\n\n    #[test]\n    fn test_build_doc_mapper_with_custom_ngram_tokenizer() {\n        let mapper = serde_json::from_str::<DocMapper>(\n            r#\"{\n            \"tokenizers\": [\n                {\n                    \"name\": \"my_tokenizer\",\n                    \"filters\": [\"lower_caser\", \"ascii_folding\", \"remove_long\"],\n                    \"type\": \"ngram\",\n                    \"min_gram\": 3,\n                    \"max_gram\": 5\n                }\n            ],\n            \"field_mappings\": [\n                {\n                    \"name\": \"my_text\",\n                    \"type\": \"text\",\n                    \"tokenizer\": \"my_tokenizer\"\n                }\n            ]\n        }\"#,\n        )\n        .unwrap();\n        let field_mapping_type = mapper\n            .field_mappings\n            .find_field_mapping_type(\"my_text\")\n            .unwrap();\n        match &field_mapping_type {\n            super::FieldMappingType::Text(options, _) => {\n                assert!(options.indexing_options.is_some());\n                let tokenizer = &options.indexing_options.as_ref().unwrap().tokenizer;\n                assert_eq!(tokenizer.name(), \"my_tokenizer\");\n            }\n            _ => panic!(\"Expected a text field\"),\n        }\n        assert!(\n            mapper\n                .tokenizer_manager()\n                .get_tokenizer(\"my_tokenizer\")\n                .is_some()\n        );\n    }\n\n    #[test]\n    fn test_build_doc_mapper_should_fail_with_unknown_tokenizer() {\n        let mapper_builder = serde_json::from_str::<DocMapperBuilder>(\n            r#\"{\n            \"field_mappings\": [\n                {\n                    \"name\": \"my_text\",\n                    \"type\": \"text\",\n                    \"tokenizer\": \"my_tokenizer\"\n                }\n            ]\n        }\"#,\n        )\n        .unwrap();\n        let mapper = mapper_builder.try_build();\n        let error_msg = mapper.unwrap_err().to_string();\n        assert!(error_msg.contains(\"unknown tokenizer\"));\n    }\n\n    #[test]\n    fn test_build_doc_mapper_tokenizer_manager_with_custom_tokenizer() {\n        let mapper = serde_json::from_str::<DocMapper>(\n            r#\"{\n            \"tokenizers\": [\n                {\n                    \"name\": \"my_tokenizer\",\n                    \"filters\": [\"lower_caser\"],\n                    \"type\": \"ngram\",\n                    \"min_gram\": 3,\n                    \"max_gram\": 5\n                }\n            ],\n            \"field_mappings\": [\n                {\n                    \"name\": \"my_text\",\n                    \"type\": \"text\",\n                    \"tokenizer\": \"my_tokenizer\"\n                }\n            ]\n        }\"#,\n        )\n        .unwrap();\n        let mut tokenizer = mapper\n            .tokenizer_manager()\n            .get_tokenizer(\"my_tokenizer\")\n            .unwrap();\n        let mut token_stream = tokenizer.token_stream(\"HELLO WORLD\");\n        assert_eq!(token_stream.next().unwrap().text, \"hel\");\n        assert_eq!(token_stream.next().unwrap().text, \"hell\");\n        assert_eq!(token_stream.next().unwrap().text, \"hello\");\n    }\n\n    #[test]\n    fn test_build_doc_mapper_with_custom_invalid_regex_tokenizer() {\n        let mapper_builder = serde_json::from_str::<DocMapperBuilder>(\n            r#\"{\n            \"tokenizers\": [\n                {\n                    \"name\": \"my_tokenizer\",\n                    \"type\": \"regex\",\n                    \"pattern\": \"(my_pattern\"\n                }\n            ],\n            \"field_mappings\": [\n                {\n                    \"name\": \"my_text\",\n                    \"type\": \"text\",\n                    \"tokenizer\": \"my_tokenizer\"\n                }\n            ]\n        }\"#,\n        )\n        .unwrap();\n        let mapper = mapper_builder.try_build();\n        assert!(mapper.is_err());\n        let error_mesg = mapper.unwrap_err().to_string();\n        assert!(error_mesg.contains(\"invalid regex tokenizer\"));\n    }\n\n    #[test]\n    fn test_doc_mapper_with_custom_tokenizer_equivalent_to_default() {\n        let mapper = serde_json::from_str::<DocMapper>(\n            r#\"{\n            \"tokenizers\": [\n                {\n                    \"name\": \"my_tokenizer\",\n                    \"filters\": [\"remove_long\", \"lower_caser\"],\n                    \"type\": \"simple\",\n                    \"min_gram\": 3,\n                    \"max_gram\": 5\n                }\n            ],\n            \"field_mappings\": [\n                {\n                    \"name\": \"my_text\",\n                    \"type\": \"text\",\n                    \"tokenizer\": \"my_tokenizer\"\n                }\n            ]\n        }\"#,\n        )\n        .unwrap();\n        let mut default_tokenizer = mapper.tokenizer_manager().get_tokenizer(\"default\").unwrap();\n        let mut tokenizer = mapper\n            .tokenizer_manager()\n            .get_tokenizer(\"my_tokenizer\")\n            .unwrap();\n        let text = \"I've seen things... seen things you little people wouldn't believe.\";\n        let mut default_token_stream = default_tokenizer.token_stream(text);\n        let mut token_stream = tokenizer.token_stream(text);\n        for _ in 0..10 {\n            assert_eq!(\n                default_token_stream.next().unwrap().text,\n                token_stream.next().unwrap().text\n            );\n        }\n    }\n\n    #[test]\n    fn test_deserialize_doc_after_mapping_change_json_to_obj() {\n        use serde::Deserialize;\n        use tantivy::Document;\n\n        let old_mapper = json!({\n            \"field_mappings\": [\n                {\"name\": \"body\", \"type\": \"json\"}\n            ]\n        });\n\n        let builder = DocMapperBuilder::deserialize(old_mapper.clone()).unwrap();\n        let old_mapper = builder.try_build().unwrap();\n\n        let JsonValue::Object(doc) = json!({\n            \"body\": {\n                \"field.1\": \"hola\",\n                \"field2\": {\n                    \"key\": \"val\",\n                    \"arr\": [1,\"abc\", {\"k\": \"v\"}],\n                },\n                \"field3\": [\"a\", \"b\"]\n            }\n        }) else {\n            panic!();\n        };\n        let tantivy_doc = old_mapper.doc_from_json_obj(doc.clone(), 0).unwrap().1;\n        let named_doc = tantivy_doc.to_named_doc(&old_mapper.schema());\n\n        let new_mapper = json!({\n            \"field_mappings\": [\n                {\n                    \"name\": \"body\",\n                    \"type\": \"object\",\n                    \"field_mappings\": [\n                        {\"name\": \"field.1\", \"type\": \"text\"},\n                        {\"name\": \"field2\", \"type\": \"json\"},\n                        {\"name\": \"field3\", \"type\": \"array<text>\"},\n                    ]\n                }\n            ]\n        });\n        let builder = DocMapperBuilder::deserialize(new_mapper).unwrap();\n        let new_mapper = builder.try_build().unwrap();\n\n        assert_eq!(new_mapper.doc_to_json(named_doc.0).unwrap(), doc);\n    }\n\n    #[test]\n    fn test_deserialize_doc_after_mapping_change_obj_to_json() {\n        use serde::Deserialize;\n        use tantivy::Document;\n\n        let old_mapper = json!({\n            \"field_mappings\": [\n                {\n                    \"name\": \"body\",\n                    \"type\": \"object\",\n                    \"field_mappings\": [\n                        {\"name\": \"field.1\", \"type\": \"text\"},\n                        {\"name\": \"field2\", \"type\": \"json\"},\n                        {\"name\": \"field3\", \"type\": \"array<text>\"},\n                    ]\n                }\n            ]\n        });\n\n        let builder = DocMapperBuilder::deserialize(old_mapper.clone()).unwrap();\n        let old_mapper = builder.try_build().unwrap();\n\n        let JsonValue::Object(doc) = json!({\n            \"body\": {\n                \"field.1\": \"hola\",\n                \"field2\": {\n                    \"key\": \"val\",\n                    \"arr\": [1,\"abc\", {\"k\": \"v\"}],\n                },\n                \"field3\": [\"a\", \"b\"]\n            }\n        }) else {\n            panic!();\n        };\n        let tantivy_doc = old_mapper.doc_from_json_obj(doc.clone(), 0).unwrap().1;\n        let named_doc = tantivy_doc.to_named_doc(&old_mapper.schema());\n\n        let new_mapper = json!({\n            \"field_mappings\": [\n                {\"name\": \"body\", \"type\": \"json\"}\n            ]\n        });\n        let builder = DocMapperBuilder::deserialize(new_mapper).unwrap();\n        let new_mapper = builder.try_build().unwrap();\n\n        assert_eq!(new_mapper.doc_to_json(named_doc.0).unwrap(), doc);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-doc-mapper/src/doc_mapper/field_mapping_entry.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::borrow::Cow;\nuse std::convert::TryFrom;\n\nuse anyhow::bail;\nuse base64::prelude::{BASE64_STANDARD, Engine};\nuse once_cell::sync::Lazy;\nuse quickwit_common::true_fn;\nuse regex::Regex;\nuse serde::{Deserialize, Serialize};\nuse serde_json::Value as JsonValue;\nuse tantivy::schema::{\n    IndexRecordOption, JsonObjectOptions, OwnedValue as TantivyValue, TextFieldIndexing,\n    TextOptions, Type,\n};\n\nuse super::FieldMappingType;\nuse super::date_time_type::QuickwitDateTimeOptions;\nuse crate::doc_mapper::field_mapping_type::QuickwitFieldType;\nuse crate::{Cardinality, QW_RESERVED_FIELD_NAMES};\n\n#[derive(Serialize, Deserialize, Default, Clone, Debug, PartialEq)]\npub struct QuickwitObjectOptions {\n    pub field_mappings: Vec<FieldMappingEntry>,\n}\n\n/// A `FieldMappingEntry` defines how a field is indexed, stored,\n/// and mapped from a JSON document to the related index fields.\n#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]\n#[serde(\n    try_from = \"FieldMappingEntryForSerialization\",\n    into = \"FieldMappingEntryForSerialization\"\n)]\npub struct FieldMappingEntry {\n    /// Field name in the index schema.\n    pub name: String,\n    /// Property parameters which define the type and the way the value must be indexed.\n    pub mapping_type: FieldMappingType,\n}\n\n// Struct used for serialization and deserialization\n// Main advantage: having a flat structure and gain flexibility\n// if we want to add some syntactic sugar in the mapping.\n// Main drawback: we have a bunch of mixed parameters in it but\n// seems to be reasonable.\n//\n// We do not rely on enum with inline tagging and flatten because\n// - serde does not support it in combination with `deny_unknown_field`\n// - it is clumsy to handle `array<type>` keys.\n\n// Docs bellow used for OpenAPI generation:\n/// A `FieldMappingEntry` defines how a field is indexed, stored,\n/// and mapped from a JSON document to the related index fields.\n///\n/// Property parameters which defines the way the value must be indexed.\n///\n/// Properties are determined by the specified type, for more information\n/// please see: <https://quickwit.io/docs/configuration/index-config#field-types>\n#[derive(Clone, Serialize, Deserialize, Debug, utoipa::ToSchema)]\npub(crate) struct FieldMappingEntryForSerialization {\n    /// Field name in the index schema.\n    name: String,\n    #[serde(rename = \"type\")]\n    type_id: String,\n    #[serde(flatten)]\n    #[schema(value_type = HashMap<String, Object>)]\n    pub field_mapping_json: serde_json::Map<String, JsonValue>,\n}\n\n#[derive(Clone, Serialize, Deserialize, Debug, PartialEq, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct QuickwitNumericOptions {\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub description: Option<String>,\n    #[serde(default = \"true_fn\")]\n    pub stored: bool,\n    #[serde(default = \"true_fn\")]\n    pub indexed: bool,\n    #[serde(default)]\n    pub fast: bool,\n    #[serde(default = \"true_fn\")]\n    pub coerce: bool,\n    #[serde(default)]\n    pub output_format: NumericOutputFormat,\n}\n\nimpl Default for QuickwitNumericOptions {\n    fn default() -> Self {\n        Self {\n            description: None,\n            indexed: true,\n            stored: true,\n            fast: false,\n            coerce: true,\n            output_format: NumericOutputFormat::default(),\n        }\n    }\n}\n\n#[derive(Clone, Serialize, Deserialize, Debug, PartialEq, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct QuickwitBoolOptions {\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub description: Option<String>,\n    #[serde(default = \"true_fn\")]\n    pub stored: bool,\n    #[serde(default = \"true_fn\")]\n    pub indexed: bool,\n    #[serde(default)]\n    pub fast: bool,\n}\n\nimpl Default for QuickwitBoolOptions {\n    fn default() -> Self {\n        Self {\n            description: None,\n            indexed: true,\n            stored: true,\n            fast: false,\n        }\n    }\n}\n\n/// Options associated to a bytes field.\n#[derive(Clone, Serialize, Deserialize, Debug, PartialEq, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct QuickwitBytesOptions {\n    /// Optional description of the bytes field.\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub description: Option<String>,\n    /// If true, the field will be stored in the doc store.\n    #[serde(default = \"true_fn\")]\n    pub stored: bool,\n    /// If true, the field will be indexed.\n    #[serde(default = \"true_fn\")]\n    pub indexed: bool,\n    /// If true, the field will be stored in columnar format.\n    #[serde(default)]\n    pub fast: bool,\n    /// Input format of the bytes field.\n    #[serde(default)]\n    pub input_format: BinaryFormat,\n    /// Output format of the bytes field.\n    #[serde(default)]\n    pub output_format: BinaryFormat,\n}\n\nimpl Default for QuickwitBytesOptions {\n    fn default() -> Self {\n        Self {\n            description: None,\n            indexed: true,\n            stored: true,\n            fast: false,\n            input_format: BinaryFormat::default(),\n            output_format: BinaryFormat::default(),\n        }\n    }\n}\n\n/// Available binary formats.\n#[derive(Clone, Copy, Debug, Eq, PartialEq, Hash, Default, Serialize, Deserialize)]\n#[serde(rename_all = \"snake_case\")]\npub enum BinaryFormat {\n    /// Base64 format.\n    #[default]\n    Base64,\n    /// Hexadecimal format.\n    Hex,\n}\n\nimpl BinaryFormat {\n    /// Returns the string representation of the format.\n    pub fn as_str(&self) -> &str {\n        match self {\n            Self::Base64 => \"base64\",\n            Self::Hex => \"hex\",\n        }\n    }\n\n    /// Returns representation of the format in `serde_json::Value`.\n    pub fn format_to_json(&self, value: &[u8]) -> JsonValue {\n        match self {\n            Self::Base64 => BASE64_STANDARD.encode(value).into(),\n            Self::Hex => hex::encode(value).into(),\n        }\n    }\n\n    /// Parses the `serde_json::Value` into `tantivy::schema::Value`.\n    pub fn parse_str(&self, byte_str: &str) -> Result<Vec<u8>, String> {\n        let payload = match self {\n            Self::Base64 => BASE64_STANDARD\n                .decode(byte_str)\n                .map_err(|base64_decode_err| {\n                    format!(\"expected base64 string, got `{byte_str}`: {base64_decode_err}\")\n                })?,\n            Self::Hex => hex::decode(byte_str).map_err(|hex_decode_err| {\n                format!(\"expected hex string, got `{byte_str}`: {hex_decode_err}\")\n            })?,\n        };\n        Ok(payload)\n    }\n\n    /// Parses the `serde_json::Value` into `tantivy::schema::Value`.\n    pub fn parse_json(&self, json_val: &JsonValue) -> Result<TantivyValue, String> {\n        let byte_str = if let JsonValue::String(byte_str) = json_val {\n            byte_str\n        } else {\n            return Err(format!(\n                \"expected {} string, got `{json_val}`\",\n                self.as_str()\n            ));\n        };\n        let payload = self.parse_str(byte_str)?;\n        Ok(TantivyValue::Bytes(payload))\n    }\n}\n\n#[derive(Clone, Copy, Debug, Eq, PartialEq, Hash, Default, Serialize, Deserialize)]\n#[serde(rename_all = \"snake_case\")]\npub enum NumericOutputFormat {\n    #[default]\n    Number,\n    String,\n}\n\n#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct QuickwitIpAddrOptions {\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub description: Option<String>,\n    #[serde(default = \"true_fn\")]\n    pub stored: bool,\n    #[serde(default = \"true_fn\")]\n    pub indexed: bool,\n    #[serde(default)]\n    pub fast: bool,\n}\n\nimpl Default for QuickwitIpAddrOptions {\n    fn default() -> Self {\n        Self {\n            description: None,\n            indexed: true,\n            stored: true,\n            fast: false,\n        }\n    }\n}\n\n#[derive(Clone, PartialEq, Debug, Eq, Serialize, Deserialize, utoipa::ToSchema)]\npub struct QuickwitTextTokenizer(Cow<'static, str>);\n\npub(crate) const DEFAULT_TOKENIZER_NAME: &str = \"default\";\n\npub(crate) const RAW_TOKENIZER_NAME: &str = \"raw\";\n\nimpl Default for QuickwitTextTokenizer {\n    fn default() -> Self {\n        Self::from_static(DEFAULT_TOKENIZER_NAME)\n    }\n}\n\nimpl QuickwitTextTokenizer {\n    pub const fn from_static(name: &'static str) -> Self {\n        Self(Cow::Borrowed(name))\n    }\n    pub(crate) fn name(&self) -> &str {\n        &self.0\n    }\n    pub fn raw() -> Self {\n        Self::from_static(RAW_TOKENIZER_NAME)\n    }\n}\n\n#[derive(Clone, Debug, Eq, PartialEq, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(rename_all = \"snake_case\")]\npub enum QuickwitTextNormalizer {\n    Raw,\n    Lowercase,\n}\n\nimpl QuickwitTextNormalizer {\n    pub fn get_name(&self) -> &str {\n        match self {\n            QuickwitTextNormalizer::Raw => \"raw\",\n            QuickwitTextNormalizer::Lowercase => \"lowercase\",\n        }\n    }\n}\n\n#[derive(Clone, PartialEq, Debug)]\npub struct TextIndexingOptions {\n    pub tokenizer: QuickwitTextTokenizer,\n    pub record: IndexRecordOption,\n    pub fieldnorms: bool,\n}\n\nimpl TextIndexingOptions {\n    fn from_parts_text(\n        indexed: bool,\n        tokenizer: Option<QuickwitTextTokenizer>,\n        record: Option<IndexRecordOption>,\n        fieldnorms: bool,\n    ) -> anyhow::Result<Option<Self>> {\n        if indexed {\n            Ok(Some(TextIndexingOptions {\n                tokenizer: tokenizer.unwrap_or_default(),\n                record: record.unwrap_or(IndexRecordOption::Basic),\n                fieldnorms,\n            }))\n        } else {\n            if tokenizer.is_some() || record.is_some() || fieldnorms {\n                bail!(\n                    \"`record`, `tokenizer`, and `fieldnorms` parameters are allowed only if \\\n                     indexed is true\"\n                )\n            }\n            Ok(None)\n        }\n    }\n\n    fn from_parts_json(\n        indexed: bool,\n        tokenizer: Option<QuickwitTextTokenizer>,\n        record: Option<IndexRecordOption>,\n    ) -> anyhow::Result<Option<Self>> {\n        if indexed {\n            Ok(Some(TextIndexingOptions {\n                tokenizer: tokenizer.unwrap_or_else(QuickwitTextTokenizer::raw),\n                record: record.unwrap_or(IndexRecordOption::Basic),\n                fieldnorms: false,\n            }))\n        } else {\n            if tokenizer.is_some() || record.is_some() {\n                bail!(\"`record` and `tokenizer` parameters are allowed only if indexed is true\")\n            }\n            Ok(None)\n        }\n    }\n\n    fn from_parts_concatenate(\n        tokenizer: Option<QuickwitTextTokenizer>,\n        record: Option<IndexRecordOption>,\n    ) -> anyhow::Result<Self> {\n        let text_index_options_opt = Self::from_parts_text(true, tokenizer, record, false)?;\n        let text_index_options = text_index_options_opt.expect(\"concatenate field must be indexed\");\n        Ok(text_index_options)\n    }\n\n    fn to_parts_text(\n        this: Option<Self>,\n    ) -> (\n        bool, // indexed\n        Option<QuickwitTextTokenizer>,\n        Option<IndexRecordOption>,\n        bool, // fieldnorms\n    ) {\n        match this {\n            Some(this) => (\n                true,\n                Some(this.tokenizer),\n                Some(this.record),\n                this.fieldnorms,\n            ),\n            None => (false, None, None, false),\n        }\n    }\n\n    fn to_parts_json(\n        this: Option<Self>,\n    ) -> (\n        bool, // indexed\n        Option<QuickwitTextTokenizer>,\n        Option<IndexRecordOption>,\n    ) {\n        let (indexed, tokenizer, record, _fieldorm) = TextIndexingOptions::to_parts_text(this);\n        (indexed, tokenizer, record)\n    }\n\n    fn to_parts_concatenate(\n        this: Self,\n    ) -> (Option<QuickwitTextTokenizer>, Option<IndexRecordOption>) {\n        let (_indexed, tokenizer, record, _fieldorm) =\n            TextIndexingOptions::to_parts_text(Some(this));\n        (tokenizer, record)\n    }\n\n    fn default_json() -> Self {\n        TextIndexingOptions {\n            tokenizer: QuickwitTextTokenizer::raw(),\n            record: IndexRecordOption::Basic,\n            fieldnorms: false,\n        }\n    }\n}\n\nimpl Default for TextIndexingOptions {\n    fn default() -> Self {\n        TextIndexingOptions {\n            tokenizer: QuickwitTextTokenizer::default(),\n            record: IndexRecordOption::Basic,\n            fieldnorms: false,\n        }\n    }\n}\n\n#[quickwit_macros::serde_multikey]\n#[derive(Clone, PartialEq, Serialize, Deserialize, Debug, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct QuickwitTextOptions {\n    #[schema(value_type = String)]\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub description: Option<String>,\n    #[serde_multikey(\n        deserializer = TextIndexingOptions::from_parts_text,\n        serializer = TextIndexingOptions::to_parts_text,\n        fields = (\n            #[serde(default = \"true_fn\")]\n            pub indexed: bool,\n            #[serde(default)]\n            #[serde(skip_serializing_if = \"Option::is_none\")]\n            pub tokenizer: Option<QuickwitTextTokenizer>,\n            #[schema(value_type = IndexRecordOptionSchema)]\n            #[serde(default)]\n            #[serde(skip_serializing_if = \"Option::is_none\")]\n            pub record: Option<IndexRecordOption>,\n            #[serde(default)]\n            pub fieldnorms: bool,\n        ),\n    )]\n    pub indexing_options: Option<TextIndexingOptions>,\n    #[serde(default = \"true_fn\")]\n    pub stored: bool,\n    #[serde(default)]\n    pub fast: FastFieldOptions,\n}\n\n#[derive(Default, Clone, Debug, PartialEq, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(\n    into = \"FastFieldOptionsForSerialization\",\n    from = \"FastFieldOptionsForSerialization\"\n)]\npub enum FastFieldOptions {\n    #[default]\n    Disabled,\n    EnabledWithNormalizer {\n        normalizer: QuickwitTextNormalizer,\n    },\n}\n\nimpl FastFieldOptions {\n    pub fn default_enabled() -> Self {\n        FastFieldOptions::EnabledWithNormalizer {\n            normalizer: QuickwitTextNormalizer::Raw,\n        }\n    }\n}\n\n#[derive(Serialize, Deserialize)]\n#[serde(untagged)]\nenum FastFieldOptionsForSerialization {\n    IsEnabled(bool),\n    EnabledWithNormalizer { normalizer: QuickwitTextNormalizer },\n}\n\nimpl From<FastFieldOptionsForSerialization> for FastFieldOptions {\n    fn from(fast_field_options: FastFieldOptionsForSerialization) -> Self {\n        match fast_field_options {\n            FastFieldOptionsForSerialization::IsEnabled(is_enabled) => {\n                if is_enabled {\n                    FastFieldOptions::default_enabled()\n                } else {\n                    FastFieldOptions::Disabled\n                }\n            }\n            FastFieldOptionsForSerialization::EnabledWithNormalizer { normalizer } => {\n                FastFieldOptions::EnabledWithNormalizer { normalizer }\n            }\n        }\n    }\n}\n\nimpl From<FastFieldOptions> for FastFieldOptionsForSerialization {\n    fn from(fast_field_options: FastFieldOptions) -> Self {\n        match fast_field_options {\n            FastFieldOptions::Disabled => FastFieldOptionsForSerialization::IsEnabled(false),\n            FastFieldOptions::EnabledWithNormalizer { normalizer } => {\n                FastFieldOptionsForSerialization::EnabledWithNormalizer { normalizer }\n            }\n        }\n    }\n}\n\nimpl Default for QuickwitTextOptions {\n    fn default() -> Self {\n        Self {\n            description: None,\n            indexing_options: Some(TextIndexingOptions::default()),\n            stored: true,\n            fast: FastFieldOptions::default(),\n        }\n    }\n}\n\nimpl From<QuickwitTextOptions> for TextOptions {\n    fn from(quickwit_text_options: QuickwitTextOptions) -> Self {\n        let mut text_options = TextOptions::default();\n        if quickwit_text_options.stored {\n            text_options = text_options.set_stored();\n        }\n        match &quickwit_text_options.fast {\n            FastFieldOptions::EnabledWithNormalizer { normalizer } => {\n                text_options = text_options.set_fast(Some(normalizer.get_name()));\n            }\n            FastFieldOptions::Disabled => {}\n        }\n        if let Some(indexing_options) = quickwit_text_options.indexing_options {\n            let text_field_indexing = TextFieldIndexing::default()\n                .set_index_option(indexing_options.record)\n                .set_fieldnorms(indexing_options.fieldnorms)\n                .set_tokenizer(indexing_options.tokenizer.name());\n\n            text_options = text_options.set_indexing_options(text_field_indexing);\n        }\n        text_options\n    }\n}\n\n#[allow(unused)]\n#[derive(utoipa::ToSchema)]\npub enum IndexRecordOptionSchema {\n    /// records only the `DocId`s\n    #[schema(rename = \"basic\")]\n    Basic,\n    /// records the document ids as well as the term frequency.\n    /// The term frequency can help giving better scoring of the documents.\n    #[schema(rename = \"freq\")]\n    WithFreqs,\n    /// records the document id, the term frequency and the positions of\n    /// the occurrences in the document.\n    #[schema(rename = \"position\")]\n    WithFreqsAndPositions,\n}\n\n/// Options associated to a json field.\n///\n/// `QuickwitJsonOptions` is also used to configure\n/// the dynamic mapping.\n#[quickwit_macros::serde_multikey]\n#[derive(Clone, Debug, PartialEq, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct QuickwitJsonOptions {\n    /// Optional description of JSON object.\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub description: Option<String>,\n    #[serde_multikey(\n        deserializer = TextIndexingOptions::from_parts_json,\n        serializer = TextIndexingOptions::to_parts_json,\n        fields = (\n            /// If true, all of the element in the json object will be indexed.\n            #[serde(default = \"true_fn\")]\n            pub indexed: bool,\n            /// Sets the tokenize that should be used with the text fields in the\n            /// json object.\n            #[serde(default)]\n            #[serde(skip_serializing_if = \"Option::is_none\")]\n            pub tokenizer: Option<QuickwitTextTokenizer>,\n            /// Sets how much information should be added in the index\n            /// with each token.\n            ///\n            /// Setting `record` is only allowed if indexed == true.\n            #[schema(value_type = IndexRecordOptionSchema)]\n            #[serde(default)]\n            #[serde(skip_serializing_if = \"Option::is_none\")]\n            pub record: Option<IndexRecordOption>,\n        ),\n    )]\n    /// Options for indexing text in a Json field.\n    pub indexing_options: Option<TextIndexingOptions>,\n    /// If true, the field will be stored in the doc store.\n    #[serde(default = \"true_fn\")]\n    pub stored: bool,\n    /// If true, the '.' in json keys will be expanded.\n    #[serde(default = \"true_fn\")]\n    pub expand_dots: bool,\n    /// If true, the json object will be stored in columnar format.\n    #[serde(default)]\n    pub fast: FastFieldOptions,\n}\n\nimpl QuickwitJsonOptions {\n    /// Build a default QuickwitJsonOptions for dynamic fields.\n    pub fn default_dynamic() -> Self {\n        QuickwitJsonOptions {\n            fast: FastFieldOptions::default_enabled(),\n            ..Default::default()\n        }\n    }\n}\n\nimpl Default for QuickwitJsonOptions {\n    fn default() -> Self {\n        QuickwitJsonOptions {\n            description: None,\n            indexing_options: Some(TextIndexingOptions::default_json()),\n            stored: true,\n            expand_dots: true,\n            fast: FastFieldOptions::default(),\n        }\n    }\n}\n\nimpl From<QuickwitJsonOptions> for JsonObjectOptions {\n    fn from(quickwit_json_options: QuickwitJsonOptions) -> Self {\n        let mut json_options = JsonObjectOptions::default();\n        if quickwit_json_options.stored {\n            json_options = json_options.set_stored();\n        }\n        if let Some(indexing_options) = quickwit_json_options.indexing_options {\n            let text_field_indexing = TextFieldIndexing::default()\n                .set_tokenizer(indexing_options.tokenizer.name())\n                .set_index_option(indexing_options.record);\n            json_options = json_options.set_indexing_options(text_field_indexing);\n        }\n        if quickwit_json_options.expand_dots {\n            json_options = json_options.set_expand_dots_enabled();\n        }\n        match &quickwit_json_options.fast {\n            FastFieldOptions::EnabledWithNormalizer { normalizer } => {\n                json_options = json_options.set_fast(Some(normalizer.get_name()));\n            }\n            FastFieldOptions::Disabled => {}\n        }\n        json_options\n    }\n}\n\n/// Options associated to a concatenate field.\n#[quickwit_macros::serde_multikey]\n#[derive(Clone, Debug, PartialEq, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct QuickwitConcatenateOptions {\n    /// Optional description of JSON object.\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub description: Option<String>,\n    /// Fields to concatenate\n    #[serde(default)]\n    pub concatenate_fields: Vec<String>,\n    #[serde(default)]\n    pub include_dynamic_fields: bool,\n    #[serde_multikey(\n        deserializer = TextIndexingOptions::from_parts_concatenate,\n        serializer = TextIndexingOptions::to_parts_concatenate,\n        fields = (\n            /// Sets the tokenize that should be used with the text fields in the\n            /// concatenate field.\n            #[serde(default)]\n            #[serde(skip_serializing_if = \"Option::is_none\")]\n            pub tokenizer: Option<QuickwitTextTokenizer>,\n            /// Sets how much information should be added in the index\n            /// with each token.\n            #[schema(value_type = IndexRecordOptionSchema)]\n            #[serde(default)]\n            #[serde(skip_serializing_if = \"Option::is_none\")]\n            pub record: Option<IndexRecordOption>,\n        ),\n    )]\n    /// Options for indexing text in a concatenate field.\n    pub indexing_options: TextIndexingOptions,\n}\n\nimpl Default for QuickwitConcatenateOptions {\n    fn default() -> Self {\n        QuickwitConcatenateOptions {\n            description: None,\n            concatenate_fields: Vec::new(),\n            include_dynamic_fields: false,\n            indexing_options: TextIndexingOptions {\n                tokenizer: QuickwitTextTokenizer::default(),\n                record: IndexRecordOption::Basic,\n                fieldnorms: false,\n            },\n        }\n    }\n}\n\nimpl From<QuickwitConcatenateOptions> for JsonObjectOptions {\n    fn from(quickwit_text_options: QuickwitConcatenateOptions) -> Self {\n        let mut text_options = JsonObjectOptions::default();\n        let text_field_indexing = TextFieldIndexing::default()\n            .set_index_option(quickwit_text_options.indexing_options.record)\n            .set_fieldnorms(quickwit_text_options.indexing_options.fieldnorms)\n            .set_tokenizer(quickwit_text_options.indexing_options.tokenizer.name());\n\n        text_options = text_options.set_indexing_options(text_field_indexing);\n        text_options\n    }\n}\n\nfn deserialize_mapping_type(\n    quickwit_field_type: QuickwitFieldType,\n    json: JsonValue,\n) -> anyhow::Result<FieldMappingType> {\n    let (typ, cardinality) = match quickwit_field_type {\n        QuickwitFieldType::Simple(typ) => (typ, Cardinality::SingleValued),\n        QuickwitFieldType::Array(typ) => (typ, Cardinality::MultiValued),\n        QuickwitFieldType::Object => {\n            let object_options: QuickwitObjectOptions = serde_json::from_value(json)?;\n            if object_options.field_mappings.is_empty() {\n                anyhow::bail!(\"object type must have at least one field mapping\");\n            }\n            return Ok(FieldMappingType::Object(object_options));\n        }\n        QuickwitFieldType::Concatenate => {\n            let concatenate_options: QuickwitConcatenateOptions = serde_json::from_value(json)?;\n            if concatenate_options.concatenate_fields.is_empty()\n                && !concatenate_options.include_dynamic_fields\n            {\n                anyhow::bail!(\"concatenate type must have at least one sub-field\");\n            }\n            return Ok(FieldMappingType::Concatenate(concatenate_options));\n        }\n    };\n    match typ {\n        Type::Str => {\n            let text_options: QuickwitTextOptions = serde_json::from_value(json)?;\n            Ok(FieldMappingType::Text(text_options, cardinality))\n        }\n        Type::U64 => {\n            let numeric_options: QuickwitNumericOptions = serde_json::from_value(json)?;\n            Ok(FieldMappingType::U64(numeric_options, cardinality))\n        }\n        Type::I64 => {\n            let numeric_options: QuickwitNumericOptions = serde_json::from_value(json)?;\n            Ok(FieldMappingType::I64(numeric_options, cardinality))\n        }\n        Type::F64 => {\n            let numeric_options: QuickwitNumericOptions = serde_json::from_value(json)?;\n            Ok(FieldMappingType::F64(numeric_options, cardinality))\n        }\n        Type::Bool => {\n            let bool_options: QuickwitBoolOptions = serde_json::from_value(json)?;\n            Ok(FieldMappingType::Bool(bool_options, cardinality))\n        }\n        Type::IpAddr => {\n            let ip_addr_options: QuickwitIpAddrOptions = serde_json::from_value(json)?;\n            Ok(FieldMappingType::IpAddr(ip_addr_options, cardinality))\n        }\n        Type::Date => {\n            let date_time_options = serde_json::from_value::<QuickwitDateTimeOptions>(json)?;\n            Ok(FieldMappingType::DateTime(date_time_options, cardinality))\n        }\n        Type::Facet => unimplemented!(\"Facet are not supported in quickwit yet.\"),\n        Type::Bytes => {\n            let numeric_options: QuickwitBytesOptions = serde_json::from_value(json)?;\n            if numeric_options.fast && cardinality == Cardinality::MultiValued {\n                bail!(\"fast field is not allowed for array<bytes>\");\n            }\n            Ok(FieldMappingType::Bytes(numeric_options, cardinality))\n        }\n        Type::Json => {\n            let json_options: QuickwitJsonOptions = serde_json::from_value(json)?;\n            Ok(FieldMappingType::Json(json_options, cardinality))\n        }\n    }\n}\n\nimpl TryFrom<FieldMappingEntryForSerialization> for FieldMappingEntry {\n    type Error = String;\n\n    fn try_from(value: FieldMappingEntryForSerialization) -> Result<Self, String> {\n        validate_field_mapping_name(&value.name).map_err(|err| err.to_string())?;\n        let quickwit_field_type =\n            QuickwitFieldType::parse_type_id(&value.type_id).ok_or_else(|| {\n                format!(\n                    \"field `{}` has an unknown type: `{}`\",\n                    &value.name, &value.type_id\n                )\n            })?;\n        let mapping_type = deserialize_mapping_type(\n            quickwit_field_type,\n            JsonValue::Object(value.field_mapping_json),\n        )\n        .map_err(|err| format!(\"error while parsing field `{}`: {}\", value.name, err))?;\n        Ok(FieldMappingEntry {\n            name: value.name,\n            mapping_type,\n        })\n    }\n}\n\n/// Serialize object into a `Map` of json values.\nfn serialize_to_map<S: Serialize>(val: &S) -> Option<serde_json::Map<String, JsonValue>> {\n    let json_val = serde_json::to_value(val).ok()?;\n    if let JsonValue::Object(map) = json_val {\n        Some(map)\n    } else {\n        None\n    }\n}\n\nfn typed_mapping_to_json_params(\n    field_mapping_type: FieldMappingType,\n) -> serde_json::Map<String, JsonValue> {\n    match field_mapping_type {\n        FieldMappingType::Text(text_options, _) => serialize_to_map(&text_options),\n        FieldMappingType::U64(options, _)\n        | FieldMappingType::I64(options, _)\n        | FieldMappingType::F64(options, _) => serialize_to_map(&options),\n        FieldMappingType::Bool(options, _) => serialize_to_map(&options),\n        FieldMappingType::Bytes(options, _) => serialize_to_map(&options),\n        FieldMappingType::IpAddr(options, _) => serialize_to_map(&options),\n        FieldMappingType::DateTime(date_time_options, _) => serialize_to_map(&date_time_options),\n        FieldMappingType::Json(json_options, _) => serialize_to_map(&json_options),\n        FieldMappingType::Object(object_options) => serialize_to_map(&object_options),\n        FieldMappingType::Concatenate(concatenate_options) => {\n            serialize_to_map(&concatenate_options)\n        }\n    }\n    .unwrap()\n}\n\nimpl From<FieldMappingEntry> for FieldMappingEntryForSerialization {\n    fn from(field_mapping_entry: FieldMappingEntry) -> FieldMappingEntryForSerialization {\n        let type_id = field_mapping_entry\n            .mapping_type\n            .quickwit_field_type()\n            .to_type_id();\n        let field_mapping_json = typed_mapping_to_json_params(field_mapping_entry.mapping_type);\n        FieldMappingEntryForSerialization {\n            name: field_mapping_entry.name,\n            type_id,\n            field_mapping_json,\n        }\n    }\n}\n\n/// Regular expression validating a field mapping name.\npub const FIELD_MAPPING_NAME_PATTERN: &str = r\"^[@$_\\-a-zA-Z][@$_/\\.\\-a-zA-Z0-9]{0,254}$\";\n\n/// Validates a field mapping name.\n/// Returns `Ok(())` if the name can be used for a field mapping.\n///\n/// A field mapping name:\n/// - can only contain uppercase and lowercase ASCII letters `[a-zA-Z]`, digits `[0-9]`, `.`,\n///   hyphens `-`, underscores `_`, at `@` and dollar `$` signs;\n/// - must not start with a dot or a digit;\n/// - must be different from Quickwit's reserved field mapping names `_source`, `_dynamic`,\n///   `_field_presence`;\n/// - must not be longer than 255 characters.\npub fn validate_field_mapping_name(field_mapping_name: &str) -> anyhow::Result<()> {\n    static FIELD_MAPPING_NAME_PTN: Lazy<Regex> =\n        Lazy::new(|| Regex::new(FIELD_MAPPING_NAME_PATTERN).unwrap());\n\n    if QW_RESERVED_FIELD_NAMES.contains(&field_mapping_name) {\n        bail!(\n            \"field name `{field_mapping_name}` is reserved. the following fields are reserved for \\\n             Quickwit internal usage: {}\",\n            QW_RESERVED_FIELD_NAMES.join(\", \"),\n        );\n    }\n    if FIELD_MAPPING_NAME_PTN.is_match(field_mapping_name) {\n        return Ok(());\n    }\n    if field_mapping_name.is_empty() {\n        bail!(\"field name is empty\");\n    }\n    if field_mapping_name.starts_with('.') {\n        bail!(\n            \"field name `{}` must not start with a dot `.`\",\n            field_mapping_name\n        );\n    }\n    if field_mapping_name.len() > 255 {\n        bail!(\n            \"field name `{}` is too long. field names must not be longer than 255 characters\",\n            field_mapping_name\n        )\n    }\n    let first_char = field_mapping_name.chars().next().unwrap();\n    if !first_char.is_ascii_alphabetic() {\n        bail!(\n            \"field name `{}` is invalid. field names must start with an uppercase or lowercase \\\n             ASCII letter, or an underscore `_`\",\n            field_mapping_name\n        )\n    }\n    bail!(\n        \"field name `{}` contains illegal characters. field names must only contain uppercase and \\\n         lowercase ASCII letters, digits, hyphens `-`, periods `.`, and underscores `_`\",\n        field_mapping_name\n    );\n}\n\n#[cfg(test)]\nmod tests {\n    use anyhow::bail;\n    use matches::matches;\n    use serde_json::json;\n    use tantivy::schema::{IndexRecordOption, JsonObjectOptions, TextOptions};\n\n    use super::*;\n    use crate::Cardinality;\n    use crate::doc_mapper::{FastFieldOptions, FieldMappingType};\n\n    #[test]\n    fn test_validate_field_mapping_name() {\n        assert!(\n            validate_field_mapping_name(\"\")\n                .unwrap_err()\n                .to_string()\n                .contains(\"is empty\")\n        );\n        assert!(\n            validate_field_mapping_name(&\"a\".repeat(256))\n                .unwrap_err()\n                .to_string()\n                .contains(\"is too long\")\n        );\n        assert!(\n            validate_field_mapping_name(\"0\")\n                .unwrap_err()\n                .to_string()\n                .contains(\"must start with\")\n        );\n        assert!(\n            validate_field_mapping_name(\".my-field\")\n                .unwrap_err()\n                .to_string()\n                .contains(\"must not start with\")\n        );\n        assert!(\n            validate_field_mapping_name(\"_source\")\n                .unwrap_err()\n                .to_string()\n                .contains(\"are reserved for Quickwit\")\n        );\n        assert!(\n            validate_field_mapping_name(\"_dynamic\")\n                .unwrap_err()\n                .to_string()\n                .contains(\"are reserved for Quickwit\")\n        );\n        assert!(\n            validate_field_mapping_name(\"my-field!\")\n                .unwrap_err()\n                .to_string()\n                .contains(\"illegal characters\")\n        );\n        assert!(validate_field_mapping_name(\"_my_field\").is_ok());\n        assert!(validate_field_mapping_name(\"-my-field\").is_ok());\n        assert!(validate_field_mapping_name(\"my-field\").is_ok());\n        assert!(validate_field_mapping_name(\"my.field\").is_ok());\n        assert!(validate_field_mapping_name(\"my_field\").is_ok());\n        assert!(validate_field_mapping_name(\"$my_field@\").is_ok());\n        assert!(validate_field_mapping_name(\"my/field\").is_ok());\n        assert!(validate_field_mapping_name(&\"a\".repeat(255)).is_ok());\n    }\n\n    #[test]\n    fn test_quickwit_json_options_default() {\n        let serde_default_json_options: QuickwitJsonOptions = serde_json::from_str(\"{}\").unwrap();\n        assert_eq!(serde_default_json_options, QuickwitJsonOptions::default())\n    }\n\n    #[test]\n    fn test_tantivy_text_options_from_quickwit_text_options() {\n        let tantivy_text_option = TextOptions::from(QuickwitTextOptions::default());\n\n        assert_eq!(tantivy_text_option.is_stored(), true);\n        assert_eq!(tantivy_text_option.is_fast(), false);\n\n        match tantivy_text_option.get_indexing_options() {\n            Some(text_field_indexing) => {\n                assert_eq!(text_field_indexing.index_option(), IndexRecordOption::Basic);\n                assert_eq!(text_field_indexing.fieldnorms(), false);\n                assert_eq!(text_field_indexing.tokenizer(), \"default\");\n            }\n            _ => panic!(\"text field indexing is None\"),\n        }\n    }\n\n    #[test]\n    fn test_tantivy_json_options_from_quickwit_json_options() {\n        let tantivy_json_option = JsonObjectOptions::from(QuickwitJsonOptions::default());\n        assert_eq!(tantivy_json_option.is_stored(), true);\n        match tantivy_json_option.get_text_indexing_options() {\n            Some(text_field_indexing) => {\n                assert_eq!(text_field_indexing.index_option(), IndexRecordOption::Basic);\n                assert_eq!(text_field_indexing.tokenizer(), \"raw\");\n            }\n            _ => panic!(\"text field indexing is None\"),\n        }\n    }\n\n    #[test]\n    fn test_deserialize_text_mapping_entry_not_indexed() -> anyhow::Result<()> {\n        let mapping_entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"data_binary\",\n                \"type\": \"text\",\n                \"indexed\": false,\n                \"stored\": true\n            }\"#,\n        )?;\n        assert_eq!(mapping_entry.name, \"data_binary\");\n        match mapping_entry.mapping_type {\n            FieldMappingType::Text(options, _) => {\n                assert_eq!(options.stored, true);\n                assert!(options.indexing_options.is_none());\n            }\n            _ => panic!(\"wrong property type\"),\n        }\n        Ok(())\n    }\n\n    #[test]\n    fn test_deserialize_text_mapping_entry_not_indexed_invalid() {\n        let result = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"data_binary\",\n                \"type\": \"text\",\n                \"indexed\": false,\n                \"record\": \"basic\"\n            }\n            \"#,\n        );\n        assert!(result.is_err());\n        let error = result.unwrap_err();\n        assert_eq!(\n            error.to_string(),\n            \"error while parsing field `data_binary`: `record`, `tokenizer`, and `fieldnorms` \\\n             parameters are allowed only if indexed is true\"\n        );\n    }\n\n    #[test]\n    fn test_deserialize_json_mapping_entry_not_indexed() -> anyhow::Result<()> {\n        let mapping_entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"data_binary\",\n                \"type\": \"json\",\n                \"indexed\": false,\n                \"stored\": true\n            }\"#,\n        )?;\n        assert_eq!(mapping_entry.name, \"data_binary\");\n        match mapping_entry.mapping_type {\n            FieldMappingType::Json(options, _) => {\n                assert_eq!(options.stored, true);\n                assert!(options.indexing_options.is_none());\n            }\n            _ => panic!(\"wrong property type\"),\n        }\n        Ok(())\n    }\n\n    #[test]\n    fn test_deserialize_json_mapping_entry_not_indexed_invalid() {\n        let result = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"data_binary\",\n                \"type\": \"json\",\n                \"indexed\": false,\n                \"record\": \"basic\"\n            }\n            \"#,\n        );\n        assert!(result.is_err());\n        let error = result.unwrap_err();\n        assert_eq!(\n            error.to_string(),\n            \"error while parsing field `data_binary`: `record` and `tokenizer` parameters are \\\n             allowed only if indexed is true\"\n        );\n    }\n\n    #[test]\n    fn test_deserialize_invalid_text_mapping_entry() -> anyhow::Result<()> {\n        let mapping_entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"type\": \"text\",\n                \"stored\": true,\n                \"record\": \"notexist\"\n            }\n            \"#,\n        );\n        assert!(mapping_entry.is_err());\n        assert_eq!(\n            mapping_entry.unwrap_err().to_string(),\n            \"error while parsing field `my_field_name`: unknown variant `notexist`, expected one \\\n             of `basic`, `freq`, `position`\"\n                .to_string()\n        );\n        Ok(())\n    }\n\n    #[test]\n    fn test_deserialize_invalid_json_mapping_entry() -> anyhow::Result<()> {\n        let mapping_entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n        {\n            \"name\": \"my_field_name\",\n            \"type\": \"json\",\n            \"blub\": true\n        }\n    \"#,\n        );\n        assert!(mapping_entry.is_err());\n        assert!(\n            mapping_entry\n                .unwrap_err()\n                .to_string()\n                .contains(\"error while parsing field `my_field_name`: unknown field `blub`\")\n        );\n        Ok(())\n    }\n\n    #[test]\n    fn test_deserialize_text_mapping_entry() -> anyhow::Result<()> {\n        let mapping_entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n        {\n            \"name\": \"my_field_name\",\n            \"type\": \"text\",\n            \"stored\": true,\n            \"record\": \"basic\",\n            \"tokenizer\": \"lowercase\"\n        }\n        \"#,\n        )?;\n        assert_eq!(mapping_entry.name, \"my_field_name\");\n        match mapping_entry.mapping_type {\n            FieldMappingType::Text(options, _) => {\n                assert_eq!(options.stored, true);\n                let indexing_options = options.indexing_options.unwrap();\n                assert_eq!(indexing_options.tokenizer.name(), \"lowercase\");\n                assert_eq!(indexing_options.record, IndexRecordOption::Basic);\n            }\n            _ => panic!(\"wrong property type\"),\n        }\n        Ok(())\n    }\n\n    #[test]\n    fn test_deserialize_valid_fieldnorms() -> anyhow::Result<()> {\n        let result = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n        {\n            \"name\": \"my_field_name\",\n            \"type\": \"text\",\n            \"stored\": true,\n            \"indexed\": true,\n            \"fieldnorms\": true,\n            \"record\": \"basic\",\n            \"tokenizer\": \"en_stem\"\n        }\"#,\n        );\n        match result.unwrap().mapping_type {\n            FieldMappingType::Text(options, _) => {\n                assert_eq!(options.stored, true);\n                let indexing_options = options.indexing_options.unwrap();\n                assert_eq!(indexing_options.fieldnorms, true);\n            }\n            _ => panic!(\"wrong property type\"),\n        }\n\n        Ok(())\n    }\n\n    #[test]\n    fn test_error_on_text_with_invalid_options() {\n        let result = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"type\": \"text\",\n                \"indexed\": false,\n                \"tokenizer\": \"default\",\n                \"record\": \"position\"\n            }\n            \"#,\n        );\n        assert!(result.is_err());\n        let error = result.unwrap_err();\n        assert_eq!(\n            error.to_string(),\n            \"error while parsing field `my_field_name`: `record`, `tokenizer`, and `fieldnorms` \\\n             parameters are allowed only if indexed is true\"\n        );\n    }\n\n    #[test]\n    fn test_error_on_unknown_fields() -> anyhow::Result<()> {\n        let result = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"type\": \"text\",\n                \"indexing\": false,\n                \"tokenizer\": \"default\",\n                \"record\": \"position\"\n            }\n            \"#,\n        );\n        assert!(result.is_err());\n        let error = result.unwrap_err();\n        assert!(error.to_string().contains(\"unknown field `indexing`\"));\n        Ok(())\n    }\n\n    #[test]\n    fn test_deserialize_object_mapping_entry() {\n        let mapping_entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n            \"name\": \"my_field_name\",\n            \"type\": \"object\",\n            \"field_mappings\": [\n                {\n                    \"name\": \"my_field_name\",\n                    \"type\": \"text\"\n                }\n            ]\n            }\n            \"#,\n        )\n        .unwrap();\n        assert_eq!(mapping_entry.name, \"my_field_name\");\n        match mapping_entry.mapping_type {\n            FieldMappingType::Object(options) => {\n                assert_eq!(options.field_mappings.len(), 1);\n            }\n            _ => panic!(\"wrong property type\"),\n        }\n    }\n\n    #[test]\n    fn test_deserialize_object_mapping_with_no_field_mappings() {\n        let result = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"type\": \"object\",\n                \"field_mappings\": []\n            }\n            \"#,\n        );\n        assert!(result.is_err());\n        let error = result.unwrap_err();\n        assert_eq!(\n            error.to_string(),\n            \"error while parsing field `my_field_name`: object type must have at least one field \\\n             mapping\"\n        );\n    }\n\n    #[test]\n    fn test_deserialize_mapping_with_unknown_type() {\n        let result = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"type\": \"my custom type\"\n            }\n            \"#,\n        );\n        assert!(result.is_err());\n        let error = result.unwrap_err();\n        assert_eq!(\n            error.to_string(),\n            \"field `my_field_name` has an unknown type: `my custom type`\"\n        );\n    }\n\n    #[test]\n    fn test_deserialize_i64_mapping_with_invalid_name() {\n        assert!(\n            serde_json::from_str::<FieldMappingEntry>(\n                r#\"\n            {\n                \"name\": \"this is not ok\",\n                \"type\": \"i64\"\n            }\n            \"#,\n            )\n            .unwrap_err()\n            .to_string()\n            .contains(\"illegal characters\")\n        );\n    }\n\n    #[test]\n    fn test_deserialize_i64_parsing_error_with_text_options() {\n        let error = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"type\": \"i64\",\n                \"tokenizer\": \"basic\"\n            }\n            \"#,\n        )\n        .unwrap_err();\n\n        assert_eq!(\n            error.to_string(),\n            \"error while parsing field `my_field_name`: unknown field `tokenizer`, expected one \\\n             of `description`, `stored`, `indexed`, `fast`, `coerce`, `output_format`\"\n        );\n    }\n\n    #[test]\n    fn test_deserialize_i64_mapping_multivalued() -> anyhow::Result<()> {\n        let result = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"type\": \"array<i64>\"\n            }\n            \"#,\n        )?;\n\n        match result.mapping_type {\n            FieldMappingType::I64(options, cardinality) => {\n                assert_eq!(options.indexed, true); // default\n                assert_eq!(options.fast, false); // default\n                assert_eq!(options.stored, true); // default\n                assert_eq!(cardinality, Cardinality::MultiValued);\n            }\n            _ => bail!(\"Wrong type\"),\n        }\n\n        Ok(())\n    }\n\n    #[test]\n    fn test_deserialize_i64_mapping_singlevalued() -> anyhow::Result<()> {\n        let result = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"type\": \"i64\"\n            }\n            \"#,\n        )?;\n\n        match result.mapping_type {\n            FieldMappingType::I64(options, cardinality) => {\n                assert_eq!(options.indexed, true); // default\n                assert_eq!(options.fast, false); // default\n                assert_eq!(options.stored, true); // default\n                assert_eq!(cardinality, Cardinality::SingleValued);\n            }\n            _ => bail!(\"Wrong type\"),\n        }\n\n        Ok(())\n    }\n\n    #[test]\n    fn test_serialize_i64_mapping() -> anyhow::Result<()> {\n        let entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"type\": \"i64\"\n            }\n            \"#,\n        )?;\n        let entry_str = serde_json::to_value(&entry)?;\n        assert_eq!(\n            entry_str,\n            serde_json::json!({\n                \"name\": \"my_field_name\",\n                \"type\": \"i64\",\n                \"stored\": true,\n                \"fast\": false,\n                \"indexed\": true,\n                \"coerce\": true,\n                \"output_format\": \"number\"\n            })\n        );\n        Ok(())\n    }\n\n    #[test]\n    fn test_deserialize_u64_mapping_with_wrong_options() {\n        assert_eq!(\n            serde_json::from_str::<FieldMappingEntry>(\n                r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"type\": \"u64\",\n                \"tokenizer\": \"basic\"\n            }\"#\n            )\n            .unwrap_err()\n            .to_string(),\n            \"error while parsing field `my_field_name`: unknown field `tokenizer`, expected one \\\n             of `description`, `stored`, `indexed`, `fast`, `coerce`, `output_format`\"\n        );\n    }\n\n    #[test]\n    fn test_deserialize_u64_u64_mapping_multivalued() {\n        let result = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"type\": \"array<u64>\"\n            }\n            \"#,\n        )\n        .unwrap();\n\n        if let FieldMappingType::U64(options, cardinality) = result.mapping_type {\n            assert_eq!(options.indexed, true); // default\n            assert_eq!(options.fast, false); // default\n            assert_eq!(options.stored, true); // default\n            assert_eq!(cardinality, Cardinality::MultiValued);\n        } else {\n            panic!(\"Wrong type\");\n        }\n    }\n\n    #[test]\n    fn test_deserialize_u64_mapping_singlevalued() {\n        let result = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"type\": \"u64\"\n            }\n            \"#,\n        )\n        .unwrap();\n        if let FieldMappingType::U64(options, cardinality) = result.mapping_type {\n            assert_eq!(options.indexed, true); // default\n            assert_eq!(options.fast, false); // default\n            assert_eq!(options.stored, true); // default\n            assert_eq!(cardinality, Cardinality::SingleValued);\n        } else {\n            panic!(\"Wrong type\");\n        }\n    }\n\n    #[test]\n    fn test_serialize_u64_mapping() {\n        let entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"type\": \"u64\"\n            }\n            \"#,\n        )\n        .unwrap();\n        let entry_str = serde_json::to_value(&entry).unwrap();\n        assert_eq!(\n            entry_str,\n            serde_json::json!({\n                \"name\": \"my_field_name\",\n                \"type\":\"u64\",\n                \"stored\": true,\n                \"fast\": false,\n                \"indexed\": true,\n                \"coerce\": true,\n                \"output_format\": \"number\"\n            })\n        );\n    }\n\n    #[test]\n    fn test_parse_f64_mapping() {\n        let entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"type\": \"f64\"\n            }\n            \"#,\n        )\n        .unwrap();\n        let entry_deserser = serde_json::to_value(&entry).unwrap();\n        assert_eq!(\n            entry_deserser,\n            json!({\n                \"name\": \"my_field_name\",\n                \"type\":\"f64\",\n                \"stored\": true,\n                \"fast\": false,\n                \"indexed\": true,\n                \"coerce\": true,\n                \"output_format\": \"number\"\n            })\n        );\n    }\n\n    #[test]\n    fn test_parse_bool_mapping() {\n        let entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"type\": \"bool\"\n            }\n            \"#,\n        )\n        .unwrap();\n        let entry_deserser = serde_json::to_value(&entry).unwrap();\n        assert_eq!(\n            entry_deserser,\n            json!({\n                \"name\": \"my_field_name\",\n                \"type\": \"bool\",\n                \"stored\": true,\n                \"fast\": false,\n                \"indexed\": true,\n            })\n        );\n    }\n\n    #[test]\n    fn test_parse_ip_addr_mapping() {\n        let entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"ip_address\",\n                \"description\": \"Client IP address\",\n                \"type\": \"ip\"\n            }\n            \"#,\n        )\n        .unwrap();\n        let entry_str = serde_json::to_value(&entry).unwrap();\n        assert_eq!(\n            entry_str,\n            serde_json::json!({\n                \"name\": \"ip_address\",\n                \"description\": \"Client IP address\",\n                \"type\": \"ip\",\n                \"stored\": true,\n                \"fast\": false,\n                \"indexed\": true\n            })\n        );\n    }\n\n    #[test]\n    fn test_parse_text_mapping() {\n        let entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"type\": \"text\"\n            }\n            \"#,\n        )\n        .unwrap();\n        let entry_deserser = serde_json::to_value(&entry).unwrap();\n        assert_eq!(\n            entry_deserser,\n            json!({\n                \"name\": \"my_field_name\",\n                \"type\": \"text\",\n                \"fast\": false,\n                \"stored\": true,\n                \"indexed\": true,\n                \"record\": \"basic\",\n                \"tokenizer\": \"default\",\n                \"fieldnorms\": false,\n            })\n        );\n    }\n\n    #[test]\n    fn test_parse_text_fast_field_normalizer() {\n        let entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"type\": \"text\",\n                \"fast\": {\"normalizer\": \"lowercase\"}\n            }\n            \"#,\n        )\n        .unwrap();\n        let entry_deserser = serde_json::to_value(&entry).unwrap();\n        assert_eq!(\n            entry_deserser,\n            json!({\n                \"name\": \"my_field_name\",\n                \"type\": \"text\",\n                \"fast\": {\"normalizer\": \"lowercase\"},\n                \"stored\": true,\n                \"indexed\": true,\n                \"record\": \"basic\",\n                \"tokenizer\": \"default\",\n                \"fieldnorms\": false,\n            })\n        );\n    }\n\n    #[test]\n    fn test_parse_text_mapping_multivalued() {\n        let entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"type\": \"array<text>\"\n            }\n            \"#,\n        )\n        .unwrap();\n        let entry_deserser = serde_json::to_value(&entry).unwrap();\n        assert_eq!(\n            entry_deserser,\n            json!({\n                \"name\": \"my_field_name\",\n                \"type\": \"array<text>\",\n                \"stored\": true,\n                \"indexed\": true,\n                \"record\": \"basic\",\n                \"tokenizer\": \"default\",\n                \"fieldnorms\": false,\n                \"fast\": false,\n            })\n        );\n    }\n\n    #[test]\n    fn test_parse_date_mapping() {\n        let entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"type\": \"datetime\"\n            }\n            \"#,\n        )\n        .unwrap();\n        let entry_deserser = serde_json::to_value(&entry).unwrap();\n        assert_eq!(\n            entry_deserser,\n            json!({\n                \"name\": \"my_field_name\",\n                \"type\": \"datetime\",\n                \"input_formats\": [\"rfc3339\", \"unix_timestamp\"],\n                \"output_format\": \"rfc3339\",\n                \"fast_precision\": \"seconds\",\n                \"stored\": true,\n                \"indexed\": true,\n                \"fast\": false,\n            })\n        );\n    }\n\n    #[test]\n    fn test_parse_date_arr_mapping() {\n        let entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"type\": \"array<datetime>\",\n                \"fast_precision\": \"milliseconds\"\n            }\n            \"#,\n        )\n        .unwrap();\n        let entry_deserser = serde_json::to_value(&entry).unwrap();\n        assert_eq!(\n            entry_deserser,\n            json!({\n                \"name\": \"my_field_name\",\n                \"type\": \"array<datetime>\",\n                \"input_formats\": [\"rfc3339\", \"unix_timestamp\"],\n                \"output_format\": \"rfc3339\",\n                \"fast_precision\": \"milliseconds\",\n                \"stored\": true,\n                \"indexed\": true,\n                \"fast\": false,\n            })\n        );\n    }\n\n    #[test]\n    fn test_parse_bytes_mapping() {\n        let entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"type\": \"bytes\",\n                \"input_format\": \"hex\",\n                \"output_format\": \"base64\"\n            }\n            \"#,\n        )\n        .unwrap();\n        let entry_deserser = serde_json::to_value(&entry).unwrap();\n        assert_eq!(\n            entry_deserser,\n            json!({\n                \"name\": \"my_field_name\",\n                \"type\": \"bytes\",\n                \"stored\": true,\n                \"indexed\": true,\n                \"fast\": false,\n                \"input_format\": \"hex\",\n                \"output_format\": \"base64\"\n            })\n        );\n    }\n\n    #[test]\n    fn test_parse_bytes_mapping_arr() {\n        let entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"type\": \"array<bytes>\"\n            }\n            \"#,\n        )\n        .unwrap();\n        let entry_deserser = serde_json::to_value(&entry).unwrap();\n        assert_eq!(\n            entry_deserser,\n            json!({\n                \"name\": \"my_field_name\",\n                \"type\": \"array<bytes>\",\n                \"stored\": true,\n                \"indexed\": true,\n                \"fast\": false,\n                \"input_format\": \"base64\",\n                \"output_format\": \"base64\"\n            })\n        );\n    }\n\n    #[test]\n    fn test_parse_bytes_mapping_arr_and_fast_forbidden() {\n        let err = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"type\": \"array<bytes>\",\n                \"fast\": true\n            }\n            \"#,\n        )\n        .err()\n        .unwrap();\n        assert_eq!(\n            err.to_string(),\n            \"error while parsing field `my_field_name`: fast field is not allowed for array<bytes>\",\n        );\n    }\n\n    #[test]\n    fn test_parse_json_mapping_singlevalue() {\n        let field_mapping_entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"type\": \"json\",\n                \"name\": \"my_json_field\",\n                \"stored\": true\n            }\n            \"#,\n        )\n        .unwrap();\n        let expected_json_options = QuickwitJsonOptions {\n            description: None,\n            indexing_options: Some(TextIndexingOptions::default_json()),\n            stored: true,\n            fast: FastFieldOptions::Disabled,\n            expand_dots: true,\n        };\n        assert_eq!(&field_mapping_entry.name, \"my_json_field\");\n        assert!(\n            matches!(field_mapping_entry.mapping_type, FieldMappingType::Json(json_config,\n            Cardinality::SingleValued) if json_config == expected_json_options)\n        );\n    }\n\n    #[test]\n    fn test_quickwit_json_options_default_tokenizer_is_raw() {\n        let quickwit_json_options = QuickwitJsonOptions::default();\n        assert_eq!(\n            quickwit_json_options\n                .indexing_options\n                .unwrap()\n                .tokenizer\n                .name(),\n            \"raw\"\n        );\n    }\n\n    #[test]\n    fn test_quickwit_json_options_default_fast_is_false() {\n        let quickwit_json_options = QuickwitJsonOptions::default();\n        assert_eq!(quickwit_json_options.fast, FastFieldOptions::Disabled);\n    }\n\n    #[test]\n    fn test_quickwit_json_options_default_consistent_with_default() {\n        let quickwit_json_options: QuickwitJsonOptions = serde_json::from_str(\"{}\").unwrap();\n        assert_eq!(quickwit_json_options, QuickwitJsonOptions::default());\n    }\n\n    #[test]\n    fn test_parse_json_mapping_multivalued() {\n        let field_mapping_entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"type\": \"array<json>\",\n                \"name\": \"my_json_field_multi\",\n                \"tokenizer\": \"raw\",\n                \"stored\": false,\n                \"fast\": false\n            }\n            \"#,\n        )\n        .unwrap();\n        let expected_json_options = QuickwitJsonOptions {\n            description: None,\n            indexing_options: Some(TextIndexingOptions::default_json()),\n            stored: false,\n            expand_dots: true,\n            fast: FastFieldOptions::Disabled,\n        };\n        assert_eq!(&field_mapping_entry.name, \"my_json_field_multi\");\n        assert!(\n            matches!(field_mapping_entry.mapping_type, FieldMappingType::Json(json_config,\n    Cardinality::MultiValued) if json_config == expected_json_options)\n        );\n    }\n\n    #[test]\n    fn test_serialize_i64_with_description_field() {\n        let entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"description\": \"If you see this description, your test is failed\",\n                \"type\": \"i64\"\n            }\"#,\n        )\n        .unwrap();\n\n        let entry_str = serde_json::to_value(&entry).unwrap();\n        assert_eq!(\n            entry_str,\n            serde_json::json!({\n                \"name\": \"my_field_name\",\n                \"description\": \"If you see this description, your test is failed\",\n                \"type\": \"i64\",\n                \"stored\": true,\n                \"fast\": false,\n                \"indexed\": true,\n                \"coerce\": true,\n                \"output_format\": \"number\"\n            })\n        );\n    }\n\n    #[test]\n    fn test_serialize_text_with_description_field() {\n        let entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"description\": \"If you see this description, your test is failed\",\n                \"type\": \"text\"\n            }\"#,\n        )\n        .unwrap();\n\n        let entry_str = serde_json::to_value(&entry).unwrap();\n        assert_eq!(\n            entry_str,\n            serde_json::json!({\n                \"name\": \"my_field_name\",\n                \"description\": \"If you see this description, your test is failed\",\n                \"type\": \"text\",\n                \"fast\": false,\n                \"stored\": true,\n                \"indexed\": true,\n                \"record\": \"basic\",\n                \"tokenizer\": \"default\",\n                \"fieldnorms\": false,\n            })\n        );\n    }\n    #[test]\n    fn test_serialize_json_with_description_field() {\n        let entry = serde_json::from_str::<FieldMappingEntry>(\n            r#\"\n            {\n                \"name\": \"my_field_name\",\n                \"description\": \"If you see this description, your test failed\",\n                \"type\": \"json\"\n            }\"#,\n        )\n        .unwrap();\n\n        let entry_str = serde_json::to_value(&entry).unwrap();\n        assert_eq!(\n            entry_str,\n            serde_json::json!({\n                \"name\": \"my_field_name\",\n                \"description\": \"If you see this description, your test failed\",\n                \"type\": \"json\",\n                \"stored\": true,\n                \"indexed\": true,\n                \"tokenizer\": \"raw\",\n                \"record\": \"basic\",\n                \"fast\": false,\n                \"expand_dots\": true,\n            })\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-doc-mapper/src/doc_mapper/field_mapping_type.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse tantivy::schema::Type;\n\nuse super::date_time_type::QuickwitDateTimeOptions;\nuse super::field_mapping_entry::QuickwitBoolOptions;\nuse crate::Cardinality;\nuse crate::doc_mapper::field_mapping_entry::{\n    QuickwitBytesOptions, QuickwitConcatenateOptions, QuickwitIpAddrOptions, QuickwitJsonOptions,\n    QuickwitNumericOptions, QuickwitObjectOptions, QuickwitTextOptions,\n};\n\n/// A `FieldMappingType` defines the type and indexing options\n/// of a mapping field.\n#[derive(Clone, Debug, PartialEq)]\npub enum FieldMappingType {\n    /// String mapping type configuration.\n    Text(QuickwitTextOptions, Cardinality),\n    /// Signed 64-bit integer mapping type configuration.\n    I64(QuickwitNumericOptions, Cardinality),\n    /// Unsigned 64-bit integer mapping type configuration.\n    U64(QuickwitNumericOptions, Cardinality),\n    /// DateTime mapping type configuration\n    DateTime(QuickwitDateTimeOptions, Cardinality),\n    /// 64-bit float mapping type configuration.\n    F64(QuickwitNumericOptions, Cardinality),\n    /// Bool mapping type configuration.\n    Bool(QuickwitBoolOptions, Cardinality),\n    /// IP Address mapping type configuration.\n    IpAddr(QuickwitIpAddrOptions, Cardinality),\n    /// Bytes mapping type configuration.\n    Bytes(QuickwitBytesOptions, Cardinality),\n    /// Json mapping type configuration.\n    Json(QuickwitJsonOptions, Cardinality),\n    /// Object mapping type configuration.\n    Object(QuickwitObjectOptions),\n    /// Concatenate field mapping type configuration.\n    Concatenate(QuickwitConcatenateOptions),\n}\n\nimpl FieldMappingType {\n    /// Returns the field mapping type name.\n    pub fn quickwit_field_type(&self) -> QuickwitFieldType {\n        let (primitive_type, cardinality) = match self {\n            FieldMappingType::Text(_, cardinality) => (Type::Str, *cardinality),\n            FieldMappingType::I64(_, cardinality) => (Type::I64, *cardinality),\n            FieldMappingType::U64(_, cardinality) => (Type::U64, *cardinality),\n            FieldMappingType::F64(_, cardinality) => (Type::F64, *cardinality),\n            FieldMappingType::Bool(_, cardinality) => (Type::Bool, *cardinality),\n            FieldMappingType::IpAddr(_, cardinality) => (Type::IpAddr, *cardinality),\n            FieldMappingType::DateTime(_, cardinality) => (Type::Date, *cardinality),\n            FieldMappingType::Bytes(_, cardinality) => (Type::Bytes, *cardinality),\n            FieldMappingType::Json(_, cardinality) => (Type::Json, *cardinality),\n            FieldMappingType::Object(_) => {\n                return QuickwitFieldType::Object;\n            }\n            FieldMappingType::Concatenate(_) => return QuickwitFieldType::Concatenate,\n        };\n        match cardinality {\n            Cardinality::SingleValued => QuickwitFieldType::Simple(primitive_type),\n            Cardinality::MultiValued => QuickwitFieldType::Array(primitive_type),\n        }\n    }\n}\n\n#[derive(Debug, Eq, PartialEq)]\npub enum QuickwitFieldType {\n    Simple(Type),\n    Object,\n    Concatenate,\n    Array(Type),\n}\n\nimpl QuickwitFieldType {\n    pub fn to_type_id(&self) -> String {\n        match self {\n            QuickwitFieldType::Simple(typ) => primitive_type_to_str(typ).to_string(),\n            QuickwitFieldType::Object => \"object\".to_string(),\n            QuickwitFieldType::Array(typ) => format!(\"array<{}>\", primitive_type_to_str(typ)),\n            QuickwitFieldType::Concatenate => \"concatenate\".to_string(),\n        }\n    }\n\n    pub fn parse_type_id(type_str: &str) -> Option<QuickwitFieldType> {\n        if type_str == \"object\" {\n            return Some(QuickwitFieldType::Object);\n        }\n        if type_str == \"concatenate\" {\n            return Some(QuickwitFieldType::Concatenate);\n        }\n        if type_str.starts_with(\"array<\") && type_str.ends_with('>') {\n            let parsed_type_str = parse_primitive_type(&type_str[6..type_str.len() - 1])?;\n            return Some(QuickwitFieldType::Array(parsed_type_str));\n        }\n        let parsed_type_str = parse_primitive_type(type_str)?;\n        Some(QuickwitFieldType::Simple(parsed_type_str))\n    }\n}\n\nfn parse_primitive_type(primitive_type_str: &str) -> Option<Type> {\n    match primitive_type_str {\n        \"text\" => Some(Type::Str),\n        \"u64\" => Some(Type::U64),\n        \"i64\" => Some(Type::I64),\n        \"f64\" => Some(Type::F64),\n        \"bool\" => Some(Type::Bool),\n        \"ip\" => Some(Type::IpAddr),\n        \"datetime\" => Some(Type::Date),\n        \"bytes\" => Some(Type::Bytes),\n        \"json\" => Some(Type::Json),\n        _unknown_type => None,\n    }\n}\n\nfn primitive_type_to_str(primitive_type: &Type) -> &'static str {\n    match primitive_type {\n        Type::Str => \"text\",\n        Type::U64 => \"u64\",\n        Type::I64 => \"i64\",\n        Type::F64 => \"f64\",\n        Type::Bool => \"bool\",\n        Type::IpAddr => \"ip\",\n        Type::Date => \"datetime\",\n        Type::Bytes => \"bytes\",\n        Type::Json => \"json\",\n        Type::Facet => {\n            unimplemented!(\"Facets are not supported by quickwit at the moment.\")\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use tantivy::schema::Type;\n\n    use super::QuickwitFieldType;\n\n    #[track_caller]\n    fn test_parse_type_aux(type_str: &str, expected: Option<QuickwitFieldType>) {\n        let quickwit_field_type = QuickwitFieldType::parse_type_id(type_str);\n        assert_eq!(quickwit_field_type, expected);\n    }\n\n    #[test]\n    fn test_parse_type() {\n        test_parse_type_aux(\"array<i64>\", Some(QuickwitFieldType::Array(Type::I64)));\n        test_parse_type_aux(\"array<text>\", Some(QuickwitFieldType::Array(Type::Str)));\n        test_parse_type_aux(\"array<texto>\", None);\n        test_parse_type_aux(\"text\", Some(QuickwitFieldType::Simple(Type::Str)));\n        test_parse_type_aux(\"object\", Some(QuickwitFieldType::Object));\n        test_parse_type_aux(\"object2\", None);\n        test_parse_type_aux(\"bool\", Some(QuickwitFieldType::Simple(Type::Bool)));\n        test_parse_type_aux(\"ip\", Some(QuickwitFieldType::Simple(Type::IpAddr)));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-doc-mapper/src/doc_mapper/field_presence.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse fnv::FnvHashSet;\nuse quickwit_common::PathHasher;\nuse tantivy::Document;\nuse tantivy::schema::document::{ReferenceValue, ReferenceValueLeaf};\nuse tantivy::schema::{FieldType, Schema, Value};\n\n/// Populates the field presence for a document.\n///\n/// The field presence is a set of hashes that represent the fields that are present in the\n/// document. Each hash is computed from the field path.\n///\n/// It is only added if the field is indexed and not fast.\npub(crate) fn populate_field_presence<D: Document>(\n    document: &D,\n    schema: &Schema,\n    populate_object_fields: bool,\n) -> FnvHashSet<u64> {\n    let mut field_presence_hashes: FnvHashSet<u64> =\n        FnvHashSet::with_capacity_and_hasher(schema.num_fields(), Default::default());\n    for (field, value) in document.iter_fields_and_values() {\n        let field_entry = schema.get_field_entry(field);\n        if !field_entry.is_indexed() || field_entry.is_fast() {\n            // We are using an tantivy's ExistsQuery for fast fields.\n            continue;\n        }\n        let mut path_hasher: PathHasher = PathHasher::default();\n        path_hasher.append(&field.field_id().to_le_bytes()[..]);\n        if let Some(json_obj) = value.as_object() {\n            let is_expand_dots_enabled: bool =\n                if let FieldType::JsonObject(json_options) = field_entry.field_type() {\n                    json_options.is_expand_dots_enabled()\n                } else {\n                    false\n                };\n            let mut subfields_populator = SubfieldsPopulator {\n                populate_object_fields,\n                is_expand_dots_enabled,\n                field_presence_hashes,\n            };\n            subfields_populator.populate_field_presence_for_json_obj(path_hasher, json_obj);\n            field_presence_hashes = subfields_populator.field_presence_hashes;\n        } else {\n            field_presence_hashes.insert(path_hasher.finish_leaf());\n        }\n    }\n    field_presence_hashes\n}\n\n/// A struct to help populate field presence hashes for nested JSON field.\nstruct SubfieldsPopulator {\n    populate_object_fields: bool,\n    is_expand_dots_enabled: bool,\n    field_presence_hashes: FnvHashSet<u64>,\n}\n\nimpl SubfieldsPopulator {\n    #[inline]\n    fn populate_field_presence_for_json_value<'a>(\n        &mut self,\n        path_hasher: PathHasher,\n        json_value: impl Value<'a>,\n    ) {\n        match json_value.as_value() {\n            ReferenceValue::Leaf(ReferenceValueLeaf::Null) => {}\n            ReferenceValue::Leaf(_) => {\n                self.field_presence_hashes.insert(path_hasher.finish_leaf());\n            }\n            ReferenceValue::Array(items) => {\n                for item in items {\n                    self.populate_field_presence_for_json_value(path_hasher.clone(), item);\n                }\n            }\n            ReferenceValue::Object(json_obj) => {\n                self.populate_field_presence_for_json_obj(path_hasher, json_obj);\n            }\n        }\n    }\n\n    fn populate_field_presence_for_json_obj<'a, I, V>(\n        &mut self,\n        path_hasher: PathHasher,\n        json_obj: I,\n    ) where\n        I: Iterator<Item = (&'a str, V)>,\n        V: Value<'a>,\n    {\n        if self.populate_object_fields {\n            self.field_presence_hashes\n                .insert(path_hasher.finish_intermediate());\n        }\n        for (field_key, field_value) in json_obj {\n            let mut child_path_hasher = path_hasher.clone();\n            if self.is_expand_dots_enabled {\n                let mut expanded_key = field_key.split('.').peekable();\n                while let Some(segment) = expanded_key.next() {\n                    child_path_hasher.append(segment.as_bytes());\n                    if self.populate_object_fields && expanded_key.peek().is_some() {\n                        self.field_presence_hashes\n                            .insert(child_path_hasher.finish_intermediate());\n                    }\n                }\n            } else {\n                child_path_hasher.append(field_key.as_bytes());\n            };\n            self.populate_field_presence_for_json_value(child_path_hasher, field_value);\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use tantivy::TantivyDocument;\n    use tantivy::schema::*;\n\n    use super::*;\n\n    #[test]\n    fn test_populate_field_presence_basic() {\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_text_field(\"indexed_text\", TEXT);\n        schema_builder.add_text_field(\"text_not_indexed\", STORED);\n        let schema = schema_builder.build();\n        let json_doc = r#\"{\"indexed_text\": \"hello\", \"text_not_indexed\": \"world\"}\"#;\n        let document = TantivyDocument::parse_json(&schema, json_doc).unwrap();\n\n        let field_presence = populate_field_presence(&document, &schema, true);\n        assert_eq!(field_presence.len(), 1);\n    }\n\n    #[test]\n    fn test_populate_field_presence_with_array() {\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_text_field(\"list\", TEXT);\n        let schema = schema_builder.build();\n        let json_doc = r#\"{\"list\": [\"value1\", \"value2\"]}\"#;\n        let document = TantivyDocument::parse_json(&schema, json_doc).unwrap();\n\n        let field_presence = populate_field_presence(&document, &schema, true);\n        assert_eq!(field_presence.len(), 1);\n    }\n\n    #[test]\n    fn test_populate_field_presence_with_json() {\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_json_field(\"json\", TEXT);\n        let schema = schema_builder.build();\n        let json_doc = r#\"{\"json\": {\"subfield\": \"a\"}}\"#;\n        let document = TantivyDocument::parse_json(&schema, json_doc).unwrap();\n\n        let field_presence = populate_field_presence(&document, &schema, false);\n        assert_eq!(field_presence.len(), 1);\n        let field_presence = populate_field_presence(&document, &schema, true);\n        assert_eq!(field_presence.len(), 2);\n    }\n\n    #[test]\n    fn test_populate_field_presence_with_nested_jsons() {\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_json_field(\"json\", TEXT);\n        let schema = schema_builder.build();\n        let json_doc = r#\"{\"json\": {\"subfield\": {\"subsubfield\": \"a\"}}}\"#;\n        let document = TantivyDocument::parse_json(&schema, json_doc).unwrap();\n\n        let field_presence = populate_field_presence(&document, &schema, false);\n        assert_eq!(field_presence.len(), 1);\n        let field_presence = populate_field_presence(&document, &schema, true);\n        assert_eq!(field_presence.len(), 3);\n    }\n\n    #[test]\n    fn test_populate_field_presence_with_array_of_objects() {\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_json_field(\"json\", TEXT);\n        let schema = schema_builder.build();\n        let json_doc = r#\"{\"json\": {\"list\": [{\"key1\":\"value1\"}, {\"key2\":\"value2\"}]}}\"#;\n        let document = TantivyDocument::parse_json(&schema, json_doc).unwrap();\n\n        let field_presence = populate_field_presence(&document, &schema, false);\n        assert_eq!(field_presence.len(), 2);\n        let field_presence = populate_field_presence(&document, &schema, true);\n        assert_eq!(field_presence.len(), 4);\n    }\n\n    #[test]\n    fn test_populate_field_presence_with_expand_dots() {\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_json_field(\n            \"json\",\n            Into::<JsonObjectOptions>::into(TEXT).set_expand_dots_enabled(),\n        );\n        let schema = schema_builder.build();\n        let json_doc = r#\"{\"json\": {\"key.with.dots\": \"value\"}}\"#;\n        let document = TantivyDocument::parse_json(&schema, json_doc).unwrap();\n\n        let field_presence = populate_field_presence(&document, &schema, false);\n        assert_eq!(field_presence.len(), 1);\n        let field_presence = populate_field_presence(&document, &schema, true);\n        assert_eq!(field_presence.len(), 4);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-doc-mapper/src/doc_mapper/mapping_tree.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::any::type_name;\nuse std::collections::BTreeMap;\nuse std::net::IpAddr;\nuse std::str::FromStr;\n\nuse anyhow::bail;\nuse itertools::Itertools;\nuse serde_json::Value as JsonValue;\nuse serde_json_borrow::{Map as BorrowedJsonMap, Value as BorrowedJsonValue};\nuse tantivy::TantivyDocument as Document;\nuse tantivy::schema::{\n    BytesOptions, DateOptions, Field, IntoIpv6Addr, IpAddrOptions, JsonObjectOptions,\n    NumericOptions, OwnedValue as TantivyValue, SchemaBuilder, TextOptions,\n};\n\nuse super::date_time_type::QuickwitDateTimeOptions;\nuse super::field_mapping_entry::QuickwitBoolOptions;\nuse super::tantivy_val_to_json::formatted_tantivy_value_to_json;\nuse crate::doc_mapper::field_mapping_entry::{\n    QuickwitBytesOptions, QuickwitIpAddrOptions, QuickwitNumericOptions, QuickwitObjectOptions,\n    QuickwitTextOptions,\n};\nuse crate::doc_mapper::{FieldMappingType, QuickwitJsonOptions};\nuse crate::{Cardinality, DocParsingError, FieldMappingEntry, ModeType};\n\n#[derive(Clone, Debug)]\npub enum LeafType {\n    Bool(QuickwitBoolOptions),\n    Bytes(QuickwitBytesOptions),\n    DateTime(QuickwitDateTimeOptions),\n    F64(QuickwitNumericOptions),\n    I64(QuickwitNumericOptions),\n    U64(QuickwitNumericOptions),\n    IpAddr(QuickwitIpAddrOptions),\n    Json(QuickwitJsonOptions),\n    Text(QuickwitTextOptions),\n}\n\nenum MapOrArrayIter {\n    Array(std::vec::IntoIter<JsonValue>),\n    Map(serde_json::map::IntoIter),\n    Value(JsonValue),\n}\n\nimpl Iterator for MapOrArrayIter {\n    type Item = JsonValue;\n\n    fn next(&mut self) -> Option<JsonValue> {\n        match self {\n            MapOrArrayIter::Array(iter) => iter.next(),\n            MapOrArrayIter::Map(iter) => iter.next().map(|(_, val)| val),\n            MapOrArrayIter::Value(val) => {\n                if val.is_null() {\n                    None\n                } else {\n                    Some(std::mem::take(val))\n                }\n            }\n        }\n    }\n}\n\n/// Iterate over all primitive values inside the provided JsonValue, ignoring Nulls, and opening\n/// arrays and objects.\npub(crate) struct JsonValueIterator {\n    currently_itered: Vec<MapOrArrayIter>,\n}\n\nimpl JsonValueIterator {\n    pub fn new(source: JsonValue) -> JsonValueIterator {\n        let base_value = match source {\n            JsonValue::Array(array) => MapOrArrayIter::Array(array.into_iter()),\n            JsonValue::Object(map) => MapOrArrayIter::Map(map.into_iter()),\n            other => MapOrArrayIter::Value(other),\n        };\n        JsonValueIterator {\n            currently_itered: vec![base_value],\n        }\n    }\n}\n\nimpl Iterator for JsonValueIterator {\n    type Item = JsonValue;\n\n    fn next(&mut self) -> Option<JsonValue> {\n        loop {\n            let currently_itered = self.currently_itered.last_mut()?;\n            match currently_itered.next() {\n                Some(JsonValue::Array(array)) => self\n                    .currently_itered\n                    .push(MapOrArrayIter::Array(array.into_iter())),\n                Some(JsonValue::Object(map)) => self\n                    .currently_itered\n                    .push(MapOrArrayIter::Map(map.into_iter())),\n                Some(JsonValue::Null) => continue,\n                Some(other) => return Some(other),\n                None => {\n                    self.currently_itered.pop();\n                    continue;\n                }\n            }\n        }\n    }\n}\n\nenum OneOrIter<T, I: Iterator<Item = T>> {\n    One(Option<T>),\n    Iter(I),\n}\n\nimpl<T, I: Iterator<Item = T>> OneOrIter<T, I> {\n    pub fn one(item: T) -> Self {\n        OneOrIter::One(Some(item))\n    }\n}\n\nimpl<T, I: Iterator<Item = T>> Iterator for OneOrIter<T, I> {\n    type Item = T;\n\n    fn next(&mut self) -> Option<T> {\n        match self {\n            OneOrIter::Iter(iter) => iter.next(),\n            OneOrIter::One(item) => std::mem::take(item),\n        }\n    }\n}\n\n/// Similar to the native `From<JsonValue> for TantivyValue` implementation, with a\n/// subtle difference: no automatic parsing to DateTime is performed when the string\n/// is a valid RFC3339 date. This enables some level of range querying through prefix\n/// queries despite concatenate fields not supporting fast fields.\npub(crate) fn map_primitive_json_to_concatenate_value(value: JsonValue) -> Option<TantivyValue> {\n    match value {\n        JsonValue::Array(_) | JsonValue::Object(_) | JsonValue::Null => None,\n        JsonValue::String(text) => Some(TantivyValue::Str(text)),\n        JsonValue::Bool(val) => Some((val).into()),\n        JsonValue::Number(number) => {\n            if let Some(val) = i64::from_json_number(&number) {\n                Some((val).into())\n            } else if let Some(val) = u64::from_json_number(&number) {\n                Some((val).into())\n            } else {\n                f64::from_json_number(&number).map(|val| (val).into())\n            }\n        }\n    }\n}\n\nimpl LeafType {\n    fn validate_from_json(&self, json_val: &BorrowedJsonValue) -> Result<(), String> {\n        match self {\n            LeafType::Text(_) => {\n                if json_val.is_string() {\n                    Ok(())\n                } else {\n                    Err(format!(\"expected string, got `{json_val}`\"))\n                }\n            }\n            LeafType::I64(numeric_options) => {\n                i64::validate_json(json_val, numeric_options.coerce).map(|_| ())\n            }\n            LeafType::U64(numeric_options) => {\n                u64::validate_json(json_val, numeric_options.coerce).map(|_| ())\n            }\n            LeafType::F64(numeric_options) => {\n                f64::validate_json(json_val, numeric_options.coerce).map(|_| ())\n            }\n            LeafType::Bool(_) => {\n                if json_val.is_bool() {\n                    Ok(())\n                } else {\n                    Err(format!(\"expected boolean, got `{json_val}`\"))\n                }\n            }\n            LeafType::IpAddr(_) => {\n                let Some(ip_address) = json_val.as_str() else {\n                    return Err(format!(\"expected string, got `{json_val}`\"));\n                };\n                IpAddr::from_str(ip_address)\n                    .map_err(|err| format!(\"failed to parse IP address `{ip_address}`: {err}\"))?;\n                Ok(())\n            }\n            LeafType::DateTime(date_time_options) => {\n                date_time_options.validate_json(json_val).map(|_| ())\n            }\n            LeafType::Bytes(binary_options) => {\n                if let Some(byte_str) = json_val.as_str() {\n                    binary_options.input_format.parse_str(byte_str)?;\n                    Ok(())\n                } else {\n                    Err(format!(\n                        \"expected {} string, got `{json_val}`\",\n                        binary_options.input_format.as_str()\n                    ))\n                }\n            }\n            LeafType::Json(_) => {\n                if json_val.is_object() {\n                    Ok(())\n                } else {\n                    Err(format!(\"expected object, got `{json_val}`\"))\n                }\n            }\n        }\n    }\n\n    fn value_from_json(&self, json_val: JsonValue) -> Result<TantivyValue, String> {\n        match self {\n            LeafType::Text(_) => {\n                if let JsonValue::String(text) = json_val {\n                    Ok(TantivyValue::Str(text))\n                } else {\n                    Err(format!(\"expected string, got `{json_val}`\"))\n                }\n            }\n            LeafType::I64(numeric_options) => i64::from_json(json_val, numeric_options.coerce),\n            LeafType::U64(numeric_options) => u64::from_json(json_val, numeric_options.coerce),\n            LeafType::F64(numeric_options) => f64::from_json(json_val, numeric_options.coerce),\n            LeafType::Bool(_) => {\n                if let JsonValue::Bool(val) = json_val {\n                    Ok(TantivyValue::Bool(val))\n                } else {\n                    Err(format!(\"expected boolean, got `{json_val}`\"))\n                }\n            }\n            LeafType::IpAddr(_) => {\n                if let JsonValue::String(ip_address) = json_val {\n                    let ipv6_value = IpAddr::from_str(ip_address.as_str())\n                        .map_err(|err| format!(\"failed to parse IP address `{ip_address}`: {err}\"))?\n                        .into_ipv6_addr();\n                    Ok(TantivyValue::IpAddr(ipv6_value))\n                } else {\n                    Err(format!(\"expected string, got `{json_val}`\"))\n                }\n            }\n            LeafType::DateTime(date_time_options) => date_time_options.parse_json(&json_val),\n            LeafType::Bytes(binary_options) => binary_options.input_format.parse_json(&json_val),\n            LeafType::Json(_) => {\n                if let JsonValue::Object(json_obj) = json_val {\n                    Ok(TantivyValue::Object(\n                        json_obj\n                            .into_iter()\n                            .map(|(key, val)| (key, val.into()))\n                            .collect(),\n                    ))\n                } else {\n                    Err(format!(\"expected object, got `{json_val}`\"))\n                }\n            }\n        }\n    }\n\n    fn concatenate_values_from_json(\n        &self,\n        json_val: JsonValue,\n    ) -> Result<impl Iterator<Item = TantivyValue>, String> {\n        match self {\n            LeafType::Text(_) => {\n                if let JsonValue::String(text) = json_val {\n                    Ok(OneOrIter::one(TantivyValue::Str(text)))\n                } else {\n                    Err(format!(\"expected string, got `{json_val}`\"))\n                }\n            }\n            LeafType::I64(numeric_options) => {\n                let val = i64::from_json_to_self(&json_val, numeric_options.coerce)?;\n                Ok(OneOrIter::one((val).into()))\n            }\n            LeafType::U64(numeric_options) => {\n                let val = u64::from_json_to_self(&json_val, numeric_options.coerce)?;\n                Ok(OneOrIter::one((val).into()))\n            }\n            LeafType::F64(numeric_options) => {\n                let val = f64::from_json_to_self(&json_val, numeric_options.coerce)?;\n                Ok(OneOrIter::one((val).into()))\n            }\n            LeafType::Bool(_) => {\n                if let JsonValue::Bool(val) = json_val {\n                    Ok(OneOrIter::one((val).into()))\n                } else {\n                    Err(format!(\"expected boolean, got `{json_val}`\"))\n                }\n            }\n            LeafType::IpAddr(_) => Err(\"unsupported concat type: IpAddr\".to_string()),\n            LeafType::DateTime(_date_time_options) => {\n                Err(\"unsupported concat type: DateTime\".to_string())\n            }\n            LeafType::Bytes(_binary_options) => Err(\"unsupported concat type: Bytes\".to_string()),\n            LeafType::Json(_) => {\n                if let JsonValue::Object(json_obj) = json_val {\n                    Ok(OneOrIter::Iter(\n                        json_obj\n                            .into_iter()\n                            .flat_map(|(_key, val)| JsonValueIterator::new(val))\n                            .flat_map(map_primitive_json_to_concatenate_value),\n                    ))\n                } else {\n                    Err(format!(\"expected object, got `{json_val}`\"))\n                }\n            }\n        }\n    }\n\n    fn supported_for_concat(&self) -> bool {\n        use LeafType::*;\n        matches!(self, Text(_) | U64(_) | I64(_) | F64(_) | Bool(_) | Json(_))\n        /*\n            // Since concat is a JSON field, anything that JSON supports can be supported\n            DateTime(_), // Could be supported if the date is converted to Rfc3339\n            IpAddr(_),\n            // won't be supported\n            Bytes(_),\n        */\n    }\n}\n\n#[derive(Clone)]\npub(crate) struct MappingLeaf {\n    field: Field,\n    typ: LeafType,\n    cardinality: Cardinality,\n    // concatenate fields this field is part of\n    concatenate: Vec<Field>,\n}\n\nimpl MappingLeaf {\n    fn validate_from_json(\n        &self,\n        json_value: &BorrowedJsonValue,\n        path: &[&str],\n    ) -> Result<(), DocParsingError> {\n        if json_value.is_null() {\n            // We just ignore `null`.\n            return Ok(());\n        }\n        if let BorrowedJsonValue::Array(els) = json_value {\n            if self.cardinality == Cardinality::SingleValued {\n                return Err(DocParsingError::MultiValuesNotSupported(path.join(\".\")));\n            }\n            for el_json_val in els {\n                if el_json_val.is_null() {\n                    // We just ignore `null`.\n                    continue;\n                }\n                self.typ\n                    .validate_from_json(el_json_val)\n                    .map_err(|err_msg| DocParsingError::ValueError(path.join(\".\"), err_msg))?;\n            }\n            return Ok(());\n        }\n\n        self.typ\n            .validate_from_json(json_value)\n            .map_err(|err_msg| DocParsingError::ValueError(path.join(\".\"), err_msg))?;\n\n        Ok(())\n    }\n\n    pub fn doc_from_json(\n        &self,\n        json_val: JsonValue,\n        document: &mut Document,\n        path: &mut [String],\n    ) -> Result<(), DocParsingError> {\n        if json_val.is_null() {\n            // We just ignore `null`.\n            return Ok(());\n        }\n        if let JsonValue::Array(els) = json_val {\n            if self.cardinality == Cardinality::SingleValued {\n                return Err(DocParsingError::MultiValuesNotSupported(path.join(\".\")));\n            }\n            for el_json_val in els {\n                if el_json_val.is_null() {\n                    // We just ignore `null`.\n                    continue;\n                }\n                if !self.concatenate.is_empty() {\n                    let concat_values = self\n                        .typ\n                        .concatenate_values_from_json(el_json_val.clone())\n                        .map_err(|err_msg| DocParsingError::ValueError(path.join(\".\"), err_msg))?;\n                    for concat_value in concat_values {\n                        for field in &self.concatenate {\n                            document.add_field_value(*field, &concat_value);\n                        }\n                    }\n                }\n                let value = self\n                    .typ\n                    .value_from_json(el_json_val)\n                    .map_err(|err_msg| DocParsingError::ValueError(path.join(\".\"), err_msg))?;\n                document.add_field_value(self.field, &value);\n            }\n            return Ok(());\n        }\n\n        if !self.concatenate.is_empty() {\n            let concat_values = self\n                .typ\n                .concatenate_values_from_json(json_val.clone())\n                .map_err(|err_msg| DocParsingError::ValueError(path.join(\".\"), err_msg))?;\n            for concat_value in concat_values {\n                for field in &self.concatenate {\n                    document.add_field_value(*field, &concat_value);\n                }\n            }\n        }\n        let value = self\n            .typ\n            .value_from_json(json_val)\n            .map_err(|err_msg| DocParsingError::ValueError(path.join(\".\"), err_msg))?;\n        document.add_field_value(self.field, &value);\n        Ok(())\n    }\n\n    fn populate_json<'a>(\n        &'a self,\n        named_doc: &mut BTreeMap<String, Vec<TantivyValue>>,\n        field_path: &[&'a str],\n        doc_json: &mut serde_json::Map<String, JsonValue>,\n    ) {\n        if let Some(json_val) =\n            extract_json_val(self.get_type(), named_doc, field_path, self.cardinality)\n        {\n            insert_json_val(field_path, json_val, doc_json);\n        }\n    }\n\n    pub fn get_type(&self) -> &LeafType {\n        &self.typ\n    }\n}\n\nfn extract_json_val(\n    leaf_type: &LeafType,\n    named_doc: &mut BTreeMap<String, Vec<TantivyValue>>,\n    field_path: &[&str],\n    cardinality: Cardinality,\n) -> Option<JsonValue> {\n    let mut full_path = field_path.join(\".\");\n    let vals: Vec<TantivyValue> = if let Some(vals) = named_doc.remove(&full_path) {\n        // we have our value directly\n        vals\n    } else {\n        let mut end_range = full_path.clone();\n        full_path.push('.');\n        // '/' is the character directly after . lexicographically\n        end_range.push('/');\n\n        // TODO use BTreeMap::drain once it exists and is stable\n        let matches = named_doc\n            .range::<String, _>(&full_path..&end_range)\n            .map(|(k, _)| k.clone())\n            .collect::<Vec<_>>();\n\n        if !matches.is_empty() {\n            let mut map = Vec::new();\n            for match_ in matches {\n                let Some(suffix) = match_.strip_prefix(&full_path) else {\n                    // this should never happen\n                    continue;\n                };\n                let Some(tantivy_values) = named_doc.remove(&match_) else {\n                    continue;\n                };\n\n                add_key_to_vec_map(&mut map, suffix, tantivy_values);\n            }\n            vec![TantivyValue::Object(map)]\n        } else {\n            // we didn't find our value, or any child of it, but maybe what we search is actually a\n            // json field closer to the root?\n            let mut split_point_iter = (1..(field_path.len())).rev();\n            loop {\n                let split_point = split_point_iter.next()?;\n                let (doc_path, json_path) = field_path.split_at(split_point);\n                let prefix_path = doc_path.join(\".\");\n                if let Some(vals) = named_doc.get_mut(&prefix_path) {\n                    // if we found a possible json field, there is no point in searching higher, our\n                    // result would have been in it.\n                    break extract_val_from_tantivy_val(json_path, vals);\n                }\n            }\n        }\n    };\n    let mut vals_with_correct_type_it = vals\n        .into_iter()\n        .flat_map(|value| formatted_tantivy_value_to_json(value, leaf_type));\n    match cardinality {\n        Cardinality::SingleValued => vals_with_correct_type_it.next(),\n        Cardinality::MultiValued => Some(JsonValue::Array(vals_with_correct_type_it.collect())),\n    }\n}\n\n/// extract a subfield from a TantivyValue. The path must be non-empty\nfn extract_val_from_tantivy_val(\n    full_path: &[&str],\n    tantivy_values: &mut [TantivyValue],\n) -> Vec<TantivyValue> {\n    // return *objects* matching path\n    fn extract_val_aux<'a>(\n        path: &[&str],\n        tantivy_values: &'a mut [TantivyValue],\n    ) -> Vec<&'a mut Vec<(String, TantivyValue)>> {\n        let mut maps: Vec<&'a mut Vec<(String, TantivyValue)>> = tantivy_values\n            .iter_mut()\n            .filter_map(|value| {\n                if let TantivyValue::Object(map) = value {\n                    Some(map)\n                } else {\n                    None\n                }\n            })\n            .collect();\n        let mut scratch_buffer = Vec::new();\n        for path_segment in path {\n            scratch_buffer.extend(\n                maps.drain(..)\n                    .flatten()\n                    .filter(|(key, _)| key == path_segment)\n                    .filter_map(|(_, value)| {\n                        if let TantivyValue::Object(map) = value {\n                            Some(map)\n                        } else {\n                            None\n                        }\n                    }),\n            );\n            std::mem::swap(&mut maps, &mut scratch_buffer);\n        }\n        maps\n    }\n\n    let Some((last_segment, path)) = full_path.split_last() else {\n        return Vec::new();\n    };\n\n    let mut results = Vec::new();\n    for object in extract_val_aux(path, tantivy_values) {\n        // TODO use extract_if once it's stable\n        let mut i = 0;\n        while i < object.len() {\n            if object[i].0 == *last_segment {\n                let (_, val) = object.swap_remove(i);\n                match val {\n                    TantivyValue::Array(mut vals) => results.append(&mut vals),\n                    _ => results.push(val),\n                }\n            } else {\n                i += 1;\n            }\n        }\n    }\n\n    results\n}\n\nfn add_key_to_vec_map(\n    mut map: &mut Vec<(String, TantivyValue)>,\n    suffix: &str,\n    mut tantivy_value: Vec<TantivyValue>,\n) {\n    let Ok(full_inner_path) = crate::routing_expression::parse_field_name(suffix) else {\n        return;\n    };\n    let Some((last_segment, inner_path)) = full_inner_path.split_last() else {\n        return;\n    };\n    for path_segment in inner_path {\n        // there is a cleaner way with find(), but the borrow checker is unhappy for no real reason\n        // thinking there are lifetime issues between two exclusive branches\n        map = if let Some(pos) = map.iter().position(|(key, _)| key == path_segment) {\n            if let (_, TantivyValue::Object(ref mut value)) = map[pos] {\n                value\n            } else {\n                // there is already a key before the end of the path ?!\n                return;\n            }\n        } else {\n            map.push((path_segment.to_string(), TantivyValue::Object(Vec::new())));\n            let TantivyValue::Object(ref mut new_map) = map.last_mut().unwrap().1 else {\n                unreachable!();\n            };\n            new_map\n        }\n    }\n    // if we are here the doc mapping was changed from obj to json. We don't really know if the\n    // field of that obj was multivalued or not. As a best effort, we say it was multivalued\n    // if we have !=1 value. We could always return a vec, but then *every* field would be\n    // transformed into an array of itself.\n    if tantivy_value.len() == 1 {\n        map.push((last_segment.to_string(), tantivy_value.pop().unwrap()));\n    } else {\n        map.push((last_segment.to_string(), TantivyValue::Array(tantivy_value)));\n    }\n}\n\nfn insert_json_val(\n    field_path: &[&str], //< may not be empty\n    json_val: JsonValue,\n    mut doc_json: &mut serde_json::Map<String, JsonValue>,\n) {\n    let (last_field_name, up_to_last) = field_path.split_last().expect(\"Empty path is forbidden\");\n    for &field_name in up_to_last {\n        let entry = doc_json\n            .entry(field_name.to_string())\n            .or_insert_with(|| JsonValue::Object(Default::default()));\n        if let JsonValue::Object(child_json_obj) = entry {\n            doc_json = child_json_obj;\n        } else {\n            return;\n        }\n    }\n    doc_json.insert(last_field_name.to_string(), json_val);\n}\n\npub(crate) trait NumVal: Sized + FromStr + ToString + Into<TantivyValue> {\n    fn from_json_number(num: &serde_json::Number) -> Option<Self>;\n\n    fn validate_json(json_val: &BorrowedJsonValue, coerce: bool) -> Result<(), String> {\n        match json_val {\n            BorrowedJsonValue::Number(num_val) => {\n                let num_val = serde_json::Number::from(*num_val);\n                Self::from_json_number(&num_val).ok_or_else(|| {\n                    format!(\n                        \"expected {}, got inconvertible JSON number `{}`\",\n                        type_name::<Self>(),\n                        num_val\n                    )\n                })?;\n                Ok(())\n            }\n            BorrowedJsonValue::Str(str_val) => {\n                if coerce {\n                    str_val.parse::<Self>().map_err(|_| {\n                        format!(\n                            \"failed to coerce JSON string `\\\"{str_val}\\\"` to {}\",\n                            type_name::<Self>()\n                        )\n                    })?;\n                    Ok(())\n                } else {\n                    Err(format!(\n                        \"expected JSON number, got string `\\\"{str_val}\\\"`. enable coercion to {} \\\n                         with the `coerce` parameter in the field mapping\",\n                        type_name::<Self>()\n                    ))\n                }\n            }\n            _ => {\n                let message = if coerce {\n                    format!(\"expected JSON number or string, got `{json_val}`\")\n                } else {\n                    format!(\"expected JSON number, got `{json_val}`\")\n                };\n                Err(message)\n            }\n        }\n    }\n\n    fn from_json_to_self(json_val: &JsonValue, coerce: bool) -> Result<Self, String> {\n        match json_val {\n            JsonValue::Number(num_val) => Self::from_json_number(num_val).ok_or_else(|| {\n                format!(\n                    \"expected {}, got inconvertible JSON number `{}`\",\n                    type_name::<Self>(),\n                    num_val\n                )\n            }),\n            JsonValue::String(str_val) => {\n                if coerce {\n                    str_val.parse::<Self>().map_err(|_| {\n                        format!(\n                            \"failed to coerce JSON string `\\\"{str_val}\\\"` to {}\",\n                            type_name::<Self>()\n                        )\n                    })\n                } else {\n                    Err(format!(\n                        \"expected JSON number, got string `\\\"{str_val}\\\"`. enable coercion to {} \\\n                         with the `coerce` parameter in the field mapping\",\n                        type_name::<Self>()\n                    ))\n                }\n            }\n            _ => {\n                let message = if coerce {\n                    format!(\"expected JSON number or string, got `{json_val}`\")\n                } else {\n                    format!(\"expected JSON number, got `{json_val}`\")\n                };\n                Err(message)\n            }\n        }\n    }\n\n    fn from_json(json_val: JsonValue, coerce: bool) -> Result<TantivyValue, String> {\n        Self::from_json_to_self(&json_val, coerce).map(Self::into)\n    }\n}\n\nimpl NumVal for u64 {\n    fn from_json_number(num: &serde_json::Number) -> Option<Self> {\n        num.as_u64()\n    }\n}\n\nimpl NumVal for i64 {\n    fn from_json_number(num: &serde_json::Number) -> Option<Self> {\n        num.as_i64()\n    }\n}\nimpl NumVal for f64 {\n    fn from_json_number(num: &serde_json::Number) -> Option<Self> {\n        num.as_f64()\n    }\n}\n\n#[derive(Clone, Default)]\npub(crate) struct MappingNode {\n    pub branches: fnv::FnvHashMap<String, MappingTree>,\n    branches_order: Vec<String>,\n}\n\nfn get_or_insert_path<'a>(\n    path: &[String],\n    mut dynamic_json_obj: &'a mut serde_json::Map<String, JsonValue>,\n) -> &'a mut serde_json::Map<String, JsonValue> {\n    for field_name in path {\n        let child_json_val = dynamic_json_obj\n            .entry(field_name.clone())\n            .or_insert_with(|| JsonValue::Object(Default::default()));\n        dynamic_json_obj = if let JsonValue::Object(child_map) = child_json_val {\n            child_map\n        } else {\n            panic!(\"Expected Json object.\");\n        };\n    }\n    dynamic_json_obj\n}\n\nimpl MappingNode {\n    /// Finds the field mapping type for a given field path in the mapping tree.\n    /// Dots in `field_path_as_str` define the boundaries between field names.\n    /// If a dot is part of a field name, it must be escaped with '\\'.\n    pub fn find_field_mapping_type(&self, field_path_as_str: &str) -> Option<FieldMappingType> {\n        let field_path = build_field_path_from_str(field_path_as_str);\n        self.internal_find_field_mapping_type(&field_path)\n    }\n\n    fn internal_find_field_mapping_type(&self, field_path: &[String]) -> Option<FieldMappingType> {\n        let (first_path_fragment, sub_field_path) = field_path.split_first()?;\n        let child_tree = self.branches.get(first_path_fragment)?;\n        match (child_tree, sub_field_path.is_empty()) {\n            (_, true) => Some(child_tree.clone().into()),\n            (MappingTree::Leaf(_), false) => None,\n            (MappingTree::Node(child_node), false) => {\n                child_node.internal_find_field_mapping_type(sub_field_path)\n            }\n        }\n    }\n\n    /// Finds the field mapping type for a given field path in the mapping tree.\n    /// Dots in `field_path_as_str` define the boundaries between field names.\n    /// If a dot is part of a field name, it must be escaped with '\\'.\n    pub fn find_field_mapping_leaf<'a>(\n        &'a mut self,\n        field_path_as_str: &str,\n    ) -> Option<impl Iterator<Item = &'a mut MappingLeaf> + use<'a>> {\n        let field_path = build_field_path_from_str(field_path_as_str);\n        self.internal_find_field_mapping_leaf(&field_path)\n    }\n\n    fn internal_find_field_mapping_leaf<'a>(\n        &'a mut self,\n        field_path: &[String],\n    ) -> Option<impl Iterator<Item = &'a mut MappingLeaf> + use<'a>> {\n        let (first_path_fragment, sub_field_path) = field_path.split_first()?;\n        let child_tree = self.branches.get_mut(first_path_fragment)?;\n        match (child_tree, sub_field_path.is_empty()) {\n            (MappingTree::Leaf(_), false) => None,\n            (MappingTree::Node(child_node), false) => {\n                child_node.internal_find_field_mapping_leaf(sub_field_path)\n            }\n            (MappingTree::Leaf(leaf), true) => Some([leaf].into_iter()),\n            (MappingTree::Node(_), true) => None,\n        }\n    }\n\n    #[cfg(test)]\n    pub fn num_fields(&self) -> usize {\n        self.branches.len()\n    }\n\n    pub fn insert(&mut self, path: &str, node: MappingTree) {\n        self.branches_order.push(path.to_string());\n        self.branches.insert(path.to_string(), node);\n    }\n\n    pub fn ordered_field_mapping_entries(&self) -> Vec<FieldMappingEntry> {\n        assert_eq!(self.branches.len(), self.branches_order.len());\n        let mut field_mapping_entries = Vec::new();\n        for field_name in &self.branches_order {\n            let child_tree = self.branches.get(field_name).expect(\"Missing field\");\n            let field_mapping_entry = FieldMappingEntry {\n                name: field_name.clone(),\n                mapping_type: child_tree.clone().into(),\n            };\n            field_mapping_entries.push(field_mapping_entry);\n        }\n        field_mapping_entries\n    }\n\n    pub fn validate_from_json<'a>(\n        &self,\n        json_obj: &'a BorrowedJsonMap,\n        strict_mode: bool,\n        path: &mut Vec<&'a str>,\n    ) -> Result<(), DocParsingError> {\n        for (field_name, json_val) in json_obj.iter() {\n            if let Some(child_tree) = self.branches.get(field_name) {\n                path.push(field_name);\n                child_tree.validate_from_json(json_val, path, strict_mode)?;\n                path.pop();\n            } else if strict_mode {\n                path.push(field_name);\n                let field_path = path.join(\".\");\n                return Err(DocParsingError::NoSuchFieldInSchema(field_path));\n            }\n        }\n        Ok(())\n    }\n\n    pub fn doc_from_json(\n        &self,\n        json_obj: serde_json::Map<String, JsonValue>,\n        mode: ModeType,\n        document: &mut Document,\n        path: &mut Vec<String>,\n        dynamic_json_obj: &mut serde_json::Map<String, JsonValue>,\n    ) -> Result<(), DocParsingError> {\n        for (field_name, val) in json_obj {\n            if let Some(child_tree) = self.branches.get(&field_name) {\n                path.push(field_name);\n                child_tree.doc_from_json(val, mode, document, path, dynamic_json_obj)?;\n                path.pop();\n            } else {\n                match mode {\n                    ModeType::Lenient => {\n                        // In lenient mode we simply ignore these unmapped fields.\n                    }\n                    ModeType::Dynamic => {\n                        let dynamic_json_obj_after_path =\n                            get_or_insert_path(path, dynamic_json_obj);\n                        dynamic_json_obj_after_path.insert(field_name, val);\n                    }\n                    ModeType::Strict => {\n                        path.push(field_name);\n                        let field_path = path.join(\".\");\n                        return Err(DocParsingError::NoSuchFieldInSchema(field_path));\n                    }\n                }\n            }\n        }\n        Ok(())\n    }\n\n    pub fn populate_json<'a>(\n        &'a self,\n        named_doc: &mut BTreeMap<String, Vec<TantivyValue>>,\n        field_path: &mut Vec<&'a str>,\n        doc_json: &mut serde_json::Map<String, JsonValue>,\n    ) {\n        for (field_name, field_mapping) in &self.branches {\n            field_path.push(field_name);\n            field_mapping.populate_json(named_doc, field_path, doc_json);\n            field_path.pop();\n        }\n    }\n}\n\nimpl From<MappingTree> for FieldMappingType {\n    fn from(mapping_tree: MappingTree) -> Self {\n        match mapping_tree {\n            MappingTree::Leaf(leaf) => leaf.into(),\n            MappingTree::Node(node) => FieldMappingType::Object(QuickwitObjectOptions {\n                field_mappings: node.into(),\n            }),\n        }\n    }\n}\n\nimpl From<MappingLeaf> for FieldMappingType {\n    fn from(leaf: MappingLeaf) -> Self {\n        match leaf.typ {\n            LeafType::Text(opt) => FieldMappingType::Text(opt, leaf.cardinality),\n            LeafType::I64(opt) => FieldMappingType::I64(opt, leaf.cardinality),\n            LeafType::U64(opt) => FieldMappingType::U64(opt, leaf.cardinality),\n            LeafType::F64(opt) => FieldMappingType::F64(opt, leaf.cardinality),\n            LeafType::Bool(opt) => FieldMappingType::Bool(opt, leaf.cardinality),\n            LeafType::IpAddr(opt) => FieldMappingType::IpAddr(opt, leaf.cardinality),\n            LeafType::DateTime(opt) => FieldMappingType::DateTime(opt, leaf.cardinality),\n            LeafType::Bytes(opt) => FieldMappingType::Bytes(opt, leaf.cardinality),\n            LeafType::Json(opt) => FieldMappingType::Json(opt, leaf.cardinality),\n        }\n    }\n}\n\nimpl From<MappingNode> for Vec<FieldMappingEntry> {\n    fn from(node: MappingNode) -> Self {\n        node.ordered_field_mapping_entries()\n    }\n}\n\n#[derive(Clone)]\npub(crate) enum MappingTree {\n    Leaf(MappingLeaf),\n    Node(MappingNode),\n}\n\nimpl MappingTree {\n    fn validate_from_json<'a>(\n        &self,\n        json_value: &'a BorrowedJsonValue<'a>,\n        field_path: &mut Vec<&'a str>,\n        strict_mode: bool,\n    ) -> Result<(), DocParsingError> {\n        match self {\n            MappingTree::Leaf(mapping_leaf) => {\n                mapping_leaf.validate_from_json(json_value, field_path)\n            }\n            MappingTree::Node(mapping_node) => {\n                if let Some(json_obj) = json_value.as_object() {\n                    mapping_node.validate_from_json(json_obj, strict_mode, field_path)\n                } else {\n                    Err(DocParsingError::ValueError(\n                        field_path.join(\".\"),\n                        format!(\"expected an JSON object, got {json_value}\"),\n                    ))\n                }\n            }\n        }\n    }\n\n    fn doc_from_json(\n        &self,\n        json_value: JsonValue,\n        mode: ModeType,\n        document: &mut Document,\n        path: &mut Vec<String>,\n        dynamic_json_obj: &mut serde_json::Map<String, JsonValue>,\n    ) -> Result<(), DocParsingError> {\n        match self {\n            MappingTree::Leaf(mapping_leaf) => {\n                mapping_leaf.doc_from_json(json_value, document, path)\n            }\n            MappingTree::Node(mapping_node) => {\n                if let JsonValue::Object(json_obj) = json_value {\n                    mapping_node.doc_from_json(json_obj, mode, document, path, dynamic_json_obj)\n                } else {\n                    Err(DocParsingError::ValueError(\n                        path.join(\".\"),\n                        format!(\"expected an JSON object, got {json_value}\"),\n                    ))\n                }\n            }\n        }\n    }\n\n    fn populate_json<'a>(\n        &'a self,\n        named_doc: &mut BTreeMap<String, Vec<TantivyValue>>,\n        field_path: &mut Vec<&'a str>,\n        doc_json: &mut serde_json::Map<String, JsonValue>,\n    ) {\n        match self {\n            MappingTree::Leaf(mapping_leaf) => {\n                mapping_leaf.populate_json(named_doc, field_path, doc_json)\n            }\n            MappingTree::Node(mapping_node) => {\n                mapping_node.populate_json(named_doc, field_path, doc_json);\n            }\n        }\n    }\n}\n\npub(crate) struct MappingNodeRoot {\n    /// The root of a mapping tree\n    pub field_mappings: MappingNode,\n    /// The list of concatenate fields which includes the dynamic field\n    pub concatenate_dynamic_fields: Vec<Field>,\n}\n\npub(crate) fn build_mapping_tree(\n    entries: &[FieldMappingEntry],\n    schema: &mut SchemaBuilder,\n) -> anyhow::Result<MappingNodeRoot> {\n    let mut field_path = Vec::new();\n    build_mapping_tree_from_entries(entries, &mut field_path, schema)\n}\n\nfn build_mapping_tree_from_entries<'a>(\n    entries: &'a [FieldMappingEntry],\n    field_path: &mut Vec<&'a str>,\n    schema: &mut SchemaBuilder,\n) -> anyhow::Result<MappingNodeRoot> {\n    let mut mapping_node = MappingNode::default();\n    let mut concatenate_fields = Vec::new();\n    let mut concatenate_dynamic_fields = Vec::new();\n    for entry in entries {\n        if let FieldMappingType::Concatenate(_) = &entry.mapping_type {\n            concatenate_fields.push(entry);\n        } else {\n            field_path.push(&entry.name);\n            if mapping_node.branches.contains_key(&entry.name) {\n                bail!(\"duplicated field definition `{}`\", entry.name);\n            }\n            let (child_tree, mut dynamic_fields) =\n                build_mapping_from_field_type(&entry.mapping_type, field_path, schema)?;\n            field_path.pop();\n            mapping_node.insert(&entry.name, child_tree);\n            concatenate_dynamic_fields.append(&mut dynamic_fields);\n        }\n    }\n    for concatenate_field_entry in concatenate_fields {\n        let FieldMappingType::Concatenate(options) = &concatenate_field_entry.mapping_type else {\n            // we only pushed Concatenate fields in `concatenate_fields`\n            unreachable!();\n        };\n        let name = &concatenate_field_entry.name;\n        if mapping_node.branches.contains_key(name) {\n            bail!(\"duplicated field definition `{}`\", name);\n        }\n        let text_options: JsonObjectOptions = options.clone().into();\n        let field = schema.add_json_field(name, text_options);\n        for sub_field in &options.concatenate_fields {\n            for matched_field in\n                mapping_node\n                    .find_field_mapping_leaf(sub_field)\n                    .ok_or_else(|| {\n                        anyhow::anyhow!(\"concatenate field uses an unknown field `{sub_field}`\")\n                    })?\n            {\n                if !matched_field.typ.supported_for_concat() {\n                    bail!(\n                        \"subfield `{}` not supported inside a concatenate field\",\n                        sub_field\n                    );\n                }\n                matched_field.concatenate.push(field);\n            }\n        }\n        if options.include_dynamic_fields {\n            concatenate_dynamic_fields.push(field);\n        }\n    }\n    Ok(MappingNodeRoot {\n        field_mappings: mapping_node,\n        concatenate_dynamic_fields,\n    })\n}\n\nfn get_numeric_options_for_bool_field(\n    quickwit_bool_options: &QuickwitBoolOptions,\n) -> NumericOptions {\n    let mut numeric_options = NumericOptions::default();\n    if quickwit_bool_options.stored {\n        numeric_options = numeric_options.set_stored();\n    }\n    if quickwit_bool_options.indexed {\n        numeric_options = numeric_options.set_indexed();\n    }\n    if quickwit_bool_options.fast {\n        numeric_options = numeric_options.set_fast();\n    }\n    numeric_options\n}\n\nfn get_numeric_options_for_numeric_field(\n    quickwit_numeric_options: &QuickwitNumericOptions,\n) -> NumericOptions {\n    let mut numeric_options = NumericOptions::default();\n    if quickwit_numeric_options.stored {\n        numeric_options = numeric_options.set_stored();\n    }\n    if quickwit_numeric_options.indexed {\n        numeric_options = numeric_options.set_indexed();\n    }\n    if quickwit_numeric_options.fast {\n        numeric_options = numeric_options.set_fast();\n    }\n    numeric_options\n}\n\nfn get_date_time_options(quickwit_date_time_options: &QuickwitDateTimeOptions) -> DateOptions {\n    let mut date_time_options = DateOptions::default();\n    if quickwit_date_time_options.stored {\n        date_time_options = date_time_options.set_stored();\n    }\n    if quickwit_date_time_options.indexed {\n        date_time_options = date_time_options.set_indexed();\n    }\n    if quickwit_date_time_options.fast {\n        date_time_options = date_time_options.set_fast();\n    }\n    date_time_options.set_precision(quickwit_date_time_options.fast_precision)\n}\n\nfn get_bytes_options(quickwit_numeric_options: &QuickwitBytesOptions) -> BytesOptions {\n    let mut bytes_options = BytesOptions::default();\n    if quickwit_numeric_options.indexed {\n        bytes_options = bytes_options.set_indexed();\n    }\n    if quickwit_numeric_options.fast {\n        bytes_options = bytes_options.set_fast();\n    }\n    if quickwit_numeric_options.stored {\n        bytes_options = bytes_options.set_stored();\n    }\n    bytes_options\n}\n\nfn get_ip_address_options(quickwit_ip_address_options: &QuickwitIpAddrOptions) -> IpAddrOptions {\n    let mut ip_address_options = IpAddrOptions::default();\n    if quickwit_ip_address_options.stored {\n        ip_address_options = ip_address_options.set_stored();\n    }\n    if quickwit_ip_address_options.indexed {\n        ip_address_options = ip_address_options.set_indexed();\n    }\n    if quickwit_ip_address_options.fast {\n        ip_address_options = ip_address_options.set_fast();\n    }\n    ip_address_options\n}\n\n/// Creates a tantivy field name for a given field path.\n///\n/// By field path, we mean the list of `field_name` that are crossed\n/// to reach the field starting from the root of the document.\n/// There can be more than one due to quickwit object type.\n///\n/// We simply concatenate these field names, interleaving them with '.'.\n/// If a fieldname itself contains a '.', we escape it with '\\'.\n/// ('\\' itself is forbidden).\nfn field_name_for_field_path(field_path: &[&str]) -> String {\n    field_path.iter().cloned().map(escape_dots).join(\".\")\n}\n\n/// Builds the sequence of field names crossed to reach the field\n/// starting from the root of the document.\n/// Dots '.' define the boundaries between field names.\n/// If a dot is part of a field name, it must be escaped with '\\'.\npub(crate) fn build_field_path_from_str(field_path_as_str: &str) -> Vec<String> {\n    let mut field_path = Vec::new();\n    let mut current_path_fragment = String::new();\n    let mut escaped = false;\n    for char in field_path_as_str.chars() {\n        if escaped {\n            current_path_fragment.push(char);\n            escaped = false;\n        } else if char == '\\\\' {\n            escaped = true;\n        } else if char == '.' {\n            let path_fragment = std::mem::take(&mut current_path_fragment);\n            field_path.push(path_fragment);\n        } else {\n            current_path_fragment.push(char);\n        }\n    }\n    if !current_path_fragment.is_empty() {\n        field_path.push(current_path_fragment);\n    }\n    field_path\n}\n\nfn escape_dots(field_name: &str) -> String {\n    let mut escaped_field_name = String::new();\n    for chr in field_name.chars() {\n        if chr == '.' {\n            escaped_field_name.push('\\\\');\n        }\n        escaped_field_name.push(chr);\n    }\n    escaped_field_name\n}\n\n/// build a sub-mapping tree from the fields it contains.\n///\n/// also returns the list of concatenate fields which consume the dynamic field\nfn build_mapping_from_field_type<'a>(\n    field_mapping_type: &'a FieldMappingType,\n    field_path: &mut Vec<&'a str>,\n    schema_builder: &mut SchemaBuilder,\n) -> anyhow::Result<(MappingTree, Vec<Field>)> {\n    let field_name = field_name_for_field_path(field_path);\n    match field_mapping_type {\n        FieldMappingType::Text(options, cardinality) => {\n            let text_options: TextOptions = options.clone().into();\n            let field = schema_builder.add_text_field(&field_name, text_options);\n            let mapping_leaf = MappingLeaf {\n                field,\n                typ: LeafType::Text(options.clone()),\n                cardinality: *cardinality,\n                concatenate: Vec::new(),\n            };\n            Ok((MappingTree::Leaf(mapping_leaf), Vec::new()))\n        }\n        FieldMappingType::I64(options, cardinality) => {\n            let numeric_options = get_numeric_options_for_numeric_field(options);\n            let field = schema_builder.add_i64_field(&field_name, numeric_options);\n            let mapping_leaf = MappingLeaf {\n                field,\n                typ: LeafType::I64(options.clone()),\n                cardinality: *cardinality,\n                concatenate: Vec::new(),\n            };\n            Ok((MappingTree::Leaf(mapping_leaf), Vec::new()))\n        }\n        FieldMappingType::U64(options, cardinality) => {\n            let numeric_options = get_numeric_options_for_numeric_field(options);\n            let field = schema_builder.add_u64_field(&field_name, numeric_options);\n            let mapping_leaf = MappingLeaf {\n                field,\n                typ: LeafType::U64(options.clone()),\n                cardinality: *cardinality,\n                concatenate: Vec::new(),\n            };\n            Ok((MappingTree::Leaf(mapping_leaf), Vec::new()))\n        }\n        FieldMappingType::F64(options, cardinality) => {\n            let numeric_options = get_numeric_options_for_numeric_field(options);\n            let field = schema_builder.add_f64_field(&field_name, numeric_options);\n            let mapping_leaf = MappingLeaf {\n                field,\n                typ: LeafType::F64(options.clone()),\n                cardinality: *cardinality,\n                concatenate: Vec::new(),\n            };\n            Ok((MappingTree::Leaf(mapping_leaf), Vec::new()))\n        }\n        FieldMappingType::Bool(options, cardinality) => {\n            let numeric_options = get_numeric_options_for_bool_field(options);\n            let field = schema_builder.add_bool_field(&field_name, numeric_options);\n            let mapping_leaf = MappingLeaf {\n                field,\n                typ: LeafType::Bool(options.clone()),\n                cardinality: *cardinality,\n                concatenate: Vec::new(),\n            };\n            Ok((MappingTree::Leaf(mapping_leaf), Vec::new()))\n        }\n        FieldMappingType::IpAddr(options, cardinality) => {\n            let ip_addr_options = get_ip_address_options(options);\n            let field = schema_builder.add_ip_addr_field(&field_name, ip_addr_options);\n            let mapping_leaf = MappingLeaf {\n                field,\n                typ: LeafType::IpAddr(options.clone()),\n                cardinality: *cardinality,\n                concatenate: Vec::new(),\n            };\n            Ok((MappingTree::Leaf(mapping_leaf), Vec::new()))\n        }\n        FieldMappingType::DateTime(options, cardinality) => {\n            let date_time_options = get_date_time_options(options);\n            let field = schema_builder.add_date_field(&field_name, date_time_options);\n            let mapping_leaf = MappingLeaf {\n                field,\n                typ: LeafType::DateTime(options.clone()),\n                cardinality: *cardinality,\n                concatenate: Vec::new(),\n            };\n            Ok((MappingTree::Leaf(mapping_leaf), Vec::new()))\n        }\n        FieldMappingType::Bytes(options, cardinality) => {\n            let bytes_options = get_bytes_options(options);\n            let field = schema_builder.add_bytes_field(&field_name, bytes_options);\n            let mapping_leaf = MappingLeaf {\n                field,\n                typ: LeafType::Bytes(options.clone()),\n                cardinality: *cardinality,\n                concatenate: Vec::new(),\n            };\n            Ok((MappingTree::Leaf(mapping_leaf), Vec::new()))\n        }\n        FieldMappingType::Json(options, cardinality) => {\n            let json_options = JsonObjectOptions::from(options.clone());\n            let field = schema_builder.add_json_field(&field_name, json_options);\n            let mapping_leaf = MappingLeaf {\n                field,\n                typ: LeafType::Json(options.clone()),\n                cardinality: *cardinality,\n                concatenate: Vec::new(),\n            };\n            Ok((MappingTree::Leaf(mapping_leaf), Vec::new()))\n        }\n        FieldMappingType::Object(entries) => {\n            let MappingNodeRoot {\n                field_mappings,\n                concatenate_dynamic_fields,\n            } = build_mapping_tree_from_entries(\n                &entries.field_mappings,\n                field_path,\n                schema_builder,\n            )?;\n            Ok((\n                MappingTree::Node(field_mappings),\n                concatenate_dynamic_fields,\n            ))\n        }\n        FieldMappingType::Concatenate(_) => {\n            bail!(\"Concatenate shouldn't reach build_mapping_from_field_type: this is a bug\")\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::net::IpAddr;\n\n    use serde_json::{Value as JsonValue, json};\n    use tantivy::schema::{Field, IntoIpv6Addr, OwnedValue as TantivyValue, Value};\n    use tantivy::{DateTime, TantivyDocument as Document};\n    use time::OffsetDateTime;\n    use time::macros::datetime;\n\n    use super::{\n        JsonValueIterator, LeafType, MapOrArrayIter, MappingLeaf, add_key_to_vec_map,\n        extract_val_from_tantivy_val,\n    };\n    use crate::Cardinality;\n    use crate::doc_mapper::date_time_type::QuickwitDateTimeOptions;\n    use crate::doc_mapper::field_mapping_entry::{\n        BinaryFormat, QuickwitBoolOptions, QuickwitBytesOptions, QuickwitIpAddrOptions,\n        QuickwitNumericOptions, QuickwitTextOptions,\n    };\n\n    #[test]\n    fn test_field_name_from_field_path() {\n        // not really a possibility, but still, let's test it.\n        assert_eq!(super::field_name_for_field_path(&[]), \"\");\n        assert_eq!(super::field_name_for_field_path(&[\"hello\"]), \"hello\");\n        assert_eq!(\n            super::field_name_for_field_path(&[\"one\", \"two\", \"three\"]),\n            \"one.two.three\"\n        );\n        assert_eq!(\n            super::field_name_for_field_path(&[\"one\", \"two\", \"three\"]),\n            \"one.two.three\"\n        );\n        assert_eq!(super::field_name_for_field_path(&[\"one.two\"]), r\"one\\.two\");\n        assert_eq!(\n            super::field_name_for_field_path(&[\"one.two\", \"three\"]),\n            r\"one\\.two.three\"\n        );\n    }\n\n    #[test]\n    fn test_get_or_insert_path() {\n        let mut map = Default::default();\n        super::get_or_insert_path(&[\"a\".to_string(), \"b\".to_string()], &mut map)\n            .insert(\"c\".to_string(), JsonValue::from(3u64));\n        assert_eq!(\n            &serde_json::to_value(&map).unwrap(),\n            &serde_json::json!({\n                \"a\": {\n                    \"b\": {\n                        \"c\": 3u64\n                    }\n                }\n            })\n        );\n        super::get_or_insert_path(&[\"a\".to_string(), \"b\".to_string()], &mut map)\n            .insert(\"d\".to_string(), JsonValue::from(2u64));\n        assert_eq!(\n            &serde_json::to_value(&map).unwrap(),\n            &serde_json::json!({\n                \"a\": {\n                    \"b\": {\n                        \"c\": 3u64,\n                        \"d\": 2u64\n                    }\n                }\n            })\n        );\n        super::get_or_insert_path(&[\"e\".to_string()], &mut map)\n            .insert(\"f\".to_string(), JsonValue::from(5u64));\n        assert_eq!(\n            &serde_json::to_value(&map).unwrap(),\n            &serde_json::json!({\n                \"a\": {\n                    \"b\": {\n                        \"c\": 3u64,\n                        \"d\": 2u64\n                    }\n                },\n                \"e\": { \"f\": 5u64 }\n            })\n        );\n        super::get_or_insert_path(&[], &mut map).insert(\"g\".to_string(), JsonValue::from(6u64));\n        assert_eq!(\n            &serde_json::to_value(&map).unwrap(),\n            &serde_json::json!({\n                \"a\": {\n                    \"b\": {\n                        \"c\": 3u64,\n                        \"d\": 2u64\n                    }\n                },\n                \"e\": { \"f\": 5u64 },\n                \"g\": 6u64\n            })\n        );\n    }\n\n    #[test]\n    fn test_parse_u64_mapping() {\n        let leaf = LeafType::U64(QuickwitNumericOptions::default());\n        assert_eq!(\n            leaf.value_from_json(json!(20i64)).unwrap(),\n            TantivyValue::U64(20u64)\n        );\n    }\n\n    #[test]\n    fn test_parse_u64_coercion() {\n        let leaf = LeafType::U64(QuickwitNumericOptions::default());\n        assert_eq!(\n            leaf.value_from_json(json!(\"20\")).unwrap(),\n            TantivyValue::U64(20u64)\n        );\n        assert_eq!(\n            leaf.value_from_json(json!(\"foo\")).unwrap_err(),\n            \"failed to coerce JSON string `\\\"foo\\\"` to u64\"\n        );\n\n        let numeric_options = QuickwitNumericOptions {\n            coerce: false,\n            ..Default::default()\n        };\n        let leaf = LeafType::U64(numeric_options);\n        assert_eq!(\n            leaf.value_from_json(json!(\"20\")).unwrap_err(),\n            \"expected JSON number, got string `\\\"20\\\"`. enable coercion to u64 with the `coerce` \\\n             parameter in the field mapping\"\n        );\n    }\n\n    #[test]\n    fn test_parse_u64_negative_should_error() {\n        let leaf = LeafType::U64(QuickwitNumericOptions::default());\n        assert_eq!(\n            leaf.value_from_json(json!(-20i64)).unwrap_err(),\n            \"expected u64, got inconvertible JSON number `-20`\"\n        );\n    }\n\n    #[test]\n    fn test_parse_i64_mapping() {\n        let leaf = LeafType::I64(QuickwitNumericOptions::default());\n        assert_eq!(\n            leaf.value_from_json(json!(20u64)).unwrap(),\n            TantivyValue::I64(20i64)\n        );\n    }\n\n    #[test]\n    fn test_parse_i64_from_f64_should_error() {\n        let leaf = LeafType::I64(QuickwitNumericOptions::default());\n        assert_eq!(\n            leaf.value_from_json(json!(20.2f64)).unwrap_err(),\n            \"expected i64, got inconvertible JSON number `20.2`\"\n        );\n    }\n\n    #[test]\n    fn test_parse_i64_too_large() {\n        let leaf = LeafType::I64(QuickwitNumericOptions::default());\n        let err = leaf.value_from_json(json!(u64::MAX)).err().unwrap();\n        assert_eq!(\n            err,\n            \"expected i64, got inconvertible JSON number `18446744073709551615`\"\n        );\n    }\n\n    #[test]\n    fn test_parse_f64_from_u64() {\n        let leaf = LeafType::F64(QuickwitNumericOptions::default());\n        assert_eq!(\n            leaf.value_from_json(json!(4_000u64)).unwrap(),\n            TantivyValue::F64(4_000f64)\n        );\n    }\n\n    #[test]\n    fn test_parse_bool_mapping() {\n        let leaf = LeafType::Bool(QuickwitBoolOptions::default());\n        assert_eq!(\n            leaf.value_from_json(json!(true)).unwrap(),\n            TantivyValue::Bool(true)\n        );\n    }\n\n    #[test]\n    fn test_parse_bool_multivalued() {\n        let typ = LeafType::Bool(QuickwitBoolOptions::default());\n        let field = Field::from_field_id(10);\n        let leaf_entry = MappingLeaf {\n            field,\n            typ,\n            cardinality: Cardinality::MultiValued,\n            concatenate: Vec::new(),\n        };\n        let mut document = Document::default();\n        let mut path = Vec::new();\n        leaf_entry\n            .doc_from_json(json!([true, false, true]), &mut document, &mut path)\n            .unwrap();\n        assert_eq!(document.len(), 3);\n        let values: Vec<bool> = document\n            .get_all(field)\n            .flat_map(|val| val.as_bool())\n            .collect();\n        assert_eq!(&values, &[true, false, true])\n    }\n\n    #[test]\n    fn test_parse_ip_addr_from_str() {\n        let leaf = LeafType::IpAddr(QuickwitIpAddrOptions::default());\n        let ips = vec![\n            \"127.0.0.0\",\n            \"2605:2700:0:3::4713:93e3\",\n            \"::afff:4567:890a\",\n            \"10.10.12.123\",\n            \"192.168.0.1\",\n            \"2001:db8::1:0:0:1\",\n        ];\n        for ip_str in ips {\n            let parsed_ip_addr = leaf.value_from_json(json!(ip_str)).unwrap();\n            let expected_ip_addr =\n                TantivyValue::IpAddr(ip_str.parse::<IpAddr>().unwrap().into_ipv6_addr());\n            assert_eq!(parsed_ip_addr, expected_ip_addr);\n        }\n    }\n\n    #[test]\n    fn test_parse_ip_addr_should_error() {\n        let typ = LeafType::IpAddr(QuickwitIpAddrOptions::default());\n        let err = typ.value_from_json(json!(\"foo\")).err().unwrap();\n        assert!(err.contains(\"failed to parse IP address `foo`\"));\n\n        let err = typ.value_from_json(json!(1200)).err().unwrap();\n        assert!(err.contains(\"expected string, got `1200`\"));\n    }\n\n    #[test]\n    fn test_parse_i64_mutivalued() {\n        let typ = LeafType::I64(QuickwitNumericOptions::default());\n        let field = Field::from_field_id(10);\n        let leaf_entry = MappingLeaf {\n            field,\n            typ,\n            cardinality: Cardinality::MultiValued,\n            concatenate: Vec::new(),\n        };\n        let mut document = Document::default();\n        let mut path = Vec::new();\n        leaf_entry\n            .doc_from_json(serde_json::json!([10u64, 20u64]), &mut document, &mut path)\n            .unwrap();\n        assert_eq!(document.len(), 2);\n        let values: Vec<i64> = document\n            .get_all(field)\n            .flat_map(|val| val.as_i64())\n            .collect();\n        assert_eq!(&values, &[10i64, 20i64]);\n    }\n\n    #[test]\n    fn test_parse_null_is_just_ignored() {\n        let typ = LeafType::I64(QuickwitNumericOptions::default());\n        let field = Field::from_field_id(10);\n        let leaf_entry = MappingLeaf {\n            field,\n            typ,\n            cardinality: Cardinality::MultiValued,\n            concatenate: Vec::new(),\n        };\n        let mut document = Document::default();\n        let mut path = Vec::new();\n        leaf_entry\n            .doc_from_json(serde_json::json!(null), &mut document, &mut path)\n            .unwrap();\n        assert_eq!(document.len(), 0);\n    }\n\n    #[test]\n    fn test_parse_i64_mutivalued_accepts_scalar() {\n        let typ = LeafType::I64(QuickwitNumericOptions::default());\n        let field = Field::from_field_id(10);\n        let leaf_entry = MappingLeaf {\n            field,\n            typ,\n            cardinality: Cardinality::MultiValued,\n            concatenate: Vec::new(),\n        };\n        let mut document = Document::default();\n        let mut path = Vec::new();\n        leaf_entry\n            .doc_from_json(serde_json::json!(10u64), &mut document, &mut path)\n            .unwrap();\n        assert_eq!(document.len(), 1);\n        assert_eq!(document.get_first(field).unwrap().as_i64().unwrap(), 10i64);\n    }\n\n    #[test]\n    fn test_parse_u64_mutivalued_nested_array_forbidden() {\n        let typ = LeafType::I64(QuickwitNumericOptions::default());\n        let field = Field::from_field_id(10);\n        let leaf_entry = MappingLeaf {\n            field,\n            typ,\n            cardinality: Cardinality::MultiValued,\n            concatenate: Vec::new(),\n        };\n        let mut document = Document::default();\n        let mut path = vec![\"root\".to_string(), \"my_field\".to_string()];\n        let parse_err = leaf_entry\n            .doc_from_json(\n                serde_json::json!([10u64, [1u64, 2u64]]),\n                &mut document,\n                &mut path,\n            )\n            .unwrap_err();\n        assert_eq!(\n            parse_err.to_string(),\n            \"the field `root.my_field` could not be parsed: expected JSON number or string, got \\\n             `[1,2]`\"\n        );\n    }\n\n    #[test]\n    fn test_parse_text() {\n        let typ = LeafType::Text(QuickwitTextOptions::default());\n        let parsed_value = typ.value_from_json(json!(\"bacon and eggs\")).unwrap();\n        assert_eq!(\n            parsed_value,\n            TantivyValue::Str(\"bacon and eggs\".to_string())\n        );\n    }\n\n    #[test]\n    fn test_parse_text_number_should_error() {\n        let typ = LeafType::Text(QuickwitTextOptions::default());\n        let err = typ.value_from_json(json!(2u64)).err().unwrap();\n        assert_eq!(err, \"expected string, got `2`\");\n    }\n\n    #[test]\n    fn test_parse_date_time_str() {\n        let typ = LeafType::DateTime(QuickwitDateTimeOptions::default());\n        let value = typ\n            .value_from_json(json!(\"2021-12-19T16:39:57-01:00\"))\n            .unwrap();\n        let date_time = datetime!(2021-12-19 17:39:57 UTC);\n        assert_eq!(value, TantivyValue::Date(DateTime::from_utc(date_time)));\n    }\n\n    #[test]\n    fn test_parse_timestamp_float() {\n        let typ = LeafType::DateTime(QuickwitDateTimeOptions::default());\n        let unix_ts_secs = OffsetDateTime::now_utc().unix_timestamp();\n        let value = typ\n            .value_from_json(json!(unix_ts_secs as f64 + 0.1))\n            .unwrap();\n        let date_time = match value {\n            TantivyValue::Date(date_time) => date_time,\n            other => panic!(\"Expected a tantivy date time, got `{other:?}`.\"),\n        };\n        assert!((date_time.into_timestamp_millis() - (unix_ts_secs * 1_000 + 100)).abs() <= 1);\n    }\n\n    #[test]\n    fn test_parse_timestamp_int() {\n        let typ = LeafType::DateTime(QuickwitDateTimeOptions::default());\n        let unix_ts_secs = OffsetDateTime::now_utc().unix_timestamp();\n        let value = typ.value_from_json(json!(unix_ts_secs)).unwrap();\n        assert_eq!(\n            value,\n            TantivyValue::Date(DateTime::from_timestamp_secs(unix_ts_secs))\n        );\n    }\n\n    #[test]\n    fn test_parse_date_number_should_error() {\n        let typ = LeafType::DateTime(QuickwitDateTimeOptions::default());\n        let err = typ.value_from_json(json!(\"foo-datetime\")).unwrap_err();\n        assert_eq!(\n            err,\n            \"failed to parse datetime `foo-datetime` using the following formats: `rfc3339`, \\\n             `unix_timestamp`\"\n        );\n    }\n\n    #[test]\n    fn test_parse_date_array_should_error() {\n        let typ = LeafType::DateTime(QuickwitDateTimeOptions::default());\n        let err = typ.value_from_json(json!([\"foo\", \"bar\"])).err().unwrap();\n        assert_eq!(\n            err,\n            \"failed to parse datetime: expected a float, integer, or string, got \\\n             `[\\\"foo\\\",\\\"bar\\\"]`\"\n        );\n    }\n\n    #[test]\n    fn test_parse_bytes() {\n        let typ = LeafType::Bytes(QuickwitBytesOptions::default());\n        let value = typ\n            .value_from_json(json!(\"dGhpcyBpcyBhIGJhc2U2NCBlbmNvZGVkIHN0cmluZw==\"))\n            .unwrap();\n        assert_eq!(\n            (&value).as_bytes().unwrap(),\n            b\"this is a base64 encoded string\"\n        );\n    }\n\n    #[test]\n    fn test_parse_bytes_hex() {\n        let typ = LeafType::Bytes(QuickwitBytesOptions {\n            input_format: BinaryFormat::Hex,\n            ..QuickwitBytesOptions::default()\n        });\n        let value = typ\n            .value_from_json(json!(\n                \"7468697320697320612068657820656e636f64656420737472696e67\"\n            ))\n            .unwrap();\n        assert_eq!(\n            (&value).as_bytes().unwrap(),\n            b\"this is a hex encoded string\"\n        );\n    }\n\n    #[test]\n    fn test_parse_bytes_number_should_err() {\n        let typ = LeafType::Bytes(QuickwitBytesOptions::default());\n        let error = typ.value_from_json(json!(2u64)).err().unwrap();\n        assert_eq!(error, \"expected base64 string, got `2`\");\n    }\n\n    #[test]\n    fn test_parse_bytes_invalid_base64() {\n        let typ = LeafType::Bytes(QuickwitBytesOptions::default());\n        let error = typ.value_from_json(json!(\"dEwerwer#!%\")).err().unwrap();\n        assert_eq!(\n            error,\n            \"expected base64 string, got `dEwerwer#!%`: Invalid symbol 35, offset 8.\"\n        );\n    }\n\n    #[test]\n    fn test_parse_array_of_bytes() {\n        let typ = LeafType::Bytes(QuickwitBytesOptions::default());\n        let field = Field::from_field_id(10);\n        let leaf_entry = MappingLeaf {\n            field,\n            typ,\n            cardinality: Cardinality::MultiValued,\n            concatenate: Vec::new(),\n        };\n        let mut document = Document::default();\n        let mut path = vec![\"root\".to_string(), \"my_field\".to_string()];\n        leaf_entry\n            .doc_from_json(\n                serde_json::json!([\n                    \"dGhpcyBpcyBhIGJhc2U2NCBlbmNvZGVkIHN0cmluZw==\",\n                    \"dGhpcyBpcyBhIGJhc2U2NCBlbmNvZGVkIHN0cmluZw==\"\n                ]),\n                &mut document,\n                &mut path,\n            )\n            .unwrap();\n        assert_eq!(document.len(), 2);\n        let bytes_vec: Vec<&[u8]> = document\n            .get_all(field)\n            .flat_map(|val| val.as_bytes())\n            .collect();\n        assert_eq!(\n            &bytes_vec[..],\n            &[\n                b\"this is a base64 encoded string\",\n                b\"this is a base64 encoded string\"\n            ]\n        )\n    }\n\n    #[test]\n    fn test_field_path_for_field_name() {\n        assert_eq!(super::build_field_path_from_str(\"\"), Vec::<String>::new());\n        assert_eq!(super::build_field_path_from_str(\"hello\"), vec![\"hello\"]);\n        assert_eq!(\n            super::build_field_path_from_str(\"one.two.three\"),\n            vec![\"one\", \"two\", \"three\"]\n        );\n        assert_eq!(\n            super::build_field_path_from_str(r\"one\\.two\"),\n            vec![\"one.two\"]\n        );\n        assert_eq!(\n            super::build_field_path_from_str(r\"one\\.two.three\"),\n            vec![\"one.two\", \"three\"]\n        );\n        assert_eq!(super::build_field_path_from_str(r#\"one.\"#), vec![\"one\"]);\n        // Those are invalid field paths, but we check that it does not panic.\n        // Issue #3538 is about validating field paths before trying to build the path.\n        assert_eq!(super::build_field_path_from_str(\"\\\\.\"), vec![\".\"]);\n        assert_eq!(super::build_field_path_from_str(\"a.\"), vec![\"a\"]);\n        assert_eq!(super::build_field_path_from_str(\".a\"), vec![\"\", \"a\"]);\n    }\n\n    #[test]\n    fn test_map_or_array_iter() {\n        // single element\n        let single_value = MapOrArrayIter::Value(json!({\"a\": \"b\", \"c\": 4}));\n        let res: Vec<_> = single_value.collect();\n        assert_eq!(res, vec![json!({\"a\": \"b\", \"c\": 4})]);\n\n        // array of elements\n        let multiple_values =\n            MapOrArrayIter::Array(vec![json!({\"a\": \"b\", \"c\": 4}), json!(5)].into_iter());\n        let res: Vec<_> = multiple_values.collect();\n        assert_eq!(res, vec![json!({\"a\": \"b\", \"c\": 4}), json!(5)]);\n\n        // map of elements\n        let multiple_values = MapOrArrayIter::Map(\n            json!({\"a\": {\"a\": \"b\", \"c\": 4}, \"b\":5})\n                .as_object()\n                .unwrap()\n                .clone()\n                .into_iter(),\n        );\n        let res: Vec<_> = multiple_values.collect();\n        assert_eq!(res, vec![json!({\"a\": \"b\", \"c\": 4}), json!(5)]);\n    }\n\n    #[test]\n    fn test_json_value_iterator() {\n        assert_eq!(\n            JsonValueIterator::new(json!(5)).collect::<Vec<_>>(),\n            vec![json!(5)]\n        );\n        assert_eq!(\n            JsonValueIterator::new(json!([5, \"a\"])).collect::<Vec<_>>(),\n            vec![json!(5), json!(\"a\")]\n        );\n        assert_eq!(\n            JsonValueIterator::new(json!({\"a\":1, \"b\": 2})).collect::<Vec<_>>(),\n            vec![json!(1), json!(2)]\n        );\n        assert_eq!(\n            JsonValueIterator::new(json!([{\"a\":1, \"b\": 2}, \"a\"])).collect::<Vec<_>>(),\n            vec![json!(1), json!(2), json!(\"a\")]\n        );\n        assert_eq!(\n            JsonValueIterator::new(json!([{\"a\":1, \"b\": 2}, {\"a\": {\"b\": [3, 4]}}]))\n                .collect::<Vec<_>>(),\n            vec![json!(1), json!(2), json!(3), json!(4)]\n        );\n    }\n\n    #[test]\n    fn test_extract_val_from_tantivy_val() {\n        let obj = TantivyValue::Object;\n        fn array(val: impl IntoIterator<Item = impl Into<TantivyValue>>) -> TantivyValue {\n            TantivyValue::Array(val.into_iter().map(Into::into).collect())\n        }\n\n        let mut sample = vec![obj(vec![\n            (\n                \"some\".to_string(),\n                obj(vec![\n                    (\n                        \"path\".to_string(),\n                        obj(vec![(\"with.dots\".to_string(), 1u64.into())]),\n                    ),\n                    (\n                        \"other\".to_string(),\n                        obj(vec![(\"path\".to_string(), array([2u64, 3]))]),\n                    ),\n                ]),\n            ),\n            (\"short\".to_string(), 4u64.into()),\n        ])];\n\n        assert_eq!(\n            extract_val_from_tantivy_val(&[\"some\", \"other\"], &mut sample),\n            vec![obj(vec![(\"path\".to_string(), array([2u64, 3]))])]\n        );\n        assert_eq!(\n            extract_val_from_tantivy_val(&[\"some\", \"other\"], &mut sample),\n            Vec::new()\n        );\n        assert_eq!(\n            extract_val_from_tantivy_val(&[\"some\", \"path\", \"with.dots\"], &mut sample),\n            vec![1u64.into()]\n        );\n        assert_eq!(\n            extract_val_from_tantivy_val(&[\"some\", \"path\", \"with.dots\"], &mut sample),\n            Vec::new()\n        );\n        assert_eq!(\n            extract_val_from_tantivy_val(&[\"short\"], &mut sample),\n            vec![4u64.into()]\n        );\n        assert_eq!(\n            extract_val_from_tantivy_val(&[\"short\"], &mut sample),\n            Vec::new()\n        );\n    }\n\n    #[test]\n    fn test_add_key_to_vec_map() {\n        let obj = TantivyValue::Object;\n        fn array(val: impl IntoIterator<Item = impl Into<TantivyValue>>) -> TantivyValue {\n            TantivyValue::Array(val.into_iter().map(Into::into).collect())\n        }\n\n        let mut map = Vec::new();\n\n        add_key_to_vec_map(&mut map, \"some.path.with\\\\.dots\", vec![1u64.into()]);\n        assert_eq!(\n            map,\n            &[(\n                \"some\".to_string(),\n                obj(vec![(\n                    \"path\".to_string(),\n                    obj(vec![(\"with.dots\".to_string(), 1u64.into())])\n                )])\n            )]\n        );\n\n        add_key_to_vec_map(&mut map, \"some.other.path\", vec![2u64.into(), 3u64.into()]);\n        assert_eq!(\n            map,\n            &[(\n                \"some\".to_string(),\n                obj(vec![\n                    (\n                        \"path\".to_string(),\n                        obj(vec![(\"with.dots\".to_string(), 1u64.into())])\n                    ),\n                    (\n                        \"other\".to_string(),\n                        obj(vec![(\"path\".to_string(), array([2u64, 3]))])\n                    ),\n                ])\n            )]\n        );\n\n        add_key_to_vec_map(&mut map, \"short\", vec![4u64.into()]);\n        assert_eq!(\n            map,\n            &[\n                (\n                    \"some\".to_string(),\n                    obj(vec![\n                        (\n                            \"path\".to_string(),\n                            obj(vec![(\"with.dots\".to_string(), 1u64.into())])\n                        ),\n                        (\n                            \"other\".to_string(),\n                            obj(vec![(\"path\".to_string(), array([2u64, 3]))])\n                        ),\n                    ])\n                ),\n                (\"short\".to_string(), 4u64.into())\n            ]\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-doc-mapper/src/doc_mapper/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod date_time_type;\nmod doc_mapper_builder;\nmod doc_mapper_impl;\nmod field_mapping_entry;\nmod field_mapping_type;\nmod field_presence;\nmod mapping_tree;\nmod tantivy_val_to_json;\nmod tokenizer_entry;\n\nuse std::collections::{HashMap, HashSet};\nuse std::fmt::Debug;\nuse std::ops::Bound;\n\npub use doc_mapper_builder::DocMapperBuilder;\npub use doc_mapper_impl::DocMapper;\npub use field_mapping_entry::{\n    BinaryFormat, FastFieldOptions, FieldMappingEntry, QuickwitBytesOptions, QuickwitJsonOptions,\n    QuickwitTextNormalizer,\n};\npub(crate) use field_mapping_entry::{\n    FieldMappingEntryForSerialization, IndexRecordOptionSchema, QuickwitTextTokenizer,\n};\n#[cfg(test)]\npub(crate) use field_mapping_entry::{QuickwitNumericOptions, QuickwitTextOptions};\npub use field_mapping_type::FieldMappingType;\nuse serde_json::Value as JsonValue;\nuse tantivy::Term;\nuse tantivy::schema::{Field, FieldType};\npub(crate) use tokenizer_entry::{\n    NgramTokenizerOption, RegexTokenizerOption, TokenFilterType, TokenizerType,\n};\npub use tokenizer_entry::{TokenizerConfig, TokenizerEntry, analyze_text};\n\npub type Partition = u64;\n\n/// An alias for serde_json's object type.\npub type JsonObject = serde_json::Map<String, JsonValue>;\n\n/// A struct to wrap a tantivy field with its name.\n#[derive(Clone, Debug)]\npub struct NamedField {\n    /// Name of the field.\n    pub name: String,\n    /// Tantivy schema field.\n    pub field: Field,\n    /// Tantivy schema field type.\n    pub field_type: FieldType,\n}\n\n/// Bounds for a range of terms, with an optional max count of terms being matched.\n#[derive(Debug, Clone, PartialEq, Eq, Hash)]\npub struct TermRange {\n    /// Start of the range\n    pub start: Bound<Term>,\n    /// End of the range\n    pub end: Bound<Term>,\n    /// Max number of matched terms\n    pub limit: Option<u64>,\n}\n\n#[derive(Debug, Clone, PartialEq, Eq, Hash)]\n/// Supported automaton types to warmup\npub enum Automaton {\n    /// A regex in it's str representation as tantivy_fst::Regex isn't PartialEq, and the path if\n    /// inside a json field\n    Regex(Option<Vec<u8>>, String),\n    // we could add termset query here, instead of downloading the whole dictionary\n}\n\n/// Description of how a fast field should be warmed up\n#[derive(Debug, Clone, PartialEq, Eq, Hash)]\npub struct FastFieldWarmupInfo {\n    /// Name of the fast field\n    pub name: String,\n    /// Whether subfields should also be loaded for warmup\n    pub with_subfields: bool,\n}\n\n/// Information about what a DocMapper think should be warmed up before\n/// running the query.\n#[derive(Debug, Default, Clone, PartialEq, Eq)]\npub struct WarmupInfo {\n    /// Name of fields from the term dictionary and posting list which needs to\n    /// be entirely loaded\n    pub term_dict_fields: HashSet<Field>,\n    /// Fast fields which needs to be loaded\n    pub fast_fields: HashSet<FastFieldWarmupInfo>,\n    /// Whether to warmup field norms. Used mostly for scoring.\n    pub field_norms: bool,\n    /// Terms to warmup, and whether their position is needed too.\n    pub terms_grouped_by_field: HashMap<Field, HashMap<Term, bool>>,\n    /// Term ranges to warmup, and whether their position is needed too.\n    pub term_ranges_grouped_by_field: HashMap<Field, HashMap<TermRange, bool>>,\n    /// Automatons to warmup\n    pub automatons_grouped_by_field: HashMap<Field, HashSet<Automaton>>,\n}\n\nimpl WarmupInfo {\n    /// Merge other WarmupInfo into self.\n    pub fn merge(&mut self, other: WarmupInfo) {\n        self.term_dict_fields.extend(other.term_dict_fields);\n        self.field_norms |= other.field_norms;\n\n        for fast_field_warmup_info in other.fast_fields.into_iter() {\n            // avoid overwriting with a less demanding warmup\n            if !self.fast_fields.contains(&FastFieldWarmupInfo {\n                name: fast_field_warmup_info.name.clone(),\n                with_subfields: true,\n            }) {\n                self.fast_fields.insert(fast_field_warmup_info);\n            }\n        }\n\n        for (field, term_and_pos) in other.terms_grouped_by_field.into_iter() {\n            let sub_map = self.terms_grouped_by_field.entry(field).or_default();\n\n            for (term, include_position) in term_and_pos.into_iter() {\n                *sub_map.entry(term).or_default() |= include_position;\n            }\n        }\n\n        // this merge is suboptimal in case of overlapping range with no limit.\n        for (field, term_range_and_pos) in other.term_ranges_grouped_by_field.into_iter() {\n            let sub_map = self.term_ranges_grouped_by_field.entry(field).or_default();\n\n            for (term_range, include_position) in term_range_and_pos.into_iter() {\n                *sub_map.entry(term_range).or_default() |= include_position;\n            }\n        }\n\n        for (field, automatons) in other.automatons_grouped_by_field.into_iter() {\n            let sub_map = self.automatons_grouped_by_field.entry(field).or_default();\n            sub_map.extend(automatons);\n        }\n    }\n\n    /// Simplify a WarmupInfo, removing some redundant tasks\n    pub fn simplify(&mut self) {\n        self.terms_grouped_by_field.retain(|field, terms| {\n            if self.term_dict_fields.contains(field) {\n                // we are already about to full-load this dictionary. We only care about terms\n                // which needs additional position\n                terms.retain(|_term, include_position| *include_position);\n            }\n            // if no term is left, remove the entry from the hashmap\n            !terms.is_empty()\n        });\n        self.term_ranges_grouped_by_field.retain(|field, terms| {\n            if self.term_dict_fields.contains(field) {\n                terms.retain(|_term, include_position| *include_position);\n            }\n            !terms.is_empty()\n        });\n        // TODO we could remove from terms_grouped_by_field for ranges with no `limit` in\n        // term_ranges_grouped_by_field\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::collections::{HashMap, HashSet};\n    use std::ops::Bound;\n\n    use quickwit_query::BooleanOperand;\n    use quickwit_query::query_ast::{UserInputQuery, query_ast_from_user_text};\n    use tantivy::schema::{Field, FieldType, Term};\n\n    use super::*;\n    use crate::{\n        Cardinality, DYNAMIC_FIELD_NAME, DocMapper, DocMapperBuilder, DocParsingError,\n        FieldMappingEntry, TermRange, WarmupInfo,\n    };\n\n    const JSON_DEFAULT_DOC_MAPPER: &str = r#\"\n        {\n            \"type\": \"default\",\n            \"default_search_fields\": [],\n            \"tag_fields\": [],\n            \"field_mappings\": []\n        }\"#;\n\n    #[test]\n    fn test_doc_from_json_bytes() {\n        let doc_mapper = DocMapperBuilder::default().try_build().unwrap();\n        let json_doc = br#\"{\"title\": \"hello\", \"body\": \"world\"}\"#;\n        doc_mapper.doc_from_json_bytes(json_doc).unwrap();\n\n        let DocParsingError::NotJsonObject(json_doc_sample) = doc_mapper\n            .doc_from_json_bytes(br#\"Not a JSON object\"#)\n            .unwrap_err()\n        else {\n            panic!(\"Expected `DocParsingError::NotJsonObject` error\");\n        };\n        assert_eq!(json_doc_sample, \"Not a JSON object...\");\n    }\n\n    #[test]\n    fn test_doc_from_json_str() {\n        let doc_mapper = DocMapperBuilder::default().try_build().unwrap();\n        let json_doc = r#\"{\"title\": \"hello\", \"body\": \"world\"}\"#;\n        doc_mapper.doc_from_json_str(json_doc).unwrap();\n\n        let DocParsingError::NotJsonObject(json_doc_sample) = doc_mapper\n            .doc_from_json_str(r#\"Not a JSON object\"#)\n            .unwrap_err()\n        else {\n            panic!(\"Expected `DocParsingError::NotJsonObject` error\");\n        };\n        assert_eq!(json_doc_sample, \"Not a JSON object...\");\n    }\n\n    #[test]\n    fn test_deserialize_doc_mapper() -> anyhow::Result<()> {\n        let deserialized_default_doc_mapper =\n            serde_json::from_str::<Box<DocMapper>>(JSON_DEFAULT_DOC_MAPPER)?;\n        let expected_default_doc_mapper = DocMapperBuilder::default().try_build()?;\n        assert_eq!(\n            format!(\"{deserialized_default_doc_mapper:?}\"),\n            format!(\"{expected_default_doc_mapper:?}\"),\n        );\n        Ok(())\n    }\n\n    #[test]\n    fn test_deserialize_minimal_doc_mapper() -> anyhow::Result<()> {\n        let deserialized_default_doc_mapper =\n            serde_json::from_str::<Box<DocMapper>>(r#\"{\"type\": \"default\"}\"#)?;\n        let expected_default_doc_mapper = DocMapperBuilder::default().try_build()?;\n        assert_eq!(\n            format!(\"{deserialized_default_doc_mapper:?}\"),\n            format!(\"{expected_default_doc_mapper:?}\"),\n        );\n        Ok(())\n    }\n\n    #[test]\n    fn test_deserialize_doc_mapper_default_dynamic_tokenizer() {\n        let doc_mapper =\n            serde_json::from_str::<Box<DocMapper>>(r#\"{\"type\": \"default\", \"mode\": \"dynamic\"}\"#)\n                .unwrap();\n        let tantivy_schema = doc_mapper.schema();\n        let dynamic_field = tantivy_schema.get_field(DYNAMIC_FIELD_NAME).unwrap();\n        if let FieldType::JsonObject(json_options) =\n            tantivy_schema.get_field_entry(dynamic_field).field_type()\n        {\n            let text_opt = json_options.get_text_indexing_options().unwrap();\n            assert_eq!(text_opt.tokenizer(), \"raw\");\n        } else {\n            panic!(\"dynamic field should be of JSON type\");\n        }\n    }\n\n    #[test]\n    fn test_doc_mapper_query_with_json_field() {\n        let mut doc_mapper_builder = DocMapperBuilder::default();\n        doc_mapper_builder\n            .doc_mapping\n            .field_mappings\n            .push(FieldMappingEntry {\n                name: \"json_field\".to_string(),\n                mapping_type: FieldMappingType::Json(\n                    QuickwitJsonOptions::default(),\n                    Cardinality::SingleValued,\n                ),\n            });\n        let doc_mapper = doc_mapper_builder.try_build().unwrap();\n        let schema = doc_mapper.schema();\n        let query_ast = UserInputQuery {\n            user_text: \"json_field.toto.titi:hello\".to_string(),\n            default_fields: None,\n            default_operator: BooleanOperand::And,\n            lenient: false,\n        }\n        .parse_user_query(&[])\n        .unwrap();\n        let (query, _) = doc_mapper.query(schema, query_ast, true, None).unwrap();\n        assert_eq!(\n            format!(\"{query:?}\"),\n            r#\"TermQuery(Term(field=2, type=Json, path=toto.titi, type=Str, \"hello\"))\"#\n        );\n    }\n\n    #[test]\n    fn test_doc_mapper_query_with_json_field_default_search_fields() {\n        let doc_mapper = DocMapperBuilder::default().try_build().unwrap();\n        let schema = doc_mapper.schema();\n        let query_ast = query_ast_from_user_text(\"toto.titi:hello\", None)\n            .parse_user_query(doc_mapper.default_search_fields())\n            .unwrap();\n        let (query, _) = doc_mapper.query(schema, query_ast, true, None).unwrap();\n        assert_eq!(\n            format!(\"{query:?}\"),\n            r#\"TermQuery(Term(field=1, type=Json, path=toto.titi, type=Str, \"hello\"))\"#\n        );\n    }\n\n    #[test]\n    fn test_doc_mapper_query_with_json_field_ambiguous_term() {\n        let doc_mapper = DocMapperBuilder::default().try_build().unwrap();\n        let schema = doc_mapper.schema();\n        let query_ast = query_ast_from_user_text(\"toto:5\", None)\n            .parse_user_query(&[])\n            .unwrap();\n        let (query, _) = doc_mapper.query(schema, query_ast, true, None).unwrap();\n        assert_eq!(\n            format!(\"{query:?}\"),\n            r#\"BooleanQuery { subqueries: [(Should, TermQuery(Term(field=1, type=Json, path=toto, type=I64, 5))), (Should, TermQuery(Term(field=1, type=Json, path=toto, type=Str, \"5\")))], minimum_number_should_match: 1 }\"#\n        );\n    }\n\n    #[track_caller]\n    fn test_validate_doc_aux(\n        doc_mapper: &DocMapper,\n        doc_json: &str,\n    ) -> Result<(), DocParsingError> {\n        let json_val: serde_json_borrow::Value = serde_json::from_str(doc_json).unwrap();\n        let json_obj = json_val.as_object().unwrap();\n        doc_mapper.validate_json_obj(json_obj)\n    }\n\n    #[test]\n    fn test_validate_doc() {\n        const JSON_CONFIG_VALUE: &str = r#\"{\n            \"timestamp_field\": \"timestamp\",\n            \"field_mappings\": [\n            {\n                \"name\": \"timestamp\",\n                \"type\": \"datetime\",\n                \"fast\": true\n            },\n            {\n                \"name\": \"body\",\n                \"type\": \"text\"\n            },\n            {\n                \"name\": \"response_date\",\n                \"type\": \"datetime\",\n                \"input_formats\": [\"rfc3339\", \"unix_timestamp\"]\n            },\n            {\n                \"name\": \"response_time\",\n                \"type\": \"f64\"\n            },\n            {\n                \"name\": \"response_time_no_coercion\",\n                \"type\": \"f64\",\n                \"coerce\": false\n            },\n            {\n                \"name\": \"response_payload\",\n                \"type\": \"bytes\"\n            },\n            {\n                \"name\": \"is_important\",\n                \"type\": \"bool\"\n            },\n            {\n                \"name\": \"properties\",\n                \"type\": \"json\"\n            },\n            {\n                \"name\": \"attributes\",\n                \"type\": \"object\",\n                \"field_mappings\": [\n                    {\n                        \"name\": \"numbers\",\n                        \"type\": \"array<i64>\"\n                    }\n                ]\n            }]\n        }\"#;\n        let doc_mapper = serde_json::from_str::<DocMapper>(JSON_CONFIG_VALUE).unwrap();\n        {\n            assert!(\n                test_validate_doc_aux(\n                    &doc_mapper,\n                    r#\"{ \"body\": \"toto\", \"timestamp\": \"2024-01-01T01:01:01Z\"}\"#\n                )\n                .is_ok()\n            );\n        }\n        {\n            assert!(matches!(\n                test_validate_doc_aux(\n                    &doc_mapper,\n                    r#\"{ \"response_time\": \"toto\", \"timestamp\": \"2024-01-01T01:01:01Z\"}\"#\n                )\n                .unwrap_err(),\n                DocParsingError::ValueError(_, _)\n            ));\n        }\n        {\n            assert!(\n                test_validate_doc_aux(\n                    &doc_mapper,\n                    r#\"{ \"response_time\": \"2.3\", \"timestamp\": \"2024-01-01T01:01:01Z\"}\"#\n                )\n                .is_ok(),\n            );\n        }\n        {\n            // coercion disabled\n            assert!(matches!(\n                test_validate_doc_aux(\n                    &doc_mapper,\n                    r#\"{\"response_time_no_coercion\": \"2.3\", \"timestamp\": \"2024-01-01T01:01:01Z\"}\"#\n                )\n                .unwrap_err(),\n                DocParsingError::ValueError(_, _)\n            ));\n        }\n        {\n            assert!(matches!(\n                test_validate_doc_aux(\n                    &doc_mapper,\n                    r#\"{\"response_time\": [2.3], \"timestamp\": \"2024-01-01T01:01:01Z\"}\"#\n                )\n                .unwrap_err(),\n                DocParsingError::MultiValuesNotSupported(_)\n            ));\n        }\n        {\n            assert!(\n                test_validate_doc_aux(\n                    &doc_mapper,\n                    r#\"{\"attributes\": {\"numbers\": [-2]}, \"timestamp\": \"2024-01-01T01:01:01Z\"}\"#\n                )\n                .is_ok()\n            );\n        }\n    }\n\n    #[test]\n    fn test_validate_doc_timestamp() {\n        const JSON_CONFIG_TS_AT_ROOT: &str = r#\"{\n            \"timestamp_field\": \"timestamp\",\n            \"field_mappings\": [\n            {\n                \"name\": \"timestamp\",\n                \"type\": \"datetime\",\n                \"fast\": true\n            },\n            {\n                \"name\": \"body\",\n                \"type\": \"text\"\n            }\n            ]\n        }\"#;\n        const JSON_CONFIG_TS_WITH_DOT: &str = r#\"{\n            \"timestamp_field\": \"timestamp\\\\.now\",\n            \"field_mappings\": [\n            {\n                \"name\": \"timestamp.now\",\n                \"type\": \"datetime\",\n                \"fast\": true\n            },\n            {\n                \"name\": \"body\",\n                \"type\": \"text\"\n            }\n            ]\n        }\"#;\n        const JSON_CONFIG_TS_NESTED: &str = r#\"{\n            \"timestamp_field\": \"doc.timestamp\",\n            \"field_mappings\": [\n            {\n                \"name\": \"doc\",\n                \"type\": \"object\",\n                \"field_mappings\": [\n                    {\n                        \"name\": \"timestamp\",\n                        \"type\": \"datetime\",\n                        \"fast\": true\n                    }\n                ]\n            },\n            {\n                \"name\": \"body\",\n                \"type\": \"text\"\n            }\n            ]\n        }\"#;\n        let doc_mapper = serde_json::from_str::<DocMapper>(JSON_CONFIG_TS_AT_ROOT).unwrap();\n        {\n            assert!(\n                test_validate_doc_aux(\n                    &doc_mapper,\n                    r#\"{ \"body\": \"toto\", \"timestamp\": \"2024-01-01T01:01:01Z\"}\"#\n                )\n                .is_ok()\n            );\n        }\n        {\n            assert!(matches!(\n                test_validate_doc_aux(\n                    &doc_mapper,\n                    r#\"{ \"body\": \"toto\", \"timestamp\": \"invalid timestamp\"}\"#\n                )\n                .unwrap_err(),\n                DocParsingError::ValueError(_, _),\n            ));\n        }\n        {\n            assert!(matches!(\n                test_validate_doc_aux(&doc_mapper, r#\"{ \"body\": \"toto\", \"timestamp\": null}\"#)\n                    .unwrap_err(),\n                DocParsingError::RequiredField(_),\n            ));\n        }\n        {\n            assert!(matches!(\n                test_validate_doc_aux(&doc_mapper, r#\"{ \"body\": \"toto\"}\"#).unwrap_err(),\n                DocParsingError::RequiredField(_),\n            ));\n        }\n\n        let doc_mapper = serde_json::from_str::<DocMapper>(JSON_CONFIG_TS_WITH_DOT).unwrap();\n        {\n            assert!(\n                test_validate_doc_aux(\n                    &doc_mapper,\n                    r#\"{ \"body\": \"toto\", \"timestamp.now\": \"2024-01-01T01:01:01Z\"}\"#\n                )\n                .is_ok()\n            );\n        }\n        {\n            assert!(matches!(\n                test_validate_doc_aux(\n                    &doc_mapper,\n                    r#\"{ \"body\": \"toto\", \"timestamp.now\": \"invalid timestamp\"}\"#\n                )\n                .unwrap_err(),\n                DocParsingError::ValueError(_, _),\n            ));\n        }\n        {\n            assert!(matches!(\n                test_validate_doc_aux(\n                    &doc_mapper,\n                    r#\"{ \"body\": \"toto\", \"timestamp\": {\"now\": \"2024-01-01T01:01:01Z\"}}\"#\n                )\n                .unwrap_err(),\n                DocParsingError::RequiredField(_),\n            ));\n        }\n\n        let doc_mapper = serde_json::from_str::<DocMapper>(JSON_CONFIG_TS_NESTED).unwrap();\n        {\n            assert!(\n                test_validate_doc_aux(\n                    &doc_mapper,\n                    r#\"{ \"body\": \"toto\", \"doc\":{\"timestamp\": \"2024-01-01T01:01:01Z\"}}\"#\n                )\n                .is_ok()\n            );\n        }\n        {\n            assert!(matches!(\n                test_validate_doc_aux(\n                    &doc_mapper,\n                    r#\"{ \"body\": \"toto\", \"doc.timestamp\": \"2024-01-01T01:01:01Z\"}\"#\n                )\n                .unwrap_err(),\n                DocParsingError::RequiredField(_),\n            ));\n        }\n    }\n\n    #[test]\n    fn test_validate_doc_mode() {\n        const DOC: &str = r#\"{ \"whatever\": \"blop\" }\"#;\n        {\n            const JSON_CONFIG_VALUE: &str = r#\"{ \"mode\": \"strict\", \"field_mappings\": [] }\"#;\n            let doc_mapper = serde_json::from_str::<DocMapper>(JSON_CONFIG_VALUE).unwrap();\n            assert!(matches!(\n                test_validate_doc_aux(&doc_mapper, DOC).unwrap_err(),\n                DocParsingError::NoSuchFieldInSchema(_)\n            ));\n        }\n        {\n            const JSON_CONFIG_VALUE: &str = r#\"{ \"mode\": \"lenient\", \"field_mappings\": [] }\"#;\n            let doc_mapper = serde_json::from_str::<DocMapper>(JSON_CONFIG_VALUE).unwrap();\n            assert!(test_validate_doc_aux(&doc_mapper, DOC).is_ok());\n        }\n        {\n            const JSON_CONFIG_VALUE: &str = r#\"{ \"mode\": \"dynamic\", \"field_mappings\": [] }\"#;\n            let doc_mapper = serde_json::from_str::<DocMapper>(JSON_CONFIG_VALUE).unwrap();\n            assert!(test_validate_doc_aux(&doc_mapper, DOC).is_ok());\n        }\n    }\n\n    fn hashset_fast(elements: &[&str]) -> HashSet<FastFieldWarmupInfo> {\n        elements\n            .iter()\n            .map(|elem| FastFieldWarmupInfo {\n                name: elem.to_string(),\n                with_subfields: false,\n            })\n            .collect()\n    }\n\n    fn automaton_hashset(elements: &[&str]) -> HashSet<Automaton> {\n        elements\n            .iter()\n            .map(|elem| Automaton::Regex(None, elem.to_string()))\n            .collect()\n    }\n\n    fn hashset_field(elements: &[u32]) -> HashSet<Field> {\n        elements\n            .iter()\n            .map(|elem| Field::from_field_id(*elem))\n            .collect()\n    }\n\n    fn hashmap(elements: &[(u32, &str, bool)]) -> HashMap<Field, HashMap<Term, bool>> {\n        let mut result: HashMap<Field, HashMap<Term, bool>> = HashMap::new();\n        for (field, term, pos) in elements {\n            let field = Field::from_field_id(*field);\n            *result\n                .entry(field)\n                .or_default()\n                .entry(Term::from_field_text(field, term))\n                .or_default() |= pos;\n        }\n\n        result\n    }\n\n    fn hashmap_ranges(elements: &[(u32, &str, bool)]) -> HashMap<Field, HashMap<TermRange, bool>> {\n        let mut result: HashMap<Field, HashMap<TermRange, bool>> = HashMap::new();\n        for (field, term, pos) in elements {\n            let field = Field::from_field_id(*field);\n            let term = Term::from_field_text(field, term);\n            // this is a 1 element bound, but it's enough for testing.\n            let range = TermRange {\n                start: Bound::Included(term.clone()),\n                end: Bound::Included(term),\n                limit: None,\n            };\n            *result.entry(field).or_default().entry(range).or_default() |= pos;\n        }\n\n        result\n    }\n\n    #[test]\n    fn test_warmup_info_merge() {\n        let wi_base = WarmupInfo {\n            term_dict_fields: hashset_field(&[1, 2]),\n            fast_fields: hashset_fast(&[\"fast1\", \"fast2\"]),\n            field_norms: false,\n            terms_grouped_by_field: hashmap(&[(1, \"term1\", false), (1, \"term2\", false)]),\n            term_ranges_grouped_by_field: hashmap_ranges(&[\n                (2, \"term1\", false),\n                (2, \"term2\", false),\n            ]),\n            automatons_grouped_by_field: [(\n                Field::from_field_id(1),\n                automaton_hashset(&[\"my_reg.*ex\"]),\n            )]\n            .into_iter()\n            .collect(),\n        };\n\n        // merging with default has no impact\n        let mut wi_cloned = wi_base.clone();\n        wi_cloned.merge(WarmupInfo::default());\n        assert_eq!(wi_cloned, wi_base);\n\n        let mut wi_base = wi_base;\n        let wi_2 = WarmupInfo {\n            term_dict_fields: hashset_field(&[2, 3]),\n            fast_fields: hashset_fast(&[\"fast2\", \"fast3\"]),\n            field_norms: true,\n            terms_grouped_by_field: hashmap(&[(2, \"term1\", false), (1, \"term2\", true)]),\n            term_ranges_grouped_by_field: hashmap_ranges(&[\n                (3, \"term1\", false),\n                (2, \"term2\", true),\n            ]),\n            automatons_grouped_by_field: [\n                (Field::from_field_id(1), automaton_hashset(&[\"other-re.ex\"])),\n                (Field::from_field_id(2), automaton_hashset(&[\"my_reg.*ex\"])),\n            ]\n            .into_iter()\n            .collect(),\n        };\n        wi_base.merge(wi_2.clone());\n\n        assert_eq!(wi_base.term_dict_fields, hashset_field(&[1, 2, 3]));\n        assert_eq!(\n            wi_base.fast_fields,\n            hashset_fast(&[\"fast1\", \"fast2\", \"fast3\"])\n        );\n        assert!(wi_base.field_norms);\n\n        let expected_terms = [(1, \"term1\", false), (1, \"term2\", true), (2, \"term1\", false)];\n        for (field, term, pos) in expected_terms {\n            let field = Field::from_field_id(field);\n            let term = Term::from_field_text(field, term);\n\n            assert_eq!(\n                *wi_base\n                    .terms_grouped_by_field\n                    .get(&field)\n                    .unwrap()\n                    .get(&term)\n                    .unwrap(),\n                pos\n            );\n        }\n\n        let expected_ranges = [(2, \"term1\", false), (2, \"term2\", true), (3, \"term1\", false)];\n        for (field, term, pos) in expected_ranges {\n            let field = Field::from_field_id(field);\n            let term = Term::from_field_text(field, term);\n            let range = TermRange {\n                start: Bound::Included(term.clone()),\n                end: Bound::Included(term),\n                limit: None,\n            };\n\n            assert_eq!(\n                *wi_base\n                    .term_ranges_grouped_by_field\n                    .get(&field)\n                    .unwrap()\n                    .get(&range)\n                    .unwrap(),\n                pos\n            );\n        }\n\n        let expected_automatons = [(1, \"my_reg.*ex\"), (1, \"other-re.ex\"), (2, \"my_reg.*ex\")];\n        for (field, regex) in expected_automatons {\n            let field = Field::from_field_id(field);\n            let automaton = Automaton::Regex(None, regex.to_string());\n            assert!(\n                wi_base\n                    .automatons_grouped_by_field\n                    .get(&field)\n                    .unwrap()\n                    .contains(&automaton)\n            );\n        }\n\n        // merge is idempotent\n        let mut wi_cloned = wi_base.clone();\n        wi_cloned.merge(wi_2);\n        assert_eq!(wi_cloned, wi_base);\n    }\n\n    #[test]\n    fn test_warmup_info_simplify() {\n        let mut warmup_info = WarmupInfo {\n            term_dict_fields: hashset_field(&[1]),\n            fast_fields: hashset_fast(&[\"fast1\", \"fast2\"]),\n            field_norms: false,\n            terms_grouped_by_field: hashmap(&[\n                (1, \"term1\", false),\n                (1, \"term2\", true),\n                (2, \"term3\", false),\n            ]),\n            term_ranges_grouped_by_field: hashmap_ranges(&[\n                (1, \"term1\", false),\n                (1, \"term2\", true),\n                (2, \"term3\", false),\n            ]),\n            automatons_grouped_by_field: [\n                (Field::from_field_id(1), automaton_hashset(&[\"other-re.ex\"])),\n                (Field::from_field_id(1), automaton_hashset(&[\"other-re.ex\"])),\n                (Field::from_field_id(2), automaton_hashset(&[\"my_reg.ex\"])),\n            ]\n            .into_iter()\n            .collect(),\n        };\n        let expected = WarmupInfo {\n            term_dict_fields: hashset_field(&[1]),\n            fast_fields: hashset_fast(&[\"fast1\", \"fast2\"]),\n            field_norms: false,\n            terms_grouped_by_field: hashmap(&[(1, \"term2\", true), (2, \"term3\", false)]),\n            term_ranges_grouped_by_field: hashmap_ranges(&[\n                (1, \"term2\", true),\n                (2, \"term3\", false),\n            ]),\n            automatons_grouped_by_field: [\n                (Field::from_field_id(1), automaton_hashset(&[\"other-re.ex\"])),\n                (Field::from_field_id(2), automaton_hashset(&[\"my_reg.ex\"])),\n            ]\n            .into_iter()\n            .collect(),\n        };\n\n        warmup_info.simplify();\n        assert_eq!(warmup_info, expected);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-doc-mapper/src/doc_mapper/tantivy_val_to_json.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse serde_json::Value as JsonValue;\nuse tantivy::schema::OwnedValue as TantivyValue;\n\nuse super::BinaryFormat;\nuse super::field_mapping_entry::{NumericOutputFormat, QuickwitNumericOptions};\nuse super::mapping_tree::LeafType;\n\npub(crate) trait NumToJson {\n    fn to_json(&self, output_format: NumericOutputFormat) -> Option<JsonValue>;\n}\n\nimpl NumToJson for u64 {\n    fn to_json(&self, output_format: NumericOutputFormat) -> Option<JsonValue> {\n        let json_value = match output_format {\n            NumericOutputFormat::String => JsonValue::String(self.to_string()),\n            NumericOutputFormat::Number => JsonValue::Number(serde_json::Number::from(*self)),\n        };\n        Some(json_value)\n    }\n}\n\nimpl NumToJson for i64 {\n    fn to_json(&self, output_format: NumericOutputFormat) -> Option<JsonValue> {\n        let json_value = match output_format {\n            NumericOutputFormat::String => JsonValue::String(self.to_string()),\n            NumericOutputFormat::Number => JsonValue::Number(serde_json::Number::from(*self)),\n        };\n        Some(json_value)\n    }\n}\nimpl NumToJson for f64 {\n    fn to_json(&self, output_format: NumericOutputFormat) -> Option<JsonValue> {\n        match output_format {\n            NumericOutputFormat::String => Some(JsonValue::String(self.to_string())),\n            NumericOutputFormat::Number => {\n                serde_json::Number::from_f64(*self).map(JsonValue::Number)\n            }\n        }\n    }\n}\n\nfn value_to_string(value: TantivyValue) -> Result<JsonValue, TantivyValue> {\n    match value {\n        TantivyValue::Str(s) => return Ok(JsonValue::String(s)),\n        TantivyValue::U64(number) => Some(number.to_string()),\n        TantivyValue::I64(number) => Some(number.to_string()),\n        TantivyValue::F64(number) => Some(number.to_string()),\n        TantivyValue::Bool(b) => Some(b.to_string()),\n        TantivyValue::Date(date) => {\n            return quickwit_datetime::DateTimeOutputFormat::default()\n                .format_to_json(date)\n                .map_err(|_| value);\n        }\n        TantivyValue::IpAddr(ip) => Some(ip.to_string()),\n        _ => None,\n    }\n    .map(JsonValue::String)\n    .ok_or(value)\n}\n\nfn value_to_bool(value: TantivyValue) -> Result<JsonValue, TantivyValue> {\n    match &value {\n        TantivyValue::Str(s) => s.parse().ok(),\n        TantivyValue::U64(number) => match number {\n            0 => Some(false),\n            1 => Some(true),\n            _ => None,\n        },\n        TantivyValue::I64(number) => match number {\n            0 => Some(false),\n            1 => Some(true),\n            _ => None,\n        },\n        TantivyValue::F64(number) => match number {\n            0.0 => Some(false),\n            1.0 => Some(true),\n            _ => None,\n        },\n        TantivyValue::Bool(b) => Some(*b),\n        _ => None,\n    }\n    .map(JsonValue::Bool)\n    .ok_or(value)\n}\n\nfn value_to_ip(value: TantivyValue) -> Result<JsonValue, TantivyValue> {\n    match &value {\n        TantivyValue::Str(s) => s\n            .parse::<std::net::Ipv6Addr>()\n            .or_else(|_| {\n                s.parse::<std::net::Ipv4Addr>()\n                    .map(|ip| ip.to_ipv6_mapped())\n            })\n            .ok(),\n        TantivyValue::IpAddr(ip) => Some(*ip),\n        _ => None,\n    }\n    .map(|ip| {\n        serde_json::to_value(TantivyValue::IpAddr(ip))\n            .expect(\"Json serialization should never fail.\")\n    })\n    .ok_or(value)\n}\n\nfn value_to_float(\n    value: TantivyValue,\n    numeric_options: &QuickwitNumericOptions,\n) -> Result<JsonValue, TantivyValue> {\n    match &value {\n        TantivyValue::Str(s) => s.parse().ok(),\n        TantivyValue::U64(number) => Some(*number as f64),\n        TantivyValue::I64(number) => Some(*number as f64),\n        TantivyValue::F64(number) => Some(*number),\n        TantivyValue::Bool(b) => Some(if *b { 1.0 } else { 0.0 }),\n        _ => None,\n    }\n    .and_then(|f64_val| f64_val.to_json(numeric_options.output_format))\n    .ok_or(value)\n}\n\nfn value_to_u64(\n    value: TantivyValue,\n    numeric_options: &QuickwitNumericOptions,\n) -> Result<JsonValue, TantivyValue> {\n    match &value {\n        TantivyValue::Str(s) => s.parse().ok(),\n        TantivyValue::U64(number) => Some(*number),\n        TantivyValue::I64(number) => (*number).try_into().ok(),\n        TantivyValue::F64(number) => {\n            if (0.0..=(u64::MAX as f64)).contains(number) {\n                Some(*number as u64)\n            } else {\n                None\n            }\n        }\n        TantivyValue::Bool(b) => Some(*b as u64),\n        _ => None,\n    }\n    .and_then(|u64_val| u64_val.to_json(numeric_options.output_format))\n    .ok_or(value)\n}\n\nfn value_to_i64(\n    value: TantivyValue,\n    numeric_options: &QuickwitNumericOptions,\n) -> Result<JsonValue, TantivyValue> {\n    match &value {\n        TantivyValue::Str(s) => s.parse().ok(),\n        TantivyValue::U64(number) => (*number).try_into().ok(),\n        TantivyValue::I64(number) => Some(*number),\n        TantivyValue::F64(number) => {\n            if ((i64::MIN as f64)..=(i64::MAX as f64)).contains(number) {\n                Some(*number as i64)\n            } else {\n                None\n            }\n        }\n        TantivyValue::Bool(b) => Some(*b as i64),\n        _ => None,\n    }\n    .and_then(|u64_val| u64_val.to_json(numeric_options.output_format))\n    .ok_or(value)\n}\n\n/// Transforms a tantivy object into a serde_json one, without cloning strings.\n/// It still allocates maps.\n// TODO we should probably move this to tantivy, it has the opposite conversion already\npub fn tantivy_object_to_json_value(object: Vec<(String, TantivyValue)>) -> JsonValue {\n    JsonValue::Object(\n        object\n            .into_iter()\n            .map(|(key, value)| (key, tantivy_value_to_json(value)))\n            .collect(),\n    )\n}\n\n/// Converts Tantivy::Value into Json Value.\n///\n/// Formatting by defaults, e.g. Rfc3339 for dates.\npub fn tantivy_value_to_json(value: TantivyValue) -> JsonValue {\n    match value {\n        TantivyValue::Null => JsonValue::Null,\n        TantivyValue::Str(s) => JsonValue::String(s),\n        TantivyValue::U64(number) => JsonValue::Number(number.into()),\n        TantivyValue::I64(number) => JsonValue::Number(number.into()),\n        TantivyValue::F64(f) => {\n            JsonValue::Number(serde_json::Number::from_f64(f).expect(\"expected finite f64\"))\n        }\n        TantivyValue::Bool(b) => JsonValue::Bool(b),\n        TantivyValue::Array(array) => {\n            JsonValue::Array(array.into_iter().map(tantivy_value_to_json).collect())\n        }\n        TantivyValue::Object(object) => tantivy_object_to_json_value(object),\n        // we shouldn't have these types inside a json field in quickwit\n        TantivyValue::PreTokStr(pretok) => JsonValue::String(pretok.text),\n        TantivyValue::Date(date) => quickwit_datetime::DateTimeOutputFormat::Rfc3339\n            .format_to_json(date)\n            .expect(\"Invalid datetime is not allowed.\"),\n        TantivyValue::Facet(facet) => JsonValue::String(facet.to_string()),\n        TantivyValue::Bytes(bytes) => BinaryFormat::Base64.format_to_json(&bytes),\n        TantivyValue::IpAddr(ip_v6) => {\n            let ip_str = if let Some(ip_v4) = ip_v6.to_ipv4_mapped() {\n                ip_v4.to_string()\n            } else {\n                ip_v6.to_string()\n            };\n            JsonValue::String(ip_str)\n        }\n    }\n}\n\n/// Converts TantivyValue into Json Value and formats according to the LeafType.\n///\n/// Makes sure the type and value are consistent before converting.\n/// For certain LeafType, we use the type options to format the output.\npub fn formatted_tantivy_value_to_json(\n    value: TantivyValue,\n    leaf_type: &LeafType,\n) -> Option<JsonValue> {\n    let res = match leaf_type {\n        LeafType::Text(_) => value_to_string(value),\n        LeafType::Bool(_) => value_to_bool(value),\n        LeafType::IpAddr(_) => value_to_ip(value),\n        LeafType::F64(numeric_options) => value_to_float(value, numeric_options),\n        LeafType::U64(numeric_options) => value_to_u64(value, numeric_options),\n        LeafType::I64(numeric_options) => value_to_i64(value, numeric_options),\n        LeafType::Json(_) => {\n            if let TantivyValue::Object(obj) = value {\n                // TODO do we want to allow almost everything here?\n                return Some(tantivy_object_to_json_value(obj));\n            } else {\n                Err(value)\n            }\n        }\n        LeafType::Bytes(bytes_options) => {\n            if let TantivyValue::Bytes(ref bytes) = value {\n                // TODO we could cast str to bytes\n                let json_value = bytes_options.output_format.format_to_json(bytes);\n                Ok(json_value)\n            } else {\n                Err(value)\n            }\n        }\n        LeafType::DateTime(date_time_options) => date_time_options\n            .reparse_tantivy_value(&value)\n            .map(|date_time| {\n                date_time_options\n                    .output_format\n                    .format_to_json(date_time)\n                    .expect(\"Invalid datetime is not allowed.\")\n            })\n            .ok_or(value),\n    };\n    match res {\n        Ok(res) => Some(res),\n        Err(value) => {\n            quickwit_common::rate_limited_warn!(\n                limit_per_min = 2,\n                \"the value type `{:?}` doesn't match the requested type `{:?}`\",\n                value,\n                leaf_type\n            );\n            None\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use tantivy::schema::OwnedValue as TantivyValue;\n\n    use super::*;\n    use crate::doc_mapper::field_mapping_entry::{\n        BinaryFormat, NumericOutputFormat, QuickwitBytesOptions, QuickwitNumericOptions,\n    };\n    use crate::doc_mapper::mapping_tree::LeafType;\n\n    #[test]\n    fn test_tantivy_value_to_json_value_bytes() {\n        let bytes_options_base64 = QuickwitBytesOptions::default();\n        assert_eq!(\n            formatted_tantivy_value_to_json(\n                TantivyValue::Bytes(vec![1, 2, 3]),\n                &LeafType::Bytes(bytes_options_base64)\n            )\n            .unwrap(),\n            serde_json::json!(\"AQID\")\n        );\n\n        let bytes_options_hex = QuickwitBytesOptions {\n            output_format: BinaryFormat::Hex,\n            ..Default::default()\n        };\n        assert_eq!(\n            formatted_tantivy_value_to_json(\n                TantivyValue::Bytes(vec![1, 2, 3]),\n                &LeafType::Bytes(bytes_options_hex)\n            )\n            .unwrap(),\n            serde_json::json!(\"010203\")\n        );\n    }\n\n    #[test]\n    fn test_tantivy_value_to_json_value_f64() {\n        let numeric_options_number = QuickwitNumericOptions::default();\n        assert_eq!(\n            formatted_tantivy_value_to_json(\n                TantivyValue::F64(0.1),\n                &LeafType::F64(numeric_options_number.clone())\n            )\n            .unwrap(),\n            serde_json::json!(0.1)\n        );\n        assert_eq!(\n            formatted_tantivy_value_to_json(\n                TantivyValue::U64(1),\n                &LeafType::F64(numeric_options_number.clone())\n            )\n            .unwrap(),\n            serde_json::json!(1.0)\n        );\n        assert_eq!(\n            formatted_tantivy_value_to_json(\n                TantivyValue::Str(\"0.1\".to_string()),\n                &LeafType::F64(numeric_options_number.clone())\n            )\n            .unwrap(),\n            serde_json::json!(0.1)\n        );\n\n        let numeric_options_str = QuickwitNumericOptions {\n            output_format: NumericOutputFormat::String,\n            ..Default::default()\n        };\n        assert_eq!(\n            formatted_tantivy_value_to_json(\n                TantivyValue::F64(0.1),\n                &LeafType::F64(numeric_options_str)\n            )\n            .unwrap(),\n            serde_json::json!(\"0.1\")\n        );\n    }\n\n    #[test]\n    fn test_tantivy_value_to_json_value_i64() {\n        let numeric_options_number = QuickwitNumericOptions::default();\n        assert_eq!(\n            formatted_tantivy_value_to_json(\n                TantivyValue::I64(-1),\n                &LeafType::I64(numeric_options_number.clone())\n            )\n            .unwrap(),\n            serde_json::json!(-1)\n        );\n        assert_eq!(\n            formatted_tantivy_value_to_json(\n                TantivyValue::I64(1),\n                &LeafType::I64(numeric_options_number)\n            )\n            .unwrap(),\n            serde_json::json!(1)\n        );\n\n        let numeric_options_str = QuickwitNumericOptions {\n            output_format: NumericOutputFormat::String,\n            ..Default::default()\n        };\n        assert_eq!(\n            formatted_tantivy_value_to_json(\n                TantivyValue::I64(-1),\n                &LeafType::I64(numeric_options_str)\n            )\n            .unwrap(),\n            serde_json::json!(\"-1\")\n        );\n    }\n\n    #[test]\n    fn test_tantivy_value_to_json_value_u64() {\n        let numeric_options_number = QuickwitNumericOptions::default();\n        assert_eq!(\n            formatted_tantivy_value_to_json(\n                TantivyValue::U64(1),\n                &LeafType::U64(numeric_options_number.clone())\n            )\n            .unwrap(),\n            serde_json::json!(1u64)\n        );\n        assert_eq!(\n            formatted_tantivy_value_to_json(\n                TantivyValue::I64(1),\n                &LeafType::U64(numeric_options_number)\n            )\n            .unwrap(),\n            serde_json::json!(1u64)\n        );\n\n        let numeric_options_str = QuickwitNumericOptions {\n            output_format: NumericOutputFormat::String,\n            ..Default::default()\n        };\n        assert_eq!(\n            formatted_tantivy_value_to_json(\n                TantivyValue::U64(1),\n                &LeafType::U64(numeric_options_str)\n            )\n            .unwrap(),\n            serde_json::json!(\"1\")\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-doc-mapper/src/doc_mapper/tokenizer_entry.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse anyhow::Context;\nuse quickwit_query::{CodeTokenizer, DEFAULT_REMOVE_TOKEN_LENGTH};\nuse serde::{Deserialize, Serialize};\nuse tantivy::tokenizer::{\n    AsciiFoldingFilter, LowerCaser, NgramTokenizer, RegexTokenizer, RemoveLongFilter,\n    SimpleTokenizer, TextAnalyzer, Token,\n};\n\n/// A `TokenizerEntry` defines a custom tokenizer with its name and configuration.\n#[derive(Clone, Serialize, Deserialize, Debug, PartialEq, Eq, Hash, utoipa::ToSchema)]\npub struct TokenizerEntry {\n    /// Tokenizer name.\n    pub name: String,\n    /// Tokenizer configuration.\n    #[serde(flatten)]\n    pub(crate) config: TokenizerConfig,\n}\n\n/// Tokenizer configuration.\n#[derive(Clone, Serialize, Deserialize, Debug, PartialEq, Eq, Hash, utoipa::ToSchema)]\npub struct TokenizerConfig {\n    #[serde(flatten)]\n    pub(crate) tokenizer_type: TokenizerType,\n    #[serde(default)]\n    pub(crate) filters: Vec<TokenFilterType>,\n}\n\nimpl TokenizerConfig {\n    /// Build a `TextAnalyzer` from a `TokenizerConfig`.\n    pub fn text_analyzer(&self) -> anyhow::Result<TextAnalyzer> {\n        let mut text_analyzer_builder = match &self.tokenizer_type {\n            TokenizerType::Simple => TextAnalyzer::builder(SimpleTokenizer::default()).dynamic(),\n            TokenizerType::SourceCode => TextAnalyzer::builder(CodeTokenizer::default()).dynamic(),\n            TokenizerType::Ngram(options) => {\n                let tokenizer =\n                    NgramTokenizer::new(options.min_gram, options.max_gram, options.prefix_only)\n                        .with_context(|| \"invalid ngram tokenizer\".to_string())?;\n                TextAnalyzer::builder(tokenizer).dynamic()\n            }\n            TokenizerType::Regex(options) => {\n                let tokenizer = RegexTokenizer::new(&options.pattern)\n                    .with_context(|| \"invalid regex tokenizer\".to_string())?;\n                TextAnalyzer::builder(tokenizer).dynamic()\n            }\n        };\n        for filter in &self.filters {\n            match filter.tantivy_token_filter_enum() {\n                TantivyTokenFilterEnum::RemoveLong(token_filter) => {\n                    text_analyzer_builder = text_analyzer_builder.filter_dynamic(token_filter);\n                }\n                TantivyTokenFilterEnum::LowerCaser(token_filter) => {\n                    text_analyzer_builder = text_analyzer_builder.filter_dynamic(token_filter);\n                }\n                TantivyTokenFilterEnum::AsciiFolding(token_filter) => {\n                    text_analyzer_builder = text_analyzer_builder.filter_dynamic(token_filter);\n                }\n            }\n        }\n        Ok(text_analyzer_builder.build())\n    }\n}\n\n/// Helper function to analyze a text with a given `TokenizerConfig`.\npub fn analyze_text(text: &str, tokenizer: &TokenizerConfig) -> anyhow::Result<Vec<Token>> {\n    let mut text_analyzer = tokenizer.text_analyzer()?;\n    let mut token_stream = text_analyzer.token_stream(text);\n    let mut tokens = Vec::new();\n    token_stream.process(&mut |token| {\n        tokens.push(token.clone());\n    });\n    Ok(tokens)\n}\n\n#[derive(Clone, Debug, PartialEq, Eq, Hash, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(rename_all = \"snake_case\")]\npub enum TokenFilterType {\n    RemoveLong,\n    LowerCaser,\n    AsciiFolding,\n}\n\n/// Tantivy token filter enum to build\n/// a `TextAnalyzer` with dynamic token filters.\nenum TantivyTokenFilterEnum {\n    RemoveLong(RemoveLongFilter),\n    LowerCaser(LowerCaser),\n    AsciiFolding(AsciiFoldingFilter),\n}\n\nimpl TokenFilterType {\n    fn tantivy_token_filter_enum(&self) -> TantivyTokenFilterEnum {\n        match &self {\n            Self::RemoveLong => TantivyTokenFilterEnum::RemoveLong(RemoveLongFilter::limit(\n                DEFAULT_REMOVE_TOKEN_LENGTH,\n            )),\n            Self::LowerCaser => TantivyTokenFilterEnum::LowerCaser(LowerCaser),\n            Self::AsciiFolding => TantivyTokenFilterEnum::AsciiFolding(AsciiFoldingFilter),\n        }\n    }\n}\n\n#[derive(Clone, Debug, PartialEq, Eq, Hash, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(tag = \"type\", rename_all = \"snake_case\")]\npub enum TokenizerType {\n    Ngram(NgramTokenizerOption),\n    Regex(RegexTokenizerOption),\n    Simple,\n    SourceCode,\n}\n\n#[derive(Clone, Serialize, Deserialize, Debug, PartialEq, Eq, Hash, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct NgramTokenizerOption {\n    pub min_gram: usize,\n    pub max_gram: usize,\n    #[serde(default)]\n    pub prefix_only: bool,\n}\n\n#[derive(Clone, Serialize, Deserialize, Debug, PartialEq, Eq, Hash, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct RegexTokenizerOption {\n    pub pattern: String,\n}\n\n#[cfg(test)]\nmod tests {\n    use super::{NgramTokenizerOption, TokenizerType};\n    use crate::TokenizerEntry;\n    use crate::doc_mapper::RegexTokenizerOption;\n\n    #[test]\n    fn test_deserialize_tokenizer_entry() {\n        let result: Result<TokenizerEntry, serde_json::Error> =\n            serde_json::from_str::<TokenizerEntry>(\n                r#\"\n            {\n                \"name\": \"my_tokenizer\",\n                \"type\": \"ngram\",\n                \"min_gram\": 1,\n                \"max_gram\": 3,\n                \"filters\": [\n                    \"remove_long\",\n                    \"lower_caser\",\n                    \"ascii_folding\"\n                ]\n            }\n            \"#,\n            );\n        assert!(result.is_ok());\n        let tokenizer_config_entry = result.unwrap();\n        assert_eq!(tokenizer_config_entry.config.filters.len(), 3);\n        match tokenizer_config_entry.config.tokenizer_type {\n            TokenizerType::Ngram(options) => {\n                assert_eq!(\n                    options,\n                    NgramTokenizerOption {\n                        min_gram: 1,\n                        max_gram: 3,\n                        prefix_only: false,\n                    }\n                )\n            }\n            _ => panic!(\"Unexpected tokenizer type\"),\n        }\n    }\n\n    #[test]\n    fn test_deserialize_tokenizer_entry_failed_with_wrong_key() {\n        let result: Result<TokenizerEntry, serde_json::Error> =\n            serde_json::from_str::<TokenizerEntry>(\n                r#\"\n            {\n                \"name\": \"my_tokenizer\",\n                \"type\": \"ngram\",\n                \"min_gram\": 1,\n                \"max_gram\": 3,\n                \"filters\": [\n                    \"remove_long\",\n                    \"lower_caser\",\n                    \"ascii_folding\"\n                ],\n                \"abc\": 123\n            }\n            \"#,\n            );\n        assert!(result.is_err());\n        assert!(\n            result\n                .unwrap_err()\n                .to_string()\n                .contains(\"unknown field `abc`\")\n        );\n    }\n\n    #[test]\n    fn test_tokenizer_entry_regex() {\n        let result: Result<TokenizerEntry, serde_json::Error> =\n            serde_json::from_str::<TokenizerEntry>(\n                r#\"\n            {\n                \"name\": \"my_tokenizer\",\n                \"type\": \"regex\",\n                \"pattern\": \"(my_pattern)\"\n            }\n            \"#,\n            );\n        assert!(result.is_ok());\n        let tokenizer_config_entry = result.unwrap();\n        assert_eq!(tokenizer_config_entry.config.filters.len(), 0);\n        match tokenizer_config_entry.config.tokenizer_type {\n            TokenizerType::Regex(options) => {\n                assert_eq!(\n                    options,\n                    RegexTokenizerOption {\n                        pattern: \"(my_pattern)\".to_string(),\n                    }\n                )\n            }\n            _ => panic!(\"Unexpected tokenizer type\"),\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-doc-mapper/src/doc_mapping.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeSet;\nuse std::num::NonZeroU32;\n\nuse quickwit_proto::types::DocMappingUid;\nuse serde::{Deserialize, Serialize};\n\nuse crate::{FieldMappingEntry, QuickwitJsonOptions, TokenizerEntry};\n\n/// Defines how unmapped fields should be handled.\n#[derive(Clone, Copy, Default, Debug, Eq, PartialEq, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(rename_all = \"lowercase\")]\npub enum ModeType {\n    /// Lenient mode: ignores unmapped fields.\n    Lenient,\n    /// Strict mode: returns an error when an unmapped field is encountered.\n    Strict,\n    /// Dynamic mode: captures and handles unmapped fields according to the dynamic field\n    /// configuration.\n    #[default]\n    Dynamic,\n}\n\n/// Defines how unmapped fields should be handled.\n#[derive(Clone, Debug, Serialize, Deserialize, PartialEq)]\npub enum Mode {\n    /// Lenient mode: ignores unmapped fields.\n    Lenient,\n    /// Strict mode: returns an error when an unmapped field is encountered.\n    Strict,\n    /// Dynamic mode: captures and handles unmapped fields according to the dynamic field\n    /// configuration.\n    Dynamic(QuickwitJsonOptions),\n}\n\nimpl Mode {\n    /// Extracts the [`ModeType`] of this [`Mode`]\n    pub fn mode_type(&self) -> ModeType {\n        match self {\n            Self::Lenient => ModeType::Lenient,\n            Self::Strict => ModeType::Strict,\n            Self::Dynamic(_) => ModeType::Dynamic,\n        }\n    }\n\n    /// Builds a [`Mode`] from its type and optional dynamic mapping options.\n    pub fn from_parts(\n        mode: ModeType,\n        dynamic_mapping: Option<QuickwitJsonOptions>,\n    ) -> anyhow::Result<Mode> {\n        Ok(match (mode, dynamic_mapping) {\n            (ModeType::Lenient, None) => Self::Lenient,\n            (ModeType::Strict, None) => Self::Strict,\n            (ModeType::Dynamic, Some(dynamic_mapping)) => Self::Dynamic(dynamic_mapping),\n            (ModeType::Dynamic, None) => Self::default(), // Dynamic with default options\n            (_, Some(_)) => anyhow::bail!(\n                \"`dynamic_mapping` is only allowed with mode=dynamic. (here mode=`{:?}`)\",\n                mode\n            ),\n        })\n    }\n\n    /// Obtains the mode type and dynamic options from a [`Mode`].\n    pub fn into_parts(self) -> (ModeType, Option<QuickwitJsonOptions>) {\n        match self {\n            Self::Lenient => (ModeType::Lenient, None),\n            Self::Strict => (ModeType::Strict, None),\n            Self::Dynamic(json_options) => (ModeType::Dynamic, Some(json_options)),\n        }\n    }\n}\n\nimpl Default for Mode {\n    fn default() -> Self {\n        Self::Dynamic(QuickwitJsonOptions::default_dynamic())\n    }\n}\n\n/// Defines how the document of an index should be parsed, tokenized, partitioned, indexed, and\n/// stored.\n#[quickwit_macros::serde_multikey]\n#[derive(Clone, Debug, PartialEq, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct DocMapping {\n    /// Doc mapping UID.\n    ///\n    /// Splits with the same doc mapping UID share the same schema and should use the same doc\n    /// mapper during indexing and querying.\n    #[serde(default = \"DocMappingUid::random\")]\n    pub doc_mapping_uid: DocMappingUid,\n\n    /// Defines how unmapped fields should be handled.\n    #[serde_multikey(\n        deserializer = Mode::from_parts,\n        serializer = Mode::into_parts,\n        fields = (\n            #[serde(default)]\n            mode: ModeType,\n            #[serde(skip_serializing_if = \"Option::is_none\")]\n            dynamic_mapping: Option<QuickwitJsonOptions>\n        ),\n    )]\n    pub mode: Mode,\n\n    /// Defines the schema of ingested documents and describes how each field value should be\n    /// parsed, tokenized, indexed, and stored.\n    #[serde(default)]\n    #[schema(value_type = Vec<FieldMappingEntryForSerialization>)]\n    pub field_mappings: Vec<FieldMappingEntry>,\n\n    /// Declares the field which contains the date or timestamp at which the document\n    /// was emitted.\n    #[serde(default)]\n    pub timestamp_field: Option<String>,\n\n    /// Declares the low cardinality fields for which the values ​​are recorded directly in the\n    /// splits metadata.\n    #[schema(value_type = Vec<String>)]\n    #[serde(default)]\n    pub tag_fields: BTreeSet<String>,\n\n    /// Expresses via a \"mini-DSL\" how to route documents to split partitions.\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub partition_key: Option<String>,\n\n    /// The maximum number of partitions that an indexer can generate.\n    #[schema(value_type = u32)]\n    #[serde(default = \"DocMapping::default_max_num_partitions\")]\n    pub max_num_partitions: NonZeroU32,\n\n    /// Whether to record the presence of the fields of each indexed document to allow `exists`\n    /// queries.\n    #[serde(default)]\n    pub index_field_presence: bool,\n\n    /// Whether to record and store the size (bytes) of each ingested document in a fast field.\n    #[serde(alias = \"document_length\")]\n    #[serde(default)]\n    pub store_document_size: bool,\n\n    /// Whether to store the original source documents in the doc store.\n    #[serde(default)]\n    pub store_source: bool,\n\n    /// A set of additional user-defined tokenizers to be used during indexing.\n    #[serde(default)]\n    pub tokenizers: Vec<TokenizerEntry>,\n}\n\nimpl DocMapping {\n    /// Returns the default value for `max_num_partitions`.\n    pub fn default_max_num_partitions() -> NonZeroU32 {\n        NonZeroU32::new(200).unwrap()\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n    use crate::doc_mapper::{QuickwitNumericOptions, QuickwitTextOptions};\n    use crate::{\n        Cardinality, FieldMappingType, RegexTokenizerOption, TokenFilterType, TokenizerConfig,\n        TokenizerType,\n    };\n\n    #[test]\n    fn test_doc_mapping_serde_roundtrip() {\n        let doc_mapping = DocMapping {\n            doc_mapping_uid: DocMappingUid::random(),\n            mode: Mode::Strict,\n            field_mappings: vec![\n                FieldMappingEntry {\n                    name: \"timestamp\".to_string(),\n                    mapping_type: FieldMappingType::U64(\n                        QuickwitNumericOptions::default(),\n                        Cardinality::SingleValued,\n                    ),\n                },\n                FieldMappingEntry {\n                    name: \"message\".to_string(),\n                    mapping_type: FieldMappingType::Text(\n                        QuickwitTextOptions::default(),\n                        Cardinality::SingleValued,\n                    ),\n                },\n            ],\n            timestamp_field: Some(\"timestamp\".to_string()),\n            tag_fields: BTreeSet::from_iter([\"level\".to_string()]),\n            partition_key: Some(\"tenant_id\".to_string()),\n            max_num_partitions: NonZeroU32::new(100).unwrap(),\n            index_field_presence: true,\n            store_document_size: true,\n            store_source: true,\n            tokenizers: vec![TokenizerEntry {\n                name: \"whitespace\".to_string(),\n                config: TokenizerConfig {\n                    tokenizer_type: TokenizerType::Regex(RegexTokenizerOption {\n                        pattern: r\"\\s+\".to_string(),\n                    }),\n                    filters: vec![TokenFilterType::LowerCaser],\n                },\n            }],\n        };\n        let serialized = serde_json::to_string(&doc_mapping).unwrap();\n        let deserialized: DocMapping = serde_json::from_str(&serialized).unwrap();\n        assert_eq!(deserialized, doc_mapping);\n    }\n\n    #[test]\n    fn test_doc_mapping_serde_default_values() {\n        let doc_mapping: DocMapping = serde_json::from_str(\"{}\").unwrap();\n        assert_eq!(\n            doc_mapping.mode,\n            Mode::Dynamic(QuickwitJsonOptions::default_dynamic())\n        );\n        assert!(doc_mapping.field_mappings.is_empty());\n        assert_eq!(doc_mapping.timestamp_field, None);\n        assert!(doc_mapping.tag_fields.is_empty());\n        assert_eq!(doc_mapping.partition_key, None);\n        assert_eq!(\n            doc_mapping.max_num_partitions,\n            NonZeroU32::new(200).unwrap()\n        );\n        assert_eq!(doc_mapping.index_field_presence, false);\n        assert_eq!(doc_mapping.store_document_size, false);\n        assert_eq!(doc_mapping.store_source, false);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-doc-mapper/src/error.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_query::InvalidQuery;\nuse tantivy::schema::DocParsingError as TantivyDocParsingError;\nuse thiserror::Error;\n\n/// Failed to parse query.\n#[derive(Error, Debug)]\n#[allow(missing_docs)]\npub enum QueryParserError {\n    #[error(\"invalid json: {0}\")]\n    InvalidJson(#[from] serde_json::Error),\n    #[error(\"invalid query: {0}\")]\n    InvalidQuery(#[from] InvalidQuery),\n    #[error(\"invalid default search field: `{field_name}` {cause}\")]\n    InvalidDefaultField {\n        cause: &'static str,\n        field_name: String,\n    },\n    #[error(\"{0}\")]\n    Other(#[from] anyhow::Error),\n}\n\n/// Error that may happen when parsing\n/// a document from JSON.\n#[derive(Debug, Error, Eq, PartialEq)]\npub enum DocParsingError {\n    /// The provided string is not a syntactically valid JSON object.\n    #[error(\"the provided string is not a syntactically valid JSON object: {0}\")]\n    NotJsonObject(String),\n    /// One of the value could not be parsed.\n    #[error(\"the field `{0}` could not be parsed: {1}\")]\n    ValueError(String, String),\n    /// The json-document contains a field that is not declared in the schema.\n    #[error(\"the document contains a field that is not declared in the schema: {0:?}\")]\n    NoSuchFieldInSchema(String),\n    /// The document contains a array of values but a single value is expected.\n    #[error(\"the document contains an array of values but a single value is expected: {0:?}\")]\n    MultiValuesNotSupported(String),\n    /// The document does not contain a field that is required.\n    #[error(\"the document must contain field {0:?}\")]\n    RequiredField(String),\n}\n\nimpl From<TantivyDocParsingError> for DocParsingError {\n    fn from(value: TantivyDocParsingError) -> Self {\n        match value {\n            TantivyDocParsingError::InvalidJson(text) => DocParsingError::NoSuchFieldInSchema(text),\n            TantivyDocParsingError::ValueError(text, error) => {\n                DocParsingError::ValueError(text, format!(\"{error:?}\"))\n            }\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-doc-mapper/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n#![warn(missing_docs)]\n#![allow(clippy::bool_assert_comparison)]\n#![deny(clippy::disallowed_methods)]\n\n//! Index config defines how to configure an index and especially how\n//! to convert a json like documents to a document indexable by tantivy\n//! engine, aka tantivy::Document.\n\nmod doc_mapper;\nmod doc_mapping;\nmod error;\nmod query_builder;\nmod routing_expression;\n\n/// Pruning tags manipulation.\npub mod tag_pruning;\n\npub use doc_mapper::{\n    Automaton, BinaryFormat, DocMapper, DocMapperBuilder, FastFieldWarmupInfo, FieldMappingEntry,\n    FieldMappingType, JsonObject, NamedField, QuickwitBytesOptions, QuickwitJsonOptions, TermRange,\n    TokenizerConfig, TokenizerEntry, WarmupInfo, analyze_text,\n};\nuse doc_mapper::{\n    FastFieldOptions, FieldMappingEntryForSerialization, IndexRecordOptionSchema,\n    NgramTokenizerOption, QuickwitTextNormalizer, QuickwitTextTokenizer, RegexTokenizerOption,\n    TokenFilterType, TokenizerType,\n};\npub use doc_mapping::{DocMapping, Mode, ModeType};\npub use error::{DocParsingError, QueryParserError};\nuse quickwit_common::shared_consts::FIELD_PRESENCE_FIELD_NAME;\nuse quickwit_proto::types::DocMappingUid;\npub use routing_expression::RoutingExpr;\n\n/// Field name reserved for storing the source document.\npub const SOURCE_FIELD_NAME: &str = \"_source\";\n\n/// Field name reserved for storing the dynamically indexed fields.\npub const DYNAMIC_FIELD_NAME: &str = \"_dynamic\";\n\n/// Field name reserved for storing the length of source document.\npub const DOCUMENT_SIZE_FIELD_NAME: &str = \"_doc_length\";\n\n/// Quickwit reserved field names.\nconst QW_RESERVED_FIELD_NAMES: &[&str] = &[\n    DOCUMENT_SIZE_FIELD_NAME,\n    DYNAMIC_FIELD_NAME,\n    FIELD_PRESENCE_FIELD_NAME,\n    SOURCE_FIELD_NAME,\n];\n\n/// Cardinality of a field.\n#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]\npub enum Cardinality {\n    /// Single-valued field.\n    SingleValued,\n    /// Multivalued field.\n    MultiValued,\n}\n\n#[derive(utoipa::OpenApi)]\n#[openapi(components(schemas(\n    DocMappingUid,\n    FastFieldOptions,\n    FieldMappingEntryForSerialization,\n    IndexRecordOptionSchema,\n    ModeType,\n    NgramTokenizerOption,\n    QuickwitJsonOptions,\n    QuickwitTextNormalizer,\n    QuickwitTextTokenizer,\n    RegexTokenizerOption,\n    TokenFilterType,\n    TokenizerConfig,\n    TokenizerEntry,\n    TokenizerType,\n)))]\n/// Schema used for the OpenAPI generation which are apart of this crate.\npub struct DocMapperApiSchemas;\n\n/// Returns a default `DefaultIndexConfig` for unit tests.\n#[cfg(any(test, feature = \"testsuite\"))]\npub fn default_doc_mapper_for_test() -> DocMapper {\n    const JSON_CONFIG_VALUE: &str = r#\"\n        {\n            \"store_source\": true,\n            \"index_field_presence\": true,\n            \"default_search_fields\": [\n                \"body\", \"attributes.server\", \"attributes.server\\\\.status\"\n            ],\n            \"timestamp_field\": \"timestamp\",\n            \"tag_fields\": [\"owner\"],\n            \"field_mappings\": [\n                {\n                    \"name\": \"timestamp\",\n                    \"type\": \"datetime\",\n                    \"output_format\": \"unix_timestamp_secs\",\n                    \"fast\": true\n                },\n                {\n                    \"name\": \"body\",\n                    \"type\": \"text\",\n                    \"stored\": true\n                },\n                {\n                    \"name\": \"response_date\",\n                    \"type\": \"datetime\",\n                    \"input_formats\": [\"rfc3339\", \"unix_timestamp\"],\n                    \"fast\": true\n                },\n                {\n                    \"name\": \"response_time\",\n                    \"type\": \"f64\",\n                    \"fast\": true\n                },\n                {\n                    \"name\": \"response_payload\",\n                    \"type\": \"bytes\",\n                    \"fast\": true\n                },\n                {\n                    \"name\": \"owner\",\n                    \"type\": \"text\",\n                    \"tokenizer\": \"raw\"\n                },\n                {\n                    \"name\": \"isImportant\",\n                    \"type\": \"bool\"\n                },\n                {\n                    \"name\": \"properties\",\n                    \"type\": \"json\"\n                },\n                {\n                    \"name\": \"children\",\n                    \"type\": \"array<json>\"\n                },\n                {\n                    \"name\": \"attributes\",\n                    \"type\": \"object\",\n                    \"field_mappings\": [\n                        {\n                            \"name\": \"tags\",\n                            \"type\": \"array<i64>\"\n                        },\n                        {\n                            \"name\": \"server\",\n                            \"type\": \"text\"\n                        },\n                        {\n                            \"name\": \"server.status\",\n                            \"type\": \"array<text>\"\n                        },\n                        {\n                            \"name\": \"server.payload\",\n                            \"type\": \"array<bytes>\"\n                        }\n                    ]\n                }\n            ]\n        }\"#;\n    serde_json::from_str::<DocMapper>(JSON_CONFIG_VALUE).unwrap()\n}\n"
  },
  {
    "path": "quickwit/quickwit-doc-mapper/src/query_builder.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{HashMap, HashSet};\nuse std::convert::Infallible;\nuse std::ops::Bound;\nuse std::sync::Arc;\n\nuse quickwit_proto::types::SplitId;\nuse quickwit_query::query_ast::{\n    BuildTantivyAstContext, FieldPresenceQuery, FullTextQuery, PhrasePrefixQuery, QueryAst,\n    QueryAstTransformer, QueryAstVisitor, RangeQuery, RegexQuery, TermSetQuery, WildcardQuery,\n};\nuse quickwit_query::tokenizers::TokenizerManager;\nuse quickwit_query::{InvalidQuery, find_field_or_hit_dynamic};\nuse tantivy::Term;\nuse tantivy::query::Query;\nuse tantivy::schema::{Field, Schema};\nuse tracing::error;\n\nuse crate::doc_mapper::FastFieldWarmupInfo;\nuse crate::{Automaton, QueryParserError, TermRange, WarmupInfo};\n\n#[derive(Default)]\nstruct RangeQueryFields {\n    range_query_field_names: HashSet<String>,\n}\n\nimpl<'a> QueryAstVisitor<'a> for RangeQueryFields {\n    type Err = Infallible;\n\n    fn visit_range(&mut self, range_query: &'a RangeQuery) -> Result<(), Infallible> {\n        self.range_query_field_names\n            .insert(range_query.field.to_string());\n        Ok(())\n    }\n}\n\n/// Term Queries on fields which are fast but not indexed.\nstruct TermSearchOnColumnar<'f> {\n    fields: &'f mut HashSet<FastFieldWarmupInfo>,\n    schema: Schema,\n}\nimpl<'a, 'f> QueryAstVisitor<'a> for TermSearchOnColumnar<'f> {\n    type Err = Infallible;\n\n    fn visit_term_set(&mut self, term_set_query: &'a TermSetQuery) -> Result<(), Infallible> {\n        for field in term_set_query.terms_per_field.keys() {\n            if let Some((_field, field_entry, path)) =\n                find_field_or_hit_dynamic(field, &self.schema)\n                && field_entry.is_fast()\n                && !field_entry.is_indexed()\n            {\n                self.fields.insert(FastFieldWarmupInfo {\n                    name: if path.is_empty() {\n                        field_entry.name().to_string()\n                    } else {\n                        format!(\"{}.{}\", field_entry.name(), path)\n                    },\n                    with_subfields: false,\n                });\n            }\n        }\n        Ok(())\n    }\n\n    fn visit_term(\n        &mut self,\n        term_query: &'a quickwit_query::query_ast::TermQuery,\n    ) -> Result<(), Infallible> {\n        if let Some((_field, field_entry, path)) =\n            find_field_or_hit_dynamic(&term_query.field, &self.schema)\n            && field_entry.is_fast()\n            && !field_entry.is_indexed()\n        {\n            self.fields.insert(FastFieldWarmupInfo {\n                name: if path.is_empty() {\n                    field_entry.name().to_string()\n                } else {\n                    format!(\"{}.{}\", field_entry.name(), path)\n                },\n                with_subfields: false,\n            });\n        }\n        Ok(())\n    }\n    /// We also need to visit full text queries because they can be converted to term queries\n    /// on fast fields. We only care about the field being fast and not indexed AND the tokenizer\n    /// being `raw` or None.\n    fn visit_full_text(&mut self, full_text_query: &'a FullTextQuery) -> Result<(), Infallible> {\n        if let Some((_field, field_entry, path)) =\n            find_field_or_hit_dynamic(&full_text_query.field, &self.schema)\n            && field_entry.is_fast()\n            && !field_entry.is_indexed()\n            && (full_text_query.params.tokenizer.is_none()\n                || full_text_query.params.tokenizer.as_deref() == Some(\"raw\"))\n        {\n            self.fields.insert(FastFieldWarmupInfo {\n                name: if path.is_empty() {\n                    field_entry.name().to_string()\n                } else {\n                    format!(\"{}.{}\", field_entry.name(), path)\n                },\n                with_subfields: false,\n            });\n        }\n        Ok(())\n    }\n}\n\nstruct ExistsQueryFastFields<'f> {\n    fields: &'f mut HashSet<FastFieldWarmupInfo>,\n    schema: Schema,\n}\n\nimpl<'a, 'f> QueryAstVisitor<'a> for ExistsQueryFastFields<'f> {\n    type Err = Infallible;\n\n    fn visit_exists(&mut self, exists_query: &'a FieldPresenceQuery) -> Result<(), Infallible> {\n        let fields = exists_query.find_field_and_subfields(&self.schema);\n        for (_, field_entry, path) in fields {\n            if field_entry.is_fast() {\n                if field_entry.field_type().is_json() {\n                    let full_path = format!(\"{}.{}\", field_entry.name(), path);\n                    self.fields.insert(FastFieldWarmupInfo {\n                        name: full_path,\n                        with_subfields: true,\n                    });\n                } else if path.is_empty() {\n                    self.fields.insert(FastFieldWarmupInfo {\n                        name: field_entry.name().to_string(),\n                        with_subfields: false,\n                    });\n                } else {\n                    error!(\n                        field_entry = field_entry.name(),\n                        path, \"only JSON type supports subfields\"\n                    );\n                }\n            }\n        }\n        Ok(())\n    }\n}\n\n/// Build a `Query` with field resolution & forbidding range clauses.\npub(crate) fn build_query(\n    query_ast: QueryAst,\n    context: &BuildTantivyAstContext,\n    cache_context: Option<(Arc<dyn quickwit_query::query_ast::PredicateCache>, SplitId)>,\n) -> Result<(Box<dyn Query>, WarmupInfo), QueryParserError> {\n    let mut fast_fields: HashSet<FastFieldWarmupInfo> = HashSet::new();\n\n    let query_ast = if let Some((cache, split_id)) = cache_context {\n        let Ok(query_ast) = quickwit_query::query_ast::PredicateCacheInjector { cache, split_id }\n            .transform(query_ast);\n        // this transformer isn't supposed to ever remove a node\n        query_ast.unwrap_or(QueryAst::MatchAll)\n    } else {\n        query_ast\n    };\n\n    let mut range_query_fields = RangeQueryFields::default();\n    // This cannot fail. The error type is Infallible.\n    let Ok(_) = range_query_fields.visit(&query_ast);\n    let range_query_fast_fields =\n        range_query_fields\n            .range_query_field_names\n            .into_iter()\n            .map(|name| FastFieldWarmupInfo {\n                name,\n                with_subfields: false,\n            });\n    fast_fields.extend(range_query_fast_fields);\n\n    let Ok(_) = TermSearchOnColumnar {\n        fields: &mut fast_fields,\n        schema: context.schema.clone(),\n    }\n    .visit(&query_ast);\n\n    let Ok(_) = ExistsQueryFastFields {\n        fields: &mut fast_fields,\n        schema: context.schema.clone(),\n    }\n    .visit(&query_ast);\n\n    let query = query_ast.build_tantivy_query(context)?;\n\n    let term_set_query_fields = extract_term_set_query_fields(&query_ast, context.schema)?;\n    let (term_ranges_grouped_by_field, automatons_grouped_by_field) =\n        extract_prefix_term_ranges_and_automaton(\n            &query_ast,\n            context.schema,\n            context.tokenizer_manager,\n        )?;\n\n    let mut terms_grouped_by_field: HashMap<Field, HashMap<_, bool>> = Default::default();\n    query.query_terms(&mut |term, need_position| {\n        let field = term.field();\n        if !context.schema.get_field_entry(field).is_indexed() {\n            return;\n        }\n        *terms_grouped_by_field\n            .entry(field)\n            .or_default()\n            .entry(term.clone())\n            .or_default() |= need_position;\n    });\n\n    let warmup_info = WarmupInfo {\n        term_dict_fields: term_set_query_fields,\n        terms_grouped_by_field,\n        term_ranges_grouped_by_field,\n        fast_fields,\n        automatons_grouped_by_field,\n        ..WarmupInfo::default()\n    };\n\n    Ok((query, warmup_info))\n}\n\nstruct ExtractTermSetFields<'a> {\n    term_dict_fields_to_warm_up: HashSet<Field>,\n    schema: &'a Schema,\n}\n\nimpl<'a> ExtractTermSetFields<'a> {\n    fn new(schema: &'a Schema) -> Self {\n        ExtractTermSetFields {\n            term_dict_fields_to_warm_up: HashSet::new(),\n            schema,\n        }\n    }\n}\n\nimpl<'a> QueryAstVisitor<'a> for ExtractTermSetFields<'_> {\n    type Err = anyhow::Error;\n\n    fn visit_term_set(&mut self, term_set_query: &'a TermSetQuery) -> anyhow::Result<()> {\n        for field in term_set_query.terms_per_field.keys() {\n            if let Some((field, _field_entry, _path)) =\n                find_field_or_hit_dynamic(field, self.schema)\n            {\n                self.term_dict_fields_to_warm_up.insert(field);\n            } else {\n                anyhow::bail!(\"field does not exist: {}\", field);\n            }\n        }\n        Ok(())\n    }\n}\n\nfn extract_term_set_query_fields(\n    query_ast: &QueryAst,\n    schema: &Schema,\n) -> anyhow::Result<HashSet<Field>> {\n    let mut visitor = ExtractTermSetFields::new(schema);\n    visitor.visit(query_ast)?;\n    Ok(visitor.term_dict_fields_to_warm_up)\n}\n\n/// Converts a `prefix` term into the equivalent term range.\n///\n/// The resulting range is `[prefix, next_prefix)`, that is:\n/// - start bound: `Included(prefix)`\n/// - end bound: `Excluded(next lexicographic term after the prefix)`\n///\n/// \"abc\"    -> start: \"abc\", end: \"abd\" (excluded)\n/// \"ab\\xFF\" -> start: \"ab\\xFF\", end: \"ac\" (excluded)\n/// \"\\xFF\\xFF\" -> start: \"\\xFF\\xFF\", end: Unbounded\nfn prefix_term_to_range(prefix: Term) -> (Bound<Term>, Bound<Term>) {\n    // Start from the given prefix and try to find the successor\n    let mut end_bound = prefix.clone();\n    let mut end_bound_value_bytes = prefix.serialized_value_bytes().to_vec();\n    while !end_bound_value_bytes.is_empty() {\n        let last_byte = end_bound_value_bytes.last_mut().unwrap();\n        if *last_byte != u8::MAX {\n            *last_byte += 1;\n            // The last non-`u8::MAX` byte incremented\n            // gives us the exclusive upper bound.\n            end_bound.set_bytes(&end_bound_value_bytes);\n            return (Bound::Included(prefix), Bound::Excluded(end_bound));\n        }\n        // pop u8::MAX byte and try next\n        end_bound_value_bytes.pop();\n    }\n    // All bytes were `u8::MAX`: there is no successor, so the upper bound is unbounded.\n    (Bound::Included(prefix), Bound::Unbounded)\n}\n\ntype PositionNeeded = bool;\n\nstruct ExtractPrefixTermRanges<'a> {\n    schema: &'a Schema,\n    tokenizer_manager: &'a TokenizerManager,\n    term_ranges_to_warm_up: HashMap<Field, HashMap<TermRange, PositionNeeded>>,\n    automatons_to_warm_up: HashMap<Field, HashSet<Automaton>>,\n}\n\nimpl<'a> ExtractPrefixTermRanges<'a> {\n    fn with_schema(schema: &'a Schema, tokenizer_manager: &'a TokenizerManager) -> Self {\n        ExtractPrefixTermRanges {\n            schema,\n            tokenizer_manager,\n            term_ranges_to_warm_up: HashMap::new(),\n            automatons_to_warm_up: HashMap::new(),\n        }\n    }\n\n    fn add_prefix_term(\n        &mut self,\n        term: Term,\n        max_expansions: u32,\n        position_needed: PositionNeeded,\n    ) {\n        let field = term.field();\n        let (start, end) = prefix_term_to_range(term);\n        let term_range = TermRange {\n            start,\n            end,\n            limit: Some(max_expansions as u64),\n        };\n        *self\n            .term_ranges_to_warm_up\n            .entry(field)\n            .or_default()\n            .entry(term_range)\n            .or_default() |= position_needed;\n    }\n\n    fn add_automaton(&mut self, field: Field, automaton: Automaton) {\n        self.automatons_to_warm_up\n            .entry(field)\n            .or_default()\n            .insert(automaton);\n    }\n}\n\nimpl<'a, 'b: 'a> QueryAstVisitor<'a> for ExtractPrefixTermRanges<'b> {\n    type Err = InvalidQuery;\n\n    fn visit_full_text(&mut self, full_text_query: &'a FullTextQuery) -> Result<(), Self::Err> {\n        if let Some(prefix_term) =\n            full_text_query.get_prefix_term(self.schema, self.tokenizer_manager)\n        {\n            // the max_expansion expansion of a bool prefix query is used for the fuzzy part of the\n            // query, not for the expension to a range request.\n            // see https://github.com/elastic/elasticsearch/blob/6ad48306d029e6e527c0481e2e9880bd2f06b239/docs/reference/query-dsl/match-bool-prefix-query.asciidoc#parameters\n            self.add_prefix_term(prefix_term, u32::MAX, false);\n        }\n        Ok(())\n    }\n\n    fn visit_phrase_prefix(\n        &mut self,\n        phrase_prefix: &'a PhrasePrefixQuery,\n    ) -> Result<(), Self::Err> {\n        let terms = match phrase_prefix.get_terms(self.schema, self.tokenizer_manager) {\n            Ok((_, terms)) => terms,\n            Err(InvalidQuery::SchemaError(_)) | Err(InvalidQuery::FieldDoesNotExist { .. }) => {\n                return Ok(());\n            } /* the query will be nullified when casting to a tantivy ast */\n            Err(e) => return Err(e),\n        };\n        if let Some((_, term)) = terms.last() {\n            self.add_prefix_term(term.clone(), phrase_prefix.max_expansions, terms.len() > 1);\n        }\n        Ok(())\n    }\n\n    fn visit_wildcard(&mut self, wildcard_query: &'a WildcardQuery) -> Result<(), Self::Err> {\n        let (field, path, regex) =\n            match wildcard_query.to_regex(self.schema, self.tokenizer_manager) {\n                Ok(res) => res,\n                /* the query will be nullified when casting to a tantivy ast */\n                Err(InvalidQuery::FieldDoesNotExist { .. }) => return Ok(()),\n                Err(e) => return Err(e),\n            };\n\n        self.add_automaton(field, Automaton::Regex(path, regex));\n        Ok(())\n    }\n\n    fn visit_regex(&mut self, regex_query: &'a RegexQuery) -> Result<(), Self::Err> {\n        let (field, path, regex) = match regex_query.to_field_and_regex(self.schema) {\n            Ok(res) => res,\n            /* the query will be nullified when casting to a tantivy ast */\n            Err(InvalidQuery::FieldDoesNotExist { .. }) => return Ok(()),\n            Err(e) => return Err(e),\n        };\n        self.add_automaton(field, Automaton::Regex(path, regex));\n        Ok(())\n    }\n}\n\ntype TermRangeWarmupInfo = HashMap<Field, HashMap<TermRange, PositionNeeded>>;\ntype AutomatonWarmupInfo = HashMap<Field, HashSet<Automaton>>;\n\nfn extract_prefix_term_ranges_and_automaton(\n    query_ast: &QueryAst,\n    schema: &Schema,\n    tokenizer_manager: &TokenizerManager,\n) -> anyhow::Result<(TermRangeWarmupInfo, AutomatonWarmupInfo)> {\n    let mut visitor = ExtractPrefixTermRanges::with_schema(schema, tokenizer_manager);\n    visitor.visit(query_ast)?;\n    Ok((\n        visitor.term_ranges_to_warm_up,\n        visitor.automatons_to_warm_up,\n    ))\n}\n\n#[cfg(test)]\nmod test {\n    use std::ops::Bound;\n\n    use quickwit_common::shared_consts::FIELD_PRESENCE_FIELD_NAME;\n    use quickwit_query::query_ast::{\n        BuildTantivyAstContext, FullTextMode, FullTextParams, PhrasePrefixQuery, QueryAstVisitor,\n        UserInputQuery, query_ast_from_user_text,\n    };\n    use quickwit_query::{\n        BooleanOperand, MatchAllOrNone, create_default_quickwit_tokenizer_manager,\n    };\n    use tantivy::Term;\n    use tantivy::schema::{DateOptions, DateTimePrecision, FAST, INDEXED, STORED, Schema, TEXT};\n\n    use super::{ExtractPrefixTermRanges, build_query};\n    use crate::{DYNAMIC_FIELD_NAME, SOURCE_FIELD_NAME, TermRange};\n\n    enum TestExpectation<'a> {\n        Err(&'a str),\n        Ok(&'a str),\n    }\n\n    fn make_schema(dynamic_mode: bool) -> Schema {\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_i64_field(FIELD_PRESENCE_FIELD_NAME, INDEXED);\n        schema_builder.add_text_field(\"title\", TEXT);\n        schema_builder.add_text_field(\"desc\", TEXT | STORED);\n        schema_builder.add_text_field(\"server.name\", TEXT | STORED);\n        schema_builder.add_text_field(\"server.mem\", TEXT);\n        schema_builder.add_bool_field(\"server.running\", FAST | STORED | INDEXED);\n        schema_builder.add_text_field(SOURCE_FIELD_NAME, TEXT);\n        schema_builder.add_ip_addr_field(\"ip\", FAST | STORED);\n        schema_builder.add_ip_addr_field(\"ips\", FAST);\n        schema_builder.add_ip_addr_field(\"ip_notff\", STORED);\n        let date_options = DateOptions::default()\n            .set_fast()\n            .set_precision(DateTimePrecision::Milliseconds);\n        schema_builder.add_date_field(\"dt\", date_options);\n        schema_builder.add_u64_field(\"u64_fast\", FAST | STORED);\n        schema_builder.add_i64_field(\"i64_fast\", FAST | STORED);\n        schema_builder.add_f64_field(\"f64_fast\", FAST | STORED);\n        schema_builder.add_json_field(\"json_fast\", FAST);\n        schema_builder.add_json_field(\"json_text\", TEXT);\n        if dynamic_mode {\n            schema_builder.add_json_field(DYNAMIC_FIELD_NAME, TEXT);\n        }\n        schema_builder.build()\n    }\n\n    #[track_caller]\n    fn check_build_query_dynamic_mode(\n        user_query: &str,\n        search_fields: Vec<String>,\n        expected: TestExpectation,\n    ) {\n        check_build_query(user_query, search_fields, expected, true, false);\n    }\n\n    #[track_caller]\n    fn check_build_query_static_mode(\n        user_query: &str,\n        search_fields: Vec<String>,\n        expected: TestExpectation,\n    ) {\n        check_build_query(user_query, search_fields, expected, false, false);\n    }\n\n    #[track_caller]\n    fn check_build_query_static_lenient_mode(\n        user_query: &str,\n        search_fields: Vec<String>,\n        expected: TestExpectation,\n    ) {\n        check_build_query(user_query, search_fields, expected, false, true);\n    }\n\n    fn test_build_query(\n        user_query: &str,\n        search_fields: Vec<String>,\n        dynamic_mode: bool,\n        lenient: bool,\n    ) -> Result<String, String> {\n        let user_input_query = UserInputQuery {\n            user_text: user_query.to_string(),\n            default_fields: Some(search_fields),\n            default_operator: BooleanOperand::And,\n            lenient,\n        };\n        let query_ast = user_input_query\n            .parse_user_query(&[])\n            .map_err(|err| err.to_string())?;\n        let schema = make_schema(dynamic_mode);\n        let query_result = build_query(query_ast, &BuildTantivyAstContext::for_test(&schema), None);\n        query_result\n            .map(|query| format!(\"{query:?}\"))\n            .map_err(|err| err.to_string())\n    }\n\n    #[track_caller]\n    fn check_build_query(\n        user_query: &str,\n        search_fields: Vec<String>,\n        expected: TestExpectation,\n        dynamic_mode: bool,\n        lenient: bool,\n    ) {\n        let query_result = test_build_query(user_query, search_fields, dynamic_mode, lenient);\n        match (query_result, expected) {\n            (Err(query_err_msg), TestExpectation::Err(sub_str)) => {\n                assert!(\n                    query_err_msg.contains(sub_str),\n                    \"query error received is {query_err_msg}. it should contain {sub_str}\"\n                );\n            }\n            (Ok(query_str), TestExpectation::Ok(sub_str)) => {\n                assert!(\n                    query_str.contains(sub_str),\n                    \"error query parsing {query_str} should contain {sub_str}\"\n                );\n            }\n            (Err(error_msg), TestExpectation::Ok(expectation)) => {\n                panic!(\"Expected `{expectation}` but got an error `{error_msg}`.\");\n            }\n            (Ok(query_str), TestExpectation::Err(expected_error)) => {\n                panic!(\"Expected the error `{expected_error}`, but got a success `{query_str}`\");\n            }\n        }\n    }\n\n    #[test]\n    fn test_build_query_dynamic_field() {\n        check_build_query_dynamic_mode(\"*\", Vec::new(), TestExpectation::Ok(\"All\"));\n        check_build_query_dynamic_mode(\n            \"foo:bar\",\n            Vec::new(),\n            TestExpectation::Ok(\n                r#\"TermQuery(Term(field=16, type=Json, path=foo, type=Str, \"bar\"))\"#,\n            ),\n        );\n        check_build_query_dynamic_mode(\n            \"server.type:hpc server.mem:4GB\",\n            Vec::new(),\n            TestExpectation::Ok(\"server.type\"),\n        );\n        check_build_query_dynamic_mode(\n            \"title:[a TO b]\",\n            Vec::new(),\n            TestExpectation::Err(\n                \"range queries are only supported for fast fields. (`title` is not a fast field)\",\n            ),\n        );\n        check_build_query_dynamic_mode(\n            \"title:{a TO b} desc:foo\",\n            Vec::new(),\n            TestExpectation::Err(\n                \"range queries are only supported for fast fields. (`title` is not a fast field)\",\n            ),\n        );\n    }\n\n    #[test]\n    fn test_build_query_not_dynamic_mode() {\n        check_build_query_static_mode(\"*\", Vec::new(), TestExpectation::Ok(\"All\"));\n        check_build_query_static_mode(\n            \"foo:bar\",\n            Vec::new(),\n            TestExpectation::Err(\"invalid query: field does not exist: `foo`\"),\n        );\n        check_build_query_static_lenient_mode(\n            \"foo:bar\",\n            Vec::new(),\n            TestExpectation::Ok(\"EmptyQuery\"),\n        );\n        check_build_query_static_mode(\n            \"title:bar\",\n            Vec::new(),\n            TestExpectation::Ok(r#\"TermQuery(Term(field=1, type=Str, \"bar\"))\"#),\n        );\n        check_build_query_static_mode(\n            \"bar\",\n            vec![\"fieldnotinschema\".to_string()],\n            TestExpectation::Err(\"invalid query: field does not exist: `fieldnotinschema`\"),\n        );\n        check_build_query_static_lenient_mode(\n            \"bar\",\n            vec![\"fieldnotinschema\".to_string()],\n            TestExpectation::Ok(\"EmptyQuery\"),\n        );\n        check_build_query_static_mode(\n            \"title:[a TO b]\",\n            Vec::new(),\n            TestExpectation::Err(\n                \"range queries are only supported for fast fields. (`title` is not a fast field)\",\n            ),\n        );\n        check_build_query_static_mode(\n            \"title:{a TO b} desc:foo\",\n            Vec::new(),\n            TestExpectation::Err(\n                \"range queries are only supported for fast fields. (`title` is not a fast field)\",\n            ),\n        );\n        check_build_query_static_mode(\n            \"title:>foo\",\n            Vec::new(),\n            TestExpectation::Err(\n                \"range queries are only supported for fast fields. (`title` is not a fast field)\",\n            ),\n        );\n        check_build_query_static_mode(\n            \"title:foo desc:bar _source:baz\",\n            Vec::new(),\n            TestExpectation::Ok(\"TermQuery\"),\n        );\n        check_build_query_static_mode(\n            \"server.name:\\\".bar:\\\" server.mem:4GB\",\n            vec![\"server.name\".to_string()],\n            TestExpectation::Ok(\"TermQuery\"),\n        );\n        check_build_query_static_mode(\n            \"server.name:\\\"for.bar:b\\\" server.mem:4GB\",\n            Vec::new(),\n            TestExpectation::Ok(\"TermQuery\"),\n        );\n        check_build_query_static_mode(\n            \"foo\",\n            Vec::new(),\n            TestExpectation::Err(\"query requires a default search field and none was supplied\"),\n        );\n        check_build_query_static_mode(\n            \"bar\",\n            Vec::new(),\n            TestExpectation::Err(\"query requires a default search field and none was supplied\"),\n        );\n        check_build_query_static_mode(\n            \"title:hello AND (Jane OR desc:world)\",\n            Vec::new(),\n            TestExpectation::Err(\"query requires a default search field and none was supplied\"),\n        );\n        check_build_query_static_mode(\n            \"server.running:true\",\n            Vec::new(),\n            TestExpectation::Ok(\"TermQuery\"),\n        );\n        check_build_query_static_mode(\n            \"title: IN [hello]\",\n            Vec::new(),\n            TestExpectation::Ok(\"TermSetQuery\"),\n        );\n        check_build_query_static_mode(\n            \"IN [hello]\",\n            Vec::new(),\n            TestExpectation::Err(\"set query need to target a specific field\"),\n        );\n    }\n\n    #[test]\n    fn test_wildcard_query() {\n        check_build_query_static_mode(\"title:hello*\", Vec::new(), TestExpectation::Ok(\"Regex\"));\n        check_build_query_static_mode(\n            \"title:\\\"hello world\\\"*\",\n            Vec::new(),\n            TestExpectation::Ok(\"PhrasePrefixQuery\"),\n        );\n        // the tokenizer removes '*' chars, making it a simple PhraseQuery (not RegexPhraseQuery)\n        check_build_query_static_mode(\n            \"title:\\\"hello* world*\\\"\",\n            Vec::new(),\n            TestExpectation::Ok(\"PhraseQuery\"),\n        );\n        check_build_query_static_mode(\n            \"foo:bar*\",\n            Vec::new(),\n            TestExpectation::Err(\"invalid query: field does not exist: `foo`\"),\n        );\n        check_build_query_static_mode(\"title:hello*yo\", Vec::new(), TestExpectation::Ok(\"Regex\"));\n    }\n\n    #[test]\n    fn test_existence_query() {\n        check_build_query_static_mode(\n            \"title:*\",\n            Vec::new(),\n            TestExpectation::Ok(\"TermQuery(Term(field=0, type=U64\"),\n        );\n\n        check_build_query_static_mode(\n            \"ip:*\",\n            Vec::new(),\n            TestExpectation::Ok(\"ExistsQuery { field_name: \\\"ip\\\", json_subpaths: true }\"),\n        );\n        check_build_query_static_mode(\n            \"json_text:*\",\n            Vec::new(),\n            TestExpectation::Ok(\"TermSetQuery\"),\n        );\n        check_build_query_static_mode(\n            \"json_fast:*\",\n            Vec::new(),\n            TestExpectation::Ok(\"ExistsQuery { field_name: \\\"json_fast\\\", json_subpaths: true }\"),\n        );\n        check_build_query_static_mode(\n            \"foo:*\",\n            Vec::new(),\n            TestExpectation::Err(\"invalid query: field does not exist: `foo`\"),\n        );\n        check_build_query_static_mode(\n            \"server:*\",\n            Vec::new(),\n            TestExpectation::Ok(\"BooleanQuery { subqueries: [(Should, TermQuery(Term\"),\n        );\n    }\n\n    #[test]\n    fn test_datetime_range_query() {\n        {\n            // Check range on datetime in millisecond, precision has no impact as it is in\n            // milliseconds.\n            let start_date_time_str = \"2023-01-10T08:38:51.150Z\";\n            let end_date_time_str = \"2023-01-10T08:38:51.160Z\";\n            check_build_query_static_mode(\n                &format!(\"dt:[{start_date_time_str} TO {end_date_time_str}]\"),\n                Vec::new(),\n                TestExpectation::Ok(\"2023-01-10T08:38:51.15Z\"),\n            );\n            check_build_query_static_mode(\n                &format!(\"dt:[{start_date_time_str} TO {end_date_time_str}]\"),\n                Vec::new(),\n                TestExpectation::Ok(\"RangeQuery\"),\n            );\n            check_build_query_static_mode(\n                &format!(\"dt:<{end_date_time_str}\"),\n                Vec::new(),\n                TestExpectation::Ok(\"lower_bound: Unbounded\"),\n            );\n            check_build_query_static_mode(\n                &format!(\"dt:<{end_date_time_str}\"),\n                Vec::new(),\n                TestExpectation::Ok(\"upper_bound: Excluded\"),\n            );\n            check_build_query_static_mode(\n                &format!(\"dt:<{end_date_time_str}\"),\n                Vec::new(),\n                TestExpectation::Ok(\"2023-01-10T08:38:51.16Z\"),\n            );\n        }\n\n        // Check range on datetime in microseconds and truncation to milliseconds.\n        {\n            let start_date_time_str = \"2023-01-10T08:38:51.000150Z\";\n            let end_date_time_str = \"2023-01-10T08:38:51.000151Z\";\n            check_build_query_static_mode(\n                &format!(\"dt:[{start_date_time_str} TO {end_date_time_str}]\"),\n                Vec::new(),\n                TestExpectation::Ok(\"2023-01-10T08:38:51Z\"),\n            );\n        }\n    }\n\n    #[test]\n    fn test_ip_range_query() {\n        check_build_query_static_mode(\n            \"ip:[127.0.0.1 TO 127.1.1.1]\",\n            Vec::new(),\n            TestExpectation::Ok(\n                \"RangeQuery { bounds: BoundsRange { lower_bound: Included(Term(field=7, \\\n                 type=IpAddr, ::ffff:127.0.0.1)), upper_bound: Included(Term(field=7, \\\n                 type=IpAddr, ::ffff:127.1.1.1)) } }\",\n            ),\n        );\n        check_build_query_static_mode(\n            \"ip:>127.0.0.1\",\n            Vec::new(),\n            TestExpectation::Ok(\n                \"RangeQuery { bounds: BoundsRange { lower_bound: Excluded(Term(field=7, \\\n                 type=IpAddr, ::ffff:127.0.0.1)), upper_bound: Unbounded } }\",\n            ),\n        );\n    }\n\n    #[test]\n    fn test_f64_range_query() {\n        check_build_query_static_mode(\n            \"f64_fast:[7.7 TO 77.7]\",\n            Vec::new(),\n            TestExpectation::Ok(\n                r#\"RangeQuery { bounds: BoundsRange { lower_bound: Included(Term(field=13, type=F64, 7.7)), upper_bound: Included(Term(field=13, type=F64, 77.7)) } }\"#,\n            ),\n        );\n        check_build_query_static_mode(\n            \"f64_fast:>7\",\n            Vec::new(),\n            TestExpectation::Ok(\n                r#\"RangeQuery { bounds: BoundsRange { lower_bound: Excluded(Term(field=13, type=F64, 7.0)), upper_bound: Unbounded } }\"#,\n            ),\n        );\n    }\n\n    #[test]\n    fn test_i64_range_query() {\n        check_build_query_static_mode(\n            \"i64_fast:[-7 TO 77]\",\n            Vec::new(),\n            TestExpectation::Ok(r#\"field=12\"#),\n        );\n        check_build_query_static_mode(\n            \"i64_fast:>7\",\n            Vec::new(),\n            TestExpectation::Ok(r#\"field=12\"#),\n        );\n    }\n\n    #[test]\n    fn test_u64_range_query() {\n        check_build_query_static_mode(\n            \"u64_fast:[7 TO 77]\",\n            Vec::new(),\n            TestExpectation::Ok(r#\"field=11,\"#),\n        );\n        check_build_query_static_mode(\n            \"u64_fast:>7\",\n            Vec::new(),\n            TestExpectation::Ok(r#\"field=11,\"#),\n        );\n    }\n\n    #[test]\n    fn test_range_query_ip_fields_multivalued() {\n        check_build_query_static_mode(\n            \"ips:[127.0.0.1 TO 127.1.1.1]\",\n            Vec::new(),\n            TestExpectation::Ok(\n                \"RangeQuery { bounds: BoundsRange { lower_bound: Included(Term(field=8, \\\n                 type=IpAddr, ::ffff:127.0.0.1)), upper_bound: Included(Term(field=8, \\\n                 type=IpAddr, ::ffff:127.1.1.1)) } }\",\n            ),\n        );\n    }\n\n    #[test]\n    fn test_range_query_no_fast_field() {\n        check_build_query_static_mode(\n            \"ip_notff:[127.0.0.1 TO 127.1.1.1]\",\n            Vec::new(),\n            TestExpectation::Err(\"`ip_notff` is not a fast field\"),\n        );\n    }\n\n    #[test]\n    fn test_build_query_not_bool_should_fail() {\n        check_build_query_static_mode(\n            \"server.running:notabool\",\n            Vec::new(),\n            TestExpectation::Err(\"expected a `bool` search value for field `server.running`\"),\n        );\n    }\n\n    #[test]\n    fn test_build_query_warmup_info() {\n        let query_with_set = query_ast_from_user_text(\"desc: IN [hello]\", None)\n            .parse_user_query(&[])\n            .unwrap();\n        let query_without_set = query_ast_from_user_text(\"desc:hello\", None)\n            .parse_user_query(&[])\n            .unwrap();\n\n        let schema = make_schema(true);\n        let context = BuildTantivyAstContext::for_test(&schema);\n\n        let (_, warmup_info) = build_query(query_with_set, &context, None).unwrap();\n        assert_eq!(warmup_info.term_dict_fields.len(), 1);\n        assert!(\n            warmup_info\n                .term_dict_fields\n                .contains(&tantivy::schema::Field::from_field_id(2))\n        );\n\n        let (_, warmup_info) = build_query(query_without_set, &context, None).unwrap();\n        assert!(warmup_info.term_dict_fields.is_empty());\n    }\n\n    #[test]\n    fn test_extract_phrase_prefix_position_required() {\n        let schema = make_schema(false);\n        let tokenizer_manager = create_default_quickwit_tokenizer_manager();\n\n        let params = FullTextParams {\n            tokenizer: None,\n            mode: FullTextMode::Phrase { slop: 0 },\n            zero_terms_query: MatchAllOrNone::MatchNone,\n        };\n        let short = PhrasePrefixQuery {\n            field: \"title\".to_string(),\n            phrase: \"short\".to_string(),\n            max_expansions: 50,\n            params: params.clone(),\n            lenient: false,\n        };\n        let long = PhrasePrefixQuery {\n            field: \"title\".to_string(),\n            phrase: \"not so short\".to_string(),\n            max_expansions: 50,\n            params: params.clone(),\n            lenient: false,\n        };\n        let mut extractor1 = ExtractPrefixTermRanges::with_schema(&schema, &tokenizer_manager);\n        extractor1.visit_phrase_prefix(&short).unwrap();\n        extractor1.visit_phrase_prefix(&long).unwrap();\n\n        let mut extractor2 = ExtractPrefixTermRanges::with_schema(&schema, &tokenizer_manager);\n        extractor2.visit_phrase_prefix(&long).unwrap();\n        extractor2.visit_phrase_prefix(&short).unwrap();\n\n        assert_eq!(\n            extractor1.term_ranges_to_warm_up,\n            extractor2.term_ranges_to_warm_up\n        );\n\n        let field = tantivy::schema::Field::from_field_id(1);\n        let mut expected_inner = std::collections::HashMap::new();\n        expected_inner.insert(\n            TermRange {\n                start: Bound::Included(Term::from_field_text(field, \"short\")),\n                end: Bound::Excluded(Term::from_field_text(field, \"shoru\")),\n                limit: Some(50),\n            },\n            true,\n        );\n        let mut expected = std::collections::HashMap::new();\n        expected.insert(field, expected_inner);\n        assert_eq!(extractor1.term_ranges_to_warm_up, expected);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-doc-mapper/src/routing_expression/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::borrow::Cow;\nuse std::fmt::{self, Display};\nuse std::hash::{Hash, Hasher};\nuse std::str::FromStr;\nuse std::sync::Arc;\n\npub(crate) use expression_dsl::parse_field_name;\nuse serde_json::Value as JsonValue;\nuse siphasher::sip::SipHasher;\n\npub trait RoutingExprContext {\n    fn hash_attribute<H: Hasher>(&self, attr_name: &[String], hasher: &mut H);\n}\n\n/// This is a bit overkill but this function has the merit of\n/// ensuring that the data that is sent to the hasher is unique\n/// to the value, so we do not lose injectivity there.\nfn hash_json_val<H: Hasher>(json_val: &JsonValue, hasher: &mut H) {\n    match json_val {\n        JsonValue::Null => {\n            hasher.write_u8(0u8);\n        }\n        JsonValue::Bool(bool_val) => {\n            hasher.write_u8(1u8);\n            bool_val.hash(hasher);\n        }\n        JsonValue::Number(num) => {\n            hasher.write_u8(2u8);\n            num.hash(hasher);\n        }\n        JsonValue::String(s) => {\n            hasher.write_u8(3u8);\n            hasher.write_u64(s.len() as u64);\n            hasher.write(s.as_bytes());\n        }\n        JsonValue::Array(arr) => {\n            hasher.write_u8(4u8);\n            hasher.write_u64(arr.len() as u64);\n            for el in arr {\n                hash_json_val(el, hasher);\n            }\n        }\n        JsonValue::Object(obj) => {\n            hasher.write_u8(5u8);\n            hasher.write_u64(obj.len() as u64);\n            for (key, val) in obj.iter() {\n                hasher.write_u64(key.len() as u64);\n                hasher.write(key.as_bytes());\n                hash_json_val(val, hasher);\n            }\n        }\n    }\n}\n\nfn find_value<'a>(mut root: &'a JsonValue, keys: &[String]) -> Option<&'a JsonValue> {\n    for key in keys {\n        match root {\n            JsonValue::Object(obj) => {\n                root = obj.get(key)?;\n            }\n            _ => return None,\n        }\n    }\n    Some(root)\n}\n\nfn find_value_in_map<'a>(\n    obj: &'a serde_json::Map<String, JsonValue>,\n    keys: &[String],\n) -> Option<&'a JsonValue> {\n    // we can't have an empty path and this is used only for the root map, so there is no risk of\n    // out of bound\n    if let Some(value) = obj.get(&keys[0]) {\n        find_value(value, &keys[1..])\n    } else {\n        None\n    }\n}\n\nimpl RoutingExprContext for serde_json::Map<String, JsonValue> {\n    fn hash_attribute<H: Hasher>(&self, attr_name: &[String], hasher: &mut H) {\n        if let Some(json_val) = find_value_in_map(self, attr_name) {\n            hasher.write_u8(1u8);\n            hash_json_val(json_val, hasher);\n        } else {\n            hasher.write_u8(0u8);\n        }\n    }\n}\n\n/// which defines a routing expression\n#[derive(Clone, Default)]\npub struct RoutingExpr {\n    inner_opt: Option<Arc<InnerRoutingExpr>>,\n    salted_hasher: SipHasher,\n}\n\nimpl RoutingExpr {\n    /// Construct a routing expression from a expression dsl string\n    pub fn new(expr_dsl_str: &str) -> anyhow::Result<Self> {\n        let expr_dsl_str = expr_dsl_str.trim();\n        if expr_dsl_str.is_empty() {\n            return Ok(RoutingExpr::default());\n        }\n\n        let mut salted_hasher: SipHasher = SipHasher::new();\n\n        let inner: InnerRoutingExpr = InnerRoutingExpr::from_str(expr_dsl_str)?;\n        // We hash the expression tree here instead of hashing the str, or\n        // hash the display of the tree, in order to make the partition id less brittle to\n        // a minor change in formatting, or a change in the DSL itself.\n        //\n        // We do not use the standard library DefaultHasher to make sure we\n        // get the same hash values.\n        inner.hash(&mut salted_hasher);\n\n        Ok(RoutingExpr {\n            inner_opt: Some(Arc::new(inner)),\n            salted_hasher,\n        })\n    }\n\n    /// Evaluates the expression applied to the given\n    /// context and returns a u64 hash.\n    ///\n    /// Obviously this function is not perfectly injective.\n    pub fn eval_hash<Ctx: RoutingExprContext>(&self, ctx: &Ctx) -> u64 {\n        if let Some(inner) = self.inner_opt.as_ref() {\n            let mut hasher: SipHasher = self.salted_hasher;\n            inner.eval_hash(ctx, &mut hasher);\n            hasher.finish()\n        } else {\n            0u64\n        }\n    }\n\n    /// return all fields in a vector\n    pub fn field_names(&self) -> Vec<String> {\n        if let Some(inner) = self.inner_opt.as_ref() {\n            inner.field_names()\n        } else {\n            Vec::new()\n        }\n    }\n}\n\nimpl Display for RoutingExpr {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        if let Some(inner_expr) = self.inner_opt.as_ref() {\n            inner_expr.fmt(f)\n        } else {\n            write!(f, \"\")\n        }\n    }\n}\n\n#[derive(Clone, Debug, Eq, PartialEq)]\nenum InnerRoutingExpr {\n    Field(Vec<String>),\n    Composite(Vec<InnerRoutingExpr>),\n    Modulo(Box<InnerRoutingExpr>, u64),\n    // TODO Enrich me! Map / ...\n}\n\nimpl InnerRoutingExpr {\n    fn eval_hash<Ctx: RoutingExprContext, H: Hasher + Default>(&self, ctx: &Ctx, hasher: &mut H) {\n        match self {\n            InnerRoutingExpr::Field(field_name) => {\n                ExprType::Field.hash(hasher);\n                ctx.hash_attribute(field_name, hasher);\n            }\n            InnerRoutingExpr::Composite(children) => {\n                ExprType::Composite.hash(hasher);\n                for child in children {\n                    child.eval_hash(ctx, hasher);\n                }\n            }\n            InnerRoutingExpr::Modulo(inner_expr, modulo) => {\n                ExprType::Modulo.hash(hasher);\n\n                let mut sub_hasher = H::default();\n                inner_expr.eval_hash(ctx, &mut sub_hasher);\n                hasher.write_u64(sub_hasher.finish() % modulo);\n            }\n        }\n    }\n\n    // return all fields in a vector\n    fn field_names(&self) -> Vec<String> {\n        match self {\n            InnerRoutingExpr::Field(field_name) => vec![field_name.join(\".\")],\n            InnerRoutingExpr::Composite(children) => {\n                let mut fields = Vec::new();\n                for child in children {\n                    fields.extend(child.field_names());\n                }\n                fields\n            }\n            InnerRoutingExpr::Modulo(inner_expr, _) => inner_expr.field_names(),\n        }\n    }\n}\n\n// We don't rely on Derive here to make it easier to keep the\n// implementation stable.\n#[allow(clippy::derived_hash_with_manual_eq)]\nimpl Hash for InnerRoutingExpr {\n    fn hash<H: Hasher>(&self, hasher: &mut H) {\n        match self {\n            InnerRoutingExpr::Field(field_name) => {\n                ExprType::Field.hash(hasher);\n                hasher.write_u64(field_name.len() as u64);\n                for (index, field) in field_name.iter().enumerate() {\n                    if index != 0 {\n                        hasher.write_u8(b'.');\n                    }\n                    hasher.write(field.as_bytes());\n                }\n            }\n            InnerRoutingExpr::Composite(children) => {\n                ExprType::Composite.hash(hasher);\n                for child in children {\n                    child.hash(hasher);\n                }\n            }\n            InnerRoutingExpr::Modulo(inner_expr, modulo) => {\n                ExprType::Modulo.hash(hasher);\n                inner_expr.hash(hasher);\n                hasher.write_u64(*modulo);\n            }\n        }\n    }\n}\n\nimpl Default for InnerRoutingExpr {\n    fn default() -> InnerRoutingExpr {\n        InnerRoutingExpr::Composite(Vec::new())\n    }\n}\n\nimpl FromStr for InnerRoutingExpr {\n    type Err = anyhow::Error;\n\n    fn from_str(expr_dsl_str: &str) -> anyhow::Result<Self> {\n        let ast = expression_dsl::parse_expression(expr_dsl_str)?;\n\n        convert_ast(ast)\n    }\n}\n\nfn convert_ast(ast: Vec<expression_dsl::ExpressionAst>) -> anyhow::Result<InnerRoutingExpr> {\n    use expression_dsl::{Argument, ExpressionAst};\n\n    let mut result = ast\n        .into_iter()\n        .map(|ast_elem| match ast_elem {\n            ExpressionAst::Field(field_name) => {\n                let field_path = expression_dsl::parse_field_name(&field_name)?\n                    .into_iter()\n                    .map(Cow::into_owned)\n                    .collect();\n                Ok(InnerRoutingExpr::Field(field_path))\n            }\n            ExpressionAst::Function { name, mut args } => match &*name {\n                \"hash_mod\" => {\n                    if args.len() != 2 {\n                        anyhow::bail!(\n                            \"invalid arguments for `hash_mod`: expected 2 arguments, found {}\",\n                            args.len()\n                        );\n                    }\n\n                    let Argument::Expression(fields) = args.remove(0) else {\n                        anyhow::bail!(\"invalid 1st argument for `hash_mod`: expected expression\");\n                    };\n\n                    let Argument::Number(modulo) = args.remove(0) else {\n                        anyhow::bail!(\"invalid 2nd argument for `hash_mod`: expected number\");\n                    };\n\n                    Ok(InnerRoutingExpr::Modulo(\n                        Box::new(convert_ast(fields)?),\n                        modulo,\n                    ))\n                }\n                _ => anyhow::bail!(\"unknown function `{}`\", name),\n            },\n        })\n        .collect::<Result<Vec<_>, _>>()?;\n    if result.is_empty() {\n        Ok(InnerRoutingExpr::default())\n    } else if result.len() == 1 {\n        Ok(result.remove(0))\n    } else {\n        Ok(InnerRoutingExpr::Composite(result))\n    }\n}\n\n// The display implementation should be consistent with `FromString`.\nimpl Display for InnerRoutingExpr {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        match &self {\n            InnerRoutingExpr::Field(field) => {\n                for (index, part) in field.iter().enumerate() {\n                    if index != 0 {\n                        f.write_str(\".\")?;\n                    }\n                    f.write_str(&part.replace('.', r\"\\.\"))?;\n                }\n            }\n            InnerRoutingExpr::Composite(children) => {\n                if children.is_empty() {\n                    return Ok(());\n                }\n                children[0].fmt(f)?;\n                for child in &children[1..] {\n                    write!(f, \",{child}\")?;\n                }\n            }\n            InnerRoutingExpr::Modulo(inner_expr, modulo) => {\n                write!(f, \"hash_mod(({inner_expr}), {modulo})\")?;\n            }\n        }\n        Ok(())\n    }\n}\n\n#[derive(Hash)]\n#[repr(u8)]\nenum ExprType {\n    Field,\n    Composite,\n    Modulo,\n}\n\nmod expression_dsl {\n    use std::borrow::Cow;\n\n    use nom::bytes::complete::{escaped, tag};\n    use nom::character::complete::multispace0;\n    use nom::combinator::{eof, map, opt};\n    use nom::error::ErrorKind;\n    use nom::multi::separated_list0;\n    use nom::sequence::delimited;\n    use nom::{AsChar, Finish, IResult, Input, Parser};\n\n    // this is a RoutingSubExpr in our DSL.\n    #[derive(Debug, PartialEq, Eq, Clone)]\n    pub(crate) enum ExpressionAst {\n        Field(String),\n        Function { name: String, args: Vec<Argument> },\n    }\n\n    #[derive(Debug, PartialEq, Eq, Clone)]\n    pub(crate) enum Argument {\n        Expression(Vec<ExpressionAst>),\n        Number(u64),\n    }\n\n    pub(crate) fn parse_expression(expr_dsl_str: &str) -> anyhow::Result<Vec<ExpressionAst>> {\n        let (i, res) = routing_expr(expr_dsl_str)\n            .finish()\n            .map_err(|e| anyhow::anyhow!(\"error parsing routing expression: {e}\"))?;\n        eof::<_, ()>(i)?;\n\n        Ok(res)\n    }\n\n    // tag, but ignore leading and trailing whitespaces\n    pub fn wtag<'a, Error: nom::error::ParseError<&'a str>>(\n        t: &'a str,\n    ) -> impl Parser<&'a str, Output = &'a str, Error = Error> {\n        delimited(multispace0, tag(t), multispace0)\n    }\n\n    // DSL:\n    //\n    // RoutingExpr := RoutingSubExpr [ , RoutingExpr ]\n    // RougingSubExpr := Identifier [ \\( Arguments \\) ]\n    // Identifier := FieldChar [ Identifier ]\n    // FieldChar := { a..z | A..Z | 0..9 | _ | . | \\ | / | @ | $ }\n    // Arguments := Argument [ , Arguments ]\n    // Argument := { \\( RoutingExpr \\) | RoutingSubExpr | DirectValue }\n    // # We may want other DirectValue in the future\n    // DirectValue := Number\n    // Number := { 0..9 } [ Number ]\n\n    /// An entire routing expression, containing comma separated routing sub-expressions\n    fn routing_expr(input: &str) -> IResult<&str, Vec<ExpressionAst>> {\n        separated_list0(wtag(\",\"), routing_sub_expr).parse(input)\n    }\n\n    /// A sub-part of a routing expression\n    fn routing_sub_expr(input: &str) -> IResult<&str, ExpressionAst> {\n        let (input, identifier) = identifier(input)?;\n        let (input, args) = opt((wtag(\"(\"), arguments, wtag(\")\"))).parse(input)?;\n        let res = if let Some((_, args, _)) = args {\n            ExpressionAst::Function {\n                name: identifier.to_owned(),\n                args,\n            }\n        } else {\n            ExpressionAst::Field(identifier.to_owned())\n        };\n        Ok((input, res))\n    }\n\n    /// An identifier, it can be either a field name, or a function name. It's returned as is,\n    /// without de-escaping.\n    fn identifier(input: &str) -> IResult<&str, &str> {\n        input.split_at_position1_complete(\n            |item| !(item.is_alphanum() || ['_', '-', '.', '\\\\', '/', '@', '$'].contains(&item)),\n            ErrorKind::AlphaNumeric,\n        )\n    }\n\n    /// Arguments for a function\n    fn arguments(input: &str) -> IResult<&str, Vec<Argument>> {\n        separated_list0(wtag(\",\"), argument).parse(input)\n    }\n\n    /// A single argument for a function\n    fn argument(input: &str) -> IResult<&str, Argument> {\n        if let Ok((input, number)) = number(input) {\n            Ok((input, Argument::Number(number)))\n        } else if let Ok((input, (_, arg, _))) = (wtag(\"(\"), routing_expr, wtag(\")\")).parse(input) {\n            Ok((input, Argument::Expression(arg)))\n        } else {\n            routing_sub_expr(input).map(|(input, arg)| (input, Argument::Expression(vec![arg])))\n        }\n    }\n\n    /// A number\n    fn number(input: &str) -> IResult<&str, u64> {\n        nom::character::complete::u64(input)\n    }\n\n    // functions after this are meant to parse a field into its path component, de-escaping where\n    // appropriate\n\n    /// Parse part of a path component, stop at the first . or \\\n    fn key_identifier(input: &str) -> IResult<&str, &str> {\n        input.split_at_position1_complete(\n            |item| !(item.is_alphanum() || ['_', '-', '/', '@', '$'].contains(&item)),\n            ErrorKind::Fail,\n        )\n    }\n\n    /// Parse a single path component, separated by dots. De-escape any escaped dot it may contain.\n    fn escaped_key(input: &str) -> IResult<&str, Cow<'_, str>> {\n        map(escaped(key_identifier, '\\\\', tag(\".\")), |s: &str| {\n            if s.contains(\"\\\\.\") {\n                Cow::Owned(s.replace(\"\\\\.\", \".\"))\n            } else {\n                Cow::Borrowed(s)\n            }\n        })\n        .parse(input)\n    }\n\n    /// Parse a field name into a path, de-escaping where appropriate.\n    pub(crate) fn parse_field_name(input: &str) -> anyhow::Result<Vec<Cow<'_, str>>> {\n        let (i, res) = separated_list0(tag(\".\"), escaped_key)\n            .parse(input)\n            .finish()\n            .map_err(|e| anyhow::anyhow!(\"error parsing key expression: {e}\"))?;\n        eof::<_, ()>(i)?;\n        Ok(res)\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::collections::HashSet;\n\n    use super::*;\n\n    #[track_caller]\n    fn test_ser_deser(expr: &InnerRoutingExpr) {\n        let ser = expr.to_string();\n        assert_eq!(&InnerRoutingExpr::from_str(&ser).unwrap(), expr);\n    }\n\n    #[track_caller]\n    fn deser_util(expr_dsl: &str) -> InnerRoutingExpr {\n        let expr = InnerRoutingExpr::from_str(expr_dsl).unwrap();\n        test_ser_deser(&expr);\n        expr\n    }\n\n    #[test]\n    fn test_routing_expr_empty() {\n        let routing_expr = deser_util(\"\");\n        assert!(matches!(routing_expr, InnerRoutingExpr::Composite(leaves) if leaves.is_empty()));\n    }\n\n    #[test]\n    fn test_routing_expr_empty_hashes_to_0() {\n        let expr = RoutingExpr::new(\"\").unwrap();\n        let ctx: serde_json::Map<String, JsonValue> = Default::default();\n        assert_eq!(expr.eval_hash(&ctx), 0u64);\n    }\n\n    #[test]\n    fn test_routing_expr_single_field() {\n        let routing_expr = deser_util(\"tenant_id\");\n        assert_eq!(\n            routing_expr,\n            InnerRoutingExpr::Field(vec![\"tenant_id\".to_owned()])\n        );\n    }\n\n    #[test]\n    fn test_routing_expr_single_field_special_char() {\n        let routing_expr = deser_util(r\"abCD01-_/@$\\.a.bc\");\n        assert_eq!(\n            routing_expr,\n            InnerRoutingExpr::Field(vec![r\"abCD01-_/@$.a\".to_owned(), \"bc\".to_string()])\n        );\n    }\n\n    #[test]\n    fn test_routing_expr_single_field_with_dot() {\n        let routing_expr = deser_util(\"app.id\");\n        assert_eq!(\n            routing_expr,\n            InnerRoutingExpr::Field(vec![\"app\".to_owned(), \"id\".to_owned()])\n        );\n    }\n\n    #[test]\n    fn test_routing_expr_modulo_field() {\n        let routing_expr = deser_util(\"hash_mod(tenant_id, 4)\");\n        assert_eq!(\n            routing_expr,\n            InnerRoutingExpr::Modulo(\n                Box::new(InnerRoutingExpr::Field(vec![\"tenant_id\".to_owned()])),\n                4\n            )\n        );\n    }\n\n    #[test]\n    fn test_routing_expr_modulo_complexe() {\n        let routing_expr = deser_util(\"hash_mod((tenant_id,hash_mod(app_id, 3)), 8),cluster_id\");\n        assert_eq!(\n            routing_expr,\n            InnerRoutingExpr::Composite(vec![\n                InnerRoutingExpr::Modulo(\n                    Box::new(InnerRoutingExpr::Composite(vec![\n                        InnerRoutingExpr::Field(vec![\"tenant_id\".to_owned()]),\n                        InnerRoutingExpr::Modulo(\n                            Box::new(InnerRoutingExpr::Field(vec![\"app_id\".to_owned()]),),\n                            3\n                        ),\n                    ])),\n                    8\n                ),\n                InnerRoutingExpr::Field(vec![\"cluster_id\".to_owned()]),\n            ])\n        );\n    }\n\n    #[test]\n    fn test_routing_expr_multiple_field() {\n        let routing_expr = deser_util(\"tenant_id,app_id\");\n\n        assert_eq!(\n            routing_expr,\n            InnerRoutingExpr::Composite(vec![\n                InnerRoutingExpr::Field(vec![\"tenant_id\".to_owned()]),\n                InnerRoutingExpr::Field(vec![\"app_id\".to_owned()]),\n            ])\n        );\n    }\n\n    #[test]\n    fn test_routing_expr_multiple_field_with_dot() {\n        let routing_expr = deser_util(\"tenant.id,app.id\");\n\n        assert_eq!(\n            routing_expr,\n            InnerRoutingExpr::Composite(vec![\n                InnerRoutingExpr::Field(vec![\"tenant\".to_owned(), \"id\".to_owned()]),\n                InnerRoutingExpr::Field(vec![\"app\".to_owned(), \"id\".to_owned()]),\n            ])\n        );\n    }\n\n    #[test]\n    fn test_parse_field_name() {\n        let keys = expression_dsl::parse_field_name(\"abc\").unwrap();\n        assert_eq!(keys, vec![String::from(\"abc\")]);\n    }\n\n    #[test]\n    fn test_parse_field_name_multiple() {\n        let keys = expression_dsl::parse_field_name(\"abc.def\").unwrap();\n        assert_eq!(keys, vec![String::from(\"abc\"), String::from(\"def\")]);\n    }\n\n    #[test]\n    fn test_parse_field_name_with_escaped_dot() {\n        let keys = expression_dsl::parse_field_name(\"abc\\\\.def.hij\").unwrap();\n        assert_eq!(keys, vec![String::from(\"abc.def\"), String::from(\"hij\")]);\n    }\n\n    #[test]\n    fn test_parse_field_name_with_special_char() {\n        let keys = expression_dsl::parse_field_name(\"abCD01-_/@$\").unwrap();\n        assert_eq!(keys, vec![String::from(\"abCD01-_/@$\")]);\n    }\n\n    #[test]\n    fn test_find_value_with_escaped_dot() {\n        let ctx = serde_json::from_str(r#\"{\"tenant.id\": \"happy\", \"app\": \"happy\"}\"#).unwrap();\n        let keys: Vec<_> = expression_dsl::parse_field_name(\"tenant\\\\.id\")\n            .unwrap()\n            .into_iter()\n            .map(Cow::into_owned)\n            .collect();\n        assert_eq!(keys, vec![String::from(\"tenant.id\")]);\n        let value = find_value(&ctx, &keys).unwrap();\n        assert_eq!(value, &JsonValue::String(String::from(\"happy\")));\n    }\n\n    #[test]\n    fn test_find_value_with_nested_keys() {\n        let ctx = serde_json::from_str(\n            r#\"{\"tenant_id\": \"happy\", \"app\": {\"name\": \"happy\", \"id\": \"123\"}}\"#,\n        )\n        .unwrap();\n        let keys: Vec<_> = expression_dsl::parse_field_name(\"app.id\")\n            .unwrap()\n            .into_iter()\n            .map(Cow::into_owned)\n            .collect();\n        assert_eq!(keys, vec![\"app\", \"id\"]);\n        let value = find_value(&ctx, &keys).unwrap();\n        assert_eq!(value, &JsonValue::String(String::from(\"123\")));\n    }\n    // This unit test is here to ensure that the routing expr hash depends on\n    // the expression itself as well as the expression value.\n    #[test]\n    fn test_routing_expr_depends_on_both_expr_and_value() {\n        let routing_expr = RoutingExpr::new(\"tenant_id\").unwrap();\n        let routing_expr2 = RoutingExpr::new(\"app\").unwrap();\n        let ctx: serde_json::Map<String, JsonValue> =\n            serde_json::from_str(r#\"{\"tenant_id\": \"happy\", \"app\": \"happy\"}\"#).unwrap();\n        let ctx2: serde_json::Map<String, JsonValue> =\n            serde_json::from_str(r#\"{\"tenant_id\": \"happy2\"}\"#).unwrap();\n        // This assert is important.\n        assert_ne!(routing_expr.eval_hash(&ctx), routing_expr2.eval_hash(&ctx),);\n        assert_ne!(routing_expr.eval_hash(&ctx), routing_expr.eval_hash(&ctx2),);\n    }\n\n    // This unit test is here to detect a change in the hash logic.\n    // Breaking it is not catastrophic but it should not happen too often.\n    #[test]\n    fn test_routing_expr_change_detection() {\n        let routing_expr = RoutingExpr::new(\"tenant_id\").unwrap();\n        let ctx: serde_json::Map<String, JsonValue> =\n            serde_json::from_str(r#\"{\"tenant_id\": \"happy-tenant\", \"app\": \"happy\"}\"#).unwrap();\n        assert_eq!(routing_expr.eval_hash(&ctx), 13914409176935416182);\n    }\n\n    #[test]\n    fn test_routing_expr_missing_value_does_not_panic() {\n        let routing_expr = RoutingExpr::new(\"tenant_id\").unwrap();\n        let ctx: serde_json::Map<String, JsonValue> = Default::default();\n        assert_eq!(routing_expr.eval_hash(&ctx), 12482849403534986143);\n    }\n\n    #[test]\n    fn test_routing_expr_mod() {\n        let mut seen = HashSet::new();\n        let routing_expr = RoutingExpr::new(\"hash_mod(tenant_id, 10)\").unwrap();\n\n        for i in 0..1000 {\n            let ctx: serde_json::Map<String, JsonValue> =\n                serde_json::from_str(&format!(r#\"{{\"tenant_id\": \"happy{i}\"}}\"#)).unwrap();\n            seen.insert(routing_expr.eval_hash(&ctx));\n        }\n\n        assert_eq!(seen.len(), 10);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-doc-mapper/src/tag_pruning.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeSet;\nuse std::fmt::Display;\n\nuse quickwit_query::query_ast::QueryAst;\nuse serde::{Deserialize, Serialize};\nuse tantivy::query_grammar::Occur;\n\n/// Returns true if and only if tag is of form `{field_name}:any_value`.\npub fn match_tag_field_name(field_name: &str, tag: &str) -> bool {\n    tag.len() > field_name.len()\n        && tag.as_bytes()[field_name.len()] == b':'\n        && tag.starts_with(field_name)\n}\n\n/// Tags a user query and returns a TagFilterAst that\n/// represents a filtering predicate over a set of tags.\n///\n/// If the predicate evaluates to false for a given set of tags\n/// associated with a split, we are guaranteed that no documents\n/// in the split matches the query.\npub fn extract_tags_from_query(query_ast: QueryAst) -> Option<TagFilterAst> {\n    let unsimplified_tag_filter_ast = extract_unsimplified_tags_filter_ast(query_ast);\n    let term_filters_ast = simplify_ast(unsimplified_tag_filter_ast)?;\n    Some(expand_to_tag_ast(term_filters_ast))\n}\n\nfn extract_unsimplified_tags_filter_ast(query_ast: QueryAst) -> UnsimplifiedTagFilterAst {\n    match query_ast {\n        QueryAst::Bool(bool_query) => {\n            let mut clause_with_resolved_occur: Vec<(Occur, UnsimplifiedTagFilterAst)> = Vec::new();\n            for (occur, children) in [\n                (Occur::Must, bool_query.must),\n                (Occur::Must, bool_query.filter),\n                (Occur::Should, bool_query.should),\n                (Occur::MustNot, bool_query.must_not),\n            ] {\n                for child_ast in children {\n                    let child_unsimplified_tag_ast =\n                        extract_unsimplified_tags_filter_ast(child_ast);\n                    clause_with_resolved_occur.push((occur, child_unsimplified_tag_ast));\n                }\n            }\n            collect_tag_filters_for_clause(clause_with_resolved_occur)\n        }\n        QueryAst::Term(term_query) => UnsimplifiedTagFilterAst::Tag {\n            is_present: true,\n            field: term_query.field,\n            value: term_query.value,\n        },\n        QueryAst::MatchAll | QueryAst::MatchNone => UnsimplifiedTagFilterAst::Uninformative,\n        QueryAst::Range(_) => {\n            // We could technically add support for range over some quantitive tag value (like we do\n            // for timestamps). This is not supported at this point.\n            UnsimplifiedTagFilterAst::Uninformative\n        }\n        QueryAst::TermSet(term_set) => {\n            let children: Vec<UnsimplifiedTagFilterAst> = term_set\n                .terms_per_field\n                .into_iter()\n                .flat_map(|(field, terms)| {\n                    terms\n                        .into_iter()\n                        .map(move |term| UnsimplifiedTagFilterAst::Tag {\n                            is_present: true,\n                            field: field.clone(),\n                            value: term,\n                        })\n                })\n                .collect();\n            UnsimplifiedTagFilterAst::Or(children)\n        }\n        QueryAst::FullText(full_text_query) => {\n            // TODO This is a bug in a sense.\n            // A phrase is supposed to go through the tokenizer.\n            UnsimplifiedTagFilterAst::Tag {\n                is_present: true,\n                field: full_text_query.field,\n                value: full_text_query.text,\n            }\n        }\n        QueryAst::PhrasePrefix(phrase_prefix_query) => {\n            // TODO same as FullText above.\n            UnsimplifiedTagFilterAst::Tag {\n                is_present: true,\n                field: phrase_prefix_query.field,\n                value: phrase_prefix_query.phrase,\n            }\n        }\n        QueryAst::Wildcard(wildcard_query) => {\n            // TODO same as FullText above.\n            UnsimplifiedTagFilterAst::Tag {\n                is_present: true,\n                field: wildcard_query.field,\n                value: wildcard_query.value,\n            }\n        }\n        QueryAst::Boost { underlying, .. } => extract_unsimplified_tags_filter_ast(*underlying),\n        QueryAst::UserInput(_user_text_query) => {\n            panic!(\"Extract unsimplified should only be called on AST without UserInputQuery.\");\n        }\n        QueryAst::FieldPresence(_) => UnsimplifiedTagFilterAst::Uninformative,\n        QueryAst::Regex(_) => UnsimplifiedTagFilterAst::Uninformative,\n        QueryAst::Cache(cache_node) => extract_unsimplified_tags_filter_ast(*cache_node.inner),\n    }\n}\n\n/// Intermediary AST that may contain leaf that are\n/// equivalent to the \"Uninformative\" predicate.\n#[derive(Clone, Debug, Eq, PartialEq)]\nenum UnsimplifiedTagFilterAst {\n    And(Vec<UnsimplifiedTagFilterAst>),\n    Or(Vec<UnsimplifiedTagFilterAst>),\n    Tag {\n        is_present: bool,\n        field: String,\n        value: String,\n    },\n    /// Uninformative represents a node which could be\n    /// True or False regardless of the tag values.\n    ///\n    /// Any subnode of the `UserInputAST` can be\n    /// replaced by `Uninformative` while still being correct.\n    Uninformative,\n}\n\n/// Represents a tag filter used for split pruning.\n#[derive(Debug, PartialEq, Clone)]\nenum TermFilterAst {\n    And(Vec<TermFilterAst>),\n    Or(Vec<TermFilterAst>),\n    Term { field: String, value: String },\n}\n\n/// Records terms into a set of tags.\n///\n/// A special tag `{field_name}!` is always added to the tag set.\n/// It indicates that `{field_name}` is in the list of the\n/// `DocMapper` attribute `tag_fields`.\n///\n/// See `SplitMetadata` in `quickwit_metastore` for more detail.\npub fn append_to_tag_set(field_name: &str, values: &[String], tag_set: &mut BTreeSet<String>) {\n    tag_set.insert(field_tag(field_name));\n    for value in values {\n        tag_set.insert(term_tag(field_name, value));\n    }\n}\n\n/// Represents a predicate over the set of tags associated with a given split.\n#[allow(missing_docs)]\n#[derive(Debug, PartialEq, Clone, Serialize, Deserialize)]\npub enum TagFilterAst {\n    And(Vec<TagFilterAst>),\n    Or(Vec<TagFilterAst>),\n    Tag {\n        /// If set to false, the predicate tests for the absence of the tag.\n        is_present: bool,\n        tag: String,\n    },\n}\n\nimpl Display for TagFilterAst {\n    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {\n        let (is_or, children) = match self {\n            TagFilterAst::And(children) => (false, children),\n            TagFilterAst::Or(children) => (true, children),\n            TagFilterAst::Tag { is_present, tag } => {\n                if !is_present {\n                    write!(f, \"¬\")?;\n                }\n                write!(f, \"{tag}\")?;\n                return Ok(());\n            }\n        };\n        if children.is_empty() {\n            return Ok(());\n        }\n        if children.len() == 1 {\n            write!(f, \"{}\", children[0])?;\n            return Ok(());\n        }\n        if is_or {\n            write!(f, \"(\")?;\n        }\n        let mut children_it = children.iter();\n        write!(f, \"{}\", children_it.next().unwrap())?;\n        for child in children_it {\n            if is_or {\n                write!(f, \" ∨ {child}\")?;\n            } else {\n                write!(f, \" ∧ {child}\")?;\n            }\n        }\n        if is_or {\n            write!(f, \")\")?;\n        }\n        Ok(())\n    }\n}\n\nimpl TagFilterAst {\n    /// Evaluates the tag filter predicate over a set of tags.\n    pub fn evaluate(&self, tag_set: &BTreeSet<String>) -> bool {\n        match self {\n            TagFilterAst::And(children) => {\n                children.iter().all(|child_ast| child_ast.evaluate(tag_set))\n            }\n            TagFilterAst::Or(children) => {\n                children.iter().any(|child_ast| child_ast.evaluate(tag_set))\n            }\n            TagFilterAst::Tag { is_present, tag } => tag_set.contains(tag) == *is_present,\n        }\n    }\n}\n\n// Takes a tag AST and simplify it.\n//\n// The resulting AST does not contain any uninformative leaves.\n//\n// Returning None here, is to be interpreted as returning `True`.\nfn simplify_ast(ast: UnsimplifiedTagFilterAst) -> Option<TermFilterAst> {\n    match ast {\n        UnsimplifiedTagFilterAst::And(conditions) => {\n            let mut pruned_conditions: Vec<TermFilterAst> =\n                conditions.into_iter().filter_map(simplify_ast).collect();\n            match pruned_conditions.len() {\n                0 => None,\n                1 => pruned_conditions.pop().unwrap().into(),\n                _ => TermFilterAst::And(pruned_conditions).into(),\n            }\n        }\n        UnsimplifiedTagFilterAst::Or(conditions) => {\n            let mut pruned_conditions: Vec<TermFilterAst> = Vec::new();\n            for condition in conditions {\n                // If we get None as part of the condition here, we return None\n                // directly. (Remember None means True).\n                pruned_conditions.push(simplify_ast(condition)?);\n            }\n            match pruned_conditions.len() {\n                0 => None,\n                1 => pruned_conditions.pop().unwrap().into(),\n                _ => TermFilterAst::Or(pruned_conditions).into(),\n            }\n        }\n        UnsimplifiedTagFilterAst::Tag {\n            is_present,\n            field,\n            value,\n        } => {\n            if is_present {\n                Some(TermFilterAst::Term { field, value })\n            } else {\n                // we can't do tag pruning on negative filters. If `field` can be one of 1 or 2,\n                // and we search for not(1), we don't want to remove a split where\n                // tags=[1,2] (which is_present: false does). It's even more problematic if some\n                // documents have `field` unset, because we don't record that at all, so can't\n                // even reject a split based on it having tags=[1].\n                None\n            }\n        }\n        UnsimplifiedTagFilterAst::Uninformative => None,\n    }\n}\n\n/// Special tag to indicate that a field is listed in the\n/// `DocMapper` `tag_fields` attribute.\npub fn field_tag(field_name: &str) -> String {\n    format!(\"{field_name}!\")\n}\n\nfn term_tag(field: &str, value: &str) -> String {\n    format!(\"{field}:{value}\")\n}\n\nfn expand_to_tag_ast(terms_filter_ast: TermFilterAst) -> TagFilterAst {\n    match terms_filter_ast {\n        TermFilterAst::And(children) => {\n            TagFilterAst::And(children.into_iter().map(expand_to_tag_ast).collect())\n        }\n        TermFilterAst::Or(children) => {\n            TagFilterAst::Or(children.into_iter().map(expand_to_tag_ast).collect())\n        }\n        TermFilterAst::Term { field, value } => {\n            let field_is_tag = TagFilterAst::Tag {\n                is_present: false,\n                tag: field_tag(&field),\n            };\n            let term_tag = TagFilterAst::Tag {\n                is_present: true,\n                tag: term_tag(&field, &value),\n            };\n            TagFilterAst::Or(vec![field_is_tag, term_tag])\n        }\n    }\n}\n\nfn collect_tag_filters_for_clause(\n    clause: Vec<(Occur, UnsimplifiedTagFilterAst)>,\n) -> UnsimplifiedTagFilterAst {\n    if clause.is_empty() {\n        return UnsimplifiedTagFilterAst::Uninformative;\n    }\n    if clause.iter().any(|(occur, _)| occur == &Occur::Must) {\n        let removed_should_clause: Vec<UnsimplifiedTagFilterAst> = clause\n            .into_iter()\n            .filter_map(|(occur, ast)| match occur {\n                Occur::Must => Some(ast),\n                Occur::MustNot => Some(negate_ast(ast)),\n                Occur::Should => None,\n            })\n            .collect();\n        // We will handle the case where removed_should_clause.len() == 1 in the simplify\n        // phase.\n        return UnsimplifiedTagFilterAst::And(removed_should_clause);\n    }\n    let converted_not_clause = clause\n        .into_iter()\n        .map(|(occur, ast)| match occur {\n            Occur::MustNot => negate_ast(ast),\n            Occur::Should => ast,\n            Occur::Must => {\n                unreachable!(\"This should never happen due to check above.\")\n            }\n        })\n        .collect();\n    UnsimplifiedTagFilterAst::Or(converted_not_clause)\n}\n\n/// Negate the unsimplified ast, pushing the negation to the leaf\n/// using De Morgan's law\n/// - NOT( A AND B )=> NOT(A) OR NOT(B)\n/// - NOT( A OR B )=> NOT(A) AND NOT(B)\n/// - NOT( Tag ) => NotTag\n/// - NOT( NotTag ) => Tag\n/// - NOT( Uninformative ) => Uninformative.\nfn negate_ast(clause: UnsimplifiedTagFilterAst) -> UnsimplifiedTagFilterAst {\n    match clause {\n        UnsimplifiedTagFilterAst::And(leaves) => {\n            UnsimplifiedTagFilterAst::Or(leaves.into_iter().map(negate_ast).collect())\n        }\n        UnsimplifiedTagFilterAst::Or(leaves) => {\n            UnsimplifiedTagFilterAst::And(leaves.into_iter().map(negate_ast).collect())\n        }\n        UnsimplifiedTagFilterAst::Tag {\n            is_present,\n            field,\n            value,\n        } => UnsimplifiedTagFilterAst::Tag {\n            is_present: !is_present,\n            field,\n            value,\n        },\n        UnsimplifiedTagFilterAst::Uninformative => UnsimplifiedTagFilterAst::Uninformative,\n    }\n}\n\n/// Helper to build a TagFilterAst checking for the presence of a tag.\npub fn tag(tag: impl ToString) -> TagFilterAst {\n    TagFilterAst::Tag {\n        is_present: true,\n        tag: tag.to_string(),\n    }\n}\n\n/// Helper to build a TagFilterAst checking for the absence of a tag.\npub fn no_tag(tag: impl ToString) -> TagFilterAst {\n    TagFilterAst::Tag {\n        is_present: false,\n        tag: tag.to_string(),\n    }\n}\n#[cfg(test)]\nmod test {\n    use quickwit_query::BooleanOperand;\n    use quickwit_query::query_ast::{QueryAst, UserInputQuery};\n\n    use super::extract_tags_from_query;\n    use crate::tag_pruning::TagFilterAst;\n\n    fn extract_tags_from_query_helper(user_query: &str) -> Option<TagFilterAst> {\n        let query_ast: QueryAst = UserInputQuery {\n            user_text: user_query.to_string(),\n            default_fields: None,\n            default_operator: BooleanOperand::Or,\n            lenient: false,\n        }\n        .into();\n        let parsed_query_ast = query_ast.parse_user_query(&[]).unwrap();\n        extract_tags_from_query(parsed_query_ast)\n    }\n\n    #[test]\n    fn test_extract_tags_from_query_all() {\n        assert_eq!(extract_tags_from_query_helper(\"*\"), None);\n    }\n\n    #[test]\n    fn test_extract_tags_from_query_range_query() {\n        assert_eq!(extract_tags_from_query_helper(\"title:>foo lang:fr\"), None);\n    }\n\n    #[test]\n    fn test_extract_tags_from_query_range_query_conjunction() {\n        assert_eq!(\n            &extract_tags_from_query_helper(\"title:>foo AND lang:fr\")\n                .unwrap()\n                .to_string(),\n            \"(¬lang! ∨ lang:fr)\"\n        );\n    }\n\n    #[test]\n    fn test_extract_tags_from_query_mixed_disjunction() -> anyhow::Result<()> {\n        assert_eq!(\n            &extract_tags_from_query_helper(\"title:foo user:bart lang:fr\")\n                .unwrap()\n                .to_string(),\n            \"((¬title! ∨ title:foo) ∨ (¬user! ∨ user:bart) ∨ (¬lang! ∨ lang:fr))\"\n        );\n        Ok(())\n    }\n\n    #[test]\n    fn test_extract_tags_from_query_and_or() -> anyhow::Result<()> {\n        assert_eq!(\n            &extract_tags_from_query_helper(\"title:foo AND (user:bart OR lang:fr)\")\n                .unwrap()\n                .to_string(),\n            \"(¬title! ∨ title:foo) ∧ ((¬user! ∨ user:bart) ∨ (¬lang! ∨ lang:fr))\"\n        );\n        Ok(())\n    }\n\n    #[test]\n    fn test_conjunction_of_tags() {\n        assert_eq!(\n            &extract_tags_from_query_helper(\"(user:bart AND lang:fr)\")\n                .unwrap()\n                .to_string(),\n            \"(¬user! ∨ user:bart) ∧ (¬lang! ∨ lang:fr)\"\n        );\n    }\n\n    #[test]\n    fn test_disjunction_of_tags() {\n        assert_eq!(\n            &extract_tags_from_query_helper(\"(user:bart OR lang:fr)\")\n                .unwrap()\n                .to_string(),\n            \"((¬user! ∨ user:bart) ∨ (¬lang! ∨ lang:fr))\"\n        );\n    }\n\n    #[test]\n    fn test_disjunction_of_tag_disjunction_with_not_clause() {\n        // ORed negative tags make the result inconclusive. See simplify_ast() for details\n        assert!(extract_tags_from_query_helper(\"(user:bart -lang:fr)\").is_none());\n    }\n\n    #[test]\n    fn test_disjunction_of_tag_conjunction_with_not_clause() {\n        // negative tags are removed from AND clauses. See simplify_ast() for details\n        assert_eq!(\n            &extract_tags_from_query_helper(\"user:bart AND NOT lang:fr\")\n                .unwrap()\n                .to_string(),\n            \"(¬user! ∨ user:bart)\"\n        );\n    }\n\n    #[test]\n    fn test_disjunction_of_tag_must_should() {\n        assert_eq!(\n            &extract_tags_from_query_helper(\"(+user:bart lang:fr)\")\n                .unwrap()\n                .to_string(),\n            \"(¬user! ∨ user:bart)\"\n        );\n    }\n\n    #[test]\n    fn test_match_tag_field_name() {\n        assert!(super::match_tag_field_name(\"tagfield\", \"tagfield:val\"));\n        assert!(super::match_tag_field_name(\"tagfield\", \"tagfield:\"));\n        assert!(!super::match_tag_field_name(\"tagfield\", \"tagfield\"));\n        assert!(!super::match_tag_field_name(\"tagfield\", \"tag:val\"));\n        assert!(!super::match_tag_field_name(\"tagfield\", \"tagfiele:val\"));\n        assert!(!super::match_tag_field_name(\"tagfield\", \"t:val\"));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-index-management/Cargo.toml",
    "content": "[package]\nname = \"quickwit-index-management\"\ndescription = \"Create and manage Quickwit indexes, sources, templates, etc.\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nanyhow = { workspace = true }\nfutures = { workspace = true }\nfutures-util = { workspace = true }\nitertools = { workspace = true }\nthiserror = { workspace = true }\ntime = { workspace = true }\ntokio = { workspace = true }\ntracing = { workspace = true }\n\nquickwit-common = { workspace = true }\nquickwit-config = { workspace = true }\nquickwit-indexing = { workspace = true }\nquickwit-metastore = { workspace = true }\nquickwit-proto = { workspace = true }\nquickwit-storage = { workspace = true }\n\n[dev-dependencies]\n\nquickwit-common = { workspace = true, features = [\"testsuite\"] }\nquickwit-metastore = { workspace = true, features = [\"testsuite\"] }\nquickwit-proto = { workspace = true, features = [\"testsuite\"] }\nquickwit-storage = { workspace = true, features = [\"testsuite\"] }\n"
  },
  {
    "path": "quickwit/quickwit-index-management/src/garbage_collection.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{HashMap, HashSet};\nuse std::path::{Path, PathBuf};\nuse std::sync::{Arc, OnceLock};\nuse std::time::Duration;\n\nuse anyhow::Context;\nuse futures::{Future, StreamExt};\nuse itertools::Itertools;\nuse quickwit_common::metrics::IntCounter;\nuse quickwit_common::pretty::PrettySample;\nuse quickwit_common::{Progress, rate_limited_info};\nuse quickwit_metastore::{\n    ListSplitsQuery, ListSplitsRequestExt, MetastoreServiceStreamSplitsExt, SplitInfo,\n    SplitMetadata, SplitState,\n};\nuse quickwit_proto::metastore::{\n    DeleteSplitsRequest, ListSplitsRequest, MarkSplitsForDeletionRequest, MetastoreError,\n    MetastoreService, MetastoreServiceClient,\n};\nuse quickwit_proto::types::{IndexUid, SplitId};\nuse quickwit_storage::{BulkDeleteError, Storage};\nuse thiserror::Error;\nuse time::OffsetDateTime;\nuse tracing::{error, instrument};\n\n/// The maximum number of splits that the GC should delete per attempt.\nconst DELETE_SPLITS_BATCH_SIZE: usize = 10_000;\n\npub struct GcMetrics {\n    pub deleted_splits: IntCounter,\n    pub deleted_bytes: IntCounter,\n    pub failed_splits: IntCounter,\n}\n\ntrait RecordGcMetrics {\n    fn record(&self, num_delete_splits: usize, num_deleted_bytes: u64, num_failed_splits: usize);\n}\n\nimpl RecordGcMetrics for Option<GcMetrics> {\n    fn record(&self, num_deleted_splits: usize, num_deleted_bytes: u64, num_failed_splits: usize) {\n        if let Some(metrics) = self {\n            metrics.deleted_splits.inc_by(num_deleted_splits as u64);\n            metrics.deleted_bytes.inc_by(num_deleted_bytes);\n            metrics.failed_splits.inc_by(num_failed_splits as u64);\n        }\n    }\n}\n\n/// [`DeleteSplitsError`] describes the errors that occurred during the deletion of splits from\n/// storage and metastore.\n#[derive(Error, Debug)]\n#[error(\"failed to delete splits from storage and/or metastore\")]\npub struct DeleteSplitsError {\n    successes: Vec<SplitInfo>,\n    storage_error: Option<BulkDeleteError>,\n    storage_failures: Vec<SplitInfo>,\n    metastore_error: Option<MetastoreError>,\n    metastore_failures: Vec<SplitInfo>,\n}\n\nasync fn protect_future<Fut, T>(progress: Option<&Progress>, future: Fut) -> T\nwhere Fut: Future<Output = T> {\n    match progress {\n        None => future.await,\n        Some(progress) => {\n            let _guard = progress.protect_zone();\n            future.await\n        }\n    }\n}\n\n/// Information on what splits have and have not been cleaned up by the GC.\n#[derive(Debug, Default)]\npub struct SplitRemovalInfo {\n    /// The set of splits that have been removed.\n    pub removed_split_entries: Vec<SplitInfo>,\n    /// The set of split ids that were attempted to be removed, but were unsuccessful.\n    pub failed_splits: Vec<SplitInfo>,\n}\n\n/// Detect all dangling splits and associated files from the index and removes them.\n///\n/// * `indexes` - The target index uids and storages.\n/// * `storage - The storage managing the target index.\n/// * `metastore` - The metastore managing the target index.\n/// * `staged_grace_period` -  Threshold period after which a staged split can be safely garbage\n///   collected.\n/// * `deletion_grace_period` -  Threshold period after which a marked as deleted split can be\n///   safely deleted.\n/// * `dry_run` - Should this only return a list of affected files without performing deletion.\n/// * `progress` - For reporting progress (useful when called from within a quickwit actor).\npub async fn run_garbage_collect(\n    indexes: HashMap<IndexUid, Arc<dyn Storage>>,\n    metastore: MetastoreServiceClient,\n    staged_grace_period: Duration,\n    deletion_grace_period: Duration,\n    dry_run: bool,\n    progress_opt: Option<&Progress>,\n    metrics: Option<GcMetrics>,\n) -> anyhow::Result<SplitRemovalInfo> {\n    let grace_period_timestamp =\n        OffsetDateTime::now_utc().unix_timestamp() - staged_grace_period.as_secs() as i64;\n\n    let index_uids: Vec<IndexUid> = indexes.keys().cloned().collect();\n\n    // TODO maybe we want to do a ListSplitsQuery::for_all_indexes and post-filter ourselves here\n    let Some(list_splits_query_for_index_uids) = ListSplitsQuery::try_from_index_uids(index_uids)\n    else {\n        return Ok(SplitRemovalInfo::default());\n    };\n    let list_splits_query = list_splits_query_for_index_uids\n        .clone()\n        .with_split_state(SplitState::Staged)\n        .with_update_timestamp_lte(grace_period_timestamp);\n\n    let list_deletable_staged_request =\n        ListSplitsRequest::try_from_list_splits_query(&list_splits_query)?;\n    let deletable_staged_splits: Vec<SplitMetadata> = protect_future(\n        progress_opt,\n        metastore.list_splits(list_deletable_staged_request),\n    )\n    .await?\n    .collect_splits_metadata()\n    .await?;\n\n    if dry_run {\n        let marked_for_deletion_query =\n            list_splits_query_for_index_uids.with_split_state(SplitState::MarkedForDeletion);\n        let marked_for_deletion_request =\n            ListSplitsRequest::try_from_list_splits_query(&marked_for_deletion_query)?;\n        let mut splits_marked_for_deletion: Vec<SplitMetadata> = protect_future(\n            progress_opt,\n            metastore.list_splits(marked_for_deletion_request),\n        )\n        .await?\n        .collect_splits_metadata()\n        .await?;\n        splits_marked_for_deletion.extend(deletable_staged_splits);\n\n        let candidate_entries: Vec<SplitInfo> = splits_marked_for_deletion\n            .into_iter()\n            .map(|split| split.as_split_info())\n            .collect();\n        return Ok(SplitRemovalInfo {\n            removed_split_entries: candidate_entries,\n            failed_splits: Vec::new(),\n        });\n    }\n\n    // Schedule all eligible staged splits for delete\n    let split_ids: HashMap<IndexUid, Vec<SplitId>> = deletable_staged_splits\n        .into_iter()\n        .map(|split| (split.index_uid, split.split_id))\n        .into_group_map();\n    for (index_uid, split_ids) in split_ids {\n        let mark_splits_for_deletion_request =\n            MarkSplitsForDeletionRequest::new(index_uid, split_ids);\n        protect_future(\n            progress_opt,\n            metastore.mark_splits_for_deletion(mark_splits_for_deletion_request),\n        )\n        .await?;\n    }\n\n    // We delete splits marked for deletion that have an update timestamp anterior\n    // to `now - deletion_grace_period`.\n    let updated_before_timestamp =\n        OffsetDateTime::now_utc().unix_timestamp() - deletion_grace_period.as_secs() as i64;\n\n    Ok(delete_splits_marked_for_deletion_several_indexes(\n        updated_before_timestamp,\n        metastore,\n        indexes,\n        progress_opt,\n        metrics,\n    )\n    .await)\n}\n\nasync fn delete_splits(\n    splits_metadata_to_delete_per_index: HashMap<IndexUid, Vec<SplitMetadata>>,\n    storages: &HashMap<IndexUid, Arc<dyn Storage>>,\n    metastore: MetastoreServiceClient,\n    progress_opt: Option<&Progress>,\n    metrics: &Option<GcMetrics>,\n    split_removal_info: &mut SplitRemovalInfo,\n) -> Result<(), ()> {\n    let mut delete_split_from_index_res_stream =\n        futures::stream::iter(splits_metadata_to_delete_per_index)\n            .map(|(index_uid, splits_metadata_to_delete)| {\n                let storage = storages.get(&index_uid).cloned();\n                let metastore = metastore.clone();\n                async move {\n                    if let Some(storage) = storage {\n                        delete_splits_from_storage_and_metastore(\n                            index_uid,\n                            storage,\n                            metastore,\n                            splits_metadata_to_delete,\n                            progress_opt,\n                        )\n                        .await\n                    } else {\n                        // in practice this can happen if the index was created between the start of\n                        // the run and now, and one of its splits has already expired, which likely\n                        // means a very long gc run, or if we run gc on a single index from the cli.\n                        quickwit_common::rate_limited_warn!(\n                            limit_per_min = 2,\n                            index_uid=%index_uid,\n                            \"we are trying to GC without knowing the storage\",\n                        );\n                        Ok(Vec::new())\n                    }\n                }\n            })\n            .buffer_unordered(get_index_gc_concurrency().unwrap_or(10));\n\n    let mut error_encountered = false;\n    while let Some(delete_split_result) = delete_split_from_index_res_stream.next().await {\n        match delete_split_result {\n            Ok(entries) => {\n                let deleted_bytes = entries\n                    .iter()\n                    .map(|entry| entry.file_size_bytes.as_u64())\n                    .sum::<u64>();\n                let deleted_splits_count = entries.len();\n\n                metrics.record(deleted_splits_count, deleted_bytes, 0);\n                split_removal_info.removed_split_entries.extend(entries);\n            }\n            Err(delete_split_error) => {\n                let deleted_bytes = delete_split_error\n                    .successes\n                    .iter()\n                    .map(|entry| entry.file_size_bytes.as_u64())\n                    .sum::<u64>();\n                let deleted_splits_count = delete_split_error.successes.len();\n                let failed_splits_count = delete_split_error.storage_failures.len()\n                    + delete_split_error.metastore_failures.len();\n\n                metrics.record(deleted_splits_count, deleted_bytes, failed_splits_count);\n                split_removal_info\n                    .removed_split_entries\n                    .extend(delete_split_error.successes);\n                split_removal_info\n                    .failed_splits\n                    .extend(delete_split_error.storage_failures);\n                split_removal_info\n                    .failed_splits\n                    .extend(delete_split_error.metastore_failures);\n                error_encountered = true;\n            }\n        }\n    }\n    if error_encountered { Err(()) } else { Ok(()) }\n}\n\n/// Fetch the list metadata from the metastore and returns them as a Vec.\nasync fn list_splits_metadata(\n    metastore: &MetastoreServiceClient,\n    query: &ListSplitsQuery,\n) -> anyhow::Result<Vec<SplitMetadata>> {\n    let list_splits_request = ListSplitsRequest::try_from_list_splits_query(query)\n        .context(\"failed to build list splits request\")?;\n    let splits_to_delete_stream = metastore\n        .list_splits(list_splits_request)\n        .await\n        .context(\"failed to fetch stream splits\")?;\n    let splits = splits_to_delete_stream\n        .collect_splits_metadata()\n        .await\n        .context(\"failed to collect splits\")?;\n    Ok(splits)\n}\n\n/// In order to avoid hammering the load on the metastore, we can throttle the rate of split\n/// deletion by setting this environment variable.\nfn get_maximum_split_deletion_rate_per_sec() -> Option<usize> {\n    static MAX_SPLIT_DELETION_RATE_PER_SEC: OnceLock<Option<usize>> = OnceLock::new();\n    *MAX_SPLIT_DELETION_RATE_PER_SEC.get_or_init(|| {\n        quickwit_common::get_from_env_opt::<usize>(\"QW_MAX_SPLIT_DELETION_RATE_PER_SEC\", false)\n    })\n}\n\nfn get_index_gc_concurrency() -> Option<usize> {\n    static INDEX_GC_CONCURRENCY: OnceLock<Option<usize>> = OnceLock::new();\n    *INDEX_GC_CONCURRENCY.get_or_init(|| {\n        quickwit_common::get_from_env_opt::<usize>(\"QW_INDEX_GC_CONCURRENCY\", false)\n    })\n}\n\n/// Removes any splits marked for deletion which haven't been\n/// updated after `updated_before_timestamp` in batches of 1,000 splits.\n///\n/// Only splits from index_uids in the `storages` map will be deleted.\n///\n/// The aim of this is to spread the load out across a longer period\n/// rather than short, heavy bursts on the metastore and storage system itself.\n#[instrument(skip(storages, metastore, progress_opt, metrics), fields(num_indexes=%storages.len()))]\nasync fn delete_splits_marked_for_deletion_several_indexes(\n    updated_before_timestamp: i64,\n    metastore: MetastoreServiceClient,\n    storages: HashMap<IndexUid, Arc<dyn Storage>>,\n    progress_opt: Option<&Progress>,\n    metrics: Option<GcMetrics>,\n) -> SplitRemovalInfo {\n    let mut split_removal_info = SplitRemovalInfo::default();\n\n    // we ask for all indexes because the query is more efficient and we almost always want all\n    // indexes anyway. The exception is when garbage collecting a single index from the commandline.\n    // In this case, we will log a bunch of warn. i (trinity) consider it worth the more generic\n    // code which needs fewer special case while testing, but we could check index_uids len if we\n    // think it's a better idea.\n    let list_splits_query = ListSplitsQuery::for_all_indexes();\n\n    let mut list_splits_query = list_splits_query\n        .with_split_state(SplitState::MarkedForDeletion)\n        .with_update_timestamp_lte(updated_before_timestamp)\n        .with_limit(DELETE_SPLITS_BATCH_SIZE)\n        .sort_by_index_uid();\n\n    loop {\n        let sleep_duration: Duration = if let Some(maximum_split_deletion_per_sec) =\n            get_maximum_split_deletion_rate_per_sec()\n        {\n            Duration::from_secs(\n                DELETE_SPLITS_BATCH_SIZE.div_ceil(maximum_split_deletion_per_sec) as u64,\n            )\n        } else {\n            Duration::default()\n        };\n        let sleep_future = tokio::time::sleep(sleep_duration);\n\n        let splits_metadata_to_delete: Vec<SplitMetadata> = match protect_future(\n            progress_opt,\n            list_splits_metadata(&metastore, &list_splits_query),\n        )\n        .await\n        {\n            Ok(splits) => splits,\n            Err(list_splits_err) => {\n                error!(error=?list_splits_err, \"failed to list splits\");\n                break;\n            }\n        };\n\n        // We page through the list of splits to delete using a limit and a `search_after` trick.\n        // To detect if this is the last page, we check if the number of splits is less than the\n        // limit.\n        assert!(splits_metadata_to_delete.len() <= DELETE_SPLITS_BATCH_SIZE);\n        let splits_to_delete_possibly_remaining =\n            splits_metadata_to_delete.len() == DELETE_SPLITS_BATCH_SIZE;\n\n        // set split after which to search for the next loop\n        let Some(last_split_metadata) = splits_metadata_to_delete.last() else {\n            break;\n        };\n        list_splits_query = list_splits_query.after_split(last_split_metadata);\n\n        let mut splits_metadata_to_delete_per_index: HashMap<IndexUid, Vec<SplitMetadata>> =\n            HashMap::with_capacity(storages.len());\n\n        for meta in splits_metadata_to_delete {\n            if !storages.contains_key(&meta.index_uid) {\n                rate_limited_info!(limit_per_min=6, index_uid=?meta.index_uid, \"split not listed in storage map: skipping\");\n                continue;\n            }\n            splits_metadata_to_delete_per_index\n                .entry(meta.index_uid.clone())\n                .or_default()\n                .push(meta);\n        }\n\n        // ignore return we continue either way\n        let _: Result<(), ()> = delete_splits(\n            splits_metadata_to_delete_per_index,\n            &storages,\n            metastore.clone(),\n            progress_opt,\n            &metrics,\n            &mut split_removal_info,\n        )\n        .await;\n\n        if splits_to_delete_possibly_remaining {\n            sleep_future.await;\n        } else {\n            // stop the gc if this was the last batch\n            // we are guaranteed to make progress due to .after_split()\n            break;\n        }\n    }\n\n    split_removal_info\n}\n\n/// Delete a list of splits from the storage and the metastore.\n/// It should leave the index and the metastore in good state.\n///\n/// * `index_id` - The target index id.\n/// * `storage - The storage managing the target index.\n/// * `metastore` - The metastore managing the target index.\n/// * `splits`  - The list of splits to delete.\n/// * `progress` - For reporting progress (useful when called from within a quickwit actor).\npub async fn delete_splits_from_storage_and_metastore(\n    index_uid: IndexUid,\n    storage: Arc<dyn Storage>,\n    metastore: MetastoreServiceClient,\n    splits: Vec<SplitMetadata>,\n    progress_opt: Option<&Progress>,\n) -> Result<Vec<SplitInfo>, DeleteSplitsError> {\n    let mut split_infos: HashMap<PathBuf, SplitInfo> = HashMap::with_capacity(splits.len());\n\n    for split in splits {\n        let split_info = split.as_split_info();\n        split_infos.insert(split_info.file_name.clone(), split_info);\n    }\n    let split_paths = split_infos\n        .keys()\n        .map(|split_path_buf| split_path_buf.as_path())\n        .collect::<Vec<&Path>>();\n    let delete_result = protect_future(progress_opt, storage.bulk_delete(&split_paths)).await;\n\n    if let Some(progress) = progress_opt {\n        progress.record_progress();\n    }\n    let mut successes = Vec::with_capacity(split_infos.len());\n    let mut storage_error: Option<BulkDeleteError> = None;\n    let mut storage_failures = Vec::new();\n\n    match delete_result {\n        Ok(_) => successes.extend(split_infos.into_values()),\n        Err(bulk_delete_error) => {\n            let success_split_paths: HashSet<&PathBuf> =\n                bulk_delete_error.successes.iter().collect();\n            for (split_path, split_info) in split_infos {\n                if success_split_paths.contains(&split_path) {\n                    successes.push(split_info);\n                } else {\n                    storage_failures.push(split_info);\n                }\n            }\n            let failed_split_paths = storage_failures\n                .iter()\n                .map(|split_info| split_info.file_name.as_path())\n                .collect::<Vec<_>>();\n            error!(\n                error=?bulk_delete_error.error,\n                index_id=index_uid.index_id,\n                \"failed to delete split file(s) {:?} from storage\",\n                PrettySample::new(&failed_split_paths, 5),\n            );\n            storage_error = Some(bulk_delete_error);\n        }\n    };\n    if !successes.is_empty() {\n        let split_ids: Vec<SplitId> = successes\n            .iter()\n            .map(|split_info| split_info.split_id.to_string())\n            .collect();\n        let delete_splits_request = DeleteSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            split_ids: split_ids.clone(),\n        };\n        let metastore_result =\n            protect_future(progress_opt, metastore.delete_splits(delete_splits_request)).await;\n\n        if let Err(metastore_error) = metastore_result {\n            error!(\n                error=?metastore_error,\n                index_id=index_uid.index_id,\n                \"failed to delete split(s) {:?} from metastore\",\n                PrettySample::new(&split_ids, 5),\n            );\n            let delete_splits_error = DeleteSplitsError {\n                successes: Vec::new(),\n                storage_error,\n                storage_failures,\n                metastore_error: Some(metastore_error),\n                metastore_failures: successes,\n            };\n            return Err(delete_splits_error);\n        }\n    }\n    if !storage_failures.is_empty() {\n        let delete_splits_error = DeleteSplitsError {\n            successes,\n            storage_error,\n            storage_failures,\n            metastore_error: None,\n            metastore_failures: Vec::new(),\n        };\n        return Err(delete_splits_error);\n    }\n    Ok(successes)\n}\n\n#[cfg(test)]\nmod tests {\n    use std::time::Duration;\n\n    use itertools::Itertools;\n    use quickwit_common::ServiceStream;\n    use quickwit_config::IndexConfig;\n    use quickwit_metastore::{\n        CreateIndexRequestExt, ListSplitsQuery, MetastoreServiceStreamSplitsExt, SplitMetadata,\n        SplitState, StageSplitsRequestExt, metastore_for_test,\n    };\n    use quickwit_proto::metastore::{\n        CreateIndexRequest, EntityKind, MockMetastoreService, StageSplitsRequest,\n    };\n    use quickwit_proto::types::IndexUid;\n    use quickwit_storage::{\n        BulkDeleteError, DeleteFailure, MockStorage, PutPayload, storage_for_test,\n    };\n\n    use super::*;\n    use crate::run_garbage_collect;\n\n    fn hashmap<K: Eq + std::hash::Hash, V>(key: K, value: V) -> HashMap<K, V> {\n        let mut map = HashMap::new();\n        map.insert(key, value);\n        map\n    }\n\n    #[tokio::test]\n    async fn test_run_gc_marks_stale_staged_splits_for_deletion_after_grace_period() {\n        let storage = storage_for_test();\n        let metastore = metastore_for_test();\n\n        let index_id = \"test-run-gc--index\";\n        let index_uri = format!(\"ram:///indexes/{index_id}\");\n        let index_config = IndexConfig::for_test(index_id, &index_uri);\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        let split_id = \"test-run-gc--split\";\n        let split_metadata = SplitMetadata {\n            split_id: split_id.to_string(),\n            index_uid: index_uid.clone(),\n            ..Default::default()\n        };\n        let stage_splits_request =\n            StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata)\n                .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let query =\n            ListSplitsQuery::for_index(index_uid.clone()).with_split_state(SplitState::Staged);\n        let list_splits_request = ListSplitsRequest::try_from_list_splits_query(&query).unwrap();\n        assert_eq!(\n            metastore\n                .list_splits(list_splits_request)\n                .await\n                .unwrap()\n                .collect_splits()\n                .await\n                .unwrap()\n                .len(),\n            1\n        );\n\n        // The staging grace period hasn't passed yet so the split remains staged.\n        run_garbage_collect(\n            hashmap(index_uid.clone(), storage.clone()),\n            metastore.clone(),\n            Duration::from_secs(30),\n            Duration::from_secs(30),\n            false,\n            None,\n            None,\n        )\n        .await\n        .unwrap();\n\n        let query =\n            ListSplitsQuery::for_index(index_uid.clone()).with_split_state(SplitState::Staged);\n        let list_splits_request = ListSplitsRequest::try_from_list_splits_query(&query).unwrap();\n        assert_eq!(\n            metastore\n                .list_splits(list_splits_request)\n                .await\n                .unwrap()\n                .collect_splits()\n                .await\n                .unwrap()\n                .len(),\n            1\n        );\n\n        // The staging grace period has passed so the split is marked for deletion.\n        run_garbage_collect(\n            hashmap(index_uid.clone(), storage.clone()),\n            metastore.clone(),\n            Duration::from_secs(0),\n            Duration::from_secs(30),\n            false,\n            None,\n            None,\n        )\n        .await\n        .unwrap();\n\n        let query =\n            ListSplitsQuery::for_index(index_uid).with_split_state(SplitState::MarkedForDeletion);\n        let list_splits_request = ListSplitsRequest::try_from_list_splits_query(&query).unwrap();\n        assert_eq!(\n            metastore\n                .list_splits(list_splits_request)\n                .await\n                .unwrap()\n                .collect_splits()\n                .await\n                .unwrap()\n                .len(),\n            1\n        );\n    }\n\n    #[tokio::test]\n    async fn test_run_gc_deletes_splits_marked_for_deletion_after_grace_period() {\n        let storage = storage_for_test();\n        let metastore = metastore_for_test();\n\n        let index_id = \"test-run-gc--index\";\n        let index_uri = format!(\"ram:///indexes/{index_id}\");\n        let index_config = IndexConfig::for_test(index_id, &index_uri);\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        let split_id = \"test-run-gc--split\";\n        let split_metadata = SplitMetadata {\n            split_id: split_id.to_string(),\n            index_uid: index_uid.clone(),\n            ..Default::default()\n        };\n        let stage_splits_request =\n            StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata)\n                .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n        let mark_splits_for_deletion_request =\n            MarkSplitsForDeletionRequest::new(index_uid.clone(), vec![split_id.to_string()]);\n        metastore\n            .mark_splits_for_deletion(mark_splits_for_deletion_request)\n            .await\n            .unwrap();\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::MarkedForDeletion);\n        let list_splits_request = ListSplitsRequest::try_from_list_splits_query(&query).unwrap();\n        assert_eq!(\n            metastore\n                .list_splits(list_splits_request)\n                .await\n                .unwrap()\n                .collect_splits()\n                .await\n                .unwrap()\n                .len(),\n            1\n        );\n\n        // The delete grace period hasn't passed yet so the split remains marked for deletion.\n        run_garbage_collect(\n            hashmap(index_uid.clone(), storage.clone()),\n            metastore.clone(),\n            Duration::from_secs(30),\n            Duration::from_secs(30),\n            false,\n            None,\n            None,\n        )\n        .await\n        .unwrap();\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::MarkedForDeletion);\n        let list_splits_request = ListSplitsRequest::try_from_list_splits_query(&query).unwrap();\n        assert_eq!(\n            metastore\n                .list_splits(list_splits_request)\n                .await\n                .unwrap()\n                .collect_splits()\n                .await\n                .unwrap()\n                .len(),\n            1\n        );\n\n        // The delete grace period has passed so the split is deleted.\n        run_garbage_collect(\n            hashmap(index_uid.clone(), storage.clone()),\n            metastore.clone(),\n            Duration::from_secs(30),\n            Duration::from_secs(0),\n            false,\n            None,\n            None,\n        )\n        .await\n        .unwrap();\n\n        let query = ListSplitsQuery::for_index(index_uid);\n        let list_splits_request = ListSplitsRequest::try_from_list_splits_query(&query).unwrap();\n        assert_eq!(\n            metastore\n                .list_splits(list_splits_request)\n                .await\n                .unwrap()\n                .collect_splits()\n                .await\n                .unwrap()\n                .len(),\n            0\n        );\n    }\n\n    #[tokio::test]\n    async fn test_run_gc_deletes_splits_with_no_split() {\n        // Test that we make only 2 calls to the metastore.\n        let storage = storage_for_test();\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_list_splits()\n            .times(2)\n            .returning(|_| Ok(ServiceStream::empty()));\n        run_garbage_collect(\n            hashmap(\n                IndexUid::new_with_random_ulid(\"index-test-gc-deletes\"),\n                storage.clone(),\n            ),\n            MetastoreServiceClient::from_mock(mock_metastore),\n            Duration::from_secs(30),\n            Duration::from_secs(30),\n            false,\n            None,\n            None,\n        )\n        .await\n        .unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_delete_splits_from_storage_and_metastore_happy_path() {\n        let storage = storage_for_test();\n        let metastore = metastore_for_test();\n\n        let index_id = \"test-delete-splits-happy--index\";\n        let index_uri = format!(\"ram:///indexes/{index_id}\");\n        let index_config = IndexConfig::for_test(index_id, &index_uri);\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        let split_id = \"test-delete-splits-happy--split\";\n        let split_metadata = SplitMetadata {\n            split_id: split_id.to_string(),\n            index_uid: IndexUid::new_with_random_ulid(index_id),\n            ..Default::default()\n        };\n        let stage_splits_request =\n            StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata.clone())\n                .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n        let mark_splits_for_deletion =\n            MarkSplitsForDeletionRequest::new(index_uid.clone(), vec![split_id.to_string()]);\n        metastore\n            .mark_splits_for_deletion(mark_splits_for_deletion)\n            .await\n            .unwrap();\n\n        let split_path_str = format!(\"{split_id}.split\");\n        let split_path = Path::new(&split_path_str);\n        let payload: Box<dyn PutPayload> = Box::new(vec![0]);\n        storage.put(split_path, payload).await.unwrap();\n        assert!(storage.exists(split_path).await.unwrap());\n\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_index_uid(index_uid.clone()).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        assert_eq!(splits.len(), 1);\n\n        let deleted_split_infos = delete_splits_from_storage_and_metastore(\n            index_uid.clone(),\n            storage.clone(),\n            metastore.clone(),\n            vec![split_metadata],\n            None,\n        )\n        .await\n        .unwrap();\n\n        assert_eq!(deleted_split_infos.len(), 1);\n        assert_eq!(deleted_split_infos[0].split_id, split_id,);\n        assert_eq!(\n            deleted_split_infos[0].file_name,\n            Path::new(&format!(\"{split_id}.split\"))\n        );\n        assert!(!storage.exists(split_path).await.unwrap());\n        assert!(\n            metastore\n                .list_splits(ListSplitsRequest::try_from_index_uid(index_uid).unwrap())\n                .await\n                .unwrap()\n                .collect_splits()\n                .await\n                .unwrap()\n                .is_empty()\n        );\n    }\n\n    #[tokio::test]\n    async fn test_delete_splits_from_storage_and_metastore_storage_error() {\n        let mut mock_storage = MockStorage::new();\n        mock_storage\n            .expect_bulk_delete()\n            .return_once(|split_paths| {\n                assert_eq!(split_paths.len(), 2);\n\n                let split_paths: Vec<PathBuf> = split_paths\n                    .iter()\n                    .map(|split_path| split_path.to_path_buf())\n                    .sorted()\n                    .collect();\n                let split_path = split_paths[0].to_path_buf();\n                let successes = vec![split_path];\n\n                let split_path = split_paths[1].to_path_buf();\n                let delete_failure = DeleteFailure {\n                    code: Some(\"AccessDenied\".to_string()),\n                    ..Default::default()\n                };\n                let failures = HashMap::from_iter([(split_path, delete_failure)]);\n                let bulk_delete_error = BulkDeleteError {\n                    successes,\n                    failures,\n                    ..Default::default()\n                };\n                Err(bulk_delete_error)\n            });\n        let storage = Arc::new(mock_storage);\n        let metastore = metastore_for_test();\n\n        let index_id = \"test-delete-splits-storage-error--index\";\n        let index_uri = format!(\"ram:///indexes/{index_id}\");\n        let index_config = IndexConfig::for_test(index_id, &index_uri);\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        let split_id_0 = \"test-delete-splits-storage-error--split-0\";\n        let split_metadata_0 = SplitMetadata {\n            split_id: split_id_0.to_string(),\n            index_uid: index_uid.clone(),\n            ..Default::default()\n        };\n        let split_id_1 = \"test-delete-splits-storage-error--split-1\";\n        let split_metadata_1 = SplitMetadata {\n            split_id: split_id_1.to_string(),\n            index_uid: index_uid.clone(),\n            ..Default::default()\n        };\n        let stage_splits_request = StageSplitsRequest::try_from_splits_metadata(\n            index_uid.clone(),\n            [split_metadata_0.clone(), split_metadata_1.clone()],\n        )\n        .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n        let mark_splits_for_deletion_request = MarkSplitsForDeletionRequest::new(\n            index_uid.clone(),\n            vec![split_id_0.to_string(), split_id_1.to_string()],\n        );\n        metastore\n            .mark_splits_for_deletion(mark_splits_for_deletion_request)\n            .await\n            .unwrap();\n\n        let error = delete_splits_from_storage_and_metastore(\n            index_uid.clone(),\n            storage.clone(),\n            metastore.clone(),\n            vec![split_metadata_0, split_metadata_1],\n            None,\n        )\n        .await\n        .unwrap_err();\n\n        assert_eq!(error.successes.len(), 1);\n        assert_eq!(error.storage_failures.len(), 1);\n        assert_eq!(error.metastore_failures.len(), 0);\n\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_index_uid(index_uid).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        assert_eq!(splits.len(), 1);\n        assert_eq!(splits[0].split_id(), split_id_1);\n    }\n\n    #[tokio::test]\n    async fn test_delete_splits_from_storage_and_metastore_metastore_error() {\n        let mut mock_storage = MockStorage::new();\n        mock_storage\n            .expect_bulk_delete()\n            .return_once(|split_paths| {\n                assert_eq!(split_paths.len(), 2);\n\n                let split_path = split_paths[0].to_path_buf();\n                let successes = vec![split_path];\n\n                let split_path = split_paths[1].to_path_buf();\n                let delete_failure = DeleteFailure {\n                    code: Some(\"AccessDenied\".to_string()),\n                    ..Default::default()\n                };\n                let failures = HashMap::from_iter([(split_path, delete_failure)]);\n                let bulk_delete_error = BulkDeleteError {\n                    successes,\n                    failures,\n                    ..Default::default()\n                };\n                Err(bulk_delete_error)\n            });\n        let storage = Arc::new(mock_storage);\n\n        let index_id = \"test-delete-splits-storage-error--index\";\n        let index_uid = IndexUid::new_with_random_ulid(index_id);\n\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore.expect_delete_splits().return_once(|_| {\n            Err(MetastoreError::NotFound(EntityKind::Index {\n                index_id: index_id.to_string(),\n            }))\n        });\n\n        let split_id_0 = \"test-delete-splits-storage-error--split-0\";\n        let split_metadata_0 = SplitMetadata {\n            split_id: split_id_0.to_string(),\n            index_uid: index_uid.clone(),\n            ..Default::default()\n        };\n        let split_id_1 = \"test-delete-splits-storage-error--split-1\";\n        let split_metadata_1 = SplitMetadata {\n            split_id: split_id_1.to_string(),\n            index_uid: index_uid.clone(),\n            ..Default::default()\n        };\n        let error = delete_splits_from_storage_and_metastore(\n            index_uid.clone(),\n            storage.clone(),\n            MetastoreServiceClient::from_mock(mock_metastore),\n            vec![split_metadata_0, split_metadata_1],\n            None,\n        )\n        .await\n        .unwrap_err();\n\n        assert!(error.successes.is_empty());\n        assert_eq!(error.storage_failures.len(), 1);\n        assert_eq!(error.metastore_failures.len(), 1);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-index-management/src/index.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{HashMap, HashSet};\nuse std::path::Path;\nuse std::time::Duration;\n\nuse futures_util::StreamExt;\nuse itertools::Itertools;\nuse quickwit_common::fs::{empty_dir, get_cache_directory_path};\nuse quickwit_common::pretty::PrettySample;\nuse quickwit_common::rate_limited_error;\nuse quickwit_config::{IndexConfig, SourceConfig, validate_identifier};\nuse quickwit_indexing::check_source_connectivity;\nuse quickwit_metastore::{\n    AddSourceRequestExt, CreateIndexResponseExt, IndexMetadata, IndexMetadataResponseExt,\n    ListIndexesMetadataResponseExt, ListSplitsQuery, ListSplitsRequestExt,\n    MetastoreServiceStreamSplitsExt, SplitInfo, SplitMetadata, SplitState, UpdateIndexRequestExt,\n    UpdateSourceRequestExt,\n};\nuse quickwit_proto::metastore::{\n    AddSourceRequest, CreateIndexRequest, DeleteIndexRequest, EntityKind, IndexMetadataRequest,\n    ListIndexesMetadataRequest, ListSplitsRequest, MarkSplitsForDeletionRequest, MetastoreError,\n    MetastoreService, MetastoreServiceClient, ResetSourceCheckpointRequest, UpdateIndexRequest,\n    UpdateSourceRequest, serde_utils,\n};\nuse quickwit_proto::types::{IndexUid, SplitId};\nuse quickwit_proto::{ServiceError, ServiceErrorCode};\nuse quickwit_storage::{StorageResolver, StorageResolverError};\nuse thiserror::Error;\nuse tracing::{error, info};\n\nuse crate::garbage_collection::{\n    DeleteSplitsError, SplitRemovalInfo, delete_splits_from_storage_and_metastore,\n    run_garbage_collect,\n};\n\n#[derive(Error, Debug)]\npub enum IndexServiceError {\n    #[error(\"failed to resolve the storage `{0}`\")]\n    Storage(#[from] StorageResolverError),\n    #[error(\"metastore error `{0}`\")]\n    Metastore(#[from] MetastoreError),\n    #[error(\"split deletion error `{0}`\")]\n    SplitDeletion(#[from] DeleteSplitsError),\n    #[error(\"invalid config: {0:#}\")]\n    InvalidConfig(anyhow::Error),\n    #[error(\"invalid identifier: {0}\")]\n    InvalidIdentifier(String),\n    #[error(\"operation not allowed: {0}\")]\n    OperationNotAllowed(String),\n    #[error(\"internal error: {0}\")]\n    Internal(String),\n}\n\nimpl ServiceError for IndexServiceError {\n    fn error_code(&self) -> ServiceErrorCode {\n        match self {\n            Self::Internal(err_msg) => {\n                rate_limited_error!(limit_per_min = 6, err_msg);\n                ServiceErrorCode::Internal\n            }\n            Self::InvalidConfig(_) => ServiceErrorCode::BadRequest,\n            Self::InvalidIdentifier(_) => ServiceErrorCode::BadRequest,\n            Self::Metastore(error) => error.error_code(),\n            Self::OperationNotAllowed(_) => ServiceErrorCode::Forbidden,\n            Self::SplitDeletion(delete_splits_error) => {\n                rate_limited_error!(\n                    limit_per_min = 6,\n                    \"index service internal error/split deletion: {delete_splits_error:?}\"\n                );\n                ServiceErrorCode::Internal\n            }\n            Self::Storage(storage_error) => {\n                rate_limited_error!(\n                    limit_per_min = 6,\n                    \"index service internal error/storage {storage_error:?}\"\n                );\n                ServiceErrorCode::Internal\n            }\n        }\n    }\n}\n\n/// Index service responsible for creating, updating and deleting indexes.\n#[derive(Clone)]\npub struct IndexService {\n    metastore: MetastoreServiceClient,\n    storage_resolver: StorageResolver,\n}\n\nimpl IndexService {\n    /// Creates an `IndexService`.\n    pub fn new(metastore: MetastoreServiceClient, storage_resolver: StorageResolver) -> Self {\n        Self {\n            metastore,\n            storage_resolver,\n        }\n    }\n\n    pub fn metastore(&self) -> MetastoreServiceClient {\n        self.metastore.clone()\n    }\n\n    /// Creates an index from `IndexConfig`.\n    pub async fn create_index(\n        &mut self,\n        index_config: IndexConfig,\n        overwrite: bool,\n    ) -> Result<IndexMetadata, IndexServiceError> {\n        validate_storage_uri(&self.storage_resolver, &index_config)\n            .await\n            .map_err(IndexServiceError::InvalidConfig)?;\n\n        // Delete existing index if it exists.\n        if overwrite {\n            match self.delete_index(&index_config.index_id, false).await {\n                Ok(_)\n                | Err(IndexServiceError::Metastore(MetastoreError::NotFound(\n                    EntityKind::Index { .. },\n                ))) => {\n                    // Ignore index not found error.\n                }\n                Err(error) => {\n                    return Err(error);\n                }\n            }\n        }\n        let metastore = self.metastore.clone();\n\n        let index_config_json = serde_utils::to_json_str(&index_config)?;\n\n        // Add default sources.\n        let source_configs_json = vec![\n            serde_utils::to_json_str(&SourceConfig::ingest_api_default())?,\n            serde_utils::to_json_str(&SourceConfig::ingest_v2())?,\n            serde_utils::to_json_str(&SourceConfig::cli())?,\n        ];\n        let create_index_request = CreateIndexRequest {\n            index_config_json,\n            source_configs_json,\n        };\n        let create_index_response = metastore.create_index(create_index_request).await?;\n        let index_metadata = create_index_response.deserialize_index_metadata()?;\n        Ok(index_metadata)\n    }\n\n    /// Returns the index metadata for the given index ID if it exists.\n    pub async fn index_metadata_opt(\n        &self,\n        index_metadata_request: IndexMetadataRequest,\n    ) -> Result<Option<IndexMetadata>, IndexServiceError> {\n        let index_metadata_response = self.metastore.index_metadata(index_metadata_request).await;\n        match index_metadata_response {\n            Ok(index_metadata_response) => {\n                let index_metadata = index_metadata_response.deserialize_index_metadata()?;\n                Ok(Some(index_metadata))\n            }\n            Err(MetastoreError::NotFound(_)) => Ok(None),\n            Err(error) => Err(IndexServiceError::Metastore(error)),\n        }\n    }\n\n    /// Updates an index with the given index config.\n    pub async fn update_index(\n        &self,\n        index_uid: IndexUid,\n        index_config: IndexConfig,\n    ) -> Result<IndexMetadata, IndexServiceError> {\n        let update_index_request = UpdateIndexRequest::try_from_updates(\n            index_uid,\n            &index_config.doc_mapping,\n            &index_config.indexing_settings,\n            &index_config.ingest_settings,\n            &index_config.search_settings,\n            &index_config.retention_policy_opt,\n        )?;\n        let update_index_response = self.metastore.update_index(update_index_request).await?;\n        let index_metadata = update_index_response.deserialize_index_metadata()?;\n        Ok(index_metadata)\n    }\n\n    /// Deletes the index specified with `index_id`.\n    /// This is equivalent to running `rm -rf <index path>` for a local index or\n    /// `aws s3 rm --recursive <index path>` for a remote Amazon S3 index.\n    ///\n    /// * `index_id` - The target index Id.\n    /// * `dry_run` - Should this only return a list of affected files without performing deletion.\n    pub async fn delete_index(\n        &mut self,\n        index_id: &str,\n        dry_run: bool,\n    ) -> Result<Vec<SplitInfo>, IndexServiceError> {\n        let index_metadata_request = IndexMetadataRequest::for_index_id(index_id.to_string());\n        let index_metadata = self\n            .metastore\n            .index_metadata(index_metadata_request)\n            .await?\n            .deserialize_index_metadata()?;\n        let index_uid = index_metadata.index_uid.clone();\n        let index_uri = index_metadata.into_index_config().index_uri.clone();\n        let storage = self.storage_resolver.resolve(&index_uri).await?;\n\n        if dry_run {\n            let list_splits_request = ListSplitsRequest::try_from_index_uid(index_uid)?;\n            let splits_to_delete: Vec<SplitInfo> = self\n                .metastore\n                .list_splits(list_splits_request)\n                .await?\n                .collect_splits()\n                .await?\n                .into_iter()\n                .map(|split| split.split_metadata.as_split_info())\n                .collect();\n            return Ok(splits_to_delete);\n        }\n        // Schedule staged and published splits for deletion.\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_states([SplitState::Staged, SplitState::Published]);\n        let list_splits_request = ListSplitsRequest::try_from_list_splits_query(&query)?;\n        let split_ids: Vec<SplitId> = self\n            .metastore\n            .list_splits(list_splits_request)\n            .await?\n            .collect_split_ids()\n            .await?;\n        let mark_splits_for_deletion_request =\n            MarkSplitsForDeletionRequest::new(index_uid.clone(), split_ids);\n        self.metastore\n            .mark_splits_for_deletion(mark_splits_for_deletion_request)\n            .await?;\n\n        // Select splits to delete\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::MarkedForDeletion);\n        let list_splits_request = ListSplitsRequest::try_from_list_splits_query(&query)?;\n        let splits_metadata_to_delete: Vec<SplitMetadata> = self\n            .metastore\n            .list_splits(list_splits_request)\n            .await?\n            .collect_splits_metadata()\n            .await?;\n\n        let deleted_splits = delete_splits_from_storage_and_metastore(\n            index_uid.clone(),\n            storage,\n            self.metastore.clone(),\n            splits_metadata_to_delete,\n            None,\n        )\n        .await?;\n        let delete_index_request = DeleteIndexRequest {\n            index_uid: Some(index_uid),\n        };\n        self.metastore.delete_index(delete_index_request).await?;\n\n        Ok(deleted_splits)\n    }\n\n    /// Deletes the indexes specified with `index_id_patterns`.\n    /// This is a wrapper of delete_index, and support index delete with index pattern\n    ///\n    /// * `index_id_patterns` - The targeted index ID patterns.\n    /// * `dry_run` - Should this only return a list of affected files without performing deletion.\n    pub async fn delete_indexes(\n        &self,\n        index_id_patterns: Vec<String>,\n        ignore_missing: bool,\n        dry_run: bool,\n    ) -> Result<Vec<SplitInfo>, IndexServiceError> {\n        let list_indexes_metadatas_request = ListIndexesMetadataRequest {\n            index_id_patterns: index_id_patterns.to_owned(),\n        };\n        // disallow index_id patterns\n        for index_id_pattern in &index_id_patterns {\n            if index_id_pattern.contains('*') {\n                return Err(IndexServiceError::Metastore(\n                    MetastoreError::InvalidArgument {\n                        message: format!(\"index_id pattern {index_id_pattern} contains *\"),\n                    },\n                ));\n            }\n            if index_id_pattern == \"_all\" {\n                return Err(IndexServiceError::Metastore(\n                    MetastoreError::InvalidArgument {\n                        message: \"index_id pattern _all not supported\".to_string(),\n                    },\n                ));\n            }\n        }\n\n        let metastore = self.metastore.clone();\n        let indexes_metadata = metastore\n            .list_indexes_metadata(list_indexes_metadatas_request)\n            .await?\n            .deserialize_indexes_metadata()\n            .await?;\n\n        if !ignore_missing && indexes_metadata.len() != index_id_patterns.len() {\n            let found_index_ids: HashSet<&str> = indexes_metadata\n                .iter()\n                .map(|index_metadata| index_metadata.index_id())\n                .collect();\n            let missing_index_ids: Vec<String> = index_id_patterns\n                .iter()\n                .filter(|index_id| !found_index_ids.contains(index_id.as_str()))\n                .map(|index_id| index_id.to_string())\n                .collect_vec();\n            return Err(IndexServiceError::Metastore(MetastoreError::NotFound(\n                EntityKind::Indexes {\n                    index_ids: missing_index_ids.to_vec(),\n                },\n            )));\n        }\n        let index_ids = indexes_metadata\n            .iter()\n            .map(|index_metadata| index_metadata.index_id())\n            .collect_vec();\n        info!(index_ids = ?PrettySample::new(&index_ids, 5), \"delete indexes\");\n\n        // setup delete index tasks\n        let mut delete_index_tasks = Vec::new();\n        for index_id in index_ids {\n            let task = async move {\n                let result = self.clone().delete_index(index_id, dry_run).await;\n                (index_id, result)\n            };\n            delete_index_tasks.push(task);\n        }\n        let mut delete_responses: HashMap<String, Vec<SplitInfo>> = HashMap::new();\n        let mut delete_errors: HashMap<String, IndexServiceError> = HashMap::new();\n        let mut stream = futures::stream::iter(delete_index_tasks).buffer_unordered(5);\n        while let Some((index_id, delete_response)) = stream.next().await {\n            match delete_response {\n                Ok(split_infos) => {\n                    delete_responses.insert(index_id.to_string(), split_infos);\n                }\n                Err(error) => {\n                    delete_errors.insert(index_id.to_string(), error);\n                }\n            }\n        }\n\n        if delete_errors.is_empty() {\n            let mut concatenated_split_infos = Vec::new();\n            for (_, split_info_vec) in delete_responses.into_iter() {\n                concatenated_split_infos.extend(split_info_vec);\n            }\n            Ok(concatenated_split_infos)\n        } else {\n            Err(IndexServiceError::Metastore(MetastoreError::Internal {\n                message: format!(\"errors occurred when deleting indexes: {index_id_patterns:?}\"),\n                cause: format!(\"errors: {delete_errors:?}\\ndeleted indexes: {delete_responses:?}\"),\n            }))\n        }\n    }\n    /// Detect all dangling splits and associated files from the index and removes them.\n    ///\n    /// * `index_id` - The target index Id.\n    /// * `grace_period` -  Threshold period after which a staged split can be garbage collected.\n    /// * `dry_run` - Should this only return a list of affected files without performing deletion.\n    pub async fn garbage_collect_index(\n        &mut self,\n        index_id: &str,\n        grace_period: Duration,\n        dry_run: bool,\n    ) -> anyhow::Result<SplitRemovalInfo> {\n        let index_metadata_request = IndexMetadataRequest::for_index_id(index_id.to_string());\n        let index_metadata = self\n            .metastore\n            .index_metadata(index_metadata_request)\n            .await?\n            .deserialize_index_metadata()?;\n        let index_uid = index_metadata.index_uid.clone();\n        let index_config = index_metadata.into_index_config();\n        let storage = self\n            .storage_resolver\n            .resolve(&index_config.index_uri)\n            .await?;\n\n        let deleted_entries = run_garbage_collect(\n            [(index_uid, storage)].into_iter().collect(),\n            self.metastore.clone(),\n            grace_period,\n            // deletion_grace_period of zero, so that a cli call directly deletes splits after\n            // marking to be deleted.\n            Duration::ZERO,\n            dry_run,\n            None,\n            None,\n        )\n        .await?;\n\n        Ok(deleted_entries)\n    }\n\n    /// Clears the index by applying the following actions:\n    /// - mark all splits for deletion in the metastore.\n    /// - delete the files of all splits marked for deletion using garbage collection.\n    /// - delete the splits from the metastore.\n    /// - reset all the source checkpoints.\n    ///\n    /// * `metastore` - A metastore object for interacting with the metastore.\n    /// * `index_id` - The target index Id.\n    /// * `storage_resolver` - A storage resolver object to access the storage.\n    pub async fn clear_index(&mut self, index_id: &str) -> Result<(), IndexServiceError> {\n        let index_metadata_request = IndexMetadataRequest::for_index_id(index_id.to_string());\n        let index_metadata = self\n            .metastore\n            .index_metadata(index_metadata_request)\n            .await?\n            .deserialize_index_metadata()?;\n        let index_uid = index_metadata.index_uid.clone();\n        let storage = self\n            .storage_resolver\n            .resolve(index_metadata.index_uri())\n            .await?;\n        let list_splits_request = ListSplitsRequest::try_from_index_uid(index_uid.clone())?;\n        let splits_metadata: Vec<SplitMetadata> = self\n            .metastore\n            .list_splits(list_splits_request)\n            .await?\n            .collect_splits_metadata()\n            .await?;\n        let split_ids: Vec<SplitId> = splits_metadata\n            .iter()\n            .map(|split| split.split_id.to_string())\n            .collect();\n        let mark_splits_for_deletion_request =\n            MarkSplitsForDeletionRequest::new(index_uid.clone(), split_ids.clone());\n        self.metastore\n            .mark_splits_for_deletion(mark_splits_for_deletion_request)\n            .await?;\n        // FIXME: return an error.\n        if let Err(err) = delete_splits_from_storage_and_metastore(\n            index_uid.clone(),\n            storage,\n            self.metastore.clone(),\n            splits_metadata,\n            None,\n        )\n        .await\n        {\n            error!(metastore_endpoints=?self.metastore.endpoints(), index_id=%index_id, error=?err, \"failed to delete all the split files during garbage collection\");\n        }\n        for source_id in index_metadata.sources.keys() {\n            let reset_source_checkpoint_request = ResetSourceCheckpointRequest {\n                index_uid: Some(index_uid.clone()),\n                source_id: source_id.to_string(),\n            };\n            self.metastore\n                .reset_source_checkpoint(reset_source_checkpoint_request)\n                .await?;\n        }\n        Ok(())\n    }\n\n    /// Adds a source to an index identified by its UID.\n    pub async fn add_source(\n        &mut self,\n        index_uid: IndexUid,\n        source_config: SourceConfig,\n    ) -> Result<SourceConfig, IndexServiceError> {\n        let source_id = source_config.source_id.clone();\n        // This is a bit redundant, as SourceConfig deserialization also checks\n        // that the identifier is valid. However it authorizes the special\n        // private names internal to quickwit, so we do an extra check.\n        validate_identifier(\"source\", &source_id).map_err(|_| {\n            IndexServiceError::InvalidIdentifier(format!(\"invalid source ID: `{source_id}`\"))\n        })?;\n        check_source_connectivity(&self.storage_resolver, &source_config)\n            .await\n            .map_err(IndexServiceError::InvalidConfig)?;\n        let add_source_request =\n            AddSourceRequest::try_from_source_config(index_uid.clone(), &source_config)?;\n        self.metastore.add_source(add_source_request).await?;\n        info!(\n            \"source `{}` successfully created for index `{}`\",\n            source_id, index_uid.index_id,\n        );\n        let index_metadata_request = IndexMetadataRequest::for_index_id(index_uid.index_id);\n        let source = self\n            .metastore\n            .index_metadata(index_metadata_request)\n            .await?\n            .deserialize_index_metadata()?\n            .sources\n            .get(&source_id)\n            .ok_or_else(|| {\n                IndexServiceError::Internal(\n                    \"created source is not in index metadata, this should never happen\".to_string(),\n                )\n            })?\n            .clone();\n        Ok(source)\n    }\n\n    /// Updates a source from an index identified by its UID.\n    pub async fn update_source(\n        &mut self,\n        index_uid: IndexUid,\n        source_config: SourceConfig,\n    ) -> Result<SourceConfig, IndexServiceError> {\n        let source_id = source_config.source_id.clone();\n        check_source_connectivity(&self.storage_resolver, &source_config)\n            .await\n            .map_err(IndexServiceError::InvalidConfig)?;\n        let update_source_request =\n            UpdateSourceRequest::try_from_source_config(index_uid.clone(), &source_config)?;\n        self.metastore.update_source(update_source_request).await?;\n        info!(\n            \"source `{source_id}` successfully updated for index `{}`\",\n            index_uid.index_id\n        );\n        let index_metadata_request = IndexMetadataRequest::for_index_id(index_uid.index_id);\n        let source = self\n            .metastore\n            .index_metadata(index_metadata_request)\n            .await?\n            .deserialize_index_metadata()?\n            .sources\n            .get(&source_id)\n            .ok_or_else(|| {\n                IndexServiceError::Internal(\n                    \"created source is not in index metadata, this should never happen\".to_string(),\n                )\n            })?\n            .clone();\n        Ok(source)\n    }\n\n    pub async fn get_source(\n        &mut self,\n        index_id: &str,\n        source_id: &str,\n    ) -> Result<SourceConfig, IndexServiceError> {\n        let index_metadata_request = IndexMetadataRequest::for_index_id(index_id.to_string());\n        let source_config = self\n            .metastore\n            .index_metadata(index_metadata_request)\n            .await?\n            .deserialize_index_metadata()?\n            .sources\n            .get(source_id)\n            .ok_or_else(|| {\n                IndexServiceError::Metastore(MetastoreError::NotFound(EntityKind::Source {\n                    index_id: index_id.to_string(),\n                    source_id: source_id.to_string(),\n                }))\n            })?\n            .clone();\n\n        Ok(source_config)\n    }\n}\n\n/// Clears the cache directory of a given source.\n///\n/// * `data_dir_path` - Path to directory where data (tmp data, splits kept for caching purpose) is\n///   persisted.\npub async fn clear_cache_directory(data_dir_path: &Path) -> anyhow::Result<()> {\n    let cache_directory_path = get_cache_directory_path(data_dir_path);\n    info!(path = %cache_directory_path.display(), \"clearing cache directory\");\n    empty_dir(&cache_directory_path).await?;\n    Ok(())\n}\n\n/// Validates the storage URI by effectively resolving it.\npub async fn validate_storage_uri(\n    storage_resolver: &StorageResolver,\n    index_config: &IndexConfig,\n) -> anyhow::Result<()> {\n    storage_resolver.resolve(&index_config.index_uri).await?;\n    Ok(())\n}\n\n#[cfg(test)]\nmod tests {\n\n    use quickwit_common::uri::Uri;\n    use quickwit_config::{\n        CLI_SOURCE_ID, INGEST_API_SOURCE_ID, INGEST_V2_SOURCE_ID, IndexConfig, RetentionPolicy,\n    };\n    use quickwit_metastore::{\n        MetastoreServiceExt, SplitMetadata, StageSplitsRequestExt, metastore_for_test,\n    };\n    use quickwit_proto::metastore::StageSplitsRequest;\n    use quickwit_storage::PutPayload;\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_create_index() {\n        let metastore = metastore_for_test();\n        let storage_resolver = StorageResolver::for_test();\n        let mut index_service = IndexService::new(metastore.clone(), storage_resolver);\n        let index_id = \"test-index\";\n        let index_uri = \"ram://indexes/test-index\";\n        let index_config = IndexConfig::for_test(index_id, index_uri);\n        let index_metadata_0 = index_service\n            .create_index(index_config.clone(), false)\n            .await\n            .unwrap();\n        assert_eq!(index_metadata_0.index_id(), index_id);\n        assert_eq!(index_metadata_0.index_uri(), &index_uri);\n\n        assert_eq!(index_metadata_0.sources.len(), 3);\n        assert!(index_metadata_0.sources.contains_key(CLI_SOURCE_ID));\n        assert!(index_metadata_0.sources.contains_key(INGEST_API_SOURCE_ID));\n        assert!(index_metadata_0.sources.contains_key(INGEST_V2_SOURCE_ID));\n\n        assert!(\n            metastore\n                .index_metadata(IndexMetadataRequest::for_index_id(index_id.to_string()))\n                .await\n                .is_ok()\n        );\n\n        let error = index_service\n            .create_index(index_config.clone(), false)\n            .await\n            .unwrap_err();\n        let IndexServiceError::Metastore(inner_error) = error else {\n            panic!(\"expected `MetastoreError` variant, got {error:?}\")\n        };\n        assert!(\n            matches!(inner_error, MetastoreError::AlreadyExists(EntityKind::Index { index_id }) if index_id == index_metadata_0.index_id())\n        );\n\n        let index_metadata_1 = index_service\n            .create_index(index_config, true)\n            .await\n            .unwrap();\n        assert_eq!(index_metadata_1.index_id(), index_id);\n        assert_eq!(index_metadata_1.index_uri(), &index_uri);\n        assert!(index_metadata_0.index_uid != index_metadata_1.index_uid);\n    }\n\n    #[tokio::test]\n    async fn test_index_metadata_opt() {\n        let metastore = metastore_for_test();\n        let storage_resolver = StorageResolver::for_test();\n        let mut index_service = IndexService::new(metastore.clone(), storage_resolver);\n\n        let index_id = \"test-index\";\n        let index_metadata_request = IndexMetadataRequest::for_index_id(index_id.to_string());\n        let index_metadata = index_service\n            .index_metadata_opt(index_metadata_request)\n            .await\n            .unwrap();\n        assert!(index_metadata.is_none());\n\n        let index_uri = \"ram://indexes/test-index\";\n        let index_config = IndexConfig::for_test(index_id, index_uri);\n        let index_uid = index_service\n            .create_index(index_config.clone(), false)\n            .await\n            .unwrap()\n            .index_uid;\n        let index_metadata_request = IndexMetadataRequest::for_index_uid(index_uid.clone());\n        let index_metadata = index_service\n            .index_metadata_opt(index_metadata_request)\n            .await\n            .unwrap()\n            .unwrap();\n        assert_eq!(index_metadata.index_uid, index_uid);\n    }\n\n    #[tokio::test]\n    async fn test_update_index() {\n        let metastore = metastore_for_test();\n        let storage_resolver = StorageResolver::for_test();\n        let mut index_service = IndexService::new(metastore.clone(), storage_resolver);\n\n        let index_id = \"test-index\";\n        let index_uri = \"ram://indexes/test-index\";\n        let mut index_config = IndexConfig::for_test(index_id, index_uri);\n        let index_uid = index_service\n            .create_index(index_config.clone(), false)\n            .await\n            .unwrap()\n            .index_uid;\n\n        let retention_policy = RetentionPolicy {\n            retention_period: \"42 hours\".to_string(),\n            evaluation_schedule: \"hourly\".to_string(),\n        };\n        index_config.retention_policy_opt = Some(retention_policy.clone());\n\n        let updated_index_metadata = index_service\n            .update_index(index_uid, index_config)\n            .await\n            .unwrap();\n        let updated_retention_policy = updated_index_metadata\n            .index_config\n            .retention_policy_opt\n            .unwrap();\n        assert_eq!(updated_retention_policy, retention_policy);\n    }\n\n    #[tokio::test]\n    async fn test_delete_index() {\n        let mut metastore = metastore_for_test();\n        let storage_resolver = StorageResolver::for_test();\n        let storage = storage_resolver\n            .resolve(&Uri::for_test(\"ram://indexes/test-index\"))\n            .await\n            .unwrap();\n        let mut index_service = IndexService::new(metastore.clone(), storage_resolver);\n        let index_id = \"test-index\";\n        let index_uri = \"ram://indexes/test-index\";\n        let index_config = IndexConfig::for_test(index_id, index_uri);\n        let index_uid = index_service\n            .create_index(index_config.clone(), false)\n            .await\n            .unwrap()\n            .index_uid;\n\n        let split_id = \"test-split\";\n        let split_metadata = SplitMetadata {\n            split_id: split_id.to_string(),\n            index_uid: index_uid.clone(),\n            ..Default::default()\n        };\n        let stage_splits_request = StageSplitsRequest::try_from_splits_metadata(\n            index_uid.clone(),\n            vec![split_metadata.clone()],\n        )\n        .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_index_uid(index_uid.clone()).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        assert_eq!(splits.len(), 1);\n\n        let split_path_str = format!(\"{split_id}.split\");\n        let split_path = Path::new(&split_path_str);\n        let payload: Box<dyn PutPayload> = Box::new(vec![0]);\n        storage.put(split_path, payload).await.unwrap();\n        assert!(storage.exists(split_path).await.unwrap());\n\n        let split_infos = index_service.delete_index(index_id, false).await.unwrap();\n        assert_eq!(split_infos.len(), 1);\n\n        assert!(!metastore.index_exists(index_id).await.unwrap());\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_index_uid(index_uid.clone()).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        assert!(splits.is_empty());\n        assert!(!storage.exists(split_path).await.unwrap());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-index-management/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod garbage_collection;\nmod index;\n\npub use garbage_collection::{GcMetrics, run_garbage_collect};\npub use index::{IndexService, IndexServiceError, clear_cache_directory, validate_storage_uri};\n"
  },
  {
    "path": "quickwit/quickwit-indexing/Cargo.toml",
    "content": "[package]\nname = \"quickwit-indexing\"\ndescription = \"Indexing service implementation\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nanyhow = { workspace = true }\narc-swap = { workspace = true }\nasync-compression = { workspace = true }\nasync-trait = { workspace = true }\naws-sdk-kinesis = { workspace = true, optional = true }\naws-sdk-sqs = { workspace = true, optional = true }\nbytes = { workspace = true }\nbytesize = { workspace = true }\nfail = { workspace = true }\nflume = { workspace = true }\nfnv = { workspace = true }\nfutures = { workspace = true }\ngoogle-cloud-auth = { workspace = true, optional = true }\ngoogle-cloud-gax = { workspace = true, optional = true }\ngoogle-cloud-googleapis = { workspace = true, optional = true }\ngoogle-cloud-pubsub = { workspace = true, optional = true }\nitertools = { workspace = true }\nlibz-sys = { workspace = true, optional = true }\nonce_cell = { workspace = true }\noneshot = { workspace = true }\nopenssl = { workspace = true, optional = true }\npercent-encoding = { workspace = true }\npulsar = { workspace = true, optional = true }\nquickwit-query = { workspace = true }\nregex = { workspace = true }\nrdkafka = { workspace = true, optional = true }\nserde = { workspace = true }\nserde_json = { workspace = true }\ntantivy = { workspace = true }\ntempfile = { workspace = true }\nthiserror = { workspace = true }\ntime = { workspace = true }\ntokio = { workspace = true }\ntracing = { workspace = true }\nulid = { workspace = true }\nutoipa = { workspace = true }\nvrl = { workspace = true, optional = true }\nwarp = { workspace = true, optional = true }\n\nquickwit-actors = { workspace = true }\nquickwit-aws = { workspace = true }\nquickwit-cluster = { workspace = true }\nquickwit-common = { workspace = true }\nquickwit-config = { workspace = true }\nquickwit-directories = { workspace = true }\nquickwit-doc-mapper = { workspace = true }\nquickwit-ingest = { workspace = true }\nquickwit-metastore = { workspace = true }\nquickwit-opentelemetry = { workspace = true }\nquickwit-proto = { workspace = true }\nquickwit-storage = { workspace = true }\n\n[features]\ngcp-pubsub = [\n  \"dep:google-cloud-auth\",\n  \"dep:google-cloud-gax\",\n  \"dep:google-cloud-googleapis\",\n  \"dep:google-cloud-pubsub\",\n]\ngcp-pubsub-emulator-tests = []\nkafka = [\"rdkafka\"]\nkafka-broker-tests = []\nkinesis = [\n  \"aws-sdk-kinesis\",\n  \"quickwit-aws/kinesis\",\n]\nkinesis-localstack-tests = []\npulsar = [\"dep:pulsar\"]\npulsar-broker-tests = []\nqueue-sources = []\nsqs = [\n  \"aws-sdk-sqs\",\n  \"queue-sources\",\n  \"quickwit-aws/sqs\",\n]\nsqs-test-helpers = [\"warp\"]\nsqs-localstack-tests = [\"sqs-test-helpers\"]\nvendored-kafka = [\n  \"kafka\",\n  \"libz-sys/static\",\n  \"openssl/vendored\",\n  \"rdkafka/gssapi-vendored\",\n]\nvendored-kafka-macos = [\"kafka\", \"libz-sys/static\", \"openssl/vendored\"]\ntestsuite = [\n  \"quickwit-actors/testsuite\",\n  \"quickwit-cluster/testsuite\",\n  \"quickwit-common/testsuite\",\n  \"quickwit-config/testsuite\",\n  \"quickwit-proto/testsuite\",\n  \"quickwit-storage/testsuite\"\n]\nvrl = [\"dep:vrl\", \"quickwit-config/vrl\"]\nci-test = []\n\n[dev-dependencies]\nbytes = { workspace = true }\ncriterion = { workspace = true, features = [\"async_tokio\"] }\nmockall = { workspace = true }\nproptest = { workspace = true }\nprost = { workspace = true }\nrand = { workspace = true }\nreqwest = { workspace = true }\ntempfile = { workspace = true }\n\nquickwit-actors = { workspace = true, features = [\"testsuite\"] }\nquickwit-cluster = { workspace = true, features = [\"testsuite\"] }\nquickwit-common = { workspace = true, features = [\"testsuite\"] }\nquickwit-config = { workspace = true, features = [\"testsuite\"] }\nquickwit-doc-mapper = { workspace = true, features = [\"testsuite\"] }\nquickwit-indexing = { workspace = true, features = [\"testsuite\"] }\nquickwit-ingest = { workspace = true, features = [\"testsuite\"] }\nquickwit-metastore = { workspace = true, features = [\"testsuite\"] }\nquickwit-proto = { workspace = true, features = [\"testsuite\"] }\nquickwit-storage = { workspace = true, features = [\"testsuite\"] }\n\n[[test]]\nname = \"failpoints\"\npath = \"failpoints/mod.rs\"\nrequired-features = [\"fail/failpoints\"]\n\n[[bench]]\nname = \"doc_process_vrl_bench\"\nharness = false\n\n[package.metadata.cargo-machete]\n# used to vendor/static build native dependencies\nignored = [\"libz-sys\", \"openssl\"]\n"
  },
  {
    "path": "quickwit/quickwit-indexing/README.md",
    "content": "```mermaid\nflowchart LR\n    subgraph Indexing pipeline\n        direction LR\n        publisher --inf--> source\n        source[Source] --10--> doc_processor\n        doc_processor[DocProcessor] --10--> indexer\n        indexer[Indexer] --1--> serializer\n        serializer[IndexSerializer] --1--> packager\n        packager[Packager] --0--> uploader\n        uploader[Uploader] --2--> sequencer\n        sequencer[Sequencer] --1--> publisher\n    end\n    subgraph Merge pipeline\n        direction LR\n        merge_downloader[MergeDownloader] --1--> merge_executor\n        merge_executor[MergeExecutor] --1--> merge_packager\n        merge_packager[MergePackager] --0--> merge_uploader\n        merge_uploader[MergeUploader] --inf--> merge_publisher\n    end\n    merge_planner[MergePlanner] --1--> merge_downloader\n    merge_publisher[MergePublisher] --1--> merge_planner\n    publisher[Publisher] --1--> merge_planner\n```\n"
  },
  {
    "path": "quickwit/quickwit-indexing/benches/data/bench_data.json",
    "content": "{\"id\":1,\"first_name\":\"Kearney\",\"last_name\":\"Paunsford\",\"email\":\"kpaunsford0@springer.com\",\"job\":\"VP Sales\",\"timestamp\":\"2022-07-19T21:48:45Z\"}\n{\"id\":2,\"first_name\":\"Robinia\",\"last_name\":\"Hapgood\",\"email\":\"rhapgood1@google.co.jp\",\"job\":\"Recruiter\",\"timestamp\":\"2022-05-28T01:40:07Z\"}\n{\"id\":3,\"first_name\":\"Patrizius\",\"last_name\":\"O'Henery\",\"email\":\"pohenery2@narod.ru\",\"job\":\"VP Quality Control\",\"timestamp\":\"2022-09-28T02:48:31Z\"}\n{\"id\":4,\"first_name\":\"Natalina\",\"last_name\":\"Jimeno\",\"email\":\"njimeno3@vimeo.com\",\"job\":\"Quality Engineer\",\"timestamp\":\"2022-10-17T07:06:14Z\"}\n{\"id\":5,\"first_name\":\"Jerrylee\",\"last_name\":\"Clemont\",\"email\":\"jclemont4@nbcnews.com\",\"job\":\"Geologist IV\",\"timestamp\":\"2022-03-16T11:22:02Z\"}\n{\"id\":6,\"first_name\":\"Alphonse\",\"last_name\":\"Andrejevic\",\"email\":\"aandrejevic5@csmonitor.com\",\"job\":\"Automation Specialist IV\",\"timestamp\":\"2022-09-25T16:06:30Z\"}\n{\"id\":7,\"first_name\":\"Jessamine\",\"last_name\":\"Sumshon\",\"email\":\"jsumshon6@buzzfeed.com\",\"job\":\"Administrative Assistant I\",\"timestamp\":\"2022-04-19T14:49:00Z\"}\n{\"id\":8,\"first_name\":\"Eloisa\",\"last_name\":\"Mowen\",\"email\":\"emowen7@mediafire.com\",\"job\":\"VP Sales\",\"timestamp\":\"2022-11-28T13:04:32Z\"}\n{\"id\":9,\"first_name\":\"Millie\",\"last_name\":\"Gooda\",\"email\":\"mgooda8@tinypic.com\",\"job\":\"Administrative Officer\",\"timestamp\":\"2022-07-24T06:56:27Z\"}\n{\"id\":10,\"first_name\":\"Tarrah\",\"last_name\":\"Crucitti\",\"email\":\"tcrucitti9@engadget.com\",\"job\":\"Chemical Engineer\",\"timestamp\":\"2022-01-13T20:45:27Z\"}\n{\"id\":11,\"first_name\":\"Micaela\",\"last_name\":\"Giottini\",\"email\":\"mgiottinia@globo.com\",\"job\":\"Director of Sales\",\"timestamp\":\"2022-10-17T12:49:19Z\"}\n{\"id\":12,\"first_name\":\"Shannah\",\"last_name\":\"Goodlet\",\"email\":\"sgoodletb@i2i.jp\",\"job\":\"Senior Financial Analyst\",\"timestamp\":\"2022-09-10T21:24:23Z\"}\n{\"id\":13,\"first_name\":\"Carley\",\"last_name\":\"Gloy\",\"email\":\"cgloyc@github.io\",\"job\":\"Sales Representative\",\"timestamp\":\"2022-06-11T07:20:14Z\"}\n{\"id\":14,\"first_name\":\"Eba\",\"last_name\":\"Simionato\",\"email\":\"esimionatod@bigcartel.com\",\"job\":\"VP Sales\",\"timestamp\":\"2022-04-04T02:02:43Z\"}\n{\"id\":15,\"first_name\":\"Camey\",\"last_name\":\"Walklett\",\"email\":\"cwalklette@youku.com\",\"job\":\"Assistant Media Planner\",\"timestamp\":\"2022-11-21T10:34:51Z\"}\n{\"id\":16,\"first_name\":\"Leonore\",\"last_name\":\"Cowland\",\"email\":\"lcowlandf@arizona.edu\",\"job\":\"Administrative Officer\",\"timestamp\":\"2021-12-13T00:18:33Z\"}\n{\"id\":17,\"first_name\":\"Kit\",\"last_name\":\"Domenici\",\"email\":\"kdomenicig@icq.com\",\"job\":\"Chemical Engineer\",\"timestamp\":\"2022-03-22T10:25:46Z\"}\n{\"id\":18,\"first_name\":\"Jewel\",\"last_name\":\"McGillacoell\",\"email\":\"jmcgillacoellh@sbwire.com\",\"job\":\"Software Consultant\",\"timestamp\":\"2022-09-06T12:27:03Z\"}\n{\"id\":19,\"first_name\":\"Arabela\",\"last_name\":\"Lillicrop\",\"email\":\"alillicropi@tamu.edu\",\"job\":\"Biostatistician III\",\"timestamp\":\"2022-10-22T03:31:14Z\"}\n{\"id\":20,\"first_name\":\"Deborah\",\"last_name\":\"Ridd\",\"email\":\"driddj@issuu.com\",\"job\":\"Account Executive\",\"timestamp\":\"2022-07-26T09:31:38Z\"}\n{\"id\":21,\"first_name\":\"Cordula\",\"last_name\":\"Borthwick\",\"email\":\"cborthwickk@bloomberg.com\",\"job\":\"Senior Sales Associate\",\"timestamp\":\"2022-09-28T18:04:21Z\"}\n{\"id\":22,\"first_name\":\"Vincents\",\"last_name\":\"Fitzjohn\",\"email\":\"vfitzjohnl@a8.net\",\"job\":\"Sales Associate\",\"timestamp\":\"2022-09-17T00:38:00Z\"}\n{\"id\":23,\"first_name\":\"Cam\",\"last_name\":\"Wadworth\",\"email\":\"cwadworthm@prweb.com\",\"job\":\"Developer IV\",\"timestamp\":\"2021-12-28T15:12:11Z\"}\n{\"id\":24,\"first_name\":\"Dennison\",\"last_name\":\"Hedlestone\",\"email\":\"dhedlestonen@so-net.ne.jp\",\"job\":\"Clinical Specialist\",\"timestamp\":\"2022-05-18T20:46:28Z\"}\n{\"id\":25,\"first_name\":\"Ibby\",\"last_name\":\"Stetson\",\"email\":\"istetsono@aboutads.info\",\"job\":\"Assistant Media Planner\",\"timestamp\":\"2022-08-14T02:30:12Z\"}\n{\"id\":26,\"first_name\":\"Herc\",\"last_name\":\"Eivers\",\"email\":\"heiversp@bloglovin.com\",\"job\":\"Media Manager IV\",\"timestamp\":\"2022-03-05T13:51:08Z\"}\n{\"id\":27,\"first_name\":\"Christy\",\"last_name\":\"Brundrett\",\"email\":\"cbrundrettq@statcounter.com\",\"job\":\"Budget/Accounting Analyst II\",\"timestamp\":\"2022-03-08T03:10:55Z\"}\n{\"id\":28,\"first_name\":\"Tyler\",\"last_name\":\"Gregersen\",\"email\":\"tgregersenr@prnewswire.com\",\"job\":\"Assistant Professor\",\"timestamp\":\"2022-10-16T14:14:06Z\"}\n{\"id\":29,\"first_name\":\"Karole\",\"last_name\":\"Worvell\",\"email\":\"kworvells@fotki.com\",\"job\":\"Marketing Assistant\",\"timestamp\":\"2022-06-10T13:21:11Z\"}\n{\"id\":30,\"first_name\":\"Lonnard\",\"last_name\":\"Myton\",\"email\":\"lmytont@unesco.org\",\"job\":\"Nuclear Power Engineer\",\"timestamp\":\"2022-07-25T06:32:06Z\"}\n{\"id\":31,\"first_name\":\"Elsey\",\"last_name\":\"Mingotti\",\"email\":\"emingottiu@flickr.com\",\"job\":\"Actuary\",\"timestamp\":\"2022-09-06T23:11:13Z\"}\n{\"id\":32,\"first_name\":\"Coral\",\"last_name\":\"Roscoe\",\"email\":\"croscoev@thetimes.co.uk\",\"job\":\"Food Chemist\",\"timestamp\":\"2022-03-17T08:08:12Z\"}\n{\"id\":33,\"first_name\":\"Clare\",\"last_name\":\"McErlaine\",\"email\":\"cmcerlainew@mozilla.org\",\"job\":\"Senior Quality Engineer\",\"timestamp\":\"2022-11-06T13:36:06Z\"}\n{\"id\":34,\"first_name\":\"Johnny\",\"last_name\":\"Cattlemull\",\"email\":\"jcattlemullx@gravatar.com\",\"job\":\"Registered Nurse\",\"timestamp\":\"2022-02-22T12:56:06Z\"}\n{\"id\":35,\"first_name\":\"Hersch\",\"last_name\":\"Andreaccio\",\"email\":\"handreaccioy@hostgator.com\",\"job\":\"Sales Representative\",\"timestamp\":\"2022-05-15T11:29:06Z\"}\n{\"id\":36,\"first_name\":\"Becky\",\"last_name\":\"Frentz\",\"email\":\"bfrentzz@psu.edu\",\"job\":\"Paralegal\",\"timestamp\":\"2022-04-15T18:31:28Z\"}\n{\"id\":37,\"first_name\":\"Katheryn\",\"last_name\":\"Gunbie\",\"email\":\"kgunbie10@feedburner.com\",\"job\":\"Design Engineer\",\"timestamp\":\"2022-08-29T10:07:17Z\"}\n{\"id\":38,\"first_name\":\"Antonino\",\"last_name\":\"Reeder\",\"email\":\"areeder11@ted.com\",\"job\":\"Paralegal\",\"timestamp\":\"2022-02-16T14:25:49Z\"}\n{\"id\":39,\"first_name\":\"Meghan\",\"last_name\":\"Pladen\",\"email\":\"mpladen12@t.co\",\"job\":\"Business Systems Development Analyst\",\"timestamp\":\"2022-01-10T12:43:07Z\"}\n{\"id\":40,\"first_name\":\"Melloney\",\"last_name\":\"Nys\",\"email\":\"mnys13@wix.com\",\"job\":\"Professor\",\"timestamp\":\"2022-03-11T23:21:46Z\"}\n{\"id\":41,\"first_name\":\"Hilliard\",\"last_name\":\"McGilben\",\"email\":\"hmcgilben14@wunderground.com\",\"job\":\"Junior Executive\",\"timestamp\":\"2022-10-28T14:13:52Z\"}\n{\"id\":42,\"first_name\":\"Olivero\",\"last_name\":\"Ladson\",\"email\":\"oladson15@aboutads.info\",\"job\":\"Dental Hygienist\",\"timestamp\":\"2022-04-27T15:37:45Z\"}\n{\"id\":43,\"first_name\":\"Angelico\",\"last_name\":\"Cregeen\",\"email\":\"acregeen16@blog.com\",\"job\":\"Clinical Specialist\",\"timestamp\":\"2022-04-24T19:18:44Z\"}\n{\"id\":44,\"first_name\":\"Redd\",\"last_name\":\"Lowseley\",\"email\":\"rlowseley17@cbsnews.com\",\"job\":\"Accounting Assistant I\",\"timestamp\":\"2022-05-06T20:20:58Z\"}\n{\"id\":45,\"first_name\":\"Ida\",\"last_name\":\"Colebrook\",\"email\":\"icolebrook18@prnewswire.com\",\"job\":\"Human Resources Assistant III\",\"timestamp\":\"2022-12-05T14:57:45Z\"}\n{\"id\":46,\"first_name\":\"Fritz\",\"last_name\":\"Corbert\",\"email\":\"fcorbert19@yelp.com\",\"job\":\"Financial Analyst\",\"timestamp\":\"2022-09-07T09:54:05Z\"}\n{\"id\":47,\"first_name\":\"Fleming\",\"last_name\":\"Woodeson\",\"email\":\"fwoodeson1a@yolasite.com\",\"job\":\"Speech Pathologist\",\"timestamp\":\"2022-09-30T13:32:24Z\"}\n{\"id\":48,\"first_name\":\"Layney\",\"last_name\":\"Dispencer\",\"email\":\"ldispencer1b@bizjournals.com\",\"job\":\"Research Nurse\",\"timestamp\":\"2022-09-22T19:15:54Z\"}\n{\"id\":49,\"first_name\":\"Chen\",\"last_name\":\"Glandfield\",\"email\":\"cglandfield1c@jiathis.com\",\"job\":\"Statistician III\",\"timestamp\":\"2022-09-14T17:26:49Z\"}\n{\"id\":50,\"first_name\":\"Maurise\",\"last_name\":\"Braunle\",\"email\":\"mbraunle1d@craigslist.org\",\"job\":\"Research Assistant II\",\"timestamp\":\"2022-12-01T06:05:42Z\"}\n{\"id\":51,\"first_name\":\"Nevin\",\"last_name\":\"McNeely\",\"email\":\"nmcneely1e@webnode.com\",\"job\":\"Marketing Assistant\",\"timestamp\":\"2021-12-10T05:46:45Z\"}\n{\"id\":52,\"first_name\":\"Antonie\",\"last_name\":\"McBean\",\"email\":\"amcbean1f@nyu.edu\",\"job\":\"Design Engineer\",\"timestamp\":\"2021-12-18T14:27:47Z\"}\n{\"id\":53,\"first_name\":\"Alonzo\",\"last_name\":\"Jordin\",\"email\":\"ajordin1g@printfriendly.com\",\"job\":\"Administrative Assistant III\",\"timestamp\":\"2022-03-19T22:48:55Z\"}\n{\"id\":54,\"first_name\":\"Laverna\",\"last_name\":\"McCurtain\",\"email\":\"lmccurtain1h@guardian.co.uk\",\"job\":\"Software Test Engineer III\",\"timestamp\":\"2022-07-20T12:24:51Z\"}\n{\"id\":55,\"first_name\":\"Krishna\",\"last_name\":\"Gerrett\",\"email\":\"kgerrett1i@slashdot.org\",\"job\":\"Internal Auditor\",\"timestamp\":\"2021-12-19T12:20:00Z\"}\n{\"id\":56,\"first_name\":\"Jethro\",\"last_name\":\"Tomkys\",\"email\":\"jtomkys1j@auda.org.au\",\"job\":\"Automation Specialist III\",\"timestamp\":\"2022-06-07T13:54:51Z\"}\n{\"id\":57,\"first_name\":\"Blinni\",\"last_name\":\"Rumgay\",\"email\":\"brumgay1k@unicef.org\",\"job\":\"Product Engineer\",\"timestamp\":\"2022-04-12T05:12:19Z\"}\n{\"id\":58,\"first_name\":\"Victoria\",\"last_name\":\"Booi\",\"email\":\"vbooi1l@forbes.com\",\"job\":\"Environmental Specialist\",\"timestamp\":\"2022-10-14T00:31:26Z\"}\n{\"id\":59,\"first_name\":\"Hube\",\"last_name\":\"Sheers\",\"email\":\"hsheers1m@behance.net\",\"job\":\"Payment Adjustment Coordinator\",\"timestamp\":\"2022-05-30T04:44:02Z\"}\n{\"id\":60,\"first_name\":\"Trip\",\"last_name\":\"Twidle\",\"email\":\"ttwidle1n@npr.org\",\"job\":\"Design Engineer\",\"timestamp\":\"2022-05-31T15:17:27Z\"}\n{\"id\":61,\"first_name\":\"Brigida\",\"last_name\":\"Simony\",\"email\":\"bsimony1o@about.com\",\"job\":\"VP Marketing\",\"timestamp\":\"2022-02-07T21:30:51Z\"}\n{\"id\":62,\"first_name\":\"Raynard\",\"last_name\":\"Prati\",\"email\":\"rprati1p@digg.com\",\"job\":\"Recruiter\",\"timestamp\":\"2022-01-27T06:13:13Z\"}\n{\"id\":63,\"first_name\":\"Sylvan\",\"last_name\":\"Brookes\",\"email\":\"sbrookes1q@google.ca\",\"job\":\"Statistician II\",\"timestamp\":\"2022-10-28T04:52:16Z\"}\n{\"id\":64,\"first_name\":\"Adrienne\",\"last_name\":\"Geeve\",\"email\":\"ageeve1r@google.com.br\",\"job\":\"Social Worker\",\"timestamp\":\"2022-06-08T13:20:30Z\"}\n{\"id\":65,\"first_name\":\"Giorgia\",\"last_name\":\"Tuddenham\",\"email\":\"gtuddenham1s@nps.gov\",\"job\":\"Account Executive\",\"timestamp\":\"2022-07-11T14:53:45Z\"}\n{\"id\":66,\"first_name\":\"Flss\",\"last_name\":\"Ibel\",\"email\":\"fibel1t@bandcamp.com\",\"job\":\"Assistant Professor\",\"timestamp\":\"2021-12-13T18:56:38Z\"}\n{\"id\":67,\"first_name\":\"Rubina\",\"last_name\":\"Seymer\",\"email\":\"rseymer1u@ted.com\",\"job\":\"Senior Editor\",\"timestamp\":\"2022-03-25T08:18:09Z\"}\n{\"id\":68,\"first_name\":\"Torie\",\"last_name\":\"Shorton\",\"email\":\"tshorton1v@theglobeandmail.com\",\"job\":\"Data Coordiator\",\"timestamp\":\"2022-03-02T04:07:55Z\"}\n{\"id\":69,\"first_name\":\"Gale\",\"last_name\":\"Nealand\",\"email\":\"gnealand1w@usda.gov\",\"job\":\"Assistant Manager\",\"timestamp\":\"2022-09-22T04:08:11Z\"}\n{\"id\":70,\"first_name\":\"Berkeley\",\"last_name\":\"Riggey\",\"email\":\"briggey1x@thetimes.co.uk\",\"job\":\"Assistant Professor\",\"timestamp\":\"2022-08-28T05:05:07Z\"}\n{\"id\":71,\"first_name\":\"Noelani\",\"last_name\":\"Guilliland\",\"email\":\"nguilliland1y@ihg.com\",\"job\":\"Structural Analysis Engineer\",\"timestamp\":\"2022-06-02T22:38:23Z\"}\n{\"id\":72,\"first_name\":\"Patsy\",\"last_name\":\"Straniero\",\"email\":\"pstraniero1z@wisc.edu\",\"job\":\"Quality Control Specialist\",\"timestamp\":\"2021-12-27T05:29:04Z\"}\n{\"id\":73,\"first_name\":\"Trisha\",\"last_name\":\"D'Angeli\",\"email\":\"tdangeli20@networkadvertising.org\",\"job\":\"Account Executive\",\"timestamp\":\"2022-01-29T11:16:44Z\"}\n{\"id\":74,\"first_name\":\"Arlen\",\"last_name\":\"Matyja\",\"email\":\"amatyja21@51.la\",\"job\":\"Software Engineer I\",\"timestamp\":\"2022-07-25T17:48:22Z\"}\n{\"id\":75,\"first_name\":\"Garvey\",\"last_name\":\"East\",\"email\":\"geast22@github.io\",\"job\":\"Recruiting Manager\",\"timestamp\":\"2022-01-31T04:47:31Z\"}\n{\"id\":76,\"first_name\":\"Pepillo\",\"last_name\":\"Stigers\",\"email\":\"pstigers23@va.gov\",\"job\":\"Web Designer II\",\"timestamp\":\"2022-04-03T01:00:18Z\"}\n{\"id\":77,\"first_name\":\"Channa\",\"last_name\":\"Johnke\",\"email\":\"cjohnke24@blogspot.com\",\"job\":\"Accounting Assistant II\",\"timestamp\":\"2022-05-20T23:59:39Z\"}\n{\"id\":78,\"first_name\":\"Marget\",\"last_name\":\"Hymer\",\"email\":\"mhymer25@cam.ac.uk\",\"job\":\"Paralegal\",\"timestamp\":\"2022-09-01T11:39:39Z\"}\n{\"id\":79,\"first_name\":\"Arthur\",\"last_name\":\"Leveridge\",\"email\":\"aleveridge26@bing.com\",\"job\":\"Structural Analysis Engineer\",\"timestamp\":\"2021-12-09T22:58:01Z\"}\n{\"id\":80,\"first_name\":\"Tiebout\",\"last_name\":\"Sharples\",\"email\":\"tsharples27@instagram.com\",\"job\":\"Software Consultant\",\"timestamp\":\"2022-03-13T04:09:09Z\"}\n{\"id\":81,\"first_name\":\"Trixie\",\"last_name\":\"O'Mullaney\",\"email\":\"tomullaney28@redcross.org\",\"job\":\"Mechanical Systems Engineer\",\"timestamp\":\"2022-09-03T22:54:46Z\"}\n{\"id\":82,\"first_name\":\"Karmen\",\"last_name\":\"Baline\",\"email\":\"kbaline29@uol.com.br\",\"job\":\"Design Engineer\",\"timestamp\":\"2022-02-05T02:43:57Z\"}\n{\"id\":83,\"first_name\":\"Bonnee\",\"last_name\":\"Whorall\",\"email\":\"bwhorall2a@skype.com\",\"job\":\"Junior Executive\",\"timestamp\":\"2022-11-11T11:49:13Z\"}\n{\"id\":84,\"first_name\":\"Tony\",\"last_name\":\"Slafford\",\"email\":\"tslafford2b@tripadvisor.com\",\"job\":\"Assistant Professor\",\"timestamp\":\"2022-08-28T11:45:10Z\"}\n{\"id\":85,\"first_name\":\"Janifer\",\"last_name\":\"Mixer\",\"email\":\"jmixer2c@pagesperso-orange.fr\",\"job\":\"Office Assistant II\",\"timestamp\":\"2022-07-25T14:56:19Z\"}\n{\"id\":86,\"first_name\":\"Haroun\",\"last_name\":\"Diddams\",\"email\":\"hdiddams2d@nasa.gov\",\"job\":\"Analog Circuit Design manager\",\"timestamp\":\"2022-07-13T04:39:35Z\"}\n{\"id\":87,\"first_name\":\"Kelcey\",\"last_name\":\"Fardell\",\"email\":\"kfardell2e@dropbox.com\",\"job\":\"Software Test Engineer I\",\"timestamp\":\"2022-11-18T00:20:54Z\"}\n{\"id\":88,\"first_name\":\"Ricard\",\"last_name\":\"Scotti\",\"email\":\"rscotti2f@live.com\",\"job\":\"Nurse Practicioner\",\"timestamp\":\"2022-06-09T19:59:33Z\"}\n{\"id\":89,\"first_name\":\"Noell\",\"last_name\":\"Tremathack\",\"email\":\"ntremathack2g@nih.gov\",\"job\":\"Food Chemist\",\"timestamp\":\"2022-10-11T05:03:53Z\"}\n{\"id\":90,\"first_name\":\"Jorge\",\"last_name\":\"McIlhagga\",\"email\":\"jmcilhagga2h@eventbrite.com\",\"job\":\"Media Manager II\",\"timestamp\":\"2022-04-03T20:52:05Z\"}\n{\"id\":91,\"first_name\":\"Clerc\",\"last_name\":\"Geraud\",\"email\":\"cgeraud2i@answers.com\",\"job\":\"Chemical Engineer\",\"timestamp\":\"2022-06-11T22:00:49Z\"}\n{\"id\":92,\"first_name\":\"Waverley\",\"last_name\":\"Patnelli\",\"email\":\"wpatnelli2j@cyberchimps.com\",\"job\":\"Software Engineer III\",\"timestamp\":\"2022-08-31T13:44:36Z\"}\n{\"id\":93,\"first_name\":\"Angele\",\"last_name\":\"Kenzie\",\"email\":\"akenzie2k@vimeo.com\",\"job\":\"Database Administrator III\",\"timestamp\":\"2022-10-30T00:24:41Z\"}\n{\"id\":94,\"first_name\":\"Mikkel\",\"last_name\":\"Faveryear\",\"email\":\"mfaveryear2l@com.com\",\"job\":\"Occupational Therapist\",\"timestamp\":\"2022-11-01T05:50:01Z\"}\n{\"id\":95,\"first_name\":\"Hadleigh\",\"last_name\":\"Selvey\",\"email\":\"hselvey2m@liveinternet.ru\",\"job\":\"Accounting Assistant IV\",\"timestamp\":\"2022-01-12T07:45:38Z\"}\n{\"id\":96,\"first_name\":\"Witty\",\"last_name\":\"Shapira\",\"email\":\"wshapira2n@behance.net\",\"job\":\"Senior Cost Accountant\",\"timestamp\":\"2022-02-27T01:22:32Z\"}\n{\"id\":97,\"first_name\":\"Wynn\",\"last_name\":\"Tamplin\",\"email\":\"wtamplin2o@liveinternet.ru\",\"job\":\"General Manager\",\"timestamp\":\"2022-07-27T17:19:59Z\"}\n{\"id\":98,\"first_name\":\"Thacher\",\"last_name\":\"Greenhough\",\"email\":\"tgreenhough2p@pbs.org\",\"job\":\"Registered Nurse\",\"timestamp\":\"2022-03-23T20:34:04Z\"}\n{\"id\":99,\"first_name\":\"Gerladina\",\"last_name\":\"Kirby\",\"email\":\"gkirby2q@prweb.com\",\"job\":\"Web Designer III\",\"timestamp\":\"2022-08-12T13:15:37Z\"}\n{\"id\":100,\"first_name\":\"Leanna\",\"last_name\":\"Wallbutton\",\"email\":\"lwallbutton2r@independent.co.uk\",\"job\":\"Pharmacist\",\"timestamp\":\"2022-11-19T12:42:02Z\"}\n{\"id\":101,\"first_name\":\"Sheela\",\"last_name\":\"Kepe\",\"email\":\"skepe2s@addtoany.com\",\"job\":\"Help Desk Technician\",\"timestamp\":\"2022-08-02T10:51:10Z\"}\n{\"id\":102,\"first_name\":\"Clo\",\"last_name\":\"Ronan\",\"email\":\"cronan2t@scribd.com\",\"job\":\"Help Desk Operator\",\"timestamp\":\"2022-09-13T21:04:16Z\"}\n{\"id\":103,\"first_name\":\"Lurlene\",\"last_name\":\"Adame\",\"email\":\"ladame2u@soup.io\",\"job\":\"Clinical Specialist\",\"timestamp\":\"2022-08-19T05:41:48Z\"}\n{\"id\":104,\"first_name\":\"Pebrook\",\"last_name\":\"Balshaw\",\"email\":\"pbalshaw2v@who.int\",\"job\":\"Office Assistant IV\",\"timestamp\":\"2022-10-17T00:17:21Z\"}\n{\"id\":105,\"first_name\":\"Tadd\",\"last_name\":\"Monsey\",\"email\":\"tmonsey2w@spiegel.de\",\"job\":\"Senior Editor\",\"timestamp\":\"2022-07-28T18:03:25Z\"}\n{\"id\":106,\"first_name\":\"Mireille\",\"last_name\":\"Milkin\",\"email\":\"mmilkin2x@theatlantic.com\",\"job\":\"GIS Technical Architect\",\"timestamp\":\"2022-09-22T05:28:48Z\"}\n{\"id\":107,\"first_name\":\"Bunni\",\"last_name\":\"Prowting\",\"email\":\"bprowting2y@t.co\",\"job\":\"Geological Engineer\",\"timestamp\":\"2022-03-15T18:58:57Z\"}\n{\"id\":108,\"first_name\":\"Tandie\",\"last_name\":\"Buddle\",\"email\":\"tbuddle2z@vk.com\",\"job\":\"Nurse\",\"timestamp\":\"2022-05-13T10:10:46Z\"}\n{\"id\":109,\"first_name\":\"Sheryl\",\"last_name\":\"Blair\",\"email\":\"sblair30@discuz.net\",\"job\":\"Administrative Assistant III\",\"timestamp\":\"2022-08-16T15:44:54Z\"}\n{\"id\":110,\"first_name\":\"Daveen\",\"last_name\":\"Liebmann\",\"email\":\"dliebmann31@nationalgeographic.com\",\"job\":\"Professor\",\"timestamp\":\"2022-10-22T09:16:28Z\"}\n{\"id\":111,\"first_name\":\"Udall\",\"last_name\":\"Essel\",\"email\":\"uessel32@canalblog.com\",\"job\":\"Speech Pathologist\",\"timestamp\":\"2022-07-09T11:08:07Z\"}\n{\"id\":112,\"first_name\":\"Nev\",\"last_name\":\"Cromarty\",\"email\":\"ncromarty33@army.mil\",\"job\":\"Structural Analysis Engineer\",\"timestamp\":\"2022-10-16T22:56:59Z\"}\n{\"id\":113,\"first_name\":\"Lara\",\"last_name\":\"Pundy\",\"email\":\"lpundy34@pbs.org\",\"job\":\"Environmental Tech\",\"timestamp\":\"2021-12-30T22:19:17Z\"}\n{\"id\":114,\"first_name\":\"Llywellyn\",\"last_name\":\"Stockman\",\"email\":\"lstockman35@senate.gov\",\"job\":\"VP Product Management\",\"timestamp\":\"2022-07-10T22:41:56Z\"}\n{\"id\":115,\"first_name\":\"Ingrim\",\"last_name\":\"Arkow\",\"email\":\"iarkow36@jiathis.com\",\"job\":\"Speech Pathologist\",\"timestamp\":\"2022-07-25T03:36:14Z\"}\n{\"id\":116,\"first_name\":\"Jamima\",\"last_name\":\"Hedley\",\"email\":\"jhedley37@epa.gov\",\"job\":\"Legal Assistant\",\"timestamp\":\"2022-07-08T02:00:55Z\"}\n{\"id\":117,\"first_name\":\"Kippie\",\"last_name\":\"Danilchev\",\"email\":\"kdanilchev38@reddit.com\",\"job\":\"Payment Adjustment Coordinator\",\"timestamp\":\"2022-03-26T23:52:43Z\"}\n{\"id\":118,\"first_name\":\"Dacie\",\"last_name\":\"Basnall\",\"email\":\"dbasnall39@cnn.com\",\"job\":\"Electrical Engineer\",\"timestamp\":\"2022-11-04T17:52:54Z\"}\n{\"id\":119,\"first_name\":\"Frazer\",\"last_name\":\"McVeighty\",\"email\":\"fmcveighty3a@soundcloud.com\",\"job\":\"Research Assistant IV\",\"timestamp\":\"2021-12-16T11:04:03Z\"}\n{\"id\":120,\"first_name\":\"Dynah\",\"last_name\":\"Cecely\",\"email\":\"dcecely3b@icq.com\",\"job\":\"Research Associate\",\"timestamp\":\"2022-05-14T03:58:47Z\"}\n{\"id\":121,\"first_name\":\"Hermie\",\"last_name\":\"Conlaund\",\"email\":\"hconlaund3c@timesonline.co.uk\",\"job\":\"Help Desk Technician\",\"timestamp\":\"2022-01-19T11:35:16Z\"}\n{\"id\":122,\"first_name\":\"Lindi\",\"last_name\":\"Newling\",\"email\":\"lnewling3d@gmpg.org\",\"job\":\"Programmer I\",\"timestamp\":\"2022-11-27T08:23:47Z\"}\n{\"id\":123,\"first_name\":\"Oralia\",\"last_name\":\"Ballefant\",\"email\":\"oballefant3e@wiley.com\",\"job\":\"Research Nurse\",\"timestamp\":\"2022-11-01T06:10:30Z\"}\n{\"id\":124,\"first_name\":\"Ava\",\"last_name\":\"Monks\",\"email\":\"amonks3f@google.es\",\"job\":\"Senior Cost Accountant\",\"timestamp\":\"2022-07-19T18:58:10Z\"}\n{\"id\":125,\"first_name\":\"Donnamarie\",\"last_name\":\"Tattersfield\",\"email\":\"dtattersfield3g@constantcontact.com\",\"job\":\"Chief Design Engineer\",\"timestamp\":\"2022-08-14T22:55:33Z\"}\n{\"id\":126,\"first_name\":\"Ketti\",\"last_name\":\"Peealess\",\"email\":\"kpeealess3h@woothemes.com\",\"job\":\"Financial Advisor\",\"timestamp\":\"2022-03-21T13:31:34Z\"}\n{\"id\":127,\"first_name\":\"Gerti\",\"last_name\":\"Pearlman\",\"email\":\"gpearlman3i@constantcontact.com\",\"job\":\"GIS Technical Architect\",\"timestamp\":\"2022-01-19T08:47:13Z\"}\n{\"id\":128,\"first_name\":\"Verna\",\"last_name\":\"Thynne\",\"email\":\"vthynne3j@opera.com\",\"job\":\"Structural Engineer\",\"timestamp\":\"2022-05-31T01:48:56Z\"}\n{\"id\":129,\"first_name\":\"Arlen\",\"last_name\":\"Coit\",\"email\":\"acoit3k@apache.org\",\"job\":\"Software Engineer IV\",\"timestamp\":\"2022-09-08T06:29:36Z\"}\n{\"id\":130,\"first_name\":\"Agna\",\"last_name\":\"Oliveti\",\"email\":\"aoliveti3l@imageshack.us\",\"job\":\"Sales Representative\",\"timestamp\":\"2022-10-24T20:06:30Z\"}\n{\"id\":131,\"first_name\":\"Ellette\",\"last_name\":\"Aggott\",\"email\":\"eaggott3m@myspace.com\",\"job\":\"Structural Engineer\",\"timestamp\":\"2022-09-03T16:17:21Z\"}\n{\"id\":132,\"first_name\":\"Candice\",\"last_name\":\"Tembridge\",\"email\":\"ctembridge3n@csmonitor.com\",\"job\":\"Tax Accountant\",\"timestamp\":\"2022-10-24T09:30:41Z\"}\n{\"id\":133,\"first_name\":\"Vinnie\",\"last_name\":\"Duffie\",\"email\":\"vduffie3o@pcworld.com\",\"job\":\"Occupational Therapist\",\"timestamp\":\"2022-02-04T17:03:25Z\"}\n{\"id\":134,\"first_name\":\"Fifine\",\"last_name\":\"O'Dogherty\",\"email\":\"fodogherty3p@behance.net\",\"job\":\"Registered Nurse\",\"timestamp\":\"2022-03-28T23:59:44Z\"}\n{\"id\":135,\"first_name\":\"Raine\",\"last_name\":\"Kerins\",\"email\":\"rkerins3q@economist.com\",\"job\":\"Database Administrator III\",\"timestamp\":\"2022-01-19T03:25:36Z\"}\n{\"id\":136,\"first_name\":\"Arabel\",\"last_name\":\"McUre\",\"email\":\"amcure3r@wired.com\",\"job\":\"Speech Pathologist\",\"timestamp\":\"2021-12-23T21:06:54Z\"}\n{\"id\":137,\"first_name\":\"Cissy\",\"last_name\":\"Olligan\",\"email\":\"colligan3s@mysql.com\",\"job\":\"Compensation Analyst\",\"timestamp\":\"2021-12-12T22:19:29Z\"}\n{\"id\":138,\"first_name\":\"Pierson\",\"last_name\":\"Kornilyev\",\"email\":\"pkornilyev3t@hp.com\",\"job\":\"Information Systems Manager\",\"timestamp\":\"2022-01-08T23:17:36Z\"}\n{\"id\":139,\"first_name\":\"Eve\",\"last_name\":\"Bostock\",\"email\":\"ebostock3u@answers.com\",\"job\":\"Human Resources Assistant IV\",\"timestamp\":\"2022-04-09T22:00:21Z\"}\n{\"id\":140,\"first_name\":\"Franciskus\",\"last_name\":\"Bakesef\",\"email\":\"fbakesef3v@yellowpages.com\",\"job\":\"Structural Analysis Engineer\",\"timestamp\":\"2022-03-09T04:42:58Z\"}\n{\"id\":141,\"first_name\":\"Junette\",\"last_name\":\"Bathersby\",\"email\":\"jbathersby3w@baidu.com\",\"job\":\"Programmer Analyst I\",\"timestamp\":\"2022-05-08T16:10:27Z\"}\n{\"id\":142,\"first_name\":\"Lyda\",\"last_name\":\"Marchi\",\"email\":\"lmarchi3x@digg.com\",\"job\":\"Media Manager I\",\"timestamp\":\"2022-11-27T04:28:38Z\"}\n{\"id\":143,\"first_name\":\"Alasdair\",\"last_name\":\"Kahler\",\"email\":\"akahler3y@wisc.edu\",\"job\":\"Assistant Manager\",\"timestamp\":\"2022-02-07T13:51:07Z\"}\n{\"id\":144,\"first_name\":\"Nessie\",\"last_name\":\"Stockdale\",\"email\":\"nstockdale3z@vinaora.com\",\"job\":\"GIS Technical Architect\",\"timestamp\":\"2021-12-11T07:04:34Z\"}\n{\"id\":145,\"first_name\":\"Bastien\",\"last_name\":\"Syalvester\",\"email\":\"bsyalvester40@nationalgeographic.com\",\"job\":\"Administrative Assistant III\",\"timestamp\":\"2022-11-06T09:08:18Z\"}\n{\"id\":146,\"first_name\":\"Ranique\",\"last_name\":\"Youson\",\"email\":\"ryouson41@simplemachines.org\",\"job\":\"Nurse\",\"timestamp\":\"2022-03-05T23:16:26Z\"}\n{\"id\":147,\"first_name\":\"Ruddy\",\"last_name\":\"Koop\",\"email\":\"rkoop42@webnode.com\",\"job\":\"Programmer IV\",\"timestamp\":\"2022-06-23T11:31:40Z\"}\n{\"id\":148,\"first_name\":\"Midge\",\"last_name\":\"Trengove\",\"email\":\"mtrengove43@lulu.com\",\"job\":\"Director of Sales\",\"timestamp\":\"2022-05-14T08:08:26Z\"}\n{\"id\":149,\"first_name\":\"Hally\",\"last_name\":\"Pettendrich\",\"email\":\"hpettendrich44@washington.edu\",\"job\":\"Operator\",\"timestamp\":\"2022-01-17T00:32:17Z\"}\n{\"id\":150,\"first_name\":\"Elfrieda\",\"last_name\":\"Sorey\",\"email\":\"esorey45@hc360.com\",\"job\":\"Operator\",\"timestamp\":\"2022-03-25T05:03:24Z\"}\n{\"id\":151,\"first_name\":\"Arnuad\",\"last_name\":\"Cridlin\",\"email\":\"acridlin46@yandex.ru\",\"job\":\"Design Engineer\",\"timestamp\":\"2022-07-23T22:58:32Z\"}\n{\"id\":152,\"first_name\":\"Cati\",\"last_name\":\"Dunkersley\",\"email\":\"cdunkersley47@umich.edu\",\"job\":\"Financial Analyst\",\"timestamp\":\"2022-09-09T12:53:00Z\"}\n{\"id\":153,\"first_name\":\"Phillis\",\"last_name\":\"Hollow\",\"email\":\"phollow48@printfriendly.com\",\"job\":\"Assistant Manager\",\"timestamp\":\"2022-07-18T16:51:18Z\"}\n{\"id\":154,\"first_name\":\"Annadiana\",\"last_name\":\"Stovold\",\"email\":\"astovold49@uiuc.edu\",\"job\":\"Staff Scientist\",\"timestamp\":\"2022-01-29T17:31:28Z\"}\n{\"id\":155,\"first_name\":\"Shurwood\",\"last_name\":\"Jurewicz\",\"email\":\"sjurewicz4a@bloglovin.com\",\"job\":\"Senior Financial Analyst\",\"timestamp\":\"2022-02-05T22:03:30Z\"}\n{\"id\":156,\"first_name\":\"Sibelle\",\"last_name\":\"Wordesworth\",\"email\":\"swordesworth4b@msn.com\",\"job\":\"Financial Advisor\",\"timestamp\":\"2022-05-26T04:44:42Z\"}\n{\"id\":157,\"first_name\":\"Sandy\",\"last_name\":\"Bau\",\"email\":\"sbau4c@tinyurl.com\",\"job\":\"Professor\",\"timestamp\":\"2022-07-16T21:47:35Z\"}\n{\"id\":158,\"first_name\":\"Moise\",\"last_name\":\"Habens\",\"email\":\"mhabens4d@nationalgeographic.com\",\"job\":\"Internal Auditor\",\"timestamp\":\"2022-03-01T13:48:25Z\"}\n{\"id\":159,\"first_name\":\"Hoyt\",\"last_name\":\"Measom\",\"email\":\"hmeasom4e@github.io\",\"job\":\"Administrative Assistant I\",\"timestamp\":\"2022-05-10T17:51:28Z\"}\n{\"id\":160,\"first_name\":\"Skell\",\"last_name\":\"Siene\",\"email\":\"ssiene4f@networkadvertising.org\",\"job\":\"Physical Therapy Assistant\",\"timestamp\":\"2022-10-20T12:11:04Z\"}\n{\"id\":161,\"first_name\":\"Abbey\",\"last_name\":\"Sainte Paul\",\"email\":\"asaintepaul4g@howstuffworks.com\",\"job\":\"Systems Administrator III\",\"timestamp\":\"2022-03-04T12:17:58Z\"}\n{\"id\":162,\"first_name\":\"Adriana\",\"last_name\":\"Mault\",\"email\":\"amault4h@ted.com\",\"job\":\"Occupational Therapist\",\"timestamp\":\"2021-12-15T16:58:31Z\"}\n{\"id\":163,\"first_name\":\"Nerty\",\"last_name\":\"Cullin\",\"email\":\"ncullin4i@geocities.com\",\"job\":\"Junior Executive\",\"timestamp\":\"2022-06-02T04:14:19Z\"}\n{\"id\":164,\"first_name\":\"Monroe\",\"last_name\":\"Conlon\",\"email\":\"mconlon4j@cnn.com\",\"job\":\"Web Designer I\",\"timestamp\":\"2022-07-05T16:54:18Z\"}\n{\"id\":165,\"first_name\":\"Rena\",\"last_name\":\"Penticost\",\"email\":\"rpenticost4k@spiegel.de\",\"job\":\"Assistant Media Planner\",\"timestamp\":\"2021-12-15T19:04:42Z\"}\n{\"id\":166,\"first_name\":\"Letisha\",\"last_name\":\"Kitchingman\",\"email\":\"lkitchingman4l@sphinn.com\",\"job\":\"Structural Engineer\",\"timestamp\":\"2022-12-05T04:33:23Z\"}\n{\"id\":167,\"first_name\":\"Denney\",\"last_name\":\"Playdon\",\"email\":\"dplaydon4m@google.cn\",\"job\":\"Systems Administrator IV\",\"timestamp\":\"2022-06-17T08:30:54Z\"}\n{\"id\":168,\"first_name\":\"Aprilette\",\"last_name\":\"Ruffles\",\"email\":\"aruffles4n@meetup.com\",\"job\":\"Budget/Accounting Analyst III\",\"timestamp\":\"2022-01-13T19:32:29Z\"}\n{\"id\":169,\"first_name\":\"Rosalie\",\"last_name\":\"Strutz\",\"email\":\"rstrutz4o@guardian.co.uk\",\"job\":\"Teacher\",\"timestamp\":\"2022-02-21T22:44:51Z\"}\n{\"id\":170,\"first_name\":\"Paxon\",\"last_name\":\"Snoden\",\"email\":\"psnoden4p@examiner.com\",\"job\":\"Biostatistician II\",\"timestamp\":\"2022-01-01T02:59:32Z\"}\n{\"id\":171,\"first_name\":\"Son\",\"last_name\":\"Clifforth\",\"email\":\"sclifforth4q@wordpress.org\",\"job\":\"Physical Therapy Assistant\",\"timestamp\":\"2022-07-01T02:10:25Z\"}\n{\"id\":172,\"first_name\":\"Pebrook\",\"last_name\":\"Rollinshaw\",\"email\":\"prollinshaw4r@gizmodo.com\",\"job\":\"Help Desk Technician\",\"timestamp\":\"2022-11-29T05:24:36Z\"}\n{\"id\":173,\"first_name\":\"Harrison\",\"last_name\":\"Steade\",\"email\":\"hsteade4s@creativecommons.org\",\"job\":\"Web Developer II\",\"timestamp\":\"2022-10-18T08:29:27Z\"}\n{\"id\":174,\"first_name\":\"Ardra\",\"last_name\":\"MacConnal\",\"email\":\"amacconnal4t@telegraph.co.uk\",\"job\":\"Systems Administrator IV\",\"timestamp\":\"2022-07-04T09:04:39Z\"}\n{\"id\":175,\"first_name\":\"Donnajean\",\"last_name\":\"Carabine\",\"email\":\"dcarabine4u@de.vu\",\"job\":\"Quality Control Specialist\",\"timestamp\":\"2022-04-30T00:46:50Z\"}\n{\"id\":176,\"first_name\":\"Jamey\",\"last_name\":\"MacLardie\",\"email\":\"jmaclardie4v@hp.com\",\"job\":\"Community Outreach Specialist\",\"timestamp\":\"2022-07-16T17:25:21Z\"}\n{\"id\":177,\"first_name\":\"Jarrad\",\"last_name\":\"Stockall\",\"email\":\"jstockall4w@google.cn\",\"job\":\"Actuary\",\"timestamp\":\"2022-08-21T17:39:00Z\"}\n{\"id\":178,\"first_name\":\"Eolande\",\"last_name\":\"Tchir\",\"email\":\"etchir4x@amazon.co.uk\",\"job\":\"Research Nurse\",\"timestamp\":\"2022-10-31T17:01:24Z\"}\n{\"id\":179,\"first_name\":\"Carilyn\",\"last_name\":\"Bindon\",\"email\":\"cbindon4y@t.co\",\"job\":\"Engineer III\",\"timestamp\":\"2022-11-27T12:30:07Z\"}\n{\"id\":180,\"first_name\":\"Lenore\",\"last_name\":\"Davidescu\",\"email\":\"ldavidescu4z@nytimes.com\",\"job\":\"Legal Assistant\",\"timestamp\":\"2022-09-18T15:32:28Z\"}\n{\"id\":181,\"first_name\":\"Catherina\",\"last_name\":\"Gowdie\",\"email\":\"cgowdie50@smh.com.au\",\"job\":\"Statistician IV\",\"timestamp\":\"2022-09-23T15:13:19Z\"}\n{\"id\":182,\"first_name\":\"Reena\",\"last_name\":\"Elgram\",\"email\":\"relgram51@etsy.com\",\"job\":\"Geologist IV\",\"timestamp\":\"2022-09-10T05:24:05Z\"}\n{\"id\":183,\"first_name\":\"Vergil\",\"last_name\":\"Saice\",\"email\":\"vsaice52@domainmarket.com\",\"job\":\"Project Manager\",\"timestamp\":\"2022-04-01T12:59:25Z\"}\n{\"id\":184,\"first_name\":\"Brose\",\"last_name\":\"Titterell\",\"email\":\"btitterell53@arizona.edu\",\"job\":\"Community Outreach Specialist\",\"timestamp\":\"2022-08-01T18:28:09Z\"}\n{\"id\":185,\"first_name\":\"Waiter\",\"last_name\":\"Slimon\",\"email\":\"wslimon54@tuttocitta.it\",\"job\":\"Research Associate\",\"timestamp\":\"2022-08-04T22:18:40Z\"}\n{\"id\":186,\"first_name\":\"Dawna\",\"last_name\":\"Avard\",\"email\":\"davard55@nhs.uk\",\"job\":\"Electrical Engineer\",\"timestamp\":\"2021-12-08T08:41:45Z\"}\n{\"id\":187,\"first_name\":\"Geoff\",\"last_name\":\"Erickson\",\"email\":\"gerickson56@meetup.com\",\"job\":\"Research Assistant I\",\"timestamp\":\"2022-07-29T02:19:03Z\"}\n{\"id\":188,\"first_name\":\"Orsa\",\"last_name\":\"Wapples\",\"email\":\"owapples57@unesco.org\",\"job\":\"Engineer IV\",\"timestamp\":\"2022-08-15T23:47:42Z\"}\n{\"id\":189,\"first_name\":\"Lana\",\"last_name\":\"Rawlin\",\"email\":\"lrawlin58@surveymonkey.com\",\"job\":\"Speech Pathologist\",\"timestamp\":\"2022-02-15T21:01:49Z\"}\n{\"id\":190,\"first_name\":\"Nydia\",\"last_name\":\"Minet\",\"email\":\"nminet59@umn.edu\",\"job\":\"Engineer III\",\"timestamp\":\"2022-10-24T09:35:38Z\"}\n{\"id\":191,\"first_name\":\"Eadith\",\"last_name\":\"Kornes\",\"email\":\"ekornes5a@deliciousdays.com\",\"job\":\"Biostatistician IV\",\"timestamp\":\"2022-03-25T15:57:40Z\"}\n{\"id\":192,\"first_name\":\"Kessia\",\"last_name\":\"Spavins\",\"email\":\"kspavins5b@mtv.com\",\"job\":\"Speech Pathologist\",\"timestamp\":\"2022-07-29T06:08:11Z\"}\n{\"id\":193,\"first_name\":\"Lana\",\"last_name\":\"Van Hesteren\",\"email\":\"lvanhesteren5c@booking.com\",\"job\":\"Director of Sales\",\"timestamp\":\"2022-03-16T16:39:52Z\"}\n{\"id\":194,\"first_name\":\"Sol\",\"last_name\":\"McRill\",\"email\":\"smcrill5d@buzzfeed.com\",\"job\":\"Administrative Assistant I\",\"timestamp\":\"2022-03-31T19:40:50Z\"}\n{\"id\":195,\"first_name\":\"Layla\",\"last_name\":\"Melling\",\"email\":\"lmelling5e@skyrock.com\",\"job\":\"Analyst Programmer\",\"timestamp\":\"2022-05-19T05:55:38Z\"}\n{\"id\":196,\"first_name\":\"Erhard\",\"last_name\":\"Hendrik\",\"email\":\"ehendrik5f@smugmug.com\",\"job\":\"VP Sales\",\"timestamp\":\"2021-12-26T20:22:51Z\"}\n{\"id\":197,\"first_name\":\"Ailbert\",\"last_name\":\"Quarles\",\"email\":\"aquarles5g@seattletimes.com\",\"job\":\"Speech Pathologist\",\"timestamp\":\"2022-10-09T04:40:12Z\"}\n{\"id\":198,\"first_name\":\"Tabby\",\"last_name\":\"Colt\",\"email\":\"tcolt5h@theglobeandmail.com\",\"job\":\"Senior Financial Analyst\",\"timestamp\":\"2022-02-01T04:06:23Z\"}\n{\"id\":199,\"first_name\":\"Andee\",\"last_name\":\"Longbone\",\"email\":\"alongbone5i@istockphoto.com\",\"job\":\"Budget/Accounting Analyst II\",\"timestamp\":\"2022-05-10T17:39:08Z\"}\n{\"id\":200,\"first_name\":\"Marlowe\",\"last_name\":\"Camsey\",\"email\":\"mcamsey5j@earthlink.net\",\"job\":\"Recruiter\",\"timestamp\":\"2022-01-12T17:12:21Z\"}\n{\"id\":201,\"first_name\":\"Gregorius\",\"last_name\":\"Mincini\",\"email\":\"gmincini5k@sakura.ne.jp\",\"job\":\"Automation Specialist II\",\"timestamp\":\"2022-08-02T17:15:49Z\"}\n{\"id\":202,\"first_name\":\"Kath\",\"last_name\":\"Minci\",\"email\":\"kminci5l@mozilla.org\",\"job\":\"Nuclear Power Engineer\",\"timestamp\":\"2022-06-14T16:37:46Z\"}\n{\"id\":203,\"first_name\":\"Harmonie\",\"last_name\":\"Dorricott\",\"email\":\"hdorricott5m@pagesperso-orange.fr\",\"job\":\"Administrative Officer\",\"timestamp\":\"2022-02-28T08:11:46Z\"}\n{\"id\":204,\"first_name\":\"Isidro\",\"last_name\":\"Hillum\",\"email\":\"ihillum5n@telegraph.co.uk\",\"job\":\"Account Representative II\",\"timestamp\":\"2022-06-01T20:52:51Z\"}\n{\"id\":205,\"first_name\":\"Gabie\",\"last_name\":\"McEniry\",\"email\":\"gmceniry5o@wikispaces.com\",\"job\":\"Financial Analyst\",\"timestamp\":\"2022-01-05T15:30:32Z\"}\n{\"id\":206,\"first_name\":\"Wynne\",\"last_name\":\"Amorts\",\"email\":\"wamorts5p@thetimes.co.uk\",\"job\":\"Editor\",\"timestamp\":\"2022-05-13T00:02:48Z\"}\n{\"id\":207,\"first_name\":\"Timothee\",\"last_name\":\"O'Finan\",\"email\":\"tofinan5q@tripod.com\",\"job\":\"Research Associate\",\"timestamp\":\"2022-01-08T21:30:02Z\"}\n{\"id\":208,\"first_name\":\"Alfreda\",\"last_name\":\"Kures\",\"email\":\"akures5r@hhs.gov\",\"job\":\"Automation Specialist II\",\"timestamp\":\"2022-04-11T12:16:06Z\"}\n{\"id\":209,\"first_name\":\"Diannne\",\"last_name\":\"Tiesman\",\"email\":\"dtiesman5s@pen.io\",\"job\":\"Environmental Specialist\",\"timestamp\":\"2022-07-06T01:41:13Z\"}\n{\"id\":210,\"first_name\":\"Leisha\",\"last_name\":\"Peasee\",\"email\":\"lpeasee5t@bloomberg.com\",\"job\":\"Associate Professor\",\"timestamp\":\"2022-10-22T08:35:54Z\"}\n{\"id\":211,\"first_name\":\"Demetri\",\"last_name\":\"Broom\",\"email\":\"dbroom5u@oaic.gov.au\",\"job\":\"Director of Sales\",\"timestamp\":\"2022-12-06T07:28:25Z\"}\n{\"id\":212,\"first_name\":\"Simone\",\"last_name\":\"Chisholm\",\"email\":\"schisholm5v@go.com\",\"job\":\"Paralegal\",\"timestamp\":\"2022-09-05T09:33:00Z\"}\n{\"id\":213,\"first_name\":\"Siobhan\",\"last_name\":\"Ironmonger\",\"email\":\"sironmonger5w@mit.edu\",\"job\":\"Database Administrator I\",\"timestamp\":\"2022-08-29T19:18:51Z\"}\n{\"id\":214,\"first_name\":\"Candra\",\"last_name\":\"Tern\",\"email\":\"ctern5x@hud.gov\",\"job\":\"Chief Design Engineer\",\"timestamp\":\"2021-12-30T13:10:59Z\"}\n{\"id\":215,\"first_name\":\"Charla\",\"last_name\":\"Bearward\",\"email\":\"cbearward5y@dyndns.org\",\"job\":\"Paralegal\",\"timestamp\":\"2022-11-14T00:49:58Z\"}\n{\"id\":216,\"first_name\":\"Flynn\",\"last_name\":\"Waring\",\"email\":\"fwaring5z@disqus.com\",\"job\":\"Mechanical Systems Engineer\",\"timestamp\":\"2022-06-12T12:06:46Z\"}\n{\"id\":217,\"first_name\":\"Laetitia\",\"last_name\":\"Haggleton\",\"email\":\"lhaggleton60@reverbnation.com\",\"job\":\"Software Test Engineer I\",\"timestamp\":\"2021-12-16T04:43:32Z\"}\n{\"id\":218,\"first_name\":\"Robin\",\"last_name\":\"Garritley\",\"email\":\"rgarritley61@hexun.com\",\"job\":\"Structural Analysis Engineer\",\"timestamp\":\"2022-02-11T00:24:01Z\"}\n{\"id\":219,\"first_name\":\"Rosie\",\"last_name\":\"Ladyman\",\"email\":\"rladyman62@mashable.com\",\"job\":\"Structural Engineer\",\"timestamp\":\"2022-08-07T13:55:28Z\"}\n{\"id\":220,\"first_name\":\"Asia\",\"last_name\":\"Ellerman\",\"email\":\"aellerman63@arstechnica.com\",\"job\":\"Food Chemist\",\"timestamp\":\"2022-01-23T06:14:00Z\"}\n{\"id\":221,\"first_name\":\"Christye\",\"last_name\":\"McWhan\",\"email\":\"cmcwhan64@github.com\",\"job\":\"Recruiter\",\"timestamp\":\"2022-05-24T01:03:07Z\"}\n{\"id\":222,\"first_name\":\"Johanna\",\"last_name\":\"Khotler\",\"email\":\"jkhotler65@thetimes.co.uk\",\"job\":\"Help Desk Technician\",\"timestamp\":\"2022-04-23T20:21:02Z\"}\n{\"id\":223,\"first_name\":\"Angil\",\"last_name\":\"Carress\",\"email\":\"acarress66@i2i.jp\",\"job\":\"Electrical Engineer\",\"timestamp\":\"2022-03-19T06:35:10Z\"}\n{\"id\":224,\"first_name\":\"Joyce\",\"last_name\":\"Beaglehole\",\"email\":\"jbeaglehole67@sfgate.com\",\"job\":\"Design Engineer\",\"timestamp\":\"2021-12-13T13:06:10Z\"}\n{\"id\":225,\"first_name\":\"Pip\",\"last_name\":\"Escudier\",\"email\":\"pescudier68@globo.com\",\"job\":\"Junior Executive\",\"timestamp\":\"2021-12-16T23:35:43Z\"}\n{\"id\":226,\"first_name\":\"Isadore\",\"last_name\":\"O'Longain\",\"email\":\"iolongain69@sbwire.com\",\"job\":\"Product Engineer\",\"timestamp\":\"2022-08-06T10:15:09Z\"}\n{\"id\":227,\"first_name\":\"Gilburt\",\"last_name\":\"Bowbrick\",\"email\":\"gbowbrick6a@ezinearticles.com\",\"job\":\"Safety Technician III\",\"timestamp\":\"2022-01-16T04:22:33Z\"}\n{\"id\":228,\"first_name\":\"Renault\",\"last_name\":\"Frammingham\",\"email\":\"rframmingham6b@ox.ac.uk\",\"job\":\"Financial Advisor\",\"timestamp\":\"2022-09-02T13:10:17Z\"}\n{\"id\":229,\"first_name\":\"Tam\",\"last_name\":\"Nangle\",\"email\":\"tnangle6c@t-online.de\",\"job\":\"Professor\",\"timestamp\":\"2022-04-16T21:51:58Z\"}\n{\"id\":230,\"first_name\":\"Ardelle\",\"last_name\":\"Coultous\",\"email\":\"acoultous6d@wix.com\",\"job\":\"Statistician I\",\"timestamp\":\"2022-11-27T05:43:01Z\"}\n{\"id\":231,\"first_name\":\"Demetra\",\"last_name\":\"Mabson\",\"email\":\"dmabson6e@sciencedirect.com\",\"job\":\"Analyst Programmer\",\"timestamp\":\"2022-01-09T08:43:12Z\"}\n{\"id\":232,\"first_name\":\"Avis\",\"last_name\":\"Laverenz\",\"email\":\"alaverenz6f@wikia.com\",\"job\":\"Biostatistician IV\",\"timestamp\":\"2022-03-12T02:41:08Z\"}\n{\"id\":233,\"first_name\":\"Scarface\",\"last_name\":\"Gurnett\",\"email\":\"sgurnett6g@bigcartel.com\",\"job\":\"Accounting Assistant I\",\"timestamp\":\"2022-11-30T00:57:31Z\"}\n{\"id\":234,\"first_name\":\"Hermon\",\"last_name\":\"Overil\",\"email\":\"hoveril6h@yolasite.com\",\"job\":\"Senior Sales Associate\",\"timestamp\":\"2022-09-08T19:07:46Z\"}\n{\"id\":235,\"first_name\":\"Eduard\",\"last_name\":\"Nasey\",\"email\":\"enasey6i@buzzfeed.com\",\"job\":\"Environmental Specialist\",\"timestamp\":\"2022-03-09T00:20:19Z\"}\n{\"id\":236,\"first_name\":\"Elle\",\"last_name\":\"Golt\",\"email\":\"egolt6j@dion.ne.jp\",\"job\":\"Computer Systems Analyst III\",\"timestamp\":\"2022-08-20T16:30:40Z\"}\n{\"id\":237,\"first_name\":\"Gwyn\",\"last_name\":\"Asaaf\",\"email\":\"gasaaf6k@webnode.com\",\"job\":\"Financial Advisor\",\"timestamp\":\"2022-02-02T03:45:52Z\"}\n{\"id\":238,\"first_name\":\"Cullie\",\"last_name\":\"Pala\",\"email\":\"cpala6l@google.co.jp\",\"job\":\"Marketing Manager\",\"timestamp\":\"2022-10-17T10:54:32Z\"}\n{\"id\":239,\"first_name\":\"Torie\",\"last_name\":\"Drinkall\",\"email\":\"tdrinkall6m@go.com\",\"job\":\"Account Coordinator\",\"timestamp\":\"2022-01-07T19:42:33Z\"}\n{\"id\":240,\"first_name\":\"Petronella\",\"last_name\":\"Reimer\",\"email\":\"preimer6n@creativecommons.org\",\"job\":\"Sales Associate\",\"timestamp\":\"2022-11-15T22:51:02Z\"}\n{\"id\":241,\"first_name\":\"Dun\",\"last_name\":\"Gamett\",\"email\":\"dgamett6o@yahoo.com\",\"job\":\"Tax Accountant\",\"timestamp\":\"2022-09-24T13:16:14Z\"}\n{\"id\":242,\"first_name\":\"Fritz\",\"last_name\":\"Jeannet\",\"email\":\"fjeannet6p@wunderground.com\",\"job\":\"Actuary\",\"timestamp\":\"2022-04-23T04:52:39Z\"}\n{\"id\":243,\"first_name\":\"Joby\",\"last_name\":\"Gouny\",\"email\":\"jgouny6q@last.fm\",\"job\":\"Associate Professor\",\"timestamp\":\"2022-04-11T10:01:24Z\"}\n{\"id\":244,\"first_name\":\"Priscilla\",\"last_name\":\"Hagard\",\"email\":\"phagard6r@blog.com\",\"job\":\"Assistant Manager\",\"timestamp\":\"2022-10-24T01:59:01Z\"}\n{\"id\":245,\"first_name\":\"Chadd\",\"last_name\":\"Runnett\",\"email\":\"crunnett6s@irs.gov\",\"job\":\"Human Resources Manager\",\"timestamp\":\"2022-10-19T14:04:54Z\"}\n{\"id\":246,\"first_name\":\"Pansy\",\"last_name\":\"Coan\",\"email\":\"pcoan6t@4shared.com\",\"job\":\"Help Desk Operator\",\"timestamp\":\"2022-07-26T13:19:39Z\"}\n{\"id\":247,\"first_name\":\"Bobby\",\"last_name\":\"Bothbie\",\"email\":\"bbothbie6u@youku.com\",\"job\":\"Analog Circuit Design manager\",\"timestamp\":\"2022-08-09T02:08:24Z\"}\n{\"id\":248,\"first_name\":\"Deidre\",\"last_name\":\"Guillart\",\"email\":\"dguillart6v@rakuten.co.jp\",\"job\":\"Statistician III\",\"timestamp\":\"2022-06-25T03:36:25Z\"}\n{\"id\":249,\"first_name\":\"Corine\",\"last_name\":\"Garnham\",\"email\":\"cgarnham6w@typepad.com\",\"job\":\"Editor\",\"timestamp\":\"2022-09-19T02:45:29Z\"}\n{\"id\":250,\"first_name\":\"Ag\",\"last_name\":\"Franiak\",\"email\":\"afraniak6x@alibaba.com\",\"job\":\"Research Associate\",\"timestamp\":\"2022-07-28T04:44:05Z\"}\n{\"id\":251,\"first_name\":\"Ben\",\"last_name\":\"Scramage\",\"email\":\"bscramage6y@sina.com.cn\",\"job\":\"Nurse Practicioner\",\"timestamp\":\"2022-09-12T04:05:39Z\"}\n{\"id\":252,\"first_name\":\"Edlin\",\"last_name\":\"Bishell\",\"email\":\"ebishell6z@de.vu\",\"job\":\"Tax Accountant\",\"timestamp\":\"2022-03-31T23:36:47Z\"}\n{\"id\":253,\"first_name\":\"Opaline\",\"last_name\":\"Soden\",\"email\":\"osoden70@utexas.edu\",\"job\":\"Project Manager\",\"timestamp\":\"2022-01-22T23:13:02Z\"}\n{\"id\":254,\"first_name\":\"Meredithe\",\"last_name\":\"Hiscocks\",\"email\":\"mhiscocks71@reuters.com\",\"job\":\"Paralegal\",\"timestamp\":\"2022-05-19T10:29:21Z\"}\n{\"id\":255,\"first_name\":\"Georgeanne\",\"last_name\":\"Donhardt\",\"email\":\"gdonhardt72@tamu.edu\",\"job\":\"Speech Pathologist\",\"timestamp\":\"2022-06-21T07:31:28Z\"}\n{\"id\":256,\"first_name\":\"Bridgette\",\"last_name\":\"Obin\",\"email\":\"bobin73@hubpages.com\",\"job\":\"Software Test Engineer IV\",\"timestamp\":\"2022-03-04T04:51:56Z\"}\n{\"id\":257,\"first_name\":\"Livvy\",\"last_name\":\"Shorten\",\"email\":\"lshorten74@slate.com\",\"job\":\"Sales Representative\",\"timestamp\":\"2022-07-02T07:44:35Z\"}\n{\"id\":258,\"first_name\":\"Isabelita\",\"last_name\":\"Hannaby\",\"email\":\"ihannaby75@exblog.jp\",\"job\":\"Human Resources Assistant III\",\"timestamp\":\"2021-12-14T20:08:16Z\"}\n{\"id\":259,\"first_name\":\"Elisabetta\",\"last_name\":\"Kisar\",\"email\":\"ekisar76@yelp.com\",\"job\":\"Legal Assistant\",\"timestamp\":\"2022-07-13T05:43:49Z\"}\n{\"id\":260,\"first_name\":\"Pattie\",\"last_name\":\"Skeggs\",\"email\":\"pskeggs77@smugmug.com\",\"job\":\"Developer II\",\"timestamp\":\"2022-09-21T23:30:44Z\"}\n{\"id\":261,\"first_name\":\"Von\",\"last_name\":\"Plaide\",\"email\":\"vplaide78@163.com\",\"job\":\"Financial Analyst\",\"timestamp\":\"2022-01-17T18:18:40Z\"}\n{\"id\":262,\"first_name\":\"Aurie\",\"last_name\":\"Fones\",\"email\":\"afones79@craigslist.org\",\"job\":\"Statistician I\",\"timestamp\":\"2022-06-25T21:03:25Z\"}\n{\"id\":263,\"first_name\":\"Sue\",\"last_name\":\"Peacocke\",\"email\":\"speacocke7a@amazonaws.com\",\"job\":\"Quality Engineer\",\"timestamp\":\"2022-05-21T11:51:33Z\"}\n{\"id\":264,\"first_name\":\"Lorrin\",\"last_name\":\"Dallemore\",\"email\":\"ldallemore7b@jigsy.com\",\"job\":\"Tax Accountant\",\"timestamp\":\"2022-06-16T21:10:49Z\"}\n{\"id\":265,\"first_name\":\"Heloise\",\"last_name\":\"Dober\",\"email\":\"hdober7c@seesaa.net\",\"job\":\"Dental Hygienist\",\"timestamp\":\"2021-12-29T01:17:53Z\"}\n{\"id\":266,\"first_name\":\"Scotti\",\"last_name\":\"Layson\",\"email\":\"slayson7d@accuweather.com\",\"job\":\"Desktop Support Technician\",\"timestamp\":\"2022-04-26T19:46:46Z\"}\n{\"id\":267,\"first_name\":\"Clementia\",\"last_name\":\"Weepers\",\"email\":\"cweepers7e@list-manage.com\",\"job\":\"Assistant Manager\",\"timestamp\":\"2022-06-01T21:45:12Z\"}\n{\"id\":268,\"first_name\":\"Cullan\",\"last_name\":\"Liebmann\",\"email\":\"cliebmann7f@dmoz.org\",\"job\":\"Financial Advisor\",\"timestamp\":\"2022-08-03T01:34:18Z\"}\n{\"id\":269,\"first_name\":\"Althea\",\"last_name\":\"Boutell\",\"email\":\"aboutell7g@seattletimes.com\",\"job\":\"Health Coach III\",\"timestamp\":\"2022-07-24T17:30:47Z\"}\n{\"id\":270,\"first_name\":\"Karoly\",\"last_name\":\"Girdwood\",\"email\":\"kgirdwood7h@webs.com\",\"job\":\"Senior Quality Engineer\",\"timestamp\":\"2022-10-17T11:54:59Z\"}\n{\"id\":271,\"first_name\":\"Farrel\",\"last_name\":\"Blackney\",\"email\":\"fblackney7i@ed.gov\",\"job\":\"Research Assistant IV\",\"timestamp\":\"2022-10-25T20:27:01Z\"}\n{\"id\":272,\"first_name\":\"Tynan\",\"last_name\":\"Bleas\",\"email\":\"tbleas7j@hhs.gov\",\"job\":\"Compensation Analyst\",\"timestamp\":\"2022-04-17T03:22:37Z\"}\n{\"id\":273,\"first_name\":\"Cybill\",\"last_name\":\"Caple\",\"email\":\"ccaple7k@about.me\",\"job\":\"Payment Adjustment Coordinator\",\"timestamp\":\"2022-07-26T23:54:40Z\"}\n{\"id\":274,\"first_name\":\"Rasla\",\"last_name\":\"Rameau\",\"email\":\"rrameau7l@nbcnews.com\",\"job\":\"Help Desk Technician\",\"timestamp\":\"2022-06-29T12:22:48Z\"}\n{\"id\":275,\"first_name\":\"Harry\",\"last_name\":\"Sculpher\",\"email\":\"hsculpher7m@apple.com\",\"job\":\"Business Systems Development Analyst\",\"timestamp\":\"2022-07-04T15:07:45Z\"}\n{\"id\":276,\"first_name\":\"Dell\",\"last_name\":\"Higgonet\",\"email\":\"dhiggonet7n@jalbum.net\",\"job\":\"Sales Representative\",\"timestamp\":\"2022-06-27T03:19:01Z\"}\n{\"id\":277,\"first_name\":\"Yorke\",\"last_name\":\"Newstead\",\"email\":\"ynewstead7o@uol.com.br\",\"job\":\"Registered Nurse\",\"timestamp\":\"2022-08-12T07:47:07Z\"}\n{\"id\":278,\"first_name\":\"Germaine\",\"last_name\":\"Polland\",\"email\":\"gpolland7p@go.com\",\"job\":\"Developer I\",\"timestamp\":\"2022-04-01T16:12:57Z\"}\n{\"id\":279,\"first_name\":\"Asher\",\"last_name\":\"Sollett\",\"email\":\"asollett7q@slashdot.org\",\"job\":\"VP Sales\",\"timestamp\":\"2022-01-06T23:20:17Z\"}\n{\"id\":280,\"first_name\":\"Marion\",\"last_name\":\"Armit\",\"email\":\"marmit7r@meetup.com\",\"job\":\"Quality Engineer\",\"timestamp\":\"2022-02-13T02:17:16Z\"}\n{\"id\":281,\"first_name\":\"Thomas\",\"last_name\":\"Clewlow\",\"email\":\"tclewlow7s@51.la\",\"job\":\"Executive Secretary\",\"timestamp\":\"2022-05-29T12:22:11Z\"}\n{\"id\":282,\"first_name\":\"Turner\",\"last_name\":\"Karchowski\",\"email\":\"tkarchowski7t@elpais.com\",\"job\":\"Automation Specialist I\",\"timestamp\":\"2022-04-06T08:23:05Z\"}\n{\"id\":283,\"first_name\":\"Ketty\",\"last_name\":\"Costain\",\"email\":\"kcostain7u@latimes.com\",\"job\":\"Structural Engineer\",\"timestamp\":\"2022-04-30T06:20:06Z\"}\n{\"id\":284,\"first_name\":\"Heath\",\"last_name\":\"Palser\",\"email\":\"hpalser7v@aol.com\",\"job\":\"Structural Engineer\",\"timestamp\":\"2022-04-05T02:10:42Z\"}\n{\"id\":285,\"first_name\":\"Gaspar\",\"last_name\":\"Van den Bosch\",\"email\":\"gvandenbosch7w@bloomberg.com\",\"job\":\"Mechanical Systems Engineer\",\"timestamp\":\"2021-12-22T16:35:11Z\"}\n{\"id\":286,\"first_name\":\"Adeline\",\"last_name\":\"Pacher\",\"email\":\"apacher7x@e-recht24.de\",\"job\":\"General Manager\",\"timestamp\":\"2022-01-19T03:53:17Z\"}\n{\"id\":287,\"first_name\":\"Emyle\",\"last_name\":\"Cookes\",\"email\":\"ecookes7y@statcounter.com\",\"job\":\"Nurse Practicioner\",\"timestamp\":\"2022-11-04T05:52:43Z\"}\n{\"id\":288,\"first_name\":\"Eugine\",\"last_name\":\"Vell\",\"email\":\"evell7z@webs.com\",\"job\":\"Sales Representative\",\"timestamp\":\"2022-04-25T21:53:59Z\"}\n{\"id\":289,\"first_name\":\"Ogden\",\"last_name\":\"Outridge\",\"email\":\"ooutridge80@epa.gov\",\"job\":\"Web Designer II\",\"timestamp\":\"2022-02-02T01:01:16Z\"}\n{\"id\":290,\"first_name\":\"Krystalle\",\"last_name\":\"Esposi\",\"email\":\"kesposi81@github.io\",\"job\":\"Editor\",\"timestamp\":\"2022-05-21T05:46:12Z\"}\n{\"id\":291,\"first_name\":\"Fremont\",\"last_name\":\"Poge\",\"email\":\"fpoge82@addtoany.com\",\"job\":\"Software Consultant\",\"timestamp\":\"2022-11-23T05:50:15Z\"}\n{\"id\":292,\"first_name\":\"Mamie\",\"last_name\":\"Aery\",\"email\":\"maery83@sciencedirect.com\",\"job\":\"Professor\",\"timestamp\":\"2022-08-21T16:38:20Z\"}\n{\"id\":293,\"first_name\":\"Andra\",\"last_name\":\"Iles\",\"email\":\"ailes84@guardian.co.uk\",\"job\":\"Biostatistician III\",\"timestamp\":\"2022-07-07T19:37:01Z\"}\n{\"id\":294,\"first_name\":\"Ardith\",\"last_name\":\"Gemnett\",\"email\":\"agemnett85@apache.org\",\"job\":\"Design Engineer\",\"timestamp\":\"2022-11-29T09:49:25Z\"}\n{\"id\":295,\"first_name\":\"Carlie\",\"last_name\":\"Mulderrig\",\"email\":\"cmulderrig86@time.com\",\"job\":\"Developer III\",\"timestamp\":\"2022-08-30T03:38:10Z\"}\n{\"id\":296,\"first_name\":\"Eadie\",\"last_name\":\"Hain\",\"email\":\"ehain87@live.com\",\"job\":\"VP Quality Control\",\"timestamp\":\"2022-02-14T01:21:52Z\"}\n{\"id\":297,\"first_name\":\"Kellen\",\"last_name\":\"McFall\",\"email\":\"kmcfall88@engadget.com\",\"job\":\"Librarian\",\"timestamp\":\"2022-05-20T11:47:09Z\"}\n{\"id\":298,\"first_name\":\"Lianna\",\"last_name\":\"Weerdenburg\",\"email\":\"lweerdenburg89@free.fr\",\"job\":\"Nuclear Power Engineer\",\"timestamp\":\"2022-04-20T06:32:55Z\"}\n{\"id\":299,\"first_name\":\"Ciro\",\"last_name\":\"Slainey\",\"email\":\"cslainey8a@angelfire.com\",\"job\":\"Office Assistant I\",\"timestamp\":\"2022-06-29T22:41:59Z\"}\n{\"id\":300,\"first_name\":\"Kenna\",\"last_name\":\"Cecchi\",\"email\":\"kcecchi8b@joomla.org\",\"job\":\"Executive Secretary\",\"timestamp\":\"2022-11-25T21:17:24Z\"}\n{\"id\":301,\"first_name\":\"Corry\",\"last_name\":\"Crean\",\"email\":\"ccrean8c@admin.ch\",\"job\":\"Biostatistician I\",\"timestamp\":\"2022-04-14T16:28:03Z\"}\n{\"id\":302,\"first_name\":\"Kylie\",\"last_name\":\"Boylund\",\"email\":\"kboylund8d@blog.com\",\"job\":\"Accounting Assistant III\",\"timestamp\":\"2022-07-16T20:44:59Z\"}\n{\"id\":303,\"first_name\":\"Venita\",\"last_name\":\"Tate\",\"email\":\"vtate8e@parallels.com\",\"job\":\"Account Representative III\",\"timestamp\":\"2022-01-16T07:51:28Z\"}\n{\"id\":304,\"first_name\":\"Alain\",\"last_name\":\"Rustedge\",\"email\":\"arustedge8f@arizona.edu\",\"job\":\"Tax Accountant\",\"timestamp\":\"2022-06-12T19:47:12Z\"}\n{\"id\":305,\"first_name\":\"Emilio\",\"last_name\":\"Ellul\",\"email\":\"eellul8g@cbslocal.com\",\"job\":\"Sales Representative\",\"timestamp\":\"2022-12-01T16:49:33Z\"}\n{\"id\":306,\"first_name\":\"Malissia\",\"last_name\":\"Caspell\",\"email\":\"mcaspell8h@spiegel.de\",\"job\":\"Research Assistant IV\",\"timestamp\":\"2022-08-14T21:19:45Z\"}\n{\"id\":307,\"first_name\":\"Gavin\",\"last_name\":\"Marvel\",\"email\":\"gmarvel8i@upenn.edu\",\"job\":\"Research Assistant II\",\"timestamp\":\"2022-06-16T01:30:19Z\"}\n{\"id\":308,\"first_name\":\"Demetri\",\"last_name\":\"Jumel\",\"email\":\"djumel8j@weibo.com\",\"job\":\"Teacher\",\"timestamp\":\"2022-07-27T03:12:46Z\"}\n{\"id\":309,\"first_name\":\"Elia\",\"last_name\":\"Stovell\",\"email\":\"estovell8k@quantcast.com\",\"job\":\"Web Designer III\",\"timestamp\":\"2022-11-21T00:23:57Z\"}\n{\"id\":310,\"first_name\":\"Mab\",\"last_name\":\"Aleksich\",\"email\":\"maleksich8l@hhs.gov\",\"job\":\"Clinical Specialist\",\"timestamp\":\"2022-07-20T13:14:20Z\"}\n{\"id\":311,\"first_name\":\"Mord\",\"last_name\":\"Klawi\",\"email\":\"mklawi8m@blogspot.com\",\"job\":\"Internal Auditor\",\"timestamp\":\"2022-11-13T15:52:38Z\"}\n{\"id\":312,\"first_name\":\"Cale\",\"last_name\":\"Fante\",\"email\":\"cfante8n@unesco.org\",\"job\":\"Editor\",\"timestamp\":\"2022-09-06T10:13:30Z\"}\n{\"id\":313,\"first_name\":\"Samantha\",\"last_name\":\"Whistlecraft\",\"email\":\"swhistlecraft8o@dion.ne.jp\",\"job\":\"Staff Accountant III\",\"timestamp\":\"2022-04-15T15:47:01Z\"}\n{\"id\":314,\"first_name\":\"Wallache\",\"last_name\":\"Meach\",\"email\":\"wmeach8p@soundcloud.com\",\"job\":\"Recruiting Manager\",\"timestamp\":\"2022-10-18T22:27:16Z\"}\n{\"id\":315,\"first_name\":\"York\",\"last_name\":\"MacRorie\",\"email\":\"ymacrorie8q@mysql.com\",\"job\":\"Software Engineer III\",\"timestamp\":\"2021-12-12T17:51:57Z\"}\n{\"id\":316,\"first_name\":\"Eugen\",\"last_name\":\"Claus\",\"email\":\"eclaus8r@google.it\",\"job\":\"Computer Systems Analyst IV\",\"timestamp\":\"2022-03-28T19:33:23Z\"}\n{\"id\":317,\"first_name\":\"Karlotta\",\"last_name\":\"Geck\",\"email\":\"kgeck8s@psu.edu\",\"job\":\"Senior Editor\",\"timestamp\":\"2022-05-17T14:01:31Z\"}\n{\"id\":318,\"first_name\":\"Cherry\",\"last_name\":\"Gillyett\",\"email\":\"cgillyett8t@cornell.edu\",\"job\":\"VP Product Management\",\"timestamp\":\"2022-05-28T08:17:39Z\"}\n{\"id\":319,\"first_name\":\"Estrellita\",\"last_name\":\"Brient\",\"email\":\"ebrient8u@clickbank.net\",\"job\":\"Account Executive\",\"timestamp\":\"2022-01-08T01:17:56Z\"}\n{\"id\":320,\"first_name\":\"Ly\",\"last_name\":\"Svanetti\",\"email\":\"lsvanetti8v@sina.com.cn\",\"job\":\"Desktop Support Technician\",\"timestamp\":\"2022-05-26T17:19:42Z\"}\n{\"id\":321,\"first_name\":\"Ronica\",\"last_name\":\"Bloys\",\"email\":\"rbloys8w@elegantthemes.com\",\"job\":\"Environmental Tech\",\"timestamp\":\"2022-04-09T15:25:24Z\"}\n{\"id\":322,\"first_name\":\"Tallie\",\"last_name\":\"Wanless\",\"email\":\"twanless8x@w3.org\",\"job\":\"Graphic Designer\",\"timestamp\":\"2022-01-22T15:01:09Z\"}\n{\"id\":323,\"first_name\":\"Karola\",\"last_name\":\"Scotland\",\"email\":\"kscotland8y@wired.com\",\"job\":\"Analyst Programmer\",\"timestamp\":\"2022-11-19T18:36:41Z\"}\n{\"id\":324,\"first_name\":\"Albrecht\",\"last_name\":\"De Bruyn\",\"email\":\"adebruyn8z@hatena.ne.jp\",\"job\":\"VP Sales\",\"timestamp\":\"2021-12-09T22:57:42Z\"}\n{\"id\":325,\"first_name\":\"Boniface\",\"last_name\":\"Lampl\",\"email\":\"blampl90@slate.com\",\"job\":\"Analog Circuit Design manager\",\"timestamp\":\"2022-05-30T23:52:01Z\"}\n{\"id\":326,\"first_name\":\"Paxton\",\"last_name\":\"Garritley\",\"email\":\"pgarritley91@imdb.com\",\"job\":\"Recruiter\",\"timestamp\":\"2022-08-17T08:47:51Z\"}\n{\"id\":327,\"first_name\":\"Em\",\"last_name\":\"Pierrepont\",\"email\":\"epierrepont92@t.co\",\"job\":\"Research Assistant II\",\"timestamp\":\"2022-06-21T06:48:47Z\"}\n{\"id\":328,\"first_name\":\"Odele\",\"last_name\":\"Weymouth\",\"email\":\"oweymouth93@dot.gov\",\"job\":\"Financial Analyst\",\"timestamp\":\"2022-07-21T18:13:46Z\"}\n{\"id\":329,\"first_name\":\"Karlik\",\"last_name\":\"Casely\",\"email\":\"kcasely94@vimeo.com\",\"job\":\"Statistician II\",\"timestamp\":\"2022-01-01T15:51:47Z\"}\n{\"id\":330,\"first_name\":\"Marisa\",\"last_name\":\"Christon\",\"email\":\"mchriston95@unicef.org\",\"job\":\"Electrical Engineer\",\"timestamp\":\"2022-07-22T00:41:36Z\"}\n{\"id\":331,\"first_name\":\"Norrie\",\"last_name\":\"Peotz\",\"email\":\"npeotz96@ftc.gov\",\"job\":\"Senior Developer\",\"timestamp\":\"2022-11-07T15:28:00Z\"}\n{\"id\":332,\"first_name\":\"Derby\",\"last_name\":\"Pover\",\"email\":\"dpover97@statcounter.com\",\"job\":\"Technical Writer\",\"timestamp\":\"2022-04-29T02:37:02Z\"}\n{\"id\":333,\"first_name\":\"Miranda\",\"last_name\":\"Beartup\",\"email\":\"mbeartup98@barnesandnoble.com\",\"job\":\"Dental Hygienist\",\"timestamp\":\"2021-12-13T12:54:56Z\"}\n{\"id\":334,\"first_name\":\"Euell\",\"last_name\":\"Bittlestone\",\"email\":\"ebittlestone99@google.es\",\"job\":\"Sales Associate\",\"timestamp\":\"2022-12-03T03:42:46Z\"}\n{\"id\":335,\"first_name\":\"Hewie\",\"last_name\":\"McConnal\",\"email\":\"hmcconnal9a@globo.com\",\"job\":\"Media Manager I\",\"timestamp\":\"2022-11-12T14:58:30Z\"}\n{\"id\":336,\"first_name\":\"Maryanna\",\"last_name\":\"Blackburne\",\"email\":\"mblackburne9b@nbcnews.com\",\"job\":\"Payment Adjustment Coordinator\",\"timestamp\":\"2022-10-07T19:58:13Z\"}\n{\"id\":337,\"first_name\":\"Vicki\",\"last_name\":\"Wicks\",\"email\":\"vwicks9c@skype.com\",\"job\":\"Computer Systems Analyst III\",\"timestamp\":\"2022-10-28T11:22:21Z\"}\n{\"id\":338,\"first_name\":\"Camel\",\"last_name\":\"Slader\",\"email\":\"cslader9d@wufoo.com\",\"job\":\"Sales Representative\",\"timestamp\":\"2022-01-26T15:37:10Z\"}\n{\"id\":339,\"first_name\":\"Alvan\",\"last_name\":\"Kehoe\",\"email\":\"akehoe9e@illinois.edu\",\"job\":\"Staff Scientist\",\"timestamp\":\"2022-06-01T01:34:48Z\"}\n{\"id\":340,\"first_name\":\"Daniella\",\"last_name\":\"Schapero\",\"email\":\"dschapero9f@usnews.com\",\"job\":\"Database Administrator II\",\"timestamp\":\"2022-04-02T23:01:40Z\"}\n{\"id\":341,\"first_name\":\"Roslyn\",\"last_name\":\"Bortoletti\",\"email\":\"rbortoletti9g@icio.us\",\"job\":\"VP Marketing\",\"timestamp\":\"2022-12-01T16:25:31Z\"}\n{\"id\":342,\"first_name\":\"Tonya\",\"last_name\":\"Largan\",\"email\":\"tlargan9h@ft.com\",\"job\":\"Project Manager\",\"timestamp\":\"2022-03-04T06:31:36Z\"}\n{\"id\":343,\"first_name\":\"Elisabeth\",\"last_name\":\"Sudran\",\"email\":\"esudran9i@wikimedia.org\",\"job\":\"VP Product Management\",\"timestamp\":\"2022-08-11T09:40:53Z\"}\n{\"id\":344,\"first_name\":\"Sukey\",\"last_name\":\"Stopper\",\"email\":\"sstopper9j@cdbaby.com\",\"job\":\"Nuclear Power Engineer\",\"timestamp\":\"2022-09-03T10:25:28Z\"}\n{\"id\":345,\"first_name\":\"Merwin\",\"last_name\":\"Fuentez\",\"email\":\"mfuentez9k@hp.com\",\"job\":\"Food Chemist\",\"timestamp\":\"2022-01-13T00:48:55Z\"}\n{\"id\":346,\"first_name\":\"Alden\",\"last_name\":\"Hariot\",\"email\":\"ahariot9l@meetup.com\",\"job\":\"Senior Financial Analyst\",\"timestamp\":\"2022-11-27T11:30:45Z\"}\n{\"id\":347,\"first_name\":\"Persis\",\"last_name\":\"Jasik\",\"email\":\"pjasik9m@behance.net\",\"job\":\"Executive Secretary\",\"timestamp\":\"2022-09-16T02:54:51Z\"}\n{\"id\":348,\"first_name\":\"Quinn\",\"last_name\":\"Pickavance\",\"email\":\"qpickavance9n@java.com\",\"job\":\"Human Resources Assistant I\",\"timestamp\":\"2022-09-16T17:58:06Z\"}\n{\"id\":349,\"first_name\":\"Jules\",\"last_name\":\"Le Franc\",\"email\":\"jlefranc9o@kickstarter.com\",\"job\":\"Administrative Officer\",\"timestamp\":\"2021-12-15T22:58:34Z\"}\n{\"id\":350,\"first_name\":\"Darn\",\"last_name\":\"Stoate\",\"email\":\"dstoate9p@umich.edu\",\"job\":\"Chief Design Engineer\",\"timestamp\":\"2022-10-03T06:16:31Z\"}\n{\"id\":351,\"first_name\":\"Cecilius\",\"last_name\":\"Deane\",\"email\":\"cdeane9q@nature.com\",\"job\":\"Developer I\",\"timestamp\":\"2022-04-20T02:13:21Z\"}\n{\"id\":352,\"first_name\":\"Liane\",\"last_name\":\"Meredyth\",\"email\":\"lmeredyth9r@t-online.de\",\"job\":\"Developer II\",\"timestamp\":\"2022-12-04T10:09:36Z\"}\n{\"id\":353,\"first_name\":\"Elbertina\",\"last_name\":\"Rogier\",\"email\":\"erogier9s@hp.com\",\"job\":\"Accountant IV\",\"timestamp\":\"2022-01-03T07:33:20Z\"}\n{\"id\":354,\"first_name\":\"Isaac\",\"last_name\":\"Takle\",\"email\":\"itakle9t@wikispaces.com\",\"job\":\"Safety Technician I\",\"timestamp\":\"2022-06-21T02:00:17Z\"}\n{\"id\":355,\"first_name\":\"Blondelle\",\"last_name\":\"Reiner\",\"email\":\"breiner9u@hp.com\",\"job\":\"Civil Engineer\",\"timestamp\":\"2022-10-14T05:55:57Z\"}\n{\"id\":356,\"first_name\":\"Hermy\",\"last_name\":\"Spraging\",\"email\":\"hspraging9v@geocities.jp\",\"job\":\"Administrative Officer\",\"timestamp\":\"2022-03-23T05:46:21Z\"}\n{\"id\":357,\"first_name\":\"Skyler\",\"last_name\":\"Bavister\",\"email\":\"sbavister9w@cyberchimps.com\",\"job\":\"Financial Analyst\",\"timestamp\":\"2021-12-20T01:19:37Z\"}\n{\"id\":358,\"first_name\":\"Ruby\",\"last_name\":\"Hebden\",\"email\":\"rhebden9x@nba.com\",\"job\":\"Payment Adjustment Coordinator\",\"timestamp\":\"2022-02-24T02:38:01Z\"}\n{\"id\":359,\"first_name\":\"Jethro\",\"last_name\":\"Lammerding\",\"email\":\"jlammerding9y@mac.com\",\"job\":\"Human Resources Assistant IV\",\"timestamp\":\"2022-03-21T04:48:37Z\"}\n{\"id\":360,\"first_name\":\"Kean\",\"last_name\":\"Whitticks\",\"email\":\"kwhitticks9z@economist.com\",\"job\":\"Financial Advisor\",\"timestamp\":\"2021-12-17T13:13:36Z\"}\n{\"id\":361,\"first_name\":\"Talia\",\"last_name\":\"Desforges\",\"email\":\"tdesforgesa0@phpbb.com\",\"job\":\"Developer IV\",\"timestamp\":\"2022-04-04T19:29:49Z\"}\n{\"id\":362,\"first_name\":\"Colin\",\"last_name\":\"Cleyburn\",\"email\":\"ccleyburna1@rakuten.co.jp\",\"job\":\"Database Administrator III\",\"timestamp\":\"2022-04-23T03:14:04Z\"}\n{\"id\":363,\"first_name\":\"Hube\",\"last_name\":\"Ells\",\"email\":\"hellsa2@smugmug.com\",\"job\":\"Biostatistician I\",\"timestamp\":\"2022-08-22T16:06:47Z\"}\n{\"id\":364,\"first_name\":\"Deloria\",\"last_name\":\"Coiley\",\"email\":\"dcoileya3@plala.or.jp\",\"job\":\"Teacher\",\"timestamp\":\"2022-04-21T19:30:19Z\"}\n{\"id\":365,\"first_name\":\"Lissi\",\"last_name\":\"Whiteland\",\"email\":\"lwhitelanda4@addthis.com\",\"job\":\"Engineer III\",\"timestamp\":\"2022-05-24T05:21:21Z\"}\n{\"id\":366,\"first_name\":\"Kathryn\",\"last_name\":\"Simek\",\"email\":\"ksimeka5@washingtonpost.com\",\"job\":\"Pharmacist\",\"timestamp\":\"2022-03-03T01:23:04Z\"}\n{\"id\":367,\"first_name\":\"Alex\",\"last_name\":\"Lammenga\",\"email\":\"alammengaa6@symantec.com\",\"job\":\"Nuclear Power Engineer\",\"timestamp\":\"2022-04-24T09:48:47Z\"}\n{\"id\":368,\"first_name\":\"Marabel\",\"last_name\":\"Reilingen\",\"email\":\"mreilingena7@upenn.edu\",\"job\":\"Help Desk Technician\",\"timestamp\":\"2022-07-01T09:26:53Z\"}\n{\"id\":369,\"first_name\":\"Zolly\",\"last_name\":\"Cooney\",\"email\":\"zcooneya8@discovery.com\",\"job\":\"Desktop Support Technician\",\"timestamp\":\"2022-03-01T18:21:42Z\"}\n{\"id\":370,\"first_name\":\"Ali\",\"last_name\":\"Fairlaw\",\"email\":\"afairlawa9@walmart.com\",\"job\":\"Environmental Specialist\",\"timestamp\":\"2022-01-22T12:52:35Z\"}\n{\"id\":371,\"first_name\":\"Reilly\",\"last_name\":\"Langston\",\"email\":\"rlangstonaa@intel.com\",\"job\":\"Software Engineer II\",\"timestamp\":\"2022-10-25T07:16:41Z\"}\n{\"id\":372,\"first_name\":\"Chantal\",\"last_name\":\"Ingram\",\"email\":\"cingramab@bizjournals.com\",\"job\":\"Web Designer IV\",\"timestamp\":\"2022-02-18T23:11:52Z\"}\n{\"id\":373,\"first_name\":\"Pembroke\",\"last_name\":\"Coltart\",\"email\":\"pcoltartac@examiner.com\",\"job\":\"VP Accounting\",\"timestamp\":\"2022-09-13T01:48:41Z\"}\n{\"id\":374,\"first_name\":\"Irwin\",\"last_name\":\"Spain-Gower\",\"email\":\"ispaingowerad@imageshack.us\",\"job\":\"Senior Financial Analyst\",\"timestamp\":\"2022-09-30T11:02:23Z\"}\n{\"id\":375,\"first_name\":\"Graig\",\"last_name\":\"Chastan\",\"email\":\"gchastanae@geocities.jp\",\"job\":\"Teacher\",\"timestamp\":\"2022-08-03T03:42:11Z\"}\n{\"id\":376,\"first_name\":\"Leanora\",\"last_name\":\"Quincee\",\"email\":\"lquinceeaf@tinyurl.com\",\"job\":\"Information Systems Manager\",\"timestamp\":\"2022-04-23T16:46:27Z\"}\n{\"id\":377,\"first_name\":\"Kele\",\"last_name\":\"Cubley\",\"email\":\"kcubleyag@harvard.edu\",\"job\":\"Software Engineer III\",\"timestamp\":\"2022-04-04T14:35:24Z\"}\n{\"id\":378,\"first_name\":\"Chariot\",\"last_name\":\"Minchin\",\"email\":\"cminchinah@ucla.edu\",\"job\":\"Senior Developer\",\"timestamp\":\"2022-09-07T07:00:13Z\"}\n{\"id\":379,\"first_name\":\"Ellyn\",\"last_name\":\"Loggie\",\"email\":\"eloggieai@washington.edu\",\"job\":\"Compensation Analyst\",\"timestamp\":\"2022-05-12T12:07:36Z\"}\n{\"id\":380,\"first_name\":\"Dmitri\",\"last_name\":\"Geleman\",\"email\":\"dgelemanaj@gizmodo.com\",\"job\":\"Account Coordinator\",\"timestamp\":\"2022-07-26T15:41:05Z\"}\n{\"id\":381,\"first_name\":\"Steve\",\"last_name\":\"Lemmer\",\"email\":\"slemmerak@eepurl.com\",\"job\":\"Database Administrator IV\",\"timestamp\":\"2022-07-19T21:11:14Z\"}\n{\"id\":382,\"first_name\":\"Tillie\",\"last_name\":\"Dodle\",\"email\":\"tdodleal@sciencedirect.com\",\"job\":\"Geologist III\",\"timestamp\":\"2022-05-01T16:47:01Z\"}\n{\"id\":383,\"first_name\":\"Deirdre\",\"last_name\":\"Southcombe\",\"email\":\"dsouthcombeam@trellian.com\",\"job\":\"Administrative Officer\",\"timestamp\":\"2022-06-02T00:35:13Z\"}\n{\"id\":384,\"first_name\":\"Harman\",\"last_name\":\"Cino\",\"email\":\"hcinoan@yahoo.co.jp\",\"job\":\"Tax Accountant\",\"timestamp\":\"2022-08-12T09:34:14Z\"}\n{\"id\":385,\"first_name\":\"Hy\",\"last_name\":\"Chittim\",\"email\":\"hchittimao@scientificamerican.com\",\"job\":\"Biostatistician IV\",\"timestamp\":\"2022-07-03T00:37:30Z\"}\n{\"id\":386,\"first_name\":\"Cordula\",\"last_name\":\"Pendlebury\",\"email\":\"cpendleburyap@cnbc.com\",\"job\":\"Research Assistant II\",\"timestamp\":\"2022-05-26T10:47:16Z\"}\n{\"id\":387,\"first_name\":\"Murvyn\",\"last_name\":\"Kuhnwald\",\"email\":\"mkuhnwaldaq@fotki.com\",\"job\":\"Senior Financial Analyst\",\"timestamp\":\"2022-11-02T18:35:53Z\"}\n{\"id\":388,\"first_name\":\"King\",\"last_name\":\"Guilbert\",\"email\":\"kguilbertar@ycombinator.com\",\"job\":\"Executive Secretary\",\"timestamp\":\"2022-03-03T18:22:09Z\"}\n{\"id\":389,\"first_name\":\"Emlyn\",\"last_name\":\"Stanislaw\",\"email\":\"estanislawas@soup.io\",\"job\":\"Developer I\",\"timestamp\":\"2022-04-14T04:40:47Z\"}\n{\"id\":390,\"first_name\":\"Talyah\",\"last_name\":\"Glanester\",\"email\":\"tglanesterat@nasa.gov\",\"job\":\"GIS Technical Architect\",\"timestamp\":\"2022-02-27T06:00:35Z\"}\n{\"id\":391,\"first_name\":\"Lou\",\"last_name\":\"Mockler\",\"email\":\"lmocklerau@liveinternet.ru\",\"job\":\"Administrative Assistant III\",\"timestamp\":\"2022-06-14T09:55:20Z\"}\n{\"id\":392,\"first_name\":\"Faulkner\",\"last_name\":\"Kiddie\",\"email\":\"fkiddieav@last.fm\",\"job\":\"Speech Pathologist\",\"timestamp\":\"2022-11-20T20:38:11Z\"}\n{\"id\":393,\"first_name\":\"Fabio\",\"last_name\":\"Brimner\",\"email\":\"fbrimneraw@g.co\",\"job\":\"Geologist I\",\"timestamp\":\"2022-05-09T23:09:20Z\"}\n{\"id\":394,\"first_name\":\"Melisa\",\"last_name\":\"Piotrowski\",\"email\":\"mpiotrowskiax@posterous.com\",\"job\":\"Information Systems Manager\",\"timestamp\":\"2022-10-08T09:12:54Z\"}\n{\"id\":395,\"first_name\":\"Celinda\",\"last_name\":\"Blodget\",\"email\":\"cblodgetay@indiatimes.com\",\"job\":\"Dental Hygienist\",\"timestamp\":\"2022-07-10T03:04:56Z\"}\n{\"id\":396,\"first_name\":\"Haywood\",\"last_name\":\"Padfield\",\"email\":\"hpadfieldaz@joomla.org\",\"job\":\"Registered Nurse\",\"timestamp\":\"2021-12-09T07:07:51Z\"}\n{\"id\":397,\"first_name\":\"Jane\",\"last_name\":\"Rasor\",\"email\":\"jrasorb0@soundcloud.com\",\"job\":\"Editor\",\"timestamp\":\"2022-01-19T18:10:29Z\"}\n{\"id\":398,\"first_name\":\"Rance\",\"last_name\":\"Hambric\",\"email\":\"rhambricb1@sciencedirect.com\",\"job\":\"Human Resources Manager\",\"timestamp\":\"2022-04-11T18:18:00Z\"}\n{\"id\":399,\"first_name\":\"Lincoln\",\"last_name\":\"Challenger\",\"email\":\"lchallengerb2@icq.com\",\"job\":\"Senior Financial Analyst\",\"timestamp\":\"2022-05-24T13:24:16Z\"}\n{\"id\":400,\"first_name\":\"Elnore\",\"last_name\":\"Pickervance\",\"email\":\"epickervanceb3@opera.com\",\"job\":\"Data Coordiator\",\"timestamp\":\"2022-10-26T04:10:45Z\"}\n{\"id\":401,\"first_name\":\"Reidar\",\"last_name\":\"Cradock\",\"email\":\"rcradockb4@nature.com\",\"job\":\"VP Marketing\",\"timestamp\":\"2022-09-20T14:35:56Z\"}\n{\"id\":402,\"first_name\":\"Tarah\",\"last_name\":\"Binford\",\"email\":\"tbinfordb5@psu.edu\",\"job\":\"VP Sales\",\"timestamp\":\"2022-11-26T12:05:40Z\"}\n{\"id\":403,\"first_name\":\"Cinda\",\"last_name\":\"Trevithick\",\"email\":\"ctrevithickb6@ustream.tv\",\"job\":\"Sales Representative\",\"timestamp\":\"2022-02-02T13:15:47Z\"}\n{\"id\":404,\"first_name\":\"Lara\",\"last_name\":\"Lovel\",\"email\":\"llovelb7@cocolog-nifty.com\",\"job\":\"Analyst Programmer\",\"timestamp\":\"2022-07-11T00:41:53Z\"}\n{\"id\":405,\"first_name\":\"Janina\",\"last_name\":\"Gossart\",\"email\":\"jgossartb8@cpanel.net\",\"job\":\"Sales Associate\",\"timestamp\":\"2022-05-06T08:33:35Z\"}\n{\"id\":406,\"first_name\":\"Stefan\",\"last_name\":\"Bowdrey\",\"email\":\"sbowdreyb9@oaic.gov.au\",\"job\":\"Internal Auditor\",\"timestamp\":\"2022-10-20T14:21:53Z\"}\n{\"id\":407,\"first_name\":\"Meryl\",\"last_name\":\"Shorthouse\",\"email\":\"mshorthouseba@macromedia.com\",\"job\":\"Assistant Manager\",\"timestamp\":\"2022-11-14T11:44:59Z\"}\n{\"id\":408,\"first_name\":\"Whittaker\",\"last_name\":\"Vela\",\"email\":\"wvelabb@answers.com\",\"job\":\"Account Coordinator\",\"timestamp\":\"2021-12-13T11:34:38Z\"}\n{\"id\":409,\"first_name\":\"Cheston\",\"last_name\":\"Ruffli\",\"email\":\"crufflibc@usnews.com\",\"job\":\"Data Coordiator\",\"timestamp\":\"2022-04-08T14:59:45Z\"}\n{\"id\":410,\"first_name\":\"Daven\",\"last_name\":\"Mulryan\",\"email\":\"dmulryanbd@aboutads.info\",\"job\":\"Teacher\",\"timestamp\":\"2022-10-09T00:18:34Z\"}\n{\"id\":411,\"first_name\":\"Gusta\",\"last_name\":\"Goldstraw\",\"email\":\"ggoldstrawbe@alibaba.com\",\"job\":\"Sales Representative\",\"timestamp\":\"2022-01-24T21:37:08Z\"}\n{\"id\":412,\"first_name\":\"Chase\",\"last_name\":\"Kenworthey\",\"email\":\"ckenwortheybf@indiatimes.com\",\"job\":\"Nurse\",\"timestamp\":\"2022-09-17T13:33:39Z\"}\n{\"id\":413,\"first_name\":\"Lynn\",\"last_name\":\"Poluzzi\",\"email\":\"lpoluzzibg@cocolog-nifty.com\",\"job\":\"Automation Specialist II\",\"timestamp\":\"2022-03-23T12:46:38Z\"}\n{\"id\":414,\"first_name\":\"Mal\",\"last_name\":\"Snawden\",\"email\":\"msnawdenbh@netvibes.com\",\"job\":\"Business Systems Development Analyst\",\"timestamp\":\"2022-02-21T09:45:03Z\"}\n{\"id\":415,\"first_name\":\"Charin\",\"last_name\":\"Pennyman\",\"email\":\"cpennymanbi@bizjournals.com\",\"job\":\"Account Executive\",\"timestamp\":\"2022-03-10T18:16:19Z\"}\n{\"id\":416,\"first_name\":\"Berkeley\",\"last_name\":\"Plaster\",\"email\":\"bplasterbj@technorati.com\",\"job\":\"Account Coordinator\",\"timestamp\":\"2022-11-18T14:10:21Z\"}\n{\"id\":417,\"first_name\":\"Fransisco\",\"last_name\":\"Flanner\",\"email\":\"fflannerbk@cisco.com\",\"job\":\"Human Resources Assistant III\",\"timestamp\":\"2022-03-16T22:27:36Z\"}\n{\"id\":418,\"first_name\":\"Burt\",\"last_name\":\"Casajuana\",\"email\":\"bcasajuanabl@techcrunch.com\",\"job\":\"Electrical Engineer\",\"timestamp\":\"2022-09-14T02:08:34Z\"}\n{\"id\":419,\"first_name\":\"Tulley\",\"last_name\":\"Gwinn\",\"email\":\"tgwinnbm@dell.com\",\"job\":\"Senior Quality Engineer\",\"timestamp\":\"2022-11-27T16:15:10Z\"}\n{\"id\":420,\"first_name\":\"Anneliese\",\"last_name\":\"Richie\",\"email\":\"arichiebn@imageshack.us\",\"job\":\"Software Test Engineer I\",\"timestamp\":\"2021-12-17T08:26:36Z\"}\n{\"id\":421,\"first_name\":\"Mack\",\"last_name\":\"Ariss\",\"email\":\"marissbo@patch.com\",\"job\":\"Operator\",\"timestamp\":\"2022-11-26T18:35:09Z\"}\n{\"id\":422,\"first_name\":\"Carlin\",\"last_name\":\"O'Keenan\",\"email\":\"cokeenanbp@amazon.co.jp\",\"job\":\"Graphic Designer\",\"timestamp\":\"2022-05-31T11:18:22Z\"}\n{\"id\":423,\"first_name\":\"Cointon\",\"last_name\":\"Wride\",\"email\":\"cwridebq@bloomberg.com\",\"job\":\"Information Systems Manager\",\"timestamp\":\"2022-10-12T14:19:52Z\"}\n{\"id\":424,\"first_name\":\"Quillan\",\"last_name\":\"Betun\",\"email\":\"qbetunbr@cnet.com\",\"job\":\"Junior Executive\",\"timestamp\":\"2022-06-03T11:49:22Z\"}\n{\"id\":425,\"first_name\":\"Dolly\",\"last_name\":\"Loren\",\"email\":\"dlorenbs@t.co\",\"job\":\"Help Desk Operator\",\"timestamp\":\"2022-06-13T09:31:29Z\"}\n{\"id\":426,\"first_name\":\"Helli\",\"last_name\":\"Whiteoak\",\"email\":\"hwhiteoakbt@bbb.org\",\"job\":\"Software Engineer IV\",\"timestamp\":\"2022-09-10T05:16:50Z\"}\n{\"id\":427,\"first_name\":\"Babb\",\"last_name\":\"Aiton\",\"email\":\"baitonbu@artisteer.com\",\"job\":\"Research Associate\",\"timestamp\":\"2022-03-25T23:13:53Z\"}\n{\"id\":428,\"first_name\":\"Ryon\",\"last_name\":\"Klimkin\",\"email\":\"rklimkinbv@istockphoto.com\",\"job\":\"Programmer Analyst I\",\"timestamp\":\"2022-01-31T10:29:35Z\"}\n{\"id\":429,\"first_name\":\"Ignacius\",\"last_name\":\"Wragge\",\"email\":\"iwraggebw@dot.gov\",\"job\":\"Database Administrator IV\",\"timestamp\":\"2021-12-30T17:50:24Z\"}\n{\"id\":430,\"first_name\":\"Orren\",\"last_name\":\"Janovsky\",\"email\":\"ojanovskybx@ucoz.com\",\"job\":\"Help Desk Operator\",\"timestamp\":\"2022-08-18T17:04:35Z\"}\n{\"id\":431,\"first_name\":\"Teddie\",\"last_name\":\"Sayward\",\"email\":\"tsaywardby@springer.com\",\"job\":\"Geologist III\",\"timestamp\":\"2022-10-17T23:55:12Z\"}\n{\"id\":432,\"first_name\":\"Linc\",\"last_name\":\"Deeming\",\"email\":\"ldeemingbz@privacy.gov.au\",\"job\":\"Programmer Analyst III\",\"timestamp\":\"2022-07-05T12:33:47Z\"}\n{\"id\":433,\"first_name\":\"Arin\",\"last_name\":\"McConnulty\",\"email\":\"amcconnultyc0@example.com\",\"job\":\"General Manager\",\"timestamp\":\"2022-03-26T08:26:10Z\"}\n{\"id\":434,\"first_name\":\"Wainwright\",\"last_name\":\"Majury\",\"email\":\"wmajuryc1@nasa.gov\",\"job\":\"Nurse\",\"timestamp\":\"2022-02-16T03:48:15Z\"}\n{\"id\":435,\"first_name\":\"Rogerio\",\"last_name\":\"Siddens\",\"email\":\"rsiddensc2@bing.com\",\"job\":\"Financial Advisor\",\"timestamp\":\"2022-05-23T14:39:00Z\"}\n{\"id\":436,\"first_name\":\"Alyce\",\"last_name\":\"Kort\",\"email\":\"akortc3@craigslist.org\",\"job\":\"Biostatistician II\",\"timestamp\":\"2022-11-17T12:36:43Z\"}\n{\"id\":437,\"first_name\":\"Tiphany\",\"last_name\":\"Savory\",\"email\":\"tsavoryc4@sohu.com\",\"job\":\"Structural Analysis Engineer\",\"timestamp\":\"2022-04-28T20:22:51Z\"}\n{\"id\":438,\"first_name\":\"Colan\",\"last_name\":\"Gissop\",\"email\":\"cgissopc5@unicef.org\",\"job\":\"Quality Engineer\",\"timestamp\":\"2022-06-15T12:53:13Z\"}\n{\"id\":439,\"first_name\":\"Almira\",\"last_name\":\"MacPike\",\"email\":\"amacpikec6@multiply.com\",\"job\":\"Chemical Engineer\",\"timestamp\":\"2022-09-22T08:14:24Z\"}\n{\"id\":440,\"first_name\":\"Jae\",\"last_name\":\"Jelks\",\"email\":\"jjelksc7@nasa.gov\",\"job\":\"Marketing Assistant\",\"timestamp\":\"2022-02-23T15:12:22Z\"}\n{\"id\":441,\"first_name\":\"Ruthe\",\"last_name\":\"Armatidge\",\"email\":\"rarmatidgec8@huffingtonpost.com\",\"job\":\"Human Resources Manager\",\"timestamp\":\"2022-01-30T13:08:16Z\"}\n{\"id\":442,\"first_name\":\"Graehme\",\"last_name\":\"Mullin\",\"email\":\"gmullinc9@netscape.com\",\"job\":\"Account Coordinator\",\"timestamp\":\"2022-01-31T13:57:36Z\"}\n{\"id\":443,\"first_name\":\"Maude\",\"last_name\":\"Conlon\",\"email\":\"mconlonca@time.com\",\"job\":\"Staff Accountant I\",\"timestamp\":\"2022-04-10T08:37:13Z\"}\n{\"id\":444,\"first_name\":\"Elonore\",\"last_name\":\"Westmore\",\"email\":\"ewestmorecb@netvibes.com\",\"job\":\"Registered Nurse\",\"timestamp\":\"2022-10-17T01:52:09Z\"}\n{\"id\":445,\"first_name\":\"Yorker\",\"last_name\":\"Merrell\",\"email\":\"ymerrellcc@cornell.edu\",\"job\":\"Account Representative IV\",\"timestamp\":\"2022-04-01T17:28:42Z\"}\n{\"id\":446,\"first_name\":\"Billi\",\"last_name\":\"Sammars\",\"email\":\"bsammarscd@sogou.com\",\"job\":\"Physical Therapy Assistant\",\"timestamp\":\"2022-05-23T09:25:27Z\"}\n{\"id\":447,\"first_name\":\"Angel\",\"last_name\":\"Leader\",\"email\":\"aleaderce@multiply.com\",\"job\":\"Editor\",\"timestamp\":\"2022-12-06T20:41:05Z\"}\n{\"id\":448,\"first_name\":\"Juana\",\"last_name\":\"Bellward\",\"email\":\"jbellwardcf@geocities.com\",\"job\":\"Executive Secretary\",\"timestamp\":\"2022-06-10T21:03:40Z\"}\n{\"id\":449,\"first_name\":\"Talia\",\"last_name\":\"Adin\",\"email\":\"tadincg@angelfire.com\",\"job\":\"Internal Auditor\",\"timestamp\":\"2022-01-15T05:23:52Z\"}\n{\"id\":450,\"first_name\":\"Bryn\",\"last_name\":\"Ibell\",\"email\":\"bibellch@oracle.com\",\"job\":\"Dental Hygienist\",\"timestamp\":\"2022-01-15T21:38:04Z\"}\n{\"id\":451,\"first_name\":\"Marlowe\",\"last_name\":\"Gauge\",\"email\":\"mgaugeci@foxnews.com\",\"job\":\"Senior Developer\",\"timestamp\":\"2022-06-21T14:29:23Z\"}\n{\"id\":452,\"first_name\":\"Roxane\",\"last_name\":\"Pernell\",\"email\":\"rpernellcj@163.com\",\"job\":\"Director of Sales\",\"timestamp\":\"2022-05-25T16:10:44Z\"}\n{\"id\":453,\"first_name\":\"Renado\",\"last_name\":\"Sheekey\",\"email\":\"rsheekeyck@amazon.de\",\"job\":\"VP Accounting\",\"timestamp\":\"2022-01-07T02:59:37Z\"}\n{\"id\":454,\"first_name\":\"Adria\",\"last_name\":\"Causer\",\"email\":\"acausercl@yale.edu\",\"job\":\"Tax Accountant\",\"timestamp\":\"2022-10-13T23:57:39Z\"}\n{\"id\":455,\"first_name\":\"Reese\",\"last_name\":\"Sclater\",\"email\":\"rsclatercm@patch.com\",\"job\":\"VP Product Management\",\"timestamp\":\"2021-12-22T05:29:22Z\"}\n{\"id\":456,\"first_name\":\"Pail\",\"last_name\":\"Uzielli\",\"email\":\"puziellicn@hhs.gov\",\"job\":\"Recruiting Manager\",\"timestamp\":\"2022-01-09T10:05:30Z\"}\n{\"id\":457,\"first_name\":\"Sadella\",\"last_name\":\"Fiander\",\"email\":\"sfianderco@webs.com\",\"job\":\"Budget/Accounting Analyst II\",\"timestamp\":\"2022-04-18T18:09:34Z\"}\n{\"id\":458,\"first_name\":\"Clint\",\"last_name\":\"Thirwell\",\"email\":\"cthirwellcp@comsenz.com\",\"job\":\"VP Marketing\",\"timestamp\":\"2022-06-21T21:54:28Z\"}\n{\"id\":459,\"first_name\":\"Lamar\",\"last_name\":\"July\",\"email\":\"ljulycq@reverbnation.com\",\"job\":\"Budget/Accounting Analyst II\",\"timestamp\":\"2022-10-02T14:08:53Z\"}\n{\"id\":460,\"first_name\":\"Meg\",\"last_name\":\"Deschlein\",\"email\":\"mdeschleincr@networkadvertising.org\",\"job\":\"Web Developer I\",\"timestamp\":\"2022-04-03T11:15:05Z\"}\n{\"id\":461,\"first_name\":\"Stephine\",\"last_name\":\"Gorry\",\"email\":\"sgorrycs@naver.com\",\"job\":\"Food Chemist\",\"timestamp\":\"2021-12-09T11:30:53Z\"}\n{\"id\":462,\"first_name\":\"Beilul\",\"last_name\":\"Merrett\",\"email\":\"bmerrettct@soup.io\",\"job\":\"Administrative Officer\",\"timestamp\":\"2022-02-15T18:27:04Z\"}\n{\"id\":463,\"first_name\":\"Perren\",\"last_name\":\"Doni\",\"email\":\"pdonicu@google.co.uk\",\"job\":\"Accounting Assistant II\",\"timestamp\":\"2022-08-20T15:13:36Z\"}\n{\"id\":464,\"first_name\":\"Cullie\",\"last_name\":\"Skarman\",\"email\":\"cskarmancv@rambler.ru\",\"job\":\"Administrative Assistant II\",\"timestamp\":\"2022-08-03T23:13:37Z\"}\n{\"id\":465,\"first_name\":\"Matthieu\",\"last_name\":\"Simonato\",\"email\":\"msimonatocw@icq.com\",\"job\":\"Civil Engineer\",\"timestamp\":\"2022-09-14T13:30:09Z\"}\n{\"id\":466,\"first_name\":\"Thea\",\"last_name\":\"Deer\",\"email\":\"tdeercx@slate.com\",\"job\":\"Nurse\",\"timestamp\":\"2022-05-22T17:37:39Z\"}\n{\"id\":467,\"first_name\":\"Nicolina\",\"last_name\":\"Deyes\",\"email\":\"ndeyescy@etsy.com\",\"job\":\"Professor\",\"timestamp\":\"2022-08-27T09:03:04Z\"}\n{\"id\":468,\"first_name\":\"Katalin\",\"last_name\":\"Bryan\",\"email\":\"kbryancz@addthis.com\",\"job\":\"Health Coach III\",\"timestamp\":\"2022-04-30T15:05:37Z\"}\n{\"id\":469,\"first_name\":\"Reggis\",\"last_name\":\"Daffern\",\"email\":\"rdaffernd0@odnoklassniki.ru\",\"job\":\"Senior Developer\",\"timestamp\":\"2022-02-24T16:58:16Z\"}\n{\"id\":470,\"first_name\":\"Westbrook\",\"last_name\":\"Cockroft\",\"email\":\"wcockroftd1@uiuc.edu\",\"job\":\"Research Associate\",\"timestamp\":\"2022-11-27T21:04:57Z\"}\n{\"id\":471,\"first_name\":\"Tomaso\",\"last_name\":\"Bellon\",\"email\":\"tbellond2@sakura.ne.jp\",\"job\":\"Business Systems Development Analyst\",\"timestamp\":\"2022-07-25T06:37:11Z\"}\n{\"id\":472,\"first_name\":\"Jonathan\",\"last_name\":\"Marfe\",\"email\":\"jmarfed3@naver.com\",\"job\":\"Help Desk Operator\",\"timestamp\":\"2022-01-21T07:39:22Z\"}\n{\"id\":473,\"first_name\":\"Dane\",\"last_name\":\"Duro\",\"email\":\"ddurod4@google.com.br\",\"job\":\"Developer III\",\"timestamp\":\"2022-10-03T05:05:38Z\"}\n{\"id\":474,\"first_name\":\"Celine\",\"last_name\":\"Cartner\",\"email\":\"ccartnerd5@is.gd\",\"job\":\"Developer I\",\"timestamp\":\"2022-07-10T15:56:19Z\"}\n{\"id\":475,\"first_name\":\"Atlante\",\"last_name\":\"Leads\",\"email\":\"aleadsd6@yellowbook.com\",\"job\":\"Structural Analysis Engineer\",\"timestamp\":\"2022-03-13T07:18:55Z\"}\n{\"id\":476,\"first_name\":\"Pail\",\"last_name\":\"Jurgenson\",\"email\":\"pjurgensond7@newyorker.com\",\"job\":\"Recruiter\",\"timestamp\":\"2021-12-19T10:30:18Z\"}\n{\"id\":477,\"first_name\":\"Roslyn\",\"last_name\":\"Bazylets\",\"email\":\"rbazyletsd8@networksolutions.com\",\"job\":\"Research Associate\",\"timestamp\":\"2022-02-28T10:34:48Z\"}\n{\"id\":478,\"first_name\":\"Rube\",\"last_name\":\"Cona\",\"email\":\"rconad9@opensource.org\",\"job\":\"Mechanical Systems Engineer\",\"timestamp\":\"2022-06-03T02:50:16Z\"}\n{\"id\":479,\"first_name\":\"Pansie\",\"last_name\":\"Waistell\",\"email\":\"pwaistellda@elegantthemes.com\",\"job\":\"Account Representative IV\",\"timestamp\":\"2022-08-03T10:25:55Z\"}\n{\"id\":480,\"first_name\":\"Uri\",\"last_name\":\"Duerden\",\"email\":\"uduerdendb@sourceforge.net\",\"job\":\"Web Developer IV\",\"timestamp\":\"2022-12-03T14:11:59Z\"}\n{\"id\":481,\"first_name\":\"Kerianne\",\"last_name\":\"Pipping\",\"email\":\"kpippingdc@arizona.edu\",\"job\":\"Operator\",\"timestamp\":\"2022-09-13T05:25:50Z\"}\n{\"id\":482,\"first_name\":\"Blaine\",\"last_name\":\"Kop\",\"email\":\"bkopdd@ucla.edu\",\"job\":\"Geologist III\",\"timestamp\":\"2022-05-23T16:22:18Z\"}\n{\"id\":483,\"first_name\":\"Ana\",\"last_name\":\"Orringe\",\"email\":\"aorringede@flickr.com\",\"job\":\"Dental Hygienist\",\"timestamp\":\"2022-04-28T11:21:02Z\"}\n{\"id\":484,\"first_name\":\"Carine\",\"last_name\":\"Rawsthorne\",\"email\":\"crawsthornedf@gmpg.org\",\"job\":\"Registered Nurse\",\"timestamp\":\"2022-04-21T04:20:56Z\"}\n{\"id\":485,\"first_name\":\"Wilburt\",\"last_name\":\"Liley\",\"email\":\"wlileydg@unicef.org\",\"job\":\"Health Coach IV\",\"timestamp\":\"2022-05-14T23:43:51Z\"}\n{\"id\":486,\"first_name\":\"Cory\",\"last_name\":\"Winscom\",\"email\":\"cwinscomdh@hp.com\",\"job\":\"Tax Accountant\",\"timestamp\":\"2022-01-08T06:38:08Z\"}\n{\"id\":487,\"first_name\":\"Pris\",\"last_name\":\"Greenley\",\"email\":\"pgreenleydi@ted.com\",\"job\":\"Design Engineer\",\"timestamp\":\"2022-05-04T19:18:30Z\"}\n{\"id\":488,\"first_name\":\"Kath\",\"last_name\":\"Danet\",\"email\":\"kdanetdj@acquirethisname.com\",\"job\":\"Help Desk Operator\",\"timestamp\":\"2022-06-17T01:19:36Z\"}\n{\"id\":489,\"first_name\":\"Cindi\",\"last_name\":\"Isac\",\"email\":\"cisacdk@patch.com\",\"job\":\"Database Administrator IV\",\"timestamp\":\"2022-09-11T21:21:36Z\"}\n{\"id\":490,\"first_name\":\"Eduardo\",\"last_name\":\"Rozzell\",\"email\":\"erozzelldl@si.edu\",\"job\":\"Cost Accountant\",\"timestamp\":\"2022-06-29T05:11:51Z\"}\n{\"id\":491,\"first_name\":\"Carmita\",\"last_name\":\"Siggins\",\"email\":\"csigginsdm@webeden.co.uk\",\"job\":\"Project Manager\",\"timestamp\":\"2022-05-05T17:20:09Z\"}\n{\"id\":492,\"first_name\":\"Friederike\",\"last_name\":\"Wileman\",\"email\":\"fwilemandn@discovery.com\",\"job\":\"Human Resources Manager\",\"timestamp\":\"2022-07-25T18:07:09Z\"}\n{\"id\":493,\"first_name\":\"Art\",\"last_name\":\"Glas\",\"email\":\"aglasdo@yolasite.com\",\"job\":\"Dental Hygienist\",\"timestamp\":\"2022-05-29T18:53:16Z\"}\n{\"id\":494,\"first_name\":\"Nevin\",\"last_name\":\"Twinning\",\"email\":\"ntwinningdp@g.co\",\"job\":\"Senior Financial Analyst\",\"timestamp\":\"2022-10-03T16:43:45Z\"}\n{\"id\":495,\"first_name\":\"Anderea\",\"last_name\":\"Soots\",\"email\":\"asootsdq@photobucket.com\",\"job\":\"Senior Sales Associate\",\"timestamp\":\"2022-04-10T06:00:14Z\"}\n{\"id\":496,\"first_name\":\"Jehanna\",\"last_name\":\"Collishaw\",\"email\":\"jcollishawdr@latimes.com\",\"job\":\"Account Coordinator\",\"timestamp\":\"2022-09-13T00:49:41Z\"}\n{\"id\":497,\"first_name\":\"Markos\",\"last_name\":\"Dunley\",\"email\":\"mdunleyds@over-blog.com\",\"job\":\"Executive Secretary\",\"timestamp\":\"2022-09-30T22:27:56Z\"}\n{\"id\":498,\"first_name\":\"Marysa\",\"last_name\":\"Lebond\",\"email\":\"mlebonddt@nationalgeographic.com\",\"job\":\"Chief Design Engineer\",\"timestamp\":\"2022-07-03T09:53:16Z\"}\n{\"id\":499,\"first_name\":\"Washington\",\"last_name\":\"Nutton\",\"email\":\"wnuttondu@godaddy.com\",\"job\":\"Software Consultant\",\"timestamp\":\"2022-01-15T12:19:31Z\"}\n{\"id\":500,\"first_name\":\"Donny\",\"last_name\":\"Matteo\",\"email\":\"dmatteodv@comsenz.com\",\"job\":\"Programmer I\",\"timestamp\":\"2022-06-13T22:13:41Z\"}\n{\"id\":501,\"first_name\":\"Alan\",\"last_name\":\"Drummond\",\"email\":\"adrummonddw@google.de\",\"job\":\"Engineer II\",\"timestamp\":\"2021-12-17T20:17:04Z\"}\n{\"id\":502,\"first_name\":\"Tedmund\",\"last_name\":\"Dorricott\",\"email\":\"tdorricottdx@huffingtonpost.com\",\"job\":\"Office Assistant III\",\"timestamp\":\"2022-01-17T16:29:48Z\"}\n{\"id\":503,\"first_name\":\"Dene\",\"last_name\":\"Lammers\",\"email\":\"dlammersdy@wiley.com\",\"job\":\"Professor\",\"timestamp\":\"2022-04-16T09:41:42Z\"}\n{\"id\":504,\"first_name\":\"Town\",\"last_name\":\"Leman\",\"email\":\"tlemandz@amazon.co.uk\",\"job\":\"Computer Systems Analyst II\",\"timestamp\":\"2022-02-07T23:47:26Z\"}\n{\"id\":505,\"first_name\":\"Bendix\",\"last_name\":\"Applewhaite\",\"email\":\"bapplewhaitee0@unc.edu\",\"job\":\"Professor\",\"timestamp\":\"2022-01-13T02:07:26Z\"}\n{\"id\":506,\"first_name\":\"Paige\",\"last_name\":\"Mcsarry\",\"email\":\"pmcsarrye1@furl.net\",\"job\":\"Librarian\",\"timestamp\":\"2022-10-08T21:52:03Z\"}\n{\"id\":507,\"first_name\":\"Nahum\",\"last_name\":\"Sweeny\",\"email\":\"nsweenye2@nifty.com\",\"job\":\"Teacher\",\"timestamp\":\"2022-05-03T19:18:43Z\"}\n{\"id\":508,\"first_name\":\"Odelle\",\"last_name\":\"Crosson\",\"email\":\"ocrossone3@time.com\",\"job\":\"Senior Developer\",\"timestamp\":\"2022-07-19T01:56:54Z\"}\n{\"id\":509,\"first_name\":\"Carny\",\"last_name\":\"Hunter\",\"email\":\"chuntere4@goodreads.com\",\"job\":\"Human Resources Assistant IV\",\"timestamp\":\"2022-06-01T04:40:16Z\"}\n{\"id\":510,\"first_name\":\"Jarad\",\"last_name\":\"Rogez\",\"email\":\"jrogeze5@weather.com\",\"job\":\"Media Manager IV\",\"timestamp\":\"2022-11-04T13:41:35Z\"}\n{\"id\":511,\"first_name\":\"Iggie\",\"last_name\":\"Gainsburgh\",\"email\":\"igainsburghe6@ehow.com\",\"job\":\"Data Coordiator\",\"timestamp\":\"2022-11-25T18:42:36Z\"}\n{\"id\":512,\"first_name\":\"Meredeth\",\"last_name\":\"Gealy\",\"email\":\"mgealye7@va.gov\",\"job\":\"Environmental Specialist\",\"timestamp\":\"2022-07-06T12:34:03Z\"}\n{\"id\":513,\"first_name\":\"Arluene\",\"last_name\":\"Hallifax\",\"email\":\"ahallifaxe8@narod.ru\",\"job\":\"Environmental Tech\",\"timestamp\":\"2022-11-01T18:57:22Z\"}\n{\"id\":514,\"first_name\":\"Yehudit\",\"last_name\":\"Leyfield\",\"email\":\"yleyfielde9@clickbank.net\",\"job\":\"Recruiting Manager\",\"timestamp\":\"2022-07-02T07:24:27Z\"}\n{\"id\":515,\"first_name\":\"Ezra\",\"last_name\":\"Blabey\",\"email\":\"eblabeyea@google.ca\",\"job\":\"Librarian\",\"timestamp\":\"2022-06-12T02:22:15Z\"}\n{\"id\":516,\"first_name\":\"Gus\",\"last_name\":\"Leipnik\",\"email\":\"gleipnikeb@nytimes.com\",\"job\":\"VP Product Management\",\"timestamp\":\"2022-01-16T10:36:47Z\"}\n{\"id\":517,\"first_name\":\"Benjamin\",\"last_name\":\"Choak\",\"email\":\"bchoakec@tumblr.com\",\"job\":\"VP Product Management\",\"timestamp\":\"2022-07-25T05:51:24Z\"}\n{\"id\":518,\"first_name\":\"Reider\",\"last_name\":\"Fisby\",\"email\":\"rfisbyed@cdbaby.com\",\"job\":\"Information Systems Manager\",\"timestamp\":\"2022-07-16T17:19:04Z\"}\n{\"id\":519,\"first_name\":\"Urbano\",\"last_name\":\"Barr\",\"email\":\"ubarree@ihg.com\",\"job\":\"Data Coordiator\",\"timestamp\":\"2022-08-16T09:11:02Z\"}\n{\"id\":520,\"first_name\":\"Heinrik\",\"last_name\":\"Courvert\",\"email\":\"hcourvertef@plala.or.jp\",\"job\":\"Software Test Engineer I\",\"timestamp\":\"2022-05-13T03:16:39Z\"}\n{\"id\":521,\"first_name\":\"Johann\",\"last_name\":\"Schlagtmans\",\"email\":\"jschlagtmanseg@over-blog.com\",\"job\":\"Software Test Engineer I\",\"timestamp\":\"2022-08-20T19:57:12Z\"}\n{\"id\":522,\"first_name\":\"Brendan\",\"last_name\":\"MacFadden\",\"email\":\"bmacfaddeneh@vkontakte.ru\",\"job\":\"Media Manager II\",\"timestamp\":\"2022-08-01T22:39:44Z\"}\n{\"id\":523,\"first_name\":\"Brittaney\",\"last_name\":\"Kissock\",\"email\":\"bkissockei@state.gov\",\"job\":\"Compensation Analyst\",\"timestamp\":\"2022-11-17T12:39:14Z\"}\n{\"id\":524,\"first_name\":\"Hy\",\"last_name\":\"Osmant\",\"email\":\"hosmantej@hibu.com\",\"job\":\"Help Desk Operator\",\"timestamp\":\"2022-05-31T11:21:37Z\"}\n{\"id\":525,\"first_name\":\"Christean\",\"last_name\":\"Okell\",\"email\":\"cokellek@sphinn.com\",\"job\":\"VP Sales\",\"timestamp\":\"2022-06-11T06:54:34Z\"}\n{\"id\":526,\"first_name\":\"Catharina\",\"last_name\":\"Onians\",\"email\":\"coniansel@seesaa.net\",\"job\":\"Structural Analysis Engineer\",\"timestamp\":\"2022-12-02T18:12:53Z\"}\n{\"id\":527,\"first_name\":\"Murry\",\"last_name\":\"Gillings\",\"email\":\"mgillingsem@vinaora.com\",\"job\":\"Associate Professor\",\"timestamp\":\"2022-10-29T18:22:04Z\"}\n{\"id\":528,\"first_name\":\"Eunice\",\"last_name\":\"Cottisford\",\"email\":\"ecottisforden@ebay.co.uk\",\"job\":\"Executive Secretary\",\"timestamp\":\"2022-03-05T20:38:13Z\"}\n{\"id\":529,\"first_name\":\"Brigg\",\"last_name\":\"Earie\",\"email\":\"bearieeo@printfriendly.com\",\"job\":\"Staff Scientist\",\"timestamp\":\"2022-09-02T02:50:14Z\"}\n{\"id\":530,\"first_name\":\"Pavla\",\"last_name\":\"Gooder\",\"email\":\"pgooderep@sphinn.com\",\"job\":\"Biostatistician II\",\"timestamp\":\"2022-08-14T11:30:27Z\"}\n{\"id\":531,\"first_name\":\"Vite\",\"last_name\":\"Hendrichs\",\"email\":\"vhendrichseq@wikia.com\",\"job\":\"Payment Adjustment Coordinator\",\"timestamp\":\"2022-10-22T09:52:12Z\"}\n{\"id\":532,\"first_name\":\"Genovera\",\"last_name\":\"Hucker\",\"email\":\"ghuckerer@diigo.com\",\"job\":\"Actuary\",\"timestamp\":\"2022-08-31T22:16:03Z\"}\n{\"id\":533,\"first_name\":\"Kristofor\",\"last_name\":\"Gee\",\"email\":\"kgeees@unblog.fr\",\"job\":\"Software Consultant\",\"timestamp\":\"2021-12-30T22:01:09Z\"}\n{\"id\":534,\"first_name\":\"Rozanne\",\"last_name\":\"Killoran\",\"email\":\"rkilloranet@sun.com\",\"job\":\"Social Worker\",\"timestamp\":\"2022-07-10T01:32:57Z\"}\n{\"id\":535,\"first_name\":\"Maurizio\",\"last_name\":\"Whitby\",\"email\":\"mwhitbyeu@reference.com\",\"job\":\"Senior Editor\",\"timestamp\":\"2021-12-31T05:35:45Z\"}\n{\"id\":536,\"first_name\":\"Gerard\",\"last_name\":\"Yukhnev\",\"email\":\"gyukhnevev@merriam-webster.com\",\"job\":\"Environmental Tech\",\"timestamp\":\"2022-09-21T02:01:45Z\"}\n{\"id\":537,\"first_name\":\"Abe\",\"last_name\":\"Fleg\",\"email\":\"aflegew@acquirethisname.com\",\"job\":\"Database Administrator III\",\"timestamp\":\"2022-03-19T11:41:21Z\"}\n{\"id\":538,\"first_name\":\"Roseanna\",\"last_name\":\"Lovewell\",\"email\":\"rlovewellex@weebly.com\",\"job\":\"Account Representative II\",\"timestamp\":\"2022-04-10T19:30:59Z\"}\n{\"id\":539,\"first_name\":\"Joya\",\"last_name\":\"Makin\",\"email\":\"jmakiney@dmoz.org\",\"job\":\"Help Desk Operator\",\"timestamp\":\"2022-05-22T09:39:40Z\"}\n{\"id\":540,\"first_name\":\"Otto\",\"last_name\":\"Kinneir\",\"email\":\"okinneirez@mysql.com\",\"job\":\"Biostatistician III\",\"timestamp\":\"2022-06-05T12:49:16Z\"}\n{\"id\":541,\"first_name\":\"Candy\",\"last_name\":\"Pedrol\",\"email\":\"cpedrolf0@sina.com.cn\",\"job\":\"Social Worker\",\"timestamp\":\"2022-02-08T15:10:50Z\"}\n{\"id\":542,\"first_name\":\"Margarita\",\"last_name\":\"Hembling\",\"email\":\"mhemblingf1@cpanel.net\",\"job\":\"Quality Engineer\",\"timestamp\":\"2022-07-02T03:09:59Z\"}\n{\"id\":543,\"first_name\":\"Claribel\",\"last_name\":\"Pirouet\",\"email\":\"cpirouetf2@apache.org\",\"job\":\"Software Engineer II\",\"timestamp\":\"2022-04-27T19:40:05Z\"}\n{\"id\":544,\"first_name\":\"Jerrold\",\"last_name\":\"Anglish\",\"email\":\"janglishf3@unblog.fr\",\"job\":\"Web Designer III\",\"timestamp\":\"2022-10-13T04:12:21Z\"}\n{\"id\":545,\"first_name\":\"Barr\",\"last_name\":\"Humberston\",\"email\":\"bhumberstonf4@etsy.com\",\"job\":\"Statistician II\",\"timestamp\":\"2022-05-04T01:15:05Z\"}\n{\"id\":546,\"first_name\":\"Elbertine\",\"last_name\":\"Fellnee\",\"email\":\"efellneef5@reuters.com\",\"job\":\"Health Coach II\",\"timestamp\":\"2022-05-04T21:05:03Z\"}\n{\"id\":547,\"first_name\":\"Edithe\",\"last_name\":\"Hackin\",\"email\":\"ehackinf6@uol.com.br\",\"job\":\"Senior Sales Associate\",\"timestamp\":\"2022-09-22T06:50:43Z\"}\n{\"id\":548,\"first_name\":\"Lyndsay\",\"last_name\":\"Bartoli\",\"email\":\"lbartolif7@slate.com\",\"job\":\"Staff Scientist\",\"timestamp\":\"2021-12-10T17:25:16Z\"}\n{\"id\":549,\"first_name\":\"Mason\",\"last_name\":\"Furney\",\"email\":\"mfurneyf8@si.edu\",\"job\":\"Web Designer II\",\"timestamp\":\"2022-11-29T08:36:41Z\"}\n{\"id\":550,\"first_name\":\"Barbey\",\"last_name\":\"Mindenhall\",\"email\":\"bmindenhallf9@dailymotion.com\",\"job\":\"Nuclear Power Engineer\",\"timestamp\":\"2022-06-23T05:29:44Z\"}\n{\"id\":551,\"first_name\":\"Jodi\",\"last_name\":\"Olekhov\",\"email\":\"jolekhovfa@oakley.com\",\"job\":\"General Manager\",\"timestamp\":\"2022-02-19T14:46:26Z\"}\n{\"id\":552,\"first_name\":\"Jillayne\",\"last_name\":\"Newis\",\"email\":\"jnewisfb@e-recht24.de\",\"job\":\"Research Associate\",\"timestamp\":\"2022-11-13T17:04:16Z\"}\n{\"id\":553,\"first_name\":\"Juliet\",\"last_name\":\"Ridsdale\",\"email\":\"jridsdalefc@loc.gov\",\"job\":\"Software Engineer III\",\"timestamp\":\"2021-12-22T08:30:46Z\"}\n{\"id\":554,\"first_name\":\"Der\",\"last_name\":\"Troth\",\"email\":\"dtrothfd@samsung.com\",\"job\":\"Software Test Engineer IV\",\"timestamp\":\"2022-12-05T05:25:54Z\"}\n{\"id\":555,\"first_name\":\"Sibley\",\"last_name\":\"Aldin\",\"email\":\"saldinfe@sogou.com\",\"job\":\"Financial Analyst\",\"timestamp\":\"2022-10-19T12:41:24Z\"}\n{\"id\":556,\"first_name\":\"Dorris\",\"last_name\":\"Blizard\",\"email\":\"dblizardff@umich.edu\",\"job\":\"Database Administrator I\",\"timestamp\":\"2022-04-07T23:05:04Z\"}\n{\"id\":557,\"first_name\":\"Blake\",\"last_name\":\"Whates\",\"email\":\"bwhatesfg@1688.com\",\"job\":\"Marketing Manager\",\"timestamp\":\"2022-05-23T17:32:50Z\"}\n{\"id\":558,\"first_name\":\"Codi\",\"last_name\":\"Marke\",\"email\":\"cmarkefh@wsj.com\",\"job\":\"Research Assistant III\",\"timestamp\":\"2022-02-02T22:23:33Z\"}\n{\"id\":559,\"first_name\":\"Stanislas\",\"last_name\":\"Brafield\",\"email\":\"sbrafieldfi@timesonline.co.uk\",\"job\":\"Assistant Manager\",\"timestamp\":\"2021-12-21T16:11:21Z\"}\n{\"id\":560,\"first_name\":\"Lottie\",\"last_name\":\"Sperring\",\"email\":\"lsperringfj@salon.com\",\"job\":\"Cost Accountant\",\"timestamp\":\"2022-06-21T13:48:13Z\"}\n{\"id\":561,\"first_name\":\"Alvina\",\"last_name\":\"Beausang\",\"email\":\"abeausangfk@blogspot.com\",\"job\":\"Biostatistician I\",\"timestamp\":\"2022-11-20T19:04:41Z\"}\n{\"id\":562,\"first_name\":\"Barton\",\"last_name\":\"Spencelayh\",\"email\":\"bspencelayhfl@histats.com\",\"job\":\"Product Engineer\",\"timestamp\":\"2022-10-20T00:46:24Z\"}\n{\"id\":563,\"first_name\":\"Allayne\",\"last_name\":\"Treasure\",\"email\":\"atreasurefm@xrea.com\",\"job\":\"Accountant III\",\"timestamp\":\"2022-10-09T13:55:39Z\"}\n{\"id\":564,\"first_name\":\"Odey\",\"last_name\":\"Van der Velde\",\"email\":\"ovanderveldefn@t-online.de\",\"job\":\"Recruiting Manager\",\"timestamp\":\"2022-02-10T08:44:33Z\"}\n{\"id\":565,\"first_name\":\"Eleanora\",\"last_name\":\"Luchelli\",\"email\":\"eluchellifo@livejournal.com\",\"job\":\"Engineer I\",\"timestamp\":\"2022-06-08T22:10:32Z\"}\n{\"id\":566,\"first_name\":\"Gerta\",\"last_name\":\"Iskowitz\",\"email\":\"giskowitzfp@bravesites.com\",\"job\":\"Nurse\",\"timestamp\":\"2021-12-14T18:48:43Z\"}\n{\"id\":567,\"first_name\":\"Asa\",\"last_name\":\"Gregorin\",\"email\":\"agregorinfq@addtoany.com\",\"job\":\"Professor\",\"timestamp\":\"2022-08-14T23:21:12Z\"}\n{\"id\":568,\"first_name\":\"Marion\",\"last_name\":\"MacManus\",\"email\":\"mmacmanusfr@cdc.gov\",\"job\":\"Marketing Manager\",\"timestamp\":\"2022-06-03T08:50:13Z\"}\n{\"id\":569,\"first_name\":\"Dionis\",\"last_name\":\"Klimowski\",\"email\":\"dklimowskifs@constantcontact.com\",\"job\":\"Automation Specialist III\",\"timestamp\":\"2022-02-14T12:33:54Z\"}\n{\"id\":570,\"first_name\":\"Katherine\",\"last_name\":\"McCarle\",\"email\":\"kmccarleft@sogou.com\",\"job\":\"Office Assistant II\",\"timestamp\":\"2022-03-31T12:05:36Z\"}\n{\"id\":571,\"first_name\":\"Evelyn\",\"last_name\":\"MacCarrane\",\"email\":\"emaccarranefu@cam.ac.uk\",\"job\":\"Technical Writer\",\"timestamp\":\"2022-11-06T20:16:28Z\"}\n{\"id\":572,\"first_name\":\"Carolynn\",\"last_name\":\"Forsaith\",\"email\":\"cforsaithfv@blogs.com\",\"job\":\"Health Coach IV\",\"timestamp\":\"2022-02-08T14:49:50Z\"}\n{\"id\":573,\"first_name\":\"Pancho\",\"last_name\":\"Grealish\",\"email\":\"pgrealishfw@fc2.com\",\"job\":\"Senior Developer\",\"timestamp\":\"2022-01-22T03:02:35Z\"}\n{\"id\":574,\"first_name\":\"Royall\",\"last_name\":\"Watson-Brown\",\"email\":\"rwatsonbrownfx@a8.net\",\"job\":\"Marketing Assistant\",\"timestamp\":\"2022-03-15T23:38:10Z\"}\n{\"id\":575,\"first_name\":\"Derk\",\"last_name\":\"Carvill\",\"email\":\"dcarvillfy@marriott.com\",\"job\":\"Nurse\",\"timestamp\":\"2021-12-13T14:15:47Z\"}\n{\"id\":576,\"first_name\":\"Siffre\",\"last_name\":\"Poston\",\"email\":\"spostonfz@creativecommons.org\",\"job\":\"Librarian\",\"timestamp\":\"2022-07-03T08:01:40Z\"}\n{\"id\":577,\"first_name\":\"Layla\",\"last_name\":\"Monckton\",\"email\":\"lmoncktong0@constantcontact.com\",\"job\":\"Civil Engineer\",\"timestamp\":\"2022-08-07T22:49:41Z\"}\n{\"id\":578,\"first_name\":\"Ellyn\",\"last_name\":\"Masse\",\"email\":\"emasseg1@archive.org\",\"job\":\"Sales Representative\",\"timestamp\":\"2022-11-11T21:31:00Z\"}\n{\"id\":579,\"first_name\":\"Gilbertina\",\"last_name\":\"Younglove\",\"email\":\"gyoungloveg2@omniture.com\",\"job\":\"Design Engineer\",\"timestamp\":\"2022-05-16T22:24:11Z\"}\n{\"id\":580,\"first_name\":\"Hansiain\",\"last_name\":\"Eddisford\",\"email\":\"heddisfordg3@newsvine.com\",\"job\":\"Design Engineer\",\"timestamp\":\"2022-12-06T18:22:45Z\"}\n{\"id\":581,\"first_name\":\"Sydney\",\"last_name\":\"Writer\",\"email\":\"swriterg4@yahoo.co.jp\",\"job\":\"Media Manager I\",\"timestamp\":\"2022-10-06T10:57:14Z\"}\n{\"id\":582,\"first_name\":\"Stuart\",\"last_name\":\"Jimeno\",\"email\":\"sjimenog5@nifty.com\",\"job\":\"Safety Technician IV\",\"timestamp\":\"2022-01-03T22:49:27Z\"}\n{\"id\":583,\"first_name\":\"Averill\",\"last_name\":\"Leuchars\",\"email\":\"aleucharsg6@g.co\",\"job\":\"GIS Technical Architect\",\"timestamp\":\"2022-05-17T04:29:13Z\"}\n{\"id\":584,\"first_name\":\"Ruddie\",\"last_name\":\"Bickerdicke\",\"email\":\"rbickerdickeg7@earthlink.net\",\"job\":\"Cost Accountant\",\"timestamp\":\"2022-10-03T06:19:29Z\"}\n{\"id\":585,\"first_name\":\"Krissie\",\"last_name\":\"Walford\",\"email\":\"kwalfordg8@a8.net\",\"job\":\"Financial Analyst\",\"timestamp\":\"2022-06-26T23:22:59Z\"}\n{\"id\":586,\"first_name\":\"Zarah\",\"last_name\":\"Ingleton\",\"email\":\"zingletong9@si.edu\",\"job\":\"Assistant Professor\",\"timestamp\":\"2022-04-27T12:56:33Z\"}\n{\"id\":587,\"first_name\":\"Ira\",\"last_name\":\"Jaxon\",\"email\":\"ijaxonga@virginia.edu\",\"job\":\"Graphic Designer\",\"timestamp\":\"2022-01-24T07:33:08Z\"}\n{\"id\":588,\"first_name\":\"Gusella\",\"last_name\":\"Musla\",\"email\":\"gmuslagb@dmoz.org\",\"job\":\"Web Designer III\",\"timestamp\":\"2022-11-22T01:03:00Z\"}\n{\"id\":589,\"first_name\":\"Hildegarde\",\"last_name\":\"Breem\",\"email\":\"hbreemgc@goo.gl\",\"job\":\"Web Designer I\",\"timestamp\":\"2022-04-01T05:45:18Z\"}\n{\"id\":590,\"first_name\":\"Thaddeus\",\"last_name\":\"Scouller\",\"email\":\"tscoullergd@bloglovin.com\",\"job\":\"Geological Engineer\",\"timestamp\":\"2022-04-02T08:13:20Z\"}\n{\"id\":591,\"first_name\":\"Martainn\",\"last_name\":\"Fevers\",\"email\":\"mfeversge@example.com\",\"job\":\"Research Associate\",\"timestamp\":\"2022-08-24T03:11:38Z\"}\n{\"id\":592,\"first_name\":\"Edd\",\"last_name\":\"Veasey\",\"email\":\"eveaseygf@ameblo.jp\",\"job\":\"Administrative Assistant II\",\"timestamp\":\"2022-05-26T08:06:04Z\"}\n{\"id\":593,\"first_name\":\"Elihu\",\"last_name\":\"Redgewell\",\"email\":\"eredgewellgg@nationalgeographic.com\",\"job\":\"Clinical Specialist\",\"timestamp\":\"2022-04-26T02:56:31Z\"}\n{\"id\":594,\"first_name\":\"Renault\",\"last_name\":\"Smye\",\"email\":\"rsmyegh@wunderground.com\",\"job\":\"Analog Circuit Design manager\",\"timestamp\":\"2022-01-30T22:12:02Z\"}\n{\"id\":595,\"first_name\":\"Leeland\",\"last_name\":\"Hendricks\",\"email\":\"lhendricksgi@foxnews.com\",\"job\":\"Pharmacist\",\"timestamp\":\"2022-05-30T02:24:30Z\"}\n{\"id\":596,\"first_name\":\"Collin\",\"last_name\":\"Arent\",\"email\":\"carentgj@liveinternet.ru\",\"job\":\"Mechanical Systems Engineer\",\"timestamp\":\"2021-12-31T21:09:33Z\"}\n{\"id\":597,\"first_name\":\"Kameko\",\"last_name\":\"Pierce\",\"email\":\"kpiercegk@china.com.cn\",\"job\":\"Payment Adjustment Coordinator\",\"timestamp\":\"2022-11-16T07:31:05Z\"}\n{\"id\":598,\"first_name\":\"Mei\",\"last_name\":\"Pigne\",\"email\":\"mpignegl@ihg.com\",\"job\":\"Technical Writer\",\"timestamp\":\"2022-04-23T10:59:50Z\"}\n{\"id\":599,\"first_name\":\"Jenni\",\"last_name\":\"Skeggs\",\"email\":\"jskeggsgm@wikipedia.org\",\"job\":\"Assistant Manager\",\"timestamp\":\"2022-05-25T06:30:40Z\"}\n{\"id\":600,\"first_name\":\"Carver\",\"last_name\":\"Rivalland\",\"email\":\"crivallandgn@cornell.edu\",\"job\":\"Internal Auditor\",\"timestamp\":\"2022-01-31T17:19:48Z\"}\n{\"id\":601,\"first_name\":\"Ciro\",\"last_name\":\"MacLaverty\",\"email\":\"cmaclavertygo@usda.gov\",\"job\":\"VP Quality Control\",\"timestamp\":\"2022-05-10T19:41:08Z\"}\n{\"id\":602,\"first_name\":\"Brook\",\"last_name\":\"Stickells\",\"email\":\"bstickellsgp@prnewswire.com\",\"job\":\"Help Desk Technician\",\"timestamp\":\"2021-12-12T18:23:13Z\"}\n{\"id\":603,\"first_name\":\"Morty\",\"last_name\":\"Varfolomeev\",\"email\":\"mvarfolomeevgq@toplist.cz\",\"job\":\"Administrative Officer\",\"timestamp\":\"2022-01-03T17:52:59Z\"}\n{\"id\":604,\"first_name\":\"Ilario\",\"last_name\":\"Silman\",\"email\":\"isilmangr@princeton.edu\",\"job\":\"Programmer I\",\"timestamp\":\"2021-12-16T15:24:55Z\"}\n{\"id\":605,\"first_name\":\"Lemar\",\"last_name\":\"Groll\",\"email\":\"lgrollgs@dyndns.org\",\"job\":\"Cost Accountant\",\"timestamp\":\"2022-01-19T15:59:40Z\"}\n{\"id\":606,\"first_name\":\"Titos\",\"last_name\":\"Thorrington\",\"email\":\"tthorringtongt@cafepress.com\",\"job\":\"Dental Hygienist\",\"timestamp\":\"2022-10-06T23:19:20Z\"}\n{\"id\":607,\"first_name\":\"Ivonne\",\"last_name\":\"Yakov\",\"email\":\"iyakovgu@alibaba.com\",\"job\":\"Programmer Analyst I\",\"timestamp\":\"2022-09-02T09:44:39Z\"}\n{\"id\":608,\"first_name\":\"Cherish\",\"last_name\":\"Poinsett\",\"email\":\"cpoinsettgv@latimes.com\",\"job\":\"Chemical Engineer\",\"timestamp\":\"2022-01-23T21:57:24Z\"}\n{\"id\":609,\"first_name\":\"Conrad\",\"last_name\":\"Edmondson\",\"email\":\"cedmondsongw@bravesites.com\",\"job\":\"Marketing Assistant\",\"timestamp\":\"2022-06-02T08:27:23Z\"}\n{\"id\":610,\"first_name\":\"Zachary\",\"last_name\":\"Debney\",\"email\":\"zdebneygx@squidoo.com\",\"job\":\"Software Consultant\",\"timestamp\":\"2022-05-16T03:14:27Z\"}\n{\"id\":611,\"first_name\":\"Candy\",\"last_name\":\"Mc Harg\",\"email\":\"cmcharggy@wikimedia.org\",\"job\":\"Account Executive\",\"timestamp\":\"2022-06-24T22:49:07Z\"}\n{\"id\":612,\"first_name\":\"Stormi\",\"last_name\":\"Stockford\",\"email\":\"sstockfordgz@thetimes.co.uk\",\"job\":\"Cost Accountant\",\"timestamp\":\"2022-06-02T06:18:14Z\"}\n{\"id\":613,\"first_name\":\"Robin\",\"last_name\":\"Antalffy\",\"email\":\"rantalffyh0@blinklist.com\",\"job\":\"Design Engineer\",\"timestamp\":\"2022-10-30T16:07:38Z\"}\n{\"id\":614,\"first_name\":\"Elaina\",\"last_name\":\"Dunkinson\",\"email\":\"edunkinsonh1@istockphoto.com\",\"job\":\"Director of Sales\",\"timestamp\":\"2022-08-12T01:26:15Z\"}\n{\"id\":615,\"first_name\":\"Merilyn\",\"last_name\":\"Annable\",\"email\":\"mannableh2@sourceforge.net\",\"job\":\"Pharmacist\",\"timestamp\":\"2022-11-13T10:13:18Z\"}\n{\"id\":616,\"first_name\":\"Ferris\",\"last_name\":\"Swetmore\",\"email\":\"fswetmoreh3@mediafire.com\",\"job\":\"Junior Executive\",\"timestamp\":\"2022-07-01T14:36:04Z\"}\n{\"id\":617,\"first_name\":\"Alf\",\"last_name\":\"Ozintsev\",\"email\":\"aozintsevh4@businessweek.com\",\"job\":\"Internal Auditor\",\"timestamp\":\"2022-11-19T03:10:43Z\"}\n{\"id\":618,\"first_name\":\"Franky\",\"last_name\":\"Ralton\",\"email\":\"fraltonh5@weather.com\",\"job\":\"VP Sales\",\"timestamp\":\"2022-01-30T19:17:00Z\"}\n{\"id\":619,\"first_name\":\"Hedvige\",\"last_name\":\"Rowlands\",\"email\":\"hrowlandsh6@comcast.net\",\"job\":\"Financial Advisor\",\"timestamp\":\"2022-09-09T11:04:09Z\"}\n{\"id\":620,\"first_name\":\"Tynan\",\"last_name\":\"Crippell\",\"email\":\"tcrippellh7@berkeley.edu\",\"job\":\"Nurse Practicioner\",\"timestamp\":\"2022-09-21T03:11:14Z\"}\n{\"id\":621,\"first_name\":\"Alexine\",\"last_name\":\"Rawlinson\",\"email\":\"arawlinsonh8@boston.com\",\"job\":\"Pharmacist\",\"timestamp\":\"2022-10-18T23:38:56Z\"}\n{\"id\":622,\"first_name\":\"Yehudit\",\"last_name\":\"Couldwell\",\"email\":\"ycouldwellh9@scientificamerican.com\",\"job\":\"Safety Technician I\",\"timestamp\":\"2022-03-28T18:53:05Z\"}\n{\"id\":623,\"first_name\":\"Eleanora\",\"last_name\":\"Bromont\",\"email\":\"ebromontha@tinyurl.com\",\"job\":\"Teacher\",\"timestamp\":\"2022-05-20T03:41:08Z\"}\n{\"id\":624,\"first_name\":\"Vincenty\",\"last_name\":\"Rackham\",\"email\":\"vrackhamhb@blogtalkradio.com\",\"job\":\"Senior Financial Analyst\",\"timestamp\":\"2022-05-10T00:44:21Z\"}\n{\"id\":625,\"first_name\":\"Rozella\",\"last_name\":\"Stent\",\"email\":\"rstenthc@so-net.ne.jp\",\"job\":\"Actuary\",\"timestamp\":\"2022-04-16T06:16:59Z\"}\n{\"id\":626,\"first_name\":\"Kerwinn\",\"last_name\":\"Possel\",\"email\":\"kposselhd@umn.edu\",\"job\":\"Analyst Programmer\",\"timestamp\":\"2022-07-30T17:35:49Z\"}\n{\"id\":627,\"first_name\":\"Griffie\",\"last_name\":\"Quibell\",\"email\":\"gquibellhe@newyorker.com\",\"job\":\"Executive Secretary\",\"timestamp\":\"2022-06-14T19:28:57Z\"}\n{\"id\":628,\"first_name\":\"Anatola\",\"last_name\":\"Mallion\",\"email\":\"amallionhf@upenn.edu\",\"job\":\"Sales Associate\",\"timestamp\":\"2022-03-15T17:11:58Z\"}\n{\"id\":629,\"first_name\":\"Dalila\",\"last_name\":\"Christaeas\",\"email\":\"dchristaeashg@bandcamp.com\",\"job\":\"Automation Specialist II\",\"timestamp\":\"2022-02-03T01:42:06Z\"}\n{\"id\":630,\"first_name\":\"Gina\",\"last_name\":\"Franses\",\"email\":\"gfranseshh@hao123.com\",\"job\":\"Clinical Specialist\",\"timestamp\":\"2022-09-02T08:10:09Z\"}\n{\"id\":631,\"first_name\":\"Clio\",\"last_name\":\"Richardt\",\"email\":\"crichardthi@joomla.org\",\"job\":\"Nurse\",\"timestamp\":\"2021-12-31T19:05:01Z\"}\n{\"id\":632,\"first_name\":\"Aryn\",\"last_name\":\"Hofler\",\"email\":\"ahoflerhj@free.fr\",\"job\":\"Quality Control Specialist\",\"timestamp\":\"2021-12-31T02:58:46Z\"}\n{\"id\":633,\"first_name\":\"Berthe\",\"last_name\":\"Pecht\",\"email\":\"bpechthk@soundcloud.com\",\"job\":\"Safety Technician III\",\"timestamp\":\"2022-11-28T13:33:39Z\"}\n{\"id\":634,\"first_name\":\"Marty\",\"last_name\":\"Crichten\",\"email\":\"mcrichtenhl@joomla.org\",\"job\":\"Technical Writer\",\"timestamp\":\"2022-11-20T17:22:51Z\"}\n{\"id\":635,\"first_name\":\"Costanza\",\"last_name\":\"Grigorushkin\",\"email\":\"cgrigorushkinhm@seesaa.net\",\"job\":\"Nurse\",\"timestamp\":\"2022-07-23T11:29:41Z\"}\n{\"id\":636,\"first_name\":\"Janet\",\"last_name\":\"Northidge\",\"email\":\"jnorthidgehn@elpais.com\",\"job\":\"Civil Engineer\",\"timestamp\":\"2022-05-29T19:03:17Z\"}\n{\"id\":637,\"first_name\":\"Charo\",\"last_name\":\"Esp\",\"email\":\"cespho@google.com.au\",\"job\":\"Assistant Media Planner\",\"timestamp\":\"2022-11-07T00:24:12Z\"}\n{\"id\":638,\"first_name\":\"Livvy\",\"last_name\":\"Grzelewski\",\"email\":\"lgrzelewskihp@alibaba.com\",\"job\":\"Administrative Officer\",\"timestamp\":\"2022-02-06T05:24:56Z\"}\n{\"id\":639,\"first_name\":\"Hernando\",\"last_name\":\"Bryde\",\"email\":\"hbrydehq@a8.net\",\"job\":\"Professor\",\"timestamp\":\"2022-11-01T19:40:15Z\"}\n{\"id\":640,\"first_name\":\"Biddy\",\"last_name\":\"Vine\",\"email\":\"bvinehr@yahoo.co.jp\",\"job\":\"Assistant Professor\",\"timestamp\":\"2022-04-10T06:19:06Z\"}\n{\"id\":641,\"first_name\":\"Base\",\"last_name\":\"Friend\",\"email\":\"bfriendhs@blogs.com\",\"job\":\"Legal Assistant\",\"timestamp\":\"2022-09-03T07:38:34Z\"}\n{\"id\":642,\"first_name\":\"Marian\",\"last_name\":\"Basek\",\"email\":\"mbasekht@shop-pro.jp\",\"job\":\"Teacher\",\"timestamp\":\"2022-09-08T13:13:12Z\"}\n{\"id\":643,\"first_name\":\"Ketty\",\"last_name\":\"Clowney\",\"email\":\"kclowneyhu@illinois.edu\",\"job\":\"Research Assistant I\",\"timestamp\":\"2022-02-20T17:55:16Z\"}\n{\"id\":644,\"first_name\":\"Thurston\",\"last_name\":\"Bossom\",\"email\":\"tbossomhv@netvibes.com\",\"job\":\"Statistician III\",\"timestamp\":\"2022-07-08T06:19:02Z\"}\n{\"id\":645,\"first_name\":\"Laure\",\"last_name\":\"Durrad\",\"email\":\"ldurradhw@google.pl\",\"job\":\"Staff Scientist\",\"timestamp\":\"2022-03-04T04:48:35Z\"}\n{\"id\":646,\"first_name\":\"Mildrid\",\"last_name\":\"Gloy\",\"email\":\"mgloyhx@themeforest.net\",\"job\":\"Information Systems Manager\",\"timestamp\":\"2022-01-08T22:58:28Z\"}\n{\"id\":647,\"first_name\":\"Johannah\",\"last_name\":\"Dorward\",\"email\":\"jdorwardhy@ovh.net\",\"job\":\"Analyst Programmer\",\"timestamp\":\"2022-08-31T07:35:18Z\"}\n{\"id\":648,\"first_name\":\"Breena\",\"last_name\":\"Sidebottom\",\"email\":\"bsidebottomhz@networksolutions.com\",\"job\":\"Data Coordiator\",\"timestamp\":\"2021-12-22T03:36:41Z\"}\n{\"id\":649,\"first_name\":\"Jemie\",\"last_name\":\"Bunch\",\"email\":\"jbunchi0@cnet.com\",\"job\":\"Senior Sales Associate\",\"timestamp\":\"2022-06-03T16:40:56Z\"}\n{\"id\":650,\"first_name\":\"Daphna\",\"last_name\":\"Matchett\",\"email\":\"dmatchetti1@lulu.com\",\"job\":\"Information Systems Manager\",\"timestamp\":\"2022-09-14T16:03:23Z\"}\n{\"id\":651,\"first_name\":\"Aymer\",\"last_name\":\"Lamb-shine\",\"email\":\"alambshinei2@miibeian.gov.cn\",\"job\":\"Executive Secretary\",\"timestamp\":\"2022-05-14T06:33:03Z\"}\n{\"id\":652,\"first_name\":\"Brady\",\"last_name\":\"O'Cuddie\",\"email\":\"bocuddiei3@google.nl\",\"job\":\"Research Assistant III\",\"timestamp\":\"2022-10-28T18:25:01Z\"}\n{\"id\":653,\"first_name\":\"Orion\",\"last_name\":\"Scane\",\"email\":\"oscanei4@ed.gov\",\"job\":\"Payment Adjustment Coordinator\",\"timestamp\":\"2022-07-31T08:16:20Z\"}\n{\"id\":654,\"first_name\":\"Bord\",\"last_name\":\"Cundy\",\"email\":\"bcundyi5@va.gov\",\"job\":\"Chief Design Engineer\",\"timestamp\":\"2022-05-14T12:57:30Z\"}\n{\"id\":655,\"first_name\":\"Claudio\",\"last_name\":\"Fowls\",\"email\":\"cfowlsi6@1688.com\",\"job\":\"Director of Sales\",\"timestamp\":\"2022-01-02T16:42:42Z\"}\n{\"id\":656,\"first_name\":\"Leif\",\"last_name\":\"Inkster\",\"email\":\"linksteri7@yolasite.com\",\"job\":\"Assistant Manager\",\"timestamp\":\"2022-10-24T14:04:24Z\"}\n{\"id\":657,\"first_name\":\"Giacinta\",\"last_name\":\"Wiley\",\"email\":\"gwileyi8@1688.com\",\"job\":\"Director of Sales\",\"timestamp\":\"2022-09-26T16:26:19Z\"}\n{\"id\":658,\"first_name\":\"Karylin\",\"last_name\":\"Allcock\",\"email\":\"kallcocki9@unesco.org\",\"job\":\"Accounting Assistant II\",\"timestamp\":\"2022-05-23T22:15:24Z\"}\n{\"id\":659,\"first_name\":\"Krisha\",\"last_name\":\"Cadden\",\"email\":\"kcaddenia@purevolume.com\",\"job\":\"Design Engineer\",\"timestamp\":\"2022-03-13T05:30:02Z\"}\n{\"id\":660,\"first_name\":\"Darnall\",\"last_name\":\"Grayer\",\"email\":\"dgrayerib@cargocollective.com\",\"job\":\"Quality Control Specialist\",\"timestamp\":\"2022-05-26T11:19:15Z\"}\n{\"id\":661,\"first_name\":\"Vin\",\"last_name\":\"Brinsden\",\"email\":\"vbrinsdenic@cnet.com\",\"job\":\"Administrative Officer\",\"timestamp\":\"2022-10-27T11:50:28Z\"}\n{\"id\":662,\"first_name\":\"Lori\",\"last_name\":\"Hartzogs\",\"email\":\"lhartzogsid@admin.ch\",\"job\":\"Senior Editor\",\"timestamp\":\"2021-12-23T05:37:47Z\"}\n{\"id\":663,\"first_name\":\"Kim\",\"last_name\":\"MacAlinden\",\"email\":\"kmacalindenie@cornell.edu\",\"job\":\"Sales Representative\",\"timestamp\":\"2022-07-09T04:11:04Z\"}\n{\"id\":664,\"first_name\":\"Trever\",\"last_name\":\"Pirnie\",\"email\":\"tpirnieif@msu.edu\",\"job\":\"Software Consultant\",\"timestamp\":\"2022-06-15T21:46:38Z\"}\n{\"id\":665,\"first_name\":\"Deidre\",\"last_name\":\"Kinloch\",\"email\":\"dkinlochig@salon.com\",\"job\":\"Health Coach I\",\"timestamp\":\"2022-10-24T05:33:17Z\"}\n{\"id\":666,\"first_name\":\"Christabella\",\"last_name\":\"Vecard\",\"email\":\"cvecardih@nydailynews.com\",\"job\":\"Social Worker\",\"timestamp\":\"2022-11-13T22:26:59Z\"}\n{\"id\":667,\"first_name\":\"Bobbye\",\"last_name\":\"Kanzler\",\"email\":\"bkanzlerii@cdbaby.com\",\"job\":\"Food Chemist\",\"timestamp\":\"2022-09-23T19:25:44Z\"}\n{\"id\":668,\"first_name\":\"Ellen\",\"last_name\":\"O'Monahan\",\"email\":\"eomonahanij@mapquest.com\",\"job\":\"Budget/Accounting Analyst III\",\"timestamp\":\"2022-08-24T01:54:47Z\"}\n{\"id\":669,\"first_name\":\"Nickolaus\",\"last_name\":\"Bilbie\",\"email\":\"nbilbieik@ucoz.ru\",\"job\":\"Junior Executive\",\"timestamp\":\"2022-02-04T07:50:33Z\"}\n{\"id\":670,\"first_name\":\"Brooks\",\"last_name\":\"Mableson\",\"email\":\"bmablesonil@toplist.cz\",\"job\":\"Editor\",\"timestamp\":\"2022-05-17T16:53:27Z\"}\n{\"id\":671,\"first_name\":\"Joyann\",\"last_name\":\"Tavinor\",\"email\":\"jtavinorim@tamu.edu\",\"job\":\"Nuclear Power Engineer\",\"timestamp\":\"2022-09-06T03:01:20Z\"}\n{\"id\":672,\"first_name\":\"Mathe\",\"last_name\":\"Valerius\",\"email\":\"mvaleriusin@squarespace.com\",\"job\":\"GIS Technical Architect\",\"timestamp\":\"2022-03-08T15:41:18Z\"}\n{\"id\":673,\"first_name\":\"Donalt\",\"last_name\":\"Sainz\",\"email\":\"dsainzio@biglobe.ne.jp\",\"job\":\"Nurse\",\"timestamp\":\"2022-03-19T11:31:30Z\"}\n{\"id\":674,\"first_name\":\"Tobey\",\"last_name\":\"Semeradova\",\"email\":\"tsemeradovaip@google.ru\",\"job\":\"Senior Cost Accountant\",\"timestamp\":\"2022-07-29T04:18:03Z\"}\n{\"id\":675,\"first_name\":\"Hendrik\",\"last_name\":\"Patman\",\"email\":\"hpatmaniq@zdnet.com\",\"job\":\"Research Associate\",\"timestamp\":\"2022-02-03T16:16:17Z\"}\n{\"id\":676,\"first_name\":\"Trina\",\"last_name\":\"Whopples\",\"email\":\"twhopplesir@who.int\",\"job\":\"Librarian\",\"timestamp\":\"2022-07-07T17:52:03Z\"}\n{\"id\":677,\"first_name\":\"Merrick\",\"last_name\":\"Ussher\",\"email\":\"mussheris@oakley.com\",\"job\":\"Software Test Engineer III\",\"timestamp\":\"2022-01-21T05:56:03Z\"}\n{\"id\":678,\"first_name\":\"Jennine\",\"last_name\":\"Mielnik\",\"email\":\"jmielnikit@ocn.ne.jp\",\"job\":\"Software Test Engineer III\",\"timestamp\":\"2022-07-25T19:09:52Z\"}\n{\"id\":679,\"first_name\":\"Kenny\",\"last_name\":\"Greeve\",\"email\":\"kgreeveiu@ow.ly\",\"job\":\"Community Outreach Specialist\",\"timestamp\":\"2022-10-16T20:23:43Z\"}\n{\"id\":680,\"first_name\":\"Hasheem\",\"last_name\":\"Franklyn\",\"email\":\"hfranklyniv@typepad.com\",\"job\":\"Quality Control Specialist\",\"timestamp\":\"2022-05-26T01:47:42Z\"}\n{\"id\":681,\"first_name\":\"Wendel\",\"last_name\":\"Dicken\",\"email\":\"wdickeniw@ft.com\",\"job\":\"Automation Specialist IV\",\"timestamp\":\"2022-06-23T02:50:42Z\"}\n{\"id\":682,\"first_name\":\"Harlene\",\"last_name\":\"Semaine\",\"email\":\"hsemaineix@odnoklassniki.ru\",\"job\":\"Recruiting Manager\",\"timestamp\":\"2022-05-31T14:10:40Z\"}\n{\"id\":683,\"first_name\":\"Timothea\",\"last_name\":\"Kilminster\",\"email\":\"tkilminsteriy@ning.com\",\"job\":\"Operator\",\"timestamp\":\"2022-07-28T21:06:36Z\"}\n{\"id\":684,\"first_name\":\"Ram\",\"last_name\":\"Lindelof\",\"email\":\"rlindelofiz@mail.ru\",\"job\":\"Nuclear Power Engineer\",\"timestamp\":\"2022-01-11T12:40:10Z\"}\n{\"id\":685,\"first_name\":\"Marven\",\"last_name\":\"Wollen\",\"email\":\"mwollenj0@printfriendly.com\",\"job\":\"Automation Specialist I\",\"timestamp\":\"2022-05-24T11:30:18Z\"}\n{\"id\":686,\"first_name\":\"Nikoletta\",\"last_name\":\"Shimmin\",\"email\":\"nshimminj1@taobao.com\",\"job\":\"Product Engineer\",\"timestamp\":\"2022-01-02T14:01:08Z\"}\n{\"id\":687,\"first_name\":\"Wheeler\",\"last_name\":\"Beincken\",\"email\":\"wbeinckenj2@symantec.com\",\"job\":\"Professor\",\"timestamp\":\"2022-12-05T15:00:24Z\"}\n{\"id\":688,\"first_name\":\"Mufi\",\"last_name\":\"Slimon\",\"email\":\"mslimonj3@arizona.edu\",\"job\":\"Junior Executive\",\"timestamp\":\"2022-02-17T18:27:49Z\"}\n{\"id\":689,\"first_name\":\"Debee\",\"last_name\":\"Heavyside\",\"email\":\"dheavysidej4@ftc.gov\",\"job\":\"Accounting Assistant II\",\"timestamp\":\"2022-04-21T20:06:43Z\"}\n{\"id\":690,\"first_name\":\"Lois\",\"last_name\":\"Choules\",\"email\":\"lchoulesj5@sbwire.com\",\"job\":\"Computer Systems Analyst IV\",\"timestamp\":\"2022-04-23T05:50:57Z\"}\n{\"id\":691,\"first_name\":\"Issie\",\"last_name\":\"Rosenberg\",\"email\":\"irosenbergj6@spotify.com\",\"job\":\"Analyst Programmer\",\"timestamp\":\"2022-05-06T21:27:41Z\"}\n{\"id\":692,\"first_name\":\"Cicely\",\"last_name\":\"Costen\",\"email\":\"ccostenj7@goo.gl\",\"job\":\"Staff Scientist\",\"timestamp\":\"2021-12-21T02:39:27Z\"}\n{\"id\":693,\"first_name\":\"Barbe\",\"last_name\":\"Kinneir\",\"email\":\"bkinneirj8@artisteer.com\",\"job\":\"Sales Associate\",\"timestamp\":\"2022-06-11T10:08:55Z\"}\n{\"id\":694,\"first_name\":\"Emlyn\",\"last_name\":\"Adamski\",\"email\":\"eadamskij9@hubpages.com\",\"job\":\"Programmer I\",\"timestamp\":\"2022-10-03T15:22:54Z\"}\n{\"id\":695,\"first_name\":\"Rebeca\",\"last_name\":\"Lorenzini\",\"email\":\"rlorenzinija@auda.org.au\",\"job\":\"Analyst Programmer\",\"timestamp\":\"2022-01-25T17:06:52Z\"}\n{\"id\":696,\"first_name\":\"Burke\",\"last_name\":\"Dalzell\",\"email\":\"bdalzelljb@bing.com\",\"job\":\"Programmer I\",\"timestamp\":\"2022-06-25T11:26:32Z\"}\n{\"id\":697,\"first_name\":\"Danila\",\"last_name\":\"Munton\",\"email\":\"dmuntonjc@oracle.com\",\"job\":\"Administrative Assistant III\",\"timestamp\":\"2022-02-07T19:08:39Z\"}\n{\"id\":698,\"first_name\":\"Pablo\",\"last_name\":\"Ritchman\",\"email\":\"pritchmanjd@virginia.edu\",\"job\":\"Research Associate\",\"timestamp\":\"2022-01-10T02:21:15Z\"}\n{\"id\":699,\"first_name\":\"Jillana\",\"last_name\":\"Welden\",\"email\":\"jweldenje@vistaprint.com\",\"job\":\"Recruiter\",\"timestamp\":\"2022-10-14T01:12:35Z\"}\n{\"id\":700,\"first_name\":\"Quintilla\",\"last_name\":\"McDonagh\",\"email\":\"qmcdonaghjf@biblegateway.com\",\"job\":\"Food Chemist\",\"timestamp\":\"2021-12-09T23:20:44Z\"}\n{\"id\":701,\"first_name\":\"Gladys\",\"last_name\":\"Schoenfisch\",\"email\":\"gschoenfischjg@google.com.br\",\"job\":\"VP Accounting\",\"timestamp\":\"2022-01-20T10:39:38Z\"}\n{\"id\":702,\"first_name\":\"Hallie\",\"last_name\":\"Gery\",\"email\":\"hgeryjh@blinklist.com\",\"job\":\"Senior Editor\",\"timestamp\":\"2022-07-28T15:09:46Z\"}\n{\"id\":703,\"first_name\":\"Iorgos\",\"last_name\":\"Skea\",\"email\":\"iskeaji@wikia.com\",\"job\":\"Chemical Engineer\",\"timestamp\":\"2022-11-22T18:08:01Z\"}\n{\"id\":704,\"first_name\":\"Lennard\",\"last_name\":\"Jolliman\",\"email\":\"ljollimanjj@walmart.com\",\"job\":\"Executive Secretary\",\"timestamp\":\"2021-12-23T00:39:05Z\"}\n{\"id\":705,\"first_name\":\"Barde\",\"last_name\":\"Dixie\",\"email\":\"bdixiejk@java.com\",\"job\":\"GIS Technical Architect\",\"timestamp\":\"2022-02-03T12:21:25Z\"}\n{\"id\":706,\"first_name\":\"Catherin\",\"last_name\":\"Jain\",\"email\":\"cjainjl@flavors.me\",\"job\":\"Internal Auditor\",\"timestamp\":\"2022-09-04T12:46:16Z\"}\n{\"id\":707,\"first_name\":\"Dwight\",\"last_name\":\"Axston\",\"email\":\"daxstonjm@taobao.com\",\"job\":\"Marketing Assistant\",\"timestamp\":\"2021-12-11T13:11:12Z\"}\n{\"id\":708,\"first_name\":\"Gerhardine\",\"last_name\":\"More\",\"email\":\"gmorejn@liveinternet.ru\",\"job\":\"Junior Executive\",\"timestamp\":\"2022-05-08T03:43:14Z\"}\n{\"id\":709,\"first_name\":\"Gabe\",\"last_name\":\"Dominy\",\"email\":\"gdominyjo@nba.com\",\"job\":\"Physical Therapy Assistant\",\"timestamp\":\"2022-04-08T18:45:30Z\"}\n{\"id\":710,\"first_name\":\"Lena\",\"last_name\":\"Abbis\",\"email\":\"labbisjp@nps.gov\",\"job\":\"Editor\",\"timestamp\":\"2022-03-09T21:30:34Z\"}\n{\"id\":711,\"first_name\":\"Viola\",\"last_name\":\"Filgate\",\"email\":\"vfilgatejq@altervista.org\",\"job\":\"Project Manager\",\"timestamp\":\"2022-11-08T00:41:42Z\"}\n{\"id\":712,\"first_name\":\"Rolfe\",\"last_name\":\"Ranahan\",\"email\":\"rranahanjr@sakura.ne.jp\",\"job\":\"Software Engineer III\",\"timestamp\":\"2022-10-23T12:04:16Z\"}\n{\"id\":713,\"first_name\":\"Victoria\",\"last_name\":\"Zanni\",\"email\":\"vzannijs@google.de\",\"job\":\"Accountant IV\",\"timestamp\":\"2022-03-24T07:57:40Z\"}\n{\"id\":714,\"first_name\":\"Scarlet\",\"last_name\":\"Linay\",\"email\":\"slinayjt@admin.ch\",\"job\":\"Food Chemist\",\"timestamp\":\"2022-06-20T04:04:22Z\"}\n{\"id\":715,\"first_name\":\"Odella\",\"last_name\":\"Nursey\",\"email\":\"onurseyju@fema.gov\",\"job\":\"Computer Systems Analyst I\",\"timestamp\":\"2022-09-06T07:57:03Z\"}\n{\"id\":716,\"first_name\":\"Gretta\",\"last_name\":\"Crasswell\",\"email\":\"gcrasswelljv@i2i.jp\",\"job\":\"Media Manager IV\",\"timestamp\":\"2022-12-06T23:09:00Z\"}\n{\"id\":717,\"first_name\":\"Lorna\",\"last_name\":\"Stanman\",\"email\":\"lstanmanjw@spiegel.de\",\"job\":\"Physical Therapy Assistant\",\"timestamp\":\"2022-04-17T12:42:27Z\"}\n{\"id\":718,\"first_name\":\"Emilio\",\"last_name\":\"Gercken\",\"email\":\"egerckenjx@addthis.com\",\"job\":\"Editor\",\"timestamp\":\"2022-08-25T06:39:00Z\"}\n{\"id\":719,\"first_name\":\"Bailey\",\"last_name\":\"Tournay\",\"email\":\"btournayjy@discuz.net\",\"job\":\"Editor\",\"timestamp\":\"2022-03-19T15:02:39Z\"}\n{\"id\":720,\"first_name\":\"Magdalen\",\"last_name\":\"Gavriel\",\"email\":\"mgavrieljz@indiatimes.com\",\"job\":\"Assistant Manager\",\"timestamp\":\"2022-08-05T17:45:49Z\"}\n{\"id\":721,\"first_name\":\"Gretel\",\"last_name\":\"Tinkler\",\"email\":\"gtinklerk0@51.la\",\"job\":\"Occupational Therapist\",\"timestamp\":\"2022-07-27T12:40:20Z\"}\n{\"id\":722,\"first_name\":\"Rona\",\"last_name\":\"Caldecutt\",\"email\":\"rcaldecuttk1@edublogs.org\",\"job\":\"Human Resources Manager\",\"timestamp\":\"2022-07-23T14:54:32Z\"}\n{\"id\":723,\"first_name\":\"Lynne\",\"last_name\":\"Crinidge\",\"email\":\"lcrinidgek2@google.nl\",\"job\":\"Financial Analyst\",\"timestamp\":\"2022-05-26T17:40:55Z\"}\n{\"id\":724,\"first_name\":\"Pace\",\"last_name\":\"Ambrogio\",\"email\":\"pambrogiok3@ehow.com\",\"job\":\"Accountant II\",\"timestamp\":\"2022-02-01T23:19:50Z\"}\n{\"id\":725,\"first_name\":\"Alaine\",\"last_name\":\"Durgan\",\"email\":\"adurgank4@cmu.edu\",\"job\":\"Chemical Engineer\",\"timestamp\":\"2022-06-13T08:59:43Z\"}\n{\"id\":726,\"first_name\":\"Teddie\",\"last_name\":\"Nealon\",\"email\":\"tnealonk5@qq.com\",\"job\":\"Community Outreach Specialist\",\"timestamp\":\"2022-05-12T16:32:14Z\"}\n{\"id\":727,\"first_name\":\"Lorelei\",\"last_name\":\"Lindstrom\",\"email\":\"llindstromk6@nyu.edu\",\"job\":\"Nuclear Power Engineer\",\"timestamp\":\"2022-02-28T06:36:27Z\"}\n{\"id\":728,\"first_name\":\"Marena\",\"last_name\":\"Treleaven\",\"email\":\"mtreleavenk7@answers.com\",\"job\":\"Occupational Therapist\",\"timestamp\":\"2022-07-31T18:58:00Z\"}\n{\"id\":729,\"first_name\":\"Trace\",\"last_name\":\"Mouth\",\"email\":\"tmouthk8@about.me\",\"job\":\"Environmental Tech\",\"timestamp\":\"2022-04-04T08:48:58Z\"}\n{\"id\":730,\"first_name\":\"Terry\",\"last_name\":\"Dorant\",\"email\":\"tdorantk9@vkontakte.ru\",\"job\":\"Accountant III\",\"timestamp\":\"2022-03-28T09:01:58Z\"}\n{\"id\":731,\"first_name\":\"De witt\",\"last_name\":\"Tilbury\",\"email\":\"dtilburyka@istockphoto.com\",\"job\":\"Mechanical Systems Engineer\",\"timestamp\":\"2022-08-08T21:56:03Z\"}\n{\"id\":732,\"first_name\":\"Mel\",\"last_name\":\"Kilalea\",\"email\":\"mkilaleakb@java.com\",\"job\":\"Developer III\",\"timestamp\":\"2022-01-22T22:44:51Z\"}\n{\"id\":733,\"first_name\":\"Albertina\",\"last_name\":\"Eagles\",\"email\":\"aeagleskc@typepad.com\",\"job\":\"Accounting Assistant IV\",\"timestamp\":\"2022-03-23T22:17:56Z\"}\n{\"id\":734,\"first_name\":\"Berthe\",\"last_name\":\"De Hailes\",\"email\":\"bdehaileskd@biblegateway.com\",\"job\":\"Associate Professor\",\"timestamp\":\"2021-12-13T10:46:15Z\"}\n{\"id\":735,\"first_name\":\"Muffin\",\"last_name\":\"MacCawley\",\"email\":\"mmaccawleyke@yahoo.com\",\"job\":\"General Manager\",\"timestamp\":\"2022-07-04T09:03:49Z\"}\n{\"id\":736,\"first_name\":\"Glynnis\",\"last_name\":\"Petz\",\"email\":\"gpetzkf@infoseek.co.jp\",\"job\":\"Data Coordiator\",\"timestamp\":\"2022-08-26T16:14:41Z\"}\n{\"id\":737,\"first_name\":\"Davis\",\"last_name\":\"Loyns\",\"email\":\"dloynskg@phpbb.com\",\"job\":\"VP Quality Control\",\"timestamp\":\"2022-02-26T22:11:32Z\"}\n{\"id\":738,\"first_name\":\"Rayshell\",\"last_name\":\"Whittenbury\",\"email\":\"rwhittenburykh@ucoz.ru\",\"job\":\"Staff Scientist\",\"timestamp\":\"2022-08-25T13:30:55Z\"}\n{\"id\":739,\"first_name\":\"Adrien\",\"last_name\":\"Wellfare\",\"email\":\"awellfareki@unblog.fr\",\"job\":\"Registered Nurse\",\"timestamp\":\"2022-08-11T06:20:29Z\"}\n{\"id\":740,\"first_name\":\"Atlanta\",\"last_name\":\"Piccop\",\"email\":\"apiccopkj@vistaprint.com\",\"job\":\"Sales Representative\",\"timestamp\":\"2022-02-22T09:26:48Z\"}\n{\"id\":741,\"first_name\":\"Glad\",\"last_name\":\"Boolsen\",\"email\":\"gboolsenkk@dailymail.co.uk\",\"job\":\"Budget/Accounting Analyst IV\",\"timestamp\":\"2022-06-15T06:48:27Z\"}\n{\"id\":742,\"first_name\":\"Marlo\",\"last_name\":\"Schenfisch\",\"email\":\"mschenfischkl@businessinsider.com\",\"job\":\"Programmer I\",\"timestamp\":\"2022-04-21T15:15:56Z\"}\n{\"id\":743,\"first_name\":\"Nadine\",\"last_name\":\"Lomb\",\"email\":\"nlombkm@theguardian.com\",\"job\":\"Senior Editor\",\"timestamp\":\"2022-10-28T11:10:07Z\"}\n{\"id\":744,\"first_name\":\"Hartley\",\"last_name\":\"Kemetz\",\"email\":\"hkemetzkn@histats.com\",\"job\":\"Internal Auditor\",\"timestamp\":\"2022-01-13T20:49:23Z\"}\n{\"id\":745,\"first_name\":\"Wayland\",\"last_name\":\"Murch\",\"email\":\"wmurchko@yellowbook.com\",\"job\":\"Nurse\",\"timestamp\":\"2022-01-19T12:11:01Z\"}\n{\"id\":746,\"first_name\":\"Chuck\",\"last_name\":\"Shama\",\"email\":\"cshamakp@noaa.gov\",\"job\":\"Occupational Therapist\",\"timestamp\":\"2022-08-17T03:48:57Z\"}\n{\"id\":747,\"first_name\":\"Maximilien\",\"last_name\":\"Hender\",\"email\":\"mhenderkq@squarespace.com\",\"job\":\"Analog Circuit Design manager\",\"timestamp\":\"2022-10-09T00:31:21Z\"}\n{\"id\":748,\"first_name\":\"Hoyt\",\"last_name\":\"Sains\",\"email\":\"hsainskr@patch.com\",\"job\":\"Project Manager\",\"timestamp\":\"2022-11-02T08:21:21Z\"}\n{\"id\":749,\"first_name\":\"Raychel\",\"last_name\":\"Marsham\",\"email\":\"rmarshamks@eepurl.com\",\"job\":\"Database Administrator III\",\"timestamp\":\"2022-08-14T01:54:32Z\"}\n{\"id\":750,\"first_name\":\"Meriel\",\"last_name\":\"Slowley\",\"email\":\"mslowleykt@mail.ru\",\"job\":\"Information Systems Manager\",\"timestamp\":\"2022-07-07T00:41:12Z\"}\n{\"id\":751,\"first_name\":\"Meara\",\"last_name\":\"Rawcliff\",\"email\":\"mrawcliffku@t.co\",\"job\":\"Tax Accountant\",\"timestamp\":\"2022-08-10T04:38:07Z\"}\n{\"id\":752,\"first_name\":\"Mignon\",\"last_name\":\"Klee\",\"email\":\"mkleekv@usatoday.com\",\"job\":\"Staff Scientist\",\"timestamp\":\"2022-10-02T13:43:33Z\"}\n{\"id\":753,\"first_name\":\"Dulci\",\"last_name\":\"Simonou\",\"email\":\"dsimonoukw@i2i.jp\",\"job\":\"Operator\",\"timestamp\":\"2022-10-11T17:09:14Z\"}\n{\"id\":754,\"first_name\":\"Candis\",\"last_name\":\"Letford\",\"email\":\"cletfordkx@cyberchimps.com\",\"job\":\"Accountant I\",\"timestamp\":\"2022-03-30T08:46:25Z\"}\n{\"id\":755,\"first_name\":\"Carmen\",\"last_name\":\"Crighton\",\"email\":\"ccrightonky@prlog.org\",\"job\":\"Assistant Professor\",\"timestamp\":\"2022-09-26T06:42:14Z\"}\n{\"id\":756,\"first_name\":\"Warner\",\"last_name\":\"Sinott\",\"email\":\"wsinottkz@dropbox.com\",\"job\":\"Staff Scientist\",\"timestamp\":\"2022-05-20T22:40:55Z\"}\n{\"id\":757,\"first_name\":\"Viv\",\"last_name\":\"Moylan\",\"email\":\"vmoylanl0@jimdo.com\",\"job\":\"Account Representative II\",\"timestamp\":\"2021-12-08T21:43:10Z\"}\n{\"id\":758,\"first_name\":\"Heddie\",\"last_name\":\"Beynke\",\"email\":\"hbeynkel1@amazon.com\",\"job\":\"Physical Therapy Assistant\",\"timestamp\":\"2021-12-16T14:39:59Z\"}\n{\"id\":759,\"first_name\":\"Roberto\",\"last_name\":\"Bottle\",\"email\":\"rbottlel2@mlb.com\",\"job\":\"Sales Associate\",\"timestamp\":\"2021-12-23T04:18:27Z\"}\n{\"id\":760,\"first_name\":\"Krysta\",\"last_name\":\"Malzard\",\"email\":\"kmalzardl3@hp.com\",\"job\":\"Administrative Assistant III\",\"timestamp\":\"2022-11-03T17:41:14Z\"}\n{\"id\":761,\"first_name\":\"Aurea\",\"last_name\":\"Povall\",\"email\":\"apovalll4@wikispaces.com\",\"job\":\"Editor\",\"timestamp\":\"2021-12-27T03:17:49Z\"}\n{\"id\":762,\"first_name\":\"Burt\",\"last_name\":\"Phillott\",\"email\":\"bphillottl5@opensource.org\",\"job\":\"Programmer II\",\"timestamp\":\"2022-02-11T00:22:12Z\"}\n{\"id\":763,\"first_name\":\"Elnar\",\"last_name\":\"Smorthit\",\"email\":\"esmorthitl6@timesonline.co.uk\",\"job\":\"Research Nurse\",\"timestamp\":\"2022-01-05T14:47:09Z\"}\n{\"id\":764,\"first_name\":\"Linnell\",\"last_name\":\"Ilyushkin\",\"email\":\"lilyushkinl7@bloglovin.com\",\"job\":\"Senior Financial Analyst\",\"timestamp\":\"2022-01-30T02:30:52Z\"}\n{\"id\":765,\"first_name\":\"Lee\",\"last_name\":\"Waeland\",\"email\":\"lwaelandl8@mapy.cz\",\"job\":\"Accounting Assistant I\",\"timestamp\":\"2022-04-17T05:28:49Z\"}\n{\"id\":766,\"first_name\":\"Clotilda\",\"last_name\":\"Litterick\",\"email\":\"clitterickl9@gmpg.org\",\"job\":\"Research Associate\",\"timestamp\":\"2022-10-13T02:50:59Z\"}\n{\"id\":767,\"first_name\":\"Shepherd\",\"last_name\":\"Furmonger\",\"email\":\"sfurmongerla@opensource.org\",\"job\":\"Junior Executive\",\"timestamp\":\"2022-10-24T12:54:10Z\"}\n{\"id\":768,\"first_name\":\"Boycey\",\"last_name\":\"Halversen\",\"email\":\"bhalversenlb@fastcompany.com\",\"job\":\"Editor\",\"timestamp\":\"2021-12-09T13:50:42Z\"}\n{\"id\":769,\"first_name\":\"Lamar\",\"last_name\":\"Dressell\",\"email\":\"ldresselllc@instagram.com\",\"job\":\"Web Developer IV\",\"timestamp\":\"2022-04-27T08:39:18Z\"}\n{\"id\":770,\"first_name\":\"Davita\",\"last_name\":\"Jolin\",\"email\":\"djolinld@is.gd\",\"job\":\"Food Chemist\",\"timestamp\":\"2021-12-21T15:03:35Z\"}\n{\"id\":771,\"first_name\":\"Teddie\",\"last_name\":\"Heinrici\",\"email\":\"theinricile@guardian.co.uk\",\"job\":\"Desktop Support Technician\",\"timestamp\":\"2022-03-14T17:43:34Z\"}\n{\"id\":772,\"first_name\":\"Catherin\",\"last_name\":\"Egle of Germany\",\"email\":\"cegleofgermanylf@yahoo.com\",\"job\":\"Senior Quality Engineer\",\"timestamp\":\"2022-06-13T23:32:57Z\"}\n{\"id\":773,\"first_name\":\"Birgit\",\"last_name\":\"Vasyukhin\",\"email\":\"bvasyukhinlg@freewebs.com\",\"job\":\"Human Resources Manager\",\"timestamp\":\"2022-01-31T18:54:27Z\"}\n{\"id\":774,\"first_name\":\"Rory\",\"last_name\":\"Bohman\",\"email\":\"rbohmanlh@goo.gl\",\"job\":\"Actuary\",\"timestamp\":\"2022-01-02T22:02:28Z\"}\n{\"id\":775,\"first_name\":\"Ezechiel\",\"last_name\":\"Bransdon\",\"email\":\"ebransdonli@blogtalkradio.com\",\"job\":\"Information Systems Manager\",\"timestamp\":\"2022-03-22T18:51:52Z\"}\n{\"id\":776,\"first_name\":\"Hillie\",\"last_name\":\"Athowe\",\"email\":\"hathowelj@google.pl\",\"job\":\"Sales Representative\",\"timestamp\":\"2021-12-16T07:47:18Z\"}\n{\"id\":777,\"first_name\":\"Edwina\",\"last_name\":\"Verry\",\"email\":\"everrylk@trellian.com\",\"job\":\"Staff Accountant IV\",\"timestamp\":\"2022-04-29T04:08:45Z\"}\n{\"id\":778,\"first_name\":\"Alyce\",\"last_name\":\"Pulham\",\"email\":\"apulhamll@samsung.com\",\"job\":\"Health Coach II\",\"timestamp\":\"2022-12-06T01:14:54Z\"}\n{\"id\":779,\"first_name\":\"Estele\",\"last_name\":\"Cullimore\",\"email\":\"ecullimorelm@indiatimes.com\",\"job\":\"Actuary\",\"timestamp\":\"2022-01-21T10:00:05Z\"}\n{\"id\":780,\"first_name\":\"Iver\",\"last_name\":\"Jeannenet\",\"email\":\"ijeannenetln@nifty.com\",\"job\":\"Accountant II\",\"timestamp\":\"2022-11-16T09:38:28Z\"}\n{\"id\":781,\"first_name\":\"Olimpia\",\"last_name\":\"Coulsen\",\"email\":\"ocoulsenlo@xing.com\",\"job\":\"Research Associate\",\"timestamp\":\"2022-02-24T09:11:20Z\"}\n{\"id\":782,\"first_name\":\"Noel\",\"last_name\":\"Ludlem\",\"email\":\"nludlemlp@desdev.cn\",\"job\":\"Speech Pathologist\",\"timestamp\":\"2022-04-25T11:57:38Z\"}\n{\"id\":783,\"first_name\":\"Enoch\",\"last_name\":\"Goddman\",\"email\":\"egoddmanlq@imdb.com\",\"job\":\"Staff Scientist\",\"timestamp\":\"2021-12-19T13:34:15Z\"}\n{\"id\":784,\"first_name\":\"Heinrik\",\"last_name\":\"McGee\",\"email\":\"hmcgeelr@marriott.com\",\"job\":\"Technical Writer\",\"timestamp\":\"2022-03-25T07:45:03Z\"}\n{\"id\":785,\"first_name\":\"Rosella\",\"last_name\":\"Arent\",\"email\":\"rarentls@mysql.com\",\"job\":\"Chief Design Engineer\",\"timestamp\":\"2022-08-27T06:07:40Z\"}\n{\"id\":786,\"first_name\":\"Gerard\",\"last_name\":\"Heathfield\",\"email\":\"gheathfieldlt@bloomberg.com\",\"job\":\"Health Coach II\",\"timestamp\":\"2022-01-11T18:42:53Z\"}\n{\"id\":787,\"first_name\":\"Davie\",\"last_name\":\"Di Biaggi\",\"email\":\"ddibiaggilu@yelp.com\",\"job\":\"Director of Sales\",\"timestamp\":\"2022-06-05T18:40:22Z\"}\n{\"id\":788,\"first_name\":\"Meredith\",\"last_name\":\"Hatchell\",\"email\":\"mhatchelllv@google.ca\",\"job\":\"Biostatistician IV\",\"timestamp\":\"2022-05-26T14:28:05Z\"}\n{\"id\":789,\"first_name\":\"Haven\",\"last_name\":\"Coppeard\",\"email\":\"hcoppeardlw@virginia.edu\",\"job\":\"Structural Engineer\",\"timestamp\":\"2022-04-02T23:43:01Z\"}\n{\"id\":790,\"first_name\":\"Marietta\",\"last_name\":\"MacTrustey\",\"email\":\"mmactrusteylx@pinterest.com\",\"job\":\"Graphic Designer\",\"timestamp\":\"2022-03-16T13:38:59Z\"}\n{\"id\":791,\"first_name\":\"Chrisse\",\"last_name\":\"Sargerson\",\"email\":\"csargersonly@dell.com\",\"job\":\"VP Sales\",\"timestamp\":\"2022-07-16T21:03:21Z\"}\n{\"id\":792,\"first_name\":\"Barri\",\"last_name\":\"Danilevich\",\"email\":\"bdanilevichlz@dmoz.org\",\"job\":\"Community Outreach Specialist\",\"timestamp\":\"2022-11-12T05:56:51Z\"}\n{\"id\":793,\"first_name\":\"Eleanore\",\"last_name\":\"Dallemore\",\"email\":\"edallemorem0@globo.com\",\"job\":\"Paralegal\",\"timestamp\":\"2022-09-29T14:34:42Z\"}\n{\"id\":794,\"first_name\":\"Skye\",\"last_name\":\"Southerill\",\"email\":\"ssoutherillm1@weebly.com\",\"job\":\"Database Administrator I\",\"timestamp\":\"2022-06-08T18:20:11Z\"}\n{\"id\":795,\"first_name\":\"Trueman\",\"last_name\":\"Layfield\",\"email\":\"tlayfieldm2@live.com\",\"job\":\"Structural Analysis Engineer\",\"timestamp\":\"2022-02-26T21:26:01Z\"}\n{\"id\":796,\"first_name\":\"Nollie\",\"last_name\":\"Allanson\",\"email\":\"nallansonm3@un.org\",\"job\":\"Sales Associate\",\"timestamp\":\"2022-02-26T05:40:49Z\"}\n{\"id\":797,\"first_name\":\"Shay\",\"last_name\":\"Marder\",\"email\":\"smarderm4@chronoengine.com\",\"job\":\"Data Coordiator\",\"timestamp\":\"2022-05-25T12:31:14Z\"}\n{\"id\":798,\"first_name\":\"Jolee\",\"last_name\":\"Danit\",\"email\":\"jdanitm5@princeton.edu\",\"job\":\"VP Product Management\",\"timestamp\":\"2022-08-21T12:50:15Z\"}\n{\"id\":799,\"first_name\":\"Neile\",\"last_name\":\"Pottiphar\",\"email\":\"npottipharm6@nsw.gov.au\",\"job\":\"Research Nurse\",\"timestamp\":\"2022-06-30T15:02:03Z\"}\n{\"id\":800,\"first_name\":\"Pen\",\"last_name\":\"Garrattley\",\"email\":\"pgarrattleym7@ucoz.ru\",\"job\":\"Registered Nurse\",\"timestamp\":\"2022-12-05T16:06:07Z\"}\n{\"id\":801,\"first_name\":\"Duffie\",\"last_name\":\"Morrow\",\"email\":\"dmorrowm8@weebly.com\",\"job\":\"Research Associate\",\"timestamp\":\"2022-10-27T20:27:59Z\"}\n{\"id\":802,\"first_name\":\"Garland\",\"last_name\":\"Dunnet\",\"email\":\"gdunnetm9@microsoft.com\",\"job\":\"Graphic Designer\",\"timestamp\":\"2022-04-06T05:47:47Z\"}\n{\"id\":803,\"first_name\":\"Bianka\",\"last_name\":\"Escott\",\"email\":\"bescottma@netlog.com\",\"job\":\"Statistician II\",\"timestamp\":\"2022-02-05T07:13:30Z\"}\n{\"id\":804,\"first_name\":\"Ebonee\",\"last_name\":\"Bown\",\"email\":\"ebownmb@nasa.gov\",\"job\":\"Paralegal\",\"timestamp\":\"2022-07-03T08:24:49Z\"}\n{\"id\":805,\"first_name\":\"Katherina\",\"last_name\":\"Marciskewski\",\"email\":\"kmarciskewskimc@cdbaby.com\",\"job\":\"Mechanical Systems Engineer\",\"timestamp\":\"2022-07-01T19:22:29Z\"}\n{\"id\":806,\"first_name\":\"Matti\",\"last_name\":\"Cadwaladr\",\"email\":\"mcadwaladrmd@163.com\",\"job\":\"Senior Sales Associate\",\"timestamp\":\"2022-05-06T00:33:03Z\"}\n{\"id\":807,\"first_name\":\"Kiel\",\"last_name\":\"Castellet\",\"email\":\"kcastelletme@washingtonpost.com\",\"job\":\"Environmental Specialist\",\"timestamp\":\"2022-02-18T09:24:22Z\"}\n{\"id\":808,\"first_name\":\"Lothario\",\"last_name\":\"Gingle\",\"email\":\"lginglemf@seattletimes.com\",\"job\":\"Software Engineer IV\",\"timestamp\":\"2022-10-18T10:17:11Z\"}\n{\"id\":809,\"first_name\":\"Thadeus\",\"last_name\":\"Caine\",\"email\":\"tcainemg@google.co.jp\",\"job\":\"Programmer Analyst I\",\"timestamp\":\"2022-09-25T02:58:47Z\"}\n{\"id\":810,\"first_name\":\"Debor\",\"last_name\":\"Membry\",\"email\":\"dmembrymh@flavors.me\",\"job\":\"GIS Technical Architect\",\"timestamp\":\"2022-06-20T09:00:42Z\"}\n{\"id\":811,\"first_name\":\"Bronson\",\"last_name\":\"Grassi\",\"email\":\"bgrassimi@reverbnation.com\",\"job\":\"Accountant II\",\"timestamp\":\"2022-05-24T07:30:50Z\"}\n{\"id\":812,\"first_name\":\"Corey\",\"last_name\":\"Cheley\",\"email\":\"ccheleymj@cafepress.com\",\"job\":\"Environmental Specialist\",\"timestamp\":\"2022-08-17T04:44:36Z\"}\n{\"id\":813,\"first_name\":\"Faydra\",\"last_name\":\"Wason\",\"email\":\"fwasonmk@sphinn.com\",\"job\":\"Software Engineer I\",\"timestamp\":\"2022-10-14T11:57:26Z\"}\n{\"id\":814,\"first_name\":\"Lulu\",\"last_name\":\"Kluger\",\"email\":\"lklugerml@google.cn\",\"job\":\"Accounting Assistant IV\",\"timestamp\":\"2022-02-11T06:23:54Z\"}\n{\"id\":815,\"first_name\":\"Micky\",\"last_name\":\"Urch\",\"email\":\"murchmm@yellowbook.com\",\"job\":\"Office Assistant II\",\"timestamp\":\"2022-09-12T14:35:45Z\"}\n{\"id\":816,\"first_name\":\"Hinze\",\"last_name\":\"Buglass\",\"email\":\"hbuglassmn@biglobe.ne.jp\",\"job\":\"GIS Technical Architect\",\"timestamp\":\"2022-09-09T07:07:58Z\"}\n{\"id\":817,\"first_name\":\"Bernette\",\"last_name\":\"Wikey\",\"email\":\"bwikeymo@issuu.com\",\"job\":\"Graphic Designer\",\"timestamp\":\"2022-08-06T03:02:20Z\"}\n{\"id\":818,\"first_name\":\"Gav\",\"last_name\":\"Starbucke\",\"email\":\"gstarbuckemp@ox.ac.uk\",\"job\":\"Registered Nurse\",\"timestamp\":\"2022-10-24T06:45:21Z\"}\n{\"id\":819,\"first_name\":\"Karleen\",\"last_name\":\"Taffie\",\"email\":\"ktaffiemq@acquirethisname.com\",\"job\":\"Speech Pathologist\",\"timestamp\":\"2022-08-26T14:37:04Z\"}\n{\"id\":820,\"first_name\":\"Aldon\",\"last_name\":\"Margerison\",\"email\":\"amargerisonmr@de.vu\",\"job\":\"Business Systems Development Analyst\",\"timestamp\":\"2022-02-13T18:06:07Z\"}\n{\"id\":821,\"first_name\":\"Gerrie\",\"last_name\":\"O'Lenechan\",\"email\":\"golenechanms@nytimes.com\",\"job\":\"Sales Associate\",\"timestamp\":\"2022-09-06T17:38:15Z\"}\n{\"id\":822,\"first_name\":\"Ronny\",\"last_name\":\"Woodage\",\"email\":\"rwoodagemt@cbc.ca\",\"job\":\"Graphic Designer\",\"timestamp\":\"2022-01-24T01:06:27Z\"}\n{\"id\":823,\"first_name\":\"Kippie\",\"last_name\":\"Stone\",\"email\":\"kstonemu@nih.gov\",\"job\":\"Sales Associate\",\"timestamp\":\"2022-07-31T21:58:11Z\"}\n{\"id\":824,\"first_name\":\"Alvis\",\"last_name\":\"Cranidge\",\"email\":\"acranidgemv@cmu.edu\",\"job\":\"Database Administrator III\",\"timestamp\":\"2021-12-15T21:10:22Z\"}\n{\"id\":825,\"first_name\":\"Irv\",\"last_name\":\"Mycroft\",\"email\":\"imycroftmw@walmart.com\",\"job\":\"Web Developer II\",\"timestamp\":\"2022-07-23T17:09:26Z\"}\n{\"id\":826,\"first_name\":\"Salome\",\"last_name\":\"McGourty\",\"email\":\"smcgourtymx@techcrunch.com\",\"job\":\"Data Coordiator\",\"timestamp\":\"2022-10-15T12:52:05Z\"}\n{\"id\":827,\"first_name\":\"Querida\",\"last_name\":\"Dall\",\"email\":\"qdallmy@home.pl\",\"job\":\"Librarian\",\"timestamp\":\"2022-04-02T23:34:41Z\"}\n{\"id\":828,\"first_name\":\"Ailee\",\"last_name\":\"Clemmensen\",\"email\":\"aclemmensenmz@webs.com\",\"job\":\"Human Resources Manager\",\"timestamp\":\"2022-03-04T22:47:54Z\"}\n{\"id\":829,\"first_name\":\"Merwyn\",\"last_name\":\"MacVaugh\",\"email\":\"mmacvaughn0@msn.com\",\"job\":\"VP Accounting\",\"timestamp\":\"2022-09-16T15:33:45Z\"}\n{\"id\":830,\"first_name\":\"Hilary\",\"last_name\":\"Ostridge\",\"email\":\"hostridgen1@apache.org\",\"job\":\"Librarian\",\"timestamp\":\"2022-07-01T12:51:19Z\"}\n{\"id\":831,\"first_name\":\"Jose\",\"last_name\":\"Willder\",\"email\":\"jwilldern2@hc360.com\",\"job\":\"Information Systems Manager\",\"timestamp\":\"2022-06-16T19:41:35Z\"}\n{\"id\":832,\"first_name\":\"Rozalie\",\"last_name\":\"Crowcher\",\"email\":\"rcrowchern3@economist.com\",\"job\":\"Accountant IV\",\"timestamp\":\"2022-05-03T15:17:59Z\"}\n{\"id\":833,\"first_name\":\"Heidi\",\"last_name\":\"Tuny\",\"email\":\"htunyn4@timesonline.co.uk\",\"job\":\"Research Associate\",\"timestamp\":\"2022-10-19T14:35:46Z\"}\n{\"id\":834,\"first_name\":\"Inge\",\"last_name\":\"Raun\",\"email\":\"iraunn5@slideshare.net\",\"job\":\"Financial Analyst\",\"timestamp\":\"2022-10-26T19:33:59Z\"}\n{\"id\":835,\"first_name\":\"Sibelle\",\"last_name\":\"Cours\",\"email\":\"scoursn6@themeforest.net\",\"job\":\"VP Sales\",\"timestamp\":\"2022-11-13T04:22:18Z\"}\n{\"id\":836,\"first_name\":\"Arden\",\"last_name\":\"Algie\",\"email\":\"aalgien7@photobucket.com\",\"job\":\"Safety Technician III\",\"timestamp\":\"2022-09-17T06:44:05Z\"}\n{\"id\":837,\"first_name\":\"Irvin\",\"last_name\":\"Scroyton\",\"email\":\"iscroytonn8@auda.org.au\",\"job\":\"Editor\",\"timestamp\":\"2021-12-19T13:18:04Z\"}\n{\"id\":838,\"first_name\":\"Waring\",\"last_name\":\"Van Dalen\",\"email\":\"wvandalenn9@addtoany.com\",\"job\":\"Chemical Engineer\",\"timestamp\":\"2022-01-13T10:19:40Z\"}\n{\"id\":839,\"first_name\":\"Mata\",\"last_name\":\"McAulay\",\"email\":\"mmcaulayna@ucla.edu\",\"job\":\"Marketing Assistant\",\"timestamp\":\"2022-08-30T06:04:43Z\"}\n{\"id\":840,\"first_name\":\"Elsa\",\"last_name\":\"Vickery\",\"email\":\"evickerynb@si.edu\",\"job\":\"Operator\",\"timestamp\":\"2022-09-30T00:48:09Z\"}\n{\"id\":841,\"first_name\":\"Hedda\",\"last_name\":\"Erat\",\"email\":\"heratnc@amazon.de\",\"job\":\"Administrative Officer\",\"timestamp\":\"2022-06-20T14:36:55Z\"}\n{\"id\":842,\"first_name\":\"Belicia\",\"last_name\":\"Eddow\",\"email\":\"beddownd@cnn.com\",\"job\":\"Chief Design Engineer\",\"timestamp\":\"2022-01-10T05:33:16Z\"}\n{\"id\":843,\"first_name\":\"Jenn\",\"last_name\":\"Maidstone\",\"email\":\"jmaidstonene@theglobeandmail.com\",\"job\":\"Senior Editor\",\"timestamp\":\"2022-03-02T23:23:36Z\"}\n{\"id\":844,\"first_name\":\"Boycie\",\"last_name\":\"Cordes\",\"email\":\"bcordesnf@baidu.com\",\"job\":\"Clinical Specialist\",\"timestamp\":\"2022-06-23T03:06:33Z\"}\n{\"id\":845,\"first_name\":\"Sanderson\",\"last_name\":\"Breffitt\",\"email\":\"sbreffittng@aboutads.info\",\"job\":\"Software Engineer II\",\"timestamp\":\"2022-10-05T00:17:08Z\"}\n{\"id\":846,\"first_name\":\"Renell\",\"last_name\":\"Eldred\",\"email\":\"reldrednh@histats.com\",\"job\":\"Analog Circuit Design manager\",\"timestamp\":\"2022-07-25T23:38:06Z\"}\n{\"id\":847,\"first_name\":\"Flory\",\"last_name\":\"Castagnier\",\"email\":\"fcastagnierni@studiopress.com\",\"job\":\"Safety Technician I\",\"timestamp\":\"2022-10-26T16:38:48Z\"}\n{\"id\":848,\"first_name\":\"Susette\",\"last_name\":\"Runnacles\",\"email\":\"srunnaclesnj@wisc.edu\",\"job\":\"Assistant Media Planner\",\"timestamp\":\"2022-04-17T08:23:24Z\"}\n{\"id\":849,\"first_name\":\"Camila\",\"last_name\":\"Tweedell\",\"email\":\"ctweedellnk@bizjournals.com\",\"job\":\"Administrative Assistant IV\",\"timestamp\":\"2022-02-21T08:17:01Z\"}\n{\"id\":850,\"first_name\":\"Perry\",\"last_name\":\"Obey\",\"email\":\"pobeynl@omniture.com\",\"job\":\"Biostatistician III\",\"timestamp\":\"2022-10-06T13:30:56Z\"}\n{\"id\":851,\"first_name\":\"Ertha\",\"last_name\":\"Elleray\",\"email\":\"eelleraynm@github.io\",\"job\":\"Teacher\",\"timestamp\":\"2022-08-04T23:19:13Z\"}\n{\"id\":852,\"first_name\":\"Wallie\",\"last_name\":\"Hamlett\",\"email\":\"whamlettnn@hexun.com\",\"job\":\"Media Manager III\",\"timestamp\":\"2022-08-30T06:54:56Z\"}\n{\"id\":853,\"first_name\":\"Marchelle\",\"last_name\":\"De la Yglesia\",\"email\":\"mdelayglesiano@nbcnews.com\",\"job\":\"Engineer III\",\"timestamp\":\"2022-03-30T09:10:19Z\"}\n{\"id\":854,\"first_name\":\"Eugen\",\"last_name\":\"Kirk\",\"email\":\"ekirknp@ezinearticles.com\",\"job\":\"Community Outreach Specialist\",\"timestamp\":\"2021-12-16T02:14:06Z\"}\n{\"id\":855,\"first_name\":\"Mable\",\"last_name\":\"Bickerton\",\"email\":\"mbickertonnq@zdnet.com\",\"job\":\"Food Chemist\",\"timestamp\":\"2022-01-07T16:19:09Z\"}\n{\"id\":856,\"first_name\":\"Ricki\",\"last_name\":\"Lalevee\",\"email\":\"rlaleveenr@msu.edu\",\"job\":\"VP Sales\",\"timestamp\":\"2022-07-16T03:33:06Z\"}\n{\"id\":857,\"first_name\":\"Karylin\",\"last_name\":\"Allport\",\"email\":\"kallportns@time.com\",\"job\":\"Paralegal\",\"timestamp\":\"2022-04-22T13:30:34Z\"}\n{\"id\":858,\"first_name\":\"Sisile\",\"last_name\":\"Burkin\",\"email\":\"sburkinnt@google.com.br\",\"job\":\"Senior Editor\",\"timestamp\":\"2022-08-12T00:46:45Z\"}\n{\"id\":859,\"first_name\":\"Maxi\",\"last_name\":\"Carl\",\"email\":\"mcarlnu@illinois.edu\",\"job\":\"Internal Auditor\",\"timestamp\":\"2022-04-07T15:15:14Z\"}\n{\"id\":860,\"first_name\":\"Ediva\",\"last_name\":\"McFarlan\",\"email\":\"emcfarlannv@google.cn\",\"job\":\"Quality Engineer\",\"timestamp\":\"2022-05-15T14:40:42Z\"}\n{\"id\":861,\"first_name\":\"Rosco\",\"last_name\":\"Gregoretti\",\"email\":\"rgregorettinw@bravesites.com\",\"job\":\"Account Coordinator\",\"timestamp\":\"2022-07-01T10:19:08Z\"}\n{\"id\":862,\"first_name\":\"Denise\",\"last_name\":\"Trimmell\",\"email\":\"dtrimmellnx@163.com\",\"job\":\"Assistant Professor\",\"timestamp\":\"2022-06-03T15:07:57Z\"}\n{\"id\":863,\"first_name\":\"Penny\",\"last_name\":\"Dahlman\",\"email\":\"pdahlmanny@apache.org\",\"job\":\"Account Representative III\",\"timestamp\":\"2022-05-18T11:27:11Z\"}\n{\"id\":864,\"first_name\":\"Brant\",\"last_name\":\"Billes\",\"email\":\"bbillesnz@kickstarter.com\",\"job\":\"Data Coordiator\",\"timestamp\":\"2022-10-28T10:54:26Z\"}\n{\"id\":865,\"first_name\":\"Lorne\",\"last_name\":\"Stanbridge\",\"email\":\"lstanbridgeo0@vistaprint.com\",\"job\":\"Compensation Analyst\",\"timestamp\":\"2022-02-15T09:00:19Z\"}\n{\"id\":866,\"first_name\":\"Rodger\",\"last_name\":\"Vedeniktov\",\"email\":\"rvedeniktovo1@ycombinator.com\",\"job\":\"Software Consultant\",\"timestamp\":\"2022-08-03T14:49:52Z\"}\n{\"id\":867,\"first_name\":\"Selma\",\"last_name\":\"Twitching\",\"email\":\"stwitchingo2@meetup.com\",\"job\":\"Help Desk Operator\",\"timestamp\":\"2022-04-30T22:24:30Z\"}\n{\"id\":868,\"first_name\":\"Templeton\",\"last_name\":\"Yakebovich\",\"email\":\"tyakebovicho3@fc2.com\",\"job\":\"Internal Auditor\",\"timestamp\":\"2022-02-24T01:33:50Z\"}\n{\"id\":869,\"first_name\":\"Reidar\",\"last_name\":\"Dudding\",\"email\":\"rduddingo4@ucoz.com\",\"job\":\"Statistician III\",\"timestamp\":\"2022-01-08T12:26:12Z\"}\n{\"id\":870,\"first_name\":\"Lammond\",\"last_name\":\"Dunnion\",\"email\":\"ldunniono5@miibeian.gov.cn\",\"job\":\"Actuary\",\"timestamp\":\"2022-10-02T01:22:33Z\"}\n{\"id\":871,\"first_name\":\"Pren\",\"last_name\":\"Baraclough\",\"email\":\"pbaraclougho6@artisteer.com\",\"job\":\"Software Consultant\",\"timestamp\":\"2022-06-27T19:19:01Z\"}\n{\"id\":872,\"first_name\":\"Stacie\",\"last_name\":\"Grunnell\",\"email\":\"sgrunnello7@aol.com\",\"job\":\"Social Worker\",\"timestamp\":\"2022-03-13T04:54:57Z\"}\n{\"id\":873,\"first_name\":\"Grata\",\"last_name\":\"Karlsen\",\"email\":\"gkarlseno8@google.com.hk\",\"job\":\"Software Test Engineer II\",\"timestamp\":\"2022-08-08T09:17:55Z\"}\n{\"id\":874,\"first_name\":\"Stirling\",\"last_name\":\"Lohan\",\"email\":\"slohano9@yale.edu\",\"job\":\"Quality Control Specialist\",\"timestamp\":\"2022-02-25T07:55:43Z\"}\n{\"id\":875,\"first_name\":\"Samuele\",\"last_name\":\"Evason\",\"email\":\"sevasonoa@va.gov\",\"job\":\"Software Engineer II\",\"timestamp\":\"2022-03-23T04:00:01Z\"}\n{\"id\":876,\"first_name\":\"Jerome\",\"last_name\":\"Sherlock\",\"email\":\"jsherlockob@4shared.com\",\"job\":\"Community Outreach Specialist\",\"timestamp\":\"2022-11-23T09:46:58Z\"}\n{\"id\":877,\"first_name\":\"Iseabal\",\"last_name\":\"Titmuss\",\"email\":\"ititmussoc@pen.io\",\"job\":\"Teacher\",\"timestamp\":\"2022-05-19T11:43:40Z\"}\n{\"id\":878,\"first_name\":\"Farr\",\"last_name\":\"Duignan\",\"email\":\"fduignanod@nbcnews.com\",\"job\":\"Business Systems Development Analyst\",\"timestamp\":\"2022-11-16T07:29:52Z\"}\n{\"id\":879,\"first_name\":\"Julita\",\"last_name\":\"Alster\",\"email\":\"jalsteroe@altervista.org\",\"job\":\"Marketing Assistant\",\"timestamp\":\"2022-07-19T19:16:33Z\"}\n{\"id\":880,\"first_name\":\"Modestia\",\"last_name\":\"Aldrin\",\"email\":\"maldrinof@slideshare.net\",\"job\":\"Media Manager III\",\"timestamp\":\"2022-08-22T15:07:30Z\"}\n{\"id\":881,\"first_name\":\"Elset\",\"last_name\":\"Bilston\",\"email\":\"ebilstonog@gizmodo.com\",\"job\":\"Nurse Practicioner\",\"timestamp\":\"2022-01-28T19:35:10Z\"}\n{\"id\":882,\"first_name\":\"Fidole\",\"last_name\":\"Deverell\",\"email\":\"fdeverelloh@issuu.com\",\"job\":\"Environmental Specialist\",\"timestamp\":\"2021-12-29T09:20:43Z\"}\n{\"id\":883,\"first_name\":\"Aloysia\",\"last_name\":\"Napier\",\"email\":\"anapieroi@rediff.com\",\"job\":\"Staff Accountant IV\",\"timestamp\":\"2022-06-07T06:02:41Z\"}\n{\"id\":884,\"first_name\":\"Rustin\",\"last_name\":\"Teliga\",\"email\":\"rteligaoj@chronoengine.com\",\"job\":\"Physical Therapy Assistant\",\"timestamp\":\"2022-07-05T08:34:42Z\"}\n{\"id\":885,\"first_name\":\"Debor\",\"last_name\":\"Kester\",\"email\":\"dkesterok@ca.gov\",\"job\":\"Recruiter\",\"timestamp\":\"2022-09-01T03:43:02Z\"}\n{\"id\":886,\"first_name\":\"Morna\",\"last_name\":\"Davidzon\",\"email\":\"mdavidzonol@altervista.org\",\"job\":\"Media Manager II\",\"timestamp\":\"2022-06-23T05:24:01Z\"}\n{\"id\":887,\"first_name\":\"Joan\",\"last_name\":\"Aldcorn\",\"email\":\"jaldcornom@phpbb.com\",\"job\":\"Internal Auditor\",\"timestamp\":\"2022-07-06T02:23:55Z\"}\n{\"id\":888,\"first_name\":\"Luciana\",\"last_name\":\"Mousley\",\"email\":\"lmousleyon@ehow.com\",\"job\":\"Data Coordiator\",\"timestamp\":\"2021-12-17T12:48:48Z\"}\n{\"id\":889,\"first_name\":\"Laney\",\"last_name\":\"Sharman\",\"email\":\"lsharmanoo@omniture.com\",\"job\":\"Executive Secretary\",\"timestamp\":\"2021-12-07T15:02:25Z\"}\n{\"id\":890,\"first_name\":\"Chancey\",\"last_name\":\"Andover\",\"email\":\"candoverop@irs.gov\",\"job\":\"Project Manager\",\"timestamp\":\"2022-08-04T23:53:09Z\"}\n{\"id\":891,\"first_name\":\"Jilly\",\"last_name\":\"Remirez\",\"email\":\"jremirezoq@sakura.ne.jp\",\"job\":\"Programmer Analyst III\",\"timestamp\":\"2021-12-31T17:59:35Z\"}\n{\"id\":892,\"first_name\":\"Darin\",\"last_name\":\"Ivanuschka\",\"email\":\"divanuschkaor@guardian.co.uk\",\"job\":\"Accountant III\",\"timestamp\":\"2022-05-03T05:25:56Z\"}\n{\"id\":893,\"first_name\":\"Griselda\",\"last_name\":\"Cordeau]\",\"email\":\"gcordeauos@illinois.edu\",\"job\":\"Legal Assistant\",\"timestamp\":\"2021-12-16T18:09:52Z\"}\n{\"id\":894,\"first_name\":\"Peta\",\"last_name\":\"Ramsier\",\"email\":\"pramsierot@thetimes.co.uk\",\"job\":\"Nurse\",\"timestamp\":\"2022-09-26T04:28:02Z\"}\n{\"id\":895,\"first_name\":\"Ferrell\",\"last_name\":\"Quinnelly\",\"email\":\"fquinnellyou@technorati.com\",\"job\":\"Legal Assistant\",\"timestamp\":\"2022-02-09T15:26:30Z\"}\n{\"id\":896,\"first_name\":\"Buffy\",\"last_name\":\"Osgodby\",\"email\":\"bosgodbyov@blogger.com\",\"job\":\"Chief Design Engineer\",\"timestamp\":\"2022-02-26T01:21:35Z\"}\n{\"id\":897,\"first_name\":\"Shannen\",\"last_name\":\"Village\",\"email\":\"svillageow@cloudflare.com\",\"job\":\"Marketing Manager\",\"timestamp\":\"2022-07-05T14:31:29Z\"}\n{\"id\":898,\"first_name\":\"Randy\",\"last_name\":\"Wickliffe\",\"email\":\"rwickliffeox@trellian.com\",\"job\":\"Actuary\",\"timestamp\":\"2022-02-11T06:57:45Z\"}\n{\"id\":899,\"first_name\":\"Elayne\",\"last_name\":\"Maurice\",\"email\":\"emauriceoy@economist.com\",\"job\":\"Research Associate\",\"timestamp\":\"2022-04-24T10:22:55Z\"}\n{\"id\":900,\"first_name\":\"Erda\",\"last_name\":\"Babonau\",\"email\":\"ebabonauoz@topsy.com\",\"job\":\"Senior Financial Analyst\",\"timestamp\":\"2022-11-28T04:40:49Z\"}\n{\"id\":901,\"first_name\":\"Jehu\",\"last_name\":\"Mullard\",\"email\":\"jmullardp0@chicagotribune.com\",\"job\":\"Marketing Manager\",\"timestamp\":\"2022-08-02T04:25:00Z\"}\n{\"id\":902,\"first_name\":\"Chrissie\",\"last_name\":\"Clacey\",\"email\":\"cclaceyp1@domainmarket.com\",\"job\":\"Software Test Engineer IV\",\"timestamp\":\"2022-03-27T12:28:06Z\"}\n{\"id\":903,\"first_name\":\"Michaela\",\"last_name\":\"Streeting\",\"email\":\"mstreetingp2@epa.gov\",\"job\":\"Human Resources Manager\",\"timestamp\":\"2022-03-08T11:02:27Z\"}\n{\"id\":904,\"first_name\":\"Stearn\",\"last_name\":\"Kiernan\",\"email\":\"skiernanp3@google.pl\",\"job\":\"Payment Adjustment Coordinator\",\"timestamp\":\"2022-08-23T15:42:08Z\"}\n{\"id\":905,\"first_name\":\"Cory\",\"last_name\":\"Athowe\",\"email\":\"cathowep4@unblog.fr\",\"job\":\"Actuary\",\"timestamp\":\"2022-09-13T15:53:22Z\"}\n{\"id\":906,\"first_name\":\"Vonni\",\"last_name\":\"Goby\",\"email\":\"vgobyp5@nifty.com\",\"job\":\"Compensation Analyst\",\"timestamp\":\"2021-12-11T08:40:28Z\"}\n{\"id\":907,\"first_name\":\"Englebert\",\"last_name\":\"Glaister\",\"email\":\"eglaisterp6@independent.co.uk\",\"job\":\"Human Resources Manager\",\"timestamp\":\"2022-11-20T00:02:17Z\"}\n{\"id\":908,\"first_name\":\"Barris\",\"last_name\":\"Mosson\",\"email\":\"bmossonp7@statcounter.com\",\"job\":\"Nurse\",\"timestamp\":\"2022-07-15T10:29:25Z\"}\n{\"id\":909,\"first_name\":\"Konstantin\",\"last_name\":\"Furphy\",\"email\":\"kfurphyp8@oakley.com\",\"job\":\"Legal Assistant\",\"timestamp\":\"2022-08-16T13:16:54Z\"}\n{\"id\":910,\"first_name\":\"Loria\",\"last_name\":\"Carratt\",\"email\":\"lcarrattp9@ow.ly\",\"job\":\"Structural Engineer\",\"timestamp\":\"2022-01-11T20:48:34Z\"}\n{\"id\":911,\"first_name\":\"Torrey\",\"last_name\":\"Richings\",\"email\":\"trichingspa@indiatimes.com\",\"job\":\"Computer Systems Analyst III\",\"timestamp\":\"2022-04-28T05:05:01Z\"}\n{\"id\":912,\"first_name\":\"Merissa\",\"last_name\":\"Jorioz\",\"email\":\"mjoriozpb@ebay.com\",\"job\":\"Senior Developer\",\"timestamp\":\"2022-04-12T00:47:48Z\"}\n{\"id\":913,\"first_name\":\"Abraham\",\"last_name\":\"Fairbard\",\"email\":\"afairbardpc@intel.com\",\"job\":\"Assistant Professor\",\"timestamp\":\"2022-01-11T00:48:04Z\"}\n{\"id\":914,\"first_name\":\"Sidnee\",\"last_name\":\"McCreery\",\"email\":\"smccreerypd@forbes.com\",\"job\":\"Registered Nurse\",\"timestamp\":\"2022-11-23T23:12:08Z\"}\n{\"id\":915,\"first_name\":\"Elspeth\",\"last_name\":\"Kollatsch\",\"email\":\"ekollatschpe@rediff.com\",\"job\":\"Account Representative I\",\"timestamp\":\"2022-01-20T06:08:10Z\"}\n{\"id\":916,\"first_name\":\"Gherardo\",\"last_name\":\"Waitland\",\"email\":\"gwaitlandpf@businessinsider.com\",\"job\":\"Cost Accountant\",\"timestamp\":\"2022-07-04T05:08:04Z\"}\n{\"id\":917,\"first_name\":\"Alick\",\"last_name\":\"Olczak\",\"email\":\"aolczakpg@stumbleupon.com\",\"job\":\"VP Product Management\",\"timestamp\":\"2022-10-22T14:53:06Z\"}\n{\"id\":918,\"first_name\":\"Eolanda\",\"last_name\":\"Scarfe\",\"email\":\"escarfeph@e-recht24.de\",\"job\":\"Cost Accountant\",\"timestamp\":\"2022-05-14T07:33:31Z\"}\n{\"id\":919,\"first_name\":\"Gilberto\",\"last_name\":\"Shatford\",\"email\":\"gshatfordpi@digg.com\",\"job\":\"Chemical Engineer\",\"timestamp\":\"2022-09-04T13:51:44Z\"}\n{\"id\":920,\"first_name\":\"Vincent\",\"last_name\":\"Andreopolos\",\"email\":\"vandreopolospj@cbslocal.com\",\"job\":\"Office Assistant II\",\"timestamp\":\"2022-07-25T09:43:30Z\"}\n{\"id\":921,\"first_name\":\"Kat\",\"last_name\":\"Gaylard\",\"email\":\"kgaylardpk@nydailynews.com\",\"job\":\"Pharmacist\",\"timestamp\":\"2022-01-07T13:41:26Z\"}\n{\"id\":922,\"first_name\":\"Kettie\",\"last_name\":\"Downing\",\"email\":\"kdowningpl@odnoklassniki.ru\",\"job\":\"Software Test Engineer IV\",\"timestamp\":\"2022-04-28T01:05:46Z\"}\n{\"id\":923,\"first_name\":\"Dolores\",\"last_name\":\"Ellif\",\"email\":\"dellifpm@craigslist.org\",\"job\":\"Chief Design Engineer\",\"timestamp\":\"2022-05-28T08:11:26Z\"}\n{\"id\":924,\"first_name\":\"Baillie\",\"last_name\":\"Aymerich\",\"email\":\"baymerichpn@posterous.com\",\"job\":\"Financial Advisor\",\"timestamp\":\"2022-08-19T21:42:32Z\"}\n{\"id\":925,\"first_name\":\"Fidelia\",\"last_name\":\"Latour\",\"email\":\"flatourpo@weebly.com\",\"job\":\"Senior Financial Analyst\",\"timestamp\":\"2022-07-09T04:32:59Z\"}\n{\"id\":926,\"first_name\":\"Fraser\",\"last_name\":\"Hinchon\",\"email\":\"fhinchonpp@nydailynews.com\",\"job\":\"Structural Engineer\",\"timestamp\":\"2022-03-07T21:17:55Z\"}\n{\"id\":927,\"first_name\":\"Eryn\",\"last_name\":\"Gosnall\",\"email\":\"egosnallpq@lulu.com\",\"job\":\"Actuary\",\"timestamp\":\"2022-07-24T19:40:39Z\"}\n{\"id\":928,\"first_name\":\"Doria\",\"last_name\":\"Coumbe\",\"email\":\"dcoumbepr@nymag.com\",\"job\":\"Cost Accountant\",\"timestamp\":\"2022-01-27T05:05:20Z\"}\n{\"id\":929,\"first_name\":\"Mei\",\"last_name\":\"Cusick\",\"email\":\"mcusickps@gizmodo.com\",\"job\":\"Librarian\",\"timestamp\":\"2022-06-13T19:46:16Z\"}\n{\"id\":930,\"first_name\":\"Hernando\",\"last_name\":\"Prestie\",\"email\":\"hprestiept@disqus.com\",\"job\":\"Product Engineer\",\"timestamp\":\"2022-03-28T11:48:59Z\"}\n{\"id\":931,\"first_name\":\"Stefanie\",\"last_name\":\"Wonham\",\"email\":\"swonhampu@liveinternet.ru\",\"job\":\"Accountant I\",\"timestamp\":\"2022-01-09T11:17:35Z\"}\n{\"id\":932,\"first_name\":\"Addy\",\"last_name\":\"Kemell\",\"email\":\"akemellpv@sina.com.cn\",\"job\":\"Sales Representative\",\"timestamp\":\"2022-09-15T01:17:27Z\"}\n{\"id\":933,\"first_name\":\"Delainey\",\"last_name\":\"Laver\",\"email\":\"dlaverpw@usa.gov\",\"job\":\"Junior Executive\",\"timestamp\":\"2022-07-13T19:59:51Z\"}\n{\"id\":934,\"first_name\":\"Ewart\",\"last_name\":\"Doe\",\"email\":\"edoepx@zdnet.com\",\"job\":\"Electrical Engineer\",\"timestamp\":\"2022-07-10T20:15:41Z\"}\n{\"id\":935,\"first_name\":\"Gabriela\",\"last_name\":\"Marmyon\",\"email\":\"gmarmyonpy@blinklist.com\",\"job\":\"Junior Executive\",\"timestamp\":\"2022-06-06T14:42:33Z\"}\n{\"id\":936,\"first_name\":\"Amabelle\",\"last_name\":\"Vassie\",\"email\":\"avassiepz@sitemeter.com\",\"job\":\"Statistician IV\",\"timestamp\":\"2022-10-17T15:35:37Z\"}\n{\"id\":937,\"first_name\":\"Haley\",\"last_name\":\"Paddon\",\"email\":\"hpaddonq0@google.ru\",\"job\":\"Engineer II\",\"timestamp\":\"2022-01-24T08:25:00Z\"}\n{\"id\":938,\"first_name\":\"Kurt\",\"last_name\":\"Sandaver\",\"email\":\"ksandaverq1@bluehost.com\",\"job\":\"Structural Engineer\",\"timestamp\":\"2022-06-29T14:55:21Z\"}\n{\"id\":939,\"first_name\":\"Almire\",\"last_name\":\"Wearne\",\"email\":\"awearneq2@tmall.com\",\"job\":\"General Manager\",\"timestamp\":\"2022-01-08T01:56:40Z\"}\n{\"id\":940,\"first_name\":\"Norina\",\"last_name\":\"Pacey\",\"email\":\"npaceyq3@cyberchimps.com\",\"job\":\"VP Product Management\",\"timestamp\":\"2022-07-26T04:44:14Z\"}\n{\"id\":941,\"first_name\":\"Irwin\",\"last_name\":\"Barrett\",\"email\":\"ibarrettq4@icio.us\",\"job\":\"Editor\",\"timestamp\":\"2022-06-06T22:52:44Z\"}\n{\"id\":942,\"first_name\":\"Cornie\",\"last_name\":\"Pasquale\",\"email\":\"cpasqualeq5@xing.com\",\"job\":\"Actuary\",\"timestamp\":\"2022-05-26T04:39:40Z\"}\n{\"id\":943,\"first_name\":\"Heda\",\"last_name\":\"Behling\",\"email\":\"hbehlingq6@noaa.gov\",\"job\":\"Senior Quality Engineer\",\"timestamp\":\"2022-08-31T20:17:15Z\"}\n{\"id\":944,\"first_name\":\"Cariotta\",\"last_name\":\"Luberti\",\"email\":\"clubertiq7@sphinn.com\",\"job\":\"Accounting Assistant II\",\"timestamp\":\"2022-06-22T05:02:39Z\"}\n{\"id\":945,\"first_name\":\"Saraann\",\"last_name\":\"Clew\",\"email\":\"sclewq8@geocities.jp\",\"job\":\"Quality Control Specialist\",\"timestamp\":\"2022-10-09T03:37:41Z\"}\n{\"id\":946,\"first_name\":\"Reynold\",\"last_name\":\"Lean\",\"email\":\"rleanq9@facebook.com\",\"job\":\"Product Engineer\",\"timestamp\":\"2022-05-01T01:08:01Z\"}\n{\"id\":947,\"first_name\":\"Dorree\",\"last_name\":\"McKevin\",\"email\":\"dmckevinqa@odnoklassniki.ru\",\"job\":\"Budget/Accounting Analyst IV\",\"timestamp\":\"2022-07-06T20:09:43Z\"}\n{\"id\":948,\"first_name\":\"Redford\",\"last_name\":\"Mancell\",\"email\":\"rmancellqb@techcrunch.com\",\"job\":\"Research Assistant I\",\"timestamp\":\"2022-07-08T21:45:19Z\"}\n{\"id\":949,\"first_name\":\"Ricky\",\"last_name\":\"Gilstoun\",\"email\":\"rgilstounqc@boston.com\",\"job\":\"Software Consultant\",\"timestamp\":\"2022-03-12T08:04:21Z\"}\n{\"id\":950,\"first_name\":\"Jessamyn\",\"last_name\":\"Canlin\",\"email\":\"jcanlinqd@jimdo.com\",\"job\":\"Structural Analysis Engineer\",\"timestamp\":\"2022-08-17T17:20:42Z\"}\n{\"id\":951,\"first_name\":\"Donaugh\",\"last_name\":\"Goodson\",\"email\":\"dgoodsonqe@reference.com\",\"job\":\"Internal Auditor\",\"timestamp\":\"2022-03-10T10:29:27Z\"}\n{\"id\":952,\"first_name\":\"Yehudi\",\"last_name\":\"Truggian\",\"email\":\"ytruggianqf@fc2.com\",\"job\":\"Project Manager\",\"timestamp\":\"2022-05-25T21:02:33Z\"}\n{\"id\":953,\"first_name\":\"Alister\",\"last_name\":\"Drust\",\"email\":\"adrustqg@techcrunch.com\",\"job\":\"Quality Engineer\",\"timestamp\":\"2022-01-04T10:05:09Z\"}\n{\"id\":954,\"first_name\":\"Cosette\",\"last_name\":\"Fawdrie\",\"email\":\"cfawdrieqh@statcounter.com\",\"job\":\"Assistant Manager\",\"timestamp\":\"2022-03-27T19:08:56Z\"}\n{\"id\":955,\"first_name\":\"Jayne\",\"last_name\":\"Crosio\",\"email\":\"jcrosioqi@webs.com\",\"job\":\"Biostatistician IV\",\"timestamp\":\"2021-12-28T22:15:22Z\"}\n{\"id\":956,\"first_name\":\"Sawyere\",\"last_name\":\"Brompton\",\"email\":\"sbromptonqj@imdb.com\",\"job\":\"Recruiting Manager\",\"timestamp\":\"2022-05-02T22:03:03Z\"}\n{\"id\":957,\"first_name\":\"Timmie\",\"last_name\":\"Farrow\",\"email\":\"tfarrowqk@hexun.com\",\"job\":\"Safety Technician II\",\"timestamp\":\"2022-07-31T05:14:18Z\"}\n{\"id\":958,\"first_name\":\"Courtney\",\"last_name\":\"Gleave\",\"email\":\"cgleaveql@squidoo.com\",\"job\":\"Information Systems Manager\",\"timestamp\":\"2022-10-01T22:12:55Z\"}\n{\"id\":959,\"first_name\":\"Justis\",\"last_name\":\"Mauditt\",\"email\":\"jmaudittqm@wikispaces.com\",\"job\":\"Structural Engineer\",\"timestamp\":\"2022-07-03T02:48:32Z\"}\n{\"id\":960,\"first_name\":\"Ambrosius\",\"last_name\":\"Taffs\",\"email\":\"ataffsqn@house.gov\",\"job\":\"Geologist II\",\"timestamp\":\"2022-10-19T20:32:22Z\"}\n{\"id\":961,\"first_name\":\"Pren\",\"last_name\":\"Bountiff\",\"email\":\"pbountiffqo@redcross.org\",\"job\":\"Electrical Engineer\",\"timestamp\":\"2022-07-28T14:25:59Z\"}\n{\"id\":962,\"first_name\":\"Zack\",\"last_name\":\"Kubal\",\"email\":\"zkubalqp@phoca.cz\",\"job\":\"Data Coordiator\",\"timestamp\":\"2022-03-20T13:57:34Z\"}\n{\"id\":963,\"first_name\":\"Jaquenetta\",\"last_name\":\"MacGilfoyle\",\"email\":\"jmacgilfoyleqq@weebly.com\",\"job\":\"Web Developer I\",\"timestamp\":\"2022-10-17T19:19:52Z\"}\n{\"id\":964,\"first_name\":\"Sarajane\",\"last_name\":\"Kampshell\",\"email\":\"skampshellqr@examiner.com\",\"job\":\"Pharmacist\",\"timestamp\":\"2022-08-19T15:18:34Z\"}\n{\"id\":965,\"first_name\":\"Zorine\",\"last_name\":\"Franc\",\"email\":\"zfrancqs@slideshare.net\",\"job\":\"Quality Control Specialist\",\"timestamp\":\"2022-11-01T20:57:31Z\"}\n{\"id\":966,\"first_name\":\"Milissent\",\"last_name\":\"Tristram\",\"email\":\"mtristramqt@va.gov\",\"job\":\"Sales Representative\",\"timestamp\":\"2021-12-22T08:28:56Z\"}\n{\"id\":967,\"first_name\":\"Finley\",\"last_name\":\"Hughf\",\"email\":\"fhughfqu@cbsnews.com\",\"job\":\"Payment Adjustment Coordinator\",\"timestamp\":\"2021-12-26T09:12:05Z\"}\n{\"id\":968,\"first_name\":\"Fionnula\",\"last_name\":\"McSporrin\",\"email\":\"fmcsporrinqv@senate.gov\",\"job\":\"Occupational Therapist\",\"timestamp\":\"2022-10-03T10:44:53Z\"}\n{\"id\":969,\"first_name\":\"Marcelline\",\"last_name\":\"Hartington\",\"email\":\"mhartingtonqw@unblog.fr\",\"job\":\"Desktop Support Technician\",\"timestamp\":\"2022-10-11T20:39:44Z\"}\n{\"id\":970,\"first_name\":\"Maurizio\",\"last_name\":\"MacBean\",\"email\":\"mmacbeanqx@hhs.gov\",\"job\":\"Operator\",\"timestamp\":\"2022-08-04T13:28:13Z\"}\n{\"id\":971,\"first_name\":\"Jeannie\",\"last_name\":\"Muzzall\",\"email\":\"jmuzzallqy@bbc.co.uk\",\"job\":\"Account Representative III\",\"timestamp\":\"2022-08-23T13:31:39Z\"}\n{\"id\":972,\"first_name\":\"Fredia\",\"last_name\":\"Hitchens\",\"email\":\"fhitchensqz@chronoengine.com\",\"job\":\"General Manager\",\"timestamp\":\"2022-01-01T15:22:41Z\"}\n{\"id\":973,\"first_name\":\"Karim\",\"last_name\":\"Fossitt\",\"email\":\"kfossittr0@csmonitor.com\",\"job\":\"Engineer IV\",\"timestamp\":\"2022-03-05T15:37:38Z\"}\n{\"id\":974,\"first_name\":\"Heindrick\",\"last_name\":\"Bird\",\"email\":\"hbirdr1@ehow.com\",\"job\":\"Computer Systems Analyst II\",\"timestamp\":\"2022-03-13T18:22:09Z\"}\n{\"id\":975,\"first_name\":\"Carolann\",\"last_name\":\"Dunphy\",\"email\":\"cdunphyr2@mozilla.com\",\"job\":\"VP Product Management\",\"timestamp\":\"2022-04-23T23:07:14Z\"}\n{\"id\":976,\"first_name\":\"Herman\",\"last_name\":\"Ciubutaro\",\"email\":\"hciubutaror3@alibaba.com\",\"job\":\"Nurse Practicioner\",\"timestamp\":\"2022-03-08T21:50:13Z\"}\n{\"id\":977,\"first_name\":\"Konrad\",\"last_name\":\"Gregon\",\"email\":\"kgregonr4@npr.org\",\"job\":\"Research Assistant IV\",\"timestamp\":\"2022-11-24T16:44:53Z\"}\n{\"id\":978,\"first_name\":\"Sansone\",\"last_name\":\"O'Regan\",\"email\":\"soreganr5@wordpress.org\",\"job\":\"Compensation Analyst\",\"timestamp\":\"2022-07-26T04:39:39Z\"}\n{\"id\":979,\"first_name\":\"Edi\",\"last_name\":\"Shevelin\",\"email\":\"eshevelinr6@reverbnation.com\",\"job\":\"Programmer Analyst III\",\"timestamp\":\"2022-02-21T21:29:55Z\"}\n{\"id\":980,\"first_name\":\"Putnem\",\"last_name\":\"Muldoon\",\"email\":\"pmuldoonr7@webs.com\",\"job\":\"Registered Nurse\",\"timestamp\":\"2022-11-17T10:38:54Z\"}\n{\"id\":981,\"first_name\":\"Clair\",\"last_name\":\"Durtnell\",\"email\":\"cdurtnellr8@theglobeandmail.com\",\"job\":\"VP Sales\",\"timestamp\":\"2022-09-03T05:07:03Z\"}\n{\"id\":982,\"first_name\":\"Mellisa\",\"last_name\":\"Stillmann\",\"email\":\"mstillmannr9@yelp.com\",\"job\":\"Actuary\",\"timestamp\":\"2022-06-01T10:13:00Z\"}\n{\"id\":983,\"first_name\":\"Alyce\",\"last_name\":\"Caron\",\"email\":\"acaronra@mysql.com\",\"job\":\"Marketing Assistant\",\"timestamp\":\"2022-03-08T11:48:23Z\"}\n{\"id\":984,\"first_name\":\"Elnora\",\"last_name\":\"Perell\",\"email\":\"eperellrb@com.com\",\"job\":\"Database Administrator II\",\"timestamp\":\"2022-01-23T07:00:30Z\"}\n{\"id\":985,\"first_name\":\"Ximenez\",\"last_name\":\"Soppit\",\"email\":\"xsoppitrc@marriott.com\",\"job\":\"Design Engineer\",\"timestamp\":\"2022-09-04T01:28:57Z\"}\n{\"id\":986,\"first_name\":\"Wallie\",\"last_name\":\"DeSousa\",\"email\":\"wdesousard@nps.gov\",\"job\":\"VP Sales\",\"timestamp\":\"2022-01-13T10:38:52Z\"}\n{\"id\":987,\"first_name\":\"Ruddy\",\"last_name\":\"Michel\",\"email\":\"rmichelre@gravatar.com\",\"job\":\"Recruiter\",\"timestamp\":\"2022-02-06T04:05:07Z\"}\n{\"id\":988,\"first_name\":\"Mariel\",\"last_name\":\"Gooderick\",\"email\":\"mgooderickrf@joomla.org\",\"job\":\"Analyst Programmer\",\"timestamp\":\"2022-04-07T06:43:49Z\"}\n{\"id\":989,\"first_name\":\"Adria\",\"last_name\":\"Kinkaid\",\"email\":\"akinkaidrg@slate.com\",\"job\":\"Business Systems Development Analyst\",\"timestamp\":\"2022-09-28T19:22:58Z\"}\n{\"id\":990,\"first_name\":\"Ashley\",\"last_name\":\"Easey\",\"email\":\"aeaseyrh@themeforest.net\",\"job\":\"Product Engineer\",\"timestamp\":\"2022-03-05T02:20:09Z\"}\n{\"id\":991,\"first_name\":\"Mikkel\",\"last_name\":\"Greiswood\",\"email\":\"mgreiswoodri@hao123.com\",\"job\":\"Speech Pathologist\",\"timestamp\":\"2022-12-04T12:31:00Z\"}\n{\"id\":992,\"first_name\":\"Nissy\",\"last_name\":\"Titmuss\",\"email\":\"ntitmussrj@si.edu\",\"job\":\"Software Engineer IV\",\"timestamp\":\"2022-02-18T04:52:53Z\"}\n{\"id\":993,\"first_name\":\"Maddi\",\"last_name\":\"Pimmocke\",\"email\":\"mpimmockerk@canalblog.com\",\"job\":\"GIS Technical Architect\",\"timestamp\":\"2022-02-11T08:34:43Z\"}\n{\"id\":994,\"first_name\":\"Rossy\",\"last_name\":\"Draco\",\"email\":\"rdracorl@goodreads.com\",\"job\":\"Marketing Assistant\",\"timestamp\":\"2022-07-02T04:23:06Z\"}\n{\"id\":995,\"first_name\":\"Travus\",\"last_name\":\"Babber\",\"email\":\"tbabberrm@shareasale.com\",\"job\":\"VP Quality Control\",\"timestamp\":\"2022-02-18T19:54:19Z\"}\n{\"id\":996,\"first_name\":\"Clayton\",\"last_name\":\"Nancarrow\",\"email\":\"cnancarrowrn@hao123.com\",\"job\":\"Web Developer III\",\"timestamp\":\"2022-05-05T22:18:27Z\"}\n{\"id\":997,\"first_name\":\"Cami\",\"last_name\":\"Jimmes\",\"email\":\"cjimmesro@webeden.co.uk\",\"job\":\"Financial Advisor\",\"timestamp\":\"2022-08-23T18:13:14Z\"}\n{\"id\":998,\"first_name\":\"Eirena\",\"last_name\":\"Darling\",\"email\":\"edarlingrp@altervista.org\",\"job\":\"Mechanical Systems Engineer\",\"timestamp\":\"2022-07-15T22:39:00Z\"}\n{\"id\":999,\"first_name\":\"Anne-marie\",\"last_name\":\"Dober\",\"email\":\"adoberrq@nyu.edu\",\"job\":\"Software Engineer I\",\"timestamp\":\"2022-10-04T07:37:09Z\"}\n{\"id\":1000,\"first_name\":\"Calla\",\"last_name\":\"Handrock\",\"email\":\"chandrockrr@seesaa.net\",\"job\":\"Systems Administrator II\",\"timestamp\":\"2022-11-18T22:33:30Z\"}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/benches/data/bench_data_heavy_transform.json",
    "content": "{ \"body\": \"{\\\"id\\\":1,\\\"first_name\\\":\\\"Darcey\\\",\\\"email\\\":\\\"dzammett0@gizmodo.com\\\",\\\"job\\\":\\\"Mechanical Systems Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":2,\\\"first_name\\\":\\\"Wilmette\\\",\\\"email\\\":\\\"wvsanelli1@yellowpages.com\\\",\\\"job\\\":\\\"Web Designer II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":3,\\\"first_name\\\":\\\"Inez\\\",\\\"email\\\":\\\"igirardet2@vkontakte.ru\\\",\\\"job\\\":\\\"Design Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":4,\\\"first_name\\\":\\\"Nickie\\\",\\\"email\\\":\\\"nranyell3@vistaprint.com\\\",\\\"job\\\":\\\"Senior Cost Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":5,\\\"first_name\\\":\\\"Shanon\\\",\\\"email\\\":\\\"spritchett4@buzzfeed.com\\\",\\\"job\\\":\\\"Tax Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":6,\\\"first_name\\\":\\\"Warren\\\",\\\"email\\\":\\\"wpicknett5@oaic.gov.au\\\",\\\"job\\\":\\\"Mechanical Systems Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":7,\\\"first_name\\\":\\\"Nedda\\\",\\\"email\\\":\\\"nstoad6@geocities.com\\\",\\\"job\\\":\\\"Assistant Media Planner\\\"}\"}\n{ \"body\": \"{\\\"id\\\":8,\\\"first_name\\\":\\\"Devonne\\\",\\\"email\\\":\\\"dbrisse7@cdc.gov\\\",\\\"job\\\":\\\"Assistant Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":9,\\\"first_name\\\":\\\"Cassondra\\\",\\\"email\\\":\\\"cbackshall8@senate.gov\\\",\\\"job\\\":\\\"Recruiter\\\"}\"}\n{ \"body\": \"{\\\"id\\\":10,\\\"first_name\\\":\\\"Maurise\\\",\\\"email\\\":\\\"mciottoi9@vistaprint.com\\\",\\\"job\\\":\\\"Technical Writer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":11,\\\"first_name\\\":\\\"Alida\\\",\\\"email\\\":\\\"alathwella@godaddy.com\\\",\\\"job\\\":\\\"Human Resources Assistant III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":12,\\\"first_name\\\":\\\"Lynna\\\",\\\"email\\\":\\\"lbulstrodeb@businesswire.com\\\",\\\"job\\\":\\\"Senior Editor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":13,\\\"first_name\\\":\\\"Bordy\\\",\\\"email\\\":\\\"bwethersc@weebly.com\\\",\\\"job\\\":\\\"Financial Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":14,\\\"first_name\\\":\\\"Jilly\\\",\\\"email\\\":\\\"jscanesd@dagondesign.com\\\",\\\"job\\\":\\\"Community Outreach Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":15,\\\"first_name\\\":\\\"Benedicto\\\",\\\"email\\\":\\\"bglantone@europa.eu\\\",\\\"job\\\":\\\"Internal Auditor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":16,\\\"first_name\\\":\\\"Hedda\\\",\\\"email\\\":\\\"hcaddingf@angelfire.com\\\",\\\"job\\\":\\\"Help Desk Technician\\\"}\"}\n{ \"body\": \"{\\\"id\\\":17,\\\"first_name\\\":\\\"Tammara\\\",\\\"email\\\":\\\"tgrigoriog@ycombinator.com\\\",\\\"job\\\":\\\"Product Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":18,\\\"first_name\\\":\\\"Lindsey\\\",\\\"email\\\":\\\"lgiraldezh@pen.io\\\",\\\"job\\\":\\\"Librarian\\\"}\"}\n{ \"body\": \"{\\\"id\\\":19,\\\"first_name\\\":\\\"Putnam\\\",\\\"email\\\":\\\"pdunnetti@hhs.gov\\\",\\\"job\\\":\\\"Geologist III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":20,\\\"first_name\\\":\\\"Dennie\\\",\\\"email\\\":\\\"dmcilvorayj@auda.org.au\\\",\\\"job\\\":\\\"Sales Representative\\\"}\"}\n{ \"body\": \"{\\\"id\\\":21,\\\"first_name\\\":\\\"Ilene\\\",\\\"email\\\":\\\"iheighok@friendfeed.com\\\",\\\"job\\\":\\\"Food Chemist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":22,\\\"first_name\\\":\\\"Orville\\\",\\\"email\\\":\\\"olanahanl@purevolume.com\\\",\\\"job\\\":\\\"VP Product Management\\\"}\"}\n{ \"body\": \"{\\\"id\\\":23,\\\"first_name\\\":\\\"Marcella\\\",\\\"email\\\":\\\"mfavellem@foxnews.com\\\",\\\"job\\\":\\\"Analyst Programmer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":24,\\\"first_name\\\":\\\"Liliane\\\",\\\"email\\\":\\\"lsommervillen@goo.ne.jp\\\",\\\"job\\\":\\\"Speech Pathologist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":25,\\\"first_name\\\":\\\"Ruperta\\\",\\\"email\\\":\\\"rbrightwello@webnode.com\\\",\\\"job\\\":\\\"Geologist III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":26,\\\"first_name\\\":\\\"Dwight\\\",\\\"email\\\":\\\"dcraigmilep@blinklist.com\\\",\\\"job\\\":\\\"Account Coordinator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":27,\\\"first_name\\\":\\\"Lory\\\",\\\"email\\\":\\\"ltemperleyq@imdb.com\\\",\\\"job\\\":\\\"Environmental Tech\\\"}\"}\n{ \"body\": \"{\\\"id\\\":28,\\\"first_name\\\":\\\"Abelard\\\",\\\"email\\\":\\\"amaseresr@pcworld.com\\\",\\\"job\\\":\\\"Data Coordiator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":29,\\\"first_name\\\":\\\"Minetta\\\",\\\"email\\\":\\\"mcobleys@squarespace.com\\\",\\\"job\\\":\\\"Environmental Tech\\\"}\"}\n{ \"body\": \"{\\\"id\\\":30,\\\"first_name\\\":\\\"Caesar\\\",\\\"email\\\":\\\"cshadboltt@imdb.com\\\",\\\"job\\\":\\\"Account Executive\\\"}\"}\n{ \"body\": \"{\\\"id\\\":31,\\\"first_name\\\":\\\"Patti\\\",\\\"email\\\":\\\"pperonu@bravesites.com\\\",\\\"job\\\":\\\"Project Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":32,\\\"first_name\\\":\\\"Louisa\\\",\\\"email\\\":\\\"lpynerv@hubpages.com\\\",\\\"job\\\":\\\"Community Outreach Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":33,\\\"first_name\\\":\\\"Cordi\\\",\\\"email\\\":\\\"cpetrowskyw@privacy.gov.au\\\",\\\"job\\\":\\\"Internal Auditor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":34,\\\"first_name\\\":\\\"Meir\\\",\\\"email\\\":\\\"mearthfieldx@biglobe.ne.jp\\\",\\\"job\\\":\\\"Marketing Assistant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":35,\\\"first_name\\\":\\\"Lark\\\",\\\"email\\\":\\\"lcasay@craigslist.org\\\",\\\"job\\\":\\\"Social Worker\\\"}\"}\n{ \"body\": \"{\\\"id\\\":36,\\\"first_name\\\":\\\"Sayer\\\",\\\"email\\\":\\\"scrummyz@answers.com\\\",\\\"job\\\":\\\"Structural Analysis Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":37,\\\"first_name\\\":\\\"Alec\\\",\\\"email\\\":\\\"alahive10@ow.ly\\\",\\\"job\\\":\\\"Mechanical Systems Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":38,\\\"first_name\\\":\\\"Helyn\\\",\\\"email\\\":\\\"hcarbry11@aol.com\\\",\\\"job\\\":\\\"Engineer I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":39,\\\"first_name\\\":\\\"Ansley\\\",\\\"email\\\":\\\"abartolozzi12@wikia.com\\\",\\\"job\\\":\\\"Social Worker\\\"}\"}\n{ \"body\": \"{\\\"id\\\":40,\\\"first_name\\\":\\\"Lucretia\\\",\\\"email\\\":\\\"lalbertson13@unc.edu\\\",\\\"job\\\":\\\"Biostatistician I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":41,\\\"first_name\\\":\\\"Guthrie\\\",\\\"email\\\":\\\"gpencost14@amazon.co.uk\\\",\\\"job\\\":\\\"Business Systems Development Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":42,\\\"first_name\\\":\\\"Misty\\\",\\\"email\\\":\\\"mmulberry15@fotki.com\\\",\\\"job\\\":\\\"Dental Hygienist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":43,\\\"first_name\\\":\\\"Dante\\\",\\\"email\\\":\\\"dbellringer16@amazon.com\\\",\\\"job\\\":\\\"GIS Technical Architect\\\"}\"}\n{ \"body\": \"{\\\"id\\\":44,\\\"first_name\\\":\\\"Josefa\\\",\\\"email\\\":\\\"jkinane17@pinterest.com\\\",\\\"job\\\":\\\"Project Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":45,\\\"first_name\\\":\\\"Mathilda\\\",\\\"email\\\":\\\"mgoldin18@msu.edu\\\",\\\"job\\\":\\\"Sales Representative\\\"}\"}\n{ \"body\": \"{\\\"id\\\":46,\\\"first_name\\\":\\\"Dinny\\\",\\\"email\\\":\\\"dbirdwhistell19@pinterest.com\\\",\\\"job\\\":\\\"VP Marketing\\\"}\"}\n{ \"body\": \"{\\\"id\\\":47,\\\"first_name\\\":\\\"Sig\\\",\\\"email\\\":\\\"srabl1a@soup.io\\\",\\\"job\\\":\\\"Legal Assistant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":48,\\\"first_name\\\":\\\"Aggie\\\",\\\"email\\\":\\\"awychard1b@sitemeter.com\\\",\\\"job\\\":\\\"Business Systems Development Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":49,\\\"first_name\\\":\\\"Isadora\\\",\\\"email\\\":\\\"itaplow1c@issuu.com\\\",\\\"job\\\":\\\"Teacher\\\"}\"}\n{ \"body\": \"{\\\"id\\\":50,\\\"first_name\\\":\\\"Celine\\\",\\\"email\\\":\\\"cbruneton1d@cbslocal.com\\\",\\\"job\\\":\\\"Speech Pathologist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":51,\\\"first_name\\\":\\\"Orelia\\\",\\\"email\\\":\\\"ozavattiero1e@delicious.com\\\",\\\"job\\\":\\\"Human Resources Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":52,\\\"first_name\\\":\\\"Orson\\\",\\\"email\\\":\\\"oarp1f@hhs.gov\\\",\\\"job\\\":\\\"General Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":53,\\\"first_name\\\":\\\"Cathyleen\\\",\\\"email\\\":\\\"cmcgannon1g@lycos.com\\\",\\\"job\\\":\\\"Software Engineer II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":54,\\\"first_name\\\":\\\"Tabitha\\\",\\\"email\\\":\\\"teich1h@comsenz.com\\\",\\\"job\\\":\\\"Dental Hygienist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":55,\\\"first_name\\\":\\\"Fabe\\\",\\\"email\\\":\\\"fnewband1i@cnet.com\\\",\\\"job\\\":\\\"Nuclear Power Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":56,\\\"first_name\\\":\\\"Liesa\\\",\\\"email\\\":\\\"lkingsbury1j@hp.com\\\",\\\"job\\\":\\\"Web Developer IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":57,\\\"first_name\\\":\\\"Rochette\\\",\\\"email\\\":\\\"rbenedetti1k@mysql.com\\\",\\\"job\\\":\\\"Physical Therapy Assistant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":58,\\\"first_name\\\":\\\"Andonis\\\",\\\"email\\\":\\\"alydon1l@wp.com\\\",\\\"job\\\":\\\"Statistician I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":59,\\\"first_name\\\":\\\"Philis\\\",\\\"email\\\":\\\"pbaldick1m@about.com\\\",\\\"job\\\":\\\"General Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":60,\\\"first_name\\\":\\\"Arleen\\\",\\\"email\\\":\\\"alongmore1n@wikimedia.org\\\",\\\"job\\\":\\\"Biostatistician II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":61,\\\"first_name\\\":\\\"Hastie\\\",\\\"email\\\":\\\"htitterell1o@examiner.com\\\",\\\"job\\\":\\\"Research Nurse\\\"}\"}\n{ \"body\": \"{\\\"id\\\":62,\\\"first_name\\\":\\\"Willow\\\",\\\"email\\\":\\\"wfillon1p@msu.edu\\\",\\\"job\\\":\\\"Executive Secretary\\\"}\"}\n{ \"body\": \"{\\\"id\\\":63,\\\"first_name\\\":\\\"Babara\\\",\\\"email\\\":\\\"bwaycot1q@opera.com\\\",\\\"job\\\":\\\"GIS Technical Architect\\\"}\"}\n{ \"body\": \"{\\\"id\\\":64,\\\"first_name\\\":\\\"Ibby\\\",\\\"email\\\":\\\"ihansbury1r@buzzfeed.com\\\",\\\"job\\\":\\\"Editor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":65,\\\"first_name\\\":\\\"Rhodie\\\",\\\"email\\\":\\\"rganforth1s@angelfire.com\\\",\\\"job\\\":\\\"Nurse Practicioner\\\"}\"}\n{ \"body\": \"{\\\"id\\\":66,\\\"first_name\\\":\\\"Dorice\\\",\\\"email\\\":\\\"disack1t@example.com\\\",\\\"job\\\":\\\"VP Sales\\\"}\"}\n{ \"body\": \"{\\\"id\\\":67,\\\"first_name\\\":\\\"Rossy\\\",\\\"email\\\":\\\"rbeadle1u@nsw.gov.au\\\",\\\"job\\\":\\\"Sales Representative\\\"}\"}\n{ \"body\": \"{\\\"id\\\":68,\\\"first_name\\\":\\\"Helena\\\",\\\"email\\\":\\\"hmennell1v@shinystat.com\\\",\\\"job\\\":\\\"Analyst Programmer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":69,\\\"first_name\\\":\\\"Tremayne\\\",\\\"email\\\":\\\"trosenblad1w@technorati.com\\\",\\\"job\\\":\\\"Biostatistician III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":70,\\\"first_name\\\":\\\"Case\\\",\\\"email\\\":\\\"cbranston1x@fc2.com\\\",\\\"job\\\":\\\"Environmental Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":71,\\\"first_name\\\":\\\"Misti\\\",\\\"email\\\":\\\"mwiddop1y@columbia.edu\\\",\\\"job\\\":\\\"Chemical Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":72,\\\"first_name\\\":\\\"Constancia\\\",\\\"email\\\":\\\"cedwinson1z@bandcamp.com\\\",\\\"job\\\":\\\"Software Consultant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":73,\\\"first_name\\\":\\\"John\\\",\\\"email\\\":\\\"jprobart20@google.es\\\",\\\"job\\\":\\\"Occupational Therapist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":74,\\\"first_name\\\":\\\"Ruddie\\\",\\\"email\\\":\\\"rfelton21@pagesperso-orange.fr\\\",\\\"job\\\":\\\"Financial Advisor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":75,\\\"first_name\\\":\\\"Rasia\\\",\\\"email\\\":\\\"rlawland22@tumblr.com\\\",\\\"job\\\":\\\"Administrative Officer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":76,\\\"first_name\\\":\\\"Ara\\\",\\\"email\\\":\\\"astatersfield23@eepurl.com\\\",\\\"job\\\":\\\"Cost Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":77,\\\"first_name\\\":\\\"Aurlie\\\",\\\"email\\\":\\\"abispo24@ycombinator.com\\\",\\\"job\\\":\\\"Safety Technician III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":78,\\\"first_name\\\":\\\"Marsh\\\",\\\"email\\\":\\\"mniven25@dailymotion.com\\\",\\\"job\\\":\\\"Analyst Programmer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":79,\\\"first_name\\\":\\\"Lee\\\",\\\"email\\\":\\\"lsmalls26@alibaba.com\\\",\\\"job\\\":\\\"Account Coordinator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":80,\\\"first_name\\\":\\\"Eloisa\\\",\\\"email\\\":\\\"estanney27@ovh.net\\\",\\\"job\\\":\\\"Staff Accountant II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":81,\\\"first_name\\\":\\\"Michale\\\",\\\"email\\\":\\\"mdurston28@yellowbook.com\\\",\\\"job\\\":\\\"VP Accounting\\\"}\"}\n{ \"body\": \"{\\\"id\\\":82,\\\"first_name\\\":\\\"Idette\\\",\\\"email\\\":\\\"ibenedikt29@themeforest.net\\\",\\\"job\\\":\\\"Statistician IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":83,\\\"first_name\\\":\\\"Rhianna\\\",\\\"email\\\":\\\"rbodimeade2a@google.ru\\\",\\\"job\\\":\\\"Junior Executive\\\"}\"}\n{ \"body\": \"{\\\"id\\\":84,\\\"first_name\\\":\\\"Lydie\\\",\\\"email\\\":\\\"lesherwood2b@fotki.com\\\",\\\"job\\\":\\\"Developer IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":85,\\\"first_name\\\":\\\"Jack\\\",\\\"email\\\":\\\"jsiddon2c@cam.ac.uk\\\",\\\"job\\\":\\\"Payment Adjustment Coordinator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":86,\\\"first_name\\\":\\\"Augie\\\",\\\"email\\\":\\\"asiggin2d@webmd.com\\\",\\\"job\\\":\\\"Automation Specialist I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":87,\\\"first_name\\\":\\\"Verina\\\",\\\"email\\\":\\\"vhurry2e@miibeian.gov.cn\\\",\\\"job\\\":\\\"Clinical Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":88,\\\"first_name\\\":\\\"Layton\\\",\\\"email\\\":\\\"lvasilechko2f@shutterfly.com\\\",\\\"job\\\":\\\"Media Manager I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":89,\\\"first_name\\\":\\\"Joana\\\",\\\"email\\\":\\\"jpinilla2g@blinklist.com\\\",\\\"job\\\":\\\"Analog Circuit Design manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":90,\\\"first_name\\\":\\\"Astra\\\",\\\"email\\\":\\\"aesom2h@google.it\\\",\\\"job\\\":\\\"Senior Cost Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":91,\\\"first_name\\\":\\\"Cassandry\\\",\\\"email\\\":\\\"cjerrolt2i@tumblr.com\\\",\\\"job\\\":\\\"Senior Quality Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":92,\\\"first_name\\\":\\\"Dedie\\\",\\\"email\\\":\\\"dleprovest2j@chron.com\\\",\\\"job\\\":\\\"Accountant IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":93,\\\"first_name\\\":\\\"Alleen\\\",\\\"email\\\":\\\"aickovitz2k@sciencedaily.com\\\",\\\"job\\\":\\\"Compensation Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":94,\\\"first_name\\\":\\\"Lorilee\\\",\\\"email\\\":\\\"lborlease2l@1und1.de\\\",\\\"job\\\":\\\"Administrative Assistant IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":95,\\\"first_name\\\":\\\"Rem\\\",\\\"email\\\":\\\"rwerlock2m@shareasale.com\\\",\\\"job\\\":\\\"VP Product Management\\\"}\"}\n{ \"body\": \"{\\\"id\\\":96,\\\"first_name\\\":\\\"Cathe\\\",\\\"email\\\":\\\"clevecque2n@engadget.com\\\",\\\"job\\\":\\\"Office Assistant III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":97,\\\"first_name\\\":\\\"Kelsey\\\",\\\"email\\\":\\\"kpatershall2o@scientificamerican.com\\\",\\\"job\\\":\\\"Recruiting Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":98,\\\"first_name\\\":\\\"Percy\\\",\\\"email\\\":\\\"pjery2p@mac.com\\\",\\\"job\\\":\\\"Operator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":99,\\\"first_name\\\":\\\"Cathee\\\",\\\"email\\\":\\\"csconce2q@blinklist.com\\\",\\\"job\\\":\\\"Staff Scientist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":100,\\\"first_name\\\":\\\"Michaeline\\\",\\\"email\\\":\\\"mcuel2r@ted.com\\\",\\\"job\\\":\\\"Environmental Tech\\\"}\"}\n{ \"body\": \"{\\\"id\\\":101,\\\"first_name\\\":\\\"Prescott\\\",\\\"email\\\":\\\"plivingstone2s@ask.com\\\",\\\"job\\\":\\\"VP Product Management\\\"}\"}\n{ \"body\": \"{\\\"id\\\":102,\\\"first_name\\\":\\\"Broddy\\\",\\\"email\\\":\\\"bgiacopazzi2t@goo.ne.jp\\\",\\\"job\\\":\\\"Actuary\\\"}\"}\n{ \"body\": \"{\\\"id\\\":103,\\\"first_name\\\":\\\"Errol\\\",\\\"email\\\":\\\"ecasino2u@surveymonkey.com\\\",\\\"job\\\":\\\"Quality Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":104,\\\"first_name\\\":\\\"Correy\\\",\\\"email\\\":\\\"cchamberlin2v@creativecommons.org\\\",\\\"job\\\":\\\"Junior Executive\\\"}\"}\n{ \"body\": \"{\\\"id\\\":105,\\\"first_name\\\":\\\"Randall\\\",\\\"email\\\":\\\"rrenshell2w@seattletimes.com\\\",\\\"job\\\":\\\"Analog Circuit Design manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":106,\\\"first_name\\\":\\\"Darbie\\\",\\\"email\\\":\\\"dchantillon2x@tamu.edu\\\",\\\"job\\\":\\\"Payment Adjustment Coordinator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":107,\\\"first_name\\\":\\\"Benny\\\",\\\"email\\\":\\\"bpeert2y@arstechnica.com\\\",\\\"job\\\":\\\"Marketing Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":108,\\\"first_name\\\":\\\"Leigh\\\",\\\"email\\\":\\\"lalchin2z@oaic.gov.au\\\",\\\"job\\\":\\\"Human Resources Assistant IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":109,\\\"first_name\\\":\\\"Demetri\\\",\\\"email\\\":\\\"dobin30@blog.com\\\",\\\"job\\\":\\\"Electrical Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":110,\\\"first_name\\\":\\\"Gilberto\\\",\\\"email\\\":\\\"glewsie31@nps.gov\\\",\\\"job\\\":\\\"Speech Pathologist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":111,\\\"first_name\\\":\\\"Ruthi\\\",\\\"email\\\":\\\"rmacconachy32@yelp.com\\\",\\\"job\\\":\\\"VP Quality Control\\\"}\"}\n{ \"body\": \"{\\\"id\\\":112,\\\"first_name\\\":\\\"Gard\\\",\\\"email\\\":\\\"glancley33@comcast.net\\\",\\\"job\\\":\\\"General Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":113,\\\"first_name\\\":\\\"Storm\\\",\\\"email\\\":\\\"sdufray34@drupal.org\\\",\\\"job\\\":\\\"Senior Cost Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":114,\\\"first_name\\\":\\\"Jillian\\\",\\\"email\\\":\\\"jgegg35@eepurl.com\\\",\\\"job\\\":\\\"Community Outreach Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":115,\\\"first_name\\\":\\\"Valentina\\\",\\\"email\\\":\\\"vthorlby36@macromedia.com\\\",\\\"job\\\":\\\"Quality Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":116,\\\"first_name\\\":\\\"Paxon\\\",\\\"email\\\":\\\"pscrewton37@odnoklassniki.ru\\\",\\\"job\\\":\\\"Administrative Officer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":117,\\\"first_name\\\":\\\"Cariotta\\\",\\\"email\\\":\\\"cbrik38@t.co\\\",\\\"job\\\":\\\"Quality Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":118,\\\"first_name\\\":\\\"Franchot\\\",\\\"email\\\":\\\"fgrzelczyk39@xrea.com\\\",\\\"job\\\":\\\"Software Consultant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":119,\\\"first_name\\\":\\\"Skip\\\",\\\"email\\\":\\\"shathaway3a@e-recht24.de\\\",\\\"job\\\":\\\"Environmental Tech\\\"}\"}\n{ \"body\": \"{\\\"id\\\":120,\\\"first_name\\\":\\\"Tripp\\\",\\\"email\\\":\\\"ttrippitt3b@rambler.ru\\\",\\\"job\\\":\\\"Financial Advisor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":121,\\\"first_name\\\":\\\"Etienne\\\",\\\"email\\\":\\\"ecoldrick3c@huffingtonpost.com\\\",\\\"job\\\":\\\"Civil Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":122,\\\"first_name\\\":\\\"Adara\\\",\\\"email\\\":\\\"agurnett3d@lycos.com\\\",\\\"job\\\":\\\"Accounting Assistant III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":123,\\\"first_name\\\":\\\"Spence\\\",\\\"email\\\":\\\"sions3e@nifty.com\\\",\\\"job\\\":\\\"VP Accounting\\\"}\"}\n{ \"body\": \"{\\\"id\\\":124,\\\"first_name\\\":\\\"Hadrian\\\",\\\"email\\\":\\\"hemlin3f@eventbrite.com\\\",\\\"job\\\":\\\"VP Accounting\\\"}\"}\n{ \"body\": \"{\\\"id\\\":125,\\\"first_name\\\":\\\"Dulci\\\",\\\"email\\\":\\\"dletham3g@com.com\\\",\\\"job\\\":\\\"Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":126,\\\"first_name\\\":\\\"Nolana\\\",\\\"email\\\":\\\"nwelham3h@weebly.com\\\",\\\"job\\\":\\\"Pharmacist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":127,\\\"first_name\\\":\\\"Barnard\\\",\\\"email\\\":\\\"bwaplinton3i@bbc.co.uk\\\",\\\"job\\\":\\\"Senior Quality Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":128,\\\"first_name\\\":\\\"Cati\\\",\\\"email\\\":\\\"cnorthbridge3j@ebay.com\\\",\\\"job\\\":\\\"Desktop Support Technician\\\"}\"}\n{ \"body\": \"{\\\"id\\\":129,\\\"first_name\\\":\\\"Elle\\\",\\\"email\\\":\\\"elester3k@mozilla.com\\\",\\\"job\\\":\\\"Research Nurse\\\"}\"}\n{ \"body\": \"{\\\"id\\\":130,\\\"first_name\\\":\\\"Dareen\\\",\\\"email\\\":\\\"dpossel3l@hc360.com\\\",\\\"job\\\":\\\"Dental Hygienist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":131,\\\"first_name\\\":\\\"Bertram\\\",\\\"email\\\":\\\"bphettis3m@imdb.com\\\",\\\"job\\\":\\\"Geologist I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":132,\\\"first_name\\\":\\\"Kittie\\\",\\\"email\\\":\\\"ksharville3n@shop-pro.jp\\\",\\\"job\\\":\\\"Health Coach IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":133,\\\"first_name\\\":\\\"Inesita\\\",\\\"email\\\":\\\"ihofton3o@wsj.com\\\",\\\"job\\\":\\\"Accountant III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":134,\\\"first_name\\\":\\\"Charin\\\",\\\"email\\\":\\\"cbartomeu3p@wikipedia.org\\\",\\\"job\\\":\\\"Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":135,\\\"first_name\\\":\\\"Kristi\\\",\\\"email\\\":\\\"kidenden3q@so-net.ne.jp\\\",\\\"job\\\":\\\"Research Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":136,\\\"first_name\\\":\\\"Torey\\\",\\\"email\\\":\\\"ttoner3r@bandcamp.com\\\",\\\"job\\\":\\\"Associate Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":137,\\\"first_name\\\":\\\"Lockwood\\\",\\\"email\\\":\\\"ldunne3s@slideshare.net\\\",\\\"job\\\":\\\"VP Sales\\\"}\"}\n{ \"body\": \"{\\\"id\\\":138,\\\"first_name\\\":\\\"Filberto\\\",\\\"email\\\":\\\"fstrang3t@state.gov\\\",\\\"job\\\":\\\"Research Assistant IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":139,\\\"first_name\\\":\\\"Rhody\\\",\\\"email\\\":\\\"rbridgwater3u@typepad.com\\\",\\\"job\\\":\\\"Graphic Designer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":140,\\\"first_name\\\":\\\"Kesley\\\",\\\"email\\\":\\\"kkepling3v@reference.com\\\",\\\"job\\\":\\\"Payment Adjustment Coordinator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":141,\\\"first_name\\\":\\\"Lombard\\\",\\\"email\\\":\\\"lmasterman3w@youku.com\\\",\\\"job\\\":\\\"Software Test Engineer I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":142,\\\"first_name\\\":\\\"Ferdinanda\\\",\\\"email\\\":\\\"fsandsallan3x@photobucket.com\\\",\\\"job\\\":\\\"Staff Scientist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":143,\\\"first_name\\\":\\\"Grier\\\",\\\"email\\\":\\\"gandriveaux3y@mtv.com\\\",\\\"job\\\":\\\"Social Worker\\\"}\"}\n{ \"body\": \"{\\\"id\\\":144,\\\"first_name\\\":\\\"Evonne\\\",\\\"email\\\":\\\"emayho3z@google.es\\\",\\\"job\\\":\\\"Dental Hygienist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":145,\\\"first_name\\\":\\\"Jarrod\\\",\\\"email\\\":\\\"jgadault40@themeforest.net\\\",\\\"job\\\":\\\"Staff Accountant IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":146,\\\"first_name\\\":\\\"Cassondra\\\",\\\"email\\\":\\\"ccunio41@ihg.com\\\",\\\"job\\\":\\\"Computer Systems Analyst II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":147,\\\"first_name\\\":\\\"Jule\\\",\\\"email\\\":\\\"jbilam42@illinois.edu\\\",\\\"job\\\":\\\"Web Developer II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":148,\\\"first_name\\\":\\\"Elianora\\\",\\\"email\\\":\\\"ehallede43@miibeian.gov.cn\\\",\\\"job\\\":\\\"Physical Therapy Assistant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":149,\\\"first_name\\\":\\\"Addi\\\",\\\"email\\\":\\\"abanishevitz44@usnews.com\\\",\\\"job\\\":\\\"Marketing Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":150,\\\"first_name\\\":\\\"Carin\\\",\\\"email\\\":\\\"carndell45@purevolume.com\\\",\\\"job\\\":\\\"Cost Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":151,\\\"first_name\\\":\\\"Buddy\\\",\\\"email\\\":\\\"btwydell46@sogou.com\\\",\\\"job\\\":\\\"Research Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":152,\\\"first_name\\\":\\\"Parker\\\",\\\"email\\\":\\\"ppriestland47@hubpages.com\\\",\\\"job\\\":\\\"Senior Editor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":153,\\\"first_name\\\":\\\"Anthony\\\",\\\"email\\\":\\\"asallnow48@globo.com\\\",\\\"job\\\":\\\"Account Coordinator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":154,\\\"first_name\\\":\\\"Barri\\\",\\\"email\\\":\\\"btollfree49@symantec.com\\\",\\\"job\\\":\\\"Executive Secretary\\\"}\"}\n{ \"body\": \"{\\\"id\\\":155,\\\"first_name\\\":\\\"Ernesta\\\",\\\"email\\\":\\\"ebeech4a@google.com\\\",\\\"job\\\":\\\"Safety Technician I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":156,\\\"first_name\\\":\\\"Claudian\\\",\\\"email\\\":\\\"civushkin4b@sciencedirect.com\\\",\\\"job\\\":\\\"Chemical Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":157,\\\"first_name\\\":\\\"Tova\\\",\\\"email\\\":\\\"townsworth4c@godaddy.com\\\",\\\"job\\\":\\\"Senior Cost Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":158,\\\"first_name\\\":\\\"Stephanus\\\",\\\"email\\\":\\\"slarkkem4d@cnn.com\\\",\\\"job\\\":\\\"Computer Systems Analyst IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":159,\\\"first_name\\\":\\\"Adiana\\\",\\\"email\\\":\\\"atorregiani4e@who.int\\\",\\\"job\\\":\\\"Sales Representative\\\"}\"}\n{ \"body\": \"{\\\"id\\\":160,\\\"first_name\\\":\\\"Tabor\\\",\\\"email\\\":\\\"ttrevorrow4f@uol.com.br\\\",\\\"job\\\":\\\"Structural Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":161,\\\"first_name\\\":\\\"Richmound\\\",\\\"email\\\":\\\"rfawkes4g@dropbox.com\\\",\\\"job\\\":\\\"Speech Pathologist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":162,\\\"first_name\\\":\\\"Evelyn\\\",\\\"email\\\":\\\"ebaggaley4h@google.it\\\",\\\"job\\\":\\\"Web Developer II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":163,\\\"first_name\\\":\\\"Gypsy\\\",\\\"email\\\":\\\"gknudsen4i@domainmarket.com\\\",\\\"job\\\":\\\"Quality Control Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":164,\\\"first_name\\\":\\\"Patsy\\\",\\\"email\\\":\\\"plouthe4j@opensource.org\\\",\\\"job\\\":\\\"Help Desk Technician\\\"}\"}\n{ \"body\": \"{\\\"id\\\":165,\\\"first_name\\\":\\\"Davita\\\",\\\"email\\\":\\\"dciotto4k@mayoclinic.com\\\",\\\"job\\\":\\\"Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":166,\\\"first_name\\\":\\\"Jorey\\\",\\\"email\\\":\\\"jmassingberd4l@topsy.com\\\",\\\"job\\\":\\\"Software Engineer IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":167,\\\"first_name\\\":\\\"Cash\\\",\\\"email\\\":\\\"cclelland4m@columbia.edu\\\",\\\"job\\\":\\\"Assistant Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":168,\\\"first_name\\\":\\\"Thorny\\\",\\\"email\\\":\\\"tlavell4n@fotki.com\\\",\\\"job\\\":\\\"Research Assistant IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":169,\\\"first_name\\\":\\\"Cassandra\\\",\\\"email\\\":\\\"ccapron4o@yellowpages.com\\\",\\\"job\\\":\\\"Administrative Assistant II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":170,\\\"first_name\\\":\\\"Marylynne\\\",\\\"email\\\":\\\"mredparth4p@marriott.com\\\",\\\"job\\\":\\\"Accountant I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":171,\\\"first_name\\\":\\\"Udale\\\",\\\"email\\\":\\\"ugarard4q@spiegel.de\\\",\\\"job\\\":\\\"Actuary\\\"}\"}\n{ \"body\": \"{\\\"id\\\":172,\\\"first_name\\\":\\\"Annamarie\\\",\\\"email\\\":\\\"ahammerton4r@who.int\\\",\\\"job\\\":\\\"Dental Hygienist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":173,\\\"first_name\\\":\\\"Haskell\\\",\\\"email\\\":\\\"hstollwerck4s@comsenz.com\\\",\\\"job\\\":\\\"VP Accounting\\\"}\"}\n{ \"body\": \"{\\\"id\\\":174,\\\"first_name\\\":\\\"Townsend\\\",\\\"email\\\":\\\"tnewnham4t@merriam-webster.com\\\",\\\"job\\\":\\\"Systems Administrator III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":175,\\\"first_name\\\":\\\"Amargo\\\",\\\"email\\\":\\\"abaish4u@netlog.com\\\",\\\"job\\\":\\\"Computer Systems Analyst II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":176,\\\"first_name\\\":\\\"Euphemia\\\",\\\"email\\\":\\\"eflaunders4v@spotify.com\\\",\\\"job\\\":\\\"Teacher\\\"}\"}\n{ \"body\": \"{\\\"id\\\":177,\\\"first_name\\\":\\\"Addy\\\",\\\"email\\\":\\\"amuspratt4w@aol.com\\\",\\\"job\\\":\\\"Nurse Practicioner\\\"}\"}\n{ \"body\": \"{\\\"id\\\":178,\\\"first_name\\\":\\\"Mellisa\\\",\\\"email\\\":\\\"mchiddy4x@sciencedirect.com\\\",\\\"job\\\":\\\"Account Coordinator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":179,\\\"first_name\\\":\\\"Zara\\\",\\\"email\\\":\\\"zyuill4y@gov.uk\\\",\\\"job\\\":\\\"Assistant Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":180,\\\"first_name\\\":\\\"Gaven\\\",\\\"email\\\":\\\"gvaszoly4z@bravesites.com\\\",\\\"job\\\":\\\"Systems Administrator I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":181,\\\"first_name\\\":\\\"Drugi\\\",\\\"email\\\":\\\"dshowt50@liveinternet.ru\\\",\\\"job\\\":\\\"Mechanical Systems Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":182,\\\"first_name\\\":\\\"Frederick\\\",\\\"email\\\":\\\"fhurlston51@indiatimes.com\\\",\\\"job\\\":\\\"Dental Hygienist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":183,\\\"first_name\\\":\\\"Darin\\\",\\\"email\\\":\\\"dmaulin52@samsung.com\\\",\\\"job\\\":\\\"Geologist I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":184,\\\"first_name\\\":\\\"Wallis\\\",\\\"email\\\":\\\"wscurrer53@nbcnews.com\\\",\\\"job\\\":\\\"Cost Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":185,\\\"first_name\\\":\\\"Susann\\\",\\\"email\\\":\\\"skingsley54@yale.edu\\\",\\\"job\\\":\\\"Project Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":186,\\\"first_name\\\":\\\"Bree\\\",\\\"email\\\":\\\"blieber55@furl.net\\\",\\\"job\\\":\\\"Accountant IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":187,\\\"first_name\\\":\\\"Nonnah\\\",\\\"email\\\":\\\"ngutteridge56@dailymail.co.uk\\\",\\\"job\\\":\\\"Software Consultant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":188,\\\"first_name\\\":\\\"Chrysler\\\",\\\"email\\\":\\\"cvarnham57@google.nl\\\",\\\"job\\\":\\\"Software Test Engineer IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":189,\\\"first_name\\\":\\\"Otha\\\",\\\"email\\\":\\\"odargavel58@phpbb.com\\\",\\\"job\\\":\\\"Administrative Assistant I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":190,\\\"first_name\\\":\\\"Reynold\\\",\\\"email\\\":\\\"rbirdsall59@devhub.com\\\",\\\"job\\\":\\\"Safety Technician IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":191,\\\"first_name\\\":\\\"Cati\\\",\\\"email\\\":\\\"calcott5a@smh.com.au\\\",\\\"job\\\":\\\"VP Sales\\\"}\"}\n{ \"body\": \"{\\\"id\\\":192,\\\"first_name\\\":\\\"Yale\\\",\\\"email\\\":\\\"ymcguffog5b@seattletimes.com\\\",\\\"job\\\":\\\"Structural Analysis Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":193,\\\"first_name\\\":\\\"Berkly\\\",\\\"email\\\":\\\"bdutteridge5c@bbb.org\\\",\\\"job\\\":\\\"Junior Executive\\\"}\"}\n{ \"body\": \"{\\\"id\\\":194,\\\"first_name\\\":\\\"Delinda\\\",\\\"email\\\":\\\"dhans5d@cbslocal.com\\\",\\\"job\\\":\\\"Chemical Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":195,\\\"first_name\\\":\\\"Fayre\\\",\\\"email\\\":\\\"fmeachen5e@vinaora.com\\\",\\\"job\\\":\\\"Financial Advisor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":196,\\\"first_name\\\":\\\"Damaris\\\",\\\"email\\\":\\\"dlimbrick5f@biglobe.ne.jp\\\",\\\"job\\\":\\\"Automation Specialist III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":197,\\\"first_name\\\":\\\"Adam\\\",\\\"email\\\":\\\"awintour5g@dyndns.org\\\",\\\"job\\\":\\\"Assistant Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":198,\\\"first_name\\\":\\\"Phedra\\\",\\\"email\\\":\\\"ptuttle5h@wsj.com\\\",\\\"job\\\":\\\"Executive Secretary\\\"}\"}\n{ \"body\": \"{\\\"id\\\":199,\\\"first_name\\\":\\\"Cindie\\\",\\\"email\\\":\\\"cwenderott5i@sfgate.com\\\",\\\"job\\\":\\\"Staff Accountant I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":200,\\\"first_name\\\":\\\"Stesha\\\",\\\"email\\\":\\\"sbatrip5j@mlb.com\\\",\\\"job\\\":\\\"Design Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":201,\\\"first_name\\\":\\\"Gale\\\",\\\"email\\\":\\\"gcraythorn5k@wikispaces.com\\\",\\\"job\\\":\\\"Executive Secretary\\\"}\"}\n{ \"body\": \"{\\\"id\\\":202,\\\"first_name\\\":\\\"Pincas\\\",\\\"email\\\":\\\"psilvester5l@purevolume.com\\\",\\\"job\\\":\\\"Cost Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":203,\\\"first_name\\\":\\\"Colly\\\",\\\"email\\\":\\\"crubinow5m@behance.net\\\",\\\"job\\\":\\\"Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":204,\\\"first_name\\\":\\\"Evy\\\",\\\"email\\\":\\\"ealkins5n@sogou.com\\\",\\\"job\\\":\\\"Nuclear Power Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":205,\\\"first_name\\\":\\\"Brana\\\",\\\"email\\\":\\\"bmelson5o@umn.edu\\\",\\\"job\\\":\\\"Information Systems Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":206,\\\"first_name\\\":\\\"Jobina\\\",\\\"email\\\":\\\"jshivlin5p@merriam-webster.com\\\",\\\"job\\\":\\\"Junior Executive\\\"}\"}\n{ \"body\": \"{\\\"id\\\":207,\\\"first_name\\\":\\\"Lanny\\\",\\\"email\\\":\\\"lbediss5q@illinois.edu\\\",\\\"job\\\":\\\"Accounting Assistant I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":208,\\\"first_name\\\":\\\"Myrilla\\\",\\\"email\\\":\\\"mbuesnel5r@cisco.com\\\",\\\"job\\\":\\\"General Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":209,\\\"first_name\\\":\\\"Karleen\\\",\\\"email\\\":\\\"kbullerwell5s@go.com\\\",\\\"job\\\":\\\"Help Desk Operator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":210,\\\"first_name\\\":\\\"Dulcie\\\",\\\"email\\\":\\\"dniaves5t@issuu.com\\\",\\\"job\\\":\\\"Analog Circuit Design manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":211,\\\"first_name\\\":\\\"Clay\\\",\\\"email\\\":\\\"cmarguerite5u@washingtonpost.com\\\",\\\"job\\\":\\\"Associate Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":212,\\\"first_name\\\":\\\"Celeste\\\",\\\"email\\\":\\\"cradeliffe5v@sourceforge.net\\\",\\\"job\\\":\\\"Assistant Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":213,\\\"first_name\\\":\\\"Calypso\\\",\\\"email\\\":\\\"claite5w@mashable.com\\\",\\\"job\\\":\\\"Director of Sales\\\"}\"}\n{ \"body\": \"{\\\"id\\\":214,\\\"first_name\\\":\\\"Jessika\\\",\\\"email\\\":\\\"jmagne5x@ebay.co.uk\\\",\\\"job\\\":\\\"Junior Executive\\\"}\"}\n{ \"body\": \"{\\\"id\\\":215,\\\"first_name\\\":\\\"Celka\\\",\\\"email\\\":\\\"ctomsa5y@addthis.com\\\",\\\"job\\\":\\\"Staff Scientist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":216,\\\"first_name\\\":\\\"Ashla\\\",\\\"email\\\":\\\"amathouse5z@alibaba.com\\\",\\\"job\\\":\\\"Staff Accountant III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":217,\\\"first_name\\\":\\\"Ameline\\\",\\\"email\\\":\\\"agibbens60@qq.com\\\",\\\"job\\\":\\\"Database Administrator III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":218,\\\"first_name\\\":\\\"Kerri\\\",\\\"email\\\":\\\"ktrowl61@google.co.jp\\\",\\\"job\\\":\\\"Technical Writer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":219,\\\"first_name\\\":\\\"Dill\\\",\\\"email\\\":\\\"dbrittan62@opera.com\\\",\\\"job\\\":\\\"Staff Accountant III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":220,\\\"first_name\\\":\\\"Nikos\\\",\\\"email\\\":\\\"nyoull63@bloomberg.com\\\",\\\"job\\\":\\\"Geologist III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":221,\\\"first_name\\\":\\\"Tyler\\\",\\\"email\\\":\\\"tosbaldstone64@noaa.gov\\\",\\\"job\\\":\\\"VP Accounting\\\"}\"}\n{ \"body\": \"{\\\"id\\\":222,\\\"first_name\\\":\\\"Antonetta\\\",\\\"email\\\":\\\"asinnett65@state.tx.us\\\",\\\"job\\\":\\\"Desktop Support Technician\\\"}\"}\n{ \"body\": \"{\\\"id\\\":223,\\\"first_name\\\":\\\"Ramsay\\\",\\\"email\\\":\\\"rlagneaux66@senate.gov\\\",\\\"job\\\":\\\"Graphic Designer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":224,\\\"first_name\\\":\\\"Webb\\\",\\\"email\\\":\\\"wceney67@indiatimes.com\\\",\\\"job\\\":\\\"Web Developer I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":225,\\\"first_name\\\":\\\"Karyl\\\",\\\"email\\\":\\\"knicholson68@exblog.jp\\\",\\\"job\\\":\\\"Technical Writer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":226,\\\"first_name\\\":\\\"Pietro\\\",\\\"email\\\":\\\"pclaybourn69@wikispaces.com\\\",\\\"job\\\":\\\"Help Desk Technician\\\"}\"}\n{ \"body\": \"{\\\"id\\\":227,\\\"first_name\\\":\\\"Dinah\\\",\\\"email\\\":\\\"dsandal6a@trellian.com\\\",\\\"job\\\":\\\"Associate Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":228,\\\"first_name\\\":\\\"Britta\\\",\\\"email\\\":\\\"bferri6b@home.pl\\\",\\\"job\\\":\\\"Research Assistant IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":229,\\\"first_name\\\":\\\"Filberto\\\",\\\"email\\\":\\\"fshireff6c@163.com\\\",\\\"job\\\":\\\"Computer Systems Analyst II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":230,\\\"first_name\\\":\\\"Eberto\\\",\\\"email\\\":\\\"etunaclift6d@booking.com\\\",\\\"job\\\":\\\"Physical Therapy Assistant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":231,\\\"first_name\\\":\\\"Martainn\\\",\\\"email\\\":\\\"mchuck6e@craigslist.org\\\",\\\"job\\\":\\\"Quality Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":232,\\\"first_name\\\":\\\"Lory\\\",\\\"email\\\":\\\"lolenichev6f@arizona.edu\\\",\\\"job\\\":\\\"Assistant Media Planner\\\"}\"}\n{ \"body\": \"{\\\"id\\\":233,\\\"first_name\\\":\\\"Blinnie\\\",\\\"email\\\":\\\"bwhelband6g@lulu.com\\\",\\\"job\\\":\\\"VP Sales\\\"}\"}\n{ \"body\": \"{\\\"id\\\":234,\\\"first_name\\\":\\\"Candide\\\",\\\"email\\\":\\\"cdresse6h@sciencedaily.com\\\",\\\"job\\\":\\\"Payment Adjustment Coordinator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":235,\\\"first_name\\\":\\\"Birgitta\\\",\\\"email\\\":\\\"bhue6i@dell.com\\\",\\\"job\\\":\\\"Internal Auditor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":236,\\\"first_name\\\":\\\"Emyle\\\",\\\"email\\\":\\\"ecommander6j@dion.ne.jp\\\",\\\"job\\\":\\\"Budget/Accounting Analyst III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":237,\\\"first_name\\\":\\\"Rosanne\\\",\\\"email\\\":\\\"rkrystek6k@washington.edu\\\",\\\"job\\\":\\\"VP Quality Control\\\"}\"}\n{ \"body\": \"{\\\"id\\\":238,\\\"first_name\\\":\\\"Dottie\\\",\\\"email\\\":\\\"dbyas6l@scribd.com\\\",\\\"job\\\":\\\"Food Chemist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":239,\\\"first_name\\\":\\\"Theda\\\",\\\"email\\\":\\\"thugk6m@kickstarter.com\\\",\\\"job\\\":\\\"Senior Sales Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":240,\\\"first_name\\\":\\\"Joceline\\\",\\\"email\\\":\\\"jgregoraci6n@omniture.com\\\",\\\"job\\\":\\\"Budget/Accounting Analyst IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":241,\\\"first_name\\\":\\\"Hannis\\\",\\\"email\\\":\\\"hquarrell6o@domainmarket.com\\\",\\\"job\\\":\\\"Senior Cost Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":242,\\\"first_name\\\":\\\"Percival\\\",\\\"email\\\":\\\"psharper6p@blog.com\\\",\\\"job\\\":\\\"Actuary\\\"}\"}\n{ \"body\": \"{\\\"id\\\":243,\\\"first_name\\\":\\\"Theodora\\\",\\\"email\\\":\\\"tsangwin6q@infoseek.co.jp\\\",\\\"job\\\":\\\"Environmental Tech\\\"}\"}\n{ \"body\": \"{\\\"id\\\":244,\\\"first_name\\\":\\\"Sherri\\\",\\\"email\\\":\\\"swilcher6r@pagesperso-orange.fr\\\",\\\"job\\\":\\\"Sales Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":245,\\\"first_name\\\":\\\"Cheryl\\\",\\\"email\\\":\\\"cedwicke6s@mlb.com\\\",\\\"job\\\":\\\"Health Coach I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":246,\\\"first_name\\\":\\\"Floyd\\\",\\\"email\\\":\\\"fharbottle6t@tiny.cc\\\",\\\"job\\\":\\\"Registered Nurse\\\"}\"}\n{ \"body\": \"{\\\"id\\\":247,\\\"first_name\\\":\\\"Merrilee\\\",\\\"email\\\":\\\"mcuesta6u@so-net.ne.jp\\\",\\\"job\\\":\\\"Financial Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":248,\\\"first_name\\\":\\\"Anestassia\\\",\\\"email\\\":\\\"amorshead6v@joomla.org\\\",\\\"job\\\":\\\"Graphic Designer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":249,\\\"first_name\\\":\\\"Coralyn\\\",\\\"email\\\":\\\"cdrynan6w@github.com\\\",\\\"job\\\":\\\"Occupational Therapist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":250,\\\"first_name\\\":\\\"Kory\\\",\\\"email\\\":\\\"kdevaney6x@theglobeandmail.com\\\",\\\"job\\\":\\\"Occupational Therapist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":251,\\\"first_name\\\":\\\"Orazio\\\",\\\"email\\\":\\\"obraddon6y@jimdo.com\\\",\\\"job\\\":\\\"Data Coordiator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":252,\\\"first_name\\\":\\\"Ajay\\\",\\\"email\\\":\\\"acoushe6z@icio.us\\\",\\\"job\\\":\\\"Help Desk Technician\\\"}\"}\n{ \"body\": \"{\\\"id\\\":253,\\\"first_name\\\":\\\"Elnore\\\",\\\"email\\\":\\\"eallsopp70@aboutads.info\\\",\\\"job\\\":\\\"Chemical Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":254,\\\"first_name\\\":\\\"Laughton\\\",\\\"email\\\":\\\"lgoodee71@booking.com\\\",\\\"job\\\":\\\"Civil Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":255,\\\"first_name\\\":\\\"Donn\\\",\\\"email\\\":\\\"dgianni72@boston.com\\\",\\\"job\\\":\\\"Social Worker\\\"}\"}\n{ \"body\": \"{\\\"id\\\":256,\\\"first_name\\\":\\\"Millie\\\",\\\"email\\\":\\\"mcogman73@skype.com\\\",\\\"job\\\":\\\"Senior Financial Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":257,\\\"first_name\\\":\\\"Elvira\\\",\\\"email\\\":\\\"elampet74@ibm.com\\\",\\\"job\\\":\\\"Compensation Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":258,\\\"first_name\\\":\\\"Tanny\\\",\\\"email\\\":\\\"tkeilloh75@bbb.org\\\",\\\"job\\\":\\\"Operator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":259,\\\"first_name\\\":\\\"Toiboid\\\",\\\"email\\\":\\\"tgennerich76@hibu.com\\\",\\\"job\\\":\\\"Physical Therapy Assistant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":260,\\\"first_name\\\":\\\"Bram\\\",\\\"email\\\":\\\"blackie77@godaddy.com\\\",\\\"job\\\":\\\"Staff Accountant I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":261,\\\"first_name\\\":\\\"Rozella\\\",\\\"email\\\":\\\"rantonov78@ca.gov\\\",\\\"job\\\":\\\"Human Resources Assistant I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":262,\\\"first_name\\\":\\\"Ilene\\\",\\\"email\\\":\\\"ibattie79@deviantart.com\\\",\\\"job\\\":\\\"Clinical Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":263,\\\"first_name\\\":\\\"Nobe\\\",\\\"email\\\":\\\"nhayhurst7a@drupal.org\\\",\\\"job\\\":\\\"Sales Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":264,\\\"first_name\\\":\\\"Reggy\\\",\\\"email\\\":\\\"rdomican7b@archive.org\\\",\\\"job\\\":\\\"Biostatistician I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":265,\\\"first_name\\\":\\\"Pru\\\",\\\"email\\\":\\\"patcherley7c@sina.com.cn\\\",\\\"job\\\":\\\"Teacher\\\"}\"}\n{ \"body\": \"{\\\"id\\\":266,\\\"first_name\\\":\\\"Jeremiah\\\",\\\"email\\\":\\\"jjiranek7d@bloglovin.com\\\",\\\"job\\\":\\\"Senior Quality Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":267,\\\"first_name\\\":\\\"Ivor\\\",\\\"email\\\":\\\"irudledge7e@businessinsider.com\\\",\\\"job\\\":\\\"Staff Scientist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":268,\\\"first_name\\\":\\\"Marys\\\",\\\"email\\\":\\\"mtarbert7f@ebay.com\\\",\\\"job\\\":\\\"Actuary\\\"}\"}\n{ \"body\": \"{\\\"id\\\":269,\\\"first_name\\\":\\\"Joshuah\\\",\\\"email\\\":\\\"jwitty7g@creativecommons.org\\\",\\\"job\\\":\\\"Programmer IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":270,\\\"first_name\\\":\\\"Gilberte\\\",\\\"email\\\":\\\"gmccall7h@delicious.com\\\",\\\"job\\\":\\\"Quality Control Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":271,\\\"first_name\\\":\\\"Clayborne\\\",\\\"email\\\":\\\"clecointe7i@twitpic.com\\\",\\\"job\\\":\\\"Social Worker\\\"}\"}\n{ \"body\": \"{\\\"id\\\":272,\\\"first_name\\\":\\\"Karole\\\",\\\"email\\\":\\\"kteodori7j@twitter.com\\\",\\\"job\\\":\\\"Speech Pathologist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":273,\\\"first_name\\\":\\\"Kaiser\\\",\\\"email\\\":\\\"kyglesias7k@purevolume.com\\\",\\\"job\\\":\\\"Project Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":274,\\\"first_name\\\":\\\"Lek\\\",\\\"email\\\":\\\"lmacci7l@prlog.org\\\",\\\"job\\\":\\\"Environmental Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":275,\\\"first_name\\\":\\\"Ellyn\\\",\\\"email\\\":\\\"eculpen7m@umich.edu\\\",\\\"job\\\":\\\"Recruiter\\\"}\"}\n{ \"body\": \"{\\\"id\\\":276,\\\"first_name\\\":\\\"Mavis\\\",\\\"email\\\":\\\"mfurlonge7n@yellowbook.com\\\",\\\"job\\\":\\\"Editor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":277,\\\"first_name\\\":\\\"Arlyne\\\",\\\"email\\\":\\\"afullegar7o@howstuffworks.com\\\",\\\"job\\\":\\\"Graphic Designer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":278,\\\"first_name\\\":\\\"Joy\\\",\\\"email\\\":\\\"jgristwood7p@myspace.com\\\",\\\"job\\\":\\\"Geological Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":279,\\\"first_name\\\":\\\"Genni\\\",\\\"email\\\":\\\"gvoak7q@wufoo.com\\\",\\\"job\\\":\\\"Research Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":280,\\\"first_name\\\":\\\"Ermina\\\",\\\"email\\\":\\\"eforseith7r@indiegogo.com\\\",\\\"job\\\":\\\"Structural Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":281,\\\"first_name\\\":\\\"Ruddie\\\",\\\"email\\\":\\\"rbranson7s@ezinearticles.com\\\",\\\"job\\\":\\\"Electrical Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":282,\\\"first_name\\\":\\\"Wendye\\\",\\\"email\\\":\\\"wcasiero7t@adobe.com\\\",\\\"job\\\":\\\"Quality Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":283,\\\"first_name\\\":\\\"Harman\\\",\\\"email\\\":\\\"htevlin7u@dailymail.co.uk\\\",\\\"job\\\":\\\"Director of Sales\\\"}\"}\n{ \"body\": \"{\\\"id\\\":284,\\\"first_name\\\":\\\"Zeb\\\",\\\"email\\\":\\\"zpiatkow7v@tamu.edu\\\",\\\"job\\\":\\\"Financial Advisor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":285,\\\"first_name\\\":\\\"Sidoney\\\",\\\"email\\\":\\\"sdawson7w@etsy.com\\\",\\\"job\\\":\\\"Web Developer II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":286,\\\"first_name\\\":\\\"Maude\\\",\\\"email\\\":\\\"marnholz7x@flickr.com\\\",\\\"job\\\":\\\"Statistician III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":287,\\\"first_name\\\":\\\"Ennis\\\",\\\"email\\\":\\\"epietranek7y@jimdo.com\\\",\\\"job\\\":\\\"Actuary\\\"}\"}\n{ \"body\": \"{\\\"id\\\":288,\\\"first_name\\\":\\\"Rutter\\\",\\\"email\\\":\\\"rlockart7z@devhub.com\\\",\\\"job\\\":\\\"Quality Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":289,\\\"first_name\\\":\\\"Shauna\\\",\\\"email\\\":\\\"sproctor80@studiopress.com\\\",\\\"job\\\":\\\"Actuary\\\"}\"}\n{ \"body\": \"{\\\"id\\\":290,\\\"first_name\\\":\\\"Ingunna\\\",\\\"email\\\":\\\"ikermon81@google.es\\\",\\\"job\\\":\\\"Account Representative II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":291,\\\"first_name\\\":\\\"Wynnie\\\",\\\"email\\\":\\\"wliddiard82@51.la\\\",\\\"job\\\":\\\"Health Coach IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":292,\\\"first_name\\\":\\\"Lydon\\\",\\\"email\\\":\\\"lkanwell83@mysql.com\\\",\\\"job\\\":\\\"Financial Advisor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":293,\\\"first_name\\\":\\\"Richart\\\",\\\"email\\\":\\\"rdoone84@cisco.com\\\",\\\"job\\\":\\\"Registered Nurse\\\"}\"}\n{ \"body\": \"{\\\"id\\\":294,\\\"first_name\\\":\\\"Zeb\\\",\\\"email\\\":\\\"ziacovelli85@baidu.com\\\",\\\"job\\\":\\\"Marketing Assistant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":295,\\\"first_name\\\":\\\"Emlen\\\",\\\"email\\\":\\\"eroly86@goodreads.com\\\",\\\"job\\\":\\\"Business Systems Development Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":296,\\\"first_name\\\":\\\"Whitaker\\\",\\\"email\\\":\\\"wkingstne87@oracle.com\\\",\\\"job\\\":\\\"Analog Circuit Design manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":297,\\\"first_name\\\":\\\"Karlis\\\",\\\"email\\\":\\\"kworssam88@freewebs.com\\\",\\\"job\\\":\\\"Help Desk Operator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":298,\\\"first_name\\\":\\\"Shurlocke\\\",\\\"email\\\":\\\"szorzenoni89@cmu.edu\\\",\\\"job\\\":\\\"Software Consultant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":299,\\\"first_name\\\":\\\"Myrtle\\\",\\\"email\\\":\\\"mmccrillis8a@google.fr\\\",\\\"job\\\":\\\"Financial Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":300,\\\"first_name\\\":\\\"Marillin\\\",\\\"email\\\":\\\"msara8b@alexa.com\\\",\\\"job\\\":\\\"Accounting Assistant III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":301,\\\"first_name\\\":\\\"Nadia\\\",\\\"email\\\":\\\"nsantarelli8c@netvibes.com\\\",\\\"job\\\":\\\"Librarian\\\"}\"}\n{ \"body\": \"{\\\"id\\\":302,\\\"first_name\\\":\\\"Kendal\\\",\\\"email\\\":\\\"kgerbl8d@un.org\\\",\\\"job\\\":\\\"Programmer Analyst IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":303,\\\"first_name\\\":\\\"Ayn\\\",\\\"email\\\":\\\"apinck8e@theglobeandmail.com\\\",\\\"job\\\":\\\"Marketing Assistant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":304,\\\"first_name\\\":\\\"Mort\\\",\\\"email\\\":\\\"mfyndon8f@nydailynews.com\\\",\\\"job\\\":\\\"Structural Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":305,\\\"first_name\\\":\\\"Pauletta\\\",\\\"email\\\":\\\"pllopis8g@blogtalkradio.com\\\",\\\"job\\\":\\\"Systems Administrator III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":306,\\\"first_name\\\":\\\"Bartel\\\",\\\"email\\\":\\\"bjosephs8h@unblog.fr\\\",\\\"job\\\":\\\"Software Consultant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":307,\\\"first_name\\\":\\\"Ezri\\\",\\\"email\\\":\\\"ebricksey8i@networksolutions.com\\\",\\\"job\\\":\\\"Product Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":308,\\\"first_name\\\":\\\"Roarke\\\",\\\"email\\\":\\\"rreymers8j@instagram.com\\\",\\\"job\\\":\\\"Recruiter\\\"}\"}\n{ \"body\": \"{\\\"id\\\":309,\\\"first_name\\\":\\\"Danya\\\",\\\"email\\\":\\\"dmilillo8k@ow.ly\\\",\\\"job\\\":\\\"Social Worker\\\"}\"}\n{ \"body\": \"{\\\"id\\\":310,\\\"first_name\\\":\\\"Anatol\\\",\\\"email\\\":\\\"aokey8l@latimes.com\\\",\\\"job\\\":\\\"Quality Control Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":311,\\\"first_name\\\":\\\"Kirsti\\\",\\\"email\\\":\\\"kwormell8m@noaa.gov\\\",\\\"job\\\":\\\"Research Assistant II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":312,\\\"first_name\\\":\\\"Chrisy\\\",\\\"email\\\":\\\"cupstell8n@wsj.com\\\",\\\"job\\\":\\\"Civil Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":313,\\\"first_name\\\":\\\"Tudor\\\",\\\"email\\\":\\\"tsunshine8o@weebly.com\\\",\\\"job\\\":\\\"Mechanical Systems Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":314,\\\"first_name\\\":\\\"Gaspar\\\",\\\"email\\\":\\\"gdollimore8p@squarespace.com\\\",\\\"job\\\":\\\"Product Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":315,\\\"first_name\\\":\\\"Amata\\\",\\\"email\\\":\\\"adaville8q@ow.ly\\\",\\\"job\\\":\\\"Environmental Tech\\\"}\"}\n{ \"body\": \"{\\\"id\\\":316,\\\"first_name\\\":\\\"Gal\\\",\\\"email\\\":\\\"gbaltrushaitis8r@bbc.co.uk\\\",\\\"job\\\":\\\"Senior Sales Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":317,\\\"first_name\\\":\\\"Jinny\\\",\\\"email\\\":\\\"jmccoughan8s@list-manage.com\\\",\\\"job\\\":\\\"Automation Specialist III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":318,\\\"first_name\\\":\\\"Rina\\\",\\\"email\\\":\\\"rlabbey8t@cdc.gov\\\",\\\"job\\\":\\\"Environmental Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":319,\\\"first_name\\\":\\\"Quinn\\\",\\\"email\\\":\\\"qgarrold8u@dailymail.co.uk\\\",\\\"job\\\":\\\"Civil Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":320,\\\"first_name\\\":\\\"Jozef\\\",\\\"email\\\":\\\"jyanin8v@purevolume.com\\\",\\\"job\\\":\\\"Dental Hygienist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":321,\\\"first_name\\\":\\\"Lynea\\\",\\\"email\\\":\\\"lspitaro8w@cdc.gov\\\",\\\"job\\\":\\\"Software Engineer IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":322,\\\"first_name\\\":\\\"Page\\\",\\\"email\\\":\\\"pyeliashev8x@si.edu\\\",\\\"job\\\":\\\"Staff Accountant II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":323,\\\"first_name\\\":\\\"Katerina\\\",\\\"email\\\":\\\"khuygen8y@wikispaces.com\\\",\\\"job\\\":\\\"Cost Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":324,\\\"first_name\\\":\\\"Conan\\\",\\\"email\\\":\\\"cjelly8z@yellowpages.com\\\",\\\"job\\\":\\\"Compensation Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":325,\\\"first_name\\\":\\\"Paula\\\",\\\"email\\\":\\\"pridgers90@paypal.com\\\",\\\"job\\\":\\\"Quality Control Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":326,\\\"first_name\\\":\\\"Lilyan\\\",\\\"email\\\":\\\"ltuft91@quantcast.com\\\",\\\"job\\\":\\\"Librarian\\\"}\"}\n{ \"body\": \"{\\\"id\\\":327,\\\"first_name\\\":\\\"Kore\\\",\\\"email\\\":\\\"kquincey92@mediafire.com\\\",\\\"job\\\":\\\"Senior Developer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":328,\\\"first_name\\\":\\\"Dorice\\\",\\\"email\\\":\\\"dfargie93@t-online.de\\\",\\\"job\\\":\\\"Media Manager IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":329,\\\"first_name\\\":\\\"Archer\\\",\\\"email\\\":\\\"apoints94@bbb.org\\\",\\\"job\\\":\\\"GIS Technical Architect\\\"}\"}\n{ \"body\": \"{\\\"id\\\":330,\\\"first_name\\\":\\\"Hyacinth\\\",\\\"email\\\":\\\"hcathcart95@taobao.com\\\",\\\"job\\\":\\\"Occupational Therapist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":331,\\\"first_name\\\":\\\"Gerome\\\",\\\"email\\\":\\\"gstrain96@free.fr\\\",\\\"job\\\":\\\"Quality Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":332,\\\"first_name\\\":\\\"Annie\\\",\\\"email\\\":\\\"apollastrino97@msn.com\\\",\\\"job\\\":\\\"VP Product Management\\\"}\"}\n{ \"body\": \"{\\\"id\\\":333,\\\"first_name\\\":\\\"Catharina\\\",\\\"email\\\":\\\"ccallendar98@sbwire.com\\\",\\\"job\\\":\\\"Director of Sales\\\"}\"}\n{ \"body\": \"{\\\"id\\\":334,\\\"first_name\\\":\\\"Lanie\\\",\\\"email\\\":\\\"lackroyd99@cdbaby.com\\\",\\\"job\\\":\\\"Recruiting Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":335,\\\"first_name\\\":\\\"Liuka\\\",\\\"email\\\":\\\"ltowns9a@wufoo.com\\\",\\\"job\\\":\\\"Executive Secretary\\\"}\"}\n{ \"body\": \"{\\\"id\\\":336,\\\"first_name\\\":\\\"Julianna\\\",\\\"email\\\":\\\"jgrassick9b@ocn.ne.jp\\\",\\\"job\\\":\\\"Clinical Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":337,\\\"first_name\\\":\\\"Jaimie\\\",\\\"email\\\":\\\"jforgie9c@plala.or.jp\\\",\\\"job\\\":\\\"Geological Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":338,\\\"first_name\\\":\\\"Amye\\\",\\\"email\\\":\\\"ashortall9d@booking.com\\\",\\\"job\\\":\\\"Technical Writer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":339,\\\"first_name\\\":\\\"Laurie\\\",\\\"email\\\":\\\"lwiddows9e@blogspot.com\\\",\\\"job\\\":\\\"VP Accounting\\\"}\"}\n{ \"body\": \"{\\\"id\\\":340,\\\"first_name\\\":\\\"Belva\\\",\\\"email\\\":\\\"barmour9f@msu.edu\\\",\\\"job\\\":\\\"Food Chemist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":341,\\\"first_name\\\":\\\"Vida\\\",\\\"email\\\":\\\"vdorgon9g@printfriendly.com\\\",\\\"job\\\":\\\"Data Coordiator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":342,\\\"first_name\\\":\\\"Sherlock\\\",\\\"email\\\":\\\"silyinykh9h@accuweather.com\\\",\\\"job\\\":\\\"Financial Advisor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":343,\\\"first_name\\\":\\\"Erhard\\\",\\\"email\\\":\\\"edranfield9i@pagesperso-orange.fr\\\",\\\"job\\\":\\\"Teacher\\\"}\"}\n{ \"body\": \"{\\\"id\\\":344,\\\"first_name\\\":\\\"Roxi\\\",\\\"email\\\":\\\"rjerdein9j@google.fr\\\",\\\"job\\\":\\\"Registered Nurse\\\"}\"}\n{ \"body\": \"{\\\"id\\\":345,\\\"first_name\\\":\\\"Carl\\\",\\\"email\\\":\\\"ctutill9k@youtu.be\\\",\\\"job\\\":\\\"Social Worker\\\"}\"}\n{ \"body\": \"{\\\"id\\\":346,\\\"first_name\\\":\\\"Alana\\\",\\\"email\\\":\\\"astangoe9l@tamu.edu\\\",\\\"job\\\":\\\"Compensation Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":347,\\\"first_name\\\":\\\"Lura\\\",\\\"email\\\":\\\"lcosgry9m@typepad.com\\\",\\\"job\\\":\\\"Speech Pathologist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":348,\\\"first_name\\\":\\\"Eve\\\",\\\"email\\\":\\\"earkin9n@usda.gov\\\",\\\"job\\\":\\\"Assistant Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":349,\\\"first_name\\\":\\\"Claiborn\\\",\\\"email\\\":\\\"cmcellen9o@bbb.org\\\",\\\"job\\\":\\\"VP Marketing\\\"}\"}\n{ \"body\": \"{\\\"id\\\":350,\\\"first_name\\\":\\\"Cindy\\\",\\\"email\\\":\\\"csellor9p@ow.ly\\\",\\\"job\\\":\\\"Chief Design Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":351,\\\"first_name\\\":\\\"Torey\\\",\\\"email\\\":\\\"tkasperski9q@bing.com\\\",\\\"job\\\":\\\"Help Desk Operator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":352,\\\"first_name\\\":\\\"Baird\\\",\\\"email\\\":\\\"bdillet9r@mac.com\\\",\\\"job\\\":\\\"Budget/Accounting Analyst I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":353,\\\"first_name\\\":\\\"Lesley\\\",\\\"email\\\":\\\"lavramow9s@wsj.com\\\",\\\"job\\\":\\\"Assistant Media Planner\\\"}\"}\n{ \"body\": \"{\\\"id\\\":354,\\\"first_name\\\":\\\"Claire\\\",\\\"email\\\":\\\"cpattenden9t@sohu.com\\\",\\\"job\\\":\\\"Software Test Engineer I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":355,\\\"first_name\\\":\\\"Bernardina\\\",\\\"email\\\":\\\"bfazakerley9u@springer.com\\\",\\\"job\\\":\\\"Automation Specialist II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":356,\\\"first_name\\\":\\\"Humfried\\\",\\\"email\\\":\\\"harrighini9v@barnesandnoble.com\\\",\\\"job\\\":\\\"Senior Sales Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":357,\\\"first_name\\\":\\\"Didi\\\",\\\"email\\\":\\\"dhullock9w@foxnews.com\\\",\\\"job\\\":\\\"Operator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":358,\\\"first_name\\\":\\\"Morlee\\\",\\\"email\\\":\\\"mosmint9x@goo.gl\\\",\\\"job\\\":\\\"Analog Circuit Design manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":359,\\\"first_name\\\":\\\"Jethro\\\",\\\"email\\\":\\\"jjessett9y@usgs.gov\\\",\\\"job\\\":\\\"Account Coordinator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":360,\\\"first_name\\\":\\\"Aurelia\\\",\\\"email\\\":\\\"aeveriss9z@china.com.cn\\\",\\\"job\\\":\\\"Data Coordiator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":361,\\\"first_name\\\":\\\"Cornelius\\\",\\\"email\\\":\\\"czanettinia0@xinhuanet.com\\\",\\\"job\\\":\\\"Software Test Engineer II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":362,\\\"first_name\\\":\\\"Giovanni\\\",\\\"email\\\":\\\"ghuddlestona1@bloomberg.com\\\",\\\"job\\\":\\\"Environmental Tech\\\"}\"}\n{ \"body\": \"{\\\"id\\\":363,\\\"first_name\\\":\\\"Perla\\\",\\\"email\\\":\\\"pjirouteka2@yahoo.co.jp\\\",\\\"job\\\":\\\"Senior Quality Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":364,\\\"first_name\\\":\\\"Hayes\\\",\\\"email\\\":\\\"hjorgensena3@ucla.edu\\\",\\\"job\\\":\\\"Desktop Support Technician\\\"}\"}\n{ \"body\": \"{\\\"id\\\":365,\\\"first_name\\\":\\\"Vitia\\\",\\\"email\\\":\\\"vpischofa4@tamu.edu\\\",\\\"job\\\":\\\"Structural Analysis Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":366,\\\"first_name\\\":\\\"Becki\\\",\\\"email\\\":\\\"bsimkissa5@drupal.org\\\",\\\"job\\\":\\\"VP Marketing\\\"}\"}\n{ \"body\": \"{\\\"id\\\":367,\\\"first_name\\\":\\\"Bary\\\",\\\"email\\\":\\\"blemmensa6@gmpg.org\\\",\\\"job\\\":\\\"Financial Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":368,\\\"first_name\\\":\\\"Jolyn\\\",\\\"email\\\":\\\"jlemarquanda7@fc2.com\\\",\\\"job\\\":\\\"Engineer II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":369,\\\"first_name\\\":\\\"Margette\\\",\\\"email\\\":\\\"mrentenbecka8@jugem.jp\\\",\\\"job\\\":\\\"Electrical Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":370,\\\"first_name\\\":\\\"Yoshi\\\",\\\"email\\\":\\\"ybinleya9@un.org\\\",\\\"job\\\":\\\"Internal Auditor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":371,\\\"first_name\\\":\\\"Skipton\\\",\\\"email\\\":\\\"stheodorisaa@pinterest.com\\\",\\\"job\\\":\\\"Editor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":372,\\\"first_name\\\":\\\"Crin\\\",\\\"email\\\":\\\"cdrezzerab@tmall.com\\\",\\\"job\\\":\\\"Executive Secretary\\\"}\"}\n{ \"body\": \"{\\\"id\\\":373,\\\"first_name\\\":\\\"Kendra\\\",\\\"email\\\":\\\"ksabatheac@istockphoto.com\\\",\\\"job\\\":\\\"Help Desk Operator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":374,\\\"first_name\\\":\\\"Alica\\\",\\\"email\\\":\\\"aglaisnerad@bbc.co.uk\\\",\\\"job\\\":\\\"Chemical Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":375,\\\"first_name\\\":\\\"Penelopa\\\",\\\"email\\\":\\\"pgiovannilliae@123-reg.co.uk\\\",\\\"job\\\":\\\"Quality Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":376,\\\"first_name\\\":\\\"Julie\\\",\\\"email\\\":\\\"jcuttenaf@yandex.ru\\\",\\\"job\\\":\\\"Sales Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":377,\\\"first_name\\\":\\\"Scot\\\",\\\"email\\\":\\\"smidgelyag@imdb.com\\\",\\\"job\\\":\\\"Chief Design Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":378,\\\"first_name\\\":\\\"Almira\\\",\\\"email\\\":\\\"ajelkah@foxnews.com\\\",\\\"job\\\":\\\"Editor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":379,\\\"first_name\\\":\\\"Diannne\\\",\\\"email\\\":\\\"dtallonai@imgur.com\\\",\\\"job\\\":\\\"Senior Editor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":380,\\\"first_name\\\":\\\"Cyrus\\\",\\\"email\\\":\\\"cdunlopaj@miibeian.gov.cn\\\",\\\"job\\\":\\\"Financial Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":381,\\\"first_name\\\":\\\"Ronny\\\",\\\"email\\\":\\\"rtriggak@edublogs.org\\\",\\\"job\\\":\\\"Senior Quality Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":382,\\\"first_name\\\":\\\"Archaimbaud\\\",\\\"email\\\":\\\"alushal@dedecms.com\\\",\\\"job\\\":\\\"Accountant I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":383,\\\"first_name\\\":\\\"Matthew\\\",\\\"email\\\":\\\"mcashinam@sfgate.com\\\",\\\"job\\\":\\\"Senior Financial Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":384,\\\"first_name\\\":\\\"Abramo\\\",\\\"email\\\":\\\"ahentzeleran@istockphoto.com\\\",\\\"job\\\":\\\"Editor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":385,\\\"first_name\\\":\\\"Farah\\\",\\\"email\\\":\\\"flarventao@1688.com\\\",\\\"job\\\":\\\"Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":386,\\\"first_name\\\":\\\"Jackquelin\\\",\\\"email\\\":\\\"jdevericksap@ted.com\\\",\\\"job\\\":\\\"VP Quality Control\\\"}\"}\n{ \"body\": \"{\\\"id\\\":387,\\\"first_name\\\":\\\"Leonhard\\\",\\\"email\\\":\\\"lbasfordaq@odnoklassniki.ru\\\",\\\"job\\\":\\\"Engineer II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":388,\\\"first_name\\\":\\\"Dixie\\\",\\\"email\\\":\\\"dbouchar@instagram.com\\\",\\\"job\\\":\\\"Senior Financial Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":389,\\\"first_name\\\":\\\"Tanitansy\\\",\\\"email\\\":\\\"ttamburoas@twitter.com\\\",\\\"job\\\":\\\"Programmer II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":390,\\\"first_name\\\":\\\"Olivier\\\",\\\"email\\\":\\\"ohighnamat@amazonaws.com\\\",\\\"job\\\":\\\"Systems Administrator II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":391,\\\"first_name\\\":\\\"Seumas\\\",\\\"email\\\":\\\"scalladineau@aol.com\\\",\\\"job\\\":\\\"Mechanical Systems Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":392,\\\"first_name\\\":\\\"Vinni\\\",\\\"email\\\":\\\"vstidworthyav@comcast.net\\\",\\\"job\\\":\\\"General Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":393,\\\"first_name\\\":\\\"Lorri\\\",\\\"email\\\":\\\"lvennartaw@linkedin.com\\\",\\\"job\\\":\\\"Paralegal\\\"}\"}\n{ \"body\": \"{\\\"id\\\":394,\\\"first_name\\\":\\\"Gelya\\\",\\\"email\\\":\\\"gcotesax@nydailynews.com\\\",\\\"job\\\":\\\"Account Representative II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":395,\\\"first_name\\\":\\\"Ximenez\\\",\\\"email\\\":\\\"xkermonay@github.io\\\",\\\"job\\\":\\\"Assistant Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":396,\\\"first_name\\\":\\\"Blythe\\\",\\\"email\\\":\\\"blandsmanaz@deliciousdays.com\\\",\\\"job\\\":\\\"Research Assistant II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":397,\\\"first_name\\\":\\\"Noak\\\",\\\"email\\\":\\\"nmourgeb0@purevolume.com\\\",\\\"job\\\":\\\"Quality Control Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":398,\\\"first_name\\\":\\\"Mallissa\\\",\\\"email\\\":\\\"mbradmoreb1@adobe.com\\\",\\\"job\\\":\\\"Information Systems Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":399,\\\"first_name\\\":\\\"Malissa\\\",\\\"email\\\":\\\"mdjorevicb2@cafepress.com\\\",\\\"job\\\":\\\"Director of Sales\\\"}\"}\n{ \"body\": \"{\\\"id\\\":400,\\\"first_name\\\":\\\"Vern\\\",\\\"email\\\":\\\"vrobroeb3@squarespace.com\\\",\\\"job\\\":\\\"Financial Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":401,\\\"first_name\\\":\\\"Willette\\\",\\\"email\\\":\\\"wgawthorpb4@fotki.com\\\",\\\"job\\\":\\\"Web Designer I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":402,\\\"first_name\\\":\\\"Joseito\\\",\\\"email\\\":\\\"jmuatb5@vkontakte.ru\\\",\\\"job\\\":\\\"Geological Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":403,\\\"first_name\\\":\\\"Roger\\\",\\\"email\\\":\\\"rtunstallb6@fotki.com\\\",\\\"job\\\":\\\"Software Test Engineer III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":404,\\\"first_name\\\":\\\"Welsh\\\",\\\"email\\\":\\\"wringerb7@cam.ac.uk\\\",\\\"job\\\":\\\"Recruiter\\\"}\"}\n{ \"body\": \"{\\\"id\\\":405,\\\"first_name\\\":\\\"Consolata\\\",\\\"email\\\":\\\"csmallwoodb8@springer.com\\\",\\\"job\\\":\\\"Budget/Accounting Analyst IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":406,\\\"first_name\\\":\\\"Bern\\\",\\\"email\\\":\\\"bgascarb9@networksolutions.com\\\",\\\"job\\\":\\\"VP Product Management\\\"}\"}\n{ \"body\": \"{\\\"id\\\":407,\\\"first_name\\\":\\\"Giusto\\\",\\\"email\\\":\\\"gfoottitba@reuters.com\\\",\\\"job\\\":\\\"Geologist III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":408,\\\"first_name\\\":\\\"Alfie\\\",\\\"email\\\":\\\"awingbb@uiuc.edu\\\",\\\"job\\\":\\\"Database Administrator II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":409,\\\"first_name\\\":\\\"Tilda\\\",\\\"email\\\":\\\"tchiecobc@dyndns.org\\\",\\\"job\\\":\\\"Civil Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":410,\\\"first_name\\\":\\\"Gilburt\\\",\\\"email\\\":\\\"gbacherbd@samsung.com\\\",\\\"job\\\":\\\"Accountant IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":411,\\\"first_name\\\":\\\"Carolyne\\\",\\\"email\\\":\\\"ckaretbe@arizona.edu\\\",\\\"job\\\":\\\"Mechanical Systems Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":412,\\\"first_name\\\":\\\"Bastian\\\",\\\"email\\\":\\\"bpoebf@si.edu\\\",\\\"job\\\":\\\"Systems Administrator I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":413,\\\"first_name\\\":\\\"Shem\\\",\\\"email\\\":\\\"smartlewbg@cafepress.com\\\",\\\"job\\\":\\\"Administrative Officer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":414,\\\"first_name\\\":\\\"Lauryn\\\",\\\"email\\\":\\\"lwardropbh@nydailynews.com\\\",\\\"job\\\":\\\"Design Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":415,\\\"first_name\\\":\\\"Sondra\\\",\\\"email\\\":\\\"sboschmannbi@fastcompany.com\\\",\\\"job\\\":\\\"Director of Sales\\\"}\"}\n{ \"body\": \"{\\\"id\\\":416,\\\"first_name\\\":\\\"Adham\\\",\\\"email\\\":\\\"awroughtbj@cpanel.net\\\",\\\"job\\\":\\\"Quality Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":417,\\\"first_name\\\":\\\"Cornall\\\",\\\"email\\\":\\\"cbreacherbk@businessweek.com\\\",\\\"job\\\":\\\"Occupational Therapist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":418,\\\"first_name\\\":\\\"Vilma\\\",\\\"email\\\":\\\"vsamsinbl@toplist.cz\\\",\\\"job\\\":\\\"Technical Writer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":419,\\\"first_name\\\":\\\"Ollie\\\",\\\"email\\\":\\\"omannockbm@wsj.com\\\",\\\"job\\\":\\\"Associate Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":420,\\\"first_name\\\":\\\"Carie\\\",\\\"email\\\":\\\"ckernaghanbn@last.fm\\\",\\\"job\\\":\\\"Environmental Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":421,\\\"first_name\\\":\\\"Wells\\\",\\\"email\\\":\\\"wkalinowskybo@discuz.net\\\",\\\"job\\\":\\\"VP Accounting\\\"}\"}\n{ \"body\": \"{\\\"id\\\":422,\\\"first_name\\\":\\\"Boothe\\\",\\\"email\\\":\\\"bjoontjesbp@people.com.cn\\\",\\\"job\\\":\\\"Account Executive\\\"}\"}\n{ \"body\": \"{\\\"id\\\":423,\\\"first_name\\\":\\\"Dominick\\\",\\\"email\\\":\\\"dweekesbq@networkadvertising.org\\\",\\\"job\\\":\\\"Executive Secretary\\\"}\"}\n{ \"body\": \"{\\\"id\\\":424,\\\"first_name\\\":\\\"Jeanie\\\",\\\"email\\\":\\\"jveldebr@ovh.net\\\",\\\"job\\\":\\\"Research Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":425,\\\"first_name\\\":\\\"Shayne\\\",\\\"email\\\":\\\"stipplebs@amazon.co.jp\\\",\\\"job\\\":\\\"Senior Sales Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":426,\\\"first_name\\\":\\\"Romola\\\",\\\"email\\\":\\\"rrylettbt@paginegialle.it\\\",\\\"job\\\":\\\"Accountant I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":427,\\\"first_name\\\":\\\"Onida\\\",\\\"email\\\":\\\"omenpesbu@gnu.org\\\",\\\"job\\\":\\\"Financial Advisor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":428,\\\"first_name\\\":\\\"Margareta\\\",\\\"email\\\":\\\"mwolteringbv@list-manage.com\\\",\\\"job\\\":\\\"Web Developer IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":429,\\\"first_name\\\":\\\"Erastus\\\",\\\"email\\\":\\\"eschwandbw@fastcompany.com\\\",\\\"job\\\":\\\"Budget/Accounting Analyst I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":430,\\\"first_name\\\":\\\"Michail\\\",\\\"email\\\":\\\"mlauritsenbx@webeden.co.uk\\\",\\\"job\\\":\\\"Physical Therapy Assistant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":431,\\\"first_name\\\":\\\"Allissa\\\",\\\"email\\\":\\\"amaddocksby@de.vu\\\",\\\"job\\\":\\\"Editor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":432,\\\"first_name\\\":\\\"Jerrylee\\\",\\\"email\\\":\\\"jwannesbz@redcross.org\\\",\\\"job\\\":\\\"Human Resources Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":433,\\\"first_name\\\":\\\"Nadeen\\\",\\\"email\\\":\\\"nsamplesc0@github.io\\\",\\\"job\\\":\\\"Legal Assistant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":434,\\\"first_name\\\":\\\"Lucilia\\\",\\\"email\\\":\\\"lalveyc1@latimes.com\\\",\\\"job\\\":\\\"Mechanical Systems Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":435,\\\"first_name\\\":\\\"Neille\\\",\\\"email\\\":\\\"ncoldhamc2@ebay.com\\\",\\\"job\\\":\\\"Director of Sales\\\"}\"}\n{ \"body\": \"{\\\"id\\\":436,\\\"first_name\\\":\\\"Alick\\\",\\\"email\\\":\\\"abidgodc3@arstechnica.com\\\",\\\"job\\\":\\\"Senior Editor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":437,\\\"first_name\\\":\\\"Myrah\\\",\\\"email\\\":\\\"mtrailc4@wunderground.com\\\",\\\"job\\\":\\\"Software Engineer II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":438,\\\"first_name\\\":\\\"Jacquelyn\\\",\\\"email\\\":\\\"jdearnleyc5@ebay.co.uk\\\",\\\"job\\\":\\\"Graphic Designer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":439,\\\"first_name\\\":\\\"Gloriane\\\",\\\"email\\\":\\\"glehemannc6@wisc.edu\\\",\\\"job\\\":\\\"Structural Analysis Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":440,\\\"first_name\\\":\\\"Genni\\\",\\\"email\\\":\\\"gbaddamc7@istockphoto.com\\\",\\\"job\\\":\\\"Financial Advisor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":441,\\\"first_name\\\":\\\"Fairlie\\\",\\\"email\\\":\\\"fdepperc8@ft.com\\\",\\\"job\\\":\\\"Actuary\\\"}\"}\n{ \"body\": \"{\\\"id\\\":442,\\\"first_name\\\":\\\"Honoria\\\",\\\"email\\\":\\\"hchokec9@seesaa.net\\\",\\\"job\\\":\\\"Information Systems Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":443,\\\"first_name\\\":\\\"Enos\\\",\\\"email\\\":\\\"ehoweyca@mediafire.com\\\",\\\"job\\\":\\\"Pharmacist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":444,\\\"first_name\\\":\\\"Corney\\\",\\\"email\\\":\\\"caguirrecb@bizjournals.com\\\",\\\"job\\\":\\\"VP Sales\\\"}\"}\n{ \"body\": \"{\\\"id\\\":445,\\\"first_name\\\":\\\"Ced\\\",\\\"email\\\":\\\"cfoulkscc@reference.com\\\",\\\"job\\\":\\\"Compensation Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":446,\\\"first_name\\\":\\\"Vivianna\\\",\\\"email\\\":\\\"vplailcd@ycombinator.com\\\",\\\"job\\\":\\\"Geologist III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":447,\\\"first_name\\\":\\\"Daffy\\\",\\\"email\\\":\\\"dingarfieldce@xrea.com\\\",\\\"job\\\":\\\"Project Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":448,\\\"first_name\\\":\\\"Barrie\\\",\\\"email\\\":\\\"bdowyercf@newyorker.com\\\",\\\"job\\\":\\\"General Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":449,\\\"first_name\\\":\\\"Ema\\\",\\\"email\\\":\\\"ekediecg@dailymotion.com\\\",\\\"job\\\":\\\"Programmer Analyst III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":450,\\\"first_name\\\":\\\"Caty\\\",\\\"email\\\":\\\"ckordingch@theguardian.com\\\",\\\"job\\\":\\\"VP Marketing\\\"}\"}\n{ \"body\": \"{\\\"id\\\":451,\\\"first_name\\\":\\\"Annmarie\\\",\\\"email\\\":\\\"ablockci@about.com\\\",\\\"job\\\":\\\"Legal Assistant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":452,\\\"first_name\\\":\\\"Otis\\\",\\\"email\\\":\\\"ofassecj@themeforest.net\\\",\\\"job\\\":\\\"Social Worker\\\"}\"}\n{ \"body\": \"{\\\"id\\\":453,\\\"first_name\\\":\\\"Perla\\\",\\\"email\\\":\\\"pfassck@yandex.ru\\\",\\\"job\\\":\\\"Statistician I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":454,\\\"first_name\\\":\\\"Issy\\\",\\\"email\\\":\\\"ithatchercl@vk.com\\\",\\\"job\\\":\\\"Administrative Officer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":455,\\\"first_name\\\":\\\"Kerby\\\",\\\"email\\\":\\\"koscanloncm@npr.org\\\",\\\"job\\\":\\\"Human Resources Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":456,\\\"first_name\\\":\\\"Randy\\\",\\\"email\\\":\\\"rchetwyndcn@google.com.au\\\",\\\"job\\\":\\\"Research Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":457,\\\"first_name\\\":\\\"Findley\\\",\\\"email\\\":\\\"fwillasco@loc.gov\\\",\\\"job\\\":\\\"Web Developer IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":458,\\\"first_name\\\":\\\"Cristobal\\\",\\\"email\\\":\\\"cmacgibboncp@washington.edu\\\",\\\"job\\\":\\\"Analog Circuit Design manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":459,\\\"first_name\\\":\\\"Neille\\\",\\\"email\\\":\\\"nscrauniagecq@nasa.gov\\\",\\\"job\\\":\\\"Health Coach IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":460,\\\"first_name\\\":\\\"Dennison\\\",\\\"email\\\":\\\"dsacazecr@barnesandnoble.com\\\",\\\"job\\\":\\\"Physical Therapy Assistant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":461,\\\"first_name\\\":\\\"Harcourt\\\",\\\"email\\\":\\\"hhawkridgecs@ted.com\\\",\\\"job\\\":\\\"Nuclear Power Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":462,\\\"first_name\\\":\\\"Ceciley\\\",\\\"email\\\":\\\"cridouttct@xinhuanet.com\\\",\\\"job\\\":\\\"VP Sales\\\"}\"}\n{ \"body\": \"{\\\"id\\\":463,\\\"first_name\\\":\\\"Ginnifer\\\",\\\"email\\\":\\\"gbartlettcu@livejournal.com\\\",\\\"job\\\":\\\"VP Accounting\\\"}\"}\n{ \"body\": \"{\\\"id\\\":464,\\\"first_name\\\":\\\"Livvy\\\",\\\"email\\\":\\\"lbranncv@microsoft.com\\\",\\\"job\\\":\\\"Biostatistician II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":465,\\\"first_name\\\":\\\"Marjory\\\",\\\"email\\\":\\\"mwimpresscw@microsoft.com\\\",\\\"job\\\":\\\"Software Consultant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":466,\\\"first_name\\\":\\\"Manya\\\",\\\"email\\\":\\\"mcianicx@usda.gov\\\",\\\"job\\\":\\\"Health Coach III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":467,\\\"first_name\\\":\\\"Hobart\\\",\\\"email\\\":\\\"hhakecy@boston.com\\\",\\\"job\\\":\\\"Business Systems Development Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":468,\\\"first_name\\\":\\\"Drona\\\",\\\"email\\\":\\\"dstylescz@amazon.co.uk\\\",\\\"job\\\":\\\"Structural Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":469,\\\"first_name\\\":\\\"Cathryn\\\",\\\"email\\\":\\\"cpollockd0@prlog.org\\\",\\\"job\\\":\\\"Data Coordiator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":470,\\\"first_name\\\":\\\"Arvie\\\",\\\"email\\\":\\\"apowlesd1@apache.org\\\",\\\"job\\\":\\\"Nuclear Power Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":471,\\\"first_name\\\":\\\"Elisha\\\",\\\"email\\\":\\\"erableaud2@latimes.com\\\",\\\"job\\\":\\\"Community Outreach Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":472,\\\"first_name\\\":\\\"Stanton\\\",\\\"email\\\":\\\"sdinehartd3@engadget.com\\\",\\\"job\\\":\\\"Structural Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":473,\\\"first_name\\\":\\\"Athena\\\",\\\"email\\\":\\\"amignotd4@123-reg.co.uk\\\",\\\"job\\\":\\\"VP Accounting\\\"}\"}\n{ \"body\": \"{\\\"id\\\":474,\\\"first_name\\\":\\\"Fredericka\\\",\\\"email\\\":\\\"fbrannod5@gov.uk\\\",\\\"job\\\":\\\"Help Desk Operator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":475,\\\"first_name\\\":\\\"Rois\\\",\\\"email\\\":\\\"rlesperd6@slideshare.net\\\",\\\"job\\\":\\\"Occupational Therapist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":476,\\\"first_name\\\":\\\"Devlin\\\",\\\"email\\\":\\\"dsaleryd7@prlog.org\\\",\\\"job\\\":\\\"Research Nurse\\\"}\"}\n{ \"body\": \"{\\\"id\\\":477,\\\"first_name\\\":\\\"Chandal\\\",\\\"email\\\":\\\"clearmonthd8@jimdo.com\\\",\\\"job\\\":\\\"Registered Nurse\\\"}\"}\n{ \"body\": \"{\\\"id\\\":478,\\\"first_name\\\":\\\"Gordy\\\",\\\"email\\\":\\\"gzanicchellid9@nydailynews.com\\\",\\\"job\\\":\\\"Structural Analysis Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":479,\\\"first_name\\\":\\\"Kaye\\\",\\\"email\\\":\\\"kklimasda@smh.com.au\\\",\\\"job\\\":\\\"Chemical Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":480,\\\"first_name\\\":\\\"Anthiathia\\\",\\\"email\\\":\\\"acorwooddb@networksolutions.com\\\",\\\"job\\\":\\\"Product Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":481,\\\"first_name\\\":\\\"Bertina\\\",\\\"email\\\":\\\"bgoddarddc@ocn.ne.jp\\\",\\\"job\\\":\\\"Tax Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":482,\\\"first_name\\\":\\\"Alexandre\\\",\\\"email\\\":\\\"aliteldd@tripadvisor.com\\\",\\\"job\\\":\\\"Community Outreach Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":483,\\\"first_name\\\":\\\"Sutton\\\",\\\"email\\\":\\\"skhidrde@techcrunch.com\\\",\\\"job\\\":\\\"Research Assistant II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":484,\\\"first_name\\\":\\\"Elsinore\\\",\\\"email\\\":\\\"eairddf@taobao.com\\\",\\\"job\\\":\\\"Analog Circuit Design manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":485,\\\"first_name\\\":\\\"Branden\\\",\\\"email\\\":\\\"bgraundissondg@cam.ac.uk\\\",\\\"job\\\":\\\"Nuclear Power Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":486,\\\"first_name\\\":\\\"Mic\\\",\\\"email\\\":\\\"mcooledh@sciencedaily.com\\\",\\\"job\\\":\\\"Geologist III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":487,\\\"first_name\\\":\\\"Wain\\\",\\\"email\\\":\\\"wtinklindi@deliciousdays.com\\\",\\\"job\\\":\\\"Engineer III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":488,\\\"first_name\\\":\\\"Cody\\\",\\\"email\\\":\\\"cliledj@earthlink.net\\\",\\\"job\\\":\\\"Editor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":489,\\\"first_name\\\":\\\"Clevie\\\",\\\"email\\\":\\\"cmcglauddk@virginia.edu\\\",\\\"job\\\":\\\"Nurse Practicioner\\\"}\"}\n{ \"body\": \"{\\\"id\\\":490,\\\"first_name\\\":\\\"Renato\\\",\\\"email\\\":\\\"rchildrensdl@wufoo.com\\\",\\\"job\\\":\\\"Biostatistician I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":491,\\\"first_name\\\":\\\"Erna\\\",\\\"email\\\":\\\"ekleinbaumdm@weather.com\\\",\\\"job\\\":\\\"Accounting Assistant IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":492,\\\"first_name\\\":\\\"Kanya\\\",\\\"email\\\":\\\"kwimmsdn@vimeo.com\\\",\\\"job\\\":\\\"Biostatistician III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":493,\\\"first_name\\\":\\\"Corilla\\\",\\\"email\\\":\\\"cgrobdo@infoseek.co.jp\\\",\\\"job\\\":\\\"Engineer III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":494,\\\"first_name\\\":\\\"Shell\\\",\\\"email\\\":\\\"schardindp@i2i.jp\\\",\\\"job\\\":\\\"Help Desk Technician\\\"}\"}\n{ \"body\": \"{\\\"id\\\":495,\\\"first_name\\\":\\\"Maury\\\",\\\"email\\\":\\\"msywelldq@rambler.ru\\\",\\\"job\\\":\\\"Electrical Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":496,\\\"first_name\\\":\\\"Towny\\\",\\\"email\\\":\\\"tburwelldr@wix.com\\\",\\\"job\\\":\\\"Quality Control Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":497,\\\"first_name\\\":\\\"Giff\\\",\\\"email\\\":\\\"gbrenardds@fda.gov\\\",\\\"job\\\":\\\"Human Resources Assistant III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":498,\\\"first_name\\\":\\\"Maisey\\\",\\\"email\\\":\\\"mlanpheredt@slideshare.net\\\",\\\"job\\\":\\\"Assistant Media Planner\\\"}\"}\n{ \"body\": \"{\\\"id\\\":499,\\\"first_name\\\":\\\"Ulrich\\\",\\\"email\\\":\\\"uwhiteleydu@jugem.jp\\\",\\\"job\\\":\\\"Associate Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":500,\\\"first_name\\\":\\\"Lisha\\\",\\\"email\\\":\\\"lcharvilledv@wufoo.com\\\",\\\"job\\\":\\\"Accounting Assistant III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":501,\\\"first_name\\\":\\\"Conrad\\\",\\\"email\\\":\\\"cmatteaudw@who.int\\\",\\\"job\\\":\\\"Staff Scientist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":502,\\\"first_name\\\":\\\"Joaquin\\\",\\\"email\\\":\\\"jbrolechandx@homestead.com\\\",\\\"job\\\":\\\"Dental Hygienist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":503,\\\"first_name\\\":\\\"Gusti\\\",\\\"email\\\":\\\"gpinchbeckdy@gizmodo.com\\\",\\\"job\\\":\\\"Graphic Designer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":504,\\\"first_name\\\":\\\"Flossi\\\",\\\"email\\\":\\\"fbrettonerdz@cargocollective.com\\\",\\\"job\\\":\\\"Administrative Assistant II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":505,\\\"first_name\\\":\\\"Rollo\\\",\\\"email\\\":\\\"rrougee0@cocolog-nifty.com\\\",\\\"job\\\":\\\"Technical Writer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":506,\\\"first_name\\\":\\\"Estella\\\",\\\"email\\\":\\\"ewallase1@businessinsider.com\\\",\\\"job\\\":\\\"GIS Technical Architect\\\"}\"}\n{ \"body\": \"{\\\"id\\\":507,\\\"first_name\\\":\\\"Karna\\\",\\\"email\\\":\\\"krobotthame2@cnn.com\\\",\\\"job\\\":\\\"Engineer I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":508,\\\"first_name\\\":\\\"Doroteya\\\",\\\"email\\\":\\\"dtinnere3@ameblo.jp\\\",\\\"job\\\":\\\"Assistant Media Planner\\\"}\"}\n{ \"body\": \"{\\\"id\\\":509,\\\"first_name\\\":\\\"Esme\\\",\\\"email\\\":\\\"emarfelle4@google.cn\\\",\\\"job\\\":\\\"VP Product Management\\\"}\"}\n{ \"body\": \"{\\\"id\\\":510,\\\"first_name\\\":\\\"Pren\\\",\\\"email\\\":\\\"ptuffelle5@artisteer.com\\\",\\\"job\\\":\\\"Associate Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":511,\\\"first_name\\\":\\\"Enos\\\",\\\"email\\\":\\\"ekarolyie6@cmu.edu\\\",\\\"job\\\":\\\"Human Resources Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":512,\\\"first_name\\\":\\\"Edvard\\\",\\\"email\\\":\\\"ebrindlee7@narod.ru\\\",\\\"job\\\":\\\"Account Representative II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":513,\\\"first_name\\\":\\\"Kimberlee\\\",\\\"email\\\":\\\"kguihene8@phoca.cz\\\",\\\"job\\\":\\\"Structural Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":514,\\\"first_name\\\":\\\"Ginevra\\\",\\\"email\\\":\\\"ghammeriche9@taobao.com\\\",\\\"job\\\":\\\"Occupational Therapist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":515,\\\"first_name\\\":\\\"Dolph\\\",\\\"email\\\":\\\"dmarquisea@webs.com\\\",\\\"job\\\":\\\"Librarian\\\"}\"}\n{ \"body\": \"{\\\"id\\\":516,\\\"first_name\\\":\\\"Kienan\\\",\\\"email\\\":\\\"kgeareb@loc.gov\\\",\\\"job\\\":\\\"Senior Sales Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":517,\\\"first_name\\\":\\\"Nonnah\\\",\\\"email\\\":\\\"nhenfreec@cpanel.net\\\",\\\"job\\\":\\\"VP Quality Control\\\"}\"}\n{ \"body\": \"{\\\"id\\\":518,\\\"first_name\\\":\\\"Yoko\\\",\\\"email\\\":\\\"ybarnetted@prweb.com\\\",\\\"job\\\":\\\"Account Executive\\\"}\"}\n{ \"body\": \"{\\\"id\\\":519,\\\"first_name\\\":\\\"Sandor\\\",\\\"email\\\":\\\"sfaireyee@behance.net\\\",\\\"job\\\":\\\"Budget/Accounting Analyst I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":520,\\\"first_name\\\":\\\"Gretel\\\",\\\"email\\\":\\\"govendenef@xrea.com\\\",\\\"job\\\":\\\"Help Desk Operator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":521,\\\"first_name\\\":\\\"Karyn\\\",\\\"email\\\":\\\"kclaywortheg@ed.gov\\\",\\\"job\\\":\\\"Physical Therapy Assistant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":522,\\\"first_name\\\":\\\"Darsie\\\",\\\"email\\\":\\\"ddeieh@cmu.edu\\\",\\\"job\\\":\\\"Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":523,\\\"first_name\\\":\\\"Lynnell\\\",\\\"email\\\":\\\"lellerbeckei@lulu.com\\\",\\\"job\\\":\\\"Occupational Therapist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":524,\\\"first_name\\\":\\\"Cissiee\\\",\\\"email\\\":\\\"cbabonauej@admin.ch\\\",\\\"job\\\":\\\"Safety Technician I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":525,\\\"first_name\\\":\\\"Natalina\\\",\\\"email\\\":\\\"npilipyakek@surveymonkey.com\\\",\\\"job\\\":\\\"Clinical Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":526,\\\"first_name\\\":\\\"Kristien\\\",\\\"email\\\":\\\"ksangel@nsw.gov.au\\\",\\\"job\\\":\\\"Research Assistant IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":527,\\\"first_name\\\":\\\"Fred\\\",\\\"email\\\":\\\"ffealeyem@usgs.gov\\\",\\\"job\\\":\\\"Desktop Support Technician\\\"}\"}\n{ \"body\": \"{\\\"id\\\":528,\\\"first_name\\\":\\\"Tilda\\\",\\\"email\\\":\\\"tshelmerdineen@ask.com\\\",\\\"job\\\":\\\"Pharmacist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":529,\\\"first_name\\\":\\\"Kassandra\\\",\\\"email\\\":\\\"kburdikineo@macromedia.com\\\",\\\"job\\\":\\\"Help Desk Technician\\\"}\"}\n{ \"body\": \"{\\\"id\\\":530,\\\"first_name\\\":\\\"Reginauld\\\",\\\"email\\\":\\\"rmittenep@samsung.com\\\",\\\"job\\\":\\\"Dental Hygienist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":531,\\\"first_name\\\":\\\"Elberta\\\",\\\"email\\\":\\\"egrosiereq@pen.io\\\",\\\"job\\\":\\\"Graphic Designer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":532,\\\"first_name\\\":\\\"Ring\\\",\\\"email\\\":\\\"rtunnacliffeer@booking.com\\\",\\\"job\\\":\\\"Desktop Support Technician\\\"}\"}\n{ \"body\": \"{\\\"id\\\":533,\\\"first_name\\\":\\\"Eleanora\\\",\\\"email\\\":\\\"eflugeres@fema.gov\\\",\\\"job\\\":\\\"Database Administrator III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":534,\\\"first_name\\\":\\\"Duane\\\",\\\"email\\\":\\\"ddunmoreet@comcast.net\\\",\\\"job\\\":\\\"Environmental Tech\\\"}\"}\n{ \"body\": \"{\\\"id\\\":535,\\\"first_name\\\":\\\"Geoff\\\",\\\"email\\\":\\\"ggardnereu@ted.com\\\",\\\"job\\\":\\\"Geological Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":536,\\\"first_name\\\":\\\"Karrie\\\",\\\"email\\\":\\\"kquickev@xrea.com\\\",\\\"job\\\":\\\"Financial Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":537,\\\"first_name\\\":\\\"Erna\\\",\\\"email\\\":\\\"egaytonew@phoca.cz\\\",\\\"job\\\":\\\"Paralegal\\\"}\"}\n{ \"body\": \"{\\\"id\\\":538,\\\"first_name\\\":\\\"Diannne\\\",\\\"email\\\":\\\"draithbieex@chicagotribune.com\\\",\\\"job\\\":\\\"Graphic Designer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":539,\\\"first_name\\\":\\\"Glenn\\\",\\\"email\\\":\\\"gfraneey@china.com.cn\\\",\\\"job\\\":\\\"Food Chemist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":540,\\\"first_name\\\":\\\"Claudio\\\",\\\"email\\\":\\\"chugliez@ifeng.com\\\",\\\"job\\\":\\\"Food Chemist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":541,\\\"first_name\\\":\\\"Addie\\\",\\\"email\\\":\\\"astanmerf0@sourceforge.net\\\",\\\"job\\\":\\\"Analyst Programmer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":542,\\\"first_name\\\":\\\"Katherina\\\",\\\"email\\\":\\\"kwenzelf1@goo.ne.jp\\\",\\\"job\\\":\\\"Administrative Assistant IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":543,\\\"first_name\\\":\\\"Shannon\\\",\\\"email\\\":\\\"sluskf2@engadget.com\\\",\\\"job\\\":\\\"Junior Executive\\\"}\"}\n{ \"body\": \"{\\\"id\\\":544,\\\"first_name\\\":\\\"Hetty\\\",\\\"email\\\":\\\"hoxenburyf3@biglobe.ne.jp\\\",\\\"job\\\":\\\"Senior Editor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":545,\\\"first_name\\\":\\\"Jemimah\\\",\\\"email\\\":\\\"jdelgardillof4@theatlantic.com\\\",\\\"job\\\":\\\"Web Developer II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":546,\\\"first_name\\\":\\\"Marlee\\\",\\\"email\\\":\\\"mconlaundf5@nifty.com\\\",\\\"job\\\":\\\"Account Representative I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":547,\\\"first_name\\\":\\\"Libbi\\\",\\\"email\\\":\\\"lkiftf6@etsy.com\\\",\\\"job\\\":\\\"Product Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":548,\\\"first_name\\\":\\\"Uta\\\",\\\"email\\\":\\\"usmithenf7@bluehost.com\\\",\\\"job\\\":\\\"Senior Sales Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":549,\\\"first_name\\\":\\\"Barny\\\",\\\"email\\\":\\\"bibarraf8@freewebs.com\\\",\\\"job\\\":\\\"Research Nurse\\\"}\"}\n{ \"body\": \"{\\\"id\\\":550,\\\"first_name\\\":\\\"Maximilianus\\\",\\\"email\\\":\\\"mconanf9@stumbleupon.com\\\",\\\"job\\\":\\\"Developer I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":551,\\\"first_name\\\":\\\"Erinna\\\",\\\"email\\\":\\\"ecaskiefa@answers.com\\\",\\\"job\\\":\\\"Actuary\\\"}\"}\n{ \"body\": \"{\\\"id\\\":552,\\\"first_name\\\":\\\"Brantley\\\",\\\"email\\\":\\\"barnsonfb@weebly.com\\\",\\\"job\\\":\\\"Engineer III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":553,\\\"first_name\\\":\\\"Sharron\\\",\\\"email\\\":\\\"sfatherfc@hao123.com\\\",\\\"job\\\":\\\"Civil Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":554,\\\"first_name\\\":\\\"Andre\\\",\\\"email\\\":\\\"achapelhowfd@dagondesign.com\\\",\\\"job\\\":\\\"Project Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":555,\\\"first_name\\\":\\\"Raimondo\\\",\\\"email\\\":\\\"rrapinfe@europa.eu\\\",\\\"job\\\":\\\"Marketing Assistant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":556,\\\"first_name\\\":\\\"Dwight\\\",\\\"email\\\":\\\"dduligallff@virginia.edu\\\",\\\"job\\\":\\\"Engineer III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":557,\\\"first_name\\\":\\\"Jae\\\",\\\"email\\\":\\\"joswalfg@cdc.gov\\\",\\\"job\\\":\\\"Computer Systems Analyst I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":558,\\\"first_name\\\":\\\"Nels\\\",\\\"email\\\":\\\"nblindermannfh@sogou.com\\\",\\\"job\\\":\\\"Civil Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":559,\\\"first_name\\\":\\\"Natala\\\",\\\"email\\\":\\\"nseebrightfi@sciencedaily.com\\\",\\\"job\\\":\\\"Senior Quality Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":560,\\\"first_name\\\":\\\"Ilaire\\\",\\\"email\\\":\\\"igiottofj@reverbnation.com\\\",\\\"job\\\":\\\"VP Sales\\\"}\"}\n{ \"body\": \"{\\\"id\\\":561,\\\"first_name\\\":\\\"Caron\\\",\\\"email\\\":\\\"ccarverhillfk@cdc.gov\\\",\\\"job\\\":\\\"Administrative Officer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":562,\\\"first_name\\\":\\\"Jojo\\\",\\\"email\\\":\\\"jloomesfl@ftc.gov\\\",\\\"job\\\":\\\"Computer Systems Analyst III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":563,\\\"first_name\\\":\\\"Hollis\\\",\\\"email\\\":\\\"hcoultarfm@elegantthemes.com\\\",\\\"job\\\":\\\"Marketing Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":564,\\\"first_name\\\":\\\"Charita\\\",\\\"email\\\":\\\"cracefn@sakura.ne.jp\\\",\\\"job\\\":\\\"VP Quality Control\\\"}\"}\n{ \"body\": \"{\\\"id\\\":565,\\\"first_name\\\":\\\"Norean\\\",\\\"email\\\":\\\"npenimanfo@last.fm\\\",\\\"job\\\":\\\"Electrical Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":566,\\\"first_name\\\":\\\"Arv\\\",\\\"email\\\":\\\"asherrocksfp@ted.com\\\",\\\"job\\\":\\\"Environmental Tech\\\"}\"}\n{ \"body\": \"{\\\"id\\\":567,\\\"first_name\\\":\\\"Editha\\\",\\\"email\\\":\\\"emottersheadfq@webeden.co.uk\\\",\\\"job\\\":\\\"Clinical Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":568,\\\"first_name\\\":\\\"Mallissa\\\",\\\"email\\\":\\\"mgreedyfr@nifty.com\\\",\\\"job\\\":\\\"Media Manager II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":569,\\\"first_name\\\":\\\"Sax\\\",\\\"email\\\":\\\"sbischoffs@networkadvertising.org\\\",\\\"job\\\":\\\"Database Administrator I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":570,\\\"first_name\\\":\\\"Arlin\\\",\\\"email\\\":\\\"ahowickft@yellowbook.com\\\",\\\"job\\\":\\\"Accountant II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":571,\\\"first_name\\\":\\\"Corry\\\",\\\"email\\\":\\\"cportmanfu@gmpg.org\\\",\\\"job\\\":\\\"Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":572,\\\"first_name\\\":\\\"Beau\\\",\\\"email\\\":\\\"bbettesworthfv@w3.org\\\",\\\"job\\\":\\\"Administrative Assistant III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":573,\\\"first_name\\\":\\\"Tersina\\\",\\\"email\\\":\\\"tandrellifw@people.com.cn\\\",\\\"job\\\":\\\"Marketing Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":574,\\\"first_name\\\":\\\"Jacklyn\\\",\\\"email\\\":\\\"jpaeckmeyerfx@rambler.ru\\\",\\\"job\\\":\\\"Assistant Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":575,\\\"first_name\\\":\\\"Robena\\\",\\\"email\\\":\\\"rlylefy@dropbox.com\\\",\\\"job\\\":\\\"Account Coordinator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":576,\\\"first_name\\\":\\\"Jo\\\",\\\"email\\\":\\\"jdewfz@nasa.gov\\\",\\\"job\\\":\\\"Structural Analysis Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":577,\\\"first_name\\\":\\\"Carrol\\\",\\\"email\\\":\\\"cdrohang0@utexas.edu\\\",\\\"job\\\":\\\"Programmer Analyst I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":578,\\\"first_name\\\":\\\"Hedda\\\",\\\"email\\\":\\\"hberreclothg1@csmonitor.com\\\",\\\"job\\\":\\\"Registered Nurse\\\"}\"}\n{ \"body\": \"{\\\"id\\\":579,\\\"first_name\\\":\\\"Matty\\\",\\\"email\\\":\\\"mbullerg2@typepad.com\\\",\\\"job\\\":\\\"Accountant IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":580,\\\"first_name\\\":\\\"Tito\\\",\\\"email\\\":\\\"tlarkbyg3@google.com\\\",\\\"job\\\":\\\"Clinical Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":581,\\\"first_name\\\":\\\"Culver\\\",\\\"email\\\":\\\"ccarloneg4@cbsnews.com\\\",\\\"job\\\":\\\"VP Accounting\\\"}\"}\n{ \"body\": \"{\\\"id\\\":582,\\\"first_name\\\":\\\"Isidore\\\",\\\"email\\\":\\\"ipecholdg5@ebay.com\\\",\\\"job\\\":\\\"Office Assistant III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":583,\\\"first_name\\\":\\\"Babette\\\",\\\"email\\\":\\\"bspooleg6@issuu.com\\\",\\\"job\\\":\\\"Financial Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":584,\\\"first_name\\\":\\\"Aarika\\\",\\\"email\\\":\\\"ajeannenetg7@berkeley.edu\\\",\\\"job\\\":\\\"Systems Administrator IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":585,\\\"first_name\\\":\\\"Buddie\\\",\\\"email\\\":\\\"bveitchg8@cpanel.net\\\",\\\"job\\\":\\\"Senior Sales Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":586,\\\"first_name\\\":\\\"Silvio\\\",\\\"email\\\":\\\"seslieg9@tumblr.com\\\",\\\"job\\\":\\\"Marketing Assistant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":587,\\\"first_name\\\":\\\"Alie\\\",\\\"email\\\":\\\"abehninckga@nymag.com\\\",\\\"job\\\":\\\"Nurse\\\"}\"}\n{ \"body\": \"{\\\"id\\\":588,\\\"first_name\\\":\\\"Fanny\\\",\\\"email\\\":\\\"flevenskygb@addthis.com\\\",\\\"job\\\":\\\"Database Administrator II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":589,\\\"first_name\\\":\\\"Rycca\\\",\\\"email\\\":\\\"rbeamesgc@live.com\\\",\\\"job\\\":\\\"Internal Auditor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":590,\\\"first_name\\\":\\\"Oralla\\\",\\\"email\\\":\\\"ogriffithgd@adobe.com\\\",\\\"job\\\":\\\"Community Outreach Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":591,\\\"first_name\\\":\\\"Cordelie\\\",\\\"email\\\":\\\"cbrattange@twitpic.com\\\",\\\"job\\\":\\\"Database Administrator I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":592,\\\"first_name\\\":\\\"Caryl\\\",\\\"email\\\":\\\"ctommasuzzigf@ow.ly\\\",\\\"job\\\":\\\"Project Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":593,\\\"first_name\\\":\\\"Belle\\\",\\\"email\\\":\\\"bbouchergg@delicious.com\\\",\\\"job\\\":\\\"Geologist I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":594,\\\"first_name\\\":\\\"Joshuah\\\",\\\"email\\\":\\\"jpiccopgh@nifty.com\\\",\\\"job\\\":\\\"Graphic Designer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":595,\\\"first_name\\\":\\\"Dean\\\",\\\"email\\\":\\\"dmalyangi@deviantart.com\\\",\\\"job\\\":\\\"Nurse Practicioner\\\"}\"}\n{ \"body\": \"{\\\"id\\\":596,\\\"first_name\\\":\\\"Alberto\\\",\\\"email\\\":\\\"agatleygj@craigslist.org\\\",\\\"job\\\":\\\"Accountant III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":597,\\\"first_name\\\":\\\"Emerson\\\",\\\"email\\\":\\\"eohalleghanegk@foxnews.com\\\",\\\"job\\\":\\\"VP Sales\\\"}\"}\n{ \"body\": \"{\\\"id\\\":598,\\\"first_name\\\":\\\"Jasmin\\\",\\\"email\\\":\\\"jcreebergl@photobucket.com\\\",\\\"job\\\":\\\"Cost Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":599,\\\"first_name\\\":\\\"Leslie\\\",\\\"email\\\":\\\"lyepiskovgm@simplemachines.org\\\",\\\"job\\\":\\\"Engineer III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":600,\\\"first_name\\\":\\\"Kassie\\\",\\\"email\\\":\\\"kbantongn@abc.net.au\\\",\\\"job\\\":\\\"Human Resources Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":601,\\\"first_name\\\":\\\"Gottfried\\\",\\\"email\\\":\\\"gsummerlygo@google.com.hk\\\",\\\"job\\\":\\\"Environmental Tech\\\"}\"}\n{ \"body\": \"{\\\"id\\\":602,\\\"first_name\\\":\\\"Thaxter\\\",\\\"email\\\":\\\"tlandsburygp@myspace.com\\\",\\\"job\\\":\\\"Librarian\\\"}\"}\n{ \"body\": \"{\\\"id\\\":603,\\\"first_name\\\":\\\"Celinka\\\",\\\"email\\\":\\\"churlgq@oakley.com\\\",\\\"job\\\":\\\"Statistician I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":604,\\\"first_name\\\":\\\"Alex\\\",\\\"email\\\":\\\"acanepegr@youtube.com\\\",\\\"job\\\":\\\"Executive Secretary\\\"}\"}\n{ \"body\": \"{\\\"id\\\":605,\\\"first_name\\\":\\\"Clerc\\\",\\\"email\\\":\\\"cwesthofergs@dion.ne.jp\\\",\\\"job\\\":\\\"Software Engineer I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":606,\\\"first_name\\\":\\\"Harwell\\\",\\\"email\\\":\\\"hlandmangt@marriott.com\\\",\\\"job\\\":\\\"Quality Control Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":607,\\\"first_name\\\":\\\"Lennard\\\",\\\"email\\\":\\\"lpillingtongu@bloglovin.com\\\",\\\"job\\\":\\\"Nurse Practicioner\\\"}\"}\n{ \"body\": \"{\\\"id\\\":608,\\\"first_name\\\":\\\"Nadia\\\",\\\"email\\\":\\\"nshewongv@marriott.com\\\",\\\"job\\\":\\\"Software Test Engineer II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":609,\\\"first_name\\\":\\\"Felic\\\",\\\"email\\\":\\\"fkidsongw@photobucket.com\\\",\\\"job\\\":\\\"Sales Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":610,\\\"first_name\\\":\\\"Amandie\\\",\\\"email\\\":\\\"abarocgx@nps.gov\\\",\\\"job\\\":\\\"Senior Sales Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":611,\\\"first_name\\\":\\\"Cosme\\\",\\\"email\\\":\\\"cfogtgy@umich.edu\\\",\\\"job\\\":\\\"Account Representative II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":612,\\\"first_name\\\":\\\"Roanne\\\",\\\"email\\\":\\\"rrobathamgz@jigsy.com\\\",\\\"job\\\":\\\"Health Coach IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":613,\\\"first_name\\\":\\\"Mar\\\",\\\"email\\\":\\\"mtilmouthh0@flavors.me\\\",\\\"job\\\":\\\"Sales Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":614,\\\"first_name\\\":\\\"Arabele\\\",\\\"email\\\":\\\"amcallisterh1@google.com.au\\\",\\\"job\\\":\\\"Staff Accountant II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":615,\\\"first_name\\\":\\\"Kynthia\\\",\\\"email\\\":\\\"ktithecoteh2@princeton.edu\\\",\\\"job\\\":\\\"Engineer II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":616,\\\"first_name\\\":\\\"Denys\\\",\\\"email\\\":\\\"dcarloneh3@123-reg.co.uk\\\",\\\"job\\\":\\\"General Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":617,\\\"first_name\\\":\\\"Christabella\\\",\\\"email\\\":\\\"csaggsh4@narod.ru\\\",\\\"job\\\":\\\"Project Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":618,\\\"first_name\\\":\\\"Gay\\\",\\\"email\\\":\\\"ghawkeridgeh5@posterous.com\\\",\\\"job\\\":\\\"Tax Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":619,\\\"first_name\\\":\\\"Quincy\\\",\\\"email\\\":\\\"qlissimoreh6@nps.gov\\\",\\\"job\\\":\\\"Statistician IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":620,\\\"first_name\\\":\\\"Cody\\\",\\\"email\\\":\\\"cpontinh7@addthis.com\\\",\\\"job\\\":\\\"General Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":621,\\\"first_name\\\":\\\"Mildrid\\\",\\\"email\\\":\\\"mdomelowh8@edublogs.org\\\",\\\"job\\\":\\\"VP Quality Control\\\"}\"}\n{ \"body\": \"{\\\"id\\\":622,\\\"first_name\\\":\\\"Luciano\\\",\\\"email\\\":\\\"lzornh9@china.com.cn\\\",\\\"job\\\":\\\"Staff Scientist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":623,\\\"first_name\\\":\\\"Carmita\\\",\\\"email\\\":\\\"criccardha@blogtalkradio.com\\\",\\\"job\\\":\\\"Systems Administrator III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":624,\\\"first_name\\\":\\\"Cally\\\",\\\"email\\\":\\\"cstainerhb@123-reg.co.uk\\\",\\\"job\\\":\\\"Senior Financial Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":625,\\\"first_name\\\":\\\"Carmella\\\",\\\"email\\\":\\\"ckeaysellhc@amazon.co.uk\\\",\\\"job\\\":\\\"Biostatistician III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":626,\\\"first_name\\\":\\\"Ernestus\\\",\\\"email\\\":\\\"erumboldhd@walmart.com\\\",\\\"job\\\":\\\"Systems Administrator II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":627,\\\"first_name\\\":\\\"Dena\\\",\\\"email\\\":\\\"dgrishechkinhe@sitemeter.com\\\",\\\"job\\\":\\\"Junior Executive\\\"}\"}\n{ \"body\": \"{\\\"id\\\":628,\\\"first_name\\\":\\\"Karrie\\\",\\\"email\\\":\\\"kheldhf@chron.com\\\",\\\"job\\\":\\\"Director of Sales\\\"}\"}\n{ \"body\": \"{\\\"id\\\":629,\\\"first_name\\\":\\\"Alfonso\\\",\\\"email\\\":\\\"aclinnickhg@i2i.jp\\\",\\\"job\\\":\\\"Associate Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":630,\\\"first_name\\\":\\\"Pren\\\",\\\"email\\\":\\\"pavannhh@digg.com\\\",\\\"job\\\":\\\"Assistant Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":631,\\\"first_name\\\":\\\"Dulcy\\\",\\\"email\\\":\\\"dsallyhi@sciencedaily.com\\\",\\\"job\\\":\\\"Recruiter\\\"}\"}\n{ \"body\": \"{\\\"id\\\":632,\\\"first_name\\\":\\\"Chalmers\\\",\\\"email\\\":\\\"cjirsahj@theglobeandmail.com\\\",\\\"job\\\":\\\"Systems Administrator I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":633,\\\"first_name\\\":\\\"Hoyt\\\",\\\"email\\\":\\\"hwhitesonhk@epa.gov\\\",\\\"job\\\":\\\"VP Marketing\\\"}\"}\n{ \"body\": \"{\\\"id\\\":634,\\\"first_name\\\":\\\"Theressa\\\",\\\"email\\\":\\\"tstinchcombehl@wunderground.com\\\",\\\"job\\\":\\\"Web Developer IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":635,\\\"first_name\\\":\\\"Tiertza\\\",\\\"email\\\":\\\"tcatterickhm@un.org\\\",\\\"job\\\":\\\"Safety Technician III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":636,\\\"first_name\\\":\\\"Kathleen\\\",\\\"email\\\":\\\"kgreesonhn@i2i.jp\\\",\\\"job\\\":\\\"Assistant Media Planner\\\"}\"}\n{ \"body\": \"{\\\"id\\\":637,\\\"first_name\\\":\\\"Alwin\\\",\\\"email\\\":\\\"athurbonho@fotki.com\\\",\\\"job\\\":\\\"Financial Advisor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":638,\\\"first_name\\\":\\\"Linet\\\",\\\"email\\\":\\\"lbedenhamhp@so-net.ne.jp\\\",\\\"job\\\":\\\"Account Coordinator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":639,\\\"first_name\\\":\\\"Hubie\\\",\\\"email\\\":\\\"hlivenshq@geocities.com\\\",\\\"job\\\":\\\"Chief Design Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":640,\\\"first_name\\\":\\\"Fergus\\\",\\\"email\\\":\\\"frablenhr@patch.com\\\",\\\"job\\\":\\\"Quality Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":641,\\\"first_name\\\":\\\"Rozelle\\\",\\\"email\\\":\\\"rcameratihs@amazon.co.jp\\\",\\\"job\\\":\\\"Speech Pathologist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":642,\\\"first_name\\\":\\\"Paco\\\",\\\"email\\\":\\\"pgaitunght@hc360.com\\\",\\\"job\\\":\\\"Social Worker\\\"}\"}\n{ \"body\": \"{\\\"id\\\":643,\\\"first_name\\\":\\\"Raymond\\\",\\\"email\\\":\\\"rkarlowiczhu@gnu.org\\\",\\\"job\\\":\\\"Help Desk Operator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":644,\\\"first_name\\\":\\\"Brody\\\",\\\"email\\\":\\\"bbaberhv@sfgate.com\\\",\\\"job\\\":\\\"Senior Cost Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":645,\\\"first_name\\\":\\\"Forster\\\",\\\"email\\\":\\\"fhuncotehw@dmoz.org\\\",\\\"job\\\":\\\"Chemical Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":646,\\\"first_name\\\":\\\"Anabelle\\\",\\\"email\\\":\\\"abenardettehx@gmpg.org\\\",\\\"job\\\":\\\"Information Systems Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":647,\\\"first_name\\\":\\\"Atlante\\\",\\\"email\\\":\\\"atabernerhy@istockphoto.com\\\",\\\"job\\\":\\\"Computer Systems Analyst III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":648,\\\"first_name\\\":\\\"Gretta\\\",\\\"email\\\":\\\"gcampeyhz@blinklist.com\\\",\\\"job\\\":\\\"Programmer Analyst II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":649,\\\"first_name\\\":\\\"Grethel\\\",\\\"email\\\":\\\"ggheeraerti0@fda.gov\\\",\\\"job\\\":\\\"Community Outreach Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":650,\\\"first_name\\\":\\\"Meggy\\\",\\\"email\\\":\\\"mstringfellowi1@whitehouse.gov\\\",\\\"job\\\":\\\"Payment Adjustment Coordinator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":651,\\\"first_name\\\":\\\"Quintina\\\",\\\"email\\\":\\\"qondrichi2@marriott.com\\\",\\\"job\\\":\\\"Senior Sales Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":652,\\\"first_name\\\":\\\"Dylan\\\",\\\"email\\\":\\\"drentenbecki3@netscape.com\\\",\\\"job\\\":\\\"Dental Hygienist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":653,\\\"first_name\\\":\\\"Donovan\\\",\\\"email\\\":\\\"dalcidei4@sciencedirect.com\\\",\\\"job\\\":\\\"Compensation Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":654,\\\"first_name\\\":\\\"Rhiamon\\\",\\\"email\\\":\\\"rkenderi5@rediff.com\\\",\\\"job\\\":\\\"Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":655,\\\"first_name\\\":\\\"Genovera\\\",\\\"email\\\":\\\"ggorvini6@trellian.com\\\",\\\"job\\\":\\\"Health Coach III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":656,\\\"first_name\\\":\\\"Warren\\\",\\\"email\\\":\\\"wsaccoi7@ask.com\\\",\\\"job\\\":\\\"Analog Circuit Design manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":657,\\\"first_name\\\":\\\"Marcellina\\\",\\\"email\\\":\\\"mkerini8@yale.edu\\\",\\\"job\\\":\\\"Programmer Analyst II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":658,\\\"first_name\\\":\\\"Pennie\\\",\\\"email\\\":\\\"prolinsoni9@pen.io\\\",\\\"job\\\":\\\"Compensation Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":659,\\\"first_name\\\":\\\"Padraic\\\",\\\"email\\\":\\\"pgrigoria@goo.ne.jp\\\",\\\"job\\\":\\\"Computer Systems Analyst III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":660,\\\"first_name\\\":\\\"Boonie\\\",\\\"email\\\":\\\"bboundsib@virginia.edu\\\",\\\"job\\\":\\\"Data Coordiator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":661,\\\"first_name\\\":\\\"Sela\\\",\\\"email\\\":\\\"sdudmeshic@reverbnation.com\\\",\\\"job\\\":\\\"Media Manager IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":662,\\\"first_name\\\":\\\"Gunilla\\\",\\\"email\\\":\\\"ggissingid@china.com.cn\\\",\\\"job\\\":\\\"Structural Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":663,\\\"first_name\\\":\\\"Diannne\\\",\\\"email\\\":\\\"djirusie@epa.gov\\\",\\\"job\\\":\\\"General Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":664,\\\"first_name\\\":\\\"Cybill\\\",\\\"email\\\":\\\"cpennif@google.com.br\\\",\\\"job\\\":\\\"Paralegal\\\"}\"}\n{ \"body\": \"{\\\"id\\\":665,\\\"first_name\\\":\\\"Roland\\\",\\\"email\\\":\\\"rtrippackig@answers.com\\\",\\\"job\\\":\\\"Research Assistant I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":666,\\\"first_name\\\":\\\"Van\\\",\\\"email\\\":\\\"vlyokhinih@slate.com\\\",\\\"job\\\":\\\"Automation Specialist I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":667,\\\"first_name\\\":\\\"Kevina\\\",\\\"email\\\":\\\"kprickettii@marriott.com\\\",\\\"job\\\":\\\"Marketing Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":668,\\\"first_name\\\":\\\"Freemon\\\",\\\"email\\\":\\\"fmckinneyij@i2i.jp\\\",\\\"job\\\":\\\"Cost Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":669,\\\"first_name\\\":\\\"Kevin\\\",\\\"email\\\":\\\"kmaccorkellik@tripadvisor.com\\\",\\\"job\\\":\\\"Database Administrator IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":670,\\\"first_name\\\":\\\"Fayina\\\",\\\"email\\\":\\\"fbarneveldil@yahoo.co.jp\\\",\\\"job\\\":\\\"Senior Financial Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":671,\\\"first_name\\\":\\\"Carmelle\\\",\\\"email\\\":\\\"cblakerim@bbc.co.uk\\\",\\\"job\\\":\\\"Business Systems Development Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":672,\\\"first_name\\\":\\\"Rosie\\\",\\\"email\\\":\\\"rmattheusin@tinyurl.com\\\",\\\"job\\\":\\\"Sales Representative\\\"}\"}\n{ \"body\": \"{\\\"id\\\":673,\\\"first_name\\\":\\\"Sheryl\\\",\\\"email\\\":\\\"sbowdenio@home.pl\\\",\\\"job\\\":\\\"Human Resources Assistant IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":674,\\\"first_name\\\":\\\"Dannye\\\",\\\"email\\\":\\\"diddensip@cnbc.com\\\",\\\"job\\\":\\\"Chemical Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":675,\\\"first_name\\\":\\\"Pasquale\\\",\\\"email\\\":\\\"pcolhouniq@vkontakte.ru\\\",\\\"job\\\":\\\"Senior Sales Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":676,\\\"first_name\\\":\\\"Betta\\\",\\\"email\\\":\\\"bcaffreyir@prnewswire.com\\\",\\\"job\\\":\\\"Teacher\\\"}\"}\n{ \"body\": \"{\\\"id\\\":677,\\\"first_name\\\":\\\"Carolan\\\",\\\"email\\\":\\\"cdickmanis@sciencedirect.com\\\",\\\"job\\\":\\\"VP Product Management\\\"}\"}\n{ \"body\": \"{\\\"id\\\":678,\\\"first_name\\\":\\\"Olwen\\\",\\\"email\\\":\\\"okirkamit@mtv.com\\\",\\\"job\\\":\\\"Nuclear Power Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":679,\\\"first_name\\\":\\\"Marmaduke\\\",\\\"email\\\":\\\"myaneziu@lycos.com\\\",\\\"job\\\":\\\"Tax Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":680,\\\"first_name\\\":\\\"Elisha\\\",\\\"email\\\":\\\"edurtneliv@symantec.com\\\",\\\"job\\\":\\\"Health Coach IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":681,\\\"first_name\\\":\\\"Lind\\\",\\\"email\\\":\\\"lrediw@engadget.com\\\",\\\"job\\\":\\\"Help Desk Operator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":682,\\\"first_name\\\":\\\"Ruby\\\",\\\"email\\\":\\\"rhedgecockix@webeden.co.uk\\\",\\\"job\\\":\\\"Desktop Support Technician\\\"}\"}\n{ \"body\": \"{\\\"id\\\":683,\\\"first_name\\\":\\\"Arch\\\",\\\"email\\\":\\\"areynaultiy@networkadvertising.org\\\",\\\"job\\\":\\\"Compensation Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":684,\\\"first_name\\\":\\\"Garold\\\",\\\"email\\\":\\\"gcolthurstiz@cocolog-nifty.com\\\",\\\"job\\\":\\\"Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":685,\\\"first_name\\\":\\\"Seward\\\",\\\"email\\\":\\\"sapplegatej0@miibeian.gov.cn\\\",\\\"job\\\":\\\"Physical Therapy Assistant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":686,\\\"first_name\\\":\\\"Fionnula\\\",\\\"email\\\":\\\"fcunahj1@auda.org.au\\\",\\\"job\\\":\\\"Human Resources Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":687,\\\"first_name\\\":\\\"Clarke\\\",\\\"email\\\":\\\"ccambridgej2@dropbox.com\\\",\\\"job\\\":\\\"Account Executive\\\"}\"}\n{ \"body\": \"{\\\"id\\\":688,\\\"first_name\\\":\\\"Therine\\\",\\\"email\\\":\\\"tjacombj3@nba.com\\\",\\\"job\\\":\\\"Senior Sales Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":689,\\\"first_name\\\":\\\"Forest\\\",\\\"email\\\":\\\"fsearsj4@ed.gov\\\",\\\"job\\\":\\\"Human Resources Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":690,\\\"first_name\\\":\\\"Nicky\\\",\\\"email\\\":\\\"nruselinj5@tripadvisor.com\\\",\\\"job\\\":\\\"Software Test Engineer IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":691,\\\"first_name\\\":\\\"Jilly\\\",\\\"email\\\":\\\"jkegleyj6@privacy.gov.au\\\",\\\"job\\\":\\\"Senior Quality Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":692,\\\"first_name\\\":\\\"Paulie\\\",\\\"email\\\":\\\"ppollicotej7@unicef.org\\\",\\\"job\\\":\\\"Nuclear Power Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":693,\\\"first_name\\\":\\\"Bette-ann\\\",\\\"email\\\":\\\"bselbiej8@blog.com\\\",\\\"job\\\":\\\"Registered Nurse\\\"}\"}\n{ \"body\": \"{\\\"id\\\":694,\\\"first_name\\\":\\\"Revkah\\\",\\\"email\\\":\\\"rgonsalvezj9@xrea.com\\\",\\\"job\\\":\\\"VP Sales\\\"}\"}\n{ \"body\": \"{\\\"id\\\":695,\\\"first_name\\\":\\\"Warren\\\",\\\"email\\\":\\\"wearpeja@dot.gov\\\",\\\"job\\\":\\\"Marketing Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":696,\\\"first_name\\\":\\\"Luce\\\",\\\"email\\\":\\\"ldousejb@deliciousdays.com\\\",\\\"job\\\":\\\"Staff Accountant I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":697,\\\"first_name\\\":\\\"Clem\\\",\\\"email\\\":\\\"cgablejc@hugedomains.com\\\",\\\"job\\\":\\\"GIS Technical Architect\\\"}\"}\n{ \"body\": \"{\\\"id\\\":698,\\\"first_name\\\":\\\"Sofie\\\",\\\"email\\\":\\\"sgoldfinchjd@gravatar.com\\\",\\\"job\\\":\\\"Quality Control Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":699,\\\"first_name\\\":\\\"Sheffy\\\",\\\"email\\\":\\\"sacklandsje@squidoo.com\\\",\\\"job\\\":\\\"Clinical Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":700,\\\"first_name\\\":\\\"Aili\\\",\\\"email\\\":\\\"acastellettojf@aol.com\\\",\\\"job\\\":\\\"Accountant I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":701,\\\"first_name\\\":\\\"Araldo\\\",\\\"email\\\":\\\"alippinijg@microsoft.com\\\",\\\"job\\\":\\\"Staff Accountant IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":702,\\\"first_name\\\":\\\"Devon\\\",\\\"email\\\":\\\"dsallerjh@sina.com.cn\\\",\\\"job\\\":\\\"Design Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":703,\\\"first_name\\\":\\\"Morgana\\\",\\\"email\\\":\\\"mharfordji@dyndns.org\\\",\\\"job\\\":\\\"Help Desk Operator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":704,\\\"first_name\\\":\\\"Genevra\\\",\\\"email\\\":\\\"gjubbjj@redcross.org\\\",\\\"job\\\":\\\"Assistant Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":705,\\\"first_name\\\":\\\"Cam\\\",\\\"email\\\":\\\"cbewshirejk@arstechnica.com\\\",\\\"job\\\":\\\"Help Desk Technician\\\"}\"}\n{ \"body\": \"{\\\"id\\\":706,\\\"first_name\\\":\\\"Aldon\\\",\\\"email\\\":\\\"akempejl@nih.gov\\\",\\\"job\\\":\\\"Research Nurse\\\"}\"}\n{ \"body\": \"{\\\"id\\\":707,\\\"first_name\\\":\\\"Ferne\\\",\\\"email\\\":\\\"fdoellejm@github.io\\\",\\\"job\\\":\\\"Compensation Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":708,\\\"first_name\\\":\\\"Aymer\\\",\\\"email\\\":\\\"aesbergerjn@issuu.com\\\",\\\"job\\\":\\\"Human Resources Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":709,\\\"first_name\\\":\\\"Jacynth\\\",\\\"email\\\":\\\"joddajo@ed.gov\\\",\\\"job\\\":\\\"Editor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":710,\\\"first_name\\\":\\\"Gerrie\\\",\\\"email\\\":\\\"gwatmanjp@dailymail.co.uk\\\",\\\"job\\\":\\\"Desktop Support Technician\\\"}\"}\n{ \"body\": \"{\\\"id\\\":711,\\\"first_name\\\":\\\"Dagmar\\\",\\\"email\\\":\\\"doshevlinjq@webmd.com\\\",\\\"job\\\":\\\"Analyst Programmer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":712,\\\"first_name\\\":\\\"Lilia\\\",\\\"email\\\":\\\"lcastilljojr@wsj.com\\\",\\\"job\\\":\\\"Web Developer I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":713,\\\"first_name\\\":\\\"Charyl\\\",\\\"email\\\":\\\"cmacilwrickjs@howstuffworks.com\\\",\\\"job\\\":\\\"Programmer Analyst III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":714,\\\"first_name\\\":\\\"Olivier\\\",\\\"email\\\":\\\"omoulsdalejt@ocn.ne.jp\\\",\\\"job\\\":\\\"Legal Assistant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":715,\\\"first_name\\\":\\\"Sully\\\",\\\"email\\\":\\\"scourtju@shop-pro.jp\\\",\\\"job\\\":\\\"Marketing Assistant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":716,\\\"first_name\\\":\\\"Cathryn\\\",\\\"email\\\":\\\"celverstonejv@weather.com\\\",\\\"job\\\":\\\"Recruiter\\\"}\"}\n{ \"body\": \"{\\\"id\\\":717,\\\"first_name\\\":\\\"Rochette\\\",\\\"email\\\":\\\"rclemenzojw@wikipedia.org\\\",\\\"job\\\":\\\"Research Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":718,\\\"first_name\\\":\\\"Jasen\\\",\\\"email\\\":\\\"jcorainijx@ox.ac.uk\\\",\\\"job\\\":\\\"Geologist IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":719,\\\"first_name\\\":\\\"Kalle\\\",\\\"email\\\":\\\"kgiacobazzijy@networkadvertising.org\\\",\\\"job\\\":\\\"Operator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":720,\\\"first_name\\\":\\\"Gaye\\\",\\\"email\\\":\\\"gmccalisterjz@gnu.org\\\",\\\"job\\\":\\\"Director of Sales\\\"}\"}\n{ \"body\": \"{\\\"id\\\":721,\\\"first_name\\\":\\\"Hyacinthe\\\",\\\"email\\\":\\\"hcovillk0@constantcontact.com\\\",\\\"job\\\":\\\"Data Coordiator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":722,\\\"first_name\\\":\\\"Vicky\\\",\\\"email\\\":\\\"vgibbingsk1@yellowbook.com\\\",\\\"job\\\":\\\"Engineer I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":723,\\\"first_name\\\":\\\"Ransell\\\",\\\"email\\\":\\\"rtreecek2@mozilla.com\\\",\\\"job\\\":\\\"Assistant Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":724,\\\"first_name\\\":\\\"Arnoldo\\\",\\\"email\\\":\\\"ajerroltk3@scientificamerican.com\\\",\\\"job\\\":\\\"Staff Scientist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":725,\\\"first_name\\\":\\\"Ailyn\\\",\\\"email\\\":\\\"aprendergastk4@networkadvertising.org\\\",\\\"job\\\":\\\"Financial Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":726,\\\"first_name\\\":\\\"Chrystel\\\",\\\"email\\\":\\\"chorbathk5@wix.com\\\",\\\"job\\\":\\\"Community Outreach Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":727,\\\"first_name\\\":\\\"Billy\\\",\\\"email\\\":\\\"bboritk6@arstechnica.com\\\",\\\"job\\\":\\\"Software Consultant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":728,\\\"first_name\\\":\\\"Wendy\\\",\\\"email\\\":\\\"wborelandk7@hao123.com\\\",\\\"job\\\":\\\"VP Quality Control\\\"}\"}\n{ \"body\": \"{\\\"id\\\":729,\\\"first_name\\\":\\\"Verile\\\",\\\"email\\\":\\\"vcroadk8@rediff.com\\\",\\\"job\\\":\\\"Data Coordiator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":730,\\\"first_name\\\":\\\"Abdul\\\",\\\"email\\\":\\\"acamblink9@paypal.com\\\",\\\"job\\\":\\\"Junior Executive\\\"}\"}\n{ \"body\": \"{\\\"id\\\":731,\\\"first_name\\\":\\\"Thorsten\\\",\\\"email\\\":\\\"tturevilleka@hatena.ne.jp\\\",\\\"job\\\":\\\"Financial Advisor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":732,\\\"first_name\\\":\\\"Gaby\\\",\\\"email\\\":\\\"gdovidaitiskb@prlog.org\\\",\\\"job\\\":\\\"Director of Sales\\\"}\"}\n{ \"body\": \"{\\\"id\\\":733,\\\"first_name\\\":\\\"Harli\\\",\\\"email\\\":\\\"hcarstairskc@goo.gl\\\",\\\"job\\\":\\\"Media Manager I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":734,\\\"first_name\\\":\\\"Netta\\\",\\\"email\\\":\\\"nbrieretonkd@cornell.edu\\\",\\\"job\\\":\\\"Safety Technician II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":735,\\\"first_name\\\":\\\"Dillie\\\",\\\"email\\\":\\\"dtrimbleke@ucla.edu\\\",\\\"job\\\":\\\"Geologist III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":736,\\\"first_name\\\":\\\"Nicoli\\\",\\\"email\\\":\\\"nbristerkf@dyndns.org\\\",\\\"job\\\":\\\"Compensation Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":737,\\\"first_name\\\":\\\"Dasha\\\",\\\"email\\\":\\\"dtulleykg@t.co\\\",\\\"job\\\":\\\"Statistician I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":738,\\\"first_name\\\":\\\"Salomone\\\",\\\"email\\\":\\\"skindlesidekh@mashable.com\\\",\\\"job\\\":\\\"Biostatistician I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":739,\\\"first_name\\\":\\\"Andras\\\",\\\"email\\\":\\\"acissonki@bbb.org\\\",\\\"job\\\":\\\"Assistant Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":740,\\\"first_name\\\":\\\"Chaim\\\",\\\"email\\\":\\\"cbettinsonkj@lulu.com\\\",\\\"job\\\":\\\"Budget/Accounting Analyst I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":741,\\\"first_name\\\":\\\"Tammara\\\",\\\"email\\\":\\\"tallinkk@sfgate.com\\\",\\\"job\\\":\\\"Recruiter\\\"}\"}\n{ \"body\": \"{\\\"id\\\":742,\\\"first_name\\\":\\\"Leora\\\",\\\"email\\\":\\\"lkiffinkl@odnoklassniki.ru\\\",\\\"job\\\":\\\"Executive Secretary\\\"}\"}\n{ \"body\": \"{\\\"id\\\":743,\\\"first_name\\\":\\\"Emogene\\\",\\\"email\\\":\\\"ecoodekm@accuweather.com\\\",\\\"job\\\":\\\"VP Accounting\\\"}\"}\n{ \"body\": \"{\\\"id\\\":744,\\\"first_name\\\":\\\"Hobart\\\",\\\"email\\\":\\\"hlarrattkn@discuz.net\\\",\\\"job\\\":\\\"Speech Pathologist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":745,\\\"first_name\\\":\\\"Daren\\\",\\\"email\\\":\\\"dnavarroko@answers.com\\\",\\\"job\\\":\\\"Biostatistician III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":746,\\\"first_name\\\":\\\"Sondra\\\",\\\"email\\\":\\\"sroakekp@bigcartel.com\\\",\\\"job\\\":\\\"Senior Sales Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":747,\\\"first_name\\\":\\\"Vinnie\\\",\\\"email\\\":\\\"vdullaghankq@guardian.co.uk\\\",\\\"job\\\":\\\"Electrical Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":748,\\\"first_name\\\":\\\"Margie\\\",\\\"email\\\":\\\"mboomekr@nhs.uk\\\",\\\"job\\\":\\\"GIS Technical Architect\\\"}\"}\n{ \"body\": \"{\\\"id\\\":749,\\\"first_name\\\":\\\"Emma\\\",\\\"email\\\":\\\"evidgenks@meetup.com\\\",\\\"job\\\":\\\"Geological Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":750,\\\"first_name\\\":\\\"Marlo\\\",\\\"email\\\":\\\"mgrastyekt@baidu.com\\\",\\\"job\\\":\\\"Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":751,\\\"first_name\\\":\\\"Jorry\\\",\\\"email\\\":\\\"jaldhouseku@icio.us\\\",\\\"job\\\":\\\"Research Assistant II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":752,\\\"first_name\\\":\\\"Hagen\\\",\\\"email\\\":\\\"hjorgensenkv@nbcnews.com\\\",\\\"job\\\":\\\"VP Product Management\\\"}\"}\n{ \"body\": \"{\\\"id\\\":753,\\\"first_name\\\":\\\"Jacinthe\\\",\\\"email\\\":\\\"jguwerkw@gizmodo.com\\\",\\\"job\\\":\\\"GIS Technical Architect\\\"}\"}\n{ \"body\": \"{\\\"id\\\":754,\\\"first_name\\\":\\\"Velvet\\\",\\\"email\\\":\\\"vwyantkx@google.co.uk\\\",\\\"job\\\":\\\"Teacher\\\"}\"}\n{ \"body\": \"{\\\"id\\\":755,\\\"first_name\\\":\\\"Lilias\\\",\\\"email\\\":\\\"lkitleeky@bloglovin.com\\\",\\\"job\\\":\\\"Product Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":756,\\\"first_name\\\":\\\"Kacey\\\",\\\"email\\\":\\\"kdemcikkz@google.es\\\",\\\"job\\\":\\\"Quality Control Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":757,\\\"first_name\\\":\\\"Bearnard\\\",\\\"email\\\":\\\"bcordell0@wunderground.com\\\",\\\"job\\\":\\\"Chief Design Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":758,\\\"first_name\\\":\\\"Fanechka\\\",\\\"email\\\":\\\"fgosdinl1@house.gov\\\",\\\"job\\\":\\\"Occupational Therapist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":759,\\\"first_name\\\":\\\"Maryanna\\\",\\\"email\\\":\\\"mmaughanl2@dion.ne.jp\\\",\\\"job\\\":\\\"Human Resources Assistant II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":760,\\\"first_name\\\":\\\"Robinette\\\",\\\"email\\\":\\\"rhedditchl3@sohu.com\\\",\\\"job\\\":\\\"Systems Administrator II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":761,\\\"first_name\\\":\\\"Aliza\\\",\\\"email\\\":\\\"adurwardl4@wikimedia.org\\\",\\\"job\\\":\\\"Teacher\\\"}\"}\n{ \"body\": \"{\\\"id\\\":762,\\\"first_name\\\":\\\"Minni\\\",\\\"email\\\":\\\"mjedraszekl5@businesswire.com\\\",\\\"job\\\":\\\"Quality Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":763,\\\"first_name\\\":\\\"Fitz\\\",\\\"email\\\":\\\"farnetl6@seesaa.net\\\",\\\"job\\\":\\\"Community Outreach Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":764,\\\"first_name\\\":\\\"Priscella\\\",\\\"email\\\":\\\"pjaherl7@marriott.com\\\",\\\"job\\\":\\\"Food Chemist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":765,\\\"first_name\\\":\\\"Marja\\\",\\\"email\\\":\\\"mdingleyl8@rambler.ru\\\",\\\"job\\\":\\\"Engineer I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":766,\\\"first_name\\\":\\\"Franni\\\",\\\"email\\\":\\\"flafayettel9@wix.com\\\",\\\"job\\\":\\\"Structural Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":767,\\\"first_name\\\":\\\"Fayina\\\",\\\"email\\\":\\\"fduckla@scribd.com\\\",\\\"job\\\":\\\"Technical Writer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":768,\\\"first_name\\\":\\\"Maggi\\\",\\\"email\\\":\\\"mgriffittslb@deviantart.com\\\",\\\"job\\\":\\\"Cost Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":769,\\\"first_name\\\":\\\"Homere\\\",\\\"email\\\":\\\"hmanterfieldlc@howstuffworks.com\\\",\\\"job\\\":\\\"Civil Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":770,\\\"first_name\\\":\\\"Raoul\\\",\\\"email\\\":\\\"rwickeyld@google.fr\\\",\\\"job\\\":\\\"Research Nurse\\\"}\"}\n{ \"body\": \"{\\\"id\\\":771,\\\"first_name\\\":\\\"Kermie\\\",\\\"email\\\":\\\"kskeermerle@tamu.edu\\\",\\\"job\\\":\\\"Accountant II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":772,\\\"first_name\\\":\\\"Steffane\\\",\\\"email\\\":\\\"sbrendishlf@live.com\\\",\\\"job\\\":\\\"Software Consultant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":773,\\\"first_name\\\":\\\"Lisle\\\",\\\"email\\\":\\\"lhattlg@weibo.com\\\",\\\"job\\\":\\\"Teacher\\\"}\"}\n{ \"body\": \"{\\\"id\\\":774,\\\"first_name\\\":\\\"Pearle\\\",\\\"email\\\":\\\"poakdenlh@cnbc.com\\\",\\\"job\\\":\\\"Associate Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":775,\\\"first_name\\\":\\\"Herold\\\",\\\"email\\\":\\\"hmusicoli@loc.gov\\\",\\\"job\\\":\\\"Internal Auditor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":776,\\\"first_name\\\":\\\"Archambault\\\",\\\"email\\\":\\\"ahawkridgelj@sitemeter.com\\\",\\\"job\\\":\\\"Financial Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":777,\\\"first_name\\\":\\\"Emmy\\\",\\\"email\\\":\\\"emandellk@japanpost.jp\\\",\\\"job\\\":\\\"Software Test Engineer IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":778,\\\"first_name\\\":\\\"Colas\\\",\\\"email\\\":\\\"cheavyll@sciencedirect.com\\\",\\\"job\\\":\\\"Marketing Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":779,\\\"first_name\\\":\\\"Mikkel\\\",\\\"email\\\":\\\"mdrummerlm@amazon.co.jp\\\",\\\"job\\\":\\\"Systems Administrator IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":780,\\\"first_name\\\":\\\"Ewart\\\",\\\"email\\\":\\\"esurmeyerln@rambler.ru\\\",\\\"job\\\":\\\"Clinical Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":781,\\\"first_name\\\":\\\"Jere\\\",\\\"email\\\":\\\"jcapslo@epa.gov\\\",\\\"job\\\":\\\"Office Assistant IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":782,\\\"first_name\\\":\\\"Ceil\\\",\\\"email\\\":\\\"cbygottlp@state.gov\\\",\\\"job\\\":\\\"Safety Technician II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":783,\\\"first_name\\\":\\\"Mabelle\\\",\\\"email\\\":\\\"mcornwalllq@xinhuanet.com\\\",\\\"job\\\":\\\"Chief Design Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":784,\\\"first_name\\\":\\\"Nolie\\\",\\\"email\\\":\\\"npirrilr@devhub.com\\\",\\\"job\\\":\\\"Media Manager III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":785,\\\"first_name\\\":\\\"Mara\\\",\\\"email\\\":\\\"mderwinls@spiegel.de\\\",\\\"job\\\":\\\"Electrical Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":786,\\\"first_name\\\":\\\"Hill\\\",\\\"email\\\":\\\"hattwilllt@dedecms.com\\\",\\\"job\\\":\\\"Community Outreach Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":787,\\\"first_name\\\":\\\"Ralina\\\",\\\"email\\\":\\\"rcloustonlu@reference.com\\\",\\\"job\\\":\\\"Electrical Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":788,\\\"first_name\\\":\\\"Cy\\\",\\\"email\\\":\\\"cfehnerslv@ebay.com\\\",\\\"job\\\":\\\"Database Administrator I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":789,\\\"first_name\\\":\\\"Lind\\\",\\\"email\\\":\\\"ldargavellw@ft.com\\\",\\\"job\\\":\\\"Associate Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":790,\\\"first_name\\\":\\\"Zara\\\",\\\"email\\\":\\\"zpereslx@cmu.edu\\\",\\\"job\\\":\\\"Web Developer III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":791,\\\"first_name\\\":\\\"Nonie\\\",\\\"email\\\":\\\"ntealely@independent.co.uk\\\",\\\"job\\\":\\\"Human Resources Assistant I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":792,\\\"first_name\\\":\\\"Gabriello\\\",\\\"email\\\":\\\"gcoetzeelz@archive.org\\\",\\\"job\\\":\\\"Senior Quality Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":793,\\\"first_name\\\":\\\"Brynna\\\",\\\"email\\\":\\\"bborrowsm0@flavors.me\\\",\\\"job\\\":\\\"Physical Therapy Assistant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":794,\\\"first_name\\\":\\\"Guntar\\\",\\\"email\\\":\\\"gduerdenm1@icio.us\\\",\\\"job\\\":\\\"Account Executive\\\"}\"}\n{ \"body\": \"{\\\"id\\\":795,\\\"first_name\\\":\\\"Boigie\\\",\\\"email\\\":\\\"battwaterm2@ed.gov\\\",\\\"job\\\":\\\"Occupational Therapist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":796,\\\"first_name\\\":\\\"Guinevere\\\",\\\"email\\\":\\\"gverrickm3@eventbrite.com\\\",\\\"job\\\":\\\"Speech Pathologist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":797,\\\"first_name\\\":\\\"Kimbra\\\",\\\"email\\\":\\\"krozalskim4@studiopress.com\\\",\\\"job\\\":\\\"Mechanical Systems Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":798,\\\"first_name\\\":\\\"Alisha\\\",\\\"email\\\":\\\"afeym5@wikispaces.com\\\",\\\"job\\\":\\\"Senior Cost Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":799,\\\"first_name\\\":\\\"Tarrance\\\",\\\"email\\\":\\\"ttallboym6@skype.com\\\",\\\"job\\\":\\\"Software Consultant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":800,\\\"first_name\\\":\\\"Neale\\\",\\\"email\\\":\\\"ndodingm7@auda.org.au\\\",\\\"job\\\":\\\"Geologist IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":801,\\\"first_name\\\":\\\"Jorge\\\",\\\"email\\\":\\\"jstearnsm8@princeton.edu\\\",\\\"job\\\":\\\"Database Administrator III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":802,\\\"first_name\\\":\\\"George\\\",\\\"email\\\":\\\"gdresserm9@nationalgeographic.com\\\",\\\"job\\\":\\\"Social Worker\\\"}\"}\n{ \"body\": \"{\\\"id\\\":803,\\\"first_name\\\":\\\"Susanne\\\",\\\"email\\\":\\\"sburridgema@fda.gov\\\",\\\"job\\\":\\\"Occupational Therapist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":804,\\\"first_name\\\":\\\"Giraud\\\",\\\"email\\\":\\\"gpaulsenmb@sphinn.com\\\",\\\"job\\\":\\\"Safety Technician II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":805,\\\"first_name\\\":\\\"Conni\\\",\\\"email\\\":\\\"ckeepingmc@mozilla.org\\\",\\\"job\\\":\\\"Data Coordiator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":806,\\\"first_name\\\":\\\"Mirabella\\\",\\\"email\\\":\\\"mknapmanmd@constantcontact.com\\\",\\\"job\\\":\\\"Data Coordiator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":807,\\\"first_name\\\":\\\"Joel\\\",\\\"email\\\":\\\"jmaccoleme@cdc.gov\\\",\\\"job\\\":\\\"Assistant Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":808,\\\"first_name\\\":\\\"Merrily\\\",\\\"email\\\":\\\"mmussardmf@yandex.ru\\\",\\\"job\\\":\\\"Web Designer IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":809,\\\"first_name\\\":\\\"Ermanno\\\",\\\"email\\\":\\\"ewinspiremg@gizmodo.com\\\",\\\"job\\\":\\\"Graphic Designer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":810,\\\"first_name\\\":\\\"Fonsie\\\",\\\"email\\\":\\\"frieplmh@unc.edu\\\",\\\"job\\\":\\\"Financial Advisor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":811,\\\"first_name\\\":\\\"Latrina\\\",\\\"email\\\":\\\"lbridgermi@house.gov\\\",\\\"job\\\":\\\"Compensation Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":812,\\\"first_name\\\":\\\"Jehanna\\\",\\\"email\\\":\\\"jmacaughtriemj@i2i.jp\\\",\\\"job\\\":\\\"Nurse Practicioner\\\"}\"}\n{ \"body\": \"{\\\"id\\\":813,\\\"first_name\\\":\\\"Sancho\\\",\\\"email\\\":\\\"swhiskinmk@devhub.com\\\",\\\"job\\\":\\\"Engineer IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":814,\\\"first_name\\\":\\\"Addy\\\",\\\"email\\\":\\\"amatzenml@arstechnica.com\\\",\\\"job\\\":\\\"Paralegal\\\"}\"}\n{ \"body\": \"{\\\"id\\\":815,\\\"first_name\\\":\\\"Gav\\\",\\\"email\\\":\\\"ghowsegomm@geocities.com\\\",\\\"job\\\":\\\"Accountant IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":816,\\\"first_name\\\":\\\"Thedric\\\",\\\"email\\\":\\\"tchealemn@360.cn\\\",\\\"job\\\":\\\"Business Systems Development Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":817,\\\"first_name\\\":\\\"Suellen\\\",\\\"email\\\":\\\"smoodycliffemo@tinyurl.com\\\",\\\"job\\\":\\\"Community Outreach Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":818,\\\"first_name\\\":\\\"Andrei\\\",\\\"email\\\":\\\"ayarkermp@nydailynews.com\\\",\\\"job\\\":\\\"Biostatistician I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":819,\\\"first_name\\\":\\\"Hermina\\\",\\\"email\\\":\\\"helderkinmq@howstuffworks.com\\\",\\\"job\\\":\\\"Design Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":820,\\\"first_name\\\":\\\"Skye\\\",\\\"email\\\":\\\"swellummr@imdb.com\\\",\\\"job\\\":\\\"VP Accounting\\\"}\"}\n{ \"body\": \"{\\\"id\\\":821,\\\"first_name\\\":\\\"Frasquito\\\",\\\"email\\\":\\\"fdunkinsonms@washington.edu\\\",\\\"job\\\":\\\"Web Developer IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":822,\\\"first_name\\\":\\\"Lin\\\",\\\"email\\\":\\\"ldarleymt@dion.ne.jp\\\",\\\"job\\\":\\\"Senior Editor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":823,\\\"first_name\\\":\\\"Lorettalorna\\\",\\\"email\\\":\\\"lfeyermu@quantcast.com\\\",\\\"job\\\":\\\"Chief Design Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":824,\\\"first_name\\\":\\\"Debera\\\",\\\"email\\\":\\\"dmellingmv@imdb.com\\\",\\\"job\\\":\\\"Pharmacist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":825,\\\"first_name\\\":\\\"Gardener\\\",\\\"email\\\":\\\"gmitchardmw@skyrock.com\\\",\\\"job\\\":\\\"Chemical Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":826,\\\"first_name\\\":\\\"Adora\\\",\\\"email\\\":\\\"abernadonmx@quantcast.com\\\",\\\"job\\\":\\\"Editor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":827,\\\"first_name\\\":\\\"Laural\\\",\\\"email\\\":\\\"lshilitomy@cargocollective.com\\\",\\\"job\\\":\\\"Executive Secretary\\\"}\"}\n{ \"body\": \"{\\\"id\\\":828,\\\"first_name\\\":\\\"Daniel\\\",\\\"email\\\":\\\"dstantonmz@google.com.hk\\\",\\\"job\\\":\\\"Senior Editor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":829,\\\"first_name\\\":\\\"Tomasina\\\",\\\"email\\\":\\\"tmccawn0@howstuffworks.com\\\",\\\"job\\\":\\\"Technical Writer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":830,\\\"first_name\\\":\\\"Ferne\\\",\\\"email\\\":\\\"fgagern1@nih.gov\\\",\\\"job\\\":\\\"Software Consultant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":831,\\\"first_name\\\":\\\"Laural\\\",\\\"email\\\":\\\"lmaturan2@mysql.com\\\",\\\"job\\\":\\\"Media Manager III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":832,\\\"first_name\\\":\\\"Christoph\\\",\\\"email\\\":\\\"cgoldhilln3@com.com\\\",\\\"job\\\":\\\"Registered Nurse\\\"}\"}\n{ \"body\": \"{\\\"id\\\":833,\\\"first_name\\\":\\\"Julissa\\\",\\\"email\\\":\\\"jmcmorlandn4@simplemachines.org\\\",\\\"job\\\":\\\"Safety Technician II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":834,\\\"first_name\\\":\\\"Rosa\\\",\\\"email\\\":\\\"ryoungen5@sun.com\\\",\\\"job\\\":\\\"Geologist II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":835,\\\"first_name\\\":\\\"Paulo\\\",\\\"email\\\":\\\"pdallowayn6@unblog.fr\\\",\\\"job\\\":\\\"Office Assistant IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":836,\\\"first_name\\\":\\\"Aurlie\\\",\\\"email\\\":\\\"amoulesn7@squidoo.com\\\",\\\"job\\\":\\\"Computer Systems Analyst I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":837,\\\"first_name\\\":\\\"Rosanne\\\",\\\"email\\\":\\\"rbrixeyn8@va.gov\\\",\\\"job\\\":\\\"Legal Assistant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":838,\\\"first_name\\\":\\\"Benedicto\\\",\\\"email\\\":\\\"bharcen9@irs.gov\\\",\\\"job\\\":\\\"Associate Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":839,\\\"first_name\\\":\\\"Huntington\\\",\\\"email\\\":\\\"hshuttleworthna@answers.com\\\",\\\"job\\\":\\\"Teacher\\\"}\"}\n{ \"body\": \"{\\\"id\\\":840,\\\"first_name\\\":\\\"Leupold\\\",\\\"email\\\":\\\"lcapounnb@bbb.org\\\",\\\"job\\\":\\\"Marketing Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":841,\\\"first_name\\\":\\\"Austine\\\",\\\"email\\\":\\\"adielhennnc@rediff.com\\\",\\\"job\\\":\\\"GIS Technical Architect\\\"}\"}\n{ \"body\": \"{\\\"id\\\":842,\\\"first_name\\\":\\\"Tristam\\\",\\\"email\\\":\\\"tfranceschellind@ibm.com\\\",\\\"job\\\":\\\"Clinical Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":843,\\\"first_name\\\":\\\"Marjie\\\",\\\"email\\\":\\\"mpendrene@reuters.com\\\",\\\"job\\\":\\\"Sales Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":844,\\\"first_name\\\":\\\"Barth\\\",\\\"email\\\":\\\"bskirlingnf@ezinearticles.com\\\",\\\"job\\\":\\\"GIS Technical Architect\\\"}\"}\n{ \"body\": \"{\\\"id\\\":845,\\\"first_name\\\":\\\"Zackariah\\\",\\\"email\\\":\\\"ztrippickng@canalblog.com\\\",\\\"job\\\":\\\"Account Representative III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":846,\\\"first_name\\\":\\\"Chandler\\\",\\\"email\\\":\\\"cdemorenanh@biglobe.ne.jp\\\",\\\"job\\\":\\\"Human Resources Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":847,\\\"first_name\\\":\\\"Grayce\\\",\\\"email\\\":\\\"gvidoni@taobao.com\\\",\\\"job\\\":\\\"Senior Cost Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":848,\\\"first_name\\\":\\\"Ryann\\\",\\\"email\\\":\\\"rbrittennj@furl.net\\\",\\\"job\\\":\\\"Sales Representative\\\"}\"}\n{ \"body\": \"{\\\"id\\\":849,\\\"first_name\\\":\\\"Findlay\\\",\\\"email\\\":\\\"fmabbittnk@youtube.com\\\",\\\"job\\\":\\\"Geological Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":850,\\\"first_name\\\":\\\"Remington\\\",\\\"email\\\":\\\"rdundonnl@nyu.edu\\\",\\\"job\\\":\\\"Health Coach I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":851,\\\"first_name\\\":\\\"Raynell\\\",\\\"email\\\":\\\"rballardnm@dmoz.org\\\",\\\"job\\\":\\\"Biostatistician I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":852,\\\"first_name\\\":\\\"Merrili\\\",\\\"email\\\":\\\"mciccettinn@quantcast.com\\\",\\\"job\\\":\\\"VP Quality Control\\\"}\"}\n{ \"body\": \"{\\\"id\\\":853,\\\"first_name\\\":\\\"Bogey\\\",\\\"email\\\":\\\"btargetterno@blogspot.com\\\",\\\"job\\\":\\\"VP Accounting\\\"}\"}\n{ \"body\": \"{\\\"id\\\":854,\\\"first_name\\\":\\\"Gordan\\\",\\\"email\\\":\\\"gkemmetnp@angelfire.com\\\",\\\"job\\\":\\\"Geological Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":855,\\\"first_name\\\":\\\"Kipp\\\",\\\"email\\\":\\\"kscotchbrooknq@posterous.com\\\",\\\"job\\\":\\\"Structural Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":856,\\\"first_name\\\":\\\"Brande\\\",\\\"email\\\":\\\"bparradyenr@home.pl\\\",\\\"job\\\":\\\"Administrative Officer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":857,\\\"first_name\\\":\\\"Gabriella\\\",\\\"email\\\":\\\"gdegoixns@163.com\\\",\\\"job\\\":\\\"Structural Analysis Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":858,\\\"first_name\\\":\\\"Ruddy\\\",\\\"email\\\":\\\"rprysnt@ted.com\\\",\\\"job\\\":\\\"Sales Representative\\\"}\"}\n{ \"body\": \"{\\\"id\\\":859,\\\"first_name\\\":\\\"Fredrick\\\",\\\"email\\\":\\\"fmcnabbnu@buzzfeed.com\\\",\\\"job\\\":\\\"Tax Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":860,\\\"first_name\\\":\\\"Bertram\\\",\\\"email\\\":\\\"bnolinnv@hibu.com\\\",\\\"job\\\":\\\"Biostatistician IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":861,\\\"first_name\\\":\\\"Marylee\\\",\\\"email\\\":\\\"mlaidlernw@kickstarter.com\\\",\\\"job\\\":\\\"Office Assistant III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":862,\\\"first_name\\\":\\\"Gerda\\\",\\\"email\\\":\\\"gsmiznx@live.com\\\",\\\"job\\\":\\\"Pharmacist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":863,\\\"first_name\\\":\\\"Theressa\\\",\\\"email\\\":\\\"tluneyny@statcounter.com\\\",\\\"job\\\":\\\"VP Marketing\\\"}\"}\n{ \"body\": \"{\\\"id\\\":864,\\\"first_name\\\":\\\"Caron\\\",\\\"email\\\":\\\"ctraillnz@geocities.jp\\\",\\\"job\\\":\\\"Graphic Designer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":865,\\\"first_name\\\":\\\"Jorie\\\",\\\"email\\\":\\\"jgreensideo0@purevolume.com\\\",\\\"job\\\":\\\"Librarian\\\"}\"}\n{ \"body\": \"{\\\"id\\\":866,\\\"first_name\\\":\\\"Fleming\\\",\\\"email\\\":\\\"flinggoodo1@slate.com\\\",\\\"job\\\":\\\"Quality Control Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":867,\\\"first_name\\\":\\\"Kennan\\\",\\\"email\\\":\\\"kpinchino2@hubpages.com\\\",\\\"job\\\":\\\"Desktop Support Technician\\\"}\"}\n{ \"body\": \"{\\\"id\\\":868,\\\"first_name\\\":\\\"Cacilia\\\",\\\"email\\\":\\\"cfishbyo3@altervista.org\\\",\\\"job\\\":\\\"Environmental Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":869,\\\"first_name\\\":\\\"Filberte\\\",\\\"email\\\":\\\"feverwino4@weebly.com\\\",\\\"job\\\":\\\"Assistant Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":870,\\\"first_name\\\":\\\"Tess\\\",\\\"email\\\":\\\"tkennewayo5@woothemes.com\\\",\\\"job\\\":\\\"GIS Technical Architect\\\"}\"}\n{ \"body\": \"{\\\"id\\\":871,\\\"first_name\\\":\\\"Vite\\\",\\\"email\\\":\\\"visakseno6@ovh.net\\\",\\\"job\\\":\\\"Dental Hygienist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":872,\\\"first_name\\\":\\\"Mauricio\\\",\\\"email\\\":\\\"mgrzelczako7@rediff.com\\\",\\\"job\\\":\\\"Engineer III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":873,\\\"first_name\\\":\\\"Daven\\\",\\\"email\\\":\\\"dhaslewoodo8@ovh.net\\\",\\\"job\\\":\\\"Accounting Assistant IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":874,\\\"first_name\\\":\\\"Maddi\\\",\\\"email\\\":\\\"mmaskallo9@intel.com\\\",\\\"job\\\":\\\"Tax Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":875,\\\"first_name\\\":\\\"Amalea\\\",\\\"email\\\":\\\"aismayoa@apache.org\\\",\\\"job\\\":\\\"Human Resources Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":876,\\\"first_name\\\":\\\"Wren\\\",\\\"email\\\":\\\"wcoyob@discuz.net\\\",\\\"job\\\":\\\"Assistant Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":877,\\\"first_name\\\":\\\"Kareem\\\",\\\"email\\\":\\\"kairetonoc@quantcast.com\\\",\\\"job\\\":\\\"Electrical Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":878,\\\"first_name\\\":\\\"Felicdad\\\",\\\"email\\\":\\\"fbiddwellod@pinterest.com\\\",\\\"job\\\":\\\"Nurse Practicioner\\\"}\"}\n{ \"body\": \"{\\\"id\\\":879,\\\"first_name\\\":\\\"Hube\\\",\\\"email\\\":\\\"hmaniloveoe@behance.net\\\",\\\"job\\\":\\\"Senior Developer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":880,\\\"first_name\\\":\\\"Lowrance\\\",\\\"email\\\":\\\"lmabbittof@wiley.com\\\",\\\"job\\\":\\\"Structural Analysis Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":881,\\\"first_name\\\":\\\"Geraldine\\\",\\\"email\\\":\\\"gleirmonthog@jigsy.com\\\",\\\"job\\\":\\\"Sales Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":882,\\\"first_name\\\":\\\"Reese\\\",\\\"email\\\":\\\"rmathiesonoh@telegraph.co.uk\\\",\\\"job\\\":\\\"Software Consultant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":883,\\\"first_name\\\":\\\"Mariel\\\",\\\"email\\\":\\\"medinborooi@github.io\\\",\\\"job\\\":\\\"Actuary\\\"}\"}\n{ \"body\": \"{\\\"id\\\":884,\\\"first_name\\\":\\\"Darnell\\\",\\\"email\\\":\\\"dgrzegoreckioj@umich.edu\\\",\\\"job\\\":\\\"Database Administrator I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":885,\\\"first_name\\\":\\\"Samson\\\",\\\"email\\\":\\\"spondeok@xing.com\\\",\\\"job\\\":\\\"Environmental Tech\\\"}\"}\n{ \"body\": \"{\\\"id\\\":886,\\\"first_name\\\":\\\"Marv\\\",\\\"email\\\":\\\"mgargettol@nature.com\\\",\\\"job\\\":\\\"Data Coordiator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":887,\\\"first_name\\\":\\\"Desmond\\\",\\\"email\\\":\\\"dlazellom@goo.ne.jp\\\",\\\"job\\\":\\\"Legal Assistant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":888,\\\"first_name\\\":\\\"Perren\\\",\\\"email\\\":\\\"preineron@list-manage.com\\\",\\\"job\\\":\\\"Analog Circuit Design manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":889,\\\"first_name\\\":\\\"Branden\\\",\\\"email\\\":\\\"blawteyoo@t.co\\\",\\\"job\\\":\\\"Cost Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":890,\\\"first_name\\\":\\\"Roy\\\",\\\"email\\\":\\\"rdiggesop@dagondesign.com\\\",\\\"job\\\":\\\"Web Developer II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":891,\\\"first_name\\\":\\\"Stillman\\\",\\\"email\\\":\\\"sdarkinsoq@disqus.com\\\",\\\"job\\\":\\\"Nurse Practicioner\\\"}\"}\n{ \"body\": \"{\\\"id\\\":892,\\\"first_name\\\":\\\"Spense\\\",\\\"email\\\":\\\"solcotor@ezinearticles.com\\\",\\\"job\\\":\\\"Data Coordiator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":893,\\\"first_name\\\":\\\"Will\\\",\\\"email\\\":\\\"wisardos@kickstarter.com\\\",\\\"job\\\":\\\"Graphic Designer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":894,\\\"first_name\\\":\\\"Millie\\\",\\\"email\\\":\\\"mbuttwellot@prweb.com\\\",\\\"job\\\":\\\"Systems Administrator I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":895,\\\"first_name\\\":\\\"Rickie\\\",\\\"email\\\":\\\"rgogieou@paypal.com\\\",\\\"job\\\":\\\"Human Resources Assistant IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":896,\\\"first_name\\\":\\\"Ardene\\\",\\\"email\\\":\\\"aboristonov@princeton.edu\\\",\\\"job\\\":\\\"Systems Administrator II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":897,\\\"first_name\\\":\\\"Irwin\\\",\\\"email\\\":\\\"irentelllow@wix.com\\\",\\\"job\\\":\\\"Analog Circuit Design manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":898,\\\"first_name\\\":\\\"Padriac\\\",\\\"email\\\":\\\"pkremerox@reuters.com\\\",\\\"job\\\":\\\"Nurse\\\"}\"}\n{ \"body\": \"{\\\"id\\\":899,\\\"first_name\\\":\\\"Franciskus\\\",\\\"email\\\":\\\"fokelloy@ft.com\\\",\\\"job\\\":\\\"Pharmacist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":900,\\\"first_name\\\":\\\"Marillin\\\",\\\"email\\\":\\\"mmacclureoz@bing.com\\\",\\\"job\\\":\\\"Software Consultant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":901,\\\"first_name\\\":\\\"Eberhard\\\",\\\"email\\\":\\\"egrigoliisp0@1688.com\\\",\\\"job\\\":\\\"Research Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":902,\\\"first_name\\\":\\\"Colin\\\",\\\"email\\\":\\\"chammerichp1@japanpost.jp\\\",\\\"job\\\":\\\"Administrative Officer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":903,\\\"first_name\\\":\\\"Dorolice\\\",\\\"email\\\":\\\"dglovesp2@soup.io\\\",\\\"job\\\":\\\"Electrical Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":904,\\\"first_name\\\":\\\"Matteo\\\",\\\"email\\\":\\\"mhickfordp3@google.com.au\\\",\\\"job\\\":\\\"Staff Scientist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":905,\\\"first_name\\\":\\\"Abagail\\\",\\\"email\\\":\\\"asallierp4@typepad.com\\\",\\\"job\\\":\\\"Occupational Therapist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":906,\\\"first_name\\\":\\\"Ina\\\",\\\"email\\\":\\\"igeevep5@diigo.com\\\",\\\"job\\\":\\\"Sales Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":907,\\\"first_name\\\":\\\"Tommi\\\",\\\"email\\\":\\\"tbridgestockp6@nydailynews.com\\\",\\\"job\\\":\\\"Human Resources Assistant IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":908,\\\"first_name\\\":\\\"Hugo\\\",\\\"email\\\":\\\"hgregoracip7@t.co\\\",\\\"job\\\":\\\"Senior Financial Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":909,\\\"first_name\\\":\\\"Nathanil\\\",\\\"email\\\":\\\"nmillsonp8@cocolog-nifty.com\\\",\\\"job\\\":\\\"Senior Developer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":910,\\\"first_name\\\":\\\"Inesita\\\",\\\"email\\\":\\\"isootp9@studiopress.com\\\",\\\"job\\\":\\\"Biostatistician I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":911,\\\"first_name\\\":\\\"Beatriz\\\",\\\"email\\\":\\\"bdmytrykpa@taobao.com\\\",\\\"job\\\":\\\"Research Assistant II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":912,\\\"first_name\\\":\\\"Brigit\\\",\\\"email\\\":\\\"btinnerpb@rediff.com\\\",\\\"job\\\":\\\"Financial Advisor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":913,\\\"first_name\\\":\\\"Free\\\",\\\"email\\\":\\\"fkollachpc@narod.ru\\\",\\\"job\\\":\\\"Senior Cost Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":914,\\\"first_name\\\":\\\"Barron\\\",\\\"email\\\":\\\"bklossmannpd@europa.eu\\\",\\\"job\\\":\\\"Research Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":915,\\\"first_name\\\":\\\"Mommy\\\",\\\"email\\\":\\\"mskoggingspe@adobe.com\\\",\\\"job\\\":\\\"Software Engineer I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":916,\\\"first_name\\\":\\\"Whittaker\\\",\\\"email\\\":\\\"wpanswickpf@amazon.com\\\",\\\"job\\\":\\\"Chemical Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":917,\\\"first_name\\\":\\\"Clementina\\\",\\\"email\\\":\\\"cbradbornepg@live.com\\\",\\\"job\\\":\\\"Chemical Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":918,\\\"first_name\\\":\\\"Feodor\\\",\\\"email\\\":\\\"fbodemeaidph@businessweek.com\\\",\\\"job\\\":\\\"Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":919,\\\"first_name\\\":\\\"Esteban\\\",\\\"email\\\":\\\"emacrurypi@mozilla.com\\\",\\\"job\\\":\\\"Human Resources Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":920,\\\"first_name\\\":\\\"Suzanne\\\",\\\"email\\\":\\\"sgotterpj@noaa.gov\\\",\\\"job\\\":\\\"Senior Editor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":921,\\\"first_name\\\":\\\"Sheila-kathryn\\\",\\\"email\\\":\\\"shubanpk@hhs.gov\\\",\\\"job\\\":\\\"Geologist IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":922,\\\"first_name\\\":\\\"Minette\\\",\\\"email\\\":\\\"mleakpl@nps.gov\\\",\\\"job\\\":\\\"Community Outreach Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":923,\\\"first_name\\\":\\\"Jordana\\\",\\\"email\\\":\\\"jhousemanpm@aboutads.info\\\",\\\"job\\\":\\\"Geologist II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":924,\\\"first_name\\\":\\\"Izak\\\",\\\"email\\\":\\\"ibaloghpn@smh.com.au\\\",\\\"job\\\":\\\"Project Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":925,\\\"first_name\\\":\\\"Carita\\\",\\\"email\\\":\\\"cbeekepo@lulu.com\\\",\\\"job\\\":\\\"Librarian\\\"}\"}\n{ \"body\": \"{\\\"id\\\":926,\\\"first_name\\\":\\\"Rowney\\\",\\\"email\\\":\\\"rgronoupp@blogtalkradio.com\\\",\\\"job\\\":\\\"Statistician III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":927,\\\"first_name\\\":\\\"Skipper\\\",\\\"email\\\":\\\"sraffonpq@prweb.com\\\",\\\"job\\\":\\\"Quality Control Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":928,\\\"first_name\\\":\\\"Bettine\\\",\\\"email\\\":\\\"briddioughpr@bloomberg.com\\\",\\\"job\\\":\\\"Internal Auditor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":929,\\\"first_name\\\":\\\"Cherice\\\",\\\"email\\\":\\\"chovendenps@diigo.com\\\",\\\"job\\\":\\\"Desktop Support Technician\\\"}\"}\n{ \"body\": \"{\\\"id\\\":930,\\\"first_name\\\":\\\"Eb\\\",\\\"email\\\":\\\"ewoodcraftpt@jigsy.com\\\",\\\"job\\\":\\\"Administrative Officer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":931,\\\"first_name\\\":\\\"Trixie\\\",\\\"email\\\":\\\"tscutchinpu@simplemachines.org\\\",\\\"job\\\":\\\"Recruiter\\\"}\"}\n{ \"body\": \"{\\\"id\\\":932,\\\"first_name\\\":\\\"Kattie\\\",\\\"email\\\":\\\"kaxtellpv@w3.org\\\",\\\"job\\\":\\\"Programmer Analyst III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":933,\\\"first_name\\\":\\\"Marnia\\\",\\\"email\\\":\\\"mwehnerrpw@technorati.com\\\",\\\"job\\\":\\\"Sales Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":934,\\\"first_name\\\":\\\"Bessy\\\",\\\"email\\\":\\\"bwahnckepx@businessweek.com\\\",\\\"job\\\":\\\"Database Administrator IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":935,\\\"first_name\\\":\\\"Parry\\\",\\\"email\\\":\\\"pseyfartpy@techcrunch.com\\\",\\\"job\\\":\\\"Senior Sales Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":936,\\\"first_name\\\":\\\"Jonie\\\",\\\"email\\\":\\\"jsteptoepz@ask.com\\\",\\\"job\\\":\\\"Nurse\\\"}\"}\n{ \"body\": \"{\\\"id\\\":937,\\\"first_name\\\":\\\"Creight\\\",\\\"email\\\":\\\"cbutfieldq0@is.gd\\\",\\\"job\\\":\\\"Developer II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":938,\\\"first_name\\\":\\\"Kendell\\\",\\\"email\\\":\\\"kkyrkemanq1@ted.com\\\",\\\"job\\\":\\\"Director of Sales\\\"}\"}\n{ \"body\": \"{\\\"id\\\":939,\\\"first_name\\\":\\\"Stanly\\\",\\\"email\\\":\\\"swherryq2@cdc.gov\\\",\\\"job\\\":\\\"Developer II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":940,\\\"first_name\\\":\\\"Valerie\\\",\\\"email\\\":\\\"vramirezq3@ucla.edu\\\",\\\"job\\\":\\\"Technical Writer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":941,\\\"first_name\\\":\\\"Margalo\\\",\\\"email\\\":\\\"mspruceq4@nps.gov\\\",\\\"job\\\":\\\"Chemical Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":942,\\\"first_name\\\":\\\"Saundra\\\",\\\"email\\\":\\\"stretterq5@phoca.cz\\\",\\\"job\\\":\\\"Research Associate\\\"}\"}\n{ \"body\": \"{\\\"id\\\":943,\\\"first_name\\\":\\\"Jenda\\\",\\\"email\\\":\\\"jalexsandrowiczq6@hhs.gov\\\",\\\"job\\\":\\\"Senior Cost Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":944,\\\"first_name\\\":\\\"Orazio\\\",\\\"email\\\":\\\"oelvyq7@odnoklassniki.ru\\\",\\\"job\\\":\\\"Actuary\\\"}\"}\n{ \"body\": \"{\\\"id\\\":945,\\\"first_name\\\":\\\"Caitlin\\\",\\\"email\\\":\\\"cledgewayq8@infoseek.co.jp\\\",\\\"job\\\":\\\"Junior Executive\\\"}\"}\n{ \"body\": \"{\\\"id\\\":946,\\\"first_name\\\":\\\"Hobard\\\",\\\"email\\\":\\\"htomkowiczq9@intel.com\\\",\\\"job\\\":\\\"Teacher\\\"}\"}\n{ \"body\": \"{\\\"id\\\":947,\\\"first_name\\\":\\\"Vitia\\\",\\\"email\\\":\\\"vgaviniqa@ezinearticles.com\\\",\\\"job\\\":\\\"Structural Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":948,\\\"first_name\\\":\\\"Karissa\\\",\\\"email\\\":\\\"klannonqb@studiopress.com\\\",\\\"job\\\":\\\"Staff Scientist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":949,\\\"first_name\\\":\\\"Beverley\\\",\\\"email\\\":\\\"bshreveqc@go.com\\\",\\\"job\\\":\\\"Actuary\\\"}\"}\n{ \"body\": \"{\\\"id\\\":950,\\\"first_name\\\":\\\"Lisette\\\",\\\"email\\\":\\\"lcasebourneqd@4shared.com\\\",\\\"job\\\":\\\"Civil Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":951,\\\"first_name\\\":\\\"Ashil\\\",\\\"email\\\":\\\"akonkeqe@admin.ch\\\",\\\"job\\\":\\\"Software Engineer IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":952,\\\"first_name\\\":\\\"Lauraine\\\",\\\"email\\\":\\\"lbleakleyqf@xing.com\\\",\\\"job\\\":\\\"Statistician IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":953,\\\"first_name\\\":\\\"Timothea\\\",\\\"email\\\":\\\"tpetfordqg@icq.com\\\",\\\"job\\\":\\\"Accountant III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":954,\\\"first_name\\\":\\\"Ancell\\\",\\\"email\\\":\\\"aabbittqh@craigslist.org\\\",\\\"job\\\":\\\"Programmer III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":955,\\\"first_name\\\":\\\"Jarid\\\",\\\"email\\\":\\\"jhardwareqi@spotify.com\\\",\\\"job\\\":\\\"Financial Advisor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":956,\\\"first_name\\\":\\\"Sheff\\\",\\\"email\\\":\\\"sbwyqj@vkontakte.ru\\\",\\\"job\\\":\\\"Actuary\\\"}\"}\n{ \"body\": \"{\\\"id\\\":957,\\\"first_name\\\":\\\"Archie\\\",\\\"email\\\":\\\"abassoqk@google.com.br\\\",\\\"job\\\":\\\"Assistant Professor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":958,\\\"first_name\\\":\\\"Ber\\\",\\\"email\\\":\\\"bspargoql@thetimes.co.uk\\\",\\\"job\\\":\\\"Analog Circuit Design manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":959,\\\"first_name\\\":\\\"Josefa\\\",\\\"email\\\":\\\"jhuffyqm@blog.com\\\",\\\"job\\\":\\\"Dental Hygienist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":960,\\\"first_name\\\":\\\"Rivalee\\\",\\\"email\\\":\\\"rknowlmanqn@domainmarket.com\\\",\\\"job\\\":\\\"GIS Technical Architect\\\"}\"}\n{ \"body\": \"{\\\"id\\\":961,\\\"first_name\\\":\\\"Kean\\\",\\\"email\\\":\\\"kbegginiqo@eventbrite.com\\\",\\\"job\\\":\\\"Programmer IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":962,\\\"first_name\\\":\\\"Jacklin\\\",\\\"email\\\":\\\"jlaxtonqp@yandex.ru\\\",\\\"job\\\":\\\"Structural Analysis Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":963,\\\"first_name\\\":\\\"Lynda\\\",\\\"email\\\":\\\"ldeluzeqq@blogger.com\\\",\\\"job\\\":\\\"Quality Control Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":964,\\\"first_name\\\":\\\"Kaile\\\",\\\"email\\\":\\\"kjefferdqr@shareasale.com\\\",\\\"job\\\":\\\"Administrative Officer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":965,\\\"first_name\\\":\\\"Tamar\\\",\\\"email\\\":\\\"tjoreauqs@nature.com\\\",\\\"job\\\":\\\"Office Assistant IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":966,\\\"first_name\\\":\\\"Reg\\\",\\\"email\\\":\\\"rcorssqt@uol.com.br\\\",\\\"job\\\":\\\"Account Coordinator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":967,\\\"first_name\\\":\\\"Alastair\\\",\\\"email\\\":\\\"abranneyqu@ustream.tv\\\",\\\"job\\\":\\\"Staff Scientist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":968,\\\"first_name\\\":\\\"Astrix\\\",\\\"email\\\":\\\"acushqv@liveinternet.ru\\\",\\\"job\\\":\\\"Food Chemist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":969,\\\"first_name\\\":\\\"Brendan\\\",\\\"email\\\":\\\"branceqw@oaic.gov.au\\\",\\\"job\\\":\\\"Food Chemist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":970,\\\"first_name\\\":\\\"Rosita\\\",\\\"email\\\":\\\"rminchellaqx@indiatimes.com\\\",\\\"job\\\":\\\"Junior Executive\\\"}\"}\n{ \"body\": \"{\\\"id\\\":971,\\\"first_name\\\":\\\"Alexina\\\",\\\"email\\\":\\\"acurrmqy@1und1.de\\\",\\\"job\\\":\\\"Staff Scientist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":972,\\\"first_name\\\":\\\"Jeanna\\\",\\\"email\\\":\\\"jdawneyqz@nba.com\\\",\\\"job\\\":\\\"Community Outreach Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":973,\\\"first_name\\\":\\\"Donavon\\\",\\\"email\\\":\\\"dvogeler0@cam.ac.uk\\\",\\\"job\\\":\\\"Nurse\\\"}\"}\n{ \"body\": \"{\\\"id\\\":974,\\\"first_name\\\":\\\"Salim\\\",\\\"email\\\":\\\"smilberryr1@amazon.co.jp\\\",\\\"job\\\":\\\"Payment Adjustment Coordinator\\\"}\"}\n{ \"body\": \"{\\\"id\\\":975,\\\"first_name\\\":\\\"Theo\\\",\\\"email\\\":\\\"trosendorfr2@illinois.edu\\\",\\\"job\\\":\\\"Senior Developer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":976,\\\"first_name\\\":\\\"Ford\\\",\\\"email\\\":\\\"fmachansr3@deliciousdays.com\\\",\\\"job\\\":\\\"Research Nurse\\\"}\"}\n{ \"body\": \"{\\\"id\\\":977,\\\"first_name\\\":\\\"Ernesto\\\",\\\"email\\\":\\\"eternaultr4@hp.com\\\",\\\"job\\\":\\\"Cost Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":978,\\\"first_name\\\":\\\"Travis\\\",\\\"email\\\":\\\"tcloneyr5@jigsy.com\\\",\\\"job\\\":\\\"Recruiting Manager\\\"}\"}\n{ \"body\": \"{\\\"id\\\":979,\\\"first_name\\\":\\\"Tynan\\\",\\\"email\\\":\\\"tcreusr6@alibaba.com\\\",\\\"job\\\":\\\"Design Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":980,\\\"first_name\\\":\\\"Samuele\\\",\\\"email\\\":\\\"shumbertr7@indiatimes.com\\\",\\\"job\\\":\\\"Software Engineer II\\\"}\"}\n{ \"body\": \"{\\\"id\\\":981,\\\"first_name\\\":\\\"Molli\\\",\\\"email\\\":\\\"mbenboughr8@bravesites.com\\\",\\\"job\\\":\\\"Nurse\\\"}\"}\n{ \"body\": \"{\\\"id\\\":982,\\\"first_name\\\":\\\"Sampson\\\",\\\"email\\\":\\\"scrasswellr9@storify.com\\\",\\\"job\\\":\\\"Marketing Assistant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":983,\\\"first_name\\\":\\\"Jerad\\\",\\\"email\\\":\\\"jdacksra@bizjournals.com\\\",\\\"job\\\":\\\"Systems Administrator IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":984,\\\"first_name\\\":\\\"Marcelline\\\",\\\"email\\\":\\\"mvenmorerb@t.co\\\",\\\"job\\\":\\\"Senior Editor\\\"}\"}\n{ \"body\": \"{\\\"id\\\":985,\\\"first_name\\\":\\\"Beryle\\\",\\\"email\\\":\\\"bemersonrc@people.com.cn\\\",\\\"job\\\":\\\"Librarian\\\"}\"}\n{ \"body\": \"{\\\"id\\\":986,\\\"first_name\\\":\\\"Rosemary\\\",\\\"email\\\":\\\"rmeddickrd@apple.com\\\",\\\"job\\\":\\\"Environmental Tech\\\"}\"}\n{ \"body\": \"{\\\"id\\\":987,\\\"first_name\\\":\\\"Lars\\\",\\\"email\\\":\\\"lgillbardre@gov.uk\\\",\\\"job\\\":\\\"Tax Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":988,\\\"first_name\\\":\\\"Barnaby\\\",\\\"email\\\":\\\"bgrishukovrf@geocities.jp\\\",\\\"job\\\":\\\"Tax Accountant\\\"}\"}\n{ \"body\": \"{\\\"id\\\":989,\\\"first_name\\\":\\\"Staci\\\",\\\"email\\\":\\\"sloryrg@pcworld.com\\\",\\\"job\\\":\\\"Civil Engineer\\\"}\"}\n{ \"body\": \"{\\\"id\\\":990,\\\"first_name\\\":\\\"Vassily\\\",\\\"email\\\":\\\"vfarfullrh@51.la\\\",\\\"job\\\":\\\"VP Marketing\\\"}\"}\n{ \"body\": \"{\\\"id\\\":991,\\\"first_name\\\":\\\"Robbert\\\",\\\"email\\\":\\\"rpinckstoneri@unblog.fr\\\",\\\"job\\\":\\\"Pharmacist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":992,\\\"first_name\\\":\\\"Simeon\\\",\\\"email\\\":\\\"shrishchenkorj@gizmodo.com\\\",\\\"job\\\":\\\"Nurse\\\"}\"}\n{ \"body\": \"{\\\"id\\\":993,\\\"first_name\\\":\\\"Silvan\\\",\\\"email\\\":\\\"slinkierk@elpais.com\\\",\\\"job\\\":\\\"Engineer III\\\"}\"}\n{ \"body\": \"{\\\"id\\\":994,\\\"first_name\\\":\\\"Doralin\\\",\\\"email\\\":\\\"dfinbyrl@xing.com\\\",\\\"job\\\":\\\"Statistician I\\\"}\"}\n{ \"body\": \"{\\\"id\\\":995,\\\"first_name\\\":\\\"Katine\\\",\\\"email\\\":\\\"kgilmartinrm@ezinearticles.com\\\",\\\"job\\\":\\\"Systems Administrator IV\\\"}\"}\n{ \"body\": \"{\\\"id\\\":996,\\\"first_name\\\":\\\"Deanne\\\",\\\"email\\\":\\\"drentonrn@example.com\\\",\\\"job\\\":\\\"Quality Control Specialist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":997,\\\"first_name\\\":\\\"Abdul\\\",\\\"email\\\":\\\"amccreeryro@tiny.cc\\\",\\\"job\\\":\\\"VP Quality Control\\\"}\"}\n{ \"body\": \"{\\\"id\\\":998,\\\"first_name\\\":\\\"Lalo\\\",\\\"email\\\":\\\"ljeevesrp@furl.net\\\",\\\"job\\\":\\\"Senior Financial Analyst\\\"}\"}\n{ \"body\": \"{\\\"id\\\":999,\\\"first_name\\\":\\\"Randal\\\",\\\"email\\\":\\\"rhancellrq@instagram.com\\\",\\\"job\\\":\\\"Staff Scientist\\\"}\"}\n{ \"body\": \"{\\\"id\\\":1000,\\\"first_name\\\":\\\"Ramsay\\\",\\\"email\\\":\\\"rprujeanrr@whitehouse.gov\\\",\\\"job\\\":\\\"Internal Auditor\\\"}\"}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/benches/data/bench_data_light_transform.json",
    "content": "{\"id\":1,\"first_name\":\"Alia\",\"email\":\"aingleston0@twitpic.com\",\"job\":\"Civil Engineer\",\"timestamp\":\"2022-01-25T09:26:29Z\"}\n{\"id\":2,\"first_name\":\"Erl\",\"email\":\"ebegwell1@google.com.br\",\"job\":\"Data Coordiator\",\"timestamp\":\"2022-04-21T23:08:59Z\"}\n{\"id\":3,\"first_name\":\"Drona\",\"email\":\"dranyell2@ehow.com\",\"job\":\"Desktop Support Technician\",\"timestamp\":\"2022-10-20T03:43:51Z\"}\n{\"id\":4,\"first_name\":\"Jackie\",\"email\":\"jkingsley3@squidoo.com\",\"job\":\"GIS Technical Architect\",\"timestamp\":\"2022-02-02T21:54:48Z\"}\n{\"id\":5,\"first_name\":\"Ginny\",\"email\":\"glangman4@hud.gov\",\"job\":\"Senior Sales Associate\",\"timestamp\":\"2022-07-08T22:55:59Z\"}\n{\"id\":6,\"first_name\":\"Lorenzo\",\"email\":\"ltempleman5@pen.io\",\"job\":\"Financial Advisor\",\"timestamp\":\"2022-07-08T09:29:57Z\"}\n{\"id\":7,\"first_name\":\"Kyle\",\"email\":\"kkundt6@soup.io\",\"job\":\"Sales Representative\",\"timestamp\":\"2022-04-04T09:20:18Z\"}\n{\"id\":8,\"first_name\":\"Miof mela\",\"email\":\"mcamelin7@github.io\",\"job\":\"Marketing Manager\",\"timestamp\":\"2022-04-10T17:52:47Z\"}\n{\"id\":9,\"first_name\":\"Shelden\",\"email\":\"ssarson8@networkadvertising.org\",\"job\":\"Biostatistician IV\",\"timestamp\":\"2022-02-21T04:11:00Z\"}\n{\"id\":10,\"first_name\":\"Evey\",\"email\":\"estrang9@hostgator.com\",\"job\":\"Programmer II\",\"timestamp\":\"2022-09-21T02:54:44Z\"}\n{\"id\":11,\"first_name\":\"Dav\",\"email\":\"davrasina@trellian.com\",\"job\":\"Accounting Assistant I\",\"timestamp\":\"2022-11-10T13:00:59Z\"}\n{\"id\":12,\"first_name\":\"Ignacio\",\"email\":\"iduhamelb@desdev.cn\",\"job\":\"Research Associate\",\"timestamp\":\"2022-01-09T14:19:37Z\"}\n{\"id\":13,\"first_name\":\"Lottie\",\"email\":\"lfouchc@amazon.co.uk\",\"job\":\"Help Desk Technician\",\"timestamp\":\"2022-01-17T07:07:32Z\"}\n{\"id\":14,\"first_name\":\"Moira\",\"email\":\"menrigod@narod.ru\",\"job\":\"Software Test Engineer IV\",\"timestamp\":\"2022-05-28T04:42:48Z\"}\n{\"id\":15,\"first_name\":\"Jori\",\"email\":\"jeverille@ed.gov\",\"job\":\"Senior Financial Analyst\",\"timestamp\":\"2022-02-03T15:27:09Z\"}\n{\"id\":16,\"first_name\":\"Markos\",\"email\":\"mpostansf@4shared.com\",\"job\":\"Tax Accountant\",\"timestamp\":\"2022-05-15T15:57:34Z\"}\n{\"id\":17,\"first_name\":\"Bryana\",\"email\":\"bpokerg@printfriendly.com\",\"job\":\"Physical Therapy Assistant\",\"timestamp\":\"2022-04-21T13:20:12Z\"}\n{\"id\":18,\"first_name\":\"Reiko\",\"email\":\"rtunsleyh@arstechnica.com\",\"job\":\"Recruiter\",\"timestamp\":\"2022-10-30T21:27:31Z\"}\n{\"id\":19,\"first_name\":\"Dedie\",\"email\":\"dcouttsi@alibaba.com\",\"job\":\"Human Resources Assistant II\",\"timestamp\":\"2022-10-17T03:12:12Z\"}\n{\"id\":20,\"first_name\":\"Sigfrid\",\"email\":\"sfriattj@google.ru\",\"job\":\"Help Desk Operator\",\"timestamp\":\"2022-06-06T22:56:02Z\"}\n{\"id\":21,\"first_name\":\"Sheilah\",\"email\":\"stuitek@baidu.com\",\"job\":\"VP Product Management\",\"timestamp\":\"2022-08-04T05:08:19Z\"}\n{\"id\":22,\"first_name\":\"Colan\",\"email\":\"cbeardselll@drupal.org\",\"job\":\"Structural Engineer\",\"timestamp\":\"2022-07-11T05:12:49Z\"}\n{\"id\":23,\"first_name\":\"Loise\",\"email\":\"lminifiem@whitehouse.gov\",\"job\":\"Research Assistant IV\",\"timestamp\":\"2022-03-31T17:55:15Z\"}\n{\"id\":24,\"first_name\":\"Imogen\",\"email\":\"imckelveyn@hibu.com\",\"job\":\"Accountant I\",\"timestamp\":\"2022-06-19T23:39:31Z\"}\n{\"id\":25,\"first_name\":\"Richy\",\"email\":\"rcoultharto@mozilla.com\",\"job\":\"Senior Cost Accountant\",\"timestamp\":\"2022-05-01T02:00:34Z\"}\n{\"id\":26,\"first_name\":\"Henrieta\",\"email\":\"hkermittp@huffingtonpost.com\",\"job\":\"Engineer III\",\"timestamp\":\"2022-01-24T14:18:00Z\"}\n{\"id\":27,\"first_name\":\"Matty\",\"email\":\"msawoodq@goodreads.com\",\"job\":\"Payment Adjustment Coordinator\",\"timestamp\":\"2022-01-03T14:34:46Z\"}\n{\"id\":28,\"first_name\":\"Lane\",\"email\":\"ltownsleyr@ustream.tv\",\"job\":\"General Manager\",\"timestamp\":\"2022-06-06T22:51:21Z\"}\n{\"id\":29,\"first_name\":\"Matias\",\"email\":\"mbangss@dagondesign.com\",\"job\":\"Paralegal\",\"timestamp\":\"2022-09-05T18:38:43Z\"}\n{\"id\":30,\"first_name\":\"Nita\",\"email\":\"nmcsheat@flickr.com\",\"job\":\"Professor\",\"timestamp\":\"2022-10-06T06:00:26Z\"}\n{\"id\":31,\"first_name\":\"Paul\",\"email\":\"pmotherwellu@google.ru\",\"job\":\"Sales Associate\",\"timestamp\":\"2021-12-13T05:11:20Z\"}\n{\"id\":32,\"first_name\":\"Hercules\",\"email\":\"hdeattav@jimdo.com\",\"job\":\"Pharmacist\",\"timestamp\":\"2022-10-15T23:26:33Z\"}\n{\"id\":33,\"first_name\":\"Beckie\",\"email\":\"bcorradiniw@flickr.com\",\"job\":\"Paralegal\",\"timestamp\":\"2022-09-06T19:00:34Z\"}\n{\"id\":34,\"first_name\":\"Roldan\",\"email\":\"rvannix@ftc.gov\",\"job\":\"Senior Quality Engineer\",\"timestamp\":\"2022-12-04T17:22:04Z\"}\n{\"id\":35,\"first_name\":\"Garwin\",\"email\":\"gprucknery@dagondesign.com\",\"job\":\"Sales Associate\",\"timestamp\":\"2022-01-20T19:46:44Z\"}\n{\"id\":36,\"first_name\":\"Sarine\",\"email\":\"sfrantzenz@answers.com\",\"job\":\"Data Coordiator\",\"timestamp\":\"2022-10-23T21:34:29Z\"}\n{\"id\":37,\"first_name\":\"Darby\",\"email\":\"dberthot10@ocn.ne.jp\",\"job\":\"Nurse Practicioner\",\"timestamp\":\"2022-11-29T04:04:16Z\"}\n{\"id\":38,\"first_name\":\"Virgil\",\"email\":\"vpeltzer11@bloglovin.com\",\"job\":\"Dental Hygienist\",\"timestamp\":\"2022-05-30T03:34:01Z\"}\n{\"id\":39,\"first_name\":\"Jennette\",\"email\":\"jrenney12@businessinsider.com\",\"job\":\"VP Product Management\",\"timestamp\":\"2022-07-19T21:12:26Z\"}\n{\"id\":40,\"first_name\":\"Marylee\",\"email\":\"mbedell13@senate.gov\",\"job\":\"VP Sales\",\"timestamp\":\"2022-10-08T00:05:14Z\"}\n{\"id\":41,\"first_name\":\"Randi\",\"email\":\"racedo14@nymag.com\",\"job\":\"Statistician III\",\"timestamp\":\"2022-01-09T05:52:30Z\"}\n{\"id\":42,\"first_name\":\"Bertrand\",\"email\":\"bloxly15@bluehost.com\",\"job\":\"Assistant Professor\",\"timestamp\":\"2022-11-05T12:33:19Z\"}\n{\"id\":43,\"first_name\":\"Maddy\",\"email\":\"mscathard16@cyberchimps.com\",\"job\":\"Actuary\",\"timestamp\":\"2022-01-25T10:18:29Z\"}\n{\"id\":44,\"first_name\":\"Gayla\",\"email\":\"glidgate17@mediafire.com\",\"job\":\"Dental Hygienist\",\"timestamp\":\"2021-12-24T18:53:23Z\"}\n{\"id\":45,\"first_name\":\"Omero\",\"email\":\"omaxstead18@gravatar.com\",\"job\":\"Civil Engineer\",\"timestamp\":\"2022-09-16T18:24:37Z\"}\n{\"id\":46,\"first_name\":\"Jaimie\",\"email\":\"jtalby19@yahoo.co.jp\",\"job\":\"Research Associate\",\"timestamp\":\"2022-09-18T08:51:35Z\"}\n{\"id\":47,\"first_name\":\"Vonni\",\"email\":\"vpude1a@drupal.org\",\"job\":\"Information Systems Manager\",\"timestamp\":\"2022-05-16T09:53:18Z\"}\n{\"id\":48,\"first_name\":\"Nikaniki\",\"email\":\"nsurmeyers1b@accuweather.com\",\"job\":\"Business Systems Development Analyst\",\"timestamp\":\"2022-01-22T12:22:46Z\"}\n{\"id\":49,\"first_name\":\"Colin\",\"email\":\"cphuprate1c@reference.com\",\"job\":\"GIS Technical Architect\",\"timestamp\":\"2022-06-02T09:18:14Z\"}\n{\"id\":50,\"first_name\":\"Vevay\",\"email\":\"vlipson1d@illinois.edu\",\"job\":\"Biostatistician II\",\"timestamp\":\"2022-06-02T08:34:03Z\"}\n{\"id\":51,\"first_name\":\"Maudie\",\"email\":\"mluckcock1e@behance.net\",\"job\":\"Staff Scientist\",\"timestamp\":\"2022-04-20T12:44:22Z\"}\n{\"id\":52,\"first_name\":\"Raymund\",\"email\":\"rlewnden1f@tripadvisor.com\",\"job\":\"Librarian\",\"timestamp\":\"2022-08-13T23:17:51Z\"}\n{\"id\":53,\"first_name\":\"Leonelle\",\"email\":\"lwellum1g@buzzfeed.com\",\"job\":\"Assistant Manager\",\"timestamp\":\"2022-02-24T05:57:17Z\"}\n{\"id\":54,\"first_name\":\"Merrill\",\"email\":\"mdominico1h@netlog.com\",\"job\":\"Nuclear Power Engineer\",\"timestamp\":\"2022-02-13T16:40:32Z\"}\n{\"id\":55,\"first_name\":\"Maura\",\"email\":\"mjarman1i@ucoz.com\",\"job\":\"Community Outreach Specialist\",\"timestamp\":\"2022-08-02T04:15:09Z\"}\n{\"id\":56,\"first_name\":\"Archambault\",\"email\":\"aalcorn1j@posterous.com\",\"job\":\"Nurse Practicioner\",\"timestamp\":\"2022-02-07T03:03:27Z\"}\n{\"id\":57,\"first_name\":\"Murray\",\"email\":\"mwharfe1k@bloglovin.com\",\"job\":\"Compensation Analyst\",\"timestamp\":\"2022-11-17T11:46:01Z\"}\n{\"id\":58,\"first_name\":\"Fawne\",\"email\":\"froston1l@bloglines.com\",\"job\":\"Computer Systems Analyst III\",\"timestamp\":\"2022-03-27T06:42:55Z\"}\n{\"id\":59,\"first_name\":\"Trudey\",\"email\":\"tberinger1m@github.io\",\"job\":\"Food Chemist\",\"timestamp\":\"2022-03-05T15:02:41Z\"}\n{\"id\":60,\"first_name\":\"Mureil\",\"email\":\"malloway1n@purevolume.com\",\"job\":\"Marketing Assistant\",\"timestamp\":\"2022-09-05T23:29:03Z\"}\n{\"id\":61,\"first_name\":\"Norine\",\"email\":\"npennetta1o@bloglovin.com\",\"job\":\"Director of Sales\",\"timestamp\":\"2022-02-08T11:56:03Z\"}\n{\"id\":62,\"first_name\":\"Lucinda\",\"email\":\"ldemetz1p@tripadvisor.com\",\"job\":\"Civil Engineer\",\"timestamp\":\"2022-03-07T23:09:35Z\"}\n{\"id\":63,\"first_name\":\"Hulda\",\"email\":\"hhaville1q@mashable.com\",\"job\":\"Paralegal\",\"timestamp\":\"2022-01-22T07:03:36Z\"}\n{\"id\":64,\"first_name\":\"Valenka\",\"email\":\"vtorpie1r@netscape.com\",\"job\":\"Design Engineer\",\"timestamp\":\"2022-01-13T11:26:42Z\"}\n{\"id\":65,\"first_name\":\"Farleigh\",\"email\":\"fdantoni1s@mayoclinic.com\",\"job\":\"Nurse Practicioner\",\"timestamp\":\"2022-03-27T14:07:23Z\"}\n{\"id\":66,\"first_name\":\"Demetra\",\"email\":\"dtabourier1t@nytimes.com\",\"job\":\"Senior Financial Analyst\",\"timestamp\":\"2022-06-13T05:56:48Z\"}\n{\"id\":67,\"first_name\":\"Austine\",\"email\":\"ableas1u@jiathis.com\",\"job\":\"Quality Control Specialist\",\"timestamp\":\"2022-02-07T04:51:56Z\"}\n{\"id\":68,\"first_name\":\"Amie\",\"email\":\"alanchbury1v@hc360.com\",\"job\":\"Operator\",\"timestamp\":\"2022-05-09T15:29:00Z\"}\n{\"id\":69,\"first_name\":\"Normie\",\"email\":\"nwardall1w@gnu.org\",\"job\":\"Professor\",\"timestamp\":\"2022-10-14T09:26:18Z\"}\n{\"id\":70,\"first_name\":\"Lowe\",\"email\":\"ledelman1x@mediafire.com\",\"job\":\"Operator\",\"timestamp\":\"2022-02-06T12:42:05Z\"}\n{\"id\":71,\"first_name\":\"Agretha\",\"email\":\"awelchman1y@deviantart.com\",\"job\":\"Product Engineer\",\"timestamp\":\"2022-07-04T03:37:27Z\"}\n{\"id\":72,\"first_name\":\"Marleah\",\"email\":\"mwale1z@youtube.com\",\"job\":\"Nurse Practicioner\",\"timestamp\":\"2022-10-07T01:50:50Z\"}\n{\"id\":73,\"first_name\":\"Tammi\",\"email\":\"tcallow20@quantcast.com\",\"job\":\"Registered Nurse\",\"timestamp\":\"2022-06-10T20:24:59Z\"}\n{\"id\":74,\"first_name\":\"Wye\",\"email\":\"wkidner21@themeforest.net\",\"job\":\"Tax Accountant\",\"timestamp\":\"2021-12-20T08:54:36Z\"}\n{\"id\":75,\"first_name\":\"Katherine\",\"email\":\"kburnep22@histats.com\",\"job\":\"Analyst Programmer\",\"timestamp\":\"2022-07-02T18:17:33Z\"}\n{\"id\":76,\"first_name\":\"Charita\",\"email\":\"cmuccino23@usatoday.com\",\"job\":\"Staff Scientist\",\"timestamp\":\"2022-10-14T20:40:12Z\"}\n{\"id\":77,\"first_name\":\"Brook\",\"email\":\"btoquet24@cmu.edu\",\"job\":\"Structural Analysis Engineer\",\"timestamp\":\"2022-02-24T07:42:26Z\"}\n{\"id\":78,\"first_name\":\"Lexine\",\"email\":\"lface25@telegraph.co.uk\",\"job\":\"Health Coach III\",\"timestamp\":\"2022-08-14T05:15:07Z\"}\n{\"id\":79,\"first_name\":\"Corri\",\"email\":\"cdavidy26@tiny.cc\",\"job\":\"Food Chemist\",\"timestamp\":\"2022-11-16T00:04:23Z\"}\n{\"id\":80,\"first_name\":\"Kelcey\",\"email\":\"ksargeaunt27@sakura.ne.jp\",\"job\":\"Associate Professor\",\"timestamp\":\"2022-11-24T13:06:33Z\"}\n{\"id\":81,\"first_name\":\"Tracy\",\"email\":\"tbennetto28@goo.ne.jp\",\"job\":\"VP Quality Control\",\"timestamp\":\"2022-11-20T11:49:26Z\"}\n{\"id\":82,\"first_name\":\"Edmon\",\"email\":\"ehuxtable29@rambler.ru\",\"job\":\"Environmental Specialist\",\"timestamp\":\"2022-07-31T23:16:51Z\"}\n{\"id\":83,\"first_name\":\"Tessy\",\"email\":\"tsargeant2a@shop-pro.jp\",\"job\":\"Desktop Support Technician\",\"timestamp\":\"2022-05-15T12:07:04Z\"}\n{\"id\":84,\"first_name\":\"Chev\",\"email\":\"ctenbrug2b@topsy.com\",\"job\":\"Senior Quality Engineer\",\"timestamp\":\"2021-12-09T09:41:20Z\"}\n{\"id\":85,\"first_name\":\"Olivero\",\"email\":\"oseebright2c@nba.com\",\"job\":\"Structural Analysis Engineer\",\"timestamp\":\"2022-09-19T23:40:16Z\"}\n{\"id\":86,\"first_name\":\"Oswald\",\"email\":\"oswash2d@fotki.com\",\"job\":\"Occupational Therapist\",\"timestamp\":\"2022-08-06T13:17:22Z\"}\n{\"id\":87,\"first_name\":\"Laurent\",\"email\":\"lsineath2e@a8.net\",\"job\":\"Web Designer IV\",\"timestamp\":\"2022-09-29T13:15:18Z\"}\n{\"id\":88,\"first_name\":\"Mehetabel\",\"email\":\"mfendt2f@bing.com\",\"job\":\"Assistant Media Planner\",\"timestamp\":\"2021-12-31T08:22:47Z\"}\n{\"id\":89,\"first_name\":\"Jaime\",\"email\":\"jrichfield2g@europa.eu\",\"job\":\"Staff Scientist\",\"timestamp\":\"2022-10-30T10:50:50Z\"}\n{\"id\":90,\"first_name\":\"Grissel\",\"email\":\"ggell2h@bluehost.com\",\"job\":\"Speech Pathologist\",\"timestamp\":\"2022-02-26T08:04:19Z\"}\n{\"id\":91,\"first_name\":\"Fanchette\",\"email\":\"fgooderham2i@123-reg.co.uk\",\"job\":\"Professor\",\"timestamp\":\"2022-02-04T06:06:20Z\"}\n{\"id\":92,\"first_name\":\"Dov\",\"email\":\"dcurston2j@jigsy.com\",\"job\":\"Payment Adjustment Coordinator\",\"timestamp\":\"2022-02-18T00:39:48Z\"}\n{\"id\":93,\"first_name\":\"Fawn\",\"email\":\"fcazin2k@mac.com\",\"job\":\"Legal Assistant\",\"timestamp\":\"2022-10-06T14:22:24Z\"}\n{\"id\":94,\"first_name\":\"Emilio\",\"email\":\"esaphin2l@china.com.cn\",\"job\":\"Systems Administrator III\",\"timestamp\":\"2022-05-18T14:52:18Z\"}\n{\"id\":95,\"first_name\":\"Lisabeth\",\"email\":\"lgarrand2m@mlb.com\",\"job\":\"Recruiter\",\"timestamp\":\"2022-08-13T12:08:41Z\"}\n{\"id\":96,\"first_name\":\"Stanwood\",\"email\":\"sschruur2n@phpbb.com\",\"job\":\"Senior Editor\",\"timestamp\":\"2022-06-08T15:49:39Z\"}\n{\"id\":97,\"first_name\":\"Elke\",\"email\":\"eoliverpaull2o@msu.edu\",\"job\":\"Research Assistant II\",\"timestamp\":\"2022-11-26T20:49:01Z\"}\n{\"id\":98,\"first_name\":\"Daisey\",\"email\":\"dpadfield2p@chronoengine.com\",\"job\":\"VP Marketing\",\"timestamp\":\"2022-09-22T13:51:47Z\"}\n{\"id\":99,\"first_name\":\"Hirsch\",\"email\":\"htrembley2q@hibu.com\",\"job\":\"Environmental Specialist\",\"timestamp\":\"2022-06-11T18:27:56Z\"}\n{\"id\":100,\"first_name\":\"Karlee\",\"email\":\"kgates2r@vistaprint.com\",\"job\":\"Media Manager III\",\"timestamp\":\"2022-04-18T22:02:59Z\"}\n{\"id\":101,\"first_name\":\"Kylie\",\"email\":\"kklimov2s@cmu.edu\",\"job\":\"Senior Financial Analyst\",\"timestamp\":\"2022-03-20T06:18:08Z\"}\n{\"id\":102,\"first_name\":\"Lorrie\",\"email\":\"lsmewings2t@weibo.com\",\"job\":\"Mechanical Systems Engineer\",\"timestamp\":\"2022-11-22T01:27:34Z\"}\n{\"id\":103,\"first_name\":\"Lilli\",\"email\":\"lsanto2u@wired.com\",\"job\":\"Research Nurse\",\"timestamp\":\"2022-02-25T04:53:03Z\"}\n{\"id\":104,\"first_name\":\"Julieta\",\"email\":\"jdyers2v@un.org\",\"job\":\"Associate Professor\",\"timestamp\":\"2022-08-17T09:32:36Z\"}\n{\"id\":105,\"first_name\":\"Uriel\",\"email\":\"uqualtro2w@mlb.com\",\"job\":\"Budget/Accounting Analyst II\",\"timestamp\":\"2022-08-09T14:45:37Z\"}\n{\"id\":106,\"first_name\":\"Irvine\",\"email\":\"ikleinschmidt2x@weather.com\",\"job\":\"Operator\",\"timestamp\":\"2022-03-29T18:17:15Z\"}\n{\"id\":107,\"first_name\":\"Elaine\",\"email\":\"eglennon2y@jigsy.com\",\"job\":\"Speech Pathologist\",\"timestamp\":\"2022-09-08T11:23:44Z\"}\n{\"id\":108,\"first_name\":\"Gaspar\",\"email\":\"gmaass2z@sfgate.com\",\"job\":\"Community Outreach Specialist\",\"timestamp\":\"2022-04-10T07:49:11Z\"}\n{\"id\":109,\"first_name\":\"Josy\",\"email\":\"jchick30@merriam-webster.com\",\"job\":\"Electrical Engineer\",\"timestamp\":\"2022-06-22T22:33:43Z\"}\n{\"id\":110,\"first_name\":\"Dawna\",\"email\":\"ddinsale31@nydailynews.com\",\"job\":\"Associate Professor\",\"timestamp\":\"2022-12-02T13:10:31Z\"}\n{\"id\":111,\"first_name\":\"Aldo\",\"email\":\"alindsell32@ow.ly\",\"job\":\"Engineer IV\",\"timestamp\":\"2022-05-03T01:50:41Z\"}\n{\"id\":112,\"first_name\":\"Wade\",\"email\":\"wparkyns33@cpanel.net\",\"job\":\"Project Manager\",\"timestamp\":\"2022-03-31T20:28:42Z\"}\n{\"id\":113,\"first_name\":\"Aundrea\",\"email\":\"ahaggith34@prnewswire.com\",\"job\":\"Engineer I\",\"timestamp\":\"2022-11-05T23:37:37Z\"}\n{\"id\":114,\"first_name\":\"Tuck\",\"email\":\"tnasi35@netvibes.com\",\"job\":\"Staff Scientist\",\"timestamp\":\"2022-03-16T15:50:38Z\"}\n{\"id\":115,\"first_name\":\"Kirby\",\"email\":\"kworsalls36@cargocollective.com\",\"job\":\"Recruiting Manager\",\"timestamp\":\"2022-06-25T06:27:40Z\"}\n{\"id\":116,\"first_name\":\"Lauren\",\"email\":\"lmenghi37@rambler.ru\",\"job\":\"Software Engineer III\",\"timestamp\":\"2022-01-25T07:16:48Z\"}\n{\"id\":117,\"first_name\":\"Pearce\",\"email\":\"pgleed38@hubpages.com\",\"job\":\"Tax Accountant\",\"timestamp\":\"2022-05-05T18:33:52Z\"}\n{\"id\":118,\"first_name\":\"Vlad\",\"email\":\"vbensley39@prweb.com\",\"job\":\"Help Desk Operator\",\"timestamp\":\"2022-05-12T17:41:17Z\"}\n{\"id\":119,\"first_name\":\"Uriah\",\"email\":\"ustaniford3a@timesonline.co.uk\",\"job\":\"Staff Accountant III\",\"timestamp\":\"2022-05-23T06:37:31Z\"}\n{\"id\":120,\"first_name\":\"Frederic\",\"email\":\"fchataignier3b@utexas.edu\",\"job\":\"Cost Accountant\",\"timestamp\":\"2022-08-21T19:45:23Z\"}\n{\"id\":121,\"first_name\":\"Nell\",\"email\":\"ngniewosz3c@cnn.com\",\"job\":\"Librarian\",\"timestamp\":\"2022-01-04T15:59:26Z\"}\n{\"id\":122,\"first_name\":\"Baxy\",\"email\":\"bcockings3d@dmoz.org\",\"job\":\"Product Engineer\",\"timestamp\":\"2022-12-04T03:13:12Z\"}\n{\"id\":123,\"first_name\":\"Shadow\",\"email\":\"squade3e@sciencedaily.com\",\"job\":\"Research Assistant I\",\"timestamp\":\"2021-12-14T07:33:10Z\"}\n{\"id\":124,\"first_name\":\"Selene\",\"email\":\"ssammut3f@51.la\",\"job\":\"Nurse Practicioner\",\"timestamp\":\"2021-12-23T23:00:29Z\"}\n{\"id\":125,\"first_name\":\"Wendye\",\"email\":\"wsimons3g@phpbb.com\",\"job\":\"Internal Auditor\",\"timestamp\":\"2021-12-31T22:20:21Z\"}\n{\"id\":126,\"first_name\":\"Cobby\",\"email\":\"cmanton3h@rediff.com\",\"job\":\"Analyst Programmer\",\"timestamp\":\"2022-05-03T06:36:31Z\"}\n{\"id\":127,\"first_name\":\"Sharyl\",\"email\":\"sdowner3i@wikipedia.org\",\"job\":\"Help Desk Technician\",\"timestamp\":\"2022-11-18T09:17:46Z\"}\n{\"id\":128,\"first_name\":\"Sallyanne\",\"email\":\"slinley3j@woothemes.com\",\"job\":\"VP Marketing\",\"timestamp\":\"2022-03-03T20:52:39Z\"}\n{\"id\":129,\"first_name\":\"Christophe\",\"email\":\"cvelti3k@youku.com\",\"job\":\"Director of Sales\",\"timestamp\":\"2022-08-18T14:15:52Z\"}\n{\"id\":130,\"first_name\":\"Dion\",\"email\":\"dcoburn3l@booking.com\",\"job\":\"Junior Executive\",\"timestamp\":\"2022-10-16T14:04:46Z\"}\n{\"id\":131,\"first_name\":\"Terencio\",\"email\":\"thandmore3m@utexas.edu\",\"job\":\"Director of Sales\",\"timestamp\":\"2022-08-15T01:47:28Z\"}\n{\"id\":132,\"first_name\":\"Tiler\",\"email\":\"tvala3n@godaddy.com\",\"job\":\"Geologist III\",\"timestamp\":\"2022-10-18T05:08:00Z\"}\n{\"id\":133,\"first_name\":\"Lelia\",\"email\":\"lleddie3o@youku.com\",\"job\":\"VP Accounting\",\"timestamp\":\"2022-11-08T09:30:26Z\"}\n{\"id\":134,\"first_name\":\"Dawna\",\"email\":\"dcamellini3p@hibu.com\",\"job\":\"Environmental Tech\",\"timestamp\":\"2022-06-26T10:52:54Z\"}\n{\"id\":135,\"first_name\":\"Nickolas\",\"email\":\"ngosling3q@ox.ac.uk\",\"job\":\"Director of Sales\",\"timestamp\":\"2021-12-28T16:28:31Z\"}\n{\"id\":136,\"first_name\":\"Mufinella\",\"email\":\"mkleiner3r@51.la\",\"job\":\"Automation Specialist III\",\"timestamp\":\"2022-04-26T08:05:51Z\"}\n{\"id\":137,\"first_name\":\"Hoebart\",\"email\":\"hharses3s@cbc.ca\",\"job\":\"Business Systems Development Analyst\",\"timestamp\":\"2022-05-16T14:01:59Z\"}\n{\"id\":138,\"first_name\":\"Brier\",\"email\":\"bstivey3t@hatena.ne.jp\",\"job\":\"Electrical Engineer\",\"timestamp\":\"2022-08-17T04:40:43Z\"}\n{\"id\":139,\"first_name\":\"Hynda\",\"email\":\"hbloore3u@shareasale.com\",\"job\":\"Media Manager III\",\"timestamp\":\"2022-01-26T18:01:30Z\"}\n{\"id\":140,\"first_name\":\"Maure\",\"email\":\"mfrankis3v@mashable.com\",\"job\":\"Software Engineer I\",\"timestamp\":\"2022-10-12T16:15:44Z\"}\n{\"id\":141,\"first_name\":\"Kendra\",\"email\":\"kgrisenthwaite3w@yandex.ru\",\"job\":\"Environmental Tech\",\"timestamp\":\"2022-08-20T21:52:08Z\"}\n{\"id\":142,\"first_name\":\"Rand\",\"email\":\"rrowledge3x@themeforest.net\",\"job\":\"Librarian\",\"timestamp\":\"2022-05-30T11:46:14Z\"}\n{\"id\":143,\"first_name\":\"Paulie\",\"email\":\"pmerit3y@hud.gov\",\"job\":\"Financial Analyst\",\"timestamp\":\"2022-04-01T18:06:45Z\"}\n{\"id\":144,\"first_name\":\"Lynn\",\"email\":\"lcrannach3z@weibo.com\",\"job\":\"Software Test Engineer II\",\"timestamp\":\"2022-03-24T14:41:03Z\"}\n{\"id\":145,\"first_name\":\"Petronella\",\"email\":\"pjanic40@infoseek.co.jp\",\"job\":\"Chief Design Engineer\",\"timestamp\":\"2022-06-19T21:29:51Z\"}\n{\"id\":146,\"first_name\":\"Alberik\",\"email\":\"abodleigh41@uiuc.edu\",\"job\":\"Nurse\",\"timestamp\":\"2022-06-17T03:57:06Z\"}\n{\"id\":147,\"first_name\":\"Joshuah\",\"email\":\"jcecchi42@businessinsider.com\",\"job\":\"Budget/Accounting Analyst III\",\"timestamp\":\"2022-03-27T02:57:33Z\"}\n{\"id\":148,\"first_name\":\"Chicky\",\"email\":\"cbraxton43@utexas.edu\",\"job\":\"Senior Quality Engineer\",\"timestamp\":\"2022-10-15T12:21:30Z\"}\n{\"id\":149,\"first_name\":\"Glyn\",\"email\":\"ggauvin44@ezinearticles.com\",\"job\":\"Software Consultant\",\"timestamp\":\"2021-12-11T23:15:12Z\"}\n{\"id\":150,\"first_name\":\"Barnabe\",\"email\":\"bemery45@bluehost.com\",\"job\":\"Marketing Manager\",\"timestamp\":\"2022-06-11T14:14:01Z\"}\n{\"id\":151,\"first_name\":\"Paulina\",\"email\":\"pchettle46@cornell.edu\",\"job\":\"Geologist III\",\"timestamp\":\"2022-01-07T14:29:30Z\"}\n{\"id\":152,\"first_name\":\"Clarisse\",\"email\":\"csharer47@indiegogo.com\",\"job\":\"Junior Executive\",\"timestamp\":\"2022-01-12T21:10:23Z\"}\n{\"id\":153,\"first_name\":\"Mona\",\"email\":\"mlakeland48@globo.com\",\"job\":\"Accountant I\",\"timestamp\":\"2022-12-04T20:07:31Z\"}\n{\"id\":154,\"first_name\":\"Emogene\",\"email\":\"ewillison49@cam.ac.uk\",\"job\":\"Food Chemist\",\"timestamp\":\"2022-06-28T15:40:59Z\"}\n{\"id\":155,\"first_name\":\"Dael\",\"email\":\"dbryns4a@nydailynews.com\",\"job\":\"Geological Engineer\",\"timestamp\":\"2022-09-24T00:17:27Z\"}\n{\"id\":156,\"first_name\":\"Mel\",\"email\":\"mbahl4b@tumblr.com\",\"job\":\"VP Marketing\",\"timestamp\":\"2022-06-12T01:10:19Z\"}\n{\"id\":157,\"first_name\":\"Marlene\",\"email\":\"mferrao4c@seesaa.net\",\"job\":\"Systems Administrator IV\",\"timestamp\":\"2022-01-13T07:40:49Z\"}\n{\"id\":158,\"first_name\":\"Guenna\",\"email\":\"gpalethorpe4d@whitehouse.gov\",\"job\":\"Senior Sales Associate\",\"timestamp\":\"2022-07-21T04:39:45Z\"}\n{\"id\":159,\"first_name\":\"Keri\",\"email\":\"kgionettitti4e@bravesites.com\",\"job\":\"Operator\",\"timestamp\":\"2022-09-03T00:08:51Z\"}\n{\"id\":160,\"first_name\":\"Collen\",\"email\":\"cmacterrelly4f@theatlantic.com\",\"job\":\"Assistant Professor\",\"timestamp\":\"2022-09-30T06:34:38Z\"}\n{\"id\":161,\"first_name\":\"Arabelle\",\"email\":\"adoree4g@blinklist.com\",\"job\":\"Analyst Programmer\",\"timestamp\":\"2022-04-18T00:16:37Z\"}\n{\"id\":162,\"first_name\":\"Aridatha\",\"email\":\"aalcido4h@xrea.com\",\"job\":\"Programmer Analyst IV\",\"timestamp\":\"2022-05-31T04:17:35Z\"}\n{\"id\":163,\"first_name\":\"Roxine\",\"email\":\"rwarbeys4i@paypal.com\",\"job\":\"Biostatistician IV\",\"timestamp\":\"2022-11-17T07:13:36Z\"}\n{\"id\":164,\"first_name\":\"Elga\",\"email\":\"ewelsby4j@parallels.com\",\"job\":\"Financial Advisor\",\"timestamp\":\"2022-08-09T11:01:09Z\"}\n{\"id\":165,\"first_name\":\"Anna\",\"email\":\"astovine4k@wikispaces.com\",\"job\":\"Marketing Assistant\",\"timestamp\":\"2022-04-18T20:09:47Z\"}\n{\"id\":166,\"first_name\":\"Ailyn\",\"email\":\"aquick4l@wordpress.com\",\"job\":\"VP Marketing\",\"timestamp\":\"2022-10-16T06:58:38Z\"}\n{\"id\":167,\"first_name\":\"Robinet\",\"email\":\"reddington4m@dailymotion.com\",\"job\":\"Help Desk Technician\",\"timestamp\":\"2021-12-17T13:53:29Z\"}\n{\"id\":168,\"first_name\":\"Berty\",\"email\":\"bhauxley4n@sun.com\",\"job\":\"Safety Technician III\",\"timestamp\":\"2022-09-09T17:16:53Z\"}\n{\"id\":169,\"first_name\":\"Hedwiga\",\"email\":\"hmassen4o@1und1.de\",\"job\":\"Senior Developer\",\"timestamp\":\"2022-06-18T19:08:16Z\"}\n{\"id\":170,\"first_name\":\"Marlow\",\"email\":\"mugo4p@time.com\",\"job\":\"Assistant Manager\",\"timestamp\":\"2022-11-21T09:02:51Z\"}\n{\"id\":171,\"first_name\":\"Lindsay\",\"email\":\"llangthorne4q@ameblo.jp\",\"job\":\"Health Coach II\",\"timestamp\":\"2022-05-24T18:16:34Z\"}\n{\"id\":172,\"first_name\":\"Katie\",\"email\":\"kdorney4r@soundcloud.com\",\"job\":\"Staff Accountant I\",\"timestamp\":\"2022-01-30T18:38:41Z\"}\n{\"id\":173,\"first_name\":\"Hilary\",\"email\":\"hcattach4s@meetup.com\",\"job\":\"VP Product Management\",\"timestamp\":\"2022-04-08T21:28:37Z\"}\n{\"id\":174,\"first_name\":\"Ardine\",\"email\":\"aparram4t@irs.gov\",\"job\":\"Legal Assistant\",\"timestamp\":\"2022-07-27T14:12:56Z\"}\n{\"id\":175,\"first_name\":\"Mable\",\"email\":\"mriccardo4u@aboutads.info\",\"job\":\"Assistant Media Planner\",\"timestamp\":\"2022-10-03T22:14:39Z\"}\n{\"id\":176,\"first_name\":\"Cairistiona\",\"email\":\"csparwell4v@instagram.com\",\"job\":\"VP Sales\",\"timestamp\":\"2022-10-02T20:22:11Z\"}\n{\"id\":177,\"first_name\":\"Gunther\",\"email\":\"gbradden4w@google.com.hk\",\"job\":\"Research Associate\",\"timestamp\":\"2022-08-07T06:19:07Z\"}\n{\"id\":178,\"first_name\":\"Filide\",\"email\":\"fkingswood4x@narod.ru\",\"job\":\"Associate Professor\",\"timestamp\":\"2022-09-14T02:24:37Z\"}\n{\"id\":179,\"first_name\":\"Jacinda\",\"email\":\"jgribbins4y@quantcast.com\",\"job\":\"Nurse Practicioner\",\"timestamp\":\"2022-02-21T19:24:53Z\"}\n{\"id\":180,\"first_name\":\"Fay\",\"email\":\"fizakson4z@i2i.jp\",\"job\":\"Tax Accountant\",\"timestamp\":\"2022-01-06T01:57:09Z\"}\n{\"id\":181,\"first_name\":\"Trish\",\"email\":\"tgurko50@dropbox.com\",\"job\":\"Research Associate\",\"timestamp\":\"2022-08-26T15:02:51Z\"}\n{\"id\":182,\"first_name\":\"Chrotoem\",\"email\":\"claviss51@bluehost.com\",\"job\":\"VP Product Management\",\"timestamp\":\"2022-10-14T17:04:29Z\"}\n{\"id\":183,\"first_name\":\"Drusilla\",\"email\":\"dvern52@upenn.edu\",\"job\":\"Web Designer II\",\"timestamp\":\"2022-10-13T00:54:08Z\"}\n{\"id\":184,\"first_name\":\"Kent\",\"email\":\"kleahair53@theglobeandmail.com\",\"job\":\"Recruiter\",\"timestamp\":\"2022-01-07T05:32:42Z\"}\n{\"id\":185,\"first_name\":\"Abagail\",\"email\":\"aparadin54@netlog.com\",\"job\":\"Editor\",\"timestamp\":\"2022-06-02T00:24:32Z\"}\n{\"id\":186,\"first_name\":\"Agosto\",\"email\":\"atwinberrow55@answers.com\",\"job\":\"Editor\",\"timestamp\":\"2022-01-29T09:42:02Z\"}\n{\"id\":187,\"first_name\":\"Danyette\",\"email\":\"dbecker56@jigsy.com\",\"job\":\"Legal Assistant\",\"timestamp\":\"2022-05-07T01:24:37Z\"}\n{\"id\":188,\"first_name\":\"Waverly\",\"email\":\"wspinelli57@umn.edu\",\"job\":\"Environmental Tech\",\"timestamp\":\"2022-09-29T06:05:36Z\"}\n{\"id\":189,\"first_name\":\"Basil\",\"email\":\"bdobel58@twitpic.com\",\"job\":\"Senior Financial Analyst\",\"timestamp\":\"2022-01-21T06:37:20Z\"}\n{\"id\":190,\"first_name\":\"Catharine\",\"email\":\"cconnew59@xing.com\",\"job\":\"Clinical Specialist\",\"timestamp\":\"2022-02-17T07:17:47Z\"}\n{\"id\":191,\"first_name\":\"Edd\",\"email\":\"edezamudio5a@intel.com\",\"job\":\"Chief Design Engineer\",\"timestamp\":\"2022-09-17T12:49:36Z\"}\n{\"id\":192,\"first_name\":\"Aura\",\"email\":\"aserris5b@google.it\",\"job\":\"Internal Auditor\",\"timestamp\":\"2022-02-24T22:25:08Z\"}\n{\"id\":193,\"first_name\":\"Tomi\",\"email\":\"tyarnton5c@g.co\",\"job\":\"Project Manager\",\"timestamp\":\"2022-09-14T23:50:00Z\"}\n{\"id\":194,\"first_name\":\"Claudianus\",\"email\":\"cskerratt5d@va.gov\",\"job\":\"Environmental Tech\",\"timestamp\":\"2022-11-11T06:48:47Z\"}\n{\"id\":195,\"first_name\":\"Christine\",\"email\":\"cmiliffe5e@fda.gov\",\"job\":\"Environmental Specialist\",\"timestamp\":\"2021-12-30T16:41:26Z\"}\n{\"id\":196,\"first_name\":\"Neda\",\"email\":\"nlicciardello5f@ameblo.jp\",\"job\":\"Database Administrator III\",\"timestamp\":\"2022-11-26T02:35:02Z\"}\n{\"id\":197,\"first_name\":\"Avram\",\"email\":\"abeeston5g@acquirethisname.com\",\"job\":\"Health Coach III\",\"timestamp\":\"2022-11-20T12:15:09Z\"}\n{\"id\":198,\"first_name\":\"Murry\",\"email\":\"madamkiewicz5h@time.com\",\"job\":\"Developer II\",\"timestamp\":\"2022-04-04T04:01:09Z\"}\n{\"id\":199,\"first_name\":\"Oralia\",\"email\":\"odener5i@amazon.de\",\"job\":\"Financial Advisor\",\"timestamp\":\"2022-06-08T15:04:48Z\"}\n{\"id\":200,\"first_name\":\"Pearce\",\"email\":\"pabramovitz5j@sciencedirect.com\",\"job\":\"Assistant Professor\",\"timestamp\":\"2022-01-04T10:36:44Z\"}\n{\"id\":201,\"first_name\":\"Jesse\",\"email\":\"jseares5k@elegantthemes.com\",\"job\":\"Civil Engineer\",\"timestamp\":\"2022-03-28T14:24:48Z\"}\n{\"id\":202,\"first_name\":\"Jedediah\",\"email\":\"jconstantinou5l@wsj.com\",\"job\":\"Assistant Manager\",\"timestamp\":\"2022-03-05T20:32:52Z\"}\n{\"id\":203,\"first_name\":\"Prescott\",\"email\":\"pmatuska5m@miitbeian.gov.cn\",\"job\":\"Quality Engineer\",\"timestamp\":\"2021-12-24T15:53:01Z\"}\n{\"id\":204,\"first_name\":\"Germaine\",\"email\":\"ghadny5n@sakura.ne.jp\",\"job\":\"Nuclear Power Engineer\",\"timestamp\":\"2022-11-23T20:50:04Z\"}\n{\"id\":205,\"first_name\":\"Merle\",\"email\":\"mgillmore5o@nsw.gov.au\",\"job\":\"Sales Associate\",\"timestamp\":\"2022-07-17T06:35:23Z\"}\n{\"id\":206,\"first_name\":\"Tiphanie\",\"email\":\"tjekel5p@msn.com\",\"job\":\"Financial Advisor\",\"timestamp\":\"2022-09-17T06:38:25Z\"}\n{\"id\":207,\"first_name\":\"Abbott\",\"email\":\"adauney5q@wsj.com\",\"job\":\"Business Systems Development Analyst\",\"timestamp\":\"2022-02-26T17:55:00Z\"}\n{\"id\":208,\"first_name\":\"Flor\",\"email\":\"fbuche5r@kickstarter.com\",\"job\":\"Account Representative II\",\"timestamp\":\"2022-06-30T12:59:37Z\"}\n{\"id\":209,\"first_name\":\"Kandace\",\"email\":\"kgavin5s@ovh.net\",\"job\":\"Professor\",\"timestamp\":\"2022-06-15T22:27:43Z\"}\n{\"id\":210,\"first_name\":\"Raimund\",\"email\":\"rmcpeck5t@weibo.com\",\"job\":\"GIS Technical Architect\",\"timestamp\":\"2021-12-25T09:56:17Z\"}\n{\"id\":211,\"first_name\":\"Archibold\",\"email\":\"atunmore5u@privacy.gov.au\",\"job\":\"VP Accounting\",\"timestamp\":\"2022-01-30T06:27:36Z\"}\n{\"id\":212,\"first_name\":\"Don\",\"email\":\"docrigane5v@squidoo.com\",\"job\":\"Speech Pathologist\",\"timestamp\":\"2022-03-28T05:37:07Z\"}\n{\"id\":213,\"first_name\":\"Zuzana\",\"email\":\"zreynard5w@google.it\",\"job\":\"Desktop Support Technician\",\"timestamp\":\"2022-08-09T01:13:52Z\"}\n{\"id\":214,\"first_name\":\"Grantley\",\"email\":\"glapley5x@youtube.com\",\"job\":\"Product Engineer\",\"timestamp\":\"2022-05-21T22:28:10Z\"}\n{\"id\":215,\"first_name\":\"Granthem\",\"email\":\"gdrover5y@technorati.com\",\"job\":\"VP Product Management\",\"timestamp\":\"2022-03-09T21:39:04Z\"}\n{\"id\":216,\"first_name\":\"Shaina\",\"email\":\"sgrinnov5z@usa.gov\",\"job\":\"Developer III\",\"timestamp\":\"2022-07-01T13:50:43Z\"}\n{\"id\":217,\"first_name\":\"Giovanna\",\"email\":\"gleeburn60@kickstarter.com\",\"job\":\"Senior Developer\",\"timestamp\":\"2022-09-03T13:08:03Z\"}\n{\"id\":218,\"first_name\":\"Dehlia\",\"email\":\"dguinness61@hc360.com\",\"job\":\"Associate Professor\",\"timestamp\":\"2021-12-15T19:00:10Z\"}\n{\"id\":219,\"first_name\":\"Cinda\",\"email\":\"cdunklee62@topsy.com\",\"job\":\"Actuary\",\"timestamp\":\"2022-04-11T05:24:27Z\"}\n{\"id\":220,\"first_name\":\"Kimball\",\"email\":\"kbortolutti63@unicef.org\",\"job\":\"Data Coordiator\",\"timestamp\":\"2022-07-02T21:45:40Z\"}\n{\"id\":221,\"first_name\":\"Daffie\",\"email\":\"dlivingstone64@pbs.org\",\"job\":\"Graphic Designer\",\"timestamp\":\"2022-02-21T10:00:07Z\"}\n{\"id\":222,\"first_name\":\"Hermina\",\"email\":\"hmacglory65@360.cn\",\"job\":\"Geological Engineer\",\"timestamp\":\"2022-09-17T07:01:59Z\"}\n{\"id\":223,\"first_name\":\"Robinet\",\"email\":\"rcook66@fema.gov\",\"job\":\"Web Developer I\",\"timestamp\":\"2021-12-25T06:48:04Z\"}\n{\"id\":224,\"first_name\":\"Cedric\",\"email\":\"cgeroldini67@arizona.edu\",\"job\":\"Engineer III\",\"timestamp\":\"2022-10-19T01:36:02Z\"}\n{\"id\":225,\"first_name\":\"Daune\",\"email\":\"dalgar68@creativecommons.org\",\"job\":\"Assistant Media Planner\",\"timestamp\":\"2022-01-15T03:59:23Z\"}\n{\"id\":226,\"first_name\":\"Susanne\",\"email\":\"sgeist69@nsw.gov.au\",\"job\":\"Structural Analysis Engineer\",\"timestamp\":\"2022-02-03T18:50:39Z\"}\n{\"id\":227,\"first_name\":\"Sibelle\",\"email\":\"skenion6a@sciencedaily.com\",\"job\":\"Community Outreach Specialist\",\"timestamp\":\"2021-12-20T23:39:47Z\"}\n{\"id\":228,\"first_name\":\"Tabb\",\"email\":\"tcubbit6b@shinystat.com\",\"job\":\"Programmer IV\",\"timestamp\":\"2022-04-06T22:19:21Z\"}\n{\"id\":229,\"first_name\":\"Shirley\",\"email\":\"shardstaff6c@furl.net\",\"job\":\"Developer II\",\"timestamp\":\"2022-09-03T18:49:43Z\"}\n{\"id\":230,\"first_name\":\"Sigvard\",\"email\":\"smaffia6d@uol.com.br\",\"job\":\"Software Test Engineer II\",\"timestamp\":\"2022-11-06T03:16:32Z\"}\n{\"id\":231,\"first_name\":\"Maryjo\",\"email\":\"mcamblin6e@symantec.com\",\"job\":\"Paralegal\",\"timestamp\":\"2022-09-27T18:02:30Z\"}\n{\"id\":232,\"first_name\":\"Jdavie\",\"email\":\"jishaki6f@mozilla.org\",\"job\":\"Computer Systems Analyst III\",\"timestamp\":\"2022-10-13T21:42:19Z\"}\n{\"id\":233,\"first_name\":\"Louie\",\"email\":\"lmoresby6g@zdnet.com\",\"job\":\"Nuclear Power Engineer\",\"timestamp\":\"2022-09-15T23:17:28Z\"}\n{\"id\":234,\"first_name\":\"Merla\",\"email\":\"mpietraszek6h@twitpic.com\",\"job\":\"Accounting Assistant I\",\"timestamp\":\"2022-09-04T11:29:39Z\"}\n{\"id\":235,\"first_name\":\"Nealon\",\"email\":\"ntertre6i@free.fr\",\"job\":\"Safety Technician IV\",\"timestamp\":\"2022-10-25T01:34:35Z\"}\n{\"id\":236,\"first_name\":\"Riordan\",\"email\":\"rhark6j@hhs.gov\",\"job\":\"Account Representative III\",\"timestamp\":\"2022-08-28T09:49:55Z\"}\n{\"id\":237,\"first_name\":\"Borg\",\"email\":\"bwettern6k@fastcompany.com\",\"job\":\"Engineer I\",\"timestamp\":\"2022-11-06T21:22:23Z\"}\n{\"id\":238,\"first_name\":\"Micki\",\"email\":\"mgange6l@live.com\",\"job\":\"Dental Hygienist\",\"timestamp\":\"2022-10-23T03:47:59Z\"}\n{\"id\":239,\"first_name\":\"Werner\",\"email\":\"wledgeway6m@vistaprint.com\",\"job\":\"Payment Adjustment Coordinator\",\"timestamp\":\"2022-06-27T08:29:47Z\"}\n{\"id\":240,\"first_name\":\"Aundrea\",\"email\":\"agirt6n@sbwire.com\",\"job\":\"Electrical Engineer\",\"timestamp\":\"2022-07-23T13:19:16Z\"}\n{\"id\":241,\"first_name\":\"Benedetto\",\"email\":\"bharmon6o@google.pl\",\"job\":\"Technical Writer\",\"timestamp\":\"2022-11-19T14:02:05Z\"}\n{\"id\":242,\"first_name\":\"Cristal\",\"email\":\"cellington6p@ask.com\",\"job\":\"Editor\",\"timestamp\":\"2022-07-01T23:57:52Z\"}\n{\"id\":243,\"first_name\":\"Ebonee\",\"email\":\"ebartolomeo6q@goo.ne.jp\",\"job\":\"Account Executive\",\"timestamp\":\"2022-01-28T09:49:08Z\"}\n{\"id\":244,\"first_name\":\"Bern\",\"email\":\"bturrell6r@topsy.com\",\"job\":\"Research Associate\",\"timestamp\":\"2022-03-06T14:31:58Z\"}\n{\"id\":245,\"first_name\":\"Kenny\",\"email\":\"kruggs6s@nba.com\",\"job\":\"Software Test Engineer III\",\"timestamp\":\"2022-05-23T21:28:39Z\"}\n{\"id\":246,\"first_name\":\"Wilhelmina\",\"email\":\"wfandrey6t@flavors.me\",\"job\":\"Actuary\",\"timestamp\":\"2022-08-29T02:19:02Z\"}\n{\"id\":247,\"first_name\":\"Aurelea\",\"email\":\"acoverdill6u@furl.net\",\"job\":\"Community Outreach Specialist\",\"timestamp\":\"2022-06-09T09:59:42Z\"}\n{\"id\":248,\"first_name\":\"Aaren\",\"email\":\"asautter6v@chicagotribune.com\",\"job\":\"GIS Technical Architect\",\"timestamp\":\"2022-09-06T22:52:21Z\"}\n{\"id\":249,\"first_name\":\"Alva\",\"email\":\"achillingsworth6w@salon.com\",\"job\":\"Community Outreach Specialist\",\"timestamp\":\"2022-02-15T03:57:24Z\"}\n{\"id\":250,\"first_name\":\"Coretta\",\"email\":\"cdenormanville6x@nbcnews.com\",\"job\":\"Graphic Designer\",\"timestamp\":\"2022-11-17T23:47:28Z\"}\n{\"id\":251,\"first_name\":\"Lem\",\"email\":\"lcarlesso6y@hibu.com\",\"job\":\"Clinical Specialist\",\"timestamp\":\"2022-08-18T16:41:37Z\"}\n{\"id\":252,\"first_name\":\"Alejandro\",\"email\":\"aloughlin6z@hibu.com\",\"job\":\"Operator\",\"timestamp\":\"2022-10-15T01:17:03Z\"}\n{\"id\":253,\"first_name\":\"Benetta\",\"email\":\"bbuttrey70@lulu.com\",\"job\":\"Speech Pathologist\",\"timestamp\":\"2022-03-02T20:29:32Z\"}\n{\"id\":254,\"first_name\":\"Ralina\",\"email\":\"rsterte71@sun.com\",\"job\":\"Environmental Specialist\",\"timestamp\":\"2022-11-21T02:40:25Z\"}\n{\"id\":255,\"first_name\":\"Serena\",\"email\":\"smoulds72@globo.com\",\"job\":\"VP Marketing\",\"timestamp\":\"2022-07-20T12:10:53Z\"}\n{\"id\":256,\"first_name\":\"Bonnie\",\"email\":\"bthreader73@over-blog.com\",\"job\":\"Financial Analyst\",\"timestamp\":\"2022-04-26T20:51:46Z\"}\n{\"id\":257,\"first_name\":\"Joelynn\",\"email\":\"jlangham74@discovery.com\",\"job\":\"Chemical Engineer\",\"timestamp\":\"2022-01-07T19:00:00Z\"}\n{\"id\":258,\"first_name\":\"Otha\",\"email\":\"ocaselli75@wikispaces.com\",\"job\":\"Actuary\",\"timestamp\":\"2022-07-04T17:16:53Z\"}\n{\"id\":259,\"first_name\":\"Maryann\",\"email\":\"mbenn76@clickbank.net\",\"job\":\"Office Assistant III\",\"timestamp\":\"2022-06-19T05:43:08Z\"}\n{\"id\":260,\"first_name\":\"Ryun\",\"email\":\"rgwalter77@about.me\",\"job\":\"Quality Control Specialist\",\"timestamp\":\"2022-06-18T08:19:32Z\"}\n{\"id\":261,\"first_name\":\"Letisha\",\"email\":\"linns78@tinypic.com\",\"job\":\"Operator\",\"timestamp\":\"2022-03-27T08:33:52Z\"}\n{\"id\":262,\"first_name\":\"Anni\",\"email\":\"awhale79@sciencedirect.com\",\"job\":\"Graphic Designer\",\"timestamp\":\"2022-03-02T13:47:31Z\"}\n{\"id\":263,\"first_name\":\"Erek\",\"email\":\"ekoppen7a@cafepress.com\",\"job\":\"Geological Engineer\",\"timestamp\":\"2021-12-29T08:26:09Z\"}\n{\"id\":264,\"first_name\":\"Guendolen\",\"email\":\"gharvatt7b@abc.net.au\",\"job\":\"Software Engineer I\",\"timestamp\":\"2022-06-05T16:59:36Z\"}\n{\"id\":265,\"first_name\":\"Byram\",\"email\":\"bfarn7c@icq.com\",\"job\":\"VP Quality Control\",\"timestamp\":\"2022-06-27T09:53:33Z\"}\n{\"id\":266,\"first_name\":\"Carine\",\"email\":\"cshallo7d@telegraph.co.uk\",\"job\":\"Environmental Tech\",\"timestamp\":\"2022-05-02T19:02:15Z\"}\n{\"id\":267,\"first_name\":\"Nina\",\"email\":\"nmiguet7e@g.co\",\"job\":\"Office Assistant II\",\"timestamp\":\"2022-06-13T09:15:28Z\"}\n{\"id\":268,\"first_name\":\"Anni\",\"email\":\"ademschke7f@acquirethisname.com\",\"job\":\"Sales Representative\",\"timestamp\":\"2022-08-03T09:56:07Z\"}\n{\"id\":269,\"first_name\":\"Leilah\",\"email\":\"lhorrod7g@nps.gov\",\"job\":\"Marketing Manager\",\"timestamp\":\"2022-04-06T00:10:25Z\"}\n{\"id\":270,\"first_name\":\"Emmit\",\"email\":\"elobbe7h@liveinternet.ru\",\"job\":\"Dental Hygienist\",\"timestamp\":\"2022-02-22T10:18:21Z\"}\n{\"id\":271,\"first_name\":\"Caprice\",\"email\":\"ccaldairou7i@behance.net\",\"job\":\"Web Developer III\",\"timestamp\":\"2022-11-10T18:43:09Z\"}\n{\"id\":272,\"first_name\":\"Delly\",\"email\":\"djefferys7j@comsenz.com\",\"job\":\"Accounting Assistant III\",\"timestamp\":\"2022-07-06T00:39:58Z\"}\n{\"id\":273,\"first_name\":\"Ninnetta\",\"email\":\"ngarton7k@cargocollective.com\",\"job\":\"Web Designer IV\",\"timestamp\":\"2022-04-29T12:06:43Z\"}\n{\"id\":274,\"first_name\":\"Gweneth\",\"email\":\"gdowell7l@timesonline.co.uk\",\"job\":\"Internal Auditor\",\"timestamp\":\"2021-12-13T07:25:58Z\"}\n{\"id\":275,\"first_name\":\"Tuckie\",\"email\":\"tpailin7m@bandcamp.com\",\"job\":\"Accounting Assistant IV\",\"timestamp\":\"2022-10-05T08:08:07Z\"}\n{\"id\":276,\"first_name\":\"Dorian\",\"email\":\"ddrews7n@marriott.com\",\"job\":\"Design Engineer\",\"timestamp\":\"2022-07-13T22:58:08Z\"}\n{\"id\":277,\"first_name\":\"Sadella\",\"email\":\"stofanini7o@so-net.ne.jp\",\"job\":\"Graphic Designer\",\"timestamp\":\"2022-04-27T23:58:37Z\"}\n{\"id\":278,\"first_name\":\"Kerby\",\"email\":\"klarrett7p@slideshare.net\",\"job\":\"Environmental Tech\",\"timestamp\":\"2022-09-18T11:27:04Z\"}\n{\"id\":279,\"first_name\":\"Roberto\",\"email\":\"rbrabbs7q@nationalgeographic.com\",\"job\":\"Programmer Analyst II\",\"timestamp\":\"2022-01-31T08:57:00Z\"}\n{\"id\":280,\"first_name\":\"Avery\",\"email\":\"aweatherdon7r@soundcloud.com\",\"job\":\"Quality Control Specialist\",\"timestamp\":\"2022-09-04T01:22:07Z\"}\n{\"id\":281,\"first_name\":\"Ammamaria\",\"email\":\"awaddie7s@msu.edu\",\"job\":\"Sales Representative\",\"timestamp\":\"2022-05-20T12:28:49Z\"}\n{\"id\":282,\"first_name\":\"Atalanta\",\"email\":\"awonter7t@miibeian.gov.cn\",\"job\":\"Information Systems Manager\",\"timestamp\":\"2022-08-07T12:32:10Z\"}\n{\"id\":283,\"first_name\":\"Matilde\",\"email\":\"mgarric7u@zimbio.com\",\"job\":\"Account Representative III\",\"timestamp\":\"2022-07-11T20:38:19Z\"}\n{\"id\":284,\"first_name\":\"Sibylle\",\"email\":\"starbett7v@chicagotribune.com\",\"job\":\"Recruiter\",\"timestamp\":\"2022-02-06T16:23:04Z\"}\n{\"id\":285,\"first_name\":\"Honey\",\"email\":\"hobrian7w@latimes.com\",\"job\":\"Account Executive\",\"timestamp\":\"2022-05-18T19:12:26Z\"}\n{\"id\":286,\"first_name\":\"Ulysses\",\"email\":\"uhutson7x@walmart.com\",\"job\":\"Sales Representative\",\"timestamp\":\"2021-12-09T23:19:28Z\"}\n{\"id\":287,\"first_name\":\"Jasper\",\"email\":\"jmacpaik7y@zdnet.com\",\"job\":\"Information Systems Manager\",\"timestamp\":\"2022-08-02T21:42:22Z\"}\n{\"id\":288,\"first_name\":\"Bessy\",\"email\":\"bburker7z@theglobeandmail.com\",\"job\":\"Quality Control Specialist\",\"timestamp\":\"2022-03-24T08:24:07Z\"}\n{\"id\":289,\"first_name\":\"Belle\",\"email\":\"bhasnney80@japanpost.jp\",\"job\":\"Health Coach I\",\"timestamp\":\"2022-10-18T10:42:23Z\"}\n{\"id\":290,\"first_name\":\"Elia\",\"email\":\"emcilwain81@sfgate.com\",\"job\":\"Research Associate\",\"timestamp\":\"2022-09-09T02:42:53Z\"}\n{\"id\":291,\"first_name\":\"Ed\",\"email\":\"ejorczyk82@t-online.de\",\"job\":\"Graphic Designer\",\"timestamp\":\"2022-06-13T11:22:03Z\"}\n{\"id\":292,\"first_name\":\"Sandor\",\"email\":\"smeller83@cafepress.com\",\"job\":\"Dental Hygienist\",\"timestamp\":\"2022-06-17T00:53:01Z\"}\n{\"id\":293,\"first_name\":\"Wallie\",\"email\":\"wroe84@gravatar.com\",\"job\":\"Occupational Therapist\",\"timestamp\":\"2022-05-28T21:53:57Z\"}\n{\"id\":294,\"first_name\":\"Ladonna\",\"email\":\"lhearst85@tiny.cc\",\"job\":\"Automation Specialist IV\",\"timestamp\":\"2022-07-25T13:50:29Z\"}\n{\"id\":295,\"first_name\":\"Michael\",\"email\":\"mgilardi86@chicagotribune.com\",\"job\":\"Graphic Designer\",\"timestamp\":\"2022-11-29T12:09:37Z\"}\n{\"id\":296,\"first_name\":\"Marion\",\"email\":\"mbusfield87@ifeng.com\",\"job\":\"Sales Associate\",\"timestamp\":\"2022-01-12T02:10:19Z\"}\n{\"id\":297,\"first_name\":\"Ode\",\"email\":\"ocoxon88@csmonitor.com\",\"job\":\"Environmental Specialist\",\"timestamp\":\"2022-06-28T05:16:21Z\"}\n{\"id\":298,\"first_name\":\"Bink\",\"email\":\"bcrossan89@t.co\",\"job\":\"Chemical Engineer\",\"timestamp\":\"2022-03-19T19:09:19Z\"}\n{\"id\":299,\"first_name\":\"Royce\",\"email\":\"rduffie8a@de.vu\",\"job\":\"Office Assistant II\",\"timestamp\":\"2022-04-26T13:21:14Z\"}\n{\"id\":300,\"first_name\":\"Thain\",\"email\":\"tannakin8b@addtoany.com\",\"job\":\"Business Systems Development Analyst\",\"timestamp\":\"2022-09-18T22:11:03Z\"}\n{\"id\":301,\"first_name\":\"Clarine\",\"email\":\"ccheal8c@alibaba.com\",\"job\":\"Marketing Manager\",\"timestamp\":\"2022-01-08T20:37:46Z\"}\n{\"id\":302,\"first_name\":\"Garrard\",\"email\":\"ggobat8d@toplist.cz\",\"job\":\"Staff Accountant II\",\"timestamp\":\"2022-07-27T23:50:41Z\"}\n{\"id\":303,\"first_name\":\"Kare\",\"email\":\"kingliby8e@ycombinator.com\",\"job\":\"Associate Professor\",\"timestamp\":\"2022-08-17T01:56:29Z\"}\n{\"id\":304,\"first_name\":\"Les\",\"email\":\"ledis8f@yahoo.co.jp\",\"job\":\"Mechanical Systems Engineer\",\"timestamp\":\"2022-01-06T20:01:59Z\"}\n{\"id\":305,\"first_name\":\"Jessie\",\"email\":\"jcherrett8g@paginegialle.it\",\"job\":\"VP Product Management\",\"timestamp\":\"2022-07-03T06:55:11Z\"}\n{\"id\":306,\"first_name\":\"Coreen\",\"email\":\"cedmund8h@ask.com\",\"job\":\"Software Test Engineer I\",\"timestamp\":\"2021-12-12T07:49:12Z\"}\n{\"id\":307,\"first_name\":\"Courtnay\",\"email\":\"clowre8i@lycos.com\",\"job\":\"Desktop Support Technician\",\"timestamp\":\"2022-10-12T09:42:23Z\"}\n{\"id\":308,\"first_name\":\"Isacco\",\"email\":\"iesslemont8j@google.co.uk\",\"job\":\"Help Desk Operator\",\"timestamp\":\"2022-07-10T22:34:28Z\"}\n{\"id\":309,\"first_name\":\"Tades\",\"email\":\"tandrat8k@patch.com\",\"job\":\"Accounting Assistant II\",\"timestamp\":\"2022-04-14T12:38:29Z\"}\n{\"id\":310,\"first_name\":\"Mitchael\",\"email\":\"mlermouth8l@weather.com\",\"job\":\"Senior Sales Associate\",\"timestamp\":\"2022-03-12T21:48:44Z\"}\n{\"id\":311,\"first_name\":\"Kurt\",\"email\":\"kfleet8m@cisco.com\",\"job\":\"Business Systems Development Analyst\",\"timestamp\":\"2022-06-23T20:47:09Z\"}\n{\"id\":312,\"first_name\":\"Miriam\",\"email\":\"mchestney8n@un.org\",\"job\":\"Statistician IV\",\"timestamp\":\"2022-01-21T02:23:11Z\"}\n{\"id\":313,\"first_name\":\"Galven\",\"email\":\"gkennifick8o@4shared.com\",\"job\":\"Quality Engineer\",\"timestamp\":\"2022-11-18T01:30:03Z\"}\n{\"id\":314,\"first_name\":\"Robinet\",\"email\":\"restabrook8p@amazon.com\",\"job\":\"Occupational Therapist\",\"timestamp\":\"2022-03-23T20:24:42Z\"}\n{\"id\":315,\"first_name\":\"Bren\",\"email\":\"bmaase8q@privacy.gov.au\",\"job\":\"Actuary\",\"timestamp\":\"2022-03-30T13:52:45Z\"}\n{\"id\":316,\"first_name\":\"Perl\",\"email\":\"pmcglew8r@auda.org.au\",\"job\":\"Paralegal\",\"timestamp\":\"2022-02-21T15:31:21Z\"}\n{\"id\":317,\"first_name\":\"Sada\",\"email\":\"shartas8s@foxnews.com\",\"job\":\"Associate Professor\",\"timestamp\":\"2022-09-20T04:49:11Z\"}\n{\"id\":318,\"first_name\":\"Trixie\",\"email\":\"tgeydon8t@intel.com\",\"job\":\"VP Sales\",\"timestamp\":\"2022-01-23T05:52:12Z\"}\n{\"id\":319,\"first_name\":\"Sauveur\",\"email\":\"sgiscken8u@ezinearticles.com\",\"job\":\"Engineer I\",\"timestamp\":\"2022-08-10T04:16:30Z\"}\n{\"id\":320,\"first_name\":\"Megan\",\"email\":\"mlawleff8v@dyndns.org\",\"job\":\"Developer II\",\"timestamp\":\"2022-10-28T17:54:31Z\"}\n{\"id\":321,\"first_name\":\"Gail\",\"email\":\"gswalough8w@dagondesign.com\",\"job\":\"Media Manager II\",\"timestamp\":\"2022-08-23T09:04:46Z\"}\n{\"id\":322,\"first_name\":\"Bradan\",\"email\":\"bellinor8x@gov.uk\",\"job\":\"Engineer IV\",\"timestamp\":\"2022-05-09T21:16:31Z\"}\n{\"id\":323,\"first_name\":\"Nan\",\"email\":\"nlindeboom8y@apple.com\",\"job\":\"Payment Adjustment Coordinator\",\"timestamp\":\"2022-05-22T20:57:54Z\"}\n{\"id\":324,\"first_name\":\"Horatia\",\"email\":\"hgounard8z@mediafire.com\",\"job\":\"VP Sales\",\"timestamp\":\"2022-10-07T14:31:33Z\"}\n{\"id\":325,\"first_name\":\"Thomasine\",\"email\":\"tgoodlake90@amazon.com\",\"job\":\"Budget/Accounting Analyst I\",\"timestamp\":\"2022-09-02T05:15:30Z\"}\n{\"id\":326,\"first_name\":\"Odetta\",\"email\":\"odoige91@java.com\",\"job\":\"VP Sales\",\"timestamp\":\"2022-11-25T13:05:15Z\"}\n{\"id\":327,\"first_name\":\"Ronda\",\"email\":\"rblandford92@wisc.edu\",\"job\":\"Biostatistician I\",\"timestamp\":\"2022-04-19T23:39:29Z\"}\n{\"id\":328,\"first_name\":\"Rhianon\",\"email\":\"rgillett93@arstechnica.com\",\"job\":\"Sales Associate\",\"timestamp\":\"2022-02-20T10:38:28Z\"}\n{\"id\":329,\"first_name\":\"Cordell\",\"email\":\"cjannings94@chronoengine.com\",\"job\":\"Geological Engineer\",\"timestamp\":\"2022-05-23T02:44:43Z\"}\n{\"id\":330,\"first_name\":\"Puff\",\"email\":\"pmaylard95@g.co\",\"job\":\"Clinical Specialist\",\"timestamp\":\"2022-01-20T13:50:09Z\"}\n{\"id\":331,\"first_name\":\"Mahalia\",\"email\":\"mmanifield96@ca.gov\",\"job\":\"Actuary\",\"timestamp\":\"2022-06-13T07:10:18Z\"}\n{\"id\":332,\"first_name\":\"Amalie\",\"email\":\"apfleger97@shareasale.com\",\"job\":\"VP Sales\",\"timestamp\":\"2022-05-03T07:14:03Z\"}\n{\"id\":333,\"first_name\":\"Hayden\",\"email\":\"hsparkes98@prlog.org\",\"job\":\"Human Resources Assistant I\",\"timestamp\":\"2022-11-25T09:10:49Z\"}\n{\"id\":334,\"first_name\":\"Penelope\",\"email\":\"pmctavish99@earthlink.net\",\"job\":\"Graphic Designer\",\"timestamp\":\"2022-10-17T00:08:59Z\"}\n{\"id\":335,\"first_name\":\"Malory\",\"email\":\"mlogsdail9a@github.io\",\"job\":\"Cost Accountant\",\"timestamp\":\"2022-10-21T00:25:47Z\"}\n{\"id\":336,\"first_name\":\"Bibbie\",\"email\":\"bcutchie9b@yahoo.co.jp\",\"job\":\"Chief Design Engineer\",\"timestamp\":\"2022-04-19T16:36:54Z\"}\n{\"id\":337,\"first_name\":\"Codie\",\"email\":\"ccoundley9c@instagram.com\",\"job\":\"Occupational Therapist\",\"timestamp\":\"2022-08-29T03:37:02Z\"}\n{\"id\":338,\"first_name\":\"Alick\",\"email\":\"ahaggleton9d@moonfruit.com\",\"job\":\"Statistician III\",\"timestamp\":\"2021-12-28T21:14:48Z\"}\n{\"id\":339,\"first_name\":\"Phil\",\"email\":\"pmowatt9e@cocolog-nifty.com\",\"job\":\"Help Desk Operator\",\"timestamp\":\"2022-05-30T14:19:42Z\"}\n{\"id\":340,\"first_name\":\"Agace\",\"email\":\"askeats9f@nationalgeographic.com\",\"job\":\"Web Designer III\",\"timestamp\":\"2022-11-19T07:35:23Z\"}\n{\"id\":341,\"first_name\":\"Maria\",\"email\":\"maleksashin9g@tmall.com\",\"job\":\"Information Systems Manager\",\"timestamp\":\"2022-02-21T07:40:14Z\"}\n{\"id\":342,\"first_name\":\"Coreen\",\"email\":\"cchampley9h@economist.com\",\"job\":\"Biostatistician I\",\"timestamp\":\"2022-07-20T05:23:22Z\"}\n{\"id\":343,\"first_name\":\"Ariel\",\"email\":\"adolder9i@nytimes.com\",\"job\":\"Automation Specialist II\",\"timestamp\":\"2022-08-17T02:56:53Z\"}\n{\"id\":344,\"first_name\":\"Mathilde\",\"email\":\"mtheml9j@unesco.org\",\"job\":\"Environmental Tech\",\"timestamp\":\"2022-09-16T10:25:30Z\"}\n{\"id\":345,\"first_name\":\"Thorndike\",\"email\":\"twyd9k@nyu.edu\",\"job\":\"Speech Pathologist\",\"timestamp\":\"2022-05-14T09:10:54Z\"}\n{\"id\":346,\"first_name\":\"Hilario\",\"email\":\"hivanchenkov9l@spiegel.de\",\"job\":\"Human Resources Assistant III\",\"timestamp\":\"2022-07-26T00:37:05Z\"}\n{\"id\":347,\"first_name\":\"Jessa\",\"email\":\"jdavidek9m@youtu.be\",\"job\":\"Technical Writer\",\"timestamp\":\"2022-08-25T16:04:23Z\"}\n{\"id\":348,\"first_name\":\"Taylor\",\"email\":\"tzavattieri9n@nba.com\",\"job\":\"Account Coordinator\",\"timestamp\":\"2022-05-17T17:07:33Z\"}\n{\"id\":349,\"first_name\":\"Tobiah\",\"email\":\"tcescot9o@symantec.com\",\"job\":\"VP Product Management\",\"timestamp\":\"2022-09-01T21:53:33Z\"}\n{\"id\":350,\"first_name\":\"Natalie\",\"email\":\"nspinks9p@bbc.co.uk\",\"job\":\"Cost Accountant\",\"timestamp\":\"2022-03-27T07:20:50Z\"}\n{\"id\":351,\"first_name\":\"Penny\",\"email\":\"pjiggins9q@tinyurl.com\",\"job\":\"Community Outreach Specialist\",\"timestamp\":\"2022-08-15T12:37:50Z\"}\n{\"id\":352,\"first_name\":\"Burnaby\",\"email\":\"bbrookton9r@shareasale.com\",\"job\":\"Biostatistician I\",\"timestamp\":\"2022-06-28T12:28:50Z\"}\n{\"id\":353,\"first_name\":\"Ted\",\"email\":\"twalhedd9s@foxnews.com\",\"job\":\"Research Associate\",\"timestamp\":\"2022-01-22T21:56:51Z\"}\n{\"id\":354,\"first_name\":\"Christie\",\"email\":\"ccrangle9t@baidu.com\",\"job\":\"Mechanical Systems Engineer\",\"timestamp\":\"2022-09-08T07:07:25Z\"}\n{\"id\":355,\"first_name\":\"Skipper\",\"email\":\"sminett9u@cam.ac.uk\",\"job\":\"Help Desk Operator\",\"timestamp\":\"2022-05-05T20:18:23Z\"}\n{\"id\":356,\"first_name\":\"Robbie\",\"email\":\"rcloake9v@rediff.com\",\"job\":\"Office Assistant I\",\"timestamp\":\"2022-08-31T05:21:10Z\"}\n{\"id\":357,\"first_name\":\"Winny\",\"email\":\"wlooney9w@gmpg.org\",\"job\":\"General Manager\",\"timestamp\":\"2022-03-26T06:45:14Z\"}\n{\"id\":358,\"first_name\":\"Meghan\",\"email\":\"mguslon9x@ca.gov\",\"job\":\"Health Coach III\",\"timestamp\":\"2022-09-21T01:54:06Z\"}\n{\"id\":359,\"first_name\":\"Ignatius\",\"email\":\"igergolet9y@gmpg.org\",\"job\":\"Research Assistant II\",\"timestamp\":\"2022-08-12T03:08:44Z\"}\n{\"id\":360,\"first_name\":\"Dalenna\",\"email\":\"dkinig9z@seesaa.net\",\"job\":\"Software Engineer II\",\"timestamp\":\"2022-07-18T12:00:37Z\"}\n{\"id\":361,\"first_name\":\"Tasha\",\"email\":\"tredmirea0@typepad.com\",\"job\":\"Business Systems Development Analyst\",\"timestamp\":\"2022-10-08T04:47:15Z\"}\n{\"id\":362,\"first_name\":\"Olenolin\",\"email\":\"ofogartya1@sbwire.com\",\"job\":\"Environmental Tech\",\"timestamp\":\"2022-02-15T14:35:15Z\"}\n{\"id\":363,\"first_name\":\"Toiboid\",\"email\":\"tjanesa2@sfgate.com\",\"job\":\"Cost Accountant\",\"timestamp\":\"2022-07-10T21:41:01Z\"}\n{\"id\":364,\"first_name\":\"Flem\",\"email\":\"fjentona3@amazon.co.jp\",\"job\":\"Desktop Support Technician\",\"timestamp\":\"2022-01-26T00:20:34Z\"}\n{\"id\":365,\"first_name\":\"Bab\",\"email\":\"bvaleka4@nydailynews.com\",\"job\":\"Clinical Specialist\",\"timestamp\":\"2022-03-12T10:26:20Z\"}\n{\"id\":366,\"first_name\":\"Juli\",\"email\":\"jcuniama5@fotki.com\",\"job\":\"Tax Accountant\",\"timestamp\":\"2022-04-10T09:07:47Z\"}\n{\"id\":367,\"first_name\":\"Anatollo\",\"email\":\"ableakleya6@fda.gov\",\"job\":\"Help Desk Operator\",\"timestamp\":\"2021-12-13T05:04:59Z\"}\n{\"id\":368,\"first_name\":\"Shina\",\"email\":\"smeggisona7@europa.eu\",\"job\":\"Research Nurse\",\"timestamp\":\"2022-08-05T03:34:02Z\"}\n{\"id\":369,\"first_name\":\"Malva\",\"email\":\"mpeizera8@vimeo.com\",\"job\":\"Design Engineer\",\"timestamp\":\"2022-03-26T15:22:50Z\"}\n{\"id\":370,\"first_name\":\"Filbert\",\"email\":\"fdominya9@theguardian.com\",\"job\":\"Marketing Assistant\",\"timestamp\":\"2021-12-15T02:49:47Z\"}\n{\"id\":371,\"first_name\":\"Dana\",\"email\":\"dcasswellaa@studiopress.com\",\"job\":\"Web Designer I\",\"timestamp\":\"2022-10-22T09:24:45Z\"}\n{\"id\":372,\"first_name\":\"Merle\",\"email\":\"mpetersonab@sciencedirect.com\",\"job\":\"Account Executive\",\"timestamp\":\"2022-02-22T10:27:38Z\"}\n{\"id\":373,\"first_name\":\"Allan\",\"email\":\"abuntingac@dot.gov\",\"job\":\"Data Coordiator\",\"timestamp\":\"2022-08-03T08:18:11Z\"}\n{\"id\":374,\"first_name\":\"Bram\",\"email\":\"bverniad@chronoengine.com\",\"job\":\"Community Outreach Specialist\",\"timestamp\":\"2022-08-02T22:06:32Z\"}\n{\"id\":375,\"first_name\":\"Leonelle\",\"email\":\"lyvensae@dailymotion.com\",\"job\":\"Graphic Designer\",\"timestamp\":\"2022-04-16T02:55:02Z\"}\n{\"id\":376,\"first_name\":\"Liane\",\"email\":\"lrabbittaf@house.gov\",\"job\":\"Project Manager\",\"timestamp\":\"2022-08-06T19:07:53Z\"}\n{\"id\":377,\"first_name\":\"Bartholemy\",\"email\":\"bsemarkeag@slashdot.org\",\"job\":\"Assistant Professor\",\"timestamp\":\"2022-01-28T21:04:03Z\"}\n{\"id\":378,\"first_name\":\"Lee\",\"email\":\"ldocwraah@diigo.com\",\"job\":\"Senior Editor\",\"timestamp\":\"2022-02-27T02:10:23Z\"}\n{\"id\":379,\"first_name\":\"Gorden\",\"email\":\"gtalmadgeai@yolasite.com\",\"job\":\"Graphic Designer\",\"timestamp\":\"2022-03-30T03:03:19Z\"}\n{\"id\":380,\"first_name\":\"Brody\",\"email\":\"bmedgewickaj@soup.io\",\"job\":\"Senior Cost Accountant\",\"timestamp\":\"2022-09-29T19:39:00Z\"}\n{\"id\":381,\"first_name\":\"Rebekah\",\"email\":\"rgossageak@mapquest.com\",\"job\":\"Software Test Engineer I\",\"timestamp\":\"2022-04-22T17:40:15Z\"}\n{\"id\":382,\"first_name\":\"Josselyn\",\"email\":\"jmilneral@timesonline.co.uk\",\"job\":\"Project Manager\",\"timestamp\":\"2022-02-07T18:32:26Z\"}\n{\"id\":383,\"first_name\":\"Padriac\",\"email\":\"psmalecombeam@bandcamp.com\",\"job\":\"Programmer Analyst I\",\"timestamp\":\"2021-12-11T22:06:01Z\"}\n{\"id\":384,\"first_name\":\"Flossy\",\"email\":\"fwhitehornean@indiegogo.com\",\"job\":\"Analyst Programmer\",\"timestamp\":\"2022-09-09T15:01:19Z\"}\n{\"id\":385,\"first_name\":\"Chantal\",\"email\":\"cvaughanao@sciencedaily.com\",\"job\":\"Financial Advisor\",\"timestamp\":\"2022-01-23T20:15:13Z\"}\n{\"id\":386,\"first_name\":\"Jacquette\",\"email\":\"jlamballap@xinhuanet.com\",\"job\":\"Community Outreach Specialist\",\"timestamp\":\"2022-02-13T01:40:24Z\"}\n{\"id\":387,\"first_name\":\"Jerrome\",\"email\":\"jbruckmannaq@homestead.com\",\"job\":\"Dental Hygienist\",\"timestamp\":\"2022-05-29T18:49:31Z\"}\n{\"id\":388,\"first_name\":\"Edik\",\"email\":\"ecoughlanar@moonfruit.com\",\"job\":\"Developer III\",\"timestamp\":\"2022-12-05T11:10:08Z\"}\n{\"id\":389,\"first_name\":\"Jonis\",\"email\":\"jdallmannas@51.la\",\"job\":\"VP Product Management\",\"timestamp\":\"2022-11-17T01:01:20Z\"}\n{\"id\":390,\"first_name\":\"Pryce\",\"email\":\"pchaliceat@yellowbook.com\",\"job\":\"Budget/Accounting Analyst II\",\"timestamp\":\"2021-12-12T03:37:34Z\"}\n{\"id\":391,\"first_name\":\"Katheryn\",\"email\":\"kfleoteau@nps.gov\",\"job\":\"Nuclear Power Engineer\",\"timestamp\":\"2022-10-21T03:25:23Z\"}\n{\"id\":392,\"first_name\":\"Bent\",\"email\":\"bblacklerav@merriam-webster.com\",\"job\":\"Paralegal\",\"timestamp\":\"2022-01-17T10:38:38Z\"}\n{\"id\":393,\"first_name\":\"Farly\",\"email\":\"fcowingaw@youku.com\",\"job\":\"Director of Sales\",\"timestamp\":\"2022-06-12T00:06:35Z\"}\n{\"id\":394,\"first_name\":\"Ninon\",\"email\":\"nreinaax@chicagotribune.com\",\"job\":\"Director of Sales\",\"timestamp\":\"2022-04-02T17:18:46Z\"}\n{\"id\":395,\"first_name\":\"Lyndsay\",\"email\":\"lbrandtsay@cargocollective.com\",\"job\":\"Assistant Media Planner\",\"timestamp\":\"2022-03-22T12:47:37Z\"}\n{\"id\":396,\"first_name\":\"Elaina\",\"email\":\"emccloughlinaz@usda.gov\",\"job\":\"Structural Engineer\",\"timestamp\":\"2022-09-14T12:34:04Z\"}\n{\"id\":397,\"first_name\":\"Hillery\",\"email\":\"hgilhoolyb0@alexa.com\",\"job\":\"Physical Therapy Assistant\",\"timestamp\":\"2022-03-23T21:50:41Z\"}\n{\"id\":398,\"first_name\":\"Garth\",\"email\":\"gabbotsb1@wordpress.com\",\"job\":\"Account Executive\",\"timestamp\":\"2022-09-28T23:38:07Z\"}\n{\"id\":399,\"first_name\":\"Sinclair\",\"email\":\"sboyerb2@usda.gov\",\"job\":\"Developer II\",\"timestamp\":\"2022-08-08T21:24:47Z\"}\n{\"id\":400,\"first_name\":\"Jody\",\"email\":\"jgetshamb3@vinaora.com\",\"job\":\"Software Test Engineer I\",\"timestamp\":\"2022-03-17T14:23:39Z\"}\n{\"id\":401,\"first_name\":\"Melamie\",\"email\":\"mmatulab4@amazonaws.com\",\"job\":\"Dental Hygienist\",\"timestamp\":\"2021-12-14T08:21:43Z\"}\n{\"id\":402,\"first_name\":\"Adina\",\"email\":\"aapedaileb5@paginegialle.it\",\"job\":\"Environmental Tech\",\"timestamp\":\"2022-09-12T14:59:41Z\"}\n{\"id\":403,\"first_name\":\"Carmencita\",\"email\":\"cnolanb6@shareasale.com\",\"job\":\"Teacher\",\"timestamp\":\"2021-12-17T03:12:30Z\"}\n{\"id\":404,\"first_name\":\"Marion\",\"email\":\"mmcfetridgeb7@csmonitor.com\",\"job\":\"Recruiter\",\"timestamp\":\"2022-11-21T06:37:22Z\"}\n{\"id\":405,\"first_name\":\"Kelley\",\"email\":\"kcouchb8@smugmug.com\",\"job\":\"VP Marketing\",\"timestamp\":\"2022-07-19T15:59:16Z\"}\n{\"id\":406,\"first_name\":\"Aluino\",\"email\":\"adeeryb9@jigsy.com\",\"job\":\"Junior Executive\",\"timestamp\":\"2022-01-02T14:05:55Z\"}\n{\"id\":407,\"first_name\":\"Chantal\",\"email\":\"cvannsba@ehow.com\",\"job\":\"Assistant Media Planner\",\"timestamp\":\"2022-11-22T10:16:54Z\"}\n{\"id\":408,\"first_name\":\"Neville\",\"email\":\"nlacasebb@vistaprint.com\",\"job\":\"Legal Assistant\",\"timestamp\":\"2022-06-07T04:19:30Z\"}\n{\"id\":409,\"first_name\":\"Babette\",\"email\":\"blandebc@is.gd\",\"job\":\"Help Desk Operator\",\"timestamp\":\"2022-06-30T17:27:16Z\"}\n{\"id\":410,\"first_name\":\"Zarah\",\"email\":\"zcasellabd@github.io\",\"job\":\"Social Worker\",\"timestamp\":\"2022-04-11T01:56:29Z\"}\n{\"id\":411,\"first_name\":\"Hendrick\",\"email\":\"hlawrencebe@rediff.com\",\"job\":\"GIS Technical Architect\",\"timestamp\":\"2022-05-18T23:39:22Z\"}\n{\"id\":412,\"first_name\":\"Miles\",\"email\":\"mtollbf@so-net.ne.jp\",\"job\":\"Staff Accountant III\",\"timestamp\":\"2022-10-07T11:26:47Z\"}\n{\"id\":413,\"first_name\":\"Antonino\",\"email\":\"aiskowbg@umn.edu\",\"job\":\"Internal Auditor\",\"timestamp\":\"2022-05-28T14:03:33Z\"}\n{\"id\":414,\"first_name\":\"Mehetabel\",\"email\":\"mjiroudekbh@wikispaces.com\",\"job\":\"Senior Quality Engineer\",\"timestamp\":\"2021-12-08T12:30:46Z\"}\n{\"id\":415,\"first_name\":\"Babb\",\"email\":\"bkembrybi@cocolog-nifty.com\",\"job\":\"Mechanical Systems Engineer\",\"timestamp\":\"2022-10-12T06:41:10Z\"}\n{\"id\":416,\"first_name\":\"Theresina\",\"email\":\"tcastagnebj@spiegel.de\",\"job\":\"Help Desk Operator\",\"timestamp\":\"2022-01-03T13:36:12Z\"}\n{\"id\":417,\"first_name\":\"Aaron\",\"email\":\"afazakerleybk@webeden.co.uk\",\"job\":\"Payment Adjustment Coordinator\",\"timestamp\":\"2022-10-03T06:22:19Z\"}\n{\"id\":418,\"first_name\":\"Bartie\",\"email\":\"bkettlestringesbl@pcworld.com\",\"job\":\"Budget/Accounting Analyst II\",\"timestamp\":\"2022-04-14T04:23:27Z\"}\n{\"id\":419,\"first_name\":\"Amelie\",\"email\":\"abesantiebm@bloglines.com\",\"job\":\"Social Worker\",\"timestamp\":\"2022-10-17T08:14:30Z\"}\n{\"id\":420,\"first_name\":\"Harrison\",\"email\":\"hrollesbn@twitter.com\",\"job\":\"Graphic Designer\",\"timestamp\":\"2022-05-18T06:25:09Z\"}\n{\"id\":421,\"first_name\":\"Siward\",\"email\":\"smarquisbo@nydailynews.com\",\"job\":\"Librarian\",\"timestamp\":\"2022-02-01T11:52:44Z\"}\n{\"id\":422,\"first_name\":\"Millard\",\"email\":\"misonbp@issuu.com\",\"job\":\"Health Coach I\",\"timestamp\":\"2022-08-24T03:53:59Z\"}\n{\"id\":423,\"first_name\":\"Daniella\",\"email\":\"dfaransbq@mysql.com\",\"job\":\"Sales Representative\",\"timestamp\":\"2022-08-10T13:25:10Z\"}\n{\"id\":424,\"first_name\":\"Raimund\",\"email\":\"rsteptoebr@wordpress.com\",\"job\":\"Analyst Programmer\",\"timestamp\":\"2022-06-07T16:45:17Z\"}\n{\"id\":425,\"first_name\":\"Ingmar\",\"email\":\"ifeldmanbs@1und1.de\",\"job\":\"Computer Systems Analyst II\",\"timestamp\":\"2022-01-05T14:01:15Z\"}\n{\"id\":426,\"first_name\":\"Zack\",\"email\":\"zbarnwellbt@msu.edu\",\"job\":\"Staff Accountant IV\",\"timestamp\":\"2022-06-28T17:14:21Z\"}\n{\"id\":427,\"first_name\":\"Mozelle\",\"email\":\"mstuchburybu@kickstarter.com\",\"job\":\"Geologist IV\",\"timestamp\":\"2022-07-13T19:58:35Z\"}\n{\"id\":428,\"first_name\":\"Wallie\",\"email\":\"wflurybv@hc360.com\",\"job\":\"Environmental Tech\",\"timestamp\":\"2022-03-24T07:28:36Z\"}\n{\"id\":429,\"first_name\":\"Annis\",\"email\":\"abenettolobw@cpanel.net\",\"job\":\"Recruiter\",\"timestamp\":\"2022-01-26T05:18:04Z\"}\n{\"id\":430,\"first_name\":\"Martainn\",\"email\":\"mbeedenbx@wikipedia.org\",\"job\":\"Geologist I\",\"timestamp\":\"2022-03-28T12:59:52Z\"}\n{\"id\":431,\"first_name\":\"Ara\",\"email\":\"akeachby@ebay.co.uk\",\"job\":\"Product Engineer\",\"timestamp\":\"2022-03-24T19:33:33Z\"}\n{\"id\":432,\"first_name\":\"Sully\",\"email\":\"sbomanbz@scientificamerican.com\",\"job\":\"Accounting Assistant III\",\"timestamp\":\"2022-04-20T00:28:36Z\"}\n{\"id\":433,\"first_name\":\"Alley\",\"email\":\"aboeckec0@youtube.com\",\"job\":\"Social Worker\",\"timestamp\":\"2022-12-06T11:01:24Z\"}\n{\"id\":434,\"first_name\":\"Ignace\",\"email\":\"ibullonc1@engadget.com\",\"job\":\"Geological Engineer\",\"timestamp\":\"2022-07-16T19:26:35Z\"}\n{\"id\":435,\"first_name\":\"Gretta\",\"email\":\"gpavelinc2@cbslocal.com\",\"job\":\"Design Engineer\",\"timestamp\":\"2022-01-13T19:58:02Z\"}\n{\"id\":436,\"first_name\":\"Godiva\",\"email\":\"gnarramorec3@barnesandnoble.com\",\"job\":\"Sales Associate\",\"timestamp\":\"2021-12-29T16:34:57Z\"}\n{\"id\":437,\"first_name\":\"Erskine\",\"email\":\"ebillingc4@ning.com\",\"job\":\"Account Representative III\",\"timestamp\":\"2022-10-09T12:03:38Z\"}\n{\"id\":438,\"first_name\":\"Otha\",\"email\":\"omcdowallc5@hugedomains.com\",\"job\":\"Speech Pathologist\",\"timestamp\":\"2022-10-25T06:57:20Z\"}\n{\"id\":439,\"first_name\":\"Annabal\",\"email\":\"ajerrardc6@ca.gov\",\"job\":\"Research Nurse\",\"timestamp\":\"2022-02-28T20:54:44Z\"}\n{\"id\":440,\"first_name\":\"Helenka\",\"email\":\"hwillougheyc7@51.la\",\"job\":\"Teacher\",\"timestamp\":\"2022-11-27T13:25:13Z\"}\n{\"id\":441,\"first_name\":\"Frederic\",\"email\":\"flaversc8@ox.ac.uk\",\"job\":\"Chief Design Engineer\",\"timestamp\":\"2022-10-22T08:43:20Z\"}\n{\"id\":442,\"first_name\":\"Shel\",\"email\":\"sjeffcoatec9@e-recht24.de\",\"job\":\"Senior Cost Accountant\",\"timestamp\":\"2022-10-18T22:44:57Z\"}\n{\"id\":443,\"first_name\":\"Gisele\",\"email\":\"gplenderleithca@hibu.com\",\"job\":\"Environmental Tech\",\"timestamp\":\"2022-11-15T06:20:33Z\"}\n{\"id\":444,\"first_name\":\"Devland\",\"email\":\"dthamescb@ebay.co.uk\",\"job\":\"Data Coordiator\",\"timestamp\":\"2022-09-08T21:56:53Z\"}\n{\"id\":445,\"first_name\":\"Leonardo\",\"email\":\"lbalazotcc@seesaa.net\",\"job\":\"Information Systems Manager\",\"timestamp\":\"2022-09-10T05:34:02Z\"}\n{\"id\":446,\"first_name\":\"Evanne\",\"email\":\"epeyntuecd@smh.com.au\",\"job\":\"Developer IV\",\"timestamp\":\"2022-06-21T05:01:58Z\"}\n{\"id\":447,\"first_name\":\"Bran\",\"email\":\"beastmeadce@dot.gov\",\"job\":\"Assistant Media Planner\",\"timestamp\":\"2022-05-06T08:08:38Z\"}\n{\"id\":448,\"first_name\":\"Edmon\",\"email\":\"egoldbourncf@digg.com\",\"job\":\"Account Representative III\",\"timestamp\":\"2022-04-30T09:53:46Z\"}\n{\"id\":449,\"first_name\":\"Dud\",\"email\":\"dmoscropcg@hexun.com\",\"job\":\"Payment Adjustment Coordinator\",\"timestamp\":\"2022-10-20T22:52:50Z\"}\n{\"id\":450,\"first_name\":\"Emilia\",\"email\":\"ethorroldch@multiply.com\",\"job\":\"Product Engineer\",\"timestamp\":\"2022-04-28T08:39:13Z\"}\n{\"id\":451,\"first_name\":\"Jinny\",\"email\":\"jrosenwasserci@yelp.com\",\"job\":\"Director of Sales\",\"timestamp\":\"2022-08-04T03:14:57Z\"}\n{\"id\":452,\"first_name\":\"Kerry\",\"email\":\"kfaltincj@techcrunch.com\",\"job\":\"Associate Professor\",\"timestamp\":\"2022-03-18T15:12:10Z\"}\n{\"id\":453,\"first_name\":\"Moore\",\"email\":\"mclellck@amazon.de\",\"job\":\"Research Assistant I\",\"timestamp\":\"2022-11-15T22:30:36Z\"}\n{\"id\":454,\"first_name\":\"Aida\",\"email\":\"acrandoncl@nationalgeographic.com\",\"job\":\"Developer IV\",\"timestamp\":\"2022-07-17T20:29:10Z\"}\n{\"id\":455,\"first_name\":\"Appolonia\",\"email\":\"abragancacm@elegantthemes.com\",\"job\":\"Internal Auditor\",\"timestamp\":\"2022-11-28T09:14:18Z\"}\n{\"id\":456,\"first_name\":\"Alberik\",\"email\":\"afountiancn@github.com\",\"job\":\"Financial Advisor\",\"timestamp\":\"2022-02-07T14:48:02Z\"}\n{\"id\":457,\"first_name\":\"Dew\",\"email\":\"dimpettco@unesco.org\",\"job\":\"Research Nurse\",\"timestamp\":\"2022-03-11T09:31:20Z\"}\n{\"id\":458,\"first_name\":\"Abner\",\"email\":\"amacdougalcp@example.com\",\"job\":\"Physical Therapy Assistant\",\"timestamp\":\"2022-07-17T03:37:51Z\"}\n{\"id\":459,\"first_name\":\"Riordan\",\"email\":\"rgeecq@ustream.tv\",\"job\":\"Chemical Engineer\",\"timestamp\":\"2022-07-15T08:22:35Z\"}\n{\"id\":460,\"first_name\":\"Lutero\",\"email\":\"lembletoncr@de.vu\",\"job\":\"Information Systems Manager\",\"timestamp\":\"2022-06-11T06:31:58Z\"}\n{\"id\":461,\"first_name\":\"Maia\",\"email\":\"mdelaceycs@google.es\",\"job\":\"VP Product Management\",\"timestamp\":\"2022-01-18T03:03:17Z\"}\n{\"id\":462,\"first_name\":\"Milka\",\"email\":\"mquenellct@reuters.com\",\"job\":\"VP Quality Control\",\"timestamp\":\"2022-09-05T04:42:07Z\"}\n{\"id\":463,\"first_name\":\"Shannen\",\"email\":\"smcevoycu@surveymonkey.com\",\"job\":\"Assistant Media Planner\",\"timestamp\":\"2022-11-01T00:56:44Z\"}\n{\"id\":464,\"first_name\":\"Ruby\",\"email\":\"rkantercv@twitter.com\",\"job\":\"Nuclear Power Engineer\",\"timestamp\":\"2022-07-21T07:23:51Z\"}\n{\"id\":465,\"first_name\":\"Ewan\",\"email\":\"eshellumcw@apple.com\",\"job\":\"Paralegal\",\"timestamp\":\"2022-01-26T17:12:30Z\"}\n{\"id\":466,\"first_name\":\"Randie\",\"email\":\"rkiernancx@freewebs.com\",\"job\":\"Software Test Engineer III\",\"timestamp\":\"2022-04-06T07:32:00Z\"}\n{\"id\":467,\"first_name\":\"Gwendolyn\",\"email\":\"gtattoocy@miitbeian.gov.cn\",\"job\":\"Administrative Assistant IV\",\"timestamp\":\"2022-04-13T22:40:15Z\"}\n{\"id\":468,\"first_name\":\"Pierson\",\"email\":\"phussycz@java.com\",\"job\":\"GIS Technical Architect\",\"timestamp\":\"2021-12-16T03:01:44Z\"}\n{\"id\":469,\"first_name\":\"Clare\",\"email\":\"cgrinikhinovd0@blogspot.com\",\"job\":\"Office Assistant I\",\"timestamp\":\"2022-01-22T12:28:11Z\"}\n{\"id\":470,\"first_name\":\"Lucie\",\"email\":\"lkillbyd1@e-recht24.de\",\"job\":\"Technical Writer\",\"timestamp\":\"2022-05-03T06:52:32Z\"}\n{\"id\":471,\"first_name\":\"Wynn\",\"email\":\"warndtd2@ucoz.ru\",\"job\":\"Structural Analysis Engineer\",\"timestamp\":\"2022-11-30T03:02:47Z\"}\n{\"id\":472,\"first_name\":\"Homerus\",\"email\":\"hclemitsd3@japanpost.jp\",\"job\":\"Research Associate\",\"timestamp\":\"2022-08-19T04:53:40Z\"}\n{\"id\":473,\"first_name\":\"Trefor\",\"email\":\"tmulberyd4@wordpress.com\",\"job\":\"Accounting Assistant III\",\"timestamp\":\"2021-12-10T12:00:52Z\"}\n{\"id\":474,\"first_name\":\"Adan\",\"email\":\"aattridged5@hc360.com\",\"job\":\"Administrative Assistant I\",\"timestamp\":\"2022-03-25T14:25:46Z\"}\n{\"id\":475,\"first_name\":\"Gal\",\"email\":\"gfourcaded6@washington.edu\",\"job\":\"Junior Executive\",\"timestamp\":\"2022-01-21T23:49:44Z\"}\n{\"id\":476,\"first_name\":\"Jasun\",\"email\":\"jchaveyd7@myspace.com\",\"job\":\"Paralegal\",\"timestamp\":\"2022-06-20T22:15:39Z\"}\n{\"id\":477,\"first_name\":\"Lanita\",\"email\":\"lpithied8@home.pl\",\"job\":\"Mechanical Systems Engineer\",\"timestamp\":\"2022-01-18T21:49:32Z\"}\n{\"id\":478,\"first_name\":\"Blancha\",\"email\":\"bcarswelld9@cnbc.com\",\"job\":\"VP Marketing\",\"timestamp\":\"2022-02-26T13:05:07Z\"}\n{\"id\":479,\"first_name\":\"Stormie\",\"email\":\"splimmerda@earthlink.net\",\"job\":\"Programmer Analyst III\",\"timestamp\":\"2022-02-14T19:55:10Z\"}\n{\"id\":480,\"first_name\":\"Jan\",\"email\":\"jlinklaterdb@ocn.ne.jp\",\"job\":\"Occupational Therapist\",\"timestamp\":\"2022-12-06T21:55:37Z\"}\n{\"id\":481,\"first_name\":\"Aggy\",\"email\":\"atargetterdc@amazon.com\",\"job\":\"Nurse\",\"timestamp\":\"2022-09-07T03:22:50Z\"}\n{\"id\":482,\"first_name\":\"Alfy\",\"email\":\"abernotdd@newsvine.com\",\"job\":\"Structural Analysis Engineer\",\"timestamp\":\"2022-05-18T01:35:03Z\"}\n{\"id\":483,\"first_name\":\"Orelee\",\"email\":\"oferrandde@tumblr.com\",\"job\":\"Recruiting Manager\",\"timestamp\":\"2022-08-23T22:05:49Z\"}\n{\"id\":484,\"first_name\":\"Jeana\",\"email\":\"jkhristyukhindf@purevolume.com\",\"job\":\"Nurse\",\"timestamp\":\"2022-10-09T17:35:26Z\"}\n{\"id\":485,\"first_name\":\"Carline\",\"email\":\"cllewelyndg@wikipedia.org\",\"job\":\"Clinical Specialist\",\"timestamp\":\"2022-08-05T01:00:13Z\"}\n{\"id\":486,\"first_name\":\"Etta\",\"email\":\"educhandh@oaic.gov.au\",\"job\":\"Administrative Officer\",\"timestamp\":\"2022-10-09T11:36:54Z\"}\n{\"id\":487,\"first_name\":\"Alphonso\",\"email\":\"astockoedi@last.fm\",\"job\":\"Marketing Manager\",\"timestamp\":\"2022-08-01T13:27:06Z\"}\n{\"id\":488,\"first_name\":\"Hunfredo\",\"email\":\"hrheltondj@163.com\",\"job\":\"Clinical Specialist\",\"timestamp\":\"2022-01-17T01:23:28Z\"}\n{\"id\":489,\"first_name\":\"Cob\",\"email\":\"cellingforddk@princeton.edu\",\"job\":\"Health Coach I\",\"timestamp\":\"2022-05-27T15:03:10Z\"}\n{\"id\":490,\"first_name\":\"Catlee\",\"email\":\"calennikovdl@so-net.ne.jp\",\"job\":\"Help Desk Technician\",\"timestamp\":\"2022-07-14T16:37:24Z\"}\n{\"id\":491,\"first_name\":\"Renato\",\"email\":\"rsauntondm@google.ru\",\"job\":\"VP Product Management\",\"timestamp\":\"2022-04-10T15:32:28Z\"}\n{\"id\":492,\"first_name\":\"Theodor\",\"email\":\"tbatissedn@dropbox.com\",\"job\":\"Automation Specialist III\",\"timestamp\":\"2022-02-08T14:25:15Z\"}\n{\"id\":493,\"first_name\":\"Freedman\",\"email\":\"fantonopoulosdo@vk.com\",\"job\":\"Human Resources Assistant IV\",\"timestamp\":\"2022-09-26T10:35:50Z\"}\n{\"id\":494,\"first_name\":\"Hendrick\",\"email\":\"hhazelgreavedp@wikispaces.com\",\"job\":\"Technical Writer\",\"timestamp\":\"2022-06-07T00:51:55Z\"}\n{\"id\":495,\"first_name\":\"Rosella\",\"email\":\"rweatheydq@oaic.gov.au\",\"job\":\"Automation Specialist III\",\"timestamp\":\"2022-08-30T15:14:52Z\"}\n{\"id\":496,\"first_name\":\"Bradly\",\"email\":\"bkylesdr@pinterest.com\",\"job\":\"VP Accounting\",\"timestamp\":\"2021-12-13T16:34:39Z\"}\n{\"id\":497,\"first_name\":\"Nannette\",\"email\":\"nsileyds@va.gov\",\"job\":\"Research Associate\",\"timestamp\":\"2022-02-02T08:57:22Z\"}\n{\"id\":498,\"first_name\":\"Virgilio\",\"email\":\"vgarforthdt@exblog.jp\",\"job\":\"Nurse\",\"timestamp\":\"2022-08-24T12:03:17Z\"}\n{\"id\":499,\"first_name\":\"Talyah\",\"email\":\"tverrechiadu@seesaa.net\",\"job\":\"Junior Executive\",\"timestamp\":\"2022-11-28T03:41:52Z\"}\n{\"id\":500,\"first_name\":\"Melisent\",\"email\":\"mwhooleydv@rakuten.co.jp\",\"job\":\"Project Manager\",\"timestamp\":\"2022-02-08T06:19:33Z\"}\n{\"id\":501,\"first_name\":\"Langston\",\"email\":\"lingerfielddw@jugem.jp\",\"job\":\"Media Manager II\",\"timestamp\":\"2022-01-31T18:38:45Z\"}\n{\"id\":502,\"first_name\":\"Chase\",\"email\":\"clangdx@clickbank.net\",\"job\":\"Engineer IV\",\"timestamp\":\"2022-03-10T07:05:47Z\"}\n{\"id\":503,\"first_name\":\"Tobiah\",\"email\":\"tmughaldy@domainmarket.com\",\"job\":\"Structural Analysis Engineer\",\"timestamp\":\"2022-07-06T18:49:01Z\"}\n{\"id\":504,\"first_name\":\"Laird\",\"email\":\"lsalladz@vistaprint.com\",\"job\":\"Pharmacist\",\"timestamp\":\"2022-11-19T14:59:08Z\"}\n{\"id\":505,\"first_name\":\"Rozamond\",\"email\":\"rwalerane0@soup.io\",\"job\":\"Developer I\",\"timestamp\":\"2022-06-15T04:38:16Z\"}\n{\"id\":506,\"first_name\":\"Eugine\",\"email\":\"emaccartane1@dedecms.com\",\"job\":\"Assistant Manager\",\"timestamp\":\"2022-03-12T12:49:39Z\"}\n{\"id\":507,\"first_name\":\"Norrie\",\"email\":\"nfeasleye2@networkadvertising.org\",\"job\":\"Human Resources Assistant III\",\"timestamp\":\"2022-07-12T04:43:23Z\"}\n{\"id\":508,\"first_name\":\"Lanny\",\"email\":\"lsnape3@discuz.net\",\"job\":\"Compensation Analyst\",\"timestamp\":\"2022-11-11T15:50:09Z\"}\n{\"id\":509,\"first_name\":\"Othilia\",\"email\":\"ochaneye4@tmall.com\",\"job\":\"Media Manager I\",\"timestamp\":\"2022-11-16T12:51:48Z\"}\n{\"id\":510,\"first_name\":\"Rosette\",\"email\":\"rbauckhame5@de.vu\",\"job\":\"Office Assistant IV\",\"timestamp\":\"2021-12-19T06:17:48Z\"}\n{\"id\":511,\"first_name\":\"Shepperd\",\"email\":\"sburehille6@ed.gov\",\"job\":\"Media Manager I\",\"timestamp\":\"2022-05-22T21:52:36Z\"}\n{\"id\":512,\"first_name\":\"Leese\",\"email\":\"lparagreene7@mashable.com\",\"job\":\"Senior Developer\",\"timestamp\":\"2022-05-09T16:08:14Z\"}\n{\"id\":513,\"first_name\":\"Nina\",\"email\":\"njiroutkae8@fema.gov\",\"job\":\"Information Systems Manager\",\"timestamp\":\"2022-10-20T10:17:19Z\"}\n{\"id\":514,\"first_name\":\"Dianne\",\"email\":\"dshrubshalle9@so-net.ne.jp\",\"job\":\"Recruiter\",\"timestamp\":\"2022-10-29T06:12:38Z\"}\n{\"id\":515,\"first_name\":\"Melissa\",\"email\":\"mcourtierea@eepurl.com\",\"job\":\"Professor\",\"timestamp\":\"2022-11-04T13:37:54Z\"}\n{\"id\":516,\"first_name\":\"Fairlie\",\"email\":\"fbargeryeb@seesaa.net\",\"job\":\"Physical Therapy Assistant\",\"timestamp\":\"2022-06-08T14:28:16Z\"}\n{\"id\":517,\"first_name\":\"Lonnie\",\"email\":\"lclaigeec@over-blog.com\",\"job\":\"Sales Associate\",\"timestamp\":\"2022-10-29T19:10:59Z\"}\n{\"id\":518,\"first_name\":\"Rik\",\"email\":\"rtedmaned@ask.com\",\"job\":\"Design Engineer\",\"timestamp\":\"2022-02-11T01:58:13Z\"}\n{\"id\":519,\"first_name\":\"Frankie\",\"email\":\"fgreenrodee@theguardian.com\",\"job\":\"Research Assistant III\",\"timestamp\":\"2022-02-14T18:52:01Z\"}\n{\"id\":520,\"first_name\":\"Gib\",\"email\":\"gdurranef@state.tx.us\",\"job\":\"Accounting Assistant III\",\"timestamp\":\"2022-07-08T08:39:29Z\"}\n{\"id\":521,\"first_name\":\"Philippa\",\"email\":\"pchattoeg@opensource.org\",\"job\":\"General Manager\",\"timestamp\":\"2022-11-25T11:36:17Z\"}\n{\"id\":522,\"first_name\":\"Eugenio\",\"email\":\"eblundoneh@shinystat.com\",\"job\":\"Programmer II\",\"timestamp\":\"2022-09-21T15:58:17Z\"}\n{\"id\":523,\"first_name\":\"Rona\",\"email\":\"rhowsanei@newyorker.com\",\"job\":\"Recruiting Manager\",\"timestamp\":\"2022-11-15T19:29:13Z\"}\n{\"id\":524,\"first_name\":\"Elise\",\"email\":\"ealgoreej@hao123.com\",\"job\":\"Professor\",\"timestamp\":\"2022-10-11T12:57:44Z\"}\n{\"id\":525,\"first_name\":\"Ertha\",\"email\":\"ewoofek@princeton.edu\",\"job\":\"Analyst Programmer\",\"timestamp\":\"2022-06-02T07:23:49Z\"}\n{\"id\":526,\"first_name\":\"Marietta\",\"email\":\"mlawrieel@jalbum.net\",\"job\":\"Human Resources Manager\",\"timestamp\":\"2022-09-09T15:08:59Z\"}\n{\"id\":527,\"first_name\":\"Meryl\",\"email\":\"mzupoem@github.io\",\"job\":\"Associate Professor\",\"timestamp\":\"2022-05-26T05:33:16Z\"}\n{\"id\":528,\"first_name\":\"Jehanna\",\"email\":\"jastburyen@hp.com\",\"job\":\"Director of Sales\",\"timestamp\":\"2022-07-10T23:08:44Z\"}\n{\"id\":529,\"first_name\":\"Rock\",\"email\":\"rsnowdoneo@mashable.com\",\"job\":\"Research Associate\",\"timestamp\":\"2022-09-10T07:52:36Z\"}\n{\"id\":530,\"first_name\":\"Genovera\",\"email\":\"gdemichettiep@weather.com\",\"job\":\"Account Coordinator\",\"timestamp\":\"2022-11-07T19:52:34Z\"}\n{\"id\":531,\"first_name\":\"Merlina\",\"email\":\"mwillinghameq@blog.com\",\"job\":\"Marketing Assistant\",\"timestamp\":\"2022-09-26T16:53:26Z\"}\n{\"id\":532,\"first_name\":\"Susan\",\"email\":\"shundeller@flickr.com\",\"job\":\"Staff Scientist\",\"timestamp\":\"2022-02-28T07:56:11Z\"}\n{\"id\":533,\"first_name\":\"Dannye\",\"email\":\"dsheivelses@plala.or.jp\",\"job\":\"VP Accounting\",\"timestamp\":\"2021-12-15T21:11:19Z\"}\n{\"id\":534,\"first_name\":\"Jenn\",\"email\":\"jellumet@buzzfeed.com\",\"job\":\"Desktop Support Technician\",\"timestamp\":\"2022-05-21T04:32:34Z\"}\n{\"id\":535,\"first_name\":\"Giana\",\"email\":\"gfulfordeu@imdb.com\",\"job\":\"Marketing Manager\",\"timestamp\":\"2022-07-21T04:57:09Z\"}\n{\"id\":536,\"first_name\":\"Jere\",\"email\":\"jcavnorev@cisco.com\",\"job\":\"Help Desk Operator\",\"timestamp\":\"2022-07-08T14:36:50Z\"}\n{\"id\":537,\"first_name\":\"Emiline\",\"email\":\"ekoschkeew@bbb.org\",\"job\":\"Analog Circuit Design manager\",\"timestamp\":\"2022-08-12T06:30:27Z\"}\n{\"id\":538,\"first_name\":\"Gaylene\",\"email\":\"gdinnegesex@dagondesign.com\",\"job\":\"Clinical Specialist\",\"timestamp\":\"2022-06-10T16:56:18Z\"}\n{\"id\":539,\"first_name\":\"Oberon\",\"email\":\"obougeney@devhub.com\",\"job\":\"Senior Sales Associate\",\"timestamp\":\"2022-10-02T23:50:51Z\"}\n{\"id\":540,\"first_name\":\"Stephine\",\"email\":\"seusticeez@boston.com\",\"job\":\"Programmer III\",\"timestamp\":\"2022-04-24T01:38:58Z\"}\n{\"id\":541,\"first_name\":\"Lulita\",\"email\":\"lmunbyf0@redcross.org\",\"job\":\"Librarian\",\"timestamp\":\"2022-07-19T05:43:05Z\"}\n{\"id\":542,\"first_name\":\"Lemmie\",\"email\":\"lmcphelimeyf1@unc.edu\",\"job\":\"Media Manager II\",\"timestamp\":\"2022-09-22T21:24:25Z\"}\n{\"id\":543,\"first_name\":\"Kimbra\",\"email\":\"kbarthroppf2@illinois.edu\",\"job\":\"Operator\",\"timestamp\":\"2022-07-01T19:30:08Z\"}\n{\"id\":544,\"first_name\":\"Clerkclaude\",\"email\":\"ccrasswellerf3@yandex.ru\",\"job\":\"Health Coach I\",\"timestamp\":\"2022-06-17T21:08:02Z\"}\n{\"id\":545,\"first_name\":\"Lucilia\",\"email\":\"lraratyf4@shop-pro.jp\",\"job\":\"Programmer Analyst III\",\"timestamp\":\"2022-03-01T05:56:13Z\"}\n{\"id\":546,\"first_name\":\"Karla\",\"email\":\"ksabbinsf5@phoca.cz\",\"job\":\"General Manager\",\"timestamp\":\"2022-10-10T04:07:44Z\"}\n{\"id\":547,\"first_name\":\"Angelica\",\"email\":\"acuninghamf6@hibu.com\",\"job\":\"Computer Systems Analyst III\",\"timestamp\":\"2022-09-17T05:20:49Z\"}\n{\"id\":548,\"first_name\":\"Verine\",\"email\":\"vpriverf7@constantcontact.com\",\"job\":\"Automation Specialist III\",\"timestamp\":\"2022-11-02T21:38:47Z\"}\n{\"id\":549,\"first_name\":\"Meridel\",\"email\":\"malesof8@g.co\",\"job\":\"Analyst Programmer\",\"timestamp\":\"2022-09-28T22:57:26Z\"}\n{\"id\":550,\"first_name\":\"Giulietta\",\"email\":\"gconeybearef9@bluehost.com\",\"job\":\"Information Systems Manager\",\"timestamp\":\"2022-04-01T17:06:32Z\"}\n{\"id\":551,\"first_name\":\"Bartlet\",\"email\":\"bvisefa@nih.gov\",\"job\":\"Computer Systems Analyst III\",\"timestamp\":\"2022-03-26T23:42:57Z\"}\n{\"id\":552,\"first_name\":\"Coriss\",\"email\":\"clorrymanfb@posterous.com\",\"job\":\"Account Representative III\",\"timestamp\":\"2021-12-31T22:46:51Z\"}\n{\"id\":553,\"first_name\":\"Eran\",\"email\":\"ecellifc@pinterest.com\",\"job\":\"Quality Control Specialist\",\"timestamp\":\"2022-07-30T08:12:57Z\"}\n{\"id\":554,\"first_name\":\"Zacharia\",\"email\":\"zvillaronfd@godaddy.com\",\"job\":\"Systems Administrator III\",\"timestamp\":\"2022-01-23T08:52:15Z\"}\n{\"id\":555,\"first_name\":\"Nico\",\"email\":\"nbirdwistlefe@noaa.gov\",\"job\":\"Nurse Practicioner\",\"timestamp\":\"2022-01-07T13:13:02Z\"}\n{\"id\":556,\"first_name\":\"Yardley\",\"email\":\"ygiannasiff@addthis.com\",\"job\":\"Registered Nurse\",\"timestamp\":\"2022-08-14T09:08:06Z\"}\n{\"id\":557,\"first_name\":\"Nina\",\"email\":\"nfeirnfg@icq.com\",\"job\":\"Librarian\",\"timestamp\":\"2022-08-30T00:25:37Z\"}\n{\"id\":558,\"first_name\":\"Kenna\",\"email\":\"krichinfh@ucoz.ru\",\"job\":\"Mechanical Systems Engineer\",\"timestamp\":\"2021-12-27T08:39:15Z\"}\n{\"id\":559,\"first_name\":\"Murray\",\"email\":\"masplenfi@soup.io\",\"job\":\"Chief Design Engineer\",\"timestamp\":\"2022-06-06T21:51:39Z\"}\n{\"id\":560,\"first_name\":\"Maribeth\",\"email\":\"mbroschkefj@drupal.org\",\"job\":\"Assistant Media Planner\",\"timestamp\":\"2022-04-22T14:35:45Z\"}\n{\"id\":561,\"first_name\":\"Shae\",\"email\":\"sstorerfk@berkeley.edu\",\"job\":\"Community Outreach Specialist\",\"timestamp\":\"2022-07-29T13:17:56Z\"}\n{\"id\":562,\"first_name\":\"Edik\",\"email\":\"esawreyfl@marketwatch.com\",\"job\":\"Speech Pathologist\",\"timestamp\":\"2021-12-19T16:35:17Z\"}\n{\"id\":563,\"first_name\":\"Annabela\",\"email\":\"acowenfm@home.pl\",\"job\":\"Speech Pathologist\",\"timestamp\":\"2022-03-31T12:06:02Z\"}\n{\"id\":564,\"first_name\":\"Barr\",\"email\":\"bgilmartinfn@walmart.com\",\"job\":\"Biostatistician IV\",\"timestamp\":\"2022-10-03T01:25:41Z\"}\n{\"id\":565,\"first_name\":\"Marten\",\"email\":\"mbothiefo@ameblo.jp\",\"job\":\"Mechanical Systems Engineer\",\"timestamp\":\"2022-04-02T20:32:33Z\"}\n{\"id\":566,\"first_name\":\"Merle\",\"email\":\"mmahoodfp@de.vu\",\"job\":\"VP Sales\",\"timestamp\":\"2022-03-20T01:29:18Z\"}\n{\"id\":567,\"first_name\":\"Vere\",\"email\":\"vlapthornefq@uiuc.edu\",\"job\":\"Media Manager III\",\"timestamp\":\"2021-12-11T03:57:21Z\"}\n{\"id\":568,\"first_name\":\"Falkner\",\"email\":\"fbrucksteinfr@mozilla.com\",\"job\":\"Account Representative III\",\"timestamp\":\"2022-03-11T12:52:49Z\"}\n{\"id\":569,\"first_name\":\"Cullie\",\"email\":\"cswallowfs@de.vu\",\"job\":\"Paralegal\",\"timestamp\":\"2022-02-01T10:37:02Z\"}\n{\"id\":570,\"first_name\":\"Vaclav\",\"email\":\"vtrowlft@stanford.edu\",\"job\":\"Software Consultant\",\"timestamp\":\"2022-04-07T18:15:57Z\"}\n{\"id\":571,\"first_name\":\"Joanne\",\"email\":\"jmaccomiskeyfu@discovery.com\",\"job\":\"Analyst Programmer\",\"timestamp\":\"2022-08-10T15:01:00Z\"}\n{\"id\":572,\"first_name\":\"Britt\",\"email\":\"bnairefv@utexas.edu\",\"job\":\"Geological Engineer\",\"timestamp\":\"2022-06-26T09:45:46Z\"}\n{\"id\":573,\"first_name\":\"Bunnie\",\"email\":\"bdafyddfw@tinyurl.com\",\"job\":\"Programmer I\",\"timestamp\":\"2022-04-10T01:54:26Z\"}\n{\"id\":574,\"first_name\":\"Janean\",\"email\":\"jpinkardfx@jimdo.com\",\"job\":\"Senior Sales Associate\",\"timestamp\":\"2022-11-22T06:59:53Z\"}\n{\"id\":575,\"first_name\":\"Abbey\",\"email\":\"aagnewfy@independent.co.uk\",\"job\":\"Desktop Support Technician\",\"timestamp\":\"2022-03-06T08:48:48Z\"}\n{\"id\":576,\"first_name\":\"Aleen\",\"email\":\"atripeanfz@reuters.com\",\"job\":\"Office Assistant IV\",\"timestamp\":\"2022-05-16T06:52:18Z\"}\n{\"id\":577,\"first_name\":\"Crissie\",\"email\":\"ctiversg0@sakura.ne.jp\",\"job\":\"Office Assistant III\",\"timestamp\":\"2022-07-08T04:07:16Z\"}\n{\"id\":578,\"first_name\":\"Audra\",\"email\":\"alisciandrig1@acquirethisname.com\",\"job\":\"Professor\",\"timestamp\":\"2022-07-19T23:58:30Z\"}\n{\"id\":579,\"first_name\":\"Jasmina\",\"email\":\"jgillowg2@mozilla.org\",\"job\":\"Marketing Manager\",\"timestamp\":\"2022-02-21T10:07:20Z\"}\n{\"id\":580,\"first_name\":\"Wendi\",\"email\":\"wtolandg3@deliciousdays.com\",\"job\":\"Administrative Assistant I\",\"timestamp\":\"2022-09-26T02:02:11Z\"}\n{\"id\":581,\"first_name\":\"Marilee\",\"email\":\"mlejeang4@noaa.gov\",\"job\":\"Sales Associate\",\"timestamp\":\"2022-07-08T14:40:55Z\"}\n{\"id\":582,\"first_name\":\"Rochelle\",\"email\":\"rrubinlichtg5@topsy.com\",\"job\":\"Quality Engineer\",\"timestamp\":\"2022-01-25T04:10:30Z\"}\n{\"id\":583,\"first_name\":\"Carolann\",\"email\":\"ctremonteg6@mtv.com\",\"job\":\"Structural Analysis Engineer\",\"timestamp\":\"2021-12-22T06:05:34Z\"}\n{\"id\":584,\"first_name\":\"Marika\",\"email\":\"mzumfeldeg7@hhs.gov\",\"job\":\"Engineer II\",\"timestamp\":\"2022-08-22T07:44:12Z\"}\n{\"id\":585,\"first_name\":\"Claiborn\",\"email\":\"crasherg8@bbc.co.uk\",\"job\":\"Computer Systems Analyst II\",\"timestamp\":\"2022-09-22T17:51:53Z\"}\n{\"id\":586,\"first_name\":\"Nonie\",\"email\":\"nitzcovichg9@npr.org\",\"job\":\"Computer Systems Analyst IV\",\"timestamp\":\"2022-11-02T11:37:11Z\"}\n{\"id\":587,\"first_name\":\"Ddene\",\"email\":\"dkeighlyga@alexa.com\",\"job\":\"Human Resources Assistant II\",\"timestamp\":\"2021-12-11T03:50:29Z\"}\n{\"id\":588,\"first_name\":\"Arlyn\",\"email\":\"amaystongb@timesonline.co.uk\",\"job\":\"Librarian\",\"timestamp\":\"2022-11-14T19:00:02Z\"}\n{\"id\":589,\"first_name\":\"Aaron\",\"email\":\"agallymoregc@taobao.com\",\"job\":\"Associate Professor\",\"timestamp\":\"2022-11-06T06:54:00Z\"}\n{\"id\":590,\"first_name\":\"Jermaine\",\"email\":\"jdelwatergd@phoca.cz\",\"job\":\"Sales Associate\",\"timestamp\":\"2022-04-02T22:52:38Z\"}\n{\"id\":591,\"first_name\":\"Robinetta\",\"email\":\"rmilingtonge@bizjournals.com\",\"job\":\"Human Resources Assistant III\",\"timestamp\":\"2021-12-08T04:09:29Z\"}\n{\"id\":592,\"first_name\":\"Hedi\",\"email\":\"htapsellgf@miibeian.gov.cn\",\"job\":\"Sales Representative\",\"timestamp\":\"2022-01-24T20:10:52Z\"}\n{\"id\":593,\"first_name\":\"Cookie\",\"email\":\"ckmieciakgg@aol.com\",\"job\":\"Account Representative IV\",\"timestamp\":\"2022-01-06T02:34:26Z\"}\n{\"id\":594,\"first_name\":\"Othilie\",\"email\":\"obredeegh@unblog.fr\",\"job\":\"Teacher\",\"timestamp\":\"2022-03-22T04:28:22Z\"}\n{\"id\":595,\"first_name\":\"Temp\",\"email\":\"tbenfordgi@disqus.com\",\"job\":\"Quality Engineer\",\"timestamp\":\"2022-08-04T07:54:26Z\"}\n{\"id\":596,\"first_name\":\"Noreen\",\"email\":\"nhawgj@timesonline.co.uk\",\"job\":\"Desktop Support Technician\",\"timestamp\":\"2022-08-16T02:03:26Z\"}\n{\"id\":597,\"first_name\":\"Gaylene\",\"email\":\"gdurbingk@myspace.com\",\"job\":\"Senior Sales Associate\",\"timestamp\":\"2022-01-05T01:43:57Z\"}\n{\"id\":598,\"first_name\":\"Katha\",\"email\":\"kbaumbergl@whitehouse.gov\",\"job\":\"Accounting Assistant IV\",\"timestamp\":\"2022-09-19T07:13:54Z\"}\n{\"id\":599,\"first_name\":\"Sisile\",\"email\":\"sgregangm@sbwire.com\",\"job\":\"Librarian\",\"timestamp\":\"2022-02-15T23:30:09Z\"}\n{\"id\":600,\"first_name\":\"Flynn\",\"email\":\"feyckelberggn@prnewswire.com\",\"job\":\"General Manager\",\"timestamp\":\"2022-02-15T10:12:52Z\"}\n{\"id\":601,\"first_name\":\"Erda\",\"email\":\"elattingo@people.com.cn\",\"job\":\"Chief Design Engineer\",\"timestamp\":\"2022-04-25T00:18:55Z\"}\n{\"id\":602,\"first_name\":\"Annabelle\",\"email\":\"amulchronegp@diigo.com\",\"job\":\"Sales Associate\",\"timestamp\":\"2022-07-27T02:47:06Z\"}\n{\"id\":603,\"first_name\":\"Etienne\",\"email\":\"ealmeidagq@ocn.ne.jp\",\"job\":\"VP Marketing\",\"timestamp\":\"2022-01-01T19:43:33Z\"}\n{\"id\":604,\"first_name\":\"Kerrie\",\"email\":\"kproudmangr@tumblr.com\",\"job\":\"Payment Adjustment Coordinator\",\"timestamp\":\"2021-12-18T07:32:44Z\"}\n{\"id\":605,\"first_name\":\"Tilda\",\"email\":\"tlandsmangs@angelfire.com\",\"job\":\"General Manager\",\"timestamp\":\"2022-01-27T10:53:51Z\"}\n{\"id\":606,\"first_name\":\"Arabella\",\"email\":\"arobinsgt@prnewswire.com\",\"job\":\"Account Representative III\",\"timestamp\":\"2022-02-16T16:13:51Z\"}\n{\"id\":607,\"first_name\":\"Alaster\",\"email\":\"arosewellgu@tuttocitta.it\",\"job\":\"Nuclear Power Engineer\",\"timestamp\":\"2022-05-10T00:05:46Z\"}\n{\"id\":608,\"first_name\":\"Devin\",\"email\":\"dbannergv@parallels.com\",\"job\":\"Human Resources Assistant III\",\"timestamp\":\"2022-08-09T19:53:47Z\"}\n{\"id\":609,\"first_name\":\"Nonah\",\"email\":\"nhallfordgw@bloglines.com\",\"job\":\"Director of Sales\",\"timestamp\":\"2022-09-02T16:04:31Z\"}\n{\"id\":610,\"first_name\":\"Alberik\",\"email\":\"amceacherngx@fda.gov\",\"job\":\"Structural Engineer\",\"timestamp\":\"2021-12-12T13:48:05Z\"}\n{\"id\":611,\"first_name\":\"Chadd\",\"email\":\"caarongy@alibaba.com\",\"job\":\"Developer III\",\"timestamp\":\"2022-08-31T00:09:29Z\"}\n{\"id\":612,\"first_name\":\"Jammal\",\"email\":\"jdavydochkingz@flavors.me\",\"job\":\"Sales Associate\",\"timestamp\":\"2022-10-04T14:58:25Z\"}\n{\"id\":613,\"first_name\":\"Bridie\",\"email\":\"bdebeauchamph0@dailymail.co.uk\",\"job\":\"Marketing Manager\",\"timestamp\":\"2022-07-04T01:26:31Z\"}\n{\"id\":614,\"first_name\":\"Nona\",\"email\":\"nobeneyh1@theguardian.com\",\"job\":\"Office Assistant III\",\"timestamp\":\"2021-12-31T06:38:53Z\"}\n{\"id\":615,\"first_name\":\"Mia\",\"email\":\"mswannellh2@blog.com\",\"job\":\"Senior Quality Engineer\",\"timestamp\":\"2022-05-07T10:17:45Z\"}\n{\"id\":616,\"first_name\":\"Salem\",\"email\":\"stissingtonh3@gmpg.org\",\"job\":\"Assistant Media Planner\",\"timestamp\":\"2022-08-21T23:43:20Z\"}\n{\"id\":617,\"first_name\":\"Harli\",\"email\":\"hlanegranh4@trellian.com\",\"job\":\"Geological Engineer\",\"timestamp\":\"2022-08-25T16:34:25Z\"}\n{\"id\":618,\"first_name\":\"Corliss\",\"email\":\"ceuelsh5@ed.gov\",\"job\":\"Help Desk Technician\",\"timestamp\":\"2022-11-18T03:47:20Z\"}\n{\"id\":619,\"first_name\":\"Clemens\",\"email\":\"cphebeeh6@constantcontact.com\",\"job\":\"Librarian\",\"timestamp\":\"2022-10-08T08:47:23Z\"}\n{\"id\":620,\"first_name\":\"Maren\",\"email\":\"mscarreh7@aol.com\",\"job\":\"Account Executive\",\"timestamp\":\"2022-07-22T15:02:12Z\"}\n{\"id\":621,\"first_name\":\"Cad\",\"email\":\"cdivinyh8@cbc.ca\",\"job\":\"Assistant Manager\",\"timestamp\":\"2021-12-15T05:30:37Z\"}\n{\"id\":622,\"first_name\":\"Carrissa\",\"email\":\"ceverallh9@amazon.com\",\"job\":\"Paralegal\",\"timestamp\":\"2022-11-12T21:35:59Z\"}\n{\"id\":623,\"first_name\":\"Alejandra\",\"email\":\"askamalha@mashable.com\",\"job\":\"Marketing Manager\",\"timestamp\":\"2022-10-09T22:22:14Z\"}\n{\"id\":624,\"first_name\":\"Kip\",\"email\":\"kconnachanhb@apache.org\",\"job\":\"Senior Developer\",\"timestamp\":\"2022-05-03T18:43:02Z\"}\n{\"id\":625,\"first_name\":\"Orland\",\"email\":\"orowenhc@eventbrite.com\",\"job\":\"Compensation Analyst\",\"timestamp\":\"2022-04-10T09:38:40Z\"}\n{\"id\":626,\"first_name\":\"Victor\",\"email\":\"vleadleyhd@1688.com\",\"job\":\"Administrative Assistant I\",\"timestamp\":\"2022-06-22T19:00:46Z\"}\n{\"id\":627,\"first_name\":\"Elfrida\",\"email\":\"ebygravehe@diigo.com\",\"job\":\"Information Systems Manager\",\"timestamp\":\"2022-03-01T08:38:49Z\"}\n{\"id\":628,\"first_name\":\"Stanislas\",\"email\":\"srandlesomehf@istockphoto.com\",\"job\":\"Physical Therapy Assistant\",\"timestamp\":\"2022-10-09T04:54:10Z\"}\n{\"id\":629,\"first_name\":\"Jake\",\"email\":\"jberkleyhg@smh.com.au\",\"job\":\"Executive Secretary\",\"timestamp\":\"2022-08-29T16:34:06Z\"}\n{\"id\":630,\"first_name\":\"Lissa\",\"email\":\"llourenshh@mit.edu\",\"job\":\"Operator\",\"timestamp\":\"2022-03-31T02:56:51Z\"}\n{\"id\":631,\"first_name\":\"Fidelia\",\"email\":\"fmendoncahi@cafepress.com\",\"job\":\"Analyst Programmer\",\"timestamp\":\"2022-10-10T17:58:04Z\"}\n{\"id\":632,\"first_name\":\"Magdaia\",\"email\":\"mpartingtonhj@cbsnews.com\",\"job\":\"Nurse\",\"timestamp\":\"2022-05-22T11:17:36Z\"}\n{\"id\":633,\"first_name\":\"Saunderson\",\"email\":\"sarnfieldhk@icq.com\",\"job\":\"Environmental Tech\",\"timestamp\":\"2022-03-17T04:16:50Z\"}\n{\"id\":634,\"first_name\":\"Morie\",\"email\":\"mnearyhl@shop-pro.jp\",\"job\":\"Geological Engineer\",\"timestamp\":\"2022-12-06T19:53:49Z\"}\n{\"id\":635,\"first_name\":\"Ulrick\",\"email\":\"umellhuishhm@a8.net\",\"job\":\"Software Test Engineer II\",\"timestamp\":\"2022-10-10T04:34:01Z\"}\n{\"id\":636,\"first_name\":\"Netta\",\"email\":\"nlamswoodhn@mail.ru\",\"job\":\"Human Resources Assistant IV\",\"timestamp\":\"2021-12-16T01:49:11Z\"}\n{\"id\":637,\"first_name\":\"Amelina\",\"email\":\"aravenscroftho@godaddy.com\",\"job\":\"Office Assistant II\",\"timestamp\":\"2022-09-17T20:31:27Z\"}\n{\"id\":638,\"first_name\":\"Marybeth\",\"email\":\"mfrenzlhp@ed.gov\",\"job\":\"Nuclear Power Engineer\",\"timestamp\":\"2022-09-19T23:47:19Z\"}\n{\"id\":639,\"first_name\":\"Viva\",\"email\":\"vdonsonhq@hubpages.com\",\"job\":\"Senior Financial Analyst\",\"timestamp\":\"2022-08-24T14:49:26Z\"}\n{\"id\":640,\"first_name\":\"Zea\",\"email\":\"zbercherhr@accuweather.com\",\"job\":\"VP Marketing\",\"timestamp\":\"2022-03-30T19:15:59Z\"}\n{\"id\":641,\"first_name\":\"Sonia\",\"email\":\"sjoreths@joomla.org\",\"job\":\"Paralegal\",\"timestamp\":\"2022-12-06T16:14:12Z\"}\n{\"id\":642,\"first_name\":\"Foster\",\"email\":\"fblaxlandeht@google.co.uk\",\"job\":\"Administrative Officer\",\"timestamp\":\"2022-06-30T05:18:46Z\"}\n{\"id\":643,\"first_name\":\"Ema\",\"email\":\"etorresehu@globo.com\",\"job\":\"Software Test Engineer I\",\"timestamp\":\"2022-08-13T07:21:00Z\"}\n{\"id\":644,\"first_name\":\"Venus\",\"email\":\"vminchellahv@woothemes.com\",\"job\":\"Senior Cost Accountant\",\"timestamp\":\"2022-05-07T02:11:33Z\"}\n{\"id\":645,\"first_name\":\"Cherilyn\",\"email\":\"cjenesshw@so-net.ne.jp\",\"job\":\"Financial Advisor\",\"timestamp\":\"2022-12-02T11:35:03Z\"}\n{\"id\":646,\"first_name\":\"Carolee\",\"email\":\"cwyehx@miibeian.gov.cn\",\"job\":\"Business Systems Development Analyst\",\"timestamp\":\"2022-11-06T02:43:51Z\"}\n{\"id\":647,\"first_name\":\"Tybi\",\"email\":\"tburrhy@geocities.jp\",\"job\":\"Electrical Engineer\",\"timestamp\":\"2022-02-13T08:57:59Z\"}\n{\"id\":648,\"first_name\":\"Karoly\",\"email\":\"kmangeonhz@github.com\",\"job\":\"Librarian\",\"timestamp\":\"2022-11-01T03:32:14Z\"}\n{\"id\":649,\"first_name\":\"Wandie\",\"email\":\"wklimentyevi0@redcross.org\",\"job\":\"Associate Professor\",\"timestamp\":\"2022-09-15T12:54:45Z\"}\n{\"id\":650,\"first_name\":\"Bonny\",\"email\":\"bsneesbyi1@tinyurl.com\",\"job\":\"Financial Analyst\",\"timestamp\":\"2022-05-16T02:52:09Z\"}\n{\"id\":651,\"first_name\":\"Minna\",\"email\":\"mmcgluei2@meetup.com\",\"job\":\"Developer III\",\"timestamp\":\"2022-10-11T11:00:55Z\"}\n{\"id\":652,\"first_name\":\"Cleo\",\"email\":\"cbillsoni3@php.net\",\"job\":\"Developer III\",\"timestamp\":\"2022-03-23T02:20:12Z\"}\n{\"id\":653,\"first_name\":\"Glendon\",\"email\":\"gwrankmorei4@japanpost.jp\",\"job\":\"Senior Developer\",\"timestamp\":\"2022-01-06T01:37:38Z\"}\n{\"id\":654,\"first_name\":\"Darn\",\"email\":\"ddunsirei5@businessweek.com\",\"job\":\"Programmer Analyst I\",\"timestamp\":\"2022-05-17T11:14:20Z\"}\n{\"id\":655,\"first_name\":\"Bernice\",\"email\":\"bhrachoveci6@guardian.co.uk\",\"job\":\"Marketing Assistant\",\"timestamp\":\"2022-06-12T01:14:43Z\"}\n{\"id\":656,\"first_name\":\"Benoite\",\"email\":\"bgregoni7@bbc.co.uk\",\"job\":\"Staff Scientist\",\"timestamp\":\"2022-05-14T20:00:31Z\"}\n{\"id\":657,\"first_name\":\"Nicol\",\"email\":\"nogleviei8@nps.gov\",\"job\":\"Quality Engineer\",\"timestamp\":\"2022-07-30T12:09:37Z\"}\n{\"id\":658,\"first_name\":\"Desmond\",\"email\":\"desleyi9@craigslist.org\",\"job\":\"Safety Technician III\",\"timestamp\":\"2022-07-29T10:46:51Z\"}\n{\"id\":659,\"first_name\":\"Simone\",\"email\":\"sdonaghieia@yelp.com\",\"job\":\"Health Coach II\",\"timestamp\":\"2022-10-27T09:06:51Z\"}\n{\"id\":660,\"first_name\":\"Lynn\",\"email\":\"lmctrustamib@mail.ru\",\"job\":\"Software Test Engineer I\",\"timestamp\":\"2022-08-24T03:37:20Z\"}\n{\"id\":661,\"first_name\":\"Jerri\",\"email\":\"jledekeric@mail.ru\",\"job\":\"Senior Developer\",\"timestamp\":\"2022-09-06T19:35:20Z\"}\n{\"id\":662,\"first_name\":\"Cristal\",\"email\":\"cjochananyid@wp.com\",\"job\":\"Sales Representative\",\"timestamp\":\"2022-06-12T06:52:58Z\"}\n{\"id\":663,\"first_name\":\"Caye\",\"email\":\"cbirdseyie@amazon.de\",\"job\":\"Chief Design Engineer\",\"timestamp\":\"2022-06-02T00:42:50Z\"}\n{\"id\":664,\"first_name\":\"Tamma\",\"email\":\"tredheadif@friendfeed.com\",\"job\":\"Pharmacist\",\"timestamp\":\"2022-11-27T20:41:48Z\"}\n{\"id\":665,\"first_name\":\"Delaney\",\"email\":\"dabbettig@umich.edu\",\"job\":\"Financial Analyst\",\"timestamp\":\"2022-09-05T10:44:54Z\"}\n{\"id\":666,\"first_name\":\"Henka\",\"email\":\"hvondrasekih@prweb.com\",\"job\":\"Recruiting Manager\",\"timestamp\":\"2022-04-09T19:42:27Z\"}\n{\"id\":667,\"first_name\":\"Martie\",\"email\":\"mjandourekii@sphinn.com\",\"job\":\"Nurse Practicioner\",\"timestamp\":\"2022-11-02T22:27:58Z\"}\n{\"id\":668,\"first_name\":\"Adelle\",\"email\":\"ariddlesdenij@netlog.com\",\"job\":\"Dental Hygienist\",\"timestamp\":\"2022-10-12T03:05:23Z\"}\n{\"id\":669,\"first_name\":\"Andee\",\"email\":\"abandeyik@guardian.co.uk\",\"job\":\"Librarian\",\"timestamp\":\"2022-08-07T13:11:22Z\"}\n{\"id\":670,\"first_name\":\"Hollis\",\"email\":\"hmacgrueril@wired.com\",\"job\":\"Recruiting Manager\",\"timestamp\":\"2022-10-06T05:09:28Z\"}\n{\"id\":671,\"first_name\":\"Dona\",\"email\":\"dselesnickim@yandex.ru\",\"job\":\"Senior Sales Associate\",\"timestamp\":\"2022-07-18T09:15:16Z\"}\n{\"id\":672,\"first_name\":\"Siffre\",\"email\":\"smaliffein@hugedomains.com\",\"job\":\"VP Sales\",\"timestamp\":\"2022-07-19T13:05:02Z\"}\n{\"id\":673,\"first_name\":\"Gwenny\",\"email\":\"gfeighryio@go.com\",\"job\":\"Automation Specialist IV\",\"timestamp\":\"2022-03-07T03:15:33Z\"}\n{\"id\":674,\"first_name\":\"Paxon\",\"email\":\"pcoplandip@blogs.com\",\"job\":\"Data Coordiator\",\"timestamp\":\"2022-09-10T10:45:06Z\"}\n{\"id\":675,\"first_name\":\"Fredric\",\"email\":\"fohanneniq@livejournal.com\",\"job\":\"Director of Sales\",\"timestamp\":\"2022-10-30T18:56:02Z\"}\n{\"id\":676,\"first_name\":\"Enoch\",\"email\":\"ekenningleyir@sciencedaily.com\",\"job\":\"Account Coordinator\",\"timestamp\":\"2022-01-11T16:42:07Z\"}\n{\"id\":677,\"first_name\":\"Farand\",\"email\":\"ffassonis@theglobeandmail.com\",\"job\":\"GIS Technical Architect\",\"timestamp\":\"2022-08-17T01:48:24Z\"}\n{\"id\":678,\"first_name\":\"Gypsy\",\"email\":\"gbristoeit@tumblr.com\",\"job\":\"VP Marketing\",\"timestamp\":\"2022-02-02T15:22:39Z\"}\n{\"id\":679,\"first_name\":\"Lewie\",\"email\":\"lskiltoniu@yolasite.com\",\"job\":\"Executive Secretary\",\"timestamp\":\"2022-11-02T17:38:40Z\"}\n{\"id\":680,\"first_name\":\"Elnora\",\"email\":\"evogeliv@geocities.com\",\"job\":\"Administrative Officer\",\"timestamp\":\"2022-05-14T16:28:29Z\"}\n{\"id\":681,\"first_name\":\"Humfrey\",\"email\":\"htethacotiw@springer.com\",\"job\":\"Compensation Analyst\",\"timestamp\":\"2022-01-18T00:36:58Z\"}\n{\"id\":682,\"first_name\":\"Eadith\",\"email\":\"eespadasix@mit.edu\",\"job\":\"Web Designer III\",\"timestamp\":\"2021-12-19T13:39:40Z\"}\n{\"id\":683,\"first_name\":\"Winne\",\"email\":\"wdunrigeiy@zdnet.com\",\"job\":\"Help Desk Operator\",\"timestamp\":\"2022-03-12T11:35:51Z\"}\n{\"id\":684,\"first_name\":\"Ewell\",\"email\":\"ewestbyiz@cargocollective.com\",\"job\":\"Research Assistant III\",\"timestamp\":\"2022-05-24T04:23:15Z\"}\n{\"id\":685,\"first_name\":\"Johan\",\"email\":\"jjoej0@mac.com\",\"job\":\"Developer I\",\"timestamp\":\"2022-11-14T13:44:24Z\"}\n{\"id\":686,\"first_name\":\"Palmer\",\"email\":\"phassenj1@pagesperso-orange.fr\",\"job\":\"Data Coordiator\",\"timestamp\":\"2022-04-11T09:12:42Z\"}\n{\"id\":687,\"first_name\":\"Gabriel\",\"email\":\"gwyllcocksj2@ox.ac.uk\",\"job\":\"Nurse Practicioner\",\"timestamp\":\"2022-02-21T15:51:39Z\"}\n{\"id\":688,\"first_name\":\"Gustavus\",\"email\":\"gwardropj3@ca.gov\",\"job\":\"Librarian\",\"timestamp\":\"2021-12-23T05:56:10Z\"}\n{\"id\":689,\"first_name\":\"Sharl\",\"email\":\"srabbagej4@artisteer.com\",\"job\":\"Marketing Manager\",\"timestamp\":\"2022-02-07T10:20:00Z\"}\n{\"id\":690,\"first_name\":\"Lexi\",\"email\":\"ldumphryj5@vistaprint.com\",\"job\":\"Accountant III\",\"timestamp\":\"2022-09-23T07:06:14Z\"}\n{\"id\":691,\"first_name\":\"Garrot\",\"email\":\"gfydoej6@ucla.edu\",\"job\":\"Geological Engineer\",\"timestamp\":\"2022-01-26T10:00:47Z\"}\n{\"id\":692,\"first_name\":\"Aida\",\"email\":\"akissackj7@delicious.com\",\"job\":\"Senior Cost Accountant\",\"timestamp\":\"2022-03-12T19:37:00Z\"}\n{\"id\":693,\"first_name\":\"Quillan\",\"email\":\"qibesonj8@marriott.com\",\"job\":\"Geological Engineer\",\"timestamp\":\"2022-08-07T09:42:27Z\"}\n{\"id\":694,\"first_name\":\"Ailey\",\"email\":\"awimmersj9@dion.ne.jp\",\"job\":\"VP Accounting\",\"timestamp\":\"2022-04-20T23:12:01Z\"}\n{\"id\":695,\"first_name\":\"Corbett\",\"email\":\"ctancockja@prlog.org\",\"job\":\"Executive Secretary\",\"timestamp\":\"2022-09-26T06:28:39Z\"}\n{\"id\":696,\"first_name\":\"Aubry\",\"email\":\"agarrudjb@sphinn.com\",\"job\":\"VP Accounting\",\"timestamp\":\"2022-07-08T15:40:52Z\"}\n{\"id\":697,\"first_name\":\"Lyndsay\",\"email\":\"lbarrelljc@aboutads.info\",\"job\":\"Administrative Assistant III\",\"timestamp\":\"2022-11-14T23:43:47Z\"}\n{\"id\":698,\"first_name\":\"Ezekiel\",\"email\":\"espeedinjd@who.int\",\"job\":\"Human Resources Manager\",\"timestamp\":\"2022-06-07T23:27:06Z\"}\n{\"id\":699,\"first_name\":\"Monah\",\"email\":\"mwittmanje@forbes.com\",\"job\":\"Executive Secretary\",\"timestamp\":\"2022-02-27T22:07:29Z\"}\n{\"id\":700,\"first_name\":\"Tammi\",\"email\":\"tpetticrewjf@yandex.ru\",\"job\":\"Programmer Analyst III\",\"timestamp\":\"2022-03-21T08:41:59Z\"}\n{\"id\":701,\"first_name\":\"Brynne\",\"email\":\"bbondleyjg@admin.ch\",\"job\":\"Analyst Programmer\",\"timestamp\":\"2022-08-08T19:25:12Z\"}\n{\"id\":702,\"first_name\":\"Mirilla\",\"email\":\"mrollinsonjh@twitter.com\",\"job\":\"Geological Engineer\",\"timestamp\":\"2022-05-23T06:08:30Z\"}\n{\"id\":703,\"first_name\":\"Archibald\",\"email\":\"aandricji@icq.com\",\"job\":\"Chemical Engineer\",\"timestamp\":\"2022-08-01T05:10:36Z\"}\n{\"id\":704,\"first_name\":\"Mozes\",\"email\":\"mcawoodjj@flickr.com\",\"job\":\"VP Sales\",\"timestamp\":\"2022-05-01T22:55:12Z\"}\n{\"id\":705,\"first_name\":\"Delcine\",\"email\":\"dcornickjk@amazon.de\",\"job\":\"Director of Sales\",\"timestamp\":\"2022-11-23T10:16:23Z\"}\n{\"id\":706,\"first_name\":\"Jill\",\"email\":\"jhandesjl@netscape.com\",\"job\":\"Programmer IV\",\"timestamp\":\"2022-05-04T07:01:23Z\"}\n{\"id\":707,\"first_name\":\"Eleonore\",\"email\":\"ecoxenjm@histats.com\",\"job\":\"Human Resources Manager\",\"timestamp\":\"2022-09-17T04:17:56Z\"}\n{\"id\":708,\"first_name\":\"Gabrila\",\"email\":\"gkeilingjn@zdnet.com\",\"job\":\"Legal Assistant\",\"timestamp\":\"2021-12-12T09:11:37Z\"}\n{\"id\":709,\"first_name\":\"Nalani\",\"email\":\"nleathesjo@yandex.ru\",\"job\":\"Quality Engineer\",\"timestamp\":\"2022-07-17T15:32:26Z\"}\n{\"id\":710,\"first_name\":\"Brittany\",\"email\":\"battenbarrowjp@cocolog-nifty.com\",\"job\":\"Graphic Designer\",\"timestamp\":\"2022-07-20T13:14:17Z\"}\n{\"id\":711,\"first_name\":\"Nickola\",\"email\":\"nnormanvillejq@seesaa.net\",\"job\":\"Electrical Engineer\",\"timestamp\":\"2022-02-05T08:16:47Z\"}\n{\"id\":712,\"first_name\":\"Gian\",\"email\":\"gneelyjr@de.vu\",\"job\":\"Desktop Support Technician\",\"timestamp\":\"2022-03-01T08:02:02Z\"}\n{\"id\":713,\"first_name\":\"Merill\",\"email\":\"mgaveltonejs@cbslocal.com\",\"job\":\"Quality Engineer\",\"timestamp\":\"2022-01-04T05:48:29Z\"}\n{\"id\":714,\"first_name\":\"Salomon\",\"email\":\"sambrogijt@exblog.jp\",\"job\":\"Geological Engineer\",\"timestamp\":\"2022-03-03T16:21:14Z\"}\n{\"id\":715,\"first_name\":\"Mamie\",\"email\":\"myaninju@clickbank.net\",\"job\":\"Payment Adjustment Coordinator\",\"timestamp\":\"2022-03-02T06:21:14Z\"}\n{\"id\":716,\"first_name\":\"Arel\",\"email\":\"acushejv@usnews.com\",\"job\":\"VP Quality Control\",\"timestamp\":\"2022-03-13T18:33:20Z\"}\n{\"id\":717,\"first_name\":\"Benjamin\",\"email\":\"bromanskijw@pinterest.com\",\"job\":\"Electrical Engineer\",\"timestamp\":\"2022-07-08T14:39:45Z\"}\n{\"id\":718,\"first_name\":\"Woody\",\"email\":\"wfrancesconejx@friendfeed.com\",\"job\":\"Account Representative III\",\"timestamp\":\"2022-02-02T21:45:59Z\"}\n{\"id\":719,\"first_name\":\"Annabal\",\"email\":\"atrewhittjy@ameblo.jp\",\"job\":\"Teacher\",\"timestamp\":\"2022-09-17T16:09:41Z\"}\n{\"id\":720,\"first_name\":\"Kasper\",\"email\":\"kweightjz@wikia.com\",\"job\":\"Financial Advisor\",\"timestamp\":\"2022-08-06T22:31:35Z\"}\n{\"id\":721,\"first_name\":\"Sylvan\",\"email\":\"stumasiank0@theglobeandmail.com\",\"job\":\"Recruiting Manager\",\"timestamp\":\"2022-03-17T18:22:35Z\"}\n{\"id\":722,\"first_name\":\"Helga\",\"email\":\"hcocklek1@slate.com\",\"job\":\"Help Desk Operator\",\"timestamp\":\"2022-10-27T22:15:09Z\"}\n{\"id\":723,\"first_name\":\"Dottie\",\"email\":\"dpiggottk2@adobe.com\",\"job\":\"Nurse Practicioner\",\"timestamp\":\"2022-05-01T14:02:18Z\"}\n{\"id\":724,\"first_name\":\"Brant\",\"email\":\"bwookeyk3@ucsd.edu\",\"job\":\"Help Desk Operator\",\"timestamp\":\"2022-08-10T08:04:15Z\"}\n{\"id\":725,\"first_name\":\"Jojo\",\"email\":\"jbeurichk4@linkedin.com\",\"job\":\"Professor\",\"timestamp\":\"2022-03-29T06:28:53Z\"}\n{\"id\":726,\"first_name\":\"Daryl\",\"email\":\"dcromlyk5@spotify.com\",\"job\":\"Tax Accountant\",\"timestamp\":\"2022-02-11T22:39:20Z\"}\n{\"id\":727,\"first_name\":\"Eadmund\",\"email\":\"emelmethk6@google.com.au\",\"job\":\"Environmental Specialist\",\"timestamp\":\"2022-04-19T04:20:44Z\"}\n{\"id\":728,\"first_name\":\"Gaston\",\"email\":\"gjaneczekk7@mashable.com\",\"job\":\"VP Marketing\",\"timestamp\":\"2022-02-09T19:01:05Z\"}\n{\"id\":729,\"first_name\":\"Jody\",\"email\":\"jhansleyk8@edublogs.org\",\"job\":\"Mechanical Systems Engineer\",\"timestamp\":\"2022-04-19T23:15:14Z\"}\n{\"id\":730,\"first_name\":\"Chelsy\",\"email\":\"cdimblebeek9@vk.com\",\"job\":\"Quality Control Specialist\",\"timestamp\":\"2022-02-18T17:49:47Z\"}\n{\"id\":731,\"first_name\":\"Lilias\",\"email\":\"lburchka@oaic.gov.au\",\"job\":\"Nuclear Power Engineer\",\"timestamp\":\"2022-05-19T21:46:19Z\"}\n{\"id\":732,\"first_name\":\"Godiva\",\"email\":\"gmolloykb@ibm.com\",\"job\":\"Systems Administrator II\",\"timestamp\":\"2022-05-17T17:39:48Z\"}\n{\"id\":733,\"first_name\":\"Faye\",\"email\":\"fbucknillkc@hc360.com\",\"job\":\"Web Developer IV\",\"timestamp\":\"2022-02-22T20:39:11Z\"}\n{\"id\":734,\"first_name\":\"Rickie\",\"email\":\"rdemezakd@tiny.cc\",\"job\":\"General Manager\",\"timestamp\":\"2022-06-09T10:15:05Z\"}\n{\"id\":735,\"first_name\":\"Jessi\",\"email\":\"jkingsnoadke@aboutads.info\",\"job\":\"Programmer II\",\"timestamp\":\"2022-08-19T22:15:39Z\"}\n{\"id\":736,\"first_name\":\"Marice\",\"email\":\"mmacdaidkf@omniture.com\",\"job\":\"Clinical Specialist\",\"timestamp\":\"2021-12-10T07:04:23Z\"}\n{\"id\":737,\"first_name\":\"Sig\",\"email\":\"sbarffordkg@wufoo.com\",\"job\":\"Account Coordinator\",\"timestamp\":\"2022-06-20T01:48:52Z\"}\n{\"id\":738,\"first_name\":\"Ashia\",\"email\":\"ahulettkh@pcworld.com\",\"job\":\"Cost Accountant\",\"timestamp\":\"2022-07-18T11:58:34Z\"}\n{\"id\":739,\"first_name\":\"Alisa\",\"email\":\"ascandrickki@businessinsider.com\",\"job\":\"Accountant III\",\"timestamp\":\"2022-10-05T04:12:34Z\"}\n{\"id\":740,\"first_name\":\"Janella\",\"email\":\"jfranzelinikj@jimdo.com\",\"job\":\"Help Desk Operator\",\"timestamp\":\"2022-01-21T13:47:46Z\"}\n{\"id\":741,\"first_name\":\"Bondie\",\"email\":\"bhryskiewiczkk@ucla.edu\",\"job\":\"Safety Technician II\",\"timestamp\":\"2022-10-15T19:50:53Z\"}\n{\"id\":742,\"first_name\":\"Vivyan\",\"email\":\"vcardnellkl@shareasale.com\",\"job\":\"Senior Sales Associate\",\"timestamp\":\"2022-04-18T16:31:43Z\"}\n{\"id\":743,\"first_name\":\"Mylo\",\"email\":\"miviekm@hhs.gov\",\"job\":\"Analog Circuit Design manager\",\"timestamp\":\"2022-06-08T12:03:06Z\"}\n{\"id\":744,\"first_name\":\"Thomas\",\"email\":\"tshawekn@cnbc.com\",\"job\":\"Senior Quality Engineer\",\"timestamp\":\"2021-12-11T11:54:22Z\"}\n{\"id\":745,\"first_name\":\"Jessalin\",\"email\":\"jgarthlandko@canalblog.com\",\"job\":\"Chemical Engineer\",\"timestamp\":\"2022-11-23T08:39:58Z\"}\n{\"id\":746,\"first_name\":\"Joannes\",\"email\":\"jwhitcombkp@odnoklassniki.ru\",\"job\":\"Food Chemist\",\"timestamp\":\"2022-01-22T22:10:33Z\"}\n{\"id\":747,\"first_name\":\"Morlee\",\"email\":\"mellisskq@virginia.edu\",\"job\":\"VP Quality Control\",\"timestamp\":\"2022-01-07T13:48:52Z\"}\n{\"id\":748,\"first_name\":\"Karalynn\",\"email\":\"koshevlinkr@japanpost.jp\",\"job\":\"VP Product Management\",\"timestamp\":\"2022-11-28T08:30:42Z\"}\n{\"id\":749,\"first_name\":\"Phedra\",\"email\":\"pgrigoriscuks@mayoclinic.com\",\"job\":\"VP Accounting\",\"timestamp\":\"2021-12-19T22:37:59Z\"}\n{\"id\":750,\"first_name\":\"Denna\",\"email\":\"ddeluzekt@freewebs.com\",\"job\":\"Sales Associate\",\"timestamp\":\"2022-09-01T10:42:32Z\"}\n{\"id\":751,\"first_name\":\"Audrye\",\"email\":\"aburghallku@behance.net\",\"job\":\"Civil Engineer\",\"timestamp\":\"2022-02-22T05:02:58Z\"}\n{\"id\":752,\"first_name\":\"Dane\",\"email\":\"dsharpekv@comcast.net\",\"job\":\"VP Sales\",\"timestamp\":\"2022-11-14T11:52:28Z\"}\n{\"id\":753,\"first_name\":\"Nye\",\"email\":\"nmulcasterkw@wikia.com\",\"job\":\"Environmental Specialist\",\"timestamp\":\"2022-08-09T11:06:13Z\"}\n{\"id\":754,\"first_name\":\"Dame\",\"email\":\"dmacenellykx@technorati.com\",\"job\":\"Compensation Analyst\",\"timestamp\":\"2022-03-16T13:16:49Z\"}\n{\"id\":755,\"first_name\":\"Nevsa\",\"email\":\"nkurdaniky@moonfruit.com\",\"job\":\"Associate Professor\",\"timestamp\":\"2022-07-30T09:04:58Z\"}\n{\"id\":756,\"first_name\":\"Ronni\",\"email\":\"rstarfordkz@aboutads.info\",\"job\":\"Payment Adjustment Coordinator\",\"timestamp\":\"2022-06-21T05:54:44Z\"}\n{\"id\":757,\"first_name\":\"Hyacintha\",\"email\":\"hcallabyl0@ask.com\",\"job\":\"Social Worker\",\"timestamp\":\"2022-03-09T00:34:02Z\"}\n{\"id\":758,\"first_name\":\"Obidiah\",\"email\":\"odougherl1@eventbrite.com\",\"job\":\"Financial Analyst\",\"timestamp\":\"2022-05-12T12:34:37Z\"}\n{\"id\":759,\"first_name\":\"Terrill\",\"email\":\"thaddacksl2@businesswire.com\",\"job\":\"Accounting Assistant II\",\"timestamp\":\"2022-12-02T04:29:19Z\"}\n{\"id\":760,\"first_name\":\"Robby\",\"email\":\"rhurdl3@i2i.jp\",\"job\":\"Budget/Accounting Analyst IV\",\"timestamp\":\"2022-09-28T12:20:36Z\"}\n{\"id\":761,\"first_name\":\"Blayne\",\"email\":\"bbebbelll4@dailymotion.com\",\"job\":\"Administrative Assistant II\",\"timestamp\":\"2022-10-22T13:16:00Z\"}\n{\"id\":762,\"first_name\":\"Viki\",\"email\":\"vmcmylerl5@mtv.com\",\"job\":\"General Manager\",\"timestamp\":\"2022-09-24T05:32:27Z\"}\n{\"id\":763,\"first_name\":\"Erhard\",\"email\":\"epengellyl6@slate.com\",\"job\":\"Database Administrator I\",\"timestamp\":\"2022-12-02T07:39:09Z\"}\n{\"id\":764,\"first_name\":\"Weylin\",\"email\":\"wpesekl7@moonfruit.com\",\"job\":\"Accounting Assistant IV\",\"timestamp\":\"2022-08-18T01:40:24Z\"}\n{\"id\":765,\"first_name\":\"Garwin\",\"email\":\"gspikinsl8@hhs.gov\",\"job\":\"Computer Systems Analyst III\",\"timestamp\":\"2022-03-01T04:22:37Z\"}\n{\"id\":766,\"first_name\":\"Frederique\",\"email\":\"fpellingl9@unc.edu\",\"job\":\"Internal Auditor\",\"timestamp\":\"2022-06-24T21:08:43Z\"}\n{\"id\":767,\"first_name\":\"Terence\",\"email\":\"tbardsleyla@i2i.jp\",\"job\":\"Teacher\",\"timestamp\":\"2022-09-12T16:41:01Z\"}\n{\"id\":768,\"first_name\":\"Reynold\",\"email\":\"rgiovanninilb@fotki.com\",\"job\":\"Information Systems Manager\",\"timestamp\":\"2022-04-12T21:15:43Z\"}\n{\"id\":769,\"first_name\":\"Ethel\",\"email\":\"ewhightmanlc@scribd.com\",\"job\":\"Tax Accountant\",\"timestamp\":\"2022-04-22T18:35:04Z\"}\n{\"id\":770,\"first_name\":\"Larry\",\"email\":\"lchavezld@amazon.co.uk\",\"job\":\"Business Systems Development Analyst\",\"timestamp\":\"2022-07-15T05:25:54Z\"}\n{\"id\":771,\"first_name\":\"Tessy\",\"email\":\"tbenle@wisc.edu\",\"job\":\"VP Sales\",\"timestamp\":\"2022-03-16T21:30:37Z\"}\n{\"id\":772,\"first_name\":\"Dane\",\"email\":\"dmatterfacelf@deliciousdays.com\",\"job\":\"Mechanical Systems Engineer\",\"timestamp\":\"2022-11-25T10:41:02Z\"}\n{\"id\":773,\"first_name\":\"Kleon\",\"email\":\"ksurgenerlg@tinypic.com\",\"job\":\"Food Chemist\",\"timestamp\":\"2022-02-13T20:28:16Z\"}\n{\"id\":774,\"first_name\":\"Nicolea\",\"email\":\"nnequestlh@dmoz.org\",\"job\":\"VP Marketing\",\"timestamp\":\"2022-03-31T01:03:41Z\"}\n{\"id\":775,\"first_name\":\"Bevvy\",\"email\":\"bsavellli@ucoz.ru\",\"job\":\"Electrical Engineer\",\"timestamp\":\"2022-10-25T06:01:02Z\"}\n{\"id\":776,\"first_name\":\"Joel\",\"email\":\"jnottilj@amazonaws.com\",\"job\":\"Chief Design Engineer\",\"timestamp\":\"2022-07-15T05:05:24Z\"}\n{\"id\":777,\"first_name\":\"Mindy\",\"email\":\"mpinnockelk@squidoo.com\",\"job\":\"Marketing Assistant\",\"timestamp\":\"2022-06-30T00:26:25Z\"}\n{\"id\":778,\"first_name\":\"Jerad\",\"email\":\"jgallierll@digg.com\",\"job\":\"Sales Associate\",\"timestamp\":\"2022-01-21T20:28:29Z\"}\n{\"id\":779,\"first_name\":\"Milissent\",\"email\":\"mbackslm@spotify.com\",\"job\":\"Research Assistant I\",\"timestamp\":\"2022-01-19T06:38:59Z\"}\n{\"id\":780,\"first_name\":\"Callean\",\"email\":\"cradborneln@ft.com\",\"job\":\"VP Sales\",\"timestamp\":\"2022-08-20T09:47:35Z\"}\n{\"id\":781,\"first_name\":\"Gilbertina\",\"email\":\"gzorzonilo@alibaba.com\",\"job\":\"Cost Accountant\",\"timestamp\":\"2022-01-09T19:52:20Z\"}\n{\"id\":782,\"first_name\":\"Thain\",\"email\":\"tlevenslp@zimbio.com\",\"job\":\"Media Manager II\",\"timestamp\":\"2022-05-31T00:22:07Z\"}\n{\"id\":783,\"first_name\":\"Lem\",\"email\":\"lrylattlq@yellowpages.com\",\"job\":\"Marketing Assistant\",\"timestamp\":\"2022-05-12T17:04:30Z\"}\n{\"id\":784,\"first_name\":\"Lillian\",\"email\":\"ltanslylr@sourceforge.net\",\"job\":\"Human Resources Assistant III\",\"timestamp\":\"2022-07-02T03:01:56Z\"}\n{\"id\":785,\"first_name\":\"Smith\",\"email\":\"smeinls@biglobe.ne.jp\",\"job\":\"Junior Executive\",\"timestamp\":\"2022-02-07T00:50:30Z\"}\n{\"id\":786,\"first_name\":\"Meris\",\"email\":\"mterbruggenlt@livejournal.com\",\"job\":\"Structural Analysis Engineer\",\"timestamp\":\"2022-07-27T16:40:12Z\"}\n{\"id\":787,\"first_name\":\"Leesa\",\"email\":\"lraeburnlu@tmall.com\",\"job\":\"Operator\",\"timestamp\":\"2022-02-22T06:34:25Z\"}\n{\"id\":788,\"first_name\":\"Brennen\",\"email\":\"bbowartlv@cdc.gov\",\"job\":\"GIS Technical Architect\",\"timestamp\":\"2022-11-08T23:11:20Z\"}\n{\"id\":789,\"first_name\":\"Fabian\",\"email\":\"fravenslw@linkedin.com\",\"job\":\"Project Manager\",\"timestamp\":\"2022-07-18T23:53:45Z\"}\n{\"id\":790,\"first_name\":\"Amelina\",\"email\":\"alandmanlx@ted.com\",\"job\":\"Internal Auditor\",\"timestamp\":\"2022-01-13T23:18:20Z\"}\n{\"id\":791,\"first_name\":\"Mozelle\",\"email\":\"meldertonly@ezinearticles.com\",\"job\":\"Media Manager II\",\"timestamp\":\"2022-11-03T14:12:26Z\"}\n{\"id\":792,\"first_name\":\"Tobit\",\"email\":\"tserjentlz@elegantthemes.com\",\"job\":\"Database Administrator I\",\"timestamp\":\"2022-07-01T12:08:46Z\"}\n{\"id\":793,\"first_name\":\"Tanner\",\"email\":\"tpauleym0@stumbleupon.com\",\"job\":\"Senior Cost Accountant\",\"timestamp\":\"2022-01-29T14:16:25Z\"}\n{\"id\":794,\"first_name\":\"Kristin\",\"email\":\"kdukerm1@sohu.com\",\"job\":\"Administrative Assistant IV\",\"timestamp\":\"2022-02-07T10:44:11Z\"}\n{\"id\":795,\"first_name\":\"Ware\",\"email\":\"wchellm2@rediff.com\",\"job\":\"Social Worker\",\"timestamp\":\"2022-02-17T13:59:09Z\"}\n{\"id\":796,\"first_name\":\"Berrie\",\"email\":\"begglem3@surveymonkey.com\",\"job\":\"Accounting Assistant I\",\"timestamp\":\"2022-06-19T18:52:37Z\"}\n{\"id\":797,\"first_name\":\"Andros\",\"email\":\"asprullsm4@ifeng.com\",\"job\":\"Account Executive\",\"timestamp\":\"2022-01-17T22:42:36Z\"}\n{\"id\":798,\"first_name\":\"Saidee\",\"email\":\"swiffillm5@blogspot.com\",\"job\":\"VP Quality Control\",\"timestamp\":\"2022-05-25T02:52:49Z\"}\n{\"id\":799,\"first_name\":\"Susanna\",\"email\":\"sbrentnallm6@vistaprint.com\",\"job\":\"VP Product Management\",\"timestamp\":\"2022-04-17T11:27:28Z\"}\n{\"id\":800,\"first_name\":\"Micheline\",\"email\":\"mwalbrookm7@nih.gov\",\"job\":\"Statistician II\",\"timestamp\":\"2022-08-07T03:20:05Z\"}\n{\"id\":801,\"first_name\":\"Putnem\",\"email\":\"pnevillm8@studiopress.com\",\"job\":\"Professor\",\"timestamp\":\"2022-03-05T15:31:28Z\"}\n{\"id\":802,\"first_name\":\"Hinze\",\"email\":\"hboocockm9@amazon.co.jp\",\"job\":\"Human Resources Assistant II\",\"timestamp\":\"2022-02-05T09:35:18Z\"}\n{\"id\":803,\"first_name\":\"Ketti\",\"email\":\"kbritcherma@instagram.com\",\"job\":\"Administrative Officer\",\"timestamp\":\"2022-07-17T06:02:28Z\"}\n{\"id\":804,\"first_name\":\"Nevins\",\"email\":\"nyanshinmb@surveymonkey.com\",\"job\":\"Operator\",\"timestamp\":\"2022-08-07T10:38:06Z\"}\n{\"id\":805,\"first_name\":\"Scott\",\"email\":\"sgallafantmc@amazonaws.com\",\"job\":\"Statistician III\",\"timestamp\":\"2022-08-26T17:02:44Z\"}\n{\"id\":806,\"first_name\":\"Lacie\",\"email\":\"lsplavenmd@addtoany.com\",\"job\":\"Quality Engineer\",\"timestamp\":\"2022-01-17T19:39:07Z\"}\n{\"id\":807,\"first_name\":\"Renado\",\"email\":\"rleijsme@discuz.net\",\"job\":\"Executive Secretary\",\"timestamp\":\"2022-04-14T21:12:21Z\"}\n{\"id\":808,\"first_name\":\"Goldia\",\"email\":\"gbumphreymf@senate.gov\",\"job\":\"Editor\",\"timestamp\":\"2022-06-27T04:37:50Z\"}\n{\"id\":809,\"first_name\":\"Corri\",\"email\":\"ctonkinsonmg@slate.com\",\"job\":\"Dental Hygienist\",\"timestamp\":\"2022-11-25T08:45:37Z\"}\n{\"id\":810,\"first_name\":\"Barry\",\"email\":\"bondramh@sciencedaily.com\",\"job\":\"Project Manager\",\"timestamp\":\"2022-02-27T10:13:39Z\"}\n{\"id\":811,\"first_name\":\"Duffy\",\"email\":\"dvandersonmi@blinklist.com\",\"job\":\"Recruiting Manager\",\"timestamp\":\"2022-04-26T20:13:47Z\"}\n{\"id\":812,\"first_name\":\"Sande\",\"email\":\"sphairmj@live.com\",\"job\":\"Associate Professor\",\"timestamp\":\"2022-07-30T13:56:11Z\"}\n{\"id\":813,\"first_name\":\"Marissa\",\"email\":\"mkisbymk@hud.gov\",\"job\":\"Mechanical Systems Engineer\",\"timestamp\":\"2022-11-13T13:55:16Z\"}\n{\"id\":814,\"first_name\":\"Timmy\",\"email\":\"twoloschinml@army.mil\",\"job\":\"Environmental Tech\",\"timestamp\":\"2022-04-12T01:37:40Z\"}\n{\"id\":815,\"first_name\":\"Early\",\"email\":\"ehaukeymm@tamu.edu\",\"job\":\"Executive Secretary\",\"timestamp\":\"2022-01-17T16:13:19Z\"}\n{\"id\":816,\"first_name\":\"Maynard\",\"email\":\"mmajormn@google.ru\",\"job\":\"Database Administrator II\",\"timestamp\":\"2022-10-30T01:30:43Z\"}\n{\"id\":817,\"first_name\":\"Geordie\",\"email\":\"gspiermo@rambler.ru\",\"job\":\"Administrative Assistant I\",\"timestamp\":\"2022-10-28T08:13:24Z\"}\n{\"id\":818,\"first_name\":\"Trula\",\"email\":\"tcustmp@drupal.org\",\"job\":\"Database Administrator II\",\"timestamp\":\"2022-01-23T22:50:14Z\"}\n{\"id\":819,\"first_name\":\"Vale\",\"email\":\"vrammemq@prlog.org\",\"job\":\"Desktop Support Technician\",\"timestamp\":\"2021-12-25T03:53:38Z\"}\n{\"id\":820,\"first_name\":\"Winny\",\"email\":\"wmcgallmr@edublogs.org\",\"job\":\"Media Manager IV\",\"timestamp\":\"2022-06-22T03:47:41Z\"}\n{\"id\":821,\"first_name\":\"Laraine\",\"email\":\"lfortems@tamu.edu\",\"job\":\"Social Worker\",\"timestamp\":\"2022-07-17T05:32:39Z\"}\n{\"id\":822,\"first_name\":\"Nathanil\",\"email\":\"nchamleymt@altervista.org\",\"job\":\"Administrative Assistant II\",\"timestamp\":\"2022-05-28T11:05:26Z\"}\n{\"id\":823,\"first_name\":\"Jarib\",\"email\":\"jspinettimu@cnbc.com\",\"job\":\"Internal Auditor\",\"timestamp\":\"2022-09-04T22:57:50Z\"}\n{\"id\":824,\"first_name\":\"Betsey\",\"email\":\"bbolgarmv@google.nl\",\"job\":\"Librarian\",\"timestamp\":\"2022-05-14T19:20:02Z\"}\n{\"id\":825,\"first_name\":\"Tadd\",\"email\":\"tnoellmw@walmart.com\",\"job\":\"Structural Engineer\",\"timestamp\":\"2022-08-24T02:35:22Z\"}\n{\"id\":826,\"first_name\":\"Puff\",\"email\":\"pgerokmx@gizmodo.com\",\"job\":\"Teacher\",\"timestamp\":\"2022-09-12T07:28:33Z\"}\n{\"id\":827,\"first_name\":\"Yasmin\",\"email\":\"yfippmy@reverbnation.com\",\"job\":\"Programmer Analyst III\",\"timestamp\":\"2022-12-02T19:47:38Z\"}\n{\"id\":828,\"first_name\":\"Bernadette\",\"email\":\"bleganmz@telegraph.co.uk\",\"job\":\"Civil Engineer\",\"timestamp\":\"2022-03-31T12:49:01Z\"}\n{\"id\":829,\"first_name\":\"Berni\",\"email\":\"bgloucestern0@gov.uk\",\"job\":\"Research Assistant I\",\"timestamp\":\"2022-06-25T03:32:48Z\"}\n{\"id\":830,\"first_name\":\"Karlotte\",\"email\":\"kmartignonin1@cocolog-nifty.com\",\"job\":\"Health Coach I\",\"timestamp\":\"2022-02-20T11:58:30Z\"}\n{\"id\":831,\"first_name\":\"Bernadine\",\"email\":\"bspavonn2@liveinternet.ru\",\"job\":\"Structural Analysis Engineer\",\"timestamp\":\"2021-12-11T21:39:43Z\"}\n{\"id\":832,\"first_name\":\"Lela\",\"email\":\"lnoenn3@redcross.org\",\"job\":\"Human Resources Assistant IV\",\"timestamp\":\"2022-05-29T13:49:27Z\"}\n{\"id\":833,\"first_name\":\"Aurora\",\"email\":\"amendenhalln4@delicious.com\",\"job\":\"Food Chemist\",\"timestamp\":\"2022-07-31T00:29:00Z\"}\n{\"id\":834,\"first_name\":\"Florian\",\"email\":\"fstrowthern5@scribd.com\",\"job\":\"Data Coordiator\",\"timestamp\":\"2022-10-09T13:32:16Z\"}\n{\"id\":835,\"first_name\":\"Audy\",\"email\":\"aoveralln6@dell.com\",\"job\":\"Editor\",\"timestamp\":\"2022-04-11T14:56:31Z\"}\n{\"id\":836,\"first_name\":\"Lyndsay\",\"email\":\"ldecavillen7@cdc.gov\",\"job\":\"Nuclear Power Engineer\",\"timestamp\":\"2022-05-27T07:08:15Z\"}\n{\"id\":837,\"first_name\":\"Pail\",\"email\":\"plewcockn8@github.com\",\"job\":\"Administrative Officer\",\"timestamp\":\"2022-01-20T04:13:27Z\"}\n{\"id\":838,\"first_name\":\"Kevan\",\"email\":\"kbarkleyn9@seesaa.net\",\"job\":\"Assistant Media Planner\",\"timestamp\":\"2022-08-07T06:05:54Z\"}\n{\"id\":839,\"first_name\":\"Kimmi\",\"email\":\"kmunnionna@newsvine.com\",\"job\":\"Safety Technician III\",\"timestamp\":\"2022-08-14T04:25:45Z\"}\n{\"id\":840,\"first_name\":\"Esmaria\",\"email\":\"eairenb@ox.ac.uk\",\"job\":\"Account Coordinator\",\"timestamp\":\"2022-03-10T21:54:53Z\"}\n{\"id\":841,\"first_name\":\"Klarika\",\"email\":\"kpennettinc@oakley.com\",\"job\":\"Senior Quality Engineer\",\"timestamp\":\"2022-09-17T15:56:15Z\"}\n{\"id\":842,\"first_name\":\"Minetta\",\"email\":\"mkornackind@discovery.com\",\"job\":\"GIS Technical Architect\",\"timestamp\":\"2022-07-11T11:26:43Z\"}\n{\"id\":843,\"first_name\":\"Emmit\",\"email\":\"eboylinne@storify.com\",\"job\":\"Sales Associate\",\"timestamp\":\"2022-10-16T12:23:15Z\"}\n{\"id\":844,\"first_name\":\"Chaim\",\"email\":\"credmaynenf@wiley.com\",\"job\":\"Mechanical Systems Engineer\",\"timestamp\":\"2022-05-14T03:30:45Z\"}\n{\"id\":845,\"first_name\":\"Orazio\",\"email\":\"owitcherleyng@amazon.de\",\"job\":\"VP Accounting\",\"timestamp\":\"2022-10-18T20:52:32Z\"}\n{\"id\":846,\"first_name\":\"Egbert\",\"email\":\"ejordannh@surveymonkey.com\",\"job\":\"Graphic Designer\",\"timestamp\":\"2021-12-28T17:52:15Z\"}\n{\"id\":847,\"first_name\":\"Clemens\",\"email\":\"cblackebyni@livejournal.com\",\"job\":\"Design Engineer\",\"timestamp\":\"2022-03-30T10:36:40Z\"}\n{\"id\":848,\"first_name\":\"Brooke\",\"email\":\"bleedernj@godaddy.com\",\"job\":\"Senior Sales Associate\",\"timestamp\":\"2022-11-26T13:38:37Z\"}\n{\"id\":849,\"first_name\":\"Batholomew\",\"email\":\"baceynk@jigsy.com\",\"job\":\"Information Systems Manager\",\"timestamp\":\"2022-11-20T08:57:31Z\"}\n{\"id\":850,\"first_name\":\"Reinhold\",\"email\":\"rcubberleynl@de.vu\",\"job\":\"Software Consultant\",\"timestamp\":\"2022-07-09T19:38:46Z\"}\n{\"id\":851,\"first_name\":\"Bel\",\"email\":\"bshavenm@miitbeian.gov.cn\",\"job\":\"Recruiting Manager\",\"timestamp\":\"2022-01-28T17:24:18Z\"}\n{\"id\":852,\"first_name\":\"Glenden\",\"email\":\"ghumbienn@princeton.edu\",\"job\":\"Project Manager\",\"timestamp\":\"2022-05-16T07:45:40Z\"}\n{\"id\":853,\"first_name\":\"Lorie\",\"email\":\"ljeannotno@examiner.com\",\"job\":\"Dental Hygienist\",\"timestamp\":\"2022-10-15T03:52:40Z\"}\n{\"id\":854,\"first_name\":\"Garrot\",\"email\":\"gjepsonnp@marriott.com\",\"job\":\"Editor\",\"timestamp\":\"2022-04-10T04:24:08Z\"}\n{\"id\":855,\"first_name\":\"Henrieta\",\"email\":\"hfarrownq@ezinearticles.com\",\"job\":\"Programmer Analyst III\",\"timestamp\":\"2022-01-27T23:19:22Z\"}\n{\"id\":856,\"first_name\":\"Luis\",\"email\":\"ldiboldinr@aboutads.info\",\"job\":\"Director of Sales\",\"timestamp\":\"2022-02-21T08:33:27Z\"}\n{\"id\":857,\"first_name\":\"Maurie\",\"email\":\"medgettns@linkedin.com\",\"job\":\"Clinical Specialist\",\"timestamp\":\"2022-02-20T13:04:26Z\"}\n{\"id\":858,\"first_name\":\"Kalie\",\"email\":\"klemmennt@webs.com\",\"job\":\"VP Sales\",\"timestamp\":\"2022-02-07T04:55:05Z\"}\n{\"id\":859,\"first_name\":\"Nadeen\",\"email\":\"naldwichnu@java.com\",\"job\":\"VP Sales\",\"timestamp\":\"2022-06-06T09:30:36Z\"}\n{\"id\":860,\"first_name\":\"Codie\",\"email\":\"cdeclairmontnv@indiegogo.com\",\"job\":\"Accountant III\",\"timestamp\":\"2022-02-07T06:29:46Z\"}\n{\"id\":861,\"first_name\":\"Fanny\",\"email\":\"fdemanchenw@parallels.com\",\"job\":\"Paralegal\",\"timestamp\":\"2022-08-14T16:49:12Z\"}\n{\"id\":862,\"first_name\":\"Dev\",\"email\":\"ddongallnx@globo.com\",\"job\":\"Health Coach III\",\"timestamp\":\"2022-10-09T10:42:24Z\"}\n{\"id\":863,\"first_name\":\"Anette\",\"email\":\"alestorny@independent.co.uk\",\"job\":\"Quality Engineer\",\"timestamp\":\"2022-05-05T09:17:14Z\"}\n{\"id\":864,\"first_name\":\"Alfy\",\"email\":\"atindallnz@wordpress.com\",\"job\":\"Web Developer II\",\"timestamp\":\"2022-04-03T15:01:04Z\"}\n{\"id\":865,\"first_name\":\"Malissa\",\"email\":\"mdorseto0@umich.edu\",\"job\":\"Research Assistant II\",\"timestamp\":\"2022-01-18T08:08:14Z\"}\n{\"id\":866,\"first_name\":\"Raychel\",\"email\":\"rfolko1@hubpages.com\",\"job\":\"Mechanical Systems Engineer\",\"timestamp\":\"2022-10-01T03:04:36Z\"}\n{\"id\":867,\"first_name\":\"Deloria\",\"email\":\"dtraharo2@freewebs.com\",\"job\":\"Paralegal\",\"timestamp\":\"2022-04-22T11:25:15Z\"}\n{\"id\":868,\"first_name\":\"Jayme\",\"email\":\"jhardbattleo3@indiatimes.com\",\"job\":\"Quality Control Specialist\",\"timestamp\":\"2022-07-22T11:02:53Z\"}\n{\"id\":869,\"first_name\":\"Brina\",\"email\":\"bsherryo4@marketwatch.com\",\"job\":\"Librarian\",\"timestamp\":\"2022-01-05T17:11:02Z\"}\n{\"id\":870,\"first_name\":\"Christal\",\"email\":\"cloisio5@ehow.com\",\"job\":\"Programmer II\",\"timestamp\":\"2022-10-22T01:30:55Z\"}\n{\"id\":871,\"first_name\":\"Nonna\",\"email\":\"nneumanno6@dailymotion.com\",\"job\":\"Nurse Practicioner\",\"timestamp\":\"2022-02-21T08:46:46Z\"}\n{\"id\":872,\"first_name\":\"Reiko\",\"email\":\"rwordeno7@techcrunch.com\",\"job\":\"Registered Nurse\",\"timestamp\":\"2022-12-06T17:34:34Z\"}\n{\"id\":873,\"first_name\":\"Michaeline\",\"email\":\"mballantineo8@goo.ne.jp\",\"job\":\"Associate Professor\",\"timestamp\":\"2022-11-22T06:13:48Z\"}\n{\"id\":874,\"first_name\":\"Flossie\",\"email\":\"fdaylyo9@java.com\",\"job\":\"Engineer III\",\"timestamp\":\"2022-04-23T11:41:09Z\"}\n{\"id\":875,\"first_name\":\"Kathleen\",\"email\":\"kroblouoa@baidu.com\",\"job\":\"Associate Professor\",\"timestamp\":\"2021-12-26T18:03:46Z\"}\n{\"id\":876,\"first_name\":\"Renaud\",\"email\":\"rgookesob@mediafire.com\",\"job\":\"Help Desk Technician\",\"timestamp\":\"2022-12-03T02:45:25Z\"}\n{\"id\":877,\"first_name\":\"Rollo\",\"email\":\"rdericutoc@businesswire.com\",\"job\":\"Research Assistant IV\",\"timestamp\":\"2022-10-31T18:36:20Z\"}\n{\"id\":878,\"first_name\":\"Faber\",\"email\":\"fsimioniod@alibaba.com\",\"job\":\"GIS Technical Architect\",\"timestamp\":\"2022-12-01T21:20:46Z\"}\n{\"id\":879,\"first_name\":\"Malynda\",\"email\":\"mgresoe@furl.net\",\"job\":\"Systems Administrator I\",\"timestamp\":\"2022-05-31T05:31:01Z\"}\n{\"id\":880,\"first_name\":\"Eldridge\",\"email\":\"everdonof@hexun.com\",\"job\":\"Data Coordiator\",\"timestamp\":\"2022-03-27T14:39:55Z\"}\n{\"id\":881,\"first_name\":\"Maxy\",\"email\":\"mmooneyog@businessweek.com\",\"job\":\"Data Coordiator\",\"timestamp\":\"2022-11-06T17:31:25Z\"}\n{\"id\":882,\"first_name\":\"Kevina\",\"email\":\"kpericooh@google.de\",\"job\":\"Nurse Practicioner\",\"timestamp\":\"2022-03-27T13:04:10Z\"}\n{\"id\":883,\"first_name\":\"Roberto\",\"email\":\"ralloneoi@over-blog.com\",\"job\":\"Assistant Media Planner\",\"timestamp\":\"2022-10-19T17:41:28Z\"}\n{\"id\":884,\"first_name\":\"Reeta\",\"email\":\"rmatlockoj@ow.ly\",\"job\":\"Legal Assistant\",\"timestamp\":\"2022-05-16T05:22:12Z\"}\n{\"id\":885,\"first_name\":\"Romonda\",\"email\":\"rpinckneyok@list-manage.com\",\"job\":\"Financial Analyst\",\"timestamp\":\"2022-02-02T20:15:20Z\"}\n{\"id\":886,\"first_name\":\"Barnett\",\"email\":\"bhedlestoneol@ifeng.com\",\"job\":\"Chief Design Engineer\",\"timestamp\":\"2021-12-14T08:29:47Z\"}\n{\"id\":887,\"first_name\":\"Jemie\",\"email\":\"jmatousekom@zimbio.com\",\"job\":\"Junior Executive\",\"timestamp\":\"2022-11-10T23:11:53Z\"}\n{\"id\":888,\"first_name\":\"Sianna\",\"email\":\"sriddioughon@usatoday.com\",\"job\":\"Product Engineer\",\"timestamp\":\"2021-12-25T07:59:10Z\"}\n{\"id\":889,\"first_name\":\"Mort\",\"email\":\"mhamshawoo@qq.com\",\"job\":\"Accountant III\",\"timestamp\":\"2022-10-08T06:32:44Z\"}\n{\"id\":890,\"first_name\":\"Raff\",\"email\":\"rdareyop@quantcast.com\",\"job\":\"Sales Associate\",\"timestamp\":\"2022-11-06T01:09:57Z\"}\n{\"id\":891,\"first_name\":\"Josias\",\"email\":\"jstimsonoq@drupal.org\",\"job\":\"Civil Engineer\",\"timestamp\":\"2022-01-28T21:36:02Z\"}\n{\"id\":892,\"first_name\":\"Cello\",\"email\":\"cbonyor@jugem.jp\",\"job\":\"Information Systems Manager\",\"timestamp\":\"2022-06-27T05:23:17Z\"}\n{\"id\":893,\"first_name\":\"Johann\",\"email\":\"jlampos@facebook.com\",\"job\":\"Recruiter\",\"timestamp\":\"2022-04-20T08:29:47Z\"}\n{\"id\":894,\"first_name\":\"Grazia\",\"email\":\"ggoverot@people.com.cn\",\"job\":\"Environmental Tech\",\"timestamp\":\"2022-08-22T23:23:20Z\"}\n{\"id\":895,\"first_name\":\"Davon\",\"email\":\"dteligaou@weibo.com\",\"job\":\"Senior Sales Associate\",\"timestamp\":\"2022-07-01T00:54:36Z\"}\n{\"id\":896,\"first_name\":\"Ichabod\",\"email\":\"ikobierraov@dion.ne.jp\",\"job\":\"GIS Technical Architect\",\"timestamp\":\"2022-04-03T21:08:09Z\"}\n{\"id\":897,\"first_name\":\"Sandor\",\"email\":\"scotaow@reference.com\",\"job\":\"Assistant Media Planner\",\"timestamp\":\"2022-03-27T14:52:14Z\"}\n{\"id\":898,\"first_name\":\"Doy\",\"email\":\"dstiffkinsox@toplist.cz\",\"job\":\"Financial Advisor\",\"timestamp\":\"2022-07-02T16:07:19Z\"}\n{\"id\":899,\"first_name\":\"Carla\",\"email\":\"cstorckeoy@mlb.com\",\"job\":\"Recruiter\",\"timestamp\":\"2022-08-13T18:25:56Z\"}\n{\"id\":900,\"first_name\":\"Nicolais\",\"email\":\"nharceoz@soup.io\",\"job\":\"Financial Advisor\",\"timestamp\":\"2022-05-10T18:45:14Z\"}\n{\"id\":901,\"first_name\":\"Ab\",\"email\":\"abrinep0@google.co.uk\",\"job\":\"Assistant Manager\",\"timestamp\":\"2022-05-01T05:27:10Z\"}\n{\"id\":902,\"first_name\":\"Blake\",\"email\":\"bmackinderp1@ed.gov\",\"job\":\"VP Quality Control\",\"timestamp\":\"2022-08-23T16:12:10Z\"}\n{\"id\":903,\"first_name\":\"Thurston\",\"email\":\"tcarncrossp2@wikia.com\",\"job\":\"Biostatistician III\",\"timestamp\":\"2021-12-10T09:23:49Z\"}\n{\"id\":904,\"first_name\":\"Montague\",\"email\":\"mreinp3@java.com\",\"job\":\"Staff Scientist\",\"timestamp\":\"2022-04-27T05:05:08Z\"}\n{\"id\":905,\"first_name\":\"Thomas\",\"email\":\"tcahernyp4@oracle.com\",\"job\":\"Geologist I\",\"timestamp\":\"2021-12-14T13:45:34Z\"}\n{\"id\":906,\"first_name\":\"Beitris\",\"email\":\"beslandp5@xing.com\",\"job\":\"GIS Technical Architect\",\"timestamp\":\"2022-10-30T12:45:54Z\"}\n{\"id\":907,\"first_name\":\"Tristam\",\"email\":\"tbyardp6@army.mil\",\"job\":\"Senior Developer\",\"timestamp\":\"2022-07-11T22:57:40Z\"}\n{\"id\":908,\"first_name\":\"Romona\",\"email\":\"rrashleighp7@cnet.com\",\"job\":\"Environmental Specialist\",\"timestamp\":\"2022-07-22T21:54:19Z\"}\n{\"id\":909,\"first_name\":\"Gardener\",\"email\":\"gcalowp8@furl.net\",\"job\":\"Recruiting Manager\",\"timestamp\":\"2022-02-21T20:44:18Z\"}\n{\"id\":910,\"first_name\":\"Westleigh\",\"email\":\"wlegerwoodp9@moonfruit.com\",\"job\":\"Web Developer III\",\"timestamp\":\"2022-04-17T17:12:19Z\"}\n{\"id\":911,\"first_name\":\"Korrie\",\"email\":\"kwightpa@tamu.edu\",\"job\":\"Office Assistant II\",\"timestamp\":\"2022-08-02T16:12:44Z\"}\n{\"id\":912,\"first_name\":\"Elliot\",\"email\":\"epeschetpb@fastcompany.com\",\"job\":\"Budget/Accounting Analyst IV\",\"timestamp\":\"2022-09-01T10:23:33Z\"}\n{\"id\":913,\"first_name\":\"Fax\",\"email\":\"feichpc@delicious.com\",\"job\":\"Food Chemist\",\"timestamp\":\"2022-07-25T01:56:13Z\"}\n{\"id\":914,\"first_name\":\"Erastus\",\"email\":\"eblaypd@paypal.com\",\"job\":\"Professor\",\"timestamp\":\"2022-09-18T23:35:46Z\"}\n{\"id\":915,\"first_name\":\"Oralla\",\"email\":\"omccorleype@umich.edu\",\"job\":\"VP Marketing\",\"timestamp\":\"2022-09-09T03:24:55Z\"}\n{\"id\":916,\"first_name\":\"Urbano\",\"email\":\"ukingstnepf@narod.ru\",\"job\":\"Engineer III\",\"timestamp\":\"2022-07-13T10:14:55Z\"}\n{\"id\":917,\"first_name\":\"Ethel\",\"email\":\"elonerganpg@discovery.com\",\"job\":\"Operator\",\"timestamp\":\"2022-02-28T13:40:14Z\"}\n{\"id\":918,\"first_name\":\"Grove\",\"email\":\"gathridgeph@springer.com\",\"job\":\"Dental Hygienist\",\"timestamp\":\"2022-05-19T00:14:03Z\"}\n{\"id\":919,\"first_name\":\"Junette\",\"email\":\"jbaupi@amazon.co.jp\",\"job\":\"Physical Therapy Assistant\",\"timestamp\":\"2022-06-05T15:40:44Z\"}\n{\"id\":920,\"first_name\":\"Elbert\",\"email\":\"elernerpj@examiner.com\",\"job\":\"Environmental Tech\",\"timestamp\":\"2022-02-23T02:05:59Z\"}\n{\"id\":921,\"first_name\":\"Juditha\",\"email\":\"jternouthpk@people.com.cn\",\"job\":\"Marketing Assistant\",\"timestamp\":\"2022-11-17T11:21:10Z\"}\n{\"id\":922,\"first_name\":\"Madeleine\",\"email\":\"mskallypl@pen.io\",\"job\":\"Community Outreach Specialist\",\"timestamp\":\"2022-07-18T05:12:34Z\"}\n{\"id\":923,\"first_name\":\"Merill\",\"email\":\"mmerrienpm@1und1.de\",\"job\":\"Help Desk Operator\",\"timestamp\":\"2022-05-21T02:24:56Z\"}\n{\"id\":924,\"first_name\":\"Lois\",\"email\":\"lcadoganpn@dyndns.org\",\"job\":\"Marketing Assistant\",\"timestamp\":\"2022-10-24T01:25:17Z\"}\n{\"id\":925,\"first_name\":\"Bendix\",\"email\":\"bchenepo@dailymotion.com\",\"job\":\"Administrative Officer\",\"timestamp\":\"2022-03-24T15:17:47Z\"}\n{\"id\":926,\"first_name\":\"Isis\",\"email\":\"iseekspp@usgs.gov\",\"job\":\"Actuary\",\"timestamp\":\"2022-10-12T05:56:55Z\"}\n{\"id\":927,\"first_name\":\"Eda\",\"email\":\"edartpq@godaddy.com\",\"job\":\"Mechanical Systems Engineer\",\"timestamp\":\"2022-02-26T05:06:30Z\"}\n{\"id\":928,\"first_name\":\"Rhys\",\"email\":\"rszachpr@instagram.com\",\"job\":\"Recruiter\",\"timestamp\":\"2022-02-18T06:27:48Z\"}\n{\"id\":929,\"first_name\":\"Gerianna\",\"email\":\"ggladbeckps@nifty.com\",\"job\":\"Compensation Analyst\",\"timestamp\":\"2022-06-11T13:07:20Z\"}\n{\"id\":930,\"first_name\":\"Merrill\",\"email\":\"mclutterhampt@tripod.com\",\"job\":\"Engineer III\",\"timestamp\":\"2022-02-24T21:42:32Z\"}\n{\"id\":931,\"first_name\":\"Sheree\",\"email\":\"sdeamayapu@bravesites.com\",\"job\":\"Financial Advisor\",\"timestamp\":\"2022-10-17T23:12:59Z\"}\n{\"id\":932,\"first_name\":\"Dane\",\"email\":\"djarmanpv@constantcontact.com\",\"job\":\"Financial Analyst\",\"timestamp\":\"2022-07-22T11:17:02Z\"}\n{\"id\":933,\"first_name\":\"Corney\",\"email\":\"cmccroriepw@va.gov\",\"job\":\"Structural Analysis Engineer\",\"timestamp\":\"2022-05-24T01:47:23Z\"}\n{\"id\":934,\"first_name\":\"Candace\",\"email\":\"cflacknellpx@php.net\",\"job\":\"VP Quality Control\",\"timestamp\":\"2022-09-07T11:36:27Z\"}\n{\"id\":935,\"first_name\":\"Corrinne\",\"email\":\"ccardenaspy@wunderground.com\",\"job\":\"Compensation Analyst\",\"timestamp\":\"2022-01-31T11:37:46Z\"}\n{\"id\":936,\"first_name\":\"Meade\",\"email\":\"mpetrashkovpz@booking.com\",\"job\":\"Account Representative III\",\"timestamp\":\"2022-01-18T15:57:19Z\"}\n{\"id\":937,\"first_name\":\"Sauveur\",\"email\":\"sfinnanq0@indiatimes.com\",\"job\":\"Account Coordinator\",\"timestamp\":\"2022-09-23T10:05:56Z\"}\n{\"id\":938,\"first_name\":\"Tammy\",\"email\":\"tgrasq1@blogger.com\",\"job\":\"Engineer I\",\"timestamp\":\"2022-04-15T01:39:44Z\"}\n{\"id\":939,\"first_name\":\"Teodoor\",\"email\":\"tmacgraithq2@sfgate.com\",\"job\":\"Engineer IV\",\"timestamp\":\"2022-12-03T22:35:35Z\"}\n{\"id\":940,\"first_name\":\"Fae\",\"email\":\"fgalgeyq3@google.fr\",\"job\":\"Dental Hygienist\",\"timestamp\":\"2022-08-21T19:01:24Z\"}\n{\"id\":941,\"first_name\":\"Joell\",\"email\":\"jkochlq4@lulu.com\",\"job\":\"Chief Design Engineer\",\"timestamp\":\"2022-01-11T09:35:21Z\"}\n{\"id\":942,\"first_name\":\"Aldwin\",\"email\":\"arosenbloomq5@msn.com\",\"job\":\"Community Outreach Specialist\",\"timestamp\":\"2022-06-19T10:51:49Z\"}\n{\"id\":943,\"first_name\":\"Tracee\",\"email\":\"tlymbourneq6@bing.com\",\"job\":\"Staff Accountant II\",\"timestamp\":\"2022-03-09T09:48:29Z\"}\n{\"id\":944,\"first_name\":\"Peyton\",\"email\":\"phardsonq7@nifty.com\",\"job\":\"Civil Engineer\",\"timestamp\":\"2022-06-15T02:15:23Z\"}\n{\"id\":945,\"first_name\":\"Larina\",\"email\":\"lleckieq8@weather.com\",\"job\":\"Web Designer III\",\"timestamp\":\"2022-11-24T04:15:27Z\"}\n{\"id\":946,\"first_name\":\"Emelia\",\"email\":\"ejarnellq9@nyu.edu\",\"job\":\"Geologist III\",\"timestamp\":\"2022-05-09T02:19:47Z\"}\n{\"id\":947,\"first_name\":\"Andrus\",\"email\":\"amarquisqa@seesaa.net\",\"job\":\"Staff Scientist\",\"timestamp\":\"2022-08-27T04:02:30Z\"}\n{\"id\":948,\"first_name\":\"Vernen\",\"email\":\"vlockeqb@ucsd.edu\",\"job\":\"Developer IV\",\"timestamp\":\"2022-03-05T01:12:55Z\"}\n{\"id\":949,\"first_name\":\"Wyndham\",\"email\":\"wbroadwayqc@github.io\",\"job\":\"Pharmacist\",\"timestamp\":\"2022-05-11T07:26:10Z\"}\n{\"id\":950,\"first_name\":\"Amye\",\"email\":\"ahellinqd@stumbleupon.com\",\"job\":\"Clinical Specialist\",\"timestamp\":\"2022-05-27T04:52:14Z\"}\n{\"id\":951,\"first_name\":\"Valera\",\"email\":\"vreemeqe@wordpress.com\",\"job\":\"Account Coordinator\",\"timestamp\":\"2022-08-14T18:45:26Z\"}\n{\"id\":952,\"first_name\":\"Kipp\",\"email\":\"kgulstonqf@guardian.co.uk\",\"job\":\"Help Desk Technician\",\"timestamp\":\"2021-12-28T08:31:34Z\"}\n{\"id\":953,\"first_name\":\"Sayer\",\"email\":\"sstifeqg@squidoo.com\",\"job\":\"Quality Engineer\",\"timestamp\":\"2022-08-16T09:11:02Z\"}\n{\"id\":954,\"first_name\":\"Yancey\",\"email\":\"yculliganqh@mediafire.com\",\"job\":\"Data Coordiator\",\"timestamp\":\"2022-05-28T22:17:43Z\"}\n{\"id\":955,\"first_name\":\"Yoshi\",\"email\":\"yprofferqi@mapquest.com\",\"job\":\"Speech Pathologist\",\"timestamp\":\"2022-05-04T02:45:51Z\"}\n{\"id\":956,\"first_name\":\"Madison\",\"email\":\"mimortsqj@discovery.com\",\"job\":\"Staff Scientist\",\"timestamp\":\"2022-04-08T00:37:08Z\"}\n{\"id\":957,\"first_name\":\"Eziechiele\",\"email\":\"efollinqk@weebly.com\",\"job\":\"Physical Therapy Assistant\",\"timestamp\":\"2022-05-02T21:05:48Z\"}\n{\"id\":958,\"first_name\":\"Barrie\",\"email\":\"bwalaronql@newyorker.com\",\"job\":\"Technical Writer\",\"timestamp\":\"2022-09-09T10:56:49Z\"}\n{\"id\":959,\"first_name\":\"Reggie\",\"email\":\"rcahnqm@google.com.hk\",\"job\":\"Graphic Designer\",\"timestamp\":\"2022-02-04T05:10:00Z\"}\n{\"id\":960,\"first_name\":\"Conny\",\"email\":\"celleswortheqn@jalbum.net\",\"job\":\"Design Engineer\",\"timestamp\":\"2022-11-08T10:58:05Z\"}\n{\"id\":961,\"first_name\":\"Roselia\",\"email\":\"rhurranqo@eventbrite.com\",\"job\":\"Environmental Specialist\",\"timestamp\":\"2022-03-31T15:47:33Z\"}\n{\"id\":962,\"first_name\":\"Vera\",\"email\":\"vgowlandqp@homestead.com\",\"job\":\"Geologist IV\",\"timestamp\":\"2022-09-24T08:20:45Z\"}\n{\"id\":963,\"first_name\":\"Sheeree\",\"email\":\"smundowqq@artisteer.com\",\"job\":\"Health Coach I\",\"timestamp\":\"2022-09-07T03:54:28Z\"}\n{\"id\":964,\"first_name\":\"Becky\",\"email\":\"bspurriarqr@diigo.com\",\"job\":\"Social Worker\",\"timestamp\":\"2022-11-09T01:46:54Z\"}\n{\"id\":965,\"first_name\":\"Tan\",\"email\":\"tbatyqs@wikimedia.org\",\"job\":\"Occupational Therapist\",\"timestamp\":\"2022-10-25T09:43:54Z\"}\n{\"id\":966,\"first_name\":\"Kalila\",\"email\":\"kdowersqt@cyberchimps.com\",\"job\":\"GIS Technical Architect\",\"timestamp\":\"2022-02-04T19:00:56Z\"}\n{\"id\":967,\"first_name\":\"Morrie\",\"email\":\"mjesticoqu@army.mil\",\"job\":\"Associate Professor\",\"timestamp\":\"2022-12-04T22:25:48Z\"}\n{\"id\":968,\"first_name\":\"Abelard\",\"email\":\"asmewinqv@arizona.edu\",\"job\":\"Financial Advisor\",\"timestamp\":\"2022-06-24T19:19:09Z\"}\n{\"id\":969,\"first_name\":\"Shelby\",\"email\":\"sropckeqw@census.gov\",\"job\":\"Account Coordinator\",\"timestamp\":\"2022-05-30T09:49:34Z\"}\n{\"id\":970,\"first_name\":\"Jaynell\",\"email\":\"jmarvelleyqx@princeton.edu\",\"job\":\"Food Chemist\",\"timestamp\":\"2022-08-18T20:04:49Z\"}\n{\"id\":971,\"first_name\":\"Jori\",\"email\":\"jdibsdaleqy@last.fm\",\"job\":\"Database Administrator III\",\"timestamp\":\"2022-05-12T19:28:48Z\"}\n{\"id\":972,\"first_name\":\"Cari\",\"email\":\"credittqz@addthis.com\",\"job\":\"Safety Technician I\",\"timestamp\":\"2022-10-08T19:43:07Z\"}\n{\"id\":973,\"first_name\":\"Edee\",\"email\":\"ezylberdikr0@ning.com\",\"job\":\"Staff Accountant IV\",\"timestamp\":\"2022-05-23T09:04:15Z\"}\n{\"id\":974,\"first_name\":\"Kaiser\",\"email\":\"kbaggallayr1@slashdot.org\",\"job\":\"Dental Hygienist\",\"timestamp\":\"2022-01-07T20:11:41Z\"}\n{\"id\":975,\"first_name\":\"Christalle\",\"email\":\"cbuzekr2@istockphoto.com\",\"job\":\"Senior Financial Analyst\",\"timestamp\":\"2022-04-10T22:36:11Z\"}\n{\"id\":976,\"first_name\":\"Otha\",\"email\":\"oaluardr3@deliciousdays.com\",\"job\":\"Project Manager\",\"timestamp\":\"2022-07-11T21:01:49Z\"}\n{\"id\":977,\"first_name\":\"Imogene\",\"email\":\"iharwoodr4@geocities.jp\",\"job\":\"Executive Secretary\",\"timestamp\":\"2021-12-29T14:21:12Z\"}\n{\"id\":978,\"first_name\":\"Valentijn\",\"email\":\"vsouterr5@mtv.com\",\"job\":\"Senior Editor\",\"timestamp\":\"2022-01-13T07:30:20Z\"}\n{\"id\":979,\"first_name\":\"Danielle\",\"email\":\"dharrowayr6@hugedomains.com\",\"job\":\"Biostatistician II\",\"timestamp\":\"2022-08-03T09:35:36Z\"}\n{\"id\":980,\"first_name\":\"Griff\",\"email\":\"gdoyleyr7@qq.com\",\"job\":\"Marketing Assistant\",\"timestamp\":\"2022-07-02T10:23:43Z\"}\n{\"id\":981,\"first_name\":\"Claudelle\",\"email\":\"cthompsonr8@wikispaces.com\",\"job\":\"Financial Advisor\",\"timestamp\":\"2022-08-22T17:13:02Z\"}\n{\"id\":982,\"first_name\":\"Marla\",\"email\":\"mcaulketr9@usnews.com\",\"job\":\"Quality Control Specialist\",\"timestamp\":\"2022-05-15T23:29:39Z\"}\n{\"id\":983,\"first_name\":\"Lorne\",\"email\":\"llabellra@g.co\",\"job\":\"Pharmacist\",\"timestamp\":\"2022-03-01T00:37:57Z\"}\n{\"id\":984,\"first_name\":\"Mar\",\"email\":\"mrispinrb@networkadvertising.org\",\"job\":\"Design Engineer\",\"timestamp\":\"2022-07-09T01:55:58Z\"}\n{\"id\":985,\"first_name\":\"Townie\",\"email\":\"tbusrc@guardian.co.uk\",\"job\":\"Director of Sales\",\"timestamp\":\"2022-10-06T16:48:30Z\"}\n{\"id\":986,\"first_name\":\"Darcey\",\"email\":\"dwillerstonerd@diigo.com\",\"job\":\"Assistant Media Planner\",\"timestamp\":\"2022-08-04T00:05:16Z\"}\n{\"id\":987,\"first_name\":\"Joann\",\"email\":\"jschwandnerre@theglobeandmail.com\",\"job\":\"Account Representative I\",\"timestamp\":\"2022-11-08T05:48:40Z\"}\n{\"id\":988,\"first_name\":\"Katerine\",\"email\":\"kuttleyrf@japanpost.jp\",\"job\":\"Nuclear Power Engineer\",\"timestamp\":\"2022-05-13T12:35:36Z\"}\n{\"id\":989,\"first_name\":\"Audie\",\"email\":\"akeemsrg@chron.com\",\"job\":\"Analog Circuit Design manager\",\"timestamp\":\"2022-10-22T02:14:29Z\"}\n{\"id\":990,\"first_name\":\"Findlay\",\"email\":\"fjaggersrh@time.com\",\"job\":\"Human Resources Manager\",\"timestamp\":\"2022-05-12T08:41:24Z\"}\n{\"id\":991,\"first_name\":\"Jonis\",\"email\":\"jjedrasikri@google.co.uk\",\"job\":\"Clinical Specialist\",\"timestamp\":\"2022-06-05T11:49:21Z\"}\n{\"id\":992,\"first_name\":\"Erhart\",\"email\":\"eszymonowiczrj@washingtonpost.com\",\"job\":\"Software Engineer IV\",\"timestamp\":\"2022-03-16T03:49:43Z\"}\n{\"id\":993,\"first_name\":\"Ulysses\",\"email\":\"umadenrk@walmart.com\",\"job\":\"Research Assistant III\",\"timestamp\":\"2022-11-17T14:27:14Z\"}\n{\"id\":994,\"first_name\":\"Vannie\",\"email\":\"vallsoprl@github.com\",\"job\":\"Executive Secretary\",\"timestamp\":\"2022-08-19T15:25:28Z\"}\n{\"id\":995,\"first_name\":\"Rory\",\"email\":\"rballstonrm@oaic.gov.au\",\"job\":\"Project Manager\",\"timestamp\":\"2022-05-23T10:42:53Z\"}\n{\"id\":996,\"first_name\":\"Korrie\",\"email\":\"kbeneditrn@constantcontact.com\",\"job\":\"Safety Technician II\",\"timestamp\":\"2021-12-30T08:56:55Z\"}\n{\"id\":997,\"first_name\":\"Vlad\",\"email\":\"vendlero@storify.com\",\"job\":\"Dental Hygienist\",\"timestamp\":\"2022-01-15T17:01:19Z\"}\n{\"id\":998,\"first_name\":\"Jenelle\",\"email\":\"jsteinerrp@technorati.com\",\"job\":\"Safety Technician III\",\"timestamp\":\"2022-02-23T20:35:11Z\"}\n{\"id\":999,\"first_name\":\"Elwood\",\"email\":\"eengehamrq@fda.gov\",\"job\":\"Occupational Therapist\",\"timestamp\":\"2022-04-27T20:29:45Z\"}\n{\"id\":1000,\"first_name\":\"Donnie\",\"email\":\"dshiptonrr@slideshare.net\",\"job\":\"Developer II\",\"timestamp\":\"2022-03-10T21:39:22Z\"}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/benches/doc_process_vrl_bench.rs",
    "content": "use std::sync::Arc;\n\nuse bytes::Bytes;\nuse criterion::{BenchmarkId, Criterion, criterion_group, criterion_main};\nuse quickwit_actors::{ActorHandle, Mailbox, Universe};\nuse quickwit_config::{SourceInputFormat, TransformConfig};\nuse quickwit_doc_mapper::DocMapper;\nuse quickwit_indexing::actors::DocProcessor;\nuse quickwit_indexing::models::RawDocBatch;\nuse quickwit_metastore::checkpoint::SourceCheckpointDelta;\n\nconst JSON_NORMAL: &str = include_str!(\"data/bench_data.json\");\nconst JSON_LIGHT_TRANSFORM: &str = include_str!(\"data/bench_data_light_transform.json\");\nconst JSON_HEAVY_TRANSFORM: &str = include_str!(\"data/bench_data_heavy_transform.json\");\n\nmacro_rules! bench_func {\n    ($input:expr, $group:expr, $name:expr, $param:expr, $func:expr) => {{\n        let lines: Vec<&str> = $input.lines().map(|line| line.trim()).collect();\n        $group.throughput(criterion::Throughput::Bytes($input.len() as u64));\n\n        let runtime = tokio::runtime::Runtime::new().unwrap();\n        let checkpoint_delta = SourceCheckpointDelta::from_range(0..$input.len() as u64);\n\n        $group.bench_function(BenchmarkId::new($name, $param), |b| {\n            b.to_async(&runtime).iter_batched(\n                || {\n                    lines\n                        .iter()\n                        .map(|line| Bytes::from(*line))\n                        .collect::<Vec<_>>()\n                },\n                |docs| async {\n                    let (mailbox, handle, universe) = $func;\n                    mailbox\n                        .send_message(RawDocBatch::new(docs, checkpoint_delta.clone(), false))\n                        .await\n                        .unwrap();\n\n                    universe.send_exit_with_success(&mailbox).await.unwrap();\n                    handle.join().await;\n                },\n                criterion::BatchSize::SmallInput,\n            )\n        });\n    }};\n}\n\npub fn default_doc_mapper_for_bench() -> DocMapper {\n    const JSON_CONFIG_VALUE: &str = r#\"\n        {\n            \"store_source\": true,\n            \"default_search_fields\": [],\n            \"timestamp_field\": \"timestamp\",\n            \"tag_fields\": [\"id\"],\n            \"field_mappings\": [\n                {\n                    \"name\": \"timestamp\",\n                    \"type\": \"datetime\",\n                    \"output_format\": \"unix_timestamp_secs\",\n                    \"fast\": true,\n                    \"input_formats\": [\"iso8601\"]\n                },\n                {\n                    \"name\": \"first_name\",\n                    \"type\": \"text\",\n                    \"stored\": true\n                },\n                {\n                    \"name\": \"last_name\",\n                    \"type\": \"text\",\n                    \"stored\": true\n                },\n                {\n                    \"name\": \"id\",\n                    \"type\": \"u64\",\n                    \"stored\": true\n                },\n                {\n                    \"name\": \"email\",\n                    \"type\": \"text\",\n                    \"stored\": true\n                },\n                {\n                    \"name\": \"job\",\n                    \"type\": \"text\",\n                    \"stored\": true\n                }\n            ]\n        }\"#;\n    serde_json::from_str::<DocMapper>(JSON_CONFIG_VALUE).unwrap()\n}\n\nfn doc_processor_no_transform() -> (Mailbox<DocProcessor>, ActorHandle<DocProcessor>, Universe) {\n    create_doc_processor(None)\n}\n\nfn doc_processor_light_transform() -> (Mailbox<DocProcessor>, ActorHandle<DocProcessor>, Universe) {\n    let vrl_script = r#\"\n        .last_name = \"Doe\"\n        .job = upcase(string!(.job))\n    \"#;\n    let transform_config = TransformConfig::for_test(vrl_script);\n    create_doc_processor(Some(transform_config))\n}\n\nfn doc_processor_heavy_transform() -> (Mailbox<DocProcessor>, ActorHandle<DocProcessor>, Universe) {\n    let vrl_script = r#\"\n        . = parse_json!(.body)\n        .last_name = \"Doe\"\n        .job = upcase(string!(.job))\n        .timestamp = to_string(to_timestamp(now()))\n    \"#;\n    let transform_config = TransformConfig::for_test(vrl_script);\n    create_doc_processor(Some(transform_config))\n}\n\nfn create_doc_processor(\n    transform_config_opt: Option<TransformConfig>,\n) -> (Mailbox<DocProcessor>, ActorHandle<DocProcessor>, Universe) {\n    let index_id = \"my-index\".to_string();\n    let source_id = \"my-source\".to_string();\n    let doc_mapper = Arc::new(default_doc_mapper_for_bench());\n    let universe = Universe::new();\n    let (indexer_mailbox, _) = universe.create_test_mailbox();\n    let doc_processor = DocProcessor::try_new(\n        index_id,\n        source_id,\n        doc_mapper,\n        indexer_mailbox,\n        transform_config_opt,\n        SourceInputFormat::Json,\n    )\n    .unwrap();\n    let (mailbox, handle) = universe.spawn_builder().spawn(doc_processor);\n    (mailbox, handle, universe)\n}\n\nfn bench_simple_json(c: &mut Criterion) {\n    let mut group = c.benchmark_group(\"Simple Json\");\n    bench_func!(\n        JSON_NORMAL,\n        group,\n        \"No VRL\",\n        \"Simple JSON\",\n        doc_processor_no_transform()\n    );\n    bench_func!(\n        JSON_NORMAL,\n        group,\n        \"Light VRL\",\n        \"Simple JSON\",\n        doc_processor_light_transform()\n    );\n}\n\nfn bench_light_json(c: &mut Criterion) {\n    let mut group = c.benchmark_group(\"Simple/Light Json\");\n    bench_func!(\n        JSON_NORMAL,\n        group,\n        \"No VRL\",\n        \"Simple JSON\",\n        doc_processor_no_transform()\n    );\n    bench_func!(\n        JSON_LIGHT_TRANSFORM,\n        group,\n        \"Light VRL\",\n        \"Light JSON\",\n        doc_processor_light_transform()\n    );\n}\n\nfn bench_heavy_json(c: &mut Criterion) {\n    let mut group = c.benchmark_group(\"Simple/Light Json\");\n    bench_func!(\n        JSON_NORMAL,\n        group,\n        \"No VRL\",\n        \"Simple JSON\",\n        doc_processor_no_transform()\n    );\n    bench_func!(\n        JSON_HEAVY_TRANSFORM,\n        group,\n        \"Heavy VRL\",\n        \"Heavy JSON\",\n        doc_processor_heavy_transform()\n    );\n}\n\ncriterion_group!(\n    benches,\n    bench_simple_json,\n    bench_light_json,\n    bench_heavy_json\n);\ncriterion_main!(benches);\n"
  },
  {
    "path": "quickwit/quickwit-indexing/data/test_corpus.json",
    "content": "{\"timestamp\":1375457457,\"body\":\"hello\",\"response_date\":141436123,\"response_time\":141436123,\"response_payload\":\"data\",\"owner\":\"\",\"properties\":{},\"children\":[],\"attributes\":{\"tags\":[12,34],\"server\":\"foo\",\"server.status\":[\"down\",\"up\"],\"server.payload\":\"data\"}}\n{\"timestamp\":1375457457,\"body\":\"happy\",\"response_date\":141436123,\"response_time\":141436123,\"response_payload\":\"data\",\"owner\":\"\",\"properties\":{},\"children\":{},\"attributes\":{\"tags\":[12,34],\"server\":\"foo\",\"server.status\":[\"down\",\"up\"],\"server.payload\":\"data\"}}\n{\"timestamp\":1375457457,\"body\":\"tax\",\"response_date\":141436123,\"response_time\":141436123,\"response_payload\":\"data\",\"owner\":\"\",\"properties\":{},\"children\":{},\"attributes\":{\"tags\":[12,34],\"server\":\"foo\",\"server.status\":[\"down\",\"up\"],\"server.payload\":\"data\"}}\n{\"timestamp\":1375457457,\"body\":\"payer\",\"response_date\":141436123,\"response_time\":141436123,\"response_payload\":\"data\",\"owner\":\"\",\"properties\":{},\"children\":[],\"attributes\":{\"tags\":[12,34],\"server\":\"foo\",\"server.status\":[\"down\",\"up\"],\"server.payload\":\"data\"}}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/failpoints/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n//! Fail points are a form of code instrumentation that allow errors and other behaviors\n//! to be injected dynamically at runtime, primarily for testing purposes. Fail\n//! points are flexible and can be configured to exhibit a variety of behaviors,\n//! including panics, early returns, and sleeps. They can be controlled both\n//! programmatically and via the environment, and can be triggered conditionally\n//! and probabilistically.\n//!\n//! They rely on a global variable, which requires them to be executed in a single\n//! thread.\n//! For this reason, we isolate them from the other unit tests and define an\n//! independent binary target.\n//!\n//! They are not executed by default.\n//! They are executed in CI and can be executed locally\n//! `cargo test --features fail/failpoints test_failpoint -- --test-threads`\n//!\n//! Below we test panics at different steps in the indexing pipeline.\n\nuse std::path::Path;\nuse std::sync::{Arc, Barrier, Mutex};\nuse std::time::Duration;\n\nuse fail::FailScenario;\nuse quickwit_common::io::IoControls;\nuse quickwit_common::rand::append_random_suffix;\nuse quickwit_common::split_file;\nuse quickwit_common::temp_dir::TempDirectory;\nuse quickwit_indexing::actors::MergeExecutor;\nuse quickwit_indexing::merge_policy::{MergeOperation, MergeTask};\nuse quickwit_indexing::models::MergeScratch;\nuse quickwit_indexing::{TestSandbox, get_tantivy_directory_from_split_bundle};\nuse quickwit_metastore::{\n    ListSplitsQuery, ListSplitsRequestExt, MetastoreServiceStreamSplitsExt, SplitMetadata,\n    SplitState,\n};\nuse quickwit_proto::indexing::MergePipelineId;\nuse quickwit_proto::metastore::{ListSplitsRequest, MetastoreService};\nuse quickwit_proto::types::{IndexUid, NodeId};\nuse serde_json::Value as JsonValue;\nuse tantivy::Directory;\n\n#[tokio::test]\nasync fn test_failpoint_no_failure() -> anyhow::Result<()> {\n    let scenario = FailScenario::setup();\n    aux_test_failpoints().await?;\n    scenario.teardown();\n    Ok(())\n}\n\nfn deterministic_panic_sequence(mut panics: Vec<bool>) -> impl Fn() + Send + Sync {\n    panics.reverse();\n    let panics = Mutex::new(panics);\n    move || {\n        let should_panic = panics.lock().unwrap().pop().unwrap_or(false);\n        if should_panic {\n            panic!(\"panicked\");\n        }\n    }\n}\n\n#[tokio::test]\nasync fn test_failpoint_packager_panics_right_away() -> anyhow::Result<()> {\n    let scenario = FailScenario::setup();\n    fail::cfg_callback(\"packager:before\", deterministic_panic_sequence(vec![true])).unwrap();\n    aux_test_failpoints().await?;\n    scenario.teardown();\n    Ok(())\n}\n\n#[tokio::test]\nasync fn test_failpoint_packager_panics_after_one_success() -> anyhow::Result<()> {\n    let scenario = FailScenario::setup();\n    fail::cfg_callback(\n        \"packager:before\",\n        deterministic_panic_sequence(vec![false, true]),\n    )\n    .unwrap();\n    aux_test_failpoints().await?;\n    scenario.teardown();\n    Ok(())\n}\n\n#[tokio::test]\nasync fn test_failpoint_publisher_panics_after_one_success() -> anyhow::Result<()> {\n    let scenario = FailScenario::setup();\n    fail::cfg_callback(\n        \"publisher:before\",\n        deterministic_panic_sequence(vec![false, true]),\n    )\n    .unwrap();\n    aux_test_failpoints().await?;\n    scenario.teardown();\n    Ok(())\n}\n\n#[tokio::test]\nasync fn test_failpoint_publisher_panics_right_away() -> anyhow::Result<()> {\n    let scenario = FailScenario::setup();\n    fail::cfg_callback(\"publisher:before\", deterministic_panic_sequence(vec![true])).unwrap();\n    aux_test_failpoints().await?;\n    scenario.teardown();\n    Ok(())\n}\n\n#[tokio::test]\nasync fn test_failpoint_publisher_after_panics_right_away() -> anyhow::Result<()> {\n    let scenario = FailScenario::setup();\n    fail::cfg_callback(\"publisher:after\", deterministic_panic_sequence(vec![true])).unwrap();\n    aux_test_failpoints().await?;\n    scenario.teardown();\n    Ok(())\n}\n\n#[tokio::test]\nasync fn test_failpoint_uploader_panics_right_away() -> anyhow::Result<()> {\n    let scenario = FailScenario::setup();\n    fail::cfg_callback(\n        \"uploader:before\",\n        deterministic_panic_sequence(vec![false, true]),\n    )\n    .unwrap();\n    aux_test_failpoints().await?;\n    scenario.teardown();\n    Ok(())\n}\n\n#[tokio::test]\nasync fn test_failpoint_uploader_panics_after_one_success() -> anyhow::Result<()> {\n    let scenario = FailScenario::setup();\n    fail::cfg_callback(\"uploader:before\", deterministic_panic_sequence(vec![true])).unwrap();\n    aux_test_failpoints().await?;\n    scenario.teardown();\n    Ok(())\n}\n\n#[tokio::test]\nasync fn test_failpoint_uploader_after_panics_right_away() -> anyhow::Result<()> {\n    let scenario = FailScenario::setup();\n    fail::cfg_callback(\"uploader:after\", deterministic_panic_sequence(vec![true])).unwrap();\n    aux_test_failpoints().await?;\n    scenario.teardown();\n    Ok(())\n}\n\nasync fn aux_test_failpoints() -> anyhow::Result<()> {\n    let doc_mapper_yaml = r#\"\n        field_mappings:\n          - name: body\n            type: text\n          - name: ts\n            type: datetime\n            fast: true\n        timestamp_field: ts\n        \"#;\n    let search_fields = [\"body\"];\n    let index_id = append_random_suffix(\"test-index\");\n    let test_index_builder =\n        TestSandbox::create(&index_id, doc_mapper_yaml, \"\", &search_fields).await?;\n    let batch_1: Vec<JsonValue> = vec![\n        serde_json::json!({\"body \": \"1\", \"ts\": 1629889530 }),\n        serde_json::json!({\"body \": \"2\", \"ts\": 1629889531 }),\n    ];\n    let batch_2: Vec<JsonValue> = vec![\n        serde_json::json!({\"body \": \"3\", \"ts\": 1629889532 }),\n        serde_json::json!({\"body \": \"4\", \"ts\": 1629889533 }),\n    ];\n    test_index_builder.add_documents(batch_1).await?;\n    test_index_builder.add_documents(batch_2).await?;\n    let query = ListSplitsQuery::for_index(test_index_builder.index_uid())\n        .with_split_state(SplitState::Published);\n    let list_splits_request = ListSplitsRequest::try_from_list_splits_query(&query).unwrap();\n    let mut splits = test_index_builder\n        .metastore()\n        .list_splits(list_splits_request)\n        .await\n        .unwrap()\n        .collect_splits()\n        .await\n        .unwrap();\n    splits.sort_by_key(|split| *split.split_metadata.time_range.clone().unwrap().start());\n    assert_eq!(splits.len(), 2);\n    assert_eq!(\n        splits[0].split_metadata.time_range.clone().unwrap(),\n        1629889530..=1629889531\n    );\n    assert_eq!(\n        splits[1].split_metadata.time_range.clone().unwrap(),\n        1629889532..=1629889533\n    );\n    test_index_builder.universe().quit().await;\n    Ok(())\n}\n\nconst TEST_TEXT: &str = r#\"His sole child, my lord, and bequeathed to my\noverlooking. I have those hopes of her good that\nher education promises; her dispositions she\ninherits, which makes fair gifts fairer; for where\nan unclean mind carries virtuous qualities, there\ncommendations go with pity; they are virtues and\ntraitors too; in her they are the better for their\nsimpleness; she derives her honesty and achieves her goodness.\"#;\n\n#[tokio::test]\nasync fn test_merge_executor_controlled_directory_kill_switch() -> anyhow::Result<()> {\n    // This tests checks that if a merger is killed in a middle of\n    // a merge, then the controlled directory makes it possible to\n    // abort the merging operation and return quickly.\n    // NOTE(fmassot): This test is working but not as exactly we would want.\n    // Ideally we want the actor to stop while merging which is a long task and we\n    // don't want to wait until it's finished. But... the merging phase is\n    // currently in a protected zone and thus there will be not kill switch activated\n    // during this period. We added the protected zone because without we observe from\n    // time to time a kill switch activation because the ControlledDirectory did not\n    // do any write during a HEARTBEAT... Before removing the protect zone, we need\n    // to investigate this instability. Then this test will finally be really helpful.\n    quickwit_common::setup_logging_for_tests();\n    let doc_mapper_yaml = r#\"\n        field_mappings:\n          - name: body\n            type: text\n          - name: ts\n            type: datetime\n            fast: true\n        timestamp_field: ts\n        \"#;\n    let indexing_setting_yaml = r#\"\n        split_num_docs_target: 1000\n        merge_policy:\n          type: \"no_merge\"\n    \"#;\n    let search_fields = [\"body\"];\n    let index_id = \"test-index-merge-executory-kill-switch\";\n    let test_index_builder = TestSandbox::create(\n        index_id,\n        doc_mapper_yaml,\n        indexing_setting_yaml,\n        &search_fields,\n    )\n    .await?;\n\n    let doc_mapper = test_index_builder.doc_mapper();\n    let batch: Vec<JsonValue> =\n        std::iter::repeat_with(|| serde_json::json!({\"body \": TEST_TEXT, \"ts\": 1631072713 }))\n            .take(500)\n            .collect();\n    for _ in 0..2 {\n        test_index_builder.add_documents(batch.clone()).await?;\n    }\n    tokio::time::sleep(Duration::from_millis(10)).await;\n\n    let metastore = test_index_builder.metastore();\n    let split_metadatas: Vec<SplitMetadata> = metastore\n        .list_splits(ListSplitsRequest::try_from_index_uid(test_index_builder.index_uid()).unwrap())\n        .await?\n        .collect_splits_metadata()\n        .await\n        .unwrap();\n    let merge_scratch_directory = TempDirectory::for_test();\n\n    let downloaded_splits_directory =\n        merge_scratch_directory.named_temp_child(\"downloaded-splits-\")?;\n    let storage = test_index_builder.storage();\n    let mut tantivy_dirs: Vec<Box<dyn Directory>> = Vec::new();\n    for split in &split_metadatas {\n        let split_filename = split_file(split.split_id());\n        let dest_filepath = downloaded_splits_directory.path().join(&split_filename);\n        storage\n            .copy_to_file(Path::new(&split_filename), &dest_filepath)\n            .await?;\n\n        tantivy_dirs.push(get_tantivy_directory_from_split_bundle(&dest_filepath).unwrap());\n    }\n    let merge_operation = MergeOperation::new_merge_operation(split_metadatas);\n    let merge_task = MergeTask::from_merge_operation_for_test(merge_operation);\n    let merge_scratch = MergeScratch {\n        merge_task,\n        merge_scratch_directory,\n        downloaded_splits_directory,\n        tantivy_dirs,\n    };\n    let pipeline_id = MergePipelineId {\n        node_id: NodeId::from(\"test-node\"),\n        index_uid: IndexUid::new_with_random_ulid(index_id),\n        source_id: \"test-source\".to_string(),\n    };\n\n    let universe = test_index_builder.universe();\n    let (merge_packager_mailbox, _merge_packager_inbox) = universe.create_test_mailbox();\n    let io_controls = IoControls::default();\n    let merge_executor = MergeExecutor::new(\n        pipeline_id,\n        metastore,\n        doc_mapper,\n        io_controls,\n        merge_packager_mailbox,\n    );\n\n    let (merge_executor_mailbox, _merge_executor_handle) =\n        universe.spawn_builder().spawn(merge_executor);\n\n    // We want to make sure that the processing of the message gets\n    // aborted not by the actor framework, before the message is being processed.\n    //\n    // To do so, we\n    // - set two barrier so the actor pauses right upon entering the process_merge function\n    // - send the merge message\n    // - wait on the first barrier to ensure that the actor has reached the process_merge function\n    // - kill the universe\n    // - wait and release the second barrier so the actor can continue processing the merge message\n    //\n    // Before the controlled directory, the merge operation would have continued until it\n    // finished, taking hundreds of millisecs to terminate.\n    let before_universe_kill = Arc::new(Barrier::new(2));\n    let after_universe_kill = Arc::new(Barrier::new(2));\n    let before_universe_kill_clone = before_universe_kill.clone();\n    let after_universe_kill_clone = after_universe_kill.clone();\n    fail::cfg_callback(\"before-merge-split\", move || {\n        before_universe_kill_clone.wait();\n        after_universe_kill_clone.wait();\n    })\n    .unwrap();\n    fail::cfg(\n        \"after-merge-split\",\n        \"panic(merge should be failed by directory kill switch)\",\n    )\n    .unwrap();\n    merge_executor_mailbox.send_message(merge_scratch).await?;\n    before_universe_kill.wait();\n    universe.kill();\n    after_universe_kill.wait();\n    universe.quit().await;\n\n    Ok(())\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/actors/cooperative_indexing.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::hash::{DefaultHasher, Hash, Hasher};\nuse std::sync::Arc;\nuse std::time::Duration;\n\nuse once_cell::sync::Lazy;\nuse quickwit_proto::indexing::{CpuCapacity, PIPELINE_FULL_CAPACITY, PipelineMetrics};\nuse tokio::sync::{OwnedSemaphorePermit, Semaphore};\nuse tokio::time::Instant;\n\n/// We allow ourselves to adjust the sleep time by at most `NUDGE_TOLERANCE`\n/// in order to steer a pipeline to its phase.\nconst NUDGE_TOLERANCE: Duration = Duration::from_secs(5);\n\n// Origin of time. It is used to compute the phase of the pipeline.\nstatic ORIGIN_OF_TIME: Lazy<Instant> = Lazy::new(Instant::now);\n\n/// Cooperative indexing is a mechanism to deal with a large amount of pipelines.\n///\n/// Instead of having all pipelines index concurrently, cooperative indexing:\n/// - have them take turn, making sure that at most only N pipelines are indexing at the same time.\n///   This has the benefit is reducing RAM using (by having a limited number of `IndexWriter` at the\n///   same time), reducing context switching.\n/// - keeps the different pipelines work uniformously spread in time. If the system is not at\n///   capacity, we prefer to have the indexing pipeline as desynchronized as possible to make sure\n///   they don't all use the same resources (disk/cpu/network) at the same time.\n///\n/// It works by:\n/// - a semaphore is used to restrict the number of pipelines indexing at the same time.\n/// - in the indexer when `on_drain` is called, the indexer will cut a split and \"go to sleep\" for a\n///   given amount of time.\n///\n/// The key logic is in the computation of that sleep time.\n///\n/// We want to set it in order to steer the pipeline toward an ideal cycle with a period\n/// of `commit_timeout`,\n///\n/// A period in this ideal cycle should, for some k,\n/// - start at `t0 + k * commit_timeout + target_phase`\n/// - end at `t0 + (k+1)*commit_timeout + target_phase`.\n///\n/// `target_phase` is computed using a hash over the pipeline id, and meant to follow\n/// a uniform distribution over the interval [0, commit_timeout).\n///\n/// Each period of this cycle is divided into three phases.\n/// - waking [t_wake..t_work_start) acquisition of the period guard (this is instantaneous)\n///   acquisition of the semaphore\n/// - working [t_work_start..t_work_end)\n/// - sleeping [t=t_work_end..t_sleep_end)\n///\n/// The idea is to first pick the sleep time to to create a cycle of period\n/// `commit_timeout`.\n///   sleep_time := max(0, commit_timeout - (t_workend - t_wake))\n///\n/// If the work phase is too long, the regular commit timeout mechanism\n/// kicks in an the pipeline will create a split without waiting for the\n/// mailbox to be drained.\n///\n/// We then allow ourselves to tweak the sleep time one way or another by at\n/// most two seconds to eventually nudge the system toward the desired phase.\npub(crate) struct CooperativeIndexingCycle {\n    target_phase: Duration,\n    commit_timeout: Duration,\n    indexing_permits: Arc<Semaphore>,\n}\n\nimpl CooperativeIndexingCycle {\n    /// Creates a new cooperative indexing cycle object.\n    /// `phase_id` is hashed to compute the target phase.\n    pub fn new(\n        phase_id: &(impl Hash + ?Sized),\n        commit_timeout: Duration,\n        indexing_permits: Arc<Semaphore>,\n    ) -> CooperativeIndexingCycle {\n        assert!(commit_timeout.as_millis() > 0);\n        let mut hasher = DefaultHasher::new();\n        phase_id.hash(&mut hasher);\n        let target_phase_millis: u64 = hasher.finish() % commit_timeout.as_millis() as u64;\n        Self::new_with_phase(\n            Duration::from_millis(target_phase_millis),\n            commit_timeout,\n            indexing_permits,\n        )\n    }\n\n    fn new_with_phase(\n        target_phase: Duration,\n        commit_timeout: Duration,\n        indexing_permits: Arc<Semaphore>,\n    ) -> CooperativeIndexingCycle {\n        // Force the initial of the origin of time.\n        let _t0 = *ORIGIN_OF_TIME;\n        CooperativeIndexingCycle {\n            target_phase,\n            commit_timeout,\n            indexing_permits,\n        }\n    }\n\n    pub fn initial_sleep_duration(&self) -> Duration {\n        let t0 = *ORIGIN_OF_TIME;\n        let commit_timeout_millis = self.commit_timeout.as_millis() as u64;\n        let current_phase_millis: u64 = t0.elapsed().as_millis() as u64 % commit_timeout_millis;\n        let target_phase_millis: u64 = self.target_phase.as_millis() as u64 % commit_timeout_millis;\n        let initial_sleep_millis: u64 = (commit_timeout_millis + target_phase_millis\n            - current_phase_millis)\n            % commit_timeout_millis;\n        if initial_sleep_millis + 2 * NUDGE_TOLERANCE.as_millis() as u64 > commit_timeout_millis {\n            // We are reasonably close to the target phase. No need to sleep. The nudge\n            // will be enough.\n            return Duration::default();\n        }\n        Duration::from_millis(initial_sleep_millis)\n    }\n\n    pub async fn cooperative_indexing_period(&self) -> CooperativeIndexingPeriod {\n        let t_wake = Instant::now();\n        let permit = Semaphore::acquire_owned(self.indexing_permits.clone())\n            .await\n            .unwrap();\n        let t_work_start = Instant::now();\n        CooperativeIndexingPeriod {\n            t_wake,\n            t_work_start,\n            commit_timeout: self.commit_timeout,\n            target_phase: self.target_phase,\n            _permit: permit,\n        }\n    }\n}\n\npub(crate) struct CooperativeIndexingPeriod {\n    // measured right before the acquisition of the indexing semaphore\n    t_wake: Instant,\n    // measured after the acquisition of the semaphore.\n    t_work_start: Instant,\n    commit_timeout: Duration,\n    target_phase: Duration,\n    _permit: OwnedSemaphorePermit,\n}\n\nimpl CooperativeIndexingPeriod {\n    fn compute_pipeline_metrics(\n        &self,\n        end: Instant,\n        uncompressed_num_bytes: u64,\n    ) -> PipelineMetrics {\n        let elapsed = end - self.t_work_start;\n        let throughput_mb_per_sec: u64 =\n            uncompressed_num_bytes / (1u64 + elapsed.as_micros() as u64);\n        let commit_timeout = self.commit_timeout;\n        let pipeline_throughput_fraction =\n            (elapsed.as_micros() as f32 / commit_timeout.as_micros() as f32).min(1.0f32);\n        let cpu_load: CpuCapacity = PIPELINE_FULL_CAPACITY * pipeline_throughput_fraction;\n        PipelineMetrics {\n            cpu_load,\n            throughput_mb_per_sec: throughput_mb_per_sec as u16,\n        }\n    }\n\n    fn compute_sleep_duration(&self, t_work_end: Instant) -> Duration {\n        let commit_timeout_millis = self.commit_timeout.as_millis() as u64;\n        let phase_millis: u64 =\n            ((t_work_end - *ORIGIN_OF_TIME).as_millis() as u64) % commit_timeout_millis;\n        let delta_phase: i64 = phase_millis as i64 - self.target_phase.as_millis() as i64;\n        // delta phase is within (-commit_timeout_millis, commit_timeout_millis)\n        // We fold it back to [-commit_timeout_millis/2, commit_timeout_millis/2)\n        let half_commit_timeout_millis = commit_timeout_millis as i64 / 2;\n        let delta_phase = if delta_phase >= half_commit_timeout_millis {\n            delta_phase - commit_timeout_millis as i64\n        } else if delta_phase < -half_commit_timeout_millis {\n            delta_phase + commit_timeout_millis as i64\n        } else {\n            delta_phase\n        };\n        let nudge_tolerance_millis = NUDGE_TOLERANCE.as_millis() as i64;\n        let nudge_millis: i64 = delta_phase.clamp(-nudge_tolerance_millis, nudge_tolerance_millis);\n        let sleep_duration_millis = self.commit_timeout.as_millis() as i64\n            - (t_work_end - self.t_wake).as_millis() as i64\n            - nudge_millis;\n        if sleep_duration_millis > 0 {\n            Duration::from_millis(sleep_duration_millis as u64)\n        } else {\n            Duration::ZERO\n        }\n    }\n\n    /// This drops the indexing permit, allowing another indexer to start indexing.\n    /// This function also returns the amount of time to sleep until the next period.\n    pub fn end_of_work(self, uncompressed_num_bytes: u64) -> (Duration, PipelineMetrics) {\n        let end = Instant::now();\n        let sleep_duration = self.compute_sleep_duration(end);\n        let metrics = self.compute_pipeline_metrics(end, uncompressed_num_bytes);\n        (sleep_duration, metrics)\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[track_caller]\n    fn assert_approx_equal_sleep_time(left: Duration, right: Duration) {\n        let delta = (left.as_millis() as i128 - right.as_millis() as i128).unsigned_abs();\n        if delta >= NUDGE_TOLERANCE.mul_f32(1.1).as_millis() {\n            panic!(\"{left:?} and {right:?} are not approximately equal.\");\n        }\n    }\n\n    #[track_caller]\n    fn assert_approx_equal(left: u32, right: u32) {\n        assert!(\n            left.abs_diff(right) * 10 <= (left + right),\n            \"inequal values {left} != {right}\"\n        );\n    }\n\n    #[track_caller]\n    fn assert_approx_metrics(left_metrics: &PipelineMetrics, right_metrics: &PipelineMetrics) {\n        assert_approx_equal(\n            left_metrics.throughput_mb_per_sec as u32,\n            right_metrics.throughput_mb_per_sec as u32,\n        );\n        assert_approx_equal(\n            left_metrics.cpu_load.cpu_millis(),\n            right_metrics.cpu_load.cpu_millis(),\n        );\n    }\n\n    #[tokio::test]\n    async fn test_initial_sleep_time() {\n        tokio::time::pause();\n        let t0 = *ORIGIN_OF_TIME;\n        for target_phase_secs in [0, 1, 2, 5, 10, 15, 20, 25, 29, 30, 1_000] {\n            for start_time_secs in [0, 1, 2, 5, 10, 15, 20, 25, 29, 30] {\n                let target_phase = Duration::from_secs(target_phase_secs);\n                let semaphore = Arc::new(Semaphore::new(1));\n                tokio::time::sleep(Duration::from_secs(start_time_secs)).await;\n                let cooperative_indexing = CooperativeIndexingCycle::new_with_phase(\n                    target_phase,\n                    Duration::from_secs(30),\n                    semaphore.clone(),\n                );\n                let initial_sleep_duration: Duration =\n                    cooperative_indexing.initial_sleep_duration();\n                tokio::time::sleep(initial_sleep_duration).await;\n                let target_phase_millis = cooperative_indexing.target_phase.as_millis() as i64;\n                let commit_timeout_ms = cooperative_indexing.commit_timeout.as_millis() as i64;\n                let phase_millis =\n                    (t0.elapsed().as_millis() as i64 - target_phase_millis) % commit_timeout_ms;\n                assert!(phase_millis >= -100, \"{phase_millis}\");\n                assert!(phase_millis <= (NUDGE_TOLERANCE.as_millis() as i64) * 2 + 100);\n            }\n        }\n    }\n\n    #[tokio::test]\n    async fn test_cooperative_indexing_simple() {\n        tokio::time::pause();\n        let semaphore = Arc::new(Semaphore::new(1));\n        let cooperative_indexing =\n            CooperativeIndexingCycle::new(\"id\", Duration::from_secs(30), semaphore.clone());\n        let guard = cooperative_indexing.cooperative_indexing_period().await;\n        tokio::time::advance(Duration::from_secs(10)).await;\n        let (sleep_time, metrics) = guard.end_of_work(100_000_000);\n        assert_approx_equal_sleep_time(sleep_time, Duration::from_secs(20));\n        let expected_metrics = PipelineMetrics {\n            cpu_load: CpuCapacity::from_cpu_millis(PIPELINE_FULL_CAPACITY.cpu_millis() * 10 / 30),\n            throughput_mb_per_sec: 10u16,\n        };\n        assert_approx_metrics(&metrics, &expected_metrics)\n    }\n\n    fn drop_after<T: Send + 'static>(guard: T, duration: Duration) {\n        tokio::task::spawn(async move {\n            tokio::time::sleep(duration).await;\n            drop(guard);\n        });\n    }\n\n    #[tokio::test]\n    async fn test_cooperative_indexing_maximum_throughput() {\n        tokio::time::pause();\n        let semaphore = Arc::new(Semaphore::new(1));\n        let cooperative_indexing =\n            CooperativeIndexingCycle::new(\"id\", Duration::from_secs(30), semaphore.clone());\n        let semaphore_guard = Semaphore::acquire_owned(semaphore).await;\n        drop_after(semaphore_guard, Duration::from_secs(30));\n        let cycle_guard = cooperative_indexing.cooperative_indexing_period().await;\n        tokio::time::advance(Duration::from_secs(15)).await;\n        let (sleep_time, metrics) = cycle_guard.end_of_work(30_000_000);\n        let expected_metrics = PipelineMetrics {\n            cpu_load: CpuCapacity::from_cpu_millis(PIPELINE_FULL_CAPACITY.cpu_millis() * 15 / 30),\n            throughput_mb_per_sec: 1u16,\n        };\n        assert_approx_metrics(&metrics, &expected_metrics);\n        assert!(sleep_time.is_zero());\n    }\n\n    #[tokio::test]\n    async fn test_cooperative_indexing_simple_contention() {\n        tokio::time::pause();\n        let semaphore = Arc::new(Semaphore::new(1));\n        let cooperative_indexing =\n            CooperativeIndexingCycle::new(\"id\", Duration::from_secs(30), semaphore.clone());\n        let semaphore_guard = Semaphore::acquire_owned(semaphore).await;\n        drop_after(semaphore_guard, Duration::from_secs(10));\n        let cycle_guard = cooperative_indexing.cooperative_indexing_period().await;\n        tokio::time::advance(Duration::from_secs(10)).await;\n        let (sleep_time, metrics) = cycle_guard.end_of_work(100_000_000);\n        assert_approx_equal_sleep_time(sleep_time, Duration::from_secs(10));\n        let expected_metrics = PipelineMetrics {\n            cpu_load: CpuCapacity::from_cpu_millis(PIPELINE_FULL_CAPACITY.cpu_millis() * 10 / 30),\n            throughput_mb_per_sec: 10u16,\n        };\n        assert_approx_metrics(&metrics, &expected_metrics);\n    }\n\n    #[tokio::test]\n    async fn test_cooperative_indexing_nudge_to_phase() {\n        tokio::time::pause();\n        let num_threads = 10;\n        let num_pipelines = 100;\n        let num_steps = 15;\n        let semaphore = Arc::new(Semaphore::new(num_threads));\n        let commit_timeout = Duration::from_secs(30);\n        let t0 = Instant::now();\n        let mut handles = Vec::new();\n        for i in 0..num_pipelines {\n            let target_phase =\n                Duration::from_millis(commit_timeout.as_millis() as u64 * i / num_pipelines);\n            let cooperative_indexing = CooperativeIndexingCycle::new_with_phase(\n                target_phase,\n                commit_timeout,\n                semaphore.clone(),\n            );\n            let join_handle = tokio::task::spawn(async move {\n                let mut last_phase = 0;\n                for _ in 0..num_steps {\n                    let cycle_guard = cooperative_indexing.cooperative_indexing_period().await;\n                    let work_time = Duration::from_millis(10);\n                    tokio::time::sleep(work_time).await;\n                    last_phase =\n                        t0.elapsed().as_millis() as u64 % commit_timeout.as_millis() as u64;\n                    let (sleep_time, _) = cycle_guard.end_of_work(1_000_000);\n                    tokio::time::sleep(sleep_time).await;\n                }\n                last_phase\n            });\n            handles.push(join_handle);\n        }\n        for (i, phase_handle) in handles.into_iter().enumerate() {\n            let phase = phase_handle.await.unwrap() as u32;\n            let expected_phase_millis: u32 =\n                commit_timeout.as_millis() as u32 * i as u32 / num_pipelines as u32;\n            assert!(phase.abs_diff(expected_phase_millis) < 3);\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/actors/doc_processor.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::string::FromUtf8Error;\nuse std::sync::Arc;\nuse std::sync::atomic::{AtomicU64, Ordering};\n\nuse anyhow::{Context, bail};\nuse async_trait::async_trait;\nuse bytes::Bytes;\nuse quickwit_actors::{Actor, ActorContext, ActorExitStatus, Handler, Mailbox, QueueCapacity};\nuse quickwit_common::metrics::IntCounter;\nuse quickwit_common::rate_limited_tracing::rate_limited_warn;\nuse quickwit_common::runtimes::RuntimeType;\nuse quickwit_config::{SourceInputFormat, TransformConfig};\nuse quickwit_doc_mapper::{DocMapper, DocParsingError, JsonObject};\nuse quickwit_opentelemetry::otlp::{\n    JsonLogIterator, JsonSpanIterator, OtlpLogsError, OtlpTracesError, parse_otlp_logs_json,\n    parse_otlp_logs_protobuf, parse_otlp_spans_json, parse_otlp_spans_protobuf,\n};\nuse quickwit_proto::types::{IndexId, SourceId};\nuse serde::Serialize;\nuse serde_json::Value as JsonValue;\nuse tantivy::schema::{Field, Value};\nuse tantivy::{DateTime, TantivyDocument};\nuse thiserror::Error;\nuse tokio::runtime::Handle;\n\n#[cfg(feature = \"vrl\")]\nuse super::vrl_processing::*;\nuse crate::actors::Indexer;\nuse crate::models::{\n    NewPublishLock, NewPublishToken, ProcessedDoc, ProcessedDocBatch, PublishLock, RawDocBatch,\n};\n\nconst PLAIN_TEXT: &str = \"plain_text\";\n\npub(super) struct JsonDoc {\n    json_obj: JsonObject,\n    num_bytes: usize,\n}\n\nimpl JsonDoc {\n    pub fn new(json_obj: JsonObject, num_bytes: usize) -> Self {\n        Self {\n            json_obj,\n            num_bytes,\n        }\n    }\n\n    pub fn try_from_json_value(\n        json_value: JsonValue,\n        num_bytes: usize,\n    ) -> Result<Self, DocProcessorError> {\n        match json_value {\n            JsonValue::Object(json_obj) => Ok(Self::new(json_obj, num_bytes)),\n            _ => Err(DocProcessorError::JsonParsing(\n                \"document is not an object\".to_string(),\n            )),\n        }\n    }\n\n    #[cfg(feature = \"vrl\")]\n    pub fn try_from_vrl_doc(vrl_doc: VrlDoc) -> Result<Self, DocProcessorError> {\n        let json_value = serde_json::to_value(vrl_doc.vrl_value)?;\n        Self::try_from_json_value(json_value, vrl_doc.num_bytes)\n    }\n}\n\n#[allow(clippy::enum_variant_names)]\n#[derive(Error, Debug)]\npub enum DocProcessorError {\n    #[error(\"doc mapper parse error: {0}\")]\n    DocMapperParsing(DocParsingError),\n    #[error(\"JSON parse error: {0}\")]\n    JsonParsing(String),\n    #[error(\"OLTP log records parse error: {0}\")]\n    OltpLogsParsing(OtlpLogsError),\n    #[error(\"OLTP traces parse error: {0}\")]\n    OltpTracesParsing(OtlpTracesError),\n    #[cfg(feature = \"vrl\")]\n    #[error(\"VRL transform error: {0}\")]\n    Transform(VrlTerminate),\n}\n\nimpl From<OtlpLogsError> for DocProcessorError {\n    fn from(error: OtlpLogsError) -> Self {\n        Self::OltpLogsParsing(error)\n    }\n}\n\nimpl From<OtlpTracesError> for DocProcessorError {\n    fn from(error: OtlpTracesError) -> Self {\n        Self::OltpTracesParsing(error)\n    }\n}\n\nimpl From<DocParsingError> for DocProcessorError {\n    fn from(error: DocParsingError) -> Self {\n        Self::DocMapperParsing(error)\n    }\n}\n\nimpl From<serde_json::Error> for DocProcessorError {\n    fn from(error: serde_json::Error) -> Self {\n        Self::JsonParsing(error.to_string())\n    }\n}\n\nimpl From<FromUtf8Error> for DocProcessorError {\n    fn from(error: FromUtf8Error) -> Self {\n        Self::JsonParsing(error.to_string())\n    }\n}\n\n#[cfg(feature = \"vrl\")]\nfn try_into_vrl_doc(\n    input_format: SourceInputFormat,\n    raw_doc: Bytes,\n    num_bytes: usize,\n) -> Result<VrlDoc, DocProcessorError> {\n    let vrl_value = match input_format {\n        SourceInputFormat::Json => serde_json::from_slice::<VrlValue>(&raw_doc)?,\n        SourceInputFormat::PlainText => {\n            let mut map = std::collections::BTreeMap::new();\n            let key = vrl::value::KeyString::from(PLAIN_TEXT);\n            let value = VrlValue::Bytes(raw_doc);\n            map.insert(key, value);\n            VrlValue::Object(map)\n        }\n        SourceInputFormat::OtlpLogsJson\n        | SourceInputFormat::OtlpLogsProtobuf\n        | SourceInputFormat::OtlpTracesJson\n        | SourceInputFormat::OtlpTracesProtobuf => {\n            panic!(\"OTP logs or traces do not support VRL transforms\")\n        }\n    };\n    let vrl_doc = VrlDoc::new(vrl_value, num_bytes);\n    Ok(vrl_doc)\n}\n\nfn try_into_json_docs(\n    input_format: SourceInputFormat,\n    raw_doc: Bytes,\n    num_bytes: usize,\n) -> JsonDocIterator {\n    match input_format {\n        SourceInputFormat::Json => {\n            let json_doc_result = serde_json::from_slice::<JsonObject>(&raw_doc)\n                .map(|json_obj| JsonDoc::new(json_obj, num_bytes));\n            JsonDocIterator::from(json_doc_result)\n        }\n        SourceInputFormat::OtlpLogsJson => {\n            let logs = parse_otlp_logs_json(&raw_doc);\n            JsonDocIterator::from(logs)\n        }\n        SourceInputFormat::OtlpLogsProtobuf => {\n            let logs = parse_otlp_logs_protobuf(&raw_doc);\n            JsonDocIterator::from(logs)\n        }\n        SourceInputFormat::OtlpTracesJson => {\n            let spans = parse_otlp_spans_json(&raw_doc);\n            JsonDocIterator::from(spans)\n        }\n        SourceInputFormat::OtlpTracesProtobuf => {\n            let spans = parse_otlp_spans_protobuf(&raw_doc);\n            JsonDocIterator::from(spans)\n        }\n        SourceInputFormat::PlainText => {\n            let json_doc_result = String::from_utf8(raw_doc.to_vec()).map(|value| {\n                let mut json_obj = serde_json::Map::with_capacity(1);\n                let key = PLAIN_TEXT.to_string();\n                json_obj.insert(key, JsonValue::String(value));\n                JsonDoc::new(json_obj, num_bytes)\n            });\n            JsonDocIterator::from(json_doc_result)\n        }\n    }\n}\n\n#[cfg(feature = \"vrl\")]\nfn parse_raw_doc(\n    input_format: SourceInputFormat,\n    raw_doc: Bytes,\n    num_bytes: usize,\n    vrl_program_opt: Option<&mut VrlProgram>,\n) -> JsonDocIterator {\n    let Some(vrl_program) = vrl_program_opt else {\n        return try_into_json_docs(input_format, raw_doc, num_bytes);\n    };\n    let json_doc_result = try_into_vrl_doc(input_format, raw_doc, num_bytes)\n        .and_then(|vrl_doc| vrl_program.transform_doc(vrl_doc))\n        .and_then(JsonDoc::try_from_vrl_doc);\n\n    JsonDocIterator::from(json_doc_result)\n}\n\n#[cfg(not(feature = \"vrl\"))]\nfn parse_raw_doc(\n    input_format: SourceInputFormat,\n    raw_doc: Bytes,\n    num_bytes: usize,\n    _vrl_program_opt: Option<&mut VrlProgram>,\n) -> JsonDocIterator {\n    try_into_json_docs(input_format, raw_doc, num_bytes)\n}\n\nenum JsonDocIterator {\n    One(Option<Result<JsonDoc, DocProcessorError>>),\n    Logs(JsonLogIterator),\n    Spans(JsonSpanIterator),\n}\n\nimpl Iterator for JsonDocIterator {\n    type Item = Result<JsonDoc, DocProcessorError>;\n\n    fn next(&mut self) -> Option<Self::Item> {\n        match self {\n            Self::One(opt) => opt.take(),\n            Self::Logs(logs) => logs\n                .next()\n                .map(|(json_value, num_bytes)| JsonDoc::try_from_json_value(json_value, num_bytes)),\n            Self::Spans(spans) => spans\n                .next()\n                .map(|(json_value, num_bytes)| JsonDoc::try_from_json_value(json_value, num_bytes)),\n        }\n    }\n}\n\nimpl<E> From<Result<JsonDoc, E>> for JsonDocIterator\nwhere E: Into<DocProcessorError>\n{\n    fn from(result: Result<JsonDoc, E>) -> Self {\n        match result {\n            Ok(json_doc) => Self::One(Some(Ok(json_doc))),\n            Err(error) => Self::One(Some(Err(error.into()))),\n        }\n    }\n}\n\nimpl From<Result<JsonLogIterator, OtlpLogsError>> for JsonDocIterator {\n    fn from(result: Result<JsonLogIterator, OtlpLogsError>) -> Self {\n        match result {\n            Ok(logs) => Self::Logs(logs),\n            Err(error) => Self::One(Some(Err(DocProcessorError::from(error)))),\n        }\n    }\n}\n\nimpl From<Result<JsonSpanIterator, OtlpTracesError>> for JsonDocIterator {\n    fn from(result: Result<JsonSpanIterator, OtlpTracesError>) -> Self {\n        match result {\n            Ok(spans) => Self::Spans(spans),\n            Err(error) => Self::One(Some(Err(DocProcessorError::from(error)))),\n        }\n    }\n}\n\n#[derive(Debug)]\npub struct DocProcessorCounter {\n    pub num_docs: AtomicU64,\n    pub num_docs_metric: IntCounter,\n    pub num_bytes_metric: IntCounter,\n}\n\nimpl Serialize for DocProcessorCounter {\n    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>\n    where S: serde::Serializer {\n        serializer.serialize_u64(self.get_num_docs())\n    }\n}\n\nimpl DocProcessorCounter {\n    fn for_index_and_doc_processor_outcome(index: &str, outcome: &str) -> DocProcessorCounter {\n        let index_label = quickwit_common::metrics::index_label(index);\n        let labels = [index_label, outcome];\n        DocProcessorCounter {\n            num_docs: Default::default(),\n            num_docs_metric: crate::metrics::INDEXER_METRICS\n                .processed_docs_total\n                .with_label_values(labels),\n            num_bytes_metric: crate::metrics::INDEXER_METRICS\n                .processed_bytes\n                .with_label_values(labels),\n        }\n    }\n\n    #[inline(always)]\n    fn get_num_docs(&self) -> u64 {\n        self.num_docs.load(Ordering::Relaxed)\n    }\n\n    fn record_doc(&self, num_bytes: u64) {\n        self.num_docs.fetch_add(1, Ordering::Relaxed);\n        self.num_docs_metric.inc();\n        self.num_bytes_metric.inc_by(num_bytes);\n    }\n}\n\n#[derive(Debug, Serialize)]\npub struct DocProcessorCounters {\n    index_id: IndexId,\n    source_id: SourceId,\n\n    /// Overall number of documents received, partitioned\n    /// into 5 categories:\n    /// - valid documents\n    /// - number of docs that could not be parsed.\n    /// - number of docs that were not valid json.\n    /// - number of docs that could not be transformed.\n    /// - number of docs for which the doc mapper returned an error.\n    /// - number of valid docs.\n    pub valid: DocProcessorCounter,\n    pub doc_mapper_errors: DocProcessorCounter,\n    pub transform_errors: DocProcessorCounter,\n    pub json_parse_errors: DocProcessorCounter,\n    pub otlp_parse_errors: DocProcessorCounter,\n\n    /// Number of bytes that went through the indexer\n    /// during its entire lifetime.\n    ///\n    /// Includes both valid and invalid documents.\n    pub num_bytes_total: AtomicU64,\n}\n\nimpl DocProcessorCounters {\n    pub fn new(index_id: IndexId, source_id: SourceId) -> Self {\n        let valid_docs =\n            DocProcessorCounter::for_index_and_doc_processor_outcome(&index_id, \"valid\");\n        let doc_mapper_errors =\n            DocProcessorCounter::for_index_and_doc_processor_outcome(&index_id, \"doc_mapper_error\");\n        let transform_errors =\n            DocProcessorCounter::for_index_and_doc_processor_outcome(&index_id, \"transform_error\");\n        let json_parse_errors =\n            DocProcessorCounter::for_index_and_doc_processor_outcome(&index_id, \"json_parse_error\");\n        let otlp_parse_errors =\n            DocProcessorCounter::for_index_and_doc_processor_outcome(&index_id, \"otlp_parse_error\");\n        DocProcessorCounters {\n            index_id,\n            source_id,\n\n            valid: valid_docs,\n            doc_mapper_errors,\n            transform_errors,\n            json_parse_errors,\n            otlp_parse_errors,\n            num_bytes_total: Default::default(),\n        }\n    }\n\n    /// Returns the overall number of docs that went through the indexer (valid or not).\n    pub fn num_processed_docs(&self) -> u64 {\n        self.valid.get_num_docs()\n            + self.doc_mapper_errors.get_num_docs()\n            + self.json_parse_errors.get_num_docs()\n            + self.otlp_parse_errors.get_num_docs()\n            + self.transform_errors.get_num_docs()\n    }\n\n    /// Returns the overall number of docs that were sent to the indexer but were invalid.\n    /// (For instance, because they were missing a required field or because their because\n    /// their format was invalid)\n    pub fn num_invalid_docs(&self) -> u64 {\n        self.doc_mapper_errors.get_num_docs()\n            + self.json_parse_errors.get_num_docs()\n            + self.otlp_parse_errors.get_num_docs()\n            + self.transform_errors.get_num_docs()\n    }\n\n    pub fn record_valid(&self, num_bytes: u64) {\n        self.num_bytes_total.fetch_add(num_bytes, Ordering::Relaxed);\n        self.valid.record_doc(num_bytes);\n    }\n\n    pub fn record_error(&self, error: DocProcessorError, num_bytes: u64) {\n        self.num_bytes_total.fetch_add(num_bytes, Ordering::Relaxed);\n        match error {\n            DocProcessorError::DocMapperParsing(_) => {\n                self.doc_mapper_errors.record_doc(num_bytes);\n            }\n            DocProcessorError::JsonParsing(_) => {\n                self.json_parse_errors.record_doc(num_bytes);\n            }\n            DocProcessorError::OltpLogsParsing(_) | DocProcessorError::OltpTracesParsing(_) => {\n                self.otlp_parse_errors.record_doc(num_bytes);\n            }\n            #[cfg(feature = \"vrl\")]\n            DocProcessorError::Transform(_) => {\n                self.transform_errors.record_doc(num_bytes);\n            }\n        };\n    }\n}\n\npub struct DocProcessor {\n    doc_mapper: Arc<DocMapper>,\n    indexer_mailbox: Mailbox<Indexer>,\n    timestamp_field_opt: Option<Field>,\n    counters: Arc<DocProcessorCounters>,\n    publish_lock: PublishLock,\n    #[cfg(feature = \"vrl\")]\n    transform_opt: Option<VrlProgram>,\n    input_format: SourceInputFormat,\n}\n\nimpl DocProcessor {\n    pub fn try_new(\n        index_id: IndexId,\n        source_id: SourceId,\n        doc_mapper: Arc<DocMapper>,\n        indexer_mailbox: Mailbox<Indexer>,\n        transform_config_opt: Option<TransformConfig>,\n        input_format: SourceInputFormat,\n    ) -> anyhow::Result<Self> {\n        let timestamp_field_opt = extract_timestamp_field(&doc_mapper)?;\n        if cfg!(not(feature = \"vrl\")) && transform_config_opt.is_some() {\n            bail!(\"VRL is not enabled: please recompile with the `vrl` feature\")\n        }\n        Ok(DocProcessor {\n            doc_mapper,\n            indexer_mailbox,\n            timestamp_field_opt,\n            counters: Arc::new(DocProcessorCounters::new(index_id, source_id)),\n            publish_lock: PublishLock::default(),\n            #[cfg(feature = \"vrl\")]\n            transform_opt: transform_config_opt\n                .map(VrlProgram::try_from_transform_config)\n                .transpose()?,\n            input_format,\n        })\n    }\n\n    // Extract a timestamp from a tantivy document.\n    //\n    // If the timestamp is set up in the docmapper and the timestamp is missing,\n    // returns an PrepareDocumentError::MissingField error.\n    fn extract_timestamp(\n        &self,\n        doc: &TantivyDocument,\n    ) -> Result<Option<DateTime>, DocProcessorError> {\n        let Some(timestamp_field) = self.timestamp_field_opt else {\n            return Ok(None);\n        };\n        let timestamp = doc\n            .get_first(timestamp_field)\n            .and_then(|val| val.as_datetime())\n            .ok_or(DocProcessorError::from(DocParsingError::RequiredField(\n                \"timestamp field is required\".to_string(),\n            )))?;\n        Ok(Some(timestamp))\n    }\n\n    fn process_raw_doc(&mut self, raw_doc: Bytes, processed_docs: &mut Vec<ProcessedDoc>) {\n        let num_bytes = raw_doc.len();\n\n        #[cfg(feature = \"vrl\")]\n        let transform_opt = self.transform_opt.as_mut();\n        #[cfg(not(feature = \"vrl\"))]\n        let transform_opt: Option<&mut VrlProgram> = None;\n\n        for json_doc_result in parse_raw_doc(self.input_format, raw_doc, num_bytes, transform_opt) {\n            let processed_doc_result =\n                json_doc_result.and_then(|json_doc| self.process_json_doc(json_doc));\n\n            match processed_doc_result {\n                Ok(processed_doc) => {\n                    self.counters.record_valid(processed_doc.num_bytes as u64);\n                    processed_docs.push(processed_doc);\n                }\n                Err(error) => {\n                    rate_limited_warn!(\n                        limit_per_min = 10,\n                        index_id = self.counters.index_id,\n                        source_id = self.counters.source_id,\n                        \"{error}\",\n                    );\n                    self.counters.record_error(error, num_bytes as u64);\n                }\n            }\n        }\n    }\n\n    fn process_json_doc(&self, json_doc: JsonDoc) -> Result<ProcessedDoc, DocProcessorError> {\n        let num_bytes = json_doc.num_bytes;\n\n        let (partition, doc) = self\n            .doc_mapper\n            .doc_from_json_obj(json_doc.json_obj, json_doc.num_bytes as u64)?;\n        let timestamp_opt = self.extract_timestamp(&doc)?;\n        Ok(ProcessedDoc {\n            doc,\n            timestamp_opt,\n            partition,\n            num_bytes,\n        })\n    }\n}\n\nfn extract_timestamp_field(doc_mapper: &DocMapper) -> anyhow::Result<Option<Field>> {\n    let schema = doc_mapper.schema();\n    let Some(timestamp_field_name) = doc_mapper.timestamp_field_name() else {\n        return Ok(None);\n    };\n    let timestamp_field = schema\n        .get_field(timestamp_field_name)\n        .context(\"failed to find timestamp field in schema\")?;\n    Ok(Some(timestamp_field))\n}\n\n#[cfg(not(feature = \"vrl\"))]\nstruct VrlProgram {}\n\n#[async_trait]\nimpl Actor for DocProcessor {\n    type ObservableState = Arc<DocProcessorCounters>;\n\n    fn observable_state(&self) -> Self::ObservableState {\n        self.counters.clone()\n    }\n\n    fn queue_capacity(&self) -> QueueCapacity {\n        QueueCapacity::Bounded(10)\n    }\n\n    fn runtime_handle(&self) -> Handle {\n        RuntimeType::Blocking.get_runtime_handle()\n    }\n\n    #[inline]\n    fn yield_after_each_message(&self) -> bool {\n        false\n    }\n\n    async fn finalize(\n        &mut self,\n        exit_status: &ActorExitStatus,\n        ctx: &ActorContext<Self>,\n    ) -> anyhow::Result<()> {\n        match exit_status {\n            ActorExitStatus::DownstreamClosed\n            | ActorExitStatus::Killed\n            | ActorExitStatus::Failure(_)\n            | ActorExitStatus::Panicked => return Ok(()),\n            ActorExitStatus::Quit | ActorExitStatus::Success => {\n                let _ = ctx.send_exit_with_success(&self.indexer_mailbox).await;\n            }\n        }\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<RawDocBatch> for DocProcessor {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        raw_doc_batch: RawDocBatch,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        if self.publish_lock.is_dead() {\n            return Ok(());\n        }\n        let mut processed_docs: Vec<ProcessedDoc> = Vec::with_capacity(raw_doc_batch.docs.len());\n\n        for raw_doc in raw_doc_batch.docs {\n            let _protected_zone_guard = ctx.protect_zone();\n            self.process_raw_doc(raw_doc, &mut processed_docs);\n            ctx.record_progress();\n        }\n        let processed_doc_batch = ProcessedDocBatch::new(\n            processed_docs,\n            raw_doc_batch.checkpoint_delta,\n            raw_doc_batch.force_commit,\n        );\n        ctx.send_message(&self.indexer_mailbox, processed_doc_batch)\n            .await?;\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<NewPublishLock> for DocProcessor {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        message: NewPublishLock,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        let NewPublishLock(publish_lock) = &message;\n        self.publish_lock = publish_lock.clone();\n        ctx.send_message(&self.indexer_mailbox, message).await?;\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<NewPublishToken> for DocProcessor {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        message: NewPublishToken,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        ctx.send_message(&self.indexer_mailbox, message).await?;\n        Ok(())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::sync::Arc;\n\n    use prost::Message;\n    use quickwit_actors::Universe;\n    use quickwit_common::uri::Uri;\n    use quickwit_config::{SearchSettings, build_doc_mapper};\n    use quickwit_doc_mapper::{DocMapper, default_doc_mapper_for_test};\n    use quickwit_metastore::checkpoint::SourceCheckpointDelta;\n    use quickwit_opentelemetry::otlp::{OtlpGrpcLogsService, OtlpGrpcTracesService};\n    use quickwit_proto::opentelemetry::proto::collector::logs::v1::ExportLogsServiceRequest;\n    use quickwit_proto::opentelemetry::proto::collector::trace::v1::ExportTraceServiceRequest;\n    use quickwit_proto::opentelemetry::proto::common::v1::AnyValue as OtlpAnyValue;\n    use quickwit_proto::opentelemetry::proto::common::v1::any_value::Value as OtlpAnyValueValue;\n    use quickwit_proto::opentelemetry::proto::logs::v1::{LogRecord, ResourceLogs, ScopeLogs};\n    use quickwit_proto::opentelemetry::proto::trace::v1::{ResourceSpans, ScopeSpans, Span};\n    use serde_json::Value as JsonValue;\n    use tantivy::Document;\n    use tantivy::schema::NamedFieldDocument;\n\n    use super::*;\n    use crate::models::{PublishLock, RawDocBatch};\n\n    #[tokio::test]\n    async fn test_doc_processor_simple() {\n        let index_id = \"my-index\";\n        let source_id = \"my-source\";\n        let universe = Universe::with_accelerated_time();\n        let doc_mapper = Arc::new(default_doc_mapper_for_test());\n        let (indexer_mailbox, indexer_inbox) = universe.create_test_mailbox();\n        let doc_processor = DocProcessor::try_new(\n            index_id.to_string(),\n            source_id.to_string(),\n            doc_mapper.clone(),\n            indexer_mailbox,\n            None,\n            SourceInputFormat::Json,\n        )\n        .unwrap();\n        let (doc_processor_mailbox, doc_processor_handle) =\n            universe.spawn_builder().spawn(doc_processor);\n        let checkpoint_delta = SourceCheckpointDelta::from_range(0..4);\n        doc_processor_mailbox\n            .send_message(RawDocBatch::for_test(\n                &[\n                    br#\"{\"body\": \"happy\", \"response_date\": \"2021-12-19T16:39:57+00:00\", \"response_time\": 12, \"response_payload\": \"YWJj\"}\"#, // missing timestamp\n                    br#\"{\"body\": \"happy\", \"timestamp\": 1628837062, \"response_date\": \"2021-12-19T16:39:59+00:00\", \"response_time\": 2, \"response_payload\": \"YWJj\"}\"#, // ok\n                    br#\"{\"body\": \"happy2\", \"timestamp\": 1628837062, \"response_date\": \"2021-12-19T16:40:57+00:00\", \"response_time\": 13, \"response_payload\": \"YWJj\"}\"#, // ok\n                    b\"{\", // invalid json\n                ],\n                0..4,\n            ))\n            .await.unwrap();\n\n        let counters = doc_processor_handle\n            .process_pending_and_observe()\n            .await\n            .state;\n        assert_eq!(counters.index_id, index_id);\n        assert_eq!(counters.source_id, source_id);\n        assert_eq!(counters.doc_mapper_errors.get_num_docs(), 1);\n        assert_eq!(counters.json_parse_errors.get_num_docs(), 1);\n        assert_eq!(counters.transform_errors.get_num_docs(), 0);\n        assert_eq!(counters.otlp_parse_errors.get_num_docs(), 0);\n        assert_eq!(counters.valid.get_num_docs(), 2);\n        assert_eq!(counters.num_bytes_total.load(Ordering::Relaxed), 387);\n\n        let output_messages = indexer_inbox.drain_for_test();\n        assert_eq!(output_messages.len(), 1);\n        let batch = *(output_messages\n            .into_iter()\n            .next()\n            .unwrap()\n            .downcast::<ProcessedDocBatch>()\n            .unwrap());\n        assert_eq!(batch.docs.len(), 2);\n        assert_eq!(batch.checkpoint_delta, checkpoint_delta);\n\n        let schema = doc_mapper.schema();\n        let NamedFieldDocument(named_field_doc_map) = batch.docs[0].doc.to_named_doc(&schema);\n        let doc_json = JsonValue::Object(doc_mapper.doc_to_json(named_field_doc_map).unwrap());\n        assert_eq!(\n            doc_json,\n            serde_json::json!({\n                \"_source\": {\n                    \"body\": \"happy\",\n                    \"response_date\": \"2021-12-19T16:39:59Z\",\n                    \"response_payload\": \"YWJj\",\n                    \"response_time\": 2,\n                    \"timestamp\": 1628837062\n                },\n                \"body\": \"happy\",\n                \"response_date\": \"2021-12-19T16:39:59Z\",\n                \"response_payload\": \"YWJj\",\n                \"response_time\": 2.0,\n                \"timestamp\": 1628837062\n            })\n        );\n        universe.assert_quit().await;\n    }\n\n    const DOCMAPPER_WITH_PARTITION_JSON: &str = r#\"\n        {\n            \"tag_fields\": [\"tenant\"],\n            \"partition_key\": \"tenant\",\n            \"field_mappings\": [\n                { \"name\": \"tenant\", \"type\": \"text\", \"tokenizer\": \"raw\", \"indexed\": true },\n                { \"name\": \"body\", \"type\": \"text\" }\n            ]\n        }\"#;\n\n    #[tokio::test]\n    async fn test_doc_processor_partitioning() {\n        let doc_mapper: Arc<DocMapper> =\n            Arc::new(serde_json::from_str::<DocMapper>(DOCMAPPER_WITH_PARTITION_JSON).unwrap());\n        let universe = Universe::with_accelerated_time();\n        let (indexer_mailbox, indexer_inbox) = universe.create_test_mailbox();\n        let doc_processor = DocProcessor::try_new(\n            \"my-index\".to_string(),\n            \"my-source\".to_string(),\n            doc_mapper,\n            indexer_mailbox,\n            None,\n            SourceInputFormat::Json,\n        )\n        .unwrap();\n        let (doc_processor_mailbox, doc_processor_handle) =\n            universe.spawn_builder().spawn(doc_processor);\n        doc_processor_mailbox\n            .send_message(RawDocBatch::for_test(\n                &[\n                    br#\"{\"tenant\": \"tenant_1\", \"body\": \"first doc for tenant 1\"}\"#,\n                    br#\"{\"tenant\": \"tenant_2\", \"body\": \"first doc for tenant 2\"}\"#,\n                    br#\"{\"tenant\": \"tenant_1\", \"body\": \"second doc for tenant 1\"}\"#,\n                    br#\"{\"tenant\": \"tenant_2\", \"body\": \"second doc for tenant 2\"}\"#,\n                ],\n                0..2,\n            ))\n            .await\n            .unwrap();\n\n        universe\n            .send_exit_with_success(&doc_processor_mailbox)\n            .await\n            .unwrap();\n        let (exit_status, _) = doc_processor_handle.join().await;\n        assert!(matches!(exit_status, ActorExitStatus::Success));\n        let processed_doc_batches: Vec<ProcessedDocBatch> = indexer_inbox.drain_for_test_typed();\n        assert_eq!(processed_doc_batches.len(), 1);\n        let partition_ids: Vec<u64> = processed_doc_batches[0]\n            .docs\n            .iter()\n            .map(|doc| doc.partition)\n            .collect();\n        assert_eq!(partition_ids[0], partition_ids[2]);\n        assert_eq!(partition_ids[1], partition_ids[3]);\n        assert_ne!(partition_ids[0], partition_ids[1]);\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_doc_processor_forward_publish_lock() {\n        let doc_mapper = Arc::new(default_doc_mapper_for_test());\n        let universe = Universe::with_accelerated_time();\n        let (indexer_mailbox, indexer_inbox) = universe.create_test_mailbox();\n        let doc_processor = DocProcessor::try_new(\n            \"my-index\".to_string(),\n            \"my-source\".to_string(),\n            doc_mapper,\n            indexer_mailbox,\n            None,\n            SourceInputFormat::Json,\n        )\n        .unwrap();\n        let (doc_processor_mailbox, doc_processor_handle) =\n            universe.spawn_builder().spawn(doc_processor);\n        let publish_lock = PublishLock::default();\n        doc_processor_mailbox\n            .send_message(NewPublishLock(publish_lock.clone()))\n            .await\n            .unwrap();\n        universe\n            .send_exit_with_success(&doc_processor_mailbox)\n            .await\n            .unwrap();\n        let (exit_status, _) = doc_processor_handle.join().await;\n        assert!(matches!(exit_status, ActorExitStatus::Success));\n        let publish_locks: Vec<NewPublishLock> = indexer_inbox.drain_for_test_typed();\n        assert_eq!(&publish_locks, &[NewPublishLock(publish_lock)]);\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_doc_processor_ignores_messages_when_publish_lock_is_dead() {\n        let universe = Universe::with_accelerated_time();\n        let (indexer_mailbox, indexer_inbox) = universe.create_test_mailbox();\n        let doc_mapper = Arc::new(default_doc_mapper_for_test());\n        let doc_processor = DocProcessor::try_new(\n            \"my-index\".to_string(),\n            \"my-source\".to_string(),\n            doc_mapper,\n            indexer_mailbox,\n            None,\n            SourceInputFormat::Json,\n        )\n        .unwrap();\n        let (doc_processor_mailbox, doc_processor_handle) =\n            universe.spawn_builder().spawn(doc_processor);\n        let publish_lock = PublishLock::default();\n        doc_processor_mailbox\n            .send_message(NewPublishLock(publish_lock.clone()))\n            .await\n            .unwrap();\n        doc_processor_handle.process_pending_and_observe().await;\n        publish_lock.kill().await;\n        doc_processor_mailbox\n            .send_message(RawDocBatch::for_test(\n                &[\n                    br#\"{\"body\": \"happy\", \"timestamp\": 1628837062, \"response_date\": \"2021-12-19T16:39:59+00:00\", \"response_time\": 2, \"response_payload\": \"YWJj\"}\"#,\n                ],\n                0..1,\n            ))\n            .await.unwrap();\n        universe\n            .send_exit_with_success(&doc_processor_mailbox)\n            .await\n            .unwrap();\n        let (exit_status, _indexer_counters) = doc_processor_handle.join().await;\n        assert!(matches!(exit_status, ActorExitStatus::Success));\n        let indexer_messages: Vec<ProcessedDocBatch> = indexer_inbox.drain_for_test_typed();\n        assert!(indexer_messages.is_empty());\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_doc_processor_otlp_logs_json() {\n        let root_uri = Uri::for_test(\"ram:///indexes\");\n        let index_config = OtlpGrpcLogsService::index_config(&root_uri).unwrap();\n        let doc_mapper =\n            build_doc_mapper(&index_config.doc_mapping, &SearchSettings::default()).unwrap();\n\n        let universe = Universe::with_accelerated_time();\n        let (indexer_mailbox, indexer_inbox) = universe.create_test_mailbox();\n        let doc_processor = DocProcessor::try_new(\n            \"my-index\".to_string(),\n            \"my-source\".to_string(),\n            doc_mapper,\n            indexer_mailbox,\n            None,\n            SourceInputFormat::OtlpLogsJson,\n        )\n        .unwrap();\n\n        let (doc_processor_mailbox, doc_processor_handle) =\n            universe.spawn_builder().spawn(doc_processor);\n\n        let scope_logs = vec![ScopeLogs {\n            log_records: vec![\n                LogRecord {\n                    time_unix_nano: 1_000_000_000,\n                    body: Some(OtlpAnyValue {\n                        value: Some(OtlpAnyValueValue::StringValue(\n                            \"foo log message\".to_string(),\n                        )),\n                    }),\n                    ..Default::default()\n                },\n                LogRecord {\n                    time_unix_nano: 1_000_000_001,\n                    body: Some(OtlpAnyValue {\n                        value: Some(OtlpAnyValueValue::StringValue(\n                            \"bar log message\".to_string(),\n                        )),\n                    }),\n                    ..Default::default()\n                },\n            ],\n            ..Default::default()\n        }];\n        let resource_logs = vec![ResourceLogs {\n            scope_logs,\n            ..Default::default()\n        }];\n        let request = ExportLogsServiceRequest { resource_logs };\n        let raw_doc_json = serde_json::to_vec(&request).unwrap();\n        let raw_doc_batch = RawDocBatch::for_test(&[&raw_doc_json], 0..2);\n        doc_processor_mailbox\n            .send_message(raw_doc_batch)\n            .await\n            .unwrap();\n\n        universe\n            .send_exit_with_success(&doc_processor_mailbox)\n            .await\n            .unwrap();\n\n        let counters = doc_processor_handle\n            .process_pending_and_observe()\n            .await\n            .state;\n        assert_eq!(counters.valid.get_num_docs(), 2);\n\n        let batch = indexer_inbox.drain_for_test_typed::<ProcessedDocBatch>();\n        assert_eq!(batch.len(), 1);\n        assert_eq!(batch[0].docs.len(), 2);\n\n        let (exit_status, _) = doc_processor_handle.join().await;\n        assert!(matches!(exit_status, ActorExitStatus::Success));\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_doc_processor_otlp_logs_proto() {\n        let root_uri = Uri::for_test(\"ram:///indexes\");\n        let index_config = OtlpGrpcLogsService::index_config(&root_uri).unwrap();\n        let doc_mapper =\n            build_doc_mapper(&index_config.doc_mapping, &SearchSettings::default()).unwrap();\n\n        let universe = Universe::with_accelerated_time();\n        let (indexer_mailbox, indexer_inbox) = universe.create_test_mailbox();\n        let doc_processor = DocProcessor::try_new(\n            \"my-index\".to_string(),\n            \"my-source\".to_string(),\n            doc_mapper,\n            indexer_mailbox,\n            None,\n            SourceInputFormat::OtlpLogsProtobuf,\n        )\n        .unwrap();\n\n        let (doc_processor_mailbox, doc_processor_handle) =\n            universe.spawn_builder().spawn(doc_processor);\n\n        let scope_logs = vec![ScopeLogs {\n            log_records: vec![\n                LogRecord {\n                    time_unix_nano: 1_000_000_000,\n                    body: Some(OtlpAnyValue {\n                        value: Some(OtlpAnyValueValue::StringValue(\n                            \"foo log message\".to_string(),\n                        )),\n                    }),\n                    ..Default::default()\n                },\n                LogRecord {\n                    time_unix_nano: 1_000_000_001,\n                    body: Some(OtlpAnyValue {\n                        value: Some(OtlpAnyValueValue::StringValue(\n                            \"bar log message\".to_string(),\n                        )),\n                    }),\n                    ..Default::default()\n                },\n            ],\n            ..Default::default()\n        }];\n        let resource_logs = vec![ResourceLogs {\n            scope_logs,\n            ..Default::default()\n        }];\n        let request = ExportLogsServiceRequest { resource_logs };\n        let mut raw_doc_buffer = Vec::new();\n        request.encode(&mut raw_doc_buffer).unwrap();\n\n        let raw_doc_batch = RawDocBatch::for_test(&[&raw_doc_buffer], 0..2);\n        doc_processor_mailbox\n            .send_message(raw_doc_batch)\n            .await\n            .unwrap();\n\n        universe\n            .send_exit_with_success(&doc_processor_mailbox)\n            .await\n            .unwrap();\n\n        let counters = doc_processor_handle\n            .process_pending_and_observe()\n            .await\n            .state;\n        assert_eq!(counters.valid.get_num_docs(), 2);\n\n        let batch = indexer_inbox.drain_for_test_typed::<ProcessedDocBatch>();\n        assert_eq!(batch.len(), 1);\n        assert_eq!(batch[0].docs.len(), 2);\n\n        let (exit_status, _) = doc_processor_handle.join().await;\n        assert!(matches!(exit_status, ActorExitStatus::Success));\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_doc_processor_otlp_traces_json() {\n        let root_uri = Uri::for_test(\"ram:///indexes\");\n        let index_config = OtlpGrpcTracesService::index_config(&root_uri).unwrap();\n        let doc_mapper =\n            build_doc_mapper(&index_config.doc_mapping, &SearchSettings::default()).unwrap();\n\n        let universe = Universe::with_accelerated_time();\n        let (indexer_mailbox, indexer_inbox) = universe.create_test_mailbox();\n        let doc_processor = DocProcessor::try_new(\n            \"my-index\".to_string(),\n            \"my-source\".to_string(),\n            doc_mapper,\n            indexer_mailbox,\n            None,\n            SourceInputFormat::OtlpTracesJson,\n        )\n        .unwrap();\n\n        let (doc_processor_mailbox, doc_processor_handle) =\n            universe.spawn_builder().spawn(doc_processor);\n\n        let scope_spans = vec![ScopeSpans {\n            spans: vec![\n                Span {\n                    trace_id: vec![1; 16],\n                    span_id: vec![2; 8],\n                    start_time_unix_nano: 1_000_000_001,\n                    end_time_unix_nano: 1_000_000_002,\n                    ..Default::default()\n                },\n                Span {\n                    trace_id: vec![3; 16],\n                    span_id: vec![4; 8],\n                    start_time_unix_nano: 2_000_000_001,\n                    end_time_unix_nano: 2_000_000_002,\n                    ..Default::default()\n                },\n            ],\n            ..Default::default()\n        }];\n        let resource_spans = vec![ResourceSpans {\n            scope_spans,\n            ..Default::default()\n        }];\n        let request = ExportTraceServiceRequest { resource_spans };\n        let raw_doc_json = serde_json::to_vec(&request).unwrap();\n        let raw_doc_batch = RawDocBatch::for_test(&[&raw_doc_json], 0..2);\n        doc_processor_mailbox\n            .send_message(raw_doc_batch)\n            .await\n            .unwrap();\n\n        universe\n            .send_exit_with_success(&doc_processor_mailbox)\n            .await\n            .unwrap();\n\n        let counters = doc_processor_handle\n            .process_pending_and_observe()\n            .await\n            .state;\n        assert_eq!(counters.valid.get_num_docs(), 2);\n\n        let batch = indexer_inbox.drain_for_test_typed::<ProcessedDocBatch>();\n        assert_eq!(batch.len(), 1);\n        assert_eq!(batch[0].docs.len(), 2);\n\n        let (exit_status, _) = doc_processor_handle.join().await;\n        assert!(matches!(exit_status, ActorExitStatus::Success));\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_doc_processor_otlp_traces_proto() {\n        let root_uri = Uri::for_test(\"ram:///indexes\");\n        let index_config = OtlpGrpcTracesService::index_config(&root_uri).unwrap();\n        let doc_mapper =\n            build_doc_mapper(&index_config.doc_mapping, &SearchSettings::default()).unwrap();\n\n        let universe = Universe::with_accelerated_time();\n        let (indexer_mailbox, indexer_inbox) = universe.create_test_mailbox();\n        let doc_processor = DocProcessor::try_new(\n            \"my-index\".to_string(),\n            \"my-source\".to_string(),\n            doc_mapper,\n            indexer_mailbox,\n            None,\n            SourceInputFormat::OtlpTracesProtobuf,\n        )\n        .unwrap();\n\n        let (doc_processor_mailbox, doc_processor_handle) =\n            universe.spawn_builder().spawn(doc_processor);\n\n        let scope_spans = vec![ScopeSpans {\n            spans: vec![\n                Span {\n                    trace_id: vec![1; 16],\n                    span_id: vec![2; 8],\n                    start_time_unix_nano: 1_000_000_001,\n                    end_time_unix_nano: 1_000_000_002,\n                    ..Default::default()\n                },\n                Span {\n                    trace_id: vec![3; 16],\n                    span_id: vec![4; 8],\n                    start_time_unix_nano: 2_000_000_001,\n                    end_time_unix_nano: 2_000_000_002,\n                    ..Default::default()\n                },\n            ],\n            ..Default::default()\n        }];\n        let resource_spans = vec![ResourceSpans {\n            scope_spans,\n            ..Default::default()\n        }];\n        let request = ExportTraceServiceRequest { resource_spans };\n        let mut raw_doc_buffer = Vec::new();\n        request.encode(&mut raw_doc_buffer).unwrap();\n\n        let raw_doc_batch = RawDocBatch::for_test(&[&raw_doc_buffer], 0..2);\n        doc_processor_mailbox\n            .send_message(raw_doc_batch)\n            .await\n            .unwrap();\n\n        universe\n            .send_exit_with_success(&doc_processor_mailbox)\n            .await\n            .unwrap();\n\n        let counters = doc_processor_handle\n            .process_pending_and_observe()\n            .await\n            .state;\n        assert_eq!(counters.valid.get_num_docs(), 2);\n\n        let batch = indexer_inbox.drain_for_test_typed::<ProcessedDocBatch>();\n        assert_eq!(batch.len(), 1);\n        assert_eq!(batch[0].docs.len(), 2);\n\n        let (exit_status, _) = doc_processor_handle.join().await;\n        assert!(matches!(exit_status, ActorExitStatus::Success));\n        universe.assert_quit().await;\n    }\n}\n\n#[cfg(feature = \"vrl\")]\n#[cfg(test)]\nmod tests_vrl {\n    use quickwit_actors::Universe;\n    use quickwit_doc_mapper::default_doc_mapper_for_test;\n    use quickwit_metastore::checkpoint::SourceCheckpointDelta;\n    use tantivy::Document;\n    use tantivy::schema::NamedFieldDocument;\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_doc_processor_simple_vrl() -> anyhow::Result<()> {\n        let index_id = \"my-index\";\n        let source_id = \"my-source\";\n        let universe = Universe::with_accelerated_time();\n        let (indexer_mailbox, indexer_inbox) = universe.create_test_mailbox();\n        let doc_mapper = Arc::new(default_doc_mapper_for_test());\n        let transform_config = TransformConfig::for_test(\".body = upcase(string!(.body))\");\n        let doc_processor = DocProcessor::try_new(\n            index_id.to_string(),\n            source_id.to_string(),\n            doc_mapper.clone(),\n            indexer_mailbox,\n            Some(transform_config),\n            SourceInputFormat::Json,\n        )\n        .unwrap();\n        let (doc_processor_mailbox, doc_processor_handle) =\n            universe.spawn_builder().spawn(doc_processor);\n        doc_processor_mailbox\n            .send_message(RawDocBatch::for_test(\n                &[\n                    br#\"{\"body\": \"happy\", \"response_date\": \"2021-12-19T16:39:57+00:00\", \"response_time\": 12, \"response_payload\": \"YWJj\"}\"#, // missing timestamp\n                    br#\"{\"body\": \"happy using VRL\", \"timestamp\": 1628837062, \"response_date\": \"2021-12-19T16:39:59+00:00\", \"response_time\": 2, \"response_payload\": \"YWJj\"}\"#, // ok\n                    br#\"{\"body\": \"happy2\", \"timestamp\": 1628837062, \"response_date\": \"2021-12-19T16:40:57+00:00\", \"response_time\": 13, \"response_payload\": \"YWJj\"}\"#, // ok\n                    b\"{\", // invalid json\n                ],\n                0..4,\n            ))\n            .await?;\n        let counters = doc_processor_handle\n            .process_pending_and_observe()\n            .await\n            .state;\n        assert_eq!(counters.index_id, index_id.to_string());\n        assert_eq!(counters.source_id, source_id.to_string());\n        assert_eq!(counters.doc_mapper_errors.get_num_docs(), 1);\n        assert_eq!(counters.json_parse_errors.get_num_docs(), 1);\n        assert_eq!(counters.transform_errors.get_num_docs(), 0);\n        assert_eq!(counters.otlp_parse_errors.get_num_docs(), 0);\n        assert_eq!(counters.valid.get_num_docs(), 2);\n        assert_eq!(counters.num_bytes_total.load(Ordering::Relaxed), 397);\n\n        let output_messages = indexer_inbox.drain_for_test();\n        assert_eq!(output_messages.len(), 1);\n        let batch = *(output_messages\n            .into_iter()\n            .next()\n            .unwrap()\n            .downcast::<ProcessedDocBatch>()\n            .unwrap());\n        assert_eq!(batch.docs.len(), 2);\n        assert_eq!(\n            batch.checkpoint_delta,\n            SourceCheckpointDelta::from_range(0..4)\n        );\n\n        let schema = doc_mapper.schema();\n        let NamedFieldDocument(named_field_doc_map) = batch.docs[0].doc.to_named_doc(&schema);\n        let doc_json = JsonValue::Object(doc_mapper.doc_to_json(named_field_doc_map)?);\n        assert_eq!(\n            doc_json,\n            serde_json::json!({\n                \"_source\": {\n                    \"body\": \"HAPPY USING VRL\",\n                    \"response_date\": \"2021-12-19T16:39:59Z\",\n                    \"response_payload\": \"YWJj\",\n                    \"response_time\": 2,\n                    \"timestamp\": 1628837062\n                },\n                \"body\": \"HAPPY USING VRL\",\n                \"response_date\": \"2021-12-19T16:39:59Z\",\n                 \"response_payload\": \"YWJj\",\n                 \"response_time\": 2.0,\n                 \"timestamp\": 1628837062\n            })\n        );\n        universe.assert_quit().await;\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_doc_processor_with_plain_text_input() {\n        let index_id = \"my-index\";\n        let source_id = \"my-source\";\n        let universe = Universe::with_accelerated_time();\n        let (indexer_mailbox, indexer_inbox) = universe.create_test_mailbox();\n        let doc_mapper = Arc::new(default_doc_mapper_for_test());\n        let vrl_script = r#\"\n            values = parse_csv!(.plain_text)\n            .body = upcase(string!(values[0]))\n            .timestamp = to_int!(values[1])\n            .response_date = values[2]\n            .response_time = to_int!(values[3])\n            .response_payload = values[4]\n            del(.plain_text)\n        \"#;\n\n        let transform_config = TransformConfig::for_test(vrl_script);\n        let doc_processor = DocProcessor::try_new(\n            index_id.to_string(),\n            source_id.to_string(),\n            doc_mapper.clone(),\n            indexer_mailbox,\n            Some(transform_config),\n            SourceInputFormat::PlainText,\n        )\n        .unwrap();\n        let (doc_processor_mailbox, doc_processor_handle) =\n            universe.spawn_builder().spawn(doc_processor);\n        doc_processor_mailbox\n            .send_message(RawDocBatch::for_test(\n                &[\n                    // body,timestamp,response_date,response_time,response_payload\n                    br#\"\"happy using VRL\",1628837062,\"2021-12-19T16:39:59+00:00\",2,\"YWJj\"\"#,\n                    br#\"\"happy2\",1628837062,\"2021-12-19T16:40:57+00:00\",13,\"YWJj\"\"#,\n                    br#\"\"happy2\",1628837062,\"2021-12-19T16:40:57+00:00\",\"invalid-response_time\",\"YWJj\"\"#,\n                ],\n                0..4,\n            ))\n            .await.unwrap();\n        let counters = doc_processor_handle\n            .process_pending_and_observe()\n            .await\n            .state;\n        assert_eq!(counters.index_id, index_id);\n        assert_eq!(counters.source_id, source_id);\n        assert_eq!(counters.doc_mapper_errors.get_num_docs(), 0,);\n        assert_eq!(counters.transform_errors.get_num_docs(), 1,);\n        assert_eq!(counters.otlp_parse_errors.get_num_docs(), 0,);\n        assert_eq!(counters.valid.get_num_docs(), 2,);\n        assert_eq!(counters.num_bytes_total.load(Ordering::Relaxed), 200,);\n\n        let output_messages = indexer_inbox.drain_for_test();\n        assert_eq!(output_messages.len(), 1);\n        let batch = *(output_messages\n            .into_iter()\n            .next()\n            .unwrap()\n            .downcast::<ProcessedDocBatch>()\n            .unwrap());\n        assert_eq!(batch.docs.len(), 2);\n        assert_eq!(\n            batch.checkpoint_delta,\n            SourceCheckpointDelta::from_range(0..4)\n        );\n\n        let schema = doc_mapper.schema();\n        let NamedFieldDocument(named_field_doc_map) = batch.docs[0].doc.to_named_doc(&schema);\n        let doc_json = JsonValue::Object(doc_mapper.doc_to_json(named_field_doc_map).unwrap());\n        assert_eq!(\n            doc_json,\n            serde_json::json!({\n                \"_source\": {\n                    \"body\": \"HAPPY USING VRL\",\n                    \"response_date\": \"2021-12-19T16:39:59Z\",\n                    \"response_payload\": \"YWJj\",\n                    \"response_time\": 2,\n                    \"timestamp\": 1628837062\n                },\n                \"body\": \"HAPPY USING VRL\",\n                \"response_date\": \"2021-12-19T16:39:59Z\",\n                \"response_payload\": \"YWJj\",\n                \"response_time\": 2.0,\n                \"timestamp\": 1628837062\n            })\n        );\n        universe.assert_quit().await;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/actors/index_serializer.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse async_trait::async_trait;\nuse quickwit_actors::{Actor, ActorContext, ActorExitStatus, Handler, Mailbox, QueueCapacity};\nuse quickwit_common::io::IoControls;\nuse quickwit_common::runtimes::RuntimeType;\nuse tokio::runtime::Handle;\nuse tracing::instrument;\n\nuse crate::actors::Packager;\nuse crate::models::{EmptySplit, IndexedSplit, IndexedSplitBatch, IndexedSplitBatchBuilder};\n\n/// The index serializer takes a non-serialized split,\n/// and serializes it before passing it to the packager.\n///\n/// This is usually a CPU heavy operation.\n///\n/// Depending on the data\n/// (terms cardinality) and the index settings (sorted or not)\n/// it can range from medium IO to IO heavy.\npub struct IndexSerializer {\n    packager_mailbox: Mailbox<Packager>,\n}\n\nimpl IndexSerializer {\n    pub fn new(packager_mailbox: Mailbox<Packager>) -> Self {\n        Self { packager_mailbox }\n    }\n}\n\n#[async_trait]\nimpl Actor for IndexSerializer {\n    type ObservableState = ();\n\n    fn observable_state(&self) -> Self::ObservableState {}\n\n    fn queue_capacity(&self) -> QueueCapacity {\n        QueueCapacity::Bounded(0)\n    }\n\n    fn runtime_handle(&self) -> Handle {\n        RuntimeType::Blocking.get_runtime_handle()\n    }\n}\n\n#[async_trait]\nimpl Handler<IndexedSplitBatchBuilder> for IndexSerializer {\n    type Reply = ();\n\n    #[instrument(\n        name=\"serialize_split_batch\"\n        parent=batch_builder.batch_parent_span.id(),\n        skip_all,\n    )]\n    async fn handle(\n        &mut self,\n        batch_builder: IndexedSplitBatchBuilder,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        let mut splits: Vec<IndexedSplit> = Vec::with_capacity(batch_builder.splits.len());\n        for split_builder in batch_builder.splits {\n            // TODO Consider & test removing this protect guard.\n            //\n            // In theory the controlled directory should be sufficient.\n            let _protect_guard = ctx.protect_zone();\n            if let Some(controlled_directory) = &split_builder.controlled_directory_opt {\n                let io_controls = IoControls::default()\n                    .set_progress(ctx.progress().clone())\n                    .set_kill_switch(ctx.kill_switch().clone())\n                    .set_component(\"index_serializer\");\n                controlled_directory.set_io_controls(io_controls);\n            }\n            let split = split_builder.finalize()?;\n            splits.push(split);\n        }\n        let indexed_split_batch = IndexedSplitBatch {\n            splits,\n            checkpoint_delta_opt: batch_builder.checkpoint_delta_opt,\n            publish_lock: batch_builder.publish_lock,\n            publish_token_opt: batch_builder.publish_token_opt,\n            merge_task_opt: None,\n            batch_parent_span: batch_builder.batch_parent_span,\n        };\n        ctx.send_message(&self.packager_mailbox, indexed_split_batch)\n            .await?;\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<EmptySplit> for IndexSerializer {\n    type Reply = ();\n\n    #[instrument(\n        name=\"serialize_empty_split\"\n        parent=empty_split.batch_parent_span.id(),\n        skip_all,\n    )]\n    async fn handle(\n        &mut self,\n        empty_split: EmptySplit,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        ctx.send_message(&self.packager_mailbox, empty_split)\n            .await?;\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/actors/indexer.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::hash_map::Entry;\nuse std::num::NonZeroU32;\nuse std::ops::RangeInclusive;\nuse std::sync::Arc;\n\nuse anyhow::Context;\nuse async_trait::async_trait;\nuse bytesize::ByteSize;\nuse fail::fail_point;\nuse fnv::FnvHashMap;\nuse itertools::Itertools;\nuse quickwit_actors::{\n    Actor, ActorContext, ActorExitStatus, Command, Handler, Mailbox, QueueCapacity,\n};\nuse quickwit_common::io::IoControls;\nuse quickwit_common::metrics::GaugeGuard;\nuse quickwit_common::runtimes::RuntimeType;\nuse quickwit_common::temp_dir::TempDirectory;\nuse quickwit_config::IndexingSettings;\nuse quickwit_doc_mapper::DocMapper;\nuse quickwit_metastore::checkpoint::{IndexCheckpointDelta, SourceCheckpointDelta};\nuse quickwit_proto::indexing::{IndexingPipelineId, PipelineMetrics};\nuse quickwit_proto::metastore::{\n    LastDeleteOpstampRequest, MetastoreService, MetastoreServiceClient,\n};\nuse quickwit_proto::types::{DocMappingUid, PublishToken};\nuse quickwit_query::get_quickwit_fastfield_normalizer_manager;\nuse serde::Serialize;\nuse tantivy::schema::Schema;\nuse tantivy::store::{Compressor, ZstdCompressor};\nuse tantivy::tokenizer::TokenizerManager;\nuse tantivy::{DateTime, IndexBuilder, IndexSettings};\nuse tokio::runtime::Handle;\nuse tokio::sync::Semaphore;\nuse tracing::{Span, info, info_span, warn};\nuse ulid::Ulid;\n\nuse crate::actors::IndexSerializer;\nuse crate::actors::cooperative_indexing::{CooperativeIndexingCycle, CooperativeIndexingPeriod};\nuse crate::models::{\n    CommitTrigger, EmptySplit, IndexedSplitBatchBuilder, IndexedSplitBuilder, NewPublishLock,\n    NewPublishToken, ProcessedDoc, ProcessedDocBatch, PublishLock,\n};\n\n// Random partition ID used to gather partitions exceeding the maximum number of partitions.\nconst OTHER_PARTITION_ID: u64 = 3264326757911759461u64;\n\n#[derive(Debug)]\nstruct CommitTimeout {\n    workbench_id: Ulid,\n}\n\n#[derive(Clone, Debug, Default, Eq, PartialEq, Serialize)]\npub struct IndexerCounters {\n    /// Number of splits that were emitted by the indexer.\n    pub num_splits_emitted: u64,\n\n    /// Number of split batches that were emitted by the indexer.\n    pub num_split_batches_emitted: u64,\n\n    /// Number of (valid) documents in the current workbench.\n    /// This value is used to trigger commit and for observation.\n    pub num_docs_in_workbench: u64,\n\n    /// Number of ProcessDocBatch received by the indexer to\n    /// build this split.\n    pub num_doc_batches_in_workbench: u64,\n\n    /// Metrics describing the load and indexing performance of the\n    /// pipeline. This is only updated for cooperative indexers.\n    pub pipeline_metrics_opt: Option<PipelineMetrics>,\n}\n\nstruct IndexerState {\n    pipeline_id: IndexingPipelineId,\n    metastore: MetastoreServiceClient,\n    indexing_directory: TempDirectory,\n    indexing_settings: IndexingSettings,\n    publish_lock: PublishLock,\n    publish_token_opt: Option<PublishToken>,\n    schema: Schema,\n    doc_mapping_uid: DocMappingUid,\n    tokenizer_manager: TokenizerManager,\n    max_num_partitions: NonZeroU32,\n    index_settings: IndexSettings,\n    cooperative_indexing_opt: Option<CooperativeIndexingCycle>,\n}\n\nimpl IndexerState {\n    fn create_indexed_split_builder(\n        &self,\n        partition_id: u64,\n        last_delete_opstamp: u64,\n        ctx: &ActorContext<Indexer>,\n    ) -> anyhow::Result<IndexedSplitBuilder> {\n        let index_builder = IndexBuilder::new()\n            .settings(self.index_settings.clone())\n            .schema(self.schema.clone())\n            .tokenizers(self.tokenizer_manager.clone())\n            .fast_field_tokenizers(\n                get_quickwit_fastfield_normalizer_manager()\n                    .tantivy_manager()\n                    .clone(),\n            );\n\n        let io_controls = IoControls::default()\n            .set_progress(ctx.progress().clone())\n            .set_kill_switch(ctx.kill_switch().clone())\n            .set_component(\"indexer\");\n\n        let indexed_split = IndexedSplitBuilder::new_in_dir(\n            self.pipeline_id.clone(),\n            partition_id,\n            last_delete_opstamp,\n            self.doc_mapping_uid,\n            self.indexing_directory.clone(),\n            index_builder,\n            io_controls,\n        )?;\n        info!(\n            split_id=%indexed_split.split_id(),\n            partition_id=%partition_id,\n            \"new-split\"\n        );\n        Ok(indexed_split)\n    }\n\n    fn get_or_create_indexed_split<'a>(\n        &self,\n        partition_id: u64,\n        last_delete_opstamp: u64,\n        splits: &'a mut FnvHashMap<u64, IndexedSplitBuilder>,\n        other_split_opt: &'a mut Option<IndexedSplitBuilder>,\n        counter: &'a mut IndexerCounters,\n        ctx: &ActorContext<Indexer>,\n    ) -> anyhow::Result<(&'a mut IndexedSplitBuilder, bool)> {\n        let num_splits = splits.len();\n        match splits.entry(partition_id) {\n            Entry::Occupied(indexed_split) => Ok((indexed_split.into_mut(), false)),\n            Entry::Vacant(vacant_entry) => {\n                if num_splits as u32 >= self.max_num_partitions.get() {\n                    // In order to avoid exceeding max_num_partitions, we map the document to the\n                    // `OTHER` special partition.\n                    if other_split_opt.is_none() {\n                        warn!(\n                            num_docs_in_workbench = counter.num_docs_in_workbench,\n                            max_num_partition = self.max_num_partitions.get(),\n                            \"Exceeding max_num_partition\"\n                        );\n                        let new_other_split = self.create_indexed_split_builder(\n                            OTHER_PARTITION_ID,\n                            last_delete_opstamp,\n                            ctx,\n                        )?;\n                        *other_split_opt = Some(new_other_split);\n                    }\n                    Ok((other_split_opt.as_mut().unwrap(), true))\n                } else {\n                    let indexed_split =\n                        self.create_indexed_split_builder(partition_id, last_delete_opstamp, ctx)?;\n                    Ok((vacant_entry.insert(indexed_split), true))\n                }\n            }\n        }\n    }\n\n    async fn create_workbench(\n        &self,\n        ctx: &ActorContext<Indexer>,\n    ) -> anyhow::Result<IndexingWorkbench> {\n        let workbench_id = Ulid::new();\n        let batch_parent_span = info_span!(target: \"quickwit-indexing\", \"index-doc-batches\",\n            index_id=%self.pipeline_id.index_uid.index_id,\n            source_id=%self.pipeline_id.source_id,\n            pipeline_uid=%self.pipeline_id.pipeline_uid,\n            workbench_id=%workbench_id,\n        );\n        let indexing_span = info_span!(parent: batch_parent_span.id(), \"indexer\");\n        let cooperative_indexing_period =\n            if let Some(cooperative_indexing) = &self.cooperative_indexing_opt {\n                Some(\n                    ctx.protect_future(cooperative_indexing.cooperative_indexing_period())\n                        .await,\n                )\n            } else {\n                None\n            };\n\n        let last_delete_opstamp_request = LastDeleteOpstampRequest {\n            index_uid: Some(self.pipeline_id.index_uid.clone()),\n        };\n        let last_delete_opstamp_response = ctx\n            .protect_future(\n                self.metastore\n                    .clone()\n                    .last_delete_opstamp(last_delete_opstamp_request),\n            )\n            .await?;\n        let last_delete_opstamp = last_delete_opstamp_response.last_delete_opstamp;\n\n        let checkpoint_delta = IndexCheckpointDelta {\n            source_id: self.pipeline_id.source_id.clone(),\n            source_delta: SourceCheckpointDelta::default(),\n        };\n        let publish_lock = self.publish_lock.clone();\n        let publish_token_opt = self.publish_token_opt.clone();\n\n        let mut split_builders_guard =\n            GaugeGuard::from_gauge(&crate::metrics::INDEXER_METRICS.split_builders);\n        split_builders_guard.add(1);\n\n        let workbench = IndexingWorkbench {\n            workbench_id,\n            batch_parent_span,\n            _indexing_span: indexing_span,\n            indexed_splits: FnvHashMap::with_capacity_and_hasher(250, Default::default()),\n            other_indexed_split_opt: None,\n            checkpoint_delta,\n            publish_lock,\n            publish_token_opt,\n            last_delete_opstamp,\n            memory_usage: GaugeGuard::from_gauge(\n                &quickwit_common::metrics::MEMORY_METRICS\n                    .in_flight\n                    .index_writer,\n            ),\n            cooperative_indexing_period,\n            split_builders_guard,\n        };\n        Ok(workbench)\n    }\n\n    /// Returns the current_indexed_split. If this is the first message, then\n    /// the indexed_split does not exist yet.\n    ///\n    /// This function will then create it, and can hence return an Error.\n    async fn get_or_create_workbench<'a>(\n        &'a self,\n        indexing_workbench_opt: &'a mut Option<IndexingWorkbench>,\n        ctx: &'a ActorContext<Indexer>,\n    ) -> anyhow::Result<&'a mut IndexingWorkbench> {\n        if indexing_workbench_opt.is_none() {\n            let indexing_workbench = self.create_workbench(ctx).await?;\n            let commit_timeout_message = CommitTimeout {\n                workbench_id: indexing_workbench.workbench_id,\n            };\n            ctx.schedule_self_msg(\n                self.indexing_settings.commit_timeout(),\n                commit_timeout_message,\n            );\n            *indexing_workbench_opt = Some(indexing_workbench);\n        }\n        let current_indexing_workbench = indexing_workbench_opt.as_mut().context(\n            \"no index writer available. this should never happen! please, report on https://github.com/quickwit-oss/quickwit/issues\"\n        )?;\n        Ok(current_indexing_workbench)\n    }\n\n    async fn index_batch(\n        &self,\n        batch: ProcessedDocBatch,\n        indexing_workbench_opt: &mut Option<IndexingWorkbench>,\n        counters: &mut IndexerCounters,\n        ctx: &ActorContext<Indexer>,\n    ) -> Result<(), ActorExitStatus> {\n        let IndexingWorkbench {\n            checkpoint_delta,\n            indexed_splits,\n            other_indexed_split_opt,\n            publish_lock,\n            last_delete_opstamp,\n            memory_usage,\n            ..\n        } = self\n            .get_or_create_workbench(indexing_workbench_opt, ctx)\n            .await?;\n        if publish_lock.is_dead() {\n            // Release indexing permit early.\n            indexing_workbench_opt.take();\n            return Ok(());\n        }\n        checkpoint_delta\n            .source_delta\n            .extend(batch.checkpoint_delta)\n            .context(\"batch delta does not follow indexer checkpoint\")?;\n        let mut memory_usage_delta: i64 = 0;\n        counters.num_doc_batches_in_workbench += 1;\n        for doc in batch.docs {\n            let ProcessedDoc {\n                doc,\n                timestamp_opt,\n                partition,\n                num_bytes,\n            } = doc;\n            counters.num_docs_in_workbench += 1;\n            let (indexed_split, split_created) = self.get_or_create_indexed_split(\n                partition,\n                *last_delete_opstamp,\n                indexed_splits,\n                other_indexed_split_opt,\n                counters,\n                ctx,\n            )?;\n            let mem_usage_before = indexed_split.index_writer.mem_usage() as u64;\n            if split_created {\n                // The split was just created. We need to account for the initial index writer's\n                // memory usage.\n                memory_usage_delta += mem_usage_before as i64;\n            }\n            indexed_split.split_attrs.uncompressed_docs_size_in_bytes += num_bytes as u64;\n            indexed_split.split_attrs.num_docs += 1;\n            if let Some(timestamp) = timestamp_opt {\n                record_timestamp(timestamp, &mut indexed_split.split_attrs.time_range);\n            }\n            let _protect_guard = ctx.protect_zone();\n            indexed_split\n                .index_writer\n                .add_document(doc)\n                .context(\"failed to add document\")?;\n            let mem_usage_after = indexed_split.index_writer.mem_usage() as u64;\n            memory_usage_delta += mem_usage_after as i64 - mem_usage_before as i64;\n            ctx.record_progress();\n        }\n        memory_usage.add(memory_usage_delta);\n        Ok(())\n    }\n}\n\n/// A workbench hosts the set of `IndexedSplit` that are being built.\nstruct IndexingWorkbench {\n    workbench_id: Ulid,\n    // This span is meant to be passed through the pipeline.\n    batch_parent_span: Span,\n    // Span for the in-memory indexing (done in the Indexer actor).\n    _indexing_span: Span,\n\n    indexed_splits: FnvHashMap<u64, IndexedSplitBuilder>,\n    other_indexed_split_opt: Option<IndexedSplitBuilder>,\n\n    checkpoint_delta: IndexCheckpointDelta,\n    publish_lock: PublishLock,\n    publish_token_opt: Option<PublishToken>,\n    // On workbench creation, we fetch from the metastore the last delete task opstamp.\n    // We use this value to set the `delete_opstamp` of the workbench splits.\n    last_delete_opstamp: u64,\n    // Number of bytes declared as used by tantivy.\n    memory_usage: GaugeGuard<'static>,\n    split_builders_guard: GaugeGuard<'static>,\n    cooperative_indexing_period: Option<CooperativeIndexingPeriod>,\n}\n\npub struct Indexer {\n    indexer_state: IndexerState,\n    index_serializer_mailbox: Mailbox<IndexSerializer>,\n    indexing_workbench_opt: Option<IndexingWorkbench>,\n    counters: IndexerCounters,\n}\n\n#[async_trait]\nimpl Actor for Indexer {\n    type ObservableState = IndexerCounters;\n\n    fn observable_state(&self) -> Self::ObservableState {\n        self.counters.clone()\n    }\n\n    fn queue_capacity(&self) -> QueueCapacity {\n        QueueCapacity::Bounded(10)\n    }\n\n    fn name(&self) -> String {\n        \"Indexer\".to_string()\n    }\n\n    fn runtime_handle(&self) -> Handle {\n        RuntimeType::Blocking.get_runtime_handle()\n    }\n\n    #[inline]\n    fn yield_after_each_message(&self) -> bool {\n        false\n    }\n\n    async fn initialize(&mut self, ctx: &ActorContext<Self>) -> Result<(), ActorExitStatus> {\n        if let Some(cooperative_indexing_cycle) = &self.indexer_state.cooperative_indexing_opt {\n            let initial_sleep_duration = cooperative_indexing_cycle.initial_sleep_duration();\n            ctx.pause();\n            ctx.schedule_self_msg(initial_sleep_duration, Command::Resume);\n        }\n        Ok(())\n    }\n\n    async fn on_drained_messages(\n        &mut self,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        let Some(indexing_workbench) = &mut self.indexing_workbench_opt else {\n            return Ok(());\n        };\n\n        let Some(cooperative_indexing_period) =\n            indexing_workbench.cooperative_indexing_period.take()\n        else {\n            return Ok(());\n        };\n\n        let uncompressed_num_bytes = indexing_workbench\n            .indexed_splits\n            .values()\n            .map(|split| split.split_attrs.uncompressed_docs_size_in_bytes)\n            .sum::<u64>();\n\n        // This also drops the indexing permit.\n        let (sleep_duration, pipeline_metrics) =\n            cooperative_indexing_period.end_of_work(uncompressed_num_bytes);\n\n        self.counters.pipeline_metrics_opt = Some(pipeline_metrics);\n\n        self.send_to_serializer(CommitTrigger::Drained, ctx).await?;\n\n        if !sleep_duration.is_zero() {\n            ctx.pause();\n            ctx.schedule_self_msg(sleep_duration, Command::Resume);\n        }\n\n        Ok(())\n    }\n\n    async fn finalize(\n        &mut self,\n        exit_status: &ActorExitStatus,\n        ctx: &ActorContext<Self>,\n    ) -> anyhow::Result<()> {\n        match exit_status {\n            ActorExitStatus::DownstreamClosed\n            | ActorExitStatus::Killed\n            | ActorExitStatus::Failure(_)\n            | ActorExitStatus::Panicked => return Ok(()),\n            ActorExitStatus::Quit | ActorExitStatus::Success => {\n                let _ = self\n                    .send_to_serializer(CommitTrigger::NoMoreDocs, ctx)\n                    .await;\n            }\n        }\n        Ok(())\n    }\n}\n\nfn record_timestamp(timestamp: DateTime, time_range: &mut Option<RangeInclusive<DateTime>>) {\n    let new_timestamp_range = match time_range {\n        Some(range) => timestamp.min(*range.start())..=timestamp.max(*range.end()),\n        None => timestamp..=timestamp,\n    };\n    *time_range = Some(new_timestamp_range);\n}\n\n#[async_trait]\nimpl Handler<CommitTimeout> for Indexer {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        commit_timeout: CommitTimeout,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        if let Some(indexing_workbench) = &self.indexing_workbench_opt {\n            // If this is a timeout for a different workbench, we must ignore it.\n            if indexing_workbench.workbench_id != commit_timeout.workbench_id {\n                return Ok(());\n            }\n        }\n        self.send_to_serializer(CommitTrigger::Timeout, ctx).await?;\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<ProcessedDocBatch> for Indexer {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        doc_batch: ProcessedDocBatch,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        self.index_batch(doc_batch, ctx).await\n    }\n}\n\n#[async_trait]\nimpl Handler<NewPublishLock> for Indexer {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        message: NewPublishLock,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        let NewPublishLock(publish_lock) = message;\n        self.indexing_workbench_opt = None;\n        self.indexer_state.publish_lock = publish_lock;\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<NewPublishToken> for Indexer {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        message: NewPublishToken,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        let NewPublishToken(publish_token) = message;\n        self.indexer_state.publish_token_opt = Some(publish_token);\n        Ok(())\n    }\n}\n\nimpl Indexer {\n    pub fn new(\n        pipeline_id: IndexingPipelineId,\n        doc_mapper: Arc<DocMapper>,\n        metastore: MetastoreServiceClient,\n        indexing_directory: TempDirectory,\n        indexing_settings: IndexingSettings,\n        cooperative_indexing_permits_opt: Option<Arc<Semaphore>>,\n        index_serializer_mailbox: Mailbox<IndexSerializer>,\n    ) -> Self {\n        let schema = doc_mapper.schema();\n        let tokenizer_manager = doc_mapper.tokenizer_manager().clone();\n        let docstore_compression = Compressor::Zstd(ZstdCompressor {\n            compression_level: Some(indexing_settings.docstore_compression_level),\n        });\n        let index_settings = IndexSettings {\n            docstore_blocksize: indexing_settings.docstore_blocksize,\n            docstore_compression,\n            docstore_compress_dedicated_thread: true,\n        };\n        let cooperative_indexing_opt: Option<CooperativeIndexingCycle> =\n            cooperative_indexing_permits_opt.map(|cooperative_indexing_permits| {\n                CooperativeIndexingCycle::new(\n                    &pipeline_id,\n                    indexing_settings.commit_timeout(),\n                    cooperative_indexing_permits,\n                )\n            });\n        Self {\n            indexer_state: IndexerState {\n                pipeline_id,\n                metastore: metastore.clone(),\n                indexing_directory,\n                indexing_settings,\n                publish_lock: PublishLock::default(),\n                publish_token_opt: None,\n                schema,\n                doc_mapping_uid: doc_mapper.doc_mapping_uid(),\n                tokenizer_manager: tokenizer_manager.tantivy_manager().clone(),\n                index_settings,\n                max_num_partitions: doc_mapper.max_num_partitions(),\n                cooperative_indexing_opt,\n            },\n            index_serializer_mailbox,\n            indexing_workbench_opt: None,\n            counters: IndexerCounters::default(),\n        }\n    }\n\n    fn memory_usage(&self) -> ByteSize {\n        if let Some(workbench) = &self.indexing_workbench_opt {\n            ByteSize(workbench.memory_usage.get() as u64)\n        } else {\n            ByteSize(0u64)\n        }\n    }\n\n    async fn index_batch(\n        &mut self,\n        batch: ProcessedDocBatch,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        fail_point!(\"indexer:batch:before\");\n        let force_commit = batch.force_commit;\n        self.indexer_state\n            .index_batch(\n                batch,\n                &mut self.indexing_workbench_opt,\n                &mut self.counters,\n                ctx,\n            )\n            .await?;\n        let memory_usage = self.memory_usage();\n        if memory_usage >= self.indexer_state.indexing_settings.resources.heap_size {\n            self.send_to_serializer(CommitTrigger::MemoryLimit, ctx)\n                .await?;\n        }\n        if self.counters.num_docs_in_workbench\n            >= self.indexer_state.indexing_settings.split_num_docs_target as u64\n        {\n            self.send_to_serializer(CommitTrigger::NumDocsLimit, ctx)\n                .await?;\n        }\n        if force_commit {\n            self.send_to_serializer(CommitTrigger::ForceCommit, ctx)\n                .await?;\n        }\n        fail_point!(\"indexer:batch:after\");\n        Ok(())\n    }\n\n    /// Extract the indexed split and send it to the IndexSerializer.\n    async fn send_to_serializer(\n        &mut self,\n        commit_trigger: CommitTrigger,\n        ctx: &ActorContext<Self>,\n    ) -> anyhow::Result<()> {\n        let Some(IndexingWorkbench {\n            indexed_splits,\n            other_indexed_split_opt,\n            checkpoint_delta,\n            publish_lock,\n            publish_token_opt,\n            batch_parent_span,\n            memory_usage,\n            split_builders_guard,\n            ..\n        }) = self.indexing_workbench_opt.take()\n        else {\n            return Ok(());\n        };\n\n        let mut splits: Vec<IndexedSplitBuilder> = indexed_splits.into_values().collect();\n\n        if let Some(other_split) = other_indexed_split_opt {\n            splits.push(other_split)\n        }\n\n        // Avoid producing empty split, but still update the checkpoint if it is not empty to avoid\n        // reprocessing the same faulty documents.\n        if splits.is_empty() {\n            if !checkpoint_delta.is_empty() {\n                ctx.send_message(\n                    &self.index_serializer_mailbox,\n                    EmptySplit {\n                        index_uid: self.indexer_state.pipeline_id.index_uid.clone(),\n                        checkpoint_delta,\n                        publish_lock,\n                        publish_token_opt,\n                        batch_parent_span,\n                    },\n                )\n                .await?;\n            }\n            return Ok(());\n        }\n        let num_splits = splits.len() as u64;\n        let split_ids = splits.iter().map(|split| split.split_id()).join(\",\");\n        info!(\n            index=%self.indexer_state.pipeline_id.index_uid,\n            source=self.indexer_state.pipeline_id.source_id.as_str(),\n            pipeline_uid=%self.indexer_state.pipeline_id.pipeline_uid,\n            commit_trigger=?commit_trigger,\n            num_batches=%self.counters.num_doc_batches_in_workbench,\n            split_ids=%split_ids,\n            num_docs=self.counters.num_docs_in_workbench, \"send-to-index-serializer\");\n        ctx.send_message(\n            &self.index_serializer_mailbox,\n            IndexedSplitBatchBuilder {\n                splits,\n                checkpoint_delta_opt: Some(checkpoint_delta),\n                publish_lock,\n                publish_token_opt,\n                commit_trigger,\n                batch_parent_span,\n                memory_usage,\n                _split_builders_guard: split_builders_guard,\n            },\n        )\n        .await?;\n        self.counters.num_docs_in_workbench = 0;\n        self.counters.num_doc_batches_in_workbench = 0;\n        self.counters.num_splits_emitted += num_splits;\n        self.counters.num_split_batches_emitted += 1;\n        Ok(())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::fmt::Write;\n    use std::sync::Arc;\n    use std::time::Duration;\n\n    use quickwit_actors::Universe;\n    use quickwit_doc_mapper::{DocMapper, default_doc_mapper_for_test};\n    use quickwit_metastore::checkpoint::SourceCheckpointDelta;\n    use quickwit_proto::metastore::{\n        EmptyResponse, LastDeleteOpstampResponse, MockMetastoreService,\n    };\n    use quickwit_proto::types::{IndexUid, NodeId, PipelineUid};\n    use tantivy::{DateTime, doc};\n\n    use super::*;\n    use crate::actors::indexer::{IndexerCounters, record_timestamp};\n\n    #[test]\n    fn test_record_timestamp() {\n        let mut time_range = None;\n        record_timestamp(DateTime::from_timestamp_secs(1628664679), &mut time_range);\n        assert_eq!(\n            time_range,\n            Some(\n                DateTime::from_timestamp_secs(1628664679)\n                    ..=DateTime::from_timestamp_secs(1628664679)\n            )\n        );\n        record_timestamp(DateTime::from_timestamp_secs(1628664112), &mut time_range);\n        assert_eq!(\n            time_range,\n            Some(\n                DateTime::from_timestamp_secs(1628664112)\n                    ..=DateTime::from_timestamp_secs(1628664679)\n            )\n        );\n        record_timestamp(DateTime::from_timestamp_secs(1628665112), &mut time_range);\n        assert_eq!(\n            time_range,\n            Some(\n                DateTime::from_timestamp_secs(1628664112)\n                    ..=DateTime::from_timestamp_secs(1628665112)\n            )\n        )\n    }\n\n    #[tokio::test]\n    async fn test_indexer_triggers_commit_on_target_num_docs() -> anyhow::Result<()> {\n        let index_uid = IndexUid::new_with_random_ulid(\"test-index\");\n        let pipeline_id = IndexingPipelineId {\n            index_uid: index_uid.clone(),\n            source_id: \"test-source\".to_string(),\n            node_id: NodeId::from(\"test-node\"),\n            pipeline_uid: PipelineUid::default(),\n        };\n        let doc_mapper = Arc::new(default_doc_mapper_for_test());\n        let last_delete_opstamp = 10;\n        let schema = doc_mapper.schema();\n        let body_field = schema.get_field(\"body\").unwrap();\n        let timestamp_field = schema.get_field(\"timestamp\").unwrap();\n        let indexing_directory = TempDirectory::for_test();\n        let mut indexing_settings = IndexingSettings::for_test();\n        indexing_settings.split_num_docs_target = 3;\n        let universe = Universe::with_accelerated_time();\n        let (index_serializer_mailbox, index_serializer_inbox) = universe.create_test_mailbox();\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore.expect_publish_splits().never();\n        mock_metastore\n            .expect_last_delete_opstamp()\n            .times(2)\n            .returning(move |delete_opstamp_request| {\n                assert_eq!(delete_opstamp_request.index_uid(), &index_uid);\n                Ok(LastDeleteOpstampResponse::new(last_delete_opstamp))\n            });\n        mock_metastore.expect_publish_splits().never();\n        let indexer = Indexer::new(\n            pipeline_id,\n            doc_mapper,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            indexing_directory,\n            indexing_settings,\n            None,\n            index_serializer_mailbox,\n        );\n        let (indexer_mailbox, indexer_handle) = universe.spawn_builder().spawn(indexer);\n        indexer_mailbox\n            .send_message(ProcessedDocBatch::new(\n                vec![\n                    ProcessedDoc {\n                        doc: doc!(\n                            body_field=>\"this is a test document\",\n                            timestamp_field=>DateTime::from_timestamp_secs(1_662_529_435)\n                        ),\n                        timestamp_opt: Some(DateTime::from_timestamp_secs(1_662_529_435)),\n                        partition: 1,\n                        num_bytes: 30,\n                    },\n                    ProcessedDoc {\n                        doc: doc!(\n                            body_field=>\"this is a test document 2\",\n                            timestamp_field=>DateTime::from_timestamp_secs(1_662_529_435)\n                        ),\n                        timestamp_opt: Some(DateTime::from_timestamp_secs(1_662_529_435)),\n                        partition: 1,\n                        num_bytes: 30,\n                    },\n                ],\n                SourceCheckpointDelta::from_range(4..6),\n                false,\n            ))\n            .await?;\n        indexer_mailbox\n            .send_message(ProcessedDocBatch::new(\n                vec![\n                    ProcessedDoc {\n                        doc: doc!(\n                            body_field=>\"this is a test document 3\",\n                            timestamp_field=>DateTime::from_timestamp_secs(1_662_529_435i64)\n                        ),\n                        timestamp_opt: Some(DateTime::from_timestamp_secs(1_662_529_435i64)),\n                        partition: 1,\n                        num_bytes: 30,\n                    },\n                    ProcessedDoc {\n                        doc: doc!(\n                            body_field=>\"this is a test document 4\",\n                            timestamp_field=>DateTime::from_timestamp_secs(1_662_529_435)\n                        ),\n                        timestamp_opt: Some(DateTime::from_timestamp_secs(1_662_529_435)),\n                        partition: 1,\n                        num_bytes: 30,\n                    },\n                ],\n                SourceCheckpointDelta::from_range(6..8),\n                false,\n            ))\n            .await?;\n        indexer_mailbox\n            .send_message(ProcessedDocBatch::new(\n                vec![ProcessedDoc {\n                    doc: doc!(\n                        body_field=>\"this is a test document 5\",\n                        timestamp_field=>DateTime::from_timestamp_secs(1_662_529_435)\n                    ),\n                    timestamp_opt: Some(DateTime::from_timestamp_secs(1_662_529_435)),\n                    partition: 1,\n                    num_bytes: 30,\n                }],\n                SourceCheckpointDelta::from_range(8..9),\n                false,\n            ))\n            .await?;\n        let indexer_counters = indexer_handle.process_pending_and_observe().await.state;\n        assert_eq!(\n            indexer_counters,\n            IndexerCounters {\n                num_splits_emitted: 1,\n                num_split_batches_emitted: 1,\n                num_docs_in_workbench: 1, //< the num docs in split counter has been reset.\n                num_doc_batches_in_workbench: 1, //< the num docs in split counter has been reset.\n                pipeline_metrics_opt: None,\n            }\n        );\n        let messages: Vec<IndexedSplitBatchBuilder> = index_serializer_inbox.drain_for_test_typed();\n        assert_eq!(messages.len(), 1);\n        let batch = messages.into_iter().next().unwrap();\n        assert_eq!(batch.commit_trigger, CommitTrigger::NumDocsLimit);\n        assert_eq!(batch.splits[0].split_attrs.num_docs, 4);\n        for split in batch.splits.iter() {\n            assert_eq!(split.split_attrs.delete_opstamp, last_delete_opstamp);\n        }\n        let index_checkpoint = batch.checkpoint_delta_opt.unwrap();\n        assert_eq!(index_checkpoint.source_id, \"test-source\");\n        assert_eq!(\n            index_checkpoint.source_delta,\n            SourceCheckpointDelta::from_range(4..8)\n        );\n        batch.splits.into_iter().next().unwrap().finalize()?;\n        universe.assert_quit().await;\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_indexer_triggers_commit_on_memory_limit() -> anyhow::Result<()> {\n        let universe = Universe::new();\n        let index_uid = IndexUid::new_with_random_ulid(\"test-index\");\n        let pipeline_id = IndexingPipelineId {\n            index_uid: index_uid.clone(),\n            source_id: \"test-source\".to_string(),\n            node_id: NodeId::from(\"test-node\"),\n            pipeline_uid: PipelineUid::default(),\n        };\n        let doc_mapper = Arc::new(default_doc_mapper_for_test());\n        let last_delete_opstamp = 10;\n        let schema = doc_mapper.schema();\n        let body_field = schema.get_field(\"body\").unwrap();\n        let indexing_directory = TempDirectory::for_test();\n        let mut indexing_settings = IndexingSettings::for_test();\n        indexing_settings.resources.heap_size = ByteSize::mb(16);\n        let (index_serializer_mailbox, index_serializer_inbox) = universe.create_test_mailbox();\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore.expect_publish_splits().never();\n        mock_metastore\n            .expect_last_delete_opstamp()\n            .times(1..=2)\n            .returning(move |last_delete_opstamp_request| {\n                assert_eq!(last_delete_opstamp_request.index_uid(), &index_uid);\n                Ok(LastDeleteOpstampResponse::new(last_delete_opstamp))\n            });\n        mock_metastore.expect_publish_splits().never();\n        let indexer = Indexer::new(\n            pipeline_id,\n            doc_mapper,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            indexing_directory,\n            indexing_settings,\n            None,\n            index_serializer_mailbox,\n        );\n        let (indexer_mailbox, _indexer_handle) = universe.spawn_builder().spawn(indexer);\n\n        let make_doc = |i: u64| {\n            let mut body = String::new();\n            for val in 100 * i..100 * (i + 1) {\n                write!(&mut body, \"{val} \").unwrap();\n            }\n            let num_bytes = body.len() * 2;\n            ProcessedDoc {\n                doc: doc!(body_field=>body),\n                timestamp_opt: None,\n                partition: 0,\n                num_bytes,\n            }\n        };\n        for i in 0..10_000 {\n            indexer_mailbox\n                .send_message(ProcessedDocBatch::new(\n                    vec![make_doc(i)],\n                    SourceCheckpointDelta::from_range(i..i + 1),\n                    false,\n                ))\n                .await?;\n            let output_messages: Vec<IndexedSplitBatchBuilder> =\n                index_serializer_inbox.drain_for_test_typed();\n            if !output_messages.is_empty() {\n                assert_eq!(output_messages.len(), 1);\n                assert_eq!(\n                    output_messages[0].commit_trigger,\n                    CommitTrigger::MemoryLimit\n                );\n                // The following assert is not a strict one. It should help detect large\n                // regression in memory usage.\n                assert!((500..3_000).contains(&i));\n                break;\n            }\n        }\n        universe.assert_quit().await;\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_indexer_triggers_commit_on_timeout() -> anyhow::Result<()> {\n        let universe = Universe::new();\n        let pipeline_id = IndexingPipelineId {\n            index_uid: IndexUid::new_with_random_ulid(\"test-index\"),\n            source_id: \"test-source\".to_string(),\n            node_id: NodeId::from(\"test-node\"),\n            pipeline_uid: PipelineUid::default(),\n        };\n        let doc_mapper = Arc::new(default_doc_mapper_for_test());\n        let last_delete_opstamp = 10;\n        let schema = doc_mapper.schema();\n        let body_field = schema.get_field(\"body\").unwrap();\n        let timestamp_field = schema.get_field(\"timestamp\").unwrap();\n        let indexing_directory = TempDirectory::for_test();\n        let mut indexing_settings = IndexingSettings::for_test();\n        indexing_settings.commit_timeout_secs = 1;\n        let (index_serializer_mailbox, index_serializer_inbox) = universe.create_test_mailbox();\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore.expect_publish_splits().never();\n        mock_metastore.expect_last_delete_opstamp().returning(\n            move |_last_delete_opstamp_request| {\n                Ok(LastDeleteOpstampResponse::new(last_delete_opstamp))\n            },\n        );\n        mock_metastore.expect_publish_splits().never();\n        let indexer = Indexer::new(\n            pipeline_id,\n            doc_mapper,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            indexing_directory,\n            indexing_settings,\n            None,\n            index_serializer_mailbox,\n        );\n        let (indexer_mailbox, indexer_handle) = universe.spawn_builder().spawn(indexer);\n        tokio::task::spawn({\n            let indexer_mailbox = indexer_mailbox.clone();\n            async move {\n                let mut position = 0;\n                while indexer_mailbox\n                    .send_message(ProcessedDocBatch::new(\n                        vec![ProcessedDoc {\n                            doc: doc!(\n                                body_field=>\"this is a test document\",\n                                timestamp_field=>DateTime::from_timestamp_secs(1_662_529_435)\n                            ),\n                            timestamp_opt: Some(DateTime::from_timestamp_secs(1_662_529_435)),\n                            partition: 1,\n                            num_bytes: 30,\n                        }],\n                        SourceCheckpointDelta::from_range(position..position + 1),\n                        false,\n                    ))\n                    .await\n                    .is_ok()\n                {\n                    position += 1;\n                }\n            }\n        });\n        universe.sleep(Duration::from_secs(3)).await;\n\n        let indexer_counters = indexer_handle.process_pending_and_observe().await.state;\n        assert!(indexer_counters.num_splits_emitted > 0);\n        assert!(indexer_counters.num_split_batches_emitted > 0);\n\n        let indexed_serializer_messages: Vec<IndexedSplitBatchBuilder> =\n            index_serializer_inbox.drain_for_test_typed();\n        assert!(!indexed_serializer_messages.is_empty());\n        assert_eq!(\n            indexed_serializer_messages[0].commit_trigger,\n            CommitTrigger::Timeout\n        );\n        assert!(\n            indexed_serializer_messages[0].splits[0]\n                .split_attrs\n                .num_docs\n                > 0\n        );\n        universe.assert_quit().await;\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_indexer_triggers_commit_on_drained_mailbox() -> anyhow::Result<()> {\n        let universe = Universe::new();\n        let pipeline_id = IndexingPipelineId {\n            index_uid: IndexUid::new_with_random_ulid(\"test-index\"),\n            source_id: \"test-source\".to_string(),\n            node_id: NodeId::from(\"test-node\"),\n            pipeline_uid: PipelineUid::default(),\n        };\n        let doc_mapper = Arc::new(default_doc_mapper_for_test());\n        let last_delete_opstamp = 10;\n        let schema = doc_mapper.schema();\n        let body_field = schema.get_field(\"body\").unwrap();\n        let timestamp_field = schema.get_field(\"timestamp\").unwrap();\n        let indexing_directory = TempDirectory::for_test();\n        let indexing_settings = IndexingSettings::for_test();\n        let (index_serializer_mailbox, index_serializer_inbox) = universe.create_test_mailbox();\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore.expect_publish_splits().never();\n        mock_metastore.expect_last_delete_opstamp().returning(\n            move |_last_delete_opstamp_request| {\n                Ok(LastDeleteOpstampResponse::new(last_delete_opstamp))\n            },\n        );\n        let indexer = Indexer::new(\n            pipeline_id,\n            doc_mapper,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            indexing_directory,\n            indexing_settings,\n            Some(Arc::new(Semaphore::new(1))),\n            index_serializer_mailbox,\n        );\n        let (indexer_mailbox, indexer_handle) = universe.spawn_builder().spawn(indexer);\n        indexer_mailbox\n            .send_message(ProcessedDocBatch::new(\n                vec![ProcessedDoc {\n                    doc: doc!(\n                        body_field=>\"this is a test document 5\",\n                        timestamp_field=>DateTime::from_timestamp_secs(1_662_529_435)\n                    ),\n                    timestamp_opt: Some(DateTime::from_timestamp_secs(1_662_529_435)),\n                    partition: 1,\n                    num_bytes: 30,\n                }],\n                SourceCheckpointDelta::from_range(8..9),\n                false,\n            ))\n            .await\n            .unwrap();\n        let mut indexer_counters: IndexerCounters = Default::default();\n        for _ in 0..100 {\n            // When a lot of unit tests are running concurrently we have a race condition here.\n            // It is very difficult to assess when drain will actually be called.\n            //\n            // Therefore we check that it happens \"eventually\".\n            universe.sleep(Duration::from_secs(1)).await;\n            tokio::task::yield_now().await;\n            indexer_counters = indexer_handle.observe().await.state;\n            indexer_counters.pipeline_metrics_opt = None;\n            // drain was called at least once.\n            if indexer_counters.num_splits_emitted > 0 {\n                break;\n            }\n        }\n\n        assert_eq!(\n            &indexer_counters,\n            &IndexerCounters {\n                num_splits_emitted: 1,\n                num_split_batches_emitted: 1,\n                num_docs_in_workbench: 0,\n                num_doc_batches_in_workbench: 0,\n                pipeline_metrics_opt: None,\n            }\n        );\n        let indexed_split_batches: Vec<IndexedSplitBatchBuilder> =\n            index_serializer_inbox.drain_for_test_typed();\n        assert_eq!(indexed_split_batches.len(), 1);\n        assert_eq!(\n            indexed_split_batches[0].commit_trigger,\n            CommitTrigger::Drained\n        );\n        assert_eq!(indexed_split_batches[0].splits[0].split_attrs.num_docs, 1);\n        universe.assert_quit().await;\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_indexer_triggers_commit_on_quit() -> anyhow::Result<()> {\n        let universe = Universe::with_accelerated_time();\n        let pipeline_id = IndexingPipelineId {\n            index_uid: IndexUid::new_with_random_ulid(\"test-index\"),\n            source_id: \"test-source\".to_string(),\n            node_id: NodeId::from(\"test-node\"),\n            pipeline_uid: PipelineUid::default(),\n        };\n        let doc_mapper = Arc::new(default_doc_mapper_for_test());\n        let schema = doc_mapper.schema();\n        let body_field = schema.get_field(\"body\").unwrap();\n        let timestamp_field = schema.get_field(\"timestamp\").unwrap();\n        let indexing_directory = TempDirectory::for_test();\n        let indexing_settings = IndexingSettings::for_test();\n        let (index_serializer_mailbox, index_serializer_inbox) = universe.create_test_mailbox();\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore.expect_publish_splits().never();\n        mock_metastore\n            .expect_last_delete_opstamp()\n            .once()\n            .returning(move |_last_delete_opstamp_request| Ok(LastDeleteOpstampResponse::new(10)));\n        mock_metastore.expect_publish_splits().never();\n        let indexer = Indexer::new(\n            pipeline_id,\n            doc_mapper,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            indexing_directory,\n            indexing_settings,\n            None,\n            index_serializer_mailbox,\n        );\n        let (indexer_mailbox, indexer_handle) = universe.spawn_builder().spawn(indexer);\n        indexer_mailbox\n            .send_message(ProcessedDocBatch::new(\n                vec![ProcessedDoc {\n                    doc: doc!(\n                        body_field=>\"this is a test document 5\",\n                        timestamp_field=> DateTime::from_timestamp_secs(1_662_529_435)\n                    ),\n                    timestamp_opt: Some(DateTime::from_timestamp_secs(1_662_529_435)),\n                    partition: 1,\n                    num_bytes: 30,\n                }],\n                SourceCheckpointDelta::from_range(8..9),\n                false,\n            ))\n            .await\n            .unwrap();\n        universe.send_exit_with_success(&indexer_mailbox).await?;\n        let (exit_status, indexer_counters) = indexer_handle.join().await;\n        assert!(exit_status.is_success());\n        assert_eq!(\n            indexer_counters,\n            IndexerCounters {\n                num_splits_emitted: 1,\n                num_split_batches_emitted: 1,\n                num_docs_in_workbench: 0,\n                num_doc_batches_in_workbench: 0,\n                pipeline_metrics_opt: None,\n            }\n        );\n        let output_messages: Vec<IndexedSplitBatchBuilder> =\n            index_serializer_inbox.drain_for_test_typed();\n        assert_eq!(output_messages.len(), 1);\n        assert_eq!(output_messages[0].commit_trigger, CommitTrigger::NoMoreDocs);\n        assert_eq!(output_messages[0].splits[0].split_attrs.num_docs, 1);\n        universe.assert_quit().await;\n        Ok(())\n    }\n\n    const DOCMAPPER_WITH_PARTITION_JSON: &str = r#\"{\n        \"tag_fields\": [\"tenant\"],\n        \"partition_key\": \"tenant\",\n        \"field_mappings\": [\n            { \"name\": \"tenant\", \"type\": \"text\", \"tokenizer\": \"raw\", \"indexed\": true },\n            { \"name\": \"body\", \"type\": \"text\" }\n        ]\n    }\"#;\n\n    #[tokio::test]\n    async fn test_indexer_partitioning() -> anyhow::Result<()> {\n        let universe = Universe::with_accelerated_time();\n        let pipeline_id = IndexingPipelineId {\n            index_uid: IndexUid::new_with_random_ulid(\"test-index\"),\n            source_id: \"test-source\".to_string(),\n            node_id: NodeId::from(\"test-node\"),\n            pipeline_uid: PipelineUid::default(),\n        };\n        let doc_mapper: Arc<DocMapper> =\n            Arc::new(serde_json::from_str::<DocMapper>(DOCMAPPER_WITH_PARTITION_JSON).unwrap());\n        let schema = doc_mapper.schema();\n        let tenant_field = schema.get_field(\"tenant\").unwrap();\n        let body_field = schema.get_field(\"body\").unwrap();\n\n        let indexing_directory = TempDirectory::for_test();\n        let mut indexing_settings = IndexingSettings::for_test();\n        indexing_settings.resources.heap_size = ByteSize::mb(100);\n        let (index_serializer_mailbox, index_serializer_inbox) = universe.create_test_mailbox();\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore.expect_publish_splits().never();\n        mock_metastore\n            .expect_last_delete_opstamp()\n            .once()\n            .returning(move |_last_delete_opstamp_request| Ok(LastDeleteOpstampResponse::new(10)));\n        mock_metastore.expect_publish_splits().never();\n        let indexer = Indexer::new(\n            pipeline_id,\n            doc_mapper,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            indexing_directory,\n            indexing_settings,\n            None,\n            index_serializer_mailbox,\n        );\n        let (indexer_mailbox, indexer_handle) = universe.spawn_builder().spawn(indexer);\n        indexer_mailbox\n            .send_message(ProcessedDocBatch::new(\n                vec![\n                    ProcessedDoc {\n                        doc: doc!(\n                            body_field=>\"doc 2\",\n                            tenant_field=>\"tenant_1\",\n                        ),\n                        timestamp_opt: None,\n                        partition: 1,\n                        num_bytes: 30,\n                    },\n                    ProcessedDoc {\n                        doc: doc!(\n                            body_field=>\"doc 2\",\n                            tenant_field=>\"tenant_2\",\n                        ),\n                        timestamp_opt: None,\n                        partition: 3,\n                        num_bytes: 30,\n                    },\n                ],\n                SourceCheckpointDelta::from_range(8..9),\n                false,\n            ))\n            .await?;\n\n        let indexer_counters = indexer_handle.process_pending_and_observe().await.state;\n        assert_eq!(\n            indexer_counters,\n            IndexerCounters {\n                num_docs_in_workbench: 2,\n                num_doc_batches_in_workbench: 1,\n                num_splits_emitted: 0,\n                num_split_batches_emitted: 0,\n                pipeline_metrics_opt: None,\n            }\n        );\n        universe.send_exit_with_success(&indexer_mailbox).await?;\n        let (exit_status, indexer_counters) = indexer_handle.join().await;\n        assert!(matches!(exit_status, ActorExitStatus::Success));\n        assert_eq!(\n            indexer_counters,\n            IndexerCounters {\n                num_docs_in_workbench: 0,\n                num_doc_batches_in_workbench: 0,\n                num_splits_emitted: 2,\n                num_split_batches_emitted: 1,\n                pipeline_metrics_opt: None,\n            }\n        );\n        let split_batches: Vec<IndexedSplitBatchBuilder> =\n            index_serializer_inbox.drain_for_test_typed();\n        assert_eq!(split_batches.len(), 1);\n        assert_eq!(split_batches[0].splits.len(), 2);\n        universe.assert_quit().await;\n        Ok(())\n    }\n\n    const DOCMAPPER_SIMPLE_JSON: &str = r#\"{\n        \"field_mappings\": [{\"name\": \"body\", \"type\": \"text\"}],\n        \"max_num_partitions\": 10\n    }\"#;\n\n    #[tokio::test]\n    async fn test_indexer_exceeding_max_num_partitions() {\n        let universe = Universe::with_accelerated_time();\n        let pipeline_id = IndexingPipelineId {\n            index_uid: IndexUid::new_with_random_ulid(\"test-index\"),\n            source_id: \"test-source\".to_string(),\n            node_id: NodeId::from(\"test-node\"),\n            pipeline_uid: PipelineUid::default(),\n        };\n        let doc_mapper: Arc<DocMapper> =\n            Arc::new(serde_json::from_str::<DocMapper>(DOCMAPPER_SIMPLE_JSON).unwrap());\n        let body_field = doc_mapper.schema().get_field(\"body\").unwrap();\n        let indexing_directory = TempDirectory::for_test();\n        let mut indexing_settings = IndexingSettings::for_test();\n        indexing_settings.resources.heap_size = ByteSize::gb(5);\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_last_delete_opstamp()\n            .times(1)\n            .returning(move |_last_delete_opstamp_request| Ok(LastDeleteOpstampResponse::new(10)));\n        mock_metastore.expect_publish_splits().never();\n        let (index_serializer_mailbox, index_serializer_inbox) = universe.create_test_mailbox();\n        let indexer = Indexer::new(\n            pipeline_id,\n            doc_mapper,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            indexing_directory,\n            indexing_settings,\n            None,\n            index_serializer_mailbox,\n        );\n        let (indexer_mailbox, indexer_handle) = universe.spawn_builder().spawn(indexer);\n\n        for partition in 0..100 {\n            indexer_mailbox\n                .send_message(ProcessedDocBatch::new(\n                    vec![ProcessedDoc {\n                        doc: doc!(body_field=>\"doc {i}\"),\n                        timestamp_opt: None,\n                        partition,\n                        num_bytes: 30,\n                    }],\n                    SourceCheckpointDelta::from_range(partition..partition + 1),\n                    false,\n                ))\n                .await\n                .unwrap();\n        }\n        universe\n            .send_exit_with_success(&indexer_mailbox)\n            .await\n            .unwrap();\n\n        let (exit_status, _indexer_counters) = indexer_handle.join().await;\n        assert!(matches!(exit_status, ActorExitStatus::Success));\n\n        let index_serializer_msgs: Vec<IndexedSplitBatchBuilder> =\n            index_serializer_inbox.drain_for_test_typed();\n        assert_eq!(index_serializer_msgs.len(), 1);\n        let msg = index_serializer_msgs.into_iter().next().unwrap();\n        assert_eq!(msg.splits.len(), 11);\n        for split in msg.splits {\n            if split.split_attrs.partition_id == OTHER_PARTITION_ID {\n                assert_eq!(split.split_attrs.num_docs, 90);\n            } else {\n                assert_eq!(split.split_attrs.num_docs, 1);\n            }\n        }\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_indexer_propagates_publish_lock() {\n        let universe = Universe::with_accelerated_time();\n        let pipeline_id = IndexingPipelineId {\n            index_uid: IndexUid::new_with_random_ulid(\"test-index\"),\n            source_id: \"test-source\".to_string(),\n            node_id: NodeId::from(\"test-node\"),\n            pipeline_uid: PipelineUid::default(),\n        };\n        let doc_mapper: Arc<DocMapper> =\n            Arc::new(serde_json::from_str::<DocMapper>(DOCMAPPER_SIMPLE_JSON).unwrap());\n        let body_field = doc_mapper.schema().get_field(\"body\").unwrap();\n        let indexing_directory = TempDirectory::for_test();\n        let mut indexing_settings = IndexingSettings::for_test();\n        indexing_settings.split_num_docs_target = 1;\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_last_delete_opstamp()\n            .times(2)\n            .returning(move |_last_delete_opstamp_request| Ok(LastDeleteOpstampResponse::new(10)));\n        mock_metastore.expect_publish_splits().never();\n        let (index_serializer_mailbox, index_serializer_inbox) = universe.create_test_mailbox();\n        let indexer = Indexer::new(\n            pipeline_id,\n            doc_mapper,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            indexing_directory,\n            indexing_settings,\n            None,\n            index_serializer_mailbox,\n        );\n        let (indexer_mailbox, indexer_handle) = universe.spawn_builder().spawn(indexer);\n\n        let first_lock = PublishLock::default();\n        let second_lock = PublishLock::default();\n\n        for lock in [&first_lock, &second_lock] {\n            indexer_mailbox\n                .send_message(NewPublishLock(lock.clone()))\n                .await\n                .unwrap();\n            indexer_mailbox\n                .send_message(ProcessedDocBatch::new(\n                    vec![ProcessedDoc {\n                        doc: doc!(body_field=>\"doc 1\"),\n                        timestamp_opt: None,\n                        partition: 0,\n                        num_bytes: 30,\n                    }],\n                    SourceCheckpointDelta::from_range(0..1),\n                    false,\n                ))\n                .await\n                .unwrap();\n        }\n        universe\n            .send_exit_with_success(&indexer_mailbox)\n            .await\n            .unwrap();\n        let (exit_status, _indexer_counters) = indexer_handle.join().await;\n        assert!(matches!(exit_status, ActorExitStatus::Success));\n\n        let index_serializer_messages: Vec<IndexedSplitBatchBuilder> =\n            index_serializer_inbox.drain_for_test_typed();\n        assert_eq!(index_serializer_messages.len(), 2);\n        assert_eq!(index_serializer_messages[0].splits.len(), 1);\n        assert_eq!(index_serializer_messages[0].publish_lock, first_lock);\n        assert_eq!(index_serializer_messages[1].splits.len(), 1);\n        assert_eq!(index_serializer_messages[1].publish_lock, second_lock);\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_indexer_ignores_messages_when_publish_lock_is_dead() {\n        let universe = Universe::with_accelerated_time();\n        let pipeline_id = IndexingPipelineId {\n            index_uid: IndexUid::new_with_random_ulid(\"test-index\"),\n            source_id: \"test-source\".to_string(),\n            node_id: NodeId::from(\"test-node\"),\n            pipeline_uid: PipelineUid::default(),\n        };\n        let doc_mapper: Arc<DocMapper> =\n            Arc::new(serde_json::from_str::<DocMapper>(DOCMAPPER_SIMPLE_JSON).unwrap());\n        let body_field = doc_mapper.schema().get_field(\"body\").unwrap();\n        let indexing_directory = TempDirectory::for_test();\n        let mut indexing_settings = IndexingSettings::for_test();\n        indexing_settings.split_num_docs_target = 1;\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_last_delete_opstamp()\n            .times(1)\n            .returning(move |_last_delete_opstamp_request| Ok(LastDeleteOpstampResponse::new(10)));\n        mock_metastore.expect_publish_splits().never();\n        let (index_serializer_mailbox, index_serializer_inbox) = universe.create_test_mailbox();\n        let indexer = Indexer::new(\n            pipeline_id,\n            doc_mapper,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            indexing_directory,\n            indexing_settings,\n            None,\n            index_serializer_mailbox,\n        );\n        let (indexer_mailbox, indexer_handle) = universe.spawn_builder().spawn(indexer);\n\n        let publish_lock = PublishLock::default();\n        indexer_mailbox\n            .send_message(NewPublishLock(publish_lock.clone()))\n            .await\n            .unwrap();\n        indexer_handle.process_pending_and_observe().await;\n        publish_lock.kill().await;\n        indexer_mailbox\n            .send_message(ProcessedDocBatch::new(\n                vec![ProcessedDoc {\n                    doc: doc!(body_field=>\"doc 1\"),\n                    timestamp_opt: None,\n                    partition: 0,\n                    num_bytes: 30,\n                }],\n                SourceCheckpointDelta::from_range(0..1),\n                false,\n            ))\n            .await\n            .unwrap();\n        universe\n            .send_exit_with_success(&indexer_mailbox)\n            .await\n            .unwrap();\n        let (exit_status, _indexer_counters) = indexer_handle.join().await;\n        assert!(matches!(exit_status, ActorExitStatus::Success));\n\n        let index_serializer_messages = index_serializer_inbox.drain_for_test();\n        assert!(index_serializer_messages.is_empty());\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_indexer_honors_batch_commit_request() {\n        let universe = Universe::with_accelerated_time();\n        let pipeline_id = IndexingPipelineId {\n            index_uid: IndexUid::new_with_random_ulid(\"test-index\"),\n            source_id: \"test-source\".to_string(),\n            node_id: NodeId::from(\"test-node\"),\n            pipeline_uid: PipelineUid::default(),\n        };\n        let doc_mapper: Arc<DocMapper> =\n            Arc::new(serde_json::from_str::<DocMapper>(DOCMAPPER_SIMPLE_JSON).unwrap());\n        let body_field = doc_mapper.schema().get_field(\"body\").unwrap();\n        let indexing_directory = TempDirectory::for_test();\n        let indexing_settings = IndexingSettings::for_test();\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_last_delete_opstamp()\n            .times(1)\n            .returning(move |_last_delete_opstamp_request| Ok(LastDeleteOpstampResponse::new(10)));\n        mock_metastore.expect_publish_splits().never();\n        let (index_serializer_mailbox, index_serializer_inbox) = universe.create_test_mailbox();\n        let indexer = Indexer::new(\n            pipeline_id,\n            doc_mapper,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            indexing_directory,\n            indexing_settings,\n            None,\n            index_serializer_mailbox,\n        );\n        let (indexer_mailbox, indexer_handle) = universe.spawn_builder().spawn(indexer);\n        indexer_mailbox\n            .send_message(ProcessedDocBatch::new(\n                vec![ProcessedDoc {\n                    doc: doc!(body_field=>\"doc 1\"),\n                    timestamp_opt: None,\n                    partition: 0,\n                    num_bytes: 30,\n                }],\n                SourceCheckpointDelta::from_range(0..1),\n                true,\n            ))\n            .await\n            .unwrap();\n        universe\n            .send_exit_with_success(&indexer_mailbox)\n            .await\n            .unwrap();\n        let (exit_status, _indexer_counters) = indexer_handle.join().await;\n        assert!(matches!(exit_status, ActorExitStatus::Success));\n        let output_messages: Vec<IndexedSplitBatchBuilder> =\n            index_serializer_inbox.drain_for_test_typed();\n\n        assert_eq!(output_messages.len(), 1);\n        assert_eq!(\n            output_messages[0].commit_trigger,\n            CommitTrigger::ForceCommit\n        );\n        assert_eq!(output_messages[0].splits[0].split_attrs.num_docs, 1);\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_indexer_checkpoint_on_all_failed_docs() -> anyhow::Result<()> {\n        let pipeline_id = IndexingPipelineId {\n            index_uid: IndexUid::new_with_random_ulid(\"test-index\"),\n            source_id: \"test-source\".to_string(),\n            node_id: NodeId::from(\"test-node\"),\n            pipeline_uid: PipelineUid::default(),\n        };\n        let doc_mapper = Arc::new(default_doc_mapper_for_test());\n        let last_delete_opstamp = 10;\n        let indexing_directory = TempDirectory::for_test();\n        let indexing_settings = IndexingSettings::for_test();\n        let commit_timeout = indexing_settings.commit_timeout();\n        let universe = Universe::with_accelerated_time();\n        let (index_serializer_mailbox, index_serializer_inbox) = universe.create_test_mailbox();\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_publish_splits()\n            .returning(move |publish_splits_request| {\n                assert!(publish_splits_request.replaced_split_ids.is_empty());\n                Ok(EmptyResponse {})\n            });\n        mock_metastore.expect_last_delete_opstamp().returning(\n            move |_last_delete_opstamp_request| {\n                Ok(LastDeleteOpstampResponse::new(last_delete_opstamp))\n            },\n        );\n        let indexer = Indexer::new(\n            pipeline_id,\n            doc_mapper,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            indexing_directory,\n            indexing_settings,\n            None,\n            index_serializer_mailbox,\n        );\n        let (indexer_mailbox, indexer_handle) = universe.spawn_builder().spawn(indexer);\n        indexer_mailbox\n            .send_message(ProcessedDocBatch::new(\n                Vec::new(),\n                SourceCheckpointDelta::from_range(4..6),\n                false,\n            ))\n            .await?;\n        indexer_mailbox\n            .send_message(ProcessedDocBatch::new(\n                Vec::new(),\n                SourceCheckpointDelta::from_range(6..8),\n                false,\n            ))\n            .await?;\n        universe\n            .sleep(commit_timeout + Duration::from_secs(2))\n            .await;\n        let indexer_counters = indexer_handle.process_pending_and_observe().await.state;\n        assert_eq!(\n            indexer_counters,\n            IndexerCounters {\n                num_splits_emitted: 0,\n                num_split_batches_emitted: 0,\n                num_docs_in_workbench: 0, //< the num docs in split counter has been reset.\n                num_doc_batches_in_workbench: 2, //< the num docs in split counter has been reset.\n                pipeline_metrics_opt: None,\n            }\n        );\n\n        let index_serializer_messages: Vec<EmptySplit> =\n            index_serializer_inbox.drain_for_test_typed();\n        assert_eq!(index_serializer_messages.len(), 1);\n        let update = index_serializer_messages.into_iter().next().unwrap();\n        assert_eq!(update.index_uid.index_id, \"test-index\");\n        assert_eq!(\n            update.checkpoint_delta,\n            IndexCheckpointDelta::for_test(\"test-source\", 4..8)\n        );\n\n        universe.assert_quit().await;\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/actors/indexing_pipeline.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeSet;\nuse std::path::PathBuf;\nuse std::sync::Arc;\nuse std::time::{Duration, Instant};\n\nuse async_trait::async_trait;\nuse quickwit_actors::{\n    Actor, ActorContext, ActorExitStatus, ActorHandle, HEARTBEAT, Handler, Health, Mailbox,\n    QueueCapacity, Supervisable,\n};\nuse quickwit_common::KillSwitch;\nuse quickwit_common::metrics::OwnedGaugeGuard;\nuse quickwit_common::pubsub::EventBroker;\nuse quickwit_common::temp_dir::TempDirectory;\nuse quickwit_config::{IndexingSettings, RetentionPolicy, SourceConfig};\nuse quickwit_doc_mapper::DocMapper;\nuse quickwit_ingest::IngesterPool;\nuse quickwit_proto::indexing::IndexingPipelineId;\nuse quickwit_proto::metastore::{MetastoreError, MetastoreServiceClient};\nuse quickwit_proto::types::ShardId;\nuse quickwit_storage::{Storage, StorageResolver};\nuse tokio::sync::Semaphore;\nuse tracing::{debug, error, info, instrument};\n\nuse super::MergePlanner;\nuse crate::SplitsUpdateMailbox;\nuse crate::actors::doc_processor::DocProcessor;\nuse crate::actors::index_serializer::IndexSerializer;\nuse crate::actors::publisher::PublisherType;\nuse crate::actors::sequencer::Sequencer;\nuse crate::actors::uploader::UploaderType;\nuse crate::actors::{Indexer, Packager, Publisher, Uploader};\nuse crate::merge_policy::MergePolicy;\nuse crate::models::IndexingStatistics;\nuse crate::source::{\n    AssignShards, Assignment, SourceActor, SourceRuntime, quickwit_supported_sources,\n};\nuse crate::split_store::IndexingSplitStore;\n\nconst SUPERVISE_INTERVAL: Duration = Duration::from_secs(1);\n\nconst MAX_RETRY_DELAY: Duration = Duration::from_secs(600); // 10 min.\n\n#[derive(Debug)]\nstruct SuperviseLoop;\n\n/// Calculates the wait time based on retry count.\n// retry_count, wait_time\n// 0   1s\n// 1   2s\n// 2   4s\n// 3   8s\n// ...\n// >=8   5mn\npub(crate) fn wait_duration_before_retry(retry_count: usize) -> Duration {\n    // Protect against a `retry_count` that will lead to an overflow.\n    let max_power = (retry_count as u32).min(31);\n    Duration::from_secs(2u64.pow(max_power)).min(MAX_RETRY_DELAY)\n}\n\n/// Spawning an indexing pipeline puts a lot of pressure on the file system, metastore, etc. so\n/// we rely on this semaphore to limit the number of indexing pipelines that can be spawned\n/// concurrently.\n/// See also <https://github.com/quickwit-oss/quickwit/issues/1638>.\nstatic SPAWN_PIPELINE_SEMAPHORE: Semaphore = Semaphore::const_new(10);\n\nstruct IndexingPipelineHandles {\n    source_mailbox: Mailbox<SourceActor>,\n    source_handle: ActorHandle<SourceActor>,\n    doc_processor: ActorHandle<DocProcessor>,\n    indexer: ActorHandle<Indexer>,\n    index_serializer: ActorHandle<IndexSerializer>,\n    packager: ActorHandle<Packager>,\n    uploader: ActorHandle<Uploader>,\n    sequencer: ActorHandle<Sequencer<Publisher>>,\n    publisher: ActorHandle<Publisher>,\n    next_check_for_progress: Instant,\n}\n\nimpl IndexingPipelineHandles {\n    fn should_check_for_progress(&mut self) -> bool {\n        let now = Instant::now();\n        let check_for_progress = now > self.next_check_for_progress;\n        if check_for_progress {\n            self.next_check_for_progress = now + *HEARTBEAT;\n        }\n        check_for_progress\n    }\n}\n\n// Messages\n\n#[derive(Clone, Copy, Debug, Default)]\npub struct Spawn {\n    retry_count: usize,\n}\n\npub struct IndexingPipeline {\n    params: IndexingPipelineParams,\n    previous_generations_statistics: IndexingStatistics,\n    statistics: IndexingStatistics,\n    handles_opt: Option<IndexingPipelineHandles>,\n    // Killswitch used for the actors in the pipeline. This is not the supervisor killswitch.\n    kill_switch: KillSwitch,\n\n    // The set of shard is something that can change dynamically without necessarily\n    // requiring a respawn of the pipeline.\n    // We keep the list of shards here however, to reassign them after a respawn.\n    shard_ids: BTreeSet<ShardId>,\n    _indexing_pipelines_gauge_guard: OwnedGaugeGuard,\n}\n\n#[async_trait]\nimpl Actor for IndexingPipeline {\n    type ObservableState = IndexingStatistics;\n\n    fn observable_state(&self) -> Self::ObservableState {\n        self.statistics.clone()\n    }\n\n    fn name(&self) -> String {\n        \"IndexingPipeline\".to_string()\n    }\n\n    async fn initialize(&mut self, ctx: &ActorContext<Self>) -> Result<(), ActorExitStatus> {\n        self.handle(Spawn::default(), ctx).await?;\n        self.handle(SuperviseLoop, ctx).await?;\n        Ok(())\n    }\n\n    async fn finalize(\n        &mut self,\n        _exit_status: &ActorExitStatus,\n        ctx: &ActorContext<Self>,\n    ) -> anyhow::Result<()> {\n        // We update the observation to ensure our last \"black box\" observation\n        // is up to date.\n        self.perform_observe(ctx);\n        Ok(())\n    }\n}\n\nimpl IndexingPipeline {\n    pub fn new(params: IndexingPipelineParams) -> Self {\n        let indexing_pipelines_gauge = crate::metrics::INDEXER_METRICS\n            .indexing_pipelines\n            .with_label_values([&params.pipeline_id.index_uid.index_id]);\n        let indexing_pipelines_gauge_guard = OwnedGaugeGuard::from_gauge(indexing_pipelines_gauge);\n        let params_fingerprint = params.params_fingerprint;\n        IndexingPipeline {\n            params,\n            previous_generations_statistics: Default::default(),\n            handles_opt: None,\n            kill_switch: KillSwitch::default(),\n            statistics: IndexingStatistics {\n                params_fingerprint,\n                ..Default::default()\n            },\n            shard_ids: Default::default(),\n            _indexing_pipelines_gauge_guard: indexing_pipelines_gauge_guard,\n        }\n    }\n\n    fn supervisables(&self) -> Vec<&dyn Supervisable> {\n        if let Some(handles) = &self.handles_opt {\n            let supervisables: Vec<&dyn Supervisable> = vec![\n                &handles.source_handle,\n                &handles.doc_processor,\n                &handles.indexer,\n                &handles.index_serializer,\n                &handles.packager,\n                &handles.uploader,\n                &handles.sequencer,\n                &handles.publisher,\n            ];\n            supervisables\n        } else {\n            Vec::new()\n        }\n    }\n\n    /// Performs healthcheck on all of the actors in the pipeline,\n    /// and consolidates the result.\n    fn healthcheck(&self, check_for_progress: bool) -> Health {\n        let mut healthy_actors: Vec<&str> = Default::default();\n        let mut failure_or_unhealthy_actors: Vec<&str> = Default::default();\n        let mut success_actors: Vec<&str> = Default::default();\n        for supervisable in self.supervisables() {\n            match supervisable.check_health(check_for_progress) {\n                Health::Healthy => {\n                    // At least one other actor is running.\n                    healthy_actors.push(supervisable.name());\n                }\n                Health::FailureOrUnhealthy => {\n                    failure_or_unhealthy_actors.push(supervisable.name());\n                }\n                Health::Success => {\n                    success_actors.push(supervisable.name());\n                }\n            }\n        }\n\n        if !failure_or_unhealthy_actors.is_empty() {\n            error!(\n                pipeline_id=?self.params.pipeline_id,\n                generation=self.generation(),\n                healthy_actors=?healthy_actors,\n                failed_or_unhealthy_actors=?failure_or_unhealthy_actors,\n                success_actors=?success_actors,\n                \"Indexing pipeline failure.\"\n            );\n            return Health::FailureOrUnhealthy;\n        }\n        if healthy_actors.is_empty() {\n            // All the actors finished successfully.\n            info!(\n                pipeline_id=?self.params.pipeline_id,\n                generation=self.generation(),\n                \"Indexing pipeline success.\"\n            );\n            return Health::Success;\n        }\n        // No error at this point and there are still some actors running.\n        debug!(\n            pipeline_id=?self.params.pipeline_id,\n            generation=self.generation(),\n            healthy_actors=?healthy_actors,\n            failed_or_unhealthy_actors=?failure_or_unhealthy_actors,\n            success_actors=?success_actors,\n            \"Indexing pipeline running.\"\n        );\n        Health::Healthy\n    }\n\n    fn generation(&self) -> usize {\n        self.statistics.generation\n    }\n\n    fn perform_observe(&mut self, ctx: &ActorContext<Self>) {\n        let Some(handles) = &self.handles_opt else {\n            return;\n        };\n        handles.doc_processor.refresh_observe();\n        handles.indexer.refresh_observe();\n        handles.uploader.refresh_observe();\n        handles.publisher.refresh_observe();\n        self.statistics = self\n            .previous_generations_statistics\n            .clone()\n            .add_actor_counters(\n                &handles.doc_processor.last_observation(),\n                &handles.indexer.last_observation(),\n                &handles.uploader.last_observation(),\n                &handles.publisher.last_observation(),\n            )\n            .set_generation(self.statistics.generation)\n            .set_num_spawn_attempts(self.statistics.num_spawn_attempts);\n        let pipeline_metrics_opt = handles.indexer.last_observation().pipeline_metrics_opt;\n        self.statistics.pipeline_metrics_opt = pipeline_metrics_opt;\n        self.statistics.params_fingerprint = self.params.params_fingerprint;\n        self.statistics.shard_ids.clone_from(&self.shard_ids);\n        ctx.observe(self);\n    }\n\n    /// Checks if some actors have terminated.\n    async fn perform_health_check(\n        &mut self,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        let Some(handles) = self.handles_opt.as_mut() else {\n            return Ok(());\n        };\n\n        // While we check if the actor has terminated or not, we do not check for progress\n        // at every single loop. Instead, we wait for the `HEARTBEAT` duration to have elapsed,\n        // since our last check.\n        let check_for_progress = handles.should_check_for_progress();\n        let health = self.healthcheck(check_for_progress);\n        match health {\n            Health::Healthy => {}\n            Health::FailureOrUnhealthy => {\n                self.terminate().await;\n                let first_retry_delay = wait_duration_before_retry(0);\n                ctx.schedule_self_msg(first_retry_delay, Spawn { retry_count: 0 });\n            }\n            Health::Success => {\n                return Err(ActorExitStatus::Success);\n            }\n        }\n        Ok(())\n    }\n\n    // TODO this should return an error saying whether we can retry or not.\n    #[instrument(\n        name=\"spawn_pipeline\",\n        level=\"info\",\n        skip_all,\n        fields(\n            index=%self.params.pipeline_id.index_uid.index_id,\n            r#gen=self.generation()\n        ))]\n    async fn spawn_pipeline(&mut self, ctx: &ActorContext<Self>) -> anyhow::Result<()> {\n        let _spawn_pipeline_permit = ctx\n            .protect_future(SPAWN_PIPELINE_SEMAPHORE.acquire())\n            .await\n            .expect(\"semaphore should not be closed\");\n\n        self.statistics.num_spawn_attempts += 1;\n        self.kill_switch = ctx.kill_switch().child();\n\n        let index_id = &self.params.pipeline_id.index_uid.index_id;\n        let source_id = &self.params.pipeline_id.source_id;\n\n        info!(\n            index_id,\n            source_id,\n            pipeline_uid=%self.params.pipeline_id.pipeline_uid,\n            root_dir=%self.params.indexing_directory.path().display(),\n            \"spawning indexing pipeline\",\n        );\n        let (source_mailbox, source_inbox) = ctx\n            .spawn_ctx()\n            .create_mailbox::<SourceActor>(\"SourceActor\", QueueCapacity::Unbounded);\n\n        // Publisher\n        let publisher = Publisher::new(\n            PublisherType::MainPublisher,\n            self.params.metastore.clone(),\n            Some(self.params.merge_planner_mailbox.clone()),\n            Some(source_mailbox.clone()),\n        );\n        let (publisher_mailbox, publisher_handle) = ctx\n            .spawn_actor()\n            .set_kill_switch(self.kill_switch.clone())\n            .set_backpressure_micros_counter(\n                crate::metrics::INDEXER_METRICS\n                    .backpressure_micros\n                    .with_label_values([\"publisher\"]),\n            )\n            .spawn(publisher);\n\n        let sequencer = Sequencer::new(publisher_mailbox);\n        let (sequencer_mailbox, sequencer_handle) = ctx\n            .spawn_actor()\n            .set_backpressure_micros_counter(\n                crate::metrics::INDEXER_METRICS\n                    .backpressure_micros\n                    .with_label_values([\"sequencer\"]),\n            )\n            .set_kill_switch(self.kill_switch.clone())\n            .spawn(sequencer);\n\n        // Uploader\n        let uploader = Uploader::new(\n            UploaderType::IndexUploader,\n            self.params.metastore.clone(),\n            self.params.merge_policy.clone(),\n            self.params.retention_policy.clone(),\n            self.params.split_store.clone(),\n            SplitsUpdateMailbox::Sequencer(sequencer_mailbox),\n            self.params.max_concurrent_split_uploads_index,\n            self.params.event_broker.clone(),\n        );\n        let (uploader_mailbox, uploader_handle) = ctx\n            .spawn_actor()\n            .set_backpressure_micros_counter(\n                crate::metrics::INDEXER_METRICS\n                    .backpressure_micros\n                    .with_label_values([\"uploader\"]),\n            )\n            .set_kill_switch(self.kill_switch.clone())\n            .spawn(uploader);\n\n        // Packager\n        let tag_fields = self.params.doc_mapper.tag_named_fields()?;\n        let packager = Packager::new(\"Packager\", tag_fields, uploader_mailbox);\n        let (packager_mailbox, packager_handle) = ctx\n            .spawn_actor()\n            .set_kill_switch(self.kill_switch.clone())\n            .spawn(packager);\n\n        // Index Serializer\n        let index_serializer = IndexSerializer::new(packager_mailbox);\n        let (index_serializer_mailbox, index_serializer_handle) = ctx\n            .spawn_actor()\n            .set_kill_switch(self.kill_switch.clone())\n            .spawn(index_serializer);\n\n        // Indexer\n        let indexer = Indexer::new(\n            self.params.pipeline_id.clone(),\n            self.params.doc_mapper.clone(),\n            self.params.metastore.clone(),\n            self.params.indexing_directory.clone(),\n            self.params.indexing_settings.clone(),\n            self.params.cooperative_indexing_permits.clone(),\n            index_serializer_mailbox,\n        );\n        let (indexer_mailbox, indexer_handle) = ctx\n            .spawn_actor()\n            .set_backpressure_micros_counter(\n                crate::metrics::INDEXER_METRICS\n                    .backpressure_micros\n                    .with_label_values([\"indexer\"]),\n            )\n            .set_kill_switch(self.kill_switch.clone())\n            .spawn(indexer);\n\n        let doc_processor = DocProcessor::try_new(\n            index_id.to_string(),\n            source_id.to_string(),\n            self.params.doc_mapper.clone(),\n            indexer_mailbox,\n            self.params.source_config.transform_config.clone(),\n            self.params.source_config.input_format,\n        )?;\n        let (doc_processor_mailbox, doc_processor_handle) = ctx\n            .spawn_actor()\n            .set_backpressure_micros_counter(\n                crate::metrics::INDEXER_METRICS\n                    .backpressure_micros\n                    .with_label_values([\"doc_processor\"]),\n            )\n            .set_kill_switch(self.kill_switch.clone())\n            .spawn(doc_processor);\n        let source_runtime = SourceRuntime {\n            pipeline_id: self.params.pipeline_id.clone(),\n            source_config: self.params.source_config.clone(),\n            metastore: self.params.metastore.clone(),\n            ingester_pool: self.params.ingester_pool.clone(),\n            queues_dir_path: self.params.queues_dir_path.clone(),\n            storage_resolver: self.params.source_storage_resolver.clone(),\n            event_broker: self.params.event_broker.clone(),\n            indexing_setting: self.params.indexing_settings.clone(),\n        };\n        let source = ctx\n            .protect_future(quickwit_supported_sources().load_source(source_runtime))\n            .await?;\n        let actor_source = SourceActor {\n            source,\n            doc_processor_mailbox,\n        };\n        let (source_mailbox, source_handle) = ctx\n            .spawn_actor()\n            .set_mailboxes(source_mailbox, source_inbox)\n            .set_kill_switch(self.kill_switch.clone())\n            .spawn(actor_source);\n        let assign_shards_message = AssignShards(Assignment {\n            shard_ids: self.shard_ids.clone(),\n        });\n        source_mailbox.send_message(assign_shards_message).await?;\n\n        // Increment generation once we are sure there will be no spawning error.\n        self.previous_generations_statistics = self.statistics.clone();\n        self.statistics.generation += 1;\n        self.handles_opt = Some(IndexingPipelineHandles {\n            source_mailbox,\n            source_handle,\n            doc_processor: doc_processor_handle,\n            indexer: indexer_handle,\n            index_serializer: index_serializer_handle,\n            packager: packager_handle,\n            uploader: uploader_handle,\n            sequencer: sequencer_handle,\n            publisher: publisher_handle,\n            next_check_for_progress: Instant::now() + *HEARTBEAT,\n        });\n        Ok(())\n    }\n\n    async fn terminate(&mut self) {\n        self.kill_switch.kill();\n        if let Some(handles) = self.handles_opt.take() {\n            tokio::join!(\n                handles.source_handle.kill(),\n                handles.indexer.kill(),\n                handles.packager.kill(),\n                handles.uploader.kill(),\n                handles.publisher.kill(),\n            );\n        }\n    }\n}\n\n#[async_trait]\nimpl Handler<SuperviseLoop> for IndexingPipeline {\n    type Reply = ();\n    async fn handle(\n        &mut self,\n        supervise_loop_token: SuperviseLoop,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        self.perform_observe(ctx);\n        self.perform_health_check(ctx).await?;\n        ctx.schedule_self_msg(SUPERVISE_INTERVAL, supervise_loop_token);\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<Spawn> for IndexingPipeline {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        spawn: Spawn,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        if self.handles_opt.is_some() {\n            return Ok(());\n        }\n        self.previous_generations_statistics.num_spawn_attempts = 1 + spawn.retry_count;\n        if let Err(spawn_error) = self.spawn_pipeline(ctx).await {\n            if let Some(MetastoreError::NotFound { .. }) =\n                spawn_error.downcast_ref::<MetastoreError>()\n            {\n                info!(error = ?spawn_error, \"could not spawn pipeline, index might have been deleted\");\n                return Err(ActorExitStatus::Success);\n            }\n            let retry_delay = wait_duration_before_retry(spawn.retry_count + 1);\n            error!(error = ?spawn_error, retry_count = spawn.retry_count, retry_delay = ?retry_delay, \"error while spawning indexing pipeline, retrying after some time\");\n            ctx.schedule_self_msg(\n                retry_delay,\n                Spawn {\n                    retry_count: spawn.retry_count + 1,\n                },\n            );\n        }\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<AssignShards> for IndexingPipeline {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        assign_shards_message: AssignShards,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        self.shard_ids\n            .clone_from(&assign_shards_message.0.shard_ids);\n        // If the pipeline is running, we forward the message to its source.\n        // If it is not, it will be respawned soon, and the shards will be assigned afterward.\n        if let Some(handles) = &mut self.handles_opt {\n            info!(\n                shard_ids=?assign_shards_message.0.shard_ids,\n                \"assigning shards to indexing pipeline\"\n            );\n            handles\n                .source_mailbox\n                .send_message(assign_shards_message)\n                .await?;\n        }\n        // We perform observe to make sure the set of shard ids is up to date.\n        self.perform_observe(ctx);\n        Ok(())\n    }\n}\n\npub struct IndexingPipelineParams {\n    pub pipeline_id: IndexingPipelineId,\n    pub metastore: MetastoreServiceClient,\n    pub storage: Arc<dyn Storage>,\n\n    // Indexing-related parameters\n    pub doc_mapper: Arc<DocMapper>,\n    pub indexing_directory: TempDirectory,\n    pub indexing_settings: IndexingSettings,\n    pub split_store: IndexingSplitStore,\n    pub max_concurrent_split_uploads_index: usize,\n    pub cooperative_indexing_permits: Option<Arc<Semaphore>>,\n\n    // Merge-related parameters\n    pub merge_policy: Arc<dyn MergePolicy>,\n    pub retention_policy: Option<RetentionPolicy>,\n    pub merge_planner_mailbox: Mailbox<MergePlanner>,\n    pub max_concurrent_split_uploads_merge: usize,\n\n    // Source-related parameters\n    pub source_config: SourceConfig,\n    pub source_storage_resolver: StorageResolver,\n    pub ingester_pool: IngesterPool,\n    pub queues_dir_path: PathBuf,\n    pub params_fingerprint: u64,\n\n    pub event_broker: EventBroker,\n}\n\n#[cfg(test)]\nmod tests {\n    use std::num::NonZeroUsize;\n    use std::path::PathBuf;\n    use std::sync::Arc;\n\n    use quickwit_actors::{Command, Universe};\n    use quickwit_common::ServiceStream;\n    use quickwit_config::{IndexingSettings, SourceInputFormat, SourceParams};\n    use quickwit_doc_mapper::{DocMapper, default_doc_mapper_for_test};\n    use quickwit_metastore::checkpoint::IndexCheckpointDelta;\n    use quickwit_metastore::{IndexMetadata, IndexMetadataResponseExt, PublishSplitsRequestExt};\n    use quickwit_proto::metastore::{\n        EmptyResponse, IndexMetadataResponse, LastDeleteOpstampResponse, MetastoreError,\n        MockMetastoreService,\n    };\n    use quickwit_proto::types::{IndexUid, NodeId, PipelineUid};\n    use quickwit_storage::RamStorage;\n\n    use super::{IndexingPipeline, *};\n    use crate::actors::merge_pipeline::{MergePipeline, MergePipelineParams};\n    use crate::merge_policy::default_merge_policy;\n\n    #[test]\n    fn test_wait_duration() {\n        assert_eq!(wait_duration_before_retry(0), Duration::from_secs(1));\n        assert_eq!(wait_duration_before_retry(1), Duration::from_secs(2));\n        assert_eq!(wait_duration_before_retry(2), Duration::from_secs(4));\n        assert_eq!(wait_duration_before_retry(3), Duration::from_secs(8));\n        assert_eq!(wait_duration_before_retry(9), Duration::from_secs(512));\n        assert_eq!(wait_duration_before_retry(10), MAX_RETRY_DELAY);\n    }\n\n    async fn test_indexing_pipeline_num_fails_before_success(\n        mut num_fails: usize,\n        test_file: &str,\n    ) -> anyhow::Result<()> {\n        let node_id = NodeId::from(\"test-node\");\n        let index_uid = IndexUid::for_test(\"test-index\", 2);\n        let pipeline_id = IndexingPipelineId {\n            node_id,\n            index_uid,\n            source_id: \"test-source\".to_string(),\n            pipeline_uid: PipelineUid::for_test(0u128),\n        };\n        let source_config = SourceConfig {\n            source_id: \"test-source\".to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::file_from_str(test_file).unwrap(),\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        };\n        let source_config_clone = source_config.clone();\n\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_index_metadata()\n            .withf(|index_metadata_request| {\n                index_metadata_request.index_uid.as_ref().unwrap() == &(\"test-index\", 2)\n            })\n            .returning(move |_| {\n                if num_fails == 0 {\n                    let mut index_metadata =\n                        IndexMetadata::for_test(\"test-index\", \"ram:///indexes/test-index\");\n                    index_metadata\n                        .add_source(source_config_clone.clone())\n                        .unwrap();\n                    let response =\n                        IndexMetadataResponse::try_from_index_metadata(&index_metadata).unwrap();\n                    return Ok(response);\n                }\n                num_fails -= 1;\n                Err(MetastoreError::Timeout(\"timeout error\".to_string()))\n            });\n        mock_metastore\n            .expect_last_delete_opstamp()\n            .returning(move |_last_delete_opstamp_request| Ok(LastDeleteOpstampResponse::new(10)));\n        mock_metastore\n            .expect_mark_splits_for_deletion()\n            .returning(|_| Ok(EmptyResponse {}));\n        mock_metastore\n            .expect_stage_splits()\n            .withf(|stage_splits_request| -> bool {\n                stage_splits_request.index_uid() == &(\"test-index\", 2)\n            })\n            .returning(|_| Ok(EmptyResponse {}));\n        mock_metastore\n            .expect_publish_splits()\n            .withf(|publish_splits_request| -> bool {\n                let checkpoint_delta: IndexCheckpointDelta = publish_splits_request\n                    .deserialize_index_checkpoint()\n                    .unwrap()\n                    .unwrap();\n                publish_splits_request.index_uid() == &(\"test-index\", 2)\n                    && checkpoint_delta.source_id == \"test-source\"\n                    && publish_splits_request.staged_split_ids.len() == 1\n                    && publish_splits_request.replaced_split_ids.is_empty()\n                    && format!(\"{:?}\", checkpoint_delta.source_delta)\n                        .ends_with(\":(00000000000000000000..~00000000000000001030])\")\n            })\n            .returning(|_| Ok(EmptyResponse {}));\n\n        let universe = Universe::new();\n        let (merge_planner_mailbox, _) = universe.create_test_mailbox();\n        let storage = Arc::new(RamStorage::default());\n        let split_store = IndexingSplitStore::create_without_local_store_for_test(storage.clone());\n        let pipeline_params = IndexingPipelineParams {\n            pipeline_id,\n            doc_mapper: Arc::new(default_doc_mapper_for_test()),\n            source_config,\n            source_storage_resolver: StorageResolver::for_test(),\n            indexing_directory: TempDirectory::for_test(),\n            indexing_settings: IndexingSettings::for_test(),\n            ingester_pool: IngesterPool::default(),\n            metastore: MetastoreServiceClient::from_mock(mock_metastore),\n            storage,\n            split_store,\n            merge_policy: default_merge_policy(),\n            retention_policy: None,\n            queues_dir_path: PathBuf::from(\"./queues\"),\n            max_concurrent_split_uploads_index: 4,\n            max_concurrent_split_uploads_merge: 5,\n            cooperative_indexing_permits: None,\n            merge_planner_mailbox,\n            event_broker: EventBroker::default(),\n            params_fingerprint: 42u64,\n        };\n        let pipeline = IndexingPipeline::new(pipeline_params);\n        let (_pipeline_mailbox, pipeline_handle) = universe.spawn_builder().spawn(pipeline);\n        let (pipeline_exit_status, pipeline_statistics) = pipeline_handle.join().await;\n        assert_eq!(\n            pipeline_statistics.generation, 1,\n            \"generation is {}, expected 1\",\n            pipeline_statistics.generation\n        );\n        assert_eq!(\n            pipeline_statistics.num_spawn_attempts,\n            1 + num_fails,\n            \"num spawn attempts is {}, expected 1 + {}\",\n            pipeline_statistics.num_spawn_attempts,\n            1 + num_fails\n        );\n        assert!(pipeline_exit_status.is_success());\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_indexing_pipeline_retry_0() -> anyhow::Result<()> {\n        test_indexing_pipeline_num_fails_before_success(0, \"data/test_corpus.json\").await\n    }\n\n    #[tokio::test]\n    async fn test_indexing_pipeline_retry_1() -> anyhow::Result<()> {\n        test_indexing_pipeline_num_fails_before_success(1, \"data/test_corpus.json\").await\n    }\n\n    #[tokio::test]\n    async fn test_indexing_pipeline_retry_0_gz() -> anyhow::Result<()> {\n        test_indexing_pipeline_num_fails_before_success(0, \"data/test_corpus.json.gz\").await\n    }\n\n    #[tokio::test]\n    async fn test_indexing_pipeline_retry_1_gz() -> anyhow::Result<()> {\n        test_indexing_pipeline_num_fails_before_success(1, \"data/test_corpus.json.gz\").await\n    }\n\n    async fn indexing_pipeline_simple(test_file: &str) -> anyhow::Result<()> {\n        let node_id = NodeId::from(\"test-node\");\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 1);\n        let pipeline_id = IndexingPipelineId {\n            node_id,\n            index_uid: index_uid.clone(),\n            source_id: \"test-source\".to_string(),\n            pipeline_uid: PipelineUid::for_test(0u128),\n        };\n        let source_config = SourceConfig {\n            source_id: \"test-source\".to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::file_from_str(test_file).unwrap(),\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        };\n        let source_config_clone = source_config.clone();\n\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_index_metadata()\n            .withf(|index_metadata_request| {\n                index_metadata_request.index_uid.as_ref().unwrap() == &(\"test-index\", 1)\n            })\n            .returning(move |_| {\n                let mut index_metadata =\n                    IndexMetadata::for_test(\"test-index\", \"ram:///indexes/test-index\");\n                index_metadata\n                    .add_source(source_config_clone.clone())\n                    .unwrap();\n                Ok(IndexMetadataResponse::try_from_index_metadata(&index_metadata).unwrap())\n            });\n        let index_uid_clone = index_uid.clone();\n        mock_metastore\n            .expect_last_delete_opstamp()\n            .withf(move |last_delete_opstamp| last_delete_opstamp.index_uid() == &index_uid_clone)\n            .returning(move |_| Ok(LastDeleteOpstampResponse::new(10)));\n        let index_uid_clone = index_uid.clone();\n        mock_metastore\n            .expect_stage_splits()\n            .withf(move |stage_splits_request| stage_splits_request.index_uid() == &index_uid_clone)\n            .returning(|_| Ok(EmptyResponse {}));\n        let index_uid_clone = index_uid.clone();\n        mock_metastore\n            .expect_publish_splits()\n            .withf(move |publish_splits_request| -> bool {\n                let checkpoint_delta: IndexCheckpointDelta = publish_splits_request\n                    .deserialize_index_checkpoint()\n                    .unwrap()\n                    .unwrap();\n                publish_splits_request.index_uid() == &index_uid_clone\n                    && publish_splits_request.staged_split_ids.len() == 1\n                    && publish_splits_request.replaced_split_ids.is_empty()\n                    && checkpoint_delta.source_id == \"test-source\"\n                    && format!(\"{:?}\", checkpoint_delta.source_delta)\n                        .ends_with(\":(00000000000000000000..~00000000000000001030])\")\n            })\n            .returning(|_| Ok(EmptyResponse {}));\n\n        let universe = Universe::new();\n        let storage = Arc::new(RamStorage::default());\n        let split_store = IndexingSplitStore::create_without_local_store_for_test(storage.clone());\n        let (merge_planner_mailbox, _) = universe.create_test_mailbox();\n        let pipeline_params = IndexingPipelineParams {\n            pipeline_id,\n            doc_mapper: Arc::new(default_doc_mapper_for_test()),\n            source_config,\n            source_storage_resolver: StorageResolver::for_test(),\n            indexing_directory: TempDirectory::for_test(),\n            indexing_settings: IndexingSettings::for_test(),\n            ingester_pool: IngesterPool::default(),\n            metastore: MetastoreServiceClient::from_mock(mock_metastore),\n            queues_dir_path: PathBuf::from(\"./queues\"),\n            storage,\n            split_store,\n            merge_policy: default_merge_policy(),\n            retention_policy: None,\n            max_concurrent_split_uploads_index: 4,\n            max_concurrent_split_uploads_merge: 5,\n            cooperative_indexing_permits: None,\n            merge_planner_mailbox,\n            event_broker: Default::default(),\n            params_fingerprint: 42u64,\n        };\n        let pipeline = IndexingPipeline::new(pipeline_params);\n        let (_pipeline_mailbox, pipeline_handler) = universe.spawn_builder().spawn(pipeline);\n        let (pipeline_exit_status, pipeline_statistics) = pipeline_handler.join().await;\n        assert!(pipeline_exit_status.is_success());\n        assert_eq!(pipeline_statistics.generation, 1);\n        assert_eq!(pipeline_statistics.num_spawn_attempts, 1);\n        assert_eq!(pipeline_statistics.num_published_splits, 1);\n        universe.assert_quit().await;\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_indexing_pipeline_simple() -> anyhow::Result<()> {\n        indexing_pipeline_simple(\"data/test_corpus.json\").await\n    }\n\n    #[tokio::test]\n    async fn test_indexing_pipeline_simple_gz() -> anyhow::Result<()> {\n        indexing_pipeline_simple(\"data/test_corpus.json.gz\").await\n    }\n\n    #[tokio::test]\n    async fn test_merge_pipeline_does_not_stop_on_indexing_pipeline_failure() {\n        let node_id = NodeId::from(\"test-node\");\n        let pipeline_id = IndexingPipelineId {\n            node_id,\n            index_uid: IndexUid::new_with_random_ulid(\"test-index\"),\n            source_id: \"test-source\".to_string(),\n            pipeline_uid: PipelineUid::for_test(0u128),\n        };\n        let source_config = SourceConfig {\n            source_id: \"test-source\".to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::void(),\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        };\n        let source_config_clone = source_config.clone();\n\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_index_metadata()\n            .withf(|index_metadata_request| {\n                index_metadata_request.index_uid.as_ref().unwrap() == &(\"test-index\", 2)\n            })\n            .returning(move |_| {\n                let mut index_metadata =\n                    IndexMetadata::for_test(\"test-index\", \"ram:///indexes/test-index\");\n                index_metadata\n                    .add_source(source_config_clone.clone())\n                    .unwrap();\n                Ok(IndexMetadataResponse::try_from_index_metadata(&index_metadata).unwrap())\n            });\n        mock_metastore\n            .expect_list_splits()\n            .returning(|_| Ok(ServiceStream::empty()));\n        let metastore = MetastoreServiceClient::from_mock(mock_metastore);\n\n        let universe = Universe::with_accelerated_time();\n        let doc_mapper = Arc::new(default_doc_mapper_for_test());\n        let storage = Arc::new(RamStorage::default());\n        let split_store = IndexingSplitStore::create_without_local_store_for_test(storage.clone());\n        let merge_pipeline_params = MergePipelineParams {\n            pipeline_id: pipeline_id.merge_pipeline_id(),\n            doc_mapper: doc_mapper.clone(),\n            indexing_directory: TempDirectory::for_test(),\n            metastore: metastore.clone(),\n            split_store: split_store.clone(),\n            merge_policy: default_merge_policy(),\n            retention_policy: None,\n            max_concurrent_split_uploads: 2,\n            merge_io_throughput_limiter_opt: None,\n            merge_scheduler_service: universe.get_or_spawn_one(),\n            event_broker: Default::default(),\n        };\n        let merge_pipeline = MergePipeline::new(merge_pipeline_params, None, universe.spawn_ctx());\n        let merge_planner_mailbox = merge_pipeline.merge_planner_mailbox().clone();\n        let (_merge_pipeline_mailbox, merge_pipeline_handler) =\n            universe.spawn_builder().spawn(merge_pipeline);\n        let indexing_pipeline_params = IndexingPipelineParams {\n            pipeline_id,\n            doc_mapper,\n            source_config,\n            source_storage_resolver: StorageResolver::for_test(),\n            indexing_directory: TempDirectory::for_test(),\n            indexing_settings: IndexingSettings::for_test(),\n            ingester_pool: IngesterPool::default(),\n            metastore,\n            queues_dir_path: PathBuf::from(\"./queues\"),\n            storage,\n            split_store,\n            merge_policy: default_merge_policy(),\n            retention_policy: None,\n            max_concurrent_split_uploads_index: 4,\n            max_concurrent_split_uploads_merge: 5,\n            cooperative_indexing_permits: None,\n            merge_planner_mailbox: merge_planner_mailbox.clone(),\n            event_broker: Default::default(),\n            params_fingerprint: 42u64,\n        };\n        let indexing_pipeline = IndexingPipeline::new(indexing_pipeline_params);\n        let (_indexing_pipeline_mailbox, indexing_pipeline_handler) =\n            universe.spawn_builder().spawn(indexing_pipeline);\n        let obs = indexing_pipeline_handler\n            .process_pending_and_observe()\n            .await;\n        assert_eq!(obs.generation, 1);\n        // Let's shutdown the indexer, this will trigger the indexing pipeline failure and the\n        // restart.\n        let indexer = universe.get::<Indexer>().into_iter().next().unwrap();\n        let _ = indexer.ask(Command::Quit).await;\n        for _ in 0..10 {\n            universe.sleep(*quickwit_actors::HEARTBEAT).await;\n            // Check indexing pipeline has restarted.\n            let obs = indexing_pipeline_handler\n                .process_pending_and_observe()\n                .await;\n            if obs.generation == 2 {\n                assert_eq!(merge_pipeline_handler.check_health(true), Health::Healthy);\n                universe.quit().await;\n                return;\n            }\n        }\n        panic!(\"Pipeline was apparently not restarted.\");\n    }\n\n    async fn indexing_pipeline_all_failures_handling(test_file: &str) -> anyhow::Result<()> {\n        let node_id = NodeId::from(\"test-node\");\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 2);\n        let pipeline_id = IndexingPipelineId {\n            node_id,\n            index_uid: index_uid.clone(),\n            source_id: \"test-source\".to_string(),\n            pipeline_uid: PipelineUid::for_test(0u128),\n        };\n        let source_config = SourceConfig {\n            source_id: \"test-source\".to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::file_from_str(test_file).unwrap(),\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        };\n        let source_config_clone = source_config.clone();\n\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_index_metadata()\n            .withf(|index_metadata_request| {\n                index_metadata_request.index_uid.as_ref().unwrap() == &(\"test-index\", 2)\n            })\n            .returning(move |_| {\n                let mut index_metadata =\n                    IndexMetadata::for_test(\"test-index\", \"ram:///indexes/test-index\");\n                index_metadata\n                    .add_source(source_config_clone.clone())\n                    .unwrap();\n\n                Ok(IndexMetadataResponse::try_from_index_metadata(&index_metadata).unwrap())\n            });\n        let index_uid_clone = index_uid.clone();\n        mock_metastore\n            .expect_last_delete_opstamp()\n            .withf(move |last_delete_opstamp| last_delete_opstamp.index_uid() == &index_uid_clone)\n            .returning(move |_| Ok(LastDeleteOpstampResponse::new(10)));\n        mock_metastore\n            .expect_stage_splits()\n            .never()\n            .returning(|_| Ok(EmptyResponse {}));\n        let index_uid_clone = index_uid.clone();\n        mock_metastore\n            .expect_publish_splits()\n            .withf(move |publish_splits_request| -> bool {\n                let checkpoint_delta: IndexCheckpointDelta = publish_splits_request\n                    .deserialize_index_checkpoint()\n                    .unwrap()\n                    .unwrap();\n                publish_splits_request.index_uid() == &index_uid_clone\n                    && publish_splits_request.staged_split_ids.is_empty()\n                    && publish_splits_request.replaced_split_ids.is_empty()\n                    && checkpoint_delta.source_id == \"test-source\"\n                    && format!(\"{:?}\", checkpoint_delta.source_delta)\n                        .ends_with(\":(00000000000000000000..~00000000000000001030])\")\n            })\n            .returning(|_| Ok(EmptyResponse {}));\n        let universe = Universe::new();\n        let storage = Arc::new(RamStorage::default());\n        let split_store = IndexingSplitStore::create_without_local_store_for_test(storage.clone());\n        let (merge_planner_mailbox, _) = universe.create_test_mailbox();\n        // Create a minimal mapper with wrong date format to ensure that all documents will fail\n        let broken_mapper = serde_json::from_str::<DocMapper>(\n            r#\"\n                {\n                    \"store_source\": true,\n                    \"timestamp_field\": \"timestamp\",\n                    \"field_mappings\": [\n                        {\n                            \"name\": \"timestamp\",\n                            \"type\": \"datetime\",\n                            \"input_formats\": [\"iso8601\"],\n                            \"fast\": true\n                        }\n                    ]\n                }\"#,\n        )\n        .unwrap();\n\n        let pipeline_params = IndexingPipelineParams {\n            pipeline_id,\n            doc_mapper: Arc::new(broken_mapper),\n            source_config,\n            source_storage_resolver: StorageResolver::for_test(),\n            indexing_directory: TempDirectory::for_test(),\n            indexing_settings: IndexingSettings::for_test(),\n            ingester_pool: IngesterPool::default(),\n            metastore: MetastoreServiceClient::from_mock(mock_metastore),\n            queues_dir_path: PathBuf::from(\"./queues\"),\n            storage,\n            split_store,\n            merge_policy: default_merge_policy(),\n            retention_policy: None,\n            max_concurrent_split_uploads_index: 4,\n            max_concurrent_split_uploads_merge: 5,\n            cooperative_indexing_permits: None,\n            merge_planner_mailbox,\n            params_fingerprint: 42u64,\n            event_broker: Default::default(),\n        };\n        let pipeline = IndexingPipeline::new(pipeline_params);\n        let (_pipeline_mailbox, pipeline_handler) = universe.spawn_builder().spawn(pipeline);\n        let (pipeline_exit_status, pipeline_statistics) = pipeline_handler.join().await;\n        assert!(pipeline_exit_status.is_success());\n        // flaky. Sometimes generations is 2.\n        assert_eq!(pipeline_statistics.generation, 1);\n        assert_eq!(pipeline_statistics.num_spawn_attempts, 1);\n        assert_eq!(pipeline_statistics.num_published_splits, 0);\n        assert_eq!(pipeline_statistics.num_empty_splits, 1);\n        assert_eq!(\n            pipeline_statistics.num_docs,\n            pipeline_statistics.num_invalid_docs\n        );\n        universe.assert_quit().await;\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_indexing_pipeline_all_failures_handling() -> anyhow::Result<()> {\n        indexing_pipeline_all_failures_handling(\"data/test_corpus.json\").await\n    }\n\n    #[tokio::test]\n    async fn test_indexing_pipeline_all_failures_handling_gz() -> anyhow::Result<()> {\n        indexing_pipeline_all_failures_handling(\"data/test_corpus.json.gz\").await\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/actors/indexing_service.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{HashMap, HashSet};\nuse std::fmt::{Debug, Formatter};\nuse std::path::PathBuf;\nuse std::sync::Arc;\n\nuse anyhow::Context;\nuse async_trait::async_trait;\nuse futures::TryStreamExt;\nuse itertools::Itertools;\nuse quickwit_actors::{\n    Actor, ActorContext, ActorExitStatus, ActorHandle, ActorState, Handler, Healthz, Mailbox,\n    Observation,\n};\nuse quickwit_cluster::Cluster;\nuse quickwit_common::fs::get_cache_directory_path;\nuse quickwit_common::io::Limiter;\nuse quickwit_common::pubsub::EventBroker;\nuse quickwit_common::{io, temp_dir};\nuse quickwit_config::{\n    INGEST_API_SOURCE_ID, IndexConfig, IndexerConfig, SourceConfig, build_doc_mapper,\n    indexing_pipeline_params_fingerprint,\n};\nuse quickwit_ingest::{\n    DropQueueRequest, GetPartitionId, IngestApiService, IngesterPool, ListQueuesRequest,\n    QUEUES_DIR_NAME,\n};\nuse quickwit_metastore::{\n    IndexMetadata, IndexMetadataResponseExt, IndexesMetadataResponseExt,\n    ListIndexesMetadataResponseExt, ListSplitsQuery, ListSplitsRequestExt, ListSplitsResponseExt,\n    SplitMetadata, SplitState,\n};\nuse quickwit_proto::indexing::{\n    ApplyIndexingPlanRequest, ApplyIndexingPlanResponse, IndexingError, IndexingPipelineId,\n    IndexingTask, MergePipelineId, PipelineMetrics,\n};\nuse quickwit_proto::metastore::{\n    IndexMetadataRequest, IndexMetadataSubrequest, IndexesMetadataRequest,\n    ListIndexesMetadataRequest, ListSplitsRequest, MetastoreResult, MetastoreService,\n    MetastoreServiceClient,\n};\nuse quickwit_proto::types::{IndexId, IndexUid, NodeId, PipelineUid, ShardId};\nuse quickwit_storage::StorageResolver;\nuse serde::{Deserialize, Serialize};\nuse time::OffsetDateTime;\nuse tokio::sync::Semaphore;\nuse tracing::{debug, error, info, warn};\n\nuse super::merge_pipeline::{MergePipeline, MergePipelineParams};\nuse super::{MergePlanner, MergeSchedulerService};\nuse crate::actors::merge_pipeline::FinishPendingMergesAndShutdownPipeline;\nuse crate::models::{DetachIndexingPipeline, DetachMergePipeline, ObservePipeline, SpawnPipeline};\nuse crate::source::{AssignShards, Assignment};\nuse crate::split_store::{IndexingSplitCache, SplitStoreQuota};\nuse crate::{IndexingPipeline, IndexingPipelineParams, IndexingSplitStore, IndexingStatistics};\n\n/// Name of the indexing directory, usually located at `<data_dir_path>/indexing`.\npub const INDEXING_DIR_NAME: &str = \"indexing\";\n\n#[derive(Clone, Debug, Default, Eq, PartialEq, Serialize, Deserialize)]\npub struct IndexingServiceCounters {\n    pub num_running_pipelines: usize,\n    pub num_successful_pipelines: usize,\n    pub num_failed_pipelines: usize,\n    pub num_running_merge_pipelines: usize,\n    pub num_deleted_queues: usize,\n    pub num_delete_queue_failures: usize,\n}\n\nstruct MergePipelineHandle {\n    mailbox: Mailbox<MergePlanner>,\n    handle: ActorHandle<MergePipeline>,\n}\n\nstruct PipelineHandle {\n    mailbox: Mailbox<IndexingPipeline>,\n    handle: ActorHandle<IndexingPipeline>,\n    indexing_pipeline_id: IndexingPipelineId,\n}\n\n/// The indexing service is (single) actor service running on indexer and in charge\n/// of executing the indexing plans received from the control plane.\n///\n/// Concretely this means receiving new plans, comparing the current situation\n/// with the target situation, and spawning/shutting down the  indexing pipelines that\n/// are respectively missing or extranumerous.\npub struct IndexingService {\n    node_id: NodeId,\n    indexing_root_directory: PathBuf,\n    queue_dir_path: PathBuf,\n    cluster: Cluster,\n    metastore: MetastoreServiceClient,\n    ingest_api_service_opt: Option<Mailbox<IngestApiService>>,\n    merge_scheduler_service: Mailbox<MergeSchedulerService>,\n    ingester_pool: IngesterPool,\n    storage_resolver: StorageResolver,\n    indexing_pipelines: HashMap<PipelineUid, PipelineHandle>,\n    counters: IndexingServiceCounters,\n    local_split_store: Arc<IndexingSplitCache>,\n    max_concurrent_split_uploads: usize,\n    merge_pipeline_handles: HashMap<MergePipelineId, MergePipelineHandle>,\n    cooperative_indexing_permits: Option<Arc<Semaphore>>,\n    merge_io_throughput_limiter_opt: Option<Limiter>,\n    event_broker: EventBroker,\n}\n\nimpl Debug for IndexingService {\n    fn fmt(&self, formatter: &mut Formatter) -> std::fmt::Result {\n        formatter\n            .debug_struct(\"IndexingService\")\n            .field(\"cluster_id\", &self.cluster.cluster_id())\n            .field(\"self_node_id\", &self.node_id)\n            .field(\"indexing_root_directory\", &self.indexing_root_directory)\n            .finish()\n    }\n}\n\nimpl IndexingService {\n    #[allow(clippy::too_many_arguments)]\n    pub async fn new(\n        node_id: NodeId,\n        data_dir_path: PathBuf,\n        indexer_config: IndexerConfig,\n        num_blocking_threads: usize,\n        cluster: Cluster,\n        metastore: MetastoreServiceClient,\n        ingest_api_service_opt: Option<Mailbox<IngestApiService>>,\n        merge_scheduler_service: Mailbox<MergeSchedulerService>,\n        ingester_pool: IngesterPool,\n        storage_resolver: StorageResolver,\n        event_broker: EventBroker,\n    ) -> anyhow::Result<IndexingService> {\n        let split_store_space_quota = SplitStoreQuota::try_new(\n            indexer_config.split_store_max_num_splits,\n            indexer_config.split_store_max_num_bytes,\n        )?;\n        let merge_io_throughput_limiter_opt =\n            indexer_config.max_merge_write_throughput.map(io::limiter);\n        let split_cache_dir_path = get_cache_directory_path(&data_dir_path);\n        let local_split_store =\n            IndexingSplitCache::open(split_cache_dir_path, split_store_space_quota).await?;\n        let indexing_root_directory =\n            temp_dir::create_or_purge_directory(&data_dir_path.join(INDEXING_DIR_NAME)).await?;\n        let queue_dir_path = data_dir_path.join(QUEUES_DIR_NAME);\n        let cooperative_indexing_permits = if indexer_config.enable_cooperative_indexing {\n            Some(Arc::new(Semaphore::new(num_blocking_threads)))\n        } else {\n            None\n        };\n        Ok(IndexingService {\n            node_id,\n            indexing_root_directory,\n            queue_dir_path,\n            cluster,\n            metastore,\n            ingest_api_service_opt,\n            merge_scheduler_service,\n            ingester_pool,\n            storage_resolver,\n            local_split_store: Arc::new(local_split_store),\n            indexing_pipelines: Default::default(),\n            counters: Default::default(),\n            max_concurrent_split_uploads: indexer_config.max_concurrent_split_uploads,\n            merge_pipeline_handles: HashMap::new(),\n            merge_io_throughput_limiter_opt,\n            cooperative_indexing_permits,\n            event_broker,\n        })\n    }\n\n    async fn detach_indexing_pipeline(\n        &mut self,\n        pipeline_uid: &PipelineUid,\n    ) -> Result<ActorHandle<IndexingPipeline>, IndexingError> {\n        let pipeline_handle = self\n            .indexing_pipelines\n            .remove(pipeline_uid)\n            .ok_or_else(|| {\n                let message = format!(\"could not find indexing pipeline `{pipeline_uid}`\");\n                IndexingError::Internal(message)\n            })?;\n        self.counters.num_running_pipelines -= 1;\n        Ok(pipeline_handle.handle)\n    }\n\n    async fn detach_merge_pipeline(\n        &mut self,\n        pipeline_id: &MergePipelineId,\n    ) -> Result<ActorHandle<MergePipeline>, IndexingError> {\n        let pipeline_handle = self\n            .merge_pipeline_handles\n            .remove(pipeline_id)\n            .ok_or_else(|| {\n                let message = format!(\"could not find merge pipeline `{pipeline_id}`\");\n                IndexingError::Internal(message)\n            })?;\n        self.counters.num_running_merge_pipelines -= 1;\n        Ok(pipeline_handle.handle)\n    }\n\n    async fn observe_pipeline(\n        &mut self,\n        pipeline_uid: &PipelineUid,\n    ) -> Result<Observation<IndexingStatistics>, IndexingError> {\n        let pipeline_handle = &self\n            .indexing_pipelines\n            .get(pipeline_uid)\n            .ok_or_else(|| {\n                let message = format!(\"could not find indexing pipeline `{pipeline_uid}`\");\n                IndexingError::Internal(message)\n            })?\n            .handle;\n        let observation = pipeline_handle.observe().await;\n        Ok(observation)\n    }\n\n    async fn spawn_pipeline(\n        &mut self,\n        ctx: &ActorContext<Self>,\n        index_id: IndexId,\n        source_config: SourceConfig,\n        pipeline_uid: PipelineUid,\n    ) -> Result<IndexingPipelineId, IndexingError> {\n        let index_metadata = self.index_metadata(ctx, &index_id).await?;\n        let pipeline_id = IndexingPipelineId {\n            index_uid: index_metadata.index_uid.clone(),\n            source_id: source_config.source_id.clone(),\n            node_id: self.node_id.clone(),\n            pipeline_uid,\n        };\n        let index_config = index_metadata.into_index_config();\n        self.spawn_pipeline_inner(\n            ctx,\n            pipeline_id.clone(),\n            index_config,\n            source_config,\n            None,\n            None,\n        )\n        .await?;\n        Ok(pipeline_id)\n    }\n\n    async fn spawn_pipeline_inner(\n        &mut self,\n        ctx: &ActorContext<Self>,\n        indexing_pipeline_id: IndexingPipelineId,\n        index_config: IndexConfig,\n        source_config: SourceConfig,\n        immature_splits_opt: Option<Vec<SplitMetadata>>,\n        expected_params_fingerprint: Option<u64>,\n    ) -> Result<(), IndexingError> {\n        if self\n            .indexing_pipelines\n            .contains_key(&indexing_pipeline_id.pipeline_uid)\n        {\n            let message = format!(\"pipeline `{indexing_pipeline_id}` already exists\");\n            return Err(IndexingError::Internal(message));\n        }\n        let pipeline_uid_str = indexing_pipeline_id.pipeline_uid.to_string();\n        let indexing_directory = temp_dir::Builder::default()\n            .join(&indexing_pipeline_id.index_uid.index_id)\n            .join(&indexing_pipeline_id.index_uid.incarnation_id.to_string())\n            .join(&indexing_pipeline_id.source_id)\n            .join(&pipeline_uid_str)\n            .tempdir_in(&self.indexing_root_directory)\n            .map_err(|error| {\n                let message = format!(\"failed to create indexing directory: {error}\");\n                IndexingError::Internal(message)\n            })?;\n        let storage = self\n            .storage_resolver\n            .resolve(&index_config.index_uri)\n            .await\n            .map_err(|error| {\n                let message = format!(\"failed to spawn indexing pipeline: {error}\");\n                IndexingError::Internal(message)\n            })?;\n        let merge_policy =\n            crate::merge_policy::merge_policy_from_settings(&index_config.indexing_settings);\n        let retention_policy = index_config.retention_policy_opt.clone();\n        let split_store = IndexingSplitStore::new(storage.clone(), self.local_split_store.clone());\n\n        let doc_mapper = build_doc_mapper(&index_config.doc_mapping, &index_config.search_settings)\n            .map_err(|error| IndexingError::Internal(error.to_string()))?;\n\n        let merge_pipeline_id = indexing_pipeline_id.merge_pipeline_id();\n        let merge_pipeline_params = MergePipelineParams {\n            pipeline_id: merge_pipeline_id.clone(),\n            doc_mapper: doc_mapper.clone(),\n            indexing_directory: indexing_directory.clone(),\n            metastore: self.metastore.clone(),\n            split_store: split_store.clone(),\n            merge_scheduler_service: self.merge_scheduler_service.clone(),\n            merge_policy: merge_policy.clone(),\n            retention_policy: retention_policy.clone(),\n            merge_io_throughput_limiter_opt: self.merge_io_throughput_limiter_opt.clone(),\n            max_concurrent_split_uploads: self.max_concurrent_split_uploads,\n            event_broker: self.event_broker.clone(),\n        };\n        let merge_planner_mailbox =\n            self.get_or_create_merge_pipeline(merge_pipeline_params, immature_splits_opt, ctx)?;\n        // The concurrent uploads budget is split in 2: 1/2 for the indexing pipeline, 1/2 for the\n        // merge pipeline.\n        let max_concurrent_split_uploads_index = (self.max_concurrent_split_uploads / 2).max(1);\n        let max_concurrent_split_uploads_merge =\n            (self.max_concurrent_split_uploads - max_concurrent_split_uploads_index).max(1);\n\n        let params_fingerprint =\n            indexing_pipeline_params_fingerprint(&index_config, &source_config);\n        if let Some(expected_params_fingerprint) = expected_params_fingerprint {\n            // If the fingerprint of the config freshly fetched from the\n            // metastore is different from that received from the control plane,\n            // it means that the config changed again since the last indexing\n            // plan was built. In this case, postpone the pipeline creation.\n            if params_fingerprint != expected_params_fingerprint {\n                info!(\n                    index_id = indexing_pipeline_id.index_uid.index_id,\n                    source_id = indexing_pipeline_id.source_id,\n                    expected = expected_params_fingerprint,\n                    actual = params_fingerprint,\n                    \"pipeline fingerprint mismatch, postponing pipeline creation\"\n                );\n                return Ok(());\n            }\n        }\n        let pipeline_params = IndexingPipelineParams {\n            pipeline_id: indexing_pipeline_id.clone(),\n            metastore: self.metastore.clone(),\n            storage,\n\n            // Indexing-related parameters\n            doc_mapper,\n            indexing_directory,\n            indexing_settings: index_config.indexing_settings.clone(),\n            split_store,\n            max_concurrent_split_uploads_index,\n            cooperative_indexing_permits: self.cooperative_indexing_permits.clone(),\n\n            // Merge-related parameters\n            merge_policy,\n            retention_policy,\n            max_concurrent_split_uploads_merge,\n            merge_planner_mailbox,\n\n            // Source-related parameters\n            source_config,\n            ingester_pool: self.ingester_pool.clone(),\n            queues_dir_path: self.queue_dir_path.clone(),\n            source_storage_resolver: self.storage_resolver.clone(),\n            params_fingerprint,\n\n            event_broker: self.event_broker.clone(),\n        };\n        let pipeline = IndexingPipeline::new(pipeline_params);\n        let (pipeline_mailbox, pipeline_handle) = ctx.spawn_actor().spawn(pipeline);\n        let pipeline_handle = PipelineHandle {\n            mailbox: pipeline_mailbox,\n            handle: pipeline_handle,\n            indexing_pipeline_id: indexing_pipeline_id.clone(),\n        };\n        self.indexing_pipelines\n            .insert(indexing_pipeline_id.pipeline_uid, pipeline_handle);\n        self.counters.num_running_pipelines += 1;\n        Ok(())\n    }\n\n    async fn index_metadata(\n        &mut self,\n        ctx: &ActorContext<Self>,\n        index_id: &str,\n    ) -> Result<IndexMetadata, IndexingError> {\n        let _protected_zone_guard = ctx.protect_zone();\n        let index_metadata_response = self\n            .metastore\n            .index_metadata(IndexMetadataRequest::for_index_id(index_id.to_string()))\n            .await?;\n        let index_metadata = index_metadata_response.deserialize_index_metadata()?;\n        Ok(index_metadata)\n    }\n\n    async fn indexes_metadata(\n        &mut self,\n        ctx: &ActorContext<Self>,\n        indexing_pipeline_ids: &[IndexingPipelineId],\n    ) -> Result<Vec<IndexMetadata>, IndexingError> {\n        let index_metadata_subrequests: Vec<IndexMetadataSubrequest> = indexing_pipeline_ids\n            .iter()\n            // Remove duplicate subrequests\n            .unique_by(|pipeline_id| &pipeline_id.index_uid)\n            .map(|pipeline_id| IndexMetadataSubrequest {\n                index_id: None,\n                index_uid: Some(pipeline_id.index_uid.clone()),\n            })\n            .collect();\n        let indexes_metadata_request = IndexesMetadataRequest {\n            subrequests: index_metadata_subrequests,\n        };\n        let _protected_zone_guard = ctx.protect_zone();\n\n        let indexes_metadata_response = self\n            .metastore\n            .indexes_metadata(indexes_metadata_request)\n            .await?;\n        let indexes_metadata = indexes_metadata_response\n            .deserialize_indexes_metadata()\n            .await?;\n        Ok(indexes_metadata)\n    }\n\n    /// Fetches the immature splits candidates for merge for all the indexing pipelines for which a\n    /// merge pipeline is not running.\n    async fn fetch_immature_splits_for_new_merge_pipelines(\n        &mut self,\n        indexing_pipeline_ids: &[IndexingPipelineId],\n        ctx: &ActorContext<Self>,\n    ) -> MetastoreResult<HashMap<MergePipelineId, Vec<SplitMetadata>>> {\n        let mut index_uids = Vec::new();\n\n        for indexing_pipeline_id in indexing_pipeline_ids {\n            let merge_pipeline_id = indexing_pipeline_id.merge_pipeline_id();\n\n            if !self.merge_pipeline_handles.contains_key(&merge_pipeline_id) {\n                index_uids.push(merge_pipeline_id.index_uid);\n            }\n        }\n        if index_uids.is_empty() {\n            return Ok(Default::default());\n        }\n        index_uids.sort_unstable();\n        index_uids.dedup();\n\n        let list_splits_query = ListSplitsQuery::try_from_index_uids(index_uids)\n            .expect(\"`index_uids` should not be empty\")\n            .with_node_id(self.node_id.clone())\n            .with_split_state(SplitState::Published)\n            .retain_immature(OffsetDateTime::now_utc());\n        let list_splits_request =\n            ListSplitsRequest::try_from_list_splits_query(&list_splits_query)?;\n\n        let mut immature_splits_stream = ctx\n            .protect_future(self.metastore.list_splits(list_splits_request))\n            .await?;\n\n        let mut per_merge_pipeline_immature_splits: HashMap<MergePipelineId, Vec<SplitMetadata>> =\n            indexing_pipeline_ids\n                .iter()\n                .map(|indexing_pipeline_id| (indexing_pipeline_id.merge_pipeline_id(), Vec::new()))\n                .collect();\n\n        let mut num_immature_splits = 0usize;\n\n        while let Some(list_splits_response) = immature_splits_stream.try_next().await? {\n            for split_metadata in list_splits_response.deserialize_splits_metadata().await? {\n                num_immature_splits += 1;\n\n                let merge_pipeline_id = MergePipelineId {\n                    node_id: self.node_id.clone(),\n                    index_uid: split_metadata.index_uid.clone(),\n                    source_id: split_metadata.source_id.clone(),\n                };\n                per_merge_pipeline_immature_splits\n                    .entry(merge_pipeline_id)\n                    .or_default()\n                    .push(split_metadata);\n            }\n        }\n        info!(\"fetched {num_immature_splits} splits candidates for merge\");\n        Ok(per_merge_pipeline_immature_splits)\n    }\n\n    async fn handle_supervise(&mut self) -> Result<(), ActorExitStatus> {\n        self.indexing_pipelines\n            .retain(|pipeline_uid, pipeline_handle| {\n                match pipeline_handle.handle.state() {\n                    ActorState::Paused | ActorState::Running => true,\n                    ActorState::Success => {\n                        info!(\n                            pipeline_uid=%pipeline_uid,\n                            \"indexing pipeline exited successfully\"\n                        );\n                        self.counters.num_successful_pipelines += 1;\n                        self.counters.num_running_pipelines -= 1;\n                        false\n                    }\n                    ActorState::Failure => {\n                        // This should never happen: Indexing Pipelines are not supposed to fail,\n                        // and are themselves in charge of supervising the pipeline actors.\n                        error!(\n                            pipeline_uid=%pipeline_uid,\n                            \"indexing pipeline exited with failure: this should never happen, please report\"\n                        );\n                        self.counters.num_failed_pipelines += 1;\n                        self.counters.num_running_pipelines -= 1;\n                        false\n                    }\n                }\n            });\n        let merge_pipelines_to_retain: HashSet<MergePipelineId> = self\n            .indexing_pipelines\n            .values()\n            .map(|pipeline_handle| pipeline_handle.indexing_pipeline_id.merge_pipeline_id())\n            .collect();\n\n        let merge_pipelines_to_shutdown: Vec<MergePipelineId> = self\n            .merge_pipeline_handles\n            .keys()\n            .filter(|running_merge_pipeline_id| {\n                !merge_pipelines_to_retain.contains(running_merge_pipeline_id)\n            })\n            .cloned()\n            .collect();\n\n        for merge_pipeline_to_shutdown in merge_pipelines_to_shutdown {\n            if let Some((_, merge_pipeline_handle)) = self\n                .merge_pipeline_handles\n                .remove_entry(&merge_pipeline_to_shutdown)\n            {\n                // We gracefully shutdown the merge pipeline, so we can complete the in-flight\n                // merges.\n                info!(\n                    index_uid=%merge_pipeline_to_shutdown.index_uid,\n                    source_id=%merge_pipeline_to_shutdown.source_id,\n                    \"shutting down orphan merge pipeline\"\n                );\n                // The queue capacity of the merge pipeline is unbounded, so `.send_message(...)`\n                // should not block.\n                // We avoid using `.quit()` here because it waits for the actor to exit.\n                merge_pipeline_handle\n                    .handle\n                    .mailbox()\n                    .send_message(FinishPendingMergesAndShutdownPipeline)\n                    .await\n                    .expect(\"merge pipeline mailbox should not be full\");\n            }\n        }\n        // Finally, we remove the completed or failed merge pipelines.\n        self.merge_pipeline_handles\n            .retain(|_, merge_pipeline_handle| merge_pipeline_handle.handle.state().is_running());\n        self.counters.num_running_merge_pipelines = self.merge_pipeline_handles.len();\n        self.update_chitchat_running_plan().await;\n\n        let pipeline_metrics: HashMap<&IndexingPipelineId, PipelineMetrics> = self\n            .indexing_pipelines\n            .values()\n            .filter_map(|pipeline_handle| {\n                let indexing_statistics = pipeline_handle.handle.last_observation();\n                let pipeline_metrics = indexing_statistics.pipeline_metrics_opt?;\n                Some((&pipeline_handle.indexing_pipeline_id, pipeline_metrics))\n            })\n            .collect();\n        self.cluster\n            .update_self_node_pipeline_metrics(&pipeline_metrics)\n            .await;\n        Ok(())\n    }\n\n    fn get_or_create_merge_pipeline(\n        &mut self,\n        merge_pipeline_params: MergePipelineParams,\n        immature_splits_opt: Option<Vec<SplitMetadata>>,\n        ctx: &ActorContext<Self>,\n    ) -> Result<Mailbox<MergePlanner>, IndexingError> {\n        if let Some(merge_pipeline_handle) = self\n            .merge_pipeline_handles\n            .get(&merge_pipeline_params.pipeline_id)\n        {\n            return Ok(merge_pipeline_handle.mailbox.clone());\n        }\n        let merge_pipeline_id = merge_pipeline_params.pipeline_id.clone();\n        let merge_pipeline =\n            MergePipeline::new(merge_pipeline_params, immature_splits_opt, ctx.spawn_ctx());\n        let merge_planner_mailbox = merge_pipeline.merge_planner_mailbox().clone();\n        let (_pipeline_mailbox, pipeline_handle) = ctx.spawn_actor().spawn(merge_pipeline);\n        let merge_pipeline_handle = MergePipelineHandle {\n            mailbox: merge_planner_mailbox.clone(),\n            handle: pipeline_handle,\n        };\n        self.merge_pipeline_handles\n            .insert(merge_pipeline_id, merge_pipeline_handle);\n        self.counters.num_running_merge_pipelines += 1;\n        Ok(merge_planner_mailbox)\n    }\n\n    /// For all Ingest V2 pipelines, assigns the set of shards they should be working on.\n    /// This is done regardless of whether there has been a change in their shard list\n    /// or not.\n    ///\n    /// If a pipeline actor has failed, this function just logs an error.\n    async fn assign_shards_to_pipelines(&mut self, tasks: &[IndexingTask]) {\n        for task in tasks {\n            if task.shard_ids.is_empty() {\n                continue;\n            }\n            let pipeline_uid = task.pipeline_uid();\n            let Some(pipeline_handle) = self.indexing_pipelines.get(&pipeline_uid) else {\n                continue;\n            };\n            let assignment = Assignment {\n                shard_ids: task.shard_ids.iter().cloned().collect(),\n            };\n            let message = AssignShards(assignment);\n\n            if let Err(error) = pipeline_handle.mailbox.send_message(message).await {\n                error!(%error, \"failed to assign shards to indexing pipeline\");\n            }\n        }\n    }\n\n    /// Applies the indexing plan by:\n    /// - Stopping the running pipelines not present in the provided plan.\n    /// - Starting the pipelines that are not running.\n    async fn apply_indexing_plan(\n        &mut self,\n        tasks: &[IndexingTask],\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), IndexingError> {\n        let pipeline_diff = self.compute_pipeline_diff(tasks);\n\n        if !pipeline_diff.pipelines_to_shutdown.is_empty() {\n            self.shutdown_pipelines(&pipeline_diff.pipelines_to_shutdown)\n                .await;\n        }\n        let mut spawn_pipeline_failures: Vec<IndexingPipelineId> = Vec::new();\n\n        if !pipeline_diff.pipelines_to_spawn.is_empty() {\n            spawn_pipeline_failures = self\n                .spawn_pipelines(&pipeline_diff.pipelines_to_spawn, ctx)\n                .await?;\n        }\n        self.assign_shards_to_pipelines(tasks).await;\n        self.update_chitchat_running_plan().await;\n\n        if !spawn_pipeline_failures.is_empty() {\n            let message =\n                format!(\"failed to spawn indexing pipelines: {spawn_pipeline_failures:?}\");\n            return Err(IndexingError::Internal(message));\n        }\n        Ok(())\n    }\n\n    /// Identifies the pipelines to spawn and shutdown by comparing the scheduled plan with the\n    /// current running plan.\n    fn compute_pipeline_diff(&self, tasks: &[IndexingTask]) -> IndexingPipelineDiff {\n        let mut pipelines_to_spawn: Vec<IndexingTask> = Vec::new();\n        let mut scheduled_pipeline_uids: HashSet<PipelineUid> = HashSet::with_capacity(tasks.len());\n\n        for task in tasks {\n            let pipeline_uid = task.pipeline_uid();\n\n            if !self.indexing_pipelines.contains_key(&pipeline_uid) {\n                pipelines_to_spawn.push(task.clone());\n            }\n            scheduled_pipeline_uids.insert(pipeline_uid);\n        }\n        let pipelines_to_shutdown: Vec<PipelineUid> = self\n            .indexing_pipelines\n            .keys()\n            .filter(|pipeline_uid| !scheduled_pipeline_uids.contains(pipeline_uid))\n            .copied()\n            .collect();\n\n        IndexingPipelineDiff {\n            pipelines_to_shutdown,\n            pipelines_to_spawn,\n        }\n    }\n\n    /// Spawns the pipelines with supplied ids and returns a list of failed pipelines.\n    async fn spawn_pipelines(\n        &mut self,\n        pipelines_to_spawn: &[IndexingTask],\n        ctx: &ActorContext<Self>,\n    ) -> Result<Vec<IndexingPipelineId>, IndexingError> {\n        let pipelines_to_spawn_ids: Vec<_> = pipelines_to_spawn\n            .iter()\n            .map(|task| IndexingPipelineId {\n                node_id: self.node_id.clone(),\n                index_uid: task.index_uid().clone(),\n                pipeline_uid: task.pipeline_uid(),\n                source_id: task.source_id.clone(),\n            })\n            .collect();\n        let indexes_metadata = self.indexes_metadata(ctx, &pipelines_to_spawn_ids).await?;\n\n        let per_index_uid_indexes_metadata: HashMap<IndexUid, IndexMetadata> = indexes_metadata\n            .into_iter()\n            .map(|index_metadata| (index_metadata.index_uid.clone(), index_metadata))\n            .collect();\n\n        let mut per_merge_pipeline_immature_splits: HashMap<MergePipelineId, Vec<SplitMetadata>> =\n            self.fetch_immature_splits_for_new_merge_pipelines(&pipelines_to_spawn_ids, ctx)\n                .await?;\n\n        let mut spawn_pipeline_failures: Vec<IndexingPipelineId> = Vec::new();\n\n        for (task_to_spawn, id_to_spawn) in pipelines_to_spawn.iter().zip(pipelines_to_spawn_ids) {\n            if let Some(index_metadata) =\n                per_index_uid_indexes_metadata.get(task_to_spawn.index_uid())\n            {\n                if let Some(source_config) = index_metadata.sources.get(&task_to_spawn.source_id) {\n                    let merge_pipeline_id = id_to_spawn.merge_pipeline_id();\n                    let immature_splits_opt =\n                        per_merge_pipeline_immature_splits.remove(&merge_pipeline_id);\n\n                    if let Err(error) = self\n                        .spawn_pipeline_inner(\n                            ctx,\n                            id_to_spawn.clone(),\n                            index_metadata.index_config.clone(),\n                            source_config.clone(),\n                            immature_splits_opt,\n                            Some(task_to_spawn.params_fingerprint),\n                        )\n                        .await\n                    {\n                        error!(pipeline_id=?id_to_spawn, %error, \"failed to spawn pipeline\");\n                        spawn_pipeline_failures.push(id_to_spawn.clone());\n                    }\n                } else {\n                    error!(pipeline_id=?id_to_spawn, \"failed to spawn pipeline: source not found\");\n                    spawn_pipeline_failures.push(id_to_spawn.clone());\n                }\n            } else {\n                error!(\n                    \"failed to spawn pipeline: index `{}` no longer exists\",\n                    id_to_spawn.index_uid\n                );\n                spawn_pipeline_failures.push(id_to_spawn.clone());\n            }\n        }\n        Ok(spawn_pipeline_failures)\n    }\n\n    /// Shuts down the pipelines with supplied ids and performs necessary cleanup.\n    async fn shutdown_pipelines(&mut self, pipelines_to_shutdown: &[PipelineUid]) {\n        info!(\n            pipeline_uids=?pipelines_to_shutdown,\n            \"shutdown indexing pipelines\"\n        );\n        let should_gc_ingest_api_queues = pipelines_to_shutdown\n            .iter()\n            .flat_map(|pipeline_uid| self.indexing_pipelines.get(pipeline_uid))\n            .any(|pipeline_handle| {\n                pipeline_handle.indexing_pipeline_id.source_id == INGEST_API_SOURCE_ID\n            });\n\n        for pipeline_to_shutdown in pipelines_to_shutdown {\n            match self.detach_indexing_pipeline(pipeline_to_shutdown).await {\n                Ok(pipeline_handle) => {\n                    // Killing the pipeline ensures that all the pipeline actors will stop.\n                    pipeline_handle.kill().await;\n                }\n                Err(error) => {\n                    // Just log the detach error, it can only come from a missing pipeline in the\n                    // `indexing_pipeline_handles`.\n                    error!(\n                        pipeline_uid=%pipeline_to_shutdown,\n                        ?error,\n                        \"failed to detach indexing pipeline\",\n                    );\n                }\n            }\n        }\n        // If at least one ingest source has been removed, the related index has possibly been\n        // deleted. Thus we run a garbage collect to remove queues of potentially deleted\n        // indexes.\n        if should_gc_ingest_api_queues && let Err(error) = self.run_ingest_api_queues_gc().await {\n            warn!(\n                %error,\n                \"failed to garbage collect ingest API queues\",\n            );\n        }\n    }\n\n    /// Broadcasts the current running plan via chitchat.\n    async fn update_chitchat_running_plan(&self) {\n        let mut indexing_tasks: Vec<IndexingTask> = self\n            .indexing_pipelines\n            .values()\n            .map(|pipeline_handle| {\n                let assignment = pipeline_handle.handle.last_observation();\n                let shard_ids: Vec<ShardId> = assignment.shard_ids.iter().cloned().collect();\n                IndexingTask {\n                    index_uid: Some(pipeline_handle.indexing_pipeline_id.index_uid.clone()),\n                    source_id: pipeline_handle.indexing_pipeline_id.source_id.clone(),\n                    pipeline_uid: Some(pipeline_handle.indexing_pipeline_id.pipeline_uid),\n                    shard_ids,\n                    params_fingerprint: assignment.params_fingerprint,\n                }\n            })\n            .collect();\n\n        // TODO: Does anybody why we sort the indexing tasks by pipeline_uid here?\n        indexing_tasks.sort_unstable_by_key(|task| task.pipeline_uid);\n\n        self.cluster\n            .update_self_node_indexing_tasks(&indexing_tasks)\n            .await;\n    }\n\n    /// Garbage collects ingest API queues of deleted indexes.\n    async fn run_ingest_api_queues_gc(&mut self) -> anyhow::Result<()> {\n        let Some(ingest_api_service) = &self.ingest_api_service_opt else {\n            return Ok(());\n        };\n        let queues: HashSet<String> = ingest_api_service\n            .ask_for_res(ListQueuesRequest {})\n            .await\n            .context(\"failed to list queues\")?\n            .queues\n            .into_iter()\n            .collect();\n        debug!(queues=?queues, \"list ingest API queues\");\n\n        if queues.is_empty() {\n            return Ok(());\n        }\n        let indexes_metadata = self\n            .metastore\n            .list_indexes_metadata(ListIndexesMetadataRequest::all())\n            .await?\n            .deserialize_indexes_metadata()\n            .await?;\n        let index_ids: HashSet<String> = indexes_metadata\n            .into_iter()\n            .map(|index_metadata| index_metadata.index_id().to_string())\n            .collect();\n        debug!(index_ids=?index_ids, \"list indexes\");\n\n        let partition_id = ingest_api_service.ask(GetPartitionId).await?;\n        let queue_ids_to_delete = queues.difference(&index_ids);\n\n        for queue_id in queue_ids_to_delete {\n            let delete_queue_res = ingest_api_service\n                .ask_for_res(DropQueueRequest {\n                    queue_id: queue_id.to_string(),\n                })\n                .await;\n            if let Err(delete_queue_error) = delete_queue_res {\n                error!(\n                    index_id = %queue_id,\n                    partition_id,\n                    error = %delete_queue_error,\n                    \"failed to delete queue\"\n                );\n                self.counters.num_delete_queue_failures += 1;\n            } else {\n                info!(\n                    index_id = %queue_id,\n                    partition_id,\n                    \"deleted queue successfully\"\n                );\n                self.counters.num_deleted_queues += 1;\n            }\n        }\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<ObservePipeline> for IndexingService {\n    type Reply = Result<Observation<IndexingStatistics>, IndexingError>;\n\n    async fn handle(\n        &mut self,\n        msg: ObservePipeline,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        let pipeline_uid = msg.pipeline_id.pipeline_uid;\n        let observation = self.observe_pipeline(&pipeline_uid).await;\n        Ok(observation)\n    }\n}\n\n#[async_trait]\nimpl Handler<DetachIndexingPipeline> for IndexingService {\n    type Reply = Result<ActorHandle<IndexingPipeline>, IndexingError>;\n\n    async fn handle(\n        &mut self,\n        msg: DetachIndexingPipeline,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        let pipeline_uid = msg.pipeline_id.pipeline_uid;\n        let detach_pipeline_result = self.detach_indexing_pipeline(&pipeline_uid).await;\n        Ok(detach_pipeline_result)\n    }\n}\n\n#[async_trait]\nimpl Handler<DetachMergePipeline> for IndexingService {\n    type Reply = Result<ActorHandle<MergePipeline>, IndexingError>;\n\n    async fn handle(\n        &mut self,\n        msg: DetachMergePipeline,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        Ok(self.detach_merge_pipeline(&msg.pipeline_id).await)\n    }\n}\n\n#[derive(Debug)]\nstruct SuperviseLoop;\n\n#[async_trait]\nimpl Handler<SuperviseLoop> for IndexingService {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        _message: SuperviseLoop,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        self.handle_supervise().await?;\n        ctx.schedule_self_msg(*quickwit_actors::HEARTBEAT, SuperviseLoop);\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Actor for IndexingService {\n    type ObservableState = IndexingServiceCounters;\n\n    fn observable_state(&self) -> Self::ObservableState {\n        self.counters.clone()\n    }\n\n    async fn initialize(&mut self, ctx: &ActorContext<Self>) -> Result<(), ActorExitStatus> {\n        self.run_ingest_api_queues_gc().await?;\n        self.handle(SuperviseLoop, ctx).await\n    }\n}\n\n#[async_trait]\nimpl Handler<SpawnPipeline> for IndexingService {\n    type Reply = Result<IndexingPipelineId, IndexingError>;\n    async fn handle(\n        &mut self,\n        message: SpawnPipeline,\n        ctx: &ActorContext<Self>,\n    ) -> Result<Result<IndexingPipelineId, IndexingError>, ActorExitStatus> {\n        Ok(self\n            .spawn_pipeline(\n                ctx,\n                message.index_id,\n                message.source_config,\n                message.pipeline_uid,\n            )\n            .await)\n    }\n}\n\n#[async_trait]\nimpl Handler<ApplyIndexingPlanRequest> for IndexingService {\n    type Reply = Result<ApplyIndexingPlanResponse, IndexingError>;\n\n    async fn handle(\n        &mut self,\n        plan_request: ApplyIndexingPlanRequest,\n        ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        Ok(self\n            .apply_indexing_plan(&plan_request.indexing_tasks, ctx)\n            .await\n            .map(|_| ApplyIndexingPlanResponse {}))\n    }\n}\n\n#[async_trait]\nimpl Handler<Healthz> for IndexingService {\n    type Reply = bool;\n\n    async fn handle(\n        &mut self,\n        _msg: Healthz,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<bool, ActorExitStatus> {\n        // In the future, check metrics such as available disk space.\n        Ok(true)\n    }\n}\n\n#[derive(Debug)]\nstruct IndexingPipelineDiff {\n    pipelines_to_shutdown: Vec<PipelineUid>,\n    pipelines_to_spawn: Vec<IndexingTask>,\n}\n\n#[cfg(test)]\nmod tests {\n    use std::num::NonZeroUsize;\n    use std::path::Path;\n    use std::time::Duration;\n\n    use quickwit_actors::{HEARTBEAT, Health, ObservationType, Supervisable, Universe};\n    use quickwit_cluster::{ChannelTransport, create_cluster_for_test};\n    use quickwit_common::ServiceStream;\n    use quickwit_common::rand::append_random_suffix;\n    use quickwit_config::{\n        IngestApiConfig, KafkaSourceParams, SourceConfig, SourceInputFormat, SourceParams,\n        VecSourceParams,\n    };\n    use quickwit_ingest::{CreateQueueIfNotExistsRequest, init_ingest_api};\n    use quickwit_metastore::{\n        AddSourceRequestExt, CreateIndexRequestExt, ListIndexesMetadataResponseExt, Split,\n        metastore_for_test,\n    };\n    use quickwit_proto::indexing::IndexingTask;\n    use quickwit_proto::metastore::{\n        AddSourceRequest, CreateIndexRequest, DeleteIndexRequest, IndexMetadataResponse,\n        IndexesMetadataResponse, ListIndexesMetadataResponse, ListSplitsResponse,\n        MockMetastoreService,\n    };\n\n    use super::*;\n    use crate::actors::merge_pipeline::SUPERVISE_LOOP_INTERVAL;\n\n    async fn spawn_indexing_service_for_test(\n        data_dir_path: &Path,\n        universe: &Universe,\n        metastore: MetastoreServiceClient,\n        cluster: Cluster,\n    ) -> (Mailbox<IndexingService>, ActorHandle<IndexingService>) {\n        let indexer_config = IndexerConfig::for_test().unwrap();\n        let num_blocking_threads = 1;\n        let storage_resolver = StorageResolver::unconfigured();\n        let queues_dir_path = data_dir_path.join(QUEUES_DIR_NAME);\n        let ingest_api_service =\n            init_ingest_api(universe, &queues_dir_path, &IngestApiConfig::default())\n                .await\n                .unwrap();\n        let merge_scheduler_mailbox: Mailbox<MergeSchedulerService> = universe.get_or_spawn_one();\n        let indexing_server = IndexingService::new(\n            NodeId::from(\"test-node\"),\n            data_dir_path.to_path_buf(),\n            indexer_config,\n            num_blocking_threads,\n            cluster,\n            metastore,\n            Some(ingest_api_service),\n            merge_scheduler_mailbox,\n            IngesterPool::default(),\n            storage_resolver.clone(),\n            EventBroker::default(),\n        )\n        .await\n        .unwrap();\n        universe.spawn_builder().spawn(indexing_server)\n    }\n\n    #[tokio::test]\n    async fn test_indexing_service_spawn_observe_detach() {\n        quickwit_common::setup_logging_for_tests();\n        let transport = ChannelTransport::default();\n        let cluster = create_cluster_for_test(Vec::new(), &[\"indexer\"], &transport, true)\n            .await\n            .unwrap();\n        let metastore = metastore_for_test();\n\n        let index_id = append_random_suffix(\"test-indexing-service\");\n        let index_uri = format!(\"ram:///indexes/{index_id}\");\n        let index_config = IndexConfig::for_test(&index_id, &index_uri);\n\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n        let create_source_request = AddSourceRequest::try_from_source_config(\n            index_uid.clone(),\n            &SourceConfig::ingest_api_default(),\n        )\n        .unwrap();\n        metastore.add_source(create_source_request).await.unwrap();\n\n        let universe = Universe::with_accelerated_time();\n        let temp_dir = tempfile::tempdir().unwrap();\n        let (indexing_service, indexing_service_handle) =\n            spawn_indexing_service_for_test(temp_dir.path(), &universe, metastore, cluster).await;\n        let observation = indexing_service_handle.observe().await;\n        assert_eq!(observation.num_running_pipelines, 0);\n        assert_eq!(observation.num_failed_pipelines, 0);\n        assert_eq!(observation.num_successful_pipelines, 0);\n\n        // Test `spawn_pipeline`.\n        let source_config_0 = SourceConfig {\n            source_id: \"test-indexing-service--source-0\".to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::void(),\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        };\n        let spawn_pipeline_msg = SpawnPipeline {\n            index_id: index_id.clone(),\n            pipeline_uid: PipelineUid::for_test(1111u128),\n            source_config: source_config_0.clone(),\n        };\n        let pipeline_id: IndexingPipelineId = indexing_service\n            .ask_for_res(spawn_pipeline_msg.clone())\n            .await\n            .unwrap();\n        indexing_service\n            .ask_for_res(spawn_pipeline_msg)\n            .await\n            .unwrap_err();\n        assert_eq!(pipeline_id.index_uid.index_id, index_id);\n        assert_eq!(pipeline_id.source_id, source_config_0.source_id);\n        assert_eq!(pipeline_id.node_id, \"test-node\");\n        assert_eq!(pipeline_id.pipeline_uid, PipelineUid::for_test(1111u128));\n        assert_eq!(\n            indexing_service_handle\n                .observe()\n                .await\n                .num_running_pipelines,\n            1\n        );\n\n        // Test `observe_pipeline`.\n        let observation = indexing_service\n            .ask_for_res(ObservePipeline {\n                pipeline_id: pipeline_id.clone(),\n            })\n            .await\n            .unwrap();\n        assert_eq!(observation.obs_type, ObservationType::Alive);\n        assert_eq!(observation.generation, 1);\n        assert_eq!(observation.num_spawn_attempts, 1);\n\n        // Test detach.\n        let pipeline_handle = indexing_service\n            .ask_for_res(DetachIndexingPipeline {\n                pipeline_id: pipeline_id.clone(),\n            })\n            .await\n            .unwrap();\n        pipeline_handle.kill().await;\n        let _merge_pipeline = indexing_service\n            .ask_for_res(DetachMergePipeline {\n                pipeline_id: pipeline_id.merge_pipeline_id(),\n            })\n            .await\n            .unwrap();\n        let observation = indexing_service_handle.process_pending_and_observe().await;\n        assert_eq!(observation.num_running_pipelines, 0);\n        assert_eq!(observation.num_running_merge_pipelines, 0);\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_indexing_service_supervise_pipelines() {\n        quickwit_common::setup_logging_for_tests();\n        let transport = ChannelTransport::default();\n        let cluster = create_cluster_for_test(Vec::new(), &[\"indexer\"], &transport, true)\n            .await\n            .unwrap();\n        let metastore = metastore_for_test();\n\n        let index_id = append_random_suffix(\"test-indexing-service\");\n        let index_uri = format!(\"ram:///indexes/{index_id}\");\n        let index_config = IndexConfig::for_test(&index_id, &index_uri);\n\n        let source_config = SourceConfig {\n            source_id: \"test-indexing-service--source\".to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::Vec(VecSourceParams {\n                docs: Vec::new(),\n                batch_num_docs: 10,\n                partition: \"0\".to_string(),\n            }),\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        };\n        let create_index_request = CreateIndexRequest::try_from_index_and_source_configs(\n            &index_config,\n            std::slice::from_ref(&source_config),\n        )\n        .unwrap();\n        metastore.create_index(create_index_request).await.unwrap();\n\n        let universe = Universe::new();\n        let temp_dir = tempfile::tempdir().unwrap();\n        let (indexing_service, indexing_server_handle) =\n            spawn_indexing_service_for_test(temp_dir.path(), &universe, metastore, cluster).await;\n\n        indexing_service\n            .ask_for_res(SpawnPipeline {\n                index_id: index_id.clone(),\n                source_config,\n                pipeline_uid: PipelineUid::default(),\n            })\n            .await\n            .unwrap();\n        for _ in 0..2000 {\n            let obs = indexing_server_handle.observe().await;\n            if obs.num_successful_pipelines == 1 {\n                // It may or may not panic\n                universe.quit().await;\n                return;\n            }\n            universe.sleep(Duration::from_millis(100)).await;\n        }\n        panic!(\"Pipeline not exited successfully.\");\n    }\n\n    #[tokio::test]\n    async fn test_indexing_service_apply_plan() {\n        const PARAMS_FINGERPRINT_INGEST_API: u64 = 1637744865450232394;\n        const PARAMS_FINGERPRINT_SOURCE_1: u64 = 1705211905504908791;\n        const PARAMS_FINGERPRINT_SOURCE_2: u64 = 8706667372658059428;\n\n        quickwit_common::setup_logging_for_tests();\n        let transport = ChannelTransport::default();\n        let cluster = create_cluster_for_test(Vec::new(), &[\"indexer\"], &transport, true)\n            .await\n            .unwrap();\n        let metastore = metastore_for_test();\n\n        let index_id = append_random_suffix(\"test-indexing-service\");\n        let index_uri = format!(\"ram:///indexes/{index_id}\");\n        let index_config = IndexConfig::for_test(&index_id, &index_uri);\n\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n        let add_source_request = AddSourceRequest::try_from_source_config(\n            index_uid.clone(),\n            &SourceConfig::ingest_api_default(),\n        )\n        .unwrap();\n        metastore.add_source(add_source_request).await.unwrap();\n        let universe = Universe::new();\n        let temp_dir = tempfile::tempdir().unwrap();\n        let (indexing_service, indexing_service_handle) = spawn_indexing_service_for_test(\n            temp_dir.path(),\n            &universe,\n            metastore.clone(),\n            cluster.clone(),\n        )\n        .await;\n        let metadata = metastore\n            .index_metadata(IndexMetadataRequest::for_index_id(index_id.clone()))\n            .await\n            .unwrap()\n            .deserialize_index_metadata()\n            .unwrap();\n\n        let source_config_1 = SourceConfig {\n            source_id: \"test-indexing-service--source-1\".to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::void(),\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        };\n        {\n            // Assign 2 indexing tasks\n            // -> total: 1 source * 2 pipelines\n            let add_source_request =\n                AddSourceRequest::try_from_source_config(index_uid.clone(), &source_config_1)\n                    .unwrap();\n            metastore.add_source(add_source_request).await.unwrap();\n            let indexing_tasks = vec![\n                IndexingTask {\n                    index_uid: Some(metadata.index_uid.clone()),\n                    source_id: source_config_1.source_id.clone(),\n                    shard_ids: Vec::new(),\n                    pipeline_uid: Some(PipelineUid::for_test(0u128)),\n                    params_fingerprint: PARAMS_FINGERPRINT_SOURCE_1,\n                },\n                IndexingTask {\n                    index_uid: Some(metadata.index_uid.clone()),\n                    source_id: source_config_1.source_id.clone(),\n                    shard_ids: Vec::new(),\n                    pipeline_uid: Some(PipelineUid::for_test(1u128)),\n                    params_fingerprint: PARAMS_FINGERPRINT_SOURCE_1,\n                },\n            ];\n            indexing_service\n                .ask_for_res(ApplyIndexingPlanRequest { indexing_tasks })\n                .await\n                .unwrap();\n            assert_eq!(\n                indexing_service_handle\n                    .observe()\n                    .await\n                    .num_running_pipelines,\n                2\n            );\n        }\n        let kafka_params = KafkaSourceParams {\n            topic: \"my-topic\".to_string(),\n            client_log_level: None,\n            client_params: serde_json::Value::Null,\n            enable_backfill_mode: false,\n        };\n        let source_config_2 = SourceConfig {\n            source_id: \"test-indexing-service--source-2\".to_string(),\n            num_pipelines: NonZeroUsize::new(2).unwrap(),\n            enabled: true,\n            source_params: SourceParams::Kafka(kafka_params),\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        };\n        {\n            // Assign 2 more indexing tasks (1 new source + activate ingest API source)\n            // -> total: 2 source * 1 pipeline + 1 source * 2 pipelines\n            let add_source_request_2 =\n                AddSourceRequest::try_from_source_config(index_uid.clone(), &source_config_2)\n                    .unwrap();\n            metastore.add_source(add_source_request_2).await.unwrap();\n\n            let indexing_tasks = vec![\n                IndexingTask {\n                    index_uid: Some(metadata.index_uid.clone()),\n                    source_id: INGEST_API_SOURCE_ID.to_string(),\n                    shard_ids: Vec::new(),\n                    pipeline_uid: Some(PipelineUid::for_test(3u128)),\n                    params_fingerprint: PARAMS_FINGERPRINT_INGEST_API,\n                },\n                IndexingTask {\n                    index_uid: Some(metadata.index_uid.clone()),\n                    source_id: source_config_1.source_id.clone(),\n                    shard_ids: Vec::new(),\n                    pipeline_uid: Some(PipelineUid::for_test(1u128)),\n                    params_fingerprint: PARAMS_FINGERPRINT_SOURCE_1,\n                },\n                IndexingTask {\n                    index_uid: Some(metadata.index_uid.clone()),\n                    source_id: source_config_1.source_id.clone(),\n                    shard_ids: Vec::new(),\n                    pipeline_uid: Some(PipelineUid::for_test(2u128)),\n                    params_fingerprint: PARAMS_FINGERPRINT_SOURCE_1,\n                },\n                IndexingTask {\n                    index_uid: Some(metadata.index_uid.clone()),\n                    source_id: source_config_2.source_id.clone(),\n                    shard_ids: Vec::new(),\n                    pipeline_uid: Some(PipelineUid::for_test(4u128)),\n                    params_fingerprint: PARAMS_FINGERPRINT_SOURCE_2,\n                },\n            ];\n            indexing_service\n                .ask_for_res(ApplyIndexingPlanRequest {\n                    indexing_tasks: indexing_tasks.clone(),\n                })\n                .await\n                .unwrap();\n            assert_eq!(\n                indexing_service_handle\n                    .observe()\n                    .await\n                    .num_running_pipelines,\n                4\n            );\n            cluster\n                .wait_for_ready_members(\n                    |members| {\n                        members\n                            .iter()\n                            .any(|member| member.indexing_tasks.len() == indexing_tasks.len())\n                    },\n                    Duration::from_secs(5),\n                )\n                .await\n                .unwrap();\n            let self_member = &cluster.ready_members().await[0];\n            assert_eq!(\n                HashSet::<_>::from_iter(self_member.indexing_tasks.iter()),\n                HashSet::from_iter(indexing_tasks.iter())\n            );\n        }\n        {\n            // Remove 1 task (source_1 runs only 1 pipeline)\n            // -> total = 3 sources x 1 pipeline each\n            let indexing_tasks = vec![\n                IndexingTask {\n                    index_uid: Some(metadata.index_uid.clone()),\n                    source_id: INGEST_API_SOURCE_ID.to_string(),\n                    shard_ids: Vec::new(),\n                    pipeline_uid: Some(PipelineUid::for_test(3u128)),\n                    params_fingerprint: PARAMS_FINGERPRINT_INGEST_API,\n                },\n                IndexingTask {\n                    index_uid: Some(metadata.index_uid.clone()),\n                    source_id: source_config_1.source_id.clone(),\n                    shard_ids: Vec::new(),\n                    pipeline_uid: Some(PipelineUid::for_test(1u128)),\n                    params_fingerprint: PARAMS_FINGERPRINT_SOURCE_1,\n                },\n                IndexingTask {\n                    index_uid: Some(metadata.index_uid.clone()),\n                    source_id: source_config_2.source_id.clone(),\n                    shard_ids: Vec::new(),\n                    pipeline_uid: Some(PipelineUid::for_test(4u128)),\n                    params_fingerprint: PARAMS_FINGERPRINT_SOURCE_2,\n                },\n            ];\n            indexing_service\n                .ask_for_res(ApplyIndexingPlanRequest {\n                    indexing_tasks: indexing_tasks.clone(),\n                })\n                .await\n                .unwrap();\n            let indexing_service_obs = indexing_service_handle.observe().await;\n            assert_eq!(indexing_service_obs.num_running_pipelines, 3);\n            assert_eq!(indexing_service_obs.num_deleted_queues, 0);\n            assert_eq!(indexing_service_obs.num_delete_queue_failures, 0);\n\n            indexing_service_handle.process_pending_and_observe().await;\n\n            cluster\n                .wait_for_ready_members(\n                    |members| {\n                        members\n                            .iter()\n                            .any(|member| member.indexing_tasks.len() == indexing_tasks.len())\n                    },\n                    Duration::from_secs(5),\n                )\n                .await\n                .unwrap();\n\n            let self_member = &cluster.ready_members().await[0];\n\n            assert_eq!(\n                HashSet::<_>::from_iter(self_member.indexing_tasks.iter()),\n                HashSet::from_iter(indexing_tasks.iter())\n            );\n        }\n        {\n            // Rescheduling a task (source_1) with an unexpected fingerprint\n            // removes the existing pipeline but doesn't start a new one.\n            // -> total: 2 sources x 1 pipeline\n            let indexing_tasks = vec![\n                IndexingTask {\n                    index_uid: Some(metadata.index_uid.clone()),\n                    source_id: INGEST_API_SOURCE_ID.to_string(),\n                    shard_ids: Vec::new(),\n                    pipeline_uid: Some(PipelineUid::for_test(3u128)),\n                    params_fingerprint: PARAMS_FINGERPRINT_INGEST_API,\n                },\n                IndexingTask {\n                    index_uid: Some(metadata.index_uid.clone()),\n                    source_id: source_config_1.source_id.clone(),\n                    shard_ids: Vec::new(),\n                    pipeline_uid: Some(PipelineUid::for_test(7u128)),\n                    params_fingerprint: 42,\n                },\n                IndexingTask {\n                    index_uid: Some(metadata.index_uid.clone()),\n                    source_id: source_config_2.source_id.clone(),\n                    shard_ids: Vec::new(),\n                    pipeline_uid: Some(PipelineUid::for_test(4u128)),\n                    params_fingerprint: PARAMS_FINGERPRINT_SOURCE_2,\n                },\n            ];\n            indexing_service\n                .ask_for_res(ApplyIndexingPlanRequest {\n                    indexing_tasks: indexing_tasks.clone(),\n                })\n                .await\n                .unwrap();\n            let indexing_service_obs = indexing_service_handle.observe().await;\n            assert_eq!(indexing_service_obs.num_running_pipelines, 2);\n            assert_eq!(indexing_service_obs.num_deleted_queues, 0);\n            assert_eq!(indexing_service_obs.num_delete_queue_failures, 0);\n        }\n\n        // Delete index and apply empty plan\n        metastore\n            .delete_index(DeleteIndexRequest {\n                index_uid: Some(index_uid.clone()),\n            })\n            .await\n            .unwrap();\n        indexing_service\n            .ask_for_res(ApplyIndexingPlanRequest {\n                indexing_tasks: Vec::new(),\n            })\n            .await\n            .unwrap();\n        let indexing_service_obs = indexing_service_handle.observe().await;\n        assert_eq!(indexing_service_obs.num_running_pipelines, 0);\n        assert_eq!(indexing_service_obs.num_deleted_queues, 1);\n        assert_eq!(indexing_service_obs.num_delete_queue_failures, 0);\n        indexing_service_handle.quit().await;\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_indexing_service_shutdown_merge_pipeline_when_no_indexing_pipeline() {\n        quickwit_common::setup_logging_for_tests();\n        let transport = ChannelTransport::default();\n        let cluster = create_cluster_for_test(Vec::new(), &[\"indexer\"], &transport, true)\n            .await\n            .unwrap();\n        let metastore = metastore_for_test();\n\n        let index_id = append_random_suffix(\"test-indexing-service\");\n        let index_uri = format!(\"ram:///indexes/{index_id}\");\n        let index_config = IndexConfig::for_test(&index_id, &index_uri);\n\n        let source_config = SourceConfig {\n            source_id: \"test-indexing-service--source\".to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::void(),\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        };\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n        let add_source_request =\n            AddSourceRequest::try_from_source_config(index_uid.clone(), &source_config).unwrap();\n        metastore.add_source(add_source_request).await.unwrap();\n\n        // Test `IndexingService::new`.\n        let temp_dir = tempfile::tempdir().unwrap();\n        let data_dir_path = temp_dir.path().to_path_buf();\n        let indexer_config = IndexerConfig::for_test().unwrap();\n        let num_blocking_threads = 1;\n        let storage_resolver = StorageResolver::unconfigured();\n        let universe = Universe::with_accelerated_time();\n        let queues_dir_path = data_dir_path.join(QUEUES_DIR_NAME);\n        let ingest_api_service =\n            init_ingest_api(&universe, &queues_dir_path, &IngestApiConfig::default())\n                .await\n                .unwrap();\n        let merge_scheduler_service = universe.get_or_spawn_one();\n        let indexing_server = IndexingService::new(\n            NodeId::from(\"test-node\"),\n            data_dir_path,\n            indexer_config,\n            num_blocking_threads,\n            cluster.clone(),\n            metastore.clone(),\n            Some(ingest_api_service),\n            merge_scheduler_service,\n            IngesterPool::default(),\n            storage_resolver.clone(),\n            EventBroker::default(),\n        )\n        .await\n        .unwrap();\n        let (indexing_server_mailbox, indexing_server_handle) =\n            universe.spawn_builder().spawn(indexing_server);\n        let pipeline_id = indexing_server_mailbox\n            .ask_for_res(SpawnPipeline {\n                index_id: index_id.clone(),\n                source_config,\n                pipeline_uid: PipelineUid::default(),\n            })\n            .await\n            .unwrap();\n        let observation = indexing_server_handle.observe().await;\n        assert_eq!(observation.num_running_pipelines, 1);\n        assert_eq!(observation.num_failed_pipelines, 0);\n        assert_eq!(observation.num_successful_pipelines, 0);\n        assert_eq!(observation.num_running_merge_pipelines, 1);\n\n        // Test `shutdown_pipeline`\n        let pipeline = indexing_server_mailbox\n            .ask_for_res(DetachIndexingPipeline { pipeline_id })\n            .await\n            .unwrap();\n        pipeline.quit().await;\n\n        // Let the service cleanup the merge pipelines.\n        universe.sleep(*HEARTBEAT).await;\n\n        let observation = indexing_server_handle.process_pending_and_observe().await;\n        assert_eq!(observation.num_running_pipelines, 0);\n        assert_eq!(observation.num_running_merge_pipelines, 0);\n        universe.sleep(SUPERVISE_LOOP_INTERVAL).await;\n        // Check that the merge pipeline is also shut down as they are no more indexing pipeilne on\n        // the index.\n        assert!(universe.get_one::<MergePipeline>().is_none());\n        // It may or may not panic\n        universe.quit().await;\n    }\n\n    #[derive(Debug)]\n    struct FreezePipeline;\n    #[async_trait]\n    impl Handler<FreezePipeline> for IndexingPipeline {\n        type Reply = ();\n        async fn handle(\n            &mut self,\n            _: FreezePipeline,\n            _ctx: &ActorContext<Self>,\n        ) -> Result<Self::Reply, ActorExitStatus> {\n            tokio::time::sleep(*HEARTBEAT * 5).await;\n            Ok(())\n        }\n    }\n\n    #[derive(Debug)]\n    struct ObservePipelineHealth(IndexingPipelineId);\n    #[async_trait]\n    impl Handler<ObservePipelineHealth> for IndexingService {\n        type Reply = Health;\n        async fn handle(\n            &mut self,\n            message: ObservePipelineHealth,\n            _ctx: &ActorContext<Self>,\n        ) -> Result<Self::Reply, ActorExitStatus> {\n            Ok(self\n                .indexing_pipelines\n                .get(&message.0.pipeline_uid)\n                .unwrap()\n                .handle\n                .check_health(true))\n        }\n    }\n\n    #[tokio::test]\n    async fn test_indexing_service_does_not_shutdown_pipelines_on_indexing_pipeline_freeze() {\n        quickwit_common::setup_logging_for_tests();\n        let transport = ChannelTransport::default();\n        let cluster = create_cluster_for_test(Vec::new(), &[\"indexer\"], &transport, true)\n            .await\n            .unwrap();\n        let index_id = append_random_suffix(\"test-indexing-service-indexing-pipeline-timeout\");\n        let index_uri = format!(\"ram:///indexes/{index_id}\");\n        let mut index_metadata = IndexMetadata::for_test(&index_id, &index_uri);\n        let source_config = SourceConfig {\n            source_id: \"test-indexing-service--source\".to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::void(),\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        };\n        index_metadata\n            .sources\n            .insert(source_config.source_id.clone(), source_config.clone());\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata_clone = index_metadata.clone();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .return_once(move |_request| {\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata_clone,\n                ]))\n            });\n        mock_metastore.expect_index_metadata().returning(move |_| {\n            Ok(IndexMetadataResponse::try_from_index_metadata(&index_metadata).unwrap())\n        });\n        mock_metastore\n            .expect_list_splits()\n            .returning(|_| Ok(ServiceStream::empty()));\n        let universe = Universe::new();\n        let temp_dir = tempfile::tempdir().unwrap();\n        let (indexing_service, indexing_service_handle) = spawn_indexing_service_for_test(\n            temp_dir.path(),\n            &universe,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            cluster,\n        )\n        .await;\n        let _pipeline_id = indexing_service\n            .ask_for_res(SpawnPipeline {\n                index_id: index_id.clone(),\n                source_config,\n                pipeline_uid: PipelineUid::default(),\n            })\n            .await\n            .unwrap();\n        let observation = indexing_service_handle.observe().await;\n        assert_eq!(observation.num_running_pipelines, 1);\n        assert_eq!(observation.num_failed_pipelines, 0);\n        assert_eq!(observation.num_successful_pipelines, 0);\n\n        let indexing_pipeline = universe.get_one::<IndexingPipeline>().unwrap();\n\n        // Freeze pipeline during 5 heartbeats.\n        indexing_pipeline\n            .send_message(FreezePipeline)\n            .await\n            .unwrap();\n        universe.sleep(*HEARTBEAT * 5).await;\n        // Check that indexing and merge pipelines are still running.\n        let observation = indexing_service_handle.observe().await;\n        assert_eq!(observation.num_running_pipelines, 1);\n        assert_eq!(observation.num_failed_pipelines, 0);\n        assert_eq!(observation.num_running_merge_pipelines, 1);\n        // Might generate panics\n        universe.quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_indexing_service_ingest_api_gc() {\n        let index_id = \"test-ingest-api-gc-index\".to_string();\n        let index_uri = format!(\"ram:///indexes/{index_id}\");\n        let index_config = IndexConfig::for_test(&index_id, &index_uri);\n        let transport = ChannelTransport::default();\n        let cluster = create_cluster_for_test(Vec::new(), &[\"indexer\"], &transport, true)\n            .await\n            .unwrap();\n        let metastore = metastore_for_test();\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        // Setup ingest api objects\n        let universe = Universe::with_accelerated_time();\n        let temp_dir = tempfile::tempdir().unwrap();\n        let queues_dir_path = temp_dir.path().join(QUEUES_DIR_NAME);\n        let ingest_api_service =\n            init_ingest_api(&universe, &queues_dir_path, &IngestApiConfig::default())\n                .await\n                .unwrap();\n        let create_queue_req = CreateQueueIfNotExistsRequest {\n            queue_id: index_id.clone(),\n        };\n        ingest_api_service\n            .ask_for_res(create_queue_req)\n            .await\n            .unwrap();\n\n        // Setup `IndexingService`\n        let data_dir_path = temp_dir.path().to_path_buf();\n        let indexer_config = IndexerConfig::for_test().unwrap();\n        let num_blocking_threads = 1;\n        let storage_resolver = StorageResolver::unconfigured();\n        let merge_scheduler_service: Mailbox<MergeSchedulerService> = universe.get_or_spawn_one();\n        let mut indexing_server = IndexingService::new(\n            NodeId::from(\"test-ingest-api-gc-node\"),\n            data_dir_path,\n            indexer_config,\n            num_blocking_threads,\n            cluster.clone(),\n            metastore.clone(),\n            Some(ingest_api_service.clone()),\n            merge_scheduler_service,\n            IngesterPool::default(),\n            storage_resolver.clone(),\n            EventBroker::default(),\n        )\n        .await\n        .unwrap();\n\n        indexing_server.run_ingest_api_queues_gc().await.unwrap();\n        assert_eq!(indexing_server.counters.num_deleted_queues, 0);\n\n        metastore\n            .delete_index(DeleteIndexRequest {\n                index_uid: Some(index_uid.clone()),\n            })\n            .await\n            .unwrap();\n\n        indexing_server.run_ingest_api_queues_gc().await.unwrap();\n        assert_eq!(indexing_server.counters.num_deleted_queues, 1);\n\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_indexing_service_apply_indexing_plan_batches_metastore_calls() {\n        let temp_dir = tempfile::tempdir().unwrap();\n        let universe = Universe::new();\n\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_index_metadata()\n            .withf(|request| request.index_id.as_ref().unwrap() == \"test-index-0\")\n            .return_once(|_request| {\n                let index_metadata_0 =\n                    IndexMetadata::for_test(\"test-index-0\", \"ram:///indexes/test-index-0\");\n                let response =\n                    IndexMetadataResponse::try_from_index_metadata(&index_metadata_0).unwrap();\n                Ok(response)\n            });\n        mock_metastore\n            .expect_indexes_metadata()\n            .withf(|request| {\n                let index_uids: Vec<&IndexUid> = request\n                    .subrequests\n                    .iter()\n                    .flat_map(|subrequest| &subrequest.index_uid)\n                    .sorted()\n                    .collect();\n\n                index_uids == [&(\"test-index-1\", 0), &(\"test-index-2\", 0)]\n            })\n            .return_once(|_request| {\n                let source_config = SourceConfig::for_test(\"test-source\", SourceParams::void());\n\n                let mut index_metadata_1 =\n                    IndexMetadata::for_test(\"test-index-1\", \"ram:///indexes/test-index-1\");\n                index_metadata_1.add_source(source_config.clone()).unwrap();\n\n                let mut index_metadata_2 =\n                    IndexMetadata::for_test(\"test-index-2\", \"ram:///indexes/test-index-2\");\n                index_metadata_2.add_source(source_config).unwrap();\n\n                let indexes_metadata = vec![index_metadata_1, index_metadata_2];\n                let failures = Vec::new();\n                let response = IndexesMetadataResponse::for_test(indexes_metadata, failures);\n                Ok(response)\n            });\n        mock_metastore\n            .expect_list_splits()\n            .withf(|request| {\n                let list_splits_query = request.deserialize_list_splits_query().unwrap();\n                list_splits_query.index_uids.unwrap() == [(\"test-index-0\", 0)]\n            })\n            .return_once(|_request| Ok(ServiceStream::empty()));\n        mock_metastore\n            .expect_list_splits()\n            .withf(|request| {\n                let list_splits_query = request.deserialize_list_splits_query().unwrap();\n                list_splits_query.index_uids.unwrap() == [(\"test-index-1\", 0), (\"test-index-2\", 0)]\n            })\n            .return_once(|_request| {\n                let splits = vec![Split {\n                    split_metadata: SplitMetadata::for_test(\"test-split\".to_string()),\n                    split_state: SplitState::Published,\n                    update_timestamp: 0,\n                    publish_timestamp: Some(0),\n                }];\n                let list_splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                let response = ServiceStream::from(vec![Ok(list_splits_response)]);\n                Ok(response)\n            });\n\n        let transport = ChannelTransport::default();\n        let cluster = create_cluster_for_test(Vec::new(), &[\"indexer\"], &transport, true)\n            .await\n            .unwrap();\n        let (indexing_service, _indexing_service_handle) = spawn_indexing_service_for_test(\n            temp_dir.path(),\n            &universe,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            cluster,\n        )\n        .await;\n\n        let source_config = SourceConfig::for_test(\"test-source\", SourceParams::void());\n\n        indexing_service\n            .ask_for_res(SpawnPipeline {\n                index_id: \"test-index-0\".to_string(),\n                source_config,\n                pipeline_uid: PipelineUid::for_test(0),\n            })\n            .await\n            .unwrap();\n\n        indexing_service\n            .ask_for_res(ApplyIndexingPlanRequest {\n                indexing_tasks: vec![\n                    IndexingTask {\n                        index_uid: Some(IndexUid::for_test(\"test-index-0\", 0)),\n                        source_id: \"test-source\".to_string(),\n                        shard_ids: Vec::new(),\n                        pipeline_uid: Some(PipelineUid::for_test(0)),\n                        params_fingerprint: 0,\n                    },\n                    IndexingTask {\n                        index_uid: Some(IndexUid::for_test(\"test-index-1\", 0)),\n                        source_id: \"test-source\".to_string(),\n                        shard_ids: Vec::new(),\n                        pipeline_uid: Some(PipelineUid::for_test(1)),\n                        params_fingerprint: 0,\n                    },\n                    IndexingTask {\n                        index_uid: Some(IndexUid::for_test(\"test-index-2\", 0)),\n                        source_id: \"test-source\".to_string(),\n                        shard_ids: Vec::new(),\n                        pipeline_uid: Some(PipelineUid::for_test(2)),\n                        params_fingerprint: 0,\n                    },\n                ],\n            })\n            .await\n            .unwrap();\n\n        universe.assert_quit().await;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/actors/merge_executor.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeSet;\nuse std::ops::RangeInclusive;\nuse std::path::Path;\nuse std::sync::Arc;\nuse std::time::Instant;\n\nuse anyhow::{Context, anyhow};\nuse async_trait::async_trait;\nuse fail::fail_point;\nuse itertools::Itertools;\nuse quickwit_actors::{Actor, ActorContext, ActorExitStatus, Handler, Mailbox, QueueCapacity};\nuse quickwit_common::io::IoControls;\nuse quickwit_common::runtimes::RuntimeType;\nuse quickwit_common::temp_dir::TempDirectory;\nuse quickwit_directories::UnionDirectory;\nuse quickwit_doc_mapper::DocMapper;\nuse quickwit_metastore::SplitMetadata;\nuse quickwit_proto::indexing::MergePipelineId;\nuse quickwit_proto::metastore::{\n    DeleteTask, ListDeleteTasksRequest, MarkSplitsForDeletionRequest, MetastoreService,\n    MetastoreServiceClient,\n};\nuse quickwit_proto::types::{NodeId, SplitId};\nuse quickwit_query::get_quickwit_fastfield_normalizer_manager;\nuse quickwit_query::query_ast::QueryAst;\nuse tantivy::directory::{Advice, DirectoryClone, MmapDirectory, RamDirectory};\nuse tantivy::index::SegmentId;\nuse tantivy::tokenizer::TokenizerManager;\nuse tantivy::{DateTime, Directory, Index, IndexMeta, IndexWriter, SegmentReader};\nuse tokio::runtime::Handle;\nuse tracing::{debug, error, info, instrument, warn};\n\nuse crate::actors::Packager;\nuse crate::controlled_directory::ControlledDirectory;\nuse crate::merge_policy::MergeOperationType;\nuse crate::models::{IndexedSplit, IndexedSplitBatch, MergeScratch, PublishLock, SplitAttrs};\n\n#[derive(Clone)]\npub struct MergeExecutor {\n    pipeline_id: MergePipelineId,\n    metastore: MetastoreServiceClient,\n    doc_mapper: Arc<DocMapper>,\n    io_controls: IoControls,\n    merge_packager_mailbox: Mailbox<Packager>,\n}\n\n#[async_trait]\nimpl Actor for MergeExecutor {\n    type ObservableState = ();\n\n    fn runtime_handle(&self) -> Handle {\n        RuntimeType::Blocking.get_runtime_handle()\n    }\n\n    fn observable_state(&self) -> Self::ObservableState {}\n\n    fn queue_capacity(&self) -> QueueCapacity {\n        QueueCapacity::Bounded(1)\n    }\n\n    fn name(&self) -> String {\n        \"MergeExecutor\".to_string()\n    }\n}\n\n#[async_trait]\nimpl Handler<MergeScratch> for MergeExecutor {\n    type Reply = ();\n\n    #[instrument(level = \"info\", name = \"merge_executor\", parent = merge_scratch.merge_task.merge_parent_span.id(), skip_all)]\n    async fn handle(\n        &mut self,\n        merge_scratch: MergeScratch,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        let start = Instant::now();\n        let merge_task = merge_scratch.merge_task;\n        let indexed_split_opt: Option<IndexedSplit> = match merge_task.operation_type {\n            MergeOperationType::Merge => {\n                let merge_res = self\n                    .process_merge(\n                        merge_task.merge_split_id.clone(),\n                        merge_task.splits.clone(),\n                        merge_scratch.tantivy_dirs,\n                        merge_scratch.merge_scratch_directory,\n                        ctx,\n                    )\n                    .await;\n                match merge_res {\n                    Ok(indexed_split) => Some(indexed_split),\n                    Err(err) => {\n                        // A failure in a merge is a bit special.\n                        //\n                        // Instead of failing the pipeline, we just log it.\n                        // The idea is to limit the risk associated with a potential split of death.\n                        //\n                        // Such a split is now not tracked by the merge planner and won't undergo a\n                        // merge until the merge pipeline is restarted.\n                        //\n                        // With a merge policy that marks splits as mature after a day or so, this\n                        // limits the noise associated to those failed\n                        // merges.\n                        error!(task=?merge_task, err=?err, \"failed to merge splits\");\n                        return Ok(());\n                    }\n                }\n            }\n            MergeOperationType::DeleteAndMerge => {\n                assert_eq!(\n                    merge_task.splits.len(),\n                    1,\n                    \"Delete tasks can be applied only on one split.\"\n                );\n                assert_eq!(merge_scratch.tantivy_dirs.len(), 1);\n                let split_with_docs_to_delete = merge_task.splits[0].clone();\n                self.process_delete_and_merge(\n                    merge_task.merge_split_id.clone(),\n                    split_with_docs_to_delete,\n                    merge_scratch.tantivy_dirs,\n                    merge_scratch.merge_scratch_directory,\n                    ctx,\n                )\n                .await?\n            }\n        };\n        if let Some(indexed_split) = indexed_split_opt {\n            info!(\n                merged_num_docs = %indexed_split.split_attrs.num_docs,\n                elapsed_secs = %start.elapsed().as_secs_f32(),\n                operation_type = %merge_task.operation_type,\n                \"merge-operation-success\"\n            );\n            ctx.send_message(\n                &self.merge_packager_mailbox,\n                IndexedSplitBatch {\n                    splits: vec![indexed_split],\n                    checkpoint_delta_opt: Default::default(),\n                    publish_lock: PublishLock::default(),\n                    publish_token_opt: None,\n                    batch_parent_span: merge_task.merge_parent_span.clone(),\n                    merge_task_opt: Some(merge_task),\n                },\n            )\n            .await?;\n        } else {\n            info!(\"no-splits-merged\");\n        }\n        Ok(())\n    }\n}\n\nfn combine_index_meta(mut index_metas: Vec<IndexMeta>) -> anyhow::Result<IndexMeta> {\n    let mut union_index_meta = index_metas.pop().with_context(|| \"only one IndexMeta\")?;\n    for index_meta in index_metas {\n        union_index_meta.segments.extend(index_meta.segments);\n    }\n    Ok(union_index_meta)\n}\n\nfn open_split_directories(\n    // Directories containing the splits to merge\n    tantivy_dirs: &[Box<dyn Directory>],\n    tokenizer_manager: &TokenizerManager,\n) -> anyhow::Result<(IndexMeta, Vec<Box<dyn Directory>>)> {\n    let mut directories: Vec<Box<dyn Directory>> = Vec::new();\n    let mut index_metas = Vec::new();\n    for tantivy_dir in tantivy_dirs {\n        directories.push(tantivy_dir.clone());\n\n        let index_meta = open_index(tantivy_dir.clone(), tokenizer_manager)?.load_metas()?;\n        index_metas.push(index_meta);\n    }\n    let union_index_meta = combine_index_meta(index_metas)?;\n    Ok((union_index_meta, directories))\n}\n\n/// Creates a directory with a single `meta.json` file describe in `index_meta`\nfn create_shadowing_meta_json_directory(index_meta: IndexMeta) -> anyhow::Result<RamDirectory> {\n    let union_index_meta_json = serde_json::to_string_pretty(&index_meta)?;\n    let ram_directory = RamDirectory::default();\n    ram_directory.atomic_write(Path::new(\"meta.json\"), union_index_meta_json.as_bytes())?;\n    Ok(ram_directory)\n}\n\nfn merge_time_range(splits: &[SplitMetadata]) -> Option<RangeInclusive<DateTime>> {\n    splits\n        .iter()\n        .flat_map(|split| split.time_range.clone())\n        .flat_map(|time_range| vec![*time_range.start(), *time_range.end()].into_iter())\n        .minmax()\n        .into_option()\n        .map(|(min_timestamp, max_timestamp)| {\n            DateTime::from_timestamp_secs(min_timestamp)\n                ..=DateTime::from_timestamp_secs(max_timestamp)\n        })\n}\n\nfn sum_doc_sizes_in_bytes(splits: &[SplitMetadata]) -> u64 {\n    splits\n        .iter()\n        .map(|split| split.uncompressed_docs_size_in_bytes)\n        .sum::<u64>()\n}\n\nfn sum_num_docs(splits: &[SplitMetadata]) -> u64 {\n    splits.iter().map(|split| split.num_docs as u64).sum()\n}\n\n/// Following Boost's hash_combine.\nfn combine_two_hashes(lhs: u64, rhs: u64) -> u64 {\n    let update_to_xor = rhs\n        .wrapping_add(0x9e3779b9)\n        .wrapping_add(lhs << 6)\n        .wrapping_add(lhs >> 2);\n    lhs ^ update_to_xor\n}\n\nfn combine_partition_ids_aux(partition_ids: impl IntoIterator<Item = u64>) -> u64 {\n    let sorted_unique_partition_ids: BTreeSet<u64> = partition_ids.into_iter().collect();\n    let mut sorted_unique_partition_ids_it = sorted_unique_partition_ids.into_iter();\n    if let Some(partition_id) = sorted_unique_partition_ids_it.next() {\n        sorted_unique_partition_ids_it.fold(partition_id, |acc, partition_id| {\n            combine_two_hashes(acc, partition_id)\n        })\n    } else {\n        // This is not forbidden but this should never happen.\n        0u64\n    }\n}\n\npub fn combine_partition_ids(splits: &[SplitMetadata]) -> u64 {\n    combine_partition_ids_aux(splits.iter().map(|split| split.partition_id))\n}\n\npub fn merge_split_attrs(\n    pipeline_id: MergePipelineId,\n    merge_split_id: SplitId,\n    splits: &[SplitMetadata],\n) -> anyhow::Result<SplitAttrs> {\n    let partition_id = combine_partition_ids_aux(splits.iter().map(|split| split.partition_id));\n    let time_range: Option<RangeInclusive<DateTime>> = merge_time_range(splits);\n    let uncompressed_docs_size_in_bytes = sum_doc_sizes_in_bytes(splits);\n    let num_docs = sum_num_docs(splits);\n    let replaced_split_ids: Vec<SplitId> = splits\n        .iter()\n        .map(|split| split.split_id().to_string())\n        .collect();\n    let delete_opstamp = splits\n        .iter()\n        .map(|split| split.delete_opstamp)\n        .min()\n        .unwrap_or(0);\n    let doc_mapping_uid = splits\n        .first()\n        .ok_or_else(|| anyhow::anyhow!(\"attempted to merge zero splits\"))?\n        .doc_mapping_uid;\n    if splits\n        .iter()\n        .any(|split| split.doc_mapping_uid != doc_mapping_uid)\n    {\n        anyhow::bail!(\"attempted to merge splits with different doc mapping uid\");\n    }\n    Ok(SplitAttrs {\n        node_id: pipeline_id.node_id.clone(),\n        index_uid: pipeline_id.index_uid.clone(),\n        source_id: pipeline_id.source_id.clone(),\n        doc_mapping_uid,\n        split_id: merge_split_id,\n        partition_id,\n        replaced_split_ids,\n        time_range,\n        num_docs,\n        uncompressed_docs_size_in_bytes,\n        delete_opstamp,\n        num_merge_ops: max_merge_ops(splits) + 1,\n    })\n}\n\nfn max_merge_ops(splits: &[SplitMetadata]) -> usize {\n    splits\n        .iter()\n        .map(|split| split.num_merge_ops)\n        .max()\n        .unwrap_or(0)\n}\n\nimpl MergeExecutor {\n    pub fn new(\n        pipeline_id: MergePipelineId,\n        metastore: MetastoreServiceClient,\n        doc_mapper: Arc<DocMapper>,\n        io_controls: IoControls,\n        merge_packager_mailbox: Mailbox<Packager>,\n    ) -> Self {\n        MergeExecutor {\n            pipeline_id,\n            metastore,\n            doc_mapper,\n            io_controls,\n            merge_packager_mailbox,\n        }\n    }\n\n    async fn process_merge(\n        &mut self,\n        merge_split_id: SplitId,\n        splits: Vec<SplitMetadata>,\n        tantivy_dirs: Vec<Box<dyn Directory>>,\n        merge_scratch_directory: TempDirectory,\n        ctx: &ActorContext<Self>,\n    ) -> anyhow::Result<IndexedSplit> {\n        let (union_index_meta, split_directories) = open_split_directories(\n            &tantivy_dirs,\n            self.doc_mapper.tokenizer_manager().tantivy_manager(),\n        )?;\n        // TODO it would be nice if tantivy could let us run the merge in the current thread.\n        fail_point!(\"before-merge-split\");\n        let controlled_directory = self\n            .merge_split_directories(\n                union_index_meta,\n                split_directories,\n                Vec::new(),\n                None,\n                merge_scratch_directory.path(),\n                ctx,\n            )\n            .await?;\n        fail_point!(\"after-merge-split\");\n\n        // This will have the side effect of deleting the directory containing the downloaded\n        // splits.\n        let merged_index = open_index(\n            controlled_directory.clone(),\n            self.doc_mapper.tokenizer_manager().tantivy_manager(),\n        )?;\n        ctx.record_progress();\n\n        let split_attrs = merge_split_attrs(self.pipeline_id.clone(), merge_split_id, &splits)?;\n        Ok(IndexedSplit {\n            split_attrs,\n            index: merged_index,\n            split_scratch_directory: merge_scratch_directory,\n            controlled_directory_opt: Some(controlled_directory),\n        })\n    }\n\n    async fn process_delete_and_merge(\n        &mut self,\n        merge_split_id: SplitId,\n        split: SplitMetadata,\n        tantivy_dirs: Vec<Box<dyn Directory>>,\n        merge_scratch_directory: TempDirectory,\n        ctx: &ActorContext<Self>,\n    ) -> anyhow::Result<Option<IndexedSplit>> {\n        let list_delete_tasks_request =\n            ListDeleteTasksRequest::new(split.index_uid.clone(), split.delete_opstamp);\n        let delete_tasks = ctx\n            .protect_future(self.metastore.list_delete_tasks(list_delete_tasks_request))\n            .await?\n            .delete_tasks;\n        if delete_tasks.is_empty() {\n            warn!(\n                \"No delete task found for split `{}` with `delete_optamp` = `{}`.\",\n                split.split_id(),\n                split.delete_opstamp\n            );\n            return Ok(None);\n        }\n\n        let last_delete_opstamp = delete_tasks\n            .iter()\n            .map(|delete_task| delete_task.opstamp)\n            .max()\n            .expect(\"There is at least one delete task.\");\n        info!(\n            delete_opstamp_start = split.delete_opstamp,\n            num_delete_tasks = delete_tasks.len()\n        );\n\n        let (union_index_meta, split_directories) = open_split_directories(\n            &tantivy_dirs,\n            self.doc_mapper.tokenizer_manager().tantivy_manager(),\n        )?;\n        let controlled_directory = self\n            .merge_split_directories(\n                union_index_meta,\n                split_directories,\n                delete_tasks,\n                Some(self.doc_mapper.clone()),\n                merge_scratch_directory.path(),\n                ctx,\n            )\n            .await?;\n\n        // This will have the side effect of deleting the directory containing the downloaded split.\n        let mut merged_index = Index::open(controlled_directory.clone())?;\n        ctx.record_progress();\n        merged_index.set_tokenizers(\n            self.doc_mapper\n                .tokenizer_manager()\n                .tantivy_manager()\n                .clone(),\n        );\n        merged_index.set_fast_field_tokenizers(\n            get_quickwit_fastfield_normalizer_manager()\n                .tantivy_manager()\n                .clone(),\n        );\n\n        ctx.record_progress();\n\n        // Compute merged split attributes.\n        let merged_segment =\n            if let Some(segment) = merged_index.searchable_segments()?.into_iter().next() {\n                segment\n            } else {\n                info!(\n                    \"All documents from split `{}` were deleted.\",\n                    split.split_id()\n                );\n                let mark_splits_for_deletion_request = MarkSplitsForDeletionRequest::new(\n                    split.index_uid.clone(),\n                    vec![split.split_id.clone()],\n                );\n                self.metastore\n                    .mark_splits_for_deletion(mark_splits_for_deletion_request)\n                    .await?;\n                return Ok(None);\n            };\n\n        let merged_segment_reader = SegmentReader::open(&merged_segment)?;\n        let num_docs = merged_segment_reader.num_docs() as u64;\n        let uncompressed_docs_size_in_bytes = (num_docs as f32\n            * split.uncompressed_docs_size_in_bytes as f32\n            / split.num_docs as f32) as u64;\n        let time_range = if let Some(timestamp_field_name) = self.doc_mapper.timestamp_field_name()\n        {\n            let reader = merged_segment_reader\n                .fast_fields()\n                .date(timestamp_field_name)?;\n            Some(reader.min_value()..=reader.max_value())\n        } else {\n            None\n        };\n        let indexed_split = IndexedSplit {\n            split_attrs: SplitAttrs {\n                node_id: NodeId::new(split.node_id),\n                index_uid: split.index_uid,\n                source_id: split.source_id,\n                doc_mapping_uid: split.doc_mapping_uid,\n                split_id: merge_split_id,\n                partition_id: split.partition_id,\n                replaced_split_ids: vec![split.split_id.clone()],\n                time_range,\n                num_docs,\n                uncompressed_docs_size_in_bytes,\n                delete_opstamp: last_delete_opstamp,\n                num_merge_ops: split.num_merge_ops,\n            },\n            index: merged_index,\n            split_scratch_directory: merge_scratch_directory,\n            controlled_directory_opt: Some(controlled_directory),\n        };\n        Ok(Some(indexed_split))\n    }\n\n    async fn merge_split_directories(\n        &self,\n        union_index_meta: IndexMeta,\n        split_directories: Vec<Box<dyn Directory>>,\n        delete_tasks: Vec<DeleteTask>,\n        doc_mapper_opt: Option<Arc<DocMapper>>,\n        output_path: &Path,\n        ctx: &ActorContext<MergeExecutor>,\n    ) -> anyhow::Result<ControlledDirectory> {\n        let shadowing_meta_json_directory = create_shadowing_meta_json_directory(union_index_meta)?;\n\n        // This directory is here to receive the merged split, as well as the final meta.json file.\n        let output_directory = ControlledDirectory::new(\n            Box::new(MmapDirectory::open_with_madvice(\n                output_path,\n                Advice::Sequential,\n            )?),\n            self.io_controls\n                .clone()\n                .set_kill_switch(ctx.kill_switch().clone())\n                .set_progress(ctx.progress().clone()),\n        );\n        let mut directory_stack: Vec<Box<dyn Directory>> = vec![\n            output_directory.box_clone(),\n            Box::new(shadowing_meta_json_directory),\n        ];\n        directory_stack.extend(split_directories.into_iter());\n        let union_directory = UnionDirectory::union_of(directory_stack);\n        let union_index = open_index(\n            union_directory,\n            self.doc_mapper.tokenizer_manager().tantivy_manager(),\n        )?;\n\n        ctx.record_progress();\n        let _protect_guard = ctx.protect_zone();\n\n        let mut index_writer: IndexWriter = union_index.writer_with_num_threads(1, 15_000_000)?;\n        let num_delete_tasks = delete_tasks.len();\n        if num_delete_tasks > 0 {\n            let doc_mapper = doc_mapper_opt\n                .ok_or_else(|| anyhow!(\"doc mapper must be present if there are delete tasks\"))?;\n            for delete_task in delete_tasks {\n                let delete_query = delete_task\n                    .delete_query\n                    .expect(\"A delete task must have a delete query.\");\n                let query_ast: QueryAst = serde_json::from_str(&delete_query.query_ast)\n                    .context(\"invalid query_ast json\")?;\n                // We ignore the docmapper default fields when we consider delete query.\n                // We reparse the query here defensively, but actually, it should already have been\n                // done in the delete task rest handler.\n                let parsed_query_ast = query_ast.parse_user_query(&[]).context(\"invalid query\")?;\n                debug!(\n                    \"Delete all documents matched by query `{:?}`\",\n                    parsed_query_ast\n                );\n                let (query, _) =\n                    doc_mapper.query(union_index.schema(), parsed_query_ast, false, None)?;\n                index_writer.delete_query(query)?;\n            }\n            debug!(\"commit-delete-operations\");\n            index_writer.commit()?;\n        }\n\n        let segment_ids: Vec<SegmentId> = union_index\n            .searchable_segment_metas()?\n            .into_iter()\n            .map(|segment_meta| segment_meta.id())\n            .collect();\n\n        // A merge is useless if there is no delete and only one segment.\n        if num_delete_tasks == 0 && segment_ids.len() <= 1 {\n            return Ok(output_directory);\n        }\n\n        // If after deletion there is no longer any document, don't try to merge.\n        if num_delete_tasks != 0 && segment_ids.is_empty() {\n            return Ok(output_directory);\n        }\n\n        debug!(segment_ids=?segment_ids,\"merging-segments\");\n        // TODO it would be nice if tantivy could let us run the merge in the current thread.\n        index_writer.merge(&segment_ids).await?;\n\n        Ok(output_directory)\n    }\n}\n\nfn open_index<T: Into<Box<dyn Directory>>>(\n    directory: T,\n    tokenizer_manager: &TokenizerManager,\n) -> tantivy::Result<Index> {\n    let mut index = Index::open(directory)?;\n    index.set_tokenizers(tokenizer_manager.clone());\n    index.set_fast_field_tokenizers(\n        get_quickwit_fastfield_normalizer_manager()\n            .tantivy_manager()\n            .clone(),\n    );\n    Ok(index)\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_actors::Universe;\n    use quickwit_common::split_file;\n    use quickwit_metastore::{\n        ListSplitsRequestExt, MetastoreServiceStreamSplitsExt, SplitMetadata, StageSplitsRequestExt,\n    };\n    use quickwit_proto::metastore::{\n        DeleteQuery, ListSplitsRequest, PublishSplitsRequest, StageSplitsRequest,\n    };\n    use serde_json::Value as JsonValue;\n    use tantivy::{Document, ReloadPolicy, TantivyDocument};\n\n    use super::*;\n    use crate::merge_policy::{MergeOperation, MergeTask};\n    use crate::{TestSandbox, get_tantivy_directory_from_split_bundle, new_split_id};\n\n    #[tokio::test]\n    async fn test_merge_executor() -> anyhow::Result<()> {\n        let doc_mapping_yaml = r#\"\n            field_mappings:\n              - name: body\n                type: text\n              - name: ts\n                type: datetime\n                input_formats:\n                - unix_timestamp\n                fast: true\n            timestamp_field: ts\n        \"#;\n        let test_sandbox =\n            TestSandbox::create(\"test-index\", doc_mapping_yaml, \"\", &[\"body\"]).await?;\n        for split_id in 0..4 {\n            let single_doc = std::iter::once(\n                serde_json::json!({\"body \": format!(\"split{split_id}\"), \"ts\": 1631072713u64 + split_id }),\n            );\n            test_sandbox.add_documents(single_doc).await?;\n        }\n        let metastore = test_sandbox.metastore();\n        let index_uid = test_sandbox.index_uid();\n        let list_splits_request = ListSplitsRequest::try_from_index_uid(index_uid.clone()).unwrap();\n        let split_metas: Vec<SplitMetadata> = metastore\n            .list_splits(list_splits_request)\n            .await\n            .unwrap()\n            .collect_splits_metadata()\n            .await\n            .unwrap();\n        assert_eq!(split_metas.len(), 4);\n        let merge_scratch_directory = TempDirectory::for_test();\n        let downloaded_splits_directory =\n            merge_scratch_directory.named_temp_child(\"downloaded-splits-\")?;\n        let mut tantivy_dirs: Vec<Box<dyn Directory>> = Vec::new();\n        for split_meta in &split_metas {\n            let split_filename = split_file(split_meta.split_id());\n            let dest_filepath = downloaded_splits_directory.path().join(&split_filename);\n            test_sandbox\n                .storage()\n                .copy_to_file(Path::new(&split_filename), &dest_filepath)\n                .await?;\n            tantivy_dirs.push(get_tantivy_directory_from_split_bundle(&dest_filepath).unwrap())\n        }\n        let merge_operation = MergeOperation::new_merge_operation(split_metas);\n        let merge_task = MergeTask::from_merge_operation_for_test(merge_operation);\n        let merge_scratch = MergeScratch {\n            merge_task,\n            tantivy_dirs,\n            merge_scratch_directory,\n            downloaded_splits_directory,\n        };\n        let pipeline_id = MergePipelineId {\n            node_id: test_sandbox.node_id(),\n            index_uid,\n            source_id: test_sandbox.source_id(),\n        };\n        let (merge_packager_mailbox, merge_packager_inbox) =\n            test_sandbox.universe().create_test_mailbox();\n        let merge_executor = MergeExecutor::new(\n            pipeline_id,\n            test_sandbox.metastore(),\n            test_sandbox.doc_mapper(),\n            IoControls::default(),\n            merge_packager_mailbox,\n        );\n        let (merge_executor_mailbox, merge_executor_handle) = test_sandbox\n            .universe()\n            .spawn_builder()\n            .spawn(merge_executor);\n        merge_executor_mailbox.send_message(merge_scratch).await?;\n        merge_executor_handle.process_pending_and_observe().await;\n        let packager_msgs: Vec<IndexedSplitBatch> = merge_packager_inbox.drain_for_test_typed();\n        assert_eq!(packager_msgs.len(), 1);\n        let split_attrs_after_merge = &packager_msgs[0].splits[0].split_attrs;\n        assert_eq!(split_attrs_after_merge.num_docs, 4);\n        assert_eq!(split_attrs_after_merge.uncompressed_docs_size_in_bytes, 136);\n        assert_eq!(split_attrs_after_merge.num_merge_ops, 1);\n        let reader = packager_msgs[0].splits[0]\n            .index\n            .reader_builder()\n            .reload_policy(ReloadPolicy::Manual)\n            .try_into()?;\n        let searcher = reader.searcher();\n        assert_eq!(searcher.segment_readers().len(), 1);\n        test_sandbox.assert_quit().await;\n        Ok(())\n    }\n\n    #[test]\n    fn test_combine_partition_ids_singleton_unchanged() {\n        assert_eq!(combine_partition_ids_aux([17]), 17);\n    }\n\n    #[test]\n    fn test_combine_partition_ids_zero_has_an_impact() {\n        assert_ne!(\n            combine_partition_ids_aux([12u64, 0u64]),\n            combine_partition_ids_aux([12u64])\n        );\n    }\n\n    #[test]\n    fn test_combine_partition_ids_depends_on_partition_id_set() {\n        assert_eq!(\n            combine_partition_ids_aux([12, 16, 12, 13]),\n            combine_partition_ids_aux([12, 16, 13])\n        );\n    }\n\n    #[test]\n    fn test_combine_partition_ids_order_does_not_matter() {\n        assert_eq!(\n            combine_partition_ids_aux([7, 12, 13]),\n            combine_partition_ids_aux([12, 13, 7])\n        );\n    }\n\n    async fn aux_test_delete_and_merge_executor(\n        index_id: &str,\n        docs: Vec<JsonValue>,\n        delete_query: &str,\n        result_docs: Vec<JsonValue>,\n    ) -> anyhow::Result<()> {\n        quickwit_common::setup_logging_for_tests();\n        let doc_mapping_yaml = r#\"\n            field_mappings:\n              - name: body\n                type: text\n              - name: ts\n                type: datetime\n                input_formats:\n                - unix_timestamp\n                fast: true\n            timestamp_field: ts\n        \"#;\n        let test_sandbox = TestSandbox::create(index_id, doc_mapping_yaml, \"\", &[\"body\"]).await?;\n        test_sandbox.add_documents(docs).await?;\n        let metastore = test_sandbox.metastore();\n        let index_uid = test_sandbox.index_uid();\n        metastore\n            .create_delete_task(DeleteQuery {\n                index_uid: Some(index_uid.clone()),\n                start_timestamp: None,\n                end_timestamp: None,\n                query_ast: quickwit_query::query_ast::qast_json_helper(delete_query, &[\"body\"]),\n            })\n            .await?;\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_index_uid(index_uid.clone()).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n\n        // We want to test a delete on a split with num_merge_ops > 0.\n        let mut new_split_metadata = splits[0].split_metadata.clone();\n        new_split_metadata.split_id = new_split_id();\n        new_split_metadata.num_merge_ops = 1;\n        let stage_splits_request =\n            StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &new_split_metadata)\n                .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![new_split_metadata.split_id.to_string()],\n            replaced_split_ids: vec![splits[0].split_metadata.split_id.to_string()],\n            index_checkpoint_delta_json_opt: None,\n            publish_token_opt: None,\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n        let expected_uncompressed_docs_size_in_bytes =\n            (new_split_metadata.uncompressed_docs_size_in_bytes as f32 / 2_f32) as u64;\n        let merge_scratch_directory = TempDirectory::for_test();\n        let downloaded_splits_directory =\n            merge_scratch_directory.named_temp_child(\"downloaded-splits-\")?;\n        let split_filename = split_file(splits[0].split_metadata.split_id());\n        let new_split_filename = split_file(new_split_metadata.split_id());\n        let dest_filepath = downloaded_splits_directory.path().join(&new_split_filename);\n        test_sandbox\n            .storage()\n            .copy_to_file(Path::new(&split_filename), &dest_filepath)\n            .await?;\n        let tantivy_dir = get_tantivy_directory_from_split_bundle(&dest_filepath).unwrap();\n        let merge_operation = MergeOperation::new_delete_and_merge_operation(new_split_metadata);\n        let merge_task = MergeTask::from_merge_operation_for_test(merge_operation);\n        let merge_scratch = MergeScratch {\n            merge_task,\n            tantivy_dirs: vec![tantivy_dir],\n            merge_scratch_directory,\n            downloaded_splits_directory,\n        };\n        let pipeline_id = MergePipelineId {\n            node_id: test_sandbox.node_id(),\n            index_uid: test_sandbox.index_uid(),\n            source_id: test_sandbox.source_id(),\n        };\n        let universe = Universe::with_accelerated_time();\n        let (merge_packager_mailbox, merge_packager_inbox) = universe.create_test_mailbox();\n        let delete_task_executor = MergeExecutor::new(\n            pipeline_id,\n            metastore,\n            test_sandbox.doc_mapper(),\n            IoControls::default(),\n            merge_packager_mailbox,\n        );\n        let (delete_task_executor_mailbox, delete_task_executor_handle) =\n            universe.spawn_builder().spawn(delete_task_executor);\n        delete_task_executor_mailbox\n            .send_message(merge_scratch)\n            .await?;\n        delete_task_executor_handle\n            .process_pending_and_observe()\n            .await;\n\n        let packager_msgs: Vec<IndexedSplitBatch> = merge_packager_inbox.drain_for_test_typed();\n        if !result_docs.is_empty() {\n            assert_eq!(packager_msgs.len(), 1);\n            let split = &packager_msgs[0].splits[0];\n            assert_eq!(split.split_attrs.num_docs, result_docs.len() as u64);\n            assert_eq!(split.split_attrs.delete_opstamp, 1);\n            // Delete operations do not update the num_merge_ops value.\n            assert_eq!(split.split_attrs.num_merge_ops, 1);\n            assert_eq!(\n                split.split_attrs.uncompressed_docs_size_in_bytes,\n                expected_uncompressed_docs_size_in_bytes,\n            );\n            let reader = split\n                .index\n                .reader_builder()\n                .reload_policy(ReloadPolicy::Manual)\n                .try_into()?;\n            let searcher = reader.searcher();\n            assert_eq!(searcher.segment_readers().len(), 1);\n\n            let documents_left = searcher\n                .search(\n                    &tantivy::query::AllQuery,\n                    &tantivy::collector::TopDocs::with_limit(result_docs.len() + 1)\n                        .order_by_score(),\n                )?\n                .into_iter()\n                .map(|(_, doc_address)| {\n                    let doc: TantivyDocument = searcher.doc(doc_address).unwrap();\n                    let doc_json = doc.to_json(searcher.schema());\n                    serde_json::from_str(&doc_json).unwrap()\n                })\n                .collect::<Vec<JsonValue>>();\n\n            assert_eq!(documents_left.len(), result_docs.len());\n            for doc in &documents_left {\n                assert!(result_docs.contains(doc));\n            }\n            for doc in &result_docs {\n                assert!(documents_left.contains(doc));\n            }\n        } else {\n            assert!(packager_msgs.is_empty());\n            let metastore = test_sandbox.metastore();\n            let index_uid = test_sandbox.index_uid();\n            let splits = metastore\n                .list_splits(ListSplitsRequest::try_from_index_uid(index_uid).unwrap())\n                .await\n                .unwrap()\n                .collect_splits()\n                .await\n                .unwrap();\n            assert!(splits.iter().all(\n                |split| split.split_state == quickwit_metastore::SplitState::MarkedForDeletion\n            ));\n        }\n        test_sandbox.assert_quit().await;\n        universe.assert_quit().await;\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_delete_and_merge_executor() -> anyhow::Result<()> {\n        aux_test_delete_and_merge_executor(\n            \"test-delete-and-merge-index\",\n            vec![\n                serde_json::json!({\"body\": \"info\", \"ts\": 1624928208 }),\n                serde_json::json!({\"body\": \"delete\", \"ts\": 1634928208 }),\n            ],\n            \"body:delete\",\n            vec![serde_json::json!({\"body\": [\"info\"], \"ts\": [\"2021-06-29T00:56:48Z\"] })],\n        )\n        .await\n    }\n\n    #[tokio::test]\n    async fn test_delete_termset_and_merge_executor() -> anyhow::Result<()> {\n        aux_test_delete_and_merge_executor(\n            \"test-delete-termset-and-merge-executor\",\n            vec![\n                serde_json::json!({\"body\": \"info\", \"ts\": 1624928208 }),\n                serde_json::json!({\"body\": \"info\", \"ts\": 1624928209 }),\n                serde_json::json!({\"body\": \"delete\", \"ts\": 1634928208 }),\n                serde_json::json!({\"body\": \"delete\", \"ts\": 1634928209 }),\n            ],\n            \"body: IN [delete]\",\n            vec![\n                serde_json::json!({\"body\": [\"info\"], \"ts\": [\"2021-06-29T00:56:48Z\"] }),\n                serde_json::json!({\"body\": [\"info\"], \"ts\": [\"2021-06-29T00:56:49Z\"] }),\n            ],\n        )\n        .await\n    }\n\n    #[tokio::test]\n    async fn test_delete_all() -> anyhow::Result<()> {\n        aux_test_delete_and_merge_executor(\n            \"test-delete-all\",\n            vec![\n                serde_json::json!({\"body\": \"delete\", \"ts\": 1634928208 }),\n                serde_json::json!({\"body\": \"delete\", \"ts\": 1634928209 }),\n            ],\n            \"body:delete\",\n            Vec::new(),\n        )\n        .await\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/actors/merge_pipeline.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::Arc;\nuse std::time::{Duration, Instant};\n\nuse async_trait::async_trait;\nuse quickwit_actors::{\n    Actor, ActorContext, ActorExitStatus, ActorHandle, HEARTBEAT, Handler, Health, Inbox, Mailbox,\n    SpawnContext, Supervisable,\n};\nuse quickwit_common::KillSwitch;\nuse quickwit_common::io::{IoControls, Limiter};\nuse quickwit_common::pubsub::EventBroker;\nuse quickwit_common::temp_dir::TempDirectory;\nuse quickwit_config::RetentionPolicy;\nuse quickwit_doc_mapper::DocMapper;\nuse quickwit_metastore::{\n    ListSplitsQuery, ListSplitsRequestExt, MetastoreServiceStreamSplitsExt, SplitMetadata,\n    SplitState,\n};\nuse quickwit_proto::indexing::MergePipelineId;\nuse quickwit_proto::metastore::{\n    ListSplitsRequest, MetastoreError, MetastoreResult, MetastoreService, MetastoreServiceClient,\n};\nuse time::OffsetDateTime;\nuse tokio::sync::Semaphore;\nuse tracing::{debug, error, info, instrument};\n\nuse super::publisher::DisconnectMergePlanner;\nuse super::{MergeSchedulerService, RunFinalizeMergePolicyAndQuit};\nuse crate::actors::indexing_pipeline::wait_duration_before_retry;\nuse crate::actors::merge_split_downloader::MergeSplitDownloader;\nuse crate::actors::publisher::PublisherType;\nuse crate::actors::{MergeExecutor, MergePlanner, Packager, Publisher, Uploader, UploaderType};\nuse crate::merge_policy::MergePolicy;\nuse crate::models::MergeStatistics;\nuse crate::split_store::IndexingSplitStore;\n\n/// Spawning a merge pipeline puts a lot of pressure on the metastore so\n/// we rely on this semaphore to limit the number of merge pipelines that can be spawned\n/// concurrently.\nstatic SPAWN_PIPELINE_SEMAPHORE: Semaphore = Semaphore::const_new(10);\n\n/// Instructs the merge pipeline that it should stop itself.\n/// Merges that have already been scheduled are not aborted.\n///\n/// In addition, the finalizer merge policy will be executed to schedule a few\n/// additional merges.\n///\n/// After reception the `FinalizeAndClosePipeline`, the merge pipeline loop will\n/// be disconnected. In other words, the connection from the merge publisher to\n/// the merge planner will be cut, so that the merge pipeline will terminate naturally.\n///\n/// Supervisation will still exist. However it will not restart the pipeline\n/// in case of failure, it will just kill all of the merge pipeline actors. (for\n/// instance, if one of the actor is stuck).\n#[derive(Debug, Clone, Copy)]\npub struct FinishPendingMergesAndShutdownPipeline;\n\npub const SUPERVISE_LOOP_INTERVAL: Duration = Duration::from_secs(1);\n\nstruct MergePipelineHandles {\n    merge_planner: ActorHandle<MergePlanner>,\n    merge_split_downloader: ActorHandle<MergeSplitDownloader>,\n    merge_executor: ActorHandle<MergeExecutor>,\n    merge_packager: ActorHandle<Packager>,\n    merge_uploader: ActorHandle<Uploader>,\n    merge_publisher: ActorHandle<Publisher>,\n    next_check_for_progress: Instant,\n}\n\nimpl MergePipelineHandles {\n    fn should_check_for_progress(&mut self) -> bool {\n        let now = Instant::now();\n        let check_for_progress = now > self.next_check_for_progress;\n        if check_for_progress {\n            self.next_check_for_progress = now + *HEARTBEAT;\n        }\n        check_for_progress\n    }\n}\n\n// Messages\n#[derive(Debug)]\nstruct SuperviseLoop;\n\n#[derive(Clone, Copy, Debug, Default)]\nstruct Spawn {\n    retry_count: usize,\n}\n\npub struct MergePipeline {\n    params: MergePipelineParams,\n    merge_planner_mailbox: Mailbox<MergePlanner>,\n    merge_planner_inbox: Inbox<MergePlanner>,\n    previous_generations_statistics: MergeStatistics,\n    statistics: MergeStatistics,\n    handles_opt: Option<MergePipelineHandles>,\n    kill_switch: KillSwitch,\n    /// Immature splits passed to the merge planner the first time the pipeline is spawned.\n    initial_immature_splits_opt: Option<Vec<SplitMetadata>>,\n    // After it is set to true, we don't respawn pipeline actors if they fail.\n    shutdown_initiated: bool,\n}\n\n#[async_trait]\nimpl Actor for MergePipeline {\n    type ObservableState = MergeStatistics;\n\n    fn observable_state(&self) -> Self::ObservableState {\n        self.statistics.clone()\n    }\n\n    fn name(&self) -> String {\n        \"MergePipeline\".to_string()\n    }\n\n    async fn initialize(&mut self, ctx: &ActorContext<Self>) -> Result<(), ActorExitStatus> {\n        self.handle(Spawn::default(), ctx).await?;\n        self.handle(SuperviseLoop, ctx).await?;\n        Ok(())\n    }\n}\n\nimpl MergePipeline {\n    /// Creates a new merge pipeline. `initial_immature_splits_opt` is typically \"seeded\" by the\n    /// indexing service who fetches the immature splits from the metastore for all the merge\n    /// pipelines it is about to spawn. By issuing a single metastore query instead of one per merge\n    /// pipeline, we reduce the load on the metastore. If the merge pipeline crashes and is\n    /// respawned by the supervisor, the immature splits are fetched directly from the metastore.\n    pub fn new(\n        params: MergePipelineParams,\n        initial_immature_splits_opt: Option<Vec<SplitMetadata>>,\n        spawn_ctx: &SpawnContext,\n    ) -> Self {\n        // TODO improve API. Maybe it could take a spawnbuilder as argument, hence removing the need\n        // for a public create_mailbox / MessageCount.\n        let (merge_planner_mailbox, merge_planner_inbox) = spawn_ctx\n            .create_mailbox::<MergePlanner>(\"MergePlanner\", MergePlanner::queue_capacity());\n        Self {\n            params,\n            previous_generations_statistics: Default::default(),\n            handles_opt: None,\n            kill_switch: KillSwitch::default(),\n            statistics: MergeStatistics::default(),\n            merge_planner_inbox,\n            merge_planner_mailbox,\n            initial_immature_splits_opt,\n            shutdown_initiated: false,\n        }\n    }\n\n    pub fn merge_planner_mailbox(&self) -> &Mailbox<MergePlanner> {\n        &self.merge_planner_mailbox\n    }\n\n    fn supervisables(&self) -> Vec<&dyn Supervisable> {\n        if let Some(handles) = &self.handles_opt {\n            let supervisables: Vec<&dyn Supervisable> = vec![\n                &handles.merge_planner,\n                &handles.merge_split_downloader,\n                &handles.merge_executor,\n                &handles.merge_packager,\n                &handles.merge_uploader,\n                &handles.merge_publisher,\n            ];\n            supervisables\n        } else {\n            Vec::new()\n        }\n    }\n\n    /// Performs healthcheck on all of the actors in the pipeline,\n    /// and consolidates the result.\n    fn healthcheck(&self, check_for_progress: bool) -> Health {\n        let mut healthy_actors: Vec<&str> = Default::default();\n        let mut failure_or_unhealthy_actors: Vec<&str> = Default::default();\n        let mut success_actors: Vec<&str> = Default::default();\n\n        for supervisable in self.supervisables() {\n            match supervisable.check_health(check_for_progress) {\n                Health::Healthy => {\n                    // At least one other actor is running.\n                    healthy_actors.push(supervisable.name());\n                }\n                Health::FailureOrUnhealthy => {\n                    failure_or_unhealthy_actors.push(supervisable.name());\n                }\n                Health::Success => {\n                    success_actors.push(supervisable.name());\n                }\n            }\n        }\n        if !failure_or_unhealthy_actors.is_empty() {\n            error!(\n                index_uid=%self.params.pipeline_id.index_uid,\n                source_id=%self.params.pipeline_id.source_id,\n                generation=self.generation(),\n                healthy_actors=?healthy_actors,\n                failed_or_unhealthy_actors=?failure_or_unhealthy_actors,\n                success_actors=?success_actors,\n                \"merge pipeline failed\"\n            );\n            return Health::FailureOrUnhealthy;\n        }\n        if healthy_actors.is_empty() {\n            // All the actors finished successfully.\n            info!(\n                index_uid=%self.params.pipeline_id.index_uid,\n                source_id=%self.params.pipeline_id.source_id,\n                generation=self.generation(),\n                \"merge pipeline completed successfully\"\n            );\n            return Health::Success;\n        }\n        // No error at this point and there are still some actors running.\n        debug!(\n            index_uid=%self.params.pipeline_id.index_uid,\n            source_id=%self.params.pipeline_id.source_id,\n            generation=self.generation(),\n            healthy_actors=?healthy_actors,\n            failed_or_unhealthy_actors=?failure_or_unhealthy_actors,\n            success_actors=?success_actors,\n            \"merge pipeline is running and healthy\"\n        );\n        Health::Healthy\n    }\n\n    fn generation(&self) -> usize {\n        self.statistics.generation\n    }\n\n    // TODO: Should return an error saying whether we can retry or not.\n    #[instrument(name=\"spawn_merge_pipeline\", level=\"info\", skip_all, fields(index_uid=%self.params.pipeline_id.index_uid, generation=self.generation()))]\n    async fn spawn_pipeline(&mut self, ctx: &ActorContext<Self>) -> anyhow::Result<()> {\n        let _spawn_pipeline_permit = ctx\n            .protect_future(SPAWN_PIPELINE_SEMAPHORE.acquire())\n            .await\n            .expect(\"semaphore should not be closed\");\n\n        self.statistics.num_spawn_attempts += 1;\n        self.kill_switch = ctx.kill_switch().child();\n\n        info!(\n            index_uid=%self.params.pipeline_id.index_uid,\n            source_id=%self.params.pipeline_id.source_id,\n            root_dir=%self.params.indexing_directory.path().display(),\n            merge_policy=?self.params.merge_policy,\n            \"spawning merge pipeline\",\n        );\n        let immature_splits = self.fetch_immature_splits(ctx).await?;\n\n        // Merge publisher\n        let merge_publisher = Publisher::new(\n            PublisherType::MergePublisher,\n            self.params.metastore.clone(),\n            Some(self.merge_planner_mailbox.clone()),\n            None,\n        );\n        let (merge_publisher_mailbox, merge_publisher_handle) = ctx\n            .spawn_actor()\n            .set_kill_switch(self.kill_switch.clone())\n            .set_backpressure_micros_counter(\n                crate::metrics::INDEXER_METRICS\n                    .backpressure_micros\n                    .with_label_values([\"merge_publisher\"]),\n            )\n            .spawn(merge_publisher);\n\n        // Merge uploader\n        let merge_uploader = Uploader::new(\n            UploaderType::MergeUploader,\n            self.params.metastore.clone(),\n            self.params.merge_policy.clone(),\n            self.params.retention_policy.clone(),\n            self.params.split_store.clone(),\n            merge_publisher_mailbox.into(),\n            self.params.max_concurrent_split_uploads,\n            self.params.event_broker.clone(),\n        );\n        let (merge_uploader_mailbox, merge_uploader_handle) = ctx\n            .spawn_actor()\n            .set_kill_switch(self.kill_switch.clone())\n            .spawn(merge_uploader);\n\n        // Merge Packager\n        let tag_fields = self.params.doc_mapper.tag_named_fields()?;\n        let merge_packager = Packager::new(\"MergePackager\", tag_fields, merge_uploader_mailbox);\n        let (merge_packager_mailbox, merge_packager_handle) = ctx\n            .spawn_actor()\n            .set_kill_switch(self.kill_switch.clone())\n            .spawn(merge_packager);\n\n        let split_downloader_io_controls = IoControls::default()\n            .set_throughput_limiter_opt(self.params.merge_io_throughput_limiter_opt.clone())\n            .set_component(\"split_downloader_merge\");\n\n        // The merge and split download share the same throughput limiter.\n        // This is how cloning the `IoControls` works.\n        let merge_executor_io_controls =\n            split_downloader_io_controls.clone().set_component(\"merger\");\n\n        let merge_executor = MergeExecutor::new(\n            self.params.pipeline_id.clone(),\n            self.params.metastore.clone(),\n            self.params.doc_mapper.clone(),\n            merge_executor_io_controls,\n            merge_packager_mailbox,\n        );\n        let (merge_executor_mailbox, merge_executor_handle) = ctx\n            .spawn_actor()\n            .set_kill_switch(self.kill_switch.clone())\n            .set_backpressure_micros_counter(\n                crate::metrics::INDEXER_METRICS\n                    .backpressure_micros\n                    .with_label_values([\"merge_executor\"]),\n            )\n            .spawn(merge_executor);\n\n        let merge_split_downloader = MergeSplitDownloader {\n            scratch_directory: self.params.indexing_directory.clone(),\n            split_store: self.params.split_store.clone(),\n            executor_mailbox: merge_executor_mailbox,\n            io_controls: split_downloader_io_controls,\n        };\n        let (merge_split_downloader_mailbox, merge_split_downloader_handle) = ctx\n            .spawn_actor()\n            .set_kill_switch(self.kill_switch.clone())\n            .set_backpressure_micros_counter(\n                crate::metrics::INDEXER_METRICS\n                    .backpressure_micros\n                    .with_label_values([\"merge_split_downloader\"]),\n            )\n            .spawn(merge_split_downloader);\n\n        // Merge planner\n        let merge_planner = MergePlanner::new(\n            &self.params.pipeline_id,\n            immature_splits,\n            self.params.merge_policy.clone(),\n            merge_split_downloader_mailbox,\n            self.params.merge_scheduler_service.clone(),\n        );\n        let (_, merge_planner_handle) = ctx\n            .spawn_actor()\n            .set_kill_switch(self.kill_switch.clone())\n            .set_mailboxes(\n                self.merge_planner_mailbox.clone(),\n                self.merge_planner_inbox.clone(),\n            )\n            .spawn(merge_planner);\n\n        self.previous_generations_statistics = self.statistics.clone();\n        self.statistics.generation += 1;\n        self.handles_opt = Some(MergePipelineHandles {\n            merge_planner: merge_planner_handle,\n            merge_split_downloader: merge_split_downloader_handle,\n            merge_executor: merge_executor_handle,\n            merge_packager: merge_packager_handle,\n            merge_uploader: merge_uploader_handle,\n            merge_publisher: merge_publisher_handle,\n            next_check_for_progress: Instant::now() + *HEARTBEAT,\n        });\n        Ok(())\n    }\n\n    async fn terminate(&mut self) {\n        self.kill_switch.kill();\n        if let Some(handles) = self.handles_opt.take() {\n            tokio::join!(\n                handles.merge_planner.kill(),\n                handles.merge_split_downloader.kill(),\n                handles.merge_executor.kill(),\n                handles.merge_packager.kill(),\n                handles.merge_uploader.kill(),\n                handles.merge_publisher.kill(),\n            );\n        }\n    }\n\n    async fn perform_observe(&mut self) {\n        let Some(handles) = &self.handles_opt else {\n            return;\n        };\n        handles.merge_planner.refresh_observe();\n        handles.merge_uploader.refresh_observe();\n        handles.merge_publisher.refresh_observe();\n        let num_ongoing_merges = crate::metrics::INDEXER_METRICS\n            .ongoing_merge_operations\n            .get();\n        self.statistics = self\n            .previous_generations_statistics\n            .clone()\n            .add_actor_counters(\n                &handles.merge_uploader.last_observation(),\n                &handles.merge_publisher.last_observation(),\n            )\n            .set_generation(self.statistics.generation)\n            .set_num_spawn_attempts(self.statistics.num_spawn_attempts)\n            .set_ongoing_merges(usize::try_from(num_ongoing_merges).unwrap_or(0));\n    }\n\n    async fn perform_health_check(\n        &mut self,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        let Some(handles) = self.handles_opt.as_mut() else {\n            return Ok(());\n        };\n        // While we check if the actor has terminated or not, we do not check for progress\n        // at every single loop. Instead, we wait for the `HEARTBEAT` duration to have elapsed,\n        // since our last check.\n        let check_for_progress = handles.should_check_for_progress();\n        let health = self.healthcheck(check_for_progress);\n        match health {\n            Health::Healthy => {}\n            Health::FailureOrUnhealthy => {\n                self.terminate().await;\n                ctx.schedule_self_msg(*quickwit_actors::HEARTBEAT, Spawn { retry_count: 0 });\n            }\n            Health::Success => {\n                info!(index_uid=%self.params.pipeline_id.index_uid, \"merge pipeline success, shutting down\");\n                return Err(ActorExitStatus::Success);\n            }\n        }\n        Ok(())\n    }\n\n    async fn fetch_immature_splits(\n        &mut self,\n        ctx: &ActorContext<Self>,\n    ) -> MetastoreResult<Vec<quickwit_metastore::SplitMetadata>> {\n        // We consume the initial immature splits provided by the indexing service on the first\n        // spawn.\n        if let Some(immature_splits) = self.initial_immature_splits_opt.take() {\n            return Ok(immature_splits);\n        }\n        // On subsequent spawns, we fetch the immature splits directly from the metastore.\n        let index_uid = self.params.pipeline_id.index_uid.clone();\n        let node_id = self.params.pipeline_id.node_id.clone();\n        let list_splits_query = ListSplitsQuery::for_index(index_uid)\n            .with_node_id(node_id)\n            .with_split_state(SplitState::Published)\n            .retain_immature(OffsetDateTime::now_utc());\n        let list_splits_request =\n            ListSplitsRequest::try_from_list_splits_query(&list_splits_query)?;\n        let immature_splits_stream = ctx\n            .protect_future(self.params.metastore.list_splits(list_splits_request))\n            .await?;\n        let immature_splits = ctx\n            .protect_future(immature_splits_stream.collect_splits_metadata())\n            .await?;\n        info!(\n            index_uid=%self.params.pipeline_id.index_uid,\n            source_id=%self.params.pipeline_id.source_id,\n            \"fetched {} splits candidates for merge\",\n            immature_splits.len()\n        );\n        Ok(immature_splits)\n    }\n}\n\n#[async_trait]\nimpl Handler<SuperviseLoop> for MergePipeline {\n    type Reply = ();\n    async fn handle(\n        &mut self,\n        supervise_loop_token: SuperviseLoop,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        self.perform_observe().await;\n        self.perform_health_check(ctx).await?;\n        ctx.schedule_self_msg(SUPERVISE_LOOP_INTERVAL, supervise_loop_token);\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<FinishPendingMergesAndShutdownPipeline> for MergePipeline {\n    type Reply = ();\n    async fn handle(\n        &mut self,\n        _: FinishPendingMergesAndShutdownPipeline,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        info!(index_uid=%self.params.pipeline_id.index_uid, \"shutdown merge pipeline initiated\");\n        // From now on, we will not respawn the pipeline if it fails.\n        self.shutdown_initiated = true;\n        if let Some(handles) = &self.handles_opt {\n            // This disconnects the merge planner from the merge publisher,\n            // breaking the merge planner pipeline loop.\n            //\n            // As a result, the pipeline will naturally terminate\n            // once all of the pending / ongoing merge operations are completed.\n            let _ = handles\n                .merge_publisher\n                .mailbox()\n                .send_message(DisconnectMergePlanner)\n                .await;\n\n            // We also initiate the merge planner finalization routine.\n            // Depending on the merge policy, it may emit a few more merge\n            // operations.\n            let _ = handles\n                .merge_planner\n                .mailbox()\n                .send_message(RunFinalizeMergePolicyAndQuit)\n                .await;\n        } else {\n            // we won't respawn the pipeline in the future, so there is nothing\n            // to do here.\n        }\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<Spawn> for MergePipeline {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        spawn: Spawn,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        if self.shutdown_initiated {\n            return Ok(());\n        }\n        if self.handles_opt.is_some() {\n            return Ok(());\n        }\n        self.previous_generations_statistics.num_spawn_attempts = 1 + spawn.retry_count;\n        if let Err(spawn_error) = self.spawn_pipeline(ctx).await {\n            if let Some(MetastoreError::NotFound { .. }) =\n                spawn_error.downcast_ref::<MetastoreError>()\n            {\n                info!(error = ?spawn_error, \"could not spawn pipeline, index might have been deleted\");\n                return Err(ActorExitStatus::Success);\n            }\n            let retry_delay = wait_duration_before_retry(spawn.retry_count);\n            error!(error = ?spawn_error, retry_count = spawn.retry_count, retry_delay = ?retry_delay, \"error while spawning indexing pipeline, retrying after some time\");\n            ctx.schedule_self_msg(\n                retry_delay,\n                Spawn {\n                    retry_count: spawn.retry_count + 1,\n                },\n            );\n        }\n        Ok(())\n    }\n}\n\n#[derive(Clone)]\npub struct MergePipelineParams {\n    pub pipeline_id: MergePipelineId,\n    pub doc_mapper: Arc<DocMapper>,\n    pub indexing_directory: TempDirectory,\n    pub metastore: MetastoreServiceClient,\n    pub merge_scheduler_service: Mailbox<MergeSchedulerService>,\n    pub split_store: IndexingSplitStore,\n    pub merge_policy: Arc<dyn MergePolicy>,\n    pub retention_policy: Option<RetentionPolicy>,\n    pub max_concurrent_split_uploads: usize, //< TODO share with the indexing pipeline.\n    pub merge_io_throughput_limiter_opt: Option<Limiter>,\n    pub event_broker: EventBroker,\n}\n\n#[cfg(test)]\nmod tests {\n    use std::ops::Bound;\n    use std::sync::Arc;\n\n    use quickwit_actors::{ActorExitStatus, Universe};\n    use quickwit_common::ServiceStream;\n    use quickwit_common::temp_dir::TempDirectory;\n    use quickwit_doc_mapper::default_doc_mapper_for_test;\n    use quickwit_metastore::ListSplitsRequestExt;\n    use quickwit_proto::indexing::MergePipelineId;\n    use quickwit_proto::metastore::{MetastoreServiceClient, MockMetastoreService};\n    use quickwit_proto::types::{IndexUid, NodeId};\n    use quickwit_storage::RamStorage;\n\n    use crate::IndexingSplitStore;\n    use crate::actors::merge_pipeline::{MergePipeline, MergePipelineParams};\n    use crate::actors::{MergePlanner, Publisher};\n    use crate::merge_policy::default_merge_policy;\n\n    #[tokio::test]\n    async fn test_merge_pipeline_simple() -> anyhow::Result<()> {\n        let node_id = NodeId::from(\"test-node\");\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = \"test-source\".to_string();\n        let pipeline_id = MergePipelineId {\n            index_uid: index_uid.clone(),\n            source_id,\n            node_id,\n        };\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_list_splits()\n            .times(1)\n            .withf(move |list_splits_request| {\n                let list_split_query = list_splits_request.deserialize_list_splits_query().unwrap();\n                assert_eq!(list_split_query.index_uids, Some(vec![index_uid.clone()]));\n                assert_eq!(\n                    list_split_query.split_states,\n                    vec![quickwit_metastore::SplitState::Published]\n                );\n                let Bound::Excluded(_) = list_split_query.mature else {\n                    panic!(\"expected `Bound::Excluded`\");\n                };\n                true\n            })\n            .returning(|_| Ok(ServiceStream::empty()));\n        let universe = Universe::with_accelerated_time();\n        let storage = Arc::new(RamStorage::default());\n        let split_store = IndexingSplitStore::create_without_local_store_for_test(storage.clone());\n        let pipeline_params = MergePipelineParams {\n            pipeline_id,\n            doc_mapper: Arc::new(default_doc_mapper_for_test()),\n            indexing_directory: TempDirectory::for_test(),\n            metastore: MetastoreServiceClient::from_mock(mock_metastore),\n            merge_scheduler_service: universe.get_or_spawn_one(),\n            split_store,\n            merge_policy: default_merge_policy(),\n            retention_policy: None,\n            max_concurrent_split_uploads: 2,\n            merge_io_throughput_limiter_opt: None,\n            event_broker: Default::default(),\n        };\n        let pipeline = MergePipeline::new(pipeline_params, None, universe.spawn_ctx());\n        let _merge_planner_mailbox = pipeline.merge_planner_mailbox().clone();\n        let (pipeline_mailbox, pipeline_handle) = universe.spawn_builder().spawn(pipeline);\n        pipeline_mailbox\n            .ask(super::FinishPendingMergesAndShutdownPipeline)\n            .await\n            .unwrap();\n\n        let (pipeline_exit_status, pipeline_statistics) = pipeline_handle.join().await;\n        assert_eq!(pipeline_statistics.generation, 1);\n        assert_eq!(pipeline_statistics.num_spawn_attempts, 1);\n        assert_eq!(pipeline_statistics.num_published_splits, 0);\n        assert!(matches!(pipeline_exit_status, ActorExitStatus::Success));\n\n        // Checking that the merge pipeline actors have been properly cleaned up.\n        assert!(universe.get_one::<MergePlanner>().is_none());\n        assert!(universe.get_one::<Publisher>().is_none());\n        assert!(universe.get_one::<MergePipeline>().is_none());\n\n        universe.assert_quit().await;\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/actors/merge_planner.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{HashMap, HashSet};\nuse std::sync::Arc;\nuse std::time::Instant;\n\nuse async_trait::async_trait;\nuse quickwit_actors::{Actor, ActorContext, ActorExitStatus, Handler, Mailbox, QueueCapacity};\nuse quickwit_metastore::{SplitMaturity, SplitMetadata};\nuse quickwit_proto::indexing::MergePipelineId;\nuse quickwit_proto::types::DocMappingUid;\nuse tantivy::Inventory;\nuse time::OffsetDateTime;\nuse tracing::{info, warn};\n\nuse super::MergeSchedulerService;\nuse crate::MergePolicy;\nuse crate::actors::MergeSplitDownloader;\nuse crate::actors::merge_scheduler_service::schedule_merge;\nuse crate::merge_policy::MergeOperation;\nuse crate::models::NewSplits;\n\n#[derive(Debug)]\npub(crate) struct RunFinalizeMergePolicyAndQuit;\n\n#[derive(Debug, Clone, PartialEq, Eq, Hash)]\nstruct MergePartition {\n    partition_id: u64,\n    doc_mapping_uid: DocMappingUid,\n}\n\nimpl MergePartition {\n    fn from_split_meta(split_meta: &SplitMetadata) -> MergePartition {\n        MergePartition {\n            partition_id: split_meta.partition_id,\n            doc_mapping_uid: split_meta.doc_mapping_uid,\n        }\n    }\n}\n\n/// The merge planner decides when to start a merge task.\npub struct MergePlanner {\n    /// A young split is a split that has not reached maturity\n    /// yet and can be candidate to merge operations.\n    partitioned_young_splits: HashMap<MergePartition, Vec<SplitMetadata>>,\n\n    /// This set contains all of the split ids that we \"acknowledged\".\n    /// The point of this set is to rapidly dismiss redundant `NewSplit` message.\n    ///\n    /// Complex scenarii that can result in the reemission of\n    /// such messages are described in #3627.\n    ///\n    /// At any given point in time, the set must contains at least:\n    /// - young splits (non-mature)\n    /// - splits that are currently in merge.\n    ///\n    /// It also contain other splits, that have gone through a successful\n    /// merge and have been deleted for instance.\n    ///\n    /// We incrementally build this set, by adding new splits to it.\n    /// When it becomes too large, we entirely rebuild it.\n    known_split_ids: HashSet<String>,\n    known_split_ids_recompute_attempt_id: usize,\n\n    merge_policy: Arc<dyn MergePolicy>,\n\n    merge_split_downloader_mailbox: Mailbox<MergeSplitDownloader>,\n    merge_scheduler_service: Mailbox<MergeSchedulerService>,\n\n    /// Inventory of ongoing merge operations. If everything goes well,\n    /// a merge operation is dropped after the publish of the merged split.\n    ///\n    /// It is used to GC the known_split_ids set.\n    ongoing_merge_operations_inventory: Inventory<MergeOperation>,\n\n    /// We use the actor start_time as a way to identify incarnations.\n    ///\n    /// Since we recycle the mailbox of the merge planner, this incarnation\n    /// makes it possible to ignore messages that where emitted from the previous\n    /// instantiation.\n    incarnation_started_at: Instant,\n}\n\n#[async_trait]\nimpl Actor for MergePlanner {\n    type ObservableState = ();\n\n    fn observable_state(&self) -> Self::ObservableState {}\n\n    fn name(&self) -> String {\n        \"MergePlanner\".to_string()\n    }\n\n    fn queue_capacity(&self) -> QueueCapacity {\n        MergePlanner::queue_capacity()\n    }\n\n    async fn initialize(&mut self, ctx: &ActorContext<Self>) -> Result<(), ActorExitStatus> {\n        // We do not call the handle method directly and instead queue the message in order to drain\n        // the recycled mailbox and get a consolidated vision of the set of published\n        // splits, before scheduling any merge operation. See #3847 for more details.\n\n        // If the mailbox is full, this send message might fail (the capacity is very low).\n        // This is however not much of a problem: it probably contains a NewSplit message.\n        // If it does not, we will be losing an opportunity to plan merge right away, but it will\n        // happen on the next split publication.\n        let _ = ctx.try_send_self_message(PlanMerge {\n            incarnation_started_at: self.incarnation_started_at,\n        });\n\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<RunFinalizeMergePolicyAndQuit> for MergePlanner {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        _plan_merge: RunFinalizeMergePolicyAndQuit,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        // Note we ignore messages that could be coming from a different incarnation.\n        // (See comment on `Self::incarnation_start_at`.)\n        self.send_merge_ops(true, ctx).await?;\n        Err(ActorExitStatus::Success)\n    }\n}\n\n#[async_trait]\nimpl Handler<PlanMerge> for MergePlanner {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        plan_merge: PlanMerge,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        if plan_merge.incarnation_started_at == self.incarnation_started_at {\n            // Note we ignore messages that could be coming from a different incarnation.\n            // (See comment on `Self::incarnation_start_at`.)\n            self.send_merge_ops(false, ctx).await?;\n        }\n        self.recompute_known_splits_if_necessary();\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<NewSplits> for MergePlanner {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        new_splits: NewSplits,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        self.record_splits_if_necessary(new_splits.new_splits);\n        self.send_merge_ops(false, ctx).await?;\n        self.recompute_known_splits_if_necessary();\n        Ok(())\n    }\n}\n\nimpl MergePlanner {\n    pub fn queue_capacity() -> QueueCapacity {\n        // We cannot have a Queue capacity of 0 here because `try_send_self`\n        // would never succeed.\n        QueueCapacity::Bounded(1)\n    }\n\n    pub fn new(\n        pipeline_id: &MergePipelineId,\n        immature_splits: Vec<SplitMetadata>,\n        merge_policy: Arc<dyn MergePolicy>,\n        merge_split_downloader_mailbox: Mailbox<MergeSplitDownloader>,\n        merge_scheduler_service: Mailbox<MergeSchedulerService>,\n    ) -> MergePlanner {\n        let immature_splits: Vec<SplitMetadata> = immature_splits\n            .into_iter()\n            .filter(|split_metadata| belongs_to_pipeline(pipeline_id, split_metadata))\n            .collect();\n        let mut merge_planner = MergePlanner {\n            known_split_ids: Default::default(),\n            known_split_ids_recompute_attempt_id: 0,\n            partitioned_young_splits: Default::default(),\n            merge_policy,\n            merge_split_downloader_mailbox,\n            merge_scheduler_service,\n            ongoing_merge_operations_inventory: Inventory::default(),\n\n            incarnation_started_at: Instant::now(),\n        };\n        merge_planner.record_splits_if_necessary(immature_splits);\n        merge_planner\n    }\n\n    fn rebuild_known_split_ids(&self) -> HashSet<String> {\n        let mut known_split_ids: HashSet<String> = HashSet::default();\n        // Add splits that in `partitioned_young_splits`.\n        for young_split_partition in self.partitioned_young_splits.values() {\n            for split in young_split_partition {\n                known_split_ids.insert(split.split_id().to_string());\n            }\n        }\n        let ongoing_merge_operations = self.ongoing_merge_operations_inventory.list();\n        // Add splits that are known as in merge.\n        for merge_op in ongoing_merge_operations {\n            for split in &merge_op.splits {\n                known_split_ids.insert(split.split_id().to_string());\n            }\n        }\n        if known_split_ids.len() * 2 >= self.known_split_ids.len() {\n            warn!(\n                known_split_ids_len_after = known_split_ids.len(),\n                known_split_ids_len_before = self.known_split_ids.len(),\n                \"Rebuilding the known split ids set ended up not halving its size. Please report. \\\n                 This is likely a bug, please report.\"\n            );\n        }\n        known_split_ids\n    }\n\n    /// Updates `known_split_ids` and return true if the split was not\n    /// previously known and should be recorded.\n    fn acknownledge_split(&mut self, split_id: &str) -> bool {\n        if self.known_split_ids.contains(split_id) {\n            return false;\n        }\n        self.known_split_ids.insert(split_id.to_string());\n        true\n    }\n\n    // No need to rebuild every time, we do once out of 100 times.\n    fn recompute_known_splits_if_necessary(&mut self) {\n        self.known_split_ids_recompute_attempt_id += 1;\n        if self\n            .known_split_ids_recompute_attempt_id\n            .is_multiple_of(100)\n        {\n            self.known_split_ids = self.rebuild_known_split_ids();\n            self.known_split_ids_recompute_attempt_id = 0;\n        }\n    }\n\n    // Record a split. This function does NOT check if the split is mature or not, or if the split\n    // is known or not.\n    fn record_split(&mut self, new_split: SplitMetadata) {\n        let splits_for_partition: &mut Vec<SplitMetadata> = self\n            .partitioned_young_splits\n            .entry(MergePartition::from_split_meta(&new_split))\n            .or_default();\n        splits_for_partition.push(new_split);\n    }\n\n    // Records a list of splits.\n    //\n    // Internally this function will detect and avoid adding the split\n    // that are:\n    // - already known\n    // - mature\n    // - do not belong to the current timeline.\n    fn record_splits_if_necessary(&mut self, split_metadatas: Vec<SplitMetadata>) {\n        for new_split in split_metadatas {\n            if let SplitMaturity::Mature = self\n                .merge_policy\n                .split_maturity(new_split.num_docs, new_split.num_merge_ops)\n            {\n                // This can happen if the merge policy changed (e.g, decreased\n                // split_num_docs_target).\n                continue;\n            }\n            if new_split.is_mature(OffsetDateTime::now_utc()) {\n                continue;\n            }\n            // Due to the recycling of the mailbox of the merge planner, it is possible for\n            // a split already in store to be received.\n            //\n            // See `known_split_ids`.\n            if !self.acknownledge_split(new_split.split_id()) {\n                continue;\n            }\n            self.record_split(new_split);\n        }\n    }\n    async fn compute_merge_ops(\n        &mut self,\n        is_finalize: bool,\n        ctx: &ActorContext<Self>,\n    ) -> Result<Vec<MergeOperation>, ActorExitStatus> {\n        let mut merge_operations = Vec::new();\n        for young_splits in self.partitioned_young_splits.values_mut() {\n            if !young_splits.is_empty() {\n                let operations = if is_finalize {\n                    self.merge_policy.finalize_operations(young_splits)\n                } else {\n                    self.merge_policy.operations(young_splits)\n                };\n                merge_operations.extend(operations);\n            }\n            ctx.record_progress();\n            ctx.yield_now().await;\n        }\n        self.partitioned_young_splits\n            .retain(|_, splits| !splits.is_empty());\n        // We recompute the number of young splits.\n        Ok(merge_operations)\n    }\n\n    async fn send_merge_ops(\n        &mut self,\n        is_finalize: bool,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        // We identify all of the merge operations we want to run and leave it\n        // to the merge scheduler to decide in which order these should be scheduled.\n        //\n        // The merge scheduler has the merit of knowing about merge operations from other\n        // index as well.\n        let merge_ops = self.compute_merge_ops(is_finalize, ctx).await?;\n        for merge_operation in merge_ops {\n            info!(merge_operation=?merge_operation, \"schedule merge operation\");\n            let tracked_merge_operation = self\n                .ongoing_merge_operations_inventory\n                .track(merge_operation);\n            schedule_merge(\n                &self.merge_scheduler_service,\n                tracked_merge_operation,\n                self.merge_split_downloader_mailbox.clone(),\n            )\n            .await?\n        }\n        Ok(())\n    }\n}\n\n/// We can only merge splits with the same (node_id, index_id, source_id).\nfn belongs_to_pipeline(pipeline_id: &MergePipelineId, split: &SplitMetadata) -> bool {\n    pipeline_id.node_id == split.node_id\n        && pipeline_id.index_uid == split.index_uid\n        && pipeline_id.source_id == split.source_id\n}\n\n#[derive(Debug)]\nstruct PlanMerge {\n    incarnation_started_at: Instant,\n}\n\n#[cfg(test)]\nmod tests {\n    use std::sync::Arc;\n    use std::time::Duration;\n\n    use itertools::Itertools;\n    use quickwit_actors::{ActorExitStatus, Command, QueueCapacity, Universe};\n    use quickwit_config::IndexingSettings;\n    use quickwit_config::merge_policy_config::{\n        ConstWriteAmplificationMergePolicyConfig, MergePolicyConfig, StableLogMergePolicyConfig,\n    };\n    use quickwit_metastore::{SplitMaturity, SplitMetadata};\n    use quickwit_proto::indexing::MergePipelineId;\n    use quickwit_proto::types::{DocMappingUid, IndexUid, NodeId};\n    use time::OffsetDateTime;\n\n    use crate::actors::MergePlanner;\n    use crate::merge_policy::{\n        MergePolicy, MergeTask, StableLogMergePolicy, merge_policy_from_settings,\n    };\n    use crate::models::NewSplits;\n\n    fn split_metadata_for_test(\n        index_uid: &IndexUid,\n        split_id: &str,\n        partition_id: u64,\n        doc_mapping_uid: DocMappingUid,\n        num_docs: usize,\n        num_merge_ops: usize,\n    ) -> SplitMetadata {\n        SplitMetadata {\n            split_id: split_id.to_string(),\n            index_uid: index_uid.clone(),\n            source_id: \"test-source\".to_string(),\n            node_id: \"test-node\".to_string(),\n            num_docs,\n            partition_id,\n            num_merge_ops,\n            create_timestamp: OffsetDateTime::now_utc().unix_timestamp(),\n            maturity: SplitMaturity::Immature {\n                maturation_period: Duration::from_secs(3600),\n            },\n            doc_mapping_uid,\n            ..Default::default()\n        }\n    }\n\n    #[tokio::test]\n    async fn test_merge_planner_with_stable_custom_merge_policy() -> anyhow::Result<()> {\n        let node_id = NodeId::from(\"test-node\");\n        let index_uid = IndexUid::new_with_random_ulid(\"test-index\");\n        let source_id = \"test-source\".to_string();\n        let [doc_mapping_uid1, doc_mapping_uid2] = {\n            let mut doc_mappings = [DocMappingUid::random(), DocMappingUid::random()];\n            doc_mappings.sort();\n            doc_mappings\n        };\n        let pipeline_id = MergePipelineId {\n            node_id,\n            index_uid: index_uid.clone(),\n            source_id,\n        };\n        let merge_policy = Arc::new(StableLogMergePolicy::new(\n            StableLogMergePolicyConfig {\n                min_level_num_docs: 10_000,\n                merge_factor: 3,\n                max_merge_factor: 5,\n                maturation_period: Duration::from_secs(3600),\n            },\n            50_000,\n        ));\n        let universe = Universe::with_accelerated_time();\n        let (merge_split_downloader_mailbox, merge_split_downloader_inbox) =\n            universe.create_test_mailbox();\n\n        let merge_planner = MergePlanner::new(\n            &pipeline_id,\n            Vec::new(),\n            merge_policy,\n            merge_split_downloader_mailbox,\n            universe.get_or_spawn_one(),\n        );\n        let (merge_planner_mailbox, merge_planner_handle) =\n            universe.spawn_builder().spawn(merge_planner);\n        {\n            // send one split\n            let message = NewSplits {\n                new_splits: vec![\n                    split_metadata_for_test(&index_uid, \"1_1\", 1, doc_mapping_uid1, 2500, 0),\n                    split_metadata_for_test(&index_uid, \"1v2_1\", 1, doc_mapping_uid2, 2500, 0),\n                    split_metadata_for_test(&index_uid, \"1_2\", 2, doc_mapping_uid1, 3000, 0),\n                ],\n            };\n            merge_planner_mailbox.send_message(message).await?;\n            let merge_ops = merge_split_downloader_inbox.drain_for_test();\n            assert_eq!(merge_ops.len(), 0);\n        }\n        {\n            // send two splits with a duplicate\n            let message = NewSplits {\n                new_splits: vec![\n                    split_metadata_for_test(&index_uid, \"2_1\", 1, doc_mapping_uid1, 2000, 0),\n                    split_metadata_for_test(&index_uid, \"2v2_1\", 1, doc_mapping_uid2, 2500, 0),\n                    split_metadata_for_test(&index_uid, \"1_2\", 2, doc_mapping_uid1, 3000, 0),\n                ],\n            };\n            merge_planner_mailbox.send_message(message).await?;\n            let merge_ops = merge_split_downloader_inbox.drain_for_test();\n            assert_eq!(merge_ops.len(), 0);\n        }\n        {\n            // send four more splits to generate merge\n            let message = NewSplits {\n                new_splits: vec![\n                    split_metadata_for_test(&index_uid, \"3_1\", 1, doc_mapping_uid1, 1500, 0),\n                    split_metadata_for_test(&index_uid, \"4_1\", 1, doc_mapping_uid1, 1000, 0),\n                    split_metadata_for_test(&index_uid, \"3v2_1\", 1, doc_mapping_uid2, 1500, 0),\n                    split_metadata_for_test(&index_uid, \"2_2\", 2, doc_mapping_uid1, 2000, 0),\n                    split_metadata_for_test(&index_uid, \"3_2\", 2, doc_mapping_uid1, 4000, 0),\n                ],\n            };\n            merge_planner_mailbox.send_message(message).await?;\n            merge_planner_handle.process_pending_and_observe().await;\n            let operations = merge_split_downloader_inbox.drain_for_test_typed::<MergeTask>();\n            assert_eq!(operations.len(), 3);\n            let mut merge_operations = operations\n                .into_iter()\n                .sorted_by_key(|op| (op.splits[0].partition_id, op.splits[0].doc_mapping_uid));\n\n            let first_merge_operation = merge_operations.next().unwrap();\n            assert_eq!(first_merge_operation.splits.len(), 4);\n            assert!(\n                first_merge_operation\n                    .splits\n                    .iter()\n                    .all(|split| split.partition_id == 1\n                        && split.doc_mapping_uid == doc_mapping_uid1)\n            );\n\n            let second_merge_operation = merge_operations.next().unwrap();\n            assert_eq!(second_merge_operation.splits.len(), 3);\n            assert!(\n                second_merge_operation\n                    .splits\n                    .iter()\n                    .all(|split| split.partition_id == 1\n                        && split.doc_mapping_uid == doc_mapping_uid2)\n            );\n\n            let third_merge_operation = merge_operations.next().unwrap();\n            assert_eq!(third_merge_operation.splits.len(), 3);\n            assert!(\n                third_merge_operation\n                    .splits\n                    .iter()\n                    .all(|split| split.partition_id == 2\n                        && split.doc_mapping_uid == doc_mapping_uid1)\n            );\n        }\n        universe.assert_quit().await;\n\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_merge_planner_spawns_merge_over_existing_splits_on_startup() -> anyhow::Result<()>\n    {\n        let node_id = NodeId::from(\"test-node\");\n        let index_uid = IndexUid::new_with_random_ulid(\"test-index\");\n        let source_id = \"test-source\".to_string();\n        let doc_mapping_uid = DocMappingUid::random();\n        let pipeline_id = MergePipelineId {\n            node_id,\n            index_uid: index_uid.clone(),\n            source_id,\n        };\n        let universe = Universe::with_accelerated_time();\n        let (merge_split_downloader_mailbox, merge_split_downloader_inbox) = universe\n            .spawn_ctx()\n            .create_mailbox(\"MergeSplitDownloader\", QueueCapacity::Bounded(2));\n        let merge_policy_config = ConstWriteAmplificationMergePolicyConfig {\n            merge_factor: 2,\n            max_merge_factor: 2,\n            max_merge_ops: 3,\n            ..Default::default()\n        };\n        let indexing_settings = IndexingSettings {\n            merge_policy: MergePolicyConfig::ConstWriteAmplification(merge_policy_config),\n            ..Default::default()\n        };\n        let immature_splits = vec![\n            split_metadata_for_test(\n                &index_uid,\n                \"a_small\",\n                0, // partition_id\n                doc_mapping_uid,\n                1_000_000,\n                2,\n            ),\n            split_metadata_for_test(\n                &index_uid,\n                \"b_small\",\n                0, // partition_id\n                doc_mapping_uid,\n                1_000_000,\n                2,\n            ),\n        ];\n        let merge_policy: Arc<dyn MergePolicy> = merge_policy_from_settings(&indexing_settings);\n        let merge_planner = MergePlanner::new(\n            &pipeline_id,\n            immature_splits.clone(),\n            merge_policy,\n            merge_split_downloader_mailbox,\n            universe.get_or_spawn_one(),\n        );\n        let (merge_planner_mailbox, merge_planner_handle) =\n            universe.spawn_builder().spawn(merge_planner);\n\n        // We wait for the first merge ops. If we sent the Quit message right away, it would have\n        // been queue before first `PlanMerge` message.\n        let merge_task_res = merge_split_downloader_inbox\n            .recv_typed_message::<MergeTask>()\n            .await;\n        assert!(merge_task_res.is_ok());\n\n        // We make sure that the known splits filtering set filters out splits are currently in\n        // merge.\n        merge_planner_mailbox\n            .ask(NewSplits {\n                new_splits: immature_splits,\n            })\n            .await?;\n\n        let _ = merge_planner_handle.process_pending_and_observe().await;\n\n        let merge_ops = merge_split_downloader_inbox.drain_for_test_typed::<MergeTask>();\n\n        assert!(merge_ops.is_empty());\n\n        merge_planner_mailbox.send_message(Command::Quit).await?;\n\n        let (exit_status, _last_state) = merge_planner_handle.join().await;\n        assert!(matches!(exit_status, ActorExitStatus::Quit));\n        let merge_ops = merge_split_downloader_inbox.drain_for_test_typed::<MergeTask>();\n        assert!(merge_ops.is_empty());\n        universe.assert_quit().await;\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_merge_planner_dismiss_splits_from_different_pipeline_id() -> anyhow::Result<()> {\n        let node_id = NodeId::from(\"test-node\");\n        let index_uid = IndexUid::new_with_random_ulid(\"test-index\");\n        let source_id = \"test-source\".to_string();\n        let doc_mapping_uid = DocMappingUid::random();\n        let pipeline_id = MergePipelineId {\n            node_id,\n            index_uid,\n            source_id,\n        };\n        // This test makes sure that the merge planner ignores the splits that do not belong\n        // to the same pipeline\n        let universe = Universe::with_accelerated_time();\n        let (merge_split_downloader_mailbox, merge_split_downloader_inbox) = universe\n            .spawn_ctx()\n            .create_mailbox(\"MergeSplitDownloader\", QueueCapacity::Bounded(2));\n\n        let merge_policy_config = ConstWriteAmplificationMergePolicyConfig {\n            merge_factor: 2,\n            max_merge_factor: 2,\n            max_merge_ops: 3,\n            ..Default::default()\n        };\n        let indexing_settings = IndexingSettings {\n            merge_policy: MergePolicyConfig::ConstWriteAmplification(merge_policy_config),\n            ..Default::default()\n        };\n\n        // It is different from the index_uid because the index uid has a unique suffix.\n        let other_index_uid = IndexUid::new_with_random_ulid(\"test-index\");\n\n        let immature_splits = vec![\n            split_metadata_for_test(\n                &other_index_uid,\n                \"a_small\",\n                0, // partition_id\n                doc_mapping_uid,\n                1_000_000,\n                2,\n            ),\n            split_metadata_for_test(\n                &other_index_uid,\n                \"b_small\",\n                0, // partition_id\n                doc_mapping_uid,\n                1_000_000,\n                2,\n            ),\n        ];\n        let merge_policy: Arc<dyn MergePolicy> = merge_policy_from_settings(&indexing_settings);\n        let merge_planner = MergePlanner::new(\n            &pipeline_id,\n            immature_splits.clone(),\n            merge_policy,\n            merge_split_downloader_mailbox,\n            universe.get_or_spawn_one(),\n        );\n        let (merge_planner_mailbox, merge_planner_handle) =\n            universe.spawn_builder().spawn(merge_planner);\n        universe.sleep(Duration::from_secs(10)).await;\n        merge_planner_mailbox.send_message(Command::Quit).await?;\n        let (exit_status, _last_state) = merge_planner_handle.join().await;\n        assert!(matches!(exit_status, ActorExitStatus::Quit));\n        let merge_tasks = merge_split_downloader_inbox.drain_for_test_typed::<MergeTask>();\n\n        assert!(merge_tasks.is_empty());\n        universe.assert_quit().await;\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_merge_planner_inherit_mailbox_with_splits_bug_3847() -> anyhow::Result<()> {\n        let node_id = NodeId::from(\"test-node\");\n        let index_uid = IndexUid::new_with_random_ulid(\"test-index\");\n        let source_id = \"test-source\".to_string();\n        let doc_mapping_uid = DocMappingUid::random();\n        let pipeline_id = MergePipelineId {\n            node_id,\n            index_uid: index_uid.clone(),\n            source_id,\n        };\n        let universe = Universe::with_accelerated_time();\n        let (merge_split_downloader_mailbox, merge_split_downloader_inbox) = universe\n            .spawn_ctx()\n            .create_mailbox(\"MergeSplitDownloader\", QueueCapacity::Bounded(2));\n\n        let merge_policy_config = ConstWriteAmplificationMergePolicyConfig {\n            merge_factor: 2,\n            max_merge_factor: 2,\n            max_merge_ops: 3,\n            ..Default::default()\n        };\n        let indexing_settings = IndexingSettings {\n            merge_policy: MergePolicyConfig::ConstWriteAmplification(merge_policy_config),\n            ..Default::default()\n        };\n        let immature_splits = vec![\n            split_metadata_for_test(\n                &index_uid,\n                \"a_small\",\n                0, // partition_id\n                doc_mapping_uid,\n                1_000_000,\n                2,\n            ),\n            split_metadata_for_test(\n                &index_uid,\n                \"b_small\",\n                0, // partition_id\n                doc_mapping_uid,\n                1_000_000,\n                2,\n            ),\n        ];\n        let merge_policy: Arc<dyn MergePolicy> = merge_policy_from_settings(&indexing_settings);\n        let merge_planner = MergePlanner::new(\n            &pipeline_id,\n            immature_splits.clone(),\n            merge_policy,\n            merge_split_downloader_mailbox,\n            universe.get_or_spawn_one(),\n        );\n        // We create a fake old mailbox that contains two new splits and a PlanMerge message from an\n        // old incarnation. This could happen in real life if the merge pipeline failed\n        // right after a `PlanMerge` was pushed to the pipeline. Note that #3847 did not\n        // even require the `PlanMerge` to be in the pipeline\n        let (merge_planner_mailbox, merge_planner_inbox) =\n            universe.create_test_mailbox::<MergePlanner>();\n\n        // We spawn our merge planner with this recycled mailbox.\n        let (merge_planner_mailbox, merge_planner_handle) = universe\n            .spawn_builder()\n            .set_mailboxes(merge_planner_mailbox, merge_planner_inbox)\n            .spawn(merge_planner);\n\n        // The low capacity of the queue of the merge planner prevents us from sending a Command in\n        // the low priority queue. It would take the single slot and prevent the message\n        // sent in the initialize method.\n\n        // Instead, we wait for the first merge ops.\n        let merge_task_res = merge_split_downloader_inbox\n            .recv_typed_message::<MergeTask>()\n            .await;\n        assert!(merge_task_res.is_ok());\n\n        // At this point, our merge has been initialized.\n        merge_planner_mailbox.send_message(Command::Quit).await?;\n        let (exit_status, _last_state) = merge_planner_handle.join().await;\n\n        assert!(matches!(exit_status, ActorExitStatus::Quit));\n        let merge_tasks = merge_split_downloader_inbox.drain_for_test_typed::<MergeTask>();\n        assert!(merge_tasks.is_empty());\n\n        universe.assert_quit().await;\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/actors/merge_scheduler_service.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::cmp::Reverse;\nuse std::collections::BinaryHeap;\nuse std::collections::binary_heap::PeekMut;\nuse std::sync::Arc;\n\nuse anyhow::Context;\nuse async_trait::async_trait;\nuse quickwit_actors::{Actor, ActorContext, ActorExitStatus, Handler, Mailbox};\nuse tantivy::TrackedObject;\nuse tokio::sync::{OwnedSemaphorePermit, Semaphore};\nuse tracing::error;\n\nuse super::MergeSplitDownloader;\nuse crate::merge_policy::{MergeOperation, MergeTask};\n\npub struct MergePermit {\n    _semaphore_permit: Option<OwnedSemaphorePermit>,\n    merge_scheduler_mailbox: Option<Mailbox<MergeSchedulerService>>,\n}\n\nimpl MergePermit {\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn for_test() -> MergePermit {\n        MergePermit {\n            _semaphore_permit: None,\n            merge_scheduler_mailbox: None,\n        }\n    }\n}\n\nimpl Drop for MergePermit {\n    fn drop(&mut self) {\n        let Some(merge_scheduler_mailbox) = self.merge_scheduler_mailbox.take() else {\n            return;\n        };\n        if merge_scheduler_mailbox\n            .send_message_with_high_priority(PermitReleased)\n            .is_err()\n        {\n            error!(\"merge scheduler service is dead\");\n        }\n    }\n}\n\npub async fn schedule_merge(\n    merge_scheduler_service: &Mailbox<MergeSchedulerService>,\n    merge_operation: TrackedObject<MergeOperation>,\n    merge_split_downloader_mailbox: Mailbox<MergeSplitDownloader>,\n) -> anyhow::Result<()> {\n    let schedule_merge = ScheduleMerge::new(merge_operation, merge_split_downloader_mailbox);\n    // TODO add backpressure.\n    merge_scheduler_service\n        .ask(schedule_merge)\n        .await\n        .context(\"failed to acquire permit\")?;\n    Ok(())\n}\n\nstruct ScheduledMerge {\n    score: u64,\n    id: u64, //< just for total ordering.\n    merge_operation: TrackedObject<MergeOperation>,\n    split_downloader_mailbox: Mailbox<MergeSplitDownloader>,\n}\n\nimpl ScheduledMerge {\n    fn order_key(&self) -> (u64, Reverse<u64>) {\n        (self.score, std::cmp::Reverse(self.id))\n    }\n}\n\nimpl Eq for ScheduledMerge {}\n\nimpl PartialEq for ScheduledMerge {\n    fn eq(&self, other: &Self) -> bool {\n        self.cmp(other).is_eq()\n    }\n}\n\nimpl PartialOrd for ScheduledMerge {\n    fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {\n        Some(self.cmp(other))\n    }\n}\n\nimpl Ord for ScheduledMerge {\n    fn cmp(&self, other: &Self) -> std::cmp::Ordering {\n        self.order_key().cmp(&other.order_key())\n    }\n}\n\n/// The merge scheduler service is in charge of keeping track of all scheduled merge operations,\n/// and schedule them in the best possible order, respecting the `merge_concurrency` limit.\n///\n/// This actor is not supervised and should stay as simple as possible.\n/// In particular,\n/// - the `ScheduleMerge` handler should reply in microseconds.\n/// - the task should never be dropped before reaching its `split_downloader_mailbox` destination as\n///   it would break the consistency of `MergePlanner` with the metastore (ie: several splits will\n///   never be merged).\npub struct MergeSchedulerService {\n    merge_semaphore: Arc<Semaphore>,\n    merge_concurrency: usize,\n    pending_merge_queue: BinaryHeap<ScheduledMerge>,\n    next_merge_id: u64,\n    pending_merge_bytes: u64,\n}\n\nimpl Default for MergeSchedulerService {\n    fn default() -> MergeSchedulerService {\n        MergeSchedulerService::new(3)\n    }\n}\n\nimpl MergeSchedulerService {\n    pub fn new(merge_concurrency: usize) -> MergeSchedulerService {\n        let merge_semaphore = Arc::new(Semaphore::new(merge_concurrency));\n        MergeSchedulerService {\n            merge_semaphore,\n            merge_concurrency,\n            pending_merge_queue: BinaryHeap::default(),\n            next_merge_id: 0,\n            pending_merge_bytes: 0,\n        }\n    }\n\n    fn schedule_pending_merges(&mut self, ctx: &ActorContext<Self>) {\n        // We schedule as many pending merges as we can,\n        // until there are no permits available or merges to schedule.\n        loop {\n            let merge_semaphore = self.merge_semaphore.clone();\n            let Some(next_merge) = self.pending_merge_queue.peek_mut() else {\n                // No merge to schedule.\n                break;\n            };\n            let Ok(semaphore_permit) = Semaphore::try_acquire_owned(merge_semaphore) else {\n                // No permit available right away.\n                break;\n            };\n            let merge_permit = MergePermit {\n                _semaphore_permit: Some(semaphore_permit),\n                merge_scheduler_mailbox: Some(ctx.mailbox().clone()),\n            };\n            let ScheduledMerge {\n                merge_operation,\n                split_downloader_mailbox,\n                ..\n            } = PeekMut::pop(next_merge);\n            let merge_task = MergeTask {\n                merge_operation,\n                _merge_permit: merge_permit,\n            };\n            self.pending_merge_bytes -= merge_task.merge_operation.total_num_bytes();\n            crate::metrics::INDEXER_METRICS\n                .pending_merge_operations\n                .set(self.pending_merge_queue.len() as i64);\n            crate::metrics::INDEXER_METRICS\n                .pending_merge_bytes\n                .set(self.pending_merge_bytes as i64);\n            match split_downloader_mailbox.try_send_message(merge_task) {\n                Ok(_) => {}\n                Err(quickwit_actors::TrySendError::Full(_)) => {\n                    // The split downloader mailbox has an unbounded queue capacity,\n                    error!(\"split downloader queue is full: please report\");\n                }\n                Err(quickwit_actors::TrySendError::Disconnected) => {\n                    // It means the split downloader is dead.\n                    // This is fine, the merge pipeline has probably been restarted.\n                }\n            }\n        }\n        let num_merges =\n            self.merge_concurrency as i64 - self.merge_semaphore.available_permits() as i64;\n        crate::metrics::INDEXER_METRICS\n            .ongoing_merge_operations\n            .set(num_merges);\n    }\n}\n\n#[async_trait]\nimpl Actor for MergeSchedulerService {\n    type ObservableState = ();\n\n    fn observable_state(&self) {}\n\n    async fn initialize(&mut self, _ctx: &ActorContext<Self>) -> Result<(), ActorExitStatus> {\n        Ok(())\n    }\n}\n\n#[derive(Debug)]\nstruct ScheduleMerge {\n    score: u64,\n    merge_operation: TrackedObject<MergeOperation>,\n    split_downloader_mailbox: Mailbox<MergeSplitDownloader>,\n}\n\n/// The higher, the sooner we will execute the merge operation.\n/// A good merge operation\n/// - strongly reduces the number splits\n/// - is light.\nfn score_merge_operation(merge_operation: &MergeOperation) -> u64 {\n    let total_num_bytes: u64 = merge_operation.total_num_bytes();\n    if total_num_bytes == 0 {\n        // Silly corner case that should never happen.\n        return u64::MAX;\n    }\n    // We will remove splits.len() and add 1 merge splits.\n    let delta_num_splits = (merge_operation.splits.len() - 1) as u64;\n    // We use integer arithmetic to avoid `f64 are not ordered` silliness.\n    (delta_num_splits << 48)\n        .checked_div(total_num_bytes)\n        .unwrap_or(1u64)\n}\n\nimpl ScheduleMerge {\n    pub fn new(\n        merge_operation: TrackedObject<MergeOperation>,\n        split_downloader_mailbox: Mailbox<MergeSplitDownloader>,\n    ) -> ScheduleMerge {\n        let score = score_merge_operation(&merge_operation);\n        ScheduleMerge {\n            score,\n            merge_operation,\n            split_downloader_mailbox,\n        }\n    }\n}\n\n#[async_trait]\nimpl Handler<ScheduleMerge> for MergeSchedulerService {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        schedule_merge: ScheduleMerge,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        let ScheduleMerge {\n            score,\n            merge_operation,\n            split_downloader_mailbox,\n        } = schedule_merge;\n        let merge_id = self.next_merge_id;\n        self.next_merge_id += 1;\n        let scheduled_merge = ScheduledMerge {\n            score,\n            id: merge_id,\n            merge_operation,\n            split_downloader_mailbox,\n        };\n        self.pending_merge_bytes += scheduled_merge.merge_operation.total_num_bytes();\n        self.pending_merge_queue.push(scheduled_merge);\n        crate::metrics::INDEXER_METRICS\n            .pending_merge_operations\n            .set(self.pending_merge_queue.len() as i64);\n        crate::metrics::INDEXER_METRICS\n            .pending_merge_bytes\n            .set(self.pending_merge_bytes as i64);\n        self.schedule_pending_merges(ctx);\n        Ok(())\n    }\n}\n\n#[derive(Debug)]\nstruct PermitReleased;\n\n#[async_trait]\nimpl Handler<PermitReleased> for MergeSchedulerService {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        _: PermitReleased,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        self.schedule_pending_merges(ctx);\n        Ok(())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::time::Duration;\n\n    use quickwit_actors::Universe;\n    use quickwit_metastore::SplitMetadata;\n    use tantivy::Inventory;\n    use tokio::time::timeout;\n\n    use super::*;\n    use crate::merge_policy::{MergeOperation, MergeTask};\n\n    fn build_merge_operation(num_splits: usize, num_bytes_per_split: u64) -> MergeOperation {\n        let splits: Vec<SplitMetadata> = std::iter::repeat_with(|| SplitMetadata {\n            footer_offsets: num_bytes_per_split..num_bytes_per_split,\n            ..Default::default()\n        })\n        .take(num_splits)\n        .collect();\n        MergeOperation::new_merge_operation(splits)\n    }\n\n    #[test]\n    fn test_score_merge_operation() {\n        let score_merge_operation_aux = |num_splits, num_bytes_per_split| {\n            let merge_operation = build_merge_operation(num_splits, num_bytes_per_split);\n            score_merge_operation(&merge_operation)\n        };\n        assert!(score_merge_operation_aux(10, 10_000_000) < score_merge_operation_aux(10, 999_999));\n        assert!(\n            score_merge_operation_aux(10, 10_000_000) > score_merge_operation_aux(9, 10_000_000)\n        );\n        assert_eq!(\n            // 9 - 1 = 8 splits removed.\n            score_merge_operation_aux(9, 10_000_000),\n            // 5 - 1  = 4 splits removed.\n            score_merge_operation_aux(5, 10_000_000 * 9 / 10)\n        );\n    }\n\n    #[tokio::test]\n    async fn test_merge_schedule_service_prioritize() {\n        let universe = Universe::new();\n        let (merge_scheduler_service, _) = universe\n            .spawn_builder()\n            .spawn(MergeSchedulerService::new(2));\n        let inventory = Inventory::new();\n\n        let (merge_split_downloader_mailbox, merge_split_downloader_inbox) =\n            universe.create_test_mailbox();\n        {\n            let large_merge_operation = build_merge_operation(10, 4_000_000);\n            let tracked_large_merge_operation = inventory.track(large_merge_operation);\n            schedule_merge(\n                &merge_scheduler_service,\n                tracked_large_merge_operation,\n                merge_split_downloader_mailbox.clone(),\n            )\n            .await\n            .unwrap();\n        }\n        {\n            let large_merge_operation2 = build_merge_operation(10, 3_000_000);\n            let tracked_large_merge_operation2 = inventory.track(large_merge_operation2);\n            schedule_merge(\n                &merge_scheduler_service,\n                tracked_large_merge_operation2,\n                merge_split_downloader_mailbox.clone(),\n            )\n            .await\n            .unwrap();\n        }\n        {\n            let large_merge_operation2 = build_merge_operation(10, 5_000_000);\n            let tracked_large_merge_operation2 = inventory.track(large_merge_operation2);\n            schedule_merge(\n                &merge_scheduler_service,\n                tracked_large_merge_operation2,\n                merge_split_downloader_mailbox.clone(),\n            )\n            .await\n            .unwrap();\n        }\n        {\n            let large_merge_operation2 = build_merge_operation(10, 2_000_000);\n            let tracked_large_merge_operation2 = inventory.track(large_merge_operation2);\n            schedule_merge(\n                &merge_scheduler_service,\n                tracked_large_merge_operation2,\n                merge_split_downloader_mailbox.clone(),\n            )\n            .await\n            .unwrap();\n        }\n        {\n            let large_merge_operation2 = build_merge_operation(10, 1_000_000);\n            let tracked_large_merge_operation2 = inventory.track(large_merge_operation2);\n            schedule_merge(\n                &merge_scheduler_service,\n                tracked_large_merge_operation2,\n                merge_split_downloader_mailbox.clone(),\n            )\n            .await\n            .unwrap();\n        }\n        {\n            let merge_task: MergeTask = merge_split_downloader_inbox\n                .recv_typed_message::<MergeTask>()\n                .await\n                .unwrap();\n            assert_eq!(\n                merge_task.merge_operation.splits[0].footer_offsets.end,\n                4_000_000\n            );\n            let merge_task2: MergeTask = merge_split_downloader_inbox\n                .recv_typed_message::<MergeTask>()\n                .await\n                .unwrap();\n            assert_eq!(\n                merge_task2.merge_operation.splits[0].footer_offsets.end,\n                3_000_000\n            );\n            assert!(\n                timeout(\n                    Duration::from_millis(200),\n                    merge_split_downloader_inbox.recv_typed_message::<MergeTask>()\n                )\n                .await\n                .is_err()\n            );\n        }\n        {\n            let merge_task: MergeTask = merge_split_downloader_inbox\n                .recv_typed_message::<MergeTask>()\n                .await\n                .unwrap();\n            assert_eq!(\n                merge_task.merge_operation.splits[0].footer_offsets.end,\n                1_000_000\n            );\n        }\n        {\n            let merge_task: MergeTask = merge_split_downloader_inbox\n                .recv_typed_message::<MergeTask>()\n                .await\n                .unwrap();\n            assert_eq!(\n                merge_task.merge_operation.splits[0].footer_offsets.end,\n                2_000_000\n            );\n        }\n        {\n            let merge_task: MergeTask = merge_split_downloader_inbox\n                .recv_typed_message::<MergeTask>()\n                .await\n                .unwrap();\n            assert_eq!(\n                merge_task.merge_operation.splits[0].footer_offsets.end,\n                5_000_000\n            );\n        }\n        universe.assert_quit().await;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/actors/merge_split_downloader.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::path::Path;\n\nuse async_trait::async_trait;\nuse quickwit_actors::{Actor, ActorContext, ActorExitStatus, Handler, Mailbox, QueueCapacity};\nuse quickwit_common::io::IoControls;\nuse quickwit_common::temp_dir::{self, TempDirectory};\nuse quickwit_metastore::SplitMetadata;\nuse tantivy::Directory;\nuse tracing::{debug, info, instrument};\n\nuse super::MergeExecutor;\nuse crate::merge_policy::MergeTask;\nuse crate::models::MergeScratch;\nuse crate::split_store::IndexingSplitStore;\n\n#[derive(Clone)]\npub struct MergeSplitDownloader {\n    pub scratch_directory: TempDirectory,\n    pub split_store: IndexingSplitStore,\n    pub executor_mailbox: Mailbox<MergeExecutor>,\n    pub io_controls: IoControls,\n}\n\nimpl Actor for MergeSplitDownloader {\n    type ObservableState = ();\n    fn observable_state(&self) -> Self::ObservableState {}\n\n    fn queue_capacity(&self) -> QueueCapacity {\n        QueueCapacity::Unbounded\n    }\n\n    fn name(&self) -> String {\n        \"MergeSplitDownloader\".to_string()\n    }\n}\n\n#[async_trait]\nimpl Handler<MergeTask> for MergeSplitDownloader {\n    type Reply = ();\n\n    #[instrument(\n        name = \"merge_split_downloader\",\n        parent = merge_task.merge_parent_span.id(),\n        skip_all,\n    )]\n    async fn handle(\n        &mut self,\n        merge_task: MergeTask,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), quickwit_actors::ActorExitStatus> {\n        let merge_scratch_directory = temp_dir::Builder::default()\n            .join(\"merge\")\n            .tempdir_in(self.scratch_directory.path())\n            .map_err(|error| anyhow::anyhow!(error))?;\n        info!(dir=%merge_scratch_directory.path().display(), \"download-merge-splits\");\n        let downloaded_splits_directory = temp_dir::Builder::default()\n            .join(\"downloaded-splits\")\n            .tempdir_in(merge_scratch_directory.path())\n            .map_err(|error| anyhow::anyhow!(error))?;\n        let tantivy_dirs = self\n            .download_splits(\n                merge_task.splits_as_slice(),\n                downloaded_splits_directory.path(),\n                ctx,\n            )\n            .await?;\n        let msg = MergeScratch {\n            merge_task,\n            merge_scratch_directory,\n            downloaded_splits_directory,\n            tantivy_dirs,\n        };\n        ctx.send_message(&self.executor_mailbox, msg).await?;\n        Ok(())\n    }\n}\n\nimpl MergeSplitDownloader {\n    async fn download_splits(\n        &self,\n        splits: &[SplitMetadata],\n        download_directory: &Path,\n        ctx: &ActorContext<Self>,\n    ) -> Result<Vec<Box<dyn Directory>>, quickwit_actors::ActorExitStatus> {\n        // we download all of the split files in the scratch directory.\n        let mut tantivy_dirs = Vec::new();\n        for split in splits {\n            if ctx.kill_switch().is_dead() {\n                debug!(\n                    split_id = split.split_id(),\n                    \"Kill switch was activated. Cancelling download.\"\n                );\n                return Err(ActorExitStatus::Killed);\n            }\n            let io_controls = self\n                .io_controls\n                .clone()\n                .set_progress(ctx.progress().clone())\n                .set_kill_switch(ctx.kill_switch().clone());\n            let _protect_guard = ctx.protect_zone();\n            let tantivy_dir = self\n                .split_store\n                .fetch_and_open_split(split.split_id(), download_directory, &io_controls)\n                .await\n                .map_err(|error| {\n                    let split_id = split.split_id();\n                    anyhow::anyhow!(error).context(format!(\"failed to download split `{split_id}`\"))\n                })?;\n            tantivy_dirs.push(tantivy_dir);\n        }\n        Ok(tantivy_dirs)\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::iter;\n    use std::sync::Arc;\n\n    use quickwit_actors::Universe;\n    use quickwit_common::split_file;\n    use quickwit_storage::{PutPayload, RamStorageBuilder, SplitPayloadBuilder};\n\n    use super::*;\n    use crate::merge_policy::MergeOperation;\n    use crate::new_split_id;\n\n    #[tokio::test]\n    async fn test_merge_split_downloader() -> anyhow::Result<()> {\n        let scratch_directory = TempDirectory::for_test();\n        let splits_to_merge: Vec<SplitMetadata> = iter::repeat_with(|| {\n            let split_id = new_split_id();\n            SplitMetadata {\n                split_id,\n                ..Default::default()\n            }\n        })\n        .take(10)\n        .collect();\n\n        let split_store = {\n            let mut storage_builder = RamStorageBuilder::default();\n            for split in &splits_to_merge {\n                let buffer = SplitPayloadBuilder::get_split_payload(&[], &[], &[1, 2, 3])?\n                    .read_all()\n                    .await?;\n                storage_builder = storage_builder.put(&split_file(split.split_id()), &buffer);\n            }\n            let ram_storage = storage_builder.build();\n            IndexingSplitStore::create_without_local_store_for_test(Arc::new(ram_storage))\n        };\n\n        let universe = Universe::with_accelerated_time();\n        let (merge_executor_mailbox, merge_executor_inbox) = universe.create_test_mailbox();\n        let merge_split_downloader = MergeSplitDownloader {\n            scratch_directory,\n            split_store,\n            executor_mailbox: merge_executor_mailbox,\n            io_controls: IoControls::default(),\n        };\n        let (merge_split_downloader_mailbox, merge_split_downloader_handler) =\n            universe.spawn_builder().spawn(merge_split_downloader);\n        let merge_operation: MergeOperation = MergeOperation::new_merge_operation(splits_to_merge);\n        let merge_task = MergeTask::from_merge_operation_for_test(merge_operation);\n        merge_split_downloader_mailbox\n            .send_message(merge_task)\n            .await?;\n        merge_split_downloader_handler\n            .process_pending_and_observe()\n            .await;\n        let merge_scratches = merge_executor_inbox.drain_for_test();\n        assert_eq!(merge_scratches.len(), 1);\n        let merge_scratch = merge_scratches\n            .into_iter()\n            .next()\n            .unwrap()\n            .downcast::<MergeScratch>()\n            .unwrap();\n        assert_eq!(merge_scratch.merge_task.splits_as_slice().len(), 10);\n        for split in merge_scratch.merge_task.splits_as_slice() {\n            let split_filename = split_file(split.split_id());\n            let split_filepath = merge_scratch\n                .downloaded_splits_directory\n                .path()\n                .join(split_filename);\n            assert!(split_filepath.try_exists().unwrap());\n        }\n        universe.assert_quit().await;\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/actors/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod cooperative_indexing;\nmod doc_processor;\nmod index_serializer;\nmod indexer;\nmod indexing_pipeline;\nmod indexing_service;\nmod merge_executor;\nmod merge_pipeline;\nmod merge_planner;\nmod merge_scheduler_service;\nmod merge_split_downloader;\nmod packager;\nmod publisher;\nmod sequencer;\nmod uploader;\n#[cfg(feature = \"vrl\")]\nmod vrl_processing;\n\npub use doc_processor::{DocProcessor, DocProcessorCounters};\npub use index_serializer::IndexSerializer;\npub use indexer::{Indexer, IndexerCounters};\npub use indexing_pipeline::{IndexingPipeline, IndexingPipelineParams};\npub use indexing_service::{INDEXING_DIR_NAME, IndexingService, IndexingServiceCounters};\npub use merge_executor::{MergeExecutor, combine_partition_ids, merge_split_attrs};\npub use merge_pipeline::{FinishPendingMergesAndShutdownPipeline, MergePipeline};\npub(crate) use merge_planner::{MergePlanner, RunFinalizeMergePolicyAndQuit};\npub use merge_scheduler_service::{MergePermit, MergeSchedulerService, schedule_merge};\npub use merge_split_downloader::MergeSplitDownloader;\npub use packager::Packager;\npub use publisher::{Publisher, PublisherCounters, PublisherType};\npub use quickwit_proto::indexing::IndexingError;\npub use sequencer::Sequencer;\npub use uploader::{SplitsUpdateMailbox, Uploader, UploaderCounters, UploaderType};\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/actors/packager.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeSet;\nuse std::io;\nuse std::path::{Path, PathBuf};\nuse std::sync::Arc;\n\nuse anyhow::{Context, bail};\nuse async_trait::async_trait;\nuse fail::fail_point;\nuse itertools::Itertools;\nuse quickwit_actors::{Actor, ActorContext, ActorExitStatus, Handler, Mailbox, QueueCapacity};\nuse quickwit_common::runtimes::RuntimeType;\nuse quickwit_common::temp_dir::TempDirectory;\nuse quickwit_directories::write_hotcache;\nuse quickwit_doc_mapper::NamedField;\nuse quickwit_doc_mapper::tag_pruning::append_to_tag_set;\nuse quickwit_proto::search::{\n    ListFieldType, ListFields, ListFieldsEntryResponse, serialize_split_fields,\n};\nuse tantivy::index::FieldMetadata;\nuse tantivy::schema::{FieldType, Type};\nuse tantivy::{InvertedIndexReader, ReloadPolicy, SegmentMeta};\nuse tokio::runtime::Handle;\nuse tracing::{debug, info, instrument, warn};\n\n/// Maximum distinct values allowed for a tag field within a split.\nconst MAX_VALUES_PER_TAG_FIELD: usize = if cfg!(any(test, feature = \"testsuite\")) {\n    6\n} else {\n    1000\n};\n\nuse crate::actors::Uploader;\nuse crate::models::{\n    EmptySplit, IndexedSplit, IndexedSplitBatch, PackagedSplit, PackagedSplitBatch,\n};\n\n/// The role of the packager is to get an index writer and\n/// produce a split file.\n///\n/// This includes the following steps:\n/// - commit: this step is CPU heavy\n/// - identifying the list of tags for the splits, and labelling it accordingly\n/// - creating a bundle file\n/// - computing the hotcache\n/// - appending it to the split file.\n///\n/// The split format is described in `internals/split-format.md`\n#[derive(Clone)]\npub struct Packager {\n    actor_name: &'static str,\n    uploader_mailbox: Mailbox<Uploader>,\n    /// List of tag fields ([`Vec<NamedField>`]) defined in the index config.\n    tag_fields: Vec<NamedField>,\n}\n\nimpl Packager {\n    pub fn new(\n        actor_name: &'static str,\n        tag_fields: Vec<NamedField>,\n        uploader_mailbox: Mailbox<Uploader>,\n    ) -> Packager {\n        Packager {\n            actor_name,\n            uploader_mailbox,\n            tag_fields,\n        }\n    }\n\n    pub async fn process_indexed_split(\n        &self,\n        split: IndexedSplit,\n        ctx: &ActorContext<Self>,\n    ) -> anyhow::Result<PackagedSplit> {\n        let segment_metas = split.index.searchable_segment_metas()?;\n        assert_eq!(segment_metas.len(), 1);\n        let packaged_split =\n            create_packaged_split(&segment_metas[..], split, &self.tag_fields, ctx)?;\n        Ok(packaged_split)\n    }\n}\n\n#[async_trait]\nimpl Actor for Packager {\n    type ObservableState = ();\n\n    #[allow(clippy::unused_unit)]\n    fn observable_state(&self) -> Self::ObservableState {\n        ()\n    }\n\n    fn queue_capacity(&self) -> QueueCapacity {\n        QueueCapacity::Bounded(1)\n    }\n\n    fn name(&self) -> String {\n        self.actor_name.to_string()\n    }\n\n    fn runtime_handle(&self) -> Handle {\n        RuntimeType::Blocking.get_runtime_handle()\n    }\n}\n\n#[async_trait]\nimpl Handler<IndexedSplitBatch> for Packager {\n    type Reply = ();\n\n    #[instrument(level = \"info\", name = \"packager\", parent=batch.batch_parent_span.id(), skip_all)]\n    async fn handle(\n        &mut self,\n        batch: IndexedSplitBatch,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        let split_ids: Vec<String> = batch\n            .splits\n            .iter()\n            .map(|split| split.split_id().to_string())\n            .collect_vec();\n        debug!(\n            split_ids=?split_ids,\n            \"start-packaging-splits\"\n        );\n        fail_point!(\"packager:before\");\n        let mut packaged_splits = Vec::with_capacity(batch.splits.len());\n        for split in batch.splits {\n            if batch.publish_lock.is_dead() {\n                // TODO: Remove the junk right away?\n                info!(\n                    split_ids=?split_ids,\n                    \"Splits' publish lock is dead.\"\n                );\n                return Ok(());\n            }\n            let packaged_split = self.process_indexed_split(split, ctx).await?;\n            packaged_splits.push(packaged_split);\n        }\n        ctx.send_message(\n            &self.uploader_mailbox,\n            PackagedSplitBatch::new(\n                packaged_splits,\n                batch.checkpoint_delta_opt,\n                batch.publish_lock,\n                batch.publish_token_opt,\n                batch.merge_task_opt,\n                batch.batch_parent_span,\n            ),\n        )\n        .await?;\n        fail_point!(\"packager:after\");\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<EmptySplit> for Packager {\n    type Reply = ();\n\n    #[instrument(\n        name=\"package_empty_batch\"\n        parent=empty_split.batch_parent_span.id(),\n        skip_all,\n    )]\n    async fn handle(\n        &mut self,\n        empty_split: EmptySplit,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        ctx.send_message(&self.uploader_mailbox, empty_split)\n            .await?;\n        Ok(())\n    }\n}\n\nfn list_split_files(\n    segment_metas: &[SegmentMeta],\n    scratch_directory: &TempDirectory,\n) -> io::Result<Vec<PathBuf>> {\n    let mut index_files = vec![scratch_directory.path().join(\"meta.json\")];\n\n    // list the segment files\n    for segment_meta in segment_metas {\n        for relative_path in segment_meta.list_files() {\n            let filepath = scratch_directory.path().join(relative_path);\n            if filepath.try_exists()? {\n                // If the file is missing, this is fine.\n                // segment_meta.list_files() may actually returns files that\n                // may not exist.\n                index_files.push(filepath);\n            }\n        }\n    }\n    index_files.sort();\n    Ok(index_files)\n}\n\nfn build_hotcache<W: io::Write>(split_path: &Path, out: &mut W) -> anyhow::Result<()> {\n    let mmap_directory = tantivy::directory::MmapDirectory::open(split_path)?;\n    write_hotcache(mmap_directory, out)?;\n    Ok(())\n}\n\n/// Attempts to exhaustively extract the list of terms in a\n/// field term dictionary.\n///\n/// returns None if:\n/// - the number of terms exceed MAX_VALUES_PER_TAG_FIELD\n/// - some of the terms are not value utf8.\n/// - an error occurs.\n///\n/// Returns None may hurt split pruning and affects performance,\n/// but it does not affect Quickwit's result validity.\nfn try_extract_terms(\n    named_field: &NamedField,\n    inv_indexes: &[Arc<InvertedIndexReader>],\n    max_terms: usize,\n) -> anyhow::Result<Vec<String>> {\n    let num_terms = inv_indexes\n        .iter()\n        .map(|inv_index| inv_index.terms().num_terms())\n        .sum::<usize>();\n    if num_terms > max_terms {\n        bail!(\n            \"number of unique terms for tag field {} > {}\",\n            named_field.name,\n            max_terms\n        );\n    }\n    let mut terms = Vec::with_capacity(num_terms);\n    for inv_index in inv_indexes {\n        let mut terms_streamer = inv_index.terms().stream()?;\n        while let Some((term_data, _)) = terms_streamer.next() {\n            let term = match named_field.field_type {\n                FieldType::U64(_) => u64_from_term_data(term_data)?.to_string(),\n                FieldType::I64(_) => {\n                    tantivy::u64_to_i64(u64_from_term_data(term_data)?).to_string()\n                }\n                FieldType::F64(_) => {\n                    tantivy::u64_to_f64(u64_from_term_data(term_data)?).to_string()\n                }\n                FieldType::Bool(_) => match u64_from_term_data(term_data)? {\n                    0 => false,\n                    1 => true,\n                    _ => bail!(\"invalid boolean value\"),\n                }\n                .to_string(),\n                FieldType::Bytes(_) => {\n                    bail!(\"tags collection is not allowed on `bytes` fields\")\n                }\n                _ => std::str::from_utf8(term_data)?.to_string(),\n            };\n            terms.push(term);\n        }\n    }\n    Ok(terms)\n}\n\nfn create_packaged_split(\n    segment_metas: &[SegmentMeta],\n    split: IndexedSplit,\n    tag_fields: &[NamedField],\n    ctx: &ActorContext<Packager>,\n) -> anyhow::Result<PackagedSplit> {\n    debug!(split_id = split.split_id(), \"create-packaged-split\");\n    let split_files = list_split_files(segment_metas, &split.split_scratch_directory)?;\n\n    // Extracts tag values from inverted indexes only when a field cardinality is less\n    // than `MAX_VALUES_PER_TAG_FIELD`.\n    debug!(split_id = split.split_id(), tag_fields =? tag_fields, \"extract-tags-values\");\n    let index_reader = split\n        .index\n        .reader_builder()\n        .reload_policy(ReloadPolicy::Manual)\n        .try_into()?;\n\n    let fields_metadata = split.index.fields_metadata()?;\n\n    let mut tags = BTreeSet::default();\n    for named_field in tag_fields {\n        let inverted_indexes = index_reader\n            .searcher()\n            .segment_readers()\n            .iter()\n            .map(|segment| segment.inverted_index(named_field.field))\n            .collect::<Result<Vec<_>, _>>()?;\n\n        match try_extract_terms(named_field, &inverted_indexes, MAX_VALUES_PER_TAG_FIELD) {\n            Ok(terms) => {\n                append_to_tag_set(&named_field.name, &terms, &mut tags);\n            }\n            Err(tag_extraction_error) => {\n                warn!(err=?tag_extraction_error,  \"no field values will be registered in the split metadata\");\n            }\n        }\n    }\n\n    ctx.record_progress();\n\n    debug!(split_id = split.split_id(), \"build-hotcache\");\n    let mut hotcache_bytes = Vec::new();\n    build_hotcache(split.split_scratch_directory.path(), &mut hotcache_bytes)?;\n    ctx.record_progress();\n\n    let serialized_split_fields = serialize_field_metadata(&fields_metadata);\n\n    let packaged_split = PackagedSplit {\n        serialized_split_fields,\n        split_attrs: split.split_attrs,\n        split_scratch_directory: split.split_scratch_directory,\n        tags,\n        split_files,\n        hotcache_bytes,\n    };\n    Ok(packaged_split)\n}\n\n/// Serializes the Split fields.\n///\n/// `fields_metadata` has to be sorted.\nfn serialize_field_metadata(fields_metadata: &[FieldMetadata]) -> Vec<u8> {\n    let fields = fields_metadata\n        .iter()\n        .map(field_metadata_to_list_field_serialized)\n        .collect::<Vec<_>>();\n\n    serialize_split_fields(ListFields { fields })\n}\n\nfn tantivy_type_to_list_field_type(typ: Type) -> ListFieldType {\n    match typ {\n        Type::Str => ListFieldType::Str,\n        Type::U64 => ListFieldType::U64,\n        Type::I64 => ListFieldType::I64,\n        Type::F64 => ListFieldType::F64,\n        Type::Bool => ListFieldType::Bool,\n        Type::Date => ListFieldType::Date,\n        Type::Facet => ListFieldType::Facet,\n        Type::Bytes => ListFieldType::Bytes,\n        Type::Json => ListFieldType::Json,\n        Type::IpAddr => ListFieldType::IpAddr,\n    }\n}\n\nfn field_metadata_to_list_field_serialized(\n    field_metadata: &FieldMetadata,\n) -> ListFieldsEntryResponse {\n    ListFieldsEntryResponse {\n        field_name: field_metadata.field_name.to_string(),\n        field_type: tantivy_type_to_list_field_type(field_metadata.typ) as i32,\n        searchable: field_metadata.is_indexed(),\n        aggregatable: field_metadata.is_fast(),\n        index_ids: Vec::new(),\n        non_searchable_index_ids: Vec::new(),\n        non_aggregatable_index_ids: Vec::new(),\n    }\n}\n\n/// Reads u64 from stored term data.\nfn u64_from_term_data(data: &[u8]) -> anyhow::Result<u64> {\n    let u64_bytes: [u8; 8] = data[0..8]\n        .try_into()\n        .context(\"could not interpret term bytes as u64\")?;\n    Ok(u64::from_be_bytes(u64_bytes))\n}\n\n#[cfg(test)]\nmod tests {\n    use std::ops::RangeInclusive;\n\n    use quickwit_actors::{ObservationType, Universe};\n    use quickwit_metastore::checkpoint::IndexCheckpointDelta;\n    use quickwit_proto::search::{ListFieldsEntryResponse, deserialize_split_fields};\n    use quickwit_proto::types::{DocMappingUid, IndexUid, NodeId};\n    use tantivy::directory::MmapDirectory;\n    use tantivy::schema::{FAST, NumericOptions, STRING, Schema, TEXT, Type};\n    use tantivy::{DateTime, IndexBuilder, IndexSettings, doc};\n    use tracing::Span;\n\n    use super::*;\n    use crate::models::{PublishLock, SplitAttrs};\n\n    #[test]\n    fn serialize_field_metadata_test() {\n        let fields_metadata = vec![\n            FieldMetadata {\n                field_name: \"test\".to_string(),\n                typ: Type::Str,\n                stored: true,\n                fast_size: Some(10u64.into()),\n                term_dictionary_size: Some(10u64.into()),\n                postings_size: Some(10u64.into()),\n                positions_size: Some(10u64.into()),\n            },\n            FieldMetadata {\n                field_name: \"test2\".to_string(),\n                typ: Type::Str,\n                stored: false,\n                fast_size: None,\n                term_dictionary_size: Some(10u64.into()),\n                postings_size: Some(10u64.into()),\n                positions_size: Some(10u64.into()),\n            },\n            FieldMetadata {\n                field_name: \"test3\".to_string(),\n                typ: Type::U64,\n                stored: false,\n                fast_size: Some(10u64.into()),\n                term_dictionary_size: Some(10u64.into()),\n                postings_size: Some(10u64.into()),\n                positions_size: Some(10u64.into()),\n            },\n        ];\n\n        let out = serialize_field_metadata(&fields_metadata);\n\n        let deserialized: Vec<ListFieldsEntryResponse> =\n            deserialize_split_fields(&mut &out[..]).unwrap().fields;\n\n        assert_eq!(fields_metadata.len(), deserialized.len());\n        assert_eq!(deserialized[0].field_name, \"test\");\n        assert_eq!(deserialized[0].field_type, ListFieldType::Str as i32);\n        assert!(deserialized[0].searchable);\n        assert!(deserialized[0].aggregatable);\n\n        assert_eq!(deserialized[1].field_name, \"test2\");\n        assert_eq!(deserialized[1].field_type, ListFieldType::Str as i32);\n        assert!(deserialized[1].searchable);\n        assert!(!deserialized[1].aggregatable);\n\n        assert_eq!(deserialized[2].field_name, \"test3\");\n        assert_eq!(deserialized[2].field_type, ListFieldType::U64 as i32);\n        assert!(deserialized[2].searchable);\n        assert!(deserialized[2].aggregatable);\n    }\n\n    fn make_indexed_split_for_test(\n        segment_timestamps: &[DateTime],\n    ) -> anyhow::Result<IndexedSplit> {\n        let split_scratch_directory = TempDirectory::for_test();\n        let mut schema_builder = Schema::builder();\n        let text_field = schema_builder.add_text_field(\"text\", TEXT);\n        let timestamp_field = schema_builder.add_u64_field(\"timestamp\", FAST);\n        let tag_str = schema_builder.add_text_field(\"tag_str\", STRING);\n        let tag_many = schema_builder.add_text_field(\"tag_many\", STRING);\n        let tag_u64 =\n            schema_builder.add_u64_field(\"tag_u64\", NumericOptions::default().set_indexed());\n        let tag_i64 =\n            schema_builder.add_i64_field(\"tag_i64\", NumericOptions::default().set_indexed());\n        let tag_f64 =\n            schema_builder.add_f64_field(\"tag_f64\", NumericOptions::default().set_indexed());\n        let tag_bool =\n            schema_builder.add_bool_field(\"tag_bool\", NumericOptions::default().set_indexed());\n        let schema = schema_builder.build();\n        let index_builder = IndexBuilder::new()\n            .settings(IndexSettings::default())\n            .schema(schema)\n            .tokenizers(\n                quickwit_query::create_default_quickwit_tokenizer_manager()\n                    .tantivy_manager()\n                    .clone(),\n            )\n            .fast_field_tokenizers(\n                quickwit_query::get_quickwit_fastfield_normalizer_manager()\n                    .tantivy_manager()\n                    .clone(),\n            );\n        let index_directory = MmapDirectory::open(split_scratch_directory.path())?;\n        let mut index_writer =\n            index_builder.single_segment_index_writer(index_directory, 100_000_000)?;\n        let mut timerange_opt: Option<RangeInclusive<DateTime>> = None;\n        let mut num_docs = 0;\n        for &timestamp in segment_timestamps {\n            for num in 1..10 {\n                let doc = doc!(\n                    text_field => format!(\"timestamp is {timestamp:?}\"),\n                    timestamp_field => timestamp,\n                    tag_str => \"value\",\n                    tag_many => format!(\"many-{num}\"),\n                    tag_u64 => 42u64,\n                    tag_i64 => -42i64,\n                    tag_f64 => -42.02f64,\n                    tag_bool => true,\n                );\n                index_writer.add_document(doc)?;\n                num_docs += 1;\n                timerange_opt = Some(\n                    timerange_opt\n                        .map(|timestamp_range| {\n                            let start = timestamp.min(*timestamp_range.start());\n                            let end = timestamp.max(*timestamp_range.end());\n                            RangeInclusive::new(start, end)\n                        })\n                        .unwrap_or_else(|| RangeInclusive::new(timestamp, timestamp)),\n                )\n            }\n        }\n        let index = index_writer.finalize()?;\n\n        let node_id = NodeId::from(\"test-node\");\n        let index_uid = IndexUid::new_with_random_ulid(\"test-index\");\n        let source_id = \"test-source\".to_string();\n\n        // TODO: In the future we would like that kind of segment flush to emit a new split,\n        // but this will require work on tantivy.\n        let indexed_split = IndexedSplit {\n            split_attrs: SplitAttrs {\n                node_id,\n                index_uid,\n                source_id,\n                doc_mapping_uid: DocMappingUid::default(),\n                split_id: \"test-split\".to_string(),\n                partition_id: 17u64,\n                num_docs,\n                uncompressed_docs_size_in_bytes: num_docs * 15,\n                time_range: timerange_opt,\n                replaced_split_ids: Vec::new(),\n                delete_opstamp: 0,\n                num_merge_ops: 0,\n            },\n            index,\n            split_scratch_directory,\n            controlled_directory_opt: None,\n        };\n        Ok(indexed_split)\n    }\n\n    fn get_tag_fields(schema: Schema, field_names: &[&str]) -> Vec<NamedField> {\n        field_names\n            .iter()\n            .map(|field_name| {\n                let field = schema.get_field(field_name).unwrap();\n                let field_type = schema.get_field_entry(field).field_type().clone();\n                NamedField {\n                    name: field_name.to_string(),\n                    field,\n                    field_type,\n                }\n            })\n            .collect()\n    }\n\n    #[tokio::test]\n    async fn test_packager_simple() -> anyhow::Result<()> {\n        quickwit_common::setup_logging_for_tests();\n        let universe = Universe::with_accelerated_time();\n        let (mailbox, inbox) = universe.create_test_mailbox();\n        let indexed_split = make_indexed_split_for_test(&[\n            DateTime::from_timestamp_secs(1628203589),\n            DateTime::from_timestamp_secs(1628203640),\n        ])?;\n        let tag_fields = get_tag_fields(\n            indexed_split.index.schema(),\n            &[\n                \"tag_str\", \"tag_many\", \"tag_u64\", \"tag_i64\", \"tag_f64\", \"tag_bool\",\n            ],\n        );\n        let packager = Packager::new(\"TestPackager\", tag_fields, mailbox);\n        let (packager_mailbox, packager_handle) = universe.spawn_builder().spawn(packager);\n        packager_mailbox\n            .send_message(IndexedSplitBatch {\n                splits: vec![indexed_split],\n                checkpoint_delta_opt: IndexCheckpointDelta::for_test(\"source_id\", 10..20).into(),\n                publish_lock: PublishLock::default(),\n                publish_token_opt: None,\n                merge_task_opt: None,\n                batch_parent_span: Span::none(),\n            })\n            .await?;\n        assert_eq!(\n            packager_handle.process_pending_and_observe().await.obs_type,\n            ObservationType::Alive\n        );\n        let packaged_splits = inbox.drain_for_test();\n        assert_eq!(packaged_splits.len(), 1);\n        let packaged_split = packaged_splits[0]\n            .downcast_ref::<PackagedSplitBatch>()\n            .unwrap();\n        let split = &packaged_split.splits[0];\n        assert_eq!(\n            &split.tags.iter().map(|s| s.as_str()).collect::<Vec<&str>>(),\n            &[\n                \"tag_bool!\",\n                \"tag_bool:true\",\n                \"tag_f64!\",\n                \"tag_f64:-42.02\",\n                \"tag_i64!\",\n                \"tag_i64:-42\",\n                \"tag_str!\",\n                \"tag_str:value\",\n                \"tag_u64!\",\n                \"tag_u64:42\"\n            ]\n        );\n        assert_eq!(\n            split.split_attrs.time_range,\n            Some(\n                DateTime::from_timestamp_secs(1628203589)\n                    ..=DateTime::from_timestamp_secs(1628203640)\n            )\n        );\n        universe.assert_quit().await;\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/actors/publisher.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse anyhow::Context;\nuse async_trait::async_trait;\nuse fail::fail_point;\nuse quickwit_actors::{Actor, ActorContext, Handler, Mailbox, QueueCapacity};\nuse quickwit_proto::metastore::{MetastoreService, MetastoreServiceClient, PublishSplitsRequest};\nuse serde::Serialize;\nuse tracing::{info, instrument, warn};\n\nuse crate::actors::MergePlanner;\nuse crate::models::{NewSplits, SplitsUpdate};\nuse crate::source::{SourceActor, SuggestTruncate};\n\n#[derive(Clone, Debug, Default, Serialize)]\npub struct PublisherCounters {\n    pub num_published_splits: u64,\n    pub num_replace_operations: u64,\n    pub num_empty_splits: u64,\n}\n\n#[derive(Clone, Copy, Debug)]\npub enum PublisherType {\n    MainPublisher,\n    MergePublisher,\n}\n\nimpl PublisherType {\n    pub fn actor_name(&self) -> &'static str {\n        match self {\n            PublisherType::MainPublisher => \"Publisher\",\n            PublisherType::MergePublisher => \"MergePublisher\",\n        }\n    }\n}\n\n/// Disconnect the merge planner loop back.\n/// This message is used to cut the merge pipeline loop, and let it terminate.\n#[derive(Debug)]\npub(crate) struct DisconnectMergePlanner;\n\n#[derive(Clone)]\npub struct Publisher {\n    publisher_type: PublisherType,\n    metastore: MetastoreServiceClient,\n    merge_planner_mailbox_opt: Option<Mailbox<MergePlanner>>,\n    source_mailbox_opt: Option<Mailbox<SourceActor>>,\n    counters: PublisherCounters,\n}\n\nimpl Publisher {\n    pub fn new(\n        publisher_type: PublisherType,\n        metastore: MetastoreServiceClient,\n        merge_planner_mailbox_opt: Option<Mailbox<MergePlanner>>,\n        source_mailbox_opt: Option<Mailbox<SourceActor>>,\n    ) -> Publisher {\n        Publisher {\n            publisher_type,\n            metastore,\n            merge_planner_mailbox_opt,\n            source_mailbox_opt,\n            counters: PublisherCounters::default(),\n        }\n    }\n}\n\n#[async_trait]\nimpl Actor for Publisher {\n    type ObservableState = PublisherCounters;\n\n    fn observable_state(&self) -> Self::ObservableState {\n        self.counters.clone()\n    }\n\n    fn name(&self) -> String {\n        self.publisher_type.actor_name().to_string()\n    }\n\n    fn queue_capacity(&self) -> QueueCapacity {\n        match self.publisher_type {\n            PublisherType::MainPublisher => QueueCapacity::Bounded(1),\n            PublisherType::MergePublisher => QueueCapacity::Unbounded,\n        }\n    }\n}\n\n#[async_trait]\nimpl Handler<DisconnectMergePlanner> for Publisher {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        _: DisconnectMergePlanner,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<(), quickwit_actors::ActorExitStatus> {\n        info!(\"disconnecting merge planner mailbox\");\n        self.merge_planner_mailbox_opt = None;\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<SplitsUpdate> for Publisher {\n    type Reply = ();\n\n    #[instrument(name=\"publisher\", parent=split_update.parent_span.id(),  skip(self, ctx))]\n    async fn handle(\n        &mut self,\n        split_update: SplitsUpdate,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), quickwit_actors::ActorExitStatus> {\n        fail_point!(\"publisher:before\");\n\n        let SplitsUpdate {\n            index_uid,\n            new_splits,\n            replaced_split_ids,\n            checkpoint_delta_opt,\n            publish_lock,\n            publish_token_opt,\n            ..\n        } = split_update;\n\n        let index_checkpoint_delta_json_opt = checkpoint_delta_opt\n            .as_ref()\n            .map(serde_json::to_string)\n            .transpose()\n            .context(\"failed to serialize `IndexCheckpointDelta`\")?;\n        let split_ids: Vec<String> = new_splits\n            .iter()\n            .map(|split| split.split_id.clone())\n            .collect();\n        if let Some(_guard) = publish_lock.acquire().await {\n            let publish_splits_request = PublishSplitsRequest {\n                index_uid: Some(index_uid),\n                staged_split_ids: split_ids.clone(),\n                replaced_split_ids: replaced_split_ids.clone(),\n                index_checkpoint_delta_json_opt,\n                publish_token_opt: publish_token_opt.clone(),\n            };\n            ctx.protect_future(self.metastore.publish_splits(publish_splits_request))\n                .await\n                .context(\"failed to publish splits\")?;\n        } else {\n            // TODO: Remove the junk right away?\n            info!(\n                split_ids=?split_ids,\n                \"Splits' publish lock is dead.\"\n            );\n            return Ok(());\n        }\n        info!(\"publish-new-splits\");\n        if let Some(source_mailbox) = self.source_mailbox_opt.as_ref()\n            && let Some(checkpoint) = checkpoint_delta_opt\n        {\n            // We voluntarily do not log anything here.\n            //\n            // Not being to send the truncation message is a common event and should not be\n            // considered an error. For instance, if the source is a\n            // FileSource, it will terminate upon EOF and drop its\n            // mailbox.\n            let suggest_truncate_res = ctx\n                .send_message(\n                    source_mailbox,\n                    SuggestTruncate(checkpoint.source_delta.get_source_checkpoint()),\n                )\n                .await;\n            if let Err(send_truncate_err) = suggest_truncate_res {\n                warn!(error=?send_truncate_err, \"failed to send truncate message from publisher to source\");\n            }\n        }\n\n        if !new_splits.is_empty() {\n            // The merge planner is not necessarily awake and this is not an error.\n            // For instance, when a source reaches its end, and the last \"new\" split\n            // has been packaged, the packager finalizer sends a message to the merge\n            // planner in order to stop it.\n            if let Some(merge_planner_mailbox) = self.merge_planner_mailbox_opt.as_ref() {\n                let _ = ctx\n                    .send_message(merge_planner_mailbox, NewSplits { new_splits })\n                    .await;\n            }\n\n            if replaced_split_ids.is_empty() {\n                self.counters.num_published_splits += 1;\n            } else {\n                self.counters.num_replace_operations += 1;\n            }\n        } else {\n            self.counters.num_empty_splits += 1;\n        }\n        fail_point!(\"publisher:after\");\n        Ok(())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_actors::Universe;\n    use quickwit_metastore::checkpoint::{\n        IndexCheckpointDelta, PartitionId, SourceCheckpoint, SourceCheckpointDelta,\n    };\n    use quickwit_metastore::{PublishSplitsRequestExt, SplitMetadata};\n    use quickwit_proto::metastore::{EmptyResponse, MockMetastoreService};\n    use quickwit_proto::types::{IndexUid, Position};\n    use tracing::Span;\n\n    use super::*;\n    use crate::models::PublishLock;\n\n    #[tokio::test]\n    async fn test_publisher_publish_operation() {\n        let universe = Universe::with_accelerated_time();\n        let ref_index_uid: IndexUid = IndexUid::for_test(\"index\", 1);\n        let mut mock_metastore = MockMetastoreService::new();\n        let ref_index_uid_clone = ref_index_uid.clone();\n        mock_metastore\n            .expect_publish_splits()\n            .withf(move |publish_splits_request| {\n                let checkpoint_delta: IndexCheckpointDelta = publish_splits_request\n                    .deserialize_index_checkpoint()\n                    .unwrap()\n                    .unwrap();\n                publish_splits_request.index_uid() == &ref_index_uid_clone\n                    && checkpoint_delta.source_id == \"source\"\n                    && publish_splits_request.staged_split_ids[..] == [\"split\"]\n                    && publish_splits_request.replaced_split_ids.is_empty()\n                    && checkpoint_delta.source_delta == SourceCheckpointDelta::from_range(1..3)\n            })\n            .times(1)\n            .returning(|_| Ok(EmptyResponse {}));\n        let (merge_planner_mailbox, merge_planner_inbox) = universe.create_test_mailbox();\n\n        let (source_mailbox, source_inbox) = universe.create_test_mailbox();\n\n        let publisher = Publisher::new(\n            PublisherType::MainPublisher,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            Some(merge_planner_mailbox),\n            Some(source_mailbox),\n        );\n        let (publisher_mailbox, publisher_handle) = universe.spawn_builder().spawn(publisher);\n\n        assert!(\n            publisher_mailbox\n                .send_message(SplitsUpdate {\n                    index_uid: ref_index_uid.clone(),\n                    new_splits: vec![SplitMetadata {\n                        split_id: \"split\".to_string(),\n                        ..Default::default()\n                    }],\n                    replaced_split_ids: Vec::new(),\n                    checkpoint_delta_opt: Some(IndexCheckpointDelta {\n                        source_id: \"source\".to_string(),\n                        source_delta: SourceCheckpointDelta::from_range(1..3),\n                    }),\n                    publish_lock: PublishLock::default(),\n                    publish_token_opt: None,\n                    merge_task: None,\n                    parent_span: tracing::Span::none(),\n                })\n                .await\n                .is_ok()\n        );\n\n        let publisher_observation = publisher_handle.process_pending_and_observe().await.state;\n        assert_eq!(publisher_observation.num_published_splits, 1);\n\n        let suggest_truncate_checkpoints: Vec<SourceCheckpoint> = source_inbox\n            .drain_for_test_typed::<SuggestTruncate>()\n            .into_iter()\n            .map(|msg| msg.0)\n            .collect();\n\n        assert_eq!(suggest_truncate_checkpoints.len(), 1);\n        assert_eq!(\n            suggest_truncate_checkpoints[0]\n                .position_for_partition(&PartitionId::default())\n                .unwrap(),\n            &Position::offset(2u64)\n        );\n\n        let merger_msgs: Vec<NewSplits> = merge_planner_inbox.drain_for_test_typed::<NewSplits>();\n        assert_eq!(merger_msgs.len(), 1);\n        assert_eq!(merger_msgs[0].new_splits.len(), 1);\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_publisher_publish_operation_with_empty_splits() {\n        let universe = Universe::with_accelerated_time();\n        let ref_index_uid: IndexUid = IndexUid::for_test(\"index\", 1);\n        let mut mock_metastore = MockMetastoreService::new();\n        let ref_index_uid_clone = ref_index_uid.clone();\n        mock_metastore\n            .expect_publish_splits()\n            .withf(move |publish_splits_request| {\n                let checkpoint_delta: IndexCheckpointDelta = publish_splits_request\n                    .deserialize_index_checkpoint()\n                    .unwrap()\n                    .unwrap();\n                publish_splits_request.index_uid() == &ref_index_uid_clone\n                    && checkpoint_delta.source_id == \"source\"\n                    && publish_splits_request.staged_split_ids.is_empty()\n                    && publish_splits_request.replaced_split_ids.is_empty()\n                    && checkpoint_delta.source_delta == SourceCheckpointDelta::from_range(1..3)\n            })\n            .times(1)\n            .returning(|_| Ok(EmptyResponse {}));\n        let (merge_planner_mailbox, merge_planner_inbox) = universe.create_test_mailbox();\n\n        let (source_mailbox, source_inbox) = universe.create_test_mailbox();\n\n        let publisher = Publisher::new(\n            PublisherType::MainPublisher,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            Some(merge_planner_mailbox),\n            Some(source_mailbox),\n        );\n        let (publisher_mailbox, publisher_handle) = universe.spawn_builder().spawn(publisher);\n\n        assert!(\n            publisher_mailbox\n                .send_message(SplitsUpdate {\n                    index_uid: ref_index_uid.clone(),\n                    new_splits: Vec::new(),\n                    replaced_split_ids: Vec::new(),\n                    checkpoint_delta_opt: Some(IndexCheckpointDelta {\n                        source_id: \"source\".to_string(),\n                        source_delta: SourceCheckpointDelta::from_range(1..3),\n                    }),\n                    publish_lock: PublishLock::default(),\n                    publish_token_opt: None,\n                    merge_task: None,\n                    parent_span: tracing::Span::none(),\n                })\n                .await\n                .is_ok()\n        );\n\n        let publisher_observation = publisher_handle.process_pending_and_observe().await.state;\n        assert_eq!(publisher_observation.num_published_splits, 0);\n        assert_eq!(publisher_observation.num_replace_operations, 0);\n        assert_eq!(publisher_observation.num_empty_splits, 1);\n\n        let suggest_truncate_checkpoints: Vec<SourceCheckpoint> = source_inbox\n            .drain_for_test_typed::<SuggestTruncate>()\n            .into_iter()\n            .map(|msg| msg.0)\n            .collect();\n\n        assert_eq!(suggest_truncate_checkpoints.len(), 1);\n        assert_eq!(\n            suggest_truncate_checkpoints[0]\n                .position_for_partition(&PartitionId::default())\n                .unwrap(),\n            &Position::offset(2u64)\n        );\n\n        let merger_msgs: Vec<NewSplits> = merge_planner_inbox.drain_for_test_typed::<NewSplits>();\n        assert_eq!(merger_msgs.len(), 0);\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_publisher_replace_operation() {\n        let universe = Universe::with_accelerated_time();\n        let mut mock_metastore = MockMetastoreService::new();\n        let ref_index_uid: IndexUid = IndexUid::for_test(\"index\", 1);\n        let ref_index_uid_clone = ref_index_uid.clone();\n        mock_metastore\n            .expect_publish_splits()\n            .withf(move |publish_splits_requests| {\n                publish_splits_requests.index_uid() == &ref_index_uid_clone\n                    && publish_splits_requests.staged_split_ids[..] == [\"split3\"]\n                    && publish_splits_requests.replaced_split_ids[..] == [\"split1\", \"split2\"]\n                    && publish_splits_requests\n                        .index_checkpoint_delta_json_opt()\n                        .is_empty()\n            })\n            .times(1)\n            .returning(|_| Ok(EmptyResponse {}));\n        let (merge_planner_mailbox, merge_planner_inbox) = universe.create_test_mailbox();\n        let publisher = Publisher::new(\n            PublisherType::MainPublisher,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            Some(merge_planner_mailbox),\n            None,\n        );\n        let (publisher_mailbox, publisher_handle) = universe.spawn_builder().spawn(publisher);\n        let publisher_message = SplitsUpdate {\n            index_uid: ref_index_uid.clone(),\n            new_splits: vec![SplitMetadata {\n                split_id: \"split3\".to_string(),\n                ..Default::default()\n            }],\n            replaced_split_ids: vec![\"split1\".to_string(), \"split2\".to_string()],\n            checkpoint_delta_opt: None,\n            publish_lock: PublishLock::default(),\n            publish_token_opt: None,\n            merge_task: None,\n            parent_span: Span::none(),\n        };\n        assert!(\n            publisher_mailbox\n                .send_message(publisher_message)\n                .await\n                .is_ok()\n        );\n        let publisher_observation = publisher_handle.process_pending_and_observe().await.state;\n        assert_eq!(publisher_observation.num_published_splits, 0);\n        assert_eq!(publisher_observation.num_replace_operations, 1);\n        let merge_planner_msgs = merge_planner_inbox.drain_for_test_typed::<NewSplits>();\n        assert_eq!(merge_planner_msgs.len(), 1);\n        assert_eq!(merge_planner_msgs[0].new_splits.len(), 1);\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn publisher_acquires_publish_lock() {\n        let universe = Universe::with_accelerated_time();\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore.expect_publish_splits().never();\n        let (merge_planner_mailbox, merge_planner_inbox) = universe.create_test_mailbox();\n\n        let publisher = Publisher::new(\n            PublisherType::MainPublisher,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            Some(merge_planner_mailbox),\n            None,\n        );\n        let (publisher_mailbox, publisher_handle) = universe.spawn_builder().spawn(publisher);\n\n        let publish_lock = PublishLock::default();\n        publish_lock.kill().await;\n\n        publisher_mailbox\n            .send_message(SplitsUpdate {\n                index_uid: IndexUid::new_with_random_ulid(\"index\"),\n                new_splits: vec![SplitMetadata::for_test(\"test-split\".to_string())],\n                replaced_split_ids: Vec::new(),\n                checkpoint_delta_opt: None,\n                publish_lock,\n                publish_token_opt: None,\n                merge_task: None,\n                parent_span: Span::none(),\n            })\n            .await\n            .unwrap();\n\n        let publisher_observation = publisher_handle.process_pending_and_observe().await.state;\n        assert_eq!(publisher_observation.num_published_splits, 0);\n\n        let merger_messages = merge_planner_inbox.drain_for_test();\n        assert!(merger_messages.is_empty());\n        universe.assert_quit().await;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/actors/sequencer.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt::Debug;\n\nuse anyhow::Context;\nuse async_trait::async_trait;\nuse quickwit_actors::{Actor, ActorContext, ActorExitStatus, Handler, Mailbox, QueueCapacity};\nuse tokio::sync::oneshot;\n\n/// The sequencer serves as a proxy to another actor,\n/// delivering message in a specific order.\n///\n/// Producers of message first \"reserve\" a position in the\n/// queue of message by sending `oneshot::Receiver<Message>` to the `Sequencer`.\n///\n/// The Sequencer then simply resolves these messages and forwards them to the\n/// targeted actor.\n///\n/// It is used by the uploader actor, to run uploads concurrently and yet\n/// ensures that publish message are send in the right order.\npub struct Sequencer<A: Actor> {\n    mailbox: Mailbox<A>,\n}\n\nimpl<A: Actor> Sequencer<A> {\n    pub fn new(mailbox: Mailbox<A>) -> Self {\n        Sequencer { mailbox }\n    }\n}\n\n#[async_trait]\nimpl<A: Actor> Actor for Sequencer<A> {\n    type ObservableState = ();\n\n    fn queue_capacity(&self) -> QueueCapacity {\n        QueueCapacity::Bounded(2)\n    }\n\n    fn observable_state(&self) {}\n}\n\n#[derive(Debug)]\npub enum SequencerCommand<T: Debug> {\n    /// Discard position in the sequence.\n    Discard,\n    /// Proceed with the enclosed value.\n    Proceed(T),\n}\n\n#[async_trait]\nimpl<A, M> Handler<oneshot::Receiver<SequencerCommand<M>>> for Sequencer<A>\nwhere\n    A: Actor,\n    A: Handler<M>,\n    M: Send + Sync + 'static + std::fmt::Debug,\n{\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        message: oneshot::Receiver<SequencerCommand<M>>,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        let command = ctx\n            .protect_future(message)\n            .await\n            .context(\"failed to receive command from uploader\")?;\n        if let SequencerCommand::Proceed(msg) = command {\n            ctx.send_message(&self.mailbox, msg)\n                .await\n                .context(\"failed to send message to publisher\")?;\n        }\n        Ok(())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_actors::Universe;\n\n    use super::*;\n\n    #[derive(Default)]\n    struct SequencerTestActor {\n        messages: Vec<usize>,\n    }\n\n    impl Actor for SequencerTestActor {\n        type ObservableState = Vec<usize>;\n\n        fn observable_state(&self) -> Self::ObservableState {\n            self.messages.clone()\n        }\n    }\n\n    #[async_trait]\n    impl Handler<usize> for SequencerTestActor {\n        type Reply = ();\n\n        async fn handle(\n            &mut self,\n            message: usize,\n            _ctx: &ActorContext<Self>,\n        ) -> Result<(), ActorExitStatus> {\n            self.messages.push(message);\n            Ok(())\n        }\n    }\n\n    #[tokio::test]\n    async fn test_sequencer() {\n        let universe = Universe::with_accelerated_time();\n        let test_actor = SequencerTestActor::default();\n        let (test_mailbox, test_handle) = universe.spawn_builder().spawn(test_actor);\n        let sequencer = Sequencer::new(test_mailbox);\n        let (sequencer_mailbox, sequencer_handle) = universe.spawn_builder().spawn(sequencer);\n        // The sequencer has a capacity of 2.\n        // This is the maximum we can do without provoking a deadlock.\n        let (fut_tx_1, fut_rx_1) = oneshot::channel();\n        let (fut_tx_2, fut_rx_2) = oneshot::channel();\n        let (fut_tx_3, fut_rx_3) = oneshot::channel();\n        sequencer_mailbox.send_message(fut_rx_1).await.unwrap();\n        sequencer_mailbox.send_message(fut_rx_2).await.unwrap();\n        fut_tx_3.send(SequencerCommand::<usize>::Discard).unwrap();\n        sequencer_mailbox.send_message(fut_rx_3).await.unwrap();\n        fut_tx_2.send(SequencerCommand::Proceed(2)).unwrap();\n        fut_tx_1.send(SequencerCommand::Proceed(1)).unwrap();\n        std::mem::drop(sequencer_mailbox);\n        let (exit_status, last_state) = test_handle.join().await;\n        assert!(matches!(exit_status, ActorExitStatus::Success));\n        assert_eq!(&last_state, &[1, 2]);\n        let (sequencer_exit_status, _) = sequencer_handle.join().await;\n        assert!(matches!(sequencer_exit_status, ActorExitStatus::Success));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/actors/uploader.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashSet;\nuse std::iter::FromIterator;\nuse std::mem;\nuse std::sync::Arc;\nuse std::sync::atomic::{AtomicU64, Ordering};\n\nuse anyhow::{Context, bail};\nuse async_trait::async_trait;\nuse fail::fail_point;\nuse itertools::Itertools;\nuse once_cell::sync::OnceCell;\nuse quickwit_actors::{Actor, ActorContext, ActorExitStatus, Handler, Mailbox, QueueCapacity};\nuse quickwit_common::pubsub::EventBroker;\nuse quickwit_common::spawn_named_task;\nuse quickwit_config::RetentionPolicy;\nuse quickwit_metastore::checkpoint::IndexCheckpointDelta;\nuse quickwit_metastore::{SplitMetadata, StageSplitsRequestExt};\nuse quickwit_proto::metastore::{MetastoreService, MetastoreServiceClient, StageSplitsRequest};\nuse quickwit_proto::search::{ReportSplit, ReportSplitsRequest};\nuse quickwit_proto::types::{IndexUid, PublishToken};\nuse quickwit_storage::SplitPayloadBuilder;\nuse serde::Serialize;\nuse tokio::sync::oneshot::Sender;\nuse tokio::sync::{Semaphore, SemaphorePermit, oneshot};\nuse tracing::{Instrument, Span, debug, info, instrument, warn};\n\nuse crate::actors::Publisher;\nuse crate::actors::sequencer::{Sequencer, SequencerCommand};\nuse crate::merge_policy::{MergePolicy, MergeTask};\nuse crate::metrics::INDEXER_METRICS;\nuse crate::models::{\n    EmptySplit, PackagedSplit, PackagedSplitBatch, PublishLock, SplitsUpdate, create_split_metadata,\n};\nuse crate::split_store::IndexingSplitStore;\n\n/// The following two semaphores ensures that, we have at most `max_concurrent_split_uploads` split\n/// uploads can happen at the same time, as configured in the `IndexerConfig`.\n///\n/// This \"budget\" is actually split into two semaphores: one for the indexing pipeline and the merge\n/// pipeline. The idea is that the merge pipeline is by nature a bit irregular, and we don't want it\n/// to stall the indexing pipeline, decreasing its throughput.\nstatic CONCURRENT_UPLOAD_PERMITS_INDEX: OnceCell<Semaphore> = OnceCell::new();\nstatic CONCURRENT_UPLOAD_PERMITS_MERGE: OnceCell<Semaphore> = OnceCell::new();\n\n#[derive(Clone, Copy, Debug)]\npub enum UploaderType {\n    IndexUploader,\n    MergeUploader,\n    DeleteUploader,\n}\n\n/// [`SplitsUpdateMailbox`] wraps either a [`Mailbox<Sequencer>`] or [`Mailbox<Publisher>`].\n///\n/// It makes it possible to send a [`SplitsUpdate`] either to the [`Sequencer`] or directly\n/// to [`Publisher`]. It is used in combination with `SplitsUpdateSender` that will do the send.\n///\n/// This is useful as we have different requirements between the indexing pipeline and\n/// the merge/delete task pipelines.\n/// 1. In the indexing pipeline, we want to publish splits in the same order as they are produced by\n///    the indexer/packager to ensure we are publishing splits without \"holes\" in checkpoints. We\n///    thus send [`SplitsUpdate`] to the [`Sequencer`] to keep the right ordering.\n/// 2. In the merge pipeline and the delete task pipeline, we are merging splits and in in this\n///    case, publishing order does not matter. In this case, we can just send [`SplitsUpdate`]\n///    directly to the [`Publisher`].\n#[derive(Clone, Debug)]\npub enum SplitsUpdateMailbox {\n    Sequencer(Mailbox<Sequencer<Publisher>>),\n    Publisher(Mailbox<Publisher>),\n}\n\nimpl From<Mailbox<Publisher>> for SplitsUpdateMailbox {\n    fn from(publisher_mailbox: Mailbox<Publisher>) -> Self {\n        SplitsUpdateMailbox::Publisher(publisher_mailbox)\n    }\n}\n\nimpl From<Mailbox<Sequencer<Publisher>>> for SplitsUpdateMailbox {\n    fn from(publisher_sequencer_mailbox: Mailbox<Sequencer<Publisher>>) -> Self {\n        SplitsUpdateMailbox::Sequencer(publisher_sequencer_mailbox)\n    }\n}\n\nimpl SplitsUpdateMailbox {\n    async fn get_split_update_sender(\n        &self,\n        ctx: &ActorContext<Uploader>,\n    ) -> anyhow::Result<SplitsUpdateSender> {\n        match self {\n            SplitsUpdateMailbox::Sequencer(sequencer_mailbox) => {\n                // We send the future to the sequencer right away.\n                // The sequencer will then resolve the future in their arrival order and ensure that\n                // the publisher publishes splits in order.\n                let (split_uploaded_tx, split_uploaded_rx) =\n                    oneshot::channel::<SequencerCommand<SplitsUpdate>>();\n                ctx.send_message(sequencer_mailbox, split_uploaded_rx)\n                    .await?;\n                Ok(SplitsUpdateSender::Sequencer(split_uploaded_tx))\n            }\n            SplitsUpdateMailbox::Publisher(publisher_mailbox) => {\n                // We just need the publisher mailbox to send the split in this case.\n                Ok(SplitsUpdateSender::Publisher(publisher_mailbox.clone()))\n            }\n        }\n    }\n}\n\nenum SplitsUpdateSender {\n    Sequencer(Sender<SequencerCommand<SplitsUpdate>>),\n    Publisher(Mailbox<Publisher>),\n}\n\nimpl SplitsUpdateSender {\n    fn discard(self) -> anyhow::Result<()> {\n        if let SplitsUpdateSender::Sequencer(split_uploader_tx) = self\n            && split_uploader_tx.send(SequencerCommand::Discard).is_err()\n        {\n            bail!(\"failed to send cancel command to sequencer. the sequencer is probably dead\");\n        }\n        Ok(())\n    }\n\n    async fn send(\n        self,\n        split_update: SplitsUpdate,\n        ctx: &ActorContext<Uploader>,\n    ) -> anyhow::Result<()> {\n        match self {\n            SplitsUpdateSender::Sequencer(split_uploaded_tx) => {\n                if let Err(publisher_message) =\n                    split_uploaded_tx.send(SequencerCommand::Proceed(split_update))\n                {\n                    bail!(\n                        \"failed to send upload split `{:?}`. the publisher is probably dead\",\n                        &publisher_message\n                    );\n                }\n            }\n            SplitsUpdateSender::Publisher(publisher_mailbox) => {\n                ctx.send_message(&publisher_mailbox, split_update).await?;\n            }\n        }\n        Ok(())\n    }\n}\n\n#[derive(Clone)]\npub struct Uploader {\n    uploader_type: UploaderType,\n    metastore: MetastoreServiceClient,\n    merge_policy: Arc<dyn MergePolicy>,\n    retention_policy: Option<RetentionPolicy>,\n    split_store: IndexingSplitStore,\n    split_update_mailbox: SplitsUpdateMailbox,\n    max_concurrent_split_uploads: usize,\n    counters: UploaderCounters,\n    event_broker: EventBroker,\n}\n\nimpl Uploader {\n    #[allow(clippy::too_many_arguments)]\n    pub fn new(\n        uploader_type: UploaderType,\n        metastore: MetastoreServiceClient,\n        merge_policy: Arc<dyn MergePolicy>,\n        retention_policy: Option<RetentionPolicy>,\n        split_store: IndexingSplitStore,\n        split_update_mailbox: SplitsUpdateMailbox,\n        max_concurrent_split_uploads: usize,\n        event_broker: EventBroker,\n    ) -> Uploader {\n        Uploader {\n            uploader_type,\n            metastore,\n            merge_policy,\n            retention_policy,\n            split_store,\n            split_update_mailbox,\n            max_concurrent_split_uploads,\n            counters: Default::default(),\n            event_broker,\n        }\n    }\n    async fn acquire_semaphore(\n        &self,\n        ctx: &ActorContext<Self>,\n    ) -> anyhow::Result<SemaphorePermit<'static>> {\n        let _guard = ctx.protect_zone();\n        let (concurrent_upload_permits_once_cell, concurrent_upload_permits_gauge) =\n            match self.uploader_type {\n                UploaderType::IndexUploader => (\n                    &CONCURRENT_UPLOAD_PERMITS_INDEX,\n                    INDEXER_METRICS\n                        .available_concurrent_upload_permits\n                        .with_label_values([\"indexer\"]),\n                ),\n                UploaderType::MergeUploader => (\n                    &CONCURRENT_UPLOAD_PERMITS_MERGE,\n                    INDEXER_METRICS\n                        .available_concurrent_upload_permits\n                        .with_label_values([\"merger\"]),\n                ),\n                UploaderType::DeleteUploader => (\n                    &CONCURRENT_UPLOAD_PERMITS_MERGE,\n                    INDEXER_METRICS\n                        .available_concurrent_upload_permits\n                        .with_label_values([\"merger\"]),\n                ),\n            };\n        let concurrent_upload_permits = concurrent_upload_permits_once_cell\n            .get_or_init(|| Semaphore::const_new(self.max_concurrent_split_uploads));\n        concurrent_upload_permits_gauge.set(concurrent_upload_permits.available_permits() as i64);\n        concurrent_upload_permits\n            .acquire()\n            .await\n            .context(\"the uploader semaphore is closed. (this should never happen)\")\n    }\n}\n\n#[derive(Clone, Debug, Default, Serialize)]\npub struct UploaderCounters {\n    pub num_staged_splits: Arc<AtomicU64>,\n    pub num_uploaded_splits: Arc<AtomicU64>,\n}\n\n#[async_trait]\nimpl Actor for Uploader {\n    type ObservableState = UploaderCounters;\n\n    #[allow(clippy::unused_unit)]\n    fn observable_state(&self) -> Self::ObservableState {\n        self.counters.clone()\n    }\n\n    fn queue_capacity(&self) -> QueueCapacity {\n        // We do not need a large capacity here...\n        // The uploader just spawns tasks that are uploading,\n        // so that in a sense, the CONCURRENT_UPLOAD_PERMITS semaphore also acts as\n        // a queue capacity.\n        //\n        // Having a large queue is costly too, because each message is a handle over\n        // a split directory. We DO need aggressive backpressure here.\n        QueueCapacity::Bounded(0)\n    }\n\n    fn name(&self) -> String {\n        format!(\"{:?}\", self.uploader_type)\n    }\n}\n\n#[async_trait]\nimpl Handler<PackagedSplitBatch> for Uploader {\n    type Reply = ();\n\n    #[instrument(name = \"uploader\",\n        parent=batch.batch_parent_span.id(),\n        skip_all)]\n    async fn handle(\n        &mut self,\n        batch: PackagedSplitBatch,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        fail_point!(\"uploader:before\");\n        let split_update_sender = self\n            .split_update_mailbox\n            .get_split_update_sender(ctx)\n            .await?;\n\n        // The permit will be added back manually to the semaphore the task after it is finished.\n        // This is not a valid usage of protected zone here.\n        //\n        // Protected zone are supposed to be used when the cause for blocking is\n        // outside of the responsibility of the current actor.\n        // For instance, when sending a message on a downstream actor with a saturated\n        // mailbox.\n        // This is meant to be fixed with ParallelActors.\n        let permit_guard = self.acquire_semaphore(ctx).await?;\n        let kill_switch = ctx.kill_switch().clone();\n        let split_ids = batch.split_ids();\n        if kill_switch.is_dead() {\n            warn!(split_ids=?split_ids,\"kill switch was activated, cancelling upload\");\n            return Err(ActorExitStatus::Killed);\n        }\n        let metastore = self.metastore.clone();\n        let split_store = self.split_store.clone();\n        let counters = self.counters.clone();\n        let index_uid = batch.index_uid();\n        let ctx_clone = ctx.clone();\n        let merge_policy = self.merge_policy.clone();\n        let retention_policy = self.retention_policy.clone();\n        debug!(split_ids=?split_ids, \"start-stage-and-store-splits\");\n        let event_broker = self.event_broker.clone();\n        spawn_named_task(\n            async move {\n                fail_point!(\"uploader:intask:before\");\n\n                let mut split_metadata_list = Vec::with_capacity(batch.splits.len());\n                let mut report_splits: Vec<ReportSplit> = Vec::with_capacity(batch.splits.len());\n\n                for packaged_split in batch.splits.iter() {\n                    if batch.publish_lock.is_dead() {\n                        // TODO: Remove the junk right away?\n                        info!(\"splits' publish lock is dead\");\n                        if let Err(e) = split_update_sender.discard() {\n                            warn!(cause=?e, \"could not discard split\");\n                        }\n                        return;\n                    }\n\n                    let split_streamer = match SplitPayloadBuilder::get_split_payload(\n                        &packaged_split.split_files,\n                        &packaged_split.serialized_split_fields,\n                        &packaged_split.hotcache_bytes,\n                    ) {\n                        Ok(split_streamer) => split_streamer,\n                        Err(e) => {\n                            warn!(cause=?e, split_id=packaged_split.split_id(), \"could not create split streamer\");\n                            return;\n                        }\n                    };\n                    let split_metadata = create_split_metadata(\n                        &merge_policy,\n                        retention_policy.as_ref(),\n                        &packaged_split.split_attrs,\n                        packaged_split.tags.clone(),\n                        split_streamer.footer_range.start..split_streamer.footer_range.end,\n                    );\n\n                    report_splits.push(ReportSplit {\n                        storage_uri: split_store.remote_uri().to_string(),\n                        split_id: packaged_split.split_id().to_string(),\n                    });\n\n                    split_metadata_list.push(split_metadata);\n\n                }\n\n                let stage_splits_request = match StageSplitsRequest::try_from_splits_metadata(index_uid.clone(), split_metadata_list.clone()) {\n                    Ok(stage_splits_request) => stage_splits_request,\n                    Err(e) => {\n                        warn!(cause=?e, \"could not create stage splits request\");\n                        return;\n                    }\n                };\n                if let Err(e) = metastore\n                    .clone()\n                    .stage_splits(stage_splits_request)\n                    .await\n                {\n                    warn!(cause=?e, \"failed to stage splits\");\n                    return;\n                };\n\n                counters.num_staged_splits.fetch_add(split_metadata_list.len() as u64, Ordering::SeqCst);\n\n                let mut packaged_splits_and_metadata = Vec::with_capacity(batch.splits.len());\n\n                event_broker.publish(ReportSplitsRequest { report_splits });\n\n                for (packaged_split, metadata) in batch.splits.into_iter().zip(split_metadata_list) {\n                    let upload_result = upload_split(\n                        &packaged_split,\n                        &metadata,\n                        &split_store,\n                        counters.clone(),\n                    )\n                    .await;\n\n                    if let Err(cause) = upload_result {\n                        warn!(cause=?cause, split_id=packaged_split.split_id(), \"Failed to upload split. Killing!\");\n                        kill_switch.kill();\n                        return;\n                    }\n\n                    packaged_splits_and_metadata.push((packaged_split, metadata));\n                }\n\n                let splits_update = make_publish_operation(\n                    index_uid,\n                    packaged_splits_and_metadata,\n                    batch.checkpoint_delta_opt,\n                    batch.publish_lock,\n                    batch.publish_token_opt,\n                    batch.merge_task_opt,\n                    batch.batch_parent_span,\n                );\n\n                let target = match &split_update_sender {\n                    SplitsUpdateSender::Sequencer(_) => \"sequencer\",\n                    SplitsUpdateSender::Publisher(_) => \"publisher\",\n                };\n                if let Err(e) = split_update_sender.send(splits_update, &ctx_clone).await {\n                    warn!(cause=?e, target, \"failed to send uploaded split\");\n                    return;\n                }\n                // We explicitly drop it in order to force move the permit guard into the async\n                // task.\n                mem::drop(permit_guard);\n            }\n            .instrument(Span::current()),\n            \"upload_single_task\"\n        );\n        fail_point!(\"uploader:intask:after\");\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<EmptySplit> for Uploader {\n    type Reply = ();\n\n    #[instrument(\n        name=\"upload_empty_split\",\n        parent=empty_split.batch_parent_span.id(),\n        skip_all,\n    )]\n    async fn handle(\n        &mut self,\n        empty_split: EmptySplit,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        let split_update_sender = self\n            .split_update_mailbox\n            .get_split_update_sender(ctx)\n            .await?;\n        let splits_update = SplitsUpdate {\n            index_uid: empty_split.index_uid,\n            new_splits: Vec::new(),\n            replaced_split_ids: Vec::new(),\n            checkpoint_delta_opt: Some(empty_split.checkpoint_delta),\n            publish_lock: empty_split.publish_lock,\n            publish_token_opt: empty_split.publish_token_opt,\n            merge_task: None,\n            parent_span: empty_split.batch_parent_span,\n        };\n\n        split_update_sender.send(splits_update, ctx).await?;\n        Ok(())\n    }\n}\n\nfn make_publish_operation(\n    index_uid: IndexUid,\n    packaged_splits_and_metadatas: Vec<(PackagedSplit, SplitMetadata)>,\n    checkpoint_delta_opt: Option<IndexCheckpointDelta>,\n    publish_lock: PublishLock,\n    publish_token_opt: Option<PublishToken>,\n    merge_task: Option<MergeTask>,\n    parent_span: Span,\n) -> SplitsUpdate {\n    assert!(!packaged_splits_and_metadatas.is_empty());\n    let replaced_split_ids = packaged_splits_and_metadatas\n        .iter()\n        .flat_map(|(split, _)| split.split_attrs.replaced_split_ids.clone())\n        .collect::<HashSet<_>>();\n    SplitsUpdate {\n        index_uid,\n        new_splits: packaged_splits_and_metadatas\n            .into_iter()\n            .map(|split_and_meta| split_and_meta.1)\n            .collect_vec(),\n        replaced_split_ids: Vec::from_iter(replaced_split_ids),\n        checkpoint_delta_opt,\n        publish_lock,\n        publish_token_opt,\n        merge_task,\n        parent_span,\n    }\n}\n\n#[instrument(\n    level = \"info\"\n    name = \"upload\",\n    fields(split = %packaged_split.split_attrs.split_id),\n    skip_all\n)]\nasync fn upload_split(\n    packaged_split: &PackagedSplit,\n    split_metadata: &SplitMetadata,\n    split_store: &IndexingSplitStore,\n    counters: UploaderCounters,\n) -> anyhow::Result<()> {\n    let split_streamer = SplitPayloadBuilder::get_split_payload(\n        &packaged_split.split_files,\n        &packaged_split.serialized_split_fields,\n        &packaged_split.hotcache_bytes,\n    )?;\n\n    split_store\n        .store_split(\n            split_metadata,\n            packaged_split.split_scratch_directory.path(),\n            Box::new(split_streamer),\n        )\n        .await?;\n    counters.num_uploaded_splits.fetch_add(1, Ordering::SeqCst);\n    Ok(())\n}\n\n#[cfg(test)]\nmod tests {\n    use std::path::PathBuf;\n    use std::time::Duration;\n\n    use quickwit_actors::{ObservationType, Universe};\n    use quickwit_common::pubsub::EventSubscriber;\n    use quickwit_common::temp_dir::TempDirectory;\n    use quickwit_metastore::checkpoint::{IndexCheckpointDelta, SourceCheckpointDelta};\n    use quickwit_proto::metastore::{EmptyResponse, MockMetastoreService};\n    use quickwit_proto::types::{DocMappingUid, NodeId};\n    use quickwit_storage::RamStorage;\n    use tantivy::DateTime;\n    use tokio::sync::oneshot;\n\n    use super::*;\n    use crate::merge_policy::{NopMergePolicy, default_merge_policy};\n    use crate::models::{SplitAttrs, SplitsUpdate};\n\n    #[tokio::test]\n    async fn test_uploader_with_sequencer() -> anyhow::Result<()> {\n        quickwit_common::setup_logging_for_tests();\n\n        let node_id = NodeId::from(\"test-node\");\n        let index_uid = IndexUid::new_with_random_ulid(\"test-index\");\n        let source_id = \"test-source\".to_string();\n\n        let event_broker = EventBroker::default();\n        let universe = Universe::new();\n        let (sequencer_mailbox, sequencer_inbox) =\n            universe.create_test_mailbox::<Sequencer<Publisher>>();\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_stage_splits()\n            .withf(move |stage_splits_request| -> bool {\n                let splits_metadata = stage_splits_request.deserialize_splits_metadata().unwrap();\n                let split_metadata = &splits_metadata[0];\n                let index_uid: IndexUid = stage_splits_request.index_uid().clone();\n                index_uid.index_id == \"test-index\"\n                    && split_metadata.split_id() == \"test-split\"\n                    && split_metadata.time_range == Some(1628203589..=1628203640)\n            })\n            .times(1)\n            .returning(|_| Ok(EmptyResponse {}));\n        let ram_storage = RamStorage::default();\n        let split_store =\n            IndexingSplitStore::create_without_local_store_for_test(Arc::new(ram_storage.clone()));\n        let merge_policy = Arc::new(NopMergePolicy);\n        let uploader = Uploader::new(\n            UploaderType::IndexUploader,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            merge_policy,\n            None,\n            split_store,\n            SplitsUpdateMailbox::Sequencer(sequencer_mailbox),\n            4,\n            event_broker,\n        );\n        let (uploader_mailbox, uploader_handle) = universe.spawn_builder().spawn(uploader);\n        let split_scratch_directory = TempDirectory::for_test();\n        let checkpoint_delta_opt: Option<IndexCheckpointDelta> = Some(IndexCheckpointDelta {\n            source_id: \"test-source\".to_string(),\n            source_delta: SourceCheckpointDelta::from_range(3..15),\n        });\n        uploader_mailbox\n            .send_message(PackagedSplitBatch::new(\n                vec![PackagedSplit {\n                    split_attrs: SplitAttrs {\n                        node_id,\n                        index_uid,\n                        source_id,\n                        doc_mapping_uid: DocMappingUid::default(),\n                        partition_id: 3u64,\n                        time_range: Some(\n                            DateTime::from_timestamp_secs(1_628_203_589)\n                                ..=DateTime::from_timestamp_secs(1_628_203_640),\n                        ),\n                        uncompressed_docs_size_in_bytes: 1_000,\n                        num_docs: 10,\n                        replaced_split_ids: Vec::new(),\n                        split_id: \"test-split\".to_string(),\n                        delete_opstamp: 10,\n                        num_merge_ops: 0,\n                    },\n                    serialized_split_fields: Vec::new(),\n                    split_scratch_directory,\n                    tags: Default::default(),\n                    hotcache_bytes: Vec::new(),\n                    split_files: Vec::new(),\n                }],\n                checkpoint_delta_opt,\n                PublishLock::default(),\n                None,\n                None,\n                Span::none(),\n            ))\n            .await?;\n        assert_eq!(\n            uploader_handle.process_pending_and_observe().await.obs_type,\n            ObservationType::Alive\n        );\n        let mut publish_futures: Vec<oneshot::Receiver<SequencerCommand<SplitsUpdate>>> =\n            sequencer_inbox.drain_for_test_typed();\n        assert_eq!(publish_futures.len(), 1);\n\n        let publisher_message = match publish_futures.pop().unwrap().await? {\n            SequencerCommand::Discard => panic!(\n                \"expected `SequencerCommand::Proceed(SplitUpdate)`, got \\\n                 `SequencerCommand::Discard`\"\n            ),\n            SequencerCommand::Proceed(publisher_message) => publisher_message,\n        };\n        let SplitsUpdate {\n            index_uid,\n            new_splits,\n            checkpoint_delta_opt,\n            replaced_split_ids,\n            ..\n        } = publisher_message;\n\n        assert_eq!(index_uid.index_id, \"test-index\");\n        assert_eq!(new_splits.len(), 1);\n        assert_eq!(new_splits[0].split_id(), \"test-split\");\n        let checkpoint_delta = checkpoint_delta_opt.unwrap();\n        assert_eq!(checkpoint_delta.source_id, \"test-source\");\n        assert_eq!(\n            checkpoint_delta.source_delta,\n            SourceCheckpointDelta::from_range(3..15)\n        );\n        assert!(replaced_split_ids.is_empty());\n        let mut files = ram_storage.list_files().await;\n        files.sort();\n        assert_eq!(&files, &[PathBuf::from(\"test-split.split\")]);\n        universe.assert_quit().await;\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_uploader_with_sequencer_emits_replace() -> anyhow::Result<()> {\n        let node_id = NodeId::from(\"test-node\");\n        let index_uid = IndexUid::new_with_random_ulid(\"test-index\");\n        let source_id = \"test-source\".to_string();\n\n        let universe = Universe::new();\n        let (sequencer_mailbox, sequencer_inbox) =\n            universe.create_test_mailbox::<Sequencer<Publisher>>();\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_stage_splits()\n            .withf(move |stage_splits_request| -> bool {\n                let splits_metadata = stage_splits_request.deserialize_splits_metadata().unwrap();\n                let is_metadata_valid = splits_metadata.iter().all(|metadata| {\n                    [\"test-split-1\", \"test-split-2\"].contains(&metadata.split_id())\n                        && metadata.time_range == Some(1628203589..=1628203640)\n                });\n                let index_uid: IndexUid = stage_splits_request.index_uid().clone();\n                index_uid.index_id == \"test-index\" && is_metadata_valid\n            })\n            .times(1)\n            .returning(|_| Ok(EmptyResponse {}));\n        let ram_storage = RamStorage::default();\n        let split_store =\n            IndexingSplitStore::create_without_local_store_for_test(Arc::new(ram_storage.clone()));\n        let merge_policy = Arc::new(NopMergePolicy);\n        let uploader = Uploader::new(\n            UploaderType::IndexUploader,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            merge_policy,\n            None,\n            split_store,\n            SplitsUpdateMailbox::Sequencer(sequencer_mailbox),\n            4,\n            EventBroker::default(),\n        );\n        let (uploader_mailbox, uploader_handle) = universe.spawn_builder().spawn(uploader);\n        let split_scratch_directory_1 = TempDirectory::for_test();\n        let split_scratch_directory_2 = TempDirectory::for_test();\n        let packaged_split_1 = PackagedSplit {\n            split_attrs: SplitAttrs {\n                node_id: node_id.clone(),\n                index_uid: index_uid.clone(),\n                source_id: source_id.clone(),\n                doc_mapping_uid: DocMappingUid::default(),\n                split_id: \"test-split-1\".to_string(),\n                partition_id: 3u64,\n                num_docs: 10,\n                uncompressed_docs_size_in_bytes: 1_000,\n                time_range: Some(\n                    DateTime::from_timestamp_secs(1_628_203_589)\n                        ..=DateTime::from_timestamp_secs(1_628_203_640),\n                ),\n                replaced_split_ids: vec![\n                    \"replaced-split-1\".to_string(),\n                    \"replaced-split-2\".to_string(),\n                ],\n                delete_opstamp: 0,\n                num_merge_ops: 0,\n            },\n            serialized_split_fields: Vec::new(),\n            split_scratch_directory: split_scratch_directory_1,\n            tags: Default::default(),\n            split_files: Vec::new(),\n            hotcache_bytes: Vec::new(),\n        };\n        let package_split_2 = PackagedSplit {\n            split_attrs: SplitAttrs {\n                node_id,\n                index_uid,\n                source_id,\n                doc_mapping_uid: DocMappingUid::default(),\n                split_id: \"test-split-2\".to_string(),\n                partition_id: 3u64,\n                num_docs: 10,\n                uncompressed_docs_size_in_bytes: 1_000,\n                time_range: Some(\n                    DateTime::from_timestamp_secs(1_628_203_589)\n                        ..=DateTime::from_timestamp_secs(1_628_203_640),\n                ),\n                replaced_split_ids: vec![\n                    \"replaced-split-1\".to_string(),\n                    \"replaced-split-2\".to_string(),\n                ],\n                delete_opstamp: 0,\n                num_merge_ops: 0,\n            },\n            serialized_split_fields: Vec::new(),\n            split_scratch_directory: split_scratch_directory_2,\n            tags: Default::default(),\n            split_files: Vec::new(),\n            hotcache_bytes: Vec::new(),\n        };\n        uploader_mailbox\n            .send_message(PackagedSplitBatch::new(\n                vec![packaged_split_1, package_split_2],\n                None,\n                PublishLock::default(),\n                None,\n                None,\n                Span::none(),\n            ))\n            .await?;\n        assert_eq!(\n            uploader_handle.process_pending_and_observe().await.obs_type,\n            ObservationType::Alive\n        );\n        let mut publish_futures: Vec<oneshot::Receiver<SequencerCommand<SplitsUpdate>>> =\n            sequencer_inbox.drain_for_test_typed();\n        assert_eq!(publish_futures.len(), 1);\n\n        let publisher_message = match publish_futures.pop().unwrap().await? {\n            SequencerCommand::Discard => panic!(\n                \"Expected `SequencerCommand::Proceed(SplitsUpdate)`, got \\\n                 `SequencerCommand::Discard`.\"\n            ),\n            SequencerCommand::Proceed(publisher_message) => publisher_message,\n        };\n        let SplitsUpdate {\n            index_uid,\n            new_splits,\n            mut replaced_split_ids,\n            checkpoint_delta_opt,\n            ..\n        } = publisher_message;\n        assert_eq!(index_uid.index_id, \"test-index\");\n        // Sort first to avoid test failing.\n        replaced_split_ids.sort();\n        assert_eq!(new_splits.len(), 2);\n        assert_eq!(new_splits[0].split_id(), \"test-split-1\");\n        assert_eq!(new_splits[1].split_id(), \"test-split-2\");\n        assert_eq!(\n            &replaced_split_ids,\n            &[\n                \"replaced-split-1\".to_string(),\n                \"replaced-split-2\".to_string()\n            ]\n        );\n        assert!(checkpoint_delta_opt.is_none());\n\n        let mut files = ram_storage.list_files().await;\n        files.sort();\n        assert_eq!(\n            &files,\n            &[\n                PathBuf::from(\"test-split-1.split\"),\n                PathBuf::from(\"test-split-2.split\")\n            ]\n        );\n        universe.assert_quit().await;\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_uploader_without_sequencer() -> anyhow::Result<()> {\n        let node_id = NodeId::from(\"test-node\");\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let index_uid_clone = index_uid.clone();\n        let source_id = \"test-source\".to_string();\n\n        let universe = Universe::new();\n        let (publisher_mailbox, publisher_inbox) = universe.create_test_mailbox::<Publisher>();\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_stage_splits()\n            .withf(move |stage_splits_request| -> bool {\n                stage_splits_request.index_uid() == &index_uid_clone\n            })\n            .times(1)\n            .returning(|_| Ok(EmptyResponse {}));\n        let ram_storage = RamStorage::default();\n        let split_store =\n            IndexingSplitStore::create_without_local_store_for_test(Arc::new(ram_storage.clone()));\n        let merge_policy = Arc::new(NopMergePolicy);\n        let uploader = Uploader::new(\n            UploaderType::IndexUploader,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            merge_policy,\n            None,\n            split_store,\n            SplitsUpdateMailbox::Publisher(publisher_mailbox),\n            4,\n            EventBroker::default(),\n        );\n        let (uploader_mailbox, uploader_handle) = universe.spawn_builder().spawn(uploader);\n        let split_scratch_directory = TempDirectory::for_test();\n        let checkpoint_delta_opt: Option<IndexCheckpointDelta> = Some(IndexCheckpointDelta {\n            source_id: \"test-source\".to_string(),\n            source_delta: SourceCheckpointDelta::from_range(3..15),\n        });\n        uploader_mailbox\n            .send_message(PackagedSplitBatch::new(\n                vec![PackagedSplit {\n                    split_attrs: SplitAttrs {\n                        node_id,\n                        index_uid,\n                        source_id,\n                        doc_mapping_uid: DocMappingUid::default(),\n                        split_id: \"test-split\".to_string(),\n                        partition_id: 3u64,\n                        time_range: None,\n                        uncompressed_docs_size_in_bytes: 1_000,\n                        num_docs: 10,\n                        replaced_split_ids: Vec::new(),\n                        delete_opstamp: 10,\n                        num_merge_ops: 0,\n                    },\n                    serialized_split_fields: Vec::new(),\n                    split_scratch_directory,\n                    tags: Default::default(),\n                    hotcache_bytes: Vec::new(),\n                    split_files: Vec::new(),\n                }],\n                checkpoint_delta_opt,\n                PublishLock::default(),\n                None,\n                None,\n                Span::none(),\n            ))\n            .await?;\n        assert_eq!(\n            uploader_handle.process_pending_and_observe().await.obs_type,\n            ObservationType::Alive\n        );\n        let SplitsUpdate {\n            index_uid,\n            new_splits,\n            replaced_split_ids,\n            ..\n        } = publisher_inbox.recv_typed_message().await.unwrap();\n\n        assert_eq!(index_uid.index_id, \"test-index\");\n        assert_eq!(new_splits.len(), 1);\n        assert!(replaced_split_ids.is_empty());\n        universe.assert_quit().await;\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_uploader_with_empty_splits() -> anyhow::Result<()> {\n        let universe = Universe::new();\n        let (sequencer_mailbox, sequencer_inbox) =\n            universe.create_test_mailbox::<Sequencer<Publisher>>();\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore.expect_stage_splits().never();\n        let ram_storage = RamStorage::default();\n        let split_store =\n            IndexingSplitStore::create_without_local_store_for_test(Arc::new(ram_storage.clone()));\n        let uploader = Uploader::new(\n            UploaderType::IndexUploader,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            default_merge_policy(),\n            None,\n            split_store,\n            SplitsUpdateMailbox::Sequencer(sequencer_mailbox),\n            4,\n            EventBroker::default(),\n        );\n        let (uploader_mailbox, uploader_handle) = universe.spawn_builder().spawn(uploader);\n        let checkpoint_delta = IndexCheckpointDelta {\n            source_id: \"test-source\".to_string(),\n            source_delta: SourceCheckpointDelta::from_range(3..15),\n        };\n        uploader_mailbox\n            .send_message(EmptySplit {\n                index_uid: IndexUid::new_with_random_ulid(\"test-index\"),\n                checkpoint_delta,\n                publish_lock: PublishLock::default(),\n                publish_token_opt: None,\n                batch_parent_span: Span::none(),\n            })\n            .await?;\n        assert_eq!(\n            uploader_handle.process_pending_and_observe().await.obs_type,\n            ObservationType::Alive\n        );\n        let mut publish_futures: Vec<oneshot::Receiver<SequencerCommand<SplitsUpdate>>> =\n            sequencer_inbox.drain_for_test_typed();\n        assert_eq!(publish_futures.len(), 1);\n\n        let publisher_message = match publish_futures.pop().unwrap().await? {\n            SequencerCommand::Discard => panic!(\n                \"Expected `SequencerCommand::Proceed(SplitUpdate)`, got \\\n                 `SequencerCommand::Discard`.\"\n            ),\n            SequencerCommand::Proceed(publisher_message) => publisher_message,\n        };\n        let SplitsUpdate {\n            index_uid,\n            new_splits,\n            checkpoint_delta_opt,\n            replaced_split_ids,\n            ..\n        } = publisher_message;\n\n        assert_eq!(index_uid.index_id, \"test-index\");\n        assert_eq!(new_splits.len(), 0);\n        let checkpoint_delta = checkpoint_delta_opt.unwrap();\n        assert_eq!(checkpoint_delta.source_id, \"test-source\");\n        assert_eq!(\n            checkpoint_delta.source_delta,\n            SourceCheckpointDelta::from_range(3..15)\n        );\n        assert!(replaced_split_ids.is_empty());\n        let files = ram_storage.list_files().await;\n        assert!(files.is_empty());\n        universe.assert_quit().await;\n        Ok(())\n    }\n\n    struct ReportSplitListener {\n        report_splits_tx: flume::Sender<ReportSplitsRequest>,\n    }\n\n    impl std::fmt::Debug for ReportSplitListener {\n        fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {\n            f.debug_struct(\"ReportSplitListener\").finish()\n        }\n    }\n\n    #[async_trait]\n    impl EventSubscriber<ReportSplitsRequest> for ReportSplitListener {\n        async fn handle_event(&mut self, event: ReportSplitsRequest) {\n            self.report_splits_tx.send(event).unwrap();\n        }\n    }\n\n    #[tokio::test]\n    async fn test_uploader_notifies_event_broker() -> anyhow::Result<()> {\n        quickwit_common::setup_logging_for_tests();\n        const SPLIT_ULID_STR: &str = \"01HAV29D4XY3D462FS3D8K5Q2H\";\n        let event_broker = EventBroker::default();\n        let (report_splits_tx, report_splits_rx) = flume::unbounded();\n        let report_splits_listener = ReportSplitListener { report_splits_tx };\n\n        // we need to keep the handle alive.\n        let _subscribe_handle = event_broker.subscribe(report_splits_listener);\n\n        let node_id = NodeId::from(\"test-node\");\n        let index_uid = IndexUid::new_with_random_ulid(\"test-index\");\n        let source_id = \"test-source\".to_string();\n\n        let universe = Universe::new();\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_stage_splits()\n            .times(1)\n            .returning(|_| Ok(EmptyResponse {}));\n        let ram_storage = RamStorage::default();\n        let split_store =\n            IndexingSplitStore::create_without_local_store_for_test(Arc::new(ram_storage.clone()));\n        let merge_policy = Arc::new(NopMergePolicy);\n        let (publisher_mailbox, _publisher_inbox) = universe.create_test_mailbox();\n        let uploader = Uploader::new(\n            UploaderType::IndexUploader,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            merge_policy,\n            None,\n            split_store,\n            SplitsUpdateMailbox::Publisher(publisher_mailbox),\n            4,\n            event_broker,\n        );\n        let (uploader_mailbox, uploader_handle) = universe.spawn_builder().spawn(uploader);\n        let split_scratch_directory = TempDirectory::for_test();\n        let checkpoint_delta_opt: Option<IndexCheckpointDelta> = Some(IndexCheckpointDelta {\n            source_id: \"test-source\".to_string(),\n            source_delta: SourceCheckpointDelta::from_range(3..15),\n        });\n        uploader_mailbox\n            .send_message(PackagedSplitBatch::new(\n                vec![PackagedSplit {\n                    split_attrs: SplitAttrs {\n                        node_id,\n                        index_uid,\n                        source_id,\n                        doc_mapping_uid: DocMappingUid::default(),\n                        partition_id: 3u64,\n                        time_range: Some(\n                            DateTime::from_timestamp_secs(1_628_203_589)\n                                ..=DateTime::from_timestamp_secs(1_628_203_640),\n                        ),\n                        uncompressed_docs_size_in_bytes: 1_000,\n                        num_docs: 10,\n                        replaced_split_ids: Vec::new(),\n                        split_id: SPLIT_ULID_STR.to_string(),\n                        delete_opstamp: 10,\n                        num_merge_ops: 0,\n                    },\n                    serialized_split_fields: Vec::new(),\n                    split_scratch_directory,\n                    tags: Default::default(),\n                    hotcache_bytes: Vec::new(),\n                    split_files: Vec::new(),\n                }],\n                checkpoint_delta_opt,\n                PublishLock::default(),\n                None,\n                None,\n                Span::none(),\n            ))\n            .await?;\n        assert_eq!(\n            uploader_handle.process_pending_and_observe().await.obs_type,\n            ObservationType::Alive\n        );\n        mem::drop(uploader_mailbox);\n        let report_splits: ReportSplitsRequest = report_splits_rx\n            .recv_timeout(Duration::from_secs(1))\n            .unwrap();\n        assert_eq!(report_splits.report_splits.len(), 1);\n        let split = &report_splits.report_splits[0];\n        assert_eq!(split.storage_uri, \"ram:///\");\n        assert_eq!(split.split_id, SPLIT_ULID_STR);\n        universe.assert_quit().await;\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/actors/vrl_processing.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeMap;\n\nuse quickwit_config::TransformConfig;\nuse tracing::warn;\nuse vrl::compiler::runtime::Runtime;\npub use vrl::compiler::runtime::Terminate as VrlTerminate;\nuse vrl::compiler::state::RuntimeState;\nuse vrl::compiler::{Program, TargetValueRef, TimeZone};\npub use vrl::value::{Secrets as VrlSecrets, Value as VrlValue};\n\nuse super::doc_processor::DocProcessorError;\n\npub(super) struct VrlDoc {\n    pub vrl_value: VrlValue,\n    pub num_bytes: usize,\n}\n\nimpl VrlDoc {\n    pub fn new(vrl_value: VrlValue, num_bytes: usize) -> Self {\n        Self {\n            vrl_value,\n            num_bytes,\n        }\n    }\n}\n\npub(super) struct VrlProgram {\n    program: Program,\n    timezone: TimeZone,\n    runtime: Runtime,\n    metadata: VrlValue,\n    secrets: VrlSecrets,\n}\n\nimpl VrlProgram {\n    pub fn transform_doc(&mut self, vrl_doc: VrlDoc) -> Result<VrlDoc, DocProcessorError> {\n        let VrlDoc {\n            mut vrl_value,\n            num_bytes,\n        } = vrl_doc;\n\n        let mut target = TargetValueRef {\n            value: &mut vrl_value,\n            metadata: &mut self.metadata,\n            secrets: &mut self.secrets,\n        };\n        let runtime_res = self\n            .runtime\n            .resolve(&mut target, &self.program, &self.timezone)\n            .map_err(|transform_error| {\n                warn!(transform_error=?transform_error);\n                DocProcessorError::Transform(transform_error)\n            });\n\n        if let VrlValue::Object(metadata) = target.metadata {\n            metadata.clear();\n        }\n        self.runtime.clear();\n\n        runtime_res.map(|vrl_value| VrlDoc::new(vrl_value, num_bytes))\n    }\n\n    pub fn try_from_transform_config(transform_config: TransformConfig) -> anyhow::Result<Self> {\n        let (program, timezone) = transform_config.compile_vrl_script()?;\n        let state = RuntimeState::default();\n        let runtime = Runtime::new(state);\n\n        Ok(VrlProgram {\n            program,\n            runtime,\n            timezone,\n            metadata: VrlValue::Object(BTreeMap::new()),\n            secrets: VrlSecrets::default(),\n        })\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/controlled_directory.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::io::{BufWriter, IntoInnerError};\nuse std::ops::Deref;\nuse std::path::Path;\nuse std::sync::Arc;\nuse std::{fmt, io};\n\nuse arc_swap::ArcSwap;\nuse quickwit_common::ProtectedZoneGuard;\nuse quickwit_common::io::{ControlledWrite, IoControls, IoControlsAccess};\nuse tantivy::Directory;\nuse tantivy::directory::error::{DeleteError, OpenReadError, OpenWriteError};\nuse tantivy::directory::{\n    AntiCallToken, FileHandle, TerminatingWrite, WatchCallback, WatchHandle, WritePtr,\n};\n\n/// Buffer capacity.\n///\n/// This is the current default for the BufWriter, but considering this constant\n/// will have a direct impact on health check, we'd better fix it.\nconst BUFFER_NUM_BYTES: usize = 8_192;\n\n/// The `ControlledDirectory` wraps another directory and enhances it\n/// with functionalities such as\n/// - records progress everytime a write (Note there is however a buffer writer above it)\n/// - if the killswitch is activated, returns an error on the first write happening after it\n/// - in the future, record a writing speed, possibly introduce some throttling, etc.\n#[derive(Clone)]\npub struct ControlledDirectory {\n    underlying: Arc<dyn Directory>,\n    io_controls: HotswappableIoControls,\n}\n\nimpl ControlledDirectory {\n    pub fn new(directory: Box<dyn Directory>, io_controls: IoControls) -> ControlledDirectory {\n        ControlledDirectory {\n            underlying: directory.into(),\n            io_controls: HotswappableIoControls::new(io_controls),\n        }\n    }\n\n    pub fn check_if_alive(&self) -> io::Result<ProtectedZoneGuard> {\n        self.io_controls.load().check_if_alive()\n    }\n\n    pub fn set_io_controls(&self, io_controls: IoControls) {\n        self.io_controls.store(Arc::new(io_controls));\n    }\n}\n\nimpl fmt::Debug for ControlledDirectory {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        f.debug_struct(\"ControlledDirectory\").finish()\n    }\n}\n\nimpl Directory for ControlledDirectory {\n    fn get_file_handle(&self, path: &Path) -> Result<Arc<dyn FileHandle>, OpenReadError> {\n        self.check_if_alive()\n            .map_err(|io_err| OpenReadError::wrap_io_error(io_err, path.to_path_buf()))?;\n        self.underlying.get_file_handle(path)\n    }\n\n    fn delete(&self, path: &Path) -> Result<(), DeleteError> {\n        self.check_if_alive()\n            .map_err(|io_error| DeleteError::IoError {\n                io_error: Arc::new(io_error),\n                filepath: path.to_path_buf(),\n            })?;\n        self.underlying.delete(path)\n    }\n\n    fn exists(&self, path: &Path) -> Result<bool, OpenReadError> {\n        self.check_if_alive()\n            .map_err(|io_err| OpenReadError::wrap_io_error(io_err, path.to_path_buf()))?;\n        self.underlying.exists(path)\n    }\n\n    fn open_write(&self, path: &Path) -> Result<WritePtr, OpenWriteError> {\n        self.check_if_alive()\n            .map_err(|io_err| OpenWriteError::wrap_io_error(io_err, path.to_path_buf()))?;\n\n        let underlying_wrt: Box<dyn TerminatingWrite> = self\n            .underlying\n            .open_write(path)?\n            .into_inner()\n            .map_err(IntoInnerError::into_error)\n            .map_err(|io_err| OpenWriteError::wrap_io_error(io_err, path.to_path_buf()))?;\n        let controlled_wrt = self.io_controls.clone().wrap_write(underlying_wrt);\n        Ok(BufWriter::with_capacity(\n            BUFFER_NUM_BYTES,\n            Box::new(AdoptedControlledWrite(controlled_wrt)),\n        ))\n    }\n\n    fn atomic_read(&self, path: &Path) -> Result<Vec<u8>, OpenReadError> {\n        self.check_if_alive()\n            .map_err(|io_err| OpenReadError::wrap_io_error(io_err, path.to_path_buf()))?;\n        self.underlying.atomic_read(path)\n    }\n\n    fn atomic_write(&self, path: &Path, data: &[u8]) -> io::Result<()> {\n        self.check_if_alive()?;\n        self.underlying.atomic_write(path, data)\n    }\n\n    fn watch(&self, watch_callback: WatchCallback) -> tantivy::Result<WatchHandle> {\n        self.check_if_alive()?;\n        self.underlying.watch(watch_callback)\n    }\n\n    fn sync_directory(&self) -> io::Result<()> {\n        self.check_if_alive()?;\n        self.underlying.sync_directory()\n    }\n}\n\n#[derive(Clone)]\nstruct HotswappableIoControls(Arc<ArcSwap<IoControls>>);\n\nimpl Deref for HotswappableIoControls {\n    type Target = ArcSwap<IoControls>;\n\n    fn deref(&self) -> &Self::Target {\n        &self.0\n    }\n}\n\nimpl HotswappableIoControls {\n    pub fn new(io_controls: IoControls) -> Self {\n        Self(Arc::new(ArcSwap::new(Arc::new(io_controls))))\n    }\n}\n\nimpl IoControlsAccess for HotswappableIoControls {\n    fn apply<F, R>(&self, f: F) -> R\n    where F: Fn(&IoControls) -> R {\n        let guard = self.0.load();\n        f(&guard)\n    }\n}\n\n// Wrapper to work around the orphan rule. (hence the word \"Adopted\").\nstruct AdoptedControlledWrite(ControlledWrite<HotswappableIoControls, Box<dyn TerminatingWrite>>);\n\nimpl io::Write for AdoptedControlledWrite {\n    fn write(&mut self, buf: &[u8]) -> io::Result<usize> {\n        self.0.write(buf)\n    }\n\n    fn flush(&mut self) -> io::Result<()> {\n        self.0.flush()\n    }\n}\n\nimpl TerminatingWrite for AdoptedControlledWrite {\n    #[inline]\n    fn terminate_ref(&mut self, token: AntiCallToken) -> io::Result<()> {\n        let underlying_wrt = self.0.underlying_wrt();\n        underlying_wrt.flush()?;\n        underlying_wrt.terminate_ref(token)\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::io::Write;\n\n    use tantivy::directory::RamDirectory;\n\n    use super::*;\n\n    #[test]\n    fn test_records_progress_on_write() -> anyhow::Result<()> {\n        let directory = RamDirectory::default();\n        let io_controls = IoControls::default();\n        let controlled_directory =\n            ControlledDirectory::new(Box::new(directory), io_controls.clone());\n        let progress = io_controls.progress().clone();\n        assert!(progress.registered_activity_since_last_call());\n        assert!(!progress.registered_activity_since_last_call());\n        let mut wrt = controlled_directory.open_write(Path::new(\"test\"))?;\n        assert!(progress.registered_activity_since_last_call());\n        // We use a large buffer to force the buf writer to flush at least once.\n        let large_buffer = vec![0u8; wrt.capacity() + 1];\n        assert_eq!(io_controls.num_bytes(), 0u64);\n        wrt.write_all(&large_buffer)?;\n        assert_eq!(io_controls.num_bytes(), 8_193u64);\n        assert!(progress.registered_activity_since_last_call());\n        wrt.write_all(b\"small payload\")?;\n        // The buffering makes it so that this last write does not\n        // get actually written right away.\n        assert_eq!(io_controls.num_bytes(), 8_193u64);\n        // Here we check that the progress only concerns is only\n        // trigger when the BufWriter flushes.\n        assert!(!progress.registered_activity_since_last_call());\n        wrt.write_all(&large_buffer)?;\n        assert_eq!(io_controls.num_bytes(), 16_399);\n        assert!(progress.registered_activity_since_last_call());\n        assert!(!progress.registered_activity_since_last_call());\n        wrt.write_all(&b\"aa\"[..])?;\n        assert_eq!(io_controls.num_bytes(), 16_399u64);\n        wrt.terminate()?;\n        // Flush works as expected and makes sure all data buffered goes through\n        assert_eq!(io_controls.num_bytes(), 16_401u64);\n        assert!(progress.registered_activity_since_last_call());\n        Ok(())\n    }\n\n    #[test]\n    fn test_records_kill_switch_triggers_io_error() -> anyhow::Result<()> {\n        let directory = RamDirectory::default();\n        let io_controls = IoControls::default();\n        let controlled_directory =\n            ControlledDirectory::new(Box::new(directory), io_controls.clone());\n        let mut wrt = controlled_directory.open_write(Path::new(\"test\"))?;\n        // We use a large buffer to force the buf writer to flush at least once.\n        let large_buffer = vec![0u8; wrt.capacity() + 1];\n        wrt.write_all(&large_buffer)?;\n        io_controls.kill();\n        let err = wrt.write_all(&large_buffer).err().unwrap();\n        assert_eq!(err.kind(), io::ErrorKind::Other);\n        wrt.terminate()?;\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n#![deny(clippy::disallowed_methods)]\n\nuse quickwit_actors::{Mailbox, Universe};\nuse quickwit_cluster::Cluster;\nuse quickwit_common::pubsub::EventBroker;\nuse quickwit_config::NodeConfig;\nuse quickwit_ingest::{IngestApiService, IngesterPool};\nuse quickwit_proto::indexing::PipelineMetrics;\nuse quickwit_proto::metastore::MetastoreServiceClient;\nuse quickwit_storage::StorageResolver;\nuse tracing::info;\n\nuse crate::actors::MergeSchedulerService;\npub use crate::actors::{\n    FinishPendingMergesAndShutdownPipeline, IndexingError, IndexingPipeline,\n    IndexingPipelineParams, IndexingService, PublisherType, Sequencer, SplitsUpdateMailbox,\n};\npub use crate::controlled_directory::ControlledDirectory;\nuse crate::models::IndexingStatistics;\npub use crate::split_store::{IndexingSplitStore, get_tantivy_directory_from_split_bundle};\n\npub mod actors;\nmod controlled_directory;\npub mod merge_policy;\nmod metrics;\npub mod models;\npub mod source;\nmod split_store;\n#[cfg(any(test, feature = \"testsuite\"))]\nmod test_utils;\n\nuse quickwit_proto::indexing::CpuCapacity;\n#[cfg(any(test, feature = \"testsuite\"))]\npub use test_utils::{MockSplitBuilder, TestSandbox, mock_split, mock_split_meta};\n\nuse self::merge_policy::MergePolicy;\npub use self::source::check_source_connectivity;\n\n#[derive(utoipa::OpenApi)]\n#[openapi(components(schemas(IndexingStatistics, PipelineMetrics, CpuCapacity)))]\n/// Schema used for the OpenAPI generation which are apart of this crate.\npub struct IndexingApiSchemas;\n\npub fn new_split_id() -> String {\n    ulid::Ulid::new().to_string()\n}\n\n#[allow(clippy::too_many_arguments)]\npub async fn start_indexing_service(\n    universe: &Universe,\n    config: &NodeConfig,\n    num_blocking_threads: usize,\n    cluster: Cluster,\n    metastore: MetastoreServiceClient,\n    ingester_pool: IngesterPool,\n    storage_resolver: StorageResolver,\n    event_broker: EventBroker,\n) -> anyhow::Result<Mailbox<IndexingService>> {\n    info!(\"starting indexer service\");\n    let ingest_api_service_mailbox = universe.get_one::<IngestApiService>();\n    let (merge_scheduler_mailbox, _) = universe.spawn_builder().spawn(MergeSchedulerService::new(\n        config.indexer_config.merge_concurrency.get(),\n    ));\n    // Spawn indexing service.\n    let indexing_service = IndexingService::new(\n        config.node_id.clone(),\n        config.data_dir_path.to_path_buf(),\n        config.indexer_config.clone(),\n        num_blocking_threads,\n        cluster,\n        metastore.clone(),\n        ingest_api_service_mailbox,\n        merge_scheduler_mailbox,\n        ingester_pool,\n        storage_resolver,\n        event_broker,\n    )\n    .await?;\n    let (indexing_service, _) = universe.spawn_builder().spawn(indexing_service);\n    Ok(indexing_service)\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/merge_policy/const_write_amplification.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::ops::RangeInclusive;\n\nuse quickwit_config::IndexingSettings;\nuse quickwit_config::merge_policy_config::ConstWriteAmplificationMergePolicyConfig;\nuse quickwit_metastore::{SplitMaturity, SplitMetadata};\nuse time::OffsetDateTime;\nuse tracing::info;\n\nuse super::MergeOperation;\nuse crate::merge_policy::MergePolicy;\n\n// Smallest number of splits in a finalize merge.\nconst FINALIZE_MIN_MERGE_FACTOR: usize = 3;\n\n/// The `ConstWriteAmplificationMergePolicy` has been designed for a use\n/// case where there are a several index partitions with different sizes,\n/// and partitions tend to be searched separately. (e.g. partitioning by tenant.)\n///\n/// In that case, the StableLogMergePolicy would tend to target the same number\n/// of docs for all tenants. Assuming a merge factor of 10 and a target num docs of 10 millions,\n/// The write amplification observed for a small tenant, emitting splits of 1\n/// document would be 7.\n///\n/// These extra merges have the benefit of making less splits, but really we are\n/// over-trading write amplification for read amplification here.\n///\n/// The `ConstWriteAmplificationMergePolicy` is very simple. It targets a number\n/// of merges instead, and stops once this number of merges is reached.\n///\n/// Only splits with the same number of merge operations are merged together,\n/// and for a given merge operation, we build split in a greedy way.\n/// After sorting the splits per creation date, we append splits one after the\n/// other until we either reach `max_merge_factor` or we exceed the\n/// targeted` split_num_docs`.\n#[derive(Debug, Clone)]\npub struct ConstWriteAmplificationMergePolicy {\n    config: ConstWriteAmplificationMergePolicyConfig,\n    split_num_docs_target: usize,\n}\n\nimpl Default for ConstWriteAmplificationMergePolicy {\n    fn default() -> Self {\n        ConstWriteAmplificationMergePolicy {\n            config: Default::default(),\n            split_num_docs_target: IndexingSettings::default_split_num_docs_target(),\n        }\n    }\n}\n\nimpl ConstWriteAmplificationMergePolicy {\n    pub fn new(\n        config: ConstWriteAmplificationMergePolicyConfig,\n        split_num_docs_target: usize,\n    ) -> Self {\n        ConstWriteAmplificationMergePolicy {\n            config,\n            split_num_docs_target,\n        }\n    }\n\n    #[cfg(test)]\n    fn for_test() -> ConstWriteAmplificationMergePolicy {\n        use std::time::Duration;\n\n        let config = ConstWriteAmplificationMergePolicyConfig {\n            max_merge_ops: 3,\n            merge_factor: 3,\n            max_merge_factor: 5,\n            maturation_period: Duration::from_secs(3600),\n            max_finalize_merge_operations: 0,\n            max_finalize_split_num_docs: None,\n        };\n        Self::new(config, 10_000_000)\n    }\n\n    /// Returns a merge operation within one `num_merge_ops` level if one can be built from the\n    /// given splits. This method assumes that the splits are sorted by reverse creation date\n    /// and have all the same `num_merge_ops`.\n    fn single_merge_operation_within_num_merge_op_level(\n        &self,\n        splits: &mut Vec<SplitMetadata>,\n        merge_factor_range: RangeInclusive<usize>,\n    ) -> Option<MergeOperation> {\n        let mut num_splits_in_merge = 0;\n        let mut num_docs_in_merge = 0;\n        for split in splits.iter().take(*merge_factor_range.end()) {\n            num_docs_in_merge += split.num_docs;\n            num_splits_in_merge += 1;\n            if num_docs_in_merge >= self.split_num_docs_target {\n                break;\n            }\n        }\n        if (num_docs_in_merge < self.split_num_docs_target)\n            && (num_splits_in_merge < *merge_factor_range.start())\n        {\n            return None;\n        }\n        assert!(num_splits_in_merge >= 2);\n        let splits_in_merge = splits.drain(0..num_splits_in_merge).collect();\n        let merge_operation = MergeOperation::new_merge_operation(splits_in_merge);\n        Some(merge_operation)\n    }\n\n    fn merge_operations_within_num_merge_op_level(\n        &self,\n        splits: &mut Vec<SplitMetadata>,\n    ) -> Vec<MergeOperation> {\n        splits.sort_by(|left, right| {\n            left.create_timestamp\n                .cmp(&right.create_timestamp)\n                .then_with(|| left.split_id().cmp(right.split_id()))\n        });\n        let mut merge_operations = Vec::new();\n        while let Some(merge_op) =\n            self.single_merge_operation_within_num_merge_op_level(splits, self.merge_factor_range())\n        {\n            merge_operations.push(merge_op);\n        }\n        merge_operations\n    }\n\n    fn merge_factor_range(&self) -> RangeInclusive<usize> {\n        self.config.merge_factor..=self.config.max_merge_factor\n    }\n}\n\nimpl MergePolicy for ConstWriteAmplificationMergePolicy {\n    fn operations(&self, splits: &mut Vec<SplitMetadata>) -> Vec<MergeOperation> {\n        let mut group_by_num_merge_ops: HashMap<usize, Vec<SplitMetadata>> = HashMap::default();\n        let mut mature_splits = Vec::new();\n        let now = OffsetDateTime::now_utc();\n        for split in splits.drain(..) {\n            if split.is_mature(now) {\n                mature_splits.push(split);\n            } else {\n                group_by_num_merge_ops\n                    .entry(split.num_merge_ops)\n                    .or_default()\n                    .push(split);\n            }\n        }\n        splits.extend(mature_splits);\n        let mut merge_operations = Vec::new();\n        for splits_in_group in group_by_num_merge_ops.values_mut() {\n            let merge_ops = self.merge_operations_within_num_merge_op_level(splits_in_group);\n            merge_operations.extend(merge_ops);\n            // we readd the splits that are not used in a merge operation into the splits vector.\n            splits.append(splits_in_group);\n        }\n        merge_operations\n    }\n\n    fn finalize_operations(&self, splits: &mut Vec<SplitMetadata>) -> Vec<MergeOperation> {\n        if self.config.max_finalize_merge_operations == 0 {\n            return Vec::new();\n        }\n\n        let now = OffsetDateTime::now_utc();\n\n        // We first isolate mature splits. Let's not touch them.\n        let (mature_splits, mut young_splits): (Vec<SplitMetadata>, Vec<SplitMetadata>) =\n            splits.drain(..).partition(|split: &SplitMetadata| {\n                if let Some(max_finalize_split_num_docs) = self.config.max_finalize_split_num_docs\n                    && split.num_docs > max_finalize_split_num_docs\n                {\n                    return true;\n                }\n                split.is_mature(now)\n            });\n        splits.extend(mature_splits);\n\n        // We then sort the split by reverse creation date and split id.\n        // You may notice that reverse is the opposite of the rest of the policy.\n        //\n        // This is because these are the youngest splits. If we limit ourselves in the number of\n        // merge we will operate, we might as well focus on the young == smaller ones for that\n        // last merge.\n        young_splits.sort_by(|left, right| {\n            left.create_timestamp\n                .cmp(&right.create_timestamp)\n                .reverse()\n                .then_with(|| left.split_id().cmp(right.split_id()))\n        });\n        let mut merge_operations = Vec::new();\n        while merge_operations.len() < self.config.max_finalize_merge_operations {\n            let min_merge_factor = FINALIZE_MIN_MERGE_FACTOR.min(self.config.max_merge_factor);\n            let merge_factor_range = min_merge_factor..=self.config.max_merge_factor;\n            if let Some(merge_op) = self.single_merge_operation_within_num_merge_op_level(\n                &mut young_splits,\n                merge_factor_range,\n            ) {\n                merge_operations.push(merge_op);\n            } else {\n                break;\n            }\n        }\n\n        // We readd the young splits that are not used in any merge operation.\n        splits.extend(young_splits);\n\n        assert!(merge_operations.len() <= self.config.max_finalize_merge_operations);\n\n        let num_splits_per_merge_op: Vec<usize> =\n            merge_operations.iter().map(|op| op.splits.len()).collect();\n        let num_docs_per_merge_op: Vec<usize> = merge_operations\n            .iter()\n            .map(|op| op.splits.iter().map(|split| split.num_docs).sum::<usize>())\n            .collect();\n        info!(\n            num_splits_per_merge_op=?num_splits_per_merge_op,\n            num_docs_per_merge_op=?num_docs_per_merge_op,\n            \"finalize merge operation\");\n        merge_operations\n    }\n\n    fn split_maturity(&self, split_num_docs: usize, split_num_merge_ops: usize) -> SplitMaturity {\n        if split_num_merge_ops >= self.config.max_merge_ops {\n            return SplitMaturity::Mature;\n        }\n        if split_num_docs >= self.split_num_docs_target {\n            return SplitMaturity::Mature;\n        }\n        SplitMaturity::Immature {\n            maturation_period: self.config.maturation_period,\n        }\n    }\n\n    #[cfg(test)]\n    fn check_is_valid(&self, merge_op: &MergeOperation, _remaining_splits: &[SplitMetadata]) {\n        use std::collections::HashSet;\n        assert!(merge_op.splits_as_slice().len() <= self.config.max_merge_factor);\n        if merge_op.splits_as_slice().len() < self.config.merge_factor {\n            let num_docs: usize = merge_op\n                .splits_as_slice()\n                .iter()\n                .map(|split| split.num_docs)\n                .sum();\n            let last_split_num_docs = merge_op.splits_as_slice().last().unwrap().num_docs;\n            assert!(num_docs >= self.split_num_docs_target);\n            assert!(num_docs - last_split_num_docs < self.split_num_docs_target);\n        }\n        let num_merge_ops: HashSet<usize> = merge_op\n            .splits_as_slice()\n            .iter()\n            .map(|merge_op| merge_op.num_merge_ops)\n            .collect();\n        assert_eq!(num_merge_ops.len(), 1);\n        assert!(num_merge_ops.into_iter().next().unwrap() < self.config.max_merge_ops);\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::collections::HashMap;\n    use std::sync::Arc;\n    use std::time::Duration;\n\n    use quickwit_metastore::{SplitMaturity, SplitMetadata};\n    use rand::seq::SliceRandom;\n    use time::OffsetDateTime;\n\n    use super::ConstWriteAmplificationMergePolicy;\n    use crate::MergePolicy;\n    use crate::merge_policy::MergeOperation;\n    use crate::merge_policy::tests::create_splits;\n\n    #[test]\n    fn test_split_is_mature() {\n        let merge_policy = ConstWriteAmplificationMergePolicy::for_test();\n        let split = create_splits(&merge_policy, vec![9_000_000])\n            .into_iter()\n            .next()\n            .unwrap();\n        // Split under split_num_docs_target, num_merge_ops < max_merge_ops and created before now()\n        // - maturation_period is not mature.\n        assert_eq!(\n            merge_policy.split_maturity(split.num_docs, split.num_merge_ops),\n            SplitMaturity::Immature {\n                maturation_period: Duration::from_secs(3600)\n            }\n        );\n        // Split with docs > split_num_docs_target is mature.\n        assert_eq!(\n            merge_policy\n                .split_maturity(merge_policy.split_num_docs_target + 1, split.num_merge_ops),\n            SplitMaturity::Mature\n        );\n\n        // Split with num_merge_ops >= max_merge_ops is mature\n        assert_eq!(\n            merge_policy.split_maturity(split.num_docs, merge_policy.config.max_merge_ops),\n            SplitMaturity::Mature\n        );\n    }\n\n    #[test]\n    fn test_const_write_amplification_merge_policy_empty() {\n        let mut splits = Vec::new();\n        let merge_policy = ConstWriteAmplificationMergePolicy::for_test();\n        assert!(merge_policy.operations(&mut splits).is_empty());\n    }\n\n    #[test]\n    fn test_const_write_merge_policy_single_split() {\n        let merge_policy = ConstWriteAmplificationMergePolicy::for_test();\n        let mut splits = vec![SplitMetadata {\n            split_id: \"01GE1R0KBFQHJ76030RYRAS8QA\".to_string(),\n            num_docs: 1,\n            create_timestamp: 1665000000,\n            maturity: merge_policy.split_maturity(1, 0),\n            num_merge_ops: 4,\n            ..Default::default()\n        }];\n        let operations: Vec<MergeOperation> = merge_policy.operations(&mut splits);\n        assert!(operations.is_empty());\n        assert_eq!(splits.len(), 1);\n    }\n\n    #[test]\n    fn test_const_write_merge_policy_simple() {\n        let merge_policy = ConstWriteAmplificationMergePolicy::for_test();\n        let create_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n        let mut splits = (0..merge_policy.config.merge_factor)\n            .map(|i| SplitMetadata {\n                split_id: format!(\"split-{i}\"),\n                num_docs: 1_000,\n                num_merge_ops: 1,\n                create_timestamp,\n                maturity: merge_policy.split_maturity(1_000, 1),\n                ..Default::default()\n            })\n            .collect();\n        let operations: Vec<MergeOperation> = merge_policy.operations(&mut splits);\n        assert_eq!(operations.len(), 1);\n        assert_eq!(\n            operations[0].splits_as_slice().len(),\n            merge_policy.config.merge_factor\n        );\n    }\n\n    #[test]\n    fn test_const_write_merge_policy_merge_factor_max() {\n        let merge_policy = ConstWriteAmplificationMergePolicy::for_test();\n        let time_to_maturity = merge_policy.split_maturity(1_000, 1);\n        let create_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n        let mut splits =\n            (0..merge_policy.config.max_merge_factor + merge_policy.config.merge_factor - 1)\n                .map(|i| SplitMetadata {\n                    split_id: format!(\"split-{i}\"),\n                    num_docs: 1_000,\n                    num_merge_ops: 1,\n                    create_timestamp,\n                    maturity: time_to_maturity,\n                    ..Default::default()\n                })\n                .collect();\n        let operations: Vec<MergeOperation> = merge_policy.operations(&mut splits);\n        assert_eq!(operations.len(), 1);\n        assert_eq!(\n            operations[0].splits_as_slice().len(),\n            merge_policy.config.max_merge_factor\n        );\n    }\n\n    #[test]\n    fn test_const_write_merge_policy_older_first() {\n        let merge_policy = ConstWriteAmplificationMergePolicy::for_test();\n        let time_to_maturity = merge_policy.split_maturity(1_000, 1);\n        let now_timestamp: i64 = OffsetDateTime::now_utc().unix_timestamp();\n        let mut splits: Vec<SplitMetadata> = (0..merge_policy.config.max_merge_factor)\n            .map(|i| SplitMetadata {\n                split_id: format!(\"split-{i}\"),\n                num_docs: 1_000,\n                num_merge_ops: 1,\n                create_timestamp: now_timestamp + i as i64,\n                maturity: time_to_maturity,\n                ..Default::default()\n            })\n            .collect();\n        splits.shuffle(&mut rand::rng());\n        let operations: Vec<MergeOperation> = merge_policy.operations(&mut splits);\n        assert_eq!(operations.len(), 1);\n        assert_eq!(\n            operations[0].splits_as_slice().len(),\n            merge_policy.config.max_merge_factor\n        );\n        let split_ids: Vec<&str> = operations[0]\n            .splits_as_slice()\n            .iter()\n            .map(|split| split.split_id())\n            .collect();\n        assert_eq!(\n            &split_ids[..],\n            &[\"split-0\", \"split-1\", \"split-2\", \"split-3\", \"split-4\"]\n        );\n    }\n\n    #[test]\n    fn test_const_write_merge_policy_target_num_docs() {\n        let merge_policy = ConstWriteAmplificationMergePolicy::for_test();\n        let create_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n        let mut splits = (0..4)\n            .map(|i| {\n                let num_docs = merge_policy.split_num_docs_target.div_ceil(3);\n                let time_to_maturity = merge_policy.split_maturity(num_docs, 1);\n                SplitMetadata {\n                    split_id: format!(\"split-{i}\"),\n                    num_docs,\n                    num_merge_ops: 1,\n                    create_timestamp,\n                    maturity: time_to_maturity,\n                    ..Default::default()\n                }\n            })\n            .collect();\n        let operations: Vec<MergeOperation> = merge_policy.operations(&mut splits);\n        assert_eq!(operations.len(), 1);\n        assert_eq!(operations[0].splits_as_slice().len(), 3);\n    }\n\n    #[test]\n    fn test_const_write_amp_merge_policy_proptest() {\n        let merge_policy = ConstWriteAmplificationMergePolicy::for_test();\n        crate::merge_policy::tests::proptest_merge_policy(&merge_policy);\n    }\n\n    #[tokio::test]\n    async fn test_simulate_const_write_amplification_merge_policy() -> anyhow::Result<()> {\n        let merge_policy = ConstWriteAmplificationMergePolicy::for_test();\n        let vals = vec![1; 1_211]; //< 1_211 splits with a single doc each.\n        let final_splits = crate::merge_policy::tests::aux_test_simulate_merge_planner_num_docs(\n            Arc::new(merge_policy.clone()),\n            &vals[..],\n            &|splits| {\n                let mut num_merge_ops_counts: HashMap<usize, usize> = HashMap::default();\n                for split in splits {\n                    *num_merge_ops_counts.entry(split.num_merge_ops).or_default() += 1;\n                }\n                for split in splits {\n                    assert!(split.num_merge_ops <= merge_policy.config.max_merge_ops);\n                }\n                for i in 0..merge_policy.config.max_merge_ops {\n                    assert!(\n                        num_merge_ops_counts.get(&i).copied().unwrap_or(0)\n                            < merge_policy.config.merge_factor\n                    );\n                }\n            },\n        )\n        .await?;\n        assert_eq!(final_splits.len(), 49);\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_simulate_const_write_amplification_merge_policy_with_finalize() {\n        let mut merge_policy = ConstWriteAmplificationMergePolicy::for_test();\n        merge_policy.config.max_merge_factor = 10;\n        merge_policy.config.merge_factor = 10;\n        merge_policy.split_num_docs_target = 10_000_000;\n\n        let vals: Vec<usize> = vec![1; 9 + 90 + 900]; //< 1_211 splits with a single doc each.\n\n        let num_final_splits_given_max_finalize_merge_operations =\n            |split_num_docs: Vec<usize>, max_finalize_merge_operations: usize| {\n                let mut merge_policy_clone = merge_policy.clone();\n                merge_policy_clone.config.max_finalize_merge_operations =\n                    max_finalize_merge_operations;\n                async move {\n                    crate::merge_policy::tests::aux_test_simulate_merge_planner_num_docs(\n                        Arc::new(merge_policy_clone),\n                        &split_num_docs[..],\n                        &|_splits| {},\n                    )\n                    .await\n                    .unwrap()\n                }\n            };\n\n        assert_eq!(\n            num_final_splits_given_max_finalize_merge_operations(vals.clone(), 0)\n                .await\n                .len(),\n            27\n        );\n        assert_eq!(\n            num_final_splits_given_max_finalize_merge_operations(vals.clone(), 1)\n                .await\n                .len(),\n            18\n        );\n        assert_eq!(\n            num_final_splits_given_max_finalize_merge_operations(vals.clone(), 2)\n                .await\n                .len(),\n            9\n        );\n        assert_eq!(\n            num_final_splits_given_max_finalize_merge_operations(vals.clone(), 3)\n                .await\n                .len(),\n            3\n        );\n        assert_eq!(\n            num_final_splits_given_max_finalize_merge_operations(vec![1; 6], 1)\n                .await\n                .len(),\n            1\n        );\n        assert_eq!(\n            num_final_splits_given_max_finalize_merge_operations(vec![1; 3], 1)\n                .await\n                .len(),\n            1\n        );\n        assert_eq!(\n            num_final_splits_given_max_finalize_merge_operations(vec![1; 2], 1)\n                .await\n                .len(),\n            2\n        );\n\n        // We check that the youngest splits are merged in priority.\n        let final_splits = num_final_splits_given_max_finalize_merge_operations(\n            vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],\n            1,\n        )\n        .await;\n        assert_eq!(final_splits.len(), 2);\n\n        let mut split_num_docs: Vec<usize> = final_splits\n            .iter()\n            .map(|split| split.num_docs)\n            .collect::<Vec<_>>();\n        split_num_docs.sort();\n        assert_eq!(split_num_docs[0], 11);\n        assert_eq!(split_num_docs[1], 55);\n    }\n\n    #[tokio::test]\n    async fn test_simulate_const_write_amplification_merge_policy_with_finalize_max_num_docs() {\n        let mut merge_policy = ConstWriteAmplificationMergePolicy::for_test();\n        merge_policy.config.max_merge_factor = 10;\n        merge_policy.config.merge_factor = 10;\n        merge_policy.split_num_docs_target = 10_000_000;\n        merge_policy.config.max_finalize_split_num_docs = Some(999_999);\n        merge_policy.config.max_finalize_merge_operations = 3;\n\n        let split_num_docs: Vec<usize> = vec![999_999, 1_000_000, 999_999, 999_999];\n\n        let final_splits = crate::merge_policy::tests::aux_test_simulate_merge_planner_num_docs(\n            Arc::new(merge_policy),\n            &split_num_docs[..],\n            &|_splits| {},\n        )\n        .await\n        .unwrap();\n\n        assert_eq!(final_splits.len(), 2);\n        let mut split_num_docs: Vec<usize> =\n            final_splits.iter().map(|split| split.num_docs).collect();\n        split_num_docs.sort();\n        assert_eq!(split_num_docs[0], 1_000_000);\n        assert_eq!(split_num_docs[1], 999_999 * 3);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/merge_policy/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod const_write_amplification;\nmod nop_merge_policy;\nmod stable_log_merge_policy;\n\nuse std::fmt;\nuse std::ops::Deref;\nuse std::sync::Arc;\n\npub(crate) use const_write_amplification::ConstWriteAmplificationMergePolicy;\nuse itertools::Itertools;\npub use nop_merge_policy::NopMergePolicy;\nuse quickwit_config::IndexingSettings;\nuse quickwit_config::merge_policy_config::MergePolicyConfig;\nuse quickwit_metastore::{SplitMaturity, SplitMetadata};\nuse quickwit_proto::types::SplitId;\nuse serde::Serialize;\npub(crate) use stable_log_merge_policy::StableLogMergePolicy;\nuse tantivy::TrackedObject;\nuse tracing::{Span, info_span};\n\nuse crate::actors::MergePermit;\nuse crate::new_split_id;\n\n#[derive(Clone, Debug, PartialEq, Eq, Serialize)]\npub enum MergeOperationType {\n    Merge,\n    DeleteAndMerge,\n}\n\nimpl fmt::Display for MergeOperationType {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(f, \"{self:?}\")\n    }\n}\n\npub struct MergeTask {\n    pub merge_operation: TrackedObject<MergeOperation>,\n    pub(crate) _merge_permit: MergePermit,\n}\n\nimpl MergeTask {\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn from_merge_operation_for_test(merge_operation: MergeOperation) -> MergeTask {\n        let inventory = tantivy::Inventory::default();\n        let tracked_merge_operation = inventory.track(merge_operation);\n        MergeTask {\n            merge_operation: tracked_merge_operation,\n            _merge_permit: MergePermit::for_test(),\n        }\n    }\n}\n\nimpl fmt::Debug for MergeTask {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        self.merge_operation.as_ref().fmt(f)\n    }\n}\n\nimpl Deref for MergeTask {\n    type Target = MergeOperation;\n\n    fn deref(&self) -> &Self::Target {\n        self.merge_operation.as_ref()\n    }\n}\n\n#[derive(Clone, Serialize)]\npub struct MergeOperation {\n    #[serde(skip_serializing)]\n    pub merge_parent_span: Span,\n    pub merge_split_id: SplitId,\n    pub splits: Vec<SplitMetadata>,\n    pub operation_type: MergeOperationType,\n}\n\nimpl MergeOperation {\n    pub fn new_merge_operation(splits: Vec<SplitMetadata>) -> Self {\n        let merge_split_id = new_split_id();\n        let split_ids = splits.iter().map(|split| split.split_id()).collect_vec();\n        let merge_parent_span = info_span!(\"merge\", merge_split_id=%merge_split_id, split_ids=?split_ids, typ=%MergeOperationType::Merge);\n        Self {\n            merge_parent_span,\n            merge_split_id,\n            splits,\n            operation_type: MergeOperationType::Merge,\n        }\n    }\n\n    pub fn total_num_bytes(&self) -> u64 {\n        self.splits\n            .iter()\n            .map(|split: &SplitMetadata| split.footer_offsets.end)\n            .sum()\n    }\n\n    pub fn new_delete_and_merge_operation(split: SplitMetadata) -> Self {\n        let merge_split_id = new_split_id();\n        let merge_parent_span = info_span!(\"delete\", merge_split_id=%merge_split_id, split_ids=?split.split_id(), typ=%MergeOperationType::DeleteAndMerge);\n        Self {\n            merge_parent_span,\n            merge_split_id,\n            splits: vec![split],\n            operation_type: MergeOperationType::DeleteAndMerge,\n        }\n    }\n\n    pub fn splits_as_slice(&self) -> &[SplitMetadata] {\n        self.splits.as_slice()\n    }\n}\n\nimpl fmt::Debug for MergeOperation {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(\n            f,\n            \"Merge(operation_type={}, merged_split_id={},splits=[\",\n            self.operation_type, self.merge_split_id\n        )?;\n        for split in &self.splits {\n            write!(f, \"{},\", split.split_id())?;\n        }\n        write!(f, \"])\")?;\n        Ok(())\n    }\n}\n\n/// A merge policy wraps the logic that decides what should be merged.\n/// The SplitMetadata must be extracted from the splits `Vec`.\n///\n/// It is called by the merge planner whenever a new split is added.\npub trait MergePolicy: Send + Sync + fmt::Debug {\n    /// Returns the list of merge operations that should be performed.\n    fn operations(&self, splits: &mut Vec<SplitMetadata>) -> Vec<MergeOperation>;\n\n    /// After the last indexing pipeline has been shutdown, quickwit\n    /// finishes the ongoing merge operations, and eventually needs to shut it down.\n    ///\n    /// This method makes it possible to offer a last list of merge operations before\n    /// really shutting down the merge policy.\n    ///\n    /// This is especially useful for users relying on a one-index-per-day scheme.\n    fn finalize_operations(&self, _splits: &mut Vec<SplitMetadata>) -> Vec<MergeOperation> {\n        Vec::new()\n    }\n\n    /// Returns split maturity.\n    /// A split is either:\n    /// - `Mature` if it does not undergo new merge operations.\n    /// - or `Immature` with a `maturation_period` after which it becomes mature.\n    fn split_maturity(&self, split_num_docs: usize, split_num_merge_ops: usize) -> SplitMaturity;\n\n    /// Checks a bunch of properties specific to the given merge policy.\n    /// This method is used in proptesting.\n    ///\n    /// - `merge_op` is a merge operation emitted by this merge policy.\n    /// - `remaining_splits` is the list of remaining splits.\n    #[cfg(test)]\n    fn check_is_valid(&self, _merge_op: &MergeOperation, _remaining_splits: &[SplitMetadata]) {}\n}\n\npub fn merge_policy_from_settings(settings: &IndexingSettings) -> Arc<dyn MergePolicy> {\n    match settings.merge_policy.clone() {\n        MergePolicyConfig::Nop => Arc::new(NopMergePolicy),\n        MergePolicyConfig::ConstWriteAmplification(config) => {\n            let merge_policy =\n                ConstWriteAmplificationMergePolicy::new(config, settings.split_num_docs_target);\n            Arc::new(merge_policy)\n        }\n        MergePolicyConfig::StableLog(config) => {\n            let merge_policy = StableLogMergePolicy::new(config, settings.split_num_docs_target);\n            Arc::new(merge_policy)\n        }\n    }\n}\n\npub fn default_merge_policy() -> Arc<dyn MergePolicy> {\n    let indexing_settings = IndexingSettings::default();\n    merge_policy_from_settings(&indexing_settings)\n}\n\npub fn nop_merge_policy() -> Arc<dyn MergePolicy> {\n    Arc::new(NopMergePolicy)\n}\n\nstruct SplitShortDebug<'a>(&'a SplitMetadata);\n\nimpl fmt::Debug for SplitShortDebug<'_> {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        f.debug_struct(\"Split\")\n            .field(\"split_id\", &self.0.split_id())\n            .field(\"num_docs\", &self.0.num_docs)\n            .finish()\n    }\n}\n\nfn splits_short_debug(splits: &[SplitMetadata]) -> Vec<SplitShortDebug<'_>> {\n    splits.iter().map(SplitShortDebug).collect()\n}\n\n#[cfg(test)]\npub mod tests {\n\n    use std::collections::hash_map::DefaultHasher;\n    use std::collections::{BTreeSet, HashMap};\n    use std::hash::Hasher;\n    use std::ops::RangeInclusive;\n\n    use proptest::prelude::*;\n    use quickwit_actors::Universe;\n    use quickwit_proto::indexing::{IndexingPipelineId, MergePipelineId};\n    use quickwit_proto::types::{IndexUid, NodeId, PipelineUid};\n    use rand::seq::SliceRandom;\n    use time::OffsetDateTime;\n\n    use super::*;\n    use crate::actors::{\n        MergePlanner, MergeSchedulerService, MergeSplitDownloader, RunFinalizeMergePolicyAndQuit,\n        merge_split_attrs,\n    };\n    use crate::models::{NewSplits, create_split_metadata};\n\n    fn pow_of_10(n: usize) -> usize {\n        10usize.pow(n as u32)\n    }\n\n    prop_compose! {\n        fn num_docs_around_power_of_ten()(\n            pow_ten in 1usize..5usize,\n            diff in -2isize..2isize\n        ) -> usize {\n            (pow_of_10(pow_ten) as isize + diff).max(1isize) as usize\n        }\n    }\n\n    fn num_docs_strategy() -> impl Strategy<Value = usize> {\n        prop_oneof![1usize..10_000_000usize, num_docs_around_power_of_ten()]\n    }\n\n    prop_compose! {\n      fn split_strategy()\n        (num_merge_ops in 0usize..5usize, start_timestamp in 1_664_000_000i64..1_665_000_000i64, average_time_delta in 100i64..120i64, delta_creation_date in 0u64..100_000u64, num_docs in num_docs_strategy()) -> SplitMetadata {\n        let split_id = crate::new_split_id();\n        let end_timestamp = start_timestamp + average_time_delta * pow_of_10(num_merge_ops) as i64;\n        let create_timestamp: i64 = (end_timestamp as u64 + delta_creation_date) as i64;\n        SplitMetadata {\n            split_id,\n            time_range: Some(start_timestamp..=end_timestamp),\n            num_docs,\n            create_timestamp,\n            num_merge_ops,\n            .. Default::default()\n        }\n      }\n    }\n\n    pub(crate) fn create_splits(\n        merge_policy: &dyn MergePolicy,\n        num_docs_vec: Vec<usize>,\n    ) -> Vec<SplitMetadata> {\n        let num_docs_with_timestamp = num_docs_vec\n            .into_iter()\n            // we give the same timestamp to all of them and rely on stable sort to keep the split\n            // order.\n            .map(|num_docs| (num_docs, (1630563067..=1630564067)))\n            .collect();\n        create_splits_with_timestamps(merge_policy, num_docs_with_timestamp)\n    }\n\n    fn create_splits_with_timestamps(\n        merge_policy: &dyn MergePolicy,\n        num_docs_vec: Vec<(usize, RangeInclusive<i64>)>,\n    ) -> Vec<SplitMetadata> {\n        num_docs_vec\n            .into_iter()\n            .enumerate()\n            .map(|(split_ord, (num_docs, time_range))| {\n                let create_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n                let time_to_maturity = merge_policy.split_maturity(num_docs, 0);\n                SplitMetadata {\n                    split_id: format!(\"split_{split_ord:02}\"),\n                    num_docs,\n                    time_range: Some(time_range),\n                    create_timestamp,\n                    maturity: time_to_maturity,\n                    ..Default::default()\n                }\n            })\n            .collect()\n    }\n\n    // Creates a checksum for a given merge operation.\n    // This does not take in account the merge split id,\n    // and is split order independent.\n    fn compute_checksum_op(op: &MergeOperation) -> u64 {\n        let mut checksum = 0u64;\n        for split in op.splits_as_slice() {\n            let mut hasher = DefaultHasher::default();\n            hasher.write(split.split_id.as_bytes());\n            checksum ^= hasher.finish();\n        }\n        checksum\n    }\n\n    // Creates a checksum for a set of operations.\n    // This checksum does not depend on the order of the merrge operations,\n    // nor the merge split ids.\n    fn compute_checksum_ops(ops: &[MergeOperation]) -> u64 {\n        let mut checksum = 0u64;\n        for op in ops {\n            let op_checksum = compute_checksum_op(op);\n            let mut hasher = DefaultHasher::default();\n            hasher.write_u64(op_checksum);\n            checksum ^= hasher.finish();\n        }\n        checksum\n    }\n\n    fn compare_merge_operations(left_ops: &[MergeOperation], right_ops: &[MergeOperation]) -> bool {\n        compute_checksum_ops(left_ops) == compute_checksum_ops(right_ops)\n    }\n\n    pub(crate) fn proptest_merge_policy(merge_policy: &dyn MergePolicy) {\n        proptest!(|(mut splits in prop::collection::vec(split_strategy(), 0..100))| {\n            let mut cloned_splits = splits.clone();\n            cloned_splits.shuffle(&mut rand::rng());\n\n            let original_num_splits = splits.len();\n\n            let mut operations: Vec<MergeOperation> = merge_policy.operations(&mut splits);\n            let operations_after_shuffle = merge_policy.operations(&mut cloned_splits);\n            assert!(compare_merge_operations(&operations[..],\n                &operations_after_shuffle[..]),\n                \"Merge policy result should be independent from the original order.\");\n\n            let num_splits_in_merge: usize = operations.iter().map(|op| op.splits_as_slice().len()).sum();\n\n            assert_eq!(\n                num_splits_in_merge + splits.len(), original_num_splits,\n                \"Splits should not be lost.\"\n            );\n\n            // This property is not uninteresting but is currently not observed\n            // in the stable log merge policy.\n            // assert!(\n            //     merge_policy.operations(&mut splits).is_empty(),\n            //     \"Merge policy are expected to return all available merge operations.\"\n            // );\n            let now_utc = OffsetDateTime::now_utc();\n            for merge_op in &mut operations {\n                assert_eq!(merge_op.operation_type, MergeOperationType::Merge,\n                    \"A merge policy should only emit Merge operations.\"\n                );\n                assert!(merge_op.splits_as_slice().len() >= 2,\n            \"Merge policies should not suggest merging a single split.\");\n                for split in merge_op.splits_as_slice() {\n                    assert!(!split.is_mature(now_utc), \"Merges should not contain mature splits.\");\n                }\n                merge_policy.check_is_valid(merge_op, &splits[..]);\n            }\n        });\n    }\n\n    fn merge_tags(splits: &[SplitMetadata]) -> BTreeSet<String> {\n        splits\n            .iter()\n            .flat_map(|split| split.tags.iter().cloned())\n            .collect()\n    }\n\n    fn fake_merge(merge_policy: &Arc<dyn MergePolicy>, splits: &[SplitMetadata]) -> SplitMetadata {\n        assert!(!splits.is_empty(), \"Split list should not be empty.\");\n        let merged_split_id = new_split_id();\n        let tags = merge_tags(splits);\n        let pipeline_id = MergePipelineId {\n            node_id: NodeId::from(\"test_node\"),\n            index_uid: IndexUid::new_with_random_ulid(\"test_index\"),\n            source_id: \"test_source\".to_string(),\n        };\n        let split_attrs = merge_split_attrs(pipeline_id, merged_split_id, splits).unwrap();\n        create_split_metadata(merge_policy, None, &split_attrs, tags, 0..0)\n    }\n\n    fn apply_merge(\n        merge_policy: &Arc<dyn MergePolicy>,\n        split_index: &mut HashMap<String, SplitMetadata>,\n        merge_op: &MergeOperation,\n    ) -> SplitMetadata {\n        for split in merge_op.splits_as_slice() {\n            assert!(split_index.remove(split.split_id()).is_some());\n        }\n        let merged_split = fake_merge(merge_policy, merge_op.splits_as_slice());\n        split_index.insert(merged_split.split_id().to_string(), merged_split.clone());\n        merged_split\n    }\n\n    async fn aux_test_simulate_merge_planner(\n        merge_policy: Arc<dyn MergePolicy>,\n        incoming_splits: Vec<SplitMetadata>,\n        check_final_configuration: &dyn Fn(&[SplitMetadata]),\n    ) -> anyhow::Result<Vec<SplitMetadata>> {\n        let universe = Universe::new();\n        let (merge_task_mailbox, merge_task_inbox) =\n            universe.create_test_mailbox::<MergeSplitDownloader>();\n        let pipeline_id = IndexingPipelineId {\n            index_uid: IndexUid::new_with_random_ulid(\"test-index\"),\n            source_id: \"test-source\".to_string(),\n            node_id: NodeId::from(\"test-node\"),\n            pipeline_uid: PipelineUid::default(),\n        };\n        let merge_planner = MergePlanner::new(\n            &pipeline_id.merge_pipeline_id(),\n            Vec::new(),\n            merge_policy.clone(),\n            merge_task_mailbox,\n            universe.get_or_spawn_one::<MergeSchedulerService>(),\n        );\n        let mut split_index: HashMap<String, SplitMetadata> = HashMap::default();\n        let (merge_planner_mailbox, merge_planner_handler) =\n            universe.spawn_builder().spawn(merge_planner);\n\n        for split in incoming_splits {\n            split_index.insert(split.split_id().to_string(), split.clone());\n            merge_planner_mailbox\n                .send_message(NewSplits {\n                    new_splits: vec![split],\n                })\n                .await?;\n            loop {\n                let obs = merge_planner_handler.process_pending_and_observe().await;\n                assert_eq!(obs.obs_type, quickwit_actors::ObservationType::Alive);\n                let merge_tasks = merge_task_inbox.drain_for_test_typed::<MergeTask>();\n                if merge_tasks.is_empty() {\n                    break;\n                }\n                let new_splits: Vec<SplitMetadata> = merge_tasks\n                    .into_iter()\n                    .map(|merge_op| apply_merge(&merge_policy, &mut split_index, &merge_op))\n                    .collect();\n                merge_planner_mailbox\n                    .send_message(NewSplits { new_splits })\n                    .await?;\n            }\n            let split_metadatas: Vec<SplitMetadata> = split_index.values().cloned().collect();\n            check_final_configuration(&split_metadatas);\n        }\n\n        merge_planner_mailbox\n            .send_message(RunFinalizeMergePolicyAndQuit)\n            .await\n            .unwrap();\n\n        let obs = merge_planner_handler.process_pending_and_observe().await;\n        assert_eq!(obs.obs_type, quickwit_actors::ObservationType::PostMortem);\n\n        let merge_tasks = merge_task_inbox.drain_for_test_typed::<MergeTask>();\n        for merge_task in merge_tasks {\n            apply_merge(&merge_policy, &mut split_index, &merge_task);\n        }\n\n        let split_metadatas: Vec<SplitMetadata> = split_index.values().cloned().collect();\n\n        universe.assert_quit().await;\n        Ok(split_metadatas)\n    }\n\n    /// Mock split meta helper.\n    fn mock_split_meta_from_num_docs(\n        time_range: RangeInclusive<i64>,\n        num_docs: u64,\n        maturity: SplitMaturity,\n    ) -> SplitMetadata {\n        SplitMetadata {\n            split_id: crate::new_split_id(),\n            partition_id: 3u64,\n            num_docs: num_docs as usize,\n            uncompressed_docs_size_in_bytes: 256u64 * num_docs,\n            time_range: Some(time_range),\n            create_timestamp: OffsetDateTime::now_utc().unix_timestamp(),\n            maturity,\n            tags: BTreeSet::from_iter(vec![\"tenant_id:1\".to_string(), \"tenant_id:2\".to_string()]),\n            footer_offsets: 0..100,\n            index_uid: IndexUid::new_with_random_ulid(\"test-index\"),\n            source_id: \"test-source\".to_string(),\n            node_id: \"test-node\".to_string(),\n            ..Default::default()\n        }\n    }\n\n    pub async fn aux_test_simulate_merge_planner_num_docs(\n        merge_policy: Arc<dyn MergePolicy>,\n        batch_num_docs: &[usize],\n        check_final_configuration: &dyn Fn(&[SplitMetadata]),\n    ) -> anyhow::Result<Vec<SplitMetadata>> {\n        let split_metadatas: Vec<SplitMetadata> = batch_num_docs\n            .iter()\n            .cloned()\n            .enumerate()\n            .map(|(split_ord, num_docs)| {\n                let time_first = split_ord as i64 * 1_000;\n                let time_last = time_first + 999;\n                let time_range = time_first..=time_last;\n                let time_to_maturity = merge_policy.split_maturity(num_docs, 0);\n                mock_split_meta_from_num_docs(time_range, num_docs as u64, time_to_maturity)\n            })\n            .collect();\n        aux_test_simulate_merge_planner(merge_policy, split_metadatas, check_final_configuration)\n            .await\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/merge_policy/nop_merge_policy.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\n\nuse quickwit_metastore::SplitMaturity;\n\nuse crate::merge_policy::MergePolicy;\n\n/// The NopMergePolicy, as the name suggests, is no-op and does not perform any merges.\n/// <https://en.wikipedia.org/wiki/NOP_(code)>\n#[derive(Debug)]\npub struct NopMergePolicy;\n\nimpl fmt::Display for NopMergePolicy {\n    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {\n        write!(f, \"{self:?}\")\n    }\n}\n\nimpl MergePolicy for NopMergePolicy {\n    fn operations(\n        &self,\n        _splits: &mut Vec<quickwit_metastore::SplitMetadata>,\n    ) -> Vec<super::MergeOperation> {\n        Vec::new()\n    }\n\n    fn split_maturity(&self, _split_num_docs: usize, _split_num_merge_ops: usize) -> SplitMaturity {\n        // With the no merge policy, all splits are mature immediately as they will never undergo\n        // any merge.\n        SplitMaturity::Mature\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use quickwit_metastore::SplitMaturity;\n\n    use crate::merge_policy::{MergePolicy, NopMergePolicy};\n\n    #[test]\n    pub fn test_no_merge_policy_maturity_timestamp() {\n        // All splits are always mature for `NopMergePolicy`.\n        assert_eq!(NopMergePolicy.split_maturity(10, 0), SplitMaturity::Mature);\n    }\n\n    #[test]\n    pub fn test_no_merge_policy_operations() {\n        let mut splits = super::super::tests::create_splits(&NopMergePolicy, vec![1; 100]);\n        assert!(NopMergePolicy.operations(&mut splits).is_empty());\n        assert_eq!(splits.len(), 100);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/merge_policy/stable_log_merge_policy.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::cmp::Ordering;\nuse std::ops::Range;\n\nuse quickwit_config::IndexingSettings;\nuse quickwit_config::merge_policy_config::StableLogMergePolicyConfig;\nuse quickwit_metastore::{SplitMaturity, SplitMetadata};\nuse time::OffsetDateTime;\nuse tracing::debug;\n\nuse crate::merge_policy::{MergeOperation, MergePolicy, splits_short_debug};\n\n/// `StableLogMergePolicy` is a rather naive implementation optimized\n/// for splits produced by a rather stable stream of splits,\n/// with incoming documents ordered more or less as expected time, so that splits are\n/// time pruning is efficient out of the box.\n///\n/// The logic goes as follows.\n/// Each splits has\n/// - a number of documents\n/// - an end time\n///\n/// The policy first builds the merge operations\n///\n/// ### Build merge operations\n/// We start by sorting the splits by reverse date so that the most recent splits are\n/// coming first.\n/// We iterate through the splits and assign them to increasing levels.\n/// Level 0 will receive `{split_i}` for i within `[0..l_0)`\n/// ...\n/// Level k will receive `{split_i}` for i within `[l_{k-1}..l_k)`\n///\n/// The limit at which we change level is simply defined as\n/// `l_0 = 3 x self.min_level_num_docs`.\n///\n/// Assuming level N-1 has been built, level N is given by\n/// `l_N = min(num_docs(split_l_{N_1})` * 3, self.split_num_docs_target)`.\n/// We stop once l_N = self.split_num_docs_target is reached.\n///\n/// As a result, each level interval is at least 3 times larger than the previous one,\n/// forming a logscale over the number of documents.\n///\n/// Because we stop merging splits reaching a size larger than if it would result in a size larger\n/// than `target_num_docs`.\n#[derive(Debug, Clone)]\npub struct StableLogMergePolicy {\n    config: StableLogMergePolicyConfig,\n    split_num_docs_target: usize,\n}\n\nimpl Default for StableLogMergePolicy {\n    fn default() -> Self {\n        StableLogMergePolicy {\n            config: Default::default(),\n            split_num_docs_target: IndexingSettings::default_split_num_docs_target(),\n        }\n    }\n}\n\nfn remove_matching_items<T, Pred: Fn(&T) -> bool>(items: &mut Vec<T>, predicate: Pred) -> Vec<T> {\n    let mut matching_items = Vec::new();\n    let mut i = 0;\n    while i < items.len() {\n        if predicate(&items[i]) {\n            let matching_item = items.remove(i);\n            matching_items.push(matching_item);\n        } else {\n            i += 1;\n        }\n    }\n    matching_items\n}\n\nimpl StableLogMergePolicy {\n    pub fn new(\n        config: StableLogMergePolicyConfig,\n        split_num_docs_target: usize,\n    ) -> StableLogMergePolicy {\n        StableLogMergePolicy {\n            config,\n            split_num_docs_target,\n        }\n    }\n}\n\nimpl MergePolicy for StableLogMergePolicy {\n    fn operations(&self, splits: &mut Vec<SplitMetadata>) -> Vec<MergeOperation> {\n        let original_num_splits = splits.len();\n        let operations = self.merge_operations(splits);\n        debug_assert_eq!(\n            original_num_splits,\n            operations\n                .iter()\n                .map(|op| op.splits_as_slice().len())\n                .sum::<usize>()\n                + splits.len(),\n            \"The merge policy is supposed to keep the number of splits.\"\n        );\n        operations\n    }\n\n    /// A mature split for merge is a split that won't undergo any merge operation in the future.\n    fn split_maturity(&self, split_num_docs: usize, _split_num_merge_ops: usize) -> SplitMaturity {\n        if split_num_docs >= self.split_num_docs_target {\n            return SplitMaturity::Mature;\n        }\n        SplitMaturity::Immature {\n            maturation_period: self.config.maturation_period,\n        }\n    }\n\n    #[cfg(test)]\n    fn check_is_valid(&self, merge_op: &MergeOperation, _remaining_splits: &[SplitMetadata]) {\n        assert!(merge_op.splits_as_slice().len() <= self.config.max_merge_factor);\n        if merge_op.splits_as_slice().len() < self.config.merge_factor {\n            let num_docs: usize = merge_op\n                .splits_as_slice()\n                .iter()\n                .map(|split| split.num_docs)\n                .sum();\n            let last_split_num_docs = merge_op\n                .splits_as_slice()\n                .iter()\n                .min_by(|&left, &right| cmp_splits_by_reverse_time_end(left, right))\n                .unwrap()\n                .num_docs;\n            assert!(num_docs >= self.split_num_docs_target);\n            assert!(num_docs - last_split_num_docs < self.split_num_docs_target);\n        }\n    }\n}\n\n#[derive(Clone, Copy, Eq, PartialEq)]\nenum MergeCandidateSize {\n    /// The split candidate is too small to be considered for execution.\n    TooSmall,\n    /// The split candidate is good to go.\n    ValidSplit,\n    /// We should not add an extra split in this candidate.\n    /// This can happen for any of the two following reasons:\n    /// - the number of splits involved already reached `merge_factor_max`.\n    /// - the overall number of docs that will end up in the merged segment already exceeds\n    ///   `split_num_docs_target`.\n    OneMoreSplitWouldBeTooBig,\n}\n\nfn extract_time_end(split: &SplitMetadata) -> Option<i64> {\n    let end_timestamp = split.time_range.as_ref()?.end();\n    Some(*end_timestamp)\n}\n\n// Total ordering by\n// - reverse time end.\n// - number of docs\n// - split ids <- this one is just to make the result of the policy  invariant when shuffling the\n//   input splits.\nfn cmp_splits_by_reverse_time_end(left: &SplitMetadata, right: &SplitMetadata) -> Ordering {\n    extract_time_end(left)\n        .cmp(&extract_time_end(right))\n        .reverse()\n        .then_with(|| left.num_docs.cmp(&right.num_docs))\n        .then_with(|| {\n            left.split_id().cmp(right.split_id()) //< for determinism.\n        })\n}\n\nimpl StableLogMergePolicy {\n    fn merge_operations(&self, splits: &mut Vec<SplitMetadata>) -> Vec<MergeOperation> {\n        if splits.len() < 2 {\n            return Vec::new();\n        }\n        // First we isolate splits that are mature.\n        let splits_not_for_merge =\n            remove_matching_items(splits, |split| split.is_mature(OffsetDateTime::now_utc()));\n\n        let mut merge_operations: Vec<MergeOperation> = Vec::new();\n        splits.sort_unstable_by(cmp_splits_by_reverse_time_end);\n        debug!(splits=?splits_short_debug(&splits[..]), \"merge-policy-run\");\n\n        // Splits should naturally have an increasing num_merge\n        let split_levels = self.build_split_levels(splits);\n        for split_range in split_levels.into_iter().rev() {\n            debug!(splits=?splits_short_debug(&splits[split_range.clone()]));\n            if let Some(merge_range) = self.merge_candidate_from_level(splits, split_range) {\n                debug!(merge_range=?merge_range, \"merge-candidate\");\n                let splits_in_merge: Vec<SplitMetadata> = splits.drain(merge_range).collect();\n                let merge_operation = MergeOperation::new_merge_operation(splits_in_merge);\n                merge_operations.push(merge_operation);\n            } else {\n                debug!(\"no-merge\");\n            }\n        }\n        splits.extend(splits_not_for_merge);\n        merge_operations\n    }\n\n    /// This function groups splits in levels.\n    ///\n    /// It assumes that splits are almost sorted by their increasing size,\n    /// but should behave decently (not create too many levels) if they are not.\n    ///\n    /// All splits are required to have a number of documents lower than\n    /// `self.split_num_docs_target`\n    pub(crate) fn build_split_levels(&self, splits: &[SplitMetadata]) -> Vec<Range<usize>> {\n        assert!(\n            splits\n                .iter()\n                .all(|split| split.num_docs < self.split_num_docs_target),\n            \"All splits are expected to be smaller than `split_num_docs_target`.\"\n        );\n        if splits.is_empty() {\n            return Vec::new();\n        }\n\n        let mut split_levels: Vec<Range<usize>> = Vec::new();\n        let mut current_level_start_ord = 0;\n        let mut current_level_max_docs =\n            (splits[0].num_docs * 3).max(self.config.min_level_num_docs);\n\n        #[allow(clippy::single_range_in_vec_init)]\n        let mut levels = vec![(0..current_level_max_docs)]; // for logging only\n        for (split_ord, split) in splits.iter().enumerate() {\n            if split.num_docs >= current_level_max_docs {\n                split_levels.push(current_level_start_ord..split_ord);\n                current_level_start_ord = split_ord;\n                current_level_max_docs = 3 * split.num_docs;\n                levels.push(split.num_docs..current_level_max_docs)\n            }\n        }\n        debug!(levels=?levels);\n        split_levels.push(current_level_start_ord..splits.len());\n        split_levels\n    }\n\n    /// Given splits tries to select a subrange of level_range that would be a good merge candidate.\n    fn merge_candidate_from_level(\n        &self,\n        splits: &[SplitMetadata],\n        level_range: Range<usize>,\n    ) -> Option<Range<usize>> {\n        let merge_candidate_end = level_range.end;\n        let mut merge_candidate_start = merge_candidate_end;\n        for split_ord in level_range.rev() {\n            if self.merge_candidate_size(&splits[merge_candidate_start..merge_candidate_end])\n                == MergeCandidateSize::OneMoreSplitWouldBeTooBig\n            {\n                break;\n            }\n            merge_candidate_start = split_ord;\n        }\n        if self.merge_candidate_size(&splits[merge_candidate_start..merge_candidate_end])\n            == MergeCandidateSize::TooSmall\n        {\n            return None;\n        }\n        Some(merge_candidate_start..merge_candidate_end)\n    }\n\n    /// Returns `MergeCandidateSize` iff we should stop adding extra split into this\n    /// merge candidate.\n    fn merge_candidate_size(&self, splits: &[SplitMetadata]) -> MergeCandidateSize {\n        // We don't perform merge with a single segment. We\n        // may relax this in the future in order to compact deletes.\n        if splits.len() <= 1 {\n            return MergeCandidateSize::TooSmall;\n        }\n\n        // There are already enough splits in this merge.\n        if splits.len() >= self.config.max_merge_factor {\n            return MergeCandidateSize::OneMoreSplitWouldBeTooBig;\n        }\n        let num_docs_in_merge: usize = splits.iter().map(|split| split.num_docs).sum();\n\n        // The resulting split will exceed `split_num_docs_target`.\n        if num_docs_in_merge >= self.split_num_docs_target {\n            return MergeCandidateSize::OneMoreSplitWouldBeTooBig;\n        }\n\n        if splits.len() < self.config.merge_factor {\n            return MergeCandidateSize::TooSmall;\n        }\n\n        MergeCandidateSize::ValidSplit\n    }\n}\n\n// Helpers which expose some internal properties of\n// the stable log merge policy to be tested in unit tests.\n#[cfg(test)]\nimpl StableLogMergePolicy {\n    fn case_levels_given_growth_factor(&self, growth_factor: usize) -> Vec<usize> {\n        assert!(self.config.min_level_num_docs > 0);\n        assert!(self.config.merge_factor > 1);\n        assert!(self.config.max_merge_factor >= self.config.merge_factor);\n        assert!(self.split_num_docs_target > self.config.min_level_num_docs);\n        let mut levels_start_num_docs = vec![1];\n        let mut level_end_doc = self.config.min_level_num_docs;\n        while level_end_doc < self.split_num_docs_target {\n            levels_start_num_docs.push(level_end_doc);\n            level_end_doc *= growth_factor;\n        }\n        levels_start_num_docs.push(self.split_num_docs_target);\n        levels_start_num_docs\n    }\n\n    pub fn max_num_splits_ideal_case(&self, num_docs: u64) -> usize {\n        let levels = self.case_levels_given_growth_factor(self.config.merge_factor);\n        self.max_num_splits_knowning_levels(num_docs, &levels, true)\n    }\n\n    pub fn max_num_splits_worst_case(&self, num_docs: u64) -> usize {\n        let levels = self.case_levels_given_growth_factor(3);\n        self.max_num_splits_knowning_levels(num_docs, &levels, false)\n    }\n\n    fn max_num_splits_knowning_levels(\n        &self,\n        mut num_docs: u64,\n        levels: &[usize],\n        sorted: bool,\n    ) -> usize {\n        assert!(levels.is_sorted());\n\n        if num_docs == 0 {\n            return 0;\n        }\n        let (&head, tail) = levels.split_first().unwrap();\n        if num_docs < head as u64 {\n            return 0;\n        }\n        let first_level_min_saturation_docs = if sorted {\n            head * (self.config.merge_factor - 1)\n        } else {\n            head + (self.config.merge_factor - 2)\n        };\n        if tail.is_empty() || num_docs <= first_level_min_saturation_docs as u64 {\n            return (num_docs as usize).div_ceil(head);\n        }\n        num_docs -= first_level_min_saturation_docs as u64;\n        self.config.merge_factor - 1 + self.max_num_splits_knowning_levels(num_docs, tail, sorted)\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use std::sync::Arc;\n    use std::time::Duration;\n\n    use super::*;\n    use crate::merge_policy::tests::{aux_test_simulate_merge_planner_num_docs, create_splits};\n\n    #[test]\n    fn test_split_is_mature() {\n        let merge_policy = StableLogMergePolicy::default();\n        // Split under split_num_docs_target and created before now() - maturation_period is not\n        // mature.\n        assert_eq!(\n            merge_policy.split_maturity(9_000_000, 0),\n            SplitMaturity::Immature {\n                maturation_period: Duration::from_secs(3600 * 48)\n            }\n        );\n        assert_eq!(\n            merge_policy.split_maturity(&merge_policy.split_num_docs_target + 1, 0),\n            SplitMaturity::Mature\n        );\n        // Split under split_num_docs_target but with create_timestamp >= now + maturity duration is\n        // mature.\n        assert_eq!(\n            merge_policy.split_maturity(9_000_000, 0),\n            SplitMaturity::Immature {\n                maturation_period: merge_policy.config.maturation_period\n            }\n        );\n    }\n\n    #[test]\n    fn test_build_split_levels() {\n        let merge_policy = StableLogMergePolicy::default();\n        let splits = Vec::new();\n        let split_groups = merge_policy.build_split_levels(&splits);\n        assert!(split_groups.is_empty());\n    }\n\n    #[test]\n    fn test_stable_log_merge_policy_build_split_simple() {\n        let merge_policy: StableLogMergePolicy = StableLogMergePolicy::default();\n        let splits = create_splits(\n            &merge_policy,\n            vec![100_000, 100_000, 100_000, 800_000, 900_000],\n        );\n        let split_groups = merge_policy.build_split_levels(&splits);\n        assert_eq!(&split_groups, &[0..3, 3..5]);\n    }\n\n    #[test]\n    fn test_stable_log_merge_policy_build_split_perfect_world() {\n        let merge_policy = StableLogMergePolicy::default();\n        let splits = create_splits(\n            &merge_policy,\n            vec![\n                100_000, 100_000, 100_000, 100_000, 100_000, 100_000, 100_000, 100_000, 800_000,\n                1_600_000,\n            ],\n        );\n        let split_groups = merge_policy.build_split_levels(&splits);\n        assert_eq!(&split_groups, &[0..8, 8..10]);\n    }\n\n    #[test]\n    fn test_stable_log_merge_policy_build_split_decreasing() {\n        let merge_policy = StableLogMergePolicy::default();\n        let splits = create_splits(\n            &merge_policy,\n            vec![\n                100_000, 100_000, 100_000, 100_000, 100_000, 100_000, 100_000, 100_000, 800_000,\n                100_000, 1_600_000,\n            ],\n        );\n        let split_groups = merge_policy.build_split_levels(&splits);\n        assert_eq!(&split_groups, &[0..8, 8..11]);\n    }\n\n    #[test]\n    #[should_panic(expected = \"All splits are expected to be smaller than `split_num_docs_target`.\")]\n    fn test_stable_log_merge_policy_build_split_panics_if_exceeding_split_num_docs_target() {\n        let merge_policy = StableLogMergePolicy::default();\n        let splits = create_splits(&merge_policy, vec![11_000_000]);\n        merge_policy.build_split_levels(&splits);\n    }\n\n    #[test]\n    fn test_stable_log_merge_policy_not_enough_splits() {\n        let merge_policy = StableLogMergePolicy::default();\n        let mut splits = create_splits(&merge_policy, vec![100; 7]);\n        assert_eq!(splits.len(), 7);\n        assert!(merge_policy.operations(&mut splits).is_empty());\n    }\n\n    #[test]\n    fn test_stable_log_merge_policy_just_enough_splits_for_a_merge() {\n        let merge_policy = StableLogMergePolicy::default();\n        let mut splits = create_splits(&merge_policy, vec![100; 10]);\n        let mut merge_ops = merge_policy.operations(&mut splits);\n        assert!(splits.is_empty());\n        assert_eq!(merge_ops.len(), 1);\n        let merge_op = merge_ops.pop().unwrap();\n        let mut merge_segment_ids: Vec<String> = merge_op\n            .splits_as_slice()\n            .iter()\n            .map(|split| split.split_id().to_string())\n            .collect();\n        merge_segment_ids.sort();\n        assert_eq!(\n            merge_segment_ids,\n            &[\n                \"split_00\", \"split_01\", \"split_02\", \"split_03\", \"split_04\", \"split_05\", \"split_06\",\n                \"split_07\", \"split_08\", \"split_09\"\n            ]\n        );\n    }\n\n    #[test]\n    fn test_stable_log_merge_policy_many_splits_on_same_level() {\n        let merge_policy = StableLogMergePolicy::default();\n        let mut splits = create_splits(&merge_policy, vec![100; 13]);\n        let mut merge_ops = merge_policy.operations(&mut splits);\n        assert_eq!(splits.len(), 1);\n        assert_eq!(splits[0].split_id(), \"split_00\");\n        assert_eq!(merge_ops.len(), 1);\n        let merge_op = merge_ops.pop().unwrap();\n        let mut merge_split_ids: Vec<String> = merge_op\n            .splits_as_slice()\n            .iter()\n            .map(|split| split.split_id().to_string())\n            .collect();\n        merge_split_ids.sort();\n        assert_eq!(\n            merge_split_ids,\n            &[\n                \"split_01\", \"split_02\", \"split_03\", \"split_04\", \"split_05\", \"split_06\", \"split_07\",\n                \"split_08\", \"split_09\", \"split_10\", \"split_11\", \"split_12\"\n            ]\n        );\n    }\n\n    #[test]\n    fn test_stable_log_merge_policy_splits_below_min_level() {\n        let merge_policy = StableLogMergePolicy::default();\n        let mut splits = create_splits(\n            &merge_policy,\n            vec![\n                100, 1000, 10_000, 10_000, 10_000, 10_000, 10_000, 40_000, 40_000, 40_000,\n            ],\n        );\n        let mut merge_ops = merge_policy.operations(&mut splits);\n        assert_eq!(splits.len(), 0);\n        assert_eq!(merge_ops.len(), 1);\n        let merge_op = merge_ops.pop().unwrap();\n        let mut merge_split_ids: Vec<String> = merge_op\n            .splits_as_slice()\n            .iter()\n            .map(|split| split.split_id().to_string())\n            .collect();\n        merge_split_ids.sort();\n        assert_eq!(\n            merge_split_ids,\n            &[\n                \"split_00\", \"split_01\", \"split_02\", \"split_03\", \"split_04\", \"split_05\", \"split_06\",\n                \"split_07\", \"split_08\", \"split_09\"\n            ]\n        );\n    }\n\n    #[test]\n    fn test_stable_log_merge_policy_splits_above_min_level() {\n        let merge_policy = StableLogMergePolicy::default();\n        let mut splits = create_splits(\n            &merge_policy,\n            vec![\n                100_000, 1_000_000, 1_000_000, 1_000_000, 1_000_000, 1_000_000, 1_000_000,\n                1_000_000,\n            ],\n        );\n        let merge_ops = merge_policy.operations(&mut splits);\n        assert_eq!(splits.len(), 8);\n        assert_eq!(merge_ops.len(), 0);\n    }\n\n    #[test]\n    fn test_stable_log_merge_policy_above_split_num_docs_target_is_ignored() {\n        let merge_policy = StableLogMergePolicy::default();\n        let mut splits = create_splits(\n            &merge_policy,\n            vec![\n                100_000, 100_000, 100_000, 100_000, 100_000,\n                10_000_000, // this split should not interfere with the merging of other splits\n                100_000, 100_000, 100_000, 100_000, 100_000,\n            ],\n        );\n        let merge_ops = merge_policy.operations(&mut splits);\n        assert_eq!(splits.len(), 1);\n        assert_eq!(splits[0].num_docs, 10_000_000);\n        assert_eq!(merge_ops.len(), 1);\n    }\n\n    #[test]\n    fn test_merge_policy_splits_too_large_are_ignored() {\n        let merge_policy = StableLogMergePolicy::default();\n        let mut splits = create_splits(&merge_policy, vec![9_999_999, 10_000_000]);\n        for split in splits.iter_mut() {\n            let time_to_maturity = merge_policy.split_maturity(split.num_docs, split.num_merge_ops);\n            split.maturity = time_to_maturity;\n        }\n        let merge_ops = merge_policy.operations(&mut splits);\n        assert_eq!(splits.len(), 2);\n        assert_eq!(splits[0].num_docs, 9_999_999);\n        assert_eq!(splits[1].num_docs, 10_000_000);\n        assert!(merge_ops.is_empty());\n    }\n\n    #[test]\n    fn test_merge_policy_splits_entire_level_reach_merge_max_doc() {\n        let merge_policy = StableLogMergePolicy::default();\n        let mut splits = create_splits(&merge_policy, vec![5_000_000, 5_000_000]);\n        let merge_ops = merge_policy.operations(&mut splits);\n        assert!(splits.is_empty());\n        assert_eq!(merge_ops.len(), 1);\n        assert_eq!(merge_ops[0].splits_as_slice().len(), 2);\n    }\n\n    #[test]\n    fn test_merge_policy_last_merge_can_have_a_lower_merge_factor() {\n        let merge_policy = StableLogMergePolicy::default();\n        let mut splits = create_splits(&merge_policy, vec![9_999_997, 9_999_998, 9_999_999]);\n        let merge_ops = merge_policy.operations(&mut splits);\n        assert_eq!(splits.len(), 1);\n        assert_eq!(splits[0].num_docs, 9_999_997);\n        assert_eq!(merge_ops.len(), 1);\n        assert_eq!(merge_ops[0].splits_as_slice().len(), 2);\n    }\n\n    #[test]\n    fn test_merge_policy_no_merge_with_only_one_split() {\n        let merge_policy = StableLogMergePolicy::default();\n        let mut splits = create_splits(&merge_policy, vec![9_999_999]);\n        let merge_ops = merge_policy.operations(&mut splits);\n        assert_eq!(splits.len(), 1);\n        assert_eq!(splits[0].num_docs, 9_999_999);\n        assert!(merge_ops.is_empty());\n    }\n\n    #[test]\n    fn test_stable_log_merge_policy_max_num_splits_worst_case() {\n        let merge_policy = StableLogMergePolicy::default();\n        assert_eq!(merge_policy.max_num_splits_worst_case(99), 9);\n        assert_eq!(merge_policy.max_num_splits_worst_case(1_000_000), 27);\n        assert_eq!(merge_policy.max_num_splits_worst_case(2_000_000), 36);\n        assert_eq!(merge_policy.max_num_splits_worst_case(3_000_000), 36);\n        assert_eq!(merge_policy.max_num_splits_worst_case(4_000_000), 36);\n        assert_eq!(merge_policy.max_num_splits_worst_case(5_000_000), 45);\n        assert_eq!(merge_policy.max_num_splits_worst_case(7_000_000), 45);\n        assert_eq!(merge_policy.max_num_splits_worst_case(10_000_000), 45);\n        assert_eq!(merge_policy.max_num_splits_worst_case(20_000_000), 54);\n        assert_eq!(merge_policy.max_num_splits_worst_case(100_000_000), 63);\n        assert_eq!(merge_policy.max_num_splits_worst_case(1_000_000_000), 153);\n    }\n\n    #[test]\n    fn test_stable_log_merge_policy_max_num_splits_ideal_case() {\n        let merge_policy = StableLogMergePolicy::default();\n        assert_eq!(merge_policy.max_num_splits_ideal_case(1_000_000), 18);\n        assert_eq!(merge_policy.max_num_splits_ideal_case(99), 9);\n        assert_eq!(merge_policy.max_num_splits_ideal_case(2_000_000), 20);\n        assert_eq!(merge_policy.max_num_splits_ideal_case(3_000_000), 21);\n        assert_eq!(merge_policy.max_num_splits_ideal_case(4_000_000), 22);\n        assert_eq!(merge_policy.max_num_splits_ideal_case(5_000_000), 23);\n        assert_eq!(merge_policy.max_num_splits_ideal_case(7_000_000), 25);\n        assert_eq!(merge_policy.max_num_splits_ideal_case(10_000_000), 27);\n        assert_eq!(merge_policy.max_num_splits_ideal_case(100_000_000), 37);\n        assert_eq!(merge_policy.max_num_splits_ideal_case(1_000_000_000), 127);\n    }\n\n    #[test]\n    fn test_stable_log_merge_policy_proptest() {\n        let config = StableLogMergePolicyConfig {\n            min_level_num_docs: 100_000,\n            merge_factor: 4,\n            max_merge_factor: 6,\n            maturation_period: Duration::from_secs(3600),\n        };\n        let merge_policy = StableLogMergePolicy::new(config, 10_000_000);\n        crate::merge_policy::tests::proptest_merge_policy(&merge_policy);\n    }\n\n    #[tokio::test]\n    #[cfg_attr(not(feature = \"ci-test\"), ignore)]\n    async fn test_simulate_stable_log_merge_policy_constant_case() -> anyhow::Result<()> {\n        let merge_policy = StableLogMergePolicy::default();\n        aux_test_simulate_merge_planner_num_docs(\n            Arc::new(merge_policy.clone()),\n            &vec![10_000; 100_000],\n            &|splits| {\n                let num_docs = splits.iter().map(|split| split.num_docs as u64).sum();\n                assert!(splits.len() <= merge_policy.max_num_splits_ideal_case(num_docs))\n            },\n        )\n        .await?;\n        Ok(())\n    }\n\n    use proptest::prelude::*;\n    use proptest::sample::select;\n    use tokio::runtime::Runtime;\n\n    fn proptest_config() -> ProptestConfig {\n        let mut proptest_config = ProptestConfig::with_cases(20);\n        proptest_config.max_shrink_iters = 600;\n        proptest_config\n    }\n\n    proptest! {\n        #![proptest_config(proptest_config())]\n        #[test]\n        fn test_proptest_simulate_stable_log_merge_planner_adversarial(batch_num_docs in proptest::collection::vec(select(&[11, 1_990, 10_000, 50_000, 310_000][..]), 1..1_000)) {\n            let merge_policy = StableLogMergePolicy::default();\n            let rt = Runtime::new().unwrap();\n            rt.block_on(\n            aux_test_simulate_merge_planner_num_docs(\n                Arc::new(merge_policy.clone()),\n                &batch_num_docs,\n                &|splits| {\n                    let num_docs = splits.iter().map(|split| split.num_docs as u64).sum();\n                    assert!(splits.len() <= merge_policy.max_num_splits_worst_case(num_docs));\n                },\n            )).unwrap();\n        }\n    }\n\n    #[tokio::test]\n    async fn test_simulate_stable_log_merge_planner_edge_case() {\n        let merge_policy = StableLogMergePolicy::default();\n        let batch_num_docs = vec![\n            11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11,\n        ];\n        aux_test_simulate_merge_planner_num_docs(\n            Arc::new(merge_policy.clone()),\n            &batch_num_docs,\n            &|splits| {\n                let num_docs = splits.iter().map(|split| split.num_docs as u64).sum();\n                assert!(splits.len() <= merge_policy.max_num_splits_worst_case(num_docs));\n            },\n        )\n        .await\n        .unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_simulate_stable_log_merge_planner_ideal_case() -> anyhow::Result<()> {\n        let merge_policy = StableLogMergePolicy::default();\n        aux_test_simulate_merge_planner_num_docs(\n            Arc::new(merge_policy.clone()),\n            &vec![10_000; 1_000],\n            &|splits| {\n                let num_docs = splits.iter().map(|split| split.num_docs as u64).sum();\n                assert!(splits.len() <= merge_policy.max_num_splits_ideal_case(num_docs));\n            },\n        )\n        .await?;\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_simulate_stable_log_merge_planner_bug() -> anyhow::Result<()> {\n        let merge_policy = StableLogMergePolicy::default();\n        let vals = &[11, 11, 11, 11, 11, 11, 310000, 11, 11, 11, 11, 11, 11, 11];\n        aux_test_simulate_merge_planner_num_docs(\n            Arc::new(merge_policy.clone()),\n            &vals[..],\n            &|splits| {\n                let num_docs = splits.iter().map(|split| split.num_docs as u64).sum();\n                assert!(splits.len() <= merge_policy.max_num_splits_worst_case(num_docs));\n            },\n        )\n        .await?;\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/metrics.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse once_cell::sync::Lazy;\nuse quickwit_common::metrics::{\n    IntCounter, IntCounterVec, IntGauge, IntGaugeVec, new_counter, new_counter_vec, new_gauge,\n    new_gauge_vec,\n};\n\npub struct IndexerMetrics {\n    pub processed_docs_total: IntCounterVec<2>,\n    pub processed_bytes: IntCounterVec<2>,\n    pub indexing_pipelines: IntGaugeVec<1>,\n    pub backpressure_micros: IntCounterVec<1>,\n    pub available_concurrent_upload_permits: IntGaugeVec<1>,\n    pub split_builders: IntGauge,\n    pub ongoing_merge_operations: IntGauge,\n    pub pending_merge_operations: IntGauge,\n    pub pending_merge_bytes: IntGauge,\n    // We use a lazy counter, as most users do not use Kafka.\n    #[cfg_attr(not(feature = \"kafka\"), allow(dead_code))]\n    pub kafka_rebalance_total: Lazy<IntCounter>,\n}\n\nimpl Default for IndexerMetrics {\n    fn default() -> Self {\n        IndexerMetrics {\n            processed_docs_total: new_counter_vec(\n                \"processed_docs_total\",\n                \"Number of processed docs by index, source and processed status in [valid, \\\n                 schema_error, parse_error, transform_error]\",\n                \"indexing\",\n                &[],\n                [\"index\", \"docs_processed_status\"],\n            ),\n            processed_bytes: new_counter_vec(\n                \"processed_bytes\",\n                \"Number of bytes of processed documents by index, source and processed status in \\\n                 [valid, schema_error, parse_error, transform_error]\",\n                \"indexing\",\n                &[],\n                [\"index\", \"docs_processed_status\"],\n            ),\n            indexing_pipelines: new_gauge_vec(\n                \"indexing_pipelines\",\n                \"Number of running indexing pipelines\",\n                \"indexing\",\n                &[],\n                [\"index\"],\n            ),\n            backpressure_micros: new_counter_vec(\n                \"backpressure_micros\",\n                \"Amount of time spent in backpressure (in micros). This time only includes the \\\n                 amount of time spent waiting for a place in the queue of another actor.\",\n                \"indexing\",\n                &[],\n                [\"actor_name\"],\n            ),\n            available_concurrent_upload_permits: new_gauge_vec(\n                \"concurrent_upload_available_permits_num\",\n                \"Number of available concurrent upload permits by component in [merger, indexer]\",\n                \"indexing\",\n                &[],\n                [\"component\"],\n            ),\n            split_builders: new_gauge(\n                \"split_builders\",\n                \"Number of existing index writer instances.\",\n                \"indexing\",\n                &[],\n            ),\n            ongoing_merge_operations: new_gauge(\n                \"ongoing_merge_operations\",\n                \"Number of ongoing merge operations\",\n                \"indexing\",\n                &[],\n            ),\n            pending_merge_operations: new_gauge(\n                \"pending_merge_operations\",\n                \"Number of pending merge operations\",\n                \"indexing\",\n                &[],\n            ),\n            pending_merge_bytes: new_gauge(\n                \"pending_merge_bytes\",\n                \"Number of pending merge bytes\",\n                \"indexing\",\n                &[],\n            ),\n            kafka_rebalance_total: Lazy::new(|| {\n                new_counter(\n                    \"kafka_rebalance_total\",\n                    \"Number of kafka rebalances\",\n                    \"indexing\",\n                    &[],\n                )\n            }),\n        }\n    }\n}\n\n/// `INDEXER_METRICS` exposes indexing related metrics through a prometheus\n/// endpoint.\npub static INDEXER_METRICS: Lazy<IndexerMetrics> = Lazy::new(IndexerMetrics::default);\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/models/indexed_split.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::path::Path;\n\nuse quickwit_common::io::IoControls;\nuse quickwit_common::metrics::GaugeGuard;\nuse quickwit_common::temp_dir::TempDirectory;\nuse quickwit_metastore::checkpoint::IndexCheckpointDelta;\nuse quickwit_proto::indexing::IndexingPipelineId;\nuse quickwit_proto::types::{DocMappingUid, IndexUid, PublishToken};\nuse tantivy::IndexBuilder;\nuse tantivy::directory::MmapDirectory;\nuse tracing::{Span, instrument};\n\nuse crate::controlled_directory::ControlledDirectory;\nuse crate::merge_policy::MergeTask;\nuse crate::models::{PublishLock, SplitAttrs};\nuse crate::new_split_id;\n\npub struct IndexedSplitBuilder {\n    pub split_attrs: SplitAttrs,\n    pub index_writer: tantivy::SingleSegmentIndexWriter,\n    pub split_scratch_directory: TempDirectory,\n    pub controlled_directory_opt: Option<ControlledDirectory>,\n}\n\npub struct IndexedSplit {\n    pub split_attrs: SplitAttrs,\n    pub index: tantivy::Index,\n    pub split_scratch_directory: TempDirectory,\n    pub controlled_directory_opt: Option<ControlledDirectory>,\n}\n\nimpl IndexedSplit {\n    pub fn split_id(&self) -> &str {\n        &self.split_attrs.split_id\n    }\n}\n\nimpl fmt::Debug for IndexedSplit {\n    fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {\n        formatter\n            .debug_struct(\"IndexedSplit\")\n            .field(\"split_id\", &self.split_attrs.split_id)\n            .field(\"dir\", &self.split_scratch_directory.path())\n            .field(\"num_docs\", &self.split_attrs.num_docs)\n            .finish()\n    }\n}\n\nimpl fmt::Debug for IndexedSplitBuilder {\n    fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {\n        formatter\n            .debug_struct(\"IndexedSplitBuilder\")\n            .field(\"split_id\", &self.split_attrs.split_id)\n            .field(\"dir\", &self.split_scratch_directory.path())\n            .field(\"num_docs\", &self.split_attrs.num_docs)\n            .finish()\n    }\n}\n\nimpl IndexedSplitBuilder {\n    pub fn new_in_dir(\n        pipeline_id: IndexingPipelineId,\n        partition_id: u64,\n        last_delete_opstamp: u64,\n        doc_mapping_uid: DocMappingUid,\n        scratch_directory: TempDirectory,\n        index_builder: IndexBuilder,\n        io_controls: IoControls,\n    ) -> anyhow::Result<Self> {\n        // We avoid intermediary merge, and instead merge all segments in the packager.\n        // The benefit is that we don't have to wait for potentially existing merges,\n        // and avoid possible race conditions.\n        let split_id = new_split_id();\n        let split_scratch_directory_prefix = format!(\"split-{split_id}-\");\n        let split_scratch_directory =\n            scratch_directory.named_temp_child(&split_scratch_directory_prefix)?;\n        let mmap_directory = MmapDirectory::open(split_scratch_directory.path())?;\n        let box_mmap_directory = Box::new(mmap_directory);\n\n        let controlled_directory = ControlledDirectory::new(box_mmap_directory, io_controls);\n\n        let index_writer =\n            index_builder.single_segment_index_writer(controlled_directory.clone(), 15_000_000)?;\n        Ok(Self {\n            split_attrs: SplitAttrs {\n                node_id: pipeline_id.node_id,\n                index_uid: pipeline_id.index_uid,\n                source_id: pipeline_id.source_id,\n                doc_mapping_uid,\n                partition_id,\n                split_id,\n                num_docs: 0,\n                replaced_split_ids: Vec::new(),\n                uncompressed_docs_size_in_bytes: 0,\n                time_range: None,\n                delete_opstamp: last_delete_opstamp,\n                num_merge_ops: 0,\n            },\n            index_writer,\n            split_scratch_directory,\n            controlled_directory_opt: Some(controlled_directory),\n        })\n    }\n\n    #[instrument(name=\"serialize_split\",\n        skip_all,\n        fields(\n            node_id=%self.split_attrs.node_id,\n            index_uid=%self.split_attrs.index_uid,\n            source_id=%self.split_attrs.source_id,\n            split_id=%self.split_attrs.split_id,\n            partition_id=%self.split_attrs.partition_id,\n            num_docs=%self.split_attrs.num_docs,\n            uncompressed_docs_size_in_bytes=%self.split_attrs.uncompressed_docs_size_in_bytes,\n            delete_opstamp=%self.split_attrs.delete_opstamp,\n            num_merge_ops=%self.split_attrs.num_merge_ops,\n        )\n    )]\n    pub fn finalize(self) -> anyhow::Result<IndexedSplit> {\n        let index = self.index_writer.finalize()?;\n        Ok(IndexedSplit {\n            split_attrs: self.split_attrs,\n            index,\n            split_scratch_directory: self.split_scratch_directory,\n            controlled_directory_opt: self.controlled_directory_opt,\n        })\n    }\n\n    pub fn path(&self) -> &Path {\n        self.split_scratch_directory.path()\n    }\n\n    pub fn split_id(&self) -> &str {\n        &self.split_attrs.split_id\n    }\n}\n\n#[derive(Debug)]\npub struct IndexedSplitBatch {\n    pub splits: Vec<IndexedSplit>,\n    pub checkpoint_delta_opt: Option<IndexCheckpointDelta>,\n    pub publish_lock: PublishLock,\n    pub publish_token_opt: Option<PublishToken>,\n    /// A [`MergeTask`] tracked by either the `MergePlanner` or the `DeleteTaskPlanner`\n    /// in the `MergePipeline` or `DeleteTaskPipeline`.\n    /// See planners docs to understand the usage.\n    /// If `None`, the split batch was built in the `IndexingPipeline`.\n    pub merge_task_opt: Option<MergeTask>,\n    pub batch_parent_span: Span,\n}\n\n#[derive(Clone, Copy, Debug, Eq, PartialEq)]\npub enum CommitTrigger {\n    Drained,\n    ForceCommit,\n    MemoryLimit,\n    NoMoreDocs,\n    NumDocsLimit,\n    Timeout,\n}\n\n#[derive(Debug)]\npub struct IndexedSplitBatchBuilder {\n    pub splits: Vec<IndexedSplitBuilder>,\n    pub checkpoint_delta_opt: Option<IndexCheckpointDelta>,\n    pub publish_lock: PublishLock,\n    pub publish_token_opt: Option<PublishToken>,\n    pub commit_trigger: CommitTrigger,\n    pub batch_parent_span: Span,\n    pub memory_usage: GaugeGuard<'static>,\n    pub _split_builders_guard: GaugeGuard<'static>,\n}\n\n/// Sends notifications to the Publisher that the last batch of splits was empty.\n#[derive(Debug)]\npub struct EmptySplit {\n    pub index_uid: IndexUid,\n    pub checkpoint_delta: IndexCheckpointDelta,\n    pub publish_lock: PublishLock,\n    pub publish_token_opt: Option<PublishToken>,\n    pub batch_parent_span: Span,\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/models/indexing_service_message.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_config::SourceConfig;\nuse quickwit_proto::indexing::{IndexingPipelineId, MergePipelineId};\nuse quickwit_proto::types::{IndexId, PipelineUid};\n\n#[derive(Clone, Debug)]\npub struct SpawnPipeline {\n    pub index_id: IndexId,\n    pub source_config: SourceConfig,\n    pub pipeline_uid: PipelineUid,\n}\n\n/// Detaches a pipeline from the indexing service. The pipeline is no longer managed by the\n/// server. This is mostly useful for ad-hoc indexing pipelines launched with `quickwit index\n/// ingest ..` and testing.\n#[derive(Debug)]\npub struct DetachIndexingPipeline {\n    pub pipeline_id: IndexingPipelineId,\n}\n\n/// Detaches a merge pipeline from the indexing service. The pipeline is no longer managed by the\n/// server. This is mostly useful for preventing the server killing an existing merge pipeline\n/// if a indexing pipeline is detached.\n#[derive(Debug)]\npub struct DetachMergePipeline {\n    pub pipeline_id: MergePipelineId,\n}\n\n#[derive(Debug)]\npub struct ObservePipeline {\n    pub pipeline_id: IndexingPipelineId,\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/models/indexing_statistics.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeSet;\nuse std::sync::atomic::Ordering;\n\nuse quickwit_proto::indexing::PipelineMetrics;\nuse quickwit_proto::types::ShardId;\nuse serde::Serialize;\n\nuse crate::actors::{DocProcessorCounters, IndexerCounters, PublisherCounters, UploaderCounters};\n\n/// A Struct that holds all statistical data about indexing\n#[derive(Clone, Debug, Default, Serialize, utoipa::ToSchema)]\npub struct IndexingStatistics {\n    /// Number of document processed (valid or not)\n    pub num_docs: u64,\n    /// Number of document parse error, or missing timestamps\n    pub num_invalid_docs: u64,\n    /// Number of created split\n    pub num_local_splits: u64,\n    /// Number of staged splits\n    pub num_staged_splits: u64,\n    /// Number of uploaded splits\n    pub num_uploaded_splits: u64,\n    /// Number of published splits\n    pub num_published_splits: u64,\n    /// Number of empty batches\n    pub num_empty_splits: u64,\n    /// Size in byte of document processed\n    pub total_bytes_processed: u64,\n    /// Size in bytes of resulting split\n    pub total_size_splits: u64,\n    /// Pipeline generation.\n    pub generation: usize,\n    /// Number of successive pipeline spawn attempts.\n    pub num_spawn_attempts: usize,\n    // Pipeline metrics.\n    pub pipeline_metrics_opt: Option<PipelineMetrics>,\n    // List of shard ids.\n    #[schema(value_type = Vec<u64>)]\n    pub shard_ids: BTreeSet<ShardId>,\n    pub params_fingerprint: u64,\n}\n\nimpl IndexingStatistics {\n    pub fn add_actor_counters(\n        mut self,\n        doc_processor_counters: &DocProcessorCounters,\n        indexer_counters: &IndexerCounters,\n        uploader_counters: &UploaderCounters,\n        publisher_counters: &PublisherCounters,\n    ) -> Self {\n        self.num_docs += doc_processor_counters.num_processed_docs();\n        self.num_invalid_docs += doc_processor_counters.num_invalid_docs();\n        self.num_local_splits += indexer_counters.num_splits_emitted;\n        self.total_bytes_processed += doc_processor_counters\n            .num_bytes_total\n            .load(Ordering::Relaxed);\n        self.num_staged_splits += uploader_counters.num_staged_splits.load(Ordering::Relaxed);\n        self.num_uploaded_splits += uploader_counters\n            .num_uploaded_splits\n            .load(Ordering::Relaxed);\n        self.num_published_splits += publisher_counters.num_published_splits;\n        self.num_empty_splits += publisher_counters.num_empty_splits;\n        self\n    }\n\n    pub fn set_num_spawn_attempts(mut self, num_spawn_attempts: usize) -> Self {\n        self.num_spawn_attempts = num_spawn_attempts;\n        self\n    }\n\n    pub fn set_generation(mut self, generation: usize) -> Self {\n        self.generation = generation;\n        self\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/models/merge_planner_message.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_metastore::SplitMetadata;\n\n#[derive(Clone, Debug)]\npub struct NewSplits {\n    pub new_splits: Vec<SplitMetadata>,\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/models/merge_scratch.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_common::temp_dir::TempDirectory;\nuse tantivy::Directory;\n\nuse crate::merge_policy::MergeTask;\n\n#[derive(Debug)]\npub struct MergeScratch {\n    /// A [`MergeTask`] tracked by either the `MergePlanner` or the `DeleteTaskPlanner`\n    /// See planners docs to understand the usage.\n    pub merge_task: MergeTask,\n    /// Scratch directory for computing the merge.\n    pub merge_scratch_directory: TempDirectory,\n    pub downloaded_splits_directory: TempDirectory,\n    pub tantivy_dirs: Vec<Box<dyn Directory>>,\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/models/merge_statistics.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::atomic::Ordering;\n\nuse serde::Serialize;\n\nuse crate::actors::{PublisherCounters, UploaderCounters};\n\n/// A Struct to hold all merge statistical data.\n#[derive(Clone, Debug, Default, Serialize)]\npub struct MergeStatistics {\n    /// Number of uploaded splits\n    pub num_uploaded_splits: u64,\n    /// Number of published splits\n    pub num_published_splits: u64,\n    /// Pipeline generation.\n    pub generation: usize,\n    /// Number of successive pipeline spawn attempts.\n    pub num_spawn_attempts: usize,\n    /// Number of merges currently in progress.\n    pub num_ongoing_merges: usize,\n}\n\nimpl MergeStatistics {\n    pub fn add_actor_counters(\n        mut self,\n        uploader_counters: &UploaderCounters,\n        publisher_counters: &PublisherCounters,\n    ) -> Self {\n        self.num_uploaded_splits += uploader_counters.num_uploaded_splits.load(Ordering::SeqCst);\n        self.num_published_splits += publisher_counters.num_published_splits;\n        self\n    }\n\n    pub fn set_num_spawn_attempts(mut self, num_spawn_attempts: usize) -> Self {\n        self.num_spawn_attempts = num_spawn_attempts;\n        self\n    }\n\n    pub fn set_generation(mut self, generation: usize) -> Self {\n        self.generation = generation;\n        self\n    }\n\n    pub fn set_ongoing_merges(mut self, n: usize) -> Self {\n        self.num_ongoing_merges = n;\n        self\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/models/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n#![allow(rustdoc::invalid_html_tags)]\n\nmod indexed_split;\nmod indexing_service_message;\nmod indexing_statistics;\nmod merge_planner_message;\nmod merge_scratch;\nmod merge_statistics;\nmod packaged_split;\nmod processed_doc;\nmod publish_lock;\nmod publisher_message;\nmod raw_doc_batch;\nmod shard_positions;\nmod split_attrs;\n\npub use indexed_split::{\n    CommitTrigger, EmptySplit, IndexedSplit, IndexedSplitBatch, IndexedSplitBatchBuilder,\n    IndexedSplitBuilder,\n};\npub use indexing_service_message::{\n    DetachIndexingPipeline, DetachMergePipeline, ObservePipeline, SpawnPipeline,\n};\npub use indexing_statistics::IndexingStatistics;\npub use merge_planner_message::NewSplits;\npub use merge_scratch::MergeScratch;\npub use merge_statistics::MergeStatistics;\npub use packaged_split::{PackagedSplit, PackagedSplitBatch};\npub use processed_doc::{ProcessedDoc, ProcessedDocBatch};\npub use publish_lock::{NewPublishLock, PublishLock};\npub use publisher_message::SplitsUpdate;\nuse quickwit_proto::types::PublishToken;\npub use raw_doc_batch::RawDocBatch;\npub(crate) use shard_positions::LocalShardPositionsUpdate;\npub use shard_positions::ShardPositionsService;\npub use split_attrs::{SplitAttrs, create_split_metadata};\n\n#[derive(Debug)]\npub struct NewPublishToken(pub PublishToken);\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/models/packaged_split.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeSet;\nuse std::fmt;\nuse std::path::PathBuf;\n\nuse itertools::Itertools;\nuse quickwit_common::temp_dir::TempDirectory;\nuse quickwit_metastore::checkpoint::IndexCheckpointDelta;\nuse quickwit_proto::types::{IndexUid, PublishToken, SplitId};\nuse tracing::Span;\n\nuse crate::merge_policy::MergeTask;\nuse crate::models::{PublishLock, SplitAttrs};\n\npub struct PackagedSplit {\n    pub serialized_split_fields: Vec<u8>,\n    pub split_attrs: SplitAttrs,\n    pub split_scratch_directory: TempDirectory,\n    pub tags: BTreeSet<String>,\n    pub split_files: Vec<PathBuf>,\n    pub hotcache_bytes: Vec<u8>,\n}\n\nimpl PackagedSplit {\n    pub fn index_uid(&self) -> &IndexUid {\n        &self.split_attrs.index_uid\n    }\n\n    pub fn split_id(&self) -> &str {\n        &self.split_attrs.split_id\n    }\n}\n\nimpl fmt::Debug for PackagedSplit {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        f.debug_struct(\"PackagedSplit\")\n            .field(\"split_attrs\", &self.split_attrs)\n            .field(\"split_scratch_directory\", &self.split_scratch_directory)\n            .field(\"tags\", &self.tags)\n            .field(\"split_files\", &self.split_files)\n            .finish()\n    }\n}\n\n#[derive(Debug)]\npub struct PackagedSplitBatch {\n    pub splits: Vec<PackagedSplit>,\n    pub checkpoint_delta_opt: Option<IndexCheckpointDelta>,\n    pub publish_lock: PublishLock,\n    pub publish_token_opt: Option<PublishToken>,\n    /// A [`MergeTask`] tracked by either the `MergePlanner` or the `DeleteTaskPlanner`\n    /// in the `MergePipeline` or `DeleteTaskPipeline`.\n    /// See planners docs to understand the usage.\n    /// If `None`, the split batch was built in the `IndexingPipeline`.\n    pub merge_task_opt: Option<MergeTask>,\n    pub batch_parent_span: Span,\n}\n\nimpl PackagedSplitBatch {\n    /// Instantiate a consistent [`PackagedSplitBatch`] that\n    /// satisfies two constraints:\n    /// - a batch must have at least one split\n    /// - all splits must belong to the same `index_uid`.\n    pub fn new(\n        splits: Vec<PackagedSplit>,\n        checkpoint_delta_opt: Option<IndexCheckpointDelta>,\n        publish_lock: PublishLock,\n        publish_token_opt: Option<PublishToken>,\n        merge_task_opt: Option<MergeTask>,\n        batch_parent_span: Span,\n    ) -> Self {\n        assert!(!splits.is_empty());\n        assert!(\n            splits\n                .iter()\n                .tuple_windows()\n                .all(|(left_split, right_split)| left_split.index_uid() == right_split.index_uid()),\n            \"All splits must belong to the same `index_uid`.\"\n        );\n        Self {\n            splits,\n            checkpoint_delta_opt,\n            publish_lock,\n            publish_token_opt,\n            merge_task_opt,\n            batch_parent_span,\n        }\n    }\n\n    pub fn index_uid(&self) -> IndexUid {\n        self.splits[0].split_attrs.index_uid.clone()\n    }\n\n    pub fn split_ids(&self) -> Vec<SplitId> {\n        self.splits\n            .iter()\n            .map(|split| split.split_attrs.split_id.clone())\n            .collect::<Vec<_>>()\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/models/processed_doc.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\n\nuse quickwit_common::metrics::{GaugeGuard, MEMORY_METRICS};\nuse quickwit_metastore::checkpoint::SourceCheckpointDelta;\nuse tantivy::{DateTime, TantivyDocument};\n\npub struct ProcessedDoc {\n    pub doc: TantivyDocument,\n    pub timestamp_opt: Option<DateTime>,\n    pub partition: u64,\n    pub num_bytes: usize,\n}\n\nimpl fmt::Debug for ProcessedDoc {\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        f.debug_struct(\"ProcessedDoc\")\n            .field(\"timestamp_opt\", &self.timestamp_opt)\n            .field(\"partition\", &self.partition)\n            .field(\"num_bytes\", &self.num_bytes)\n            .finish()\n    }\n}\n\npub struct ProcessedDocBatch {\n    // Do not directly append documents to this vector; otherwise, in-flight metrics will be\n    // incorrect.\n    pub docs: Vec<ProcessedDoc>,\n    pub checkpoint_delta: SourceCheckpointDelta,\n    pub force_commit: bool,\n    _gauge_guard: GaugeGuard<'static>,\n}\n\nimpl ProcessedDocBatch {\n    pub fn new(\n        docs: Vec<ProcessedDoc>,\n        checkpoint_delta: SourceCheckpointDelta,\n        force_commit: bool,\n    ) -> Self {\n        let delta = docs.iter().map(|doc| doc.num_bytes as i64).sum::<i64>();\n        let mut gauge_guard = GaugeGuard::from_gauge(&MEMORY_METRICS.in_flight.indexer_mailbox);\n        gauge_guard.add(delta);\n        Self {\n            docs,\n            checkpoint_delta,\n            force_commit,\n            _gauge_guard: gauge_guard,\n        }\n    }\n}\n\nimpl fmt::Debug for ProcessedDocBatch {\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        f.debug_struct(\"ProcessedDocBatch\")\n            .field(\"num_docs\", &self.docs.len())\n            .field(\"checkpoint_delta\", &self.checkpoint_delta)\n            .field(\"force_commit\", &self.force_commit)\n            .finish()\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/models/publish_lock.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt::Debug;\nuse std::sync::Arc;\nuse std::sync::atomic::{AtomicBool, Ordering};\n\nuse tokio::sync::{Mutex, MutexGuard};\n\n// Publisher locks have two clients: publishers and sources.\n//\n// Publishers must acquire the lock and ensure that the lock is alive before publishing.\n//\n// When a partition reassignment occurs, sources must (i) acquire, then (ii) kill, and finally (iii)\n// release the lock before propagating a new lock via message passing to the downstream consumers.\n#[derive(Clone, Default)]\npub struct PublishLock {\n    inner: Arc<PublishLockInner>,\n}\n\nimpl PartialEq for PublishLock {\n    fn eq(&self, other: &Self) -> bool {\n        std::ptr::eq(self.inner.as_ref(), other.inner.as_ref())\n    }\n}\n\nimpl Debug for PublishLock {\n    fn fmt(&self, fmt: &mut std::fmt::Formatter) -> std::fmt::Result {\n        fmt.debug_struct(\"PublishLock\")\n            .field(\"is_alive\", &self.is_alive())\n            .finish()\n    }\n}\n\nstruct PublishLockInner {\n    alive: AtomicBool,\n    mutex: Mutex<()>,\n}\n\nimpl Default for PublishLockInner {\n    fn default() -> Self {\n        Self {\n            alive: AtomicBool::new(true),\n            mutex: Mutex::default(),\n        }\n    }\n}\n\nimpl PublishLock {\n    pub fn dead() -> Self {\n        PublishLock {\n            inner: Arc::new(PublishLockInner {\n                alive: AtomicBool::new(false),\n                mutex: Mutex::default(),\n            }),\n        }\n    }\n    pub async fn acquire(&self) -> Option<MutexGuard<'_, ()>> {\n        let guard = self.inner.mutex.lock().await;\n        if self.is_dead() {\n            return None;\n        }\n        Some(guard)\n    }\n\n    pub fn is_alive(&self) -> bool {\n        self.inner.alive.load(Ordering::Relaxed)\n    }\n\n    pub fn is_dead(&self) -> bool {\n        !self.is_alive()\n    }\n\n    pub async fn kill(&self) {\n        let _guard = self.inner.mutex.lock().await;\n        self.inner.alive.store(false, Ordering::Relaxed);\n    }\n}\n\n#[derive(Debug, PartialEq)]\npub struct NewPublishLock(pub PublishLock);\n\n#[cfg(test)]\nmod tests {\n\n    use std::time::Duration;\n\n    use tokio::time::timeout;\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_publish_lock() {\n        let lock = PublishLock::default();\n        assert!(lock.is_alive());\n\n        let guard = lock.acquire().await.unwrap();\n        assert!(\n            timeout(Duration::from_millis(50), lock.kill())\n                .await\n                .is_err()\n        );\n        drop(guard);\n\n        lock.kill().await;\n        assert!(lock.is_dead());\n        assert!(lock.acquire().await.is_none());\n    }\n\n    #[test]\n    fn test_publish_lock_dead() {\n        let publish_lock = PublishLock::dead();\n        assert!(publish_lock.is_dead());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/models/publisher_message.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\n\nuse itertools::Itertools;\nuse quickwit_metastore::SplitMetadata;\nuse quickwit_metastore::checkpoint::IndexCheckpointDelta;\nuse quickwit_proto::types::{IndexUid, PublishToken};\nuse tracing::Span;\n\nuse crate::merge_policy::MergeTask;\nuse crate::models::PublishLock;\n\npub struct SplitsUpdate {\n    pub index_uid: IndexUid,\n    pub new_splits: Vec<SplitMetadata>,\n    pub replaced_split_ids: Vec<String>,\n    pub checkpoint_delta_opt: Option<IndexCheckpointDelta>,\n    pub publish_lock: PublishLock,\n    pub publish_token_opt: Option<PublishToken>,\n    /// A [`MergeTask`] tracked by either the `MergePlanner` or the `DeleteTaskPlanner`\n    /// in the `MergePipeline` or `DeleteTaskPipeline`.\n    /// See planners docs to understand the usage.\n    /// If `None`, the split batch was built in the `IndexingPipeline`.\n    pub merge_task: Option<MergeTask>,\n    pub parent_span: Span,\n}\n\nimpl fmt::Debug for SplitsUpdate {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        let new_split_ids: String = self\n            .new_splits\n            .iter()\n            .map(|split| split.split_id())\n            .join(\",\");\n        f.debug_struct(\"SplitsUpdate\")\n            .field(\"index_id\", &self.index_uid.index_id)\n            .field(\"new_splits\", &new_split_ids)\n            .field(\"checkpoint_delta\", &self.checkpoint_delta_opt)\n            .finish()\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/models/raw_doc_batch.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\n\nuse bytes::Bytes;\nuse quickwit_common::metrics::{GaugeGuard, MEMORY_METRICS};\nuse quickwit_metastore::checkpoint::SourceCheckpointDelta;\n\npub struct RawDocBatch {\n    // Do not directly append documents to this vector; otherwise, in-flight metrics will be\n    // incorrect.\n    pub docs: Vec<Bytes>,\n    pub checkpoint_delta: SourceCheckpointDelta,\n    pub force_commit: bool,\n    _gauge_guard: GaugeGuard<'static>,\n}\n\nimpl RawDocBatch {\n    pub fn new(\n        docs: Vec<Bytes>,\n        checkpoint_delta: SourceCheckpointDelta,\n        force_commit: bool,\n    ) -> Self {\n        let delta = docs.iter().map(|doc| doc.len() as i64).sum::<i64>();\n        let mut gauge_guard =\n            GaugeGuard::from_gauge(&MEMORY_METRICS.in_flight.doc_processor_mailbox);\n        gauge_guard.add(delta);\n\n        Self {\n            docs,\n            checkpoint_delta,\n            force_commit,\n            _gauge_guard: gauge_guard,\n        }\n    }\n\n    #[cfg(test)]\n    pub fn for_test(docs: &[&[u8]], range: std::ops::Range<u64>) -> Self {\n        let docs = docs.iter().map(|doc| Bytes::from(doc.to_vec())).collect();\n        let checkpoint_delta = SourceCheckpointDelta::from_range(range);\n        Self::new(docs, checkpoint_delta, false)\n    }\n}\n\nimpl fmt::Debug for RawDocBatch {\n    fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {\n        formatter\n            .debug_struct(\"RawDocBatch\")\n            .field(\"num_docs\", &self.docs.len())\n            .field(\"checkpoint_delta\", &self.checkpoint_delta)\n            .field(\"force_commit\", &self.force_commit)\n            .finish()\n    }\n}\n\nimpl Default for RawDocBatch {\n    fn default() -> Self {\n        let _gauge_guard = GaugeGuard::from_gauge(&MEMORY_METRICS.in_flight.doc_processor_mailbox);\n        Self {\n            docs: Vec::new(),\n            checkpoint_delta: SourceCheckpointDelta::default(),\n            force_commit: false,\n            _gauge_guard,\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/models/shard_positions.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeMap;\nuse std::fmt::Debug;\nuse std::time::{Duration, Instant};\n\nuse anyhow::Context;\nuse async_trait::async_trait;\nuse fnv::FnvHashMap;\nuse quickwit_actors::{Actor, ActorContext, ActorExitStatus, Handler, Mailbox, SpawnContext};\nuse quickwit_cluster::{Cluster, ListenerHandle};\nuse quickwit_common::pretty::PrettyDisplay;\nuse quickwit_common::pubsub::{Event, EventBroker};\nuse quickwit_proto::indexing::ShardPositionsUpdate;\nuse quickwit_proto::types::{Position, ShardId, SourceUid};\nuse tracing::{debug, error, info, warn};\n\n/// Prefix used in chitchat to publish the shard positions.\nconst SHARD_POSITIONS_PREFIX: &str = \"indexer.shard_positions:\";\n\n/// This event means that a pipeline running in the current node (hence \"local\")\n/// performed a publish on an ingest pipeline, and hence the position of a shard has been updated.\n///\n/// This event is meant to be built by the `IngestSource`, upon reception of suggest truncate\n/// event. It should only be consumed by the `ShardPositionsService`.\n///\n/// (This is why its member are private).\n///\n/// The new position is to be exposed to the entire cluster via chitchat.\n///\n/// Consumers of such events should listen to the more `ShardPositionsUpdate` event instead.\n/// That event is broadcasted via the cluster event broker, and will include both local\n/// changes and changes from other nodes.\n#[derive(Debug, Clone, PartialEq, Eq)]\npub(crate) struct LocalShardPositionsUpdate {\n    source_uid: SourceUid,\n    // This list can be partial: not all shards for the source need to be listed here.\n    shard_positions: Vec<(ShardId, Position)>,\n}\n\nimpl LocalShardPositionsUpdate {\n    pub fn new(source_uid: SourceUid, shard_positions: Vec<(ShardId, Position)>) -> Self {\n        LocalShardPositionsUpdate {\n            source_uid,\n            shard_positions,\n        }\n    }\n}\n\n/// This event is an internal detail of the `ShardPositionsService`.\n///\n/// When a shard position change in the cluster is detected, a `ClusterShardPositionUpdate`\n/// message is queued into the `ShardPositionsService`\n#[derive(Debug)]\nstruct ClusterShardPositionsUpdate {\n    pub source_uid: SourceUid,\n    pub shard_id: ShardId,\n    pub position: Position,\n}\n\nimpl Event for LocalShardPositionsUpdate {}\n\n/// The published shard positions is a model unique to the indexer service instance that\n/// keeps track of the latest (known) published position for the shards of all managed sources.\n///\n/// It receives updates through the event broker, and only keeps the maximum published position\n/// for each shard.\npub struct ShardPositionsService {\n    shard_positions_per_source: FnvHashMap<SourceUid, BTreeMap<ShardId, Position>>,\n    cluster: Cluster,\n    event_broker: EventBroker,\n    cluster_listener_handle_opt: Option<ListenerHandle>,\n}\n\nfn parse_shard_positions_from_kv(\n    key: &str,\n    value: &str,\n) -> anyhow::Result<ClusterShardPositionsUpdate> {\n    let (source_uid_str, shard_id_str) = key.rsplit_once(':').context(\"invalid key\")?;\n    let shard_id = ShardId::from(shard_id_str);\n    let (index_uid_str, source_id) = source_uid_str.rsplit_once(':').context(\"invalid key\")?;\n    let index_uid = index_uid_str.parse()?;\n    let source_uid = SourceUid {\n        index_uid,\n        source_id: source_id.to_string(),\n    };\n    let position = Position::from(value.to_string());\n    Ok(ClusterShardPositionsUpdate {\n        source_uid,\n        shard_id,\n        position,\n    })\n}\n\nfn push_position_update(\n    shard_positions_service_mailbox: &Mailbox<ShardPositionsService>,\n    key: &str,\n    value: &str,\n) {\n    let shard_positions = match parse_shard_positions_from_kv(key, value) {\n        Ok(shard_positions) => shard_positions,\n        Err(error) => {\n            error!(key=key, value=value, error=%error, \"failed to parse shard positions from cluster kv\");\n            return;\n        }\n    };\n    if shard_positions_service_mailbox\n        .try_send_message(shard_positions)\n        .is_err()\n    {\n        error!(\"failed to send shard positions to the shard positions service\");\n    }\n}\n\n#[async_trait]\nimpl Actor for ShardPositionsService {\n    type ObservableState = ();\n    fn observable_state(&self) {}\n\n    async fn initialize(&mut self, ctx: &ActorContext<Self>) -> Result<(), ActorExitStatus> {\n        let mailbox = ctx.mailbox().clone();\n\n        self.cluster_listener_handle_opt = Some(\n            self.cluster\n                .subscribe(SHARD_POSITIONS_PREFIX, move |event| {\n                    push_position_update(&mailbox, event.key, event.value);\n                })\n                .await,\n        );\n\n        // We are now listening to new updates. However, the cluster has been started earlier.\n        // It might have already received shard updates from other nodes.\n        //\n        // Let's also sync our `ShardPositionsService` with the current state of the cluster.\n        // Shard position updates are trivially idempotent, so we can replay all the events,\n        // without worrying about duplicates.\n\n        let now = Instant::now();\n        let chitchat = self.cluster.chitchat().await;\n        let chitchat_lock = chitchat.lock().await;\n        let mut num_keys = 0;\n        for node_state in chitchat_lock.node_states().values() {\n            for (key, versioned_value) in node_state.iter_prefix(SHARD_POSITIONS_PREFIX) {\n                let key_stripped = key.strip_prefix(SHARD_POSITIONS_PREFIX).unwrap();\n                push_position_update(ctx.mailbox(), key_stripped, &versioned_value.value);\n                num_keys += 1;\n            }\n            // It is tempting to yield here, but we are holding the chitchat lock.\n            // Let's just log the amount of time it takes for the moment.\n        }\n        let elapsed = now.elapsed();\n        if elapsed > Duration::from_millis(300) {\n            warn!(\n                \"initializing shard positions took longer than expected: {} ({num_keys} keys)\",\n                elapsed.pretty_display(),\n            );\n        } else {\n            info!(\n                \"initialized shard positions in {} ({num_keys} keys)\",\n                elapsed.pretty_display(),\n            );\n        }\n        Ok(())\n    }\n}\n\nimpl ShardPositionsService {\n    pub fn spawn(spawn_ctx: &SpawnContext, event_broker: EventBroker, cluster: Cluster) {\n        let shard_positions_service = ShardPositionsService::new(event_broker.clone(), cluster);\n        let (shard_positions_service_mailbox, _) =\n            spawn_ctx.spawn_builder().spawn(shard_positions_service);\n        // This subscription is in charge of updating the shard positions model.\n        event_broker\n            .subscribe_without_timeout::<LocalShardPositionsUpdate>(move |update| {\n                if shard_positions_service_mailbox\n                    .try_send_message(update)\n                    .is_err()\n                {\n                    error!(\"failed to send update to shard positions service\");\n                }\n            })\n            .forever();\n    }\n\n    fn new(event_broker: EventBroker, cluster: Cluster) -> ShardPositionsService {\n        ShardPositionsService {\n            shard_positions_per_source: Default::default(),\n            cluster,\n            event_broker,\n            cluster_listener_handle_opt: None,\n        }\n    }\n}\n\n#[async_trait]\nimpl Handler<ClusterShardPositionsUpdate> for ShardPositionsService {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        update: ClusterShardPositionsUpdate,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        let ClusterShardPositionsUpdate {\n            source_uid,\n            shard_id,\n            position,\n        } = update;\n        let updated_shard_positions = self.apply_update(&source_uid, vec![(shard_id, position)]);\n        debug!(updated_shard_positions=?updated_shard_positions, \"cluster position update\");\n        if !updated_shard_positions.is_empty() {\n            self.publish_shard_updates_to_event_broker(source_uid, updated_shard_positions);\n        }\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<LocalShardPositionsUpdate> for ShardPositionsService {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        update: LocalShardPositionsUpdate,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        let LocalShardPositionsUpdate {\n            source_uid,\n            shard_positions,\n        } = update;\n        let updated_shard_positions: Vec<(ShardId, Position)> =\n            self.apply_update(&source_uid, shard_positions);\n        if updated_shard_positions.is_empty() {\n            return Ok(());\n        }\n        self.publish_positions_into_chitchat(&source_uid, &updated_shard_positions)\n            .await;\n        self.publish_shard_updates_to_event_broker(source_uid, updated_shard_positions);\n        Ok(())\n    }\n}\n\nimpl ShardPositionsService {\n    async fn publish_positions_into_chitchat(\n        &self,\n        source_uid: &SourceUid,\n        shard_positions: &[(ShardId, Position)],\n    ) {\n        let SourceUid {\n            index_uid,\n            source_id,\n        } = &source_uid;\n        for (shard_id, position) in shard_positions {\n            let key = format!(\"{SHARD_POSITIONS_PREFIX}{index_uid}:{source_id}:{shard_id}\");\n            self.cluster\n                .set_self_key_value_delete_after_ttl(key, position)\n                .await;\n        }\n    }\n\n    fn publish_shard_updates_to_event_broker(\n        &self,\n        source_uid: SourceUid,\n        shard_positions: Vec<(ShardId, Position)>,\n    ) {\n        debug!(shard_positions=?shard_positions, \"shard positions updates\");\n        self.event_broker.publish(ShardPositionsUpdate {\n            source_uid,\n            updated_shard_positions: shard_positions,\n        });\n    }\n\n    /// Updates the internal model holding the last position per shard, and\n    /// returns the list of shards that were updated.\n    fn apply_update(\n        &mut self,\n        source_uid: &SourceUid,\n        published_positions_per_shard: Vec<(ShardId, Position)>,\n    ) -> Vec<(ShardId, Position)> {\n        if published_positions_per_shard.is_empty() {\n            warn!(\"received an empty publish shard positions update\");\n            return Vec::new();\n        }\n        let current_shard_positions = self\n            .shard_positions_per_source\n            .entry(source_uid.clone())\n            .or_default();\n\n        let updated_positions_per_shard = published_positions_per_shard\n            .into_iter()\n            .filter(|(shard, new_position)| {\n                let Some(position) = current_shard_positions.get(shard) else {\n                    return true;\n                };\n                new_position > position\n            })\n            .collect::<Vec<_>>();\n\n        for (shard, position) in updated_positions_per_shard.iter() {\n            current_shard_positions.insert(shard.clone(), position.clone());\n        }\n\n        updated_positions_per_shard\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::time::Duration;\n\n    use quickwit_actors::Universe;\n    use quickwit_cluster::{ChannelTransport, create_cluster_for_test};\n    use quickwit_common::pubsub::EventBroker;\n    use quickwit_proto::types::IndexUid;\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_shard_positions_from_cluster() {\n        quickwit_common::setup_logging_for_tests();\n\n        let transport = ChannelTransport::default();\n\n        let universe1 = Universe::with_accelerated_time();\n        let universe2 = Universe::with_accelerated_time();\n\n        let event_broker1 = EventBroker::default();\n        let event_broker2 = EventBroker::default();\n\n        let (tx1, mut rx1) = tokio::sync::mpsc::unbounded_channel::<ShardPositionsUpdate>();\n        let (tx2, mut rx2) = tokio::sync::mpsc::unbounded_channel::<ShardPositionsUpdate>();\n\n        event_broker1\n            .subscribe(move |update: ShardPositionsUpdate| {\n                tx1.send(update).unwrap();\n            })\n            .forever();\n\n        event_broker2\n            .subscribe(move |update: ShardPositionsUpdate| {\n                tx2.send(update).unwrap();\n            })\n            .forever();\n\n        let index_uid = IndexUid::new_with_random_ulid(\"index-test\");\n        let source_id = \"test-source\".to_string();\n        let source_uid = SourceUid {\n            index_uid,\n            source_id,\n        };\n\n        let cluster1 =\n            create_cluster_for_test(Vec::new(), &[\"indexer\", \"metastore\"], &transport, true)\n                .await\n                .unwrap();\n        ShardPositionsService::spawn(\n            universe1.spawn_ctx(),\n            event_broker1.clone(),\n            cluster1.clone(),\n        );\n\n        // One of the event is published before cluster formation.\n        event_broker1.publish(LocalShardPositionsUpdate::new(\n            source_uid.clone(),\n            vec![(ShardId::from(20), Position::offset(100u64))],\n        ));\n\n        let cluster2 = create_cluster_for_test(\n            vec![cluster1.gossip_listen_addr.to_string()],\n            &[\"indexer\"],\n            &transport,\n            true,\n        )\n        .await\n        .unwrap();\n\n        cluster1\n            .wait_for_ready_members(|members| members.len() == 2, Duration::from_secs(5))\n            .await\n            .unwrap();\n        cluster2\n            .wait_for_ready_members(|members| members.len() == 2, Duration::from_secs(5))\n            .await\n            .unwrap();\n\n        ShardPositionsService::spawn(\n            universe2.spawn_ctx(),\n            event_broker2.clone(),\n            cluster2.clone(),\n        );\n\n        // ----------------------\n        // One of the node publishes a given shard position update.\n        // This is done using a LocalPublishShardPositionUpdate\n\n        event_broker1.publish(LocalShardPositionsUpdate::new(\n            source_uid.clone(),\n            vec![(ShardId::from(2), Position::offset(10u64))],\n        ));\n        event_broker1.publish(LocalShardPositionsUpdate::new(\n            source_uid.clone(),\n            vec![(ShardId::from(1), Position::offset(10u64))],\n        ));\n        event_broker2.publish(LocalShardPositionsUpdate::new(\n            source_uid.clone(),\n            vec![(ShardId::from(2), Position::offset(10u64))],\n        ));\n        event_broker2.publish(LocalShardPositionsUpdate::new(\n            source_uid.clone(),\n            vec![(ShardId::from(2), Position::offset(12u64))],\n        ));\n        event_broker2.publish(LocalShardPositionsUpdate::new(\n            source_uid.clone(),\n            vec![\n                (ShardId::from(1), Position::Beginning),\n                (ShardId::from(2), Position::offset(12u64)),\n            ],\n        ));\n\n        let mut updates1: Vec<Vec<(ShardId, Position)>> = Vec::new();\n        for _ in 0..4 {\n            let update = rx1.recv().await.unwrap();\n            assert_eq!(update.source_uid, source_uid);\n            updates1.push(update.updated_shard_positions);\n        }\n\n        // The updates as seen from the first node.\n        assert_eq!(\n            updates1,\n            vec![\n                vec![(ShardId::from(20), Position::offset(100u64))],\n                vec![(ShardId::from(2u64), Position::offset(10u64))],\n                vec![(ShardId::from(1u64), Position::offset(10u64)),],\n                vec![(ShardId::from(2u64), Position::offset(12u64)),],\n            ]\n        );\n\n        // The updates as seen from the second.\n        let mut updates2: Vec<Vec<(ShardId, Position)>> = Vec::new();\n        for _ in 0..5 {\n            let update = rx2.recv().await.unwrap();\n            assert_eq!(update.source_uid, source_uid);\n            updates2.push(update.updated_shard_positions);\n        }\n        assert_eq!(\n            updates2,\n            vec![\n                vec![(ShardId::from(20u64), Position::offset(100u64))],\n                vec![(ShardId::from(2u64), Position::offset(10u64))],\n                vec![(ShardId::from(2u64), Position::offset(12u64))],\n                vec![(ShardId::from(1u64), Position::Beginning)],\n                vec![(ShardId::from(1u64), Position::offset(10u64))]\n            ]\n        );\n\n        universe1.assert_quit().await;\n        universe2.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_shard_positions_local_updates_publish_to_cluster() {\n        quickwit_common::setup_logging_for_tests();\n        let universe = Universe::with_accelerated_time();\n        let transport = ChannelTransport::default();\n\n        let cluster: Cluster = create_cluster_for_test(Vec::new(), &[], &transport, true)\n            .await\n            .unwrap();\n        let event_broker = EventBroker::default();\n\n        ShardPositionsService::spawn(universe.spawn_ctx(), event_broker.clone(), cluster.clone());\n\n        let index_uid = IndexUid::new_with_random_ulid(\"index-test\");\n        let source_id = \"test-source\".to_string();\n        let key_prefix = format!(\"{SHARD_POSITIONS_PREFIX}{index_uid}:{source_id}\");\n        let source_uid = SourceUid {\n            index_uid,\n            source_id,\n        };\n\n        let shard_id1 = ShardId::from(1);\n        let shard_id2 = ShardId::from(2);\n        let shard_id3 = ShardId::from(3);\n\n        {\n            event_broker.publish(LocalShardPositionsUpdate::new(\n                source_uid.clone(),\n                vec![(ShardId::from(1), Position::Beginning)],\n            ));\n            tokio::time::sleep(Duration::from_secs(1)).await;\n            let key = format!(\"{key_prefix}:{shard_id1}\");\n            let value = cluster.get_self_key_value(&key).await.unwrap();\n            assert_eq!(&value, \"\");\n        }\n        {\n            event_broker.publish(LocalShardPositionsUpdate::new(\n                source_uid.clone(),\n                vec![\n                    (shard_id1.clone(), Position::offset(1_000u64)),\n                    (shard_id2.clone(), Position::offset(2_000u64)),\n                ],\n            ));\n            tokio::time::sleep(Duration::from_secs(1)).await;\n            let value1 = cluster\n                .get_self_key_value(&format!(\"{key_prefix}:{shard_id1}\"))\n                .await\n                .unwrap();\n            assert_eq!(&value1, \"00000000000000001000\");\n            let value2 = cluster\n                .get_self_key_value(&format!(\"{key_prefix}:{shard_id2}\"))\n                .await\n                .unwrap();\n            assert_eq!(&value2, \"00000000000000002000\");\n        }\n        {\n            event_broker.publish(LocalShardPositionsUpdate::new(\n                source_uid.clone(),\n                vec![\n                    (shard_id1.clone(), Position::offset(999u64)),\n                    (shard_id3.clone(), Position::offset(3_000u64)),\n                ],\n            ));\n            tokio::time::sleep(Duration::from_secs(1)).await;\n            let value1 = cluster\n                .get_self_key_value(&format!(\"{key_prefix}:{shard_id1}\"))\n                .await\n                .unwrap();\n            // We do not update the position that got lower, nor the position that disappeared\n            assert_eq!(&value1, \"00000000000000001000\");\n            let value2 = cluster\n                .get_self_key_value(&format!(\"{key_prefix}:{shard_id2}\"))\n                .await\n                .unwrap();\n            assert_eq!(&value2, \"00000000000000002000\");\n            let value3 = cluster\n                .get_self_key_value(&format!(\"{key_prefix}:{shard_id3}\"))\n                .await\n                .unwrap();\n            assert_eq!(&value3, \"00000000000000003000\");\n        }\n        universe.assert_quit().await;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/models/split_attrs.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeSet;\nuse std::fmt;\nuse std::ops::{Range, RangeInclusive};\nuse std::sync::Arc;\nuse std::time::Duration;\n\nuse quickwit_metastore::{SplitMaturity, SplitMetadata};\nuse quickwit_proto::types::{DocMappingUid, IndexUid, NodeId, SourceId, SplitId};\nuse tantivy::DateTime;\nuse time::OffsetDateTime;\n\nuse crate::merge_policy::MergePolicy;\n\npub struct SplitAttrs {\n    /// ID of the node that produced the split.\n    pub node_id: NodeId,\n    // Index UID to which the split belongs.\n    pub index_uid: IndexUid,\n    /// Source ID to which the split belongs.\n    pub source_id: SourceId,\n\n    /// Doc mapping UID used to produce this split.\n    pub doc_mapping_uid: DocMappingUid,\n\n    /// Split ID. Joined with the index URI (<index URI>/<split ID>), this ID\n    /// should be enough to uniquely identify a split.\n    /// In reality, some information may be implicitly configured\n    /// in the storage resolver: for instance, the Amazon S3 region.\n    pub split_id: SplitId,\n\n    /// Partition to which the split belongs.\n    ///\n    /// Partitions are usually meant to isolate documents based on some field like\n    /// `tenant_id`. For this reason, ideally splits with a different `partition_id`\n    /// should not be merged together. Merging two splits with different `partition_id`\n    /// does not hurt correctness however.\n    pub partition_id: u64,\n\n    /// Number of valid documents in the split.\n    pub num_docs: u64,\n\n    // Sum of the size of the document that were sent to the indexed.\n    // This includes both documents that are valid or documents that are\n    // invalid.\n    pub uncompressed_docs_size_in_bytes: u64,\n\n    pub time_range: Option<RangeInclusive<DateTime>>,\n\n    pub replaced_split_ids: Vec<String>,\n\n    /// Delete opstamp.\n    pub delete_opstamp: u64,\n\n    // Number of merge operation the split has been through so far.\n    pub num_merge_ops: usize,\n}\n\nimpl fmt::Debug for SplitAttrs {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        f.debug_struct(\"SplitAttrs\")\n            .field(\"split_id\", &self.split_id)\n            .field(\"partition_id\", &self.partition_id)\n            .field(\"replaced_split_ids\", &self.replaced_split_ids)\n            .field(\"time_range\", &self.time_range)\n            .field(\n                \"uncompressed_docs_size_in_bytes\",\n                &self.uncompressed_docs_size_in_bytes,\n            )\n            .field(\"num_docs\", &self.num_docs)\n            .field(\"num_merge_ops\", &self.num_merge_ops)\n            .finish()\n    }\n}\n\npub fn create_split_metadata(\n    merge_policy: &Arc<dyn MergePolicy>,\n    retention_policy: Option<&quickwit_config::RetentionPolicy>,\n    split_attrs: &SplitAttrs,\n    tags: BTreeSet<String>,\n    footer_offsets: Range<u64>,\n) -> SplitMetadata {\n    let create_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n\n    let time_range = split_attrs\n        .time_range\n        .as_ref()\n        .map(|range| range.start().into_timestamp_secs()..=range.end().into_timestamp_secs());\n\n    let mut maturity =\n        merge_policy.split_maturity(split_attrs.num_docs as usize, split_attrs.num_merge_ops);\n    if let Some(max_maturity) = max_maturity_before_end_of_retention(\n        retention_policy,\n        create_timestamp,\n        time_range.as_ref().map(|time_range| *time_range.end()),\n    ) {\n        maturity = maturity.min(max_maturity);\n    }\n    SplitMetadata {\n        node_id: split_attrs.node_id.to_string(),\n        index_uid: split_attrs.index_uid.clone(),\n        source_id: split_attrs.source_id.clone(),\n        doc_mapping_uid: split_attrs.doc_mapping_uid,\n        split_id: split_attrs.split_id.clone(),\n        partition_id: split_attrs.partition_id,\n        num_docs: split_attrs.num_docs as usize,\n        time_range,\n        uncompressed_docs_size_in_bytes: split_attrs.uncompressed_docs_size_in_bytes,\n        create_timestamp,\n        maturity,\n        tags,\n        footer_offsets,\n        delete_opstamp: split_attrs.delete_opstamp,\n        num_merge_ops: split_attrs.num_merge_ops,\n    }\n}\n\n/// reduce the maturity period of a split based on retention policy, so that it doesn't get merged\n/// after it expires.\nfn max_maturity_before_end_of_retention(\n    retention_policy: Option<&quickwit_config::RetentionPolicy>,\n    create_timestamp: i64,\n    time_range_end: Option<i64>,\n) -> Option<SplitMaturity> {\n    let time_range_end = time_range_end? as u64;\n    let retention_period_s = retention_policy?.retention_period().ok()?.as_secs();\n\n    let maturity = if let Some(maturation_period_s) =\n        (time_range_end + retention_period_s).checked_sub(create_timestamp as u64)\n    {\n        SplitMaturity::Immature {\n            maturation_period: Duration::from_secs(maturation_period_s),\n        }\n    } else {\n        // this split could be deleted as soon as it is created. Ideally we would\n        // handle that sooner.\n        SplitMaturity::Mature\n    };\n    Some(maturity)\n}\n\n#[cfg(test)]\nmod tests {\n    use std::time::Duration;\n\n    use quickwit_metastore::SplitMaturity;\n\n    use super::max_maturity_before_end_of_retention;\n\n    #[test]\n    fn test_max_maturity_before_end_of_retention() {\n        let retention_policy = quickwit_config::RetentionPolicy {\n            evaluation_schedule: \"daily\".to_string(),\n            retention_period: \"300 sec\".to_string(),\n        };\n        let create_timestamp = 1000;\n\n        // this should be deleted asap, not subject to merge\n        assert_eq!(\n            max_maturity_before_end_of_retention(\n                Some(&retention_policy),\n                create_timestamp,\n                Some(200),\n            ),\n            Some(SplitMaturity::Mature)\n        );\n\n        // retention ends at 750 + 300 = 1050, which is 50s from now\n        assert_eq!(\n            max_maturity_before_end_of_retention(\n                Some(&retention_policy),\n                create_timestamp,\n                Some(750),\n            ),\n            Some(SplitMaturity::Immature {\n                maturation_period: Duration::from_secs(50)\n            })\n        );\n\n        // no retention policy\n        assert_eq!(\n            max_maturity_before_end_of_retention(None, create_timestamp, Some(850),),\n            None,\n        );\n\n        // no timestamp_range.end but a retention policy, that's odd, don't change anything about\n        // the maturity period\n        assert_eq!(\n            max_maturity_before_end_of_retention(Some(&retention_policy), create_timestamp, None,),\n            None,\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/doc_file_reader.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::io;\nuse std::path::Path;\n\nuse anyhow::Context;\nuse async_compression::tokio::bufread::GzipDecoder;\nuse bytes::Bytes;\nuse quickwit_common::Progress;\nuse quickwit_common::uri::Uri;\nuse quickwit_metastore::checkpoint::PartitionId;\nuse quickwit_proto::metastore::SourceType;\nuse quickwit_proto::types::Position;\nuse quickwit_storage::StorageResolver;\nuse tokio::io::{AsyncBufReadExt, AsyncRead, AsyncReadExt, BufReader};\n\nuse super::{BATCH_NUM_BYTES_LIMIT, BatchBuilder};\n\npub struct FileRecord {\n    pub next_offset: u64,\n    pub doc: Bytes,\n    pub is_last: bool,\n}\n\n/// A helper wrapper that lets you skip bytes in compressed files where you\n/// cannot seek (e.g. gzip files).\nstruct SkipReader {\n    reader: BufReader<Box<dyn AsyncRead + Send + Unpin>>,\n    num_bytes_to_skip: usize,\n}\n\nimpl SkipReader {\n    fn new(reader: Box<dyn AsyncRead + Send + Unpin>, num_bytes_to_skip: usize) -> Self {\n        Self {\n            reader: BufReader::new(reader),\n            num_bytes_to_skip,\n        }\n    }\n\n    async fn skip(&mut self) -> io::Result<()> {\n        // allocate on the heap to avoid stack overflows\n        let mut buf = vec![0u8; 64_000];\n        while self.num_bytes_to_skip > 0 {\n            let num_bytes_to_read = self.num_bytes_to_skip.min(buf.len());\n            let num_bytes_read = self\n                .reader\n                .read_exact(&mut buf[..num_bytes_to_read])\n                .await?;\n            self.num_bytes_to_skip -= num_bytes_read;\n        }\n        Ok(())\n    }\n\n    /// Reads a line and peeks into the readers buffer. Returns the number of\n    /// bytes read and true the end of the file is reached.\n    async fn read_line_and_peek(&mut self, buf: &mut String) -> io::Result<(usize, bool)> {\n        if self.num_bytes_to_skip > 0 {\n            self.skip().await?;\n        }\n        let line_size = self.reader.read_line(buf).await?;\n        if line_size == 0 {\n            return Ok((0, true));\n        }\n        let next_bytes = self.reader.fill_buf().await?;\n        Ok((line_size, next_bytes.is_empty()))\n    }\n}\n\npub struct DocFileReader {\n    reader: SkipReader,\n    next_offset: u64,\n}\n\nimpl DocFileReader {\n    pub fn empty() -> Self {\n        DocFileReader {\n            reader: SkipReader::new(Box::new(tokio::io::empty()), 0),\n            next_offset: 0,\n        }\n    }\n\n    pub async fn from_uri(\n        storage_resolver: &StorageResolver,\n        uri: &Uri,\n        offset: usize,\n    ) -> anyhow::Result<Self> {\n        let (dir_uri, file_name) = dir_and_filename(uri)?;\n        let storage = storage_resolver.resolve(&dir_uri).await?;\n        let file_size = storage.file_num_bytes(file_name).await?.try_into().unwrap();\n        if file_size == 0 {\n            return Ok(DocFileReader::empty());\n        }\n        // If it's a gzip file, we can't seek to a specific offset. `SkipReader`\n        // starts from the beginning of the file, decompresses and skips the\n        // first `offset` bytes.\n        let reader = if uri.extension() == Some(\"gz\") {\n            let stream = storage.get_slice_stream(file_name, 0..file_size).await?;\n            let decompressed_stream = Box::new(GzipDecoder::new(BufReader::new(stream)));\n            DocFileReader {\n                reader: SkipReader::new(decompressed_stream, offset),\n                next_offset: offset as u64,\n            }\n        } else {\n            let stream = storage\n                .get_slice_stream(file_name, offset..file_size)\n                .await?;\n            DocFileReader {\n                reader: SkipReader::new(stream, 0),\n                next_offset: offset as u64,\n            }\n        };\n        Ok(reader)\n    }\n\n    /// Reads the next record from the underlying file. Returns `None` when EOF\n    /// is reached.\n    pub async fn next_record(&mut self) -> anyhow::Result<Option<FileRecord>> {\n        let mut buf = String::new();\n        // TODO retry if stream is broken (#5243)\n        let (bytes_read, is_last) = self.reader.read_line_and_peek(&mut buf).await?;\n        if bytes_read == 0 {\n            Ok(None)\n        } else {\n            self.next_offset += bytes_read as u64;\n            Ok(Some(FileRecord {\n                next_offset: self.next_offset,\n                doc: Bytes::from(buf),\n                is_last,\n            }))\n        }\n    }\n}\n\npub struct ObjectUriBatchReader {\n    partition_id: PartitionId,\n    reader: DocFileReader,\n    current_offset: usize,\n    is_eof: bool,\n}\n\nimpl ObjectUriBatchReader {\n    pub async fn try_new(\n        storage_resolver: &StorageResolver,\n        partition_id: PartitionId,\n        uri: &Uri,\n        position: Position,\n    ) -> anyhow::Result<Self> {\n        let current_offset = match position {\n            Position::Beginning => 0,\n            Position::Offset(offset) => offset\n                .as_usize()\n                .context(\"file offset should be stored as usize\")?,\n            Position::Eof(_) => {\n                return Ok(ObjectUriBatchReader {\n                    partition_id,\n                    reader: DocFileReader::empty(),\n                    current_offset: 0,\n                    is_eof: true,\n                });\n            }\n        };\n        let reader = DocFileReader::from_uri(storage_resolver, uri, current_offset).await?;\n        Ok(ObjectUriBatchReader {\n            partition_id,\n            reader,\n            current_offset,\n            is_eof: false,\n        })\n    }\n\n    pub async fn read_batch(\n        &mut self,\n        source_progress: &Progress,\n        source_type: SourceType,\n    ) -> anyhow::Result<BatchBuilder> {\n        let mut batch_builder = BatchBuilder::new(source_type);\n        if self.is_eof {\n            return Ok(batch_builder);\n        }\n        let limit_num_bytes = self.current_offset + BATCH_NUM_BYTES_LIMIT as usize;\n        let mut new_offset = self.current_offset;\n        while new_offset < limit_num_bytes {\n            if let Some(record) = source_progress\n                .protect_future(self.reader.next_record())\n                .await?\n            {\n                new_offset = record.next_offset as usize;\n                batch_builder.add_doc(record.doc);\n                if record.is_last {\n                    self.is_eof = true;\n                    break;\n                }\n            } else {\n                self.is_eof = true;\n                break;\n            }\n        }\n        let to_position = if self.is_eof {\n            Position::eof(new_offset)\n        } else {\n            Position::offset(new_offset)\n        };\n        batch_builder.checkpoint_delta.record_partition_delta(\n            self.partition_id.clone(),\n            Position::offset(self.current_offset),\n            to_position,\n        )?;\n        self.current_offset = new_offset;\n        Ok(batch_builder)\n    }\n\n    pub fn is_eof(&self) -> bool {\n        self.is_eof\n    }\n}\n\npub(crate) fn dir_and_filename(filepath: &Uri) -> anyhow::Result<(Uri, &Path)> {\n    let dir_uri: Uri = filepath\n        .parent()\n        .context(\"Parent directory could not be resolved\")?;\n    let file_name = filepath\n        .file_name()\n        .context(\"Path does not appear to be a file\")?;\n    Ok((dir_uri, file_name))\n}\n\n#[cfg(test)]\npub mod file_test_helpers {\n    use std::io::Write;\n\n    use async_compression::tokio::write::GzipEncoder;\n    use tempfile::NamedTempFile;\n\n    pub const DUMMY_DOC: &[u8] = r#\"{\"body\": \"hello happy tax payer!\"}\"#.as_bytes();\n\n    async fn gzip_bytes(bytes: &[u8]) -> Vec<u8> {\n        let mut gzip_documents = Vec::new();\n        let mut encoder = GzipEncoder::new(&mut gzip_documents);\n        tokio::io::AsyncWriteExt::write_all(&mut encoder, bytes)\n            .await\n            .unwrap();\n        // flush is not sufficient here and reading the file will raise a unexpected end of file\n        // error.\n        tokio::io::AsyncWriteExt::shutdown(&mut encoder)\n            .await\n            .unwrap();\n        gzip_documents\n    }\n\n    async fn write_to_tmp(data: Vec<u8>, gzip: bool) -> NamedTempFile {\n        let mut temp_file: tempfile::NamedTempFile = if gzip {\n            tempfile::Builder::new().suffix(\".gz\").tempfile().unwrap()\n        } else {\n            tempfile::NamedTempFile::new().unwrap()\n        };\n        if gzip {\n            let gzip_documents = gzip_bytes(&data).await;\n            temp_file.write_all(&gzip_documents).unwrap();\n        } else {\n            temp_file.write_all(&data).unwrap();\n        }\n        temp_file.flush().unwrap();\n        temp_file\n    }\n\n    pub async fn generate_dummy_doc_file(gzip: bool, lines: usize) -> (NamedTempFile, usize) {\n        let mut documents_bytes = Vec::with_capacity(DUMMY_DOC.len() * lines);\n        for _ in 0..lines {\n            documents_bytes.write_all(DUMMY_DOC).unwrap();\n            documents_bytes.write_all(\"\\n\".as_bytes()).unwrap();\n        }\n        let size = documents_bytes.len();\n        let file = write_to_tmp(documents_bytes, gzip).await;\n        (file, size)\n    }\n\n    /// Generates a file with increasing padded numbers. Each line is 8 bytes\n    /// including the newline char.\n    ///\n    /// 0000000\\n0000001\\n0000002\\n...\n    pub async fn generate_index_doc_file(gzip: bool, lines: usize) -> NamedTempFile {\n        assert!(lines < 9999999, \"each line is 7 digits + newline\");\n        let mut documents_bytes = Vec::new();\n        for i in 0..lines {\n            documents_bytes\n                .write_all(format!(\"{i:0>7}\\n\").as_bytes())\n                .unwrap();\n        }\n        write_to_tmp(documents_bytes, gzip).await\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::io::Cursor;\n    use std::str::FromStr;\n\n    use file_test_helpers::generate_index_doc_file;\n    use quickwit_metastore::checkpoint::SourceCheckpointDelta;\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_skip_reader() {\n        {\n            // Skip 0 bytes.\n            let mut reader = SkipReader::new(Box::new(\"hello\".as_bytes()), 0);\n            let mut buf = String::new();\n            let (bytes_read, eof) = reader.read_line_and_peek(&mut buf).await.unwrap();\n            assert_eq!(buf, \"hello\");\n            assert!(eof);\n            assert_eq!(bytes_read, 5)\n        }\n        {\n            // Skip 2 bytes.\n            let mut reader = SkipReader::new(Box::new(\"hello\".as_bytes()), 2);\n            let mut buf = String::new();\n            let (bytes_read, eof) = reader.read_line_and_peek(&mut buf).await.unwrap();\n            assert_eq!(buf, \"llo\");\n            assert!(eof);\n            assert_eq!(bytes_read, 3)\n        }\n        {\n            let input = \"hello\";\n            let cursor = Cursor::new(input);\n            let mut reader = SkipReader::new(Box::new(cursor), 5);\n            let mut buf = String::new();\n            let (bytes_read, eof) = reader.read_line_and_peek(&mut buf).await.unwrap();\n            assert!(eof);\n            assert_eq!(bytes_read, 0)\n        }\n        {\n            let input = \"hello\";\n            let cursor = Cursor::new(input);\n            let mut reader = SkipReader::new(Box::new(cursor), 10);\n            let mut buf = String::new();\n            assert!(reader.read_line_and_peek(&mut buf).await.is_err());\n        }\n        {\n            let input = \"hello world\".repeat(10000);\n            let cursor = Cursor::new(input.clone());\n            let mut reader = SkipReader::new(Box::new(cursor), 64000);\n            let mut buf = String::new();\n            reader.read_line_and_peek(&mut buf).await.unwrap();\n            assert_eq!(buf, input[64000..]);\n        }\n        {\n            let input = \"hello world\".repeat(10000);\n            let cursor = Cursor::new(input.clone());\n            let mut reader = SkipReader::new(Box::new(cursor), 64001);\n            let mut buf = String::new();\n            reader.read_line_and_peek(&mut buf).await.unwrap();\n            assert_eq!(buf, input[64001..]);\n        }\n    }\n\n    async fn aux_test_full_read_record(file: impl AsRef<str>, expected_lines: usize) {\n        let storage_resolver = StorageResolver::for_test();\n        let uri = Uri::from_str(file.as_ref()).unwrap();\n        let mut doc_reader = DocFileReader::from_uri(&storage_resolver, &uri, 0)\n            .await\n            .unwrap();\n        let mut parsed_lines = 0;\n        while doc_reader.next_record().await.unwrap().is_some() {\n            parsed_lines += 1;\n        }\n        assert_eq!(parsed_lines, expected_lines);\n    }\n\n    #[tokio::test]\n    async fn test_full_read_record() {\n        aux_test_full_read_record(\"data/test_corpus.json\", 4).await;\n    }\n\n    #[tokio::test]\n    async fn test_full_read_record_gz() {\n        aux_test_full_read_record(\"data/test_corpus.json.gz\", 4).await;\n    }\n\n    #[tokio::test]\n    async fn test_empty_file() {\n        let empty_file = tempfile::NamedTempFile::new().unwrap();\n        let empty_file_uri = empty_file.path().to_str().unwrap();\n        aux_test_full_read_record(empty_file_uri, 0).await;\n    }\n\n    async fn aux_test_resumed_read_record(\n        file: impl AsRef<str>,\n        expected_lines: usize,\n        stop_at_line: usize,\n    ) {\n        let storage_resolver = StorageResolver::for_test();\n        let uri = Uri::from_str(file.as_ref()).unwrap();\n        // read the first part of the file\n        let mut first_part_reader = DocFileReader::from_uri(&storage_resolver, &uri, 0)\n            .await\n            .unwrap();\n        let mut resume_offset = 0;\n        let mut parsed_lines = 0;\n        for _ in 0..stop_at_line {\n            let rec = first_part_reader\n                .next_record()\n                .await\n                .unwrap()\n                .expect(\"EOF happened before stop_at_line\");\n            resume_offset = rec.next_offset as usize;\n            assert_eq!(Bytes::from(format!(\"{parsed_lines:0>7}\\n\")), rec.doc);\n            parsed_lines += 1;\n        }\n        // read the second part of the file\n        let mut second_part_reader =\n            DocFileReader::from_uri(&storage_resolver, &uri, resume_offset)\n                .await\n                .unwrap();\n        while let Some(rec) = second_part_reader.next_record().await.unwrap() {\n            assert_eq!(Bytes::from(format!(\"{parsed_lines:0>7}\\n\")), rec.doc);\n            parsed_lines += 1;\n        }\n        assert_eq!(parsed_lines, expected_lines);\n    }\n\n    #[tokio::test]\n    async fn test_resumed_read_record() {\n        let dummy_doc_file = generate_index_doc_file(false, 1000).await;\n        let dummy_doc_file_uri = dummy_doc_file.path().to_str().unwrap();\n        aux_test_resumed_read_record(dummy_doc_file_uri, 1000, 1).await;\n        aux_test_resumed_read_record(dummy_doc_file_uri, 1000, 40).await;\n        aux_test_resumed_read_record(dummy_doc_file_uri, 1000, 999).await;\n        aux_test_resumed_read_record(dummy_doc_file_uri, 1000, 1000).await;\n    }\n\n    #[tokio::test]\n    async fn test_resumed_read_record_gz() {\n        let dummy_doc_file = generate_index_doc_file(true, 1000).await;\n        let dummy_doc_file_uri = dummy_doc_file.path().to_str().unwrap();\n        aux_test_resumed_read_record(dummy_doc_file_uri, 1000, 1).await;\n        aux_test_resumed_read_record(dummy_doc_file_uri, 1000, 40).await;\n        aux_test_resumed_read_record(dummy_doc_file_uri, 1000, 999).await;\n        aux_test_resumed_read_record(dummy_doc_file_uri, 1000, 1000).await;\n    }\n\n    async fn aux_test_full_read_batch(\n        file: impl AsRef<str>,\n        expected_lines: usize,\n        expected_batches: usize,\n        file_size: usize,\n        from: Position,\n    ) {\n        let progress = Progress::default();\n        let storage_resolver = StorageResolver::for_test();\n        let uri = Uri::from_str(file.as_ref()).unwrap();\n        let partition = PartitionId::from(\"test\");\n        let mut batch_reader =\n            ObjectUriBatchReader::try_new(&storage_resolver, partition.clone(), &uri, from)\n                .await\n                .unwrap();\n\n        let mut parsed_lines = 0;\n        let mut parsed_batches = 0;\n        let mut checkpoint_delta = SourceCheckpointDelta::default();\n        while !batch_reader.is_eof() {\n            let batch = batch_reader\n                .read_batch(&progress, SourceType::Unspecified)\n                .await\n                .unwrap();\n            parsed_lines += batch.docs.len();\n            parsed_batches += 1;\n            checkpoint_delta.extend(batch.checkpoint_delta).unwrap();\n        }\n        assert_eq!(parsed_lines, expected_lines);\n        assert_eq!(parsed_batches, expected_batches);\n        let position = checkpoint_delta\n            .get_source_checkpoint()\n            .position_for_partition(&partition)\n            .unwrap()\n            .clone();\n        assert_eq!(position, Position::eof(file_size))\n    }\n\n    #[tokio::test]\n    async fn test_read_batch_empty_file() {\n        let empty_file = tempfile::NamedTempFile::new().unwrap();\n        let empty_file_uri = empty_file.path().to_str().unwrap();\n        aux_test_full_read_batch(empty_file_uri, 0, 1, 0, Position::Beginning).await;\n    }\n\n    #[tokio::test]\n    async fn test_full_read_single_batch() {\n        let num_lines = 10;\n        let dummy_doc_file = generate_index_doc_file(false, num_lines).await;\n        let dummy_doc_file_uri = dummy_doc_file.path().to_str().unwrap();\n        aux_test_full_read_batch(\n            dummy_doc_file_uri,\n            num_lines,\n            1,\n            num_lines * 8,\n            Position::Beginning,\n        )\n        .await;\n    }\n\n    #[tokio::test]\n    async fn test_full_read_single_batch_max_size() {\n        let num_lines = BATCH_NUM_BYTES_LIMIT as usize / 8;\n        let dummy_doc_file = generate_index_doc_file(false, num_lines).await;\n        let dummy_doc_file_uri = dummy_doc_file.path().to_str().unwrap();\n        aux_test_full_read_batch(\n            dummy_doc_file_uri,\n            num_lines,\n            1,\n            num_lines * 8,\n            Position::Beginning,\n        )\n        .await;\n    }\n\n    #[tokio::test]\n    async fn test_full_read_two_batches() {\n        let num_lines = BATCH_NUM_BYTES_LIMIT as usize / 8 + 10;\n        let dummy_doc_file = generate_index_doc_file(false, num_lines).await;\n        let dummy_doc_file_uri = dummy_doc_file.path().to_str().unwrap();\n        aux_test_full_read_batch(\n            dummy_doc_file_uri,\n            num_lines,\n            2,\n            num_lines * 8,\n            Position::Beginning,\n        )\n        .await;\n    }\n\n    #[tokio::test]\n    async fn test_resume_read_batches() {\n        let total_num_lines = BATCH_NUM_BYTES_LIMIT as usize / 8 * 3;\n        let resume_after_lines = total_num_lines / 2;\n        let dummy_doc_file = generate_index_doc_file(false, total_num_lines).await;\n        let dummy_doc_file_uri = dummy_doc_file.path().to_str().unwrap();\n        aux_test_full_read_batch(\n            dummy_doc_file_uri,\n            total_num_lines - resume_after_lines,\n            2,\n            total_num_lines * 8,\n            Position::offset(resume_after_lines * 8),\n        )\n        .await;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/file_source.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::time::Duration;\n\nuse async_trait::async_trait;\nuse quickwit_actors::{ActorExitStatus, Mailbox};\nuse quickwit_config::FileSourceParams;\nuse quickwit_metastore::checkpoint::{PartitionId, SourceCheckpoint};\nuse quickwit_proto::metastore::SourceType;\nuse quickwit_proto::types::SourceId;\n\nuse super::doc_file_reader::ObjectUriBatchReader;\n#[cfg(feature = \"queue-sources\")]\nuse super::queue_sources::coordinator::QueueCoordinator;\nuse crate::actors::DocProcessor;\nuse crate::source::{Source, SourceContext, SourceRuntime, TypedSourceFactory};\n\nenum FileSourceState {\n    #[cfg(feature = \"queue-sources\")]\n    Notification(Box<QueueCoordinator>),\n    Filepath {\n        batch_reader: ObjectUriBatchReader,\n        num_bytes_processed: u64,\n        num_lines_processed: u64,\n    },\n}\n\npub struct FileSource {\n    source_id: SourceId,\n    state: FileSourceState,\n    source_type: SourceType,\n}\n\nimpl fmt::Debug for FileSource {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(f, \"FileSource {{ source_id: {} }}\", self.source_id)\n    }\n}\n\n#[async_trait]\nimpl Source for FileSource {\n    #[allow(unused_variables)]\n    async fn initialize(\n        &mut self,\n        doc_processor_mailbox: &Mailbox<DocProcessor>,\n        ctx: &SourceContext,\n    ) -> Result<(), ActorExitStatus> {\n        match &mut self.state {\n            #[cfg(feature = \"queue-sources\")]\n            FileSourceState::Notification(coordinator) => {\n                coordinator.initialize(doc_processor_mailbox, ctx).await\n            }\n            FileSourceState::Filepath { .. } => Ok(()),\n        }\n    }\n\n    #[allow(unused_variables)]\n    async fn emit_batches(\n        &mut self,\n        doc_processor_mailbox: &Mailbox<DocProcessor>,\n        ctx: &SourceContext,\n    ) -> Result<Duration, ActorExitStatus> {\n        match &mut self.state {\n            #[cfg(feature = \"queue-sources\")]\n            FileSourceState::Notification(coordinator) => {\n                coordinator.emit_batches(doc_processor_mailbox, ctx).await?;\n            }\n            FileSourceState::Filepath {\n                batch_reader,\n                num_bytes_processed,\n                num_lines_processed,\n            } => {\n                let batch_builder = batch_reader\n                    .read_batch(ctx.progress(), self.source_type)\n                    .await?;\n                *num_bytes_processed += batch_builder.num_bytes;\n                *num_lines_processed += batch_builder.docs.len() as u64;\n                doc_processor_mailbox\n                    .send_message(batch_builder.build())\n                    .await?;\n                if batch_reader.is_eof() {\n                    ctx.send_exit_with_success(doc_processor_mailbox).await?;\n                    return Err(ActorExitStatus::Success);\n                }\n            }\n        }\n        Ok(Duration::ZERO)\n    }\n\n    fn name(&self) -> String {\n        format!(\"{self:?}\")\n    }\n\n    #[allow(unused_variables)]\n    async fn suggest_truncate(\n        &mut self,\n        checkpoint: SourceCheckpoint,\n        ctx: &SourceContext,\n    ) -> anyhow::Result<()> {\n        match &mut self.state {\n            #[cfg(feature = \"queue-sources\")]\n            FileSourceState::Notification(coordinator) => {\n                coordinator.suggest_truncate(checkpoint, ctx).await\n            }\n            FileSourceState::Filepath { .. } => Ok(()),\n        }\n    }\n\n    fn observable_state(&self) -> serde_json::Value {\n        match &self.state {\n            #[cfg(feature = \"queue-sources\")]\n            FileSourceState::Notification(coordinator) => {\n                serde_json::to_value(coordinator.observable_state()).unwrap()\n            }\n            FileSourceState::Filepath {\n                num_bytes_processed,\n                num_lines_processed,\n                ..\n            } => {\n                serde_json::json!({\n                    \"num_bytes_processed\": num_bytes_processed,\n                    \"num_lines_processed\": num_lines_processed,\n                })\n            }\n        }\n    }\n}\n\npub struct FileSourceFactory;\n\n#[async_trait]\nimpl TypedSourceFactory for FileSourceFactory {\n    type Source = FileSource;\n    type Params = FileSourceParams;\n\n    async fn typed_create_source(\n        source_runtime: SourceRuntime,\n        params: FileSourceParams,\n    ) -> anyhow::Result<FileSource> {\n        let source_id = source_runtime.source_config.source_id.clone();\n        let source_type = source_runtime.source_config.source_type();\n        let state = match params {\n            FileSourceParams::Filepath(file_uri) => {\n                let partition_id = PartitionId::from(file_uri.as_str());\n                let position = source_runtime\n                    .fetch_checkpoint()\n                    .await?\n                    .position_for_partition(&partition_id)\n                    .cloned()\n                    .unwrap_or_default();\n                let batch_reader = ObjectUriBatchReader::try_new(\n                    &source_runtime.storage_resolver,\n                    partition_id,\n                    &file_uri,\n                    position,\n                )\n                .await?;\n                FileSourceState::Filepath {\n                    batch_reader,\n                    num_bytes_processed: 0,\n                    num_lines_processed: 0,\n                }\n            }\n            #[cfg(feature = \"sqs\")]\n            FileSourceParams::Notifications(quickwit_config::FileSourceNotification::Sqs(\n                sqs_config,\n            )) => {\n                let coordinator =\n                    QueueCoordinator::try_from_sqs_config(sqs_config, source_runtime).await?;\n                FileSourceState::Notification(Box::new(coordinator))\n            }\n            #[cfg(not(feature = \"sqs\"))]\n            FileSourceParams::Notifications(quickwit_config::FileSourceNotification::Sqs(_)) => {\n                anyhow::bail!(\"Quickwit was compiled without the `sqs` feature\")\n            }\n        };\n\n        Ok(FileSource {\n            state,\n            source_id,\n            source_type,\n        })\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::num::NonZeroUsize;\n    use std::str::FromStr;\n\n    use bytes::Bytes;\n    use quickwit_actors::{Command, Universe};\n    use quickwit_common::uri::Uri;\n    use quickwit_config::{SourceConfig, SourceInputFormat, SourceParams};\n    use quickwit_metastore::checkpoint::{PartitionId, SourceCheckpointDelta};\n    use quickwit_proto::types::{IndexUid, Position};\n\n    use super::*;\n    use crate::models::RawDocBatch;\n    use crate::source::doc_file_reader::file_test_helpers::{\n        DUMMY_DOC, generate_dummy_doc_file, generate_index_doc_file,\n    };\n    use crate::source::tests::SourceRuntimeBuilder;\n    use crate::source::{BATCH_NUM_BYTES_LIMIT, SourceActor};\n\n    #[tokio::test]\n    async fn test_file_source() {\n        aux_test_file_source(false).await;\n        aux_test_file_source(true).await;\n    }\n\n    async fn aux_test_file_source(gzip: bool) {\n        let universe = Universe::with_accelerated_time();\n        let (doc_processor_mailbox, indexer_inbox) = universe.create_test_mailbox();\n        let params = if gzip {\n            FileSourceParams::from_filepath(\"data/test_corpus.json.gz\").unwrap()\n        } else {\n            FileSourceParams::from_filepath(\"data/test_corpus.json\").unwrap()\n        };\n        let source_config = SourceConfig {\n            source_id: \"test-file-source\".to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::File(params.clone()),\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        };\n        let index_uid = IndexUid::new_with_random_ulid(\"test-index\");\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config).build();\n        let file_source = FileSourceFactory::typed_create_source(source_runtime, params)\n            .await\n            .unwrap();\n        let file_source_actor = SourceActor {\n            source: Box::new(file_source),\n            doc_processor_mailbox,\n        };\n        let (_file_source_mailbox, file_source_handle) =\n            universe.spawn_builder().spawn(file_source_actor);\n        let (actor_termination, counters) = file_source_handle.join().await;\n        assert!(actor_termination.is_success());\n        assert_eq!(\n            counters,\n            serde_json::json!({\n                \"num_bytes_processed\": 1030u64,\n                \"num_lines_processed\": 4u32\n            })\n        );\n        let batch = indexer_inbox.drain_for_test();\n        assert_eq!(batch.len(), 2);\n        batch[0].downcast_ref::<RawDocBatch>().unwrap();\n        assert!(matches!(\n            batch[1].downcast_ref::<Command>().unwrap(),\n            Command::ExitWithSuccess\n        ));\n    }\n\n    #[tokio::test]\n    async fn test_file_source_several_batch() {\n        aux_test_file_source_several_batch(false).await;\n        aux_test_file_source_several_batch(true).await;\n    }\n\n    async fn aux_test_file_source_several_batch(gzip: bool) {\n        quickwit_common::setup_logging_for_tests();\n        let universe = Universe::with_accelerated_time();\n        let (doc_processor_mailbox, doc_processor_inbox) = universe.create_test_mailbox();\n        let lines = BATCH_NUM_BYTES_LIMIT as usize / DUMMY_DOC.len() + 1;\n        let (temp_file, temp_file_size) = generate_dummy_doc_file(gzip, lines).await;\n        let filepath = temp_file.path().to_str().unwrap();\n        let uri = Uri::from_str(filepath).unwrap();\n        let params = FileSourceParams::Filepath(uri.clone());\n        let source_config = SourceConfig {\n            source_id: \"test-file-source\".to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::File(params.clone()),\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        };\n        let index_uid = IndexUid::new_with_random_ulid(\"test-index\");\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config).build();\n        let file_source = FileSourceFactory::typed_create_source(source_runtime, params)\n            .await\n            .unwrap();\n        let file_source_actor = SourceActor {\n            source: Box::new(file_source),\n            doc_processor_mailbox,\n        };\n        let (_file_source_mailbox, file_source_handle) =\n            universe.spawn_builder().spawn(file_source_actor);\n        let (actor_termination, counters) = file_source_handle.join().await;\n        assert!(actor_termination.is_success());\n        assert_eq!(\n            counters,\n            serde_json::json!({\n                \"num_lines_processed\": lines,\n                \"num_bytes_processed\": temp_file_size,\n            })\n        );\n        let indexer_msgs = doc_processor_inbox.drain_for_test();\n        assert_eq!(indexer_msgs.len(), 3);\n        let batch1 = indexer_msgs[0].downcast_ref::<RawDocBatch>().unwrap();\n        let batch2 = indexer_msgs[1].downcast_ref::<RawDocBatch>().unwrap();\n        let command = indexer_msgs[2].downcast_ref::<Command>().unwrap();\n        assert_eq!(\n            format!(\"{:?}\", &batch1.checkpoint_delta),\n            format!(\n                \"∆({}:{})\",\n                uri, \"(00000000000000000000..00000000000005242895]\"\n            )\n        );\n        assert_eq!(\n            format!(\"{:?}\", &batch2.checkpoint_delta),\n            format!(\n                \"∆({}:{})\",\n                uri, \"(00000000000005242895..~00000000000005397105]\"\n            )\n        );\n        assert!(matches!(command, &Command::ExitWithSuccess));\n    }\n\n    #[tokio::test]\n    async fn test_file_source_resume_from_checkpoint() {\n        aux_test_file_source_resume_from_checkpoint(false).await;\n        aux_test_file_source_resume_from_checkpoint(true).await;\n    }\n\n    async fn aux_test_file_source_resume_from_checkpoint(gzip: bool) {\n        quickwit_common::setup_logging_for_tests();\n        let universe = Universe::with_accelerated_time();\n        let (doc_processor_mailbox, doc_processor_inbox) = universe.create_test_mailbox();\n        let temp_file = generate_index_doc_file(gzip, 100).await;\n        let temp_file_path = temp_file.path().to_str().unwrap();\n        let uri = Uri::from_str(temp_file_path).unwrap();\n        let params = FileSourceParams::Filepath(uri.clone());\n        let source_config = SourceConfig {\n            source_id: \"test-file-source\".to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::File(params.clone()),\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        };\n        let partition_id = PartitionId::from(uri.as_str());\n        let source_checkpoint_delta = SourceCheckpointDelta::from_partition_delta(\n            partition_id,\n            Position::Beginning,\n            Position::offset(16usize),\n        )\n        .unwrap();\n\n        let index_uid = IndexUid::new_with_random_ulid(\"test-index\");\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config)\n            .with_mock_metastore(Some(source_checkpoint_delta))\n            .with_queues_dir(temp_file_path)\n            .build();\n\n        let file_source = FileSourceFactory::typed_create_source(source_runtime, params)\n            .await\n            .unwrap();\n        let file_source_actor = SourceActor {\n            source: Box::new(file_source),\n            doc_processor_mailbox,\n        };\n        let (_file_source_mailbox, file_source_handle) =\n            universe.spawn_builder().spawn(file_source_actor);\n        let (actor_termination, counters) = file_source_handle.join().await;\n        assert!(actor_termination.is_success());\n        assert_eq!(\n            counters,\n            serde_json::json!({\n                \"num_bytes_processed\": (800-16) as u64,\n                \"num_lines_processed\": (100-2) as u64,\n            })\n        );\n        let indexer_messages: Vec<RawDocBatch> = doc_processor_inbox.drain_for_test_typed();\n        assert_eq!(\n            indexer_messages[0].docs[0],\n            Bytes::from_static(b\"0000002\\n\")\n        );\n    }\n}\n\n#[cfg(all(test, feature = \"sqs-localstack-tests\"))]\nmod localstack_tests {\n    use std::str::FromStr;\n\n    use quickwit_actors::Universe;\n    use quickwit_common::rand::append_random_suffix;\n    use quickwit_common::uri::Uri;\n    use quickwit_config::{\n        FileSourceMessageType, FileSourceNotification, FileSourceSqs, SourceConfig, SourceParams,\n    };\n    use quickwit_metastore::metastore_for_test;\n\n    use super::*;\n    use crate::models::RawDocBatch;\n    use crate::source::SourceActor;\n    use crate::source::doc_file_reader::file_test_helpers::generate_dummy_doc_file;\n    use crate::source::queue_sources::sqs_queue::test_helpers::{\n        create_queue, get_localstack_sqs_client, send_message,\n    };\n    use crate::source::test_setup_helper::setup_index;\n    use crate::source::tests::SourceRuntimeBuilder;\n\n    #[tokio::test]\n    async fn test_file_source_sqs_notifications() {\n        // queue setup\n        let sqs_client = get_localstack_sqs_client().await.unwrap();\n        let queue_url = create_queue(&sqs_client, \"file-source-sqs-notifications\").await;\n        let (dummy_doc_file, _) = generate_dummy_doc_file(false, 10).await;\n        let test_uri = Uri::from_str(dummy_doc_file.path().to_str().unwrap()).unwrap();\n        send_message(&sqs_client, &queue_url, test_uri.as_str()).await;\n\n        // source setup\n        let source_params =\n            FileSourceParams::Notifications(FileSourceNotification::Sqs(FileSourceSqs {\n                queue_url,\n                message_type: FileSourceMessageType::RawUri,\n                deduplication_window_duration_secs: 100,\n                deduplication_window_max_messages: 100,\n                deduplication_cleanup_interval_secs: 60,\n            }));\n        let source_config = SourceConfig::for_test(\n            \"test-file-source-sqs-notifications\",\n            SourceParams::File(source_params.clone()),\n        );\n        let metastore = metastore_for_test();\n        let index_id = append_random_suffix(\"test-sqs-index\");\n        let index_uid = setup_index(metastore.clone(), &index_id, &source_config, &[]).await;\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config)\n            .with_metastore(metastore)\n            .build();\n        let sqs_source = FileSourceFactory::typed_create_source(source_runtime, source_params)\n            .await\n            .unwrap();\n\n        // actor setup\n        let universe = Universe::with_accelerated_time();\n        let (doc_processor_mailbox, doc_processor_inbox) = universe.create_test_mailbox();\n        {\n            let actor = SourceActor {\n                source: Box::new(sqs_source),\n                doc_processor_mailbox: doc_processor_mailbox.clone(),\n            };\n            let (_mailbox, handle) = universe.spawn_builder().spawn(actor);\n\n            // run the source actor for a while\n            tokio::time::timeout(Duration::from_millis(500), handle.join())\n                .await\n                .unwrap_err();\n\n            let next_message = doc_processor_inbox\n                .drain_for_test()\n                .into_iter()\n                .flat_map(|box_any| box_any.downcast::<RawDocBatch>().ok())\n                .map(|box_raw_doc_batch| *box_raw_doc_batch)\n                .next()\n                .unwrap();\n            assert_eq!(next_message.docs.len(), 10);\n        }\n        universe.assert_quit().await;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/gcp_pubsub_source.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::time::{Duration, Instant};\nuse std::{fmt, mem};\n\nuse anyhow::Context;\nuse async_trait::async_trait;\nuse bytes::Bytes;\nuse google_cloud_auth::credentials::CredentialsFile;\nuse google_cloud_gax::retry::RetrySetting;\nuse google_cloud_pubsub::client::{Client, ClientConfig};\nuse google_cloud_pubsub::subscription::Subscription;\nuse quickwit_actors::{ActorContext, ActorExitStatus, Mailbox};\nuse quickwit_common::rand::append_random_suffix;\nuse quickwit_config::PubSubSourceParams;\nuse quickwit_metastore::checkpoint::{PartitionId, SourceCheckpoint};\nuse quickwit_proto::metastore::SourceType;\nuse quickwit_proto::types::Position;\nuse serde_json::{Value as JsonValue, json};\nuse tokio::time;\nuse tracing::{debug, info, warn};\n\nuse super::{BATCH_NUM_BYTES_LIMIT, EMIT_BATCHES_TIMEOUT, SourceActor};\nuse crate::actors::DocProcessor;\nuse crate::source::{BatchBuilder, Source, SourceContext, SourceRuntime, TypedSourceFactory};\n\nconst DEFAULT_MAX_MESSAGES_PER_PULL: i32 = 1_000;\n\npub struct GcpPubSubSourceFactory;\n\n#[async_trait]\nimpl TypedSourceFactory for GcpPubSubSourceFactory {\n    type Source = GcpPubSubSource;\n    type Params = PubSubSourceParams;\n\n    async fn typed_create_source(\n        source_runtime: SourceRuntime,\n        source_params: PubSubSourceParams,\n    ) -> anyhow::Result<Self::Source> {\n        GcpPubSubSource::try_new(source_runtime, source_params).await\n    }\n}\n\n#[derive(Default)]\npub struct GcpPubSubSourceState {\n    /// Number of bytes processed by the source.\n    num_bytes_processed: u64,\n    /// Number of messages processed by the source.\n    num_messages_processed: u64,\n    /// Current position of the source, i.e. the position of the last message processed.\n    current_position: Position,\n    // Number of invalid messages, i.e., that were empty or could not be parsed.\n    num_invalid_messages: u64,\n    /// Number of time we looped without getting a single message\n    num_consecutive_empty_batches: u64,\n}\n\npub struct GcpPubSubSource {\n    source_runtime: SourceRuntime,\n    subscription_name: String,\n    subscription: Subscription,\n    state: GcpPubSubSourceState,\n    backfill_mode_enabled: bool,\n    partition_id: PartitionId,\n    max_messages_per_pull: i32,\n}\n\nimpl fmt::Debug for GcpPubSubSource {\n    fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {\n        formatter\n            .debug_struct(\"GcpPubSubSource\")\n            .field(\"index_id\", &self.source_runtime.index_id())\n            .field(\"source_id\", &self.source_runtime.source_id())\n            .field(\"subscription\", &self.subscription)\n            .finish()\n    }\n}\n\nimpl GcpPubSubSource {\n    pub async fn try_new(\n        source_runtime: SourceRuntime,\n        source_params: PubSubSourceParams,\n    ) -> anyhow::Result<Self> {\n        let subscription_name = source_params.subscription;\n        let backfill_mode_enabled = source_params.enable_backfill_mode;\n        let max_messages_per_pull = source_params\n            .max_messages_per_pull\n            .unwrap_or(DEFAULT_MAX_MESSAGES_PER_PULL);\n\n        let mut client_config: ClientConfig = match source_params.credentials_file {\n            Some(credentials_file) => {\n                let credentials = CredentialsFile::new_from_file(credentials_file.clone())\n                    .await\n                    .with_context(|| {\n                        format!(\n                            \"failed to load GCP PubSub credentials file from `{credentials_file}`\"\n                        )\n                    })?;\n                ClientConfig::default().with_credentials(credentials).await\n            }\n            _ => ClientConfig::default().with_auth().await,\n        }\n        .context(\"failed to create GCP PubSub client config\")?;\n\n        if source_params.project_id.is_some() {\n            client_config.project_id = source_params.project_id\n        }\n\n        let client = Client::new(client_config)\n            .await\n            .context(\"failed to create GCP PubSub client\")?;\n        let subscription = client.subscription(&subscription_name);\n        // TODO: replace with \"<node_id>/<index_id>/<source_id>/<pipeline_ord>\"\n        let partition_id = append_random_suffix(&format!(\"gpc-pubsub-{subscription_name}\"));\n        let partition_id = PartitionId::from(partition_id);\n\n        info!(\n            index_id=%source_runtime.index_id(),\n            source_id=%source_runtime.source_id(),\n            subscription=%subscription_name,\n            max_messages_per_pull=%max_messages_per_pull,\n            \"starting GCP PubSub source\"\n        );\n        if !subscription.exists(Some(RetrySetting::default())).await? {\n            anyhow::bail!(\"GCP PubSub subscription `{subscription_name}` does not exist\");\n        }\n        Ok(Self {\n            source_runtime,\n            subscription_name,\n            subscription,\n            state: GcpPubSubSourceState::default(),\n            backfill_mode_enabled,\n            partition_id,\n            max_messages_per_pull,\n        })\n    }\n\n    fn should_exit(&self) -> bool {\n        self.backfill_mode_enabled && self.state.num_consecutive_empty_batches >= 10\n    }\n}\n\n#[async_trait]\nimpl Source for GcpPubSubSource {\n    async fn emit_batches(\n        &mut self,\n        doc_processor_mailbox: &Mailbox<DocProcessor>,\n        ctx: &SourceContext,\n    ) -> Result<Duration, ActorExitStatus> {\n        let now = Instant::now();\n        let mut batch_builder = BatchBuilder::new(SourceType::PubSub);\n        let deadline = time::sleep(*EMIT_BATCHES_TIMEOUT);\n        tokio::pin!(deadline);\n        // TODO: ensure we ACK the message after being commit: at least once\n        // TODO: ensure we increase_ack_deadline for the items\n        loop {\n            tokio::select! {\n                resp = self.pull_message_batch(&mut batch_builder) => {\n                    if let Err(err) = resp {\n                        warn!(\"failed to pull messages from subscription `{}`: {:?}\", self.subscription_name, err);\n                    }\n                    if batch_builder.num_bytes >= BATCH_NUM_BYTES_LIMIT {\n                        break;\n                    }\n                }\n                _ = &mut deadline => {\n                    break;\n                }\n            }\n            ctx.record_progress();\n        }\n\n        if batch_builder.num_bytes > 0 {\n            self.state.num_consecutive_empty_batches = 0\n        } else {\n            self.state.num_consecutive_empty_batches += 1\n        }\n\n        // TODO: need to wait for all the id to be ack for at_least_once\n        if self.should_exit() {\n            info!(subscription=%self.subscription_name, \"reached end of subscription\");\n            ctx.send_exit_with_success(doc_processor_mailbox).await?;\n            return Err(ActorExitStatus::Success);\n        }\n        if !batch_builder.checkpoint_delta.is_empty() {\n            debug!(\n                num_bytes=%batch_builder.num_bytes,\n                num_docs=%batch_builder.docs.len(),\n                num_millis=%now.elapsed().as_millis(),\n                \"Sending doc batch to indexer.\");\n            let message = batch_builder.build();\n            ctx.send_message(doc_processor_mailbox, message).await?;\n        }\n        Ok(Duration::default())\n    }\n\n    async fn suggest_truncate(\n        &mut self,\n        _checkpoint: SourceCheckpoint,\n        _ctx: &ActorContext<SourceActor>,\n    ) -> anyhow::Result<()> {\n        // TODO: add ack of ids\n        Ok(())\n    }\n\n    fn name(&self) -> String {\n        format!(\"{self:?}\")\n    }\n\n    fn observable_state(&self) -> JsonValue {\n        json!({\n            \"index_id\": self.source_runtime.index_id(),\n            \"source_id\": self.source_runtime.source_id(),\n            \"subscription\": self.subscription_name,\n            \"num_bytes_processed\": self.state.num_bytes_processed,\n            \"num_messages_processed\": self.state.num_messages_processed,\n            \"num_invalid_messages\": self.state.num_invalid_messages,\n            \"num_consecutive_empty_batches\": self.state.num_consecutive_empty_batches,\n        })\n    }\n}\n\nimpl GcpPubSubSource {\n    async fn pull_message_batch(&mut self, batch: &mut BatchBuilder) -> anyhow::Result<()> {\n        let messages = self\n            .subscription\n            .pull(self.max_messages_per_pull, None)\n            .await\n            .context(\"failed to pull messages from subscription\")?;\n\n        let Some(last_message) = messages.last() else {\n            return Ok(());\n        };\n        let message_id = last_message.message.message_id.clone();\n        let publish_timestamp_millis = last_message\n            .message\n            .publish_time\n            .as_ref()\n            .map(|timestamp| timestamp.seconds * 1_000 + (timestamp.nanos as i64 / 1_000_000))\n            .unwrap_or(0); // TODO: Replace with now UTC millis.\n\n        for message in messages {\n            message.ack().await?; // TODO: remove ACK here when doing at least once\n            self.state.num_messages_processed += 1;\n            self.state.num_bytes_processed += message.message.data.len() as u64;\n            let doc: Bytes = Bytes::from(message.message.data);\n            if doc.is_empty() {\n                self.state.num_invalid_messages += 1;\n            } else {\n                batch.add_doc(doc);\n            }\n        }\n        let to_position = Position::from(format!(\n            \"{}:{message_id}:{publish_timestamp_millis}\",\n            self.state.num_messages_processed\n        ));\n        let from_position = mem::replace(&mut self.state.current_position, to_position.clone());\n\n        batch\n            .checkpoint_delta\n            .record_partition_delta(self.partition_id.clone(), from_position, to_position)\n            .context(\"failed to record partition delta\")?;\n        Ok(())\n    }\n}\n\n// TODO: first implementation of the test\n// After we need to ensure at_least_once and concurrent pipeline\n#[cfg(all(test, feature = \"gcp-pubsub-emulator-tests\"))]\nmod gcp_pubsub_emulator_tests {\n    use std::env::var;\n    use std::num::NonZeroUsize;\n\n    use google_cloud_googleapis::pubsub::v1::PubsubMessage;\n    use google_cloud_pubsub::publisher::Publisher;\n    use google_cloud_pubsub::subscription::SubscriptionConfig;\n    use quickwit_actors::Universe;\n    use quickwit_config::{SourceConfig, SourceInputFormat, SourceParams};\n    use quickwit_proto::types::{IndexId, IndexUid};\n    use serde_json::json;\n\n    use super::*;\n    use crate::models::RawDocBatch;\n    use crate::source::quickwit_supported_sources;\n    use crate::source::tests::SourceRuntimeBuilder;\n\n    static GCP_TEST_PROJECT: &str = \"quickwit-emulator\";\n\n    fn get_source_config(subscription: &str) -> SourceConfig {\n        var(\"PUBSUB_EMULATOR_HOST\").expect(\n            \"environment variable `PUBSUB_EMULATOR_HOST` should be set when running GCP PubSub \\\n             source tests\",\n        );\n        let source_id = append_random_suffix(\"test-gcp-pubsub-source--source\");\n        SourceConfig {\n            source_id,\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::PubSub(PubSubSourceParams {\n                project_id: Some(GCP_TEST_PROJECT.to_string()),\n                enable_backfill_mode: true,\n                subscription: subscription.to_string(),\n                credentials_file: None,\n                max_messages_per_pull: None,\n            }),\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        }\n    }\n\n    async fn create_topic_and_subscription(topic: &str, subscription: &str) -> Publisher {\n        let client_config = google_cloud_pubsub::client::ClientConfig {\n            project_id: Some(GCP_TEST_PROJECT.to_string()),\n            ..Default::default()\n        };\n        let client = Client::new(client_config.with_auth().await.unwrap())\n            .await\n            .unwrap();\n        let subscription_config = SubscriptionConfig::default();\n\n        let created_topic = client.create_topic(topic, None, None).await.unwrap();\n        client\n            .create_subscription(subscription, topic, subscription_config, None)\n            .await\n            .unwrap();\n        created_topic.new_publisher(None)\n    }\n\n    #[tokio::test]\n    async fn test_gcp_pubsub_source_invalid_subscription() {\n        let subscription =\n            append_random_suffix(\"test-gcp-pubsub-source--invalid-subscription--subscription\");\n        let source_config = get_source_config(&subscription);\n\n        let index_id = append_random_suffix(\"test-gcp-pubsub-source--invalid-subscription--index\");\n        let index_uid = IndexUid::new_with_random_ulid(&index_id);\n        let SourceParams::PubSub(params) = source_config.clone().source_params else {\n            panic!(\n                \"Expected `SourceParams::GcpPubSub` source params, got {:?}\",\n                source_config.source_params\n            );\n        };\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config).build();\n        GcpPubSubSource::try_new(source_runtime, params)\n            .await\n            .unwrap_err();\n    }\n\n    #[tokio::test]\n    async fn test_gcp_pubsub_source() {\n        let universe = Universe::with_accelerated_time();\n\n        let topic = append_random_suffix(\"test-gcp-pubsub-source--topic\");\n        let subscription = append_random_suffix(\"test-gcp-pubsub-source--subscription\");\n        let publisher = create_topic_and_subscription(&topic, &subscription).await;\n\n        let source_config = get_source_config(&subscription);\n        let source_id = source_config.source_id.clone();\n\n        let source_loader = quickwit_supported_sources();\n        let index_id: IndexId = append_random_suffix(\"test-gcp-pubsub-source--index\");\n        let index_uid = IndexUid::new_with_random_ulid(&index_id);\n\n        let mut pubsub_messages = Vec::with_capacity(6);\n        for i in 0..6 {\n            let pubsub_message = PubsubMessage {\n                data: format!(\"Message {i}\").into(),\n                ..Default::default()\n            };\n            pubsub_messages.push(pubsub_message);\n        }\n        let awaiters = publisher.publish_bulk(pubsub_messages).await;\n        for awaiter in awaiters {\n            awaiter.get().await.unwrap();\n        }\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config).build();\n        let source = source_loader.load_source(source_runtime).await.unwrap();\n\n        let (doc_processor_mailbox, doc_processor_inbox) = universe.create_test_mailbox();\n        let source_actor = SourceActor {\n            source,\n            doc_processor_mailbox: doc_processor_mailbox.clone(),\n        };\n        let (_source_mailbox, source_handle) = universe.spawn_builder().spawn(source_actor);\n        let (exit_status, exit_state) = source_handle.join().await;\n        assert!(exit_status.is_success());\n\n        let messages: Vec<RawDocBatch> = doc_processor_inbox.drain_for_test_typed();\n        assert_eq!(messages.len(), 1);\n        let expected_docs = vec![\n            \"Message 0\",\n            \"Message 1\",\n            \"Message 2\",\n            \"Message 3\",\n            \"Message 4\",\n            \"Message 5\",\n        ];\n        assert_eq!(messages[0].docs, expected_docs);\n        let expected_exit_state = json!({\n            \"index_id\": index_id,\n            \"source_id\": source_id,\n            \"subscription\": subscription,\n            \"num_bytes_processed\": 54,\n            \"num_messages_processed\": 6,\n            \"num_invalid_messages\": 0,\n            \"num_consecutive_empty_batches\": 10,\n        });\n        assert_eq!(exit_state, expected_exit_state);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/ingest/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeSet;\nuse std::fmt;\nuse std::time::Duration;\n\nuse anyhow::Context;\nuse async_trait::async_trait;\nuse fnv::FnvHashMap;\nuse itertools::Itertools;\nuse quickwit_actors::{ActorExitStatus, Mailbox};\nuse quickwit_common::pubsub::EventBroker;\nuse quickwit_common::retry::RetryParams;\nuse quickwit_ingest::{\n    FetchStreamError, IngesterPool, MRecord, MultiFetchStream, decoded_mrecords,\n};\nuse quickwit_metastore::checkpoint::{PartitionId, SourceCheckpoint};\nuse quickwit_proto::ingest::IngestV2Error;\nuse quickwit_proto::ingest::ingester::{\n    FetchEof, FetchPayload, IngesterService, TruncateShardsRequest, TruncateShardsSubrequest,\n    fetch_message,\n};\nuse quickwit_proto::metastore::{\n    AcquireShardsRequest, AcquireShardsResponse, MetastoreService, MetastoreServiceClient,\n    SourceType,\n};\nuse quickwit_proto::types::{\n    NodeId, PipelineUid, Position, PublishToken, ShardId, SourceId, SourceUid,\n};\nuse serde::Serialize;\nuse serde_json::json;\nuse tokio::time;\nuse tracing::{debug, error, info, warn};\nuse ulid::Ulid;\n\nuse super::{\n    BATCH_NUM_BYTES_LIMIT, BatchBuilder, EMIT_BATCHES_TIMEOUT, Source, SourceContext,\n    SourceRuntime, TypedSourceFactory,\n};\nuse crate::actors::DocProcessor;\nuse crate::models::{LocalShardPositionsUpdate, NewPublishLock, NewPublishToken, PublishLock};\n\npub struct IngestSourceFactory;\n\n#[async_trait]\nimpl TypedSourceFactory for IngestSourceFactory {\n    type Source = IngestSource;\n    type Params = ();\n\n    async fn typed_create_source(\n        source_runtime: SourceRuntime,\n        _params: Self::Params,\n    ) -> anyhow::Result<Self::Source> {\n        // Retry parameters for the fetch stream: retry indefinitely until the shard is complete or\n        // unassigned.\n        let retry_params = RetryParams {\n            max_attempts: usize::MAX,\n            base_delay: Duration::from_secs(5),\n            max_delay: Duration::from_secs(10 * 60), // 10 minutes\n        };\n        IngestSource::try_new(source_runtime, retry_params).await\n    }\n}\n\n/// The [`ClientId`] is a unique identifier for a client of the ingest service and allows to\n/// distinguish which indexers are streaming documents from a shard. It is also used to form a\n/// publish token.\n#[derive(Debug, Clone)]\nstruct ClientId {\n    node_id: NodeId,\n    source_uid: SourceUid,\n    pipeline_uid: PipelineUid,\n}\n\nimpl fmt::Display for ClientId {\n    fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {\n        write!(\n            formatter,\n            \"indexer/{}/{}/{}/{}\",\n            self.node_id, self.source_uid.index_uid, self.source_uid.source_id, self.pipeline_uid\n        )\n    }\n}\n\nimpl ClientId {\n    fn new(node_id: NodeId, source_uid: SourceUid, pipeline_uid: PipelineUid) -> Self {\n        ClientId {\n            node_id,\n            source_uid,\n            pipeline_uid,\n        }\n    }\n\n    fn new_publish_token(&self) -> String {\n        let ulid = if cfg!(test) { Ulid::nil() } else { Ulid::new() };\n        format!(\"{self}/{ulid}\")\n    }\n}\n\n#[derive(Debug, Clone, Copy, Default, Eq, PartialEq, Serialize)]\n#[serde(rename_all = \"snake_case\")]\nenum IndexingStatus {\n    #[default]\n    // Indexing is in progress.\n    Active,\n    // All documents have been indexed AND published.\n    Complete,\n    Error,\n    // The shard no longer exists.\n    NotFound,\n    // We have received all documents from the stream. Note that they\n    // are not necessarily published yet.\n    ReachedEof,\n}\n\n#[derive(Debug, Eq, PartialEq)]\nstruct AssignedShard {\n    leader_id: NodeId,\n    follower_id_opt: Option<NodeId>,\n    // This is just the shard id converted to a partition id object.\n    partition_id: PartitionId,\n    current_position_inclusive: Position,\n    status: IndexingStatus,\n}\n\n/// Streams documents from a set of shards.\npub struct IngestSource {\n    client_id: ClientId,\n    metastore: MetastoreServiceClient,\n    ingester_pool: IngesterPool,\n    assigned_shards: FnvHashMap<ShardId, AssignedShard>,\n    fetch_stream: MultiFetchStream,\n    publish_lock: PublishLock,\n    publish_token: PublishToken,\n    event_broker: EventBroker,\n}\n\nimpl fmt::Debug for IngestSource {\n    fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {\n        formatter.debug_struct(\"IngestSource\").finish()\n    }\n}\n\nimpl IngestSource {\n    pub async fn try_new(\n        source_runtime: SourceRuntime,\n        retry_params: RetryParams,\n    ) -> anyhow::Result<IngestSource> {\n        let self_node_id: NodeId = source_runtime.node_id().into();\n        let client_id = ClientId::new(\n            self_node_id.clone(),\n            SourceUid {\n                index_uid: source_runtime.index_uid().clone(),\n                source_id: source_runtime.source_id().to_string(),\n            },\n            source_runtime.pipeline_uid(),\n        );\n        let metastore = source_runtime.metastore.clone();\n        let ingester_pool = source_runtime.ingester_pool.clone();\n        let assigned_shards = FnvHashMap::default();\n        let fetch_stream = MultiFetchStream::new(\n            self_node_id,\n            client_id.to_string(),\n            ingester_pool.clone(),\n            retry_params,\n        );\n        // We start as dead. The first reset with a non-empty list of shards will create an alive\n        // publish lock.\n        let publish_lock = PublishLock::dead();\n        let publish_token = client_id.new_publish_token();\n\n        Ok(IngestSource {\n            client_id,\n            metastore,\n            ingester_pool,\n            assigned_shards,\n            fetch_stream,\n            publish_lock,\n            publish_token,\n            event_broker: source_runtime.event_broker.clone(),\n        })\n    }\n\n    fn process_fetch_payload(\n        &mut self,\n        batch_builder: &mut BatchBuilder,\n        fetch_payload: FetchPayload,\n    ) -> anyhow::Result<()> {\n        let mrecord_batch = match &fetch_payload.mrecord_batch {\n            Some(mrecord_batch) if !mrecord_batch.is_empty() => mrecord_batch,\n            _ => {\n                warn!(\"received empty mrecord batch\");\n                return Ok(());\n            }\n        };\n        let assigned_shard = self\n            .assigned_shards\n            .get_mut(fetch_payload.shard_id())\n            .expect(\"shard should be assigned\");\n\n        assigned_shard.status = IndexingStatus::Active;\n\n        let partition_id = assigned_shard.partition_id.clone();\n        let from_position_exclusive = fetch_payload.from_position_exclusive();\n        let to_position_inclusive = fetch_payload.to_position_inclusive();\n\n        for mrecord in decoded_mrecords(mrecord_batch) {\n            match mrecord {\n                MRecord::Doc(doc) => {\n                    batch_builder.add_doc(doc);\n                }\n                MRecord::Commit => {\n                    batch_builder.force_commit();\n                }\n            }\n        }\n        batch_builder\n            .checkpoint_delta\n            .record_partition_delta(\n                partition_id,\n                from_position_exclusive,\n                to_position_inclusive.clone(),\n            )\n            .context(\"failed to record partition delta\")?;\n        assigned_shard.current_position_inclusive = to_position_inclusive;\n        Ok(())\n    }\n\n    fn process_fetch_eof(\n        &mut self,\n        batch_builder: &mut BatchBuilder,\n        fetch_eof: FetchEof,\n    ) -> anyhow::Result<()> {\n        let assigned_shard = self\n            .assigned_shards\n            .get_mut(fetch_eof.shard_id())\n            .expect(\"shard should be assigned\");\n\n        assigned_shard.status = IndexingStatus::ReachedEof;\n\n        let partition_id = assigned_shard.partition_id.clone();\n        let from_position_exclusive = assigned_shard.current_position_inclusive.clone();\n        let to_position_inclusive = fetch_eof.eof_position();\n\n        batch_builder\n            .checkpoint_delta\n            .record_partition_delta(\n                partition_id,\n                from_position_exclusive,\n                to_position_inclusive.clone(),\n            )\n            .context(\"failed to record partition delta\")?;\n        assigned_shard.current_position_inclusive = to_position_inclusive;\n        Ok(())\n    }\n\n    fn process_fetch_stream_error(\n        &mut self,\n        batch_builder: &mut BatchBuilder,\n        fetch_stream_error: FetchStreamError,\n    ) -> anyhow::Result<()> {\n        let Some(assigned_shard) = self.assigned_shards.get_mut(&fetch_stream_error.shard_id)\n        else {\n            return Ok(());\n        };\n        if assigned_shard.status == IndexingStatus::Complete {\n            return Ok(());\n        }\n        if let IngestV2Error::ShardNotFound { .. } = fetch_stream_error.ingest_error {\n            batch_builder.checkpoint_delta.record_partition_delta(\n                assigned_shard.partition_id.clone(),\n                assigned_shard.current_position_inclusive.clone(),\n                assigned_shard.current_position_inclusive.as_eof(),\n            )?;\n            assigned_shard.current_position_inclusive.to_eof();\n            assigned_shard.status = IndexingStatus::NotFound;\n        } else if assigned_shard.status != IndexingStatus::ReachedEof {\n            assigned_shard.status = IndexingStatus::Error;\n        }\n        Ok(())\n    }\n\n    async fn truncate(&mut self, truncate_up_to_positions: Vec<(ShardId, Position)>) {\n        if truncate_up_to_positions.is_empty() {\n            return;\n        }\n        let shard_positions_update = LocalShardPositionsUpdate::new(\n            self.client_id.source_uid.clone(),\n            truncate_up_to_positions.clone(),\n        );\n        // Let's record all shards that have reached Eof as complete.\n        for (shard, truncate_up_to_position_inclusive) in &truncate_up_to_positions {\n            if truncate_up_to_position_inclusive.is_eof()\n                && let Some(assigned_shard) = self.assigned_shards.get_mut(shard)\n            {\n                assigned_shard.status = IndexingStatus::Complete;\n            }\n        }\n\n        // We publish the event to the event broker.\n        self.event_broker.publish(shard_positions_update);\n\n        // Finally, we push the information to ingesters in a best effort manner.\n        // If the request fails, we just log an error.\n        let mut per_ingester_truncate_subrequests: FnvHashMap<\n            &NodeId,\n            Vec<TruncateShardsSubrequest>,\n        > = FnvHashMap::default();\n\n        for (shard_id, truncate_up_to_position_inclusive) in truncate_up_to_positions {\n            if truncate_up_to_position_inclusive.is_beginning() {\n                continue;\n            }\n            let Some(shard) = self.assigned_shards.get(&shard_id) else {\n                warn!(\"failed to truncate shard `{shard_id}`: shard is no longer assigned\");\n                continue;\n            };\n            let truncate_shards_subrequest = TruncateShardsSubrequest {\n                index_uid: self.client_id.source_uid.index_uid.clone().into(),\n                source_id: self.client_id.source_uid.source_id.clone(),\n                shard_id: Some(shard_id),\n                truncate_up_to_position_inclusive: Some(truncate_up_to_position_inclusive),\n            };\n            if let Some(follower_id) = &shard.follower_id_opt {\n                per_ingester_truncate_subrequests\n                    .entry(follower_id)\n                    .or_default()\n                    .push(truncate_shards_subrequest.clone());\n            }\n            per_ingester_truncate_subrequests\n                .entry(&shard.leader_id)\n                .or_default()\n                .push(truncate_shards_subrequest);\n        }\n        for (ingester_id, truncate_subrequests) in per_ingester_truncate_subrequests {\n            let Some(ingester) = self.ingester_pool.get(ingester_id) else {\n                warn!(\"failed to truncate shard(s): ingester `{ingester_id}` is unavailable\");\n                continue;\n            };\n            let truncate_shards_request = TruncateShardsRequest {\n                ingester_id: ingester_id.clone().into(),\n                subrequests: truncate_subrequests,\n            };\n            let truncate_future = async move {\n                let retry_params = RetryParams {\n                    base_delay: Duration::from_secs(1),\n                    max_delay: Duration::from_secs(10),\n                    max_attempts: 5,\n                };\n                for num_attempts in 1..=retry_params.max_attempts {\n                    let Err(error) = ingester\n                        .client\n                        .truncate_shards(truncate_shards_request.clone())\n                        .await\n                    else {\n                        return;\n                    };\n                    let delay = retry_params.compute_delay(num_attempts);\n                    time::sleep(delay).await;\n\n                    if num_attempts == retry_params.max_attempts {\n                        warn!(\n                            ingester_id=%truncate_shards_request.ingester_id,\n                            \"failed to truncate shard(s): {error}\"\n                        );\n                    }\n                }\n            };\n            // Truncation is best-effort, so fire and forget.\n            tokio::spawn(truncate_future);\n        }\n    }\n\n    /// If the new assignment removes a shard that we were in the middle of indexing (ie they have\n    /// not reached `IndexingStatus::Complete` status yet), we need to reset the pipeline:\n    ///\n    /// Ongoing work and splits traveling through the pipeline will be dropped.\n    ///\n    /// After this method has returned we are guaranteed to have the following post condition:\n    /// - a alive publish lock / non-empty publish token\n    /// - all currently assigned shards included in the `new_assigned_shard_ids` set.\n    async fn reset_if_needed(\n        &mut self,\n        new_assigned_shard_ids: &BTreeSet<ShardId>,\n        doc_processor_mailbox: &Mailbox<DocProcessor>,\n        ctx: &SourceContext,\n    ) -> anyhow::Result<()> {\n        // No need to do anything if the list of shards before and after are empty.\n        if new_assigned_shard_ids.is_empty() && self.assigned_shards.is_empty() {\n            return Ok(());\n        }\n        // There are two reasons why we might want to reset the pipeline.\n        // 1) it has never been initialized in the first place. This happens typically on the first\n        // call to `assign_shards` with a non-empty list of shards. We check that by looking at\n        // whether the publish lock is dead or not.\n        // 2) we are removing a shard that has not reached the complete status yet.\n        let reset_needed: bool = self.publish_lock.is_dead()\n            || self\n                .assigned_shards\n                .keys()\n                .filter(|&shard_id| !new_assigned_shard_ids.contains(shard_id))\n                .any(|removed_shard_id| {\n                    let Some(assigned_shard) = self.assigned_shards.get(removed_shard_id) else {\n                        return false;\n                    };\n                    assigned_shard.status != IndexingStatus::Complete\n                });\n\n        if !reset_needed {\n            // Not need to reset the fetch streams, we can just remove the shard that have been\n            // completely indexed.\n            self.assigned_shards.retain(|shard_id, assignment| {\n                if new_assigned_shard_ids.contains(shard_id) {\n                    true\n                } else {\n                    assert_eq!(assignment.status, IndexingStatus::Complete);\n                    false\n                }\n            });\n            return Ok(());\n        }\n        info!(\n            index_uid=%self.client_id.source_uid.index_uid,\n            pipeline_uid=%self.client_id.pipeline_uid,\n            \"resetting indexing pipeline\"\n        );\n        self.assigned_shards.clear();\n        self.fetch_stream.reset();\n        self.publish_lock.kill().await;\n        self.publish_lock = PublishLock::default();\n        self.publish_token = self.client_id.new_publish_token();\n        ctx.send_message(\n            doc_processor_mailbox,\n            NewPublishLock(self.publish_lock.clone()),\n        )\n        .await?;\n        ctx.send_message(\n            doc_processor_mailbox,\n            NewPublishToken(self.publish_token.clone()),\n        )\n        .await?;\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Source for IngestSource {\n    async fn emit_batches(\n        &mut self,\n        doc_processor_mailbox: &Mailbox<DocProcessor>,\n        ctx: &SourceContext,\n    ) -> Result<Duration, ActorExitStatus> {\n        let mut batch_builder = BatchBuilder::new(SourceType::IngestV2);\n\n        let now = time::Instant::now();\n        let deadline = now + *EMIT_BATCHES_TIMEOUT;\n        loop {\n            match time::timeout_at(deadline, self.fetch_stream.next()).await {\n                Ok(Ok(fetch_message)) => match fetch_message.message {\n                    Some(fetch_message::Message::Payload(fetch_payload)) => {\n                        self.process_fetch_payload(&mut batch_builder, fetch_payload)?;\n\n                        if batch_builder.num_bytes >= BATCH_NUM_BYTES_LIMIT {\n                            break;\n                        }\n                    }\n                    Some(fetch_message::Message::Eof(fetch_eof)) => {\n                        self.process_fetch_eof(&mut batch_builder, fetch_eof)?;\n                    }\n                    None => {\n                        warn!(\"received empty fetch message\");\n                        continue;\n                    }\n                },\n                Ok(Err(fetch_stream_error)) => {\n                    self.process_fetch_stream_error(&mut batch_builder, fetch_stream_error)?;\n                }\n                Err(_) => {\n                    // The deadline has elapsed.\n                    break;\n                }\n            }\n            ctx.record_progress();\n        }\n        if !batch_builder.checkpoint_delta.is_empty() {\n            debug!(\n                num_docs=%batch_builder.docs.len(),\n                num_bytes=%batch_builder.num_bytes,\n                num_millis=%now.elapsed().as_millis(),\n                \"Sending doc batch to indexer.\"\n            );\n            let message = batch_builder.build();\n            ctx.send_message(doc_processor_mailbox, message).await?;\n        }\n        Ok(Duration::default())\n    }\n\n    async fn assign_shards(\n        &mut self,\n        new_assigned_shard_ids: BTreeSet<ShardId>,\n        doc_processor_mailbox: &Mailbox<DocProcessor>,\n        ctx: &SourceContext,\n    ) -> anyhow::Result<()> {\n        self.reset_if_needed(&new_assigned_shard_ids, doc_processor_mailbox, ctx)\n            .await?;\n\n        // As enforced by `reset_if_needed`, at this point, all currently assigned shards should be\n        // in the new_assigned_shards.\n        debug_assert!(\n            self.assigned_shards\n                .keys()\n                .all(|shard_id| new_assigned_shard_ids.contains(shard_id))\n        );\n\n        if self.assigned_shards.len() == new_assigned_shard_ids.len() {\n            // Nothing to do.\n            // The set shards is unchanged.\n            return Ok(());\n        }\n\n        let added_shard_ids: Vec<ShardId> = new_assigned_shard_ids\n            .into_iter()\n            .filter(|shard_id| !self.assigned_shards.contains_key(shard_id))\n            .collect();\n\n        assert!(!added_shard_ids.is_empty());\n        info!(added_shards=?added_shard_ids, \"adding shards assignment\");\n\n        let acquire_shards_request = AcquireShardsRequest {\n            index_uid: Some(self.client_id.source_uid.index_uid.clone()),\n            source_id: self.client_id.source_uid.source_id.clone(),\n            shard_ids: added_shard_ids.clone(),\n            publish_token: self.publish_token.clone(),\n        };\n        let acquire_shards_response: AcquireShardsResponse = ctx\n            .protect_future(self.metastore.acquire_shards(acquire_shards_request))\n            .await\n            .context(\"failed to acquire shards\")?;\n\n        if acquire_shards_response.acquired_shards.len() != added_shard_ids.len() {\n            let missing_shards = added_shard_ids\n                .iter()\n                .filter(|shard_id| {\n                    !acquire_shards_response\n                        .acquired_shards\n                        .iter()\n                        .any(|acquired_shard| acquired_shard.shard_id() == *shard_id)\n                })\n                .collect::<Vec<_>>();\n            // This can happen if the shards have been deleted by the control plane, after building\n            // the plan and before the apply terminated. See #4888.\n            info!(missing_shards=?missing_shards, \"failed to acquire all assigned shards\");\n        }\n\n        let mut truncate_up_to_positions =\n            Vec::with_capacity(acquire_shards_response.acquired_shards.len());\n\n        for acquired_shard in acquire_shards_response.acquired_shards {\n            let index_uid = acquired_shard.index_uid().clone();\n            let shard_id = acquired_shard.shard_id().clone();\n            let mut current_position_inclusive = acquired_shard.publish_position_inclusive();\n            let leader_id: NodeId = acquired_shard.leader_id.into();\n            let follower_id_opt: Option<NodeId> = acquired_shard.follower_id.map(Into::into);\n            let source_id: SourceId = acquired_shard.source_id;\n            let partition_id = PartitionId::from(shard_id.as_str());\n            let from_position_exclusive = current_position_inclusive.clone();\n\n            let status = if from_position_exclusive.is_eof() {\n                IndexingStatus::Complete\n            } else if let Err(error) = ctx\n                .protect_future(self.fetch_stream.subscribe(\n                    leader_id.clone(),\n                    follower_id_opt.clone(),\n                    index_uid,\n                    source_id,\n                    shard_id.clone(),\n                    from_position_exclusive,\n                ))\n                .await\n            {\n                if let IngestV2Error::ShardNotFound { .. } = error {\n                    error!(\"failed to subscribe to shard `{shard_id}`: shard not found\");\n                    current_position_inclusive.to_eof();\n                    IndexingStatus::NotFound\n                } else {\n                    error!(%error, \"failed to subscribe to shard `{shard_id}`\");\n                    IndexingStatus::Error\n                }\n            } else {\n                IndexingStatus::Active\n            };\n            truncate_up_to_positions.push((shard_id.clone(), current_position_inclusive.clone()));\n\n            let assigned_shard = AssignedShard {\n                leader_id,\n                follower_id_opt,\n                partition_id,\n                current_position_inclusive,\n                status,\n            };\n            self.assigned_shards.insert(shard_id, assigned_shard);\n        }\n\n        self.truncate(truncate_up_to_positions).await;\n\n        Ok(())\n    }\n\n    async fn suggest_truncate(\n        &mut self,\n        checkpoint: SourceCheckpoint,\n        _ctx: &SourceContext,\n    ) -> anyhow::Result<()> {\n        let truncate_up_to_positions: Vec<(ShardId, Position)> = checkpoint\n            .iter()\n            .map(|(partition_id, position)| {\n                let shard_id = ShardId::from(partition_id.as_str());\n                (shard_id, position)\n            })\n            .collect();\n        self.truncate(truncate_up_to_positions).await;\n        Ok(())\n    }\n\n    fn name(&self) -> String {\n        \"IngestSource\".to_string()\n    }\n\n    fn observable_state(&self) -> serde_json::Value {\n        let assigned_shards: Vec<serde_json::Value> = self\n            .assigned_shards\n            .iter()\n            .sorted_by(|(left_shard_id, _), (right_shard_id, _)| left_shard_id.cmp(right_shard_id))\n            .map(|(shard_id, assigned_shard)| {\n                json!({\n                    \"shard_id\": *shard_id,\n                    \"current_position\": assigned_shard.current_position_inclusive,\n                    \"status\": assigned_shard.status,\n                })\n            })\n            .collect();\n        json!({\n            \"client_id\": self.client_id.to_string(),\n            \"assigned_shards\": assigned_shards,\n            \"publish_token\": self.publish_token,\n        })\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::iter::once;\n    use std::path::PathBuf;\n    use std::sync::Arc;\n    use std::sync::atomic::AtomicBool;\n\n    use bytesize::ByteSize;\n    use itertools::Itertools;\n    use quickwit_actors::{ActorContext, Universe};\n    use quickwit_common::ServiceStream;\n    use quickwit_common::metrics::MEMORY_METRICS;\n    use quickwit_common::stream_utils::InFlightValue;\n    use quickwit_config::{IndexingSettings, SourceConfig, SourceParams};\n    use quickwit_ingest::IngesterPoolEntry;\n    use quickwit_proto::indexing::IndexingPipelineId;\n    use quickwit_proto::ingest::ingester::{\n        FetchMessage, IngesterServiceClient, MockIngesterService, TruncateShardsResponse,\n    };\n    use quickwit_proto::ingest::{IngestV2Error, MRecordBatch, Shard, ShardState};\n    use quickwit_proto::metastore::{AcquireShardsResponse, MockMetastoreService};\n    use quickwit_proto::types::{DocMappingUid, IndexUid, PipelineUid};\n    use quickwit_storage::StorageResolver;\n    use tokio::sync::mpsc::error::TryRecvError;\n    use tokio::sync::watch;\n\n    use super::*;\n    use crate::models::RawDocBatch;\n    use crate::source::SourceActor;\n\n    // In this test, we simulate a source to which we sequentially assign the following set of\n    // shards []\n    // [1] (triggers a reset, and the creation of a publish lock)\n    // [1,2]\n    // [2,3] (which triggers a reset)\n    #[tokio::test]\n    async fn test_ingest_source_assign_shards() {\n        let pipeline_id = IndexingPipelineId {\n            node_id: NodeId::from(\"test-node\"),\n            index_uid: IndexUid::for_test(\"test-index\", 0),\n            source_id: \"test-source\".to_string(),\n            pipeline_uid: PipelineUid::default(),\n        };\n        let source_config = SourceConfig::for_test(\"test-source\", SourceParams::Ingest);\n        let publish_token = \"indexer/test-node/test-index:0/test-source/\\\n                             00000000000000000000000000/00000000000000000000000000\";\n\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_acquire_shards()\n            .withf(|request| request.shard_ids == [ShardId::from(0)])\n            .once()\n            .returning(|request| {\n                assert_eq!(request.index_uid(), &(\"test-index\", 0));\n                assert_eq!(request.source_id, \"test-source\");\n                let response = AcquireShardsResponse {\n                    acquired_shards: vec![Shard {\n                        index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n                        source_id: \"test-source\".to_string(),\n                        leader_id: \"test-ingester-0\".to_string(),\n                        follower_id: None,\n                        shard_id: Some(ShardId::from(0)),\n                        shard_state: ShardState::Open as i32,\n                        doc_mapping_uid: Some(DocMappingUid::default()),\n                        publish_position_inclusive: Some(Position::offset(10u64)),\n                        publish_token: Some(publish_token.to_string()),\n                        update_timestamp: 1724158996,\n                    }],\n                };\n                Ok(response)\n            });\n        mock_metastore\n            .expect_acquire_shards()\n            .once()\n            .withf(|request| request.shard_ids == [ShardId::from(1)])\n            .returning(|request| {\n                assert_eq!(request.index_uid(), &(\"test-index\", 0));\n                assert_eq!(request.source_id, \"test-source\");\n\n                let response = AcquireShardsResponse {\n                    acquired_shards: vec![Shard {\n                        leader_id: \"test-ingester-0\".to_string(),\n                        follower_id: None,\n                        index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n                        source_id: \"test-source\".to_string(),\n                        shard_id: Some(ShardId::from(1)),\n                        shard_state: ShardState::Open as i32,\n                        doc_mapping_uid: Some(DocMappingUid::default()),\n                        publish_position_inclusive: Some(Position::offset(11u64)),\n                        publish_token: Some(publish_token.to_string()),\n                        update_timestamp: 1724158996,\n                    }],\n                };\n                Ok(response)\n            });\n        mock_metastore\n            .expect_acquire_shards()\n            .withf(|request| request.shard_ids == [ShardId::from(1), ShardId::from(2)])\n            .once()\n            .returning(|request| {\n                assert_eq!(request.index_uid(), &(\"test-index\", 0));\n                assert_eq!(request.source_id, \"test-source\");\n\n                let response = AcquireShardsResponse {\n                    acquired_shards: vec![\n                        Shard {\n                            leader_id: \"test-ingester-0\".to_string(),\n                            follower_id: None,\n                            index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n                            source_id: \"test-source\".to_string(),\n                            shard_id: Some(ShardId::from(1)),\n                            shard_state: ShardState::Open as i32,\n                            doc_mapping_uid: Some(DocMappingUid::default()),\n                            publish_position_inclusive: Some(Position::offset(11u64)),\n                            publish_token: Some(publish_token.to_string()),\n                            update_timestamp: 1724158996,\n                        },\n                        Shard {\n                            leader_id: \"test-ingester-0\".to_string(),\n                            follower_id: None,\n                            index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n                            source_id: \"test-source\".to_string(),\n                            shard_id: Some(ShardId::from(2)),\n                            shard_state: ShardState::Open as i32,\n                            doc_mapping_uid: Some(DocMappingUid::default()),\n                            publish_position_inclusive: Some(Position::offset(12u64)),\n                            publish_token: Some(publish_token.to_string()),\n                            update_timestamp: 1724158996,\n                        },\n                    ],\n                };\n                Ok(response)\n            });\n        let ingester_pool = IngesterPool::default();\n\n        // This sequence is used to remove the race condition by waiting for the fetch stream\n        // request.\n        let (sequence_tx, mut sequence_rx) = tokio::sync::mpsc::unbounded_channel::<usize>();\n\n        let mut mock_ingester_0 = MockIngesterService::new();\n        let sequence_tx_clone1 = sequence_tx.clone();\n        mock_ingester_0\n            .expect_open_fetch_stream()\n            .withf(|request| {\n                request.from_position_exclusive() == Position::offset(10u64)\n                    && request.shard_id() == ShardId::from(0)\n            })\n            .once()\n            .returning(move |request| {\n                sequence_tx_clone1.send(1).unwrap();\n                assert_eq!(\n                    request.client_id,\n                    \"indexer/test-node/test-index:00000000000000000000000000/test-source/\\\n                     00000000000000000000000000\"\n                );\n                assert_eq!(request.index_uid(), &(\"test-index\", 0));\n                assert_eq!(request.source_id, \"test-source\");\n\n                let (_service_stream_tx, service_stream) = ServiceStream::new_bounded(1);\n                Ok(service_stream)\n            });\n        let sequence_tx_clone2 = sequence_tx.clone();\n        mock_ingester_0\n            .expect_open_fetch_stream()\n            .withf(|request| {\n                request.from_position_exclusive() == Position::offset(11u64)\n                    && request.shard_id() == ShardId::from(1)\n            })\n            .times(2)\n            .returning(move |request| {\n                sequence_tx_clone2.send(2).unwrap();\n                assert_eq!(\n                    request.client_id,\n                    \"indexer/test-node/test-index:00000000000000000000000000/test-source/\\\n                     00000000000000000000000000\"\n                );\n                assert_eq!(request.index_uid(), &(\"test-index\", 0));\n                assert_eq!(request.source_id, \"test-source\");\n\n                let (_service_stream_tx, service_stream) = ServiceStream::new_bounded(1);\n                Ok(service_stream)\n            });\n        let sequence_tx_clone3 = sequence_tx.clone();\n        mock_ingester_0\n            .expect_open_fetch_stream()\n            .withf(|request| {\n                request.from_position_exclusive() == Position::offset(12u64)\n                    && request.shard_id() == ShardId::from(2)\n            })\n            .once()\n            .returning(move |request| {\n                sequence_tx_clone3.send(3).unwrap();\n                assert_eq!(\n                    request.client_id,\n                    \"indexer/test-node/test-index:00000000000000000000000000/test-source/\\\n                     00000000000000000000000000\"\n                );\n                assert_eq!(request.index_uid(), &(\"test-index\", 0));\n                assert_eq!(request.source_id, \"test-source\");\n\n                let (_service_stream_tx, service_stream) = ServiceStream::new_bounded(1);\n                Ok(service_stream)\n            });\n        mock_ingester_0\n            .expect_truncate_shards()\n            .withf(|truncate_req| truncate_req.subrequests[0].shard_id() == ShardId::from(0))\n            .once()\n            .returning(|request| {\n                assert_eq!(request.ingester_id, \"test-ingester-0\");\n                assert_eq!(request.subrequests.len(), 1);\n\n                let subrequest = &request.subrequests[0];\n                assert_eq!(subrequest.index_uid(), &(\"test-index\", 0));\n                assert_eq!(subrequest.source_id, \"test-source\");\n                assert_eq!(\n                    subrequest.truncate_up_to_position_inclusive(),\n                    Position::offset(10u64)\n                );\n\n                let response = TruncateShardsResponse {};\n                Ok(response)\n            });\n\n        mock_ingester_0\n            .expect_truncate_shards()\n            .withf(|truncate_req| truncate_req.subrequests[0].shard_id() == ShardId::from(1))\n            .once()\n            .returning(|request| {\n                assert_eq!(request.ingester_id, \"test-ingester-0\");\n                assert_eq!(request.subrequests.len(), 1);\n\n                let subrequest = &request.subrequests[0];\n                assert_eq!(subrequest.index_uid(), &(\"test-index\", 0));\n                assert_eq!(subrequest.source_id, \"test-source\");\n                assert_eq!(\n                    subrequest.truncate_up_to_position_inclusive(),\n                    Position::offset(11u64)\n                );\n\n                Ok(TruncateShardsResponse {})\n            });\n        mock_ingester_0\n            .expect_truncate_shards()\n            .withf(|truncate_req| {\n                truncate_req.subrequests.len() == 2\n                    && truncate_req.subrequests[0].shard_id() == ShardId::from(1)\n                    && truncate_req.subrequests[1].shard_id() == ShardId::from(2)\n            })\n            .once()\n            .returning(|request| {\n                assert_eq!(request.ingester_id, \"test-ingester-0\");\n\n                let subrequest = &request.subrequests[0];\n                assert_eq!(subrequest.index_uid(), &(\"test-index\", 0));\n                assert_eq!(subrequest.source_id, \"test-source\");\n                assert_eq!(\n                    subrequest.truncate_up_to_position_inclusive(),\n                    Position::offset(11u64)\n                );\n\n                let subrequest = &request.subrequests[1];\n                assert_eq!(subrequest.index_uid(), &(\"test-index\", 0));\n                assert_eq!(subrequest.source_id, \"test-source\");\n                assert_eq!(\n                    subrequest.truncate_up_to_position_inclusive(),\n                    Position::offset(12u64)\n                );\n\n                let response = TruncateShardsResponse {};\n                Ok(response)\n            });\n\n        let ingester_0 =\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester_0));\n        ingester_pool.insert(\"test-ingester-0\".into(), ingester_0.clone());\n\n        let event_broker = EventBroker::default();\n\n        let source_runtime = SourceRuntime {\n            pipeline_id,\n            source_config,\n            metastore: MetastoreServiceClient::from_mock(mock_metastore),\n            ingester_pool: ingester_pool.clone(),\n            queues_dir_path: PathBuf::from(\"./queues\"),\n            storage_resolver: StorageResolver::for_test(),\n            event_broker,\n            indexing_setting: IndexingSettings::default(),\n        };\n        let retry_params = RetryParams::no_retries();\n        let mut source = IngestSource::try_new(source_runtime, retry_params)\n            .await\n            .unwrap();\n\n        let universe = Universe::with_accelerated_time();\n        let (source_mailbox, _source_inbox) = universe.create_test_mailbox::<SourceActor>();\n        let (doc_processor_mailbox, doc_processor_inbox) =\n            universe.create_test_mailbox::<DocProcessor>();\n        let (observable_state_tx, _observable_state_rx) = watch::channel(serde_json::Value::Null);\n        let ctx: SourceContext =\n            ActorContext::for_test(&universe, source_mailbox, observable_state_tx);\n\n        // We assign [0] (previously []).\n        // The stream does not need to be reset.\n        let shard_ids: BTreeSet<ShardId> = once(0).map(ShardId::from).collect();\n        let publish_lock = source.publish_lock.clone();\n        source\n            .assign_shards(shard_ids, &doc_processor_mailbox, &ctx)\n            .await\n            .unwrap();\n        assert_eq!(sequence_rx.recv().await.unwrap(), 1);\n        assert!(!publish_lock.is_alive());\n\n        assert!(source.publish_lock.is_alive());\n        assert!(!source.publish_token.is_empty());\n\n        // We assign [0,1] (previously [0]). This should just add the shard 1.\n        // The stream does not need to be reset.\n        let shard_ids: BTreeSet<ShardId> = (0..2).map(ShardId::from).collect();\n        let publish_lock = source.publish_lock.clone();\n        source\n            .assign_shards(shard_ids, &doc_processor_mailbox, &ctx)\n            .await\n            .unwrap();\n        assert_eq!(sequence_rx.recv().await.unwrap(), 2);\n        assert!(publish_lock.is_alive());\n        assert_eq!(publish_lock, source.publish_lock);\n\n        // We assign [1,2]. (previously [0,1]) This should reset the stream\n        // because the shard 0 has to be removed.\n        // The publish lock should be killed and a new one should be created.\n        let shard_ids: BTreeSet<ShardId> = (1..3).map(ShardId::from).collect();\n        let publish_lock = source.publish_lock.clone();\n        source\n            .assign_shards(shard_ids, &doc_processor_mailbox, &ctx)\n            .await\n            .unwrap();\n\n        assert_eq!(sequence_rx.recv().await.unwrap(), 2);\n        assert_eq!(sequence_rx.recv().await.unwrap(), 3);\n        assert!(!publish_lock.is_alive());\n        assert!(source.publish_lock.is_alive());\n        assert_ne!(publish_lock, source.publish_lock);\n\n        let NewPublishLock(publish_lock) = doc_processor_inbox\n            .recv_typed_message::<NewPublishLock>()\n            .await\n            .unwrap();\n        assert_ne!(&source.publish_lock, &publish_lock);\n\n        // assert!(publish_token != source.publish_token);\n\n        let NewPublishToken(publish_token) = doc_processor_inbox\n            .recv_typed_message::<NewPublishToken>()\n            .await\n            .unwrap();\n        assert_eq!(source.publish_token, publish_token);\n\n        assert_eq!(source.assigned_shards.len(), 2);\n\n        let assigned_shard = source.assigned_shards.get(&ShardId::from(1)).unwrap();\n        let expected_assigned_shard = AssignedShard {\n            leader_id: \"test-ingester-0\".into(),\n            follower_id_opt: None,\n            partition_id: 1u64.into(),\n            current_position_inclusive: Position::offset(11u64),\n            status: IndexingStatus::Active,\n        };\n        assert_eq!(assigned_shard, &expected_assigned_shard);\n\n        let assigned_shard = source.assigned_shards.get(&ShardId::from(2)).unwrap();\n        let expected_assigned_shard = AssignedShard {\n            leader_id: \"test-ingester-0\".into(),\n            follower_id_opt: None,\n            partition_id: 2u64.into(),\n            current_position_inclusive: Position::offset(12u64),\n            status: IndexingStatus::Active,\n        };\n        assert_eq!(assigned_shard, &expected_assigned_shard);\n\n        // Wait for the truncate future to complete.\n        time::sleep(Duration::from_millis(1)).await;\n    }\n\n    #[tokio::test]\n    async fn test_ingest_source_assign_shards_all_eof() {\n        // In this test, we check that if all assigned shards are originally marked as EOF in the\n        // metastore, we observe the following:\n        // - emission of a suggest truncate\n        // - no stream request is emitted\n        let pipeline_id = IndexingPipelineId {\n            node_id: NodeId::from(\"test-node\"),\n            index_uid: IndexUid::for_test(\"test-index\", 0),\n            source_id: \"test-source\".to_string(),\n            pipeline_uid: PipelineUid::default(),\n        };\n        let source_config = SourceConfig::for_test(\"test-source\", SourceParams::Ingest);\n        let publish_token = \"indexer/test-node/test-index:0/test-source/\\\n                             00000000000000000000000000/00000000000000000000000000\";\n\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_acquire_shards()\n            .once()\n            .returning(|request| {\n                assert_eq!(request.index_uid(), &(\"test-index\", 0));\n                assert_eq!(request.source_id, \"test-source\");\n                assert_eq!(request.shard_ids, [ShardId::from(1), ShardId::from(2)]);\n\n                let response = AcquireShardsResponse {\n                    acquired_shards: vec![\n                        Shard {\n                            leader_id: \"test-ingester-0\".to_string(),\n                            follower_id: None,\n                            index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n                            source_id: \"test-source\".to_string(),\n                            shard_id: Some(ShardId::from(1)),\n                            shard_state: ShardState::Open as i32,\n                            doc_mapping_uid: Some(DocMappingUid::default()),\n                            publish_position_inclusive: Some(Position::eof(11u64)),\n                            publish_token: Some(publish_token.to_string()),\n                            update_timestamp: 1724158996,\n                        },\n                        Shard {\n                            leader_id: \"test-ingester-0\".to_string(),\n                            follower_id: None,\n                            index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n                            source_id: \"test-source\".to_string(),\n                            shard_id: Some(ShardId::from(2)),\n                            shard_state: ShardState::Open as i32,\n                            doc_mapping_uid: Some(DocMappingUid::default()),\n                            publish_position_inclusive: Some(Position::Beginning.as_eof()),\n                            publish_token: Some(publish_token.to_string()),\n                            update_timestamp: 1724158996,\n                        },\n                    ],\n                };\n                Ok(response)\n            });\n        let ingester_pool = IngesterPool::default();\n\n        let mut mock_ingester_0 = MockIngesterService::new();\n        mock_ingester_0\n            .expect_truncate_shards()\n            .once()\n            .returning(|request| {\n                assert_eq!(request.ingester_id, \"test-ingester-0\");\n                assert_eq!(request.subrequests.len(), 2);\n\n                let subrequest_0 = &request.subrequests[0];\n                assert_eq!(subrequest_0.index_uid(), &(\"test-index\", 0));\n                assert_eq!(subrequest_0.source_id, \"test-source\");\n                assert_eq!(subrequest_0.shard_id(), ShardId::from(1));\n                assert_eq!(\n                    subrequest_0.truncate_up_to_position_inclusive(),\n                    Position::eof(11u64)\n                );\n\n                let subrequest_1 = &request.subrequests[1];\n                assert_eq!(subrequest_1.index_uid(), &(\"test-index\", 0));\n                assert_eq!(subrequest_1.source_id, \"test-source\");\n                assert_eq!(subrequest_1.shard_id(), ShardId::from(2));\n                assert_eq!(\n                    subrequest_1.truncate_up_to_position_inclusive(),\n                    Position::Beginning.as_eof()\n                );\n\n                let response = TruncateShardsResponse {};\n                Ok(response)\n            });\n\n        let ingester_0 =\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester_0));\n        ingester_pool.insert(\"test-ingester-0\".into(), ingester_0.clone());\n\n        let event_broker = EventBroker::default();\n        let (shard_positions_update_tx, mut shard_positions_update_rx) =\n            tokio::sync::mpsc::unbounded_channel::<LocalShardPositionsUpdate>();\n        event_broker\n            .subscribe::<LocalShardPositionsUpdate>(move |update| {\n                shard_positions_update_tx.send(update).unwrap();\n            })\n            .forever();\n\n        let source_runtime = SourceRuntime {\n            pipeline_id,\n            source_config,\n            metastore: MetastoreServiceClient::from_mock(mock_metastore),\n            ingester_pool: ingester_pool.clone(),\n            queues_dir_path: PathBuf::from(\"./queues\"),\n            storage_resolver: StorageResolver::for_test(),\n            event_broker,\n            indexing_setting: IndexingSettings::default(),\n        };\n        let retry_params = RetryParams::for_test();\n        let mut source = IngestSource::try_new(source_runtime, retry_params)\n            .await\n            .unwrap();\n\n        let universe = Universe::with_accelerated_time();\n        let (source_mailbox, _source_inbox) = universe.create_test_mailbox::<SourceActor>();\n        let (doc_processor_mailbox, _doc_processor_inbox) =\n            universe.create_test_mailbox::<DocProcessor>();\n        let (observable_state_tx, _observable_state_rx) = watch::channel(serde_json::Value::Null);\n        let ctx: SourceContext =\n            ActorContext::for_test(&universe, source_mailbox, observable_state_tx);\n\n        // In this scenario, the indexer will be able to acquire shard 1 and 2.\n        let shard_ids: BTreeSet<ShardId> =\n            BTreeSet::from_iter([ShardId::from(1), ShardId::from(2)]);\n\n        source\n            .assign_shards(shard_ids, &doc_processor_mailbox, &ctx)\n            .await\n            .unwrap();\n\n        let expected_local_update = LocalShardPositionsUpdate::new(\n            SourceUid {\n                index_uid: IndexUid::for_test(\"test-index\", 0),\n                source_id: \"test-source\".to_string(),\n            },\n            vec![\n                (ShardId::from(1), Position::eof(11u64)),\n                (ShardId::from(2), Position::Beginning.as_eof()),\n            ],\n        );\n        let local_update = shard_positions_update_rx.recv().await.unwrap();\n        assert_eq!(local_update, expected_local_update);\n    }\n\n    #[tokio::test]\n    async fn test_ingest_source_assign_shards_some_eof() {\n        // In this test, we check that if some shards that are originally marked as EOF in the\n        // metastore, we observe the following:\n        // - emission of a suggest truncate\n        // - the stream request emitted does not include the EOF shards\n        let pipeline_id = IndexingPipelineId {\n            node_id: NodeId::from(\"test-node\"),\n            index_uid: IndexUid::for_test(\"test-index\", 0),\n            source_id: \"test-source\".to_string(),\n            pipeline_uid: PipelineUid::default(),\n        };\n        let source_config = SourceConfig::for_test(\"test-source\", SourceParams::Ingest);\n        let publish_token = \"indexer/test-node/test-index:0/test-source/\\\n                             00000000000000000000000000/00000000000000000000000000\";\n\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_acquire_shards()\n            .once()\n            .returning(|request| {\n                assert_eq!(request.index_uid(), &(\"test-index\", 0));\n                assert_eq!(request.source_id, \"test-source\");\n                assert_eq!(request.shard_ids, [ShardId::from(1), ShardId::from(2)]);\n\n                let response = AcquireShardsResponse {\n                    acquired_shards: vec![\n                        Shard {\n                            leader_id: \"test-ingester-0\".to_string(),\n                            follower_id: None,\n                            index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n                            source_id: \"test-source\".to_string(),\n                            shard_id: Some(ShardId::from(1)),\n                            shard_state: ShardState::Open as i32,\n                            doc_mapping_uid: Some(DocMappingUid::default()),\n                            publish_position_inclusive: Some(Position::offset(11u64)),\n                            publish_token: Some(publish_token.to_string()),\n                            update_timestamp: 1724158996,\n                        },\n                        Shard {\n                            leader_id: \"test-ingester-0\".to_string(),\n                            follower_id: None,\n                            index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n                            source_id: \"test-source\".to_string(),\n                            shard_id: Some(ShardId::from(2)),\n                            shard_state: ShardState::Closed as i32,\n                            doc_mapping_uid: Some(DocMappingUid::default()),\n                            publish_position_inclusive: Some(Position::eof(22u64)),\n                            publish_token: Some(publish_token.to_string()),\n                            update_timestamp: 1724158996,\n                        },\n                    ],\n                };\n                Ok(response)\n            });\n        let ingester_pool = IngesterPool::default();\n\n        let mut mock_ingester_0 = MockIngesterService::new();\n        mock_ingester_0\n            .expect_open_fetch_stream()\n            .once()\n            .returning(|request| {\n                assert_eq!(\n                    request.client_id,\n                    \"indexer/test-node/test-index:00000000000000000000000000/test-source/\\\n                     00000000000000000000000000\"\n                );\n                assert_eq!(request.index_uid(), &(\"test-index\", 0));\n                assert_eq!(request.source_id, \"test-source\");\n                assert_eq!(request.shard_id(), ShardId::from(1));\n                assert_eq!(request.from_position_exclusive(), Position::offset(11u64));\n\n                let (_service_stream_tx, service_stream) = ServiceStream::new_bounded(1);\n                Ok(service_stream)\n            });\n        mock_ingester_0\n            .expect_truncate_shards()\n            .once()\n            .returning(|mut request| {\n                assert_eq!(request.ingester_id, \"test-ingester-0\");\n                assert_eq!(request.subrequests.len(), 2);\n                request\n                    .subrequests\n                    .sort_unstable_by(|left, right| left.shard_id.cmp(&right.shard_id));\n\n                let subrequest = &request.subrequests[0];\n                assert_eq!(subrequest.index_uid(), &(\"test-index\", 0));\n                assert_eq!(subrequest.source_id, \"test-source\");\n                assert_eq!(subrequest.shard_id(), ShardId::from(1));\n                assert_eq!(\n                    subrequest.truncate_up_to_position_inclusive(),\n                    Position::offset(11u64)\n                );\n\n                let subrequest = &request.subrequests[1];\n                assert_eq!(subrequest.index_uid(), &(\"test-index\", 0));\n                assert_eq!(subrequest.source_id, \"test-source\");\n                assert_eq!(subrequest.shard_id(), ShardId::from(2));\n                assert_eq!(\n                    subrequest.truncate_up_to_position_inclusive(),\n                    Position::eof(22u64)\n                );\n\n                let response = TruncateShardsResponse {};\n                Ok(response)\n            });\n\n        let ingester_0 =\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester_0));\n        ingester_pool.insert(\"test-ingester-0\".into(), ingester_0.clone());\n\n        let event_broker = EventBroker::default();\n        let (shard_positions_update_tx, mut shard_positions_update_rx) =\n            tokio::sync::mpsc::unbounded_channel::<LocalShardPositionsUpdate>();\n        event_broker\n            .subscribe::<LocalShardPositionsUpdate>(move |update| {\n                shard_positions_update_tx.send(update).unwrap();\n            })\n            .forever();\n\n        let source_runtime = SourceRuntime {\n            pipeline_id,\n            source_config,\n            metastore: MetastoreServiceClient::from_mock(mock_metastore),\n            ingester_pool: ingester_pool.clone(),\n            queues_dir_path: PathBuf::from(\"./queues\"),\n            storage_resolver: StorageResolver::for_test(),\n            event_broker,\n            indexing_setting: IndexingSettings::default(),\n        };\n        let retry_params = RetryParams::for_test();\n        let mut source = IngestSource::try_new(source_runtime, retry_params)\n            .await\n            .unwrap();\n\n        let universe = Universe::with_accelerated_time();\n        let (source_mailbox, _source_inbox) = universe.create_test_mailbox::<SourceActor>();\n        let (doc_processor_mailbox, _doc_processor_inbox) =\n            universe.create_test_mailbox::<DocProcessor>();\n        let (observable_state_tx, _observable_state_rx) = watch::channel(serde_json::Value::Null);\n        let ctx: SourceContext =\n            ActorContext::for_test(&universe, source_mailbox, observable_state_tx);\n\n        // In this scenario, the indexer will only be able to acquire shard 1.\n        let shard_ids: BTreeSet<ShardId> = (1..3).map(ShardId::from).collect();\n        assert_eq!(\n            shard_positions_update_rx.try_recv().unwrap_err(),\n            TryRecvError::Empty\n        );\n\n        // In this scenario, the indexer will only be able to acquire shard 1.\n        source\n            .assign_shards(shard_ids, &doc_processor_mailbox, &ctx)\n            .await\n            .unwrap();\n\n        let local_shard_positions_update = shard_positions_update_rx.recv().await.unwrap();\n        let expected_local_shard_positions_update = LocalShardPositionsUpdate::new(\n            SourceUid {\n                index_uid: IndexUid::for_test(\"test-index\", 0),\n                source_id: \"test-source\".to_string(),\n            },\n            vec![\n                (ShardId::from(1), Position::offset(11u64)),\n                (ShardId::from(2), Position::eof(22u64)),\n            ],\n        );\n        assert_eq!(\n            local_shard_positions_update,\n            expected_local_shard_positions_update,\n        );\n    }\n\n    #[tokio::test]\n    async fn test_ingest_source_emit_batches() {\n        let pipeline_id = IndexingPipelineId {\n            node_id: NodeId::from(\"test-node\"),\n            index_uid: IndexUid::for_test(\"test-index\", 0),\n            source_id: \"test-source\".to_string(),\n            pipeline_uid: PipelineUid::default(),\n        };\n        let source_config = SourceConfig::for_test(\"test-source\", SourceParams::Ingest);\n        let mock_metastore = MockMetastoreService::new();\n        let ingester_pool = IngesterPool::default();\n        let event_broker = EventBroker::default();\n\n        let source_runtime = SourceRuntime {\n            pipeline_id,\n            source_config,\n            metastore: MetastoreServiceClient::from_mock(mock_metastore),\n            ingester_pool: ingester_pool.clone(),\n            queues_dir_path: PathBuf::from(\"./queues\"),\n            storage_resolver: StorageResolver::for_test(),\n            event_broker,\n            indexing_setting: IndexingSettings::default(),\n        };\n        let retry_params = RetryParams::for_test();\n        let mut source = IngestSource::try_new(source_runtime, retry_params)\n            .await\n            .unwrap();\n\n        let universe = Universe::with_accelerated_time();\n        let (source_mailbox, _source_inbox) = universe.create_test_mailbox::<SourceActor>();\n        let (doc_processor_mailbox, doc_processor_inbox) =\n            universe.create_test_mailbox::<DocProcessor>();\n        let (observable_state_tx, _observable_state_rx) = watch::channel(serde_json::Value::Null);\n        let ctx: SourceContext =\n            ActorContext::for_test(&universe, source_mailbox, observable_state_tx);\n\n        // In this scenario, the ingester receives fetch responses from shard 1 and 2.\n        source.assigned_shards.insert(\n            ShardId::from(1),\n            AssignedShard {\n                leader_id: \"test-ingester-0\".into(),\n                follower_id_opt: None,\n                partition_id: 1u64.into(),\n                current_position_inclusive: Position::offset(11u64),\n                status: IndexingStatus::Active,\n            },\n        );\n        source.assigned_shards.insert(\n            ShardId::from(2),\n            AssignedShard {\n                leader_id: \"test-ingester-1\".into(),\n                follower_id_opt: None,\n                partition_id: 2u64.into(),\n                current_position_inclusive: Position::offset(22u64),\n                status: IndexingStatus::Active,\n            },\n        );\n        let fetch_message_tx = source.fetch_stream.fetch_message_tx();\n\n        let fetch_payload = FetchPayload {\n            index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n            source_id: \"test-source\".into(),\n            shard_id: Some(ShardId::from(1)),\n            mrecord_batch: MRecordBatch::for_test([\n                \"\\0\\0test-doc-foo\",\n                \"\\0\\0test-doc-bar\",\n                \"\\0\\x01\",\n            ]),\n            from_position_exclusive: Some(Position::offset(11u64)),\n            to_position_inclusive: Some(Position::offset(14u64)),\n        };\n        let batch_size = fetch_payload.estimate_size();\n        let fetch_message = FetchMessage::new_payload(fetch_payload);\n        let in_flight_value = InFlightValue::new(\n            fetch_message,\n            batch_size,\n            &MEMORY_METRICS.in_flight.fetch_stream,\n        );\n        fetch_message_tx.send(Ok(in_flight_value)).await.unwrap();\n\n        let fetch_payload = FetchPayload {\n            index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n            source_id: \"test-source\".into(),\n            shard_id: Some(ShardId::from(2)),\n            mrecord_batch: MRecordBatch::for_test([\"\\0\\0test-doc-qux\"]),\n            from_position_exclusive: Some(Position::offset(22u64)),\n            to_position_inclusive: Some(Position::offset(23u64)),\n        };\n        let batch_size = fetch_payload.estimate_size();\n        let fetch_message = FetchMessage::new_payload(fetch_payload);\n        let in_flight_value = InFlightValue::new(\n            fetch_message,\n            batch_size,\n            &MEMORY_METRICS.in_flight.fetch_stream,\n        );\n        fetch_message_tx.send(Ok(in_flight_value)).await.unwrap();\n\n        let fetch_eof = FetchEof {\n            index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n            source_id: \"test-source\".into(),\n            shard_id: Some(ShardId::from(2)),\n            eof_position: Some(Position::eof(23u64)),\n        };\n        let fetch_message = FetchMessage::new_eof(fetch_eof);\n        let in_flight_value = InFlightValue::new(\n            fetch_message,\n            ByteSize(0),\n            &MEMORY_METRICS.in_flight.fetch_stream,\n        );\n        fetch_message_tx.send(Ok(in_flight_value)).await.unwrap();\n\n        source\n            .emit_batches(&doc_processor_mailbox, &ctx)\n            .await\n            .unwrap();\n        let doc_batch = doc_processor_inbox\n            .recv_typed_message::<RawDocBatch>()\n            .await\n            .unwrap();\n        assert_eq!(doc_batch.docs.len(), 3);\n        assert_eq!(doc_batch.docs[0], \"test-doc-foo\");\n        assert_eq!(doc_batch.docs[1], \"test-doc-bar\");\n        assert_eq!(doc_batch.docs[2], \"test-doc-qux\");\n        assert!(doc_batch.force_commit);\n\n        let partition_deltas = doc_batch\n            .checkpoint_delta\n            .iter()\n            .sorted_by(|left, right| left.0.cmp(&right.0))\n            .collect::<Vec<_>>();\n\n        assert_eq!(partition_deltas.len(), 2);\n        assert_eq!(partition_deltas[0].0, 1u64.into());\n        assert_eq!(partition_deltas[0].1.from, Position::offset(11u64));\n        assert_eq!(partition_deltas[0].1.to, Position::offset(14u64));\n\n        assert_eq!(partition_deltas[1].0, 2u64.into());\n        assert_eq!(partition_deltas[1].1.from, Position::offset(22u64));\n        assert_eq!(partition_deltas[1].1.to, Position::eof(23u64));\n\n        source\n            .emit_batches(&doc_processor_mailbox, &ctx)\n            .await\n            .unwrap();\n        let shard = source.assigned_shards.get(&ShardId::from(2)).unwrap();\n        assert_eq!(shard.status, IndexingStatus::ReachedEof);\n\n        fetch_message_tx\n            .send(Err(FetchStreamError {\n                index_uid: IndexUid::for_test(\"test-index\", 0),\n                source_id: \"test-source\".into(),\n                shard_id: ShardId::from(1),\n                ingest_error: IngestV2Error::Internal(\"test-error\".to_string()),\n            }))\n            .await\n            .unwrap();\n\n        source\n            .emit_batches(&doc_processor_mailbox, &ctx)\n            .await\n            .unwrap();\n        let shard = source.assigned_shards.get(&ShardId::from(1)).unwrap();\n        assert_eq!(shard.status, IndexingStatus::Error);\n\n        let fetch_payload = FetchPayload {\n            index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n            source_id: \"test-source\".into(),\n            shard_id: Some(ShardId::from(1)),\n            mrecord_batch: MRecordBatch::for_test([\"\\0\\0test-doc-baz\"]),\n            from_position_exclusive: Some(Position::offset(14u64)),\n            to_position_inclusive: Some(Position::offset(15u64)),\n        };\n        let batch_size = fetch_payload.estimate_size();\n        let fetch_message = FetchMessage::new_payload(fetch_payload);\n        let in_flight_value = InFlightValue::new(\n            fetch_message,\n            batch_size,\n            &MEMORY_METRICS.in_flight.fetch_stream,\n        );\n        fetch_message_tx.send(Ok(in_flight_value)).await.unwrap();\n\n        source\n            .emit_batches(&doc_processor_mailbox, &ctx)\n            .await\n            .unwrap();\n        let shard = source.assigned_shards.get(&ShardId::from(1)).unwrap();\n        assert_eq!(shard.status, IndexingStatus::Active);\n    }\n\n    #[tokio::test]\n    async fn test_ingest_source_emit_batches_shard_not_found() {\n        let pipeline_id = IndexingPipelineId {\n            node_id: NodeId::from(\"test-node\"),\n            index_uid: IndexUid::for_test(\"test-index\", 0),\n            source_id: \"test-source\".to_string(),\n            pipeline_uid: PipelineUid::default(),\n        };\n        let source_config = SourceConfig::for_test(\"test-source\", SourceParams::Ingest);\n        let publish_token = \"indexer/test-node/test-index:0/test-source/\\\n                             00000000000000000000000000/00000000000000000000000000\";\n\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_acquire_shards()\n            .once()\n            .returning(|request: AcquireShardsRequest| {\n                assert_eq!(request.index_uid(), &(\"test-index\", 0));\n                assert_eq!(request.source_id, \"test-source\");\n                assert_eq!(request.shard_ids, [ShardId::from(1)]);\n\n                let response = AcquireShardsResponse {\n                    acquired_shards: vec![Shard {\n                        leader_id: \"test-ingester-0\".to_string(),\n                        follower_id: None,\n                        index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n                        source_id: \"test-source\".to_string(),\n                        shard_id: Some(ShardId::from(1)),\n                        shard_state: ShardState::Open as i32,\n                        doc_mapping_uid: Some(DocMappingUid::default()),\n                        publish_position_inclusive: Some(Position::Beginning),\n                        publish_token: Some(publish_token.to_string()),\n                        update_timestamp: 1724158996,\n                    }],\n                };\n                Ok(response)\n            });\n        let ingester_pool = IngesterPool::default();\n\n        let mut mock_ingester_0 = MockIngesterService::new();\n        mock_ingester_0\n            .expect_open_fetch_stream()\n            .once()\n            .returning(|request| {\n                assert_eq!(request.index_uid(), &(\"test-index\", 0));\n                assert_eq!(request.source_id, \"test-source\");\n                assert_eq!(request.shard_id(), ShardId::from(1));\n                assert_eq!(request.from_position_exclusive(), Position::Beginning);\n\n                Err(IngestV2Error::ShardNotFound {\n                    shard_id: ShardId::from(1),\n                })\n            });\n\n        let ingester_0 =\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester_0));\n        ingester_pool.insert(\"test-ingester-0\".into(), ingester_0.clone());\n\n        let event_broker = EventBroker::default();\n        let source_runtime = SourceRuntime {\n            pipeline_id,\n            source_config,\n            metastore: MetastoreServiceClient::from_mock(mock_metastore),\n            ingester_pool,\n            queues_dir_path: PathBuf::from(\"./queues\"),\n            storage_resolver: StorageResolver::for_test(),\n            event_broker,\n            indexing_setting: IndexingSettings::default(),\n        };\n        let retry_params = RetryParams::for_test();\n        let mut source = IngestSource::try_new(source_runtime, retry_params)\n            .await\n            .unwrap();\n\n        let universe = Universe::with_accelerated_time();\n        let (source_mailbox, _source_inbox) = universe.create_test_mailbox::<SourceActor>();\n        let (doc_processor_mailbox, doc_processor_inbox) =\n            universe.create_test_mailbox::<DocProcessor>();\n        let (observable_state_tx, _observable_state_rx) = watch::channel(serde_json::Value::Null);\n        let ctx: SourceContext =\n            ActorContext::for_test(&universe, source_mailbox, observable_state_tx);\n\n        let shard_ids: BTreeSet<ShardId> = BTreeSet::from_iter([ShardId::from(1)]);\n\n        source\n            .assign_shards(shard_ids, &doc_processor_mailbox, &ctx)\n            .await\n            .unwrap();\n\n        source\n            .emit_batches(&doc_processor_mailbox, &ctx)\n            .await\n            .unwrap();\n\n        let shard = source.assigned_shards.get(&ShardId::from(1)).unwrap();\n        assert_eq!(shard.status, IndexingStatus::NotFound);\n        assert_eq!(\n            shard.current_position_inclusive,\n            Position::Beginning.as_eof()\n        );\n        let raw_doc_batch = doc_processor_inbox\n            .recv_typed_message::<RawDocBatch>()\n            .await\n            .unwrap();\n\n        let (partition_id, position) = raw_doc_batch.checkpoint_delta.iter().next().unwrap();\n        assert_eq!(partition_id, PartitionId::from(1u64));\n        assert_eq!(position.from, Position::Beginning);\n        assert_eq!(position.to, Position::Beginning.as_eof());\n    }\n\n    #[tokio::test]\n    async fn test_ingest_source_suggest_truncate() {\n        let pipeline_id = IndexingPipelineId {\n            node_id: NodeId::from(\"test-node\"),\n            index_uid: IndexUid::for_test(\"test-index\", 0),\n            source_id: \"test-source\".to_string(),\n            pipeline_uid: PipelineUid::default(),\n        };\n        let source_config = SourceConfig::for_test(\"test-source\", SourceParams::Ingest);\n        let mock_metastore = MockMetastoreService::new();\n\n        let ingester_pool = IngesterPool::default();\n\n        let mut mock_ingester_0 = MockIngesterService::new();\n        mock_ingester_0\n            .expect_truncate_shards()\n            .once()\n            .returning(|request| {\n                assert_eq!(request.ingester_id, \"test-ingester-0\");\n                assert_eq!(request.subrequests.len(), 3);\n\n                let subrequest_0 = &request.subrequests[0];\n                assert_eq!(subrequest_0.shard_id(), ShardId::from(1));\n                assert_eq!(\n                    subrequest_0.truncate_up_to_position_inclusive(),\n                    Position::offset(11u64)\n                );\n\n                let subrequest_1 = &request.subrequests[1];\n                assert_eq!(subrequest_1.shard_id(), ShardId::from(2));\n                assert_eq!(\n                    subrequest_1.truncate_up_to_position_inclusive(),\n                    Position::offset(22u64)\n                );\n\n                let subrequest_2 = &request.subrequests[2];\n                assert_eq!(subrequest_2.shard_id(), ShardId::from(3));\n                assert_eq!(\n                    subrequest_2.truncate_up_to_position_inclusive(),\n                    Position::eof(33u64)\n                );\n\n                Ok(TruncateShardsResponse {})\n            });\n        let ingester_0 =\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester_0));\n        ingester_pool.insert(\"test-ingester-0\".into(), ingester_0.clone());\n\n        let mut mock_ingester_1 = MockIngesterService::new();\n        mock_ingester_1\n            .expect_truncate_shards()\n            .once()\n            .returning(|request| {\n                assert_eq!(request.ingester_id, \"test-ingester-1\");\n                assert_eq!(request.subrequests.len(), 2);\n\n                let subrequest_0 = &request.subrequests[0];\n                assert_eq!(subrequest_0.shard_id(), ShardId::from(2));\n                assert_eq!(\n                    subrequest_0.truncate_up_to_position_inclusive(),\n                    Position::offset(22u64)\n                );\n\n                let subrequest_1 = &request.subrequests[1];\n                assert_eq!(subrequest_1.shard_id(), ShardId::from(3));\n                assert_eq!(\n                    subrequest_1.truncate_up_to_position_inclusive(),\n                    Position::eof(33u64)\n                );\n\n                Ok(TruncateShardsResponse {})\n            });\n        let ingester_1 =\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester_1));\n        ingester_pool.insert(\"test-ingester-1\".into(), ingester_1.clone());\n\n        let mut mock_ingester_3 = MockIngesterService::new();\n        mock_ingester_3\n            .expect_truncate_shards()\n            .once()\n            .returning(|request| {\n                assert_eq!(request.ingester_id, \"test-ingester-3\");\n                assert_eq!(request.subrequests.len(), 1);\n\n                let subrequest_0 = &request.subrequests[0];\n                assert_eq!(subrequest_0.shard_id(), ShardId::from(4));\n                assert_eq!(\n                    subrequest_0.truncate_up_to_position_inclusive(),\n                    Position::offset(44u64)\n                );\n\n                Ok(TruncateShardsResponse {})\n            });\n        let ingester_3 =\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester_3));\n        ingester_pool.insert(\"test-ingester-3\".into(), ingester_3.clone());\n\n        let event_broker = EventBroker::default();\n        let (shard_positions_update_tx, mut shard_positions_update_rx) =\n            tokio::sync::mpsc::unbounded_channel::<LocalShardPositionsUpdate>();\n        event_broker\n            .subscribe::<LocalShardPositionsUpdate>(move |update| {\n                shard_positions_update_tx.send(update).unwrap();\n            })\n            .forever();\n\n        let source_runtime = SourceRuntime {\n            pipeline_id,\n            source_config,\n            metastore: MetastoreServiceClient::from_mock(mock_metastore),\n            ingester_pool: ingester_pool.clone(),\n            queues_dir_path: PathBuf::from(\"./queues\"),\n            storage_resolver: StorageResolver::for_test(),\n            event_broker,\n            indexing_setting: IndexingSettings::default(),\n        };\n        let retry_params = RetryParams::for_test();\n        let mut source = IngestSource::try_new(source_runtime, retry_params)\n            .await\n            .unwrap();\n\n        let universe = Universe::with_accelerated_time();\n        let (source_mailbox, _source_inbox) = universe.create_test_mailbox::<SourceActor>();\n        let (observable_state_tx, _observable_state_rx) = watch::channel(serde_json::Value::Null);\n        let ctx: SourceContext =\n            ActorContext::for_test(&universe, source_mailbox, observable_state_tx);\n\n        // In this scenario, the ingester 2 is not available and the shard 6 is no longer assigned.\n        source.assigned_shards.insert(\n            ShardId::from(1),\n            AssignedShard {\n                leader_id: \"test-ingester-0\".into(),\n                follower_id_opt: None,\n                partition_id: 1u64.into(),\n                current_position_inclusive: Position::offset(11u64),\n                status: IndexingStatus::Active,\n            },\n        );\n        source.assigned_shards.insert(\n            ShardId::from(2),\n            AssignedShard {\n                leader_id: \"test-ingester-0\".into(),\n                follower_id_opt: Some(\"test-ingester-1\".into()),\n                partition_id: 2u64.into(),\n                current_position_inclusive: Position::offset(22u64),\n                status: IndexingStatus::Active,\n            },\n        );\n        source.assigned_shards.insert(\n            ShardId::from(3),\n            AssignedShard {\n                leader_id: \"test-ingester-1\".into(),\n                follower_id_opt: Some(\"test-ingester-0\".into()),\n                partition_id: 3u64.into(),\n                current_position_inclusive: Position::offset(33u64),\n                status: IndexingStatus::Active,\n            },\n        );\n        source.assigned_shards.insert(\n            ShardId::from(4),\n            AssignedShard {\n                leader_id: \"test-ingester-2\".into(),\n                follower_id_opt: Some(\"test-ingester-3\".into()),\n                partition_id: 4u64.into(),\n                current_position_inclusive: Position::offset(44u64),\n                status: IndexingStatus::Active,\n            },\n        );\n        source.assigned_shards.insert(\n            ShardId::from(5),\n            AssignedShard {\n                leader_id: \"test-ingester-2\".into(),\n                follower_id_opt: Some(\"test-ingester-3\".into()),\n                partition_id: 5u64.into(),\n                current_position_inclusive: Position::Beginning,\n                status: IndexingStatus::Active,\n            },\n        );\n\n        let checkpoint = SourceCheckpoint::from_iter(vec![\n            (1u64.into(), Position::offset(11u64)),\n            (2u64.into(), Position::offset(22u64)),\n            (3u64.into(), Position::eof(33u64)),\n            (4u64.into(), Position::offset(44u64)),\n            (5u64.into(), Position::Beginning),\n            (6u64.into(), Position::offset(66u64)),\n        ]);\n        source.suggest_truncate(checkpoint, &ctx).await.unwrap();\n\n        let local_shards_update = shard_positions_update_rx.recv().await.unwrap();\n        let expected_local_shards_update = LocalShardPositionsUpdate::new(\n            SourceUid {\n                index_uid: IndexUid::for_test(\"test-index\", 0),\n                source_id: \"test-source\".to_string(),\n            },\n            vec![\n                (ShardId::from(1u64), Position::offset(11u64)),\n                (ShardId::from(2u64), Position::offset(22u64)),\n                (ShardId::from(3u64), Position::eof(33u64)),\n                (ShardId::from(4u64), Position::offset(44u64)),\n                (ShardId::from(5u64), Position::Beginning),\n                (ShardId::from(6u64), Position::offset(66u64)),\n            ],\n        );\n        assert_eq!(local_shards_update, expected_local_shards_update);\n    }\n\n    // Motivated by #4888\n    #[tokio::test]\n    async fn test_assigned_deleted_shards() {\n        // It is possible for the control plan to assign a shard to an indexer and delete it right\n        // away. In that case, the ingester should just ignore the assigned shard, as\n        // opposed to fail as the metastore does not let it `acquire` the shard.\n        let pipeline_id = IndexingPipelineId {\n            node_id: NodeId::from(\"test-node\"),\n            index_uid: IndexUid::for_test(\"test-index\", 0),\n            source_id: \"test-source\".to_string(),\n            pipeline_uid: PipelineUid::default(),\n        };\n        let source_config = SourceConfig::for_test(\"test-source\", SourceParams::Ingest);\n\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_acquire_shards()\n            .once()\n            .returning(|request: AcquireShardsRequest| {\n                assert_eq!(request.index_uid(), &(\"test-index\", 0));\n                assert_eq!(request.source_id, \"test-source\");\n                assert_eq!(request.shard_ids, [ShardId::from(1)]);\n\n                let response = AcquireShardsResponse {\n                    acquired_shards: Vec::new(),\n                };\n                Ok(response)\n            });\n        let ingester_pool = IngesterPool::default();\n\n        let event_broker = EventBroker::default();\n        let source_runtime = SourceRuntime {\n            pipeline_id,\n            source_config,\n            metastore: MetastoreServiceClient::from_mock(mock_metastore),\n            ingester_pool,\n            queues_dir_path: PathBuf::from(\"./queues\"),\n            storage_resolver: StorageResolver::for_test(),\n            event_broker: event_broker.clone(),\n            indexing_setting: IndexingSettings::default(),\n        };\n        let retry_params = RetryParams::for_test();\n        let mut source = IngestSource::try_new(source_runtime, retry_params)\n            .await\n            .unwrap();\n\n        let universe = Universe::with_accelerated_time();\n        let (source_mailbox, _source_inbox) = universe.create_test_mailbox::<SourceActor>();\n        let (doc_processor_mailbox, _doc_processor_inbox) =\n            universe.create_test_mailbox::<DocProcessor>();\n        let (observable_state_tx, _observable_state_rx) = watch::channel(serde_json::Value::Null);\n        let ctx: SourceContext =\n            ActorContext::for_test(&universe, source_mailbox, observable_state_tx);\n\n        let shard_ids: BTreeSet<ShardId> = BTreeSet::from_iter([ShardId::from(1)]);\n\n        let truncation_happened = Arc::new(AtomicBool::new(false));\n        let truncation_happened_clone = truncation_happened.clone();\n\n        let _subscription_guard = event_broker.subscribe(move |_: LocalShardPositionsUpdate| {\n            truncation_happened_clone.store(true, std::sync::atomic::Ordering::Relaxed);\n        });\n\n        source\n            .assign_shards(shard_ids, &doc_processor_mailbox, &ctx)\n            .await\n            .unwrap();\n\n        tokio::time::sleep(Duration::from_millis(100)).await;\n        assert!(!truncation_happened.load(std::sync::atomic::Ordering::Relaxed));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/ingest_api_source.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::time::Duration;\n\nuse anyhow::bail;\nuse async_trait::async_trait;\nuse quickwit_actors::{ActorContext, ActorExitStatus, Mailbox};\nuse quickwit_ingest::{\n    CreateQueueIfNotExistsRequest, DocCommand, FetchRequest, FetchResponse, GetPartitionId,\n    IngestApiService, SuggestTruncateRequest, get_ingest_api_service,\n};\nuse quickwit_metastore::checkpoint::{PartitionId, SourceCheckpoint};\nuse quickwit_proto::metastore::SourceType;\nuse quickwit_proto::types::{Position, SourceId};\nuse serde::Serialize;\nuse serde_json::Value as JsonValue;\nuse tracing::{error, info};\n\nuse super::{BatchBuilder, Source, SourceActor, SourceContext, TypedSourceFactory};\nuse crate::actors::DocProcessor;\nuse crate::source::SourceRuntime;\n\n/// Wait time for SourceActor before pooling for new documents.\n/// TODO: Think of better way, maybe increment this (i.e wait longer) as time\n/// goes on without receiving docs.\nconst INGEST_API_POLLING_COOL_DOWN: Duration = Duration::from_secs(1);\n\n#[derive(Default, Clone, Debug, Eq, PartialEq, Serialize)]\npub struct IngestApiSourceCounters {\n    /// Maintains the value of where we stopped in queue from\n    /// a previous call on `emit_batch` and allows\n    /// setting the lower-bound of the checkpoint delta.\n    /// It has the same value as `current_offset` at the end of emit_batch.\n    pub previous_offset: Option<u64>,\n    /// Maintains the value of where we are in queue and allows\n    /// setting the upper-bound of the checkpoint delta.\n    pub current_offset: Option<u64>,\n    pub num_docs_processed: u64,\n}\n\npub struct IngestApiSource {\n    source_runtime: SourceRuntime,\n    source_id: SourceId,\n    partition_id: PartitionId,\n    ingest_api_service: Mailbox<IngestApiService>,\n    counters: IngestApiSourceCounters,\n}\n\nimpl fmt::Debug for IngestApiSource {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(f, \"IngestApiSource {{ source_id: {} }}\", self.source_id)\n    }\n}\n\nimpl IngestApiSource {\n    pub async fn try_new(source_runtime: SourceRuntime) -> anyhow::Result<Self> {\n        let source_id = source_runtime.source_id().to_string();\n        let queues_dir_path = source_runtime.queues_dir_path.as_path();\n        let ingest_api_service = get_ingest_api_service(queues_dir_path).await?;\n        let partition_id: PartitionId = ingest_api_service.ask(GetPartitionId).await?.into();\n\n        // Ensure a queue for this index exists.\n        let create_queue_req = CreateQueueIfNotExistsRequest {\n            queue_id: source_runtime.index_id().to_string(),\n        };\n        match ingest_api_service.ask_for_res(create_queue_req).await {\n            Ok(response) if response.created => {\n                info!(\n                    index_id = source_runtime.index_id(),\n                    %partition_id,\n                    \"created queue successfully\"\n                );\n            }\n            Ok(_) => {}\n            Err(error) => {\n                error!(\n                    index_id = source_runtime.index_id(),\n                    %partition_id,\n                    %error,\n                    \"failed to create queue\"\n                );\n                bail!(error);\n            }\n        }\n        let checkpoint = source_runtime.fetch_checkpoint().await?;\n        let previous_offset: Option<u64> = checkpoint\n            .position_for_partition(&partition_id)\n            .map(|position| position.as_u64().expect(\"offset should be stored as u64\"));\n        let current_offset = previous_offset;\n        let ingest_api_source = IngestApiSource {\n            source_runtime,\n            source_id,\n            partition_id,\n            ingest_api_service,\n            counters: IngestApiSourceCounters {\n                previous_offset,\n                current_offset,\n                num_docs_processed: 0,\n            },\n        };\n        Ok(ingest_api_source)\n    }\n\n    async fn send_suggest_truncate_to_ingest_service(\n        &self,\n        up_to_position_included: u64,\n        ctx: &ActorContext<SourceActor>,\n    ) -> anyhow::Result<()> {\n        let suggest_truncate_req = SuggestTruncateRequest {\n            index_id: self.source_runtime.index_id().to_string(),\n            up_to_position_included,\n        };\n        ctx.ask_for_res(&self.ingest_api_service, suggest_truncate_req)\n            .await?;\n        Ok(())\n    }\n\n    fn update_counters(&mut self, current_offset: u64, num_docs: u64) {\n        self.counters.num_docs_processed += num_docs;\n        self.counters.current_offset = Some(current_offset);\n        self.counters.previous_offset = Some(current_offset);\n    }\n}\n\n#[async_trait]\nimpl Source for IngestApiSource {\n    async fn initialize(\n        &mut self,\n        _: &Mailbox<DocProcessor>,\n        ctx: &SourceContext,\n    ) -> Result<(), ActorExitStatus> {\n        if let Some(position) = self.counters.previous_offset {\n            self.send_suggest_truncate_to_ingest_service(position, ctx)\n                .await?;\n        }\n        Ok(())\n    }\n\n    async fn emit_batches(\n        &mut self,\n        batch_sink: &Mailbox<DocProcessor>,\n        ctx: &SourceContext,\n    ) -> Result<Duration, ActorExitStatus> {\n        let fetch_req = FetchRequest {\n            index_id: self.source_runtime.index_id().to_string(),\n            start_after: self.counters.current_offset,\n            num_bytes_limit: None,\n        };\n        let FetchResponse {\n            first_position: first_position_opt,\n            doc_batch: doc_batch_opt,\n        } = ctx\n            .ask_for_res(&self.ingest_api_service, fetch_req)\n            .await\n            .map_err(anyhow::Error::from)?;\n\n        // The `first_position_opt` being none means the doc_batch is empty and there is\n        // no more document available, at least for the time being.\n        // That is, we have consumed all pending docs in the queue and need to\n        // make the client wait a bit before pooling again.\n        let (first_position, doc_batch) = if let Some(first_position) = first_position_opt {\n            (first_position, doc_batch_opt.unwrap())\n        } else {\n            return Ok(INGEST_API_POLLING_COOL_DOWN);\n        };\n\n        let batch_num_docs = doc_batch.num_docs();\n        // TODO use a timestamp (in the raw doc batch) given by at ingest time to be more accurate.\n        let mut batch_builder =\n            BatchBuilder::with_capacity(doc_batch.num_docs(), SourceType::IngestV1);\n        for doc in doc_batch.into_iter() {\n            match doc {\n                DocCommand::Ingest { payload } => batch_builder.add_doc(payload),\n                DocCommand::Commit => batch_builder.force_commit(),\n            }\n        }\n        let current_offset = first_position + batch_num_docs as u64 - 1;\n        let partition_id = self.partition_id.clone();\n        batch_builder\n            .checkpoint_delta\n            .record_partition_delta(\n                partition_id,\n                self.counters\n                    .previous_offset\n                    .map(Position::offset)\n                    .unwrap_or_default(),\n                Position::offset(current_offset),\n            )\n            .map_err(anyhow::Error::from)?;\n\n        self.update_counters(current_offset, batch_builder.docs.len() as u64);\n        ctx.send_message(batch_sink, batch_builder.build()).await?;\n        Ok(Duration::default())\n    }\n\n    async fn suggest_truncate(\n        &mut self,\n        checkpoint: SourceCheckpoint,\n        ctx: &ActorContext<SourceActor>,\n    ) -> anyhow::Result<()> {\n        if let Some(Position::Offset(offset)) =\n            checkpoint.position_for_partition(&self.partition_id)\n        {\n            let up_to_position_included = offset.as_u64().expect(\"offset should be stored as u64\");\n            self.send_suggest_truncate_to_ingest_service(up_to_position_included, ctx)\n                .await?;\n        }\n        Ok(())\n    }\n\n    fn name(&self) -> String {\n        \"IngestApiSource\".to_string()\n    }\n\n    fn observable_state(&self) -> JsonValue {\n        serde_json::to_value(&self.counters).unwrap()\n    }\n}\n\npub struct IngestApiSourceFactory;\n\n#[async_trait]\nimpl TypedSourceFactory for IngestApiSourceFactory {\n    type Source = IngestApiSource;\n    type Params = ();\n\n    async fn typed_create_source(\n        source_runtime: SourceRuntime,\n        _: (),\n    ) -> anyhow::Result<Self::Source> {\n        IngestApiSource::try_new(source_runtime).await\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::num::NonZeroUsize;\n    use std::time::Duration;\n\n    use quickwit_actors::Command::Nudge;\n    use quickwit_actors::Universe;\n    use quickwit_common::rand::append_random_suffix;\n    use quickwit_config::{\n        INGEST_API_SOURCE_ID, IngestApiConfig, SourceConfig, SourceInputFormat, SourceParams,\n    };\n    use quickwit_ingest::{CommitType, DocBatchBuilder, IngestRequest, init_ingest_api};\n    use quickwit_metastore::checkpoint::SourceCheckpointDelta;\n    use quickwit_proto::types::{IndexId, IndexUid};\n\n    use super::*;\n    use crate::models::RawDocBatch;\n    use crate::source::SourceActor;\n    use crate::source::tests::SourceRuntimeBuilder;\n\n    fn make_ingest_request(\n        index_id: IndexId,\n        num_batch: u64,\n        batch_size: usize,\n        commit_type: CommitType,\n    ) -> IngestRequest {\n        let mut doc_batches = Vec::new();\n        let mut doc_id = 0usize;\n        for _ in 0..num_batch {\n            let mut doc_batch_builder = DocBatchBuilder::new(index_id.clone());\n            for _ in 0..batch_size {\n                doc_batch_builder.ingest_doc(\n                    format!(\"{doc_id:0>6} - The quick brown fox jumps over the lazy dog\")\n                        .as_bytes(),\n                );\n                doc_id += 1;\n            }\n            doc_batches.push(doc_batch_builder.build());\n        }\n        IngestRequest {\n            doc_batches,\n            commit: commit_type.into(),\n        }\n    }\n\n    fn make_source_config() -> SourceConfig {\n        SourceConfig {\n            source_id: INGEST_API_SOURCE_ID.to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::IngestApi,\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        }\n    }\n\n    #[tokio::test]\n    async fn test_ingest_api_source() -> anyhow::Result<()> {\n        let universe = Universe::with_accelerated_time();\n        let index_id = append_random_suffix(\"test-ingest-api-source\");\n        let index_uid = IndexUid::new_with_random_ulid(&index_id);\n        let temp_dir = tempfile::tempdir()?;\n        let queues_dir_path = temp_dir.path();\n\n        let ingest_api_service =\n            init_ingest_api(&universe, queues_dir_path, &IngestApiConfig::default()).await?;\n        let (doc_processor_mailbox, doc_processor_inbox) = universe.create_test_mailbox();\n        let source_config = make_source_config();\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config)\n            .with_queues_dir(queues_dir_path)\n            .build();\n        let ingest_api_source = IngestApiSource::try_new(source_runtime).await?;\n        let ingest_api_source_actor = SourceActor {\n            source: Box::new(ingest_api_source),\n            doc_processor_mailbox,\n        };\n        let (_ingest_api_source_mailbox, ingest_api_source_handle) =\n            universe.spawn_builder().spawn(ingest_api_source_actor);\n\n        let ingest_req = make_ingest_request(index_id.clone(), 2, 20_000, CommitType::Auto);\n        ingest_api_service\n            .ask_for_res(ingest_req)\n            .await\n            .map_err(|err| anyhow::anyhow!(err.to_string()))?;\n        universe.sleep(Duration::from_secs(2)).await;\n        let counters = ingest_api_source_handle\n            .process_pending_and_observe()\n            .await\n            .state;\n        assert_eq!(\n            counters,\n            serde_json::json!({\n                \"previous_offset\": 39999u64,\n                \"current_offset\": 39999u64,\n                \"num_docs_processed\": 40000u64\n            })\n        );\n        let doc_batches: Vec<RawDocBatch> = doc_processor_inbox.drain_for_test_typed();\n        assert_eq!(doc_batches.len(), 2);\n        assert!(&doc_batches[1].docs[0].starts_with(b\"037736\"));\n        // TODO: Source deadlocks and test hangs occasionally if we don't quit source first.\n        ingest_api_source_handle.quit().await;\n        universe.assert_quit().await;\n        Ok(())\n    }\n\n    /// See #2310\n    #[tokio::test]\n    async fn test_ingest_api_source_partition_id_changes() -> anyhow::Result<()> {\n        let universe = Universe::with_accelerated_time();\n        let partition_id_before_lost_queue_dir = {\n            let temp_dir = tempfile::tempdir()?;\n            let queues_dir_path = temp_dir.path();\n            let ingest_api_service =\n                init_ingest_api(&universe, queues_dir_path, &IngestApiConfig::default()).await?;\n            let partition_id: PartitionId = ingest_api_service.ask(GetPartitionId).await?.into();\n            let partition_id2: PartitionId = ingest_api_service.ask(GetPartitionId).await?.into();\n            assert_eq!(partition_id, partition_id2);\n            drop(ingest_api_service);\n            let ingest_api_service =\n                init_ingest_api(&universe, queues_dir_path, &IngestApiConfig::default()).await?;\n            let partition_id3: PartitionId = ingest_api_service.ask(GetPartitionId).await?.into();\n            assert_eq!(partition_id, partition_id3);\n            partition_id\n        };\n        let partition_id_after_lost_queue_dir = {\n            let temp_dir = tempfile::tempdir()?;\n            let queues_dir_path = temp_dir.path();\n            let ingest_api_service =\n                init_ingest_api(&universe, queues_dir_path, &IngestApiConfig::default()).await?;\n            let partition_id: PartitionId = ingest_api_service.ask(GetPartitionId).await?.into();\n            partition_id\n        };\n        assert_ne!(\n            partition_id_before_lost_queue_dir,\n            partition_id_after_lost_queue_dir\n        );\n        universe.assert_quit().await;\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_ingest_api_source_resume_from_checkpoint() -> anyhow::Result<()> {\n        let universe = Universe::with_accelerated_time();\n        let index_id = append_random_suffix(\"test-ingest-api-source\");\n        let index_uid = IndexUid::new_with_random_ulid(&index_id);\n        let temp_dir = tempfile::tempdir()?;\n        let queues_dir_path = temp_dir.path();\n\n        let ingest_api_service =\n            init_ingest_api(&universe, queues_dir_path, &IngestApiConfig::default()).await?;\n        let create_queue_req = CreateQueueIfNotExistsRequest {\n            queue_id: index_id.clone(),\n        };\n        ingest_api_service\n            .ask_for_res(create_queue_req)\n            .await\n            .unwrap();\n\n        let ingest_req = make_ingest_request(index_id.clone(), 4, 1000, CommitType::Auto);\n        ingest_api_service\n            .ask_for_res(ingest_req)\n            .await\n            .map_err(|err| anyhow::anyhow!(err.to_string()))?;\n\n        let (doc_processor_mailbox, doc_processor_inbox) = universe.create_test_mailbox();\n        let partition_id: PartitionId = ingest_api_service.ask(GetPartitionId).await?.into();\n        let checkpoint_delta = SourceCheckpointDelta::from_partition_delta(\n            partition_id.clone(),\n            Position::Beginning,\n            Position::offset(1200u64),\n        )\n        .unwrap();\n\n        let source_config = make_source_config();\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config)\n            .with_mock_metastore(Some(checkpoint_delta))\n            .with_queues_dir(queues_dir_path)\n            .build();\n\n        let ingest_api_source = IngestApiSource::try_new(source_runtime).await?;\n        let ingest_api_source_actor = SourceActor {\n            source: Box::new(ingest_api_source),\n            doc_processor_mailbox,\n        };\n        let (_ingest_api_source_mailbox, ingest_api_source_handle) =\n            universe.spawn_builder().spawn(ingest_api_source_actor);\n\n        universe.sleep(Duration::from_secs(2)).await;\n        let counters = ingest_api_source_handle\n            .process_pending_and_observe()\n            .await\n            .state;\n        assert_eq!(\n            counters,\n            serde_json::json!({\n                \"previous_offset\": 3999u64,\n                \"current_offset\": 3999u64,\n                \"num_docs_processed\": 2799u64\n            })\n        );\n        let doc_batches: Vec<RawDocBatch> = doc_processor_inbox.drain_for_test_typed();\n        assert_eq!(doc_batches.len(), 1);\n        assert!(&doc_batches[0].docs[0].starts_with(b\"001201\"));\n        assert_eq!(doc_batches[0].checkpoint_delta.num_partitions(), 1);\n        assert_eq!(\n            doc_batches[0].checkpoint_delta.partitions().next().unwrap(),\n            &partition_id\n        );\n        // TODO: Source deadlocks and test hangs occasionally if we don't quit source first.\n        ingest_api_source_handle.quit().await;\n        universe.assert_quit().await;\n\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_ingest_api_source_with_one_doc() -> anyhow::Result<()> {\n        let universe = Universe::with_accelerated_time();\n        let index_id = append_random_suffix(\"test-ingest-api-source\");\n        let index_uid = IndexUid::new_with_random_ulid(&index_id);\n        let temp_dir = tempfile::tempdir()?;\n        let queues_dir_path = temp_dir.path();\n        let ingest_api_service =\n            init_ingest_api(&universe, queues_dir_path, &IngestApiConfig::default()).await?;\n\n        let (doc_processor_mailbox, doc_processor_inbox) = universe.create_test_mailbox();\n        let source_config = make_source_config();\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config)\n            .with_queues_dir(queues_dir_path)\n            .build();\n\n        let ingest_api_source = IngestApiSource::try_new(source_runtime).await?;\n        let ingest_api_source_actor = SourceActor {\n            source: Box::new(ingest_api_source),\n            doc_processor_mailbox,\n        };\n        let (_ingest_api_source_mailbox, ingest_api_source_handle) =\n            universe.spawn_builder().spawn(ingest_api_source_actor);\n\n        let ingest_req = make_ingest_request(index_id.clone(), 1, 1, CommitType::Auto);\n        ingest_api_service\n            .ask_for_res(ingest_req)\n            .await\n            .map_err(|err| anyhow::anyhow!(err.to_string()))?;\n        universe.sleep(Duration::from_secs(2)).await;\n        let counters = ingest_api_source_handle\n            .process_pending_and_observe()\n            .await\n            .state;\n        assert_eq!(\n            counters,\n            serde_json::json!({\n                \"previous_offset\": 0u64,\n                \"current_offset\": 0u64,\n                \"num_docs_processed\": 1u64\n            })\n        );\n        let doc_batches: Vec<RawDocBatch> = doc_processor_inbox.drain_for_test_typed();\n        assert_eq!(doc_batches.len(), 1);\n        assert!(&doc_batches[0].docs[0].starts_with(b\"000000\"));\n        // TODO: Source deadlocks and test hangs occasionally if we don't quit source first.\n        ingest_api_source_handle.quit().await;\n        universe.assert_quit().await;\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_ingest_api_source_with_force_commit() -> anyhow::Result<()> {\n        let universe = Universe::with_accelerated_time();\n        let index_id = append_random_suffix(\"test-ingest-api-source\");\n        let index_uid = IndexUid::new_with_random_ulid(&index_id);\n        let temp_dir = tempfile::tempdir()?;\n        let queues_dir_path = temp_dir.path();\n\n        let ingest_api_service =\n            init_ingest_api(&universe, queues_dir_path, &IngestApiConfig::default()).await?;\n        let (doc_processor_mailbox, doc_processor_inbox) = universe.create_test_mailbox();\n        let source_config = make_source_config();\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config)\n            .with_queues_dir(queues_dir_path)\n            .build();\n\n        let ingest_api_source = IngestApiSource::try_new(source_runtime).await?;\n        let ingest_api_source_actor = SourceActor {\n            source: Box::new(ingest_api_source),\n            doc_processor_mailbox,\n        };\n        let (_ingest_api_source_mailbox, ingest_api_source_handle) =\n            universe.spawn_builder().spawn(ingest_api_source_actor);\n\n        let ingest_req = make_ingest_request(index_id.clone(), 2, 20_000, CommitType::Force);\n        let ingest_res = ingest_api_service\n            .send_message(ingest_req)\n            .await\n            .map_err(|err| anyhow::anyhow!(err.to_string()))?;\n        universe.sleep(Duration::from_secs(2)).await;\n        let counters = ingest_api_source_handle\n            .process_pending_and_observe()\n            .await\n            .state;\n        assert_eq!(\n            counters,\n            serde_json::json!({\n                \"previous_offset\": 40001u64,\n                \"current_offset\": 40001u64,\n                \"num_docs_processed\": 40000u64\n            })\n        );\n        let doc_batches: Vec<RawDocBatch> = doc_processor_inbox.drain_for_test_typed();\n        assert_eq!(doc_batches.len(), 2);\n        assert!(doc_batches[1].docs[0].starts_with(b\"037736\"));\n        assert!(doc_batches[0].force_commit);\n        assert!(doc_batches[1].force_commit);\n        ingest_api_service\n            .ask_for_res(SuggestTruncateRequest {\n                index_id: index_id.clone(),\n                up_to_position_included: 40001,\n            })\n            .await\n            .map_err(|err| anyhow::anyhow!(err.to_string()))?;\n        let res = ingest_res\n            .await\n            .map_err(|err| anyhow::anyhow!(err.to_string()))?\n            .map_err(|err| anyhow::anyhow!(err.to_string()))?;\n        assert_eq!(res.num_docs_for_processing, 40_000);\n        ingest_api_source_handle.quit().await;\n        universe.assert_quit().await;\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_ingest_api_source_with_wait() -> anyhow::Result<()> {\n        let universe = Universe::with_accelerated_time();\n        let index_id = append_random_suffix(\"test-ingest-api-source\");\n        let index_uid = IndexUid::new_with_random_ulid(&index_id);\n        let temp_dir = tempfile::tempdir()?;\n        let queues_dir_path = temp_dir.path();\n\n        let ingest_api_service =\n            init_ingest_api(&universe, queues_dir_path, &IngestApiConfig::default()).await?;\n        let (doc_processor_mailbox, doc_processor_inbox) = universe.create_test_mailbox();\n        let source_config = make_source_config();\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config)\n            .with_queues_dir(queues_dir_path)\n            .build();\n\n        let ingest_api_source = IngestApiSource::try_new(source_runtime).await?;\n        let ingest_api_source_actor = SourceActor {\n            source: Box::new(ingest_api_source),\n            doc_processor_mailbox,\n        };\n        let (_ingest_api_source_mailbox, ingest_api_source_handle) =\n            universe.spawn_builder().spawn(ingest_api_source_actor);\n        let ingest_req = make_ingest_request(index_id.clone(), 2, 20_000, CommitType::WaitFor);\n        let ingest_res = ingest_api_service\n            .send_message(ingest_req)\n            .await\n            .map_err(|err| anyhow::anyhow!(err.to_string()))?;\n        universe.sleep(Duration::from_secs(2)).await;\n        let counters = ingest_api_source_handle\n            .process_pending_and_observe()\n            .await\n            .state;\n        assert_eq!(\n            counters,\n            serde_json::json!({\n                \"previous_offset\": 39999u64,\n                \"current_offset\": 39999u64,\n                \"num_docs_processed\": 40000u64\n            })\n        );\n        let doc_batches: Vec<RawDocBatch> = doc_processor_inbox.drain_for_test_typed();\n        assert_eq!(doc_batches.len(), 2);\n        assert!(doc_batches[1].docs[0].starts_with(b\"037736\"));\n        assert!(!doc_batches[0].force_commit);\n        assert!(!doc_batches[1].force_commit);\n        ingest_api_service\n            .ask_for_res(SuggestTruncateRequest {\n                index_id: index_id.clone(),\n                up_to_position_included: 39999,\n            })\n            .await\n            .unwrap();\n        let res = ingest_res.await.unwrap().unwrap();\n        assert_eq!(res.num_docs_for_processing, 40_000);\n        ingest_api_source_handle.quit().await;\n        universe.assert_quit().await;\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_ingest_api_source_truncate_on_initialize() -> anyhow::Result<()> {\n        let universe = Universe::with_accelerated_time();\n        let index_id = append_random_suffix(\"test-ingest-api-source\");\n        let index_uid = IndexUid::new_with_random_ulid(&index_id);\n        let temp_dir = tempfile::tempdir()?;\n        let queues_dir_path = temp_dir.path();\n\n        let ingest_api_service =\n            init_ingest_api(&universe, queues_dir_path, &IngestApiConfig::default()).await?;\n        let (doc_processor_mailbox, _doc_processor_inbox) = universe.create_test_mailbox();\n        let source_config = make_source_config();\n        let _source_runtime = SourceRuntimeBuilder::new(index_uid.clone(), source_config.clone())\n            .with_queues_dir(queues_dir_path)\n            .build();\n\n        let create_queue_req = CreateQueueIfNotExistsRequest {\n            queue_id: index_id.clone(),\n        };\n        ingest_api_service\n            .ask_for_res(create_queue_req)\n            .await\n            .unwrap();\n\n        let ingest_req = make_ingest_request(index_id.clone(), 2, 20_000, CommitType::Auto);\n        ingest_api_service.ask(ingest_req).await.unwrap().unwrap();\n\n        let fetch_request = FetchRequest {\n            index_id: index_id.clone(),\n            start_after: None,\n            num_bytes_limit: None,\n        };\n        let FetchResponse { first_position, .. } = ingest_api_service\n            .ask(fetch_request.clone())\n            .await\n            .unwrap()\n            .unwrap();\n        assert_eq!(first_position, Some(0));\n\n        let partition_id: PartitionId = ingest_api_service.ask(GetPartitionId).await?.into();\n        let checkpoint_delta = SourceCheckpointDelta::from_partition_delta(\n            partition_id.clone(),\n            Position::Beginning,\n            Position::offset(10u64),\n        )\n        .unwrap();\n\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config)\n            .with_mock_metastore(Some(checkpoint_delta))\n            .with_queues_dir(queues_dir_path)\n            .build();\n\n        let ingest_api_source = IngestApiSource::try_new(source_runtime).await?;\n        let ingest_api_source_actor = SourceActor {\n            source: Box::new(ingest_api_source),\n            doc_processor_mailbox,\n        };\n        let (ingest_api_source_mailbox, ingest_api_source_handle) =\n            universe.spawn_builder().spawn(ingest_api_source_actor);\n\n        ingest_api_source_mailbox.ask(Nudge).await.unwrap();\n        let FetchResponse { first_position, .. } = ingest_api_service\n            .ask(fetch_request.clone())\n            .await\n            .unwrap()\n            .unwrap();\n        // We should have truncated to keep only message strictly after the source checkpoint.\n        assert_eq!(first_position, Some(11u64));\n\n        ingest_api_source_handle.quit().await;\n        universe.assert_quit().await;\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/kafka_source.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::fmt;\nuse std::time::{Duration, Instant};\n\nuse anyhow::{Context, anyhow, bail};\nuse async_trait::async_trait;\nuse bytes::Bytes;\nuse itertools::Itertools;\nuse oneshot;\nuse quickwit_actors::{ActorExitStatus, Mailbox};\nuse quickwit_config::KafkaSourceParams;\nuse quickwit_metastore::checkpoint::{PartitionId, SourceCheckpoint};\nuse quickwit_proto::metastore::SourceType;\nuse quickwit_proto::types::{IndexUid, Position};\nuse rdkafka::config::{ClientConfig, RDKafkaLogLevel};\nuse rdkafka::consumer::{\n    BaseConsumer, CommitMode, Consumer, ConsumerContext, DefaultConsumerContext, Rebalance,\n};\nuse rdkafka::error::KafkaError;\nuse rdkafka::message::BorrowedMessage;\nuse rdkafka::util::Timeout;\nuse rdkafka::{ClientContext, Message, Offset, TopicPartitionList};\nuse serde_json::{Value as JsonValue, json};\nuse tokio::sync::{mpsc, watch};\nuse tokio::task::{JoinHandle, spawn_blocking};\nuse tokio::time;\nuse tracing::{debug, info, warn};\n\nuse crate::actors::DocProcessor;\nuse crate::models::{NewPublishLock, PublishLock};\nuse crate::source::{\n    BATCH_NUM_BYTES_LIMIT, BatchBuilder, EMIT_BATCHES_TIMEOUT, Source, SourceContext,\n    SourceRuntime, TypedSourceFactory,\n};\n\ntype GroupId = String;\n\n/// Factory for instantiating a `KafkaSource`.\npub struct KafkaSourceFactory;\n\n#[async_trait]\nimpl TypedSourceFactory for KafkaSourceFactory {\n    type Source = KafkaSource;\n    type Params = KafkaSourceParams;\n\n    async fn typed_create_source(\n        source_runtime: SourceRuntime,\n        params: KafkaSourceParams,\n    ) -> anyhow::Result<Self::Source> {\n        KafkaSource::try_new(source_runtime, params).await\n    }\n}\n\n#[derive(Debug)]\nenum KafkaEvent {\n    Message(KafkaMessage),\n    AssignPartitions {\n        partitions: Vec<i32>,\n        assignment_tx: oneshot::Sender<Vec<(i32, Offset)>>,\n    },\n    RevokePartitions {\n        ack_tx: oneshot::Sender<()>,\n    },\n    PartitionEOF(i32),\n    Error(anyhow::Error),\n}\n\n#[derive(Debug)]\nstruct KafkaMessage {\n    doc_opt: Option<Bytes>,\n    payload_len: u64,\n    partition: i32,\n    offset: i64,\n}\n\nimpl From<BorrowedMessage<'_>> for KafkaMessage {\n    fn from(message: BorrowedMessage<'_>) -> Self {\n        Self {\n            doc_opt: message_payload_to_doc(&message),\n            payload_len: message.payload_len() as u64,\n            partition: message.partition(),\n            offset: message.offset(),\n        }\n    }\n}\n\nstruct RdKafkaContext {\n    topic: String,\n    events_tx: mpsc::Sender<KafkaEvent>,\n}\n\nimpl ClientContext for RdKafkaContext {}\n\nmacro_rules! return_if_err {\n    ($expression:expr, $lit: literal) => {\n        match $expression {\n            Ok(v) => v,\n            Err(_) => {\n                debug!(concat!($lit, \"the source was dropped\"));\n                return;\n            }\n        }\n    };\n}\n\n/// The rebalance protocol at a very high level:\n/// - A consumer joins or leaves a consumer group.\n/// - Consumers receive a revoke partitions notification, which gives them the opportunity to commit\n///   the work in progress.\n/// - Broker waits for ALL the consumers to ack the revoke notification (synchronization barrier).\n/// - Consumers receive new partition assignmennts.\n///\n/// The API of the rebalance callback is better explained in the docs of `librdkafka`:\n/// <https://docs.confluent.io/2.0.0/clients/librdkafka/classRdKafka_1_1RebalanceCb.html>\nimpl ConsumerContext for RdKafkaContext {\n    fn pre_rebalance(&self, _consumer: &BaseConsumer<Self>, rebalance: &Rebalance) {\n        crate::metrics::INDEXER_METRICS.kafka_rebalance_total.inc();\n        quickwit_common::rate_limited_info!(limit_per_min = 3, topic = self.topic, \"rebalance\");\n        if let Rebalance::Revoke(tpl) = rebalance {\n            let partitions = collect_partitions(tpl, &self.topic);\n            debug!(partitions=?partitions, \"revoke partitions\");\n\n            let (ack_tx, ack_rx) = oneshot::channel();\n            return_if_err!(\n                self.events_tx\n                    .blocking_send(KafkaEvent::RevokePartitions { ack_tx }),\n                \"failed to send revoke message to source\"\n            );\n            return_if_err!(ack_rx.recv(), \"failed to receive revoke ack from source\");\n        }\n        if let Rebalance::Assign(tpl) = rebalance {\n            let partitions = collect_partitions(tpl, &self.topic);\n            debug!(partitions=?partitions, \"assign partitions\");\n\n            let (assignment_tx, assignment_rx) = oneshot::channel();\n            return_if_err!(\n                self.events_tx.blocking_send(KafkaEvent::AssignPartitions {\n                    partitions,\n                    assignment_tx,\n                }),\n                \"failed to send assign message to source\"\n            );\n            let assignment = return_if_err!(\n                assignment_rx.recv(),\n                \"failed to receive assignment from source\"\n            );\n            for (partition_id, offset) in assignment {\n                let Some(mut partition) = tpl.find_partition(&self.topic, partition_id) else {\n                    warn!(\"partition `{partition_id}` not found in assignment\");\n                    continue;\n                };\n                if let Err(error) = partition.set_offset(offset) {\n                    warn!(\n                        \"failed to set offset to `{offset:?}` for partition `{partition_id}`: \\\n                         {error}\"\n                    );\n                }\n            }\n        }\n    }\n}\n\nfn collect_partitions(tpl: &TopicPartitionList, topic: &str) -> Vec<i32> {\n    tpl.elements()\n        .iter()\n        .map(|tple| {\n            assert_eq!(tple.topic(), topic);\n            tple.partition()\n        })\n        .collect()\n}\n\ntype RdKafkaConsumer = BaseConsumer<RdKafkaContext>;\n\n#[derive(Default)]\npub struct KafkaSourceState {\n    /// Partitions IDs assigned to the source.\n    pub assigned_partitions: HashMap<i32, PartitionId>,\n    /// Offset for each partition of the last message received.\n    pub current_positions: HashMap<i32, Position>,\n    /// Number of inactive partitions, i.e., that have reached EOF.\n    pub num_inactive_partitions: usize,\n    /// Number of bytes processed by the source.\n    pub num_bytes_processed: u64,\n    /// Number of messages processed by the source (including invalid messages).\n    pub num_messages_processed: u64,\n    // Number of invalid messages, i.e., that were empty or could not be parsed.\n    pub num_invalid_messages: u64,\n    /// Number of rebalances the consumer went through.\n    pub num_rebalances: usize,\n}\n\n/// A `KafkaSource` consumes a topic and forwards its messages to an `Indexer`.\npub struct KafkaSource {\n    source_runtime: SourceRuntime,\n    topic: String,\n    group_id: GroupId,\n    state: KafkaSourceState,\n    backfill_mode_enabled: bool,\n    events_rx: mpsc::Receiver<KafkaEvent>,\n    truncate_tx: watch::Sender<SourceCheckpoint>,\n    poll_loop_jh: JoinHandle<()>,\n    publish_lock: PublishLock,\n}\n\nimpl fmt::Debug for KafkaSource {\n    fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {\n        formatter\n            .debug_struct(\"KafkaSource\")\n            .field(\"index_uid\", self.source_runtime.index_uid())\n            .field(\"source_id\", &self.source_runtime.source_id())\n            .field(\"topic\", &self.topic)\n            .finish()\n    }\n}\n\nimpl KafkaSource {\n    /// Instantiates a new `KafkaSource`.\n    pub async fn try_new(\n        source_runtime: SourceRuntime,\n        source_params: KafkaSourceParams,\n    ) -> anyhow::Result<Self> {\n        let topic = source_params.topic.clone();\n        let backfill_mode_enabled = source_params.enable_backfill_mode;\n\n        let (events_tx, events_rx) = mpsc::channel(100);\n        let (truncate_tx, truncate_rx) = watch::channel(SourceCheckpoint::default());\n        let (client_config, consumer, group_id) = create_consumer(\n            source_runtime.index_uid(),\n            source_runtime.source_id(),\n            source_params,\n            events_tx.clone(),\n        )?;\n        let native_client_config = client_config.create_native_config()?;\n        let session_timeout_ms = native_client_config\n            .get(\"session.timeout.ms\")?\n            .parse::<u64>()?;\n        let max_poll_interval_ms = native_client_config\n            .get(\"max.poll.interval.ms\")?\n            .parse::<u64>()?;\n\n        let poll_loop_jh =\n            spawn_consumer_poll_loop(consumer, topic.clone(), events_tx, truncate_rx);\n        let publish_lock = PublishLock::default();\n\n        info!(\n            index_uid=%source_runtime.index_uid(),\n            source_id=%source_runtime.source_id(),\n            topic,\n            group_id,\n            max_poll_interval_ms,\n            session_timeout_ms,\n            \"starting Kafka source\"\n        );\n        if max_poll_interval_ms <= 60_000 {\n            warn!(\n                \"`max.poll.interval.ms` is set to a short duration that may cause the source to \\\n                 crash when back pressure from the indexer occurs. The recommended value is \\\n                 `300000` (5 minutes).\"\n            );\n        }\n        Ok(KafkaSource {\n            source_runtime,\n            topic,\n            group_id,\n            state: KafkaSourceState::default(),\n            backfill_mode_enabled,\n            events_rx,\n            truncate_tx,\n            poll_loop_jh,\n            publish_lock,\n        })\n    }\n\n    async fn process_message(\n        &mut self,\n        message: KafkaMessage,\n        batch: &mut BatchBuilder,\n    ) -> anyhow::Result<()> {\n        let KafkaMessage {\n            doc_opt,\n            payload_len,\n            partition,\n            offset,\n            ..\n        } = message;\n\n        if let Some(doc) = doc_opt {\n            batch.add_doc(doc);\n        } else {\n            self.state.num_invalid_messages += 1;\n        }\n        self.state.num_bytes_processed += payload_len;\n        self.state.num_messages_processed += 1;\n\n        let partition_id = self\n            .state\n            .assigned_partitions\n            .get(&partition)\n            .ok_or_else(|| {\n                anyhow::anyhow!(\n                    \"received message from unassigned partition `{}`. Assigned partitions: \\\n                     `{{{}}}`\",\n                    partition,\n                    self.state.assigned_partitions.keys().join(\", \"),\n                )\n            })?\n            .clone();\n        let current_position = Position::offset(offset);\n        let previous_position = self\n            .state\n            .current_positions\n            .insert(partition, current_position.clone())\n            .unwrap_or_else(|| previous_position_for_offset(offset));\n        batch\n            .checkpoint_delta\n            .record_partition_delta(partition_id, previous_position, current_position)\n            .context(\"failed to record partition delta\")?;\n        Ok(())\n    }\n\n    async fn process_assign_partitions(\n        &mut self,\n        ctx: &SourceContext,\n        partitions: &[i32],\n        assignment_tx: oneshot::Sender<Vec<(i32, Offset)>>,\n    ) -> anyhow::Result<()> {\n        let checkpoint = ctx\n            .protect_future(self.source_runtime.fetch_checkpoint())\n            .await?;\n\n        self.state.assigned_partitions.clear();\n        self.state.current_positions.clear();\n        self.state.num_inactive_partitions = 0;\n\n        let mut next_offsets: Vec<(i32, Offset)> = Vec::with_capacity(partitions.len());\n\n        for &partition in partitions {\n            let partition_id = PartitionId::from(partition as i64);\n\n            self.state\n                .assigned_partitions\n                .insert(partition, partition_id.clone());\n\n            let Some(current_position) = checkpoint.position_for_partition(&partition_id).cloned()\n            else {\n                continue;\n            };\n            let next_offset = match &current_position {\n                Position::Beginning => Offset::Beginning,\n                Position::Offset(offset) => {\n                    let offset = offset\n                        .as_i64()\n                        .expect(\"Kafka offset should be stored as i64\");\n                    Offset::Offset(offset + 1)\n                }\n                Position::Eof(_) => {\n                    panic!(\"position of a Kafka partition should never be EOF\")\n                }\n            };\n            self.state\n                .current_positions\n                .insert(partition, current_position);\n            next_offsets.push((partition, next_offset));\n        }\n        info!(\n            index_id=%self.source_runtime.index_id(),\n            source_id=%self.source_runtime.source_id(),\n            topic=%self.topic,\n            group_id=%self.group_id,\n            partitions=?partitions,\n            \"new partition assignment after rebalance\",\n        );\n        assignment_tx\n            .send(next_offsets)\n            .context(\"Kafka consumer context was dropped\")?;\n        Ok(())\n    }\n\n    async fn process_revoke_partitions(\n        &mut self,\n        ctx: &SourceContext,\n        doc_processor_mailbox: &Mailbox<DocProcessor>,\n        batch: &mut BatchBuilder,\n        ack_tx: oneshot::Sender<()>,\n    ) -> anyhow::Result<()> {\n        ctx.protect_future(self.publish_lock.kill()).await;\n        ack_tx\n            .send(())\n            .context(\"Kafka consumer context was dropped\")?;\n\n        batch.clear();\n        self.publish_lock = PublishLock::default();\n        self.state.num_rebalances += 1;\n        ctx.send_message(\n            doc_processor_mailbox,\n            NewPublishLock(self.publish_lock.clone()),\n        )\n        .await?;\n        Ok(())\n    }\n\n    fn process_partition_eof(&mut self, partition: i32) {\n        self.state.num_inactive_partitions += 1;\n\n        info!(\n            topic=%self.topic,\n            partition=%partition,\n            num_inactive_partitions=?self.state.num_inactive_partitions,\n            \"reached end of partition\"\n        );\n    }\n\n    fn should_exit(&self) -> bool {\n        self.backfill_mode_enabled\n            // This check ensures that we don't shutdown the source before the first partition assignment.\n            && self.state.num_inactive_partitions > 0\n            && self.state.num_inactive_partitions == self.state.assigned_partitions.len()\n    }\n\n    fn truncate(&self, checkpoint: SourceCheckpoint) -> anyhow::Result<()> {\n        self.truncate_tx\n            .send(checkpoint)\n            .context(\"Kafka consumer was dropped\")?;\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Source for KafkaSource {\n    async fn initialize(\n        &mut self,\n        doc_processor_mailbox: &Mailbox<DocProcessor>,\n        ctx: &SourceContext,\n    ) -> Result<(), ActorExitStatus> {\n        let publish_lock = self.publish_lock.clone();\n        ctx.send_message(doc_processor_mailbox, NewPublishLock(publish_lock))\n            .await?;\n        Ok(())\n    }\n\n    async fn emit_batches(\n        &mut self,\n        doc_processor_mailbox: &Mailbox<DocProcessor>,\n        ctx: &SourceContext,\n    ) -> Result<Duration, ActorExitStatus> {\n        let now = Instant::now();\n        let mut batch_builder = BatchBuilder::new(SourceType::Kafka);\n        let deadline = time::sleep(*EMIT_BATCHES_TIMEOUT);\n        tokio::pin!(deadline);\n\n        loop {\n            tokio::select! {\n                event_opt = self.events_rx.recv() => {\n                    let event = event_opt.ok_or_else(|| ActorExitStatus::from(anyhow!(\"consumer was dropped\")))?;\n                    match event {\n                        KafkaEvent::Message(message) => self.process_message(message, &mut batch_builder).await?,\n                        KafkaEvent::AssignPartitions { partitions, assignment_tx} => self.process_assign_partitions(ctx, &partitions, assignment_tx).await?,\n                        KafkaEvent::RevokePartitions { ack_tx } => self.process_revoke_partitions(ctx, doc_processor_mailbox, &mut batch_builder, ack_tx).await?,\n                        KafkaEvent::PartitionEOF(partition) => self.process_partition_eof(partition),\n                        KafkaEvent::Error(error) => Err(ActorExitStatus::from(error))?,\n                    }\n                    if batch_builder.num_bytes >= BATCH_NUM_BYTES_LIMIT {\n                        break;\n                    }\n                }\n                _ = &mut deadline => {\n                    break;\n                }\n            }\n            ctx.record_progress();\n        }\n        if !batch_builder.checkpoint_delta.is_empty() {\n            debug!(\n                num_docs=%batch_builder.docs.len(),\n                num_bytes=%batch_builder.num_bytes,\n                num_millis=%now.elapsed().as_millis(),\n                \"sending doc batch to indexer\"\n            );\n            let message = batch_builder.build();\n            ctx.send_message(doc_processor_mailbox, message).await?;\n        }\n        if self.should_exit() {\n            info!(topic = %self.topic, \"reached end of topic\");\n            ctx.send_exit_with_success(doc_processor_mailbox).await?;\n            return Err(ActorExitStatus::Success);\n        }\n        Ok(Duration::default())\n    }\n\n    async fn suggest_truncate(\n        &mut self,\n        checkpoint: SourceCheckpoint,\n        _ctx: &SourceContext,\n    ) -> anyhow::Result<()> {\n        self.truncate(checkpoint)?;\n        Ok(())\n    }\n\n    async fn finalize(\n        &mut self,\n        _exit_status: &ActorExitStatus,\n        _ctx: &SourceContext,\n    ) -> anyhow::Result<()> {\n        self.poll_loop_jh.abort();\n        Ok(())\n    }\n\n    fn name(&self) -> String {\n        format!(\"{self:?}\")\n    }\n\n    fn observable_state(&self) -> JsonValue {\n        let assigned_partitions: Vec<&i32> =\n            self.state.assigned_partitions.keys().sorted().collect();\n        let current_positions: Vec<(&i32, &Position)> =\n            self.state.current_positions.iter().sorted().collect();\n        json!({\n            \"index_id\": self.source_runtime.index_id(),\n            \"source_id\": self.source_runtime.source_id(),\n            \"topic\": self.topic,\n            \"assigned_partitions\": assigned_partitions,\n            \"current_positions\": current_positions,\n            \"num_inactive_partitions\": self.state.num_inactive_partitions,\n            \"num_bytes_processed\": self.state.num_bytes_processed,\n            \"num_messages_processed\": self.state.num_messages_processed,\n            \"num_invalid_messages\": self.state.num_invalid_messages,\n            \"num_rebalances\": self.state.num_rebalances,\n        })\n    }\n}\n\n// `rust-rdkafka` provides an async API via `StreamConsumer` for consuming topics asynchronously,\n// BUT the async calls to `recev()` end up being sync when a rebalance occurs because the rebalance\n// callback is sync. Until `rust-rdkafka` offers a fully asynchronous API, we poll the consumer in a\n// blocking tokio task and handle the rebalance events via message passing between the rebalance\n// callback and the source.\nfn spawn_consumer_poll_loop(\n    consumer: RdKafkaConsumer,\n    topic: String,\n    events_tx: mpsc::Sender<KafkaEvent>,\n    mut truncate_rx: watch::Receiver<SourceCheckpoint>,\n) -> JoinHandle<()> {\n    spawn_blocking(move || {\n        // `subscribe()` returns immediately but triggers the execution of synchronous code (e.g.\n        // rebalance callback) so it must be called in a blocking task.\n        //\n        // From the librdkafka docs:\n        // `subscribe()` is an asynchronous method which returns immediately: background threads\n        // will (re)join the group, wait for group rebalance, issue any registered rebalance_cb,\n        // assign() the assigned partitions, and then start fetching messages.\n        if let Err(error) = consumer.subscribe(&[&topic]) {\n            let _ = events_tx.blocking_send(KafkaEvent::Error(anyhow!(error)));\n            return;\n        }\n        while !events_tx.is_closed() {\n            if let Some(message_res) = consumer.poll(Some(Duration::from_secs(1))) {\n                let event = match message_res {\n                    Ok(message) => KafkaEvent::Message(message.into()),\n                    Err(KafkaError::PartitionEOF(partition)) => KafkaEvent::PartitionEOF(partition),\n                    Err(error) => KafkaEvent::Error(anyhow!(error)),\n                };\n                // When the source experiences backpressure, this channel becomes full and the\n                // consumer might not call `poll()` for a duration that exceeds\n                // `max.poll.interval.ms`. When that happens the consumer is kicked out of the group\n                // and the source fails. This should not happen in practice with a\n                // sufficiently large value for `max.poll.interval.ms`. The default value is 5\n                // minutes.\n                if events_tx.blocking_send(event).is_err() {\n                    break;\n                }\n            }\n            if let Ok(true) = truncate_rx.has_changed() {\n                let checkpoint = truncate_rx.borrow_and_update();\n\n                let mut tpl = TopicPartitionList::new();\n                for (partition_id, position) in checkpoint.iter() {\n                    let partition = partition_id\n                        .as_i64()\n                        .expect(\"Kafka partition should be stored as i64.\")\n                        as i32;\n                    // Quickwit positions are inclusive whereas Kafka offsets are exclusive, hence\n                    // the increment by 1.\n                    let next_position = position\n                        .as_i64()\n                        .expect(\"Kafka offset should be stored as i64.\")\n                        + 1;\n                    let offset = Offset::Offset(next_position);\n                    tpl.add_partition_offset(&topic, partition, offset)\n                        .expect(\"The offset should be valid.\");\n                }\n                if let Err(error) = consumer.commit(&tpl, CommitMode::Async) {\n                    warn!(error=?error, \"failed to commit offsets\");\n                }\n            }\n        }\n        debug!(\"exiting consumer poll loop\");\n        consumer.unsubscribe();\n    })\n}\n\n/// Returns the preceding `Position` for the offset.\nfn previous_position_for_offset(offset: i64) -> Position {\n    if offset == 0 {\n        Position::Beginning\n    } else {\n        Position::offset(offset - 1)\n    }\n}\n\n/// Checks whether we can establish a connection to the Kafka broker.\npub(super) async fn check_connectivity(params: KafkaSourceParams) -> anyhow::Result<()> {\n    let mut client_config = parse_client_params(params.client_params)?;\n\n    let consumer: BaseConsumer<DefaultConsumerContext> = client_config\n        .set(\"group.id\", \"quickwit-connectivity-check\".to_string())\n        .set_log_level(RDKafkaLogLevel::Error)\n        .create()?;\n\n    let topic = params.topic.clone();\n    let timeout = Timeout::After(Duration::from_secs(5));\n    let cluster_metadata = spawn_blocking(move || {\n        consumer\n            .fetch_metadata(Some(&topic), timeout)\n            .with_context(|| format!(\"failed to fetch metadata for topic `{topic}`\"))\n    })\n    .await??;\n\n    if cluster_metadata.topics().is_empty() {\n        bail!(\"topic `{}` does not exist\", params.topic);\n    }\n    let topic_metadata = &cluster_metadata.topics()[0];\n    assert_eq!(topic_metadata.name(), params.topic); // Belt and suspenders.\n\n    if topic_metadata.partitions().is_empty() {\n        bail!(\"topic `{}` has no partitions\", params.topic);\n    }\n    Ok(())\n}\n\n/// Creates a new `KafkaSourceConsumer`.\nfn create_consumer(\n    index_uid: &IndexUid,\n    source_id: &str,\n    params: KafkaSourceParams,\n    events_tx: mpsc::Sender<KafkaEvent>,\n) -> anyhow::Result<(ClientConfig, RdKafkaConsumer, GroupId)> {\n    // Group ID is limited to 255 characters.\n    let mut group_id = match &params.client_params[\"group.id\"] {\n        JsonValue::String(group_id) => group_id.clone(),\n        _ => format!(\"quickwit-{index_uid}-{source_id}\"),\n    };\n    group_id.truncate(255);\n\n    let mut client_config = parse_client_params(params.client_params)?;\n\n    let log_level = parse_client_log_level(params.client_log_level)?;\n    let consumer: RdKafkaConsumer = client_config\n        .set(\"enable.auto.commit\", \"false\") // We manage offsets ourselves: we always want to set this value to `false`.\n        .set(\n            \"enable.partition.eof\",\n            params.enable_backfill_mode.to_string(),\n        )\n        .set(\"group.id\", &group_id)\n        .set_log_level(log_level)\n        .create_with_context(RdKafkaContext {\n            topic: params.topic,\n            events_tx,\n        })\n        .context(\"failed to create Kafka consumer\")?;\n\n    Ok((client_config, consumer, group_id))\n}\n\nfn parse_client_log_level(client_log_level: Option<String>) -> anyhow::Result<RDKafkaLogLevel> {\n    let log_level = match client_log_level\n        .map(|log_level| log_level.to_lowercase())\n        .as_deref()\n    {\n        Some(\"debug\") => RDKafkaLogLevel::Debug,\n        Some(\"info\") | None => RDKafkaLogLevel::Info,\n        Some(\"warn\") | Some(\"warning\") => RDKafkaLogLevel::Warning,\n        Some(\"error\") => RDKafkaLogLevel::Error,\n        Some(\"critical\") => RDKafkaLogLevel::Critical,\n        Some(\"alert\") => RDKafkaLogLevel::Alert,\n        Some(\"emerg\") => RDKafkaLogLevel::Emerg,\n        Some(level) => bail!(\n            \"failed to parse Kafka client log level. value `{}` is not supported\",\n            level\n        ),\n    };\n    Ok(log_level)\n}\n\nfn parse_client_params(client_params: JsonValue) -> anyhow::Result<ClientConfig> {\n    let params = if let JsonValue::Object(params) = client_params {\n        params\n    } else {\n        bail!(\"failed to parse Kafka client parameters. `client_params` must be a JSON object\");\n    };\n    let mut client_config = ClientConfig::new();\n    for (key, value_json) in params {\n        let value = match value_json {\n            JsonValue::Bool(value_bool) => value_bool.to_string(),\n            JsonValue::Number(value_number) => value_number.to_string(),\n            JsonValue::String(value_string) => value_string,\n            JsonValue::Null => continue,\n            JsonValue::Array(_) | JsonValue::Object(_) => bail!(\n                \"failed to parse Kafka client parameters. `client_params.{}` must be a boolean, \\\n                 number, or string\",\n                key\n            ),\n        };\n        client_config.set(key, value);\n    }\n    Ok(client_config)\n}\n\n/// Returns the message payload as a `Bytes` object if it exists and is not empty.\nfn message_payload_to_doc(message: &BorrowedMessage) -> Option<Bytes> {\n    match message.payload() {\n        Some(payload) if !payload.is_empty() => {\n            let doc = Bytes::from(payload.to_vec());\n            return Some(doc);\n        }\n        Some(_) => debug!(\n            topic=%message.topic(),\n            partition=%message.partition(),\n            offset=%message.offset(),\n            timestamp=?message.timestamp(),\n            \"Document is empty.\"\n        ),\n        None => debug!(\n            topic=%message.topic(),\n            partition=%message.partition(),\n            offset=%message.offset(),\n            timestamp=?message.timestamp(),\n            \"Message payload is empty.\"\n        ),\n    }\n    None\n}\n\n#[cfg(all(test, feature = \"kafka-broker-tests\"))]\nmod kafka_broker_tests {\n    use std::num::NonZeroUsize;\n\n    use quickwit_actors::{ActorContext, Universe};\n    use quickwit_common::rand::append_random_suffix;\n    use quickwit_config::{SourceConfig, SourceInputFormat, SourceParams};\n    use quickwit_metastore::checkpoint::SourceCheckpointDelta;\n    use quickwit_metastore::metastore_for_test;\n    use quickwit_proto::types::IndexUid;\n    use rdkafka::admin::{AdminClient, AdminOptions, NewTopic, TopicReplication};\n    use rdkafka::client::DefaultClientContext;\n    use rdkafka::message::ToBytes;\n    use rdkafka::producer::{FutureProducer, FutureRecord};\n    use tokio::sync::watch;\n\n    use super::*;\n    use crate::source::test_setup_helper::setup_index;\n    use crate::source::tests::SourceRuntimeBuilder;\n    use crate::source::{RawDocBatch, SourceActor, quickwit_supported_sources};\n\n    fn create_base_consumer(group_id: &str) -> BaseConsumer {\n        ClientConfig::new()\n            .set(\"bootstrap.servers\", \"localhost:9092\")\n            .set(\"group.id\", group_id)\n            .create()\n            .unwrap()\n    }\n\n    fn create_admin_client() -> AdminClient<DefaultClientContext> {\n        ClientConfig::new()\n            .set(\"bootstrap.servers\", \"localhost:9092\")\n            .create()\n            .unwrap()\n    }\n\n    async fn create_topic(\n        admin_client: &AdminClient<DefaultClientContext>,\n        topic: &str,\n        num_partitions: i32,\n    ) -> anyhow::Result<()> {\n        admin_client\n            .create_topics(\n                &[NewTopic::new(\n                    topic,\n                    num_partitions,\n                    TopicReplication::Fixed(1),\n                )],\n                &AdminOptions::new().operation_timeout(Some(Duration::from_secs(5))),\n            )\n            .await?\n            .into_iter()\n            .collect::<Result<Vec<_>, _>>()\n            .map_err(|(topic, err_code)| {\n                anyhow::anyhow!(\n                    \"failed to create topic `{}`. error code: `{}`\",\n                    topic,\n                    err_code\n                )\n            })?;\n        Ok(())\n    }\n\n    async fn populate_topic<K, M, J, Q>(\n        topic: &str,\n        num_messages: i32,\n        key_fn: &K,\n        message_fn: &M,\n        partition: Option<i32>,\n        timestamp: Option<i64>,\n    ) -> anyhow::Result<HashMap<(i32, i64), i32>>\n    where\n        K: Fn(i32) -> J,\n        M: Fn(i32) -> Q,\n        J: ToBytes,\n        Q: ToBytes,\n    {\n        let producer: &FutureProducer = &ClientConfig::new()\n            .set(\"bootstrap.servers\", \"localhost:9092\")\n            .set(\"statistics.interval.ms\", \"500\")\n            .set(\"api.version.request\", \"true\")\n            .set(\"debug\", \"all\")\n            .set(\"message.timeout.ms\", \"30000\")\n            .create()?;\n        let tasks = (0..num_messages).map(|id| async move {\n            producer\n                .send(\n                    FutureRecord {\n                        topic,\n                        partition,\n                        timestamp,\n                        key: Some(&key_fn(id)),\n                        payload: Some(&message_fn(id)),\n                        headers: None,\n                    },\n                    Duration::from_secs(1),\n                )\n                .await\n                .map(|delivery| (id, delivery.partition, delivery.offset))\n                .map_err(|(err, _)| err)\n        });\n        let message_map = futures::future::try_join_all(tasks)\n            .await?\n            .into_iter()\n            .fold(HashMap::new(), |mut acc, (id, partition, offset)| {\n                acc.insert((partition, offset), id);\n                acc\n            });\n        Ok(message_map)\n    }\n\n    fn key_fn(id: i32) -> String {\n        format!(\"Key {id}\")\n    }\n\n    fn get_source_config(topic: &str, auto_offset_reset: &str) -> (String, SourceConfig) {\n        let source_id = append_random_suffix(\"test-kafka-source--source\");\n        // Setting explicitly ip v4 with `broker.address.family` is required\n        // because of https://github.com/fede1024/rust-rdkafka/issues/809\n        let source_config = SourceConfig {\n            source_id: source_id.clone(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::Kafka(KafkaSourceParams {\n                topic: topic.to_string(),\n                client_log_level: None,\n                client_params: json!({\n                    \"auto.offset.reset\": auto_offset_reset,\n                    \"bootstrap.servers\": \"localhost:9092\",\n                    \"broker.address.family\": \"v4\",\n                }),\n                enable_backfill_mode: true,\n            }),\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        };\n        (source_id, source_config)\n    }\n\n    fn merge_doc_batches(batches: Vec<RawDocBatch>) -> anyhow::Result<RawDocBatch> {\n        let mut merged_batch = RawDocBatch::default();\n        for batch in batches {\n            merged_batch.docs.extend(batch.docs);\n            merged_batch\n                .checkpoint_delta\n                .extend(batch.checkpoint_delta)?;\n        }\n        merged_batch.docs.sort();\n        Ok(merged_batch)\n    }\n\n    #[tokio::test]\n    async fn test_kafka_source_process_message() {\n        let admin_client = create_admin_client();\n        let topic = append_random_suffix(\"test-kafka-source--process-message--topic\");\n        create_topic(&admin_client, &topic, 2).await.unwrap();\n\n        let index_id = append_random_suffix(\"test-kafka-source--process-message--index\");\n        let index_uid = IndexUid::new_with_random_ulid(&index_id);\n        let (_source_id, source_config) = get_source_config(&topic, \"earliest\");\n        let SourceParams::Kafka(params) = source_config.clone().source_params else {\n            panic!(\n                \"Expected Kafka source params, got {:?}.\",\n                source_config.source_params\n            );\n        };\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config).build();\n        let mut kafka_source = KafkaSource::try_new(source_runtime, params).await.unwrap();\n\n        let partition_id_1 = PartitionId::from(1u64);\n        let partition_id_2 = PartitionId::from(2u64);\n\n        kafka_source.state.assigned_partitions =\n            HashMap::from_iter([(1, partition_id_1.clone()), (2, partition_id_2.clone())]);\n\n        assert_eq!(kafka_source.state.num_messages_processed, 0);\n        assert_eq!(kafka_source.state.num_invalid_messages, 0);\n\n        let mut batch_builder = BatchBuilder::new(SourceType::Kafka);\n\n        let message = KafkaMessage {\n            doc_opt: None,\n            payload_len: 7,\n            partition: 1,\n            offset: 0,\n        };\n        kafka_source\n            .process_message(message, &mut batch_builder)\n            .await\n            .unwrap();\n\n        assert_eq!(batch_builder.docs.len(), 0);\n        assert_eq!(batch_builder.num_bytes, 0);\n        assert_eq!(\n            kafka_source.state.current_positions.get(&1).unwrap(),\n            &Position::offset(0u64)\n        );\n        assert_eq!(kafka_source.state.num_bytes_processed, 7);\n        assert_eq!(kafka_source.state.num_messages_processed, 1);\n        assert_eq!(kafka_source.state.num_invalid_messages, 1);\n\n        let message = KafkaMessage {\n            doc_opt: Some(Bytes::from_static(b\"test-doc\")),\n            payload_len: 8,\n            partition: 1,\n            offset: 1,\n        };\n        kafka_source\n            .process_message(message, &mut batch_builder)\n            .await\n            .unwrap();\n\n        assert_eq!(batch_builder.docs.len(), 1);\n        assert_eq!(batch_builder.docs[0], \"test-doc\");\n        assert_eq!(batch_builder.num_bytes, 8);\n        assert_eq!(\n            kafka_source.state.current_positions.get(&1).unwrap(),\n            &Position::offset(1u64)\n        );\n        assert_eq!(kafka_source.state.num_bytes_processed, 15);\n        assert_eq!(kafka_source.state.num_messages_processed, 2);\n        assert_eq!(kafka_source.state.num_invalid_messages, 1);\n\n        let message = KafkaMessage {\n            doc_opt: Some(Bytes::from_static(b\"test-doc\")),\n            payload_len: 8,\n            partition: 2,\n            offset: 42,\n        };\n        kafka_source\n            .process_message(message, &mut batch_builder)\n            .await\n            .unwrap();\n\n        assert_eq!(batch_builder.docs.len(), 2);\n        assert_eq!(batch_builder.docs[1], \"test-doc\");\n        assert_eq!(batch_builder.num_bytes, 16);\n        assert_eq!(\n            kafka_source.state.current_positions.get(&2).unwrap(),\n            &Position::offset(42u64)\n        );\n        assert_eq!(kafka_source.state.num_bytes_processed, 23);\n        assert_eq!(kafka_source.state.num_messages_processed, 3);\n        assert_eq!(kafka_source.state.num_invalid_messages, 1);\n\n        let mut expected_checkpoint_delta = SourceCheckpointDelta::default();\n        expected_checkpoint_delta\n            .record_partition_delta(partition_id_1, Position::Beginning, Position::offset(1u64))\n            .unwrap();\n        expected_checkpoint_delta\n            .record_partition_delta(\n                partition_id_2,\n                Position::offset(41u64),\n                Position::offset(42u64),\n            )\n            .unwrap();\n        assert_eq!(batch_builder.checkpoint_delta, expected_checkpoint_delta);\n\n        // Message from unassigned partition\n        let message = KafkaMessage {\n            doc_opt: Some(Bytes::from_static(b\"test-doc\")),\n            payload_len: 8,\n            partition: 3,\n            offset: 42,\n        };\n        kafka_source\n            .process_message(message, &mut batch_builder)\n            .await\n            .unwrap_err();\n    }\n\n    #[tokio::test]\n    async fn test_kafka_source_process_assign_partitions() {\n        let admin_client = create_admin_client();\n        let topic = append_random_suffix(\"test-kafka-source--process-assign-partitions--topic\");\n        create_topic(&admin_client, &topic, 2).await.unwrap();\n\n        let metastore = metastore_for_test();\n        let index_id = append_random_suffix(\"test-kafka-source--process-assign-partitions--index\");\n        let (_source_id, source_config) = get_source_config(&topic, \"earliest\");\n\n        let index_uid = setup_index(\n            metastore.clone(),\n            &index_id,\n            &source_config,\n            &[(\n                PartitionId::from(2u64),\n                Position::Beginning,\n                Position::offset(42u64),\n            )],\n        )\n        .await;\n\n        let SourceParams::Kafka(params) = source_config.clone().source_params else {\n            panic!(\n                \"Expected Kafka source params, got {:?}.\",\n                source_config.source_params\n            );\n        };\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config)\n            .with_metastore(metastore)\n            .build();\n        let mut kafka_source = KafkaSource::try_new(source_runtime, params).await.unwrap();\n        kafka_source.state.num_inactive_partitions = 1;\n\n        let universe = Universe::with_accelerated_time();\n        let (source_mailbox, _source_inbox) = universe.create_test_mailbox();\n        let (observable_state_tx, _observable_state_rx) = watch::channel(json!({}));\n        let ctx: ActorContext<SourceActor> =\n            ActorContext::for_test(&universe, source_mailbox, observable_state_tx);\n        let (assignment_tx, assignment_rx) = oneshot::channel();\n\n        kafka_source\n            .process_assign_partitions(&ctx, &[1, 2], assignment_tx)\n            .await\n            .unwrap();\n\n        assert_eq!(kafka_source.state.num_inactive_partitions, 0);\n\n        let expected_assigned_partitions =\n            HashMap::from_iter([(1, PartitionId::from(1u64)), (2, PartitionId::from(2u64))]);\n        assert_eq!(\n            kafka_source.state.assigned_partitions,\n            expected_assigned_partitions\n        );\n        let expected_current_positions = HashMap::from_iter([(2, Position::offset(42u64))]);\n        assert_eq!(\n            kafka_source.state.current_positions,\n            expected_current_positions\n        );\n\n        let assignment = assignment_rx.await.unwrap();\n        assert_eq!(assignment, &[(2, Offset::Offset(43))])\n    }\n\n    #[tokio::test]\n    async fn test_kafka_source_process_revoke_partitions() {\n        let admin_client = create_admin_client();\n        let topic = append_random_suffix(\"test-kafka-source--process-revoke-partitions--topic\");\n        create_topic(&admin_client, &topic, 1).await.unwrap();\n\n        let index_id = append_random_suffix(\"test-kafka-source--process-revoke--partitions--index\");\n        let index_uid = IndexUid::new_with_random_ulid(&index_id);\n        let (_source_id, source_config) = get_source_config(&topic, \"earliest\");\n        let SourceParams::Kafka(params) = source_config.clone().source_params else {\n            panic!(\n                \"Expected Kafka source params, got {:?}.\",\n                source_config.source_params\n            );\n        };\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config).build();\n        let mut kafka_source = KafkaSource::try_new(source_runtime, params).await.unwrap();\n\n        let universe = Universe::with_accelerated_time();\n        let (source_mailbox, _source_inbox) = universe.create_test_mailbox();\n        let (indexer_mailbox, indexer_inbox) = universe.create_test_mailbox();\n        let (observable_state_tx, _observable_state_rx) = watch::channel(json!({}));\n        let ctx: ActorContext<SourceActor> =\n            ActorContext::for_test(&universe, source_mailbox, observable_state_tx);\n        let (ack_tx, ack_rx) = oneshot::channel();\n\n        let mut batch_builder = BatchBuilder::new(SourceType::Kafka);\n        batch_builder.add_doc(Bytes::from_static(b\"test-doc\"));\n\n        let publish_lock = kafka_source.publish_lock.clone();\n        assert!(publish_lock.is_alive());\n        assert_eq!(kafka_source.state.num_rebalances, 0);\n\n        kafka_source\n            .process_revoke_partitions(&ctx, &indexer_mailbox, &mut batch_builder, ack_tx)\n            .await\n            .unwrap();\n\n        ack_rx.await.unwrap();\n        assert!(batch_builder.docs.is_empty());\n        assert!(publish_lock.is_dead());\n\n        assert_eq!(kafka_source.state.num_rebalances, 1);\n\n        let indexer_messages: Vec<NewPublishLock> = indexer_inbox.drain_for_test_typed();\n        assert_eq!(indexer_messages.len(), 1);\n        assert!(indexer_messages[0].0.is_alive());\n    }\n\n    #[tokio::test]\n    async fn test_kafka_source_process_partition_eof() {\n        let admin_client = create_admin_client();\n        let topic = append_random_suffix(\"test-kafka-source--process-partition-eof--topic\");\n        create_topic(&admin_client, &topic, 1).await.unwrap();\n\n        let index_id = append_random_suffix(\"test-kafka-source--process-partition-eof--index\");\n        let index_uid = IndexUid::new_with_random_ulid(&index_id);\n        let (_source_id, source_config) = get_source_config(&topic, \"earliest\");\n        let SourceParams::Kafka(params) = source_config.clone().source_params else {\n            panic!(\n                \"Expected Kafka source params, got {:?}.\",\n                source_config.source_params\n            );\n        };\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config).build();\n        let mut kafka_source = KafkaSource::try_new(source_runtime, params).await.unwrap();\n        let partition_id_1 = PartitionId::from(1u64);\n        kafka_source.state.assigned_partitions = HashMap::from_iter([(1, partition_id_1)]);\n\n        assert!(!kafka_source.should_exit());\n\n        kafka_source.process_partition_eof(1);\n        assert_eq!(kafka_source.state.num_inactive_partitions, 1);\n        assert!(kafka_source.should_exit());\n\n        kafka_source.backfill_mode_enabled = false;\n        assert!(!kafka_source.should_exit());\n    }\n\n    #[tokio::test]\n    async fn test_kafka_source_suggest_truncate() {\n        let admin_client = create_admin_client();\n        let topic = append_random_suffix(\"test-kafka-source--suggest-truncate--topic\");\n        create_topic(&admin_client, &topic, 2).await.unwrap();\n\n        let metastore = metastore_for_test();\n        let index_id = append_random_suffix(\"test-kafka-source--suggest-truncate--index\");\n        let (_source_id, source_config) = get_source_config(&topic, \"earliest\");\n        let index_uid = setup_index(\n            metastore.clone(),\n            &index_id,\n            &source_config,\n            &[(\n                PartitionId::from(2u64),\n                Position::Beginning,\n                Position::offset(42u64),\n            )],\n        )\n        .await;\n\n        let SourceParams::Kafka(params) = source_config.clone().source_params else {\n            panic!(\n                \"Expected Kafka source params, got {:?}.\",\n                source_config.source_params\n            );\n        };\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config)\n            .with_metastore(metastore)\n            .build();\n        let mut kafka_source = KafkaSource::try_new(source_runtime, params).await.unwrap();\n\n        let universe = Universe::with_accelerated_time();\n        let (source_mailbox, _source_inbox) = universe.create_test_mailbox();\n        let (observable_state_tx, _observable_state_rx) = watch::channel(json!({}));\n        let ctx: ActorContext<SourceActor> =\n            ActorContext::for_test(&universe, source_mailbox, observable_state_tx);\n\n        let KafkaEvent::AssignPartitions {\n            partitions,\n            assignment_tx,\n        } = kafka_source.events_rx.recv().await.unwrap()\n        else {\n            panic!(\"Expected `AssignPartitions` event.\");\n        };\n        kafka_source\n            .process_assign_partitions(&ctx, &partitions, assignment_tx)\n            .await\n            .unwrap();\n\n        let checkpoint: SourceCheckpoint = [(0u64, 1u64), (1u64, 2u64)]\n            .into_iter()\n            .map(|(partition_id, offset)| {\n                (PartitionId::from(partition_id), Position::offset(offset))\n            })\n            .collect();\n        kafka_source.truncate(checkpoint).unwrap();\n\n        tokio::time::sleep(Duration::from_secs(1)).await;\n\n        let mut tpl = TopicPartitionList::new();\n        tpl.add_partition(&topic, 0);\n        tpl.add_partition(&topic, 1);\n\n        let consumer = create_base_consumer(&kafka_source.group_id);\n        let committed_offsets = consumer\n            .committed_offsets(tpl.clone(), Duration::from_secs(10))\n            .unwrap();\n\n        assert_eq!(\n            committed_offsets\n                .find_partition(&topic, 0)\n                .unwrap()\n                .offset(),\n            Offset::Offset(2)\n        );\n        assert_eq!(\n            committed_offsets\n                .find_partition(&topic, 1)\n                .unwrap()\n                .offset(),\n            Offset::Offset(3)\n        );\n    }\n\n    #[tokio::test]\n    async fn test_kafka_source() -> anyhow::Result<()> {\n        let universe = Universe::with_accelerated_time();\n        let admin_client = create_admin_client();\n        let topic = append_random_suffix(\"test-kafka-source--topic\");\n        create_topic(&admin_client, &topic, 3).await?;\n\n        let source_loader = quickwit_supported_sources();\n        {\n            // Test Kafka source with empty topic.\n            let metastore = metastore_for_test();\n            let index_id = append_random_suffix(\"test-kafka-source--index\");\n            let (source_id, source_config) = get_source_config(&topic, \"earliest\");\n            let index_uid = setup_index(metastore.clone(), &index_id, &source_config, &[]).await;\n            let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config)\n                .with_metastore(metastore)\n                .build();\n            let source = source_loader.load_source(source_runtime).await?;\n            let (doc_processor_mailbox, doc_processor_inbox) = universe.create_test_mailbox();\n            let source_actor = SourceActor {\n                source,\n                doc_processor_mailbox: doc_processor_mailbox.clone(),\n            };\n            let (_source_mailbox, source_handle) = universe.spawn_builder().spawn(source_actor);\n            let (exit_status, exit_state) = source_handle.join().await;\n            assert!(exit_status.is_success());\n\n            let messages: Vec<RawDocBatch> = doc_processor_inbox.drain_for_test_typed();\n            assert!(messages.is_empty());\n\n            let expected_state = json!({\n                \"index_id\": index_id,\n                \"source_id\": source_id,\n                \"topic\":  topic,\n                \"assigned_partitions\": vec![0, 1, 2],\n                \"current_positions\": json!([]),\n                \"num_inactive_partitions\": 3,\n                \"num_bytes_processed\": 0,\n                \"num_messages_processed\": 0,\n                \"num_invalid_messages\": 0,\n                \"num_rebalances\": 0,\n            });\n            assert_eq!(exit_state, expected_state);\n        }\n        for partition_id in 0..3 {\n            populate_topic(\n                &topic,\n                3,\n                &key_fn,\n                &|message_id| {\n                    if message_id == 1 {\n                        \"\".to_string()\n                    } else {\n                        format!(\"Message #{:0>3}\", partition_id * 100 + message_id)\n                    }\n                },\n                Some(partition_id),\n                None,\n            )\n            .await?;\n        }\n        {\n            let metastore = metastore_for_test();\n            let index_id = append_random_suffix(\"test-kafka-source--index\");\n            let (source_id, source_config) = get_source_config(&topic, \"earliest\");\n            let index_uid = setup_index(metastore.clone(), &index_id, &source_config, &[]).await;\n            let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config)\n                .with_metastore(metastore)\n                .build();\n            let (doc_processor_mailbox, doc_processor_inbox) = universe.create_test_mailbox();\n            let source = source_loader.load_source(source_runtime).await?;\n            let source_actor = SourceActor {\n                source,\n                doc_processor_mailbox: doc_processor_mailbox.clone(),\n            };\n            let (_source_mailbox, source_handle) = universe.spawn_builder().spawn(source_actor);\n            let (exit_status, exit_state) = source_handle.join().await;\n            assert!(exit_status.is_success());\n\n            let messages: Vec<RawDocBatch> = doc_processor_inbox.drain_for_test_typed();\n            assert!(!messages.is_empty());\n\n            let batch = merge_doc_batches(messages)?;\n            let expected_docs = vec![\n                \"Message #000\",\n                \"Message #002\",\n                \"Message #100\",\n                \"Message #102\",\n                \"Message #200\",\n                \"Message #202\",\n            ];\n            assert_eq!(batch.docs, expected_docs);\n\n            let mut expected_checkpoint_delta = SourceCheckpointDelta::default();\n            for partition in 0u64..3u64 {\n                expected_checkpoint_delta.record_partition_delta(\n                    PartitionId::from(partition),\n                    Position::Beginning,\n                    Position::offset(2u64),\n                )?;\n            }\n            assert_eq!(batch.checkpoint_delta, expected_checkpoint_delta);\n\n            let expected_state = json!({\n                \"index_id\": index_id,\n                \"source_id\": source_id,\n                \"topic\":  topic,\n                \"assigned_partitions\": vec![0, 1, 2],\n                \"current_positions\":  vec![(0, \"00000000000000000002\"), (1, \"00000000000000000002\"), (2, \"00000000000000000002\")],\n                \"num_inactive_partitions\": 3,\n                \"num_bytes_processed\": 72,\n                \"num_messages_processed\": 9,\n                \"num_invalid_messages\": 3,\n                \"num_rebalances\": 0,\n            });\n            assert_eq!(exit_state, expected_state);\n        }\n        {\n            // Test Kafka source with `earliest` offset reset.\n            let metastore = metastore_for_test();\n            let index_id = append_random_suffix(\"test-kafka-source--index\");\n            let (source_id, source_config) = get_source_config(&topic, \"earliest\");\n            let index_uid = setup_index(\n                metastore.clone(),\n                &index_id,\n                &source_config,\n                &[\n                    (\n                        PartitionId::from(0u64),\n                        Position::Beginning,\n                        Position::offset(0u64),\n                    ),\n                    (\n                        PartitionId::from(1u64),\n                        Position::Beginning,\n                        Position::offset(2u64),\n                    ),\n                ],\n            )\n            .await;\n            let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config)\n                .with_metastore(metastore)\n                .build();\n            let source = source_loader.load_source(source_runtime).await?;\n            let (doc_processor_mailbox, doc_processor_inbox) = universe.create_test_mailbox();\n            let source_actor = SourceActor {\n                source,\n                doc_processor_mailbox: doc_processor_mailbox.clone(),\n            };\n            let (_source_mailbox, source_handle) = universe.spawn_builder().spawn(source_actor);\n            let (exit_status, exit_state) = source_handle.join().await;\n            assert!(exit_status.is_success());\n\n            let messages: Vec<RawDocBatch> = doc_processor_inbox.drain_for_test_typed();\n            assert!(!messages.is_empty());\n\n            let batch = merge_doc_batches(messages)?;\n            let expected_docs = vec![\"Message #002\", \"Message #200\", \"Message #202\"];\n            assert_eq!(batch.docs, expected_docs);\n\n            let mut expected_checkpoint_delta = SourceCheckpointDelta::default();\n            expected_checkpoint_delta.record_partition_delta(\n                PartitionId::from(0u64),\n                Position::offset(0u64),\n                Position::offset(2u64),\n            )?;\n            expected_checkpoint_delta.record_partition_delta(\n                PartitionId::from(2u64),\n                Position::Beginning,\n                Position::offset(2u64),\n            )?;\n            assert_eq!(batch.checkpoint_delta, expected_checkpoint_delta,);\n\n            let expected_exit_state = json!({\n                \"index_id\": index_id,\n                \"source_id\": source_id,\n                \"topic\":  topic,\n                \"assigned_partitions\": vec![0, 1, 2],\n                \"current_positions\":  vec![(0, \"00000000000000000002\"), (1, \"00000000000000000002\"), (2, \"00000000000000000002\")],\n                \"num_inactive_partitions\": 3,\n                \"num_bytes_processed\": 36,\n                \"num_messages_processed\": 5,\n                \"num_invalid_messages\": 2,\n                \"num_rebalances\": 0,\n            });\n            assert_eq!(exit_state, expected_exit_state);\n        }\n        {\n            // Test Kafka source with `latest` offset reset.\n            let metastore = metastore_for_test();\n            let index_id = append_random_suffix(\"test-kafka-source--index\");\n            let (source_id, source_config) = get_source_config(&topic, \"latest\");\n            let index_uid = setup_index(metastore.clone(), &index_id, &source_config, &[]).await;\n            let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config)\n                .with_metastore(metastore)\n                .build();\n            let source = source_loader.load_source(source_runtime).await?;\n            let (doc_processor_mailbox, doc_processor_inbox) = universe.create_test_mailbox();\n            let source_actor = SourceActor {\n                source,\n                doc_processor_mailbox: doc_processor_mailbox.clone(),\n            };\n            let (_source_mailbox, source_handle) = universe.spawn_builder().spawn(source_actor);\n            let (exit_status, exit_state) = source_handle.join().await;\n            assert!(exit_status.is_success());\n\n            let messages: Vec<RawDocBatch> = doc_processor_inbox.drain_for_test_typed();\n            assert!(messages.is_empty());\n\n            let expected_state = json!({\n                \"index_id\": index_id,\n                \"source_id\": source_id,\n                \"topic\":  topic,\n                \"assigned_partitions\": vec![0, 1, 2],\n                \"current_positions\": json!([]),\n                \"num_inactive_partitions\": 3,\n                \"num_bytes_processed\": 0,\n                \"num_messages_processed\": 0,\n                \"num_invalid_messages\": 0,\n                \"num_rebalances\": 0,\n            });\n            assert_eq!(exit_state, expected_state);\n        }\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_kafka_connectivity() {\n        let bootstrap_servers = \"localhost:9092\".to_string();\n        let topic = append_random_suffix(\"test-kafka-connectivity-topic\");\n\n        let admin_client = create_admin_client();\n        create_topic(&admin_client, &topic, 1).await.unwrap();\n\n        // Check valid connectivity\n        check_connectivity(KafkaSourceParams {\n            topic: topic.clone(),\n            client_log_level: None,\n            client_params: json!({ \"bootstrap.servers\": bootstrap_servers }),\n            enable_backfill_mode: true,\n        })\n        .await\n        .unwrap();\n\n        // TODO: these tests should be checking the specific errors.\n        // Non existent topic should throw an error.\n        check_connectivity(KafkaSourceParams {\n            topic: \"non-existent-topic\".to_string(),\n            client_log_level: None,\n            client_params: json!({ \"bootstrap.servers\": bootstrap_servers }),\n            enable_backfill_mode: true,\n        })\n        .await\n        .unwrap_err();\n\n        // Invalid brokers should throw an error\n        let _result = check_connectivity(KafkaSourceParams {\n            topic: topic.clone(),\n            client_log_level: None,\n            client_params: json!({\n                \"bootstrap.servers\": \"192.0.2.10:9092\"\n            }),\n            enable_backfill_mode: true,\n        })\n        .await\n        .unwrap_err();\n    }\n\n    #[test]\n    fn test_client_config_default_max_poll_interval() {\n        // If the client config does not specify `max.poll.interval.ms`, then the default value\n        // provided by the native config will be used.\n        //\n        // This unit test will warn us if the current default value of 5 minutes changes.\n        let config = ClientConfig::new();\n        let native_config = config.create_native_config().unwrap();\n        let default_max_poll_interval_ms = native_config.get(\"max.poll.interval.ms\").unwrap();\n        assert_eq!(default_max_poll_interval_ms, \"300000\");\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/kinesis/api.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse aws_sdk_kinesis::Client as KinesisClient;\nuse aws_sdk_kinesis::operation::get_records::GetRecordsOutput;\nuse aws_sdk_kinesis::types::{Shard, ShardIteratorType};\nuse quickwit_aws::retry::aws_retry;\nuse quickwit_common::retry::RetryParams;\n\n/// Gets records from a Kinesis data stream's shard.\n/// <https://docs.aws.amazon.com/kinesis/latest/APIReference/API_GetRecords.html>\npub(crate) async fn get_records(\n    kinesis_client: &KinesisClient,\n    retry_params: &RetryParams,\n    shard_iterator: String,\n) -> anyhow::Result<GetRecordsOutput> {\n    // TODO: Return an error other than `anyhow::Error` so that expired shard iterators can be\n    // handled properly.\n    let response = aws_retry(retry_params, || async {\n        kinesis_client\n            .get_records()\n            .shard_iterator(shard_iterator.clone())\n            .send()\n            .await\n    })\n    .await?;\n\n    Ok(response)\n}\n\n/// Gets a Kinesis shard iterator. A shard iterator expires 5 minutes after it is returned\n/// to the requester.\n/// <https://docs.aws.amazon.com/kinesis/latest/APIReference/API_GetShardIterator.html>\n///\n/// The returned shard iterator points to the record positioned right after\n/// `from_sequence_number_exclusive` if a value is provided. Otherwise, it points to the first\n/// (oldest) record in the shard.\npub(crate) async fn get_shard_iterator(\n    kinesis_client: &KinesisClient,\n    retry_params: &RetryParams,\n    stream_name: &str,\n    shard_id: &str,\n    from_sequence_number_exclusive: Option<String>,\n) -> anyhow::Result<Option<String>> {\n    let shard_iterator_type = if from_sequence_number_exclusive.is_some() {\n        ShardIteratorType::AfterSequenceNumber\n    } else {\n        ShardIteratorType::TrimHorizon\n    };\n\n    let response = aws_retry(retry_params, || async {\n        kinesis_client\n            .get_shard_iterator()\n            .stream_name(stream_name)\n            .shard_id(shard_id.to_string())\n            .shard_iterator_type(shard_iterator_type.clone())\n            .set_starting_sequence_number(from_sequence_number_exclusive.clone())\n            .send()\n            .await\n    })\n    .await?;\n    Ok(response.shard_iterator)\n}\n\n/// Lists the shards in a stream and provides information about each shard. This operation has a\n/// limit of 1000 transactions per second per data stream.\n/// <https://docs.aws.amazon.com/kinesis/latest/APIReference/API_ListShards.html>\npub(crate) async fn list_shards(\n    kinesis_client: &KinesisClient,\n    retry_params: &RetryParams,\n    stream_name: &str,\n    limit_per_request: Option<usize>,\n) -> anyhow::Result<Vec<Shard>> {\n    let mut shards = Vec::new();\n    let mut next_token = None;\n\n    loop {\n        // `stream_name` and `next_token` cannot be set simultaneously.\n        let stream_name = if next_token.is_none() {\n            Some(stream_name.to_string())\n        } else {\n            None\n        };\n        let limit_per_request = limit_per_request.map(|limit| limit as i32);\n        let response = aws_retry(retry_params, || async {\n            kinesis_client\n                .list_shards()\n                .set_stream_name(stream_name.clone())\n                .set_next_token(next_token.clone())\n                .set_max_results(limit_per_request)\n                .send()\n                .await\n        })\n        .await?;\n\n        if let Some(shrds) = response.shards {\n            shards.extend_from_slice(&shrds);\n        }\n        if response.next_token.is_none() {\n            return Ok(shards);\n        }\n        next_token = response.next_token;\n    }\n}\n\n#[cfg(all(test, feature = \"kinesis-localstack-tests\"))]\npub(crate) mod tests {\n    use std::collections::BTreeSet;\n    use std::time::Duration;\n\n    use anyhow::{Context, anyhow};\n    use aws_sdk_kinesis::types::StreamDescription;\n\n    use super::*;\n    use crate::source::kinesis::helpers::tests::DEFAULT_RETRY_PARAMS;\n\n    /// Creates a Kinesis data stream.\n    /// https://docs.aws.amazon.com/kinesis/latest/APIReference/API_CreateStream.html\n    pub(crate) async fn create_stream(\n        kinesis_client: &KinesisClient,\n        stream_name: &str,\n        num_shards: usize,\n    ) -> anyhow::Result<()> {\n        aws_retry(&DEFAULT_RETRY_PARAMS, || async {\n            kinesis_client\n                .create_stream()\n                .stream_name(stream_name)\n                .shard_count(num_shards as i32)\n                .send()\n                .await\n        })\n        .await\n        .with_context(|| format!(\"failed to create Kinesis data stream `{stream_name}`\"))?;\n        Ok(())\n    }\n\n    /// Deletes a Kinesis data stream. Only streams in `ACTIVE` state can be deleted.\n    /// https://docs.aws.amazon.com/kinesis/latest/APIReference/API_DeleteStream.html\n    pub(crate) async fn delete_stream(\n        kinesis_client: &KinesisClient,\n        stream_name: &str,\n    ) -> anyhow::Result<()> {\n        aws_retry(&DEFAULT_RETRY_PARAMS, || async {\n            kinesis_client\n                .delete_stream()\n                .stream_name(stream_name.to_string())\n                .send()\n                .await\n        })\n        .await\n        .with_context(|| format!(\"failed to delete Kinesis data stream `{stream_name}`\"))?;\n        Ok(())\n    }\n\n    /// Provides a summarized description of the specified Kinesis data stream without the shard\n    /// list. https://docs.aws.amazon.com/kinesis/latest/APIReference/API_DescribeStreamSummary.html\n    pub(crate) async fn describe_stream(\n        kinesis_client: &KinesisClient,\n        stream_name: &str,\n    ) -> anyhow::Result<StreamDescription> {\n        let response = aws_retry(&DEFAULT_RETRY_PARAMS, || async {\n            kinesis_client\n                .describe_stream()\n                .stream_name(stream_name.to_string())\n                .send()\n                .await\n        })\n        .await?;\n\n        response\n            .stream_description\n            .ok_or_else(|| anyhow!(\"no stream summary was returned from AWS\"))\n    }\n    /// Lists the Kinesis data streams.\n    /// https://docs.aws.amazon.com/kinesis/latest/APIReference/API_ListStreams.html\n    pub(crate) async fn list_streams(\n        kinesis_client: &KinesisClient,\n        mut exclusive_start_stream_name: Option<String>,\n        limit_per_request: Option<usize>,\n    ) -> anyhow::Result<BTreeSet<String>> {\n        let mut stream_names = BTreeSet::new();\n        let mut has_more_streams = true;\n        let limit_per_request = limit_per_request.map(|limit| limit as i32);\n        while has_more_streams {\n            let response = aws_retry(&DEFAULT_RETRY_PARAMS, || async {\n                kinesis_client\n                    .list_streams()\n                    .set_exclusive_start_stream_name(exclusive_start_stream_name.clone())\n                    .set_limit(limit_per_request)\n                    .send()\n                    .await\n            })\n            .await?;\n            exclusive_start_stream_name = response.stream_names.last().cloned();\n            has_more_streams = response.has_more_streams;\n            stream_names.extend(response.stream_names);\n        }\n        Ok(stream_names)\n    }\n\n    /// Merges two adjacent shards in a Kinesis data stream and combines them into a single shard.\n    /// https://docs.aws.amazon.com/kinesis/latest/APIReference/API_MergeShards.html\n    #[cfg(test)]\n    pub(crate) async fn merge_shards(\n        kinesis_client: &KinesisClient,\n        stream_name: &str,\n        shard_id: &str,\n        adjacent_shard_id: &str,\n    ) -> anyhow::Result<()> {\n        aws_retry(&DEFAULT_RETRY_PARAMS, || async {\n            kinesis_client\n                .merge_shards()\n                .stream_name(stream_name)\n                .shard_to_merge(shard_id)\n                .adjacent_shard_to_merge(adjacent_shard_id)\n                .send()\n                .await\n        })\n        .await?;\n        Ok(())\n    }\n\n    /// Splits a shard into two new shards in the Kinesis data stream.\n    /// https://docs.aws.amazon.com/kinesis/latest/APIReference/API_SplitShard.html\n    #[cfg(test)]\n    pub(crate) async fn split_shard(\n        kinesis_client: &KinesisClient,\n        stream_name: &str,\n        shard_id: &str,\n        starting_hash_key: &str,\n    ) -> anyhow::Result<()> {\n        aws_retry(&DEFAULT_RETRY_PARAMS, || async {\n            kinesis_client\n                .split_shard()\n                .stream_name(stream_name)\n                .shard_to_split(shard_id)\n                .new_starting_hash_key(starting_hash_key)\n                .send()\n                .await\n        })\n        .await?;\n        Ok(())\n    }\n\n    /// Waits for a Kinesis data stream's status to satisfy the specified predicate. This is done\n    /// through periodically polling the `[describe_stream]` API for the stream. Returns an error\n    /// after the specified `timeout` duration has passed.\n    #[cfg(test)]\n    pub(crate) async fn wait_for_stream_status<P>(\n        kinesis_client: &KinesisClient,\n        stream_name: &str,\n        stream_status_predicate: P,\n        timeout: Duration,\n    ) -> Result<anyhow::Result<()>, tokio::time::error::Elapsed>\n    where\n        P: Fn(aws_sdk_kinesis::types::StreamStatus) -> bool,\n    {\n        tokio::time::timeout(timeout, async {\n            let period = Duration::from_millis(if cfg!(test) { 100 } else { 5000 });\n            let mut interval = tokio::time::interval(period);\n            loop {\n                interval.tick().await;\n                let stream_status = describe_stream(kinesis_client, stream_name)\n                    .await?\n                    .stream_status;\n\n                if stream_status_predicate(stream_status) {\n                    return Ok(());\n                }\n            }\n        })\n        .await\n    }\n}\n\n#[cfg(all(test, feature = \"kinesis-localstack-tests\"))]\nmod kinesis_localstack_tests {\n    use std::time::Duration;\n\n    use aws_sdk_kinesis::types::StreamStatus;\n    use quickwit_common::rand::append_random_suffix;\n\n    use super::*;\n    use crate::source::kinesis::api::tests::{\n        create_stream, delete_stream, describe_stream, list_streams, wait_for_stream_status,\n    };\n    use crate::source::kinesis::helpers::tests::{\n        DEFAULT_RETRY_PARAMS, get_localstack_client, make_shard_id, put_records_into_shards, setup,\n        teardown, wait_for_active_stream,\n    };\n\n    #[ignore]\n    #[tokio::test]\n    async fn test_create_stream() -> anyhow::Result<()> {\n        let stream_name = append_random_suffix(\"test-create-stream\");\n        let kinesis_client = get_localstack_client().await?;\n        create_stream(&kinesis_client, &stream_name, 1).await?;\n        wait_for_active_stream(&kinesis_client, &stream_name).await??;\n        let description_summary = describe_stream(&kinesis_client, &stream_name).await?;\n        assert_eq!(description_summary.stream_name, stream_name);\n        assert_eq!(description_summary.stream_status, StreamStatus::Active,);\n        teardown(&kinesis_client, &stream_name).await;\n        Ok(())\n    }\n\n    #[ignore]\n    #[tokio::test]\n    async fn test_delete_stream() -> anyhow::Result<()> {\n        let (kinesis_client, stream_name) = setup(\"test-delete-stream\", 1).await?;\n        delete_stream(&kinesis_client, &stream_name).await?;\n        let _ = wait_for_stream_status(\n            &kinesis_client,\n            &stream_name,\n            |stream_status| stream_status != StreamStatus::Deleting,\n            Duration::from_secs(1),\n        )\n        .await;\n        assert!(\n            !list_streams(&kinesis_client, None, None,)\n                .await?\n                .contains(&stream_name)\n        );\n        Ok(())\n    }\n\n    #[ignore]\n    #[tokio::test]\n    async fn test_get_records() -> anyhow::Result<()> {\n        let (kinesis_client, stream_name) = setup(\"test-get-records\", 2).await?;\n        let _sequence_numbers = put_records_into_shards(\n            &kinesis_client,\n            &stream_name,\n            [(0, \"Record #00\"), (0, \"Record #01\"), (1, \"Record #10\")],\n        )\n        .await?;\n        let shard_id = make_shard_id(0);\n        let shard_iterator = get_shard_iterator(\n            &kinesis_client,\n            &DEFAULT_RETRY_PARAMS,\n            &stream_name,\n            &shard_id,\n            None,\n        )\n        .await?;\n\n        let get_records_output = get_records(\n            &kinesis_client,\n            &DEFAULT_RETRY_PARAMS,\n            shard_iterator.unwrap(),\n        )\n        .await?;\n\n        let records = get_records_output.records;\n        assert_eq!(records.len(), 2);\n        assert_eq!(std::str::from_utf8(records[0].data.as_ref())?, \"Record #00\");\n        assert_eq!(std::str::from_utf8(records[1].data.as_ref())?, \"Record #01\");\n        teardown(&kinesis_client, &stream_name).await;\n        Ok(())\n    }\n\n    // Ignoring this test because the localstack implementation of Kinesis is bogus.\n    #[ignore]\n    #[tokio::test]\n    async fn test_get_shard_iterator() -> anyhow::Result<()> {\n        let (kinesis_client, stream_name) = setup(\"test-get-shard-iterator\", 2).await?;\n        let sequence_numbers = put_records_into_shards(\n            &kinesis_client,\n            &stream_name,\n            [(0, \"Record #00\"), (1, \"Record #10\")],\n        )\n        .await?;\n        let shard_id = make_shard_id(0);\n        {\n            let shard_iterator = get_shard_iterator(\n                &kinesis_client,\n                &DEFAULT_RETRY_PARAMS,\n                &stream_name,\n                &shard_id,\n                None,\n            )\n            .await?;\n            assert!(shard_iterator.is_some());\n\n            let get_records_output = get_records(\n                &kinesis_client,\n                &DEFAULT_RETRY_PARAMS,\n                shard_iterator.unwrap(),\n            )\n            .await?;\n            assert_eq!(get_records_output.records.len(), 1);\n        }\n        {\n            let starting_sequence_number = sequence_numbers.get(&0).unwrap().first().cloned();\n            let shard_iterator = get_shard_iterator(\n                &kinesis_client,\n                &DEFAULT_RETRY_PARAMS,\n                &stream_name,\n                &shard_id,\n                starting_sequence_number,\n            )\n            .await?;\n            assert!(shard_iterator.is_some());\n\n            let get_records_output = get_records(\n                &kinesis_client,\n                &DEFAULT_RETRY_PARAMS,\n                shard_iterator.unwrap(),\n            )\n            .await?;\n            assert_eq!(get_records_output.records.len(), 0)\n        }\n        teardown(&kinesis_client, &stream_name).await;\n        Ok(())\n    }\n\n    #[ignore]\n    #[tokio::test]\n    async fn test_list_shards() -> anyhow::Result<()> {\n        let (kinesis_client, stream_name) = setup(\"test-list-shards\", 2).await?;\n        let shards = list_shards(\n            &kinesis_client,\n            &DEFAULT_RETRY_PARAMS,\n            &stream_name,\n            Some(1),\n        )\n        .await?;\n        assert_eq!(shards.len(), 2);\n        assert_eq!(shards[0].shard_id, make_shard_id(0));\n        assert_eq!(shards[1].shard_id, make_shard_id(1));\n        teardown(&kinesis_client, &stream_name).await;\n        Ok(())\n    }\n\n    // Ignoring this test because the localstack implementation of Kinesis is bogus.\n    #[ignore]\n    #[tokio::test]\n    async fn test_list_streams() -> anyhow::Result<()> {\n        let kinesis_client = get_localstack_client().await?;\n        let mut stream_names = Vec::new();\n\n        for stream_name_suffix in [\"foo\", \"bar\"] {\n            let (_kinesis_client, stream_name) =\n                setup(format!(\"test-list-streams-{stream_name_suffix}\"), 1).await?;\n            stream_names.push(stream_name);\n        }\n        {\n            let streams = list_streams(&kinesis_client, None, Some(1)).await?;\n            assert!(streams.contains(&stream_names[0]));\n            assert!(streams.contains(&stream_names[1]));\n        }\n        {\n            let streams = list_streams(\n                &kinesis_client,\n                Some(\"test-list-streams-foo\".to_string()),\n                Some(1),\n            )\n            .await?;\n            assert!(streams.contains(&stream_names[0]));\n            assert!(!streams.contains(&stream_names[1]));\n        }\n        for stream_name in stream_names {\n            teardown(&kinesis_client, &stream_name).await;\n        }\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/kinesis/helpers.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse aws_sdk_kinesis::config::{Region, SharedAsyncSleep};\nuse aws_sdk_kinesis::{Client, Config};\nuse quickwit_aws::{DEFAULT_AWS_REGION, aws_behavior_version, get_aws_config};\nuse quickwit_config::RegionOrEndpoint;\n\npub async fn get_kinesis_client(region_or_endpoint: RegionOrEndpoint) -> anyhow::Result<Client> {\n    let aws_config = get_aws_config().await;\n\n    let mut kinesis_config = Config::builder().behavior_version(aws_behavior_version());\n    kinesis_config.set_retry_config(aws_config.retry_config().cloned());\n    kinesis_config.set_credentials_provider(aws_config.credentials_provider());\n    kinesis_config.set_http_client(aws_config.http_client());\n    kinesis_config.set_timeout_config(aws_config.timeout_config().cloned());\n    if let Some(identity_cache) = aws_config.identity_cache() {\n        kinesis_config.set_identity_cache(identity_cache);\n    }\n    kinesis_config.set_sleep_impl(Some(SharedAsyncSleep::new(\n        quickwit_aws::TokioSleep::default(),\n    )));\n\n    match region_or_endpoint {\n        RegionOrEndpoint::Region(region) => {\n            kinesis_config = kinesis_config.region(Some(Region::new(region)));\n        }\n        RegionOrEndpoint::Endpoint(endpoint) => {\n            kinesis_config = kinesis_config.endpoint_url(endpoint);\n            kinesis_config = kinesis_config.region(Some(DEFAULT_AWS_REGION));\n        }\n    }\n\n    Ok(Client::from_conf(kinesis_config.build()))\n}\n\n#[cfg(all(test, feature = \"kinesis-localstack-tests\"))]\npub(crate) mod tests {\n    use std::collections::HashMap;\n    use std::time::Duration;\n\n    use anyhow::bail;\n    use aws_sdk_kinesis::Client as KinesisClient;\n    use aws_sdk_kinesis::primitives::Blob;\n    use aws_sdk_kinesis::types::{PutRecordsRequestEntry, StreamStatus};\n    use once_cell::sync::Lazy;\n    use quickwit_common::rand::append_random_suffix;\n    use quickwit_common::retry::RetryParams;\n    use quickwit_config::RegionOrEndpoint;\n    use tracing::error;\n\n    use crate::source::kinesis::api::list_shards;\n    use crate::source::kinesis::api::tests::{\n        create_stream, delete_stream, wait_for_stream_status,\n    };\n    use crate::source::kinesis::helpers::get_kinesis_client;\n\n    pub static DEFAULT_RETRY_PARAMS: Lazy<RetryParams> = Lazy::new(RetryParams::standard);\n\n    pub async fn get_localstack_client() -> anyhow::Result<KinesisClient> {\n        let endpoint = RegionOrEndpoint::Endpoint(\"http://localhost:4566\".to_string());\n        get_kinesis_client(endpoint).await\n    }\n\n    pub fn make_shard_id(id: usize) -> String {\n        format!(\"shardId-{id:0>12}\")\n    }\n\n    pub fn parse_shard_id<S: AsRef<str>>(shard_id: S) -> Option<usize> {\n        shard_id\n            .as_ref()\n            .strip_prefix(\"shardId-\")\n            .and_then(|shard_id| shard_id.parse::<usize>().ok())\n    }\n\n    pub async fn put_records_into_shards<I>(\n        kinesis_client: &aws_sdk_kinesis::Client,\n        stream_name: &str,\n        records: I,\n    ) -> anyhow::Result<HashMap<usize, Vec<String>>>\n    where\n        I: IntoIterator<Item = (usize, &'static str)>,\n    {\n        let shard_hash_keys: HashMap<usize, String> =\n            list_shards(kinesis_client, &DEFAULT_RETRY_PARAMS, stream_name, None)\n                .await?\n                .into_iter()\n                .flat_map(|shard| {\n                    let starting_hash_key = shard.hash_key_range?.starting_hash_key;\n                    parse_shard_id(shard.shard_id).map(|shard_id| (shard_id, starting_hash_key))\n                })\n                .collect();\n\n        let put_records_request_entries = records\n            .into_iter()\n            .map(|(shard_id, record)| {\n                PutRecordsRequestEntry::builder()\n                    .set_explicit_hash_key(shard_hash_keys.get(&shard_id).cloned())\n                    .partition_key(\"Overridden by hash key\".to_string())\n                    .data(Blob::new(record.as_bytes()))\n                    .build()\n            })\n            .collect::<Result<Vec<_>, _>>()?;\n\n        let response = kinesis_client\n            .put_records()\n            .stream_name(stream_name.to_string())\n            .set_records(Some(put_records_request_entries))\n            .send()\n            .await?;\n\n        let mut sequence_numbers = HashMap::new();\n        for record in response.records {\n            if let Some(sequence_number) = record.sequence_number {\n                sequence_numbers\n                    .entry(record.shard_id.and_then(parse_shard_id).unwrap())\n                    .or_insert_with(Vec::new)\n                    .push(sequence_number);\n            } else {\n                bail!(\"sequence number is missing from record\");\n            }\n        }\n        Ok(sequence_numbers)\n    }\n\n    pub async fn setup<S: AsRef<str>>(\n        test_name: S,\n        num_shards: usize,\n    ) -> anyhow::Result<(aws_sdk_kinesis::Client, String)> {\n        let stream_name = append_random_suffix(test_name.as_ref());\n        let kinesis_client = get_localstack_client().await?;\n        create_stream(&kinesis_client, &stream_name, num_shards).await?;\n        wait_for_active_stream(&kinesis_client, &stream_name).await??;\n        Ok((kinesis_client, stream_name))\n    }\n\n    pub async fn teardown(kinesis_client: &aws_sdk_kinesis::Client, stream_name: &str) {\n        if let Err(error) = delete_stream(kinesis_client, stream_name).await {\n            error!(stream_name = %stream_name, error = ?error, \"Failed to delete stream.\")\n        }\n    }\n\n    pub async fn wait_for_active_stream(\n        kinesis_client: &aws_sdk_kinesis::Client,\n        stream_name: &str,\n    ) -> Result<anyhow::Result<()>, tokio::time::error::Elapsed> {\n        wait_for_stream_status(\n            kinesis_client,\n            stream_name,\n            |stream_status| stream_status == StreamStatus::Active,\n            Duration::from_secs(30),\n        )\n        .await\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/kinesis/kinesis_source.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::fmt;\nuse std::time::Duration;\n\nuse anyhow::{Context, bail};\nuse async_trait::async_trait;\nuse aws_sdk_kinesis::Client as KinesisClient;\nuse bytes::Bytes;\nuse itertools::Itertools;\nuse quickwit_actors::{ActorExitStatus, Mailbox};\nuse quickwit_aws::get_aws_config;\nuse quickwit_common::retry::RetryParams;\nuse quickwit_config::{KinesisSourceParams, RegionOrEndpoint};\nuse quickwit_metastore::checkpoint::{PartitionId, SourceCheckpoint};\nuse quickwit_proto::metastore::SourceType;\nuse quickwit_proto::types::Position;\nuse serde_json::{Value as JsonValue, json};\nuse tokio::sync::mpsc;\nuse tokio::time;\nuse tracing::{info, warn};\n\nuse super::api::list_shards;\nuse super::shard_consumer::{ShardConsumer, ShardConsumerHandle, ShardConsumerMessage};\nuse crate::actors::DocProcessor;\nuse crate::source::kinesis::helpers::get_kinesis_client;\nuse crate::source::{\n    BATCH_NUM_BYTES_LIMIT, BatchBuilder, EMIT_BATCHES_TIMEOUT, Source, SourceContext,\n    SourceRuntime, TypedSourceFactory,\n};\n\ntype ShardId = String;\n\n/// Factory for instantiating a `KafkaSource`.\npub struct KinesisSourceFactory;\n\n#[async_trait]\nimpl TypedSourceFactory for KinesisSourceFactory {\n    type Source = KinesisSource;\n    type Params = KinesisSourceParams;\n\n    async fn typed_create_source(\n        source_runtime: SourceRuntime,\n        source_params: KinesisSourceParams,\n    ) -> anyhow::Result<Self::Source> {\n        KinesisSource::try_new(source_runtime, source_params).await\n    }\n}\n\nstruct ShardConsumerState {\n    partition_id: PartitionId,\n    current_position: Position,\n    lag_millis: Option<i64>,\n    _shard_consumer_handle: ShardConsumerHandle,\n}\n\n#[derive(Default)]\npub struct KinesisSourceState {\n    /// Pool of [`ShardConsumer`] managed by the source.\n    shard_consumers: HashMap<ShardId, ShardConsumerState>,\n    /// Number of bytes processed by the source.\n    pub num_bytes_processed: u64,\n    /// Number of records processed by the source (including invalid messages).\n    pub num_records_processed: u64,\n    // Number of invalid records, i.e., that were empty or could not be parsed.\n    pub num_invalid_records: u64,\n}\n\npub struct KinesisSource {\n    // Runtime arguments.\n    source_runtime: SourceRuntime,\n    // Target stream to consume.\n    stream_name: String,\n    kinesis_client: KinesisClient,\n    // Retry parameters (max attempts, max delay, ...).\n    retry_params: RetryParams,\n    // Sender for the communication channel between the source and the shard consumers.\n    shard_consumers_tx: mpsc::Sender<ShardConsumerMessage>,\n    // Receiver for the communication channel between the source and the shard consumers.\n    shard_consumers_rx: mpsc::Receiver<ShardConsumerMessage>,\n    state: KinesisSourceState,\n    backfill_mode_enabled: bool,\n}\n\nimpl fmt::Debug for KinesisSource {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        f.debug_struct(\"KinesisSource\")\n            .field(\"index_uid\", self.source_runtime.index_uid())\n            .field(\"source_id\", &self.source_runtime.source_id())\n            .field(\"stream_Name\", &self.stream_name)\n            .finish()\n    }\n}\n\nimpl KinesisSource {\n    /// Instantiates a new `KinesisSource`.\n    pub async fn try_new(\n        source_runtime: SourceRuntime,\n        source_params: KinesisSourceParams,\n    ) -> anyhow::Result<Self> {\n        let stream_name = source_params.stream_name;\n        let backfill_mode_enabled = source_params.enable_backfill_mode;\n        let region = get_region(source_params.region_or_endpoint).await?;\n        let kinesis_client = get_kinesis_client(region).await?;\n        let (shard_consumers_tx, shard_consumers_rx) = mpsc::channel(1_000);\n        let state = KinesisSourceState::default();\n        let retry_params = RetryParams::aggressive();\n        let kinesis_source = KinesisSource {\n            source_runtime,\n            stream_name,\n            kinesis_client,\n            shard_consumers_tx,\n            shard_consumers_rx,\n            state,\n            backfill_mode_enabled,\n            retry_params,\n        };\n        Ok(kinesis_source)\n    }\n\n    fn spawn_shard_consumer(\n        &mut self,\n        ctx: &SourceContext,\n        shard_id: ShardId,\n        checkpoint: &SourceCheckpoint,\n    ) {\n        if self.state.shard_consumers.contains_key(&shard_id) {\n            info!(\n                stream_name = %self.stream_name,\n                shard_id = %shard_id,\n                \"Shard consumer already exists, skipping creation.\"\n            );\n            return;\n        }\n\n        let partition_id = PartitionId::from(shard_id.as_str());\n        let from_position = checkpoint\n            .position_for_partition(&partition_id)\n            .cloned()\n            .unwrap_or(Position::Beginning);\n        let from_sequence_number_exclusive = match &from_position {\n            Position::Beginning => None,\n            Position::Offset(offset) => Some(offset.to_string()),\n            Position::Eof(_) => panic!(\"position of a Kinesis shard should never be EOF\"),\n        };\n        info!(\n            stream_name = %self.stream_name,\n            shard_id = %shard_id,\n            start_position = ?from_position,\n            \"Spawning new shard consumer\"\n        );\n        let shard_consumer = ShardConsumer::new(\n            self.stream_name.clone(),\n            shard_id.clone(),\n            from_sequence_number_exclusive,\n            self.backfill_mode_enabled,\n            self.kinesis_client.clone(),\n            self.shard_consumers_tx.clone(),\n            self.retry_params,\n        );\n        let _shard_consumer_handle = shard_consumer.spawn(ctx);\n        let shard_consumer_state = ShardConsumerState {\n            partition_id,\n            current_position: from_position,\n            lag_millis: None,\n            _shard_consumer_handle,\n        };\n        self.state\n            .shard_consumers\n            .insert(shard_id, shard_consumer_state);\n    }\n}\n\n#[async_trait]\nimpl Source for KinesisSource {\n    async fn initialize(\n        &mut self,\n        _doc_processor_mailbox: &Mailbox<DocProcessor>,\n        ctx: &SourceContext,\n    ) -> Result<(), ActorExitStatus> {\n        let shards = ctx\n            .protect_future(list_shards(\n                &self.kinesis_client,\n                &self.retry_params,\n                &self.stream_name,\n                None,\n            ))\n            .await?;\n        let checkpoint = self\n            .source_runtime\n            .fetch_checkpoint()\n            .await\n            .context(\"failed to fetch checkpoint\")?;\n\n        for shard in shards {\n            self.spawn_shard_consumer(ctx, shard.shard_id, &checkpoint);\n        }\n        info!(\n            stream_name = %self.stream_name,\n            assigned_shards = %self.state.shard_consumers.keys().sorted().join(\", \"),\n            \"Starting Kinesis source.\"\n        );\n        Ok(())\n    }\n\n    async fn emit_batches(\n        &mut self,\n        indexer_mailbox: &Mailbox<DocProcessor>,\n        ctx: &SourceContext,\n    ) -> Result<Duration, ActorExitStatus> {\n        let mut batch_builder = BatchBuilder::new(SourceType::Kinesis);\n        let deadline = time::sleep(*EMIT_BATCHES_TIMEOUT);\n        tokio::pin!(deadline);\n\n        loop {\n            tokio::select! {\n                message_opt = self.shard_consumers_rx.recv() => {\n                    // The source always carries a sender for this channel.\n                    match message_opt.expect(\"Channel unexpectedly closed.\") {\n                        ShardConsumerMessage::ChildShards(shard_ids) => {\n                            let checkpoint = self.source_runtime.fetch_checkpoint().await.context(\"failed to fetch checkpoint\")?;\n\n                            for shard_id in shard_ids {\n                                self.spawn_shard_consumer(ctx, shard_id, &checkpoint);\n                            }\n                        }\n                        ShardConsumerMessage::Records { shard_id, records, lag_millis } => {\n                            let num_records = records.len();\n\n                            for (i, record) in records.into_iter().enumerate() {\n                                let record_data = record.data.into_inner();\n\n                                if record_data.is_empty() {\n                                    warn!(\n                                        stream_name=%self.stream_name,\n                                        shard_id=%shard_id,\n                                        sequence_number=%record.sequence_number,\n                                        \"record is empty\"\n                                    );\n                                    self.state.num_invalid_records += 1;\n                                    continue;\n                                }\n                                batch_builder.add_doc(Bytes::from(record_data));\n\n                                if i == num_records - 1 {\n                                    let shard_consumer_state = self\n                                        .state\n                                        .shard_consumers\n                                        .get_mut(&shard_id)\n                                        .ok_or_else(|| {\n                                            anyhow::anyhow!(\n                                                \"received record from unassigned shard `{}`\", shard_id,\n                                            )\n                                        })?;\n                                    shard_consumer_state.lag_millis = lag_millis;\n\n                                    let partition_id = shard_consumer_state.partition_id.clone();\n                                    let current_position = Position::from(record.sequence_number);\n                                    let previous_position = std::mem::replace(&mut shard_consumer_state.current_position, current_position.clone());\n\n                                    batch_builder.checkpoint_delta.record_partition_delta(\n                                        partition_id,\n                                        previous_position,\n                                        current_position,\n                                    ).context(\"failed to record partition delta\")?;\n                                }\n                            }\n                            if batch_builder.num_bytes >= BATCH_NUM_BYTES_LIMIT {\n                                break;\n                            }\n                        }\n                        ShardConsumerMessage::ShardClosed(shard_id) => {\n                            info!(\n                                stream_name = %self.stream_name,\n                                shard_id = %shard_id,\n                                num_active_shards = %self.state.shard_consumers.len(),\n                                \"Shard is closed.\"\n                            );\n                            self.state.shard_consumers.remove(&shard_id);\n\n                        }\n                        ShardConsumerMessage::ShardEOF(shard_id) => {\n                            info!(\n                                stream_name = %self.stream_name,\n                                shard_id = %shard_id,\n                                num_active_shards = %self.state.shard_consumers.len(),\n                                \"Reached end of shard.\"\n                            );\n                            self.state.shard_consumers.remove(&shard_id);\n                        }\n                    }\n                    ctx.record_progress();\n                }\n                _ = &mut deadline => {\n                    break;\n                }\n            }\n        }\n        self.state.num_bytes_processed += batch_builder.num_bytes;\n        self.state.num_records_processed += batch_builder.docs.len() as u64;\n\n        if !batch_builder.checkpoint_delta.is_empty() {\n            ctx.send_message(indexer_mailbox, batch_builder.build())\n                .await?;\n        }\n        if self.state.shard_consumers.is_empty() {\n            info!(stream_name = %self.stream_name, \"reached end of stream\");\n            ctx.send_exit_with_success(indexer_mailbox).await?;\n            return Err(ActorExitStatus::Success);\n        }\n        Ok(Duration::default())\n    }\n\n    fn name(&self) -> String {\n        format!(\"{self:?}\")\n    }\n\n    fn observable_state(&self) -> JsonValue {\n        let shard_consumer_positions: Vec<(&ShardId, &Position)> = self\n            .state\n            .shard_consumers\n            .iter()\n            .map(|(shard_id, shard_consumer_state)| {\n                (shard_id, &shard_consumer_state.current_position)\n            })\n            .sorted()\n            .collect();\n        json!({\n            \"stream_name\": self.stream_name,\n            \"shard_consumer_positions\": shard_consumer_positions,\n            \"num_bytes_processed\": self.state.num_bytes_processed,\n            \"num_records_processed\": self.state.num_records_processed,\n            \"num_invalid_records\": self.state.num_invalid_records,\n        })\n    }\n}\n\npub(super) async fn get_region(\n    region_or_endpoint_opt: Option<RegionOrEndpoint>,\n) -> anyhow::Result<RegionOrEndpoint> {\n    if let Some(region_or_endpoint) = region_or_endpoint_opt {\n        return Ok(region_or_endpoint);\n    }\n    //< We fallback to AWS region if `region_or_endpoint` is `None`\n    let sdk_config = get_aws_config().await;\n\n    if let Some(region) = sdk_config.region() {\n        return Ok(RegionOrEndpoint::Region(region.to_string()));\n    }\n    if let Some(endpoint) = sdk_config.endpoint_url() {\n        return Ok(RegionOrEndpoint::Endpoint(endpoint.to_string()));\n    }\n    bail!(\"unable to sniff region from environment\")\n}\n\n#[cfg(all(test, feature = \"kinesis-localstack-tests\"))]\nmod tests {\n\n    use quickwit_actors::Universe;\n    use quickwit_config::{SourceConfig, SourceParams};\n    use quickwit_metastore::checkpoint::SourceCheckpointDelta;\n    use quickwit_proto::types::IndexUid;\n\n    use super::*;\n    use crate::models::RawDocBatch;\n    use crate::source::SourceActor;\n    use crate::source::kinesis::helpers::tests::{\n        make_shard_id, put_records_into_shards, setup, teardown,\n    };\n    use crate::source::tests::SourceRuntimeBuilder;\n\n    // Sequence number\n    type SeqNo = String;\n\n    fn merge_doc_batches(batches: Vec<RawDocBatch>) -> anyhow::Result<RawDocBatch> {\n        let mut merged_batch = RawDocBatch::default();\n        for batch in batches {\n            merged_batch.docs.extend(batch.docs);\n            merged_batch\n                .checkpoint_delta\n                .extend(batch.checkpoint_delta)?;\n        }\n        merged_batch.docs.sort();\n        Ok(merged_batch)\n    }\n\n    #[ignore]\n    #[tokio::test]\n    async fn test_kinesis_source_handles_resharding_with_split() {\n        use crate::source::kinesis::api::tests::split_shard;\n        use crate::source::kinesis::helpers::tests::wait_for_active_stream;\n\n        let universe = Universe::with_accelerated_time();\n        let (doc_processor_mailbox, _doc_processor_inbox) = universe.create_test_mailbox();\n        let (kinesis_client, stream_name) = setup(\"test-resharding-split\", 1).await.unwrap();\n        let index_id = \"test-kinesis-resharding-index\";\n        let index_uid = IndexUid::new_with_random_ulid(index_id);\n\n        // Split the shard (1 -> 2 shards)\n        let shard_id_0 = make_shard_id(0);\n        split_shard(\n            &kinesis_client,\n            &stream_name,\n            &shard_id_0,\n            \"85070591730234615865843651857942052864\",\n        )\n        .await\n        .unwrap();\n\n        // Wait for stream to be active after split\n        let _ = wait_for_active_stream(&kinesis_client, &stream_name)\n            .await\n            .unwrap();\n\n        // Initialize source after split\n        let kinesis_params = KinesisSourceParams {\n            stream_name: stream_name.clone(),\n            region_or_endpoint: Some(RegionOrEndpoint::Endpoint(\n                \"http://localhost:4566\".to_string(),\n            )),\n            enable_backfill_mode: true,\n        };\n        let source_params = SourceParams::Kinesis(kinesis_params.clone());\n        let source_config = SourceConfig::for_test(\"test-kinesis-resharding\", source_params);\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config).build();\n\n        let kinesis_source = KinesisSource::try_new(source_runtime, kinesis_params)\n            .await\n            .unwrap();\n\n        let actor = SourceActor {\n            source: Box::new(kinesis_source),\n            doc_processor_mailbox: doc_processor_mailbox.clone(),\n        };\n        let (_mailbox, handle) = universe.spawn_builder().spawn(actor);\n        let (exit_status, _exit_state) = handle.join().await;\n        assert!(exit_status.is_success());\n\n        teardown(&kinesis_client, &stream_name).await;\n    }\n\n    #[ignore]\n    #[tokio::test]\n    async fn test_kinesis_source() {\n        let universe = Universe::with_accelerated_time();\n        let (doc_processor_mailbox, doc_processor_inbox) = universe.create_test_mailbox();\n        let (kinesis_client, stream_name) = setup(\"test-kinesis-source\", 3).await.unwrap();\n        let index_id = \"test-kinesis-index\";\n        let index_uid = IndexUid::new_with_random_ulid(index_id);\n        let kinesis_params = KinesisSourceParams {\n            stream_name: stream_name.clone(),\n            region_or_endpoint: Some(RegionOrEndpoint::Endpoint(\n                \"http://localhost:4566\".to_string(),\n            )),\n            enable_backfill_mode: true,\n        };\n        let source_params = SourceParams::Kinesis(kinesis_params.clone());\n        let source_config = SourceConfig::for_test(\"test-kinesis-source\", source_params);\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config).build();\n        {\n            let kinesis_source =\n                KinesisSource::try_new(source_runtime.clone(), kinesis_params.clone())\n                    .await\n                    .unwrap();\n            let actor = SourceActor {\n                source: Box::new(kinesis_source),\n                doc_processor_mailbox: doc_processor_mailbox.clone(),\n            };\n            let (_mailbox, handle) = universe.spawn_builder().spawn(actor);\n            let (exit_status, exit_state) = handle.join().await;\n            assert!(exit_status.is_success());\n\n            let next_message = doc_processor_inbox\n                .drain_for_test()\n                .into_iter()\n                .flat_map(|box_any| box_any.downcast::<RawDocBatch>().ok())\n                .map(|box_raw_doc_batch| *box_raw_doc_batch)\n                .next();\n            assert!(next_message.is_none());\n\n            let expected_shard_consumer_positions: Vec<(ShardId, SeqNo)> = Vec::new();\n            let expected_state = json!({\n                \"stream_name\":  stream_name,\n                \"shard_consumer_positions\": expected_shard_consumer_positions,\n                \"num_bytes_processed\": 0,\n                \"num_records_processed\": 0,\n                \"num_invalid_records\": 0,\n            });\n            assert_eq!(exit_state, expected_state);\n        }\n        let sequence_numbers = put_records_into_shards(\n            &kinesis_client,\n            &stream_name,\n            [\n                (0, \"Record #00\"),\n                (0, \"Record #01\"),\n                (1, \"Record #10\"),\n                (1, \"Record #11\"),\n                (2, \"Record #20\"),\n                (2, \"Record #21\"),\n            ],\n        )\n        .await\n        .unwrap();\n        let shard_sequence_numbers: HashMap<usize, SeqNo> = sequence_numbers\n            .iter()\n            .map(|(shard_id, records)| (*shard_id, records.last().unwrap().clone()))\n            .collect();\n        let shard_positions: HashMap<usize, Position> = shard_sequence_numbers\n            .iter()\n            .map(|(shard_id, seqno)| (*shard_id, Position::from(seqno.clone())))\n            .collect();\n        {\n            let kinesis_source =\n                KinesisSource::try_new(source_runtime.clone(), kinesis_params.clone())\n                    .await\n                    .unwrap();\n            let actor = SourceActor {\n                source: Box::new(kinesis_source),\n                doc_processor_mailbox: doc_processor_mailbox.clone(),\n            };\n            let (_mailbox, handle) = universe.spawn_builder().spawn(actor);\n            let (exit_status, exit_state) = handle.join().await;\n            assert!(exit_status.is_success());\n\n            let messages: Vec<RawDocBatch> = doc_processor_inbox\n                .drain_for_test()\n                .into_iter()\n                .flat_map(|box_any| box_any.downcast::<RawDocBatch>().ok())\n                .map(|box_raw_doc_batch| *box_raw_doc_batch)\n                .collect();\n            assert!(!messages.is_empty());\n\n            let batch = merge_doc_batches(messages).unwrap();\n            let expected_docs = vec![\n                \"Record #00\",\n                \"Record #01\",\n                \"Record #10\",\n                \"Record #11\",\n                \"Record #20\",\n                \"Record #21\",\n            ];\n            assert_eq!(batch.docs, expected_docs);\n\n            let mut expected_checkpoint_delta = SourceCheckpointDelta::default();\n            for shard_id in 0..3 {\n                expected_checkpoint_delta\n                    .record_partition_delta(\n                        PartitionId::from(make_shard_id(shard_id)),\n                        Position::Beginning,\n                        shard_positions.get(&shard_id).unwrap().clone(),\n                    )\n                    .unwrap();\n            }\n            assert_eq!(batch.checkpoint_delta, expected_checkpoint_delta);\n\n            let expected_shard_consumer_positions: Vec<(ShardId, SeqNo)> = Vec::new();\n            let expected_state = json!({\n                \"stream_name\":  stream_name,\n                \"shard_consumer_positions\": expected_shard_consumer_positions,\n                \"num_bytes_processed\": 60,\n                \"num_records_processed\": 6,\n                \"num_invalid_records\": 0,\n            });\n            assert_eq!(exit_state, expected_state);\n        }\n        {\n            let from_sequence_number_exclusive_shard_1 =\n                sequence_numbers.get(&1).unwrap().first().unwrap().clone();\n            let from_sequence_number_exclusive_shard_2 =\n                sequence_numbers.get(&2).unwrap().last().unwrap().clone();\n            let _checkpoint: SourceCheckpoint = vec![\n                (\n                    make_shard_id(1),\n                    from_sequence_number_exclusive_shard_1.clone(),\n                ),\n                (\n                    make_shard_id(2),\n                    from_sequence_number_exclusive_shard_2.clone(),\n                ),\n            ]\n            .into_iter()\n            .map(|(partition_id, offset)| (PartitionId::from(partition_id), Position::from(offset)))\n            .collect();\n            let kinesis_source = KinesisSource::try_new(source_runtime, kinesis_params)\n                .await\n                .unwrap();\n            let actor = SourceActor {\n                source: Box::new(kinesis_source),\n                doc_processor_mailbox: doc_processor_mailbox.clone(),\n            };\n            let (_mailbox, handle) = universe.spawn_builder().spawn(actor);\n            let (exit_status, exit_state) = handle.join().await;\n            assert!(exit_status.is_success());\n\n            let messages: Vec<RawDocBatch> = doc_processor_inbox\n                .drain_for_test()\n                .into_iter()\n                .flat_map(|box_any| box_any.downcast::<RawDocBatch>().ok())\n                .map(|box_raw_doc_batch| *box_raw_doc_batch)\n                .collect();\n            assert!(!messages.is_empty());\n\n            let batch = merge_doc_batches(messages).unwrap();\n            let expected_docs = vec![\"Record #00\", \"Record #01\", \"Record #11\"];\n            assert_eq!(batch.docs, expected_docs);\n\n            let mut expected_checkpoint_delta = SourceCheckpointDelta::default();\n            for (shard_id, from_position) in [\n                Position::Beginning,\n                Position::from(from_sequence_number_exclusive_shard_1),\n            ]\n            .into_iter()\n            .enumerate()\n            {\n                expected_checkpoint_delta\n                    .record_partition_delta(\n                        PartitionId::from(make_shard_id(shard_id)),\n                        from_position,\n                        shard_positions.get(&shard_id).unwrap().clone(),\n                    )\n                    .unwrap();\n            }\n            assert_eq!(batch.checkpoint_delta, expected_checkpoint_delta);\n\n            let expected_shard_consumer_positions: Vec<(ShardId, SeqNo)> = Vec::new();\n            let expected_state = json!({\n                \"stream_name\":  stream_name,\n                \"shard_consumer_positions\": expected_shard_consumer_positions,\n                \"num_bytes_processed\": 30,\n                \"num_records_processed\": 3,\n                \"num_invalid_records\": 0,\n            });\n            assert_eq!(exit_state, expected_state);\n        }\n        teardown(&kinesis_client, &stream_name).await;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/kinesis/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod api;\nmod helpers;\npub mod kinesis_source;\nmod shard_consumer;\n\nuse quickwit_common::retry::RetryParams;\nuse quickwit_config::KinesisSourceParams;\n\nuse crate::source::kinesis::api::{get_records, get_shard_iterator, list_shards};\nuse crate::source::kinesis::helpers::get_kinesis_client;\nuse crate::source::kinesis::kinesis_source::get_region;\n\n/// Checks whether we can establish a connection to the Kinesis service and read some records.\npub(super) async fn check_connectivity(params: KinesisSourceParams) -> anyhow::Result<()> {\n    let region = get_region(params.region_or_endpoint).await?;\n    let kinesis_client = get_kinesis_client(region).await?;\n    let retry_params = RetryParams::standard();\n    let shards = list_shards(&kinesis_client, &retry_params, &params.stream_name, Some(1)).await?;\n\n    if let Some(shard_id) = shards.first().map(|s| s.shard_id()) {\n        let shard_iterator_opt = get_shard_iterator(\n            &kinesis_client,\n            &retry_params,\n            &params.stream_name,\n            shard_id,\n            None,\n        )\n        .await?;\n\n        if let Some(shard_iterator) = shard_iterator_opt {\n            get_records(&kinesis_client, &retry_params, shard_iterator).await?;\n        }\n    }\n    Ok(())\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/kinesis/shard_consumer.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::time::Duration;\n\nuse async_trait::async_trait;\nuse aws_sdk_kinesis::types::Record;\nuse quickwit_actors::{Actor, ActorContext, ActorExitStatus, ActorHandle, Handler, Mailbox};\nuse quickwit_common::retry::RetryParams;\nuse serde_json::{Value as JsonValue, json};\nuse tokio::sync::mpsc;\n\nuse crate::source::SourceContext;\nuse crate::source::kinesis::api::{get_records, get_shard_iterator};\n\n#[derive(Debug)]\npub(super) enum ShardConsumerMessage {\n    /// The shard was the subject of a merge or a split and points to one (merge) or two (split)\n    /// children.\n    ChildShards(Vec<String>),\n    Records {\n        shard_id: String,\n        records: Vec<Record>,\n        lag_millis: Option<i64>,\n    },\n    /// The shard is closed after a merge or a split. There are no new records available.\n    ShardClosed(String),\n    /// The consumer has reached the latest record in the shard and stops if\n    /// `shutdown_at_shard_eof` is set to true.\n    ShardEOF(String),\n}\n\n#[derive(Default)]\npub(super) struct ShardConsumerState {\n    /// The sequence number of the last record processed.\n    current_sequence_number: Option<String>,\n    /// The number of milliseconds the last `GetRecords` response is from the tip of the stream.\n    lag_millis: Option<i64>,\n    /// Number of bytes processed by the consumer.\n    num_bytes_processed: u64,\n    /// Number of records processed by the consumer.\n    num_records_processed: u64,\n    /// The shard iterator value that will be used for the next call to `GetRecords`.\n    next_shard_iterator: Option<String>,\n}\n\npub(super) struct ShardConsumer {\n    stream_name: String,\n    shard_id: String,\n    /// Sequence number of the last record processed. Consumption of the shard is resumed right\n    /// after this sequence number.\n    from_sequence_number_exclusive: Option<String>,\n    /// When this value is set to true, the consumer shuts down after reaching the last (most\n    /// recent) record in the shard.\n    shutdown_at_shard_eof: bool,\n    state: ShardConsumerState,\n    kinesis_client: aws_sdk_kinesis::Client,\n    sink: mpsc::Sender<ShardConsumerMessage>,\n    retry_params: RetryParams,\n}\n\nimpl fmt::Debug for ShardConsumer {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(\n            f,\n            \"KinesisShardConsumer {{ stream_name: {}, shard_id: {} }}\",\n            self.stream_name, self.shard_id\n        )\n    }\n}\n\nimpl ShardConsumer {\n    pub fn new(\n        stream_name: String,\n        shard_id: String,\n        from_sequence_number_exclusive: Option<String>,\n        shutdown_at_shard_eof: bool,\n        kinesis_client: aws_sdk_kinesis::Client,\n        sink: mpsc::Sender<ShardConsumerMessage>,\n        retry_params: RetryParams,\n    ) -> Self {\n        Self {\n            stream_name,\n            shard_id,\n            from_sequence_number_exclusive,\n            state: Default::default(),\n            shutdown_at_shard_eof,\n            kinesis_client,\n            sink,\n            retry_params,\n        }\n    }\n\n    pub fn spawn(self, ctx: &SourceContext) -> ShardConsumerHandle {\n        let (_mailbox, _actor_handle) = ctx.spawn_actor().spawn(self);\n        ShardConsumerHandle {\n            _mailbox,\n            _actor_handle,\n        }\n    }\n\n    async fn send_message(\n        &self,\n        ctx: &ActorContext<Self>,\n        message: ShardConsumerMessage,\n    ) -> anyhow::Result<()> {\n        let _guard = ctx.protect_zone();\n        self.sink.send(message).await?;\n        Ok(())\n    }\n}\n\npub(super) struct ShardConsumerHandle {\n    _mailbox: Mailbox<ShardConsumer>,\n    _actor_handle: ActorHandle<ShardConsumer>,\n}\n\n#[derive(Debug)]\npub(super) struct Loop;\n\n#[async_trait]\nimpl Actor for ShardConsumer {\n    type ObservableState = JsonValue;\n\n    fn name(&self) -> String {\n        \"KinesisShardConsumer\".to_string()\n    }\n\n    async fn initialize(&mut self, ctx: &ActorContext<Self>) -> Result<(), ActorExitStatus> {\n        self.state.next_shard_iterator = ctx\n            .protect_future(get_shard_iterator(\n                &self.kinesis_client,\n                &self.retry_params,\n                &self.stream_name,\n                &self.shard_id,\n                self.from_sequence_number_exclusive.clone(),\n            ))\n            .await?;\n        ctx.send_self_message(Loop).await?;\n        Ok(())\n    }\n\n    fn yield_after_each_message(&self) -> bool {\n        false\n    }\n\n    fn observable_state(&self) -> Self::ObservableState {\n        json!({\n            \"stream_name\": self.stream_name,\n            \"shard_id\": self.shard_id,\n            \"current_sequence_number\": self.state.current_sequence_number,\n            \"lag_millis\": self.state.lag_millis,\n            \"num_bytes_processed\": self.state.num_bytes_processed,\n            \"num_records_processed\": self.state.num_records_processed,\n        })\n    }\n}\n\n#[async_trait]\nimpl Handler<Loop> for ShardConsumer {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        _message: Loop,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        if let Some(shard_iterator) = self.state.next_shard_iterator.take() {\n            let response = ctx\n                .protect_future(get_records(\n                    &self.kinesis_client,\n                    &self.retry_params,\n                    shard_iterator,\n                ))\n                .await?;\n            self.state.lag_millis = response.millis_behind_latest;\n            self.state.next_shard_iterator = response.next_shard_iterator;\n\n            if !response.records.is_empty() {\n                self.state.current_sequence_number = response\n                    .records\n                    .last()\n                    .map(|record| record.sequence_number.clone());\n                self.state.num_bytes_processed += response\n                    .records\n                    .iter()\n                    .map(|record| record.data().as_ref().len() as u64)\n                    .sum::<u64>();\n                self.state.num_records_processed += response.records.len() as u64;\n\n                let message = ShardConsumerMessage::Records {\n                    shard_id: self.shard_id.clone(),\n                    records: response.records,\n                    lag_millis: response.millis_behind_latest,\n                };\n                self.send_message(ctx, message).await?;\n            }\n            if let Some(children) = response.child_shards {\n                let shard_ids: Vec<String> = children\n                    .into_iter()\n                    // Filter out duplicate message when two shards are merged.\n                    .filter(|child| child.parent_shards().first() == Some(&self.shard_id))\n                    .map(|child| child.shard_id)\n                    .collect();\n                if !shard_ids.is_empty() {\n                    let message = ShardConsumerMessage::ChildShards(shard_ids);\n                    self.send_message(ctx, message).await?;\n                }\n            }\n            if self.shutdown_at_shard_eof && response.millis_behind_latest == Some(0) {\n                let message = ShardConsumerMessage::ShardEOF(self.shard_id.clone());\n                self.send_message(ctx, message).await?;\n                return Err(ActorExitStatus::Success);\n            };\n            // The `GetRecords` API has a limit of 5 transactions per second. 1s / 5 + ε = 210ms.\n            let interval = Duration::from_millis(210);\n            ctx.schedule_self_msg(interval, Loop);\n            return Ok(());\n        }\n        let message = ShardConsumerMessage::ShardClosed(self.shard_id.clone());\n        self.send_message(ctx, message).await?;\n        Err(ActorExitStatus::Success)\n    }\n}\n\n#[cfg(all(test, feature = \"kinesis-localstack-tests\"))]\nmod tests {\n    use quickwit_actors::Universe;\n    use serde_json::Value as JsonValue;\n\n    use super::*;\n    use crate::source::kinesis::api::tests::{merge_shards, split_shard};\n    use crate::source::kinesis::helpers::tests::{\n        DEFAULT_RETRY_PARAMS, make_shard_id, put_records_into_shards, setup, teardown,\n    };\n\n    async fn drain_messages(\n        sink_rx: &mut mpsc::Receiver<ShardConsumerMessage>,\n    ) -> Vec<ShardConsumerMessage> {\n        let mut messages = Vec::new();\n        while let Ok(message) = sink_rx.try_recv() {\n            messages.push(message);\n        }\n        messages\n    }\n\n    #[ignore]\n    #[tokio::test]\n    async fn test_shard_eof() -> anyhow::Result<()> {\n        let universe = Universe::with_accelerated_time();\n        let (sink_tx, mut sink_rx) = mpsc::channel(100);\n        let (kinesis_client, stream_name) = setup(\"test-shard-eof\", 1).await?;\n        let shard_id_0 = make_shard_id(0);\n        let shard_consumer = ShardConsumer::new(\n            stream_name.clone(),\n            shard_id_0.clone(),\n            None,\n            true,\n            kinesis_client.clone(),\n            sink_tx,\n            *DEFAULT_RETRY_PARAMS,\n        );\n        let (_mailbox, handle) = universe.spawn_builder().spawn(shard_consumer);\n        let (exit_status, exit_state) = handle.join().await;\n        assert!(exit_status.is_success());\n\n        let messages = drain_messages(&mut sink_rx).await;\n        assert_eq!(messages.len(), 1);\n\n        assert!(matches!(\n            &messages[0],\n            ShardConsumerMessage::ShardEOF(shard_id) if *shard_id == shard_id_0\n        ));\n        let expected_state = json!({\n            \"stream_name\": stream_name,\n            \"shard_id\": shard_id_0,\n            \"current_sequence_number\": JsonValue::Null,\n            \"lag_millis\": 0,\n            \"num_bytes_processed\": 0,\n            \"num_records_processed\": 0,\n        });\n        assert_eq!(exit_state, expected_state);\n\n        teardown(&kinesis_client, &stream_name).await;\n        Ok(())\n    }\n\n    #[ignore]\n    #[tokio::test]\n    async fn test_start_at_horizon() -> anyhow::Result<()> {\n        let universe = Universe::with_accelerated_time();\n        let (sink_tx, mut sink_rx) = mpsc::channel(100);\n        let (kinesis_client, stream_name) = setup(\"test-start-at-horizon\", 1).await?;\n        let sequence_numbers = put_records_into_shards(\n            &kinesis_client,\n            &stream_name,\n            [(0, \"Record #00\"), (0, \"Record #01\")],\n        )\n        .await?;\n        let shard_id_0 = make_shard_id(0);\n        let shard_consumer = ShardConsumer::new(\n            stream_name.clone(),\n            shard_id_0.clone(),\n            None,\n            true,\n            kinesis_client.clone(),\n            sink_tx,\n            *DEFAULT_RETRY_PARAMS,\n        );\n        let (_mailbox, handle) = universe.spawn_builder().spawn(shard_consumer);\n        let (exit_status, exit_state) = handle.join().await;\n        assert!(exit_status.is_success());\n\n        let messages = drain_messages(&mut sink_rx).await;\n        assert_eq!(messages.len(), 2);\n\n        assert!(matches!(\n            &messages[0],\n            ShardConsumerMessage::Records { shard_id, records, lag_millis: _ } if *shard_id == shard_id_0 && records.len() == 2\n        ));\n        assert!(matches!(\n            &messages[1],\n            ShardConsumerMessage::ShardEOF(shard_id) if *shard_id == shard_id_0\n        ));\n        let current_sequence_number = sequence_numbers\n            .get(&0)\n            .and_then(|per_shard_sequence_numbers| per_shard_sequence_numbers.last())\n            .cloned();\n        let expected_state = json!({\n            \"stream_name\": stream_name,\n            \"shard_id\": shard_id_0,\n            \"current_sequence_number\": current_sequence_number,\n            \"lag_millis\": 0,\n            \"num_bytes_processed\": 20,\n            \"num_records_processed\": 2,\n        });\n        assert_eq!(exit_state, expected_state);\n\n        teardown(&kinesis_client, &stream_name).await;\n        Ok(())\n    }\n\n    // Ignoring this test because the localstack implementation of Kinesis is bogus.\n    #[ignore]\n    #[tokio::test]\n    async fn test_start_after_sequence_number() -> anyhow::Result<()> {\n        let universe = Universe::with_accelerated_time();\n        let (sink_tx, mut sink_rx) = mpsc::channel(100);\n        let (kinesis_client, stream_name) = setup(\"test-start-after-sequence-number\", 1).await?;\n        let sequence_numbers = put_records_into_shards(\n            &kinesis_client,\n            &stream_name,\n            [(0, \"Record #00\"), (0, \"Record #01\")],\n        )\n        .await?;\n        let shard_id_0 = make_shard_id(0);\n        let from_sequence_number_exclusive = sequence_numbers\n            .get(&0)\n            .and_then(|sequence_numbers| sequence_numbers.first())\n            .cloned();\n        let shard_consumer = ShardConsumer::new(\n            stream_name.clone(),\n            shard_id_0.clone(),\n            from_sequence_number_exclusive,\n            true,\n            kinesis_client.clone(),\n            sink_tx,\n            *DEFAULT_RETRY_PARAMS,\n        );\n        let (_mailbox, handle) = universe.spawn_builder().spawn(shard_consumer);\n        let (exit_status, exit_state) = handle.join().await;\n        assert!(exit_status.is_success());\n\n        let messages = drain_messages(&mut sink_rx).await;\n        assert_eq!(messages.len(), 2);\n\n        assert!(matches!(\n            &messages[0],\n            ShardConsumerMessage::Records { shard_id, records, lag_millis: _ } if *shard_id == shard_id_0 && records.len() == 1\n        ));\n        assert!(matches!(\n            &messages[1],\n            ShardConsumerMessage::ShardEOF(shard_id) if *shard_id == shard_id_0\n        ));\n        let current_sequence_number = sequence_numbers\n            .get(&0)\n            .and_then(|per_shard_sequence_numbers| per_shard_sequence_numbers.last())\n            .cloned();\n        let expected_state = json!({\n            \"stream_name\": stream_name,\n            \"shard_id\": shard_id_0,\n            \"current_sequence_number\": current_sequence_number,\n            \"lag_millis\": 0,\n            \"num_bytes_processed\": 10,\n            \"num_records_processed\": 1,\n        });\n        assert_eq!(exit_state, expected_state);\n\n        teardown(&kinesis_client, &stream_name).await;\n        Ok(())\n    }\n\n    // Ignoring this test because the localstack implementation of Kinesis is bogus.\n    #[ignore]\n    #[tokio::test]\n    async fn test_merge_shards() -> anyhow::Result<()> {\n        let universe = Universe::with_accelerated_time();\n        let (sink_tx, mut sink_rx) = mpsc::channel(100);\n        let (kinesis_client, stream_name) = setup(\"test-merge-shards\", 2).await?;\n        let shard_id_0 = make_shard_id(0);\n        let shard_id_1 = make_shard_id(1);\n        merge_shards(&kinesis_client, &stream_name, &shard_id_0, &shard_id_1).await?;\n        {\n            let shard_consumer_0 = ShardConsumer::new(\n                stream_name.clone(),\n                shard_id_0.clone(),\n                None,\n                false,\n                kinesis_client.clone(),\n                sink_tx.clone(),\n                *DEFAULT_RETRY_PARAMS,\n            );\n            let (_mailbox, handle) = universe.spawn_builder().spawn(shard_consumer_0);\n            let (exit_status, _exit_state) = handle.join().await;\n            assert!(exit_status.is_success());\n\n            let messages = drain_messages(&mut sink_rx).await;\n            assert_eq!(messages.len(), 2);\n\n            assert!(matches!(\n                &messages[0],\n                ShardConsumerMessage::ChildShards(shard_ids) if *shard_ids == vec![make_shard_id(2)]\n            ));\n            assert!(matches!(\n                &messages[1],\n                ShardConsumerMessage::ShardClosed(shard_id) if *shard_id == shard_id_0\n            ));\n        }\n        {\n            let shard_consumer_1 = ShardConsumer::new(\n                stream_name.clone(),\n                shard_id_1.clone(),\n                None,\n                false,\n                kinesis_client.clone(),\n                sink_tx,\n                *DEFAULT_RETRY_PARAMS,\n            );\n            let (_mailbox, handle) = universe.spawn_builder().spawn(shard_consumer_1);\n            let (exit_status, _exit_state) = handle.join().await;\n            assert!(exit_status.is_success());\n\n            let messages = drain_messages(&mut sink_rx).await;\n            assert_eq!(messages.len(), 1);\n\n            assert!(matches!(\n                &messages[0],\n                ShardConsumerMessage::ShardClosed(shard_id) if *shard_id == shard_id_1\n            ));\n        }\n        teardown(&kinesis_client, &stream_name).await;\n        Ok(())\n    }\n\n    // Ignoring this test because the localstack implementation of Kinesis is bogus.\n    #[ignore]\n    #[tokio::test]\n    async fn test_split_shard() -> anyhow::Result<()> {\n        let universe = Universe::with_accelerated_time();\n        let (sink_tx, mut sink_rx) = mpsc::channel(100);\n        let (kinesis_client, stream_name) = setup(\"test-split-shard\", 1).await?;\n        let shard_id_0 = make_shard_id(0);\n        split_shard(&kinesis_client, &stream_name, &shard_id_0, \"42\").await?;\n\n        let shard_consumer = ShardConsumer::new(\n            stream_name.clone(),\n            shard_id_0.clone(),\n            None,\n            false,\n            kinesis_client.clone(),\n            sink_tx,\n            *DEFAULT_RETRY_PARAMS,\n        );\n        let (_mailbox, handle) = universe.spawn_builder().spawn(shard_consumer);\n        let (exit_status, _exit_state) = handle.join().await;\n        assert!(exit_status.is_success());\n\n        let messages = drain_messages(&mut sink_rx).await;\n        assert_eq!(messages.len(), 2);\n\n        assert!(matches!(\n            &messages[0],\n            ShardConsumerMessage::ChildShards(shard_ids) if *shard_ids == vec![make_shard_id(1), make_shard_id(2)]\n        ));\n        assert!(matches!(\n            &messages[1],\n            ShardConsumerMessage::ShardClosed(shard_id) if *shard_id == shard_id_0\n        ));\n        teardown(&kinesis_client, &stream_name).await;\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n//! # Sources\n//!\n//! Quickwit gets its data from so-called `Sources`.\n//!\n//! The role of a source is to push message to an indexer mailbox.\n//! Implementers need to focus on the implementation of the [`Source`] trait\n//! and in particular its emit_batches method.\n//! In addition, they need to implement a source factory.\n//!\n//! The source trait will executed in an actor.\n//!\n//! # Checkpoints and exactly-once semantics\n//!\n//! Quickwit is designed to offer exactly-once semantics whenever possible using the following\n//! strategy, using checkpoints.\n//!\n//! Messages are split into partitions, and within a partition messages are totally ordered: they\n//! are marked by a unique position within this partition.\n//!\n//! Sources are required to emit messages in a way that respects this partial order.\n//! If two message belong 2 different partitions, they can be emitted in any order.\n//! If two message belong to the same partition, they  are required to be emitted in the order of\n//! their position.\n//!\n//! The set of documents processed by a source can then be expressed entirely as Checkpoint, that is\n//! simply a mapping `(PartitionId -> Position)`.\n//!\n//! This checkpoint is used in Quickwit to implement exactly-once semantics.\n//! When a new split is published, it is atomically published with an update of the last indexed\n//! checkpoint.\n//!\n//! If the indexing pipeline is restarted, the source will simply be recreated with that checkpoint.\n//!\n//! # Example sources\n//!\n//! Right now two sources are implemented in quickwit.\n//! - the file source: there partition here is a filepath, and the position is a byte-offset within\n//!   that file.\n//! - the kafka source: the partition id is a kafka topic partition id, and the position is a kafka\n//!   offset.\nmod doc_file_reader;\nmod file_source;\n#[cfg(feature = \"gcp-pubsub\")]\nmod gcp_pubsub_source;\nmod ingest;\nmod ingest_api_source;\n#[cfg(feature = \"kafka\")]\nmod kafka_source;\n#[cfg(feature = \"kinesis\")]\nmod kinesis;\n#[cfg(feature = \"pulsar\")]\nmod pulsar_source;\n#[cfg(feature = \"queue-sources\")]\nmod queue_sources;\nmod source_factory;\nmod stdin_source;\nmod vec_source;\nmod void_source;\n\nuse std::collections::BTreeSet;\nuse std::path::PathBuf;\nuse std::time::Duration;\n\nuse async_trait::async_trait;\nuse bytes::Bytes;\nuse bytesize::ByteSize;\npub use file_source::{FileSource, FileSourceFactory};\n#[cfg(feature = \"gcp-pubsub\")]\npub use gcp_pubsub_source::{GcpPubSubSource, GcpPubSubSourceFactory};\n#[cfg(feature = \"kafka\")]\npub use kafka_source::{KafkaSource, KafkaSourceFactory};\n#[cfg(feature = \"kinesis\")]\npub use kinesis::kinesis_source::{KinesisSource, KinesisSourceFactory};\nuse once_cell::sync::{Lazy, OnceCell};\n#[cfg(feature = \"pulsar\")]\npub use pulsar_source::{PulsarSource, PulsarSourceFactory};\n#[cfg(feature = \"sqs\")]\npub use queue_sources::sqs_queue;\nuse quickwit_actors::{Actor, ActorContext, ActorExitStatus, Handler, Mailbox};\nuse quickwit_common::metrics::{GaugeGuard, MEMORY_METRICS};\nuse quickwit_common::pubsub::EventBroker;\nuse quickwit_common::runtimes::RuntimeType;\nuse quickwit_config::{\n    FileSourceNotification, FileSourceParams, IndexingSettings, SourceConfig, SourceParams,\n};\nuse quickwit_ingest::IngesterPool;\nuse quickwit_metastore::IndexMetadataResponseExt;\nuse quickwit_metastore::checkpoint::{SourceCheckpoint, SourceCheckpointDelta};\nuse quickwit_proto::indexing::IndexingPipelineId;\nuse quickwit_proto::metastore::{\n    IndexMetadataRequest, MetastoreError, MetastoreResult, MetastoreService,\n    MetastoreServiceClient, SourceType,\n};\nuse quickwit_proto::types::{IndexUid, NodeIdRef, PipelineUid, ShardId};\nuse quickwit_storage::StorageResolver;\nuse serde_json::Value as JsonValue;\npub use source_factory::{SourceFactory, SourceLoader, TypedSourceFactory};\nuse tokio::runtime::Handle;\nuse tracing::error;\npub use vec_source::{VecSource, VecSourceFactory};\npub use void_source::{VoidSource, VoidSourceFactory};\n\nuse self::doc_file_reader::dir_and_filename;\nuse self::stdin_source::StdinSourceFactory;\nuse crate::actors::DocProcessor;\nuse crate::models::RawDocBatch;\nuse crate::source::ingest::IngestSourceFactory;\nuse crate::source::ingest_api_source::IngestApiSourceFactory;\n\n/// Number of bytes after which we cut a new batch.\n///\n/// We try to emit chewable batches for the indexer.\n/// One batch = one message to the indexer actor.\n///\n/// If batches are too large:\n/// - we might not be able to observe the state of the indexer for 5 seconds.\n/// - we will be needlessly occupying resident memory in the mailbox.\n/// - we will not have a precise control of the timeout before commit.\n///\n/// 5MB seems like a good one size fits all value.\nconst BATCH_NUM_BYTES_LIMIT: u64 = ByteSize::mib(5).as_u64();\n\nstatic EMIT_BATCHES_TIMEOUT: Lazy<Duration> = Lazy::new(|| {\n    if cfg!(any(test, feature = \"testsuite\")) {\n        let timeout = Duration::from_millis(100);\n        assert!(timeout < *quickwit_actors::HEARTBEAT);\n        timeout\n    } else {\n        let timeout = Duration::from_millis(1_000);\n        if *quickwit_actors::HEARTBEAT < timeout {\n            error!(\"QW_ACTOR_HEARTBEAT_SECS smaller than batch timeout\");\n        }\n        timeout\n    }\n});\n\n/// Runtime configuration used during execution of a source actor.\n#[derive(Clone)]\npub struct SourceRuntime {\n    pub pipeline_id: IndexingPipelineId,\n    pub source_config: SourceConfig,\n    pub metastore: MetastoreServiceClient,\n    pub ingester_pool: IngesterPool,\n    // Ingest API queues directory path.\n    pub queues_dir_path: PathBuf,\n    pub storage_resolver: StorageResolver,\n    pub event_broker: EventBroker,\n    pub indexing_setting: IndexingSettings,\n}\n\nimpl SourceRuntime {\n    pub fn node_id(&self) -> &NodeIdRef {\n        &self.pipeline_id.node_id\n    }\n\n    pub fn index_uid(&self) -> &IndexUid {\n        &self.pipeline_id.index_uid\n    }\n\n    pub fn index_id(&self) -> &str {\n        &self.pipeline_id.index_uid.index_id\n    }\n\n    pub fn source_id(&self) -> &str {\n        &self.pipeline_id.source_id\n    }\n\n    pub fn pipeline_uid(&self) -> PipelineUid {\n        self.pipeline_id.pipeline_uid\n    }\n\n    pub async fn fetch_checkpoint(&self) -> MetastoreResult<SourceCheckpoint> {\n        let index_uid = self.index_uid().clone();\n        let request = IndexMetadataRequest::for_index_uid(index_uid);\n        let response = self.metastore.clone().index_metadata(request).await?;\n        let index_metadata = response.deserialize_index_metadata()?;\n\n        if let Some(checkpoint) = index_metadata\n            .checkpoint\n            .source_checkpoint(self.source_id())\n            .cloned()\n        {\n            return Ok(checkpoint);\n        }\n        Err(MetastoreError::Internal {\n            message: format!(\n                \"could not find checkpoint for index `{}` and source `{}`\",\n                self.index_uid(),\n                self.source_id()\n            ),\n            cause: \"\".to_string(),\n        })\n    }\n}\n\npub type SourceContext = ActorContext<SourceActor>;\n\n/// A Source is a trait that is mounted in a light wrapping Actor called `SourceActor`.\n///\n/// For this reason, its methods mimics those of Actor.\n/// One key difference is the absence of messages.\n///\n/// The `SourceActor` implements a loop until emit_batches returns an\n/// ActorExitStatus.\n///\n/// Conceptually, a source execution works as if it was a simple loop\n/// as follow:\n///\n/// ```ignore\n/// fn whatever() -> anyhow::Result<()> {\n///     source.initialize(ctx)?;\n///     let exit_status = loop {\n///         if let Err(exit_status) = source.emit_batches()? {\n///             break exit_status;\n///         }\n///     };\n///     source.finalize(exit_status)?;\n///     Ok(())\n/// }\n/// ```\n#[async_trait]\npub trait Source: Send + 'static {\n    /// This method will be called before any calls to `emit_batches`.\n    async fn initialize(\n        &mut self,\n        _doc_processor_mailbox: &Mailbox<DocProcessor>,\n        _ctx: &SourceContext,\n    ) -> Result<(), ActorExitStatus> {\n        Ok(())\n    }\n\n    /// Main part of the source implementation, `emit_batches` can emit 0..n batches.\n    ///\n    /// The `batch_sink` is a mailbox that has a bounded capacity.\n    /// In that case, `batch_sink` will block.\n    ///\n    /// It returns an optional duration specifying how long the batch requester\n    /// should wait before polling again.\n    async fn emit_batches(\n        &mut self,\n        doc_processor_mailbox: &Mailbox<DocProcessor>,\n        ctx: &SourceContext,\n    ) -> Result<Duration, ActorExitStatus>;\n\n    /// Assign shards is called when the source is assigned a new set of shards by the control\n    /// plane.\n    async fn assign_shards(\n        &mut self,\n        _shard_ids: BTreeSet<ShardId>,\n        _doc_processor_mailbox: &Mailbox<DocProcessor>,\n        _ctx: &SourceContext,\n    ) -> anyhow::Result<()> {\n        Ok(())\n    }\n\n    /// After publication of a split, `suggest_truncate` is called.\n    /// This makes it possible for the implementation of a source to\n    /// release some resources associated to the data that was just published.\n    ///\n    /// This method is for instance useful for the ingest API, as it is possible\n    /// to delete all message anterior to the checkpoint in the ingest API queue.\n    ///\n    /// It is perfectly fine for implementation to ignore this function.\n    /// For instance, message queue like kafka are meant to be shared by different\n    /// client, and rely on a retention strategy to delete messages.\n    ///\n    /// Returning an error has no effect on the source actor itself or the\n    /// indexing pipeline, as truncation is just \"a suggestion\".\n    /// The error will however be logged.\n    async fn suggest_truncate(\n        &mut self,\n        _checkpoint: SourceCheckpoint,\n        _ctx: &SourceContext,\n    ) -> anyhow::Result<()> {\n        Ok(())\n    }\n\n    /// Finalize is called once after the actor terminates.\n    async fn finalize(\n        &mut self,\n        _exit_status: &ActorExitStatus,\n        _ctx: &SourceContext,\n    ) -> anyhow::Result<()> {\n        Ok(())\n    }\n\n    /// A name identifying the type of source.\n    fn name(&self) -> String;\n\n    /// Returns an observable_state for the actor.\n    ///\n    /// This object is simply a json object, and its content may vary depending on the\n    /// source.\n    fn observable_state(&self) -> JsonValue;\n}\n\n/// The SourceActor acts as a thin wrapper over a source trait object to execute.\n///\n/// It mostly takes care of running a loop calling `emit_batches(...)`.\npub struct SourceActor {\n    pub source: Box<dyn Source>,\n    pub doc_processor_mailbox: Mailbox<DocProcessor>,\n}\n\n#[derive(Debug)]\nstruct Loop;\n\n#[derive(Debug)]\npub struct Assignment {\n    pub shard_ids: BTreeSet<ShardId>,\n}\n\n#[derive(Debug)]\npub struct AssignShards(pub Assignment);\n\n#[async_trait]\nimpl Actor for SourceActor {\n    type ObservableState = JsonValue;\n\n    fn name(&self) -> String {\n        self.source.name()\n    }\n\n    fn observable_state(&self) -> Self::ObservableState {\n        self.source.observable_state()\n    }\n\n    fn runtime_handle(&self) -> Handle {\n        RuntimeType::NonBlocking.get_runtime_handle()\n    }\n\n    fn yield_after_each_message(&self) -> bool {\n        false\n    }\n\n    async fn initialize(&mut self, ctx: &SourceContext) -> Result<(), ActorExitStatus> {\n        self.source\n            .initialize(&self.doc_processor_mailbox, ctx)\n            .await?;\n        self.handle(Loop, ctx).await?;\n        Ok(())\n    }\n\n    async fn finalize(\n        &mut self,\n        exit_status: &ActorExitStatus,\n        ctx: &SourceContext,\n    ) -> anyhow::Result<()> {\n        self.source.finalize(exit_status, ctx).await?;\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<Loop> for SourceActor {\n    type Reply = ();\n\n    async fn handle(&mut self, _message: Loop, ctx: &SourceContext) -> Result<(), ActorExitStatus> {\n        let wait_for = self\n            .source\n            .emit_batches(&self.doc_processor_mailbox, ctx)\n            .await?;\n        if wait_for.is_zero() {\n            ctx.send_self_message(Loop).await?;\n            return Ok(());\n        }\n        ctx.schedule_self_msg(wait_for, Loop);\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<AssignShards> for SourceActor {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        assign_shards_message: AssignShards,\n        ctx: &SourceContext,\n    ) -> Result<(), ActorExitStatus> {\n        let AssignShards(Assignment { shard_ids }) = assign_shards_message;\n        self.source\n            .assign_shards(shard_ids, &self.doc_processor_mailbox, ctx)\n            .await?;\n        Ok(())\n    }\n}\n\n// TODO: Use `SourceType` instead of `&str``.\npub fn quickwit_supported_sources() -> &'static SourceLoader {\n    static SOURCE_LOADER: OnceCell<SourceLoader> = OnceCell::new();\n    SOURCE_LOADER.get_or_init(|| {\n        let mut source_factory = SourceLoader::default();\n        source_factory.add_source(SourceType::File, FileSourceFactory);\n        #[cfg(feature = \"gcp-pubsub\")]\n        source_factory.add_source(SourceType::PubSub, GcpPubSubSourceFactory);\n        source_factory.add_source(SourceType::IngestV1, IngestApiSourceFactory);\n        source_factory.add_source(SourceType::IngestV2, IngestSourceFactory);\n        #[cfg(feature = \"kafka\")]\n        source_factory.add_source(SourceType::Kafka, KafkaSourceFactory);\n        #[cfg(feature = \"kinesis\")]\n        source_factory.add_source(SourceType::Kinesis, KinesisSourceFactory);\n        #[cfg(feature = \"pulsar\")]\n        source_factory.add_source(SourceType::Pulsar, PulsarSourceFactory);\n        source_factory.add_source(SourceType::Stdin, StdinSourceFactory);\n        source_factory.add_source(SourceType::Vec, VecSourceFactory);\n        source_factory.add_source(SourceType::Void, VoidSourceFactory);\n        source_factory\n    })\n}\n\npub async fn check_source_connectivity(\n    storage_resolver: &StorageResolver,\n    source_config: &SourceConfig,\n) -> anyhow::Result<()> {\n    match &source_config.source_params {\n        SourceParams::File(FileSourceParams::Filepath(file_uri)) => {\n            let (dir_uri, file_name) = dir_and_filename(file_uri)?;\n            let storage = storage_resolver.resolve(&dir_uri).await?;\n            storage.file_num_bytes(file_name).await?;\n            Ok(())\n        }\n        #[allow(unused_variables)]\n        SourceParams::File(FileSourceParams::Notifications(FileSourceNotification::Sqs(\n            sqs_config,\n        ))) => {\n            #[cfg(not(feature = \"sqs\"))]\n            anyhow::bail!(\"Quickwit was compiled without the `sqs` feature\");\n\n            #[cfg(feature = \"sqs\")]\n            {\n                queue_sources::sqs_queue::check_connectivity(&sqs_config.queue_url).await?;\n                Ok(())\n            }\n        }\n        #[allow(unused_variables)]\n        SourceParams::Kafka(params) => {\n            #[cfg(not(feature = \"kafka\"))]\n            anyhow::bail!(\"Quickwit was compiled without the `kafka` feature\");\n\n            #[cfg(feature = \"kafka\")]\n            {\n                kafka_source::check_connectivity(params.clone()).await?;\n                Ok(())\n            }\n        }\n        #[allow(unused_variables)]\n        SourceParams::Kinesis(params) => {\n            #[cfg(not(feature = \"kinesis\"))]\n            anyhow::bail!(\"Quickwit was compiled without the `kinesis` feature\");\n\n            #[cfg(feature = \"kinesis\")]\n            {\n                kinesis::check_connectivity(params.clone()).await?;\n                Ok(())\n            }\n        }\n        #[allow(unused_variables)]\n        SourceParams::Pulsar(params) => {\n            #[cfg(not(feature = \"pulsar\"))]\n            anyhow::bail!(\"Quickwit was compiled without the `pulsar` feature\");\n\n            #[cfg(feature = \"pulsar\")]\n            {\n                pulsar_source::check_connectivity(params).await?;\n                Ok(())\n            }\n        }\n        _ => Ok(()),\n    }\n}\n\n#[derive(Debug)]\npub struct SuggestTruncate(pub SourceCheckpoint);\n\n#[async_trait]\nimpl Handler<SuggestTruncate> for SourceActor {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        suggest_truncate: SuggestTruncate,\n        ctx: &SourceContext,\n    ) -> Result<(), ActorExitStatus> {\n        let SuggestTruncate(checkpoint) = suggest_truncate;\n\n        if let Err(error) = self.source.suggest_truncate(checkpoint, ctx).await {\n            // Failing to process suggest truncate does not\n            // kill the source nor the indexing pipeline, but we log the error.\n            error!(%error, \"failed to process suggest truncate\");\n        }\n        Ok(())\n    }\n}\n\npub(super) struct BatchBuilder {\n    // Do not directly append documents to this vector; otherwise, in-flight metrics will be\n    // incorrect. Use `add_doc` instead.\n    docs: Vec<Bytes>,\n    num_bytes: u64,\n    checkpoint_delta: SourceCheckpointDelta,\n    force_commit: bool,\n    gauge_guard: GaugeGuard<'static>,\n}\n\nimpl BatchBuilder {\n    pub fn new(source_type: SourceType) -> Self {\n        Self::with_capacity(0, source_type)\n    }\n\n    pub fn with_capacity(capacity: usize, source_type: SourceType) -> Self {\n        let gauge = match source_type {\n            SourceType::File => MEMORY_METRICS.in_flight.file(),\n            SourceType::IngestV2 => MEMORY_METRICS.in_flight.ingest(),\n            SourceType::Kafka => MEMORY_METRICS.in_flight.kafka(),\n            SourceType::Kinesis => MEMORY_METRICS.in_flight.kinesis(),\n            SourceType::PubSub => MEMORY_METRICS.in_flight.pubsub(),\n            SourceType::Pulsar => MEMORY_METRICS.in_flight.pulsar(),\n            _ => MEMORY_METRICS.in_flight.other(),\n        };\n        let gauge_guard = GaugeGuard::from_gauge(gauge);\n\n        Self {\n            docs: Vec::with_capacity(capacity),\n            num_bytes: 0,\n            checkpoint_delta: SourceCheckpointDelta::default(),\n            force_commit: false,\n            gauge_guard,\n        }\n    }\n\n    pub fn add_doc(&mut self, doc: Bytes) {\n        let num_bytes = doc.len();\n        self.docs.push(doc);\n        self.gauge_guard.add(num_bytes as i64);\n        self.num_bytes += num_bytes as u64;\n    }\n\n    pub fn force_commit(&mut self) {\n        self.force_commit = true;\n    }\n\n    pub fn build(self) -> RawDocBatch {\n        RawDocBatch::new(self.docs, self.checkpoint_delta, self.force_commit)\n    }\n\n    #[cfg(feature = \"kafka\")]\n    pub fn clear(&mut self) {\n        self.docs.clear();\n        self.checkpoint_delta = SourceCheckpointDelta::default();\n        self.gauge_guard.sub(self.num_bytes as i64);\n        self.num_bytes = 0;\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use std::num::NonZeroUsize;\n\n    use quickwit_config::{SourceInputFormat, VecSourceParams};\n    use quickwit_metastore::IndexMetadata;\n    use quickwit_metastore::checkpoint::IndexCheckpointDelta;\n    use quickwit_proto::metastore::{IndexMetadataResponse, MockMetastoreService};\n    use quickwit_proto::types::NodeId;\n\n    use super::*;\n\n    pub struct SourceRuntimeBuilder {\n        index_uid: IndexUid,\n        source_config: SourceConfig,\n        metastore_opt: Option<MetastoreServiceClient>,\n        queues_dir_path_opt: Option<PathBuf>,\n    }\n\n    impl SourceRuntimeBuilder {\n        pub fn new(index_uid: IndexUid, source_config: SourceConfig) -> Self {\n            SourceRuntimeBuilder {\n                index_uid,\n                source_config,\n                metastore_opt: None,\n                queues_dir_path_opt: None,\n            }\n        }\n\n        pub fn build(mut self) -> SourceRuntime {\n            let metastore = self\n                .metastore_opt\n                .take()\n                .unwrap_or_else(|| self.setup_mock_metastore(None));\n\n            let queues_dir_path = self\n                .queues_dir_path_opt\n                .unwrap_or_else(|| PathBuf::from(\"./queues\"));\n\n            SourceRuntime {\n                pipeline_id: IndexingPipelineId {\n                    node_id: NodeId::from(\"test-node\"),\n                    index_uid: self.index_uid,\n                    source_id: self.source_config.source_id.clone(),\n                    pipeline_uid: PipelineUid::for_test(0u128),\n                },\n                metastore,\n                ingester_pool: IngesterPool::default(),\n                queues_dir_path,\n                source_config: self.source_config,\n                storage_resolver: StorageResolver::for_test(),\n                event_broker: EventBroker::default(),\n                indexing_setting: IndexingSettings::default(),\n            }\n        }\n\n        #[cfg(all(\n            test,\n            any(feature = \"kafka-broker-tests\", feature = \"sqs-localstack-tests\")\n        ))]\n        pub fn with_metastore(mut self, metastore: MetastoreServiceClient) -> Self {\n            self.metastore_opt = Some(metastore);\n            self\n        }\n\n        pub fn with_mock_metastore(\n            mut self,\n            source_checkpoint_delta_opt: Option<SourceCheckpointDelta>,\n        ) -> Self {\n            self.metastore_opt = Some(self.setup_mock_metastore(source_checkpoint_delta_opt));\n            self\n        }\n\n        pub fn with_queues_dir(mut self, queues_dir_path: impl Into<PathBuf>) -> Self {\n            self.queues_dir_path_opt = Some(queues_dir_path.into());\n            self\n        }\n\n        fn setup_mock_metastore(\n            &self,\n            source_checkpoint_delta_opt: Option<SourceCheckpointDelta>,\n        ) -> MetastoreServiceClient {\n            let index_uid = self.index_uid.clone();\n            let source_config = self.source_config.clone();\n\n            let mut mock_metastore = MockMetastoreService::new();\n            mock_metastore\n                .expect_index_metadata()\n                .returning(move |_request| {\n                    let index_uri = format!(\"ram:///indexes/{}\", index_uid.index_id);\n                    let mut index_metadata =\n                        IndexMetadata::for_test(&index_uid.index_id, &index_uri);\n                    index_metadata.index_uid = index_uid.clone();\n\n                    let source_id = source_config.source_id.clone();\n                    index_metadata.add_source(source_config.clone()).unwrap();\n\n                    if let Some(source_delta) = source_checkpoint_delta_opt.clone() {\n                        let delta = IndexCheckpointDelta {\n                            source_id,\n                            source_delta,\n                        };\n                        index_metadata.checkpoint.try_apply_delta(delta).unwrap();\n                    }\n                    let response =\n                        IndexMetadataResponse::try_from_index_metadata(&index_metadata).unwrap();\n                    Ok(response)\n                });\n            MetastoreServiceClient::from_mock(mock_metastore)\n        }\n    }\n\n    #[tokio::test]\n    async fn test_check_source_connectivity() -> anyhow::Result<()> {\n        {\n            let source_config = SourceConfig {\n                source_id: \"void\".to_string(),\n                num_pipelines: NonZeroUsize::MIN,\n                enabled: true,\n                source_params: SourceParams::void(),\n                transform_config: None,\n                input_format: SourceInputFormat::Json,\n            };\n            check_source_connectivity(&StorageResolver::for_test(), &source_config).await?;\n        }\n        {\n            let source_config = SourceConfig {\n                source_id: \"vec\".to_string(),\n                num_pipelines: NonZeroUsize::MIN,\n                enabled: true,\n                source_params: SourceParams::Vec(VecSourceParams::default()),\n                transform_config: None,\n                input_format: SourceInputFormat::Json,\n            };\n            check_source_connectivity(&StorageResolver::for_test(), &source_config).await?;\n        }\n        {\n            let source_config = SourceConfig {\n                source_id: \"file\".to_string(),\n                num_pipelines: NonZeroUsize::MIN,\n                enabled: true,\n                source_params: SourceParams::file_from_str(\"file-does-not-exist.json\").unwrap(),\n                transform_config: None,\n                input_format: SourceInputFormat::Json,\n            };\n            assert!(\n                check_source_connectivity(&StorageResolver::for_test(), &source_config)\n                    .await\n                    .is_err()\n            );\n        }\n        {\n            let source_config = SourceConfig {\n                source_id: \"file\".to_string(),\n                num_pipelines: NonZeroUsize::MIN,\n                enabled: true,\n                source_params: SourceParams::file_from_str(\"data/test_corpus.json\").unwrap(),\n                transform_config: None,\n                input_format: SourceInputFormat::Json,\n            };\n            assert!(\n                check_source_connectivity(&StorageResolver::for_test(), &source_config)\n                    .await\n                    .is_ok()\n            );\n        }\n        Ok(())\n    }\n}\n\n#[cfg(all(\n    test,\n    any(\n        feature = \"sqs-localstack-tests\",\n        feature = \"kafka-broker-tests\",\n        feature = \"pulsar-broker-tests\"\n    )\n))]\nmod test_setup_helper {\n\n    use quickwit_config::IndexConfig;\n    use quickwit_metastore::checkpoint::{IndexCheckpointDelta, PartitionId};\n    use quickwit_metastore::{CreateIndexRequestExt, SplitMetadata, StageSplitsRequestExt};\n    use quickwit_proto::metastore::{CreateIndexRequest, PublishSplitsRequest, StageSplitsRequest};\n    use quickwit_proto::types::Position;\n\n    use super::*;\n    use crate::new_split_id;\n\n    pub async fn setup_index(\n        metastore: MetastoreServiceClient,\n        index_id: &str,\n        source_config: &SourceConfig,\n        partition_deltas: &[(PartitionId, Position, Position)],\n    ) -> IndexUid {\n        let index_uri = format!(\"ram:///indexes/{index_id}\");\n        let index_config = IndexConfig::for_test(index_id, &index_uri);\n        let create_index_request = CreateIndexRequest::try_from_index_and_source_configs(\n            &index_config,\n            std::slice::from_ref(source_config),\n        )\n        .unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        if partition_deltas.is_empty() {\n            return index_uid;\n        }\n        let split_id = new_split_id();\n        let split_metadata = SplitMetadata::for_test(split_id.clone());\n        let stage_splits_request =\n            StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata)\n                .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let mut source_delta = SourceCheckpointDelta::default();\n        for (partition_id, from_position, to_position) in partition_deltas.iter().cloned() {\n            source_delta\n                .record_partition_delta(partition_id, from_position, to_position)\n                .unwrap();\n        }\n        let checkpoint_delta = IndexCheckpointDelta {\n            source_id: source_config.source_id.to_string(),\n            source_delta,\n        };\n        let checkpoint_delta_json = serde_json::to_string(&checkpoint_delta).unwrap();\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            index_checkpoint_delta_json_opt: Some(checkpoint_delta_json),\n            staged_split_ids: vec![split_id.clone()],\n            replaced_split_ids: Vec::new(),\n            publish_token_opt: None,\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n        index_uid\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/pulsar_source.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeMap;\nuse std::fmt;\nuse std::time::{Duration, Instant};\n\nuse anyhow::{Context, anyhow};\nuse async_trait::async_trait;\nuse bytes::Bytes;\nuse futures::StreamExt;\nuse pulsar::authentication::oauth2::{OAuth2Authentication, OAuth2Params};\nuse pulsar::consumer::Message;\nuse pulsar::message::proto::MessageIdData;\nuse pulsar::{\n    Authentication, Consumer, DeserializeMessage, Payload, Pulsar, SubType, TokioExecutor,\n};\nuse quickwit_actors::{ActorContext, ActorExitStatus, Mailbox};\nuse quickwit_config::{PulsarSourceAuth, PulsarSourceParams};\nuse quickwit_metastore::checkpoint::{PartitionId, SourceCheckpoint};\nuse quickwit_proto::metastore::SourceType;\nuse quickwit_proto::types::{IndexUid, Position};\nuse serde_json::{Value as JsonValue, json};\nuse tokio::time;\nuse tracing::{debug, info, warn};\n\nuse crate::actors::DocProcessor;\nuse crate::source::{\n    BATCH_NUM_BYTES_LIMIT, BatchBuilder, EMIT_BATCHES_TIMEOUT, Source, SourceActor, SourceContext,\n    SourceRuntime, TypedSourceFactory,\n};\n\ntype PulsarConsumer = Consumer<PulsarMessage, TokioExecutor>;\n\npub struct PulsarSourceFactory;\n\n#[async_trait]\nimpl TypedSourceFactory for PulsarSourceFactory {\n    type Source = PulsarSource;\n    type Params = PulsarSourceParams;\n\n    async fn typed_create_source(\n        source_runtime: SourceRuntime,\n        source_params: PulsarSourceParams,\n    ) -> anyhow::Result<Self::Source> {\n        PulsarSource::try_new(source_runtime, source_params).await\n    }\n}\n\n#[derive(Default, Debug)]\npub struct PulsarSourceState {\n    /// Number of bytes processed by the source.\n    pub num_bytes_processed: u64,\n    /// Number of messages processed by the source (including invalid messages).\n    pub num_messages_processed: u64,\n    /// Number of invalid messages, i.e., that were empty or could not be parsed.\n    pub num_invalid_messages: u64,\n    /// The number of messages that were skipped due to the message being older\n    /// than the current checkpoint position\n    pub num_skipped_messages: u64,\n}\n\npub struct PulsarSource {\n    source_runtime: SourceRuntime,\n    source_params: PulsarSourceParams,\n    pulsar_consumer: PulsarConsumer,\n    subscription_name: String,\n    current_positions: BTreeMap<PartitionId, Position>,\n    state: PulsarSourceState,\n}\n\nimpl fmt::Debug for PulsarSource {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        f.debug_struct(\"PulsarSource\")\n            .field(\"index_uid\", self.source_runtime.index_uid())\n            .field(\"source_id\", &self.source_runtime.source_id())\n            .field(\"subscription_name\", &self.subscription_name)\n            .field(\"topics\", &self.source_params.topics.join(\", \"))\n            .finish()\n    }\n}\n\nimpl PulsarSource {\n    pub async fn try_new(\n        source_runtime: SourceRuntime,\n        source_params: PulsarSourceParams,\n    ) -> anyhow::Result<Self> {\n        let subscription_name =\n            subscription_name(source_runtime.index_uid(), source_runtime.source_id());\n        info!(\n            index_id=%source_runtime.index_id(),\n            source_id=%source_runtime.source_id(),\n            topics=?source_params.topics,\n            subscription_name=%subscription_name,\n            \"Create Pulsar source.\"\n        );\n        let pulsar = connect_pulsar(&source_params).await?;\n        let checkpoint = source_runtime.fetch_checkpoint().await?;\n\n        // Current positions are built mapping the topic ID to the last-saved\n        // message ID, pulsar ensures these topics (and topic partitions) are\n        // unique so that we don't inadvertently clash.\n        let mut current_positions = BTreeMap::new();\n        for topic in source_params.topics.iter() {\n            let partitions = pulsar.lookup_partitioned_topic(topic).await?;\n\n            for (partition, _) in partitions {\n                let partition_id = PartitionId::from(partition);\n                let position_opt = checkpoint.position_for_partition(&partition_id).cloned();\n\n                if let Some(position) = position_opt {\n                    current_positions.insert(partition_id, position);\n                }\n            }\n        }\n        let pulsar_consumer = create_pulsar_consumer(\n            subscription_name.clone(),\n            source_params.clone(),\n            pulsar,\n            current_positions.clone(),\n        )\n        .await?;\n\n        Ok(Self {\n            source_runtime,\n            source_params,\n            pulsar_consumer,\n            subscription_name,\n            current_positions,\n            state: PulsarSourceState::default(),\n        })\n    }\n\n    fn process_message(\n        &mut self,\n        message: Message<PulsarMessage>,\n        batch: &mut BatchBuilder,\n    ) -> anyhow::Result<()> {\n        let current_position = msg_id_to_position(message.message_id());\n        let doc = message.deserialize();\n        self.add_doc_to_batch(&message.topic, current_position, doc, batch)\n    }\n\n    fn add_doc_to_batch(\n        &mut self,\n        topic: &str,\n        msg_position: Position,\n        doc: Bytes,\n        batch: &mut BatchBuilder,\n    ) -> anyhow::Result<()> {\n        if doc.is_empty() {\n            warn!(\"message received from queue was empty\");\n            self.state.num_invalid_messages += 1;\n            return Ok(());\n        }\n\n        let partition = PartitionId::from(topic);\n        let num_bytes = doc.len() as u64;\n\n        if let Some(current_position) = self.current_positions.get(&partition) {\n            // We skip messages older or equal to the current recorded position.\n            // This is because Pulsar may replay messages which have not yet been acknowledged but\n            // are in the process of being published, this can occur in situations like pulsar\n            // re-balancing topic partitions if a node leaves, node failure, etc...\n            if &msg_position <= current_position {\n                self.state.num_skipped_messages += 1;\n                return Ok(());\n            }\n        }\n\n        let current_position = self\n            .current_positions\n            .insert(partition.clone(), msg_position.clone())\n            .unwrap_or(Position::Beginning);\n\n        batch\n            .checkpoint_delta\n            .record_partition_delta(partition, current_position, msg_position)\n            .context(\"failed to record partition delta\")?;\n        batch.add_doc(doc);\n\n        self.state.num_bytes_processed += num_bytes;\n        self.state.num_messages_processed += 1;\n\n        Ok(())\n    }\n\n    async fn try_ack_messages(&mut self, checkpoint: SourceCheckpoint) -> anyhow::Result<()> {\n        debug!(ckpt = ?checkpoint, \"truncating message queue\");\n        for (partition, position) in checkpoint.iter() {\n            if let Some(msg_id) = msg_id_from_position(&position) {\n                self.pulsar_consumer\n                    .cumulative_ack_with_id(partition.0.as_ref(), msg_id)\n                    .await?;\n            }\n        }\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Source for PulsarSource {\n    async fn emit_batches(\n        &mut self,\n        doc_processor_mailbox: &Mailbox<DocProcessor>,\n        ctx: &SourceContext,\n    ) -> Result<Duration, ActorExitStatus> {\n        let now = Instant::now();\n        let mut batch_builder = BatchBuilder::new(SourceType::Pulsar);\n        let deadline = time::sleep(*EMIT_BATCHES_TIMEOUT);\n        tokio::pin!(deadline);\n\n        loop {\n            tokio::select! {\n                // This does not actually acquire the lock of the mutex internally\n                // we're using the mutex in order to convince the Rust compiler\n                // that we can use the consumer within this Sync context.\n                message = self.pulsar_consumer.next() => {\n                    let message = message\n                        .ok_or_else(|| ActorExitStatus::from(anyhow!(\"consumer was dropped\")))?\n                        .map_err(|e| ActorExitStatus::from(anyhow!(\"failed to get message from consumer: {:?}\", e)))?;\n\n                    self.process_message(message, &mut batch_builder).map_err(ActorExitStatus::from)?;\n\n                    if batch_builder.num_bytes >= BATCH_NUM_BYTES_LIMIT {\n                        break;\n                    }\n                }\n                _ = &mut deadline => {\n                    break;\n                }\n            }\n            ctx.record_progress();\n        }\n\n        if !batch_builder.checkpoint_delta.is_empty() {\n            debug!(\n                num_docs=%batch_builder.docs.len(),\n                num_bytes=%batch_builder.num_bytes,\n                num_millis=%now.elapsed().as_millis(),\n                \"sending doc batch to indexer\"\n            );\n            let message = batch_builder.build();\n            ctx.send_message(doc_processor_mailbox, message).await?;\n        }\n        Ok(Duration::default())\n    }\n\n    async fn suggest_truncate(\n        &mut self,\n        checkpoint: SourceCheckpoint,\n        _ctx: &ActorContext<SourceActor>,\n    ) -> anyhow::Result<()> {\n        self.try_ack_messages(checkpoint).await\n    }\n\n    fn name(&self) -> String {\n        format!(\"{self:?}\")\n    }\n\n    async fn finalize(\n        &mut self,\n        _exit_status: &ActorExitStatus,\n        _ctx: &SourceContext,\n    ) -> anyhow::Result<()> {\n        self.pulsar_consumer.close().await?;\n        Ok(())\n    }\n\n    fn observable_state(&self) -> JsonValue {\n        json!({\n            \"index_id\": self.source_runtime.index_id(),\n            \"source_id\": self.source_runtime.source_id(),\n            \"topics\": self.source_params.topics,\n            \"subscription_name\": self.subscription_name,\n            \"consumer_name\": self.source_params.consumer_name,\n            \"num_bytes_processed\": self.state.num_bytes_processed,\n            \"num_messages_processed\": self.state.num_messages_processed,\n            \"num_invalid_messages\": self.state.num_invalid_messages,\n        })\n    }\n}\n\n#[derive(Debug)]\nstruct PulsarMessage;\n\nimpl DeserializeMessage for PulsarMessage {\n    type Output = Bytes;\n\n    fn deserialize_message(payload: &Payload) -> Self::Output {\n        Bytes::from(payload.data.clone())\n    }\n}\n\n#[tracing::instrument(name = \"pulsar-consumer\", skip(pulsar))]\n/// Creates a new pulsar consumer\nasync fn create_pulsar_consumer(\n    subscription_name: String,\n    params: PulsarSourceParams,\n    pulsar: Pulsar<TokioExecutor>,\n    current_positions: BTreeMap<PartitionId, Position>,\n) -> anyhow::Result<PulsarConsumer> {\n    let mut consumer: Consumer<PulsarMessage, _> = pulsar\n        .consumer()\n        .with_topics(&params.topics)\n        .with_consumer_name(&params.consumer_name)\n        .with_subscription(subscription_name)\n        .with_subscription_type(SubType::Failover)\n        .build()\n        .await?;\n\n    let consumer_ids = consumer\n        .consumer_id()\n        .into_iter()\n        .map(|id| id.to_string())\n        .collect::<Vec<_>>();\n    info!(positions = ?current_positions, \"seeking to last checkpoint positions\");\n    for (_, position) in current_positions {\n        let seek_to = msg_id_from_position(&position);\n\n        if seek_to.is_some() {\n            consumer\n                .seek(Some(consumer_ids.clone()), seek_to, None, pulsar.clone())\n                .await?;\n        }\n    }\n    Ok(consumer)\n}\n\nfn msg_id_to_position(msg: &MessageIdData) -> Position {\n    // The order of these fields are important as they affect the sorting\n    // of the checkpoint positions.\n    //\n    // The key parts of the ID used for ordering are:\n    // - The ledger ID which is a sequentially increasing ID.\n    // - The entry ID the unique ID of the message within the ledger.\n    // - The batch position for the current chunk of messages.\n    //\n    // The remaining keys are not required for sorting but are required\n    // in order to re-construct the message ID in order to send back to pulsar.\n    // The ledger_id, entry_id and the batch_index form a unique composite key which will\n    // prevent the remaining parts of the ID from interfering with the sorting.\n    let position_str = format!(\n        \"{:0>20},{:0>20},{},{},{}\",\n        msg.ledger_id,\n        msg.entry_id,\n        msg.batch_index\n            .map(|v| format!(\"{v:010}\"))\n            .unwrap_or_default(),\n        msg.partition\n            .and_then(|v| if v < 0 {\n                None\n            } else {\n                Some(format!(\"{v:010}\"))\n            })\n            .unwrap_or_default(),\n        msg.batch_size\n            .map(|v| format!(\"{v:010}\"))\n            .unwrap_or_default(),\n    );\n\n    Position::from(position_str)\n}\n\nfn msg_id_from_position(position: &Position) -> Option<MessageIdData> {\n    let Position::Offset(offset) = position else {\n        return None;\n    };\n    let mut parts = offset.as_str().split(',');\n\n    let ledger_id = parts.next()?.parse::<u64>().ok()?;\n    let entry_id = parts.next()?.parse::<u64>().ok()?;\n    let batch_index = parts.next()?.parse::<i32>().ok();\n    let partition = parts.next()?.parse::<i32>().unwrap_or(-1);\n    let batch_size = parts.next()?.parse::<i32>().ok();\n\n    Some(MessageIdData {\n        ledger_id,\n        entry_id,\n        batch_index,\n        batch_size,\n        partition: Some(partition),\n        ack_set: Vec::new(),\n        first_chunk_message_id: None,\n    })\n}\n\nasync fn connect_pulsar(params: &PulsarSourceParams) -> anyhow::Result<Pulsar<TokioExecutor>> {\n    let mut builder = Pulsar::builder(&params.address, TokioExecutor);\n\n    match params.authentication.clone() {\n        None => {}\n        Some(PulsarSourceAuth::Token(token)) => {\n            let auth = Authentication {\n                name: \"token\".to_string(),\n                data: token.as_bytes().to_vec(),\n            };\n\n            builder = builder.with_auth(auth);\n        }\n        Some(PulsarSourceAuth::Oauth2 {\n            issuer_url,\n            credentials_url,\n            audience,\n            scope,\n        }) => {\n            let auth = OAuth2Params {\n                issuer_url,\n                credentials_url,\n                audience,\n                scope,\n            };\n            builder = builder.with_auth_provider(OAuth2Authentication::client_credentials(auth));\n        }\n    }\n    let pulsar: Pulsar<_> = builder.build().await?;\n    Ok(pulsar)\n}\n\n/// Checks whether we can establish a connection to the pulsar broker.\npub(crate) async fn check_connectivity(params: &PulsarSourceParams) -> anyhow::Result<()> {\n    connect_pulsar(params).await?;\n    Ok(())\n}\n\nfn subscription_name(index_uid: &IndexUid, source_id: &str) -> String {\n    format!(\"quickwit-{index_uid}-{source_id}\")\n}\n\n#[cfg(all(test, feature = \"pulsar-broker-tests\"))]\nmod pulsar_broker_tests {\n    use std::collections::HashSet;\n    use std::num::NonZeroUsize;\n    use std::ops::Range;\n\n    use futures::future::join_all;\n    use quickwit_actors::{ActorHandle, HEARTBEAT, Inbox, Universe};\n    use quickwit_common::rand::append_random_suffix;\n    use quickwit_config::{SourceConfig, SourceInputFormat, SourceParams};\n    use quickwit_metastore::checkpoint::{PartitionId, SourceCheckpointDelta};\n    use quickwit_metastore::metastore_for_test;\n    use quickwit_proto::metastore::MetastoreServiceClient;\n    use reqwest::StatusCode;\n\n    use super::*;\n    use crate::source::pulsar_source::{msg_id_from_position, msg_id_to_position};\n    use crate::source::test_setup_helper::setup_index;\n    use crate::source::tests::SourceRuntimeBuilder;\n    use crate::source::{RawDocBatch, SuggestTruncate, quickwit_supported_sources};\n\n    static PULSAR_URI: &str = \"pulsar://localhost:6650\";\n    static PULSAR_ADMIN_URI: &str = \"http://localhost:8081\";\n    static CLIENT_NAME: &str = \"quickwit-tester\";\n\n    macro_rules! positions {\n        ($($partition:expr => $position:expr $(,)?)*) => {{\n            let mut positions = BTreeMap::new();\n            $(\n                positions.insert(PartitionId::from($partition), Position::offset($position));\n            )*\n            positions\n        }};\n    }\n\n    macro_rules! checkpoints {\n        ($($partition:expr => $position:expr $(,)?)*) => {{\n            let mut checkpoint = SourceCheckpointDelta::default();\n            $(\n                checkpoint.record_partition_delta(\n                    PartitionId::from($partition),\n                    Position::Beginning,\n                    $position,\n                ).unwrap();\n            )*\n            checkpoint\n        }};\n    }\n\n    fn get_source_config<S: AsRef<str>>(\n        topics: impl IntoIterator<Item = S>,\n    ) -> (String, SourceConfig) {\n        let source_id = append_random_suffix(\"test-pulsar-source--source\");\n        let source_config = SourceConfig {\n            source_id: source_id.clone(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::Pulsar(PulsarSourceParams {\n                topics: topics.into_iter().map(|v| v.as_ref().to_string()).collect(),\n                address: PULSAR_URI.to_string(),\n                consumer_name: CLIENT_NAME.to_string(),\n                authentication: None,\n            }),\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        };\n        (source_id, source_config)\n    }\n\n    fn merge_doc_batches(batches: Vec<RawDocBatch>) -> RawDocBatch {\n        let mut merged_batch = RawDocBatch::default();\n        for batch in batches {\n            merged_batch.docs.extend(batch.docs);\n            merged_batch\n                .checkpoint_delta\n                .extend(batch.checkpoint_delta)\n                .unwrap();\n        }\n        merged_batch.docs.sort();\n        merged_batch\n    }\n\n    struct TopicData {\n        messages: Vec<String>,\n        expected_position: Position,\n    }\n\n    impl TopicData {\n        fn num_bytes(&self) -> usize {\n            self.messages.iter().map(|v| v.len()).sum::<usize>()\n        }\n\n        fn len(&self) -> usize {\n            self.messages.len()\n        }\n    }\n\n    /// Populates a given set of topics with messages produced by closure `M`\n    ///\n    /// A set of messages and it's expected last checkpoint position is returned\n    /// for each topic provided.\n    async fn populate_topic<'a, S: AsRef<str> + 'a, M>(\n        topics: impl IntoIterator<Item = S>,\n        range_message_ids: Range<usize>,\n        message_fn: M,\n    ) -> anyhow::Result<Vec<TopicData>>\n    where\n        M: Fn(&str, usize) -> JsonValue,\n    {\n        let client = Pulsar::builder(PULSAR_URI, TokioExecutor).build().await?;\n\n        let mut pending_messages = Vec::new();\n        for topic in topics {\n            let mut topic_messages = Vec::with_capacity(range_message_ids.len());\n            let mut producer = client\n                .producer()\n                .with_name(append_random_suffix(CLIENT_NAME))\n                .with_topic(topic.as_ref())\n                .build()\n                .await?;\n\n            for id in range_message_ids.clone() {\n                let msg = (message_fn)(topic.as_ref(), id).to_string();\n                topic_messages.push(msg);\n            }\n\n            let futures = producer.send_all(topic_messages.clone()).await?;\n            let receipts = join_all(futures).await;\n\n            let mut last_expected_position = Position::Beginning;\n            for result in receipts {\n                let msg_id = result?.message_id.unwrap();\n                last_expected_position = msg_id_to_position(&msg_id);\n            }\n\n            topic_messages.sort();\n            pending_messages.push(TopicData {\n                messages: topic_messages,\n                expected_position: last_expected_position,\n            });\n            producer.close().await.expect(\"Close connection.\");\n        }\n\n        Ok(pending_messages)\n    }\n\n    async fn wait_for_completion(\n        source_handle: ActorHandle<SourceActor>,\n        num_expected: usize,\n        partition: PartitionId,\n        truncate_to: Position,\n    ) -> JsonValue {\n        loop {\n            let observation = source_handle.observe().await;\n            let value = observation.state;\n            let num_messages_processed = value\n                .get(\"num_messages_processed\")\n                .unwrap()\n                .as_u64()\n                .unwrap();\n            if num_messages_processed >= num_expected as u64 {\n                break;\n            }\n            tokio::time::sleep(Duration::from_secs(1)).await;\n        }\n\n        let mut checkpoint = SourceCheckpoint::default();\n        checkpoint\n            .try_apply_delta(checkpoints!(partition => truncate_to))\n            .expect(\"Create checkpoint\");\n        let truncate = SuggestTruncate(checkpoint);\n        source_handle\n            .mailbox()\n            .send_message(truncate)\n            .await\n            .expect(\"Truncate\");\n\n        let (_exit_status, exit_state) = source_handle.quit().await;\n        exit_state\n    }\n\n    async fn create_partitioned_topic(topic: &str, num_partitions: usize) {\n        let client = reqwest::Client::new();\n        let res = client\n            .put(format!(\n                \"{PULSAR_ADMIN_URI}/admin/v2/persistent/public/default/{topic}/partitions\"\n            ))\n            .body(num_partitions.to_string())\n            .header(\"content-type\", b\"application/json\".as_ref())\n            .send()\n            .await\n            .expect(\"Send admin request\");\n\n        assert_eq!(\n            res.status(),\n            StatusCode::NO_CONTENT,\n            \"Expect 204 status code.\"\n        );\n    }\n\n    async fn create_source(\n        universe: &Universe,\n        _metastore: MetastoreServiceClient,\n        index_uid: IndexUid,\n        source_config: SourceConfig,\n        _start_checkpoint: SourceCheckpoint,\n    ) -> anyhow::Result<(ActorHandle<SourceActor>, Inbox<DocProcessor>)> {\n        let source_loader = quickwit_supported_sources();\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config).build();\n        let source = source_loader.load_source(source_runtime).await?;\n        let (doc_processor_mailbox, doc_processor_inbox) = universe.create_test_mailbox();\n        let source_actor = SourceActor {\n            source,\n            doc_processor_mailbox,\n        };\n        let (_source_mailbox, source_handle) = universe.spawn_builder().spawn(source_actor);\n\n        Ok((source_handle, doc_processor_inbox))\n    }\n\n    fn message_generator(topic: &str, id: usize) -> JsonValue {\n        json!({\n            \"id\": id.to_string(),\n            \"topic\": topic,\n            \"timestamp\": 1674515715,\n            \"body\": \"Hello, world! This is some test data.\",\n        })\n    }\n\n    fn count_unique_messages_in_batches(batches: &[RawDocBatch]) -> usize {\n        let message_ids_topic: HashSet<String> = batches\n            .iter()\n            .flat_map(|batch| &batch.docs)\n            .map(|doc| {\n                let json_doc = serde_json::from_slice::<serde_json::Value>(doc).unwrap();\n                let id: &str = json_doc.get(\"id\").unwrap().as_str().unwrap();\n                let topic: &str = json_doc.get(\"topic\").unwrap().as_str().unwrap();\n                format!(\"{id}-{topic}\")\n            })\n            .collect();\n        message_ids_topic.len()\n    }\n\n    #[test]\n    fn test_position_serialization() {\n        let populated_id = MessageIdData {\n            ledger_id: 1,\n            entry_id: 134,\n            batch_index: Some(3),\n            partition: Some(-1),\n            batch_size: Some(6),\n\n            // We never serialize these fields.\n            ack_set: Vec::new(),\n            first_chunk_message_id: None,\n        };\n\n        let position = msg_id_to_position(&populated_id);\n        assert_eq!(\n            position.to_string(),\n            format!(\"{:0>20},{:0>20},{:010},,{:010}\", 1, 134, 3, 6)\n        );\n        let retrieved_id = msg_id_from_position(&position)\n            .expect(\"Successfully deserialize message ID from position.\");\n        assert_eq!(retrieved_id, populated_id);\n\n        let partitioned_id = MessageIdData {\n            ledger_id: 1,\n            entry_id: 134,\n            batch_index: Some(3),\n            partition: Some(5),\n            batch_size: Some(6),\n\n            // We never serialize these fields.\n            ack_set: Vec::new(),\n            first_chunk_message_id: None,\n        };\n\n        let position = msg_id_to_position(&partitioned_id);\n        assert_eq!(\n            position.to_string(),\n            format!(\"{:0>20},{:0>20},{:010},{:010},{:010}\", 1, 134, 3, 5, 6)\n        );\n        let retrieved_id = msg_id_from_position(&position)\n            .expect(\"Successfully deserialize message ID from position.\");\n        assert_eq!(retrieved_id, partitioned_id);\n\n        let sparse_id = MessageIdData {\n            ledger_id: 1,\n            entry_id: 4,\n            batch_index: None,\n            partition: Some(-1),\n            batch_size: Some(0),\n\n            // We never serialize these fields.\n            ack_set: Vec::new(),\n            first_chunk_message_id: None,\n        };\n\n        let position = msg_id_to_position(&sparse_id);\n        assert_eq!(\n            position.to_string(),\n            format!(\"{:0>20},{:0>20},,,{:010}\", 1, 4, 0)\n        );\n        let retrieved_id = msg_id_from_position(&position)\n            .expect(\"Successfully deserialize message ID from position.\");\n        assert_eq!(retrieved_id, sparse_id);\n    }\n\n    #[tokio::test]\n    async fn test_doc_batching_logic() {\n        let topic = append_random_suffix(\"test-pulsar-source-topic\");\n\n        let index_id = append_random_suffix(\"test-pulsar-source-index\");\n        let index_uid = IndexUid::new_with_random_ulid(&index_id);\n        let (_source_id, source_config) = get_source_config([&topic]);\n        let params = if let SourceParams::Pulsar(params) = source_config.clone().source_params {\n            params\n        } else {\n            unreachable!()\n        };\n\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config).build();\n        let mut pulsar_source = PulsarSource::try_new(source_runtime, params)\n            .await\n            .expect(\"Setup pulsar source\");\n\n        let position = Position::Beginning;\n        let mut batch = BatchBuilder::new(SourceType::Pulsar);\n        pulsar_source\n            .add_doc_to_batch(&topic, position, Bytes::from_static(b\"\"), &mut batch)\n            .expect(\"Add batch should not error on empty doc.\");\n        assert_eq!(pulsar_source.state.num_invalid_messages, 1);\n        assert_eq!(pulsar_source.state.num_messages_processed, 0);\n        assert_eq!(pulsar_source.state.num_bytes_processed, 0);\n        assert!(pulsar_source.current_positions.is_empty());\n        assert_eq!(batch.num_bytes, 0);\n        assert!(batch.docs.is_empty());\n\n        let position = Position::offset(1u64); // Used for testing simplicity.\n        let mut batch = BatchBuilder::new(SourceType::Pulsar);\n        let doc = Bytes::from_static(b\"some-demo-data\");\n        pulsar_source\n            .add_doc_to_batch(&topic, position, doc, &mut batch)\n            .expect(\"Add batch should not error on empty doc.\");\n\n        assert_eq!(pulsar_source.state.num_invalid_messages, 1);\n        assert_eq!(pulsar_source.state.num_messages_processed, 1);\n        assert_eq!(pulsar_source.state.num_bytes_processed, 14);\n        assert_eq!(\n            pulsar_source.current_positions,\n            positions!(topic.as_str() => 1u64)\n        );\n        assert_eq!(batch.num_bytes, 14);\n        assert_eq!(batch.docs.len(), 1);\n\n        let position = Position::offset(4u64); // Used for testing simplicity.\n        let mut batch = BatchBuilder::new(SourceType::Pulsar);\n        let doc = Bytes::from_static(b\"some-demo-data-2\");\n        pulsar_source\n            .add_doc_to_batch(&topic, position, doc, &mut batch)\n            .expect(\"Add batch should not error on empty doc.\");\n        assert_eq!(pulsar_source.state.num_invalid_messages, 1);\n        assert_eq!(pulsar_source.state.num_messages_processed, 2);\n        assert_eq!(pulsar_source.state.num_bytes_processed, 30);\n        assert_eq!(\n            pulsar_source.current_positions,\n            positions!(topic.as_str() => 4u64)\n        );\n        assert_eq!(batch.num_bytes, 16);\n        assert_eq!(batch.docs.len(), 1);\n\n        let mut expected_checkpoint_delta = SourceCheckpointDelta::default();\n        expected_checkpoint_delta\n            .record_partition_delta(\n                PartitionId::from(topic.as_str()),\n                Position::offset(1u64),\n                Position::offset(4u64),\n            )\n            .unwrap();\n        assert_eq!(batch.checkpoint_delta, expected_checkpoint_delta);\n    }\n\n    #[tokio::test]\n    async fn test_topic_ingestion() {\n        let universe = Universe::with_accelerated_time();\n        let metastore = metastore_for_test();\n        let topic = append_random_suffix(\"test-pulsar-source--topic-ingestion--topic\");\n\n        let index_id = append_random_suffix(\"test-pulsar-source--topic-ingestion--index\");\n        let (source_id, source_config) = get_source_config([&topic]);\n\n        let index_uid = setup_index(metastore.clone(), &index_id, &source_config, &[]).await;\n\n        let (source_handle, doc_processor_inbox) = create_source(\n            &universe,\n            metastore,\n            index_uid.clone(),\n            source_config,\n            SourceCheckpoint::default(),\n        )\n        .await\n        .expect(\"Create source\");\n\n        let expected_docs = populate_topic([&topic], 0..10, message_generator)\n            .await\n            .unwrap();\n\n        let exit_state = wait_for_completion(\n            source_handle,\n            expected_docs[0].len(),\n            PartitionId::from(topic.clone()),\n            expected_docs[0].expected_position.clone(),\n        )\n        .await;\n        let messages: Vec<RawDocBatch> = doc_processor_inbox.drain_for_test_typed();\n        assert!(!messages.is_empty());\n\n        let batch = merge_doc_batches(messages);\n        assert_eq!(batch.docs, expected_docs[0].messages);\n        assert_eq!(\n            batch.checkpoint_delta,\n            checkpoints!(topic.as_str() => expected_docs[0].expected_position.clone())\n        );\n\n        let num_bytes = expected_docs[0].num_bytes();\n        let expected_state = json!({\n            \"index_id\": index_id,\n            \"source_id\": source_id,\n            \"topics\": vec![topic],\n            \"subscription_name\": subscription_name(&index_uid, &source_id),\n            \"consumer_name\": CLIENT_NAME,\n            \"num_bytes_processed\": num_bytes,\n            \"num_messages_processed\": 10,\n            \"num_invalid_messages\": 0,\n        });\n        assert_eq!(exit_state, expected_state);\n    }\n\n    #[tokio::test]\n    async fn test_multi_topic_ingestion() {\n        let universe = Universe::with_accelerated_time();\n        let metastore = metastore_for_test();\n        let topic1 = append_random_suffix(\"test-pulsar-source--topic-ingestion--topic\");\n        let topic2 = append_random_suffix(\"test-pulsar-source--topic-ingestion--topic\");\n\n        let index_id = append_random_suffix(\"test-pulsar-source--topic-ingestion--index\");\n        let (source_id, source_config) = get_source_config([&topic1, &topic2]);\n\n        let index_uid = setup_index(metastore.clone(), &index_id, &source_config, &[]).await;\n\n        let (source_handle, doc_processor_inbox) = create_source(\n            &universe,\n            metastore,\n            index_uid.clone(),\n            source_config,\n            SourceCheckpoint::default(),\n        )\n        .await\n        .expect(\"Create source\");\n\n        let expected_docs = populate_topic([&topic1, &topic2], 0..10, message_generator)\n            .await\n            .unwrap();\n\n        let mut combined_messages = expected_docs\n            .iter()\n            .flat_map(|v| &v.messages)\n            .cloned()\n            .collect::<Vec<_>>();\n        combined_messages.sort();\n\n        let exit_state = wait_for_completion(\n            source_handle,\n            combined_messages.len(),\n            PartitionId::from(topic1.clone()),\n            expected_docs[0].expected_position.clone(),\n        )\n        .await;\n        let messages: Vec<RawDocBatch> = doc_processor_inbox.drain_for_test_typed();\n        assert!(!messages.is_empty());\n\n        let batch = merge_doc_batches(messages);\n        assert_eq!(batch.docs, combined_messages);\n        assert_eq!(\n            batch.checkpoint_delta,\n            checkpoints! {\n                topic1.as_str() => expected_docs[0].expected_position.clone(),\n                topic2.as_str() => expected_docs[1].expected_position.clone(),\n            }\n        );\n\n        let num_bytes = expected_docs[0].num_bytes() + expected_docs[1].num_bytes();\n        let expected_state = json!({\n            \"index_id\": index_id,\n            \"source_id\": source_id,\n            \"topics\": vec![topic1, topic2],\n            \"subscription_name\": subscription_name(&index_uid, &source_id),\n            \"consumer_name\": CLIENT_NAME,\n            \"num_bytes_processed\": num_bytes,\n            \"num_messages_processed\": 20,\n            \"num_invalid_messages\": 0,\n        });\n        assert_eq!(exit_state, expected_state);\n    }\n\n    #[tokio::test]\n    async fn test_partitioned_topic_single_consumer_ingestion() {\n        let universe = Universe::with_accelerated_time();\n        let metastore = metastore_for_test();\n        let topic = append_random_suffix(\"test-pulsar-source--partitioned-single-consumer--topic\");\n\n        let index_id =\n            append_random_suffix(\"test-pulsar-source--partitioned-single-consumer--index\");\n        let (source_id, source_config) = get_source_config([&topic]);\n\n        create_partitioned_topic(&topic, 2).await;\n        let index_uid = setup_index(metastore.clone(), &index_id, &source_config, &[]).await;\n\n        let (source_handle, doc_processor_inbox) = create_source(\n            &universe,\n            metastore,\n            index_uid.clone(),\n            source_config,\n            SourceCheckpoint::default(),\n        )\n        .await\n        .expect(\"Create source\");\n\n        let expected_docs = populate_topic([&topic], 0..10, message_generator)\n            .await\n            .unwrap();\n\n        let exit_state = wait_for_completion(\n            source_handle,\n            expected_docs.len(),\n            PartitionId::from(topic.clone()),\n            expected_docs[0].expected_position.clone(),\n        )\n        .await;\n        let messages: Vec<RawDocBatch> = doc_processor_inbox.drain_for_test_typed();\n        assert!(!messages.is_empty());\n\n        let batch = merge_doc_batches(messages);\n        assert_eq!(batch.docs, expected_docs[0].messages);\n\n        let num_bytes = expected_docs[0].num_bytes();\n        let expected_state = json!({\n            \"index_id\": index_id,\n            \"source_id\": source_id,\n            \"topics\": vec![topic],\n            \"subscription_name\": subscription_name(&index_uid, &source_id),\n            \"consumer_name\": CLIENT_NAME,\n            \"num_bytes_processed\": num_bytes,\n            \"num_messages_processed\": 10,\n            \"num_invalid_messages\": 0,\n        });\n        assert_eq!(exit_state, expected_state);\n    }\n\n    #[tokio::test]\n    async fn test_partitioned_topic_multi_consumer_ingestion() {\n        let universe = Universe::with_accelerated_time();\n        let metastore = metastore_for_test();\n        let topic = append_random_suffix(\"test-pulsar-source--partitioned-multi-consumer--topic\");\n\n        let index_id =\n            append_random_suffix(\"test-pulsar-source--partitioned-multi-consumer--index\");\n        let (source_id, source_config) = get_source_config([&topic]);\n\n        create_partitioned_topic(&topic, 2).await;\n        let index_uid = setup_index(metastore.clone(), &index_id, &source_config, &[]).await;\n\n        let topic_partition_1 = format!(\"{topic}-partition-0\");\n        let topic_partition_2 = format!(\"{topic}-partition-1\");\n\n        let (source_handle1, doc_processor_inbox1) = create_source(\n            &universe,\n            metastore.clone(),\n            index_uid.clone(),\n            source_config.clone(),\n            SourceCheckpoint::default(),\n        )\n        .await\n        .expect(\"Create source\");\n\n        let (source_handle2, doc_processor_inbox2) = create_source(\n            &universe,\n            metastore,\n            index_uid.clone(),\n            source_config,\n            SourceCheckpoint::default(),\n        )\n        .await\n        .expect(\"Create source\");\n\n        let expected_docs = populate_topic(\n            [&topic_partition_1, &topic_partition_2],\n            0..10,\n            message_generator,\n        )\n        .await\n        .unwrap();\n\n        let exit_state1 = wait_for_completion(\n            source_handle1,\n            10,\n            PartitionId::from(topic_partition_1.clone()),\n            expected_docs[0].expected_position.clone(),\n        )\n        .await;\n        let exit_state2 = wait_for_completion(\n            source_handle2,\n            10,\n            PartitionId::from(topic_partition_2.clone()),\n            expected_docs[1].expected_position.clone(),\n        )\n        .await;\n        let messages1: Vec<RawDocBatch> = doc_processor_inbox1.drain_for_test_typed();\n        assert!(!messages1.is_empty());\n        let messages2: Vec<RawDocBatch> = doc_processor_inbox2.drain_for_test_typed();\n        assert!(!messages2.is_empty());\n\n        let batch1 = merge_doc_batches(messages1);\n        assert_eq!(batch1.docs, expected_docs[0].messages);\n\n        let batch2 = merge_doc_batches(messages2);\n        assert_eq!(batch2.docs, expected_docs[1].messages);\n\n        let num_bytes = expected_docs[1].num_bytes();\n        let expected_state = json!({\n            \"index_id\": index_id,\n            \"source_id\": source_id,\n            \"topics\": vec![topic],\n            \"subscription_name\": subscription_name(&index_uid, &source_id),\n            \"consumer_name\": CLIENT_NAME,\n            \"num_bytes_processed\": num_bytes,\n            \"num_messages_processed\": 10,\n            \"num_invalid_messages\": 0,\n        });\n        assert_eq!(exit_state1, expected_state);\n        assert_eq!(exit_state2, expected_state);\n    }\n\n    #[tokio::test]\n    async fn test_partitioned_topic_multi_consumer_ingestion_with_failover() {\n        // We test successive failures of one source and observe pulsar failover mechanism.\n        quickwit_common::setup_logging_for_tests();\n        let universe = Universe::new();\n        let metastore = metastore_for_test();\n        let topic =\n            append_random_suffix(\"test-pulsar-source--partitioned-multi-consumer-failure--topic\");\n\n        let index_id =\n            append_random_suffix(\"test-pulsar-source--partitioned-multi-consumer-failure--index\");\n        let (_, source_config) = get_source_config([&topic]);\n\n        create_partitioned_topic(&topic, 2).await;\n        let index_uid = setup_index(metastore.clone(), &index_id, &source_config, &[]).await;\n\n        let topic_partition_1 = format!(\"{topic}-partition-0\");\n        let topic_partition_2 = format!(\"{topic}-partition-1\");\n\n        let (_source_handle1, doc_processor_inbox1) = create_source(\n            &universe,\n            metastore.clone(),\n            index_uid.clone(),\n            source_config.clone(),\n            SourceCheckpoint::default(),\n        )\n        .await\n        .expect(\"Create source\");\n\n        // Send 10 messages on each topic and kill the source 5 times.\n        for idx in 0..5 {\n            let (source_handle2, _) = create_source(\n                &universe,\n                metastore.clone(),\n                index_uid.clone(),\n                source_config.clone(),\n                SourceCheckpoint::default(),\n            )\n            .await\n            .expect(\"Create source\");\n            populate_topic(\n                [&topic_partition_1, &topic_partition_2],\n                idx * 10..(idx + 1) * 10,\n                message_generator,\n            )\n            .await\n            .unwrap();\n            tokio::time::sleep(*HEARTBEAT * 5).await;\n            source_handle2.kill().await;\n        }\n\n        let messages1: Vec<RawDocBatch> = doc_processor_inbox1.drain_for_test_typed();\n        assert!(!messages1.is_empty());\n        let num_docs_sent_to_doc_processor: usize =\n            messages1.iter().map(|batch| batch.docs.len()).sum();\n        assert_eq!(100, num_docs_sent_to_doc_processor);\n        // Check that we have received all the messages without duplicates.\n        assert_eq!(100, count_unique_messages_in_batches(&messages1));\n        universe.assert_quit().await;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/queue_sources/coordinator.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::sync::Arc;\nuse std::time::Duration;\n\nuse itertools::Itertools;\nuse quickwit_actors::{ActorExitStatus, Mailbox};\nuse quickwit_common::rate_limited_error;\nuse quickwit_config::{FileSourceMessageType, FileSourceSqs};\nuse quickwit_metastore::checkpoint::SourceCheckpoint;\nuse quickwit_proto::indexing::IndexingPipelineId;\nuse quickwit_proto::metastore::SourceType;\nuse quickwit_proto::types::SourceUid;\nuse quickwit_storage::StorageResolver;\nuse serde::Serialize;\nuse ulid::Ulid;\n\nuse super::Queue;\nuse super::helpers::QueueReceiver;\nuse super::local_state::QueueLocalState;\nuse super::message::{MessageType, PreProcessingError, ReadyMessage};\nuse super::shared_state::{QueueSharedState, checkpoint_messages};\nuse super::visibility::{VisibilitySettings, spawn_visibility_task};\nuse crate::actors::DocProcessor;\nuse crate::models::{NewPublishLock, NewPublishToken, PublishLock};\nuse crate::source::{SourceContext, SourceRuntime};\n\n/// Maximum duration that the `emit_batches()` callback can wait for\n/// `queue.receive()` calls. If too small, the actor loop will spin\n/// un-necessarily. If too large, the actor loop will be slow to react to new\n/// messages (or shutdown).\npub const RECEIVE_POLL_TIMEOUT: Duration = Duration::from_millis(500);\n\n#[derive(Default, Serialize)]\npub struct QueueCoordinatorObservableState {\n    /// Number of bytes processed by the source.\n    pub num_bytes_processed: u64,\n    /// Number of lines processed by the source.\n    pub num_lines_processed: u64,\n    /// Number of messages processed by the source.\n    pub num_messages_processed: u64,\n    /// Number of messages that could not be pre-processed.\n    pub num_messages_failed_preprocessing: u64,\n    /// Number of messages that could not be moved to in-progress.\n    pub num_messages_failed_opening: u64,\n}\n\n/// The `QueueCoordinator` fetches messages from a queue, converts them into\n/// record batches, and tracks the messages' state until their entire content is\n/// published. Its API closely resembles the [`crate::source::Source`] trait,\n/// making the implementation of queue sources straightforward.\npub struct QueueCoordinator {\n    storage_resolver: StorageResolver,\n    pipeline_id: IndexingPipelineId,\n    source_type: SourceType,\n    queue: Arc<dyn Queue>,\n    queue_receiver: QueueReceiver,\n    observable_state: QueueCoordinatorObservableState,\n    message_type: MessageType,\n    publish_lock: PublishLock,\n    shared_state: QueueSharedState,\n    local_state: QueueLocalState,\n    publish_token: String,\n    visibility_settings: VisibilitySettings,\n}\n\nimpl fmt::Debug for QueueCoordinator {\n    fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {\n        formatter\n            .debug_struct(\"QueueTracker\")\n            .field(\"index_id\", &self.pipeline_id.index_uid.index_id)\n            .field(\"queue\", &self.queue)\n            .finish()\n    }\n}\n\nimpl QueueCoordinator {\n    pub fn new(\n        source_runtime: SourceRuntime,\n        queue: Arc<dyn Queue>,\n        message_type: MessageType,\n        shard_max_age: Option<Duration>,\n        shard_max_count: Option<u32>,\n        shard_pruning_interval: Duration,\n    ) -> Self {\n        Self {\n            shared_state: QueueSharedState::new(\n                source_runtime.metastore,\n                SourceUid {\n                    index_uid: source_runtime.pipeline_id.index_uid.clone(),\n                    source_id: source_runtime.pipeline_id.source_id.clone(),\n                },\n                Duration::from_secs(2 * source_runtime.indexing_setting.commit_timeout_secs as u64),\n                shard_max_age,\n                shard_max_count,\n                shard_pruning_interval,\n            ),\n            local_state: QueueLocalState::default(),\n            pipeline_id: source_runtime.pipeline_id,\n            source_type: source_runtime.source_config.source_type(),\n            storage_resolver: source_runtime.storage_resolver,\n            queue_receiver: QueueReceiver::new(queue.clone(), RECEIVE_POLL_TIMEOUT),\n            queue,\n            observable_state: QueueCoordinatorObservableState::default(),\n            message_type,\n            publish_lock: PublishLock::default(),\n            publish_token: Ulid::new().to_string(),\n            visibility_settings: VisibilitySettings::from_commit_timeout(\n                source_runtime.indexing_setting.commit_timeout_secs,\n            ),\n        }\n    }\n\n    #[cfg(feature = \"sqs\")]\n    pub async fn try_from_sqs_config(\n        config: FileSourceSqs,\n        source_runtime: SourceRuntime,\n    ) -> anyhow::Result<Self> {\n        use super::sqs_queue::SqsQueue;\n        let queue = SqsQueue::try_new(config.queue_url).await?;\n        let message_type = match config.message_type {\n            FileSourceMessageType::S3Notification => MessageType::S3Notification,\n            FileSourceMessageType::RawUri => MessageType::RawUri,\n        };\n        let shard_max_age = Duration::from_secs(config.deduplication_window_duration_secs as u64);\n        Ok(QueueCoordinator::new(\n            source_runtime,\n            Arc::new(queue),\n            message_type,\n            Some(shard_max_age),\n            Some(config.deduplication_window_max_messages),\n            Duration::from_secs(config.deduplication_cleanup_interval_secs as u64),\n        ))\n    }\n\n    pub async fn initialize(\n        &mut self,\n        doc_processor_mailbox: &Mailbox<DocProcessor>,\n        ctx: &SourceContext,\n    ) -> Result<(), ActorExitStatus> {\n        let publish_lock = self.publish_lock.clone();\n        ctx.send_message(doc_processor_mailbox, NewPublishLock(publish_lock))\n            .await?;\n        ctx.send_message(\n            doc_processor_mailbox,\n            NewPublishToken(self.publish_token.clone()),\n        )\n        .await?;\n        Ok(())\n    }\n\n    /// Polls messages from the queue and prepares them for processing\n    async fn poll_messages(&mut self, ctx: &SourceContext) -> Result<(), ActorExitStatus> {\n        let raw_messages = self\n            .queue_receiver\n            .receive(1, self.visibility_settings.deadline_for_receive)\n            .await?;\n\n        let mut format_errors = Vec::new();\n        let mut discardable_ack_ids = Vec::new();\n        let mut preprocessed_messages = Vec::new();\n        for message in raw_messages {\n            match message.pre_process(self.message_type) {\n                Ok(preprocessed_message) => preprocessed_messages.push(preprocessed_message),\n                Err(PreProcessingError::UnexpectedFormat(err)) => format_errors.push(err),\n                Err(PreProcessingError::Discardable { ack_id }) => discardable_ack_ids.push(ack_id),\n            }\n        }\n        if !format_errors.is_empty() {\n            self.observable_state.num_messages_failed_preprocessing += format_errors.len() as u64;\n            rate_limited_error!(\n                limit_per_min = 10,\n                count = format_errors.len(),\n                last_err = ?format_errors.last().unwrap(),\n                \"invalid messages not processed, use a dead letter queue to limit retries\"\n            );\n        }\n        if preprocessed_messages.is_empty() {\n            self.queue.acknowledge(&discardable_ack_ids).await?;\n            return Ok(());\n        }\n\n        // in rare situations, there might be duplicates within a batch\n        let deduplicated_messages = preprocessed_messages\n            .into_iter()\n            .unique_by(|x| x.partition_id());\n\n        let mut untracked_locally = Vec::new();\n        let mut already_completed = Vec::new();\n        for message in deduplicated_messages {\n            let partition_id = message.partition_id();\n            if self.local_state.is_completed(&partition_id) {\n                already_completed.push(message);\n            } else if !self.local_state.is_tracked(&partition_id) {\n                untracked_locally.push(message);\n            }\n        }\n\n        let checkpointed_messages = checkpoint_messages(\n            &mut self.shared_state,\n            &self.publish_token,\n            untracked_locally,\n        )\n        .await?;\n\n        let mut ready_messages = Vec::new();\n        for (message, position) in checkpointed_messages {\n            if position.is_eof() {\n                self.local_state.mark_completed(message.partition_id());\n                already_completed.push(message);\n            } else {\n                ready_messages.push(ReadyMessage {\n                    visibility_handle: spawn_visibility_task(\n                        ctx,\n                        self.queue.clone(),\n                        message.metadata.ack_id.clone(),\n                        message.metadata.initial_deadline,\n                        self.visibility_settings.clone(),\n                    ),\n                    content: message,\n                    position,\n                })\n            }\n        }\n\n        self.local_state.set_ready_for_read(ready_messages);\n\n        // Acknowledge messages that already have been processed\n        let mut ack_ids = already_completed\n            .iter()\n            .map(|msg| msg.metadata.ack_id.clone())\n            .collect::<Vec<_>>();\n        ack_ids.append(&mut discardable_ack_ids);\n        self.queue.acknowledge(&ack_ids).await?;\n\n        Ok(())\n    }\n\n    pub async fn emit_batches(\n        &mut self,\n        doc_processor_mailbox: &Mailbox<DocProcessor>,\n        ctx: &SourceContext,\n    ) -> Result<Duration, ActorExitStatus> {\n        if let Some(in_progress_ref) = self.local_state.read_in_progress_mut() {\n            // TODO: should we kill the publish lock if the message visibility extension failed?\n            let batch_builder = in_progress_ref\n                .batch_reader\n                .read_batch(ctx.progress(), self.source_type)\n                .await?;\n            self.observable_state.num_lines_processed += batch_builder.docs.len() as u64;\n            self.observable_state.num_bytes_processed += batch_builder.num_bytes;\n            doc_processor_mailbox\n                .send_message(batch_builder.build())\n                .await?;\n            if in_progress_ref.batch_reader.is_eof() {\n                self.local_state.drop_currently_read().await?;\n                self.observable_state.num_messages_processed += 1;\n            }\n        } else if let Some(ready_message) = self.local_state.get_ready_for_read() {\n            match ready_message.start_processing(&self.storage_resolver).await {\n                Ok(new_in_progress) => {\n                    self.local_state.set_currently_read(new_in_progress)?;\n                }\n                Err(err) => {\n                    self.observable_state.num_messages_failed_opening += 1;\n                    rate_limited_error!(\n                        limit_per_min = 5,\n                        err = ?err,\n                        \"failed to start message processing\"\n                    );\n                }\n            }\n        } else {\n            self.poll_messages(ctx).await?;\n        }\n\n        Ok(Duration::ZERO)\n    }\n\n    pub async fn suggest_truncate(\n        &mut self,\n        checkpoint: SourceCheckpoint,\n        _ctx: &SourceContext,\n    ) -> anyhow::Result<()> {\n        let committed_partition_ids = checkpoint\n            .iter()\n            .filter(|(_, pos)| pos.is_eof())\n            .map(|(pid, _)| pid)\n            .collect::<Vec<_>>();\n        let mut completed = Vec::new();\n        for partition_id in committed_partition_ids {\n            let ack_id_opt = self.local_state.mark_completed(partition_id);\n            if let Some(ack_id) = ack_id_opt {\n                completed.push(ack_id);\n            }\n        }\n        self.queue.acknowledge(&completed).await\n    }\n\n    pub fn observable_state(&self) -> &QueueCoordinatorObservableState {\n        &self.observable_state\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::str::FromStr;\n\n    use quickwit_actors::{ActorContext, Universe};\n    use quickwit_common::uri::Uri;\n    use quickwit_proto::types::{NodeId, PipelineUid, Position};\n    use tokio::sync::watch;\n    use ulid::Ulid;\n\n    use super::*;\n    use crate::models::RawDocBatch;\n    use crate::source::doc_file_reader::file_test_helpers::{DUMMY_DOC, generate_dummy_doc_file};\n    use crate::source::queue_sources::memory_queue::MemoryQueueForTests;\n    use crate::source::queue_sources::message::PreProcessedPayload;\n    use crate::source::queue_sources::shared_state::shared_state_for_tests::init_state;\n    use crate::source::{BATCH_NUM_BYTES_LIMIT, SourceActor};\n\n    fn setup_coordinator(\n        queue: Arc<MemoryQueueForTests>,\n        shared_state: QueueSharedState,\n    ) -> QueueCoordinator {\n        let pipeline_id = IndexingPipelineId {\n            node_id: NodeId::from_str(\"test-node\").unwrap(),\n            index_uid: shared_state.source_uid.index_uid.clone(),\n            source_id: shared_state.source_uid.source_id.clone(),\n            pipeline_uid: PipelineUid::random(),\n        };\n\n        QueueCoordinator {\n            local_state: QueueLocalState::default(),\n            shared_state,\n            pipeline_id,\n            observable_state: QueueCoordinatorObservableState::default(),\n            publish_lock: PublishLock::default(),\n            // set a very high chunking timeout to make it possible to count the\n            // number of iterations required to process messages\n            queue_receiver: QueueReceiver::new(queue.clone(), Duration::from_secs(10)),\n            queue,\n            message_type: MessageType::RawUri,\n            source_type: SourceType::Unspecified,\n            storage_resolver: StorageResolver::for_test(),\n            publish_token: Ulid::new().to_string(),\n            visibility_settings: VisibilitySettings::from_commit_timeout(5),\n        }\n    }\n\n    async fn process_messages(\n        coordinator: &mut QueueCoordinator,\n        queue: Arc<MemoryQueueForTests>,\n        messages: &[(&Uri, &str)],\n    ) -> Vec<RawDocBatch> {\n        let universe = Universe::with_accelerated_time();\n        let (source_mailbox, _source_inbox) = universe.create_test_mailbox::<SourceActor>();\n        let (doc_processor_mailbox, doc_processor_inbox) =\n            universe.create_test_mailbox::<DocProcessor>();\n        let (observable_state_tx, _observable_state_rx) = watch::channel(serde_json::Value::Null);\n        let ctx: SourceContext =\n            ActorContext::for_test(&universe, source_mailbox, observable_state_tx);\n\n        coordinator\n            .initialize(&doc_processor_mailbox, &ctx)\n            .await\n            .unwrap();\n\n        coordinator\n            .emit_batches(&doc_processor_mailbox, &ctx)\n            .await\n            .unwrap();\n\n        for (uri, ack_id) in messages {\n            queue.send_message(uri.to_string(), ack_id);\n        }\n\n        // Need 3 iterations for each msg to emit the first batch (receive,\n        // start, emit), assuming the `QueueReceiver` doesn't chunk the receive\n        // future.\n        for _ in 0..(messages.len() * 4) {\n            coordinator\n                .emit_batches(&doc_processor_mailbox, &ctx)\n                .await\n                .unwrap();\n        }\n\n        let batches = doc_processor_inbox\n            .drain_for_test()\n            .into_iter()\n            .flat_map(|box_any| box_any.downcast::<RawDocBatch>().ok())\n            .map(|box_raw_doc_batch| *box_raw_doc_batch)\n            .collect::<Vec<_>>();\n        universe.assert_quit().await;\n        batches\n    }\n\n    #[tokio::test]\n    async fn test_process_empty_queue() {\n        let queue = Arc::new(MemoryQueueForTests::new());\n        let shared_state = init_state(\"test-index\", Default::default());\n        let mut coordinator = setup_coordinator(queue.clone(), shared_state);\n        let batches = process_messages(&mut coordinator, queue, &[]).await;\n        assert_eq!(batches.len(), 0);\n    }\n\n    #[tokio::test]\n    async fn test_process_one_small_message() {\n        let queue = Arc::new(MemoryQueueForTests::new());\n        let shared_state = init_state(\"test-index\", Default::default());\n        let mut coordinator = setup_coordinator(queue.clone(), shared_state.clone());\n        let (dummy_doc_file, _) = generate_dummy_doc_file(false, 10).await;\n        let test_uri = Uri::from_str(dummy_doc_file.path().to_str().unwrap()).unwrap();\n        let partition_id = PreProcessedPayload::ObjectUri(test_uri.clone()).partition_id();\n        let batches = process_messages(&mut coordinator, queue, &[(&test_uri, \"ack-id\")]).await;\n        assert_eq!(batches.len(), 1);\n        assert_eq!(batches[0].docs.len(), 10);\n        assert!(coordinator.local_state.is_awaiting_commit(&partition_id));\n    }\n\n    #[tokio::test]\n    async fn test_process_one_big_message() {\n        let queue = Arc::new(MemoryQueueForTests::new());\n        let shared_state = init_state(\"test-index\", Default::default());\n        let mut coordinator = setup_coordinator(queue.clone(), shared_state);\n        let lines = BATCH_NUM_BYTES_LIMIT as usize / DUMMY_DOC.len() + 1;\n        let (dummy_doc_file, _) = generate_dummy_doc_file(true, lines).await;\n        let test_uri = Uri::from_str(dummy_doc_file.path().to_str().unwrap()).unwrap();\n        let batches = process_messages(&mut coordinator, queue, &[(&test_uri, \"ack-id\")]).await;\n        assert_eq!(batches.len(), 2);\n        assert_eq!(batches.iter().map(|b| b.docs.len()).sum::<usize>(), lines);\n    }\n\n    #[tokio::test]\n    async fn test_process_two_messages_different_compression() {\n        let queue = Arc::new(MemoryQueueForTests::new());\n        let shared_state = init_state(\"test-index\", Default::default());\n        let mut coordinator = setup_coordinator(queue.clone(), shared_state);\n        let (dummy_doc_file_1, _) = generate_dummy_doc_file(false, 10).await;\n        let test_uri_1 = Uri::from_str(dummy_doc_file_1.path().to_str().unwrap()).unwrap();\n        let (dummy_doc_file_2, _) = generate_dummy_doc_file(true, 10).await;\n        let test_uri_2 = Uri::from_str(dummy_doc_file_2.path().to_str().unwrap()).unwrap();\n        let batches = process_messages(\n            &mut coordinator,\n            queue,\n            &[(&test_uri_1, \"ack-id-1\"), (&test_uri_2, \"ack-id-2\")],\n        )\n        .await;\n        // could be generated in 1 or 2 batches, it doesn't matter\n        assert_eq!(batches.iter().map(|b| b.docs.len()).sum::<usize>(), 20);\n    }\n\n    #[tokio::test]\n    async fn test_process_local_duplicate_message() {\n        let queue = Arc::new(MemoryQueueForTests::new());\n        let shared_state = init_state(\"test-index\", Default::default());\n        let mut coordinator = setup_coordinator(queue.clone(), shared_state);\n        let (dummy_doc_file, _) = generate_dummy_doc_file(false, 10).await;\n        let test_uri = Uri::from_str(dummy_doc_file.path().to_str().unwrap()).unwrap();\n        let batches = process_messages(\n            &mut coordinator,\n            queue,\n            &[(&test_uri, \"ack-id-1\"), (&test_uri, \"ack-id-2\")],\n        )\n        .await;\n        assert_eq!(batches.len(), 1);\n        assert_eq!(batches.iter().map(|b| b.docs.len()).sum::<usize>(), 10);\n    }\n\n    #[tokio::test]\n    async fn test_process_shared_complete_message() {\n        let (dummy_doc_file, file_size) = generate_dummy_doc_file(false, 10).await;\n        let test_uri = Uri::from_str(dummy_doc_file.path().to_str().unwrap()).unwrap();\n        let partition_id = PreProcessedPayload::ObjectUri(test_uri.clone()).partition_id();\n\n        let queue = Arc::new(MemoryQueueForTests::new());\n        let shared_state = init_state(\n            \"test-index\",\n            &[(\n                partition_id.clone(),\n                (\n                    \"existing_token\".to_string(),\n                    Position::eof(file_size),\n                    false,\n                ),\n            )],\n        );\n        let mut coordinator = setup_coordinator(queue.clone(), shared_state.clone());\n\n        assert!(!coordinator.local_state.is_tracked(&partition_id));\n        let batches = process_messages(&mut coordinator, queue, &[(&test_uri, \"ack-id-1\")]).await;\n        assert_eq!(batches.len(), 0);\n        assert!(coordinator.local_state.is_completed(&partition_id));\n    }\n\n    #[tokio::test]\n    async fn test_process_existing_messages() {\n        let (dummy_doc_file_1, _) = generate_dummy_doc_file(false, 10).await;\n        let test_uri_1 = Uri::from_str(dummy_doc_file_1.path().to_str().unwrap()).unwrap();\n        let partition_id_1 = PreProcessedPayload::ObjectUri(test_uri_1.clone()).partition_id();\n\n        let (dummy_doc_file_2, _) = generate_dummy_doc_file(false, 10).await;\n        let test_uri_2 = Uri::from_str(dummy_doc_file_2.path().to_str().unwrap()).unwrap();\n        let partition_id_2 = PreProcessedPayload::ObjectUri(test_uri_2.clone()).partition_id();\n\n        let (dummy_doc_file_3, _) = generate_dummy_doc_file(false, 10).await;\n        let test_uri_3 = Uri::from_str(dummy_doc_file_3.path().to_str().unwrap()).unwrap();\n        let partition_id_3 = PreProcessedPayload::ObjectUri(test_uri_3.clone()).partition_id();\n\n        let queue = Arc::new(MemoryQueueForTests::new());\n        let shared_state = init_state(\n            \"test-index\",\n            &[\n                (\n                    partition_id_1.clone(),\n                    (\"existing_token_1\".to_string(), Position::Beginning, true),\n                ),\n                (\n                    partition_id_2.clone(),\n                    (\n                        \"existing_token_2\".to_string(),\n                        Position::offset((DUMMY_DOC.len() + 1) * 2),\n                        true,\n                    ),\n                ),\n                (\n                    partition_id_3.clone(),\n                    (\n                        \"existing_token_3\".to_string(),\n                        Position::offset((DUMMY_DOC.len() + 1) * 6),\n                        false, // should not be processed because not stale yet\n                    ),\n                ),\n            ],\n        );\n        let mut coordinator = setup_coordinator(queue.clone(), shared_state.clone());\n        let batches = process_messages(\n            &mut coordinator,\n            queue,\n            &[\n                (&test_uri_1, \"ack-id-1\"),\n                (&test_uri_2, \"ack-id-2\"),\n                (&test_uri_3, \"ack-id-3\"),\n            ],\n        )\n        .await;\n        assert_eq!(batches.len(), 2);\n        assert_eq!(batches.iter().map(|b| b.docs.len()).sum::<usize>(), 18);\n        assert!(coordinator.local_state.is_awaiting_commit(&partition_id_1));\n        assert!(coordinator.local_state.is_awaiting_commit(&partition_id_2));\n    }\n\n    #[tokio::test]\n    async fn test_process_multiple_coordinator() {\n        let queue = Arc::new(MemoryQueueForTests::new());\n        let shared_state = init_state(\"test-index\", Default::default());\n        let mut coord_1 = setup_coordinator(queue.clone(), shared_state.clone());\n        let mut coord_2 = setup_coordinator(queue.clone(), shared_state.clone());\n        let (dummy_doc_file, _) = generate_dummy_doc_file(false, 10).await;\n        let test_uri = Uri::from_str(dummy_doc_file.path().to_str().unwrap()).unwrap();\n        let partition_id = PreProcessedPayload::ObjectUri(test_uri.clone()).partition_id();\n\n        let batches_1 = process_messages(&mut coord_1, queue.clone(), &[(&test_uri, \"ack1\")]).await;\n        let batches_2 = process_messages(&mut coord_2, queue, &[(&test_uri, \"ack2\")]).await;\n\n        assert_eq!(batches_1.len(), 1);\n        assert_eq!(batches_1[0].docs.len(), 10);\n        assert!(coord_1.local_state.is_awaiting_commit(&partition_id));\n        // proc_2 learns from shared state that the message is likely still\n        // being processed and skips it\n        assert_eq!(batches_2.len(), 0);\n        assert!(!coord_2.local_state.is_tracked(&partition_id));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/queue_sources/design.md",
    "content": "# Queue source design\n\n## Exactly once\n\nBesides the usual failures that can happen during indexing, most queues are also subject to duplicates. To ensure that all object files are indexed exactly once, we track the progress of their indexing using the shard table:\n- each file object is tracked as a shard, the file URI is the shard ID\n- progress made on the indexing of a given shard is committed in the shard table in a common transaction with the split publishing\n- after some time (called deduplication window) shards are garbage collected to keep the size of the shard table small\n\n## Visibility extension task\n\nTo keep messages invisible to other pipelines while they are being processed, each received message spawns a visibility extension task. This task is responsible of extending the visibility timeout each time the visibility deadlines approaches. When the last batch is read for the message and sent to the indexing pipeline:\n- a last visibility extension is requested to give time for the indexing to complete (typically twice the commit timeout) \n- the visibility extension task stopped\n\n## Cleanup of old shards\n\nGarbage collection is owned by the queue based sources. Each pipeline with a queue source will spawn a garbage collection task. To avoid having an increased load on the metastore as the number of pipeline scales, garbage collection calls are debounced by the control plane.\n\n## Onboarding new queues\n\nThis module is meant to be generic enough to:\n- use other queue implementations, such as GCP Pub/Sub\n- source the data from other sources than object storage, e.g directly from the message\n\nNote that because every single messages is tracked by the metastore, this design will not behave well with high message rates. For instance it is not meant to be efficient with a data stream where every message contains a single event. As a rule of thumb, to protect the metastore, it is discouraged to try processing more than 50 messages per second with this design. This means that high throughput can only be achieved with larger contents for each message (e.g larger files when the using the file source with queue notifications).\n\n## Implementation\n\nThe `QueueCoordinator` is a concrete implementation of the machinery necessary to consume data from a queue, from the message reception to its acknowledgment after indexing. The `QueueCoordinator` interacts with 3 main components.\n\n### The `Queue`\n\nThe `Queue` is an abstract interface that can represent any queue implementation (AWS SQS, Google Pub/Sub...). It is sufficient that the queue guaranties at least one delivery of its messages. The abstraction reduces the actual queue's API surface to 3 main functions:\n- receive messages that are ready to be processed\n- extend their visibility timeout, i.e delay the time at which a message is visible again to other consumers\n- acknowledge messages, i.e delete them definitively from the queue after successful indexing\n\n### The `QueueLocalState`\n\nThe local state is an in memory data structure that keeps track of the knowledge that the current source has of recently received messages. It manages the transitions of messages between 4 states:\n- ready for read\n- read in progress\n- awaiting commit\n- completed\n\n\n### The `QueueSharedState`\n\nThe shared state is a client of the Shard API, a metastore API that was mainly designed to serve ingest V2. The Shard API improves on the previous checkpoint API which was stored as a blob in one of the fields of the index model. The flow is the following one:\n\nThe queue source opens a shard, using an ID that uniquely identifies the content of the message as shard ID. For the file source, the shard ID is the file URI. Each source has a unique publish token that is provided in the `OpenShards` metastore request. The response of the `OpenShards` requests returns the token of the caller that called the API first. Either:\n- The returned token matches the current pipeline's token. This means that we have the ownership of this message content and can proceed with its indexing\n- The returned token does not match the current pipeline's token. This means that another pipeline has the ownership. In that case, we look at the content of the shard:\n  - if it's already completely processed (EOF), we acknowledge the message drop it\n  - if its last update timestamp is old (e.g twice the commit timeout), we assume the processing of the content to be stale (e.g the owning pipeline failed). We perform an `AcquireShards` call to update the shard's token in the metastore with the local one. This indicates subsequent attempts to process the shard that this pipeline now has its ownership. Note though that this is subject to a race conditions: 2 pipelines might acquire the shard concurrently. In that case both pipelines will assume that it owns the shard, and one of them will fail at commit time.\n  - if its last update timestamp is recent, we assume that the processing of the content is still in progress in another pipeline. We just drop the message (without any acknowledgment) and let it be re-processed once its visibility timeout expires.\n\nThe `QueueSharedState` also owns the background task that will periodically initiate a call to `PruneShards` to garbage collect old shards.\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/queue_sources/helpers.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::Arc;\nuse std::time::Duration;\n\nuse futures::future::BoxFuture;\n\nuse super::Queue;\nuse super::message::RawMessage;\n\ntype ReceiveResult = anyhow::Result<Vec<RawMessage>>;\n\n/// A statefull wrapper around a `Queue` that chunks the slow `receive()` call\n/// into shorter iterations. This enables yielding back to the actor system\n/// without compromising on queue poll durations. Without this, an actor that\n/// tries to receive messages from a `Queue` will be blocked for multiple seconds\n/// before being able to process new mailbox messages (or shutting down).\npub struct QueueReceiver {\n    queue: Arc<dyn Queue>,\n    receive: Option<BoxFuture<'static, ReceiveResult>>,\n    iteration: Duration,\n}\n\nimpl QueueReceiver {\n    pub fn new(queue: Arc<dyn Queue>, iteration: Duration) -> Self {\n        Self {\n            queue,\n            receive: None,\n            iteration,\n        }\n    }\n\n    pub async fn receive(\n        &mut self,\n        max_messages: usize,\n        suggested_deadline: Duration,\n    ) -> anyhow::Result<Vec<RawMessage>> {\n        if self.receive.is_none() {\n            self.receive = Some(self.queue.clone().receive(max_messages, suggested_deadline));\n        }\n        tokio::select! {\n            res = self.receive.as_mut().unwrap() => {\n                self.receive = None;\n                res\n            }\n            _ = tokio::time::sleep(self.iteration) => {\n                Ok(Vec::new())\n            }\n\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::time::{Duration, Instant};\n\n    use anyhow::bail;\n    use async_trait::async_trait;\n\n    use super::*;\n\n    #[derive(Clone, Debug)]\n    struct SleepyQueue {\n        receive_sleep: Duration,\n    }\n\n    #[async_trait]\n    impl Queue for SleepyQueue {\n        async fn receive(\n            self: Arc<Self>,\n            _max_messages: usize,\n            _suggested_deadline: Duration,\n        ) -> anyhow::Result<Vec<RawMessage>> {\n            tokio::time::sleep(self.receive_sleep).await;\n            bail!(\"Waking up from my nap\")\n        }\n\n        async fn acknowledge(&self, _ack_ids: &[String]) -> anyhow::Result<()> {\n            unimplemented!()\n        }\n\n        async fn modify_deadlines(\n            &self,\n            _ack_id: &str,\n            _suggested_deadline: Duration,\n        ) -> anyhow::Result<Instant> {\n            unimplemented!()\n        }\n    }\n\n    #[tokio::test]\n    async fn test_queue_receiver_slow_receive() {\n        let queue = Arc::new(SleepyQueue {\n            receive_sleep: Duration::from_millis(100),\n        });\n        let mut receiver = QueueReceiver::new(queue, Duration::from_millis(20));\n        let mut iterations = 0;\n        while receiver.receive(1, Duration::from_secs(1)).await.is_ok() {\n            iterations += 1;\n        }\n        assert!(iterations >= 4);\n    }\n\n    #[tokio::test]\n    async fn test_queue_receiver_fast_receive() {\n        let queue = Arc::new(SleepyQueue {\n            receive_sleep: Duration::from_millis(10),\n        });\n        let mut receiver = QueueReceiver::new(queue, Duration::from_millis(50));\n        assert!(receiver.receive(1, Duration::from_secs(1)).await.is_err());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/queue_sources/local_state.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{BTreeMap, BTreeSet, VecDeque};\n\nuse anyhow::bail;\nuse quickwit_metastore::checkpoint::PartitionId;\n\nuse super::message::{InProgressMessage, ReadyMessage};\n\n/// Tracks the state of the queue messages that are known to the owning indexing\n/// pipeline.\n///\n/// Messages first land in the `ready_for_read` queue. They are then moved to\n/// `read_in_progress` to track the reader's progress. Once the reader reaches\n/// EOF, the message is transitioned as `awaiting_commit`. Once the message is\n/// known to be fully indexed and committed (e.g after receiving the\n/// `suggest_truncate` call), it is moved to `completed`.\n#[derive(Default)]\npub struct QueueLocalState {\n    /// Messages that were received from the queue and are ready to be read\n    ready_for_read: VecDeque<ReadyMessage>,\n    /// Message that is currently being read and sent to the `DocProcessor`\n    read_in_progress: Option<InProgressMessage>,\n    /// Partitions that were read and are still being indexed, with their\n    /// associated ack_id\n    awaiting_commit: BTreeMap<PartitionId, String>,\n    /// Partitions that were fully indexed and committed\n    completed: BTreeSet<PartitionId>,\n}\n\nimpl QueueLocalState {\n    pub fn is_ready_for_read(&self, partition_id: &PartitionId) -> bool {\n        self.ready_for_read\n            .iter()\n            .any(|msg| &msg.partition_id() == partition_id)\n    }\n\n    pub fn is_read_in_progress(&self, partition_id: &PartitionId) -> bool {\n        self.read_in_progress\n            .as_ref()\n            .map(|msg| &msg.partition_id == partition_id)\n            .unwrap_or(false)\n    }\n\n    pub fn is_awaiting_commit(&self, partition_id: &PartitionId) -> bool {\n        self.awaiting_commit.contains_key(partition_id)\n    }\n\n    pub fn is_completed(&self, partition_id: &PartitionId) -> bool {\n        self.completed.contains(partition_id)\n    }\n\n    pub fn is_tracked(&self, partition_id: &PartitionId) -> bool {\n        self.is_ready_for_read(partition_id)\n            || self.is_read_in_progress(partition_id)\n            || self.is_awaiting_commit(partition_id)\n            || self.is_completed(partition_id)\n    }\n\n    pub fn set_ready_for_read(&mut self, ready_messages: Vec<ReadyMessage>) {\n        for message in ready_messages {\n            self.ready_for_read.push_back(message)\n        }\n    }\n\n    pub fn get_ready_for_read(&mut self) -> Option<ReadyMessage> {\n        while let Some(msg) = self.ready_for_read.pop_front() {\n            // don't return messages for which we didn't manage to extend the\n            // visibility, they will pop up in the queue again anyway\n            if !msg.visibility_handle.extension_failed() {\n                return Some(msg);\n            }\n        }\n        None\n    }\n\n    pub fn read_in_progress_mut(&mut self) -> Option<&mut InProgressMessage> {\n        self.read_in_progress.as_mut()\n    }\n\n    pub async fn drop_currently_read(&mut self) -> anyhow::Result<()> {\n        if let Some(in_progress) = self.read_in_progress.take() {\n            self.awaiting_commit.insert(\n                in_progress.partition_id.clone(),\n                in_progress.visibility_handle.ack_id().to_string(),\n            );\n            in_progress\n                .visibility_handle\n                .request_last_extension()\n                .await?;\n        }\n        Ok(())\n    }\n\n    /// Tries to set the message that is currently being read. Returns an error\n    /// if there is already a message being read.\n    pub fn set_currently_read(\n        &mut self,\n        in_progress: Option<InProgressMessage>,\n    ) -> anyhow::Result<()> {\n        if self.read_in_progress.is_some() {\n            bail!(\"trying to replace in progress message\");\n        }\n        self.read_in_progress = in_progress;\n        Ok(())\n    }\n\n    /// Returns the ack_id if that message was awaiting_commit\n    pub fn mark_completed(&mut self, partition_id: PartitionId) -> Option<String> {\n        let ack_id_opt = self.awaiting_commit.remove(&partition_id);\n        self.completed.insert(partition_id);\n        ack_id_opt\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/queue_sources/memory_queue.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{BTreeMap, VecDeque};\nuse std::fmt;\nuse std::sync::{Arc, Mutex};\nuse std::time::{Duration, Instant};\n\nuse anyhow::bail;\nuse async_trait::async_trait;\nuse quickwit_storage::OwnedBytes;\nuse ulid::Ulid;\n\nuse super::Queue;\nuse super::message::{MessageMetadata, RawMessage};\n\n#[derive(Default)]\nstruct InnerState {\n    in_queue: VecDeque<RawMessage>,\n    in_flight: BTreeMap<String, RawMessage>,\n    acked: Vec<RawMessage>,\n}\n\nimpl fmt::Debug for InnerState {\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        f.debug_struct(\"Queue\")\n            .field(\"in_queue_count\", &self.in_queue.len())\n            .field(\"in_flight_count\", &self.in_flight.len())\n            .field(\"acked_count\", &self.acked.len())\n            .finish()\n    }\n}\n\n/// A simple in-memory queue\n#[derive(Clone, Debug)]\npub struct MemoryQueueForTests {\n    inner_state: Arc<Mutex<InnerState>>,\n    receive_sleep: Duration,\n}\n\nimpl MemoryQueueForTests {\n    pub fn new() -> Self {\n        let inner_state = Arc::new(Mutex::new(InnerState::default()));\n        let inner_weak = Arc::downgrade(&inner_state);\n        tokio::spawn(async move {\n            loop {\n                if let Some(inner_state) = inner_weak.upgrade() {\n                    let mut inner_state = inner_state.lock().unwrap();\n                    let mut expired = Vec::new();\n                    for (ack_id, msg) in inner_state.in_flight.iter() {\n                        if msg.metadata.initial_deadline < Instant::now() {\n                            expired.push(ack_id.clone());\n                        }\n                    }\n                    for ack_id in expired {\n                        let msg = inner_state.in_flight.remove(&ack_id).unwrap();\n                        inner_state.in_queue.push_back(msg);\n                    }\n                } else {\n                    break;\n                }\n                tokio::time::sleep(Duration::from_millis(50)).await;\n            }\n        });\n        MemoryQueueForTests {\n            inner_state: Arc::new(Mutex::new(InnerState::default())),\n            receive_sleep: Duration::from_millis(50),\n        }\n    }\n\n    pub fn send_message(&self, payload: String, ack_id: &str) {\n        let message = RawMessage {\n            payload: OwnedBytes::new(payload.into_bytes()),\n            metadata: MessageMetadata {\n                ack_id: ack_id.to_string(),\n                delivery_attempts: 0,\n                initial_deadline: Instant::now(),\n                message_id: Ulid::new().to_string(),\n            },\n        };\n        self.inner_state.lock().unwrap().in_queue.push_back(message);\n    }\n\n    /// Returns the next visibility deadline for the message if it is in flight\n    pub fn next_visibility_deadline(&self, ack_id: &str) -> Option<Instant> {\n        let inner_state = self.inner_state.lock().unwrap();\n        inner_state\n            .in_flight\n            .get(ack_id)\n            .map(|msg| msg.metadata.initial_deadline)\n    }\n}\n\n#[async_trait]\nimpl Queue for MemoryQueueForTests {\n    async fn receive(\n        self: Arc<Self>,\n        max_messages: usize,\n        suggested_deadline: Duration,\n    ) -> anyhow::Result<Vec<RawMessage>> {\n        {\n            let mut inner_state = self.inner_state.lock().unwrap();\n            let mut response = Vec::new();\n            while let Some(mut msg) = inner_state.in_queue.pop_front() {\n                msg.metadata.delivery_attempts += 1;\n                msg.metadata.initial_deadline = Instant::now() + suggested_deadline;\n                let msg_cloned = RawMessage {\n                    payload: msg.payload.clone(),\n                    metadata: msg.metadata.clone(),\n                };\n                inner_state\n                    .in_flight\n                    .insert(msg.metadata.ack_id.clone(), msg_cloned);\n                response.push(msg);\n                if response.len() >= max_messages {\n                    break;\n                }\n            }\n            if !response.is_empty() {\n                return Ok(response);\n            }\n        }\n        // `sleep` to avoid using all the CPU when called in a loop\n        tokio::time::sleep(self.receive_sleep).await;\n\n        Ok(vec![])\n    }\n\n    async fn acknowledge(&self, ack_ids: &[String]) -> anyhow::Result<()> {\n        let mut inner_state = self.inner_state.lock().unwrap();\n        for ack_id in ack_ids {\n            if let Some(msg) = inner_state.in_flight.remove(ack_id) {\n                inner_state.acked.push(msg);\n            }\n        }\n        Ok(())\n    }\n\n    async fn modify_deadlines(\n        &self,\n        ack_id: &str,\n        suggested_deadline: Duration,\n    ) -> anyhow::Result<Instant> {\n        let mut inner_state = self.inner_state.lock().unwrap();\n        let in_flight = inner_state.in_flight.get_mut(ack_id);\n        if let Some(msg) = in_flight {\n            msg.metadata.initial_deadline = Instant::now() + suggested_deadline;\n        } else {\n            bail!(\"ack_id {} not found in in-flight\", ack_id);\n        }\n        return Ok(Instant::now() + suggested_deadline);\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    fn prefilled_queue(nb_message: usize) -> Arc<MemoryQueueForTests> {\n        let memory_queue = MemoryQueueForTests::new();\n        for i in 0..nb_message {\n            let payload = format!(\"Test message {i}\");\n            let ack_id = i.to_string();\n            memory_queue.send_message(payload.clone(), &ack_id);\n        }\n        Arc::new(memory_queue)\n    }\n\n    #[tokio::test]\n    async fn test_receive_1_by_1() {\n        let memory_queue = prefilled_queue(2);\n        for i in 0..2 {\n            let messages = memory_queue\n                .clone()\n                .receive(1, Duration::from_secs(5))\n                .await\n                .unwrap();\n            assert_eq!(messages.len(), 1);\n            let message = &messages[0];\n            let exp_payload = format!(\"Test message {i}\");\n            let exp_ack_id = i.to_string();\n            assert_eq!(message.payload.as_ref(), exp_payload.as_bytes());\n            assert_eq!(message.metadata.ack_id, exp_ack_id);\n        }\n    }\n\n    #[tokio::test]\n    async fn test_receive_2_by_2() {\n        let memory_queue = prefilled_queue(2);\n        let messages = memory_queue\n            .receive(2, Duration::from_secs(5))\n            .await\n            .unwrap();\n        assert_eq!(messages.len(), 2);\n        for (i, message) in messages.iter().enumerate() {\n            let exp_payload = format!(\"Test message {i}\");\n            let exp_ack_id = i.to_string();\n            assert_eq!(message.payload.as_ref(), exp_payload.as_bytes());\n            assert_eq!(message.metadata.ack_id, exp_ack_id);\n        }\n    }\n\n    #[tokio::test]\n    async fn test_receive_early_if_only_1() {\n        let memory_queue = prefilled_queue(1);\n        let messages = memory_queue\n            .receive(2, Duration::from_secs(5))\n            .await\n            .unwrap();\n        assert_eq!(messages.len(), 1);\n        let message = &messages[0];\n        let exp_payload = \"Test message 0\".to_string();\n        let exp_ack_id = \"0\";\n        assert_eq!(message.payload.as_ref(), exp_payload.as_bytes());\n        assert_eq!(message.metadata.ack_id, exp_ack_id);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/queue_sources/message.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse core::fmt;\nuse std::io::read_to_string;\nuse std::str::FromStr;\nuse std::time::Instant;\n\nuse anyhow::Context;\nuse quickwit_common::rate_limited_warn;\nuse quickwit_common::uri::Uri;\nuse quickwit_metastore::checkpoint::PartitionId;\nuse quickwit_proto::types::Position;\nuse quickwit_storage::{OwnedBytes, StorageResolver};\nuse serde_json::Value;\nuse thiserror::Error;\nuse tracing::info;\n\nuse super::visibility::VisibilityTaskHandle;\nuse crate::source::doc_file_reader::ObjectUriBatchReader;\n\n#[derive(Debug, Clone, Copy)]\npub enum MessageType {\n    S3Notification,\n    // GcsNotification,\n    RawUri,\n    // RawData,\n}\n\n#[derive(Debug, Clone, PartialEq, Eq)]\npub struct MessageMetadata {\n    /// The handle that should be used to acknowledge the message or change its visibility deadline\n    pub ack_id: String,\n\n    /// The unique message id assigned by the queue\n    pub message_id: String,\n\n    /// The approximate number of times the message was delivered. 1 means it is\n    /// the first time this message is being delivered.\n    pub delivery_attempts: usize,\n\n    /// The first deadline when the message is received. It can be extended later using the ack_id.\n    pub initial_deadline: Instant,\n}\n\n/// The raw messages as received from the queue abstraction\npub struct RawMessage {\n    pub metadata: MessageMetadata,\n    pub payload: OwnedBytes,\n}\n\nimpl fmt::Debug for RawMessage {\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        f.debug_struct(\"RawMessage\")\n            .field(\"metadata\", &self.metadata)\n            .field(\"payload\", &\"<bytes>\")\n            .finish()\n    }\n}\n\n#[derive(Error, Debug)]\npub enum PreProcessingError {\n    #[error(\"message can be acknowledged without processing\")]\n    Discardable { ack_id: String },\n    #[error(\"unexpected message format: {0}\")]\n    UnexpectedFormat(#[from] anyhow::Error),\n}\n\nimpl RawMessage {\n    pub fn pre_process(\n        self,\n        message_type: MessageType,\n    ) -> Result<PreProcessedMessage, PreProcessingError> {\n        let payload = match message_type {\n            MessageType::S3Notification => PreProcessedPayload::ObjectUri(\n                uri_from_s3_notification(&self.payload, &self.metadata.ack_id)?,\n            ),\n            MessageType::RawUri => {\n                let payload_str = read_to_string(self.payload).context(\"failed to read payload\")?;\n                PreProcessedPayload::ObjectUri(Uri::from_str(&payload_str)?)\n            }\n        };\n        Ok(PreProcessedMessage {\n            metadata: self.metadata,\n            payload,\n        })\n    }\n}\n\n#[derive(Debug, PartialEq, Eq)]\npub enum PreProcessedPayload {\n    /// The message contains an object URI\n    ObjectUri(Uri),\n    // /// The message contains the raw JSON data\n    // RawData(OwnedBytes),\n}\n\nimpl PreProcessedPayload {\n    pub fn partition_id(&self) -> PartitionId {\n        match &self {\n            Self::ObjectUri(uri) => PartitionId::from(uri.as_str()),\n        }\n    }\n}\n\n/// A message that went through the minimal transformation to discover its\n/// partition id. Indeed, the message might be discarded if the partition was\n/// already processed, so it's better to avoid doing unnecessary work at this\n/// stage.\n#[derive(Debug, PartialEq, Eq)]\npub struct PreProcessedMessage {\n    pub metadata: MessageMetadata,\n    pub payload: PreProcessedPayload,\n}\n\nimpl PreProcessedMessage {\n    pub fn partition_id(&self) -> PartitionId {\n        self.payload.partition_id()\n    }\n}\n\nfn uri_from_s3_notification(message: &[u8], ack_id: &str) -> Result<Uri, PreProcessingError> {\n    let value: Value = serde_json::from_slice(message).context(\"invalid JSON message\")?;\n    if matches!(value[\"Event\"].as_str(), Some(\"s3:TestEvent\")) {\n        info!(\"discarding S3 test event\");\n        return Err(PreProcessingError::Discardable {\n            ack_id: ack_id.to_string(),\n        });\n    }\n    let event_name = value[\"Records\"][0][\"eventName\"]\n        .as_str()\n        .context(\"invalid S3 notification: Records[0].eventName not found\")?;\n    if !event_name.starts_with(\"ObjectCreated:\") {\n        rate_limited_warn!(\n            limit_per_min = 5,\n            event = event_name,\n            \"only s3:ObjectCreated:* events are supported\"\n        );\n        return Err(PreProcessingError::Discardable {\n            ack_id: ack_id.to_string(),\n        });\n    }\n    let key = value[\"Records\"][0][\"s3\"][\"object\"][\"key\"]\n        .as_str()\n        .context(\"invalid S3 notification: Records[0].s3.object.key not found\")?;\n    let bucket = value[\"Records\"][0][\"s3\"][\"bucket\"][\"name\"]\n        .as_str()\n        .context(\"invalid S3 notification: Records[0].s3.bucket.name not found\")?;\n    let encoded_key = percent_encoding::percent_decode(key.as_bytes())\n        .decode_utf8()\n        .context(\"invalid S3 notification: Records[0].s3.object.key could not be url decoded\")?;\n    Uri::from_str(&format!(\"s3://{bucket}/{encoded_key}\")).map_err(|e| e.into())\n}\n\n/// A message for which we know as much of the global processing status as\n/// possible and that is now ready to be processed.\npub struct ReadyMessage {\n    pub position: Position,\n    pub content: PreProcessedMessage,\n    pub visibility_handle: VisibilityTaskHandle,\n}\n\nimpl ReadyMessage {\n    pub async fn start_processing(\n        self,\n        storage_resolver: &StorageResolver,\n    ) -> anyhow::Result<Option<InProgressMessage>> {\n        let partition_id = self.partition_id();\n        match self.content.payload {\n            PreProcessedPayload::ObjectUri(uri) => {\n                let batch_reader = ObjectUriBatchReader::try_new(\n                    storage_resolver,\n                    partition_id.clone(),\n                    &uri,\n                    self.position,\n                )\n                .await?;\n                if batch_reader.is_eof() {\n                    Ok(None)\n                } else {\n                    Ok(Some(InProgressMessage {\n                        batch_reader,\n                        partition_id,\n                        visibility_handle: self.visibility_handle,\n                    }))\n                }\n            }\n        }\n    }\n\n    pub fn partition_id(&self) -> PartitionId {\n        self.content.partition_id()\n    }\n}\n\n/// A message that is actively being read\npub struct InProgressMessage {\n    pub partition_id: PartitionId,\n    pub visibility_handle: VisibilityTaskHandle,\n    pub batch_reader: ObjectUriBatchReader,\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_uri_from_s3_notification_valid() {\n        let test_message = r#\"\n        {\n            \"Records\": [\n                {\n                \"eventVersion\": \"2.1\",\n                \"eventSource\": \"aws:s3\",\n                \"awsRegion\": \"us-west-2\",\n                \"eventTime\": \"2021-05-22T09:22:41.789Z\",\n                \"eventName\": \"ObjectCreated:Put\",\n                \"userIdentity\": {\n                    \"principalId\": \"AWS:AIDAJDPLRKLG7UEXAMPLE\"\n                },\n                \"requestParameters\": {\n                    \"sourceIPAddress\": \"127.0.0.1\"\n                },\n                \"responseElements\": {\n                    \"x-amz-request-id\": \"C3D13FE58DE4C810\",\n                    \"x-amz-id-2\": \"FMyUVURIx7Zv2cPi/IZb9Fk1/U4QfTaVK5fahHPj/\"\n                },\n                \"s3\": {\n                    \"s3SchemaVersion\": \"1.0\",\n                    \"configurationId\": \"testConfigRule\",\n                    \"bucket\": {\n                        \"name\": \"mybucket\",\n                        \"ownerIdentity\": {\n                            \"principalId\": \"A3NL1KOZZKExample\"\n                        },\n                        \"arn\": \"arn:aws:s3:::mybucket\"\n                    },\n                    \"object\": {\n                        \"key\": \"logs.json\",\n                        \"size\": 1024,\n                        \"eTag\": \"d41d8cd98f00b204e9800998ecf8427e\",\n                        \"versionId\": \"096fKKXTRTtl3on89fVO.nfljtsv6qko\",\n                        \"sequencer\": \"0055AED6DCD90281E5\"\n                    }\n                }\n                }\n            ]\n        }\"#;\n        let actual_uri = uri_from_s3_notification(test_message.as_bytes(), \"myackid\").unwrap();\n        let expected_uri = Uri::from_str(\"s3://mybucket/logs.json\").unwrap();\n        assert_eq!(actual_uri, expected_uri);\n    }\n\n    #[test]\n    fn test_uri_from_s3_notification_invalid() {\n        let invalid_message = r#\"{\n            \"Records\": [\n                {\n                    \"s3\": {\n                        \"object\": {\n                            \"key\": \"test_key\"\n                        }\n                    }\n                }\n            ]\n        }\"#;\n        let result =\n            uri_from_s3_notification(&OwnedBytes::new(invalid_message.as_bytes()), \"myackid\");\n        assert!(matches!(\n            result,\n            Err(PreProcessingError::UnexpectedFormat(_))\n        ));\n    }\n\n    #[test]\n    fn test_uri_from_s3_bad_event_type() {\n        let invalid_message = r#\"{\n            \"Records\": [\n                {\n                    \"eventVersion\": \"2.1\",\n                    \"eventSource\": \"aws:s3\",\n                    \"awsRegion\": \"us-east-1\",\n                    \"eventTime\": \"2024-07-29T12:47:14.577Z\",\n                    \"eventName\": \"ObjectRemoved:Delete\",\n                    \"userIdentity\": {\n                        \"principalId\": \"AWS:ARGHGOHSDGOKGHOGHMCC4:user\"\n                    },\n                    \"requestParameters\": {\n                        \"sourceIPAddress\": \"1.1.1.1\"\n                    },\n                    \"responseElements\": {\n                        \"x-amz-request-id\": \"GHGSH\",\n                        \"x-amz-id-2\": \"gndflghndflhmnrflsh+gLLKU6X0PvD6ANdVY1+/hspflhjladgfkelagfkndl\"\n                    },\n                    \"s3\": {\n                        \"s3SchemaVersion\": \"1.0\",\n                        \"configurationId\": \"hello\",\n                        \"bucket\": {\n                            \"name\": \"mybucket\",\n                            \"ownerIdentity\": {\n                                \"principalId\": \"KMGP12GHKKH\"\n                            },\n                            \"arn\": \"arn:aws:s3:::mybucket\"\n                        },\n                        \"object\": {\n                            \"key\": \"my_deleted_file\",\n                            \"sequencer\": \"GKHOFLGKHSALFK0\"\n                        }\n                    }\n                }\n            ]\n        }\"#;\n        let result =\n            uri_from_s3_notification(&OwnedBytes::new(invalid_message.as_bytes()), \"myackid\");\n        assert!(matches!(\n            result,\n            Err(PreProcessingError::Discardable { .. })\n        ));\n    }\n\n    #[test]\n    fn test_uri_from_s3_notification_discardable() {\n        let invalid_message = r#\"{\n            \"Service\":\"Amazon S3\",\n            \"Event\":\"s3:TestEvent\",\n            \"Time\":\"2014-10-13T15:57:02.089Z\",\n            \"Bucket\":\"bucketname\",\n            \"RequestId\":\"5582815E1AEA5ADF\",\n            \"HostId\":\"8cLeGAmw098X5cv4Zkwcmo8vvZa3eH3eKxsPzbB9wrR+YstdA6Knx4Ip8EXAMPLE\"\n        }\"#;\n        let result =\n            uri_from_s3_notification(&OwnedBytes::new(invalid_message.as_bytes()), \"myackid\");\n        if let Err(PreProcessingError::Discardable { ack_id }) = result {\n            assert_eq!(ack_id, \"myackid\");\n        } else {\n            panic!(\"Expected skippable error\");\n        }\n    }\n\n    #[test]\n    fn test_uri_from_s3_notification_url_decode() {\n        let test_message = r#\"\n        {\n            \"Records\": [\n                {\n                \"eventVersion\": \"2.1\",\n                \"eventSource\": \"aws:s3\",\n                \"awsRegion\": \"us-west-2\",\n                \"eventTime\": \"2021-05-22T09:22:41.789Z\",\n                \"eventName\": \"ObjectCreated:Put\",\n                \"userIdentity\": {\n                    \"principalId\": \"AWS:AIDAJDPLRKLG7UEXAMPLE\"\n                },\n                \"requestParameters\": {\n                    \"sourceIPAddress\": \"127.0.0.1\"\n                },\n                \"responseElements\": {\n                    \"x-amz-request-id\": \"C3D13FE58DE4C810\",\n                    \"x-amz-id-2\": \"FMyUVURIx7Zv2cPi/IZb9Fk1/U4QfTaVK5fahHPj/\"\n                },\n                \"s3\": {\n                    \"s3SchemaVersion\": \"1.0\",\n                    \"configurationId\": \"testConfigRule\",\n                    \"bucket\": {\n                        \"name\": \"mybucket\",\n                        \"ownerIdentity\": {\n                            \"principalId\": \"A3NL1KOZZKExample\"\n                        },\n                        \"arn\": \"arn:aws:s3:::mybucket\"\n                    },\n                    \"object\": {\n                        \"key\": \"hello%3A%3Aworld%3A%3Alogs.json\",\n                        \"size\": 1024,\n                        \"eTag\": \"d41d8cd98f00b204e9800998ecf8427e\",\n                        \"versionId\": \"096fKKXTRTtl3on89fVO.nfljtsv6qko\",\n                        \"sequencer\": \"0055AED6DCD90281E5\"\n                    }\n                }\n                }\n            ]\n        }\"#;\n        let actual_uri = uri_from_s3_notification(test_message.as_bytes(), \"myackid\").unwrap();\n        let expected_uri = Uri::from_str(\"s3://mybucket/hello::world::logs.json\").unwrap();\n        assert_eq!(actual_uri, expected_uri);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/queue_sources/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\npub mod coordinator;\nmod helpers;\nmod local_state;\n#[cfg(test)]\nmod memory_queue;\nmod message;\nmod shared_state;\n#[cfg(feature = \"sqs\")]\npub mod sqs_queue;\nmod visibility;\n\nuse std::fmt;\nuse std::sync::Arc;\nuse std::time::{Duration, Instant};\n\nuse async_trait::async_trait;\nuse message::RawMessage;\n\n/// The queue abstraction is based on the AWS SQS and Google Pubsub APIs. The\n/// only requirement of the underlying implementation is that messages exposed\n/// to a given consumer are hidden to other consumers for a configurable period\n/// of time. Retries are handled by the implementation because queues might\n/// behave differently (throttling, deduplication...).\n#[async_trait]\npub trait Queue: fmt::Debug + Send + Sync + 'static {\n    /// Polls the queue to receive messages.\n    ///\n    /// The implementation is in charge of choosing the wait strategy when there\n    /// are no messages in the queue. It will typically use long polling to do\n    /// this efficiently. On the other hand, when there is a message available\n    /// in the queue, it should be returned as quickly as possible, regardless\n    /// of the `max_messages` parameter. The `max_messages` paramater should\n    /// always be clamped by the implementation to not violate the maximum value\n    /// supported by the backing queue (e.g 10 messages for AWS SQS).\n    ///\n    /// As soon as the message is received, the caller is responsible for\n    /// maintaining the message visibility in a timely fashion. Failing to do so\n    /// implies that duplicates will be received by other indexing pipelines,\n    /// thus increasing competition for the commit lock.\n    async fn receive(\n        // `Arc` to make the resulting future `'static` and thus easily\n        // wrappable by the `QueueReceiver`\n        self: Arc<Self>,\n        max_messages: usize,\n        suggested_deadline: Duration,\n    ) -> anyhow::Result<Vec<RawMessage>>;\n\n    /// Tries to acknowledge (delete) the messages.\n    ///\n    /// The call returns `Ok(())` if at the message level:\n    /// - the acknowledgement failed due to a transient failure\n    /// - the message was already acknowledged\n    /// - the message was not acknowledged in time and is back to the queue\n    ///\n    /// If an empty list of ack_ids is provided, the call should be a no-op.\n    async fn acknowledge(&self, ack_ids: &[String]) -> anyhow::Result<()>;\n\n    /// Modifies the visibility deadline of the messages.\n    ///\n    /// We try to set the initial visibility large enough to avoid having to\n    /// call this too often. The implementation can retry as long as desired,\n    /// it's the caller's responsibility to cancel the future if the deadline is\n    /// getting to close to the expiration. The returned `Instant` is a\n    /// conservative estimate of the new deadline expiration time.\n    async fn modify_deadlines(\n        &self,\n        ack_id: &str,\n        suggested_deadline: Duration,\n    ) -> anyhow::Result<Instant>;\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/queue_sources/shared_state.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeMap;\nuse std::sync::{Arc, Weak};\nuse std::time::Duration;\n\nuse anyhow::{Context, bail};\nuse quickwit_metastore::checkpoint::PartitionId;\nuse quickwit_proto::metastore::{\n    AcquireShardsRequest, MetastoreService, MetastoreServiceClient, OpenShardSubrequest,\n    OpenShardsRequest, PruneShardsRequest,\n};\nuse quickwit_proto::types::{DocMappingUid, Position, ShardId, SourceUid};\nuse time::OffsetDateTime;\nuse tracing::{error, info};\n\nuse super::message::PreProcessedMessage;\n\n#[derive(Clone)]\npub struct QueueSharedState {\n    metastore: MetastoreServiceClient,\n    pub source_uid: SourceUid,\n    /// Duration after which the processing of a shard is considered stale and\n    /// should be reacquired\n    reacquire_grace_period: Duration,\n    _cleanup_handle: Arc<()>,\n}\n\nimpl QueueSharedState {\n    /// Create a shared state service and runs a cleanup task that prunes shards\n    /// in the background\n    pub fn new(\n        metastore: MetastoreServiceClient,\n        source_uid: SourceUid,\n        reacquire_grace_period: Duration,\n        max_age: Option<Duration>,\n        max_count: Option<u32>,\n        pruning_interval: Duration,\n    ) -> Self {\n        let cleanup_handle = Arc::new(());\n        tokio::spawn(Self::run_cleanup_task(\n            metastore.clone(),\n            source_uid.clone(),\n            max_age,\n            max_count,\n            pruning_interval,\n            Arc::downgrade(&cleanup_handle),\n        ));\n        Self {\n            metastore,\n            source_uid,\n            reacquire_grace_period,\n            _cleanup_handle: cleanup_handle,\n        }\n    }\n\n    async fn run_cleanup_task(\n        metastore: MetastoreServiceClient,\n        source_uid: SourceUid,\n        max_age: Option<Duration>,\n        max_count: Option<u32>,\n        pruning_interval: Duration,\n        owner_handle: Weak<()>,\n    ) {\n        if max_count.is_none() && max_age.is_none() {\n            return;\n        }\n        let max_age_secs = max_age.map(|duration| duration.as_secs() as u32);\n        let SourceUid {\n            index_uid,\n            source_id,\n        } = source_uid;\n        tokio::spawn(async move {\n            let mut interval = tokio::time::interval(pruning_interval);\n            loop {\n                interval.tick().await;\n                if owner_handle.upgrade().is_none() {\n                    break;\n                }\n                let result: Result<_, _> = metastore\n                    .prune_shards(PruneShardsRequest {\n                        index_uid: Some(index_uid.clone()),\n                        source_id: source_id.clone(),\n                        max_age_secs,\n                        max_count,\n                        interval_secs: Some(pruning_interval.as_secs() as u32),\n                    })\n                    .await;\n                if let Err(err) = result {\n                    error!(error = ?err, \"failed to prune shards\");\n                }\n            }\n        });\n    }\n\n    /// Tries to acquire the ownership for the provided messages from the global\n    /// shared context. For each partition id, if the ownership was successfully\n    /// acquired or the partition was already successfully indexed, the position\n    /// is returned along with the partition id, otherwise the partition id is\n    /// dropped.\n    async fn acquire_partitions(\n        &mut self,\n        publish_token: &str,\n        partitions: Vec<PartitionId>,\n    ) -> anyhow::Result<Vec<(PartitionId, Position)>> {\n        let open_shard_subrequests = partitions\n            .iter()\n            .enumerate()\n            .map(|(idx, partition_id)| OpenShardSubrequest {\n                subrequest_id: idx as u32,\n                index_uid: Some(self.source_uid.index_uid.clone()),\n                source_id: self.source_uid.source_id.clone(),\n                leader_id: String::new(),\n                follower_id: None,\n                shard_id: Some(ShardId::from(partition_id.as_str())),\n                doc_mapping_uid: Some(DocMappingUid::default()),\n                publish_token: Some(publish_token.to_string()),\n            })\n            .collect();\n\n        let open_shard_resp = self\n            .metastore\n            .open_shards(OpenShardsRequest {\n                subrequests: open_shard_subrequests,\n            })\n            .await?;\n\n        let mut shards = Vec::new();\n        let mut re_acquired_shards = Vec::new();\n        for sub in open_shard_resp.subresponses {\n            // we could also just cast the shard_id back to a partition_id\n            let partition_id = partitions[sub.subrequest_id as usize].clone();\n            let shard = sub.open_shard();\n            let position = shard.publish_position_inclusive.clone().unwrap_or_default();\n            let is_owned = sub.open_shard().publish_token.as_deref() == Some(publish_token);\n            let update_datetime = OffsetDateTime::from_unix_timestamp(shard.update_timestamp)\n                .context(\"Invalid shard update timestamp\")?;\n            let is_stale =\n                OffsetDateTime::now_utc() - update_datetime > self.reacquire_grace_period;\n            if position.is_eof() || (is_owned && position.is_beginning()) {\n                shards.push((partition_id, position));\n            } else if !is_owned && is_stale {\n                info!(previous_token = shard.publish_token, \"shard re-acquired\");\n                re_acquired_shards.push(shard.shard_id().clone());\n            } else if is_owned && !position.is_beginning() {\n                bail!(\"Partition is owned by this indexing pipeline but is not at the beginning. This should never happen! Please, report on https://github.com/quickwit-oss/quickwit/issues.\")\n            }\n        }\n\n        if re_acquired_shards.is_empty() {\n            return Ok(shards);\n        }\n\n        // re-acquire shards that have a token that is not the local token\n        let acquire_shard_resp = self\n            .metastore\n            .acquire_shards(AcquireShardsRequest {\n                index_uid: Some(self.source_uid.index_uid.clone()),\n                source_id: self.source_uid.source_id.clone(),\n                shard_ids: re_acquired_shards,\n                publish_token: publish_token.to_string(),\n            })\n            .await?;\n        for shard in acquire_shard_resp.acquired_shards {\n            let partition_id = PartitionId::from(shard.shard_id().as_str());\n            let position = shard.publish_position_inclusive.unwrap_or_default();\n            shards.push((partition_id, position));\n        }\n\n        Ok(shards)\n    }\n}\n\n/// Acquires shards from the shared state for the provided list of messages\n/// using [`QueueSharedState::acquire_partitions`], then maps resulting\n/// positions back to that original list. Messages that don't require any\n/// further processing are dropped.\npub async fn checkpoint_messages(\n    shared_state: &mut QueueSharedState,\n    publish_token: &str,\n    messages: Vec<PreProcessedMessage>,\n) -> anyhow::Result<Vec<(PreProcessedMessage, Position)>> {\n    let mut message_map: BTreeMap<PartitionId, PreProcessedMessage> = messages\n        .into_iter()\n        .map(|msg| (msg.partition_id(), msg))\n        .collect();\n    let partition_ids = message_map.keys().cloned().collect();\n\n    let partition_positions = shared_state\n        .acquire_partitions(publish_token, partition_ids)\n        .await?;\n\n    let mut result = Vec::with_capacity(partition_positions.len());\n    for (partition_id, position) in partition_positions {\n        let content = message_map.remove(&partition_id).context(\"Unexpected partition ID. This should never happen! Please, report on https://github.com/quickwit-oss/quickwit/issues.\")?;\n        result.push((content, position));\n    }\n\n    Ok(result)\n}\n\n#[cfg(test)]\npub mod shared_state_for_tests {\n    use std::sync::{Arc, Mutex};\n\n    use itertools::Itertools;\n    use quickwit_proto::ingest::{Shard, ShardState};\n    use quickwit_proto::metastore::{\n        AcquireShardsResponse, MockMetastoreService, OpenShardSubresponse, OpenShardsResponse,\n    };\n    use quickwit_proto::types::IndexUid;\n\n    use super::*;\n\n    /// Creates a metastore that mocks the behavior of the Shard API on the open\n    /// and acquire methods using a simplified in-memory state.\n    pub(super) fn mock_metastore(\n        // Shards (token, position, update_timestamp) in the initial state\n        initial_state: &[(PartitionId, (String, Position, i64))],\n        // Times open_shards is expected to be called (None <=> no expectation)\n        open_shard_times: Option<usize>,\n        // Times acquire_shards is expected to be called (None <=> no expectation)\n        acquire_times: Option<usize>,\n    ) -> MetastoreServiceClient {\n        let mut mock_metastore = MockMetastoreService::new();\n        let inner_state = Arc::new(Mutex::new(BTreeMap::from_iter(\n            initial_state.iter().cloned(),\n        )));\n        let inner_state_ref = Arc::clone(&inner_state);\n        let open_shards_expectation =\n            mock_metastore\n                .expect_open_shards()\n                .returning(move |request| {\n                    let mut subresponses = Vec::with_capacity(request.subrequests.len());\n                    for sub_req in request.subrequests {\n                        let partition_id: PartitionId = sub_req.shard_id().to_string().into();\n                        let req_token = sub_req.publish_token();\n                        let (token, position, update_timestamp) = inner_state_ref\n                            .lock()\n                            .unwrap()\n                            .get(&partition_id)\n                            .cloned()\n                            .unwrap_or((\n                                req_token.to_string(),\n                                Position::Beginning,\n                                OffsetDateTime::now_utc().unix_timestamp(),\n                            ));\n\n                        inner_state_ref.lock().unwrap().insert(\n                            partition_id,\n                            (token.clone(), position.clone(), update_timestamp),\n                        );\n                        subresponses.push(OpenShardSubresponse {\n                            subrequest_id: sub_req.subrequest_id,\n                            open_shard: Some(Shard {\n                                shard_id: sub_req.shard_id,\n                                source_id: sub_req.source_id,\n                                publish_token: Some(token),\n                                index_uid: sub_req.index_uid,\n                                follower_id: sub_req.follower_id,\n                                leader_id: sub_req.leader_id,\n                                doc_mapping_uid: sub_req.doc_mapping_uid,\n                                publish_position_inclusive: Some(position),\n                                shard_state: ShardState::Open as i32,\n                                update_timestamp,\n                            }),\n                        });\n                    }\n                    Ok(OpenShardsResponse { subresponses })\n                });\n        if let Some(times) = open_shard_times {\n            open_shards_expectation.times(times);\n        }\n        let acquire_shards_expectation =\n            mock_metastore\n                .expect_acquire_shards()\n                .returning(move |request| {\n                    let mut acquired_shards = Vec::with_capacity(request.shard_ids.len());\n                    for shard_id in request.shard_ids {\n                        let partition_id: PartitionId = shard_id.to_string().into();\n                        let (existing_token, position, update_timestamp) = inner_state\n                            .lock()\n                            .unwrap()\n                            .get(&partition_id)\n                            .cloned()\n                            .expect(\"we should never try to acquire a shard that doesn't exist\");\n                        inner_state.lock().unwrap().insert(\n                            partition_id,\n                            (\n                                request.publish_token.clone(),\n                                position.clone(),\n                                update_timestamp,\n                            ),\n                        );\n                        assert_ne!(existing_token, request.publish_token);\n                        acquired_shards.push(Shard {\n                            shard_id: Some(shard_id),\n                            source_id: \"dummy\".to_string(),\n                            publish_token: Some(request.publish_token.clone()),\n                            index_uid: None,\n                            follower_id: None,\n                            leader_id: \"dummy\".to_string(),\n                            doc_mapping_uid: None,\n                            publish_position_inclusive: Some(position),\n                            shard_state: ShardState::Open as i32,\n                            update_timestamp,\n                        });\n                    }\n                    Ok(AcquireShardsResponse { acquired_shards })\n                });\n        if let Some(times) = acquire_times {\n            acquire_shards_expectation.times(times);\n        }\n        MetastoreServiceClient::from_mock(mock_metastore)\n    }\n\n    pub fn init_state(\n        index_id: &str,\n        // Shards (token, position, is_stale) in the initial state\n        initial_state: &[(PartitionId, (String, Position, bool))],\n    ) -> QueueSharedState {\n        let index_uid = IndexUid::new_with_random_ulid(index_id);\n        let metastore_state = initial_state\n            .iter()\n            .map(|(pid, (token, pos, is_stale))| {\n                let update_timestamp = if *is_stale {\n                    OffsetDateTime::now_utc().unix_timestamp() - 100\n                } else {\n                    OffsetDateTime::now_utc().unix_timestamp()\n                };\n                (pid.clone(), (token.clone(), pos.clone(), update_timestamp))\n            })\n            .collect_vec();\n        let metastore = mock_metastore(&metastore_state, None, None);\n        QueueSharedState {\n            metastore,\n            source_uid: SourceUid {\n                index_uid,\n                source_id: \"test-queue-src\".to_string(),\n            },\n            reacquire_grace_period: Duration::from_secs(10),\n            _cleanup_handle: Arc::new(()),\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::str::FromStr;\n    use std::time::Instant;\n    use std::vec;\n\n    use quickwit_common::uri::Uri;\n    use quickwit_proto::types::IndexUid;\n    use shared_state_for_tests::mock_metastore;\n\n    use super::*;\n    use crate::source::queue_sources::message::{MessageMetadata, PreProcessedPayload};\n\n    fn test_messages(message_number: usize) -> Vec<PreProcessedMessage> {\n        (0..message_number)\n            .map(|i| PreProcessedMessage {\n                metadata: MessageMetadata {\n                    ack_id: format!(\"ackid{i}\"),\n                    delivery_attempts: 0,\n                    initial_deadline: Instant::now(),\n                    message_id: format!(\"mid{i}\"),\n                },\n                payload: PreProcessedPayload::ObjectUri(\n                    Uri::from_str(&format!(\"s3://bucket/key{i}\")).unwrap(),\n                ),\n            })\n            .collect()\n    }\n\n    #[tokio::test]\n    async fn test_acquire_shards_with_completed() {\n        let index_id = \"test-sqs-index\";\n        let index_uid = IndexUid::new_with_random_ulid(index_id);\n        let init_state = &[(\n            \"p1\".into(),\n            (\n                \"token2\".to_string(),\n                Position::eof(100usize),\n                OffsetDateTime::now_utc().unix_timestamp(),\n            ),\n        )];\n        let metastore = mock_metastore(init_state, Some(1), Some(0));\n\n        let mut shared_state = QueueSharedState {\n            metastore,\n            source_uid: SourceUid {\n                index_uid,\n                source_id: \"test-sqs-source\".to_string(),\n            },\n            reacquire_grace_period: Duration::from_secs(10),\n            _cleanup_handle: Arc::new(()),\n        };\n\n        let aquired = shared_state\n            .acquire_partitions(\"token1\", vec![\"p1\".into(), \"p2\".into()])\n            .await\n            .unwrap();\n        assert!(aquired.contains(&(\"p1\".into(), Position::eof(100usize))));\n        assert!(aquired.contains(&(\"p2\".into(), Position::Beginning)));\n    }\n\n    #[tokio::test]\n    async fn test_re_acquire_shards_within_grace_period() {\n        let index_id = \"test-sqs-index\";\n        let index_uid = IndexUid::new_with_random_ulid(index_id);\n        let init_state = &[(\n            \"p1\".into(),\n            (\n                \"token2\".to_string(),\n                Position::offset(100usize),\n                OffsetDateTime::now_utc().unix_timestamp(),\n            ),\n        )];\n        let metastore = mock_metastore(init_state, Some(1), Some(0));\n\n        let mut shared_state = QueueSharedState {\n            metastore,\n            source_uid: SourceUid {\n                index_uid,\n                source_id: \"test-sqs-source\".to_string(),\n            },\n            reacquire_grace_period: Duration::from_secs(10),\n            _cleanup_handle: Arc::new(()),\n        };\n\n        let acquired = shared_state\n            .acquire_partitions(\"token1\", vec![\"p1\".into(), \"p2\".into()])\n            .await\n            .unwrap();\n        assert_eq!(acquired.len(), 1);\n        assert!(acquired.contains(&(\"p2\".into(), Position::Beginning)));\n    }\n\n    #[tokio::test]\n    async fn test_re_acquire_shards_after_grace_period() {\n        let index_id = \"test-sqs-index\";\n        let index_uid = IndexUid::new_with_random_ulid(index_id);\n        let init_state = &[(\n            \"p1\".into(),\n            (\n                \"token2\".to_string(),\n                Position::offset(100usize),\n                OffsetDateTime::now_utc().unix_timestamp() - 100,\n            ),\n        )];\n        let metastore = mock_metastore(init_state, Some(1), Some(1));\n\n        let mut shared_state = QueueSharedState {\n            metastore,\n            source_uid: SourceUid {\n                index_uid,\n                source_id: \"test-sqs-source\".to_string(),\n            },\n            reacquire_grace_period: Duration::from_secs(10),\n            _cleanup_handle: Arc::new(()),\n        };\n\n        let aquired = shared_state\n            .acquire_partitions(\"token1\", vec![\"p1\".into(), \"p2\".into()])\n            .await\n            .unwrap();\n        assert!(aquired.contains(&(\"p1\".into(), Position::offset(100usize))));\n        assert!(aquired.contains(&(\"p2\".into(), Position::Beginning)));\n    }\n\n    #[tokio::test]\n    async fn test_checkpoint_with_completed() {\n        let index_id = \"test-sqs-index\";\n        let index_uid = IndexUid::new_with_random_ulid(index_id);\n\n        let source_messages = test_messages(2);\n        let completed_partition_id = source_messages[0].partition_id();\n        let new_partition_id = source_messages[1].partition_id();\n\n        let init_state = &[(\n            completed_partition_id.clone(),\n            (\n                \"token2\".to_string(),\n                Position::eof(100usize),\n                OffsetDateTime::now_utc().unix_timestamp(),\n            ),\n        )];\n        let metastore = mock_metastore(init_state, Some(1), Some(0));\n        let mut shared_state = QueueSharedState {\n            metastore,\n            source_uid: SourceUid {\n                index_uid,\n                source_id: \"test-sqs-source\".to_string(),\n            },\n            reacquire_grace_period: Duration::from_secs(10),\n            _cleanup_handle: Arc::new(()),\n        };\n\n        let checkpointed_msg = checkpoint_messages(&mut shared_state, \"token1\", source_messages)\n            .await\n            .unwrap();\n        assert_eq!(checkpointed_msg.len(), 2);\n        let completed_msg = checkpointed_msg\n            .iter()\n            .find(|(msg, _)| msg.partition_id() == completed_partition_id)\n            .unwrap();\n        assert_eq!(completed_msg.1, Position::eof(100usize));\n        let new_msg = checkpointed_msg\n            .iter()\n            .find(|(msg, _)| msg.partition_id() == new_partition_id)\n            .unwrap();\n        assert_eq!(new_msg.1, Position::Beginning);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/queue_sources/sqs_queue.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::Arc;\nuse std::time::{Duration, Instant};\n\nuse anyhow::{Context, bail};\nuse async_trait::async_trait;\nuse aws_sdk_sqs::config::{Builder, Region, SharedAsyncSleep};\nuse aws_sdk_sqs::types::{DeleteMessageBatchRequestEntry, MessageSystemAttributeName};\nuse aws_sdk_sqs::{Client, Config};\nuse itertools::Itertools;\nuse quickwit_aws::retry::{AwsRetryable, aws_retry};\nuse quickwit_aws::{DEFAULT_AWS_REGION, aws_behavior_version, get_aws_config};\nuse quickwit_common::rate_limited_error;\nuse quickwit_common::retry::RetryParams;\nuse quickwit_storage::OwnedBytes;\nuse regex::Regex;\n\nuse super::message::MessageMetadata;\nuse super::{Queue, RawMessage};\n\n#[derive(Debug)]\npub struct SqsQueue {\n    sqs_client: Client,\n    queue_url: String,\n    receive_retries: RetryParams,\n    acknowledge_retries: RetryParams,\n    modify_deadline_retries: RetryParams,\n}\n\nimpl SqsQueue {\n    pub async fn try_new(queue_url: String) -> anyhow::Result<Self> {\n        let sqs_client = get_sqs_client(&queue_url).await?;\n        Ok(SqsQueue {\n            sqs_client,\n            queue_url,\n            receive_retries: RetryParams::standard(),\n            // Acknowledgment is retried when the message is received again\n            acknowledge_retries: RetryParams::no_retries(),\n            // Retry aggressively to avoid loosing the ownership of the message\n            modify_deadline_retries: RetryParams::aggressive(),\n        })\n    }\n}\n\n#[async_trait]\nimpl Queue for SqsQueue {\n    async fn receive(\n        self: Arc<Self>,\n        max_messages: usize,\n        suggested_deadline: Duration,\n    ) -> anyhow::Result<Vec<RawMessage>> {\n        // TODO: We estimate the message deadline using the start of the\n        // ReceiveMessage request. This might be overly pessimistic: the docs\n        // state that it starts when the message is returned.\n        let initial_deadline = Instant::now() + suggested_deadline;\n        let clamped_max_messages = std::cmp::min(max_messages, 10) as i32;\n        let receive_output = aws_retry(&self.receive_retries, || async {\n            self.sqs_client\n                .receive_message()\n                .queue_url(&self.queue_url)\n                .message_system_attribute_names(MessageSystemAttributeName::ApproximateReceiveCount)\n                .wait_time_seconds(20)\n                .set_max_number_of_messages(Some(clamped_max_messages))\n                .visibility_timeout(suggested_deadline.as_secs() as i32)\n                .send()\n                .await\n        })\n        .await?;\n\n        let received_messages = receive_output.messages.unwrap_or_default();\n        let mut resulting_raw_messages = Vec::with_capacity(received_messages.len());\n        for received_message in received_messages {\n            let delivery_attempts: usize = received_message\n                .attributes\n                .as_ref()\n                .and_then(|attrs| attrs.get(&MessageSystemAttributeName::ApproximateReceiveCount))\n                .and_then(|s| s.parse().ok())\n                .unwrap_or(0);\n            let ack_id = received_message\n                .receipt_handle\n                .context(\"missing receipt_handle in received message\")?;\n            let message_id = received_message\n                .message_id\n                .context(\"missing message_id in received message\")?;\n            let raw_message = RawMessage {\n                metadata: MessageMetadata {\n                    ack_id,\n                    message_id,\n                    initial_deadline,\n                    delivery_attempts,\n                },\n                payload: OwnedBytes::new(received_message.body.unwrap_or_default().into_bytes()),\n            };\n            resulting_raw_messages.push(raw_message);\n        }\n        Ok(resulting_raw_messages)\n    }\n\n    async fn acknowledge(&self, ack_ids: &[String]) -> anyhow::Result<()> {\n        if ack_ids.is_empty() {\n            return Ok(());\n        }\n        let entry_batches: Vec<Vec<_>> = ack_ids\n            .iter()\n            .dedup()\n            .enumerate()\n            .map(|(i, id)| {\n                DeleteMessageBatchRequestEntry::builder()\n                    .id(i.to_string())\n                    .receipt_handle(id.to_string())\n                    .build()\n                    .unwrap()\n            })\n            .chunks(10)\n            .into_iter()\n            .map(|chunk| chunk.collect())\n            .collect();\n\n        // TODO: parallelization\n        let mut batch_errors = Vec::new();\n        let mut message_errors = Vec::new();\n        for batch in entry_batches {\n            let res = aws_retry(&self.acknowledge_retries, || {\n                self.sqs_client\n                    .delete_message_batch()\n                    .queue_url(&self.queue_url)\n                    .set_entries(Some(batch.clone()))\n                    .send()\n            })\n            .await;\n            match res {\n                Ok(res) => {\n                    message_errors.extend(res.failed.into_iter());\n                }\n                Err(err) => {\n                    batch_errors.push(err);\n                }\n            }\n        }\n        if batch_errors.iter().any(|err| !err.is_retryable()) {\n            let fatal_error = batch_errors\n                .into_iter()\n                .find(|err| !err.is_retryable())\n                .unwrap();\n            bail!(fatal_error);\n        } else if !batch_errors.is_empty() {\n            rate_limited_error!(\n                limit_per_min = 10,\n                count = batch_errors.len(),\n                first_err = ?batch_errors.into_iter().next().unwrap(),\n                \"failed to acknowledge some message batches\",\n            );\n        }\n        // The documentation is unclear about these partial failures. We assume\n        // it is either:\n        // - a transient failure\n        // - the message is already acknowledged\n        // - the message is expired\n        if !message_errors.is_empty() {\n            rate_limited_error!(\n                limit_per_min = 10,\n                count = message_errors.len(),\n                first_err = ?message_errors.into_iter().next().unwrap(),\n                \"failed to acknowledge individual messages\",\n            );\n        }\n        Ok(())\n    }\n\n    async fn modify_deadlines(\n        &self,\n        ack_id: &str,\n        suggested_deadline: Duration,\n    ) -> anyhow::Result<Instant> {\n        let visibility_timeout = std::cmp::min(suggested_deadline.as_secs() as i32, 43200);\n        let new_deadline = Instant::now() + suggested_deadline;\n        aws_retry(&self.modify_deadline_retries, || {\n            self.sqs_client\n                .change_message_visibility()\n                .queue_url(&self.queue_url)\n                .visibility_timeout(visibility_timeout)\n                .receipt_handle(ack_id)\n                .send()\n        })\n        .await?;\n        Ok(new_deadline)\n    }\n}\n\nasync fn preconfigured_builder() -> anyhow::Result<Builder> {\n    let aws_config = get_aws_config().await;\n\n    let mut sqs_config = Config::builder().behavior_version(aws_behavior_version());\n    sqs_config.set_retry_config(aws_config.retry_config().cloned());\n    sqs_config.set_credentials_provider(aws_config.credentials_provider());\n    sqs_config.set_http_client(aws_config.http_client());\n    sqs_config.set_timeout_config(aws_config.timeout_config().cloned());\n\n    if let Some(identity_cache) = aws_config.identity_cache() {\n        sqs_config.set_identity_cache(identity_cache);\n    }\n    sqs_config.set_sleep_impl(Some(SharedAsyncSleep::new(\n        quickwit_aws::TokioSleep::default(),\n    )));\n\n    Ok(sqs_config)\n}\n\nfn queue_url_region(queue_url: &str) -> Option<Region> {\n    let re = Regex::new(r\"^https?://sqs\\.(.*?)\\.amazonaws\\.com\").unwrap();\n    let caps = re.captures(queue_url)?;\n    let region_str = caps.get(1)?.as_str();\n    Some(Region::new(region_str.to_string()))\n}\n\nfn queue_url_endpoint(queue_url: &str) -> anyhow::Result<String> {\n    let re = Regex::new(r\"(^https?://[^/]+)\").unwrap();\n    let caps = re.captures(queue_url).context(\"Invalid queue URL\")?;\n    let endpoint_str = caps.get(1).context(\"Invalid queue URL\")?.as_str();\n    Ok(endpoint_str.to_string())\n}\n\npub async fn get_sqs_client(queue_url: &str) -> anyhow::Result<Client> {\n    let mut sqs_config = preconfigured_builder().await?;\n    // region is required by the SDK to work\n    let inferred_region = queue_url_region(queue_url).unwrap_or(DEFAULT_AWS_REGION);\n    let inferred_endpoint = queue_url_endpoint(queue_url)?;\n    sqs_config.set_region(Some(inferred_region));\n    sqs_config.set_endpoint_url(Some(inferred_endpoint));\n    Ok(Client::from_conf(sqs_config.build()))\n}\n\n/// Checks whether we can establish a connection to the SQS service and we can\n/// access the provided queue_url\npub(crate) async fn check_connectivity(queue_url: &str) -> anyhow::Result<()> {\n    let client = get_sqs_client(queue_url).await?;\n    client\n        .get_queue_attributes()\n        .queue_url(queue_url)\n        .send()\n        .await?;\n\n    Ok(())\n}\n\n#[cfg(feature = \"sqs-test-helpers\")]\npub mod test_helpers {\n    use aws_sdk_sqs::types::QueueAttributeName;\n    use ulid::Ulid;\n    use warp::Filter;\n\n    use super::*;\n\n    pub async fn get_localstack_sqs_client() -> anyhow::Result<Client> {\n        let mut sqs_config = preconfigured_builder().await?;\n        sqs_config.set_endpoint_url(Some(\"http://localhost:4566\".to_string()));\n        sqs_config.set_region(Some(DEFAULT_AWS_REGION));\n        Ok(Client::from_conf(sqs_config.build()))\n    }\n\n    pub async fn create_queue(sqs_client: &Client, queue_name_prefix: &str) -> String {\n        let queue_name = format!(\"{}-{}\", queue_name_prefix, Ulid::new());\n        sqs_client\n            .create_queue()\n            .queue_name(queue_name)\n            .send()\n            .await\n            .unwrap()\n            .queue_url\n            .unwrap()\n    }\n\n    pub async fn send_message(sqs_client: &Client, queue_url: &str, payload: &str) {\n        sqs_client\n            .send_message()\n            .queue_url(queue_url)\n            .message_body(payload.to_string())\n            .send()\n            .await\n            .unwrap();\n    }\n\n    pub async fn get_queue_attribute(\n        sqs_client: &Client,\n        queue_url: &str,\n        attribute: QueueAttributeName,\n    ) -> String {\n        let queue_attributes = sqs_client\n            .get_queue_attributes()\n            .queue_url(queue_url)\n            .attribute_names(attribute.clone())\n            .send()\n            .await\n            .unwrap();\n        queue_attributes\n            .attributes\n            .unwrap()\n            .get(&attribute)\n            .unwrap()\n            .to_string()\n    }\n\n    /// Runs a mock SQS GetQueueAttributes endpoint to enable creating SQS\n    /// sources that pass the connectivity check\n    ///\n    /// Returns the queue URL to use for the source and a guard for the\n    /// temporary mock server\n    pub async fn start_mock_sqs_get_queue_attributes_endpoint() -> (String, oneshot::Sender<()>) {\n        let hello = warp::path!().map(|| \"{}\");\n        let (tx, rx) = oneshot::channel();\n        let listener = tokio::net::TcpListener::bind(\"127.0.0.1:0\")\n            .await\n            .expect(\"listener should bind\");\n        let addr = listener.local_addr().unwrap();\n\n        let server = warp::serve(hello).incoming(listener).graceful(async {\n            rx.await.ok();\n        });\n        tokio::spawn(server.run());\n\n        let queue_url = format!(\"http://{}:{}/\", addr.ip(), addr.port());\n        (queue_url, tx)\n    }\n\n    #[tokio::test]\n    async fn test_mock_sqs_get_queue_attributes_endpoint() {\n        let (queue_url, _shutdown) = start_mock_sqs_get_queue_attributes_endpoint().await;\n        check_connectivity(&queue_url).await.unwrap();\n        drop(_shutdown);\n        check_connectivity(&queue_url).await.unwrap_err();\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_queue_url_region() {\n        let url = \"https://sqs.eu-west-2.amazonaws.com/12345678910/test\";\n        let region = queue_url_region(url);\n        assert_eq!(region, Some(Region::from_static(\"eu-west-2\")));\n\n        let url = \"https://sqs.ap-south-1.amazonaws.com/12345678910/test\";\n        let region = queue_url_region(url);\n        assert_eq!(region, Some(Region::from_static(\"ap-south-1\")));\n\n        let url = \"http://localhost:4566/000000000000/test-queue\";\n        let region = queue_url_region(url);\n        assert_eq!(region, None);\n    }\n\n    #[test]\n    fn test_queue_url_endpoint() {\n        let url = \"https://sqs.eu-west-2.amazonaws.com/12345678910/test\";\n        let endpoint = queue_url_endpoint(url).unwrap();\n        assert_eq!(endpoint, \"https://sqs.eu-west-2.amazonaws.com\");\n\n        let url = \"https://sqs.ap-south-1.amazonaws.com/12345678910/test\";\n        let endpoint = queue_url_endpoint(url).unwrap();\n        assert_eq!(endpoint, \"https://sqs.ap-south-1.amazonaws.com\");\n\n        let url = \"http://localhost:4566/000000000000/test-queue\";\n        let endpoint = queue_url_endpoint(url).unwrap();\n        assert_eq!(endpoint, \"http://localhost:4566\");\n\n        let url = \"http://localhost:4566/000000000000/test-queue\";\n        let endpoint = queue_url_endpoint(url).unwrap();\n        assert_eq!(endpoint, \"http://localhost:4566\");\n    }\n}\n\n#[cfg(all(test, feature = \"sqs-localstack-tests\"))]\nmod localstack_tests {\n    use aws_sdk_sqs::types::QueueAttributeName;\n\n    use super::*;\n    use crate::source::queue_sources::helpers::QueueReceiver;\n    use crate::source::queue_sources::sqs_queue::test_helpers::{\n        create_queue, get_localstack_sqs_client,\n    };\n\n    #[tokio::test]\n    async fn test_check_connectivity() {\n        let sqs_client = get_localstack_sqs_client().await.unwrap();\n        let queue_url = create_queue(&sqs_client, \"check-connectivity\").await;\n        check_connectivity(&queue_url).await.unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_receive_existing_msg_quickly() {\n        let client = test_helpers::get_localstack_sqs_client().await.unwrap();\n        let queue_url = test_helpers::create_queue(&client, \"test-receive-existing-msg\").await;\n        let message = \"hello world\";\n        test_helpers::send_message(&client, &queue_url, message).await;\n\n        let queue = Arc::new(SqsQueue::try_new(queue_url).await.unwrap());\n        let messages = tokio::time::timeout(\n            Duration::from_millis(500),\n            queue.clone().receive(5, Duration::from_secs(60)),\n        )\n        .await\n        .unwrap()\n        .unwrap();\n        assert_eq!(messages.len(), 1);\n        assert_eq!(messages[0].payload.as_slice(), message.as_bytes());\n\n        // just assess that there are no errors for now\n        queue\n            .modify_deadlines(&messages[0].metadata.ack_id, Duration::from_secs(10))\n            .await\n            .unwrap();\n        queue\n            .acknowledge(&[messages[0].metadata.ack_id.clone()])\n            .await\n            .unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_acknowledge_larger_batch() {\n        let client = test_helpers::get_localstack_sqs_client().await.unwrap();\n        let queue_url = test_helpers::create_queue(&client, \"test-ack-large\").await;\n        let message = \"hello world\";\n        for _ in 0..20 {\n            test_helpers::send_message(&client, &queue_url, message).await;\n        }\n\n        let queue: Arc<SqsQueue> = Arc::new(SqsQueue::try_new(queue_url.clone()).await.unwrap());\n        let mut queue_receiver = QueueReceiver::new(queue.clone(), Duration::from_millis(200));\n        let mut messages = Vec::new();\n        for _ in 0..5 {\n            let new_messages = queue_receiver\n                .receive(20, Duration::from_secs(60))\n                .await\n                .unwrap();\n            messages.extend(new_messages.into_iter());\n        }\n        assert_eq!(messages.len(), 20);\n        let in_flight_count: usize = test_helpers::get_queue_attribute(\n            &client,\n            &queue_url,\n            QueueAttributeName::ApproximateNumberOfMessagesNotVisible,\n        )\n        .await\n        .parse()\n        .unwrap();\n        assert_eq!(in_flight_count, 20);\n\n        let ack_ids = messages\n            .iter()\n            .map(|msg| msg.metadata.ack_id.clone())\n            .collect::<Vec<_>>();\n\n        queue.acknowledge(&ack_ids).await.unwrap();\n\n        let in_flight_count: usize = test_helpers::get_queue_attribute(\n            &client,\n            &queue_url,\n            QueueAttributeName::ApproximateNumberOfMessagesNotVisible,\n        )\n        .await\n        .parse()\n        .unwrap();\n        assert_eq!(in_flight_count, 0);\n    }\n\n    #[tokio::test]\n    async fn test_receive_wrong_queue() {\n        let client = test_helpers::get_localstack_sqs_client().await.unwrap();\n        let queue_url = test_helpers::create_queue(&client, \"test-receive-existing-msg\").await;\n        let bad_queue_url = format!(\"{queue_url}wrong\");\n        let queue = Arc::new(SqsQueue::try_new(bad_queue_url).await.unwrap());\n        tokio::time::timeout(\n            Duration::from_millis(500),\n            queue.clone().receive(5, Duration::from_secs(60)),\n        )\n        .await\n        .unwrap()\n        .unwrap_err();\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/queue_sources/visibility.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::{Arc, Weak};\nuse std::time::{Duration, Instant};\n\nuse anyhow::{Context, anyhow};\nuse async_trait::async_trait;\nuse quickwit_actors::{\n    Actor, ActorContext, ActorExitStatus, ActorHandle, ActorState, Handler, Mailbox,\n};\nuse serde_json::{Value as JsonValue, json};\n\nuse super::Queue;\nuse crate::source::SourceContext;\n\n#[derive(Debug, Clone)]\npub(super) struct VisibilitySettings {\n    /// The original deadline asked from the queue when polling the messages\n    pub deadline_for_receive: Duration,\n    /// The last deadline extension when the message reading is completed\n    pub deadline_for_last_extension: Duration,\n    /// The extension applied why the VisibilityTask to maintain the message visibility\n    pub deadline_for_default_extension: Duration,\n    /// Rhe timeout for the visibility extension request\n    pub request_timeout: Duration,\n    /// an extra margin that is subtracted from the expected deadline when\n    /// asserting whether we are still in time to extend the visibility\n    pub request_margin: Duration,\n}\n\nimpl VisibilitySettings {\n    /// The commit timeout gives us a first estimate on how long the processing\n    /// will take for the messages. We could include other factors such as the\n    /// message size.\n    pub(super) fn from_commit_timeout(commit_timeout_secs: usize) -> Self {\n        let commit_timeout = Duration::from_secs(commit_timeout_secs as u64);\n        Self {\n            deadline_for_receive: Duration::from_secs(120) + commit_timeout,\n            deadline_for_last_extension: 2 * commit_timeout,\n            deadline_for_default_extension: Duration::from_secs(60),\n            request_timeout: Duration::from_secs(3),\n            request_margin: Duration::from_secs(1),\n        }\n    }\n}\n\n#[derive(Debug)]\nstruct VisibilityTask {\n    queue: Arc<dyn Queue>,\n    ack_id: String,\n    extension_count: u64,\n    current_deadline: Instant,\n    last_extension_requested: bool,\n    visibility_settings: VisibilitySettings,\n    ref_count: Weak<()>,\n}\n\n// A handle to the visibility actor. When dropped, the actor exits and the\n// visibility isn't maintained anymore.\npub(super) struct VisibilityTaskHandle {\n    mailbox: Mailbox<VisibilityTask>,\n    actor_handle: ActorHandle<VisibilityTask>,\n    ack_id: String,\n    _ref_count: Arc<()>,\n}\n\n/// Spawns actor that ensures that the visibility of a given message\n/// (represented by its ack_id) is extended when required. We prefer applying\n/// ample margins in the extension process to avoid missing deadlines while also\n/// keeping the number of extension requests (and associated cost) small.\npub(super) fn spawn_visibility_task(\n    ctx: &SourceContext,\n    queue: Arc<dyn Queue>,\n    ack_id: String,\n    current_deadline: Instant,\n    visibility_settings: VisibilitySettings,\n) -> VisibilityTaskHandle {\n    let ref_count = Arc::new(());\n    let weak_ref = Arc::downgrade(&ref_count);\n    let task = VisibilityTask {\n        queue,\n        ack_id: ack_id.clone(),\n        extension_count: 0,\n        current_deadline,\n        last_extension_requested: false,\n        visibility_settings,\n        ref_count: weak_ref,\n    };\n    let (mailbox, actor_handle) = ctx.spawn_actor().spawn(task);\n    VisibilityTaskHandle {\n        mailbox,\n        actor_handle,\n        ack_id,\n        _ref_count: ref_count,\n    }\n}\n\nimpl VisibilityTask {\n    async fn extend_visibility(\n        &mut self,\n        ctx: &ActorContext<Self>,\n        extension: Duration,\n    ) -> anyhow::Result<()> {\n        let _zone = ctx.protect_zone();\n        self.current_deadline = tokio::time::timeout(\n            self.visibility_settings.request_timeout,\n            self.queue.modify_deadlines(&self.ack_id, extension),\n        )\n        .await\n        .context(\"deadline extension timed out\")??;\n        self.extension_count += 1;\n        Ok(())\n    }\n\n    fn next_extension(&self) -> Duration {\n        (self.current_deadline - Instant::now())\n            - self.visibility_settings.request_timeout\n            - self.visibility_settings.request_margin\n    }\n}\n\nimpl VisibilityTaskHandle {\n    pub fn extension_failed(&self) -> bool {\n        self.actor_handle.state() == ActorState::Failure\n    }\n\n    pub fn ack_id(&self) -> &str {\n        &self.ack_id\n    }\n\n    pub async fn request_last_extension(self) -> anyhow::Result<()> {\n        self.mailbox\n            .ask_for_res(RequestLastExtension)\n            .await\n            .map_err(|e| anyhow!(e))?;\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Actor for VisibilityTask {\n    type ObservableState = JsonValue;\n\n    fn name(&self) -> String {\n        \"QueueVisibilityTask\".to_string()\n    }\n\n    async fn initialize(&mut self, ctx: &ActorContext<Self>) -> Result<(), ActorExitStatus> {\n        let first_extension = self.next_extension();\n        if first_extension.is_zero() {\n            return Err(anyhow!(\"initial visibility deadline insufficient\").into());\n        }\n        ctx.schedule_self_msg(first_extension, Loop);\n        Ok(())\n    }\n\n    fn yield_after_each_message(&self) -> bool {\n        false\n    }\n\n    fn observable_state(&self) -> Self::ObservableState {\n        json!({\n            \"ack_id\": self.ack_id,\n            \"extension_count\": self.extension_count,\n        })\n    }\n}\n\n#[derive(Debug)]\nstruct Loop;\n\n#[async_trait]\nimpl Handler<Loop> for VisibilityTask {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        _message: Loop,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        if self.ref_count.strong_count() == 0 {\n            return Ok(());\n        }\n        if self.last_extension_requested {\n            return Ok(());\n        }\n        self.extend_visibility(ctx, self.visibility_settings.deadline_for_default_extension)\n            .await?;\n        ctx.schedule_self_msg(self.next_extension(), Loop);\n        Ok(())\n    }\n}\n\n/// Ensures that the visibility of the message is extended using\n/// deadline_for_last_extension and then stops the extension loop.\n#[derive(Debug)]\nstruct RequestLastExtension;\n\n#[async_trait]\nimpl Handler<RequestLastExtension> for VisibilityTask {\n    type Reply = anyhow::Result<()>;\n\n    async fn handle(\n        &mut self,\n        _message: RequestLastExtension,\n        ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        let deadline_for_last_extension = self.visibility_settings.deadline_for_last_extension;\n        let last_deadline = Instant::now() + deadline_for_last_extension;\n        self.last_extension_requested = true;\n        if last_deadline > self.current_deadline {\n            Ok(self\n                .extend_visibility(ctx, deadline_for_last_extension)\n                .await)\n        } else {\n            Ok(Ok(()))\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_actors::Universe;\n    use tokio::sync::watch;\n\n    use super::*;\n    use crate::source::queue_sources::memory_queue::MemoryQueueForTests;\n\n    #[tokio::test]\n    async fn test_visibility_task_request_last_extension() {\n        // actor context\n        let universe = Universe::with_accelerated_time();\n        let (source_mailbox, _source_inbox) = universe.create_test_mailbox();\n        let (observable_state_tx, _observable_state_rx) = watch::channel(serde_json::Value::Null);\n        let ctx: SourceContext =\n            ActorContext::for_test(&universe, source_mailbox, observable_state_tx);\n        // queue with test message\n        let ack_id = \"ack_id\".to_string();\n        let queue = Arc::new(MemoryQueueForTests::new());\n        queue.send_message(\"test message\".to_string(), &ack_id);\n        let initial_deadline = queue\n            .clone()\n            .receive(1, Duration::from_secs(1))\n            .await\n            .unwrap()[0]\n            .metadata\n            .initial_deadline;\n        // spawn task\n        let visibility_settings = VisibilitySettings {\n            deadline_for_default_extension: Duration::from_secs(1),\n            deadline_for_last_extension: Duration::from_secs(5),\n            deadline_for_receive: Duration::from_secs(1),\n            request_timeout: Duration::from_millis(100),\n            request_margin: Duration::from_millis(100),\n        };\n        let handle = spawn_visibility_task(\n            &ctx,\n            queue.clone(),\n            ack_id.clone(),\n            initial_deadline,\n            visibility_settings.clone(),\n        );\n        // assert that the background task performs extensions\n        assert!(!handle.extension_failed());\n        tokio::time::sleep_until(initial_deadline.into()).await;\n        let next_deadline = queue.next_visibility_deadline(&ack_id).unwrap();\n        assert!(initial_deadline < next_deadline);\n        assert!(!handle.extension_failed());\n        handle.request_last_extension().await.unwrap();\n        assert!(\n            Instant::now() + Duration::from_secs(4)\n                < queue.next_visibility_deadline(&ack_id).unwrap()\n        );\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_visibility_task_stop_on_drop() {\n        // actor context\n        let universe = Universe::with_accelerated_time();\n        let (source_mailbox, _source_inbox) = universe.create_test_mailbox();\n        let (observable_state_tx, _observable_state_rx) = watch::channel(serde_json::Value::Null);\n        let ctx: SourceContext =\n            ActorContext::for_test(&universe, source_mailbox, observable_state_tx);\n        // queue with test message\n        let ack_id = \"ack_id\".to_string();\n        let queue = Arc::new(MemoryQueueForTests::new());\n        queue.send_message(\"test message\".to_string(), &ack_id);\n        let initial_deadline = queue\n            .clone()\n            .receive(1, Duration::from_secs(1))\n            .await\n            .unwrap()[0]\n            .metadata\n            .initial_deadline;\n        // spawn task\n        let visibility_settings = VisibilitySettings {\n            deadline_for_default_extension: Duration::from_secs(1),\n            deadline_for_last_extension: Duration::from_secs(20),\n            deadline_for_receive: Duration::from_secs(1),\n            request_timeout: Duration::from_millis(100),\n            request_margin: Duration::from_millis(100),\n        };\n        let handle = spawn_visibility_task(\n            &ctx,\n            queue.clone(),\n            ack_id.clone(),\n            initial_deadline,\n            visibility_settings.clone(),\n        );\n        // assert that visibility is not extended after drop\n        drop(handle);\n        tokio::time::sleep_until(initial_deadline.into()).await;\n        // the message is either already expired or about to expire\n        if let Some(next_deadline) = queue.next_visibility_deadline(&ack_id) {\n            assert_eq!(next_deadline, initial_deadline);\n        }\n        // assert_eq!(q, None);\n        universe.assert_quit().await;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/source_factory.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\n\nuse async_trait::async_trait;\nuse itertools::Itertools;\nuse quickwit_proto::metastore::SourceType;\nuse quickwit_proto::types::SourceId;\nuse thiserror::Error;\n\nuse super::Source;\nuse crate::source::SourceRuntime;\n\n#[async_trait]\npub trait SourceFactory: Send + Sync + 'static {\n    async fn create_source(&self, source_runtime: SourceRuntime)\n    -> anyhow::Result<Box<dyn Source>>;\n}\n\n#[async_trait]\npub trait TypedSourceFactory: Send + Sync + 'static {\n    type Source: Source;\n    type Params: serde::de::DeserializeOwned + Send + Sync + 'static;\n\n    async fn typed_create_source(\n        source_runtime: SourceRuntime,\n        source_params: Self::Params,\n    ) -> anyhow::Result<Self::Source>;\n}\n\n#[async_trait]\nimpl<T: TypedSourceFactory> SourceFactory for T {\n    async fn create_source(\n        &self,\n        source_runtime: SourceRuntime,\n    ) -> anyhow::Result<Box<dyn Source>> {\n        let typed_params: T::Params =\n            serde_json::from_value(source_runtime.source_config.params())?;\n        let source = Self::typed_create_source(source_runtime, typed_params).await?;\n        Ok(Box::new(source))\n    }\n}\n\n#[derive(Default)]\npub struct SourceLoader {\n    type_to_factory: HashMap<SourceType, Box<dyn SourceFactory>>,\n}\n\n#[derive(Error, Debug)]\npub enum SourceLoaderError {\n    #[error(\n        \"unknown source type `{requested_source_type}` (available source types are \\\n         {available_source_types})\"\n    )]\n    UnknownSourceType {\n        requested_source_type: SourceType,\n        available_source_types: String, //< a comma separated list with the available source_type.\n    },\n    #[error(\"failed to create source `{source_id}` of type `{source_type}`. Cause: {error:?}\")]\n    FailedToCreateSource {\n        source_id: SourceId,\n        source_type: SourceType,\n        #[source]\n        error: anyhow::Error,\n    },\n}\n\nimpl SourceLoader {\n    pub fn add_source<F: SourceFactory>(&mut self, source_type: SourceType, source_factory: F) {\n        self.type_to_factory\n            .insert(source_type, Box::new(source_factory));\n    }\n\n    pub async fn load_source(\n        &self,\n        source_runtime: SourceRuntime,\n    ) -> Result<Box<dyn Source>, SourceLoaderError> {\n        let source_type = source_runtime.source_config.source_type();\n        let source_id = source_runtime.source_id().to_string();\n        let source_factory = self.type_to_factory.get(&source_type).ok_or_else(|| {\n            SourceLoaderError::UnknownSourceType {\n                requested_source_type: source_type,\n                available_source_types: self.type_to_factory.keys().join(\", \"),\n            }\n        })?;\n        source_factory\n            .create_source(source_runtime)\n            .await\n            .map_err(|error| SourceLoaderError::FailedToCreateSource {\n                source_type,\n                source_id,\n                error,\n            })\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use std::num::NonZeroUsize;\n\n    use quickwit_config::{SourceConfig, SourceInputFormat, SourceParams};\n    use quickwit_proto::types::IndexUid;\n\n    use crate::source::quickwit_supported_sources;\n    use crate::source::tests::SourceRuntimeBuilder;\n\n    #[tokio::test]\n    async fn test_source_loader_success() -> anyhow::Result<()> {\n        let source_loader = quickwit_supported_sources();\n        let index_uid = IndexUid::new_with_random_ulid(\"test-index\");\n        let source_config = SourceConfig {\n            source_id: \"test-source\".to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::void(),\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        };\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config).build();\n        source_loader.load_source(source_runtime).await?;\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/stdin_source.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::time::Duration;\n\nuse async_trait::async_trait;\nuse quickwit_actors::{ActorExitStatus, Mailbox};\nuse quickwit_common::Progress;\nuse quickwit_proto::metastore::SourceType;\nuse tokio::io::{AsyncBufReadExt, BufReader};\n\nuse super::{BATCH_NUM_BYTES_LIMIT, BatchBuilder};\nuse crate::actors::DocProcessor;\nuse crate::source::{Source, SourceContext, SourceRuntime, TypedSourceFactory};\n\npub struct StdinBatchReader {\n    reader: BufReader<tokio::io::Stdin>,\n    is_eof: bool,\n}\n\nimpl StdinBatchReader {\n    pub fn new() -> Self {\n        Self {\n            reader: BufReader::new(tokio::io::stdin()),\n            is_eof: false,\n        }\n    }\n\n    async fn read_batch(&mut self, source_progress: &Progress) -> anyhow::Result<BatchBuilder> {\n        let mut batch_builder = BatchBuilder::new(SourceType::Stdin);\n        while batch_builder.num_bytes < BATCH_NUM_BYTES_LIMIT {\n            let mut buf = String::new();\n            // stdin might be slow because it's depending on external\n            // input (e.g. user typing on a keyboard)\n            let bytes_read = source_progress\n                .protect_future(self.reader.read_line(&mut buf))\n                .await?;\n            if bytes_read > 0 {\n                batch_builder.add_doc(buf.into());\n            } else {\n                self.is_eof = true;\n                break;\n            }\n        }\n\n        Ok(batch_builder)\n    }\n\n    fn is_eof(&self) -> bool {\n        self.is_eof\n    }\n}\n\npub struct StdinSource {\n    reader: StdinBatchReader,\n    num_bytes_processed: u64,\n    num_lines_processed: u64,\n}\n\nimpl fmt::Debug for StdinSource {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(f, \"StdinSource\")\n    }\n}\n\n#[async_trait]\nimpl Source for StdinSource {\n    async fn emit_batches(\n        &mut self,\n        doc_processor_mailbox: &Mailbox<DocProcessor>,\n        ctx: &SourceContext,\n    ) -> Result<Duration, ActorExitStatus> {\n        let batch_builder = self.reader.read_batch(ctx.progress()).await?;\n        self.num_bytes_processed += batch_builder.num_bytes;\n        self.num_lines_processed += batch_builder.docs.len() as u64;\n        doc_processor_mailbox\n            .send_message(batch_builder.build())\n            .await?;\n        if self.reader.is_eof() {\n            ctx.send_exit_with_success(doc_processor_mailbox).await?;\n            return Err(ActorExitStatus::Success);\n        }\n\n        Ok(Duration::ZERO)\n    }\n\n    fn name(&self) -> String {\n        format!(\"{self:?}\")\n    }\n\n    fn observable_state(&self) -> serde_json::Value {\n        serde_json::json!({\n            \"num_bytes_processed\": self.num_bytes_processed,\n            \"num_lines_processed\": self.num_lines_processed,\n        })\n    }\n}\n\npub struct StdinSourceFactory;\n\n#[async_trait]\nimpl TypedSourceFactory for StdinSourceFactory {\n    type Source = StdinSource;\n    type Params = ();\n\n    async fn typed_create_source(\n        _source_runtime: SourceRuntime,\n        _params: (),\n    ) -> anyhow::Result<StdinSource> {\n        Ok(StdinSource {\n            reader: StdinBatchReader::new(),\n            num_bytes_processed: 0,\n            num_lines_processed: 0,\n        })\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/vec_source.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::time::Duration;\n\nuse async_trait::async_trait;\nuse quickwit_actors::{ActorExitStatus, Mailbox};\nuse quickwit_config::VecSourceParams;\nuse quickwit_metastore::checkpoint::{PartitionId, SourceCheckpointDelta};\nuse quickwit_proto::metastore::SourceType;\nuse quickwit_proto::types::{Position, SourceId};\nuse serde_json::Value as JsonValue;\nuse tracing::info;\n\nuse super::BatchBuilder;\nuse crate::actors::DocProcessor;\nuse crate::source::{Source, SourceContext, SourceRuntime, TypedSourceFactory};\n\npub struct VecSource {\n    source_id: SourceId,\n    source_params: VecSourceParams,\n    next_item_idx: usize,\n    partition: PartitionId,\n}\n\nimpl fmt::Debug for VecSource {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        f.debug_struct(\"VecSource\")\n            .field(\"source_id\", &self.source_id)\n            .finish()\n    }\n}\n\npub struct VecSourceFactory;\n\n#[async_trait]\nimpl TypedSourceFactory for VecSourceFactory {\n    type Source = VecSource;\n    type Params = VecSourceParams;\n    async fn typed_create_source(\n        source_runtime: SourceRuntime,\n        source_params: VecSourceParams,\n    ) -> anyhow::Result<Self::Source> {\n        let checkpoint = source_runtime.fetch_checkpoint().await?;\n        let partition = PartitionId::from(source_params.partition.as_str());\n        let next_item_idx = checkpoint\n            .position_for_partition(&partition)\n            .map(|position| {\n                position\n                    .as_usize()\n                    .expect(\"offset should be stored as usize\")\n                    + 1\n            })\n            .unwrap_or(0);\n        Ok(VecSource {\n            source_id: source_runtime.pipeline_id.source_id,\n            source_params,\n            partition,\n            next_item_idx,\n        })\n    }\n}\n\nfn position_from_offset(offset: usize) -> Position {\n    if offset == 0 {\n        return Position::Beginning;\n    }\n    Position::offset(offset - 1)\n}\n\n#[async_trait]\nimpl Source for VecSource {\n    async fn emit_batches(\n        &mut self,\n        batch_sink: &Mailbox<DocProcessor>,\n        ctx: &SourceContext,\n    ) -> Result<Duration, ActorExitStatus> {\n        let mut batch_builder = BatchBuilder::new(SourceType::Vec);\n\n        for doc in self.source_params.docs[self.next_item_idx..]\n            .iter()\n            .take(self.source_params.batch_num_docs)\n            .cloned()\n        {\n            batch_builder.add_doc(doc);\n        }\n        if batch_builder.docs.is_empty() {\n            info!(\"reached end of source\");\n            ctx.send_exit_with_success(batch_sink).await?;\n            return Err(ActorExitStatus::Success);\n        }\n        let from_item_idx = self.next_item_idx;\n        self.next_item_idx += batch_builder.docs.len();\n        let to_item_idx = self.next_item_idx;\n\n        batch_builder.checkpoint_delta = SourceCheckpointDelta::from_partition_delta(\n            self.partition.clone(),\n            position_from_offset(from_item_idx),\n            position_from_offset(to_item_idx),\n        )\n        .unwrap();\n        ctx.send_message(batch_sink, batch_builder.build()).await?;\n\n        Ok(Duration::default())\n    }\n\n    fn name(&self) -> String {\n        format!(\"{self:?}\")\n    }\n\n    fn observable_state(&self) -> JsonValue {\n        serde_json::json!({\n            \"next_item_idx\": self.next_item_idx,\n        })\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::num::NonZeroUsize;\n\n    use bytes::Bytes;\n    use quickwit_actors::{Actor, Command, Universe};\n    use quickwit_config::{SourceConfig, SourceInputFormat, SourceParams};\n    use quickwit_proto::types::IndexUid;\n    use serde_json::json;\n\n    use super::*;\n    use crate::models::RawDocBatch;\n    use crate::source::SourceActor;\n    use crate::source::tests::SourceRuntimeBuilder;\n\n    #[tokio::test]\n    async fn test_vec_source() -> anyhow::Result<()> {\n        let universe = Universe::with_accelerated_time();\n        let (doc_processor_mailbox, doc_processor_inbox) = universe.create_test_mailbox();\n        let docs = std::iter::repeat_with(|| Bytes::from_static(b\"{}\"))\n            .take(100)\n            .collect();\n        let params = VecSourceParams {\n            docs,\n            batch_num_docs: 3,\n            partition: \"partition\".to_string(),\n        };\n        let index_uid = IndexUid::new_with_random_ulid(\"test-index\");\n        let source_config = SourceConfig {\n            source_id: \"test-vec-source\".to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::Vec(params.clone()),\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        };\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config).build();\n        let vec_source = VecSourceFactory::typed_create_source(source_runtime, params).await?;\n        let vec_source_actor = SourceActor {\n            source: Box::new(vec_source),\n            doc_processor_mailbox,\n        };\n        assert_eq!(\n            vec_source_actor.name(),\n            r#\"VecSource { source_id: \"test-vec-source\" }\"#\n        );\n        let (_vec_source_mailbox, vec_source_handle) =\n            universe.spawn_builder().spawn(vec_source_actor);\n        let (actor_termination, last_observation) = vec_source_handle.join().await;\n        assert!(actor_termination.is_success());\n        assert_eq!(last_observation, json!({\"next_item_idx\": 100u64}));\n        let batches = doc_processor_inbox.drain_for_test();\n        assert_eq!(batches.len(), 35);\n        let raw_batch = batches[1].downcast_ref::<RawDocBatch>().unwrap();\n        assert_eq!(\n            format!(\"{:?}\", raw_batch.checkpoint_delta),\n            \"∆(partition:(00000000000000000002..00000000000000000005])\"\n        );\n        assert!(matches!(\n            &batches[34].downcast_ref::<Command>().unwrap(),\n            &Command::ExitWithSuccess\n        ));\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_vec_source_from_checkpoint() -> anyhow::Result<()> {\n        let universe = Universe::with_accelerated_time();\n        let (doc_processor_mailbox, doc_processor_inbox) = universe.create_test_mailbox();\n        let docs = (0..10).map(|i| Bytes::from(format!(\"{i}\"))).collect();\n        let params = VecSourceParams {\n            docs,\n            batch_num_docs: 3,\n            partition: \"\".to_string(),\n        };\n        let index_uid = IndexUid::new_with_random_ulid(\"test-index\");\n        let source_config = SourceConfig {\n            source_id: \"test-vec-source\".to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::Vec(params.clone()),\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        };\n        let source_delta = SourceCheckpointDelta::from_range(0u64..2u64);\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config)\n            .with_mock_metastore(Some(source_delta))\n            .build();\n        let vec_source = VecSourceFactory::typed_create_source(source_runtime, params).await?;\n        let vec_source_actor = SourceActor {\n            source: Box::new(vec_source),\n            doc_processor_mailbox,\n        };\n        let (_vec_source_mailbox, vec_source_handle) =\n            universe.spawn_builder().spawn(vec_source_actor);\n        let (actor_termination, last_observation) = vec_source_handle.join().await;\n        assert!(actor_termination.is_success());\n        assert_eq!(last_observation, json!({\"next_item_idx\": 10}));\n        let messages = doc_processor_inbox.drain_for_test();\n        let batch = messages[0].downcast_ref::<RawDocBatch>().unwrap();\n        assert_eq!(&batch.docs[0], \"2\");\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/source/void_source.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::time::Duration;\n\nuse async_trait::async_trait;\nuse quickwit_actors::{ActorExitStatus, HEARTBEAT, Mailbox};\nuse quickwit_config::VoidSourceParams;\nuse serde_json::Value as JsonValue;\n\nuse crate::actors::DocProcessor;\nuse crate::source::{Source, SourceContext, SourceRuntime, TypedSourceFactory};\n\npub struct VoidSource;\n\n#[async_trait]\nimpl Source for VoidSource {\n    async fn emit_batches(\n        &mut self,\n        _: &Mailbox<DocProcessor>,\n        _: &SourceContext,\n    ) -> Result<Duration, ActorExitStatus> {\n        tokio::time::sleep(*HEARTBEAT / 2).await;\n        Ok(Duration::default())\n    }\n\n    fn name(&self) -> String {\n        \"VoidSource\".to_string()\n    }\n\n    fn observable_state(&self) -> JsonValue {\n        JsonValue::Object(Default::default())\n    }\n}\n\npub struct VoidSourceFactory;\n\n#[async_trait]\nimpl TypedSourceFactory for VoidSourceFactory {\n    type Source = VoidSource;\n\n    type Params = VoidSourceParams;\n\n    async fn typed_create_source(\n        _source_runtime: SourceRuntime,\n        _params: VoidSourceParams,\n    ) -> anyhow::Result<VoidSource> {\n        Ok(VoidSource)\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use std::num::NonZeroUsize;\n\n    use quickwit_actors::{Health, Supervisable, Universe};\n    use quickwit_config::{SourceInputFormat, SourceParams};\n    use quickwit_proto::types::IndexUid;\n    use serde_json::json;\n\n    use super::*;\n    use crate::source::tests::SourceRuntimeBuilder;\n    use crate::source::{SourceActor, SourceConfig, quickwit_supported_sources};\n\n    #[tokio::test]\n    async fn test_void_source_loading() {\n        let index_uid = IndexUid::new_with_random_ulid(\"test-index\");\n        let source_config = SourceConfig {\n            source_id: \"test-void-source\".to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::void(),\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        };\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config).build();\n        let source = quickwit_supported_sources()\n            .load_source(source_runtime)\n            .await\n            .unwrap();\n        assert_eq!(source.name(), \"VoidSource\");\n    }\n\n    #[tokio::test]\n    async fn test_void_source_running() -> anyhow::Result<()> {\n        let universe = Universe::with_accelerated_time();\n        let index_uid = IndexUid::new_with_random_ulid(\"test-index\");\n        let source_config = SourceConfig {\n            source_id: \"test-void-source\".to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::void(),\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        };\n        let source_runtime = SourceRuntimeBuilder::new(index_uid, source_config).build();\n        let void_source =\n            VoidSourceFactory::typed_create_source(source_runtime, VoidSourceParams).await?;\n        let (doc_processor_mailbox, _) = universe.create_test_mailbox();\n        let void_source_actor = SourceActor {\n            source: Box::new(void_source),\n            doc_processor_mailbox,\n        };\n        let (_, void_source_handle) = universe.spawn_builder().spawn(void_source_actor);\n        matches!(void_source_handle.check_health(true), Health::Healthy);\n        let (actor_termination, observed_state) = void_source_handle.quit().await;\n        assert_eq!(observed_state, json!({}));\n        matches!(actor_termination, ActorExitStatus::Quit);\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/split_store/indexing_split_cache.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeMap;\nuse std::collections::btree_map::Entry;\nuse std::io;\nuse std::path::{Path, PathBuf};\nuse std::str::FromStr;\nuse std::time::{Duration, SystemTime};\n\nuse anyhow::Context;\nuse bytesize::ByteSize;\nuse quickwit_common::split_file;\nuse quickwit_directories::BundleDirectory;\nuse quickwit_storage::StorageResult;\nuse tantivy::Directory;\nuse tantivy::directory::MmapDirectory;\nuse tokio::sync::Mutex;\nuse tracing::{debug, error, warn};\nuse ulid::Ulid;\n\nuse super::SplitStoreQuota;\n\n// TODO Make this configurable.\nconst SPLIT_MAX_AGE: Duration = Duration::from_secs(2 * 24 * 3_600); // 2 days\n\npub fn get_tantivy_directory_from_split_bundle(\n    split_file: &Path,\n) -> StorageResult<Box<dyn Directory>> {\n    let mmap_directory = MmapDirectory::open(split_file.parent().ok_or_else(|| {\n        io::Error::new(\n            io::ErrorKind::NotFound,\n            format!(\"couldn't find parent for {}\", split_file.display()),\n        )\n    })?)?;\n    let split_fileslice = mmap_directory.open_read(Path::new(&split_file))?;\n    Ok(Box::new(BundleDirectory::open_split(split_fileslice)?))\n}\n\n/// Returns the number of bytes held in a given directory.\nasync fn num_bytes_in_folder(directory_path: &Path) -> io::Result<ByteSize> {\n    let mut total_bytes = 0;\n    let mut read_dir = tokio::fs::read_dir(directory_path).await?;\n    while let Some(dir_entry) = read_dir.next_entry().await? {\n        let metadata = dir_entry.metadata().await?;\n        if metadata.is_file() {\n            total_bytes += metadata.len();\n        } else {\n            warn!(\n                \"Unexpected directory found in split cache. {:?}\",\n                dir_entry.path()\n            );\n        }\n    }\n    Ok(ByteSize(total_bytes))\n}\n\n/// The local split store is a cache for freshly indexed splits.\n///\n/// In order to save the cost of an extra write, we store splits in the form\n/// of a directory and the split bundles are built upon upload.\n#[derive(Debug, Copy, Clone)]\nstruct SplitFolder {\n    split_id: Ulid,\n    num_bytes: ByteSize,\n}\n\nimpl SplitFolder {\n    /// Creates a new `SplitFolder`.\n    ///\n    /// There are no specific constraints on `path`.\n    pub async fn create(split_id: &str, path: &Path) -> io::Result<Self> {\n        let split_id = Ulid::from_str(split_id).map_err(|_err| {\n            let error_msg = format!(\"split Id should be an ulid: got `{split_id}`\");\n            io::Error::new(io::ErrorKind::InvalidInput, error_msg)\n        })?;\n        let num_bytes = num_bytes_in_folder(path).await?;\n        Ok(SplitFolder {\n            split_id,\n            num_bytes,\n        })\n    }\n\n    /// Returns the creation time as encoded in the split id ULID.\n    fn creation_time(&self) -> SystemTime {\n        self.split_id.datetime()\n    }\n}\n\nfn split_id_from_split_folder(dir_path: &Path) -> Option<&str> {\n    dir_path.file_name()?.to_str()?.strip_suffix(\".split\")\n}\n\n/// The [`IndexingSplitCache`] is a local cache used to improve the performance of indexing nodes.\n/// Its purpose is simple: when a new split is freshly created, it is usually merged\n/// very rapidly after.\n///\n/// In order to prevent this merge from forcing its download, we store it in the\n/// [`IndexingSplitCache`]. This store is just a cache: a cache miss is acceptable and\n/// just means that the split will be downloaded again.\n///\n/// The indexing split cache eviction policy however, is rather uncommon.\n/// On our happy path, a split is stored into the cache, and is then used only once\n/// to undergo a merge.\n///\n/// For this reason, we simply offer a way to `move splits into the cache`,\n/// and `move splits out of the cache`. A split is removed from the split cache\n/// after its first access.\n///\n/// Of course a failed merge could require accessing a given split more than once. In that\n/// case the split will be downloaded again.\n///\n/// The cache size is limited by 3 things:\n/// - a maximum number of splits as defined in the `SplitStoreQuota`.\n/// - a maximum number of bytes as defined in the `SplitStoreQuota`.\n/// - finally, we evict older splits to make sure that the newest split and the oldest split only\n///   differ by at most `SPLIT_MAX_AGE`.\n///\n/// The point of this final rule invariant is to make sure that the disk space will be\n/// released if the cache is NOT under pressure but some splits are actually useless.\n///\n/// When adding a new split into the cache, if adding the split would break one of the following\n/// limit, we simply remove split one by one starting by the oldest first, until the split\n/// can be added.\npub struct IndexingSplitCache {\n    inner: Mutex<InnerSplitCache>,\n}\n\nstruct InnerSplitCache {\n    split_registry: SplitFolderRegistry,\n    split_store_folder: PathBuf,\n}\n\nstruct SplitFolderRegistry {\n    /// Splits owned by the local split store, which reside in the split_store_folder.\n    ///\n    /// Splits ids are generated using ULID, so that they are sorted\n    /// according to their creation date.\n    ///\n    /// We evict the oldest split first. Note this is not an LRU strategy\n    /// because we do not care about the last access time, but we only\n    /// consider the creation time.\n    split_folders: BTreeMap<Ulid, ByteSize>,\n    /// The split store quota shared among all indexing split stores.\n    split_store_quota: SplitStoreQuota,\n}\n\nimpl SplitFolderRegistry {\n    pub fn with_quota(split_store_quota: SplitStoreQuota) -> SplitFolderRegistry {\n        SplitFolderRegistry {\n            split_folders: BTreeMap::default(),\n            split_store_quota,\n        }\n    }\n\n    /// Returns an iterator over the split folders sorted by ULID.\n    #[cfg(any(test, feature = \"testsuite\"))]\n    fn iter(&self) -> impl Iterator<Item = SplitFolder> + '_ {\n        self.split_folders\n            .iter()\n            .map(|(&split_id, &num_bytes)| SplitFolder {\n                split_id,\n                num_bytes,\n            })\n    }\n\n    /// Returns whether the element was inserted or was already present\n    fn insert(&mut self, split_folder: SplitFolder) -> bool {\n        if let Entry::Vacant(entry) = self.split_folders.entry(split_folder.split_id) {\n            entry.insert(split_folder.num_bytes);\n            self.split_store_quota.add_split(split_folder.num_bytes);\n            true\n        } else {\n            false\n        }\n    }\n\n    /// Returns true if the split was indeed present in the registry\n    fn remove(&mut self, split_id: Ulid) -> Option<SplitFolder> {\n        let num_bytes = self.split_folders.remove(&split_id)?;\n        self.split_store_quota.remove_split(num_bytes);\n        Some(SplitFolder {\n            num_bytes,\n            split_id,\n        })\n    }\n\n    /// Returns the oldest split (oldest in the sense of the ULID = creation time).\n    fn oldest_split(&self) -> Option<Ulid> {\n        let (split_id, _) = self.split_folders.first_key_value()?;\n        Some(*split_id)\n    }\n\n    /// Removes the oldest split.\n    fn pop_oldest(&mut self) -> Option<SplitFolder> {\n        let oldest_split_id = self.oldest_split()?;\n        self.remove(oldest_split_id)\n    }\n\n    fn quota(&self) -> &SplitStoreQuota {\n        &self.split_store_quota\n    }\n}\n\nimpl InnerSplitCache {\n    /// Moves a split within the store to an external folder.\n    ///\n    /// Returns `None` if the split is not available in the cache.\n    async fn move_out(\n        &mut self,\n        split_id: Ulid,\n        to_folder: &Path,\n    ) -> StorageResult<Option<PathBuf>> {\n        let Some(split_folder) = self.split_registry.remove(split_id) else {\n            // The split is simply not in cache.\n            return Ok(None);\n        };\n        let from_path = self.split_path(split_id);\n        let to_full_path = to_folder.join(from_path.file_name().unwrap());\n\n        // We voluntarily use a non async operation: A rename is supposed to be\n        // quick, and we want to keep this operation as transactional as\n        // possible. In particular, we don't want our task to be cancelled in the\n        // middle of an inconsistent state.\n        if let Err(io_err) = std::fs::rename(&from_path, &to_full_path) {\n            // We do not simply rely on the `io::ErrorKind::NotFound` here\n            // because it could be about the destination and not the origin.\n            return match from_path.try_exists() {\n                Ok(false) => {\n                    // This could happen if some files have been manually\n                    // deleted from the FS for instance.\n                    warn!(from_path=%from_path.display(), to_full_path=%to_full_path.display(), error=%io_err, \"cached split missing from local split directory\");\n                    Ok(None)\n                }\n                Ok(true) => {\n                    // The file couldn't be copied out but is still in the\n                    // cache, we put it back to the registry to keep the\n                    // statistics accurate\n                    warn!(from_path=%from_path.display(), to_full_path=%to_full_path.display(), error=%io_err, \"split stuck in local split cache\");\n                    self.split_registry.insert(split_folder);\n                    Ok(None)\n                }\n                Err(_) => {\n                    // At this point, we are probably in an inconsistent state.\n                    // The split has been removed from our registry but we don't\n                    // know whether the files are still in the cache directory.\n                    error!(from_path=%from_path.display(), to_full_path=%to_full_path.display(), error=%io_err, \"failed to move split directory out of cache\");\n                    Err(From::from(io_err))\n                }\n            };\n        }\n        Ok(Some(to_full_path))\n    }\n\n    /// Returns the directory filepath of a split in cache.\n    fn split_path(&self, split_id: Ulid) -> PathBuf {\n        let split_file = split_file(split_id);\n        self.split_store_folder.join(split_file)\n    }\n\n    /// Remove one split from the cache to make some room.\n    ///\n    /// # Panics\n    /// Panics if there are no remaining splits.\n    async fn evict_one_split(&mut self) -> io::Result<()> {\n        let evicted_split = self\n            .split_registry\n            .pop_oldest()\n            .expect(\"split cache should not be empty\");\n        let result = tokio::fs::remove_dir_all(&self.split_path(evicted_split.split_id)).await;\n        if let Err(io_err) = result {\n            if io_err.kind() == io::ErrorKind::NotFound {\n                // This could happen if some files have been manually deleted\n                // from the FS for instance.\n                warn!(split_id=%evicted_split.split_id, \"cached split missing from local split directory\");\n                return Ok(());\n            } else {\n                return Err(io_err);\n            }\n        }\n        Ok(())\n    }\n\n    /// Tries to move a `split_folder` file into the cache.\n    ///\n    /// Move is not an image here. We are literally moving the directory.\n    ///\n    /// If the cache capacity does not allow it returns Ok(false).\n    ///\n    /// Ok(true) means the file was effectively accepted.\n    async fn move_into_cache(&mut self, split_id_str: &str, split_path: &Path) -> io::Result<bool> {\n        let split_folder = SplitFolder::create(split_id_str, split_path).await?;\n        let split_id = split_folder.split_id;\n        let should_move_split = self.make_room_and_record_split(split_folder).await?;\n        if !should_move_split {\n            return Ok(false);\n        }\n        let to_full_path = self.split_path(split_id);\n        if let Err(io_err) = tokio::fs::rename(split_path, &to_full_path).await {\n            // keep the registry stats accurate\n            self.split_registry.remove(split_id);\n            return Err(io_err);\n        }\n        Ok(true)\n    }\n\n    /// Removes all splits that have a creation date older than `limit`.\n    async fn remove_splits_older_than_limit(&mut self, limit: SystemTime) -> io::Result<()> {\n        while let Some(split_id) = self.split_registry.oldest_split() {\n            if split_id.datetime() >= limit {\n                break;\n            }\n            self.evict_one_split().await?;\n        }\n        Ok(())\n    }\n\n    /// Ensures that there is room to store the split:\n    /// - return false if the split should not be added to the cache.\n    /// - return true and record the split if the split should be moved into the cache.\n    async fn make_room_and_record_split(&mut self, split_folder: SplitFolder) -> io::Result<bool> {\n        // We don't accept splits that are too large.\n        if split_folder.num_bytes > self.split_registry.quota().max_num_bytes() {\n            return Ok(false);\n        }\n\n        while !self\n            .split_registry\n            .quota()\n            .can_fit_split(split_folder.num_bytes)\n        {\n            self.evict_one_split().await?;\n        }\n\n        if let Some(creation_time_limit) = split_folder.creation_time().checked_sub(SPLIT_MAX_AGE) {\n            self.remove_splits_older_than_limit(creation_time_limit)\n                .await?;\n        };\n        Ok(self.split_registry.insert(split_folder))\n    }\n}\n\nimpl IndexingSplitCache {\n    pub fn no_caching() -> IndexingSplitCache {\n        let split_store_space_quota = SplitStoreQuota::no_caching();\n        let inner = Mutex::new(InnerSplitCache {\n            split_registry: SplitFolderRegistry::with_quota(split_store_space_quota),\n            split_store_folder: PathBuf::from(\"no_caching\"),\n        });\n        IndexingSplitCache { inner }\n    }\n\n    /// Try to open an existing local split store directory.\n    ///\n    /// If the directory does not exists, it will be created.\n    ///\n    /// The directory is expected to only contain directory\n    /// with a name following the pattern `<ULID.split>`.\n    ///\n    /// The different pre-existing splits are recorded into\n    /// the split store in their creation order. If the split store\n    /// quota have been modified, the store will undergo the same\n    /// eviction logic.\n    pub async fn open(\n        split_store_folder: PathBuf,\n        space_quota: SplitStoreQuota,\n    ) -> anyhow::Result<IndexingSplitCache> {\n        tokio::fs::create_dir_all(&split_store_folder)\n            .await\n            .context(\"failed to create the split cache directory\")?;\n\n        let mut split_folders: Vec<SplitFolder> = Vec::new();\n\n        let mut read_dir = tokio::fs::read_dir(&split_store_folder).await?;\n        while let Some(dir_entry) = read_dir.next_entry().await? {\n            let metadata = dir_entry.metadata().await?;\n            let dir_path: PathBuf = dir_entry.path();\n\n            if metadata.is_file() {\n                warn!(\n                    \"unexpected file found in split cache directory: `{}`\",\n                    dir_path.display()\n                );\n                continue;\n            }\n\n            let split_id = split_id_from_split_folder(&dir_path).ok_or_else(|| {\n                let error_msg = format!(\n                    \"split folder name should match the format `<split_id>.split`: got `{}`\",\n                    dir_path.display()\n                );\n                io::Error::new(io::ErrorKind::InvalidInput, error_msg)\n            })?;\n\n            let split_folder = SplitFolder::create(split_id, &dir_entry.path()).await?;\n            split_folders.push(split_folder);\n        }\n\n        let mut inner_local_split_store = InnerSplitCache {\n            split_store_folder: split_store_folder.clone(),\n            split_registry: SplitFolderRegistry::with_quota(space_quota),\n        };\n\n        split_folders.sort_by_key(SplitFolder::creation_time);\n\n        // We record all `split_folder`, sorted by `creation_time`.\n        for split_folder in split_folders {\n            let split_id = split_folder.split_id;\n            if !inner_local_split_store\n                .make_room_and_record_split(split_folder)\n                .await?\n            {\n                let split_dir = inner_local_split_store.split_path(split_id);\n                tokio::fs::remove_dir_all(&split_dir).await?;\n            }\n        }\n\n        Ok(IndexingSplitCache {\n            inner: Mutex::new(inner_local_split_store),\n        })\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub async fn inspect_registry(&self) -> std::collections::HashMap<String, ByteSize> {\n        self.inner\n            .lock()\n            .await\n            .split_registry\n            .iter()\n            .map(|split_folder| (split_folder.split_id.to_string(), split_folder.num_bytes))\n            .collect()\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub async fn inspect_quota(&self) -> SplitStoreQuota {\n        self.inner\n            .lock()\n            .await\n            .split_registry\n            .split_store_quota\n            .clone()\n    }\n\n    /// Returns a cached split to performs a merge operation.\n    ///\n    /// For simplicity, this method optimistically assumes that the merge operation will be\n    /// successful and removes the split from the cache.\n    ///\n    /// If the merge operation is a failure and needs to be re-executed, we will\n    /// experience a cache miss, and the split will be downloaded from the\n    /// storage.\n    pub(super) async fn get_cached_split(\n        &self,\n        split_id: &str,\n        output_dir_path: &Path,\n    ) -> StorageResult<Option<PathBuf>> {\n        let mut split_store_lock = self.inner.lock().await;\n        let split_ulid = if let Ok(split_ulid) = Ulid::from_str(split_id) {\n            split_ulid\n        } else {\n            return Ok(None);\n        };\n        let split_file_opt: Option<PathBuf> = split_store_lock\n            .move_out(split_ulid, output_dir_path)\n            .await?;\n        if split_file_opt.is_none() {\n            debug!(split_id, \"split folder not in cache\");\n        }\n        Ok(split_file_opt)\n    }\n\n    /// Tries to move a `split_folder` file into the cache.\n    ///\n    /// Move is not an image here. We are literally moving the directory.\n    ///\n    /// If the cache capacity does not allow it, this function\n    /// just logs a warning and returns Ok(false).\n    ///\n    /// Ok(true) means the file was effectively accepted.\n    pub(super) async fn move_into_cache(\n        &self,\n        split_id: &str,\n        split_path: &Path,\n    ) -> io::Result<bool> {\n        assert!(split_path.is_dir());\n        let mut inner = self.inner.lock().await;\n        inner.move_into_cache(split_id, split_path).await\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::fs::File;\n    use std::io;\n    use std::io::Write;\n    use std::path::Path;\n    use std::time::Duration;\n\n    use bytesize::ByteSize;\n    use quickwit_directories::BundleDirectory;\n    use quickwit_storage::{PutPayload, SplitPayloadBuilder};\n    use tantivy::Directory;\n    use tantivy::directory::FileSlice;\n    use tempfile::tempdir;\n    use tokio::fs;\n    use ulid::Ulid;\n\n    use super::SPLIT_MAX_AGE;\n    use crate::split_store::{IndexingSplitCache, SplitStoreQuota};\n\n    async fn create_fake_split(\n        split_cache_path: &Path,\n        split_id: &str,\n        len: usize,\n    ) -> io::Result<()> {\n        let split_path = split_cache_path.join(format!(\"{split_id}.split\"));\n        fs::create_dir(&split_path).await?;\n        fs::write(split_path.join(\"splitdata\"), &vec![0u8; len]).await?;\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_local_split_store_load_existing_splits() -> anyhow::Result<()> {\n        let temp_dir = tempfile::tempdir()?;\n        let split_id1 = \"01GF5449X7DA53TK9F9W2ZJST2\";\n        let split_id2 = \"01GF545472A06WY07SEHGCJF9P\";\n        create_fake_split(temp_dir.path(), split_id1, 15).await?;\n        create_fake_split(temp_dir.path(), split_id2, 13).await?;\n        let split_store_space_quota = SplitStoreQuota::default();\n        let split_store =\n            IndexingSplitCache::open(temp_dir.path().to_path_buf(), split_store_space_quota)\n                .await?;\n        let cache_content = split_store.inspect_registry().await;\n        assert_eq!(cache_content.len(), 2);\n        assert_eq!(cache_content.get(split_id1).cloned(), Some(ByteSize(15)));\n        assert_eq!(cache_content.get(split_id2).cloned(), Some(ByteSize(13)));\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_create_with_too_many_files() {\n        let dir = tempdir().unwrap();\n        create_fake_split(dir.path(), \"01GF5215TMV48JT7GZ543BV193\", 12)\n            .await\n            .unwrap(); // 1\n        create_fake_split(dir.path(), \"01GF520MTTRNCCTQZE264BBYWM\", 23)\n            .await\n            .unwrap(); // 0\n        create_fake_split(dir.path(), \"01GF521M316V9AEHZWTHN76F2V\", 5)\n            .await\n            .unwrap(); // 3\n        create_fake_split(dir.path(), \"01GF521CZC1260V8QPA81T46X7\", 45)\n            .await\n            .unwrap(); // 2\n        let split_store_space_quota = SplitStoreQuota::try_new(2, ByteSize::kb(1)).unwrap();\n        let local_split_store =\n            IndexingSplitCache::open(dir.path().to_path_buf(), split_store_space_quota)\n                .await\n                .unwrap();\n        assert_eq!(local_split_store.inspect_registry().await.len(), 2);\n        let quota = local_split_store.inspect_quota().await;\n        assert_eq!(quota.used_num_bytes(), ByteSize(50));\n    }\n\n    #[tokio::test]\n    async fn test_create_with_exceeds_num_bytes() {\n        let dir = tempdir().unwrap();\n        create_fake_split(dir.path(), \"01GF5215TMV48JT7GZ543BV193\", 12)\n            .await\n            .unwrap(); // 1\n        create_fake_split(dir.path(), \"01GF520MTTRNCCTQZE264BBYWM\", 23)\n            .await\n            .unwrap(); // 0\n        create_fake_split(dir.path(), \"01GF521M316V9AEHZWTHN76F2V\", 5)\n            .await\n            .unwrap(); // 3\n        create_fake_split(dir.path(), \"01GF521CZC1260V8QPA81T46X7\", 45)\n            .await\n            .unwrap(); // 2\n        let split_store_space_quota = SplitStoreQuota::try_new(6, ByteSize(61)).unwrap();\n        let local_split_store =\n            IndexingSplitCache::open(dir.path().to_path_buf(), split_store_space_quota)\n                .await\n                .unwrap();\n        let cache_content = local_split_store.inspect_registry().await;\n        assert_eq!(cache_content.len(), 2);\n        assert_eq!(cache_content.values().map(|v| v.as_u64()).sum::<u64>(), 50);\n        let quota = local_split_store.inspect_quota().await;\n        assert_eq!(quota.used_num_bytes(), ByteSize(50));\n    }\n\n    #[tokio::test]\n    async fn test_big_split_evicts_all() {\n        let dir = tempdir().unwrap();\n        create_fake_split(dir.path(), \"01GF5215TMV48JT7GZ543BV193\", 100)\n            .await\n            .unwrap(); // 1\n        create_fake_split(dir.path(), \"01GF520MTTRNCCTQZE264BBYWM\", 100)\n            .await\n            .unwrap(); // 0\n        create_fake_split(dir.path(), \"01GF521M316V9AEHZWTHN76F2V\", 100)\n            .await\n            .unwrap(); // 3\n        create_fake_split(dir.path(), \"01GF521CZC1260V8QPA81T46X7\", 100)\n            .await\n            .unwrap(); // 2\n        let split_store_space_quota = SplitStoreQuota::try_new(6, ByteSize::b(401)).unwrap();\n        let local_split_store =\n            IndexingSplitCache::open(dir.path().to_path_buf(), split_store_space_quota)\n                .await\n                .unwrap();\n        assert_eq!(local_split_store.inspect_registry().await.len(), 4);\n\n        let extra_split = tempdir().unwrap();\n        fs::write(extra_split.path().join(\"splitdata\"), &vec![0u8; 400])\n            .await\n            .unwrap();\n        local_split_store\n            .move_into_cache(\"01GFCZJBMBMEPMAQSFD09VTST2\", extra_split.path())\n            .await\n            .unwrap();\n        assert_eq!(local_split_store.inspect_registry().await.len(), 1);\n        let quota = local_split_store.inspect_quota().await;\n        assert_eq!(quota.used_num_bytes(), ByteSize(400));\n    }\n\n    #[tokio::test]\n    async fn test_remove_splits_out_of_age() {\n        let dir = tempdir().unwrap();\n        // 2022-10-13T06:12:37.643Z\n        create_fake_split(dir.path(), \"01GF7ZJBMBMEPMAQSFD09VTST2\", 1)\n            .await\n            .unwrap();\n        // 2022-10-12T20:53:23.211Z\n        create_fake_split(dir.path(), \"01GF6ZJBMBMEPMAQSFD09VTST2\", 1)\n            .await\n            .unwrap();\n        // 2022-10-12T02:14:54.347Z\n        create_fake_split(dir.path(), \"01GF4ZJBMBMEPMAQSFD09VTST2\", 1)\n            .await\n            .unwrap();\n        // 2022-10-10T22:17:11.051Z\n        create_fake_split(dir.path(), \"01GF1ZJBMBMEPMAQSFD09VTST2\", 1)\n            .await\n            .unwrap();\n        let split_store_space_quota = SplitStoreQuota::try_new(6, ByteSize(100)).unwrap();\n        let local_split_store =\n            IndexingSplitCache::open(dir.path().to_path_buf(), split_store_space_quota)\n                .await\n                .unwrap();\n        let cache_content = local_split_store.inspect_registry().await;\n        assert_eq!(cache_content.len(), 3);\n\n        // adding a split with a large time gap only keeps splits younger than SPLIT_MAX_AGE\n        assert_eq!(\n            SPLIT_MAX_AGE,\n            Duration::from_secs(2 * 24 * 3_600),\n            \"update this test if SPLIT_MAX_AGE changes\"\n        );\n        {\n            let extra_split = tempdir().unwrap();\n            local_split_store\n                // 2022-10-15T4:48:49.803Z\n                .move_into_cache(\"01GFCZJBMBMEPMAQSFD09VTST2\", extra_split.path())\n                .await\n                .unwrap();\n            let cache_content = local_split_store.inspect_registry().await;\n            assert_eq!(cache_content.len(), 2);\n            let quota = local_split_store.inspect_quota().await;\n            assert_eq!(quota.used_num_bytes(), ByteSize(1));\n        }\n        {\n            // adding a split with a huge time gap should empty the cache entirely first\n            let extra_split = tempdir().unwrap();\n            let was_accepted = local_split_store\n                // 2025-01-13T14:28:17.364Z\n                .move_into_cache(\"01JHG11FAM8F2XPWHY24R3HF6M\", extra_split.path())\n                .await\n                .unwrap();\n            assert!(was_accepted);\n            let cache_content = local_split_store.inspect_registry().await;\n            assert_eq!(cache_content.len(), 1);\n            let quota = local_split_store.inspect_quota().await;\n            assert_eq!(quota.used_num_bytes(), ByteSize(0));\n        }\n    }\n\n    #[tokio::test]\n    async fn test_stream_split_to_bundle_and_open() {\n        let temp_dir = tempfile::tempdir().unwrap();\n        let test_filepath1 = temp_dir.path().join(\"f1\");\n        let test_filepath2 = temp_dir.path().join(\"f2\");\n        let mut file1 = File::create(&test_filepath1).unwrap();\n        file1.write_all(b\"ab\").unwrap();\n        let mut file2 = File::create(&test_filepath2).unwrap();\n        file2.write_all(b\"def\").unwrap();\n        let split_streamer = SplitPayloadBuilder::get_split_payload(\n            &[test_filepath1, test_filepath2],\n            &[],\n            b\"hotcache\",\n        )\n        .unwrap();\n        let data = split_streamer.read_all().await.unwrap();\n        let bundle_dir = BundleDirectory::open_split(FileSlice::from(data.to_vec())).unwrap();\n        let f1_data = bundle_dir.atomic_read(Path::new(\"f1\")).unwrap();\n        assert_eq!(&*f1_data, b\"ab\");\n        let f2_data = bundle_dir.atomic_read(Path::new(\"f2\")).unwrap();\n        assert_eq!(&f2_data[..], b\"def\");\n    }\n\n    #[tokio::test]\n    async fn test_store_and_fetch() {\n        let temp_dir_in = tempfile::tempdir().unwrap();\n        let split_id = Ulid::default().to_string();\n        let cache_dir = tempfile::tempdir().unwrap();\n        let quota = SplitStoreQuota::default();\n        let local_store = IndexingSplitCache::open(cache_dir.path().to_path_buf(), quota)\n            .await\n            .unwrap();\n        {\n            let split_dir = temp_dir_in.path().join(format!(\"scratch_{split_id}\"));\n            tokio::fs::create_dir(&split_dir).await.unwrap();\n            assert!(\n                local_store\n                    .move_into_cache(&split_id, &split_dir)\n                    .await\n                    .unwrap()\n            );\n            assert!(!split_dir.try_exists().unwrap());\n        }\n        {\n            let split_path = local_store\n                .get_cached_split(&split_id, temp_dir_in.path())\n                .await\n                .unwrap()\n                .unwrap();\n            assert!(split_path.try_exists().unwrap());\n            assert_eq!(split_path.parent().unwrap(), temp_dir_in.path());\n        }\n        {\n            // cache miss because the previous get_cached_split removed the split from the cache\n            let split_path_opt = local_store\n                .get_cached_split(&split_id, temp_dir_in.path())\n                .await\n                .unwrap();\n            assert_eq!(split_path_opt, None);\n        }\n    }\n\n    async fn clear_dir_manually(dir: &Path) {\n        let mut entries = fs::read_dir(dir).await.unwrap();\n        while let Some(entry) = entries.next_entry().await.unwrap() {\n            let path = entry.path();\n            if path.is_dir() {\n                fs::remove_dir_all(&path).await.unwrap();\n            } else {\n                fs::remove_file(&path).await.unwrap();\n            }\n        }\n    }\n\n    #[tokio::test]\n    async fn test_fetch_manually_deleted_split() {\n        let dir = tempdir().unwrap();\n        create_fake_split(dir.path(), \"01GF5215TMV48JT7GZ543BV193\", 100)\n            .await\n            .unwrap();\n        let split_store_space_quota = SplitStoreQuota::try_new(6, ByteSize::b(401)).unwrap();\n        let local_split_store =\n            IndexingSplitCache::open(dir.path().to_path_buf(), split_store_space_quota)\n                .await\n                .unwrap();\n        assert_eq!(local_split_store.inspect_registry().await.len(), 1);\n\n        clear_dir_manually(dir.path()).await;\n\n        let target_dir = tempdir().unwrap();\n        let path_opt = local_split_store\n            .get_cached_split(\"01GF5215TMV48JT7GZ543BV193\", target_dir.path())\n            .await\n            .unwrap();\n        assert_eq!(path_opt, None);\n        assert_eq!(local_split_store.inspect_registry().await.len(), 0);\n        let quota = local_split_store.inspect_quota().await;\n        assert_eq!(quota.used_num_bytes(), ByteSize(0));\n    }\n\n    #[tokio::test]\n    async fn test_evict_manually_deleted_split() {\n        let dir = tempdir().unwrap();\n        // // 2022-10-12T20:53:23.211Z\n        create_fake_split(dir.path(), \"01GF6ZJBMBMEPMAQSFD09VTST2\", 100)\n            .await\n            .unwrap();\n        let split_store_space_quota = SplitStoreQuota::try_new(1, ByteSize::b(401)).unwrap();\n        let local_split_store =\n            IndexingSplitCache::open(dir.path().to_path_buf(), split_store_space_quota)\n                .await\n                .unwrap();\n        assert_eq!(local_split_store.inspect_registry().await.len(), 1);\n\n        clear_dir_manually(dir.path()).await;\n\n        let extra_split = tempdir().unwrap();\n        let was_accepted = local_split_store\n            // 2022-10-12T02:14:54.347Z\n            .move_into_cache(\"01GF4ZJBMBMEPMAQSFD09VTST2\", extra_split.path())\n            .await\n            .unwrap();\n        assert!(was_accepted);\n        assert_eq!(local_split_store.inspect_registry().await.len(), 1);\n        let quota = local_split_store.inspect_quota().await;\n        assert_eq!(quota.used_num_bytes(), ByteSize(0));\n    }\n\n    #[tokio::test]\n    async fn test_load_same_split_twice() {\n        let temp_dir = tempfile::tempdir().unwrap();\n        let split_id = \"01GF5449X7DA53TK9F9W2ZJST2\";\n        create_fake_split(temp_dir.path(), split_id, 15)\n            .await\n            .unwrap();\n        let split_store_space_quota = SplitStoreQuota::default();\n        let split_store =\n            IndexingSplitCache::open(temp_dir.path().to_path_buf(), split_store_space_quota)\n                .await\n                .unwrap();\n\n        let extra_split = tempdir().unwrap();\n        let extra_split_filepath = temp_dir.path().join(\"splitfile\");\n        let mut extra_split_file = File::create(&extra_split_filepath).unwrap();\n        extra_split_file.write_all(&[0u8; 15]).unwrap();\n\n        let was_accepted = split_store\n            .move_into_cache(split_id, extra_split.path())\n            .await\n            .unwrap();\n        assert!(!was_accepted);\n        let quota = split_store.inspect_quota().await;\n        assert_eq!(quota.used_num_bytes(), ByteSize(15));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/split_store/indexing_split_store.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n#[cfg(any(test, feature = \"testsuite\"))]\nuse std::collections::HashMap;\nuse std::path::{Path, PathBuf};\nuse std::sync::Arc;\nuse std::time::Instant;\n\nuse anyhow::Context;\n#[cfg(any(test, feature = \"testsuite\"))]\nuse bytesize::ByteSize;\nuse quickwit_common::io::{IoControls, IoControlsAccess};\nuse quickwit_common::uri::Uri;\nuse quickwit_metastore::SplitMetadata;\nuse quickwit_storage::{PutPayload, Storage, StorageResult};\nuse tantivy::Directory;\nuse tantivy::directory::{Advice, MmapDirectory};\nuse time::OffsetDateTime;\nuse tracing::{Instrument, debug, info_span, instrument};\n\nuse super::IndexingSplitCache;\nuse crate::get_tantivy_directory_from_split_bundle;\n\n/// IndexingSplitStore is a wrapper around a regular `Storage` to upload and\n/// download splits while allowing for efficient caching.\n///\n/// We typically index with a limited amount of RAM or some constraints on the\n/// expected time-to-search.\n/// Because of these constraints, the indexer produces splits that are smaller\n/// than optimal and need to be merged.\n///\n/// A split therefore typically undergoes a few merges relatively shortly after\n/// its creation.\n///\n/// In order to alleviate the disk IO as well as the network bandwidth,\n/// we save new splits into a split store.\n///\n/// The role of the `IndexingSplitStore` is to combine a cache and a storage\n/// to avoid unnecessary download of fresh splits. Its behavior are however very different\n/// from a usual cache as we have a strong knowledge of the split lifecycle.\n///\n/// The splits are stored on the local filesystem in the `IndexingSplitCache`.\n#[derive(Clone)]\npub struct IndexingSplitStore {\n    inner: Arc<InnerIndexingSplitStore>,\n}\n\nstruct InnerIndexingSplitStore {\n    /// The remote storage.\n    remote_storage: Arc<dyn Storage>,\n    split_cache: Arc<IndexingSplitCache>,\n}\n\nimpl IndexingSplitStore {\n    /// Creates an instance of [`IndexingSplitStore`]\n    ///\n    /// It needs the remote storage to work with.\n    pub fn new(remote_storage: Arc<dyn Storage>, split_cache: Arc<IndexingSplitCache>) -> Self {\n        let inner = InnerIndexingSplitStore {\n            remote_storage,\n            split_cache,\n        };\n        Self {\n            inner: Arc::new(inner),\n        }\n    }\n\n    /// Helper function to create a indexing split store for tests.\n    /// The resulting store does not have any local cache.\n    pub fn create_without_local_store_for_test(remote_storage: Arc<dyn Storage>) -> Self {\n        let inner = InnerIndexingSplitStore {\n            remote_storage,\n            split_cache: Arc::new(IndexingSplitCache::no_caching()),\n        };\n        IndexingSplitStore {\n            inner: Arc::new(inner),\n        }\n    }\n\n    pub fn remote_uri(&self) -> &Uri {\n        self.inner.remote_storage.uri()\n    }\n\n    fn split_path(&self, split_id: &str) -> PathBuf {\n        PathBuf::from(quickwit_common::split_file(split_id))\n    }\n\n    /// Stores a split.\n    ///\n    /// If a split is identified as mature by the merge policy,\n    /// it will not be cached into the local storage.\n    ///\n    /// In order to limit the write IO, the file might be moved (and not copied into\n    /// the store).\n    /// In other words, after calling this function the file will not be available\n    /// at `split_folder` anymore.\n    #[instrument(\"store_split\", skip_all)]\n    pub async fn store_split(\n        &self,\n        split: &SplitMetadata,\n        split_folder_path: &Path,\n        put_payload: Box<dyn PutPayload>,\n    ) -> anyhow::Result<()> {\n        let start = Instant::now();\n        let split_num_bytes = put_payload.len();\n\n        let key = self.split_path(split.split_id());\n        let is_mature = split.is_mature(OffsetDateTime::now_utc());\n        self.inner\n            .remote_storage\n            .put(&key, put_payload)\n            .instrument(info_span!(\"store_split_in_remote_storage\", split=?split.split_id(), is_mature=is_mature, num_bytes=split_num_bytes))\n            .await\n            .with_context(|| {\n                format!(\n                    \"failed uploading key {} in bucket {}\",\n                    key.display(),\n                    self.inner.remote_storage.uri()\n                )\n            })?;\n\n        let elapsed_secs = start.elapsed().as_secs_f32();\n        let split_size_in_megabytes = split_num_bytes as f32 / 1_000_000f32;\n        let throughput_mb_s = split_size_in_megabytes / elapsed_secs;\n\n        debug!(\n            split_size_in_megabytes = %split_size_in_megabytes,\n            num_docs = %split.num_docs,\n            elapsed_secs = %elapsed_secs,\n            throughput_mb_s = %throughput_mb_s,\n            is_mature = is_mature,\n            \"store-split-remote-success\"\n        );\n\n        if !is_mature {\n            debug!(\"store-in-cache\");\n            if self\n                .inner\n                .split_cache\n                .move_into_cache(split.split_id(), split_folder_path)\n                .await?\n            {\n                return Ok(());\n            }\n        }\n        tokio::fs::remove_dir_all(split_folder_path).await?;\n        Ok(())\n    }\n\n    /// Gets a split from the split store, and makes it available to the given `output_path`.\n    /// If the split is available in the local disk cache, then it will be moved\n    /// from the cache to the `output_dir_path`.\n    ///\n    /// The output_path is expected to be a directory path.\n    ///\n    /// If not, it will be fetched from the remote `Storage`.\n    ///\n    /// # Implementation detail:\n    ///\n    /// Depending on whether the split was obtained from the `Storage`\n    /// or the cache, it could consist in a directly or a proper split file.\n    /// This method takes care of the dealing with opening the split correctly.\n    ///\n    /// As we fetch the split, we optimistically assume that this is for a merge\n    /// operation that will be successful and we remove the split from the cache.\n    #[instrument(skip(self, output_dir_path, io_controls), fields(cache_hit))]\n    pub async fn fetch_and_open_split(\n        &self,\n        split_id: &str,\n        output_dir_path: &Path,\n        io_controls: &IoControls,\n    ) -> StorageResult<Box<dyn Directory>> {\n        let path = PathBuf::from(quickwit_common::split_file(split_id));\n        if let Some(split_path) = self\n            .inner\n            .split_cache\n            .get_cached_split(split_id, output_dir_path)\n            .await?\n        {\n            tracing::Span::current().record(\"cache_hit\", true);\n            let mmap_directory: Box<dyn Directory> = Box::new(MmapDirectory::open_with_madvice(\n                split_path,\n                Advice::Sequential,\n            )?);\n            return Ok(mmap_directory);\n        } else {\n            tracing::Span::current().record(\"cache_hit\", false);\n        }\n        let dest_filepath = output_dir_path.join(&path);\n        let dest_file = tokio::fs::File::create(&dest_filepath).await?;\n        let mut dest_file_with_write_limit = io_controls.clone().wrap_write(dest_file);\n        self.inner\n            .remote_storage\n            .copy_to(&path, &mut dest_file_with_write_limit)\n            .instrument(info_span!(\"fetch_split_from_remote_storage\", path=?path))\n            .await?;\n        get_tantivy_directory_from_split_bundle(&dest_filepath)\n    }\n\n    /// Takes a snapshot of the cache view (only used for testing).\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub async fn inspect_split_cache(&self) -> HashMap<String, ByteSize> {\n        self.inner.split_cache.inspect_registry().await\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::sync::Arc;\n    use std::time::Duration;\n\n    use bytesize::ByteSize;\n    use quickwit_common::io::IoControls;\n    use quickwit_metastore::{SplitMaturity, SplitMetadata};\n    use quickwit_storage::{PutPayload, RamStorage, SplitPayloadBuilder};\n    use tempfile::tempdir;\n    use time::OffsetDateTime;\n    use tokio::fs;\n    use ulid::Ulid;\n\n    use super::IndexingSplitStore;\n    use crate::split_store::{IndexingSplitCache, SplitStoreQuota};\n\n    fn create_test_split_metadata(split_id: &str) -> SplitMetadata {\n        SplitMetadata {\n            split_id: split_id.to_string(),\n            create_timestamp: OffsetDateTime::now_utc().unix_timestamp(),\n            maturity: SplitMaturity::Immature {\n                maturation_period: Duration::from_secs(3600),\n            },\n            ..Default::default()\n        }\n    }\n\n    #[tokio::test]\n    async fn test_local_store_cache_in_and_out() -> anyhow::Result<()> {\n        let temp_dir = tempfile::tempdir()?;\n        let split_cache_dir = tempdir()?;\n\n        let split_cache = IndexingSplitCache::open(\n            split_cache_dir.path().to_path_buf(),\n            SplitStoreQuota::default(),\n        )\n        .await?;\n        let remote_storage = Arc::new(RamStorage::default());\n        let split_store = IndexingSplitStore::new(remote_storage, Arc::new(split_cache));\n\n        let split_id1 = Ulid::new().to_string();\n        let split_id2 = Ulid::new().to_string();\n\n        {\n            let split1_dir = temp_dir.path().join(&split_id1);\n            fs::create_dir_all(&split1_dir).await?;\n            let split_metadata1 = create_test_split_metadata(&split_id1);\n            fs::write(split1_dir.join(\"splitfile\"), b\"1234\").await?;\n            split_store\n                .store_split(&split_metadata1, &split1_dir, Box::new(b\"1234\".to_vec()))\n                .await?;\n            assert!(!split1_dir.try_exists()?);\n            assert!(\n                split_cache_dir\n                    .path()\n                    .join(format!(\"{split_id1}.split\"))\n                    .try_exists()?\n            );\n            let local_store_stats = split_store.inspect_split_cache().await;\n            assert_eq!(local_store_stats.len(), 1);\n            assert_eq!(\n                local_store_stats.get(&split_id1).cloned(),\n                Some(ByteSize(4))\n            );\n        }\n        {\n            let split2_dir = temp_dir.path().join(&split_id2);\n            fs::create_dir_all(&split2_dir).await?;\n            fs::write(split2_dir.join(\"splitfile\"), b\"567\").await?;\n            let split_metadata2 = create_test_split_metadata(&split_id2);\n            split_store\n                .store_split(&split_metadata2, &split2_dir, Box::new(b\"567\".to_vec()))\n                .await?;\n            assert!(!split2_dir.try_exists()?);\n            assert!(\n                split_cache_dir\n                    .path()\n                    .join(format!(\"{split_id2}.split\"))\n                    .try_exists()?\n            );\n        }\n\n        let local_store_stats = split_store.inspect_split_cache().await;\n        assert_eq!(local_store_stats.len(), 2);\n        assert_eq!(\n            local_store_stats.get(&split_id1).cloned(),\n            Some(ByteSize(4))\n        );\n        assert_eq!(\n            local_store_stats.get(&split_id2).cloned(),\n            Some(ByteSize(3))\n        );\n\n        let io_controls = IoControls::default();\n        {\n            let output = tempfile::tempdir()?;\n            let split1 = split_store\n                .fetch_and_open_split(&split_id1, output.path(), &io_controls)\n                .await?;\n            let local_store_stats = split_store.inspect_split_cache().await;\n            assert_eq!(local_store_stats.len(), 1);\n            assert!(split1.exists(std::path::Path::new(\"splitfile\")).unwrap());\n        }\n        {\n            let output = tempfile::tempdir()?;\n            let split2 = split_store\n                .fetch_and_open_split(&split_id2, output.path(), &io_controls)\n                .await?;\n            let local_store_stats = split_store.inspect_split_cache().await;\n            assert_eq!(local_store_stats.len(), 0);\n            assert!(split2.exists(std::path::Path::new(\"splitfile\")).unwrap());\n        }\n\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_eviction_and_fallback_to_remote() -> anyhow::Result<()> {\n        let temp_dir = tempfile::tempdir()?;\n\n        let split_cache_dir = tempdir()?;\n        let split_cache = IndexingSplitCache::open(\n            split_cache_dir.path().to_path_buf(),\n            SplitStoreQuota::try_new(1, ByteSize::mb(1)).unwrap(),\n        )\n        .await?;\n\n        let remote_storage = Arc::new(RamStorage::default());\n        let split_store = IndexingSplitStore::new(remote_storage, Arc::new(split_cache));\n\n        let split_id1 = Ulid::new().to_string();\n        let split_payload1 = SplitPayloadBuilder::get_split_payload(&[], &[], &[5, 5, 5])?;\n        let split_id2 = Ulid::new().to_string();\n        let split_payload2 = SplitPayloadBuilder::get_split_payload(&[], &[], &[5, 5, 5, 5])?;\n\n        {\n            let split_path = temp_dir.path().join(&split_id1);\n            fs::create_dir_all(&split_path).await?;\n            fs::write(split_path.join(\"splitdatafile\"), b\"hello-world\").await?;\n            let split_metadata1 = create_test_split_metadata(&split_id1);\n            split_store\n                .store_split(\n                    &split_metadata1,\n                    &split_path,\n                    Box::new(split_payload1.clone()),\n                )\n                .await?;\n            assert!(!split_path.try_exists()?);\n            assert!(\n                split_cache_dir\n                    .path()\n                    .join(format!(\"{split_id1}.split\"))\n                    .try_exists()?\n            );\n            let split_cache_stats = split_store.inspect_split_cache().await;\n            assert_eq!(split_cache_stats.len(), 1);\n            assert_eq!(\n                split_cache_stats.get(&split_id1).cloned(),\n                Some(ByteSize(11))\n            );\n        }\n        {\n            let split_path = temp_dir.path().join(&split_id2);\n            fs::create_dir_all(&split_path).await?;\n            fs::write(split_path.join(\"splitdatafile2\"), b\"hello-world2\").await?;\n            let split_metadata2 = create_test_split_metadata(&split_id2);\n\n            split_store\n                .store_split(\n                    &split_metadata2,\n                    &split_path,\n                    Box::new(split_payload2.clone()),\n                )\n                .await?;\n            assert!(!split_path.try_exists()?);\n            assert!(\n                split_cache_dir\n                    .path()\n                    .join(format!(\"{split_id2}.split\"))\n                    .try_exists()?\n            );\n            let split_cache_stats = split_store.inspect_split_cache().await;\n            assert_eq!(split_cache_stats.len(), 1);\n            assert_eq!(\n                split_cache_stats.get(&split_id2).cloned(),\n                Some(ByteSize(12))\n            );\n        }\n        let io_controls = IoControls::default();\n        {\n            // get from remote storage because split_id1 was evicted by split_id2\n            let output = tempfile::tempdir()?;\n            let _split1 = split_store\n                .fetch_and_open_split(&split_id1, output.path(), &io_controls)\n                .await?;\n            assert_eq!(io_controls.num_bytes(), split_payload1.len());\n        }\n        {\n            // get from cache\n            let output = tempfile::tempdir()?;\n            let _split2 = split_store\n                .fetch_and_open_split(&split_id2, output.path(), &io_controls)\n                .await?;\n            // the number of downloaded by didn't change (still the size of split_payload1)\n            assert_eq!(io_controls.num_bytes(), split_payload1.len());\n        }\n        {\n            // get from remote because getting from cache removes the split from the cache\n            let output = tempfile::tempdir()?;\n            let _split2 = split_store\n                .fetch_and_open_split(&split_id2, output.path(), &io_controls)\n                .await?;\n            assert_eq!(\n                io_controls.num_bytes(),\n                split_payload1.len() + split_payload2.len()\n            );\n        }\n\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/split_store/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod indexing_split_cache;\nmod indexing_split_store;\nmod split_store_quota;\n\npub use indexing_split_cache::{IndexingSplitCache, get_tantivy_directory_from_split_bundle};\npub use indexing_split_store::IndexingSplitStore;\npub use split_store_quota::SplitStoreQuota;\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/split_store/split_store_quota.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse bytesize::ByteSize;\nuse quickwit_config::IndexerConfig;\n\n/// A struct for keeping in check multiple SplitStore.\n#[derive(Debug, Clone)]\npub struct SplitStoreQuota {\n    /// Current number of splits in the cache.\n    num_splits_in_cache: usize,\n    /// Current size in bytes of splits in the cache.\n    size_in_bytes_in_cache: ByteSize,\n    /// Maximum number of files allowed in the cache.\n    max_num_splits: usize,\n    /// Maximum size in bytes allowed in the cache. 0 if max_num_splits=0.\n    max_num_bytes: ByteSize,\n}\n\nimpl Default for SplitStoreQuota {\n    fn default() -> Self {\n        Self {\n            num_splits_in_cache: 0,\n            size_in_bytes_in_cache: ByteSize::default(),\n            max_num_bytes: IndexerConfig::default_split_store_max_num_bytes(),\n            max_num_splits: IndexerConfig::default_split_store_max_num_splits(),\n        }\n    }\n}\n\nimpl SplitStoreQuota {\n    pub fn try_new(max_num_splits: usize, max_num_bytes: ByteSize) -> anyhow::Result<Self> {\n        if max_num_splits == 0 && max_num_bytes.as_u64() > 0 {\n            anyhow::bail!(\"max_num_bytes cannot be > 0 if max_num_splits is 0\");\n        }\n        Ok(Self {\n            max_num_splits,\n            max_num_bytes,\n            ..Default::default()\n        })\n    }\n\n    /// Space quota that prevents any caching.\n    pub fn no_caching() -> Self {\n        Self::try_new(0, ByteSize::default()).unwrap()\n    }\n\n    pub fn can_fit_split(&self, split_size_in_bytes: ByteSize) -> bool {\n        if self.num_splits_in_cache >= self.max_num_splits {\n            return false;\n        }\n        if self.size_in_bytes_in_cache.as_u64() + split_size_in_bytes.as_u64()\n            > self.max_num_bytes.as_u64()\n        {\n            return false;\n        }\n        true\n    }\n\n    pub fn add_split(&mut self, split_size_in_bytes: ByteSize) {\n        self.num_splits_in_cache += 1;\n        self.size_in_bytes_in_cache =\n            ByteSize(self.size_in_bytes_in_cache.as_u64() + split_size_in_bytes.as_u64());\n    }\n\n    pub fn remove_split(&mut self, split_size_in_bytes: ByteSize) {\n        self.size_in_bytes_in_cache =\n            ByteSize(self.size_in_bytes_in_cache.as_u64() - split_size_in_bytes.as_u64());\n        self.num_splits_in_cache -= 1;\n    }\n\n    pub fn max_num_bytes(&self) -> ByteSize {\n        self.max_num_bytes\n    }\n\n    pub fn used_num_bytes(&self) -> ByteSize {\n        self.size_in_bytes_in_cache\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use bytesize::ByteSize;\n\n    use crate::split_store::SplitStoreQuota;\n\n    #[test]\n    fn test_invalid_quota() {\n        SplitStoreQuota::try_new(0, ByteSize(100)).unwrap_err();\n    }\n\n    #[test]\n    fn test_split_store_quota_max_bytes_accepted() {\n        let split_store_quota = SplitStoreQuota::try_new(3, ByteSize(100)).unwrap();\n        assert!(split_store_quota.can_fit_split(ByteSize(100)));\n    }\n\n    #[test]\n    fn test_split_store_quota_exceeding_bytes() {\n        let split_store_quota = SplitStoreQuota::try_new(3, ByteSize(100)).unwrap();\n        assert!(!split_store_quota.can_fit_split(ByteSize(101)));\n    }\n\n    #[test]\n    fn test_split_store_quota_max_num_files_accepted() {\n        let mut split_store_quota = SplitStoreQuota::try_new(2, ByteSize(100)).unwrap();\n        split_store_quota.add_split(ByteSize(1));\n        assert!(split_store_quota.can_fit_split(ByteSize(1)));\n    }\n\n    #[test]\n    fn test_split_store_quota_exceeding_max_num_files() {\n        let mut split_store_quota = SplitStoreQuota::try_new(2, ByteSize(100)).unwrap();\n        split_store_quota.add_split(ByteSize(1));\n        split_store_quota.add_split(ByteSize(1));\n        assert!(!split_store_quota.can_fit_split(ByteSize(1)));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-indexing/src/test_utils.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::num::NonZeroUsize;\nuse std::str::FromStr;\nuse std::sync::Arc;\nuse std::sync::atomic::{AtomicUsize, Ordering};\n\nuse bytes::Bytes;\nuse quickwit_actors::{Mailbox, Universe};\nuse quickwit_cluster::{ChannelTransport, create_cluster_for_test};\nuse quickwit_common::pubsub::EventBroker;\nuse quickwit_common::rand::append_random_suffix;\nuse quickwit_common::uri::Uri;\nuse quickwit_config::{\n    ConfigFormat, INGEST_API_SOURCE_ID, IndexConfig, IndexerConfig, IngestApiConfig,\n    MetastoreConfigs, SourceConfig, SourceInputFormat, SourceParams, VecSourceParams,\n    build_doc_mapper,\n};\nuse quickwit_doc_mapper::DocMapper;\nuse quickwit_ingest::{IngesterPool, QUEUES_DIR_NAME, init_ingest_api};\nuse quickwit_metastore::{\n    CreateIndexRequestExt, MetastoreResolver, Split, SplitMetadata, SplitState,\n};\nuse quickwit_proto::metastore::{CreateIndexRequest, MetastoreService, MetastoreServiceClient};\nuse quickwit_proto::types::{IndexUid, NodeId, PipelineUid, SourceId};\nuse quickwit_storage::{Storage, StorageResolver};\nuse serde_json::Value as JsonValue;\n\nuse crate::actors::IndexingService;\nuse crate::models::{DetachIndexingPipeline, IndexingStatistics, SpawnPipeline};\n\n/// Creates a Test environment.\n///\n/// It makes it easy to create a test index, perfect for unit testing.\n/// The test index content is entirely in RAM and isolated,\n/// but the construction of the index involves temporary file directory.\npub struct TestSandbox {\n    node_id: NodeId,\n    index_uid: IndexUid,\n    source_id: SourceId,\n    indexing_service: Mailbox<IndexingService>,\n    doc_mapper: Arc<DocMapper>,\n    metastore: MetastoreServiceClient,\n    storage_resolver: StorageResolver,\n    storage: Arc<dyn Storage>,\n    add_docs_id: AtomicUsize,\n    universe: Universe,\n    _temp_dir: tempfile::TempDir,\n}\n\nconst METASTORE_URI: &str = \"ram://quickwit-test-indexes\";\n\nfn index_uri(index_id: &str) -> Uri {\n    Uri::from_str(&format!(\"{METASTORE_URI}/{index_id}\")).unwrap()\n}\n\nimpl TestSandbox {\n    /// Creates a new test environment.\n    pub async fn create(\n        index_id: &str,\n        doc_mapping_yaml: &str,\n        indexing_settings_yaml: &str,\n        search_fields: &[&str],\n    ) -> anyhow::Result<TestSandbox> {\n        let node_id = NodeId::new(append_random_suffix(\"test-node\"));\n        let transport = ChannelTransport::default();\n        let cluster = create_cluster_for_test(Vec::new(), &[\"indexer\"], &transport, true)\n            .await\n            .unwrap();\n        let index_uri = index_uri(index_id);\n        let mut index_config = IndexConfig::for_test(index_id, index_uri.as_str());\n        index_config.doc_mapping = ConfigFormat::Yaml.parse(doc_mapping_yaml.as_bytes())?;\n        index_config.indexing_settings =\n            ConfigFormat::Yaml.parse(indexing_settings_yaml.as_bytes())?;\n        index_config.search_settings.default_search_fields = search_fields\n            .iter()\n            .map(|search_field| search_field.to_string())\n            .collect();\n        let source_config = SourceConfig::ingest_api_default();\n        let storage_resolver = StorageResolver::for_test();\n        let metastore_resolver =\n            MetastoreResolver::configured(storage_resolver.clone(), &MetastoreConfigs::default());\n        let metastore = metastore_resolver\n            .resolve(&Uri::for_test(METASTORE_URI))\n            .await?;\n        let create_index_request = CreateIndexRequest::try_from_index_and_source_configs(\n            &index_config,\n            std::slice::from_ref(&source_config),\n        )?;\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await?\n            .index_uid()\n            .clone();\n        let doc_mapper =\n            build_doc_mapper(&index_config.doc_mapping, &index_config.search_settings)?;\n        let temp_dir = tempfile::tempdir()?;\n        let indexer_config = IndexerConfig::for_test()?;\n        let num_blocking_threads = 1;\n        let storage = storage_resolver.resolve(&index_uri).await?;\n        let universe = Universe::with_accelerated_time();\n        let merge_scheduler_mailbox = universe.get_or_spawn_one();\n        let queues_dir_path = temp_dir.path().join(QUEUES_DIR_NAME);\n        let ingest_api_service =\n            init_ingest_api(&universe, &queues_dir_path, &IngestApiConfig::default()).await?;\n        let indexing_service_actor = IndexingService::new(\n            node_id.clone(),\n            temp_dir.path().to_path_buf(),\n            indexer_config,\n            num_blocking_threads,\n            cluster,\n            metastore.clone(),\n            Some(ingest_api_service),\n            merge_scheduler_mailbox,\n            IngesterPool::default(),\n            storage_resolver.clone(),\n            EventBroker::default(),\n        )\n        .await?;\n        let (indexing_service, _indexing_service_handle) =\n            universe.spawn_builder().spawn(indexing_service_actor);\n        Ok(TestSandbox {\n            node_id,\n            index_uid,\n            source_id: INGEST_API_SOURCE_ID.to_string(),\n            indexing_service,\n            doc_mapper,\n            metastore,\n            storage_resolver,\n            storage,\n            add_docs_id: AtomicUsize::default(),\n            universe,\n            _temp_dir: temp_dir,\n        })\n    }\n\n    /// Adds documents and waits for them to be indexed (creating a separate split).\n    ///\n    /// The documents are expected to be `JsonValue`.\n    /// They can be created using the `serde_json::json!` macro.\n    pub async fn add_documents<I>(&self, json_docs: I) -> anyhow::Result<IndexingStatistics>\n    where\n        I: IntoIterator<Item = JsonValue> + 'static,\n        I::IntoIter: Send,\n    {\n        let docs: Vec<Bytes> = json_docs\n            .into_iter()\n            .map(|json_doc| Bytes::from(json_doc.to_string()))\n            .collect();\n        let add_docs_id = self.add_docs_id.fetch_add(1, Ordering::SeqCst);\n        let source_config = SourceConfig {\n            source_id: INGEST_API_SOURCE_ID.to_string(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::Vec(VecSourceParams {\n                docs,\n                batch_num_docs: 10,\n                partition: format!(\"add-docs-{add_docs_id}\"),\n            }),\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        };\n        let pipeline_id = self\n            .indexing_service\n            .ask_for_res(SpawnPipeline {\n                index_id: self.index_uid.index_id.to_string(),\n                source_config,\n                pipeline_uid: PipelineUid::for_test(0u128),\n            })\n            .await?;\n        let pipeline_handle = self\n            .indexing_service\n            .ask_for_res(DetachIndexingPipeline {\n                pipeline_id: pipeline_id.clone(),\n            })\n            .await?;\n        let (_pipeline_exit_status, pipeline_statistics) = pipeline_handle.join().await;\n        Ok(pipeline_statistics)\n    }\n\n    /// Returns the metastore of the TestSandbox.\n    ///\n    /// The metastore is a file-backed metastore.\n    /// Its data can be found via the `storage` in\n    /// the `ram://quickwit-test-indexes` directory.\n    pub fn metastore(&self) -> MetastoreServiceClient {\n        self.metastore.clone()\n    }\n\n    /// Returns the storage of the TestSandbox.\n    pub fn storage(&self) -> Arc<dyn Storage> {\n        self.storage.clone()\n    }\n\n    /// Returns the storage resolver of the TestSandbox.\n    pub fn storage_resolver(&self) -> StorageResolver {\n        self.storage_resolver.clone()\n    }\n\n    /// Returns the doc mapper of the TestSandbox.\n    pub fn doc_mapper(&self) -> Arc<DocMapper> {\n        self.doc_mapper.clone()\n    }\n\n    /// Returns the node ID.\n    pub fn node_id(&self) -> NodeId {\n        self.node_id.clone()\n    }\n\n    /// Returns the index UID.\n    pub fn index_uid(&self) -> IndexUid {\n        self.index_uid.clone()\n    }\n\n    /// Returns the source ID.\n    pub fn source_id(&self) -> SourceId {\n        self.source_id.clone()\n    }\n\n    /// Returns the underlying universe.\n    pub fn universe(&self) -> &Universe {\n        &self.universe\n    }\n\n    /// Gracefully quits all registered actors in the underlying universe and asserts that none of\n    /// them panicked.\n    ///\n    /// This is useful for testing purposes to detect failed asserts in actors\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub async fn assert_quit(self) {\n        self.universe.assert_quit().await\n    }\n}\n\n/// Mock split builder.\npub struct MockSplitBuilder {\n    split_metadata: SplitMetadata,\n}\n\nimpl MockSplitBuilder {\n    pub fn new(split_id: &str) -> Self {\n        Self {\n            split_metadata: mock_split_meta(split_id, &IndexUid::for_test(\"test-index\", 0)),\n        }\n    }\n\n    pub fn with_index_uid(mut self, index_uid: &IndexUid) -> Self {\n        self.split_metadata.index_uid = index_uid.clone();\n        self\n    }\n\n    pub fn build(self) -> Split {\n        Split {\n            split_state: SplitState::Published,\n            split_metadata: self.split_metadata,\n            update_timestamp: 0,\n            publish_timestamp: None,\n        }\n    }\n}\n\n/// Mock split helper.\npub fn mock_split(split_id: &str) -> Split {\n    MockSplitBuilder::new(split_id).build()\n}\n\n/// Mock split meta helper.\npub fn mock_split_meta(split_id: &str, index_uid: &IndexUid) -> SplitMetadata {\n    SplitMetadata {\n        index_uid: index_uid.clone(),\n        split_id: split_id.to_string(),\n        partition_id: 13u64,\n        num_docs: if split_id == \"split1\" { 1_000_000 } else { 10 },\n        uncompressed_docs_size_in_bytes: 256,\n        time_range: Some(121000..=130198),\n        create_timestamp: 0,\n        footer_offsets: 700..800,\n        ..Default::default()\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_metastore::{ListSplitsRequestExt, MetastoreServiceStreamSplitsExt};\n    use quickwit_proto::metastore::{ListSplitsRequest, MetastoreService};\n\n    use super::TestSandbox;\n\n    #[tokio::test]\n    async fn test_test_sandbox() -> anyhow::Result<()> {\n        quickwit_common::setup_logging_for_tests();\n        let doc_mapping_yaml = r#\"\n            field_mappings:\n              - name: title\n                type: text\n              - name: body\n                type: text\n              - name: url\n                type: text\n        \"#;\n        let test_sandbox =\n            TestSandbox::create(\"test_index\", doc_mapping_yaml, \"{}\", &[\"body\"]).await?;\n        let statistics = test_sandbox.add_documents(vec![\n            serde_json::json!({\"title\": \"Hurricane Fay\", \"body\": \"...\", \"url\": \"http://hurricane-fay\"}),\n            serde_json::json!({\"title\": \"Ganimede\", \"body\": \"...\", \"url\": \"http://ganimede\"}),\n        ]).await?;\n        assert_eq!(statistics.num_uploaded_splits, 1);\n        let metastore = test_sandbox.metastore();\n        {\n            let splits = metastore\n                .list_splits(\n                    ListSplitsRequest::try_from_index_uid(test_sandbox.index_uid()).unwrap(),\n                )\n                .await?\n                .collect_splits()\n                .await?;\n            assert_eq!(splits.len(), 1);\n            test_sandbox.add_documents(vec![\n            serde_json::json!({\"title\": \"Byzantine-Ottoman wars\", \"body\": \"...\", \"url\": \"http://biz-ottoman\"}),\n        ]).await?;\n        }\n        {\n            let splits = metastore\n                .list_splits(\n                    ListSplitsRequest::try_from_index_uid(test_sandbox.index_uid()).unwrap(),\n                )\n                .await?\n                .collect_splits()\n                .await?;\n            assert_eq!(splits.len(), 2);\n        }\n        test_sandbox.assert_quit().await;\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/Cargo.toml",
    "content": "[package]\nname = \"quickwit-ingest\"\ndescription = \"Native distributed and replicated ingestion engine\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nanyhow = { workspace = true }\nasync-trait = { workspace = true }\nbytes = { workspace = true }\nbytesize = { workspace = true }\nfail = { workspace = true, optional = true }\nfutures = { workspace = true }\nhttp = { workspace = true }\nitertools = { workspace = true }\nmockall = { workspace = true, optional = true }\nmrecordlog = { workspace = true }\nonce_cell = { workspace = true }\nprost = { workspace = true }\nrand = { workspace = true }\nserde = { workspace = true }\nserde_json = { workspace = true }\nserde_json_borrow = { workspace = true }\nthiserror = { workspace = true }\ntokio = { workspace = true }\ntonic = { workspace = true }\ntonic-prost = { workspace = true }\ntower = { workspace = true }\ntracing = { workspace = true }\nulid = { workspace = true }\nutoipa = { workspace = true }\n\nquickwit-actors = { workspace = true }\nquickwit-cluster = { workspace = true }\nquickwit-common = { workspace = true, features = [\"testsuite\"] }\nquickwit-config = { workspace = true }\nquickwit-doc-mapper = { workspace = true, features = [\"testsuite\"] }\nquickwit-proto = { workspace = true }\n\n[dev-dependencies]\nmockall = { workspace = true }\nrand = { workspace = true }\nrand_distr = { workspace = true }\ntempfile = { workspace = true }\ntokio = { workspace = true, features = [\"test-util\"]}\n\nquickwit-actors = { workspace = true, features = [\"testsuite\"] }\nquickwit-cluster = { workspace = true, features = [\"testsuite\"] }\nquickwit-common = { workspace = true, features = [\"testsuite\"] }\nquickwit-proto = { workspace = true, features = [\"testsuite\"] }\n\n[build-dependencies]\nquickwit-codegen = { workspace = true }\n\n[features]\nfailpoints = [\"fail/failpoints\"]\nno-failpoints = []\ntestsuite = [\"mockall\"]\n"
  },
  {
    "path": "quickwit/quickwit-ingest/build.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_codegen::{Codegen, ProstConfig};\n\nfn main() {\n    // Legacy ingest codegen\n    let mut prost_config = ProstConfig::default();\n    prost_config.bytes([\"DocBatch.doc_buffer\"]);\n\n    Codegen::builder()\n        .with_protos(&[\"src/ingest_service.proto\"])\n        .with_output_dir(\"src/codegen/\")\n        .with_result_type_path(\"crate::Result\")\n        .with_error_type_path(\"crate::IngestServiceError\")\n        .with_prost_config(prost_config)\n        .generate_rpc_name_impls()\n        .run()\n        .unwrap();\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/codegen/ingest_service.rs",
    "content": "// This file is @generated by prost-build.\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct QueueExistsRequest {\n    #[prost(string, tag = \"1\")]\n    pub queue_id: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct CreateQueueRequest {\n    #[prost(string, tag = \"1\")]\n    pub queue_id: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct CreateQueueIfNotExistsRequest {\n    #[prost(string, tag = \"1\")]\n    pub queue_id: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct CreateQueueIfNotExistsResponse {\n    #[prost(string, tag = \"1\")]\n    pub queue_id: ::prost::alloc::string::String,\n    #[prost(bool, tag = \"2\")]\n    pub created: bool,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct DropQueueRequest {\n    #[prost(string, tag = \"1\")]\n    pub queue_id: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct IngestRequest {\n    #[prost(message, repeated, tag = \"1\")]\n    pub doc_batches: ::prost::alloc::vec::Vec<DocBatch>,\n    #[prost(enumeration = \"CommitType\", tag = \"2\")]\n    pub commit: i32,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct IngestResponse {\n    #[prost(uint64, tag = \"1\")]\n    pub num_docs_for_processing: u64,\n}\n/// Fetch messages with position strictly after `start_after`.\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct FetchRequest {\n    #[prost(string, tag = \"1\")]\n    pub index_id: ::prost::alloc::string::String,\n    #[prost(uint64, optional, tag = \"2\")]\n    pub start_after: ::core::option::Option<u64>,\n    #[prost(uint64, optional, tag = \"3\")]\n    pub num_bytes_limit: ::core::option::Option<u64>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct FetchResponse {\n    #[prost(uint64, optional, tag = \"1\")]\n    pub first_position: ::core::option::Option<u64>,\n    #[prost(message, optional, tag = \"2\")]\n    pub doc_batch: ::core::option::Option<DocBatch>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct DocBatch {\n    #[prost(string, tag = \"1\")]\n    pub index_id: ::prost::alloc::string::String,\n    #[prost(bytes = \"bytes\", tag = \"2\")]\n    #[schema(value_type = String, format = Binary)]\n    pub doc_buffer: ::prost::bytes::Bytes,\n    #[prost(uint32, repeated, tag = \"3\")]\n    pub doc_lengths: ::prost::alloc::vec::Vec<u32>,\n}\n/// Suggest to truncate the queue.\n///\n/// This function allows the queue to remove all records up to and\n/// including `up_to_offset_included`.\n///\n/// The role of this truncation is to release memory and disk space.\n///\n/// There are no guarantees that the record will effectively be removed.\n/// Nothing might happen, or the truncation might be partial.\n///\n/// In other words, truncating from a position, and fetching records starting\n/// earlier than this position can yield undefined result:\n/// the truncated records may or may not be returned.\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct SuggestTruncateRequest {\n    #[prost(string, tag = \"1\")]\n    pub index_id: ::prost::alloc::string::String,\n    #[prost(uint64, tag = \"2\")]\n    pub up_to_position_included: u64,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct TailRequest {\n    #[prost(string, tag = \"1\")]\n    pub index_id: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ListQueuesRequest {}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ListQueuesResponse {\n    #[prost(string, repeated, tag = \"1\")]\n    pub queues: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n}\n/// Specifies if the ingest request should block waiting for the records to be committed.\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[serde(rename_all = \"snake_case\")]\n#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]\n#[repr(i32)]\npub enum CommitType {\n    /// The request doesn't wait for commit\n    Auto = 0,\n    /// The request waits for the next scheduled commit to finish.\n    WaitFor = 1,\n    /// The request forces an immediate commit after the last document in the batch and waits for\n    /// it to finish.\n    Force = 2,\n}\nimpl CommitType {\n    /// String value of the enum field names used in the ProtoBuf definition.\n    ///\n    /// The values are not transformed in any way and thus are considered stable\n    /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n    pub fn as_str_name(&self) -> &'static str {\n        match self {\n            Self::Auto => \"Auto\",\n            Self::WaitFor => \"WaitFor\",\n            Self::Force => \"Force\",\n        }\n    }\n    /// Creates an enum from field names used in the ProtoBuf definition.\n    pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n        match value {\n            \"Auto\" => Some(Self::Auto),\n            \"WaitFor\" => Some(Self::WaitFor),\n            \"Force\" => Some(Self::Force),\n            _ => None,\n        }\n    }\n}\n/// BEGIN quickwit-codegen\n#[allow(unused_imports)]\nuse std::str::FromStr;\nuse tower::{Layer, Service, ServiceExt};\nuse quickwit_common::tower::RpcName;\nimpl RpcName for IngestRequest {\n    fn rpc_name() -> &'static str {\n        \"ingest\"\n    }\n}\nimpl RpcName for FetchRequest {\n    fn rpc_name() -> &'static str {\n        \"fetch\"\n    }\n}\nimpl RpcName for TailRequest {\n    fn rpc_name() -> &'static str {\n        \"tail\"\n    }\n}\n#[cfg_attr(any(test, feature = \"testsuite\"), mockall::automock)]\n#[async_trait::async_trait]\npub trait IngestService: std::fmt::Debug + Send + Sync + 'static {\n    ///Ingests document in a given queue.\n    ///\n    ///Upon any kind of error, the client should\n    ///\n    ///* retry to get at least once delivery.\n    ///* not retry to get at most once delivery.\n    ///\n    ///Exactly once delivery is not supported yet.\n    async fn ingest(&self, request: IngestRequest) -> crate::Result<IngestResponse>;\n    ///Fetches record from a given queue.\n    ///\n    ///Records are returned in order.\n    ///\n    ///The returned `FetchResponse` object is meant to be read with the\n    ///`crate::iter_records` function.\n    ///\n    ///Fetching does not necessarily return all of the available records.\n    ///If returning all records would exceed `FETCH_PAYLOAD_LIMIT` (2MB),\n    ///the response will be partial.\n    async fn fetch(&self, request: FetchRequest) -> crate::Result<FetchResponse>;\n    ///Returns a batch containing the last records.\n    ///\n    ///It returns the last documents, from the newest\n    ///to the oldest, and stops as soon as `FETCH_PAYLOAD_LIMIT` (2MB)\n    ///is exceeded.\n    async fn tail(&self, request: TailRequest) -> crate::Result<FetchResponse>;\n}\n#[derive(Debug, Clone)]\npub struct IngestServiceClient {\n    inner: InnerIngestServiceClient,\n}\n#[derive(Debug, Clone)]\nstruct InnerIngestServiceClient(std::sync::Arc<dyn IngestService>);\nimpl IngestServiceClient {\n    pub fn new<T>(instance: T) -> Self\n    where\n        T: IngestService,\n    {\n        #[cfg(any(test, feature = \"testsuite\"))]\n        assert!(\n            std::any::TypeId::of:: < T > () != std::any::TypeId::of:: < MockIngestService\n            > (),\n            \"`MockIngestService` must be wrapped in a `MockIngestServiceWrapper`: use `IngestServiceClient::from_mock(mock)` to instantiate the client\"\n        );\n        Self {\n            inner: InnerIngestServiceClient(std::sync::Arc::new(instance)),\n        }\n    }\n    pub fn as_grpc_service(\n        &self,\n        max_message_size: bytesize::ByteSize,\n    ) -> ingest_service_grpc_server::IngestServiceGrpcServer<\n        IngestServiceGrpcServerAdapter,\n    > {\n        let adapter = IngestServiceGrpcServerAdapter::new(self.clone());\n        ingest_service_grpc_server::IngestServiceGrpcServer::new(adapter)\n            .accept_compressed(tonic::codec::CompressionEncoding::Gzip)\n            .accept_compressed(tonic::codec::CompressionEncoding::Zstd)\n            .send_compressed(tonic::codec::CompressionEncoding::Gzip)\n            .send_compressed(tonic::codec::CompressionEncoding::Zstd)\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize)\n    }\n    pub fn from_channel(\n        addr: std::net::SocketAddr,\n        channel: tonic::transport::Channel,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> Self {\n        let (_, connection_keys_watcher) = tokio::sync::watch::channel(\n            std::collections::HashSet::from_iter([addr]),\n        );\n        let mut client = ingest_service_grpc_client::IngestServiceGrpcClient::new(\n                channel,\n            )\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize);\n        if let Some(compression_encoding) = compression_encoding_opt {\n            client = client\n                .accept_compressed(compression_encoding)\n                .send_compressed(compression_encoding);\n        }\n        let adapter = IngestServiceGrpcClientAdapter::new(\n            client,\n            connection_keys_watcher,\n        );\n        Self::new(adapter)\n    }\n    pub fn from_balance_channel(\n        balance_channel: quickwit_common::tower::BalanceChannel<std::net::SocketAddr>,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> IngestServiceClient {\n        let connection_keys_watcher = balance_channel.connection_keys_watcher();\n        let mut client = ingest_service_grpc_client::IngestServiceGrpcClient::new(\n                balance_channel,\n            )\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize);\n        if let Some(compression_encoding) = compression_encoding_opt {\n            client = client\n                .accept_compressed(compression_encoding)\n                .send_compressed(compression_encoding);\n        }\n        let adapter = IngestServiceGrpcClientAdapter::new(\n            client,\n            connection_keys_watcher,\n        );\n        Self::new(adapter)\n    }\n    pub fn from_mailbox<A>(mailbox: quickwit_actors::Mailbox<A>) -> Self\n    where\n        A: quickwit_actors::Actor + std::fmt::Debug + Send + 'static,\n        IngestServiceMailbox<A>: IngestService,\n    {\n        IngestServiceClient::new(IngestServiceMailbox::new(mailbox))\n    }\n    pub fn tower() -> IngestServiceTowerLayerStack {\n        IngestServiceTowerLayerStack::default()\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn from_mock(mock: MockIngestService) -> Self {\n        let mock_wrapper = mock_ingest_service::MockIngestServiceWrapper {\n            inner: tokio::sync::Mutex::new(mock),\n        };\n        Self::new(mock_wrapper)\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn mocked() -> Self {\n        Self::from_mock(MockIngestService::new())\n    }\n}\n#[async_trait::async_trait]\nimpl IngestService for IngestServiceClient {\n    async fn ingest(&self, request: IngestRequest) -> crate::Result<IngestResponse> {\n        self.inner.0.ingest(request).await\n    }\n    async fn fetch(&self, request: FetchRequest) -> crate::Result<FetchResponse> {\n        self.inner.0.fetch(request).await\n    }\n    async fn tail(&self, request: TailRequest) -> crate::Result<FetchResponse> {\n        self.inner.0.tail(request).await\n    }\n}\n#[cfg(any(test, feature = \"testsuite\"))]\npub mod mock_ingest_service {\n    use super::*;\n    #[derive(Debug)]\n    pub struct MockIngestServiceWrapper {\n        pub(super) inner: tokio::sync::Mutex<MockIngestService>,\n    }\n    #[async_trait::async_trait]\n    impl IngestService for MockIngestServiceWrapper {\n        async fn ingest(\n            &self,\n            request: super::IngestRequest,\n        ) -> crate::Result<super::IngestResponse> {\n            self.inner.lock().await.ingest(request).await\n        }\n        async fn fetch(\n            &self,\n            request: super::FetchRequest,\n        ) -> crate::Result<super::FetchResponse> {\n            self.inner.lock().await.fetch(request).await\n        }\n        async fn tail(\n            &self,\n            request: super::TailRequest,\n        ) -> crate::Result<super::FetchResponse> {\n            self.inner.lock().await.tail(request).await\n        }\n    }\n}\npub type BoxFuture<T, E> = std::pin::Pin<\n    Box<dyn std::future::Future<Output = Result<T, E>> + Send + 'static>,\n>;\nimpl tower::Service<IngestRequest> for InnerIngestServiceClient {\n    type Response = IngestResponse;\n    type Error = crate::IngestServiceError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: IngestRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.ingest(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<FetchRequest> for InnerIngestServiceClient {\n    type Response = FetchResponse;\n    type Error = crate::IngestServiceError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: FetchRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.fetch(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<TailRequest> for InnerIngestServiceClient {\n    type Response = FetchResponse;\n    type Error = crate::IngestServiceError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: TailRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.tail(request).await };\n        Box::pin(fut)\n    }\n}\n/// A tower service stack is a set of tower services.\n#[derive(Debug)]\nstruct IngestServiceTowerServiceStack {\n    #[allow(dead_code)]\n    inner: InnerIngestServiceClient,\n    ingest_svc: quickwit_common::tower::BoxService<\n        IngestRequest,\n        IngestResponse,\n        crate::IngestServiceError,\n    >,\n    fetch_svc: quickwit_common::tower::BoxService<\n        FetchRequest,\n        FetchResponse,\n        crate::IngestServiceError,\n    >,\n    tail_svc: quickwit_common::tower::BoxService<\n        TailRequest,\n        FetchResponse,\n        crate::IngestServiceError,\n    >,\n}\n#[async_trait::async_trait]\nimpl IngestService for IngestServiceTowerServiceStack {\n    async fn ingest(&self, request: IngestRequest) -> crate::Result<IngestResponse> {\n        self.ingest_svc.clone().ready().await?.call(request).await\n    }\n    async fn fetch(&self, request: FetchRequest) -> crate::Result<FetchResponse> {\n        self.fetch_svc.clone().ready().await?.call(request).await\n    }\n    async fn tail(&self, request: TailRequest) -> crate::Result<FetchResponse> {\n        self.tail_svc.clone().ready().await?.call(request).await\n    }\n}\ntype IngestLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        IngestRequest,\n        IngestResponse,\n        crate::IngestServiceError,\n    >,\n    IngestRequest,\n    IngestResponse,\n    crate::IngestServiceError,\n>;\ntype FetchLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        FetchRequest,\n        FetchResponse,\n        crate::IngestServiceError,\n    >,\n    FetchRequest,\n    FetchResponse,\n    crate::IngestServiceError,\n>;\ntype TailLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        TailRequest,\n        FetchResponse,\n        crate::IngestServiceError,\n    >,\n    TailRequest,\n    FetchResponse,\n    crate::IngestServiceError,\n>;\n#[derive(Debug, Default)]\npub struct IngestServiceTowerLayerStack {\n    ingest_layers: Vec<IngestLayer>,\n    fetch_layers: Vec<FetchLayer>,\n    tail_layers: Vec<TailLayer>,\n}\nimpl IngestServiceTowerLayerStack {\n    pub fn stack_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    IngestRequest,\n                    IngestResponse,\n                    crate::IngestServiceError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                IngestRequest,\n                IngestResponse,\n                crate::IngestServiceError,\n            >,\n        >>::Service: tower::Service<\n                IngestRequest,\n                Response = IngestResponse,\n                Error = crate::IngestServiceError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                IngestRequest,\n                IngestResponse,\n                crate::IngestServiceError,\n            >,\n        >>::Service as tower::Service<IngestRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    FetchRequest,\n                    FetchResponse,\n                    crate::IngestServiceError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                FetchRequest,\n                FetchResponse,\n                crate::IngestServiceError,\n            >,\n        >>::Service: tower::Service<\n                FetchRequest,\n                Response = FetchResponse,\n                Error = crate::IngestServiceError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                FetchRequest,\n                FetchResponse,\n                crate::IngestServiceError,\n            >,\n        >>::Service as tower::Service<FetchRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    TailRequest,\n                    FetchResponse,\n                    crate::IngestServiceError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                TailRequest,\n                FetchResponse,\n                crate::IngestServiceError,\n            >,\n        >>::Service: tower::Service<\n                TailRequest,\n                Response = FetchResponse,\n                Error = crate::IngestServiceError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                TailRequest,\n                FetchResponse,\n                crate::IngestServiceError,\n            >,\n        >>::Service as tower::Service<TailRequest>>::Future: Send + 'static,\n    {\n        self.ingest_layers.push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.fetch_layers.push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.tail_layers.push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self\n    }\n    pub fn stack_ingest_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    IngestRequest,\n                    IngestResponse,\n                    crate::IngestServiceError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                IngestRequest,\n                Response = IngestResponse,\n                Error = crate::IngestServiceError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<IngestRequest>>::Future: Send + 'static,\n    {\n        self.ingest_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_fetch_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    FetchRequest,\n                    FetchResponse,\n                    crate::IngestServiceError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                FetchRequest,\n                Response = FetchResponse,\n                Error = crate::IngestServiceError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<FetchRequest>>::Future: Send + 'static,\n    {\n        self.fetch_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_tail_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    TailRequest,\n                    FetchResponse,\n                    crate::IngestServiceError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                TailRequest,\n                Response = FetchResponse,\n                Error = crate::IngestServiceError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<TailRequest>>::Future: Send + 'static,\n    {\n        self.tail_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn build<T>(self, instance: T) -> IngestServiceClient\n    where\n        T: IngestService,\n    {\n        let inner_client = InnerIngestServiceClient(std::sync::Arc::new(instance));\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_channel(\n        self,\n        addr: std::net::SocketAddr,\n        channel: tonic::transport::Channel,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> IngestServiceClient {\n        let client = IngestServiceClient::from_channel(\n            addr,\n            channel,\n            max_message_size,\n            compression_encoding_opt,\n        );\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_balance_channel(\n        self,\n        balance_channel: quickwit_common::tower::BalanceChannel<std::net::SocketAddr>,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> IngestServiceClient {\n        let client = IngestServiceClient::from_balance_channel(\n            balance_channel,\n            max_message_size,\n            compression_encoding_opt,\n        );\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_mailbox<A>(\n        self,\n        mailbox: quickwit_actors::Mailbox<A>,\n    ) -> IngestServiceClient\n    where\n        A: quickwit_actors::Actor + std::fmt::Debug + Send + 'static,\n        IngestServiceMailbox<A>: IngestService,\n    {\n        let inner_client = InnerIngestServiceClient(\n            std::sync::Arc::new(IngestServiceMailbox::new(mailbox)),\n        );\n        self.build_from_inner_client(inner_client)\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn build_from_mock(self, mock: MockIngestService) -> IngestServiceClient {\n        let client = IngestServiceClient::from_mock(mock);\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    fn build_from_inner_client(\n        self,\n        inner_client: InnerIngestServiceClient,\n    ) -> IngestServiceClient {\n        let ingest_svc = self\n            .ingest_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let fetch_svc = self\n            .fetch_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let tail_svc = self\n            .tail_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let tower_svc_stack = IngestServiceTowerServiceStack {\n            inner: inner_client,\n            ingest_svc,\n            fetch_svc,\n            tail_svc,\n        };\n        IngestServiceClient::new(tower_svc_stack)\n    }\n}\n#[derive(Debug, Clone)]\nstruct MailboxAdapter<A: quickwit_actors::Actor, E> {\n    inner: quickwit_actors::Mailbox<A>,\n    phantom: std::marker::PhantomData<E>,\n}\nimpl<A, E> std::ops::Deref for MailboxAdapter<A, E>\nwhere\n    A: quickwit_actors::Actor,\n{\n    type Target = quickwit_actors::Mailbox<A>;\n    fn deref(&self) -> &Self::Target {\n        &self.inner\n    }\n}\n#[derive(Debug)]\npub struct IngestServiceMailbox<A: quickwit_actors::Actor> {\n    inner: MailboxAdapter<A, crate::IngestServiceError>,\n}\nimpl<A: quickwit_actors::Actor> IngestServiceMailbox<A> {\n    pub fn new(instance: quickwit_actors::Mailbox<A>) -> Self {\n        let inner = MailboxAdapter {\n            inner: instance,\n            phantom: std::marker::PhantomData,\n        };\n        Self { inner }\n    }\n}\nimpl<A: quickwit_actors::Actor> Clone for IngestServiceMailbox<A> {\n    fn clone(&self) -> Self {\n        let inner = MailboxAdapter {\n            inner: self.inner.clone(),\n            phantom: std::marker::PhantomData,\n        };\n        Self { inner }\n    }\n}\nimpl<A, M, T, E> tower::Service<M> for IngestServiceMailbox<A>\nwhere\n    A: quickwit_actors::Actor\n        + quickwit_actors::DeferableReplyHandler<M, Reply = Result<T, E>> + Send\n        + 'static,\n    M: std::fmt::Debug + Send + 'static,\n    T: Send + 'static,\n    E: std::fmt::Debug + Send + 'static,\n    crate::IngestServiceError: From<quickwit_actors::AskError<E>>,\n{\n    type Response = T;\n    type Error = crate::IngestServiceError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        //! This does not work with balance middlewares such as `tower::balance::pool::Pool` because\n        //! this always returns `Poll::Ready`. The fix is to acquire a permit from the\n        //! mailbox in `poll_ready` and consume it in `call`.\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, message: M) -> Self::Future {\n        let mailbox = self.inner.clone();\n        let fut = async move {\n            mailbox.ask_for_res(message).await.map_err(|error| error.into())\n        };\n        Box::pin(fut)\n    }\n}\n#[async_trait::async_trait]\nimpl<A> IngestService for IngestServiceMailbox<A>\nwhere\n    A: quickwit_actors::Actor + std::fmt::Debug,\n    IngestServiceMailbox<\n        A,\n    >: tower::Service<\n            IngestRequest,\n            Response = IngestResponse,\n            Error = crate::IngestServiceError,\n            Future = BoxFuture<IngestResponse, crate::IngestServiceError>,\n        >\n        + tower::Service<\n            FetchRequest,\n            Response = FetchResponse,\n            Error = crate::IngestServiceError,\n            Future = BoxFuture<FetchResponse, crate::IngestServiceError>,\n        >\n        + tower::Service<\n            TailRequest,\n            Response = FetchResponse,\n            Error = crate::IngestServiceError,\n            Future = BoxFuture<FetchResponse, crate::IngestServiceError>,\n        >,\n{\n    async fn ingest(&self, request: IngestRequest) -> crate::Result<IngestResponse> {\n        self.clone().call(request).await\n    }\n    async fn fetch(&self, request: FetchRequest) -> crate::Result<FetchResponse> {\n        self.clone().call(request).await\n    }\n    async fn tail(&self, request: TailRequest) -> crate::Result<FetchResponse> {\n        self.clone().call(request).await\n    }\n}\n#[derive(Debug, Clone)]\npub struct IngestServiceGrpcClientAdapter<T> {\n    inner: T,\n    #[allow(dead_code)]\n    connection_addrs_rx: tokio::sync::watch::Receiver<\n        std::collections::HashSet<std::net::SocketAddr>,\n    >,\n}\nimpl<T> IngestServiceGrpcClientAdapter<T> {\n    pub fn new(\n        instance: T,\n        connection_addrs_rx: tokio::sync::watch::Receiver<\n            std::collections::HashSet<std::net::SocketAddr>,\n        >,\n    ) -> Self {\n        Self {\n            inner: instance,\n            connection_addrs_rx,\n        }\n    }\n}\n#[async_trait::async_trait]\nimpl<T> IngestService\nfor IngestServiceGrpcClientAdapter<\n    ingest_service_grpc_client::IngestServiceGrpcClient<T>,\n>\nwhere\n    T: tonic::client::GrpcService<tonic::body::Body> + std::fmt::Debug + Clone + Send\n        + Sync + 'static,\n    T::ResponseBody: tonic::codegen::Body<Data = tonic::codegen::Bytes> + Send + 'static,\n    <T::ResponseBody as tonic::codegen::Body>::Error: Into<tonic::codegen::StdError>\n        + Send,\n    T::Future: Send,\n{\n    async fn ingest(&self, request: IngestRequest) -> crate::Result<IngestResponse> {\n        self.inner\n            .clone()\n            .ingest(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                IngestRequest::rpc_name(),\n            ))\n    }\n    async fn fetch(&self, request: FetchRequest) -> crate::Result<FetchResponse> {\n        self.inner\n            .clone()\n            .fetch(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                FetchRequest::rpc_name(),\n            ))\n    }\n    async fn tail(&self, request: TailRequest) -> crate::Result<FetchResponse> {\n        self.inner\n            .clone()\n            .tail(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                TailRequest::rpc_name(),\n            ))\n    }\n}\n#[derive(Debug)]\npub struct IngestServiceGrpcServerAdapter {\n    inner: InnerIngestServiceClient,\n}\nimpl IngestServiceGrpcServerAdapter {\n    pub fn new<T>(instance: T) -> Self\n    where\n        T: IngestService,\n    {\n        Self {\n            inner: InnerIngestServiceClient(std::sync::Arc::new(instance)),\n        }\n    }\n}\n#[async_trait::async_trait]\nimpl ingest_service_grpc_server::IngestServiceGrpc for IngestServiceGrpcServerAdapter {\n    async fn ingest(\n        &self,\n        request: tonic::Request<IngestRequest>,\n    ) -> Result<tonic::Response<IngestResponse>, tonic::Status> {\n        self.inner\n            .0\n            .ingest(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn fetch(\n        &self,\n        request: tonic::Request<FetchRequest>,\n    ) -> Result<tonic::Response<FetchResponse>, tonic::Status> {\n        self.inner\n            .0\n            .fetch(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn tail(\n        &self,\n        request: tonic::Request<TailRequest>,\n    ) -> Result<tonic::Response<FetchResponse>, tonic::Status> {\n        self.inner\n            .0\n            .tail(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n}\n/// Generated client implementations.\npub mod ingest_service_grpc_client {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    use tonic::codegen::http::Uri;\n    #[derive(Debug, Clone)]\n    pub struct IngestServiceGrpcClient<T> {\n        inner: tonic::client::Grpc<T>,\n    }\n    impl IngestServiceGrpcClient<tonic::transport::Channel> {\n        /// Attempt to create a new client by connecting to a given endpoint.\n        pub async fn connect<D>(dst: D) -> Result<Self, tonic::transport::Error>\n        where\n            D: TryInto<tonic::transport::Endpoint>,\n            D::Error: Into<StdError>,\n        {\n            let conn = tonic::transport::Endpoint::new(dst)?.connect().await?;\n            Ok(Self::new(conn))\n        }\n    }\n    impl<T> IngestServiceGrpcClient<T>\n    where\n        T: tonic::client::GrpcService<tonic::body::Body>,\n        T::Error: Into<StdError>,\n        T::ResponseBody: Body<Data = Bytes> + std::marker::Send + 'static,\n        <T::ResponseBody as Body>::Error: Into<StdError> + std::marker::Send,\n    {\n        pub fn new(inner: T) -> Self {\n            let inner = tonic::client::Grpc::new(inner);\n            Self { inner }\n        }\n        pub fn with_origin(inner: T, origin: Uri) -> Self {\n            let inner = tonic::client::Grpc::with_origin(inner, origin);\n            Self { inner }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> IngestServiceGrpcClient<InterceptedService<T, F>>\n        where\n            F: tonic::service::Interceptor,\n            T::ResponseBody: Default,\n            T: tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n                Response = http::Response<\n                    <T as tonic::client::GrpcService<tonic::body::Body>>::ResponseBody,\n                >,\n            >,\n            <T as tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n            >>::Error: Into<StdError> + std::marker::Send + std::marker::Sync,\n        {\n            IngestServiceGrpcClient::new(InterceptedService::new(inner, interceptor))\n        }\n        /// Compress requests with the given encoding.\n        ///\n        /// This requires the server to support it otherwise it might respond with an\n        /// error.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.send_compressed(encoding);\n            self\n        }\n        /// Enable decompressing responses.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.accept_compressed(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_decoding_message_size(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_encoding_message_size(limit);\n            self\n        }\n        /// Ingests document in a given queue.\n        ///\n        /// Upon any kind of error, the client should\n        ///\n        /// * retry to get at least once delivery.\n        /// * not retry to get at most once delivery.\n        ///\n        /// Exactly once delivery is not supported yet.\n        pub async fn ingest(\n            &mut self,\n            request: impl tonic::IntoRequest<super::IngestRequest>,\n        ) -> std::result::Result<tonic::Response<super::IngestResponse>, tonic::Status> {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/ingest_service.IngestService/Ingest\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(GrpcMethod::new(\"ingest_service.IngestService\", \"Ingest\"));\n            self.inner.unary(req, path, codec).await\n        }\n        /// Fetches record from a given queue.\n        ///\n        /// Records are returned in order.\n        ///\n        /// The returned `FetchResponse` object is meant to be read with the\n        /// `crate::iter_records` function.\n        ///\n        /// Fetching does not necessarily return all of the available records.\n        /// If returning all records would exceed `FETCH_PAYLOAD_LIMIT` (2MB),\n        /// the response will be partial.\n        pub async fn fetch(\n            &mut self,\n            request: impl tonic::IntoRequest<super::FetchRequest>,\n        ) -> std::result::Result<tonic::Response<super::FetchResponse>, tonic::Status> {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/ingest_service.IngestService/Fetch\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(GrpcMethod::new(\"ingest_service.IngestService\", \"Fetch\"));\n            self.inner.unary(req, path, codec).await\n        }\n        /// Returns a batch containing the last records.\n        ///\n        /// It returns the last documents, from the newest\n        /// to the oldest, and stops as soon as `FETCH_PAYLOAD_LIMIT` (2MB)\n        /// is exceeded.\n        pub async fn tail(\n            &mut self,\n            request: impl tonic::IntoRequest<super::TailRequest>,\n        ) -> std::result::Result<tonic::Response<super::FetchResponse>, tonic::Status> {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/ingest_service.IngestService/Tail\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(GrpcMethod::new(\"ingest_service.IngestService\", \"Tail\"));\n            self.inner.unary(req, path, codec).await\n        }\n    }\n}\n/// Generated server implementations.\npub mod ingest_service_grpc_server {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    /// Generated trait containing gRPC methods that should be implemented for use with IngestServiceGrpcServer.\n    #[async_trait]\n    pub trait IngestServiceGrpc: std::marker::Send + std::marker::Sync + 'static {\n        /// Ingests document in a given queue.\n        ///\n        /// Upon any kind of error, the client should\n        ///\n        /// * retry to get at least once delivery.\n        /// * not retry to get at most once delivery.\n        ///\n        /// Exactly once delivery is not supported yet.\n        async fn ingest(\n            &self,\n            request: tonic::Request<super::IngestRequest>,\n        ) -> std::result::Result<tonic::Response<super::IngestResponse>, tonic::Status>;\n        /// Fetches record from a given queue.\n        ///\n        /// Records are returned in order.\n        ///\n        /// The returned `FetchResponse` object is meant to be read with the\n        /// `crate::iter_records` function.\n        ///\n        /// Fetching does not necessarily return all of the available records.\n        /// If returning all records would exceed `FETCH_PAYLOAD_LIMIT` (2MB),\n        /// the response will be partial.\n        async fn fetch(\n            &self,\n            request: tonic::Request<super::FetchRequest>,\n        ) -> std::result::Result<tonic::Response<super::FetchResponse>, tonic::Status>;\n        /// Returns a batch containing the last records.\n        ///\n        /// It returns the last documents, from the newest\n        /// to the oldest, and stops as soon as `FETCH_PAYLOAD_LIMIT` (2MB)\n        /// is exceeded.\n        async fn tail(\n            &self,\n            request: tonic::Request<super::TailRequest>,\n        ) -> std::result::Result<tonic::Response<super::FetchResponse>, tonic::Status>;\n    }\n    #[derive(Debug)]\n    pub struct IngestServiceGrpcServer<T> {\n        inner: Arc<T>,\n        accept_compression_encodings: EnabledCompressionEncodings,\n        send_compression_encodings: EnabledCompressionEncodings,\n        max_decoding_message_size: Option<usize>,\n        max_encoding_message_size: Option<usize>,\n    }\n    impl<T> IngestServiceGrpcServer<T> {\n        pub fn new(inner: T) -> Self {\n            Self::from_arc(Arc::new(inner))\n        }\n        pub fn from_arc(inner: Arc<T>) -> Self {\n            Self {\n                inner,\n                accept_compression_encodings: Default::default(),\n                send_compression_encodings: Default::default(),\n                max_decoding_message_size: None,\n                max_encoding_message_size: None,\n            }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> InterceptedService<Self, F>\n        where\n            F: tonic::service::Interceptor,\n        {\n            InterceptedService::new(Self::new(inner), interceptor)\n        }\n        /// Enable decompressing requests with the given encoding.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.accept_compression_encodings.enable(encoding);\n            self\n        }\n        /// Compress responses with the given encoding, if the client supports it.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.send_compression_encodings.enable(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.max_decoding_message_size = Some(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.max_encoding_message_size = Some(limit);\n            self\n        }\n    }\n    impl<T, B> tonic::codegen::Service<http::Request<B>> for IngestServiceGrpcServer<T>\n    where\n        T: IngestServiceGrpc,\n        B: Body + std::marker::Send + 'static,\n        B::Error: Into<StdError> + std::marker::Send + 'static,\n    {\n        type Response = http::Response<tonic::body::Body>;\n        type Error = std::convert::Infallible;\n        type Future = BoxFuture<Self::Response, Self::Error>;\n        fn poll_ready(\n            &mut self,\n            _cx: &mut Context<'_>,\n        ) -> Poll<std::result::Result<(), Self::Error>> {\n            Poll::Ready(Ok(()))\n        }\n        fn call(&mut self, req: http::Request<B>) -> Self::Future {\n            match req.uri().path() {\n                \"/ingest_service.IngestService/Ingest\" => {\n                    #[allow(non_camel_case_types)]\n                    struct IngestSvc<T: IngestServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: IngestServiceGrpc,\n                    > tonic::server::UnaryService<super::IngestRequest>\n                    for IngestSvc<T> {\n                        type Response = super::IngestResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::IngestRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as IngestServiceGrpc>::ingest(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = IngestSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/ingest_service.IngestService/Fetch\" => {\n                    #[allow(non_camel_case_types)]\n                    struct FetchSvc<T: IngestServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: IngestServiceGrpc,\n                    > tonic::server::UnaryService<super::FetchRequest> for FetchSvc<T> {\n                        type Response = super::FetchResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::FetchRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as IngestServiceGrpc>::fetch(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = FetchSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/ingest_service.IngestService/Tail\" => {\n                    #[allow(non_camel_case_types)]\n                    struct TailSvc<T: IngestServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: IngestServiceGrpc,\n                    > tonic::server::UnaryService<super::TailRequest> for TailSvc<T> {\n                        type Response = super::FetchResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::TailRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as IngestServiceGrpc>::tail(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = TailSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                _ => {\n                    Box::pin(async move {\n                        let mut response = http::Response::new(\n                            tonic::body::Body::default(),\n                        );\n                        let headers = response.headers_mut();\n                        headers\n                            .insert(\n                                tonic::Status::GRPC_STATUS,\n                                (tonic::Code::Unimplemented as i32).into(),\n                            );\n                        headers\n                            .insert(\n                                http::header::CONTENT_TYPE,\n                                tonic::metadata::GRPC_CONTENT_TYPE,\n                            );\n                        Ok(response)\n                    })\n                }\n            }\n        }\n    }\n    impl<T> Clone for IngestServiceGrpcServer<T> {\n        fn clone(&self) -> Self {\n            let inner = self.inner.clone();\n            Self {\n                inner,\n                accept_compression_encodings: self.accept_compression_encodings,\n                send_compression_encodings: self.send_compression_encodings,\n                max_decoding_message_size: self.max_decoding_message_size,\n                max_encoding_message_size: self.max_encoding_message_size,\n            }\n        }\n    }\n    /// Generated gRPC service name\n    pub const SERVICE_NAME: &str = \"ingest_service.IngestService\";\n    impl<T> tonic::server::NamedService for IngestServiceGrpcServer<T> {\n        const NAME: &'static str = SERVICE_NAME;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/doc_batch.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse bytes::buf::Writer;\nuse bytes::{Buf, BufMut, Bytes, BytesMut};\nuse quickwit_proto::types::IndexId;\nuse serde::Serialize;\n\nuse crate::DocBatch;\n\n#[derive(Debug)]\n/// Represents a command that can be stored in a [`DocBatch`].\npub enum DocCommand<T>\nwhere T: Buf\n{\n    Ingest { payload: T },\n    Commit,\n    // ... more to come?\n}\n\n/// We can use this byte to track both commands and their version changes\n/// If serialization protocol changes, we can just use the next number\n#[derive(Debug)]\n#[repr(u8)]\npub enum DocCommandCode {\n    IngestV1 = 0,\n    CommitV1 = 1,\n}\n\nimpl From<u8> for DocCommandCode {\n    fn from(value: u8) -> Self {\n        match value {\n            0 => DocCommandCode::IngestV1,\n            1 => DocCommandCode::CommitV1,\n            other => panic!(\"Encountered unknown command: code {other}\"),\n        }\n    }\n}\n\nimpl<T> DocCommand<T>\nwhere T: Buf + Default\n{\n    /// Returns the binary serialization code for the current version of this command.\n    pub fn code(&self) -> DocCommandCode {\n        match self {\n            DocCommand::Ingest { payload: _ } => DocCommandCode::IngestV1,\n            DocCommand::Commit => DocCommandCode::CommitV1,\n        }\n    }\n\n    /// Builds a command for bytes::Buf\n    pub fn read(mut buf: T) -> Self {\n        match buf.get_u8().into() {\n            DocCommandCode::IngestV1 => DocCommand::Ingest { payload: buf },\n            DocCommandCode::CommitV1 => DocCommand::Commit,\n        }\n    }\n\n    /// Copies the command to the end of bytes::BufMut while returning the number of bytes copied\n    pub fn write(self, mut buf: impl BufMut) -> usize {\n        let self_buf = self.into_buf();\n        let len = self_buf.remaining();\n        buf.put(self_buf);\n        len\n    }\n\n    pub fn into_buf(self) -> impl Buf {\n        self.code_chunk().chain(match self {\n            DocCommand::Ingest { payload } => payload,\n            DocCommand::Commit => T::default(),\n        })\n    }\n\n    fn code_chunk(&self) -> &'static [u8; 1] {\n        match self {\n            DocCommand::Ingest { payload: _ } => &[DocCommandCode::IngestV1 as u8],\n            DocCommand::Commit => &[DocCommandCode::CommitV1 as u8],\n        }\n    }\n}\n\n/// Builds DocBatch from individual commands\npub struct DocBatchBuilder {\n    index_id: IndexId,\n    doc_buffer: BytesMut,\n    doc_lengths: Vec<u32>,\n}\n\nimpl DocBatchBuilder {\n    /// Creates a new batch builder for the given index name.\n    pub fn new(index_id: IndexId) -> Self {\n        Self {\n            index_id,\n            doc_buffer: BytesMut::new(),\n            doc_lengths: Vec::new(),\n        }\n    }\n\n    /// Creates a new batch builder for the given index name with some pre-allocated capacity for\n    /// the internal doc buffer.\n    pub fn with_capacity(index_id: IndexId, capacity: usize) -> Self {\n        Self {\n            index_id,\n            doc_buffer: BytesMut::with_capacity(capacity),\n            doc_lengths: Vec::new(),\n        }\n    }\n\n    /// Adds an ingest command to the batch\n    pub fn ingest_doc(&mut self, payload: impl Buf + Default) -> usize {\n        let command = DocCommand::Ingest { payload };\n        self.command(command)\n    }\n\n    /// Adds a commit command to the batch\n    pub fn commit(&mut self) -> usize {\n        let command: DocCommand<Bytes> = DocCommand::Commit;\n        self.command(command)\n    }\n\n    /// Adds a parsed command to the batch\n    pub fn command<T>(&mut self, command: DocCommand<T>) -> usize\n    where T: Buf + Default {\n        let len = command.write(&mut self.doc_buffer);\n        self.doc_lengths.push(len as u32);\n        len\n    }\n\n    /// Adds a list of bytes representing a command to the batch\n    pub fn command_from_buf(&mut self, raw: impl Buf) -> usize {\n        let len = raw.remaining();\n        self.doc_buffer.put(raw);\n        self.doc_lengths.push(len as u32);\n        len\n    }\n\n    /// Creates another batch builder capable of processing a Serialize structs instead of commands\n    pub fn json_writer(self) -> JsonDocBatchBuilder {\n        JsonDocBatchBuilder {\n            index_id: self.index_id,\n            doc_buffer: self.doc_buffer.writer(),\n            doc_lengths: self.doc_lengths,\n        }\n    }\n\n    /// Builds the batch\n    pub fn build(self) -> DocBatch {\n        DocBatch {\n            index_id: self.index_id,\n            doc_buffer: self.doc_buffer.freeze(),\n            doc_lengths: self.doc_lengths,\n        }\n    }\n}\n\n/// A wrapper around batch builder that can add a Serialize structs\npub struct JsonDocBatchBuilder {\n    index_id: IndexId,\n    doc_buffer: Writer<BytesMut>,\n    doc_lengths: Vec<u32>,\n}\n\nimpl JsonDocBatchBuilder {\n    /// Adds an ingest command to the batch for a Serialize struct\n    pub fn ingest_doc(&mut self, payload: impl Serialize) -> serde_json::Result<usize> {\n        let old_len = self.doc_buffer.get_ref().len();\n        self.doc_buffer\n            .get_mut()\n            .put_u8(DocCommandCode::IngestV1 as u8);\n        let res = serde_json::to_writer(&mut self.doc_buffer, &payload);\n        let new_len = self.doc_buffer.get_ref().len();\n        if let Err(err) = res {\n            Err(err)\n        } else {\n            let len = new_len - old_len;\n            self.doc_lengths.push(len as u32);\n            Ok(len)\n        }\n    }\n\n    /// Returns the underlying batch builder\n    pub fn into_inner(self) -> DocBatchBuilder {\n        DocBatchBuilder {\n            index_id: self.index_id,\n            doc_buffer: self.doc_buffer.into_inner(),\n            doc_lengths: self.doc_lengths,\n        }\n    }\n\n    /// Builds the batch\n    pub fn build(self) -> DocBatch {\n        self.into_inner().build()\n    }\n}\n\nimpl DocBatch {\n    /// Returns an iterator over the document payloads within a doc_batch.\n    #[allow(clippy::should_implement_trait)]\n    pub fn into_iter(self) -> impl Iterator<Item = DocCommand<Bytes>> {\n        self.into_iter_raw().map(DocCommand::read)\n    }\n\n    /// Returns an iterator over the document payloads within a doc_batch.\n    pub fn into_iter_raw(self) -> impl Iterator<Item = Bytes> {\n        let DocBatch {\n            doc_buffer,\n            doc_lengths,\n            ..\n        } = self;\n        doc_lengths\n            .into_iter()\n            .scan(0, move |current_offset, doc_num_bytes| {\n                let start = *current_offset;\n                let end = start + doc_num_bytes as usize;\n                *current_offset = end;\n                Some(doc_buffer.slice(start..end))\n            })\n    }\n\n    /// Returns true if the batch is empty.\n    pub fn is_empty(&self) -> bool {\n        self.doc_lengths.is_empty()\n    }\n\n    /// Returns the total number of bytes in the batch.\n    pub fn num_bytes(&self) -> usize {\n        self.doc_buffer.len()\n    }\n\n    /// Returns the number of documents in the batch.\n    pub fn num_docs(&self) -> usize {\n        self.doc_lengths.len()\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use serde_json::json;\n\n    use super::*;\n\n    fn commands_eq<L, R>(l: DocCommand<L>, r: DocCommand<R>) -> bool\n    where\n        L: Buf,\n        R: Buf,\n    {\n        match (l, r) {\n            (\n                DocCommand::Ingest {\n                    payload: mut l_payload,\n                },\n                DocCommand::Ingest {\n                    payload: mut r_payload,\n                },\n            ) => {\n                l_payload.copy_to_bytes(l_payload.remaining())\n                    == r_payload.copy_to_bytes(r_payload.remaining())\n            }\n            (DocCommand::Commit, DocCommand::Commit) => true,\n            _ => false,\n        }\n    }\n\n    macro_rules! test_command_roundtrip {\n        ($command:expr) => {\n            let original = $command;\n            let expected = $command;\n            let mut buf = BytesMut::new();\n            let size = original.write(&mut buf);\n            assert!(size > 0);\n            let copy = DocCommand::read(buf);\n            assert!(commands_eq(expected, copy));\n        };\n    }\n\n    #[test]\n    fn test_commands_eq() {\n        assert!(commands_eq(\n            DocCommand::Ingest {\n                payload: &b\"hello\"[..]\n            },\n            DocCommand::Ingest {\n                payload: Bytes::from(\"hello\")\n            }\n        ));\n        assert!(commands_eq(\n            DocCommand::Commit::<Bytes>,\n            DocCommand::Commit::<&[u8]>\n        ));\n        assert!(!commands_eq(\n            DocCommand::Ingest {\n                payload: Bytes::from(\"hello\")\n            },\n            DocCommand::Ingest {\n                payload: Bytes::from(\"world\")\n            }\n        ));\n        assert!(!commands_eq(\n            DocCommand::Ingest {\n                payload: Bytes::from(\"hello\")\n            },\n            DocCommand::Commit::<Bytes>\n        ));\n    }\n\n    #[test]\n    fn test_commands_roundtrip() {\n        test_command_roundtrip!(DocCommand::Ingest {\n            payload: &b\"hello\"[..]\n        });\n        test_command_roundtrip!(DocCommand::Ingest {\n            payload: Bytes::from(\"hello\")\n        });\n        test_command_roundtrip!(DocCommand::Commit::<Bytes>);\n        test_command_roundtrip!(DocCommand::Commit::<&[u8]>);\n    }\n\n    #[test]\n    fn test_batch_builder() {\n        let mut batch = DocBatchBuilder::new(\"test\".to_string());\n        batch.ingest_doc(&b\"hello\"[..]);\n        batch.ingest_doc(&b\" \"[..]);\n        batch.command(DocCommand::Ingest {\n            payload: Bytes::from(\"world\"),\n        });\n        batch.commit();\n\n        let batch = batch.build();\n        assert_eq!(batch.index_id, \"test\");\n        assert_eq!(batch.num_docs(), 4);\n        assert_eq!(batch.num_bytes(), 5 + 1 + 5 + 4);\n\n        let mut iter = batch.clone().into_iter();\n        assert!(commands_eq(\n            iter.next().unwrap(),\n            DocCommand::Ingest {\n                payload: Bytes::from(\"hello\")\n            }\n        ));\n        assert!(commands_eq(\n            iter.next().unwrap(),\n            DocCommand::Ingest {\n                payload: Bytes::from(\" \")\n            }\n        ));\n        assert!(commands_eq(\n            iter.next().unwrap(),\n            DocCommand::Ingest {\n                payload: Bytes::from(\"world\")\n            }\n        ));\n        assert!(commands_eq(\n            iter.next().unwrap(),\n            DocCommand::Commit::<Bytes>\n        ));\n        assert!(iter.next().is_none());\n\n        let mut copied_batch = DocBatchBuilder::new(\"test\".to_string());\n        for raw_buf in batch.clone().into_iter_raw() {\n            copied_batch.command_from_buf(raw_buf);\n        }\n        let copied_batch = copied_batch.build();\n\n        assert_eq!(batch, copied_batch);\n    }\n\n    #[test]\n    fn test_json_batch_builder() {\n        let mut batch = DocBatchBuilder::new(\"test\".to_string()).json_writer();\n        batch.ingest_doc(json!({\"test\":\"a\"})).unwrap();\n        batch.ingest_doc(json!({\"test\":\"b\"})).unwrap();\n\n        let mut batch = batch.into_inner();\n        batch.commit();\n\n        let batch = batch.build();\n        assert_eq!(batch.index_id, \"test\");\n        assert_eq!(batch.num_docs(), 3);\n        assert_eq!(batch.num_bytes(), 12 + 12 + 3);\n\n        let mut iter = batch.into_iter();\n        assert!(commands_eq(\n            iter.next().unwrap(),\n            DocCommand::Ingest {\n                payload: Bytes::from(json!({\"test\": \"a\"}).to_string())\n            }\n        ));\n        assert!(commands_eq(\n            iter.next().unwrap(),\n            DocCommand::Ingest {\n                payload: Bytes::from(json!({\"test\": \"b\"}).to_string())\n            }\n        ));\n        assert!(commands_eq(\n            iter.next().unwrap(),\n            DocCommand::Commit::<Bytes>\n        ));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/error.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::io;\n\nuse mrecordlog::error::*;\nuse quickwit_actors::AskError;\nuse quickwit_common::rate_limited_error;\npub(crate) use quickwit_proto::error::{grpc_error_to_grpc_status, grpc_status_to_service_error};\nuse quickwit_proto::ingest::router::{IngestFailure, IngestFailureReason};\nuse quickwit_proto::ingest::{IngestV2Error, RateLimitingCause};\nuse quickwit_proto::types::IndexId;\nuse quickwit_proto::{GrpcServiceError, ServiceError, ServiceErrorCode, tonic};\nuse serde::{Deserialize, Serialize};\n\n#[derive(Debug, Clone, thiserror::Error, Serialize, Deserialize)]\npub enum IngestServiceError {\n    #[error(\"data corruption: {0}\")]\n    Corruption(String),\n    #[error(\"index `{index_id}` already exists\")]\n    IndexAlreadyExists { index_id: IndexId },\n    #[error(\"index `{index_id}` not found\")]\n    IndexNotFound { index_id: IndexId },\n    #[error(\"an internal error occurred: {0}\")]\n    Internal(String),\n    #[error(\"invalid position: {0}\")]\n    InvalidPosition(String),\n    #[error(\"io error {0}\")]\n    IoError(String),\n    #[error(\"rate limited {0}\")]\n    RateLimited(RateLimitingCause),\n    #[error(\"ingest service is unavailable ({0})\")]\n    Unavailable(String),\n    #[error(\"bad request ({0})\")]\n    BadRequest(String),\n}\n\nimpl From<AskError<IngestServiceError>> for IngestServiceError {\n    fn from(error: AskError<IngestServiceError>) -> Self {\n        match error {\n            AskError::ErrorReply(error) => error,\n            AskError::MessageNotDelivered => {\n                IngestServiceError::Unavailable(\"actor not running\".to_string())\n            }\n            AskError::ProcessMessageError => IngestServiceError::Internal(error.to_string()),\n        }\n    }\n}\n\nimpl From<quickwit_common::tower::BufferError> for IngestServiceError {\n    fn from(error: quickwit_common::tower::BufferError) -> Self {\n        use quickwit_common::tower::BufferError;\n        match error {\n            BufferError::Closed => IngestServiceError::Unavailable(error.to_string()),\n            BufferError::Unknown => IngestServiceError::Internal(error.to_string()),\n        }\n    }\n}\n\nimpl From<io::Error> for IngestServiceError {\n    fn from(io_error: io::Error) -> Self {\n        IngestServiceError::IoError(io_error.to_string())\n    }\n}\n\nimpl From<IngestV2Error> for IngestServiceError {\n    fn from(error: IngestV2Error) -> Self {\n        match error {\n            IngestV2Error::Timeout(error_msg) => {\n                IngestServiceError::Unavailable(format!(\"timeout {error_msg}\"))\n            }\n            IngestV2Error::Unavailable(error_msg) => {\n                IngestServiceError::Unavailable(format!(\"unavailable: {error_msg}\"))\n            }\n            IngestV2Error::Internal(message) => IngestServiceError::Internal(message),\n            IngestV2Error::ShardNotFound { .. } => {\n                IngestServiceError::Internal(\"shard not found\".to_string())\n            }\n            IngestV2Error::TooManyRequests(rate_limiting_cause) => {\n                IngestServiceError::RateLimited(rate_limiting_cause)\n            }\n        }\n    }\n}\n\nimpl From<IngestFailure> for IngestServiceError {\n    fn from(ingest_failure: IngestFailure) -> Self {\n        match ingest_failure.reason() {\n            IngestFailureReason::Unspecified => {\n                IngestServiceError::Internal(\"unknown error\".to_string())\n            }\n            IngestFailureReason::IndexNotFound => IngestServiceError::IndexNotFound {\n                index_id: ingest_failure.index_id,\n            },\n            IngestFailureReason::SourceNotFound => IngestServiceError::Internal(format!(\n                \"Ingest v2 source not found for index {}\",\n                ingest_failure.index_id\n            )),\n            IngestFailureReason::Internal => {\n                IngestServiceError::Internal(\"internal error\".to_string())\n            }\n            IngestFailureReason::NoShardsAvailable => {\n                IngestServiceError::Unavailable(\"no shards available\".to_string())\n            }\n            IngestFailureReason::ShardRateLimited => {\n                IngestServiceError::RateLimited(RateLimitingCause::ShardRateLimiting)\n            }\n            IngestFailureReason::WalFull => {\n                IngestServiceError::RateLimited(RateLimitingCause::WalFull)\n            }\n            IngestFailureReason::Timeout => {\n                IngestServiceError::Internal(\"request timed out\".to_string())\n            }\n            IngestFailureReason::RouterLoadShedding => {\n                IngestServiceError::RateLimited(RateLimitingCause::RouterLoadShedding)\n            }\n            IngestFailureReason::LoadShedding => {\n                IngestServiceError::RateLimited(RateLimitingCause::LoadShedding)\n            }\n            IngestFailureReason::CircuitBreaker => {\n                IngestServiceError::RateLimited(RateLimitingCause::CircuitBreaker)\n            }\n        }\n    }\n}\n\nimpl ServiceError for IngestServiceError {\n    fn error_code(&self) -> ServiceErrorCode {\n        match self {\n            Self::Corruption(err_msg) => {\n                rate_limited_error!(\n                    limit_per_min = 6,\n                    \"ingest/corruption internal error: {err_msg}\"\n                );\n                ServiceErrorCode::Internal\n            }\n            Self::IndexAlreadyExists { .. } => ServiceErrorCode::AlreadyExists,\n            Self::IndexNotFound { .. } => ServiceErrorCode::NotFound,\n            Self::Internal(err_msg) => {\n                rate_limited_error!(limit_per_min = 6, \"ingest internal error: {err_msg}\");\n                ServiceErrorCode::Internal\n            }\n            Self::InvalidPosition(_) => ServiceErrorCode::BadRequest,\n            Self::IoError(io_err) => {\n                rate_limited_error!(limit_per_min = 6, \"ingest/io internal error: {io_err}\");\n                ServiceErrorCode::Internal\n            }\n            Self::RateLimited(_) => ServiceErrorCode::TooManyRequests,\n            Self::Unavailable(_) => ServiceErrorCode::Unavailable,\n            Self::BadRequest(_) => ServiceErrorCode::BadRequest,\n        }\n    }\n}\n\nimpl GrpcServiceError for IngestServiceError {\n    fn new_internal(message: String) -> Self {\n        Self::Internal(message)\n    }\n\n    fn new_timeout(message: String) -> Self {\n        Self::Internal(message)\n    }\n\n    fn new_too_many_requests() -> Self {\n        Self::RateLimited(RateLimitingCause::Unknown)\n    }\n\n    fn new_unavailable(error_msg: String) -> Self {\n        Self::Unavailable(error_msg)\n    }\n}\n\n#[derive(Debug, thiserror::Error)]\n#[error(\"key should contain 16 bytes, got {0}\")]\npub struct CorruptedKey(pub usize);\n\nimpl From<CorruptedKey> for IngestServiceError {\n    fn from(error: CorruptedKey) -> Self {\n        IngestServiceError::Corruption(format!(\"corrupted key: {error:?}\"))\n    }\n}\n\nimpl From<IngestServiceError> for tonic::Status {\n    fn from(error: IngestServiceError) -> tonic::Status {\n        let code = match &error {\n            IngestServiceError::Corruption { .. } => tonic::Code::DataLoss,\n            IngestServiceError::IndexAlreadyExists { .. } => tonic::Code::AlreadyExists,\n            IngestServiceError::IndexNotFound { .. } => tonic::Code::NotFound,\n            IngestServiceError::Internal(_) => tonic::Code::Internal,\n            IngestServiceError::InvalidPosition(_) => tonic::Code::InvalidArgument,\n            IngestServiceError::IoError { .. } => tonic::Code::Internal,\n            IngestServiceError::RateLimited(_) => tonic::Code::ResourceExhausted,\n            IngestServiceError::Unavailable(_) => tonic::Code::Unavailable,\n            IngestServiceError::BadRequest(_) => tonic::Code::InvalidArgument,\n        };\n        let message = error.to_string();\n        tonic::Status::new(code, message)\n    }\n}\n\nimpl From<ReadRecordError> for IngestServiceError {\n    fn from(error: ReadRecordError) -> IngestServiceError {\n        match error {\n            ReadRecordError::IoError(io_error) => io_error.into(),\n            ReadRecordError::Corruption => {\n                IngestServiceError::Corruption(\"failed to read record\".to_string())\n            }\n        }\n    }\n}\n\nimpl From<AppendError> for IngestServiceError {\n    fn from(err: AppendError) -> IngestServiceError {\n        match err {\n            AppendError::IoError(io_error) => io_error.into(),\n            AppendError::MissingQueue(index_id) => IngestServiceError::IndexNotFound { index_id },\n            // these errors can't be reached right now\n            AppendError::Past => IngestServiceError::InvalidPosition(\n                \"attempted to append a record in the past\".to_string(),\n            ),\n        }\n    }\n}\n\nimpl From<DeleteQueueError> for IngestServiceError {\n    fn from(err: DeleteQueueError) -> IngestServiceError {\n        match err {\n            DeleteQueueError::IoError(io_error) => io_error.into(),\n            DeleteQueueError::MissingQueue(index_id) => {\n                IngestServiceError::IndexNotFound { index_id }\n            }\n        }\n    }\n}\n\nimpl From<TruncateError> for IngestServiceError {\n    fn from(err: TruncateError) -> IngestServiceError {\n        match err {\n            TruncateError::IoError(io_error) => io_error.into(),\n            TruncateError::MissingQueue(index_id) => IngestServiceError::IndexNotFound { index_id },\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_api_service.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::path::Path;\nuse std::{fmt, iter};\n\nuse async_trait::async_trait;\nuse bytes::Bytes;\nuse quickwit_actors::{\n    Actor, ActorContext, ActorExitStatus, DeferableReplyHandler, Handler, QueueCapacity,\n};\nuse quickwit_common::runtimes::RuntimeType;\nuse quickwit_common::tower::Cost;\nuse quickwit_proto::ingest::RateLimitingCause;\nuse tracing::{error, info};\nuse ulid::Ulid;\n\nuse crate::metrics::INGEST_METRICS;\nuse crate::notifications::Notifications;\nuse crate::{\n    CommitType, CreateQueueIfNotExistsRequest, CreateQueueIfNotExistsResponse, CreateQueueRequest,\n    DocCommand, DropQueueRequest, FetchRequest, FetchResponse, IngestRequest, IngestResponse,\n    IngestServiceError, ListQueuesRequest, ListQueuesResponse, MemoryCapacity, Queues,\n    SuggestTruncateRequest, TailRequest,\n};\n\nimpl Cost for IngestRequest {\n    fn cost(&self) -> u64 {\n        self.doc_batches\n            .iter()\n            .map(|doc_batch| doc_batch.num_bytes())\n            .sum::<usize>() as u64\n    }\n}\n\npub struct IngestApiService {\n    partition_id: String,\n    queues: Queues,\n    memory_limit: usize,\n    disk_limit: usize,\n    memory_capacity: MemoryCapacity,\n    notifications: Notifications,\n}\n\nimpl fmt::Debug for IngestApiService {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        f.debug_struct(\"IngestApiService\")\n            .field(\"partition_id\", &self.partition_id)\n            .field(\"memory_limit\", &self.memory_limit)\n            .field(\"disk_limit\", &self.disk_limit)\n            .finish()\n    }\n}\n\n/// When we create our queue storage, we also generate and store\n/// a random partition id associated to it.\n///\n/// That partition_id is used in the source checkpoint.\n///\n/// The idea is to make sure that if the entire queue storage is lost,\n/// the old source checkpoint (stored in the metastore) do not apply.\n/// (See #2310)\nconst PARTITION_ID_PATH: &str = \"partition_id\";\n\nasync fn get_or_initialize_partition_id(dir_path: &Path) -> crate::Result<String> {\n    let partition_id_path = dir_path.join(PARTITION_ID_PATH);\n    if let Ok(partition_id_bytes) = tokio::fs::read(&partition_id_path).await {\n        let partition_id: &str = std::str::from_utf8(&partition_id_bytes).map_err(|_| {\n            let msg = format!(\"partition key ({partition_id_bytes:?}) is not utf8\");\n            IngestServiceError::Corruption(msg)\n        })?;\n        return Ok(partition_id.to_string());\n    }\n    // We add a prefix here to make sure we don't mistake it for a split id when reading logs.\n    let partition_id = format!(\"ingest_partition_{}\", Ulid::new());\n    tokio::fs::write(partition_id_path, partition_id.as_bytes()).await?;\n    Ok(partition_id)\n}\n\nimpl IngestApiService {\n    pub async fn with_queues_dir(\n        queues_dir_path: &Path,\n        memory_limit: usize,\n        disk_limit: usize,\n    ) -> crate::Result<Self> {\n        let queues = Queues::open(queues_dir_path).await?;\n        let partition_id = get_or_initialize_partition_id(queues_dir_path).await?;\n        let memory_capacity = MemoryCapacity::new(memory_limit);\n        let notifications = Notifications::new();\n        info!(ingest_partition_id=%partition_id, \"Ingest API partition id\");\n        Ok(Self {\n            partition_id,\n            queues,\n            memory_limit,\n            disk_limit,\n            memory_capacity,\n            notifications,\n        })\n    }\n\n    async fn ingest(\n        &mut self,\n        request: IngestRequest,\n        reply: impl FnOnce(crate::Result<IngestResponse>) + Send + Sync + 'static,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        let notification = self.ingest_inner(request, ctx).await;\n        match notification {\n            Ok((response, index_positions)) => {\n                if index_positions.is_empty() {\n                    reply(Ok(response));\n                } else {\n                    self.notifications\n                        .register(index_positions, move || {\n                            reply(Ok(response));\n                        })\n                        .await;\n                }\n                Ok(())\n            }\n            Err(err) => {\n                reply(Err(err));\n                Ok(())\n            }\n        }\n    }\n\n    async fn ingest_inner(\n        &mut self,\n        request: IngestRequest,\n        ctx: &ActorContext<Self>,\n    ) -> crate::Result<(IngestResponse, Vec<(String, u64)>)> {\n        // Check all indexes exist assuming existing queues always have a corresponding index.\n        let first_non_existing_queue_opt = request\n            .doc_batches\n            .iter()\n            .map(|batch| batch.index_id.as_str())\n            .find(|index_id| !self.queues.queue_exists(index_id));\n\n        if let Some(index_id) = first_non_existing_queue_opt {\n            error!(\n                index_id,\n                partition_id = self.partition_id,\n                \"could not find index\"\n            );\n            return Err(IngestServiceError::IndexNotFound {\n                index_id: index_id.to_string(),\n            });\n        }\n        let disk_used = self.queues.resource_usage().disk_used_bytes;\n\n        if disk_used > self.disk_limit {\n            info!(\"ingestion rejected due to disk limit\");\n            return Err(IngestServiceError::RateLimited(RateLimitingCause::WalFull));\n        }\n\n        if self\n            .memory_capacity\n            .reserve_capacity(request.cost() as usize)\n            .is_err()\n        {\n            info!(\"ingest request rejected due to memory limit\");\n            return Err(IngestServiceError::RateLimited(RateLimitingCause::WalFull));\n        }\n        let mut num_docs = 0usize;\n        let mut notifications = Vec::new();\n        let commit = request.commit();\n        for doc_batch in request.doc_batches {\n            // TODO better error handling.\n            // If there is an error, we probably want a transactional behavior.\n\n            let batch_num_docs = doc_batch.num_docs();\n            let batch_num_bytes = doc_batch.num_bytes();\n            let index_id = doc_batch.index_id.clone();\n            let records_it = doc_batch.into_iter_raw();\n            let max_position = self.queues.append_batch(&index_id, records_it, ctx).await?;\n            if let Some(max_position) = max_position\n                && commit != CommitType::Auto\n            {\n                if commit == CommitType::Force {\n                    self.queues\n                        .append_batch(\n                            &index_id,\n                            iter::once(DocCommand::Commit::<Bytes>.into_buf()),\n                            ctx,\n                        )\n                        .await?;\n                }\n                notifications.push((index_id.clone(), max_position));\n            }\n\n            num_docs += batch_num_docs;\n            INGEST_METRICS\n                .ingested_docs_bytes_valid\n                .inc_by(batch_num_bytes as u64);\n            INGEST_METRICS\n                .ingested_docs_valid\n                .inc_by(batch_num_docs as u64);\n        }\n        // TODO we could fsync here and disable autosync to have better i/o perfs.\n        Ok((\n            IngestResponse {\n                num_docs_for_processing: num_docs as u64,\n            },\n            notifications,\n        ))\n    }\n\n    fn fetch(&mut self, fetch_req: FetchRequest) -> crate::Result<FetchResponse> {\n        let num_bytes_limit_opt: Option<usize> = fetch_req\n            .num_bytes_limit\n            .map(|num_bytes_limit| num_bytes_limit as usize);\n        self.queues.fetch(\n            &fetch_req.index_id,\n            fetch_req.start_after,\n            num_bytes_limit_opt,\n        )\n    }\n\n    async fn suggest_truncate(\n        &mut self,\n        request: SuggestTruncateRequest,\n        ctx: &ActorContext<Self>,\n    ) -> crate::Result<()> {\n        self.notifications\n            .notify(&request.index_id, request.up_to_position_included)\n            .await;\n        self.queues\n            .suggest_truncate(&request.index_id, request.up_to_position_included, ctx)\n            .await?;\n\n        let memory_used = self.queues.resource_usage().memory_used_bytes;\n        let new_capacity = self.memory_limit - memory_used;\n        self.memory_capacity.reset_capacity(new_capacity);\n\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Actor for IngestApiService {\n    type ObservableState = ();\n\n    fn observable_state(&self) -> Self::ObservableState {}\n\n    fn runtime_handle(&self) -> tokio::runtime::Handle {\n        RuntimeType::NonBlocking.get_runtime_handle()\n    }\n\n    /// The Actor's incoming mailbox queue capacity. It is set when the actor is spawned.\n    fn queue_capacity(&self) -> QueueCapacity {\n        QueueCapacity::Bounded(3)\n    }\n}\n\n#[derive(Debug)]\npub struct GetPartitionId;\n\n#[async_trait]\nimpl Handler<GetPartitionId> for IngestApiService {\n    type Reply = String;\n\n    async fn handle(\n        &mut self,\n        _request: GetPartitionId,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        Ok(self.partition_id.clone())\n    }\n}\n\n#[derive(Debug)]\npub struct GetMemoryCapacity;\n\n#[async_trait]\nimpl Handler<GetMemoryCapacity> for IngestApiService {\n    type Reply = MemoryCapacity;\n\n    async fn handle(\n        &mut self,\n        _request: GetMemoryCapacity,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        Ok(self.memory_capacity.clone())\n    }\n}\n\n#[async_trait]\nimpl Handler<CreateQueueRequest> for IngestApiService {\n    type Reply = crate::Result<()>;\n    async fn handle(\n        &mut self,\n        create_queue_req: CreateQueueRequest,\n        ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        Ok(self\n            .queues\n            .create_queue(&create_queue_req.queue_id, ctx)\n            .await)\n    }\n}\n\n#[async_trait]\nimpl Handler<CreateQueueIfNotExistsRequest> for IngestApiService {\n    type Reply = crate::Result<CreateQueueIfNotExistsResponse>;\n    async fn handle(\n        &mut self,\n        create_queue_inf_req: CreateQueueIfNotExistsRequest,\n        ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        if self.queues.queue_exists(&create_queue_inf_req.queue_id) {\n            let response = CreateQueueIfNotExistsResponse {\n                queue_id: create_queue_inf_req.queue_id,\n                created: false,\n            };\n            return Ok(Ok(response));\n        }\n        Ok(self\n            .queues\n            .create_queue(&create_queue_inf_req.queue_id, ctx)\n            .await\n            .map(|_| CreateQueueIfNotExistsResponse {\n                queue_id: create_queue_inf_req.queue_id,\n                created: true,\n            }))\n    }\n}\n\n#[async_trait]\nimpl Handler<DropQueueRequest> for IngestApiService {\n    type Reply = crate::Result<()>;\n    async fn handle(\n        &mut self,\n        drop_queue_req: DropQueueRequest,\n        ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        Ok(self.queues.drop_queue(&drop_queue_req.queue_id, ctx).await)\n    }\n}\n\n#[async_trait]\nimpl DeferableReplyHandler<IngestRequest> for IngestApiService {\n    type Reply = crate::Result<IngestResponse>;\n    async fn handle_message(\n        &mut self,\n        ingest_req: IngestRequest,\n        reply: impl FnOnce(Self::Reply) + Send + Sync + 'static,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        self.ingest(ingest_req, reply, ctx).await?;\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<FetchRequest> for IngestApiService {\n    type Reply = crate::Result<FetchResponse>;\n    async fn handle(\n        &mut self,\n        request: FetchRequest,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        Ok(self.fetch(request))\n    }\n}\n\n#[async_trait]\nimpl Handler<TailRequest> for IngestApiService {\n    type Reply = crate::Result<FetchResponse>;\n    async fn handle(\n        &mut self,\n        request: TailRequest,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        Ok(self.queues.tail(&request.index_id))\n    }\n}\n\n#[async_trait]\nimpl Handler<SuggestTruncateRequest> for IngestApiService {\n    type Reply = crate::Result<()>;\n    async fn handle(\n        &mut self,\n        request: SuggestTruncateRequest,\n        ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        Ok(self.suggest_truncate(request, ctx).await)\n    }\n}\n\n#[async_trait]\nimpl Handler<ListQueuesRequest> for IngestApiService {\n    type Reply = crate::Result<ListQueuesResponse>;\n    async fn handle(\n        &mut self,\n        _list_queue_req: ListQueuesRequest,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        Ok(self.queues.list_queues())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::time::Duration;\n\n    use bytes::Bytes;\n    use quickwit_actors::Universe;\n    use quickwit_config::IngestApiConfig;\n\n    use super::*;\n    use crate::{DocBatch, DocBatchBuilder, init_ingest_api};\n\n    #[test]\n    fn test_ingest_request_cost() {\n        let ingest_request = IngestRequest {\n            doc_batches: vec![\n                DocBatch {\n                    index_id: \"index-1\".to_string(),\n                    doc_buffer: Bytes::from_static(&[0, 1, 2]),\n                    doc_lengths: vec![1, 2],\n                },\n                DocBatch {\n                    index_id: \"index-2\".to_string(),\n                    doc_buffer: Bytes::from_static(&[3, 4, 5, 6, 7, 8]),\n                    doc_lengths: vec![1, 3, 2],\n                },\n            ],\n            commit: CommitType::Auto.into(),\n        };\n        assert_eq!(ingest_request.cost(), 9);\n    }\n\n    #[tokio::test]\n    async fn test_ingest_api_service_with_commit() -> anyhow::Result<()> {\n        let universe = Universe::with_accelerated_time();\n        let temp_dir = tempfile::tempdir()?;\n        let queues_dir_path = temp_dir.path();\n\n        let ingest_api_service =\n            init_ingest_api(&universe, queues_dir_path, &IngestApiConfig::default()).await?;\n\n        // Ensure a queue for this index exists.\n        let create_queue_req = CreateQueueIfNotExistsRequest {\n            queue_id: \"index-1\".to_string(),\n        };\n\n        ingest_api_service.ask_for_res(create_queue_req).await?;\n\n        let mut batch = DocBatchBuilder::new(\"index-1\".to_string());\n        batch.ingest_doc(Bytes::from_static(b\"Test1\"));\n        batch.ingest_doc(Bytes::from_static(b\"Test2\"));\n        batch.ingest_doc(Bytes::from_static(b\"Test3\"));\n        batch.ingest_doc(Bytes::from_static(b\"Test4\"));\n\n        let ingest_request = IngestRequest {\n            doc_batches: vec![batch.build()],\n            commit: CommitType::Force.into(),\n        };\n        let ingest_response = ingest_api_service\n            .send_message(ingest_request)\n            .await\n            .unwrap();\n        universe.sleep(Duration::from_secs(2)).await;\n        let fetch_request = FetchRequest {\n            index_id: \"index-1\".to_string(),\n            start_after: None,\n            num_bytes_limit: None,\n        };\n        let fetch_response = ingest_api_service.ask_for_res(fetch_request).await.unwrap();\n        let doc_batch = fetch_response.doc_batch.unwrap();\n        let position = doc_batch.num_docs() as u64;\n        assert_eq!(doc_batch.num_docs(), 5);\n        assert!(matches!(\n            doc_batch.into_iter().nth(4),\n            Some(DocCommand::Commit::<Bytes>)\n        ));\n        ingest_api_service\n            .send_message(SuggestTruncateRequest {\n                index_id: \"index-1\".to_string(),\n                up_to_position_included: position,\n            })\n            .await\n            .unwrap();\n\n        let ingest_response = ingest_response.await.unwrap().unwrap();\n        assert_eq!(ingest_response.num_docs_for_processing, 4);\n\n        universe.assert_quit().await;\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_ingest_api_service_with_wait() -> anyhow::Result<()> {\n        let universe = Universe::with_accelerated_time();\n        let temp_dir = tempfile::tempdir()?;\n        let queues_dir_path = temp_dir.path();\n\n        let ingest_api_service =\n            init_ingest_api(&universe, queues_dir_path, &IngestApiConfig::default()).await?;\n\n        // Ensure a queue for this index exists.\n        let create_queue_req = CreateQueueIfNotExistsRequest {\n            queue_id: \"index-1\".to_string(),\n        };\n\n        ingest_api_service.ask_for_res(create_queue_req).await?;\n\n        let mut batch = DocBatchBuilder::new(\"index-1\".to_string());\n        batch.ingest_doc(Bytes::from_static(b\"Test1\"));\n        batch.ingest_doc(Bytes::from_static(b\"Test2\"));\n        batch.ingest_doc(Bytes::from_static(b\"Test3\"));\n        batch.ingest_doc(Bytes::from_static(b\"Test4\"));\n\n        let ingest_request = IngestRequest {\n            doc_batches: vec![batch.build()],\n            commit: CommitType::WaitFor.into(),\n        };\n        let ingest_response = ingest_api_service\n            .send_message(ingest_request)\n            .await\n            .unwrap();\n        universe.sleep(Duration::from_secs(2)).await;\n        let fetch_request = FetchRequest {\n            index_id: \"index-1\".to_string(),\n            start_after: None,\n            num_bytes_limit: None,\n        };\n        let fetch_response = ingest_api_service.ask_for_res(fetch_request).await.unwrap();\n        let doc_batch = fetch_response.doc_batch.unwrap();\n        let position = doc_batch.num_docs() as u64;\n        assert_eq!(doc_batch.num_docs(), 4);\n        ingest_api_service\n            .send_message(SuggestTruncateRequest {\n                index_id: \"index-1\".to_string(),\n                up_to_position_included: position,\n            })\n            .await\n            .unwrap();\n\n        let ingest_response = ingest_response.await.unwrap().unwrap();\n        assert_eq!(ingest_response.num_docs_for_processing, 4);\n\n        universe.assert_quit().await;\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_service.proto",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n\nsyntax = \"proto3\";\n\npackage ingest_service;\n\nservice IngestService {\n  // Ingests document in a given queue.\n  //\n  // Upon any kind of error, the client should\n  // - retry to get at least once delivery.\n  // - not retry to get at most once delivery.\n  //\n  // Exactly once delivery is not supported yet.\n  rpc Ingest(IngestRequest) returns (IngestResponse);\n\n  // Fetches record from a given queue.\n  //\n  // Records are returned in order.\n  //\n  // The returned `FetchResponse` object is meant to be read with the\n  // `crate::iter_records` function.\n  //\n  // Fetching does not necessarily return all of the available records.\n  // If returning all records would exceed `FETCH_PAYLOAD_LIMIT` (2MB),\n  // the response will be partial.\n  rpc Fetch(FetchRequest) returns (FetchResponse);\n\n  // Returns a batch containing the last records.\n  //\n  // It returns the last documents, from the newest\n  // to the oldest, and stops as soon as `FETCH_PAYLOAD_LIMIT` (2MB)\n  // is exceeded.\n  rpc Tail(TailRequest) returns (FetchResponse);\n}\n\nmessage QueueExistsRequest {\n    string queue_id = 1;\n}\n\nmessage CreateQueueRequest {\n    string queue_id = 1;\n}\n\nmessage CreateQueueIfNotExistsRequest {\n    string queue_id = 1;\n}\n\nmessage CreateQueueIfNotExistsResponse {\n    string queue_id = 1;\n    bool created = 2;\n}\n\nmessage DropQueueRequest {\n    string queue_id = 1;\n}\n\n// Specifies if the ingest request should block waiting for the records to be committed.\nenum CommitType {\n    // The request doesn't wait for commit\n    Auto = 0;\n    // The request waits for the next scheduled commit to finish.\n    WaitFor = 1;\n    // The request forces an immediate commit after the last document in the batch and waits for\n    // it to finish.\n    Force = 2;\n}\n\nmessage IngestRequest {\n    repeated DocBatch doc_batches = 1;\n    CommitType commit = 2;\n}\n\nmessage IngestResponse {\n    uint64 num_docs_for_processing = 1;\n}\n\n// Fetch messages with position strictly after `start_after`.\nmessage FetchRequest {\n    string index_id = 1;\n    optional uint64 start_after = 2;\n    optional uint64 num_bytes_limit = 3;\n}\n\nmessage FetchResponse {\n    optional uint64 first_position = 1;\n    DocBatch doc_batch = 2;\n}\n\nmessage DocBatch {\n    string index_id = 1;\n    bytes doc_buffer = 2;\n    repeated uint32 doc_lengths = 3;\n}\n\n// Suggest to truncate the queue.\n//\n// This function allows the queue to remove all records up to and\n// including `up_to_offset_included`.\n//\n// The role of this truncation is to release memory and disk space.\n//\n// There are no guarantees that the record will effectively be removed.\n// Nothing might happen, or the truncation might be partial.\n//\n// In other words, truncating from a position, and fetching records starting\n// earlier than this position can yield undefined result:\n// the truncated records may or may not be returned.\nmessage SuggestTruncateRequest {\n    string index_id = 1;\n    uint64 up_to_position_included = 2;\n}\n\nmessage TailRequest {\n    string index_id = 1;\n}\n\nmessage ListQueuesRequest {\n}\n\nmessage ListQueuesResponse {\n    repeated string queues = 1;\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_v2/broadcast/capacity_score.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeSet;\n\nuse anyhow::{Context, Result};\nuse bytesize::ByteSize;\nuse quickwit_cluster::{Cluster, ListenerHandle};\nuse quickwit_common::pubsub::{Event, EventBroker};\nuse quickwit_common::shared_consts::INGESTER_CAPACITY_SCORE_PREFIX;\nuse quickwit_proto::ingest::ingester::IngesterStatus;\nuse quickwit_proto::types::{NodeId, SourceUid};\nuse serde::{Deserialize, Serialize};\nuse tokio::task::JoinHandle;\nuse tracing::{info, warn};\n\nuse super::{BROADCAST_INTERVAL_PERIOD, make_key, parse_key};\nuse crate::OpenShardCounts;\nuse crate::ingest_v2::state::WeakIngesterState;\n\n#[derive(Debug, Clone, Default, Serialize, Deserialize)]\npub struct IngesterCapacityScore {\n    pub capacity_score: usize,\n    pub open_shard_count: usize,\n}\n\n/// Periodically snapshots the ingester's WAL memory usage and open shard counts, computes\n/// a capacity score, and broadcasts it to other nodes via Chitchat.\npub struct BroadcastIngesterCapacityScoreTask {\n    cluster: Cluster,\n    weak_state: WeakIngesterState,\n}\n\nimpl BroadcastIngesterCapacityScoreTask {\n    pub fn spawn(cluster: Cluster, weak_state: WeakIngesterState) -> JoinHandle<()> {\n        let mut broadcaster = Self {\n            cluster,\n            weak_state,\n        };\n        tokio::spawn(async move { broadcaster.run().await })\n    }\n\n    async fn snapshot(&self) -> Result<Option<(usize, OpenShardCounts)>> {\n        let state = self\n            .weak_state\n            .upgrade()\n            .context(\"ingester state has been dropped\")?;\n\n        // lock fully asserts that the ingester is ready. There's a likelihood that this task runs\n        // before the WAL is loaded, so we make sure that the ingester is ready just in case.\n        if *state.status_rx.borrow() != IngesterStatus::Ready {\n            return Ok(None);\n        }\n\n        let mut guard = state\n            .lock_fully()\n            .await\n            .context(\"failed to acquire ingester state lock\")?;\n\n        let usage = guard.mrecordlog.resource_usage();\n        let disk_used = ByteSize::b(usage.disk_used_bytes as u64);\n        let memory_used = ByteSize::b(usage.memory_used_bytes as u64);\n        let capacity_score = guard\n            .wal_capacity_tracker\n            .record_and_score(disk_used, memory_used);\n        let (open_shard_counts, _) = guard.get_shard_snapshot();\n\n        Ok(Some((capacity_score, open_shard_counts)))\n    }\n\n    async fn run(&mut self) {\n        let mut interval = tokio::time::interval(BROADCAST_INTERVAL_PERIOD);\n        let mut previous_sources: BTreeSet<SourceUid> = BTreeSet::new();\n\n        loop {\n            interval.tick().await;\n\n            let (capacity_score, open_shard_counts) = match self.snapshot().await {\n                Ok(Some(snapshot)) => snapshot,\n                Ok(None) => continue,\n                Err(error) => {\n                    info!(\"stopping ingester capacity broadcast: {error}\");\n                    return;\n                }\n            };\n\n            previous_sources = self\n                .broadcast_capacity(capacity_score, &open_shard_counts, &previous_sources)\n                .await;\n        }\n    }\n\n    async fn broadcast_capacity(\n        &self,\n        capacity_score: usize,\n        open_shard_counts: &OpenShardCounts,\n        previous_sources: &BTreeSet<SourceUid>,\n    ) -> BTreeSet<SourceUid> {\n        let mut current_sources = BTreeSet::new();\n\n        for (index_uid, source_id, open_shard_count) in open_shard_counts {\n            let source_uid = SourceUid {\n                index_uid: index_uid.clone(),\n                source_id: source_id.clone(),\n            };\n            let key = make_key(INGESTER_CAPACITY_SCORE_PREFIX, &source_uid);\n            let capacity = IngesterCapacityScore {\n                capacity_score,\n                open_shard_count: *open_shard_count,\n            };\n            let value = serde_json::to_string(&capacity)\n                .expect(\"`IngesterCapacityScore` should be JSON serializable\");\n            self.cluster.set_self_key_value(key, value).await;\n            current_sources.insert(source_uid);\n        }\n\n        for removed_source in previous_sources.difference(&current_sources) {\n            let key = make_key(INGESTER_CAPACITY_SCORE_PREFIX, removed_source);\n            self.cluster.remove_self_key(&key).await;\n        }\n\n        current_sources\n    }\n}\n\n#[derive(Debug, Clone)]\npub struct IngesterCapacityScoreUpdate {\n    pub node_id: NodeId,\n    pub source_uid: SourceUid,\n    pub capacity_score: usize,\n    pub open_shard_count: usize,\n}\n\nimpl Event for IngesterCapacityScoreUpdate {}\n\npub async fn setup_ingester_capacity_update_listener(\n    cluster: Cluster,\n    event_broker: EventBroker,\n) -> ListenerHandle {\n    cluster\n        .subscribe(INGESTER_CAPACITY_SCORE_PREFIX, move |event| {\n            let Some(source_uid) = parse_key(event.key) else {\n                warn!(\"failed to parse source UID from key `{}`\", event.key);\n                return;\n            };\n            let Ok(ingester_capacity) = serde_json::from_str::<IngesterCapacityScore>(event.value)\n            else {\n                warn!(\"failed to parse ingester capacity `{}`\", event.value);\n                return;\n            };\n            let node_id: NodeId = event.node.node_id.clone().into();\n            event_broker.publish(IngesterCapacityScoreUpdate {\n                node_id,\n                source_uid,\n                capacity_score: ingester_capacity.capacity_score,\n                open_shard_count: ingester_capacity.open_shard_count,\n            });\n        })\n        .await\n}\n\n#[cfg(test)]\nmod tests {\n    use std::sync::Arc;\n    use std::sync::atomic::{AtomicUsize, Ordering};\n\n    use quickwit_cluster::{ChannelTransport, create_cluster_for_test};\n    use quickwit_proto::types::{IndexUid, ShardId, SourceId};\n\n    use super::*;\n    use crate::ingest_v2::models::IngesterShard;\n    use crate::ingest_v2::state::IngesterState;\n\n    #[tokio::test]\n    async fn test_snapshot_state_dropped() {\n        let transport = ChannelTransport::default();\n        let cluster = create_cluster_for_test(Vec::new(), &[\"indexer\"], &transport, true)\n            .await\n            .unwrap();\n        let (_temp_dir, state) = IngesterState::for_test(cluster.clone()).await;\n        let weak_state = state.weak();\n        drop(state);\n\n        let task = BroadcastIngesterCapacityScoreTask {\n            cluster,\n            weak_state,\n        };\n        assert!(task.snapshot().await.is_err());\n    }\n\n    #[tokio::test]\n    async fn test_broadcast_ingester_capacity() {\n        let transport = ChannelTransport::default();\n        let cluster = create_cluster_for_test(Vec::new(), &[\"indexer\"], &transport, true)\n            .await\n            .unwrap();\n        let event_broker = EventBroker::default();\n\n        // Use 1000 bytes disk capacity so 500 used => 50% remaining, 0 delta => score = 6\n        let (_temp_dir, state) =\n            IngesterState::for_test_with_disk_capacity(cluster.clone(), ByteSize::b(1000)).await;\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let mut state_guard = state.lock_partially().await.unwrap();\n        let shard = IngesterShard::new_solo(\n            index_uid.clone(),\n            SourceId::from(\"test-source\"),\n            ShardId::from(0),\n        )\n        .advertisable()\n        .build();\n        state_guard.shards.insert(shard.queue_id(), shard);\n        let (open_shard_counts, _) = state_guard.get_shard_snapshot();\n        let capacity_score = state_guard\n            .wal_capacity_tracker\n            .record_and_score(ByteSize::b(500), ByteSize::b(0));\n        drop(state_guard);\n\n        assert_eq!(capacity_score, 6);\n\n        let task = BroadcastIngesterCapacityScoreTask {\n            cluster: cluster.clone(),\n            weak_state: state.weak(),\n        };\n\n        let update_counter = Arc::new(AtomicUsize::new(0));\n        let update_counter_clone = update_counter.clone();\n        let index_uid_clone = index_uid.clone();\n        let _sub = event_broker.subscribe(move |event: IngesterCapacityScoreUpdate| {\n            update_counter_clone.fetch_add(1, Ordering::Release);\n            assert_eq!(event.source_uid.index_uid, index_uid_clone);\n            assert_eq!(event.source_uid.source_id, \"test-source\");\n            assert_eq!(event.capacity_score, 6);\n            assert_eq!(event.open_shard_count, 1);\n        });\n\n        let _listener =\n            setup_ingester_capacity_update_listener(cluster.clone(), event_broker).await;\n\n        let previous_sources = BTreeSet::new();\n        task.broadcast_capacity(capacity_score, &open_shard_counts, &previous_sources)\n            .await;\n        tokio::time::sleep(BROADCAST_INTERVAL_PERIOD * 2).await;\n\n        assert_eq!(update_counter.load(Ordering::Acquire), 1);\n\n        let source_uid = SourceUid {\n            index_uid: index_uid.clone(),\n            source_id: SourceId::from(\"test-source\"),\n        };\n        let key = make_key(INGESTER_CAPACITY_SCORE_PREFIX, &source_uid);\n        let value = cluster.get_self_key_value(&key).await.unwrap();\n        let deserialized: IngesterCapacityScore = serde_json::from_str(&value).unwrap();\n        assert_eq!(deserialized.capacity_score, 6);\n        assert_eq!(deserialized.open_shard_count, 1);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_v2/broadcast/local_shards.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{BTreeMap, BTreeSet, HashMap};\nuse std::time::Duration;\n\nuse bytesize::ByteSize;\nuse quickwit_cluster::{Cluster, ListenerHandle};\nuse quickwit_common::pubsub::{Event, EventBroker};\nuse quickwit_common::ring_buffer::RingBuffer;\nuse quickwit_common::shared_consts::INGESTER_PRIMARY_SHARDS_PREFIX;\nuse quickwit_common::sorted_iter::{KeyDiff, SortedByKeyIterator};\nuse quickwit_common::tower::{ConstantRate, Rate};\nuse quickwit_proto::ingest::ShardState;\nuse quickwit_proto::types::{NodeId, ShardId, SourceUid};\nuse serde::{Deserialize, Serialize, Serializer};\nuse tokio::task::JoinHandle;\nuse tracing::{debug, warn};\n\nuse super::{BROADCAST_INTERVAL_PERIOD, make_key, parse_key};\nuse crate::RateMibPerSec;\nuse crate::ingest_v2::metrics::INGEST_V2_METRICS;\nuse crate::ingest_v2::state::WeakIngesterState;\n\nconst ONE_MIB: ByteSize = ByteSize::mib(1);\n\n/// Broadcasted information about a primary shard.\n#[derive(Debug, Clone, Eq, PartialEq, Ord, PartialOrd)]\npub struct ShardInfo {\n    pub shard_id: ShardId,\n    pub shard_state: ShardState,\n    /// Shard ingestion rate in MiB/s.\n    /// Short term ingestion rate. It is measured over a short period of time.\n    pub short_term_ingestion_rate: RateMibPerSec,\n    /// Long term ingestion rate. It is measured over a larger period of time.\n    pub long_term_ingestion_rate: RateMibPerSec,\n}\n\nimpl Serialize for ShardInfo {\n    fn serialize<S: Serializer>(&self, serializer: S) -> Result<S::Ok, S::Error> {\n        serializer.serialize_str(&format!(\n            \"{}:{}:{}:{}\",\n            self.shard_id,\n            self.shard_state.as_json_str_name(),\n            self.short_term_ingestion_rate.0,\n            self.long_term_ingestion_rate.0,\n        ))\n    }\n}\n\nimpl<'de> Deserialize<'de> for ShardInfo {\n    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>\n    where D: serde::Deserializer<'de> {\n        let value = String::deserialize(deserializer)?;\n        let mut parts = value.split(':');\n\n        let shard_id: ShardId = parts\n            .next()\n            .ok_or_else(|| serde::de::Error::custom(\"invalid shard info\"))?\n            .into();\n\n        let shard_state_str = parts\n            .next()\n            .ok_or_else(|| serde::de::Error::custom(\"invalid shard info\"))?;\n        let shard_state = ShardState::from_json_str_name(shard_state_str)\n            .ok_or_else(|| serde::de::Error::custom(\"invalid shard state\"))?;\n\n        let short_term_ingestion_rate = parts\n            .next()\n            .ok_or_else(|| serde::de::Error::custom(\"invalid shard info\"))?\n            .parse::<u16>()\n            .map(RateMibPerSec)\n            .map_err(|_| serde::de::Error::custom(\"invalid shard ingestion rate\"))?;\n\n        let long_term_ingestion_rate = parts\n            .next()\n            .ok_or_else(|| serde::de::Error::custom(\"invalid shard info\"))?\n            .parse::<u16>()\n            .map(RateMibPerSec)\n            .map_err(|_| serde::de::Error::custom(\"invalid shard ingestion rate\"))?;\n\n        Ok(Self {\n            shard_id,\n            shard_state,\n            short_term_ingestion_rate,\n            long_term_ingestion_rate,\n        })\n    }\n}\n\n/// A set of primary shards belonging to the same source.\npub type ShardInfos = BTreeSet<ShardInfo>;\n\n/// Lists ALL the primary shards hosted by a SINGLE ingester, grouped by source.\n#[derive(Debug, Default, Eq, PartialEq)]\nstruct LocalShardsSnapshot {\n    per_source_shard_infos: BTreeMap<SourceUid, ShardInfos>,\n}\n\n#[derive(Debug)]\nenum ShardInfosChange<'a> {\n    Updated {\n        source_uid: &'a SourceUid,\n        shard_infos: &'a ShardInfos,\n    },\n    Removed {\n        source_uid: &'a SourceUid,\n    },\n}\n\nimpl LocalShardsSnapshot {\n    pub fn diff<'a>(&'a self, other: &'a Self) -> impl Iterator<Item = ShardInfosChange<'a>> + 'a {\n        self.per_source_shard_infos\n            .iter()\n            .diff_by_key(other.per_source_shard_infos.iter())\n            .filter_map(|key_diff| match key_diff {\n                KeyDiff::Added(source_uid, shard_infos) => Some(ShardInfosChange::Updated {\n                    source_uid,\n                    shard_infos,\n                }),\n                KeyDiff::Unchanged(source_uid, previous_shard_infos, new_shard_infos) => {\n                    if previous_shard_infos != new_shard_infos {\n                        Some(ShardInfosChange::Updated {\n                            source_uid,\n                            shard_infos: new_shard_infos,\n                        })\n                    } else {\n                        None\n                    }\n                }\n                KeyDiff::Removed(source_uid, _shard_infos) => {\n                    Some(ShardInfosChange::Removed { source_uid })\n                }\n            })\n    }\n}\n\n/// Takes a snapshot of the primary shards hosted by the ingester at regular intervals and\n/// broadcasts it to other nodes via Chitchat.\npub struct BroadcastLocalShardsTask {\n    cluster: Cluster,\n    weak_state: WeakIngesterState,\n    shard_throughput_time_series_map: ShardThroughputTimeSeriesMap,\n}\n\nconst SHARD_THROUGHPUT_LONG_TERM_WINDOW_LEN: usize = 12;\n\n#[derive(Default)]\nstruct ShardThroughputTimeSeriesMap {\n    shard_time_series: HashMap<(SourceUid, ShardId), ShardThroughputTimeSeries>,\n}\n\nimpl ShardThroughputTimeSeriesMap {\n    // Records a list of shard throughputs.\n    //\n    // A new time series is created for each new shard_ids.\n    // If a shard_id had a time series, and it is not present in the\n    // `shard_throughput`, the time series will be removed.\n    #[allow(clippy::mutable_key_type)]\n    pub fn record_shard_throughputs(\n        &mut self,\n        shard_throughputs: HashMap<(SourceUid, ShardId), (ShardState, ConstantRate)>,\n    ) {\n        self.shard_time_series\n            .retain(|key, _| shard_throughputs.contains_key(key));\n        for ((source_uid, shard_id), (shard_state, throughput)) in shard_throughputs {\n            let throughput_measurement = throughput.rescale(Duration::from_secs(1)).work_bytes();\n            let shard_time_series = self\n                .shard_time_series\n                .entry((source_uid.clone(), shard_id.clone()))\n                .or_default();\n            shard_time_series.shard_state = shard_state;\n            shard_time_series.record(throughput_measurement);\n        }\n    }\n\n    pub fn get_per_source_shard_infos(&self) -> BTreeMap<SourceUid, ShardInfos> {\n        let mut per_source_shard_infos: BTreeMap<SourceUid, ShardInfos> = BTreeMap::new();\n        for ((source_uid, shard_id), shard_time_series) in self.shard_time_series.iter() {\n            let shard_state = shard_time_series.shard_state;\n            let short_term_ingestion_rate_mib_per_sec_u64: u64 =\n                shard_time_series.last().as_u64().div_ceil(ONE_MIB.as_u64());\n            let long_term_ingestion_rate_mib_per_sec_u64: u64 = shard_time_series\n                .average()\n                .as_u64()\n                .div_ceil(ONE_MIB.as_u64());\n            INGEST_V2_METRICS\n                .shard_st_throughput_mib\n                .observe(short_term_ingestion_rate_mib_per_sec_u64 as f64);\n            INGEST_V2_METRICS\n                .shard_lt_throughput_mib\n                .observe(long_term_ingestion_rate_mib_per_sec_u64 as f64);\n\n            let short_term_ingestion_rate =\n                RateMibPerSec(short_term_ingestion_rate_mib_per_sec_u64 as u16);\n            let long_term_ingestion_rate =\n                RateMibPerSec(long_term_ingestion_rate_mib_per_sec_u64 as u16);\n            let shard_info = ShardInfo {\n                shard_id: shard_id.clone(),\n                shard_state,\n                short_term_ingestion_rate,\n                long_term_ingestion_rate,\n            };\n\n            per_source_shard_infos\n                .entry(source_uid.clone())\n                .or_default()\n                .insert(shard_info);\n        }\n        per_source_shard_infos\n    }\n}\n\n#[derive(Default)]\nstruct ShardThroughputTimeSeries {\n    shard_state: ShardState,\n    throughput: RingBuffer<ByteSize, SHARD_THROUGHPUT_LONG_TERM_WINDOW_LEN>,\n}\n\nimpl ShardThroughputTimeSeries {\n    fn last(&self) -> ByteSize {\n        self.throughput.last().unwrap_or_default()\n    }\n\n    fn average(&self) -> ByteSize {\n        if self.throughput.is_empty() {\n            return ByteSize::default();\n        }\n        let sum = self.throughput.iter().map(ByteSize::as_u64).sum::<u64>();\n        ByteSize::b(sum / self.throughput.len() as u64)\n    }\n\n    fn record(&mut self, new_throughput_measurement: ByteSize) {\n        self.throughput.push_back(new_throughput_measurement);\n    }\n}\n\nimpl BroadcastLocalShardsTask {\n    pub fn spawn(cluster: Cluster, weak_state: WeakIngesterState) -> JoinHandle<()> {\n        let mut broadcaster = Self {\n            cluster,\n            weak_state,\n            shard_throughput_time_series_map: Default::default(),\n        };\n        tokio::spawn(async move { broadcaster.run().await })\n    }\n\n    async fn snapshot_local_shards(&mut self) -> Option<LocalShardsSnapshot> {\n        let state = self.weak_state.upgrade()?;\n\n        let Ok(mut state_guard) = state.lock_partially().await else {\n            return Some(LocalShardsSnapshot::default());\n        };\n        #[allow(clippy::mutable_key_type)]\n        let ingestion_rates: HashMap<(SourceUid, ShardId), (ShardState, ConstantRate)> =\n            state_guard\n                .shards\n                .values_mut()\n                .filter(|shard| shard.is_advertisable && !shard.is_replica())\n                .map(|shard| {\n                    let source_uid = SourceUid {\n                        index_uid: shard.index_uid.clone(),\n                        source_id: shard.source_id.clone(),\n                    };\n                    let shard_id = shard.shard_id.clone();\n                    let shard_state = shard.shard_state;\n                    let rate_meter = &mut shard.rate_meter;\n\n                    ((source_uid, shard_id), (shard_state, rate_meter.harvest()))\n                })\n                .collect();\n\n        self.shard_throughput_time_series_map\n            .record_shard_throughputs(ingestion_rates);\n\n        let per_source_shard_infos = self\n            .shard_throughput_time_series_map\n            .get_per_source_shard_infos();\n\n        let mut num_open_shards = 0;\n        let mut num_closed_shards = 0;\n\n        for shard_infos in per_source_shard_infos.values() {\n            for shard_info in shard_infos {\n                match shard_info.shard_state {\n                    ShardState::Open => num_open_shards += 1,\n                    ShardState::Closed => num_closed_shards += 1,\n                    ShardState::Unavailable | ShardState::Unspecified => {}\n                }\n            }\n        }\n        INGEST_V2_METRICS.open_shards.set(num_open_shards as i64);\n        INGEST_V2_METRICS\n            .closed_shards\n            .set(num_closed_shards as i64);\n\n        let snapshot = LocalShardsSnapshot {\n            per_source_shard_infos,\n        };\n        Some(snapshot)\n    }\n\n    async fn broadcast_local_shards(\n        &self,\n        previous_snapshot: &LocalShardsSnapshot,\n        new_snapshot: &LocalShardsSnapshot,\n    ) {\n        for change in previous_snapshot.diff(new_snapshot) {\n            match change {\n                ShardInfosChange::Updated {\n                    source_uid,\n                    shard_infos,\n                } => {\n                    let key = make_key(INGESTER_PRIMARY_SHARDS_PREFIX, source_uid);\n                    let value = serde_json::to_string(&shard_infos)\n                        .expect(\"`ShardInfos` should be JSON serializable\");\n                    self.cluster.set_self_key_value(key, value).await;\n                }\n                ShardInfosChange::Removed { source_uid } => {\n                    let key = make_key(INGESTER_PRIMARY_SHARDS_PREFIX, source_uid);\n                    self.cluster.remove_self_key(&key).await;\n                }\n            }\n        }\n    }\n\n    async fn run(&mut self) {\n        let mut interval = tokio::time::interval(BROADCAST_INTERVAL_PERIOD);\n        let mut previous_snapshot = LocalShardsSnapshot::default();\n\n        loop {\n            interval.tick().await;\n\n            let Some(new_snapshot) = self.snapshot_local_shards().await else {\n                // The state has been dropped, we can stop the task.\n                debug!(\"stopping local shards broadcast task\");\n                return;\n            };\n            self.broadcast_local_shards(&previous_snapshot, &new_snapshot)\n                .await;\n\n            previous_snapshot = new_snapshot;\n        }\n    }\n}\n\n#[derive(Debug, Clone)]\npub struct LocalShardsUpdate {\n    pub leader_id: NodeId,\n    pub source_uid: SourceUid,\n    pub shard_infos: ShardInfos,\n}\n\nimpl Event for LocalShardsUpdate {}\n\npub async fn setup_local_shards_update_listener(\n    cluster: Cluster,\n    event_broker: EventBroker,\n) -> ListenerHandle {\n    cluster\n        .subscribe(INGESTER_PRIMARY_SHARDS_PREFIX, move |event| {\n            let Some(source_uid) = parse_key(event.key) else {\n                warn!(\"failed to parse source UID `{}`\", event.key);\n                return;\n            };\n            let Ok(shard_infos) = serde_json::from_str::<ShardInfos>(event.value) else {\n                warn!(\"failed to parse shard infos `{}`\", event.value);\n                return;\n            };\n            let leader_id: NodeId = event.node.node_id.clone().into();\n\n            let local_shards_update = LocalShardsUpdate {\n                leader_id,\n                source_uid,\n                shard_infos,\n            };\n            event_broker.publish(local_shards_update);\n        })\n        .await\n}\n\n#[cfg(test)]\nmod tests {\n\n    use std::sync::Arc;\n    use std::sync::atomic::{AtomicUsize, Ordering};\n\n    use quickwit_cluster::{ChannelTransport, create_cluster_for_test};\n    use quickwit_common::shared_consts::INGESTER_PRIMARY_SHARDS_PREFIX;\n    use quickwit_proto::ingest::ShardState;\n    use quickwit_proto::types::{IndexUid, NodeId, ShardId, SourceId, SourceUid};\n\n    use super::*;\n    use crate::RateMibPerSec;\n    use crate::ingest_v2::models::IngesterShard;\n    use crate::ingest_v2::state::IngesterState;\n\n    #[test]\n    fn test_shard_info_serde() {\n        let shard_info = ShardInfo {\n            shard_id: ShardId::from(1),\n            shard_state: ShardState::Open,\n            short_term_ingestion_rate: RateMibPerSec(42),\n            long_term_ingestion_rate: RateMibPerSec(40),\n        };\n        let serialized = serde_json::to_string(&shard_info).unwrap();\n        assert_eq!(serialized, r#\"\"00000000000000000001:open:42:40\"\"#);\n\n        let deserialized = serde_json::from_str::<ShardInfo>(&serialized).unwrap();\n        assert_eq!(deserialized, shard_info);\n    }\n\n    #[test]\n    fn test_local_shards_snapshot_diff() {\n        let previous_snapshot = LocalShardsSnapshot::default();\n        let current_snapshot = LocalShardsSnapshot::default();\n        let num_changes = previous_snapshot.diff(&current_snapshot).count();\n        assert_eq!(num_changes, 0);\n\n        let previous_snapshot = LocalShardsSnapshot::default();\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let current_snapshot = LocalShardsSnapshot {\n            per_source_shard_infos: vec![(\n                SourceUid {\n                    index_uid: index_uid.clone(),\n                    source_id: SourceId::from(\"test-source\"),\n                },\n                vec![ShardInfo {\n                    shard_id: ShardId::from(1),\n                    shard_state: ShardState::Open,\n                    short_term_ingestion_rate: RateMibPerSec(42),\n                    long_term_ingestion_rate: RateMibPerSec(42),\n                }]\n                .into_iter()\n                .collect(),\n            )]\n            .into_iter()\n            .collect(),\n        };\n        let changes = previous_snapshot\n            .diff(&current_snapshot)\n            .collect::<Vec<_>>();\n        assert_eq!(changes.len(), 1);\n\n        let ShardInfosChange::Updated {\n            source_uid,\n            shard_infos,\n        } = &changes[0]\n        else {\n            panic!(\n                \"expected `ShardInfosChange::Updated` variant, got {:?}\",\n                changes[0]\n            );\n        };\n        assert_eq!(source_uid.index_uid, index_uid);\n        assert_eq!(source_uid.source_id, \"test-source\");\n        assert_eq!(shard_infos.len(), 1);\n\n        let num_changes = current_snapshot.diff(&current_snapshot).count();\n        assert_eq!(num_changes, 0);\n\n        let previous_snapshot = current_snapshot;\n        let current_snapshot = LocalShardsSnapshot {\n            per_source_shard_infos: vec![(\n                SourceUid {\n                    index_uid: index_uid.clone(),\n                    source_id: SourceId::from(\"test-source\"),\n                },\n                vec![ShardInfo {\n                    shard_id: ShardId::from(1),\n                    shard_state: ShardState::Closed,\n                    short_term_ingestion_rate: RateMibPerSec(42),\n                    long_term_ingestion_rate: RateMibPerSec(42),\n                }]\n                .into_iter()\n                .collect(),\n            )]\n            .into_iter()\n            .collect(),\n        };\n        let changes = previous_snapshot\n            .diff(&current_snapshot)\n            .collect::<Vec<_>>();\n        assert_eq!(changes.len(), 1);\n\n        let ShardInfosChange::Updated {\n            source_uid,\n            shard_infos,\n        } = &changes[0]\n        else {\n            panic!(\n                \"expected `ShardInfosChange::Updated` variant, got {:?}\",\n                changes[0]\n            );\n        };\n        assert_eq!(source_uid.index_uid, index_uid);\n        assert_eq!(source_uid.source_id, \"test-source\");\n        assert_eq!(shard_infos.len(), 1);\n\n        let previous_snapshot = current_snapshot;\n        let current_snapshot = LocalShardsSnapshot::default();\n\n        let changes = previous_snapshot\n            .diff(&current_snapshot)\n            .collect::<Vec<_>>();\n        assert_eq!(changes.len(), 1);\n\n        let ShardInfosChange::Removed { source_uid } = &changes[0] else {\n            panic!(\n                \"expected `ShardInfosChange::Removed` variant, got {:?}\",\n                changes[0]\n            );\n        };\n        assert_eq!(source_uid.index_uid, index_uid);\n        assert_eq!(source_uid.source_id, \"test-source\");\n    }\n\n    #[tokio::test]\n    async fn test_broadcast_local_shards_task() {\n        let transport = ChannelTransport::default();\n        let cluster = create_cluster_for_test(Vec::new(), &[\"indexer\"], &transport, true)\n            .await\n            .unwrap();\n        let (_temp_dir, state) = IngesterState::for_test(cluster.clone()).await;\n        let weak_state = state.weak();\n        let mut task = BroadcastLocalShardsTask {\n            cluster,\n            weak_state,\n            shard_throughput_time_series_map: Default::default(),\n        };\n        let previous_snapshot = task.snapshot_local_shards().await.unwrap();\n        assert!(previous_snapshot.per_source_shard_infos.is_empty());\n\n        let mut state_guard = state.lock_partially().await.unwrap();\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let shard_00 = IngesterShard::new_solo(\n            index_uid.clone(),\n            SourceId::from(\"test-source\"),\n            ShardId::from(0),\n        )\n        .build();\n        state_guard.shards.insert(shard_00.queue_id(), shard_00);\n\n        let shard_01 = IngesterShard::new_solo(\n            index_uid.clone(),\n            SourceId::from(\"test-source\"),\n            ShardId::from(1),\n        )\n        .advertisable()\n        .build();\n        state_guard.shards.insert(shard_01.queue_id(), shard_01);\n\n        let shard_02 = IngesterShard::new_replica(\n            index_uid.clone(),\n            SourceId::from(\"test-source\"),\n            ShardId::from(2),\n            NodeId::from(\"test-leader\"),\n        )\n        .advertisable()\n        .build();\n        state_guard.shards.insert(shard_02.queue_id(), shard_02);\n        drop(state_guard);\n\n        let new_snapshot = task.snapshot_local_shards().await.unwrap();\n        assert_eq!(new_snapshot.per_source_shard_infos.len(), 1);\n\n        task.broadcast_local_shards(&previous_snapshot, &new_snapshot)\n            .await;\n\n        tokio::time::sleep(Duration::from_millis(100)).await;\n\n        let key = format!(\n            \"{INGESTER_PRIMARY_SHARDS_PREFIX}{}:{}\",\n            index_uid, \"test-source\"\n        );\n        task.cluster.get_self_key_value(&key).await.unwrap();\n\n        task.broadcast_local_shards(&new_snapshot, &previous_snapshot)\n            .await;\n\n        tokio::time::sleep(Duration::from_millis(100)).await;\n\n        let value_opt = task.cluster.get_self_key_value(&key).await;\n        assert!(value_opt.is_none());\n    }\n\n    #[tokio::test]\n    async fn test_local_shards_update_listener() {\n        let transport = ChannelTransport::default();\n        let cluster = create_cluster_for_test(Vec::new(), &[\"indexer\"], &transport, true)\n            .await\n            .unwrap();\n        let event_broker = EventBroker::default();\n\n        let local_shards_update_counter = Arc::new(AtomicUsize::new(0));\n        let local_shards_update_counter_clone = local_shards_update_counter.clone();\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n\n        let index_uid_clone = index_uid.clone();\n        event_broker\n            .subscribe(move |event: LocalShardsUpdate| {\n                local_shards_update_counter_clone.fetch_add(1, Ordering::Release);\n\n                assert_eq!(event.source_uid.index_uid, index_uid_clone);\n                assert_eq!(event.source_uid.source_id, \"test-source\");\n                assert_eq!(event.shard_infos.len(), 1);\n\n                let shard_info = event.shard_infos.iter().next().unwrap();\n                assert_eq!(shard_info.shard_id, ShardId::from(1));\n                assert_eq!(shard_info.shard_state, ShardState::Open);\n                assert_eq!(shard_info.short_term_ingestion_rate, 42u16);\n            })\n            .forever();\n\n        setup_local_shards_update_listener(cluster.clone(), event_broker.clone())\n            .await\n            .forever();\n\n        let source_uid = SourceUid {\n            index_uid: index_uid.clone(),\n            source_id: SourceId::from(\"test-source\"),\n        };\n        let key = make_key(INGESTER_PRIMARY_SHARDS_PREFIX, &source_uid);\n        let value = serde_json::to_string(&vec![ShardInfo {\n            shard_id: ShardId::from(1),\n            shard_state: ShardState::Open,\n            short_term_ingestion_rate: RateMibPerSec(42),\n            long_term_ingestion_rate: RateMibPerSec(42),\n        }])\n        .unwrap();\n\n        cluster.set_self_key_value(key, value).await;\n        tokio::time::sleep(Duration::from_millis(50)).await;\n\n        assert_eq!(local_shards_update_counter.load(Ordering::Acquire), 1);\n    }\n\n    #[test]\n    fn test_shard_throughput_time_series() {\n        let mut time_series = ShardThroughputTimeSeries::default();\n        assert_eq!(time_series.last(), ByteSize::mb(0));\n        assert_eq!(time_series.average(), ByteSize::mb(0));\n\n        time_series.record(ByteSize::mb(2));\n        assert_eq!(time_series.last(), ByteSize::mb(2));\n        assert_eq!(time_series.average(), ByteSize::mb(2));\n\n        time_series.record(ByteSize::mb(1));\n        assert_eq!(time_series.last(), ByteSize::mb(1));\n        assert_eq!(time_series.average(), ByteSize::kb(1500));\n\n        time_series.record(ByteSize::mb(3));\n        assert_eq!(time_series.last(), ByteSize::mb(3));\n        assert_eq!(time_series.average(), ByteSize::mb(2));\n\n        for _ in 0..SHARD_THROUGHPUT_LONG_TERM_WINDOW_LEN {\n            time_series.record(ByteSize::mb(4));\n            assert_eq!(time_series.last(), ByteSize::mb(4));\n        }\n        assert_eq!(time_series.last(), ByteSize::mb(4));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_v2/broadcast/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n#[allow(dead_code)]\nmod capacity_score;\nmod local_shards;\n\nuse std::time::Duration;\n\nuse quickwit_proto::types::SourceUid;\n\npub(in crate::ingest_v2) const BROADCAST_INTERVAL_PERIOD: Duration = if cfg!(test) {\n    Duration::from_millis(50)\n} else {\n    Duration::from_secs(5)\n};\n\npub use capacity_score::{\n    BroadcastIngesterCapacityScoreTask, IngesterCapacityScoreUpdate,\n    setup_ingester_capacity_update_listener,\n};\npub use local_shards::{\n    BroadcastLocalShardsTask, LocalShardsUpdate, ShardInfo, ShardInfos,\n    setup_local_shards_update_listener,\n};\n\nfn make_key(prefix: &str, source_uid: &SourceUid) -> String {\n    format!(\"{prefix}{}:{}\", source_uid.index_uid, source_uid.source_id)\n}\n\nfn parse_key(key: &str) -> Option<SourceUid> {\n    let (index_uid_str, source_id_str) = key.rsplit_once(':')?;\n    Some(SourceUid {\n        index_uid: index_uid_str.parse().ok()?,\n        source_id: source_id_str.to_string(),\n    })\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_common::shared_consts::INGESTER_PRIMARY_SHARDS_PREFIX;\n    use quickwit_proto::types::{IndexUid, SourceId, SourceUid};\n\n    use super::*;\n\n    #[test]\n    fn test_make_key() {\n        let source_uid = SourceUid {\n            index_uid: IndexUid::for_test(\"test-index\", 0),\n            source_id: SourceId::from(\"test-source\"),\n        };\n        let key = make_key(INGESTER_PRIMARY_SHARDS_PREFIX, &source_uid);\n        assert_eq!(\n            key,\n            \"ingester.primary_shards:test-index:00000000000000000000000000:test-source\"\n        );\n    }\n\n    #[test]\n    fn test_parse_key() {\n        let key = \"test-index:00000000000000000000000000:test-source\";\n        let source_uid = parse_key(key).unwrap();\n        assert_eq!(\n            &source_uid.index_uid.to_string(),\n            \"test-index:00000000000000000000000000\"\n        );\n        assert_eq!(source_uid.source_id, \"test-source\".to_string());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_v2/debouncing.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::sync::Arc;\n\nuse quickwit_proto::control_plane::{\n    GetOrCreateOpenShardsRequest, GetOrCreateOpenShardsSubrequest,\n};\nuse quickwit_proto::ingest::ShardIds;\nuse quickwit_proto::types::{IndexId, SourceId};\nuse tokio::sync::{OwnedRwLockWriteGuard, RwLock};\n\n#[derive(Default)]\nstruct Debouncer(Arc<RwLock<()>>);\n\nimpl Debouncer {\n    fn acquire(&self) -> Result<PermitGuard, BarrierGuard> {\n        if let Ok(permit) = self.0.clone().try_write_owned() {\n            Ok(PermitGuard(permit))\n        } else {\n            let barrier = self.0.clone();\n            Err(BarrierGuard(barrier))\n        }\n    }\n}\n\n#[derive(Debug)]\npub(super) struct PermitGuard(#[allow(dead_code)] OwnedRwLockWriteGuard<()>);\n\n#[derive(Debug)]\npub(super) struct BarrierGuard(Arc<RwLock<()>>);\n\nimpl BarrierGuard {\n    pub async fn wait(self) {\n        let _ = self.0.read().await;\n    }\n}\n\n/// Debounces [`GetOrCreateOpenShardsRequest`] requests by index and source IDs. It gives away a\n/// permit to the first request and a barrier to subsequent requests.\n#[derive(Default)]\npub(super) struct GetOrCreateOpenShardsRequestDebouncer {\n    debouncers: HashMap<(IndexId, SourceId), Debouncer>,\n}\n\nimpl GetOrCreateOpenShardsRequestDebouncer {\n    pub fn acquire(\n        &mut self,\n        index_id: &str,\n        source_id: &str,\n    ) -> Result<PermitGuard, BarrierGuard> {\n        let key = (index_id.to_string(), source_id.to_string());\n        self.debouncers.entry(key).or_default().acquire()\n    }\n}\n\n#[derive(Default)]\npub(super) struct DebouncedGetOrCreateOpenShardsRequest {\n    subrequests: Vec<GetOrCreateOpenShardsSubrequest>,\n    pub closed_shards: Vec<ShardIds>,\n    pub unavailable_leaders: Vec<String>,\n    rendezvous: Rendezvous,\n}\n\nimpl DebouncedGetOrCreateOpenShardsRequest {\n    pub fn is_empty(&self) -> bool {\n        self.subrequests.is_empty()\n    }\n\n    pub fn take(self) -> (Option<GetOrCreateOpenShardsRequest>, Rendezvous) {\n        if self.is_empty() {\n            return (None, self.rendezvous);\n        }\n        let request = GetOrCreateOpenShardsRequest {\n            subrequests: self.subrequests,\n            closed_shards: self.closed_shards,\n            unavailable_leaders: self.unavailable_leaders,\n        };\n        (Some(request), self.rendezvous)\n    }\n\n    pub fn push_subrequest(\n        &mut self,\n        subrequest: GetOrCreateOpenShardsSubrequest,\n        permit: PermitGuard,\n    ) {\n        self.subrequests.push(subrequest);\n        self.rendezvous.permits.push(permit);\n    }\n\n    pub fn push_barrier(&mut self, barrier: BarrierGuard) {\n        self.rendezvous.barriers.push(barrier);\n    }\n}\n\n#[derive(Default)]\npub(super) struct Rendezvous {\n    permits: Vec<PermitGuard>,\n    barriers: Vec<BarrierGuard>,\n}\n\nimpl Rendezvous {\n    /// Releases the permits and waits for the barriers to be lifted.\n    pub async fn wait(mut self) {\n        // Releasing the permits before waiting for the barriers is necessary to avoid\n        // dead locks.\n        self.permits.clear();\n\n        for barrier in self.barriers {\n            barrier.wait().await;\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::sync::atomic::{AtomicUsize, Ordering};\n    use std::time::Duration;\n\n    use super::*;\n\n    impl Rendezvous {\n        pub fn is_empty(&self) -> bool {\n            self.permits.is_empty() && self.barriers.is_empty()\n        }\n\n        pub fn num_permits(&self) -> usize {\n            self.permits.len()\n        }\n\n        pub fn num_barriers(&self) -> usize {\n            self.barriers.len()\n        }\n    }\n\n    #[tokio::test]\n    async fn test_debouncer() {\n        let debouncer = Debouncer::default();\n\n        let permit = debouncer.acquire().unwrap();\n        let barrier = debouncer.acquire().unwrap_err();\n        drop(permit);\n        barrier.wait().await;\n\n        let permit = debouncer.acquire().unwrap();\n        let barrier = debouncer.acquire().unwrap_err();\n        let flag = Arc::new(AtomicUsize::new(0));\n\n        let flag_clone = flag.clone();\n        tokio::spawn(async move {\n            tokio::time::sleep(Duration::from_millis(100)).await;\n            flag_clone.store(1, Ordering::Release);\n            drop(permit);\n        });\n        let flag_clone = flag.clone();\n        tokio::spawn(async move {\n            let _ = barrier.wait().await;\n            flag_clone.store(2, Ordering::Release);\n        });\n        tokio::time::sleep(Duration::from_millis(200)).await;\n        assert_eq!(flag.load(Ordering::Acquire), 2);\n    }\n\n    #[test]\n    fn test_get_or_create_open_shards_request_debouncer() {\n        let mut debouncer = GetOrCreateOpenShardsRequestDebouncer::default();\n\n        let _permit_foo: PermitGuard = debouncer.acquire(\"test-index\", \"test-source-foo\").unwrap();\n\n        let _barrier = debouncer\n            .acquire(\"test-index\", \"test-source-foo\")\n            .unwrap_err();\n\n        let _permit_bar: PermitGuard = debouncer.acquire(\"test-index\", \"test-source-bar\").unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_debounced_get_or_create_open_shards_request() {\n        let debounced_request = DebouncedGetOrCreateOpenShardsRequest::default();\n        assert!(debounced_request.is_empty());\n\n        let (request_opt, rendezvous) = debounced_request.take();\n        assert!(request_opt.is_none());\n        assert!(rendezvous.is_empty());\n\n        let mut debouncer = GetOrCreateOpenShardsRequestDebouncer::default();\n        let mut debounced_request = DebouncedGetOrCreateOpenShardsRequest::default();\n\n        let permit = debouncer.acquire(\"test-index\", \"test-source-foo\").unwrap();\n        debounced_request.push_subrequest(\n            GetOrCreateOpenShardsSubrequest {\n                index_id: \"test-index\".to_string(),\n                source_id: \"test-source-foo\".to_string(),\n                ..Default::default()\n            },\n            permit,\n        );\n\n        let barrier = debouncer\n            .acquire(\"test-index\", \"test-source-foo\")\n            .unwrap_err();\n        debounced_request.push_barrier(barrier);\n\n        let (request_opt, rendezvous) = debounced_request.take();\n        let request = request_opt.unwrap();\n\n        assert_eq!(request.subrequests.len(), 1);\n        assert_eq!(rendezvous.num_permits(), 1);\n        assert_eq!(rendezvous.num_barriers(), 1);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_v2/doc_mapper.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::hash_map::Entry;\nuse std::collections::{HashMap, HashSet};\nuse std::sync::{Arc, Weak};\n\nuse once_cell::sync::OnceCell;\nuse quickwit_common::rate_limited_error;\nuse quickwit_common::thread_pool::run_cpu_intensive;\nuse quickwit_config::{DocMapping, SearchSettings, build_doc_mapper};\nuse quickwit_doc_mapper::DocMapper;\nuse quickwit_proto::ingest::{\n    DocBatchV2, IngestV2Error, IngestV2Result, ParseFailure, ParseFailureReason,\n};\nuse quickwit_proto::types::{DocMappingUid, DocUid};\nuse serde_json_borrow::Value as JsonValue;\nuse tracing::info;\n\nuse crate::DocBatchV2Builder;\n\n/// Attempts to get the doc mapper identified by the given doc mapping UID `doc_mapping_uid` from\n/// the `doc_mappers` cache. If it is not found, it is built from the specified JSON doc mapping\n/// `doc_mapping_json` and inserted into the cache before being returned.\npub(super) fn get_or_try_build_doc_mapper(\n    doc_mappers: &mut HashMap<DocMappingUid, Weak<DocMapper>>,\n    doc_mapping_uid: DocMappingUid,\n    doc_mapping_json: &str,\n) -> IngestV2Result<Arc<DocMapper>> {\n    if let Entry::Occupied(occupied) = doc_mappers.entry(doc_mapping_uid) {\n        if let Some(doc_mapper) = occupied.get().upgrade() {\n            return Ok(doc_mapper);\n        }\n        occupied.remove();\n    }\n    let doc_mapper = try_build_doc_mapper(doc_mapping_json)?;\n\n    if doc_mapper.doc_mapping_uid() != doc_mapping_uid {\n        let message = format!(\n            \"doc mapping UID mismatch: expected `{doc_mapping_uid}`, got `{}`\",\n            doc_mapper.doc_mapping_uid()\n        );\n        return Err(IngestV2Error::Internal(message));\n    }\n    doc_mappers.insert(doc_mapping_uid, Arc::downgrade(&doc_mapper));\n    info!(\"inserted doc mapper `{doc_mapping_uid}` into cache`\");\n\n    Ok(doc_mapper)\n}\n\n/// Attempts to build a doc mapper from the specified JSON doc mapping `doc_mapping_json`.\npub(super) fn try_build_doc_mapper(doc_mapping_json: &str) -> IngestV2Result<Arc<DocMapper>> {\n    let doc_mapping: DocMapping = serde_json::from_str(doc_mapping_json).map_err(|error| {\n        IngestV2Error::Internal(format!(\"failed to parse doc mapping: {error}\"))\n    })?;\n    let search_settings = SearchSettings::default();\n    let doc_mapper = build_doc_mapper(&doc_mapping, &search_settings)\n        .map_err(|error| IngestV2Error::Internal(format!(\"failed to build doc mapper: {error}\")))?;\n    Ok(doc_mapper)\n}\n\nfn validate_document(\n    doc_mapper: &DocMapper,\n    doc_bytes: &[u8],\n) -> Result<(), (ParseFailureReason, String)> {\n    let Ok(json_doc) = serde_json::from_slice::<serde_json_borrow::Value>(doc_bytes) else {\n        return Err((\n            ParseFailureReason::InvalidJson,\n            \"failed to parse JSON document\".to_string(),\n        ));\n    };\n    let JsonValue::Object(json_obj) = json_doc else {\n        return Err((\n            ParseFailureReason::InvalidJson,\n            \"JSON document is not an object\".to_string(),\n        ));\n    };\n    if let Err(error) = doc_mapper.validate_json_obj(&json_obj) {\n        rate_limited_error!(\n            limit_per_min = 6,\n            \"failed to validate JSON document: {}\",\n            error\n        );\n        return Err((ParseFailureReason::InvalidSchema, error.to_string()));\n    }\n    Ok(())\n}\n\n/// Validates a batch of docs.\n///\n/// Returns a batch of valid docs and the list of errors.\nfn validate_doc_batch_impl(\n    doc_batch: DocBatchV2,\n    doc_mapper: &DocMapper,\n) -> (DocBatchV2, Vec<ParseFailure>) {\n    let mut parse_failures: Vec<ParseFailure> = Vec::new();\n    let mut invalid_doc_ids: HashSet<DocUid> = HashSet::default();\n    for (doc_uid, doc_bytes) in doc_batch.docs() {\n        if let Err((reason, message)) = validate_document(doc_mapper, &doc_bytes) {\n            let parse_failure = ParseFailure {\n                doc_uid: Some(doc_uid),\n                reason: reason as i32,\n                message,\n            };\n            invalid_doc_ids.insert(doc_uid);\n            parse_failures.push(parse_failure);\n        }\n    }\n    if invalid_doc_ids.is_empty() {\n        // All docs are valid! We don't need to build a valid doc batch.\n        return (doc_batch, parse_failures);\n    }\n    let mut valid_doc_batch_builder = DocBatchV2Builder::default();\n    for (doc_uid, doc_bytes) in doc_batch.docs() {\n        if !invalid_doc_ids.contains(&doc_uid) {\n            valid_doc_batch_builder.add_doc(doc_uid, &doc_bytes);\n        }\n    }\n    let valid_doc_batch: DocBatchV2 = valid_doc_batch_builder.build().unwrap_or_default();\n    assert_eq!(\n        valid_doc_batch.num_docs() + parse_failures.len(),\n        doc_batch.num_docs()\n    );\n    (valid_doc_batch, parse_failures)\n}\n\nfn is_document_validation_enabled() -> bool {\n    static IS_DOCUMENT_VALIDATION_ENABLED: OnceCell<bool> = OnceCell::new();\n    *IS_DOCUMENT_VALIDATION_ENABLED.get_or_init(|| {\n        !quickwit_common::get_bool_from_env(\"QW_DISABLE_DOCUMENT_VALIDATION\", false)\n    })\n}\n\n/// Parses the JSON documents contained in the batch and applies the doc mapper. Returns the\n/// original batch and a list of parse failures.\npub(super) async fn validate_doc_batch(\n    doc_batch: DocBatchV2,\n    doc_mapper: Arc<DocMapper>,\n) -> IngestV2Result<(DocBatchV2, Vec<ParseFailure>)> {\n    if is_document_validation_enabled() {\n        run_cpu_intensive(move || validate_doc_batch_impl(doc_batch, &doc_mapper))\n            .await\n            .map_err(|error| {\n                let message = format!(\"failed to validate documents: {error}\");\n                IngestV2Error::Internal(message)\n            })\n    } else {\n        Ok((doc_batch, Vec::new()))\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_proto::types::DocUid;\n\n    use super::*;\n\n    #[test]\n    fn test_get_or_try_build_doc_mapper() {\n        let mut doc_mappers: HashMap<DocMappingUid, Weak<DocMapper>> = HashMap::new();\n\n        let doc_mapping_uid = DocMappingUid::random();\n        let doc_mapping_json = r#\"{\n            \"field_mappings\": [{\n                \"name\": \"message\",\n                \"type\": \"text\"\n            }]\n        }\"#;\n        let error =\n            get_or_try_build_doc_mapper(&mut doc_mappers, doc_mapping_uid, doc_mapping_json)\n                .unwrap_err();\n        assert!(\n            matches!(error, IngestV2Error::Internal(message) if message.contains(\"doc mapping UID mismatch\"))\n        );\n\n        let doc_mapping_json = format!(\n            r#\"{{\n                \"doc_mapping_uid\": \"{doc_mapping_uid}\",\n                \"field_mappings\": [{{\n                        \"name\": \"message\",\n                        \"type\": \"text\"\n                }}]\n            }}\"#\n        );\n        let doc_mapper =\n            get_or_try_build_doc_mapper(&mut doc_mappers, doc_mapping_uid, &doc_mapping_json)\n                .unwrap();\n        assert_eq!(doc_mappers.len(), 1);\n        assert_eq!(doc_mapper.doc_mapping_uid(), doc_mapping_uid);\n        assert_eq!(Arc::strong_count(&doc_mapper), 1);\n\n        drop(doc_mapper);\n        assert!(\n            doc_mappers\n                .get(&doc_mapping_uid)\n                .unwrap()\n                .upgrade()\n                .is_none()\n        );\n\n        let error = get_or_try_build_doc_mapper(&mut doc_mappers, doc_mapping_uid, \"\").unwrap_err();\n        assert!(\n            matches!(error, IngestV2Error::Internal(message) if message.contains(\"parse doc mapping\"))\n        );\n        assert_eq!(doc_mappers.len(), 0);\n    }\n\n    #[test]\n    fn test_try_build_doc_mapper() {\n        let error = try_build_doc_mapper(\"\").unwrap_err();\n        assert!(\n            matches!(error, IngestV2Error::Internal(message) if message.contains(\"parse doc mapping\"))\n        );\n\n        let error = try_build_doc_mapper(r#\"{\"timestamp_field\": \".timestamp\"}\"#).unwrap_err();\n        assert!(\n            matches!(error, IngestV2Error::Internal(message) if message.contains(\"build doc mapper\"))\n        );\n\n        let doc_mapping_json = r#\"{\n            \"mode\": \"strict\",\n            \"field_mappings\": [{\n                \"name\": \"message\",\n                \"type\": \"text\"\n        }]}\"#;\n        let doc_mapper = try_build_doc_mapper(doc_mapping_json).unwrap();\n        let schema = doc_mapper.schema();\n        assert_eq!(schema.num_fields(), 2);\n\n        let contains_message_field = schema\n            .fields()\n            .map(|(_field, entry)| entry.name())\n            .any(|field_name| field_name == \"message\");\n        assert!(contains_message_field);\n    }\n\n    #[test]\n    fn test_validate_doc_batch() {\n        let doc_mapping_json = r#\"{\n            \"mode\": \"strict\",\n            \"field_mappings\": [\n                {\n                    \"name\": \"doc\",\n                    \"type\": \"text\"\n                }\n            ]\n        }\"#;\n        let doc_mapper = try_build_doc_mapper(doc_mapping_json).unwrap();\n        let doc_batch = DocBatchV2::default();\n\n        let (_, parse_failures) = validate_doc_batch_impl(doc_batch, &doc_mapper);\n        assert_eq!(parse_failures.len(), 0);\n\n        let doc_batch =\n            DocBatchV2::for_test([\"\", \"[]\", r#\"{\"foo\": \"bar\"}\"#, r#\"{\"doc\": \"test-doc-000\"}\"#]);\n        let (doc_batch, parse_failures) = validate_doc_batch_impl(doc_batch, &doc_mapper);\n        assert_eq!(parse_failures.len(), 3);\n\n        let parse_failure_0 = &parse_failures[0];\n        assert_eq!(parse_failure_0.doc_uid(), DocUid::for_test(0));\n        assert_eq!(parse_failure_0.reason(), ParseFailureReason::InvalidJson);\n        assert!(parse_failure_0.message.contains(\"parse JSON document\"));\n\n        let parse_failure_1 = &parse_failures[1];\n        assert_eq!(parse_failure_1.doc_uid(), DocUid::for_test(1));\n        assert_eq!(parse_failure_1.reason(), ParseFailureReason::InvalidJson);\n        assert!(parse_failure_1.message.contains(\"not an object\"));\n\n        let parse_failure_2 = &parse_failures[2];\n        assert_eq!(parse_failure_2.doc_uid(), DocUid::for_test(2));\n        assert_eq!(parse_failure_2.reason(), ParseFailureReason::InvalidSchema);\n        assert!(parse_failure_2.message.contains(\"not declared\"));\n\n        assert_eq!(doc_batch.num_docs(), 1);\n        assert_eq!(doc_batch.doc_uids[0], DocUid::for_test(3));\n        let (valid_doc_uid, valid_doc_bytes) = doc_batch.docs().next().unwrap();\n        assert_eq!(valid_doc_uid, DocUid::for_test(3));\n        assert_eq!(&valid_doc_bytes, r#\"{\"doc\": \"test-doc-000\"}\"#.as_bytes());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_v2/fetch.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::borrow::Borrow;\nuse std::collections::HashMap;\nuse std::collections::hash_map::Entry;\nuse std::fmt;\nuse std::sync::Arc;\n\nuse bytes::{BufMut, BytesMut};\nuse bytesize::ByteSize;\nuse futures::StreamExt;\nuse mrecordlog::Record;\nuse quickwit_common::metrics::MEMORY_METRICS;\nuse quickwit_common::retry::RetryParams;\nuse quickwit_common::stream_utils::{InFlightValue, TrackedSender};\nuse quickwit_common::{ServiceStream, spawn_named_task};\nuse quickwit_proto::ingest::ingester::{\n    FetchEof, FetchMessage, FetchPayload, IngesterService, OpenFetchStreamRequest, fetch_message,\n};\nuse quickwit_proto::ingest::{IngestV2Error, IngestV2Result, MRecordBatch};\nuse quickwit_proto::types::{IndexUid, NodeId, Position, QueueId, ShardId, SourceId, queue_id};\nuse tokio::sync::{RwLock, mpsc, watch};\nuse tokio::task::JoinHandle;\nuse tracing::{debug, error, warn};\n\nuse super::models::ShardStatus;\nuse crate::mrecordlog_async::MultiRecordLogAsync;\nuse crate::{ClientId, IngesterPool, with_lock_metrics};\n\n/// A fetch stream task is responsible for waiting and pushing new records written to a shard's\n/// record log into a channel named `fetch_message_tx`.\npub(super) struct FetchStreamTask {\n    /// Uniquely identifies the consumer of the fetch task for logging and debugging purposes.\n    client_id: ClientId,\n    index_uid: IndexUid,\n    source_id: SourceId,\n    shard_id: ShardId,\n    queue_id: QueueId,\n    /// The position of the next record fetched.\n    from_position_inclusive: u64,\n    mrecordlog: Arc<RwLock<Option<MultiRecordLogAsync>>>,\n    fetch_message_tx: TrackedSender<IngestV2Result<FetchMessage>>,\n    /// This channel notifies the fetch task when new records are available. This way the fetch\n    /// task does not need to grab the lock and poll the mrecordlog queue unnecessarily.\n    shard_status_rx: watch::Receiver<ShardStatus>,\n    batch_num_bytes: usize,\n}\n\nimpl fmt::Debug for FetchStreamTask {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        f.debug_struct(\"FetchStreamTask\")\n            .field(\"client_id\", &self.client_id)\n            .field(\"index_uid\", &self.index_uid)\n            .field(\"source_id\", &self.source_id)\n            .field(\"shard_id\", &self.shard_id)\n            .finish()\n    }\n}\n\nimpl FetchStreamTask {\n    pub fn spawn(\n        open_fetch_stream_request: OpenFetchStreamRequest,\n        mrecordlog: Arc<RwLock<Option<MultiRecordLogAsync>>>,\n        shard_status_rx: watch::Receiver<ShardStatus>,\n        batch_num_bytes: usize,\n    ) -> (ServiceStream<IngestV2Result<FetchMessage>>, JoinHandle<()>) {\n        let from_position_inclusive = open_fetch_stream_request\n            .from_position_exclusive()\n            .as_u64()\n            .map(|offset| offset + 1)\n            .unwrap_or_default();\n        let (fetch_message_tx, fetch_stream) =\n            ServiceStream::new_bounded_with_gauge(3, &MEMORY_METRICS.in_flight.fetch_stream);\n        let mut fetch_task = Self {\n            shard_id: open_fetch_stream_request.shard_id().clone(),\n            queue_id: open_fetch_stream_request.queue_id(),\n            index_uid: open_fetch_stream_request.index_uid().clone(),\n            client_id: open_fetch_stream_request.client_id,\n            source_id: open_fetch_stream_request.source_id,\n            from_position_inclusive,\n            mrecordlog,\n            fetch_message_tx,\n            shard_status_rx,\n            batch_num_bytes,\n        };\n        let future = async move { fetch_task.run().await };\n        let fetch_task_handle: JoinHandle<()> = spawn_named_task(future, \"fetch_task\");\n        (fetch_stream, fetch_task_handle)\n    }\n\n    /// Runs the fetch task. It waits for new records in the log and pushes them into the fetch\n    /// response channel until it reaches the end of the shard marked by an EOF record.\n    async fn run(&mut self) {\n        debug!(\n            client_id=%self.client_id,\n            index_uid=%self.index_uid,\n            source_id=%self.source_id,\n            shard_id=%self.shard_id,\n            from_position_inclusive=%self.from_position_inclusive,\n            \"spawning fetch task\"\n        );\n        let mut has_drained_queue = false;\n        let mut to_position_inclusive = if self.from_position_inclusive == 0 {\n            Position::Beginning\n        } else {\n            Position::offset(self.from_position_inclusive - 1)\n        };\n\n        loop {\n            if has_drained_queue && self.shard_status_rx.changed().await.is_err() {\n                // The shard was dropped.\n                break;\n            }\n            has_drained_queue = true;\n\n            let mut mrecord_buffer = BytesMut::with_capacity(self.batch_num_bytes);\n            let mut mrecord_lengths = Vec::new();\n\n            let mrecordlog_guard =\n                with_lock_metrics!(self.mrecordlog.read().await, \"fetch\", \"read\");\n\n            let Ok(mrecords) = mrecordlog_guard\n                .as_ref()\n                .expect(\"mrecordlog should be initialized\")\n                .range(&self.queue_id, self.from_position_inclusive..)\n            else {\n                // The queue was dropped.\n                break;\n            };\n            for Record { payload, .. } in mrecords {\n                // Accept at least one message\n                if !mrecord_buffer.is_empty()\n                    && (mrecord_buffer.len() + payload.len() > mrecord_buffer.capacity())\n                {\n                    has_drained_queue = false;\n                    break;\n                }\n                mrecord_buffer.put(payload.borrow());\n                mrecord_lengths.push(payload.len() as u32);\n            }\n            // Drop the lock while we send the message.\n            drop(mrecordlog_guard);\n\n            if !mrecord_lengths.is_empty() {\n                let from_position_exclusive = if self.from_position_inclusive == 0 {\n                    Position::Beginning\n                } else {\n                    Position::offset(self.from_position_inclusive - 1)\n                };\n                self.from_position_inclusive += mrecord_lengths.len() as u64;\n\n                to_position_inclusive = Position::offset(self.from_position_inclusive - 1);\n\n                let mrecord_batch = MRecordBatch {\n                    mrecord_buffer: mrecord_buffer.freeze(),\n                    mrecord_lengths,\n                };\n                let batch_size = mrecord_batch.estimate_size();\n                let fetch_payload = FetchPayload {\n                    index_uid: Some(self.index_uid.clone()),\n                    source_id: self.source_id.clone(),\n                    shard_id: Some(self.shard_id.clone()),\n                    mrecord_batch: Some(mrecord_batch),\n                    from_position_exclusive: Some(from_position_exclusive),\n                    to_position_inclusive: Some(to_position_inclusive.clone()),\n                };\n                let fetch_message = FetchMessage::new_payload(fetch_payload);\n\n                if self\n                    .fetch_message_tx\n                    .send(Ok(fetch_message), batch_size)\n                    .await\n                    .is_err()\n                {\n                    // The consumer was dropped.\n                    return;\n                }\n            }\n            if has_drained_queue {\n                let has_reached_eof = {\n                    let shard_status = self.shard_status_rx.borrow();\n                    let shard_state = &shard_status.0;\n                    let replication_position = &shard_status.1;\n                    shard_state.is_closed() && to_position_inclusive >= *replication_position\n                };\n                if has_reached_eof {\n                    debug!(\n                        client_id=%self.client_id,\n                        index_uid=%self.index_uid,\n                        source_id=%self.source_id,\n                        shard_id=%self.shard_id,\n                        to_position_inclusive=%self.from_position_inclusive - 1,\n                        \"fetch stream reached end of shard\"\n                    );\n                    let eof_position = to_position_inclusive.as_eof();\n\n                    let fetch_eof = FetchEof {\n                        index_uid: Some(self.index_uid.clone()),\n                        source_id: self.source_id.clone(),\n                        shard_id: Some(self.shard_id.clone()),\n                        eof_position: Some(eof_position),\n                    };\n                    let fetch_message = FetchMessage::new_eof(fetch_eof);\n                    let _ = self\n                        .fetch_message_tx\n                        .send(Ok(fetch_message), ByteSize(0))\n                        .await;\n                    return;\n                }\n            }\n        }\n        if !to_position_inclusive.is_eof() {\n            // This can happen if we delete the associated source or index.\n            warn!(\n                client_id=%self.client_id,\n                index_uid=%self.index_uid,\n                source_id=%self.source_id,\n                shard_id=%self.shard_id,\n                \"fetch stream ended before reaching end of shard\"\n            );\n            let _ = self\n                .fetch_message_tx\n                .send(\n                    Err(IngestV2Error::Internal(\n                        \"fetch stream ended before reaching end of shard\".to_string(),\n                    )),\n                    ByteSize(0),\n                )\n                .await;\n        }\n    }\n}\n\n#[derive(Debug)]\npub struct FetchStreamError {\n    pub index_uid: IndexUid,\n    pub source_id: SourceId,\n    pub shard_id: ShardId,\n    pub ingest_error: IngestV2Error,\n}\n\n/// Combines multiple fetch streams originating from different ingesters into a single stream. It\n/// tolerates the failure of ingesters and automatically fails over to replica shards.\npub struct MultiFetchStream {\n    self_node_id: NodeId,\n    client_id: ClientId,\n    ingester_pool: IngesterPool,\n    retry_params: RetryParams,\n    fetch_task_handles: HashMap<QueueId, JoinHandle<()>>,\n    fetch_message_rx: mpsc::Receiver<Result<InFlightValue<FetchMessage>, FetchStreamError>>,\n    fetch_message_tx: mpsc::Sender<Result<InFlightValue<FetchMessage>, FetchStreamError>>,\n}\n\nimpl MultiFetchStream {\n    pub fn new(\n        self_node_id: NodeId,\n        client_id: ClientId,\n        ingester_pool: IngesterPool,\n        retry_params: RetryParams,\n    ) -> Self {\n        let (fetch_message_tx, fetch_message_rx) = mpsc::channel(3);\n        Self {\n            self_node_id,\n            client_id,\n            ingester_pool,\n            retry_params,\n            fetch_task_handles: HashMap::new(),\n            fetch_message_rx,\n            fetch_message_tx,\n        }\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn fetch_message_tx(\n        &self,\n    ) -> mpsc::Sender<Result<InFlightValue<FetchMessage>, FetchStreamError>> {\n        self.fetch_message_tx.clone()\n    }\n\n    /// Subscribes to a shard and fails over to the replica if an error occurs.\n    #[allow(clippy::too_many_arguments)]\n    pub async fn subscribe(\n        &mut self,\n        leader_id: NodeId,\n        follower_id_opt: Option<NodeId>,\n        index_uid: IndexUid,\n        source_id: SourceId,\n        shard_id: ShardId,\n        from_position_exclusive: Position,\n    ) -> IngestV2Result<()> {\n        let queue_id = queue_id(&index_uid, &source_id, &shard_id);\n        let entry = self.fetch_task_handles.entry(queue_id.clone());\n\n        if let Entry::Occupied(_) = entry {\n            return Err(IngestV2Error::Internal(format!(\n                \"stream has already subscribed to shard `{queue_id}`\"\n            )));\n        }\n        let (preferred_ingester_id, failover_ingester_id_opt) =\n            select_preferred_and_failover_ingesters(&self.self_node_id, leader_id, follower_id_opt);\n\n        let mut ingester_ids = Vec::with_capacity(1 + failover_ingester_id_opt.is_some() as usize);\n        ingester_ids.push(preferred_ingester_id);\n\n        if let Some(failover_ingester_id) = failover_ingester_id_opt {\n            ingester_ids.push(failover_ingester_id);\n        }\n        let fetch_stream_future = retrying_fetch_stream(\n            self.client_id.clone(),\n            index_uid,\n            source_id,\n            shard_id,\n            from_position_exclusive,\n            ingester_ids,\n            self.ingester_pool.clone(),\n            self.retry_params,\n            self.fetch_message_tx.clone(),\n        );\n        let fetch_task_handle = spawn_named_task(fetch_stream_future, \"fetch_stream\");\n        self.fetch_task_handles.insert(queue_id, fetch_task_handle);\n        Ok(())\n    }\n\n    pub fn unsubscribe(\n        &mut self,\n        index_uid: &IndexUid,\n        source_id: &str,\n        shard_id: ShardId,\n    ) -> IngestV2Result<()> {\n        let queue_id = queue_id(index_uid, source_id, &shard_id);\n\n        if let Some(fetch_stream_handle) = self.fetch_task_handles.remove(&queue_id) {\n            fetch_stream_handle.abort();\n        }\n        Ok(())\n    }\n\n    /// Returns the next fetch response. This method blocks until a response is available.\n    ///\n    /// # Cancel safety\n    ///\n    /// This method is cancel safe.\n    pub async fn next(&mut self) -> Result<FetchMessage, FetchStreamError> {\n        // Because we always hold a sender and never call `close()` on the receiver, the channel is\n        // always open.\n        self.fetch_message_rx\n            .recv()\n            .await\n            .expect(\"channel should be open\")\n            .map(|value: InFlightValue<FetchMessage>| value.into_inner())\n    }\n\n    /// Resets the stream by aborting all the active fetch tasks and dropping all queued responses.\n    ///\n    /// The borrow checker guarantees that both `next()` and `reset()` cannot be called\n    /// simultaneously because they are both `&mut self` methods.\n    pub fn reset(&mut self) {\n        for (_queue_id, fetch_stream_handle) in self.fetch_task_handles.drain() {\n            fetch_stream_handle.abort();\n        }\n        let (fetch_message_tx, fetch_message_rx) = mpsc::channel(3);\n        self.fetch_message_tx = fetch_message_tx;\n        self.fetch_message_rx = fetch_message_rx;\n    }\n}\n\nimpl Drop for MultiFetchStream {\n    fn drop(&mut self) {\n        self.reset();\n    }\n}\n\n/// Chooses the ingester to stream records from, preferring \"local\" ingesters.\nfn select_preferred_and_failover_ingesters(\n    self_node_id: &NodeId,\n    leader_id: NodeId,\n    follower_id_opt: Option<NodeId>,\n) -> (NodeId, Option<NodeId>) {\n    // The replication factor is 1 and there is no follower.\n    let Some(follower_id) = follower_id_opt else {\n        return (leader_id, None);\n    };\n    if &leader_id == self_node_id {\n        (leader_id, Some(follower_id))\n    } else if &follower_id == self_node_id {\n        (follower_id, Some(leader_id))\n    } else if rand::random::<bool>() {\n        (leader_id, Some(follower_id))\n    } else {\n        (follower_id, Some(leader_id))\n    }\n}\n\n/// Performs multiple fault-tolerant fetch stream attempts until the stream reaches\n/// the end of the shard.\n#[allow(clippy::too_many_arguments)]\nasync fn retrying_fetch_stream(\n    client_id: String,\n    index_uid: IndexUid,\n    source_id: SourceId,\n    shard_id: ShardId,\n    mut from_position_exclusive: Position,\n    ingester_ids: Vec<NodeId>,\n    ingester_pool: IngesterPool,\n    retry_params: RetryParams,\n    fetch_message_tx: mpsc::Sender<Result<InFlightValue<FetchMessage>, FetchStreamError>>,\n) {\n    for num_attempts in 1..=retry_params.max_attempts {\n        fault_tolerant_fetch_stream(\n            client_id.clone(),\n            index_uid.clone(),\n            source_id.clone(),\n            shard_id.clone(),\n            &mut from_position_exclusive,\n            &ingester_ids,\n            ingester_pool.clone(),\n            fetch_message_tx.clone(),\n        )\n        .await;\n\n        if from_position_exclusive.is_eof() {\n            break;\n        }\n        let delay = retry_params.compute_delay(num_attempts);\n        tokio::time::sleep(delay).await;\n    }\n}\n\n/// Streams records from the preferred ingester and fails over to the other ingester if an error\n/// occurs.\n#[allow(clippy::too_many_arguments)]\nasync fn fault_tolerant_fetch_stream(\n    client_id: String,\n    index_uid: IndexUid,\n    source_id: SourceId,\n    shard_id: ShardId,\n    from_position_exclusive: &mut Position,\n    ingester_ids: &[NodeId],\n    ingester_pool: IngesterPool,\n    fetch_message_tx: mpsc::Sender<Result<InFlightValue<FetchMessage>, FetchStreamError>>,\n) {\n    // TODO: We can probably simplify this code by breaking it into smaller functions.\n    'outer: for (ingester_idx, ingester_id) in ingester_ids.iter().enumerate() {\n        let failover_ingester_id_opt = ingester_ids.get(ingester_idx + 1);\n\n        let Some(ingester) = ingester_pool.get(ingester_id) else {\n            if let Some(failover_ingester_id) = failover_ingester_id_opt {\n                warn!(\n                    client_id=%client_id,\n                    index_uid=%index_uid,\n                    source_id=%source_id,\n                    shard_id=%shard_id,\n                    \"ingester `{ingester_id}` is unavailable: failing over to ingester `{failover_ingester_id}`\"\n                );\n            } else {\n                error!(\n                    client_id=%client_id,\n                    index_uid=%index_uid,\n                    source_id=%source_id,\n                    shard_id=%shard_id,\n                    \"ingester `{ingester_id}` is unavailable: closing fetch stream\"\n                );\n                let message =\n                    format!(\"ingester `{ingester_id}` is unavailable: closing fetch stream\");\n                let ingest_error = IngestV2Error::Unavailable(message);\n                // Attempt to send the error to the consumer in a best-effort manner before\n                // returning.\n                let fetch_stream_error = FetchStreamError {\n                    index_uid,\n                    source_id,\n                    shard_id,\n                    ingest_error,\n                };\n                let _ = fetch_message_tx.send(Err(fetch_stream_error)).await;\n                return;\n            }\n            continue;\n        };\n        let open_fetch_stream_request = OpenFetchStreamRequest {\n            client_id: client_id.clone(),\n            index_uid: index_uid.clone().into(),\n            source_id: source_id.clone(),\n            shard_id: Some(shard_id.clone()),\n            from_position_exclusive: Some(from_position_exclusive.clone()),\n        };\n        let mut fetch_stream = match ingester\n            .client\n            .open_fetch_stream(open_fetch_stream_request)\n            .await\n        {\n            Ok(fetch_stream) => fetch_stream,\n            Err(not_found_error @ IngestV2Error::ShardNotFound { .. }) => {\n                error!(\n                    client_id=%client_id,\n                    index_uid=%index_uid,\n                    source_id=%source_id,\n                    shard_id=%shard_id,\n                    \"failed to open fetch stream from ingester `{ingester_id}`: shard not found\"\n                );\n                let fetch_stream_error = FetchStreamError {\n                    index_uid,\n                    source_id,\n                    shard_id,\n                    ingest_error: not_found_error,\n                };\n                let _ = fetch_message_tx.send(Err(fetch_stream_error)).await;\n                from_position_exclusive.to_eof();\n                return;\n            }\n            Err(other_ingest_error) => {\n                if let Some(failover_ingester_id) = failover_ingester_id_opt {\n                    warn!(\n                        client_id=%client_id,\n                        index_uid=%index_uid,\n                        source_id=%source_id,\n                        shard_id=%shard_id,\n                        error=%other_ingest_error,\n                        \"failed to open fetch stream from ingester `{ingester_id}`: failing over to ingester `{failover_ingester_id}`\"\n                    );\n                } else {\n                    error!(\n                        client_id=%client_id,\n                        index_uid=%index_uid,\n                        source_id=%source_id,\n                        shard_id=%shard_id,\n                        error=%other_ingest_error,\n                        \"failed to open fetch stream from ingester `{ingester_id}`: closing fetch stream\"\n                    );\n                    let fetch_stream_error = FetchStreamError {\n                        index_uid,\n                        source_id,\n                        shard_id,\n                        ingest_error: other_ingest_error,\n                    };\n                    let _ = fetch_message_tx.send(Err(fetch_stream_error)).await;\n                    return;\n                }\n                continue;\n            }\n        };\n        while let Some(fetch_message_result) = fetch_stream.next().await {\n            match fetch_message_result {\n                Ok(fetch_message) => match &fetch_message.message {\n                    Some(fetch_message::Message::Payload(fetch_payload)) => {\n                        let batch_size = fetch_payload.estimate_size();\n                        let to_position_inclusive = fetch_payload.to_position_inclusive();\n                        let in_flight_value = InFlightValue::new(\n                            fetch_message,\n                            batch_size,\n                            &MEMORY_METRICS.in_flight.multi_fetch_stream,\n                        );\n                        if fetch_message_tx.send(Ok(in_flight_value)).await.is_err() {\n                            // The consumer was dropped.\n                            return;\n                        }\n                        *from_position_exclusive = to_position_inclusive;\n                    }\n                    Some(fetch_message::Message::Eof(fetch_eof)) => {\n                        let eof_position = fetch_eof.eof_position();\n                        let in_flight_value = InFlightValue::new(\n                            fetch_message,\n                            ByteSize(0),\n                            &MEMORY_METRICS.in_flight.multi_fetch_stream,\n                        );\n                        // We ignore the send error if the consumer was dropped because we're going\n                        // to return anyway.\n                        let _ = fetch_message_tx.send(Ok(in_flight_value)).await;\n\n                        *from_position_exclusive = eof_position;\n                        return;\n                    }\n                    None => {\n                        warn!(\"received empty fetch message\");\n                        continue;\n                    }\n                },\n                Err(ingest_error) => {\n                    if let Some(failover_ingester_id) = failover_ingester_id_opt {\n                        warn!(\n                            client_id=%client_id,\n                            index_uid=%index_uid,\n                            source_id=%source_id,\n                            shard_id=%shard_id,\n                            error=%ingest_error,\n                            \"failed to fetch records from ingester `{ingester_id}`: failing over to ingester `{failover_ingester_id}`\"\n                        );\n                    } else {\n                        error!(\n                            client_id=%client_id,\n                            index_uid=%index_uid,\n                            source_id=%source_id,\n                            shard_id=%shard_id,\n                            error=%ingest_error,\n                            \"failed to fetch records from ingester `{ingester_id}`: closing fetch stream\"\n                        );\n                        let fetch_stream_error = FetchStreamError {\n                            index_uid,\n                            source_id,\n                            shard_id,\n                            ingest_error,\n                        };\n                        let _ = fetch_message_tx.send(Err(fetch_stream_error)).await;\n                        return;\n                    }\n                    continue 'outer;\n                }\n            }\n        }\n    }\n}\n\n#[cfg(test)]\npub(super) mod tests {\n    use std::time::Duration;\n\n    use bytes::Bytes;\n    use quickwit_proto::ingest::ShardState;\n    use quickwit_proto::ingest::ingester::{IngesterServiceClient, MockIngesterService};\n    use quickwit_proto::types::queue_id;\n    use tokio::time::timeout;\n\n    use super::*;\n    use crate::{IngesterPoolEntry, MRecord};\n\n    pub fn into_fetch_payload(fetch_message: FetchMessage) -> FetchPayload {\n        match fetch_message.message.unwrap() {\n            fetch_message::Message::Payload(fetch_payload) => fetch_payload,\n            other => panic!(\"expected fetch payload, got `{other:?}`\"),\n        }\n    }\n\n    pub fn into_fetch_eof(fetch_message: FetchMessage) -> FetchEof {\n        match fetch_message.message.unwrap() {\n            fetch_message::Message::Eof(fetch_eof) => fetch_eof,\n            other => panic!(\"expected fetch EOF, got `{other:?}`\"),\n        }\n    }\n\n    #[tokio::test]\n    async fn test_fetch_task_happy_path() {\n        let tempdir = tempfile::tempdir().unwrap();\n        let mrecordlog = Arc::new(RwLock::new(Some(\n            MultiRecordLogAsync::open(tempdir.path()).await.unwrap(),\n        )));\n        let client_id = \"test-client\".to_string();\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = \"test-source\".to_string();\n        let shard_id = ShardId::from(1);\n        let queue_id = queue_id(&index_uid, &source_id, &shard_id);\n\n        let open_fetch_stream_request = OpenFetchStreamRequest {\n            client_id: client_id.clone(),\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(shard_id.clone()),\n            from_position_exclusive: Some(Position::Beginning),\n        };\n        let (shard_status_tx, shard_status_rx) = watch::channel(ShardStatus::default());\n        let (mut fetch_stream, fetch_task_handle) = FetchStreamTask::spawn(\n            open_fetch_stream_request,\n            mrecordlog.clone(),\n            shard_status_rx,\n            1024,\n        );\n        let mut mrecordlog_guard = mrecordlog.write().await;\n\n        mrecordlog_guard\n            .as_mut()\n            .unwrap()\n            .create_queue(&queue_id)\n            .await\n            .unwrap();\n        mrecordlog_guard\n            .as_mut()\n            .unwrap()\n            .append_records(\n                &queue_id,\n                None,\n                std::iter::once(MRecord::new_doc(\"test-doc-foo\").encode()),\n            )\n            .await\n            .unwrap();\n        drop(mrecordlog_guard);\n\n        let fetch_message = timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap()\n            .unwrap()\n            .unwrap();\n        let fetch_payload = into_fetch_payload(fetch_message);\n\n        assert_eq!(fetch_payload.index_uid(), &index_uid);\n        assert_eq!(fetch_payload.source_id, source_id);\n        assert_eq!(fetch_payload.shard_id(), shard_id);\n        assert_eq!(fetch_payload.from_position_exclusive(), Position::Beginning);\n        assert_eq!(\n            fetch_payload.to_position_inclusive(),\n            Position::offset(0u64)\n        );\n        assert_eq!(\n            fetch_payload\n                .mrecord_batch\n                .as_ref()\n                .unwrap()\n                .mrecord_lengths,\n            [14]\n        );\n        assert_eq!(\n            fetch_payload.mrecord_batch.as_ref().unwrap().mrecord_buffer,\n            \"\\0\\0test-doc-foo\"\n        );\n\n        timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap_err();\n\n        // Trigger a spurious notification.\n        let shard_status = (ShardState::Open, Position::offset(0u64));\n        shard_status_tx.send(shard_status).unwrap();\n\n        timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap_err();\n\n        let mut mrecordlog_guard = mrecordlog.write().await;\n\n        mrecordlog_guard\n            .as_mut()\n            .unwrap()\n            .append_records(\n                &queue_id,\n                None,\n                std::iter::once(MRecord::new_doc(\"test-doc-bar\").encode()),\n            )\n            .await\n            .unwrap();\n        drop(mrecordlog_guard);\n\n        let shard_status = (ShardState::Open, Position::offset(1u64));\n        shard_status_tx.send(shard_status.clone()).unwrap();\n\n        let fetch_message = timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap()\n            .unwrap()\n            .unwrap();\n        let fetch_payload = into_fetch_payload(fetch_message);\n\n        assert_eq!(\n            fetch_payload.from_position_exclusive(),\n            Position::offset(0u64)\n        );\n        assert_eq!(\n            fetch_payload.to_position_inclusive(),\n            Position::offset(1u64)\n        );\n        assert_eq!(\n            fetch_payload\n                .mrecord_batch\n                .as_ref()\n                .unwrap()\n                .mrecord_lengths,\n            [14]\n        );\n        assert_eq!(\n            fetch_payload.mrecord_batch.as_ref().unwrap().mrecord_buffer,\n            \"\\0\\0test-doc-bar\"\n        );\n\n        let mut mrecordlog_guard = mrecordlog.write().await;\n\n        let mrecords = [\n            MRecord::new_doc(\"test-doc-baz\").encode(),\n            MRecord::new_doc(\"test-doc-qux\").encode(),\n        ]\n        .into_iter();\n\n        mrecordlog_guard\n            .as_mut()\n            .unwrap()\n            .append_records(&queue_id, None, mrecords)\n            .await\n            .unwrap();\n        drop(mrecordlog_guard);\n\n        let shard_status = (ShardState::Open, Position::offset(3u64));\n        shard_status_tx.send(shard_status).unwrap();\n\n        let fetch_message = timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap()\n            .unwrap()\n            .unwrap();\n        let fetch_payload = into_fetch_payload(fetch_message);\n\n        assert_eq!(\n            fetch_payload.from_position_exclusive(),\n            Position::offset(1u64)\n        );\n        assert_eq!(\n            fetch_payload.to_position_inclusive(),\n            Position::offset(3u64)\n        );\n        assert_eq!(\n            fetch_payload\n                .mrecord_batch\n                .as_ref()\n                .unwrap()\n                .mrecord_lengths,\n            [14, 14]\n        );\n        assert_eq!(\n            fetch_payload.mrecord_batch.as_ref().unwrap().mrecord_buffer,\n            \"\\0\\0test-doc-baz\\0\\0test-doc-qux\"\n        );\n\n        let shard_status = (ShardState::Closed, Position::offset(3u64));\n        shard_status_tx.send(shard_status).unwrap();\n\n        let fetch_message = timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap()\n            .unwrap()\n            .unwrap();\n        let fetch_eof = into_fetch_eof(fetch_message);\n\n        assert_eq!(fetch_eof.index_uid(), &index_uid);\n        assert_eq!(fetch_eof.source_id, source_id);\n        assert_eq!(fetch_eof.shard_id(), shard_id);\n        assert_eq!(fetch_eof.eof_position, Some(Position::eof(3u64)));\n\n        fetch_task_handle.await.unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_fetch_task_signals_eof() {\n        let tempdir = tempfile::tempdir().unwrap();\n        let mrecordlog = Arc::new(RwLock::new(Some(\n            MultiRecordLogAsync::open(tempdir.path()).await.unwrap(),\n        )));\n        let client_id = \"test-client\".to_string();\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = \"test-source\".to_string();\n        let shard_id = ShardId::from(1);\n        let queue_id = queue_id(&index_uid, &source_id, &shard_id);\n\n        let mut mrecordlog_guard = mrecordlog.write().await;\n\n        mrecordlog_guard\n            .as_mut()\n            .unwrap()\n            .create_queue(&queue_id)\n            .await\n            .unwrap();\n\n        mrecordlog_guard\n            .as_mut()\n            .unwrap()\n            .append_records(\n                &queue_id,\n                None,\n                std::iter::once(MRecord::new_doc(\"test-doc-foo\").encode()),\n            )\n            .await\n            .unwrap();\n        drop(mrecordlog_guard);\n\n        let open_fetch_stream_request = OpenFetchStreamRequest {\n            client_id: client_id.clone(),\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(shard_id.clone()),\n            from_position_exclusive: Some(Position::offset(0u64)),\n        };\n        let shard_status = (ShardState::Closed, Position::offset(0u64));\n        let (_shard_status_tx, shard_status_rx) = watch::channel(shard_status);\n\n        let (mut fetch_stream, fetch_task_handle) = FetchStreamTask::spawn(\n            open_fetch_stream_request,\n            mrecordlog.clone(),\n            shard_status_rx,\n            1024,\n        );\n        let fetch_message = timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap()\n            .unwrap()\n            .unwrap();\n        let fetch_eof = into_fetch_eof(fetch_message);\n\n        assert_eq!(fetch_eof.index_uid(), &index_uid);\n        assert_eq!(fetch_eof.source_id, source_id);\n        assert_eq!(fetch_eof.shard_id(), shard_id);\n        assert_eq!(fetch_eof.eof_position, Some(Position::eof(0u64).as_eof()));\n\n        fetch_task_handle.await.unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_fetch_task_signals_eof_at_beginning() {\n        let tempdir = tempfile::tempdir().unwrap();\n        let mrecordlog = Arc::new(RwLock::new(Some(\n            MultiRecordLogAsync::open(tempdir.path()).await.unwrap(),\n        )));\n        let client_id = \"test-client\".to_string();\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = \"test-source\".to_string();\n        let shard_id = ShardId::from(1);\n        let queue_id = queue_id(&index_uid, &source_id, &shard_id);\n\n        let open_fetch_stream_request = OpenFetchStreamRequest {\n            client_id: client_id.clone(),\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(shard_id.clone()),\n            from_position_exclusive: Some(Position::Beginning),\n        };\n        let (shard_status_tx, shard_status_rx) = watch::channel(ShardStatus::default());\n        let (mut fetch_stream, fetch_task_handle) = FetchStreamTask::spawn(\n            open_fetch_stream_request,\n            mrecordlog.clone(),\n            shard_status_rx,\n            1024,\n        );\n        let mut mrecordlog_guard = mrecordlog.write().await;\n\n        mrecordlog_guard\n            .as_mut()\n            .unwrap()\n            .create_queue(&queue_id)\n            .await\n            .unwrap();\n        drop(mrecordlog_guard);\n\n        let shard_status = (ShardState::Closed, Position::Beginning);\n        shard_status_tx.send(shard_status).unwrap();\n\n        let fetch_message = timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap()\n            .unwrap()\n            .unwrap();\n        let fetch_eof = into_fetch_eof(fetch_message);\n\n        assert_eq!(fetch_eof.index_uid(), &index_uid);\n        assert_eq!(fetch_eof.source_id, source_id);\n        assert_eq!(fetch_eof.shard_id(), shard_id);\n        assert_eq!(fetch_eof.eof_position, Some(Position::Beginning.as_eof()));\n\n        fetch_task_handle.await.unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_fetch_task_from_position_exclusive() {\n        let tempdir = tempfile::tempdir().unwrap();\n        let mrecordlog = Arc::new(RwLock::new(Some(\n            MultiRecordLogAsync::open(tempdir.path()).await.unwrap(),\n        )));\n        let client_id = \"test-client\".to_string();\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = \"test-source\".to_string();\n        let shard_id = ShardId::from(1);\n        let queue_id = queue_id(&index_uid, &source_id, &shard_id);\n\n        let open_fetch_stream_request = OpenFetchStreamRequest {\n            client_id: client_id.clone(),\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(shard_id.clone()),\n            from_position_exclusive: Some(Position::offset(0u64)),\n        };\n        let (shard_status_tx, shard_status_rx) = watch::channel(ShardStatus::default());\n        let (mut fetch_stream, _fetch_task_handle) = FetchStreamTask::spawn(\n            open_fetch_stream_request,\n            mrecordlog.clone(),\n            shard_status_rx,\n            1024,\n        );\n        let mut mrecordlog_guard = mrecordlog.write().await;\n\n        mrecordlog_guard\n            .as_mut()\n            .unwrap()\n            .create_queue(&queue_id)\n            .await\n            .unwrap();\n        drop(mrecordlog_guard);\n\n        timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap_err();\n\n        let mut mrecordlog_guard = mrecordlog.write().await;\n\n        mrecordlog_guard\n            .as_mut()\n            .unwrap()\n            .append_records(\n                &queue_id,\n                None,\n                std::iter::once(MRecord::new_doc(\"test-doc-foo\").encode()),\n            )\n            .await\n            .unwrap();\n        drop(mrecordlog_guard);\n\n        let shard_status = (ShardState::Open, Position::offset(0u64));\n        shard_status_tx.send(shard_status).unwrap();\n\n        timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap_err();\n\n        let mut mrecordlog_guard = mrecordlog.write().await;\n\n        mrecordlog_guard\n            .as_mut()\n            .unwrap()\n            .append_records(\n                &queue_id,\n                None,\n                std::iter::once(MRecord::new_doc(\"test-doc-bar\").encode()),\n            )\n            .await\n            .unwrap();\n        drop(mrecordlog_guard);\n\n        let shard_status = (ShardState::Open, Position::offset(1u64));\n        shard_status_tx.send(shard_status).unwrap();\n\n        let fetch_message = timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap()\n            .unwrap()\n            .unwrap();\n        let fetch_payload = into_fetch_payload(fetch_message);\n\n        assert_eq!(fetch_payload.index_uid(), &index_uid);\n        assert_eq!(fetch_payload.source_id, source_id);\n        assert_eq!(fetch_payload.shard_id(), shard_id);\n        assert_eq!(\n            fetch_payload.from_position_exclusive(),\n            Position::offset(0u64)\n        );\n        assert_eq!(\n            fetch_payload.to_position_inclusive(),\n            Position::offset(1u64)\n        );\n        assert_eq!(\n            fetch_payload\n                .mrecord_batch\n                .as_ref()\n                .unwrap()\n                .mrecord_lengths,\n            [14]\n        );\n        assert_eq!(\n            fetch_payload.mrecord_batch.as_ref().unwrap().mrecord_buffer,\n            \"\\0\\0test-doc-bar\"\n        );\n    }\n\n    #[tokio::test]\n    async fn test_fetch_task_error() {\n        let tempdir = tempfile::tempdir().unwrap();\n        let mrecordlog = Arc::new(RwLock::new(Some(\n            MultiRecordLogAsync::open(tempdir.path()).await.unwrap(),\n        )));\n        let client_id = \"test-client\".to_string();\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = \"test-source\".to_string();\n        let shard_id = ShardId::from(1);\n\n        let open_fetch_stream_request = OpenFetchStreamRequest {\n            client_id: client_id.clone(),\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(shard_id.clone()),\n            from_position_exclusive: Some(Position::Beginning),\n        };\n        let (_shard_status_tx, shard_status_rx) = watch::channel(ShardStatus::default());\n        let (mut fetch_stream, fetch_task_handle) = FetchStreamTask::spawn(\n            open_fetch_stream_request,\n            mrecordlog.clone(),\n            shard_status_rx,\n            1024,\n        );\n        let ingest_error = timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap()\n            .unwrap()\n            .unwrap_err();\n        assert!(matches!(ingest_error, IngestV2Error::Internal(_)));\n\n        fetch_task_handle.await.unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_fetch_task_batch_num_bytes() {\n        let tempdir = tempfile::tempdir().unwrap();\n        let mrecordlog = Arc::new(RwLock::new(Some(\n            MultiRecordLogAsync::open(tempdir.path()).await.unwrap(),\n        )));\n        let client_id = \"test-client\".to_string();\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = \"test-source\".to_string();\n        let shard_id = ShardId::from(1);\n        let queue_id = queue_id(&index_uid, &source_id, &shard_id);\n\n        let open_fetch_stream_request = OpenFetchStreamRequest {\n            client_id: client_id.clone(),\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(shard_id.clone()),\n            from_position_exclusive: Some(Position::Beginning),\n        };\n        let (shard_status_tx, shard_status_rx) = watch::channel(ShardStatus::default());\n        let (mut fetch_stream, _fetch_task_handle) = FetchStreamTask::spawn(\n            open_fetch_stream_request,\n            mrecordlog.clone(),\n            shard_status_rx,\n            30,\n        );\n        let mut mrecordlog_guard = mrecordlog.write().await;\n\n        mrecordlog_guard\n            .as_mut()\n            .unwrap()\n            .create_queue(&queue_id)\n            .await\n            .unwrap();\n\n        let records = [\n            Bytes::from_static(b\"test-doc-foo\"),\n            Bytes::from_static(b\"test-doc-bar\"),\n            Bytes::from_static(b\"test-doc-baz\"),\n        ]\n        .into_iter();\n\n        mrecordlog_guard\n            .as_mut()\n            .unwrap()\n            .append_records(&queue_id, None, records)\n            .await\n            .unwrap();\n        drop(mrecordlog_guard);\n\n        let shard_status = (ShardState::Open, Position::offset(2u64));\n        shard_status_tx.send(shard_status).unwrap();\n\n        let fetch_message = timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap()\n            .unwrap()\n            .unwrap();\n        let fetch_payload = into_fetch_payload(fetch_message);\n\n        assert_eq!(\n            fetch_payload\n                .mrecord_batch\n                .as_ref()\n                .unwrap()\n                .mrecord_lengths,\n            [12, 12]\n        );\n        assert_eq!(\n            fetch_payload.mrecord_batch.as_ref().unwrap().mrecord_buffer,\n            \"test-doc-footest-doc-bar\"\n        );\n\n        let fetch_message = timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap()\n            .unwrap()\n            .unwrap();\n        let fetch_payload = into_fetch_payload(fetch_message);\n\n        assert_eq!(\n            fetch_payload\n                .mrecord_batch\n                .as_ref()\n                .unwrap()\n                .mrecord_lengths,\n            [12]\n        );\n        assert_eq!(\n            fetch_payload.mrecord_batch.as_ref().unwrap().mrecord_buffer,\n            \"test-doc-baz\"\n        );\n    }\n\n    #[tokio::test]\n    async fn test_fetch_task_batch_num_bytes_less_than_record_payload() {\n        let tempdir = tempfile::tempdir().unwrap();\n        let mrecordlog = Arc::new(RwLock::new(Some(\n            MultiRecordLogAsync::open(tempdir.path()).await.unwrap(),\n        )));\n        let client_id = \"test-client\".to_string();\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = \"test-source\".to_string();\n        let shard_id = ShardId::from(1);\n        let queue_id = queue_id(&index_uid, &source_id, &shard_id);\n\n        let open_fetch_stream_request = OpenFetchStreamRequest {\n            client_id: client_id.clone(),\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(shard_id.clone()),\n            from_position_exclusive: Some(Position::Beginning),\n        };\n        let (shard_status_tx, shard_status_rx) = watch::channel(ShardStatus::default());\n        let (mut fetch_stream, _fetch_task_handle) = FetchStreamTask::spawn(\n            open_fetch_stream_request,\n            mrecordlog.clone(),\n            shard_status_rx,\n            10, //< we request batch larger than 10 bytes.\n        );\n\n        let mut mrecordlog_guard = mrecordlog.write().await;\n\n        mrecordlog_guard\n            .as_mut()\n            .unwrap()\n            .create_queue(&queue_id)\n            .await\n            .unwrap();\n\n        mrecordlog_guard\n            .as_mut()\n            .unwrap()\n            .append_records(\n                &queue_id,\n                None,\n                // This doc is longer than 10 bytes.\n                std::iter::once(MRecord::new_doc(\"test-doc-foo\").encode()),\n            )\n            .await\n            .unwrap();\n\n        drop(mrecordlog_guard);\n\n        let shard_status = (ShardState::Open, Position::offset(1u64));\n        shard_status_tx.send(shard_status).unwrap();\n\n        let fetch_message = timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap()\n            .unwrap()\n            .unwrap();\n\n        let fetch_payload = into_fetch_payload(fetch_message);\n\n        assert_eq!(\n            fetch_payload\n                .mrecord_batch\n                .as_ref()\n                .unwrap()\n                .mrecord_lengths,\n            [14]\n        );\n        assert_eq!(\n            fetch_payload.mrecord_batch.as_ref().unwrap().mrecord_buffer,\n            \"\\0\\0test-doc-foo\"\n        );\n    }\n\n    #[test]\n    fn test_select_preferred_and_failover_ingesters() {\n        let self_node_id: NodeId = \"test-ingester-0\".into();\n\n        let (preferred, failover) =\n            select_preferred_and_failover_ingesters(&self_node_id, \"test-ingester-0\".into(), None);\n        assert_eq!(preferred, \"test-ingester-0\");\n        assert!(failover.is_none());\n\n        let (preferred, failover) = select_preferred_and_failover_ingesters(\n            &self_node_id,\n            \"test-ingester-0\".into(),\n            Some(\"test-ingester-1\".into()),\n        );\n        assert_eq!(preferred, \"test-ingester-0\");\n        assert_eq!(failover.unwrap(), \"test-ingester-1\");\n\n        let (preferred, failover) = select_preferred_and_failover_ingesters(\n            &self_node_id,\n            \"test-ingester-1\".into(),\n            Some(\"test-ingester-0\".into()),\n        );\n        assert_eq!(preferred, \"test-ingester-0\");\n        assert_eq!(failover.unwrap(), \"test-ingester-1\");\n    }\n\n    #[tokio::test]\n    async fn test_fault_tolerant_fetch_stream_ingester_unavailable_failover() {\n        let client_id = \"test-client\".to_string();\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let source_id: SourceId = \"test-source\".into();\n        let shard_id = ShardId::from(1);\n        let mut from_position_exclusive = Position::offset(0u64);\n\n        let ingester_ids: Vec<NodeId> = vec![\"test-ingester-0\".into(), \"test-ingester-1\".into()];\n        let ingester_pool = IngesterPool::default();\n\n        let (fetch_message_tx, mut fetch_stream) = ServiceStream::new_bounded(5);\n        let (service_stream_tx_1, service_stream_1) = ServiceStream::new_unbounded();\n\n        let mut mock_ingester_1 = MockIngesterService::new();\n        let index_uid_clone = index_uid.clone();\n        mock_ingester_1\n            .expect_open_fetch_stream()\n            .return_once(move |request| {\n                assert_eq!(request.client_id, \"test-client\");\n                assert_eq!(request.index_uid(), &index_uid_clone);\n                assert_eq!(request.source_id, \"test-source\");\n                assert_eq!(request.shard_id(), ShardId::from(1));\n                assert_eq!(request.from_position_exclusive(), Position::offset(0u64));\n\n                Ok(service_stream_1)\n            });\n        let ingester_1 =\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester_1));\n        ingester_pool.insert(\"test-ingester-1\".into(), ingester_1);\n\n        let fetch_payload = FetchPayload {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(shard_id.clone()),\n            mrecord_batch: MRecordBatch::for_test([\"\\0\\0test-doc-foo\"]),\n            from_position_exclusive: Some(Position::offset(0u64)),\n            to_position_inclusive: Some(Position::offset(1u64)),\n        };\n        let fetch_message = FetchMessage::new_payload(fetch_payload);\n        service_stream_tx_1.send(Ok(fetch_message)).unwrap();\n\n        let fetch_eof = FetchEof {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(shard_id.clone()),\n            eof_position: Some(Position::eof(1u64)),\n        };\n        let fetch_message = FetchMessage::new_eof(fetch_eof);\n        service_stream_tx_1.send(Ok(fetch_message)).unwrap();\n\n        fault_tolerant_fetch_stream(\n            client_id,\n            index_uid,\n            source_id,\n            shard_id,\n            &mut from_position_exclusive,\n            &ingester_ids,\n            ingester_pool,\n            fetch_message_tx,\n        )\n        .await;\n\n        let fetch_message = timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap()\n            .unwrap()\n            .unwrap()\n            .into_inner();\n        let fetch_payload = into_fetch_payload(fetch_message);\n\n        assert_eq!(\n            fetch_payload.from_position_exclusive(),\n            Position::offset(0u64)\n        );\n        assert_eq!(\n            fetch_payload.to_position_inclusive(),\n            Position::offset(1u64)\n        );\n\n        let fetch_message = timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap()\n            .unwrap()\n            .unwrap()\n            .into_inner();\n        let fetch_eof = into_fetch_eof(fetch_message);\n\n        assert_eq!(fetch_eof.eof_position(), Position::eof(1u64));\n\n        assert!(\n            timeout(Duration::from_millis(100), fetch_stream.next())\n                .await\n                .unwrap()\n                .is_none()\n        );\n    }\n\n    #[tokio::test]\n    async fn test_fault_tolerant_fetch_stream_open_fetch_stream_error_failover() {\n        let client_id = \"test-client\".to_string();\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let source_id: SourceId = \"test-source\".into();\n        let shard_id = ShardId::from(1);\n        let mut from_position_exclusive = Position::offset(0u64);\n\n        let ingester_ids: Vec<NodeId> = vec![\"test-ingester-0\".into(), \"test-ingester-1\".into()];\n        let ingester_pool = IngesterPool::default();\n\n        let (fetch_message_tx, mut fetch_stream) = ServiceStream::new_bounded(5);\n        let (service_stream_tx_1, service_stream_1) = ServiceStream::new_unbounded();\n\n        let mut mock_ingester_0 = MockIngesterService::new();\n        let index_uid_clone = index_uid.clone();\n        mock_ingester_0\n            .expect_open_fetch_stream()\n            .return_once(move |request| {\n                assert_eq!(request.client_id, \"test-client\");\n                assert_eq!(request.index_uid(), &index_uid_clone);\n                assert_eq!(request.source_id, \"test-source\");\n                assert_eq!(request.shard_id(), ShardId::from(1));\n                assert_eq!(request.from_position_exclusive(), Position::offset(0u64));\n\n                Err(IngestV2Error::Internal(\n                    \"open fetch stream error\".to_string(),\n                ))\n            });\n        let ingester_0 =\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester_0));\n\n        let mut mock_ingester_1 = MockIngesterService::new();\n        let index_uid_clone = index_uid.clone();\n        mock_ingester_1\n            .expect_open_fetch_stream()\n            .return_once(move |request| {\n                assert_eq!(request.client_id, \"test-client\");\n                assert_eq!(request.index_uid(), &index_uid_clone);\n                assert_eq!(request.source_id, \"test-source\");\n                assert_eq!(request.shard_id(), ShardId::from(1));\n                assert_eq!(request.from_position_exclusive(), Position::offset(0u64));\n\n                Ok(service_stream_1)\n            });\n        let ingester_1 =\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester_1));\n\n        ingester_pool.insert(\"test-ingester-0\".into(), ingester_0);\n        ingester_pool.insert(\"test-ingester-1\".into(), ingester_1);\n\n        let fetch_payload = FetchPayload {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(shard_id.clone()),\n            mrecord_batch: MRecordBatch::for_test([\"\\0\\0test-doc-foo\"]),\n            from_position_exclusive: Some(Position::offset(0u64)),\n            to_position_inclusive: Some(Position::offset(1u64)),\n        };\n        let fetch_message = FetchMessage::new_payload(fetch_payload);\n        service_stream_tx_1.send(Ok(fetch_message)).unwrap();\n\n        let fetch_eof = FetchEof {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(shard_id.clone()),\n            eof_position: Some(Position::eof(1u64)),\n        };\n        let fetch_message = FetchMessage::new_eof(fetch_eof);\n        service_stream_tx_1.send(Ok(fetch_message)).unwrap();\n\n        fault_tolerant_fetch_stream(\n            client_id,\n            index_uid,\n            source_id,\n            shard_id,\n            &mut from_position_exclusive,\n            &ingester_ids,\n            ingester_pool,\n            fetch_message_tx,\n        )\n        .await;\n\n        let fetch_message = timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap()\n            .unwrap()\n            .unwrap()\n            .into_inner();\n        let fetch_payload = into_fetch_payload(fetch_message);\n\n        assert_eq!(\n            fetch_payload.from_position_exclusive(),\n            Position::offset(0u64)\n        );\n        assert_eq!(\n            fetch_payload.to_position_inclusive(),\n            Position::offset(1u64)\n        );\n\n        let fetch_message = timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap()\n            .unwrap()\n            .unwrap()\n            .into_inner();\n        let fetch_eof = into_fetch_eof(fetch_message);\n\n        assert_eq!(fetch_eof.eof_position(), Position::eof(1u64));\n\n        assert!(\n            timeout(Duration::from_millis(100), fetch_stream.next())\n                .await\n                .unwrap()\n                .is_none()\n        );\n    }\n\n    #[tokio::test]\n    async fn test_fault_tolerant_fetch_stream_error_failover() {\n        let client_id = \"test-client\".to_string();\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let source_id: SourceId = \"test-source\".into();\n        let shard_id = ShardId::from(1);\n        let mut from_position_exclusive = Position::offset(0u64);\n\n        let ingester_ids: Vec<NodeId> = vec![\"test-ingester-0\".into(), \"test-ingester-1\".into()];\n        let ingester_pool = IngesterPool::default();\n\n        let (fetch_message_tx, mut fetch_stream) = ServiceStream::new_bounded(5);\n        let (service_stream_tx_0, service_stream_0) = ServiceStream::new_unbounded();\n        let (service_stream_tx_1, service_stream_1) = ServiceStream::new_unbounded();\n\n        let mut mock_ingester_0 = MockIngesterService::new();\n        let index_uid_clone = index_uid.clone();\n        mock_ingester_0\n            .expect_open_fetch_stream()\n            .return_once(move |request| {\n                assert_eq!(request.client_id, \"test-client\");\n                assert_eq!(request.index_uid(), &index_uid_clone);\n                assert_eq!(request.source_id, \"test-source\");\n                assert_eq!(request.shard_id(), ShardId::from(1));\n                assert_eq!(request.from_position_exclusive(), Position::offset(0u64));\n\n                Ok(service_stream_0)\n            });\n        let ingester_0 =\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester_0));\n\n        let mut mock_ingester_1 = MockIngesterService::new();\n        let index_uid_clone = index_uid.clone();\n        mock_ingester_1\n            .expect_open_fetch_stream()\n            .return_once(move |request| {\n                assert_eq!(request.client_id, \"test-client\");\n                assert_eq!(request.index_uid(), &index_uid_clone);\n                assert_eq!(request.source_id, \"test-source\");\n                assert_eq!(request.shard_id(), ShardId::from(1));\n                assert_eq!(request.from_position_exclusive(), Position::offset(1u64));\n\n                Ok(service_stream_1)\n            });\n        let ingester_1 =\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester_1));\n\n        ingester_pool.insert(\"test-ingester-0\".into(), ingester_0);\n        ingester_pool.insert(\"test-ingester-1\".into(), ingester_1);\n\n        let fetch_payload = FetchPayload {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(shard_id.clone()),\n            mrecord_batch: MRecordBatch::for_test([\"\\0\\0test-doc-foo\"]),\n            from_position_exclusive: Some(Position::offset(0u64)),\n            to_position_inclusive: Some(Position::offset(1u64)),\n        };\n        let fetch_message = FetchMessage::new_payload(fetch_payload);\n        service_stream_tx_0.send(Ok(fetch_message)).unwrap();\n\n        let ingest_error = IngestV2Error::Internal(\"fetch stream error\".into());\n        service_stream_tx_0.send(Err(ingest_error)).unwrap();\n\n        let fetch_eof = FetchEof {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(shard_id.clone()),\n            eof_position: Some(Position::eof(1u64)),\n        };\n        let fetch_message = FetchMessage::new_eof(fetch_eof);\n        service_stream_tx_1.send(Ok(fetch_message)).unwrap();\n\n        fault_tolerant_fetch_stream(\n            client_id,\n            index_uid,\n            source_id,\n            shard_id,\n            &mut from_position_exclusive,\n            &ingester_ids,\n            ingester_pool,\n            fetch_message_tx,\n        )\n        .await;\n\n        let fetch_message = timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap()\n            .unwrap()\n            .unwrap()\n            .into_inner();\n        let fetch_payload = into_fetch_payload(fetch_message);\n\n        assert_eq!(\n            fetch_payload.from_position_exclusive(),\n            Position::offset(0u64)\n        );\n        assert_eq!(\n            fetch_payload.to_position_inclusive(),\n            Position::offset(1u64)\n        );\n\n        let fetch_message = timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap()\n            .unwrap()\n            .unwrap()\n            .into_inner();\n        let fetch_eof = into_fetch_eof(fetch_message);\n\n        assert_eq!(fetch_eof.eof_position(), Position::eof(1u64));\n\n        assert!(\n            timeout(Duration::from_millis(100), fetch_stream.next())\n                .await\n                .unwrap()\n                .is_none()\n        );\n    }\n\n    #[tokio::test]\n    async fn test_fault_tolerant_fetch_stream_shard_not_found() {\n        let client_id = \"test-client\".to_string();\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let source_id: SourceId = \"test-source\".into();\n        let shard_id = ShardId::from(1);\n        let mut from_position_exclusive = Position::offset(0u64);\n\n        let ingester_ids: Vec<NodeId> = vec![\"test-ingester-0\".into(), \"test-ingester-1\".into()];\n        let ingester_pool = IngesterPool::default();\n\n        let (fetch_message_tx, mut fetch_stream) = ServiceStream::new_bounded(5);\n\n        let mut mock_ingester_0 = MockIngesterService::new();\n        let index_uid_clone = index_uid.clone();\n        mock_ingester_0\n            .expect_open_fetch_stream()\n            .return_once(move |request| {\n                assert_eq!(request.client_id, \"test-client\");\n                assert_eq!(request.index_uid(), &index_uid_clone);\n                assert_eq!(request.source_id, \"test-source\");\n                assert_eq!(request.shard_id(), ShardId::from(1));\n                assert_eq!(request.from_position_exclusive(), Position::offset(0u64));\n\n                Err(IngestV2Error::ShardNotFound {\n                    shard_id: ShardId::from(1),\n                })\n            });\n        let ingester_0 =\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester_0));\n        ingester_pool.insert(\"test-ingester-0\".into(), ingester_0);\n\n        fault_tolerant_fetch_stream(\n            client_id,\n            index_uid,\n            source_id,\n            shard_id,\n            &mut from_position_exclusive,\n            &ingester_ids,\n            ingester_pool,\n            fetch_message_tx,\n        )\n        .await;\n\n        let fetch_stream_error = timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap()\n            .unwrap()\n            .unwrap_err();\n\n        assert!(matches!(\n            fetch_stream_error.ingest_error,\n            IngestV2Error::ShardNotFound { shard_id } if shard_id == ShardId::from(1)\n        ));\n        assert!(from_position_exclusive.is_eof());\n    }\n\n    #[tokio::test]\n    async fn test_retrying_fetch_stream() {\n        let client_id = \"test-client\".to_string();\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let source_id: SourceId = \"test-source\".into();\n        let shard_id = ShardId::from(1);\n        let from_position_exclusive = Position::offset(0u64);\n\n        let ingester_ids: Vec<NodeId> = vec![\"test-ingester\".into()];\n        let ingester_pool = IngesterPool::default();\n\n        let (fetch_message_tx, mut fetch_stream) = ServiceStream::new_bounded(5);\n        let (service_stream_tx_1, service_stream_1) = ServiceStream::new_unbounded();\n        let (service_stream_tx_2, service_stream_2) = ServiceStream::new_unbounded();\n\n        let mut retry_params = RetryParams::for_test();\n        retry_params.max_attempts = 3;\n\n        let mut mock_ingester = MockIngesterService::new();\n        let index_uid_clone = index_uid.clone();\n        mock_ingester\n            .expect_open_fetch_stream()\n            .once()\n            .returning(move |request| {\n                assert_eq!(request.client_id, \"test-client\");\n                assert_eq!(request.index_uid(), &index_uid_clone);\n                assert_eq!(request.source_id, \"test-source\");\n                assert_eq!(request.shard_id(), ShardId::from(1));\n                assert_eq!(request.from_position_exclusive(), Position::offset(0u64));\n\n                Err(IngestV2Error::Internal(\n                    \"open fetch stream error\".to_string(),\n                ))\n            });\n        let index_uid_clone = index_uid.clone();\n        mock_ingester\n            .expect_open_fetch_stream()\n            .once()\n            .return_once(move |request| {\n                assert_eq!(request.client_id, \"test-client\");\n                assert_eq!(request.index_uid(), &index_uid_clone);\n                assert_eq!(request.source_id, \"test-source\");\n                assert_eq!(request.shard_id(), ShardId::from(1));\n                assert_eq!(request.from_position_exclusive(), Position::offset(0u64));\n\n                Ok(service_stream_1)\n            });\n        let index_uid_clone = index_uid.clone();\n        mock_ingester\n            .expect_open_fetch_stream()\n            .once()\n            .return_once(move |request| {\n                assert_eq!(request.client_id, \"test-client\");\n                assert_eq!(request.index_uid(), &index_uid_clone);\n                assert_eq!(request.source_id, \"test-source\");\n                assert_eq!(request.shard_id(), ShardId::from(1));\n                assert_eq!(request.from_position_exclusive(), Position::offset(1u64));\n\n                Ok(service_stream_2)\n            });\n        let ingester =\n            IngesterPoolEntry::ready_with_client(IngesterServiceClient::from_mock(mock_ingester));\n\n        ingester_pool.insert(\"test-ingester\".into(), ingester);\n\n        let fetch_payload = FetchPayload {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(shard_id.clone()),\n            mrecord_batch: MRecordBatch::for_test([\"\\0\\0test-doc-foo\"]),\n            from_position_exclusive: Some(Position::offset(0u64)),\n            to_position_inclusive: Some(Position::offset(1u64)),\n        };\n        let fetch_message = FetchMessage::new_payload(fetch_payload);\n        service_stream_tx_1.send(Ok(fetch_message)).unwrap();\n\n        let ingest_error = IngestV2Error::Internal(\"fetch stream error #1\".into());\n        service_stream_tx_1.send(Err(ingest_error)).unwrap();\n\n        let fetch_payload = FetchPayload {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(shard_id.clone()),\n            mrecord_batch: MRecordBatch::for_test([\"\\0\\0test-doc-bar\"]),\n            from_position_exclusive: Some(Position::offset(1u64)),\n            to_position_inclusive: Some(Position::offset(2u64)),\n        };\n        let fetch_message = FetchMessage::new_payload(fetch_payload);\n        service_stream_tx_2.send(Ok(fetch_message)).unwrap();\n\n        let ingest_error = IngestV2Error::Internal(\"fetch stream error #2\".into());\n        service_stream_tx_2.send(Err(ingest_error)).unwrap();\n\n        retrying_fetch_stream(\n            client_id,\n            index_uid,\n            source_id,\n            shard_id,\n            from_position_exclusive,\n            ingester_ids,\n            ingester_pool,\n            retry_params,\n            fetch_message_tx,\n        )\n        .await;\n\n        let ingest_error = timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap()\n            .unwrap()\n            .unwrap_err()\n            .ingest_error;\n        assert!(\n            matches!(ingest_error, IngestV2Error::Internal(message) if message == \"open fetch stream error\")\n        );\n\n        let fetch_message = timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap()\n            .unwrap()\n            .unwrap()\n            .into_inner();\n        let fetch_payload = into_fetch_payload(fetch_message);\n\n        assert_eq!(\n            fetch_payload.from_position_exclusive(),\n            Position::offset(0u64)\n        );\n        assert_eq!(\n            fetch_payload.to_position_inclusive(),\n            Position::offset(1u64)\n        );\n\n        let fetch_stream_error = timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap()\n            .unwrap()\n            .unwrap_err();\n        assert!(\n            matches!(fetch_stream_error.ingest_error, IngestV2Error::Internal(message) if message == \"fetch stream error #1\")\n        );\n\n        let fetch_message = timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap()\n            .unwrap()\n            .unwrap()\n            .into_inner();\n        let fetch_payload = into_fetch_payload(fetch_message);\n\n        assert_eq!(\n            fetch_payload.from_position_exclusive(),\n            Position::offset(1u64)\n        );\n        assert_eq!(\n            fetch_payload.to_position_inclusive(),\n            Position::offset(2u64)\n        );\n\n        let fetch_stream_error = timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap()\n            .unwrap()\n            .unwrap_err();\n        assert!(\n            matches!(fetch_stream_error.ingest_error, IngestV2Error::Internal(message) if message == \"fetch stream error #2\")\n        );\n\n        assert!(\n            timeout(Duration::from_millis(100), fetch_stream.next())\n                .await\n                .unwrap()\n                .is_none()\n        );\n    }\n\n    #[tokio::test]\n    async fn test_multi_fetch_stream() {\n        let self_node_id: NodeId = \"test-node\".into();\n        let client_id = \"test-client\".to_string();\n        let ingester_pool = IngesterPool::default();\n        let retry_params = RetryParams::for_test();\n        let _multi_fetch_stream =\n            MultiFetchStream::new(self_node_id, client_id, ingester_pool, retry_params);\n        // TODO: Backport from original branch.\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_v2/helpers.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::time::{Duration, Instant};\n\nuse anyhow::{Context, anyhow, bail};\nuse futures::StreamExt;\nuse quickwit_common::pretty::PrettyDisplay;\nuse quickwit_proto::ingest::ingester::{\n    DecommissionRequest, IngesterService, IngesterStatus, OpenObservationStreamRequest,\n};\nuse tracing::info;\n\n/// Tries to get the current status of an ingester by opening an observation stream\n/// and reading the first message.\n///\n/// # Errors\n///\n/// Returns an error if:\n/// - The observation stream fails to open\n/// - The stream ends without producing a message\n/// - The stream ends after returning an error\npub async fn try_get_ingester_status(\n    ingester: &impl IngesterService,\n) -> anyhow::Result<IngesterStatus> {\n    let mut observation_stream = ingester\n        .open_observation_stream(OpenObservationStreamRequest {})\n        .await\n        .context(\"failed to open observation stream\")?;\n\n    let next_observation_message = observation_stream\n        .next()\n        .await\n        .context(\"observation stream ended\")?\n        .context(\"observation stream failed\")?;\n\n    Ok(next_observation_message.status())\n}\n\n/// Waits for an ingester to reach a specific status by monitoring its observation stream.\n///\n/// This function continuously polls the observation stream until the ingester reaches\n/// the desired status.\n///\n/// # Errors\n///\n/// Returns an error if:\n/// - The observation stream fails to open\n/// - The stream ends without producing a message\n/// - The stream ends after returning an error\n/// - The timeout is exceeded\npub async fn wait_for_ingester_status(\n    ingester: &impl IngesterService,\n    status: IngesterStatus,\n    timeout_after: Duration,\n) -> anyhow::Result<()> {\n    debug_assert!(\n        timeout_after > Duration::ZERO,\n        \"timeout_after should be greater than zero\"\n    );\n    tokio::time::timeout(\n        timeout_after,\n        wait_for_ingester_status_inner(ingester, status),\n    )\n    .await\n    .with_context(|| {\n        format!(\n            \"timed out waiting for ingester to transition to status {status} after {} seconds\",\n            timeout_after.as_secs(),\n        )\n    })?\n}\n\nasync fn wait_for_ingester_status_inner(\n    ingester: &impl IngesterService,\n    status: IngesterStatus,\n) -> anyhow::Result<()> {\n    let mut observation_stream = ingester\n        .open_observation_stream(OpenObservationStreamRequest {})\n        .await\n        .context(\"failed to open observation stream\")?;\n\n    loop {\n        match observation_stream.next().await {\n            Some(Ok(observation_message)) => {\n                if observation_message.status() == status {\n                    return Ok(());\n                }\n            }\n            Some(Err(error)) => {\n                return Err(anyhow!(error).context(\"observation stream failed\"));\n            }\n            None => {\n                bail!(\"observation stream ended\");\n            }\n        }\n    }\n}\n\n/// Initiates decommission of an ingester and waits for it to complete.\n///\n/// This function sends a decommission request to the ingester and then waits\n/// for it to reach the `Decommissioned` status.\n///\n/// # Errors\n///\n/// Returns an error if:\n/// - The decommission request fails\n/// - The observation stream fails to open\n/// - The stream ends without producing a message\n/// - The stream ends after returning an error\n/// - The timeout is exceeded\npub async fn wait_for_ingester_decommission(\n    ingester: &impl IngesterService,\n    timeout_after: Duration,\n) -> anyhow::Result<()> {\n    let now = Instant::now();\n\n    ingester\n        .decommission(DecommissionRequest {})\n        .await\n        .context(\"failed to initiate ingester decommission\")?;\n\n    wait_for_ingester_status(\n        ingester,\n        IngesterStatus::Decommissioned,\n        timeout_after.saturating_sub(now.elapsed()),\n    )\n    .await?;\n\n    info!(\n        \"successfully decommissioned ingester in {}\",\n        now.elapsed().pretty_display()\n    );\n    Ok(())\n}\n\n#[cfg(test)]\nmod tests {\n\n    use std::time::Duration;\n\n    use quickwit_common::ServiceStream;\n    use quickwit_proto::ingest::ingester::{\n        DecommissionResponse, IngesterServiceClient, MockIngesterService, ObservationMessage,\n    };\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_try_get_ingester_status() {\n        let mut mock_ingester = MockIngesterService::new();\n        mock_ingester\n            .expect_open_observation_stream()\n            .once()\n            .returning(|_| {\n                let (service_stream_tx, service_stream) = ServiceStream::new_bounded(1);\n                let message = ObservationMessage {\n                    node_id: \"test-ingester\".to_string(),\n                    status: IngesterStatus::Initializing as i32,\n                };\n                service_stream_tx.try_send(Ok(message)).unwrap();\n                Ok(service_stream)\n            });\n        let ingester = IngesterServiceClient::from_mock(mock_ingester);\n        let status = try_get_ingester_status(&ingester).await.unwrap();\n        assert_eq!(status, IngesterStatus::Initializing);\n    }\n\n    #[tokio::test]\n    async fn test_wait_for_ingester_status() {\n        let mut mock_ingester = MockIngesterService::new();\n        mock_ingester\n            .expect_open_observation_stream()\n            .once()\n            .returning(|_| {\n                let (service_stream_tx, service_stream) = ServiceStream::new_bounded(2);\n                let message = ObservationMessage {\n                    node_id: \"test-ingester\".to_string(),\n                    status: IngesterStatus::Initializing as i32,\n                };\n                service_stream_tx.try_send(Ok(message)).unwrap();\n\n                let message = ObservationMessage {\n                    node_id: \"test-ingester\".to_string(),\n                    status: IngesterStatus::Ready as i32,\n                };\n                service_stream_tx.try_send(Ok(message)).unwrap();\n                Ok(service_stream)\n            });\n        let ingester = IngesterServiceClient::from_mock(mock_ingester);\n        wait_for_ingester_status(&ingester, IngesterStatus::Ready, Duration::from_secs(1))\n            .await\n            .unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_wait_for_ingester_decommission_elapsed_timeout_not_zero() {\n        let mut mock_ingester = MockIngesterService::new();\n        mock_ingester\n            .expect_open_observation_stream()\n            .once()\n            .returning(|_| {\n                let (service_stream_tx, service_stream) = ServiceStream::new_bounded(1);\n                // Simulate the ingester transitioning to Decommissioned after 50ms.\n                tokio::spawn(async move {\n                    tokio::time::sleep(Duration::from_millis(50)).await;\n                    let message = ObservationMessage {\n                        node_id: \"test-ingester\".to_string(),\n                        status: IngesterStatus::Decommissioned as i32,\n                    };\n                    service_stream_tx.try_send(Ok(message)).unwrap();\n                });\n                Ok(service_stream)\n            });\n        mock_ingester\n            .expect_decommission()\n            .once()\n            .returning(|_| Ok(DecommissionResponse {}));\n        let ingester = IngesterServiceClient::from_mock(mock_ingester);\n        wait_for_ingester_decommission(&ingester, Duration::from_secs(1))\n            .await\n            .unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_wait_for_ingester_decommission() {\n        let mut mock_ingester = MockIngesterService::new();\n        mock_ingester\n            .expect_open_observation_stream()\n            .once()\n            .returning(|_| {\n                let (service_stream_tx, service_stream) = ServiceStream::new_bounded(3);\n                let message = ObservationMessage {\n                    node_id: \"test-ingester\".to_string(),\n                    status: IngesterStatus::Ready as i32,\n                };\n                service_stream_tx.try_send(Ok(message)).unwrap();\n\n                let message = ObservationMessage {\n                    node_id: \"test-ingester\".to_string(),\n                    status: IngesterStatus::Decommissioning as i32,\n                };\n                service_stream_tx.try_send(Ok(message)).unwrap();\n\n                let message = ObservationMessage {\n                    node_id: \"test-ingester\".to_string(),\n                    status: IngesterStatus::Decommissioned as i32,\n                };\n                service_stream_tx.try_send(Ok(message)).unwrap();\n                Ok(service_stream)\n            });\n        mock_ingester\n            .expect_decommission()\n            .once()\n            .returning(|_| Ok(DecommissionResponse {}));\n        let ingester = IngesterServiceClient::from_mock(mock_ingester);\n        wait_for_ingester_decommission(&ingester, Duration::from_secs(1))\n            .await\n            .unwrap();\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_v2/idle.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::time::{Duration, Instant};\n\nuse tokio::task::JoinHandle;\nuse tracing::info;\n\nuse super::state::WeakIngesterState;\nuse crate::with_lock_metrics;\n\nconst RUN_INTERVAL_PERIOD: Duration = if cfg!(test) {\n    Duration::from_millis(50)\n} else {\n    Duration::from_secs(60)\n};\n\n/// Periodically closes idle shards.\npub(super) struct CloseIdleShardsTask {\n    weak_state: WeakIngesterState,\n    idle_shard_timeout: Duration,\n}\n\nimpl CloseIdleShardsTask {\n    pub fn spawn(weak_state: WeakIngesterState, idle_shard_timeout: Duration) -> JoinHandle<()> {\n        let task = Self {\n            weak_state,\n            idle_shard_timeout,\n        };\n        tokio::spawn(async move {\n            let Some(mut state) = task.weak_state.upgrade() else {\n                return;\n            };\n            state.wait_for_ready().await;\n            drop(state);\n\n            task.run().await\n        })\n    }\n\n    async fn run(&self) {\n        let mut interval = tokio::time::interval(RUN_INTERVAL_PERIOD);\n\n        loop {\n            interval.tick().await;\n\n            let Some(state) = self.weak_state.upgrade() else {\n                return;\n            };\n            let Ok(mut state_guard) =\n                with_lock_metrics!(state.lock_partially(), \"close_idle_shards\", \"write\").await\n            else {\n                return;\n            };\n\n            let now = Instant::now();\n\n            for (queue_id, shard) in &mut state_guard.shards {\n                if shard.is_open() && shard.is_idle(now, self.idle_shard_timeout) {\n                    shard.close();\n                    info!(\"closed idle shard `{queue_id}`\");\n                }\n            }\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_cluster::{ChannelTransport, create_cluster_for_test};\n    use quickwit_config::service::QuickwitService;\n    use quickwit_proto::types::{IndexUid, ShardId};\n\n    use super::*;\n    use crate::ingest_v2::models::IngesterShard;\n    use crate::ingest_v2::state::IngesterState;\n\n    #[tokio::test]\n    async fn test_close_idle_shards_run() {\n        let cluster = create_cluster_for_test(\n            Vec::new(),\n            &[QuickwitService::Indexer.as_str()],\n            &ChannelTransport::default(),\n            true,\n        )\n        .await\n        .unwrap();\n        let (_temp_dir, state) = IngesterState::for_test(cluster).await;\n        let weak_state = state.weak();\n        let idle_shard_timeout = RUN_INTERVAL_PERIOD * 4;\n        let join_handle = CloseIdleShardsTask::spawn(weak_state, idle_shard_timeout);\n\n        let mut state_guard = state.lock_partially().await.unwrap();\n        let now = Instant::now();\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let shard_01 = IngesterShard::new_solo(\n            index_uid.clone(),\n            \"test-source\".to_string(),\n            ShardId::from(1),\n        )\n        .with_last_write(now - idle_shard_timeout)\n        .build();\n        let queue_id_01 = shard_01.queue_id();\n        state_guard.shards.insert(queue_id_01.clone(), shard_01);\n\n        let shard_02 = IngesterShard::new_solo(\n            index_uid.clone(),\n            \"test-source\".to_string(),\n            ShardId::from(2),\n        )\n        .build();\n        let queue_id_02 = shard_02.queue_id();\n        state_guard.shards.insert(queue_id_02.clone(), shard_02);\n        drop(state_guard);\n\n        tokio::time::sleep(RUN_INTERVAL_PERIOD * 2).await;\n\n        let state_guard = state.lock_partially().await.unwrap();\n        state_guard\n            .shards\n            .get(&queue_id_01)\n            .unwrap()\n            .assert_is_closed();\n        state_guard\n            .shards\n            .get(&queue_id_02)\n            .unwrap()\n            .assert_is_open();\n        drop(state_guard);\n\n        tokio::time::sleep(idle_shard_timeout).await;\n\n        let state_guard = state.lock_partially().await.unwrap();\n        state_guard\n            .shards\n            .get(&queue_id_02)\n            .unwrap()\n            .assert_is_closed();\n        drop(state_guard);\n        drop(state);\n\n        tokio::time::timeout(Duration::from_secs(1), join_handle)\n            .await\n            .unwrap()\n            .unwrap();\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_v2/ingest.md",
    "content": "## Replication\n\n### Settings\n\n- ingest request timeout (35s), `Itimeout`\n- persist request timeout (6s), `Ptimeout`\n- replicate request timeout (3s), `Rtimeout`\n- number of persist attempts (5), `k`\n\nKnowing that persist requests issue replicate requests, and ingest requests issue persist requests, we must have approximately:\n- `Ptimeout` >= 2 * `Rtimeout`\n- `Itimeout` >= `k` * `Ptimeout`\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_v2/ingester.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::hash_map::Entry;\nuse std::collections::{BTreeMap, HashMap, HashSet};\nuse std::fmt;\nuse std::path::Path;\nuse std::sync::Arc;\nuse std::time::{Duration, Instant};\n\nuse async_trait::async_trait;\nuse bytesize::ByteSize;\nuse futures::StreamExt;\nuse futures::stream::FuturesUnordered;\nuse mrecordlog::error::CreateQueueError;\nuse once_cell::sync::OnceCell;\nuse quickwit_cluster::Cluster;\nuse quickwit_common::metrics::{GaugeGuard, MEMORY_METRICS};\nuse quickwit_common::pretty::PrettyDisplay;\nuse quickwit_common::pubsub::{EventBroker, EventSubscriber};\nuse quickwit_common::rate_limiter::{RateLimiter, RateLimiterSettings};\nuse quickwit_common::{ServiceStream, rate_limited_error, rate_limited_warn};\nuse quickwit_proto::control_plane::{\n    AdviseResetShardsRequest, ControlPlaneService, ControlPlaneServiceClient,\n};\nuse quickwit_proto::indexing::ShardPositionsUpdate;\nuse quickwit_proto::ingest::ingester::*;\nuse quickwit_proto::ingest::{\n    CommitTypeV2, DocBatchV2, IngestV2Error, IngestV2Result, ParseFailure, Shard, ShardIds,\n};\nuse quickwit_proto::types::{\n    IndexUid, NodeId, Position, QueueId, ShardId, SourceId, SubrequestId, queue_id, split_queue_id,\n};\nuse serde_json::{Value as JsonValue, json};\nuse tokio::sync::Semaphore;\nuse tokio::time::{sleep, timeout};\nuse tracing::{debug, error, info, warn};\n\nuse super::IngesterPool;\nuse super::broadcast::{BroadcastIngesterCapacityScoreTask, BroadcastLocalShardsTask};\nuse super::doc_mapper::validate_doc_batch;\nuse super::fetch::FetchStreamTask;\nuse super::idle::CloseIdleShardsTask;\nuse super::metrics::INGEST_V2_METRICS;\nuse super::models::IngesterShard;\nuse super::mrecordlog_utils::{\n    AppendDocBatchError, append_non_empty_doc_batch, check_enough_capacity,\n};\nuse super::rate_meter::RateMeter;\nuse super::replication::{\n    ReplicationClient, ReplicationStreamTask, ReplicationStreamTaskHandle, ReplicationTask,\n    SYN_REPLICATION_STREAM_CAPACITY,\n};\nuse super::state::{IngesterState, InnerIngesterState, WeakIngesterState};\nuse crate::ingest_v2::doc_mapper::get_or_try_build_doc_mapper;\nuse crate::ingest_v2::metrics::report_wal_usage;\nuse crate::ingest_v2::models::IngesterShardType;\nuse crate::mrecordlog_async::MultiRecordLogAsync;\nuse crate::{FollowerId, estimate_size, with_lock_metrics};\n\n/// Minimum interval between two reset shards operations.\nconst MIN_RESET_SHARDS_INTERVAL: Duration = if cfg!(any(test, feature = \"testsuite\")) {\n    Duration::ZERO\n} else {\n    Duration::from_secs(60)\n};\n\n/// Duration after which persist requests time out with\n/// [`quickwit_proto::ingest::IngestV2Error::Timeout`].\npub(super) const PERSIST_REQUEST_TIMEOUT: Duration = if cfg!(any(test, feature = \"testsuite\")) {\n    Duration::from_millis(500)\n} else {\n    Duration::from_secs(6)\n};\n\nconst DEFAULT_BATCH_NUM_BYTES: usize = 1024 * 1024; // 1 MiB\n\nfn get_batch_num_bytes() -> usize {\n    static BATCH_NUM_BYTES_CELL: OnceCell<usize> = OnceCell::new();\n    *BATCH_NUM_BYTES_CELL.get_or_init(|| {\n        quickwit_common::get_from_env(\"QW_INGEST_BATCH_NUM_BYTES\", DEFAULT_BATCH_NUM_BYTES, false)\n    })\n}\n\n#[derive(Clone)]\npub struct Ingester {\n    self_node_id: NodeId,\n    control_plane: ControlPlaneServiceClient,\n    ingester_pool: IngesterPool,\n    state: IngesterState,\n    disk_capacity: ByteSize,\n    memory_capacity: ByteSize,\n    rate_limiter_settings: RateLimiterSettings,\n    replication_factor: usize,\n    // This semaphore ensures that the ingester that not run two reset shards operations\n    // concurrently.\n    reset_shards_permits: Arc<Semaphore>,\n}\n\nimpl fmt::Debug for Ingester {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        f.debug_struct(\"Ingester\")\n            .field(\"replication_factor\", &self.replication_factor)\n            .finish()\n    }\n}\n\nimpl Ingester {\n    #[allow(clippy::too_many_arguments)]\n    pub async fn try_new(\n        cluster: Cluster,\n        control_plane: ControlPlaneServiceClient,\n        ingester_pool: IngesterPool,\n        wal_dir_path: &Path,\n        disk_capacity: ByteSize,\n        memory_capacity: ByteSize,\n        rate_limiter_settings: RateLimiterSettings,\n        replication_factor: usize,\n        idle_shard_timeout: Duration,\n    ) -> IngestV2Result<Self> {\n        let self_node_id: NodeId = cluster.self_node_id().into();\n        let state = IngesterState::load(\n            cluster.clone(),\n            wal_dir_path,\n            disk_capacity,\n            memory_capacity,\n            rate_limiter_settings,\n        )\n        .await;\n\n        let weak_state = state.weak();\n        BroadcastLocalShardsTask::spawn(cluster.clone(), weak_state.clone());\n        BroadcastIngesterCapacityScoreTask::spawn(cluster, weak_state.clone());\n        CloseIdleShardsTask::spawn(weak_state, idle_shard_timeout);\n\n        let ingester = Self {\n            self_node_id,\n            control_plane,\n            ingester_pool,\n            state,\n            disk_capacity,\n            memory_capacity,\n            rate_limiter_settings,\n            replication_factor,\n            reset_shards_permits: Arc::new(Semaphore::new(1)),\n        };\n        ingester.background_reset_shards();\n\n        Ok(ingester)\n    }\n\n    /// Checks whether the ingester is fully decommissioned and updates its status accordingly.\n    async fn check_decommissioning_status(&self, state: &mut InnerIngesterState) {\n        if state.status() != IngesterStatus::Decommissioning {\n            return;\n        }\n        if state.shards.values().all(|shard| shard.is_indexed()) {\n            state.set_status(IngesterStatus::Decommissioned).await;\n        }\n    }\n\n    /// Initializes a primary shard by creating a queue in the write-ahead log and inserting a new\n    /// [`IngesterShard`] into the ingester state. If replication is enabled, this method will\n    /// also:\n    /// - open a replication stream between the leader and the follower if one does not already\n    ///   exist.\n    /// - initialize the replica shard.\n    async fn init_primary_shard(\n        &self,\n        state: &mut InnerIngesterState,\n        mrecordlog: &mut MultiRecordLogAsync,\n        shard: Shard,\n        doc_mapping_json: &str,\n        now: Instant,\n        validate_docs: bool,\n    ) -> IngestV2Result<()> {\n        let queue_id = shard.queue_id();\n        info!(\n            index_uid=%shard.index_uid(),\n            source_id=shard.source_id,\n            shard_id=%shard.shard_id(),\n            \"init primary shard\"\n        );\n        let Entry::Vacant(entry) = state.shards.entry(queue_id.clone()) else {\n            return Ok(());\n        };\n        let doc_mapper = get_or_try_build_doc_mapper(\n            &mut state.doc_mappers,\n            shard.doc_mapping_uid(),\n            doc_mapping_json,\n        )?;\n        match mrecordlog.create_queue(&queue_id).await {\n            Ok(_) => {}\n            Err(CreateQueueError::AlreadyExists) => {\n                error!(\"WAL queue `{queue_id}` already exists\");\n                let message = format!(\"WAL queue `{queue_id}` already exists\");\n                return Err(IngestV2Error::Internal(message));\n            }\n            Err(CreateQueueError::IoError(io_error)) => {\n                error!(\"failed to create WAL queue `{queue_id}`: {io_error}\",);\n                let message = format!(\"failed to create WAL queue `{queue_id}`: {io_error}\");\n                return Err(IngestV2Error::Internal(message));\n            }\n        };\n        let index_uid = shard.index_uid().clone();\n        let source_id = shard.source_id.clone();\n        let shard_id = shard.shard_id().clone();\n        let rate_limiter = RateLimiter::from_settings(self.rate_limiter_settings);\n        let rate_meter = RateMeter::default();\n\n        let primary_shard = if let Some(follower_id) = &shard.follower_id {\n            let leader_id: NodeId = shard.leader_id.clone().into();\n            let follower_id: NodeId = follower_id.clone().into();\n\n            let replication_client = self\n                .init_replication_stream(\n                    &mut state.replication_streams,\n                    leader_id,\n                    follower_id.clone(),\n                )\n                .await?;\n\n            if let Err(error) = replication_client.init_replica(shard).await {\n                // TODO: Remove dangling queue from the WAL.\n                error!(\"failed to initialize replica shard: {error}\");\n                let message = format!(\"failed to initialize replica shard: {error}\");\n                return Err(IngestV2Error::Internal(message));\n            }\n            IngesterShard::new_primary(index_uid, source_id, shard_id, follower_id)\n                .with_rate_limiter(rate_limiter)\n                .with_rate_meter(rate_meter)\n                .with_doc_mapper(doc_mapper)\n                .with_validate_docs(validate_docs)\n                .with_last_write(now)\n                .build()\n        } else {\n            IngesterShard::new_solo(index_uid, source_id, shard_id)\n                .with_rate_limiter(rate_limiter)\n                .with_rate_meter(rate_meter)\n                .with_doc_mapper(doc_mapper)\n                .with_validate_docs(validate_docs)\n                .with_last_write(now)\n                .build()\n        };\n        entry.insert(primary_shard);\n        Ok(())\n    }\n\n    /// Resets the local shards in a separate background task.\n    fn background_reset_shards(&self) {\n        let mut ingester = self.clone();\n\n        let future = async move {\n            ingester.reset_shards().await;\n        };\n        tokio::spawn(future);\n    }\n\n    /// Resets the local shards at most once by minute by querying the control plane for the shards\n    /// that should be deleted or truncated and then performing the requested operations.\n    ///\n    /// This operation should be triggered very rarely when the ingester has not been able to delete\n    /// or truncate its shards by other means (RPCs from indexers, gossip, etc.).\n    async fn reset_shards(&mut self) {\n        let Ok(_permit) = self.reset_shards_permits.try_acquire() else {\n            return;\n        };\n        self.state.wait_for_ready().await;\n\n        info!(\"resetting shards\");\n        let now = Instant::now();\n\n        let mut per_source_shard_ids: HashMap<(IndexUid, SourceId), Vec<ShardId>> = HashMap::new();\n\n        let state_guard = with_lock_metrics!(self.state.lock_fully().await, \"reset_shards\", \"read\")\n            .expect(\"ingester should be ready\");\n\n        for queue_id in state_guard.mrecordlog.list_queues() {\n            let Some((index_uid, source_id, shard_id)) = split_queue_id(queue_id) else {\n                // `split_queue_id` already logs an error.\n                continue;\n            };\n            per_source_shard_ids\n                .entry((index_uid, source_id))\n                .or_default()\n                .push(shard_id);\n        }\n        drop(state_guard);\n\n        let shard_ids = per_source_shard_ids\n            .into_iter()\n            .map(|((index_uid, source_id), shard_ids)| ShardIds {\n                index_uid: Some(index_uid),\n                source_id,\n                shard_ids,\n            })\n            .collect();\n\n        let advise_reset_shards_request = AdviseResetShardsRequest {\n            ingester_id: self.self_node_id.to_string(),\n            shard_ids,\n        };\n        let advise_reset_shards_future = self\n            .control_plane\n            .advise_reset_shards(advise_reset_shards_request);\n        let advise_reset_shards_result =\n            timeout(Duration::from_secs(30), advise_reset_shards_future).await;\n\n        match advise_reset_shards_result {\n            Ok(Ok(advise_reset_shards_response)) => {\n                let mut state_guard =\n                    with_lock_metrics!(self.state.lock_fully().await, \"reset_shards\", \"write\")\n                        .expect(\"ingester should be ready\");\n\n                state_guard\n                    .reset_shards(&advise_reset_shards_response)\n                    .await;\n\n                info!(\n                    \"deleted {} and truncated {} shard(s) in {}\",\n                    advise_reset_shards_response.shards_to_delete.len(),\n                    advise_reset_shards_response.shards_to_truncate.len(),\n                    now.elapsed().pretty_display()\n                );\n                INGEST_V2_METRICS\n                    .reset_shards_operations_total\n                    .with_label_values([\"success\"])\n                    .inc();\n\n                let wal_usage = state_guard.mrecordlog.resource_usage();\n                report_wal_usage(wal_usage);\n            }\n            Ok(Err(error)) => {\n                warn!(\"advise reset shards request failed: {error}\");\n\n                INGEST_V2_METRICS\n                    .reset_shards_operations_total\n                    .with_label_values([\"error\"])\n                    .inc();\n            }\n            Err(_) => {\n                warn!(\"advise reset shards request timed out\");\n\n                INGEST_V2_METRICS\n                    .reset_shards_operations_total\n                    .with_label_values([\"timeout\"])\n                    .inc();\n            }\n        };\n        // We still hold the permit while sleeping so we effectively rate limit the reset shards\n        // operation to once per [`MIN_RESET_SHARDS_INTERVAL`].\n        if let Some(sleep_for) = MIN_RESET_SHARDS_INTERVAL.checked_sub(now.elapsed()) {\n            sleep(sleep_for).await;\n        }\n    }\n\n    async fn init_replication_stream(\n        &self,\n        replication_streams: &mut HashMap<FollowerId, ReplicationStreamTaskHandle>,\n        leader_id: NodeId,\n        follower_id: NodeId,\n    ) -> IngestV2Result<ReplicationClient> {\n        let entry = match replication_streams.entry(follower_id.clone()) {\n            Entry::Occupied(entry) => {\n                // A replication stream with this follower is already opened.\n                return Ok(entry.get().replication_client());\n            }\n            Entry::Vacant(entry) => entry,\n        };\n        let open_request = OpenReplicationStreamRequest {\n            leader_id: leader_id.clone().into(),\n            follower_id: follower_id.clone().into(),\n            replication_seqno: 0,\n        };\n        let open_message = SynReplicationMessage::new_open_request(open_request);\n        let (syn_replication_stream_tx, syn_replication_stream) =\n            ServiceStream::new_bounded(SYN_REPLICATION_STREAM_CAPACITY);\n        syn_replication_stream_tx\n            .try_send(open_message)\n            .expect(\"channel should be open and have capacity\");\n\n        let ingester = self.ingester_pool.get(&follower_id).ok_or_else(|| {\n            let message = format!(\"ingester `{follower_id}` is unavailable\");\n            IngestV2Error::Unavailable(message)\n        })?;\n        let mut ack_replication_stream = ingester\n            .client\n            .open_replication_stream(syn_replication_stream)\n            .await?;\n        ack_replication_stream\n            .next()\n            .await\n            .expect(\"TODO\")\n            .expect(\"TODO\")\n            .into_open_response()\n            .expect(\"first message should be an open response\");\n\n        let replication_stream_task_handle = ReplicationStreamTask::spawn(\n            leader_id.clone(),\n            follower_id.clone(),\n            syn_replication_stream_tx,\n            ack_replication_stream,\n        );\n        let replication_client = replication_stream_task_handle.replication_client();\n        entry.insert(replication_stream_task_handle);\n        Ok(replication_client)\n    }\n\n    pub fn subscribe(&self, event_broker: &EventBroker) {\n        let weak_ingester_state = self.state.weak();\n        // This subscription is the one in charge of truncating the mrecordlog.\n        info!(\"subscribing ingester to shard positions updates\");\n        event_broker\n            .subscribe_without_timeout::<ShardPositionsUpdate>(weak_ingester_state)\n            .forever();\n    }\n\n    async fn persist_inner(\n        &self,\n        persist_request: PersistRequest,\n    ) -> IngestV2Result<PersistResponse> {\n        if persist_request.leader_id != self.self_node_id {\n            return Err(IngestV2Error::Internal(format!(\n                \"routing error: expected leader ID `{}`, got `{}`\",\n                self.self_node_id, persist_request.leader_id,\n            )));\n        }\n        let mut persist_successes = Vec::with_capacity(persist_request.subrequests.len());\n        let mut persist_failures = Vec::new();\n        let mut per_follower_replicate_subrequests: HashMap<NodeId, Vec<ReplicateSubrequest>> =\n            HashMap::new();\n        let mut pending_persist_subrequests: HashMap<SubrequestId, PendingPersistSubrequest> =\n            HashMap::with_capacity(persist_request.subrequests.len());\n\n        // Keep track of the shards that need to be closed following an IO error.\n        let mut shards_to_close: HashSet<QueueId> = HashSet::new();\n\n        // Keep track of dangling shards, i.e., shards for which there is no longer a corresponding\n        // queue in the WAL and should be deleted.\n        let mut shards_to_delete: HashSet<QueueId> = HashSet::new();\n\n        let commit_type = persist_request.commit_type();\n        let force_commit = commit_type == CommitTypeV2::Force;\n\n        let mut state_guard =\n            with_lock_metrics!(self.state.lock_fully().await, \"persist\", \"write\")?;\n        let status = state_guard.status();\n\n        if !status.accepts_write_requests() {\n            persist_failures.reserve_exact(persist_request.subrequests.len());\n\n            for subrequest in persist_request.subrequests {\n                let persist_failure = PersistFailure {\n                    subrequest_id: subrequest.subrequest_id,\n                    index_uid: subrequest.index_uid,\n                    source_id: subrequest.source_id,\n                    reason: PersistFailureReason::NodeUnavailable as i32,\n                };\n                persist_failures.push(persist_failure);\n            }\n            let persist_response = PersistResponse {\n                leader_id: persist_request.leader_id,\n                successes: Vec::new(),\n                failures: persist_failures,\n                routing_update: None,\n            };\n            return Ok(persist_response);\n        }\n        // first verify if we would locally accept each subrequest\n        {\n            let mut total_requested_capacity = ByteSize::b(0);\n\n            for subrequest in persist_request.subrequests {\n                let Some(shard) = state_guard\n                    .inner\n                    .find_most_capacity_shard_mut(subrequest.index_uid(), &subrequest.source_id)\n                else {\n                    warn!(\n                        index_uid=%subrequest.index_uid(),\n                        source_id=%subrequest.source_id,\n                        \"no open shard found on ingester\"\n                    );\n                    let persist_failure = PersistFailure {\n                        subrequest_id: subrequest.subrequest_id,\n                        index_uid: subrequest.index_uid,\n                        source_id: subrequest.source_id,\n                        reason: PersistFailureReason::NoShardsAvailable as i32,\n                    };\n                    persist_failures.push(persist_failure);\n                    continue;\n                };\n                let shard_id = shard.shard_id.clone();\n\n                // A router can only know about a newly opened shard if it has been informed by the\n                // control plane, which confirms that the shard was correctly opened in the\n                // metastore.\n                shard.is_advertisable = true;\n                let doc_mapper = shard.doc_mapper_opt.clone().expect(\"shard should be open\");\n                let validate_docs = shard.validate_docs;\n                let follower_id_opt = shard.follower_id_opt().cloned();\n                let from_position_exclusive = shard.replication_position_inclusive.clone();\n\n                let doc_batch = match subrequest.doc_batch {\n                    Some(doc_batch) if !doc_batch.is_empty() => doc_batch,\n                    _ => {\n                        warn!(\"received empty persist request\");\n                        DocBatchV2::default()\n                    }\n                };\n                let requested_capacity = estimate_size(&doc_batch);\n\n                if let Err(error) = check_enough_capacity(\n                    &state_guard.mrecordlog,\n                    self.disk_capacity,\n                    self.memory_capacity,\n                    requested_capacity + total_requested_capacity,\n                ) {\n                    rate_limited_warn!(\n                        limit_per_min = 10,\n                        \"failed to persist records to ingester `{}`: {error}\",\n                        self.self_node_id\n                    );\n                    let persist_failure = PersistFailure {\n                        subrequest_id: subrequest.subrequest_id,\n                        index_uid: subrequest.index_uid,\n                        source_id: subrequest.source_id,\n                        reason: PersistFailureReason::WalFull as i32,\n                    };\n                    persist_failures.push(persist_failure);\n                    continue;\n                };\n                // Because we return the shard with the most available capacity, if this hits, it\n                // means that no shard can receive this request, and it should be retried.\n                if !shard.rate_limiter.acquire_bytes(requested_capacity) {\n                    debug!(\n                        \"failed to persist records to shard `{}`: rate limited\",\n                        shard.queue_id()\n                    );\n\n                    let persist_failure = PersistFailure {\n                        subrequest_id: subrequest.subrequest_id,\n                        index_uid: subrequest.index_uid,\n                        source_id: subrequest.source_id,\n                        reason: PersistFailureReason::NoShardsAvailable as i32,\n                    };\n                    persist_failures.push(persist_failure);\n                    continue;\n                }\n\n                // Total number of bytes (valid and invalid documents)\n                let original_batch_num_bytes = doc_batch.num_bytes() as u64;\n\n                let (valid_doc_batch, parse_failures) = if validate_docs {\n                    validate_doc_batch(doc_batch, doc_mapper).await?\n                } else {\n                    (doc_batch, Vec::new())\n                };\n\n                if valid_doc_batch.is_empty() {\n                    crate::metrics::INGEST_METRICS\n                        .ingested_docs_invalid\n                        .inc_by(parse_failures.len() as u64);\n                    crate::metrics::INGEST_METRICS\n                        .ingested_docs_bytes_invalid\n                        .inc_by(original_batch_num_bytes);\n                    let persist_success = PersistSuccess {\n                        subrequest_id: subrequest.subrequest_id,\n                        index_uid: subrequest.index_uid,\n                        source_id: subrequest.source_id,\n                        shard_id: Some(shard_id),\n                        replication_position_inclusive: Some(from_position_exclusive),\n                        num_persisted_docs: 0,\n                        parse_failures,\n                    };\n                    persist_successes.push(persist_success);\n                    continue;\n                };\n\n                crate::metrics::INGEST_METRICS\n                    .ingested_docs_valid\n                    .inc_by(valid_doc_batch.num_docs() as u64);\n                crate::metrics::INGEST_METRICS\n                    .ingested_docs_bytes_valid\n                    .inc_by(valid_doc_batch.num_bytes() as u64);\n                if !parse_failures.is_empty() {\n                    crate::metrics::INGEST_METRICS\n                        .ingested_docs_invalid\n                        .inc_by(parse_failures.len() as u64);\n                    crate::metrics::INGEST_METRICS\n                        .ingested_docs_bytes_invalid\n                        .inc_by(original_batch_num_bytes - valid_doc_batch.num_bytes() as u64);\n                }\n                let valid_batch_num_bytes = valid_doc_batch.num_bytes() as u64;\n                shard.rate_meter.update(valid_batch_num_bytes);\n                total_requested_capacity += requested_capacity;\n\n                let mut successfully_replicated = true;\n\n                if let Some(follower_id) = follower_id_opt {\n                    successfully_replicated = false;\n\n                    let replicate_subrequest = ReplicateSubrequest {\n                        subrequest_id: subrequest.subrequest_id,\n                        index_uid: subrequest.index_uid.clone(),\n                        source_id: subrequest.source_id.clone(),\n                        shard_id: Some(shard_id.clone()),\n                        from_position_exclusive: Some(from_position_exclusive),\n                        doc_batch: Some(valid_doc_batch.clone()),\n                    };\n                    per_follower_replicate_subrequests\n                        .entry(follower_id)\n                        .or_default()\n                        .push(replicate_subrequest);\n                }\n                let pending_persist_subrequest = PendingPersistSubrequest {\n                    queue_id: shard.queue_id(),\n                    subrequest_id: subrequest.subrequest_id,\n                    index_uid: subrequest.index_uid,\n                    source_id: subrequest.source_id,\n                    shard_id: Some(shard_id),\n                    doc_batch: valid_doc_batch,\n                    parse_failures,\n                    expected_position_inclusive: None,\n                    successfully_replicated,\n                };\n                pending_persist_subrequests.insert(\n                    pending_persist_subrequest.subrequest_id,\n                    pending_persist_subrequest,\n                );\n            }\n        }\n        // replicate to the follower\n        {\n            let mut replicate_futures = FuturesUnordered::new();\n\n            for (follower_id, replicate_subrequests) in per_follower_replicate_subrequests {\n                let replication_client = state_guard\n                    .replication_streams\n                    .get(&follower_id)\n                    .expect(\"replication stream should be initialized\")\n                    .replication_client();\n                let leader_id = self.self_node_id.clone();\n\n                let replicate_future = replication_client.replicate(\n                    leader_id,\n                    follower_id,\n                    replicate_subrequests,\n                    commit_type,\n                );\n                replicate_futures.push(replicate_future);\n            }\n            while let Some(replication_result) = replicate_futures.next().await {\n                let replicate_response = match replication_result {\n                    Ok(replicate_response) => replicate_response,\n                    Err(_) => {\n                        // TODO: Handle replication error:\n                        // 1. Close and evict all the shards hosted by the follower.\n                        // 2. Close and evict the replication client.\n                        // 3. Return `PersistFailureReason::NodeUnavailable` to router.\n                        continue;\n                    }\n                };\n                for replicate_success in replicate_response.successes {\n                    let pending_persist_subrequest = pending_persist_subrequests\n                        .get_mut(&replicate_success.subrequest_id)\n                        .expect(\"persist subrequest should exist\");\n\n                    pending_persist_subrequest.successfully_replicated = true;\n                    pending_persist_subrequest.expected_position_inclusive =\n                        replicate_success.replication_position_inclusive;\n                }\n                for replicate_failure in replicate_response.failures {\n                    // TODO: If the replica shard is closed, close the primary shard if it is not\n                    // already.\n                    let persist_failure_reason: PersistFailureReason =\n                        replicate_failure.reason().into();\n                    let persist_failure = PersistFailure {\n                        subrequest_id: replicate_failure.subrequest_id,\n                        index_uid: replicate_failure.index_uid,\n                        source_id: replicate_failure.source_id,\n                        reason: persist_failure_reason as i32,\n                    };\n                    persist_failures.push(persist_failure);\n                }\n            }\n        }\n        // finally write locally\n        {\n            let now = Instant::now();\n            for subrequest in pending_persist_subrequests.into_values() {\n                if !subrequest.successfully_replicated {\n                    continue;\n                }\n                let queue_id = subrequest.queue_id;\n\n                let batch_num_docs = subrequest.doc_batch.num_docs() as u64;\n\n                let append_result = append_non_empty_doc_batch(\n                    &mut state_guard.mrecordlog,\n                    &queue_id,\n                    subrequest.doc_batch,\n                    force_commit,\n                )\n                .await;\n\n                let current_position_inclusive = match append_result {\n                    Ok(current_position_inclusive) => current_position_inclusive,\n                    Err(append_error) => {\n                        let reason = match &append_error {\n                            AppendDocBatchError::Io(io_error) => {\n                                error!(\n                                    \"failed to persist records to shard `{queue_id}`: {io_error}\"\n                                );\n                                shards_to_close.insert(queue_id);\n                                PersistFailureReason::NodeUnavailable\n                            }\n                            AppendDocBatchError::QueueNotFound(_) => {\n                                error!(\n                                    \"failed to persist records to shard `{queue_id}`: WAL queue \\\n                                     not found\"\n                                );\n                                shards_to_delete.insert(queue_id);\n                                PersistFailureReason::NodeUnavailable\n                            }\n                        };\n                        let persist_failure = PersistFailure {\n                            subrequest_id: subrequest.subrequest_id,\n                            index_uid: subrequest.index_uid,\n                            source_id: subrequest.source_id,\n                            reason: reason as i32,\n                        };\n                        persist_failures.push(persist_failure);\n                        continue;\n                    }\n                };\n\n                if let Some(expected_position_inclusive) = subrequest.expected_position_inclusive\n                    && expected_position_inclusive != current_position_inclusive\n                {\n                    return Err(IngestV2Error::Internal(format!(\n                        \"bad replica position: expected {expected_position_inclusive:?}, got \\\n                         {current_position_inclusive:?}\"\n                    )));\n                }\n                state_guard\n                    .shards\n                    .get_mut(&queue_id)\n                    .expect(\"primary shard should exist\")\n                    .set_replication_position_inclusive(current_position_inclusive.clone(), now);\n\n                let persist_success = PersistSuccess {\n                    subrequest_id: subrequest.subrequest_id,\n                    index_uid: subrequest.index_uid,\n                    source_id: subrequest.source_id,\n                    shard_id: subrequest.shard_id,\n                    replication_position_inclusive: Some(current_position_inclusive),\n                    num_persisted_docs: batch_num_docs as u32,\n                    parse_failures: subrequest.parse_failures,\n                };\n                persist_successes.push(persist_success);\n            }\n        }\n        if !shards_to_close.is_empty() {\n            for queue_id in &shards_to_close {\n                let shard = state_guard\n                    .shards\n                    .get_mut(queue_id)\n                    .expect(\"shard should exist\");\n\n                shard.close();\n                warn!(\"closed shard `{queue_id}` following IO error\");\n            }\n        }\n        if !shards_to_delete.is_empty() {\n            for queue_id in &shards_to_delete {\n                state_guard.shards.remove(queue_id);\n                warn!(\"deleted dangling shard `{queue_id}`\");\n            }\n        }\n        let wal_usage = state_guard.mrecordlog.resource_usage();\n        let disk_used = wal_usage.disk_used_bytes as u64;\n        let memory_used = wal_usage.memory_used_bytes as u64;\n        let (open_shard_counts, closed_shards) = state_guard.get_shard_snapshot();\n        let capacity_score = state_guard\n            .wal_capacity_tracker\n            .score(ByteSize::b(disk_used), ByteSize::b(memory_used))\n            as u32;\n        drop(state_guard);\n\n        if disk_used >= self.disk_capacity.as_u64() * 90 / 100 {\n            self.background_reset_shards();\n        }\n        report_wal_usage(wal_usage);\n\n        let source_shard_updates = open_shard_counts\n            .into_iter()\n            .map(|(index_uid, source_id, count)| SourceShardUpdate {\n                index_uid: Some(index_uid),\n                source_id,\n                open_shard_count: count as u32,\n            })\n            .collect();\n\n        let routing_update = RoutingUpdate {\n            capacity_score,\n            source_shard_updates,\n            closed_shards,\n        };\n\n        #[cfg(test)]\n        {\n            persist_successes.sort_by_key(|success| success.subrequest_id);\n            persist_failures.sort_by_key(|failure| failure.subrequest_id);\n        }\n        let leader_id = self.self_node_id.to_string();\n        let persist_response = PersistResponse {\n            leader_id,\n            successes: persist_successes,\n            failures: persist_failures,\n            routing_update: Some(routing_update),\n        };\n        Ok(persist_response)\n    }\n\n    /// Opens a replication stream, which is a bi-directional gRPC stream. The client-side stream\n    async fn open_replication_stream_inner(\n        &self,\n        mut syn_replication_stream: quickwit_common::ServiceStream<SynReplicationMessage>,\n    ) -> IngestV2Result<IngesterServiceStream<AckReplicationMessage>> {\n        let open_replication_stream_request = syn_replication_stream\n            .next()\n            .await\n            .ok_or_else(|| IngestV2Error::Internal(\"syn replication stream aborted\".to_string()))?\n            .into_open_request()\n            .expect(\"first message should be an open replication stream request\");\n\n        if open_replication_stream_request.follower_id != self.self_node_id {\n            return Err(IngestV2Error::Internal(\"routing error\".to_string()));\n        }\n        let leader_id: NodeId = open_replication_stream_request.leader_id.into();\n        let follower_id: NodeId = open_replication_stream_request.follower_id.into();\n\n        let mut state_guard = self.state.lock_partially().await?;\n        let status = state_guard.status();\n\n        if !status.accepts_write_requests() {\n            let error = IngestV2Error::Unavailable(format!(\n                \"ingester {follower_id} is not ready: {status}\",\n            ));\n            return Err(error);\n        }\n        let Entry::Vacant(entry) = state_guard.replication_tasks.entry(leader_id.clone()) else {\n            return Err(IngestV2Error::Internal(format!(\n                \"a replication stream between {leader_id} and {follower_id} is already opened\"\n            )));\n        };\n        // Channel capacity: there is no need to bound the capacity of the channel here because it\n        // is already virtually bounded by the capacity of the SYN replication stream.\n        let (ack_replication_stream_tx, ack_replication_stream) = ServiceStream::new_unbounded();\n        let open_response = OpenReplicationStreamResponse {\n            replication_seqno: 0,\n        };\n        let ack_replication_message = AckReplicationMessage::new_open_response(open_response);\n        ack_replication_stream_tx\n            .send(Ok(ack_replication_message))\n            .expect(\"channel should be open\");\n\n        let replication_task_handle = ReplicationTask::spawn(\n            leader_id,\n            follower_id,\n            self.state.clone(),\n            syn_replication_stream,\n            ack_replication_stream_tx,\n            self.disk_capacity,\n            self.memory_capacity,\n        );\n        entry.insert(replication_task_handle);\n        Ok(ack_replication_stream)\n    }\n\n    async fn open_fetch_stream_inner(\n        &self,\n        open_fetch_stream_request: OpenFetchStreamRequest,\n    ) -> IngestV2Result<ServiceStream<IngestV2Result<FetchMessage>>> {\n        let queue_id = open_fetch_stream_request.queue_id();\n\n        let mut state_guard = self.state.lock_partially().await?;\n\n        let shard = state_guard.shards.get_mut(&queue_id).ok_or_else(|| {\n            rate_limited_error!(limit_per_min=6, queue_id=%queue_id, \"shard not found\");\n            IngestV2Error::ShardNotFound {\n                shard_id: open_fetch_stream_request.shard_id().clone(),\n            }\n        })?;\n        // An indexer can only know about a newly opened shard if it has been scheduled by the\n        // control plane, which confirms that the shard was correctly opened in the\n        // metastore.\n        shard.is_advertisable = true;\n\n        let shard_status_rx = shard.shard_status_rx.clone();\n        let mrecordlog = self.state.mrecordlog();\n        let (service_stream, _fetch_task_handle) = FetchStreamTask::spawn(\n            open_fetch_stream_request,\n            mrecordlog,\n            shard_status_rx,\n            get_batch_num_bytes(),\n        );\n        Ok(service_stream)\n    }\n\n    async fn open_observation_stream_inner(\n        &self,\n        _open_observation_stream_request: OpenObservationStreamRequest,\n    ) -> IngestV2Result<IngesterServiceStream<ObservationMessage>> {\n        let status_stream = ServiceStream::from(self.state.status_rx.clone());\n        let self_node_id = self.self_node_id.clone();\n        let observation_stream = status_stream.map(move |status| {\n            let observation_message = ObservationMessage {\n                node_id: self_node_id.clone().into(),\n                status: status as i32,\n            };\n            Ok(observation_message)\n        });\n        Ok(observation_stream)\n    }\n\n    async fn init_shards_inner(\n        &self,\n        init_shards_request: InitShardsRequest,\n    ) -> IngestV2Result<InitShardsResponse> {\n        let mut state_guard =\n            with_lock_metrics!(self.state.lock_fully().await, \"init_shards\", \"write\")?;\n        let status = state_guard.status();\n\n        if !status.accepts_write_requests() {\n            let error = IngestV2Error::Unavailable(format!(\n                \"ingester {} is not ready: {status}\",\n                self.self_node_id\n            ));\n            return Err(error);\n        }\n        let mut successes = Vec::with_capacity(init_shards_request.subrequests.len());\n        let mut failures = Vec::new();\n        let now = Instant::now();\n\n        for subrequest in init_shards_request.subrequests {\n            let init_primary_shard_result = self\n                .init_primary_shard(\n                    &mut state_guard.inner,\n                    &mut state_guard.mrecordlog,\n                    subrequest.shard().clone(),\n                    &subrequest.doc_mapping_json,\n                    now,\n                    subrequest.validate_docs,\n                )\n                .await;\n            if init_primary_shard_result.is_ok() {\n                let success = InitShardSuccess {\n                    subrequest_id: subrequest.subrequest_id,\n                    shard: subrequest.shard,\n                };\n                successes.push(success);\n            } else {\n                let shard = subrequest.shard();\n                let failure = InitShardFailure {\n                    subrequest_id: subrequest.subrequest_id,\n                    index_uid: shard.index_uid.clone(),\n                    source_id: shard.source_id.clone(),\n                    shard_id: shard.shard_id.clone(),\n                };\n                failures.push(failure);\n            }\n        }\n        let response = InitShardsResponse {\n            successes,\n            failures,\n        };\n        Ok(response)\n    }\n\n    async fn truncate_shards_inner(\n        &self,\n        truncate_shards_request: TruncateShardsRequest,\n    ) -> IngestV2Result<TruncateShardsResponse> {\n        if truncate_shards_request.ingester_id != self.self_node_id {\n            return Err(IngestV2Error::Internal(format!(\n                \"routing error: expected ingester `{}`, got `{}`\",\n                self.self_node_id, truncate_shards_request.ingester_id,\n            )));\n        }\n        let mut state_guard =\n            with_lock_metrics!(self.state.lock_fully().await, \"truncate_shards\", \"write\")?;\n\n        for subrequest in truncate_shards_request.subrequests {\n            let queue_id = subrequest.queue_id();\n            let truncate_up_to_position_inclusive = subrequest.truncate_up_to_position_inclusive();\n\n            if truncate_up_to_position_inclusive.is_eof() {\n                state_guard.delete_shard(&queue_id, \"indexer-rpc\").await;\n            } else {\n                state_guard\n                    .truncate_shard(&queue_id, truncate_up_to_position_inclusive, \"indexer-rpc\")\n                    .await;\n            }\n        }\n        let wal_usage = state_guard.mrecordlog.resource_usage();\n        report_wal_usage(wal_usage);\n\n        self.check_decommissioning_status(&mut state_guard).await;\n        let truncate_response = TruncateShardsResponse {};\n        Ok(truncate_response)\n    }\n\n    async fn close_shards_inner(\n        &self,\n        close_shards_request: CloseShardsRequest,\n    ) -> IngestV2Result<CloseShardsResponse> {\n        let mut state_guard =\n            with_lock_metrics!(self.state.lock_partially().await, \"close_shards\", \"write\")?;\n\n        let mut successes = Vec::with_capacity(close_shards_request.shard_pkeys.len());\n\n        for shard_pkey in close_shards_request.shard_pkeys {\n            let queue_id = shard_pkey.queue_id();\n\n            if let Some(shard) = state_guard.shards.get_mut(&queue_id) {\n                shard.close();\n                successes.push(shard_pkey);\n            }\n        }\n        info!(\"closed {} shards\", successes.len());\n        let response = CloseShardsResponse { successes };\n        Ok(response)\n    }\n\n    pub async fn debug_info(&self) -> JsonValue {\n        let state_guard = match self.state.lock_fully().await {\n            Ok(state_guard) => state_guard,\n            Err(_) => {\n                return json!({\n                    \"status\": \"initializing\",\n                    \"shards\": {},\n                    \"mrecordlog\": {},\n                });\n            }\n        };\n        let mut per_index_shards_json: BTreeMap<IndexUid, Vec<JsonValue>> = BTreeMap::new();\n\n        for (queue_id, shard) in &state_guard.shards {\n            let Some((index_uid, source_id, shard_id)) = split_queue_id(queue_id) else {\n                // `split_queue_id` already logs an error.\n                continue;\n            };\n            let mut shard_json = json!({\n                \"index_uid\": index_uid,\n                \"source_id\": source_id,\n                \"shard_id\": shard_id,\n                \"state\": shard.shard_state.as_json_str_name(),\n                \"replication_position_inclusive\": shard.replication_position_inclusive,\n                \"truncation_position_inclusive\": shard.truncation_position_inclusive,\n            });\n            match &shard.shard_type {\n                IngesterShardType::Primary { follower_id, .. } => {\n                    shard_json[\"type\"] = json!(\"primary\");\n                    shard_json[\"leader_id\"] = json!(self.self_node_id.to_string());\n                    shard_json[\"follower_id\"] = json!(follower_id.to_string());\n                }\n                IngesterShardType::Replica { leader_id } => {\n                    shard_json[\"type\"] = json!(\"replica\");\n                    shard_json[\"leader_id\"] = json!(leader_id.to_string());\n                    shard_json[\"follower_id\"] = json!(self.self_node_id.to_string());\n                }\n                IngesterShardType::Solo => {\n                    shard_json[\"type\"] = json!(\"solo\");\n                    shard_json[\"leader_id\"] = json!(self.self_node_id.to_string());\n                }\n            };\n            per_index_shards_json\n                .entry(index_uid.clone())\n                .or_default()\n                .push(shard_json);\n        }\n        json!({\n            \"status\": state_guard.status().as_json_str_name(),\n            \"shards\": per_index_shards_json,\n            \"mrecordlog\":  state_guard.mrecordlog.summary(),\n        })\n    }\n}\n\n#[async_trait]\nimpl IngesterService for Ingester {\n    async fn persist(&self, persist_request: PersistRequest) -> IngestV2Result<PersistResponse> {\n        // If the request is local, the amount of memory it occupies is already\n        // accounted for in the router.\n        let request_size_bytes = persist_request\n            .subrequests\n            .iter()\n            .flat_map(|subrequest| match &subrequest.doc_batch {\n                Some(doc_batch) if doc_batch.doc_buffer.is_unique() => Some(doc_batch.num_bytes()),\n                _ => None,\n            })\n            .sum::<usize>();\n        let mut gauge_guard = GaugeGuard::from_gauge(&MEMORY_METRICS.in_flight.ingester_persist);\n        gauge_guard.add(request_size_bytes as i64);\n\n        self.persist_inner(persist_request).await\n    }\n\n    async fn open_replication_stream(\n        &self,\n        syn_replication_stream: quickwit_common::ServiceStream<SynReplicationMessage>,\n    ) -> IngestV2Result<IngesterServiceStream<AckReplicationMessage>> {\n        self.open_replication_stream_inner(syn_replication_stream)\n            .await\n    }\n\n    async fn open_fetch_stream(\n        &self,\n        open_fetch_stream_request: OpenFetchStreamRequest,\n    ) -> IngestV2Result<ServiceStream<IngestV2Result<FetchMessage>>> {\n        self.open_fetch_stream_inner(open_fetch_stream_request)\n            .await\n    }\n\n    async fn open_observation_stream(\n        &self,\n        open_observation_stream_request: OpenObservationStreamRequest,\n    ) -> IngestV2Result<IngesterServiceStream<ObservationMessage>> {\n        self.open_observation_stream_inner(open_observation_stream_request)\n            .await\n    }\n\n    async fn init_shards(\n        &self,\n        init_shards_request: InitShardsRequest,\n    ) -> IngestV2Result<InitShardsResponse> {\n        self.init_shards_inner(init_shards_request).await\n    }\n\n    async fn retain_shards(\n        &self,\n        request: RetainShardsRequest,\n    ) -> IngestV2Result<RetainShardsResponse> {\n        let retain_queue_ids: HashSet<QueueId> = request\n            .retain_shards_for_sources\n            .into_iter()\n            .flat_map(|retain_shards_for_source: RetainShardsForSource| {\n                let index_uid = retain_shards_for_source.index_uid().clone();\n                retain_shards_for_source\n                    .shard_ids\n                    .into_iter()\n                    .map(move |shard_id| {\n                        queue_id(&index_uid, &retain_shards_for_source.source_id, &shard_id)\n                    })\n            })\n            .collect();\n        let mut state_guard =\n            with_lock_metrics!(self.state.lock_fully(), \"retain_shards\", \"write\").await?;\n        let remove_queue_ids: HashSet<QueueId> = state_guard\n            .shards\n            .keys()\n            .filter(move |shard_id| !retain_queue_ids.contains(*shard_id))\n            .map(ToString::to_string)\n            .collect();\n        info!(queues=?remove_queue_ids, \"removing queues\");\n        for queue_id in remove_queue_ids {\n            state_guard\n                .delete_shard(&queue_id, \"control-plane-retain-shards-rpc\")\n                .await;\n        }\n        self.check_decommissioning_status(&mut state_guard).await;\n        Ok(RetainShardsResponse {})\n    }\n\n    async fn truncate_shards(\n        &self,\n        truncate_shards_request: TruncateShardsRequest,\n    ) -> IngestV2Result<TruncateShardsResponse> {\n        self.truncate_shards_inner(truncate_shards_request).await\n    }\n\n    async fn close_shards(\n        &self,\n        close_shards_request: CloseShardsRequest,\n    ) -> IngestV2Result<CloseShardsResponse> {\n        self.close_shards_inner(close_shards_request).await\n    }\n\n    async fn decommission(\n        &self,\n        _decommission_request: DecommissionRequest,\n    ) -> IngestV2Result<DecommissionResponse> {\n        // Retire the ingester immediately by setting its status to `Retiring`.\n        info!(\"retiring ingester\");\n        let mut state_guard = self.state.lock_partially().await?;\n        state_guard.set_status(IngesterStatus::Retiring).await;\n        drop(state_guard); // Dropping explicitly for readability.\n\n        // Drain write requests by scheduling the decommissioning of the ingester after a delay\n        // allowing the propagation of the `Retiring` status to other nodes.\n        let self_clone = self.clone();\n        tokio::spawn(async move {\n            const DECOMMISSION_DELAY: Duration = if cfg!(any(test, feature = \"testsuite\")) {\n                Duration::from_millis(100)\n            } else {\n                // Having to wait for 10s is not great but we can live with it. During this time, we\n                // still make progress towards decommissioning because we gradually receive less\n                // write requests and indexing is still ongoing. However, it sets a floor on the\n                // amount of time with which we can fully decommission an ingester. This will be\n                // most noticeable when using Quickwit locally.\n                Duration::from_secs(10)\n            };\n            tokio::time::sleep(DECOMMISSION_DELAY).await;\n\n            info!(\"decommissioning ingester\");\n            let mut state_guard = match self_clone.state.lock_partially().await {\n                Ok(state_guard) => state_guard,\n                Err(error) => {\n                    error!(%error, \"failed to decommission ingester\");\n                    return;\n                }\n            };\n            state_guard\n                .set_status(IngesterStatus::Decommissioning)\n                .await;\n\n            for shard in state_guard.shards.values_mut() {\n                shard.close();\n            }\n            self_clone\n                .check_decommissioning_status(&mut state_guard)\n                .await;\n        });\n        Ok(DecommissionResponse {})\n    }\n}\n\n#[async_trait]\nimpl EventSubscriber<ShardPositionsUpdate> for WeakIngesterState {\n    async fn handle_event(&mut self, shard_positions_update: ShardPositionsUpdate) {\n        let Some(state) = self.upgrade() else {\n            warn!(\"ingester state update failed\");\n            return;\n        };\n        let Ok(mut state_guard) =\n            with_lock_metrics!(state.lock_fully().await, \"gc_shards\", \"write\")\n        else {\n            error!(\"failed to lock the ingester state\");\n            return;\n        };\n        let index_uid = shard_positions_update.source_uid.index_uid;\n        let source_id = shard_positions_update.source_uid.source_id;\n\n        for (shard_id, shard_position) in shard_positions_update.updated_shard_positions {\n            let queue_id = queue_id(&index_uid, &source_id, &shard_id);\n            if shard_position.is_eof() {\n                state_guard.delete_shard(&queue_id, \"indexer-gossip\").await;\n            } else if !shard_position.is_beginning() {\n                state_guard\n                    .truncate_shard(&queue_id, shard_position, \"indexer-gossip\")\n                    .await;\n            }\n        }\n    }\n}\n\nstruct PendingPersistSubrequest {\n    queue_id: QueueId,\n    subrequest_id: u32,\n    index_uid: Option<IndexUid>,\n    source_id: SourceId,\n    shard_id: Option<ShardId>,\n    doc_batch: DocBatchV2,\n    parse_failures: Vec<ParseFailure>,\n    expected_position_inclusive: Option<Position>,\n    successfully_replicated: bool,\n}\n\n#[cfg(test)]\nmod tests {\n    #![allow(clippy::mutable_key_type)]\n\n    use std::collections::HashSet;\n    use std::net::SocketAddr;\n    use std::sync::atomic::{AtomicU16, Ordering};\n\n    use bytes::Bytes;\n    use quickwit_cluster::{ChannelTransport, create_cluster_for_test_with_id};\n    use quickwit_common::shared_consts::INGESTER_PRIMARY_SHARDS_PREFIX;\n    use quickwit_common::tower::ConstantRate;\n    use quickwit_config::service::QuickwitService;\n    use quickwit_proto::control_plane::{AdviseResetShardsResponse, MockControlPlaneService};\n    use quickwit_proto::ingest::ingester::{\n        IngesterServiceClient, IngesterServiceGrpcServer, IngesterServiceGrpcServerAdapter,\n        IngesterStatus, InitShardSubrequest, PersistSubrequest, TruncateShardsSubrequest,\n    };\n    use quickwit_proto::ingest::{\n        DocBatchV2, ParseFailureReason, ShardIdPosition, ShardIdPositions, ShardIds, ShardPKey,\n        ShardState,\n    };\n    use quickwit_proto::types::{DocMappingUid, DocUid, ShardId, SourceUid, queue_id};\n    use tokio::task::yield_now;\n    use tokio::time::timeout;\n    use tonic::transport::{Endpoint, Server};\n\n    use super::*;\n    use crate::ingest_v2::DEFAULT_IDLE_SHARD_TIMEOUT;\n    use crate::ingest_v2::broadcast::ShardInfos;\n    use crate::ingest_v2::doc_mapper::try_build_doc_mapper;\n    use crate::ingest_v2::fetch::tests::{into_fetch_eof, into_fetch_payload};\n    use crate::ingest_v2::helpers::wait_for_ingester_status;\n    use crate::{IngesterPoolEntry, MRecord};\n\n    const MAX_GRPC_MESSAGE_SIZE: ByteSize = ByteSize::mib(1);\n\n    pub(super) struct IngesterForTest {\n        node_id: NodeId,\n        control_plane: ControlPlaneServiceClient,\n        ingester_pool: IngesterPool,\n        disk_capacity: ByteSize,\n        memory_capacity: ByteSize,\n        rate_limiter_settings: RateLimiterSettings,\n        replication_factor: usize,\n        idle_shard_timeout: Duration,\n    }\n\n    impl Default for IngesterForTest {\n        fn default() -> Self {\n            let mut mock_control_plane = MockControlPlaneService::new();\n            mock_control_plane\n                .expect_advise_reset_shards()\n                .returning(|_| Ok(AdviseResetShardsResponse::default()));\n            let control_plane = ControlPlaneServiceClient::from_mock(mock_control_plane);\n\n            Self {\n                node_id: \"test-ingester\".into(),\n                control_plane,\n                ingester_pool: IngesterPool::default(),\n                disk_capacity: ByteSize::mb(256),\n                memory_capacity: ByteSize::mb(1),\n                rate_limiter_settings: RateLimiterSettings::default(),\n                replication_factor: 1,\n                idle_shard_timeout: DEFAULT_IDLE_SHARD_TIMEOUT,\n            }\n        }\n    }\n\n    impl IngesterForTest {\n        pub fn with_node_id(mut self, node_id: &str) -> Self {\n            self.node_id = node_id.into();\n            self\n        }\n\n        pub fn with_control_plane(mut self, control_plane: ControlPlaneServiceClient) -> Self {\n            self.control_plane = control_plane;\n            self\n        }\n\n        pub fn with_ingester_pool(mut self, ingester_pool: &IngesterPool) -> Self {\n            self.ingester_pool = ingester_pool.clone();\n            self\n        }\n\n        pub fn with_disk_capacity(mut self, disk_capacity: ByteSize) -> Self {\n            self.disk_capacity = disk_capacity;\n            self\n        }\n\n        pub fn with_memory_capacity(mut self, memory_capacity: ByteSize) -> Self {\n            self.memory_capacity = memory_capacity;\n            self\n        }\n\n        pub fn with_rate_limiter_settings(\n            mut self,\n            rate_limiter_settings: RateLimiterSettings,\n        ) -> Self {\n            self.rate_limiter_settings = rate_limiter_settings;\n            self\n        }\n\n        pub fn with_replication(mut self) -> Self {\n            self.replication_factor = 2;\n            self\n        }\n\n        pub fn with_idle_shard_timeout(mut self, idle_shard_timeout: Duration) -> Self {\n            self.idle_shard_timeout = idle_shard_timeout;\n            self\n        }\n\n        pub async fn build(self) -> (IngesterContext, Ingester) {\n            static GOSSIP_ADVERTISE_PORT_SEQUENCE: AtomicU16 = AtomicU16::new(1u16);\n\n            let tempdir = tempfile::tempdir().unwrap();\n            let wal_dir_path = tempdir.path();\n            let transport = ChannelTransport::default();\n\n            let gossip_advertise_port =\n                GOSSIP_ADVERTISE_PORT_SEQUENCE.fetch_add(1, Ordering::Relaxed);\n\n            let cluster = create_cluster_for_test_with_id(\n                self.node_id.clone(),\n                gossip_advertise_port,\n                \"test-cluster\".to_string(),\n                Vec::new(),\n                &HashSet::from_iter([QuickwitService::Indexer]),\n                &transport,\n                true,\n            )\n            .await\n            .unwrap();\n\n            let ingester = Ingester::try_new(\n                cluster.clone(),\n                self.control_plane.clone(),\n                self.ingester_pool.clone(),\n                wal_dir_path,\n                self.disk_capacity,\n                self.memory_capacity,\n                self.rate_limiter_settings,\n                self.replication_factor,\n                self.idle_shard_timeout,\n            )\n            .await\n            .unwrap();\n\n            wait_for_ingester_status(&ingester, IngesterStatus::Ready, Duration::from_secs(1))\n                .await\n                .unwrap();\n\n            let ingester_env = IngesterContext {\n                tempdir,\n                _transport: transport,\n                node_id: self.node_id,\n                cluster,\n                ingester_pool: self.ingester_pool,\n            };\n            (ingester_env, ingester)\n        }\n    }\n\n    pub struct IngesterContext {\n        tempdir: tempfile::TempDir,\n        _transport: ChannelTransport,\n        node_id: NodeId,\n        cluster: Cluster,\n        ingester_pool: IngesterPool,\n    }\n\n    #[tokio::test]\n    async fn test_ingester_init() {\n        let (ingester_ctx, ingester) = IngesterForTest::default().build().await;\n        let mut state_guard = ingester.state.lock_fully().await.unwrap();\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n        let queue_id_01 = queue_id(&index_uid, &source_id, &ShardId::from(1));\n        let queue_id_02 = queue_id(&index_uid, &source_id, &ShardId::from(2));\n        let queue_id_03 = queue_id(&index_uid, &source_id, &ShardId::from(3));\n\n        state_guard\n            .mrecordlog\n            .create_queue(&queue_id_01)\n            .await\n            .unwrap();\n\n        let records = [MRecord::new_doc(\"test-doc-foo\").encode()].into_iter();\n\n        state_guard\n            .mrecordlog\n            .append_records(&queue_id_01, None, records)\n            .await\n            .unwrap();\n\n        state_guard\n            .mrecordlog\n            .truncate(&queue_id_01, 0)\n            .await\n            .unwrap();\n\n        state_guard\n            .mrecordlog\n            .create_queue(&queue_id_02)\n            .await\n            .unwrap();\n\n        let records = [\n            MRecord::new_doc(\"test-doc-foo\").encode(),\n            MRecord::new_doc(\"test-doc-bar\").encode(),\n        ]\n        .into_iter();\n\n        state_guard\n            .mrecordlog\n            .append_records(&queue_id_02, None, records)\n            .await\n            .unwrap();\n\n        state_guard\n            .mrecordlog\n            .truncate(&queue_id_02, 0)\n            .await\n            .unwrap();\n\n        state_guard\n            .mrecordlog\n            .create_queue(&queue_id_03)\n            .await\n            .unwrap();\n\n        state_guard.set_status(IngesterStatus::Initializing).await;\n\n        drop(state_guard);\n\n        ingester\n            .state\n            .init(ingester_ctx.tempdir.path(), RateLimiterSettings::default())\n            .await;\n\n        let state_guard = ingester.state.lock_fully().await.unwrap();\n        assert_eq!(state_guard.shards.len(), 1);\n\n        let solo_shard_02 = state_guard.shards.get(&queue_id_02).unwrap();\n        solo_shard_02.assert_is_solo();\n        solo_shard_02.assert_is_closed();\n        solo_shard_02.assert_replication_position(Position::offset(1u64));\n        solo_shard_02.assert_truncation_position(Position::offset(0u64));\n        assert!(solo_shard_02.is_advertisable);\n\n        state_guard\n            .mrecordlog\n            .assert_records_eq(&queue_id_02, .., &[(1, [0, 0], \"test-doc-bar\")]);\n\n        assert_eq!(state_guard.status(), IngesterStatus::Ready);\n    }\n\n    #[tokio::test]\n    async fn test_ingester_broadcasts_local_shards() {\n        let (ingester_ctx, ingester) = IngesterForTest::default().build().await;\n\n        let mut state_guard = ingester.state.lock_fully().await.unwrap();\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n\n        let shard_00 =\n            IngesterShard::new_solo(index_uid.clone(), source_id.clone(), ShardId::from(0)).build();\n        state_guard.shards.insert(shard_00.queue_id(), shard_00);\n\n        let shard_01 = IngesterShard::new_solo(index_uid.clone(), source_id, ShardId::from(1))\n            .advertisable()\n            .build();\n        let queue_id_01 = shard_01.queue_id();\n        state_guard.shards.insert(queue_id_01.clone(), shard_01);\n        drop(state_guard);\n\n        tokio::time::sleep(Duration::from_millis(100)).await;\n\n        let key = format!(\n            \"{INGESTER_PRIMARY_SHARDS_PREFIX}{}:{}\",\n            index_uid, \"test-source\"\n        );\n        let value = ingester_ctx.cluster.get_self_key_value(&key).await.unwrap();\n\n        let shard_infos: ShardInfos = serde_json::from_str(&value).unwrap();\n        assert_eq!(shard_infos.len(), 1);\n\n        let shard_info = shard_infos.iter().next().unwrap();\n        assert_eq!(shard_info.shard_id, ShardId::from(1));\n        assert_eq!(shard_info.shard_state, ShardState::Open);\n        assert_eq!(shard_info.short_term_ingestion_rate, 0);\n\n        let mut state_guard = ingester.state.lock_fully().await.unwrap();\n        state_guard\n            .shards\n            .get_mut(&queue_id_01)\n            .unwrap()\n            .shard_state = ShardState::Closed;\n        drop(state_guard);\n\n        tokio::time::sleep(Duration::from_millis(100)).await;\n\n        let value = ingester_ctx.cluster.get_self_key_value(&key).await.unwrap();\n\n        let shard_infos: ShardInfos = serde_json::from_str(&value).unwrap();\n        assert_eq!(shard_infos.len(), 1);\n\n        let shard_info = shard_infos.iter().next().unwrap();\n        assert_eq!(shard_info.shard_state, ShardState::Closed);\n\n        let mut state_guard = ingester.state.lock_fully().await.unwrap();\n        state_guard.shards.remove(&queue_id_01).unwrap();\n        drop(state_guard);\n\n        tokio::time::sleep(Duration::from_millis(100)).await;\n\n        let value_opt = ingester_ctx.cluster.get_self_key_value(&key).await;\n        assert!(value_opt.is_none());\n    }\n\n    #[tokio::test]\n    async fn test_ingester_init_primary_shard() {\n        let (ingester_ctx, ingester) = IngesterForTest::default().build().await;\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n\n        let doc_mapping_uid = DocMappingUid::random();\n        let doc_mapping_json = format!(\n            r#\"{{\n                \"doc_mapping_uid\": \"{doc_mapping_uid}\",\n                \"field_mappings\": [{{\n                        \"name\": \"message\",\n                        \"type\": \"text\"\n                }}]\n            }}\"#\n        );\n        let primary_shard = Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            shard_state: ShardState::Open as i32,\n            leader_id: ingester_ctx.node_id.to_string(),\n            doc_mapping_uid: Some(doc_mapping_uid),\n            ..Default::default()\n        };\n        let mut state_guard = ingester.state.lock_fully().await.unwrap();\n\n        ingester\n            .init_primary_shard(\n                &mut state_guard.inner,\n                &mut state_guard.mrecordlog,\n                primary_shard,\n                &doc_mapping_json,\n                Instant::now(),\n                true,\n            )\n            .await\n            .unwrap();\n\n        let queue_id = queue_id(&index_uid, &source_id, &ShardId::from(1));\n        let shard = state_guard.shards.get(&queue_id).unwrap();\n        shard.assert_is_solo();\n        shard.assert_is_open();\n        shard.assert_replication_position(Position::Beginning);\n        shard.assert_truncation_position(Position::Beginning);\n        assert!(shard.doc_mapper_opt.is_some());\n    }\n\n    #[tokio::test]\n    async fn test_ingester_init_shards() {\n        let (ingester_ctx, ingester) = IngesterForTest::default().build().await;\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n\n        let doc_mapping_uid = DocMappingUid::random();\n        let doc_mapping_json = format!(\n            r#\"{{\n                \"doc_mapping_uid\": \"{doc_mapping_uid}\"\n            }}\"#\n        );\n        let shard = Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            shard_state: ShardState::Open as i32,\n            leader_id: ingester_ctx.node_id.to_string(),\n            follower_id: None,\n            doc_mapping_uid: Some(doc_mapping_uid),\n            publish_position_inclusive: None,\n            publish_token: None,\n            update_timestamp: 1724158996,\n        };\n        let init_shards_request = InitShardsRequest {\n            subrequests: vec![InitShardSubrequest {\n                subrequest_id: 0,\n                shard: Some(shard.clone()),\n                doc_mapping_json,\n                validate_docs: true,\n            }],\n        };\n        let response = ingester.init_shards(init_shards_request).await.unwrap();\n        assert_eq!(response.successes.len(), 1);\n        assert_eq!(response.failures.len(), 0);\n\n        let init_shard_success = &response.successes[0];\n        assert_eq!(init_shard_success.subrequest_id, 0);\n        assert_eq!(init_shard_success.shard, Some(shard));\n\n        let state_guard = ingester.state.lock_fully().await.unwrap();\n\n        let queue_id = queue_id(&index_uid, &source_id, &ShardId::from(1));\n        let shard = state_guard.shards.get(&queue_id).unwrap();\n        shard.assert_is_solo();\n        shard.assert_is_open();\n        shard.assert_replication_position(Position::Beginning);\n        shard.assert_truncation_position(Position::Beginning);\n\n        assert!(state_guard.mrecordlog.queue_exists(&queue_id));\n    }\n\n    #[tokio::test]\n    async fn test_ingester_persist() {\n        let (ingester_ctx, ingester) = IngesterForTest::default().build().await;\n\n        let index_uid_0 = IndexUid::for_test(\"test-index\", 0);\n        let index_uid_1 = IndexUid::for_test(\"test-index\", 1);\n        let source_id = SourceId::from(\"test-source\");\n\n        let doc_mapping_uid = DocMappingUid::random();\n        let doc_mapping_json = format!(\n            r#\"{{\n                \"doc_mapping_uid\": \"{doc_mapping_uid}\"\n            }}\"#\n        );\n        let init_shards_request = InitShardsRequest {\n            subrequests: vec![\n                InitShardSubrequest {\n                    subrequest_id: 0,\n                    shard: Some(Shard {\n                        index_uid: Some(index_uid_0.clone()),\n                        source_id: source_id.clone(),\n                        shard_id: Some(ShardId::from(1)),\n                        shard_state: ShardState::Open as i32,\n                        leader_id: ingester_ctx.node_id.to_string(),\n                        doc_mapping_uid: Some(doc_mapping_uid),\n                        ..Default::default()\n                    }),\n                    doc_mapping_json: doc_mapping_json.clone(),\n                    validate_docs: true,\n                },\n                InitShardSubrequest {\n                    subrequest_id: 1,\n                    shard: Some(Shard {\n                        index_uid: Some(index_uid_1.clone()),\n                        source_id: source_id.clone(),\n                        shard_id: Some(ShardId::from(1)),\n                        shard_state: ShardState::Open as i32,\n                        leader_id: ingester_ctx.node_id.to_string(),\n                        doc_mapping_uid: Some(doc_mapping_uid),\n                        ..Default::default()\n                    }),\n                    doc_mapping_json,\n                    validate_docs: true,\n                },\n            ],\n        };\n        ingester.init_shards(init_shards_request).await.unwrap();\n\n        let persist_request = PersistRequest {\n            leader_id: ingester_ctx.node_id.to_string(),\n            commit_type: CommitTypeV2::Force as i32,\n            subrequests: vec![\n                PersistSubrequest {\n                    subrequest_id: 0,\n                    index_uid: Some(index_uid_0.clone()),\n                    source_id: source_id.clone(),\n                    doc_batch: Some(DocBatchV2::for_test([r#\"{\"doc\": \"test-doc-010\"}\"#])),\n                },\n                PersistSubrequest {\n                    subrequest_id: 1,\n                    index_uid: Some(index_uid_1.clone()),\n                    source_id: source_id.clone(),\n                    doc_batch: Some(DocBatchV2::for_test([\n                        r#\"{\"doc\": \"test-doc-110\"}\"#,\n                        r#\"{\"doc\": \"test-doc-111\"}\"#,\n                    ])),\n                },\n            ],\n        };\n        let persist_response = ingester.persist(persist_request).await.unwrap();\n        assert_eq!(persist_response.leader_id, \"test-ingester\");\n        assert_eq!(persist_response.successes.len(), 2);\n        assert_eq!(persist_response.failures.len(), 0);\n\n        let persist_success_0 = &persist_response.successes[0];\n        assert_eq!(persist_success_0.subrequest_id, 0);\n        assert_eq!(persist_success_0.index_uid(), &index_uid_0);\n        assert_eq!(persist_success_0.source_id, \"test-source\");\n        assert_eq!(\n            persist_success_0.replication_position_inclusive,\n            Some(Position::offset(1u64))\n        );\n\n        let persist_success_1 = &persist_response.successes[1];\n        assert_eq!(persist_success_1.subrequest_id, 1);\n        assert_eq!(persist_success_1.index_uid(), &index_uid_1);\n        assert_eq!(persist_success_1.source_id, \"test-source\");\n        assert_eq!(\n            persist_success_1.replication_position_inclusive,\n            Some(Position::offset(2u64))\n        );\n\n        let state_guard = ingester.state.lock_fully().await.unwrap();\n        assert_eq!(state_guard.shards.len(), 2);\n\n        let queue_id_01 = queue_id(&index_uid_0, &source_id, &ShardId::from(1));\n        let solo_shard_01 = state_guard.shards.get(&queue_id_01).unwrap();\n        solo_shard_01.assert_is_solo();\n        solo_shard_01.assert_is_open();\n        solo_shard_01.assert_replication_position(Position::offset(1u64));\n\n        state_guard.mrecordlog.assert_records_eq(\n            &queue_id_01,\n            ..,\n            &[(0, [0, 0], r#\"{\"doc\": \"test-doc-010\"}\"#), (1, [0, 1], \"\")],\n        );\n\n        let queue_id_11 = queue_id(&index_uid_1, &source_id, &ShardId::from(1));\n        let solo_shard_11 = state_guard.shards.get(&queue_id_11).unwrap();\n        solo_shard_11.assert_is_solo();\n        solo_shard_11.assert_is_open();\n        solo_shard_11.assert_replication_position(Position::offset(2u64));\n\n        state_guard.mrecordlog.assert_records_eq(\n            &queue_id_11,\n            ..,\n            &[\n                (0, [0, 0], r#\"{\"doc\": \"test-doc-110\"}\"#),\n                (1, [0, 0], r#\"{\"doc\": \"test-doc-111\"}\"#),\n                (2, [0, 1], \"\"),\n            ],\n        );\n    }\n\n    #[tokio::test]\n    async fn test_ingester_persist_empty() {\n        let (ingester_ctx, ingester) = IngesterForTest::default().build().await;\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n\n        let doc_mapping_uid = DocMappingUid::random();\n        let doc_mapping_json = format!(\n            r#\"{{\n                \"doc_mapping_uid\": \"{doc_mapping_uid}\"\n            }}\"#\n        );\n        let init_shards_request = InitShardsRequest {\n            subrequests: vec![InitShardSubrequest {\n                subrequest_id: 0,\n                shard: Some(Shard {\n                    index_uid: Some(index_uid.clone()),\n                    source_id: source_id.clone(),\n                    shard_id: Some(ShardId::from(0)),\n                    shard_state: ShardState::Open as i32,\n                    leader_id: ingester_ctx.node_id.to_string(),\n                    doc_mapping_uid: Some(doc_mapping_uid),\n                    ..Default::default()\n                }),\n                doc_mapping_json,\n                validate_docs: true,\n            }],\n        };\n        let response = ingester.init_shards(init_shards_request).await.unwrap();\n        assert_eq!(response.successes.len(), 1);\n        assert_eq!(response.failures.len(), 0);\n\n        let persist_request = PersistRequest {\n            leader_id: ingester_ctx.node_id.to_string(),\n            commit_type: CommitTypeV2::Force as i32,\n            subrequests: Vec::new(),\n        };\n        let persist_response = ingester.persist(persist_request).await.unwrap();\n        assert_eq!(persist_response.leader_id, \"test-ingester\");\n        assert_eq!(persist_response.successes.len(), 0);\n        assert_eq!(persist_response.failures.len(), 0);\n\n        let persist_request = PersistRequest {\n            leader_id: \"test-ingester\".to_string(),\n            commit_type: CommitTypeV2::Force as i32,\n            subrequests: vec![PersistSubrequest {\n                subrequest_id: 0,\n                index_uid: Some(index_uid.clone()),\n                source_id: \"test-source\".to_string(),\n                doc_batch: None,\n            }],\n        };\n        let persist_response = ingester.persist(persist_request).await.unwrap();\n        assert_eq!(persist_response.leader_id, \"test-ingester\");\n        assert_eq!(persist_response.successes.len(), 1);\n        assert_eq!(persist_response.failures.len(), 0);\n\n        let persist_success = &persist_response.successes[0];\n        assert_eq!(persist_success.subrequest_id, 0);\n        assert_eq!(persist_success.index_uid(), &index_uid);\n        assert_eq!(persist_success.source_id, \"test-source\");\n        assert_eq!(\n            persist_success.replication_position_inclusive,\n            Some(Position::Beginning)\n        );\n    }\n\n    #[tokio::test]\n    async fn test_ingester_persist_validates_docs() {\n        let (ingester_ctx, ingester) = IngesterForTest::default().build().await;\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n\n        let doc_mapping_uid = DocMappingUid::random();\n        let doc_mapping_json = format!(\n            r#\"{{\n                \"doc_mapping_uid\": \"{doc_mapping_uid}\",\n                \"mode\": \"strict\",\n                \"field_mappings\": [{{\"name\": \"doc\", \"type\": \"text\"}}]\n            }}\"#\n        );\n        let init_shards_request = InitShardsRequest {\n            subrequests: vec![InitShardSubrequest {\n                subrequest_id: 0,\n                shard: Some(Shard {\n                    index_uid: Some(index_uid.clone()),\n                    source_id: source_id.clone(),\n                    shard_id: Some(ShardId::from(0)),\n                    shard_state: ShardState::Open as i32,\n                    leader_id: ingester_ctx.node_id.to_string(),\n                    doc_mapping_uid: Some(doc_mapping_uid),\n                    ..Default::default()\n                }),\n                doc_mapping_json,\n                validate_docs: true,\n            }],\n        };\n        let response = ingester.init_shards(init_shards_request).await.unwrap();\n        assert_eq!(response.successes.len(), 1);\n        assert_eq!(response.failures.len(), 0);\n\n        let persist_request = PersistRequest {\n            leader_id: ingester_ctx.node_id.to_string(),\n            commit_type: CommitTypeV2::Force as i32,\n            subrequests: vec![PersistSubrequest {\n                subrequest_id: 0,\n                index_uid: Some(index_uid.clone()),\n                source_id: \"test-source\".to_string(),\n                doc_batch: Some(DocBatchV2::for_test([\n                    \"\",                           // invalid\n                    \"[]\",                         // invalid\n                    r#\"{\"foo\": \"bar\"}\"#,          // invalid\n                    r#\"{\"doc\": \"test-doc-000\"}\"#, // valid\n                ])),\n            }],\n        };\n        let persist_response = ingester.persist(persist_request).await.unwrap();\n        assert_eq!(persist_response.leader_id, \"test-ingester\");\n        assert_eq!(persist_response.successes.len(), 1);\n        assert_eq!(persist_response.failures.len(), 0);\n\n        let persist_success = &persist_response.successes[0];\n        assert_eq!(persist_success.num_persisted_docs, 1);\n        assert_eq!(persist_success.parse_failures.len(), 3);\n\n        let parse_failure_0 = &persist_success.parse_failures[0];\n        assert_eq!(parse_failure_0.doc_uid(), DocUid::for_test(0));\n        assert_eq!(parse_failure_0.reason(), ParseFailureReason::InvalidJson);\n        assert!(parse_failure_0.message.contains(\"parse JSON document\"));\n\n        let parse_failure_1 = &persist_success.parse_failures[1];\n        assert_eq!(parse_failure_1.doc_uid(), DocUid::for_test(1));\n        assert_eq!(parse_failure_1.reason(), ParseFailureReason::InvalidJson);\n        assert!(parse_failure_1.message.contains(\"not an object\"));\n\n        let parse_failure_2 = &persist_success.parse_failures[2];\n        assert_eq!(parse_failure_2.doc_uid(), DocUid::for_test(2));\n        assert_eq!(parse_failure_2.reason(), ParseFailureReason::InvalidSchema);\n        assert!(parse_failure_2.message.contains(\"not declared\"));\n    }\n\n    #[tokio::test]\n    async fn test_ingester_persist_doesnt_validates_docs_when_requested() {\n        let (ingester_ctx, ingester) = IngesterForTest::default().build().await;\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n\n        let doc_mapping_uid = DocMappingUid::random();\n        let doc_mapping_json = format!(\n            r#\"{{\n                \"doc_mapping_uid\": \"{doc_mapping_uid}\",\n                \"mode\": \"strict\",\n                \"field_mappings\": [{{\"name\": \"doc\", \"type\": \"text\"}}]\n            }}\"#\n        );\n        let init_shards_request = InitShardsRequest {\n            subrequests: vec![InitShardSubrequest {\n                subrequest_id: 0,\n                shard: Some(Shard {\n                    index_uid: Some(index_uid.clone()),\n                    source_id: source_id.clone(),\n                    shard_id: Some(ShardId::from(0)),\n                    shard_state: ShardState::Open as i32,\n                    leader_id: ingester_ctx.node_id.to_string(),\n                    doc_mapping_uid: Some(doc_mapping_uid),\n                    ..Default::default()\n                }),\n                doc_mapping_json,\n                validate_docs: false,\n            }],\n        };\n        let response = ingester.init_shards(init_shards_request).await.unwrap();\n        assert_eq!(response.successes.len(), 1);\n        assert_eq!(response.failures.len(), 0);\n\n        let persist_request = PersistRequest {\n            leader_id: ingester_ctx.node_id.to_string(),\n            commit_type: CommitTypeV2::Force as i32,\n            subrequests: vec![PersistSubrequest {\n                subrequest_id: 0,\n                index_uid: Some(index_uid.clone()),\n                source_id: \"test-source\".to_string(),\n                doc_batch: Some(DocBatchV2::for_test([\n                    \"\",                           // invalid\n                    \"[]\",                         // invalid\n                    r#\"{\"foo\": \"bar\"}\"#,          // invalid\n                    r#\"{\"doc\": \"test-doc-000\"}\"#, // valid\n                ])),\n            }],\n        };\n        let persist_response = ingester.persist(persist_request).await.unwrap();\n        assert_eq!(persist_response.leader_id, \"test-ingester\");\n        assert_eq!(persist_response.successes.len(), 1);\n        assert_eq!(persist_response.failures.len(), 0);\n\n        let persist_success = &persist_response.successes[0];\n        assert_eq!(persist_success.num_persisted_docs, 4);\n        assert_eq!(persist_success.parse_failures.len(), 0);\n    }\n\n    #[tokio::test]\n    async fn test_ingester_persist_checks_capacity_before_validating_docs() {\n        let (ingester_ctx, ingester) = IngesterForTest::default()\n            .with_memory_capacity(ByteSize(0))\n            .build()\n            .await;\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n\n        let doc_mapping_uid = DocMappingUid::random();\n        let doc_mapping_json = format!(\n            r#\"{{\n                \"doc_mapping_uid\": \"{doc_mapping_uid}\",\n                \"mode\": \"strict\",\n                \"field_mappings\": [{{\"name\": \"doc\", \"type\": \"text\"}}]\n            }}\"#\n        );\n        let init_shards_request = InitShardsRequest {\n            subrequests: vec![InitShardSubrequest {\n                subrequest_id: 0,\n                shard: Some(Shard {\n                    index_uid: Some(index_uid.clone()),\n                    source_id: source_id.clone(),\n                    shard_id: Some(ShardId::from(0)),\n                    shard_state: ShardState::Open as i32,\n                    leader_id: ingester_ctx.node_id.to_string(),\n                    doc_mapping_uid: Some(doc_mapping_uid),\n                    ..Default::default()\n                }),\n                doc_mapping_json,\n                validate_docs: true,\n            }],\n        };\n        let response = ingester.init_shards(init_shards_request).await.unwrap();\n        assert_eq!(response.successes.len(), 1);\n        assert_eq!(response.failures.len(), 0);\n\n        let persist_request = PersistRequest {\n            leader_id: ingester_ctx.node_id.to_string(),\n            commit_type: CommitTypeV2::Force as i32,\n            subrequests: vec![PersistSubrequest {\n                subrequest_id: 0,\n                index_uid: Some(index_uid.clone()),\n                source_id: \"test-source\".to_string(),\n                doc_batch: Some(DocBatchV2::for_test([\"\", \"[]\", r#\"{\"foo\": \"bar\"}\"#])),\n            }],\n        };\n        let persist_response = ingester.persist(persist_request).await.unwrap();\n        assert_eq!(persist_response.leader_id, \"test-ingester\");\n        assert_eq!(persist_response.successes.len(), 0);\n        assert_eq!(persist_response.failures.len(), 1);\n\n        let persist_failure = &persist_response.failures[0];\n        assert_eq!(persist_failure.reason(), PersistFailureReason::WalFull);\n    }\n\n    #[tokio::test]\n    async fn test_ingester_persist_applies_rate_limiting_before_validating_docs() {\n        let (ingester_ctx, ingester) = IngesterForTest::default()\n            .with_rate_limiter_settings(RateLimiterSettings {\n                burst_limit: 0,\n                rate_limit: ConstantRate::bytes_per_sec(ByteSize(0)),\n                refill_period: Duration::from_secs(1),\n            })\n            .build()\n            .await;\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n\n        let doc_mapping_uid = DocMappingUid::random();\n        let doc_mapping_json = format!(\n            r#\"{{\n                \"doc_mapping_uid\": \"{doc_mapping_uid}\",\n                \"mode\": \"strict\",\n                \"field_mappings\": [{{\"name\": \"doc\", \"type\": \"text\"}}]\n            }}\"#\n        );\n        let init_shards_request = InitShardsRequest {\n            subrequests: vec![InitShardSubrequest {\n                subrequest_id: 0,\n                shard: Some(Shard {\n                    index_uid: Some(index_uid.clone()),\n                    source_id: source_id.clone(),\n                    shard_id: Some(ShardId::from(0)),\n                    shard_state: ShardState::Open as i32,\n                    leader_id: ingester_ctx.node_id.to_string(),\n                    doc_mapping_uid: Some(doc_mapping_uid),\n                    ..Default::default()\n                }),\n                doc_mapping_json,\n                validate_docs: true,\n            }],\n        };\n        let response = ingester.init_shards(init_shards_request).await.unwrap();\n        assert_eq!(response.successes.len(), 1);\n        assert_eq!(response.failures.len(), 0);\n\n        let persist_request = PersistRequest {\n            leader_id: ingester_ctx.node_id.to_string(),\n            commit_type: CommitTypeV2::Force as i32,\n            subrequests: vec![PersistSubrequest {\n                subrequest_id: 0,\n                index_uid: Some(index_uid.clone()),\n                source_id: \"test-source\".to_string(),\n                doc_batch: Some(DocBatchV2::for_test([\"\", \"[]\", r#\"{\"foo\": \"bar\"}\"#])),\n            }],\n        };\n        let persist_response = ingester.persist(persist_request).await.unwrap();\n        assert_eq!(persist_response.leader_id, \"test-ingester\");\n        assert_eq!(persist_response.successes.len(), 0);\n        assert_eq!(persist_response.failures.len(), 1);\n\n        let persist_failure = &persist_response.failures[0];\n        assert_eq!(\n            persist_failure.reason(),\n            PersistFailureReason::NoShardsAvailable\n        );\n    }\n\n    // This test should be run manually and independently of other tests with the `failpoints`\n    // feature enabled:\n    // ```sh\n    // cargo test --manifest-path quickwit/Cargo.toml -p quickwit-ingest --features failpoints -- test_ingester_persist_closes_shard_on_io_error\n    // ```\n    #[cfg(all(feature = \"failpoints\", not(feature = \"no-failpoints\")))]\n    #[tokio::test]\n    async fn test_ingester_persist_closes_shard_on_io_error() {\n        let scenario = fail::FailScenario::setup();\n        fail::cfg(\"ingester:append_records\", \"return\").unwrap();\n\n        let (_ingester_ctx, ingester) = IngesterForTest::default().build().await;\n\n        let mut state_guard = ingester.state.lock_fully().await.unwrap();\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n        let solo_shard =\n            IngesterShard::new_solo(index_uid.clone(), source_id, ShardId::from(1)).build();\n        let queue_id = solo_shard.queue_id();\n        state_guard.shards.insert(queue_id.clone(), solo_shard);\n\n        state_guard\n            .mrecordlog\n            .create_queue(&queue_id)\n            .await\n            .unwrap();\n\n        let rate_limiter = RateLimiter::from_settings(RateLimiterSettings::default());\n        let rate_meter = RateMeter::default();\n        state_guard\n            .rate_trackers\n            .insert(queue_id.clone(), (rate_limiter, rate_meter));\n\n        drop(state_guard);\n\n        let persist_request = PersistRequest {\n            leader_id: \"test-ingester\".to_string(),\n            commit_type: CommitTypeV2::Force as i32,\n            subrequests: vec![PersistSubrequest {\n                subrequest_id: 0,\n                index_uid: Some(index_uid.clone()),\n                source_id: \"test-source\".to_string(),\n                doc_batch: Some(DocBatchV2::for_test([r#\"test-doc-foo\"#])),\n            }],\n        };\n        let persist_response = ingester.persist(persist_request).await.unwrap();\n        assert_eq!(persist_response.leader_id, \"test-ingester\");\n        assert_eq!(persist_response.successes.len(), 0);\n        assert_eq!(persist_response.failures.len(), 1);\n\n        let persist_failure = &persist_response.failures[0];\n        assert_eq!(persist_failure.subrequest_id, 0);\n        assert_eq!(persist_failure.index_uid(), &index_uid);\n        assert_eq!(persist_failure.source_id, \"test-source\");\n        assert_eq!(\n            persist_failure.reason(),\n            PersistFailureReason::NodeUnavailable,\n        );\n\n        let state_guard = ingester.state.lock_fully().await.unwrap();\n        let shard = state_guard.shards.get(&queue_id).unwrap();\n        shard.assert_is_closed();\n\n        scenario.teardown();\n    }\n\n    #[tokio::test]\n    async fn test_ingester_persist_deletes_dangling_shard() {\n        let (_ingester_ctx, ingester) = IngesterForTest::default().build().await;\n\n        let mut state_guard = ingester.state.lock_fully().await.unwrap();\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n\n        let doc_mapper = try_build_doc_mapper(\"{}\").unwrap();\n\n        // Insert a dangling shard, i.e. a shard without a corresponding queue.\n        let solo_shard =\n            IngesterShard::new_solo(index_uid.clone(), source_id.clone(), ShardId::from(1))\n                .with_doc_mapper(doc_mapper)\n                .build();\n        state_guard.shards.insert(solo_shard.queue_id(), solo_shard);\n        drop(state_guard);\n\n        let persist_request = PersistRequest {\n            leader_id: \"test-ingester\".to_string(),\n            commit_type: CommitTypeV2::Force as i32,\n            subrequests: vec![PersistSubrequest {\n                subrequest_id: 0,\n                index_uid: Some(index_uid.clone()),\n                source_id: \"test-source\".to_string(),\n                doc_batch: Some(DocBatchV2::for_test([r#\"{\"doc\": \"test-doc-foo\"}\"#])),\n            }],\n        };\n        let persist_response = ingester.persist(persist_request).await.unwrap();\n        assert_eq!(persist_response.leader_id, \"test-ingester\");\n        assert_eq!(persist_response.successes.len(), 0);\n        assert_eq!(persist_response.failures.len(), 1);\n\n        let persist_failure = &persist_response.failures[0];\n        assert_eq!(persist_failure.subrequest_id, 0);\n        assert_eq!(persist_failure.index_uid(), &index_uid);\n        assert_eq!(persist_failure.source_id, \"test-source\");\n        assert_eq!(\n            persist_failure.reason(),\n            PersistFailureReason::NodeUnavailable\n        );\n\n        let state_guard = ingester.state.lock_fully().await.unwrap();\n        assert_eq!(state_guard.shards.len(), 0);\n    }\n\n    #[tokio::test]\n    async fn test_ingester_persist_replicate() {\n        let (leader_ctx, leader) = IngesterForTest::default()\n            .with_node_id(\"test-leader\")\n            .with_replication()\n            .build()\n            .await;\n\n        let (follower_ctx, follower) = IngesterForTest::default()\n            .with_node_id(\"test-follower\")\n            .with_ingester_pool(&leader_ctx.ingester_pool)\n            .with_replication()\n            .build()\n            .await;\n\n        let ingester_pool_entry = IngesterPoolEntry {\n            client: IngesterServiceClient::new(follower.clone()),\n            status: IngesterStatus::Ready,\n            availability_zone: None,\n        };\n\n        leader_ctx\n            .ingester_pool\n            .insert(follower_ctx.node_id.clone(), ingester_pool_entry);\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let index_uid2 = IndexUid::for_test(\"test-index\", 1);\n        let source_id = SourceId::from(\"test-source\");\n\n        let doc_mapping_uid = DocMappingUid::random();\n        let doc_mapping_json = format!(\n            r#\"{{\n                \"doc_mapping_uid\": \"{doc_mapping_uid}\"\n            }}\"#\n        );\n        let init_shards_request = InitShardsRequest {\n            subrequests: vec![\n                InitShardSubrequest {\n                    subrequest_id: 0,\n                    shard: Some(Shard {\n                        index_uid: Some(index_uid.clone()),\n                        source_id: source_id.clone(),\n                        shard_id: Some(ShardId::from(1)),\n                        shard_state: ShardState::Open as i32,\n                        leader_id: leader_ctx.node_id.to_string(),\n                        follower_id: Some(follower_ctx.node_id.to_string()),\n                        doc_mapping_uid: Some(doc_mapping_uid),\n                        ..Default::default()\n                    }),\n                    doc_mapping_json: doc_mapping_json.clone(),\n                    validate_docs: true,\n                },\n                InitShardSubrequest {\n                    subrequest_id: 1,\n                    shard: Some(Shard {\n                        index_uid: Some(index_uid2.clone()),\n                        source_id: source_id.clone(),\n                        shard_id: Some(ShardId::from(1)),\n                        shard_state: ShardState::Open as i32,\n                        leader_id: leader_ctx.node_id.to_string(),\n                        follower_id: Some(follower_ctx.node_id.to_string()),\n                        doc_mapping_uid: Some(doc_mapping_uid),\n                        ..Default::default()\n                    }),\n                    doc_mapping_json,\n                    validate_docs: true,\n                },\n            ],\n        };\n        leader.init_shards(init_shards_request).await.unwrap();\n\n        let persist_request = PersistRequest {\n            leader_id: \"test-leader\".to_string(),\n            commit_type: CommitTypeV2::Force as i32,\n            subrequests: vec![\n                PersistSubrequest {\n                    subrequest_id: 0,\n                    index_uid: Some(index_uid.clone()),\n                    source_id: \"test-source\".to_string(),\n                    doc_batch: Some(DocBatchV2::for_test([r#\"{\"doc\": \"test-doc-010\"}\"#])),\n                },\n                PersistSubrequest {\n                    subrequest_id: 1,\n                    index_uid: Some(index_uid2.clone()),\n                    source_id: \"test-source\".to_string(),\n                    doc_batch: Some(DocBatchV2::for_test([\n                        r#\"{\"doc\": \"test-doc-110\"}\"#,\n                        r#\"{\"doc\": \"test-doc-111\"}\"#,\n                    ])),\n                },\n            ],\n        };\n        let persist_response = leader.persist(persist_request).await.unwrap();\n        assert_eq!(persist_response.leader_id, \"test-leader\");\n        assert_eq!(persist_response.successes.len(), 2);\n        assert_eq!(persist_response.failures.len(), 0);\n\n        let persist_success_0 = &persist_response.successes[0];\n        assert_eq!(persist_success_0.subrequest_id, 0);\n        assert_eq!(persist_success_0.index_uid(), &index_uid);\n        assert_eq!(persist_success_0.source_id, \"test-source\");\n        assert_eq!(persist_success_0.shard_id(), ShardId::from(1));\n        assert_eq!(\n            persist_success_0.replication_position_inclusive,\n            Some(Position::offset(1u64))\n        );\n\n        let persist_success_1 = &persist_response.successes[1];\n        assert_eq!(persist_success_1.subrequest_id, 1);\n        assert_eq!(persist_success_1.index_uid(), &index_uid2);\n        assert_eq!(persist_success_1.source_id, \"test-source\");\n        assert_eq!(persist_success_1.shard_id(), ShardId::from(1));\n        assert_eq!(\n            persist_success_1.replication_position_inclusive,\n            Some(Position::offset(2u64))\n        );\n\n        let leader_state_guard = leader.state.lock_fully().await.unwrap();\n        assert_eq!(leader_state_guard.shards.len(), 2);\n\n        let queue_id_01 = queue_id(&index_uid, &source_id, &ShardId::from(1));\n        let primary_shard_01 = leader_state_guard.shards.get(&queue_id_01).unwrap();\n        primary_shard_01.assert_is_primary();\n        primary_shard_01.assert_is_open();\n        primary_shard_01.assert_replication_position(Position::offset(1u64));\n\n        leader_state_guard.mrecordlog.assert_records_eq(\n            &queue_id_01,\n            ..,\n            &[(0, [0, 0], r#\"{\"doc\": \"test-doc-010\"}\"#), (1, [0, 1], \"\")],\n        );\n\n        let queue_id_11 = queue_id(&index_uid2, &source_id, &ShardId::from(1));\n        let primary_shard_11 = leader_state_guard.shards.get(&queue_id_11).unwrap();\n        primary_shard_11.assert_is_primary();\n        primary_shard_11.assert_is_open();\n        primary_shard_11.assert_replication_position(Position::offset(2u64));\n\n        leader_state_guard.mrecordlog.assert_records_eq(\n            &queue_id_11,\n            ..,\n            &[\n                (0, [0, 0], r#\"{\"doc\": \"test-doc-110\"}\"#),\n                (1, [0, 0], r#\"{\"doc\": \"test-doc-111\"}\"#),\n                (2, [0, 1], \"\"),\n            ],\n        );\n\n        let follower_state_guard = follower.state.lock_fully().await.unwrap();\n        assert_eq!(follower_state_guard.shards.len(), 2);\n\n        let replica_shard_01 = follower_state_guard.shards.get(&queue_id_01).unwrap();\n        replica_shard_01.assert_is_replica();\n        replica_shard_01.assert_is_open();\n        replica_shard_01.assert_replication_position(Position::offset(1u64));\n\n        follower_state_guard.mrecordlog.assert_records_eq(\n            &queue_id_01,\n            ..,\n            &[(0, [0, 0], r#\"{\"doc\": \"test-doc-010\"}\"#), (1, [0, 1], \"\")],\n        );\n\n        let replica_shard_11 = follower_state_guard.shards.get(&queue_id_11).unwrap();\n        replica_shard_11.assert_is_replica();\n        replica_shard_11.assert_is_open();\n        replica_shard_11.assert_replication_position(Position::offset(2u64));\n\n        follower_state_guard.mrecordlog.assert_records_eq(\n            &queue_id_11,\n            ..,\n            &[\n                (0, [0, 0], r#\"{\"doc\": \"test-doc-110\"}\"#),\n                (1, [0, 0], r#\"{\"doc\": \"test-doc-111\"}\"#),\n                (2, [0, 1], \"\"),\n            ],\n        );\n    }\n\n    #[tokio::test]\n    async fn test_ingester_persist_replicate_grpc() {\n        let (leader_ctx, leader) = IngesterForTest::default()\n            .with_node_id(\"test-leader\")\n            .with_replication()\n            .build()\n            .await;\n\n        let leader_grpc_server_adapter = IngesterServiceGrpcServerAdapter::new(leader.clone());\n        let leader_grpc_server = IngesterServiceGrpcServer::new(leader_grpc_server_adapter);\n        let leader_socket_addr: SocketAddr = \"127.0.0.1:6666\".parse().unwrap();\n\n        tokio::spawn({\n            async move {\n                Server::builder()\n                    .add_service(leader_grpc_server)\n                    .serve(leader_socket_addr)\n                    .await\n                    .unwrap();\n            }\n        });\n\n        let (follower_ctx, follower) = IngesterForTest::default()\n            .with_node_id(\"test-follower\")\n            .with_ingester_pool(&leader_ctx.ingester_pool)\n            .with_replication()\n            .build()\n            .await;\n\n        let follower_grpc_server_adapter = IngesterServiceGrpcServerAdapter::new(follower.clone());\n        let follower_grpc_server = IngesterServiceGrpcServer::new(follower_grpc_server_adapter);\n        let follower_socket_addr: SocketAddr = \"127.0.0.1:7777\".parse().unwrap();\n\n        tokio::spawn({\n            async move {\n                Server::builder()\n                    .add_service(follower_grpc_server)\n                    .serve(follower_socket_addr)\n                    .await\n                    .unwrap();\n            }\n        });\n        let follower_channel = Endpoint::from_static(\"http://127.0.0.1:7777\").connect_lazy();\n        let follower_client = IngesterServiceClient::from_channel(\n            \"127.0.0.1:7777\".parse().unwrap(),\n            follower_channel,\n            MAX_GRPC_MESSAGE_SIZE,\n            None,\n        );\n\n        let ingester_pool_entry = IngesterPoolEntry {\n            client: follower_client,\n            status: IngesterStatus::Ready,\n            availability_zone: None,\n        };\n\n        leader_ctx\n            .ingester_pool\n            .insert(follower_ctx.node_id.clone(), ingester_pool_entry);\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let index_uid2 = IndexUid::for_test(\"test-index\", 1);\n        let source_id = SourceId::from(\"test-source\");\n\n        let doc_mapping_uid = DocMappingUid::random();\n        let doc_mapping_json = format!(\n            r#\"{{\n                \"doc_mapping_uid\": \"{doc_mapping_uid}\"\n            }}\"#\n        );\n        let init_shards_request = InitShardsRequest {\n            subrequests: vec![\n                InitShardSubrequest {\n                    subrequest_id: 0,\n                    shard: Some(Shard {\n                        index_uid: Some(index_uid.clone()),\n                        source_id: source_id.clone(),\n                        shard_id: Some(ShardId::from(1)),\n                        shard_state: ShardState::Open as i32,\n                        leader_id: leader_ctx.node_id.to_string(),\n                        follower_id: Some(follower_ctx.node_id.to_string()),\n                        doc_mapping_uid: Some(doc_mapping_uid),\n                        ..Default::default()\n                    }),\n                    doc_mapping_json: doc_mapping_json.clone(),\n                    validate_docs: true,\n                },\n                InitShardSubrequest {\n                    subrequest_id: 1,\n                    shard: Some(Shard {\n                        index_uid: Some(index_uid2.clone()),\n                        source_id: source_id.clone(),\n                        shard_id: Some(ShardId::from(1)),\n                        shard_state: ShardState::Open as i32,\n                        leader_id: leader_ctx.node_id.to_string(),\n                        follower_id: Some(follower_ctx.node_id.to_string()),\n                        doc_mapping_uid: Some(doc_mapping_uid),\n                        ..Default::default()\n                    }),\n                    doc_mapping_json,\n                    validate_docs: true,\n                },\n            ],\n        };\n        leader.init_shards(init_shards_request).await.unwrap();\n\n        let persist_request = PersistRequest {\n            leader_id: \"test-leader\".to_string(),\n            commit_type: CommitTypeV2::Auto as i32,\n            subrequests: vec![\n                PersistSubrequest {\n                    subrequest_id: 0,\n                    index_uid: Some(index_uid.clone()),\n                    source_id: \"test-source\".to_string(),\n                    doc_batch: Some(DocBatchV2::for_test([r#\"{\"doc\": \"test-doc-010\"}\"#])),\n                },\n                PersistSubrequest {\n                    subrequest_id: 1,\n                    index_uid: Some(index_uid2.clone()),\n                    source_id: \"test-source\".to_string(),\n                    doc_batch: Some(DocBatchV2::for_test([\n                        r#\"{\"doc\": \"test-doc-110\"}\"#,\n                        r#\"{\"doc\": \"test-doc-111\"}\"#,\n                    ])),\n                },\n            ],\n        };\n        let persist_response = leader.persist(persist_request).await.unwrap();\n        assert_eq!(persist_response.leader_id, \"test-leader\");\n        assert_eq!(persist_response.successes.len(), 2);\n        assert_eq!(persist_response.failures.len(), 0);\n\n        let persist_success_0 = &persist_response.successes[0];\n        assert_eq!(persist_success_0.subrequest_id, 0);\n        assert_eq!(persist_success_0.index_uid(), &index_uid);\n        assert_eq!(persist_success_0.source_id, \"test-source\");\n        assert_eq!(persist_success_0.shard_id(), ShardId::from(1));\n        assert_eq!(\n            persist_success_0.replication_position_inclusive,\n            Some(Position::offset(0u64))\n        );\n\n        let persist_success_1 = &persist_response.successes[1];\n        assert_eq!(persist_success_1.subrequest_id, 1);\n        assert_eq!(persist_success_1.index_uid(), &index_uid2);\n        assert_eq!(persist_success_1.source_id, \"test-source\");\n        assert_eq!(persist_success_1.shard_id(), ShardId::from(1));\n        assert_eq!(\n            persist_success_1.replication_position_inclusive,\n            Some(Position::offset(1u64))\n        );\n\n        let leader_state_guard = leader.state.lock_fully().await.unwrap();\n        assert_eq!(leader_state_guard.shards.len(), 2);\n\n        let queue_id_01 = queue_id(&index_uid, &source_id, &ShardId::from(1));\n        let primary_shard_01 = leader_state_guard.shards.get(&queue_id_01).unwrap();\n        primary_shard_01.assert_is_primary();\n        primary_shard_01.assert_is_open();\n        primary_shard_01.assert_replication_position(Position::offset(0u64));\n\n        leader_state_guard.mrecordlog.assert_records_eq(\n            &queue_id_01,\n            ..,\n            &[(0, [0, 0], r#\"{\"doc\": \"test-doc-010\"}\"#)],\n        );\n\n        let queue_id_11 = queue_id(&index_uid2, &source_id, &ShardId::from(1));\n        let primary_shard_11 = leader_state_guard.shards.get(&queue_id_11).unwrap();\n        primary_shard_11.assert_is_primary();\n        primary_shard_11.assert_is_open();\n        primary_shard_11.assert_replication_position(Position::offset(1u64));\n\n        leader_state_guard.mrecordlog.assert_records_eq(\n            &queue_id_11,\n            ..,\n            &[\n                (0, [0, 0], r#\"{\"doc\": \"test-doc-110\"}\"#),\n                (1, [0, 0], r#\"{\"doc\": \"test-doc-111\"}\"#),\n            ],\n        );\n\n        let follower_state_guard = follower.state.lock_fully().await.unwrap();\n        assert_eq!(follower_state_guard.shards.len(), 2);\n\n        let replica_shard_01 = follower_state_guard.shards.get(&queue_id_01).unwrap();\n        replica_shard_01.assert_is_replica();\n        replica_shard_01.assert_is_open();\n        replica_shard_01.assert_replication_position(Position::offset(0u64));\n\n        follower_state_guard.mrecordlog.assert_records_eq(\n            &queue_id_01,\n            ..,\n            &[(0, [0, 0], r#\"{\"doc\": \"test-doc-010\"}\"#)],\n        );\n\n        let replica_shard_11 = follower_state_guard.shards.get(&queue_id_11).unwrap();\n        replica_shard_11.assert_is_replica();\n        replica_shard_11.assert_is_open();\n        replica_shard_11.assert_replication_position(Position::offset(1u64));\n\n        follower_state_guard.mrecordlog.assert_records_eq(\n            &queue_id_11,\n            ..,\n            &[\n                (0, [0, 0], r#\"{\"doc\": \"test-doc-110\"}\"#),\n                (1, [0, 0], r#\"{\"doc\": \"test-doc-111\"}\"#),\n            ],\n        );\n    }\n\n    #[tokio::test]\n    async fn test_ingester_persist_no_available_shards() {\n        let (ingester_ctx, ingester) = IngesterForTest::default().build().await;\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n        let solo_shard =\n            IngesterShard::new_solo(index_uid.clone(), source_id.clone(), ShardId::from(1))\n                .with_state(ShardState::Closed)\n                .build();\n        let queue_id = solo_shard.queue_id();\n        ingester\n            .state\n            .lock_fully()\n            .await\n            .unwrap()\n            .shards\n            .insert(queue_id.clone(), solo_shard);\n\n        let persist_request = PersistRequest {\n            leader_id: ingester_ctx.node_id.to_string(),\n            commit_type: CommitTypeV2::Auto as i32,\n            subrequests: vec![PersistSubrequest {\n                subrequest_id: 0,\n                index_uid: Some(index_uid.clone()),\n                source_id: \"test-source\".to_string(),\n                doc_batch: Some(DocBatchV2::for_test([r#\"{\"doc\": \"test-doc-010\"}\"#])),\n            }],\n        };\n        let persist_response = ingester.persist(persist_request).await.unwrap();\n        assert_eq!(persist_response.leader_id, \"test-ingester\");\n        assert_eq!(persist_response.successes.len(), 0);\n        assert_eq!(persist_response.failures.len(), 1);\n\n        let persist_failure = &persist_response.failures[0];\n        assert_eq!(persist_failure.subrequest_id, 0);\n        assert_eq!(persist_failure.index_uid(), &index_uid);\n        assert_eq!(persist_failure.source_id, \"test-source\");\n        assert_eq!(\n            persist_failure.reason(),\n            PersistFailureReason::NoShardsAvailable\n        );\n\n        let state_guard = ingester.state.lock_fully().await.unwrap();\n        assert_eq!(state_guard.shards.len(), 1);\n\n        let solo_shard = state_guard.shards.get(&queue_id).unwrap();\n        solo_shard.assert_is_solo();\n        solo_shard.assert_is_closed();\n        solo_shard.assert_replication_position(Position::Beginning);\n    }\n\n    #[tokio::test]\n    async fn test_ingester_persist_rate_limited() {\n        let (ingester_ctx, ingester) = IngesterForTest::default()\n            .with_rate_limiter_settings(RateLimiterSettings {\n                burst_limit: 0,\n                rate_limit: ConstantRate::bytes_per_sec(ByteSize(0)),\n                refill_period: Duration::from_millis(100),\n            })\n            .build()\n            .await;\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n\n        let doc_mapping_uid = DocMappingUid::random();\n        let doc_mapping_json = format!(\n            r#\"{{\n                \"doc_mapping_uid\": \"{doc_mapping_uid}\"\n            }}\"#\n        );\n        let primary_shard = Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            shard_state: ShardState::Open as i32,\n            leader_id: ingester_ctx.node_id.to_string(),\n            doc_mapping_uid: Some(doc_mapping_uid),\n            ..Default::default()\n        };\n        let mut state_guard = ingester.state.lock_fully().await.unwrap();\n\n        ingester\n            .init_primary_shard(\n                &mut state_guard.inner,\n                &mut state_guard.mrecordlog,\n                primary_shard,\n                &doc_mapping_json,\n                Instant::now(),\n                true,\n            )\n            .await\n            .unwrap();\n\n        drop(state_guard);\n\n        let persist_request = PersistRequest {\n            leader_id: ingester_ctx.node_id.to_string(),\n            commit_type: CommitTypeV2::Auto as i32,\n            subrequests: vec![PersistSubrequest {\n                subrequest_id: 0,\n                index_uid: Some(index_uid.clone()),\n                source_id: \"test-source\".to_string(),\n                doc_batch: Some(DocBatchV2::for_test([r#\"{\"doc\": \"test-doc-010\"}\"#])),\n            }],\n        };\n        let persist_response = ingester.persist(persist_request).await.unwrap();\n        assert_eq!(persist_response.leader_id, \"test-ingester\");\n        assert_eq!(persist_response.successes.len(), 0);\n        assert_eq!(persist_response.failures.len(), 1);\n\n        let persist_failure = &persist_response.failures[0];\n        assert_eq!(persist_failure.subrequest_id, 0);\n        assert_eq!(persist_failure.index_uid(), &index_uid);\n        assert_eq!(persist_failure.source_id, \"test-source\");\n        assert_eq!(\n            persist_failure.reason(),\n            PersistFailureReason::NoShardsAvailable\n        );\n\n        let state_guard = ingester.state.lock_fully().await.unwrap();\n        assert_eq!(state_guard.shards.len(), 1);\n\n        let queue_id_01 = queue_id(&index_uid, &source_id, &ShardId::from(1));\n\n        let solo_shard_01 = state_guard.shards.get(&queue_id_01).unwrap();\n        solo_shard_01.assert_is_solo();\n        solo_shard_01.assert_is_open();\n        solo_shard_01.assert_replication_position(Position::Beginning);\n\n        state_guard\n            .mrecordlog\n            .assert_records_eq(&queue_id_01, .., &[]);\n    }\n\n    #[tokio::test]\n    async fn test_ingester_persist_resource_exhausted() {\n        let (ingester_ctx, ingester) = IngesterForTest::default()\n            .with_disk_capacity(ByteSize(0))\n            .build()\n            .await;\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n\n        let doc_mapping_uid = DocMappingUid::random();\n        let doc_mapping_json = format!(\n            r#\"{{\n                \"doc_mapping_uid\": \"{doc_mapping_uid}\"\n            }}\"#\n        );\n        let primary_shard = Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            shard_state: ShardState::Open as i32,\n            leader_id: ingester_ctx.node_id.to_string(),\n            doc_mapping_uid: Some(doc_mapping_uid),\n            ..Default::default()\n        };\n        let mut state_guard = ingester.state.lock_fully().await.unwrap();\n\n        ingester\n            .init_primary_shard(\n                &mut state_guard.inner,\n                &mut state_guard.mrecordlog,\n                primary_shard,\n                &doc_mapping_json,\n                Instant::now(),\n                true,\n            )\n            .await\n            .unwrap();\n\n        drop(state_guard);\n\n        let persist_request = PersistRequest {\n            leader_id: ingester_ctx.node_id.to_string(),\n            commit_type: CommitTypeV2::Auto as i32,\n            subrequests: vec![PersistSubrequest {\n                subrequest_id: 0,\n                index_uid: Some(index_uid.clone()),\n                source_id: \"test-source\".to_string(),\n                doc_batch: Some(DocBatchV2::for_test([r#\"{\"doc\": \"test-doc-010\"}\"#])),\n            }],\n        };\n        let persist_response = ingester.persist(persist_request).await.unwrap();\n        assert_eq!(persist_response.leader_id, \"test-ingester\");\n        assert_eq!(persist_response.successes.len(), 0);\n        assert_eq!(persist_response.failures.len(), 1);\n\n        let persist_failure = &persist_response.failures[0];\n        assert_eq!(persist_failure.subrequest_id, 0);\n        assert_eq!(persist_failure.index_uid(), &index_uid);\n        assert_eq!(persist_failure.source_id, \"test-source\");\n        assert_eq!(persist_failure.reason(), PersistFailureReason::WalFull);\n\n        let state_guard = ingester.state.lock_fully().await.unwrap();\n        assert_eq!(state_guard.shards.len(), 1);\n\n        let queue_id_01 = queue_id(&index_uid, &source_id, &ShardId::from(1));\n        let solo_shard_01 = state_guard.shards.get(&queue_id_01).unwrap();\n        solo_shard_01.assert_is_solo();\n        solo_shard_01.assert_is_open();\n        solo_shard_01.assert_replication_position(Position::Beginning);\n\n        state_guard\n            .mrecordlog\n            .assert_records_eq(&queue_id_01, .., &[]);\n    }\n\n    #[tokio::test]\n    async fn test_ingester_persist_returns_routing_update() {\n        let (ingester_ctx, ingester) = IngesterForTest::default().build().await;\n\n        let index_uid_0 = IndexUid::for_test(\"test-index-0\", 0);\n        let index_uid_1 = IndexUid::for_test(\"test-index-1\", 0);\n        let source_id = SourceId::from(\"test-source\");\n\n        let doc_mapping_uid = DocMappingUid::random();\n        let doc_mapping_json = format!(\n            r#\"{{\n                \"doc_mapping_uid\": \"{doc_mapping_uid}\"\n            }}\"#\n        );\n        let init_shards_request = InitShardsRequest {\n            subrequests: vec![\n                InitShardSubrequest {\n                    subrequest_id: 0,\n                    shard: Some(Shard {\n                        index_uid: Some(index_uid_0.clone()),\n                        source_id: source_id.clone(),\n                        shard_id: Some(ShardId::from(1)),\n                        shard_state: ShardState::Open as i32,\n                        leader_id: ingester_ctx.node_id.to_string(),\n                        doc_mapping_uid: Some(doc_mapping_uid),\n                        ..Default::default()\n                    }),\n                    doc_mapping_json: doc_mapping_json.clone(),\n                    validate_docs: false,\n                },\n                InitShardSubrequest {\n                    subrequest_id: 1,\n                    shard: Some(Shard {\n                        index_uid: Some(index_uid_1.clone()),\n                        source_id: source_id.clone(),\n                        shard_id: Some(ShardId::from(1)),\n                        shard_state: ShardState::Open as i32,\n                        leader_id: ingester_ctx.node_id.to_string(),\n                        doc_mapping_uid: Some(doc_mapping_uid),\n                        ..Default::default()\n                    }),\n                    doc_mapping_json,\n                    validate_docs: false,\n                },\n            ],\n        };\n        ingester.init_shards(init_shards_request).await.unwrap();\n\n        let persist_request = PersistRequest {\n            leader_id: ingester_ctx.node_id.to_string(),\n            commit_type: CommitTypeV2::Force as i32,\n            subrequests: vec![\n                PersistSubrequest {\n                    subrequest_id: 0,\n                    index_uid: Some(index_uid_0.clone()),\n                    source_id: source_id.clone(),\n                    doc_batch: Some(DocBatchV2::for_test([r#\"{\"doc\": \"test-doc-010\"}\"#])),\n                },\n                PersistSubrequest {\n                    subrequest_id: 1,\n                    index_uid: Some(index_uid_1.clone()),\n                    source_id: source_id.clone(),\n                    doc_batch: Some(DocBatchV2::for_test([r#\"{\"doc\": \"test-doc-110\"}\"#])),\n                },\n            ],\n        };\n        let persist_response = ingester.persist(persist_request).await.unwrap();\n        assert_eq!(persist_response.successes.len(), 2);\n\n        let routing_update = persist_response\n            .routing_update\n            .expect(\"routing update should be present\");\n\n        assert!(\n            routing_update.capacity_score > 0,\n            \"capacity score should be non-zero after a small persist\"\n        );\n\n        let mut source_shard_updates = routing_update.source_shard_updates;\n        source_shard_updates.sort_by(|a, b| a.index_uid().cmp(b.index_uid()));\n\n        assert_eq!(source_shard_updates.len(), 2);\n        assert_eq!(source_shard_updates[0].index_uid(), &index_uid_0);\n        assert_eq!(source_shard_updates[0].source_id, source_id.as_str());\n        assert_eq!(source_shard_updates[0].open_shard_count, 1);\n        assert_eq!(source_shard_updates[1].index_uid(), &index_uid_1);\n        assert_eq!(source_shard_updates[1].source_id, source_id.as_str());\n        assert_eq!(source_shard_updates[1].open_shard_count, 1);\n\n        assert!(routing_update.closed_shards.is_empty());\n    }\n\n    #[tokio::test]\n    async fn test_ingester_open_replication_stream() {\n        let (_ingester_ctx, ingester) = IngesterForTest::default()\n            .with_node_id(\"test-follower\")\n            .build()\n            .await;\n\n        let (syn_replication_stream_tx, syn_replication_stream) = ServiceStream::new_bounded(5);\n        let open_stream_request = OpenReplicationStreamRequest {\n            leader_id: \"test-leader\".to_string(),\n            follower_id: \"test-follower\".to_string(),\n            replication_seqno: 0,\n        };\n        let syn_replication_message = SynReplicationMessage::new_open_request(open_stream_request);\n        syn_replication_stream_tx\n            .send(syn_replication_message)\n            .await\n            .unwrap();\n        let mut ack_replication_stream = ingester\n            .open_replication_stream(syn_replication_stream)\n            .await\n            .unwrap();\n        ack_replication_stream\n            .next()\n            .await\n            .unwrap()\n            .unwrap()\n            .into_open_response()\n            .unwrap();\n\n        let state_guard = ingester.state.lock_fully().await.unwrap();\n        assert!(state_guard.replication_tasks.contains_key(\"test-leader\"));\n    }\n\n    #[tokio::test]\n    async fn test_ingester_open_fetch_stream() {\n        let (_ingester_ctx, ingester) = IngesterForTest::default().build().await;\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n        let open_fetch_stream_request = OpenFetchStreamRequest {\n            client_id: \"test-client\".to_string(),\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1337)),\n            from_position_exclusive: Some(Position::Beginning),\n        };\n        let error = ingester\n            .open_fetch_stream(open_fetch_stream_request)\n            .await\n            .unwrap_err();\n        assert!(\n            matches!(error, IngestV2Error::ShardNotFound { shard_id } if shard_id == ShardId::from(1337))\n        );\n\n        let doc_mapping_uid = DocMappingUid::random();\n        let doc_mapping_json = format!(\n            r#\"{{\n                \"doc_mapping_uid\": \"{doc_mapping_uid}\"\n            }}\"#\n        );\n        let shard = Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            shard_state: ShardState::Open as i32,\n            doc_mapping_uid: Some(doc_mapping_uid),\n            ..Default::default()\n        };\n        let queue_id = queue_id(&index_uid, &source_id, &ShardId::from(1));\n\n        let mut state_guard = ingester.state.lock_fully().await.unwrap();\n\n        ingester\n            .init_primary_shard(\n                &mut state_guard.inner,\n                &mut state_guard.mrecordlog,\n                shard,\n                &doc_mapping_json,\n                Instant::now(),\n                true,\n            )\n            .await\n            .unwrap();\n\n        let records = [MRecord::new_doc(\"test-doc-foo\").encode()].into_iter();\n\n        state_guard\n            .mrecordlog\n            .append_records(&queue_id, None, records)\n            .await\n            .unwrap();\n\n        drop(state_guard);\n\n        let open_fetch_stream_request = OpenFetchStreamRequest {\n            client_id: \"test-client\".to_string(),\n            index_uid: Some(index_uid.clone()),\n            source_id,\n            shard_id: Some(ShardId::from(1)),\n            from_position_exclusive: Some(Position::Beginning),\n        };\n        let mut fetch_stream = ingester\n            .open_fetch_stream(open_fetch_stream_request)\n            .await\n            .unwrap();\n\n        let fetch_response = fetch_stream.next().await.unwrap().unwrap();\n        let fetch_payload = into_fetch_payload(fetch_response);\n\n        assert_eq!(fetch_payload.from_position_exclusive(), Position::Beginning);\n        assert_eq!(\n            fetch_payload.to_position_inclusive(),\n            Position::offset(0u64)\n        );\n\n        let mrecord_batch = fetch_payload.mrecord_batch.unwrap();\n        assert_eq!(\n            mrecord_batch.mrecord_buffer,\n            Bytes::from_static(b\"\\0\\0test-doc-foo\")\n        );\n        assert_eq!(mrecord_batch.mrecord_lengths, [14]);\n\n        let mut state_guard = ingester.state.lock_fully().await.unwrap();\n\n        let records = [MRecord::new_doc(\"test-doc-bar\").encode()].into_iter();\n\n        state_guard\n            .mrecordlog\n            .append_records(&queue_id, None, records)\n            .await\n            .unwrap();\n\n        let shard = state_guard.shards.get(&queue_id).unwrap();\n        assert!(shard.is_advertisable);\n        shard.notify_shard_status();\n        drop(state_guard);\n\n        let fetch_response = fetch_stream.next().await.unwrap().unwrap();\n        let fetch_payload = into_fetch_payload(fetch_response);\n\n        assert_eq!(\n            fetch_payload.from_position_exclusive(),\n            Position::offset(0u64)\n        );\n        assert_eq!(\n            fetch_payload.to_position_inclusive(),\n            Position::offset(1u64)\n        );\n\n        let mrecord_batch = fetch_payload.mrecord_batch.unwrap();\n        assert_eq!(\n            mrecord_batch.mrecord_buffer,\n            Bytes::from_static(b\"\\0\\0test-doc-bar\")\n        );\n        assert_eq!(mrecord_batch.mrecord_lengths, [14]);\n    }\n\n    #[tokio::test]\n    async fn test_ingester_truncate_shards() {\n        let (ingester_ctx, ingester) = IngesterForTest::default().build().await;\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n        let queue_id_01 = queue_id(&index_uid, &source_id, &ShardId::from(1));\n        let queue_id_02 = queue_id(&index_uid, &source_id, &ShardId::from(2));\n\n        let doc_mapping_uid_01 = DocMappingUid::random();\n        let doc_mapping_json_01 = format!(\n            r#\"{{\n                \"doc_mapping_uid\": \"{doc_mapping_uid_01}\"\n            }}\"#\n        );\n        let shard_01 = Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            shard_state: ShardState::Open as i32,\n            doc_mapping_uid: Some(doc_mapping_uid_01),\n            ..Default::default()\n        };\n\n        let doc_mapping_uid_02 = DocMappingUid::random();\n        let doc_mapping_json_02 = format!(\n            r#\"{{\n                \"doc_mapping_uid\": \"{doc_mapping_uid_02}\"\n            }}\"#\n        );\n        let shard_02 = Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(2)),\n            shard_state: ShardState::Closed as i32,\n            doc_mapping_uid: Some(doc_mapping_uid_02),\n            ..Default::default()\n        };\n        let mut state_guard = ingester.state.lock_fully().await.unwrap();\n        let now = Instant::now();\n\n        ingester\n            .init_primary_shard(\n                &mut state_guard.inner,\n                &mut state_guard.mrecordlog,\n                shard_01,\n                &doc_mapping_json_01,\n                now,\n                true,\n            )\n            .await\n            .unwrap();\n        ingester\n            .init_primary_shard(\n                &mut state_guard.inner,\n                &mut state_guard.mrecordlog,\n                shard_02,\n                &doc_mapping_json_02,\n                now,\n                true,\n            )\n            .await\n            .unwrap();\n\n        assert_eq!(state_guard.shards.len(), 2);\n        assert_eq!(state_guard.doc_mappers.len(), 2);\n\n        let records = [\n            MRecord::new_doc(\"test-doc-foo\").encode(),\n            MRecord::new_doc(\"test-doc-bar\").encode(),\n        ]\n        .into_iter();\n\n        state_guard\n            .mrecordlog\n            .append_records(&queue_id_01, None, records)\n            .await\n            .unwrap();\n\n        let records = [MRecord::new_doc(\"test-doc-baz\").encode()].into_iter();\n\n        state_guard\n            .mrecordlog\n            .append_records(&queue_id_02, None, records)\n            .await\n            .unwrap();\n\n        drop(state_guard);\n\n        let truncate_shards_request = TruncateShardsRequest {\n            ingester_id: ingester_ctx.node_id.to_string(),\n            subrequests: vec![\n                TruncateShardsSubrequest {\n                    index_uid: Some(index_uid.clone()),\n                    source_id: source_id.clone(),\n                    shard_id: Some(ShardId::from(1)),\n                    truncate_up_to_position_inclusive: Some(Position::offset(0u64)),\n                },\n                TruncateShardsSubrequest {\n                    index_uid: Some(index_uid.clone()),\n                    source_id: source_id.clone(),\n                    shard_id: Some(ShardId::from(2)),\n                    truncate_up_to_position_inclusive: Some(Position::eof(0u64)),\n                },\n                TruncateShardsSubrequest {\n                    index_uid: Some(IndexUid::for_test(\"test-index\", 1337)),\n                    source_id,\n                    shard_id: Some(ShardId::from(1337)),\n                    truncate_up_to_position_inclusive: Some(Position::offset(1337u64)),\n                },\n            ],\n        };\n        ingester\n            .truncate_shards(truncate_shards_request.clone())\n            .await\n            .unwrap();\n\n        // Verify idempotency.\n        ingester\n            .truncate_shards(truncate_shards_request)\n            .await\n            .unwrap();\n\n        let state_guard = ingester.state.lock_fully().await.unwrap();\n\n        assert_eq!(state_guard.shards.len(), 1);\n        assert_eq!(state_guard.doc_mappers.len(), 1);\n\n        assert!(state_guard.shards.contains_key(&queue_id_01));\n        assert!(state_guard.doc_mappers.contains_key(&doc_mapping_uid_01));\n\n        state_guard\n            .mrecordlog\n            .assert_records_eq(&queue_id_01, .., &[(1, [0, 0], \"test-doc-bar\")]);\n    }\n\n    #[tokio::test]\n    async fn test_ingester_truncate_shards_deletes_dangling_shards() {\n        let (ingester_ctx, ingester) = IngesterForTest::default().build().await;\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n\n        let mut state_guard = ingester.state.lock_fully().await.unwrap();\n        let solo_shard =\n            IngesterShard::new_solo(index_uid.clone(), source_id.clone(), ShardId::from(1)).build();\n        state_guard.shards.insert(solo_shard.queue_id(), solo_shard);\n        drop(state_guard);\n\n        let truncate_shards_request = TruncateShardsRequest {\n            ingester_id: ingester_ctx.node_id.to_string(),\n            subrequests: vec![TruncateShardsSubrequest {\n                index_uid: Some(index_uid.clone()),\n                source_id,\n                shard_id: Some(ShardId::from(1)),\n                truncate_up_to_position_inclusive: Some(Position::offset(0u64)),\n            }],\n        };\n        ingester\n            .truncate_shards(truncate_shards_request.clone())\n            .await\n            .unwrap();\n\n        let state_guard = ingester.state.lock_fully().await.unwrap();\n        assert_eq!(state_guard.shards.len(), 0);\n    }\n\n    #[tokio::test]\n    async fn test_ingester_reset_shards() {\n        let mut mock_control_plane = MockControlPlaneService::new();\n        mock_control_plane\n            .expect_advise_reset_shards()\n            .once()\n            .returning(|_| Ok(AdviseResetShardsResponse::default()));\n\n        mock_control_plane\n            .expect_advise_reset_shards()\n            .once()\n            .returning(|mut request| {\n                assert_eq!(request.ingester_id, \"test-ingester\");\n                assert_eq!(request.shard_ids.len(), 1);\n                assert_eq!(request.shard_ids[0].index_uid(), &(\"test-index\", 0));\n                assert_eq!(request.shard_ids[0].source_id, \"test-source\");\n                request.shard_ids[0].shard_ids.sort_unstable();\n                assert_eq!(\n                    request.shard_ids[0].shard_ids,\n                    [ShardId::from(1), ShardId::from(2)]\n                );\n                let response = AdviseResetShardsResponse {\n                    shards_to_delete: vec![ShardIds {\n                        index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n                        source_id: \"test-source\".to_string(),\n                        shard_ids: vec![ShardId::from(1)],\n                    }],\n                    shards_to_truncate: vec![ShardIdPositions {\n                        index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n                        source_id: \"test-source\".to_string(),\n                        shard_positions: vec![ShardIdPosition {\n                            shard_id: Some(ShardId::from(2)),\n                            publish_position_inclusive: Some(Position::offset(1u64)),\n                        }],\n                    }],\n                };\n                Ok(response)\n            });\n        let control_plane = ControlPlaneServiceClient::from_mock(mock_control_plane);\n\n        let (_ingester_ctx, mut ingester) = IngesterForTest::default()\n            .with_control_plane(control_plane)\n            .build()\n            .await;\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n\n        let doc_mapping_uid = DocMappingUid::random();\n        let doc_mapping_json = format!(\n            r#\"{{\n                \"doc_mapping_uid\": \"{doc_mapping_uid}\"\n            }}\"#\n        );\n        let shard_01 = Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            shard_state: ShardState::Open as i32,\n            doc_mapping_uid: Some(doc_mapping_uid),\n            ..Default::default()\n        };\n        let shard_02 = Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(2)),\n            shard_state: ShardState::Open as i32,\n            doc_mapping_uid: Some(doc_mapping_uid),\n            ..Default::default()\n        };\n        let queue_id_02 = queue_id(&index_uid, &source_id, &ShardId::from(2));\n\n        let mut state_guard = ingester.state.lock_fully().await.unwrap();\n        let now = Instant::now();\n\n        ingester\n            .init_primary_shard(\n                &mut state_guard.inner,\n                &mut state_guard.mrecordlog,\n                shard_01,\n                &doc_mapping_json,\n                now,\n                true,\n            )\n            .await\n            .unwrap();\n        ingester\n            .init_primary_shard(\n                &mut state_guard.inner,\n                &mut state_guard.mrecordlog,\n                shard_02,\n                &doc_mapping_json,\n                now,\n                true,\n            )\n            .await\n            .unwrap();\n\n        let records = [\n            MRecord::new_doc(\"test-doc-foo\").encode(),\n            MRecord::new_doc(\"test-doc-bar\").encode(),\n        ]\n        .into_iter();\n\n        state_guard\n            .mrecordlog\n            .append_records(&queue_id_02, None, records)\n            .await\n            .unwrap();\n\n        drop(state_guard);\n\n        ingester.reset_shards().await;\n\n        let state_guard = ingester.state.lock_partially().await.unwrap();\n        assert_eq!(state_guard.shards.len(), 1);\n\n        let shard_02 = state_guard.shards.get(&queue_id_02).unwrap();\n        shard_02.assert_truncation_position(Position::offset(1u64));\n    }\n\n    #[tokio::test]\n    async fn test_ingester_retain_shards() {\n        let (_ingester_ctx, ingester) = IngesterForTest::default().build().await;\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n\n        let doc_mapping_uid = DocMappingUid::random();\n        let doc_mapping_json = format!(\n            r#\"{{\n                \"doc_mapping_uid\": \"{doc_mapping_uid}\"\n            }}\"#\n        );\n        let shard_17 = Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(17)),\n            shard_state: ShardState::Open as i32,\n            doc_mapping_uid: Some(doc_mapping_uid),\n            ..Default::default()\n        };\n\n        let shard_18 = Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(18)),\n            shard_state: ShardState::Closed as i32,\n            doc_mapping_uid: Some(doc_mapping_uid),\n            ..Default::default()\n        };\n        let queue_id_17 = queue_id(\n            shard_17.index_uid(),\n            &shard_17.source_id,\n            shard_17.shard_id(),\n        );\n\n        let mut state_guard = ingester.state.lock_fully().await.unwrap();\n        let now = Instant::now();\n\n        ingester\n            .init_primary_shard(\n                &mut state_guard.inner,\n                &mut state_guard.mrecordlog,\n                shard_17,\n                &doc_mapping_json,\n                now,\n                true,\n            )\n            .await\n            .unwrap();\n\n        ingester\n            .init_primary_shard(\n                &mut state_guard.inner,\n                &mut state_guard.mrecordlog,\n                shard_18,\n                &doc_mapping_json,\n                now,\n                true,\n            )\n            .await\n            .unwrap();\n\n        drop(state_guard);\n\n        {\n            let state_guard = ingester.state.lock_fully().await.unwrap();\n            assert_eq!(state_guard.shards.len(), 2);\n        }\n\n        let retain_shards_request = RetainShardsRequest {\n            retain_shards_for_sources: vec![RetainShardsForSource {\n                index_uid: Some(index_uid.clone()),\n                source_id,\n                shard_ids: vec![ShardId::from(17u64)],\n            }],\n        };\n        ingester.retain_shards(retain_shards_request).await.unwrap();\n\n        {\n            let state_guard = ingester.state.lock_fully().await.unwrap();\n            assert_eq!(state_guard.shards.len(), 1);\n            assert!(state_guard.shards.contains_key(&queue_id_17));\n        }\n    }\n\n    #[tokio::test]\n    async fn test_ingester_close_shards() {\n        let (_ingester_ctx, ingester) = IngesterForTest::default().build().await;\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n        let queue_id = queue_id(&index_uid, &source_id, &ShardId::from(1));\n\n        let doc_mapping_uid = DocMappingUid::random();\n        let doc_mapping_json = format!(\n            r#\"{{\n                \"doc_mapping_uid\": \"{doc_mapping_uid}\"\n            }}\"#\n        );\n        let shard = Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            shard_state: ShardState::Open as i32,\n            doc_mapping_uid: Some(doc_mapping_uid),\n            publish_position_inclusive: Some(Position::Beginning),\n            ..Default::default()\n        };\n        let mut state_guard = ingester.state.lock_fully().await.unwrap();\n        ingester\n            .init_primary_shard(\n                &mut state_guard.inner,\n                &mut state_guard.mrecordlog,\n                shard,\n                &doc_mapping_json,\n                Instant::now(),\n                true,\n            )\n            .await\n            .unwrap();\n        drop(state_guard);\n\n        let open_fetch_stream_request = OpenFetchStreamRequest {\n            client_id: \"test-client\".to_string(),\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            from_position_exclusive: Some(Position::Beginning),\n        };\n        let mut fetch_stream = ingester\n            .open_fetch_stream(open_fetch_stream_request)\n            .await\n            .unwrap();\n\n        let close_shards_request = CloseShardsRequest {\n            shard_pkeys: vec![\n                ShardPKey {\n                    index_uid: Some(index_uid.clone()),\n                    source_id: source_id.clone(),\n                    shard_id: Some(ShardId::from(1)),\n                },\n                ShardPKey {\n                    index_uid: Some(index_uid.clone()),\n                    source_id,\n                    shard_id: Some(ShardId::from(1337)),\n                },\n            ],\n        };\n        let closed_shards_response = ingester\n            .close_shards(close_shards_request.clone())\n            .await\n            .unwrap();\n        assert_eq!(closed_shards_response.successes.len(), 1);\n\n        let close_shard_success = &closed_shards_response.successes[0];\n        assert_eq!(close_shard_success.index_uid(), &index_uid);\n        assert_eq!(close_shard_success.source_id, \"test-source\");\n        assert_eq!(close_shard_success.shard_id(), ShardId::from(1));\n\n        // Verify idempotency.\n        ingester\n            .close_shards(close_shards_request.clone())\n            .await\n            .unwrap();\n\n        let state_guard = ingester.state.lock_partially().await.unwrap();\n        let shard = state_guard.shards.get(&queue_id).unwrap();\n        shard.assert_is_closed();\n\n        let fetch_response = timeout(Duration::from_millis(100), fetch_stream.next())\n            .await\n            .unwrap()\n            .unwrap()\n            .unwrap();\n        let fetch_eof = into_fetch_eof(fetch_response);\n\n        assert_eq!(fetch_eof.eof_position(), Position::Beginning.as_eof());\n    }\n\n    #[tokio::test]\n    async fn test_ingester_open_observation_stream() {\n        let (ingester_ctx, ingester) = IngesterForTest::default().build().await;\n\n        let mut observation_stream = ingester\n            .open_observation_stream(OpenObservationStreamRequest {})\n            .await\n            .unwrap();\n        let observation = observation_stream.next().await.unwrap().unwrap();\n        assert_eq!(observation.node_id, ingester_ctx.node_id);\n        assert_eq!(observation.status(), IngesterStatus::Ready);\n\n        let mut state_guard = ingester.state.lock_fully().await.unwrap();\n        state_guard\n            .set_status(IngesterStatus::Decommissioning)\n            .await;\n        drop(state_guard);\n\n        let observation = observation_stream.next().await.unwrap().unwrap();\n        assert_eq!(observation.node_id, ingester_ctx.node_id);\n        assert_eq!(observation.status(), IngesterStatus::Decommissioning);\n\n        drop(ingester);\n\n        let observation_opt = observation_stream.next().await;\n        assert!(observation_opt.is_none());\n    }\n\n    #[tokio::test]\n    async fn test_ingester_decommission() {\n        let (_ingester_ctx, ingester) = IngesterForTest::default().build().await;\n\n        let mut state_guard = ingester.state.lock_fully().await.unwrap();\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n\n        let shard = IngesterShard::new_solo(index_uid, source_id, ShardId::from(1)).build();\n        let queue_id = shard.queue_id();\n\n        state_guard.shards.insert(queue_id.clone(), shard);\n        drop(state_guard);\n\n        let mut observation_stream = ingester\n            .open_observation_stream(OpenObservationStreamRequest {})\n            .await\n            .unwrap();\n\n        ingester.decommission(DecommissionRequest {}).await.unwrap();\n\n        let next_observation = observation_stream.next().await.unwrap().unwrap();\n        let next_status = next_observation.status();\n        assert_eq!(next_status, IngesterStatus::Retiring);\n\n        wait_for_ingester_status(\n            &ingester,\n            IngesterStatus::Decommissioning,\n            Duration::from_secs(1),\n        )\n        .await\n        .unwrap();\n\n        let state_guard = ingester.state.lock_fully().await.unwrap();\n        let shard = state_guard.shards.get(&queue_id).unwrap();\n        shard.assert_is_closed();\n    }\n\n    #[tokio::test]\n    async fn test_check_decommissioning_status() {\n        let (_ingester_ctx, ingester) = IngesterForTest::default().build().await;\n        let mut state_guard = ingester.state.lock_fully().await.unwrap();\n\n        ingester\n            .check_decommissioning_status(&mut state_guard)\n            .await;\n        assert_eq!(state_guard.status(), IngesterStatus::Ready);\n\n        state_guard\n            .set_status(IngesterStatus::Decommissioning)\n            .await;\n        ingester\n            .check_decommissioning_status(&mut state_guard)\n            .await;\n        assert_eq!(state_guard.status(), IngesterStatus::Decommissioned);\n\n        state_guard\n            .set_status(IngesterStatus::Decommissioning)\n            .await;\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n\n        let solo_shard = IngesterShard::new_solo(index_uid.clone(), source_id, ShardId::from(1))\n            .with_state(ShardState::Closed)\n            .with_replication_position_inclusive(Position::offset(12u64))\n            .build();\n        let queue_id = solo_shard.queue_id();\n\n        state_guard.shards.insert(queue_id.clone(), solo_shard);\n        ingester\n            .check_decommissioning_status(&mut state_guard)\n            .await;\n        assert_eq!(state_guard.status(), IngesterStatus::Decommissioning);\n\n        let shard = state_guard.shards.get_mut(&queue_id).unwrap();\n        shard.truncation_position_inclusive = Position::Beginning.as_eof();\n\n        ingester\n            .check_decommissioning_status(&mut state_guard)\n            .await;\n        assert_eq!(state_guard.status(), IngesterStatus::Decommissioned);\n    }\n\n    #[tokio::test]\n    async fn test_ingester_truncate_on_shard_positions_update() {\n        let (_ingester_ctx, ingester) = IngesterForTest::default().build().await;\n        let event_broker = EventBroker::default();\n        ingester.subscribe(&event_broker);\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n\n        let doc_mapping_uid = DocMappingUid::random();\n        let doc_mapping_json = format!(\n            r#\"{{\n                \"doc_mapping_uid\": \"{doc_mapping_uid}\"\n            }}\"#\n        );\n        let shard_01 = Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            shard_state: ShardState::Open as i32,\n            doc_mapping_uid: Some(doc_mapping_uid),\n            ..Default::default()\n        };\n        let queue_id_01 = queue_id(&index_uid, &source_id, &ShardId::from(1));\n\n        let shard_02 = Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(2)),\n            shard_state: ShardState::Closed as i32,\n            doc_mapping_uid: Some(doc_mapping_uid),\n            ..Default::default()\n        };\n        let queue_id_02 = queue_id(&index_uid, &source_id, &ShardId::from(2));\n\n        let mut state_guard = ingester.state.lock_fully().await.unwrap();\n        let now = Instant::now();\n\n        ingester\n            .init_primary_shard(\n                &mut state_guard.inner,\n                &mut state_guard.mrecordlog,\n                shard_01,\n                &doc_mapping_json,\n                now,\n                true,\n            )\n            .await\n            .unwrap();\n        ingester\n            .init_primary_shard(\n                &mut state_guard.inner,\n                &mut state_guard.mrecordlog,\n                shard_02,\n                &doc_mapping_json,\n                now,\n                true,\n            )\n            .await\n            .unwrap();\n\n        let records = [\n            MRecord::new_doc(\"test-doc-foo\").encode(),\n            MRecord::new_doc(\"test-doc-bar\").encode(),\n        ]\n        .into_iter();\n\n        state_guard\n            .mrecordlog\n            .append_records(&queue_id_01, None, records)\n            .await\n            .unwrap();\n\n        let records = [MRecord::new_doc(\"test-doc-baz\").encode()].into_iter();\n\n        state_guard\n            .mrecordlog\n            .append_records(&queue_id_02, None, records)\n            .await\n            .unwrap();\n\n        drop(state_guard);\n\n        let shard_position_update = ShardPositionsUpdate {\n            source_uid: SourceUid {\n                index_uid: index_uid.clone(),\n                source_id,\n            },\n            updated_shard_positions: vec![\n                (ShardId::from(1), Position::offset(0u64)),\n                (ShardId::from(2), Position::eof(0u64)),\n                (ShardId::from(1337), Position::offset(1337u64)),\n            ],\n        };\n        event_broker.publish(shard_position_update.clone());\n\n        // Verify idempotency.\n        event_broker.publish(shard_position_update);\n\n        // Yield so that the event is processed.\n        yield_now().await;\n\n        let state_guard = ingester.state.lock_fully().await.unwrap();\n        assert_eq!(state_guard.shards.len(), 1);\n\n        assert!(state_guard.shards.contains_key(&queue_id_01));\n\n        state_guard\n            .mrecordlog\n            .assert_records_eq(&queue_id_01, .., &[(1, [0, 0], \"test-doc-bar\")]);\n\n        assert!(!state_guard.shards.contains_key(&queue_id_02));\n        assert!(!state_guard.mrecordlog.queue_exists(&queue_id_02));\n    }\n\n    #[tokio::test]\n    async fn test_ingester_closes_idle_shards() {\n        // The `CloseIdleShardsTask` task is already unit tested, so this test ensures the task is\n        // correctly spawned upon starting an ingester.\n        let idle_shard_timeout = Duration::from_millis(200);\n        let (_ingester_ctx, ingester) = IngesterForTest::default()\n            .with_idle_shard_timeout(idle_shard_timeout)\n            .build()\n            .await;\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n        let queue_id_01 = queue_id(&index_uid, &source_id, &ShardId::from(1));\n\n        let doc_mapping_uid = DocMappingUid::random();\n        let doc_mapping_json = format!(\n            r#\"{{\n                \"doc_mapping_uid\": \"{doc_mapping_uid}\"\n            }}\"#\n        );\n        let shard_01 = Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id,\n            shard_id: Some(ShardId::from(1)),\n            shard_state: ShardState::Open as i32,\n            doc_mapping_uid: Some(doc_mapping_uid),\n            ..Default::default()\n        };\n        let mut state_guard = ingester.state.lock_fully().await.unwrap();\n        let now = Instant::now();\n\n        ingester\n            .init_primary_shard(\n                &mut state_guard.inner,\n                &mut state_guard.mrecordlog,\n                shard_01,\n                &doc_mapping_json,\n                now - idle_shard_timeout,\n                true,\n            )\n            .await\n            .unwrap();\n\n        drop(state_guard);\n\n        for _ in 0..10 {\n            tokio::time::sleep(Duration::from_millis(100)).await;\n\n            let state_guard = ingester.state.lock_partially().await.unwrap();\n            let shard = state_guard.shards.get(&queue_id_01).unwrap();\n\n            if shard.is_closed() {\n                return;\n            }\n            drop(state_guard);\n        }\n        panic!(\"idle shard was not closed\");\n    }\n\n    #[tokio::test]\n    async fn test_ingester_debug_info() {\n        let (_ingester_ctx, ingester) = IngesterForTest::default().build().await;\n\n        let index_uid_0: IndexUid = IndexUid::for_test(\"test-index-0\", 0);\n        let index_uid_1: IndexUid = IndexUid::for_test(\"test-index-1\", 0);\n        let source_id = SourceId::from(\"test-source\");\n\n        let doc_mapping_uid = DocMappingUid::random();\n        let doc_mapping_json = format!(\n            r#\"{{\n                \"doc_mapping_uid\": \"{doc_mapping_uid}\"\n            }}\"#\n        );\n        let shard_01 = Shard {\n            index_uid: Some(index_uid_0.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            shard_state: ShardState::Open as i32,\n            doc_mapping_uid: Some(doc_mapping_uid),\n            ..Default::default()\n        };\n        let shard_02 = Shard {\n            index_uid: Some(index_uid_0.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(2)),\n            shard_state: ShardState::Closed as i32,\n            doc_mapping_uid: Some(doc_mapping_uid),\n            ..Default::default()\n        };\n        let shard_03 = Shard {\n            index_uid: Some(index_uid_1.clone()),\n            source_id,\n            shard_id: Some(ShardId::from(3)),\n            shard_state: ShardState::Closed as i32,\n            doc_mapping_uid: Some(doc_mapping_uid),\n            ..Default::default()\n        };\n        let mut state_guard = ingester.state.lock_fully().await.unwrap();\n        let now = Instant::now();\n\n        ingester\n            .init_primary_shard(\n                &mut state_guard.inner,\n                &mut state_guard.mrecordlog,\n                shard_01,\n                &doc_mapping_json,\n                now,\n                true,\n            )\n            .await\n            .unwrap();\n        ingester\n            .init_primary_shard(\n                &mut state_guard.inner,\n                &mut state_guard.mrecordlog,\n                shard_02,\n                &doc_mapping_json,\n                now,\n                true,\n            )\n            .await\n            .unwrap();\n        ingester\n            .init_primary_shard(\n                &mut state_guard.inner,\n                &mut state_guard.mrecordlog,\n                shard_03,\n                &doc_mapping_json,\n                now,\n                true,\n            )\n            .await\n            .unwrap();\n        drop(state_guard);\n\n        let debug_info = ingester.debug_info().await;\n        assert_eq!(debug_info[\"status\"], \"ready\");\n\n        let shards = &debug_info[\"shards\"];\n        assert_eq!(shards.as_object().unwrap().len(), 2);\n\n        assert_eq!(\n            shards[\"test-index-0:00000000000000000000000000\"]\n                .as_array()\n                .unwrap()\n                .len(),\n            2\n        );\n        assert_eq!(\n            shards[\"test-index-1:00000000000000000000000000\"]\n                .as_array()\n                .unwrap()\n                .len(),\n            1\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_v2/metrics.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse mrecordlog::ResourceUsage;\nuse once_cell::sync::Lazy;\nuse quickwit_common::metrics::{\n    Histogram, HistogramVec, IntCounter, IntCounterVec, IntGauge, IntGaugeVec, exponential_buckets,\n    linear_buckets, new_counter_vec, new_gauge, new_gauge_vec, new_histogram, new_histogram_vec,\n};\n\n// Counter vec counting the different outcomes of ingest requests as\n// measure at the end of the router work.\n//\n// The counter are counting persist subrequests.\npub(crate) struct IngestResultMetrics {\n    pub success: IntCounter,\n    pub circuit_breaker: IntCounter,\n    pub unspecified: IntCounter,\n    pub index_not_found: IntCounter,\n    pub source_not_found: IntCounter,\n    pub internal: IntCounter,\n    pub no_shards_available: IntCounter,\n    pub shard_rate_limited: IntCounter,\n    pub wal_full: IntCounter,\n    pub timeout: IntCounter,\n    pub router_timeout: IntCounter,\n    pub router_load_shedding: IntCounter,\n    pub load_shedding: IntCounter,\n    pub shard_not_found: IntCounter,\n    pub unavailable: IntCounter,\n}\n\nimpl Default for IngestResultMetrics {\n    fn default() -> Self {\n        let ingest_result_total_vec = new_counter_vec::<1>(\n            \"ingest_result_total\",\n            \"Number of ingest requests by result\",\n            \"ingest\",\n            &[],\n            [\"result\"],\n        );\n        Self {\n            success: ingest_result_total_vec.with_label_values([\"success\"]),\n            circuit_breaker: ingest_result_total_vec.with_label_values([\"circuit_breaker\"]),\n            unspecified: ingest_result_total_vec.with_label_values([\"unspecified\"]),\n            index_not_found: ingest_result_total_vec.with_label_values([\"index_not_found\"]),\n            source_not_found: ingest_result_total_vec.with_label_values([\"source_not_found\"]),\n            internal: ingest_result_total_vec.with_label_values([\"internal\"]),\n            no_shards_available: ingest_result_total_vec.with_label_values([\"no_shards_available\"]),\n            shard_rate_limited: ingest_result_total_vec.with_label_values([\"shard_rate_limited\"]),\n            wal_full: ingest_result_total_vec.with_label_values([\"wal_full\"]),\n            timeout: ingest_result_total_vec.with_label_values([\"timeout\"]),\n            router_timeout: ingest_result_total_vec.with_label_values([\"router_timeout\"]),\n            router_load_shedding: ingest_result_total_vec\n                .with_label_values([\"router_load_shedding\"]),\n            load_shedding: ingest_result_total_vec.with_label_values([\"load_shedding\"]),\n            unavailable: ingest_result_total_vec.with_label_values([\"unavailable\"]),\n            shard_not_found: ingest_result_total_vec.with_label_values([\"shard_not_found\"]),\n        }\n    }\n}\n\npub(super) struct IngestV2Metrics {\n    pub reset_shards_operations_total: IntCounterVec<1>,\n    pub open_shards: IntGauge,\n    pub closed_shards: IntGauge,\n    pub shard_lt_throughput_mib: Histogram,\n    pub shard_st_throughput_mib: Histogram,\n    pub wal_acquire_lock_requests_in_flight: IntGaugeVec<2>,\n    pub wal_acquire_lock_request_duration_secs: HistogramVec<2>,\n    pub wal_disk_used_bytes: IntGauge,\n    pub wal_memory_used_bytes: IntGauge,\n    pub ingest_results: IngestResultMetrics,\n    pub ingest_attempts: IntCounterVec<1>,\n}\n\nimpl Default for IngestV2Metrics {\n    fn default() -> Self {\n        Self {\n            ingest_results: IngestResultMetrics::default(),\n            ingest_attempts: new_counter_vec::<1>(\n                \"ingest_attempts\",\n                \"Number of routing attempts by AZ locality\",\n                \"ingest\",\n                &[],\n                [\"az_routing\"],\n            ),\n            reset_shards_operations_total: new_counter_vec(\n                \"reset_shards_operations_total\",\n                \"Total number of reset shards operations performed.\",\n                \"ingest\",\n                &[],\n                [\"status\"],\n            ),\n            open_shards: new_gauge(\n                \"shards\",\n                \"Number of shards hosted by the ingester.\",\n                \"ingest\",\n                &[(\"state\", \"open\")],\n            ),\n            closed_shards: new_gauge(\n                \"shards\",\n                \"Number of shards hosted by the ingester.\",\n                \"ingest\",\n                &[(\"state\", \"closed\")],\n            ),\n            shard_lt_throughput_mib: new_histogram(\n                \"shard_lt_throughput_mib\",\n                \"Shard long term throughput as reported through chitchat\",\n                \"ingest\",\n                linear_buckets(0.0f64, 1.0f64, 15).unwrap(),\n            ),\n            shard_st_throughput_mib: new_histogram(\n                \"shard_st_throughput_mib\",\n                \"Shard short term throughput as reported through chitchat\",\n                \"ingest\",\n                linear_buckets(0.0f64, 1.0f64, 15).unwrap(),\n            ),\n            wal_acquire_lock_requests_in_flight: new_gauge_vec(\n                \"wal_acquire_lock_requests_in_flight\",\n                \"Number of acquire lock requests in-flight.\",\n                \"ingest\",\n                &[],\n                [\"operation\", \"type\"],\n            ),\n            wal_acquire_lock_request_duration_secs: new_histogram_vec(\n                \"wal_acquire_lock_request_duration_secs\",\n                \"Duration of acquire lock requests in seconds.\",\n                \"ingest\",\n                &[],\n                [\"operation\", \"type\"],\n                exponential_buckets(0.001, 2.0, 12).unwrap(),\n            ),\n            wal_disk_used_bytes: new_gauge(\n                \"wal_disk_used_bytes\",\n                \"WAL disk space used in bytes.\",\n                \"ingest\",\n                &[],\n            ),\n            wal_memory_used_bytes: new_gauge(\n                \"wal_memory_used_bytes\",\n                \"WAL memory used in bytes.\",\n                \"ingest\",\n                &[],\n            ),\n        }\n    }\n}\n\npub(super) fn report_wal_usage(wal_usage: ResourceUsage) {\n    INGEST_V2_METRICS\n        .wal_disk_used_bytes\n        .set(wal_usage.disk_used_bytes as i64);\n    quickwit_common::metrics::MEMORY_METRICS\n        .in_flight\n        .wal\n        .set(wal_usage.memory_allocated_bytes as i64);\n    INGEST_V2_METRICS\n        .wal_memory_used_bytes\n        .set(wal_usage.memory_used_bytes as i64);\n}\n\npub(super) static INGEST_V2_METRICS: Lazy<IngestV2Metrics> = Lazy::new(IngestV2Metrics::default);\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_v2/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod broadcast;\nmod debouncing;\nmod doc_mapper;\nmod fetch;\nmod helpers;\nmod idle;\nmod ingester;\nmod metrics;\nmod models;\nmod mrecord;\nmod mrecordlog_utils;\nmod publish_tracker;\nmod rate_meter;\nmod replication;\nmod router;\nmod routing_table;\nmod state;\nmod wal_capacity_tracker;\nmod workbench;\n\nuse std::collections::HashMap;\nuse std::collections::hash_map::Entry;\nuse std::ops::{Add, AddAssign};\nuse std::time::Duration;\nuse std::{env, fmt};\n\npub use broadcast::{\n    LocalShardsUpdate, ShardInfo, ShardInfos, setup_ingester_capacity_update_listener,\n    setup_local_shards_update_listener,\n};\nuse bytes::buf::Writer;\nuse bytes::{BufMut, BytesMut};\nuse bytesize::ByteSize;\nuse quickwit_common::tower::Pool;\nuse quickwit_proto::ingest::ingester::{IngesterServiceClient, IngesterStatus};\nuse quickwit_proto::ingest::router::{IngestRequestV2, IngestSubrequest};\nuse quickwit_proto::ingest::{CommitTypeV2, DocBatchV2};\nuse quickwit_proto::types::{\n    DocUid, DocUidGenerator, IndexId, IndexUid, NodeId, SourceId, SubrequestId,\n};\nuse serde::Serialize;\nuse tracing::{error, info};\nuse workbench::pending_subrequests;\n\npub use self::fetch::{FetchStreamError, MultiFetchStream};\npub use self::helpers::{\n    try_get_ingester_status, wait_for_ingester_decommission, wait_for_ingester_status,\n};\npub use self::ingester::Ingester;\nuse self::mrecord::MRECORD_HEADER_LEN;\npub use self::mrecord::{MRecord, decoded_mrecords};\npub use self::router::IngestRouter;\n\n/// An ingester as represented in the pool, bundling the gRPC client with node metadata.\n#[derive(Debug, Clone)]\npub struct IngesterPoolEntry {\n    pub client: IngesterServiceClient,\n    pub status: IngesterStatus,\n    pub availability_zone: Option<String>,\n}\n\nimpl IngesterPoolEntry {\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn ready_with_client(client: IngesterServiceClient) -> Self {\n        IngesterPoolEntry {\n            client,\n            status: IngesterStatus::Ready,\n            availability_zone: None,\n        }\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn mocked_ingester() -> Self {\n        IngesterPoolEntry {\n            client: IngesterServiceClient::mocked(),\n            status: IngesterStatus::Ready,\n            availability_zone: None,\n        }\n    }\n}\n\npub type IngesterPool = Pool<NodeId, IngesterPoolEntry>;\n\n/// Identifies an ingester client, typically a source, for logging and debugging purposes.\npub type ClientId = String;\n\npub type LeaderId = NodeId;\n\npub type FollowerId = NodeId;\n\npub type OpenShardCounts = Vec<(IndexUid, SourceId, usize)>;\n\nconst IDLE_SHARD_TIMEOUT_ENV_KEY: &str = \"QW_IDLE_SHARD_TIMEOUT_SECS\";\n\nconst DEFAULT_IDLE_SHARD_TIMEOUT: Duration = Duration::from_secs(15 * 60); // 15 minutes\n\npub fn get_idle_shard_timeout() -> Duration {\n    env::var(IDLE_SHARD_TIMEOUT_ENV_KEY)\n        .ok()\n        .and_then(|idle_shard_timeout_str| {\n            if let Ok(idle_shard_timeout_secs) = idle_shard_timeout_str.parse::<u64>() {\n                info!(\"overriding idle shard timeout to {idle_shard_timeout_secs} seconds\");\n                Some(idle_shard_timeout_secs)\n            } else {\n                error!(\n                    \"failed to parse environment variable \\\n                     `{IDLE_SHARD_TIMEOUT_ENV_KEY}={idle_shard_timeout_str}`\"\n                );\n                None\n            }\n        })\n        .map(Duration::from_secs)\n        .unwrap_or(DEFAULT_IDLE_SHARD_TIMEOUT)\n}\n\nconst INGEST_ROUTER_BUFFER_SIZE_ENV_KEY: &str = \"QW_INGEST_ROUTER_BUFFER_SIZE_BYTES\";\n\nconst DEFAULT_INGEST_ROUTER_BUFFER_SIZE: ByteSize = ByteSize::mib(if cfg!(test) { 8 } else { 256 }); // 256 MiB\n\npub(crate) fn get_ingest_router_buffer_size() -> ByteSize {\n    env::var(INGEST_ROUTER_BUFFER_SIZE_ENV_KEY)\n        .ok()\n        .and_then(|buffer_size_bytes_str| {\n            if let Ok(buffer_size) = buffer_size_bytes_str.parse::<ByteSize>() {\n                info!(\"overriding ingest router buffer size to {buffer_size}\");\n                Some(buffer_size)\n            } else {\n                error!(\n                    \"failed to parse environment variable \\\n                     `{INGEST_ROUTER_BUFFER_SIZE_ENV_KEY}={buffer_size_bytes_str}`\"\n                );\n                None\n            }\n        })\n        .unwrap_or(DEFAULT_INGEST_ROUTER_BUFFER_SIZE)\n}\n\n/// Helper struct to build a [`DocBatchV2`]`.\n#[derive(Debug, Default)]\npub struct DocBatchV2Builder {\n    doc_uids: Vec<DocUid>,\n    doc_buffer: BytesMut,\n    doc_lengths: Vec<u32>,\n}\n\nimpl DocBatchV2Builder {\n    /// Adds a document to the batch.\n    pub fn add_doc(&mut self, doc_uid: DocUid, doc: &[u8]) {\n        self.doc_uids.push(doc_uid);\n        self.doc_buffer.put(doc);\n        self.doc_lengths.push(doc.len() as u32);\n    }\n\n    /// Builds the [`DocBatchV2`], returning `None` if the batch is empty.\n    pub fn build(self) -> Option<DocBatchV2> {\n        if self.doc_uids.is_empty() {\n            return None;\n        }\n        let doc_batch = DocBatchV2 {\n            doc_uids: self.doc_uids,\n            doc_buffer: self.doc_buffer.freeze(),\n            doc_lengths: self.doc_lengths,\n        };\n        Some(doc_batch)\n    }\n}\n\n/// Batch builder that can append [`Serialize`] structs without an extra copy\npub struct JsonDocBatchV2Builder {\n    doc_uids: Vec<DocUid>,\n    doc_buffer: Writer<BytesMut>,\n    doc_lengths: Vec<u32>,\n}\n\nimpl Default for JsonDocBatchV2Builder {\n    fn default() -> Self {\n        Self {\n            doc_uids: Vec::new(),\n            doc_buffer: BytesMut::new().writer(),\n            doc_lengths: Vec::new(),\n        }\n    }\n}\n\nimpl JsonDocBatchV2Builder {\n    pub fn add_doc(&mut self, doc_uid: DocUid, payload: impl Serialize) -> serde_json::Result<()> {\n        let old_len = self.doc_buffer.get_ref().len();\n        serde_json::to_writer(&mut self.doc_buffer, &payload)?;\n        let new_len = self.doc_buffer.get_ref().len();\n        let written_len = new_len - old_len;\n        self.doc_uids.push(doc_uid);\n        self.doc_lengths.push(written_len as u32);\n        Ok(())\n    }\n\n    pub fn build(self) -> DocBatchV2 {\n        DocBatchV2 {\n            doc_uids: self.doc_uids,\n            doc_buffer: self.doc_buffer.into_inner().freeze(),\n            doc_lengths: self.doc_lengths,\n        }\n    }\n\n    pub fn with_num_docs(num_docs: usize) -> Self {\n        Self {\n            doc_uids: Vec::with_capacity(num_docs),\n            doc_lengths: Vec::with_capacity(num_docs),\n            ..Default::default()\n        }\n    }\n}\n\n/// Helper struct to build an [`IngestRequestV2`].\n#[derive(Debug, Default)]\npub struct IngestRequestV2Builder {\n    per_index_id_doc_batch_builders: HashMap<IndexId, (SubrequestId, DocBatchV2Builder)>,\n    subrequest_id_sequence: SubrequestId,\n    doc_uid_generator: DocUidGenerator,\n}\n\nimpl IngestRequestV2Builder {\n    /// Adds a document to the request, returning the ID of the subrequest to which it was added and\n    /// its newly assigned [`DocUid`].\n    pub fn add_doc(&mut self, index_id: IndexId, doc: &[u8]) -> (SubrequestId, DocUid) {\n        match self.per_index_id_doc_batch_builders.entry(index_id) {\n            Entry::Occupied(mut entry) => {\n                let (subrequest_id, doc_batch_builder) = entry.get_mut();\n                let doc_uid = self.doc_uid_generator.next_doc_uid();\n                doc_batch_builder.add_doc(doc_uid, doc);\n                (*subrequest_id, doc_uid)\n            }\n            Entry::Vacant(entry) => {\n                let subrequest_id = self.subrequest_id_sequence;\n                self.subrequest_id_sequence += 1;\n                let mut doc_batch_builder = DocBatchV2Builder::default();\n                let doc_uid = self.doc_uid_generator.next_doc_uid();\n                doc_batch_builder.add_doc(doc_uid, doc);\n                entry.insert((subrequest_id, doc_batch_builder));\n                (subrequest_id, doc_uid)\n            }\n        }\n    }\n\n    /// Builds the [`IngestRequestV2`], returning `None` if the request is empty.\n    pub fn build(self, source_id: &str, commit_type: CommitTypeV2) -> Option<IngestRequestV2> {\n        let subrequests: Vec<IngestSubrequest> = self\n            .per_index_id_doc_batch_builders\n            .into_iter()\n            .flat_map(|(index_id, (subrequest_id, doc_batch_builder))| {\n                let doc_batch = doc_batch_builder.build()?;\n                let ingest_subrequest = IngestSubrequest {\n                    subrequest_id,\n                    index_id,\n                    source_id: source_id.to_string(),\n                    doc_batch: Some(doc_batch),\n                };\n                Some(ingest_subrequest)\n            })\n            .collect();\n\n        if subrequests.is_empty() {\n            return None;\n        }\n        let ingest_request = IngestRequestV2 {\n            subrequests,\n            commit_type: commit_type as i32,\n        };\n        Some(ingest_request)\n    }\n}\n\npub(super) fn estimate_size(doc_batch: &DocBatchV2) -> ByteSize {\n    let estimate = doc_batch.num_bytes() + doc_batch.num_docs() * MRECORD_HEADER_LEN;\n    ByteSize(estimate as u64)\n}\n\n#[derive(Debug, Clone, Copy, Default, Eq, PartialEq, Ord, PartialOrd)]\npub struct RateMibPerSec(pub u16);\n\nimpl fmt::Display for RateMibPerSec {\n    fn fmt(&self, f: &mut fmt::Formatter) -> std::fmt::Result {\n        write!(f, \"{}MiB/s\", self.0)\n    }\n}\n\nimpl PartialEq<u16> for RateMibPerSec {\n    fn eq(&self, other: &u16) -> bool {\n        self.0 == *other\n    }\n}\n\nimpl Add<RateMibPerSec> for RateMibPerSec {\n    type Output = RateMibPerSec;\n\n    #[inline(always)]\n    fn add(self, rhs: RateMibPerSec) -> Self::Output {\n        RateMibPerSec(self.0 + rhs.0)\n    }\n}\n\nimpl AddAssign<RateMibPerSec> for RateMibPerSec {\n    #[inline(always)]\n    fn add_assign(&mut self, rhs: RateMibPerSec) {\n        self.0 += rhs.0;\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use bytes::Bytes;\n\n    use super::*;\n\n    #[test]\n    fn test_doc_batch_builder() {\n        let doc_batch_builder = DocBatchV2Builder::default();\n        let doc_batch_opt = doc_batch_builder.build();\n        assert!(doc_batch_opt.is_none());\n\n        let mut doc_batch_builder = DocBatchV2Builder::default();\n        let mut doc_uid_generator = DocUidGenerator::default();\n        doc_batch_builder.add_doc(doc_uid_generator.next_doc_uid(), b\"Hello, \");\n        doc_batch_builder.add_doc(doc_uid_generator.next_doc_uid(), b\"World!\");\n        let doc_batch = doc_batch_builder.build().unwrap();\n\n        assert_eq!(doc_batch.num_docs(), 2);\n        assert_eq!(doc_batch.num_bytes(), 21);\n        assert_eq!(doc_batch.doc_lengths, [7, 6]);\n        assert_eq!(doc_batch.doc_buffer, Bytes::from(&b\"Hello, World!\"[..]));\n    }\n\n    #[test]\n    fn test_ingest_request_builder() {\n        let ingest_request_builder = IngestRequestV2Builder::default();\n        let ingest_request_opt = ingest_request_builder.build(\"test-source\", CommitTypeV2::Auto);\n        assert!(ingest_request_opt.is_none());\n\n        let mut ingest_request_builder = IngestRequestV2Builder::default();\n\n        let (subrequest_id, hello_doc_uid) =\n            ingest_request_builder.add_doc(\"test-index-foo\".to_string(), b\"Hello, \");\n        assert_eq!(subrequest_id, 0);\n\n        let (subrequest_id, world_doc_uid) =\n            ingest_request_builder.add_doc(\"test-index-foo\".to_string(), b\"World!\");\n        assert_eq!(subrequest_id, 0);\n        assert!(hello_doc_uid < world_doc_uid);\n\n        let (subrequest_id, hola_doc_uid) =\n            ingest_request_builder.add_doc(\"test-index-bar\".to_string(), b\"Hola, \");\n        assert_eq!(subrequest_id, 1);\n        assert!(world_doc_uid < hola_doc_uid);\n\n        let (subrequest_id, mundo_doc_uid) =\n            ingest_request_builder.add_doc(\"test-index-bar\".to_string(), b\"Mundo!\");\n        assert_eq!(subrequest_id, 1);\n        assert!(hola_doc_uid < mundo_doc_uid);\n\n        let mut ingest_request = ingest_request_builder\n            .build(\"test-source\", CommitTypeV2::Auto)\n            .unwrap();\n\n        ingest_request\n            .subrequests\n            .sort_by(|left, right| left.index_id.cmp(&right.index_id).reverse());\n\n        assert_eq!(ingest_request.subrequests.len(), 2);\n        assert_eq!(ingest_request.subrequests[0].index_id, \"test-index-foo\");\n        assert_eq!(ingest_request.subrequests[0].source_id, \"test-source\");\n        assert_eq!(\n            ingest_request.subrequests[0]\n                .doc_batch\n                .as_ref()\n                .unwrap()\n                .num_docs(),\n            2\n        );\n        assert_eq!(\n            ingest_request.subrequests[0]\n                .doc_batch\n                .as_ref()\n                .unwrap()\n                .num_bytes(),\n            21\n        );\n        assert_eq!(\n            ingest_request.subrequests[0]\n                .doc_batch\n                .as_ref()\n                .unwrap()\n                .doc_lengths,\n            [7, 6]\n        );\n        assert_eq!(\n            ingest_request.subrequests[0]\n                .doc_batch\n                .as_ref()\n                .unwrap()\n                .doc_buffer,\n            Bytes::from(&b\"Hello, World!\"[..])\n        );\n        assert_eq!(\n            ingest_request.subrequests[0]\n                .doc_batch\n                .as_ref()\n                .unwrap()\n                .doc_uids,\n            [hello_doc_uid, world_doc_uid]\n        );\n\n        assert_eq!(ingest_request.subrequests[1].index_id, \"test-index-bar\");\n        assert_eq!(ingest_request.subrequests[1].source_id, \"test-source\");\n        assert_eq!(\n            ingest_request.subrequests[1]\n                .doc_batch\n                .as_ref()\n                .unwrap()\n                .num_docs(),\n            2\n        );\n        assert_eq!(\n            ingest_request.subrequests[1]\n                .doc_batch\n                .as_ref()\n                .unwrap()\n                .num_bytes(),\n            20\n        );\n        assert_eq!(\n            ingest_request.subrequests[1]\n                .doc_batch\n                .as_ref()\n                .unwrap()\n                .doc_lengths,\n            [6, 6]\n        );\n        assert_eq!(\n            ingest_request.subrequests[1]\n                .doc_batch\n                .as_ref()\n                .unwrap()\n                .doc_buffer,\n            Bytes::from(&b\"Hola, Mundo!\"[..])\n        );\n        assert_eq!(\n            ingest_request.subrequests[1]\n                .doc_batch\n                .as_ref()\n                .unwrap()\n                .doc_uids,\n            [hola_doc_uid, mundo_doc_uid]\n        );\n    }\n\n    #[test]\n    fn test_estimate_size() {\n        let doc_batch = DocBatchV2 {\n            doc_buffer: Vec::new().into(),\n            doc_lengths: Vec::new(),\n            doc_uids: Vec::new(),\n        };\n        assert_eq!(estimate_size(&doc_batch), ByteSize(0));\n\n        let doc_batch = DocBatchV2 {\n            doc_buffer: vec![0u8; 100].into(),\n            doc_lengths: vec![10, 20, 30],\n            doc_uids: Vec::new(),\n        };\n        assert_eq!(estimate_size(&doc_batch), ByteSize(118));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_v2/models.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::Arc;\nuse std::time::{Duration, Instant};\n\nuse quickwit_common::rate_limiter::RateLimiter;\nuse quickwit_doc_mapper::DocMapper;\nuse quickwit_proto::ingest::ShardState;\nuse quickwit_proto::types::{IndexUid, NodeId, Position, QueueId, ShardId, SourceId, queue_id};\nuse tokio::sync::watch;\n\nuse crate::ingest_v2::rate_meter::RateMeter;\n\n#[derive(Debug, Clone)]\npub(super) enum IngesterShardType {\n    /// A primary shard hosted on a leader and replicated on a follower.\n    Primary { follower_id: NodeId },\n    /// A replica shard hosted on a follower.\n    Replica { leader_id: NodeId },\n    /// A shard hosted on a single node when the replication factor is set to 1.\n    Solo,\n}\n\n/// Status of a shard: state + position of the last record written.\npub(super) type ShardStatus = (ShardState, Position);\n\n#[derive(Debug)]\npub(super) struct IngesterShard {\n    pub index_uid: IndexUid,\n    pub source_id: SourceId,\n    pub shard_id: ShardId,\n    pub shard_type: IngesterShardType,\n    pub shard_state: ShardState,\n    /// Position of the last record written in the shard's mrecordlog queue.\n    pub replication_position_inclusive: Position,\n    /// Position up to which the shard has been truncated.\n    pub truncation_position_inclusive: Position,\n    // Rate limiter for the shard. Unused for replica shards.\n    pub rate_limiter: RateLimiter,\n    // Rate meter for the shard. Unused for replica shards.\n    pub rate_meter: RateMeter,\n    /// Whether the shard should be advertised to other nodes (routers) via gossip.\n    ///\n    /// Because shards  are created in multiple steps, (e.g., init shard on leader, create shard in\n    /// metastore), we must receive a \"signal\" from the control plane confirming that a shard\n    /// was successfully opened before advertising it. Currently, this confirmation comes in the\n    /// form of `PersistRequest` or `FetchRequest`.\n    pub is_advertisable: bool,\n    /// Document mapper for the shard. Replica shards and closed solo shards do not have one.\n    pub doc_mapper_opt: Option<Arc<DocMapper>>,\n    /// Whether to validate documents in this shard. True if no preprocessing (VRL) will happen\n    /// before indexing.\n    pub validate_docs: bool,\n    pub shard_status_tx: watch::Sender<ShardStatus>,\n    pub shard_status_rx: watch::Receiver<ShardStatus>,\n    /// Instant at which the shard was last written to.\n    pub last_write_instant: Instant,\n}\n\n/// Builder for `IngesterShard`. By default, the shard is open, is empty (i.e. the replication and\n/// truncation positions are at the beginning), uses the default rate limiter and rate meter, has no\n/// doc mapper, does not validate documents, and is not advertisable.\npub(super) struct IngesterShardBuilder {\n    index_uid: IndexUid,\n    source_id: SourceId,\n    shard_id: ShardId,\n    shard_type: IngesterShardType,\n    shard_state: ShardState,\n    replication_position_inclusive: Position,\n    truncation_position_inclusive: Position,\n    rate_limiter: RateLimiter,\n    rate_meter: RateMeter,\n    doc_mapper_opt: Option<Arc<DocMapper>>,\n    validate_docs: bool,\n    is_advertisable: bool,\n    last_write_instant: Option<Instant>,\n}\n\nimpl IngesterShardBuilder {\n    /// Sets the shard state. Defaults to `ShardState::Open`.\n    pub fn with_state(mut self, shard_state: ShardState) -> Self {\n        self.shard_state = shard_state;\n        self\n    }\n\n    /// Sets the rate limiter. Defaults to `RateLimiter::default()`.\n    pub fn with_rate_limiter(mut self, rate_limiter: RateLimiter) -> Self {\n        self.rate_limiter = rate_limiter;\n        self\n    }\n\n    /// Sets the rate meter. Defaults to `RateMeter::default()`.\n    pub fn with_rate_meter(mut self, rate_meter: RateMeter) -> Self {\n        self.rate_meter = rate_meter;\n        self\n    }\n\n    /// Sets the doc mapper.\n    pub fn with_doc_mapper(mut self, doc_mapper: Arc<DocMapper>) -> Self {\n        self.doc_mapper_opt = Some(doc_mapper);\n        self\n    }\n\n    /// Sets the replication position. Defaults to `Position::Beginning`.\n    pub fn with_replication_position_inclusive(mut self, position: Position) -> Self {\n        self.replication_position_inclusive = position;\n        self\n    }\n\n    /// Sets the truncation position. Defaults to `Position::Beginning`.\n    pub fn with_truncation_position_inclusive(mut self, position: Position) -> Self {\n        self.truncation_position_inclusive = position;\n        self\n    }\n\n    /// Sets whether to validate documents. Defaults to `false`.\n    pub fn with_validate_docs(mut self, validate_docs: bool) -> Self {\n        self.validate_docs = validate_docs;\n        self\n    }\n\n    /// Sets whether the shard should be advertised to other nodes via gossip. Defaults to `false`.\n    pub fn advertisable(mut self) -> Self {\n        self.is_advertisable = true;\n        self\n    }\n\n    /// Sets the last write instant. Defaults to `Instant::now()`.\n    pub fn with_last_write(mut self, last_write_instant: Instant) -> Self {\n        self.last_write_instant = Some(last_write_instant);\n        self\n    }\n\n    /// Builds the `IngesterShard`. Uses `Instant::now()` for last write time if not specified.\n    pub fn build(self) -> IngesterShard {\n        let shard_status = (\n            self.shard_state,\n            self.replication_position_inclusive.clone(),\n        );\n        let (shard_status_tx, shard_status_rx) = watch::channel(shard_status);\n        IngesterShard {\n            index_uid: self.index_uid,\n            source_id: self.source_id,\n            shard_id: self.shard_id,\n            shard_type: self.shard_type,\n            shard_state: self.shard_state,\n            replication_position_inclusive: self.replication_position_inclusive,\n            truncation_position_inclusive: self.truncation_position_inclusive,\n            rate_limiter: self.rate_limiter,\n            rate_meter: self.rate_meter,\n            is_advertisable: self.is_advertisable,\n            doc_mapper_opt: self.doc_mapper_opt,\n            validate_docs: self.validate_docs,\n            shard_status_tx,\n            shard_status_rx,\n            last_write_instant: self.last_write_instant.unwrap_or_else(Instant::now),\n        }\n    }\n}\n\nimpl IngesterShard {\n    /// Creates a builder for a primary shard hosted on a leader and replicated on a follower.\n    pub fn new_primary(\n        index_uid: IndexUid,\n        source_id: SourceId,\n        shard_id: ShardId,\n        follower_id: NodeId,\n    ) -> IngesterShardBuilder {\n        IngesterShardBuilder {\n            index_uid,\n            source_id,\n            shard_id,\n            shard_type: IngesterShardType::Primary { follower_id },\n            shard_state: ShardState::Open,\n            replication_position_inclusive: Position::Beginning,\n            truncation_position_inclusive: Position::Beginning,\n            rate_limiter: RateLimiter::default(),\n            rate_meter: RateMeter::default(),\n            doc_mapper_opt: None,\n            validate_docs: false,\n            is_advertisable: false,\n            last_write_instant: None,\n        }\n    }\n\n    /// Creates a builder for a replica shard hosted on a follower.\n    pub fn new_replica(\n        index_uid: IndexUid,\n        source_id: SourceId,\n        shard_id: ShardId,\n        leader_id: NodeId,\n    ) -> IngesterShardBuilder {\n        IngesterShardBuilder {\n            index_uid,\n            source_id,\n            shard_id,\n            shard_type: IngesterShardType::Replica { leader_id },\n            shard_state: ShardState::Open,\n            replication_position_inclusive: Position::Beginning,\n            truncation_position_inclusive: Position::Beginning,\n            rate_limiter: RateLimiter::default(),\n            rate_meter: RateMeter::default(),\n            doc_mapper_opt: None,\n            validate_docs: false,\n            is_advertisable: false,\n            last_write_instant: None,\n        }\n    }\n\n    /// Creates a builder for a solo shard hosted on a single node (i.e. replication factor = 1).\n    pub fn new_solo(\n        index_uid: IndexUid,\n        source_id: SourceId,\n        shard_id: ShardId,\n    ) -> IngesterShardBuilder {\n        IngesterShardBuilder {\n            index_uid,\n            source_id,\n            shard_id,\n            shard_type: IngesterShardType::Solo,\n            shard_state: ShardState::Open,\n            replication_position_inclusive: Position::Beginning,\n            truncation_position_inclusive: Position::Beginning,\n            rate_limiter: RateLimiter::default(),\n            rate_meter: RateMeter::default(),\n            doc_mapper_opt: None,\n            validate_docs: false,\n            is_advertisable: false,\n            last_write_instant: None,\n        }\n    }\n\n    pub fn follower_id_opt(&self) -> Option<&NodeId> {\n        match &self.shard_type {\n            IngesterShardType::Primary { follower_id, .. } => Some(follower_id),\n            IngesterShardType::Replica { .. } => None,\n            IngesterShardType::Solo => None,\n        }\n    }\n\n    pub fn close(&mut self) {\n        self.shard_state = ShardState::Closed;\n        self.notify_shard_status();\n    }\n\n    pub fn is_closed(&self) -> bool {\n        self.shard_state.is_closed()\n    }\n\n    pub fn is_open(&self) -> bool {\n        self.shard_state.is_open()\n    }\n\n    pub fn is_idle(&self, now: Instant, idle_timeout: Duration) -> bool {\n        now.duration_since(self.last_write_instant) >= idle_timeout\n    }\n\n    pub fn is_indexed(&self) -> bool {\n        self.shard_state.is_closed() && self.truncation_position_inclusive.is_eof()\n    }\n\n    pub fn is_replica(&self) -> bool {\n        matches!(self.shard_type, IngesterShardType::Replica { .. })\n    }\n\n    pub fn notify_shard_status(&self) {\n        let shard_status = (\n            self.shard_state,\n            self.replication_position_inclusive.clone(),\n        );\n        // `shard_status_tx` is guaranteed to be open because `self` also holds a receiver.\n        self.shard_status_tx\n            .send(shard_status)\n            .expect(\"channel should be open\");\n    }\n\n    pub fn queue_id(&self) -> QueueId {\n        queue_id(&self.index_uid, &self.source_id, &self.shard_id)\n    }\n\n    pub fn set_replication_position_inclusive(\n        &mut self,\n        replication_position_inclusive: Position,\n        now: Instant,\n    ) {\n        if self.replication_position_inclusive == replication_position_inclusive {\n            return;\n        }\n        self.replication_position_inclusive = replication_position_inclusive;\n        self.last_write_instant = now;\n        self.notify_shard_status();\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_config::{DocMapping, SearchSettings, build_doc_mapper};\n\n    use super::*;\n\n    impl IngesterShard {\n        #[track_caller]\n        pub fn assert_is_solo(&self) {\n            assert!(matches!(self.shard_type, IngesterShardType::Solo))\n        }\n\n        #[track_caller]\n        pub fn assert_is_primary(&self) {\n            assert!(matches!(self.shard_type, IngesterShardType::Primary { .. }))\n        }\n\n        #[track_caller]\n        pub fn assert_is_replica(&self) {\n            assert!(matches!(self.shard_type, IngesterShardType::Replica { .. }))\n        }\n\n        #[track_caller]\n        pub fn assert_is_open(&self) {\n            assert!(self.shard_state.is_open())\n        }\n\n        #[track_caller]\n        pub fn assert_is_closed(&self) {\n            assert!(self.shard_state.is_closed())\n        }\n\n        #[track_caller]\n        pub fn assert_replication_position(&self, expected_replication_position: Position) {\n            assert_eq!(\n                self.replication_position_inclusive, expected_replication_position,\n                \"expected replication position at `{:?}`, got `{:?}`\",\n                expected_replication_position, self.replication_position_inclusive\n            );\n        }\n\n        #[track_caller]\n        pub fn assert_truncation_position(&self, expected_truncation_position: Position) {\n            assert_eq!(\n                self.truncation_position_inclusive, expected_truncation_position,\n                \"expected truncation position at `{:?}`, got `{:?}`\",\n                expected_truncation_position, self.truncation_position_inclusive\n            );\n        }\n    }\n\n    #[test]\n    fn test_new_primary_shard() {\n        let doc_mapping: DocMapping = serde_json::from_str(\"{}\").unwrap();\n        let search_settings = SearchSettings::default();\n        let doc_mapper = build_doc_mapper(&doc_mapping, &search_settings).unwrap();\n\n        let primary_shard = IngesterShard::new_primary(\n            IndexUid::for_test(\"test-index\", 0),\n            SourceId::from(\"test-source\"),\n            ShardId::from(1),\n            NodeId::from(\"test-follower\"),\n        )\n        .with_state(ShardState::Closed)\n        .with_replication_position_inclusive(Position::offset(42u64))\n        .with_doc_mapper(doc_mapper)\n        .with_validate_docs(true)\n        .build();\n\n        assert!(matches!(\n            &primary_shard.shard_type,\n            IngesterShardType::Primary { follower_id, .. } if *follower_id == \"test-follower\"\n        ));\n        assert!(!primary_shard.is_replica());\n        assert_eq!(primary_shard.shard_state, ShardState::Closed);\n        assert_eq!(\n            primary_shard.replication_position_inclusive,\n            Position::offset(42u64)\n        );\n        assert_eq!(\n            primary_shard.truncation_position_inclusive,\n            Position::Beginning\n        );\n        assert!(!primary_shard.is_advertisable);\n    }\n\n    #[test]\n    fn test_new_replica_shard() {\n        let replica_shard = IngesterShard::new_replica(\n            IndexUid::for_test(\"test-index\", 0),\n            SourceId::from(\"test-source\"),\n            ShardId::from(1),\n            NodeId::from(\"test-leader\"),\n        )\n        .with_state(ShardState::Closed)\n        .with_replication_position_inclusive(Position::offset(42u64))\n        .build();\n\n        assert!(matches!(\n            &replica_shard.shard_type,\n            IngesterShardType::Replica { leader_id } if *leader_id == \"test-leader\"\n        ));\n        assert!(replica_shard.is_replica());\n        assert_eq!(replica_shard.shard_state, ShardState::Closed);\n        assert_eq!(\n            replica_shard.replication_position_inclusive,\n            Position::offset(42u64)\n        );\n        assert_eq!(\n            replica_shard.truncation_position_inclusive,\n            Position::Beginning\n        );\n        assert!(!replica_shard.is_advertisable);\n    }\n\n    #[test]\n    fn test_new_solo_shard() {\n        let solo_shard = IngesterShard::new_solo(\n            IndexUid::for_test(\"test-index\", 0),\n            SourceId::from(\"test-source\"),\n            ShardId::from(1),\n        )\n        .with_state(ShardState::Closed)\n        .with_replication_position_inclusive(Position::offset(42u64))\n        .build();\n\n        solo_shard.assert_is_solo();\n        assert!(!solo_shard.is_replica());\n        assert_eq!(solo_shard.shard_state, ShardState::Closed);\n        assert_eq!(\n            solo_shard.replication_position_inclusive,\n            Position::offset(42u64)\n        );\n        assert_eq!(\n            solo_shard.truncation_position_inclusive,\n            Position::Beginning\n        );\n        assert!(!solo_shard.is_advertisable);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_v2/mrecord.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse bytes::{Buf, Bytes};\nuse quickwit_proto::ingest::MRecordBatch;\nuse tracing::warn;\n\n/// The first byte of a [`MRecord`] is the version of the record header.\n#[derive(Debug)]\n#[repr(u8)]\npub enum HeaderVersion {\n    /// Version 0, introduced in Quickwit 0.7.0, it uses one byte to encode the record type.\n    V0 = 0,\n}\n\n/// Length of the header of a [`MRecord`] in bytes.\npub(super) const MRECORD_HEADER_LEN: usize = 2;\n\n/// `Doc` header v0 composed of the header version and the `Doc = 0` record type.\nconst DOC_HEADER_V0: &[u8; MRECORD_HEADER_LEN] = &[HeaderVersion::V0 as u8, 0];\n\n/// `Commit` header v0 composed of the header version and the `Commit = 1` record type.\nconst COMMIT_HEADER_V0: &[u8; MRECORD_HEADER_LEN] = &[HeaderVersion::V0 as u8, 1];\n\n#[derive(Debug, Clone, Eq, PartialEq)]\npub enum MRecord {\n    Doc(Bytes),\n    Commit,\n}\n\nimpl MRecord {\n    pub fn encode(&self) -> impl Buf + use<> {\n        match &self {\n            Self::Doc(doc) => DOC_HEADER_V0.chain(doc.clone()),\n            Self::Commit => COMMIT_HEADER_V0.chain(Bytes::new()),\n        }\n    }\n\n    pub fn decode(mut buf: impl Buf) -> Option<Self> {\n        if buf.remaining() < 2 {\n            return None;\n        }\n\n        let header_version = buf.get_u8();\n\n        if header_version != HeaderVersion::V0 as u8 {\n            warn!(\"unknown mrecord header version `{header_version}`\");\n            return None;\n        }\n\n        let mrecord = match buf.get_u8() {\n            0 => {\n                let doc = buf.copy_to_bytes(buf.remaining());\n                Self::Doc(doc)\n            }\n            1 => Self::Commit,\n            other => {\n                warn!(\"unknown mrecord type `{other}`\");\n                return None;\n            }\n        };\n        Some(mrecord)\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn new_doc(doc: impl Into<Bytes>) -> Self {\n        Self::Doc(doc.into())\n    }\n}\n\npub fn decoded_mrecords(mrecord_batch: &MRecordBatch) -> impl Iterator<Item = MRecord> + '_ {\n    mrecord_batch.encoded_mrecords().flat_map(MRecord::decode)\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_parse_invalid_mrecord() {\n        assert!(MRecord::decode(&b\"\"[..]).is_none());\n        assert!(MRecord::decode(&b\"a\"[..]).is_none());\n        assert!(MRecord::decode(&[HeaderVersion::V0 as u8][..]).is_none());\n        assert!(MRecord::decode(&[HeaderVersion::V0 as u8, 19u8][..]).is_none());\n    }\n\n    #[test]\n    fn test_mrecord_doc_roundtrip() {\n        let record = MRecord::new_doc(\"hello\");\n        let encoded_record = record.encode();\n        let decoded_record = MRecord::decode(encoded_record).unwrap();\n        assert_eq!(record, decoded_record);\n    }\n\n    #[test]\n    fn test_mrecord_commit_roundtrip() {\n        let record = MRecord::Commit;\n        let encoded_record = record.encode();\n        let decoded_record = MRecord::decode(encoded_record).unwrap();\n        assert_eq!(record, decoded_record);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_v2/mrecordlog_utils.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::io;\nuse std::iter::once;\nuse std::ops::RangeInclusive;\n\nuse bytesize::ByteSize;\n#[cfg(feature = \"failpoints\")]\nuse fail::fail_point;\nuse mrecordlog::error::{AppendError, DeleteQueueError};\nuse quickwit_proto::ingest::DocBatchV2;\nuse quickwit_proto::types::{Position, QueueId};\n\nuse crate::MRecord;\nuse crate::mrecordlog_async::MultiRecordLogAsync;\n\n#[derive(Debug, thiserror::Error)]\npub(super) enum AppendDocBatchError {\n    #[error(\"IO error: {0}\")]\n    Io(#[from] io::Error),\n    #[error(\"WAL queue `{0}` not found\")]\n    QueueNotFound(QueueId),\n}\n\n/// Appends a non-empty document batch to the WAL queue `queue_id`.\n///\n/// # Panics\n///\n/// Panics if `doc_batch` is empty.\npub(super) async fn append_non_empty_doc_batch(\n    mrecordlog: &mut MultiRecordLogAsync,\n    queue_id: &QueueId,\n    doc_batch: DocBatchV2,\n    force_commit: bool,\n) -> Result<Position, AppendDocBatchError> {\n    let append_result = if force_commit {\n        let encoded_mrecords = doc_batch\n            .into_docs()\n            .map(|(_doc_uid, doc)| MRecord::Doc(doc).encode())\n            .chain(once(MRecord::Commit.encode()));\n\n        #[cfg(feature = \"failpoints\")]\n        fail_point!(\"ingester:append_records\", |_| {\n            let io_error = io::Error::from(io::ErrorKind::PermissionDenied);\n            Err(AppendDocBatchError::Io(io_error))\n        });\n\n        mrecordlog\n            .append_records(queue_id, None, encoded_mrecords)\n            .await\n    } else {\n        let encoded_mrecords = doc_batch\n            .into_docs()\n            .map(|(_doc_uid, doc)| MRecord::Doc(doc).encode());\n\n        #[cfg(feature = \"failpoints\")]\n        fail_point!(\"ingester:append_records\", |_| {\n            let io_error = io::Error::from(io::ErrorKind::PermissionDenied);\n            Err(AppendDocBatchError::Io(io_error))\n        });\n\n        mrecordlog\n            .append_records(queue_id, None, encoded_mrecords)\n            .await\n    };\n    match append_result {\n        Ok(Some(offset)) => Ok(Position::offset(offset)),\n        Ok(None) => panic!(\"`doc_batch` should not be empty\"),\n        Err(AppendError::IoError(io_error)) => Err(AppendDocBatchError::Io(io_error)),\n        Err(AppendError::MissingQueue(queue_id)) => {\n            Err(AppendDocBatchError::QueueNotFound(queue_id))\n        }\n        Err(AppendError::Past) => {\n            panic!(\"`append_records` should be called with `position_opt: None`\")\n        }\n    }\n}\n\n/// Error returned when the mrecordlog does not have enough capacity to store some records.\n#[derive(Debug, Clone, Copy, thiserror::Error)]\npub(super) enum NotEnoughCapacityError {\n    #[error(\n        \"write-ahead log is full, capacity: {capacity}, usage: {usage}, requested: {requested}\"\n    )]\n    Disk {\n        usage: ByteSize,\n        capacity: ByteSize,\n        requested: ByteSize,\n    },\n    #[error(\n        \"write-ahead log memory buffer is full: capacity: {capacity}, usage: {usage}, requested: \\\n         {requested}\"\n    )]\n    Memory {\n        usage: ByteSize,\n        capacity: ByteSize,\n        requested: ByteSize,\n    },\n}\n\n/// Checks whether the log has enough capacity to store some records.\npub(super) fn check_enough_capacity(\n    mrecordlog: &MultiRecordLogAsync,\n    disk_capacity: ByteSize,\n    memory_capacity: ByteSize,\n    requested_capacity: ByteSize,\n) -> Result<(), NotEnoughCapacityError> {\n    let wal_usage = mrecordlog.resource_usage();\n    let disk_used = ByteSize(wal_usage.disk_used_bytes as u64);\n\n    if disk_used + requested_capacity > disk_capacity {\n        return Err(NotEnoughCapacityError::Disk {\n            usage: disk_used,\n            capacity: disk_capacity,\n            requested: requested_capacity,\n        });\n    }\n    let memory_used = ByteSize(wal_usage.memory_used_bytes as u64);\n\n    if memory_used + requested_capacity > memory_capacity {\n        return Err(NotEnoughCapacityError::Memory {\n            usage: memory_used,\n            capacity: memory_capacity,\n            requested: requested_capacity,\n        });\n    }\n    Ok(())\n}\n\n/// Deletes a queue from the WAL. Returns without error if the queue does not exist.\npub async fn force_delete_queue(\n    mrecordlog: &mut MultiRecordLogAsync,\n    queue_id: &QueueId,\n) -> io::Result<()> {\n    match mrecordlog.delete_queue(queue_id).await {\n        Ok(_) | Err(DeleteQueueError::MissingQueue(_)) => Ok(()),\n        Err(DeleteQueueError::IoError(error)) => Err(error),\n    }\n}\n\n/// Returns the first and last position of the records currently stored in the queue. Returns `None`\n/// if the queue does not exist or is empty.\npub(super) fn queue_position_range(\n    mrecordlog: &MultiRecordLogAsync,\n    queue_id: &QueueId,\n) -> Option<RangeInclusive<u64>> {\n    let first_position = mrecordlog\n        .range(queue_id, ..)\n        .ok()?\n        .next()\n        .map(|record| record.position)?;\n\n    let last_position = mrecordlog\n        .last_record(queue_id)\n        .ok()?\n        .map(|record| record.position)?;\n\n    Some(first_position..=last_position)\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[cfg(not(feature = \"failpoints\"))]\n    #[tokio::test]\n    async fn test_append_non_empty_doc_batch() {\n        let tempdir = tempfile::tempdir().unwrap();\n        let mut mrecordlog = MultiRecordLogAsync::open(tempdir.path()).await.unwrap();\n\n        let queue_id = \"test-queue\".to_string();\n        let doc_batch = DocBatchV2::for_test([\"test-doc-foo\"]);\n\n        let append_error =\n            append_non_empty_doc_batch(&mut mrecordlog, &queue_id, doc_batch.clone(), false)\n                .await\n                .unwrap_err();\n\n        assert!(matches!(\n            append_error,\n            AppendDocBatchError::QueueNotFound(..)\n        ));\n\n        mrecordlog.create_queue(&queue_id).await.unwrap();\n\n        let position =\n            append_non_empty_doc_batch(&mut mrecordlog, &queue_id, doc_batch.clone(), false)\n                .await\n                .unwrap();\n        assert_eq!(position, Position::offset(0u64));\n\n        let position =\n            append_non_empty_doc_batch(&mut mrecordlog, &queue_id, doc_batch.clone(), true)\n                .await\n                .unwrap();\n        assert_eq!(position, Position::offset(2u64));\n    }\n\n    // This test should be run manually and independently of other tests with the `failpoints`\n    // feature enabled:\n    // ```sh\n    // cargo test --manifest-path quickwit/Cargo.toml -p quickwit-ingest --features failpoints -- test_append_non_empty_doc_batch_io_error\n    // ```\n    #[cfg(feature = \"failpoints\")]\n    #[tokio::test]\n    async fn test_append_non_empty_doc_batch_io_error() {\n        let scenario = fail::FailScenario::setup();\n        fail::cfg(\"ingester:append_records\", \"return\").unwrap();\n\n        let tempdir = tempfile::tempdir().unwrap();\n        let mut mrecordlog = MultiRecordLogAsync::open(tempdir.path()).await.unwrap();\n\n        let queue_id = \"test-queue\".to_string();\n        mrecordlog.create_queue(&queue_id).await.unwrap();\n\n        let doc_batch = DocBatchV2::for_test([\"test-doc-foo\"]);\n        let append_error = append_non_empty_doc_batch(&mut mrecordlog, &queue_id, doc_batch, false)\n            .await\n            .unwrap_err();\n\n        assert!(matches!(append_error, AppendDocBatchError::Io(..)));\n\n        scenario.teardown();\n    }\n\n    #[tokio::test]\n    async fn test_check_enough_capacity() {\n        let tempdir = tempfile::tempdir().unwrap();\n        let mrecordlog = MultiRecordLogAsync::open(tempdir.path()).await.unwrap();\n\n        let disk_error =\n            check_enough_capacity(&mrecordlog, ByteSize(0), ByteSize(0), ByteSize(12)).unwrap_err();\n\n        assert!(matches!(disk_error, NotEnoughCapacityError::Disk { .. }));\n\n        let memory_error =\n            check_enough_capacity(&mrecordlog, ByteSize::mb(256), ByteSize(11), ByteSize(12))\n                .unwrap_err();\n\n        assert!(matches!(\n            memory_error,\n            NotEnoughCapacityError::Memory { .. }\n        ));\n\n        check_enough_capacity(&mrecordlog, ByteSize::mb(256), ByteSize(12), ByteSize(12)).unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_append_queue_position_range() {\n        let tempdir = tempfile::tempdir().unwrap();\n        let mut mrecordlog = MultiRecordLogAsync::open(tempdir.path()).await.unwrap();\n\n        assert!(queue_position_range(&mrecordlog, &\"queue-not-found\".to_string()).is_none());\n\n        mrecordlog.create_queue(\"test-queue\").await.unwrap();\n        assert!(queue_position_range(&mrecordlog, &\"test-queue\".to_string()).is_none());\n\n        mrecordlog\n            .append_records(\"test-queue\", None, std::iter::once(&b\"test-doc-foo\"[..]))\n            .await\n            .unwrap();\n        let position_range = queue_position_range(&mrecordlog, &\"test-queue\".to_string()).unwrap();\n        assert_eq!(position_range, 0..=0);\n\n        mrecordlog\n            .append_records(\"test-queue\", None, std::iter::once(&b\"test-doc-bar\"[..]))\n            .await\n            .unwrap();\n        let position_range = queue_position_range(&mrecordlog, &\"test-queue\".to_string()).unwrap();\n        assert_eq!(position_range, 0..=1);\n\n        mrecordlog.truncate(\"test-queue\", 0).await.unwrap();\n        let position_range = queue_position_range(&mrecordlog, &\"test-queue\".to_string()).unwrap();\n        assert_eq!(position_range, 1..=1);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_v2/publish_tracker.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::sync::{Arc, Mutex};\n\nuse quickwit_common::pubsub::{EventBroker, EventSubscriptionHandle};\nuse quickwit_proto::indexing::ShardPositionsUpdate;\nuse quickwit_proto::types::{Position, ShardId};\nuse tokio::sync::Notify;\nuse tracing::error;\n\n/// A helper for awaiting shard publish events when running in `wait_for` and\n/// `force` commit mode.\n///\n/// Registers a set of shard positions and listens to [`ShardPositionsUpdate`]\n/// events to assert when all the persisted events have been published. To\n/// ensure that no events are missed:\n/// - create the tracker before any persist requests is sent\n/// - call `register_requested_shards` before each persist request to ensure that the associated\n///   publish events are recorded\n/// - call `track_persisted_shard_position` after each successful persist subrequests\npub struct PublishTracker {\n    state: Arc<Mutex<ShardPublishStates>>,\n    // sync::notify instead of sync::oneshot because we don't want to store the permit\n    publish_complete: Arc<Notify>,\n    _publish_listen_handle: EventSubscriptionHandle,\n}\n\nimpl PublishTracker {\n    pub fn new(event_tracker: EventBroker) -> Self {\n        let state = Arc::new(Mutex::new(ShardPublishStates::default()));\n        let state_clone = state.clone();\n        let publish_complete = Arc::new(Notify::new());\n        let publish_complete_notifier = publish_complete.clone();\n        let _publish_listen_handle =\n            event_tracker.subscribe(move |update: ShardPositionsUpdate| {\n                let mut publish_states = state_clone.lock().unwrap();\n                for (updated_shard_id, updated_position) in &update.updated_shard_positions {\n                    publish_states.position_published(\n                        updated_shard_id,\n                        updated_position,\n                        &publish_complete_notifier,\n                    );\n                }\n            });\n        Self {\n            state,\n            _publish_listen_handle,\n            publish_complete,\n        }\n    }\n\n    pub fn track_persisted_shard_position(&self, shard_id: ShardId, new_position: Position) {\n        let mut publish_states = self.state.lock().unwrap();\n        publish_states.position_persisted(&shard_id, &new_position)\n    }\n\n    pub async fn wait_publish_complete(self) {\n        // correctness: `awaiting_count` cannot be increased after this point\n        // because `self` is consumed. By subscribing to `publish_complete`\n        // before checking `awaiting_count`, we make sure we don't miss the\n        // moment when it becomes 0.\n        let notified = self.publish_complete.notified();\n        if self.state.lock().unwrap().awaiting_count == 0 {\n            return;\n        }\n        notified.await;\n    }\n}\n\nenum PublishState {\n    /// The persist request for this shard success response has been received\n    /// but the position has not yet been published\n    AwaitingPublish(Position),\n    ///  The shard has been published up to this position (might happen before\n    ///  the persist success is received)\n    Published(Position),\n}\n\n#[derive(Default)]\nstruct ShardPublishStates {\n    states: HashMap<ShardId, PublishState>,\n    awaiting_count: usize,\n}\n\nimpl ShardPublishStates {\n    fn position_published(\n        &mut self,\n        shard_id: &ShardId,\n        new_position: &Position,\n        publish_complete_notifier: &Notify,\n    ) {\n        let Some(publish_state) = self.states.get_mut(shard_id) else {\n            return;\n        };\n\n        match publish_state {\n            PublishState::AwaitingPublish(shard_position) if new_position >= shard_position => {\n                *publish_state = PublishState::Published(new_position.clone());\n                self.awaiting_count -= 1;\n                if self.awaiting_count == 0 {\n                    publish_complete_notifier.notify_waiters();\n                }\n            }\n            PublishState::Published(current_position) if new_position > current_position => {\n                *current_position = new_position.clone();\n            }\n            PublishState::Published(_) | PublishState::AwaitingPublish(_) => {\n                // duplicate/out-of-order or not enough progress yet\n            }\n        }\n    }\n\n    fn position_persisted(&mut self, shard_id: &ShardId, new_position: &Position) {\n        if self.states.contains_key(shard_id) {\n            error!(%shard_id, \"shard persisted positions should not be tracked multiple times\");\n            return;\n        }\n\n        self.states.insert(\n            shard_id.clone(),\n            PublishState::AwaitingPublish(new_position.clone()),\n        );\n        self.awaiting_count += 1;\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::time::Duration;\n\n    use quickwit_proto::types::{IndexUid, ShardId, SourceUid};\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_shard_publish_states() {\n        let mut shard_publish_states = ShardPublishStates::default();\n        let notifier = Arc::new(Notify::new());\n\n        let shard_id_1 = ShardId::from(\"test-shard-1\");\n        let shard_id_2 = ShardId::from(\"test-shard-2\");\n        let shard_id_3 = ShardId::from(\"test-shard-3\"); // not tracked\n\n        let notifier_receiver = notifier.clone();\n        let notified_subscription = notifier_receiver.notified();\n\n        shard_publish_states.position_persisted(&shard_id_1, &Position::offset(10usize));\n        assert_eq!(shard_publish_states.awaiting_count, 1);\n        shard_publish_states.position_persisted(&shard_id_2, &Position::offset(20usize));\n        assert_eq!(shard_publish_states.awaiting_count, 2);\n        shard_publish_states.position_published(&shard_id_1, &Position::offset(15usize), &notifier);\n        assert_eq!(shard_publish_states.awaiting_count, 1);\n        shard_publish_states.position_published(&shard_id_2, &Position::offset(20usize), &notifier);\n        assert_eq!(shard_publish_states.awaiting_count, 0);\n\n        // check that only the notification that was subscribed before holds a permit\n        tokio::time::timeout(Duration::from_millis(100), notifier.notified())\n            .await\n            .unwrap_err();\n        tokio::time::timeout(Duration::from_millis(100), notified_subscription)\n            .await\n            .unwrap();\n\n        // shard 3 is not tracked\n        shard_publish_states.position_published(&shard_id_3, &Position::offset(10usize), &notifier);\n        assert_eq!(shard_publish_states.awaiting_count, 0);\n        assert!(!shard_publish_states.states.contains_key(&shard_id_3));\n    }\n\n    #[tokio::test]\n    async fn test_publish_tracker() {\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index-0\", 0);\n        let event_broker = EventBroker::default();\n        let tracker = PublishTracker::new(event_broker.clone());\n        let shard_id_1 = ShardId::from(\"test-shard-1\");\n        let shard_id_2 = ShardId::from(\"test-shard-2\");\n        let shard_id_3 = ShardId::from(\"test-shard-3\");\n        let shard_id_4 = ShardId::from(\"test-shard-4\");\n        let shard_id_5 = ShardId::from(\"test-shard-5\"); // not tracked\n\n        tracker.track_persisted_shard_position(shard_id_1.clone(), Position::offset(42usize));\n        tracker.track_persisted_shard_position(shard_id_2.clone(), Position::offset(42usize));\n        tracker.track_persisted_shard_position(shard_id_3.clone(), Position::offset(42usize));\n\n        event_broker.publish(ShardPositionsUpdate {\n            source_uid: SourceUid {\n                index_uid: index_uid.clone(),\n                source_id: \"test-source\".to_string(),\n            },\n            updated_shard_positions: vec![\n                (shard_id_1.clone(), Position::offset(42usize)),\n                (shard_id_2.clone(), Position::offset(666usize)),\n                (shard_id_5.clone(), Position::offset(888usize)),\n            ]\n            .into_iter()\n            .collect(),\n        });\n\n        event_broker.publish(ShardPositionsUpdate {\n            source_uid: SourceUid {\n                index_uid: index_uid.clone(),\n                source_id: \"test-source\".to_string(),\n            },\n            updated_shard_positions: vec![\n                (shard_id_3.clone(), Position::eof(42usize)),\n                (shard_id_4.clone(), Position::offset(42usize)),\n            ]\n            .into_iter()\n            .collect(),\n        });\n\n        // persist response received after the publish event\n        tracker.track_persisted_shard_position(shard_id_4.clone(), Position::offset(42usize));\n\n        tokio::time::timeout(Duration::from_millis(200), tracker.wait_publish_complete())\n            .await\n            .unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_publish_tracker_waits() {\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index-0\", 0);\n        let shard_id_1 = ShardId::from(\"test-shard-1\");\n        let shard_id_2 = ShardId::from(\"test-shard-2\");\n        let position = Position::offset(42usize);\n\n        {\n            let event_broker = EventBroker::default();\n            let tracker = PublishTracker::new(event_broker.clone());\n            tracker.track_persisted_shard_position(shard_id_1.clone(), position.clone());\n            tracker.track_persisted_shard_position(shard_id_2.clone(), position.clone());\n\n            event_broker.publish(ShardPositionsUpdate {\n                source_uid: SourceUid {\n                    index_uid: index_uid.clone(),\n                    source_id: \"test-source\".to_string(),\n                },\n                updated_shard_positions: vec![(shard_id_1.clone(), position.clone())]\n                    .into_iter()\n                    .collect(),\n            });\n\n            tokio::time::timeout(Duration::from_millis(200), tracker.wait_publish_complete())\n                .await\n                .unwrap_err();\n        }\n        {\n            let event_broker = EventBroker::default();\n            let tracker = PublishTracker::new(event_broker.clone());\n            tracker.track_persisted_shard_position(shard_id_1.clone(), position.clone());\n            event_broker.publish(ShardPositionsUpdate {\n                source_uid: SourceUid {\n                    index_uid: index_uid.clone(),\n                    source_id: \"test-source\".to_string(),\n                },\n                updated_shard_positions: vec![(shard_id_1.clone(), position.clone())]\n                    .into_iter()\n                    .collect(),\n            });\n            // sleep to make sure the event is processed\n            tokio::time::sleep(Duration::from_millis(50)).await;\n            tracker.track_persisted_shard_position(shard_id_2.clone(), position.clone());\n\n            tokio::time::timeout(Duration::from_millis(200), tracker.wait_publish_complete())\n                .await\n                .unwrap_err();\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_v2/rate_meter.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_common::tower::ConstantRate;\nuse tokio::time::Instant;\n\n/// A naive rate meter that tracks how much work was performed during a period of time defined by\n/// two successive calls to `harvest`.\n#[derive(Debug, Clone)]\npub(super) struct RateMeter {\n    total_work: u64,\n    harvested_at: Instant,\n}\n\nimpl Default for RateMeter {\n    fn default() -> Self {\n        Self {\n            total_work: 0,\n            harvested_at: Instant::now(),\n        }\n    }\n}\n\nimpl RateMeter {\n    /// Increments the amount of work performed since the last call to `harvest`.\n    pub fn update(&mut self, work: u64) {\n        self.total_work += work;\n    }\n\n    /// Returns the average work rate since the last call to this method and resets the internal\n    /// state.\n    pub fn harvest(&mut self) -> ConstantRate {\n        let now = Instant::now();\n        let elapsed = now.duration_since(self.harvested_at);\n        let rate = ConstantRate::new(self.total_work, elapsed);\n        self.total_work = 0;\n        self.harvested_at = now;\n        rate\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::time::Duration;\n\n    use quickwit_common::tower::Rate;\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_rate_meter() {\n        tokio::time::pause();\n        let mut rate_meter = RateMeter::default();\n\n        let rate = rate_meter.harvest();\n        assert_eq!(rate.work(), 0);\n        assert!(rate.period().is_zero());\n\n        tokio::time::advance(Duration::from_millis(100)).await;\n\n        let rate = rate_meter.harvest();\n        assert_eq!(rate.work(), 0);\n        assert_eq!(rate.period(), Duration::from_millis(100));\n\n        rate_meter.update(1);\n        tokio::time::advance(Duration::from_millis(100)).await;\n\n        let rate = rate_meter.harvest();\n        assert_eq!(rate.work(), 1);\n        assert_eq!(rate.period(), Duration::from_millis(100));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_v2/replication.md",
    "content": "## Replication\n\n### Sync replication\nFor each shard, leaders replicate the state of their local mrecordlog queues and associated metadata (positions) by sending replication requests to their followers. Then, they wait for followers to acknowledge the replication requests before returning success or failure responses to routers.\n\n### Replication stream\nTwo gRPC streams back the independent streams of requests and responses between leader-follower pairs called the SYN replication stream and the ACK replication stream. gRPC streams guarantee that the streamed messages are delivered in the order they are sent. However, gRPC bidirectional streaming does not guarantee that requests and responses match. Most of the logic implemented in `replication.rs` aims to \"zip\" the two streams together to fix this issue.\n\n### Life of a happy persist request\n1. Leader receives a persist request pre-assigned to a shard from a router.\n\n1. Leader forwards replicate request to follower of the shard via the SYN replication stream.\n\n1. Follower receives the replicate request, writes the data to its replica queue, and records the new position of the queue called `replica_position`.\n\n1. Follower returns replicate response to leader via the ACK replication stream.\n\n1. Leader records the new position of the replica queue.\n\n1. Leader writes the data to its local mrecordlog queue and records the new position of the queue called `primary_position`.  It should match the `replica_position`.\n\n1. Leader return success persist response to router.\n\n### Replication stream errors\n\n- When a replication request fails, the leader and follower close the shard(s) targeted by the request.\n\n- When a replication stream fails (transport error, timeout), the leader and follower close the shard(s) targeted by the stream. Then, the leader reopens a new stream if necessary.\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_v2/replication.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashSet;\nuse std::time::{Duration, Instant};\n\nuse bytesize::ByteSize;\nuse futures::{Future, StreamExt};\nuse mrecordlog::error::CreateQueueError;\nuse quickwit_common::metrics::{GaugeGuard, MEMORY_METRICS};\nuse quickwit_common::{ServiceStream, rate_limited_warn};\nuse quickwit_proto::ingest::ingester::{\n    AckReplicationMessage, IngesterStatus, InitReplicaRequest, InitReplicaResponse,\n    ReplicateFailure, ReplicateFailureReason, ReplicateRequest, ReplicateResponse,\n    ReplicateSubrequest, ReplicateSuccess, SynReplicationMessage, ack_replication_message,\n    syn_replication_message,\n};\nuse quickwit_proto::ingest::{CommitTypeV2, IngestV2Error, IngestV2Result, Shard, ShardState};\nuse quickwit_proto::types::{NodeId, QueueId};\nuse tokio::sync::mpsc::error::TryRecvError;\nuse tokio::sync::{mpsc, oneshot};\nuse tokio::task::JoinHandle;\nuse tracing::{error, warn};\n\nuse super::metrics::report_wal_usage;\nuse super::models::IngesterShard;\nuse super::mrecordlog_utils::check_enough_capacity;\nuse super::state::IngesterState;\nuse crate::ingest_v2::mrecordlog_utils::{AppendDocBatchError, append_non_empty_doc_batch};\nuse crate::metrics::INGEST_METRICS;\nuse crate::{estimate_size, with_lock_metrics};\n\npub(super) const SYN_REPLICATION_STREAM_CAPACITY: usize = 5;\n\n/// Duration after which replication requests time out with [`ReplicationError::Timeout`].\nconst REPLICATION_REQUEST_TIMEOUT: Duration = if cfg!(any(test, feature = \"testsuite\")) {\n    Duration::from_millis(250)\n} else {\n    Duration::from_secs(3)\n};\n\n/// A replication request is sent by the leader to its follower to update the state of a replica\n/// shard.\n#[derive(Debug)]\npub(super) enum ReplicationRequest {\n    Init(InitReplicaRequest),\n    Replicate(ReplicateRequest),\n}\n\nimpl ReplicationRequest {\n    fn replication_seqno(&self) -> ReplicationSeqNo {\n        match self {\n            ReplicationRequest::Init(init_replica_request) => {\n                init_replica_request.replication_seqno\n            }\n            ReplicationRequest::Replicate(replicate_request) => replicate_request.replication_seqno,\n        }\n    }\n\n    fn set_replication_seqno(&mut self, replication_seqno: ReplicationSeqNo) {\n        match self {\n            ReplicationRequest::Init(init_replica_request) => {\n                init_replica_request.replication_seqno = replication_seqno\n            }\n            ReplicationRequest::Replicate(replicate_request) => {\n                replicate_request.replication_seqno = replication_seqno\n            }\n        }\n    }\n\n    fn into_syn_replication_message(self) -> SynReplicationMessage {\n        match self {\n            ReplicationRequest::Init(init_replica_request) => {\n                SynReplicationMessage::new_init_replica_request(init_replica_request)\n            }\n            ReplicationRequest::Replicate(replicate_request) => {\n                SynReplicationMessage::new_replicate_request(replicate_request)\n            }\n        }\n    }\n}\n\n#[derive(Debug)]\npub(super) enum ReplicationResponse {\n    Init(InitReplicaResponse),\n    Replicate(ReplicateResponse),\n}\n\nimpl ReplicationResponse {\n    fn replication_seqno(&self) -> ReplicationSeqNo {\n        match self {\n            ReplicationResponse::Init(init_replica_response) => {\n                init_replica_response.replication_seqno\n            }\n            ReplicationResponse::Replicate(replicate_response) => {\n                replicate_response.replication_seqno\n            }\n        }\n    }\n}\n\ntype OneShotReplicationRequest = (ReplicationRequest, oneshot::Sender<ReplicationResponse>);\n\n/// Replication sequence number.\ntype ReplicationSeqNo = u64;\n\n/// Task that \"powers\" the replication stream between a leader and a follower.\npub(super) struct ReplicationStreamTask {\n    leader_id: NodeId,\n    follower_id: NodeId,\n    replication_request_rx: mpsc::Receiver<OneShotReplicationRequest>,\n    syn_replication_stream_tx: mpsc::Sender<SynReplicationMessage>,\n    ack_replication_stream: ServiceStream<IngestV2Result<AckReplicationMessage>>,\n}\n\nimpl ReplicationStreamTask {\n    /// Spawns a [`ReplicationStreamTask`].\n    pub fn spawn(\n        leader_id: NodeId,\n        follower_id: NodeId,\n        syn_replication_stream_tx: mpsc::Sender<SynReplicationMessage>,\n        ack_replication_stream: ServiceStream<IngestV2Result<AckReplicationMessage>>,\n    ) -> ReplicationStreamTaskHandle {\n        let (replication_request_tx, replication_request_rx) =\n            mpsc::channel::<OneShotReplicationRequest>(3);\n\n        let replication_stream_task = Self {\n            leader_id,\n            follower_id,\n            replication_request_rx,\n            syn_replication_stream_tx,\n            ack_replication_stream,\n        };\n        let (enqueue_syn_requests_join_handle, dequeue_ack_responses_join_handle) =\n            replication_stream_task.run();\n\n        ReplicationStreamTaskHandle {\n            replication_request_tx,\n            enqueue_syn_requests_join_handle,\n            dequeue_ack_responses_join_handle,\n        }\n    }\n\n    /// Executes the request processing loop. It enqueues requests into the SYN replication stream\n    /// going to the follower then dequeues the responses returned from the ACK replication\n    /// stream. Additionally (and crucially), it ensures that requests and responses are\n    /// processed and returned in the same order. Conceptually, it is akin to \"zipping\" the SYN and\n    /// ACK replication streams together.\n    fn run(mut self) -> (JoinHandle<()>, JoinHandle<()>) {\n        // Response sequencer channel. It ensures that requests and responses are processed and\n        // returned in the same order.\n        //\n        // Channel capacity: there is no need to bound the capacity of the channel here\n        // because it is already virtually bounded by the capacity of the SYN replication\n        // stream.\n        let (response_sequencer_tx, mut response_sequencer_rx) = mpsc::unbounded_channel();\n\n        // This loop enqueues SYN replication requests into the SYN replication stream and passes\n        // the one-shot response sender to the \"dequeue\" loop via the sequencer channel.\n        let enqueue_syn_requests_fut = async move {\n            let mut replication_seqno = ReplicationSeqNo::default();\n            while let Some((mut replication_request, oneshot_replication_response_tx)) =\n                self.replication_request_rx.recv().await\n            {\n                replication_request.set_replication_seqno(replication_seqno);\n                replication_seqno += 1;\n\n                if response_sequencer_tx\n                    .send((\n                        replication_request.replication_seqno(),\n                        oneshot_replication_response_tx,\n                    ))\n                    .is_err()\n                {\n                    // The response sequencer receiver was dropped.\n                    return;\n                }\n                let syn_replication_message = replication_request.into_syn_replication_message();\n\n                if self\n                    .syn_replication_stream_tx\n                    .send(syn_replication_message)\n                    .await\n                    .is_err()\n                {\n                    // The SYN replication stream was closed.\n                    return;\n                }\n            }\n            // The replication client was dropped.\n        };\n        // This loop dequeues ACK replication responses from the ACK replication stream and forwards\n        // them to their respective clients using associated one-shot associated with each response.\n        let dequeue_ack_responses_fut = async move {\n            while let Some(ack_replication_message_res) = self.ack_replication_stream.next().await {\n                let ack_replication_message = match ack_replication_message_res {\n                    Ok(ack_replication_message) => ack_replication_message,\n                    Err(_) => {\n                        return;\n                    }\n                };\n                let replication_response = match ack_replication_message.message {\n                    Some(ack_replication_message::Message::InitResponse(init_replica_response)) => {\n                        ReplicationResponse::Init(init_replica_response)\n                    }\n                    Some(ack_replication_message::Message::ReplicateResponse(\n                        replicate_response,\n                    )) => ReplicationResponse::Replicate(replicate_response),\n                    Some(ack_replication_message::Message::OpenResponse(_)) => {\n                        warn!(\"received unexpected ACK replication message\");\n                        continue;\n                    }\n                    None => {\n                        warn!(\"received empty ACK replication message\");\n                        continue;\n                    }\n                };\n                let oneshot_replication_response_tx = match response_sequencer_rx.try_recv() {\n                    Ok((replication_seqno, oneshot_replication_response_tx)) => {\n                        if replication_response.replication_seqno() != replication_seqno {\n                            error!(\n                                \"received out-of-order replication response: expected replication \\\n                                 seqno `{}`, got `{}`; closing replication stream from leader \\\n                                 `{}` to follower `{}`\",\n                                replication_seqno,\n                                replication_response.replication_seqno(),\n                                self.leader_id,\n                                self.follower_id,\n                            );\n                            return;\n                        }\n                        oneshot_replication_response_tx\n                    }\n                    Err(TryRecvError::Empty) => {\n                        panic!(\"response sequencer should not be empty\");\n                    }\n                    Err(TryRecvError::Disconnected) => {\n                        // The response sequencer sender was dropped.\n                        return;\n                    }\n                };\n                // We intentionally ignore the error here. It is the responsibility of the\n                // `replicate` method to surface it.\n                let _ = oneshot_replication_response_tx.send(replication_response);\n            }\n            // The ACK replication stream was closed.\n        };\n        (\n            tokio::spawn(enqueue_syn_requests_fut),\n            tokio::spawn(dequeue_ack_responses_fut),\n        )\n    }\n}\n\npub(super) struct ReplicationStreamTaskHandle {\n    replication_request_tx: mpsc::Sender<OneShotReplicationRequest>,\n    enqueue_syn_requests_join_handle: JoinHandle<()>,\n    dequeue_ack_responses_join_handle: JoinHandle<()>,\n}\n\nimpl ReplicationStreamTaskHandle {\n    /// Returns a [`ReplicationClient`] that can be used to enqueue replication requests\n    /// into the replication stream.\n    pub fn replication_client(&self) -> ReplicationClient {\n        ReplicationClient {\n            replication_request_tx: self.replication_request_tx.clone(),\n        }\n    }\n}\n\nimpl Drop for ReplicationStreamTaskHandle {\n    fn drop(&mut self) {\n        self.enqueue_syn_requests_join_handle.abort();\n        self.dequeue_ack_responses_join_handle.abort();\n    }\n}\n\n/// Error returned by the [`ReplicationClient`].\n#[derive(Debug, Clone, Copy, thiserror::Error)]\n#[error(\"failed to replicate records from leader to follower\")]\npub(super) enum ReplicationError {\n    /// The replication stream was closed.\n    #[error(\"replication stream was closed\")]\n    Closed,\n    /// The replication request timed out.\n    #[error(\"replication request timed out\")]\n    Timeout,\n}\n\n// DO NOT derive or implement `Clone` for this object.\n#[derive(Debug)]\npub(super) struct ReplicationClient {\n    replication_request_tx: mpsc::Sender<OneShotReplicationRequest>,\n}\n\n/// Single-use client that enqueues replication requests into the replication stream.\n///\n/// The `init_replica`, `replicate`, and `submit` methods take `self` instead of `&self`\n/// to produce 'static futures and enforce single-use semantics.\nimpl ReplicationClient {\n    /// Enqueues an init replica request into the replication stream and waits for the response.\n    /// Times out after [`REPLICATION_REQUEST_TIMEOUT`] seconds.\n    pub fn init_replica(\n        self,\n        replica_shard: Shard,\n    ) -> impl Future<Output = Result<InitReplicaResponse, ReplicationError>> + Send + 'static {\n        let init_replica_request = InitReplicaRequest {\n            replica_shard: Some(replica_shard),\n            replication_seqno: 0, // replication number are generated further down\n        };\n        let replication_request = ReplicationRequest::Init(init_replica_request);\n\n        async {\n            self.submit(replication_request)\n                .await\n                .map(|replication_response| {\n                    if let ReplicationResponse::Init(init_replica_response) = replication_response {\n                        init_replica_response\n                    } else {\n                        panic!(\"response should be an init replica response\")\n                    }\n                })\n        }\n    }\n\n    /// Enqueues a replicate request into the replication stream and waits for the response. Times\n    /// out after [`REPLICATION_REQUEST_TIMEOUT`] seconds.\n    pub fn replicate(\n        self,\n        leader_id: NodeId,\n        follower_id: NodeId,\n        subrequests: Vec<ReplicateSubrequest>,\n        commit_type: CommitTypeV2,\n    ) -> impl Future<Output = Result<ReplicateResponse, ReplicationError>> + Send + 'static {\n        let replicate_request = ReplicateRequest {\n            leader_id: leader_id.into(),\n            follower_id: follower_id.into(),\n            subrequests,\n            commit_type: commit_type as i32,\n            replication_seqno: 0, // replication number are generated further down\n        };\n        let replication_request = ReplicationRequest::Replicate(replicate_request);\n\n        async {\n            self.submit(replication_request)\n                .await\n                .map(|replication_response| {\n                    if let ReplicationResponse::Replicate(replicate_response) = replication_response\n                    {\n                        replicate_response\n                    } else {\n                        panic!(\"response should be a replicate response\")\n                    }\n                })\n        }\n    }\n\n    /// Submits a replication request to the replication stream and waits for the response.\n    fn submit(\n        self,\n        replication_request: ReplicationRequest,\n    ) -> impl Future<Output = Result<ReplicationResponse, ReplicationError>> + Send + 'static {\n        let (oneshot_replication_response_tx, oneshot_replication_response_rx) = oneshot::channel();\n\n        let send_recv_fut = async move {\n            self.replication_request_tx\n                .send((replication_request, oneshot_replication_response_tx))\n                .await\n                .map_err(|_| ReplicationError::Closed)?;\n            let replicate_response = oneshot_replication_response_rx\n                .await\n                .map_err(|_| ReplicationError::Closed)?;\n            Ok(replicate_response)\n        };\n        async {\n            tokio::time::timeout(REPLICATION_REQUEST_TIMEOUT, send_recv_fut)\n                .await\n                .map_err(|_| ReplicationError::Timeout)?\n        }\n    }\n}\n\n/// Replication task executed for each replication stream.\npub(super) struct ReplicationTask {\n    leader_id: NodeId,\n    follower_id: NodeId,\n    state: IngesterState,\n    syn_replication_stream: ServiceStream<SynReplicationMessage>,\n    ack_replication_stream_tx: mpsc::UnboundedSender<IngestV2Result<AckReplicationMessage>>,\n    current_replication_seqno: ReplicationSeqNo,\n    disk_capacity: ByteSize,\n    memory_capacity: ByteSize,\n}\n\nimpl ReplicationTask {\n    pub fn spawn(\n        leader_id: NodeId,\n        follower_id: NodeId,\n        state: IngesterState,\n        syn_replication_stream: ServiceStream<SynReplicationMessage>,\n        ack_replication_stream_tx: mpsc::UnboundedSender<IngestV2Result<AckReplicationMessage>>,\n        disk_capacity: ByteSize,\n        memory_capacity: ByteSize,\n    ) -> ReplicationTaskHandle {\n        let mut replication_task = Self {\n            leader_id,\n            follower_id,\n            state,\n            syn_replication_stream,\n            ack_replication_stream_tx,\n            current_replication_seqno: 0,\n            disk_capacity,\n            memory_capacity,\n        };\n        let join_handle = tokio::spawn(async move { replication_task.run().await });\n        ReplicationTaskHandle { join_handle }\n    }\n\n    async fn init_replica(\n        &mut self,\n        init_replica_request: InitReplicaRequest,\n    ) -> IngestV2Result<InitReplicaResponse> {\n        if init_replica_request.replication_seqno != self.current_replication_seqno {\n            return Err(IngestV2Error::Internal(format!(\n                \"received out-of-order replication request: expected replication seqno `{}`, got \\\n                 `{}`\",\n                self.current_replication_seqno, init_replica_request.replication_seqno\n            )));\n        }\n        self.current_replication_seqno += 1;\n\n        let Some(replica_shard) = init_replica_request.replica_shard else {\n            warn!(\"received empty init replica request\");\n\n            return Err(IngestV2Error::Internal(\n                \"init replica request is empty\".to_string(),\n            ));\n        };\n        let queue_id = replica_shard.queue_id();\n\n        let mut state_guard =\n            with_lock_metrics!(self.state.lock_fully(), \"init_replica\", \"write\").await?;\n\n        match state_guard.mrecordlog.create_queue(&queue_id).await {\n            Ok(_) => {}\n            Err(CreateQueueError::AlreadyExists) => {\n                error!(\"WAL queue `{queue_id}` already exists\");\n                let message = format!(\"WAL queue `{queue_id}` already exists\");\n                return Err(IngestV2Error::Internal(message));\n            }\n            Err(CreateQueueError::IoError(io_error)) => {\n                error!(\"failed to create WAL queue `{queue_id}`: {io_error}\",);\n                let message = format!(\"failed to create WAL queue `{queue_id}`: {io_error}\");\n                return Err(IngestV2Error::Internal(message));\n            }\n        };\n        let index_uid = replica_shard.index_uid().clone();\n        let shard_id = replica_shard.shard_id().clone();\n        let source_id = replica_shard.source_id;\n        let leader_id = NodeId::from(replica_shard.leader_id);\n\n        let replica_shard =\n            IngesterShard::new_replica(index_uid, source_id, shard_id, leader_id).build();\n        state_guard.shards.insert(queue_id, replica_shard);\n\n        let init_replica_response = InitReplicaResponse {\n            replication_seqno: init_replica_request.replication_seqno,\n        };\n        Ok(init_replica_response)\n    }\n\n    async fn replicate(\n        &mut self,\n        replicate_request: ReplicateRequest,\n    ) -> IngestV2Result<ReplicateResponse> {\n        if replicate_request.leader_id != self.leader_id {\n            return Err(IngestV2Error::Internal(format!(\n                \"routing error: expected leader ID `{}`, got `{}`\",\n                self.leader_id, replicate_request.leader_id\n            )));\n        }\n        if replicate_request.follower_id != self.follower_id {\n            return Err(IngestV2Error::Internal(format!(\n                \"routing error: expected follower ID `{}`, got `{}`\",\n                self.follower_id, replicate_request.follower_id\n            )));\n        }\n        if replicate_request.replication_seqno != self.current_replication_seqno {\n            return Err(IngestV2Error::Internal(format!(\n                \"received out-of-order replication request: expected replication seqno `{}`, got \\\n                 `{}`\",\n                self.current_replication_seqno, replicate_request.replication_seqno\n            )));\n        }\n        let request_size_bytes = replicate_request.num_bytes();\n        let mut gauge_guard = GaugeGuard::from_gauge(&MEMORY_METRICS.in_flight.ingester_replicate);\n        gauge_guard.add(request_size_bytes as i64);\n\n        self.current_replication_seqno += 1;\n\n        let commit_type = replicate_request.commit_type();\n        let force_commit = commit_type == CommitTypeV2::Force;\n\n        let mut replicate_successes = Vec::with_capacity(replicate_request.subrequests.len());\n        let mut replicate_failures = Vec::new();\n\n        // Keep track of the shards that need to be closed following an IO error.\n        let mut shards_to_close: HashSet<QueueId> = HashSet::new();\n\n        // Keep track of dangling shards, i.e., shards for which there is no longer a corresponding\n        // queue in the WAL and should be deleted.\n        let mut shards_to_delete: HashSet<QueueId> = HashSet::new();\n\n        let mut state_guard =\n            with_lock_metrics!(self.state.lock_fully(), \"replicate\", \"write\").await?;\n\n        if state_guard.status() != IngesterStatus::Ready {\n            replicate_failures.reserve_exact(replicate_request.subrequests.len());\n\n            for subrequest in replicate_request.subrequests {\n                let replicate_failure = ReplicateFailure {\n                    subrequest_id: subrequest.subrequest_id,\n                    index_uid: subrequest.index_uid,\n                    source_id: subrequest.source_id,\n                    shard_id: subrequest.shard_id,\n                    reason: ReplicateFailureReason::ShardClosed as i32,\n                };\n                replicate_failures.push(replicate_failure);\n            }\n            let replicate_response = ReplicateResponse {\n                follower_id: replicate_request.follower_id,\n                successes: Vec::new(),\n                failures: replicate_failures,\n                replication_seqno: replicate_request.replication_seqno,\n            };\n            return Ok(replicate_response);\n        }\n        let now = Instant::now();\n\n        for subrequest in replicate_request.subrequests {\n            let queue_id = subrequest.queue_id();\n            let from_position_exclusive = subrequest.from_position_exclusive();\n\n            let Some(shard) = state_guard.shards.get(&queue_id) else {\n                let replicate_failure = ReplicateFailure {\n                    subrequest_id: subrequest.subrequest_id,\n                    index_uid: subrequest.index_uid,\n                    source_id: subrequest.source_id,\n                    shard_id: subrequest.shard_id,\n                    reason: ReplicateFailureReason::ShardNotFound as i32,\n                };\n                replicate_failures.push(replicate_failure);\n                continue;\n            };\n            assert!(shard.is_replica());\n\n            if shard.is_closed() {\n                let replicate_failure = ReplicateFailure {\n                    subrequest_id: subrequest.subrequest_id,\n                    index_uid: subrequest.index_uid,\n                    source_id: subrequest.source_id,\n                    shard_id: subrequest.shard_id,\n                    reason: ReplicateFailureReason::ShardClosed as i32,\n                };\n                replicate_failures.push(replicate_failure);\n                continue;\n            }\n            if shard.replication_position_inclusive != from_position_exclusive {\n                // TODO\n            }\n            let doc_batch = match subrequest.doc_batch {\n                Some(doc_batch) if !doc_batch.is_empty() => doc_batch,\n                _ => {\n                    warn!(\"received empty replicate request\");\n\n                    let replicate_success = ReplicateSuccess {\n                        subrequest_id: subrequest.subrequest_id,\n                        index_uid: subrequest.index_uid,\n                        source_id: subrequest.source_id,\n                        shard_id: subrequest.shard_id,\n                        replication_position_inclusive: Some(\n                            shard.replication_position_inclusive.clone(),\n                        ),\n                    };\n                    replicate_successes.push(replicate_success);\n                    continue;\n                }\n            };\n\n            let batch_num_bytes = doc_batch.num_bytes() as u64;\n            let batch_num_docs = doc_batch.num_docs() as u64;\n\n            let requested_capacity = estimate_size(&doc_batch);\n\n            if let Err(error) = check_enough_capacity(\n                &state_guard.mrecordlog,\n                self.disk_capacity,\n                self.memory_capacity,\n                requested_capacity,\n            ) {\n                rate_limited_warn!(\n                    limit_per_min = 10,\n                    \"failed to replicate records to ingester `{}`: {error}\",\n                    self.follower_id,\n                );\n                let replicate_failure = ReplicateFailure {\n                    subrequest_id: subrequest.subrequest_id,\n                    index_uid: subrequest.index_uid,\n                    source_id: subrequest.source_id,\n                    shard_id: subrequest.shard_id,\n                    reason: ReplicateFailureReason::WalFull as i32,\n                };\n                replicate_failures.push(replicate_failure);\n                continue;\n            };\n            let append_result = append_non_empty_doc_batch(\n                &mut state_guard.mrecordlog,\n                &queue_id,\n                doc_batch,\n                force_commit,\n            )\n            .await;\n\n            let current_position_inclusive = match append_result {\n                Ok(current_position_inclusive) => current_position_inclusive,\n                Err(append_error) => {\n                    let reason = match &append_error {\n                        AppendDocBatchError::Io(io_error) => {\n                            error!(\"failed to replicate records to shard `{queue_id}`: {io_error}\");\n                            shards_to_close.insert(queue_id);\n                            ReplicateFailureReason::ShardClosed\n                        }\n                        AppendDocBatchError::QueueNotFound(_) => {\n                            error!(\n                                \"failed to replicate records to shard `{queue_id}`: WAL queue not \\\n                                 found\"\n                            );\n                            shards_to_delete.insert(queue_id);\n                            ReplicateFailureReason::ShardNotFound\n                        }\n                    };\n                    let replicate_failure = ReplicateFailure {\n                        subrequest_id: subrequest.subrequest_id,\n                        index_uid: subrequest.index_uid,\n                        source_id: subrequest.source_id,\n                        shard_id: subrequest.shard_id,\n                        reason: reason as i32,\n                    };\n                    replicate_failures.push(replicate_failure);\n                    continue;\n                }\n            };\n            state_guard\n                .shards\n                .get_mut(&queue_id)\n                .expect(\"replica shard should be initialized\")\n                .set_replication_position_inclusive(current_position_inclusive.clone(), now);\n\n            INGEST_METRICS\n                .replicated_num_bytes_total\n                .inc_by(batch_num_bytes);\n            INGEST_METRICS\n                .replicated_num_docs_total\n                .inc_by(batch_num_docs);\n\n            let replicate_success = ReplicateSuccess {\n                subrequest_id: subrequest.subrequest_id,\n                index_uid: subrequest.index_uid,\n                source_id: subrequest.source_id,\n                shard_id: subrequest.shard_id,\n                replication_position_inclusive: Some(current_position_inclusive),\n            };\n            replicate_successes.push(replicate_success);\n        }\n        if !shards_to_close.is_empty() {\n            for queue_id in &shards_to_close {\n                let shard = state_guard\n                    .shards\n                    .get_mut(queue_id)\n                    .expect(\"shard should exist\");\n\n                shard.shard_state = ShardState::Closed;\n                shard.notify_shard_status();\n                warn!(\"closed shard `{queue_id}` following IO error\");\n            }\n        }\n        if !shards_to_delete.is_empty() {\n            for queue_id in &shards_to_delete {\n                state_guard.shards.remove(queue_id);\n                warn!(\"deleted dangling shard `{queue_id}`\");\n            }\n        }\n        let wal_usage = state_guard.mrecordlog.resource_usage();\n        drop(state_guard);\n\n        report_wal_usage(wal_usage);\n\n        let follower_id = self.follower_id.clone().into();\n\n        let replicate_response = ReplicateResponse {\n            follower_id,\n            successes: replicate_successes,\n            failures: replicate_failures,\n            replication_seqno: replicate_request.replication_seqno,\n        };\n        Ok(replicate_response)\n    }\n\n    async fn run(&mut self) -> IngestV2Result<()> {\n        while let Some(syn_replication_message) = self.syn_replication_stream.next().await {\n            let ack_replication_message = match syn_replication_message.message {\n                Some(syn_replication_message::Message::OpenRequest(_)) => {\n                    panic!(\"TODO: this should not happen, internal error\");\n                }\n                Some(syn_replication_message::Message::InitRequest(init_replica_request)) => self\n                    .init_replica(init_replica_request)\n                    .await\n                    .map(AckReplicationMessage::new_init_replica_response),\n                Some(syn_replication_message::Message::ReplicateRequest(replicate_request)) => self\n                    .replicate(replicate_request)\n                    .await\n                    .map(AckReplicationMessage::new_replicate_response),\n                None => {\n                    warn!(\"received empty SYN replication message\");\n                    continue;\n                }\n            };\n            if self\n                .ack_replication_stream_tx\n                .send(ack_replication_message)\n                .is_err()\n            {\n                break;\n            }\n        }\n        Ok(())\n    }\n}\n\npub(super) struct ReplicationTaskHandle {\n    join_handle: JoinHandle<IngestV2Result<()>>,\n}\n\nimpl Drop for ReplicationTaskHandle {\n    fn drop(&mut self) {\n        self.join_handle.abort();\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use quickwit_cluster::{ChannelTransport, create_cluster_for_test};\n    use quickwit_config::service::QuickwitService;\n    use quickwit_proto::ingest::ingester::{ReplicateSubrequest, ReplicateSuccess};\n    use quickwit_proto::ingest::{DocBatchV2, Shard};\n    use quickwit_proto::types::{IndexUid, Position, ShardId, queue_id};\n\n    use super::*;\n\n    fn into_init_replica_request(\n        syn_replication_message: SynReplicationMessage,\n    ) -> InitReplicaRequest {\n        let Some(syn_replication_message::Message::InitRequest(init_replica_request)) =\n            syn_replication_message.message\n        else {\n            panic!(\n                \"expected init replica SYN message, got `{:?}`\",\n                syn_replication_message.message\n            );\n        };\n        init_replica_request\n    }\n\n    fn into_replicate_request(syn_replication_message: SynReplicationMessage) -> ReplicateRequest {\n        let Some(syn_replication_message::Message::ReplicateRequest(replicate_request)) =\n            syn_replication_message.message\n        else {\n            panic!(\n                \"expected replicate SYN message, got `{:?}`\",\n                syn_replication_message.message\n            );\n        };\n        replicate_request\n    }\n\n    fn into_init_replica_response(\n        ack_replication_message: AckReplicationMessage,\n    ) -> InitReplicaResponse {\n        let Some(ack_replication_message::Message::InitResponse(init_replica_response)) =\n            ack_replication_message.message\n        else {\n            panic!(\n                \"expected init replica ACK message, got `{:?}`\",\n                ack_replication_message.message\n            );\n        };\n        init_replica_response\n    }\n\n    fn into_replicate_response(\n        ack_replication_message: AckReplicationMessage,\n    ) -> ReplicateResponse {\n        let Some(ack_replication_message::Message::ReplicateResponse(replicate_response)) =\n            ack_replication_message.message\n        else {\n            panic!(\n                \"expected replicate ACK message, got `{:?}`\",\n                ack_replication_message.message\n            );\n        };\n        replicate_response\n    }\n\n    #[tokio::test]\n    async fn test_replication_stream_task_init() {\n        let leader_id: NodeId = \"test-leader\".into();\n        let follower_id: NodeId = \"test-follower\".into();\n        let (syn_replication_stream_tx, mut syn_replication_stream_rx) = mpsc::channel(5);\n        let (ack_replication_stream_tx, ack_replication_stream) =\n            ServiceStream::new_bounded(SYN_REPLICATION_STREAM_CAPACITY);\n        let replication_stream_task_handle = ReplicationStreamTask::spawn(\n            leader_id,\n            follower_id,\n            syn_replication_stream_tx,\n            ack_replication_stream,\n        );\n        let dummy_replication_task_future = async move {\n            while let Some(syn_replication_message) = syn_replication_stream_rx.recv().await {\n                let init_replica_request = into_init_replica_request(syn_replication_message);\n                let init_replica_response = InitReplicaResponse {\n                    replication_seqno: init_replica_request.replication_seqno,\n                };\n                let ack_replication_message =\n                    AckReplicationMessage::new_init_replica_response(init_replica_response);\n                ack_replication_stream_tx\n                    .send(Ok(ack_replication_message))\n                    .await\n                    .unwrap();\n            }\n        };\n        tokio::spawn(dummy_replication_task_future);\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let replica_shard = Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: \"test-source\".to_string(),\n            shard_id: Some(ShardId::from(1)),\n            shard_state: ShardState::Open as i32,\n            leader_id: \"test-leader\".to_string(),\n            follower_id: Some(\"test-follower\".to_string()),\n            ..Default::default()\n        };\n        let init_replica_response = replication_stream_task_handle\n            .replication_client()\n            .init_replica(replica_shard)\n            .await\n            .unwrap();\n        assert_eq!(init_replica_response.replication_seqno, 0);\n    }\n\n    #[tokio::test]\n    async fn test_replication_stream_task_replicate() {\n        let leader_id: NodeId = \"test-leader\".into();\n        let follower_id: NodeId = \"test-follower\".into();\n        let (syn_replication_stream_tx, mut syn_replication_stream_rx) = mpsc::channel(5);\n        let (ack_replication_stream_tx, ack_replication_stream) =\n            ServiceStream::new_bounded(SYN_REPLICATION_STREAM_CAPACITY);\n        let replication_stream_task_handle = ReplicationStreamTask::spawn(\n            leader_id.clone(),\n            follower_id.clone(),\n            syn_replication_stream_tx,\n            ack_replication_stream,\n        );\n        let dummy_replication_task_future = async move {\n            while let Some(syn_replication_message) = syn_replication_stream_rx.recv().await {\n                let replicate_request = into_replicate_request(syn_replication_message);\n                let replicate_successes = replicate_request\n                    .subrequests\n                    .iter()\n                    .map(|subrequest| {\n                        let batch_len = subrequest.doc_batch.as_ref().unwrap().num_docs();\n                        let replication_position_inclusive = subrequest\n                            .from_position_exclusive()\n                            .as_usize()\n                            .map(|pos| pos + batch_len)\n                            .unwrap_or(batch_len - 1);\n                        ReplicateSuccess {\n                            subrequest_id: subrequest.subrequest_id,\n                            index_uid: subrequest.index_uid.clone(),\n                            source_id: subrequest.source_id.clone(),\n                            shard_id: subrequest.shard_id.clone(),\n                            replication_position_inclusive: Some(Position::offset(\n                                replication_position_inclusive,\n                            )),\n                        }\n                    })\n                    .collect::<Vec<_>>();\n\n                let replicate_response = ReplicateResponse {\n                    follower_id: replicate_request.follower_id,\n                    successes: replicate_successes,\n                    failures: Vec::new(),\n                    replication_seqno: replicate_request.replication_seqno,\n                };\n                let ack_replication_message =\n                    AckReplicationMessage::new_replicate_response(replicate_response);\n                ack_replication_stream_tx\n                    .send(Ok(ack_replication_message))\n                    .await\n                    .unwrap();\n            }\n        };\n        tokio::spawn(dummy_replication_task_future);\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let index_uid2: IndexUid = IndexUid::for_test(\"test-index\", 1);\n\n        let subrequests = vec![\n            ReplicateSubrequest {\n                subrequest_id: 0,\n                index_uid: Some(index_uid.clone()),\n                source_id: \"test-source\".to_string(),\n                shard_id: Some(ShardId::from(1)),\n                doc_batch: Some(DocBatchV2::for_test([\"test-doc-foo\"])),\n                from_position_exclusive: Some(Position::Beginning),\n            },\n            ReplicateSubrequest {\n                subrequest_id: 1,\n                index_uid: Some(index_uid.clone()),\n                source_id: \"test-source\".to_string(),\n                shard_id: Some(ShardId::from(2)),\n                doc_batch: Some(DocBatchV2::for_test([\"test-doc-bar\", \"test-doc-baz\"])),\n                from_position_exclusive: Some(Position::Beginning),\n            },\n            ReplicateSubrequest {\n                subrequest_id: 2,\n                index_uid: Some(index_uid2.clone()),\n                source_id: \"test-source\".to_string(),\n                shard_id: Some(ShardId::from(1)),\n                doc_batch: Some(DocBatchV2::for_test([\"test-qux\", \"test-doc-tux\"])),\n                from_position_exclusive: Some(Position::offset(0u64)),\n            },\n        ];\n        let replicate_response = replication_stream_task_handle\n            .replication_client()\n            .replicate(\n                leader_id.clone(),\n                follower_id.clone(),\n                subrequests,\n                CommitTypeV2::Auto,\n            )\n            .await\n            .unwrap();\n        assert_eq!(replicate_response.follower_id, \"test-follower\");\n        assert_eq!(replicate_response.successes.len(), 3);\n        assert_eq!(replicate_response.failures.len(), 0);\n        assert_eq!(replicate_response.replication_seqno, 0);\n\n        let replicate_success_0 = &replicate_response.successes[0];\n        assert_eq!(replicate_success_0.index_uid(), &index_uid);\n        assert_eq!(replicate_success_0.source_id, \"test-source\");\n        assert_eq!(replicate_success_0.shard_id(), ShardId::from(1));\n        assert_eq!(\n            replicate_success_0.replication_position_inclusive(),\n            Position::offset(0u64)\n        );\n\n        let replicate_success_1 = &replicate_response.successes[1];\n        assert_eq!(replicate_success_1.index_uid(), &index_uid);\n        assert_eq!(replicate_success_1.source_id, \"test-source\");\n        assert_eq!(replicate_success_1.shard_id(), ShardId::from(2));\n        assert_eq!(\n            replicate_success_1.replication_position_inclusive(),\n            Position::offset(1u64)\n        );\n\n        let replicate_success_2 = &replicate_response.successes[2];\n        assert_eq!(replicate_success_2.index_uid(), &index_uid2);\n        assert_eq!(replicate_success_2.source_id, \"test-source\");\n        assert_eq!(replicate_success_2.shard_id(), ShardId::from(1));\n        assert_eq!(\n            replicate_success_2.replication_position_inclusive(),\n            Position::offset(2u64)\n        );\n    }\n\n    #[tokio::test]\n    async fn test_replication_stream_replicate_errors() {\n        let leader_id: NodeId = \"test-leader\".into();\n        let follower_id: NodeId = \"test-follower\".into();\n        let (syn_replication_stream_tx, _syn_replication_stream_rx) = mpsc::channel(5);\n        let (_ack_replication_stream_tx, ack_replication_stream) =\n            ServiceStream::new_bounded(SYN_REPLICATION_STREAM_CAPACITY);\n        let replication_stream_task_handle = ReplicationStreamTask::spawn(\n            leader_id.clone(),\n            follower_id.clone(),\n            syn_replication_stream_tx,\n            ack_replication_stream,\n        );\n        let timeout_error = replication_stream_task_handle\n            .replication_client()\n            .replicate(\n                leader_id.clone(),\n                follower_id.clone(),\n                Vec::new(),\n                CommitTypeV2::Auto,\n            )\n            .await\n            .unwrap_err();\n        assert!(matches!(timeout_error, ReplicationError::Timeout));\n\n        replication_stream_task_handle\n            .enqueue_syn_requests_join_handle\n            .abort();\n\n        let closed_error = replication_stream_task_handle\n            .replication_client()\n            .replicate(leader_id, follower_id, Vec::new(), CommitTypeV2::Auto)\n            .await\n            .unwrap_err();\n\n        assert!(matches!(closed_error, ReplicationError::Closed));\n    }\n\n    #[tokio::test]\n    async fn test_replication_task_happy_path() {\n        let leader_id: NodeId = \"test-leader\".into();\n        let follower_id: NodeId = \"test-follower\".into();\n        let cluster = create_cluster_for_test(\n            Vec::new(),\n            &[QuickwitService::Indexer.as_str()],\n            &ChannelTransport::default(),\n            true,\n        )\n        .await\n        .unwrap();\n        let (_temp_dir, state) = IngesterState::for_test(cluster).await;\n        let (syn_replication_stream_tx, syn_replication_stream) =\n            ServiceStream::new_bounded(SYN_REPLICATION_STREAM_CAPACITY);\n        let (ack_replication_stream_tx, mut ack_replication_stream) =\n            ServiceStream::new_unbounded();\n\n        let disk_capacity = ByteSize::mb(256);\n        let memory_capacity = ByteSize::mb(1);\n\n        let _replication_task_handle = ReplicationTask::spawn(\n            leader_id,\n            follower_id,\n            state.clone(),\n            syn_replication_stream,\n            ack_replication_stream_tx,\n            disk_capacity,\n            memory_capacity,\n        );\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let index_uid2: IndexUid = IndexUid::for_test(\"test-index\", 1);\n\n        // Init shard 01.\n        let init_replica_request = InitReplicaRequest {\n            replica_shard: Some(Shard {\n                index_uid: Some(index_uid.clone()),\n                source_id: \"test-source\".to_string(),\n                shard_id: Some(ShardId::from(1)),\n                shard_state: ShardState::Open as i32,\n                leader_id: \"test-leader\".to_string(),\n                follower_id: Some(\"test-follower\".to_string()),\n                ..Default::default()\n            }),\n            replication_seqno: 0,\n        };\n        let syn_replication_message =\n            SynReplicationMessage::new_init_replica_request(init_replica_request);\n        syn_replication_stream_tx\n            .send(syn_replication_message)\n            .await\n            .unwrap();\n        let ack_replication_message = ack_replication_stream.next().await.unwrap().unwrap();\n        let init_replica_response = into_init_replica_response(ack_replication_message);\n        assert_eq!(init_replica_response.replication_seqno, 0);\n\n        // Init shard 02.\n        let init_replica_request = InitReplicaRequest {\n            replica_shard: Some(Shard {\n                index_uid: Some(index_uid.clone()),\n                source_id: \"test-source\".to_string(),\n                shard_id: Some(ShardId::from(2)),\n                shard_state: ShardState::Open as i32,\n                leader_id: \"test-leader\".to_string(),\n                follower_id: Some(\"test-follower\".to_string()),\n                ..Default::default()\n            }),\n            replication_seqno: 1,\n        };\n        let syn_replication_message =\n            SynReplicationMessage::new_init_replica_request(init_replica_request);\n        syn_replication_stream_tx\n            .send(syn_replication_message)\n            .await\n            .unwrap();\n        let ack_replication_message = ack_replication_stream.next().await.unwrap().unwrap();\n        let init_replica_response = into_init_replica_response(ack_replication_message);\n        assert_eq!(init_replica_response.replication_seqno, 1);\n\n        // Init shard 11.\n        let init_replica_request = InitReplicaRequest {\n            replica_shard: Some(Shard {\n                index_uid: Some(index_uid2.clone()),\n                source_id: \"test-source\".to_string(),\n                shard_id: Some(ShardId::from(1)),\n                shard_state: ShardState::Open as i32,\n                leader_id: \"test-leader\".to_string(),\n                follower_id: Some(\"test-follower\".to_string()),\n                ..Default::default()\n            }),\n            replication_seqno: 2,\n        };\n        let syn_replication_message =\n            SynReplicationMessage::new_init_replica_request(init_replica_request);\n        syn_replication_stream_tx\n            .send(syn_replication_message)\n            .await\n            .unwrap();\n        let ack_replication_message = ack_replication_stream.next().await.unwrap().unwrap();\n        let init_replica_response = into_init_replica_response(ack_replication_message);\n        assert_eq!(init_replica_response.replication_seqno, 2);\n\n        let state_guard = state.lock_fully().await.unwrap();\n\n        let queue_id_01 = queue_id(&index_uid, \"test-source\", &ShardId::from(1));\n\n        let replica_shard_01 = state_guard.shards.get(&queue_id_01).unwrap();\n        replica_shard_01.assert_is_replica();\n        replica_shard_01.assert_is_open();\n        replica_shard_01.assert_replication_position(Position::Beginning);\n        replica_shard_01.assert_truncation_position(Position::Beginning);\n\n        assert!(state_guard.mrecordlog.queue_exists(&queue_id_01));\n\n        let queue_id_02 = queue_id(&index_uid, \"test-source\", &ShardId::from(2));\n\n        let replica_shard_02 = state_guard.shards.get(&queue_id_02).unwrap();\n        replica_shard_02.assert_is_replica();\n        replica_shard_02.assert_is_open();\n        replica_shard_02.assert_replication_position(Position::Beginning);\n        replica_shard_02.assert_truncation_position(Position::Beginning);\n\n        let queue_id_11 = queue_id(&index_uid2, \"test-source\", &ShardId::from(1));\n\n        let replica_shard_11 = state_guard.shards.get(&queue_id_11).unwrap();\n        replica_shard_11.assert_is_replica();\n        replica_shard_11.assert_is_open();\n        replica_shard_11.assert_replication_position(Position::Beginning);\n        replica_shard_11.assert_truncation_position(Position::Beginning);\n\n        drop(state_guard);\n\n        let replicate_request = ReplicateRequest {\n            leader_id: \"test-leader\".to_string(),\n            follower_id: \"test-follower\".to_string(),\n            commit_type: CommitTypeV2::Auto as i32,\n            subrequests: vec![\n                ReplicateSubrequest {\n                    subrequest_id: 0,\n                    index_uid: Some(index_uid.clone()),\n                    source_id: \"test-source\".to_string(),\n                    shard_id: Some(ShardId::from(1)),\n                    doc_batch: Some(DocBatchV2::for_test([\"test-doc-foo\"])),\n                    from_position_exclusive: Some(Position::Beginning),\n                },\n                ReplicateSubrequest {\n                    subrequest_id: 1,\n                    index_uid: Some(index_uid.clone()),\n                    source_id: \"test-source\".to_string(),\n                    shard_id: Some(ShardId::from(2)),\n                    doc_batch: Some(DocBatchV2::for_test([\"test-doc-bar\", \"test-doc-baz\"])),\n                    from_position_exclusive: Some(Position::Beginning),\n                },\n                ReplicateSubrequest {\n                    subrequest_id: 2,\n                    index_uid: Some(index_uid2.clone()),\n                    source_id: \"test-source\".to_string(),\n                    shard_id: Some(ShardId::from(1)),\n                    doc_batch: Some(DocBatchV2::for_test([\"test-doc-qux\", \"test-doc-tux\"])),\n                    from_position_exclusive: Some(Position::Beginning),\n                },\n            ],\n            replication_seqno: 3,\n        };\n        let syn_replication_message =\n            SynReplicationMessage::new_replicate_request(replicate_request);\n        syn_replication_stream_tx\n            .send(syn_replication_message)\n            .await\n            .unwrap();\n        let ack_replication_message = ack_replication_stream.next().await.unwrap().unwrap();\n        let replicate_response = into_replicate_response(ack_replication_message);\n\n        assert_eq!(replicate_response.follower_id, \"test-follower\");\n        assert_eq!(replicate_response.successes.len(), 3);\n        assert_eq!(replicate_response.failures.len(), 0);\n        assert_eq!(replicate_response.replication_seqno, 3);\n\n        let replicate_success_0 = &replicate_response.successes[0];\n        assert_eq!(replicate_success_0.index_uid(), &index_uid);\n        assert_eq!(replicate_success_0.source_id, \"test-source\");\n        assert_eq!(replicate_success_0.shard_id(), ShardId::from(1));\n        assert_eq!(\n            replicate_success_0.replication_position_inclusive(),\n            Position::offset(0u64)\n        );\n\n        let replicate_success_1 = &replicate_response.successes[1];\n        assert_eq!(replicate_success_1.index_uid(), &index_uid);\n        assert_eq!(replicate_success_1.source_id, \"test-source\");\n        assert_eq!(replicate_success_1.shard_id(), ShardId::from(2));\n        assert_eq!(\n            replicate_success_1.replication_position_inclusive(),\n            Position::offset(1u64)\n        );\n\n        let replicate_success_2 = &replicate_response.successes[2];\n        assert_eq!(replicate_success_2.index_uid(), &index_uid2);\n        assert_eq!(replicate_success_2.source_id, \"test-source\");\n        assert_eq!(replicate_success_2.shard_id(), ShardId::from(1));\n        assert_eq!(\n            replicate_success_2.replication_position_inclusive(),\n            Position::offset(1u64)\n        );\n\n        let state_guard = state.lock_fully().await.unwrap();\n\n        state_guard\n            .mrecordlog\n            .assert_records_eq(&queue_id_01, .., &[(0, [0, 0], \"test-doc-foo\")]);\n\n        state_guard.mrecordlog.assert_records_eq(\n            &queue_id_02,\n            ..,\n            &[(0, [0, 0], \"test-doc-bar\"), (1, [0, 0], \"test-doc-baz\")],\n        );\n\n        state_guard.mrecordlog.assert_records_eq(\n            &queue_id_11,\n            ..,\n            &[(0, [0, 0], \"test-doc-qux\"), (1, [0, 0], \"test-doc-tux\")],\n        );\n        drop(state_guard);\n\n        let replicate_request = ReplicateRequest {\n            leader_id: \"test-leader\".to_string(),\n            follower_id: \"test-follower\".to_string(),\n            commit_type: CommitTypeV2::Auto as i32,\n            subrequests: vec![ReplicateSubrequest {\n                subrequest_id: 0,\n                index_uid: Some(index_uid.clone()),\n                source_id: \"test-source\".to_string(),\n                shard_id: Some(ShardId::from(1)),\n                doc_batch: Some(DocBatchV2::for_test([\"test-doc-moo\"])),\n                from_position_exclusive: Some(Position::offset(0u64)),\n            }],\n            replication_seqno: 4,\n        };\n        let syn_replication_message =\n            SynReplicationMessage::new_replicate_request(replicate_request);\n        syn_replication_stream_tx\n            .send(syn_replication_message)\n            .await\n            .unwrap();\n        let ack_replication_message = ack_replication_stream.next().await.unwrap().unwrap();\n        let replicate_response = into_replicate_response(ack_replication_message);\n\n        assert_eq!(replicate_response.follower_id, \"test-follower\");\n        assert_eq!(replicate_response.successes.len(), 1);\n        assert_eq!(replicate_response.failures.len(), 0);\n        assert_eq!(replicate_response.replication_seqno, 4);\n\n        let replicate_success_0 = &replicate_response.successes[0];\n        assert_eq!(replicate_success_0.index_uid(), &index_uid);\n        assert_eq!(replicate_success_0.source_id, \"test-source\");\n        assert_eq!(replicate_success_0.shard_id(), ShardId::from(1));\n        assert_eq!(\n            replicate_success_0.replication_position_inclusive(),\n            Position::offset(1u64)\n        );\n\n        let state_guard = state.lock_fully().await.unwrap();\n\n        state_guard.mrecordlog.assert_records_eq(\n            &queue_id_01,\n            ..,\n            &[(0, [0, 0], \"test-doc-foo\"), (1, [0, 0], \"test-doc-moo\")],\n        );\n    }\n\n    #[tokio::test]\n    async fn test_replication_task_shard_closed() {\n        let leader_id: NodeId = \"test-leader\".into();\n        let follower_id: NodeId = \"test-follower\".into();\n        let cluster = create_cluster_for_test(\n            Vec::new(),\n            &[QuickwitService::Indexer.as_str()],\n            &ChannelTransport::default(),\n            true,\n        )\n        .await\n        .unwrap();\n        let (_temp_dir, state) = IngesterState::for_test(cluster).await;\n        let (syn_replication_stream_tx, syn_replication_stream) =\n            ServiceStream::new_bounded(SYN_REPLICATION_STREAM_CAPACITY);\n        let (ack_replication_stream_tx, mut ack_replication_stream) =\n            ServiceStream::new_unbounded();\n\n        let disk_capacity = ByteSize::mb(256);\n        let memory_capacity = ByteSize::mb(1);\n\n        let _replication_task_handle = ReplicationTask::spawn(\n            leader_id.clone(),\n            follower_id,\n            state.clone(),\n            syn_replication_stream,\n            ack_replication_stream_tx,\n            disk_capacity,\n            memory_capacity,\n        );\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let replica_shard = IngesterShard::new_replica(\n            index_uid.clone(),\n            \"test-source\".to_string(),\n            ShardId::from(1),\n            leader_id,\n        )\n        .with_state(ShardState::Closed)\n        .build();\n        state\n            .lock_fully()\n            .await\n            .unwrap()\n            .shards\n            .insert(replica_shard.queue_id(), replica_shard);\n\n        let replicate_request = ReplicateRequest {\n            leader_id: \"test-leader\".to_string(),\n            follower_id: \"test-follower\".to_string(),\n            commit_type: CommitTypeV2::Auto as i32,\n            subrequests: vec![ReplicateSubrequest {\n                subrequest_id: 0,\n                index_uid: Some(index_uid.clone()),\n                source_id: \"test-source\".to_string(),\n                shard_id: Some(ShardId::from(1)),\n                doc_batch: Some(DocBatchV2::for_test([\"test-doc-foo\"])),\n                from_position_exclusive: Position::offset(0u64).into(),\n            }],\n            replication_seqno: 0,\n        };\n        let syn_replication_message =\n            SynReplicationMessage::new_replicate_request(replicate_request);\n        syn_replication_stream_tx\n            .send(syn_replication_message)\n            .await\n            .unwrap();\n        let ack_replication_message = ack_replication_stream.next().await.unwrap().unwrap();\n        let replicate_response = into_replicate_response(ack_replication_message);\n\n        assert_eq!(replicate_response.follower_id, \"test-follower\");\n        assert_eq!(replicate_response.successes.len(), 0);\n        assert_eq!(replicate_response.failures.len(), 1);\n\n        let replicate_failure = &replicate_response.failures[0];\n        assert_eq!(replicate_failure.index_uid(), &index_uid);\n        assert_eq!(replicate_failure.source_id, \"test-source\");\n        assert_eq!(replicate_failure.shard_id(), ShardId::from(1));\n        assert_eq!(\n            replicate_failure.reason(),\n            ReplicateFailureReason::ShardClosed\n        );\n    }\n\n    #[cfg(not(feature = \"failpoints\"))]\n    #[tokio::test]\n    async fn test_replication_task_deletes_dangling_shard() {\n        let leader_id: NodeId = \"test-leader\".into();\n        let follower_id: NodeId = \"test-follower\".into();\n        let cluster = create_cluster_for_test(\n            Vec::new(),\n            &[QuickwitService::Indexer.as_str()],\n            &ChannelTransport::default(),\n            true,\n        )\n        .await\n        .unwrap();\n        let (_temp_dir, state) = IngesterState::for_test(cluster).await;\n        let (syn_replication_stream_tx, syn_replication_stream) =\n            ServiceStream::new_bounded(SYN_REPLICATION_STREAM_CAPACITY);\n        let (ack_replication_stream_tx, mut ack_replication_stream) =\n            ServiceStream::new_unbounded();\n\n        let disk_capacity = ByteSize::mb(256);\n        let memory_capacity = ByteSize::mb(1);\n\n        let _replication_task_handle = ReplicationTask::spawn(\n            leader_id.clone(),\n            follower_id,\n            state.clone(),\n            syn_replication_stream,\n            ack_replication_stream_tx,\n            disk_capacity,\n            memory_capacity,\n        );\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let replica_shard = IngesterShard::new_replica(\n            index_uid.clone(),\n            \"test-source\".to_string(),\n            ShardId::from(1),\n            leader_id,\n        )\n        .build();\n        let queue_id_01 = replica_shard.queue_id();\n        state\n            .lock_fully()\n            .await\n            .unwrap()\n            .shards\n            .insert(queue_id_01.clone(), replica_shard);\n\n        let replicate_request = ReplicateRequest {\n            leader_id: \"test-leader\".to_string(),\n            follower_id: \"test-follower\".to_string(),\n            commit_type: CommitTypeV2::Auto as i32,\n            subrequests: vec![ReplicateSubrequest {\n                subrequest_id: 0,\n                index_uid: Some(index_uid.clone()),\n                source_id: \"test-source\".to_string(),\n                shard_id: Some(ShardId::from(1)),\n                doc_batch: Some(DocBatchV2::for_test([\"test-doc-foo\"])),\n                from_position_exclusive: Position::offset(0u64).into(),\n            }],\n            replication_seqno: 0,\n        };\n        let syn_replication_message =\n            SynReplicationMessage::new_replicate_request(replicate_request);\n        syn_replication_stream_tx\n            .send(syn_replication_message)\n            .await\n            .unwrap();\n        let ack_replication_message = ack_replication_stream.next().await.unwrap().unwrap();\n        let replicate_response = into_replicate_response(ack_replication_message);\n\n        assert_eq!(replicate_response.follower_id, \"test-follower\");\n        assert_eq!(replicate_response.successes.len(), 0);\n        assert_eq!(replicate_response.failures.len(), 1);\n\n        let replicate_failure = &replicate_response.failures[0];\n        assert_eq!(replicate_failure.index_uid(), &index_uid);\n        assert_eq!(replicate_failure.source_id, \"test-source\");\n        assert_eq!(replicate_failure.shard_id(), ShardId::from(1));\n        assert_eq!(\n            replicate_failure.reason(),\n            ReplicateFailureReason::ShardNotFound\n        );\n\n        let state_guard = state.lock_partially().await.unwrap();\n        assert!(!state_guard.shards.contains_key(&queue_id_01));\n    }\n\n    // This test should be run manually and independently of other tests with the `failpoints`\n    // feature enabled:\n    // ```sh\n    // cargo test --manifest-path quickwit/Cargo.toml -p quickwit-ingest --features failpoints -- test_replication_task_closes_shard_on_io_error\n    // ```\n    #[cfg(feature = \"failpoints\")]\n    #[tokio::test]\n    async fn test_replication_task_closes_shard_on_io_error() {\n        let scenario = fail::FailScenario::setup();\n        fail::cfg(\"ingester:append_records\", \"return\").unwrap();\n\n        let leader_id: NodeId = \"test-leader\".into();\n        let follower_id: NodeId = \"test-follower\".into();\n        let cluster = create_cluster_for_test(\n            Vec::new(),\n            &[QuickwitService::Indexer.as_str()],\n            &ChannelTransport::default(),\n            true,\n        )\n        .await\n        .unwrap();\n        let (_temp_dir, state) = IngesterState::for_test(cluster).await;\n        let (syn_replication_stream_tx, syn_replication_stream) =\n            ServiceStream::new_bounded(SYN_REPLICATION_STREAM_CAPACITY);\n        let (ack_replication_stream_tx, mut ack_replication_stream) =\n            ServiceStream::new_unbounded();\n\n        let disk_capacity = ByteSize::mb(256);\n        let memory_capacity = ByteSize::mb(1);\n\n        let _replication_task_handle = ReplicationTask::spawn(\n            leader_id.clone(),\n            follower_id,\n            state.clone(),\n            syn_replication_stream,\n            ack_replication_stream_tx,\n            disk_capacity,\n            memory_capacity,\n        );\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let queue_id_01 = queue_id(&index_uid, \"test-source\", &ShardId::from(1));\n        let replica_shard = IngesterShard::new_replica(\n            index_uid.clone(),\n            \"test-source\".to_string(),\n            ShardId::from(1),\n            leader_id,\n        )\n        .build();\n        let mut state_guard = state.lock_fully().await.unwrap();\n\n        state_guard\n            .shards\n            .insert(queue_id_01.clone(), replica_shard);\n\n        state_guard\n            .mrecordlog\n            .create_queue(&queue_id_01)\n            .await\n            .unwrap();\n\n        drop(state_guard);\n\n        let replicate_request = ReplicateRequest {\n            leader_id: \"test-leader\".to_string(),\n            follower_id: \"test-follower\".to_string(),\n            commit_type: CommitTypeV2::Auto as i32,\n            subrequests: vec![ReplicateSubrequest {\n                subrequest_id: 0,\n                index_uid: Some(index_uid.clone()),\n                source_id: \"test-source\".to_string(),\n                shard_id: Some(ShardId::from(1)),\n                doc_batch: Some(DocBatchV2::for_test([\"test-doc-foo\"])),\n                from_position_exclusive: Position::offset(0u64).into(),\n            }],\n            replication_seqno: 0,\n        };\n        let syn_replication_message =\n            SynReplicationMessage::new_replicate_request(replicate_request);\n        syn_replication_stream_tx\n            .send(syn_replication_message)\n            .await\n            .unwrap();\n        let ack_replication_message = ack_replication_stream.next().await.unwrap().unwrap();\n        let replicate_response = into_replicate_response(ack_replication_message);\n\n        assert_eq!(replicate_response.follower_id, \"test-follower\");\n        assert_eq!(replicate_response.successes.len(), 0);\n        assert_eq!(replicate_response.failures.len(), 1);\n\n        let replicate_failure = &replicate_response.failures[0];\n        assert_eq!(replicate_failure.index_uid(), &index_uid);\n        assert_eq!(replicate_failure.source_id, \"test-source\");\n        assert_eq!(replicate_failure.shard_id(), ShardId::from(1));\n        assert_eq!(\n            replicate_failure.reason(),\n            ReplicateFailureReason::ShardClosed\n        );\n\n        let state_guard = state.lock_partially().await.unwrap();\n        let replica_shard = state_guard.shards.get(&queue_id_01).unwrap();\n        replica_shard.assert_is_closed();\n\n        scenario.teardown();\n    }\n\n    #[tokio::test]\n    async fn test_replication_task_resource_exhausted() {\n        let leader_id: NodeId = \"test-leader\".into();\n        let follower_id: NodeId = \"test-follower\".into();\n        let cluster = create_cluster_for_test(\n            Vec::new(),\n            &[QuickwitService::Indexer.as_str()],\n            &ChannelTransport::default(),\n            true,\n        )\n        .await\n        .unwrap();\n        let (_temp_dir, state) = IngesterState::for_test(cluster).await;\n        let (syn_replication_stream_tx, syn_replication_stream) =\n            ServiceStream::new_bounded(SYN_REPLICATION_STREAM_CAPACITY);\n        let (ack_replication_stream_tx, mut ack_replication_stream) =\n            ServiceStream::new_unbounded();\n\n        let disk_capacity = ByteSize(0);\n        let memory_capacity = ByteSize(0);\n\n        let _replication_task_handle = ReplicationTask::spawn(\n            leader_id.clone(),\n            follower_id,\n            state.clone(),\n            syn_replication_stream,\n            ack_replication_stream_tx,\n            disk_capacity,\n            memory_capacity,\n        );\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let replica_shard = IngesterShard::new_replica(\n            index_uid.clone(),\n            \"test-source\".to_string(),\n            ShardId::from(1),\n            leader_id,\n        )\n        .build();\n        let queue_id_01 = replica_shard.queue_id();\n        state\n            .lock_fully()\n            .await\n            .unwrap()\n            .shards\n            .insert(queue_id_01.clone(), replica_shard);\n\n        let replicate_request = ReplicateRequest {\n            leader_id: \"test-leader\".to_string(),\n            follower_id: \"test-follower\".to_string(),\n            commit_type: CommitTypeV2::Auto as i32,\n            subrequests: vec![ReplicateSubrequest {\n                subrequest_id: 0,\n                index_uid: Some(index_uid.clone()),\n                source_id: \"test-source\".to_string(),\n                shard_id: Some(ShardId::from(1)),\n                doc_batch: Some(DocBatchV2::for_test([\"test-doc-foo\"])),\n                from_position_exclusive: Some(Position::Beginning),\n            }],\n            replication_seqno: 0,\n        };\n        let syn_replication_message =\n            SynReplicationMessage::new_replicate_request(replicate_request);\n        syn_replication_stream_tx\n            .send(syn_replication_message)\n            .await\n            .unwrap();\n        let ack_replication_message = ack_replication_stream.next().await.unwrap().unwrap();\n        let replicate_response = into_replicate_response(ack_replication_message);\n\n        assert_eq!(replicate_response.follower_id, \"test-follower\");\n        assert_eq!(replicate_response.successes.len(), 0);\n        assert_eq!(replicate_response.failures.len(), 1);\n\n        let replicate_failure_0 = &replicate_response.failures[0];\n        assert_eq!(replicate_failure_0.index_uid(), &index_uid);\n        assert_eq!(replicate_failure_0.source_id, \"test-source\");\n        assert_eq!(replicate_failure_0.shard_id(), ShardId::from(1));\n        assert_eq!(\n            replicate_failure_0.reason(),\n            ReplicateFailureReason::WalFull\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_v2/router.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{HashMap, HashSet};\nuse std::fmt;\nuse std::sync::{Arc, OnceLock, Weak};\nuse std::time::Duration;\n\nuse async_trait::async_trait;\nuse futures::stream::FuturesUnordered;\nuse futures::{Future, StreamExt};\nuse quickwit_common::metrics::{GaugeGuard, MEMORY_METRICS};\nuse quickwit_common::pubsub::{EventBroker, EventSubscriber};\nuse quickwit_common::{rate_limited_error, rate_limited_warn};\nuse quickwit_proto::control_plane::{\n    ControlPlaneService, ControlPlaneServiceClient, GetOrCreateOpenShardsRequest,\n    GetOrCreateOpenShardsSubrequest,\n};\nuse quickwit_proto::ingest::ingester::{\n    IngesterService, PersistFailureReason, PersistRequest, PersistResponse, PersistSubrequest,\n};\nuse quickwit_proto::ingest::router::{\n    IngestFailureReason, IngestRequestV2, IngestResponseV2, IngestRouterService,\n};\nuse quickwit_proto::ingest::{CommitTypeV2, IngestV2Error, IngestV2Result, RateLimitingCause};\nuse quickwit_proto::types::{NodeId, SubrequestId};\nuse serde_json::{Value as JsonValue, json};\nuse tokio::sync::{Mutex, Semaphore};\nuse tokio::time::error::Elapsed;\nuse tracing::{error, info};\n\nuse super::broadcast::IngesterCapacityScoreUpdate;\nuse super::debouncing::{\n    DebouncedGetOrCreateOpenShardsRequest, GetOrCreateOpenShardsRequestDebouncer,\n};\nuse super::ingester::PERSIST_REQUEST_TIMEOUT;\nuse super::metrics::IngestResultMetrics;\nuse super::routing_table::RoutingTable;\nuse super::workbench::IngestWorkbench;\nuse super::{IngesterPool, pending_subrequests};\nuse crate::get_ingest_router_buffer_size;\nuse crate::ingest_v2::metrics::INGEST_V2_METRICS;\n\n/// Duration after which ingest requests time out with [`IngestV2Error::Timeout`].\nfn ingest_request_timeout() -> Duration {\n    const DEFAULT_INGEST_REQUEST_TIMEOUT: Duration = if cfg!(any(test, feature = \"testsuite\")) {\n        Duration::from_millis(10)\n    } else {\n        Duration::from_secs(35)\n    };\n    static TIMEOUT: OnceLock<Duration> = OnceLock::new();\n    *TIMEOUT.get_or_init(|| {\n        let duration_ms = quickwit_common::get_from_env(\n            \"QW_INGEST_REQUEST_TIMEOUT_MS\",\n            DEFAULT_INGEST_REQUEST_TIMEOUT.as_millis() as u64,\n            false,\n        );\n        let minimum_ingest_request_timeout: Duration =\n            PERSIST_REQUEST_TIMEOUT * (MAX_PERSIST_ATTEMPTS as u32) + Duration::from_secs(5);\n        let requested_ingest_request_timeout = Duration::from_millis(duration_ms);\n        if requested_ingest_request_timeout < minimum_ingest_request_timeout {\n            error!(\n                \"ingest request timeout too short {}ms, setting to {}ms\",\n                requested_ingest_request_timeout.as_millis(),\n                minimum_ingest_request_timeout.as_millis()\n            );\n            minimum_ingest_request_timeout\n        } else {\n            requested_ingest_request_timeout\n        }\n    })\n}\n\nconst MAX_PERSIST_ATTEMPTS: usize = 5;\n\ntype PersistResult = (PersistRequestSummary, IngestV2Result<PersistResponse>);\n\n#[derive(Clone)]\npub struct IngestRouter {\n    self_node_id: NodeId,\n    control_plane: ControlPlaneServiceClient,\n    ingester_pool: IngesterPool,\n    state: Arc<Mutex<RouterState>>,\n    replication_factor: usize,\n    // Limits the number of ingest requests in-flight to some capacity in bytes.\n    ingest_semaphore: Arc<Semaphore>,\n    event_broker: EventBroker,\n}\n\nstruct RouterState {\n    // Debounces `GetOrCreateOpenShardsRequest` requests to the control plane.\n    debouncer: GetOrCreateOpenShardsRequestDebouncer,\n    // Routing table of nodes, their WAL capacity, and the number of open shards per source.\n    routing_table: RoutingTable,\n}\n\nimpl fmt::Debug for IngestRouter {\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        f.debug_struct(\"IngestRouter\")\n            .field(\"self_node_id\", &self.self_node_id)\n            .field(\"replication_factor\", &self.replication_factor)\n            .finish()\n    }\n}\n\nimpl IngestRouter {\n    pub fn new(\n        self_node_id: NodeId,\n        control_plane: ControlPlaneServiceClient,\n        ingester_pool: IngesterPool,\n        replication_factor: usize,\n        event_broker: EventBroker,\n        self_availability_zone: Option<String>,\n    ) -> Self {\n        let state = Arc::new(Mutex::new(RouterState {\n            debouncer: GetOrCreateOpenShardsRequestDebouncer::default(),\n            routing_table: RoutingTable::new(self_availability_zone),\n        }));\n        let ingest_semaphore_permits = get_ingest_router_buffer_size().as_u64() as usize;\n        let ingest_semaphore = Arc::new(Semaphore::new(ingest_semaphore_permits));\n\n        Self {\n            self_node_id,\n            control_plane,\n            ingester_pool,\n            state,\n            replication_factor,\n            ingest_semaphore,\n            event_broker,\n        }\n    }\n\n    pub fn subscribe(&self) {\n        let weak_router_state = WeakRouterState(Arc::downgrade(&self.state));\n        self.event_broker\n            .subscribe::<IngesterCapacityScoreUpdate>(weak_router_state)\n            .forever();\n    }\n\n    /// Inspects the shard table for each subrequest and returns the appropriate\n    /// [`GetOrCreateOpenShardsRequest`] request if open shards do not exist for all of them.\n    async fn make_get_or_create_open_shard_request(\n        &self,\n        workbench: &mut IngestWorkbench,\n        ingester_pool: &IngesterPool,\n    ) -> DebouncedGetOrCreateOpenShardsRequest {\n        let mut debounced_request = DebouncedGetOrCreateOpenShardsRequest::default();\n        let unavailable_leaders: &HashSet<NodeId> = &workbench.unavailable_leaders;\n\n        let mut state_guard = self.state.lock().await;\n\n        for subrequest in pending_subrequests(&workbench.subworkbenches) {\n            if !state_guard.routing_table.has_open_nodes(\n                &subrequest.index_id,\n                &subrequest.source_id,\n                ingester_pool,\n                unavailable_leaders,\n            ) {\n                // No known nodes with open shards for this source. Ask the control\n                // plane to create shards so we have somewhere to route to.\n                let acquire_result = state_guard\n                    .debouncer\n                    .acquire(&subrequest.index_id, &subrequest.source_id);\n\n                match acquire_result {\n                    Ok(permit) => {\n                        let subrequest = GetOrCreateOpenShardsSubrequest {\n                            subrequest_id: subrequest.subrequest_id,\n                            index_id: subrequest.index_id.clone(),\n                            source_id: subrequest.source_id.clone(),\n                        };\n                        debounced_request.push_subrequest(subrequest, permit);\n                    }\n                    Err(barrier) => {\n                        debounced_request.push_barrier(barrier);\n                    }\n                }\n            }\n        }\n        drop(state_guard);\n\n        if !debounced_request.is_empty() && !workbench.closed_shards.is_empty() {\n            info!(closed_shards=?workbench.closed_shards, \"reporting closed shard(s) to control plane\");\n            debounced_request\n                .closed_shards\n                .append(&mut workbench.closed_shards);\n        }\n        if !debounced_request.is_empty() && !unavailable_leaders.is_empty() {\n            info!(unavailable_leaders=?unavailable_leaders, \"reporting unavailable leader(s) to control plane\");\n\n            for unavailable_leader in unavailable_leaders.iter() {\n                debounced_request\n                    .unavailable_leaders\n                    .push(unavailable_leader.to_string());\n            }\n        }\n        debounced_request\n    }\n\n    async fn populate_routing_table_debounced(\n        &self,\n        workbench: &mut IngestWorkbench,\n        debounced_request: DebouncedGetOrCreateOpenShardsRequest,\n    ) {\n        let (request_opt, rendezvous) = debounced_request.take();\n\n        if let Some(request) = request_opt {\n            self.populate_routing_table(workbench, request).await;\n        }\n        rendezvous.wait().await;\n    }\n\n    /// Issues a [`GetOrCreateOpenShardsRequest`] request to the control plane and populates the\n    /// shard table according to the response received.\n    async fn populate_routing_table(\n        &self,\n        workbench: &mut IngestWorkbench,\n        request: GetOrCreateOpenShardsRequest,\n    ) {\n        if request.subrequests.is_empty() {\n            return;\n        }\n        let response_result = self.control_plane.get_or_create_open_shards(request).await;\n        let response = match response_result {\n            Ok(response) => response,\n            Err(control_plane_error) => {\n                if workbench.is_last_attempt() {\n                    rate_limited_error!(\n                        limit_per_min = 10,\n                        \"failed to get open shards from control plane: {control_plane_error}\"\n                    );\n                } else {\n                    rate_limited_warn!(\n                        limit_per_min = 10,\n                        \"failed to get open shards from control plane: {control_plane_error}\"\n                    );\n                };\n                return;\n            }\n        };\n        let mut state_guard = self.state.lock().await;\n\n        for success in response.successes {\n            state_guard.routing_table.merge_from_shards(\n                success.index_uid().clone(),\n                success.source_id,\n                success.open_shards,\n            );\n        }\n        drop(state_guard);\n\n        for failure in response.failures {\n            workbench.record_get_or_create_open_shards_failure(failure);\n        }\n    }\n\n    async fn process_persist_results(\n        &self,\n        workbench: &mut IngestWorkbench,\n        mut persist_futures: FuturesUnordered<impl Future<Output = PersistResult>>,\n    ) {\n        let mut unavailable_leaders: HashSet<NodeId> = HashSet::new();\n\n        while let Some((persist_summary, persist_result)) = persist_futures.next().await {\n            match persist_result {\n                Ok(persist_response) => {\n                    let leader_id = NodeId::from(persist_response.leader_id.clone());\n\n                    for persist_success in persist_response.successes {\n                        workbench.record_persist_success(persist_success);\n                    }\n                    for persist_failure in persist_response.failures {\n                        workbench.record_persist_failure(&persist_failure);\n\n                        match persist_failure.reason() {\n                            PersistFailureReason::NoShardsAvailable => {\n                                // For non-critical failures, we don't mark the nodes unavailable;\n                                // a routing update is piggybacked on PersistResponses, so shard\n                                // counts and capacity scores will be fresh on the next try.\n                            }\n                            PersistFailureReason::NodeUnavailable\n                            | PersistFailureReason::WalFull\n                            | PersistFailureReason::Timeout => {\n                                unavailable_leaders.insert(leader_id.clone());\n                            }\n                            _ => {}\n                        }\n                    }\n\n                    if let Some(routing_update) = persist_response.routing_update {\n                        // Since we just talked to the node, we take advantage and use the\n                        // opportunity to get a fresh routing update.\n                        let mut state_guard = self.state.lock().await;\n                        for shard_update in routing_update.source_shard_updates {\n                            state_guard.routing_table.apply_capacity_update(\n                                leader_id.clone(),\n                                shard_update.index_uid().clone(),\n                                shard_update.source_id,\n                                routing_update.capacity_score as usize,\n                                shard_update.open_shard_count as usize,\n                            );\n                        }\n                        drop(state_guard);\n\n                        workbench.closed_shards.extend(routing_update.closed_shards);\n                    }\n                }\n                Err(persist_error) => {\n                    if workbench.is_last_attempt() {\n                        rate_limited_error!(\n                            limit_per_min = 10,\n                            \"failed to persist records on ingester `{}`: {persist_error}\",\n                            persist_summary.leader_id\n                        );\n                    } else {\n                        rate_limited_warn!(\n                            limit_per_min = 10,\n                            \"failed to persist records on ingester `{}`: {persist_error}\",\n                            persist_summary.leader_id\n                        );\n                    }\n                    workbench.record_persist_error(persist_error, persist_summary);\n                }\n            };\n        }\n        workbench.unavailable_leaders.extend(unavailable_leaders);\n    }\n\n    async fn batch_persist(&self, workbench: &mut IngestWorkbench, commit_type: CommitTypeV2) {\n        // Let's first create the shards that might be missing.\n        let debounced_request = self\n            .make_get_or_create_open_shard_request(workbench, &self.ingester_pool)\n            .await;\n\n        self.populate_routing_table_debounced(workbench, debounced_request)\n            .await;\n\n        let unavailable_leaders = &workbench.unavailable_leaders;\n        let mut no_shards_available_subrequest_ids: Vec<SubrequestId> = Vec::new();\n        let mut per_leader_persist_subrequests: HashMap<&NodeId, Vec<PersistSubrequest>> =\n            HashMap::new();\n\n        let state_guard = self.state.lock().await;\n\n        for subrequest in pending_subrequests(&workbench.subworkbenches) {\n            let ingester_node = state_guard.routing_table.pick_node(\n                &subrequest.index_id,\n                &subrequest.source_id,\n                &self.ingester_pool,\n                unavailable_leaders,\n            );\n\n            let ingester_node = match ingester_node {\n                Some(node) => node,\n                None => {\n                    no_shards_available_subrequest_ids.push(subrequest.subrequest_id);\n                    continue;\n                }\n            };\n            let az_locality = state_guard\n                .routing_table\n                .classify_az_locality(&ingester_node.node_id, &self.ingester_pool);\n            INGEST_V2_METRICS\n                .ingest_attempts\n                .with_label_values([az_locality])\n                .inc();\n            let persist_subrequest = PersistSubrequest {\n                subrequest_id: subrequest.subrequest_id,\n                index_uid: Some(ingester_node.index_uid.clone()),\n                source_id: subrequest.source_id.clone(),\n                doc_batch: subrequest.doc_batch.clone(),\n            };\n            per_leader_persist_subrequests\n                .entry(&ingester_node.node_id)\n                .or_default()\n                .push(persist_subrequest);\n        }\n        let persist_futures = FuturesUnordered::new();\n\n        for (leader_id, subrequests) in per_leader_persist_subrequests {\n            let leader_id: NodeId = leader_id.clone();\n            let subrequest_ids: Vec<SubrequestId> = subrequests\n                .iter()\n                .map(|subrequest| subrequest.subrequest_id)\n                .collect();\n            let Some(ingester) = self.ingester_pool.get(&leader_id).map(|h| h.client) else {\n                no_shards_available_subrequest_ids.extend(subrequest_ids);\n                continue;\n            };\n            let persist_summary = PersistRequestSummary {\n                leader_id: leader_id.clone(),\n                subrequest_ids,\n            };\n            let persist_request = PersistRequest {\n                leader_id: leader_id.into(),\n                subrequests,\n                commit_type: commit_type as i32,\n            };\n\n            let persist_future = async move {\n                let persist_result = tokio::time::timeout(\n                    PERSIST_REQUEST_TIMEOUT,\n                    ingester.persist(persist_request),\n                )\n                .await\n                .unwrap_or_else(|_| {\n                    let message = format!(\n                        \"persist request timed out after {} seconds\",\n                        PERSIST_REQUEST_TIMEOUT.as_secs()\n                    );\n                    Err(IngestV2Error::Timeout(message))\n                });\n                (persist_summary, persist_result)\n            };\n            persist_futures.push(persist_future);\n        }\n        drop(state_guard);\n\n        for subrequest_id in no_shards_available_subrequest_ids {\n            workbench.record_no_shards_available(subrequest_id);\n        }\n        self.process_persist_results(workbench, persist_futures)\n            .await;\n    }\n\n    async fn retry_batch_persist(\n        &self,\n        ingest_request: IngestRequestV2,\n        max_num_attempts: usize,\n    ) -> IngestResponseV2 {\n        let commit_type = ingest_request.commit_type();\n        let mut workbench = if matches!(commit_type, CommitTypeV2::Force | CommitTypeV2::WaitFor) {\n            IngestWorkbench::new_with_publish_tracking(\n                ingest_request.subrequests,\n                max_num_attempts,\n                self.event_broker.clone(),\n            )\n        } else {\n            IngestWorkbench::new(ingest_request.subrequests, max_num_attempts)\n        };\n        while !workbench.is_complete() {\n            workbench.new_attempt();\n            self.batch_persist(&mut workbench, commit_type).await;\n        }\n        workbench.into_ingest_result().await\n    }\n\n    async fn ingest_timeout(\n        &self,\n        ingest_request: IngestRequestV2,\n        timeout_duration: Duration,\n    ) -> IngestV2Result<IngestResponseV2> {\n        tokio::time::timeout(\n            timeout_duration,\n            self.retry_batch_persist(ingest_request, MAX_PERSIST_ATTEMPTS),\n        )\n        .await\n        .map_err(|_elapsed: Elapsed| {\n            let message = format!(\n                \"ingest request timed out after {} millis\",\n                timeout_duration.as_millis()\n            );\n            error!(\n                \"ingest request should not timeout as there is a timeout on independent ingest \\\n                 requests too. timeout after {}\",\n                timeout_duration.as_millis()\n            );\n            IngestV2Error::Timeout(message)\n        })\n    }\n\n    pub async fn debug_info(&self) -> JsonValue {\n        let state_guard = self.state.lock().await;\n        let routing_table_json = state_guard.routing_table.debug_info(&self.ingester_pool);\n\n        json!({\n            \"routing_table\": routing_table_json,\n        })\n    }\n}\n\nfn update_ingest_metrics(ingest_result: &IngestV2Result<IngestResponseV2>, num_subrequests: usize) {\n    let num_subrequests = num_subrequests as u64;\n    let ingest_results_metrics: &IngestResultMetrics = &INGEST_V2_METRICS.ingest_results;\n    match ingest_result {\n        Ok(ingest_response) => {\n            ingest_results_metrics\n                .success\n                .inc_by(ingest_response.successes.len() as u64);\n            for ingest_failure in &ingest_response.failures {\n                match ingest_failure.reason() {\n                    IngestFailureReason::CircuitBreaker => {\n                        ingest_results_metrics.circuit_breaker.inc();\n                    }\n                    IngestFailureReason::Unspecified => ingest_results_metrics.unspecified.inc(),\n                    IngestFailureReason::IndexNotFound => {\n                        ingest_results_metrics.index_not_found.inc()\n                    }\n                    IngestFailureReason::SourceNotFound => {\n                        ingest_results_metrics.source_not_found.inc()\n                    }\n                    IngestFailureReason::Internal => ingest_results_metrics.internal.inc(),\n                    IngestFailureReason::NoShardsAvailable => {\n                        ingest_results_metrics.no_shards_available.inc()\n                    }\n                    IngestFailureReason::ShardRateLimited => {\n                        ingest_results_metrics.shard_rate_limited.inc()\n                    }\n                    IngestFailureReason::WalFull => ingest_results_metrics.wal_full.inc(),\n                    IngestFailureReason::Timeout => ingest_results_metrics.timeout.inc(),\n                    IngestFailureReason::RouterLoadShedding => {\n                        ingest_results_metrics.router_load_shedding.inc()\n                    }\n                    IngestFailureReason::LoadShedding => ingest_results_metrics.load_shedding.inc(),\n                }\n            }\n        }\n        Err(ingest_error) => match ingest_error {\n            IngestV2Error::TooManyRequests(rate_limiting_cause) => match rate_limiting_cause {\n                RateLimitingCause::RouterLoadShedding => {\n                    ingest_results_metrics\n                        .router_load_shedding\n                        .inc_by(num_subrequests);\n                }\n                RateLimitingCause::LoadShedding => {\n                    ingest_results_metrics.load_shedding.inc_by(num_subrequests)\n                }\n                RateLimitingCause::WalFull => {\n                    ingest_results_metrics.wal_full.inc_by(num_subrequests);\n                }\n                RateLimitingCause::CircuitBreaker => {\n                    ingest_results_metrics\n                        .circuit_breaker\n                        .inc_by(num_subrequests);\n                }\n                RateLimitingCause::ShardRateLimiting => {\n                    ingest_results_metrics\n                        .shard_rate_limited\n                        .inc_by(num_subrequests);\n                }\n                RateLimitingCause::Unknown => {\n                    ingest_results_metrics.unspecified.inc_by(num_subrequests);\n                }\n            },\n            IngestV2Error::Timeout(_) => {\n                ingest_results_metrics\n                    .router_timeout\n                    .inc_by(num_subrequests);\n            }\n            IngestV2Error::ShardNotFound { .. } => {\n                ingest_results_metrics\n                    .shard_not_found\n                    .inc_by(num_subrequests);\n            }\n            IngestV2Error::Unavailable(_) => {\n                ingest_results_metrics.unavailable.inc_by(num_subrequests);\n            }\n            IngestV2Error::Internal(_) => {\n                ingest_results_metrics.internal.inc_by(num_subrequests);\n            }\n        },\n    }\n}\n\n#[async_trait]\nimpl IngestRouterService for IngestRouter {\n    async fn ingest(&self, ingest_request: IngestRequestV2) -> IngestV2Result<IngestResponseV2> {\n        let request_size_bytes = ingest_request.num_bytes();\n\n        let mut gauge_guard = GaugeGuard::from_gauge(&MEMORY_METRICS.in_flight.ingest_router);\n        gauge_guard.add(request_size_bytes as i64);\n        let num_subrequests = ingest_request.subrequests.len();\n\n        let _permit = self\n            .ingest_semaphore\n            .clone()\n            .try_acquire_many_owned(request_size_bytes as u32)\n            .map_err(|_| IngestV2Error::TooManyRequests(RateLimitingCause::RouterLoadShedding))?;\n\n        let ingest_res = if ingest_request.commit_type() == CommitTypeV2::Auto {\n            self.ingest_timeout(ingest_request, ingest_request_timeout())\n                .await\n        } else {\n            Ok(self\n                .retry_batch_persist(ingest_request, MAX_PERSIST_ATTEMPTS)\n                .await)\n        };\n        update_ingest_metrics(&ingest_res, num_subrequests);\n\n        ingest_res\n    }\n}\n\n#[derive(Clone)]\nstruct WeakRouterState(Weak<Mutex<RouterState>>);\n\n#[async_trait]\nimpl EventSubscriber<IngesterCapacityScoreUpdate> for WeakRouterState {\n    async fn handle_event(&mut self, update: IngesterCapacityScoreUpdate) {\n        let Some(state) = self.0.upgrade() else {\n            return;\n        };\n        let mut state_guard = state.lock().await;\n        state_guard.routing_table.apply_capacity_update(\n            update.node_id,\n            update.source_uid.index_uid,\n            update.source_uid.source_id,\n            update.capacity_score,\n            update.open_shard_count,\n        );\n    }\n}\n\npub(super) struct PersistRequestSummary {\n    pub leader_id: NodeId,\n    pub subrequest_ids: Vec<SubrequestId>,\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_proto::control_plane::{\n        GetOrCreateOpenShardsFailure, GetOrCreateOpenShardsFailureReason,\n        GetOrCreateOpenShardsResponse, GetOrCreateOpenShardsSuccess, MockControlPlaneService,\n    };\n    use quickwit_proto::ingest::ingester::{\n        IngesterServiceClient, IngesterStatus, MockIngesterService, PersistFailure,\n        PersistResponse, PersistSuccess, RoutingUpdate, SourceShardUpdate,\n    };\n    use quickwit_proto::ingest::router::IngestSubrequest;\n    use quickwit_proto::ingest::{\n        CommitTypeV2, DocBatchV2, ParseFailure, ParseFailureReason, Shard, ShardState,\n    };\n    use quickwit_proto::types::{DocUid, IndexUid, Position, ShardId, SourceUid};\n\n    use super::*;\n    use crate::IngesterPoolEntry;\n    use crate::ingest_v2::workbench::SubworkbenchFailure;\n\n    #[tokio::test]\n    async fn test_router_make_get_or_create_open_shard_request() {\n        let self_node_id = \"test-router\".into();\n        let control_plane: ControlPlaneServiceClient =\n            ControlPlaneServiceClient::from_mock(MockControlPlaneService::new());\n        let ingester_pool = IngesterPool::default();\n        let replication_factor = 1;\n        let router = IngestRouter::new(\n            self_node_id,\n            control_plane,\n            ingester_pool.clone(),\n            replication_factor,\n            EventBroker::default(),\n            Some(\"test-az\".to_string()),\n        );\n        let mut workbench = IngestWorkbench::default();\n        let (get_or_create_open_shard_request_opt, rendezvous) = router\n            .make_get_or_create_open_shard_request(&mut workbench, &ingester_pool)\n            .await\n            .take();\n        assert!(get_or_create_open_shard_request_opt.is_none());\n        assert!(rendezvous.is_empty());\n\n        {\n            let mut state_guard = router.state.lock().await;\n            state_guard.routing_table.apply_capacity_update(\n                \"test-ingester-0\".into(),\n                IndexUid::for_test(\"test-index-0\", 0),\n                \"test-source\".to_string(),\n                8,\n                1,\n            );\n        }\n\n        let ingest_subrequests: Vec<IngestSubrequest> = vec![\n            IngestSubrequest {\n                subrequest_id: 0,\n                index_id: \"test-index-0\".to_string(),\n                source_id: \"test-source\".to_string(),\n                ..Default::default()\n            },\n            IngestSubrequest {\n                subrequest_id: 1,\n                index_id: \"test-index-1\".to_string(),\n                source_id: \"test-source\".to_string(),\n                ..Default::default()\n            },\n        ];\n        let mut workbench = IngestWorkbench::new(ingest_subrequests.clone(), 3);\n        let (get_or_create_open_shard_request_opt, rendezvous_1) = router\n            .make_get_or_create_open_shard_request(&mut workbench, &ingester_pool)\n            .await\n            .take();\n\n        let get_or_create_open_shard_request = get_or_create_open_shard_request_opt.unwrap();\n        assert_eq!(get_or_create_open_shard_request.subrequests.len(), 2);\n\n        assert_eq!(rendezvous_1.num_permits(), 2);\n        assert_eq!(rendezvous_1.num_barriers(), 0);\n\n        let subrequest = &get_or_create_open_shard_request.subrequests[0];\n        assert_eq!(subrequest.index_id, \"test-index-0\");\n        assert_eq!(subrequest.source_id, \"test-source\");\n\n        let subrequest = &get_or_create_open_shard_request.subrequests[1];\n        assert_eq!(subrequest.index_id, \"test-index-1\");\n        assert_eq!(subrequest.source_id, \"test-source\");\n\n        assert!(\n            get_or_create_open_shard_request\n                .unavailable_leaders\n                .is_empty()\n        );\n        assert!(workbench.unavailable_leaders.is_empty());\n\n        let (get_or_create_open_shard_request_opt, rendezvous_2) = router\n            .make_get_or_create_open_shard_request(&mut workbench, &ingester_pool)\n            .await\n            .take();\n\n        assert!(get_or_create_open_shard_request_opt.is_none());\n\n        assert_eq!(rendezvous_2.num_permits(), 0);\n        assert_eq!(rendezvous_2.num_barriers(), 2);\n\n        drop(rendezvous_1);\n        drop(rendezvous_2);\n\n        ingester_pool.insert(\n            \"test-ingester-0\".into(),\n            IngesterPoolEntry::mocked_ingester(),\n        );\n        {\n            // Ingester-0 is in pool and in table, but marked unavailable on the workbench\n            // (simulating a prior transport error). has_open_nodes returns false → both\n            // subrequests trigger CP request.\n            workbench\n                .unavailable_leaders\n                .insert(\"test-ingester-0\".into());\n            let (get_or_create_open_shard_request_opt, _rendezvous) = router\n                .make_get_or_create_open_shard_request(&mut workbench, &ingester_pool)\n                .await\n                .take();\n            let get_or_create_open_shard_request = get_or_create_open_shard_request_opt.unwrap();\n            assert_eq!(get_or_create_open_shard_request.subrequests.len(), 2);\n            assert_eq!(\n                get_or_create_open_shard_request.unavailable_leaders.len(),\n                1\n            );\n        }\n        {\n            // Fresh workbench: ingester-0 is in pool, in table, and NOT unavailable.\n            // has_open_nodes returns true for index-0 → only index-1 triggers request.\n            let mut workbench = IngestWorkbench::new(ingest_subrequests, 3);\n            let (get_or_create_open_shard_request_opt, _rendezvous) = router\n                .make_get_or_create_open_shard_request(&mut workbench, &ingester_pool)\n                .await\n                .take();\n            let get_or_create_open_shard_request = get_or_create_open_shard_request_opt.unwrap();\n            assert_eq!(get_or_create_open_shard_request.subrequests.len(), 1);\n\n            let subrequest = &get_or_create_open_shard_request.subrequests[0];\n            assert_eq!(subrequest.index_id, \"test-index-1\");\n            assert_eq!(subrequest.source_id, \"test-source\");\n\n            assert!(\n                get_or_create_open_shard_request\n                    .unavailable_leaders\n                    .is_empty()\n            );\n        }\n    }\n\n    #[tokio::test]\n    async fn test_router_populate_routing_table() {\n        let self_node_id = \"test-router\".into();\n\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index-0\", 0);\n        let index_uid2: IndexUid = IndexUid::for_test(\"test-index-1\", 0);\n        let mut mock_control_plane = MockControlPlaneService::new();\n        mock_control_plane\n            .expect_get_or_create_open_shards()\n            .once()\n            .returning(move |request| {\n                assert_eq!(request.subrequests.len(), 4);\n\n                let subrequest_0 = &request.subrequests[0];\n                assert_eq!(subrequest_0.index_id, \"test-index-0\");\n                assert_eq!(subrequest_0.source_id, \"test-source\");\n\n                let subrequest_1 = &request.subrequests[1];\n                assert_eq!(subrequest_1.index_id, \"test-index-1\");\n                assert_eq!(subrequest_1.source_id, \"test-source\");\n\n                let subrequest_2 = &request.subrequests[2];\n                assert_eq!(subrequest_2.index_id, \"index-not-found\");\n                assert_eq!(subrequest_2.source_id, \"test-source\");\n\n                let subrequest_3 = &request.subrequests[3];\n                assert_eq!(subrequest_3.index_id, \"test-index-0\");\n                assert_eq!(subrequest_3.source_id, \"source-not-found\");\n\n                let response = GetOrCreateOpenShardsResponse {\n                    successes: vec![\n                        GetOrCreateOpenShardsSuccess {\n                            subrequest_id: 0,\n                            index_uid: Some(index_uid.clone()),\n                            source_id: \"test-source\".to_string(),\n                            open_shards: vec![Shard {\n                                index_uid: Some(index_uid.clone()),\n                                source_id: \"test-source\".to_string(),\n                                shard_id: Some(ShardId::from(1)),\n                                shard_state: ShardState::Open as i32,\n                                leader_id: \"test-ingester-0\".to_string(),\n                                ..Default::default()\n                            }],\n                        },\n                        GetOrCreateOpenShardsSuccess {\n                            subrequest_id: 1,\n                            index_uid: Some(index_uid2.clone()),\n                            source_id: \"test-source\".to_string(),\n                            open_shards: vec![\n                                Shard {\n                                    index_uid: Some(index_uid2.clone()),\n                                    source_id: \"test-source\".to_string(),\n                                    shard_id: Some(ShardId::from(1)),\n                                    shard_state: ShardState::Open as i32,\n                                    leader_id: \"test-ingester-1\".to_string(),\n                                    ..Default::default()\n                                },\n                                Shard {\n                                    index_uid: Some(index_uid2.clone()),\n                                    source_id: \"test-source\".to_string(),\n                                    shard_id: Some(ShardId::from(2)),\n                                    shard_state: ShardState::Open as i32,\n                                    leader_id: \"test-ingester-1\".to_string(),\n                                    ..Default::default()\n                                },\n                            ],\n                        },\n                    ],\n                    failures: vec![\n                        GetOrCreateOpenShardsFailure {\n                            subrequest_id: 2,\n                            index_id: \"index-not-found\".to_string(),\n                            source_id: \"test-source\".to_string(),\n                            reason: GetOrCreateOpenShardsFailureReason::IndexNotFound as i32,\n                        },\n                        GetOrCreateOpenShardsFailure {\n                            subrequest_id: 3,\n                            index_id: \"test-index-0\".to_string(),\n                            source_id: \"source-not-found\".to_string(),\n                            reason: GetOrCreateOpenShardsFailureReason::SourceNotFound as i32,\n                        },\n                    ],\n                };\n                Ok(response)\n            });\n        let control_plane = ControlPlaneServiceClient::from_mock(mock_control_plane);\n        let ingester_pool = IngesterPool::default();\n        let replication_factor = 1;\n        let router = IngestRouter::new(\n            self_node_id,\n            control_plane,\n            ingester_pool.clone(),\n            replication_factor,\n            EventBroker::default(),\n            Some(\"test-az\".to_string()),\n        );\n        let ingest_subrequests = vec![\n            IngestSubrequest {\n                subrequest_id: 0,\n                index_id: \"test-index-0\".to_string(),\n                source_id: \"test-source\".to_string(),\n                ..Default::default()\n            },\n            IngestSubrequest {\n                subrequest_id: 1,\n                index_id: \"test-index-1\".to_string(),\n                source_id: \"test-source\".to_string(),\n                ..Default::default()\n            },\n            IngestSubrequest {\n                subrequest_id: 2,\n                index_id: \"index-not-found\".to_string(),\n                source_id: \"test-source\".to_string(),\n                ..Default::default()\n            },\n            IngestSubrequest {\n                subrequest_id: 3,\n                index_id: \"source-not-found\".to_string(),\n                source_id: \"test-source\".to_string(),\n                ..Default::default()\n            },\n        ];\n        let mut workbench = IngestWorkbench::new(ingest_subrequests, 2);\n\n        let get_or_create_open_shards_request = GetOrCreateOpenShardsRequest {\n            subrequests: vec![\n                GetOrCreateOpenShardsSubrequest {\n                    subrequest_id: 0,\n                    index_id: \"test-index-0\".to_string(),\n                    source_id: \"test-source\".to_string(),\n                },\n                GetOrCreateOpenShardsSubrequest {\n                    subrequest_id: 1,\n                    index_id: \"test-index-1\".to_string(),\n                    source_id: \"test-source\".to_string(),\n                },\n                GetOrCreateOpenShardsSubrequest {\n                    subrequest_id: 2,\n                    index_id: \"index-not-found\".to_string(),\n                    source_id: \"test-source\".to_string(),\n                },\n                GetOrCreateOpenShardsSubrequest {\n                    subrequest_id: 3,\n                    index_id: \"test-index-0\".to_string(),\n                    source_id: \"source-not-found\".to_string(),\n                },\n            ],\n            closed_shards: Vec::new(),\n            unavailable_leaders: Vec::new(),\n        };\n        router\n            .populate_routing_table(&mut workbench, get_or_create_open_shards_request)\n            .await;\n\n        let subworkbench = workbench.subworkbenches.get(&2).unwrap();\n        assert!(matches!(\n            subworkbench.last_failure_opt,\n            Some(SubworkbenchFailure::IndexNotFound)\n        ));\n\n        let subworkbench = workbench.subworkbenches.get(&3).unwrap();\n        assert!(matches!(\n            subworkbench.last_failure_opt,\n            Some(SubworkbenchFailure::SourceNotFound)\n        ));\n    }\n\n    #[tokio::test]\n    async fn test_router_batch_persist_records_no_shards_available_empty_routing_table() {\n        let self_node_id = \"test-router\".into();\n        let mut mock_control_plane = MockControlPlaneService::new();\n        mock_control_plane\n            .expect_get_or_create_open_shards()\n            .once()\n            .returning(move |request| {\n                assert_eq!(request.subrequests.len(), 1);\n\n                let subrequest = &request.subrequests[0];\n                assert_eq!(subrequest.index_id, \"test-index\");\n                assert_eq!(subrequest.source_id, \"test-source\");\n\n                let response = GetOrCreateOpenShardsResponse::default();\n                Ok(response)\n            });\n        let control_plane = ControlPlaneServiceClient::from_mock(mock_control_plane);\n        let ingester_pool = IngesterPool::default();\n        let replication_factor = 1;\n        let router = IngestRouter::new(\n            self_node_id,\n            control_plane,\n            ingester_pool.clone(),\n            replication_factor,\n            EventBroker::default(),\n            Some(\"test-az\".to_string()),\n        );\n        let ingest_subrequests = vec![IngestSubrequest {\n            subrequest_id: 0,\n            index_id: \"test-index\".to_string(),\n            source_id: \"test-source\".to_string(),\n            ..Default::default()\n        }];\n        let mut workbench = IngestWorkbench::new(ingest_subrequests, 2);\n        let commit_type = CommitTypeV2::Auto;\n        router.batch_persist(&mut workbench, commit_type).await;\n\n        let subworkbench = workbench.subworkbenches.get(&0).unwrap();\n        assert!(matches!(\n            subworkbench.last_failure_opt,\n            Some(SubworkbenchFailure::NoShardsAvailable)\n        ));\n    }\n\n    #[tokio::test]\n    async fn test_router_batch_persist_records_no_shards_available_unavailable_ingester() {\n        let self_node_id = \"test-router\".into();\n        let mut mock_control_plane = MockControlPlaneService::new();\n        mock_control_plane\n            .expect_get_or_create_open_shards()\n            .once()\n            .returning(move |request| {\n                assert_eq!(request.subrequests.len(), 1);\n\n                let subrequest = &request.subrequests[0];\n                assert_eq!(subrequest.index_id, \"test-index\");\n                assert_eq!(subrequest.source_id, \"test-source\");\n\n                let response = GetOrCreateOpenShardsResponse {\n                    successes: vec![GetOrCreateOpenShardsSuccess {\n                        subrequest_id: 0,\n                        index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n                        source_id: \"test-source\".to_string(),\n                        open_shards: vec![Shard {\n                            index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n                            source_id: \"test-source\".to_string(),\n                            shard_id: Some(ShardId::from(1)),\n                            shard_state: ShardState::Open as i32,\n                            leader_id: \"test-ingester\".into(),\n                            ..Default::default()\n                        }],\n                    }],\n                    ..Default::default()\n                };\n                Ok(response)\n            });\n        let control_plane = ControlPlaneServiceClient::from_mock(mock_control_plane);\n        let ingester_pool = IngesterPool::default();\n        let replication_factor = 1;\n        let router = IngestRouter::new(\n            self_node_id,\n            control_plane,\n            ingester_pool.clone(),\n            replication_factor,\n            EventBroker::default(),\n            Some(\"test-az\".to_string()),\n        );\n        let ingest_subrequests = vec![IngestSubrequest {\n            subrequest_id: 0,\n            index_id: \"test-index\".to_string(),\n            source_id: \"test-source\".to_string(),\n            ..Default::default()\n        }];\n        let mut workbench = IngestWorkbench::new(ingest_subrequests, 2);\n        let commit_type = CommitTypeV2::Auto;\n        router.batch_persist(&mut workbench, commit_type).await;\n\n        let subworkbench = workbench.subworkbenches.get(&0).unwrap();\n        assert!(matches!(\n            subworkbench.last_failure_opt,\n            Some(SubworkbenchFailure::NoShardsAvailable)\n        ));\n    }\n\n    #[tokio::test]\n    async fn test_router_process_persist_results_record_persist_successes() {\n        let self_node_id = \"test-router\".into();\n        let control_plane = ControlPlaneServiceClient::from_mock(MockControlPlaneService::new());\n        let ingester_pool = IngesterPool::default();\n        let replication_factor = 1;\n        let router = IngestRouter::new(\n            self_node_id,\n            control_plane,\n            ingester_pool.clone(),\n            replication_factor,\n            EventBroker::default(),\n            Some(\"test-az\".to_string()),\n        );\n        let ingest_subrequests = vec![IngestSubrequest {\n            subrequest_id: 0,\n            index_id: \"test-index-0\".to_string(),\n            source_id: \"test-source\".to_string(),\n            ..Default::default()\n        }];\n        let mut workbench = IngestWorkbench::new(ingest_subrequests, 2);\n        let persist_futures = FuturesUnordered::new();\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index-0\", 0);\n\n        persist_futures.push(async move {\n            let persist_summary = PersistRequestSummary {\n                leader_id: \"test-ingester-0\".into(),\n                subrequest_ids: vec![0],\n            };\n            let persist_result = Ok::<_, IngestV2Error>(PersistResponse {\n                leader_id: \"test-ingester-0\".to_string(),\n                successes: vec![PersistSuccess {\n                    subrequest_id: 0,\n                    index_uid: Some(index_uid.clone()),\n                    source_id: \"test-source\".to_string(),\n                    shard_id: Some(ShardId::from(1)),\n                    ..Default::default()\n                }],\n                failures: Vec::new(),\n                routing_update: Some(RoutingUpdate {\n                    capacity_score: 6,\n                    source_shard_updates: Vec::new(),\n                    ..Default::default()\n                }),\n            });\n            (persist_summary, persist_result)\n        });\n        router\n            .process_persist_results(&mut workbench, persist_futures)\n            .await;\n\n        let subworkbench = workbench.subworkbenches.get(&0).unwrap();\n        assert!(matches!(\n            subworkbench.persist_success_opt,\n            Some(PersistSuccess { .. })\n        ));\n    }\n\n    #[tokio::test]\n    async fn test_router_process_persist_results_record_persist_failures() {\n        let self_node_id = \"test-router\".into();\n        let control_plane = ControlPlaneServiceClient::from_mock(MockControlPlaneService::new());\n        let ingester_pool = IngesterPool::default();\n        let replication_factor = 1;\n        let router = IngestRouter::new(\n            self_node_id,\n            control_plane,\n            ingester_pool.clone(),\n            replication_factor,\n            EventBroker::default(),\n            Some(\"test-az\".to_string()),\n        );\n        let ingest_subrequests = vec![IngestSubrequest {\n            subrequest_id: 0,\n            index_id: \"test-index-0\".to_string(),\n            source_id: \"test-source\".to_string(),\n            ..Default::default()\n        }];\n        let mut workbench = IngestWorkbench::new(ingest_subrequests, 2);\n        let persist_futures = FuturesUnordered::new();\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index-0\", 0);\n\n        persist_futures.push(async move {\n            let persist_summary = PersistRequestSummary {\n                leader_id: \"test-ingester-0\".into(),\n                subrequest_ids: vec![0],\n            };\n            let persist_result = Ok::<_, IngestV2Error>(PersistResponse {\n                leader_id: \"test-ingester-0\".to_string(),\n                successes: Vec::new(),\n                failures: vec![PersistFailure {\n                    subrequest_id: 0,\n                    index_uid: Some(index_uid.clone()),\n                    source_id: \"test-source\".to_string(),\n                    reason: PersistFailureReason::NoShardsAvailable as i32,\n                }],\n                routing_update: Some(RoutingUpdate {\n                    capacity_score: 6,\n                    source_shard_updates: Vec::new(),\n                    ..Default::default()\n                }),\n            });\n            (persist_summary, persist_result)\n        });\n        router\n            .process_persist_results(&mut workbench, persist_futures)\n            .await;\n\n        let subworkbench = workbench.subworkbenches.get(&0).unwrap();\n        assert!(matches!(\n            subworkbench.last_failure_opt,\n            Some(SubworkbenchFailure::Persist { .. })\n        ));\n    }\n\n    #[tokio::test]\n    async fn test_router_process_persist_results_does_not_remove_unavailable_leaders() {\n        let self_node_id = \"test-router\".into();\n        let control_plane = ControlPlaneServiceClient::from_mock(MockControlPlaneService::new());\n\n        let ingester_pool = IngesterPool::default();\n        ingester_pool.insert(\n            \"test-ingester-0\".into(),\n            IngesterPoolEntry::mocked_ingester(),\n        );\n        ingester_pool.insert(\n            \"test-ingester-1\".into(),\n            IngesterPoolEntry::mocked_ingester(),\n        );\n\n        let replication_factor = 1;\n        let router = IngestRouter::new(\n            self_node_id,\n            control_plane,\n            ingester_pool.clone(),\n            replication_factor,\n            EventBroker::default(),\n            Some(\"test-az\".to_string()),\n        );\n        let ingest_subrequests = vec![\n            IngestSubrequest {\n                subrequest_id: 0,\n                index_id: \"test-index-0\".to_string(),\n                source_id: \"test-source\".to_string(),\n                ..Default::default()\n            },\n            IngestSubrequest {\n                subrequest_id: 1,\n                index_id: \"test-index-1\".to_string(),\n                source_id: \"test-source\".to_string(),\n                ..Default::default()\n            },\n        ];\n        let mut workbench = IngestWorkbench::new(ingest_subrequests, 2);\n        let persist_futures = FuturesUnordered::new();\n\n        persist_futures.push(async {\n            let persist_summary = PersistRequestSummary {\n                leader_id: \"test-ingester-0\".into(),\n                subrequest_ids: vec![0],\n            };\n            let persist_result =\n                Err::<_, IngestV2Error>(IngestV2Error::Internal(\"internal error\".to_string()));\n            (persist_summary, persist_result)\n        });\n        router\n            .process_persist_results(&mut workbench, persist_futures)\n            .await;\n\n        let subworkbench = workbench.subworkbenches.get(&0).unwrap();\n        assert!(matches!(\n            &subworkbench.last_failure_opt,\n            Some(SubworkbenchFailure::Internal)\n        ));\n\n        assert!(\n            !workbench\n                .unavailable_leaders\n                .contains(&NodeId::from(\"test-ingester-1\"))\n        );\n        let persist_futures = FuturesUnordered::new();\n        persist_futures.push(async {\n            let persist_summary = PersistRequestSummary {\n                leader_id: \"test-ingester-1\".into(),\n                subrequest_ids: vec![1],\n            };\n            let persist_result =\n                Err::<_, IngestV2Error>(IngestV2Error::Unavailable(\"connection error\".to_string()));\n            (persist_summary, persist_result)\n        });\n        router\n            .process_persist_results(&mut workbench, persist_futures)\n            .await;\n\n        // We do not remove the leader from the pool.\n        assert!(!ingester_pool.is_empty());\n        // ... but we mark it as unavailable.\n        assert!(\n            workbench\n                .unavailable_leaders\n                .contains(&NodeId::from(\"test-ingester-1\"))\n        );\n\n        let subworkbench = workbench.subworkbenches.get(&1).unwrap();\n        assert!(matches!(\n            subworkbench.last_failure_opt,\n            Some(SubworkbenchFailure::Unavailable)\n        ));\n    }\n\n    #[tokio::test]\n    async fn test_router_ingest() {\n        let self_node_id = \"test-router\".into();\n        let control_plane = ControlPlaneServiceClient::from_mock(MockControlPlaneService::new());\n        let ingester_pool = IngesterPool::default();\n        let router = IngestRouter::new(\n            self_node_id,\n            control_plane,\n            ingester_pool.clone(),\n            1,\n            EventBroker::default(),\n            Some(\"test-az\".to_string()),\n        );\n\n        let index_uid_0: IndexUid = IndexUid::for_test(\"test-index-0\", 0);\n        let index_uid_1: IndexUid = IndexUid::for_test(\"test-index-1\", 0);\n        {\n            let mut state_guard = router.state.lock().await;\n            state_guard.routing_table.merge_from_shards(\n                index_uid_0.clone(),\n                \"test-source\".to_string(),\n                vec![Shard {\n                    index_uid: Some(index_uid_0.clone()),\n                    source_id: \"test-source\".to_string(),\n                    shard_id: Some(ShardId::from(1)),\n                    shard_state: ShardState::Open as i32,\n                    leader_id: \"test-ingester-0\".to_string(),\n                    ..Default::default()\n                }],\n            );\n            state_guard.routing_table.merge_from_shards(\n                index_uid_1.clone(),\n                \"test-source\".to_string(),\n                vec![Shard {\n                    index_uid: Some(index_uid_1.clone()),\n                    source_id: \"test-source\".to_string(),\n                    shard_id: Some(ShardId::from(1)),\n                    shard_state: ShardState::Open as i32,\n                    leader_id: \"test-ingester-1\".to_string(),\n                    ..Default::default()\n                }],\n            );\n        }\n\n        let index_uid_0_clone = index_uid_0.clone();\n        let mut mock_ingester_0 = MockIngesterService::new();\n        mock_ingester_0\n            .expect_persist()\n            .once()\n            .returning(move |request| {\n                assert_eq!(request.leader_id, \"test-ingester-0\");\n                assert_eq!(request.subrequests.len(), 1);\n\n                Ok(PersistResponse {\n                    leader_id: request.leader_id,\n                    successes: vec![PersistSuccess {\n                        subrequest_id: 0,\n                        index_uid: Some(index_uid_0_clone.clone()),\n                        source_id: \"test-source\".to_string(),\n                        shard_id: Some(ShardId::from(1)),\n                        replication_position_inclusive: Some(Position::offset(1u64)),\n                        num_persisted_docs: 2,\n                        parse_failures: vec![ParseFailure {\n                            doc_uid: Some(DocUid::for_test(0)),\n                            reason: ParseFailureReason::InvalidJson as i32,\n                            message: \"invalid JSON\".to_string(),\n                        }],\n                    }],\n                    failures: Vec::new(),\n                    routing_update: Some(RoutingUpdate {\n                        capacity_score: 6,\n                        source_shard_updates: Vec::new(),\n                        ..Default::default()\n                    }),\n                })\n            });\n        ingester_pool.insert(\n            \"test-ingester-0\".into(),\n            IngesterPoolEntry {\n                client: IngesterServiceClient::from_mock(mock_ingester_0),\n                status: IngesterStatus::Ready,\n                availability_zone: None,\n            },\n        );\n\n        let mut mock_ingester_1 = MockIngesterService::new();\n        mock_ingester_1\n            .expect_persist()\n            .once()\n            .returning(move |request| {\n                assert_eq!(request.leader_id, \"test-ingester-1\");\n                assert_eq!(request.subrequests.len(), 1);\n\n                Ok(PersistResponse {\n                    leader_id: request.leader_id,\n                    successes: vec![PersistSuccess {\n                        subrequest_id: 1,\n                        index_uid: Some(index_uid_1.clone()),\n                        source_id: \"test-source\".to_string(),\n                        shard_id: Some(ShardId::from(1)),\n                        replication_position_inclusive: Some(Position::offset(0u64)),\n                        num_persisted_docs: 1,\n                        parse_failures: Vec::new(),\n                    }],\n                    failures: Vec::new(),\n                    routing_update: Some(RoutingUpdate {\n                        capacity_score: 6,\n                        source_shard_updates: Vec::new(),\n                        ..Default::default()\n                    }),\n                })\n            });\n        ingester_pool.insert(\n            \"test-ingester-1\".into(),\n            IngesterPoolEntry {\n                client: IngesterServiceClient::from_mock(mock_ingester_1),\n                availability_zone: None,\n                status: IngesterStatus::Ready,\n            },\n        );\n\n        let response = router\n            .ingest(IngestRequestV2 {\n                subrequests: vec![\n                    IngestSubrequest {\n                        subrequest_id: 0,\n                        index_id: \"test-index-0\".to_string(),\n                        source_id: \"test-source\".to_string(),\n                        doc_batch: Some(DocBatchV2::for_test([\"\", \"test-doc-foo\", \"test-doc-bar\"])),\n                    },\n                    IngestSubrequest {\n                        subrequest_id: 1,\n                        index_id: \"test-index-1\".to_string(),\n                        source_id: \"test-source\".to_string(),\n                        doc_batch: Some(DocBatchV2::for_test([\"test-doc-qux\"])),\n                    },\n                ],\n                commit_type: CommitTypeV2::Auto as i32,\n            })\n            .await\n            .unwrap();\n\n        assert_eq!(response.successes.len(), 2);\n        assert_eq!(response.failures.len(), 0);\n\n        let parse_failures = &response.successes[0].parse_failures;\n        assert_eq!(parse_failures.len(), 1);\n        assert_eq!(parse_failures[0].doc_uid(), DocUid::for_test(0));\n        assert_eq!(parse_failures[0].reason(), ParseFailureReason::InvalidJson);\n    }\n\n    #[tokio::test]\n    async fn test_router_ingest_retry() {\n        let self_node_id = \"test-router\".into();\n        let control_plane = ControlPlaneServiceClient::from_mock(MockControlPlaneService::new());\n        let ingester_pool = IngesterPool::default();\n        let router = IngestRouter::new(\n            self_node_id,\n            control_plane,\n            ingester_pool.clone(),\n            1,\n            EventBroker::default(),\n            Some(\"test-az\".to_string()),\n        );\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index-0\", 0);\n        {\n            let mut state_guard = router.state.lock().await;\n            state_guard.routing_table.merge_from_shards(\n                index_uid.clone(),\n                \"test-source\".to_string(),\n                vec![Shard {\n                    index_uid: Some(index_uid.clone()),\n                    source_id: \"test-source\".to_string(),\n                    shard_id: Some(ShardId::from(1)),\n                    shard_state: ShardState::Open as i32,\n                    leader_id: \"test-ingester-0\".to_string(),\n                    ..Default::default()\n                }],\n            );\n        }\n\n        let mut mock_ingester_0 = MockIngesterService::new();\n        // First attempt: returns NoShardsAvailable (transient, doesn't mark leader unavailable).\n        // The response still reports capacity_score=6 and 1 open shard so the node stays routable.\n        let index_uid_clone = index_uid.clone();\n        mock_ingester_0\n            .expect_persist()\n            .once()\n            .returning(move |request| {\n                Ok(PersistResponse {\n                    leader_id: request.leader_id,\n                    successes: Vec::new(),\n                    failures: vec![PersistFailure {\n                        subrequest_id: 0,\n                        index_uid: Some(index_uid_clone.clone()),\n                        source_id: \"test-source\".to_string(),\n                        reason: PersistFailureReason::NoShardsAvailable as i32,\n                    }],\n                    routing_update: Some(RoutingUpdate {\n                        capacity_score: 6,\n                        source_shard_updates: vec![SourceShardUpdate {\n                            index_uid: Some(index_uid_clone.clone()),\n                            source_id: \"test-source\".to_string(),\n                            open_shard_count: 1,\n                        }],\n                        ..Default::default()\n                    }),\n                })\n            });\n        // Second attempt: succeeds.\n        mock_ingester_0\n            .expect_persist()\n            .once()\n            .returning(move |request| {\n                Ok(PersistResponse {\n                    leader_id: request.leader_id,\n                    successes: vec![PersistSuccess {\n                        subrequest_id: 0,\n                        index_uid: Some(index_uid.clone()),\n                        source_id: \"test-source\".to_string(),\n                        shard_id: Some(ShardId::from(1)),\n                        replication_position_inclusive: Some(Position::offset(0u64)),\n                        num_persisted_docs: 1,\n                        parse_failures: Vec::new(),\n                    }],\n                    failures: Vec::new(),\n                    routing_update: Some(RoutingUpdate {\n                        capacity_score: 6,\n                        source_shard_updates: Vec::new(),\n                        ..Default::default()\n                    }),\n                })\n            });\n        ingester_pool.insert(\n            \"test-ingester-0\".into(),\n            IngesterPoolEntry {\n                client: IngesterServiceClient::from_mock(mock_ingester_0),\n                status: IngesterStatus::Ready,\n                availability_zone: None,\n            },\n        );\n\n        let response = router\n            .ingest(IngestRequestV2 {\n                subrequests: vec![IngestSubrequest {\n                    subrequest_id: 0,\n                    index_id: \"test-index-0\".to_string(),\n                    source_id: \"test-source\".to_string(),\n                    doc_batch: Some(DocBatchV2::for_test([\"test-doc-foo\"])),\n                }],\n                commit_type: CommitTypeV2::Auto as i32,\n            })\n            .await\n            .unwrap();\n        assert_eq!(response.successes.len(), 1);\n        assert_eq!(response.failures.len(), 0);\n    }\n\n    #[tokio::test]\n    async fn test_router_debug_info() {\n        let self_node_id = \"test-router\".into();\n        let control_plane = ControlPlaneServiceClient::from_mock(MockControlPlaneService::new());\n        let ingester_pool = IngesterPool::default();\n        let replication_factor = 1;\n        let router = IngestRouter::new(\n            self_node_id,\n            control_plane,\n            ingester_pool.clone(),\n            replication_factor,\n            EventBroker::default(),\n            Some(\"test-az\".to_string()),\n        );\n        let index_uid_0: IndexUid = IndexUid::for_test(\"test-index-0\", 0);\n        let index_uid_1: IndexUid = IndexUid::for_test(\"test-index-1\", 0);\n\n        {\n            let mut state_guard = router.state.lock().await;\n            state_guard.routing_table.merge_from_shards(\n                index_uid_0.clone(),\n                \"test-source\".to_string(),\n                vec![Shard {\n                    index_uid: Some(index_uid_0.clone()),\n                    shard_id: Some(ShardId::from(1)),\n                    shard_state: ShardState::Open as i32,\n                    leader_id: \"test-ingester-0\".to_string(),\n                    ..Default::default()\n                }],\n            );\n            state_guard.routing_table.merge_from_shards(\n                index_uid_1.clone(),\n                \"test-source\".to_string(),\n                vec![Shard {\n                    index_uid: Some(index_uid_1.clone()),\n                    shard_id: Some(ShardId::from(2)),\n                    shard_state: ShardState::Open as i32,\n                    leader_id: \"test-ingester-1\".to_string(),\n                    ..Default::default()\n                }],\n            );\n        }\n\n        let debug_info = router.debug_info().await;\n        let routing_table = &debug_info[\"routing_table\"];\n        assert_eq!(routing_table.as_object().unwrap().len(), 2);\n\n        let index_0_entries = routing_table[\"test-index-0\"].as_array().unwrap();\n        assert_eq!(index_0_entries.len(), 1);\n        assert_eq!(index_0_entries[0][\"node_id\"], \"test-ingester-0\");\n        assert_eq!(index_0_entries[0][\"capacity_score\"], 5);\n\n        let index_1_entries = routing_table[\"test-index-1\"].as_array().unwrap();\n        assert_eq!(index_1_entries.len(), 1);\n        assert_eq!(index_1_entries[0][\"node_id\"], \"test-ingester-1\");\n    }\n\n    #[tokio::test]\n    async fn test_router_returns_rate_limited_failure() {\n        let self_node_id = \"test-router\".into();\n        let control_plane = ControlPlaneServiceClient::from_mock(MockControlPlaneService::new());\n        let ingester_pool = IngesterPool::default();\n        let replication_factor = 1;\n        let router = IngestRouter::new(\n            self_node_id,\n            control_plane,\n            ingester_pool.clone(),\n            replication_factor,\n            EventBroker::default(),\n            Some(\"test-az\".to_string()),\n        );\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index-0\", 0);\n        {\n            let mut state_guard = router.state.lock().await;\n            state_guard.routing_table.merge_from_shards(\n                index_uid.clone(),\n                \"test-source\".to_string(),\n                vec![Shard {\n                    index_uid: Some(index_uid.clone()),\n                    source_id: \"test-source\".to_string(),\n                    shard_id: Some(ShardId::from(1)),\n                    shard_state: ShardState::Open as i32,\n                    leader_id: \"test-ingester-0\".to_string(),\n                    ..Default::default()\n                }],\n            );\n        }\n\n        let mut mock_ingester_0 = MockIngesterService::new();\n        mock_ingester_0.expect_persist().returning(move |request| {\n            assert_eq!(request.leader_id, \"test-ingester-0\");\n            assert_eq!(request.commit_type(), CommitTypeV2::Auto);\n            assert_eq!(request.subrequests.len(), 1);\n            let subrequest = &request.subrequests[0];\n            assert_eq!(subrequest.subrequest_id, 0);\n            let index_uid = subrequest.index_uid().clone();\n            assert_eq!(subrequest.source_id, \"test-source\");\n            assert_eq!(\n                subrequest.doc_batch,\n                Some(DocBatchV2::for_test([\"test-doc-foo\"]))\n            );\n\n            let response = PersistResponse {\n                leader_id: request.leader_id,\n                successes: Vec::new(),\n                failures: vec![PersistFailure {\n                    subrequest_id: 0,\n                    index_uid: Some(index_uid.clone()),\n                    source_id: \"test-source\".to_string(),\n                    reason: PersistFailureReason::NoShardsAvailable as i32,\n                }],\n                routing_update: Some(RoutingUpdate {\n                    capacity_score: 6,\n                    source_shard_updates: vec![SourceShardUpdate {\n                        index_uid: Some(index_uid),\n                        source_id: \"test-source\".to_string(),\n                        open_shard_count: 1,\n                    }],\n                    ..Default::default()\n                }),\n            };\n            Ok(response)\n        });\n        let ingester_0 = IngesterServiceClient::from_mock(mock_ingester_0);\n        ingester_pool.insert(\n            \"test-ingester-0\".into(),\n            IngesterPoolEntry {\n                client: ingester_0.clone(),\n                availability_zone: None,\n                status: IngesterStatus::Ready,\n            },\n        );\n\n        let ingest_request = IngestRequestV2 {\n            subrequests: vec![IngestSubrequest {\n                subrequest_id: 0,\n                index_id: \"test-index-0\".to_string(),\n                source_id: \"test-source\".to_string(),\n                doc_batch: Some(DocBatchV2::for_test([\"test-doc-foo\"])),\n            }],\n            commit_type: CommitTypeV2::Auto as i32,\n        };\n        let ingest_response = router.ingest(ingest_request).await.unwrap();\n        assert_eq!(ingest_response.successes.len(), 0);\n        assert_eq!(ingest_response.failures.len(), 1);\n        assert_eq!(\n            ingest_response.failures[0].reason(),\n            IngestFailureReason::NoShardsAvailable\n        );\n    }\n\n    #[tokio::test]\n    async fn test_router_updates_node_routing_table_on_capacity_update() {\n        let event_broker = EventBroker::default();\n        let ingester_pool = IngesterPool::default();\n        let router = IngestRouter::new(\n            \"test-router\".into(),\n            ControlPlaneServiceClient::from_mock(MockControlPlaneService::new()),\n            ingester_pool.clone(),\n            1,\n            event_broker.clone(),\n            Some(\"test-az\".to_string()),\n        );\n        router.subscribe();\n\n        event_broker.publish(IngesterCapacityScoreUpdate {\n            node_id: \"test-ingester-0\".into(),\n            source_uid: SourceUid {\n                index_uid: IndexUid::for_test(\"test-index\", 0),\n                source_id: \"test-source\".to_string(),\n            },\n            capacity_score: 7,\n            open_shard_count: 3,\n        });\n        // Give the async subscriber a moment to process.\n        tokio::time::sleep(Duration::from_millis(10)).await;\n\n        ingester_pool.insert(\n            \"test-ingester-0\".into(),\n            IngesterPoolEntry::mocked_ingester(),\n        );\n        let state_guard = router.state.lock().await;\n        let node = state_guard\n            .routing_table\n            .pick_node(\"test-index\", \"test-source\", &ingester_pool, &HashSet::new())\n            .unwrap();\n        assert_eq!(node.node_id, NodeId::from(\"test-ingester-0\"));\n    }\n\n    #[tokio::test]\n    async fn test_router_process_persist_results_marks_unavailable_on_persist_failure() {\n        let router = IngestRouter::new(\n            \"test-router\".into(),\n            ControlPlaneServiceClient::from_mock(MockControlPlaneService::new()),\n            IngesterPool::default(),\n            1,\n            EventBroker::default(),\n            Some(\"test-az\".to_string()),\n        );\n        let ingest_subrequests = vec![\n            IngestSubrequest {\n                subrequest_id: 0,\n                index_id: \"test-index-0\".to_string(),\n                source_id: \"test-source\".to_string(),\n                ..Default::default()\n            },\n            IngestSubrequest {\n                subrequest_id: 1,\n                index_id: \"test-index-1\".to_string(),\n                source_id: \"test-source\".to_string(),\n                ..Default::default()\n            },\n        ];\n        let mut workbench = IngestWorkbench::new(ingest_subrequests, 2);\n\n        // NoShardsAvailable does NOT mark the leader as unavailable.\n        let persist_futures = FuturesUnordered::new();\n        persist_futures.push(async {\n            let summary = PersistRequestSummary {\n                leader_id: \"test-ingester-0\".into(),\n                subrequest_ids: vec![0],\n            };\n            let result = Ok::<_, IngestV2Error>(PersistResponse {\n                leader_id: \"test-ingester-0\".to_string(),\n                successes: Vec::new(),\n                failures: vec![PersistFailure {\n                    subrequest_id: 0,\n                    index_uid: Some(IndexUid::for_test(\"test-index-0\", 0)),\n                    source_id: \"test-source\".to_string(),\n                    reason: PersistFailureReason::NoShardsAvailable as i32,\n                }],\n                routing_update: Some(RoutingUpdate {\n                    capacity_score: 6,\n                    source_shard_updates: Vec::new(),\n                    ..Default::default()\n                }),\n            });\n            (summary, result)\n        });\n        router\n            .process_persist_results(&mut workbench, persist_futures)\n            .await;\n        assert!(\n            !workbench\n                .unavailable_leaders\n                .contains(&NodeId::from(\"test-ingester-0\"))\n        );\n\n        // NodeUnavailable DOES mark the leader as unavailable.\n        let persist_futures = FuturesUnordered::new();\n        persist_futures.push(async {\n            let summary = PersistRequestSummary {\n                leader_id: \"test-ingester-1\".into(),\n                subrequest_ids: vec![1],\n            };\n            let result = Ok::<_, IngestV2Error>(PersistResponse {\n                leader_id: \"test-ingester-1\".to_string(),\n                successes: Vec::new(),\n                failures: vec![PersistFailure {\n                    subrequest_id: 1,\n                    index_uid: Some(IndexUid::for_test(\"test-index-1\", 0)),\n                    source_id: \"test-source\".to_string(),\n                    reason: PersistFailureReason::NodeUnavailable as i32,\n                }],\n                routing_update: Some(RoutingUpdate {\n                    capacity_score: 6,\n                    source_shard_updates: Vec::new(),\n                    ..Default::default()\n                }),\n            });\n            (summary, result)\n        });\n        router\n            .process_persist_results(&mut workbench, persist_futures)\n            .await;\n        assert!(\n            workbench\n                .unavailable_leaders\n                .contains(&NodeId::from(\"test-ingester-1\"))\n        );\n    }\n\n    #[tokio::test]\n    async fn test_router_process_persist_results_applies_piggybacked_routing_updates() {\n        let ingester_pool = IngesterPool::default();\n        let router = IngestRouter::new(\n            \"test-router\".into(),\n            ControlPlaneServiceClient::from_mock(MockControlPlaneService::new()),\n            ingester_pool.clone(),\n            1,\n            EventBroker::default(),\n            Some(\"test-az\".to_string()),\n        );\n        let ingest_subrequests = vec![IngestSubrequest {\n            subrequest_id: 0,\n            index_id: \"test-index\".to_string(),\n            source_id: \"test-source\".to_string(),\n            ..Default::default()\n        }];\n        let mut workbench = IngestWorkbench::new(ingest_subrequests, 2);\n\n        let persist_futures = FuturesUnordered::new();\n        persist_futures.push(async {\n            let summary = PersistRequestSummary {\n                leader_id: \"test-ingester-0\".into(),\n                subrequest_ids: vec![0],\n            };\n            let result = Ok::<_, IngestV2Error>(PersistResponse {\n                leader_id: \"test-ingester-0\".to_string(),\n                successes: Vec::new(),\n                failures: Vec::new(),\n                routing_update: Some(RoutingUpdate {\n                    capacity_score: 3,\n                    source_shard_updates: vec![SourceShardUpdate {\n                        index_uid: Some(IndexUid::for_test(\"test-index\", 0)),\n                        source_id: \"test-source\".to_string(),\n                        open_shard_count: 2,\n                    }],\n                    ..Default::default()\n                }),\n            });\n            (summary, result)\n        });\n        router\n            .process_persist_results(&mut workbench, persist_futures)\n            .await;\n\n        ingester_pool.insert(\n            \"test-ingester-0\".into(),\n            IngesterPoolEntry::mocked_ingester(),\n        );\n        let state_guard = router.state.lock().await;\n        let node = state_guard\n            .routing_table\n            .pick_node(\"test-index\", \"test-source\", &ingester_pool, &HashSet::new())\n            .unwrap();\n        assert_eq!(node.node_id, NodeId::from(\"test-ingester-0\"));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_v2/routing_table.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{HashMap, HashSet};\n\nuse itertools::Itertools;\nuse quickwit_proto::ingest::Shard;\nuse quickwit_proto::types::{IndexId, IndexUid, NodeId, SourceId};\nuse rand::rng;\nuse rand::seq::IndexedRandom;\n\nuse crate::IngesterPool;\n\n/// A single ingester node's routing-relevant data for a specific (index, source) pair.\n/// Each entry is self-describing: it carries its own node_id, index_uid, and source_id\n/// so it can always be attributed back to a specific source on a specific node.\n#[derive(Debug, Clone)]\npub(super) struct IngesterNode {\n    pub node_id: NodeId,\n    pub index_uid: IndexUid,\n    #[allow(unused)]\n    pub source_id: SourceId,\n    /// Score from 0-10. Higher means more available capacity.\n    pub capacity_score: usize,\n    /// Number of open shards on this node for this (index, source) pair. Tiebreaker for power of\n    /// two choices comparison - we favor a node with more open shards.\n    pub open_shard_count: usize,\n}\n\n#[derive(Debug, Default)]\npub(super) struct RoutingEntry {\n    pub nodes: HashMap<NodeId, IngesterNode>,\n}\n\n/// Given a slice of candidates, picks the better of two random choices.\n/// Higher capacity_score wins; tiebreak on more open_shard_count (more landing spots).\nfn power_of_two_choices<'a>(candidates: &[&'a IngesterNode]) -> &'a IngesterNode {\n    debug_assert!(candidates.len() >= 2);\n    let mut iter = candidates.choose_multiple(&mut rng(), 2);\n    let (&a, &b) = (iter.next().unwrap(), iter.next().unwrap());\n\n    if (a.capacity_score, a.open_shard_count) >= (b.capacity_score, b.open_shard_count) {\n        a\n    } else {\n        b\n    }\n}\n\nfn pick_from(candidates: Vec<&IngesterNode>) -> Option<&IngesterNode> {\n    match candidates.len() {\n        0 => None,\n        1 => Some(candidates[0]),\n        _ => Some(power_of_two_choices(&candidates)),\n    }\n}\n\nimpl RoutingEntry {\n    /// Pick an ingester node to persist the request to. Uses power of two choices based on reported\n    /// ingester capacity, if more than one eligible node exists. Prefers nodes in the same\n    /// availability zone, falling back to remote nodes.\n    fn pick_node(\n        &self,\n        ingester_pool: &IngesterPool,\n        unavailable_leaders: &HashSet<NodeId>,\n        self_availability_zone: &Option<String>,\n    ) -> Option<&IngesterNode> {\n        let (local_ingesters, remote_ingesters): (Vec<&IngesterNode>, Vec<&IngesterNode>) = self\n            .nodes\n            .values()\n            .filter(|node| {\n                node.capacity_score > 0\n                    && node.open_shard_count > 0\n                    && ingester_pool\n                        .get(&node.node_id)\n                        .map(|entry| entry.status.is_ready())\n                        .unwrap_or(false)\n                    && !unavailable_leaders.contains(&node.node_id)\n            })\n            .partition(|node| {\n                let node_az = ingester_pool\n                    .get(&node.node_id)\n                    .and_then(|h| h.availability_zone);\n                node_az == *self_availability_zone\n            });\n\n        pick_from(local_ingesters).or_else(|| pick_from(remote_ingesters))\n    }\n}\n\n#[derive(Debug, Default)]\npub(super) struct RoutingTable {\n    table: HashMap<(IndexId, SourceId), RoutingEntry>,\n    self_availability_zone: Option<String>,\n}\n\nimpl RoutingTable {\n    pub fn new(self_availability_zone: Option<String>) -> Self {\n        Self {\n            self_availability_zone,\n            ..Default::default()\n        }\n    }\n\n    pub fn pick_node(\n        &self,\n        index_id: &str,\n        source_id: &str,\n        ingester_pool: &IngesterPool,\n        unavailable_leaders: &HashSet<NodeId>,\n    ) -> Option<&IngesterNode> {\n        let key = (index_id.to_string(), source_id.to_string());\n        let entry = self.table.get(&key)?;\n        entry.pick_node(\n            ingester_pool,\n            unavailable_leaders,\n            &self.self_availability_zone,\n        )\n    }\n\n    pub fn classify_az_locality(\n        &self,\n        target_node_id: &NodeId,\n        ingester_pool: &IngesterPool,\n    ) -> &'static str {\n        let Some(self_az) = &self.self_availability_zone else {\n            return \"az_unaware\";\n        };\n        let target_az = ingester_pool\n            .get(target_node_id)\n            .and_then(|entry| entry.availability_zone);\n        match target_az {\n            Some(ref az) if az == self_az => \"same_az\",\n            Some(_) => \"cross_az\",\n            None => \"az_unaware\",\n        }\n    }\n\n    pub fn debug_info(\n        &self,\n        ingester_pool: &IngesterPool,\n    ) -> HashMap<IndexId, Vec<serde_json::Value>> {\n        let mut per_index: HashMap<IndexId, Vec<serde_json::Value>> = HashMap::new();\n        for ((index_id, source_id), entry) in &self.table {\n            for (node_id, node) in &entry.nodes {\n                let az = ingester_pool.get(node_id).and_then(|h| h.availability_zone);\n                per_index\n                    .entry(index_id.clone())\n                    .or_default()\n                    .push(serde_json::json!({\n                        \"source_id\": source_id,\n                        \"node_id\": node_id,\n                        \"capacity_score\": node.capacity_score,\n                        \"open_shard_count\": node.open_shard_count,\n                        \"availability_zone\": az,\n                    }));\n            }\n        }\n        per_index\n    }\n\n    pub fn has_open_nodes(\n        &self,\n        index_id: &str,\n        source_id: &str,\n        ingester_pool: &IngesterPool,\n        unavailable_leaders: &HashSet<NodeId>,\n    ) -> bool {\n        let key = (index_id.to_string(), source_id.to_string());\n        let Some(entry) = self.table.get(&key) else {\n            return false;\n        };\n        entry.nodes.values().any(|node| {\n            node.capacity_score > 0\n                && node.open_shard_count > 0\n                && ingester_pool\n                    .get(&node.node_id)\n                    .map(|entry| entry.status.is_ready())\n                    .unwrap_or(false)\n                && !unavailable_leaders.contains(&node.node_id)\n        })\n    }\n\n    /// Applies a capacity update from the IngesterCapacityScoreUpdate broadcast. This is the\n    /// primary way the table learns about node availability and capacity.\n    pub fn apply_capacity_update(\n        &mut self,\n        node_id: NodeId,\n        index_uid: IndexUid,\n        source_id: SourceId,\n        capacity_score: usize,\n        open_shard_count: usize,\n    ) {\n        let key = (index_uid.index_id.to_string(), source_id.clone());\n\n        let entry = self.table.entry(key).or_default();\n        let ingester_node = IngesterNode {\n            node_id: node_id.clone(),\n            index_uid,\n            source_id,\n            capacity_score,\n            open_shard_count,\n        };\n        entry.nodes.insert(node_id, ingester_node);\n    }\n\n    /// Merges routing updates from a GetOrCreateOpenShards control plane response into the\n    /// table. For existing nodes, updates their open shard count, including if the count is 0, from\n    /// the CP response while preserving capacity scores if they already exist.\n    /// New nodes get a default capacity_score of 5.\n    pub fn merge_from_shards(\n        &mut self,\n        index_uid: IndexUid,\n        source_id: SourceId,\n        shards: Vec<Shard>,\n    ) {\n        let per_leader_count: HashMap<NodeId, usize> = shards\n            .iter()\n            .map(|shard| {\n                let num_open_shards = shard.is_open() as usize;\n                let leader_id = NodeId::from(shard.leader_id.clone());\n                (leader_id, num_open_shards)\n            })\n            .into_grouping_map()\n            .sum();\n\n        let key = (index_uid.index_id.to_string(), source_id.clone());\n        let entry = self.table.entry(key).or_default();\n\n        for (node_id, open_shard_count) in per_leader_count {\n            entry\n                .nodes\n                .entry(node_id.clone())\n                .and_modify(|node| node.open_shard_count = open_shard_count)\n                .or_insert_with(|| IngesterNode {\n                    node_id,\n                    index_uid: index_uid.clone(),\n                    source_id: source_id.clone(),\n                    capacity_score: 5,\n                    open_shard_count,\n                });\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_proto::ingest::ShardState;\n    use quickwit_proto::ingest::ingester::{IngesterServiceClient, IngesterStatus};\n    use quickwit_proto::types::ShardId;\n\n    use super::*;\n    use crate::IngesterPoolEntry;\n\n    fn mocked_ingester(availability_zone: Option<&str>) -> IngesterPoolEntry {\n        IngesterPoolEntry {\n            client: IngesterServiceClient::mocked(),\n            status: IngesterStatus::Ready,\n            availability_zone: availability_zone.map(|s| s.to_string()),\n        }\n    }\n\n    #[test]\n    fn test_apply_capacity_update() {\n        let mut table = RoutingTable::default();\n        let key = (\"test-index\".to_string(), \"test-source\".into());\n\n        // Insert first node.\n        table.apply_capacity_update(\n            \"node-1\".into(),\n            IndexUid::for_test(\"test-index\", 0),\n            \"test-source\".into(),\n            8,\n            3,\n        );\n        let entry = table.table.get(&key).unwrap();\n        assert_eq!(entry.nodes.len(), 1);\n        assert_eq!(entry.nodes.get(\"node-1\").unwrap().capacity_score, 8);\n\n        // Update existing node.\n        table.apply_capacity_update(\n            \"node-1\".into(),\n            IndexUid::for_test(\"test-index\", 0),\n            \"test-source\".into(),\n            4,\n            5,\n        );\n        let node = table.table.get(&key).unwrap().nodes.get(\"node-1\").unwrap();\n        assert_eq!(node.capacity_score, 4);\n        assert_eq!(node.open_shard_count, 5);\n\n        // Add second node.\n        table.apply_capacity_update(\n            \"node-2\".into(),\n            IndexUid::for_test(\"test-index\", 0),\n            \"test-source\".into(),\n            6,\n            2,\n        );\n        assert_eq!(table.table.get(&key).unwrap().nodes.len(), 2);\n\n        // Zero shards: node stays in table but becomes ineligible for routing.\n        table.apply_capacity_update(\n            \"node-1\".into(),\n            IndexUid::for_test(\"test-index\", 0),\n            \"test-source\".into(),\n            0,\n            0,\n        );\n        let entry = table.table.get(&key).unwrap();\n        assert_eq!(entry.nodes.len(), 2);\n        assert_eq!(entry.nodes.get(\"node-1\").unwrap().open_shard_count, 0);\n        assert_eq!(entry.nodes.get(\"node-1\").unwrap().capacity_score, 0);\n    }\n\n    #[test]\n    fn test_has_open_nodes() {\n        let mut table = RoutingTable::default();\n        let pool = IngesterPool::default();\n\n        // Empty table.\n        assert!(!table.has_open_nodes(\"test-index\", \"test-source\", &pool, &HashSet::new()));\n\n        // Node exists but is not in pool.\n        table.apply_capacity_update(\n            \"node-1\".into(),\n            IndexUid::for_test(\"test-index\", 0),\n            \"test-source\".into(),\n            8,\n            3,\n        );\n        assert!(!table.has_open_nodes(\"test-index\", \"test-source\", &pool, &HashSet::new()));\n\n        // Node is in pool → true.\n        pool.insert(\"node-1\".into(), mocked_ingester(None));\n        assert!(table.has_open_nodes(\"test-index\", \"test-source\", &pool, &HashSet::new()));\n\n        // Node is unavailable → false.\n        let unavailable: HashSet<NodeId> = HashSet::from([\"node-1\".into()]);\n        assert!(!table.has_open_nodes(\"test-index\", \"test-source\", &pool, &unavailable));\n\n        // Second node available → true despite first being unavailable.\n        table.apply_capacity_update(\n            \"node-2\".into(),\n            IndexUid::for_test(\"test-index\", 0),\n            \"test-source\".into(),\n            6,\n            2,\n        );\n        pool.insert(\"node-2\".into(), mocked_ingester(None));\n        assert!(table.has_open_nodes(\"test-index\", \"test-source\", &pool, &unavailable));\n\n        // Node with capacity_score=0 is not eligible.\n        table.apply_capacity_update(\n            \"node-2\".into(),\n            IndexUid::for_test(\"test-index\", 0),\n            \"test-source\".into(),\n            0,\n            2,\n        );\n        assert!(!table.has_open_nodes(\"test-index\", \"test-source\", &pool, &unavailable));\n    }\n\n    #[test]\n    fn test_pick_node_prefers_same_az() {\n        let mut table = RoutingTable::new(Some(\"az-1\".to_string()));\n        let pool = IngesterPool::default();\n\n        table.apply_capacity_update(\n            \"node-1\".into(),\n            IndexUid::for_test(\"test-index\", 0),\n            \"test-source\".into(),\n            5,\n            1,\n        );\n        table.apply_capacity_update(\n            \"node-2\".into(),\n            IndexUid::for_test(\"test-index\", 0),\n            \"test-source\".into(),\n            5,\n            1,\n        );\n        pool.insert(\"node-1\".into(), mocked_ingester(Some(\"az-1\")));\n        pool.insert(\"node-2\".into(), mocked_ingester(Some(\"az-2\")));\n\n        let picked = table\n            .pick_node(\"test-index\", \"test-source\", &pool, &HashSet::new())\n            .unwrap();\n        assert_eq!(picked.node_id, NodeId::from(\"node-1\"));\n    }\n\n    #[test]\n    fn test_pick_node_falls_back_to_cross_az() {\n        let mut table = RoutingTable::new(Some(\"az-1\".to_string()));\n        let pool = IngesterPool::default();\n\n        table.apply_capacity_update(\n            \"node-2\".into(),\n            IndexUid::for_test(\"test-index\", 0),\n            \"test-source\".into(),\n            5,\n            1,\n        );\n        pool.insert(\"node-2\".into(), mocked_ingester(Some(\"az-2\")));\n\n        let picked = table\n            .pick_node(\"test-index\", \"test-source\", &pool, &HashSet::new())\n            .unwrap();\n        assert_eq!(picked.node_id, NodeId::from(\"node-2\"));\n    }\n\n    #[test]\n    fn test_pick_node_no_az_awareness() {\n        let mut table = RoutingTable::default();\n        let pool = IngesterPool::default();\n\n        table.apply_capacity_update(\n            \"node-1\".into(),\n            IndexUid::for_test(\"test-index\", 0),\n            \"test-source\".into(),\n            5,\n            1,\n        );\n        pool.insert(\"node-1\".into(), mocked_ingester(Some(\"az-1\")));\n\n        let picked = table\n            .pick_node(\"test-index\", \"test-source\", &pool, &HashSet::new())\n            .unwrap();\n        assert_eq!(picked.node_id, NodeId::from(\"node-1\"));\n    }\n\n    #[test]\n    fn test_pick_node_missing_entry() {\n        let table = RoutingTable::new(Some(\"az-1\".to_string()));\n        let pool = IngesterPool::default();\n\n        assert!(\n            table\n                .pick_node(\"nonexistent\", \"source\", &pool, &HashSet::new())\n                .is_none()\n        );\n    }\n\n    #[test]\n    fn test_power_of_two_choices() {\n        // 3 candidates: best appears in the random pair 2/3 of the time and always\n        // wins when it does, so it should win ~67% of 1000 runs. Asserting > 550\n        // is ~7.5 standard deviations from the mean — effectively impossible to flake.\n        let high = IngesterNode {\n            node_id: \"high\".into(),\n            index_uid: IndexUid::for_test(\"idx\", 0),\n            source_id: \"src\".into(),\n            capacity_score: 9,\n            open_shard_count: 2,\n        };\n        let mid = IngesterNode {\n            node_id: \"mid\".into(),\n            index_uid: IndexUid::for_test(\"idx\", 0),\n            source_id: \"src\".into(),\n            capacity_score: 5,\n            open_shard_count: 2,\n        };\n        let low = IngesterNode {\n            node_id: \"low\".into(),\n            index_uid: IndexUid::for_test(\"idx\", 0),\n            source_id: \"src\".into(),\n            capacity_score: 1,\n            open_shard_count: 2,\n        };\n        let candidates: Vec<&IngesterNode> = vec![&high, &mid, &low];\n\n        let mut high_wins = 0;\n        for _ in 0..1000 {\n            if power_of_two_choices(&candidates).node_id == \"high\" {\n                high_wins += 1;\n            }\n        }\n        assert!(high_wins > 550, \"high won only {high_wins}/1000 times\");\n    }\n\n    #[test]\n    fn test_merge_from_shards() {\n        let mut table = RoutingTable::default();\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let key = (\"test-index\".to_string(), \"test-source\".to_string());\n\n        let make_shard = |id: u64, leader: &str, open: bool| Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: \"test-source\".to_string(),\n            shard_id: Some(ShardId::from(id)),\n            shard_state: if open {\n                ShardState::Open as i32\n            } else {\n                ShardState::Closed as i32\n            },\n            leader_id: leader.to_string(),\n            ..Default::default()\n        };\n\n        // Two open shards on node-1, one open + one closed on node-2, only closed on node-3.\n        let shards = vec![\n            make_shard(1, \"node-1\", true),\n            make_shard(2, \"node-1\", true),\n            make_shard(3, \"node-2\", true),\n            make_shard(4, \"node-2\", false),\n            make_shard(5, \"node-3\", false),\n        ];\n        table.merge_from_shards(index_uid.clone(), \"test-source\".into(), shards);\n\n        let entry = table.table.get(&key).unwrap();\n        assert_eq!(entry.nodes.len(), 3);\n\n        let n1 = entry.nodes.get(\"node-1\").unwrap();\n        assert_eq!(n1.open_shard_count, 2);\n        assert_eq!(n1.capacity_score, 5);\n\n        let n2 = entry.nodes.get(\"node-2\").unwrap();\n        assert_eq!(n2.open_shard_count, 1);\n\n        let n3 = entry.nodes.get(\"node-3\").unwrap();\n        assert_eq!(n3.open_shard_count, 0);\n\n        // Merging again adds new nodes but preserves existing ones.\n        let shards = vec![make_shard(10, \"node-4\", true)];\n        table.merge_from_shards(index_uid, \"test-source\".into(), shards);\n\n        let entry = table.table.get(&key).unwrap();\n        assert_eq!(entry.nodes.len(), 4);\n        assert!(entry.nodes.contains_key(\"node-1\"));\n        assert!(entry.nodes.contains_key(\"node-2\"));\n        assert!(entry.nodes.contains_key(\"node-3\"));\n        assert!(entry.nodes.contains_key(\"node-4\"));\n    }\n\n    #[test]\n    fn test_classify_az_locality() {\n        let table = RoutingTable::new(Some(\"az-1\".to_string()));\n        let pool = IngesterPool::default();\n        pool.insert(\"node-local\".into(), mocked_ingester(Some(\"az-1\")));\n        pool.insert(\"node-remote\".into(), mocked_ingester(Some(\"az-2\")));\n        pool.insert(\"node-no-az\".into(), mocked_ingester(None));\n\n        assert_eq!(\n            table.classify_az_locality(&\"node-local\".into(), &pool),\n            \"same_az\"\n        );\n        assert_eq!(\n            table.classify_az_locality(&\"node-remote\".into(), &pool),\n            \"cross_az\"\n        );\n        assert_eq!(\n            table.classify_az_locality(&\"node-no-az\".into(), &pool),\n            \"az_unaware\"\n        );\n\n        let table_no_az = RoutingTable::default();\n        assert_eq!(\n            table_no_az.classify_az_locality(&\"node-local\".into(), &pool),\n            \"az_unaware\"\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_v2/state.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::fmt;\nuse std::ops::{Deref, DerefMut};\nuse std::path::Path;\nuse std::sync::{Arc, Weak};\nuse std::time::{Duration, Instant};\n\nuse bytesize::ByteSize;\nuse itertools::Itertools;\nuse mrecordlog::error::{DeleteQueueError, TruncateError};\nuse quickwit_cluster::Cluster;\nuse quickwit_common::pretty::PrettyDisplay;\nuse quickwit_common::rate_limiter::{RateLimiter, RateLimiterSettings};\nuse quickwit_common::shared_consts::INGESTER_STATUS_KEY;\nuse quickwit_doc_mapper::DocMapper;\nuse quickwit_proto::control_plane::AdviseResetShardsResponse;\nuse quickwit_proto::ingest::ingester::IngesterStatus;\nuse quickwit_proto::ingest::{IngestV2Error, IngestV2Result, ShardIds, ShardState};\nuse quickwit_proto::types::{DocMappingUid, IndexUid, Position, QueueId, SourceId, split_queue_id};\nuse tokio::sync::{Mutex, MutexGuard, RwLock, RwLockMappedWriteGuard, RwLockWriteGuard, watch};\nuse tracing::{error, info};\n\nuse super::models::IngesterShard;\nuse super::rate_meter::RateMeter;\nuse super::replication::{ReplicationStreamTaskHandle, ReplicationTaskHandle};\nuse super::wal_capacity_tracker::WalCapacityTracker;\nuse crate::ingest_v2::mrecordlog_utils::{force_delete_queue, queue_position_range};\nuse crate::mrecordlog_async::MultiRecordLogAsync;\nuse crate::{FollowerId, LeaderId, OpenShardCounts};\n\n/// Stores the state of the ingester and attempts to prevent deadlocks by exposing an API that\n/// guarantees that the internal data structures are always locked in the same order.\n///\n/// `lock_partially` locks `inner` only, while `lock_fully` locks both `inner` and `mrecordlog`. Use\n/// the former when you only need to access the in-memory state of the ingester and the latter when\n/// you need to access both the in-memory state AND the WAL.\n#[derive(Clone)]\npub(super) struct IngesterState {\n    // `inner` is a mutex because it's almost always accessed mutably.\n    inner: Arc<Mutex<InnerIngesterState>>,\n    mrecordlog: Arc<RwLock<Option<MultiRecordLogAsync>>>,\n    pub status_rx: watch::Receiver<IngesterStatus>,\n}\n\npub(super) struct InnerIngesterState {\n    pub shards: HashMap<QueueId, IngesterShard>,\n    pub doc_mappers: HashMap<DocMappingUid, Weak<DocMapper>>,\n    // Replication stream opened with followers.\n    pub replication_streams: HashMap<FollowerId, ReplicationStreamTaskHandle>,\n    // Replication tasks running for each replication stream opened with leaders.\n    pub replication_tasks: HashMap<LeaderId, ReplicationTaskHandle>,\n    cluster: Cluster,\n    pub wal_capacity_tracker: WalCapacityTracker,\n    status: IngesterStatus,\n    status_tx: watch::Sender<IngesterStatus>,\n}\n\nimpl InnerIngesterState {\n    pub fn status(&self) -> IngesterStatus {\n        self.status\n    }\n\n    pub async fn set_status(&mut self, status: IngesterStatus) {\n        self.status = status;\n        self.status_tx.send(status).expect(\"channel should be open\");\n        self.cluster\n            .set_self_key_value(INGESTER_STATUS_KEY, status.as_json_str_name())\n            .await;\n    }\n\n    /// Returns the shard with the most available permits for this index and source.\n    pub fn find_most_capacity_shard_mut(\n        &mut self,\n        index_uid: &IndexUid,\n        source_id: &SourceId,\n    ) -> Option<&mut IngesterShard> {\n        self.shards\n            .values_mut()\n            .filter(|shard| {\n                shard.is_open() && shard.index_uid == *index_uid && shard.source_id == *source_id\n            })\n            .map(|shard| (shard.rate_limiter.available_permits(), shard))\n            .max_by_key(|(available_permits, _)| *available_permits)\n            .map(|(_, shard)| shard)\n    }\n\n    /// Returns per-source open shard counts and closed shard IDs for all advertisable,\n    /// non-replica shards.\n    pub fn get_shard_snapshot(&self) -> (OpenShardCounts, Vec<ShardIds>) {\n        let grouped = self\n            .shards\n            .values()\n            .filter(|shard| shard.is_advertisable && !shard.is_replica())\n            .map(|shard| ((shard.index_uid.clone(), shard.source_id.clone()), shard))\n            .into_group_map();\n\n        let mut open_counts = Vec::new();\n        let mut closed_shards = Vec::new();\n\n        for ((index_uid, source_id), shards) in grouped {\n            let mut open_count = 0;\n            let mut closed_ids = Vec::new();\n\n            for shard in shards {\n                if shard.is_open() {\n                    open_count += 1;\n                } else if shard.is_closed() {\n                    closed_ids.push(shard.shard_id.clone());\n                }\n            }\n            open_counts.push((index_uid.clone(), source_id.clone(), open_count));\n            if !closed_ids.is_empty() {\n                closed_shards.push(ShardIds {\n                    index_uid: Some(index_uid),\n                    source_id,\n                    shard_ids: closed_ids,\n                });\n            }\n        }\n        (open_counts, closed_shards)\n    }\n}\n\nimpl IngesterState {\n    async fn create(cluster: Cluster, disk_capacity: ByteSize, memory_capacity: ByteSize) -> Self {\n        let status = IngesterStatus::Initializing;\n        let (status_tx, status_rx) = watch::channel(status);\n        let mut inner = InnerIngesterState {\n            shards: Default::default(),\n            doc_mappers: Default::default(),\n            replication_streams: Default::default(),\n            replication_tasks: Default::default(),\n            cluster,\n            wal_capacity_tracker: WalCapacityTracker::new(disk_capacity, memory_capacity),\n            status,\n            status_tx,\n        };\n        // We call `set_status` here instead of setting it directly because it also updates the\n        // ingester status in chitchat.\n        inner.set_status(IngesterStatus::Initializing).await;\n\n        let inner = Arc::new(Mutex::new(inner));\n        let mrecordlog = Arc::new(RwLock::new(None));\n\n        Self {\n            inner,\n            mrecordlog,\n            status_rx,\n        }\n    }\n\n    pub async fn load(\n        cluster: Cluster,\n        wal_dir_path: &Path,\n        disk_capacity: ByteSize,\n        memory_capacity: ByteSize,\n        rate_limiter_settings: RateLimiterSettings,\n    ) -> Self {\n        let state = Self::create(cluster, disk_capacity, memory_capacity).await;\n        let state_clone = state.clone();\n        let wal_dir_path = wal_dir_path.to_path_buf();\n\n        let init_future = async move {\n            state_clone.init(&wal_dir_path, rate_limiter_settings).await;\n        };\n        tokio::spawn(init_future);\n\n        state\n    }\n\n    #[cfg(test)]\n    pub async fn for_test(cluster: Cluster) -> (tempfile::TempDir, Self) {\n        Self::for_test_with_disk_capacity(cluster, ByteSize::mb(256)).await\n    }\n\n    #[cfg(test)]\n    pub async fn for_test_with_disk_capacity(\n        cluster: Cluster,\n        disk_capacity: ByteSize,\n    ) -> (tempfile::TempDir, Self) {\n        let temp_dir = tempfile::tempdir().unwrap();\n        let mut state = IngesterState::load(\n            cluster,\n            temp_dir.path(),\n            disk_capacity,\n            ByteSize::mb(256),\n            RateLimiterSettings::default(),\n        )\n        .await;\n\n        state.wait_for_ready().await;\n\n        (temp_dir, state)\n    }\n\n    /// Initializes the internal state of the ingester. It loads the local WAL, then lists all its\n    /// queues. Empty queues are deleted, while non-empty queues are recovered. However, the\n    /// corresponding shards are closed and become read-only.\n    pub async fn init(&self, wal_dir_path: &Path, rate_limiter_settings: RateLimiterSettings) {\n        // Acquire locks in the same order as `lock_fully` (mrecordlog first, then inner) to\n        // prevent ABBA deadlocks with the broadcast capacity task.\n        let mut mrecordlog_guard = self.mrecordlog.write().await;\n        let mut inner_guard = self.inner.lock().await;\n\n        let now = Instant::now();\n\n        info!(\"opening WAL located at `{}`\", wal_dir_path.display());\n        let open_result = MultiRecordLogAsync::open_with_prefs(\n            wal_dir_path,\n            mrecordlog::PersistPolicy::OnDelay {\n                interval: Duration::from_secs(5),\n                // TODO maybe we want to fsync too?\n                action: mrecordlog::PersistAction::Flush,\n            },\n        )\n        .await;\n\n        let mut mrecordlog = match open_result {\n            Ok(mrecordlog) => {\n                info!(\n                    \"opened WAL successfully in {}\",\n                    now.elapsed().pretty_display()\n                );\n                mrecordlog\n            }\n            Err(error) => {\n                error!(\"failed to open WAL: {error}\");\n                inner_guard.set_status(IngesterStatus::Failed).await;\n                return;\n            }\n        };\n        let queue_ids: Vec<QueueId> = mrecordlog\n            .list_queues()\n            .map(|queue_id| queue_id.to_string())\n            .collect();\n\n        if !queue_ids.is_empty() {\n            info!(\"recovering {} shard(s)\", queue_ids.len());\n        }\n        let now = Instant::now();\n        let mut num_closed_shards = 0;\n        let mut num_deleted_shards = 0;\n\n        for queue_id in queue_ids {\n            if let Some(position_range) = queue_position_range(&mrecordlog, &queue_id) {\n                let Some((index_uid, source_id, shard_id)) = split_queue_id(&queue_id) else {\n                    // `split_queue_id` already logs an error.\n                    continue;\n                };\n                // The queue is not empty: recover it.\n                let replication_position_inclusive = Position::offset(*position_range.end());\n                let truncation_position_inclusive = if *position_range.start() == 0 {\n                    Position::Beginning\n                } else {\n                    Position::offset(*position_range.start() - 1)\n                };\n                let rate_limiter = RateLimiter::from_settings(rate_limiter_settings);\n                let rate_meter = RateMeter::default();\n                // We want to advertise the shard as read-only right away.\n                let solo_shard =\n                    IngesterShard::new_solo(index_uid.clone(), source_id.clone(), shard_id.clone())\n                        .with_state(ShardState::Closed)\n                        .with_replication_position_inclusive(replication_position_inclusive)\n                        .with_truncation_position_inclusive(truncation_position_inclusive)\n                        .with_rate_limiter(rate_limiter)\n                        .with_rate_meter(rate_meter)\n                        .with_last_write(now)\n                        .advertisable() // We want to advertise the shard as read-only right away.\n                        .build();\n                inner_guard.shards.insert(queue_id.clone(), solo_shard);\n\n                num_closed_shards += 1;\n            } else {\n                // The queue is empty: delete it.\n                if let Err(io_error) = force_delete_queue(&mut mrecordlog, &queue_id).await {\n                    error!(\"failed to delete shard `{queue_id}`: {io_error}\");\n                    continue;\n                }\n                num_deleted_shards += 1;\n            }\n        }\n        if num_closed_shards > 0 {\n            info!(\"recovered and closed {num_closed_shards} shard(s)\");\n        }\n        if num_deleted_shards > 0 {\n            info!(\"deleted {num_deleted_shards} empty shard(s)\");\n        }\n        mrecordlog_guard.replace(mrecordlog);\n        inner_guard.set_status(IngesterStatus::Ready).await;\n    }\n\n    pub async fn wait_for_ready(&mut self) {\n        self.status_rx\n            .wait_for(|status| *status == IngesterStatus::Ready)\n            .await\n            .expect(\"channel should be open\");\n    }\n\n    pub async fn lock_partially(&self) -> IngestV2Result<PartiallyLockedIngesterState<'_>> {\n        if *self.status_rx.borrow() == IngesterStatus::Initializing {\n            return Err(IngestV2Error::Internal(\n                \"ingester is initializing\".to_string(),\n            ));\n        }\n        let inner_guard = self.inner.lock().await;\n\n        if inner_guard.status() == IngesterStatus::Failed {\n            return Err(IngestV2Error::Internal(\n                \"failed to initialize ingester\".to_string(),\n            ));\n        }\n        let partial_lock = PartiallyLockedIngesterState { inner: inner_guard };\n        Ok(partial_lock)\n    }\n\n    pub async fn lock_fully(&self) -> IngestV2Result<FullyLockedIngesterState<'_>> {\n        if *self.status_rx.borrow() == IngesterStatus::Initializing {\n            return Err(IngestV2Error::Internal(\n                \"ingester is initializing\".to_string(),\n            ));\n        }\n        // We assume that the mrecordlog lock is the most \"expensive\" one to acquire, so we acquire\n        // it first.\n        let mrecordlog_opt_guard = self.mrecordlog.write().await;\n        let inner_guard = self.inner.lock().await;\n\n        if inner_guard.status() == IngesterStatus::Failed {\n            return Err(IngestV2Error::Internal(\n                \"failed to initialize ingester\".to_string(),\n            ));\n        }\n        let mrecordlog_guard = RwLockWriteGuard::map(mrecordlog_opt_guard, |mrecordlog_opt| {\n            mrecordlog_opt\n                .as_mut()\n                .expect(\"mrecordlog should be initialized\")\n        });\n        let full_lock = FullyLockedIngesterState {\n            inner: inner_guard,\n            mrecordlog: mrecordlog_guard,\n        };\n        Ok(full_lock)\n    }\n\n    // Leaks the mrecordlog lock for use in fetch tasks. It's safe to do so because fetch tasks\n    // never attempt to lock the inner state.\n    pub fn mrecordlog(&self) -> Arc<RwLock<Option<MultiRecordLogAsync>>> {\n        self.mrecordlog.clone()\n    }\n\n    pub fn weak(&self) -> WeakIngesterState {\n        WeakIngesterState {\n            inner: Arc::downgrade(&self.inner),\n            mrecordlog: Arc::downgrade(&self.mrecordlog),\n            status_rx: self.status_rx.clone(),\n        }\n    }\n}\n\npub(super) struct PartiallyLockedIngesterState<'a> {\n    pub inner: MutexGuard<'a, InnerIngesterState>,\n}\n\nimpl fmt::Debug for PartiallyLockedIngesterState<'_> {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        f.debug_struct(\"PartiallyLockedIngesterState\").finish()\n    }\n}\n\nimpl Deref for PartiallyLockedIngesterState<'_> {\n    type Target = InnerIngesterState;\n\n    fn deref(&self) -> &Self::Target {\n        &self.inner\n    }\n}\n\nimpl DerefMut for PartiallyLockedIngesterState<'_> {\n    fn deref_mut(&mut self) -> &mut Self::Target {\n        &mut self.inner\n    }\n}\n\npub(super) struct FullyLockedIngesterState<'a> {\n    pub inner: MutexGuard<'a, InnerIngesterState>,\n    pub mrecordlog: RwLockMappedWriteGuard<'a, MultiRecordLogAsync>,\n}\n\nimpl fmt::Debug for FullyLockedIngesterState<'_> {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        f.debug_struct(\"FullyLockedIngesterState\").finish()\n    }\n}\n\nimpl Deref for FullyLockedIngesterState<'_> {\n    type Target = InnerIngesterState;\n\n    fn deref(&self) -> &Self::Target {\n        &self.inner\n    }\n}\n\nimpl DerefMut for FullyLockedIngesterState<'_> {\n    fn deref_mut(&mut self) -> &mut Self::Target {\n        &mut self.inner\n    }\n}\n\nimpl FullyLockedIngesterState<'_> {\n    /// Deletes the shard identified by `queue_id` from the ingester state. It removes the\n    /// mrecordlog queue first and then removes the associated in-memory shard and rate trackers.\n    pub async fn delete_shard(&mut self, queue_id: &QueueId, initiator: &'static str) {\n        match self.mrecordlog.delete_queue(queue_id).await {\n            Ok(_) | Err(DeleteQueueError::MissingQueue(_)) => {\n                // Log only if the shard was actually removed.\n                if let Some(shard) = self.shards.remove(queue_id) {\n                    info!(\"deleted shard `{queue_id}` initiated via `{initiator}`\");\n\n                    if let Some(doc_mapper) = shard.doc_mapper_opt {\n                        // At this point, we hold the lock so we can safely check the strong count.\n                        // The other locations where the doc mapper is cloned also require holding\n                        // the lock.\n                        if Arc::strong_count(&doc_mapper) == 1 {\n                            let doc_mapping_uid = doc_mapper.doc_mapping_uid();\n\n                            if self.doc_mappers.remove(&doc_mapping_uid).is_some() {\n                                info!(\"evicted doc mapper `{doc_mapping_uid}` from cache`\");\n                            }\n                        }\n                    }\n                }\n            }\n            Err(DeleteQueueError::IoError(io_error)) => {\n                error!(\"failed to delete shard `{queue_id}`: {io_error}\");\n            }\n        };\n    }\n\n    /// Truncates the shard identified by `queue_id` up to `truncate_up_to_position_inclusive` only\n    /// if the current truncation position of the shard is smaller.\n    pub async fn truncate_shard(\n        &mut self,\n        queue_id: &QueueId,\n        truncate_up_to_position_inclusive: Position,\n        initiator: &'static str,\n    ) {\n        if let Some(truncate_up_to_offset_inclusive) = truncate_up_to_position_inclusive.as_u64()\n            && let Some(shard) = self.inner.shards.get_mut(queue_id)\n            && shard.truncation_position_inclusive < truncate_up_to_position_inclusive\n        {\n            match self\n                .mrecordlog\n                .truncate(queue_id, truncate_up_to_offset_inclusive)\n                .await\n            {\n                Ok(_) => {\n                    info!(\n                        \"truncated shard `{queue_id}` at {truncate_up_to_position_inclusive} \\\n                         initiated via `{initiator}`\"\n                    );\n                    shard.truncation_position_inclusive = truncate_up_to_position_inclusive;\n                }\n                Err(TruncateError::MissingQueue(_)) => {\n                    error!(\"failed to truncate shard `{queue_id}`: WAL queue not found\");\n                    self.shards.remove(queue_id);\n                    info!(\"deleted dangling shard `{queue_id}`\");\n                }\n                Err(TruncateError::IoError(io_error)) => {\n                    error!(\"failed to truncate shard `{queue_id}`: {io_error}\");\n                }\n            };\n        }\n    }\n\n    /// Deletes and truncates the shards as directed by the `advise_reset_shards_response` returned\n    /// by the control plane.\n    pub async fn reset_shards(&mut self, advise_reset_shards_response: &AdviseResetShardsResponse) {\n        info!(\"resetting shards\");\n        for shard_ids in &advise_reset_shards_response.shards_to_delete {\n            for queue_id in shard_ids.queue_ids() {\n                self.delete_shard(&queue_id, \"control-plane-reset-shards-rpc\")\n                    .await;\n            }\n        }\n        for shard_id_positions in &advise_reset_shards_response.shards_to_truncate {\n            for (queue_id, publish_position) in shard_id_positions.queue_id_positions() {\n                self.truncate_shard(\n                    &queue_id,\n                    publish_position,\n                    \"control-plane-reset-shards-rpc\",\n                )\n                .await;\n            }\n        }\n    }\n}\n\n#[derive(Clone)]\npub(super) struct WeakIngesterState {\n    inner: Weak<Mutex<InnerIngesterState>>,\n    mrecordlog: Weak<RwLock<Option<MultiRecordLogAsync>>>,\n    status_rx: watch::Receiver<IngesterStatus>,\n}\n\nimpl WeakIngesterState {\n    pub fn upgrade(&self) -> Option<IngesterState> {\n        let inner = self.inner.upgrade()?;\n        let mrecordlog = self.mrecordlog.upgrade()?;\n        let status_rx = self.status_rx.clone();\n        let state = IngesterState {\n            inner,\n            mrecordlog,\n            status_rx,\n        };\n        Some(state)\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use bytesize::ByteSize;\n    use quickwit_cluster::{ChannelTransport, create_cluster_for_test};\n    use quickwit_config::service::QuickwitService;\n    use quickwit_proto::types::{NodeId, ShardId, SourceId};\n    use tokio::time::timeout;\n\n    use super::*;\n\n    async fn test_cluster() -> Cluster {\n        create_cluster_for_test(\n            Vec::new(),\n            &[QuickwitService::Indexer.as_str()],\n            &ChannelTransport::default(),\n            true,\n        )\n        .await\n        .unwrap()\n    }\n\n    #[tokio::test]\n    async fn test_ingester_state_does_not_lock_while_initializing() {\n        let cluster = test_cluster().await;\n        let state = IngesterState::create(cluster, ByteSize::mb(256), ByteSize::mb(256)).await;\n        let inner_guard = state.inner.lock().await;\n\n        assert_eq!(inner_guard.status(), IngesterStatus::Initializing);\n        assert_eq!(*state.status_rx.borrow(), IngesterStatus::Initializing);\n\n        let error = state.lock_partially().await.unwrap_err().to_string();\n        assert!(error.contains(\"ingester is initializing\"));\n\n        let error = state.lock_fully().await.unwrap_err().to_string();\n        assert!(error.contains(\"ingester is initializing\"));\n    }\n\n    #[tokio::test]\n    async fn test_ingester_state_failed() {\n        let cluster = test_cluster().await;\n        let state = IngesterState::create(cluster, ByteSize::mb(256), ByteSize::mb(256)).await;\n\n        state\n            .inner\n            .lock()\n            .await\n            .set_status(IngesterStatus::Failed)\n            .await;\n\n        let error = state.lock_partially().await.unwrap_err().to_string();\n        assert!(error.to_string().ends_with(\"failed to initialize ingester\"));\n\n        let error = state.lock_fully().await.unwrap_err().to_string();\n        assert!(error.contains(\"failed to initialize ingester\"));\n    }\n\n    #[tokio::test]\n    async fn test_ingester_state_init() {\n        let cluster = test_cluster().await;\n        let mut state = IngesterState::create(cluster, ByteSize::mb(256), ByteSize::mb(256)).await;\n        let temp_dir = tempfile::tempdir().unwrap();\n\n        state\n            .init(temp_dir.path(), RateLimiterSettings::default())\n            .await;\n\n        timeout(Duration::from_millis(100), state.wait_for_ready())\n            .await\n            .unwrap();\n\n        state.lock_partially().await.unwrap();\n\n        let locked_state = state.lock_fully().await.unwrap();\n        assert_eq!(locked_state.status(), IngesterStatus::Ready);\n        assert_eq!(*locked_state.status_tx.borrow(), IngesterStatus::Ready);\n    }\n\n    fn insert_shard_with_used_capacity(\n        state: &mut InnerIngesterState,\n        index_uid: IndexUid,\n        source_id: SourceId,\n        shard_id: ShardId,\n        shard_state: ShardState,\n        used_capacity: ByteSize,\n    ) {\n        let mut shard = IngesterShard::new_solo(index_uid, source_id, shard_id)\n            .with_state(shard_state)\n            .build();\n        shard.rate_limiter.acquire_bytes(used_capacity);\n\n        let queue_id = shard.queue_id();\n        state.shards.insert(queue_id, shard);\n    }\n\n    #[tokio::test]\n    async fn test_find_most_capacity_shard_returns_shard_with_least_used_capacity() {\n        let cluster = create_cluster_for_test(\n            Vec::new(),\n            &[QuickwitService::Indexer.as_str()],\n            &ChannelTransport::default(),\n            true,\n        )\n        .await\n        .unwrap();\n        let (_temp_dir, state) = IngesterState::for_test(cluster).await;\n        let mut state_guard = state.lock_partially().await.unwrap();\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n\n        // Shard 1: 1KB used (most available capacity)\n        // Shard 2: 2KB used\n        // ...\n        // Shard 5: 5KB used (least available capacity)\n        for i in 1..=5u64 {\n            insert_shard_with_used_capacity(\n                &mut state_guard,\n                index_uid.clone(),\n                source_id.clone(),\n                ShardId::from(i),\n                ShardState::Open,\n                ByteSize::kb(i),\n            );\n        }\n\n        let shard = state_guard\n            .find_most_capacity_shard_mut(&index_uid, &source_id)\n            .unwrap();\n\n        assert_eq!(shard.shard_id, ShardId::from(1));\n        assert_eq!(shard.shard_state, ShardState::Open);\n\n        let expected_available_permits =\n            RateLimiterSettings::default().burst_limit - ByteSize::kb(1).as_u64();\n        assert_eq!(\n            shard.rate_limiter.available_permits(),\n            expected_available_permits\n        );\n    }\n\n    #[tokio::test]\n    async fn test_find_most_capacity_shard_skips_closed_shards() {\n        let cluster = create_cluster_for_test(\n            Vec::new(),\n            &[QuickwitService::Indexer.as_str()],\n            &ChannelTransport::default(),\n            true,\n        )\n        .await\n        .unwrap();\n        let (_temp_dir, state) = IngesterState::for_test(cluster).await;\n        let mut locked_state = state.lock_partially().await.unwrap();\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n\n        insert_shard_with_used_capacity(\n            &mut locked_state,\n            index_uid.clone(),\n            source_id.clone(),\n            ShardId::from(1),\n            ShardState::Open,\n            ByteSize::kb(1),\n        );\n        insert_shard_with_used_capacity(\n            &mut locked_state,\n            index_uid.clone(),\n            source_id.clone(),\n            ShardId::from(2),\n            ShardState::Open,\n            ByteSize::kb(2),\n        );\n\n        insert_shard_with_used_capacity(\n            &mut locked_state,\n            index_uid.clone(),\n            source_id.clone(),\n            ShardId::from(3),\n            ShardState::Closed,\n            ByteSize::kb(0),\n        );\n\n        let shard = locked_state\n            .find_most_capacity_shard_mut(&index_uid, &source_id)\n            .unwrap();\n\n        // Should pick shard 1 (most capacity among open shards), not shard 3 (closed)\n        assert_eq!(shard.shard_id, ShardId::from(1));\n    }\n\n    #[tokio::test]\n    async fn test_find_most_capacity_shard_returns_none_for_unknown_index_or_source() {\n        let cluster = create_cluster_for_test(\n            Vec::new(),\n            &[QuickwitService::Indexer.as_str()],\n            &ChannelTransport::default(),\n            true,\n        )\n        .await\n        .unwrap();\n        let (_temp_dir, state) = IngesterState::for_test(cluster).await;\n        let mut locked_state = state.lock_partially().await.unwrap();\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = SourceId::from(\"test-source\");\n\n        insert_shard_with_used_capacity(\n            &mut locked_state,\n            index_uid.clone(),\n            source_id.clone(),\n            ShardId::from(1),\n            ShardState::Open,\n            ByteSize::kb(0),\n        );\n\n        let shard_opt = locked_state\n            .find_most_capacity_shard_mut(&IndexUid::for_test(\"other-index\", 0), &source_id);\n        assert!(shard_opt.is_none());\n\n        let shard_opt =\n            locked_state.find_most_capacity_shard_mut(&index_uid, &SourceId::from(\"other-source\"));\n        assert!(shard_opt.is_none());\n    }\n\n    #[tokio::test]\n    async fn test_ingester_state_set_status() {\n        let cluster = test_cluster().await;\n        let state =\n            IngesterState::create(cluster.clone(), ByteSize::mb(256), ByteSize::mb(256)).await;\n        let temp_dir = tempfile::tempdir().unwrap();\n\n        state\n            .init(temp_dir.path(), RateLimiterSettings::default())\n            .await;\n\n        let mut state_guard = state.lock_fully().await.unwrap();\n        state_guard.set_status(IngesterStatus::Failed).await;\n        assert_eq!(state_guard.status(), IngesterStatus::Failed);\n        assert_eq!(*state.status_rx.borrow(), IngesterStatus::Failed);\n\n        let status_json_str = cluster\n            .get_self_key_value(INGESTER_STATUS_KEY)\n            .await\n            .unwrap();\n        let status = IngesterStatus::from_json_str_name(&status_json_str).unwrap();\n        assert_eq!(status, IngesterStatus::Failed);\n    }\n\n    fn open_shard(\n        index_uid: IndexUid,\n        source_id: SourceId,\n        shard_id: ShardId,\n        is_replica: bool,\n    ) -> IngesterShard {\n        let builder = if is_replica {\n            IngesterShard::new_replica(index_uid, source_id, shard_id, NodeId::from(\"test-leader\"))\n        } else {\n            IngesterShard::new_solo(index_uid, source_id, shard_id)\n        };\n        builder.advertisable().build()\n    }\n\n    #[tokio::test]\n    async fn test_get_shard_snapshot() {\n        let cluster = test_cluster().await;\n        let (_temp_dir, state) = IngesterState::for_test(cluster).await;\n        let mut state_guard = state.lock_partially().await.unwrap();\n\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n\n        // source-a: 2 open shards + 1 closed shard + 1 replica (ignored).\n        let s = open_shard(\n            index_uid.clone(),\n            \"source-a\".into(),\n            ShardId::from(1),\n            false,\n        );\n        state_guard.shards.insert(s.queue_id(), s);\n        let s = open_shard(\n            index_uid.clone(),\n            \"source-a\".into(),\n            ShardId::from(2),\n            false,\n        );\n        state_guard.shards.insert(s.queue_id(), s);\n        let s = IngesterShard::new_solo(index_uid.clone(), \"source-a\".into(), ShardId::from(3))\n            .with_state(ShardState::Closed)\n            .advertisable()\n            .build();\n        state_guard.shards.insert(s.queue_id(), s);\n        let s = open_shard(index_uid.clone(), \"source-a\".into(), ShardId::from(4), true);\n        state_guard.shards.insert(s.queue_id(), s);\n\n        // source-b: 2 closed shards, no open shards.\n        let s = IngesterShard::new_solo(index_uid.clone(), \"source-b\".into(), ShardId::from(5))\n            .with_state(ShardState::Closed)\n            .advertisable()\n            .build();\n        state_guard.shards.insert(s.queue_id(), s);\n        let s = IngesterShard::new_solo(index_uid.clone(), \"source-b\".into(), ShardId::from(6))\n            .with_state(ShardState::Closed)\n            .advertisable()\n            .build();\n        state_guard.shards.insert(s.queue_id(), s);\n\n        let (mut open_counts, mut closed_shards) = state_guard.get_shard_snapshot();\n\n        // Open counts: source-a has 2, source-b has 0.\n        open_counts.sort_by(|a, b| a.1.cmp(&b.1));\n        assert_eq!(open_counts.len(), 2);\n        assert_eq!(\n            open_counts[0],\n            (index_uid.clone(), SourceId::from(\"source-a\"), 2)\n        );\n        assert_eq!(\n            open_counts[1],\n            (index_uid.clone(), SourceId::from(\"source-b\"), 0)\n        );\n\n        // Closed shards: source-a has shard 3, source-b has shards 5 and 6.\n        closed_shards.sort_by(|a, b| a.source_id.cmp(&b.source_id));\n        assert_eq!(closed_shards.len(), 2);\n\n        assert_eq!(closed_shards[0].source_id, \"source-a\");\n        assert_eq!(closed_shards[0].shard_ids, vec![ShardId::from(3)]);\n\n        assert_eq!(closed_shards[1].source_id, \"source-b\");\n        let mut source_b_ids = closed_shards[1].shard_ids.clone();\n        source_b_ids.sort();\n        assert_eq!(source_b_ids, vec![ShardId::from(5), ShardId::from(6)]);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_v2/wal_capacity_tracker.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse bytesize::ByteSize;\nuse quickwit_common::ring_buffer::RingBuffer;\n\n/// The lookback window length is meant to capture readings far enough back in time to give\n/// a rough rate of change estimate. At size 6, with broadcast interval of 5 seconds, this would be\n/// 30 seconds of readings.\nconst WAL_CAPACITY_LOOKBACK_WINDOW_LEN: usize = 6;\n\n/// The ring buffer stores one extra element so that `delta()` can compare the newest reading\n/// with the one that is exactly `WAL_CAPACITY_LOOKBACK_WINDOW_LEN` steps ago. Otherwise, that\n/// reading would be discarded when the next reading is inserted.\nconst WAL_CAPACITY_READINGS_LEN: usize = WAL_CAPACITY_LOOKBACK_WINDOW_LEN + 1;\n\nstruct WalCapacityTimeSeries {\n    capacity: ByteSize,\n    readings: RingBuffer<f64, WAL_CAPACITY_READINGS_LEN>,\n}\n\nimpl WalCapacityTimeSeries {\n    fn new(capacity: ByteSize) -> Self {\n        #[cfg(not(test))]\n        assert!(capacity.as_u64() > 0);\n        Self {\n            capacity,\n            readings: RingBuffer::default(),\n        }\n    }\n\n    fn record_and_score(&mut self, used: ByteSize) -> usize {\n        self.record(used);\n        let remaining = self.current().unwrap_or(1.0);\n        let delta = self.delta().unwrap_or(0.0);\n        compute_capacity_score(remaining, delta)\n    }\n\n    fn score(&self, used: ByteSize) -> usize {\n        let remaining = 1.0 - (used.as_u64() as f64 / self.capacity.as_u64() as f64);\n        let delta = self.delta().unwrap_or(0.0);\n        compute_capacity_score(remaining, delta)\n    }\n\n    fn record(&mut self, used: ByteSize) {\n        let remaining = 1.0 - (used.as_u64() as f64 / self.capacity.as_u64() as f64);\n        self.readings.push_back(remaining.clamp(0.0, 1.0));\n    }\n\n    fn current(&self) -> Option<f64> {\n        self.readings.last()\n    }\n\n    fn delta(&self) -> Option<f64> {\n        let current = self.readings.last()?;\n        let oldest = self.readings.front()?;\n        Some(current - oldest)\n    }\n}\n\npub struct WalCapacityTracker {\n    disk: WalCapacityTimeSeries,\n    memory: WalCapacityTimeSeries,\n}\n\nimpl WalCapacityTracker {\n    pub fn new(disk_capacity: ByteSize, memory_capacity: ByteSize) -> Self {\n        Self {\n            disk: WalCapacityTimeSeries::new(disk_capacity),\n            memory: WalCapacityTimeSeries::new(memory_capacity),\n        }\n    }\n\n    /// Records disk and memory usage readings and returns the resulting capacity score.\n    /// The score is the minimum of the individual disk and memory scores.\n    pub fn record_and_score(&mut self, disk_used: ByteSize, memory_used: ByteSize) -> usize {\n        let disk_score = self.disk.record_and_score(disk_used);\n        let memory_score = self.memory.record_and_score(memory_used);\n        disk_score.min(memory_score)\n    }\n\n    /// Computes a capacity score for the given usage without recording it.\n    pub fn score(&self, disk_used: ByteSize, memory_used: ByteSize) -> usize {\n        let disk_score = self.disk.score(disk_used);\n        let memory_score = self.memory.score(memory_used);\n        disk_score.min(memory_score)\n    }\n}\n\n/// Computes a capacity score from 0 to 10 using a PD controller.\n///\n/// The score has two components:\n///\n/// - **P (proportional):** How much WAL capacity remains right now. An ingester with 100% free\n///   capacity gets `PROPORTIONAL_WEIGHT` points; 50% gets half; and so on. If remaining capacity\n///   drops to `MIN_PERMISSIBLE_CAPACITY` or below, the score is immediately 0.\n///\n/// - **D (derivative):** Up to `DERIVATIVE_WEIGHT` bonus points based on how fast remaining\n///   capacity is changing over the lookback window. A higher drain rate is worse, so we invert it:\n///   `drain / MAX_DRAIN_RATE` normalizes the drain to a 0–1 penalty, and subtracting from 1\n///   converts it into a 0–1 bonus. Multiplied by `DERIVATIVE_WEIGHT`, a stable node gets the full\n///   bonus and a node draining at `MAX_DRAIN_RATE` or faster gets nothing.\n///\n/// Putting it together: a completely idle ingester scores 10 (8 + 2).\n/// One that is full but stable scores ~2. One that is draining rapidly scores less.\n/// A score of 0 means the ingester is at or below minimum permissible capacity.\n///\n/// Below this remaining capacity fraction, the score is immediately 0.\nconst MIN_PERMISSIBLE_CAPACITY: f64 = 0.05;\n/// Weight of the proportional term (max points from P).\nconst PROPORTIONAL_WEIGHT: f64 = 8.0;\n/// Weight of the derivative term (max points from D).\nconst DERIVATIVE_WEIGHT: f64 = 2.0;\n/// The drain rate (as a fraction of total capacity over the lookback window) at which the\n/// derivative penalty is fully applied. Drain rates beyond this are clamped.\nconst MAX_DRAIN_RATE: f64 = 0.10;\n\nfn compute_capacity_score(remaining_capacity: f64, capacity_delta: f64) -> usize {\n    if remaining_capacity <= MIN_PERMISSIBLE_CAPACITY {\n        return 0;\n    }\n    let p = PROPORTIONAL_WEIGHT * remaining_capacity;\n    let drain = (-capacity_delta).clamp(0.0, MAX_DRAIN_RATE);\n    let d = DERIVATIVE_WEIGHT * (1.0 - drain / MAX_DRAIN_RATE);\n    (p + d).clamp(0.0, 10.0) as usize\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    fn ts() -> WalCapacityTimeSeries {\n        WalCapacityTimeSeries::new(ByteSize::b(100))\n    }\n\n    /// Helper: record a reading with `used` bytes against the series' fixed capacity.\n    fn record(series: &mut WalCapacityTimeSeries, used: u64) {\n        series.record(ByteSize::b(used));\n    }\n\n    #[test]\n    fn test_wal_disk_capacity_current_after_record() {\n        let mut series = WalCapacityTimeSeries::new(ByteSize::b(256));\n        // 192 of 256 used => 25% remaining\n        series.record(ByteSize::b(192));\n        assert_eq!(series.current(), Some(0.25));\n\n        // 16 of 256 used => 93.75% remaining\n        series.record(ByteSize::b(16));\n        assert_eq!(series.current(), Some(0.9375));\n    }\n\n    #[test]\n    fn test_wal_disk_capacity_record_saturates_at_zero() {\n        let mut series = ts();\n        // 200 used out of 100 capacity => clamped to 0.0\n        record(&mut series, 200);\n        assert_eq!(series.current(), Some(0.0));\n    }\n\n    #[test]\n    fn test_wal_disk_capacity_delta_growing() {\n        let mut series = ts();\n        // oldest: 60 of 100 used => 40% remaining\n        record(&mut series, 60);\n        // current: 20 of 100 used => 80% remaining\n        record(&mut series, 20);\n        // delta = 0.80 - 0.40 = 0.40\n        assert_eq!(series.delta(), Some(0.40));\n    }\n\n    #[test]\n    fn test_wal_disk_capacity_delta_shrinking() {\n        let mut series = ts();\n        // oldest: 20 of 100 used => 80% remaining\n        record(&mut series, 20);\n        // current: 60 of 100 used => 40% remaining\n        record(&mut series, 60);\n        // delta = 0.40 - 0.80 = -0.40\n        assert_eq!(series.delta(), Some(-0.40));\n    }\n\n    #[test]\n    fn test_capacity_score_draining_vs_stable() {\n        // Node A: capacity draining — usage increases 10, 20, ..., 70 over 7 ticks.\n        let mut node_a = ts();\n        for used in (10..=70).step_by(10) {\n            record(&mut node_a, used);\n        }\n        let a_remaining = node_a.current().unwrap();\n        let a_delta = node_a.delta().unwrap();\n        let a_score = compute_capacity_score(a_remaining, a_delta);\n\n        // Node B: steady at 50% usage over 7 ticks.\n        let mut node_b = ts();\n        for _ in 0..7 {\n            record(&mut node_b, 50);\n        }\n        let b_remaining = node_b.current().unwrap();\n        let b_delta = node_b.delta().unwrap();\n        let b_score = compute_capacity_score(b_remaining, b_delta);\n\n        // p=2.4, d=0 (max drain) => 2\n        assert_eq!(a_score, 2);\n        // p=4, d=2 (stable) => 6\n        assert_eq!(b_score, 6);\n        assert!(b_score > a_score);\n    }\n\n    #[test]\n    fn test_wal_disk_capacity_delta_spans_lookback_window() {\n        let mut series = ts();\n\n        // Fill to exactly the lookback window length (6 readings), all same value.\n        for _ in 0..WAL_CAPACITY_LOOKBACK_WINDOW_LEN {\n            record(&mut series, 50);\n        }\n        assert_eq!(series.delta(), Some(0.0));\n\n        // 7th reading fills the ring buffer. Delta spans 6 intervals.\n        record(&mut series, 0);\n        assert_eq!(series.delta(), Some(0.50));\n\n        // 8th reading evicts the oldest 50-remaining. Delta still spans 6 intervals.\n        record(&mut series, 0);\n        assert_eq!(series.delta(), Some(0.50));\n    }\n\n    #[test]\n    fn test_wal_capacity_tracker_returns_min() {\n        let mut tracker = WalCapacityTracker::new(ByteSize::b(100), ByteSize::b(100));\n        // Disk 10% used (score 9), memory 90% used (score 2) → returns 2.\n        assert_eq!(\n            tracker.record_and_score(ByteSize::b(10), ByteSize::b(90)),\n            2\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/ingest_v2/workbench.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{BTreeMap, HashSet};\n\nuse quickwit_common::pubsub::EventBroker;\nuse quickwit_common::rate_limited_error;\nuse quickwit_proto::control_plane::{\n    GetOrCreateOpenShardsFailure, GetOrCreateOpenShardsFailureReason,\n};\nuse quickwit_proto::ingest::ingester::{PersistFailure, PersistFailureReason, PersistSuccess};\nuse quickwit_proto::ingest::router::{\n    IngestFailure, IngestFailureReason, IngestResponseV2, IngestSubrequest, IngestSuccess,\n};\nuse quickwit_proto::ingest::{IngestV2Error, RateLimitingCause, ShardIds};\nuse quickwit_proto::types::{NodeId, SubrequestId};\nuse tracing::warn;\n\nuse super::publish_tracker::PublishTracker;\nuse super::router::PersistRequestSummary;\n\n/// A helper struct for managing the state of the subrequests of an ingest request during multiple\n/// persist attempts.\n#[derive(Default)]\npub(super) struct IngestWorkbench {\n    pub subworkbenches: BTreeMap<SubrequestId, IngestSubworkbench>,\n    pub num_successes: usize,\n    /// The number of batch persist attempts. This is not sum of the number of attempts for each\n    /// subrequest.\n    pub num_attempts: usize,\n    pub max_num_attempts: usize,\n    /// List of leaders that have been marked as temporarily unavailable.\n    /// These leaders have encountered a transport error during an attempt and will be treated as\n    /// if they were out of the pool for subsequent attempts.\n    ///\n    /// (The point here is to make sure we do not wait for the failure detection to kick the node\n    /// out of the ingest node.)\n    pub unavailable_leaders: HashSet<NodeId>,\n    pub closed_shards: Vec<ShardIds>,\n    publish_tracker: Option<PublishTracker>,\n}\n\n/// Returns an iterator of pending of subrequests, sorted by sub request id.\npub(super) fn pending_subrequests(\n    subworkbenches: &BTreeMap<SubrequestId, IngestSubworkbench>,\n) -> impl Iterator<Item = &IngestSubrequest> {\n    subworkbenches.values().filter_map(|subworbench| {\n        if subworbench.is_pending() {\n            Some(&subworbench.subrequest)\n        } else {\n            None\n        }\n    })\n}\n\nimpl IngestWorkbench {\n    fn new_inner(\n        ingest_subrequests: Vec<IngestSubrequest>,\n        max_num_attempts: usize,\n        publish_tracker: Option<PublishTracker>,\n    ) -> Self {\n        let subworkbenches: BTreeMap<SubrequestId, IngestSubworkbench> = ingest_subrequests\n            .into_iter()\n            .map(|subrequest| {\n                (\n                    subrequest.subrequest_id,\n                    IngestSubworkbench::new(subrequest),\n                )\n            })\n            .collect();\n\n        Self {\n            subworkbenches,\n            max_num_attempts,\n            publish_tracker,\n            ..Default::default()\n        }\n    }\n\n    pub fn new(ingest_subrequests: Vec<IngestSubrequest>, max_num_attempts: usize) -> Self {\n        Self::new_inner(ingest_subrequests, max_num_attempts, None)\n    }\n\n    pub fn new_with_publish_tracking(\n        ingest_subrequests: Vec<IngestSubrequest>,\n        max_num_attempts: usize,\n        event_broker: EventBroker,\n    ) -> Self {\n        Self::new_inner(\n            ingest_subrequests,\n            max_num_attempts,\n            Some(PublishTracker::new(event_broker)),\n        )\n    }\n\n    pub fn new_attempt(&mut self) {\n        self.num_attempts += 1;\n    }\n\n    /// Returns true if all subrequests were successfully persisted or if the\n    /// number of attempts has been exhausted.\n    pub fn is_complete(&self) -> bool {\n        self.num_successes >= self.subworkbenches.len()\n            || self.num_attempts >= self.max_num_attempts\n            || self.has_no_pending_subrequests()\n    }\n\n    pub fn is_last_attempt(&self) -> bool {\n        self.num_attempts >= self.max_num_attempts\n    }\n\n    fn has_no_pending_subrequests(&self) -> bool {\n        self.subworkbenches\n            .values()\n            .all(|subworbench| !subworbench.is_pending())\n    }\n\n    pub fn record_get_or_create_open_shards_failure(\n        &mut self,\n        open_shards_failure: GetOrCreateOpenShardsFailure,\n    ) {\n        let last_failure = match open_shards_failure.reason() {\n            GetOrCreateOpenShardsFailureReason::IndexNotFound => SubworkbenchFailure::IndexNotFound,\n            GetOrCreateOpenShardsFailureReason::SourceNotFound => {\n                SubworkbenchFailure::SourceNotFound\n            }\n            GetOrCreateOpenShardsFailureReason::NoIngestersAvailable => {\n                SubworkbenchFailure::NoShardsAvailable\n            }\n            GetOrCreateOpenShardsFailureReason::Unspecified => {\n                warn!(\n                    \"failure reason for subrequest `{}` is unspecified\",\n                    open_shards_failure.subrequest_id\n                );\n                SubworkbenchFailure::Internal\n            }\n        };\n        self.record_failure(open_shards_failure.subrequest_id, last_failure);\n    }\n\n    pub fn record_persist_success(&mut self, persist_success: PersistSuccess) {\n        let Some(subworkbench) = self.subworkbenches.get_mut(&persist_success.subrequest_id) else {\n            warn!(\n                \"could not find subrequest `{}` in workbench\",\n                persist_success.subrequest_id\n            );\n            return;\n        };\n        if let Some(publish_tracker) = &mut self.publish_tracker\n            && let Some(position) = &persist_success.replication_position_inclusive\n        {\n            publish_tracker.track_persisted_shard_position(\n                persist_success.shard_id().clone(),\n                position.clone(),\n            );\n        }\n        self.num_successes += 1;\n        subworkbench.num_attempts += 1;\n        subworkbench.persist_success_opt = Some(persist_success);\n    }\n\n    pub fn record_persist_error(\n        &mut self,\n        persist_error: IngestV2Error,\n        persist_summary: PersistRequestSummary,\n    ) {\n        // Persist responses use dedicated failure reasons for `ShardNotFound` and\n        // `TooManyRequests`: in reality, we should never have to handle these cases here.\n        match persist_error {\n            IngestV2Error::Timeout(_) => {\n                for subrequest_id in persist_summary.subrequest_ids {\n                    let failure = SubworkbenchFailure::Persist(PersistFailureReason::Timeout);\n                    self.record_failure(subrequest_id, failure);\n                }\n            }\n            IngestV2Error::Unavailable(_) => {\n                self.unavailable_leaders.insert(persist_summary.leader_id);\n                for subrequest_id in persist_summary.subrequest_ids {\n                    self.record_ingester_unavailable(subrequest_id);\n                }\n            }\n            IngestV2Error::Internal(internal_err_msg) => {\n                rate_limited_error!(limit_per_min=6, err_msg=%internal_err_msg, \"persist error: internal error during persist\");\n                for subrequest_id in persist_summary.subrequest_ids {\n                    self.record_internal_error(subrequest_id);\n                }\n            }\n            IngestV2Error::ShardNotFound { shard_id } => {\n                rate_limited_error!(limit_per_min=6, shard_id=%shard_id, \"persist error: shard not found\");\n                for subrequest_id in persist_summary.subrequest_ids {\n                    self.record_internal_error(subrequest_id);\n                }\n            }\n            IngestV2Error::TooManyRequests(rate_limiting_cause) => {\n                for subrequest_id in persist_summary.subrequest_ids {\n                    self.record_too_many_requests(subrequest_id, rate_limiting_cause);\n                }\n            }\n        }\n    }\n\n    pub fn record_persist_failure(&mut self, persist_failure: &PersistFailure) {\n        let failure = SubworkbenchFailure::Persist(persist_failure.reason());\n        self.record_failure(persist_failure.subrequest_id, failure);\n    }\n\n    fn record_failure(&mut self, subrequest_id: SubrequestId, failure: SubworkbenchFailure) {\n        let Some(subworkbench) = self.subworkbenches.get_mut(&subrequest_id) else {\n            warn!(\"could not find subrequest `{}` in workbench\", subrequest_id);\n            return;\n        };\n        subworkbench.num_attempts += 1;\n        subworkbench.last_failure_opt = Some(failure);\n    }\n\n    pub fn record_no_shards_available(&mut self, subrequest_id: SubrequestId) {\n        self.record_failure(subrequest_id, SubworkbenchFailure::NoShardsAvailable);\n    }\n\n    /// Marks a node as unavailable for the span of the workbench.\n    ///\n    /// Remaining attempts will treat the node as if it was not in the ingester pool.\n    pub fn record_ingester_unavailable(&mut self, subrequest_id: SubrequestId) {\n        self.record_failure(subrequest_id, SubworkbenchFailure::Unavailable);\n    }\n\n    fn record_internal_error(&mut self, subrequest_id: SubrequestId) {\n        self.record_failure(subrequest_id, SubworkbenchFailure::Internal);\n    }\n\n    fn record_too_many_requests(\n        &mut self,\n        subrequest_id: SubrequestId,\n        rate_limiting_cause: RateLimitingCause,\n    ) {\n        self.record_failure(\n            subrequest_id,\n            SubworkbenchFailure::RateLimited(rate_limiting_cause),\n        );\n    }\n\n    pub async fn into_ingest_result(self) -> IngestResponseV2 {\n        let num_subworkbenches = self.subworkbenches.len();\n        let mut successes = Vec::with_capacity(self.num_successes);\n        let mut failures = Vec::with_capacity(num_subworkbenches - self.num_successes);\n\n        // We consider the last retry outcome as the actual outcome.\n        for subworkbench in self.subworkbenches.into_values() {\n            if let Some(persist_success) = subworkbench.persist_success_opt {\n                let success = IngestSuccess {\n                    subrequest_id: persist_success.subrequest_id,\n                    index_uid: persist_success.index_uid,\n                    source_id: persist_success.source_id,\n                    shard_id: persist_success.shard_id,\n                    replication_position_inclusive: persist_success.replication_position_inclusive,\n                    num_ingested_docs: persist_success.num_persisted_docs,\n                    parse_failures: persist_success.parse_failures,\n                };\n                successes.push(success);\n            } else if let Some(failure) = subworkbench.last_failure_opt {\n                let failure = IngestFailure {\n                    subrequest_id: subworkbench.subrequest.subrequest_id,\n                    index_id: subworkbench.subrequest.index_id,\n                    source_id: subworkbench.subrequest.source_id,\n                    reason: failure.reason() as i32,\n                };\n                failures.push(failure);\n            }\n        }\n        let num_successes = successes.len();\n        let num_failures = failures.len();\n        assert_eq!(num_successes + num_failures, num_subworkbenches);\n\n        if let Some(publish_tracker) = self.publish_tracker {\n            publish_tracker.wait_publish_complete().await;\n        }\n\n        // For tests, we sort the successes and failures by subrequest_id\n        #[cfg(test)]\n        {\n            for success in &mut successes {\n                success\n                    .parse_failures\n                    .sort_by_key(|parse_failure| parse_failure.doc_uid());\n            }\n            successes.sort_by_key(|success| success.subrequest_id);\n            failures.sort_by_key(|failure| failure.subrequest_id);\n        }\n\n        IngestResponseV2 {\n            successes,\n            failures,\n        }\n    }\n}\n\n#[derive(Debug)]\npub(super) enum SubworkbenchFailure {\n    // There is no entry in the routing table for this index.\n    IndexNotFound,\n    // There is no entry in the routing table for this source.\n    SourceNotFound,\n    // The routing table entry for this source is empty, shards are all closed, or their leaders\n    // are unavailable.\n    NoShardsAvailable,\n    // This is an error returned by the ingester: e.g. shard not found, shard closed, rate\n    // limited, resource exhausted, etc.\n    Persist(PersistFailureReason),\n    Internal,\n    // The ingester is no longer in the pool or a transport error occurred.\n    Unavailable,\n    // The ingester is rate limited.\n    RateLimited(RateLimitingCause),\n}\n\nimpl SubworkbenchFailure {\n    /// Returns the final `IngestFailureReason` returned to the client.\n    fn reason(&self) -> IngestFailureReason {\n        match self {\n            Self::IndexNotFound => IngestFailureReason::IndexNotFound,\n            Self::SourceNotFound => IngestFailureReason::SourceNotFound,\n            Self::Internal => IngestFailureReason::Internal,\n            Self::NoShardsAvailable => IngestFailureReason::NoShardsAvailable,\n            // In our last attempt, we did not manage to reach the ingester.\n            // We can consider that as a no shards available.\n            Self::Unavailable => IngestFailureReason::NoShardsAvailable,\n            Self::RateLimited(rate_limiting_cause) => match rate_limiting_cause {\n                RateLimitingCause::RouterLoadShedding => IngestFailureReason::RouterLoadShedding,\n                RateLimitingCause::LoadShedding => IngestFailureReason::RouterLoadShedding,\n                RateLimitingCause::WalFull => IngestFailureReason::WalFull,\n                RateLimitingCause::CircuitBreaker => IngestFailureReason::CircuitBreaker,\n                RateLimitingCause::ShardRateLimiting => IngestFailureReason::ShardRateLimited,\n                RateLimitingCause::Unknown => IngestFailureReason::Unspecified,\n            },\n            Self::Persist(persist_failure_reason) => (*persist_failure_reason).into(),\n        }\n    }\n}\n\n#[derive(Debug, Default)]\npub(super) struct IngestSubworkbench {\n    pub subrequest: IngestSubrequest,\n    pub persist_success_opt: Option<PersistSuccess>,\n    pub last_failure_opt: Option<SubworkbenchFailure>,\n    /// The number of persist attempts for this subrequest.\n    pub num_attempts: usize,\n}\n\nimpl IngestSubworkbench {\n    pub fn new(subrequest: IngestSubrequest) -> Self {\n        Self {\n            subrequest,\n            ..Default::default()\n        }\n    }\n\n    pub fn is_pending(&self) -> bool {\n        self.persist_success_opt.is_none() && self.last_failure_is_transient()\n    }\n\n    /// Returns `false` if and only if the last attempt suggests retrying (on any node) will fail.\n    /// e.g.:\n    /// - the index does not exist\n    /// - the source does not exist.\n    fn last_failure_is_transient(&self) -> bool {\n        match self.last_failure_opt {\n            Some(SubworkbenchFailure::IndexNotFound) => false,\n            Some(SubworkbenchFailure::SourceNotFound) => false,\n            Some(SubworkbenchFailure::Internal) => true,\n            Some(SubworkbenchFailure::NoShardsAvailable) => true,\n            Some(SubworkbenchFailure::Persist(_)) => true,\n            Some(SubworkbenchFailure::Unavailable) => true,\n            Some(SubworkbenchFailure::RateLimited(_)) => true,\n            None => true,\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::time::Duration;\n\n    use quickwit_proto::indexing::ShardPositionsUpdate;\n    use quickwit_proto::ingest::ingester::PersistFailureReason;\n    use quickwit_proto::types::{IndexUid, Position, ShardId, SourceUid};\n\n    use super::*;\n\n    #[test]\n    fn test_ingest_subworkbench() {\n        let subrequest = IngestSubrequest {\n            ..Default::default()\n        };\n        let mut subworkbench = IngestSubworkbench::new(subrequest);\n        assert!(subworkbench.is_pending());\n        assert!(subworkbench.last_failure_is_transient());\n\n        subworkbench.last_failure_opt = Some(SubworkbenchFailure::Unavailable);\n        assert!(subworkbench.is_pending());\n        assert!(subworkbench.last_failure_is_transient());\n\n        subworkbench.last_failure_opt = Some(SubworkbenchFailure::Internal);\n        assert!(subworkbench.is_pending());\n        assert!(subworkbench.last_failure_is_transient());\n\n        subworkbench.last_failure_opt = Some(SubworkbenchFailure::NoShardsAvailable);\n        assert!(subworkbench.is_pending());\n        assert!(subworkbench.last_failure_is_transient());\n\n        subworkbench.last_failure_opt = Some(SubworkbenchFailure::IndexNotFound);\n        assert!(!subworkbench.is_pending());\n        assert!(!subworkbench.last_failure_is_transient());\n        subworkbench.last_failure_opt = Some(SubworkbenchFailure::SourceNotFound);\n        assert!(!subworkbench.is_pending());\n        assert!(!subworkbench.last_failure_is_transient());\n\n        subworkbench.last_failure_opt = Some(SubworkbenchFailure::Persist(\n            PersistFailureReason::NoShardsAvailable,\n        ));\n        assert!(subworkbench.is_pending());\n        assert!(subworkbench.last_failure_is_transient());\n\n        let persist_success = PersistSuccess {\n            ..Default::default()\n        };\n        subworkbench.persist_success_opt = Some(persist_success);\n        assert!(!subworkbench.is_pending());\n    }\n\n    #[test]\n    fn test_ingest_workbench() {\n        let workbench = IngestWorkbench::new(Vec::new(), 1);\n        assert!(workbench.is_complete());\n\n        let ingest_subrequests = vec![IngestSubrequest {\n            ..Default::default()\n        }];\n        let mut workbench = IngestWorkbench::new(ingest_subrequests, 1);\n        assert!(!workbench.is_last_attempt());\n        assert!(!workbench.is_complete());\n\n        workbench.new_attempt();\n        assert!(workbench.is_last_attempt());\n        assert!(workbench.is_complete());\n\n        let ingest_subrequests = vec![\n            IngestSubrequest {\n                subrequest_id: 0,\n                ..Default::default()\n            },\n            IngestSubrequest {\n                subrequest_id: 1,\n                ..Default::default()\n            },\n        ];\n        let mut workbench = IngestWorkbench::new(ingest_subrequests, 1);\n        assert_eq!(pending_subrequests(&workbench.subworkbenches).count(), 2);\n        assert!(!workbench.is_complete());\n\n        let persist_success = PersistSuccess {\n            subrequest_id: 0,\n            ..Default::default()\n        };\n        workbench.record_persist_success(persist_success);\n\n        assert_eq!(workbench.num_successes, 1);\n        assert_eq!(pending_subrequests(&workbench.subworkbenches).count(), 1);\n        assert_eq!(\n            pending_subrequests(&workbench.subworkbenches)\n                .next()\n                .unwrap()\n                .subrequest_id,\n            1\n        );\n\n        let subworkbench = workbench.subworkbenches.get(&0).unwrap();\n        assert_eq!(subworkbench.num_attempts, 1);\n        assert!(!subworkbench.is_pending());\n\n        let persist_failure = PersistFailure {\n            subrequest_id: 1,\n            ..Default::default()\n        };\n        workbench.record_persist_failure(&persist_failure);\n\n        assert_eq!(workbench.num_successes, 1);\n        assert_eq!(pending_subrequests(&workbench.subworkbenches).count(), 1);\n        assert_eq!(\n            pending_subrequests(&workbench.subworkbenches)\n                .next()\n                .unwrap()\n                .subrequest_id,\n            1\n        );\n\n        let subworkbench = workbench.subworkbenches.get(&1).unwrap();\n        assert_eq!(subworkbench.num_attempts, 1);\n        assert!(subworkbench.last_failure_opt.is_some());\n\n        let persist_success = PersistSuccess {\n            subrequest_id: 1,\n            ..Default::default()\n        };\n        workbench.record_persist_success(persist_success);\n\n        assert!(workbench.is_complete());\n        assert_eq!(workbench.num_successes, 2);\n        assert_eq!(pending_subrequests(&workbench.subworkbenches).count(), 0);\n    }\n\n    #[tokio::test]\n    async fn test_workbench_publish_tracking_empty() {\n        let workbench =\n            IngestWorkbench::new_with_publish_tracking(Vec::new(), 1, EventBroker::default());\n        assert!(workbench.is_complete());\n        assert_eq!(\n            workbench.into_ingest_result().await,\n            IngestResponseV2::default()\n        );\n    }\n\n    #[tokio::test]\n    async fn test_workbench_publish_tracking_happy_path() {\n        let event_broker = EventBroker::default();\n        let shard_id_1 = ShardId::from(\"test-shard-1\");\n        let shard_id_2 = ShardId::from(\"test-shard-2\");\n        let ingest_subrequests = vec![\n            IngestSubrequest {\n                subrequest_id: 0,\n                ..Default::default()\n            },\n            IngestSubrequest {\n                subrequest_id: 1,\n                ..Default::default()\n            },\n        ];\n        let mut workbench =\n            IngestWorkbench::new_with_publish_tracking(ingest_subrequests, 1, event_broker.clone());\n        assert_eq!(pending_subrequests(&workbench.subworkbenches).count(), 2);\n        assert!(!workbench.is_complete());\n\n        let persist_success = PersistSuccess {\n            subrequest_id: 0,\n            shard_id: Some(shard_id_1.clone()),\n            replication_position_inclusive: Some(Position::offset(42usize)),\n            ..Default::default()\n        };\n        workbench.record_persist_success(persist_success);\n\n        let persist_failure = PersistFailure {\n            subrequest_id: 1,\n            ..Default::default()\n        };\n        workbench.record_persist_failure(&persist_failure);\n\n        let persist_success = PersistSuccess {\n            subrequest_id: 1,\n            shard_id: Some(shard_id_2.clone()),\n            replication_position_inclusive: Some(Position::offset(66usize)),\n            ..Default::default()\n        };\n\n        workbench.record_persist_success(persist_success);\n\n        assert!(workbench.is_complete());\n        assert_eq!(workbench.num_successes, 2);\n        assert_eq!(pending_subrequests(&workbench.subworkbenches).count(), 0);\n\n        event_broker.publish(ShardPositionsUpdate {\n            source_uid: SourceUid {\n                index_uid: IndexUid::for_test(\"test-index\", 0),\n                source_id: \"test-source\".to_string(),\n            },\n            updated_shard_positions: vec![\n                (shard_id_1, Position::offset(42usize)),\n                (shard_id_2, Position::offset(66usize)),\n            ]\n            .into_iter()\n            .collect(),\n        });\n\n        let ingest_response = workbench.into_ingest_result().await;\n        assert_eq!(ingest_response.successes.len(), 2);\n        assert_eq!(ingest_response.failures.len(), 0);\n    }\n\n    #[tokio::test]\n    async fn test_workbench_publish_tracking_waits() {\n        let event_broker = EventBroker::default();\n        let shard_id_1 = ShardId::from(\"test-shard-1\");\n        let shard_id_2 = ShardId::from(\"test-shard-2\");\n        let ingest_subrequests = vec![\n            IngestSubrequest {\n                subrequest_id: 0,\n                ..Default::default()\n            },\n            IngestSubrequest {\n                subrequest_id: 1,\n                ..Default::default()\n            },\n        ];\n        let mut workbench =\n            IngestWorkbench::new_with_publish_tracking(ingest_subrequests, 1, event_broker.clone());\n\n        let persist_success = PersistSuccess {\n            subrequest_id: 0,\n            shard_id: Some(shard_id_1.clone()),\n            replication_position_inclusive: Some(Position::offset(42usize)),\n            ..Default::default()\n        };\n        workbench.record_persist_success(persist_success);\n\n        let persist_success = PersistSuccess {\n            subrequest_id: 1,\n            shard_id: Some(shard_id_2.clone()),\n            replication_position_inclusive: Some(Position::offset(66usize)),\n            ..Default::default()\n        };\n        workbench.record_persist_success(persist_success);\n\n        assert!(workbench.is_complete());\n        assert_eq!(workbench.num_successes, 2);\n        assert_eq!(pending_subrequests(&workbench.subworkbenches).count(), 0);\n\n        event_broker.publish(ShardPositionsUpdate {\n            source_uid: SourceUid {\n                index_uid: IndexUid::for_test(\"test-index\", 0),\n                source_id: \"test-source\".to_string(),\n            },\n            updated_shard_positions: vec![(shard_id_2, Position::offset(66usize))]\n                .into_iter()\n                .collect(),\n        });\n        // still waits for shard 1 to be published\n        tokio::time::timeout(Duration::from_millis(200), workbench.into_ingest_result())\n            .await\n            .unwrap_err();\n    }\n\n    #[test]\n    fn test_ingest_workbench_record_get_or_create_open_shards_failure() {\n        let ingest_subrequests = vec![IngestSubrequest {\n            subrequest_id: 0,\n            ..Default::default()\n        }];\n        let mut workbench = IngestWorkbench::new(ingest_subrequests, 1);\n\n        let get_or_create_open_shards_failure = GetOrCreateOpenShardsFailure {\n            subrequest_id: 42,\n            reason: GetOrCreateOpenShardsFailureReason::IndexNotFound as i32,\n            ..Default::default()\n        };\n        workbench.record_get_or_create_open_shards_failure(get_or_create_open_shards_failure);\n\n        let get_or_create_open_shards_failure = GetOrCreateOpenShardsFailure {\n            subrequest_id: 0,\n            reason: GetOrCreateOpenShardsFailureReason::SourceNotFound as i32,\n            ..Default::default()\n        };\n        workbench.record_get_or_create_open_shards_failure(get_or_create_open_shards_failure);\n\n        assert_eq!(workbench.num_successes, 0);\n\n        let subworkbench = workbench.subworkbenches.get(&0).unwrap();\n        assert!(matches!(\n            subworkbench.last_failure_opt,\n            Some(SubworkbenchFailure::SourceNotFound)\n        ));\n        assert_eq!(subworkbench.num_attempts, 1);\n    }\n\n    #[test]\n    fn test_ingest_workbench_record_persist_success() {\n        let ingest_subrequests = vec![IngestSubrequest {\n            subrequest_id: 0,\n            ..Default::default()\n        }];\n        let mut workbench = IngestWorkbench::new(ingest_subrequests, 1);\n\n        let persist_success = PersistSuccess {\n            subrequest_id: 42,\n            ..Default::default()\n        };\n        workbench.record_persist_success(persist_success);\n\n        let persist_success = PersistSuccess {\n            subrequest_id: 0,\n            ..Default::default()\n        };\n        workbench.record_persist_success(persist_success);\n\n        assert_eq!(workbench.num_successes, 1);\n\n        let subworkbench = workbench.subworkbenches.get(&0).unwrap();\n        assert!(matches!(\n            subworkbench.persist_success_opt,\n            Some(PersistSuccess { .. })\n        ));\n        assert_eq!(subworkbench.num_attempts, 1);\n    }\n\n    #[test]\n    fn test_ingest_workbench_record_persist_error_timeout() {\n        let ingest_subrequests = vec![IngestSubrequest {\n            subrequest_id: 0,\n            ..Default::default()\n        }];\n        let mut workbench = IngestWorkbench::new(ingest_subrequests, 1);\n\n        let persist_error = IngestV2Error::Timeout(\"request timed out\".to_string());\n        let leader_id = NodeId::from(\"test-leader\");\n        let persist_summary = PersistRequestSummary {\n            leader_id: leader_id.clone(),\n            subrequest_ids: vec![0],\n        };\n        workbench.record_persist_error(persist_error, persist_summary);\n\n        let subworkbench = workbench.subworkbenches.get(&0).unwrap();\n        assert_eq!(subworkbench.num_attempts, 1);\n\n        assert!(matches!(\n            subworkbench.last_failure_opt,\n            Some(SubworkbenchFailure::Persist(PersistFailureReason::Timeout))\n        ));\n        assert!(subworkbench.persist_success_opt.is_none());\n    }\n\n    #[test]\n    fn test_ingest_workbench_record_persist_error_unavailable() {\n        let ingest_subrequests = vec![IngestSubrequest {\n            subrequest_id: 0,\n            ..Default::default()\n        }];\n        let mut workbench = IngestWorkbench::new(ingest_subrequests, 1);\n\n        let persist_error = IngestV2Error::Unavailable(\"connection error\".to_string());\n        let leader_id = NodeId::from(\"test-leader\");\n        let persist_summary = PersistRequestSummary {\n            leader_id: leader_id.clone(),\n            subrequest_ids: vec![0],\n        };\n        workbench.record_persist_error(persist_error, persist_summary);\n\n        assert!(workbench.unavailable_leaders.contains(&leader_id));\n\n        let subworkbench = workbench.subworkbenches.get(&0).unwrap();\n        assert_eq!(subworkbench.num_attempts, 1);\n\n        assert!(matches!(\n            subworkbench.last_failure_opt,\n            Some(SubworkbenchFailure::Unavailable)\n        ));\n        assert!(subworkbench.persist_success_opt.is_none());\n    }\n\n    #[test]\n    fn test_ingest_workbench_record_persist_error_internal() {\n        let ingest_subrequests = vec![IngestSubrequest {\n            subrequest_id: 0,\n            ..Default::default()\n        }];\n        let mut workbench = IngestWorkbench::new(ingest_subrequests, 1);\n\n        let persist_error = IngestV2Error::Internal(\"IO error\".to_string());\n        let persist_summary = PersistRequestSummary {\n            leader_id: NodeId::from(\"test-leader\"),\n            subrequest_ids: vec![0],\n        };\n        workbench.record_persist_error(persist_error, persist_summary);\n\n        let subworkbench = workbench.subworkbenches.get(&0).unwrap();\n        assert_eq!(subworkbench.num_attempts, 1);\n\n        assert!(matches!(\n            &subworkbench.last_failure_opt,\n            Some(SubworkbenchFailure::Internal)\n        ));\n        assert!(subworkbench.persist_success_opt.is_none());\n    }\n\n    #[test]\n    fn test_ingest_workbench_record_persist_failure() {\n        let ingest_subrequests = vec![IngestSubrequest {\n            subrequest_id: 0,\n            ..Default::default()\n        }];\n        let mut workbench = IngestWorkbench::new(ingest_subrequests, 1);\n\n        let persist_failure = PersistFailure {\n            subrequest_id: 42,\n            reason: PersistFailureReason::NoShardsAvailable as i32,\n            ..Default::default()\n        };\n        workbench.record_persist_failure(&persist_failure);\n\n        let persist_failure = PersistFailure {\n            subrequest_id: 0,\n            reason: PersistFailureReason::WalFull as i32,\n            ..Default::default()\n        };\n        workbench.record_persist_failure(&persist_failure);\n\n        assert_eq!(workbench.num_successes, 0);\n\n        let subworkbench = workbench.subworkbenches.get(&0).unwrap();\n        assert!(matches!(\n            subworkbench.last_failure_opt,\n            Some(SubworkbenchFailure::Persist(reason)) if reason == PersistFailureReason::WalFull\n        ));\n        assert_eq!(subworkbench.num_attempts, 1);\n    }\n\n    #[test]\n    fn test_ingest_workbench_record_no_shards_available() {\n        let ingest_subrequests = vec![IngestSubrequest {\n            subrequest_id: 0,\n            ..Default::default()\n        }];\n        let mut workbench = IngestWorkbench::new(ingest_subrequests, 1);\n\n        workbench.record_no_shards_available(42);\n        workbench.record_no_shards_available(0);\n\n        assert_eq!(workbench.num_successes, 0);\n\n        let subworkbench = workbench.subworkbenches.get(&0).unwrap();\n        assert!(matches!(\n            subworkbench.last_failure_opt,\n            Some(SubworkbenchFailure::NoShardsAvailable)\n        ));\n        assert_eq!(subworkbench.num_attempts, 1);\n    }\n\n    #[tokio::test]\n    async fn test_ingest_workbench_into_ingest_result() {\n        let workbench = IngestWorkbench::new(Vec::new(), 0);\n        let response = workbench.into_ingest_result().await;\n        assert!(response.successes.is_empty());\n        assert!(response.failures.is_empty());\n\n        let ingest_subrequests = vec![\n            IngestSubrequest {\n                subrequest_id: 0,\n                ..Default::default()\n            },\n            IngestSubrequest {\n                subrequest_id: 1,\n                ..Default::default()\n            },\n        ];\n        let mut workbench = IngestWorkbench::new(ingest_subrequests, 1);\n        let persist_success = PersistSuccess {\n            ..Default::default()\n        };\n        let subworkbench = workbench.subworkbenches.get_mut(&0).unwrap();\n        subworkbench.persist_success_opt = Some(persist_success);\n\n        workbench.record_no_shards_available(1);\n\n        let response = workbench.into_ingest_result().await;\n        assert_eq!(response.successes.len(), 1);\n        assert_eq!(response.successes[0].subrequest_id, 0);\n\n        assert_eq!(response.failures.len(), 1);\n        assert_eq!(response.failures[0].subrequest_id, 1);\n        assert_eq!(\n            response.failures[0].reason(),\n            IngestFailureReason::NoShardsAvailable\n        );\n\n        let ingest_subrequests = vec![IngestSubrequest {\n            subrequest_id: 0,\n            ..Default::default()\n        }];\n        let mut workbench = IngestWorkbench::new(ingest_subrequests, 1);\n        let failure = SubworkbenchFailure::Persist(PersistFailureReason::Timeout);\n        workbench.record_failure(0, failure);\n\n        let ingest_response = workbench.into_ingest_result().await;\n        assert_eq!(ingest_response.successes.len(), 0);\n        assert_eq!(\n            ingest_response.failures[0].reason(),\n            IngestFailureReason::Timeout\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n#![deny(clippy::disallowed_methods)]\n\nmod doc_batch;\npub mod error;\nmod ingest_api_service;\n#[path = \"codegen/ingest_service.rs\"]\n#[allow(clippy::disallowed_methods)]\nmod ingest_service;\nmod ingest_v2;\nmod memory_capacity;\nmod metrics;\nmod mrecordlog_async;\nmod notifications;\nmod position;\nmod queue;\n\nuse std::collections::HashMap;\nuse std::path::{Path, PathBuf};\n\nuse anyhow::{Context, bail};\npub use doc_batch::*;\npub use error::IngestServiceError;\npub use ingest_api_service::{GetMemoryCapacity, GetPartitionId, IngestApiService};\npub use ingest_service::*;\npub use ingest_v2::*;\npub use memory_capacity::MemoryCapacity;\nuse once_cell::sync::OnceCell;\npub use position::Position;\npub use queue::Queues;\nuse quickwit_actors::{Mailbox, Universe};\nuse quickwit_config::IngestApiConfig;\nuse tokio::sync::Mutex;\n\npub const QUEUES_DIR_NAME: &str = \"queues\";\n\npub type Result<T> = std::result::Result<T, IngestServiceError>;\n\ntype IngestApiServiceMailboxes = HashMap<PathBuf, Mailbox<IngestApiService>>;\n\npub static INGEST_API_SERVICE_MAILBOXES: OnceCell<Mutex<IngestApiServiceMailboxes>> =\n    OnceCell::new();\n\n/// Initializes an [`IngestApiService`] consuming the queue located at `queue_path`.\npub async fn init_ingest_api(\n    universe: &Universe,\n    queues_dir_path: &Path,\n    config: &IngestApiConfig,\n) -> anyhow::Result<Mailbox<IngestApiService>> {\n    let mut guard = INGEST_API_SERVICE_MAILBOXES\n        .get_or_init(|| Mutex::new(HashMap::new()))\n        .lock()\n        .await;\n    if let Some(mailbox) = guard.get(queues_dir_path) {\n        return Ok(mailbox.clone());\n    }\n    let ingest_api_actor = IngestApiService::with_queues_dir(\n        queues_dir_path,\n        config.max_queue_memory_usage.as_u64() as usize,\n        config.max_queue_disk_usage.as_u64() as usize,\n    )\n    .await\n    .with_context(|| {\n        format!(\n            \"failed to open the ingest API record log located at `{}`\",\n            queues_dir_path.display()\n        )\n    })?;\n    let (ingest_api_service, _ingest_api_handle) = universe.spawn_builder().spawn(ingest_api_actor);\n    guard.insert(queues_dir_path.to_path_buf(), ingest_api_service.clone());\n    Ok(ingest_api_service)\n}\n\n/// Returns the instance of the single IngestApiService via a copy of its Mailbox.\npub async fn get_ingest_api_service(\n    queues_dir_path: &Path,\n) -> anyhow::Result<Mailbox<IngestApiService>> {\n    let guard = INGEST_API_SERVICE_MAILBOXES\n        .get_or_init(|| Mutex::new(HashMap::new()))\n        .lock()\n        .await;\n    if let Some(mailbox) = guard.get(queues_dir_path) {\n        return Ok(mailbox.clone());\n    }\n    bail!(\n        \"ingest API service with queues directory located at `{}` is not initialized\",\n        queues_dir_path.display()\n    )\n}\n\n/// Starts an [`IngestApiService`] instance at `<data_dir_path>/queues`.\npub async fn start_ingest_api_service(\n    universe: &Universe,\n    data_dir_path: &Path,\n    config: &IngestApiConfig,\n) -> anyhow::Result<Mailbox<IngestApiService>> {\n    let queues_dir_path = data_dir_path.join(QUEUES_DIR_NAME);\n    init_ingest_api(universe, &queues_dir_path, config).await\n}\n\n#[macro_export]\nmacro_rules! with_lock_metrics {\n    ($future:expr, $($label:tt),*) => {\n        {\n            $crate::ingest_v2::metrics::INGEST_V2_METRICS\n                .wal_acquire_lock_requests_in_flight\n                .with_label_values([$($label),*])\n                .inc();\n\n            let now = std::time::Instant::now();\n            let guard = $future;\n\n            let elapsed = now.elapsed();\n            if elapsed > std::time::Duration::from_secs(1) {\n                quickwit_common::rate_limited_warn!(\n                    limit_per_min=6,\n                    \"lock acquisition took {}ms\", elapsed.as_millis()\n                );\n            }\n            $crate::ingest_v2::metrics::INGEST_V2_METRICS\n                .wal_acquire_lock_requests_in_flight\n                .with_label_values([$($label),*])\n                .dec();\n            $crate::ingest_v2::metrics::INGEST_V2_METRICS\n                .wal_acquire_lock_request_duration_secs\n                .with_label_values([$($label),*])\n                .observe(elapsed.as_secs_f64());\n\n            guard\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use quickwit_actors::AskError;\n    use quickwit_proto::ingest::RateLimitingCause;\n\n    use super::*;\n    use crate::{CreateQueueRequest, IngestRequest, SuggestTruncateRequest};\n\n    #[tokio::test]\n    async fn test_get_ingest_api_service() {\n        let universe = Universe::with_accelerated_time();\n        let temp_dir = tempfile::tempdir().unwrap();\n\n        let queues_0_dir_path = temp_dir.path().join(\"queues-0\");\n        get_ingest_api_service(&queues_0_dir_path)\n            .await\n            .unwrap_err();\n        init_ingest_api(&universe, &queues_0_dir_path, &IngestApiConfig::default())\n            .await\n            .unwrap();\n        let ingest_api_service_0 = get_ingest_api_service(&queues_0_dir_path).await.unwrap();\n        ingest_api_service_0\n            .ask_for_res(CreateQueueRequest {\n                queue_id: \"test-queue\".to_string(),\n            })\n            .await\n            .unwrap();\n\n        let queues_1_dir_path = temp_dir.path().join(\"queues-1\");\n        init_ingest_api(&universe, &queues_1_dir_path, &IngestApiConfig::default())\n            .await\n            .unwrap();\n        let ingest_api_service_1 = get_ingest_api_service(&queues_1_dir_path).await.unwrap();\n        ingest_api_service_1\n            .ask_for_res(CreateQueueRequest {\n                queue_id: \"test-queue\".to_string(),\n            })\n            .await\n            .unwrap();\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_get_ingest_multiple_index_api_service() {\n        let universe = Universe::with_accelerated_time();\n        let temp_dir = tempfile::tempdir().unwrap();\n\n        let queues_0_dir_path = temp_dir.path().join(\"queues-0\");\n        let ingest_api_service =\n            init_ingest_api(&universe, &queues_0_dir_path, &IngestApiConfig::default())\n                .await\n                .unwrap();\n        ingest_api_service\n            .ask_for_res(CreateQueueRequest {\n                queue_id: \"index-1\".to_string(),\n            })\n            .await\n            .unwrap();\n        let ingest_request = IngestRequest {\n            doc_batches: vec![\n                DocBatch {\n                    index_id: \"index-1\".to_string(),\n                    doc_buffer: vec![10, 11, 12].into(),\n                    doc_lengths: vec![2],\n                },\n                DocBatch {\n                    index_id: \"index-2\".to_string(),\n                    doc_buffer: vec![10, 11, 12].into(),\n                    doc_lengths: vec![2],\n                },\n            ],\n            commit: CommitType::Auto.into(),\n        };\n        let ingest_result = ingest_api_service.ask_for_res(ingest_request).await;\n        assert!(ingest_result.is_err());\n        match ingest_result.unwrap_err() {\n            AskError::ErrorReply(ingest_error) => {\n                assert!(ingest_error.to_string().contains(\"index-2\"));\n            }\n            _ => panic!(\"wrong error type\"),\n        }\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_queue_limit() {\n        let universe = Universe::with_accelerated_time();\n        let temp_dir = tempfile::tempdir().unwrap();\n\n        let queues_dir_path = temp_dir.path().join(\"queues-0\");\n        get_ingest_api_service(&queues_dir_path).await.unwrap_err();\n\n        let ingest_api_config = serde_json::from_str(\n            r#\"{\n            \"max_queue_memory_usage\": \"1200b\",\n            \"max_queue_disk_usage\": \"256mb\"\n        }\"#,\n        )\n        .unwrap();\n        init_ingest_api(&universe, &queues_dir_path, &ingest_api_config)\n            .await\n            .unwrap();\n        let ingest_api_service = get_ingest_api_service(&queues_dir_path).await.unwrap();\n\n        ingest_api_service\n            .ask_for_res(CreateQueueRequest {\n                queue_id: \"test-queue\".to_string(),\n            })\n            .await\n            .unwrap();\n\n        let ingest_request = IngestRequest {\n            doc_batches: vec![DocBatch {\n                index_id: \"test-queue\".to_string(),\n                doc_buffer: vec![1; 600].into(),\n                doc_lengths: vec![30; 20],\n            }],\n            commit: CommitType::Auto.into(),\n        };\n\n        ingest_api_service\n            .ask_for_res(ingest_request.clone())\n            .await\n            .unwrap();\n\n        ingest_api_service\n            .ask_for_res(ingest_request.clone())\n            .await\n            .unwrap();\n\n        // we have to much in memory\n        assert!(matches!(\n            ingest_api_service\n                .ask_for_res(ingest_request.clone())\n                .await\n                .unwrap_err(),\n            AskError::ErrorReply(IngestServiceError::RateLimited(RateLimitingCause::WalFull))\n        ));\n\n        // delete the first batch\n        ingest_api_service\n            .ask_for_res(SuggestTruncateRequest {\n                index_id: \"test-queue\".to_string(),\n                up_to_position_included: 29,\n            })\n            .await\n            .unwrap();\n\n        // now we should be okay\n        ingest_api_service\n            .ask_for_res(ingest_request)\n            .await\n            .unwrap();\n        universe.assert_quit().await;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/memory_capacity.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::sync::Arc;\nuse std::sync::atomic::{AtomicUsize, Ordering};\n\n#[derive(Debug, Clone, Copy, thiserror::Error)]\n#[error(\"failed to reserve requested memory capacity. current capacity: {0}\")]\npub struct ReserveCapacityError(usize);\n\n#[derive(Clone)]\npub struct MemoryCapacity {\n    inner: Arc<InnerMemoryCapacity>,\n}\n\nimpl fmt::Debug for MemoryCapacity {\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        f.debug_struct(\"MemoryCapacity\")\n            .field(\"capacity\", &self.capacity())\n            .field(\"max_capacity\", &self.max_capacity())\n            .finish()\n    }\n}\n\nimpl MemoryCapacity {\n    /// Creates a new [`MemoryCapacity`] object with a capacity of `max_capacity` bytes.\n    ///\n    /// # Panics\n    ///\n    /// This constructor panics if `max_capacity` is 0.\n    pub fn new(max_capacity: usize) -> Self {\n        assert!(\n            max_capacity > 0,\n            \"The memory capacity is required to be > 0.\"\n        );\n\n        Self {\n            inner: Arc::new(InnerMemoryCapacity {\n                max_capacity,\n                capacity: AtomicUsize::new(max_capacity),\n            }),\n        }\n    }\n\n    /// Attempts to reserve `num_bytes` of capacity. Returns an error if there is not enough\n    /// capacity available.\n    pub fn reserve_capacity(&self, num_bytes: usize) -> Result<(), ReserveCapacityError> {\n        loop {\n            let current_capacity = self.inner.capacity.load(Ordering::Acquire);\n\n            if current_capacity < num_bytes {\n                return Err(ReserveCapacityError(current_capacity));\n            }\n            let new_capacity = current_capacity - num_bytes;\n\n            if self\n                .inner\n                .capacity\n                .compare_exchange(\n                    current_capacity,\n                    new_capacity,\n                    Ordering::AcqRel,\n                    Ordering::Acquire,\n                )\n                .is_ok()\n            {\n                return Ok(());\n            }\n        }\n    }\n\n    /// Resets the capacity to `new_capacity`.\n    pub fn reset_capacity(&self, new_capacity: usize) {\n        self.inner.capacity.store(new_capacity, Ordering::Release);\n    }\n\n    pub fn max_capacity(&self) -> usize {\n        self.inner.max_capacity\n    }\n\n    /// Returns the current capacity.\n    pub fn capacity(&self) -> usize {\n        self.inner\n            .capacity\n            .load(std::sync::atomic::Ordering::Relaxed)\n    }\n\n    /// Returns the ratio of used capacity to maximum capacity.\n    pub fn usage_ratio(&self) -> f64 {\n        1.0 - (self.capacity() as f64 / self.max_capacity() as f64)\n    }\n}\n\nstruct InnerMemoryCapacity {\n    /// The maximum number of bytes that can be stored in memory.\n    max_capacity: usize,\n    /// The current number of bytes stored in memory.\n    capacity: AtomicUsize,\n}\n\n#[cfg(test)]\nmod tests {\n    use std::sync::Barrier;\n    use std::thread;\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_memory_capacity() {\n        let memory_capacity = MemoryCapacity::new(10);\n        assert_eq!(memory_capacity.max_capacity(), 10);\n        assert_eq!(memory_capacity.capacity(), 10);\n        assert_eq!(memory_capacity.usage_ratio(), 0.0);\n\n        memory_capacity.reserve_capacity(6).unwrap();\n        assert_eq!(memory_capacity.max_capacity(), 10);\n        assert_eq!(memory_capacity.capacity(), 4);\n        assert_eq!(memory_capacity.usage_ratio(), 0.6);\n\n        memory_capacity.reserve_capacity(3).unwrap();\n        assert_eq!(memory_capacity.max_capacity(), 10);\n        assert_eq!(memory_capacity.capacity(), 1);\n        assert_eq!(memory_capacity.usage_ratio(), 0.9);\n\n        memory_capacity.reserve_capacity(1).unwrap();\n        assert_eq!(memory_capacity.max_capacity(), 10);\n        assert_eq!(memory_capacity.capacity(), 0);\n        assert_eq!(memory_capacity.usage_ratio(), 1.0);\n\n        memory_capacity.reserve_capacity(1).unwrap_err();\n\n        let mut handles = Vec::with_capacity(100);\n        let barrier = Arc::new(Barrier::new(100));\n        let memory_capacity = MemoryCapacity::new(100);\n\n        for _ in 0..100 {\n            let barrier = barrier.clone();\n            let memory_capacity = memory_capacity.clone();\n\n            handles.push(thread::spawn(move || {\n                barrier.wait();\n                memory_capacity.reserve_capacity(1).unwrap();\n            }));\n        }\n        for handle in handles {\n            handle.join().unwrap();\n        }\n        assert_eq!(memory_capacity.capacity(), 0)\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/metrics.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse once_cell::sync::Lazy;\nuse quickwit_common::metrics::{IntCounter, IntGauge, new_counter, new_counter_vec, new_gauge};\n\npub struct IngestMetrics {\n    pub ingested_docs_bytes_valid: IntCounter,\n    pub ingested_docs_bytes_invalid: IntCounter,\n    pub ingested_docs_invalid: IntCounter,\n    pub ingested_docs_valid: IntCounter,\n\n    pub replicated_num_bytes_total: IntCounter,\n    pub replicated_num_docs_total: IntCounter,\n    #[allow(dead_code)] // this really shouldn't be dead, it needs to be used somewhere\n    pub queue_count: IntGauge,\n}\n\nimpl Default for IngestMetrics {\n    fn default() -> Self {\n        let ingest_docs_bytes_total = new_counter_vec(\n            \"docs_bytes_total\",\n            \"Total size of the docs ingested, measured in ingester's leader, after validation and \\\n             before persistence/replication\",\n            \"ingest\",\n            &[],\n            [\"validity\"],\n        );\n        let ingested_docs_bytes_valid = ingest_docs_bytes_total.with_label_values([\"valid\"]);\n        let ingested_docs_bytes_invalid = ingest_docs_bytes_total.with_label_values([\"invalid\"]);\n\n        let ingest_docs_total = new_counter_vec(\n            \"docs_total\",\n            \"Total number of the docs ingested, measured in ingester's leader, after validation \\\n             and before persistence/replication\",\n            \"ingest\",\n            &[],\n            [\"validity\"],\n        );\n        let ingested_docs_valid = ingest_docs_total.with_label_values([\"valid\"]);\n        let ingested_docs_invalid = ingest_docs_total.with_label_values([\"invalid\"]);\n\n        IngestMetrics {\n            ingested_docs_bytes_valid,\n            ingested_docs_bytes_invalid,\n            ingested_docs_valid,\n            ingested_docs_invalid,\n            replicated_num_bytes_total: new_counter(\n                \"replicated_num_bytes_total\",\n                \"Total size in bytes of the replicated docs.\",\n                \"ingest\",\n                &[],\n            ),\n            replicated_num_docs_total: new_counter(\n                \"replicated_num_docs_total\",\n                \"Total number of docs replicated.\",\n                \"ingest\",\n                &[],\n            ),\n            queue_count: new_gauge(\n                \"queue_count\",\n                \"Number of queues currently active\",\n                \"ingest\",\n                &[],\n            ),\n        }\n    }\n}\n\npub static INGEST_METRICS: Lazy<IngestMetrics> = Lazy::new(IngestMetrics::default);\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/mrecordlog_async.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::io;\nuse std::ops::RangeBounds;\nuse std::path::Path;\n\nuse bytes::Buf;\nuse mrecordlog::error::*;\nuse mrecordlog::{MultiRecordLog, PersistAction, PersistPolicy, Record, ResourceUsage};\nuse tokio::task::JoinError;\nuse tracing::error;\n\n/// A light wrapper to allow async operation in mrecordlog.\npub struct MultiRecordLogAsync {\n    mrecordlog_opt: Option<MultiRecordLog>,\n}\n\nimpl MultiRecordLogAsync {\n    fn take(&mut self) -> MultiRecordLog {\n        let Some(mrecordlog) = self.mrecordlog_opt.take() else {\n            error!(\"wal is poisoned (on write), aborting process\");\n            std::process::abort();\n        };\n        mrecordlog\n    }\n\n    fn mrecordlog_ref(&self) -> &MultiRecordLog {\n        let Some(mrecordlog) = &self.mrecordlog_opt else {\n            error!(\"wal is poisoned (on read), aborting process\");\n            std::process::abort();\n        };\n        mrecordlog\n    }\n\n    pub async fn open(directory_path: &Path) -> Result<Self, ReadRecordError> {\n        Self::open_with_prefs(directory_path, PersistPolicy::Always(PersistAction::Flush)).await\n    }\n\n    pub async fn open_with_prefs(\n        directory_path: &Path,\n        persist_policy: PersistPolicy,\n    ) -> Result<Self, ReadRecordError> {\n        let directory_path = directory_path.to_path_buf();\n        let mrecordlog = tokio::task::spawn(async move {\n            MultiRecordLog::open_with_prefs(&directory_path, persist_policy)\n        })\n        .await\n        .map_err(|join_err| {\n            error!(error=?join_err, \"failed to load WAL\");\n            ReadRecordError::IoError(io::Error::other(\"loading wal from directory failed\"))\n        })??;\n        Ok(Self {\n            mrecordlog_opt: Some(mrecordlog),\n        })\n    }\n\n    async fn run_operation<F, T>(&mut self, operation: F) -> T\n    where\n        F: FnOnce(&mut MultiRecordLog) -> T + Send + 'static,\n        T: Send + 'static,\n    {\n        let mut mrecordlog = self.take();\n        let join_res: Result<(T, MultiRecordLog), JoinError> =\n            tokio::task::spawn_blocking(move || {\n                let res = operation(&mut mrecordlog);\n                (res, mrecordlog)\n            })\n            .await;\n        match join_res {\n            Ok((operation_result, mrecordlog)) => {\n                self.mrecordlog_opt = Some(mrecordlog);\n                operation_result\n            }\n            Err(join_error) => {\n                // This could be caused by a panic\n                error!(error=?join_error, \"failed to run mrecordlog operation\");\n                panic!(\"failed to run mrecordlog operation\");\n            }\n        }\n    }\n\n    pub async fn create_queue(&mut self, queue: &str) -> Result<(), CreateQueueError> {\n        let queue = queue.to_string();\n        self.run_operation(move |mrecordlog| mrecordlog.create_queue(&queue))\n            .await\n    }\n\n    pub async fn delete_queue(&mut self, queue: &str) -> Result<(), DeleteQueueError> {\n        let queue = queue.to_string();\n        self.run_operation(move |mrecordlog| mrecordlog.delete_queue(&queue))\n            .await\n    }\n\n    pub async fn append_records<T: Iterator<Item = impl Buf> + Send + 'static>(\n        &mut self,\n        queue: &str,\n        position_opt: Option<u64>,\n        payloads: T,\n    ) -> Result<Option<u64>, AppendError> {\n        let queue = queue.to_string();\n        self.run_operation(move |mrecordlog| {\n            mrecordlog.append_records(&queue, position_opt, payloads)\n        })\n        .await\n    }\n\n    #[track_caller]\n    #[cfg(test)]\n    pub fn assert_records_eq<R>(\n        &self,\n        queue_id: &str,\n        range: R,\n        expected_records: &[(u64, [u8; 2], &str)],\n    ) where\n        R: RangeBounds<u64> + 'static,\n    {\n        let records = self\n            .range(queue_id, range)\n            .unwrap()\n            .map(|Record { position, payload }| {\n                let header: [u8; 2] = payload[..2].try_into().unwrap();\n                let payload = String::from_utf8(payload[2..].to_vec()).unwrap();\n                (position, header, payload)\n            })\n            .collect::<Vec<_>>();\n        assert_eq!(\n            records.len(),\n            expected_records.len(),\n            \"expected {} records, got {}\",\n            expected_records.len(),\n            records.len()\n        );\n        for ((position, header, payload), (expected_position, expected_header, expected_payload)) in\n            records.iter().zip(expected_records.iter())\n        {\n            assert_eq!(\n                position, expected_position,\n                \"expected record at position `{expected_position}`, got `{position}`\",\n            );\n            assert_eq!(\n                header, expected_header,\n                \"expected record header, `{expected_header:?}`, got `{header:?}`\",\n            );\n            assert_eq!(\n                payload, expected_payload,\n                \"expected record payload, `{expected_payload}`, got `{payload}`\",\n            );\n        }\n    }\n\n    pub async fn truncate(&mut self, queue: &str, position: u64) -> Result<usize, TruncateError> {\n        let queue = queue.to_string();\n        self.run_operation(move |mrecordlog| mrecordlog.truncate(&queue, position))\n            .await\n    }\n\n    pub fn range<R>(\n        &self,\n        queue: &str,\n        range: R,\n    ) -> Result<impl Iterator<Item = Record<'_>> + '_, MissingQueue>\n    where\n        R: RangeBounds<u64> + 'static,\n    {\n        self.mrecordlog_ref().range(queue, range)\n    }\n\n    pub fn queue_exists(&self, queue: &str) -> bool {\n        self.mrecordlog_ref().queue_exists(queue)\n    }\n\n    pub fn list_queues(&self) -> impl Iterator<Item = &str> {\n        self.mrecordlog_ref().list_queues()\n    }\n\n    pub fn last_record(&self, queue: &str) -> Result<Option<Record<'_>>, MissingQueue> {\n        self.mrecordlog_ref().last_record(queue)\n    }\n\n    pub fn resource_usage(&self) -> ResourceUsage {\n        self.mrecordlog_ref().resource_usage()\n    }\n\n    pub fn summary(&self) -> mrecordlog::QueuesSummary {\n        self.mrecordlog_ref().summary()\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/notifications.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{HashMap, VecDeque};\nuse std::sync::Arc;\n\nuse tokio::sync::Mutex;\n\n/// Registry for the index positions that are waiting to be notified when index commit occurs.\n#[derive(Clone, Default)]\npub struct Notifications {\n    notifications: Arc<Mutex<HashMap<String, VecDeque<Position>>>>,\n}\n\nimpl Notifications {\n    /// Create a new notification registry\n    pub fn new() -> Self {\n        Self {\n            notifications: Arc::new(Mutex::new(HashMap::new())),\n        }\n    }\n\n    /// Register index positions\n    pub async fn register(\n        &self,\n        index_positions: Vec<(String, u64)>,\n        notify: impl FnOnce() + Send + Sync + 'static,\n    ) {\n        let mut guard = self.notifications.lock().await;\n        let notification = Arc::new(Notification::new(notify));\n        for index_position in index_positions {\n            let positions = guard\n                .entry(index_position.0.clone())\n                .or_insert_with(VecDeque::new);\n            positions.push_back(Position {\n                position: index_position.1,\n                notification: notification.clone(),\n            });\n        }\n    }\n\n    /// Notify positions\n    pub async fn notify(&self, index: &String, max_position: u64) {\n        let mut map = self.notifications.lock().await;\n        if let Some(positions) = map.get_mut(index) {\n            while let Some(position) = positions.front() {\n                if position.position <= max_position {\n                    positions\n                        .pop_front()\n                        .unwrap()\n                        .decrement_count_and_notify_if_last();\n                } else {\n                    break;\n                }\n            }\n            if positions.is_empty() {\n                map.remove(index);\n            }\n        }\n    }\n}\n\nimpl Notification {\n    fn new(notify: impl FnOnce() + Send + Sync + 'static) -> Self {\n        Self {\n            notify: Box::new(notify),\n        }\n    }\n}\n\nstruct Position {\n    position: u64,\n    notification: Arc<Notification>,\n}\n\nimpl Position {\n    /// Reduces the notification's Arc count and notifies when if self has the only pointer.\n    fn decrement_count_and_notify_if_last(self) {\n        // Errors are allowed here, it simply means theare are still some positions that\n        // were not notified\n        let _ = Arc::try_unwrap(self.notification).map(|notification| notification.notify());\n    }\n}\n\nstruct Notification {\n    notify: Box<dyn FnOnce() + Send + Sync + 'static>,\n}\n\nimpl Notification {\n    fn notify(self) {\n        (self.notify)();\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::sync::Arc;\n    use std::sync::atomic::AtomicUsize;\n\n    use crate::notifications::Notifications;\n\n    #[tokio::test]\n    async fn test_notifications() {\n        let notifications = Notifications::new();\n        let cleared = Arc::new(AtomicUsize::default());\n        let cleared_clone = cleared.clone();\n        notifications\n            .register(vec![(\"index1\".to_string(), 10)], move || {\n                assert_eq!(\n                    cleared_clone.fetch_add(1, std::sync::atomic::Ordering::Relaxed),\n                    0\n                );\n            })\n            .await;\n        let cleared_clone = cleared.clone();\n        notifications\n            .register(vec![(\"index2\".to_string(), 10)], move || {\n                assert_eq!(\n                    cleared_clone.fetch_add(1, std::sync::atomic::Ordering::Relaxed),\n                    1\n                );\n            })\n            .await;\n        let cleared_clone = cleared.clone();\n        notifications\n            .register(\n                vec![(\"index1\".to_string(), 20), (\"index1\".to_string(), 30)],\n                move || {\n                    assert_eq!(\n                        cleared_clone.fetch_add(1, std::sync::atomic::Ordering::Relaxed),\n                        2\n                    );\n                },\n            )\n            .await;\n        assert_eq!(cleared.load(std::sync::atomic::Ordering::Relaxed), 0);\n        notifications.notify(&\"index1\".to_string(), 20).await;\n        assert_eq!(cleared.load(std::sync::atomic::Ordering::Relaxed), 1);\n        notifications.notify(&\"index2\".to_string(), 100).await;\n        assert_eq!(cleared.load(std::sync::atomic::Ordering::Relaxed), 2);\n        notifications.notify(&\"index1\".to_string(), 100).await;\n        assert_eq!(cleared.load(std::sync::atomic::Ordering::Relaxed), 3);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/position.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\n\nuse crate::error::CorruptedKey;\n\n#[derive(Clone, Copy, Default, Ord, PartialOrd, Eq, PartialEq)]\npub struct Position([u8; 8]);\n\nimpl TryFrom<&[u8]> for Position {\n    type Error = CorruptedKey;\n\n    fn try_from(bytes: &[u8]) -> Result<Self, CorruptedKey> {\n        let bytes: [u8; 8] = bytes.try_into().map_err(|_| CorruptedKey(bytes.len()))?;\n        Ok(Position(bytes))\n    }\n}\n\nimpl From<u64> for Position {\n    fn from(num: u64) -> Self {\n        Position(num.to_be_bytes())\n    }\n}\n\nimpl From<Position> for u64 {\n    fn from(pos: Position) -> u64 {\n        pos.pos_val()\n    }\n}\n\nimpl Position {\n    fn pos_val(self) -> u64 {\n        u64::from_be_bytes(self.0)\n    }\n\n    pub fn inc(&self) -> Position {\n        let new_val: u64 = self.pos_val() + 1u64;\n        Position::from(new_val)\n    }\n}\n\nimpl fmt::Debug for Position {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        f.debug_tuple(\"Position\").field(&self.pos_val()).finish()\n    }\n}\n\nimpl fmt::Display for Position {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(f, \"#{:_>20}\", self.pos_val())\n    }\n}\n\nimpl AsRef<[u8]> for Position {\n    fn as_ref(&self) -> &[u8] {\n        &self.0\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::cmp::Ordering;\n\n    use crate::Position;\n\n    #[test]\n    fn test_position_ordering_is_matching_natural_order() {\n        for (lesser, greater) in (0..1_000).zip(1..1_001) {\n            let lesser_pos = Position::from(lesser);\n            let greater_pos = Position::from(greater);\n            assert_eq!(lesser_pos.cmp(&greater_pos), Ordering::Less);\n        }\n    }\n\n    #[test]\n    fn test_from_to_u128() {\n        let test_n = 20_220_303u64;\n        let position = Position::from(test_n);\n        let position_val: u64 = position.into();\n        assert_eq!(test_n, position_val);\n    }\n\n    #[test]\n    fn test_position_debug() {\n        let test_n = 20_220_303u64;\n        let position = Position::from(test_n);\n        let position_str = format!(\"{position:?}\");\n        assert_eq!(position_str, \"Position(20220303)\");\n    }\n\n    #[test]\n    fn test_position_display() {\n        let test_n = 20_220_303u64;\n        let position_str = Position::from(test_n).to_string();\n        assert_eq!(position_str, \"#____________20220303\");\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ingest/src/queue.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::ops::Bound;\nuse std::path::Path;\n\nuse bytes::Buf;\nuse mrecordlog::error::CreateQueueError;\nuse mrecordlog::{Record, ResourceUsage};\nuse quickwit_actors::ActorContext;\n\nuse crate::mrecordlog_async::MultiRecordLogAsync;\nuse crate::{\n    DocBatchBuilder, FetchResponse, IngestApiService, IngestServiceError, ListQueuesResponse,\n};\n\nconst FETCH_PAYLOAD_LIMIT: usize = 2_000_000; // 2MB\n\n// TODO do we need to keep this?\nconst QUICKWIT_CF_PREFIX: &str = \".queue_\";\n\npub struct Queues {\n    record_log: MultiRecordLogAsync,\n}\n\nimpl Queues {\n    pub async fn open(queues_dir_path: &Path) -> crate::Result<Queues> {\n        tokio::fs::create_dir_all(queues_dir_path)\n            .await\n            .map_err(|error| {\n                IngestServiceError::IoError(format!(\n                    \"failed to create WAL directory `{}`: {}\",\n                    queues_dir_path.display(),\n                    error\n                ))\n            })?;\n        let record_log = MultiRecordLogAsync::open(queues_dir_path).await?;\n        Ok(Queues { record_log })\n    }\n\n    pub fn queue_exists(&self, queue_id: &str) -> bool {\n        let real_queue_id = format!(\"{QUICKWIT_CF_PREFIX}{queue_id}\");\n        self.record_log.queue_exists(&real_queue_id)\n    }\n\n    pub async fn create_queue(\n        &mut self,\n        queue_id: &str,\n        ctx: &ActorContext<IngestApiService>,\n    ) -> crate::Result<()> {\n        if self.queue_exists(queue_id) {\n            return Err(crate::IngestServiceError::IndexAlreadyExists {\n                index_id: queue_id.to_string(),\n            });\n        }\n        let real_queue_id = format!(\"{QUICKWIT_CF_PREFIX}{queue_id}\");\n        ctx.protect_future(self.record_log.create_queue(&real_queue_id))\n            .await\n            .map_err(|e| match e {\n                CreateQueueError::AlreadyExists => IngestServiceError::IndexAlreadyExists {\n                    index_id: queue_id.to_owned(),\n                },\n                CreateQueueError::IoError(ioe) => ioe.into(),\n            })?;\n        Ok(())\n    }\n\n    pub async fn drop_queue(\n        &mut self,\n        queue_id: &str,\n        ctx: &ActorContext<IngestApiService>,\n    ) -> crate::Result<()> {\n        let real_queue_id = format!(\"{QUICKWIT_CF_PREFIX}{queue_id}\");\n        ctx.protect_future(self.record_log.delete_queue(&real_queue_id))\n            .await?;\n        Ok(())\n    }\n\n    /// Suggest to truncate the queue.\n    ///\n    /// This function allows the queue to remove all records up to and\n    /// including `up_to_offset_included`.\n    ///\n    /// The role of this truncation is to release memory and disk space.\n    ///\n    /// There are no guarantees that the record will effectively be removed.\n    /// Nothing might happen, or the truncation might be partial.\n    ///\n    /// In other words, truncating from a position, and fetching records starting\n    /// earlier than this position can yield undefined result:\n    /// the truncated records may or may not be returned.\n    pub async fn suggest_truncate(\n        &mut self,\n        queue_id: &str,\n        up_to_offset_included: u64,\n        ctx: &ActorContext<IngestApiService>,\n    ) -> crate::Result<()> {\n        let real_queue_id = format!(\"{QUICKWIT_CF_PREFIX}{queue_id}\");\n\n        ctx.protect_future(\n            self.record_log\n                .truncate(&real_queue_id, up_to_offset_included),\n        )\n        .await?;\n\n        Ok(())\n    }\n\n    // Append a single record to a target queue.\n    #[cfg(test)]\n    async fn append(\n        &mut self,\n        queue_id: &str,\n        record: &[u8],\n        ctx: &ActorContext<IngestApiService>,\n    ) -> crate::Result<Option<u64>> {\n        use bytes::Bytes;\n\n        self.append_batch(queue_id, std::iter::once(Bytes::from(record.to_vec())), ctx)\n            .await\n    }\n\n    // Append a batch of records to a target queue.\n    //\n    // This operation is atomic: the batch of records is either entirely added or not.\n    pub async fn append_batch(\n        &mut self,\n        queue_id: &str,\n        records_it: impl Iterator<Item = impl Buf> + Send + 'static,\n        ctx: &ActorContext<IngestApiService>,\n    ) -> crate::Result<Option<u64>> {\n        let real_queue_id = format!(\"{QUICKWIT_CF_PREFIX}{queue_id}\");\n\n        // TODO None means we don't have itempotent inserts\n        let max_position = ctx\n            .protect_future(\n                self.record_log\n                    .append_records(&real_queue_id, None, records_it),\n            )\n            .await?;\n\n        Ok(max_position)\n    }\n\n    // Streams messages from in `]after_position, +∞[`.\n    //\n    // If after_position is set to None, then fetch from the start of the Stream.\n    pub fn fetch(\n        &self,\n        queue_id: &str,\n        start_after: Option<u64>,\n        num_bytes_limit: Option<usize>,\n    ) -> crate::Result<FetchResponse> {\n        let real_queue_id = format!(\"{QUICKWIT_CF_PREFIX}{queue_id}\");\n\n        let starting_bound = match start_after {\n            Some(pos) => Bound::Excluded(pos),\n            None => Bound::Unbounded,\n        };\n        let records = self\n            .record_log\n            .range(&real_queue_id, (starting_bound, Bound::Unbounded))\n            .map_err(|_| crate::IngestServiceError::IndexNotFound {\n                // we want to return the queue_id, not the real_queue_id, so we can't just\n                // implement From<MissingQueue>\n                index_id: queue_id.to_string(),\n            })?;\n\n        let size_limit = num_bytes_limit.unwrap_or(FETCH_PAYLOAD_LIMIT);\n        let mut doc_batch = DocBatchBuilder::new(queue_id.to_string());\n        let mut num_bytes = 0;\n        let mut first_key_opt = None;\n\n        for Record { position, payload } in records {\n            if first_key_opt.is_none() {\n                first_key_opt = Some(position);\n            }\n            num_bytes += doc_batch.command_from_buf(payload.as_ref());\n            if num_bytes > size_limit {\n                break;\n            }\n        }\n\n        Ok(FetchResponse {\n            first_position: first_key_opt,\n            doc_batch: Some(doc_batch.build()),\n        })\n    }\n\n    // Streams messages from the start of the Stream.\n    pub fn tail(&self, queue_id: &str) -> crate::Result<FetchResponse> {\n        self.fetch(queue_id, None, None)\n    }\n\n    pub fn list_queues(&self) -> crate::Result<ListQueuesResponse> {\n        Ok(ListQueuesResponse {\n            queues: self\n                .record_log\n                .list_queues()\n                .flat_map(|real_queue_id| real_queue_id.strip_prefix(QUICKWIT_CF_PREFIX))\n                .map(|queue| queue.to_string())\n                .collect(),\n        })\n    }\n\n    pub(crate) fn resource_usage(&self) -> ResourceUsage {\n        self.record_log.resource_usage()\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::collections::HashSet;\n    use std::ops::{Deref, DerefMut};\n\n    use bytes::Bytes;\n    use quickwit_actors::{ActorContext, Universe};\n    use tokio::sync::watch;\n\n    use super::Queues;\n    use crate::IngestApiService;\n    use crate::error::IngestServiceError;\n\n    const TEST_QUEUE_ID: &str = \"my-queue\";\n    const TEST_QUEUE_ID2: &str = \"my-queue2\";\n\n    struct QueuesForTest {\n        queues: Option<Queues>,\n        temp_dir: tempfile::TempDir,\n    }\n\n    impl QueuesForTest {\n        async fn new() -> (Self, ActorContext<IngestApiService>) {\n            let temp_dir = tempfile::tempdir().unwrap();\n            let mut queues_for_test = QueuesForTest {\n                temp_dir,\n                queues: None,\n            };\n            queues_for_test.reload().await;\n\n            let universe = Universe::with_accelerated_time();\n            let (source_mailbox, _source_inbox) = universe.create_test_mailbox();\n            let (observable_state_tx, _observable_state_rx) = watch::channel(());\n            let ctx = ActorContext::for_test(&universe, source_mailbox, observable_state_tx);\n            (queues_for_test, ctx)\n        }\n    }\n\n    impl QueuesForTest {\n        async fn reload(&mut self) {\n            std::mem::drop(self.queues.take());\n            self.queues = Some(Queues::open(self.temp_dir.path()).await.unwrap());\n        }\n\n        #[track_caller]\n        fn fetch_test(\n            &mut self,\n            queue_id: &str,\n            start_after: Option<u64>,\n            expected_first_pos_opt: Option<u64>,\n            expected: &[&[u8]],\n        ) {\n            let fetch_resp = self.fetch(queue_id, start_after, None).unwrap();\n            assert_eq!(fetch_resp.first_position, expected_first_pos_opt);\n            let doc_batch = fetch_resp.doc_batch.unwrap();\n            let records: Vec<Bytes> = doc_batch.clone().into_iter_raw().collect();\n            assert_eq!(&records, expected);\n        }\n    }\n\n    impl Deref for QueuesForTest {\n        type Target = Queues;\n\n        fn deref(&self) -> &Self::Target {\n            self.queues.as_ref().unwrap()\n        }\n    }\n\n    impl DerefMut for QueuesForTest {\n        fn deref_mut(&mut self) -> &mut Self::Target {\n            self.queues.as_mut().unwrap()\n        }\n    }\n\n    impl Drop for QueuesForTest {\n        fn drop(&mut self) {\n            std::mem::drop(self.queues.take().unwrap());\n        }\n    }\n\n    #[tokio::test]\n    async fn test_access_queue_twice() {\n        let (mut queues, ctx) = QueuesForTest::new().await;\n        queues.create_queue(TEST_QUEUE_ID, &ctx).await.unwrap();\n        let queue_err = queues\n            .create_queue(TEST_QUEUE_ID, &ctx)\n            .await\n            .err()\n            .unwrap();\n        assert!(matches!(\n            queue_err,\n            IngestServiceError::IndexAlreadyExists { .. }\n        ));\n    }\n\n    #[tokio::test]\n    async fn test_list_queues() {\n        let queue_ids = vec![\"foo\".to_string(), \"bar\".to_string(), \"baz\".to_string()];\n        let (mut queues, ctx) = QueuesForTest::new().await;\n        for queue_id in queue_ids.iter() {\n            queues.create_queue(queue_id, &ctx).await.unwrap();\n        }\n        assert_eq!(\n            HashSet::<String>::from_iter(queue_ids),\n            HashSet::from_iter(queues.list_queues().unwrap().queues)\n        );\n\n        queues.drop_queue(\"foo\", &ctx).await.unwrap();\n        assert_eq!(\n            HashSet::<String>::from_iter(vec![\"bar\".to_string(), \"baz\".to_string()]),\n            HashSet::from_iter(queues.list_queues().unwrap().queues)\n        );\n    }\n\n    #[tokio::test]\n    async fn test_simple() {\n        let (mut queues, ctx) = QueuesForTest::new().await;\n\n        queues.create_queue(TEST_QUEUE_ID, &ctx).await.unwrap();\n        queues\n            .append_batch(\n                TEST_QUEUE_ID,\n                [b\"hello\", b\"happy\"].iter().map(|bytes| bytes.as_slice()),\n                &ctx,\n            )\n            .await\n            .unwrap();\n\n        queues.reload().await;\n        queues.fetch_test(\n            TEST_QUEUE_ID,\n            None,\n            Some(0),\n            &[&b\"hello\"[..], &b\"happy\"[..]],\n        );\n\n        queues.reload().await;\n        queues.fetch_test(\n            TEST_QUEUE_ID,\n            None,\n            Some(0),\n            &[&b\"hello\"[..], &b\"happy\"[..]],\n        );\n    }\n\n    #[tokio::test]\n    async fn test_distinct_queues() {\n        let (mut queues, ctx) = QueuesForTest::new().await;\n\n        queues.create_queue(TEST_QUEUE_ID, &ctx).await.unwrap();\n        queues.create_queue(TEST_QUEUE_ID2, &ctx).await.unwrap();\n        queues.append(TEST_QUEUE_ID, b\"hello\", &ctx).await.unwrap();\n        queues\n            .append(TEST_QUEUE_ID2, b\"hello2\", &ctx)\n            .await\n            .unwrap();\n\n        queues.fetch_test(TEST_QUEUE_ID, None, Some(0), &[&b\"hello\"[..]]);\n        queues.fetch_test(TEST_QUEUE_ID2, None, Some(0), &[&b\"hello2\"[..]]);\n    }\n\n    #[tokio::test]\n    async fn test_create_reopen() {\n        let (mut queues, ctx) = QueuesForTest::new().await;\n        queues.create_queue(TEST_QUEUE_ID, &ctx).await.unwrap();\n\n        queues.reload().await;\n        queues.append(TEST_QUEUE_ID, b\"hello\", &ctx).await.unwrap();\n\n        queues.reload().await;\n        queues.append(TEST_QUEUE_ID, b\"happy\", &ctx).await.unwrap();\n\n        queues.fetch_test(\n            TEST_QUEUE_ID,\n            None,\n            Some(0),\n            &[&b\"hello\"[..], &b\"happy\"[..]],\n        );\n    }\n\n    // Note this test is specific to the current implementation of truncate.\n    //\n    // The truncate contract is actually not as accurate as what we are testing here.\n    #[tokio::test]\n    async fn test_truncation() {\n        let (mut queues, ctx) = QueuesForTest::new().await;\n        queues.create_queue(TEST_QUEUE_ID, &ctx).await.unwrap();\n        queues.append(TEST_QUEUE_ID, b\"hello\", &ctx).await.unwrap();\n        queues.append(TEST_QUEUE_ID, b\"happy\", &ctx).await.unwrap();\n        queues\n            .suggest_truncate(TEST_QUEUE_ID, 0, &ctx)\n            .await\n            .unwrap();\n        queues.fetch_test(TEST_QUEUE_ID, None, Some(1), &[&b\"happy\"[..]]);\n    }\n\n    #[tokio::test]\n    async fn test_truncation_and_reload() {\n        // This test makes sure that we don't reset the position counter when we truncate an entire\n        // queue.\n        let (mut queues, ctx) = QueuesForTest::new().await;\n        queues.create_queue(TEST_QUEUE_ID, &ctx).await.unwrap();\n        queues.append(TEST_QUEUE_ID, b\"hello\", &ctx).await.unwrap();\n        queues.append(TEST_QUEUE_ID, b\"happy\", &ctx).await.unwrap();\n        queues.reload().await;\n        queues\n            .suggest_truncate(TEST_QUEUE_ID, 1, &ctx)\n            .await\n            .unwrap();\n        queues.reload().await;\n        queues.append(TEST_QUEUE_ID, b\"tax\", &ctx).await.unwrap();\n        queues.fetch_test(TEST_QUEUE_ID, Some(1), Some(2), &[&b\"tax\"[..]]);\n    }\n\n    struct Record {\n        queue_id: String,\n        payload: Vec<u8>,\n    }\n\n    #[ignore]\n    #[tokio::test]\n    async fn test_create_multiple_queue() {\n        use std::iter::repeat_with;\n\n        use rand::rngs::StdRng;\n        use rand::{Rng, SeedableRng};\n        use rand_distr::weighted::WeightedIndex;\n        use rand_distr::{Distribution, LogNormal};\n\n        const NUM_QUEUES: usize = 100;\n        const NUM_RECORDS: usize = 1_000_000;\n\n        let (_, ctx) = QueuesForTest::new().await;\n\n        // mean 2, standard deviation 3\n        let log_normal = LogNormal::new(10.0f32, 3.0f32).unwrap();\n        let mut rng = StdRng::seed_from_u64(4u64);\n        let queue_weights: Vec<f32> = repeat_with(|| log_normal.sample(&mut rng))\n            .take(NUM_QUEUES)\n            .collect();\n\n        let dist = WeightedIndex::new(&queue_weights).unwrap();\n        let record_queue_ids: Vec<usize> = repeat_with(|| dist.sample(&mut rng))\n            .take(NUM_RECORDS)\n            .collect();\n\n        let records: Vec<Record> = record_queue_ids\n            .into_iter()\n            .map(|queue_id| {\n                let num_bytes: usize = rng.random_range(80..800);\n                let payload: Vec<u8> = repeat_with(rand::random::<u8>).take(num_bytes).collect();\n                Record {\n                    queue_id: queue_id.to_string(),\n                    payload,\n                }\n            })\n            .collect();\n\n        let tmpdir = tempfile::tempdir_in(\".\").unwrap();\n        let mut queues = Queues::open(tmpdir.path()).await.unwrap();\n        for queue_id in 0..NUM_QUEUES {\n            queues\n                .create_queue(&queue_id.to_string(), &ctx)\n                .await\n                .unwrap();\n        }\n        let start = std::time::Instant::now();\n        let mut num_bytes = 0;\n        for record in records.iter() {\n            queues\n                .append(&record.queue_id, &record.payload, &ctx)\n                .await\n                .unwrap();\n            num_bytes += record.payload.len();\n        }\n        let elapsed = start.elapsed();\n        println!(\"{elapsed:?}\");\n        println!(\"{num_bytes}\");\n        let throughput = num_bytes as f64 / (elapsed.as_micros() as f64);\n        println!(\"Throughput: {throughput}\");\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/Cargo.toml",
    "content": "[package]\nname = \"quickwit-integration-tests\"\ndescription = \"Integration tests runner and repository\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[features]\nsqs-localstack-tests = [\n    \"quickwit-indexing/sqs\",\n    \"quickwit-indexing/sqs-localstack-tests\",\n]\n\n[dependencies]\n\n[dev-dependencies]\nanyhow = { workspace = true }\naws-sdk-sqs = { workspace = true }\nfutures-util = { workspace = true }\nhyper = { workspace = true }\nhyper-util = { workspace = true }\nitertools = { workspace = true }\nrand = { workspace = true }\nreqwest = { workspace = true }\nrustls = { workspace = true }\nserde_json = { workspace = true }\ntempfile = { workspace = true }\ntokio = { workspace = true }\ntonic = { workspace = true }\ntracing = { workspace = true }\ntracing-subscriber = { workspace = true }\n\nquickwit-actors = { workspace = true, features = [\"testsuite\"] }\nquickwit-cli = { workspace = true }\nquickwit-common = { workspace = true, features = [\"testsuite\"] }\nquickwit-config = { workspace = true, features = [\"testsuite\"] }\nquickwit-indexing = { workspace = true, features = [\"testsuite\"] }\nquickwit-metastore = { workspace = true, features = [\"testsuite\"] }\nquickwit-opentelemetry = { workspace = true, features = [\"testsuite\"] }\nquickwit-proto = { workspace = true, features = [\"testsuite\"] }\nquickwit-rest-client = { workspace = true }\nquickwit-serve = { workspace = true, features = [\"testsuite\"] }\nquickwit-storage = { workspace = true, features = [\"testsuite\"] }\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n#![recursion_limit = \"256\"]\n\n#[cfg(test)]\nmod test_utils;\n#[cfg(test)]\nmod tests;\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/src/test_utils/cluster_sandbox.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{HashMap, HashSet};\nuse std::io::Write;\nuse std::net::SocketAddr;\nuse std::str::FromStr;\nuse std::time::Duration;\n\nuse anyhow::Context;\nuse futures_util::future;\nuse itertools::Itertools;\nuse quickwit_actors::ActorExitStatus;\nuse quickwit_cli::tool::{LocalIngestDocsArgs, local_ingest_docs_cli};\nuse quickwit_common::new_coolid;\nuse quickwit_common::runtimes::RuntimesConfig;\nuse quickwit_common::test_utils::wait_until_predicate;\nuse quickwit_common::uri::Uri as QuickwitUri;\nuse quickwit_config::NodeConfig;\nuse quickwit_config::service::QuickwitService;\nuse quickwit_metastore::{MetastoreResolver, SplitState};\nuse quickwit_proto::jaeger::storage::v1::span_reader_plugin_client::SpanReaderPluginClient;\nuse quickwit_proto::opentelemetry::proto::collector::logs::v1::logs_service_client::LogsServiceClient;\nuse quickwit_proto::opentelemetry::proto::collector::trace::v1::trace_service_client::TraceServiceClient;\nuse quickwit_proto::types::NodeId;\nuse quickwit_rest_client::models::IngestSource;\nuse quickwit_rest_client::rest_client::{\n    CommitType, DEFAULT_BASE_URL, QuickwitClient, QuickwitClientBuilder,\n};\nuse quickwit_serve::tcp_listener::for_tests::TestTcpListenerResolver;\nuse quickwit_serve::{\n    ListSplitsQueryParams, RestIngestResponse, SearchRequestQueryString, serve_quickwit,\n};\nuse quickwit_storage::StorageResolver;\nuse reqwest::Url;\nuse serde_json::Value;\nuse tempfile::TempDir;\nuse tokio::net::TcpListener;\nuse tracing::debug;\n\nuse super::shutdown::NodeShutdownHandle;\n\npub struct TestNodeConfig {\n    pub services: HashSet<QuickwitService>,\n    pub enable_otlp: bool,\n}\n\nimpl TestNodeConfig {\n    async fn build_node_config(\n        &self,\n        node_idx: usize,\n        cluster_id: String,\n        temp_dir: &TempDir,\n        unique_dir_name: String,\n        tcp_listener_resolver: &TestTcpListenerResolver,\n    ) -> NodeConfig {\n        let socket: SocketAddr = ([127, 0, 0, 1], 0u16).into();\n        let rest_tcp_listener = TcpListener::bind(socket).await.unwrap();\n        let grpc_tcp_listener = TcpListener::bind(socket).await.unwrap();\n        let mut config = NodeConfig::for_test_from_ports(\n            rest_tcp_listener.local_addr().unwrap().port(),\n            grpc_tcp_listener.local_addr().unwrap().port(),\n        );\n        tcp_listener_resolver.add_listener(rest_tcp_listener).await;\n        tcp_listener_resolver.add_listener(grpc_tcp_listener).await;\n        config.indexer_config.enable_otlp_endpoint = self.enable_otlp;\n        config.enabled_services.clone_from(&self.services);\n        config.jaeger_config.enable_endpoint = true;\n        config.cluster_id.clone_from(&cluster_id);\n        config.node_id = NodeId::new(format!(\"test-node-{node_idx}\"));\n        let root_data_dir = temp_dir.path().to_path_buf();\n        config.data_dir_path = root_data_dir.join(config.node_id.as_str());\n        config.metastore_uri =\n            QuickwitUri::from_str(&format!(\"ram:///{unique_dir_name}/metastore\")).unwrap();\n        config.default_index_root_uri =\n            QuickwitUri::from_str(&format!(\"ram:///{unique_dir_name}/indexes\")).unwrap();\n        config\n    }\n}\n\npub struct ClusterSandboxBuilder {\n    temp_dir: TempDir,\n    node_configs: Vec<TestNodeConfig>,\n    use_legacy_ingest: bool,\n}\n\nimpl Default for ClusterSandboxBuilder {\n    fn default() -> Self {\n        Self {\n            temp_dir: tempfile::tempdir().unwrap(),\n            node_configs: Vec::new(),\n            use_legacy_ingest: false,\n        }\n    }\n}\n\nimpl ClusterSandboxBuilder {\n    pub fn add_node(mut self, services: impl IntoIterator<Item = QuickwitService>) -> Self {\n        self.node_configs.push(TestNodeConfig {\n            services: HashSet::from_iter(services),\n            enable_otlp: false,\n        });\n        self\n    }\n\n    pub fn add_node_with_otlp(\n        mut self,\n        services: impl IntoIterator<Item = QuickwitService>,\n    ) -> Self {\n        self.node_configs.push(TestNodeConfig {\n            services: HashSet::from_iter(services),\n            enable_otlp: true,\n        });\n        self\n    }\n\n    pub fn use_legacy_ingest(mut self) -> Self {\n        self.use_legacy_ingest = true;\n        self\n    }\n\n    /// Builds a list of of [`NodeConfig`] from the node definitions added to\n    /// builder. For each node, a [`NodeConfig`] is built with the right\n    /// parameters such that we will be able to run `quickwit_serve` on them and\n    /// form a Quickwit cluster. For each node, we set:\n    /// - `data_dir_path` defined by `root_data_dir/node_id`.\n    /// - `metastore_uri` defined by `root_data_dir/metastore`.\n    /// - `default_index_root_uri` defined by `root_data_dir/indexes`.\n    /// - `peers` defined by others nodes `gossip_advertise_addr`.\n    pub async fn build_config(self) -> ResolvedClusterConfig {\n        let cluster_id = new_coolid(\"test-cluster\");\n        let mut resolved_node_configs = Vec::new();\n        let mut peers: Vec<String> = Vec::new();\n        let unique_dir_name = new_coolid(\"test-dir\");\n        let tcp_listener_resolver = TestTcpListenerResolver::default();\n        for (node_idx, node_builder) in self.node_configs.iter().enumerate() {\n            let config = node_builder\n                .build_node_config(\n                    node_idx,\n                    cluster_id.clone(),\n                    &self.temp_dir,\n                    unique_dir_name.clone(),\n                    &tcp_listener_resolver,\n                )\n                .await;\n            peers.push(config.gossip_advertise_addr.to_string());\n            resolved_node_configs.push((config, node_builder.services.clone()));\n        }\n        for node_config in resolved_node_configs.iter_mut() {\n            node_config.0.peer_seeds = peers\n                .clone()\n                .into_iter()\n                .filter(|seed| *seed != node_config.0.gossip_advertise_addr.to_string())\n                .collect_vec();\n        }\n        ResolvedClusterConfig {\n            cluster_id,\n            temp_dir: self.temp_dir,\n            unique_dir_name,\n            node_configs: resolved_node_configs,\n            tcp_listener_resolver,\n        }\n    }\n\n    /// Builds the cluster config, starts the nodes and waits for them to be ready\n    pub async fn build_and_start(self) -> ClusterSandbox {\n        self.build_config().await.start().await\n    }\n\n    pub async fn build_and_start_standalone() -> ClusterSandbox {\n        ClusterSandboxBuilder::default()\n            .add_node(QuickwitService::supported_services())\n            .build_config()\n            .await\n            .start()\n            .await\n    }\n}\n\n/// Intermediate state where the ports of all the test cluster nodes have\n/// been reserved and the configurations have been generated.\npub struct ResolvedClusterConfig {\n    cluster_id: String,\n    temp_dir: TempDir,\n    unique_dir_name: String,\n    pub node_configs: Vec<(NodeConfig, HashSet<QuickwitService>)>,\n    tcp_listener_resolver: TestTcpListenerResolver,\n}\n\nimpl ResolvedClusterConfig {\n    /// Start a cluster using this config and waits for the nodes to be ready\n    pub async fn start(self) -> ClusterSandbox {\n        quickwit_cli::install_default_crypto_ring_provider();\n\n        let mut sandbox = ClusterSandbox {\n            cluster_id: self.cluster_id,\n            node_configs: Vec::new(),\n            temp_dir: self.temp_dir,\n            unique_dir_name: self.unique_dir_name,\n            node_shutdown_handles: Vec::new(),\n            tcp_listener_resolver: self.tcp_listener_resolver,\n            storage_resolver: StorageResolver::unconfigured(),\n            metastore_resolver: MetastoreResolver::unconfigured(),\n        };\n        for (config, services) in &self.node_configs {\n            sandbox.spawn_node(config.clone(), services.clone());\n        }\n        sandbox.node_configs = self.node_configs;\n        sandbox\n            .wait_for_cluster_num_ready_nodes(sandbox.node_configs.len())\n            .await\n            .unwrap();\n        sandbox\n    }\n}\n\nfn transport_url(addr: SocketAddr, tls: bool) -> Url {\n    let mut url = Url::parse(DEFAULT_BASE_URL).unwrap();\n    url.set_ip_host(addr.ip()).unwrap();\n    url.set_port(Some(addr.port())).unwrap();\n    if tls {\n        url.set_scheme(\"https\").unwrap();\n    }\n    url\n}\n\n#[macro_export]\nmacro_rules! ingest_json {\n    ($($json:tt)+) => {\n        quickwit_rest_client::models::IngestSource::Str(json!($($json)+).to_string())\n    };\n}\n\npub(crate) async fn ingest(\n    client: &QuickwitClient,\n    index_id: &str,\n    ingest_source: IngestSource,\n    commit_type: CommitType,\n) -> anyhow::Result<RestIngestResponse> {\n    let resp = client\n        .ingest(index_id, ingest_source, None, None, commit_type)\n        .await?;\n    Ok(resp)\n}\n\n/// A test environment where you can start a Quickwit cluster and use the gRPC\n/// or REST clients to test it.\npub struct ClusterSandbox {\n    cluster_id: String,\n    pub node_configs: Vec<(NodeConfig, HashSet<QuickwitService>)>,\n    unique_dir_name: String,\n    temp_dir: TempDir,\n    node_shutdown_handles: Vec<NodeShutdownHandle>,\n    tcp_listener_resolver: TestTcpListenerResolver,\n    storage_resolver: StorageResolver,\n    metastore_resolver: MetastoreResolver,\n}\n\nimpl ClusterSandbox {\n    fn spawn_node(&mut self, config: NodeConfig, services: HashSet<QuickwitService>) {\n        let mut shutdown_handle = NodeShutdownHandle::new(config.node_id.clone(), services.clone());\n        let shutdown_signal = shutdown_handle.shutdown_signal();\n        let runtimes_config = RuntimesConfig::light_for_tests();\n        let join_handle = tokio::spawn({\n            let node_id = config.node_id.clone();\n            let metastore_resolver = self.metastore_resolver.clone();\n            let storage_resolver = self.storage_resolver.clone();\n            let tcp_listener_resolver = self.tcp_listener_resolver.clone();\n            async move {\n                let result = serve_quickwit(\n                    config,\n                    runtimes_config,\n                    metastore_resolver,\n                    storage_resolver,\n                    tcp_listener_resolver,\n                    shutdown_signal,\n                    quickwit_serve::do_nothing_env_filter_reload_fn(),\n                )\n                .await?;\n                debug!(\"{node_id} stopped successfully ({services:?})\");\n                Result::<_, anyhow::Error>::Ok(result)\n            }\n        });\n        shutdown_handle.set_node_join_handle(join_handle);\n        self.node_shutdown_handles.push(shutdown_handle);\n    }\n\n    /// Dynamically adds a node to the cluster. Does not wait for readiness.\n    pub async fn add_node(&mut self, services: impl IntoIterator<Item = QuickwitService>) {\n        self.add_node_inner(TestNodeConfig {\n            services: HashSet::from_iter(services),\n            enable_otlp: false,\n        })\n        .await;\n    }\n\n    async fn add_node_inner(&mut self, config_builder: TestNodeConfig) {\n        let mut config = config_builder\n            .build_node_config(\n                self.node_configs.len() + 1,\n                self.cluster_id.clone(),\n                &self.temp_dir,\n                self.unique_dir_name.clone(),\n                &self.tcp_listener_resolver,\n            )\n            .await;\n        config.peer_seeds = self\n            .node_configs\n            .iter()\n            .map(|config| config.0.gossip_advertise_addr.to_string())\n            .collect_vec();\n        self.spawn_node(config.clone(), config_builder.services.clone());\n        self.node_configs\n            .push((config, config_builder.services.clone()));\n    }\n\n    fn find_node_for_service(&self, service: QuickwitService) -> NodeConfig {\n        self.node_configs\n            .iter()\n            .find(|config| config.1.contains(&service))\n            .unwrap_or_else(|| panic!(\"No {service:?} node\"))\n            .0\n            .clone()\n    }\n\n    fn channel(&self, service: QuickwitService) -> tonic::transport::Channel {\n        let node_config = self.find_node_for_service(service);\n        let endpoint = format!(\"http://{}\", node_config.grpc_listen_addr);\n        tonic::transport::Channel::from_shared(endpoint)\n            .unwrap()\n            .connect_lazy()\n    }\n\n    /// Returns a client to one of the nodes that runs the specified service\n    pub fn rest_client(&self, service: QuickwitService) -> QuickwitClient {\n        let node_config = self.find_node_for_service(service);\n\n        let certificate = if let Some(tls_conf) = &node_config.rest_config.tls {\n            let cert_bytes = std::fs::read(&tls_conf.ca_path).unwrap();\n            Some(reqwest::tls::Certificate::from_pem(&cert_bytes).unwrap())\n        } else {\n            None\n        };\n\n        QuickwitClientBuilder::new(transport_url(\n            node_config.rest_config.listen_addr,\n            certificate.is_some(),\n        ))\n        .set_tls_ca(certificate)\n        .build()\n    }\n\n    /// A client configured to ingest documents and return detailed parse failures.\n    pub fn detailed_ingest_client(&self) -> QuickwitClient {\n        let node_config = self.find_node_for_service(QuickwitService::Indexer);\n\n        let certificate = if let Some(tls_conf) = &node_config.rest_config.tls {\n            let cert_bytes = std::fs::read(&tls_conf.ca_path).unwrap();\n            Some(reqwest::tls::Certificate::from_pem(&cert_bytes).unwrap())\n        } else {\n            None\n        };\n\n        QuickwitClientBuilder::new(transport_url(\n            node_config.rest_config.listen_addr,\n            certificate.is_some(),\n        ))\n        .set_tls_ca(certificate)\n        .detailed_response(true)\n        .build()\n    }\n\n    // TODO(#5604)\n    pub fn rest_client_legacy_indexer(&self) -> QuickwitClient {\n        let node_config = self.find_node_for_service(QuickwitService::Indexer);\n\n        let certificate = if let Some(tls_conf) = &node_config.rest_config.tls {\n            let cert_bytes = std::fs::read(&tls_conf.ca_path).unwrap();\n            Some(reqwest::tls::Certificate::from_pem(&cert_bytes).unwrap())\n        } else {\n            None\n        };\n\n        QuickwitClientBuilder::new(transport_url(\n            node_config.rest_config.listen_addr,\n            certificate.is_some(),\n        ))\n        .set_tls_ca(certificate)\n        .use_legacy_ingest(true)\n        .build()\n    }\n\n    pub fn jaeger_client(&self) -> SpanReaderPluginClient<tonic::transport::Channel> {\n        SpanReaderPluginClient::new(self.channel(QuickwitService::Searcher))\n    }\n\n    pub fn logs_client(&self) -> LogsServiceClient<tonic::transport::Channel> {\n        LogsServiceClient::new(self.channel(QuickwitService::Indexer))\n    }\n\n    pub fn trace_client(&self) -> TraceServiceClient<tonic::transport::Channel> {\n        TraceServiceClient::new(self.channel(QuickwitService::Indexer))\n    }\n\n    pub async fn wait_for_cluster_num_ready_nodes(\n        &self,\n        expected_num_ready_nodes: usize,\n    ) -> anyhow::Result<()> {\n        wait_until_predicate(\n            || async move {\n                match self\n                    .rest_client(QuickwitService::Metastore)\n                    .cluster()\n                    .snapshot()\n                    .await\n                {\n                    Ok(result) => {\n                        if result.ready_nodes.len() != expected_num_ready_nodes {\n                            debug!(\n                                \"wait_for_cluster_num_ready_nodes expected {} ready nodes, got {}\",\n                                expected_num_ready_nodes,\n                                result.live_nodes.len()\n                            );\n                            false\n                        } else {\n                            true\n                        }\n                    }\n                    Err(err) => {\n                        debug!(\"wait_for_cluster_num_ready_nodes error {err}\");\n                        false\n                    }\n                }\n            },\n            Duration::from_secs(10),\n            Duration::from_millis(100),\n        )\n        .await?;\n        Ok(())\n    }\n\n    /// Waits for the needed number of indexing pipeline to start.\n    ///\n    /// WARNING! does not work if multiple indexers are running\n    pub async fn wait_for_indexing_pipelines(\n        &self,\n        required_pipeline_num: usize,\n    ) -> anyhow::Result<()> {\n        wait_until_predicate(\n            || async move {\n                match self\n                    .rest_client(QuickwitService::Indexer)\n                    .node_stats()\n                    .indexing()\n                    .await\n                {\n                    Ok(result) => {\n                        if result.num_running_pipelines != required_pipeline_num {\n                            debug!(\n                                \"wait_for_indexing_pipelines expected {} pipelines, got {}\",\n                                required_pipeline_num, result.num_running_pipelines\n                            );\n                            false\n                        } else {\n                            true\n                        }\n                    }\n                    Err(err) => {\n                        debug!(\"wait_for_cluster_num_ready_nodes error {err}\");\n                        false\n                    }\n                }\n            },\n            Duration::from_secs(10),\n            Duration::from_millis(100),\n        )\n        .await?;\n        Ok(())\n    }\n\n    // Waits for the needed number of indexing pipeline to start.\n    pub async fn wait_for_splits(\n        &self,\n        index_id: &str,\n        split_states_filter: Option<Vec<SplitState>>,\n        required_splits_num: usize,\n    ) -> anyhow::Result<()> {\n        wait_until_predicate(\n            || {\n                let splits_query_params = ListSplitsQueryParams {\n                    split_states: split_states_filter.clone(),\n                    ..Default::default()\n                };\n                async move {\n                    match self\n                        .rest_client(QuickwitService::Metastore)\n                        .splits(index_id)\n                        .list(splits_query_params)\n                        .await\n                    {\n                        Ok(result) => {\n                            if result.len() != required_splits_num {\n                                debug!(\n                                    \"wait_for_splits expected {} splits, got {}\",\n                                    required_splits_num,\n                                    result.len()\n                                );\n                                false\n                            } else {\n                                true\n                            }\n                        }\n                        Err(err) => {\n                            debug!(\"wait_for_splits error {err}\");\n                            false\n                        }\n                    }\n                }\n            },\n            Duration::from_secs(15),\n            Duration::from_millis(500),\n        )\n        .await?;\n        Ok(())\n    }\n\n    pub async fn local_ingest(&self, index_id: &str, json_data: &[Value]) -> anyhow::Result<()> {\n        let test_conf = self\n            .node_configs\n            .iter()\n            .find(|config| config.1.contains(&QuickwitService::Indexer))\n            .ok_or(anyhow::anyhow!(\"No indexer node found\"))?;\n        // NodeConfig cannot be serialized, we write our own simplified config\n        let mut tmp_config_file = tempfile::Builder::new().suffix(\".yaml\").tempfile().unwrap();\n        // we suffix data_dir with a random slug to save us from multiple local ingestion trying to\n        // concurrently do something, and cleanup the directory to start a new ingestion.\n        let data_dir = test_conf\n            .0\n            .data_dir_path\n            .join(rand::random::<u64>().to_string());\n        tokio::fs::create_dir(&data_dir).await?;\n        let node_config = format!(\n            r#\"\n                version: 0.8\n                metastore_uri: {}\n                data_dir: {:?}\n                \"#,\n            test_conf.0.metastore_uri, data_dir\n        );\n        tmp_config_file.write_all(node_config.as_bytes())?;\n        tmp_config_file.flush()?;\n\n        let mut tmp_data_file = tempfile::NamedTempFile::new().unwrap();\n        for line in json_data {\n            serde_json::to_writer(&mut tmp_data_file, line)?;\n            tmp_data_file.write_all(b\"\\n\")?;\n        }\n        tmp_data_file.flush()?;\n\n        local_ingest_docs_cli(LocalIngestDocsArgs {\n            clear_cache: false,\n            config_uri: QuickwitUri::from_str(tmp_config_file.path().to_str().unwrap())?,\n            index_id: index_id.to_string(),\n            input_format: quickwit_config::SourceInputFormat::Json,\n            overwrite: false,\n            vrl_script: None,\n            input_path_opt: Some(QuickwitUri::from_str(\n                tmp_data_file\n                    .path()\n                    .to_str()\n                    .context(\"temp path could not be converted to URI\")?,\n            )?),\n        })\n        .await?;\n        Ok(())\n    }\n\n    pub async fn assert_hit_count(&self, index_id: &str, query: &str, expected_num_hits: u64) {\n        let search_response = self\n            .rest_client(QuickwitService::Searcher)\n            .search(\n                index_id,\n                SearchRequestQueryString {\n                    query: query.to_string(),\n                    max_hits: 10,\n                    ..Default::default()\n                },\n            )\n            .await\n            .unwrap();\n        debug!(\n            \"search response for query {} on index {index_id}: {:?}\",\n            query, search_response\n        );\n        assert_eq!(\n            search_response.num_hits, expected_num_hits,\n            \"unexpected num_hits for query {query}\"\n        );\n    }\n\n    /// Shutdown nodes that only provide the specified services\n    pub async fn shutdown_services(\n        &mut self,\n        shutdown_services: impl IntoIterator<Item = QuickwitService>,\n    ) -> Result<Vec<HashMap<String, ActorExitStatus>>, anyhow::Error> {\n        // We need to drop rest clients first because reqwest can hold connections open\n        // preventing rest server's graceful shutdown.\n        let mut indexer_shutdown_futures = Vec::new();\n        let mut other_shutdown_futures = Vec::new();\n        let mut shutdown_nodes = HashMap::new();\n        let mut i = 0;\n        let shutdown_services_map = HashSet::from_iter(shutdown_services);\n        while i < self.node_shutdown_handles.len() {\n            let handler_services = &self.node_shutdown_handles[i].node_services;\n            if !handler_services.is_subset(&shutdown_services_map) {\n                i += 1;\n                continue;\n            }\n            let handler_to_shutdown = self.node_shutdown_handles.remove(i);\n            shutdown_nodes.insert(\n                handler_to_shutdown.node_id.clone(),\n                handler_to_shutdown.node_services.clone(),\n            );\n            if handler_to_shutdown\n                .node_services\n                .contains(&QuickwitService::Indexer)\n            {\n                indexer_shutdown_futures.push(handler_to_shutdown.shutdown());\n            } else {\n                other_shutdown_futures.push(handler_to_shutdown.shutdown());\n            }\n        }\n        debug!(\"shutting down {:?}\", shutdown_nodes);\n        // We must decommision the indexer nodes first and independently from the other nodes.\n        let indexer_shutdown_results = future::join_all(indexer_shutdown_futures).await;\n        let other_shutdown_results = future::join_all(other_shutdown_futures).await;\n        let exit_statuses = indexer_shutdown_results\n            .into_iter()\n            .chain(other_shutdown_results)\n            .collect::<Result<Vec<_>, _>>()?;\n        Ok(exit_statuses)\n    }\n\n    pub async fn shutdown(\n        mut self,\n    ) -> Result<Vec<HashMap<String, ActorExitStatus>>, anyhow::Error> {\n        self.shutdown_services(QuickwitService::supported_services())\n            .await\n    }\n\n    /// Remove a node from the sandbox and return its shutdown handle.\n    /// After this call, `rest_client` and other lookup methods skip the removed\n    /// node, so callers can trigger shutdown concurrently with other sandbox\n    /// operations.\n    pub fn remove_node_with_service(&mut self, service: QuickwitService) -> NodeShutdownHandle {\n        let idx = self\n            .node_shutdown_handles\n            .iter()\n            .position(|h| h.node_services.contains(&service))\n            .unwrap_or_else(|| panic!(\"no node with service {service:?}\"));\n        self.node_configs.remove(idx);\n        self.node_shutdown_handles.remove(idx)\n    }\n}\n\n/// We don't usually test the tests, but the complexity of the sandbox setup code justifies it here.\n#[tokio::test]\nasync fn test_sandbox_happy_path() {\n    let sandbox = ClusterSandboxBuilder::default()\n        .add_node([QuickwitService::ControlPlane, QuickwitService::Metastore])\n        .add_node([QuickwitService::Searcher])\n        .add_node([QuickwitService::Indexer])\n        .build_and_start()\n        .await;\n\n    sandbox.wait_for_cluster_num_ready_nodes(3).await.unwrap();\n    sandbox.shutdown().await.unwrap();\n}\n\n#[tokio::test]\nasync fn test_sandbox_add_node_dynamically() {\n    let mut sandbox = ClusterSandboxBuilder::default()\n        .add_node([QuickwitService::ControlPlane, QuickwitService::Metastore])\n        .add_node([QuickwitService::Searcher])\n        .build_and_start()\n        .await;\n    sandbox.wait_for_cluster_num_ready_nodes(2).await.unwrap();\n\n    // Later, add an indexer node to the running cluster\n    sandbox.add_node([QuickwitService::Indexer]).await;\n\n    sandbox.wait_for_cluster_num_ready_nodes(3).await.unwrap();\n    sandbox.shutdown().await.unwrap();\n}\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/src/test_utils/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod cluster_sandbox;\nmod shutdown;\n\npub(crate) use cluster_sandbox::{ClusterSandbox, ClusterSandboxBuilder, ingest};\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/src/test_utils/shutdown.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{HashMap, HashSet};\n\nuse quickwit_actors::ActorExitStatus;\nuse quickwit_common::tower::BoxFutureInfaillible;\nuse quickwit_config::service::QuickwitService;\nuse quickwit_proto::types::NodeId;\nuse tokio::sync::watch::{self, Receiver, Sender};\nuse tokio::task::JoinHandle;\n\ntype NodeJoinHandle = JoinHandle<Result<HashMap<String, ActorExitStatus>, anyhow::Error>>;\n\npub(crate) struct NodeShutdownHandle {\n    sender: Sender<()>,\n    receiver: Receiver<()>,\n    pub node_services: HashSet<QuickwitService>,\n    pub node_id: NodeId,\n    join_handle_opt: Option<NodeJoinHandle>,\n}\n\nimpl NodeShutdownHandle {\n    pub(crate) fn new(node_id: NodeId, node_services: HashSet<QuickwitService>) -> Self {\n        let (sender, receiver) = watch::channel(());\n        Self {\n            sender,\n            receiver,\n            node_id,\n            node_services,\n            join_handle_opt: None,\n        }\n    }\n\n    pub(crate) fn shutdown_signal(&self) -> BoxFutureInfaillible<()> {\n        let receiver = self.receiver.clone();\n        Box::pin(async move {\n            receiver.clone().changed().await.unwrap();\n        })\n    }\n\n    pub(crate) fn set_node_join_handle(&mut self, join_handle: NodeJoinHandle) {\n        self.join_handle_opt = Some(join_handle);\n    }\n\n    /// Initiate node shutdown and wait for it to complete\n    pub(crate) async fn shutdown(\n        self,\n    ) -> anyhow::Result<HashMap<std::string::String, ActorExitStatus>> {\n        self.sender.send(()).unwrap();\n        self.join_handle_opt\n            .expect(\"node join handle was not set before shutdown\")\n            .await\n            .unwrap()\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/src/tests/basic_tests.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::time::Duration;\n\nuse hyper::{Method, Request, StatusCode};\nuse hyper_util::rt::TokioExecutor;\nuse quickwit_config::service::QuickwitService;\nuse quickwit_serve::SearchRequestQueryString;\n\nuse crate::test_utils::ClusterSandboxBuilder;\n\n#[tokio::test]\nasync fn test_ui_redirect_on_get() {\n    quickwit_common::setup_logging_for_tests();\n    let sandbox = ClusterSandboxBuilder::build_and_start_standalone().await;\n    let node_config = sandbox.node_configs.first().unwrap();\n    let client = hyper_util::client::legacy::Client::builder(TokioExecutor::new())\n        .pool_idle_timeout(Duration::from_secs(30))\n        .http2_only(true)\n        .build_http();\n    let root_uri = format!(\"http://{}/\", node_config.0.rest_config.listen_addr)\n        .parse::<hyper::Uri>()\n        .unwrap();\n    let response = client.get(root_uri.clone()).await.unwrap();\n    assert_eq!(response.status(), StatusCode::MOVED_PERMANENTLY);\n    let post_request = Request::builder()\n        .uri(root_uri)\n        .method(Method::POST)\n        .body(\"{}\".to_string())\n        .unwrap();\n    let response = client.request(post_request).await.unwrap();\n    assert_eq!(response.status(), StatusCode::METHOD_NOT_ALLOWED);\n    sandbox.shutdown().await.unwrap();\n}\n\n#[tokio::test]\nasync fn test_standalone_server() {\n    quickwit_common::setup_logging_for_tests();\n    let sandbox = ClusterSandboxBuilder::build_and_start_standalone().await;\n    {\n        // The indexing service should be running.\n        let counters = sandbox\n            .rest_client(QuickwitService::Indexer)\n            .node_stats()\n            .indexing()\n            .await\n            .unwrap();\n        assert_eq!(counters.num_running_pipelines, 0);\n    }\n\n    {\n        // Create an dynamic index.\n        sandbox\n            .rest_client(QuickwitService::Indexer)\n            .indexes()\n            .create(\n                r#\"\n                version: 0.8\n                index_id: my-new-index\n                doc_mapping:\n                  field_mappings:\n                  - name: body\n                    type: text\n                \"#,\n                quickwit_config::ConfigFormat::Yaml,\n                false,\n            )\n            .await\n            .unwrap();\n\n        // Index should be searchable\n        assert_eq!(\n            sandbox\n                .rest_client(QuickwitService::Indexer)\n                .search(\n                    \"my-new-index\",\n                    SearchRequestQueryString {\n                        query: \"body:test\".to_string(),\n                        max_hits: 10,\n                        ..Default::default()\n                    },\n                )\n                .await\n                .unwrap()\n                .num_hits,\n            0\n        );\n        sandbox.wait_for_indexing_pipelines(1).await.unwrap();\n    }\n    sandbox.shutdown().await.unwrap();\n}\n\n#[tokio::test]\nasync fn test_multi_nodes_cluster() {\n    quickwit_common::setup_logging_for_tests();\n    let sandbox = ClusterSandboxBuilder::default()\n        .add_node([QuickwitService::Searcher])\n        .add_node([QuickwitService::Metastore])\n        .add_node([QuickwitService::Indexer])\n        .add_node([QuickwitService::ControlPlane])\n        .add_node([QuickwitService::Janitor])\n        .build_and_start()\n        .await;\n\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .create(\n            r#\"\n            version: 0.8\n            index_id: my-new-multi-node-index\n            doc_mapping:\n              field_mappings:\n              - name: body\n                type: text\n            indexing_settings:\n              commit_timeout_secs: 1\n            \"#,\n            quickwit_config::ConfigFormat::Yaml,\n            false,\n        )\n        .await\n        .unwrap();\n\n    assert!(\n        sandbox\n            .rest_client(QuickwitService::Indexer)\n            .node_health()\n            .is_live()\n            .await\n            .unwrap()\n    );\n\n    // Assert that at least 1 indexing pipelines is successfully started\n    sandbox.wait_for_indexing_pipelines(1).await.unwrap();\n\n    // Check that search is working\n    let search_response_empty = sandbox\n        .rest_client(QuickwitService::Searcher)\n        .search(\n            \"my-new-multi-node-index\",\n            SearchRequestQueryString {\n                query: \"body:bar\".to_string(),\n                ..Default::default()\n            },\n        )\n        .await\n        .unwrap();\n    assert_eq!(search_response_empty.num_hits, 0);\n\n    sandbox.shutdown().await.unwrap();\n}\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/src/tests/ingest_v1_tests.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_config::ConfigFormat;\nuse quickwit_config::service::QuickwitService;\nuse quickwit_metastore::SplitState;\nuse quickwit_rest_client::rest_client::CommitType;\nuse serde_json::json;\n\nuse crate::ingest_json;\nuse crate::test_utils::{ClusterSandboxBuilder, ingest};\n\n// TODO(#5604)\n\n/// This tests checks our happy path for ingesting one doc.\n#[tokio::test]\nasync fn test_ingest_v1_happy_path() {\n    let sandbox = ClusterSandboxBuilder::default()\n        .use_legacy_ingest()\n        .add_node([QuickwitService::Indexer])\n        .add_node([QuickwitService::Searcher])\n        .add_node([\n            QuickwitService::ControlPlane,\n            QuickwitService::Janitor,\n            QuickwitService::Metastore,\n        ])\n        .build_and_start()\n        .await;\n\n    let index_id = \"test-ingest-v1-happy-path\";\n    let index_config = format!(\n        r#\"\n        version: 0.8\n        index_id: {index_id}\n        doc_mapping:\n            field_mappings:\n            - name: body\n              type: text\n        indexing_settings:\n            commit_timeout_secs: 1\n        \"#\n    );\n    let indexer_client = sandbox.rest_client_legacy_indexer();\n    indexer_client\n        .indexes()\n        .create(index_config, ConfigFormat::Yaml, false)\n        .await\n        .unwrap();\n\n    ingest(\n        &indexer_client,\n        index_id,\n        ingest_json!({\"body\": \"my-doc\"}),\n        CommitType::Auto,\n    )\n    .await\n    .unwrap();\n\n    sandbox\n        .wait_for_splits(index_id, Some(vec![SplitState::Published]), 1)\n        .await\n        .unwrap();\n\n    sandbox.assert_hit_count(index_id, \"*\", 1).await;\n\n    // Delete the index to avoid potential hanging on shutdown #5068\n    indexer_client\n        .indexes()\n        .delete(index_id, false)\n        .await\n        .unwrap();\n\n    sandbox.shutdown().await.unwrap();\n}\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/src/tests/ingest_v2_tests.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::time::Duration;\n\nuse futures_util::FutureExt;\nuse itertools::Itertools;\nuse quickwit_common::test_utils::wait_until_predicate;\nuse quickwit_config::ConfigFormat;\nuse quickwit_config::service::QuickwitService;\nuse quickwit_indexing::actors::INDEXING_DIR_NAME;\nuse quickwit_metastore::SplitState;\nuse quickwit_proto::ingest::ParseFailureReason;\nuse quickwit_rest_client::error::{ApiError, Error};\nuse quickwit_rest_client::models::IngestSource;\nuse quickwit_rest_client::rest_client::CommitType;\nuse quickwit_serve::{ListSplitsQueryParams, RestIngestResponse, RestParseFailure};\nuse serde_json::json;\n\nuse crate::ingest_json;\nuse crate::test_utils::{ClusterSandboxBuilder, ingest};\n\n/// Ingesting on a freshly re-created index sometimes fails, see #5430\n#[tokio::test]\n#[ignore]\nasync fn test_ingest_recreated_index() {\n    let sandbox = ClusterSandboxBuilder::build_and_start_standalone().await;\n    let index_id = \"test-ingest-recreated-index\";\n    let index_config = format!(\n        r#\"\n            version: 0.8\n            index_id: {index_id}\n            doc_mapping:\n                field_mappings:\n                - name: body\n                  type: text\n            indexing_settings:\n                commit_timeout_secs: 1\n                merge_policy:\n                    type: stable_log\n                    merge_factor: 3\n                    max_merge_factor: 3\n            \"#\n    );\n    let current_index_metadata = sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .create(index_config.clone(), ConfigFormat::Yaml, false)\n        .await\n        .unwrap();\n\n    ingest(\n        &sandbox.rest_client(QuickwitService::Indexer),\n        index_id,\n        ingest_json!({\"body\": \"first record\"}),\n        CommitType::Force,\n    )\n    .await\n    .unwrap();\n\n    sandbox\n        .wait_for_splits(index_id, Some(vec![SplitState::Published]), 1)\n        .await\n        .unwrap();\n\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .delete(index_id, false)\n        .await\n        .unwrap();\n\n    // Recreate the index and start ingesting into it again\n\n    let new_index_metadata = sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .create(index_config, ConfigFormat::Yaml, false)\n        .await\n        .unwrap();\n\n    assert_ne!(\n        current_index_metadata.index_uid.incarnation_id,\n        new_index_metadata.index_uid.incarnation_id\n    );\n\n    ingest(\n        &sandbox.rest_client(QuickwitService::Indexer),\n        index_id,\n        ingest_json!({\"body\": \"second record\"}),\n        CommitType::Force,\n    )\n    .await\n    .unwrap();\n\n    sandbox\n        .wait_for_splits(index_id, Some(vec![SplitState::Published]), 1)\n        .await\n        .unwrap();\n\n    ingest(\n        &sandbox.rest_client(QuickwitService::Indexer),\n        index_id,\n        ingest_json!({\"body\": \"third record\"}),\n        CommitType::Force,\n    )\n    .await\n    .unwrap();\n\n    sandbox\n        .wait_for_splits(index_id, Some(vec![SplitState::Published]), 2)\n        .await\n        .unwrap();\n\n    ingest(\n        &sandbox.rest_client(QuickwitService::Indexer),\n        index_id,\n        ingest_json!({\"body\": \"fourth record\"}),\n        CommitType::Force,\n    )\n    .await\n    .unwrap();\n\n    sandbox\n        .wait_for_splits(index_id, Some(vec![SplitState::Published]), 3)\n        .await\n        .unwrap();\n\n    sandbox.assert_hit_count(index_id, \"body:record\", 3).await;\n\n    // Wait for splits to merge, since we created 3 splits and merge factor is 3,\n    // we should get 1 published split with no staged splits eventually.\n    sandbox\n        .wait_for_splits(\n            index_id,\n            Some(vec![SplitState::Published, SplitState::Staged]),\n            1,\n        )\n        .await\n        .unwrap();\n\n    // Delete the index to avoid potential hanging on shutdown #5068\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .delete(index_id, false)\n        .await\n        .unwrap();\n\n    sandbox.shutdown().await.unwrap();\n}\n\n/// Indexing directory is not cleaned up after deleting an index, see #5436\n#[tokio::test]\n#[ignore]\nasync fn test_indexing_directory_cleanup() {\n    let sandbox = ClusterSandboxBuilder::build_and_start_standalone().await;\n    let index_id = \"test-ingest-directory-cleanup\";\n    let index_config = format!(\n        r#\"\n            version: 0.8\n            index_id: {index_id}\n            doc_mapping:\n                field_mappings:\n                - name: body\n                  type: text\n            indexing_settings:\n                commit_timeout_secs: 1\n                merge_policy:\n                    type: stable_log\n                    merge_factor: 3\n                    max_merge_factor: 3\n            \"#\n    );\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .create(index_config.clone(), ConfigFormat::Yaml, false)\n        .await\n        .unwrap();\n\n    ingest(\n        &sandbox.rest_client(QuickwitService::Indexer),\n        index_id,\n        ingest_json!({\"body\": \"first record\"}),\n        CommitType::Force,\n    )\n    .await\n    .unwrap();\n\n    sandbox\n        .wait_for_splits(index_id, Some(vec![SplitState::Published]), 1)\n        .await\n        .unwrap();\n\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .delete(index_id, false)\n        .await\n        .unwrap();\n\n    // The index is deleted so the `indexing` directory should be cleaned up\n    let data_dir_path = &sandbox.node_configs.first().unwrap().0.data_dir_path;\n    let indexing_dir_path = data_dir_path.join(INDEXING_DIR_NAME);\n    wait_until_predicate(\n        || async {\n            let indexing_dir_entries = indexing_dir_path.read_dir().unwrap().collect_vec();\n            indexing_dir_entries.is_empty()\n        },\n        Duration::from_secs(100),\n        Duration::from_millis(500),\n    )\n    .await\n    .unwrap();\n\n    sandbox.shutdown().await.unwrap();\n}\n\n/// This tests checks what happens when we try to ingest into a non-existing index.\n#[tokio::test]\nasync fn test_ingest_v2_index_not_found() {\n    let sandbox = ClusterSandboxBuilder::default()\n        .add_node([QuickwitService::Indexer, QuickwitService::Janitor])\n        .add_node([QuickwitService::Indexer, QuickwitService::Janitor])\n        .add_node([\n            QuickwitService::ControlPlane,\n            QuickwitService::Metastore,\n            QuickwitService::Searcher,\n        ])\n        .build_and_start()\n        .await;\n    let missing_index_err: Error = sandbox\n        .rest_client(QuickwitService::Indexer)\n        .ingest(\n            \"missing_index\",\n            ingest_json!({\"body\": \"doc1\"}),\n            None,\n            None,\n            CommitType::Auto,\n        )\n        .await\n        .unwrap_err();\n    let Error::Api(ApiError { message, code }) = missing_index_err else {\n        panic!(\"Expected an API error.\");\n    };\n    assert_eq!(code, 404u16);\n    let error_message = message.unwrap();\n    assert_eq!(error_message, \"index `missing_index` not found\");\n    sandbox.shutdown().await.unwrap();\n}\n\n/// This tests checks our happy path for ingesting one doc.\n#[tokio::test]\nasync fn test_ingest_v2_happy_path() {\n    let sandbox = ClusterSandboxBuilder::default()\n        .add_node([QuickwitService::Indexer, QuickwitService::Janitor])\n        .add_node([QuickwitService::Indexer, QuickwitService::Janitor])\n        .add_node([\n            QuickwitService::ControlPlane,\n            QuickwitService::Metastore,\n            QuickwitService::Searcher,\n        ])\n        .build_and_start()\n        .await;\n    let index_id = \"test_happy_path\";\n    let index_config = format!(\n        r#\"\n        version: 0.8\n        index_id: {index_id}\n        doc_mapping:\n            field_mappings:\n            - name: body\n              type: text\n        indexing_settings:\n            commit_timeout_secs: 1\n        \"#\n    );\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .create(index_config, ConfigFormat::Yaml, false)\n        .await\n        .unwrap();\n\n    let ingest_resp = ingest(\n        &sandbox.rest_client(QuickwitService::Indexer),\n        index_id,\n        ingest_json!({\"body\": \"doc1\"}),\n        CommitType::Auto,\n    )\n    .await\n    .unwrap();\n    assert_eq!(\n        ingest_resp,\n        RestIngestResponse {\n            num_docs_for_processing: 1,\n            num_ingested_docs: Some(1),\n            num_rejected_docs: Some(0),\n            parse_failures: None,\n        },\n    );\n\n    sandbox\n        .wait_for_splits(index_id, Some(vec![SplitState::Published]), 1)\n        .await\n        .unwrap();\n\n    sandbox.assert_hit_count(index_id, \"*\", 1).await;\n\n    // Delete the index to avoid potential hanging on shutdown #5068\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .delete(index_id, false)\n        .await\n        .unwrap();\n\n    sandbox.shutdown().await.unwrap();\n}\n\n#[tokio::test]\nasync fn test_commit_force() {\n    let sandbox = ClusterSandboxBuilder::build_and_start_standalone().await;\n    let index_id = \"test_commit_force\";\n    let index_config = format!(\n        r#\"\n        version: 0.8\n        index_id: {index_id}\n        doc_mapping:\n            field_mappings:\n            - name: body\n              type: text\n        indexing_settings:\n            commit_timeout_secs: 60\n        \"#\n    );\n\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .create(index_config, ConfigFormat::Yaml, false)\n        .await\n        .unwrap();\n\n    // commit_timeout_secs is set to a large value, so this would timeout if\n    // the commit isn't forced\n    let ingest_resp = tokio::time::timeout(\n        Duration::from_secs(20),\n        ingest(\n            &sandbox.rest_client(QuickwitService::Indexer),\n            index_id,\n            ingest_json!({\"body\": \"force\"}),\n            CommitType::Force,\n        ),\n    )\n    .await\n    .unwrap()\n    .unwrap();\n    assert_eq!(\n        ingest_resp,\n        RestIngestResponse {\n            num_docs_for_processing: 1,\n            num_ingested_docs: Some(1),\n            num_rejected_docs: Some(0),\n            parse_failures: None,\n        },\n    );\n\n    sandbox.assert_hit_count(index_id, \"body:force\", 1).await;\n\n    // Delete the index to avoid waiting for the commit timeout on shutdown #5068\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .delete(index_id, false)\n        .await\n        .unwrap();\n\n    sandbox.shutdown().await.unwrap();\n}\n\n#[tokio::test]\nasync fn test_commit_wait_for() {\n    let sandbox = ClusterSandboxBuilder::build_and_start_standalone().await;\n    let index_id = \"test_commit_wait_for\";\n    let index_config = format!(\n        r#\"\n        version: 0.8\n        index_id: {index_id}\n        doc_mapping:\n            field_mappings:\n            - name: body \n              type: text\n        indexing_settings:\n            commit_timeout_secs: 3\n        \"#\n    );\n\n    // Create index\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .create(index_config, ConfigFormat::Yaml, false)\n        .await\n        .unwrap();\n\n    // run 2 ingest requests at the same time on the same index\n    // wait_for shouldn't force the commit so expect only 1 published split\n    let client = sandbox.rest_client(QuickwitService::Indexer);\n    let ingest_1_fut = client\n        .ingest(\n            index_id,\n            ingest_json!({\"body\": \"wait for\"}),\n            None,\n            None,\n            CommitType::WaitFor,\n        )\n        .then(|res| async {\n            let ingest_resp = res.unwrap();\n            sandbox.assert_hit_count(index_id, \"body:for\", 1).await;\n            ingest_resp\n        });\n\n    let ingest_2_fut = client\n        .ingest(\n            index_id,\n            ingest_json!({\"body\": \"wait again\"}),\n            None,\n            None,\n            CommitType::WaitFor,\n        )\n        .then(|res| async {\n            let ingest_resp = res.unwrap();\n            sandbox.assert_hit_count(index_id, \"body:again\", 1).await;\n            ingest_resp\n        });\n\n    let (ingest_resp_1, ingest_resp_2) = tokio::join!(ingest_1_fut, ingest_2_fut);\n    assert_eq!(\n        ingest_resp_1,\n        RestIngestResponse {\n            num_docs_for_processing: 1,\n            num_ingested_docs: Some(1),\n            num_rejected_docs: Some(0),\n            parse_failures: None,\n        },\n    );\n    assert_eq!(\n        ingest_resp_2,\n        RestIngestResponse {\n            num_docs_for_processing: 1,\n            num_ingested_docs: Some(1),\n            num_rejected_docs: Some(0),\n            parse_failures: None,\n        },\n    );\n\n    sandbox.assert_hit_count(index_id, \"body:wait\", 2).await;\n\n    let splits_query_params = ListSplitsQueryParams {\n        split_states: Some(vec![SplitState::Published]),\n        ..Default::default()\n    };\n    let published_splits = sandbox\n        .rest_client(QuickwitService::Indexer)\n        .splits(index_id)\n        .list(splits_query_params)\n        .await\n        .unwrap();\n    assert_eq!(published_splits.len(), 1);\n\n    sandbox.shutdown().await.unwrap();\n}\n\n#[tokio::test]\nasync fn test_commit_auto() {\n    let sandbox = ClusterSandboxBuilder::build_and_start_standalone().await;\n    let index_id = \"test_commit_auto\";\n    let index_config = format!(\n        r#\"\n        version: 0.8\n        index_id: {index_id}\n        doc_mapping:\n            field_mappings:\n            - name: body\n              type: text\n        indexing_settings:\n            commit_timeout_secs: 2\n        \"#\n    );\n\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .create(index_config, ConfigFormat::Yaml, false)\n        .await\n        .unwrap();\n\n    let ingest_resp = sandbox\n        .rest_client(QuickwitService::Indexer)\n        .ingest(\n            index_id,\n            ingest_json!({\"body\": \"auto\"}),\n            None,\n            None,\n            CommitType::Auto,\n        )\n        .await\n        .unwrap();\n    assert_eq!(\n        ingest_resp,\n        RestIngestResponse {\n            num_docs_for_processing: 1,\n            num_ingested_docs: Some(1),\n            num_rejected_docs: Some(0),\n            parse_failures: None,\n        },\n    );\n\n    sandbox.assert_hit_count(index_id, \"body:auto\", 0).await;\n\n    sandbox\n        .wait_for_splits(index_id, Some(vec![SplitState::Published]), 1)\n        .await\n        .unwrap();\n\n    sandbox.assert_hit_count(index_id, \"body:auto\", 1).await;\n\n    sandbox.shutdown().await.unwrap();\n}\n\n#[tokio::test]\nasync fn test_detailed_ingest_response() {\n    let sandbox = ClusterSandboxBuilder::build_and_start_standalone().await;\n    let index_id = \"test_detailed_ingest_response\";\n    let index_config = format!(\n        r#\"\n        version: 0.8\n        index_id: {index_id}\n        doc_mapping:\n            field_mappings:\n            - name: body\n              type: text\n        indexing_settings:\n            commit_timeout_secs: 1\n        \"#\n    );\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .create(index_config, ConfigFormat::Yaml, false)\n        .await\n        .unwrap();\n\n    let ingest_resp = ingest(\n        &sandbox.detailed_ingest_client(),\n        index_id,\n        IngestSource::Str(\"{\\\"body\\\":\\\"hello\\\"}\\naouch!\".to_string()),\n        CommitType::Auto,\n    )\n    .await\n    .unwrap();\n\n    assert_eq!(\n        ingest_resp,\n        RestIngestResponse {\n            num_docs_for_processing: 2,\n            num_ingested_docs: Some(1),\n            num_rejected_docs: Some(1),\n            parse_failures: Some(vec![RestParseFailure {\n                document: \"aouch!\".to_string(),\n                message: \"failed to parse JSON document\".to_string(),\n                reason: ParseFailureReason::InvalidJson,\n            }]),\n        },\n    );\n    sandbox.shutdown().await.unwrap();\n}\n\n#[tokio::test]\nasync fn test_very_large_index_name() {\n    let sandbox = ClusterSandboxBuilder::default()\n        .add_node([QuickwitService::Searcher])\n        .add_node([QuickwitService::Metastore])\n        .add_node([QuickwitService::Indexer])\n        .add_node([QuickwitService::ControlPlane])\n        .add_node([QuickwitService::Janitor])\n        .build_and_start()\n        .await;\n\n    let acceptable_index_id = \"its_very_very_very_very_very_very_very_very_very_very_very_\\\n    very_very_very_very_very_very_very_very_very_very_very_very_very_very_very_\\\n    very_very_very_very_very_very_very_very_very_very_very_very_very_very_very_\\\n    very_very_very_very_very_very_index_large_name\";\n    assert_eq!(acceptable_index_id.len(), 255);\n    let oversized_index_id = format!(\"{acceptable_index_id}1\");\n\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .create(\n            format!(\n                r#\"\n                version: 0.8\n                index_id: {acceptable_index_id}\n                doc_mapping:\n                  field_mappings:\n                    - name: body\n                      type: text\n                indexing_settings:\n                    commit_timeout_secs: 1\n                \"#,\n            ),\n            ConfigFormat::Yaml,\n            false,\n        )\n        .await\n        .unwrap();\n\n    ingest(\n        &sandbox.rest_client(QuickwitService::Indexer),\n        acceptable_index_id,\n        ingest_json!({\"body\": \"not too long\"}),\n        CommitType::Auto,\n    )\n    .await\n    .unwrap();\n\n    sandbox\n        .wait_for_splits(acceptable_index_id, Some(vec![SplitState::Published]), 1)\n        .await\n        .unwrap();\n\n    sandbox\n        .assert_hit_count(acceptable_index_id, \"body:long\", 1)\n        .await;\n\n    // Delete the index to avoid potential hanging on shutdown #5068\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .delete(acceptable_index_id, false)\n        .await\n        .unwrap();\n\n    let error = sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .create(\n            format!(\n                r#\"\n                version: 0.8\n                index_id: {oversized_index_id}\n                doc_mapping:\n                    field_mappings:\n                    - name: body\n                      type: text\n                indexing_settings:\n                    commit_timeout_secs: 1\n                \"#,\n            ),\n            ConfigFormat::Yaml,\n            false,\n        )\n        .await\n        .unwrap_err();\n\n    assert!(error.to_string().ends_with(\n        \"is invalid: identifiers must match the following regular expression: \\\n         `^[a-zA-Z][a-zA-Z0-9-_\\\\.]{2,254}$`)\"\n    ));\n\n    sandbox.shutdown().await.unwrap();\n}\n\n#[tokio::test]\nasync fn test_shutdown_single_node() {\n    let sandbox = ClusterSandboxBuilder::build_and_start_standalone().await;\n    let index_id = \"test_shutdown_single_node\";\n\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .create(\n            format!(\n                r#\"\n            version: 0.8\n            index_id: {index_id}\n            doc_mapping:\n              field_mappings:\n              - name: body\n                type: text\n            indexing_settings:\n              commit_timeout_secs: 1\n            \"#\n            ),\n            ConfigFormat::Yaml,\n            false,\n        )\n        .await\n        .unwrap();\n\n    ingest(\n        &sandbox.rest_client(QuickwitService::Indexer),\n        index_id,\n        ingest_json!({\"body\": \"one\"}),\n        CommitType::Force,\n    )\n    .await\n    .unwrap();\n\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .ingest(\n            index_id,\n            ingest_json!({\"body\": \"two\"}),\n            None,\n            None,\n            CommitType::Force,\n        )\n        .await\n        .unwrap();\n\n    tokio::time::timeout(Duration::from_secs(10), sandbox.shutdown())\n        .await\n        .unwrap()\n        .unwrap();\n}\n\n#[tokio::test]\nasync fn test_shutdown_control_plane_first() {\n    let mut sandbox = ClusterSandboxBuilder::default()\n        .add_node([QuickwitService::Indexer])\n        .add_node([\n            QuickwitService::ControlPlane,\n            QuickwitService::Searcher,\n            QuickwitService::Metastore,\n            QuickwitService::Janitor,\n        ])\n        .build_and_start()\n        .await;\n    let index_id = \"test_shutdown_control_plane_first\";\n\n    // Create index\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .create(\n            format!(\n                r#\"\n            version: 0.8\n            index_id: {index_id}\n            doc_mapping:\n              field_mappings:\n              - name: body\n                type: text\n            indexing_settings:\n              commit_timeout_secs: 1\n            \"#\n            ),\n            ConfigFormat::Yaml,\n            false,\n        )\n        .await\n        .unwrap();\n\n    ingest(\n        &sandbox.rest_client(QuickwitService::Indexer),\n        index_id,\n        ingest_json!({\"body\": \"one\"}),\n        CommitType::Force,\n    )\n    .await\n    .unwrap();\n\n    sandbox\n        .shutdown_services([\n            QuickwitService::ControlPlane,\n            QuickwitService::Searcher,\n            QuickwitService::Metastore,\n            QuickwitService::Janitor,\n        ])\n        .await\n        .unwrap();\n\n    // The indexer hangs on shutdown because it cannot commit the shard EOF\n    tokio::time::timeout(Duration::from_secs(5), sandbox.shutdown())\n        .await\n        .unwrap_err();\n}\n\n#[tokio::test]\nasync fn test_shutdown_indexer_first() {\n    let mut sandbox = ClusterSandboxBuilder::default()\n        .add_node([QuickwitService::Indexer])\n        .add_node([\n            QuickwitService::ControlPlane,\n            QuickwitService::Searcher,\n            QuickwitService::Metastore,\n            QuickwitService::Janitor,\n        ])\n        .build_and_start()\n        .await;\n    let index_id = \"test_shutdown_indexer_first\";\n\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .create(\n            format!(\n                r#\"\n            version: 0.8\n            index_id: {index_id}\n            doc_mapping:\n              field_mappings:\n              - name: body\n                type: text\n            indexing_settings:\n              commit_timeout_secs: 1\n            \"#\n            ),\n            ConfigFormat::Yaml,\n            false,\n        )\n        .await\n        .unwrap();\n\n    ingest(\n        &sandbox.rest_client(QuickwitService::Indexer),\n        index_id,\n        ingest_json!({\"body\": \"one\"}),\n        CommitType::Force,\n    )\n    .await\n    .unwrap();\n\n    sandbox\n        .shutdown_services([QuickwitService::Indexer])\n        .await\n        .unwrap();\n\n    tokio::time::timeout(Duration::from_secs(5), sandbox.shutdown())\n        .await\n        .unwrap()\n        .unwrap();\n}\n\n/// Tests that the graceful shutdown sequence works correctly in a multi-indexer\n/// cluster: shutting down one indexer does NOT cause 500 errors or data loss,\n/// and the cluster eventually rebalances. see #6158\n///\n/// We start with a single indexer so the shard for this index is guaranteed to\n/// live on it. After ingesting, we dynamically add a second indexer, then shut\n/// down the first one. This proves the decommission sequence correctly drains\n/// in-flight data even when the shard owner is the node being removed.\n#[tokio::test]\nasync fn test_graceful_shutdown_no_data_loss() {\n    let mut sandbox = ClusterSandboxBuilder::default()\n        .add_node([QuickwitService::Indexer])\n        .add_node([\n            QuickwitService::ControlPlane,\n            QuickwitService::Searcher,\n            QuickwitService::Metastore,\n            QuickwitService::Janitor,\n        ])\n        .build_and_start()\n        .await;\n    let index_id = \"test_graceful_shutdown_no_data_loss\";\n\n    // Create index with a long commit timeout so documents stay uncommitted\n    // in the ingesters' WAL. The decommission sequence should commit\n    // them before the indexer quits.\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .create(\n            format!(\n                r#\"\n            version: 0.8\n            index_id: {index_id}\n            doc_mapping:\n              field_mappings:\n              - name: body\n                type: text\n            indexing_settings:\n              commit_timeout_secs: 5\n            \"#\n            ),\n            ConfigFormat::Yaml,\n            false,\n        )\n        .await\n        .unwrap();\n\n    // Ingest docs with auto-commit. With a 5s commit timeout, these documents\n    // sit uncommitted in the ingesters' WAL - exactly the in-flight state we\n    // want to exercise during draining.\n    ingest(\n        &sandbox.rest_client(QuickwitService::Indexer),\n        index_id,\n        ingest_json!({\"body\": \"before-shutdown-1\"}),\n        CommitType::Auto,\n    )\n    .await\n    .unwrap();\n\n    ingest(\n        &sandbox.rest_client(QuickwitService::Indexer),\n        index_id,\n        ingest_json!({\"body\": \"before-shutdown-2\"}),\n        CommitType::Auto,\n    )\n    .await\n    .unwrap();\n\n    // Add a second indexer after the shard has been created on the first one.\n    sandbox.add_node([QuickwitService::Indexer]).await;\n    sandbox.wait_for_cluster_num_ready_nodes(3).await.unwrap();\n\n    // Remove the first indexer (the shard owner) from the sandbox and get its\n    // shutdown handle. After this call, rest_client(Indexer) returns the\n    // second (surviving) indexer.\n    let shutdown_handle = sandbox.remove_node_with_service(QuickwitService::Indexer);\n\n    // Concurrently: shut down the removed indexer AND ingest more data via the\n    // surviving indexer. This verifies the cluster stays operational and the\n    // router on the surviving node does not return 500 errors while one indexer\n    // is decommissioning. The control plane excludes the decommissioning\n    // ingester from shard allocation, so new shards go to the surviving one.\n    let ingest_client = sandbox.rest_client(QuickwitService::Indexer);\n    let (shutdown_result, ingest_result) = tokio::join!(\n        async {\n            tokio::time::timeout(Duration::from_secs(30), shutdown_handle.shutdown())\n                .await\n                .expect(\"indexer shutdown timed out — decommission may be stuck\")\n        },\n        async {\n            // Small delay so the decommission sequence has started before we ingest.\n            tokio::time::sleep(Duration::from_millis(200)).await;\n            ingest(\n                &ingest_client,\n                index_id,\n                ingest_json!({\"body\": \"during-shutdown\"}),\n                CommitType::Auto,\n            )\n            .await\n        },\n    );\n    shutdown_result.expect(\"indexer shutdown failed\");\n    ingest_result.expect(\"ingest during shutdown should succeed (no 500 errors)\");\n\n    // All 3 documents should eventually be searchable. Documents 1 & 2 were\n    // in-flight on the decommissioning indexer and should have been committed during\n    // the decommission step. Document 3 was ingested to the surviving indexer.\n    wait_until_predicate(\n        || async {\n            match sandbox\n                .rest_client(QuickwitService::Searcher)\n                .search(\n                    index_id,\n                    quickwit_serve::SearchRequestQueryString {\n                        query: \"*\".to_string(),\n                        max_hits: 10,\n                        ..Default::default()\n                    },\n                )\n                .await\n            {\n                Ok(resp) => resp.num_hits == 3,\n                Err(_) => false,\n            }\n        },\n        Duration::from_secs(30),\n        Duration::from_millis(500),\n    )\n    .await\n    .expect(\"expected 3 documents after decommission shutdown, some data may have been lost\");\n\n    // Verify the cluster sees 2 ready nodes (the surviving indexer + the\n    // control-plane/searcher/metastore/janitor node).\n    sandbox\n        .wait_for_cluster_num_ready_nodes(2)\n        .await\n        .expect(\"cluster should see 2 ready nodes after indexer shutdown\");\n\n    // Clean shutdown of the remaining nodes.\n    tokio::time::timeout(Duration::from_secs(30), sandbox.shutdown())\n        .await\n        .unwrap()\n        .unwrap();\n}\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/src/tests/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod basic_tests;\nmod ingest_v1_tests;\nmod ingest_v2_tests;\nmod no_cp_tests;\nmod otlp_tests;\n#[cfg(feature = \"sqs-localstack-tests\")]\nmod sqs_tests;\nmod tls_tests;\nmod update_tests;\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/src/tests/no_cp_tests.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n//! Tests for cluster configurations without a control plane.\n\nuse quickwit_config::ConfigFormat;\nuse quickwit_config::service::QuickwitService;\nuse quickwit_rest_client::error::{ApiError, Error as RestClientError};\nuse quickwit_serve::SearchRequestQueryString;\n\nuse crate::test_utils::ClusterSandboxBuilder;\n\nfn initialize_tests() {\n    // SAFETY: this test may not be entirely sound if not run with nextest or --test-threads=1\n    // as this is only a test, and it would be extremly inconvenient to run it in a different way,\n    // we are keeping it that way\n\n    quickwit_common::setup_logging_for_tests();\n    unsafe { std::env::set_var(\"QW_ENABLE_INGEST_V2\", \"true\") };\n}\n\n#[tokio::test]\nasync fn test_search_after_control_plane_shutdown() {\n    initialize_tests();\n    let mut sandbox = ClusterSandboxBuilder::default()\n        .add_node([QuickwitService::ControlPlane])\n        .add_node([QuickwitService::Metastore])\n        .add_node([QuickwitService::Searcher])\n        .build_and_start()\n        .await;\n    let index_id = \"test-search-after-control-plane-shutdown\";\n    let index_config = format!(\n        r#\"\n            version: 0.8\n            index_id: {index_id}\n            doc_mapping:\n                field_mappings:\n                - name: body\n                  type: text\n            indexing_settings:\n                commit_timeout_secs: 1\n            \"#\n    );\n\n    sandbox\n        .rest_client(QuickwitService::Metastore)\n        .indexes()\n        .create(index_config.clone(), ConfigFormat::Yaml, false)\n        .await\n        .unwrap();\n\n    sandbox\n        .shutdown_services([QuickwitService::ControlPlane])\n        .await\n        .unwrap();\n\n    sandbox.assert_hit_count(index_id, \"\", 0).await;\n\n    sandbox.shutdown().await.unwrap();\n}\n\n#[tokio::test]\nasync fn test_searcher_and_metastore_without_control_plane() {\n    initialize_tests();\n    let sandbox = ClusterSandboxBuilder::default()\n        .add_node([QuickwitService::Metastore])\n        .add_node([QuickwitService::Searcher])\n        .build_and_start()\n        .await;\n\n    // we cannot create an actual index without control plane\n\n    let search_error = sandbox\n        .rest_client(QuickwitService::Searcher)\n        .search(\n            \"does-not-exist\",\n            SearchRequestQueryString {\n                query: String::new(),\n                max_hits: 10,\n                ..Default::default()\n            },\n        )\n        .await\n        .unwrap_err();\n\n    if let RestClientError::Api(ApiError { message, code }) = search_error {\n        assert_eq!(\n            message.unwrap(),\n            \"could not find indexes matching the IDs `[\\\"does-not-exist\\\"]`\"\n        );\n        assert_eq!(code.as_u16(), 404);\n    } else {\n        panic!(\"unexpected error: {search_error:?}\");\n    }\n\n    sandbox.shutdown().await.unwrap();\n}\n\n#[tokio::test]\n#[should_panic]\nasync fn test_indexer_fails_without_control_plane() {\n    initialize_tests();\n    let sandbox = ClusterSandboxBuilder::default()\n        .add_node([QuickwitService::Metastore])\n        .add_node([QuickwitService::Indexer, QuickwitService::Searcher])\n        .build_and_start()\n        .await;\n\n    let _ = sandbox.shutdown().await;\n}\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/src/tests/otlp_tests.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\n\nuse futures_util::StreamExt;\nuse quickwit_config::service::QuickwitService;\nuse quickwit_metastore::SplitState;\nuse quickwit_opentelemetry::otlp::{\n    OTEL_LOGS_INDEX_ID, OTEL_TRACES_INDEX_ID, make_resource_spans_for_test,\n};\nuse quickwit_proto::jaeger::storage::v1::{\n    FindTraceIDsRequest, GetOperationsRequest, GetServicesRequest, GetTraceRequest, Operation,\n    SpansResponseChunk, TraceQueryParameters,\n};\nuse quickwit_proto::opentelemetry::proto::collector::logs::v1::ExportLogsServiceRequest;\nuse quickwit_proto::opentelemetry::proto::collector::trace::v1::ExportTraceServiceRequest;\nuse quickwit_proto::opentelemetry::proto::common::v1::AnyValue;\nuse quickwit_proto::opentelemetry::proto::common::v1::any_value::Value;\nuse quickwit_proto::opentelemetry::proto::logs::v1::{LogRecord, ResourceLogs, ScopeLogs};\nuse quickwit_proto::opentelemetry::proto::trace::v1::{ResourceSpans, ScopeSpans, Span};\nuse tonic::codec::CompressionEncoding;\n\nuse crate::test_utils::ClusterSandboxBuilder;\n\nfn initialize_tests() {\n    // SAFETY: this test may not be entirely sound if not run with nextest or --test-threads=1\n    // as this is only a test, and it would be extremly inconvenient to run it in a different way,\n    // we are keeping it that way\n\n    quickwit_common::setup_logging_for_tests();\n    unsafe { std::env::set_var(\"QW_ENABLE_INGEST_V2\", \"true\") };\n}\n\n#[tokio::test]\nasync fn test_ingest_traces_with_otlp_grpc_api() {\n    initialize_tests();\n    let mut sandbox = ClusterSandboxBuilder::default()\n        .add_node([QuickwitService::Searcher])\n        .add_node([QuickwitService::Metastore])\n        .add_node_with_otlp([QuickwitService::Indexer])\n        .add_node([QuickwitService::ControlPlane])\n        .add_node([QuickwitService::Janitor])\n        .build_and_start()\n        .await;\n    // Wait for the pipelines to start (one for logs and one for traces)\n    sandbox.wait_for_indexing_pipelines(2).await.unwrap();\n\n    fn build_span(span_name: String) -> Vec<ResourceSpans> {\n        let scope_spans = vec![ScopeSpans {\n            spans: vec![Span {\n                name: span_name,\n                trace_id: vec![1; 16],\n                span_id: vec![2; 8],\n                start_time_unix_nano: 1724060143000000001,\n                end_time_unix_nano: 1724060144000000000,\n                ..Default::default()\n            }],\n            ..Default::default()\n        }];\n        vec![ResourceSpans {\n            scope_spans,\n            ..Default::default()\n        }]\n    }\n\n    // Send the spans on the default index\n    let tested_clients = vec![\n        sandbox.trace_client().clone(),\n        sandbox\n            .trace_client()\n            .clone()\n            .send_compressed(CompressionEncoding::Gzip),\n    ];\n    for (idx, mut tested_client) in tested_clients.into_iter().enumerate() {\n        let body = format!(\"hello{idx}\");\n        let request = ExportTraceServiceRequest {\n            resource_spans: build_span(body.clone()),\n        };\n        let response = tested_client.export(request).await.unwrap();\n        assert_eq!(\n            response\n                .into_inner()\n                .partial_success\n                .unwrap()\n                .rejected_spans,\n            0\n        );\n        sandbox\n            .wait_for_splits(\n                OTEL_TRACES_INDEX_ID,\n                Some(vec![SplitState::Published]),\n                idx + 1,\n            )\n            .await\n            .unwrap();\n        sandbox\n            .assert_hit_count(OTEL_TRACES_INDEX_ID, &format!(\"span_name:{body}\"), 1)\n            .await;\n    }\n\n    // Send the spans on a non existing index, should return an error.\n    {\n        let request = ExportTraceServiceRequest {\n            resource_spans: build_span(\"hello\".to_string()),\n        };\n        let mut tonic_request = tonic::Request::new(request);\n        tonic_request.metadata_mut().insert(\n            \"qw-otel-traces-index\",\n            tonic::metadata::MetadataValue::try_from(\"non-existing-index\").unwrap(),\n        );\n        let status = sandbox\n            .trace_client()\n            .clone()\n            .export(tonic_request)\n            .await\n            .unwrap_err();\n        assert_eq!(status.code(), tonic::Code::NotFound);\n    }\n\n    sandbox\n        .shutdown_services([QuickwitService::Indexer])\n        .await\n        .unwrap();\n    sandbox.shutdown().await.unwrap();\n}\n\n#[tokio::test]\nasync fn test_ingest_logs_with_otlp_grpc_api() {\n    initialize_tests();\n    let mut sandbox = ClusterSandboxBuilder::default()\n        .add_node([QuickwitService::Searcher])\n        .add_node([QuickwitService::Metastore])\n        .add_node_with_otlp([QuickwitService::Indexer])\n        .add_node([QuickwitService::ControlPlane])\n        .add_node([QuickwitService::Janitor])\n        .build_and_start()\n        .await;\n    // Wait fo the pipelines to start (one for logs and one for traces)\n    sandbox.wait_for_indexing_pipelines(2).await.unwrap();\n\n    fn build_log(body: String) -> Vec<ResourceLogs> {\n        let log_record = LogRecord {\n            time_unix_nano: 1724060143000000001,\n            body: Some(AnyValue {\n                value: Some(Value::StringValue(body)),\n            }),\n            ..Default::default()\n        };\n        let scope_logs = ScopeLogs {\n            log_records: vec![log_record],\n            ..Default::default()\n        };\n        vec![ResourceLogs {\n            scope_logs: vec![scope_logs],\n            ..Default::default()\n        }]\n    }\n\n    // Send the logs on the default index\n    let tested_clients = vec![\n        sandbox.logs_client().clone(),\n        sandbox\n            .logs_client()\n            .clone()\n            .send_compressed(CompressionEncoding::Gzip),\n    ];\n    for (idx, mut tested_client) in tested_clients.into_iter().enumerate() {\n        let body: String = format!(\"hello{idx}\");\n        let request = ExportLogsServiceRequest {\n            resource_logs: build_log(body.clone()),\n        };\n        let response = tested_client.export(request).await.unwrap();\n        assert_eq!(\n            response\n                .into_inner()\n                .partial_success\n                .unwrap()\n                .rejected_log_records,\n            0\n        );\n        sandbox\n            .wait_for_splits(\n                OTEL_LOGS_INDEX_ID,\n                Some(vec![SplitState::Published]),\n                idx + 1,\n            )\n            .await\n            .unwrap();\n        sandbox\n            .assert_hit_count(OTEL_LOGS_INDEX_ID, &format!(\"body.message:{body}\"), 1)\n            .await;\n    }\n\n    sandbox\n        .shutdown_services([QuickwitService::Indexer])\n        .await\n        .unwrap();\n    sandbox.shutdown().await.unwrap();\n}\n\n#[tokio::test]\nasync fn test_jaeger_api() {\n    initialize_tests();\n    let mut sandbox = ClusterSandboxBuilder::default()\n        .add_node([QuickwitService::Searcher])\n        .add_node([QuickwitService::Metastore])\n        .add_node_with_otlp([QuickwitService::Indexer])\n        .add_node([QuickwitService::ControlPlane])\n        .add_node([QuickwitService::Janitor])\n        .build_and_start()\n        .await;\n    // Wait fo the pipelines to start (one for logs and one for traces)\n    sandbox.wait_for_indexing_pipelines(2).await.unwrap();\n\n    let export_trace_request = ExportTraceServiceRequest {\n        resource_spans: make_resource_spans_for_test(),\n    };\n    sandbox\n        .trace_client()\n        .export(export_trace_request)\n        .await\n        .unwrap();\n\n    sandbox\n        .wait_for_splits(OTEL_TRACES_INDEX_ID, Some(vec![SplitState::Published]), 1)\n        .await\n        .unwrap();\n\n    sandbox\n        .shutdown_services([QuickwitService::Indexer])\n        .await\n        .unwrap();\n\n    {\n        // Test `GetServices`\n        let get_services_request = GetServicesRequest {};\n        let get_services_response = sandbox\n            .jaeger_client()\n            .get_services(tonic::Request::new(get_services_request))\n            .await\n            .unwrap()\n            .into_inner();\n        assert_eq!(get_services_response.services, &[\"quickwit\"]);\n    }\n    {\n        // Test `GetOperations`\n        let get_operations_request = GetOperationsRequest {\n            service: \"quickwit\".to_string(),\n            span_kind: \"\".to_string(),\n        };\n        let get_operations_response = sandbox\n            .jaeger_client()\n            .get_operations(tonic::Request::new(get_operations_request))\n            .await\n            .unwrap()\n            .into_inner();\n        assert_eq!(get_operations_response.operations.len(), 4);\n        assert_eq!(\n            get_operations_response.operations,\n            vec![\n                Operation {\n                    name: \"delete_splits\".to_string(),\n                    span_kind: \"client\".to_string(),\n                },\n                Operation {\n                    name: \"list_splits\".to_string(),\n                    span_kind: \"client\".to_string(),\n                },\n                Operation {\n                    name: \"publish_splits\".to_string(),\n                    span_kind: \"server\".to_string(),\n                },\n                Operation {\n                    name: \"stage_splits\".to_string(),\n                    span_kind: \"internal\".to_string(),\n                }\n            ]\n        );\n\n        let get_operations_request = GetOperationsRequest {\n            service: \"quickwit\".to_string(),\n            span_kind: \"server\".to_string(),\n        };\n        let get_operations_response = sandbox\n            .jaeger_client()\n            .get_operations(tonic::Request::new(get_operations_request))\n            .await\n            .unwrap()\n            .into_inner();\n        assert_eq!(get_operations_response.operations.len(), 1);\n        assert_eq!(\n            get_operations_response.operations,\n            vec![Operation {\n                name: \"publish_splits\".to_string(),\n                span_kind: \"server\".to_string(),\n            },]\n        );\n    }\n    {\n        // Test `FindTraceIds`\n        // TODO: Increase comprehensiveness of this test.\n        // Search by service and operation name.\n        let query = TraceQueryParameters {\n            service_name: \"quickwit\".to_string(),\n            operation_name: \"stage_splits\".to_string(),\n            tags: HashMap::new(),\n            start_time_min: None,\n            start_time_max: None,\n            duration_min: None,\n            duration_max: None,\n            num_traces: 10,\n        };\n        let find_trace_ids_request = FindTraceIDsRequest { query: Some(query) };\n        let find_trace_ids_response = sandbox\n            .jaeger_client()\n            .find_trace_i_ds(tonic::Request::new(find_trace_ids_request))\n            .await\n            .unwrap()\n            .into_inner();\n        assert_eq!(find_trace_ids_response.trace_ids.len(), 1);\n        assert_eq!(find_trace_ids_response.trace_ids[0], [1; 16]);\n\n        // Search by service name, operation name, and span attribute.\n        let query = TraceQueryParameters {\n            service_name: \"quickwit\".to_string(),\n            operation_name: \"list_splits\".to_string(),\n            tags: HashMap::from([(\"span_key\".to_string(), \"span_value\".to_string())]),\n            start_time_min: None,\n            start_time_max: None,\n            duration_min: None,\n            duration_max: None,\n            num_traces: 10,\n        };\n        let find_trace_ids_request = FindTraceIDsRequest { query: Some(query) };\n        let find_trace_ids_response = sandbox\n            .jaeger_client()\n            .find_trace_i_ds(tonic::Request::new(find_trace_ids_request))\n            .await\n            .unwrap()\n            .into_inner();\n        assert_eq!(find_trace_ids_response.trace_ids.len(), 1);\n        assert_eq!(find_trace_ids_response.trace_ids[0], [3; 16]);\n\n        // Search by service name, operation name, and event attribute.\n        let query = TraceQueryParameters {\n            service_name: \"quickwit\".to_string(),\n            operation_name: \"delete_splits\".to_string(),\n            tags: HashMap::from([(\"event_key\".to_string(), \"event_value\".to_string())]),\n            start_time_min: None,\n            start_time_max: None,\n            duration_min: None,\n            duration_max: None,\n            num_traces: 10,\n        };\n        let find_trace_ids_request = FindTraceIDsRequest { query: Some(query) };\n        let find_trace_ids_response = sandbox\n            .jaeger_client()\n            .find_trace_i_ds(tonic::Request::new(find_trace_ids_request))\n            .await\n            .unwrap()\n            .into_inner();\n        assert_eq!(find_trace_ids_response.trace_ids.len(), 1);\n        assert_eq!(find_trace_ids_response.trace_ids[0], [5; 16]);\n\n        // Search traces with an error.\n        let query = TraceQueryParameters {\n            service_name: \"quickwit\".to_string(),\n            operation_name: \"list_splits\".to_string(),\n            tags: HashMap::from([(\"error\".to_string(), \"true\".to_string())]),\n            start_time_min: None,\n            start_time_max: None,\n            duration_min: None,\n            duration_max: None,\n            num_traces: 10,\n        };\n        let find_trace_ids_request = FindTraceIDsRequest { query: Some(query) };\n        let find_trace_ids_response = sandbox\n            .jaeger_client()\n            .find_trace_i_ds(tonic::Request::new(find_trace_ids_request))\n            .await\n            .unwrap()\n            .into_inner();\n        assert_eq!(find_trace_ids_response.trace_ids.len(), 1);\n        assert_eq!(find_trace_ids_response.trace_ids[0], [4; 16]);\n\n        // Search traces without an error.\n        let query = TraceQueryParameters {\n            service_name: \"quickwit\".to_string(),\n            operation_name: \"list_splits\".to_string(),\n            tags: HashMap::from([(\"error\".to_string(), \"false\".to_string())]),\n            start_time_min: None,\n            start_time_max: None,\n            duration_min: None,\n            duration_max: None,\n            num_traces: 10,\n        };\n        let find_trace_ids_request = FindTraceIDsRequest { query: Some(query) };\n        let find_trace_ids_response = sandbox\n            .jaeger_client()\n            .find_trace_i_ds(tonic::Request::new(find_trace_ids_request))\n            .await\n            .unwrap()\n            .into_inner();\n        assert_eq!(find_trace_ids_response.trace_ids.len(), 1);\n        assert_eq!(find_trace_ids_response.trace_ids[0], [3; 16]);\n    }\n    {\n        // Test `GetTrace`\n        let get_trace_request = GetTraceRequest {\n            trace_id: [1; 16].to_vec(),\n        };\n        let mut span_stream = sandbox\n            .jaeger_client()\n            .get_trace(tonic::Request::new(get_trace_request))\n            .await\n            .unwrap()\n            .into_inner();\n        let SpansResponseChunk { spans } = span_stream.next().await.unwrap().unwrap();\n        assert_eq!(spans.len(), 1);\n\n        let span: &quickwit_proto::jaeger::api_v2::Span = &spans[0];\n        assert_eq!(span.operation_name, \"stage_splits\");\n\n        let process = span.process.as_ref().unwrap();\n        assert_eq!(process.tags.len(), 1);\n        assert_eq!(process.tags[0].key, \"tags\");\n        assert_eq!(process.tags[0].v_str, r#\"[\"foo\"]\"#);\n    }\n    sandbox.shutdown().await.unwrap();\n}\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/src/tests/sqs_tests.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::io::Write;\nuse std::iter;\nuse std::str::FromStr;\nuse std::time::Duration;\n\nuse aws_sdk_sqs::types::QueueAttributeName;\nuse quickwit_common::test_utils::wait_until_predicate;\nuse quickwit_common::uri::Uri;\nuse quickwit_config::ConfigFormat;\nuse quickwit_config::service::QuickwitService;\nuse quickwit_indexing::source::sqs_queue::test_helpers as sqs_test_helpers;\nuse quickwit_metastore::SplitState;\nuse tempfile::NamedTempFile;\n\nuse crate::test_utils::ClusterSandboxBuilder;\n\nfn create_mock_data_file(num_lines: usize) -> (NamedTempFile, Uri) {\n    let mut temp_file = tempfile::NamedTempFile::new().unwrap();\n    for i in 0..num_lines {\n        writeln!(temp_file, \"{{\\\"body\\\": \\\"hello {i}\\\"}}\").unwrap()\n    }\n    temp_file.flush().unwrap();\n    let path = temp_file.path().to_str().unwrap();\n    let uri = Uri::from_str(path).unwrap();\n    (temp_file, uri)\n}\n\n#[tokio::test]\nasync fn test_sqs_with_duplicates() {\n    quickwit_common::setup_logging_for_tests();\n    let sandbox = ClusterSandboxBuilder::build_and_start_standalone().await;\n    let index_id = \"test-sqs-source-duplicates\";\n    let index_config = format!(\n        r#\"\n            version: 0.8\n            index_id: {index_id}\n            doc_mapping:\n                field_mappings:\n                - name: body\n                  type: text\n            indexing_settings:\n                commit_timeout_secs: 3\n            \"#\n    );\n\n    let sqs_client = sqs_test_helpers::get_localstack_sqs_client().await.unwrap();\n    let queue_url = sqs_test_helpers::create_queue(&sqs_client, \"test-single-node-cluster\").await;\n\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .create(index_config.clone(), ConfigFormat::Yaml, false)\n        .await\n        .unwrap();\n\n    let source_id: &str = \"test-sqs-single-node-cluster\";\n    let source_config_input = format!(\n        r#\"\n            version: 0.7\n            source_id: {source_id}\n            desired_num_pipelines: 1\n            max_num_pipelines_per_indexer: 1\n            source_type: file\n            params:\n                notifications:\n                  - type: sqs\n                    queue_url: {queue_url}\n                    message_type: raw_uri\n            input_format: plain_text\n        \"#\n    );\n\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .sources(index_id)\n        .create(source_config_input, ConfigFormat::Yaml)\n        .await\n        .unwrap();\n\n    // Send messages with duplicates\n    let tmp_mock_data_files: Vec<_> = iter::repeat_with(|| create_mock_data_file(1000))\n        .take(10)\n        .collect();\n    for (_, uri) in &tmp_mock_data_files {\n        sqs_test_helpers::send_message(&sqs_client, &queue_url, uri.as_str()).await;\n    }\n    sqs_test_helpers::send_message(&sqs_client, &queue_url, tmp_mock_data_files[0].1.as_str())\n        .await;\n    sqs_test_helpers::send_message(&sqs_client, &queue_url, tmp_mock_data_files[5].1.as_str())\n        .await;\n\n    sandbox\n        .wait_for_splits(index_id, Some(vec![SplitState::Published]), 1)\n        .await\n        .unwrap();\n\n    sandbox.assert_hit_count(index_id, \"\", 10 * 1000).await;\n\n    // The two duplicates could not be acknowledged when the were received\n    // because at that point the relevant data was not yet committed. Now it is\n    // committed, but their visibility timeout will still take a while to be\n    // reached.\n    wait_until_predicate(\n        || async {\n            let in_flight_count: usize = sqs_test_helpers::get_queue_attribute(\n                &sqs_client,\n                &queue_url,\n                QueueAttributeName::ApproximateNumberOfMessagesNotVisible,\n            )\n            .await\n            .parse()\n            .unwrap();\n            in_flight_count == 2\n        },\n        Duration::from_secs(5),\n        Duration::from_millis(100),\n    )\n    .await\n    .expect(\"number of in-flight messages didn't reach 2 within the timeout\");\n\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .delete(index_id, false)\n        .await\n        .unwrap();\n\n    sandbox.shutdown().await.unwrap();\n}\n\n#[tokio::test]\nasync fn test_sqs_garbage_collect() {\n    quickwit_common::setup_logging_for_tests();\n    let sandbox = ClusterSandboxBuilder::build_and_start_standalone().await;\n    let index_id = \"test-sqs-source-garbage-collect\";\n    let index_config = format!(\n        r#\"\n            version: 0.8\n            index_id: {index_id}\n            doc_mapping:\n                field_mappings:\n                - name: body\n                  type: text\n            indexing_settings:\n                commit_timeout_secs: 1\n            \"#\n    );\n\n    let sqs_client = sqs_test_helpers::get_localstack_sqs_client().await.unwrap();\n    let queue_url = sqs_test_helpers::create_queue(&sqs_client, \"test-single-node-cluster\").await;\n\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .create(index_config.clone(), ConfigFormat::Yaml, false)\n        .await\n        .unwrap();\n\n    let source_id: &str = \"test-sqs-single-node-cluster\";\n    let source_config_input = format!(\n        r#\"\n            version: 0.7\n            source_id: {source_id}\n            desired_num_pipelines: 1\n            max_num_pipelines_per_indexer: 1\n            source_type: file\n            params:\n                notifications:\n                  - type: sqs\n                    queue_url: {queue_url}\n                    message_type: raw_uri\n                    deduplication_window_max_messages: 5\n                    deduplication_cleanup_interval_secs: 3\n            input_format: plain_text\n        \"#\n    );\n\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .sources(index_id)\n        .create(source_config_input, ConfigFormat::Yaml)\n        .await\n        .unwrap();\n\n    let tmp_mock_data_files: Vec<_> = iter::repeat_with(|| create_mock_data_file(1000))\n        .take(10)\n        .collect();\n    for (_, uri) in &tmp_mock_data_files {\n        sqs_test_helpers::send_message(&sqs_client, &queue_url, uri.as_str()).await;\n    }\n\n    sandbox\n        .wait_for_splits(index_id, Some(vec![SplitState::Published]), 1)\n        .await\n        .unwrap();\n\n    sandbox.assert_hit_count(index_id, \"\", 10 * 1000).await;\n\n    wait_until_predicate(\n        || async {\n            let shard_count = sandbox\n                .rest_client(QuickwitService::Indexer)\n                .sources(index_id)\n                .get_shards(source_id)\n                .await\n                .unwrap()\n                .len();\n            tracing::info!(\"shard_count: {}\", shard_count);\n            shard_count == 5\n        },\n        Duration::from_secs(6),\n        Duration::from_millis(200),\n    )\n    .await\n    .expect(\"shards where not pruned within the timeout\");\n\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .delete(index_id, false)\n        .await\n        .unwrap();\n\n    sandbox.shutdown().await.unwrap();\n}\n\n// this source update test is done here because SQS is the only long running\n// configurable source for which we have integration tests set up.\n#[tokio::test]\nasync fn test_update_source_multi_node_cluster() {\n    quickwit_common::setup_logging_for_tests();\n    let index_id = \"test-update-source-cluster\";\n    let sqs_client = sqs_test_helpers::get_localstack_sqs_client().await.unwrap();\n    let queue_url = sqs_test_helpers::create_queue(&sqs_client, \"test-update-source-cluster\").await;\n\n    let sandbox = ClusterSandboxBuilder::default()\n        .add_node([QuickwitService::Searcher])\n        .add_node([QuickwitService::Metastore])\n        .add_node([QuickwitService::Indexer])\n        .add_node([QuickwitService::ControlPlane])\n        .add_node([QuickwitService::Janitor])\n        .build_and_start()\n        .await;\n\n    {\n        // Wait for indexer to fully start.\n        // The starting time is a bit long for a cluster.\n        tokio::time::sleep(Duration::from_secs(3)).await;\n        let indexing_service_counters = sandbox\n            .rest_client(QuickwitService::Indexer)\n            .node_stats()\n            .indexing()\n            .await\n            .unwrap();\n        assert_eq!(indexing_service_counters.num_running_pipelines, 0);\n    }\n\n    // Create an index\n    let index_config = format!(\n        r#\"\n        version: 0.8\n        index_id: {index_id}\n        doc_mapping:\n            field_mappings:\n            - name: body\n              type: text\n        indexing_settings:\n            commit_timeout_secs: 1\n        \"#\n    );\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .create(index_config, ConfigFormat::Yaml, false)\n        .await\n        .unwrap();\n\n    // Wait until indexing pipelines are started\n    sandbox.wait_for_indexing_pipelines(1).await.unwrap();\n\n    // create an SQS source with 1 pipeline\n    let source_id: &str = \"test-update-source-cluster\";\n    let source_config_input = format!(\n        r#\"\n            version: 0.7\n            source_id: {source_id}\n            desired_num_pipelines: 1\n            max_num_pipelines_per_indexer: 1\n            source_type: file\n            params:\n                notifications:\n                  - type: sqs\n                    queue_url: {queue_url}\n                    message_type: raw_uri\n                    deduplication_window_max_messages: 5\n                    deduplication_cleanup_interval_secs: 3\n            input_format: plain_text\n        \"#\n    );\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .sources(index_id)\n        .create(source_config_input, ConfigFormat::Yaml)\n        .await\n        .unwrap();\n\n    // Wait until the SQS indexing pipeline is also started\n    sandbox.wait_for_indexing_pipelines(2).await.unwrap();\n\n    // increase the number of pipelines to 3\n    let source_config_input = format!(\n        r#\"\n            version: 0.7\n            source_id: {source_id}\n            desired_num_pipelines: 3\n            max_num_pipelines_per_indexer: 3\n            source_type: file\n            params:\n                notifications:\n                  - type: sqs\n                    queue_url: {queue_url}\n                    message_type: raw_uri\n                    deduplication_window_max_messages: 5\n                    deduplication_cleanup_interval_secs: 3\n            input_format: plain_text\n        \"#\n    );\n    sandbox\n        .rest_client(QuickwitService::Metastore)\n        .sources(index_id)\n        .update(source_id, source_config_input, ConfigFormat::Yaml, false)\n        .await\n        .unwrap();\n\n    // Wait until the SQS indexing pipeline is also started\n    sandbox.wait_for_indexing_pipelines(4).await.unwrap();\n\n    sandbox.shutdown().await.unwrap();\n}\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/src/tests/tls_tests.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::time::Duration;\n\nuse hyper_util::rt::TokioExecutor;\nuse quickwit_config::service::QuickwitService;\nuse quickwit_serve::SearchRequestQueryString;\n\nuse crate::test_utils::ClusterSandboxBuilder;\n\n#[tokio::test]\nasync fn test_tls_rest() {\n    quickwit_common::setup_logging_for_tests();\n    let mut sandbox_config = ClusterSandboxBuilder::default()\n        .add_node(QuickwitService::supported_services())\n        .build_config()\n        .await;\n    sandbox_config.node_configs[0].0.rest_config.tls = Some(quickwit_config::TlsConfig {\n        cert_path: concat!(env!(\"CARGO_MANIFEST_DIR\"), \"/test_data/server.crt\").to_string(),\n        key_path: concat!(env!(\"CARGO_MANIFEST_DIR\"), \"/test_data/server.key\").to_string(),\n        ca_path: concat!(env!(\"CARGO_MANIFEST_DIR\"), \"/test_data/ca.crt\").to_string(),\n        expected_name: None,\n        validate_client: false,\n    });\n    let sandbox = sandbox_config.start().await;\n    let node_config = sandbox.node_configs.first().unwrap();\n    let client = hyper_util::client::legacy::Client::builder(TokioExecutor::new())\n        .pool_idle_timeout(Duration::from_secs(30))\n        .http2_only(true)\n        .build_http::<String>();\n    let root_uri = format!(\"http://{}/\", node_config.0.rest_config.listen_addr)\n        .parse::<hyper::Uri>()\n        .unwrap();\n    client\n        .get(root_uri.clone())\n        .await\n        .expect_err(\"non tls connection should fail\");\n\n    assert_eq!(\n        sandbox\n            .rest_client(QuickwitService::Indexer)\n            .indexes()\n            .list()\n            .await\n            .unwrap()\n            .len(),\n        0\n    );\n\n    sandbox.shutdown().await.unwrap();\n}\n\n#[tokio::test]\nasync fn test_tls_grpc() {\n    quickwit_common::setup_logging_for_tests();\n    let mut sandbox_config = ClusterSandboxBuilder::default()\n        .add_node([QuickwitService::Searcher])\n        .add_node([QuickwitService::Metastore])\n        .add_node([QuickwitService::Indexer])\n        .add_node([QuickwitService::ControlPlane])\n        .add_node([QuickwitService::Janitor])\n        .build_config()\n        .await;\n\n    for node in &mut sandbox_config.node_configs {\n        node.0.rest_config.tls = Some(quickwit_config::TlsConfig {\n            cert_path: concat!(env!(\"CARGO_MANIFEST_DIR\"), \"/test_data/server.crt\").to_string(),\n            key_path: concat!(env!(\"CARGO_MANIFEST_DIR\"), \"/test_data/server.key\").to_string(),\n            ca_path: concat!(env!(\"CARGO_MANIFEST_DIR\"), \"/test_data/ca.crt\").to_string(),\n            expected_name: Some(\"quickwit.local\".to_string()),\n            validate_client: false,\n        });\n    }\n\n    let sandbox = sandbox_config.start().await;\n\n    // TODO connect to grpc port and verify it refuses non-tls connection\n\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .create(\n            r#\"\n            version: 0.8\n            index_id: my-new-multi-node-index\n            doc_mapping:\n              field_mappings:\n              - name: body\n                type: text\n            indexing_settings:\n              commit_timeout_secs: 1\n            \"#,\n            quickwit_config::ConfigFormat::Yaml,\n            false,\n        )\n        .await\n        .unwrap();\n\n    assert!(\n        sandbox\n            .rest_client(QuickwitService::Indexer)\n            .node_health()\n            .is_live()\n            .await\n            .unwrap()\n    );\n\n    // Assert that at least 1 indexing pipelines is successfully started\n    sandbox.wait_for_indexing_pipelines(1).await.unwrap();\n\n    // Check that search is working\n    let search_response_empty = sandbox\n        .rest_client(QuickwitService::Searcher)\n        .search(\n            \"my-new-multi-node-index\",\n            SearchRequestQueryString {\n                query: \"body:bar\".to_string(),\n                ..Default::default()\n            },\n        )\n        .await\n        .unwrap();\n    assert_eq!(search_response_empty.num_hits, 0);\n\n    sandbox.shutdown().await.unwrap();\n}\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/src/tests/update_tests/create_on_update.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::time::Duration;\n\nuse quickwit_config::service::QuickwitService;\nuse quickwit_rest_client::rest_client::CommitType;\nuse serde_json::json;\n\nuse super::assert_hits_unordered;\nuse crate::ingest_json;\nuse crate::test_utils::{ClusterSandboxBuilder, ingest};\n\n#[tokio::test]\nasync fn test_update_missing_no_create() {\n    quickwit_common::setup_logging_for_tests();\n    let sandbox = ClusterSandboxBuilder::default()\n        .add_node([QuickwitService::Searcher])\n        .add_node([QuickwitService::Metastore])\n        .add_node([QuickwitService::Indexer])\n        .add_node([QuickwitService::ControlPlane])\n        .add_node([QuickwitService::Janitor])\n        .build_and_start()\n        .await;\n\n    {\n        // Wait for indexer to fully start.\n        // The starting time is a bit long for a cluster.\n        tokio::time::sleep(Duration::from_secs(3)).await;\n        let indexing_service_counters = sandbox\n            .rest_client(QuickwitService::Indexer)\n            .node_stats()\n            .indexing()\n            .await\n            .unwrap();\n        assert_eq!(indexing_service_counters.num_running_pipelines, 0);\n    }\n\n    assert!(\n        sandbox\n            .rest_client(QuickwitService::Indexer)\n            .node_health()\n            .is_live()\n            .await\n            .unwrap()\n    );\n\n    let status_code = sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .update(\n            \"my-updatable-index\",\n            r#\"\n            version: 0.8\n            index_id: my-updatable-index\n            doc_mapping:\n              field_mappings:\n              - name: title\n                type: text\n              - name: body\n                type: text\n            indexing_settings:\n              commit_timeout_secs: 1\n            search_settings:\n              default_search_fields: [title, body]\n            \"#,\n            quickwit_config::ConfigFormat::Yaml,\n            false,\n        )\n        .await\n        .unwrap_err()\n        .status_code()\n        .unwrap();\n    assert_eq!(status_code, 404);\n\n    sandbox.shutdown().await.unwrap();\n}\n\n#[tokio::test]\nasync fn test_update_missing_create() {\n    quickwit_common::setup_logging_for_tests();\n    let sandbox = ClusterSandboxBuilder::default()\n        .add_node([QuickwitService::Searcher])\n        .add_node([QuickwitService::Metastore])\n        .add_node([QuickwitService::Indexer])\n        .add_node([QuickwitService::ControlPlane])\n        .add_node([QuickwitService::Janitor])\n        .build_and_start()\n        .await;\n\n    {\n        // Wait for indexer to fully start.\n        // The starting time is a bit long for a cluster.\n        tokio::time::sleep(Duration::from_secs(3)).await;\n        let indexing_service_counters = sandbox\n            .rest_client(QuickwitService::Indexer)\n            .node_stats()\n            .indexing()\n            .await\n            .unwrap();\n        assert_eq!(indexing_service_counters.num_running_pipelines, 0);\n    }\n\n    assert!(\n        sandbox\n            .rest_client(QuickwitService::Indexer)\n            .node_health()\n            .is_live()\n            .await\n            .unwrap()\n    );\n\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .update(\n            \"my-updatable-index\",\n            r#\"\n            version: 0.8\n            index_id: my-updatable-index\n            doc_mapping:\n              field_mappings:\n              - name: title\n                type: text\n              - name: body\n                type: text\n            indexing_settings:\n              commit_timeout_secs: 1\n            search_settings:\n              default_search_fields: [title, body]\n            \"#,\n            quickwit_config::ConfigFormat::Yaml,\n            true,\n        )\n        .await\n        .unwrap();\n\n    sandbox.shutdown().await.unwrap();\n}\n\n#[tokio::test]\nasync fn test_update_create_existing_doesnt_clear() {\n    quickwit_common::setup_logging_for_tests();\n    let sandbox = ClusterSandboxBuilder::default()\n        .add_node([QuickwitService::Searcher])\n        .add_node([QuickwitService::Metastore])\n        .add_node([QuickwitService::Indexer])\n        .add_node([QuickwitService::ControlPlane])\n        .add_node([QuickwitService::Janitor])\n        .build_and_start()\n        .await;\n\n    {\n        // Wait for indexer to fully start.\n        // The starting time is a bit long for a cluster.\n        tokio::time::sleep(Duration::from_secs(3)).await;\n        let indexing_service_counters = sandbox\n            .rest_client(QuickwitService::Indexer)\n            .node_stats()\n            .indexing()\n            .await\n            .unwrap();\n        assert_eq!(indexing_service_counters.num_running_pipelines, 0);\n    }\n\n    // Create an index\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .create(\n            r#\"\n            version: 0.8\n            index_id: my-updatable-index\n            doc_mapping:\n              field_mappings:\n              - name: title\n                type: text\n              - name: body\n                type: text\n            indexing_settings:\n              commit_timeout_secs: 1\n            search_settings:\n              default_search_fields: [title]\n            \"#,\n            quickwit_config::ConfigFormat::Yaml,\n            false,\n        )\n        .await\n        .unwrap();\n    assert!(\n        sandbox\n            .rest_client(QuickwitService::Indexer)\n            .node_health()\n            .is_live()\n            .await\n            .unwrap()\n    );\n\n    // Wait until indexing pipelines are started\n    sandbox.wait_for_indexing_pipelines(1).await.unwrap();\n\n    ingest(\n        &sandbox.rest_client(QuickwitService::Indexer),\n        \"my-updatable-index\",\n        ingest_json!({\"title\": \"first\", \"body\": \"first record\"}),\n        CommitType::Auto,\n    )\n    .await\n    .unwrap();\n\n    // Wait until split is committed\n    tokio::time::sleep(Duration::from_secs(4)).await;\n\n    // No hit because `default_search_fields` only covers the `title` field\n    assert_hits_unordered(&sandbox, \"my-updatable-index\", \"record\", Ok(&[])).await;\n\n    // Update the index to also search `body` by default, the same search should\n    // now have 1 hit\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .update(\n            \"my-updatable-index\",\n            r#\"\n            version: 0.8\n            index_id: my-updatable-index\n            doc_mapping:\n              field_mappings:\n              - name: title\n                type: text\n              - name: body\n                type: text\n            indexing_settings:\n              commit_timeout_secs: 1\n            search_settings:\n              default_search_fields: [title, body]\n            \"#,\n            quickwit_config::ConfigFormat::Yaml,\n            true,\n        )\n        .await\n        .unwrap();\n\n    assert_hits_unordered(\n        &sandbox,\n        \"my-updatable-index\",\n        \"record\",\n        Ok(&[json!({\"title\": \"first\", \"body\": \"first record\"})]),\n    )\n    .await;\n\n    sandbox.shutdown().await.unwrap();\n}\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/src/tests/update_tests/doc_mapping_tests.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt::Write;\nuse std::time::Duration;\n\nuse quickwit_config::service::QuickwitService;\nuse quickwit_rest_client::models::IngestSource;\nuse quickwit_rest_client::rest_client::CommitType;\nuse serde_json::{Value, json};\n\nuse super::assert_hits_unordered;\nuse crate::test_utils::ClusterSandboxBuilder;\n\n/// Update the doc mapping between 2 calls to local-ingest (forces separate indexing pipelines) and\n/// assert the number of hits for the given query\nasync fn validate_search_across_doc_mapping_updates(\n    index_id: &str,\n    original_doc_mapping: Value,\n    ingest_before_update: &[Value],\n    updated_doc_mapping: Value,\n    ingest_after_update: &[Value],\n    query_and_expect: &[(&str, Result<&[Value], ()>)],\n) {\n    let sandbox = ClusterSandboxBuilder::build_and_start_standalone().await;\n\n    {\n        // Wait for indexer to fully start.\n        // The starting time is a bit long for a cluster.\n        tokio::time::sleep(Duration::from_secs(3)).await;\n        let indexing_service_counters = sandbox\n            .rest_client(QuickwitService::Indexer)\n            .node_stats()\n            .indexing()\n            .await\n            .unwrap();\n        assert_eq!(indexing_service_counters.num_running_pipelines, 0);\n    }\n\n    // Create index\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .create(\n            json!({\n                \"version\": \"0.8\",\n                \"index_id\": index_id,\n                \"doc_mapping\": original_doc_mapping,\n                \"indexing_settings\": {\n                    \"commit_timeout_secs\": 1\n                },\n            })\n            .to_string(),\n            quickwit_config::ConfigFormat::Json,\n            false,\n        )\n        .await\n        .unwrap();\n\n    assert!(\n        sandbox\n            .rest_client(QuickwitService::Indexer)\n            .node_health()\n            .is_live()\n            .await\n            .unwrap()\n    );\n\n    // Wait until indexing pipelines are started.\n    sandbox.wait_for_indexing_pipelines(1).await.unwrap();\n\n    // We use local ingest to always pick up the latest doc mapping\n    sandbox\n        .local_ingest(index_id, ingest_before_update)\n        .await\n        .unwrap();\n\n    // Update index to also search \"body\" by default, search should now have 1 hit\n    sandbox\n        .rest_client(QuickwitService::Searcher)\n        .indexes()\n        .update(\n            index_id,\n            json!({\n                \"version\": \"0.8\",\n                \"index_id\": index_id,\n                \"doc_mapping\": updated_doc_mapping,\n                \"indexing_settings\": {\n                    \"commit_timeout_secs\": 1,\n                },\n            })\n            .to_string(),\n            quickwit_config::ConfigFormat::Json,\n            false,\n        )\n        .await\n        .unwrap();\n\n    sandbox\n        .local_ingest(index_id, ingest_after_update)\n        .await\n        .unwrap();\n\n    for (query, expected_hits) in query_and_expect.iter().copied() {\n        assert_hits_unordered(&sandbox, index_id, query, expected_hits).await;\n    }\n\n    sandbox.shutdown().await.unwrap();\n}\n\n#[tokio::test]\nasync fn test_update_doc_mapping_text_to_u64() {\n    let index_id = \"update-text-to-u64\";\n    let original_doc_mappings = json!({\n        \"field_mappings\": [\n            {\"name\": \"body\", \"type\": \"text\"}\n        ]\n    });\n    let ingest_before_update = &[json!({\"body\": \"14\"}), json!({\"body\": \"15\"})];\n    let updated_doc_mappings = json!({\n        \"field_mappings\": [\n            {\"name\": \"body\", \"type\": \"u64\"}\n        ]\n    });\n    let ingest_after_update = &[json!({\"body\": 16}), json!({\"body\": 17})];\n    validate_search_across_doc_mapping_updates(\n        index_id,\n        original_doc_mappings,\n        ingest_before_update,\n        updated_doc_mappings,\n        ingest_after_update,\n        &[\n            (\"body:14\", Ok(&[json!({\"body\": 14})])),\n            (\"body:16\", Ok(&[json!({\"body\": 16})])),\n            // error expected because the validation is performed\n            // by latest doc mapping\n            (\"body:hello\", Err(())),\n        ],\n    )\n    .await;\n}\n\n#[tokio::test]\nasync fn test_update_doc_mapping_u64_to_text() {\n    let index_id = \"update-u64-to-text\";\n    let original_doc_mappings = json!({\n        \"field_mappings\": [\n            {\"name\": \"body\", \"type\": \"u64\"}\n        ],\n        \"mode\": \"strict\",\n    });\n    let ingest_before_update = &[json!({\"body\": 14}), json!({\"body\": 15})];\n    let updated_doc_mappings = json!({\n        \"field_mappings\": [\n            {\"name\": \"body\", \"type\": \"text\"},\n        ],\n        \"mode\": \"strict\",\n    });\n    let ingest_after_update = &[json!({\"body\": \"16\"}), json!({\"body\": \"hello world\"})];\n    validate_search_across_doc_mapping_updates(\n        index_id,\n        original_doc_mappings,\n        ingest_before_update,\n        updated_doc_mappings,\n        ingest_after_update,\n        &[\n            (\"body:14\", Ok(&[json!({\"body\": \"14\"})])),\n            (\"body:16\", Ok(&[json!({\"body\": \"16\"})])),\n            (\"body:hello\", Ok(&[json!({\"body\": \"hello world\"})])),\n        ],\n    )\n    .await;\n}\n\n#[tokio::test]\nasync fn test_update_doc_mapping_json_to_text() {\n    let index_id = \"update-json-to-text\";\n    let original_doc_mappings = json!({\n        \"field_mappings\": [\n            {\"name\": \"body\", \"type\": \"json\"}\n        ]\n    });\n    let ingest_before_update = &[\n        json!({\"body\": {\"field1\": \"hello\"}}),\n        json!({\"body\": {\"field2\": \"world\"}}),\n    ];\n    let updated_doc_mappings = json!({\n        \"field_mappings\": [\n            {\"name\": \"body\", \"type\": \"text\"}\n        ]\n    });\n    let ingest_after_update = &[json!({\"body\": \"hello world\"})];\n    validate_search_across_doc_mapping_updates(\n        index_id,\n        original_doc_mappings,\n        ingest_before_update,\n        updated_doc_mappings,\n        ingest_after_update,\n        &[\n            (\"body:hello\", Ok(&[json!({\"body\": \"hello world\"})])),\n            // error expected because the validation is performed\n            // by latest doc mapping\n            (\"body.field1:hello\", Err(())),\n        ],\n    )\n    .await;\n}\n\n#[tokio::test]\nasync fn test_update_doc_mapping_json_to_object() {\n    let index_id = \"update-json-to-object\";\n    let original_doc_mappings = json!({\n        \"field_mappings\": [\n            {\"name\": \"body\", \"type\": \"json\"}\n        ]\n    });\n    let ingest_before_update = &[\n        json!({\"body\": {\"field1\": \"hello\"}}),\n        json!({\"body\": {\"field2\": \"world\"}}),\n    ];\n    let updated_doc_mappings = json!({\n        \"field_mappings\": [\n            {\n                \"name\": \"body\",\n                \"type\": \"object\",\n                \"field_mappings\": [\n                    {\"name\": \"field1\", \"type\": \"text\"},\n                    {\"name\": \"field2\", \"type\": \"text\"},\n                ]\n            }\n        ]\n    });\n    let ingest_after_update = &[\n        json!({\"body\": {\"field1\": \"hola\"}}),\n        json!({\"body\": {\"field2\": \"mundo\"}}),\n    ];\n    validate_search_across_doc_mapping_updates(\n        index_id,\n        original_doc_mappings,\n        ingest_before_update,\n        updated_doc_mappings,\n        ingest_after_update,\n        &[\n            (\n                \"body.field1:hello\",\n                Ok(&[json!({\"body\": {\"field1\": \"hello\"}})]),\n            ),\n            (\n                \"body.field1:hola\",\n                Ok(&[json!({\"body\": {\"field1\": \"hola\"}})]),\n            ),\n        ],\n    )\n    .await;\n}\n\n#[tokio::test]\nasync fn test_update_doc_mapping_object_to_json() {\n    let index_id = \"update-object-to-json\";\n    let original_doc_mappings = json!({\n        \"field_mappings\": [\n            {\n                \"name\": \"body\",\n                \"type\": \"object\",\n                \"field_mappings\": [\n                    {\"name\": \"field1\", \"type\": \"text\"},\n                    {\"name\": \"field2\", \"type\": \"text\"},\n                ]\n            }\n        ]\n    });\n    let ingest_before_update = &[\n        json!({\"body\": {\"field1\": \"hello\"}}),\n        json!({\"body\": {\"field2\": \"world\"}}),\n    ];\n    let updated_doc_mappings = json!({\n        \"field_mappings\": [\n            {\"name\": \"body\", \"type\": \"json\"}\n        ]\n    });\n    let ingest_after_update = &[\n        json!({\"body\": {\"field1\": \"hola\"}}),\n        json!({\"body\": {\"field2\": \"mundo\"}}),\n    ];\n    validate_search_across_doc_mapping_updates(\n        index_id,\n        original_doc_mappings,\n        ingest_before_update,\n        updated_doc_mappings,\n        ingest_after_update,\n        &[\n            (\n                \"body.field1:hello\",\n                Ok(&[json!({\"body\": {\"field1\": \"hello\"}})]),\n            ),\n            (\n                \"body.field1:hola\",\n                Ok(&[json!({\"body\": {\"field1\": \"hola\"}})]),\n            ),\n        ],\n    )\n    .await;\n}\n\n#[tokio::test]\nasync fn test_update_doc_mapping_tokenizer_default_to_raw() {\n    let index_id = \"update-tokenizer-default-to-raw\";\n    let original_doc_mappings = json!({\n        \"field_mappings\": [\n            {\"name\": \"body\", \"type\": \"text\", \"tokenizer\": \"default\"}\n        ]\n    });\n    let ingest_before_update = &[json!({\"body\": \"hello-world\"})];\n    let updated_doc_mappings = json!({\n        \"field_mappings\": [\n            {\"name\": \"body\", \"type\": \"text\", \"tokenizer\": \"raw\"}\n        ]\n    });\n    let ingest_after_update = &[json!({\"body\": \"bonjour-monde\"})];\n    validate_search_across_doc_mapping_updates(\n        index_id,\n        original_doc_mappings,\n        ingest_before_update,\n        updated_doc_mappings,\n        ingest_after_update,\n        &[\n            (\"body:hello\", Ok(&[json!({\"body\": \"hello-world\"})])),\n            (\"body:world\", Ok(&[json!({\"body\": \"hello-world\"})])),\n            // phrases queries won't apply to older splits that didn't support them\n            (\"body:\\\"hello world\\\"\", Ok(&[])),\n            (\"body:\\\"hello-world\\\"\", Ok(&[])),\n            (\"body:\\\"hello-worl\\\"*\", Ok(&[])),\n            (\"body:bonjour\", Ok(&[])),\n            (\"body:monde\", Ok(&[])),\n            // the raw tokenizer only returns exact matches\n            (\"body:\\\"bonjour monde\\\"\", Ok(&[])),\n            (\n                \"body:\\\"bonjour-monde\\\"\",\n                Ok(&[json!({\"body\": \"bonjour-monde\"})]),\n            ),\n            (\n                \"body:\\\"bonjour-mond\\\"*\",\n                Ok(&[json!({\"body\": \"bonjour-monde\"})]),\n            ),\n        ],\n    )\n    .await;\n}\n\n#[tokio::test]\nasync fn test_update_doc_mapping_tokenizer_add_position() {\n    let index_id = \"update-tokenizer-add-position\";\n    let original_doc_mappings = json!({\n        \"field_mappings\": [\n            {\"name\": \"body\", \"type\": \"text\", \"tokenizer\": \"default\"}\n        ]\n    });\n    let ingest_before_update = &[json!({\"body\": \"hello-world\"})];\n    let updated_doc_mappings = json!({\n        \"field_mappings\": [\n            {\"name\": \"body\", \"type\": \"text\", \"tokenizer\": \"default\", \"record\": \"position\"}\n        ]\n    });\n    let ingest_after_update = &[json!({\"body\": \"bonjour-monde\"})];\n    validate_search_across_doc_mapping_updates(\n        index_id,\n        original_doc_mappings,\n        ingest_before_update,\n        updated_doc_mappings,\n        ingest_after_update,\n        &[\n            (\"body:hello\", Ok(&[json!({\"body\": \"hello-world\"})])),\n            (\"body:world\", Ok(&[json!({\"body\": \"hello-world\"})])),\n            // phrases queries don't apply to older splits that didn't support them\n            (\"body:\\\"hello-world\\\"\", Ok(&[])),\n            (\"body:\\\"hello world\\\"\", Ok(&[])),\n            (\"body:\\\"hello-worl\\\"*\", Ok(&[])),\n            (\"body:bonjour\", Ok(&[json!({\"body\": \"bonjour-monde\"})])),\n            (\"body:monde\", Ok(&[json!({\"body\": \"bonjour-monde\"})])),\n            (\n                \"body:\\\"bonjour-monde\\\"\",\n                Ok(&[json!({\"body\": \"bonjour-monde\"})]),\n            ),\n            (\n                \"body:\\\"bonjour monde\\\"\",\n                Ok(&[json!({\"body\": \"bonjour-monde\"})]),\n            ),\n            (\n                \"body:\\\"bonjour-mond\\\"*\",\n                Ok(&[json!({\"body\": \"bonjour-monde\"})]),\n            ),\n        ],\n    )\n    .await;\n}\n\n#[tokio::test]\nasync fn test_update_doc_mapping_tokenizer_raw_to_phrase() {\n    let index_id = \"update-tokenizer-raw-to-phrase\";\n    let original_doc_mappings = json!({\n        \"field_mappings\": [\n            {\"name\": \"body\", \"type\": \"text\", \"tokenizer\": \"raw\"}\n        ]\n    });\n    let ingest_before_update = &[json!({\"body\": \"hello-world\"})];\n    let updated_doc_mappings = json!({\n        \"field_mappings\": [\n            {\"name\": \"body\", \"type\": \"text\", \"tokenizer\": \"default\", \"record\": \"position\"}\n        ]\n    });\n    let ingest_after_update = &[json!({\"body\": \"bonjour-monde\"})];\n    validate_search_across_doc_mapping_updates(\n        index_id,\n        original_doc_mappings,\n        ingest_before_update,\n        updated_doc_mappings,\n        ingest_after_update,\n        &[\n            (\"body:hello\", Ok(&[])),\n            (\"body:world\", Ok(&[])),\n            // raw tokenizer used here, only exact matches returned\n            (\n                \"body:\\\"hello-world\\\"\",\n                Ok(&[json!({\"body\": \"hello-world\"})]),\n            ),\n            (\"body:\\\"hello world\\\"\", Ok(&[])),\n            (\"body:bonjour\", Ok(&[json!({\"body\": \"bonjour-monde\"})])),\n            (\"body:monde\", Ok(&[json!({\"body\": \"bonjour-monde\"})])),\n            (\n                \"body:\\\"bonjour-monde\\\"\",\n                Ok(&[json!({\"body\": \"bonjour-monde\"})]),\n            ),\n            (\n                \"body:\\\"bonjour monde\\\"\",\n                Ok(&[json!({\"body\": \"bonjour-monde\"})]),\n            ),\n        ],\n    )\n    .await;\n}\n\n#[tokio::test]\nasync fn test_update_doc_mapping_unindexed_to_indexed() {\n    let index_id = \"update-not-indexed-to-indexed\";\n    let original_doc_mappings = json!({\n        \"field_mappings\": [\n            {\"name\": \"body\", \"type\": \"text\", \"indexed\": false}\n        ]\n    });\n    let ingest_before_update = &[json!({\"body\": \"hello\"})];\n    let updated_doc_mappings = json!({\n        \"field_mappings\": [\n            {\"name\": \"body\", \"type\": \"text\", \"tokenizer\": \"raw\"}\n        ]\n    });\n    let ingest_after_update = &[json!({\"body\": \"bonjour\"})];\n    validate_search_across_doc_mapping_updates(\n        index_id,\n        original_doc_mappings,\n        ingest_before_update,\n        updated_doc_mappings,\n        ingest_after_update,\n        &[\n            // term query won't apply to older splits that weren't indexed\n            (\"body:hello\", Ok(&[])),\n            (\"body:IN [hello]\", Ok(&[])),\n            // works on newer data\n            (\"body:bonjour\", Ok(&[json!({\"body\": \"bonjour\"})])),\n            (\"body:IN [bonjour]\", Ok(&[json!({\"body\": \"bonjour\"})])),\n        ],\n    )\n    .await;\n}\n\n#[tokio::test]\nasync fn test_update_doc_mapping_strict_to_dynamic() {\n    let index_id = \"update-strict-to-dynamic\";\n    let original_doc_mappings = json!({\n        \"field_mappings\": [\n            {\"name\": \"body\", \"type\": \"text\"}\n        ],\n        \"mode\": \"strict\",\n    });\n    let ingest_before_update = &[json!({\"body\": \"hello\"})];\n    let updated_doc_mappings = json!({\n        \"mode\": \"dynamic\",\n    });\n    let ingest_after_update = &[json!({\"body\": \"world\", \"title\": \"salutations\"})];\n    validate_search_across_doc_mapping_updates(\n        index_id,\n        original_doc_mappings,\n        ingest_before_update,\n        updated_doc_mappings,\n        ingest_after_update,\n        &[\n            (\"body:hello\", Ok(&[json!({\"body\": \"hello\"})])),\n            (\n                \"body:world\",\n                Ok(&[json!({\"body\": \"world\", \"title\": \"salutations\"})]),\n            ),\n            (\n                \"title:salutations\",\n                Ok(&[json!({\"body\": \"world\", \"title\": \"salutations\"})]),\n            ),\n        ],\n    )\n    .await;\n}\n\n#[tokio::test]\nasync fn test_update_doc_mapping_dynamic_to_strict() {\n    let index_id = \"update-dynamic-to-strict\";\n    let original_doc_mappings = json!({\n        \"mode\": \"dynamic\",\n    });\n    let ingest_before_update = &[json!({\"body\": \"hello\"})];\n    let updated_doc_mappings = json!({\n        \"field_mappings\": [\n            {\"name\": \"body\", \"type\": \"text\"}\n        ],\n        \"mode\": \"strict\",\n    });\n    let ingest_after_update = &[json!({\"body\": \"world\"})];\n    validate_search_across_doc_mapping_updates(\n        index_id,\n        original_doc_mappings,\n        ingest_before_update,\n        updated_doc_mappings,\n        ingest_after_update,\n        &[\n            (\"body:hello\", Ok(&[json!({\"body\": \"hello\"})])),\n            (\"body:world\", Ok(&[json!({\"body\": \"world\"})])),\n        ],\n    )\n    .await;\n}\n\n#[tokio::test]\nasync fn test_update_doc_mapping_add_field_on_strict() {\n    let index_id = \"update-add-field-on-strict\";\n    let original_doc_mappings = json!({\n        \"field_mappings\": [\n            {\"name\": \"body\", \"type\": \"text\"},\n        ],\n        \"mode\": \"strict\",\n    });\n    let ingest_before_update = &[json!({\"body\": \"hello\"})];\n    let updated_doc_mappings = json!({\n        \"field_mappings\": [\n            {\"name\": \"body\", \"type\": \"text\"},\n            {\"name\": \"title\", \"type\": \"text\"},\n        ],\n        \"mode\": \"strict\",\n    });\n    let ingest_after_update = &[json!({\"body\": \"world\", \"title\": \"salutations\"})];\n    validate_search_across_doc_mapping_updates(\n        index_id,\n        original_doc_mappings,\n        ingest_before_update,\n        updated_doc_mappings,\n        ingest_after_update,\n        &[\n            (\"body:hello\", Ok(&[json!({\"body\": \"hello\"})])),\n            (\n                \"body:world\",\n                Ok(&[json!({\"body\": \"world\", \"title\": \"salutations\"})]),\n            ),\n            (\n                \"title:salutations\",\n                Ok(&[json!({\"body\": \"world\", \"title\": \"salutations\"})]),\n            ),\n        ],\n    )\n    .await;\n}\n\n#[tokio::test]\n#[ignore]\n// TODO(#5738)\nasync fn test_update_doc_validation() {\n    quickwit_common::setup_logging_for_tests();\n    let index_id = \"update-doc-validation\";\n    let sandbox = ClusterSandboxBuilder::default()\n        .add_node([\n            QuickwitService::Searcher,\n            QuickwitService::Metastore,\n            QuickwitService::Indexer,\n            QuickwitService::ControlPlane,\n            QuickwitService::Janitor,\n        ])\n        .build_and_start()\n        .await;\n\n    {\n        // Wait for indexer to fully start.\n        // The starting time is a bit long for a cluster.\n        tokio::time::sleep(Duration::from_secs(3)).await;\n        let indexing_service_counters = sandbox\n            .rest_client(QuickwitService::Indexer)\n            .node_stats()\n            .indexing()\n            .await\n            .unwrap();\n        assert_eq!(indexing_service_counters.num_running_pipelines, 0);\n    }\n\n    // Create index\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .create(\n            json!({\n                \"version\": \"0.8\",\n                \"index_id\": index_id,\n                \"doc_mapping\": {\n                    \"field_mappings\": [\n                        {\"name\": \"body\", \"type\": \"u64\"}\n                    ]\n                },\n                \"indexing_settings\": {\n                    \"commit_timeout_secs\": 1\n                },\n            })\n            .to_string(),\n            quickwit_config::ConfigFormat::Json,\n            false,\n        )\n        .await\n        .unwrap();\n\n    assert!(\n        sandbox\n            .rest_client(QuickwitService::Indexer)\n            .node_health()\n            .is_live()\n            .await\n            .unwrap()\n    );\n\n    // Wait until indexing pipelines are started.\n    sandbox.wait_for_indexing_pipelines(1).await.unwrap();\n\n    let unsigned_payload = (0..20).fold(String::new(), |mut buffer, id| {\n        writeln!(&mut buffer, \"{{\\\"body\\\": {id}}}\").unwrap();\n        buffer\n    });\n\n    let unsigned_response = sandbox\n        .rest_client(QuickwitService::Indexer)\n        .ingest(\n            index_id,\n            IngestSource::Str(unsigned_payload.clone()),\n            None,\n            None,\n            CommitType::Auto,\n        )\n        .await\n        .unwrap();\n\n    assert_eq!(unsigned_response.num_rejected_docs.unwrap(), 0);\n\n    sandbox\n        .rest_client(QuickwitService::Searcher)\n        .indexes()\n        .update(\n            index_id,\n            json!({\n                \"version\": \"0.8\",\n                \"index_id\": index_id,\n                \"doc_mapping\": {\n                    \"field_mappings\": [\n                        {\"name\": \"body\", \"type\": \"i64\"}\n                    ]\n                },\n                \"indexing_settings\": {\n                    \"commit_timeout_secs\": 1,\n                },\n            })\n            .to_string(),\n            quickwit_config::ConfigFormat::Json,\n            false,\n        )\n        .await\n        .unwrap();\n\n    let signed_payload = (-20..0).fold(String::new(), |mut buffer, id| {\n        writeln!(&mut buffer, \"{{\\\"body\\\": {id}}}\").unwrap();\n        buffer\n    });\n\n    let signed_response = sandbox\n        .rest_client(QuickwitService::Indexer)\n        .ingest(\n            index_id,\n            IngestSource::Str(signed_payload.clone()),\n            None,\n            None,\n            CommitType::Auto,\n        )\n        .await\n        .unwrap();\n\n    assert_eq!(signed_response.num_rejected_docs.unwrap(), 0);\n\n    sandbox.shutdown().await.unwrap();\n}\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/src/tests/update_tests/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_config::service::QuickwitService;\nuse quickwit_serve::SearchRequestQueryString;\nuse serde_json::Value;\n\nuse crate::test_utils::ClusterSandbox;\n\n/// Checks that the result of the given query matches the expected values\nasync fn assert_hits_unordered(\n    sandbox: &ClusterSandbox,\n    index_id: &str,\n    query: &str,\n    expected_result: Result<&[Value], ()>,\n) {\n    let search_res = sandbox\n        .rest_client(QuickwitService::Searcher)\n        .search(\n            index_id,\n            SearchRequestQueryString {\n                query: query.to_string(),\n                max_hits: expected_result.map(|hits| hits.len() as u64).unwrap_or(1),\n                ..Default::default()\n            },\n        )\n        .await;\n    if let Ok(expected_hits) = expected_result {\n        let resp = search_res.unwrap_or_else(|err| panic!(\"query: {query}, error: {err}\"));\n        assert_eq!(resp.errors.len(), 0, \"query: {query}\");\n        assert_eq!(resp.num_hits, expected_hits.len() as u64, \"query: {query}\");\n        for expected_hit in expected_hits {\n            assert!(\n                resp.hits.contains(expected_hit),\n                \"query: {} -> expected hits: {:?}, got: {:?}\",\n                query,\n                expected_hits,\n                resp.hits\n            );\n        }\n    } else if let Ok(search_response) = search_res {\n        assert!(!search_response.errors.is_empty(), \"query: {query}\");\n    }\n}\n\nmod create_on_update;\nmod doc_mapping_tests;\nmod restart_indexer_tests;\nmod search_settings_tests;\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/src/tests/update_tests/restart_indexer_tests.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt::Write;\nuse std::time::Duration;\n\nuse quickwit_config::service::QuickwitService;\nuse quickwit_metastore::SplitState;\nuse quickwit_proto::types::DocMappingUid;\nuse quickwit_rest_client::models::IngestSource;\nuse quickwit_rest_client::rest_client::CommitType;\nuse quickwit_serve::ListSplitsQueryParams;\nuse serde_json::json;\n\nuse crate::test_utils::ClusterSandboxBuilder;\n\n#[tokio::test]\nasync fn test_update_doc_mapping_restart_indexing_pipeline() {\n    let index_id = \"update-restart-ingest\";\n    quickwit_common::setup_logging_for_tests();\n    let sandbox = ClusterSandboxBuilder::default()\n        .add_node([\n            QuickwitService::Searcher,\n            QuickwitService::Metastore,\n            QuickwitService::Indexer,\n            QuickwitService::ControlPlane,\n            QuickwitService::Janitor,\n        ])\n        .build_and_start()\n        .await;\n\n    {\n        // Wait for indexer to fully start.\n        // The starting time is a bit long for a cluster.\n        tokio::time::sleep(Duration::from_secs(3)).await;\n        let indexing_service_counters = sandbox\n            .rest_client(QuickwitService::Indexer)\n            .node_stats()\n            .indexing()\n            .await\n            .unwrap();\n        assert_eq!(indexing_service_counters.num_running_pipelines, 0);\n    }\n\n    // usually these are choosen by quickwit, but actually the client can specify them\n    // and we do here to simplify the test\n    let initial_mapping_uid = DocMappingUid::for_test(1);\n    let final_mapping_uid = DocMappingUid::for_test(2);\n\n    // Create index\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .create(\n            json!({\n                \"version\": \"0.8\",\n                \"index_id\": index_id,\n                \"doc_mapping\": {\n                    \"doc_mapping_uid\": initial_mapping_uid,\n                    \"field_mappings\": [\n                        {\"name\": \"body\", \"type\": \"u64\"}\n                    ]\n                },\n                \"indexing_settings\": {\n                    \"commit_timeout_secs\": 1\n                },\n            })\n            .to_string(),\n            quickwit_config::ConfigFormat::Json,\n            false,\n        )\n        .await\n        .unwrap();\n\n    assert!(\n        sandbox\n            .rest_client(QuickwitService::Indexer)\n            .node_health()\n            .is_live()\n            .await\n            .unwrap()\n    );\n\n    // Wait until indexing pipelines are started.\n    sandbox.wait_for_indexing_pipelines(1).await.unwrap();\n\n    let payload = (0..1000).fold(String::new(), |mut buffer, id| {\n        writeln!(&mut buffer, \"{{\\\"body\\\": {id}}}\").unwrap();\n        buffer\n    });\n\n    // ingest some documents with old doc mapping.\n    // we *don't* use local ingest to use a normal indexing pipeline\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .ingest(\n            index_id,\n            IngestSource::Str(payload.clone()),\n            None,\n            None,\n            CommitType::Auto,\n        )\n        .await\n        .unwrap();\n\n    // we wait for a new split. We don't want to force commits to let the pipeline behave as if in\n    // a steady state.\n    sandbox\n        .wait_for_splits(index_id, Some(vec![SplitState::Published]), 1)\n        .await\n        .unwrap();\n\n    // we ingest again, this might end up with the new or old doc mapping depending on how quickly\n    // the pipeline gets killed and restarted (in practice as this cluster is very lightly loaded,\n    // it will almost always kill the pipeline before these documents are committed)\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .ingest(\n            index_id,\n            IngestSource::Str(payload.clone()),\n            None,\n            None,\n            CommitType::Auto,\n        )\n        .await\n        .unwrap();\n\n    // Update index\n    sandbox\n        .rest_client(QuickwitService::Searcher)\n        .indexes()\n        .update(\n            index_id,\n            json!({\n                \"version\": \"0.8\",\n                \"index_id\": index_id,\n                \"doc_mapping\": {\n                    \"doc_mapping_uid\": final_mapping_uid,\n                    \"field_mappings\": [\n                        {\"name\": \"body\", \"type\": \"i64\"}\n                    ]\n                },\n                \"indexing_settings\": {\n                    \"commit_timeout_secs\": 1,\n                },\n            })\n            .to_string(),\n            quickwit_config::ConfigFormat::Json,\n            false,\n        )\n        .await\n        .unwrap();\n\n    // we ingest again, this might end up with the new or old doc mapping depending on how quickly\n    // the pipeline gets killed and restarted. In practice this will almost always use the new\n    // mapping on a lightly loaded cluster.\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .ingest(\n            index_id,\n            IngestSource::Str(payload.clone()),\n            None,\n            None,\n            CommitType::Auto,\n        )\n        .await\n        .unwrap();\n\n    // we wait for a 2nd split, though it might still be there if it contains only batch 2 and not\n    // batch 3.\n    sandbox\n        .wait_for_splits(index_id, Some(vec![SplitState::Published]), 2)\n        .await\n        .unwrap();\n\n    // we ingest again, definitely with the up to date doc mapper this time\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .ingest(\n            index_id,\n            IngestSource::Str(payload.clone()),\n            None,\n            None,\n            CommitType::Auto,\n        )\n        .await\n        .unwrap();\n\n    // wait for a last commit\n    sandbox\n        .wait_for_splits(index_id, Some(vec![SplitState::Published]), 3)\n        .await\n        .unwrap();\n\n    let splits = sandbox\n        .rest_client(QuickwitService::Indexer)\n        .splits(index_id)\n        .list(ListSplitsQueryParams::default())\n        .await\n        .unwrap();\n\n    // we expect 3 splits, with all docs, and at least one split under old mapping and one under\n    // new mapping\n    assert_eq!(splits.len(), 3);\n    assert!(\n        splits\n            .iter()\n            .filter(|split| split.split_metadata.doc_mapping_uid == initial_mapping_uid)\n            .count()\n            > 0\n    );\n    assert!(\n        splits\n            .iter()\n            .filter(|split| split.split_metadata.doc_mapping_uid == final_mapping_uid)\n            .count()\n            > 0\n    );\n    assert_eq!(\n        splits\n            .iter()\n            .map(|split| split.split_metadata.num_docs)\n            .sum::<usize>(),\n        4000\n    );\n\n    sandbox.shutdown().await.unwrap();\n}\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/src/tests/update_tests/search_settings_tests.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::time::Duration;\n\nuse quickwit_config::service::QuickwitService;\nuse quickwit_rest_client::rest_client::CommitType;\nuse serde_json::json;\n\nuse super::assert_hits_unordered;\nuse crate::ingest_json;\nuse crate::test_utils::{ClusterSandboxBuilder, ingest};\n\n#[tokio::test]\nasync fn test_update_search_settings_on_multi_nodes_cluster() {\n    quickwit_common::setup_logging_for_tests();\n    let sandbox = ClusterSandboxBuilder::default()\n        .add_node([QuickwitService::Searcher])\n        .add_node([QuickwitService::Metastore])\n        .add_node([QuickwitService::Indexer])\n        .add_node([QuickwitService::ControlPlane])\n        .add_node([QuickwitService::Janitor])\n        .build_and_start()\n        .await;\n\n    {\n        // Wait for indexer to fully start.\n        // The starting time is a bit long for a cluster.\n        tokio::time::sleep(Duration::from_secs(3)).await;\n        let indexing_service_counters = sandbox\n            .rest_client(QuickwitService::Indexer)\n            .node_stats()\n            .indexing()\n            .await\n            .unwrap();\n        assert_eq!(indexing_service_counters.num_running_pipelines, 0);\n    }\n\n    // Create an index\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .create(\n            r#\"\n            version: 0.8\n            index_id: my-updatable-index\n            doc_mapping:\n              field_mappings:\n              - name: title\n                type: text\n              - name: body\n                type: text\n            indexing_settings:\n              commit_timeout_secs: 1\n            search_settings:\n              default_search_fields: [title]\n            \"#,\n            quickwit_config::ConfigFormat::Yaml,\n            false,\n        )\n        .await\n        .unwrap();\n    assert!(\n        sandbox\n            .rest_client(QuickwitService::Indexer)\n            .node_health()\n            .is_live()\n            .await\n            .unwrap()\n    );\n\n    // Wait until indexing pipelines are started\n    sandbox.wait_for_indexing_pipelines(1).await.unwrap();\n\n    ingest(\n        &sandbox.rest_client(QuickwitService::Indexer),\n        \"my-updatable-index\",\n        ingest_json!({\"title\": \"first\", \"body\": \"first record\"}),\n        CommitType::Auto,\n    )\n    .await\n    .unwrap();\n\n    // Wait until split is committed\n    tokio::time::sleep(Duration::from_secs(4)).await;\n\n    // No hit because `default_search_fields`` only covers the `title` field\n    assert_hits_unordered(&sandbox, \"my-updatable-index\", \"record\", Ok(&[])).await;\n\n    // Update the index to also search `body` by default, the same search should\n    // now have 1 hit\n    sandbox\n        .rest_client(QuickwitService::Indexer)\n        .indexes()\n        .update(\n            \"my-updatable-index\",\n            r#\"\n            version: 0.8\n            index_id: my-updatable-index\n            doc_mapping:\n              field_mappings:\n              - name: title\n                type: text\n              - name: body\n                type: text\n            indexing_settings:\n              commit_timeout_secs: 1\n            search_settings:\n              default_search_fields: [title, body]\n            \"#,\n            quickwit_config::ConfigFormat::Yaml,\n            false,\n        )\n        .await\n        .unwrap();\n\n    assert_hits_unordered(\n        &sandbox,\n        \"my-updatable-index\",\n        \"record\",\n        Ok(&[json!({\"title\": \"first\", \"body\": \"first record\"})]),\n    )\n    .await;\n\n    sandbox.shutdown().await.unwrap();\n}\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/test_data/README.md",
    "content": "some of the assets containing in this directory are test certificates and corresponding private key.\nIt's not unusual for automatic scanners to pick them up and warn about leaked private keys. These\nkeys are not meant to be private, so if that happen, feel free to ignore the messages.\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/test_data/ca.crt",
    "content": "-----BEGIN CERTIFICATE-----\nMIIFCzCCAvOgAwIBAgIUI70aP6K+1YrhuCysdEmmY9LcdXEwDQYJKoZIhvcNAQEL\nBQAwFTETMBEGA1UEAwwKcXcgdGVzdCBDQTAeFw0yNDEyMDkxMTM4MDVaFw0zNDEy\nMTAxMTM4MDVaMBUxEzARBgNVBAMMCnF3IHRlc3QgQ0EwggIiMA0GCSqGSIb3DQEB\nAQUAA4ICDwAwggIKAoICAQDWZLa7Tzkm+hs9itOG4dGsGAOP6FmY7zQtpq2cvi62\nGfzQZMrNyhNB49zRPKPmwZ7dTrqP65OHZt6JPn/Pw12reWRGIFHFnw87ybdqCi3W\nM9bz6UYdvn5r3g+VrIVDJ0NlBKjv2ZGvv/lpzP3GbbT8hX1UThee8k8SdMMuEQzg\nrepZnOlby5sfIAZ+40pZagjln+w+wUvzupQpgHGRRPokWVrA6ej4Gx/6mRVFTT4S\ndQABfAI6mAeQfkiK0AoN2DuujvDM6noqelePtQCD7X4XCj8QTw+qM8meDLNY92BK\n2Y3zMnbBpnLfX1zA6hVWhBeBAfjFuCTxgh54RWGZm7vSFefwtA0c9xV4OXIMlZ95\n3OB2W7pB+xGPIaJfH4YOFnAPWrDMBf1HXX4HWPgxahpntgJLwhOpR5FfeKaq8liK\nPmFu2nr7PYcQ4yWDyFOy5hWSdsYejwo5IJiQryC+sitPaBIZrXl0Q29vGPqLUn5s\nemP0XTsWPrZsm+RYz25Ux+DHEHzWCnRq16Qlye7cBrjJECCuD4dYCR/t8pfWeX4+\nvICqaq46QZk5C5hFjPPjrP5ZtmjfScp//TuxyBYuY5D6ZzfUnsm/RfEiWicjOp4/\ngWkqsDE1SwOzw0YZFVeZ+c3TBG8A/XxbZ7csFrzzmoGMnF/1bjSpGOAB4M/CubaN\ndwIDAQABo1MwUTAdBgNVHQ4EFgQUKihfkQ1qc1VhyRkgnQfvHksi5iswHwYDVR0j\nBBgwFoAUKihfkQ1qc1VhyRkgnQfvHksi5iswDwYDVR0TAQH/BAUwAwEB/zANBgkq\nhkiG9w0BAQsFAAOCAgEA0jJTDB7hgwJnvWPvc4kKBQGnsIYt4dQFn377GLqPIB8Y\nSLdBVVCRQCyssvqEMZe/+38SX8u3UVnBnnVlxuGqBTZJo7KN4geRkruXFVdnSOHF\nlHslvPVp2KsYcpjqRspt8NRIkgvx5YGd02q2NUIIyY8JeEkt1B3QurPaUIpTlzWS\n6dxVgo1Byyoxoe9BJ/M9bH+73mMTJvSjVKWGIJFvCdNV6twjyoWJeXkL696zquhu\n0YdHYxS65AU0NAflBCxgxREr5dql0TruxwLG/6THAJcqmMLUwxyDYIP4P+1k4xtV\nKEGKqESdZTUp61AhAOAdsUuK5Iz4HUTb0uB1fzZMVjxA+03VC0/pcum6qP15M+xg\nchnQ8fNo0iZCDr3LD4kwsAqoWOQ4oV+ZLBthh6gVmHtuTS3ehMTH99r6T2II+8R1\n5rPtE14uelgZJVcxD7z+xWDfODJIhxlWjwxJkBcCkvaGMQU3+kTUe5U6EJL52Z5q\nG/Mh4uNceLx1uSuh0/R68AXj5LLmShjSYO7rV7KlFGS0B7iglsgwgs74GUQzt0nq\n1Qe7QGsVoSVIabq0Es2gUjiEeRJYg6X7Gy9tAESY//zYW3Qh26/sqGgkYypa9Hy9\nw0T+IrQAj2K+LkOEoFwBOdx5qyRlRdd1l7xioJjhjHA+u+e3HBf6fk/VVuFXgBc=\n-----END CERTIFICATE-----\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/test_data/ca.key",
    "content": "-----BEGIN PRIVATE KEY-----\nMIIJRAIBADANBgkqhkiG9w0BAQEFAASCCS4wggkqAgEAAoICAQDWZLa7Tzkm+hs9\nitOG4dGsGAOP6FmY7zQtpq2cvi62GfzQZMrNyhNB49zRPKPmwZ7dTrqP65OHZt6J\nPn/Pw12reWRGIFHFnw87ybdqCi3WM9bz6UYdvn5r3g+VrIVDJ0NlBKjv2ZGvv/lp\nzP3GbbT8hX1UThee8k8SdMMuEQzgrepZnOlby5sfIAZ+40pZagjln+w+wUvzupQp\ngHGRRPokWVrA6ej4Gx/6mRVFTT4SdQABfAI6mAeQfkiK0AoN2DuujvDM6noqeleP\ntQCD7X4XCj8QTw+qM8meDLNY92BK2Y3zMnbBpnLfX1zA6hVWhBeBAfjFuCTxgh54\nRWGZm7vSFefwtA0c9xV4OXIMlZ953OB2W7pB+xGPIaJfH4YOFnAPWrDMBf1HXX4H\nWPgxahpntgJLwhOpR5FfeKaq8liKPmFu2nr7PYcQ4yWDyFOy5hWSdsYejwo5IJiQ\nryC+sitPaBIZrXl0Q29vGPqLUn5semP0XTsWPrZsm+RYz25Ux+DHEHzWCnRq16Ql\nye7cBrjJECCuD4dYCR/t8pfWeX4+vICqaq46QZk5C5hFjPPjrP5ZtmjfScp//Tux\nyBYuY5D6ZzfUnsm/RfEiWicjOp4/gWkqsDE1SwOzw0YZFVeZ+c3TBG8A/XxbZ7cs\nFrzzmoGMnF/1bjSpGOAB4M/CubaNdwIDAQABAoICAAdp94ZmCJ7tb+W37aAGInRm\nEYbMzSm5vpSEOrq8j5b33RiZHl5GylWxUIs3zUnckObW+Hf4+I1qeZrwu1tu8R1N\n7FwfH/91LwICnsIe/1N0pXlNLCyOetwLaNSPNmiKbLfglQr+UMhtcMLqNi3CJiEZ\naKpyrORv6AYbsugz6hMW+zoFcBh0OQq21ngAL4EEvAayJL74JkKTxYM4kDTVgs+g\n+2odF1OvycemZZLo8eg80uGPl/Ajc8Iua8nJGfU9RgsU9wTPEvxxu0zFQJ2kp9ia\nH8YrmxS9XQ/cdpNGsrYZQeG9cVfkWVKaKfX4jQEEk7TVYZ9MwpJLd/bxcfJC7W1t\nKjHStiIFuvp5pqfKYgGZPnVdEe8T43naQopvB/IJ8TYp3Jy7Pxl+FP7sHN2F4sgn\ntHDdvuWTSQT9DvInvuEzMkFxI0SN1kt/GEA7xg4bd/TUCZgkIS92xxxKG8c/3a2O\nBbXc5RpaGreDZn9cYtZ4nihJj7lsqouHAGcVz2oljZaWlJIkOzE/rDHDl2hQtvUg\noTQvsbsFZjm3tf3ZPoDjM5GCM8Z+BG61KAyIuUlZ9QgYANgUMd+XoM+NMpGOmbhZ\npLaCUzSKpVKt58rqepW3uFh+cQFi0kDkT5c9sNHrZtv9xZy0MzuzcErda/fJhASQ\nm9xX9dbqsgIoJWsv5NnJAoIBAQD//JOQsTsZSGGPptaV6ctieat2jkemdVGdtGuO\n9XjmC26htA0Unv+NBzeBfQ2Lf3nsap9ksw2/abnLAi8Bq6S/utQCVjoh5Kwg3m2d\nj/rBDSd+GSPNH31yPki7tLsiLgWZGEo5sYu5bdaHGTXXD6SwY9xAZUT2CN9U83nR\n+2OvOvd+oZvKok4DCstPKIYI5oDYgDrmII/4kKjaBfXU3qxWJ7jr4aKOCpUM1Xtd\ndTCaDwKWOV4DyNGzn4VjNhLh/QHL1v+EbtZHJp2vqj8X+cLJDCMLJMqTPa/+2FvX\nzj4orbh5Yo5APGs0QjT/67CeWbjHoczcZwTaJLY67zdoLGBLAoIBAQDWZ5TC+SZL\n71REYW68rGIqa8T2Md9kLHD6h0UOPnReYFLPpE7nmQKp4+25wmuDTQVg2db3Zzxc\n+be+JEpi8nJreMjsnyj6x9ufLST/2Hq2Ns9amZsdCXdFPK2zeuvrTh5ahieQ9ywQ\nflg1VDOIXN+46Q+/dJmgfLtz5/AFyf/cGDiZ+KpmrZxWYAjV1kYDU1XbQ+IFwP3b\nBweULbIw1ugEeXhkj5ecoqCnIhX33VmCMVbu+wuuFW9QOqIFEtpsYs1qf4CH6zW/\ngyrgWmJHTBgnPj5XIGRpa+gkUB6dqCCQdpW/F3bdVj1jY6+ef6UXNlCKe0FT2BXR\n9jNrfP2iVoQFAoIBAQCsAwcNpWo29QJJyyxKlE2MoIFtKvJOkmsDc+cKqzxQKMJw\nelKH1seV9pF/u45MfJ5rFMKCoibMxriIB7Gah8Iu69XmtBZgDA72D0DNLaCr9LDi\n9PWvskdTazLonutYbmBonX/TANEJCxuqsHATUXmy5Yds5h/Oy+t2ZB0p0qkLaK5C\nM0pCgYm2VZyEVpCqjmlqEdCCLsNPnbU4u+SS5AYd5pdGOdpHZCj/9Lvu8v5zpz5v\nv6DDHkB7WOgC4KUTojWAybntPaVTLkmrbtTywWv4OOWbaV+OTVdkAfLFMttl7kYV\nmvpHg9HtzcdbaP7HiTa3PqwwNjF2fFDPjUtH/vm9AoIBAQCeGh2ptN4XqrFCB3MI\nMDnnPDcusNIhZWAebfdvLIDVQ0Rtl6UTxVIgg1I+4+4yEW7A34JUR81MZlynGs27\nrzrOo2/OhQNMAmqiM0EQZMsAaOR408J9JAjOhpM0QZWMm7toV3r/vDTDKNfU43Vo\nvcu/6CTTsqDCppf4PXVSX4WMAFRkveix9J3PV9vMC8jvFNm/6YvXYFwR0lo1W4kF\n2MOY4RX1WamcOJQtCsaWU6R4i/emHHudcHL3/3SQNznYKPd+6+yUzc6BnbDVZfEI\n0EUTUyPXTaydzJOPi4E4Es2ImdmM4zmkt75m9xB+2XOc7VFw/LjMohBdFqcOQUor\nFo8dAoIBAQCsBBs7zNwOhVLBiq1BKMCSQF22snz/0jGgA+pNjNkbZspbVNrRQh3X\nZeBW1akN3z1uueAydMJvkhJjVp2Ub1e3xlQZK71NjT1HscttuZ7FhuGm04r7rWcZ\nYtKtakOAuHmrTAaDR0akgPJvdblyhDrB9xjJHZAWyky/WqOwtrVQHE9VOWp5JOBd\nMa++rCEV0CSs/c+Mi5i0EMw3gJrE+x1veWIEbcDT4O6z4oN89GvswvMZxjEIUPlZ\nIgnt0ylL665EKdoNfSDBt83XmixaHpNSkIRL/QePMUrSfMVeGwbuqb7v3BBbrGxq\n7wK4WKdoN0S5t/D1Sv5mBCaf1RGPP8XQ\n-----END PRIVATE KEY-----\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/test_data/ca.srl",
    "content": "083453D85C20FF26BD4F878E64C2F3C62316E7A5\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/test_data/regenerate-certs.sh",
    "content": "#!/usr/bin/env bash\n\n# this script regenerate cryptographic material used in tests. These are valid for 10y, but better\n# keep how to regenerate them than get stuck with failing tests eventually\n\nrm ca.{crt,key,srl} server.{csr,key,v3.ext,crt}\nopenssl genrsa -out ca.key 4096\nopenssl req -x509 -new -nodes -key ca.key -sha256 -days 3653 -out ca.crt -subj '/CN=qw test CA'\nopenssl req -new -nodes -out server.csr -newkey rsa:4096 -keyout server.key -subj '/CN=qw test certificate'\n\ncat > server.v3.ext << EOF\nauthorityKeyIdentifier=keyid,issuer\nbasicConstraints=CA:FALSE\nkeyUsage = digitalSignature, nonRepudiation, keyEncipherment, dataEncipherment\nsubjectAltName = @alt_names\n[alt_names]\nDNS.1 = quickwit.local\nIP.1 = 127.0.0.1\nEOF\n\nopenssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out server.crt -days 3653 -sha256 -extfile server.v3.ext\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/test_data/server.crt",
    "content": "-----BEGIN CERTIFICATE-----\nMIIFPDCCAySgAwIBAgIUCDRT2Fwg/ya9T4eOZMLzxiMW56UwDQYJKoZIhvcNAQEL\nBQAwFTETMBEGA1UEAwwKcXcgdGVzdCBDQTAeFw0yNDEyMDkxMTM4MDVaFw0zNDEy\nMTAxMTM4MDVaMB4xHDAaBgNVBAMME3F3IHRlc3QgY2VydGlmaWNhdGUwggIiMA0G\nCSqGSIb3DQEBAQUAA4ICDwAwggIKAoICAQCJTizLx2PE0AyHhBBtctCwNw+3JR1Z\njTYmcgcdg4QEN/QE5VvUr3Cf8GZWZLjkzO2zQHExsmofcjGhnfayYnKTecWUryD8\nWfC1fGwUGw+X7oef16f/hi2iQokk56WNBfB/rsi5tOH4cLZRszPiuPTP1hLJPpAN\nPhmAXqjors2HptMCjwvoD+J6Rjd+H8tflyztSV8GbIRj2Xlkbj9kDclP6Ou9/Ie8\n+omaSDXYPefv5jYIDF+w914wfqVn5bCrNSOtVQUOqazBZIKfqmmQxY7uggfdsk8p\nu5eCGvA5Ql6PGOCrjSsmw3wuKpxJHS9j+Sg8n65fyB5v4DQxc98RQi9ryGynVF1+\nYyuSnQN8A7CGPT8n+dwtApZMishrh8PMCJb9HYHrp1fnzmYcq8hLr0D3BxYPF7vl\nwjhq4D5LiZgHLtKtWc30CY2+wjhqQC+FiPq35+XiSvjnNPWD/8oq0sIGJaTnBxVF\naGDhXH+i/HkIlbgDTaPAr9Kt8ZsUmc76LujuBcEyqQKFrQt9kce8xiJHcEz6F5sD\n+VGiNrbsme1FE+W16M5CggkT5bjPtiXbcxBPTcByl2qbt2/Hh1fGjiihoC3Gl/0x\nUjWJ4QvrG1fshW7Ylzf146genDg987QD9bw6ulCNJ/aT6yGZiBNh3zYIar8DX7sn\nnTg2405jfIqiWQIDAQABo3sweTAfBgNVHSMEGDAWgBQqKF+RDWpzVWHJGSCdB+8e\nSyLmKzAJBgNVHRMEAjAAMAsGA1UdDwQEAwIE8DAfBgNVHREEGDAWgg5xdWlja3dp\ndC5sb2NhbIcEfwAAATAdBgNVHQ4EFgQULWzIQbbWgJ3ZgAx2wNy9Bl5LROIwDQYJ\nKoZIhvcNAQELBQADggIBAJgIrxW3tEKsK/gaiSi5Lpng/LvSv2I9/Q7bnDGTKiLN\nS6qcdyoiByu/88T1mK+kXyFzY2JSFSaLQgXxDip5kaPY+J8ySRkCii2NuMfMhfTP\n/E2t+UEoXW8X87FRvAzGy2jbIPcFkkJE784TYsQhD6bjcKKTXvnAB7pCgu3zz1Xv\nzmpN7vmcYwpkWbM2mYlzDYvhNs6XRGTKc0u4ho7VMyqYYBdXyT8KknWxzDLCXkhA\nFahUPtm+63WyYDumm4dCuLIk3QjC/kYTcexhZTTlpHv6cnL2YOxcqUEpjCQHCisu\n76f8mw9nA4Hm9SAHll2P9lT2cMs8edwhPfAKEk1xlLNvIz2QaG65YbifBPQAOO7S\nDhn32Vm7TIUPbgiP+TrTNjhHICcIALfsz32UuTM5r4VwvkODX64+1hN7GAHlBQPf\nIQXogRlP2Dv6Gecrnr4HzP4kftgdDvGq+ULGPYzMszI1mVgrJLgO/1KTqVfN0eLu\nByqq14OodZKd+9RPEMrom+iSNmfgBffmL1zrBmGnUiHgMuzHrc2Eo3bOCguWIwRO\nGrJZXmC+ldpy4XXzzulnzX40sgp7LXg7oVgQojWPZ4fEXEdszfb3yAdPqSjg8D73\n8T/Z6edpFuVq86y4EM99xpd1I+THLryto6ebSYHOXlj+1fbMbam1kGZo8HE5+oHa\n-----END CERTIFICATE-----\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/test_data/server.csr",
    "content": "-----BEGIN CERTIFICATE REQUEST-----\nMIIEYzCCAksCAQAwHjEcMBoGA1UEAwwTcXcgdGVzdCBjZXJ0aWZpY2F0ZTCCAiIw\nDQYJKoZIhvcNAQEBBQADggIPADCCAgoCggIBAIlOLMvHY8TQDIeEEG1y0LA3D7cl\nHVmNNiZyBx2DhAQ39ATlW9SvcJ/wZlZkuOTM7bNAcTGyah9yMaGd9rJicpN5xZSv\nIPxZ8LV8bBQbD5fuh5/Xp/+GLaJCiSTnpY0F8H+uyLm04fhwtlGzM+K49M/WEsk+\nkA0+GYBeqOiuzYem0wKPC+gP4npGN34fy1+XLO1JXwZshGPZeWRuP2QNyU/o6738\nh7z6iZpINdg95+/mNggMX7D3XjB+pWflsKs1I61VBQ6prMFkgp+qaZDFju6CB92y\nTym7l4Ia8DlCXo8Y4KuNKybDfC4qnEkdL2P5KDyfrl/IHm/gNDFz3xFCL2vIbKdU\nXX5jK5KdA3wDsIY9Pyf53C0ClkyKyGuHw8wIlv0dgeunV+fOZhyryEuvQPcHFg8X\nu+XCOGrgPkuJmAcu0q1ZzfQJjb7COGpAL4WI+rfn5eJK+Oc09YP/yirSwgYlpOcH\nFUVoYOFcf6L8eQiVuANNo8Cv0q3xmxSZzvou6O4FwTKpAoWtC32Rx7zGIkdwTPoX\nmwP5UaI2tuyZ7UUT5bXozkKCCRPluM+2JdtzEE9NwHKXapu3b8eHV8aOKKGgLcaX\n/TFSNYnhC+sbV+yFbtiXN/XjqB6cOD3ztAP1vDq6UI0n9pPrIZmIE2HfNghqvwNf\nuyedODbjTmN8iqJZAgMBAAGgADANBgkqhkiG9w0BAQsFAAOCAgEAfHtBglg3NED4\n1MqoZmS8Q2DEAMgiIiq+CwpPW6yHV2BNBzfxfDezkrzH5b2oNg//IZ2ftCKV86jh\nf9cszzhcbW3hre2tIg9CIC61qhp2MoPXeijneNYpGfXAvdXxs8VQB064ZdBj6ZbM\nZfIUdi+C9eb+kcgHBJ6fv4TSuik72f9bY0K7Kem0YoP4aXQ4aeZUkZFKZe2kw3hR\n6jhSejHOrkNslXQhQIjjP8t3bx0vU7BdRniz/J0Dq+L/96v/KkGXNk39z5VDF6Ce\nwl11KgeezZVNgczk2Xed7tQ/Uf05ptE9re+hYc3tRLW61VIpk93sZAOikm+69su3\nIfLPo4Vq6gT1bxIErgwP5kjTb1gQWb+r1g5Yqyrvpiue3ReJ5OpZU6JQic61cOBH\nlY50X34DeTzCMeGXbVYMpGybRGRQrMeK4RvULHM5s/FjN56HqpdDqDxH0UoVCF8w\nvjJDqqvKHOFlBCUCu5CXBFHDjb6pev6kLP9G9JqnavmDzKdH4+pqp2kH4aOVrYl5\nolXiz7l5JWEpwIXvkZa3ve6/cyXwDafbDzcVoq+dgLk8cMqT5uIqJdPjJXO2BCa3\n0VzdC8CN8XR9hc3XKbD59zfZJRtw7qD57/EbV3kOTpE+ywCxRRMcjIBgz1ueOwfg\npHiu1g/7+yuiUEmZLOJLeODe6y5Ij3Q=\n-----END CERTIFICATE REQUEST-----\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/test_data/server.key",
    "content": "-----BEGIN PRIVATE KEY-----\nMIIJQQIBADANBgkqhkiG9w0BAQEFAASCCSswggknAgEAAoICAQCJTizLx2PE0AyH\nhBBtctCwNw+3JR1ZjTYmcgcdg4QEN/QE5VvUr3Cf8GZWZLjkzO2zQHExsmofcjGh\nnfayYnKTecWUryD8WfC1fGwUGw+X7oef16f/hi2iQokk56WNBfB/rsi5tOH4cLZR\nszPiuPTP1hLJPpANPhmAXqjors2HptMCjwvoD+J6Rjd+H8tflyztSV8GbIRj2Xlk\nbj9kDclP6Ou9/Ie8+omaSDXYPefv5jYIDF+w914wfqVn5bCrNSOtVQUOqazBZIKf\nqmmQxY7uggfdsk8pu5eCGvA5Ql6PGOCrjSsmw3wuKpxJHS9j+Sg8n65fyB5v4DQx\nc98RQi9ryGynVF1+YyuSnQN8A7CGPT8n+dwtApZMishrh8PMCJb9HYHrp1fnzmYc\nq8hLr0D3BxYPF7vlwjhq4D5LiZgHLtKtWc30CY2+wjhqQC+FiPq35+XiSvjnNPWD\n/8oq0sIGJaTnBxVFaGDhXH+i/HkIlbgDTaPAr9Kt8ZsUmc76LujuBcEyqQKFrQt9\nkce8xiJHcEz6F5sD+VGiNrbsme1FE+W16M5CggkT5bjPtiXbcxBPTcByl2qbt2/H\nh1fGjiihoC3Gl/0xUjWJ4QvrG1fshW7Ylzf146genDg987QD9bw6ulCNJ/aT6yGZ\niBNh3zYIar8DX7snnTg2405jfIqiWQIDAQABAoICAAuOuYWTrIm6C3vyuWFFG1zh\n3aBIbv6FPOwiiAfLdLbYMOGinsa518yWsz6NbuTPfSisAXsx7e3kslnMzqG4WLWi\nN+fqj+e9+F2GciCFIJxk1yS0xe7jz/6LBDlHiG9k8xgEUeAT4juD3UzmPTVV5UGD\nXwRykDSVKBdeoHApmqJTAAsEmHd5stIhC/XBXmCW/JCiru1+/+vZt/aksxBMesgP\nrpxI3/If0qpabrHbkCTo515pEDr4r4R8fJbQxjy7Fdw2vly8Go8S0032TbmCb6QS\n7O+T3UcBg+DPPK5NMGFyMiMumFXEebT3ID2lR8JuDB2CQW7NjQZUxH/vJXUx8YTN\nA2Ui7bYulT7g8z0P2UJeJCVXaOYoLgSQxnBJWMi3r/fuZmhC5fGYoBUF4/dwtaVd\nryMqDtaF9O9PilOQ4DsANSWqdBNkCkUObuWab9NDNksXLSDK+1nIeagLnG6U3qgy\nDjSZ+LWHIGgTXpbP8KtxxlGeYznFcCH5Jdm/ubK/QyruLmhvrhdm6wXeFJwxt3M/\n5CSy8WmM1B3kAZ3a/+RmJGAQwJkGN7gyPdTmLw9yhAiFTrZd/bhMFT08cGTJoGa9\nd0yveZ5lFIxFeq12/u0gySx8Xqk2miIKq8OV8xh4M/9kOwUpNl9qMVFaK4G+nSjr\nF1Git/7wE384CVMdb90BAoIBAQC8bzFIvkxCRh7ATl9faDcI/vPkwPos3w5fA7hC\nHUELI5zdnBlYbd0X0HfUVRi5Lt2KY5AmeCIPLSMDfGMH7IE7LiAn2Jyp7pzVvoIk\nEFAdeZ90o3ggj20IRuWNVNCQ0IOSdhSHCeAPsFtQst4lGzp8tFj8VAJl7nzY4moa\nJD5sgOaB8WqXNKqWcdmxRGGK5VmXLKSW4ELRq3YNH+AGNquJgqu10hEkhvlpn3Qn\nxVJ9KLPAhRHc81+Q8aVA0w73R3wrTmWaFmtDOKDeefB5zVbUYMDXrkEIHys9NXma\n0I4ZhXvYjW6lL/ZNgHK3IP/kC0msycFl5H5l6+USJ+vIM96ZAoIBAQC6ib1++LtE\nuauKYJsGaq+14s/LxfJ9KiIgxReE/eZRn3rG4LWXgdwwEVsmV1EYCTuCwMvIN+F0\nj10YJqyoAtGwiFG4jGac0vDLFs3xx4+tp13aJ7WYCkM25O+9CMa32b89wOzN5lct\nTYGrzQn2RhjKh3bGH4sGfaNb4R0kS6X5N1ErKPgGvCXlCrIYgNZyr88uVUeB4m6i\nJ2FobHpShjPoyaJn24KPwT4+GVnig3MZhbw0s4P25+VbTSMmHvAQmCHdv7MYSuDB\nj9DmlvJ0mhoUKi3b1sKa1B9B4HAKVmmSaABfqpk4qdxyXjtUDQwN62YqAQLcmLOP\nroxDm/YOdfnBAoIBABfAmGDIBArSleu9tU3scAuFP68VGDPxxfj6Gg7TazCBQ7O3\nioZYCueGkqREOcKWAr0AAdqnh/uLv/8ffcgw6rVQAiOjrVPKTSCwS+1J1R9yBkSI\nmorYKXFCporjJwsqDXu3wKyo9QJlQ41vjor03LF9dj4QROEeZ8Ra/e7fpLK+qM+2\nY649qEcggMVUjksYz+s7aF/QUvvk9hN/chi2aXcC7qwTl6+YB/ZlcBnXSKeYKthY\nrcDBOMmnfCIouJk4/JDk5++9ZbXqfHSuwD5KQOiybXyCbZYdf7DOfc6i/VaAOfU5\nFrphylVInK0yzq6rMZVDNUqnu7sTOiPIvnLU/vkCggEARSx1ABPe5jJwIYWHl46S\nkEGGy0shjDbGpx5PhXreISCh2ARWctOuQoj9Iy+4G9C4p9k0+I94ZNARNraIylkZ\nR3yVyXkPSFKVBsrzHhjh+ASbsh2Nos8Tc9Tb7l7FykHOQGk9p3EmnN8kGgCUFCaU\nZO5tJjVmScbngFfvhZkj+FICIJ41s9Grv88CkkGcxLTbgJQRS2Iborg10BKCHf40\nW7wCJL9rIEIKAd9GzM/wK+PDEkwLwNDn5b6qLSXF4nF4BZJkKLsDs+PQFOKfEIxg\n5V9q2B5A1keZO8Wt5rd6uNcmZFOQNEoRPLwjBh08fiDwJt1vITzjQYH588xvJ5eq\nQQKCAQBX59K+aTukXUBC5clWMspLYUH4Ok8K35cKGbWtuy31EhNUlI+1Bc0ClV2l\nuh3+XVv1Uicg+jymYEO229CIZz1XODyYfBSXP9QqJgE5T/W7dy+LOioZe2CDvZPp\n2+ftXBaHlCk65EXt+a5LxtuIpc56Djo4yP8yFJRdG0QOSvOqYXK7dODvbTcdwEMa\ncR6PUy8hXDsBbqylwqC/ZGK9YDN1J+RCwMWrqR1kDRfVaQolamLk2u+sVTySgSWk\n8xyad6Vtj31jVCHhgA3WJgpied1QlRc4S1NXllIN47zHo5iUodj57FR9Ic1FSxT6\ntyXdXUmgzPrm/d9zKsxJjionOhie\n-----END PRIVATE KEY-----\n"
  },
  {
    "path": "quickwit/quickwit-integration-tests/test_data/server.v3.ext",
    "content": "authorityKeyIdentifier=keyid,issuer\nbasicConstraints=CA:FALSE\nkeyUsage = digitalSignature, nonRepudiation, keyEncipherment, dataEncipherment\nsubjectAltName = @alt_names\n[alt_names]\nDNS.1 = quickwit.local\nIP.1 = 127.0.0.1\n"
  },
  {
    "path": "quickwit/quickwit-jaeger/Cargo.toml",
    "content": "[package]\nname = \"quickwit-jaeger\"\ndescription = \"Jaeger storage backend\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nasync-trait = { workspace = true }\nitertools = { workspace = true }\nonce_cell = { workspace = true }\npostcard = { workspace = true }\nprost = { workspace = true }\nprost-types = { workspace = true }\nserde = { workspace = true }\nserde_json = { workspace = true }\ntantivy = { workspace = true }\ntime = { workspace = true }\ntokio = { workspace = true }\ntokio-stream = { workspace = true }\ntonic = { workspace = true }\ntracing = { workspace = true }\n\nquickwit-common = { workspace = true }\nquickwit-config = { workspace = true }\nquickwit-opentelemetry = { workspace = true }\nquickwit-proto = { workspace = true }\nquickwit-query = { workspace = true }\nquickwit-search = { workspace = true }\n\n[dev-dependencies]\ntempfile = { workspace = true }\ntime = { workspace = true }\n\nquickwit-actors = { workspace = true }\nquickwit-cluster = { workspace = true }\nquickwit-common = { workspace = true, features = [\"testsuite\"] }\nquickwit-indexing = { workspace = true, features = [\"testsuite\"] }\nquickwit-ingest = { workspace = true }\nquickwit-metastore = { workspace = true, features = [\"testsuite\"] }\nquickwit-opentelemetry = { workspace = true, features = [\"testsuite\"] }\nquickwit-proto = { workspace = true, features = [\"testsuite\"] }\nquickwit-search = { workspace = true, features = [\"testsuite\"] }\nquickwit-storage = { workspace = true, features = [\"testsuite\"] }\n\n[features]\ntestsuite = []\n"
  },
  {
    "path": "quickwit/quickwit-jaeger/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::mem;\nuse std::ops::{Bound, RangeInclusive};\nuse std::sync::Arc;\nuse std::time::Instant;\n\nuse itertools::{Either, Itertools};\nuse prost::Message;\nuse prost_types::{Duration as WellKnownDuration, Timestamp as WellKnownTimestamp};\nuse quickwit_config::JaegerConfig;\nuse quickwit_opentelemetry::otlp::{\n    Event as QwEvent, Link as QwLink, OTEL_TRACES_INDEX_ID, Span as QwSpan, SpanFingerprint,\n    SpanId, SpanKind as QwSpanKind, SpanStatus as QwSpanStatus, TraceId,\n};\nuse quickwit_proto::jaeger::api_v2::{\n    KeyValue as JaegerKeyValue, Log as JaegerLog, Process as JaegerProcess, Span as JaegerSpan,\n    SpanRef as JaegerSpanRef, SpanRefType as JaegerSpanRefType, ValueType,\n};\nuse quickwit_proto::jaeger::storage::v1::{\n    FindTraceIDsRequest, FindTraceIDsResponse, FindTracesRequest, GetOperationsRequest,\n    GetOperationsResponse, GetServicesRequest, GetServicesResponse, GetTraceRequest, Operation,\n    SpansResponseChunk, TraceQueryParameters,\n};\nuse quickwit_proto::opentelemetry::proto::trace::v1::status::StatusCode as OtlpStatusCode;\nuse quickwit_proto::search::{CountHits, ListTermsRequest, SearchRequest};\nuse quickwit_query::BooleanOperand;\nuse quickwit_query::query_ast::{BoolQuery, QueryAst, RangeQuery, TermQuery, UserInputQuery};\nuse quickwit_search::{FindTraceIdsCollector, SearchService};\nuse serde::Deserialize;\nuse serde_json::Value as JsonValue;\nuse tantivy::collector::Collector;\nuse time::OffsetDateTime;\nuse time::format_description::well_known::Rfc3339;\nuse tokio::sync::mpsc;\nuse tokio_stream::wrappers::ReceiverStream;\nuse tonic::Status;\nuse tracing::field::Empty;\nuse tracing::{Span as RuntimeSpan, debug, error, instrument, warn};\n\npub(crate) use crate::metrics::JAEGER_SERVICE_METRICS;\n\nmod metrics;\nmod v1;\nmod v2;\n\n// OpenTelemetry to Jaeger Transformation\n// <https://opentelemetry.io/docs/reference/specification/trace/sdk_exporters/jaeger/>\n\ntype TimeIntervalSecs = RangeInclusive<i64>;\n\npub(crate) type JaegerResult<T> = Result<T, Status>;\n\npub(crate) type SpanStream = ReceiverStream<Result<SpansResponseChunk, Status>>;\n\npub(crate) type TracesDataStream =\n    ReceiverStream<Result<quickwit_proto::opentelemetry::proto::trace::v1::TracesData, Status>>;\n\n#[derive(Clone)]\npub struct JaegerService {\n    search_service: Arc<dyn SearchService>,\n    lookback_period_secs: i64,\n    max_trace_duration_secs: i64,\n    max_fetch_spans: u64,\n}\n\nimpl JaegerService {\n    pub fn new(config: JaegerConfig, search_service: Arc<dyn SearchService>) -> Self {\n        Self {\n            search_service,\n            lookback_period_secs: config.lookback_period().as_secs() as i64,\n            max_trace_duration_secs: config.max_trace_duration().as_secs() as i64,\n            max_fetch_spans: config.max_fetch_spans.get(),\n        }\n    }\n\n    #[instrument(\"get_services\", skip_all)]\n    pub async fn get_services_for_indexes(\n        &self,\n        _request: GetServicesRequest,\n        index_id_patterns: Vec<String>,\n    ) -> JaegerResult<GetServicesResponse> {\n        let services = get_services_impl(\n            self.search_service.clone(),\n            self.lookback_period_secs,\n            index_id_patterns,\n        )\n        .await?;\n        Ok(GetServicesResponse { services })\n    }\n\n    #[instrument(\"get_operations\", skip_all, fields(service=%request.service, span_kind=%request.span_kind))]\n    pub async fn get_operations_for_indexes(\n        &self,\n        request: GetOperationsRequest,\n        index_id_patterns: Vec<String>,\n    ) -> JaegerResult<GetOperationsResponse> {\n        let operations = get_operations_impl(\n            self.search_service.clone(),\n            self.lookback_period_secs,\n            request.service,\n            request.span_kind,\n            index_id_patterns,\n        )\n        .await?;\n        Ok(GetOperationsResponse {\n            operations,\n            operation_names: Vec::new(), // `operation_names` is deprecated.\n        })\n    }\n\n    // Instrumentation happens in `find_trace_ids`.\n    pub async fn find_trace_ids_for_indexes(\n        &self,\n        request: FindTraceIDsRequest,\n        index_id_patterns: Vec<String>,\n    ) -> JaegerResult<FindTraceIDsResponse> {\n        debug!(request=?request, index_ids=?index_id_patterns, \"`find_trace_ids` request\");\n\n        let trace_query = request\n            .query\n            .ok_or_else(|| Status::invalid_argument(\"Query is empty.\"))?;\n\n        let (trace_ids, _) = self.find_trace_ids(trace_query, index_id_patterns).await?;\n        let trace_ids = trace_ids\n            .into_iter()\n            .map(|trace_id| trace_id.to_vec())\n            .collect();\n        debug!(trace_ids=?trace_ids, \"`find_trace_ids` response\");\n        let response = FindTraceIDsResponse { trace_ids };\n        Ok(response)\n    }\n\n    #[instrument(\"find_traces\", skip_all)]\n    pub async fn find_traces_for_indexes(\n        &self,\n        request: FindTracesRequest,\n        operation_name: &'static str,\n        request_start: Instant,\n        index_id_patterns: Vec<String>,\n        root_only: bool,\n    ) -> JaegerResult<SpanStream> {\n        debug!(request=?request, \"`find_traces` request\");\n\n        let trace_query = request\n            .query\n            .ok_or_else(|| Status::invalid_argument(\"Trace query is empty.\"))?;\n        let (trace_ids, span_timestamps_range) = self\n            .find_trace_ids(trace_query, index_id_patterns.clone())\n            .await?;\n        let start = span_timestamps_range.start() - self.max_trace_duration_secs;\n        let end = span_timestamps_range.end() + self.max_trace_duration_secs;\n        let search_window = start..=end;\n        let response = self\n            .stream_spans(\n                &trace_ids,\n                search_window,\n                operation_name,\n                request_start,\n                index_id_patterns,\n                root_only,\n            )\n            .await?;\n        Ok(response)\n    }\n\n    #[instrument(\"get_trace\", skip_all)]\n    pub async fn get_trace_for_indexes(\n        &self,\n        request: GetTraceRequest,\n        operation_name: &'static str,\n        request_start: Instant,\n        index_id_patterns: Vec<String>,\n    ) -> JaegerResult<SpanStream> {\n        debug!(request=?request, \"`get_trace` request\");\n        debug_assert_eq!(request.trace_id.len(), 16);\n        let trace_id = TraceId::try_from(request.trace_id)\n            .map_err(|error| Status::invalid_argument(error.to_string()))?;\n        let end = OffsetDateTime::now_utc().unix_timestamp();\n        let start = end - self.lookback_period_secs;\n        let search_window = start..=end;\n        let response = self\n            .stream_spans(\n                &[trace_id],\n                search_window,\n                operation_name,\n                request_start,\n                index_id_patterns,\n                false,\n            )\n            .await?;\n        Ok(response)\n    }\n\n    #[instrument(\"find_trace_ids\", skip_all fields(service_name=%trace_query.service_name, operation_name=%trace_query.operation_name))]\n    async fn find_trace_ids(\n        &self,\n        trace_query: TraceQueryParameters,\n        index_id_patterns: Vec<String>,\n    ) -> Result<(Vec<TraceId>, TimeIntervalSecs), Status> {\n        let min_start_secs = trace_query.start_time_min.map(|ts| ts.seconds);\n        let max_start_secs = trace_query.start_time_max.map(|ts| ts.seconds);\n        let min_duration_millis = trace_query\n            .duration_min\n            .and_then(|d| to_duration_millis(&d));\n        let max_duration_millis = trace_query\n            .duration_max\n            .and_then(|d| to_duration_millis(&d));\n\n        find_trace_ids_common(\n            self.search_service.clone(),\n            &trace_query.service_name,\n            &trace_query.operation_name,\n            trace_query.tags,\n            min_start_secs,\n            max_start_secs,\n            min_duration_millis,\n            max_duration_millis,\n            trace_query.num_traces as usize,\n            index_id_patterns,\n        )\n        .await\n    }\n}\n\n#[instrument(\"find_trace_ids_common\", skip_all)]\n#[allow(clippy::too_many_arguments)]\npub(crate) async fn find_trace_ids_common(\n    search_service: Arc<dyn SearchService>,\n    service_name: &str,\n    operation_name: &str,\n    tags: HashMap<String, String>,\n    min_start_secs: Option<i64>,\n    max_start_secs: Option<i64>,\n    min_duration_millis: Option<i64>,\n    max_duration_millis: Option<i64>,\n    num_traces: usize,\n    index_id_patterns: Vec<String>,\n) -> Result<(Vec<TraceId>, TimeIntervalSecs), Status> {\n    let query_ast = build_search_query(\n        service_name,\n        None,\n        operation_name,\n        tags,\n        min_start_secs,\n        max_start_secs,\n        min_duration_millis,\n        max_duration_millis,\n    );\n\n    let search_request = SearchRequest {\n        index_id_patterns,\n        query_ast: serde_json::to_string(&query_ast)\n            .map_err(|err| Status::internal(err.to_string()))?,\n        aggregation_request: Some(build_aggregations_query(num_traces)),\n        max_hits: 0,\n        start_timestamp: min_start_secs,\n        end_timestamp: max_start_secs,\n        count_hits: CountHits::Underestimate.into(),\n        ..Default::default()\n    };\n\n    let search_response = search_service.root_search(search_request).await?;\n\n    let Some(agg_result_postcard) = search_response.aggregation_postcard else {\n        debug!(\"the query matched no traces\");\n        return Ok((Vec::new(), 0..=0));\n    };\n\n    let trace_ids = collect_trace_ids(&agg_result_postcard)?;\n    debug!(\"the query matched {} traces.\", trace_ids.0.len());\n    Ok(trace_ids)\n}\n\nimpl JaegerService {\n    #[instrument(\"stream_spans\", skip_all, fields(num_traces=%trace_ids.len(), num_spans=Empty, num_bytes=Empty))]\n    async fn stream_spans(\n        &self,\n        trace_ids: &[TraceId],\n        search_window: TimeIntervalSecs,\n        operation_name: &'static str,\n        request_start: Instant,\n        index_id_patterns: Vec<String>,\n        root_only: bool,\n    ) -> Result<SpanStream, Status> {\n        if trace_ids.is_empty() {\n            let (_tx, rx) = mpsc::channel(1);\n            return Ok(ReceiverStream::new(rx));\n        }\n        let num_traces = trace_ids.len() as u64;\n        let mut query = BoolQuery::default();\n\n        for trace_id in trace_ids {\n            let value = trace_id.hex_display();\n            let term_query = TermQuery {\n                field: \"trace_id\".to_string(),\n                value,\n            };\n            query.should.push(term_query.into());\n        }\n        if root_only {\n            // we do this so we don't error on old indexes, and instead return both root and non\n            // root spans\n            let is_root = UserInputQuery {\n                user_text: \"NOT is_root:false\".to_string(),\n                default_fields: None,\n                default_operator: BooleanOperand::And,\n                lenient: true,\n            };\n            let mut new_query = BoolQuery::default();\n            new_query.must.push(query.into());\n            new_query.must.push(is_root.into());\n            query = new_query;\n        }\n\n        let query_ast: QueryAst = query.into();\n        let query_ast =\n            serde_json::to_string(&query_ast).map_err(|err| Status::internal(err.to_string()))?;\n\n        let search_request = SearchRequest {\n            index_id_patterns,\n            query_ast,\n            start_timestamp: Some(*search_window.start()),\n            end_timestamp: Some(*search_window.end()),\n            max_hits: self.max_fetch_spans,\n            count_hits: CountHits::Underestimate.into(),\n            ..Default::default()\n        };\n        let search_response = match self.search_service.root_search(search_request).await {\n            Ok(search_response) => search_response,\n            Err(search_error) => {\n                error!(search_error=?search_error, \"failed to fetch spans\");\n                record_error(operation_name, request_start);\n                return Err(Status::internal(\"Failed to fetch spans.\"));\n            }\n        };\n        let mut spans: Vec<JaegerSpan> = Vec::with_capacity(search_response.hits.len());\n\n        for hit in search_response.hits {\n            match qw_span_to_jaeger_span(&hit.json) {\n                Ok(span) => {\n                    spans.push(span);\n                }\n                Err(status) => {\n                    record_error(operation_name, request_start);\n                    return Err(status);\n                }\n            };\n        }\n        if trace_ids.len() > 1 {\n            spans.sort_unstable_by(|left, right| left.trace_id.cmp(&right.trace_id));\n        }\n        let (tx, rx) = mpsc::channel(2);\n        let current_span = RuntimeSpan::current();\n\n        tokio::task::spawn(async move {\n            const MAX_CHUNK_LEN: usize = 1_000;\n            const MAX_CHUNK_NUM_BYTES: usize = 4 * 1024 * 1024 - 10 * 1024; // 4 MiB, the default max size of gRPC messages, minus some headroom.\n\n            let chunk_len = MAX_CHUNK_LEN.min(spans.len());\n            let mut chunk = Vec::with_capacity(chunk_len);\n            let mut chunk_num_bytes = 0;\n            let mut num_spans_total = 0;\n            let mut num_bytes_total = 0;\n\n            while let Some(span) = spans.pop() {\n                let span_num_bytes = span.encoded_len();\n\n                if chunk.len() == MAX_CHUNK_LEN\n                    || chunk_num_bytes + span_num_bytes > MAX_CHUNK_NUM_BYTES\n                {\n                    let num_spans = chunk.len();\n                    num_spans_total += num_spans;\n                    num_bytes_total += chunk_num_bytes;\n\n                    // + 1 to account for the span we just popped from `spans` but haven't yet\n                    // appended to `chunk`.\n                    let chunk_len = MAX_CHUNK_LEN.min(spans.len() + 1);\n                    let chunk = mem::replace(&mut chunk, Vec::with_capacity(chunk_len));\n                    if let Err(send_error) = tx.send(Ok(SpansResponseChunk { spans: chunk })).await\n                    {\n                        debug!(send_error=?send_error, \"client disconnected\");\n                        return;\n                    }\n                    record_send(operation_name, num_spans, chunk_num_bytes);\n                    chunk_num_bytes = 0;\n                }\n                chunk_num_bytes += span_num_bytes;\n                chunk.push(span);\n            }\n            if !chunk.is_empty() {\n                let num_spans = chunk.len();\n                num_spans_total += num_spans;\n                num_bytes_total += chunk_num_bytes;\n\n                if let Err(send_error) = tx.send(Ok(SpansResponseChunk { spans: chunk })).await {\n                    debug!(error=?send_error, \"client disconnected\");\n                    return;\n                }\n                record_send(operation_name, num_spans, chunk_num_bytes);\n            }\n            current_span.record(\"num_spans\", num_spans_total);\n            current_span.record(\"num_bytes\", num_bytes_total);\n\n            JAEGER_SERVICE_METRICS\n                .fetched_traces_total\n                .with_label_values([operation_name, OTEL_TRACES_INDEX_ID])\n                .inc_by(num_traces);\n\n            let elapsed = request_start.elapsed().as_secs_f64();\n            JAEGER_SERVICE_METRICS\n                .request_duration_seconds\n                .with_label_values([operation_name, OTEL_TRACES_INDEX_ID, \"false\"])\n                .observe(elapsed);\n        });\n        Ok(ReceiverStream::new(rx))\n    }\n}\n\npub(crate) fn record_error(operation_name: &'static str, request_start: Instant) {\n    JAEGER_SERVICE_METRICS\n        .request_errors_total\n        .with_label_values([operation_name, OTEL_TRACES_INDEX_ID])\n        .inc();\n\n    let elapsed = request_start.elapsed().as_secs_f64();\n    JAEGER_SERVICE_METRICS\n        .request_duration_seconds\n        .with_label_values([operation_name, OTEL_TRACES_INDEX_ID, \"true\"])\n        .observe(elapsed);\n}\n\npub(crate) fn record_send(operation_name: &'static str, num_spans: usize, num_bytes: usize) {\n    JAEGER_SERVICE_METRICS\n        .fetched_spans_total\n        .with_label_values([operation_name, OTEL_TRACES_INDEX_ID])\n        .inc_by(num_spans as u64);\n    JAEGER_SERVICE_METRICS\n        .transferred_bytes_total\n        .with_label_values([operation_name, OTEL_TRACES_INDEX_ID])\n        .inc_by(num_bytes as u64);\n}\n\n#[allow(deprecated)]\nfn extract_term(term_bytes: &[u8]) -> String {\n    tantivy::Term::wrap(term_bytes)\n        .value()\n        .as_str()\n        .expect(\"Term should be a valid UTF-8 string.\")\n        .to_string()\n}\n\nfn extract_operation(term_bytes: &[u8]) -> Operation {\n    let term = extract_term(term_bytes);\n    let fingerprint = SpanFingerprint::from_string(term);\n    let span_name = fingerprint\n        .span_name()\n        .expect(\"The span fingerprint should be properly formed.\")\n        .to_string();\n    let span_kind = fingerprint\n        .span_kind()\n        .map(|span_kind| span_kind.as_jaeger())\n        .expect(\"The span fingerprint should be properly formed.\")\n        .to_string();\n    Operation {\n        name: span_name,\n        span_kind,\n    }\n}\n\n#[instrument(\"get_services\", skip_all)]\npub(crate) async fn get_services_impl(\n    search_service: Arc<dyn SearchService>,\n    lookback_period_secs: i64,\n    index_id_patterns: Vec<String>,\n) -> Result<Vec<String>, Status> {\n    debug!(index_ids=?index_id_patterns, \"`get_services` request\");\n\n    let max_hits = Some(1_000);\n    let start_timestamp = Some(OffsetDateTime::now_utc().unix_timestamp() - lookback_period_secs);\n\n    let search_request = ListTermsRequest {\n        index_id_patterns,\n        field: \"service_name\".to_string(),\n        max_hits,\n        start_timestamp,\n        end_timestamp: None,\n        start_key: None,\n        end_key: None,\n    };\n    let search_response = search_service.root_list_terms(search_request).await?;\n    let services: Vec<String> = search_response\n        .terms\n        .into_iter()\n        .map(|term_bytes| extract_term(&term_bytes))\n        .sorted()\n        .collect();\n    debug!(services=?services, \"`get_services` response\");\n    Ok(services)\n}\n\n#[instrument(\"get_operations\", skip_all, fields(service=%service, span_kind=%span_kind))]\npub(crate) async fn get_operations_impl(\n    search_service: Arc<dyn SearchService>,\n    lookback_period_secs: i64,\n    service: String,\n    span_kind: String,\n    index_id_patterns: Vec<String>,\n) -> Result<Vec<Operation>, Status> {\n    debug!(service=%service, span_kind=%span_kind, index_ids=?index_id_patterns, \"`get_operations` request\");\n\n    let max_hits = Some(1_000);\n    let start_timestamp = Some(OffsetDateTime::now_utc().unix_timestamp() - lookback_period_secs);\n\n    let span_kind_opt = span_kind.parse().ok();\n    let start_key = SpanFingerprint::start_key(&service, span_kind_opt.clone());\n    let end_key = SpanFingerprint::end_key(&service, span_kind_opt);\n\n    let search_request = ListTermsRequest {\n        index_id_patterns,\n        field: \"span_fingerprint\".to_string(),\n        max_hits,\n        start_timestamp,\n        end_timestamp: None,\n        start_key,\n        end_key,\n    };\n    let search_response = search_service.root_list_terms(search_request).await?;\n    let operations: Vec<Operation> = search_response\n        .terms\n        .into_iter()\n        .map(|term_json| extract_operation(&term_json))\n        .sorted()\n        .collect();\n    debug!(operations=?operations, \"`get_operations` response\");\n    Ok(operations)\n}\n\n// TODO: builder pattern\n#[allow(clippy::too_many_arguments)]\npub(crate) fn build_search_query(\n    service_name: &str,\n    span_kind_opt: Option<QwSpanKind>,\n    span_name: &str,\n    mut tags: HashMap<String, String>,\n    min_span_start_timestamp_secs_opt: Option<i64>,\n    max_span_start_timestamp_secs_opt: Option<i64>,\n    min_span_duration_millis_opt: Option<i64>,\n    max_span_duration_millis_opt: Option<i64>,\n) -> QueryAst {\n    // TODO disable based on some feature?\n    if let Some(qw_query) = tags.remove(\"_qw_query\") {\n        return quickwit_query::query_ast::query_ast_from_user_text(&qw_query, None);\n    }\n    // TODO should we use filter instead of must? Does it changes anything? Less scoring?\n    let mut query_ast = BoolQuery::default();\n\n    if !service_name.is_empty() {\n        query_ast.must.push(\n            TermQuery {\n                field: \"service_name\".to_string(),\n                value: service_name.to_string(),\n            }\n            .into(),\n        );\n    }\n    if let Some(span_kind) = span_kind_opt {\n        query_ast.must.push(\n            TermQuery {\n                field: \"span_kind\".to_string(),\n                value: span_kind.as_char().to_string(),\n            }\n            .into(),\n        )\n    }\n    if !span_name.is_empty() {\n        query_ast.must.push(\n            TermQuery {\n                field: \"span_name\".to_string(),\n                value: span_name.to_string(),\n            }\n            .into(),\n        )\n    }\n    if !tags.is_empty() {\n        // Sort the tags for deterministic tests.\n        for (key, value) in tags.into_iter().sorted() {\n            // In Jaeger land, `event` is a regular event attribute whereas in OpenTelemetry land,\n            // it is an event top-level field named `name`. In Quickwit, it is stored as\n            // `event_name` to distinguish it from the span top-level field `name`.\n            if key == \"event\" {\n                query_ast.must.push(\n                    TermQuery {\n                        field: \"events.event_name\".to_string(),\n                        value,\n                    }\n                    .into(),\n                )\n            } else if key == \"error\" && value == \"true\" {\n                query_ast.must.push(\n                    TermQuery {\n                        field: \"span_status.code\".to_string(),\n                        value: \"error\".to_string(),\n                    }\n                    .into(),\n                )\n            } else if key == \"error\" && value == \"false\" {\n                query_ast.must_not.push(\n                    TermQuery {\n                        field: \"span_status.code\".to_string(),\n                        value: \"error\".to_string(),\n                    }\n                    .into(),\n                )\n            } else {\n                let mut sub_query = BoolQuery::default();\n\n                sub_query.should.push(\n                    TermQuery {\n                        field: format!(\"resource_attributes.{key}\"),\n                        value: value.clone(),\n                    }\n                    .into(),\n                );\n                sub_query.should.push(\n                    TermQuery {\n                        field: format!(\"span_attributes.{key}\"),\n                        value: value.clone(),\n                    }\n                    .into(),\n                );\n                sub_query.should.push(\n                    TermQuery {\n                        field: format!(\"events.event_attributes.{key}\"),\n                        value,\n                    }\n                    .into(),\n                );\n                query_ast.must.push(sub_query.into())\n            }\n        }\n    }\n    if min_span_start_timestamp_secs_opt.is_some() || max_span_start_timestamp_secs_opt.is_some() {\n        let mut start_range = RangeQuery {\n            field: \"span_start_timestamp_nanos\".to_string(),\n            lower_bound: Bound::Unbounded,\n            upper_bound: Bound::Unbounded,\n        };\n\n        if let Some(min_span_start_timestamp_secs) = min_span_start_timestamp_secs_opt {\n            let min_span_start_datetime =\n                OffsetDateTime::from_unix_timestamp(min_span_start_timestamp_secs)\n                    .expect(\"Timestamp should fall within the [Date::MIN, Date::MAX] interval.\");\n            let min_span_start_datetime_rfc3339 = min_span_start_datetime\n                .format(&Rfc3339)\n                .expect(\"Datetime should be formattable to RFC 3339.\");\n            start_range.lower_bound = Bound::Included(min_span_start_datetime_rfc3339.into());\n        }\n\n        if let Some(max_span_start_timestamp_secs) = max_span_start_timestamp_secs_opt {\n            let max_span_start_datetime =\n                OffsetDateTime::from_unix_timestamp(max_span_start_timestamp_secs)\n                    .expect(\"Timestamp should fall within the [Date::MIN, Date::MAX] interval.\");\n            let max_span_start_datetime_rfc3339 = max_span_start_datetime\n                .format(&Rfc3339)\n                .expect(\"Datetime should be formattable to RFC 3339.\");\n            start_range.upper_bound = Bound::Included(max_span_start_datetime_rfc3339.into());\n        }\n\n        query_ast.must.push(start_range.into());\n    }\n    if min_span_duration_millis_opt.is_some() || max_span_duration_millis_opt.is_some() {\n        let mut duration_range = RangeQuery {\n            field: \"span_duration_millis\".to_string(),\n            lower_bound: Bound::Unbounded,\n            upper_bound: Bound::Unbounded,\n        };\n\n        if let Some(min_span_duration_millis) = min_span_duration_millis_opt {\n            duration_range.lower_bound = Bound::Included(min_span_duration_millis.into());\n        }\n\n        if let Some(max_span_duration_millis) = max_span_duration_millis_opt {\n            duration_range.upper_bound = Bound::Included(max_span_duration_millis.into());\n        }\n\n        query_ast.must.push(duration_range.into());\n    }\n    if !query_ast.must.is_empty() || !query_ast.must_not.is_empty() {\n        query_ast.into()\n    } else {\n        QueryAst::MatchAll\n    }\n}\n\npub(crate) fn build_aggregations_query(num_traces: usize) -> String {\n    let query = serde_json::to_string(&FindTraceIdsCollector {\n        num_traces,\n        trace_id_field_name: \"trace_id\".to_string(),\n        span_timestamp_field_name: \"span_start_timestamp_nanos\".to_string(),\n    })\n    .expect(\"The collector should be JSON serializable.\");\n    debug!(query=%query, \"Aggregations query\");\n    query\n}\n\n#[allow(clippy::result_large_err)]\nfn qw_span_to_jaeger_span(qw_span_json: &str) -> Result<JaegerSpan, Status> {\n    let mut qw_span: QwSpan = json_deserialize(qw_span_json, \"span\")?;\n\n    let start_time = Some(to_well_known_timestamp(qw_span.span_start_timestamp_nanos));\n    let duration = Some(to_well_known_duration(\n        qw_span.span_start_timestamp_nanos,\n        qw_span.span_end_timestamp_nanos,\n    ));\n    qw_span.resource_attributes.remove(\"service.name\");\n    let process = Some(JaegerProcess {\n        service_name: qw_span.service_name,\n        tags: otlp_attributes_to_jaeger_tags(qw_span.resource_attributes),\n    });\n    let logs: Vec<JaegerLog> = qw_span\n        .events\n        .into_iter()\n        .map(qw_event_to_jaeger_log)\n        .collect::<Result<_, _>>()?;\n\n    let mut tags = otlp_attributes_to_jaeger_tags(qw_span.span_attributes);\n    inject_dropped_count_tags(\n        &mut tags,\n        qw_span.span_dropped_attributes_count,\n        qw_span.span_dropped_events_count,\n        qw_span.span_dropped_links_count,\n    );\n    inject_span_kind_tag(&mut tags, qw_span.span_kind);\n    inject_span_status_tags(&mut tags, qw_span.span_status);\n\n    let references =\n        otlp_links_to_jaeger_references(&qw_span.trace_id, qw_span.parent_span_id, qw_span.links)?;\n\n    let span = JaegerSpan {\n        trace_id: qw_span.trace_id.to_vec(),\n        span_id: qw_span.span_id.to_vec(),\n        operation_name: qw_span.span_name,\n        references,\n        flags: 0, // TODO\n        start_time,\n        duration,\n        tags,\n        logs,\n        process,\n        process_id: \"\".to_string(), // TODO\n        warnings: Vec::new(),       // TODO\n    };\n    Ok(span)\n}\n\npub(crate) fn to_duration_millis(duration: &WellKnownDuration) -> Option<i64> {\n    let duration_millis = duration.seconds * 1_000 + (duration.nanos as i64) / 1_000_000;\n    if duration_millis == 0 {\n        None\n    } else {\n        Some(duration_millis)\n    }\n}\n\nfn to_well_known_timestamp(timestamp_nanos: u64) -> WellKnownTimestamp {\n    let seconds = (timestamp_nanos / 1_000_000_000) as i64;\n    let nanos = (timestamp_nanos % 1_000_000_000) as i32;\n    WellKnownTimestamp { seconds, nanos }\n}\n\nfn to_well_known_duration(\n    start_timestamp_nanos: u64,\n    end_timestamp_nanos: u64,\n) -> WellKnownDuration {\n    let duration_nanos = end_timestamp_nanos - start_timestamp_nanos;\n    let seconds = (duration_nanos / 1_000_000_000) as i64;\n    let nanos = (duration_nanos % 1_000_000_000) as i32;\n    WellKnownDuration { seconds, nanos }\n}\n\nfn inject_dropped_count_tags(\n    tags: &mut Vec<JaegerKeyValue>,\n    dropped_attributes_count: u32,\n    dropped_events_count: u32,\n    dropped_links_count: u32,\n) {\n    for (dropped_count, key) in [\n        (dropped_attributes_count, \"otel.dropped_attributes_count\"),\n        (dropped_events_count, \"otel.dropped_events_count\"),\n        (dropped_links_count, \"otel.dropped_links_count\"),\n    ] {\n        if dropped_count > 0 {\n            tags.push(JaegerKeyValue {\n                key: key.to_string(),\n                v_type: ValueType::Int64 as i32,\n                v_str: String::new(),\n                v_bool: false,\n                v_int64: dropped_count as i64,\n                v_float64: 0.0,\n                v_binary: Vec::new(),\n            });\n        }\n    }\n}\n\n/// Injects span kind tag.\n/// <https://opentelemetry.io/docs/specs/otel/trace/sdk_exporters/jaeger/#spankind>\nfn inject_span_kind_tag(tags: &mut Vec<JaegerKeyValue>, span_kind_id: u32) {\n    // OpenTelemetry SpanKind field MUST be encoded as span.kind tag in Jaeger span, except for\n    // SpanKind.INTERNAL, which SHOULD NOT be translated to a tag.\n    let span_kind = match span_kind_id {\n        0 | 1 => return,\n        2 => \"server\",\n        3 => \"client\",\n        4 => \"producer\",\n        5 => \"consumer\",\n        _ => {\n            warn!(span_kind_id=%span_kind_id, \"unknown span kind ID\");\n            return;\n        }\n    };\n    tags.push(JaegerKeyValue {\n        key: \"span.kind\".to_string(),\n        v_type: ValueType::String as i32,\n        v_str: span_kind.to_string(),\n        v_bool: false,\n        v_int64: 0,\n        v_float64: 0.0,\n        v_binary: Vec::new(),\n    });\n}\n\n/// Injects span status tags.\n/// <https://opentelemetry.io/docs/specs/otel/common/mapping-to-non-otlp/#span-status>\nfn inject_span_status_tags(tags: &mut Vec<JaegerKeyValue>, span_status: QwSpanStatus) {\n    // \"Span Status MUST be reported as key-value pairs associated with the Span, unless the Status\n    // is UNSET. In the latter case it MUST NOT be reported.\"\n    match span_status.code {\n        OtlpStatusCode::Unset => {}\n        OtlpStatusCode::Ok => {\n            // \"Name of the code, either OK or ERROR. MUST NOT be set if the code is UNSET.\"\n            tags.push(JaegerKeyValue {\n                key: \"otel.status_code\".to_string(),\n                v_type: ValueType::String as i32,\n                v_str: \"OK\".to_string(),\n                v_bool: false,\n                v_int64: 0,\n                v_float64: 0.0,\n                v_binary: Vec::new(),\n            });\n        }\n        OtlpStatusCode::Error => {\n            // \"Name of the code, either OK or ERROR. MUST NOT be set if the code is UNSET.\"\n            tags.push(JaegerKeyValue {\n                key: \"otel.status_code\".to_string(),\n                v_type: ValueType::String as i32,\n                v_str: \"ERROR\".to_string(),\n                v_bool: false,\n                v_int64: 0,\n                v_float64: 0.0,\n                v_binary: Vec::new(),\n            });\n            // \"Description of the Status if it has a value otherwise not set.\"\n            if let Some(message) = span_status.message {\n                tags.push(JaegerKeyValue {\n                    key: \"otel.status_description\".to_string(),\n                    v_type: ValueType::String as i32,\n                    v_str: message,\n                    v_bool: false,\n                    v_int64: 0,\n                    v_float64: 0.0,\n                    v_binary: Vec::new(),\n                });\n            }\n            // \"When Span Status is set to ERROR, an error span tag MUST be added with the Boolean\n            // value of true. The added error tag MAY override any previous value.\"\n            tags.push(JaegerKeyValue {\n                key: \"error\".to_string(),\n                v_type: ValueType::Bool as i32,\n                v_str: String::new(),\n                v_bool: true,\n                v_int64: 0,\n                v_float64: 0.0,\n                v_binary: Vec::new(),\n            });\n        }\n    };\n}\n\n/// Converts OpenTelemetry attributes to Jaeger tags. Objects are flattened with\n/// their keys prefixed with the parent keys delimited by a dot.\n///\n/// <https://opentelemetry.io/docs/specs/otel/trace/sdk_exporters/jaeger/#attributes>\nfn otlp_attributes_to_jaeger_tags(\n    attributes: impl IntoIterator<Item = (String, JsonValue)>,\n) -> Vec<JaegerKeyValue> {\n    otlp_attributes_to_jaeger_tags_inner(attributes, None)\n}\n\n/// Inner helper for `otpl_attributes_to_jaeger_tags` recursive call\n///\n/// PERF: as long as `attributes` IntoIterator implementation correctly sets the\n/// lower bound then collect should allocate efficiently. Note that the flat map\n/// may cause more allocations as we cannot predict the number of elements in the\n/// iterator.\nfn otlp_attributes_to_jaeger_tags_inner(\n    attributes: impl IntoIterator<Item = (String, JsonValue)>,\n    parent_key: Option<&str>,\n) -> Vec<JaegerKeyValue> {\n    attributes\n        .into_iter()\n        .map(|(key, value)| {\n            let key = parent_key\n                .map(|parent_key| format!(\"{parent_key}.{key}\"))\n                .unwrap_or(key);\n            match value {\n                JsonValue::Array(values) => {\n                    Either::Left(Some(JaegerKeyValue {\n                        key,\n                        v_type: ValueType::String as i32,\n                        // Array values MUST be serialized to string like a JSON list.\n                        v_str: serde_json::to_string(&values).expect(\n                            \"A vec of `serde_json::Value` values should be JSON serializable.\",\n                        ),\n                        ..Default::default()\n                    }))\n                }\n                JsonValue::Bool(v_bool) => Either::Left(Some(JaegerKeyValue {\n                    key,\n                    v_type: ValueType::Bool as i32,\n                    v_bool,\n                    ..Default::default()\n                })),\n                JsonValue::Number(number) => {\n                    let value = if let Some(v_int64) = number.as_i64() {\n                        Some(JaegerKeyValue {\n                            key,\n                            v_type: ValueType::Int64 as i32,\n                            v_int64,\n                            ..Default::default()\n                        })\n                    } else if let Some(v_float64) = number.as_f64() {\n                        Some(JaegerKeyValue {\n                            key,\n                            v_type: ValueType::Float64 as i32,\n                            v_float64,\n                            ..Default::default()\n                        })\n                    } else {\n                        // Print some error rather than silently ignoring the value.\n                        warn!(\"ignoring unrepresentable number value: {number:?}\");\n                        None\n                    };\n\n                    Either::Left(value)\n                }\n                JsonValue::String(v_str) => Either::Left(Some(JaegerKeyValue {\n                    key,\n                    v_type: ValueType::String as i32,\n                    v_str,\n                    ..Default::default()\n                })),\n                JsonValue::Null => {\n                    // No use including null values in the tags, so ignore\n                    Either::Left(None)\n                }\n                JsonValue::Object(value) => {\n                    Either::Right(otlp_attributes_to_jaeger_tags_inner(value, Some(&key)))\n                }\n            }\n        })\n        .flat_map(|e| e.into_iter())\n        .collect()\n}\n\n/// Converts OpenTelemetry links to Jaeger span references.\n/// <https://opentelemetry.io/docs/specs/otel/trace/sdk_exporters/jaeger/#links>\n#[allow(clippy::result_large_err)]\nfn otlp_links_to_jaeger_references(\n    trace_id: &TraceId,\n    parent_span_id_opt: Option<SpanId>,\n    links: Vec<QwLink>,\n) -> Result<Vec<JaegerSpanRef>, Status> {\n    let mut references = Vec::with_capacity(parent_span_id_opt.is_some() as usize + links.len());\n\n    // <https://opentelemetry.io/docs/specs/otel/trace/sdk_exporters/jaeger/#parent-id>\n    if let Some(parent_span_id) = parent_span_id_opt {\n        let reference = JaegerSpanRef {\n            trace_id: trace_id.to_vec(),\n            span_id: parent_span_id.to_vec(),\n            ref_type: JaegerSpanRefType::ChildOf as i32,\n        };\n        references.push(reference);\n    }\n    // \"Span references generated from Link(s) MUST be added after the span reference generated from\n    // Parent ID, if any.\"\n    for link in links {\n        let trace_id = link.link_trace_id.to_vec();\n        let span_id = link.link_span_id.to_vec();\n        let reference = JaegerSpanRef {\n            trace_id,\n            span_id,\n            ref_type: JaegerSpanRefType::FollowsFrom as i32,\n        };\n        references.push(reference);\n    }\n    Ok(references)\n}\n\n#[allow(clippy::result_large_err)]\nfn qw_event_to_jaeger_log(event: QwEvent) -> Result<JaegerLog, Status> {\n    let timestamp = to_well_known_timestamp(event.event_timestamp_nanos);\n    // \"OpenTelemetry Event’s name field should be added to Jaeger Log’s fields map as follows: name\n    // -> event. If OpenTelemetry Event contains an attribute with the key event, it should take\n    // precedence over Event’s name field.\"\n    let insert_event_name =\n        !event.event_name.is_empty() && !event.event_attributes.contains_key(\"event\");\n\n    let mut fields = otlp_attributes_to_jaeger_tags(event.event_attributes);\n\n    if insert_event_name {\n        fields.push(JaegerKeyValue {\n            key: \"event\".to_string(),\n            v_type: ValueType::String as i32,\n            v_str: event.event_name,\n            v_bool: false,\n            v_int64: 0,\n            v_float64: 0.0,\n            v_binary: Vec::new(),\n        });\n    }\n    inject_dropped_count_tags(&mut fields, event.event_dropped_attributes_count, 0, 0);\n    let log = JaegerLog {\n        timestamp: Some(timestamp),\n        fields,\n    };\n    Ok(log)\n}\n\n#[allow(clippy::result_large_err)]\nfn collect_trace_ids(\n    trace_ids_postcard: &[u8],\n) -> Result<(Vec<TraceId>, TimeIntervalSecs), Status> {\n    let collector_fruit: <FindTraceIdsCollector as Collector>::Fruit =\n        postcard_deserialize(trace_ids_postcard, \"trace IDs aggregation\")?;\n    if collector_fruit.is_empty() {\n        return Ok((Vec::new(), 0..=0));\n    }\n    let mut trace_ids = Vec::with_capacity(collector_fruit.len());\n    let mut start = i64::MAX;\n    let mut end = i64::MIN;\n\n    for trace_id in collector_fruit {\n        trace_ids.push(trace_id.trace_id);\n        start = start.min(trace_id.span_timestamp.into_timestamp_secs());\n        end = end.max(trace_id.span_timestamp.into_timestamp_secs());\n    }\n    Ok((trace_ids, start..=end))\n}\n\n#[allow(clippy::result_large_err)]\nfn json_deserialize<'a, T>(json: &'a str, label: &'static str) -> Result<T, Status>\nwhere T: Deserialize<'a> {\n    match serde_json::from_str(json) {\n        Ok(deserialized) => Ok(deserialized),\n        Err(error) => {\n            error!(\"failed to deserialize {label}: {error:?}\");\n            Err(Status::internal(format!(\n                \"Failed to deserialize {label}: {error:?}.\"\n            )))\n        }\n    }\n}\n\n#[allow(clippy::result_large_err)]\nfn postcard_deserialize<'a, T>(json: &'a [u8], label: &'static str) -> Result<T, Status>\nwhere T: Deserialize<'a> {\n    match postcard::from_bytes(json) {\n        Ok(deserialized) => Ok(deserialized),\n        Err(error) => {\n            error!(\"failed to deserialize {label}: {error:?}\");\n            Err(Status::internal(format!(\n                \"Failed to deserialize {label}: {error:?}.\"\n            )))\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_opentelemetry::otlp::{OTEL_TRACES_INDEX_ID_PATTERN, OtelSignal};\n    use quickwit_proto::jaeger::api_v2::ValueType;\n    use quickwit_proto::jaeger::storage::v1::span_reader_plugin_server::SpanReaderPlugin;\n    use quickwit_search::{MockSearchService, QuickwitAggregations, encode_term_for_test};\n    use serde_json::json;\n\n    use super::*;\n\n    #[track_caller]\n    fn get_must(ast: QueryAst) -> Vec<QueryAst> {\n        match ast {\n            QueryAst::Bool(boolean_query) => boolean_query.must,\n            _ => panic!(\"expected `QueryAst::Bool`, got `{ast:?}`\"),\n        }\n    }\n\n    #[track_caller]\n    fn get_must_not(ast: QueryAst) -> Vec<QueryAst> {\n        match ast {\n            QueryAst::Bool(boolean_query) => boolean_query.must_not,\n            _ => panic!(\"expected `QueryAst::Bool`, got `{ast:?}`\"),\n        }\n    }\n\n    #[test]\n    fn test_build_query() {\n        {\n            let service_name = \"\";\n            let span_kind = None;\n            let span_name = \"\";\n            let tags = HashMap::new();\n            let min_span_start_timestamp_secs = None;\n            let max_span_start_timestamp_secs = None;\n            let min_span_duration_secs = None;\n            let max_span_duration_secs = None;\n            assert_eq!(\n                build_search_query(\n                    service_name,\n                    span_kind,\n                    span_name,\n                    tags,\n                    min_span_start_timestamp_secs,\n                    max_span_start_timestamp_secs,\n                    min_span_duration_secs,\n                    max_span_duration_secs\n                ),\n                QueryAst::MatchAll,\n            );\n        }\n        {\n            let service_name = \"quickwit search\";\n            let span_kind = None;\n            let span_name = \"\";\n            let tags = HashMap::new();\n            let min_span_start_timestamp_secs = None;\n            let max_span_start_timestamp_secs = None;\n            let min_span_duration_secs = None;\n            let max_span_duration_secs = None;\n            assert_eq!(\n                get_must(build_search_query(\n                    service_name,\n                    span_kind,\n                    span_name,\n                    tags,\n                    min_span_start_timestamp_secs,\n                    max_span_start_timestamp_secs,\n                    min_span_duration_secs,\n                    max_span_duration_secs\n                )),\n                vec![\n                    TermQuery {\n                        field: \"service_name\".to_string(),\n                        value: service_name.to_string(),\n                    }\n                    .into()\n                ]\n            );\n        }\n        {\n            let service_name = \"quickwit\";\n            let span_kind = None;\n            let span_name = \"\";\n            let tags = HashMap::from_iter([(\"_qw_query\".to_string(), \"query\".to_string())]);\n            let min_span_start_timestamp_secs = None;\n            let max_span_start_timestamp_secs = None;\n            let min_span_duration_secs = None;\n            let max_span_duration_secs = None;\n            assert_eq!(\n                build_search_query(\n                    service_name,\n                    span_kind,\n                    span_name,\n                    tags,\n                    min_span_start_timestamp_secs,\n                    max_span_start_timestamp_secs,\n                    min_span_duration_secs,\n                    max_span_duration_secs\n                ),\n                quickwit_query::query_ast::UserInputQuery {\n                    user_text: \"query\".to_string(),\n                    default_fields: None,\n                    default_operator: quickwit_query::BooleanOperand::And,\n                    lenient: false,\n                }\n                .into()\n            );\n        }\n        {\n            let service_name = \"\";\n            let span_kind = \"client\".parse().ok();\n            let span_name = \"\";\n            let tags = HashMap::new();\n            let min_span_start_timestamp_secs = None;\n            let max_span_start_timestamp_secs = None;\n            let min_span_duration_secs = None;\n            let max_span_duration_secs = None;\n            assert_eq!(\n                get_must(build_search_query(\n                    service_name,\n                    span_kind,\n                    span_name,\n                    tags,\n                    min_span_start_timestamp_secs,\n                    max_span_start_timestamp_secs,\n                    min_span_duration_secs,\n                    max_span_duration_secs\n                )),\n                vec![\n                    TermQuery {\n                        field: \"span_kind\".to_string(),\n                        value: \"3\".to_string(),\n                    }\n                    .into()\n                ]\n            );\n        }\n        {\n            let service_name = \"\";\n            let span_kind = None;\n            let span_name = \"GET /config\";\n            let tags = HashMap::new();\n            let min_span_start_timestamp_secs = None;\n            let max_span_start_timestamp_secs = None;\n            let min_span_duration_secs = None;\n            let max_span_duration_secs = None;\n            assert_eq!(\n                get_must(build_search_query(\n                    service_name,\n                    span_kind,\n                    span_name,\n                    tags,\n                    min_span_start_timestamp_secs,\n                    max_span_start_timestamp_secs,\n                    min_span_duration_secs,\n                    max_span_duration_secs\n                )),\n                vec![\n                    TermQuery {\n                        field: \"span_name\".to_string(),\n                        value: span_name.to_string(),\n                    }\n                    .into()\n                ]\n            );\n        }\n        {\n            let service_name = \"\";\n            let span_kind = None;\n            let span_name = \"\";\n            let tags = HashMap::from_iter([(\"error\".to_string(), \"true\".to_string())]);\n            let min_span_start_timestamp_secs = None;\n            let max_span_start_timestamp_secs = None;\n            let min_span_duration_secs = None;\n            let max_span_duration_secs = None;\n            assert_eq!(\n                get_must(build_search_query(\n                    service_name,\n                    span_kind,\n                    span_name,\n                    tags,\n                    min_span_start_timestamp_secs,\n                    max_span_start_timestamp_secs,\n                    min_span_duration_secs,\n                    max_span_duration_secs\n                )),\n                vec![\n                    TermQuery {\n                        field: \"span_status.code\".to_string(),\n                        value: \"error\".to_string(),\n                    }\n                    .into(),\n                ],\n            );\n        }\n        {\n            let service_name = \"\";\n            let span_kind = None;\n            let span_name = \"\";\n            let tags = HashMap::from_iter([(\"error\".to_string(), \"false\".to_string())]);\n            let min_span_start_timestamp_secs = None;\n            let max_span_start_timestamp_secs = None;\n            let min_span_duration_secs = None;\n            let max_span_duration_secs = None;\n            assert_eq!(\n                get_must_not(build_search_query(\n                    service_name,\n                    span_kind,\n                    span_name,\n                    tags,\n                    min_span_start_timestamp_secs,\n                    max_span_start_timestamp_secs,\n                    min_span_duration_secs,\n                    max_span_duration_secs\n                )),\n                vec![\n                    TermQuery {\n                        field: \"span_status.code\".to_string(),\n                        value: \"error\".to_string(),\n                    }\n                    .into(),\n                ],\n            );\n        }\n        {\n            let service_name = \"\";\n            let span_kind = None;\n            let span_name = \"\";\n            let tag_value = \"bar baz\";\n            let tags = HashMap::from_iter([(\"foo\".to_string(), tag_value.to_string())]);\n            let min_span_start_timestamp_secs = None;\n            let max_span_start_timestamp_secs = None;\n            let min_span_duration_secs = None;\n            let max_span_duration_secs = None;\n            assert_eq!(\n                get_must(build_search_query(\n                    service_name,\n                    span_kind,\n                    span_name,\n                    tags,\n                    min_span_start_timestamp_secs,\n                    max_span_start_timestamp_secs,\n                    min_span_duration_secs,\n                    max_span_duration_secs\n                )),\n                vec![\n                    BoolQuery {\n                        should: vec![\n                            TermQuery {\n                                field: \"resource_attributes.foo\".to_string(),\n                                value: tag_value.to_string(),\n                            }\n                            .into(),\n                            TermQuery {\n                                field: \"span_attributes.foo\".to_string(),\n                                value: tag_value.to_string(),\n                            }\n                            .into(),\n                            TermQuery {\n                                field: \"events.event_attributes.foo\".to_string(),\n                                value: tag_value.to_string(),\n                            }\n                            .into(),\n                        ],\n                        ..Default::default()\n                    }\n                    .into()\n                ]\n            );\n        }\n        {\n            let service_name = \"\";\n            let span_kind = None;\n            let span_name = \"\";\n            let event_name = \"Failed to ...\";\n            let tags = HashMap::from_iter([(\"event\".to_string(), event_name.to_string())]);\n            let min_span_start_timestamp_secs = None;\n            let max_span_start_timestamp_secs = None;\n            let min_span_duration_secs = None;\n            let max_span_duration_secs = None;\n            assert_eq!(\n                get_must(build_search_query(\n                    service_name,\n                    span_kind,\n                    span_name,\n                    tags,\n                    min_span_start_timestamp_secs,\n                    max_span_start_timestamp_secs,\n                    min_span_duration_secs,\n                    max_span_duration_secs\n                )),\n                vec![\n                    TermQuery {\n                        field: \"events.event_name\".to_string(),\n                        value: event_name.to_string(),\n                    }\n                    .into()\n                ]\n            );\n        }\n        {\n            let service_name = \"\";\n            let span_kind = None;\n            let span_name = \"\";\n            let tag_value = \"bar\";\n            let event_name = \"Failed to ...\";\n            let tags = HashMap::from_iter([\n                (\"event\".to_string(), event_name.to_string()),\n                (\"foo\".to_string(), tag_value.to_string()),\n            ]);\n            let min_span_start_timestamp_secs = None;\n            let max_span_start_timestamp_secs = None;\n            let min_span_duration_secs = None;\n            let max_span_duration_secs = None;\n            assert_eq!(\n                get_must(build_search_query(\n                    service_name,\n                    span_kind,\n                    span_name,\n                    tags,\n                    min_span_start_timestamp_secs,\n                    max_span_start_timestamp_secs,\n                    min_span_duration_secs,\n                    max_span_duration_secs\n                )),\n                vec![\n                    TermQuery {\n                        field: \"events.event_name\".to_string(),\n                        value: event_name.to_string(),\n                    }\n                    .into(),\n                    BoolQuery {\n                        should: vec![\n                            TermQuery {\n                                field: \"resource_attributes.foo\".to_string(),\n                                value: tag_value.to_string(),\n                            }\n                            .into(),\n                            TermQuery {\n                                field: \"span_attributes.foo\".to_string(),\n                                value: tag_value.to_string(),\n                            }\n                            .into(),\n                            TermQuery {\n                                field: \"events.event_attributes.foo\".to_string(),\n                                value: tag_value.to_string(),\n                            }\n                            .into(),\n                        ],\n                        ..Default::default()\n                    }\n                    .into()\n                ]\n            );\n        }\n        {\n            let service_name = \"\";\n            let span_kind = None;\n            let span_name = \"\";\n            let tags = HashMap::from_iter([\n                (\"baz\".to_string(), \"qux\".to_string()),\n                (\"foo\".to_string(), \"bar\".to_string()),\n            ]);\n            let min_span_start_timestamp_secs = None;\n            let max_span_start_timestamp_secs = None;\n            let min_span_duration_secs = None;\n            let max_span_duration_secs = None;\n            assert_eq!(\n                get_must(build_search_query(\n                    service_name,\n                    span_kind,\n                    span_name,\n                    tags,\n                    min_span_start_timestamp_secs,\n                    max_span_start_timestamp_secs,\n                    min_span_duration_secs,\n                    max_span_duration_secs\n                )),\n                vec![\n                    BoolQuery {\n                        should: vec![\n                            TermQuery {\n                                field: \"resource_attributes.baz\".to_string(),\n                                value: \"qux\".to_string(),\n                            }\n                            .into(),\n                            TermQuery {\n                                field: \"span_attributes.baz\".to_string(),\n                                value: \"qux\".to_string(),\n                            }\n                            .into(),\n                            TermQuery {\n                                field: \"events.event_attributes.baz\".to_string(),\n                                value: \"qux\".to_string(),\n                            }\n                            .into(),\n                        ],\n                        ..Default::default()\n                    }\n                    .into(),\n                    BoolQuery {\n                        should: vec![\n                            TermQuery {\n                                field: \"resource_attributes.foo\".to_string(),\n                                value: \"bar\".to_string(),\n                            }\n                            .into(),\n                            TermQuery {\n                                field: \"span_attributes.foo\".to_string(),\n                                value: \"bar\".to_string(),\n                            }\n                            .into(),\n                            TermQuery {\n                                field: \"events.event_attributes.foo\".to_string(),\n                                value: \"bar\".to_string(),\n                            }\n                            .into(),\n                        ],\n                        ..Default::default()\n                    }\n                    .into()\n                ]\n            );\n        }\n        {\n            let service_name = \"\";\n            let span_kind = None;\n            let span_name = \"\";\n            let tags = HashMap::new();\n            let min_span_start_timestamp_secs = Some(3);\n            let max_span_start_timestamp_secs = None;\n            let min_span_duration_secs = None;\n            let max_span_duration_secs = None;\n            assert_eq!(\n                get_must(build_search_query(\n                    service_name,\n                    span_kind,\n                    span_name,\n                    tags,\n                    min_span_start_timestamp_secs,\n                    max_span_start_timestamp_secs,\n                    min_span_duration_secs,\n                    max_span_duration_secs\n                )),\n                vec![\n                    RangeQuery {\n                        field: \"span_start_timestamp_nanos\".to_string(),\n                        lower_bound: Bound::Included(\"1970-01-01T00:00:03Z\".to_string().into()),\n                        upper_bound: Bound::Unbounded\n                    }\n                    .into()\n                ]\n            );\n        }\n        {\n            let service_name = \"\";\n            let span_kind = None;\n            let span_name = \"\";\n            let tags = HashMap::new();\n            let min_span_start_timestamp_secs = None;\n            let max_span_start_timestamp_secs = Some(33);\n            let min_span_duration_secs = None;\n            let max_span_duration_secs = None;\n            assert_eq!(\n                get_must(build_search_query(\n                    service_name,\n                    span_kind,\n                    span_name,\n                    tags,\n                    min_span_start_timestamp_secs,\n                    max_span_start_timestamp_secs,\n                    min_span_duration_secs,\n                    max_span_duration_secs\n                )),\n                vec![\n                    RangeQuery {\n                        field: \"span_start_timestamp_nanos\".to_string(),\n                        lower_bound: Bound::Unbounded,\n                        upper_bound: Bound::Included(\"1970-01-01T00:00:33Z\".to_string().into()),\n                    }\n                    .into()\n                ]\n            );\n        }\n        {\n            let service_name = \"\";\n            let span_kind = None;\n            let span_name = \"\";\n            let tags = HashMap::new();\n            let min_span_start_timestamp_secs = Some(3);\n            let max_span_start_timestamp_secs = Some(33);\n            let min_span_duration_secs = None;\n            let max_span_duration_secs = None;\n            assert_eq!(\n                get_must(build_search_query(\n                    service_name,\n                    span_kind,\n                    span_name,\n                    tags,\n                    min_span_start_timestamp_secs,\n                    max_span_start_timestamp_secs,\n                    min_span_duration_secs,\n                    max_span_duration_secs\n                )),\n                vec![\n                    RangeQuery {\n                        field: \"span_start_timestamp_nanos\".to_string(),\n                        lower_bound: Bound::Included(\"1970-01-01T00:00:03Z\".to_string().into()),\n                        upper_bound: Bound::Included(\"1970-01-01T00:00:33Z\".to_string().into()),\n                    }\n                    .into()\n                ]\n            );\n        }\n        {\n            let service_name = \"\";\n            let span_kind = None;\n            let span_name = \"\";\n            let tags = HashMap::new();\n            let min_span_start_timestamp_secs = None;\n            let max_span_start_timestamp_secs = None;\n            let min_span_duration_secs = Some(7);\n            let max_span_duration_secs = None;\n            assert_eq!(\n                get_must(build_search_query(\n                    service_name,\n                    span_kind,\n                    span_name,\n                    tags,\n                    min_span_start_timestamp_secs,\n                    max_span_start_timestamp_secs,\n                    min_span_duration_secs,\n                    max_span_duration_secs\n                )),\n                vec![\n                    RangeQuery {\n                        field: \"span_duration_millis\".to_string(),\n                        lower_bound: Bound::Included(7u64.into()),\n                        upper_bound: Bound::Unbounded\n                    }\n                    .into()\n                ]\n            );\n        }\n        {\n            let service_name = \"\";\n            let span_kind = None;\n            let span_name = \"\";\n            let tags = HashMap::new();\n            let min_span_start_timestamp_secs = None;\n            let max_span_start_timestamp_secs = None;\n            let min_span_duration_secs = None;\n            let max_span_duration_secs = Some(77);\n            assert_eq!(\n                get_must(build_search_query(\n                    service_name,\n                    span_kind,\n                    span_name,\n                    tags,\n                    min_span_start_timestamp_secs,\n                    max_span_start_timestamp_secs,\n                    min_span_duration_secs,\n                    max_span_duration_secs\n                )),\n                vec![\n                    RangeQuery {\n                        field: \"span_duration_millis\".to_string(),\n                        lower_bound: Bound::Unbounded,\n                        upper_bound: Bound::Included(77u64.into()),\n                    }\n                    .into()\n                ]\n            );\n        }\n        {\n            let service_name = \"\";\n            let span_kind = None;\n            let span_name = \"\";\n            let tags = HashMap::new();\n            let min_span_start_timestamp_secs = None;\n            let max_span_start_timestamp_secs = None;\n            let min_span_duration_secs = Some(7);\n            let max_span_duration_secs = Some(77);\n            assert_eq!(\n                get_must(build_search_query(\n                    service_name,\n                    span_kind,\n                    span_name,\n                    tags,\n                    min_span_start_timestamp_secs,\n                    max_span_start_timestamp_secs,\n                    min_span_duration_secs,\n                    max_span_duration_secs\n                )),\n                vec![\n                    RangeQuery {\n                        field: \"span_duration_millis\".to_string(),\n                        lower_bound: Bound::Included(7u64.into()),\n                        upper_bound: Bound::Included(77u64.into()),\n                    }\n                    .into()\n                ]\n            );\n        }\n        {\n            let service_name = \"quickwit\";\n            let span_kind = None;\n            let span_name = \"\";\n            let tag_value = \"bar\";\n            let tags = HashMap::from_iter([(\"foo\".to_string(), tag_value.to_string())]);\n            let min_span_start_timestamp_secs = None;\n            let max_span_start_timestamp_secs = None;\n            let min_span_duration_secs = None;\n            let max_span_duration_secs = None;\n            assert_eq!(\n                get_must(build_search_query(\n                    service_name,\n                    span_kind,\n                    span_name,\n                    tags,\n                    min_span_start_timestamp_secs,\n                    max_span_start_timestamp_secs,\n                    min_span_duration_secs,\n                    max_span_duration_secs\n                )),\n                vec![\n                    TermQuery {\n                        field: \"service_name\".to_string(),\n                        value: service_name.to_string(),\n                    }\n                    .into(),\n                    BoolQuery {\n                        should: vec![\n                            TermQuery {\n                                field: \"resource_attributes.foo\".to_string(),\n                                value: tag_value.to_string(),\n                            }\n                            .into(),\n                            TermQuery {\n                                field: \"span_attributes.foo\".to_string(),\n                                value: tag_value.to_string(),\n                            }\n                            .into(),\n                            TermQuery {\n                                field: \"events.event_attributes.foo\".to_string(),\n                                value: tag_value.to_string(),\n                            }\n                            .into(),\n                        ],\n                        ..Default::default()\n                    }\n                    .into()\n                ]\n            );\n        }\n        {\n            let service_name = \"quickwit\";\n            let span_kind = \"client\".parse().ok();\n            let span_name = \"\";\n            let tag_value = \"bar\";\n            let tags = HashMap::from_iter([(\"foo\".to_string(), tag_value.to_string())]);\n            let min_span_start_timestamp_secs = None;\n            let max_span_start_timestamp_secs = None;\n            let min_span_duration_secs = None;\n            let max_span_duration_secs = None;\n            assert_eq!(\n                get_must(build_search_query(\n                    service_name,\n                    span_kind,\n                    span_name,\n                    tags,\n                    min_span_start_timestamp_secs,\n                    max_span_start_timestamp_secs,\n                    min_span_duration_secs,\n                    max_span_duration_secs\n                )),\n                vec![\n                    TermQuery {\n                        field: \"service_name\".to_string(),\n                        value: service_name.to_string(),\n                    }\n                    .into(),\n                    TermQuery {\n                        field: \"span_kind\".to_string(),\n                        value: \"3\".to_string()\n                    }\n                    .into(),\n                    BoolQuery {\n                        should: vec![\n                            TermQuery {\n                                field: \"resource_attributes.foo\".to_string(),\n                                value: tag_value.to_string(),\n                            }\n                            .into(),\n                            TermQuery {\n                                field: \"span_attributes.foo\".to_string(),\n                                value: tag_value.to_string(),\n                            }\n                            .into(),\n                            TermQuery {\n                                field: \"events.event_attributes.foo\".to_string(),\n                                value: tag_value.to_string(),\n                            }\n                            .into(),\n                        ],\n                        ..Default::default()\n                    }\n                    .into()\n                ]\n            );\n        }\n        {\n            let service_name = \"quickwit\";\n            let span_kind = \"client\".parse().ok();\n            let span_name = \"leaf_search\";\n            let tag_value = \"bar\";\n            let tags = HashMap::from_iter([(\"foo\".to_string(), tag_value.to_string())]);\n            let min_span_start_timestamp_secs = None;\n            let max_span_start_timestamp_secs = None;\n            let min_span_duration_secs = None;\n            let max_span_duration_secs = None;\n            assert_eq!(\n                get_must(build_search_query(\n                    service_name,\n                    span_kind,\n                    span_name,\n                    tags,\n                    min_span_start_timestamp_secs,\n                    max_span_start_timestamp_secs,\n                    min_span_duration_secs,\n                    max_span_duration_secs\n                )),\n                vec![\n                    TermQuery {\n                        field: \"service_name\".to_string(),\n                        value: service_name.to_string(),\n                    }\n                    .into(),\n                    TermQuery {\n                        field: \"span_kind\".to_string(),\n                        value: \"3\".to_string()\n                    }\n                    .into(),\n                    TermQuery {\n                        field: \"span_name\".to_string(),\n                        value: span_name.to_string(),\n                    }\n                    .into(),\n                    BoolQuery {\n                        should: vec![\n                            TermQuery {\n                                field: \"resource_attributes.foo\".to_string(),\n                                value: tag_value.to_string(),\n                            }\n                            .into(),\n                            TermQuery {\n                                field: \"span_attributes.foo\".to_string(),\n                                value: tag_value.to_string(),\n                            }\n                            .into(),\n                            TermQuery {\n                                field: \"events.event_attributes.foo\".to_string(),\n                                value: tag_value.to_string(),\n                            }\n                            .into(),\n                        ],\n                        ..Default::default()\n                    }\n                    .into()\n                ]\n            );\n        }\n        {\n            let service_name = \"quickwit\";\n            let span_kind = \"client\".parse().ok();\n            let span_name = \"leaf_search\";\n            let tag_value = \"bar\";\n            let tags = HashMap::from_iter([(\"foo\".to_string(), tag_value.to_string())]);\n            let min_span_start_timestamp_secs = Some(3);\n            let max_span_start_timestamp_secs = Some(33);\n            let min_span_duration_secs = Some(7);\n            let max_span_duration_secs = Some(77);\n            assert_eq!(\n                get_must(build_search_query(\n                    service_name,\n                    span_kind,\n                    span_name,\n                    tags,\n                    min_span_start_timestamp_secs,\n                    max_span_start_timestamp_secs,\n                    min_span_duration_secs,\n                    max_span_duration_secs\n                )),\n                vec![\n                    TermQuery {\n                        field: \"service_name\".to_string(),\n                        value: service_name.to_string(),\n                    }\n                    .into(),\n                    TermQuery {\n                        field: \"span_kind\".to_string(),\n                        value: \"3\".to_string()\n                    }\n                    .into(),\n                    TermQuery {\n                        field: \"span_name\".to_string(),\n                        value: span_name.to_string(),\n                    }\n                    .into(),\n                    BoolQuery {\n                        should: vec![\n                            TermQuery {\n                                field: \"resource_attributes.foo\".to_string(),\n                                value: tag_value.to_string(),\n                            }\n                            .into(),\n                            TermQuery {\n                                field: \"span_attributes.foo\".to_string(),\n                                value: tag_value.to_string(),\n                            }\n                            .into(),\n                            TermQuery {\n                                field: \"events.event_attributes.foo\".to_string(),\n                                value: tag_value.to_string(),\n                            }\n                            .into(),\n                        ],\n                        ..Default::default()\n                    }\n                    .into(),\n                    RangeQuery {\n                        field: \"span_start_timestamp_nanos\".to_string(),\n                        lower_bound: Bound::Included(\"1970-01-01T00:00:03Z\".to_string().into()),\n                        upper_bound: Bound::Included(\"1970-01-01T00:00:33Z\".to_string().into()),\n                    }\n                    .into(),\n                    RangeQuery {\n                        field: \"span_duration_millis\".to_string(),\n                        lower_bound: Bound::Included(7u64.into()),\n                        upper_bound: Bound::Included(77u64.into()),\n                    }\n                    .into(),\n                ]\n            );\n        }\n    }\n\n    #[test]\n    fn test_build_aggregations_query() {\n        let aggregations_query = build_aggregations_query(77);\n        let aggregations: QuickwitAggregations = serde_json::from_str(&aggregations_query).unwrap();\n        let QuickwitAggregations::FindTraceIdsAggregation(collector) = aggregations else {\n            panic!(\"Expected find trace IDs aggregation!\");\n        };\n        assert_eq!(collector.num_traces, 77);\n        assert_eq!(collector.trace_id_field_name, \"trace_id\");\n        assert_eq!(\n            collector.span_timestamp_field_name,\n            \"span_start_timestamp_nanos\"\n        );\n    }\n\n    #[test]\n    fn test_to_duration_millis() {\n        {\n            let duration = WellKnownDuration {\n                seconds: 0,\n                nanos: 1,\n            };\n            let duration_millis = to_duration_millis(&duration);\n            assert!(duration_millis.is_none())\n        }\n        {\n            let duration = WellKnownDuration {\n                seconds: 1,\n                nanos: 1_000_000,\n            };\n            let duration_millis = to_duration_millis(&duration).unwrap();\n            assert_eq!(duration_millis, 1001)\n        }\n    }\n\n    #[test]\n    fn test_to_well_known_duration() {\n        let duration = to_well_known_duration(1_000_000_001, 2_000_000_002);\n        assert_eq!(duration.seconds, 1);\n        assert_eq!(duration.nanos, 1);\n    }\n\n    #[test]\n    fn test_to_well_known_timestamp() {\n        let timestamp = to_well_known_timestamp(1_000_000_001);\n        assert_eq!(timestamp.seconds, 1);\n        assert_eq!(timestamp.nanos, 1);\n    }\n\n    #[test]\n    fn test_otlp_attributes_to_jaeger_tags() {\n        let mut tags = otlp_attributes_to_jaeger_tags([\n            (\"array_int\".to_string(), json!([1, 2])),\n            (\"array_str\".to_string(), json!([\"foo\", \"bar\"])),\n            (\"bool\".to_string(), json!(true)),\n            (\"float\".to_string(), json!(1.0)),\n            (\"integer\".to_string(), json!(1)),\n            (\"string\".to_string(), json!(\"foo\")),\n            (\n                \"object\".to_string(),\n                json!({\n                    \"array_int\": [1,2],\n                    \"array_str\": [\"foo\", \"bar\"],\n                    \"bool\": true,\n                    \"float\": 1.0,\n                    \"integer\": 1,\n                    \"string\": \"foo\",\n                }),\n            ),\n        ]);\n        tags.sort_by(|left, right| left.key.cmp(&right.key));\n\n        // a tag for the 6 keys in the root, plus 6 more for the nested keys\n        assert_eq!(tags.len(), 12);\n\n        assert_eq!(tags[0].key, \"array_int\");\n        assert_eq!(tags[0].v_type(), ValueType::String);\n        assert_eq!(tags[0].v_str, \"[1,2]\");\n\n        assert_eq!(tags[1].key, \"array_str\");\n        assert_eq!(tags[1].v_type(), ValueType::String);\n        assert_eq!(tags[1].v_str, r#\"[\"foo\",\"bar\"]\"#);\n\n        assert_eq!(tags[2].key, \"bool\");\n        assert_eq!(tags[2].v_type(), ValueType::Bool);\n        assert!(tags[2].v_bool);\n\n        assert_eq!(tags[3].key, \"float\");\n        assert_eq!(tags[3].v_type(), ValueType::Float64);\n        assert_eq!(tags[3].v_float64, 1.0);\n\n        assert_eq!(tags[4].key, \"integer\");\n        assert_eq!(tags[4].v_type(), ValueType::Int64);\n        assert_eq!(tags[4].v_int64, 1);\n\n        assert_eq!(tags[5].key, \"object.array_int\");\n        assert_eq!(tags[5].v_type(), ValueType::String);\n        assert_eq!(tags[5].v_str, \"[1,2]\");\n\n        assert_eq!(tags[6].key, \"object.array_str\");\n        assert_eq!(tags[6].v_type(), ValueType::String);\n        assert_eq!(tags[6].v_str, r#\"[\"foo\",\"bar\"]\"#);\n\n        assert_eq!(tags[7].key, \"object.bool\");\n        assert_eq!(tags[7].v_type(), ValueType::Bool);\n        assert!(tags[7].v_bool);\n\n        assert_eq!(tags[8].key, \"object.float\");\n        assert_eq!(tags[8].v_type(), ValueType::Float64);\n        assert_eq!(tags[8].v_float64, 1.0);\n\n        assert_eq!(tags[9].key, \"object.integer\");\n        assert_eq!(tags[9].v_type(), ValueType::Int64);\n        assert_eq!(tags[9].v_int64, 1);\n\n        assert_eq!(tags[10].key, \"object.string\");\n        assert_eq!(tags[10].v_type(), ValueType::String);\n        assert_eq!(tags[10].v_str, \"foo\");\n\n        assert_eq!(tags[11].key, \"string\");\n        assert_eq!(tags[11].v_type(), ValueType::String);\n        assert_eq!(tags[11].v_str, \"foo\");\n    }\n\n    #[test]\n    fn test_inject_dropped_attribute_tag() {\n        let mut tags = Vec::new();\n\n        inject_dropped_count_tags(&mut tags, 0, 0, 0);\n        assert!(tags.is_empty());\n\n        inject_dropped_count_tags(&mut tags, 1, 2, 3);\n        assert_eq!(tags.len(), 3);\n\n        assert_eq!(tags[0].key, \"otel.dropped_attributes_count\");\n        assert_eq!(tags[0].v_type(), ValueType::Int64);\n        assert_eq!(tags[0].v_int64, 1);\n\n        assert_eq!(tags[1].key, \"otel.dropped_events_count\");\n        assert_eq!(tags[1].v_type(), ValueType::Int64);\n        assert_eq!(tags[1].v_int64, 2);\n\n        assert_eq!(tags[2].key, \"otel.dropped_links_count\");\n        assert_eq!(tags[2].v_type(), ValueType::Int64);\n        assert_eq!(tags[2].v_int64, 3);\n    }\n\n    #[test]\n    fn test_inject_span_kind_tag() {\n        {\n            let mut tags = Vec::new();\n            inject_span_kind_tag(&mut tags, 0);\n            assert!(tags.is_empty());\n        }\n        {\n            let mut tags = Vec::new();\n            inject_span_kind_tag(&mut tags, 1);\n            assert!(tags.is_empty());\n        }\n        {\n            for (expected_span_kind, span_kind_id) in [\"server\", \"client\", \"producer\", \"consumer\"]\n                .iter()\n                .zip(2..6)\n            {\n                let mut tags = Vec::new();\n                inject_span_kind_tag(&mut tags, span_kind_id);\n                assert_eq!(tags.len(), 1);\n\n                assert_eq!(tags[0].key, \"span.kind\");\n                assert_eq!(tags[0].v_type(), ValueType::String);\n                assert_eq!(tags[0].v_str, *expected_span_kind);\n            }\n        }\n    }\n\n    #[test]\n    fn test_inject_status_code_tag() {\n        {\n            let mut tags = Vec::new();\n            let span_status = QwSpanStatus {\n                code: OtlpStatusCode::Unset,\n                message: None,\n            };\n            inject_span_status_tags(&mut tags, span_status);\n            assert!(tags.is_empty());\n        }\n        {\n            let mut tags = Vec::new();\n            let span_status = QwSpanStatus {\n                code: OtlpStatusCode::Ok,\n                message: None,\n            };\n            inject_span_status_tags(&mut tags, span_status);\n            assert_eq!(tags.len(), 1);\n            assert_eq!(tags[0].key, \"otel.status_code\");\n            assert_eq!(tags[0].v_type(), ValueType::String);\n            assert_eq!(tags[0].v_str, \"OK\");\n        }\n        {\n            let mut tags = Vec::new();\n            let span_status = QwSpanStatus {\n                code: OtlpStatusCode::Error,\n                message: Some(\"An error occurred.\".to_string()),\n            };\n            inject_span_status_tags(&mut tags, span_status);\n            assert_eq!(tags.len(), 3);\n\n            assert_eq!(tags[0].key, \"otel.status_code\");\n            assert_eq!(tags[0].v_type(), ValueType::String);\n            assert_eq!(tags[0].v_str, \"ERROR\");\n\n            assert_eq!(tags[1].key, \"otel.status_description\");\n            assert_eq!(tags[1].v_type(), ValueType::String);\n            assert_eq!(tags[1].v_str, \"An error occurred.\");\n\n            assert_eq!(tags[2].key, \"error\");\n            assert_eq!(tags[2].v_type(), ValueType::Bool);\n            assert!(tags[2].v_bool);\n        }\n    }\n\n    #[test]\n    fn test_qw_event_to_jaeger_logs() {\n        {\n            let event = QwEvent {\n                event_timestamp_nanos: 1_000_000_001,\n                event_name: \"\".to_string(),\n                event_attributes: HashMap::from_iter([(\"foo\".to_string(), json!(\"bar\"))]),\n                event_dropped_attributes_count: 0,\n            };\n            let log = qw_event_to_jaeger_log(event).unwrap();\n            assert_eq!(\n                log.timestamp.unwrap(),\n                to_well_known_timestamp(1_000_000_001)\n            );\n            assert_eq!(log.fields.len(), 1);\n\n            assert_eq!(log.fields[0].key, \"foo\");\n            assert_eq!(log.fields[0].v_type(), ValueType::String);\n            assert_eq!(log.fields[0].v_str, \"bar\");\n        }\n        {\n            let event = QwEvent {\n                event_timestamp_nanos: 1_000_000_001,\n                event_name: \"Failed to ...\".to_string(),\n                event_attributes: HashMap::from_iter([(\"foo\".to_string(), json!(\"bar\"))]),\n                event_dropped_attributes_count: 1,\n            };\n            let log = qw_event_to_jaeger_log(event).unwrap();\n            assert_eq!(log.fields.len(), 3);\n\n            assert_eq!(log.fields[0].key, \"foo\");\n            assert_eq!(log.fields[0].v_type(), ValueType::String);\n            assert_eq!(log.fields[0].v_str, \"bar\");\n\n            assert_eq!(log.fields[1].key, \"event\");\n            assert_eq!(log.fields[1].v_type(), ValueType::String);\n            assert_eq!(log.fields[1].v_str, \"Failed to ...\");\n\n            assert_eq!(log.fields[2].key, \"otel.dropped_attributes_count\");\n            assert_eq!(log.fields[2].v_type(), ValueType::Int64);\n            assert_eq!(log.fields[2].v_int64, 1);\n        }\n        {\n            let event = QwEvent {\n                event_timestamp_nanos: 1_000_000_001,\n                event_name: \"Failed to ...\".to_string(),\n                event_attributes: HashMap::from_iter([(\"event\".to_string(), json!(\"foo\"))]),\n                event_dropped_attributes_count: 0,\n            };\n            let log = qw_event_to_jaeger_log(event).unwrap();\n            assert_eq!(log.fields.len(), 1);\n            assert_eq!(log.fields[0].key, \"event\");\n            assert_eq!(log.fields[0].v_type(), ValueType::String);\n            assert_eq!(log.fields[0].v_str, \"foo\");\n        }\n    }\n\n    #[test]\n    fn test_qw_span_to_jaeger_span() {\n        let qw_span = QwSpan {\n            trace_id: TraceId::new([1; 16]),\n            trace_state: Some(\"key1=value1,key2=value2\".to_string()),\n            service_name: \"quickwit\".to_string(),\n            resource_attributes: HashMap::from_iter([(\n                \"resource_key\".to_string(),\n                json!(\"resource_value\"),\n            )]),\n            resource_dropped_attributes_count: 1,\n            scope_name: Some(\"vector.dev\".to_string()),\n            scope_version: Some(\"1.0.0\".to_string()),\n            scope_attributes: HashMap::from_iter([(\"scope_key\".to_string(), json!(\"scope_value\"))]),\n            scope_dropped_attributes_count: 2,\n            span_id: SpanId::new([2; 8]),\n            span_kind: 2,\n            span_name: \"publish_split\".to_string(),\n            span_fingerprint: Some(SpanFingerprint::new(\"quickwit\", 2.into(), \"publish_split\")),\n            span_start_timestamp_nanos: 1_000_000_001,\n            span_end_timestamp_nanos: 2_000_000_002,\n            span_duration_millis: Some(1_001),\n            span_attributes: HashMap::from_iter([(\"span_key\".to_string(), json!(\"span_value\"))]),\n            span_dropped_attributes_count: 3,\n            span_dropped_events_count: 4,\n            span_dropped_links_count: 5,\n            span_status: QwSpanStatus {\n                code: OtlpStatusCode::Error,\n                message: Some(\"An error occurred.\".to_string()),\n            },\n            parent_span_id: Some(SpanId::new([3; 8])),\n            is_root: Some(false),\n            events: vec![QwEvent {\n                event_timestamp_nanos: 1000500003,\n                event_name: \"event_name\".to_string(),\n                event_attributes: HashMap::from_iter([(\n                    \"event_key\".to_string(),\n                    json!(\"event_value\"),\n                )]),\n                event_dropped_attributes_count: 6,\n            }],\n            event_names: vec![\"event_name\".to_string()],\n            links: vec![QwLink {\n                link_trace_id: TraceId::new([4; 16]),\n                link_trace_state: Some(\"link_key1=link_value1,link_key2=link_value2\".to_string()),\n                link_span_id: SpanId::new([5; 8]),\n                link_attributes: HashMap::from_iter([(\n                    \"link_key\".to_string(),\n                    json!(\"link_value\"),\n                )]),\n                link_dropped_attributes_count: 7,\n            }],\n        };\n        let qw_span_json = serde_json::to_string(&qw_span).unwrap();\n        let jaeger_span = qw_span_to_jaeger_span(&qw_span_json).unwrap();\n        assert_eq!(jaeger_span.trace_id, [1; 16]);\n        assert_eq!(jaeger_span.span_id, [2; 8]);\n        assert_eq!(jaeger_span.operation_name, \"publish_split\");\n        assert_eq!(\n            jaeger_span.references,\n            vec![\n                JaegerSpanRef {\n                    trace_id: vec![1; 16],\n                    span_id: vec![3; 8],\n                    ref_type: 0,\n                },\n                JaegerSpanRef {\n                    trace_id: vec![4; 16],\n                    span_id: vec![5; 8],\n                    ref_type: 1,\n                }\n            ]\n        );\n        assert_eq!(jaeger_span.flags, 0);\n        assert_eq!(\n            jaeger_span.start_time.unwrap(),\n            WellKnownTimestamp {\n                seconds: 1,\n                nanos: 1,\n            }\n        );\n        assert_eq!(\n            jaeger_span.duration.unwrap(),\n            WellKnownDuration {\n                seconds: 1,\n                nanos: 1,\n            }\n        );\n        assert_eq!(\n            jaeger_span.tags,\n            vec![\n                JaegerKeyValue {\n                    key: \"span_key\".to_string(),\n                    v_type: 0,\n                    v_str: \"span_value\".to_string(),\n                    v_bool: false,\n                    v_int64: 0,\n                    v_float64: 0.0,\n                    v_binary: Vec::new()\n                },\n                JaegerKeyValue {\n                    key: \"otel.dropped_attributes_count\".to_string(),\n                    v_type: 2,\n                    v_str: String::new(),\n                    v_bool: false,\n                    v_int64: 3,\n                    v_float64: 0.0,\n                    v_binary: Vec::new()\n                },\n                JaegerKeyValue {\n                    key: \"otel.dropped_events_count\".to_string(),\n                    v_type: 2,\n                    v_str: String::new(),\n                    v_bool: false,\n                    v_int64: 4,\n                    v_float64: 0.0,\n                    v_binary: Vec::new()\n                },\n                JaegerKeyValue {\n                    key: \"otel.dropped_links_count\".to_string(),\n                    v_type: 2,\n                    v_str: String::new(),\n                    v_bool: false,\n                    v_int64: 5,\n                    v_float64: 0.0,\n                    v_binary: Vec::new()\n                },\n                JaegerKeyValue {\n                    key: \"span.kind\".to_string(),\n                    v_type: 0,\n                    v_str: \"server\".to_string(),\n                    v_bool: false,\n                    v_int64: 0,\n                    v_float64: 0.0,\n                    v_binary: Vec::new()\n                },\n                JaegerKeyValue {\n                    key: \"otel.status_code\".to_string(),\n                    v_type: 0,\n                    v_str: \"ERROR\".to_string(),\n                    v_bool: false,\n                    v_int64: 0,\n                    v_float64: 0.0,\n                    v_binary: Vec::new()\n                },\n                JaegerKeyValue {\n                    key: \"otel.status_description\".to_string(),\n                    v_type: 0,\n                    v_str: \"An error occurred.\".to_string(),\n                    v_bool: false,\n                    v_int64: 0,\n                    v_float64: 0.0,\n                    v_binary: Vec::new()\n                },\n                JaegerKeyValue {\n                    key: \"error\".to_string(),\n                    v_type: 1,\n                    v_str: String::new(),\n                    v_bool: true,\n                    v_int64: 0,\n                    v_float64: 0.0,\n                    v_binary: Vec::new()\n                },\n            ]\n        );\n        assert_eq!(\n            jaeger_span.logs,\n            vec![JaegerLog {\n                timestamp: Some(WellKnownTimestamp {\n                    seconds: 1,\n                    nanos: 500003,\n                }),\n                fields: vec![\n                    JaegerKeyValue {\n                        key: \"event_key\".to_string(),\n                        v_type: 0,\n                        v_str: \"event_value\".to_string(),\n                        v_bool: false,\n                        v_int64: 0,\n                        v_float64: 0.0,\n                        v_binary: Vec::new()\n                    },\n                    JaegerKeyValue {\n                        key: \"event\".to_string(),\n                        v_type: 0,\n                        v_str: \"event_name\".to_string(),\n                        v_bool: false,\n                        v_int64: 0,\n                        v_float64: 0.0,\n                        v_binary: Vec::new()\n                    },\n                    JaegerKeyValue {\n                        key: \"otel.dropped_attributes_count\".to_string(),\n                        v_type: 2,\n                        v_str: String::new(),\n                        v_bool: false,\n                        v_int64: 6,\n                        v_float64: 0.0,\n                        v_binary: Vec::new()\n                    },\n                ],\n            }]\n        );\n        assert_eq!(\n            jaeger_span.process.unwrap(),\n            JaegerProcess {\n                service_name: \"quickwit\".to_string(),\n                tags: vec![JaegerKeyValue {\n                    key: \"resource_key\".to_string(),\n                    v_type: 0,\n                    v_str: \"resource_value\".to_string(),\n                    v_bool: false,\n                    v_int64: 0,\n                    v_float64: 0.0,\n                    v_binary: Vec::new()\n                }]\n            }\n        );\n        assert!(jaeger_span.warnings.is_empty());\n    }\n\n    #[test]\n    fn test_otlp_links_to_jaeger_references() {\n        let trace_id = TraceId::new([1; 16]);\n        let parent_span_id = SpanId::new([3; 8]);\n        let links = vec![QwLink {\n            link_trace_id: TraceId::new([4; 16]),\n            link_trace_state: Some(\"link_key1=link_value1,link_key2=link_value2\".to_string()),\n            link_span_id: SpanId::new([5; 8]),\n            link_attributes: HashMap::from_iter([(\"link_key\".to_string(), json!(\"link_value\"))]),\n            link_dropped_attributes_count: 7,\n        }];\n        let jaeger_references =\n            otlp_links_to_jaeger_references(&trace_id, Some(parent_span_id), links).unwrap();\n        assert_eq!(\n            jaeger_references,\n            vec![\n                JaegerSpanRef {\n                    trace_id: vec![1; 16],\n                    span_id: vec![3; 8],\n                    ref_type: 0,\n                },\n                JaegerSpanRef {\n                    trace_id: vec![4; 16],\n                    span_id: vec![5; 8],\n                    ref_type: 1,\n                }\n            ]\n        );\n    }\n\n    #[test]\n    fn test_collect_trace_ids() {\n        use quickwit_opentelemetry::otlp::TraceId;\n        use quickwit_search::Span;\n        use tantivy::DateTime;\n        {\n            let agg_result: Vec<Span> = Vec::new();\n            let agg_result_postcard = postcard::to_stdvec(&agg_result).unwrap();\n            let (trace_ids, _span_timestamps_range) =\n                collect_trace_ids(&agg_result_postcard).unwrap();\n            assert!(trace_ids.is_empty());\n        }\n        {\n            let agg_result = vec![Span {\n                trace_id: TraceId::new([\n                    0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01,\n                    0x01, 0x01, 0x01,\n                ]),\n                span_timestamp: DateTime::from_timestamp_nanos(1684857492783747000),\n            }];\n            let agg_result_postcard = postcard::to_stdvec(&agg_result).unwrap();\n            let (trace_ids, span_timestamps_range) =\n                collect_trace_ids(&agg_result_postcard).unwrap();\n            assert_eq!(trace_ids.len(), 1);\n            assert_eq!(span_timestamps_range, 1684857492..=1684857492);\n        }\n        {\n            let agg_result = vec![\n                Span {\n                    trace_id: TraceId::new([\n                        0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0a, 0x0b, 0x0c,\n                        0x0d, 0x0e, 0x0f, 0x10,\n                    ]),\n                    span_timestamp: DateTime::from_timestamp_nanos(1684857492783747000),\n                },\n                Span {\n                    trace_id: TraceId::new([\n                        0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02,\n                        0x02, 0x02, 0x02, 0x02,\n                    ]),\n                    span_timestamp: DateTime::from_timestamp_nanos(1684857826019627000),\n                },\n            ];\n            let agg_result_postcard = postcard::to_stdvec(&agg_result).unwrap();\n            let (trace_ids, span_timestamps_range) =\n                collect_trace_ids(&agg_result_postcard).unwrap();\n            assert_eq!(trace_ids.len(), 2);\n            assert_eq!(span_timestamps_range, 1684857492..=1684857826);\n        }\n    }\n\n    #[tokio::test]\n    async fn test_get_services() {\n        let mut service = MockSearchService::new();\n        service\n            .expect_root_list_terms()\n            .withf(|req| {\n                req.index_id_patterns == vec![OTEL_TRACES_INDEX_ID_PATTERN]\n                    && req.field == \"service_name\"\n                    && req.start_timestamp.is_some()\n            })\n            .return_once(|_| {\n                Ok(quickwit_proto::search::ListTermsResponse {\n                    num_hits: 3,\n                    terms: vec![\n                        encode_term_for_test!(\"service1\"),\n                        encode_term_for_test!(\"service2\"),\n                        encode_term_for_test!(\"service3\"),\n                    ],\n                    elapsed_time_micros: 0,\n                    errors: Vec::new(),\n                })\n            });\n\n        let service = Arc::new(service);\n        let jaeger = JaegerService::new(JaegerConfig::default(), service);\n\n        let request = tonic::Request::new(GetServicesRequest {});\n        let response = jaeger.get_services(request).await.unwrap().into_inner();\n        assert_eq!(response.services, &[\"service1\", \"service2\", \"service3\"]);\n    }\n\n    #[tokio::test]\n    async fn test_get_services_on_custom_indexes() {\n        let mut service = MockSearchService::new();\n        service\n            .expect_root_list_terms()\n            .withf(|req| {\n                req.index_id_patterns == vec![\"index-1\", \"index-3*\"]\n                    && req.field == \"service_name\"\n                    && req.start_timestamp.is_some()\n            })\n            .return_once(|_| {\n                Ok(quickwit_proto::search::ListTermsResponse {\n                    num_hits: 3,\n                    terms: vec![\n                        encode_term_for_test!(\"service1\"),\n                        encode_term_for_test!(\"service2\"),\n                        encode_term_for_test!(\"service3\"),\n                    ],\n                    elapsed_time_micros: 0,\n                    errors: Vec::new(),\n                })\n            });\n\n        let service = Arc::new(service);\n        let jaeger = JaegerService::new(JaegerConfig::default(), service);\n\n        let mut request = tonic::Request::new(GetServicesRequest {});\n        request.metadata_mut().insert(\n            OtelSignal::Traces.header_name(),\n            \"index-1,index-3*\".parse().unwrap(),\n        );\n        let response = jaeger.get_services(request).await.unwrap().into_inner();\n        assert_eq!(response.services, &[\"service1\", \"service2\", \"service3\"]);\n    }\n\n    #[tokio::test]\n    async fn test_v2_get_services() {\n        let mut service = MockSearchService::new();\n        service\n            .expect_root_list_terms()\n            .withf(|req| {\n                req.index_id_patterns == vec![OTEL_TRACES_INDEX_ID_PATTERN]\n                    && req.field == \"service_name\"\n                    && req.start_timestamp.is_some()\n            })\n            .return_once(|_| {\n                Ok(quickwit_proto::search::ListTermsResponse {\n                    num_hits: 3,\n                    terms: vec![\n                        encode_term_for_test!(\"service1\"),\n                        encode_term_for_test!(\"service2\"),\n                        encode_term_for_test!(\"service3\"),\n                    ],\n                    elapsed_time_micros: 0,\n                    errors: Vec::new(),\n                })\n            });\n\n        let service = Arc::new(service);\n        let jaeger = JaegerService::new(JaegerConfig::default(), service);\n\n        let request =\n            tonic::Request::new(quickwit_proto::jaeger::storage::v2::GetServicesRequest {});\n        let response =\n            quickwit_proto::jaeger::storage::v2::trace_reader_server::TraceReader::get_services(\n                &jaeger, request,\n            )\n            .await\n            .unwrap()\n            .into_inner();\n        assert_eq!(response.services, &[\"service1\", \"service2\", \"service3\"]);\n    }\n\n    #[tokio::test]\n    async fn test_v2_get_operations() {\n        let mut service = MockSearchService::new();\n        service\n            .expect_root_list_terms()\n            .withf(|req| {\n                req.index_id_patterns == vec![OTEL_TRACES_INDEX_ID_PATTERN]\n                    && req.field == \"span_fingerprint\"\n                    && req.start_timestamp.is_some()\n            })\n            .return_once(|_| {\n                let fingerprint1 =\n                    SpanFingerprint::new(\"test-service\", QwSpanKind::from(2), \"GET /api\");\n                let fingerprint2 =\n                    SpanFingerprint::new(\"test-service\", QwSpanKind::from(3), \"POST /data\");\n\n                Ok(quickwit_proto::search::ListTermsResponse {\n                    num_hits: 2,\n                    terms: vec![\n                        encode_term_for_test!(fingerprint1.as_str()),\n                        encode_term_for_test!(fingerprint2.as_str()),\n                    ],\n                    elapsed_time_micros: 0,\n                    errors: Vec::new(),\n                })\n            });\n\n        let service = Arc::new(service);\n        let jaeger = JaegerService::new(JaegerConfig::default(), service);\n\n        let request =\n            tonic::Request::new(quickwit_proto::jaeger::storage::v2::GetOperationsRequest {\n                service: \"test-service\".to_string(),\n                span_kind: String::new(),\n            });\n        let response =\n            quickwit_proto::jaeger::storage::v2::trace_reader_server::TraceReader::get_operations(\n                &jaeger, request,\n            )\n            .await\n            .unwrap()\n            .into_inner();\n        assert_eq!(response.operations.len(), 2);\n        assert_eq!(response.operations[0].name, \"GET /api\");\n        assert_eq!(response.operations[0].span_kind, \"server\");\n        assert_eq!(response.operations[1].name, \"POST /data\");\n        assert_eq!(response.operations[1].span_kind, \"client\");\n    }\n\n    #[tokio::test]\n    async fn test_v2_find_trace_ids() {\n        let mut service = MockSearchService::new();\n        service\n            .expect_root_search()\n            .withf(|req| {\n                req.index_id_patterns == vec![OTEL_TRACES_INDEX_ID_PATTERN]\n                    && req.start_timestamp.is_some()\n                    && req.end_timestamp.is_some()\n            })\n            .return_once(|_| {\n                use quickwit_search::Span as TraceSpan;\n                use tantivy::DateTime;\n\n                let trace_id_1 =\n                    TraceId::new([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]);\n                let trace_id_2 = TraceId::new([\n                    17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32,\n                ]);\n\n                let spans = vec![\n                    TraceSpan {\n                        trace_id: trace_id_1,\n                        span_timestamp: DateTime::from_timestamp_secs(1500),\n                    },\n                    TraceSpan {\n                        trace_id: trace_id_2,\n                        span_timestamp: DateTime::from_timestamp_secs(1600),\n                    },\n                ];\n\n                let aggregation_postcard = postcard::to_allocvec(&spans).unwrap();\n\n                Ok(quickwit_proto::search::SearchResponse {\n                    num_hits: 2,\n                    hits: vec![],\n                    elapsed_time_micros: 100,\n                    errors: Vec::new(),\n                    aggregation_postcard: Some(aggregation_postcard),\n                    scroll_id: None,\n                    failed_splits: Vec::new(),\n                    num_successful_splits: 1,\n                })\n            });\n\n        let service = Arc::new(service);\n        let jaeger = JaegerService::new(JaegerConfig::default(), service);\n\n        let request = tonic::Request::new(quickwit_proto::jaeger::storage::v2::FindTracesRequest {\n            query: Some(quickwit_proto::jaeger::storage::v2::TraceQueryParameters {\n                service_name: \"test-service\".to_string(),\n                operation_name: String::new(),\n                attributes: vec![],\n                start_time_min: Some(prost_types::Timestamp {\n                    seconds: 1000,\n                    nanos: 0,\n                }),\n                start_time_max: Some(prost_types::Timestamp {\n                    seconds: 2000,\n                    nanos: 0,\n                }),\n                duration_min: None,\n                duration_max: None,\n                search_depth: 10,\n            }),\n        });\n        let response =\n            quickwit_proto::jaeger::storage::v2::trace_reader_server::TraceReader::find_trace_i_ds(\n                &jaeger, request,\n            )\n            .await\n            .unwrap()\n            .into_inner();\n        assert_eq!(response.trace_ids.len(), 2);\n        assert_eq!(response.trace_ids[0].trace_id.len(), 16);\n        assert_eq!(response.trace_ids[1].trace_id.len(), 16);\n    }\n\n    #[test]\n    fn test_convert_v2_attributes_to_v1_tags() {\n        let attributes = vec![\n            quickwit_proto::jaeger::storage::v2::KeyValue {\n                key: \"http.method\".to_string(),\n                value: Some(quickwit_proto::jaeger::storage::v2::AnyValue {\n                    value: Some(\n                        quickwit_proto::jaeger::storage::v2::any_value::Value::StringValue(\n                            \"GET\".to_string(),\n                        ),\n                    ),\n                }),\n            },\n            quickwit_proto::jaeger::storage::v2::KeyValue {\n                key: \"http.status_code\".to_string(),\n                value: Some(quickwit_proto::jaeger::storage::v2::AnyValue {\n                    value: Some(\n                        quickwit_proto::jaeger::storage::v2::any_value::Value::IntValue(200),\n                    ),\n                }),\n            },\n            quickwit_proto::jaeger::storage::v2::KeyValue {\n                key: \"error\".to_string(),\n                value: Some(quickwit_proto::jaeger::storage::v2::AnyValue {\n                    value: Some(\n                        quickwit_proto::jaeger::storage::v2::any_value::Value::BoolValue(false),\n                    ),\n                }),\n            },\n        ];\n\n        let tags = crate::v2::convert_v2_attributes_to_v1_tags(attributes);\n        assert_eq!(tags.len(), 3);\n        assert_eq!(tags.get(\"http.method\"), Some(&\"GET\".to_string()));\n        assert_eq!(tags.get(\"http.status_code\"), Some(&\"200\".to_string()));\n        assert_eq!(tags.get(\"error\"), Some(&\"false\".to_string()));\n    }\n\n    // Note: test_spans_to_otel_traces_data was removed as v2 now works directly with\n    // native OpenTelemetry format (QwSpan) instead of converting from Jaeger v1 format\n}\n"
  },
  {
    "path": "quickwit/quickwit-jaeger/src/metrics.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse once_cell::sync::Lazy;\nuse quickwit_common::metrics::{\n    HistogramVec, IntCounterVec, exponential_buckets, new_counter_vec, new_histogram_vec,\n};\n\npub struct JaegerServiceMetrics {\n    pub requests_total: IntCounterVec<2>,\n    pub request_errors_total: IntCounterVec<2>,\n    pub request_duration_seconds: HistogramVec<3>,\n    pub fetched_traces_total: IntCounterVec<2>,\n    pub fetched_spans_total: IntCounterVec<2>,\n    pub transferred_bytes_total: IntCounterVec<2>,\n}\n\nimpl Default for JaegerServiceMetrics {\n    fn default() -> Self {\n        Self {\n            requests_total: new_counter_vec(\n                \"requests_total\",\n                \"Number of requests\",\n                \"jaeger\",\n                &[],\n                [\"operation\", \"index\"],\n            ),\n            request_errors_total: new_counter_vec(\n                \"request_errors_total\",\n                \"Number of failed requests\",\n                \"jaeger\",\n                &[],\n                [\"operation\", \"index\"],\n            ),\n            request_duration_seconds: new_histogram_vec(\n                \"request_duration_seconds\",\n                \"Duration of requests\",\n                \"jaeger\",\n                &[],\n                [\"operation\", \"index\", \"error\"],\n                exponential_buckets(0.02, 2.0, 8).unwrap(),\n            ),\n            fetched_traces_total: new_counter_vec(\n                \"fetched_traces_total\",\n                \"Number of traces retrieved from storage\",\n                \"jaeger\",\n                &[],\n                [\"operation\", \"index\"],\n            ),\n            fetched_spans_total: new_counter_vec(\n                \"fetched_spans_total\",\n                \"Number of spans retrieved from storage\",\n                \"jaeger\",\n                &[],\n                [\"operation\", \"index\"],\n            ),\n            transferred_bytes_total: new_counter_vec(\n                \"transferred_bytes_total\",\n                \"Number of bytes transferred\",\n                \"jaeger\",\n                &[],\n                [\"operation\", \"index\"],\n            ),\n        }\n    }\n}\n\npub static JAEGER_SERVICE_METRICS: Lazy<JaegerServiceMetrics> =\n    Lazy::new(JaegerServiceMetrics::default);\n"
  },
  {
    "path": "quickwit/quickwit-jaeger/src/v1.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n//! Jaeger v1 API implementation (SpanReaderPlugin)\n\nuse std::time::Instant;\n\nuse async_trait::async_trait;\nuse quickwit_opentelemetry::otlp::{\n    OTEL_TRACES_INDEX_ID, extract_otel_traces_index_id_patterns_from_metadata,\n};\nuse quickwit_proto::jaeger::storage::v1::span_reader_plugin_server::SpanReaderPlugin;\nuse quickwit_proto::jaeger::storage::v1::{\n    FindTraceIDsRequest, FindTraceIDsResponse, FindTracesRequest, GetOperationsRequest,\n    GetOperationsResponse, GetServicesRequest, GetServicesResponse, GetTraceRequest,\n};\nuse tonic::{Request, Response, Status};\n\nuse crate::metrics::JAEGER_SERVICE_METRICS;\nuse crate::{JaegerService, SpanStream};\n\nmacro_rules! metrics {\n    ($expr:expr, [$operation:ident, $($label:expr),*]) => {\n        let start = std::time::Instant::now();\n        let labels = [stringify!($operation), $($label,)*];\n        JAEGER_SERVICE_METRICS.requests_total.with_label_values(labels).inc();\n        let (res, is_error) = match $expr {\n            ok @ Ok(_) => {\n                (ok, \"false\")\n            },\n            err @ Err(_) => {\n                JAEGER_SERVICE_METRICS.request_errors_total.with_label_values(labels).inc();\n                (err, \"true\")\n            },\n        };\n        let elapsed = start.elapsed().as_secs_f64();\n        let labels = [stringify!($operation), $($label,)* is_error];\n        JAEGER_SERVICE_METRICS.request_duration_seconds.with_label_values(labels).observe(elapsed);\n\n        return res.map(Response::new);\n    };\n}\n\n#[async_trait]\nimpl SpanReaderPlugin for JaegerService {\n    type GetTraceStream = SpanStream;\n\n    type FindTracesStream = SpanStream;\n\n    async fn get_services(\n        &self,\n        request: Request<GetServicesRequest>,\n    ) -> Result<Response<GetServicesResponse>, Status> {\n        let index_id_patterns =\n            extract_otel_traces_index_id_patterns_from_metadata(request.metadata())?;\n        metrics!(\n            self.get_services_for_indexes(request.into_inner(), index_id_patterns)\n                .await,\n            [get_services, OTEL_TRACES_INDEX_ID]\n        );\n    }\n\n    async fn get_operations(\n        &self,\n        request: Request<GetOperationsRequest>,\n    ) -> Result<Response<GetOperationsResponse>, Status> {\n        let index_id_patterns =\n            extract_otel_traces_index_id_patterns_from_metadata(request.metadata())?;\n        metrics!(\n            self.get_operations_for_indexes(request.into_inner(), index_id_patterns)\n                .await,\n            [get_operations, OTEL_TRACES_INDEX_ID]\n        );\n    }\n\n    async fn find_trace_i_ds(\n        &self,\n        request: Request<FindTraceIDsRequest>,\n    ) -> Result<Response<FindTraceIDsResponse>, Status> {\n        let index_id_patterns =\n            extract_otel_traces_index_id_patterns_from_metadata(request.metadata())?;\n        metrics!(\n            self.find_trace_ids_for_indexes(request.into_inner(), index_id_patterns)\n                .await,\n            [find_trace_ids, OTEL_TRACES_INDEX_ID]\n        );\n    }\n\n    async fn find_traces(\n        &self,\n        request: Request<FindTracesRequest>,\n    ) -> Result<Response<Self::FindTracesStream>, Status> {\n        let index_id_patterns =\n            extract_otel_traces_index_id_patterns_from_metadata(request.metadata())?;\n        self.find_traces_for_indexes(\n            request.into_inner(),\n            \"find_traces\",\n            Instant::now(),\n            index_id_patterns,\n            false, /* if we use true, Jaeger will display \"1 Span\", and display an empty trace\n                    * when clicking on the ui (but display the full trace after reloading the\n                    * page) */\n        )\n        .await\n        .map(Response::new)\n    }\n\n    async fn get_trace(\n        &self,\n        request: Request<GetTraceRequest>,\n    ) -> Result<Response<Self::GetTraceStream>, Status> {\n        let index_id_patterns =\n            extract_otel_traces_index_id_patterns_from_metadata(request.metadata())?;\n        self.get_trace_for_indexes(\n            request.into_inner(),\n            \"get_trace\",\n            Instant::now(),\n            index_id_patterns,\n        )\n        .await\n        .map(Response::new)\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-jaeger/src/v2.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n// Jaeger v2 API implementation (TraceReader)\nuse std::collections::HashMap;\nuse std::sync::Arc;\nuse std::time::Instant;\n\nuse async_trait::async_trait;\nuse prost_types::Timestamp as WellKnownTimestamp;\nuse quickwit_opentelemetry::otlp::{\n    OTEL_TRACES_INDEX_ID, Span as QwSpan, TraceId,\n    extract_otel_traces_index_id_patterns_from_metadata,\n};\nuse quickwit_proto::jaeger::storage::v2::trace_reader_server::TraceReader;\nuse quickwit_proto::jaeger::storage::v2::{\n    FindTracesRequest, FoundTraceId, GetOperationsRequest, GetOperationsResponse,\n    GetServicesRequest, GetServicesResponse, GetTracesRequest, Operation,\n};\nuse quickwit_proto::opentelemetry::proto::common::v1::any_value::Value as OtelValue;\nuse quickwit_proto::opentelemetry::proto::common::v1::{\n    AnyValue as OtelAnyValue, InstrumentationScope, KeyValue as OtelKeyValue,\n};\nuse quickwit_proto::opentelemetry::proto::resource::v1::Resource as OtelResource;\nuse quickwit_proto::opentelemetry::proto::trace::v1 as otel_trace;\nuse quickwit_proto::opentelemetry::proto::trace::v1::status::StatusCode as OtelStatusCode;\nuse quickwit_proto::opentelemetry::proto::trace::v1::{\n    ResourceSpans, ScopeSpans, Span as OtelSpan, Status as OtelStatus,\n};\nuse quickwit_proto::search::{CountHits, SearchRequest};\nuse quickwit_query::BooleanOperand;\nuse quickwit_query::query_ast::{BoolQuery, QueryAst, TermQuery, UserInputQuery};\nuse quickwit_search::SearchService;\nuse serde_json::Value as JsonValue;\nuse time::OffsetDateTime;\nuse tokio::sync::mpsc;\nuse tokio_stream::wrappers::ReceiverStream;\nuse tonic::{Request, Response, Status};\nuse tracing::field::Empty;\nuse tracing::{Span as RuntimeSpan, debug, error, instrument};\n\nuse crate::metrics::JAEGER_SERVICE_METRICS;\nuse crate::{\n    JaegerService, TimeIntervalSecs, TracesDataStream, get_operations_impl, get_services_impl,\n    json_deserialize, record_error, record_send, to_duration_millis,\n};\n\nmacro_rules! metrics {\n    ($expr:expr, [$operation:ident, $($label:expr),*]) => {\n        let start = std::time::Instant::now();\n        let labels = [stringify!($operation), $($label,)*];\n        JAEGER_SERVICE_METRICS.requests_total.with_label_values(labels).inc();\n        let (res, is_error) = match $expr {\n            ok @ Ok(_) => {\n                (ok, \"false\")\n            },\n            err @ Err(_) => {\n                JAEGER_SERVICE_METRICS.request_errors_total.with_label_values(labels).inc();\n                (err, \"true\")\n            },\n        };\n        let elapsed = start.elapsed().as_secs_f64();\n        let labels = [stringify!($operation), $($label,)* is_error];\n        JAEGER_SERVICE_METRICS.request_duration_seconds.with_label_values(labels).observe(elapsed);\n\n        return res.map(Response::new);\n    };\n}\n\n#[async_trait]\nimpl TraceReader for JaegerService {\n    async fn get_services(\n        &self,\n        request: Request<GetServicesRequest>,\n    ) -> Result<Response<GetServicesResponse>, Status> {\n        let index_id_patterns =\n            extract_otel_traces_index_id_patterns_from_metadata(request.metadata())?;\n\n        let services = get_services_impl(\n            self.search_service.clone(),\n            self.lookback_period_secs,\n            index_id_patterns,\n        )\n        .await?;\n\n        let response = GetServicesResponse { services };\n        metrics!(Ok(response), [get_services_v2, OTEL_TRACES_INDEX_ID]);\n    }\n\n    async fn get_operations(\n        &self,\n        request: Request<GetOperationsRequest>,\n    ) -> Result<Response<GetOperationsResponse>, Status> {\n        let index_id_patterns =\n            extract_otel_traces_index_id_patterns_from_metadata(request.metadata())?;\n\n        let req = request.into_inner();\n\n        let operations = get_operations_impl(\n            self.search_service.clone(),\n            self.lookback_period_secs,\n            req.service,\n            req.span_kind,\n            index_id_patterns,\n        )\n        .await?\n        .into_iter()\n        .map(|op| Operation {\n            name: op.name,\n            span_kind: op.span_kind,\n        })\n        .collect();\n\n        let response = GetOperationsResponse { operations };\n        metrics!(Ok(response), [get_operations_v2, OTEL_TRACES_INDEX_ID]);\n    }\n\n    type GetTracesStream = TracesDataStream;\n\n    async fn get_traces(\n        &self,\n        request: Request<GetTracesRequest>,\n    ) -> Result<Response<Self::GetTracesStream>, Status> {\n        let request_start = Instant::now();\n        let index_id_patterns =\n            extract_otel_traces_index_id_patterns_from_metadata(request.metadata())?;\n\n        let (tx, rx) = mpsc::channel(2);\n        let search_service = self.search_service.clone();\n        let max_fetch_spans = self.max_fetch_spans;\n        let lookback_period_secs = self.lookback_period_secs;\n        let query_list = request.into_inner().query;\n\n        tokio::task::spawn(async move {\n            for query_params in query_list {\n                let trace_id = match TraceId::try_from(query_params.trace_id) {\n                    Ok(id) => id,\n                    Err(error) => {\n                        let _ = tx\n                            .send(Err(Status::invalid_argument(error.to_string())))\n                            .await;\n                        return;\n                    }\n                };\n\n                let end = OffsetDateTime::now_utc().unix_timestamp();\n                let search_window = (end - lookback_period_secs)..=end;\n\n                let otel_spans = match stream_otel_spans_impl(\n                    search_service.clone(),\n                    max_fetch_spans,\n                    &[trace_id],\n                    search_window,\n                    \"get_traces_v2\",\n                    request_start,\n                    index_id_patterns.clone(),\n                    false,\n                )\n                .await\n                {\n                    Ok(spans) => spans,\n                    Err(e) => {\n                        let _ = tx.send(Err(e)).await;\n                        return;\n                    }\n                };\n\n                if tx\n                    .send(Ok(qw_spans_to_otel_traces_data(otel_spans)))\n                    .await\n                    .is_err()\n                {\n                    return;\n                }\n            }\n        });\n\n        Ok(Response::new(ReceiverStream::new(rx)))\n    }\n\n    type FindTracesStream = TracesDataStream;\n\n    async fn find_traces(\n        &self,\n        request: Request<FindTracesRequest>,\n    ) -> Result<Response<Self::FindTracesStream>, Status> {\n        let request_start = Instant::now();\n\n        let index_id_patterns =\n            extract_otel_traces_index_id_patterns_from_metadata(request.metadata())?;\n\n        let query = request\n            .into_inner()\n            .query\n            .ok_or_else(|| Status::invalid_argument(\"Query is empty.\"))?;\n\n        let (trace_ids, span_timestamps_range) = find_trace_ids_impl(\n            self.search_service.clone(),\n            self.max_trace_duration_secs,\n            query,\n            index_id_patterns.clone(),\n        )\n        .await?;\n\n        let search_window = (span_timestamps_range.start() - self.max_trace_duration_secs)\n            ..=(span_timestamps_range.end() + self.max_trace_duration_secs);\n\n        let (tx, rx) = mpsc::channel(2);\n        let search_service = self.search_service.clone();\n        let max_fetch_spans = self.max_fetch_spans;\n\n        tokio::task::spawn(async move {\n            let all_spans = match stream_otel_spans_impl(\n                search_service,\n                max_fetch_spans,\n                &trace_ids,\n                search_window,\n                \"find_traces_v2\",\n                request_start,\n                index_id_patterns,\n                false,\n            )\n            .await\n            {\n                Ok(spans) => spans,\n                Err(e) => {\n                    let _ = tx.send(Err(e)).await;\n                    return;\n                }\n            };\n\n            // Group by trace_id and send each trace\n            let mut spans_by_trace: HashMap<Vec<u8>, Vec<QwSpan>> = HashMap::new();\n            for span in all_spans {\n                spans_by_trace\n                    .entry(span.trace_id.to_vec())\n                    .or_default()\n                    .push(span);\n            }\n\n            for spans in spans_by_trace.into_values() {\n                if tx\n                    .send(Ok(qw_spans_to_otel_traces_data(spans)))\n                    .await\n                    .is_err()\n                {\n                    return;\n                }\n            }\n        });\n\n        Ok(Response::new(ReceiverStream::new(rx)))\n    }\n\n    async fn find_trace_i_ds(\n        &self,\n        request: Request<quickwit_proto::jaeger::storage::v2::FindTracesRequest>,\n    ) -> Result<Response<quickwit_proto::jaeger::storage::v2::FindTraceIDsResponse>, Status> {\n        let index_id_patterns =\n            extract_otel_traces_index_id_patterns_from_metadata(request.metadata())?;\n\n        let query = request\n            .into_inner()\n            .query\n            .ok_or_else(|| Status::invalid_argument(\"Query is empty.\"))?;\n\n        let (trace_ids, time_range) = find_trace_ids_impl(\n            self.search_service.clone(),\n            self.max_trace_duration_secs,\n            query,\n            index_id_patterns,\n        )\n        .await?;\n\n        let trace_ids = trace_ids\n            .into_iter()\n            .map(|trace_id| FoundTraceId {\n                trace_id: trace_id.to_vec(),\n                start: Some(WellKnownTimestamp {\n                    seconds: *time_range.start(),\n                    nanos: 0,\n                }),\n                end: Some(WellKnownTimestamp {\n                    seconds: *time_range.end(),\n                    nanos: 0,\n                }),\n            })\n            .collect();\n\n        let response = quickwit_proto::jaeger::storage::v2::FindTraceIDsResponse { trace_ids };\n        metrics!(Ok(response), [find_trace_ids_v2, OTEL_TRACES_INDEX_ID]);\n    }\n}\n\n// === Helper functions ===\n#[instrument(\"find_trace_ids\", skip_all)]\nasync fn find_trace_ids_impl(\n    search_service: Arc<dyn SearchService>,\n    _max_trace_duration_secs: i64,\n    query: quickwit_proto::jaeger::storage::v2::TraceQueryParameters,\n    index_id_patterns: Vec<String>,\n) -> Result<(Vec<TraceId>, TimeIntervalSecs), Status> {\n    debug!(service_name=%query.service_name, operation_name=%query.operation_name, \"`find_trace_ids` request\");\n\n    let min_start_secs = query.start_time_min.as_ref().map(|ts| ts.seconds);\n    let max_start_secs = query.start_time_max.as_ref().map(|ts| ts.seconds);\n    let min_duration_millis = query.duration_min.as_ref().and_then(to_duration_millis);\n    let max_duration_millis = query.duration_max.as_ref().and_then(to_duration_millis);\n    let tags = convert_v2_attributes_to_v1_tags(query.attributes);\n\n    crate::find_trace_ids_common(\n        search_service,\n        &query.service_name,\n        &query.operation_name,\n        tags,\n        min_start_secs,\n        max_start_secs,\n        min_duration_millis,\n        max_duration_millis,\n        query.search_depth as usize,\n        index_id_patterns,\n    )\n    .await\n}\n\n#[instrument(\"stream_otel_spans\", skip_all, fields(num_traces=%trace_ids.len(), num_spans=Empty, num_bytes=Empty))]\n#[allow(clippy::too_many_arguments)]\nasync fn stream_otel_spans_impl(\n    search_service: Arc<dyn SearchService>,\n    max_fetch_spans: u64,\n    trace_ids: &[TraceId],\n    search_window: TimeIntervalSecs,\n    operation_name: &'static str,\n    request_start: Instant,\n    index_id_patterns: Vec<String>,\n    root_only: bool,\n) -> Result<Vec<QwSpan>, Status> {\n    if trace_ids.is_empty() {\n        return Ok(Vec::new());\n    }\n\n    let mut query = BoolQuery::default();\n\n    for trace_id in trace_ids {\n        let value = trace_id.hex_display();\n        let term_query = TermQuery {\n            field: \"trace_id\".to_string(),\n            value,\n        };\n        query.should.push(term_query.into());\n    }\n\n    if root_only {\n        let is_root = UserInputQuery {\n            user_text: \"NOT is_root:false\".to_string(),\n            default_fields: None,\n            default_operator: BooleanOperand::And,\n            lenient: true,\n        };\n        let mut new_query = BoolQuery::default();\n        new_query.must.push(query.into());\n        new_query.must.push(is_root.into());\n        query = new_query;\n    }\n\n    let query_ast: QueryAst = query.into();\n    let query_ast =\n        serde_json::to_string(&query_ast).map_err(|err| Status::internal(err.to_string()))?;\n\n    let search_request = SearchRequest {\n        index_id_patterns,\n        query_ast,\n        start_timestamp: Some(*search_window.start()),\n        end_timestamp: Some(*search_window.end()),\n        max_hits: max_fetch_spans,\n        count_hits: CountHits::Underestimate.into(),\n        ..Default::default()\n    };\n\n    let search_response = match search_service.root_search(search_request).await {\n        Ok(search_response) => search_response,\n        Err(search_error) => {\n            error!(search_error=?search_error, \"failed to fetch spans\");\n            record_error(operation_name, request_start);\n            return Err(Status::internal(\"Failed to fetch spans.\"));\n        }\n    };\n\n    let mut qw_spans: Vec<QwSpan> = Vec::with_capacity(search_response.hits.len());\n\n    for hit in search_response.hits {\n        match qw_span_from_json(&hit.json) {\n            Ok(span) => {\n                qw_spans.push(span);\n            }\n            Err(status) => {\n                record_error(operation_name, request_start);\n                return Err(status);\n            }\n        };\n    }\n\n    if trace_ids.len() > 1 {\n        qw_spans.sort_unstable_by(|left, right| left.trace_id.cmp(&right.trace_id));\n    }\n\n    let num_spans = qw_spans.len();\n    let num_bytes = qw_spans\n        .iter()\n        .map(|span| serde_json::to_string(span).unwrap_or_default().len())\n        .sum::<usize>();\n\n    RuntimeSpan::current().record(\"num_spans\", num_spans);\n    RuntimeSpan::current().record(\"num_bytes\", num_bytes);\n\n    record_send(operation_name, num_spans, num_bytes);\n\n    JAEGER_SERVICE_METRICS\n        .fetched_traces_total\n        .with_label_values([operation_name, OTEL_TRACES_INDEX_ID])\n        .inc_by(trace_ids.len() as u64);\n\n    let elapsed = request_start.elapsed().as_secs_f64();\n    JAEGER_SERVICE_METRICS\n        .request_duration_seconds\n        .with_label_values([operation_name, OTEL_TRACES_INDEX_ID, \"false\"])\n        .observe(elapsed);\n\n    Ok(qw_spans)\n}\n\n// === Conversion functions ===\n// Note: record_error and record_send are now shared in lib.rs\n\n/// Direct conversion from Quickwit's native OpenTelemetry span to Jaeger v2's OpenTelemetry format\nfn qw_spans_to_otel_traces_data(\n    qw_spans: Vec<QwSpan>,\n) -> quickwit_proto::opentelemetry::proto::trace::v1::TracesData {\n    // Group spans by service\n    let mut spans_by_service: HashMap<String, Vec<QwSpan>> = HashMap::new();\n    for span in qw_spans {\n        spans_by_service\n            .entry(span.service_name.clone())\n            .or_default()\n            .push(span);\n    }\n\n    let resource_spans = spans_by_service\n        .into_iter()\n        .map(|(service_name, spans)| {\n            // Get resource attributes from first span before grouping\n            let first_span_attrs = spans\n                .first()\n                .map(|span| span.resource_attributes.clone())\n                .unwrap_or_default();\n\n            // Group by scope\n            let mut spans_by_scope: HashMap<(Option<String>, Option<String>), Vec<QwSpan>> =\n                HashMap::new();\n            for span in spans {\n                let key = (span.scope_name.clone(), span.scope_version.clone());\n                spans_by_scope.entry(key).or_default().push(span);\n            }\n\n            let scope_spans = spans_by_scope\n                .into_iter()\n                .map(|((scope_name, scope_version), spans)| {\n                    let otel_spans = spans.into_iter().map(qw_span_to_otel_span).collect();\n\n                    ScopeSpans {\n                        scope: Some(InstrumentationScope {\n                            name: scope_name.unwrap_or_default(),\n                            version: scope_version.unwrap_or_default(),\n                            attributes: vec![],\n                            dropped_attributes_count: 0,\n                        }),\n                        spans: otel_spans,\n                        schema_url: String::new(),\n                    }\n                })\n                .collect();\n\n            let mut resource_attrs = vec![OtelKeyValue {\n                key: \"service.name\".to_string(),\n                value: Some(OtelAnyValue {\n                    value: Some(OtelValue::StringValue(service_name)),\n                }),\n            }];\n\n            // Add other resource attributes\n            for (key, value) in first_span_attrs {\n                resource_attrs.push(json_value_to_otel_kv(key, value));\n            }\n\n            ResourceSpans {\n                resource: Some(OtelResource {\n                    attributes: resource_attrs,\n                    dropped_attributes_count: 0,\n                }),\n                scope_spans,\n                schema_url: String::new(),\n            }\n        })\n        .collect();\n\n    quickwit_proto::opentelemetry::proto::trace::v1::TracesData { resource_spans }\n}\n\n/// Convert a Quickwit span (native OTEL format) to Jaeger v2 OTEL span\nfn qw_span_to_otel_span(qw_span: QwSpan) -> OtelSpan {\n    OtelSpan {\n        trace_id: qw_span.trace_id.to_vec(),\n        span_id: qw_span.span_id.to_vec(),\n        trace_state: qw_span.trace_state.unwrap_or_default(),\n        parent_span_id: qw_span\n            .parent_span_id\n            .map(|id| id.to_vec())\n            .unwrap_or_default(),\n        name: qw_span.span_name,\n        kind: qw_span.span_kind as i32,\n        start_time_unix_nano: qw_span.span_start_timestamp_nanos,\n        end_time_unix_nano: qw_span.span_end_timestamp_nanos,\n        attributes: qw_span\n            .span_attributes\n            .into_iter()\n            .map(|(k, v)| json_value_to_otel_kv(k, v))\n            .collect(),\n        dropped_attributes_count: qw_span.span_dropped_attributes_count,\n        events: qw_span\n            .events\n            .into_iter()\n            .map(|event| otel_trace::span::Event {\n                time_unix_nano: event.event_timestamp_nanos,\n                name: event.event_name,\n                attributes: event\n                    .event_attributes\n                    .into_iter()\n                    .map(|(k, v)| json_value_to_otel_kv(k, v))\n                    .collect(),\n                dropped_attributes_count: event.event_dropped_attributes_count,\n            })\n            .collect(),\n        dropped_events_count: qw_span.span_dropped_events_count,\n        links: qw_span\n            .links\n            .into_iter()\n            .map(|link| otel_trace::span::Link {\n                trace_id: link.link_trace_id.to_vec(),\n                span_id: link.link_span_id.to_vec(),\n                trace_state: link.link_trace_state.unwrap_or_default(),\n                attributes: link\n                    .link_attributes\n                    .into_iter()\n                    .map(|(k, v)| json_value_to_otel_kv(k, v))\n                    .collect(),\n                dropped_attributes_count: link.link_dropped_attributes_count,\n            })\n            .collect(),\n        dropped_links_count: qw_span.span_dropped_links_count,\n        status: Some(OtelStatus {\n            message: qw_span.span_status.message.unwrap_or_default(),\n            code: match qw_span.span_status.code {\n                quickwit_proto::opentelemetry::proto::trace::v1::status::StatusCode::Unset => {\n                    OtelStatusCode::Unset as i32\n                }\n                quickwit_proto::opentelemetry::proto::trace::v1::status::StatusCode::Ok => {\n                    OtelStatusCode::Ok as i32\n                }\n                quickwit_proto::opentelemetry::proto::trace::v1::status::StatusCode::Error => {\n                    OtelStatusCode::Error as i32\n                }\n            },\n        }),\n    }\n}\n\nfn json_value_to_otel_kv(key: String, value: JsonValue) -> OtelKeyValue {\n    let otel_value = match value {\n        JsonValue::String(s) => OtelValue::StringValue(s),\n        JsonValue::Number(n) => {\n            if let Some(i) = n.as_i64() {\n                OtelValue::IntValue(i)\n            } else if let Some(f) = n.as_f64() {\n                OtelValue::DoubleValue(f)\n            } else {\n                OtelValue::StringValue(n.to_string())\n            }\n        }\n        JsonValue::Bool(b) => OtelValue::BoolValue(b),\n        JsonValue::Array(_) | JsonValue::Object(_) => OtelValue::StringValue(value.to_string()),\n        JsonValue::Null => OtelValue::StringValue(String::new()),\n    };\n\n    OtelKeyValue {\n        key,\n        value: Some(OtelAnyValue {\n            value: Some(otel_value),\n        }),\n    }\n}\n\n#[allow(clippy::result_large_err)]\nfn qw_span_from_json(qw_span_json: &str) -> Result<QwSpan, Status> {\n    json_deserialize(qw_span_json, \"span\")\n}\n\npub(crate) fn convert_v2_attributes_to_v1_tags(\n    attributes: Vec<quickwit_proto::jaeger::storage::v2::KeyValue>,\n) -> HashMap<String, String> {\n    attributes\n        .into_iter()\n        .filter_map(|kv| {\n            let value = kv.value?.value?;\n            let string_value = match value {\n                quickwit_proto::jaeger::storage::v2::any_value::Value::StringValue(s) => s,\n                quickwit_proto::jaeger::storage::v2::any_value::Value::IntValue(i) => i.to_string(),\n                quickwit_proto::jaeger::storage::v2::any_value::Value::DoubleValue(d) => {\n                    d.to_string()\n                }\n                quickwit_proto::jaeger::storage::v2::any_value::Value::BoolValue(b) => {\n                    b.to_string()\n                }\n                _ => return None,\n            };\n            Some((kv.key, string_value))\n        })\n        .collect()\n}\n"
  },
  {
    "path": "quickwit/quickwit-janitor/Cargo.toml",
    "content": "[package]\nname = \"quickwit-janitor\"\ndescription = \"Janitor service implementation\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nanyhow = { workspace = true }\nasync-trait = { workspace = true }\nfutures = { workspace = true }\nitertools = { workspace = true }\nonce_cell = { workspace = true }\nserde = { workspace = true }\nserde_json = { workspace = true }\ntantivy = { workspace = true }\nthiserror = { workspace = true }\ntime = { workspace = true }\ntokio = { workspace = true }\ntracing = { workspace = true }\nutoipa = { workspace = true }\n\nquickwit-actors = { workspace = true }\nquickwit-common = { workspace = true }\nquickwit-config = { workspace = true }\nquickwit-doc-mapper = { workspace = true }\nquickwit-index-management = { workspace = true }\nquickwit-indexing = { workspace = true }\nquickwit-metastore = { workspace = true }\nquickwit-proto = { workspace = true }\nquickwit-query = { workspace = true }\nquickwit-search = { workspace = true }\nquickwit-storage = { workspace = true }\n\n[features]\ntestsuite = []\n\n[dev-dependencies]\nmockall = { workspace = true }\ntempfile = { workspace = true }\n\nquickwit-actors = { workspace = true, features = [\"testsuite\"] }\nquickwit-common = { workspace = true, features = [\"testsuite\"] }\nquickwit-config = { workspace = true, features = [\"testsuite\"] }\nquickwit-indexing = { workspace = true, features = [\"testsuite\"] }\nquickwit-metastore = { workspace = true, features = [\"testsuite\"] }\nquickwit-proto = { workspace = true, features = [\"testsuite\"] }\nquickwit-search = { workspace = true, features = [\"testsuite\"] }\nquickwit-storage = { workspace = true, features = [\"testsuite\"] }\n"
  },
  {
    "path": "quickwit/quickwit-janitor/src/actors/delete_task_pipeline.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::path::PathBuf;\nuse std::sync::Arc;\nuse std::time::Duration;\n\nuse async_trait::async_trait;\nuse quickwit_actors::{\n    Actor, ActorContext, ActorExitStatus, ActorHandle, Handler, Mailbox, Supervisor,\n    SupervisorState,\n};\nuse quickwit_common::io::IoControls;\nuse quickwit_common::pubsub::EventBroker;\nuse quickwit_common::temp_dir::{self};\nuse quickwit_common::uri::Uri;\nuse quickwit_config::build_doc_mapper;\nuse quickwit_indexing::actors::{\n    MergeExecutor, MergeSchedulerService, MergeSplitDownloader, Packager, Publisher,\n    PublisherCounters, Uploader, UploaderCounters, UploaderType,\n};\nuse quickwit_indexing::merge_policy::merge_policy_from_settings;\nuse quickwit_indexing::{IndexingSplitStore, PublisherType, SplitsUpdateMailbox};\nuse quickwit_metastore::IndexMetadataResponseExt;\nuse quickwit_proto::indexing::MergePipelineId;\nuse quickwit_proto::metastore::{IndexMetadataRequest, MetastoreService, MetastoreServiceClient};\nuse quickwit_proto::types::{IndexUid, NodeId};\nuse quickwit_search::SearchJobPlacer;\nuse quickwit_storage::Storage;\nuse serde::Serialize;\nuse tokio::join;\nuse tracing::info;\n\nuse super::delete_task_planner::DeleteTaskPlanner;\nuse crate::actors::delete_task_planner::DeleteTaskPlannerState;\n\nconst OBSERVE_PIPELINE_INTERVAL: Duration = if cfg!(any(test, feature = \"testsuite\")) {\n    Duration::from_millis(500)\n} else {\n    // 1 minute.\n    // This is only for observation purpose, not supervision.\n    Duration::from_secs(60)\n};\n\nstruct DeletePipelineHandle {\n    pub delete_task_planner: ActorHandle<Supervisor<DeleteTaskPlanner>>,\n    pub downloader: ActorHandle<Supervisor<MergeSplitDownloader>>,\n    pub delete_task_executor: ActorHandle<Supervisor<MergeExecutor>>,\n    pub packager: ActorHandle<Supervisor<Packager>>,\n    pub uploader: ActorHandle<Supervisor<Uploader>>,\n    pub publisher: ActorHandle<Supervisor<Publisher>>,\n}\n\n/// A Struct to hold all statistical data about deletes.\n#[derive(Clone, Debug, Default, Serialize)]\npub struct DeleteTaskPipelineState {\n    pub delete_task_planner: SupervisorState<DeleteTaskPlannerState>,\n    pub downloader: SupervisorState<()>,\n    pub delete_task_executor: SupervisorState<()>,\n    pub packager: SupervisorState<()>,\n    pub uploader: SupervisorState<UploaderCounters>,\n    pub publisher: SupervisorState<PublisherCounters>,\n}\n\npub struct DeleteTaskPipeline {\n    index_uid: IndexUid,\n    metastore: MetastoreServiceClient,\n    search_job_placer: SearchJobPlacer,\n    index_storage: Arc<dyn Storage>,\n    delete_service_task_dir: PathBuf,\n    handles: Option<DeletePipelineHandle>,\n    max_concurrent_split_uploads: usize,\n    state: DeleteTaskPipelineState,\n    merge_scheduler_service: Mailbox<MergeSchedulerService>,\n    event_broker: EventBroker,\n}\n\n#[async_trait]\nimpl Actor for DeleteTaskPipeline {\n    type ObservableState = DeleteTaskPipelineState;\n\n    fn observable_state(&self) -> Self::ObservableState {\n        self.state.clone()\n    }\n\n    fn name(&self) -> String {\n        \"DeleteTaskPipeline\".to_string()\n    }\n\n    async fn initialize(&mut self, ctx: &ActorContext<Self>) -> Result<(), ActorExitStatus> {\n        self.spawn_pipeline(ctx).await?;\n        self.handle(Observe, ctx).await?;\n        Ok(())\n    }\n\n    async fn finalize(\n        &mut self,\n        _exit_status: &ActorExitStatus,\n        _ctx: &ActorContext<Self>,\n    ) -> anyhow::Result<()> {\n        if let Some(handles) = self.handles.take() {\n            join!(\n                handles.delete_task_planner.quit(),\n                handles.downloader.quit(),\n                handles.delete_task_executor.quit(),\n                handles.packager.quit(),\n                handles.uploader.quit(),\n                handles.publisher.quit(),\n            );\n        };\n        Ok(())\n    }\n}\n\nimpl DeleteTaskPipeline {\n    #[allow(clippy::too_many_arguments)]\n    pub fn new(\n        index_uid: IndexUid,\n        metastore: MetastoreServiceClient,\n        search_job_placer: SearchJobPlacer,\n        index_storage: Arc<dyn Storage>,\n        delete_service_task_dir: PathBuf,\n        max_concurrent_split_uploads: usize,\n        merge_scheduler_service: Mailbox<MergeSchedulerService>,\n        event_broker: EventBroker,\n    ) -> Self {\n        Self {\n            index_uid,\n            metastore,\n            search_job_placer,\n            index_storage,\n            delete_service_task_dir,\n            handles: Default::default(),\n            max_concurrent_split_uploads,\n            state: DeleteTaskPipelineState::default(),\n            merge_scheduler_service,\n            event_broker,\n        }\n    }\n\n    pub async fn spawn_pipeline(&mut self, ctx: &ActorContext<Self>) -> anyhow::Result<()> {\n        info!(\n            index_uid=%self.index_uid,\n            root_dir=%self.delete_service_task_dir.to_str().unwrap(),\n            \"spawning delete tasks pipeline\",\n        );\n        let index_config = self\n            .metastore\n            .index_metadata(IndexMetadataRequest::for_index_uid(self.index_uid.clone()))\n            .await?\n            .deserialize_index_metadata()?\n            .into_index_config();\n        let publisher = Publisher::new(\n            PublisherType::MergePublisher,\n            self.metastore.clone(),\n            None,\n            None,\n        );\n        let (publisher_mailbox, publisher_supervisor_handler) =\n            ctx.spawn_actor().supervise(publisher);\n        let split_store =\n            IndexingSplitStore::create_without_local_store_for_test(self.index_storage.clone());\n        let merge_policy = merge_policy_from_settings(&index_config.indexing_settings);\n        let uploader = Uploader::new(\n            UploaderType::DeleteUploader,\n            self.metastore.clone(),\n            merge_policy,\n            index_config.retention_policy_opt.clone(),\n            split_store.clone(),\n            SplitsUpdateMailbox::Publisher(publisher_mailbox),\n            self.max_concurrent_split_uploads,\n            self.event_broker.clone(),\n        );\n        let (uploader_mailbox, uploader_supervisor_handler) = ctx.spawn_actor().supervise(uploader);\n\n        let doc_mapper =\n            build_doc_mapper(&index_config.doc_mapping, &index_config.search_settings)?;\n        let tag_fields = doc_mapper.tag_named_fields()?;\n        let packager = Packager::new(\"MergePackager\", tag_fields, uploader_mailbox);\n        let (packager_mailbox, packager_supervisor_handler) = ctx.spawn_actor().supervise(packager);\n        let pipeline_id = MergePipelineId {\n            node_id: NodeId::from(\"unknown\"),\n            index_uid: self.index_uid.clone(),\n            source_id: \"unknown\".to_string(),\n        };\n\n        let delete_executor_io_controls = IoControls::default().set_component(\"deleter\");\n\n        let split_download_io_controls = delete_executor_io_controls\n            .clone()\n            .set_component(\"split_downloader_delete\");\n        let delete_executor = MergeExecutor::new(\n            pipeline_id,\n            self.metastore.clone(),\n            doc_mapper.clone(),\n            delete_executor_io_controls,\n            packager_mailbox,\n        );\n        let (delete_executor_mailbox, task_executor_supervisor_handler) =\n            ctx.spawn_actor().supervise(delete_executor);\n        let scratch_directory = temp_dir::Builder::default()\n            .join(&self.index_uid.index_id)\n            .join(&self.index_uid.incarnation_id.to_string())\n            .tempdir_in(&self.delete_service_task_dir)?;\n        let merge_split_downloader = MergeSplitDownloader {\n            scratch_directory,\n            split_store,\n            executor_mailbox: delete_executor_mailbox,\n            io_controls: split_download_io_controls,\n        };\n        let (downloader_mailbox, downloader_supervisor_handler) =\n            ctx.spawn_actor().supervise(merge_split_downloader);\n        let doc_mapper_str = serde_json::to_string(&doc_mapper)?;\n        let index_uri: &Uri = &index_config.index_uri;\n        let task_planner = DeleteTaskPlanner::new(\n            self.index_uid.clone(),\n            index_uri.clone(),\n            doc_mapper_str,\n            self.metastore.clone(),\n            self.search_job_placer.clone(),\n            downloader_mailbox,\n            self.merge_scheduler_service.clone(),\n        );\n        let (_, task_planner_supervisor_handler) = ctx.spawn_actor().supervise(task_planner);\n        self.handles = Some(DeletePipelineHandle {\n            delete_task_planner: task_planner_supervisor_handler,\n            downloader: downloader_supervisor_handler,\n            delete_task_executor: task_executor_supervisor_handler,\n            packager: packager_supervisor_handler,\n            uploader: uploader_supervisor_handler,\n            publisher: publisher_supervisor_handler,\n        });\n        Ok(())\n    }\n}\n\n#[derive(Debug)]\nstruct Observe;\n\n#[async_trait]\nimpl Handler<Observe> for DeleteTaskPipeline {\n    type Reply = ();\n    async fn handle(\n        &mut self,\n        _: Observe,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        if let Some(handles) = &self.handles {\n            handles.delete_task_planner.refresh_observe();\n            handles.downloader.refresh_observe();\n            handles.delete_task_executor.refresh_observe();\n            handles.packager.refresh_observe();\n            handles.uploader.refresh_observe();\n            handles.publisher.refresh_observe();\n            self.state = DeleteTaskPipelineState {\n                delete_task_planner: handles.delete_task_planner.last_observation().clone(),\n                downloader: handles.downloader.last_observation().clone(),\n                delete_task_executor: handles.delete_task_executor.last_observation().clone(),\n                packager: handles.packager.last_observation().clone(),\n                uploader: handles.uploader.last_observation().clone(),\n                publisher: handles.publisher.last_observation().clone(),\n            }\n        }\n        ctx.schedule_self_msg(OBSERVE_PIPELINE_INTERVAL, Observe);\n        Ok(())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use async_trait::async_trait;\n    use quickwit_actors::{Handler, Universe};\n    use quickwit_common::pubsub::EventBroker;\n    use quickwit_common::temp_dir::TempDirectory;\n    use quickwit_indexing::TestSandbox;\n    use quickwit_indexing::actors::MergeSchedulerService;\n    use quickwit_metastore::{ListSplitsRequestExt, MetastoreServiceStreamSplitsExt, SplitState};\n    use quickwit_proto::metastore::{DeleteQuery, ListSplitsRequest, MetastoreService};\n    use quickwit_proto::search::{LeafSearchRequest, LeafSearchResponse};\n    use quickwit_search::{\n        MockSearchService, SearchError, SearchJobPlacer, searcher_pool_for_test,\n    };\n\n    use super::{ActorContext, ActorExitStatus, DeleteTaskPipeline, OBSERVE_PIPELINE_INTERVAL};\n\n    #[derive(Debug)]\n    struct GracefulShutdown;\n\n    #[async_trait]\n    impl Handler<GracefulShutdown> for DeleteTaskPipeline {\n        type Reply = ();\n        async fn handle(\n            &mut self,\n            _: GracefulShutdown,\n            _: &ActorContext<Self>,\n        ) -> Result<(), ActorExitStatus> {\n            if let Some(handles) = self.handles.take() {\n                handles.delete_task_planner.quit().await;\n                handles.publisher.join().await;\n            }\n            // Nothing to do.\n            Err(ActorExitStatus::Success)\n        }\n    }\n\n    #[tokio::test]\n    async fn test_delete_pipeline_simple() -> anyhow::Result<()> {\n        quickwit_common::setup_logging_for_tests();\n        let index_id = \"test-delete-pipeline-simple\";\n        let doc_mapping_yaml = r#\"\n            field_mappings:\n              - name: body\n                type: text\n              - name: ts\n                type: i64\n                fast: true\n        \"#;\n        let indexing_settings_yaml = r#\"\n            merge_policy:\n                type: no_merge\n        \"#;\n        let test_sandbox = TestSandbox::create(\n            index_id,\n            doc_mapping_yaml,\n            indexing_settings_yaml,\n            &[\"body\"],\n        )\n        .await\n        .unwrap();\n        let universe: &Universe = test_sandbox.universe();\n        let merge_scheduler_service = universe.get_or_spawn_one::<MergeSchedulerService>();\n        let index_uid = test_sandbox.index_uid();\n        let docs = vec![\n            serde_json::json!({\"body\": \"info\", \"ts\": 0 }),\n            serde_json::json!({\"body\": \"info\", \"ts\": 0 }),\n            serde_json::json!({\"body\": \"delete\", \"ts\": 0 }),\n        ];\n        test_sandbox.add_documents(docs).await?;\n        let metastore = test_sandbox.metastore();\n        metastore\n            .create_delete_task(DeleteQuery {\n                index_uid: Some(index_uid.clone()),\n                start_timestamp: None,\n                end_timestamp: None,\n                query_ast: quickwit_query::query_ast::qast_json_helper(\"body:delete\", &[]),\n            })\n            .await\n            .unwrap();\n        let mut mock_search_service = MockSearchService::new();\n        let mut leaf_search_num_failures = 1;\n        mock_search_service\n            .expect_leaf_search()\n            .withf(|leaf_request| -> bool {\n                leaf_request\n                    .search_request\n                    .as_ref()\n                    .unwrap()\n                    .index_id_patterns\n                    == vec![\"test-delete-pipeline-simple\".to_string()]\n            })\n            .times(2)\n            .returning(move |_: LeafSearchRequest| {\n                if leaf_search_num_failures > 0 {\n                    leaf_search_num_failures -= 1;\n                    return Err(SearchError::Internal(\"leaf search error\".to_string()));\n                }\n                Ok(LeafSearchResponse {\n                    num_hits: 1,\n                    ..Default::default()\n                })\n            });\n        let searcher_pool = searcher_pool_for_test([(\"127.0.0.1:1001\", mock_search_service)]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let delete_service_task_dir = TempDirectory::for_test();\n        let pipeline = DeleteTaskPipeline::new(\n            test_sandbox.index_uid(),\n            metastore.clone(),\n            search_job_placer,\n            test_sandbox.storage(),\n            delete_service_task_dir.path().into(),\n            4,\n            merge_scheduler_service,\n            EventBroker::default(),\n        );\n\n        let (pipeline_mailbox, pipeline_handler) = universe.spawn_builder().spawn(pipeline);\n        // Ensure that the message sent by initialize method is processed.\n        let _ = pipeline_handler.process_pending_and_observe().await.state;\n        // Pipeline will first fail and we need to wait a OBSERVE_PIPELINE_INTERVAL * some number\n        // for the pipeline state to be updated.\n        universe.sleep(OBSERVE_PIPELINE_INTERVAL * 5).await;\n        let pipeline_state = pipeline_handler.process_pending_and_observe().await.state;\n        assert_eq!(pipeline_state.delete_task_planner.metrics.num_errors, 1);\n        assert_eq!(pipeline_state.downloader.metrics.num_errors, 0);\n        assert_eq!(pipeline_state.delete_task_executor.metrics.num_errors, 0);\n        assert_eq!(pipeline_state.packager.metrics.num_errors, 0);\n        assert_eq!(pipeline_state.uploader.metrics.num_errors, 0);\n        assert_eq!(pipeline_state.publisher.metrics.num_errors, 0);\n        let _ = pipeline_mailbox.ask(GracefulShutdown).await;\n\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_index_uid(index_uid).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        assert_eq!(splits.len(), 2);\n        let published_split = splits\n            .iter()\n            .find(|split| split.split_state == SplitState::Published)\n            .unwrap();\n        assert_eq!(published_split.split_metadata.delete_opstamp, 1);\n        test_sandbox.assert_quit().await;\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_delete_pipeline_shut_down() -> anyhow::Result<()> {\n        quickwit_common::setup_logging_for_tests();\n        let index_id = \"test-delete-pipeline-shut-down\";\n        let doc_mapping_yaml = r#\"\n            field_mappings:\n              - name: body\n                type: text\n              - name: ts\n                type: i64\n                fast: true\n        \"#;\n        let test_sandbox = TestSandbox::create(index_id, doc_mapping_yaml, \"{}\", &[\"body\"])\n            .await\n            .unwrap();\n        let universe: &Universe = test_sandbox.universe();\n        let merge_scheduler_mailbox = universe.get_or_spawn_one::<MergeSchedulerService>();\n        let metastore = test_sandbox.metastore();\n        let mut mock_search_service = MockSearchService::new();\n        mock_search_service\n            .expect_leaf_search()\n            .withf(|leaf_request| -> bool {\n                leaf_request\n                    .search_request\n                    .as_ref()\n                    .unwrap()\n                    .index_id_patterns\n                    == vec![\"test-delete-pipeline-shut-down\".to_string()]\n            })\n            .returning(move |_: LeafSearchRequest| {\n                Ok(LeafSearchResponse {\n                    num_hits: 0,\n                    ..Default::default()\n                })\n            });\n        let searcher_pool = searcher_pool_for_test([(\"127.0.0.1:1001\", mock_search_service)]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let delete_service_task_dir = TempDirectory::for_test();\n        let pipeline = DeleteTaskPipeline::new(\n            test_sandbox.index_uid(),\n            metastore.clone(),\n            search_job_placer,\n            test_sandbox.storage(),\n            delete_service_task_dir.path().into(),\n            4,\n            merge_scheduler_mailbox,\n            EventBroker::default(),\n        );\n\n        let (_pipeline_mailbox, pipeline_handler) = universe.spawn_builder().spawn(pipeline);\n        pipeline_handler.quit().await;\n        let observations = universe.observe(OBSERVE_PIPELINE_INTERVAL).await;\n        assert!(observations.into_iter().all(\n            |observation| observation.type_name != std::any::type_name::<DeleteTaskPipeline>()\n        ));\n        test_sandbox.assert_quit().await;\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-janitor/src/actors/delete_task_planner.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{HashMap, HashSet};\nuse std::str::FromStr;\nuse std::time::Duration;\n\nuse anyhow::Context;\nuse async_trait::async_trait;\nuse itertools::Itertools;\nuse quickwit_actors::{Actor, ActorContext, ActorExitStatus, Handler, Mailbox, QueueCapacity};\nuse quickwit_common::extract_time_range;\nuse quickwit_common::uri::Uri;\nuse quickwit_doc_mapper::tag_pruning::extract_tags_from_query;\nuse quickwit_indexing::actors::{MergeSchedulerService, MergeSplitDownloader, schedule_merge};\nuse quickwit_indexing::merge_policy::MergeOperation;\nuse quickwit_metastore::{ListSplitsResponseExt, Split, split_tag_filter, split_time_range_filter};\nuse quickwit_proto::metastore::{\n    DeleteTask, LastDeleteOpstampRequest, ListDeleteTasksRequest, ListStaleSplitsRequest,\n    MetastoreResult, MetastoreService, MetastoreServiceClient, UpdateSplitsDeleteOpstampRequest,\n};\nuse quickwit_proto::search::SearchRequest;\nuse quickwit_proto::types::IndexUid;\nuse quickwit_search::{IndexMetasForLeafSearch, SearchJob, SearchJobPlacer, jobs_to_leaf_request};\nuse serde::Serialize;\nuse tantivy::Inventory;\nuse tracing::{debug, info};\n\nuse crate::metrics::JANITOR_METRICS;\n\nconst PLANNER_REFRESH_INTERVAL: Duration = Duration::from_secs(60);\nconst NUM_STALE_SPLITS_TO_FETCH: usize = 1000;\n\n/// The `DeleteTaskPlanner` plans delete operations on splits for a given index.\n/// For each split, the planner checks if there is some documents to delete:\n/// - If this is the case, it sends a [`MergeOperation`] to the `MergeExecutor` `MergeOperation` to\n///   the `MergeExecutor`.\n/// - If there is no document to delete, it updates the split `delete_opstamp` to the latest delete\n///   task opstamp.\n///\n/// Pseudo-algorithm for a given index:\n/// 1. Fetches the delete tasks and deduce the last `opstamp`.\n/// 2. Fetches the last `N` stale splits ordered by their `delete_opstamp`. A stale split is a split\n///    a `delete_opstamp` inferior to the last `opstamp` In theory, this works but... there is one\n///    difficulty:\n///    - Delete operations do not run on immature splits and they are excluded after fetching stale\n///      splits from the metastore as the metastore has no knowledge about the merge policy. If\n///      there are more than `N` immature stale splits, the planner will plan no operations.\n///      However, this is mitigated by the fact that a merge policy should consider \"old split\" as\n///      mature and an index should not have many immature splits.\n///      See tracked issue <https://github.com/quickwit-oss/quickwit/issues/2147>.\n/// 3. If there is no stale splits, stop.\n/// 4. If there are stale splits, for each split, do:\n///    - Get the list of delete queries to apply to this split.\n///    - Keep only delete queries that match the split metadata (time range and tags).\n///    - If no delete queries remains, then update the split `delete_opstamp` to the latest\n///      `opstamp`.\n///    - If there are delete queries that match the metadata, do: + Execute delete queries\n///      (`leaf_request`) one by one to check if there is a match. + As soon as a hit is returned\n///      for a given query, the split is sent to the `MergeExecutor`. + If no delete queries match\n///      documents, update the split `delete_opstamp` to the last `opstamp`.\n#[derive(Clone)]\npub struct DeleteTaskPlanner {\n    index_uid: IndexUid,\n    index_uri: Uri,\n    doc_mapper_str: String,\n    metastore: MetastoreServiceClient,\n    search_job_placer: SearchJobPlacer,\n    merge_split_downloader_mailbox: Mailbox<MergeSplitDownloader>,\n    merge_scheduler_service: Mailbox<MergeSchedulerService>,\n    /// Inventory of ongoing delete operations. If everything goes well,\n    /// a merge operation is dropped after the publish of the split that underwent\n    /// the delete operation.\n    /// The inventory is used to avoid sending twice the same delete operation.\n    ongoing_delete_operations_inventory: Inventory<MergeOperation>,\n}\n\n#[async_trait]\nimpl Actor for DeleteTaskPlanner {\n    type ObservableState = DeleteTaskPlannerState;\n\n    fn observable_state(&self) -> Self::ObservableState {\n        let ongoing_delete_operations = self\n            .ongoing_delete_operations_inventory\n            .list()\n            .iter()\n            .map(|tracked_operation| tracked_operation.as_ref().clone())\n            .collect_vec();\n        DeleteTaskPlannerState {\n            ongoing_delete_operations,\n        }\n    }\n\n    fn name(&self) -> String {\n        \"DeleteTaskPlanner\".to_string()\n    }\n\n    fn queue_capacity(&self) -> QueueCapacity {\n        QueueCapacity::Bounded(0)\n    }\n\n    async fn initialize(&mut self, ctx: &ActorContext<Self>) -> Result<(), ActorExitStatus> {\n        self.handle(PlanDeleteLoop, ctx).await\n    }\n}\n\nimpl DeleteTaskPlanner {\n    pub fn new(\n        index_uid: IndexUid,\n        index_uri: Uri,\n        doc_mapper_str: String,\n        metastore: MetastoreServiceClient,\n        search_job_placer: SearchJobPlacer,\n        merge_split_downloader_mailbox: Mailbox<MergeSplitDownloader>,\n        merge_scheduler_service: Mailbox<MergeSchedulerService>,\n    ) -> Self {\n        Self {\n            index_uid,\n            index_uri,\n            doc_mapper_str,\n            metastore,\n            search_job_placer,\n            merge_split_downloader_mailbox,\n            merge_scheduler_service,\n            ongoing_delete_operations_inventory: Inventory::new(),\n        }\n    }\n\n    /// Send delete operations for a given `index_id`.\n    async fn send_delete_operations(&mut self, ctx: &ActorContext<Self>) -> anyhow::Result<()> {\n        // Loop until there is no more stale splits.\n        loop {\n            let last_delete_opstamp_request = LastDeleteOpstampRequest {\n                index_uid: Some(self.index_uid.clone()),\n            };\n            let last_delete_opstamp = self\n                .metastore\n                .last_delete_opstamp(last_delete_opstamp_request)\n                .await?\n                .last_delete_opstamp;\n            let stale_splits = self\n                .get_relevant_stale_splits(self.index_uid.clone(), last_delete_opstamp, ctx)\n                .await?;\n            ctx.record_progress();\n            debug!(\n                index_id = self.index_uid.index_id,\n                last_delete_opstamp = last_delete_opstamp,\n                num_stale_splits = stale_splits.len()\n            );\n\n            if stale_splits.is_empty() {\n                break;\n            }\n\n            let (splits_with_deletes, splits_without_deletes) =\n                self.partition_splits_by_deletes(&stale_splits, ctx).await?;\n\n            info!(\n                \"{} splits with deletes, {} splits without deletes.\",\n                splits_with_deletes.len(),\n                splits_without_deletes.len()\n            );\n            ctx.record_progress();\n\n            // Updates `delete_opstamp` of splits that won't undergo delete operations.\n            let split_ids_without_delete = splits_without_deletes\n                .iter()\n                .map(|split| split.split_id().to_string())\n                .collect_vec();\n            let update_splits_delete_opstamp_request = UpdateSplitsDeleteOpstampRequest {\n                index_uid: Some(self.index_uid.clone()),\n                split_ids: split_ids_without_delete.clone(),\n                delete_opstamp: last_delete_opstamp,\n            };\n            ctx.protect_future(\n                self.metastore\n                    .update_splits_delete_opstamp(update_splits_delete_opstamp_request),\n            )\n            .await?;\n\n            // Sends delete operations.\n            for split_with_deletes in splits_with_deletes {\n                let delete_operation = MergeOperation::new_delete_and_merge_operation(\n                    split_with_deletes.split_metadata,\n                );\n                info!(delete_operation=?delete_operation, \"planned delete operation\");\n                let tracked_delete_operation = self\n                    .ongoing_delete_operations_inventory\n                    .track(delete_operation);\n                schedule_merge(\n                    &self.merge_scheduler_service,\n                    tracked_delete_operation,\n                    self.merge_split_downloader_mailbox.clone(),\n                )\n                .await?;\n                let index_label =\n                    quickwit_common::metrics::index_label(self.index_uid.index_id.as_str());\n                JANITOR_METRICS\n                    .ongoing_num_delete_operations_total\n                    .with_label_values([index_label])\n                    .set(self.ongoing_delete_operations_inventory.list().len() as i64);\n            }\n        }\n\n        Ok(())\n    }\n\n    /// Identifies splits that contain documents to delete and\n    /// splits that do not and returns the two groups.\n    async fn partition_splits_by_deletes(\n        &mut self,\n        stale_splits: &[Split],\n        ctx: &ActorContext<Self>,\n    ) -> anyhow::Result<(Vec<Split>, Vec<Split>)> {\n        let mut splits_without_deletes: Vec<Split> = Vec::new();\n        let mut splits_with_deletes: Vec<Split> = Vec::new();\n\n        for stale_split in stale_splits {\n            let list_delete_tasks_request = ListDeleteTasksRequest::new(\n                self.index_uid.clone(),\n                stale_split.split_metadata.delete_opstamp,\n            );\n            let pending_tasks = ctx\n                .protect_future(self.metastore.list_delete_tasks(list_delete_tasks_request))\n                .await?\n                .delete_tasks;\n\n            // Keep only delete tasks that matches the split metadata.\n            let pending_and_matching_metadata_tasks = pending_tasks\n                .into_iter()\n                .filter(|delete_task| {\n                    let delete_query = delete_task\n                        .delete_query\n                        .as_ref()\n                        .expect(\"Delete task must have a delete query.\");\n                    let time_range = extract_time_range(\n                        delete_query.start_timestamp,\n                        delete_query.end_timestamp,\n                    );\n                    // TODO: validate the query at the beginning and return an appropriate error.\n                    let delete_query_ast = serde_json::from_str(&delete_query.query_ast)\n                        .expect(\"Failed to deserialize query_ast json\");\n                    let tags_filter = extract_tags_from_query(delete_query_ast);\n                    split_time_range_filter(&stale_split.split_metadata, time_range.as_ref())\n                        && split_tag_filter(&stale_split.split_metadata, tags_filter.as_ref())\n                })\n                .collect_vec();\n\n            // If there is no matching delete tasks,\n            // there is no document to delete on this split.\n            if pending_and_matching_metadata_tasks.is_empty() {\n                splits_without_deletes.push(stale_split.clone());\n                continue;\n            }\n\n            let has_split_docs_to_delete = self\n                .has_split_docs_to_delete(\n                    stale_split,\n                    &pending_and_matching_metadata_tasks,\n                    &self.doc_mapper_str,\n                    self.index_uri.as_str(),\n                    ctx,\n                )\n                .await?;\n            ctx.record_progress();\n\n            if has_split_docs_to_delete {\n                splits_with_deletes.push(stale_split.clone());\n            } else {\n                splits_without_deletes.push(stale_split.clone());\n            }\n        }\n\n        Ok((splits_with_deletes, splits_without_deletes))\n    }\n\n    /// Executes a `LeafSearchRequest` on the split and returns true\n    /// if it matches documents.\n    async fn has_split_docs_to_delete(\n        &self,\n        stale_split: &Split,\n        delete_tasks: &[DeleteTask],\n        doc_mapper_str: &str,\n        index_uri: &str,\n        ctx: &ActorContext<Self>,\n    ) -> anyhow::Result<bool> {\n        let search_job = SearchJob::from(&stale_split.split_metadata);\n        let mut search_client = self\n            .search_job_placer\n            .assign_job(search_job.clone(), &HashSet::new())\n            .await?;\n        for delete_task in delete_tasks {\n            let delete_query = delete_task\n                .delete_query\n                .as_ref()\n                .expect(\"Delete task must have a delete query.\");\n            // TODO: resolve with the default fields.\n            let search_request = SearchRequest {\n                index_id_patterns: vec![delete_query.index_uid().index_id.to_string()],\n                query_ast: delete_query.query_ast.clone(),\n                start_timestamp: delete_query.start_timestamp,\n                end_timestamp: delete_query.end_timestamp,\n                ..Default::default()\n            };\n            let mut search_indexes_metas = HashMap::new();\n            let index_uri = Uri::from_str(index_uri).context(\"invalid index URI\")?;\n            search_indexes_metas.insert(\n                delete_query.index_uid().clone(),\n                IndexMetasForLeafSearch {\n                    doc_mapper_str: doc_mapper_str.to_string(),\n                    index_uri,\n                },\n            );\n            let leaf_search_request = jobs_to_leaf_request(\n                &search_request,\n                &search_indexes_metas,\n                vec![search_job.clone()],\n            )?;\n            let response = search_client.leaf_search(leaf_search_request).await?;\n            ctx.record_progress();\n            if response.num_hits > 0 {\n                return Ok(true);\n            }\n        }\n        Ok(false)\n    }\n\n    /// Fetches stale splits from [`quickwit_metastore::Metastore`] and excludes immature splits and\n    /// split already among ongoing delete operations.\n    async fn get_relevant_stale_splits(\n        &mut self,\n        index_uid: IndexUid,\n        last_delete_opstamp: u64,\n        ctx: &ActorContext<Self>,\n    ) -> MetastoreResult<Vec<Split>> {\n        let list_stale_splits_request = ListStaleSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            delete_opstamp: last_delete_opstamp,\n            num_splits: NUM_STALE_SPLITS_TO_FETCH as u64,\n        };\n        let stale_splits = ctx\n            .protect_future(self.metastore.list_stale_splits(list_stale_splits_request))\n            .await?\n            .deserialize_splits()\n            .await?;\n        debug!(\n            index_id = index_uid.index_id,\n            last_delete_opstamp = last_delete_opstamp,\n            num_stale_splits_from_metastore = stale_splits.len()\n        );\n        let ongoing_delete_operations = self.ongoing_delete_operations_inventory.list();\n        let filtered_splits = stale_splits\n            .into_iter()\n            .filter(|stale_split| {\n                !ongoing_delete_operations.iter().any(|operation| {\n                    operation\n                        .splits\n                        .first()\n                        .unwrap() // <- This is safe as we know for sure that an operation is on one split.\n                        .split_id()\n                        == stale_split.split_id()\n                })\n            })\n            .collect_vec();\n        Ok(filtered_splits)\n    }\n}\n\n#[derive(Clone, Debug, Serialize)]\npub struct DeleteTaskPlannerState {\n    ongoing_delete_operations: Vec<MergeOperation>,\n}\n\n#[derive(Debug)]\nstruct PlanDeleteOperations;\n\n#[async_trait]\nimpl Handler<PlanDeleteOperations> for DeleteTaskPlanner {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        _: PlanDeleteOperations,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        self.send_delete_operations(ctx).await?;\n        Ok(())\n    }\n}\n\n#[derive(Debug)]\nstruct PlanDeleteLoop;\n\n#[async_trait]\nimpl Handler<PlanDeleteLoop> for DeleteTaskPlanner {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        _: PlanDeleteLoop,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        self.handle(PlanDeleteOperations, ctx).await?;\n        ctx.schedule_self_msg(PLANNER_REFRESH_INTERVAL, PlanDeleteLoop);\n        Ok(())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_config::build_doc_mapper;\n    use quickwit_indexing::TestSandbox;\n    use quickwit_indexing::merge_policy::MergeTask;\n    use quickwit_metastore::{\n        IndexMetadataResponseExt, ListSplitsRequestExt, MetastoreServiceStreamSplitsExt,\n        SplitMetadata,\n    };\n    use quickwit_proto::metastore::{DeleteQuery, IndexMetadataRequest, ListSplitsRequest};\n    use quickwit_proto::search::{LeafSearchRequest, LeafSearchResponse};\n    use quickwit_search::{MockSearchService, searcher_pool_for_test};\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_delete_task_planner() -> anyhow::Result<()> {\n        quickwit_common::setup_logging_for_tests();\n        let index_id = \"test-delete-task-planner\";\n        let doc_mapping_yaml = r#\"\n            field_mappings:\n              - name: body\n                type: text\n              - name: ts\n                type: i64\n                fast: true\n        \"#;\n        let indexing_settings_yaml = r#\"\n            merge_policy:\n                type: no_merge\n        \"#;\n        let test_sandbox = TestSandbox::create(\n            index_id,\n            doc_mapping_yaml,\n            indexing_settings_yaml,\n            &[\"body\"],\n        )\n        .await?;\n        let universe = test_sandbox.universe();\n        let docs = [\n            serde_json::json!({\"body\": \"info\", \"ts\": 0 }),\n            serde_json::json!({\"body\": \"info\", \"ts\": 0 }),\n            serde_json::json!({\"body\": \"delete\", \"ts\": 0 }),\n        ];\n        // Creates 3 splits\n        for doc in docs {\n            test_sandbox.add_documents(vec![doc]).await?;\n        }\n        let metastore = test_sandbox.metastore();\n        let index_metadata_request = IndexMetadataRequest::for_index_id(index_id.to_string());\n        let index_metadata = metastore\n            .index_metadata(index_metadata_request)\n            .await\n            .unwrap()\n            .deserialize_index_metadata()\n            .unwrap();\n        let index_uid = index_metadata.index_uid.clone();\n        let index_config = index_metadata.into_index_config();\n        let split_metas: Vec<SplitMetadata> = metastore\n            .list_splits(ListSplitsRequest::try_from_index_uid(index_uid.clone()).unwrap())\n            .await\n            .unwrap()\n            .collect_splits_metadata()\n            .await\n            .unwrap();\n        assert_eq!(split_metas.len(), 3);\n        let doc_mapper =\n            build_doc_mapper(&index_config.doc_mapping, &index_config.search_settings)?;\n        let doc_mapper_str = serde_json::to_string(&doc_mapper)?;\n\n        // Creates 2 delete tasks, one that will match 1 document,\n        // the other that will match no document.\n\n        let body_delete_ast = quickwit_query::query_ast::qast_json_helper(\"body:delete\", &[]);\n        let match_nothing_ast =\n            quickwit_query::query_ast::qast_json_helper(\"body:matchnothing\", &[]);\n        metastore\n            .create_delete_task(DeleteQuery {\n                index_uid: Some(index_uid.clone()),\n                start_timestamp: None,\n                end_timestamp: None,\n                query_ast: body_delete_ast.clone(),\n            })\n            .await?;\n        metastore\n            .create_delete_task(DeleteQuery {\n                index_uid: Some(index_uid.clone()),\n                start_timestamp: None,\n                end_timestamp: None,\n                query_ast: match_nothing_ast,\n            })\n            .await?;\n        let mut mock_search_service = MockSearchService::new();\n\n        // We have 2 delete tasks. Each one will trigger a leaf request for each\n        // of the 3 splits. This makes 6 requests.\n        let split_id_with_doc_to_delete = split_metas[2].split_id().to_string();\n        mock_search_service.expect_leaf_search().times(6).returning(\n            move |request: LeafSearchRequest| {\n                // Search on body:delete should return one hit only on the last split\n                // that should contains the doc.\n                if request.leaf_requests[0].split_offsets[0].split_id == split_id_with_doc_to_delete\n                    && request.search_request.as_ref().unwrap().query_ast == body_delete_ast\n                {\n                    return Ok(LeafSearchResponse {\n                        num_hits: 1,\n                        ..Default::default()\n                    });\n                }\n                Ok(LeafSearchResponse {\n                    num_hits: 0,\n                    ..Default::default()\n                })\n            },\n        );\n        let searcher_pool = searcher_pool_for_test([(\"127.0.0.1:1000\", mock_search_service)]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let merge_scheduler_mailbox = universe.get_or_spawn_one();\n        let (merge_split_downloader_mailbox, merge_split_downloader_inbox) =\n            universe.create_test_mailbox();\n        let delete_planner = DeleteTaskPlanner::new(\n            index_uid.clone(),\n            index_config.index_uri.clone(),\n            doc_mapper_str,\n            metastore.clone(),\n            search_job_placer,\n            merge_split_downloader_mailbox,\n            merge_scheduler_mailbox,\n        );\n        let (delete_planner_mailbox, delete_planner_handle) = test_sandbox\n            .universe()\n            .spawn_builder()\n            .spawn(delete_planner);\n        delete_planner_handle.process_pending_and_observe().await;\n        let downloader_msgs: Vec<MergeTask> = merge_split_downloader_inbox.drain_for_test_typed();\n        assert_eq!(downloader_msgs.len(), 1);\n        // The last split will undergo a delete operation.\n        assert_eq!(\n            downloader_msgs[0].splits[0].split_id(),\n            split_metas[2].split_id()\n        );\n        // Check planner state is inline.\n        let delete_planner_state = delete_planner_handle.observe().await;\n        assert_eq!(\n            delete_planner_state.ongoing_delete_operations[0].splits[0].split_id(),\n            split_metas[2].split_id()\n        );\n        // Trigger new plan evaluation and check that we don't have new merge operation.\n        delete_planner_mailbox\n            .ask(PlanDeleteOperations)\n            .await\n            .unwrap();\n        assert!(merge_split_downloader_inbox.drain_for_test().is_empty());\n        // Now drop the current merge operation and check that the planner will plan a new\n        // operation.\n        drop(downloader_msgs.into_iter().next().unwrap());\n        // Check planner state is inline.\n        assert!(\n            delete_planner_handle\n                .observe()\n                .await\n                .ongoing_delete_operations\n                .is_empty()\n        );\n\n        // Trigger operations planning.\n        delete_planner_mailbox\n            .ask(PlanDeleteOperations)\n            .await\n            .unwrap();\n        let downloader_last_msgs = merge_split_downloader_inbox.drain_for_test_typed::<MergeTask>();\n        assert_eq!(downloader_last_msgs.len(), 1);\n        assert_eq!(\n            downloader_last_msgs[0].splits[0].split_id(),\n            split_metas[2].split_id()\n        );\n        // The other splits has just their delete opstamps updated to the last opstamps which is 2\n        // as there are 2 delete tasks. The last split\n        let all_splits = metastore\n            .list_splits(ListSplitsRequest::try_from_index_uid(index_uid).unwrap())\n            .await\n            .unwrap()\n            .collect_splits_metadata()\n            .await\n            .unwrap();\n        assert_eq!(all_splits[0].delete_opstamp, 2);\n        assert_eq!(all_splits[1].delete_opstamp, 2);\n        // The last split has not yet its delete opstamp updated.\n        assert_eq!(all_splits[2].delete_opstamp, 0);\n        test_sandbox.assert_quit().await;\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-janitor/src/actors/delete_task_service.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{HashMap, HashSet};\nuse std::path::PathBuf;\nuse std::time::Duration;\n\nuse async_trait::async_trait;\nuse quickwit_actors::{Actor, ActorContext, ActorExitStatus, ActorHandle, Handler, Mailbox};\nuse quickwit_common::pubsub::EventBroker;\nuse quickwit_common::temp_dir::{self};\nuse quickwit_config::IndexConfig;\nuse quickwit_indexing::actors::MergeSchedulerService;\nuse quickwit_metastore::{IndexMetadataResponseExt, ListIndexesMetadataResponseExt};\nuse quickwit_proto::metastore::{\n    IndexMetadataRequest, ListIndexesMetadataRequest, MetastoreService, MetastoreServiceClient,\n};\nuse quickwit_proto::types::IndexUid;\nuse quickwit_search::SearchJobPlacer;\nuse quickwit_storage::StorageResolver;\nuse serde::Serialize;\nuse tracing::{error, info, warn};\n\nuse super::delete_task_pipeline::DeleteTaskPipeline;\n\npub const DELETE_SERVICE_TASK_DIR_NAME: &str = \"delete_task_service\";\n\nconst UPDATE_PIPELINES_INTERVAL: Duration = if cfg!(any(test, feature = \"testsuite\")) {\n    Duration::from_millis(200)\n} else {\n    // Each update triggers a call to the metastore. Deletes are not frequent operation and\n    // it's fine to wait a bit before updating the pipelines.\n    Duration::from_secs(30)\n};\n\n#[derive(Debug, Clone, Serialize)]\npub struct DeleteTaskServiceState {\n    pub num_running_pipelines: usize,\n}\n\npub struct DeleteTaskService {\n    metastore: MetastoreServiceClient,\n    search_job_placer: SearchJobPlacer,\n    storage_resolver: StorageResolver,\n    delete_service_task_dir: PathBuf,\n    pipeline_handles_by_index_uid: HashMap<IndexUid, ActorHandle<DeleteTaskPipeline>>,\n    max_concurrent_split_uploads: usize,\n    event_broker: EventBroker,\n    merge_scheduler_service: Mailbox<MergeSchedulerService>,\n}\n\nimpl DeleteTaskService {\n    pub async fn new(\n        metastore: MetastoreServiceClient,\n        search_job_placer: SearchJobPlacer,\n        storage_resolver: StorageResolver,\n        data_dir_path: PathBuf,\n        max_concurrent_split_uploads: usize,\n        merge_scheduler_service: Mailbox<MergeSchedulerService>,\n        event_broker: EventBroker,\n    ) -> anyhow::Result<Self> {\n        let delete_service_task_path = data_dir_path.join(DELETE_SERVICE_TASK_DIR_NAME);\n        let delete_service_task_dir =\n            temp_dir::create_or_purge_directory(delete_service_task_path.as_path()).await?;\n        Ok(Self {\n            metastore,\n            search_job_placer,\n            storage_resolver,\n            delete_service_task_dir,\n            pipeline_handles_by_index_uid: Default::default(),\n            max_concurrent_split_uploads,\n            merge_scheduler_service,\n            event_broker,\n        })\n    }\n}\n\n#[async_trait]\nimpl Actor for DeleteTaskService {\n    type ObservableState = DeleteTaskServiceState;\n\n    fn observable_state(&self) -> Self::ObservableState {\n        DeleteTaskServiceState {\n            num_running_pipelines: self.pipeline_handles_by_index_uid.len(),\n        }\n    }\n\n    fn name(&self) -> String {\n        \"DeleteTaskService\".to_string()\n    }\n\n    async fn initialize(&mut self, ctx: &ActorContext<Self>) -> Result<(), ActorExitStatus> {\n        self.handle(UpdatePipelines, ctx).await?;\n        Ok(())\n    }\n}\n\nimpl DeleteTaskService {\n    pub async fn update_pipeline_handles(\n        &mut self,\n        ctx: &ActorContext<Self>,\n    ) -> anyhow::Result<()> {\n        let mut index_config_by_index_id: HashMap<IndexUid, IndexConfig> = self\n            .metastore\n            .list_indexes_metadata(ListIndexesMetadataRequest::all())\n            .await?\n            .deserialize_indexes_metadata()\n            .await?\n            .into_iter()\n            .map(|index_metadata| {\n                (\n                    index_metadata.index_uid.clone(),\n                    index_metadata.into_index_config(),\n                )\n            })\n            .collect();\n        let index_uids: HashSet<IndexUid> = index_config_by_index_id.keys().cloned().collect();\n        let pipeline_index_uids: HashSet<IndexUid> =\n            self.pipeline_handles_by_index_uid.keys().cloned().collect();\n\n        // Remove pipelines on deleted indexes.\n        for deleted_index_uid in pipeline_index_uids.difference(&index_uids) {\n            info!(\n                deleted_index_id = deleted_index_uid.index_id,\n                \"Remove deleted index from delete task pipelines.\"\n            );\n            let pipeline_handle = self\n                .pipeline_handles_by_index_uid\n                .remove(deleted_index_uid)\n                .expect(\"Handle must be present.\");\n            // Kill the pipeline, this avoids to wait a long time for a delete operation to finish.\n            pipeline_handle.kill().await;\n        }\n\n        // Start new pipelines and add them to the handles hashmap.\n        for index_uid in index_uids.difference(&pipeline_index_uids) {\n            let index_config = index_config_by_index_id\n                .remove(index_uid)\n                .expect(\"index metadata should be present\");\n            if self.spawn_pipeline(index_config, ctx).await.is_err() {\n                warn!(\"failed to spawn delete pipeline for {}\", index_uid.index_id);\n            }\n        }\n\n        Ok(())\n    }\n\n    pub async fn spawn_pipeline(\n        &mut self,\n        index_config: IndexConfig,\n        ctx: &ActorContext<Self>,\n    ) -> anyhow::Result<()> {\n        let index_uri = index_config.index_uri.clone();\n        let index_storage = self.storage_resolver.resolve(&index_uri).await?;\n        let index_metadata_request =\n            IndexMetadataRequest::for_index_id(index_config.index_id.to_string());\n        let index_metadata = self\n            .metastore\n            .index_metadata(index_metadata_request)\n            .await?\n            .deserialize_index_metadata()?;\n        let pipeline = DeleteTaskPipeline::new(\n            index_metadata.index_uid.clone(),\n            self.metastore.clone(),\n            self.search_job_placer.clone(),\n            index_storage,\n            self.delete_service_task_dir.clone(),\n            self.max_concurrent_split_uploads,\n            self.merge_scheduler_service.clone(),\n            self.event_broker.clone(),\n        );\n        let (_pipeline_mailbox, pipeline_handler) = ctx.spawn_actor().spawn(pipeline);\n        self.pipeline_handles_by_index_uid\n            .insert(index_metadata.index_uid, pipeline_handler);\n        Ok(())\n    }\n}\n\n#[derive(Debug)]\nstruct UpdatePipelines;\n\n#[async_trait]\nimpl Handler<UpdatePipelines> for DeleteTaskService {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        _: UpdatePipelines,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), ActorExitStatus> {\n        let result = self.update_pipeline_handles(ctx).await;\n        if let Err(error) = result {\n            error!(error=%error, \"delete task pipelines update failed\");\n        }\n        ctx.schedule_self_msg(UPDATE_PIPELINES_INTERVAL, UpdatePipelines);\n        Ok(())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_actors::Universe;\n    use quickwit_common::pubsub::EventBroker;\n    use quickwit_indexing::TestSandbox;\n    use quickwit_proto::metastore::{\n        DeleteIndexRequest, DeleteQuery, ListDeleteTasksRequest, MetastoreService,\n    };\n    use quickwit_search::{MockSearchService, SearchJobPlacer, searcher_pool_for_test};\n    use quickwit_storage::StorageResolver;\n\n    use super::{DeleteTaskService, UPDATE_PIPELINES_INTERVAL};\n\n    #[tokio::test]\n    async fn test_delete_task_service() -> anyhow::Result<()> {\n        quickwit_common::setup_logging_for_tests();\n        let index_id = \"test-delete-task-service-index\";\n        let doc_mapping_yaml = r#\"\n            field_mappings:\n              - name: body\n                type: text\n              - name: ts\n                type: i64\n                fast: true\n        \"#;\n        let test_sandbox = TestSandbox::create(index_id, doc_mapping_yaml, \"{}\", &[\"body\"]).await?;\n        let index_uid = test_sandbox.index_uid();\n        let metastore = test_sandbox.metastore();\n        let mock_search_service = MockSearchService::new();\n        let searcher_pool = searcher_pool_for_test([(\"127.0.0.1:1000\", mock_search_service)]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let temp_dir = tempfile::tempdir().unwrap();\n        let data_dir_path = temp_dir.path().to_path_buf();\n        let universe: &Universe = test_sandbox.universe();\n        let delete_task_service = DeleteTaskService::new(\n            metastore.clone(),\n            search_job_placer,\n            StorageResolver::unconfigured(),\n            data_dir_path,\n            4,\n            universe.get_or_spawn_one(),\n            EventBroker::default(),\n        )\n        .await\n        .unwrap();\n        let (_delete_task_service_mailbox, delete_task_service_handler) =\n            universe.spawn_builder().spawn(delete_task_service);\n        let state = delete_task_service_handler\n            .process_pending_and_observe()\n            .await;\n        assert_eq!(state.num_running_pipelines, 1);\n        let delete_query = DeleteQuery {\n            index_uid: Some(index_uid.clone()),\n            start_timestamp: None,\n            end_timestamp: None,\n            query_ast: r#\"{\"type\": \"MatchAll\"}\"#.to_string(),\n        };\n        metastore.create_delete_task(delete_query).await.unwrap();\n        // Just test creation of delete query.\n        assert_eq!(\n            metastore\n                .list_delete_tasks(ListDeleteTasksRequest::new(index_uid.clone(), 0))\n                .await\n                .unwrap()\n                .delete_tasks\n                .len(),\n            1\n        );\n        metastore\n            .delete_index(DeleteIndexRequest {\n                index_uid: Some(index_uid.clone()),\n            })\n            .await\n            .unwrap();\n        universe.sleep(UPDATE_PIPELINES_INTERVAL * 2).await;\n        let state_after_deletion = delete_task_service_handler\n            .process_pending_and_observe()\n            .await;\n        assert_eq!(state_after_deletion.num_running_pipelines, 0);\n        assert!(universe.get_one::<DeleteTaskService>().is_some());\n        let actors_observations = universe.observe(UPDATE_PIPELINES_INTERVAL).await;\n        assert!(\n            actors_observations\n                .into_iter()\n                .any(|observation| observation.type_name\n                    == std::any::type_name::<DeleteTaskService>())\n        );\n        assert!(universe.get_one::<DeleteTaskService>().is_some());\n        test_sandbox.assert_quit().await;\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-janitor/src/actors/garbage_collector.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{HashMap, HashSet};\nuse std::path::Path;\nuse std::sync::Arc;\nuse std::time::{Duration, Instant};\n\nuse async_trait::async_trait;\nuse futures::{StreamExt, stream};\nuse quickwit_actors::{Actor, ActorContext, Handler};\nuse quickwit_common::shared_consts::split_deletion_grace_period;\nuse quickwit_index_management::{GcMetrics, run_garbage_collect};\nuse quickwit_metastore::ListIndexesMetadataResponseExt;\nuse quickwit_proto::metastore::{\n    ListIndexesMetadataRequest, MetastoreService, MetastoreServiceClient,\n};\nuse quickwit_proto::types::IndexUid;\nuse quickwit_storage::{Storage, StorageResolver};\nuse serde::Serialize;\nuse tracing::{debug, error, info};\n\nuse crate::metrics::JANITOR_METRICS;\n\nconst RUN_INTERVAL: Duration = Duration::from_secs(10 * 60); // 10 minutes\n\n/// Staged files needs to be deleted if there was a failure.\n/// TODO ideally we want clean up all staged splits every time we restart the indexing pipeline, but\n/// the grace period strategy should do the job for the moment.\nconst STAGED_GRACE_PERIOD: Duration = Duration::from_secs(60 * 60 * 24); // 24 hours\n\n#[derive(Clone, Debug, Default, Serialize)]\npub struct GarbageCollectorCounters {\n    /// The number of passes the garbage collector has performed.\n    pub num_passes: usize,\n    /// The number of deleted files.\n    pub num_deleted_files: usize,\n    /// The number of bytes deleted.\n    pub num_deleted_bytes: usize,\n    /// The number of failed garbage collection run.\n    pub num_failed_gc_run: usize,\n    /// The number of successful garbage collection run.\n    pub num_successful_gc_run: usize,\n    /// The number or failed storage resolution.\n    pub num_failed_storage_resolution: usize,\n    /// The number of splits that were unable to be removed.\n    pub num_failed_splits: usize,\n}\n\n#[derive(Debug)]\nstruct Loop;\n\n/// An actor for collecting garbage periodically from an index.\npub struct GarbageCollector {\n    metastore: MetastoreServiceClient,\n    storage_resolver: StorageResolver,\n    counters: GarbageCollectorCounters,\n}\n\nimpl GarbageCollector {\n    pub fn new(metastore: MetastoreServiceClient, storage_resolver: StorageResolver) -> Self {\n        Self {\n            metastore,\n            storage_resolver,\n            counters: GarbageCollectorCounters::default(),\n        }\n    }\n\n    /// Gc Loop handler logic.\n    /// Should not return an error to prevent the actor from crashing.\n    async fn handle_inner(&mut self, ctx: &ActorContext<Self>) {\n        debug!(\"loading indexes from the metastore\");\n        self.counters.num_passes += 1;\n\n        let start = Instant::now();\n\n        let response = match self\n            .metastore\n            .list_indexes_metadata(ListIndexesMetadataRequest::all())\n            .await\n        {\n            Ok(response) => response,\n            Err(error) => {\n                error!(%error, \"failed to list indexes from the metastore\");\n                return;\n            }\n        };\n        let indexes = match response.deserialize_indexes_metadata().await {\n            Ok(indexes) => indexes,\n            Err(error) => {\n                error!(%error, \"failed to deserialize indexes metadata\");\n                return;\n            }\n        };\n        info!(\"loaded {} indexes from the metastore\", indexes.len());\n\n        let expected_count = indexes.len();\n        let index_storages: HashMap<IndexUid, Arc<dyn Storage>> = stream::iter(indexes).filter_map(|index| {\n            let storage_resolver = self.storage_resolver.clone();\n            async move {\n                let index_uid = index.index_uid.clone();\n                let index_uri = index.index_uri();\n                let storage = match storage_resolver.resolve(index_uri).await {\n                    Ok(storage) => storage,\n                    Err(error) => {\n                        error!(index=%index.index_id(), error=?error, \"failed to resolve the index storage Uri\");\n                        return None;\n                    }\n                };\n                Some((index_uid, storage))\n            }}).collect()\n            .await;\n\n        let storage_got_count = index_storages.len();\n        self.counters.num_failed_storage_resolution += expected_count - storage_got_count;\n\n        if index_storages.is_empty() {\n            return;\n        }\n\n        let gc_res = run_garbage_collect(\n            index_storages,\n            self.metastore.clone(),\n            STAGED_GRACE_PERIOD,\n            split_deletion_grace_period(),\n            false,\n            Some(ctx.progress()),\n            Some(GcMetrics {\n                deleted_splits: JANITOR_METRICS\n                    .gc_deleted_splits\n                    .with_label_values([\"success\"])\n                    .clone(),\n                deleted_bytes: JANITOR_METRICS.gc_deleted_bytes.clone(),\n                failed_splits: JANITOR_METRICS\n                    .gc_deleted_splits\n                    .with_label_values([\"error\"])\n                    .clone(),\n            }),\n        )\n        .await;\n\n        let run_duration = start.elapsed().as_secs();\n        JANITOR_METRICS.gc_seconds_total.inc_by(run_duration);\n\n        let deleted_file_entries = match gc_res {\n            Ok(removal_info) => {\n                self.counters.num_successful_gc_run += 1;\n                JANITOR_METRICS.gc_runs.with_label_values([\"success\"]).inc();\n                self.counters.num_failed_splits += removal_info.failed_splits.len();\n                removal_info.removed_split_entries\n            }\n            Err(error) => {\n                self.counters.num_failed_gc_run += 1;\n                JANITOR_METRICS.gc_runs.with_label_values([\"error\"]).inc();\n                error!(error=?error, \"failed to run garbage collection\");\n                return;\n            }\n        };\n        if !deleted_file_entries.is_empty() {\n            let num_deleted_splits = deleted_file_entries.len();\n            let num_deleted_bytes = deleted_file_entries\n                .iter()\n                .map(|entry| entry.file_size_bytes.as_u64() as usize)\n                .sum::<usize>();\n            let deleted_files: HashSet<&Path> = deleted_file_entries\n                .iter()\n                .map(|deleted_entry| deleted_entry.file_name.as_path())\n                .take(5)\n                .collect();\n            info!(\n                num_deleted_splits = num_deleted_splits,\n                \"Janitor deleted {:?} and {} other splits.\", deleted_files, num_deleted_splits,\n            );\n            self.counters.num_deleted_files += num_deleted_splits;\n            self.counters.num_deleted_bytes += num_deleted_bytes;\n        }\n    }\n}\n\n#[async_trait]\nimpl Actor for GarbageCollector {\n    type ObservableState = GarbageCollectorCounters;\n\n    fn observable_state(&self) -> Self::ObservableState {\n        self.counters.clone()\n    }\n\n    fn name(&self) -> String {\n        \"GarbageCollector\".to_string()\n    }\n\n    async fn initialize(\n        &mut self,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), quickwit_actors::ActorExitStatus> {\n        self.handle(Loop, ctx).await?;\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<Loop> for GarbageCollector {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        _: Loop,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), quickwit_actors::ActorExitStatus> {\n        self.handle_inner(ctx).await;\n        ctx.schedule_self_msg(RUN_INTERVAL, Loop);\n        Ok(())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::ops::Bound;\n    use std::path::Path;\n    use std::sync::Arc;\n\n    use quickwit_actors::Universe;\n    use quickwit_common::ServiceStream;\n    use quickwit_common::shared_consts::split_deletion_grace_period;\n    use quickwit_metastore::{\n        IndexMetadata, ListSplitsRequestExt, ListSplitsResponseExt, Split, SplitMetadata,\n        SplitState,\n    };\n    use quickwit_proto::metastore::{\n        EmptyResponse, ListIndexesMetadataResponse, ListSplitsResponse, MetastoreError,\n        MockMetastoreService,\n    };\n    use quickwit_proto::types::IndexUid;\n    use quickwit_storage::MockStorage;\n    use time::OffsetDateTime;\n\n    use super::*;\n\n    fn hashmap<K: Eq + std::hash::Hash, V>(key: K, value: V) -> HashMap<K, V> {\n        let mut map = HashMap::new();\n        map.insert(key, value);\n        map\n    }\n\n    fn make_splits(index_id: &str, split_ids: &[&str], split_state: SplitState) -> Vec<Split> {\n        split_ids\n            .iter()\n            .map(|split_id| Split {\n                split_metadata: SplitMetadata {\n                    split_id: split_id.to_string(),\n                    index_uid: IndexUid::for_test(index_id, 0),\n                    footer_offsets: 5..20,\n                    ..Default::default()\n                },\n                split_state,\n                update_timestamp: 0i64,\n                publish_timestamp: None,\n            })\n            .collect()\n    }\n\n    #[tokio::test]\n    async fn test_run_garbage_collect_calls_dependencies_appropriately() {\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let mut mock_storage = MockStorage::default();\n        mock_storage\n            .expect_bulk_delete()\n            .times(1)\n            .returning(|paths: &[&Path]| {\n                let actual: HashSet<&Path> = HashSet::from_iter(paths.iter().copied());\n                let expected: HashSet<&Path> = HashSet::from_iter([\n                    Path::new(\"a.split\"),\n                    Path::new(\"b.split\"),\n                    Path::new(\"c.split\"),\n                ]);\n\n                assert_eq!(actual, expected);\n\n                Ok(())\n            });\n\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_uid_clone = index_uid.clone();\n        mock_metastore\n            .expect_list_splits()\n            .times(2)\n            .returning(move |list_splits_request| {\n                let query = list_splits_request.deserialize_list_splits_query().unwrap();\n                let splits = match query.split_states[0] {\n                    SplitState::Staged => {\n                        assert_eq!(query.index_uids.unwrap()[0], index_uid_clone);\n                        make_splits(\"test-index\", &[\"a\"], SplitState::Staged)\n                    }\n                    SplitState::MarkedForDeletion => {\n                        assert!(query.index_uids.is_none());\n                        let expected_deletion_timestamp = OffsetDateTime::now_utc()\n                            .unix_timestamp()\n                            - split_deletion_grace_period().as_secs() as i64;\n                        assert_eq!(\n                            query.update_timestamp.end,\n                            Bound::Included(expected_deletion_timestamp),\n                            \"Expected splits query to only select splits which have not been \\\n                             updated since the expected deletion timestamp.\",\n                        );\n                        assert_eq!(\n                            query.update_timestamp.start,\n                            Bound::Unbounded,\n                            \"Expected the lower bound to be unbounded when filtering splits.\",\n                        );\n\n                        make_splits(\n                            \"test-index\",\n                            &[\"a\", \"b\", \"c\"],\n                            SplitState::MarkedForDeletion,\n                        )\n                    }\n                    _ => panic!(\"only Staged and MarkedForDeletion expected.\"),\n                };\n                let splits = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits)]))\n            });\n        let index_uid_clone = index_uid.clone();\n        mock_metastore\n            .expect_mark_splits_for_deletion()\n            .times(1)\n            .returning(move |mark_splits_for_deletion_request| {\n                assert_eq!(\n                    mark_splits_for_deletion_request.index_uid(),\n                    &index_uid_clone\n                );\n                assert_eq!(mark_splits_for_deletion_request.split_ids, vec![\"a\"]);\n                Ok(EmptyResponse {})\n            });\n        let index_uid_clone = index_uid.clone();\n        mock_metastore\n            .expect_delete_splits()\n            .times(1)\n            .returning(move |delete_splits_request| {\n                assert_eq!(delete_splits_request.index_uid(), &index_uid_clone);\n                let split_ids = HashSet::<&str>::from_iter(\n                    delete_splits_request\n                        .split_ids\n                        .iter()\n                        .map(|split_id| split_id.as_str()),\n                );\n                let expected_split_ids = HashSet::<&str>::from_iter([\"a\", \"b\", \"c\"]);\n                assert_eq!(split_ids, expected_split_ids);\n\n                Ok(EmptyResponse {})\n            });\n\n        let result = run_garbage_collect(\n            hashmap(index_uid, Arc::new(mock_storage)),\n            MetastoreServiceClient::from_mock(mock_metastore),\n            STAGED_GRACE_PERIOD,\n            split_deletion_grace_period(),\n            false,\n            None,\n            None,\n        )\n        .await;\n        assert!(result.is_ok());\n    }\n\n    #[tokio::test]\n    async fn test_garbage_collect_calls_dependencies_appropriately() {\n        let storage_resolver = StorageResolver::unconfigured();\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .times(1)\n            .returning(|_list_indexes_request| {\n                let indexes_metadata = vec![IndexMetadata::for_test(\n                    \"test-index\",\n                    \"ram://indexes/test-index\",\n                )];\n                Ok(ListIndexesMetadataResponse::for_test(indexes_metadata))\n            });\n        mock_metastore\n            .expect_list_splits()\n            .times(2)\n            .returning(|list_splits_request| {\n                let query = list_splits_request.deserialize_list_splits_query().unwrap();\n                let splits = match query.split_states[0] {\n                    SplitState::Staged => {\n                        assert_eq!(&query.index_uids.unwrap()[0].index_id, \"test-index\");\n                        make_splits(\"test-index\", &[\"a\"], SplitState::Staged)\n                    }\n                    SplitState::MarkedForDeletion => {\n                        assert!(query.index_uids.is_none());\n                        make_splits(\n                            \"test-index\",\n                            &[\"a\", \"b\", \"c\"],\n                            SplitState::MarkedForDeletion,\n                        )\n                    }\n                    _ => panic!(\"only Staged and MarkedForDeletion expected.\"),\n                };\n                let splits = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits)]))\n            });\n        mock_metastore\n            .expect_mark_splits_for_deletion()\n            .times(1)\n            .returning(|mark_splits_for_deletion_request| {\n                let index_uid: IndexUid = mark_splits_for_deletion_request.index_uid().clone();\n                assert_eq!(&index_uid.index_id, \"test-index\");\n                assert_eq!(mark_splits_for_deletion_request.split_ids, vec![\"a\"]);\n                Ok(EmptyResponse {})\n            });\n        mock_metastore\n            .expect_delete_splits()\n            .times(1)\n            .returning(|delete_splits_request| {\n                let index_uid: IndexUid = delete_splits_request.index_uid().clone();\n                assert_eq!(&index_uid.index_id, \"test-index\");\n\n                let split_ids = HashSet::<&str>::from_iter(\n                    delete_splits_request\n                        .split_ids\n                        .iter()\n                        .map(|split_id| split_id.as_str()),\n                );\n                let expected_split_ids = HashSet::<&str>::from_iter([\"a\", \"b\", \"c\"]);\n\n                assert_eq!(split_ids, expected_split_ids);\n                Ok(EmptyResponse {})\n            });\n\n        let garbage_collect_actor = GarbageCollector::new(\n            MetastoreServiceClient::from_mock(mock_metastore),\n            storage_resolver,\n        );\n        let universe = Universe::with_accelerated_time();\n        let (_mailbox, handler) = universe.spawn_builder().spawn(garbage_collect_actor);\n\n        let state_after_initialization = handler.process_pending_and_observe().await.state;\n        assert_eq!(state_after_initialization.num_passes, 1);\n        assert_eq!(state_after_initialization.num_deleted_files, 3);\n        assert_eq!(state_after_initialization.num_deleted_bytes, 60);\n        assert_eq!(state_after_initialization.num_failed_splits, 0);\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_garbage_collect_get_calls_repeatedly() {\n        let storage_resolver = StorageResolver::unconfigured();\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .times(3)\n            .returning(|_list_indexes_metadata| {\n                let indexes_metadata = vec![IndexMetadata::for_test(\n                    \"test-index\",\n                    \"ram://indexes/test-index\",\n                )];\n                Ok(ListIndexesMetadataResponse::for_test(indexes_metadata))\n            });\n        mock_metastore\n            .expect_list_splits()\n            .times(6)\n            .returning(|list_splits_request| {\n                let query = list_splits_request.deserialize_list_splits_query().unwrap();\n                let splits = match query.split_states[0] {\n                    SplitState::Staged => {\n                        assert_eq!(&query.index_uids.unwrap()[0].index_id, \"test-index\");\n                        make_splits(\"test-index\", &[\"a\"], SplitState::Staged)\n                    }\n                    SplitState::MarkedForDeletion => {\n                        assert!(&query.index_uids.is_none());\n                        make_splits(\"test-index\", &[\"a\", \"b\"], SplitState::MarkedForDeletion)\n                    }\n                    _ => panic!(\"only Staged and MarkedForDeletion expected.\"),\n                };\n                let splits = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits)]))\n            });\n        mock_metastore\n            .expect_mark_splits_for_deletion()\n            .times(3)\n            .returning(|mark_splits_for_deletion_request| {\n                let index_uid: IndexUid = mark_splits_for_deletion_request.index_uid().clone();\n                assert_eq!(&index_uid.index_id, \"test-index\");\n                assert_eq!(mark_splits_for_deletion_request.split_ids, vec![\"a\"]);\n                Ok(EmptyResponse {})\n            });\n        mock_metastore\n            .expect_delete_splits()\n            .times(3)\n            .returning(|delete_splits_request| {\n                let index_uid: IndexUid = delete_splits_request.index_uid().clone();\n                assert_eq!(&index_uid.index_id, \"test-index\");\n\n                let split_ids = HashSet::<&str>::from_iter(\n                    delete_splits_request\n                        .split_ids\n                        .iter()\n                        .map(|split_id| split_id.as_str()),\n                );\n                let expected_split_ids = HashSet::<&str>::from_iter([\"a\", \"b\"]);\n\n                assert_eq!(split_ids, expected_split_ids);\n                Ok(EmptyResponse {})\n            });\n\n        let garbage_collect_actor = GarbageCollector::new(\n            MetastoreServiceClient::from_mock(mock_metastore),\n            storage_resolver,\n        );\n        let universe = Universe::with_accelerated_time();\n        let (_mailbox, handle) = universe.spawn_builder().spawn(garbage_collect_actor);\n\n        let counters = handle.process_pending_and_observe().await.state;\n        assert_eq!(counters.num_passes, 1);\n        assert_eq!(counters.num_deleted_files, 2);\n        assert_eq!(counters.num_deleted_bytes, 40);\n        assert_eq!(counters.num_successful_gc_run, 1);\n        assert_eq!(counters.num_failed_storage_resolution, 0);\n        assert_eq!(counters.num_failed_gc_run, 0);\n        assert_eq!(counters.num_failed_splits, 0);\n\n        // 30 secs later\n        universe.sleep(Duration::from_secs(30)).await;\n        let counters = handle.process_pending_and_observe().await.state;\n        assert_eq!(counters.num_passes, 1);\n        assert_eq!(counters.num_deleted_files, 2);\n        assert_eq!(counters.num_deleted_bytes, 40);\n        assert_eq!(counters.num_successful_gc_run, 1);\n        assert_eq!(counters.num_failed_storage_resolution, 0);\n        assert_eq!(counters.num_failed_gc_run, 0);\n        assert_eq!(counters.num_failed_splits, 0);\n\n        // 60 secs later\n        universe.sleep(RUN_INTERVAL).await;\n        let counters = handle.process_pending_and_observe().await.state;\n        assert_eq!(counters.num_passes, 2);\n        assert_eq!(counters.num_deleted_files, 4);\n        assert_eq!(counters.num_deleted_bytes, 80);\n        assert_eq!(counters.num_successful_gc_run, 2);\n        assert_eq!(counters.num_failed_storage_resolution, 0);\n        assert_eq!(counters.num_failed_gc_run, 0);\n        assert_eq!(counters.num_failed_splits, 0);\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_garbage_collect_get_called_repeatedly_on_failure() {\n        let storage_resolver = StorageResolver::unconfigured();\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .times(4)\n            .returning(|_list_indexes_request| {\n                Err(MetastoreError::Db {\n                    message: \"fail to list indexes\".to_string(),\n                })\n            });\n\n        let garbage_collect_actor = GarbageCollector::new(\n            MetastoreServiceClient::from_mock(mock_metastore),\n            storage_resolver,\n        );\n        let universe = Universe::with_accelerated_time();\n        let (_mailbox, handle) = universe.spawn_builder().spawn(garbage_collect_actor);\n\n        let counters = handle.process_pending_and_observe().await.state;\n        assert_eq!(counters.num_passes, 1);\n\n        universe.sleep(RUN_INTERVAL).await;\n        let counters = handle.process_pending_and_observe().await.state;\n        assert_eq!(counters.num_passes, 2);\n\n        universe.sleep(RUN_INTERVAL).await;\n        let counters = handle.process_pending_and_observe().await.state;\n        assert_eq!(counters.num_passes, 3);\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_garbage_collect_fails_to_resolve_storage() {\n        let storage_resolver = StorageResolver::unconfigured();\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .times(1)\n            .returning(move |_list_indexes_request| {\n                let indexes_metadata = vec![IndexMetadata::for_test(\n                    \"test-index\",\n                    \"postgresql://indexes/test-index\",\n                )];\n                Ok(ListIndexesMetadataResponse::for_test(indexes_metadata))\n            });\n\n        let garbage_collect_actor = GarbageCollector::new(\n            MetastoreServiceClient::from_mock(mock_metastore),\n            storage_resolver,\n        );\n        let universe = Universe::with_accelerated_time();\n        let (_mailbox, handle) = universe.spawn_builder().spawn(garbage_collect_actor);\n\n        let counters = handle.process_pending_and_observe().await.state;\n        assert_eq!(counters.num_passes, 1);\n        assert_eq!(counters.num_deleted_files, 0);\n        assert_eq!(counters.num_deleted_bytes, 0);\n        assert_eq!(counters.num_successful_gc_run, 0);\n        assert_eq!(counters.num_failed_storage_resolution, 1);\n        assert_eq!(counters.num_failed_gc_run, 0);\n        assert_eq!(counters.num_failed_splits, 0);\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_garbage_collect_fails_to_run_delete_on_one_index() {\n        let storage_resolver = StorageResolver::unconfigured();\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .times(1)\n            .returning(|_list_indexes_request| {\n                let indexes_metadata = vec![\n                    IndexMetadata::for_test(\"test-index-1\", \"ram://indexes/test-index-1\"),\n                    IndexMetadata::for_test(\"test-index-2\", \"ram://indexes/test-index-2\"),\n                ];\n                Ok(ListIndexesMetadataResponse::for_test(indexes_metadata))\n            });\n        mock_metastore\n            .expect_list_splits()\n            .times(3)\n            .returning(|list_splits_request| {\n                let query = list_splits_request.deserialize_list_splits_query().unwrap();\n                let splits_ids_string: Vec<String> =\n                    (0..8000).map(|seq| format!(\"split-{seq:04}\")).collect();\n                let splits_ids: Vec<&str> = splits_ids_string\n                    .iter()\n                    .map(|string| string.as_str())\n                    .collect();\n                let mut splits = match query.split_states[0] {\n                    SplitState::Staged => {\n                        let index_uids = query.index_uids.unwrap();\n                        assert_eq!(index_uids.len(), 2);\n                        assert!(\n                            [\"test-index-1\", \"test-index-2\"]\n                                .contains(&index_uids[0].index_id.as_ref())\n                        );\n                        assert!(\n                            [\"test-index-1\", \"test-index-2\"]\n                                .contains(&index_uids[1].index_id.as_ref())\n                        );\n                        let mut splits = make_splits(\"test-index-1\", &[\"a\"], SplitState::Staged);\n                        splits.append(&mut make_splits(\"test-index-2\", &[\"a\"], SplitState::Staged));\n                        splits\n                    }\n                    SplitState::MarkedForDeletion => {\n                        assert!(query.index_uids.is_none());\n                        assert_eq!(query.limit, Some(10_000));\n                        let mut splits =\n                            make_splits(\"test-index-1\", &splits_ids, SplitState::MarkedForDeletion);\n                        splits.append(&mut make_splits(\n                            \"test-index-2\",\n                            &splits_ids,\n                            SplitState::MarkedForDeletion,\n                        ));\n                        splits\n                    }\n                    _ => panic!(\"only Staged and MarkedForDeletion expected.\"),\n                };\n                if let Some((index_uid, split_id)) = query.after_split {\n                    splits.retain(|split| {\n                        (\n                            &split.split_metadata.index_uid,\n                            &split.split_metadata.split_id,\n                        ) > (&index_uid, &split_id)\n                    });\n                }\n                splits.truncate(10_000);\n                let splits = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits)]))\n            });\n        mock_metastore\n            .expect_mark_splits_for_deletion()\n            .times(2)\n            .returning(|mark_splits_for_deletion_request| {\n                let index_uid: IndexUid = mark_splits_for_deletion_request.index_uid().clone();\n                assert!([\"test-index-1\", \"test-index-2\"].contains(&index_uid.index_id.as_ref()));\n                assert_eq!(mark_splits_for_deletion_request.split_ids, vec![\"a\"]);\n                Ok(EmptyResponse {})\n            });\n        mock_metastore\n            .expect_delete_splits()\n            .times(3)\n            .returning(|delete_splits_request| {\n                let index_uid: IndexUid = delete_splits_request.index_uid().clone();\n                let split_ids = HashSet::<&str>::from_iter(\n                    delete_splits_request\n                        .split_ids\n                        .iter()\n                        .map(|split_id| split_id.as_str()),\n                );\n                if index_uid.index_id == \"test-index-1\" {\n                    assert_eq!(split_ids.len(), 8000);\n                    for seq in 0..8000 {\n                        let split_id = format!(\"split-{seq:04}\");\n                        assert!(split_ids.contains(&*split_id));\n                    }\n                } else if split_ids.len() == 2000 {\n                    for seq in 0..2000 {\n                        let split_id = format!(\"split-{seq:04}\");\n                        assert!(split_ids.contains(&*split_id));\n                    }\n                } else if split_ids.len() == 6000 {\n                    for seq in 2000..8000 {\n                        let split_id = format!(\"split-{seq:04}\");\n                        assert!(split_ids.contains(&*split_id));\n                    }\n                } else {\n                    panic!();\n                }\n\n                // This should not cause the whole run to fail and return an error,\n                // instead this should simply get logged and return the list of splits\n                // which have successfully been deleted.\n                if index_uid.index_id == \"test-index-2\" && split_ids.len() == 2000 {\n                    Err(MetastoreError::Db {\n                        message: \"fail to delete\".to_string(),\n                    })\n                } else {\n                    Ok(EmptyResponse {})\n                }\n            });\n\n        let garbage_collect_actor = GarbageCollector::new(\n            MetastoreServiceClient::from_mock(mock_metastore),\n            storage_resolver,\n        );\n        let universe = Universe::with_accelerated_time();\n        let (_mailbox, handle) = universe.spawn_builder().spawn(garbage_collect_actor);\n\n        let counters = handle.process_pending_and_observe().await.state;\n        assert_eq!(counters.num_passes, 1);\n        assert_eq!(counters.num_deleted_files, 14000);\n        assert_eq!(counters.num_deleted_bytes, 20 * 14000);\n        assert_eq!(counters.num_successful_gc_run, 1);\n        assert_eq!(counters.num_failed_storage_resolution, 0);\n        assert_eq!(counters.num_failed_gc_run, 0);\n        assert_eq!(counters.num_failed_splits, 2000);\n        universe.assert_quit().await;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-janitor/src/actors/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod delete_task_pipeline;\nmod delete_task_planner;\nmod delete_task_service;\nmod garbage_collector;\nmod retention_policy_executor;\n\npub use delete_task_service::{DELETE_SERVICE_TASK_DIR_NAME, DeleteTaskService};\npub use garbage_collector::GarbageCollector;\npub use retention_policy_executor::RetentionPolicyExecutor;\n"
  },
  {
    "path": "quickwit/quickwit-janitor/src/actors/retention_policy_executor.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{HashMap, HashSet};\nuse std::time::Duration;\n\nuse async_trait::async_trait;\nuse itertools::Itertools;\nuse quickwit_actors::{Actor, ActorContext, Handler};\nuse quickwit_config::IndexConfig;\nuse quickwit_metastore::ListIndexesMetadataResponseExt;\nuse quickwit_proto::metastore::{\n    ListIndexesMetadataRequest, MetastoreService, MetastoreServiceClient,\n};\nuse quickwit_proto::types::IndexUid;\nuse serde::Serialize;\nuse tracing::{debug, error, info};\n\nuse crate::retention_policy_execution::run_execute_retention_policy;\n\nconst RUN_INTERVAL: Duration = Duration::from_secs(60 * 60); // 1 hours\n\n#[derive(Clone, Debug, Default, Serialize)]\npub struct RetentionPolicyExecutorCounters {\n    /// The number of refresh the config passes.\n    pub num_refresh_passes: usize,\n\n    /// The number of execution passes.\n    pub num_execution_passes: usize,\n\n    /// The number of expired splits.\n    pub num_expired_splits: usize,\n}\n\n#[derive(Debug)]\nstruct Loop;\n\n#[derive(Debug)]\nstruct Execute {\n    index_uid: IndexUid,\n}\n\n/// An actor for scheduling retention policy execution on all indexes.\n/// It keeps a list of indexes that have retention policy configured\n/// in a cache and periodically update this list.\npub struct RetentionPolicyExecutor {\n    metastore: MetastoreServiceClient,\n    /// A map of index_id to index metadata that are managed by this executor.\n    /// This act as local cache that is periodically updated while taking into\n    /// account deleted indexes, updated or removed retention policy on indexes.\n    index_configs: HashMap<String, IndexConfig>,\n    counters: RetentionPolicyExecutorCounters,\n}\n\nimpl RetentionPolicyExecutor {\n    pub fn new(metastore: MetastoreServiceClient) -> Self {\n        Self {\n            metastore,\n            index_configs: HashMap::new(),\n            counters: RetentionPolicyExecutorCounters::default(),\n        }\n    }\n\n    /// Indexes refresh Loop handler logic.\n    /// Should not return an error to prevent the actor from crashing.\n    async fn handle_refresh_loop(&mut self, ctx: &ActorContext<Self>) {\n        debug!(\"loading indexes from the metastore\");\n        self.counters.num_refresh_passes += 1;\n\n        let response = match self\n            .metastore\n            .list_indexes_metadata(ListIndexesMetadataRequest::all())\n            .await\n        {\n            Ok(response) => response,\n            Err(error) => {\n                error!(%error, \"failed to list indexes from the metastore\");\n                return;\n            }\n        };\n        let indexes = match response.deserialize_indexes_metadata().await {\n            Ok(indexes) => indexes,\n            Err(error) => {\n                error!(%error, \"failed to deserialize indexes metadata\");\n                return;\n            }\n        };\n        info!(\"loaded {} indexes from the metastore\", indexes.len());\n\n        let deleted_indexes = compute_deleted_indexes(\n            self.index_configs.keys().map(String::as_str),\n            indexes\n                .iter()\n                .map(|index_metadata| index_metadata.index_id()),\n        );\n        if !deleted_indexes.is_empty() {\n            debug!(index_ids=%deleted_indexes.iter().join(\", \"), \"deleting indexes from cache\");\n            for index_id in deleted_indexes {\n                self.index_configs.remove(&index_id);\n            }\n        }\n        for index_metadata in indexes {\n            let index_uid = index_metadata.index_uid.clone();\n            let index_config = index_metadata.into_index_config();\n            // We only care about indexes with a retention policy configured.\n            let retention_policy = match &index_config.retention_policy_opt {\n                Some(policy) => policy,\n                None => {\n                    // Remove the index from the cache if it exist.\n                    // In case where the retention policy was removed this index might have\n                    // been inserted in the cache from a previous iteration.\n                    self.index_configs.remove(&index_config.index_id);\n                    continue;\n                }\n            };\n\n            // Insert or update the index in the cache.\n            if let Some(value) = self.index_configs.get_mut(&index_config.index_id) {\n                // Update the cache index entry in case the retention policy was updated.\n                *value = index_config;\n                continue;\n            }\n\n            if let Ok(next_interval) = retention_policy.duration_until_next_evaluation() {\n                let message = Execute { index_uid };\n                info!(index_id=?index_config.index_id, scheduled_in=?next_interval, \"retention-policy-schedule-operation\");\n                // Inserts & schedule the index's first retention policy execution.\n                self.index_configs\n                    .insert(index_config.index_id.clone(), index_config);\n                ctx.schedule_self_msg(next_interval, message);\n            } else {\n                error!(index_id=%index_config.index_id, \"Couldn't extract the index next schedule time.\")\n            }\n        }\n    }\n}\n\n#[async_trait]\nimpl Actor for RetentionPolicyExecutor {\n    type ObservableState = RetentionPolicyExecutorCounters;\n\n    fn observable_state(&self) -> Self::ObservableState {\n        self.counters.clone()\n    }\n\n    fn name(&self) -> String {\n        \"RetentionPolicyExecutor\".to_string()\n    }\n\n    async fn initialize(\n        &mut self,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), quickwit_actors::ActorExitStatus> {\n        self.handle(Loop, ctx).await?;\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<Loop> for RetentionPolicyExecutor {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        _: Loop,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), quickwit_actors::ActorExitStatus> {\n        self.handle_refresh_loop(ctx).await;\n        ctx.schedule_self_msg(RUN_INTERVAL, Loop);\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Handler<Execute> for RetentionPolicyExecutor {\n    type Reply = ();\n\n    async fn handle(\n        &mut self,\n        message: Execute,\n        ctx: &ActorContext<Self>,\n    ) -> Result<(), quickwit_actors::ActorExitStatus> {\n        info!(index_id=%message.index_uid.index_id, \"retention-policy-execute-operation\");\n        self.counters.num_execution_passes += 1;\n\n        let index_config = match self.index_configs.get(&message.index_uid.index_id) {\n            Some(config) => config,\n            None => {\n                debug!(index_id=%message.index_uid.index_id, \"the index might have been deleted\");\n                return Ok(());\n            }\n        };\n\n        let retention_policy = index_config\n            .retention_policy_opt\n            .as_ref()\n            .expect(\"Expected index to have retention policy configure.\");\n\n        let execution_result = run_execute_retention_policy(\n            message.index_uid.clone(),\n            self.metastore.clone(),\n            retention_policy,\n            ctx,\n        )\n        .await;\n        match execution_result {\n            Ok(splits) => self.counters.num_expired_splits += splits.len(),\n            Err(error) => {\n                error!(index_id=%message.index_uid.index_id, error=?error, \"Failed to execute the retention policy on the index.\")\n            }\n        }\n\n        if let Ok(next_interval) = retention_policy.duration_until_next_evaluation() {\n            info!(index_id=?index_config.index_id, scheduled_in=?next_interval, \"retention-policy-schedule-operation\");\n            ctx.schedule_self_msg(next_interval, message);\n        } else {\n            // Since we have failed to schedule next execution for this index,\n            // we remove it from the cache for it to be retried next time it gets\n            // added back by the RetentionPolicyExecutor cache refresh loop.\n            self.index_configs.remove(&message.index_uid.index_id);\n            error!(index_id=%message.index_uid.index_id, \"couldn't extract the index next schedule interval\");\n        }\n        Ok(())\n    }\n}\n\n/// Extract the list of deleted indexes.\nfn compute_deleted_indexes<'a>(\n    cached_indexes: impl Iterator<Item = &'a str>,\n    indexes: impl Iterator<Item = &'a str>,\n) -> HashSet<String> {\n    let cached_set: HashSet<_> = cached_indexes.collect();\n    let indexes_set: HashSet<_> = indexes.collect();\n    (&cached_set - &indexes_set)\n        .into_iter()\n        .map(ToString::to_string)\n        .collect()\n}\n\n#[cfg(test)]\nmod tests {\n    use std::ops::RangeInclusive;\n\n    use mockall::Sequence;\n    use quickwit_actors::Universe;\n    use quickwit_common::ServiceStream;\n    use quickwit_config::RetentionPolicy;\n    use quickwit_metastore::{\n        IndexMetadata, ListSplitsRequestExt, ListSplitsResponseExt, Split, SplitMetadata,\n        SplitState,\n    };\n    use quickwit_proto::metastore::{\n        EmptyResponse, ListIndexesMetadataResponse, ListSplitsResponse, MockMetastoreService,\n    };\n\n    use super::*;\n\n    #[derive(Debug)]\n    struct AssertState(Vec<(&'static str, Option<&'static str>)>);\n\n    #[async_trait]\n    impl Handler<AssertState> for RetentionPolicyExecutor {\n        type Reply = ();\n\n        async fn handle(\n            &mut self,\n            message: AssertState,\n            _ctx: &ActorContext<Self>,\n        ) -> Result<Self::Reply, quickwit_actors::ActorExitStatus> {\n            let indexes_set: HashSet<_> = self\n                .index_configs\n                .values()\n                .map(|im| (&im.index_id, &im.retention_policy_opt))\n                .collect();\n\n            let expected_indexes: Vec<IndexConfig> = make_indexes(&message.0)\n                .into_iter()\n                .map(IndexMetadata::into_index_config)\n                .collect();\n            let expected_indexes_set: HashSet<_> = expected_indexes\n                .iter()\n                .map(|im| (&im.index_id, &im.retention_policy_opt))\n                .collect();\n            assert_eq!(\n                indexes_set, expected_indexes_set,\n                \"Mismatch set of indexes.\"\n            );\n            Ok(())\n        }\n    }\n\n    const EVALUATION_SCHEDULE: &str = \"hourly\";\n\n    fn make_index(index_id: &str, retention_period_opt: Option<&str>) -> IndexConfig {\n        let mut index = IndexConfig::for_test(index_id, &format!(\"ram://indexes/{index_id}\"));\n        if let Some(retention_period) = retention_period_opt {\n            index.retention_policy_opt = Some(RetentionPolicy {\n                retention_period: retention_period.to_string(),\n                evaluation_schedule: EVALUATION_SCHEDULE.to_string(),\n            })\n        }\n        index\n    }\n\n    fn make_indexes(index_ids: &[(&str, Option<&str>)]) -> Vec<IndexMetadata> {\n        index_ids\n            .iter()\n            .map(|(index_id, retention_period_opt)| make_index(index_id, *retention_period_opt))\n            .map(IndexMetadata::new)\n            .collect()\n    }\n\n    fn make_split(split_id: &str, time_range: Option<RangeInclusive<i64>>) -> Split {\n        Split {\n            split_metadata: SplitMetadata {\n                split_id: split_id.to_string(),\n                footer_offsets: 5..20,\n                time_range,\n                ..Default::default()\n            },\n            split_state: SplitState::Published,\n            update_timestamp: 0,\n            publish_timestamp: Some(100),\n        }\n    }\n\n    // Uses the retention policy scheduler to calculate\n    // how much time to advance for the execution to take place.\n    fn shift_time_by() -> Duration {\n        let scheduler = RetentionPolicy {\n            retention_period: \"\".to_string(),\n            evaluation_schedule: EVALUATION_SCHEDULE.to_string(),\n        };\n\n        scheduler.duration_until_next_evaluation().unwrap() + Duration::from_secs(1)\n    }\n\n    #[tokio::test]\n    async fn test_retention_executor_refresh() -> anyhow::Result<()> {\n        let mut mock_metastore = MockMetastoreService::new();\n\n        let mut sequence = Sequence::new();\n        mock_metastore\n            .expect_list_splits()\n            .times(..)\n            .returning(|_| Ok(ServiceStream::empty()));\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .times(1)\n            .in_sequence(&mut sequence)\n            .returning(|_list_indexes_request| {\n                let indexes_metadata = make_indexes(&[\n                    (\"index-1\", Some(\"1 hour\")),\n                    (\"index-2\", Some(\"1 hour\")),\n                    (\"index-3\", None),\n                ]);\n                Ok(ListIndexesMetadataResponse::for_test(indexes_metadata))\n            });\n\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .times(1)\n            .in_sequence(&mut sequence)\n            .returning(|_list_indexes_request| {\n                let indexes_metadata = make_indexes(&[\n                    (\"index-1\", Some(\"1 hour\")),\n                    (\"index-2\", Some(\"2 hour\")),\n                    (\"index-3\", Some(\"1 hour\")),\n                ]);\n                Ok(ListIndexesMetadataResponse::for_test(indexes_metadata))\n            });\n\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .times(1)\n            .in_sequence(&mut sequence)\n            .returning(|_list_indexes_request| {\n                let indexes_metadata = make_indexes(&[\n                    (\"index-2\", Some(\"1 hour\")),\n                    (\"index-4\", Some(\"1 hour\")),\n                    (\"index-5\", None),\n                ]);\n                Ok(ListIndexesMetadataResponse::for_test(indexes_metadata))\n            });\n\n        let retention_policy_executor =\n            RetentionPolicyExecutor::new(MetastoreServiceClient::from_mock(mock_metastore));\n        let universe = Universe::with_accelerated_time();\n        let (mailbox, handle) = universe.spawn_builder().spawn(retention_policy_executor);\n\n        let counters = handle.process_pending_and_observe().await.state;\n        assert_eq!(counters.num_refresh_passes, 1);\n        mailbox\n            .ask(AssertState(vec![\n                (\"index-1\", Some(\"1 hour\")),\n                (\"index-2\", Some(\"1 hour\")),\n            ]))\n            .await?;\n\n        universe.sleep(RUN_INTERVAL + Duration::from_secs(5)).await;\n        let counters = handle.process_pending_and_observe().await.state;\n        assert_eq!(counters.num_refresh_passes, 2);\n        mailbox\n            .ask(AssertState(vec![\n                (\"index-1\", Some(\"1 hour\")),\n                (\"index-2\", Some(\"2 hour\")),\n                (\"index-3\", Some(\"1 hour\")),\n            ]))\n            .await?;\n\n        universe.sleep(RUN_INTERVAL + Duration::from_secs(5)).await;\n        let counters = handle.process_pending_and_observe().await.state;\n        assert_eq!(counters.num_refresh_passes, 3);\n        mailbox\n            .ask(AssertState(vec![\n                (\"index-2\", Some(\"1 hour\")),\n                (\"index-4\", Some(\"1 hour\")),\n            ]))\n            .await?;\n        universe.assert_quit().await;\n\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_retention_policy_execution_calls_dependencies() -> anyhow::Result<()> {\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .times(..)\n            .returning(|_list_indexes_request| {\n                let indexes_metadata = make_indexes(&[\n                    (\"index-1\", Some(\"2 hour\")),\n                    (\"index-2\", Some(\"1 hour\")),\n                    (\"index-3\", None),\n                ]);\n                Ok(ListIndexesMetadataResponse::for_test(indexes_metadata))\n            });\n\n        mock_metastore\n            .expect_list_splits()\n            .times(2..=4)\n            .returning(|list_splits_request| {\n                let query = list_splits_request.deserialize_list_splits_query().unwrap();\n                assert_eq!(query.split_states, &[SplitState::Published]);\n                let splits = match query.index_uids.unwrap()[0].index_id.as_ref() {\n                    \"index-1\" => {\n                        vec![\n                            make_split(\"split-1\", Some(1000..=5000)),\n                            make_split(\"split-2\", Some(2000..=6000)),\n                            make_split(\"split-3\", None),\n                        ]\n                    }\n                    \"index-2\" => Vec::new(),\n                    unknown => panic!(\"Unknown index: `{unknown}`.\"),\n                };\n                let splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits_response)]))\n            });\n\n        mock_metastore\n            .expect_mark_splits_for_deletion()\n            .times(1..=3)\n            .returning(|mark_splits_for_deletion_request| {\n                let index_uid: IndexUid = mark_splits_for_deletion_request.index_uid().clone();\n                assert_eq!(index_uid.index_id, \"index-1\");\n                assert_eq!(\n                    mark_splits_for_deletion_request.split_ids,\n                    [\"split-1\", \"split-2\"]\n                );\n                Ok(EmptyResponse {})\n            });\n\n        let retention_policy_executor =\n            RetentionPolicyExecutor::new(MetastoreServiceClient::from_mock(mock_metastore));\n        let universe = Universe::with_accelerated_time();\n        let (_mailbox, handle) = universe.spawn_builder().spawn(retention_policy_executor);\n\n        let counters = handle.process_pending_and_observe().await.state;\n        assert_eq!(counters.num_execution_passes, 0);\n        assert_eq!(counters.num_expired_splits, 0);\n\n        universe.sleep(shift_time_by()).await;\n        let counters = handle.process_pending_and_observe().await.state;\n        assert_eq!(counters.num_execution_passes, 2);\n        assert_eq!(counters.num_expired_splits, 2);\n        universe.assert_quit().await;\n\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-janitor/src/error.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_common::rate_limited_error;\nuse quickwit_proto::metastore::MetastoreError;\nuse quickwit_proto::{ServiceError, ServiceErrorCode};\nuse serde::{Deserialize, Serialize};\nuse thiserror::Error;\n\n/// Janitor errors.\n#[allow(missing_docs)]\n#[derive(Error, Debug, Serialize, Deserialize)]\npub enum JanitorError {\n    #[error(\"internal error: `{0}`\")]\n    Internal(String),\n    #[error(\"invalid delete query: `{0}`\")]\n    InvalidDeleteQuery(String),\n    #[error(\"metastore error: `{0}`\")]\n    Metastore(#[from] MetastoreError),\n}\n\nimpl ServiceError for JanitorError {\n    fn error_code(&self) -> ServiceErrorCode {\n        match self {\n            Self::Internal(err_msg) => {\n                rate_limited_error!(limit_per_min = 6, \"janitor internal error {err_msg}\");\n                ServiceErrorCode::Internal\n            }\n            Self::InvalidDeleteQuery(_) => ServiceErrorCode::BadRequest,\n            Self::Metastore(metastore_error) => metastore_error.error_code(),\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-janitor/src/janitor_service.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse async_trait::async_trait;\nuse quickwit_actors::{\n    Actor, ActorContext, ActorExitStatus, ActorHandle, ActorState, Handler, Healthz,\n};\nuse serde_json::{Value as JsonValue, json};\n\nuse crate::actors::{DeleteTaskService, GarbageCollector, RetentionPolicyExecutor};\n\npub struct JanitorService {\n    delete_task_service_handle: Option<ActorHandle<DeleteTaskService>>,\n    garbage_collector_handle: ActorHandle<GarbageCollector>,\n    retention_policy_executor_handle: ActorHandle<RetentionPolicyExecutor>,\n}\n\nimpl JanitorService {\n    pub fn new(\n        delete_task_service_handle: Option<ActorHandle<DeleteTaskService>>,\n        garbage_collector_handle: ActorHandle<GarbageCollector>,\n        retention_policy_executor_handle: ActorHandle<RetentionPolicyExecutor>,\n    ) -> Self {\n        Self {\n            delete_task_service_handle,\n            garbage_collector_handle,\n            retention_policy_executor_handle,\n        }\n    }\n\n    fn is_healthy(&self) -> bool {\n        let delete_task_is_not_failure: bool =\n            if let Some(delete_task_service_handle) = &self.delete_task_service_handle {\n                delete_task_service_handle.state() != ActorState::Failure\n            } else {\n                true\n            };\n        delete_task_is_not_failure\n            && self.garbage_collector_handle.state() != ActorState::Failure\n            && self.retention_policy_executor_handle.state() != ActorState::Failure\n    }\n}\n\n#[async_trait]\nimpl Actor for JanitorService {\n    type ObservableState = JsonValue;\n\n    fn name(&self) -> String {\n        \"JanitorService\".to_string()\n    }\n\n    fn observable_state(&self) -> Self::ObservableState {\n        json!({})\n    }\n}\n\n#[async_trait]\nimpl Handler<Healthz> for JanitorService {\n    type Reply = bool;\n\n    async fn handle(\n        &mut self,\n        _message: Healthz,\n        _ctx: &ActorContext<Self>,\n    ) -> Result<Self::Reply, ActorExitStatus> {\n        Ok(self.is_healthy())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-janitor/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n#![deny(clippy::disallowed_methods)]\n\nuse quickwit_actors::{Mailbox, Universe};\nuse quickwit_common::pubsub::EventBroker;\nuse quickwit_config::NodeConfig;\nuse quickwit_indexing::actors::MergeSchedulerService;\nuse quickwit_metastore::SplitInfo;\nuse quickwit_proto::metastore::MetastoreServiceClient;\nuse quickwit_search::SearchJobPlacer;\nuse quickwit_storage::StorageResolver;\nuse tracing::info;\n\npub mod actors;\npub mod error;\nmod janitor_service;\nmod metrics;\nmod retention_policy_execution;\n\npub use janitor_service::JanitorService;\n\nuse crate::actors::{DeleteTaskService, GarbageCollector, RetentionPolicyExecutor};\n\n#[derive(utoipa::OpenApi)]\n#[openapi(components(schemas(SplitInfo)))]\n/// Schema used for the OpenAPI generation which are apart of this crate.\npub struct JanitorApiSchemas;\n\npub async fn start_janitor_service(\n    universe: &Universe,\n    config: &NodeConfig,\n    metastore: MetastoreServiceClient,\n    search_job_placer: SearchJobPlacer,\n    storage_resolver: StorageResolver,\n    event_broker: EventBroker,\n    run_delete_task_service: bool,\n) -> anyhow::Result<Mailbox<JanitorService>> {\n    info!(\"starting janitor service\");\n    let garbage_collector = GarbageCollector::new(metastore.clone(), storage_resolver.clone());\n    let (_, garbage_collector_handle) = universe.spawn_builder().spawn(garbage_collector);\n\n    let retention_policy_executor = RetentionPolicyExecutor::new(metastore.clone());\n    let (_, retention_policy_executor_handle) =\n        universe.spawn_builder().spawn(retention_policy_executor);\n    let delete_task_service_handle = if run_delete_task_service {\n        let delete_task_service = DeleteTaskService::new(\n            metastore,\n            search_job_placer,\n            storage_resolver,\n            config.data_dir_path.clone(),\n            config.indexer_config.max_concurrent_split_uploads,\n            universe.get_or_spawn_one::<MergeSchedulerService>(),\n            event_broker,\n        )\n        .await?;\n        let (_, delete_task_service_handle) = universe.spawn_builder().spawn(delete_task_service);\n        Some(delete_task_service_handle)\n    } else {\n        tracing::warn!(\"delete task service is disabled: delete queries will not be processed\");\n        None\n    };\n\n    let janitor_service = JanitorService::new(\n        delete_task_service_handle,\n        garbage_collector_handle,\n        retention_policy_executor_handle,\n    );\n    let (janitor_service_mailbox, _janitor_service_handle) =\n        universe.spawn_builder().spawn(janitor_service);\n    Ok(janitor_service_mailbox)\n}\n"
  },
  {
    "path": "quickwit/quickwit-janitor/src/metrics.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse once_cell::sync::Lazy;\nuse quickwit_common::metrics::{\n    IntCounter, IntCounterVec, IntGaugeVec, new_counter, new_counter_vec, new_gauge_vec,\n};\n\npub struct JanitorMetrics {\n    pub ongoing_num_delete_operations_total: IntGaugeVec<1>,\n    pub gc_deleted_splits: IntCounterVec<1>,\n    pub gc_deleted_bytes: IntCounter,\n    pub gc_runs: IntCounterVec<1>,\n    pub gc_seconds_total: IntCounter,\n    // TODO having a current run duration which is 0|undefined out of run, and returns `now -\n    // start_time` during a run would be nice\n}\n\nimpl Default for JanitorMetrics {\n    fn default() -> Self {\n        JanitorMetrics {\n            ongoing_num_delete_operations_total: new_gauge_vec(\n                \"ongoing_num_delete_operations_total\",\n                \"Num of ongoing delete operations (per index).\",\n                \"quickwit_janitor\",\n                &[],\n                [\"index\"],\n            ),\n            gc_deleted_splits: new_counter_vec(\n                \"gc_deleted_splits_total\",\n                \"Total number of splits deleted by the garbage collector.\",\n                \"quickwit_janitor\",\n                &[],\n                [\"result\"],\n            ),\n            gc_deleted_bytes: new_counter(\n                \"gc_deleted_bytes_total\",\n                \"Total number of bytes deleted by the garbage collector.\",\n                \"quickwit_janitor\",\n                &[],\n            ),\n            gc_runs: new_counter_vec(\n                \"gc_runs_total\",\n                \"Total number of garbage collector execition.\",\n                \"quickwit_janitor\",\n                &[],\n                [\"result\"],\n            ),\n            gc_seconds_total: new_counter(\n                \"gc_seconds_total\",\n                \"Total time spent running the garbage collector\",\n                \"quickwit_janitor\",\n                &[],\n            ),\n        }\n    }\n}\n\n/// `JANITOR_METRICS` exposes a bunch of related metrics through a prometheus\n/// endpoint.\npub static JANITOR_METRICS: Lazy<JanitorMetrics> = Lazy::new(JanitorMetrics::default);\n"
  },
  {
    "path": "quickwit/quickwit-janitor/src/retention_policy_execution.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_actors::ActorContext;\nuse quickwit_common::pretty::PrettySample;\nuse quickwit_config::RetentionPolicy;\nuse quickwit_metastore::{\n    ListSplitsQuery, ListSplitsRequestExt, MetastoreServiceStreamSplitsExt, SplitMetadata,\n    SplitState,\n};\nuse quickwit_proto::metastore::{\n    ListSplitsRequest, MarkSplitsForDeletionRequest, MetastoreService, MetastoreServiceClient,\n};\nuse quickwit_proto::types::{IndexUid, SplitId};\nuse time::OffsetDateTime;\nuse tracing::{info, warn};\n\nuse crate::actors::RetentionPolicyExecutor;\n\n/// Detect all expired splits based a retention policy and\n/// only mark them as `MarkedForDeletion`. Actual split deletion\n/// is taken care of by the garbage collector.\n///\n/// * `index_id` - The target index id.\n/// * `metastore` - The metastore managing the target index.\n/// * `retention_policy` - The retention policy to used to evaluate the splits.\n/// * `ctx_opt` - A context for reporting progress (only useful within quickwit actor).\npub async fn run_execute_retention_policy(\n    index_uid: IndexUid,\n    metastore: MetastoreServiceClient,\n    retention_policy: &RetentionPolicy,\n    ctx: &ActorContext<RetentionPolicyExecutor>,\n) -> anyhow::Result<Vec<SplitMetadata>> {\n    // Select splits that are published and older than the retention period.\n    let retention_period = retention_policy.retention_period()?;\n    let current_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n    let max_retention_timestamp = current_timestamp - retention_period.as_secs() as i64;\n    let query = ListSplitsQuery::for_index(index_uid.clone())\n        .with_split_state(SplitState::Published)\n        .with_max_time_range_end(max_retention_timestamp);\n\n    let list_splits_request = ListSplitsRequest::try_from_list_splits_query(&query)?;\n    let (expired_splits, ignored_splits): (Vec<SplitMetadata>, Vec<SplitMetadata>) = ctx\n        .protect_future(metastore.list_splits(list_splits_request))\n        .await?\n        .collect_splits_metadata()\n        .await?\n        .into_iter()\n        .partition(|split_metadata| split_metadata.time_range.is_some());\n\n    if !ignored_splits.is_empty() {\n        let ignored_split_ids: Vec<String> = ignored_splits\n            .into_iter()\n            .map(|split_metadata| split_metadata.split_id)\n            .collect();\n        warn!(\n            index_id=%index_uid.index_id,\n            split_ids=?PrettySample::new(&ignored_split_ids, 5),\n            \"Retention policy could not be applied to {} splits because they lack a timestamp range.\",\n            ignored_split_ids.len()\n        );\n    }\n    if expired_splits.is_empty() {\n        return Ok(expired_splits);\n    }\n    // Mark the expired splits for deletion.\n    let expired_split_ids: Vec<SplitId> = expired_splits\n        .iter()\n        .map(|split_metadata| split_metadata.split_id.to_string())\n        .collect();\n    info!(\n        index_id=%index_uid.index_id,\n        split_ids=?PrettySample::new(&expired_split_ids, 5),\n        \"Marking {} splits for deletion based on retention policy.\",\n        expired_split_ids.len()\n    );\n    let mark_splits_for_deletion_request =\n        MarkSplitsForDeletionRequest::new(index_uid, expired_split_ids);\n    ctx.protect_future(metastore.mark_splits_for_deletion(mark_splits_for_deletion_request))\n        .await?;\n    Ok(expired_splits)\n}\n"
  },
  {
    "path": "quickwit/quickwit-lambda-client/Cargo.toml",
    "content": "[package]\nname = \"quickwit-lambda-client\"\ndescription = \"AWS Lambda client for Quickwit leaf search invocation and deployment\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nanyhow = { workspace = true }\nasync-trait = { workspace = true }\naws-config = { workspace = true }\naws-sdk-lambda = { workspace = true }\nbase64 = { workspace = true }\nmd5 = { workspace = true }\nprost = { workspace = true }\nserde_json = { workspace = true }\nonce_cell = { workspace = true }\ntokio = { workspace = true }\ntracing = { workspace = true }\n\nquickwit-common = { workspace = true }\nquickwit-config = { workspace = true }\nquickwit-lambda-server = { workspace = true }\nquickwit-proto = { workspace = true }\nquickwit-search = { workspace = true }\n\n[dev-dependencies]\naws-smithy-mocks = { workspace = true }\naws-sdk-lambda = { workspace = true, features = [\"test-util\"] }\nbytesize = { workspace = true }\ntokio = { workspace = true, features = [\"test-util\", \"macros\"] }\n\n# Required for complicated reasons. quickwit-storage checks that we\n# do use preserve order with serde. aws forces that feature. We disable\n# the check by switching on its testsuite feature.\nquickwit-storage = { workspace = true, features = [\"testsuite\"] }\n\n[build-dependencies]\nsha2 = { workspace = true }\nureq = { workspace = true }\n"
  },
  {
    "path": "quickwit/quickwit-lambda-client/README.md",
    "content": "# Quickwit Lambda\n\nQuickwit supports offloading leaf search to AWS Lambda for horizontal scaling.\nThe Lambda function is built separately and embedded into Quickwit's binary,\nallowing Quickwit to auto-deploy the function at startup.\n\n## Architecture\n\n- **quickwit-lambda-server**: The Lambda function binary that executes leaf searches\n- **quickwit-lambda-client**: The client that invokes Lambda and embeds the Lambda zip for auto-deployment\n\n## Release Process\n\n### 1. Tag the release\n\nPush a tag with the `lambda-` prefix:\n\n```bash\ngit tag lambda-v0.8.0\ngit push origin lambda-v0.8.0\n```\n\nThis triggers the `publish_lambda.yaml` GitHub Action which:\n- Cross-compiles the Lambda binary for ARM64\n- Creates a zip file named `quickwit-aws-lambda-v0.8.0-aarch64.zip`\n- Uploads it as a **draft** GitHub release\n\n### 2. Publish the release\n\nGo to GitHub releases and manually publish the draft release to make the\nartifact URL publicly accessible.\n\n### 3. Update the embedded Lambda URL\n\nUpdate `LAMBDA_ZIP_URL` in `quickwit-lambda-client/build.rs` to point to the\nnew release:\n\n```rust\nconst LAMBDA_ZIP_URL: &str = \"https://github.com/quickwit-oss/quickwit/releases/download/lambda-v0.8.0/quickwit-aws-lambda-v0.8.0-aarch64.zip\";\n```\n\n### 4. Versioning\n\nThe Lambda client uses content-based versioning:\n- An MD5 hash of the Lambda zip is computed at build time\n- This hash is embedded in the Lambda function description as `quickwit:{version}-{hash_short}`\n- When Quickwit starts, it checks if a matching version exists before deploying\n\nThis ensures that:\n- Different Quickwit builds with the same Lambda binary share the same Lambda version\n- Updating the Lambda binary automatically triggers a new deployment\n"
  },
  {
    "path": "quickwit/quickwit-lambda-client/build.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n//! Build script for quickwit-lambda-client.\n//!\n//! This script downloads the pre-built Lambda zip from a GitHub release\n//! and places it in OUT_DIR for embedding via include_bytes!\n//!\n//! The Lambda binary is built separately in CI and published as a GitHub release.\n\nuse std::env;\nuse std::path::{Path, PathBuf};\n\nuse sha2::{Digest, Sha256};\n\n/// URL to download the pre-built Lambda zip from GitHub releases.\n/// This should be updated when a new Lambda binary is released.\nconst LAMBDA_ZIP_URL: &str = \"https://github.com/quickwit-oss/quickwit/releases/download/lambda-ff6fdfa5/quickwit-aws-lambda--aarch64.zip\";\n\n/// Expected SHA256 hash of the Lambda zip artifact.\n/// Must be updated alongside LAMBDA_ZIP_URL when a new Lambda binary is released.\nconst LAMBDA_ZIP_SHA256: &str = \"fa940f44178e28460c21e44bb2610b776542b9b97db66a53bc65b10cad653b90\";\n\n/// AWS Lambda direct upload limit is 50MB.\n/// Larger artifacts must be uploaded via S3.\nconst MAX_LAMBDA_ZIP_SIZE: usize = 50 * 1024 * 1024;\n\nfn main() {\n    println!(\"cargo:rerun-if-changed=build.rs\");\n\n    let out_dir = PathBuf::from(env::var(\"OUT_DIR\").expect(\"OUT_DIR not set\"));\n    let zip_path = out_dir.join(\"lambda_bootstrap.zip\");\n\n    fetch_lambda_zip(&zip_path);\n\n    // Export first 8 hex chars of the SHA256 as environment variable.\n    // This is used to create a unique qualifier for Lambda versioning.\n    let hash_short = &LAMBDA_ZIP_SHA256[..8];\n    println!(\"cargo:rustc-env=LAMBDA_BINARY_HASH={}\", hash_short);\n    println!(\"lambda binary hash (short): {}\", hash_short);\n}\n\n/// Fetch the Lambda zip and save it to `local_cache_path`.\n///\n/// If a cached file already exists with the correct SHA256, this is a no-op.\n/// If the hash doesn't match (stale artifact), the file is deleted and re-downloaded.\n/// If no cached file exists, the artifact is downloaded fresh.\n///\n/// This function panics if a problem happens.\nfn fetch_lambda_zip(local_cache_path: &Path) {\n    // Try the cache first.\n    if let Ok(data) = std::fs::read(local_cache_path) {\n        let actual_hash = sha256_hex(&data);\n        if actual_hash == LAMBDA_ZIP_SHA256 {\n            println!(\"using cached Lambda zip from {:?}\", local_cache_path);\n            return;\n        }\n        println!(\"cargo:warning=cached Lambda zip has wrong SHA256, re-downloading\");\n        std::fs::remove_file(local_cache_path).expect(\"failed to delete stale cached zip\");\n    }\n\n    // Download from the remote URL.\n    println!(\n        \"cargo:warning=downloading Lambda zip from: {}\",\n        LAMBDA_ZIP_URL\n    );\n    let data = download_lambda_zip(LAMBDA_ZIP_URL).expect(\"failed to download Lambda zip\");\n\n    // Verify SHA256 BEFORE writing to disk.\n    let actual_hash = sha256_hex(&data);\n    if actual_hash != LAMBDA_ZIP_SHA256 {\n        panic!(\n            \"SHA256 mismatch for Lambda zip!\\n  expected: {LAMBDA_ZIP_SHA256}\\n  actual:   \\\n             {actual_hash}\\nThe artifact at {LAMBDA_ZIP_URL} may have been tampered with.\"\n        );\n    }\n\n    std::fs::write(local_cache_path, &data).expect(\"failed to write zip file\");\n    println!(\n        \"cargo:warning=downloaded Lambda zip to {:?} ({} bytes)\",\n        local_cache_path,\n        data.len()\n    );\n}\n\nfn sha256_hex(data: &[u8]) -> String {\n    format!(\"{:x}\", Sha256::digest(data))\n}\n\nfn download_lambda_zip(url: &str) -> Result<Vec<u8>, String> {\n    let response = ureq::get(url)\n        .call()\n        .map_err(|err| format!(\"HTTP request failed: {err}\"))?;\n    // Set limit higher than MAX_LAMBDA_ZIP_SIZE so we can detect oversized artifacts.\n    let data = response\n        .into_body()\n        .with_config()\n        .limit(MAX_LAMBDA_ZIP_SIZE as u64 + 1)\n        .read_to_vec()\n        .map_err(|err| format!(\"failed to read response body: {err}\"))?;\n    if data.len() > MAX_LAMBDA_ZIP_SIZE {\n        return Err(format!(\n            \"Lambda zip is too large ({} bytes, max {} bytes).\\nAWS Lambda does not support \\\n             direct upload of binaries larger than 50MB.\\nWorkaround: upload the Lambda zip to S3 \\\n             and deploy from there instead.\",\n            data.len(),\n            MAX_LAMBDA_ZIP_SIZE\n        ));\n    }\n    Ok(data)\n}\n"
  },
  {
    "path": "quickwit/quickwit-lambda-client/src/deploy.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n//! Lambda function deployment for auto-deploy feature.\n//!\n//! This module provides functionality to automatically deploy or update\n//! the Lambda function used for leaf search operations.\n//!\n//! # Versioning Strategy\n//!\n//! We use AWS Lambda published versions with description-based identification:\n//! - Each published version has a description like `quickwit:0_8_0-fa752891`\n//! - We list versions to find one matching our qualifier\n//! - We invoke the specific version number (not $LATEST)\n//! - Old versions are garbage collected (keep current + top 5 most recent)\n\nuse std::collections::HashMap;\nuse std::sync::OnceLock;\n\nuse anyhow::{Context, anyhow};\nuse aws_sdk_lambda::Client as LambdaClient;\nuse aws_sdk_lambda::error::SdkError;\nuse aws_sdk_lambda::primitives::Blob;\nuse aws_sdk_lambda::types::{\n    Architecture, Environment, FunctionCode, LastUpdateStatus, Runtime, State,\n};\nuse quickwit_config::{LambdaConfig, LambdaDeployConfig};\nuse quickwit_search::LambdaLeafSearchInvoker;\nuse tracing::{debug, info};\n\nuse crate::invoker::create_lambda_invoker_for_version;\n\n/// Embedded Lambda binary (arm64, compressed).\n/// This is included at compile time.\nconst LAMBDA_BINARY: &[u8] = include_bytes!(concat!(env!(\"OUT_DIR\"), \"/lambda_bootstrap.zip\"));\n\n/// Prefix for version descriptions to identify Quickwit-managed versions.\nconst VERSION_DESCRIPTION_PREFIX: &str = \"quickwit\";\n\n/// Number of recent versions to keep during garbage collection (in addition to current).\nconst GC_KEEP_RECENT_VERSIONS: usize = 5;\n\n/// Returns the Lambda qualifier combining version and binary hash.\n/// Format: \"{quickwit_version}-{hash_short}\" with dots replaced by underscores.\n/// Example: \"0_8_0-fa752891\"\nfn lambda_qualifier() -> &'static str {\n    static LAMBDA_QUALIFIER: OnceLock<String> = OnceLock::new();\n    LAMBDA_QUALIFIER\n        .get_or_init(|| {\n            format!(\n                \"{}_{}\",\n                env!(\"CARGO_PKG_VERSION\").replace('.', \"_\"),\n                env!(\"LAMBDA_BINARY_HASH\")\n            )\n        })\n        .as_str()\n}\n\n/// Returns the version description for our qualifier.\n///\n/// We also pass the deploy config, as we want the function to be redeployed\n/// if the deploy config is changed.\nfn version_description(deploy_config_opt: Option<&LambdaDeployConfig>) -> String {\n    if let Some(deploy_config) = deploy_config_opt {\n        let memory_size_mib = deploy_config.memory_size.as_mib() as u64;\n        let execution_role_arn_digest: String = format!(\n            \"{:x}\",\n            md5::compute(deploy_config.execution_role_arn.as_bytes())\n        );\n        format!(\n            \"{}_{}_{}_{}s_{}\",\n            VERSION_DESCRIPTION_PREFIX,\n            lambda_qualifier(),\n            memory_size_mib,\n            deploy_config.invocation_timeout_secs,\n            &execution_role_arn_digest[..5]\n        )\n    } else {\n        format!(\n            \"{}_{}_nodeploy\",\n            VERSION_DESCRIPTION_PREFIX,\n            lambda_qualifier()\n        )\n    }\n}\n\n/// Get or deploy the Lambda function and return an invoker.\n///\n/// This function:\n/// 1. Lists existing Lambda versions to find one matching our description\n/// 2. If not found, (and if a deploy config is provided) attempt to deploy the embedded Lambda\n///    binary\n/// 3. Garbage collects old versions (keeps current + 5 most recent)\n/// 4. Returns an invoker configured to call the specific version\n///\n/// The qualifier is computed from the Quickwit version and Lambda binary hash,\n/// ensuring the deployed Lambda matches the embedded binary.\npub async fn try_get_or_deploy_invoker(\n    lambda_config: &LambdaConfig,\n) -> anyhow::Result<impl LambdaLeafSearchInvoker> {\n    let aws_config = aws_config::load_defaults(aws_config::BehaviorVersion::latest()).await;\n    let client = LambdaClient::new(&aws_config);\n    let function_name = &lambda_config.function_name;\n    let target_description = version_description(lambda_config.auto_deploy.as_ref());\n\n    info!(\n        function_name = %function_name,\n        qualifier = %lambda_qualifier(),\n        \"looking for Lambda function version\"\n    );\n\n    let version = find_or_deploy_version(\n        &client,\n        function_name,\n        &target_description,\n        lambda_config.auto_deploy.as_ref(),\n    )\n    .await?;\n\n    // Spawn background garbage collection (best effort, non-blocking)\n    let gc_client = client.clone();\n    let gc_function_name = function_name.clone();\n    let gc_version = version.clone();\n    tokio::spawn(async move {\n        if let Err(e) =\n            garbage_collect_old_versions(&gc_client, &gc_function_name, &gc_version).await\n        {\n            info!(error = %e, \"failed to garbage collect old Lambda versions\");\n        }\n    });\n\n    // Create and return the invoker\n    let invoker = create_lambda_invoker_for_version(function_name.clone(), version)\n        .await\n        .context(\"failed to create Lambda invoker\")?;\n\n    info!(\"created the lambda invoker\");\n\n    Ok(invoker)\n}\n\n/// Find a Lambda version with a description matching our qualifier.\n///\n/// If none is found and a deploy config is provided, attempt to deploy a new version.\n///\n/// Returns the version number as a string (because it is a string on AWS side, e.g.: \"7\") if found.\nasync fn find_or_deploy_version(\n    client: &LambdaClient,\n    function_name: &str,\n    target_description: &str,\n    deploy_config: Option<&LambdaDeployConfig>,\n) -> anyhow::Result<String> {\n    if let Some(version) = find_matching_version(client, function_name, target_description).await? {\n        info!(\n            function_name = %function_name,\n            version = %version,\n            \"found existing Lambda version\"\n        );\n        return Ok(version);\n    }\n\n    let deploy_config = deploy_config.with_context(|| {\n        format!(\n            \"no Lambda version found with description '{}' and auto_deploy is not configured. \\\n             Either deploy the Lambda function manually or enable auto_deploy.\",\n            target_description\n        )\n    })?;\n\n    info!(\n        function_name = %function_name,\n        \"no matching version found, deploying Lambda function\"\n    );\n\n    deploy_lambda_function(client, function_name, deploy_config).await\n}\n\nasync fn find_matching_version(\n    client: &LambdaClient,\n    function_name: &str,\n    target_description: &str,\n) -> anyhow::Result<Option<String>> {\n    let mut marker: Option<String> = None;\n\n    loop {\n        let mut request = client\n            .list_versions_by_function()\n            .function_name(function_name);\n\n        if let Some(m) = marker {\n            request = request.marker(m);\n        }\n\n        let response = match request.send().await {\n            Ok(resp) => resp,\n            Err(SdkError::ServiceError(err)) if err.err().is_resource_not_found_exception() => {\n                info!(\n                    function_name = %function_name,\n                    \"lambda function does not exist yet\"\n                );\n                return Ok(None);\n            }\n            Err(e) => {\n                return Err(anyhow!(\n                    \"failed to list Lambda versions for '{}': {}\",\n                    function_name,\n                    e\n                ));\n            }\n        };\n\n        for version in response.versions() {\n            if let Some(description) = version.description()\n                && description == target_description\n                && let Some(ver) = version.version()\n                && ver != \"$LATEST\"\n            {\n                return Ok(Some(ver.to_string()));\n            }\n        }\n\n        marker = response.next_marker().map(|s| s.to_string());\n        if marker.is_none() {\n            break;\n        }\n    }\n\n    Ok(None)\n}\n\n/// Deploy the Lambda function and publish a new version.\n/// AWS's API is pretty terrible.\n///\n/// Lambda's version are integer generated by AWS (we don't have control over them).\n/// To publish a new version, we need to implement two paths:\n/// - If the function doesn't exist yet, `create_function(publish=true)` atomically creates it and\n///   publishes a version in one call.\n/// - If the function already exists, we first update the code. We do not publish because strangely\n///   the API call does not make it possible to change the description. Updating the code has the\n///   effect of create a version $LATEST.\n/// - We publish the version $LATEST. That's the moment AWS attributes a version number. That call\n///   allows us to change the description. We pass the sha256 hash of the code to ensure that\n///   $LATEST has not been overwritten by another concurrent update.\nasync fn deploy_lambda_function(\n    client: &LambdaClient,\n    function_name: &str,\n    deploy_config: &LambdaDeployConfig,\n) -> anyhow::Result<String> {\n    // This looks overly complicated but this is not AI slop.\n    // The AWS API forces us to go through a bunch of hoops to update our function\n    // in a safe manner.\n    let description = version_description(Some(deploy_config));\n\n    // Fast path: if the function does not exist, we can create and publish the function atomically.\n    if let Some(version) =\n        try_create_function(client, function_name, deploy_config, &description).await?\n    {\n        return Ok(version);\n    }\n\n    // Function already exists — we need to update the code.\n    // This will create or update a version called \"$LATEST\" (that's the actual string)\n    //\n    // We cannot directly publish here, because updating the function code does not allow\n    // us to pass a different description.\n    let code_sha256 = update_function_code(client, function_name).await?;\n\n    // We can now publish that new uploaded version.\n    // We pass the code_sha256 guard to make sure a race condition does not cause\n    // us to publish a different version.\n    //\n    // Publishing will create an actual version (a number as a string) and return it.\n    publish_version(client, function_name, &code_sha256, &description).await\n}\n\n/// Try to create the Lambda function with `publish=true`.\n///\n/// Returns `Some(version)` if the function was created and published.\n/// Returns `None` if the function already exists (`ResourceConflictException`).\nasync fn try_create_function(\n    client: &LambdaClient,\n    function_name: &str,\n    deploy_config: &LambdaDeployConfig,\n    description: &str,\n) -> anyhow::Result<Option<String>> {\n    let memory_size_mb = deploy_config\n        .memory_size\n        .as_u64()\n        .div_ceil(1024u64 * 1024u64) as i32;\n    let timeout_secs = deploy_config.invocation_timeout_secs as i32;\n\n    info!(\n        function_name = %function_name,\n        memory_size_mb = memory_size_mb,\n        timeout_secs = timeout_secs,\n        \"attempting to create Lambda function\"\n    );\n\n    let function_code = FunctionCode::builder()\n        .zip_file(Blob::new(LAMBDA_BINARY))\n        .build();\n\n    let create_result = client\n        .create_function()\n        .function_name(function_name)\n        .runtime(Runtime::Providedal2023)\n        .role(&deploy_config.execution_role_arn)\n        .handler(\"bootstrap\")\n        .description(description)\n        .code(function_code)\n        .architectures(Architecture::Arm64)\n        .memory_size(memory_size_mb)\n        .timeout(timeout_secs)\n        .environment(build_environment())\n        .set_tags(Some(build_tags()))\n        .publish(true)\n        .send()\n        .await;\n\n    match create_result {\n        Ok(output) => {\n            let version = output\n                .version()\n                .ok_or_else(|| anyhow!(\"created function has no version number\"))?\n                .to_string();\n            info!(\n                function_name = %function_name,\n                version = %version,\n                \"lambda function created and published\"\n            );\n            Ok(Some(version))\n        }\n        Err(SdkError::ServiceError(err)) if err.err().is_resource_conflict_exception() => {\n            debug!(\n                function_name = %function_name,\n                \"lambda function already exists\"\n            );\n            Ok(None)\n        }\n        Err(e) => Err(anyhow!(\n            \"failed to create Lambda function '{}': {}\",\n            function_name,\n            e\n        )),\n    }\n}\n\n/// Update `$LATEST` to our embedded binary.\n///\n/// Returns the `code_sha256` of the uploaded code, to be used as a guard\n/// when publishing the version (detects if another process overwrote `$LATEST`\n/// between our update and publish).\nasync fn update_function_code(\n    client: &LambdaClient,\n    function_name: &str,\n) -> anyhow::Result<String> {\n    info!(\n        function_name = %function_name,\n        \"updating Lambda function code to current binary\"\n    );\n\n    let response = client\n        .update_function_code()\n        .function_name(function_name)\n        .zip_file(Blob::new(LAMBDA_BINARY))\n        .architectures(Architecture::Arm64)\n        .send()\n        .await\n        .context(\"failed to update Lambda function code\")?;\n\n    let code_sha256 = response\n        .code_sha256()\n        .ok_or_else(|| anyhow!(\"update_function_code response missing code_sha256\"))?\n        .to_string();\n\n    wait_for_function_ready(client, function_name).await?;\n\n    Ok(code_sha256)\n}\n\n/// Publish a new immutable version from `$LATEST` with our description.\n///\n/// The `code_sha256` parameter guards against races: if another process\n/// overwrote `$LATEST` since our `update_function_code` call, AWS will\n/// reject the publish.\n///\n/// Returns the version number (e.g., \"8\").\nasync fn publish_version(\n    client: &LambdaClient,\n    function_name: &str,\n    code_sha256: &str,\n    description: &str,\n) -> anyhow::Result<String> {\n    info!(\n        function_name = %function_name,\n        description = %description,\n        \"publishing new Lambda version\"\n    );\n\n    let publish_response = client\n        .publish_version()\n        .function_name(function_name)\n        .description(description)\n        .code_sha256(code_sha256)\n        .send()\n        .await\n        .context(\n            \"failed to publish Lambda version (code_sha256 mismatch means a concurrent deploy \\\n             race)\",\n        )?;\n\n    let version = publish_response\n        .version()\n        .context(\"published version has no version number\")?\n        .to_string();\n\n    info!(\n        function_name = %function_name,\n        version = %version,\n        \"lambda version published successfully\"\n    );\n\n    Ok(version)\n}\n\n/// Wait for the Lambda function to be ready.\n///\n/// \"Ready\" means `State == Active` and no update is in progress\n/// (`LastUpdateStatus` is absent or `Successful`).\n///\n/// This matters because:\n/// - After `create_function`: `State` transitions `Pending → Active`\n/// - After `update_function_code`: `State` stays `Active` but `LastUpdateStatus` transitions\n///   `InProgress → Successful`\nasync fn wait_for_function_ready(client: &LambdaClient, function_name: &str) -> anyhow::Result<()> {\n    const MAX_WAIT_ATTEMPTS: u32 = 30;\n    const WAIT_INTERVAL: tokio::time::Duration = tokio::time::Duration::from_secs(1);\n\n    let mut interval = tokio::time::interval(WAIT_INTERVAL);\n\n    for attempt in 0..MAX_WAIT_ATTEMPTS {\n        interval.tick().await;\n\n        let response = client\n            .get_function()\n            .function_name(function_name)\n            .send()\n            .await\n            .context(\"failed to get function status\")?;\n\n        let Some(config) = response.configuration() else {\n            continue;\n        };\n\n        // Check for terminal failure states.\n        if config.state() == Some(&State::Failed) {\n            let reason = config.state_reason().unwrap_or(\"unknown reason\");\n            anyhow::bail!(\n                \"lambda function '{}' is in Failed state: {}\",\n                function_name,\n                reason\n            );\n        }\n\n        let last_update_status: &LastUpdateStatus = config\n            .last_update_status()\n            .unwrap_or(&LastUpdateStatus::Successful);\n\n        if last_update_status == &LastUpdateStatus::Failed {\n            let reason = config\n                .last_update_status_reason()\n                .unwrap_or(\"unknown reason\");\n            anyhow::bail!(\n                \"lambda function '{}' last update failed: {}\",\n                function_name,\n                reason\n            );\n        }\n\n        // Ready = Active state with no update in progress.\n        let is_active = config.state() == Some(&State::Active);\n        if is_active && last_update_status == &LastUpdateStatus::Successful {\n            info!(\n                function_name = %function_name,\n                attempts = attempt + 1,\n                \"lambda function is ready\"\n            );\n            return Ok(());\n        }\n\n        info!(\n            function_name = %function_name,\n            state = ?config.state(),\n            last_update_status = ?config.last_update_status(),\n            attempt = attempt + 1,\n            \"waiting for Lambda function to be ready\"\n        );\n    }\n\n    anyhow::bail!(\n        \"lambda function '{}' did not become ready within {} seconds\",\n        function_name,\n        MAX_WAIT_ATTEMPTS as u64 * WAIT_INTERVAL.as_secs()\n    )\n}\n\n/// Garbage collect old Lambda versions, keeping the current + 5 most recent.\nasync fn garbage_collect_old_versions(\n    client: &LambdaClient,\n    function_name: &str,\n    current_version: &str,\n) -> anyhow::Result<()> {\n    let mut quickwit_lambda_versions: Vec<(u64, String)> = Vec::new();\n    let mut marker: Option<String> = None;\n\n    // Collect all Quickwit-managed versions\n    loop {\n        let mut request = client\n            .list_versions_by_function()\n            .function_name(function_name);\n\n        if let Some(m) = marker {\n            request = request.marker(m);\n        }\n\n        let response = request\n            .send()\n            .await\n            .context(\"failed to list Lambda versions for garbage collection\")?;\n\n        for version in response.versions() {\n            let Some(version_str) = version.version() else {\n                continue;\n            };\n            if version_str == \"$LATEST\" {\n                continue;\n            }\n            // Only consider Quickwit-managed versions\n            let Some(description) = version.description() else {\n                continue;\n            };\n            if description.starts_with(VERSION_DESCRIPTION_PREFIX)\n                && let Ok(version_num) = version_str.parse::<u64>()\n            {\n                quickwit_lambda_versions.push((version_num, version_str.to_string()));\n            }\n        }\n\n        marker = response.next_marker().map(ToString::to_string);\n        if marker.is_none() {\n            break;\n        }\n    }\n\n    // Sort by version number ascending (oldest first)\n    quickwit_lambda_versions.sort();\n\n    // We keep the last 5 versions.\n    quickwit_lambda_versions.truncate(\n        quickwit_lambda_versions\n            .len()\n            .saturating_sub(GC_KEEP_RECENT_VERSIONS),\n    );\n\n    if let Some(pos) = quickwit_lambda_versions\n        .iter()\n        .position(|(_version, version_str)| version_str == current_version)\n    {\n        quickwit_lambda_versions.swap_remove(pos);\n    }\n\n    // Delete old versions\n    for (version, version_str) in quickwit_lambda_versions {\n        info!(\n            function_name = %function_name,\n            version = %version_str,\n            \"deleting old Lambda version\"\n        );\n\n        if let Err(e) = client\n            .delete_function()\n            .function_name(function_name)\n            .qualifier(&version_str)\n            .send()\n            .await\n        {\n            info!(\n                function_name = %function_name,\n                version = %version,\n                error = %e,\n                \"failed to delete old Lambda version\"\n            );\n        }\n    }\n\n    Ok(())\n}\n\n/// Build environment variables for the Lambda function.\nfn build_environment() -> Environment {\n    let mut env_vars = HashMap::new();\n    env_vars.insert(\"RUST_LOG\".to_string(), \"info\".to_string());\n    env_vars.insert(\"RUST_BACKTRACE\".to_string(), \"1\".to_string());\n    Environment::builder().set_variables(Some(env_vars)).build()\n}\n\n/// Build tags for the Lambda function.\nfn build_tags() -> HashMap<String, String> {\n    let mut tags = HashMap::new();\n    tags.insert(\"managed_by\".to_string(), \"quickwit\".to_string());\n    tags\n}\n\n#[cfg(test)]\nmod tests {\n    use aws_sdk_lambda::operation::create_function::{CreateFunctionError, CreateFunctionOutput};\n    use aws_sdk_lambda::operation::delete_function::DeleteFunctionOutput;\n    use aws_sdk_lambda::operation::get_function::GetFunctionOutput;\n    use aws_sdk_lambda::operation::list_versions_by_function::{\n        ListVersionsByFunctionError, ListVersionsByFunctionOutput,\n    };\n    use aws_sdk_lambda::operation::publish_version::PublishVersionOutput;\n    use aws_sdk_lambda::operation::update_function_code::UpdateFunctionCodeOutput;\n    use aws_sdk_lambda::types::FunctionConfiguration;\n    use aws_sdk_lambda::types::error::{ResourceConflictException, ResourceNotFoundException};\n    use aws_smithy_mocks::{RuleMode, mock, mock_client};\n    use bytesize::ByteSize;\n\n    use super::*;\n\n    fn make_version(version: &str, description: &str) -> FunctionConfiguration {\n        FunctionConfiguration::builder()\n            .version(version)\n            .description(description)\n            .build()\n    }\n\n    fn test_deploy_config() -> LambdaDeployConfig {\n        LambdaDeployConfig {\n            execution_role_arn: \"arn:aws:iam::123456789:role/test-role\".to_string(),\n            memory_size: ByteSize::gib(5),\n            invocation_timeout_secs: 60,\n        }\n    }\n\n    fn test_description() -> String {\n        version_description(None)\n    }\n\n    #[test]\n    fn test_version_description() {\n        let lambda_deploy_config = test_deploy_config();\n        let description = version_description(Some(&lambda_deploy_config));\n        assert!(description.ends_with(\"_60s_6c3b2\"));\n        let description = version_description(None);\n        assert!(description.ends_with(\"_nodeploy\"));\n    }\n\n    // --- find_matching_version tests ---\n\n    #[tokio::test]\n    async fn test_find_matching_version_found() {\n        let target = \"quickwit:test_version\";\n        let rule = mock!(aws_sdk_lambda::Client::list_versions_by_function).then_output(|| {\n            ListVersionsByFunctionOutput::builder()\n                .versions(make_version(\"$LATEST\", \"\"))\n                .versions(make_version(\"1\", \"quickwit:old_version\"))\n                .versions(make_version(\"7\", \"quickwit:test_version\"))\n                .build()\n        });\n        let client = mock_client!(aws_sdk_lambda, [&rule]);\n\n        let matching_version_opt = find_matching_version(&client, \"my-fn\", target)\n            .await\n            .unwrap();\n        assert_eq!(matching_version_opt, Some(\"7\".to_string()));\n    }\n\n    #[tokio::test]\n    async fn test_find_matching_version_not_found() {\n        let rule = mock!(aws_sdk_lambda::Client::list_versions_by_function).then_output(|| {\n            ListVersionsByFunctionOutput::builder()\n                .versions(make_version(\"$LATEST\", \"\"))\n                .versions(make_version(\"1\", \"quickwit:other\"))\n                .build()\n        });\n        let client = mock_client!(aws_sdk_lambda, [&rule]);\n\n        let result = find_matching_version(&client, \"my-fn\", \"quickwit:no_match\")\n            .await\n            .unwrap();\n        assert_eq!(result, None);\n    }\n\n    #[tokio::test]\n    async fn test_find_matching_version_function_does_not_exist() {\n        let rule = mock!(aws_sdk_lambda::Client::list_versions_by_function).then_error(|| {\n            ListVersionsByFunctionError::ResourceNotFoundException(\n                ResourceNotFoundException::builder().build(),\n            )\n        });\n        let client = mock_client!(aws_sdk_lambda, [&rule]);\n\n        let result = find_matching_version(&client, \"no-such-fn\", \"quickwit:x\")\n            .await\n            .unwrap();\n        assert_eq!(result, None);\n    }\n\n    #[tokio::test]\n    async fn test_find_matching_version_skips_latest_even_if_description_matches() {\n        let rule = mock!(aws_sdk_lambda::Client::list_versions_by_function).then_output(|| {\n            ListVersionsByFunctionOutput::builder()\n                .versions(make_version(\"$LATEST\", \"quickwit:match\"))\n                .build()\n        });\n        let client = mock_client!(aws_sdk_lambda, [&rule]);\n\n        let result = find_matching_version(&client, \"my-fn\", \"quickwit:match\")\n            .await\n            .unwrap();\n        assert_eq!(result, None);\n    }\n\n    // --- try_create_function tests ---\n\n    #[tokio::test]\n    async fn test_try_create_function_success() {\n        let rule = mock!(aws_sdk_lambda::Client::create_function).then_output(|| {\n            CreateFunctionOutput::builder()\n                .version(\"1\")\n                .function_name(\"my-fn\")\n                .build()\n        });\n        let client = mock_client!(aws_sdk_lambda, [&rule]);\n        let config = test_deploy_config();\n\n        let result = try_create_function(&client, \"my-fn\", &config, &test_description())\n            .await\n            .unwrap();\n        assert_eq!(result, Some(\"1\".to_string()));\n    }\n\n    #[tokio::test]\n    async fn test_try_create_function_already_exists() {\n        let rule = mock!(aws_sdk_lambda::Client::create_function).then_error(|| {\n            CreateFunctionError::ResourceConflictException(\n                ResourceConflictException::builder().build(),\n            )\n        });\n        let client = mock_client!(aws_sdk_lambda, [&rule]);\n        let config = test_deploy_config();\n\n        let result = try_create_function(&client, \"my-fn\", &config, &test_description())\n            .await\n            .unwrap();\n        assert_eq!(result, None);\n    }\n\n    // --- deploy (update path) tests ---\n\n    #[tokio::test]\n    async fn test_deploy_update_path() {\n        // create_function → conflict (function exists)\n        let create_rule = mock!(aws_sdk_lambda::Client::create_function).then_error(|| {\n            CreateFunctionError::ResourceConflictException(\n                ResourceConflictException::builder().build(),\n            )\n        });\n        // update_function_code → success with code_sha256\n        let update_rule = mock!(aws_sdk_lambda::Client::update_function_code).then_output(|| {\n            UpdateFunctionCodeOutput::builder()\n                .code_sha256(\"abc123hash\")\n                .build()\n        });\n        // get_function → active and ready (for wait_for_function_ready)\n        let get_rule = mock!(aws_sdk_lambda::Client::get_function).then_output(|| {\n            GetFunctionOutput::builder()\n                .configuration(\n                    FunctionConfiguration::builder()\n                        .state(State::Active)\n                        .last_update_status(LastUpdateStatus::Successful)\n                        .build(),\n                )\n                .build()\n        });\n        // publish_version → success\n        let publish_rule = mock!(aws_sdk_lambda::Client::publish_version)\n            .then_output(|| PublishVersionOutput::builder().version(\"8\").build());\n\n        let client = mock_client!(\n            aws_sdk_lambda,\n            RuleMode::MatchAny,\n            [&create_rule, &update_rule, &get_rule, &publish_rule]\n        );\n        let config = test_deploy_config();\n\n        tokio::time::pause();\n        let version = deploy_lambda_function(&client, \"my-fn\", &config)\n            .await\n            .unwrap();\n        assert_eq!(version, \"8\");\n    }\n\n    // --- wait_for_function_ready tests ---\n\n    #[tokio::test]\n    async fn test_wait_for_function_ready_immediate() {\n        let rule = mock!(aws_sdk_lambda::Client::get_function).then_output(|| {\n            GetFunctionOutput::builder()\n                .configuration(\n                    FunctionConfiguration::builder()\n                        .state(State::Active)\n                        .last_update_status(LastUpdateStatus::Successful)\n                        .build(),\n                )\n                .build()\n        });\n        let client = mock_client!(aws_sdk_lambda, [&rule]);\n\n        tokio::time::pause();\n        wait_for_function_ready(&client, \"my-fn\").await.unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_wait_for_function_ready_after_update_in_progress() {\n        let rule = mock!(aws_sdk_lambda::Client::get_function)\n            .sequence()\n            .output(|| {\n                GetFunctionOutput::builder()\n                    .configuration(\n                        FunctionConfiguration::builder()\n                            .state(State::Active)\n                            .last_update_status(LastUpdateStatus::InProgress)\n                            .build(),\n                    )\n                    .build()\n            })\n            .output(|| {\n                GetFunctionOutput::builder()\n                    .configuration(\n                        FunctionConfiguration::builder()\n                            .state(State::Active)\n                            .last_update_status(LastUpdateStatus::Successful)\n                            .build(),\n                    )\n                    .build()\n            })\n            .build();\n        let client = mock_client!(aws_sdk_lambda, RuleMode::Sequential, [&rule]);\n\n        tokio::time::pause();\n        wait_for_function_ready(&client, \"my-fn\").await.unwrap();\n        assert_eq!(rule.num_calls(), 2);\n    }\n\n    #[tokio::test]\n    async fn test_wait_for_function_ready_fails_on_failed_state() {\n        let rule = mock!(aws_sdk_lambda::Client::get_function).then_output(|| {\n            GetFunctionOutput::builder()\n                .configuration(\n                    FunctionConfiguration::builder()\n                        .state(State::Failed)\n                        .state_reason(\"Something broke\")\n                        .build(),\n                )\n                .build()\n        });\n        let client = mock_client!(aws_sdk_lambda, [&rule]);\n\n        tokio::time::pause();\n        let err = wait_for_function_ready(&client, \"my-fn\").await.unwrap_err();\n        assert!(\n            err.to_string().contains(\"Failed state\"),\n            \"unexpected error: {}\",\n            err\n        );\n    }\n\n    #[tokio::test]\n    async fn test_wait_for_function_ready_fails_on_last_update_failed() {\n        let rule = mock!(aws_sdk_lambda::Client::get_function).then_output(|| {\n            GetFunctionOutput::builder()\n                .configuration(\n                    FunctionConfiguration::builder()\n                        .state(State::Active)\n                        .last_update_status(LastUpdateStatus::Failed)\n                        .last_update_status_reason(\"Update broke\")\n                        .build(),\n                )\n                .build()\n        });\n        let client = mock_client!(aws_sdk_lambda, [&rule]);\n\n        tokio::time::pause();\n        let err = wait_for_function_ready(&client, \"my-fn\").await.unwrap_err();\n        assert!(\n            err.to_string().contains(\"last update failed\"),\n            \"unexpected error: {}\",\n            err\n        );\n    }\n\n    // --- garbage_collect_old_versions tests ---\n\n    #[tokio::test]\n    async fn test_gc_deletes_old_versions_keeps_recent() {\n        // 8 quickwit versions (1..=8) + $LATEST + one non-quickwit version\n        let list_rule =\n            mock!(aws_sdk_lambda::Client::list_versions_by_function).then_output(|| {\n                let mut builder = ListVersionsByFunctionOutput::builder()\n                    .versions(make_version(\"$LATEST\", \"\"))\n                    .versions(make_version(\"99\", \"not-quickwit\"));\n                for i in 1..=8 {\n                    builder = builder\n                        .versions(make_version(&i.to_string(), &format!(\"quickwit:ver_{}\", i)));\n                }\n                builder.build()\n            });\n\n        let delete_rule = mock!(aws_sdk_lambda::Client::delete_function)\n            .then_output(|| DeleteFunctionOutput::builder().build());\n\n        let client = mock_client!(\n            aws_sdk_lambda,\n            RuleMode::MatchAny,\n            [&list_rule, &delete_rule]\n        );\n\n        // Current version is \"7\", so keep 7 + the 5 most recent (4,5,6,7,8).\n        // Should delete versions 1, 2, 3.\n        garbage_collect_old_versions(&client, \"my-fn\", \"7\")\n            .await\n            .unwrap();\n\n        assert_eq!(delete_rule.num_calls(), 3);\n    }\n\n    #[tokio::test]\n    async fn test_gc_nothing_to_delete() {\n        // Only 3 quickwit versions — below the GC_KEEP_RECENT_VERSIONS threshold.\n        let list_rule =\n            mock!(aws_sdk_lambda::Client::list_versions_by_function).then_output(|| {\n                ListVersionsByFunctionOutput::builder()\n                    .versions(make_version(\"$LATEST\", \"\"))\n                    .versions(make_version(\"1\", \"quickwit:v1\"))\n                    .versions(make_version(\"2\", \"quickwit:v2\"))\n                    .versions(make_version(\"3\", \"quickwit:v3\"))\n                    .build()\n            });\n\n        let delete_rule = mock!(aws_sdk_lambda::Client::delete_function)\n            .then_output(|| DeleteFunctionOutput::builder().build());\n\n        let client = mock_client!(\n            aws_sdk_lambda,\n            RuleMode::MatchAny,\n            [&list_rule, &delete_rule]\n        );\n\n        garbage_collect_old_versions(&client, \"my-fn\", \"3\")\n            .await\n            .unwrap();\n\n        assert_eq!(delete_rule.num_calls(), 0);\n    }\n\n    #[tokio::test]\n    async fn test_gc_does_not_delete_current_version() {\n        // 7 quickwit versions, current is \"1\" (the oldest).\n        // Without the current-version guard, version 1 would be deleted.\n        let list_rule =\n            mock!(aws_sdk_lambda::Client::list_versions_by_function).then_output(|| {\n                let mut builder =\n                    ListVersionsByFunctionOutput::builder().versions(make_version(\"$LATEST\", \"\"));\n                for i in 1..=7 {\n                    builder = builder\n                        .versions(make_version(&i.to_string(), &format!(\"quickwit:ver_{}\", i)));\n                }\n                builder.build()\n            });\n\n        let delete_rule = mock!(aws_sdk_lambda::Client::delete_function)\n            .then_output(|| DeleteFunctionOutput::builder().build());\n\n        let client = mock_client!(\n            aws_sdk_lambda,\n            RuleMode::MatchAny,\n            [&list_rule, &delete_rule]\n        );\n\n        // Current version is \"1\". Without guard: would delete 1,2. With guard: only deletes 2.\n        garbage_collect_old_versions(&client, \"my-fn\", \"1\")\n            .await\n            .unwrap();\n\n        assert_eq!(delete_rule.num_calls(), 1);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-lambda-client/src/invoker.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::time::Duration;\n\nuse anyhow::Context as _;\nuse async_trait::async_trait;\nuse aws_sdk_lambda::Client as LambdaClient;\nuse aws_sdk_lambda::error::{DisplayErrorContext, SdkError};\nuse aws_sdk_lambda::operation::invoke::InvokeError;\nuse aws_sdk_lambda::primitives::Blob;\nuse aws_sdk_lambda::types::InvocationType;\nuse base64::prelude::*;\nuse prost::Message;\nuse quickwit_common::retry::RetryParams;\nuse quickwit_lambda_server::{LambdaSearchRequestPayload, LambdaSearchResponsePayload};\nuse quickwit_proto::search::{LambdaSearchResponses, LambdaSingleSplitResult, LeafSearchRequest};\nuse quickwit_search::{LambdaLeafSearchInvoker, SearchError};\nuse tracing::{debug, info, instrument, warn};\n\nuse crate::metrics::LAMBDA_METRICS;\n\n/// Upper bound on the retry-after hint we will honor from Lambda rate-limit responses.\nconst MAX_RETRY_AFTER: Duration = Duration::from_secs(10);\n\n/// Richer error type used internally by the invoker so that rate-limit retry-after hints\n/// are not lost before the retry loop can consume them.\nenum LambdaInvokeError {\n    /// Lambda returned a throttling error. The optional duration is the `Retry-After` hint\n    /// provided by Lambda; `None` means no hint was present.\n    RateLimited(Option<Duration>),\n    /// The invocation timed out.\n    Timeout(String),\n    /// A non-retryable error.\n    Permanent(SearchError),\n}\n\nimpl LambdaInvokeError {\n    fn into_search_error(self) -> SearchError {\n        match self {\n            Self::RateLimited(_) => SearchError::TooManyRequests,\n            Self::Timeout(msg) => SearchError::Timeout(msg),\n            Self::Permanent(err) => err,\n        }\n    }\n}\n\nimpl From<SearchError> for LambdaInvokeError {\n    fn from(err: SearchError) -> Self {\n        LambdaInvokeError::Permanent(err)\n    }\n}\n\nfn invoke_error_to_lambda_error(error: SdkError<InvokeError>) -> LambdaInvokeError {\n    if let SdkError::ServiceError(ref service_error) = error {\n        match service_error.err() {\n            InvokeError::TooManyRequestsException(exc) => {\n                let retry_after = exc\n                    .retry_after_seconds()\n                    .and_then(|raw| raw.parse::<f64>().ok())\n                    .filter(|secs| secs.is_finite() && *secs > 0.0)\n                    .map(|secs| Duration::from_secs_f64(secs).min(MAX_RETRY_AFTER));\n                return LambdaInvokeError::RateLimited(retry_after);\n            }\n            InvokeError::EniLimitReachedException(_)\n            | InvokeError::SubnetIpAddressLimitReachedException(_)\n            | InvokeError::Ec2ThrottledException(_)\n            | InvokeError::ResourceConflictException(_) => {\n                return LambdaInvokeError::RateLimited(None);\n            }\n            _ => {}\n        }\n    }\n\n    let is_timeout = match &error {\n        SdkError::TimeoutError(_) => true,\n        SdkError::DispatchFailure(failure) => failure.is_timeout(),\n        SdkError::ServiceError(service_error) => matches!(\n            service_error.err(),\n            InvokeError::EfsMountTimeoutException(_) | InvokeError::SnapStartTimeoutException(_)\n        ),\n        _ => false,\n    };\n\n    let error_msg = format!(\"lambda invocation failed: {}\", DisplayErrorContext(&error));\n\n    if is_timeout {\n        LambdaInvokeError::Timeout(error_msg)\n    } else {\n        LambdaInvokeError::Permanent(SearchError::Internal(error_msg))\n    }\n}\n\n/// Create a Lambda invoker for a specific version.\n///\n/// The version number is used as the qualifier when invoking, ensuring we call\n/// the exact published version (not $LATEST).\npub(crate) async fn create_lambda_invoker_for_version(\n    function_name: String,\n    version: String,\n) -> anyhow::Result<AwsLambdaInvoker> {\n    let aws_config = aws_config::load_defaults(aws_config::BehaviorVersion::latest()).await;\n    let client = LambdaClient::new(&aws_config);\n    let invoker = AwsLambdaInvoker {\n        client,\n        function_name,\n        version,\n    };\n    invoker.validate().await?;\n    Ok(invoker)\n}\n\n/// AWS Lambda implementation of RemoteFunctionInvoker.\npub(crate) struct AwsLambdaInvoker {\n    client: LambdaClient,\n    function_name: String,\n    /// The version number to invoke (e.g., \"7\", \"12\").\n    version: String,\n}\n\nimpl AwsLambdaInvoker {\n    /// Validate that the Lambda function version exists and is invocable.\n    /// Uses DryRun invocation type - validates without executing.\n    async fn validate(&self) -> anyhow::Result<()> {\n        info!(\"lambda invoker dry run\");\n        let request = self\n            .client\n            .invoke()\n            .function_name(&self.function_name)\n            .qualifier(&self.version)\n            .invocation_type(InvocationType::DryRun);\n\n        request.send().await.with_context(|| {\n            format!(\n                \"failed to validate Lambda function '{}:{}'\",\n                self.function_name, self.version\n            )\n        })?;\n\n        info!(\"the lambda invoker dry run was successful\");\n        Ok(())\n    }\n}\n\n/// Retry parameters used for exponential backoff when no `Retry-After` hint is available.\nconst LAMBDA_RETRY_PARAMS: RetryParams = RetryParams {\n    base_delay: Duration::from_secs(1),\n    max_delay: Duration::from_secs(10),\n    max_attempts: 3,\n};\n\n#[async_trait]\nimpl LambdaLeafSearchInvoker for AwsLambdaInvoker {\n    #[instrument(skip(self, request), fields(function_name = %self.function_name, version = %self.version))]\n    async fn invoke_leaf_search(\n        &self,\n        request: LeafSearchRequest,\n    ) -> Result<Vec<LambdaSingleSplitResult>, SearchError> {\n        let start = std::time::Instant::now();\n        let result = self.invoke_leaf_search_with_retry(request).await;\n        let elapsed = start.elapsed().as_secs_f64();\n        let status = if result.is_ok() { \"success\" } else { \"error\" };\n        LAMBDA_METRICS\n            .leaf_search_requests_total\n            .with_label_values([status])\n            .inc();\n        LAMBDA_METRICS\n            .leaf_search_duration_seconds\n            .with_label_values([status])\n            .observe(elapsed);\n        result\n    }\n}\n\nimpl AwsLambdaInvoker {\n    async fn invoke_leaf_search_with_retry(\n        &self,\n        request: LeafSearchRequest,\n    ) -> Result<Vec<LambdaSingleSplitResult>, SearchError> {\n        let mut error = match self.invoke_leaf_search_once(request.clone()).await {\n            Ok(results) => return Ok(results),\n            Err(error) => error,\n        };\n\n        for num_attempts in 1..LAMBDA_RETRY_PARAMS.max_attempts {\n            // Determine whether to retry and how long to wait.\n            let delay = match &error {\n                LambdaInvokeError::RateLimited(retry_after) => {\n                    retry_after.unwrap_or_else(|| LAMBDA_RETRY_PARAMS.compute_delay(num_attempts))\n                }\n                LambdaInvokeError::Timeout(_) => LAMBDA_RETRY_PARAMS.compute_delay(num_attempts),\n                LambdaInvokeError::Permanent(_) => return Err(error.into_search_error()),\n            };\n\n            warn!(\n                num_attempts = num_attempts,\n                delay_ms = delay.as_millis(),\n                \"lambda invocation failed, retrying\"\n            );\n            tokio::time::sleep(delay).await;\n\n            match self.invoke_leaf_search_once(request.clone()).await {\n                Ok(results) => return Ok(results),\n                Err(e) => error = e,\n            };\n        }\n\n        Err(error.into_search_error())\n    }\n\n    async fn invoke_leaf_search_once(\n        &self,\n        request: LeafSearchRequest,\n    ) -> Result<Vec<LambdaSingleSplitResult>, LambdaInvokeError> {\n        // Serialize request to protobuf bytes, then base64 encode\n        let request_bytes = request.encode_to_vec();\n        let payload = LambdaSearchRequestPayload {\n            payload: BASE64_STANDARD.encode(&request_bytes),\n        };\n\n        let payload_json = serde_json::to_vec(&payload)\n            .map_err(|e| SearchError::Internal(format!(\"JSON serialization error: {}\", e)))?;\n\n        LAMBDA_METRICS\n            .leaf_search_request_payload_size_bytes\n            .observe(payload_json.len() as f64);\n\n        debug!(\n            payload_size = payload_json.len(),\n            version = %self.version,\n            \"invoking Lambda function\"\n        );\n\n        // Invoke the specific version\n        let invoke_builder = self\n            .client\n            .invoke()\n            .function_name(&self.function_name)\n            .qualifier(&self.version)\n            .invocation_type(InvocationType::RequestResponse)\n            .payload(Blob::new(payload_json));\n\n        let response = invoke_builder\n            .send()\n            .await\n            .map_err(invoke_error_to_lambda_error)?;\n\n        // Check for function error\n        if let Some(error) = response.function_error() {\n            let error_payload = response\n                .payload()\n                .map(|b| String::from_utf8_lossy(b.as_ref()).to_string())\n                .unwrap_or_default();\n            return Err(SearchError::Internal(format!(\n                \"lambda function error: {}: {}\",\n                error, error_payload\n            ))\n            .into());\n        }\n\n        // Deserialize response\n        let response_payload = response\n            .payload()\n            .ok_or_else(|| SearchError::Internal(\"no response payload from Lambda\".into()))?;\n\n        LAMBDA_METRICS\n            .leaf_search_response_payload_size_bytes\n            .observe(response_payload.as_ref().len() as f64);\n\n        let lambda_response: LambdaSearchResponsePayload =\n            serde_json::from_slice(response_payload.as_ref())\n                .map_err(|e| SearchError::Internal(format!(\"json deserialization error: {}\", e)))?;\n\n        let response_bytes = BASE64_STANDARD\n            .decode(&lambda_response.payload)\n            .map_err(|e| SearchError::Internal(format!(\"base64 decode error: {}\", e)))?;\n\n        let leaf_responses = LambdaSearchResponses::decode(&response_bytes[..])\n            .map_err(|e| SearchError::Internal(format!(\"protobuf decode error: {}\", e)))?;\n\n        debug!(\n            num_results = leaf_responses.split_results.len(),\n            \"lambda invocation completed\"\n        );\n\n        Ok(leaf_responses.split_results)\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-lambda-client/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n//! AWS Lambda client for Quickwit leaf search operations.\n//!\n//! This crate provides:\n//! - An AWS Lambda implementation of the `LambdaLeafSearchInvoker` trait used by `quickwit-search`\n//! - Auto-deployment functionality for Lambda functions\n//!\n//! # Usage\n//!\n//! Use `try_get_or_deploy_invoker` to get an invoker that will automatically deploy\n//! the Lambda function if needed:\n//!\n//! ```ignore\n//! let invoker = try_get_or_deploy_invoker(&function_name, &deploy_config).await?;\n//! ```\n\nmod deploy;\nmod invoker;\nmod metrics;\n\npub use deploy::try_get_or_deploy_invoker;\npub use metrics::LAMBDA_METRICS;\n// Re-export payload types from server crate for convenience\npub use quickwit_lambda_server::{LambdaSearchRequestPayload, LambdaSearchResponsePayload};\n"
  },
  {
    "path": "quickwit/quickwit-lambda-client/src/metrics.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n// See https://prometheus.io/docs/practices/naming/\n\nuse once_cell::sync::Lazy;\nuse quickwit_common::metrics::{\n    Histogram, HistogramVec, IntCounterVec, exponential_buckets, new_counter_vec, new_histogram,\n    new_histogram_vec,\n};\n\n/// From 100ms to 73s seconds\nfn duration_buckets() -> Vec<f64> {\n    exponential_buckets(0.100, 3f64.sqrt(), 13).unwrap()\n}\n\n/// From 1KB to 16MB\nfn payload_size_buckets() -> Vec<f64> {\n    exponential_buckets(1024.0, 4.0, 8).unwrap()\n}\n\npub struct LambdaMetrics {\n    pub leaf_search_requests_total: IntCounterVec<1>,\n    pub leaf_search_duration_seconds: HistogramVec<1>,\n    pub leaf_search_request_payload_size_bytes: Histogram,\n    pub leaf_search_response_payload_size_bytes: Histogram,\n}\n\nimpl Default for LambdaMetrics {\n    fn default() -> Self {\n        LambdaMetrics {\n            leaf_search_requests_total: new_counter_vec(\n                \"leaf_search_requests_total\",\n                \"Total number of Lambda leaf search invocations.\",\n                \"lambda\",\n                &[],\n                [\"status\"],\n            ),\n            leaf_search_duration_seconds: new_histogram_vec(\n                \"leaf_search_duration_seconds\",\n                \"Duration of Lambda leaf search invocations in seconds.\",\n                \"lambda\",\n                &[],\n                [\"status\"],\n                duration_buckets(),\n            ),\n            leaf_search_request_payload_size_bytes: new_histogram(\n                \"leaf_search_request_payload_size_bytes\",\n                \"Size of the request payload sent to Lambda in bytes.\",\n                \"lambda\",\n                payload_size_buckets(),\n            ),\n            leaf_search_response_payload_size_bytes: new_histogram(\n                \"leaf_search_response_payload_size_bytes\",\n                \"Size of the response payload received from Lambda in bytes.\",\n                \"lambda\",\n                payload_size_buckets(),\n            ),\n        }\n    }\n}\n\npub static LAMBDA_METRICS: Lazy<LambdaMetrics> = Lazy::new(LambdaMetrics::default);\n"
  },
  {
    "path": "quickwit/quickwit-lambda-server/Cargo.toml",
    "content": "[package]\nname = \"quickwit-lambda-server\"\ndescription = \"AWS Lambda handler for Quickwit leaf search\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[package.metadata.cargo-machete]\n# Its here even though it is not useful, in order to enable its \"vendor\" feature,\n# allowing the cross-build.\nignored = [\"openssl\"]\n\n[dependencies]\nanyhow = { workspace = true }\nbase64 = { workspace = true }\nbytesize = { workspace = true }\nlambda_runtime = { workspace = true }\nprost = { workspace = true }\nserde = { workspace = true }\nserde_json = { workspace = true }\nthiserror = { workspace = true }\ntokio = { workspace = true }\ntracing = { workspace = true }\ntracing-subscriber = { workspace = true, features = [\"env-filter\", \"json\"] }\n\nopenssl = { workspace = true, optional = true }\n\nquickwit-common = { workspace = true }\nquickwit-config = { workspace = true }\nquickwit-doc-mapper = { workspace = true }\nquickwit-proto = { workspace = true }\nquickwit-search = { workspace = true }\nquickwit-storage = { workspace = true }\n\n[[bin]]\nname = \"quickwit-aws-lambda-leaf-search\"\npath = \"src/bin/leaf_search.rs\"\n\n[features]\ndefault = []\ntestsuite = []\n\n# Keep this in sync with quickwit-cli!\nlambda-release = [\n    # The vendored OpenSSL will be compiled from source during the\n    # build, avoiding the pkg-config dependency issue during\n    # cross-compilation.\n    \"openssl/vendored\",\n]\n"
  },
  {
    "path": "quickwit/quickwit-lambda-server/src/bin/leaf_search.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n//! AWS Lambda binary entry point for Quickwit leaf search.\n\nuse std::sync::Arc;\n\nuse lambda_runtime::{Error, LambdaEvent, service_fn};\nuse quickwit_lambda_server::{\n    LambdaSearchRequestPayload, LambdaSearcherContext, handle_leaf_search,\n};\nuse tracing::info;\nuse tracing_subscriber::EnvFilter;\n\n#[tokio::main]\nasync fn main() -> Result<(), Error> {\n    // Initialize tracing with JSON output for CloudWatch\n    tracing_subscriber::fmt()\n        .with_env_filter(EnvFilter::from_default_env())\n        .json()\n        .init();\n\n    // Initialize context on cold start (wrapped in Arc for sharing across invocations)\n    let context = Arc::new(LambdaSearcherContext::try_from_env()?);\n\n    info!(\"lambda context initialized, starting handler loop\");\n\n    // Run the Lambda handler\n    lambda_runtime::run(service_fn(\n        |event: LambdaEvent<LambdaSearchRequestPayload>| {\n            let ctx = Arc::clone(&context);\n            async move {\n                let (payload, _event_ctx) = event.into_parts();\n                handle_leaf_search(payload, &ctx)\n                    .await\n                    .map_err(|e| lambda_runtime::Error::from(e.to_string()))\n            }\n        },\n    ))\n    .await\n}\n"
  },
  {
    "path": "quickwit/quickwit-lambda-server/src/context.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::Arc;\n\nuse anyhow::Context as _;\nuse bytesize::ByteSize;\nuse quickwit_config::{CacheConfig, SearcherConfig};\nuse quickwit_search::SearcherContext;\nuse quickwit_storage::StorageResolver;\nuse tracing::info;\n\n/// Lambda-specific searcher context that holds resources for search execution.\npub struct LambdaSearcherContext {\n    pub searcher_context: Arc<SearcherContext>,\n    pub storage_resolver: StorageResolver,\n}\n\nimpl LambdaSearcherContext {\n    /// Create a new Lambda searcher context from environment variables.\n    pub fn try_from_env() -> anyhow::Result<Self> {\n        info!(\"initializing lambda searcher context\");\n\n        let searcher_config = try_searcher_config_from_env()?;\n        let searcher_context =\n            Arc::new(SearcherContext::new_without_invoker(searcher_config, None));\n        let storage_resolver = StorageResolver::configured(&Default::default());\n\n        Ok(Self {\n            searcher_context,\n            storage_resolver,\n        })\n    }\n}\n\n/// Create a Lambda-optimized searcher config based on the `AWS_LAMBDA_FUNCTION_MEMORY_SIZE`\n/// environment variable.\nfn try_searcher_config_from_env() -> anyhow::Result<SearcherConfig> {\n    let lambda_memory_mib: u64 = quickwit_common::get_from_env_opt(\n        \"AWS_LAMBDA_FUNCTION_MEMORY_SIZE\",\n        /* sensitive */ false,\n    )\n    .context(\"could not get aws lambda function memory size from ENV\")?;\n    let lambda_memory = ByteSize::mib(lambda_memory_mib);\n    anyhow::ensure!(\n        lambda_memory >= ByteSize::gib(1u64),\n        \"lambda memory must be at least 1GB\"\n    );\n    let warmup_memory_budget = ByteSize::b(lambda_memory.as_u64() - ByteSize::mib(500).as_u64());\n\n    let mut searcher_config = SearcherConfig::default();\n    searcher_config.max_num_concurrent_split_searches = 20;\n    searcher_config.warmup_memory_budget = warmup_memory_budget;\n    searcher_config.fast_field_cache = CacheConfig::no_cache();\n    searcher_config.split_footer_cache = CacheConfig::no_cache();\n    searcher_config.predicate_cache = CacheConfig::no_cache();\n    searcher_config.partial_request_cache = CacheConfig::no_cache();\n    Ok(searcher_config)\n}\n"
  },
  {
    "path": "quickwit/quickwit-lambda-server/src/error.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_search::SearchError;\nuse thiserror::Error;\n\n/// Result type for Lambda operations.\npub type LambdaResult<T> = Result<T, LambdaError>;\n\n/// Errors that can occur during Lambda handler operations.\n#[derive(Debug, Error)]\npub enum LambdaError {\n    /// Error serializing/deserializing protobuf.\n    #[error(\"serialization error: {0}\")]\n    Serialization(String),\n    /// Error from the search operation.\n    #[error(\"search error: {0}\")]\n    Search(#[from] SearchError),\n    /// Internal error.\n    #[error(\"internal error: {0}\")]\n    Internal(String),\n    /// Task was cancelled.\n    #[error(\"cancelled\")]\n    Cancelled,\n}\n\nimpl From<prost::DecodeError> for LambdaError {\n    fn from(err: prost::DecodeError) -> Self {\n        LambdaError::Serialization(format!(\"protobuf decode error: {}\", err))\n    }\n}\n\nimpl From<prost::EncodeError> for LambdaError {\n    fn from(err: prost::EncodeError) -> Self {\n        LambdaError::Serialization(format!(\"protobuf encode error: {}\", err))\n    }\n}\n\nimpl From<base64::DecodeError> for LambdaError {\n    fn from(err: base64::DecodeError) -> Self {\n        LambdaError::Serialization(format!(\"base64 decode error: {}\", err))\n    }\n}\n\nimpl From<serde_json::Error> for LambdaError {\n    fn from(err: serde_json::Error) -> Self {\n        LambdaError::Serialization(format!(\"json error: {}\", err))\n    }\n}\n\nimpl From<LambdaError> for SearchError {\n    fn from(err: LambdaError) -> Self {\n        match err {\n            LambdaError::Search(search_err) => search_err,\n            other => SearchError::Internal(other.to_string()),\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-lambda-server/src/handler.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::str::FromStr;\nuse std::sync::Arc;\n\nuse base64::prelude::*;\nuse prost::Message;\nuse quickwit_common::uri::Uri;\nuse quickwit_doc_mapper::DocMapper;\nuse quickwit_proto::search::lambda_single_split_result::Outcome;\nuse quickwit_proto::search::{\n    LambdaSearchResponses, LambdaSingleSplitResult, LeafRequestRef, LeafSearchRequest,\n    SearchRequest,\n};\nuse quickwit_search::leaf::single_doc_mapping_leaf_search;\nuse quickwit_storage::Storage;\nuse serde::{Deserialize, Serialize};\nuse tracing::{error, info, instrument, warn};\n\nuse crate::context::LambdaSearcherContext;\nuse crate::error::{LambdaError, LambdaResult};\n\n/// Payload for leaf search Lambda invocation.\n#[derive(Debug, Serialize, Deserialize)]\npub struct LambdaSearchRequestPayload {\n    /// Base64-encoded serialized LeafSearchRequest protobuf.\n    pub payload: String,\n}\n\n/// Response from leaf search Lambda invocation.\n#[derive(Debug, Serialize, Deserialize)]\npub struct LambdaSearchResponsePayload {\n    /// Base64-encoded serialized `LambdaSearchResponses` protobuf (one per split).\n    pub payload: String,\n}\n\n/// Handle a leaf search request in Lambda.\n///\n/// Returns one `LambdaSingleSplitResult` per split, each tagged with its\n/// split_id. Individual split failures are reported per-split rather than\n/// failing the entire invocation, so the caller can retry only failed splits.\n#[instrument(skip(ctx), fields(request_id))]\npub async fn handle_leaf_search(\n    event: LambdaSearchRequestPayload,\n    ctx: &LambdaSearcherContext,\n) -> LambdaResult<LambdaSearchResponsePayload> {\n    // Decode base64 payload\n    let request_bytes: Vec<u8> = BASE64_STANDARD\n        .decode(&event.payload)\n        .map_err(|err| LambdaError::Serialization(format!(\"base64 decode error: {}\", err)))?;\n\n    // Deserialize LeafSearchRequest\n    let leaf_search_request = LeafSearchRequest::decode(&request_bytes[..])?;\n\n    // Unpack the shared fields once instead of cloning per split.\n    let search_request: Arc<SearchRequest> = leaf_search_request\n        .search_request\n        .ok_or_else(|| LambdaError::Internal(\"no search request\".to_string()))?\n        .into();\n\n    let doc_mappers: Vec<Arc<DocMapper>> = leaf_search_request\n        .doc_mappers\n        .iter()\n        .map(String::as_str)\n        .map(serde_json::from_str::<Arc<DocMapper>>)\n        .collect::<Result<Vec<_>, _>>()\n        .map_err(|err| {\n            LambdaError::Internal(format!(\"failed to deserialize doc mapper: `{err}`\"))\n        })?;\n\n    // Resolve storage for every index URI upfront.\n    let mut storages: Vec<Arc<dyn quickwit_storage::Storage>> =\n        Vec::with_capacity(leaf_search_request.index_uris.len());\n    for uri_str in &leaf_search_request.index_uris {\n        let uri = Uri::from_str(uri_str)\n            .map_err(|err| LambdaError::Internal(format!(\"invalid index uri: {err}\")))?;\n        let storage =\n            ctx.storage_resolver.resolve(&uri).await.map_err(|err| {\n                LambdaError::Internal(format!(\"failed to resolve storage: {err}\"))\n            })?;\n        storages.push(storage);\n    }\n\n    let split_results: Vec<LambdaSingleSplitResult> = lambda_leaf_search(\n        search_request,\n        leaf_search_request.leaf_requests,\n        &doc_mappers[..],\n        &storages[..],\n        ctx,\n    )\n    .await?;\n    let wrapper = LambdaSearchResponses { split_results };\n    let response_bytes = wrapper.encode_to_vec();\n    let payload = BASE64_STANDARD.encode(&response_bytes);\n\n    Ok(LambdaSearchResponsePayload { payload })\n}\n\n/// Lambda leaf search returns individual split results.\nasync fn lambda_leaf_search(\n    search_request: Arc<SearchRequest>,\n    leaf_req_ref: Vec<LeafRequestRef>,\n    doc_mappers: &[Arc<DocMapper>],\n    storages: &[Arc<dyn Storage>],\n    ctx: &LambdaSearcherContext,\n) -> LambdaResult<Vec<LambdaSingleSplitResult>> {\n    // Flatten leaf_requests into per-split tasks using pre-resolved Arc references.\n    let mut split_search_joinset: tokio::task::JoinSet<(String, Result<_, String>)> =\n        tokio::task::JoinSet::new();\n\n    for leaf_req in leaf_req_ref {\n        let doc_mapper = doc_mappers\n            .get(leaf_req.doc_mapper_ord as usize)\n            .ok_or_else(|| {\n                LambdaError::Internal(format!(\n                    \"doc_mapper_ord out of bounds: {}\",\n                    leaf_req.doc_mapper_ord\n                ))\n            })?\n            .clone();\n        let storage = storages[leaf_req.index_uri_ord as usize].clone();\n\n        for split_id_and_footer_offsets in leaf_req.split_offsets {\n            let split_id = split_id_and_footer_offsets.split_id.clone();\n            let searcher_context = ctx.searcher_context.clone();\n            let search_request = search_request.clone();\n            let doc_mapper = doc_mapper.clone();\n            let storage = storage.clone();\n            let split = split_id_and_footer_offsets.clone();\n            split_search_joinset.spawn(async move {\n                let result = single_doc_mapping_leaf_search(\n                    searcher_context,\n                    search_request,\n                    storage,\n                    vec![split],\n                    doc_mapper,\n                )\n                .await\n                .map_err(|err| format!(\"{err}\"));\n                (split_id, result)\n            });\n        }\n    }\n\n    let num_splits = split_search_joinset.len();\n    info!(num_splits, \"processing leaf search request (per-split)\");\n\n    // Collect results. Order is irrelevant: each result is tagged with its split_id.\n    let mut split_results: Vec<LambdaSingleSplitResult> = Vec::with_capacity(num_splits);\n    let mut num_successes: usize = 0;\n    let mut num_failures: usize = 0;\n    while let Some(join_result) = split_search_joinset.join_next().await {\n        match join_result {\n            Ok((split_id, Ok(response))) => {\n                num_successes += 1;\n                split_results.push(LambdaSingleSplitResult {\n                    split_id,\n                    outcome: Some(Outcome::Response(response)),\n                });\n            }\n            Ok((split_id, Err(error_msg))) => {\n                num_failures += 1;\n                warn!(split_id = %split_id, error = %error_msg, \"split search failed\");\n                split_results.push(LambdaSingleSplitResult {\n                    split_id,\n                    outcome: Some(Outcome::Error(error_msg)),\n                });\n            }\n            Err(join_error) if join_error.is_cancelled() => {\n                warn!(\"search task was cancelled\");\n                return Err(LambdaError::Cancelled);\n            }\n            Err(join_error) => {\n                // Panics lose the captured split_id, so we fail the entire invocation.\n                error!(error = %join_error, \"search task panicked\");\n                return Err(LambdaError::Internal(format!(\n                    \"search task panicked: {join_error}\"\n                )));\n            }\n        }\n    }\n    info!(\n        num_successes,\n        num_failures, \"leaf search completed (per-split)\"\n    );\n\n    Ok(split_results)\n}\n"
  },
  {
    "path": "quickwit/quickwit-lambda-server/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n//! AWS Lambda handler for Quickwit leaf search operations.\n//!\n//! This crate provides the Lambda handler that executes leaf search requests.\n//! It is designed to be deployed as an AWS Lambda function.\n\nmod context;\nmod error;\nmod handler;\n\npub use context::LambdaSearcherContext;\npub use error::{LambdaError, LambdaResult};\npub use handler::{LambdaSearchRequestPayload, LambdaSearchResponsePayload, handle_leaf_search};\n"
  },
  {
    "path": "quickwit/quickwit-macros/Cargo.toml",
    "content": "[package]\nname = \"quickwit-macros\"\ndescription =  \"Proc macro definitions\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[lib]\nproc-macro = true\n\n[dependencies]\nproc-macro2 = { workspace = true }\nquote = { workspace = true }\nsyn = { workspace = true }\n"
  },
  {
    "path": "quickwit/quickwit-macros/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::mem;\n\nuse proc_macro::TokenStream;\nuse proc_macro2::{Span, TokenStream as TokenStream2};\nuse quote::quote;\nuse syn::parse::{Parse, ParseStream, Parser};\nuse syn::punctuated::Punctuated;\nuse syn::{\n    Attribute, Error, Field, Fields, FieldsNamed, Ident, ItemStruct, Meta, Path, Token, Visibility,\n    parenthesized,\n};\n\n#[proc_macro_attribute]\npub fn serde_multikey(attr: TokenStream, item: TokenStream) -> TokenStream {\n    match serde_multikey_inner(attr, item) {\n        Ok(ts) => ts,\n        Err(e) => e.to_compile_error().into(),\n    }\n}\n\nfn serde_multikey_inner(_attr: TokenStream, item: TokenStream) -> Result<TokenStream, Error> {\n    let Ok(input) = syn::parse::<ItemStruct>(item) else {\n        return Err(Error::new(\n            Span::call_site(),\n            \"the attribute can only be applied to struct\",\n        ));\n    };\n\n    let main_struct = generate_main_struct(input.clone())?;\n\n    let proxy_struct = generate_proxy_struct(input)?;\n\n    Ok(quote!(\n    #main_struct\n    #proxy_struct\n    )\n    .into())\n}\n\n/// Generate the main struct. It's a copy of the original struct, but with most\n/// ser/de attributes removed, and serde try_from/into `__MultiKey{}` added.\nfn generate_main_struct(mut input: ItemStruct) -> Result<TokenStream2, Error> {\n    let (serialize, deserialize) = get_ser_de(&input.attrs)?;\n    let has_utoipa_schema = get_and_remove_utoipa_schema(&mut input.attrs)?;\n\n    if !deserialize && !serialize {\n        return Err(Error::new(\n            Span::call_site(),\n            \"`serde_multikey` was applied to a non Serialize/Deserialize struct\",\n        ));\n    }\n\n    // remove serde and utoipa attributes from fields\n    for field in input.fields.iter_mut() {\n        let attrs = mem::take(&mut field.attrs);\n        field.attrs = attrs\n            .into_iter()\n            .filter(|attr| {\n                !(attr.path().is_ident(\"serde_multikey\")\n                    || attr.path().is_ident(\"serde\")\n                    || attr.path().is_ident(\"serde_as\")\n                    || attr.path().is_ident(\"schema\"))\n            })\n            .collect();\n    }\n\n    // remove serde attributes from struct\n    let attrs = mem::take(&mut input.attrs);\n    input.attrs = attrs\n        .into_iter()\n        .filter(|attr| !(attr.path().is_ident(\"serde\") || attr.path().is_ident(\"serde_as\")))\n        .collect();\n\n    if deserialize {\n        let mut attr = Attribute::parse_outer\n            .parse_str(&format!(\n                r#\"#[serde(try_from = \"__MultiKey{}\")]\"#,\n                input.ident\n            ))\n            .unwrap();\n        input.attrs.append(&mut attr);\n    }\n\n    if serialize {\n        let mut attr = Attribute::parse_outer\n            .parse_str(&format!(r#\"#[serde(into = \"__MultiKey{}\")]\"#, input.ident))\n            .unwrap();\n        input.attrs.append(&mut attr);\n    }\n\n    let utoipa = if has_utoipa_schema {\n        let main_ident = input.ident.clone();\n        let main_ident_str = main_ident.to_string();\n        let proxy_ident = Ident::new(&format!(\"__MultiKey{}\", input.ident), input.ident.span());\n\n        Some(quote!(\n            impl<'__s> utoipa::ToSchema<'__s> for #main_ident {\n                fn schema() -> (\n                    &'__s str,\n                    utoipa::openapi::RefOr<utoipa::openapi::schema::Schema>,\n                ) {\n                    (\n                        #main_ident_str,\n                        <#proxy_ident as utoipa::ToSchema>::schema().1,\n                    )\n                }\n            }\n        ))\n    } else {\n        None\n    };\n\n    Ok(quote!(\n        #input\n\n        #utoipa\n    ))\n}\n\n/// Generate the proxy struct. It is a copy of the original struct, but fields marked\n/// with `serde_multikey` have been replaced with the fields the correspond to.\n/// Also generate TryFrom/Into as required.\nfn generate_proxy_struct(mut input: ItemStruct) -> Result<TokenStream2, Error> {\n    let main_ident = input.ident.clone();\n    let proxy_ident = Ident::new(&format!(\"__MultiKey{}\", input.ident), input.ident.span());\n\n    input.ident = proxy_ident.clone();\n    input.vis = Visibility::Inherited;\n    // TODO wait for https://github.com/juhaku/utoipa/issues/704 to re-enable\n    // input.attrs.append(&mut Attribute::parse_outer\n    // .parse_str(&\"#[doc(hidden)]\")\n    // .unwrap());\n\n    let (ser, de) = get_ser_de(&input.attrs)?;\n\n    let mut pass_through = Vec::<Ident>::new();\n    let mut final_fields = Punctuated::<Field, Token![,]>::new();\n    let mut try_from_conv = Vec::<TokenStream2>::new();\n    let mut into_pre_conv = Vec::<TokenStream2>::new();\n    let mut into_in_conv = Vec::<TokenStream2>::new();\n\n    let Fields::Named(FieldsNamed { brace_token, named }) = input.fields else {\n        return Err(Error::new(\n            Span::call_site(),\n            \"`serde_multikey` was applied to a tuple-struct or an empty struct\",\n        ));\n    };\n    for pair in named.into_pairs() {\n        let (mut field, ponct) = pair.into_tuple();\n        // we are in a \"normal\" struct, not a tuple-struct, unwrap is fine.\n        let field_name = field.ident.clone().unwrap();\n\n        let (field_config, attrs) = parse_attributes(field.attrs, &field_name)?;\n        field.attrs = attrs;\n\n        if let Some(field_config) = field_config {\n            let value = Ident::new(\"value\", Span::call_site());\n            for field in &field_config.proxy_fields {\n                final_fields.push(field.clone());\n            }\n            match (ser, field_config.get_into(&value)) {\n                (true, Some((pre_conv, in_conv))) => {\n                    into_pre_conv.push(pre_conv);\n                    into_in_conv.push(in_conv);\n                }\n                (false, None) => (),\n                (true, None) => {\n                    return Err(Error::new(\n                        field_name.span(),\n                        \"structure implement serialize but no serializer defined\",\n                    ));\n                }\n                (false, Some(_)) => {\n                    return Err(Error::new(\n                        field_name.span(),\n                        \"structure doesn't implement serialize but a serializer is defined\",\n                    ));\n                }\n            }\n            match (de, field_config.get_try_from(&value)) {\n                (true, Some(conv)) => {\n                    try_from_conv.push(conv);\n                }\n                (false, None) => (),\n                (true, None) => {\n                    return Err(Error::new(\n                        field_name.span(),\n                        \"structure implement deserialize but no deserializer defined\",\n                    ));\n                }\n                (false, Some(_)) => {\n                    return Err(Error::new(\n                        field_name.span(),\n                        \"structure doesn't implement deserialize but a deserializer is defined\",\n                    ));\n                }\n            }\n        } else {\n            pass_through.push(field_name);\n            final_fields.push(field);\n            if let Some(ponct) = ponct {\n                final_fields.push_punct(ponct);\n            }\n        }\n    }\n    input.fields = Fields::Named(FieldsNamed {\n        brace_token,\n        named: final_fields,\n    });\n\n    let into = if ser {\n        Some(quote!(\n            impl From<#main_ident> for #proxy_ident {\n                fn from(value: #main_ident) -> #proxy_ident {\n                    #(#into_pre_conv)*\n                    #proxy_ident {\n                        #(#pass_through: value.#pass_through,)*\n                        #(#into_in_conv)*\n                    }\n                }\n            }\n        ))\n    } else {\n        None\n    };\n    let try_from = if de {\n        Some(quote!(\n            impl TryFrom<#proxy_ident> for #main_ident {\n                type Error = String;\n\n                fn try_from(value: #proxy_ident) -> Result<Self, Self::Error> {\n                    Ok(#main_ident {\n                        #(#pass_through: value.#pass_through,)*\n                        #(#try_from_conv)*\n                    })\n                }\n            }\n        ))\n    } else {\n        None\n    };\n    Ok(quote!(\n        #input\n\n        #into\n        #try_from\n    ))\n}\n\nfn get_ser_de(attributes: &[Attribute]) -> Result<(bool, bool), Error> {\n    let mut ser = false;\n    let mut de = false;\n\n    for attr in attributes {\n        if !attr.path().is_ident(\"derive\") {\n            continue;\n        }\n        let Meta::List(ref derives) = attr.meta else {\n            continue;\n        };\n        let derives =\n            Punctuated::<Path, Token![,]>::parse_terminated.parse2(derives.tokens.clone())?;\n\n        for path in derives.iter() {\n            ser |= path_equiv(path, &[\"serde\", \"Serialize\"]);\n            de |= path_equiv(path, &[\"serde\", \"Deserialize\"]);\n        }\n    }\n    Ok((ser, de))\n}\n\nfn get_and_remove_utoipa_schema(attributes: &mut [Attribute]) -> Result<bool, Error> {\n    let mut has_schema = false;\n    for attr in attributes {\n        if !attr.path().is_ident(\"derive\") {\n            continue;\n        }\n        let Meta::List(ref mut derives) = attr.meta else {\n            continue;\n        };\n\n        let derive_list =\n            Punctuated::<Path, Token![,]>::parse_terminated.parse2(derives.tokens.clone())?;\n        let mut new_derives = Punctuated::<Path, Token![,]>::new();\n        for path in derive_list {\n            if path_equiv(&path, &[\"utoipa\", \"ToSchema\"]) {\n                has_schema = true;\n            } else {\n                new_derives.push(path);\n            }\n        }\n        derives.tokens = quote!(#new_derives);\n    }\n\n    Ok(has_schema)\n}\n\nfn path_equiv(path: &Path, reference: &[&str]) -> bool {\n    if path.segments.is_empty() || reference.is_empty() {\n        return false;\n    }\n\n    path.segments\n        .iter()\n        .rev()\n        .zip(reference.iter().rev())\n        .fold(true, |equal, (path_part, ref_part)| {\n            equal && path_part.ident == ref_part\n        })\n}\n\n#[derive(Debug)]\nstruct MultiKeyOptions {\n    main_field_name: Ident,\n    deserializer: Option<Path>,\n    serializer: Option<Path>,\n    proxy_fields: Vec<Field>,\n}\n\nimpl MultiKeyOptions {\n    fn get_into(&self, this: &Ident) -> Option<(TokenStream2, TokenStream2)> {\n        if let Some(ref serializer) = self.serializer {\n            let field_names: Vec<_> = self\n                .proxy_fields\n                .iter()\n                .map(|field| field.ident.clone().unwrap())\n                .collect();\n            let main_field_name = &self.main_field_name;\n\n            let pre = quote!(\n                let (#(#field_names,)*) = #serializer(#this.#main_field_name);\n            );\n            let in_struct = quote!(\n                #(\n                    #field_names,\n                )*\n            );\n            Some((pre, in_struct))\n        } else {\n            None\n        }\n    }\n\n    fn get_try_from(&self, this: &Ident) -> Option<TokenStream2> {\n        if let Some(ref deserializer) = self.deserializer {\n            let field_names: Vec<_> = self\n                .proxy_fields\n                .iter()\n                .map(|field| field.ident.clone().unwrap())\n                .collect();\n            let main_field_name = &self.main_field_name;\n\n            Some(quote!(\n                #main_field_name: match #deserializer( #(#this.#field_names,)* ) {\n                    Ok(val) => val,\n                    Err(e) => return Err(e.to_string()),\n                },\n            ))\n        } else {\n            None\n        }\n    }\n}\n\nenum MultiKeyOption {\n    Deserializer(Path),\n    Serializer(Path),\n    Fields(Vec<Field>),\n}\n\nimpl Parse for MultiKeyOption {\n    fn parse(input: ParseStream) -> Result<Self, Error> {\n        let ident: Ident = input.parse()?;\n        match ident.to_string().as_str() {\n            \"serializer\" => {\n                input.parse::<Token![=]>()?;\n                Ok(MultiKeyOption::Serializer(input.parse::<Path>()?))\n            }\n            \"deserializer\" => {\n                input.parse::<Token![=]>()?;\n                Ok(MultiKeyOption::Deserializer(input.parse::<Path>()?))\n            }\n            \"fields\" => {\n                input.parse::<Token![=]>()?;\n                let content;\n                parenthesized!(content in input);\n                let fields = content.parse_terminated(Field::parse_named, Token![,])?;\n                Ok(MultiKeyOption::Fields(fields.into_iter().collect()))\n            }\n            _ => Err(Error::new(ident.span(), \"unknown field\")),\n        }\n    }\n}\n\nimpl Parse for MultiKeyOptions {\n    fn parse(input: ParseStream) -> Result<Self, Error> {\n        let mut res = MultiKeyOptions {\n            main_field_name: Ident::new(\"tmp_name\", Span::call_site()),\n            deserializer: None,\n            serializer: None,\n            proxy_fields: Vec::new(),\n        };\n\n        let options = Punctuated::<MultiKeyOption, Token![,]>::parse_terminated(input)?;\n        for option in options {\n            match option {\n                MultiKeyOption::Deserializer(path) => {\n                    if res.deserializer.is_none() {\n                        res.deserializer = Some(path);\n                    } else {\n                        todo!(\"throw error\");\n                    }\n                }\n                MultiKeyOption::Serializer(path) => {\n                    if res.serializer.is_none() {\n                        res.serializer = Some(path);\n                    } else {\n                        todo!(\"throw error\");\n                    }\n                }\n                MultiKeyOption::Fields(fields) => {\n                    if res.proxy_fields.is_empty() {\n                        res.proxy_fields = fields;\n                    } else {\n                        todo!(\"throw error\");\n                    }\n                }\n            }\n        }\n\n        if res.proxy_fields.is_empty() {\n            todo!(\"throw error\")\n        }\n\n        Ok(res)\n    }\n}\n\nfn parse_attributes(\n    attributes: Vec<Attribute>,\n    field_name: &Ident,\n) -> Result<(Option<MultiKeyOptions>, Vec<Attribute>), Error> {\n    let (mut multikey_attributes, normal_attributes): (Vec<_>, _) = attributes\n        .into_iter()\n        .partition(|attr| attr.path().is_ident(\"serde_multikey\"));\n\n    if multikey_attributes.len() > 1 {\n        let last = multikey_attributes.last().unwrap();\n        return Err(Error::new(\n            last.pound_token.spans[0],\n            \"`serde_multikey` was applied multiple time to the same field\",\n        ));\n    }\n    let options = if let Some(multikey_attribute) = multikey_attributes.pop() {\n        let Meta::List(meta_list) = multikey_attribute.meta else {\n            return Err(Error::new(\n                multikey_attribute.pound_token.spans[0],\n                \"`serde_multikey` require list-style arguments\",\n            ));\n        };\n        let mut options: MultiKeyOptions = syn::parse2(meta_list.tokens)?;\n        options.main_field_name = field_name.clone();\n        Some(options)\n    } else {\n        None\n    };\n\n    Ok((options, normal_attributes))\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/Cargo.toml",
    "content": "[package]\nname = \"quickwit-metastore\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nanyhow = { workspace = true }\nasync-trait = { workspace = true }\nbytes = { workspace = true }\nbytesize = { workspace = true }\nfutures = { workspace = true }\nhttp = { workspace = true }\nitertools = { workspace = true }\nmockall = { workspace = true, optional = true }\nonce_cell = { workspace = true }\nouroboros = { workspace = true }\nrand = { workspace = true }\nregex = { workspace = true }\nregex-syntax = { workspace = true }\nsea-query = { workspace = true, optional = true }\nsea-query-binder = { workspace = true, optional = true }\nserde = { workspace = true }\nserde_json = { workspace = true }\nserde_with = { workspace = true }\nsqlx = { workspace = true, optional = true }\ntempfile = { workspace = true, optional = true }\nthiserror = { workspace = true }\ntime = { workspace = true }\ntokio = { workspace = true }\ntokio-stream = { workspace = true }\ntower = { workspace = true }\ntracing = { workspace = true }\nulid = { workspace = true, features = [\"serde\"] }\nuuid = { workspace = true }\nutoipa = { workspace = true }\n\nquickwit-common = { workspace = true }\nquickwit-config = { workspace = true }\nquickwit-doc-mapper = { workspace = true }\nquickwit-proto = { workspace = true }\nquickwit-query = { workspace = true }\nquickwit-storage = { workspace = true }\n\n[dev-dependencies]\ndotenvy = { workspace = true }\nfutures = { workspace = true }\nhyper-util = { workspace = true }\nmd5 = { workspace = true }\nmockall = { workspace = true }\nrand = { workspace = true }\nserial_test = { workspace = true }\ntempfile = { workspace = true }\ntracing-subscriber = { workspace = true }\n\nquickwit-common = { workspace = true, features = [\"testsuite\"] }\nquickwit-config = { workspace = true, features = [\"testsuite\"] }\nquickwit-doc-mapper = { workspace = true, features = [\"testsuite\"] }\nquickwit-proto = { workspace = true, features = [\"testsuite\"] }\nquickwit-storage = { workspace = true, features = [\"testsuite\"] }\n\n[features]\nci-test = []\npostgres = [\"quickwit-proto/postgres\", \"sea-query\", \"sea-query-binder\", \"sqlx\"]\ntestsuite = [\"mockall\", \"tempfile\", \"quickwit-config/testsuite\"]\n"
  },
  {
    "path": "quickwit/quickwit-metastore/README.md",
    "content": "# quickwit-metastore\n\n## Starting postgres\n\nThe following command starts a postgresql server\nlocally to test the postgres metastore implementation.\n\n`docker-compose up postgres`\n\nIts data is saved in the tmp directory, and\nis not necessarily cleaned up between two runs.\n\nYou can execute `make rm-postgres` to remove the\ndata of this postgresql database.\n\n## Testing quickwit-metastore\n\nTo test FileBackedMetastore only, use the following command.\n\n```\n$ cargo test\n```\n\nTo test including PostgresqlMetastore, you need to start PostgreSQL beforehand.\nStart PostgreSQL for testing with the following command in `quickwit` project root.\n\n```\n$ make docker-compose-up DOCKER_SERVICES=postgres\n```\n\nOnce PostgreSQL is up and running, you can run tests including PostgresqlMetastore with the following command.\n\n```\n$ cargo test --features=postgres\n```\n\nYou can stop PostgreSQL with the following command.\n\n```\n$ docker-compose down\n```\n\n## Sqlx-cli and migrations\n\nThis sqlx-cli can be useful (but is not necessary) to work with migrations.\n\n```\ncargo install sqlx-cli\n```\n\nYou can then use the following commands to apply/revert your postgresql migrations.\n```\nsqlx migrate run  --database-url postgres://quickwit-dev:quickwit-dev@localhost:5432/quickwit-metastore-dev --source migrations/postgresql\nsqlx migrate revert  --database-url postgres://quickwit-dev:quickwit-dev@localhost:5432/quickwit-metastore-dev --source migrations/postgresql\n```\n"
  },
  {
    "path": "quickwit/quickwit-metastore/build.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nfn main() {\n    println!(\"cargo:rerun-if-changed=migrations/postgresql\");\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/10_add-split-incarnation-id.down.sql",
    "content": "ALTER TABLE indexes ALTER COLUMN index_uid TYPE VARCHAR(64);\nALTER TABLE indexes ALTER COLUMN index_id TYPE VARCHAR(50);\nALTER TABLE splits ALTER COLUMN index_uid TYPE VARCHAR(64);\nALTER TABLE delete_tasks ALTER COLUMN index_uid TYPE VARCHAR(64);\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/10_add-split-incarnation-id.up.sql",
    "content": "ALTER TABLE indexes ALTER COLUMN index_uid TYPE VARCHAR(282);\nALTER TABLE indexes ALTER COLUMN index_id TYPE VARCHAR(255);\nALTER TABLE splits ALTER COLUMN index_uid TYPE VARCHAR(282);\nALTER TABLE delete_tasks ALTER COLUMN index_uid TYPE VARCHAR(282);\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/11_add-split-maturity-timestamp-field.down.sql",
    "content": "ALTER TABLE splits\n  DROP COLUMN maturity_timestamp;\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/11_add-split-maturity-timestamp-field.up.sql",
    "content": "ALTER TABLE splits\n    ADD COLUMN maturity_timestamp TIMESTAMP DEFAULT TO_TIMESTAMP(0);\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/12_create-shards.down.sql",
    "content": "DROP TABLE shards;\n\nDROP TYPE IF EXISTS SHARD_STATE;"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/12_create-shards.up.sql",
    "content": "CREATE TYPE SHARD_STATE AS ENUM ('unspecified', 'open', 'unavailable', 'closed');\n\nCREATE TABLE IF NOT EXISTS shards (\n    index_uid VARCHAR(282) NOT NULL,\n    source_id VARCHAR(255) NOT NULL,\n    shard_id BIGSERIAL,\n    leader_id VARCHAR(255) NOT NULL,\n    follower_id VARCHAR(255),\n    shard_state SHARD_STATE NOT NULL,\n    publish_position_inclusive VARCHAR(255) NOT NULL,\n    publish_token VARCHAR(255),\n    PRIMARY KEY (index_uid, source_id, shard_id),\n    FOREIGN KEY (index_uid) REFERENCES indexes (index_uid)\n);"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/13_migrate-otel-indexes-v0_6.down.sql",
    "content": "UPDATE \n   indexes\nSET \n   index_metadata_json = REPLACE(\n      REPLACE(index_metadata_json, '\"output_format\":\"hex\"', '\"output_format\":\"base64\"'),\n      '\"input_format\":\"hex\"', '\"input_format\":\"base64\"'\n   )\nWHERE \n    index_id in ('otel-logs-v0_6', 'otel-traces-v0_6');\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/13_migrate-otel-indexes-v0_6.up.sql",
    "content": "UPDATE \n   indexes\nSET \n   index_metadata_json = REPLACE(\n      REPLACE(index_metadata_json, '\"output_format\":\"base64\"', '\"output_format\":\"hex\"'),\n      '\"input_format\":\"base64\"', '\"input_format\":\"hex\"'\n   )\nWHERE \n    index_id in ('otel-logs-v0_6', 'otel-traces-v0_6');\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/14_update-shard-id.down.sql",
    "content": "ALTER TABLE shards\n    ALTER COLUMN shard_id TYPE BIGSERIAL,\n    ALTER COLUMN shard_id DROP NOT NULL,\n    ALTER COLUMN shard_state DROP DEFAULT,\n    ALTER COLUMN publish_position_inclusive DROP DEFAULT,\n    DROP CONSTRAINT shards_index_uid_fkey,\n    ADD CONSTRAINT shards_index_uid_fkey FOREIGN KEY (index_uid) REFERENCES indexes(index_uid)\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/14_update-shard-id.up.sql",
    "content": "ALTER TABLE shards\n    ALTER COLUMN shard_id TYPE VARCHAR(255),\n    ALTER COLUMN shard_id SET NOT NULL,\n    ALTER COLUMN shard_state SET DEFAULT 'open',\n    ALTER COLUMN publish_position_inclusive SET DEFAULT '',\n    DROP CONSTRAINT shards_index_uid_fkey,\n    ADD CONSTRAINT shards_index_uid_fkey FOREIGN KEY (index_uid) REFERENCES indexes(index_uid) ON DELETE CASCADE\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/15_create-templates.down.sql",
    "content": "DROP TABLE index_templates;\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/15_create-templates.up.sql",
    "content": "CREATE TABLE IF NOT EXISTS index_templates (\n    template_id VARCHAR(255) NOT NULL,\n    positive_index_id_patterns VARCHAR(255)[] NOT NULL,\n    negative_index_id_patterns VARCHAR(255)[] NOT NULL,\n    priority INTEGER NOT NULL DEFAULT 0,\n    index_template_json TEXT NOT NULL,\n    PRIMARY KEY (template_id)\n);\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/16_create-index-split-uid.down.sql",
    "content": "DROP INDEX IF EXISTS splits_index_uid_idx;\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/16_create-index-split-uid.up.sql",
    "content": "CREATE INDEX IF NOT EXISTS splits_index_uid_idx ON splits USING HASH(index_uid);\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/17_create-index-split-timestamp.down.sql",
    "content": "DROP INDEX IF EXISTS splits_time_range_start_idx;\nDROP INDEX IF EXISTS splits_time_range_end_idx;\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/17_create-index-split-timestamp.up.sql",
    "content": "CREATE INDEX IF NOT EXISTS splits_time_range_start_idx ON splits (time_range_start);\nCREATE INDEX IF NOT EXISTS splits_time_range_end_idx ON splits (time_range_end);\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/18_create-index-shard-index-uid.down.sql",
    "content": "CREATE INDEX IF NOT EXISTS shards_index_uid_idx ON shards USING HASH(index_uid);\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/18_create-index-shard-index-uid.up.sql",
    "content": "DROP INDEX IF EXISTS shards_index_uid_idx;\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/19_add-split-node-id-field.down.sql",
    "content": "DROP INDEX IF EXISTS splits_node_id_idx;\n\nALTER TABLE splits\n    DROP IF EXISTS COLUMN node_id;\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/19_add-split-node-id-field.up.sql",
    "content": "ALTER TABLE splits\n    ADD COLUMN node_id VARCHAR(253);\n\n-- Split metadata has been stable for quite a while, so we allow ourselves to do this,\n-- but please, reader of the future, do not reapply this pattern without careful consideration.\nUPDATE\n    splits\nSET\n    node_id = splits.split_metadata_json::json ->> 'node_id';\n\nALTER TABLE splits\n    ALTER COLUMN node_id SET NOT NULL;\n\nCREATE INDEX IF NOT EXISTS splits_node_id_idx ON splits USING HASH (node_id);\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/1_create-indexes.down.sql",
    "content": "DROP TABLE indexes;\n\nDROP FUNCTION IF EXISTS quickwit_manage_update_timestamp(_tbl regclass);\nDROP FUNCTION IF EXISTS quickwit_set_update_timestamp();\n\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/1_create-indexes.up.sql",
    "content": "DO $$\nBEGIN\n    IF EXISTS (SELECT * FROM pg_tables WHERE tablename  = '__diesel_schema_migrations')\n\tTHEN\n\t    -- We are migrating from a diesel table.\n\t    -- That's ok, but let's make sure we are at the last version.\n\t    --\n\t    -- If you hit this Assert, the workaround is to download Quickwit 0.3.1\n\t    -- and run the missing migrations.\n\t    ASSERT EXISTS (\n\t\t    SELECT FROM __diesel_schema_migrations\n\t\t    WHERE version = '20211217102648'\n\t\t);\n\t\tDROP TABLE __diesel_schema_migrations;\n\tEND IF;\nEND $$;\n\n\nCREATE TABLE IF NOT EXISTS indexes (\n    index_id VARCHAR(50) PRIMARY KEY,\n    index_metadata_json TEXT NOT NULL,\n    create_timestamp TIMESTAMP NOT NULL DEFAULT (CURRENT_TIMESTAMP AT TIME ZONE 'UTC'),\n    update_timestamp TIMESTAMP NOT NULL DEFAULT (CURRENT_TIMESTAMP AT TIME ZONE 'UTC')\n);\n\nCREATE OR REPLACE FUNCTION quickwit_manage_update_timestamp(_tbl regclass) RETURNS VOID AS $$\nBEGIN\n    EXECUTE format('DROP TRIGGER IF EXISTS set_update_timestamp ON %s CASCADE', _tbl);\n    EXECUTE format('CREATE TRIGGER set_update_timestamp BEFORE UPDATE ON %s\n                    FOR EACH ROW EXECUTE PROCEDURE quickwit_set_update_timestamp()', _tbl);\nEND;\n$$ LANGUAGE plpgsql;\n\nCREATE OR REPLACE FUNCTION quickwit_set_update_timestamp() RETURNS trigger AS $$\nBEGIN\n    IF (\n        NEW IS DISTINCT FROM OLD AND\n        NEW.update_timestamp IS NOT DISTINCT FROM OLD.update_timestamp\n    ) THEN\n        NEW.update_timestamp := (CURRENT_TIMESTAMP AT TIME ZONE 'UTC');\n    END IF;\n    RETURN NEW;\nEND;\n$$ LANGUAGE plpgsql;\n\n-- Apply the `update_timestamp` trigger to the `indexes` table\nSELECT quickwit_manage_update_timestamp('indexes');\n\n-- We also want to update an index `update_timestamp` field whenever a related split\n-- is modified.\nCREATE OR REPLACE FUNCTION set_index_update_timestamp_for_split() RETURNS trigger AS $$\nBEGIN\n    IF (TG_OP = 'INSERT' OR TG_OP = 'UPDATE') THEN\n        UPDATE indexes SET update_timestamp = NEW.update_timestamp\n        WHERE indexes.index_id = NEW.index_id;\n    ELSIF (TG_OP = 'DELETE') THEN\n        UPDATE indexes SET update_timestamp = (CURRENT_TIMESTAMP AT TIME ZONE 'UTC')\n        WHERE indexes.index_id = OLD.index_id;\n    END IF;\n    RETURN NULL;\nEND;\n$$ LANGUAGE plpgsql;\n\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/20_add-shard-doc-mapping-uid-field.down.sql",
    "content": "ALTER TABLE shards\n    DROP IF EXISTS COLUMN doc_mapping_uid;\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/20_add-shard-doc-mapping-uid-field.up.sql",
    "content": "ALTER TABLE shards\n    ADD COLUMN IF NOT EXISTS doc_mapping_uid VARCHAR(26);\n\n-- Index metadata has been stable for quite a while, so we allow ourselves to do this,\n-- but please, reader of the future, do not reapply this pattern without careful consideration.\nUPDATE\n    shards\nSET\n    doc_mapping_uid = '00000000000000000000000000';\n\nALTER TABLE shards\n    ALTER COLUMN doc_mapping_uid SET NOT NULL;\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/21_add-shard-update-timestamp-field.down.sql",
    "content": "ALTER TABLE shards\n    DROP IF EXISTS update_timestamp;\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/21_add-shard-update-timestamp-field.up.sql",
    "content": "ALTER TABLE shards\n    -- We prefer a fix value here because it makes  tests simpler. \n    -- Very few users use the shard API in versions <0.9 anyway.\n    ADD COLUMN IF NOT EXISTS update_timestamp TIMESTAMP NOT NULL DEFAULT '2024-01-01 00:00:00+00';\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/22_change-splits-pkey.down.sql",
    "content": "CREATE INDEX IF NOT EXISTS splits_index_uid_idx ON splits USING HASH(index_uid);\nALTER TABLE splits DROP CONSTRAINT splits_pkey, ADD PRIMARY KEY (split_id);\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/22_change-splits-pkey.up.sql",
    "content": "ALTER TABLE splits DROP CONSTRAINT splits_pkey, ADD PRIMARY KEY (index_uid, split_id);\nDROP INDEX IF EXISTS splits_index_uid_idx;\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/23_change-indexes-unique-index.down.sql",
    "content": "DROP INDEX IF EXISTS indexes_index_id_unique;\nALTER TABLE indexes ADD CONSTRAINT indexes_index_id_unique UNIQUE (index_id);\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/23_change-indexes-unique-index.up.sql",
    "content": "ALTER TABLE indexes DROP CONSTRAINT IF EXISTS indexes_index_id_unique;\n\nCREATE UNIQUE INDEX IF NOT EXISTS indexes_index_id_unique\n  ON indexes USING btree (\"index_id\" varchar_pattern_ops);\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/24_add-arbitrary-kv.down.sql",
    "content": "DROP TABLE kv;\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/24_add-arbitrary-kv.up.sql",
    "content": "CREATE TABLE IF NOT EXISTS kv (\n    key VARCHAR(50) PRIMARY KEY,\n    value TEXT NOT NULL\n);\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/25_add-split-size.down.sql",
    "content": "DROP INDEX IF EXISTS idx_splits_stats;\n\nALTER TABLE splits DROP COLUMN IF EXISTS split_size_bytes;\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/25_add-split-size.up.sql",
    "content": "ALTER TABLE splits ADD COLUMN IF NOT EXISTS split_size_bytes BIGINT NOT NULL GENERATED ALWAYS AS ((split_metadata_json::json->'footer_offsets'->>'end')::bigint) STORED;\n\nCREATE INDEX IF NOT EXISTS idx_splits_stats ON splits (index_uid, split_state) INCLUDE (split_size_bytes);\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/2_create-splits.down.sql",
    "content": "DROP TABLE splits;\n\nDROP FUNCTION IF EXISTS set_index_update_timestamp_for_split();\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/2_create-splits.up.sql",
    "content": "CREATE TABLE IF NOT EXISTS splits (\n    split_id VARCHAR(50) PRIMARY KEY,\n    split_state VARCHAR(30) NOT NULL,\n    time_range_start BIGINT,\n    time_range_end BIGINT,\n    tags TEXT[] NOT NULL,\n    split_metadata_json TEXT NOT NULL,\n    index_id VARCHAR(50) NOT NULL,\n    create_timestamp TIMESTAMP NOT NULL DEFAULT (CURRENT_TIMESTAMP AT TIME ZONE 'UTC'),\n    update_timestamp TIMESTAMP NOT NULL DEFAULT (CURRENT_TIMESTAMP AT TIME ZONE 'UTC'),\n\n    FOREIGN KEY(index_id) REFERENCES indexes(index_id)\n);\n\nDROP TRIGGER IF EXISTS quickwit_set_index_update_timestamp_on_split_change ON splits CASCADE;\nCREATE TRIGGER quickwit_set_index_update_timestamp_on_split_change\n    AFTER INSERT OR DELETE OR UPDATE ON splits\n    FOR EACH ROW\n    EXECUTE PROCEDURE set_index_update_timestamp_for_split();\n\n-- We also want to update an index `update_timestamp` field whenever a related split\n-- is modified.\nCREATE OR REPLACE FUNCTION set_index_update_timestamp_for_split() RETURNS trigger AS $$\nBEGIN\n    IF (TG_OP = 'INSERT' OR TG_OP = 'UPDATE') THEN\n        UPDATE indexes SET update_timestamp = NEW.update_timestamp\n        WHERE indexes.index_id = NEW.index_id;\n    ELSIF (TG_OP = 'DELETE') THEN\n        UPDATE indexes SET update_timestamp = (CURRENT_TIMESTAMP AT TIME ZONE 'UTC')\n        WHERE indexes.index_id = OLD.index_id;\n    END IF;\n    RETURN NULL;\nEND;\n\n$$ LANGUAGE plpgsql;\n\n\n-- apply the trigger to the `splits` table\nSELECT quickwit_manage_update_timestamp('splits');\n\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/3_add-split-publish-timestamp-field.down.sql",
    "content": "ALTER TABLE splits\n  DROP COLUMN publish_timestamp;\n\n\nDROP FUNCTION IF EXISTS set_split_publish_timestamp_on_split_publish();\nDROP TRIGGER IF EXISTS set_split_publish_timestamp_on_split_publish ON splits CASCADE;\nDROP FUNCTION IF EXISTS set_split_publish_timestamp_for_split(); \n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/3_add-split-publish-timestamp-field.up.sql",
    "content": "ALTER TABLE splits\n    ADD COLUMN publish_timestamp TIMESTAMP DEFAULT NULL;\n\n-- We want to update the split `publish_timestamp` field whenever the split\n-- being is published.\nCREATE OR REPLACE FUNCTION set_split_publish_timestamp_for_split() RETURNS trigger AS $$\nBEGIN\n    IF (TG_OP = 'UPDATE') AND (NEW.split_state = 'Published') AND (OLD.split_state = 'Staged') THEN\n        NEW.publish_timestamp := (CURRENT_TIMESTAMP AT TIME ZONE 'UTC');\n    END IF;\n    RETURN NEW;\nEND;\n\n$$ LANGUAGE plpgsql;\n\nDROP TRIGGER IF EXISTS set_split_publish_timestamp_on_split_publish ON splits CASCADE;\nCREATE TRIGGER set_split_publish_timestamp_on_split_publish\n    BEFORE UPDATE ON splits\n    FOR EACH ROW\n    EXECUTE PROCEDURE set_split_publish_timestamp_for_split();\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/4_create-delete_tasks.down.sql",
    "content": "DROP TABLE delete_tasks;\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/4_create-delete_tasks.up.sql",
    "content": "CREATE TABLE IF NOT EXISTS delete_tasks (\n    create_timestamp TIMESTAMP NOT NULL DEFAULT (CURRENT_TIMESTAMP AT TIME ZONE 'UTC'),\n    opstamp BIGSERIAL PRIMARY KEY,\n    index_id VARCHAR(50) NOT NULL,\n    delete_query_json TEXT NOT NULL,\n\n    FOREIGN KEY(index_id) REFERENCES indexes(index_id) ON DELETE CASCADE\n);\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/5_add-delete-opstamp-splits.down.sql",
    "content": "ALTER TABLE splits DROP COLUMN delete_opstamp;"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/5_add-delete-opstamp-splits.up.sql",
    "content": "ALTER TABLE splits ADD COLUMN delete_opstamp BIGINT CHECK (delete_opstamp >= 0) DEFAULT 0;\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/6_delete-update-index-update-timestamp-on-split-update-trigger.up.sql",
    "content": "DROP TRIGGER IF EXISTS quickwit_set_index_update_timestamp_on_split_change ON splits CASCADE;\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/7_delete-split-table-triggers.up.sql",
    "content": "DROP TRIGGER IF EXISTS set_split_publish_timestamp_on_split_publish ON splits CASCADE;\nDROP TRIGGER IF EXISTS set_update_timestamp ON splits CASCADE;\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/8_delete-update-timestamp-on-indexes-table.up.sql",
    "content": "ALTER TABLE indexes DROP COLUMN IF EXISTS update_timestamp;\nDROP TRIGGER IF EXISTS set_update_timestamp ON indexes;"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/9_add-split-incarnation-id.down.sql",
    "content": "\nALTER TABLE delete_tasks DROP CONSTRAINT IF EXISTS delete_tasks_index_uid_fkey;\nUPDATE delete_tasks set index_uid = split_part(index_uid,':',1);\nALTER TABLE delete_tasks ALTER COLUMN index_uid TYPE VARCHAR(50);\nALTER TABLE delete_tasks RENAME COLUMN index_uid TO index_id;\n\nALTER TABLE splits DROP CONSTRAINT IF EXISTS splits_index_uid_fkey;\nALTER TABLE splits ADD COLUMN incarnation_id VARCHAR(26) NOT NULL DEFAULT '00000000000000000000000000';\nUPDATE splits set index_uid = split_part(index_uid,':',1);\nALTER TABLE splits ALTER COLUMN index_uid TYPE VARCHAR(50);\nALTER TABLE splits RENAME COLUMN index_uid TO index_id;\n\nALTER TABLE indexes DROP COLUMN index_id;\nUPDATE indexes set index_uid = split_part(index_uid,':',1);\nALTER TABLE indexes ALTER COLUMN index_uid TYPE VARCHAR(50);\nALTER TABLE indexes RENAME COLUMN index_uid TO index_id;\n\nALTER TABLE delete_tasks ADD CONSTRAINT delete_tasks_index_id_fkey FOREIGN KEY (index_id) REFERENCES indexes(index_id) ON DELETE CASCADE;\nALTER TABLE splits ADD CONSTRAINT splits_index_id_fkey FOREIGN KEY (index_id) REFERENCES indexes(index_id) ON DELETE CASCADE;\n"
  },
  {
    "path": "quickwit/quickwit-metastore/migrations/postgresql/9_add-split-incarnation-id.up.sql",
    "content": "ALTER TABLE indexes RENAME COLUMN index_id TO index_uid;\nALTER TABLE indexes ADD COLUMN index_id VARCHAR(50) NOT NULL DEFAULT '';\nUPDATE indexes set index_id = index_uid;\nALTER TABLE indexes ADD CONSTRAINT indexes_index_id_unique UNIQUE (index_id);\nALTER TABLE indexes ALTER COLUMN index_uid TYPE VARCHAR(64);\n\nALTER TABLE splits DROP CONSTRAINT IF EXISTS splits_index_id_fkey;\nALTER TABLE splits RENAME COLUMN index_id TO index_uid;\nALTER TABLE splits ALTER COLUMN index_uid TYPE VARCHAR(64);\nALTER TABLE splits ADD CONSTRAINT splits_index_uid_fkey FOREIGN KEY (index_uid) REFERENCES indexes(index_uid) ON DELETE CASCADE;\n\nALTER TABLE delete_tasks DROP CONSTRAINT IF EXISTS delete_tasks_index_id_fkey;\nALTER TABLE delete_tasks RENAME COLUMN index_id TO index_uid;\nALTER TABLE delete_tasks ALTER COLUMN index_uid TYPE VARCHAR(64);\nALTER TABLE delete_tasks ADD CONSTRAINT delete_tasks_index_uid_fkey FOREIGN KEY (index_uid) REFERENCES indexes(index_uid) ON DELETE CASCADE;\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/backward_compatibility_tests/README.md",
    "content": "See docs/internals/backward-compatibility.md.\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/backward_compatibility_tests/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fs;\nuse std::path::{Path, PathBuf};\n\nuse anyhow::{Context, bail};\nuse quickwit_config::{IndexConfig, IndexTemplate, SourceConfig, TestableForRegression};\nuse serde::{Deserialize, Serialize};\nuse serde_json::Value as JsonValue;\n\nuse crate::file_backed::file_backed_index::FileBackedIndex;\nuse crate::file_backed::manifest::Manifest;\nuse crate::{IndexMetadata, SplitMetadata};\n\n/// In order to avoid confusion, we need to make sure that the\n/// resource versions is the same for all resources.\n///\n/// We don't want to confuse quickwit users with different source_config /\n/// index_config versions.\n///\n/// If you bump this version, makes sure to update all resources.\n/// Of course some resource may not have any config change.\n///\n/// You can just reuse the same versioned object in that case.\n/// ```\n/// enum MyResource {\n///     #[serde(rename=\"0.1\")]\n///     V0_1(MyResourceV1),\n///     #[serde(rename=\"0.2\")]\n///     V0_2(MyResourceV1) //< there was no change in this version.\n/// }\nconst GLOBAL_QUICKWIT_RESOURCE_VERSION: &str = \"0.9\";\n\n/// This test makes sure that the resource is using the current `GLOBAL_QUICKWIT_RESOURCE_VERSION`.\nfn test_global_version<T: Serialize>(serializable: &T) -> anyhow::Result<()> {\n    let json = serde_json::to_value(serializable).unwrap();\n    let version_value = json.get(\"version\").context(\"no version tag\")?;\n    let version_str = version_value.as_str().context(\"version should be a str\")?;\n    if version_str != GLOBAL_QUICKWIT_RESOURCE_VERSION {\n        bail!(\n            \"version `{version_str}` is not the global quickwit resource version \\\n             ({GLOBAL_QUICKWIT_RESOURCE_VERSION})\"\n        );\n    }\n    Ok(())\n}\n\nfn deserialize_json_file<T>(path: &Path) -> anyhow::Result<T>\nwhere for<'a> T: Deserialize<'a> {\n    let payload = std::fs::read(path)?;\n    let deserialized: T = serde_json::from_slice(&payload)?;\n    Ok(deserialized)\n}\n\nfn test_backward_compatibility_single_case<T>(path: &Path) -> anyhow::Result<()>\nwhere T: TestableForRegression + std::fmt::Debug {\n    println!(\"---\\nTest deserialization of {}\", path.display());\n    let deserialized: T = deserialize_json_file(path)?;\n    let expected_path = path.to_string_lossy().replace(\".json\", \".expected.json\");\n    let expected: T = deserialize_json_file(Path::new(&expected_path))?;\n    println!(\"---\\nTest equality of {expected:?}\");\n    println!(\"---\\nwith {deserialized:?}\");\n    deserialized.assert_equality(&expected);\n    Ok(())\n}\n\n/// For each pair of `x.json` and `x.expected.json` in `test_dir`, assert that the deserialized\n/// versions are equal according to `T::assert_equality`.\nfn test_backward_compatibility<T>(test_dir: &Path) -> anyhow::Result<()>\nwhere T: TestableForRegression + std::fmt::Debug {\n    for entry in\n        fs::read_dir(test_dir).with_context(|| format!(\"failed to read {}\", test_dir.display()))?\n    {\n        let entry = entry?;\n        let path = entry.path();\n        if path.to_string_lossy().ends_with(\".expected.json\")\n            || path.to_string_lossy().ends_with(\".modified.json\")\n        {\n            continue;\n        }\n        test_backward_compatibility_single_case::<T>(&path)\n            .with_context(|| format!(\"test path {}\", path.display()))?;\n    }\n    Ok(())\n}\n\nfn test_and_update_expected_files_single_case<T>(expected_path: &Path) -> anyhow::Result<bool>\nwhere for<'a> T: std::fmt::Debug + Serialize + Deserialize<'a> {\n    let expected: T = deserialize_json_file(Path::new(&expected_path))?;\n    let expected_old_json_value: JsonValue = deserialize_json_file(Path::new(&expected_path))?;\n    let expected_new_json_value: JsonValue = serde_json::to_value(&expected)?;\n    // We compare json Value, so we don't detect format change like a change in the field order.\n    if expected_old_json_value == expected_new_json_value {\n        // No modification\n        return Ok(false);\n    }\n    println!(\"---\\nTest deserialization of {}\", expected_path.display());\n    println!(\"---\\nexpected {expected:?}\");\n    println!(\"---\\nwith {expected_new_json_value:?}\");\n    let mut expected_new_json = serde_json::to_string_pretty(&expected_new_json_value)?;\n    expected_new_json.push('\\n');\n    std::fs::write(\n        expected_path.with_extension(\"modified.json\"),\n        expected_new_json.as_bytes(),\n    )?;\n    Ok(true)\n}\n\n/// For versions different (older) than the current [GLOBAL_QUICKWIT_RESOURCE_VERSION],\n/// assert whether the expected.json files need to be changed.\n///\n/// Returns the proposed updated files (xxx.expected.modified.json).\nfn test_and_update_old_expected_files<T>(test_dir: &Path) -> anyhow::Result<Vec<PathBuf>>\nwhere for<'a> T: std::fmt::Debug + Deserialize<'a> + Serialize {\n    let mut updated_expected_files = Vec::new();\n    for entry in fs::read_dir(test_dir)? {\n        let entry = entry?;\n        let path = entry.path();\n        if !path.to_string_lossy().ends_with(\".expected.json\") {\n            continue;\n        }\n        if path.to_string_lossy().ends_with(&format!(\n            \"v{GLOBAL_QUICKWIT_RESOURCE_VERSION}.expected.json\"\n        )) {\n            continue;\n        }\n        if test_and_update_expected_files_single_case::<T>(&path)\n            .with_context(|| format!(\"test filepath {}\", path.display()))?\n        {\n            updated_expected_files.push(path.with_extension(\"modified.json\"));\n        }\n    }\n    Ok(updated_expected_files)\n}\n\n/// Asserts whether the serialized version of the `sample` is the same as the existing\n/// `v{GLOBAL_QUICKWIT_RESOURCE_VERSION}.json`.\n///\n/// Returns the created serialized files if they didn't exist (x.json and x.expected.json) or the\n/// proposed updated files (.modified.json) if they changed.\n///\n/// Both generated files have identical contents.\nfn test_and_create_new_test<T>(test_dir: &Path, sample: T) -> anyhow::Result<Vec<PathBuf>>\nwhere for<'a> T: Serialize {\n    let sample_json_value = serde_json::to_value(&sample)?;\n    let mut sample_json = serde_json::to_string_pretty(&sample_json_value)?;\n    sample_json.push('\\n');\n\n    let file_regression_test_path_str = format!(\n        \"{}/v{GLOBAL_QUICKWIT_RESOURCE_VERSION}.json\",\n        test_dir.display()\n    );\n    let mut file_regression_test_path = PathBuf::from(file_regression_test_path_str);\n\n    let (changes_detected, file_created) = if file_regression_test_path.try_exists()? {\n        let expected_old_json_value: JsonValue = deserialize_json_file(&file_regression_test_path)?;\n        let expected_new_json_value: JsonValue = serde_json::from_str(&sample_json)?;\n        (expected_old_json_value != expected_new_json_value, false)\n    } else {\n        (false, true)\n    };\n\n    let mut file_regression_expected_path =\n        file_regression_test_path.with_extension(\"expected.json\");\n\n    if !file_created {\n        file_regression_test_path = file_regression_test_path.with_extension(\"modified.json\");\n        file_regression_expected_path =\n            file_regression_expected_path.with_extension(\"modified.json\")\n    }\n\n    if changes_detected || file_created {\n        std::fs::write(&file_regression_test_path, sample_json.as_bytes())?;\n        std::fs::write(&file_regression_expected_path, sample_json.as_bytes())?;\n        Ok(vec![\n            file_regression_test_path,\n            file_regression_expected_path,\n        ])\n    } else {\n        Ok(vec![])\n    }\n}\n\n/// This helper function scans the `test-data/{test_name}`\n/// for JSON deserialization regression tests and runs them sequentially.\n///\n/// - `test_name` is just the subdirectory name, for the type being test.\npub(crate) fn test_json_backward_compatibility_helper<T>(test_name: &str) -> anyhow::Result<()>\nwhere T: TestableForRegression + std::fmt::Debug {\n    let sample_instance: T = T::sample_for_regression();\n    test_global_version(&sample_instance).unwrap();\n\n    let test_dir = Path::new(\"test-data\").join(test_name);\n    test_backward_compatibility::<T>(&test_dir).context(\"backward-compatibility\")?;\n    let updated_files =\n        test_and_update_old_expected_files::<T>(&test_dir).context(\"test-and-update\")?;\n\n    let mut updated_or_new_files = test_and_create_new_test::<T>(&test_dir, sample_instance)\n        .context(\"test-and-create-new-test\")?;\n\n    updated_or_new_files.extend(updated_files);\n\n    if !updated_or_new_files.is_empty() {\n        panic!(\n            \"Some files have been updated or created. Please check the diff and replace their \\\n             counterparts when appropriate: {updated_or_new_files:?}\"\n        );\n    }\n\n    Ok(())\n}\n\n#[test]\nfn test_split_metadata_backward_compatibility() {\n    test_json_backward_compatibility_helper::<SplitMetadata>(\"split-metadata\").unwrap();\n}\n\n#[test]\nfn test_index_metadata_backward_compatibility() {\n    test_json_backward_compatibility_helper::<IndexMetadata>(\"index-metadata\").unwrap();\n}\n\n#[test]\nfn test_index_config_global_version() {\n    let sample_instance = IndexConfig::sample_for_regression();\n    test_global_version(&sample_instance).unwrap();\n}\n\n#[test]\nfn test_source_config_global_version() {\n    let sample_instance = SourceConfig::sample_for_regression();\n    test_global_version(&sample_instance).unwrap();\n}\n\n#[test]\nfn test_file_backed_index_backward_compatibility() {\n    test_json_backward_compatibility_helper::<FileBackedIndex>(\"file-backed-index\").unwrap();\n}\n\n#[test]\nfn test_file_backed_metastore_manifest_backward_compatibility() {\n    test_json_backward_compatibility_helper::<Manifest>(\"manifest\").unwrap();\n}\n\n#[test]\nfn test_index_template_global_version() {\n    let sample_instance = IndexTemplate::sample_for_regression();\n    test_global_version(&sample_instance).unwrap();\n}\n\n/// Testing the tests\n///\n/// A simplified example that helps understanding the backward compatibility tests.\n#[cfg(test)]\nmod tests {\n    use std::panic::catch_unwind;\n\n    use serde_json::json;\n\n    use super::*;\n\n    #[derive(Serialize, Deserialize, Debug, Clone)]\n    #[serde(into = \"VersionedTestEntity\")]\n    #[serde(from = \"VersionedTestEntity\")]\n    struct TestEntity {\n        field_already_in_0_7: u16,\n        field_added_in_0_8: u16,\n    }\n\n    #[derive(Serialize, Deserialize, Debug, Clone)]\n    struct TestEntityV0_8 {\n        field_already_in_0_7: u16,\n        field_added_in_0_8: u16,\n    }\n\n    #[derive(Deserialize, Debug, Clone)]\n    struct TestEntityV0_7 {\n        field_already_in_0_7: u16,\n    }\n\n    #[derive(Clone, Debug, Serialize, Deserialize, utoipa::ToSchema)]\n    #[serde(tag = \"version\")]\n    enum VersionedTestEntity {\n        #[serde(rename = \"0.9\")]\n        #[serde(alias = \"0.8\")]\n        V0_8(TestEntityV0_8),\n        #[serde(alias = \"0.7\", skip_serializing)]\n        V0_7(TestEntityV0_7),\n    }\n\n    impl From<VersionedTestEntity> for TestEntity {\n        fn from(versioned_test_entity: VersionedTestEntity) -> Self {\n            match versioned_test_entity {\n                VersionedTestEntity::V0_8(v0_9) => TestEntity {\n                    field_added_in_0_8: v0_9.field_added_in_0_8,\n                    field_already_in_0_7: v0_9.field_already_in_0_7,\n                },\n                VersionedTestEntity::V0_7(v0_7) => TestEntity {\n                    field_already_in_0_7: v0_7.field_already_in_0_7,\n                    field_added_in_0_8: 1,\n                },\n            }\n        }\n    }\n\n    impl From<TestEntity> for VersionedTestEntity {\n        fn from(test_entity: TestEntity) -> Self {\n            VersionedTestEntity::V0_8(TestEntityV0_8 {\n                field_added_in_0_8: test_entity.field_added_in_0_8,\n                field_already_in_0_7: test_entity.field_already_in_0_7,\n            })\n        }\n    }\n\n    impl TestableForRegression for TestEntity {\n        fn sample_for_regression() -> Self {\n            TestEntity {\n                field_added_in_0_8: 43,\n                field_already_in_0_7: 42,\n            }\n        }\n\n        fn assert_equality(&self, other: &Self) {\n            assert_eq!(self.field_added_in_0_8, other.field_added_in_0_8);\n            assert_eq!(self.field_already_in_0_7, other.field_already_in_0_7);\n        }\n    }\n\n    #[test]\n    fn test_test_json_backward_compatibility_helper_create() {\n        let temp_dir = tempfile::tempdir().unwrap();\n        let temp_path = temp_dir.path();\n\n        let json_sample_0_7 = json!({\"version\": \"0.7\", \"field_already_in_0_7\": 42});\n        let json_sample_0_8 = json!({\"version\": GLOBAL_QUICKWIT_RESOURCE_VERSION,\n\"field_already_in_0_7\": 42, \"field_added_in_0_8\": 43});\n\n        let json_sample_0_7_str = serde_json::to_string_pretty(&json_sample_0_7).unwrap();\n        let json_sample_0_8_str = serde_json::to_string_pretty(&json_sample_0_8).unwrap();\n\n        std::fs::write(temp_path.join(\"v0.7.json\"), json_sample_0_7_str.as_bytes()).unwrap();\n        std::fs::write(\n            temp_path.join(\"v0.7.expected.json\"),\n            json_sample_0_7_str.as_bytes(),\n        )\n        .unwrap();\n        std::fs::write(temp_path.join(\"v0.8.json\"), json_sample_0_8_str.as_bytes()).unwrap();\n        std::fs::write(\n            temp_path.join(\"v0.8.expected.json\"),\n            json_sample_0_8_str.as_bytes(),\n        )\n        .unwrap();\n\n        let test_panic = catch_unwind(|| {\n            test_json_backward_compatibility_helper::<TestEntity>(&temp_path.to_string_lossy())\n                .unwrap();\n        });\n        let test_panic_msg = format!(\n            \"{:?}\",\n            test_panic.unwrap_err().downcast::<String>().unwrap()\n        );\n        let latest_version_filename = format!(\"v{GLOBAL_QUICKWIT_RESOURCE_VERSION}.json\");\n        let latest_version_expected_filename =\n            format!(\"v{GLOBAL_QUICKWIT_RESOURCE_VERSION}.expected.json\");\n        assert!(test_panic_msg.contains(&latest_version_filename));\n        assert!(test_panic_msg.contains(&latest_version_expected_filename));\n        assert!(test_panic_msg.contains(\"v0.7.expected.modified.json\"));\n\n        // assert on the directory\n        let nb_files = fs::read_dir(temp_path).unwrap().count();\n        assert_eq!(nb_files, 4 + 3);\n        let created_last_version =\n            deserialize_json_file::<JsonValue>(&temp_path.join(latest_version_filename)).unwrap();\n        assert_eq!(created_last_version, json_sample_0_8);\n        let created_expected_last_version =\n            deserialize_json_file::<JsonValue>(&temp_path.join(latest_version_expected_filename))\n                .unwrap();\n        assert_eq!(created_expected_last_version, json_sample_0_8);\n        let created_expected_modified_0_7 =\n            deserialize_json_file::<JsonValue>(&temp_path.join(\"v0.7.expected.modified.json\"))\n                .unwrap();\n        assert_eq!(\n            created_expected_modified_0_7,\n            json!({\n                \"version\": GLOBAL_QUICKWIT_RESOURCE_VERSION,\n                \"field_already_in_0_7\": 42,\n                // use TestEntity::From<VersionedTestEntity>\n                \"field_added_in_0_8\": 1,\n            })\n        );\n\n        // assert idempotency\n        let test_panic = catch_unwind(|| {\n            test_json_backward_compatibility_helper::<TestEntity>(&temp_path.to_string_lossy())\n                .unwrap();\n        });\n        test_panic.unwrap_err();\n        let nb_files = fs::read_dir(temp_path).unwrap().count();\n        assert_eq!(nb_files, 4 + 3);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/checkpoint.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::cmp::Ordering;\nuse std::collections::BTreeMap;\nuse std::collections::btree_map::Entry;\nuse std::fmt;\nuse std::iter::FromIterator;\nuse std::ops::Range;\nuse std::sync::Arc;\n\nuse quickwit_proto::types::{Position, SourceId};\nuse serde::ser::SerializeMap;\nuse serde::{Deserialize, Serialize};\n/// Updates running indexing tasks in chitchat cluster state.\nuse thiserror::Error;\nuse tracing::{debug, warn};\n\n/// A `PartitionId` uniquely identifies a partition for a given source.\n#[derive(Clone, Debug, Default, Eq, PartialEq, Ord, PartialOrd, Serialize, Deserialize, Hash)]\npub struct PartitionId(pub Arc<String>);\n\nimpl PartitionId {\n    /// Returns the partition ID as a `i64`.\n    pub fn as_i64(&self) -> Option<i64> {\n        self.0.parse::<i64>().ok()\n    }\n\n    /// Returns the partition ID as a `u64`.\n    pub fn as_u64(&self) -> Option<u64> {\n        self.0.parse().ok()\n    }\n\n    pub fn as_str(&self) -> &str {\n        &self.0\n    }\n}\n\nimpl fmt::Display for PartitionId {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(f, \"{}\", &self.0)\n    }\n}\n\nimpl From<String> for PartitionId {\n    fn from(partition_id_str: String) -> Self {\n        PartitionId(Arc::new(partition_id_str))\n    }\n}\n\nimpl From<&str> for PartitionId {\n    fn from(partition_id_str: &str) -> Self {\n        PartitionId(Arc::new(partition_id_str.to_string()))\n    }\n}\n\nimpl From<u64> for PartitionId {\n    fn from(partition_id: u64) -> Self {\n        let partition_id_str = format!(\"{partition_id:0>20}\");\n        PartitionId(Arc::new(partition_id_str))\n    }\n}\n\nimpl From<i64> for PartitionId {\n    fn from(partition_id: i64) -> Self {\n        let partition_id_str = format!(\"{partition_id:0>20}\");\n        PartitionId(Arc::new(partition_id_str))\n    }\n}\n\n/// A partition delta represents an interval (from, to] over a partition of a source.\n#[derive(Clone, Debug, Eq, PartialEq, Serialize, Deserialize)]\npub struct PartitionDelta {\n    pub from: Position,\n    pub to: Position,\n}\n\n#[derive(Default, Clone, Eq, PartialEq, Serialize, Deserialize)]\npub struct IndexCheckpoint {\n    #[serde(flatten)]\n    per_source: BTreeMap<SourceId, SourceCheckpoint>,\n}\n\nimpl fmt::Debug for IndexCheckpoint {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        let json = serde_json::to_string_pretty(&self).map_err(|_| fmt::Error)?;\n        write!(f, \"{json}\")?;\n        Ok(())\n    }\n}\n\nimpl From<BTreeMap<SourceId, SourceCheckpoint>> for IndexCheckpoint {\n    fn from(per_source: BTreeMap<SourceId, SourceCheckpoint>) -> Self {\n        Self { per_source }\n    }\n}\n\nimpl IndexCheckpoint {\n    /// Updates a checkpoint in place. Returns whether the checkpoint was modified.\n    ///\n    /// If the checkpoint delta is not compatible with the\n    /// current checkpoint, an error is returned, and the\n    /// checkpoint remains unchanged.\n    ///\n    /// See [`SourceCheckpoint::try_apply_delta`] for more details.\n    pub fn try_apply_delta(\n        &mut self,\n        delta: IndexCheckpointDelta,\n    ) -> Result<bool, IncompatibleCheckpointDelta> {\n        if delta.is_empty() {\n            return Ok(false);\n        }\n        self.per_source\n            .entry(delta.source_id)\n            .or_default()\n            .try_apply_delta(delta.source_delta)?;\n        Ok(true)\n    }\n\n    /// Resets the checkpoint of the source identified by `source_id`. Returns whether a mutation\n    /// occurred.\n    pub(crate) fn reset_source(&mut self, source_id: &str) -> bool {\n        self.per_source.remove(source_id).is_some()\n    }\n\n    /// Returns the checkpoint associated with a given source.\n    ///\n    /// All registered source have an associated checkpoint (that is possibly empty).\n    ///\n    /// Some non-registered source may also have checkpoint (due to backward compatibility\n    /// and the ingest command).\n    pub fn source_checkpoint(&self, source_id: &str) -> Option<&SourceCheckpoint> {\n        self.per_source.get(source_id)\n    }\n\n    /// Adds a new source. If the source was already here, this\n    /// method returns successfully and does not override the existing checkpoint.\n    pub fn add_source(&mut self, source_id: &str) {\n        self.per_source.entry(source_id.to_string()).or_default();\n    }\n\n    /// Removes a source.\n    /// Returns successfully regardless of whether the source was present or not.\n    pub fn remove_source(&mut self, source_id: &str) {\n        self.per_source.remove(source_id);\n    }\n\n    /// Returns [`true`] if the checkpoint is empty.\n    pub fn is_empty(&self) -> bool {\n        self.per_source.is_empty()\n    }\n}\n\n/// A source checkpoint is a map of the last processed position for every partition.\n///\n/// If a partition is missing, it implicitly means that none of its message\n/// has been processed.\n#[derive(Default, Clone, Eq, PartialEq)]\npub struct SourceCheckpoint {\n    per_partition: BTreeMap<PartitionId, Position>,\n}\nimpl SourceCheckpoint {\n    /// Adds a partition to the checkpoint.\n    pub fn add_partition(&mut self, partition_id: PartitionId, position: Position) {\n        self.per_partition.insert(partition_id, position);\n    }\n\n    /// Returns the number of partitions covered by the checkpoint.\n    pub fn num_partitions(&self) -> usize {\n        self.per_partition.len()\n    }\n\n    /// Returns [`true`] if the checkpoint is empty.\n    pub fn is_empty(&self) -> bool {\n        self.per_partition.is_empty()\n    }\n}\n\n/// Creates a checkpoint from an iterator of `(PartitionId, Position)` tuples.\n/// ```\n/// use quickwit_metastore::checkpoint::{SourceCheckpoint, PartitionId};\n/// use quickwit_proto::types::Position;\n///\n/// let checkpoint: SourceCheckpoint = [(0u64, 0u64), (1u64, 2u64)]\n///     .into_iter()\n///     .map(|(partition_id, offset)| {\n///         (PartitionId::from(partition_id), Position::offset(offset))\n///     })\n///     .collect();\n/// ```\nimpl FromIterator<(PartitionId, Position)> for SourceCheckpoint {\n    fn from_iter<I>(iter: I) -> SourceCheckpoint\n    where I: IntoIterator<Item = (PartitionId, Position)> {\n        SourceCheckpoint {\n            per_partition: iter.into_iter().collect(),\n        }\n    }\n}\n\nimpl Serialize for SourceCheckpoint {\n    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>\n    where S: serde::Serializer {\n        let mut map = serializer.serialize_map(Some(self.per_partition.len()))?;\n        for (partition, position) in &self.per_partition {\n            map.serialize_entry(&*partition.0, position)?;\n        }\n        map.end()\n    }\n}\n\nimpl<'de> Deserialize<'de> for SourceCheckpoint {\n    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>\n    where D: serde::Deserializer<'de> {\n        let string_to_string_map: BTreeMap<String, String> = BTreeMap::deserialize(deserializer)?;\n        let per_partition: BTreeMap<PartitionId, Position> = string_to_string_map\n            .into_iter()\n            .map(|(partition_id, position)| {\n                (PartitionId::from(partition_id), Position::from(position))\n            })\n            .collect();\n        Ok(SourceCheckpoint { per_partition })\n    }\n}\n\n/// Error returned when trying to apply a checkpoint delta to a checkpoint that is not\n/// compatible. ie: the checkpoint delta starts from a point anterior to\n/// the checkpoint.\n#[derive(Clone, Debug, Error, Eq, PartialEq, Serialize, Deserialize)]\n#[error(\n    \"incompatible checkpoint delta at partition `{partition_id}`: end position is \\\n     `{partition_position:?}` (inclusive), whereas delta starts at `{delta_from_position:?}` \\\n     (exclusive)\"\n)]\npub struct IncompatibleCheckpointDelta {\n    /// The partition ID for which the incompatibility has been detected.\n    pub partition_id: PartitionId,\n    /// The current position (inclusive) within this partition.\n    pub partition_position: Position,\n    /// The start position (exclusive) for the delta.\n    pub delta_from_position: Position,\n}\n\n#[derive(Clone, Debug, Error, Serialize, Deserialize, PartialEq, Eq)]\npub enum PartitionDeltaError {\n    #[error(transparent)]\n    IncompatibleCheckpointDelta(#[from] IncompatibleCheckpointDelta),\n    #[error(\n        \"empty or negative delta at partition `{partition_id}`: {from_position:?} >= \\\n         {to_position:?}\"\n    )]\n    EmptyOrNegativeDelta {\n        /// One PartitionId for which the negative delta has been detected.\n        partition_id: PartitionId,\n        /// Delta from position.\n        from_position: Position,\n        /// Delta to position.\n        to_position: Position,\n    },\n}\n\nimpl SourceCheckpoint {\n    /// Returns the position reached for a given partition.\n    pub fn position_for_partition(&self, partition_id: &PartitionId) -> Option<&Position> {\n        self.per_partition.get(partition_id)\n    }\n\n    /// Returns an iterator with the reached position for each partition.\n    pub fn iter(&self) -> impl Iterator<Item = (PartitionId, Position)> + '_ {\n        self.per_partition\n            .iter()\n            .map(|(partition_id, position)| (partition_id.clone(), position.clone()))\n    }\n\n    pub fn check_compatibility(\n        &self,\n        delta: &SourceCheckpointDelta,\n    ) -> Result<(), IncompatibleCheckpointDelta> {\n        for (delta_partition, delta_position) in &delta.per_partition {\n            let Some(position) = self.per_partition.get(delta_partition) else {\n                continue;\n            };\n            match position.cmp(&delta_position.from) {\n                Ordering::Equal => {}\n                Ordering::Less => {\n                    warn!(cur_pos=?position, delta_pos_from=?delta_position.from,partition=?delta_partition, \"some positions were skipped\");\n                }\n                Ordering::Greater => {\n                    return Err(IncompatibleCheckpointDelta {\n                        partition_id: delta_partition.clone(),\n                        partition_position: position.clone(),\n                        delta_from_position: delta_position.from.clone(),\n                    });\n                }\n            }\n        }\n        Ok(())\n    }\n\n    /// Try and apply a delta.\n    ///\n    /// We accept a delta as long as it comes after the current checkpoint,\n    /// for all partitions.\n    ///\n    /// We accept a delta that is not perfected chained after a checkpoint,\n    /// as gaps may happen. For instance, assuming a Kafka source, if the indexing\n    /// pipeline is down for more than the retention period.\n    ///\n    ///   |    Checkpoint & Delta        | Outcome                     |\n    ///   |------------------------------|-----------------------------|\n    ///   |  (..a] (b..c] with a = b     | Compatible                  |\n    ///   |  (..a] (b..c] with b > a     | Compatible                  |\n    ///   |  (..a] (b..c] with b < a     | Incompatible                |\n    ///\n    /// If the delta is incompatible, returns an error without modifying the original checkpoint.\n    pub fn try_apply_delta(\n        &mut self,\n        delta: SourceCheckpointDelta,\n    ) -> Result<(), IncompatibleCheckpointDelta> {\n        self.check_compatibility(&delta)?;\n        debug!(delta=?delta, checkpoint=?self, \"applying delta to checkpoint\");\n        for (partition_id, partition_position) in delta.per_partition {\n            self.per_partition\n                .insert(partition_id, partition_position.to);\n        }\n        Ok(())\n    }\n}\n\nimpl fmt::Debug for SourceCheckpoint {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        f.write_str(\"Ckpt(\")?;\n        for (i, (partition_id, position)) in self.per_partition.iter().enumerate() {\n            f.write_str(&partition_id.0)?;\n            f.write_str(\":\")?;\n            write!(f, \"{position}\")?;\n            let is_last = i == self.per_partition.len() - 1;\n            if !is_last {\n                f.write_str(\" \")?;\n            }\n        }\n        f.write_str(\")\")?;\n        Ok(())\n    }\n}\n\n/// A checkpoint delta represents a checkpoint update.\n///\n/// It is shipped as part of a split to convey the update\n/// that should be applied to the index checkpoint once the split\n/// is published.\n///\n/// The `CheckpointDelta` not only ships for each\n/// partition not only a new position, but also an expected\n/// `from` position. This makes it possible to defensively check that\n/// we are not trying to add documents to the index that were already indexed.\n#[derive(Clone, PartialEq, Eq, Serialize, Deserialize)]\npub struct IndexCheckpointDelta {\n    pub source_id: SourceId,\n    pub source_delta: SourceCheckpointDelta,\n}\n\nimpl IndexCheckpointDelta {\n    pub fn is_empty(&self) -> bool {\n        self.source_delta.is_empty()\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn for_test(source_id: &str, pos_range: Range<u64>) -> Self {\n        Self {\n            source_id: source_id.to_string(),\n            source_delta: SourceCheckpointDelta::from_range(pos_range),\n        }\n    }\n}\n\nimpl fmt::Debug for IndexCheckpointDelta {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(f, \"{}:{:?}\", &self.source_id, self.source_delta)?;\n        Ok(())\n    }\n}\n\n#[derive(Default, Clone, Eq, PartialEq, Serialize, Deserialize)]\npub struct SourceCheckpointDelta {\n    per_partition: BTreeMap<PartitionId, PartitionDelta>,\n}\n\nimpl fmt::Debug for SourceCheckpointDelta {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        f.write_str(\"∆(\")?;\n        for (i, (partition_id, partition_delta)) in self.per_partition.iter().enumerate() {\n            write!(\n                f,\n                \"{}:({}..{}]\",\n                partition_id.0, partition_delta.from, partition_delta.to,\n            )?;\n            if i != self.per_partition.len() - 1 {\n                f.write_str(\" \")?;\n            }\n        }\n        f.write_str(\")\")?;\n        Ok(())\n    }\n}\n\nimpl TryFrom<Range<u64>> for SourceCheckpointDelta {\n    type Error = PartitionDeltaError;\n\n    fn try_from(range: Range<u64>) -> Result<Self, Self::Error> {\n        // Checkpoint delta are expressed as (from, to] intervals while ranges\n        // are [start, end) intervals\n        let from_position = if range.start == 0 {\n            Position::Beginning\n        } else {\n            Position::offset(range.start - 1)\n        };\n        let to_position = if range.end == 0 {\n            Position::Beginning\n        } else {\n            Position::offset(range.end - 1)\n        };\n        SourceCheckpointDelta::from_partition_delta(\n            PartitionId::default(),\n            from_position,\n            to_position,\n        )\n    }\n}\n\nimpl SourceCheckpointDelta {\n    /// Used for tests only.\n    /// Panics if the range is not strictly increasing.\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn from_range(range: Range<u64>) -> Self {\n        SourceCheckpointDelta::try_from(range).expect(\"Invalid position range\")\n    }\n\n    /// Creates a new checkpoint delta initialized with a single partition delta.\n    pub fn from_partition_delta(\n        partition_id: PartitionId,\n        from_position: Position,\n        to_position: Position,\n    ) -> Result<Self, PartitionDeltaError> {\n        let mut delta = SourceCheckpointDelta::default();\n        delta.record_partition_delta(partition_id, from_position, to_position)?;\n        Ok(delta)\n    }\n\n    /// Returns the checkpoint associated with the endpoint of the delta.\n    pub fn get_source_checkpoint(&self) -> SourceCheckpoint {\n        let mut source_checkpoint = SourceCheckpoint::default();\n        source_checkpoint.try_apply_delta(self.clone()).unwrap();\n        source_checkpoint\n    }\n\n    /// Returns an iterator of partition IDs and associated deltas.\n    pub fn iter(&self) -> impl Iterator<Item = (PartitionId, PartitionDelta)> + '_ {\n        self.per_partition\n            .iter()\n            .map(|(partition_id, partition_delta)| (partition_id.clone(), partition_delta.clone()))\n    }\n\n    /// Records a `(from, to]` partition delta for a given partition.\n    pub fn record_partition_delta(\n        &mut self,\n        partition_id: PartitionId,\n        from_position: Position,\n        to_position: Position,\n    ) -> Result<(), PartitionDeltaError> {\n        // `from_position == to_position` means delta is empty.\n        if from_position >= to_position {\n            return Err(PartitionDeltaError::EmptyOrNegativeDelta {\n                partition_id,\n                from_position,\n                to_position,\n            });\n        }\n        let entry = self.per_partition.entry(partition_id);\n        match entry {\n            Entry::Occupied(mut occupied_entry) => {\n                if occupied_entry.get().to == from_position {\n                    occupied_entry.get_mut().to = to_position;\n                } else {\n                    return Err(PartitionDeltaError::from(IncompatibleCheckpointDelta {\n                        partition_id: occupied_entry.key().clone(),\n                        partition_position: occupied_entry.get().to.clone(),\n                        delta_from_position: from_position,\n                    }));\n                }\n            }\n            Entry::Vacant(vacant_entry) => {\n                let partition_delta = PartitionDelta {\n                    from: from_position,\n                    to: to_position,\n                };\n                vacant_entry.insert(partition_delta);\n            }\n        }\n        Ok(())\n    }\n\n    /// Extends the current checkpoint delta in-place with the provided checkpoint delta.\n    ///\n    /// Contrary to checkpoint update, the two deltas here need to chain perfectly.\n    pub fn extend(&mut self, delta: SourceCheckpointDelta) -> Result<(), PartitionDeltaError> {\n        for (partition_id, partition_delta) in delta.per_partition {\n            self.record_partition_delta(partition_id, partition_delta.from, partition_delta.to)?;\n        }\n        Ok(())\n    }\n\n    /// Returns the number of partitions covered by the checkpoint delta.\n    pub fn num_partitions(&self) -> usize {\n        self.per_partition.len()\n    }\n\n    /// Returns an iterator over the partition_ids.\n    pub fn partitions(&self) -> impl Iterator<Item = &PartitionId> {\n        self.per_partition.keys()\n    }\n\n    /// Returns `true` if the checkpoint delta is empty.\n    pub fn is_empty(&self) -> bool {\n        self.per_partition.is_empty()\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_delta_from_range() {\n        let checkpoint_delta = SourceCheckpointDelta::from_range(0..3);\n        assert_eq!(\n            format!(\"{checkpoint_delta:?}\"),\n            \"∆(:(..00000000000000000002])\"\n        );\n        let checkpoint_delta = SourceCheckpointDelta::from_range(1..4);\n        assert_eq!(\n            format!(\"{checkpoint_delta:?}\"),\n            \"∆(:(00000000000000000000..00000000000000000003])\"\n        );\n    }\n\n    #[test]\n    fn test_checkpoint_simple() {\n        let mut checkpoint = SourceCheckpoint::default();\n        assert_eq!(format!(\"{checkpoint:?}\"), \"Ckpt()\");\n\n        let delta = {\n            let mut delta = SourceCheckpointDelta::from_partition_delta(\n                PartitionId::from(\"a\"),\n                Position::offset(123u64),\n                Position::offset(128u64),\n            )\n            .unwrap();\n            delta\n                .record_partition_delta(\n                    PartitionId::from(\"b\"),\n                    Position::offset(60002u64),\n                    Position::offset(60187u64),\n                )\n                .unwrap();\n            delta\n        };\n        checkpoint.try_apply_delta(delta.clone()).unwrap();\n        assert_eq!(\n            format!(\"{checkpoint:?}\"),\n            \"Ckpt(a:00000000000000000128 b:00000000000000060187)\"\n        );\n        // `try_apply_delta` is not idempotent.\n        checkpoint.try_apply_delta(delta).unwrap_err();\n        assert_eq!(\n            format!(\"{checkpoint:?}\"),\n            \"Ckpt(a:00000000000000000128 b:00000000000000060187)\"\n        );\n    }\n\n    #[test]\n    fn test_partially_incompatible_does_not_update() -> anyhow::Result<()> {\n        let mut checkpoint = SourceCheckpoint::default();\n        let delta1 = {\n            let mut delta = SourceCheckpointDelta::from_partition_delta(\n                PartitionId::from(\"a\"),\n                Position::offset(\"00123\"),\n                Position::offset(\"00128\"),\n            )\n            .unwrap();\n            delta.record_partition_delta(\n                PartitionId::from(\"b\"),\n                Position::offset(\"60002\"),\n                Position::offset(\"60187\"),\n            )?;\n            delta\n        };\n        assert!(checkpoint.try_apply_delta(delta1).is_ok());\n        let delta2 = {\n            let mut delta = SourceCheckpointDelta::from_partition_delta(\n                PartitionId::from(\"a\"),\n                Position::offset(\"00128\"),\n                Position::offset(\"00129\"),\n            )\n            .unwrap();\n            delta.record_partition_delta(\n                PartitionId::from(\"b\"),\n                Position::offset(\"50099\"),\n                Position::offset(\"60002\"),\n            )?;\n            delta\n        };\n        assert!(matches!(\n            checkpoint.try_apply_delta(delta2),\n            Err(IncompatibleCheckpointDelta { .. })\n        ));\n        // checkpoint was unchanged\n        assert_eq!(format!(\"{checkpoint:?}\"), \"Ckpt(a:00128 b:60187)\");\n        Ok(())\n    }\n\n    #[test]\n    fn test_adding_new_partition() -> anyhow::Result<()> {\n        let mut checkpoint = SourceCheckpoint::default();\n        let delta1 = {\n            let mut delta = SourceCheckpointDelta::from_partition_delta(\n                PartitionId::from(\"a\"),\n                Position::offset(\"00123\"),\n                Position::offset(\"00128\"),\n            )\n            .unwrap();\n            delta.record_partition_delta(\n                PartitionId::from(\"b\"),\n                Position::offset(\"60002\"),\n                Position::offset(\"60187\"),\n            )?;\n            delta\n        };\n        assert!(checkpoint.try_apply_delta(delta1).is_ok());\n        let delta3 = {\n            let mut delta = SourceCheckpointDelta::from_partition_delta(\n                PartitionId::from(\"b\"),\n                Position::offset(\"60187\"),\n                Position::offset(\"60190\"),\n            )\n            .unwrap();\n            delta.record_partition_delta(\n                PartitionId::from(\"c\"),\n                Position::offset(\"20001\"),\n                Position::offset(\"20008\"),\n            )?;\n            delta\n        };\n        assert!(checkpoint.try_apply_delta(delta3).is_ok());\n        assert_eq!(format!(\"{checkpoint:?}\"), \"Ckpt(a:00128 b:60190 c:20008)\");\n        Ok(())\n    }\n\n    #[test]\n    fn test_extend_checkpoint_delta() {\n        let mut delta1 = {\n            let mut delta = SourceCheckpointDelta::from_partition_delta(\n                PartitionId::from(\"a\"),\n                Position::offset(\"00123\"),\n                Position::offset(\"00128\"),\n            )\n            .unwrap();\n            delta\n                .record_partition_delta(\n                    PartitionId::from(\"b\"),\n                    Position::offset(\"60002\"),\n                    Position::offset(\"60187\"),\n                )\n                .unwrap();\n            delta\n        };\n        let delta2 = {\n            let mut delta = SourceCheckpointDelta::from_partition_delta(\n                PartitionId::from(\"b\"),\n                Position::offset(\"60187\"),\n                Position::offset(\"60348\"),\n            )\n            .unwrap();\n            delta\n                .record_partition_delta(\n                    PartitionId::from(\"c\"),\n                    Position::offset(\"20001\"),\n                    Position::offset(\"20008\"),\n                )\n                .unwrap();\n            delta\n        };\n        let delta3 = {\n            let mut delta = SourceCheckpointDelta::from_partition_delta(\n                PartitionId::from(\"a\"),\n                Position::offset(\"00123\"),\n                Position::offset(\"00128\"),\n            )\n            .unwrap();\n            delta\n                .record_partition_delta(\n                    PartitionId::from(\"b\"),\n                    Position::offset(\"60002\"),\n                    Position::offset(\"60348\"),\n                )\n                .unwrap();\n            delta\n                .record_partition_delta(\n                    PartitionId::from(\"c\"),\n                    Position::offset(\"20001\"),\n                    Position::offset(\"20008\"),\n                )\n                .unwrap();\n            delta\n        };\n        delta1.extend(delta2).unwrap();\n        assert_eq!(delta1, delta3);\n\n        let delta4 = SourceCheckpointDelta::from_partition_delta(\n            PartitionId::from(\"a\"),\n            Position::offset(\"00130\"),\n            Position::offset(\"00142\"),\n        )\n        .unwrap();\n        let result = delta1.extend(delta4);\n        assert_eq!(\n            result,\n            Err(PartitionDeltaError::from(IncompatibleCheckpointDelta {\n                partition_id: PartitionId::from(\"a\"),\n                partition_position: Position::offset(\"00128\"),\n                delta_from_position: Position::offset(\"00130\")\n            }))\n        );\n    }\n\n    #[test]\n    fn test_record_negative_partition_delta_is_failing() {\n        {\n            let delta_error = SourceCheckpointDelta::from_partition_delta(\n                PartitionId::from(\"a\"),\n                Position::offset(\"20\"),\n                Position::offset(\"20\"),\n            )\n            .unwrap_err();\n            matches!(\n                delta_error,\n                PartitionDeltaError::EmptyOrNegativeDelta { .. }\n            );\n        }\n        {\n            let mut delta = SourceCheckpointDelta::from_range(10..20);\n            let delta_error = delta\n                .record_partition_delta(\n                    PartitionId::from(\"a\"),\n                    Position::offset(\"20\"),\n                    Position::offset(\"10\"),\n                )\n                .unwrap_err();\n            matches!(\n                delta_error,\n                PartitionDeltaError::EmptyOrNegativeDelta { .. }\n            );\n        }\n    }\n\n    #[test]\n    fn test_index_checkpoint() {\n        let mut index_checkpoint = IndexCheckpoint::default();\n        assert!(\n            index_checkpoint\n                .source_checkpoint(\"missing_source\")\n                .is_none()\n        );\n        index_checkpoint.add_source(\"existing_source_with_empty_checkpoint\");\n        assert!(\n            index_checkpoint\n                .source_checkpoint(\"existing_source_with_empty_checkpoint\")\n                .is_some()\n        );\n        index_checkpoint.remove_source(\"missing_source\"); //< we just check this does not fail\n        assert!(\n            index_checkpoint\n                .source_checkpoint(\"missing_source\")\n                .is_none()\n        );\n        assert!(\n            index_checkpoint\n                .source_checkpoint(\"existing_source_with_empty_checkpoint\")\n                .is_some()\n        );\n        index_checkpoint.remove_source(\"existing_source_with_empty_checkpoint\"); //< we just check this does not fail\n        assert!(\n            index_checkpoint\n                .source_checkpoint(\"existing_source_with_empty_checkpoint\")\n                .is_none()\n        );\n    }\n\n    #[test]\n    fn test_get_source_checkpoint() {\n        let partition = PartitionId::from(\"a\");\n        let delta = SourceCheckpointDelta::from_partition_delta(\n            partition.clone(),\n            Position::offset(42u64),\n            Position::offset(43u64),\n        )\n        .unwrap();\n        let checkpoint: SourceCheckpoint = delta.get_source_checkpoint();\n        assert_eq!(\n            checkpoint.position_for_partition(&partition).unwrap(),\n            &Position::offset(43u64)\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/error.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_proto::metastore::MetastoreError;\n\n/// Generic Storage Resolver error.\n#[derive(Debug, thiserror::Error)]\npub enum MetastoreResolverError {\n    /// The metastore config is invalid.\n    #[error(\"invalid metastore config: `{0}`\")]\n    InvalidConfig(String),\n\n    /// The URI does not contain sufficient information to connect to the metastore.\n    #[error(\"invalid metastore URI: `{0}`\")]\n    InvalidUri(String),\n\n    /// The requested backend is unsupported or unavailable.\n    #[error(\"unsupported metastore backend: `{0}`\")]\n    UnsupportedBackend(String),\n\n    /// The config and URI are valid, and are meant to be handled by this resolver, but the\n    /// resolver failed to actually connect to the backend. e.g. connection error, credentials\n    /// error, incompatible version, internal error in a third party, etc.\n    #[error(\"failed to connect to metastore: `{0}`\")]\n    Initialization(#[from] MetastoreError),\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n#![warn(missing_docs)]\n#![allow(clippy::bool_assert_comparison)]\n#![deny(clippy::disallowed_methods)]\n#![allow(rustdoc::invalid_html_tags)]\n\n//! `quickwit-metastore` is the abstraction used in Quickwit to interface itself to different\n//! metastore:\n//! - file-backed metastore\n//! - PostgreSQL metastore\n//! - etc.\n\n#[allow(missing_docs)]\npub mod checkpoint;\nmod error;\nmod metastore;\nmod metastore_factory;\nmod metastore_resolver;\nmod split_metadata;\nmod split_metadata_version;\n#[cfg(test)]\npub(crate) mod tests;\n\nuse std::ops::Range;\n\npub use error::MetastoreResolverError;\npub use metastore::control_plane_metastore::ControlPlaneMetastore;\npub use metastore::file_backed::FileBackedMetastore;\npub(crate) use metastore::index_metadata::serialize::{IndexMetadataV0_8, VersionedIndexMetadata};\n#[cfg(feature = \"postgres\")]\npub use metastore::postgres::PostgresqlMetastore;\npub use metastore::{\n    AddSourceRequestExt, CreateIndexRequestExt, CreateIndexResponseExt, IndexMetadata,\n    IndexMetadataResponseExt, IndexesMetadataResponseExt, ListIndexesMetadataResponseExt,\n    ListSplitsQuery, ListSplitsRequestExt, ListSplitsResponseExt, MetastoreServiceExt,\n    MetastoreServiceStreamSplitsExt, PublishSplitsRequestExt, StageSplitsRequestExt,\n    UpdateIndexRequestExt, UpdateSourceRequestExt, file_backed,\n};\npub use metastore_factory::{MetastoreFactory, UnsupportedMetastore};\npub use metastore_resolver::MetastoreResolver;\nuse quickwit_common::is_disjoint;\nuse quickwit_doc_mapper::tag_pruning::TagFilterAst;\npub use split_metadata::{Split, SplitInfo, SplitMaturity, SplitMetadata, SplitState};\npub(crate) use split_metadata_version::{SplitMetadataV0_8, VersionedSplitMetadata};\n\n#[derive(utoipa::OpenApi)]\n#[openapi(components(schemas(\n    IndexMetadataV0_8,\n    Split,\n    SplitMetadataV0_8,\n    SplitState,\n    VersionedIndexMetadata,\n    VersionedSplitMetadata,\n)))]\n/// Schema used for the OpenAPI generation which are apart of this crate.\npub struct MetastoreApiSchemas;\n\n/// Returns `true` if the split time range is included in `time_range_opt`.\n/// If `time_range_opt` is None, returns always true.\npub fn split_time_range_filter(\n    split_metadata: &SplitMetadata,\n    time_range_opt: Option<&Range<i64>>,\n) -> bool {\n    match (time_range_opt, split_metadata.time_range.as_ref()) {\n        (Some(filter_time_range), Some(split_time_range)) => {\n            !is_disjoint(filter_time_range, split_time_range)\n        }\n        _ => true, // Return `true` if `time_range` is omitted or the split has no time range.\n    }\n}\n/// Returns `true` if the tags filter evaluation is true.\n/// If `tags_filter_opt` is None, returns always true.\npub fn split_tag_filter(\n    split_metadata: &SplitMetadata,\n    tags_filter_opt: Option<&TagFilterAst>,\n) -> bool {\n    tags_filter_opt\n        .map(|tags_filter_ast| tags_filter_ast.evaluate(&split_metadata.tags))\n        .unwrap_or(true)\n}\n\n#[cfg(test)]\nmod backward_compatibility_tests;\n\n#[cfg(any(test, feature = \"testsuite\"))]\n/// Returns a metastore backed by an \"in-memory file\" for testing.\npub fn metastore_for_test() -> quickwit_proto::metastore::MetastoreServiceClient {\n    quickwit_proto::metastore::MetastoreServiceClient::new(FileBackedMetastore::for_test(\n        std::sync::Arc::new(quickwit_storage::RamStorage::default()),\n    ))\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/control_plane_metastore.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\n\nuse async_trait::async_trait;\nuse quickwit_common::uri::Uri;\nuse quickwit_proto::control_plane::{ControlPlaneService, ControlPlaneServiceClient};\nuse quickwit_proto::metastore::{\n    AcquireShardsRequest, AcquireShardsResponse, AddSourceRequest, CreateIndexRequest,\n    CreateIndexResponse, CreateIndexTemplateRequest, DeleteIndexRequest,\n    DeleteIndexTemplatesRequest, DeleteQuery, DeleteShardsRequest, DeleteShardsResponse,\n    DeleteSourceRequest, DeleteSplitsRequest, DeleteTask, EmptyResponse,\n    FindIndexTemplateMatchesRequest, FindIndexTemplateMatchesResponse, GetClusterIdentityRequest,\n    GetClusterIdentityResponse, GetIndexTemplateRequest, GetIndexTemplateResponse,\n    IndexMetadataRequest, IndexMetadataResponse, IndexesMetadataRequest, IndexesMetadataResponse,\n    LastDeleteOpstampRequest, LastDeleteOpstampResponse, ListDeleteTasksRequest,\n    ListDeleteTasksResponse, ListIndexStatsRequest, ListIndexStatsResponse,\n    ListIndexTemplatesRequest, ListIndexTemplatesResponse, ListIndexesMetadataRequest,\n    ListIndexesMetadataResponse, ListShardsRequest, ListShardsResponse, ListSplitsRequest,\n    ListSplitsResponse, ListStaleSplitsRequest, MarkSplitsForDeletionRequest, MetastoreResult,\n    MetastoreService, MetastoreServiceClient, MetastoreServiceStream, OpenShardsRequest,\n    OpenShardsResponse, PruneShardsRequest, PublishSplitsRequest, ResetSourceCheckpointRequest,\n    StageSplitsRequest, ToggleSourceRequest, UpdateIndexRequest, UpdateSourceRequest,\n    UpdateSplitsDeleteOpstampRequest, UpdateSplitsDeleteOpstampResponse,\n};\n\n/// A [`MetastoreService`] implementation that proxies some requests to the control plane so it can\n/// track the state of the metastore accurately and react to events in real-time.\n#[derive(Clone)]\npub struct ControlPlaneMetastore {\n    control_plane: ControlPlaneServiceClient,\n    metastore: MetastoreServiceClient,\n}\n\nimpl fmt::Debug for ControlPlaneMetastore {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        f.debug_struct(\"ControlPlaneMetastore\").finish()\n    }\n}\n\nimpl ControlPlaneMetastore {\n    /// Creates a new [`ControlPlaneMetastore`].\n    pub fn new(\n        control_plane: ControlPlaneServiceClient,\n        metastore: MetastoreServiceClient,\n    ) -> Self {\n        Self {\n            control_plane,\n            metastore,\n        }\n    }\n}\n\n#[async_trait]\nimpl MetastoreService for ControlPlaneMetastore {\n    fn endpoints(&self) -> Vec<Uri> {\n        self.metastore.endpoints()\n    }\n\n    async fn check_connectivity(&self) -> anyhow::Result<()> {\n        self.metastore.check_connectivity().await\n    }\n\n    // Proxied metastore API calls.\n\n    async fn create_index(\n        &self,\n        request: CreateIndexRequest,\n    ) -> MetastoreResult<CreateIndexResponse> {\n        let response = self.control_plane.create_index(request).await?;\n        Ok(response)\n    }\n\n    async fn update_index(\n        &self,\n        request: UpdateIndexRequest,\n    ) -> MetastoreResult<IndexMetadataResponse> {\n        let response = self.control_plane.update_index(request).await?;\n        Ok(response)\n    }\n\n    async fn delete_index(&self, request: DeleteIndexRequest) -> MetastoreResult<EmptyResponse> {\n        let response = self.control_plane.delete_index(request).await?;\n        Ok(response)\n    }\n\n    async fn add_source(&self, request: AddSourceRequest) -> MetastoreResult<EmptyResponse> {\n        let response = self.control_plane.add_source(request).await?;\n        Ok(response)\n    }\n\n    async fn update_source(&self, request: UpdateSourceRequest) -> MetastoreResult<EmptyResponse> {\n        let response = self.control_plane.update_source(request).await?;\n        Ok(response)\n    }\n\n    async fn toggle_source(&self, request: ToggleSourceRequest) -> MetastoreResult<EmptyResponse> {\n        let response = self.control_plane.toggle_source(request).await?;\n        Ok(response)\n    }\n\n    async fn delete_source(&self, request: DeleteSourceRequest) -> MetastoreResult<EmptyResponse> {\n        let response = self.control_plane.delete_source(request).await?;\n        Ok(response)\n    }\n\n    // Proxy through the control plane to debounce queries\n    async fn prune_shards(&self, request: PruneShardsRequest) -> MetastoreResult<EmptyResponse> {\n        self.control_plane.prune_shards(request).await?;\n        Ok(EmptyResponse {})\n    }\n\n    // Other metastore API calls.\n\n    async fn index_metadata(\n        &self,\n        request: IndexMetadataRequest,\n    ) -> MetastoreResult<IndexMetadataResponse> {\n        self.metastore.index_metadata(request).await\n    }\n\n    async fn indexes_metadata(\n        &self,\n        request: IndexesMetadataRequest,\n    ) -> MetastoreResult<IndexesMetadataResponse> {\n        self.metastore.indexes_metadata(request).await\n    }\n\n    async fn list_indexes_metadata(\n        &self,\n        request: ListIndexesMetadataRequest,\n    ) -> MetastoreResult<ListIndexesMetadataResponse> {\n        self.metastore.list_indexes_metadata(request).await\n    }\n\n    async fn stage_splits(&self, request: StageSplitsRequest) -> MetastoreResult<EmptyResponse> {\n        self.metastore.stage_splits(request).await\n    }\n\n    async fn publish_splits(\n        &self,\n        request: PublishSplitsRequest,\n    ) -> MetastoreResult<EmptyResponse> {\n        self.metastore.publish_splits(request).await\n    }\n\n    async fn list_splits(\n        &self,\n        request: ListSplitsRequest,\n    ) -> MetastoreResult<MetastoreServiceStream<ListSplitsResponse>> {\n        self.metastore.list_splits(request).await\n    }\n\n    async fn list_index_stats(\n        &self,\n        request: ListIndexStatsRequest,\n    ) -> MetastoreResult<ListIndexStatsResponse> {\n        self.metastore.list_index_stats(request).await\n    }\n\n    async fn list_stale_splits(\n        &self,\n        request: ListStaleSplitsRequest,\n    ) -> MetastoreResult<ListSplitsResponse> {\n        self.metastore.list_stale_splits(request).await\n    }\n\n    async fn mark_splits_for_deletion(\n        &self,\n        request: MarkSplitsForDeletionRequest,\n    ) -> MetastoreResult<EmptyResponse> {\n        self.metastore.mark_splits_for_deletion(request).await\n    }\n\n    async fn delete_splits(&self, request: DeleteSplitsRequest) -> MetastoreResult<EmptyResponse> {\n        self.metastore.delete_splits(request).await\n    }\n\n    async fn reset_source_checkpoint(\n        &self,\n        request: ResetSourceCheckpointRequest,\n    ) -> MetastoreResult<EmptyResponse> {\n        self.metastore.reset_source_checkpoint(request).await\n    }\n\n    // Delete tasks API\n\n    async fn create_delete_task(&self, delete_query: DeleteQuery) -> MetastoreResult<DeleteTask> {\n        self.metastore.create_delete_task(delete_query).await\n    }\n\n    async fn last_delete_opstamp(\n        &self,\n        request: LastDeleteOpstampRequest,\n    ) -> MetastoreResult<LastDeleteOpstampResponse> {\n        self.metastore.last_delete_opstamp(request).await\n    }\n\n    async fn update_splits_delete_opstamp(\n        &self,\n        request: UpdateSplitsDeleteOpstampRequest,\n    ) -> MetastoreResult<UpdateSplitsDeleteOpstampResponse> {\n        self.metastore.update_splits_delete_opstamp(request).await\n    }\n\n    async fn list_delete_tasks(\n        &self,\n        request: ListDeleteTasksRequest,\n    ) -> MetastoreResult<ListDeleteTasksResponse> {\n        self.metastore.list_delete_tasks(request).await\n    }\n\n    // Shard API\n\n    async fn open_shards(&self, request: OpenShardsRequest) -> MetastoreResult<OpenShardsResponse> {\n        self.metastore.open_shards(request).await\n    }\n\n    async fn acquire_shards(\n        &self,\n        request: AcquireShardsRequest,\n    ) -> MetastoreResult<AcquireShardsResponse> {\n        self.metastore.acquire_shards(request).await\n    }\n\n    async fn list_shards(&self, request: ListShardsRequest) -> MetastoreResult<ListShardsResponse> {\n        self.metastore.list_shards(request).await\n    }\n\n    async fn delete_shards(\n        &self,\n        request: DeleteShardsRequest,\n    ) -> MetastoreResult<DeleteShardsResponse> {\n        self.metastore.delete_shards(request).await\n    }\n\n    // Index Template API\n\n    async fn create_index_template(\n        &self,\n        request: CreateIndexTemplateRequest,\n    ) -> MetastoreResult<EmptyResponse> {\n        self.metastore.create_index_template(request).await\n    }\n\n    async fn get_index_template(\n        &self,\n        request: GetIndexTemplateRequest,\n    ) -> MetastoreResult<GetIndexTemplateResponse> {\n        self.metastore.get_index_template(request).await\n    }\n\n    async fn find_index_template_matches(\n        &self,\n        request: FindIndexTemplateMatchesRequest,\n    ) -> MetastoreResult<FindIndexTemplateMatchesResponse> {\n        self.metastore.find_index_template_matches(request).await\n    }\n\n    async fn list_index_templates(\n        &self,\n        request: ListIndexTemplatesRequest,\n    ) -> MetastoreResult<ListIndexTemplatesResponse> {\n        self.metastore.list_index_templates(request).await\n    }\n\n    async fn delete_index_templates(\n        &self,\n        request: DeleteIndexTemplatesRequest,\n    ) -> MetastoreResult<EmptyResponse> {\n        self.metastore.delete_index_templates(request).await\n    }\n\n    async fn get_cluster_identity(\n        &self,\n        request: GetClusterIdentityRequest,\n    ) -> MetastoreResult<GetClusterIdentityResponse> {\n        self.metastore.get_cluster_identity(request).await\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/file_backed/file_backed_index/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n//! [`FileBackedIndex`] module. It is public so that the crate `quickwit-backward-compat` can\n//! import [`FileBackedIndex`] and run backward-compatibility tests. You should not have to import\n//! anything from here directly.\n\nmod serialize;\nmod shards;\n\nuse std::collections::HashMap;\nuse std::fmt::Debug;\nuse std::ops::Bound;\n\nuse itertools::Itertools;\nuse quickwit_common::pretty::PrettySample;\nuse quickwit_config::{\n    DocMapping, IndexingSettings, IngestSettings, RetentionPolicy, SearchSettings, SourceConfig,\n};\nuse quickwit_proto::metastore::{\n    AcquireShardsRequest, AcquireShardsResponse, DeleteQuery, DeleteShardsRequest,\n    DeleteShardsResponse, DeleteTask, EntityKind, IndexStats, ListShardsSubrequest,\n    ListShardsSubresponse, MetastoreError, MetastoreResult, OpenShardSubrequest,\n    OpenShardSubresponse, PruneShardsRequest, SplitStats,\n};\nuse quickwit_proto::types::{IndexUid, PublishToken, SourceId, SplitId};\nuse serde::{Deserialize, Serialize};\nuse serialize::VersionedFileBackedIndex;\nuse shards::Shards;\nuse time::OffsetDateTime;\nuse tracing::{info, warn};\n\nuse super::MutationOccurred;\nuse crate::checkpoint::IndexCheckpointDelta;\nuse crate::metastore::{SortBy, use_shard_api};\nuse crate::{IndexMetadata, ListSplitsQuery, Split, SplitMetadata, SplitState, split_tag_filter};\n\n/// A `FileBackedIndex` object carries an index metadata and its split metadata.\n// This struct is meant to be used only within the [`FileBackedMetastore`]. The public visibility is\n#[derive(Clone, Debug, Serialize, Deserialize)]\n#[serde(into = \"VersionedFileBackedIndex\")]\n#[serde(from = \"VersionedFileBackedIndex\")]\npub(crate) struct FileBackedIndex {\n    /// Metadata specific to the index.\n    metadata: IndexMetadata,\n    /// List of splits belonging to the index.\n    splits: HashMap<SplitId, Split>,\n    /// Shards of each source.\n    per_source_shards: HashMap<SourceId, Shards>,\n    /// Delete tasks.\n    delete_tasks: Vec<DeleteTask>,\n    /// Stamper.\n    stamper: Stamper,\n    /// Flag used to avoid polling the metastore if\n    /// the process is actually writing the metastore.\n    ///\n    /// The logic is \"soft\". We avoid the polling step\n    /// if the metastore wrote some value since the last\n    /// polling loop.\n    recently_modified: bool,\n    /// Has been discarded. This field exists to make\n    /// it possible to discard this entry if there is an error\n    /// while mutating the Index.\n    pub discarded: bool,\n}\n\n#[cfg(any(test, feature = \"testsuite\"))]\nimpl quickwit_config::TestableForRegression for FileBackedIndex {\n    fn sample_for_regression() -> Self {\n        use quickwit_config::INGEST_V2_SOURCE_ID;\n        use quickwit_proto::ingest::{Shard, ShardState};\n        use quickwit_proto::types::{DocMappingUid, Position, ShardId};\n\n        let index_metadata = IndexMetadata::sample_for_regression();\n        let index_uid = index_metadata.index_uid.clone();\n        let source_id = INGEST_V2_SOURCE_ID.to_string();\n\n        let split_metadata = SplitMetadata::sample_for_regression();\n        let split = Split {\n            split_state: SplitState::Published,\n            split_metadata,\n            update_timestamp: 1789,\n            publish_timestamp: Some(1789),\n        };\n        let splits = vec![split];\n\n        let shard = Shard {\n            index_uid: index_uid.clone().into(),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            shard_state: ShardState::Open as i32,\n            leader_id: \"leader-ingester\".to_string(),\n            follower_id: Some(\"follower-ingester\".to_string()),\n            doc_mapping_uid: Some(DocMappingUid::for_test(1)),\n            publish_position_inclusive: Some(Position::Beginning),\n            update_timestamp: 1724240908,\n            ..Default::default()\n        };\n        let shards = Shards::from_shards_vec(index_uid.clone(), source_id.clone(), vec![shard]);\n        let per_source_shards = HashMap::from_iter([(source_id, shards)]);\n\n        let delete_task = DeleteTask {\n            create_timestamp: 0,\n            opstamp: 10,\n            delete_query: Some(DeleteQuery {\n                index_uid: index_uid.into(),\n                start_timestamp: None,\n                end_timestamp: None,\n                query_ast: quickwit_query::query_ast::qast_json_helper(\"Harry Potter\", &[\"body\"]),\n            }),\n        };\n        let delete_tasks = vec![delete_task];\n        FileBackedIndex::new(index_metadata, splits, per_source_shards, delete_tasks)\n    }\n\n    fn assert_equality(&self, other: &Self) {\n        self.metadata().assert_equality(other.metadata());\n        assert_eq!(self.splits, other.splits);\n        assert_eq!(self.per_source_shards, other.per_source_shards);\n        assert_eq!(self.delete_tasks, other.delete_tasks);\n    }\n}\n\nimpl From<IndexMetadata> for FileBackedIndex {\n    fn from(index_metadata: IndexMetadata) -> Self {\n        let per_source_shards = index_metadata\n            .sources\n            .keys()\n            .map(|source_id| {\n                let shards = Shards::empty(index_metadata.index_uid.clone(), source_id.clone());\n                (source_id.clone(), shards)\n            })\n            .collect();\n\n        Self {\n            metadata: index_metadata,\n            splits: Default::default(),\n            per_source_shards,\n            delete_tasks: Default::default(),\n            stamper: Default::default(),\n            recently_modified: false,\n            discarded: false,\n        }\n    }\n}\n\nenum DeleteSplitOutcome {\n    Success,\n    SplitNotFound,\n    // The split is in another state than marked for deletion.\n    Forbidden,\n}\n\nimpl FileBackedIndex {\n    /// Constructor.\n    pub fn new(\n        metadata: IndexMetadata,\n        splits: Vec<Split>,\n        per_source_shards: HashMap<SourceId, Shards>,\n        delete_tasks: Vec<DeleteTask>,\n    ) -> Self {\n        let last_opstamp = delete_tasks\n            .iter()\n            .map(|delete_task| delete_task.opstamp)\n            .max()\n            .unwrap_or(0) as usize;\n        let splits = splits\n            .into_iter()\n            .map(|split| (split.split_id().to_string(), split))\n            .collect();\n        Self {\n            metadata,\n            splits,\n            per_source_shards,\n            delete_tasks,\n            stamper: Stamper::new(last_opstamp),\n            recently_modified: false,\n            discarded: false,\n        }\n    }\n\n    /// Sets the `recently_modified` flag to false and returns the previous value.\n    pub fn flip_recently_modified_down(&mut self) -> bool {\n        std::mem::replace(&mut self.recently_modified, false)\n    }\n\n    /// Marks the file as `recently_modified`.\n    pub fn set_recently_modified(&mut self) {\n        self.recently_modified = true;\n    }\n\n    /// Index ID accessor.\n    pub fn index_id(&self) -> &str {\n        self.metadata.index_id()\n    }\n\n    /// Index UID accessor.\n    pub fn index_uid(&self) -> &IndexUid {\n        &self.metadata.index_uid\n    }\n\n    /// Index metadata accessor.\n    pub fn metadata(&self) -> &IndexMetadata {\n        &self.metadata\n    }\n\n    pub fn update_index_config(\n        &mut self,\n        doc_mapping: DocMapping,\n        indexing_settings: IndexingSettings,\n        ingest_settings: IngestSettings,\n        search_settings: SearchSettings,\n        retention_policy_opt: Option<RetentionPolicy>,\n    ) -> MetastoreResult<bool> {\n        self.metadata.update_index_config(\n            doc_mapping,\n            indexing_settings,\n            ingest_settings,\n            search_settings,\n            retention_policy_opt,\n        )\n    }\n\n    /// Stages a single split.\n    ///\n    /// If a split already exists and is in the [SplitState::Staged] state,\n    /// it is simply updated/overwritten.\n    ///\n    /// If a split already exists and is *not* in the [SplitState::Staged] state, a\n    /// [MetastoreError::NotFound] error is returned providing the split ID to go with\n    /// it.\n    pub(crate) fn stage_split(\n        &mut self,\n        split_metadata: SplitMetadata,\n    ) -> Result<(), MetastoreError> {\n        // Check whether the split exists.\n        // If the split exists, we check what state it is in. If it's anything other than `Staged`\n        // something has gone very wrong and we should abort the operation.\n        if let Some(split) = self.splits.get(split_metadata.split_id())\n            && split.split_state != SplitState::Staged\n        {\n            let entity = EntityKind::Split {\n                split_id: split.split_id().to_string(),\n            };\n            let message = \"split is not staged\".to_string();\n            return Err(MetastoreError::FailedPrecondition { entity, message });\n        }\n        let now_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n        let split = Split {\n            split_state: SplitState::Staged,\n            update_timestamp: now_timestamp,\n            publish_timestamp: None,\n            split_metadata,\n        };\n        self.splits.insert(split.split_id().to_string(), split);\n        Ok(())\n    }\n\n    /// Marks the splits for deletion. Returns whether a mutation occurred.\n    pub(crate) fn mark_splits_for_deletion(\n        &mut self,\n        split_ids: impl IntoIterator<Item = impl AsRef<str>>,\n        deletable_split_states: &[SplitState],\n        return_error_on_splits_not_found: bool,\n    ) -> MetastoreResult<bool> {\n        let mut mutation_occurred = false;\n        let mut split_not_found_ids = Vec::new();\n        let mut non_deletable_split_ids = Vec::new();\n        let now_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n\n        for split_id in split_ids {\n            let split_id_ref = split_id.as_ref();\n            // Check for the existence of split.\n            let metadata = match self.splits.get_mut(split_id_ref) {\n                Some(metadata) => metadata,\n                None => {\n                    split_not_found_ids.push(split_id_ref.to_string());\n                    continue;\n                }\n            };\n            if !deletable_split_states.contains(&metadata.split_state) {\n                non_deletable_split_ids.push(split_id_ref.to_string());\n                continue;\n            };\n            if metadata.split_state == SplitState::MarkedForDeletion {\n                // If the split is already marked for deletion, This is fine, we just skip it.\n                continue;\n            }\n            metadata.split_state = SplitState::MarkedForDeletion;\n            metadata.update_timestamp = now_timestamp;\n            mutation_occurred = true;\n        }\n        if !split_not_found_ids.is_empty() {\n            if return_error_on_splits_not_found {\n                return Err(MetastoreError::NotFound(EntityKind::Splits {\n                    split_ids: split_not_found_ids,\n                }));\n            } else {\n                warn!(\n                    index_id=%self.index_id(),\n                    split_ids=?PrettySample::new(&split_not_found_ids, 5),\n                    \"{} splits were not found and could not be marked for deletion.\",\n                    split_not_found_ids.len()\n                );\n            }\n        }\n        if !non_deletable_split_ids.is_empty() {\n            let entity = EntityKind::Splits {\n                split_ids: non_deletable_split_ids,\n            };\n            let message = \"splits are not deletable\".to_string();\n            return Err(MetastoreError::FailedPrecondition { entity, message });\n        }\n        Ok(mutation_occurred)\n    }\n\n    /// Helper to mark a list of splits as published.\n    /// This function however does not update the checkpoint.\n    fn mark_splits_as_published_helper(\n        &mut self,\n        staged_split_ids: impl IntoIterator<Item = impl AsRef<str>>,\n    ) -> MetastoreResult<()> {\n        let mut split_not_found_ids = Vec::new();\n        let mut split_not_staged_ids = Vec::new();\n\n        let now_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n\n        for staged_plit_id in staged_split_ids {\n            let staged_split_id_ref = staged_plit_id.as_ref();\n            // Check for the existence of split.\n            let Some(metadata) = self.splits.get_mut(staged_split_id_ref) else {\n                split_not_found_ids.push(staged_split_id_ref.to_string());\n                continue;\n            };\n            if metadata.split_state == SplitState::Staged {\n                metadata.split_state = SplitState::Published;\n                metadata.update_timestamp = now_timestamp;\n                metadata.publish_timestamp = Some(now_timestamp);\n            } else {\n                split_not_staged_ids.push(staged_split_id_ref.to_string());\n            }\n        }\n        if !split_not_found_ids.is_empty() {\n            return Err(MetastoreError::NotFound(EntityKind::Splits {\n                split_ids: split_not_found_ids,\n            }));\n        }\n        if !split_not_staged_ids.is_empty() {\n            let entity = EntityKind::Splits {\n                split_ids: split_not_staged_ids,\n            };\n            let message = \"splits are not staged\".to_string();\n            return Err(MetastoreError::FailedPrecondition { entity, message });\n        }\n        Ok(())\n    }\n\n    /// Publishes splits.\n    pub(crate) fn publish_splits(\n        &mut self,\n        staged_split_ids: impl IntoIterator<Item = impl AsRef<str>>,\n        replaced_split_ids: impl IntoIterator<Item = impl AsRef<str>>,\n        checkpoint_delta_opt: Option<IndexCheckpointDelta>,\n        publish_token_opt: Option<PublishToken>,\n    ) -> MetastoreResult<()> {\n        if let Some(checkpoint_delta) = checkpoint_delta_opt {\n            let source_id = checkpoint_delta.source_id.clone();\n            let source = self.metadata.sources.get(&source_id).ok_or_else(|| {\n                MetastoreError::NotFound(EntityKind::Source {\n                    index_id: self.index_id().to_string(),\n                    source_id: source_id.clone(),\n                })\n            })?;\n\n            if use_shard_api(&source.source_params) {\n                let publish_token = publish_token_opt.ok_or_else(|| {\n                    let message = format!(\n                        \"publish token is required for publishing splits for source `{source_id}`\"\n                    );\n                    MetastoreError::InvalidArgument { message }\n                })?;\n                self.try_apply_delta_v2(checkpoint_delta, publish_token)?;\n            } else {\n                self.metadata\n                    .checkpoint\n                    .try_apply_delta(checkpoint_delta)\n                    .map_err(|error| {\n                        quickwit_common::rate_limited_error!(\n                            limit_per_min = 6,\n                            index = self.index_id(),\n                            \"failed to apply checkpoint delta\"\n                        );\n                        let entity = EntityKind::CheckpointDelta {\n                            index_id: self.index_id().to_string(),\n                            source_id,\n                        };\n                        let message = error.to_string();\n                        MetastoreError::FailedPrecondition { entity, message }\n                    })?;\n            }\n        }\n        self.mark_splits_as_published_helper(staged_split_ids)?;\n        self.mark_splits_for_deletion(replaced_split_ids, &[SplitState::Published], true)?;\n        Ok(())\n    }\n\n    /// Lists splits.\n    pub(crate) fn list_splits(&self, query: &ListSplitsQuery) -> MetastoreResult<Vec<Split>> {\n        let limit = query\n            .limit\n            .map(|limit| limit + query.offset.unwrap_or_default())\n            .unwrap_or(usize::MAX);\n        // skip is done at a higher layer in case other indexes give spltis that would go before\n        // ours\n\n        let results = if query.sort_by == SortBy::None {\n            // internally sorted_unstable_by collect everything to an intermediary vec. When not\n            // sorting at all, skip that.\n            self.splits\n                .values()\n                .filter(|split| split_query_predicate(split, query))\n                .take(limit)\n                .cloned()\n                .collect()\n        } else {\n            self.splits\n                .values()\n                .filter(|split| split_query_predicate(split, query))\n                .sorted_unstable_by(|lhs, rhs| query.sort_by.compare(lhs, rhs))\n                .take(limit)\n                .cloned()\n                .collect()\n        };\n        Ok(results)\n    }\n\n    /// Deletes a split.\n    fn delete_split(&mut self, split_id: &str) -> DeleteSplitOutcome {\n        match self.splits.get(split_id).map(|split| split.split_state) {\n            Some(SplitState::MarkedForDeletion) => {\n                self.splits.remove(split_id);\n                DeleteSplitOutcome::Success\n            }\n            Some(SplitState::Staged | SplitState::Published) => DeleteSplitOutcome::Forbidden,\n            None => DeleteSplitOutcome::SplitNotFound,\n        }\n    }\n\n    /// Deletes multiple splits.\n    pub(crate) fn delete_splits(\n        &mut self,\n        split_ids: impl IntoIterator<Item = impl AsRef<str>>,\n    ) -> MetastoreResult<()> {\n        let num_deleted_splits = 0;\n        let mut split_not_found_ids = Vec::new();\n        let mut split_not_deletable_ids = Vec::new();\n\n        for split_id in split_ids {\n            let split_id_ref = split_id.as_ref();\n            match self.delete_split(split_id_ref) {\n                DeleteSplitOutcome::Success => {}\n                DeleteSplitOutcome::SplitNotFound => {\n                    split_not_found_ids.push(split_id_ref.to_string());\n                }\n                DeleteSplitOutcome::Forbidden => {\n                    split_not_deletable_ids.push(split_id_ref.to_string());\n                }\n            }\n        }\n        if !split_not_deletable_ids.is_empty() {\n            let entity = EntityKind::Splits {\n                split_ids: split_not_deletable_ids,\n            };\n            let message = \"splits are not deletable\".to_string();\n            return Err(MetastoreError::FailedPrecondition { entity, message });\n        }\n        info!(index_id=%self.index_id(), \"deleted {num_deleted_splits} splits from index\");\n\n        if !split_not_found_ids.is_empty() {\n            warn!(\n                index_id=self.index_id().to_string(),\n                split_ids=?PrettySample::new(&split_not_found_ids, 5),\n                \"{} splits were not found and could not be deleted\",\n                split_not_found_ids.len()\n            );\n        }\n        Ok(())\n    }\n\n    /// Gets IndexStats for this index\n    pub(crate) fn get_stats(&self) -> MetastoreResult<IndexStats> {\n        let mut staged_stats = SplitStats::default();\n        let mut published_stats = SplitStats::default();\n        let mut marked_for_deletion_stats = SplitStats::default();\n\n        for split in self.splits.values() {\n            match split.split_state {\n                SplitState::Staged => {\n                    staged_stats.add_split(split.split_metadata.footer_offsets.end)\n                }\n                SplitState::Published => {\n                    published_stats.add_split(split.split_metadata.footer_offsets.end)\n                }\n                SplitState::MarkedForDeletion => {\n                    marked_for_deletion_stats.add_split(split.split_metadata.footer_offsets.end)\n                }\n            }\n        }\n\n        Ok(IndexStats {\n            index_uid: Some(self.index_uid().clone()),\n            staged: Some(staged_stats),\n            published: Some(published_stats),\n            marked_for_deletion: Some(marked_for_deletion_stats),\n        })\n    }\n\n    /// Adds a source.\n    pub(crate) fn add_source(&mut self, source_config: SourceConfig) -> MetastoreResult<()> {\n        let index_uid = self.index_uid().clone();\n        let source_id = source_config.source_id.clone();\n\n        self.metadata.add_source(source_config)?;\n\n        let shards = Shards::empty(index_uid, source_id.clone());\n        self.per_source_shards.insert(source_id, shards);\n        Ok(())\n    }\n\n    /// Updates a source. Returns whether a mutation occurred.\n    pub(crate) fn update_source(&mut self, source_config: SourceConfig) -> MetastoreResult<bool> {\n        self.metadata.update_source(source_config)\n    }\n\n    /// Enables or disables a source. Returns whether a mutation occurred.\n    pub(crate) fn toggle_source(&mut self, source_id: &str, enable: bool) -> MetastoreResult<bool> {\n        self.metadata.toggle_source(source_id, enable)\n    }\n\n    /// Deletes the source. Returns whether a mutation occurred.\n    pub(crate) fn delete_source(&mut self, source_id: &str) -> MetastoreResult<()> {\n        self.metadata.delete_source(source_id)\n    }\n\n    /// Resets the checkpoint of a source. Returns whether a mutation occurred.\n    pub(crate) fn reset_source_checkpoint(&mut self, source_id: &str) -> MetastoreResult<bool> {\n        Ok(self.metadata.checkpoint.reset_source(source_id))\n    }\n\n    /// Creates [`DeleteTask`] from a [`DeleteQuery`].\n    pub(crate) fn create_delete_task(\n        &mut self,\n        delete_query: DeleteQuery,\n    ) -> MetastoreResult<DeleteTask> {\n        let now_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n        let delete_task = DeleteTask {\n            create_timestamp: now_timestamp,\n            opstamp: self.stamper.stamp() as u64,\n            delete_query: Some(delete_query),\n        };\n        self.delete_tasks.push(delete_task.clone());\n        Ok(delete_task)\n    }\n\n    /// Returns index last delete opstamp.\n    pub(crate) fn last_delete_opstamp(&self) -> u64 {\n        self.delete_tasks\n            .iter()\n            .map(|delete_task| delete_task.opstamp)\n            .max()\n            .unwrap_or(0)\n    }\n\n    /// Updates splits delete opstamp. Returns that a mutation occurred (true).\n    pub(crate) fn update_splits_delete_opstamp(\n        &mut self,\n        split_ids: &[&str],\n        delete_opstamp: u64,\n    ) -> MetastoreResult<bool> {\n        for split_id in split_ids {\n            let split = self.splits.get_mut(*split_id).ok_or_else(|| {\n                MetastoreError::NotFound(EntityKind::Splits {\n                    split_ids: vec![split_id.to_string()],\n                })\n            })?;\n            split.split_metadata.delete_opstamp = delete_opstamp;\n        }\n        Ok(true)\n    }\n\n    /// Lists delete tasks with opstamp > `opstamp_start`.\n    pub(crate) fn list_delete_tasks(&self, opstamp_start: u64) -> MetastoreResult<Vec<DeleteTask>> {\n        let delete_tasks = self\n            .delete_tasks\n            .iter()\n            .filter(|delete_task| delete_task.opstamp > opstamp_start)\n            .cloned()\n            .collect();\n        Ok(delete_tasks)\n    }\n\n    // Shard API\n\n    fn get_shards_for_source(&self, source_id: &str) -> MetastoreResult<&Shards> {\n        self.per_source_shards.get(source_id).ok_or_else(|| {\n            MetastoreError::NotFound(EntityKind::Source {\n                index_id: self.index_id().to_string(),\n                source_id: source_id.to_string(),\n            })\n        })\n    }\n\n    fn get_shards_for_source_mut(&mut self, source_id: &str) -> MetastoreResult<&mut Shards> {\n        self.per_source_shards.get_mut(source_id).ok_or_else(|| {\n            MetastoreError::NotFound(EntityKind::Source {\n                index_id: self.metadata.index_id().to_string(),\n                source_id: source_id.to_string(),\n            })\n        })\n    }\n\n    pub(crate) fn open_shards(\n        &mut self,\n        subrequests: Vec<OpenShardSubrequest>,\n    ) -> MetastoreResult<MutationOccurred<Vec<OpenShardSubresponse>>> {\n        let mut mutation_occurred = false;\n        let mut subresponses = Vec::with_capacity(subrequests.len());\n\n        for subrequest in subrequests {\n            let subresponse = match self\n                .get_shards_for_source_mut(&subrequest.source_id)?\n                .open_shard(subrequest)?\n            {\n                MutationOccurred::Yes(subresponse) => {\n                    mutation_occurred = true;\n                    subresponse\n                }\n                MutationOccurred::No(subresponse) => subresponse,\n            };\n            subresponses.push(subresponse);\n        }\n        if mutation_occurred {\n            Ok(MutationOccurred::Yes(subresponses))\n        } else {\n            Ok(MutationOccurred::No(subresponses))\n        }\n    }\n\n    pub(crate) fn acquire_shards(\n        &mut self,\n        request: AcquireShardsRequest,\n    ) -> MetastoreResult<MutationOccurred<AcquireShardsResponse>> {\n        self.get_shards_for_source_mut(&request.source_id)?\n            .acquire_shards(request)\n    }\n\n    pub(crate) fn delete_shards(\n        &mut self,\n        request: DeleteShardsRequest,\n    ) -> MetastoreResult<MutationOccurred<DeleteShardsResponse>> {\n        self.get_shards_for_source_mut(&request.source_id)?\n            .delete_shards(request)\n    }\n\n    pub(crate) fn prune_shards(\n        &mut self,\n        request: PruneShardsRequest,\n    ) -> MetastoreResult<MutationOccurred<()>> {\n        self.get_shards_for_source_mut(&request.source_id)?\n            .prune_shards(request)\n    }\n\n    pub(crate) fn list_shards(\n        &self,\n        subrequest: ListShardsSubrequest,\n    ) -> MetastoreResult<ListShardsSubresponse> {\n        self.get_shards_for_source(&subrequest.source_id)?\n            .list_shards(subrequest)\n    }\n\n    pub(crate) fn try_apply_delta_v2(\n        &mut self,\n        checkpoint_delta: IndexCheckpointDelta,\n        publish_token: PublishToken,\n    ) -> MetastoreResult<MutationOccurred<()>> {\n        self.get_shards_for_source_mut(&checkpoint_delta.source_id)?\n            .try_apply_delta(checkpoint_delta.source_delta, publish_token)\n    }\n}\n\n/// Stamper provides Opstamps, which is just an auto-increment id to label\n/// a delete operation.\n#[derive(Clone, Default)]\nstruct Stamper(usize);\n\nimpl Stamper {\n    /// Creates a new [`Stamper`].\n    pub fn new(initial_opstamp: usize) -> Self {\n        Self(initial_opstamp)\n    }\n\n    /// Increments the stamper by 1 and returns the incremented value.\n    pub fn stamp(&mut self) -> usize {\n        self.0 += 1;\n        self.0\n    }\n}\n\nimpl Debug for Stamper {\n    fn fmt(&self, fmt: &mut std::fmt::Formatter) -> std::fmt::Result {\n        fmt.debug_struct(\"Stamper\").field(\"stamp\", &self.0).finish()\n    }\n}\n\nfn split_query_predicate(split: &&Split, query: &ListSplitsQuery) -> bool {\n    if !split_tag_filter(&split.split_metadata, query.tags.as_ref()) {\n        return false;\n    }\n\n    if !query.split_states.is_empty() && !query.split_states.contains(&split.split_state) {\n        return false;\n    }\n\n    if !query\n        .delete_opstamp\n        .contains(&split.split_metadata.delete_opstamp)\n    {\n        return false;\n    }\n\n    if !query.update_timestamp.contains(&split.update_timestamp) {\n        return false;\n    }\n\n    if !query\n        .create_timestamp\n        .contains(&split.split_metadata.create_timestamp)\n    {\n        return false;\n    }\n\n    match &query.mature {\n        Bound::Included(evaluation_datetime) => {\n            return split.split_metadata.is_mature(*evaluation_datetime);\n        }\n        Bound::Excluded(evaluation_datetime) => {\n            return !split.split_metadata.is_mature(*evaluation_datetime);\n        }\n        Bound::Unbounded => {}\n    }\n\n    if let Some(range) = &split.split_metadata.time_range {\n        if !query.time_range.overlaps_with(range.clone()) {\n            return false;\n        }\n        if let Some(v) = query.max_time_range_end\n            && range.end() > &v\n        {\n            return false;\n        }\n    }\n\n    if let Some(node_id) = &query.node_id\n        && split.split_metadata.node_id != *node_id\n    {\n        return false;\n    }\n\n    if let Some((index_uid, split_id)) = &query.after_split {\n        if *index_uid > split.split_metadata.index_uid {\n            return false;\n        }\n        if *index_uid == split.split_metadata.index_uid\n            && *split_id >= split.split_metadata.split_id\n        {\n            return false;\n        }\n    }\n\n    true\n}\n\n#[cfg(test)]\nmod tests {\n    use std::collections::{BTreeSet, HashMap};\n\n    use quickwit_doc_mapper::tag_pruning::TagFilterAst;\n    use quickwit_proto::ingest::Shard;\n    use quickwit_proto::metastore::{ListShardsSubrequest, SplitStats};\n    use quickwit_proto::types::{IndexUid, SourceId};\n\n    use super::FileBackedIndex;\n    use crate::file_backed::file_backed_index::split_query_predicate;\n    use crate::{IndexMetadata, ListSplitsQuery, Split, SplitMetadata, SplitState};\n\n    impl FileBackedIndex {\n        pub(crate) fn insert_shards(&mut self, source_id: &SourceId, shards: Vec<Shard>) {\n            self.per_source_shards\n                .get_mut(source_id)\n                .unwrap()\n                .insert_shards(shards)\n        }\n\n        pub(crate) fn list_all_shards(&self, source_id: &SourceId) -> Vec<Shard> {\n            self.per_source_shards\n                .get(source_id)\n                .unwrap()\n                .list_shards(ListShardsSubrequest {\n                    ..Default::default()\n                })\n                .unwrap()\n                .shards\n        }\n    }\n\n    fn make_splits() -> [Split; 3] {\n        [\n            Split {\n                split_metadata: SplitMetadata {\n                    split_id: \"split-1\".to_string(),\n                    delete_opstamp: 9,\n                    time_range: Some(32..=40),\n                    tags: BTreeSet::from([\"tag-1\".to_string()]),\n                    create_timestamp: 12,\n                    footer_offsets: 0..2048,\n                    ..Default::default()\n                },\n                split_state: SplitState::Staged,\n                update_timestamp: 70i64,\n                publish_timestamp: None,\n            },\n            Split {\n                split_metadata: SplitMetadata {\n                    split_id: \"split-2\".to_string(),\n                    delete_opstamp: 4,\n                    time_range: None,\n                    tags: BTreeSet::from([\"tag-2\".to_string(), \"tag-3\".to_string()]),\n                    create_timestamp: 5,\n                    footer_offsets: 0..1024,\n                    ..Default::default()\n                },\n                split_state: SplitState::MarkedForDeletion,\n                update_timestamp: 50i64,\n                publish_timestamp: None,\n            },\n            Split {\n                split_metadata: SplitMetadata {\n                    split_id: \"split-3\".to_string(),\n                    delete_opstamp: 0,\n                    time_range: Some(0..=90),\n                    tags: BTreeSet::from([\"tag-2\".to_string(), \"tag-4\".to_string()]),\n                    create_timestamp: 64,\n                    footer_offsets: 0..512,\n                    ..Default::default()\n                },\n                split_state: SplitState::Published,\n                update_timestamp: 0i64,\n                publish_timestamp: Some(10i64),\n            },\n        ]\n    }\n\n    #[test]\n    fn test_single_filter_behaviour() {\n        let [split_1, split_2, split_3] = make_splits();\n\n        let query = ListSplitsQuery::for_index(IndexUid::new_with_random_ulid(\"test-index\"))\n            .with_split_state(SplitState::Staged);\n        assert!(split_query_predicate(&&split_1, &query));\n\n        let query = ListSplitsQuery::for_index(IndexUid::new_with_random_ulid(\"test-index\"))\n            .with_split_state(SplitState::Published);\n        assert!(!split_query_predicate(&&split_2, &query));\n\n        let query = ListSplitsQuery::for_index(IndexUid::new_with_random_ulid(\"test-index\"))\n            .with_split_states([SplitState::Published, SplitState::MarkedForDeletion]);\n        assert!(!split_query_predicate(&&split_1, &query));\n        assert!(split_query_predicate(&&split_3, &query));\n\n        let query = ListSplitsQuery::for_index(IndexUid::new_with_random_ulid(\"test-index\"))\n            .with_update_timestamp_lt(51);\n        assert!(!split_query_predicate(&&split_1, &query));\n        assert!(split_query_predicate(&&split_2, &query));\n        assert!(split_query_predicate(&&split_3, &query));\n\n        let query = ListSplitsQuery::for_index(IndexUid::new_with_random_ulid(\"test-index\"))\n            .with_create_timestamp_gte(51);\n        assert!(!split_query_predicate(&&split_1, &query));\n        assert!(!split_query_predicate(&&split_2, &query));\n        assert!(split_query_predicate(&&split_3, &query));\n\n        let query = ListSplitsQuery::for_index(IndexUid::new_with_random_ulid(\"test-index\"))\n            .with_delete_opstamp_gte(4);\n        assert!(split_query_predicate(&&split_1, &query));\n        assert!(split_query_predicate(&&split_2, &query));\n        assert!(!split_query_predicate(&&split_3, &query));\n\n        let query = ListSplitsQuery::for_index(IndexUid::new_with_random_ulid(\"test-index\"))\n            .with_time_range_start_gt(45);\n        assert!(!split_query_predicate(&&split_1, &query));\n        assert!(split_query_predicate(&&split_2, &query));\n        assert!(split_query_predicate(&&split_3, &query));\n\n        let query = ListSplitsQuery::for_index(IndexUid::new_with_random_ulid(\"test-index\"))\n            .with_time_range_end_lt(45);\n        assert!(split_query_predicate(&&split_1, &query));\n        assert!(split_query_predicate(&&split_2, &query));\n        assert!(split_query_predicate(&&split_3, &query));\n\n        let query = ListSplitsQuery::for_index(IndexUid::new_with_random_ulid(\"test-index\"))\n            .with_tags_filter(TagFilterAst::Tag {\n                is_present: false,\n                tag: \"tag-2\".to_string(),\n            });\n        assert!(split_query_predicate(&&split_1, &query));\n        assert!(!split_query_predicate(&&split_2, &query));\n        assert!(!split_query_predicate(&&split_3, &query));\n\n        let query = ListSplitsQuery::for_index(IndexUid::new_with_random_ulid(\"test-index\"))\n            .with_max_time_range_end(50);\n        assert!(split_query_predicate(&&split_1, &query));\n        assert!(split_query_predicate(&&split_2, &query));\n        assert!(!split_query_predicate(&&split_3, &query));\n    }\n\n    #[test]\n    fn test_combination_filter() {\n        let [split_1, split_2, split_3] = make_splits();\n\n        let query = ListSplitsQuery::for_index(IndexUid::new_with_random_ulid(\"test-index\"))\n            .with_time_range_start_gt(0)\n            .with_time_range_end_lt(40);\n        assert!(split_query_predicate(&&split_1, &query));\n        assert!(split_query_predicate(&&split_2, &query));\n        assert!(split_query_predicate(&&split_3, &query));\n\n        let query = ListSplitsQuery::for_index(IndexUid::new_with_random_ulid(\"test-index\"))\n            .with_time_range_start_gt(45)\n            .with_delete_opstamp_gt(0);\n        assert!(!split_query_predicate(&&split_1, &query));\n        assert!(split_query_predicate(&&split_2, &query));\n        assert!(!split_query_predicate(&&split_3, &query));\n\n        let query = ListSplitsQuery::for_index(IndexUid::new_with_random_ulid(\"test-index\"))\n            .with_update_timestamp_lt(51)\n            .with_split_states([SplitState::Published, SplitState::MarkedForDeletion]);\n        assert!(!split_query_predicate(&&split_1, &query));\n        assert!(split_query_predicate(&&split_2, &query));\n        assert!(split_query_predicate(&&split_3, &query));\n\n        let query = ListSplitsQuery::for_index(IndexUid::new_with_random_ulid(\"test-index\"))\n            .with_update_timestamp_lt(51)\n            .with_create_timestamp_lte(63);\n        assert!(!split_query_predicate(&&split_1, &query));\n        assert!(split_query_predicate(&&split_2, &query));\n        assert!(!split_query_predicate(&&split_3, &query));\n\n        let query = ListSplitsQuery::for_index(IndexUid::new_with_random_ulid(\"test-index\"))\n            .with_time_range_start_gt(90)\n            .with_tags_filter(TagFilterAst::Tag {\n                is_present: true,\n                tag: \"tag-1\".to_string(),\n            });\n        assert!(!split_query_predicate(&&split_1, &query));\n        assert!(!split_query_predicate(&&split_2, &query));\n        assert!(!split_query_predicate(&&split_3, &query));\n    }\n\n    #[test]\n    fn test_get_stats() {\n        let index_id = \"test-index\";\n        let index_metadata = IndexMetadata::for_test(index_id, \"file:///qwdata/indexes/test-index\");\n        let index =\n            FileBackedIndex::new(index_metadata, make_splits().into(), HashMap::new(), vec![]);\n\n        let expected_staged = Some(SplitStats {\n            num_splits: 1,\n            total_size_bytes: 2048,\n        });\n        let expected_published = Some(SplitStats {\n            num_splits: 1,\n            total_size_bytes: 512,\n        });\n        let expected_marked_for_deletion = Some(SplitStats {\n            num_splits: 1,\n            total_size_bytes: 1024,\n        });\n        let stats = index.get_stats().unwrap();\n\n        assert_eq!(stats.staged, expected_staged);\n        assert_eq!(stats.published, expected_published);\n        assert_eq!(stats.marked_for_deletion, expected_marked_for_deletion);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/file_backed/file_backed_index/serialize.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\n\nuse itertools::Itertools;\nuse quickwit_proto::ingest::Shard;\nuse quickwit_proto::metastore::SourceType;\nuse quickwit_proto::types::{DocMappingUid, SourceId};\nuse serde::{Deserialize, Serialize};\n\nuse super::shards::Shards;\nuse crate::file_backed::file_backed_index::FileBackedIndex;\nuse crate::metastore::DeleteTask;\nuse crate::{IndexMetadata, Split};\n\n#[derive(Clone, Debug, Serialize, Deserialize)]\n#[serde(tag = \"version\")]\npub(crate) enum VersionedFileBackedIndex {\n    #[serde(rename = \"0.9\")]\n    V0_9(FileBackedIndexV0_8),\n    // Retro compatibility.\n    #[serde(alias = \"0.8\")]\n    #[serde(alias = \"0.7\")]\n    V0_8(FileBackedIndexV0_8),\n}\n\nimpl From<FileBackedIndex> for VersionedFileBackedIndex {\n    fn from(index: FileBackedIndex) -> Self {\n        VersionedFileBackedIndex::V0_9(index.into())\n    }\n}\n\nimpl From<VersionedFileBackedIndex> for FileBackedIndex {\n    fn from(index: VersionedFileBackedIndex) -> Self {\n        match index {\n            VersionedFileBackedIndex::V0_8(mut v0_8) => {\n                for shards in v0_8.shards.values_mut() {\n                    for shard in shards {\n                        shard.doc_mapping_uid = Some(DocMappingUid::default());\n                    }\n                }\n                v0_8.into()\n            }\n            VersionedFileBackedIndex::V0_9(v0_8) => v0_8.into(),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Serialize, Deserialize)]\npub(crate) struct FileBackedIndexV0_8 {\n    #[serde(rename = \"index\")]\n    metadata: IndexMetadata,\n    splits: Vec<Split>,\n    // TODO: Remove `skip_serializing_if` when we release ingest v2.\n    #[serde(default, skip_serializing_if = \"HashMap::is_empty\")]\n    shards: HashMap<SourceId, Vec<Shard>>,\n    #[serde(default)]\n    delete_tasks: Vec<DeleteTask>,\n}\n\nimpl From<FileBackedIndex> for FileBackedIndexV0_8 {\n    fn from(index: FileBackedIndex) -> Self {\n        let splits = index\n            .splits\n            .into_values()\n            .sorted_by_key(|split| split.update_timestamp)\n            .collect();\n        let shards = index\n            .per_source_shards\n            .into_iter()\n            .filter_map(|(source_id, shards)| {\n                // TODO: Remove this filter when we release ingest v2.\n                // Skip serializing empty shards since the feature is hidden and disabled by\n                // default. This way, we can still modify the serialization format without worrying\n                // about backward compatibility post `0.7`.\n                if !shards.is_empty() {\n                    Some((source_id, shards.into_shards_vec()))\n                } else {\n                    None\n                }\n            })\n            .collect();\n        let delete_tasks = index\n            .delete_tasks\n            .into_iter()\n            .sorted_by_key(|delete_task| delete_task.opstamp)\n            .collect();\n        Self {\n            metadata: index.metadata,\n            splits,\n            shards,\n            delete_tasks,\n        }\n    }\n}\n\nimpl From<FileBackedIndexV0_8> for FileBackedIndex {\n    fn from(index: FileBackedIndexV0_8) -> Self {\n        let mut per_source_shards: HashMap<SourceId, Shards> = index\n            .shards\n            .into_iter()\n            .map(|(source_id, shards_vec)| {\n                let index_uid = index.metadata.index_uid.clone();\n                (\n                    source_id.clone(),\n                    Shards::from_shards_vec(index_uid, source_id, shards_vec),\n                )\n            })\n            .collect();\n        // TODO: Remove this when we release ingest v2.\n        for source in index.metadata.sources.values() {\n            if source.source_type() == SourceType::IngestV2\n                && !per_source_shards.contains_key(&source.source_id)\n            {\n                let index_uid = index.metadata.index_uid.clone();\n                let source_id = source.source_id.clone();\n                per_source_shards.insert(source_id.clone(), Shards::empty(index_uid, source_id));\n            }\n        }\n        Self::new(\n            index.metadata,\n            index.splits,\n            per_source_shards,\n            index.delete_tasks,\n        )\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/file_backed/file_backed_index/shards.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::collections::hash_map::Entry;\nuse std::fmt;\n\nuse itertools::Itertools;\nuse quickwit_proto::ingest::{Shard, ShardState};\nuse quickwit_proto::metastore::{\n    AcquireShardsRequest, AcquireShardsResponse, DeleteShardsRequest, DeleteShardsResponse,\n    EntityKind, ListShardsSubrequest, ListShardsSubresponse, MetastoreError, MetastoreResult,\n    OpenShardSubrequest, OpenShardSubresponse, PruneShardsRequest,\n};\nuse quickwit_proto::types::{IndexUid, Position, PublishToken, ShardId, SourceId, queue_id};\nuse time::OffsetDateTime;\nuse tracing::{info, warn};\n\nuse crate::checkpoint::{PartitionId, SourceCheckpoint, SourceCheckpointDelta};\nuse crate::file_backed::MutationOccurred;\n\n// TODO: Rename `SourceShards`\n/// Manages the shards of a source.\n#[derive(Clone, Eq, PartialEq)]\npub(crate) struct Shards {\n    index_uid: IndexUid,\n    source_id: SourceId,\n    checkpoint: SourceCheckpoint,\n    shards: HashMap<ShardId, Shard>,\n}\n\nimpl fmt::Debug for Shards {\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        f.debug_struct(\"Shards\")\n            .field(\"index_uid\", &self.index_uid)\n            .field(\"source_id\", &self.source_id)\n            .field(\"num_shards\", &self.shards.len())\n            .field(\"shards\", &self.shards)\n            .finish()\n    }\n}\n\nimpl Shards {\n    pub(super) fn empty(index_uid: IndexUid, source_id: SourceId) -> Self {\n        Self {\n            index_uid,\n            source_id,\n            checkpoint: SourceCheckpoint::default(),\n            shards: HashMap::new(),\n        }\n    }\n\n    pub(super) fn from_shards_vec(\n        index_uid: IndexUid,\n        source_id: SourceId,\n        shards_vec: Vec<Shard>,\n    ) -> Self {\n        let mut shards: HashMap<ShardId, Shard> = HashMap::with_capacity(shards_vec.len());\n        let mut checkpoint = SourceCheckpoint::default();\n\n        for shard in shards_vec {\n            let shard_id = shard.shard_id().clone();\n            let partition_id = PartitionId::from(shard_id.as_str());\n            let position = shard.publish_position_inclusive();\n            checkpoint.add_partition(partition_id, position);\n            shards.insert(shard_id, shard);\n        }\n\n        Self {\n            index_uid,\n            source_id,\n            checkpoint,\n            shards,\n        }\n    }\n\n    pub fn into_shards_vec(self) -> Vec<Shard> {\n        self.shards.into_values().collect()\n    }\n\n    pub fn is_empty(&self) -> bool {\n        self.shards.is_empty()\n    }\n\n    fn get_shard(&self, shard_id: &ShardId) -> MetastoreResult<&Shard> {\n        self.shards.get(shard_id).ok_or_else(|| {\n            let queue_id = queue_id(&self.index_uid, &self.source_id, shard_id);\n            MetastoreError::NotFound(EntityKind::Shard { queue_id })\n        })\n    }\n\n    fn get_shard_mut(&mut self, shard_id: &ShardId) -> MetastoreResult<&mut Shard> {\n        self.shards.get_mut(shard_id).ok_or_else(|| {\n            let queue_id = queue_id(&self.index_uid, &self.source_id, shard_id);\n            MetastoreError::NotFound(EntityKind::Shard { queue_id })\n        })\n    }\n\n    pub(super) fn open_shard(\n        &mut self,\n        subrequest: OpenShardSubrequest,\n    ) -> MetastoreResult<MutationOccurred<OpenShardSubresponse>> {\n        let mut mutation_occurred = false;\n\n        let shard_id = subrequest.shard_id().clone();\n        let entry = self.shards.entry(shard_id.clone());\n        let shard = match entry {\n            Entry::Occupied(entry) => entry.get().clone(),\n            Entry::Vacant(entry) => {\n                let shard = Shard {\n                    index_uid: Some(self.index_uid.clone()),\n                    source_id: self.source_id.clone(),\n                    shard_id: Some(shard_id.clone()),\n                    shard_state: ShardState::Open as i32,\n                    leader_id: subrequest.leader_id,\n                    follower_id: subrequest.follower_id,\n                    doc_mapping_uid: subrequest.doc_mapping_uid,\n                    publish_position_inclusive: Some(Position::Beginning),\n                    publish_token: subrequest.publish_token.clone(),\n                    update_timestamp: OffsetDateTime::now_utc().unix_timestamp(),\n                };\n                mutation_occurred = true;\n                entry.insert(shard.clone());\n\n                info!(\n                    index_uid=%self.index_uid,\n                    source_id=%self.source_id,\n                    %shard_id,\n                    leader_id=%shard.leader_id,\n                    follower_id=?shard.follower_id,\n                    \"opened shard\"\n                );\n                shard\n            }\n        };\n        let response = OpenShardSubresponse {\n            subrequest_id: subrequest.subrequest_id,\n            open_shard: Some(shard),\n        };\n        if mutation_occurred {\n            Ok(MutationOccurred::Yes(response))\n        } else {\n            Ok(MutationOccurred::No(response))\n        }\n    }\n\n    pub(super) fn acquire_shards(\n        &mut self,\n        request: AcquireShardsRequest,\n    ) -> MetastoreResult<MutationOccurred<AcquireShardsResponse>> {\n        let mut mutation_occurred = false;\n        let mut acquired_shards = Vec::with_capacity(request.shard_ids.len());\n\n        for shard_id in &request.shard_ids {\n            if let Some(shard) = self.shards.get_mut(shard_id) {\n                if shard.publish_token() != request.publish_token {\n                    shard.publish_token = Some(request.publish_token.clone());\n                    mutation_occurred = true;\n                }\n                acquired_shards.push(shard.clone());\n            } else {\n                warn!(\n                    index_uid=%self.index_uid,\n                    source_id=%self.source_id,\n                    %shard_id,\n                    \"shard not found\"\n                );\n            }\n        }\n        let response = AcquireShardsResponse { acquired_shards };\n\n        if mutation_occurred {\n            Ok(MutationOccurred::Yes(response))\n        } else {\n            Ok(MutationOccurred::No(response))\n        }\n    }\n\n    pub(super) fn delete_shards(\n        &mut self,\n        request: DeleteShardsRequest,\n    ) -> MetastoreResult<MutationOccurred<DeleteShardsResponse>> {\n        let mut successes = Vec::with_capacity(request.shard_ids.len());\n        let mut failures = Vec::new();\n        let mut mutation_occurred = false;\n\n        for shard_id in request.shard_ids {\n            if let Entry::Occupied(entry) = self.shards.entry(shard_id.clone()) {\n                let shard = entry.get();\n                if !request.force && !shard.publish_position_inclusive().is_eof() {\n                    failures.push(shard_id);\n                    continue;\n                }\n                info!(\n                    index_uid=%self.index_uid,\n                    source_id=%self.source_id,\n                    %shard_id,\n                    \"deleted shard\",\n                );\n                entry.remove();\n                mutation_occurred = true;\n            }\n            successes.push(shard_id);\n        }\n        if !failures.is_empty() {\n            warn!(\n                index_uid=%self.index_uid,\n                source_id=%self.source_id,\n                \"failed to delete shards `{}`: shards are not fully indexed\",\n                failures.iter().join(\", \")\n            );\n        }\n        let response = DeleteShardsResponse {\n            index_uid: request.index_uid,\n            source_id: request.source_id,\n            successes,\n            failures,\n        };\n        if mutation_occurred {\n            Ok(MutationOccurred::Yes(response))\n        } else {\n            Ok(MutationOccurred::No(response))\n        }\n    }\n\n    pub(super) fn prune_shards(\n        &mut self,\n        request: PruneShardsRequest,\n    ) -> MetastoreResult<MutationOccurred<()>> {\n        let initial_shard_count = self.shards.len();\n\n        if let Some(max_age_secs) = request.max_age_secs {\n            self.shards.retain(|_, shard| {\n                let gc_deadline = shard.update_timestamp + max_age_secs as i64;\n                let now = OffsetDateTime::now_utc().unix_timestamp();\n                gc_deadline >= now\n            });\n        };\n        if let Some(max_count) = request.max_count {\n            let max_count = max_count as usize;\n            if max_count < self.shards.len() {\n                let num_to_remove = self.shards.len() - max_count;\n                let shard_ids_to_delete = self\n                    .shards\n                    .values()\n                    .sorted_by_key(|shard| shard.update_timestamp)\n                    .take(num_to_remove)\n                    .map(|shard| shard.shard_id().clone())\n                    .collect_vec();\n                for shard_id in shard_ids_to_delete {\n                    self.shards.remove(&shard_id);\n                }\n            }\n        }\n        if initial_shard_count > self.shards.len() {\n            Ok(MutationOccurred::Yes(()))\n        } else {\n            Ok(MutationOccurred::No(()))\n        }\n    }\n\n    pub(super) fn list_shards(\n        &self,\n        subrequest: ListShardsSubrequest,\n    ) -> MetastoreResult<ListShardsSubresponse> {\n        let shards = self.list_shards_inner(subrequest.shard_state);\n        let response = ListShardsSubresponse {\n            index_uid: subrequest.index_uid,\n            source_id: subrequest.source_id,\n            shards,\n        };\n        Ok(response)\n    }\n\n    pub(super) fn try_apply_delta(\n        &mut self,\n        checkpoint_delta: SourceCheckpointDelta,\n        publish_token: PublishToken,\n    ) -> MetastoreResult<MutationOccurred<()>> {\n        if checkpoint_delta.is_empty() {\n            return Ok(MutationOccurred::No(()));\n        }\n        self.checkpoint\n            .check_compatibility(&checkpoint_delta)\n            .map_err(|error| MetastoreError::InvalidArgument {\n                message: error.to_string(),\n            })?;\n\n        let mut shard_ids = Vec::with_capacity(checkpoint_delta.num_partitions());\n\n        for (partition_id, partition_delta) in checkpoint_delta.iter() {\n            let shard_id = ShardId::from(partition_id.as_str());\n            let shard = self.get_shard(&shard_id)?;\n\n            if shard.publish_token() != publish_token {\n                let message = \"failed to apply checkpoint delta: invalid publish token\".to_string();\n                return Err(MetastoreError::InvalidArgument { message });\n            }\n            let publish_position_inclusive = partition_delta.to;\n            shard_ids.push((shard_id, publish_position_inclusive))\n        }\n        self.checkpoint\n            .try_apply_delta(checkpoint_delta)\n            .expect(\"delta compatibility should have been checked\");\n\n        for (shard_id, publish_position_inclusive) in shard_ids {\n            let shard = self.get_shard_mut(&shard_id).expect(\"shard should exist\");\n\n            if publish_position_inclusive.is_eof() {\n                shard.shard_state = ShardState::Closed as i32;\n            }\n            shard.publish_position_inclusive = Some(publish_position_inclusive);\n            shard.update_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n        }\n        Ok(MutationOccurred::Yes(()))\n    }\n\n    fn list_shards_inner(&self, shard_state: Option<i32>) -> Vec<Shard> {\n        if let Some(shard_state) = shard_state {\n            self.shards\n                .values()\n                .filter(|shard| shard.shard_state == shard_state)\n                .cloned()\n                .collect()\n        } else {\n            self.shards.values().cloned().collect()\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_proto::ingest::ShardState;\n    use quickwit_proto::types::DocMappingUid;\n\n    use super::*;\n\n    impl Shards {\n        pub(crate) fn insert_shards(&mut self, shards: Vec<Shard>) {\n            for shard in shards {\n                let shard_id = shard.shard_id().clone();\n                self.shards.insert(shard_id, shard);\n            }\n        }\n    }\n\n    #[test]\n    fn test_open_shards() {\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = \"test-source\".to_string();\n        let mut shards = Shards::empty(index_uid.clone(), source_id.clone());\n\n        let subrequest = OpenShardSubrequest {\n            subrequest_id: 0,\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            leader_id: \"leader_id\".to_string(),\n            follower_id: None,\n            doc_mapping_uid: Some(DocMappingUid::default()),\n            publish_token: None,\n        };\n        let MutationOccurred::Yes(subresponse) = shards.open_shard(subrequest.clone()).unwrap()\n        else {\n            panic!(\"expected `MutationOccurred::Yes`\");\n        };\n        assert_eq!(subresponse.subrequest_id, 0);\n\n        let shard = subresponse.open_shard();\n        assert_eq!(shard.index_uid(), &index_uid);\n        assert_eq!(shard.source_id, source_id);\n        assert_eq!(shard.shard_id(), ShardId::from(1));\n        assert_eq!(shard.shard_state(), ShardState::Open);\n        assert_eq!(shard.leader_id, \"leader_id\");\n        assert_eq!(shard.follower_id, None);\n        assert_eq!(shard.publish_token, None);\n        assert_eq!(shard.publish_position_inclusive(), Position::Beginning);\n\n        let MutationOccurred::No(subresponse) = shards.open_shard(subrequest).unwrap() else {\n            panic!(\"Expected `MutationOccurred::No`\");\n        };\n        assert_eq!(subresponse.subrequest_id, 0);\n\n        let shard = subresponse.open_shard();\n        assert_eq!(shards.shards.get(&ShardId::from(1)).unwrap(), shard);\n\n        let subrequest = OpenShardSubrequest {\n            subrequest_id: 0,\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(2)),\n            leader_id: \"leader_id\".to_string(),\n            follower_id: Some(\"follower_id\".to_string()),\n            doc_mapping_uid: Some(DocMappingUid::default()),\n            publish_token: Some(\"publish_token\".to_string()),\n        };\n        let MutationOccurred::Yes(subresponse) = shards.open_shard(subrequest).unwrap() else {\n            panic!(\"Expected `MutationOccurred::No`\");\n        };\n        assert_eq!(subresponse.subrequest_id, 0);\n\n        let shard = subresponse.open_shard();\n        assert_eq!(shard.index_uid(), &index_uid);\n        assert_eq!(shard.source_id, source_id);\n        assert_eq!(shard.shard_id(), ShardId::from(2));\n        assert_eq!(shard.shard_state(), ShardState::Open);\n        assert_eq!(shard.leader_id, \"leader_id\");\n        assert_eq!(shard.follower_id.as_ref().unwrap(), \"follower_id\");\n        assert_eq!(shard.publish_position_inclusive(), Position::Beginning);\n\n        assert_eq!(shards.shards.get(&ShardId::from(2)).unwrap(), shard);\n    }\n\n    #[test]\n    fn test_list_shards() {\n        let index_uid: IndexUid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = \"test-source\".to_string();\n        let mut shards = Shards::empty(index_uid.clone(), source_id.clone());\n\n        let subrequest = ListShardsSubrequest {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_state: None,\n        };\n        let subresponse = shards.list_shards(subrequest).unwrap();\n        assert_eq!(subresponse.index_uid(), &index_uid);\n        assert_eq!(subresponse.source_id, source_id);\n        assert_eq!(subresponse.shards.len(), 0);\n\n        let shard_0 = Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(0)),\n            shard_state: ShardState::Open as i32,\n            ..Default::default()\n        };\n        let shard_1 = Shard {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            shard_state: ShardState::Closed as i32,\n            ..Default::default()\n        };\n        shards.shards.insert(ShardId::from(0), shard_0);\n        shards.shards.insert(ShardId::from(1), shard_1);\n\n        let subrequest = ListShardsSubrequest {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_state: None,\n        };\n        let mut subresponse = shards.list_shards(subrequest).unwrap();\n        subresponse\n            .shards\n            .sort_unstable_by(|left, right| left.shard_id.cmp(&right.shard_id));\n        assert_eq!(subresponse.shards.len(), 2);\n        assert_eq!(subresponse.shards[0].shard_id(), ShardId::from(0));\n        assert_eq!(subresponse.shards[1].shard_id(), ShardId::from(1));\n\n        let subrequest = ListShardsSubrequest {\n            index_uid: index_uid.into(),\n            source_id,\n            shard_state: Some(ShardState::Closed as i32),\n        };\n        let subresponse = shards.list_shards(subrequest).unwrap();\n        assert_eq!(subresponse.shards.len(), 1);\n        assert_eq!(subresponse.shards[0].shard_id(), ShardId::from(1));\n    }\n\n    #[test]\n    fn test_acquire_shards() {\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = \"test-source\".to_string();\n        let mut shards = Shards::empty(index_uid.clone(), source_id.clone());\n\n        let request = AcquireShardsRequest {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_ids: Vec::new(),\n            publish_token: \"test-publish-token\".to_string(),\n        };\n        let MutationOccurred::No(response) = shards.acquire_shards(request).unwrap() else {\n            panic!(\"Expected `MutationOccurred::No`\");\n        };\n        assert!(response.acquired_shards.is_empty());\n\n        let request = AcquireShardsRequest {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_ids: vec![ShardId::from(0), ShardId::from(1)],\n            publish_token: \"test-publish-token\".to_string(),\n        };\n        let MutationOccurred::No(response) = shards.acquire_shards(request.clone()).unwrap() else {\n            panic!(\"Expected `MutationOccurred::No`\");\n        };\n        assert!(response.acquired_shards.is_empty());\n\n        shards.shards.insert(\n            ShardId::from(0),\n            Shard {\n                index_uid: Some(index_uid.clone()),\n                source_id: source_id.clone(),\n                shard_id: Some(ShardId::from(0)),\n                shard_state: ShardState::Open as i32,\n                publish_position_inclusive: Some(Position::eof(0u64)),\n                ..Default::default()\n            },\n        );\n        let MutationOccurred::Yes(response) = shards.acquire_shards(request.clone()).unwrap()\n        else {\n            panic!(\"expected `MutationOccurred::Yes`\");\n        };\n        assert_eq!(response.acquired_shards.len(), 1);\n        let acquired_shard = &response.acquired_shards[0];\n        assert_eq!(acquired_shard.shard_id(), ShardId::from(0));\n\n        assert_eq!(\n            shards\n                .shards\n                .get(&ShardId::from(0))\n                .unwrap()\n                .publish_token(),\n            \"test-publish-token\"\n        );\n    }\n\n    #[test]\n    fn test_delete_shards() {\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = \"test-source\".to_string();\n        let mut shards = Shards::empty(index_uid.clone(), source_id.clone());\n\n        let request = DeleteShardsRequest {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_ids: Vec::new(),\n            force: false,\n        };\n        let MutationOccurred::No(response) = shards.delete_shards(request).unwrap() else {\n            panic!(\"expected `MutationOccurred::No`\");\n        };\n        assert_eq!(response.index_uid(), &index_uid);\n        assert_eq!(response.source_id, source_id);\n        assert!(response.successes.is_empty());\n        assert!(response.failures.is_empty());\n\n        let request = DeleteShardsRequest {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_ids: vec![ShardId::from(0)],\n            force: false,\n        };\n        let MutationOccurred::No(response) = shards.delete_shards(request).unwrap() else {\n            panic!(\"expected `MutationOccurred::No`\");\n        };\n        assert_eq!(response.index_uid(), &index_uid);\n        assert_eq!(response.source_id, source_id);\n        assert_eq!(response.successes.len(), 1);\n        assert_eq!(response.successes[0], ShardId::from(0));\n        assert!(response.failures.is_empty());\n\n        shards.shards.insert(\n            ShardId::from(0),\n            Shard {\n                index_uid: Some(index_uid.clone()),\n                source_id: source_id.clone(),\n                shard_id: Some(ShardId::from(0)),\n                shard_state: ShardState::Open as i32,\n                publish_position_inclusive: Some(Position::eof(0u64)),\n                ..Default::default()\n            },\n        );\n        shards.shards.insert(\n            ShardId::from(1),\n            Shard {\n                index_uid: Some(index_uid.clone()),\n                source_id: source_id.clone(),\n                shard_id: Some(ShardId::from(1)),\n                shard_state: ShardState::Open as i32,\n                publish_position_inclusive: Some(Position::offset(0u64)),\n                ..Default::default()\n            },\n        );\n        let request = DeleteShardsRequest {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_ids: vec![ShardId::from(0), ShardId::from(1)],\n            force: false,\n        };\n        let MutationOccurred::Yes(response) = shards.delete_shards(request).unwrap() else {\n            panic!(\"expected `MutationOccurred::Yes`\");\n        };\n        assert_eq!(response.index_uid(), &index_uid);\n        assert_eq!(response.source_id, source_id);\n        assert_eq!(response.successes.len(), 1);\n        assert_eq!(response.successes[0], ShardId::from(0));\n        assert_eq!(response.failures.len(), 1);\n        assert_eq!(response.failures[0], ShardId::from(1));\n\n        let request = DeleteShardsRequest {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            shard_ids: vec![ShardId::from(1)],\n            force: true,\n        };\n        let MutationOccurred::Yes(response) = shards.delete_shards(request).unwrap() else {\n            panic!(\"expected `MutationOccurred::Yes`\");\n        };\n        assert_eq!(response.index_uid(), &index_uid);\n        assert_eq!(response.source_id, source_id);\n        assert_eq!(response.successes.len(), 1);\n        assert_eq!(response.successes[0], ShardId::from(1));\n        assert!(response.failures.is_empty());\n\n        assert!(shards.shards.is_empty());\n    }\n\n    #[test]\n    fn test_prune_shards() {\n        let index_uid = IndexUid::for_test(\"test-index\", 0);\n        let source_id = \"test-source\".to_string();\n        let mut shards = Shards::empty(index_uid.clone(), source_id.clone());\n\n        let request = PruneShardsRequest {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            max_age_secs: None,\n            max_count: None,\n            interval_secs: None,\n        };\n        let MutationOccurred::No(()) = shards.prune_shards(request).unwrap() else {\n            panic!(\"expected `MutationOccurred::No`\");\n        };\n\n        let request = PruneShardsRequest {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            max_age_secs: Some(50),\n            max_count: None,\n            interval_secs: None,\n        };\n        let MutationOccurred::No(()) = shards.prune_shards(request).unwrap() else {\n            panic!(\"expected `MutationOccurred::No`\");\n        };\n\n        let current_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n        shards.shards.insert(\n            ShardId::from(0),\n            Shard {\n                index_uid: Some(index_uid.clone()),\n                source_id: source_id.clone(),\n                shard_id: Some(ShardId::from(0)),\n                shard_state: ShardState::Open as i32,\n                publish_position_inclusive: Some(Position::eof(0u64)),\n                update_timestamp: current_timestamp - 200,\n                ..Default::default()\n            },\n        );\n        shards.shards.insert(\n            ShardId::from(1),\n            Shard {\n                index_uid: Some(index_uid.clone()),\n                source_id: source_id.clone(),\n                shard_id: Some(ShardId::from(1)),\n                shard_state: ShardState::Open as i32,\n                publish_position_inclusive: Some(Position::offset(0u64)),\n                update_timestamp: current_timestamp - 100,\n                ..Default::default()\n            },\n        );\n\n        let request = PruneShardsRequest {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            max_age_secs: Some(150),\n            max_count: None,\n            interval_secs: None,\n        };\n        let MutationOccurred::Yes(()) = shards.prune_shards(request).unwrap() else {\n            panic!(\"expected `MutationOccurred::Yes`\");\n        };\n\n        let request = PruneShardsRequest {\n            index_uid: Some(index_uid.clone()),\n            source_id: source_id.clone(),\n            max_age_secs: Some(150),\n            max_count: None,\n            interval_secs: None,\n        };\n        let MutationOccurred::No(()) = shards.prune_shards(request).unwrap() else {\n            panic!(\"expected `MutationOccurred::No`\");\n        };\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/file_backed/file_backed_metastore_factory.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::str::FromStr;\nuse std::sync::Arc;\nuse std::time::Duration;\n\nuse async_trait::async_trait;\nuse once_cell::sync::OnceCell;\nuse quickwit_common::uri::Uri;\nuse quickwit_config::{MetastoreBackend, MetastoreConfig};\nuse quickwit_proto::metastore::{MetastoreError, MetastoreServiceClient};\nuse quickwit_storage::{StorageResolver, StorageResolverError};\nuse regex::Regex;\nuse tokio::sync::Mutex;\nuse tracing::debug;\n\nuse crate::{FileBackedMetastore, MetastoreFactory, MetastoreResolverError};\n\n/// A file-backed metastore factory.\n///\n/// The implementation ensures that there is only\n/// one living instance of `FileBasedMetastore` per metastore URI.\n/// As a result, within a same process as long as we keep a single\n/// FileBasedMetastoreFactory, it is safe to use the file based\n/// metastore, even from different threads.\n#[derive(Clone)]\npub struct FileBackedMetastoreFactory {\n    storage_resolver: StorageResolver,\n    // We never garbage collect unused metastore client instances. This should not be a problem\n    // because during normal use this cache will hold at most a single instance.\n    cache: Arc<Mutex<HashMap<Uri, MetastoreServiceClient>>>,\n}\n\nfn extract_polling_interval_from_uri(uri: &str) -> (String, Option<Duration>) {\n    static URI_FRAGMENT_PATTERN: OnceCell<Regex> = OnceCell::new();\n    if let Some(captures) = URI_FRAGMENT_PATTERN\n        .get_or_init(|| Regex::new(\"(.*)#polling_interval=([1-9][0-9]{0,8})s\").unwrap())\n        .captures(uri)\n    {\n        let uri_without_fragment = captures.get(1).unwrap().as_str().to_string();\n        let polling_interval_in_secs: u64 =\n            captures.get(2).unwrap().as_str().parse::<u64>().unwrap();\n        (\n            uri_without_fragment,\n            Some(Duration::from_secs(polling_interval_in_secs)),\n        )\n    } else {\n        (uri.to_string(), None)\n    }\n}\n\nimpl FileBackedMetastoreFactory {\n    /// Creates a new [`FileBackedMetastoreFactory`].\n    pub fn new(storage_resolver: StorageResolver) -> Self {\n        Self {\n            storage_resolver,\n            cache: Default::default(),\n        }\n    }\n\n    async fn get_from_cache(&self, uri: &Uri) -> Option<MetastoreServiceClient> {\n        self.cache.lock().await.get(uri).cloned()\n    }\n\n    /// If there is a valid entry in the cache to begin with, we ignore the new\n    /// metastore and return the old one.\n    ///\n    /// This way we make sure that we keep only one instance associated\n    /// to the key `uri` outside of this struct.\n    async fn cache_metastore(\n        &self,\n        uri: Uri,\n        metastore: MetastoreServiceClient,\n    ) -> MetastoreServiceClient {\n        self.cache\n            .lock()\n            .await\n            .entry(uri)\n            .or_insert(metastore)\n            .clone()\n    }\n}\n\n#[async_trait]\nimpl MetastoreFactory for FileBackedMetastoreFactory {\n    fn backend(&self) -> MetastoreBackend {\n        MetastoreBackend::File\n    }\n\n    async fn resolve(\n        &self,\n        _metastore_config: &MetastoreConfig,\n        uri: &Uri,\n    ) -> Result<MetastoreServiceClient, MetastoreResolverError> {\n        let (uri_stripped, polling_interval_opt) = extract_polling_interval_from_uri(uri.as_str());\n        let uri = Uri::from_str(&uri_stripped).map_err(|_| {\n            MetastoreResolverError::InvalidConfig(format!(\"invalid URI: `{uri_stripped}`\"))\n        })?;\n        if let Some(metastore) = self.get_from_cache(&uri).await {\n            debug!(\"using metastore from cache\");\n            return Ok(metastore);\n        }\n        debug!(\"metastore not found in cache\");\n        let storage = self\n            .storage_resolver\n            .resolve(&uri)\n            .await\n            .map_err(|err| match err {\n                StorageResolverError::InvalidConfig(message) => {\n                    MetastoreResolverError::InvalidConfig(message)\n                }\n                StorageResolverError::InvalidUri(message) => {\n                    MetastoreResolverError::InvalidUri(message)\n                }\n                StorageResolverError::UnsupportedBackend(message) => {\n                    MetastoreResolverError::UnsupportedBackend(message)\n                }\n                StorageResolverError::FailedToOpenStorage { kind, message } => {\n                    MetastoreResolverError::Initialization(MetastoreError::Internal {\n                        message: format!(\"failed to open metastore file `{uri}`\"),\n                        cause: format!(\"StorageError {kind:?}: {message}\"),\n                    })\n                }\n            })?;\n        let file_backed_metastore = FileBackedMetastore::try_new(storage, polling_interval_opt)\n            .await\n            .map(MetastoreServiceClient::new)\n            .map_err(MetastoreResolverError::Initialization)?;\n        let unique_metastore_for_uri = self.cache_metastore(uri, file_backed_metastore).await;\n        Ok(unique_metastore_for_uri)\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::time::Duration;\n\n    use crate::metastore::file_backed::file_backed_metastore_factory::extract_polling_interval_from_uri;\n\n    #[test]\n    fn test_extract_polling_interval_from_uri() {\n        assert_eq!(\n            extract_polling_interval_from_uri(\"file://some-uri#polling_interval=23s\"),\n            (\"file://some-uri\".to_string(), Some(Duration::from_secs(23)))\n        );\n        assert_eq!(\n            extract_polling_interval_from_uri(\n                \"file://some-uri#polling_interval=18446744073709551616s\"\n            ),\n            (\n                \"file://some-uri#polling_interval=18446744073709551616s\".to_string(),\n                None\n            )\n        );\n        assert_eq!(\n            extract_polling_interval_from_uri(\"file://some-uri#polling_interval=0s\"),\n            (\"file://some-uri#polling_interval=0s\".to_string(), None)\n        );\n        assert_eq!(\n            extract_polling_interval_from_uri(\"file://some-uri#otherfragment#polling_interval=10s\"),\n            (\n                \"file://some-uri#otherfragment\".to_string(),\n                Some(Duration::from_secs(10))\n            )\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/file_backed/index_id_matcher.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_config::validate_index_id_pattern;\nuse quickwit_proto::metastore::{MetastoreError, MetastoreResult};\nuse regex::RegexSet;\nuse regex_syntax::escape_into;\n\npub(super) type IndexIdPattern = String;\n\n#[derive(Debug)]\npub(super) struct IndexIdMatcher {\n    positive_matcher: RegexSet,\n    negative_matcher: RegexSet,\n}\n\nimpl IndexIdMatcher {\n    /// Builds an [`IndexIdMatcher`] from an set of index ID patterns using the following rules:\n    /// - If the given pattern does not contain a `*` char, it matches the exact pattern.\n    /// - If the given pattern contains one or more `*`, it matches the regex built from a regex\n    ///   where `*` is replaced by `.*`. All other regular expression meta characters are escaped.\n    pub fn try_from_index_id_patterns(\n        index_id_patterns: &[IndexIdPattern],\n    ) -> MetastoreResult<Self> {\n        let mut positive_patterns: Vec<&str> = Vec::new();\n        let mut negative_patterns: Vec<&str> = Vec::new();\n\n        for pattern in index_id_patterns {\n            if let Some(negative_pattern) = pattern.strip_prefix('-') {\n                negative_patterns.push(negative_pattern);\n            } else {\n                positive_patterns.push(pattern);\n            }\n        }\n        if positive_patterns.is_empty() {\n            let message = \"failed to build index ID matcher: at least one positive index ID \\\n                           pattern must be provided\"\n                .to_string();\n            return Err(MetastoreError::InvalidArgument { message });\n        }\n        let positive_matcher = build_regex_set(&positive_patterns)?;\n        let negative_matcher = build_regex_set(&negative_patterns)?;\n\n        let matcher = IndexIdMatcher {\n            positive_matcher,\n            negative_matcher,\n        };\n        Ok(matcher)\n    }\n\n    pub fn is_match(&self, index_id: &str) -> bool {\n        self.positive_matcher.is_match(index_id) && !self.negative_matcher.is_match(index_id)\n    }\n}\n\nfn build_regex_set(patterns: &[&str]) -> MetastoreResult<RegexSet> {\n    for pattern in patterns {\n        if *pattern == \"*\" {\n            let regex_set = RegexSet::new([\".*\"]).expect(\"regular expression set should compile\");\n            return Ok(regex_set);\n        }\n        validate_index_id_pattern(pattern, false).map_err(|error| {\n            let message = format!(\"failed to build index ID matcher: {error}\");\n            MetastoreError::InvalidArgument { message }\n        })?;\n    }\n    let regexes = patterns.iter().map(|pattern| build_regex(pattern));\n\n    let regex_set = RegexSet::new(regexes).map_err(|error| {\n        let message = format!(\"failed to build index ID matcher: {error}\");\n        MetastoreError::InvalidArgument { message }\n    })?;\n    Ok(regex_set)\n}\n\nfn build_regex(pattern: &str) -> String {\n    let mut regex = String::new();\n    regex.push('^');\n\n    for (idx, part) in pattern.split('*').enumerate() {\n        if idx > 0 {\n            regex.push_str(\".*\");\n        }\n        escape_into(part, &mut regex);\n    }\n    regex.push('$');\n    regex\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_build_regex() {\n        let regex = build_regex(\"\");\n        assert_eq!(regex, r\"^$\");\n\n        let regex = build_regex(\"*\");\n        assert_eq!(regex, r\"^.*$\");\n\n        let regex = build_regex(\"index-1\");\n        assert_eq!(regex, r\"^index\\-1$\");\n\n        let regex = build_regex(\"*-index-*-1\");\n        assert_eq!(regex, r\"^.*\\-index\\-.*\\-1$\");\n\n        let regex = build_regex(\"INDEX.2*-1\");\n        assert_eq!(regex, r\"^INDEX\\.2.*\\-1$\");\n    }\n\n    #[test]\n    fn test_build_regex_set() {\n        let error = build_regex_set(&[\"_index-1\"]).unwrap_err();\n        assert!(matches!(error, MetastoreError::InvalidArgument { .. }));\n\n        let regex_set = build_regex_set(&[\"index-1\"]).unwrap();\n        assert!(regex_set.is_match(\"index-1\"));\n        assert!(!regex_set.is_match(\"index-2\"));\n\n        let regex_set = build_regex_set(&[\"index-1\", \"index-2\"]).unwrap();\n        assert!(regex_set.is_match(\"index-1\"));\n        assert!(regex_set.is_match(\"index-2\"));\n        assert!(!regex_set.is_match(\"index-3\"));\n\n        let regex_set = build_regex_set(&[\"index-1*\"]).unwrap();\n        assert!(regex_set.is_match(\"index-1\"));\n        assert!(regex_set.is_match(\"index-10\"));\n        assert!(!regex_set.is_match(\"index-2\"));\n    }\n\n    #[test]\n    fn test_index_id_matcher() {\n        let error = IndexIdMatcher::try_from_index_id_patterns(&[]).unwrap_err();\n        assert!(matches!(error, MetastoreError::InvalidArgument { .. }));\n\n        let matcher = IndexIdMatcher::try_from_index_id_patterns(&[\n            \"index-foo*\".to_string(),\n            \"-index-foobar\".to_string(),\n        ])\n        .unwrap();\n        assert!(matcher.is_match(\"index-foo\"));\n        assert!(matcher.is_match(\"index-fooo\"));\n        assert!(!matcher.is_match(\"index-foobar\"));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/file_backed/index_template_matcher.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_config::{IndexTemplate, IndexTemplateId};\nuse quickwit_proto::metastore::MetastoreResult;\n\nuse super::index_id_matcher::IndexIdMatcher;\n\nstruct InnerMatcher {\n    template_id: IndexTemplateId,\n    priority: usize,\n    matcher: IndexIdMatcher,\n}\n\nimpl InnerMatcher {\n    /// Compares two matchers by (-<priority>, <template ID>)\n    fn cmp_by_priority_desc(&self, other: &Self) -> std::cmp::Ordering {\n        self.priority\n            .cmp(&other.priority)\n            .reverse()\n            .then_with(|| self.template_id.cmp(&other.template_id))\n    }\n\n    fn is_match(&self, index_id: &str) -> bool {\n        self.matcher.is_match(index_id)\n    }\n}\n\n/// Finds the best matching index template for a given index ID. The matching algorithm is naive and\n/// should be improved to support a large number of templates, should the need arise. It maintains a\n/// list of index templates matchers sorted by priority and performs a linear search returning the\n/// first match.\n#[derive(Default)]\npub(super) struct IndexTemplateMatcher {\n    inner_matchers: Vec<InnerMatcher>,\n}\n\nimpl IndexTemplateMatcher {\n    pub fn try_from_index_templates<'a>(\n        templates: impl Iterator<Item = &'a IndexTemplate> + 'a,\n    ) -> MetastoreResult<Self> {\n        let mut inner_matchers = Vec::new();\n\n        for template in templates {\n            let matcher = IndexIdMatcher::try_from_index_id_patterns(&template.index_id_patterns)?;\n            let inner_matcher = InnerMatcher {\n                template_id: template.template_id.clone(),\n                priority: template.priority,\n                matcher,\n            };\n            inner_matchers.push(inner_matcher);\n        }\n        let mut matcher = Self { inner_matchers };\n        matcher.sort_by_priority_desc();\n\n        Ok(matcher)\n    }\n\n    pub fn insert(&mut self, template: &IndexTemplate) -> MetastoreResult<()> {\n        let matcher = IndexIdMatcher::try_from_index_id_patterns(&template.index_id_patterns)?;\n        let inner_matcher = InnerMatcher {\n            template_id: template.template_id.clone(),\n            priority: template.priority,\n            matcher,\n        };\n        self.inner_matchers.push(inner_matcher);\n        self.sort_by_priority_desc();\n\n        Ok(())\n    }\n\n    pub fn remove(&mut self, template_id: &str) {\n        self.inner_matchers\n            .retain(|matcher| matcher.template_id != *template_id);\n    }\n\n    pub fn find_match(&self, index_id: &str) -> Option<IndexTemplateId> {\n        self.inner_matchers\n            .iter()\n            .find(|inner_matcher| inner_matcher.is_match(index_id))\n            .map(|inner_matcher| inner_matcher.template_id.clone())\n    }\n\n    fn sort_by_priority_desc(&mut self) {\n        self.inner_matchers\n            .sort_unstable_by(InnerMatcher::cmp_by_priority_desc)\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_index_template_matcher() {\n        let index_template_bar =\n            IndexTemplate::for_test(\"test-template-bar\", &[\"test-index-bar*\"], 0);\n        let index_template_foo =\n            IndexTemplate::for_test(\"test-template-foo\", &[\"test-index-foo*\"], 100);\n        let index_template_foobar =\n            IndexTemplate::for_test(\"test-template-foobar\", &[\"test-index-foobar*\"], 200);\n\n        let mut matcher = IndexTemplateMatcher::default();\n        matcher.insert(&index_template_foo).unwrap();\n        matcher.insert(&index_template_bar).unwrap();\n\n        assert_eq!(\n            matcher.find_match(\"test-index-bar-1\").unwrap(),\n            \"test-template-bar\"\n        );\n        assert_eq!(\n            matcher.find_match(\"test-index-foobar\").unwrap(),\n            \"test-template-foo\"\n        );\n        assert_eq!(\n            matcher.find_match(\"test-index-foo\").unwrap(),\n            \"test-template-foo\"\n        );\n\n        matcher.insert(&index_template_foobar).unwrap();\n        assert_eq!(\n            matcher.find_match(\"test-index-foobar\").unwrap(),\n            \"test-template-foobar\"\n        );\n\n        matcher.remove(\"test-template-foobar\");\n        assert_eq!(\n            matcher.find_match(\"test-index-foobar\").unwrap(),\n            \"test-template-foo\"\n        );\n\n        matcher.remove(\"test-template-foo\");\n        assert!(matcher.find_match(\"test-index-foobar\").is_none())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/file_backed/lazy_file_backed_index.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::{Arc, Weak};\nuse std::time::Duration;\n\nuse quickwit_proto::metastore::{EntityKind, MetastoreError, MetastoreResult};\nuse quickwit_proto::types::IndexId;\nuse quickwit_storage::Storage;\nuse tokio::sync::{Mutex, OnceCell};\nuse tracing::error;\n\nuse super::file_backed_index::FileBackedIndex;\nuse super::store_operations::{METASTORE_FILE_NAME, load_index};\n\n/// Lazy [`FileBackedIndex`]. It loads a `FileBackedIndex` on demand. When the index is first\n/// loaded, it optionally spawns a task to periodically poll the storage and update the index.\npub(crate) struct LazyFileBackedIndex {\n    index_id: IndexId,\n    storage: Arc<dyn Storage>,\n    polling_interval_opt: Option<Duration>,\n    lazy_index: OnceCell<Arc<Mutex<FileBackedIndex>>>,\n}\n\nimpl LazyFileBackedIndex {\n    /// Create `LazyFileBackedIndex`.\n    pub fn new(\n        storage: Arc<dyn Storage>,\n        index_id: IndexId,\n        polling_interval_opt: Option<Duration>,\n        file_backed_index: Option<FileBackedIndex>,\n    ) -> Self {\n        let index_mutex_opt = file_backed_index.map(|index| Arc::new(Mutex::new(index)));\n        // If the polling interval is configured and the index is already loaded,\n        // spawn immediately the polling task\n        if let Some(index_mutex) = &index_mutex_opt\n            && let Some(polling_interval) = polling_interval_opt\n        {\n            spawn_index_metadata_polling_task(\n                storage.clone(),\n                index_id.clone(),\n                Arc::downgrade(index_mutex),\n                polling_interval,\n            );\n        }\n        Self {\n            index_id,\n            storage,\n            polling_interval_opt,\n            lazy_index: OnceCell::new_with(index_mutex_opt),\n        }\n    }\n\n    /// Gets a synchronized `FileBackedIndex`. If the index wasn't provided on creation, we load it\n    /// lazily on the first call of this method.\n    pub async fn get(&self) -> MetastoreResult<Arc<Mutex<FileBackedIndex>>> {\n        self.lazy_index\n            .get_or_try_init(|| async move {\n                let index = load_index(&*self.storage, &self.index_id).await?;\n                let index_mutex = Arc::new(Mutex::new(index));\n                // When the index is loaded lazily, the polling task is not started in the\n                // constructor so we do it here when the index is actually loaded.\n                if let Some(polling_interval) = self.polling_interval_opt {\n                    spawn_index_metadata_polling_task(\n                        self.storage.clone(),\n                        self.index_id.clone(),\n                        Arc::downgrade(&index_mutex),\n                        polling_interval,\n                    );\n                }\n                Ok(index_mutex)\n            })\n            .await\n            .cloned()\n    }\n}\n\nasync fn poll_index_metadata_once(\n    storage: &dyn Storage,\n    index_id: &str,\n    index_mutex: &Mutex<FileBackedIndex>,\n) {\n    let mut locked_index = index_mutex.lock().await;\n    if locked_index.flip_recently_modified_down() {\n        return;\n    }\n    let load_index_result = load_index(storage, index_id).await;\n\n    match load_index_result {\n        Ok(index) => {\n            *locked_index = index;\n        }\n        Err(MetastoreError::NotFound(EntityKind::Index { .. })) => {\n            // The index has been deleted by the file-backed metastore holding a reference to this\n            // index. When it removes an index, it does so without holding the lock on the target\n            // index. As a result, the associated polling task may run for one\n            // more iteration before exiting and `load_index` returns a `NotFound` error.\n        }\n        Err(metastore_error) => {\n            error!(\n                error=%metastore_error,\n                \"failed to load index metadata from metastore file located at `{}/{index_id}/{METASTORE_FILE_NAME}`\",\n                storage.uri()\n            );\n        }\n    }\n}\n\nfn spawn_index_metadata_polling_task(\n    storage: Arc<dyn Storage>,\n    index_id: IndexId,\n    metastore_weak: Weak<Mutex<FileBackedIndex>>,\n    polling_interval: Duration,\n) {\n    tokio::task::spawn(async move {\n        let mut interval = tokio::time::interval(polling_interval);\n        interval.set_missed_tick_behavior(tokio::time::MissedTickBehavior::Delay);\n        interval.tick().await; //< this is to prevent fetch right after the first population of the data.\n\n        while let Some(metadata_mutex) = metastore_weak.upgrade() {\n            interval.tick().await;\n            poll_index_metadata_once(&*storage, &index_id, &metadata_mutex).await;\n        }\n    });\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/file_backed/manifest.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{BTreeMap, HashMap};\nuse std::path::Path;\n\nuse itertools::Itertools;\nuse quickwit_common::uri::Uri;\nuse quickwit_config::{IndexTemplate, IndexTemplateId};\nuse quickwit_proto::metastore::{MetastoreError, MetastoreResult, serde_utils};\nuse quickwit_proto::types::{DocMappingUid, IndexId};\nuse quickwit_storage::{OwnedBytes, Storage, StorageError, StorageErrorKind, StorageResult};\nuse serde::{Deserialize, Serialize};\nuse tracing::error;\nuse uuid::Uuid;\n\npub(super) const MANIFEST_FILE_NAME: &str = \"manifest.json\";\n\n// The legacy manifest file was deprecated in 0.8.0, we can drop support for it in 0.10.0 or 0.11.0.\nconst LEGACY_MANIFEST_FILE_NAME: &str = \"indexes_states.json\";\n\n#[derive(Clone, Debug, Deserialize)]\nstruct LegacyManifest {\n    #[serde(default, flatten)]\n    indexes: BTreeMap<IndexId, IndexStatus>,\n}\n\nimpl LegacyManifest {\n    fn into_manifest(self) -> Manifest {\n        Manifest {\n            indexes: self.indexes,\n            templates: HashMap::new(),\n            identity: Uuid::nil(),\n        }\n    }\n}\n\n// TODO: Remove the aliases once we drop support for the legacy manifest file.\n#[derive(Clone, Copy, Debug, Eq, PartialEq, Serialize, Deserialize)]\n#[serde(rename_all = \"snake_case\")]\npub(crate) enum IndexStatus {\n    #[serde(alias = \"Creating\")]\n    Creating,\n    #[serde(alias = \"Alive\")]\n    Active,\n    #[serde(alias = \"Deleting\")]\n    Deleting,\n}\n\n#[derive(Clone, Debug, Default, PartialEq, Serialize, Deserialize)]\n#[serde(into = \"VersionedManifest\")]\n#[serde(from = \"VersionedManifest\")]\npub(crate) struct Manifest {\n    pub indexes: BTreeMap<IndexId, IndexStatus>,\n    // The templates are serialized as a sorted `Vec<IndexTemplate>` so the btree map is\n    // unnecessary here and we can pass the hash map as is to the `MetastoreState`\n    pub templates: HashMap<IndexTemplateId, IndexTemplate>,\n    pub identity: Uuid,\n}\n\n#[derive(Clone, Debug, Serialize, Deserialize)]\n#[serde(tag = \"version\")]\nenum VersionedManifest {\n    // The two versions use the same format but for v0.8 and below, we need to set the\n    // `doc_mapping_uid` to the nil value upon deserialization.\n    #[serde(rename = \"0.9\")]\n    V0_9(ManifestV0_8),\n    #[serde(alias = \"0.8\")]\n    #[serde(alias = \"0.7\")]\n    V0_8(ManifestV0_8),\n}\n\nimpl From<Manifest> for VersionedManifest {\n    fn from(manifest: Manifest) -> Self {\n        VersionedManifest::V0_9(manifest.into())\n    }\n}\n\nimpl From<VersionedManifest> for Manifest {\n    fn from(versioned_manifest: VersionedManifest) -> Self {\n        match versioned_manifest {\n            VersionedManifest::V0_8(mut manifest) => {\n                for template in &mut manifest.templates {\n                    // Override the randomly generated doc mapping UID with the nil value.\n                    template.doc_mapping.doc_mapping_uid = DocMappingUid::default();\n                }\n                manifest.into()\n            }\n            VersionedManifest::V0_9(manifest) => manifest.into(),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Serialize, Deserialize)]\nstruct ManifestV0_8 {\n    indexes: BTreeMap<IndexId, IndexStatus>,\n    templates: Vec<IndexTemplate>,\n    #[serde(default, skip_serializing_if = \"Uuid::is_nil\")]\n    identity: Uuid,\n}\n\nimpl From<Manifest> for ManifestV0_8 {\n    fn from(manifest: Manifest) -> Self {\n        let templates = manifest\n            .templates\n            .into_values()\n            .sorted_unstable_by(|left, right| left.template_id.cmp(&right.template_id))\n            .collect();\n        ManifestV0_8 {\n            indexes: manifest.indexes,\n            templates,\n            identity: manifest.identity,\n        }\n    }\n}\n\nimpl From<ManifestV0_8> for Manifest {\n    fn from(manifest: ManifestV0_8) -> Self {\n        let indexes = manifest.indexes.into_iter().collect();\n        let templates = manifest\n            .templates\n            .into_iter()\n            .map(|template| (template.template_id.clone(), template))\n            .collect();\n        Manifest {\n            indexes,\n            templates,\n            identity: manifest.identity,\n        }\n    }\n}\n\n#[cfg(any(test, feature = \"testsuite\"))]\nimpl quickwit_config::TestableForRegression for Manifest {\n    fn sample_for_regression() -> Self {\n        let mut indexes = BTreeMap::new();\n        indexes.insert(\"test-index-1\".to_string(), IndexStatus::Creating);\n        indexes.insert(\"test-index-2\".to_string(), IndexStatus::Active);\n        indexes.insert(\"test-index-3\".to_string(), IndexStatus::Deleting);\n\n        let mut templates = HashMap::new();\n        templates.insert(\n            \"test-template-1\".to_string(),\n            IndexTemplate::sample_for_regression(),\n        );\n        Manifest {\n            indexes,\n            templates,\n            identity: Uuid::nil(),\n        }\n    }\n\n    fn assert_equality(&self, other: &Self) {\n        assert_eq!(self.indexes, other.indexes);\n        assert_eq!(self.templates, other.templates);\n    }\n}\n\npub(super) async fn load_or_create_manifest(storage: &dyn Storage) -> MetastoreResult<Manifest> {\n    if file_exists(storage, MANIFEST_FILE_NAME).await? {\n        let manifest_json = get_bytes(storage, MANIFEST_FILE_NAME).await?;\n        let manifest: Manifest = serde_utils::from_json_bytes(&manifest_json)?;\n        return Ok(manifest);\n    }\n    if file_exists(storage, LEGACY_MANIFEST_FILE_NAME).await? {\n        let legacy_manifest_json = get_bytes(storage, LEGACY_MANIFEST_FILE_NAME).await?;\n        let legacy_manifest: LegacyManifest = serde_utils::from_json_bytes(&legacy_manifest_json)?;\n        let manifest = legacy_manifest.into_manifest();\n        save_manifest(storage, &manifest).await?;\n\n        if let Err(storage_error) = delete_file(storage, LEGACY_MANIFEST_FILE_NAME).await {\n            error!(\n                error=%storage_error,\n                \"failed to delete legacy manifest file located at `{}/{LEGACY_MANIFEST_FILE_NAME}`\", storage.uri()\n            );\n        }\n        return Ok(manifest);\n    }\n    let manifest = Manifest::default();\n    save_manifest(storage, &manifest).await?;\n    Ok(manifest)\n}\n\npub(super) async fn save_manifest(\n    storage: &dyn Storage,\n    manifest: &Manifest,\n) -> MetastoreResult<()> {\n    let manifest_json_bytes = serde_utils::to_json_bytes_pretty(manifest)?;\n    put_bytes(storage, MANIFEST_FILE_NAME, manifest_json_bytes).await?;\n    Ok(())\n}\n\nasync fn delete_file(storage: &dyn Storage, path: &str) -> StorageResult<()> {\n    storage.delete(Path::new(path)).await?;\n    Ok(())\n}\n\nasync fn file_exists(storage: &dyn Storage, path_str: &str) -> MetastoreResult<bool> {\n    let path = Path::new(path_str);\n    let exists = storage.exists(path).await.map_err(|storage_error| {\n        into_metastore_error(storage_error, storage.uri(), path, \"list\")\n    })?;\n    Ok(exists)\n}\n\nasync fn get_bytes(storage: &dyn Storage, path_str: &str) -> MetastoreResult<OwnedBytes> {\n    let path = Path::new(path_str);\n    let bytes = storage.get_all(path).await.map_err(|storage_error| {\n        into_metastore_error(storage_error, storage.uri(), path, \"load\")\n    })?;\n    Ok(bytes)\n}\n\nasync fn put_bytes(storage: &dyn Storage, path_str: &str, content: Vec<u8>) -> MetastoreResult<()> {\n    let path = Path::new(path_str);\n    storage\n        .put(path, Box::new(content))\n        .await\n        .map_err(|storage_error| {\n            into_metastore_error(storage_error, storage.uri(), path, \"save\")\n        })?;\n    Ok(())\n}\n\nfn into_metastore_error(\n    storage_error: StorageError,\n    uri: &Uri,\n    path: &Path,\n    operation_name: &str,\n) -> MetastoreError {\n    match storage_error.kind() {\n        StorageErrorKind::Unauthorized => MetastoreError::Forbidden {\n            message: format!(\n                \"failed to access manifest file located at `{uri}/{}`: unauthorized\",\n                path.display()\n            ),\n        },\n        _ => MetastoreError::Internal {\n            message: format!(\n                \"failed to {operation_name} manifest file located at `{uri}/{}`\",\n                path.display()\n            ),\n            cause: storage_error.to_string(),\n        },\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use serde_json::json;\n\n    use super::*;\n\n    #[test]\n    fn test_legacy_manifest_deserialization() {\n        let legacy_manifest_json = r#\"{\n            \"test-index-1\": \"Creating\",\n            \"test-index-2\": \"Alive\",\n            \"test-index-3\": \"Deleting\"\n        }\n        \"#;\n        let legacy_manifest: LegacyManifest = serde_json::from_str(legacy_manifest_json).unwrap();\n        assert_eq!(legacy_manifest.indexes.len(), 3);\n\n        assert_eq!(\n            legacy_manifest.indexes.get(\"test-index-1\").unwrap(),\n            &IndexStatus::Creating\n        );\n        assert_eq!(\n            legacy_manifest.indexes.get(\"test-index-2\").unwrap(),\n            &IndexStatus::Active\n        );\n        assert_eq!(\n            legacy_manifest.indexes.get(\"test-index-3\").unwrap(),\n            &IndexStatus::Deleting\n        );\n    }\n\n    #[test]\n    fn test_legacy_manifest_into_manifest() {\n        let legacy_manifest = LegacyManifest {\n            indexes: vec![\n                (\"test-index-1\".to_string(), IndexStatus::Creating),\n                (\"test-index-2\".to_string(), IndexStatus::Active),\n                (\"test-index-3\".to_string(), IndexStatus::Deleting),\n            ]\n            .into_iter()\n            .collect(),\n        };\n        let manifest = legacy_manifest.into_manifest();\n\n        assert_eq!(manifest.indexes.len(), 3);\n        assert_eq!(manifest.templates.len(), 0);\n\n        assert_eq!(\n            manifest.indexes.get(\"test-index-1\").unwrap(),\n            &IndexStatus::Creating\n        );\n        assert_eq!(\n            manifest.indexes.get(\"test-index-2\").unwrap(),\n            &IndexStatus::Active\n        );\n        assert_eq!(\n            manifest.indexes.get(\"test-index-3\").unwrap(),\n            &IndexStatus::Deleting\n        );\n    }\n\n    #[test]\n    fn test_manifest_serde() {\n        let indexes = BTreeMap::from_iter([\n            (\"test-index-1\".to_string(), IndexStatus::Creating),\n            (\"test-index-2\".to_string(), IndexStatus::Active),\n            (\"test-index-3\".to_string(), IndexStatus::Deleting),\n        ]);\n        let templates = HashMap::from_iter([\n            (\n                \"test-template-1\".to_string(),\n                IndexTemplate::for_test(\"test-template-1\", &[\"test-index-foo*\"], 100),\n            ),\n            (\n                \"test-template-2\".to_string(),\n                IndexTemplate::for_test(\"test-template-2\", &[\"test-index-bar*\"], 200),\n            ),\n        ]);\n        let manifest = Manifest {\n            indexes,\n            templates,\n            identity: Uuid::nil(),\n        };\n        let manifest_json = serde_json::to_string_pretty(&manifest).unwrap();\n        let manifest_deserialized: Manifest = serde_json::from_str(&manifest_json).unwrap();\n        assert_eq!(manifest, manifest_deserialized);\n    }\n\n    #[tokio::test]\n    async fn test_create_mutate_save_load_manifest() {\n        let storage = quickwit_storage::storage_for_test();\n        let mut manifest = load_or_create_manifest(&*storage).await.unwrap();\n\n        assert_eq!(manifest.indexes.len(), 0);\n        assert_eq!(manifest.templates.len(), 0);\n\n        let empty_manifest_size = storage\n            .get_all(Path::new(MANIFEST_FILE_NAME))\n            .await\n            .unwrap()\n            .len();\n        assert!(empty_manifest_size > 0);\n\n        manifest\n            .indexes\n            .insert(\"test-index\".to_string(), IndexStatus::Creating);\n        manifest.templates.insert(\n            \"test-template\".to_string(),\n            IndexTemplate::for_test(\"test-template\", &[\"test-index-*\"], 100),\n        );\n\n        save_manifest(&*storage, &manifest).await.unwrap();\n\n        let populated_manifest_size = storage\n            .get_all(Path::new(MANIFEST_FILE_NAME))\n            .await\n            .unwrap()\n            .len();\n        assert!(populated_manifest_size > empty_manifest_size);\n\n        let manifest = load_or_create_manifest(&*storage).await.unwrap();\n        assert_eq!(manifest.indexes.len(), 1);\n        assert_eq!(\n            manifest.indexes.get(\"test-index\").unwrap(),\n            &IndexStatus::Creating\n        );\n\n        assert_eq!(manifest.templates.len(), 1);\n\n        let template = manifest.templates.get(\"test-template\").unwrap();\n        assert_eq!(template.template_id, \"test-template\");\n        assert_eq!(template.index_id_patterns, [\"test-index-*\"]);\n        assert_eq!(template.priority, 100);\n    }\n\n    #[tokio::test]\n    async fn test_legacy_manifest_migration() {\n        let storage = quickwit_storage::storage_for_test();\n        let legacy_manifest_json = json!(\n            {\n                \"test-index-1\": \"Creating\",\n                \"test-index-2\": \"Alive\",\n                \"test-index-3\": \"Deleting\"\n            }\n        );\n        let legacy_manifest_json_bytes = serde_json::to_vec(&legacy_manifest_json).unwrap();\n\n        put_bytes(\n            &*storage,\n            LEGACY_MANIFEST_FILE_NAME,\n            legacy_manifest_json_bytes,\n        )\n        .await\n        .unwrap();\n\n        let manifest = load_or_create_manifest(&*storage).await.unwrap();\n        assert_eq!(manifest.indexes.len(), 3);\n        assert_eq!(manifest.templates.len(), 0);\n\n        assert_eq!(\n            manifest.indexes.get(\"test-index-1\").unwrap(),\n            &IndexStatus::Creating\n        );\n        assert_eq!(\n            manifest.indexes.get(\"test-index-2\").unwrap(),\n            &IndexStatus::Active\n        );\n        assert_eq!(\n            manifest.indexes.get(\"test-index-3\").unwrap(),\n            &IndexStatus::Deleting\n        );\n\n        let legacy_manifest_exists = file_exists(&*storage, LEGACY_MANIFEST_FILE_NAME)\n            .await\n            .unwrap();\n        assert!(!legacy_manifest_exists);\n\n        let manifest_exists = file_exists(&*storage, MANIFEST_FILE_NAME).await.unwrap();\n        assert!(manifest_exists);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/file_backed/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n//! Module for [`FileBackedMetastore`]. It is public so that the crate `quickwit-backward-compat`\n//! can import [`FileBackedIndex`] and run backward-compatibility tests. You should not have to\n//! import anything from here directly.\n\npub mod file_backed_index;\nmod file_backed_metastore_factory;\nmod index_id_matcher;\nmod index_template_matcher;\nmod lazy_file_backed_index;\npub(crate) mod manifest;\nmod state;\nmod store_operations;\n\nuse core::fmt;\nuse std::collections::HashMap;\nuse std::collections::hash_map::Entry;\nuse std::path::Path;\nuse std::sync::Arc;\nuse std::time::Duration;\n\nuse async_trait::async_trait;\nuse futures::StreamExt;\nuse futures::future::try_join_all;\nuse futures::stream::FuturesUnordered;\nuse itertools::Itertools;\nuse quickwit_common::ServiceStream;\nuse quickwit_config::IndexTemplate;\nuse quickwit_proto::metastore::{\n    AcquireShardsRequest, AcquireShardsResponse, AddSourceRequest, CreateIndexRequest,\n    CreateIndexResponse, CreateIndexTemplateRequest, DeleteIndexRequest,\n    DeleteIndexTemplatesRequest, DeleteQuery, DeleteShardsRequest, DeleteShardsResponse,\n    DeleteSourceRequest, DeleteSplitsRequest, DeleteTask, EmptyResponse, EntityKind,\n    FindIndexTemplateMatchesRequest, FindIndexTemplateMatchesResponse, GetClusterIdentityRequest,\n    GetClusterIdentityResponse, GetIndexTemplateRequest, GetIndexTemplateResponse,\n    IndexMetadataFailure, IndexMetadataFailureReason, IndexMetadataRequest, IndexMetadataResponse,\n    IndexTemplateMatch, IndexesMetadataRequest, IndexesMetadataResponse, LastDeleteOpstampRequest,\n    LastDeleteOpstampResponse, ListDeleteTasksRequest, ListDeleteTasksResponse,\n    ListIndexStatsRequest, ListIndexStatsResponse, ListIndexTemplatesRequest,\n    ListIndexTemplatesResponse, ListIndexesMetadataRequest, ListIndexesMetadataResponse,\n    ListShardsRequest, ListShardsResponse, ListSplitsRequest, ListSplitsResponse,\n    ListStaleSplitsRequest, MarkSplitsForDeletionRequest, MetastoreError, MetastoreResult,\n    MetastoreService, MetastoreServiceStream, OpenShardSubrequest, OpenShardsRequest,\n    OpenShardsResponse, PruneShardsRequest, PublishSplitsRequest, ResetSourceCheckpointRequest,\n    StageSplitsRequest, ToggleSourceRequest, UpdateIndexRequest, UpdateSourceRequest,\n    UpdateSplitsDeleteOpstampRequest, UpdateSplitsDeleteOpstampResponse, serde_utils,\n};\nuse quickwit_proto::types::{IndexId, IndexUid};\nuse quickwit_storage::Storage;\nuse time::OffsetDateTime;\nuse tokio::sync::{Mutex, OwnedMutexGuard, RwLock};\nuse ulid::Ulid;\nuse uuid::Uuid;\n\nuse self::file_backed_index::FileBackedIndex;\npub use self::file_backed_metastore_factory::FileBackedMetastoreFactory;\nuse self::index_id_matcher::IndexIdMatcher;\nuse self::lazy_file_backed_index::LazyFileBackedIndex;\nuse self::manifest::{MANIFEST_FILE_NAME, load_or_create_manifest, save_manifest};\nuse self::state::MetastoreState;\nuse self::store_operations::{delete_index, index_exists, load_index, put_index};\nuse super::{\n    AddSourceRequestExt, CreateIndexRequestExt, IndexMetadataResponseExt,\n    IndexesMetadataResponseExt, ListIndexesMetadataResponseExt, ListSplitsRequestExt,\n    ListSplitsResponseExt, PublishSplitsRequestExt, STREAM_SPLITS_CHUNK_SIZE,\n    StageSplitsRequestExt, UpdateIndexRequestExt, UpdateSourceRequestExt,\n};\nuse crate::checkpoint::IndexCheckpointDelta;\nuse crate::{IndexMetadata, ListSplitsQuery, MetastoreServiceExt, Split, SplitState};\n\n/// Status of an index tracked by the metastore.\npub(crate) enum LazyIndexStatus {\n    /// The index is being created but its metadata have yet to be written on the storage.\n    Creating,\n    /// The index is created and available.\n    Active(LazyFileBackedIndex),\n    /// The index is being deleted and but its index metadata file has not yet been removed from\n    /// storage.\n    Deleting,\n}\n\n#[derive(Debug)]\npub(crate) enum MutationOccurred<T> {\n    Yes(T),\n    No(T),\n}\n\nimpl From<bool> for MutationOccurred<()> {\n    fn from(mutation_occurred: bool) -> Self {\n        if mutation_occurred {\n            Self::Yes(())\n        } else {\n            Self::No(())\n        }\n    }\n}\n\n/// A metastore implementation that stores all the metadata associated to each index\n/// into as many files and stores a map of indexes\n/// (index_id, index_status) in a dedicated file `manifest.json`.\n///\n/// A [`LazyIndexStatus`] describes the lifecycle of an index: [`LazyIndexStatus::Creating`] and\n/// [`LazyIndexStatus::Deleting`] are transitioning states that indicates that the index is not\n/// yet available. On the contrary, the [`LazyIndexStatus::Active`] status indicates the index is\n/// ready to be fetched and updated.\n///\n/// Transitioning states are useful to track inconsistencies between the in-memory and on-disk data\n/// structures when error(s) occur during index creations and deletions:\n/// - `Creating` indicates that the metastore updated the manifest file with this state but not yet\n///   the index metadata file;\n/// - `Deleting` indicates that the metastore updated the manifest file with this state but the\n///   index metadata file is not yet deleted.\n///\n/// !!! Important note: the indexes map manifest does not\n/// guarantee exhaustivity: an index metadata file can be on the storage\n/// but not present in the states map. As the map is incomplete, the metastore\n/// does not rely on it to check index existence, this leads to following\n/// implementations:\n/// - on creation, the metastore always checks if an index metadata file is already present on the\n///   storage even if the index is not in the indexes map;\n/// - on get/update of an index, same story, the metastore checks if index is on the storage and if\n///   present, the index is loaded in the map and returned /modified;\n/// - on deletion, same story, the metastore deletes an index metadata file present on the storage\n///   even if the index is not in the map.\n///\n/// !!! Important note 2: it is strongly advised to restrict the `FileBackedMetastore`\n/// usage to the following use cases:\n/// - testing;\n/// - single-node environment;\n/// - multiple-nodes environment with only one writer and readers. In this case, you must be very\n///   cautious and ensure that your readers are really readers.\n#[derive(Clone)]\npub struct FileBackedMetastore {\n    state: Arc<RwLock<MetastoreState>>,\n    storage: Arc<dyn Storage>,\n    polling_interval_opt: Option<Duration>,\n}\n\nimpl fmt::Debug for FileBackedMetastore {\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        f.debug_struct(\"FileBackedMetastore\")\n            .field(\"storage_uri\", self.storage.uri())\n            .field(\"polling_interval_opt\", &self.polling_interval_opt)\n            .finish()\n    }\n}\n\nimpl FileBackedMetastore {\n    /// Creates a [`FileBackedMetastore`] for tests.\n    #[doc(hidden)]\n    pub fn for_test(storage: Arc<dyn Storage>) -> Self {\n        Self {\n            state: Default::default(),\n            storage,\n            polling_interval_opt: None,\n        }\n    }\n\n    /// Sets the polling interval.\n    ///\n    /// Only newly accessed indexes will be affected by the change of this setting.\n    pub fn set_polling_interval(&mut self, polling_interval_opt: Option<Duration>) {\n        self.polling_interval_opt = polling_interval_opt;\n    }\n\n    /// Return the underlying storage.\n    ///\n    /// This is only build in tests to verify the metastore did indeed store what it should.\n    /// It shouldn't be relied uppon elsewhere as to not break abstractions.\n    #[cfg(test)]\n    pub fn storage(&self) -> Arc<dyn Storage> {\n        self.storage.clone()\n    }\n\n    /// Creates a [`FileBackedMetastore`] for a specified storage, immediately loading the manifest\n    /// file.\n    pub async fn try_new(\n        storage: Arc<dyn Storage>,\n        polling_interval_opt: Option<Duration>,\n    ) -> MetastoreResult<Self> {\n        let manifest = load_or_create_manifest(&*storage).await?;\n        let state =\n            MetastoreState::try_from_manifest(storage.clone(), manifest, polling_interval_opt)?;\n        let metastore = Self {\n            state: Arc::new(RwLock::new(state)),\n            storage,\n            polling_interval_opt,\n        };\n        Ok(metastore)\n    }\n\n    async fn mutate<T>(\n        &self,\n        index_uid: &IndexUid,\n        mutate_fn: impl FnOnce(&mut FileBackedIndex) -> MetastoreResult<MutationOccurred<T>>,\n    ) -> MetastoreResult<T> {\n        let index_id = &index_uid.index_id;\n        let mut locked_index = self.get_locked_index(index_id).await?;\n        if locked_index.index_uid() != index_uid {\n            return Err(MetastoreError::NotFound(EntityKind::Index {\n                index_id: index_id.to_string(),\n            }));\n        }\n        let mut index = locked_index.clone();\n\n        let value = match mutate_fn(&mut index)? {\n            MutationOccurred::Yes(value) => value,\n            MutationOccurred::No(value) => {\n                return Ok(value);\n            }\n        };\n        locked_index.set_recently_modified();\n\n        let put_result = put_index(&*self.storage, &index).await;\n        match put_result {\n            Ok(()) => {\n                *locked_index = index;\n                Ok(value)\n            }\n            Err(error) => {\n                // For some of the error type here, we cannot know for sure\n                // whether the content was written or not.\n                //\n                // Just to be sure, let's discard the cache.\n                let mut state_wlock_guard = self.state.write().await;\n\n                // At this point, we hold both locks.\n                state_wlock_guard.indexes.insert(\n                    index_id.to_string(),\n                    LazyIndexStatus::Active(LazyFileBackedIndex::new(\n                        self.storage.clone(),\n                        index_id.to_string(),\n                        self.polling_interval_opt,\n                        None,\n                    )),\n                );\n                locked_index.discarded = true;\n                Err(error)\n            }\n        }\n    }\n\n    async fn read<T, F>(&self, index_uid: &IndexUid, view: F) -> MetastoreResult<T>\n    where F: FnOnce(&FileBackedIndex) -> MetastoreResult<T> {\n        self.read_any(\n            index_uid.index_id.as_str(),\n            Some(index_uid.incarnation_id),\n            view,\n        )\n        .await\n    }\n\n    /// Reads the index metadata given an `index_id`. The difference with `read` it that\n    /// this function does necessarily take a incarnation id, so that it is less strict.\n    async fn read_any<T>(\n        &self,\n        index_id: &str,\n        incarnation_id_opt: Option<Ulid>,\n        view: impl FnOnce(&FileBackedIndex) -> MetastoreResult<T>,\n    ) -> MetastoreResult<T> {\n        let locked_index = self.get_locked_index(index_id).await?;\n        if let Some(incarnation_id) = incarnation_id_opt\n            && locked_index.index_uid().incarnation_id != incarnation_id\n        {\n            return Err(MetastoreError::NotFound(EntityKind::Index {\n                index_id: index_id.to_string(),\n            }));\n        }\n        view(&locked_index)\n    }\n\n    /// Returns a valid locked index.\n    ///\n    /// This function guarantees that it has not been\n    /// marked as discarded.\n    async fn get_locked_index(\n        &self,\n        index_id: &str,\n    ) -> MetastoreResult<OwnedMutexGuard<FileBackedIndex>> {\n        loop {\n            let index = self.index(index_id).await?;\n            let locked_index = index.lock_owned().await;\n\n            if !locked_index.discarded {\n                return Ok(locked_index);\n            }\n        }\n    }\n\n    /// Returns a FileBackedIndex for the given index_id.\n    ///\n    /// If `index_id` is in a transitioning state `Creating` or `Deleting`, it will\n    /// trigger an error.\n    /// If `index_id` is not yet in `per_index_metastores` map,\n    /// a fetch to the storage will be initiated and might trigger an error.\n    ///\n    /// For a given index_id, only copies of the same index_view are returned.\n    async fn index(&self, index_id: &str) -> MetastoreResult<Arc<Mutex<FileBackedIndex>>> {\n        {\n            // Happy path!\n            // If the object is already in our cache then we just return a copy\n            let inner_rlock_guard = self.state.read().await;\n            if let Some(index_state) = inner_rlock_guard.indexes.get(index_id) {\n                return get_index_mutex(index_id, index_state).await;\n            }\n        }\n        // At this point we do not hold our mutex, so we need to do a little dance\n        // to make sure we return the same instance.\n        //\n        // If there is an error here, note we do not return right away.\n        // That's because we want to observe the property that after one success\n        // all subsequent calls will succeed.\n        let index_result = load_index(&*self.storage, index_id).await;\n\n        // Here we retake the lock, still no io ongoing.\n        let mut state_wlock_guard = self.state.write().await;\n\n        // At this point, some other client might have added another instance of the Metadataet in\n        // the map. We want to avoid two copies to exist in the application, so we keep only\n        // one.\n        if let Some(index_state) = state_wlock_guard.indexes.get(index_id) {\n            return get_index_mutex(index_id, index_state).await;\n        }\n\n        // We need to instantiate a `LazyFileBackedIndex` that will hold the mutex\n        // and take care of spawning the polling if needed.\n        let index = index_result?;\n        let lazy_index = LazyFileBackedIndex::new(\n            self.storage.clone(),\n            index_id.to_string(),\n            self.polling_interval_opt,\n            Some(index),\n        );\n        let index_mutex = lazy_index.get().await?;\n        state_wlock_guard\n            .indexes\n            .insert(index_id.to_string(), LazyIndexStatus::Active(lazy_index));\n        Ok(index_mutex)\n    }\n\n    async fn index_metadata_inner(\n        &self,\n        index_id_opt: Option<IndexId>,\n        index_uid_opt: Option<IndexUid>,\n    ) -> Result<IndexMetadata, (MetastoreError, Option<IndexId>, Option<IndexUid>)> {\n        let index_id = if let Some(index_id) = &index_id_opt {\n            index_id\n        } else if let Some(index_uid) = &index_uid_opt {\n            &index_uid.index_id\n        } else {\n            let message = \"invalid request: neither `index_id` nor `index_uid` is set\".to_string();\n            let metastore_error = MetastoreError::Internal {\n                message,\n                cause: \"\".to_string(),\n            };\n            return Err((metastore_error, index_id_opt, index_uid_opt));\n        };\n        let index_metadata = match self\n            .read_any(index_id, None, |index| Ok(index.metadata().clone()))\n            .await\n        {\n            Ok(index_metadata) => index_metadata,\n            Err(metastore_error) => {\n                return Err((metastore_error, index_id_opt, index_uid_opt));\n            }\n        };\n        if let Some(index_uid) = &index_uid_opt\n            && index_metadata.index_uid != *index_uid\n        {\n            let metastore_error = MetastoreError::NotFound(EntityKind::Index {\n                index_id: index_id.to_string(),\n            });\n            return Err((metastore_error, index_id_opt, index_uid_opt));\n        }\n        Ok(index_metadata)\n    }\n\n    async fn list_splits_aux(\n        &self,\n        index_id_with_incarnation_id_opts: &[(IndexId, Option<Ulid>)],\n        list_splits_query: ListSplitsQuery,\n    ) -> MetastoreResult<Vec<Split>> {\n        let mut splits_per_index = Vec::with_capacity(index_id_with_incarnation_id_opts.len());\n        for (index_id, incarnation_id_opt) in index_id_with_incarnation_id_opts {\n            match self\n                .read_any(index_id, *incarnation_id_opt, |index| {\n                    index.list_splits(&list_splits_query)\n                })\n                .await\n            {\n                Ok(splits) => {\n                    splits_per_index.push(splits);\n                }\n                Err(MetastoreError::NotFound(_)) => {\n                    // If the index does not exist, we just skip it.\n                    continue;\n                }\n                Err(error) => return Err(error),\n            }\n        }\n\n        let limit = list_splits_query.limit.unwrap_or(usize::MAX);\n        let offset = list_splits_query.offset.unwrap_or_default();\n\n        let merged_results = splits_per_index\n            .into_iter()\n            .kmerge_by(|lhs, rhs| list_splits_query.sort_by.compare(lhs, rhs).is_lt())\n            .skip(offset)\n            .take(limit)\n            .collect();\n\n        Ok(merged_results)\n    }\n\n    /// Returns the list of splits for the given request.\n    /// No error is returned if any of the requested `index_uid` does not exist.\n    async fn list_splits_inner(&self, request: ListSplitsRequest) -> MetastoreResult<Vec<Split>> {\n        let mut list_splits_query = request.deserialize_list_splits_query()?;\n\n        let index_id_incarnation_id_opts: Vec<(IndexId, Option<Ulid>)> =\n            if let Some(index_uids) = list_splits_query.index_uids.take() {\n                index_uids\n                    .into_iter()\n                    .map(|index_uid| (index_uid.index_id, Some(index_uid.incarnation_id)))\n                    .collect()\n            } else {\n                // We do not have an explicit list of index_uids with the query, so we search for\n                // all indexes.\n                let inner_rlock_guard = self.state.read().await;\n                inner_rlock_guard\n                    .indexes\n                    .iter()\n                    .filter_map(|(index_id, index_state)| match index_state {\n                        LazyIndexStatus::Active(_) => Some(index_id),\n                        _ => None,\n                    })\n                    .map(|index_id| (index_id.clone(), None))\n                    .collect()\n            };\n\n        self.list_splits_aux(&index_id_incarnation_id_opts, list_splits_query)\n            .await\n    }\n\n    /// Helper used for testing to obtain the data associated with the given index.\n    #[cfg(test)]\n    async fn get_index(&self, index_uid: &IndexUid) -> MetastoreResult<FileBackedIndex> {\n        self.read(index_uid, |index| Ok(index.clone())).await\n    }\n}\n\n#[async_trait]\nimpl MetastoreService for FileBackedMetastore {\n    async fn check_connectivity(&self) -> anyhow::Result<()> {\n        self.storage.exists(Path::new(MANIFEST_FILE_NAME)).await?;\n        Ok(())\n    }\n\n    fn endpoints(&self) -> Vec<quickwit_common::uri::Uri> {\n        vec![self.storage.uri().clone()]\n    }\n\n    // -------------------------------------------------------------------------------\n    // Mutations over the high-level index.\n\n    async fn create_index(\n        &self,\n        request: CreateIndexRequest,\n    ) -> MetastoreResult<CreateIndexResponse> {\n        let index_config = request.deserialize_index_config()?;\n        let source_configs = request.deserialize_source_configs()?;\n\n        let mut index_metadata = IndexMetadata::new(index_config);\n\n        for source_config in source_configs {\n            index_metadata.add_source(source_config)?;\n        }\n        let index_uid = index_metadata.index_uid.clone();\n        let index_id = &index_uid.index_id;\n\n        let index_metadata_json = serde_utils::to_json_str(&index_metadata)?;\n        let index = FileBackedIndex::from(index_metadata);\n\n        let mut state_wlock_guard = self.state.write().await;\n\n        // Checking if index already exists is a bit tedious:\n        // - first we check the index state: if it's `Active`, return `IndexAlreadyExists` error,\n        //   and if it's `Creating` or `Deleting`, it's ok to override them as these are\n        //   transitioning states.\n        // - if the index is not in the index states map, we still need to check the storage as we\n        //   don't want to override an existing metadata file.\n        if let Some(index_status) = state_wlock_guard.indexes.get(index_id) {\n            if let LazyIndexStatus::Active(_) = index_status {\n                return Err(MetastoreError::AlreadyExists(EntityKind::Index {\n                    index_id: index_id.to_string(),\n                }));\n            }\n        } else if index_exists(&*self.storage, index_id).await? {\n            return Err(MetastoreError::Internal {\n                message: format!(\"index {index_id} cannot be created\"),\n                cause: format!(\n                    \"index {index_id} is not present in the manifest file but its file \\\n                     `{index_id}/metastore.json` is on the storage\"\n                ),\n            });\n        }\n        // Set state to `Creating` and rollback on metastore error.\n        state_wlock_guard\n            .indexes\n            .insert(index_id.clone(), LazyIndexStatus::Creating);\n\n        let manifest = state_wlock_guard.as_manifest();\n\n        if let Err(error) = save_manifest(&*self.storage, &manifest).await {\n            state_wlock_guard.indexes.remove(index_id);\n            return Err(error);\n        }\n        put_index(&*self.storage, &index).await?;\n\n        state_wlock_guard.indexes.insert(\n            index_id.clone(),\n            LazyIndexStatus::Active(LazyFileBackedIndex::new(\n                self.storage.clone(),\n                index_id.clone(),\n                self.polling_interval_opt,\n                Some(index),\n            )),\n        );\n        // Set state to `Active` and rollback on metastore error.\n        let manifest = state_wlock_guard.as_manifest();\n\n        if let Err(error) = save_manifest(&*self.storage, &manifest).await {\n            state_wlock_guard\n                .indexes\n                .insert(index_id.clone(), LazyIndexStatus::Creating);\n            return Err(error);\n        }\n\n        let response = CreateIndexResponse {\n            index_uid: index_uid.into(),\n            index_metadata_json,\n        };\n        Ok(response)\n    }\n\n    async fn update_index(\n        &self,\n        request: UpdateIndexRequest,\n    ) -> MetastoreResult<IndexMetadataResponse> {\n        let index_uid = request.index_uid();\n        let doc_mapping = request.deserialize_doc_mapping()?;\n        let indexing_settings = request.deserialize_indexing_settings()?;\n        let ingest_settings = request.deserialize_ingest_settings()?;\n        let search_settings = request.deserialize_search_settings()?;\n        let retention_policy_opt = request.deserialize_retention_policy()?;\n\n        let index_metadata = self\n            .mutate(index_uid, |index| {\n                let mutation_occurred = index.update_index_config(\n                    doc_mapping,\n                    indexing_settings,\n                    ingest_settings,\n                    search_settings,\n                    retention_policy_opt,\n                )?;\n                let index_metadata = index.metadata().clone();\n\n                if mutation_occurred {\n                    Ok(MutationOccurred::Yes(index_metadata))\n                } else {\n                    Ok(MutationOccurred::No(index_metadata))\n                }\n            })\n            .await?;\n        IndexMetadataResponse::try_from_index_metadata(&index_metadata)\n    }\n\n    async fn delete_index(&self, request: DeleteIndexRequest) -> MetastoreResult<EmptyResponse> {\n        // We pick the outer lock here, so that we enter a critical section.\n        let mut state_wlock_guard = self.state.write().await;\n\n        let index_id = &request.index_uid().index_id;\n        // If index is neither in `per_index_metastores_wlock` nor on the storage, it does not\n        // exist.\n        if !state_wlock_guard.indexes.contains_key(index_id)\n            && !index_exists(&*self.storage, index_id).await?\n        {\n            return Err(MetastoreError::NotFound(EntityKind::Index {\n                index_id: index_id.to_string(),\n            }));\n        }\n        // Set state to `Deleting` and keep the previous state in memory in case we need to insert\n        // if an error occurs.\n        let index_state_opt = state_wlock_guard\n            .indexes\n            .insert(index_id.to_string(), LazyIndexStatus::Deleting);\n        let manifest = state_wlock_guard.as_manifest();\n        // On a put error, reinsert the previous state if any.\n        if let Err(error) = save_manifest(&*self.storage, &manifest).await {\n            if let Some(index_state) = index_state_opt {\n                state_wlock_guard\n                    .indexes\n                    .insert(index_id.to_string(), index_state);\n            } else {\n                state_wlock_guard.indexes.remove(index_id);\n            }\n            return Err(error);\n        }\n\n        let delete_result = delete_index(&*self.storage, index_id).await;\n\n        if matches!(\n            &delete_result,\n            Ok(()) | Err(MetastoreError::NotFound(EntityKind::Index { .. }))\n        ) {\n            state_wlock_guard.indexes.remove(index_id);\n            let manifest = state_wlock_guard.as_manifest();\n\n            if let Err(error) = save_manifest(&*self.storage, &manifest).await {\n                state_wlock_guard\n                    .indexes\n                    .insert(index_id.to_string(), LazyIndexStatus::Deleting);\n                return Err(error);\n            }\n        }\n        delete_result.map(|_| EmptyResponse {})\n    }\n\n    // -------------------------------------------------------------------------------\n    // Mutations over a single index\n\n    async fn stage_splits(&self, request: StageSplitsRequest) -> MetastoreResult<EmptyResponse> {\n        let index_uid = request.index_uid().clone();\n        let splits_metadata = request.deserialize_splits_metadata()?;\n\n        self.mutate(&index_uid, |index| {\n            let mut failed_split_ids = Vec::new();\n\n            for split_metadata in splits_metadata {\n                match index.stage_split(split_metadata) {\n                    Ok(()) => {}\n                    Err(MetastoreError::FailedPrecondition {\n                        entity: EntityKind::Split { split_id },\n                        ..\n                    }) => {\n                        failed_split_ids.push(split_id);\n                    }\n                    Err(error) => return Err(error),\n                };\n            }\n            if !failed_split_ids.is_empty() {\n                let entity = EntityKind::Splits {\n                    split_ids: failed_split_ids,\n                };\n                let message = \"splits are not staged\".to_string();\n                Err(MetastoreError::FailedPrecondition { entity, message })\n            } else {\n                Ok(MutationOccurred::Yes(()))\n            }\n        })\n        .await?;\n        Ok(EmptyResponse {})\n    }\n\n    async fn publish_splits(\n        &self,\n        request: PublishSplitsRequest,\n    ) -> MetastoreResult<EmptyResponse> {\n        let index_checkpoint_delta: Option<IndexCheckpointDelta> =\n            request.deserialize_index_checkpoint()?;\n        let index_uid = request.index_uid().clone();\n        self.mutate(&index_uid, |index| {\n            index.publish_splits(\n                request.staged_split_ids,\n                request.replaced_split_ids,\n                index_checkpoint_delta,\n                request.publish_token_opt,\n            )?;\n            Ok(MutationOccurred::Yes(()))\n        })\n        .await?;\n        Ok(EmptyResponse {})\n    }\n\n    async fn mark_splits_for_deletion(\n        &self,\n        request: MarkSplitsForDeletionRequest,\n    ) -> MetastoreResult<EmptyResponse> {\n        let index_uid = request.index_uid().clone();\n\n        self.mutate(&index_uid, |index| {\n            index\n                .mark_splits_for_deletion(\n                    request.split_ids,\n                    &[\n                        SplitState::Staged,\n                        SplitState::Published,\n                        SplitState::MarkedForDeletion,\n                    ],\n                    false,\n                )\n                .map(MutationOccurred::from)\n        })\n        .await?;\n        Ok(EmptyResponse {})\n    }\n\n    async fn delete_splits(&self, request: DeleteSplitsRequest) -> MetastoreResult<EmptyResponse> {\n        let index_uid = request.index_uid().clone();\n\n        self.mutate(&index_uid, |index| {\n            index.delete_splits(request.split_ids)?;\n            Ok(MutationOccurred::Yes(EmptyResponse {}))\n        })\n        .await?;\n        Ok(EmptyResponse {})\n    }\n\n    async fn add_source(&self, request: AddSourceRequest) -> MetastoreResult<EmptyResponse> {\n        let source_config = request.deserialize_source_config()?;\n        let index_uid = request.index_uid();\n\n        self.mutate(index_uid, |index| {\n            index.add_source(source_config)?;\n            Ok(MutationOccurred::Yes(()))\n        })\n        .await?;\n        Ok(EmptyResponse {})\n    }\n\n    async fn update_source(&self, request: UpdateSourceRequest) -> MetastoreResult<EmptyResponse> {\n        let source_config = request.deserialize_source_config()?;\n        let index_uid = request.index_uid();\n\n        self.mutate(index_uid, |index| {\n            let mutation_occurred = index.update_source(source_config)?;\n            Ok(MutationOccurred::from(mutation_occurred))\n        })\n        .await?;\n        Ok(EmptyResponse {})\n    }\n\n    async fn toggle_source(&self, request: ToggleSourceRequest) -> MetastoreResult<EmptyResponse> {\n        let index_uid = request.index_uid();\n\n        self.mutate(index_uid, |index| {\n            index\n                .toggle_source(&request.source_id, request.enable)\n                .map(MutationOccurred::from)\n        })\n        .await?;\n        Ok(EmptyResponse {})\n    }\n\n    async fn delete_source(&self, request: DeleteSourceRequest) -> MetastoreResult<EmptyResponse> {\n        let index_uid = request.index_uid();\n\n        self.mutate(index_uid, |index| {\n            index.delete_source(&request.source_id)?;\n            Ok(MutationOccurred::Yes(()))\n        })\n        .await?;\n        Ok(EmptyResponse {})\n    }\n\n    async fn reset_source_checkpoint(\n        &self,\n        request: ResetSourceCheckpointRequest,\n    ) -> MetastoreResult<EmptyResponse> {\n        let index_uid = request.index_uid();\n\n        self.mutate(index_uid, |index| {\n            index\n                .reset_source_checkpoint(&request.source_id)\n                .map(MutationOccurred::from)\n        })\n        .await?;\n        Ok(EmptyResponse {})\n    }\n\n    // -------------------------------------------------------------------------------\n    // Read-only accessors\n\n    /// Streams of splits for the given request.\n    /// No error is returned if any of the requested `index_uid` does not exist.\n    async fn list_splits(\n        &self,\n        request: ListSplitsRequest,\n    ) -> MetastoreResult<MetastoreServiceStream<ListSplitsResponse>> {\n        let splits = self.list_splits_inner(request).await?;\n        let splits_responses: Vec<MetastoreResult<ListSplitsResponse>> = splits\n            .chunks(STREAM_SPLITS_CHUNK_SIZE)\n            .map(|chunk| ListSplitsResponse::try_from_splits(chunk.to_vec()))\n            .collect();\n        let splits_responses_stream = Box::pin(futures::stream::iter(splits_responses));\n        Ok(ServiceStream::new(splits_responses_stream))\n    }\n\n    async fn list_index_stats(\n        &self,\n        request: ListIndexStatsRequest,\n    ) -> MetastoreResult<ListIndexStatsResponse> {\n        let index_id_matcher =\n            IndexIdMatcher::try_from_index_id_patterns(&request.index_id_patterns)?;\n        let index_ids: Vec<IndexId> = {\n            let inner_rlock_guard = self.state.read().await;\n            inner_rlock_guard\n                .indexes\n                .iter()\n                .filter_map(|(index_id, index_state)| match index_state {\n                    LazyIndexStatus::Active(_) if index_id_matcher.is_match(index_id) => {\n                        Some(index_id)\n                    }\n                    _ => None,\n                })\n                .cloned()\n                .collect()\n        };\n\n        let mut index_read_futures = FuturesUnordered::new();\n        for index_id in index_ids {\n            let index_read_future = async move {\n                self.read_any(&index_id, None, |index| index.get_stats())\n                    .await\n            };\n            index_read_futures.push(index_read_future);\n        }\n\n        let mut index_stats = Vec::new();\n        while let Some(index_read_result) = index_read_futures.next().await {\n            match index_read_result {\n                Ok(stats) => index_stats.push(stats),\n                Err(MetastoreError::NotFound(_)) => {\n                    // If the index does not exist, we just skip it.\n                    continue;\n                }\n                Err(error) => return Err(error),\n            }\n        }\n\n        Ok(ListIndexStatsResponse { index_stats })\n    }\n\n    async fn list_stale_splits(\n        &self,\n        request: ListStaleSplitsRequest,\n    ) -> MetastoreResult<ListSplitsResponse> {\n        let list_splits_query = ListSplitsQuery::for_index(request.index_uid().clone())\n            .with_delete_opstamp_lt(request.delete_opstamp)\n            .with_split_state(SplitState::Published)\n            .retain_mature(OffsetDateTime::now_utc())\n            .sort_by_staleness()\n            .with_limit(request.num_splits as usize);\n        let list_splits_request =\n            ListSplitsRequest::try_from_list_splits_query(&list_splits_query)?;\n        let splits = self.list_splits_inner(list_splits_request).await?;\n        ListSplitsResponse::try_from_splits(splits)\n    }\n\n    async fn index_metadata(\n        &self,\n        request: IndexMetadataRequest,\n    ) -> MetastoreResult<IndexMetadataResponse> {\n        let index_metadata = self\n            .index_metadata_inner(request.index_id, request.index_uid)\n            .await\n            .map_err(|(metastore_error, _index_id_opt, _index_uid_opt)| metastore_error)?;\n        let response = IndexMetadataResponse::try_from_index_metadata(&index_metadata)?;\n        Ok(response)\n    }\n\n    async fn indexes_metadata(\n        &self,\n        request: IndexesMetadataRequest,\n    ) -> MetastoreResult<IndexesMetadataResponse> {\n        let mut indexes_metadata: Vec<IndexMetadata> =\n            Vec::with_capacity(request.subrequests.len());\n        let mut failures: Vec<IndexMetadataFailure> = Vec::new();\n\n        let mut index_metadata_futures = FuturesUnordered::new();\n\n        for subrequest in request.subrequests {\n            let metastore = self.clone();\n            let index_metadata_future = async move {\n                metastore\n                    .index_metadata_inner(subrequest.index_id, subrequest.index_uid)\n                    .await\n            };\n            index_metadata_futures.push(index_metadata_future);\n        }\n        while let Some(index_metadata_result) = index_metadata_futures.next().await {\n            match index_metadata_result {\n                Ok(index_metadata) => indexes_metadata.push(index_metadata),\n                Err((MetastoreError::NotFound(_), index_id, index_uid)) => {\n                    let failure = IndexMetadataFailure {\n                        index_id,\n                        index_uid,\n                        reason: IndexMetadataFailureReason::NotFound as i32,\n                    };\n                    failures.push(failure)\n                }\n                // All other errors are considered internal errors.\n                Err((_metastore_error, index_id, index_uid)) => {\n                    let failure = IndexMetadataFailure {\n                        index_id,\n                        index_uid,\n                        reason: IndexMetadataFailureReason::Internal as i32,\n                    };\n                    failures.push(failure)\n                }\n            }\n        }\n        let response =\n            IndexesMetadataResponse::try_from_indexes_metadata(indexes_metadata, failures).await?;\n        Ok(response)\n    }\n\n    async fn list_indexes_metadata(\n        &self,\n        request: ListIndexesMetadataRequest,\n    ) -> MetastoreResult<ListIndexesMetadataResponse> {\n        // Done in two steps:\n        // 1) Get index IDs and release the lock on `per_index_metastores`.\n        // 2) Get each index metadata. Note that each get will take a read lock on\n        // `per_index_metastores`. Lock is released in 1) to let a concurrent task/thread to\n        // take a write lock on `per_index_metastores`.\n        let index_id_matcher =\n            IndexIdMatcher::try_from_index_id_patterns(&request.index_id_patterns)?;\n        let inner_rlock_guard = self.state.read().await;\n        let index_ids: Vec<IndexId> = inner_rlock_guard\n            .indexes\n            .iter()\n            .filter_map(|(index_id, index_state)| match index_state {\n                LazyIndexStatus::Active(_) if index_id_matcher.is_match(index_id) => Some(index_id),\n                _ => None,\n            })\n            .cloned()\n            .collect();\n        drop(inner_rlock_guard);\n\n        let metastore = self.clone();\n        let indexes_metadata: Vec<IndexMetadata> = try_join_all(\n            index_ids\n                .into_iter()\n                .map(|index_id| get_index_metadata(metastore.clone(), index_id)),\n        )\n        .await?\n        .into_iter()\n        .flatten()\n        .collect();\n        let response =\n            ListIndexesMetadataResponse::try_from_indexes_metadata(indexes_metadata).await?;\n        Ok(response)\n    }\n\n    // Shard API\n\n    async fn open_shards(&self, request: OpenShardsRequest) -> MetastoreResult<OpenShardsResponse> {\n        let mut response = OpenShardsResponse {\n            subresponses: Vec::with_capacity(request.subrequests.len()),\n        };\n        // We must group the subrequests by `index_uid` to mutate each index only once, since each\n        // mutation triggers an IO.\n        let per_index_uid_subrequests: HashMap<IndexUid, Vec<OpenShardSubrequest>> = request\n            .subrequests\n            .into_iter()\n            .into_group_map_by(|subrequest| subrequest.index_uid().clone());\n\n        for (index_uid, subrequests) in per_index_uid_subrequests {\n            let subresponses = self\n                .mutate(&index_uid, |index| index.open_shards(subrequests))\n                .await?;\n            response.subresponses.extend(subresponses);\n        }\n        Ok(response)\n    }\n\n    async fn acquire_shards(\n        &self,\n        request: AcquireShardsRequest,\n    ) -> MetastoreResult<AcquireShardsResponse> {\n        let index_uid = request.index_uid().clone();\n        let response = self\n            .mutate(&index_uid, |index| index.acquire_shards(request))\n            .await?;\n        Ok(response)\n    }\n\n    async fn delete_shards(\n        &self,\n        request: DeleteShardsRequest,\n    ) -> MetastoreResult<DeleteShardsResponse> {\n        let index_uid = request.index_uid().clone();\n        let response = self\n            .mutate(&index_uid, |index| index.delete_shards(request))\n            .await?;\n        Ok(response)\n    }\n\n    async fn prune_shards(&self, request: PruneShardsRequest) -> MetastoreResult<EmptyResponse> {\n        let index_uid = request.index_uid().clone();\n        self.mutate(&index_uid, |index| index.prune_shards(request))\n            .await?;\n        Ok(EmptyResponse {})\n    }\n\n    async fn list_shards(&self, request: ListShardsRequest) -> MetastoreResult<ListShardsResponse> {\n        let mut subresponses = Vec::with_capacity(request.subrequests.len());\n\n        for subrequest in request.subrequests {\n            let index_uid = subrequest.index_uid().clone();\n            let subresponse = self\n                .read(&index_uid, |index| index.list_shards(subrequest))\n                .await?;\n            subresponses.push(subresponse);\n        }\n        let response = ListShardsResponse { subresponses };\n        Ok(response)\n    }\n\n    // -------------------------------------------------------------------------------\n    // Delete tasks\n\n    async fn last_delete_opstamp(\n        &self,\n        request: LastDeleteOpstampRequest,\n    ) -> MetastoreResult<LastDeleteOpstampResponse> {\n        let last_delete_opstamp = self\n            .read(request.index_uid(), |index| Ok(index.last_delete_opstamp()))\n            .await?;\n        Ok(LastDeleteOpstampResponse::new(last_delete_opstamp))\n    }\n\n    async fn create_delete_task(&self, delete_query: DeleteQuery) -> MetastoreResult<DeleteTask> {\n        let index_uid = delete_query.index_uid().clone();\n        let delete_task = self\n            .mutate(&index_uid, |index| {\n                index\n                    .create_delete_task(delete_query)\n                    .map(MutationOccurred::Yes)\n            })\n            .await?;\n        Ok(delete_task)\n    }\n\n    async fn update_splits_delete_opstamp(\n        &self,\n        request: UpdateSplitsDeleteOpstampRequest,\n    ) -> MetastoreResult<UpdateSplitsDeleteOpstampResponse> {\n        let index_uid = request.index_uid();\n\n        self.mutate(index_uid, |index| {\n            let split_ids_str = request\n                .split_ids\n                .iter()\n                .map(|split_id| split_id.as_str())\n                .collect::<Vec<_>>();\n            index\n                .update_splits_delete_opstamp(&split_ids_str, request.delete_opstamp)\n                .map(MutationOccurred::from)\n        })\n        .await?;\n        Ok(UpdateSplitsDeleteOpstampResponse {})\n    }\n\n    async fn list_delete_tasks(\n        &self,\n        request: ListDeleteTasksRequest,\n    ) -> MetastoreResult<ListDeleteTasksResponse> {\n        let index_uid = request.index_uid();\n\n        let delete_tasks = self\n            .read(index_uid, |index| {\n                Ok(index.list_delete_tasks(request.opstamp_start))\n            })\n            .await??;\n        let response = ListDeleteTasksResponse { delete_tasks };\n        Ok(response)\n    }\n\n    // Index Template API\n\n    async fn create_index_template(\n        &self,\n        request: CreateIndexTemplateRequest,\n    ) -> MetastoreResult<EmptyResponse> {\n        let index_template: IndexTemplate =\n            serde_utils::from_json_str(&request.index_template_json)?;\n        let template_id = index_template.template_id.clone();\n\n        let mut state_wlock_guard = self.state.write().await;\n\n        let evicted_template_opt = match state_wlock_guard.templates.entry(template_id.clone()) {\n            Entry::Vacant(entry) => {\n                entry.insert(index_template.clone());\n                None\n            }\n            Entry::Occupied(mut entry) if request.overwrite => {\n                let evicted_template = entry.insert(index_template.clone());\n                Some(evicted_template)\n            }\n            Entry::Occupied(_) => {\n                return Err(MetastoreError::AlreadyExists(EntityKind::IndexTemplate {\n                    template_id,\n                }));\n            }\n        };\n        if let Err(error) = state_wlock_guard.template_matcher.insert(&index_template) {\n            if let Some(evicted_template) = evicted_template_opt {\n                state_wlock_guard\n                    .templates\n                    .insert(evicted_template.template_id.clone(), evicted_template);\n            } else {\n                state_wlock_guard.templates.remove(&template_id);\n            }\n            return Err(error);\n        }\n        let manifest = state_wlock_guard.as_manifest();\n        let save_result = save_manifest(&*self.storage, &manifest).await;\n\n        // Rollback on error.\n        if let Err(error) = save_result {\n            if let Some(evicted_template) = evicted_template_opt {\n                state_wlock_guard\n                    .template_matcher\n                    .insert(&evicted_template)\n                    .expect(\"evicted template should be valid\");\n                state_wlock_guard\n                    .templates\n                    .insert(evicted_template.template_id.clone(), evicted_template);\n            } else {\n                state_wlock_guard.templates.remove(&template_id);\n                state_wlock_guard.template_matcher.remove(&template_id);\n            }\n            return Err(error);\n        }\n        Ok(EmptyResponse {})\n    }\n\n    async fn get_index_template(\n        &self,\n        request: GetIndexTemplateRequest,\n    ) -> MetastoreResult<GetIndexTemplateResponse> {\n        let inner_rlock_guard = self.state.read().await;\n        let index_template = inner_rlock_guard\n            .templates\n            .get(&request.template_id)\n            .ok_or({\n                MetastoreError::NotFound(EntityKind::IndexTemplate {\n                    template_id: request.template_id,\n                })\n            })?;\n        let index_template_json = serde_utils::to_json_str(index_template)?;\n        let response = GetIndexTemplateResponse {\n            index_template_json,\n        };\n        Ok(response)\n    }\n\n    async fn find_index_template_matches(\n        &self,\n        request: FindIndexTemplateMatchesRequest,\n    ) -> MetastoreResult<FindIndexTemplateMatchesResponse> {\n        let inner_rlock_guard = self.state.read().await;\n\n        let mut matches = Vec::new();\n\n        for index_id in request.index_ids {\n            if let Some(template_id) = inner_rlock_guard\n                .template_matcher\n                .find_match(&index_id)\n                .clone()\n            {\n                let index_template = inner_rlock_guard\n                    .templates\n                    .get(&template_id)\n                    .expect(\"template should exist\");\n                let index_template_json = serde_utils::to_json_str(index_template)?;\n                let index_template_match = IndexTemplateMatch {\n                    index_id,\n                    template_id,\n                    index_template_json,\n                };\n                matches.push(index_template_match);\n            };\n        }\n        let response = FindIndexTemplateMatchesResponse { matches };\n        Ok(response)\n    }\n\n    async fn list_index_templates(\n        &self,\n        _request: ListIndexTemplatesRequest,\n    ) -> MetastoreResult<ListIndexTemplatesResponse> {\n        let inner_rlock_guard = self.state.read().await;\n\n        let index_templates_json: Vec<String> = inner_rlock_guard\n            .templates\n            .values()\n            .map(serde_utils::to_json_str)\n            .collect::<MetastoreResult<_>>()?;\n        let response = ListIndexTemplatesResponse {\n            index_templates_json,\n        };\n        Ok(response)\n    }\n\n    async fn delete_index_templates(\n        &self,\n        request: DeleteIndexTemplatesRequest,\n    ) -> MetastoreResult<EmptyResponse> {\n        let mut evicted_templates = Vec::with_capacity(request.template_ids.len());\n        let mut state_wlock_guard = self.state.write().await;\n\n        for template_id in &request.template_ids {\n            if let Some(evicted_template) = state_wlock_guard.templates.remove(template_id) {\n                evicted_templates.push(evicted_template);\n                state_wlock_guard.template_matcher.remove(template_id);\n            }\n        }\n        let manifest = state_wlock_guard.as_manifest();\n        let save_result = save_manifest(&*self.storage, &manifest).await;\n\n        // Rollback on error.\n        if let Err(error) = save_result {\n            for evicted_template in evicted_templates {\n                state_wlock_guard\n                    .template_matcher\n                    .insert(&evicted_template)\n                    .expect(\"evicted template should be valid\");\n                state_wlock_guard\n                    .templates\n                    .insert(evicted_template.template_id.clone(), evicted_template);\n            }\n            return Err(error);\n        }\n        Ok(EmptyResponse {})\n    }\n\n    // Get cluster identity api\n\n    // this returns a constant uuid. on first call, it generate said uuid if it doesn't already\n    // exists\n    async fn get_cluster_identity(\n        &self,\n        _: GetClusterIdentityRequest,\n    ) -> MetastoreResult<GetClusterIdentityResponse> {\n        let mut state_wlock_guard = self.state.write().await;\n\n        if state_wlock_guard.identity.is_nil() {\n            state_wlock_guard.identity = Uuid::new_v4();\n\n            let manifest = state_wlock_guard.as_manifest();\n\n            if let Err(error) = save_manifest(&*self.storage, &manifest).await {\n                state_wlock_guard.identity = Uuid::nil();\n                return Err(error);\n            }\n        }\n\n        Ok(GetClusterIdentityResponse {\n            uuid: state_wlock_guard.identity.hyphenated().to_string(),\n        })\n    }\n}\n\nimpl MetastoreServiceExt for FileBackedMetastore {}\n\nasync fn get_index_mutex(\n    index_id: &str,\n    lazy_index_status: &LazyIndexStatus,\n) -> MetastoreResult<Arc<Mutex<FileBackedIndex>>> {\n    match lazy_index_status {\n        LazyIndexStatus::Active(lazy_index) => lazy_index.get().await,\n        LazyIndexStatus::Creating => Err(MetastoreError::Internal {\n            message: format!(\"index `{index_id}` cannot be retrieved\"),\n            cause: \"index `{index_id}` is in transitioning state `creating` and this should not \\\n                    happened. either recreate or delete it\"\n                .to_string(),\n        }),\n        LazyIndexStatus::Deleting => Err(MetastoreError::Internal {\n            message: format!(\"index `{index_id}` cannot be retrieved\"),\n            cause: \"index `{index_id}` is in transitioning state `deleting` and this should not \\\n                    happened. try to delete it again\"\n                .to_string(),\n        }),\n    }\n}\n\nasync fn get_index_metadata(\n    metastore: FileBackedMetastore,\n    index_id: IndexId,\n) -> MetastoreResult<Option<IndexMetadata>> {\n    let request = IndexMetadataRequest::for_index_id(index_id);\n    let index_metadata_result = metastore\n        .index_metadata(request)\n        .await\n        .and_then(|response| response.deserialize_index_metadata());\n    match index_metadata_result {\n        Ok(index_metadata) => Ok(Some(index_metadata)),\n        Err(MetastoreError::NotFound { .. }) => Ok(None),\n        Err(MetastoreError::Internal { message, cause }) => {\n            // Indexes can be in transient states `Creating` or `Deleting`.\n            // It is fine to ignore those errors.\n            if message.contains(\"transient state\") {\n                Ok(None)\n            } else {\n                Err(MetastoreError::Internal { message, cause })\n            }\n        }\n        Err(error) => Err(error),\n    }\n}\n\n#[cfg(test)]\n#[async_trait]\nimpl crate::tests::DefaultForTest for FileBackedMetastore {\n    async fn default_for_test() -> Self {\n        use quickwit_storage::RamStorage;\n        FileBackedMetastore::try_new(Arc::new(RamStorage::default()), None)\n            .await\n            .unwrap()\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use std::ops::RangeInclusive;\n    use std::path::Path;\n    use std::sync::Arc;\n\n    use futures::executor::block_on;\n    use quickwit_common::uri::{Protocol, Uri};\n    use quickwit_config::IndexConfig;\n    use quickwit_proto::ingest::Shard;\n    use quickwit_proto::metastore::{DeleteQuery, MetastoreError};\n    use quickwit_proto::types::SourceId;\n    use quickwit_query::query_ast::qast_helper;\n    use quickwit_storage::{MockStorage, RamStorage, Storage, StorageErrorKind};\n    use rand::Rng;\n    use tests::manifest::{IndexStatus, Manifest};\n    use time::OffsetDateTime;\n    use tokio::time::Duration;\n\n    use super::store_operations::{metastore_filepath, put_index_given_index_id};\n    use super::*;\n    use crate::metastore::MetastoreServiceStreamSplitsExt;\n    use crate::tests::DefaultForTest;\n    use crate::tests::shard::ReadWriteShardsForTest;\n    use crate::{IndexMetadata, ListSplitsQuery, SplitMetadata, SplitState, metastore_test_suite};\n\n    #[async_trait]\n    impl ReadWriteShardsForTest for FileBackedMetastore {\n        async fn insert_shards(\n            &self,\n            index_uid: &IndexUid,\n            source_id: &SourceId,\n            shards: Vec<Shard>,\n        ) {\n            self.mutate(index_uid, |index| {\n                index.insert_shards(source_id, shards);\n                Ok(MutationOccurred::Yes(()))\n            })\n            .await\n            .unwrap();\n        }\n\n        async fn list_all_shards(&self, index_uid: &IndexUid, source_id: &SourceId) -> Vec<Shard> {\n            self.read(index_uid, |index| {\n                let shards = index.list_all_shards(source_id);\n                Ok(shards)\n            })\n            .await\n            .unwrap()\n        }\n    }\n\n    metastore_test_suite!(crate::FileBackedMetastore);\n\n    #[tokio::test]\n    async fn test_metastore_connectivity_and_endpoints() {\n        let metastore = FileBackedMetastore::default_for_test().await;\n        metastore.check_connectivity().await.unwrap();\n        assert_eq!(metastore.endpoints()[0].protocol(), Protocol::Ram);\n    }\n\n    #[tokio::test]\n    async fn test_file_backed_metastore_connectivity_fails_if_states_file_does_not_exist() {\n        let mut mock_storage = MockStorage::default();\n        let ram_storage = RamStorage::default();\n        let ram_storage_clone = ram_storage.clone();\n        mock_storage // remove this if we end up changing the semantics of create.\n            .expect_exists()\n            .times(3)\n            .returning(|_| Ok(false));\n        mock_storage\n            .expect_put()\n            .times(1)\n            .returning(move |path, put_payload| {\n                assert!(path == Path::new(\"manifest.json\"));\n                block_on(ram_storage_clone.put(path, put_payload))\n            });\n        let metastore = FileBackedMetastore::try_new(Arc::new(mock_storage), None)\n            .await\n            .unwrap();\n\n        metastore.check_connectivity().await.unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_file_backed_metastore_index_exists() {\n        let index_id = \"test-index\";\n        let mut metastore = FileBackedMetastore::default_for_test().await;\n        assert!(!metastore.index_exists(index_id).await.unwrap());\n\n        let index_config = IndexConfig::for_test(index_id, \"ram:///indexes/test-index\");\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        metastore.create_index(create_index_request).await.unwrap();\n\n        assert!(metastore.index_exists(index_id).await.unwrap());\n    }\n\n    #[tokio::test]\n    async fn test_file_backed_metastore_get_index() {\n        let metastore = FileBackedMetastore::default_for_test().await;\n\n        // Create index\n        let index_id = \"test-index\";\n        let index_config = IndexConfig::for_test(index_id, \"ram:///indexes/test-index\");\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        // Open index and check its metadata\n        let created_index = metastore.get_index(&index_uid).await.unwrap();\n        assert_eq!(created_index.index_id(), index_config.index_id);\n        assert_eq!(\n            created_index.metadata().index_uri(),\n            &index_config.index_uri\n        );\n\n        // Check index is returned by list indexes.\n        let indexes_metadata = metastore\n            .list_indexes_metadata(ListIndexesMetadataRequest::all())\n            .await\n            .unwrap()\n            .deserialize_indexes_metadata()\n            .await\n            .unwrap();\n        assert_eq!(indexes_metadata.len(), 1);\n\n        // Open a non-existent index.\n        let metastore_error = metastore\n            .get_index(&IndexUid::new_with_random_ulid(\"index-does-not-exist\"))\n            .await\n            .unwrap_err();\n        assert!(matches!(metastore_error, MetastoreError::NotFound { .. }));\n\n        // Open a index with a different incarnation_id.\n        let metastore_error = metastore\n            .get_index(&IndexUid::new_with_random_ulid(index_id))\n            .await\n            .unwrap_err();\n        assert!(matches!(metastore_error, MetastoreError::NotFound { .. }));\n    }\n\n    #[tokio::test]\n    async fn test_file_backed_metastore_storage_failing() {\n        // The file-backed metastore should not update its internal state if the storage fails.\n        let mut mock_storage = MockStorage::default();\n\n        let current_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n\n        let ram_storage = RamStorage::default();\n        let ram_storage_clone = ram_storage.clone();\n\n        mock_storage // remove this if we end up changing the semantics of create.\n            .expect_exists()\n            .returning(|_| Ok(false));\n        mock_storage\n            .expect_put()\n            .times(4)\n            .returning(move |path, put_payload| {\n                assert!(\n                    path == Path::new(\"manifest.json\") || path == metastore_filepath(\"test-index\")\n                );\n                block_on(ram_storage_clone.put(path, put_payload))\n            });\n        mock_storage\n            .expect_get_all()\n            .times(1)\n            .returning(move |path| block_on(ram_storage.get_all(path)));\n        mock_storage.expect_put().times(1).returning(|_uri, _| {\n            Err(StorageErrorKind::Io\n                .with_error(anyhow::anyhow!(\"Oops. Some network problem maybe?\")))\n        });\n        let metastore = FileBackedMetastore::for_test(Arc::new(mock_storage));\n\n        let index_config = IndexConfig::for_test(\"test-index\", \"ram:///indexes/test-index\");\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        let split_id = \"split-one\";\n        let split_metadata = SplitMetadata {\n            footer_offsets: 1000..2000,\n            split_id: split_id.to_string(),\n            num_docs: 1,\n            uncompressed_docs_size_in_bytes: 2,\n            time_range: Some(RangeInclusive::new(0, 99)),\n            create_timestamp: current_timestamp,\n            ..Default::default()\n        };\n        let stage_splits_request =\n            StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata)\n                .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        // publish split fails\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id.to_string()],\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap_err();\n\n        let list_splits_query =\n            ListSplitsQuery::for_index(index_uid.clone()).with_split_state(SplitState::Published);\n        let list_splits_request =\n            ListSplitsRequest::try_from_list_splits_query(&list_splits_query).unwrap();\n        let splits = metastore\n            .list_splits(list_splits_request)\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        assert!(splits.is_empty());\n\n        let list_splits_query =\n            ListSplitsQuery::for_index(index_uid.clone()).with_split_state(SplitState::Staged);\n        let list_splits_request =\n            ListSplitsRequest::try_from_list_splits_query(&list_splits_query).unwrap();\n        let splits = metastore\n            .list_splits(list_splits_request)\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        assert!(!splits.is_empty());\n    }\n\n    #[tokio::test]\n    async fn test_file_backed_metastore_get_index_checks_for_inconsistent_index_id()\n    -> MetastoreResult<()> {\n        let storage = Arc::new(RamStorage::default());\n        let index_id = \"test-index\";\n        let index_metadata =\n            IndexMetadata::for_test(\"my-inconsistent-index\", \"ram:///indexes/test-index\");\n\n        // Put inconsistent index and manifest into storage.\n        let index = FileBackedIndex::from(index_metadata);\n        put_index_given_index_id(&*storage, &index, index_id).await?;\n        let mut manifest = Manifest::default();\n        manifest\n            .indexes\n            .insert(index_id.to_string(), IndexStatus::Active);\n        save_manifest(&*storage, &manifest).await.unwrap();\n\n        let metastore = FileBackedMetastore::try_new(storage.clone(), None)\n            .await\n            .unwrap();\n\n        // Getting index with inconsistent index ID should raise an error.\n        let metastore_error = metastore\n            .get_index(&IndexUid::new_with_random_ulid(index_id))\n            .await\n            .unwrap_err();\n        assert!(matches!(metastore_error, MetastoreError::Internal { .. }));\n\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_file_backed_metastore_write_directly_visible() -> MetastoreResult<()> {\n        let metastore = FileBackedMetastore::default_for_test().await;\n\n        let index_config = IndexConfig::for_test(\"test-index\", \"ram:///indexes/test-index\");\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let create_index_response = metastore.create_index(create_index_request).await.unwrap();\n        let index_uid: IndexUid = create_index_response.index_uid().clone();\n\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_index_uid(index_uid.clone()).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        assert!(splits.is_empty());\n\n        let split_metadata = SplitMetadata {\n            footer_offsets: 1000..2000,\n            split_id: \"split1\".to_string(),\n            num_docs: 1,\n            uncompressed_docs_size_in_bytes: 2,\n            time_range: Some(0..=99),\n            ..Default::default()\n        };\n        let stage_splits_request =\n            StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata)\n                .unwrap();\n        metastore.stage_splits(stage_splits_request).await?;\n\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_index_uid(index_uid).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        assert_eq!(splits.len(), 1);\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_file_backed_metastore_polling() -> MetastoreResult<()> {\n        let storage = Arc::new(RamStorage::default());\n\n        let metastore_write = FileBackedMetastore::try_new(storage.clone(), None)\n            .await\n            .unwrap();\n        let polling_interval = Duration::from_millis(20);\n        let metastore_read = FileBackedMetastore::try_new(storage, Some(polling_interval))\n            .await\n            .unwrap();\n\n        let index_config = IndexConfig::for_test(\"test-index\", \"ram:///indexes/test-index\");\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let create_index_response = metastore_write\n            .create_index(create_index_request)\n            .await\n            .unwrap();\n        let index_uid: IndexUid = create_index_response.index_uid().clone();\n\n        let splits = metastore_write\n            .list_splits(ListSplitsRequest::try_from_index_uid(index_uid.clone()).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        assert!(splits.is_empty());\n\n        let splits = metastore_read\n            .list_splits(ListSplitsRequest::try_from_index_uid(index_uid.clone()).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        assert!(splits.is_empty());\n\n        let split_metadata = SplitMetadata {\n            footer_offsets: 1000..2000,\n            split_id: \"split1\".to_string(),\n            num_docs: 1,\n            uncompressed_docs_size_in_bytes: 2,\n            time_range: Some(0..=99),\n            ..Default::default()\n        };\n        let stage_splits_request =\n            StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata)\n                .unwrap();\n        metastore_write.stage_splits(stage_splits_request).await?;\n\n        let splits = metastore_read\n            .list_splits(ListSplitsRequest::try_from_index_uid(index_uid.clone()).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        assert!(splits.is_empty());\n\n        for _ in 0..10 {\n            tokio::time::sleep(polling_interval).await;\n\n            let splits = metastore_read\n                .list_splits(ListSplitsRequest::try_from_index_uid(index_uid.clone()).unwrap())\n                .await\n                .unwrap()\n                .collect_splits()\n                .await\n                .unwrap();\n            if !splits.is_empty() {\n                return Ok(());\n            }\n        }\n        panic!(\"The metastore should have been updated.\");\n    }\n\n    #[tokio::test(flavor = \"multi_thread\", worker_threads = 3)]\n    async fn test_file_backed_metastore_race_condition() {\n        let metastore = FileBackedMetastore::default_for_test().await;\n\n        let index_config = IndexConfig::for_test(\"test-index\", \"ram:///indexes/test-index\");\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let create_index_response = metastore.create_index(create_index_request).await.unwrap();\n        let index_uid: IndexUid = create_index_response.index_uid().clone();\n\n        // Stage splits in multiple threads\n        let mut handles = Vec::new();\n        let mut random_generator = rand::rng();\n        for i in 1..=20 {\n            let sleep_duration = Duration::from_millis(random_generator.random_range(0..=200));\n            let metastore = metastore.clone();\n            let current_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n            let handle = tokio::spawn({\n                let index_uid = index_uid.clone();\n                async move {\n                    let split_metadata = SplitMetadata {\n                        footer_offsets: 1000..2000,\n                        split_id: format!(\"split-{i}\"),\n                        num_docs: 1,\n                        uncompressed_docs_size_in_bytes: 2,\n                        time_range: Some(RangeInclusive::new(0, 99)),\n                        create_timestamp: current_timestamp,\n                        ..Default::default()\n                    };\n                    // stage split\n                    let stage_splits_request = StageSplitsRequest::try_from_split_metadata(\n                        index_uid.clone(),\n                        &split_metadata,\n                    )\n                    .unwrap();\n                    metastore.stage_splits(stage_splits_request).await.unwrap();\n\n                    tokio::time::sleep(sleep_duration).await;\n\n                    // publish split\n                    let split_id = format!(\"split-{i}\");\n                    let publish_splits_request = PublishSplitsRequest {\n                        index_uid: Some(index_uid.clone()),\n                        staged_split_ids: vec![split_id.to_string()],\n                        ..Default::default()\n                    };\n                    metastore\n                        .publish_splits(publish_splits_request)\n                        .await\n                        .unwrap();\n                }\n            });\n            handles.push(handle);\n        }\n\n        futures::future::try_join_all(handles).await.unwrap();\n\n        let list_splits_query =\n            ListSplitsQuery::for_index(index_uid.clone()).with_split_state(SplitState::Published);\n        let list_splits_request =\n            ListSplitsRequest::try_from_list_splits_query(&list_splits_query).unwrap();\n        let splits = metastore\n            .list_splits(list_splits_request)\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n\n        // Make sure that all 20 splits are in `Published` state.\n        assert_eq!(splits.len(), 20);\n    }\n\n    #[tokio::test(flavor = \"multi_thread\", worker_threads = 3)]\n    async fn test_file_backed_metastore_list_indexes_race_condition() {\n        let metastore = FileBackedMetastore::default_for_test().await;\n        let mut index_uids = Vec::new();\n        for idx in 0..10 {\n            let index_uid = IndexUid::new_with_random_ulid(&format!(\"test-index-{idx}\"));\n            let index_config =\n                IndexConfig::for_test(&index_uid.index_id, \"ram:///indexes/test-index\");\n            let create_index_request =\n                CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n            let index_uid: IndexUid = metastore\n                .create_index(create_index_request)\n                .await\n                .unwrap()\n                .index_uid()\n                .clone();\n            index_uids.push(index_uid);\n        }\n        // Delete indexes + call to list_indexes_metadata.\n        let mut handles = Vec::new();\n        for index_uid in index_uids {\n            let delete_request = DeleteIndexRequest {\n                index_uid: Some(index_uid.clone()),\n            };\n            {\n                let metastore = metastore.clone();\n                let handle = tokio::spawn(async move {\n                    metastore\n                        .list_indexes_metadata(ListIndexesMetadataRequest::all())\n                        .await\n                        .unwrap();\n                });\n                handles.push(handle);\n            }\n            {\n                let metastore = metastore.clone();\n                let handle = tokio::spawn(async move {\n                    metastore.delete_index(delete_request).await.unwrap();\n                });\n                handles.push(handle);\n            }\n        }\n        tokio::time::timeout(\n            Duration::from_secs(2),\n            futures::future::try_join_all(handles),\n        )\n        .await\n        .unwrap()\n        .unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_file_backed_metastore_create_index_when_storage_failing_on_indexes_states_put() {\n        let mut mock_storage = MockStorage::default();\n        let ram_storage = RamStorage::default();\n        let index_id = \"test-index\";\n\n        mock_storage\n            .expect_uri()\n            .return_const(Uri::for_test(\"ram:///indexes\"));\n        mock_storage.expect_exists().returning(|_| Ok(false));\n        mock_storage\n            .expect_put()\n            .times(1)\n            .returning(move |path, _| {\n                assert!(path == Path::new(\"manifest.json\"));\n                Err(StorageErrorKind::Io\n                    .with_error(anyhow::anyhow!(\"Oops. Some network problem maybe?\")))\n            });\n        mock_storage\n            .expect_get_all()\n            .times(1)\n            .returning(move |path| block_on(ram_storage.get_all(path)));\n\n        let metastore = FileBackedMetastore::for_test(Arc::new(mock_storage));\n        let index_config = IndexConfig::for_test(index_id, \"ram:///indexes/test-index\");\n\n        // Create index.\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let metastore_error = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap_err();\n        assert!(matches!(metastore_error, MetastoreError::Internal { .. }));\n        // Try fetch the not created index.\n        let created_index_error = metastore\n            .get_index(&IndexUid::new_with_random_ulid(index_id))\n            .await\n            .unwrap_err();\n        assert!(matches!(\n            created_index_error,\n            MetastoreError::NotFound { .. }\n        ));\n    }\n\n    #[tokio::test]\n    async fn test_file_backed_metastore_create_index_when_storage_failing_before_metadata_put() {\n        let mut mock_storage = MockStorage::default();\n        let ram_storage = RamStorage::default();\n        let ram_storage_clone = ram_storage.clone();\n        let ram_storage_clone_2 = ram_storage.clone();\n        let index_id = \"test-index\";\n        let index_uid = IndexUid::new_with_random_ulid(index_id);\n\n        mock_storage // remove this if we end up changing the semantics of create.\n            .expect_exists()\n            .returning(|_| Ok(false));\n        mock_storage\n            .expect_put()\n            .times(4)\n            .returning(move |path, put_payload| {\n                assert!(\n                    path == Path::new(\"manifest.json\") || path == metastore_filepath(\"test-index\")\n                );\n                if path == Path::new(\"manifest.json\") {\n                    return block_on(ram_storage_clone.put(path, put_payload));\n                }\n                Err(StorageErrorKind::Io\n                    .with_error(anyhow::anyhow!(\"Oops. Some network problem maybe?\")))\n            });\n        mock_storage\n            .expect_get_all()\n            .times(1)\n            .returning(move |path| block_on(ram_storage.get_all(path)));\n        let metastore = FileBackedMetastore::for_test(Arc::new(mock_storage));\n        let index_config = IndexConfig::for_test(index_id, \"ram:///indexes/test-index\");\n\n        // Create index\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let metastore_error = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap_err();\n        assert!(matches!(metastore_error, MetastoreError::Internal { .. }));\n        // Let's fetch the index, we expect an internal error as the index state is in `Creating`\n        // state.\n        let created_index_error = metastore.get_index(&index_uid.clone()).await.unwrap_err();\n        assert!(matches!(\n            created_index_error,\n            MetastoreError::Internal { .. }\n        ));\n        // Check index state is in `Creating` in the manifest file.\n        let storage = Arc::new(ram_storage_clone_2.clone());\n        let manifest = load_or_create_manifest(&*storage).await.unwrap();\n        assert!(matches!(\n            *manifest.indexes.get(index_id).unwrap(),\n            IndexStatus::Creating\n        ));\n        // Let's delete the index to clean states.\n        let delete_request = DeleteIndexRequest {\n            index_uid: Some(index_uid.clone()),\n        };\n        let deleted_index_error = metastore.delete_index(delete_request).await.unwrap_err();\n        assert!(matches!(\n            deleted_index_error,\n            MetastoreError::NotFound { .. }\n        ));\n        let manifest = load_or_create_manifest(&*storage).await.unwrap();\n        assert!(!manifest.indexes.contains_key(index_id));\n        // Now we can expect an `IndexDoesNotExist` error.\n        let created_index_error = metastore.get_index(&index_uid).await.unwrap_err();\n        assert!(matches!(\n            created_index_error,\n            MetastoreError::NotFound { .. }\n        ));\n    }\n\n    #[tokio::test]\n    async fn test_file_backed_metastore_create_index_when_storage_failing_before_last_indexes_states_put()\n     {\n        let mut mock_storage = MockStorage::default();\n        let ram_storage = RamStorage::default();\n        let ram_storage_clone = ram_storage.clone();\n        let index_id = \"test-index\";\n        let mut indexes_json_valid_put = 1;\n\n        mock_storage\n            .expect_uri()\n            .return_const(Uri::for_test(\"ram:///indexes\"));\n        mock_storage // remove this if we end up changing the semantics of create.\n            .expect_exists()\n            .returning(|_| Ok(false));\n        mock_storage\n            .expect_put()\n            .times(3)\n            .returning(move |path, put_payload| {\n                assert!(\n                    path == Path::new(\"manifest.json\") || path == metastore_filepath(\"test-index\")\n                );\n                if path == Path::new(\"manifest.json\") {\n                    if indexes_json_valid_put == 0 {\n                        return Err(StorageErrorKind::Io.with_error(anyhow::anyhow!(\n                            \"oops. perhaps there are some network problems\"\n                        )));\n                    }\n                    indexes_json_valid_put -= 1;\n                }\n                block_on(ram_storage_clone.put(path, put_payload))\n            });\n        let metastore = FileBackedMetastore::for_test(Arc::new(mock_storage));\n        let index_config = IndexConfig::for_test(index_id, \"ram:///indexes/test-index\");\n\n        // Create index\n        let metastore_error = metastore\n            .create_index(CreateIndexRequest::try_from_index_config(&index_config).unwrap())\n            .await\n            .unwrap_err();\n        assert!(matches!(metastore_error, MetastoreError::Internal { .. }));\n        // Let's fetch the index, we expect an internal error as the index state is in `Creating`\n        // state.\n        let created_index_error = metastore\n            .get_index(&IndexUid::new_with_random_ulid(index_id))\n            .await\n            .unwrap_err();\n        assert!(matches!(\n            created_index_error,\n            MetastoreError::Internal { .. }\n        ));\n    }\n\n    #[tokio::test]\n    async fn test_file_backed_metastore_delete_index_when_storage_failing_before_metadata_delete() {\n        let mut mock_storage = MockStorage::default();\n        let ram_storage = RamStorage::default();\n        let ram_storage_clone = ram_storage.clone();\n        let index_id = \"test-index\";\n        let index_uid = IndexUid::new_with_random_ulid(index_id);\n        let index_metadata =\n            IndexMetadata::for_test(&index_uid.index_id, \"ram:///indexes/test-index\");\n        let index = FileBackedIndex::from(index_metadata);\n        put_index_given_index_id(&ram_storage, &index, &index_uid.index_id)\n            .await\n            .unwrap();\n\n        mock_storage\n            .expect_uri()\n            .return_const(Uri::for_test(\"ram:///indexes\"));\n        mock_storage // remove this if we end up changing the semantics of create.\n            .expect_exists()\n            .returning(|_| Ok(true));\n        mock_storage // remove this if we end up changing the semantics of create.\n            .expect_delete()\n            .returning(|_| {\n                Err(StorageErrorKind::Io\n                    .with_error(anyhow::anyhow!(\"Oops. Some network problem maybe?\")))\n            });\n        mock_storage\n            .expect_put()\n            .times(1)\n            .returning(move |path, put_payload| block_on(ram_storage_clone.put(path, put_payload)));\n        let metastore = FileBackedMetastore::for_test(Arc::new(mock_storage));\n\n        // Delete index\n        let delete_request = DeleteIndexRequest {\n            index_uid: Some(index_uid.clone()),\n        };\n        let metastore_error = metastore.delete_index(delete_request).await.unwrap_err();\n        assert!(matches!(metastore_error, MetastoreError::Internal { .. }));\n        // Let's fetch the index, we expect an internal error as the index state is in `Deleting`\n        // state.\n        let created_index_error = metastore.get_index(&index_uid).await.unwrap_err();\n        assert!(matches!(\n            created_index_error,\n            MetastoreError::Internal { .. }\n        ));\n    }\n\n    #[tokio::test]\n    async fn test_file_backed_metastore_delete_index_storage_failing_before_last_indexes_states_put()\n     {\n        let mut mock_storage = MockStorage::default();\n        let ram_storage = RamStorage::default();\n        let ram_storage_clone = ram_storage.clone();\n        let index_id = \"test-index\";\n        let index_uid = IndexUid::new_with_random_ulid(index_id);\n        let index_metadata =\n            IndexMetadata::for_test(&index_uid.index_id, \"ram:///indexes/test-index\");\n        let index = FileBackedIndex::from(index_metadata);\n        put_index_given_index_id(&ram_storage, &index, &index_uid.index_id)\n            .await\n            .unwrap();\n        let mut indexes_json_valid_put = 1;\n        mock_storage\n            .expect_uri()\n            .return_const(Uri::for_test(\"ram:///indexes\"));\n        mock_storage // remove this if we end up changing the semantics of create.\n            .expect_exists()\n            .returning(|_| Ok(true));\n        mock_storage // remove this if we end up changing the semantics of create.\n            .expect_delete()\n            .returning(|_| Ok(()));\n        mock_storage\n            .expect_put()\n            .times(2)\n            .returning(move |path, put_payload| {\n                assert!(path == Path::new(\"manifest.json\"));\n                if path == Path::new(\"manifest.json\") {\n                    if indexes_json_valid_put == 0 {\n                        return Err(StorageErrorKind::Io.with_error(anyhow::anyhow!(\n                            \"oops. perhaps there are some network problems\"\n                        )));\n                    }\n                    indexes_json_valid_put -= 1;\n                }\n                block_on(ram_storage_clone.put(path, put_payload))\n            });\n        let metastore = FileBackedMetastore::for_test(Arc::new(mock_storage));\n\n        // Delete index\n        let delete_request = DeleteIndexRequest {\n            index_uid: Some(index_uid.clone()),\n        };\n        let metastore_error = metastore.delete_index(delete_request).await.unwrap_err();\n        assert!(matches!(metastore_error, MetastoreError::Internal { .. }));\n        // Let's fetch the index, we expect an internal error as the index state is in `Deleting`\n        // state.\n        let created_index_error = metastore.get_index(&index_uid).await.unwrap_err();\n        assert!(matches!(\n            created_index_error,\n            MetastoreError::Internal { .. }\n        ));\n    }\n\n    #[tokio::test]\n    async fn test_file_backed_metastore_get_list_indexes() -> MetastoreResult<()> {\n        let index_id_creating = \"test-index--creating\";\n        let index_id_alive = \"testing-index--alive\";\n        let index_id_unregistered = \"test-index--unregistered\";\n        let index_id_deleting = \"test-index--deleting\";\n\n        let index_metadata_alive =\n            IndexMetadata::for_test(index_id_alive, \"ram:///indexes/test-index--alive\");\n        let index_metadata_unregistered = IndexMetadata::for_test(\n            index_id_unregistered,\n            \"ram:///indexes/test-index--unregistered\",\n        );\n\n        // Put index states into storage.\n        let ram_storage = Arc::new(RamStorage::default());\n        let mut manifest = Manifest::default();\n        manifest\n            .indexes\n            .insert(index_id_creating.to_string(), IndexStatus::Creating);\n        manifest\n            .indexes\n            .insert(index_id_alive.to_string(), IndexStatus::Active);\n        manifest\n            .indexes\n            .insert(index_id_deleting.to_string(), IndexStatus::Deleting);\n        save_manifest(&*ram_storage, &manifest).await.unwrap();\n\n        let index_alive = FileBackedIndex::from(index_metadata_alive);\n        let index_alive_unregistered = FileBackedIndex::from(index_metadata_unregistered);\n        let index_uid_alive = index_alive.index_uid();\n        let index_uid_unregistered = index_alive_unregistered.index_uid();\n\n        // Put indexes metadatas.\n        put_index_given_index_id(&*ram_storage, &index_alive, index_id_alive).await?;\n        put_index_given_index_id(\n            &*ram_storage,\n            &index_alive_unregistered,\n            index_id_unregistered,\n        )\n        .await?;\n\n        // Fetch alive indexes metadatas.\n        let metastore = FileBackedMetastore::try_new(ram_storage.clone(), None)\n            .await\n            .unwrap();\n        let indexes_metadata = metastore\n            .list_indexes_metadata(ListIndexesMetadataRequest::all())\n            .await\n            .unwrap()\n            .deserialize_indexes_metadata()\n            .await\n            .unwrap();\n        assert_eq!(indexes_metadata.len(), 1);\n\n        // Fetch the index metadata not registered in index states json.\n        metastore\n            .get_index(&index_uid_unregistered.clone())\n            .await\n            .unwrap();\n\n        // Now list indexes return 2 indexes metadatas as the metastore is now aware of\n        // 2 alive indexes.\n        let indexes_metadata = metastore\n            .list_indexes_metadata(ListIndexesMetadataRequest::all())\n            .await\n            .unwrap()\n            .deserialize_indexes_metadata()\n            .await\n            .unwrap();\n        assert_eq!(indexes_metadata.len(), 2);\n\n        // Let's delete indexes.\n        let delete_request = DeleteIndexRequest {\n            index_uid: Some(index_uid_alive.clone()),\n        };\n        metastore.delete_index(delete_request).await.unwrap();\n\n        let delete_request = DeleteIndexRequest {\n            index_uid: Some(index_uid_unregistered.clone()),\n        };\n        metastore.delete_index(delete_request).await.unwrap();\n        let indexes_metadata = metastore\n            .list_indexes_metadata(ListIndexesMetadataRequest::all())\n            .await\n            .unwrap()\n            .deserialize_indexes_metadata()\n            .await\n            .unwrap();\n        assert!(indexes_metadata.is_empty());\n\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_monotically_increasing_stamps_by_index() {\n        let storage = RamStorage::default();\n        let metastore = FileBackedMetastore::try_new(Arc::new(storage.clone()), None)\n            .await\n            .unwrap();\n        let index_id = \"test-index-increasing-stamps-by-index\";\n        let index_config = IndexConfig::for_test(\n            index_id,\n            \"ram:///indexes/test-index-increasing-stamps-by-index\",\n        );\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let create_index_response = metastore.create_index(create_index_request).await.unwrap();\n        let index_uid = create_index_response.index_uid;\n\n        let delete_query = DeleteQuery {\n            start_timestamp: None,\n            end_timestamp: None,\n            index_uid,\n            query_ast: serde_json::to_string(&qast_helper(\"harry potter\", &[\"body\"])).unwrap(),\n        };\n\n        let delete_task_1 = metastore\n            .create_delete_task(delete_query.clone())\n            .await\n            .unwrap();\n        assert_eq!(delete_task_1.opstamp, 1);\n        let delete_task_2 = metastore\n            .create_delete_task(delete_query.clone())\n            .await\n            .unwrap();\n        assert_eq!(delete_task_2.opstamp, 2);\n\n        // Create metastore with data already in the storage.\n        let new_metastore = FileBackedMetastore::try_new(Arc::new(storage), None)\n            .await\n            .unwrap();\n        let delete_task_3 = new_metastore\n            .create_delete_task(delete_query.clone())\n            .await\n            .unwrap();\n        assert_eq!(delete_task_3.opstamp, 3);\n\n        // Create delete tasks on new index.\n        let index_id_2 = \"test-index-increasing-stamps-by-index-2\";\n        let index_config = IndexConfig::for_test(\n            index_id_2,\n            \"ram:///indexes/test-index-increasing-stamps-by-index-2\",\n        );\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let create_index_response = metastore.create_index(create_index_request).await.unwrap();\n        let index_uid = create_index_response.index_uid;\n\n        let delete_query = DeleteQuery {\n            start_timestamp: None,\n            end_timestamp: None,\n            index_uid,\n            query_ast: serde_json::to_string(&qast_helper(\"harry potter\", &[\"body\"])).unwrap(),\n        };\n        let delete_task_4 = metastore.create_delete_task(delete_query).await.unwrap();\n        assert_eq!(delete_task_4.opstamp, 1);\n    }\n\n    #[tokio::test]\n    async fn test_create_index_template_rollback() {\n        let mut mock_storage = MockStorage::default();\n\n        mock_storage\n            .expect_uri()\n            .return_const(Uri::for_test(\"ram:///indexes\"));\n\n        mock_storage\n            .expect_put()\n            .once()\n            .returning(|path, _payload| {\n                assert_eq!(path, Path::new(MANIFEST_FILE_NAME));\n                Ok(())\n            });\n\n        mock_storage\n            .expect_put()\n            .once()\n            .returning(|path, _payload| {\n                assert_eq!(path, Path::new(MANIFEST_FILE_NAME));\n                let io_error = StorageErrorKind::Io.with_error(anyhow::anyhow!(\"IO error\"));\n                Err(io_error)\n            });\n\n        let metastore = FileBackedMetastore::for_test(Arc::new(mock_storage));\n\n        let index_template = IndexTemplate::for_test(\"test-template\", &[\"test-index-foo*\"], 100);\n        let index_template_json = serde_json::to_string(&index_template).unwrap();\n        let create_index_template_request = CreateIndexTemplateRequest {\n            index_template_json,\n            overwrite: false,\n        };\n        metastore\n            .create_index_template(create_index_template_request)\n            .await\n            .unwrap();\n        {\n            let state = metastore.state.read().await;\n            assert_eq!(state.templates.len(), 1);\n            state.template_matcher.find_match(\"test-index-foo\").unwrap();\n        }\n        let index_template = IndexTemplate::for_test(\"test-template\", &[\"test-index-bar*\"], 100);\n        let index_template_json = serde_json::to_string(&index_template).unwrap();\n        let create_index_template_request = CreateIndexTemplateRequest {\n            index_template_json,\n            overwrite: true,\n        };\n        metastore\n            .create_index_template(create_index_template_request)\n            .await\n            .unwrap_err();\n        {\n            let state = metastore.state.read().await;\n            assert_eq!(state.templates.len(), 1);\n            state.template_matcher.find_match(\"test-index-foo\").unwrap();\n        }\n    }\n\n    #[tokio::test]\n    async fn test_delete_index_templates_rollback() {\n        let mut mock_storage = MockStorage::default();\n\n        mock_storage\n            .expect_uri()\n            .return_const(Uri::for_test(\"ram:///indexes\"));\n\n        mock_storage\n            .expect_put()\n            .once()\n            .returning(|path, _payload| {\n                assert_eq!(path, Path::new(MANIFEST_FILE_NAME));\n                Ok(())\n            });\n\n        mock_storage\n            .expect_put()\n            .once()\n            .returning(|path, _payload| {\n                assert_eq!(path, Path::new(MANIFEST_FILE_NAME));\n                let io_error = StorageErrorKind::Io.with_error(anyhow::anyhow!(\"IO error\"));\n                Err(io_error)\n            });\n\n        let metastore = FileBackedMetastore::for_test(Arc::new(mock_storage));\n\n        let index_template = IndexTemplate::for_test(\"test-template\", &[\"test-index-foo*\"], 100);\n        let index_template_json = serde_json::to_string(&index_template).unwrap();\n        let create_index_template_request = CreateIndexTemplateRequest {\n            index_template_json,\n            overwrite: false,\n        };\n        metastore\n            .create_index_template(create_index_template_request)\n            .await\n            .unwrap();\n        {\n            let state = metastore.state.read().await;\n            assert_eq!(state.templates.len(), 1);\n            state.template_matcher.find_match(\"test-index-foo\").unwrap();\n        }\n        let delete_index_templates_request = DeleteIndexTemplatesRequest {\n            template_ids: vec![index_template.template_id],\n        };\n        metastore\n            .delete_index_templates(delete_index_templates_request)\n            .await\n            .unwrap_err();\n        {\n            let state = metastore.state.read().await;\n            assert_eq!(state.templates.len(), 1);\n            state.template_matcher.find_match(\"test-index-foo\").unwrap();\n\n            assert!(\n                state\n                    .template_matcher\n                    .find_match(\"test-index-bar\")\n                    .is_none()\n            );\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/file_backed/state.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::sync::Arc;\nuse std::time::Duration;\n\nuse quickwit_config::{IndexTemplate, IndexTemplateId};\nuse quickwit_proto::metastore::MetastoreResult;\nuse quickwit_proto::types::IndexId;\nuse quickwit_storage::Storage;\nuse uuid::Uuid;\n\nuse super::LazyIndexStatus;\nuse super::index_template_matcher::IndexTemplateMatcher;\nuse super::lazy_file_backed_index::LazyFileBackedIndex;\nuse super::manifest::{IndexStatus, Manifest};\n\n#[derive(Default)]\npub(super) struct MetastoreState {\n    pub indexes: HashMap<IndexId, LazyIndexStatus>,\n    pub templates: HashMap<IndexTemplateId, IndexTemplate>,\n    pub template_matcher: IndexTemplateMatcher,\n    pub identity: Uuid,\n}\n\nimpl MetastoreState {\n    pub fn try_from_manifest(\n        storage: Arc<dyn Storage>,\n        manifest: Manifest,\n        polling_interval_opt: Option<Duration>,\n    ) -> MetastoreResult<Self> {\n        let indexes = manifest\n            .indexes\n            .into_iter()\n            .map(|(index_id, index_status)| match index_status {\n                IndexStatus::Creating => (index_id, LazyIndexStatus::Creating),\n                IndexStatus::Deleting => (index_id, LazyIndexStatus::Deleting),\n                IndexStatus::Active => {\n                    let lazy_index = LazyFileBackedIndex::new(\n                        storage.clone(),\n                        index_id.clone(),\n                        polling_interval_opt,\n                        None,\n                    );\n                    (index_id, LazyIndexStatus::Active(lazy_index))\n                }\n            })\n            .collect();\n\n        let template_matcher =\n            IndexTemplateMatcher::try_from_index_templates(manifest.templates.values())?;\n\n        let state = Self {\n            indexes,\n            templates: manifest.templates,\n            template_matcher,\n            identity: manifest.identity,\n        };\n        Ok(state)\n    }\n\n    pub fn as_manifest(&self) -> Manifest {\n        let indexes = self\n            .indexes\n            .iter()\n            .map(|(index_id, index_state)| {\n                let index_status = match index_state {\n                    LazyIndexStatus::Creating => IndexStatus::Creating,\n                    LazyIndexStatus::Active(_) => IndexStatus::Active,\n                    LazyIndexStatus::Deleting => IndexStatus::Deleting,\n                };\n                (index_id.clone(), index_status)\n            })\n            .collect();\n        let templates = self.templates.clone();\n        Manifest {\n            indexes,\n            templates,\n            identity: self.identity,\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/file_backed/store_operations.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::path::{Path, PathBuf};\n\nuse quickwit_proto::metastore::{EntityKind, MetastoreError, MetastoreResult, serde_utils};\nuse quickwit_storage::{Storage, StorageError, StorageErrorKind};\n\nuse crate::metastore::file_backed::file_backed_index::FileBackedIndex;\n\n/// Index metastore file managed by [`FileBackedMetastore`](crate::FileBackedMetastore).\npub(super) const METASTORE_FILE_NAME: &str = \"metastore.json\";\n\n/// Path to the metadata file from the given index ID.\npub(super) fn metastore_filepath(index_id: &str) -> PathBuf {\n    Path::new(index_id).join(METASTORE_FILE_NAME)\n}\n\nfn convert_error(index_id: &str, storage_error: StorageError) -> MetastoreError {\n    match storage_error.kind() {\n        StorageErrorKind::NotFound => MetastoreError::NotFound(EntityKind::Index {\n            index_id: index_id.to_string(),\n        }),\n        StorageErrorKind::Unauthorized => MetastoreError::Forbidden {\n            message: \"the request credentials do not allow for this operation\".to_string(),\n        },\n        _ => MetastoreError::Internal {\n            message: \"failed to get index files\".to_string(),\n            cause: storage_error.to_string(),\n        },\n    }\n}\n\npub(super) async fn load_index(\n    storage: &dyn Storage,\n    index_id: &str,\n) -> MetastoreResult<FileBackedIndex> {\n    let metastore_filepath = metastore_filepath(index_id);\n\n    let content = storage\n        .get_all(&metastore_filepath)\n        .await\n        .map_err(|storage_err| convert_error(index_id, storage_err))?;\n\n    let index: FileBackedIndex = serde_utils::from_json_bytes(&content)?;\n\n    if index.index_id() != index_id {\n        return Err(MetastoreError::Internal {\n            message: \"inconsistent manifest: index_id mismatch\".to_string(),\n            cause: format!(\n                \"expected index_id `{}`, but found `{}`\",\n                index_id,\n                index.index_id()\n            ),\n        });\n    }\n    Ok(index)\n}\n\npub(super) async fn index_exists(storage: &dyn Storage, index_id: &str) -> MetastoreResult<bool> {\n    let metastore_filepath = metastore_filepath(index_id);\n    let exists = storage\n        .exists(&metastore_filepath)\n        .await\n        .map_err(|storage_error| convert_error(index_id, storage_error))?;\n    Ok(exists)\n}\n\n/// Serializes the `Index` object and stores the data on the storage.\n///\n/// Do not call this method. Instead, call `put_index`.\n/// The point of having two methods here is just to make it usable in a unit test.\npub(super) async fn put_index_given_index_id(\n    storage: &dyn Storage,\n    index: &FileBackedIndex,\n    index_id: &str,\n) -> MetastoreResult<()> {\n    // Serialize Index.\n    let content: Vec<u8> = serde_utils::to_json_bytes_pretty(index)?;\n    let metastore_filepath = metastore_filepath(index_id);\n    // Put data back into storage.\n    storage\n        .put(&metastore_filepath, Box::new(content))\n        .await\n        .map_err(|storage_err| convert_error(index_id, storage_err))?;\n    Ok(())\n}\n\n/// Serializes the `Index` object and stores the data on the storage.\npub(super) async fn put_index(\n    storage: &dyn Storage,\n    index: &FileBackedIndex,\n) -> MetastoreResult<()> {\n    put_index_given_index_id(storage, index, index.index_id()).await\n}\n\n/// Serializes the Index and stores the data on the storage.\npub(super) async fn delete_index(storage: &dyn Storage, index_id: &str) -> MetastoreResult<()> {\n    let metastore_filepath = metastore_filepath(index_id);\n\n    let file_exists = storage\n        .exists(&metastore_filepath)\n        .await\n        .map_err(|storage_err| convert_error(index_id, storage_err))?;\n\n    if !file_exists {\n        return Err(MetastoreError::NotFound(EntityKind::Index {\n            index_id: index_id.to_string(),\n        }));\n    }\n    // Put data back into storage.\n    storage\n        .delete(&metastore_filepath)\n        .await\n        .map_err(|storage_error| match storage_error.kind() {\n            StorageErrorKind::Unauthorized => MetastoreError::Forbidden {\n                message: \"the request credentials do not allow for this operation\".to_string(),\n            },\n            _ => MetastoreError::Internal {\n                message: format!(\n                    \"failed to delete metastore file located at `{}/{}`\",\n                    storage.uri(),\n                    metastore_filepath.display()\n                ),\n                cause: storage_error.to_string(),\n            },\n        })?;\n    Ok(())\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/index_metadata/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\npub(crate) mod serialize;\n\nuse std::collections::HashMap;\nuse std::collections::hash_map::Entry;\n\nuse quickwit_common::uri::Uri;\nuse quickwit_config::{\n    DocMapping, IndexConfig, IndexingSettings, IngestSettings, RetentionPolicy, SearchSettings,\n    SourceConfig, prepare_doc_mapping_update,\n};\nuse quickwit_proto::metastore::{EntityKind, MetastoreError, MetastoreResult};\nuse quickwit_proto::types::{IndexUid, SourceId};\nuse serde::{Deserialize, Serialize};\nuse serialize::VersionedIndexMetadata;\nuse time::OffsetDateTime;\n\nuse crate::checkpoint::IndexCheckpoint;\n\n/// An index metadata carries all meta data about an index.\n#[derive(Clone, Debug, Serialize, Deserialize, PartialEq)]\n#[serde(into = \"VersionedIndexMetadata\")]\n#[serde(try_from = \"VersionedIndexMetadata\")]\npub struct IndexMetadata {\n    /// Index incarnation id\n    pub index_uid: IndexUid,\n    /// Index configuration\n    pub index_config: IndexConfig,\n    /// Per-source map of checkpoint for the given index.\n    pub checkpoint: IndexCheckpoint,\n    /// Time at which the index was created.\n    pub create_timestamp: i64,\n    /// Sources\n    pub sources: HashMap<SourceId, SourceConfig>,\n}\n\nimpl IndexMetadata {\n    /// Panics if `index_config` is missing `index_uri`.\n    pub fn new(index_config: IndexConfig) -> Self {\n        let index_uid = IndexUid::new_with_random_ulid(&index_config.index_id);\n        IndexMetadata::new_with_index_uid(index_uid, index_config)\n    }\n\n    /// Panics if `index_config` is missing `index_uri`.\n    pub fn new_with_index_uid(index_uid: IndexUid, index_config: IndexConfig) -> Self {\n        IndexMetadata {\n            index_uid,\n            index_config,\n            checkpoint: Default::default(),\n            create_timestamp: OffsetDateTime::now_utc().unix_timestamp(),\n            sources: HashMap::default(),\n        }\n    }\n\n    /// Returns an [`IndexMetadata`] object with multiple hard coded values for tests.\n    ///\n    /// An incarnation id of `0` will be used to complete the index id into a index uuid.\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn for_test(index_id: &str, index_uri: &str) -> Self {\n        let index_uid = IndexUid::for_test(index_id, 0);\n        let mut index_metadata = IndexMetadata::new(IndexConfig::for_test(index_id, index_uri));\n        index_metadata.index_uid = index_uid;\n        index_metadata\n    }\n\n    /// Extracts the index config from the index metadata object.\n    pub fn into_index_config(self) -> IndexConfig {\n        self.index_config\n    }\n\n    /// Accessor to the index config.\n    pub fn index_config(&self) -> &IndexConfig {\n        &self.index_config\n    }\n\n    /// Accessor to the index config's index id for convenience.\n    pub fn index_id(&self) -> &str {\n        &self.index_config.index_id\n    }\n\n    /// Accessor to the index config's index uri for convenience.\n    pub fn index_uri(&self) -> &Uri {\n        &self.index_config().index_uri\n    }\n\n    /// Updates the index config.\n    ///\n    /// Returns whether a mutation occurred.\n    pub fn update_index_config(\n        &mut self,\n        doc_mapping: DocMapping,\n        indexing_settings: IndexingSettings,\n        ingest_settings: IngestSettings,\n        search_settings: SearchSettings,\n        retention_policy_opt: Option<RetentionPolicy>,\n    ) -> MetastoreResult<bool> {\n        let (updated_doc_mapping, mut mutation_occurred) = prepare_doc_mapping_update(\n            doc_mapping,\n            &self.index_config.doc_mapping,\n            &search_settings,\n        )\n        .map_err(|error| MetastoreError::InvalidArgument {\n            message: error.to_string(),\n        })?;\n        self.index_config.doc_mapping = updated_doc_mapping;\n        if indexing_settings != self.index_config.indexing_settings {\n            self.index_config.indexing_settings = indexing_settings;\n            mutation_occurred = true;\n        }\n        if ingest_settings != self.index_config.ingest_settings {\n            self.index_config.ingest_settings = ingest_settings;\n            mutation_occurred = true;\n        }\n        if search_settings != self.index_config.search_settings {\n            self.index_config.search_settings = search_settings;\n            mutation_occurred = true;\n        }\n        if retention_policy_opt != self.index_config.retention_policy_opt {\n            self.index_config.retention_policy_opt = retention_policy_opt;\n            mutation_occurred = true;\n        }\n        Ok(mutation_occurred)\n    }\n\n    /// Adds a source to the index. Returns an error if the source already exists.\n    pub fn add_source(&mut self, source_config: SourceConfig) -> MetastoreResult<()> {\n        match self.sources.entry(source_config.source_id.clone()) {\n            Entry::Occupied(_) => Err(MetastoreError::AlreadyExists(EntityKind::Source {\n                index_id: self.index_id().to_string(),\n                source_id: source_config.source_id,\n            })),\n            Entry::Vacant(entry) => {\n                self.checkpoint.add_source(&source_config.source_id);\n                entry.insert(source_config);\n                Ok(())\n            }\n        }\n    }\n\n    /// Adds a source to the index. Returns whether a mutation occurred and an\n    /// error if the source doesn't exist.\n    pub fn update_source(&mut self, source_config: SourceConfig) -> MetastoreResult<bool> {\n        match self.sources.entry(source_config.source_id.clone()) {\n            Entry::Occupied(mut entry) => {\n                if entry.get() == &source_config {\n                    return Ok(false);\n                }\n                entry.insert(source_config);\n                Ok(true)\n            }\n            Entry::Vacant(_) => Err(MetastoreError::NotFound(EntityKind::Source {\n                index_id: self.index_id().to_string(),\n                source_id: source_config.source_id,\n            })),\n        }\n    }\n\n    pub(crate) fn toggle_source(&mut self, source_id: &str, enable: bool) -> MetastoreResult<bool> {\n        let Some(source_config) = self.sources.get_mut(source_id) else {\n            return Err(MetastoreError::NotFound(EntityKind::Source {\n                index_id: self.index_id().to_string(),\n                source_id: source_id.to_string(),\n            }));\n        };\n        let mutation_occurred = source_config.enabled != enable;\n        source_config.enabled = enable;\n        Ok(mutation_occurred)\n    }\n\n    /// Deletes a source from the index.\n    pub(crate) fn delete_source(&mut self, source_id: &str) -> MetastoreResult<()> {\n        self.sources.remove(source_id).ok_or_else(|| {\n            MetastoreError::NotFound(EntityKind::Source {\n                index_id: self.index_id().to_string(),\n                source_id: source_id.to_string(),\n            })\n        })?;\n        self.checkpoint.remove_source(source_id);\n        Ok(())\n    }\n}\n\n#[cfg(any(test, feature = \"testsuite\"))]\nimpl quickwit_config::TestableForRegression for IndexMetadata {\n    fn sample_for_regression() -> IndexMetadata {\n        use std::collections::BTreeMap;\n\n        use quickwit_proto::types::Position;\n\n        use crate::checkpoint::{PartitionId, SourceCheckpoint, SourceCheckpointDelta};\n\n        let index_config = IndexConfig::sample_for_regression();\n\n        let mut source_checkpoint = SourceCheckpoint::default();\n        let delta = SourceCheckpointDelta::from_partition_delta(\n            PartitionId::from(0i64),\n            Position::Beginning,\n            Position::offset(42u64),\n        )\n        .unwrap();\n        source_checkpoint.try_apply_delta(delta).unwrap();\n\n        let per_source_checkpoint: BTreeMap<String, SourceCheckpoint> =\n            BTreeMap::from_iter([(\"kafka-source\".to_string(), source_checkpoint)]);\n        let checkpoint = IndexCheckpoint::from(per_source_checkpoint);\n\n        let mut index_metadata = IndexMetadata {\n            index_uid: IndexUid::for_test(&index_config.index_id, 1),\n            index_config,\n            checkpoint,\n            create_timestamp: 1789,\n            sources: Default::default(),\n        };\n        index_metadata\n            .add_source(SourceConfig::sample_for_regression())\n            .unwrap();\n        index_metadata\n    }\n\n    fn assert_equality(&self, other: &Self) {\n        self.index_config().assert_equality(other.index_config());\n        assert_eq!(self.checkpoint, other.checkpoint);\n        assert_eq!(self.create_timestamp, other.create_timestamp);\n        assert_eq!(self.sources, other.sources);\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_doc_mapper::Mode;\n    use quickwit_proto::types::DocMappingUid;\n\n    use super::*;\n\n    #[test]\n    fn test_update_index_config() {\n        let current_index_config = IndexConfig::for_test(\"test-index\", \"s3://test-index\");\n        let mut current_index_metadata = IndexMetadata::new(current_index_config.clone());\n\n        let mutation_occurred = current_index_metadata\n            .update_index_config(\n                current_index_config.doc_mapping.clone(),\n                current_index_config.indexing_settings.clone(),\n                current_index_config.ingest_settings.clone(),\n                current_index_config.search_settings.clone(),\n                current_index_config.retention_policy_opt.clone(),\n            )\n            .unwrap();\n        assert!(!mutation_occurred);\n\n        let new_search_settings = SearchSettings {\n            default_search_fields: vec![\"message\".to_string(), \"status\".to_string()],\n        };\n        let mutation_occurred = current_index_metadata\n            .update_index_config(\n                current_index_config.doc_mapping.clone(),\n                current_index_config.indexing_settings.clone(),\n                current_index_config.ingest_settings.clone(),\n                new_search_settings,\n                current_index_config.retention_policy_opt.clone(),\n            )\n            .unwrap();\n        assert!(mutation_occurred);\n        assert_eq!(\n            current_index_metadata\n                .index_config()\n                .search_settings\n                .default_search_fields,\n            [\"message\", \"status\"]\n        );\n    }\n\n    #[test]\n    fn test_update_doc_mapping() {\n        let current_index_config = IndexConfig::for_test(\"test-index\", \"s3://test-index\");\n        let mut current_index_metadata = IndexMetadata::new(current_index_config.clone());\n\n        let mut new_doc_mapping = current_index_config.doc_mapping.clone();\n        new_doc_mapping.doc_mapping_uid = DocMappingUid::random();\n        new_doc_mapping.timestamp_field = Some(\"ts\".to_string()); // This is set to `timestamp` for the current doc mapping.\n\n        current_index_metadata\n            .update_index_config(\n                new_doc_mapping,\n                current_index_config.indexing_settings.clone(),\n                current_index_config.ingest_settings.clone(),\n                current_index_config.search_settings.clone(),\n                current_index_config.retention_policy_opt.clone(),\n            )\n            .unwrap_err();\n\n        let mut new_doc_mapping = current_index_config.doc_mapping.clone();\n        let new_doc_mapping_uid = DocMappingUid::random();\n        new_doc_mapping.doc_mapping_uid = new_doc_mapping_uid;\n        new_doc_mapping.mode = Mode::Strict;\n\n        let mutation_occurred = current_index_metadata\n            .update_index_config(\n                new_doc_mapping,\n                current_index_config.indexing_settings,\n                current_index_config.ingest_settings,\n                current_index_config.search_settings,\n                current_index_config.retention_policy_opt,\n            )\n            .unwrap();\n        assert!(mutation_occurred);\n        assert_eq!(\n            current_index_metadata\n                .index_config()\n                .doc_mapping\n                .doc_mapping_uid,\n            new_doc_mapping_uid\n        );\n        assert_eq!(\n            current_index_metadata.index_config().doc_mapping.mode,\n            Mode::Strict\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/index_metadata/serialize.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\n\nuse quickwit_config::{IndexConfig, SourceConfig};\nuse quickwit_proto::types::IndexUid;\nuse serde::{self, Deserialize, Serialize};\n\nuse crate::IndexMetadata;\nuse crate::checkpoint::IndexCheckpoint;\nuse crate::split_metadata::utc_now_timestamp;\n\n#[derive(Clone, Debug, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(tag = \"version\")]\npub(crate) enum VersionedIndexMetadata {\n    #[serde(rename = \"0.9\")]\n    // Retro compatibility.\n    #[serde(alias = \"0.8\")]\n    #[serde(alias = \"0.7\")]\n    V0_8(IndexMetadataV0_8),\n}\n\nimpl From<IndexMetadata> for VersionedIndexMetadata {\n    fn from(index_metadata: IndexMetadata) -> Self {\n        VersionedIndexMetadata::V0_8(index_metadata.into())\n    }\n}\n\nimpl TryFrom<VersionedIndexMetadata> for IndexMetadata {\n    type Error = anyhow::Error;\n\n    fn try_from(index_metadata: VersionedIndexMetadata) -> anyhow::Result<Self> {\n        match index_metadata {\n            // When we have more than one version, you should chain version conversion.\n            // ie. Implement conversion from V_k -> V_{k+1}\n            VersionedIndexMetadata::V0_8(v8) => v8.try_into(),\n        }\n    }\n}\n\nimpl From<IndexMetadata> for IndexMetadataV0_8 {\n    fn from(index_metadata: IndexMetadata) -> Self {\n        let sources: Vec<SourceConfig> = index_metadata.sources.values().cloned().collect();\n        Self {\n            index_uid: index_metadata.index_uid,\n            index_config: index_metadata.index_config,\n            checkpoint: index_metadata.checkpoint,\n            create_timestamp: index_metadata.create_timestamp,\n            sources,\n        }\n    }\n}\n\n#[derive(Clone, Debug, Serialize, Deserialize, utoipa::ToSchema)]\npub(crate) struct IndexMetadataV0_8 {\n    #[schema(value_type = String)]\n    pub index_uid: IndexUid,\n    #[schema(value_type = VersionedIndexConfig)]\n    pub index_config: IndexConfig,\n    #[schema(value_type = Object)]\n    pub checkpoint: IndexCheckpoint,\n    #[serde(default = \"utc_now_timestamp\")]\n    pub create_timestamp: i64,\n    #[schema(value_type = Vec<VersionedSourceConfig>)]\n    pub sources: Vec<SourceConfig>,\n}\n\nimpl TryFrom<IndexMetadataV0_8> for IndexMetadata {\n    type Error = anyhow::Error;\n\n    fn try_from(v0_8: IndexMetadataV0_8) -> anyhow::Result<Self> {\n        let mut sources: HashMap<String, SourceConfig> = Default::default();\n        for source in v0_8.sources {\n            if sources.contains_key(&source.source_id) {\n                anyhow::bail!(\"source `{}` is defined more than once\", source.source_id);\n            }\n            sources.insert(source.source_id.clone(), source);\n        }\n        Ok(Self {\n            index_uid: v0_8.index_uid,\n            index_config: v0_8.index_config,\n            checkpoint: v0_8.checkpoint,\n            create_timestamp: v0_8.create_timestamp,\n            sources,\n        })\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\npub mod file_backed;\npub(crate) mod index_metadata;\n#[cfg(feature = \"postgres\")]\npub mod postgres;\n\npub mod control_plane_metastore;\n\nuse std::cmp::Ordering;\nuse std::ops::{Bound, RangeInclusive};\n\nuse async_trait::async_trait;\nuse bytes::Bytes;\nuse futures::TryStreamExt;\npub use index_metadata::IndexMetadata;\nuse itertools::Itertools;\nuse quickwit_common::thread_pool::run_cpu_intensive;\nuse quickwit_config::{\n    DocMapping, FileSourceParams, IndexConfig, IndexingSettings, IngestSettings, RetentionPolicy,\n    SearchSettings, SourceConfig, SourceParams,\n};\nuse quickwit_doc_mapper::tag_pruning::TagFilterAst;\nuse quickwit_proto::metastore::{\n    AddSourceRequest, CreateIndexRequest, CreateIndexResponse, DeleteTask, IndexMetadataFailure,\n    IndexMetadataRequest, IndexMetadataResponse, IndexesMetadataResponse,\n    ListIndexesMetadataResponse, ListSplitsRequest, ListSplitsResponse, MetastoreError,\n    MetastoreResult, MetastoreService, MetastoreServiceClient, MetastoreServiceStream,\n    PublishSplitsRequest, StageSplitsRequest, UpdateIndexRequest, UpdateSourceRequest, serde_utils,\n};\nuse quickwit_proto::types::{IndexUid, NodeId, SplitId};\nuse time::OffsetDateTime;\n\nuse crate::checkpoint::IndexCheckpointDelta;\nuse crate::{Split, SplitMetadata, SplitState};\n\n/// Splits batch size returned by the stream splits API\npub(crate) const STREAM_SPLITS_CHUNK_SIZE: usize = 100;\n\n/// An extended trait for [`MetastoreService`].\n#[async_trait]\npub trait MetastoreServiceExt: MetastoreService {\n    /// Returns whether the index `index_id` exists in the metastore.\n    async fn index_exists(&mut self, index_id: &str) -> MetastoreResult<bool> {\n        let request = IndexMetadataRequest::for_index_id(index_id.to_string());\n        match self.index_metadata(request).await {\n            Ok(_) => Ok(true),\n            Err(MetastoreError::NotFound { .. }) => Ok(false),\n            Err(error) => Err(error),\n        }\n    }\n}\n\nimpl MetastoreServiceExt for MetastoreServiceClient {}\n\n/// Helper trait to collect splits from a [`MetastoreServiceStream<ListSplitsResponse>`].\n#[async_trait]\npub trait MetastoreServiceStreamSplitsExt {\n    /// Collects all splits from a [`MetastoreServiceStream<ListSplitsResponse>`].\n    async fn collect_splits(mut self) -> MetastoreResult<Vec<Split>>;\n\n    /// Collects all splits metadata from a [`MetastoreServiceStream<ListSplitsResponse>`].\n    async fn collect_splits_metadata(mut self) -> MetastoreResult<Vec<SplitMetadata>>;\n\n    /// Collects all splits IDs from a [`MetastoreServiceStream<ListSplitsResponse>`].\n    async fn collect_split_ids(mut self) -> MetastoreResult<Vec<SplitId>>;\n}\n\n#[async_trait]\nimpl MetastoreServiceStreamSplitsExt for MetastoreServiceStream<ListSplitsResponse> {\n    async fn collect_splits(mut self) -> MetastoreResult<Vec<Split>> {\n        let mut all_splits = Vec::new();\n        while let Some(list_splits_response) = self.try_next().await? {\n            let splits = list_splits_response.deserialize_splits().await?;\n            all_splits.extend(splits);\n        }\n        Ok(all_splits)\n    }\n\n    async fn collect_splits_metadata(mut self) -> MetastoreResult<Vec<SplitMetadata>> {\n        let mut all_splits_metadata = Vec::new();\n        while let Some(list_splits_response) = self.try_next().await? {\n            let splits_metadata = list_splits_response.deserialize_splits_metadata().await?;\n            all_splits_metadata.extend(splits_metadata);\n        }\n        Ok(all_splits_metadata)\n    }\n\n    async fn collect_split_ids(mut self) -> MetastoreResult<Vec<SplitId>> {\n        let mut all_splits = Vec::new();\n        while let Some(list_splits_response) = self.try_next().await? {\n            let splits = list_splits_response.deserialize_split_ids().await?;\n            all_splits.extend(splits);\n        }\n        Ok(all_splits)\n    }\n}\n\n/// Helper trait to build a [`CreateIndexRequest`] and deserialize its payload.\npub trait CreateIndexRequestExt {\n    /// Creates a new [`CreateIndexRequest`] from an [`IndexConfig`].\n    fn try_from_index_config(index_config: &IndexConfig) -> MetastoreResult<CreateIndexRequest>;\n\n    /// Creates a new [`CreateIndexRequest`] from an [`IndexConfig`] and a list of [`SourceConfig`].\n    fn try_from_index_and_source_configs(\n        index_config: &IndexConfig,\n        source_configs: &[SourceConfig],\n    ) -> MetastoreResult<CreateIndexRequest>;\n\n    /// Deserializes the `index_config_json` field of a [`CreateIndexRequest`] into an\n    /// [`IndexConfig`].\n    fn deserialize_index_config(&self) -> MetastoreResult<IndexConfig>;\n\n    /// Deserializes the `source_configs_json` field of a [`CreateIndexRequest`] into an\n    /// `Vec` of [`SourceConfig`].\n    fn deserialize_source_configs(&self) -> MetastoreResult<Vec<SourceConfig>>;\n}\n\nimpl CreateIndexRequestExt for CreateIndexRequest {\n    fn try_from_index_config(index_config: &IndexConfig) -> MetastoreResult<CreateIndexRequest> {\n        let index_config_json = serde_utils::to_json_str(index_config)?;\n        let source_configs_json = Vec::new();\n        let request = Self {\n            index_config_json,\n            source_configs_json,\n        };\n        Ok(request)\n    }\n\n    fn try_from_index_and_source_configs(\n        index_config: &IndexConfig,\n        source_configs: &[SourceConfig],\n    ) -> MetastoreResult<CreateIndexRequest> {\n        let index_config_json = serde_utils::to_json_str(index_config)?;\n        let source_configs_json: Vec<String> = source_configs\n            .iter()\n            .map(serde_utils::to_json_str)\n            .collect::<MetastoreResult<_>>()?;\n        let request = Self {\n            index_config_json,\n            source_configs_json,\n        };\n        Ok(request)\n    }\n\n    fn deserialize_index_config(&self) -> MetastoreResult<IndexConfig> {\n        serde_utils::from_json_str(&self.index_config_json)\n    }\n\n    fn deserialize_source_configs(&self) -> MetastoreResult<Vec<SourceConfig>> {\n        self.source_configs_json\n            .iter()\n            .map(|source_config_json| serde_utils::from_json_str(source_config_json))\n            .collect()\n    }\n}\n\n/// Helper trait to deserialize the payload of a [`CreateIndexResponse`].\npub trait CreateIndexResponseExt {\n    /// Deserializes the `index_metadata_json` field of a [`CreateIndexResponse`] into an\n    /// [`IndexMetadata`].\n    fn deserialize_index_metadata(&self) -> MetastoreResult<IndexMetadata>;\n}\n\nimpl CreateIndexResponseExt for CreateIndexResponse {\n    fn deserialize_index_metadata(&self) -> MetastoreResult<IndexMetadata> {\n        serde_utils::from_json_str(&self.index_metadata_json)\n    }\n}\n\n/// Helper trait to build a [`UpdateIndexRequest`] and deserialize its payload.\npub trait UpdateIndexRequestExt {\n    /// Creates a new [`UpdateIndexRequest`] from the different updated fields.\n    fn try_from_updates(\n        index_uid: impl Into<IndexUid>,\n        doc_mapping: &DocMapping,\n        indexing_settings: &IndexingSettings,\n        ingest_settings: &IngestSettings,\n        search_settings: &SearchSettings,\n        retention_policy_opt: &Option<RetentionPolicy>,\n    ) -> MetastoreResult<UpdateIndexRequest>;\n\n    /// Deserializes the `doc_mapping_json` field of an `[UpdateIndexRequest]` into a\n    /// [`DocMapping`] object.\n    fn deserialize_doc_mapping(&self) -> MetastoreResult<DocMapping>;\n\n    /// Deserializes the `indexing_settings_json` field of an [`UpdateIndexRequest`] into a\n    /// [`IndexingSettings`] object.\n    fn deserialize_indexing_settings(&self) -> MetastoreResult<IndexingSettings>;\n\n    /// Deserializes the `ingest_settings_json` field of an [`UpdateIndexRequest`] into a\n    /// [`IngestSettings`] object.\n    fn deserialize_ingest_settings(&self) -> MetastoreResult<IngestSettings>;\n\n    /// Deserializes the `search_settings_json` field of an [`UpdateIndexRequest`] into a\n    /// [`SearchSettings`] object.\n    fn deserialize_search_settings(&self) -> MetastoreResult<SearchSettings>;\n\n    /// Deserializes the `retention_policy_json` field of an [`UpdateIndexRequest`] into a\n    /// [`RetentionPolicy`] object.\n    fn deserialize_retention_policy(&self) -> MetastoreResult<Option<RetentionPolicy>>;\n}\n\nimpl UpdateIndexRequestExt for UpdateIndexRequest {\n    fn try_from_updates(\n        index_uid: impl Into<IndexUid>,\n        doc_mapping: &DocMapping,\n        indexing_settings: &IndexingSettings,\n        ingest_settings: &IngestSettings,\n        search_settings: &SearchSettings,\n        retention_policy_opt: &Option<RetentionPolicy>,\n    ) -> MetastoreResult<UpdateIndexRequest> {\n        let doc_mapping_json = serde_utils::to_json_str(doc_mapping)?;\n        let indexing_settings_json = serde_utils::to_json_str(indexing_settings)?;\n        let ingest_settings_json = serde_utils::to_json_str(ingest_settings)?;\n        let search_settings_json = serde_utils::to_json_str(search_settings)?;\n        let retention_policy_json_opt = retention_policy_opt\n            .as_ref()\n            .map(serde_utils::to_json_str)\n            .transpose()?;\n\n        let update_request = UpdateIndexRequest {\n            index_uid: Some(index_uid.into()),\n            doc_mapping_json,\n            indexing_settings_json,\n            ingest_settings_json,\n            search_settings_json,\n            retention_policy_json_opt,\n        };\n        Ok(update_request)\n    }\n    fn deserialize_doc_mapping(&self) -> MetastoreResult<DocMapping> {\n        serde_utils::from_json_str(&self.doc_mapping_json)\n    }\n\n    fn deserialize_indexing_settings(&self) -> MetastoreResult<IndexingSettings> {\n        serde_utils::from_json_str(&self.indexing_settings_json)\n    }\n\n    fn deserialize_ingest_settings(&self) -> MetastoreResult<IngestSettings> {\n        serde_utils::from_json_str(&self.ingest_settings_json)\n    }\n\n    fn deserialize_search_settings(&self) -> MetastoreResult<SearchSettings> {\n        serde_utils::from_json_str(&self.search_settings_json)\n    }\n\n    fn deserialize_retention_policy(&self) -> MetastoreResult<Option<RetentionPolicy>> {\n        self.retention_policy_json_opt\n            .as_ref()\n            .map(|policy_json| serde_utils::from_json_str(policy_json))\n            .transpose()\n    }\n}\n\n/// Helper trait to build a [`IndexMetadataResponse`] and deserialize its payload.\npub trait IndexMetadataResponseExt {\n    /// Creates a new [`IndexMetadataResponse`] from an [`IndexMetadata`].\n    fn try_from_index_metadata(\n        index_metadata: &IndexMetadata,\n    ) -> MetastoreResult<IndexMetadataResponse>;\n\n    /// Deserializes the `index_metadata_serialized_json` field of a [`IndexMetadataResponse`] into\n    /// an [`IndexMetadata`].\n    fn deserialize_index_metadata(&self) -> MetastoreResult<IndexMetadata>;\n}\n\nimpl IndexMetadataResponseExt for IndexMetadataResponse {\n    fn try_from_index_metadata(index_metadata: &IndexMetadata) -> MetastoreResult<Self> {\n        let index_metadata_serialized_json = serde_utils::to_json_str(index_metadata)?;\n        let response = Self {\n            index_metadata_serialized_json,\n        };\n        Ok(response)\n    }\n\n    fn deserialize_index_metadata(&self) -> MetastoreResult<IndexMetadata> {\n        serde_utils::from_json_str(&self.index_metadata_serialized_json)\n    }\n}\n\n/// Helper trait to build a [`IndexesMetadataResponse`] and deserialize its payload.\n#[async_trait]\npub trait IndexesMetadataResponseExt {\n    /// Creates a new `IndexesMetadataResponse` from a `Vec` of [`IndexMetadata`].\n    async fn try_from_indexes_metadata(\n        indexes_metadata: Vec<IndexMetadata>,\n        failures: Vec<IndexMetadataFailure>,\n    ) -> MetastoreResult<IndexesMetadataResponse>;\n\n    /// Deserializes the payload of an `IndexesMetadataResponse` into a `Vec`` of [`IndexMetadata`].\n    async fn deserialize_indexes_metadata(self) -> MetastoreResult<Vec<IndexMetadata>>;\n\n    /// Creates a new `IndexesMetadataResponse` from a `Vec` of [`IndexMetadata`] synchronously.\n    #[cfg(any(test, feature = \"testsuite\"))]\n    fn for_test(\n        indexes_metadata: Vec<IndexMetadata>,\n        failures: Vec<IndexMetadataFailure>,\n    ) -> IndexesMetadataResponse {\n        use futures::executor;\n\n        executor::block_on(Self::try_from_indexes_metadata(indexes_metadata, failures)).unwrap()\n    }\n}\n\n#[async_trait]\nimpl IndexesMetadataResponseExt for IndexesMetadataResponse {\n    async fn try_from_indexes_metadata(\n        indexes_metadata: Vec<IndexMetadata>,\n        failures: Vec<IndexMetadataFailure>,\n    ) -> MetastoreResult<Self> {\n        let indexes_metadata_json_zstd = run_cpu_intensive(move || {\n            serde_utils::to_json_zstd(&indexes_metadata, 0).map(Bytes::from)\n        })\n        .await\n        .map_err(|join_error| MetastoreError::Internal {\n            message: \"failed to serialize indexes metadata\".to_string(),\n            cause: join_error.to_string(),\n        })??;\n        let response = Self {\n            indexes_metadata_json_zstd,\n            failures,\n        };\n        Ok(response)\n    }\n\n    async fn deserialize_indexes_metadata(self) -> MetastoreResult<Vec<IndexMetadata>> {\n        run_cpu_intensive(move || serde_utils::from_json_zstd(&self.indexes_metadata_json_zstd))\n            .await\n            .map_err(|join_error| MetastoreError::Internal {\n                message: \"failed to deserialize indexes metadata\".to_string(),\n                cause: join_error.to_string(),\n            })?\n    }\n}\n\n/// Helper trait to build a `ListIndexesResponse` and deserialize its payload.\n#[async_trait]\npub trait ListIndexesMetadataResponseExt {\n    /// Creates a new `ListIndexesMetadataResponse` from a `Vec` of [`IndexMetadata`].\n    async fn try_from_indexes_metadata(\n        indexes_metadata: Vec<IndexMetadata>,\n    ) -> MetastoreResult<ListIndexesMetadataResponse>;\n\n    /// Deserializes the payload of a `ListIndexesResponse` into a `Vec`` of [`IndexMetadata`].\n    async fn deserialize_indexes_metadata(self) -> MetastoreResult<Vec<IndexMetadata>>;\n\n    /// Creates a new `ListIndexesMetadataResponse` from a `Vec` of [`IndexMetadata`] synchronously.\n    #[cfg(any(test, feature = \"testsuite\"))]\n    fn for_test(indexes_metadata: Vec<IndexMetadata>) -> ListIndexesMetadataResponse {\n        use futures::executor;\n\n        executor::block_on(Self::try_from_indexes_metadata(indexes_metadata)).unwrap()\n    }\n}\n\n#[async_trait]\nimpl ListIndexesMetadataResponseExt for ListIndexesMetadataResponse {\n    async fn try_from_indexes_metadata(\n        indexes_metadata: Vec<IndexMetadata>,\n    ) -> MetastoreResult<Self> {\n        let indexes_metadata_json_zstd = run_cpu_intensive(move || {\n            serde_utils::to_json_zstd(&indexes_metadata, 0).map(Bytes::from)\n        })\n        .await\n        .map_err(|join_error| MetastoreError::Internal {\n            message: \"failed to serialize indexes metadata\".to_string(),\n            cause: join_error.to_string(),\n        })??;\n        let response = Self {\n            indexes_metadata_json_zstd,\n            indexes_metadata_json_opt: None,\n        };\n        Ok(response)\n    }\n\n    async fn deserialize_indexes_metadata(self) -> MetastoreResult<Vec<IndexMetadata>> {\n        run_cpu_intensive(move || {\n            if let Some(indexes_metadata_json) = &self.indexes_metadata_json_opt {\n                return serde_utils::from_json_str(indexes_metadata_json);\n            };\n            serde_utils::from_json_zstd(&self.indexes_metadata_json_zstd)\n        })\n        .await\n        .map_err(|join_error| MetastoreError::Internal {\n            message: \"failed to deserialize indexes metadata\".to_string(),\n            cause: join_error.to_string(),\n        })?\n    }\n}\n\n/// Helper trait to build a [`AddSourceRequest`] and deserialize its payload.\npub trait AddSourceRequestExt {\n    /// Creates a new [`AddSourceRequest`] from a [`SourceConfig`].\n    fn try_from_source_config(\n        index_uid: impl Into<IndexUid>,\n        source_config: &SourceConfig,\n    ) -> MetastoreResult<AddSourceRequest>;\n\n    /// Deserializes the `source_config_json` field of a [`AddSourceRequest`] into a\n    /// [`SourceConfig`].\n    fn deserialize_source_config(&self) -> MetastoreResult<SourceConfig>;\n}\n\nimpl AddSourceRequestExt for AddSourceRequest {\n    fn try_from_source_config(\n        index_uid: impl Into<IndexUid>,\n        source_config: &SourceConfig,\n    ) -> MetastoreResult<AddSourceRequest> {\n        let source_config_json = serde_utils::to_json_str(&source_config)?;\n        let request = Self {\n            index_uid: Some(index_uid.into()),\n            source_config_json,\n        };\n        Ok(request)\n    }\n\n    fn deserialize_source_config(&self) -> MetastoreResult<SourceConfig> {\n        serde_utils::from_json_str(&self.source_config_json)\n    }\n}\n\n/// Helper trait to build a [`UpdateSourceRequest`] and deserialize its payload.\npub trait UpdateSourceRequestExt {\n    /// Creates a new [`UpdateSourceRequest`] from a [`SourceConfig`].\n    fn try_from_source_config(\n        index_uid: impl Into<IndexUid>,\n        source_config: &SourceConfig,\n    ) -> MetastoreResult<UpdateSourceRequest>;\n\n    /// Deserializes the `source_config_json` field of a [`UpdateSourceRequest`] into a\n    /// [`SourceConfig`].\n    fn deserialize_source_config(&self) -> MetastoreResult<SourceConfig>;\n}\n\nimpl UpdateSourceRequestExt for UpdateSourceRequest {\n    fn try_from_source_config(\n        index_uid: impl Into<IndexUid>,\n        source_config: &SourceConfig,\n    ) -> MetastoreResult<UpdateSourceRequest> {\n        let source_config_json = serde_utils::to_json_str(&source_config)?;\n        let request = Self {\n            index_uid: Some(index_uid.into()),\n            source_config_json,\n        };\n        Ok(request)\n    }\n\n    fn deserialize_source_config(&self) -> MetastoreResult<SourceConfig> {\n        serde_utils::from_json_str(&self.source_config_json)\n    }\n}\n/// Helper trait to build a [`DeleteTask`] and deserialize its payload.\npub trait StageSplitsRequestExt {\n    /// Creates a new [`StageSplitsRequest`] from a [`SplitMetadata`].\n    fn try_from_split_metadata(\n        index_uid: impl Into<IndexUid>,\n        split_metadata: &SplitMetadata,\n    ) -> MetastoreResult<StageSplitsRequest>;\n\n    /// Creates a new [`StageSplitsRequest`] from a list of [`SplitMetadata`].\n    fn try_from_splits_metadata(\n        index_uid: impl Into<IndexUid>,\n        splits_metadata: impl IntoIterator<Item = SplitMetadata>,\n    ) -> MetastoreResult<StageSplitsRequest>;\n\n    /// Deserializes the `split_metadata_list_serialized_json` field of a [`StageSplitsRequest`]\n    /// into a list of [`SplitMetadata`].\n    fn deserialize_splits_metadata(&self) -> MetastoreResult<Vec<SplitMetadata>>;\n}\n\nimpl StageSplitsRequestExt for StageSplitsRequest {\n    fn try_from_split_metadata(\n        index_uid: impl Into<IndexUid>,\n        split_metadata: &SplitMetadata,\n    ) -> MetastoreResult<StageSplitsRequest> {\n        let split_metadata_list_serialized_json = serde_utils::to_json_str(&[split_metadata])?;\n        let request = Self {\n            index_uid: Some(index_uid.into()),\n            split_metadata_list_serialized_json,\n        };\n        Ok(request)\n    }\n\n    fn try_from_splits_metadata(\n        index_uid: impl Into<IndexUid>,\n        splits_metadata: impl IntoIterator<Item = SplitMetadata>,\n    ) -> MetastoreResult<StageSplitsRequest> {\n        let splits_metadata: Vec<SplitMetadata> = splits_metadata.into_iter().collect();\n        let split_metadata_list_serialized_json = serde_utils::to_json_str(&splits_metadata)?;\n        let request = Self {\n            index_uid: Some(index_uid.into()),\n            split_metadata_list_serialized_json,\n        };\n        Ok(request)\n    }\n\n    fn deserialize_splits_metadata(&self) -> MetastoreResult<Vec<SplitMetadata>> {\n        serde_utils::from_json_str(&self.split_metadata_list_serialized_json)\n    }\n}\n\n/// Helper trait to build a [`ListSplitsRequest`] and deserialize its payload.\npub trait ListSplitsRequestExt {\n    /// Creates a new [`ListSplitsRequest`] from an [`IndexUid`].\n    fn try_from_index_uid(index_uid: IndexUid) -> MetastoreResult<ListSplitsRequest>;\n\n    /// Creates a new [`ListSplitsRequest`] from a [`ListSplitsQuery`].\n    fn try_from_list_splits_query(\n        list_splits_query: &ListSplitsQuery,\n    ) -> MetastoreResult<ListSplitsRequest>;\n\n    /// Deserializes the `query_json` field of a [`ListSplitsRequest`] into a [`ListSplitsQuery`].\n    fn deserialize_list_splits_query(&self) -> MetastoreResult<ListSplitsQuery>;\n}\n\nimpl ListSplitsRequestExt for ListSplitsRequest {\n    fn try_from_index_uid(index_uid: IndexUid) -> MetastoreResult<ListSplitsRequest> {\n        let list_splits_query = ListSplitsQuery::for_index(index_uid);\n        Self::try_from_list_splits_query(&list_splits_query)\n    }\n\n    fn try_from_list_splits_query(\n        list_splits_query: &ListSplitsQuery,\n    ) -> MetastoreResult<ListSplitsRequest> {\n        let query_json = serde_utils::to_json_str(&list_splits_query)?;\n        let request = Self { query_json };\n        Ok(request)\n    }\n\n    fn deserialize_list_splits_query(&self) -> MetastoreResult<ListSplitsQuery> {\n        let list_splits_query = serde_utils::from_json_str(&self.query_json)?;\n        Ok(list_splits_query)\n    }\n}\n\n/// Helper trait to build a [`ListSplitsResponse`] and deserialize its payload.\n#[async_trait]\npub trait ListSplitsResponseExt {\n    /// Creates a new [`ListSplitsResponse`] from a list of [`Split`].\n    fn try_from_splits(\n        splits: impl IntoIterator<Item = Split>,\n    ) -> MetastoreResult<ListSplitsResponse>;\n\n    /// Deserializes the `splits_serialized_json` field of a [`ListSplitsResponse`] into a list of\n    /// [`Split`].\n    async fn deserialize_splits(self) -> MetastoreResult<Vec<Split>>;\n\n    /// Deserializes the `splits_serialized_json` field of a [`ListSplitsResponse`] into a list of\n    /// [`SplitMetadata`].\n    async fn deserialize_splits_metadata(self) -> MetastoreResult<Vec<SplitMetadata>>;\n\n    /// Deserializes the `splits_serialized_json` field of a [`ListSplitsResponse`] into a list of\n    /// [`SplitId`].\n    async fn deserialize_split_ids(self) -> MetastoreResult<Vec<SplitId>>;\n\n    /// Creates an empty [`ListSplitsResponse`].\n    fn empty() -> Self;\n}\n\n/// Helper trait for [`PublishSplitsRequest`] to deserialize its payload.\npub trait PublishSplitsRequestExt {\n    /// Deserializes the `index_checkpoint_delta_json_opt` field of a [`PublishSplitsRequest`] into\n    /// an [`Option<IndexCheckpointDelta>`].\n    fn deserialize_index_checkpoint(&self) -> MetastoreResult<Option<IndexCheckpointDelta>>;\n}\n\nimpl PublishSplitsRequestExt for PublishSplitsRequest {\n    fn deserialize_index_checkpoint(&self) -> MetastoreResult<Option<IndexCheckpointDelta>> {\n        self.index_checkpoint_delta_json_opt\n            .as_ref()\n            .map(|value| serde_utils::from_json_str(value))\n            .transpose()\n    }\n}\n\n#[async_trait]\nimpl ListSplitsResponseExt for ListSplitsResponse {\n    fn empty() -> Self {\n        Self {\n            splits_serialized_json: \"[]\".to_string(),\n        }\n    }\n\n    fn try_from_splits(splits: impl IntoIterator<Item = Split>) -> MetastoreResult<Self> {\n        let splits_serialized_json = serde_utils::to_json_str(&splits.into_iter().collect_vec())?;\n        let response = Self {\n            splits_serialized_json,\n        };\n        Ok(response)\n    }\n\n    async fn deserialize_splits(self) -> MetastoreResult<Vec<Split>> {\n        run_cpu_intensive(move || serde_utils::from_json_str(&self.splits_serialized_json))\n            .await\n            .map_err(|join_error| MetastoreError::Internal {\n                message: \"failed to deserialize splits\".to_string(),\n                cause: join_error.to_string(),\n            })?\n    }\n\n    async fn deserialize_splits_metadata(self) -> MetastoreResult<Vec<SplitMetadata>> {\n        let splits = self.deserialize_splits().await?;\n        let splits_metadata = splits\n            .into_iter()\n            .map(|split| split.split_metadata)\n            .collect();\n        Ok(splits_metadata)\n    }\n\n    async fn deserialize_split_ids(self) -> MetastoreResult<Vec<SplitId>> {\n        let splits = self.deserialize_splits().await?;\n        let split_ids = splits\n            .into_iter()\n            .map(|split| split.split_metadata.split_id)\n            .collect();\n        Ok(split_ids)\n    }\n}\n\n#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]\n/// A query builder for listing splits within the metastore.\npub struct ListSplitsQuery {\n    /// A non-empty list of index UIDs for which to fetch the splits, or\n    /// None if we want splits from all indexes.\n    pub index_uids: Option<Vec<IndexUid>>,\n\n    /// A specific node ID to filter by.\n    pub node_id: Option<NodeId>,\n\n    /// The maximum number of splits to retrieve.\n    pub limit: Option<usize>,\n\n    /// The number of splits to skip.\n    pub offset: Option<usize>,\n\n    /// A specific split state(s) to filter by.\n    pub split_states: Vec<SplitState>,\n\n    /// A specific set of tag(s) to filter by.\n    pub tags: Option<TagFilterAst>,\n\n    /// The time range to filter by.\n    pub time_range: FilterRange<i64>,\n\n    /// The maximum time range end to filter by.\n    pub max_time_range_end: Option<i64>,\n\n    /// The delete opstamp range to filter by.\n    pub delete_opstamp: FilterRange<u64>,\n\n    /// The update timestamp range to filter by.\n    pub update_timestamp: FilterRange<i64>,\n\n    /// The create timestamp range to filter by.\n    pub create_timestamp: FilterRange<i64>,\n\n    /// The datetime at which you include or exclude mature splits.\n    pub mature: Bound<OffsetDateTime>,\n\n    /// Sorts the splits by staleness, i.e. by delete opstamp and publish timestamp in ascending\n    /// order.\n    pub sort_by: SortBy,\n\n    /// Only return splits whose (index_uid, split_id) are lexicographically after this split\n    pub after_split: Option<(IndexUid, SplitId)>,\n}\n\n#[derive(Debug, Clone, PartialEq, serde::Serialize, serde::Deserialize)]\npub enum SortBy {\n    None,\n    Staleness,\n    IndexUid,\n}\n\nimpl SortBy {\n    fn compare(&self, left_split: &Split, right_split: &Split) -> Ordering {\n        match self {\n            SortBy::None => Ordering::Equal,\n            SortBy::Staleness => left_split\n                .split_metadata\n                .delete_opstamp\n                .cmp(&right_split.split_metadata.delete_opstamp)\n                .then_with(|| {\n                    left_split\n                        .publish_timestamp\n                        .cmp(&right_split.publish_timestamp)\n                }),\n            SortBy::IndexUid => left_split\n                .split_metadata\n                .index_uid\n                .cmp(&right_split.split_metadata.index_uid)\n                .then_with(|| {\n                    left_split\n                        .split_metadata\n                        .split_id\n                        .cmp(&right_split.split_metadata.split_id)\n                }),\n        }\n    }\n}\n\n#[allow(unused_attributes)]\nimpl ListSplitsQuery {\n    /// Creates a new [`ListSplitsQuery`] for the designated index.\n    pub fn for_index(index_uid: IndexUid) -> Self {\n        Self {\n            index_uids: Some(vec![index_uid]),\n            node_id: None,\n            limit: None,\n            offset: None,\n            split_states: Vec::new(),\n            tags: None,\n            time_range: Default::default(),\n            max_time_range_end: None,\n            delete_opstamp: Default::default(),\n            update_timestamp: Default::default(),\n            create_timestamp: Default::default(),\n            mature: Bound::Unbounded,\n            sort_by: SortBy::None,\n            after_split: None,\n        }\n    }\n\n    /// Creates a new [`ListSplitsQuery`] from a non-empty list of index UIDs.\n    /// Returns None if the list is empty.\n    pub fn try_from_index_uids(index_uids: Vec<IndexUid>) -> Option<Self> {\n        if index_uids.is_empty() {\n            return None;\n        }\n        Some(Self {\n            index_uids: Some(index_uids),\n            node_id: None,\n            limit: None,\n            offset: None,\n            split_states: Vec::new(),\n            tags: None,\n            time_range: Default::default(),\n            max_time_range_end: None,\n            delete_opstamp: Default::default(),\n            update_timestamp: Default::default(),\n            create_timestamp: Default::default(),\n            mature: Bound::Unbounded,\n            sort_by: SortBy::None,\n            after_split: None,\n        })\n    }\n\n    /// Creates a new [`ListSplitsQuery`] for all indexes.\n    pub fn for_all_indexes() -> Self {\n        Self {\n            index_uids: None,\n            node_id: None,\n            limit: None,\n            offset: None,\n            split_states: Vec::new(),\n            tags: None,\n            time_range: Default::default(),\n            max_time_range_end: None,\n            delete_opstamp: Default::default(),\n            update_timestamp: Default::default(),\n            create_timestamp: Default::default(),\n            mature: Bound::Unbounded,\n            sort_by: SortBy::None,\n            after_split: None,\n        }\n    }\n\n    /// Selects splits produced by the specified node.\n    pub fn with_node_id(mut self, node_id: NodeId) -> Self {\n        self.node_id = Some(node_id);\n        self\n    }\n\n    /// Sets the maximum number of splits to retrieve.\n    pub fn with_limit(mut self, n: usize) -> Self {\n        self.limit = Some(n);\n        self\n    }\n\n    /// Sets the number of splits to skip.\n    pub fn with_offset(mut self, n: usize) -> Self {\n        self.offset = Some(n);\n        self\n    }\n\n    /// Selects splits which have the given split state.\n    pub fn with_split_state(mut self, state: SplitState) -> Self {\n        self.split_states.push(state);\n        self\n    }\n\n    /// Selects splits which have the any of the following split state.\n    pub fn with_split_states(mut self, states: impl AsRef<[SplitState]>) -> Self {\n        self.split_states.extend_from_slice(states.as_ref());\n        self\n    }\n\n    /// Selects splits which match the given tag filter.\n    pub fn with_tags_filter(mut self, tags: TagFilterAst) -> Self {\n        self.tags = Some(tags);\n        self\n    }\n\n    /// Sets the field's lower bound to match values that are\n    /// *less than or equal to* the provided value.\n    pub fn with_time_range_end_lte(mut self, v: i64) -> Self {\n        self.time_range.end = Bound::Included(v);\n        self\n    }\n\n    /// Sets the field's lower bound to match values that are\n    /// *less than* the provided value.\n    pub fn with_time_range_end_lt(mut self, v: i64) -> Self {\n        self.time_range.end = Bound::Excluded(v);\n        self\n    }\n\n    /// Sets the field's upper bound to match values that are\n    /// *greater than or equal to* the provided value.\n    pub fn with_time_range_start_gte(mut self, v: i64) -> Self {\n        self.time_range.start = Bound::Included(v);\n        self\n    }\n\n    /// Sets the field's upper bound to match values that are\n    /// *greater than* the provided value.\n    pub fn with_time_range_start_gt(mut self, v: i64) -> Self {\n        self.time_range.start = Bound::Excluded(v);\n        self\n    }\n\n    /// Retains only splits with a time range end that is\n    /// *less than or equal to* the provided value.\n    pub fn with_max_time_range_end(mut self, v: i64) -> Self {\n        self.max_time_range_end = Some(v);\n        self\n    }\n\n    /// Sets the field's lower bound to match values that are\n    /// *less than or equal to* the provided value.\n    pub fn with_delete_opstamp_lte(mut self, v: u64) -> Self {\n        self.delete_opstamp.end = Bound::Included(v);\n        self\n    }\n\n    /// Sets the field's lower bound to match values that are\n    /// *less than* the provided value.\n    pub fn with_delete_opstamp_lt(mut self, v: u64) -> Self {\n        self.delete_opstamp.end = Bound::Excluded(v);\n        self\n    }\n\n    /// Sets the field's upper bound to match values that are\n    /// *greater than or equal to* the provided value.\n    pub fn with_delete_opstamp_gte(mut self, v: u64) -> Self {\n        self.delete_opstamp.start = Bound::Included(v);\n        self\n    }\n\n    /// Sets the field's upper bound to match values that are\n    /// *greater than* the provided value.\n    pub fn with_delete_opstamp_gt(mut self, v: u64) -> Self {\n        self.delete_opstamp.start = Bound::Excluded(v);\n        self\n    }\n\n    /// Sets the field's lower bound to match values that are\n    /// *less than or equal to* the provided value.\n    pub fn with_update_timestamp_lte(mut self, v: i64) -> Self {\n        self.update_timestamp.end = Bound::Included(v);\n        self\n    }\n\n    /// Sets the field's lower bound to match values that are\n    /// *less than* the provided value.\n    pub fn with_update_timestamp_lt(mut self, v: i64) -> Self {\n        self.update_timestamp.end = Bound::Excluded(v);\n        self\n    }\n\n    /// Sets the field's upper bound to match values that are\n    /// *greater than or equal to* the provided value.\n    pub fn with_update_timestamp_gte(mut self, v: i64) -> Self {\n        self.update_timestamp.start = Bound::Included(v);\n        self\n    }\n\n    /// Sets the field's upper bound to match values that are\n    /// *greater than* the provided value.\n    pub fn with_update_timestamp_gt(mut self, v: i64) -> Self {\n        self.update_timestamp.start = Bound::Excluded(v);\n        self\n    }\n\n    /// Sets the field's lower bound to match values that are\n    /// *less than or equal to* the provided value.\n    pub fn with_create_timestamp_lte(mut self, v: i64) -> Self {\n        self.create_timestamp.end = Bound::Included(v);\n        self\n    }\n\n    /// Sets the field's lower bound to match values that are\n    /// *less than* the provided value.\n    pub fn with_create_timestamp_lt(mut self, v: i64) -> Self {\n        self.create_timestamp.end = Bound::Excluded(v);\n        self\n    }\n\n    /// Sets the field's upper bound to match values that are\n    /// *greater than or equal to* the provided value.\n    pub fn with_create_timestamp_gte(mut self, v: i64) -> Self {\n        self.create_timestamp.start = Bound::Included(v);\n        self\n    }\n\n    /// Sets the field's upper bound to match values that are\n    /// *greater than* the provided value.\n    pub fn with_create_timestamp_gt(mut self, v: i64) -> Self {\n        self.create_timestamp.start = Bound::Excluded(v);\n        self\n    }\n\n    /// Retains splits that are mature at the given datetime.\n    pub fn retain_mature(mut self, now: OffsetDateTime) -> Self {\n        self.mature = Bound::Included(now);\n        self\n    }\n\n    /// Retains splits that are immature at the given datetime.\n    pub fn retain_immature(mut self, now: OffsetDateTime) -> Self {\n        self.mature = Bound::Excluded(now);\n        self\n    }\n\n    /// Sorts the splits by staleness, i.e. by delete opstamp and publish timestamp in ascending\n    /// order.\n    pub fn sort_by_staleness(mut self) -> Self {\n        self.sort_by = SortBy::Staleness;\n        self\n    }\n\n    /// Sorts the splits by index_uid and split_id.\n    pub fn sort_by_index_uid(mut self) -> Self {\n        self.sort_by = SortBy::IndexUid;\n        self\n    }\n\n    /// Only return splits whose (index_uid, split_id) are lexicographically after this split.\n    /// This is only useful if results are sorted by index_uid and split_id.\n    pub fn after_split(mut self, split_meta: &SplitMetadata) -> Self {\n        self.after_split = Some((split_meta.index_uid.clone(), split_meta.split_id.clone()));\n        self\n    }\n}\n\n#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]\n/// A range containing the upper and lower bounds to filter documents by.\npub struct FilterRange<T> {\n    /// The lower bound of the filter.\n    pub start: Bound<T>,\n    /// The upper bound of the filter.\n    pub end: Bound<T>,\n}\n\nimpl<T: PartialEq + PartialOrd> FilterRange<T> {\n    /// Checks if both the upper and lower bound are `Bound::Unbounded`.\n    pub fn is_unbounded(&self) -> bool {\n        self.start == Bound::Unbounded && self.end == Bound::Unbounded\n    }\n\n    /// Checks if the provided value lies within the upper and lower bounds\n    /// of the range.\n    pub fn contains(&self, value: &T) -> bool {\n        if self.is_unbounded() {\n            return true;\n        }\n\n        let lower_check = match &self.start {\n            Bound::Unbounded => true,\n            Bound::Included(left) => left <= value,\n            Bound::Excluded(left) => left < value,\n        };\n\n        let upper_check = match &self.end {\n            Bound::Unbounded => true,\n            Bound::Included(left) => left >= value,\n            Bound::Excluded(left) => left > value,\n        };\n\n        lower_check && upper_check\n    }\n\n    /// Checks if the provided range overlaps with the range.\n    pub fn overlaps_with(&self, range: RangeInclusive<T>) -> bool {\n        if self.is_unbounded() {\n            return true;\n        }\n\n        let lower_check = match &self.start {\n            Bound::Unbounded => true,\n            Bound::Included(left) => left <= range.end(),\n            Bound::Excluded(left) => left < range.end(),\n        };\n\n        let upper_check = match &self.end {\n            Bound::Unbounded => true,\n            Bound::Included(left) => left >= range.start(),\n            Bound::Excluded(left) => left > range.start(),\n        };\n\n        lower_check && upper_check\n    }\n}\n\n// The `Default` derive implementation imposes a restriction\n// for `T` to also implement Default when this is not required.\nimpl<T> Default for FilterRange<T> {\n    fn default() -> Self {\n        Self {\n            start: Bound::Unbounded,\n            end: Bound::Unbounded,\n        }\n    }\n}\n\n/// Maps the given source params to whether checkpoints should be stored in the index metadata\n/// (false) or the shard table (true)\nfn use_shard_api(params: &SourceParams) -> bool {\n    match params {\n        SourceParams::File(FileSourceParams::Filepath(_)) => false,\n        SourceParams::File(FileSourceParams::Notifications(_)) => true,\n        SourceParams::Ingest => true,\n        SourceParams::IngestApi => false,\n        SourceParams::IngestCli => false,\n        SourceParams::Kafka(_) => false,\n        SourceParams::Kinesis(_) => false,\n        SourceParams::PubSub(_) => false,\n        SourceParams::Pulsar(_) => false,\n        SourceParams::Stdin => panic!(\"stdin cannot be checkpointed\"),\n        SourceParams::Vec(_) => false,\n        SourceParams::Void(_) => false,\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_filter_contains() {\n        let filter = FilterRange {\n            start: Bound::Unbounded,\n            end: Bound::Excluded(50),\n        };\n        assert!(!filter.contains(&50));\n        assert!(filter.contains(&0));\n        assert!(filter.contains(&49));\n\n        let filter = FilterRange {\n            start: Bound::Included(50),\n            end: Bound::Unbounded,\n        };\n        assert!(filter.contains(&50));\n        assert!(filter.contains(&51));\n        assert!(!filter.contains(&0));\n\n        let filter = FilterRange {\n            start: Bound::Included(50),\n            end: Bound::Excluded(75),\n        };\n        assert!(filter.contains(&50));\n        assert!(filter.contains(&51));\n        assert!(!filter.contains(&0));\n        assert!(!filter.contains(&75));\n        assert!(filter.contains(&74));\n    }\n\n    #[test]\n    fn test_overlaps_with() {\n        let filter = FilterRange {\n            start: Bound::Unbounded,\n            end: Bound::Excluded(50),\n        };\n        assert!(filter.overlaps_with(0..=50));\n        assert!(filter.overlaps_with(0..=51));\n        assert!(filter.overlaps_with(32..=63));\n        assert!(filter.overlaps_with(32..=32));\n        assert!(!filter.overlaps_with(51..=76));\n        assert!(!filter.overlaps_with(50..=76));\n\n        let filter = FilterRange {\n            start: Bound::Unbounded,\n            end: Bound::Included(50),\n        };\n        assert!(filter.overlaps_with(0..=50));\n        assert!(filter.overlaps_with(0..=51));\n        assert!(filter.overlaps_with(50..=76));\n        assert!(!filter.overlaps_with(51..=76));\n\n        let filter = FilterRange {\n            start: Bound::Excluded(50),\n            end: Bound::Unbounded,\n        };\n        assert!(filter.overlaps_with(51..=75));\n        assert!(filter.overlaps_with(0..=51));\n        assert!(filter.overlaps_with(51..=76));\n        assert!(filter.overlaps_with(50..=76));\n        assert!(!filter.overlaps_with(0..=49));\n        assert!(!filter.overlaps_with(0..=50));\n\n        let filter = FilterRange {\n            start: Bound::Included(50),\n            end: Bound::Unbounded,\n        };\n        assert!(filter.overlaps_with(51..=75));\n        assert!(filter.overlaps_with(0..=51));\n        assert!(filter.overlaps_with(51..=76));\n        assert!(filter.overlaps_with(50..=76));\n        assert!(filter.overlaps_with(0..=50));\n        assert!(!filter.overlaps_with(0..=49));\n\n        let filter = FilterRange {\n            start: Bound::Included(50),\n            end: Bound::Excluded(75),\n        };\n        assert!(filter.overlaps_with(51..=75));\n        assert!(filter.overlaps_with(0..=51));\n        assert!(filter.overlaps_with(45..=76));\n        assert!(filter.overlaps_with(50..=76));\n        assert!(filter.overlaps_with(0..=50));\n        assert!(filter.overlaps_with(74..=124));\n        assert!(!filter.overlaps_with(0..=49));\n        assert!(!filter.overlaps_with(75..=124));\n    }\n\n    #[tokio::test]\n    async fn test_list_splits_response_empty() {\n        let response = ListSplitsResponse::empty();\n        let splits = response.deserialize_splits().await.unwrap();\n        assert!(splits.is_empty());\n    }\n\n    #[tokio::test]\n    async fn test_list_indexes_metadata_response_serde() {\n        let response = ListIndexesMetadataResponse::try_from_indexes_metadata(Vec::new())\n            .await\n            .unwrap();\n        let indexes_metadata = response.deserialize_indexes_metadata().await.unwrap();\n        assert!(indexes_metadata.is_empty());\n\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///test-index\");\n        let response = ListIndexesMetadataResponse::for_test(vec![index_metadata.clone()]);\n        let indexes_metadata = response.deserialize_indexes_metadata().await.unwrap();\n        assert_eq!(indexes_metadata.len(), 1);\n        assert_eq!(indexes_metadata[0], index_metadata);\n    }\n\n    #[tokio::test]\n    async fn test_list_indexes_metadata_backward_compatible_serde() {\n        let indexes_metadata_json = serde_json::to_string(&Vec::<IndexMetadata>::new()).unwrap();\n        let response = ListIndexesMetadataResponse {\n            indexes_metadata_json_opt: Some(indexes_metadata_json),\n            indexes_metadata_json_zstd: Bytes::from_static(b\"\"),\n        };\n        let indexes_metadata = response.deserialize_indexes_metadata().await.unwrap();\n        assert!(indexes_metadata.is_empty());\n\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///test-index\");\n        let indexes_metadata_json = serde_json::to_string(&vec![index_metadata.clone()]).unwrap();\n        let response = ListIndexesMetadataResponse {\n            indexes_metadata_json_opt: Some(indexes_metadata_json),\n            indexes_metadata_json_zstd: Bytes::from_static(b\"\"),\n        };\n        let indexes_metadata = response.deserialize_indexes_metadata().await.unwrap();\n        assert_eq!(indexes_metadata.len(), 1);\n        assert_eq!(indexes_metadata[0], index_metadata);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/postgres/error.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_proto::metastore::{EntityKind, MetastoreError};\nuse sqlx::postgres::PgDatabaseError;\nuse tracing::error;\n\n// https://www.postgresql.org/docs/current/errcodes-appendix.html\nmod pg_error_codes {\n    pub const FOREIGN_KEY_VIOLATION: &str = \"23503\";\n    pub const UNIQUE_VIOLATION: &str = \"23505\";\n}\n\npub(super) fn convert_sqlx_err(index_id: &str, sqlx_error: sqlx::Error) -> MetastoreError {\n    match &sqlx_error {\n        sqlx::Error::Database(boxed_db_error) => {\n            let pg_db_error = boxed_db_error.downcast_ref::<PgDatabaseError>();\n            let pg_error_code = pg_db_error.code();\n            let pg_error_table = pg_db_error.table();\n\n            match (pg_error_code, pg_error_table) {\n                (pg_error_codes::FOREIGN_KEY_VIOLATION, _) => {\n                    MetastoreError::NotFound(EntityKind::Index {\n                        index_id: index_id.to_string(),\n                    })\n                }\n                (pg_error_codes::UNIQUE_VIOLATION, Some(table)) if table.starts_with(\"indexes\") => {\n                    MetastoreError::AlreadyExists(EntityKind::Index {\n                        index_id: index_id.to_string(),\n                    })\n                }\n                (pg_error_codes::UNIQUE_VIOLATION, _) => {\n                    error!(error=?boxed_db_error, \"postgresql-error\");\n                    MetastoreError::Internal {\n                        message: \"unique key violation\".to_string(),\n                        cause: format!(\"DB error {boxed_db_error:?}\"),\n                    }\n                }\n                _ => {\n                    error!(error=?boxed_db_error, \"postgresql-error\");\n                    MetastoreError::Db {\n                        message: boxed_db_error.to_string(),\n                    }\n                }\n            }\n        }\n        _ => {\n            error!(error=?sqlx_error, \"an error has occurred in the database operation\");\n            MetastoreError::Db {\n                message: sqlx_error.to_string(),\n            }\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/postgres/factory.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::sync::Arc;\n\nuse async_trait::async_trait;\nuse quickwit_common::uri::Uri;\nuse quickwit_config::{MetastoreBackend, MetastoreConfig};\nuse quickwit_proto::metastore::MetastoreServiceClient;\nuse tokio::sync::Mutex;\nuse tracing::debug;\n\nuse crate::{MetastoreFactory, MetastoreResolverError, PostgresqlMetastore};\n\n#[derive(Clone, Default)]\npub struct PostgresqlMetastoreFactory {\n    // Under normal conditions of use, this cache will contain a single `Metastore`.\n    //\n    // In contrast to the file-backed metastore, we use a strong pointer here, so that the\n    // `Metastore` doesn't get dropped. This is done in order to keep the underlying connection\n    // pool to Postgres alive.\n    cache: Arc<Mutex<HashMap<Uri, MetastoreServiceClient>>>,\n}\n\nimpl PostgresqlMetastoreFactory {\n    async fn get_from_cache(&self, uri: &Uri) -> Option<MetastoreServiceClient> {\n        let cache_lock = self.cache.lock().await;\n        cache_lock.get(uri).cloned()\n    }\n\n    /// If there is a valid entry in the cache to begin with, we trash the new\n    /// one and return the old one.\n    ///\n    /// This way we make sure that we keep only one instance associated\n    /// to the key `uri` outside of this struct.\n    async fn cache_metastore(\n        &self,\n        uri: Uri,\n        metastore: MetastoreServiceClient,\n    ) -> MetastoreServiceClient {\n        let mut cache_lock = self.cache.lock().await;\n        if let Some(metastore) = cache_lock.get(&uri) {\n            return metastore.clone();\n        }\n        cache_lock.insert(uri, metastore.clone());\n        metastore\n    }\n}\n\n#[async_trait]\nimpl MetastoreFactory for PostgresqlMetastoreFactory {\n    fn backend(&self) -> MetastoreBackend {\n        MetastoreBackend::PostgreSQL\n    }\n\n    async fn resolve(\n        &self,\n        metastore_config: &MetastoreConfig,\n        uri: &Uri,\n    ) -> Result<MetastoreServiceClient, MetastoreResolverError> {\n        if let Some(metastore) = self.get_from_cache(uri).await {\n            debug!(\"using metastore from cache\");\n            return Ok(metastore);\n        }\n        debug!(\"metastore not found in cache\");\n        let postgresql_metastore_config = metastore_config.as_postgres().ok_or_else(|| {\n            let message = format!(\n                \"expected PostgreSQL metastore config, got `{:?}`\",\n                metastore_config.backend()\n            );\n            MetastoreResolverError::InvalidConfig(message)\n        })?;\n        let postgresql_metastore = PostgresqlMetastore::new(postgresql_metastore_config, uri)\n            .await\n            .map(MetastoreServiceClient::new)\n            .map_err(MetastoreResolverError::Initialization)?;\n        let unique_metastore_for_uri = self\n            .cache_metastore(uri.clone(), postgresql_metastore)\n            .await;\n        Ok(unique_metastore_for_uri)\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/postgres/metastore.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::fmt::{self, Write};\nuse std::str::FromStr;\nuse std::time::Duration;\n\nuse async_trait::async_trait;\nuse futures::StreamExt;\nuse itertools::Itertools;\nuse quickwit_common::pretty::PrettySample;\nuse quickwit_common::uri::Uri;\nuse quickwit_common::{ServiceStream, get_bool_from_env, rate_limited_error};\nuse quickwit_config::{\n    IndexTemplate, IndexTemplateId, PostgresMetastoreConfig, validate_index_id_pattern,\n};\nuse quickwit_proto::ingest::{Shard, ShardState};\nuse quickwit_proto::metastore::{\n    AcquireShardsRequest, AcquireShardsResponse, AddSourceRequest, CreateIndexRequest,\n    CreateIndexResponse, CreateIndexTemplateRequest, DeleteIndexRequest,\n    DeleteIndexTemplatesRequest, DeleteQuery, DeleteShardsRequest, DeleteShardsResponse,\n    DeleteSourceRequest, DeleteSplitsRequest, DeleteTask, EmptyResponse, EntityKind,\n    FindIndexTemplateMatchesRequest, FindIndexTemplateMatchesResponse, GetClusterIdentityRequest,\n    GetClusterIdentityResponse, GetIndexTemplateRequest, GetIndexTemplateResponse,\n    IndexMetadataFailure, IndexMetadataFailureReason, IndexMetadataRequest, IndexMetadataResponse,\n    IndexStats, IndexTemplateMatch, IndexesMetadataRequest, IndexesMetadataResponse,\n    LastDeleteOpstampRequest, LastDeleteOpstampResponse, ListDeleteTasksRequest,\n    ListDeleteTasksResponse, ListIndexStatsRequest, ListIndexStatsResponse,\n    ListIndexTemplatesRequest, ListIndexTemplatesResponse, ListIndexesMetadataRequest,\n    ListIndexesMetadataResponse, ListShardsRequest, ListShardsResponse, ListShardsSubresponse,\n    ListSplitsRequest, ListSplitsResponse, ListStaleSplitsRequest, MarkSplitsForDeletionRequest,\n    MetastoreError, MetastoreResult, MetastoreService, MetastoreServiceStream, OpenShardSubrequest,\n    OpenShardSubresponse, OpenShardsRequest, OpenShardsResponse, PruneShardsRequest,\n    PublishSplitsRequest, ResetSourceCheckpointRequest, SplitStats, StageSplitsRequest,\n    ToggleSourceRequest, UpdateIndexRequest, UpdateSourceRequest, UpdateSplitsDeleteOpstampRequest,\n    UpdateSplitsDeleteOpstampResponse, serde_utils,\n};\nuse quickwit_proto::types::{IndexId, IndexUid, Position, PublishToken, ShardId, SourceId};\nuse sea_query::{Alias, Asterisk, Expr, Func, PostgresQueryBuilder, Query, UnionType};\nuse sea_query_binder::SqlxBinder;\nuse sqlx::{Acquire, Executor, Postgres, Transaction};\nuse time::OffsetDateTime;\nuse tracing::{debug, info, instrument, warn};\nuse uuid::Uuid;\n\nuse super::error::convert_sqlx_err;\nuse super::migrator::run_migrations;\nuse super::model::{PgDeleteTask, PgIndex, PgIndexTemplate, PgShard, PgSplit, Splits};\nuse super::pool::TrackedPool;\nuse super::split_stream::SplitStream;\nuse super::utils::{append_query_filters_and_order_by, establish_connection};\nuse super::{\n    QW_POSTGRES_READ_ONLY_ENV_KEY, QW_POSTGRES_SKIP_MIGRATION_LOCKING_ENV_KEY,\n    QW_POSTGRES_SKIP_MIGRATIONS_ENV_KEY,\n};\nuse crate::checkpoint::{\n    IndexCheckpointDelta, PartitionId, SourceCheckpoint, SourceCheckpointDelta,\n};\nuse crate::file_backed::MutationOccurred;\nuse crate::metastore::postgres::model::Shards;\nuse crate::metastore::postgres::utils::split_maturity_timestamp;\nuse crate::metastore::{\n    IndexesMetadataResponseExt, PublishSplitsRequestExt, STREAM_SPLITS_CHUNK_SIZE,\n    UpdateSourceRequestExt, use_shard_api,\n};\nuse crate::{\n    AddSourceRequestExt, CreateIndexRequestExt, IndexMetadata, IndexMetadataResponseExt,\n    ListIndexesMetadataResponseExt, ListSplitsRequestExt, ListSplitsResponseExt,\n    MetastoreServiceExt, Split, SplitState, StageSplitsRequestExt, UpdateIndexRequestExt,\n};\n\n/// PostgreSQL metastore implementation.\n#[derive(Clone)]\npub struct PostgresqlMetastore {\n    uri: Uri,\n    connection_pool: TrackedPool<Postgres>,\n}\n\nimpl fmt::Debug for PostgresqlMetastore {\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        f.debug_struct(\"PostgresqlMetastore\")\n            .field(\"uri\", &self.uri)\n            .finish()\n    }\n}\n\nimpl PostgresqlMetastore {\n    /// Creates a metastore given a database URI.\n    pub async fn new(\n        postgres_metastore_config: &PostgresMetastoreConfig,\n        connection_uri: &Uri,\n    ) -> MetastoreResult<Self> {\n        let min_connections = postgres_metastore_config.min_connections;\n        let max_connections = postgres_metastore_config.max_connections.get();\n        let acquire_timeout = postgres_metastore_config\n            .acquire_connection_timeout()\n            .expect(\"PostgreSQL metastore config should have been validated\");\n        let idle_timeout_opt = postgres_metastore_config\n            .idle_connection_timeout_opt()\n            .expect(\"PostgreSQL metastore config should have been validated\");\n        let max_lifetime_opt = postgres_metastore_config\n            .max_connection_lifetime_opt()\n            .expect(\"PostgreSQL metastore config should have been validated\");\n\n        let read_only = get_bool_from_env(QW_POSTGRES_READ_ONLY_ENV_KEY, false);\n        let skip_migrations = get_bool_from_env(QW_POSTGRES_SKIP_MIGRATIONS_ENV_KEY, false);\n        let skip_locking = get_bool_from_env(QW_POSTGRES_SKIP_MIGRATION_LOCKING_ENV_KEY, false);\n\n        let connection_pool = establish_connection(\n            connection_uri,\n            min_connections,\n            max_connections,\n            acquire_timeout,\n            idle_timeout_opt,\n            max_lifetime_opt,\n            read_only,\n        )\n        .await?;\n\n        run_migrations(&connection_pool, skip_migrations, skip_locking).await?;\n\n        let metastore = PostgresqlMetastore {\n            uri: connection_uri.clone(),\n            connection_pool,\n        };\n        Ok(metastore)\n    }\n}\n\n/// Returns an Index object given an index_id or None if it does not exist.\nasync fn index_opt<'a, E>(\n    executor: E,\n    index_id: &str,\n    lock: bool,\n) -> MetastoreResult<Option<PgIndex>>\nwhere\n    E: sqlx::Executor<'a, Database = Postgres>,\n{\n    let index_opt: Option<PgIndex> = sqlx::query_as::<_, PgIndex>(&format!(\n        r#\"\n        SELECT *\n        FROM indexes\n        WHERE index_id = $1\n        {}\n        \"#,\n        if lock { \"FOR UPDATE\" } else { \"\" }\n    ))\n    .bind(index_id)\n    .fetch_optional(executor)\n    .await?;\n    Ok(index_opt)\n}\n\n/// Returns an Index object given an index_uid or None if it does not exist.\nasync fn index_opt_for_uid<'a, E>(\n    executor: E,\n    index_uid: IndexUid,\n    lock: bool,\n) -> MetastoreResult<Option<PgIndex>>\nwhere\n    E: sqlx::Executor<'a, Database = Postgres>,\n{\n    let index_opt: Option<PgIndex> = sqlx::query_as::<_, PgIndex>(&format!(\n        r#\"\n        SELECT *\n        FROM indexes\n        WHERE index_uid = $1\n        {}\n        \"#,\n        if lock { \"FOR UPDATE\" } else { \"\" }\n    ))\n    .bind(&index_uid)\n    .fetch_optional(executor)\n    .await?;\n    Ok(index_opt)\n}\n\nasync fn index_metadata(\n    tx: &mut Transaction<'_, Postgres>,\n    index_id: &str,\n    lock: bool,\n) -> MetastoreResult<IndexMetadata> {\n    index_opt(tx.as_mut(), index_id, lock)\n        .await?\n        .ok_or_else(|| {\n            MetastoreError::NotFound(EntityKind::Index {\n                index_id: index_id.to_string(),\n            })\n        })?\n        .index_metadata()\n}\n\nasync fn try_apply_delta_v2(\n    tx: &mut Transaction<'_, Postgres>,\n    index_uid: &IndexUid,\n    source_id: &SourceId,\n    checkpoint_delta: SourceCheckpointDelta,\n    publish_token: PublishToken,\n) -> MetastoreResult<()> {\n    let num_partitions = checkpoint_delta.num_partitions();\n    let shard_ids: Vec<String> = checkpoint_delta\n        .partitions()\n        .map(|partition_id| partition_id.to_string())\n        .collect();\n\n    let shards: Vec<(String, String, Option<PublishToken>)> = sqlx::query_as(\n        r#\"\n        SELECT\n            shard_id, publish_position_inclusive, publish_token\n        FROM\n            shards\n        WHERE\n            index_uid = $1\n            AND source_id = $2\n            AND shard_id = ANY($3)\n        FOR UPDATE\n        \"#,\n    )\n    .bind(index_uid)\n    .bind(source_id)\n    .bind(shard_ids)\n    .fetch_all(tx.as_mut())\n    .await?;\n\n    if shards.len() != num_partitions {\n        let queue_id = format!(\"{index_uid}/{source_id}\"); // FIXME\n        let entity_kind = EntityKind::Shard { queue_id };\n        return Err(MetastoreError::NotFound(entity_kind));\n    }\n    let mut current_checkpoint = SourceCheckpoint::default();\n\n    for (shard_id, current_position, current_publish_token_opt) in shards {\n        if current_publish_token_opt.is_none()\n            || current_publish_token_opt.unwrap() != publish_token\n        {\n            let message = \"failed to apply checkpoint delta: invalid publish token\".to_string();\n            return Err(MetastoreError::InvalidArgument { message });\n        }\n        let partition_id = PartitionId::from(shard_id);\n        let current_position = Position::from(current_position);\n        current_checkpoint.add_partition(partition_id, current_position);\n    }\n    current_checkpoint\n        .try_apply_delta(checkpoint_delta)\n        .map_err(|error| MetastoreError::InvalidArgument {\n            message: error.to_string(),\n        })?;\n\n    let mut shard_ids = Vec::with_capacity(num_partitions);\n    let mut new_positions = Vec::with_capacity(num_partitions);\n\n    for (partition_id, new_position) in current_checkpoint.iter() {\n        let shard_id = partition_id.to_string();\n        shard_ids.push(shard_id.to_string());\n        new_positions.push(new_position.to_string());\n    }\n\n    sqlx::query(\n        r#\"\n            UPDATE\n                shards\n            SET\n                publish_position_inclusive = new_positions.position,\n                shard_state = CASE WHEN new_positions.position LIKE '~%' THEN 'closed' ELSE shards.shard_state END,\n                update_timestamp = $5\n            FROM\n                UNNEST($3, $4)\n                AS new_positions(shard_id, position)\n            WHERE\n                index_uid = $1\n                AND source_id = $2\n                AND shards.shard_id = new_positions.shard_id\n            \"#,\n    )\n    .bind(index_uid)\n    .bind(source_id)\n    .bind(shard_ids)\n    .bind(new_positions)\n    // Use a timestamp generated by the metastore node to avoid clock drift issues\n    .bind(OffsetDateTime::now_utc())\n    .execute(tx.as_mut())\n    .await?;\n    Ok(())\n}\n\n/// This macro is used to systematically wrap the metastore\n/// into transaction, commit them on Result::Ok and rollback on Error.\n///\n/// Note this is suboptimal.\n/// Some of the methods actually did not require a transaction.\n///\n/// We still use this macro for them in order to make the code\n/// \"trivially correct\".\nmacro_rules! run_with_tx {\n    ($connection_pool:expr, $tx_refmut:ident, $label:literal, $x:block) => {{\n        let mut tx: Transaction<'_, Postgres> = $connection_pool.begin().await?;\n        let $tx_refmut = &mut tx;\n        let op_fut = move || async move { $x };\n        let op_result: MetastoreResult<_> = op_fut().await;\n        match &op_result {\n            Ok(_) => {\n                debug!(\"committing transaction\");\n                tx.commit().await?;\n            }\n            Err(error) => {\n                rate_limited_error!(limit_per_min = 60, error=%error, \"failed to {}, rolling transaction back\" , $label);\n                tx.rollback().await?;\n            }\n        }\n        op_result\n    }};\n}\n\nasync fn mutate_index_metadata<E, M>(\n    tx: &mut Transaction<'_, Postgres>,\n    index_uid: IndexUid,\n    mutate_fn: M,\n) -> MetastoreResult<IndexMetadata>\nwhere\n    MetastoreError: From<E>,\n    M: FnOnce(&mut IndexMetadata) -> Result<MutationOccurred<()>, E>,\n{\n    let index_id = &index_uid.index_id;\n    let mut index_metadata = index_metadata(tx, index_id, true).await?;\n\n    if index_metadata.index_uid != index_uid {\n        return Err(MetastoreError::NotFound(EntityKind::Index {\n            index_id: index_id.to_string(),\n        }));\n    }\n    if let MutationOccurred::No(()) = mutate_fn(&mut index_metadata)? {\n        return Ok(index_metadata);\n    }\n    let index_metadata_json = serde_utils::to_json_str(&index_metadata)?;\n\n    let update_index_res = sqlx::query(\n        r#\"\n        UPDATE indexes\n        SET index_metadata_json = $1\n        WHERE index_uid = $2\n        \"#,\n    )\n    .bind(index_metadata_json)\n    .bind(&index_uid)\n    .execute(tx.as_mut())\n    .await?;\n    if update_index_res.rows_affected() == 0 {\n        return Err(MetastoreError::NotFound(EntityKind::Index {\n            index_id: index_id.to_string(),\n        }));\n    }\n    Ok(index_metadata)\n}\n\n#[async_trait]\nimpl MetastoreService for PostgresqlMetastore {\n    async fn check_connectivity(&self) -> anyhow::Result<()> {\n        self.connection_pool.acquire().await?;\n        Ok(())\n    }\n\n    fn endpoints(&self) -> Vec<quickwit_common::uri::Uri> {\n        vec![self.uri.clone()]\n    }\n\n    // Index API:\n    // - `create_index`\n    // - `update_index`\n    // - `index_metadata`\n    // - `indexes_metadata`\n    // - `list_indexes_metadata`\n\n    #[instrument(skip(self))]\n    async fn create_index(\n        &self,\n        request: CreateIndexRequest,\n    ) -> MetastoreResult<CreateIndexResponse> {\n        let index_config = request.deserialize_index_config()?;\n        let mut index_metadata = IndexMetadata::new(index_config);\n\n        let source_configs = request.deserialize_source_configs()?;\n\n        for source_config in source_configs {\n            index_metadata.add_source(source_config)?;\n        }\n        let index_metadata_json = serde_utils::to_json_str(&index_metadata)?;\n\n        sqlx::query(\n            \"INSERT INTO indexes (index_uid, index_id, index_metadata_json) VALUES ($1, $2, $3)\",\n        )\n        .bind(index_metadata.index_uid.to_string())\n        .bind(&index_metadata.index_uid.index_id)\n        .bind(&index_metadata_json)\n        .execute(&self.connection_pool)\n        .await\n        .map_err(|sqlx_error| convert_sqlx_err(index_metadata.index_id(), sqlx_error))?;\n\n        let response = CreateIndexResponse {\n            index_uid: index_metadata.index_uid.into(),\n            index_metadata_json,\n        };\n        Ok(response)\n    }\n\n    async fn update_index(\n        &self,\n        request: UpdateIndexRequest,\n    ) -> MetastoreResult<IndexMetadataResponse> {\n        let doc_mapping = request.deserialize_doc_mapping()?;\n        let indexing_settings = request.deserialize_indexing_settings()?;\n        let ingest_settings = request.deserialize_ingest_settings()?;\n        let search_settings = request.deserialize_search_settings()?;\n        let retention_policy_opt = request.deserialize_retention_policy()?;\n\n        let index_uid: IndexUid = request.index_uid().clone();\n        let updated_index_metadata = run_with_tx!(self.connection_pool, tx, \"update index\", {\n            mutate_index_metadata::<MetastoreError, _>(tx, index_uid, |index_metadata| {\n                let mutation_occurred = index_metadata.update_index_config(\n                    doc_mapping,\n                    indexing_settings,\n                    ingest_settings,\n                    search_settings,\n                    retention_policy_opt,\n                )?;\n                Ok(MutationOccurred::from(mutation_occurred))\n            })\n            .await\n        })?;\n        IndexMetadataResponse::try_from_index_metadata(&updated_index_metadata)\n    }\n\n    #[instrument(skip(self))]\n    async fn index_metadata(\n        &self,\n        request: IndexMetadataRequest,\n    ) -> MetastoreResult<IndexMetadataResponse> {\n        let pg_index_opt = if let Some(index_uid) = &request.index_uid {\n            index_opt_for_uid(&self.connection_pool, index_uid.clone(), false).await?\n        } else if let Some(index_id) = &request.index_id {\n            index_opt(&self.connection_pool, index_id, false).await?\n        } else {\n            let message = \"invalid request: neither `index_id` nor `index_uid` is set\".to_string();\n            return Err(MetastoreError::Internal {\n                message,\n                cause: \"\".to_string(),\n            });\n        };\n        let index_metadata = pg_index_opt\n            .ok_or(MetastoreError::NotFound(EntityKind::Index {\n                index_id: request\n                    .into_index_id()\n                    .expect(\"`index_id` or `index_uid` should be set\"),\n            }))?\n            .index_metadata()?;\n        let response = IndexMetadataResponse::try_from_index_metadata(&index_metadata)?;\n        Ok(response)\n    }\n\n    #[instrument(skip(self))]\n    async fn indexes_metadata(\n        &self,\n        request: IndexesMetadataRequest,\n    ) -> MetastoreResult<IndexesMetadataResponse> {\n        const INDEXES_METADATA_QUERY: &str = include_str!(\"queries/indexes_metadata.sql\");\n\n        let num_subrequests = request.subrequests.len();\n\n        if num_subrequests == 0 {\n            return Ok(Default::default());\n        }\n        let mut index_ids: Vec<IndexId> = Vec::new();\n        let mut index_uids: Vec<IndexUid> = Vec::with_capacity(num_subrequests);\n        let mut failures: Vec<IndexMetadataFailure> = Vec::new();\n\n        for subrequest in request.subrequests {\n            if let Some(index_id) = subrequest.index_id {\n                index_ids.push(index_id);\n            } else if let Some(index_uid) = subrequest.index_uid {\n                index_uids.push(index_uid);\n            } else {\n                let failure = IndexMetadataFailure {\n                    index_id: subrequest.index_id,\n                    index_uid: subrequest.index_uid,\n                    reason: IndexMetadataFailureReason::Internal as i32,\n                };\n                failures.push(failure);\n            }\n        }\n        let pg_indexes: Vec<PgIndex> = sqlx::query_as::<_, PgIndex>(INDEXES_METADATA_QUERY)\n            .bind(&index_ids)\n            .bind(&index_uids)\n            .fetch_all(&self.connection_pool)\n            .await?;\n\n        let indexes_metadata: Vec<IndexMetadata> = pg_indexes\n            .iter()\n            .map(|pg_index| pg_index.index_metadata())\n            .collect::<MetastoreResult<_>>()?;\n\n        if pg_indexes.len() + failures.len() < num_subrequests {\n            for index_id in index_ids {\n                if pg_indexes\n                    .iter()\n                    .all(|pg_index| pg_index.index_id != index_id)\n                {\n                    let failure = IndexMetadataFailure {\n                        index_id: Some(index_id),\n                        index_uid: None,\n                        reason: IndexMetadataFailureReason::NotFound as i32,\n                    };\n                    failures.push(failure);\n                }\n            }\n            for index_uid in index_uids {\n                if pg_indexes\n                    .iter()\n                    .all(|pg_index| pg_index.index_uid != index_uid)\n                {\n                    let failure = IndexMetadataFailure {\n                        index_id: None,\n                        index_uid: Some(index_uid),\n                        reason: IndexMetadataFailureReason::NotFound as i32,\n                    };\n                    failures.push(failure);\n                }\n            }\n        }\n        let response =\n            IndexesMetadataResponse::try_from_indexes_metadata(indexes_metadata, failures).await?;\n        Ok(response)\n    }\n\n    #[instrument(skip(self))]\n    async fn list_indexes_metadata(\n        &self,\n        request: ListIndexesMetadataRequest,\n    ) -> MetastoreResult<ListIndexesMetadataResponse> {\n        let sql =\n            build_index_id_patterns_sql_query(&request.index_id_patterns).map_err(|error| {\n                MetastoreError::Internal {\n                    message: \"failed to build `list_indexes_metadata` SQL query\".to_string(),\n                    cause: error.to_string(),\n                }\n            })?;\n        let pg_indexes = sqlx::query_as::<_, PgIndex>(&sql)\n            .fetch_all(&self.connection_pool)\n            .await?;\n        let indexes_metadata: Vec<IndexMetadata> = pg_indexes\n            .into_iter()\n            .map(|pg_index| pg_index.index_metadata())\n            .collect::<MetastoreResult<_>>()?;\n        let response =\n            ListIndexesMetadataResponse::try_from_indexes_metadata(indexes_metadata).await?;\n        Ok(response)\n    }\n\n    #[instrument(skip_all, fields(index_id=%request.index_uid()))]\n    async fn delete_index(&self, request: DeleteIndexRequest) -> MetastoreResult<EmptyResponse> {\n        let index_uid: IndexUid = request.index_uid().clone();\n        let delete_result = sqlx::query(\"DELETE FROM indexes WHERE index_uid = $1\")\n            .bind(&index_uid)\n            .execute(&self.connection_pool)\n            .await?;\n        // FIXME: This is not idempotent.\n        if delete_result.rows_affected() == 0 {\n            return Err(MetastoreError::NotFound(EntityKind::Index {\n                index_id: index_uid.index_id,\n            }));\n        }\n        info!(index_id = index_uid.index_id, \"deleted index successfully\");\n        Ok(EmptyResponse {})\n    }\n\n    #[instrument(skip_all, fields(split_ids))]\n    async fn stage_splits(&self, request: StageSplitsRequest) -> MetastoreResult<EmptyResponse> {\n        let index_uid: IndexUid = request.index_uid().clone();\n        let splits_metadata = request.deserialize_splits_metadata()?;\n\n        if splits_metadata.is_empty() {\n            return Ok(Default::default());\n        }\n        let mut split_ids = Vec::with_capacity(splits_metadata.len());\n        let mut time_range_start_list = Vec::with_capacity(splits_metadata.len());\n        let mut time_range_end_list = Vec::with_capacity(splits_metadata.len());\n        let mut tags_list = Vec::with_capacity(splits_metadata.len());\n        let mut splits_metadata_json = Vec::with_capacity(splits_metadata.len());\n        let mut delete_opstamps = Vec::with_capacity(splits_metadata.len());\n        let mut maturity_timestamps = Vec::with_capacity(splits_metadata.len());\n        let mut node_ids = Vec::with_capacity(splits_metadata.len());\n\n        for split_metadata in splits_metadata {\n            let split_metadata_json = serde_utils::to_json_str(&split_metadata)?;\n            splits_metadata_json.push(split_metadata_json);\n\n            let time_range_start = split_metadata\n                .time_range\n                .as_ref()\n                .map(|range| *range.start());\n            time_range_start_list.push(time_range_start);\n            maturity_timestamps.push(split_maturity_timestamp(&split_metadata));\n\n            let time_range_end = split_metadata.time_range.map(|range| *range.end());\n            time_range_end_list.push(time_range_end);\n\n            let tags: Vec<String> = split_metadata.tags.into_iter().collect();\n            tags_list.push(sqlx::types::Json(tags));\n            split_ids.push(split_metadata.split_id);\n            delete_opstamps.push(split_metadata.delete_opstamp as i64);\n            node_ids.push(split_metadata.node_id);\n        }\n        tracing::Span::current().record(\"split_ids\", format!(\"{split_ids:?}\"));\n\n        // TODO: Remove transaction.\n        run_with_tx!(self.connection_pool, tx, \"stage splits\", {\n            let upserted_split_ids: Vec<String> = sqlx::query_scalar(r#\"\n                INSERT INTO splits\n                    (split_id, time_range_start, time_range_end, tags, split_metadata_json, delete_opstamp, maturity_timestamp, split_state, index_uid, node_id)\n                SELECT\n                    split_id,\n                    time_range_start,\n                    time_range_end,\n                    ARRAY(SELECT json_array_elements_text(tags_json::json)) as tags,\n                    split_metadata_json,\n                    delete_opstamp,\n                    to_timestamp(maturity_timestamp),\n                    $9 as split_state,\n                    $10 as index_uid,\n                    node_id\n                FROM\n                    UNNEST($1, $2, $3, $4, $5, $6, $7, $8)\n                    AS staged_splits (split_id, time_range_start, time_range_end, tags_json, split_metadata_json, delete_opstamp, maturity_timestamp, node_id)\n                ON CONFLICT(index_uid, split_id) DO UPDATE\n                    SET\n                        time_range_start = excluded.time_range_start,\n                        time_range_end = excluded.time_range_end,\n                        tags = excluded.tags,\n                        split_metadata_json = excluded.split_metadata_json,\n                        delete_opstamp = excluded.delete_opstamp,\n                        maturity_timestamp = excluded.maturity_timestamp,\n                        node_id = excluded.node_id,\n                        update_timestamp = CURRENT_TIMESTAMP,\n                        create_timestamp = CURRENT_TIMESTAMP\n                    WHERE splits.split_id = excluded.split_id AND splits.split_state = 'Staged'\n                RETURNING split_id;\n                \"#)\n                .bind(&split_ids)\n                .bind(time_range_start_list)\n                .bind(time_range_end_list)\n                .bind(tags_list)\n                .bind(splits_metadata_json)\n                .bind(delete_opstamps)\n                .bind(maturity_timestamps)\n                .bind(&node_ids)\n                .bind(SplitState::Staged.as_str())\n                .bind(&index_uid)\n                .fetch_all(tx.as_mut())\n                .await\n                .map_err(|sqlx_error| convert_sqlx_err(&index_uid.index_id, sqlx_error))?;\n\n            if upserted_split_ids.len() != split_ids.len() {\n                let failed_split_ids: Vec<String> = split_ids\n                    .into_iter()\n                    .filter(|split_id| !upserted_split_ids.contains(split_id))\n                    .collect();\n                let entity = EntityKind::Splits {\n                    split_ids: failed_split_ids,\n                };\n                let message = \"splits are not staged\".to_string();\n                return Err(MetastoreError::FailedPrecondition { entity, message });\n            }\n            info!(\n                %index_uid,\n                \"staged `{}` splits successfully\", split_ids.len()\n            );\n            Ok(EmptyResponse {})\n        })\n    }\n\n    #[instrument(skip(self))]\n    async fn publish_splits(\n        &self,\n        request: PublishSplitsRequest,\n    ) -> MetastoreResult<EmptyResponse> {\n        let checkpoint_delta_opt: Option<IndexCheckpointDelta> =\n            request.deserialize_index_checkpoint()?;\n        let index_uid: IndexUid = request.index_uid().clone();\n        let staged_split_ids = request.staged_split_ids;\n        let replaced_split_ids = request.replaced_split_ids;\n\n        run_with_tx!(self.connection_pool, tx, \"publish splits\", {\n            let mut index_metadata = index_metadata(tx, &index_uid.index_id, true).await?;\n            if index_metadata.index_uid != index_uid {\n                return Err(MetastoreError::NotFound(EntityKind::Index {\n                    index_id: index_uid.index_id,\n                }));\n            }\n            if let Some(checkpoint_delta) = checkpoint_delta_opt {\n                let source_id = checkpoint_delta.source_id.clone();\n                let source = index_metadata.sources.get(&source_id).ok_or_else(|| {\n                    MetastoreError::NotFound(EntityKind::Source {\n                        index_id: index_uid.index_id.to_string(),\n                        source_id: source_id.to_string(),\n                    })\n                })?;\n\n                if use_shard_api(&source.source_params) {\n                    let publish_token = request.publish_token_opt.ok_or_else(|| {\n                        let message = format!(\n                            \"publish token is required for publishing splits for source \\\n                             `{source_id}`\"\n                        );\n                        MetastoreError::InvalidArgument { message }\n                    })?;\n                    try_apply_delta_v2(\n                        tx,\n                        &index_uid,\n                        &source_id,\n                        checkpoint_delta.source_delta,\n                        publish_token,\n                    )\n                    .await?;\n                } else {\n                    index_metadata\n                        .checkpoint\n                        .try_apply_delta(checkpoint_delta)\n                        .map_err(|error| {\n                            let entity = EntityKind::CheckpointDelta {\n                                index_id: index_uid.index_id.to_string(),\n                                source_id,\n                            };\n                            let message = error.to_string();\n                            MetastoreError::FailedPrecondition { entity, message }\n                        })?;\n                }\n            }\n            let index_metadata_json = serde_utils::to_json_str(&index_metadata)?;\n\n            const PUBLISH_SPLITS_QUERY: &str = r#\"\n            -- Select the splits to update, regardless of their state.\n            -- The left join make it possible to identify the splits that do not exist.\n            WITH input_splits AS (\n                SELECT input_splits.split_id, input_splits.expected_split_state, splits.actual_split_state\n                FROM (\n                    SELECT split_id, 'Staged' AS expected_split_state\n                    FROM UNNEST($3) AS staged_splits(split_id)\n                    UNION\n                    SELECT split_id, 'Published' AS expected_split_state\n                    FROM UNNEST($4) AS published_splits(split_id)\n                ) input_splits\n                LEFT JOIN (\n                    SELECT split_id, split_state AS actual_split_state\n                    FROM splits\n                    WHERE\n                        index_uid = $1\n                        AND (split_id = ANY($3) OR split_id = ANY($4))\n                    FOR UPDATE\n                    ) AS splits\n                USING (split_id)\n            ),\n            -- Update the index metadata with the new checkpoint.\n            updated_index_metadata AS (\n                UPDATE indexes\n                SET\n                    index_metadata_json = $2\n                WHERE\n                    index_uid = $1\n                    AND NOT EXISTS (\n                        SELECT 1\n                        FROM input_splits\n                        WHERE\n                            actual_split_state != expected_split_state\n                        )\n            ),\n            -- Publish the staged splits and mark the published splits for deletion.\n            updated_splits AS (\n                UPDATE splits\n                SET\n                    split_state = CASE split_state\n                        WHEN 'Staged' THEN 'Published'\n                        ELSE 'MarkedForDeletion'\n                    END,\n                    update_timestamp = (CURRENT_TIMESTAMP AT TIME ZONE 'UTC'),\n                    publish_timestamp = (CURRENT_TIMESTAMP AT TIME ZONE 'UTC')\n                FROM input_splits\n                WHERE\n                    splits.index_uid = $1\n                    AND splits.split_id = input_splits.split_id\n                    AND NOT EXISTS (\n                        SELECT 1\n                        FROM input_splits\n                        WHERE\n                            actual_split_state != expected_split_state\n                    )\n            )\n            -- Report the outcome of the update query.\n            SELECT\n                COUNT(1) FILTER (WHERE actual_split_state = 'Staged' AND expected_split_state = 'Staged'),\n                COUNT(1) FILTER (WHERE actual_split_state = 'Published' AND expected_split_state = 'Published'),\n                COALESCE(ARRAY_AGG(split_id) FILTER (WHERE actual_split_state IS NULL), ARRAY[]::TEXT[]),\n                COALESCE(ARRAY_AGG(split_id) FILTER (WHERE actual_split_state != 'Staged' AND expected_split_state = 'Staged'), ARRAY[]::TEXT[]),\n                COALESCE(ARRAY_AGG(split_id) FILTER (WHERE actual_split_state != 'Published' AND expected_split_state = 'Published'), ARRAY[]::TEXT[])\n                FROM input_splits\n        \"#;\n            let (\n                num_published_splits,\n                num_marked_splits,\n                not_found_split_ids,\n                not_staged_split_ids,\n                not_marked_split_ids,\n            ): (i64, i64, Vec<String>, Vec<String>, Vec<String>) =\n                sqlx::query_as(PUBLISH_SPLITS_QUERY)\n                    .bind(&index_uid)\n                    .bind(index_metadata_json)\n                    .bind(staged_split_ids)\n                    .bind(replaced_split_ids)\n                    .fetch_one(tx.as_mut())\n                    .await\n                    .map_err(|sqlx_error| convert_sqlx_err(&index_uid.index_id, sqlx_error))?;\n\n            if !not_found_split_ids.is_empty() {\n                return Err(MetastoreError::NotFound(EntityKind::Splits {\n                    split_ids: not_found_split_ids,\n                }));\n            }\n            if !not_staged_split_ids.is_empty() {\n                let entity = EntityKind::Splits {\n                    split_ids: not_staged_split_ids,\n                };\n                let message = \"splits are not staged\".to_string();\n                return Err(MetastoreError::FailedPrecondition { entity, message });\n            }\n            if !not_marked_split_ids.is_empty() {\n                let entity = EntityKind::Splits {\n                    split_ids: not_marked_split_ids,\n                };\n                let message = \"splits are not marked for deletion\".to_string();\n                return Err(MetastoreError::FailedPrecondition { entity, message });\n            }\n            info!(\n                %index_uid,\n                \"published {num_published_splits} splits and marked {num_marked_splits} for deletion successfully\"\n            );\n            Ok(EmptyResponse {})\n        })\n    }\n\n    #[instrument(skip(self))]\n    async fn list_splits(\n        &self,\n        request: ListSplitsRequest,\n    ) -> MetastoreResult<MetastoreServiceStream<ListSplitsResponse>> {\n        let list_splits_query = request.deserialize_list_splits_query()?;\n        let mut sql_query_builder = Query::select();\n        sql_query_builder.column(Asterisk).from(Splits::Table);\n        append_query_filters_and_order_by(&mut sql_query_builder, &list_splits_query);\n\n        let (sql_query, values) = sql_query_builder.build_sqlx(PostgresQueryBuilder);\n        let pg_split_stream = SplitStream::new(\n            self.connection_pool.clone(),\n            sql_query,\n            |connection_pool: &TrackedPool<Postgres>, sql_query: &String| {\n                sqlx::query_as_with::<_, PgSplit, _>(sql_query, values).fetch(connection_pool)\n            },\n        );\n        let split_stream =\n            pg_split_stream\n                .chunks(STREAM_SPLITS_CHUNK_SIZE)\n                .map(|pg_splits_results| {\n                    let mut splits = Vec::with_capacity(pg_splits_results.len());\n                    for pg_split_result in pg_splits_results {\n                        let pg_split = match pg_split_result {\n                            Ok(pg_split) => pg_split,\n                            Err(error) => {\n                                return Err(MetastoreError::Internal {\n                                    message: \"failed to fetch splits\".to_string(),\n                                    cause: error.to_string(),\n                                });\n                            }\n                        };\n                        let split: Split = match pg_split.try_into() {\n                            Ok(split) => split,\n                            Err(error) => {\n                                return Err(MetastoreError::Internal {\n                                    message: \"failed to convert `PgSplit` to `Split`\".to_string(),\n                                    cause: error.to_string(),\n                                });\n                            }\n                        };\n                        splits.push(split);\n                    }\n                    ListSplitsResponse::try_from_splits(splits)\n                });\n        let service_stream = ServiceStream::new(Box::pin(split_stream));\n        Ok(service_stream)\n    }\n\n    async fn list_index_stats(\n        &self,\n        request: ListIndexStatsRequest,\n    ) -> MetastoreResult<ListIndexStatsResponse> {\n        let index_pattern_sql = build_index_id_patterns_sql_query(&request.index_id_patterns)\n            .map_err(|error| MetastoreError::Internal {\n                message: \"failed to build `list_index_stats` SQL query\".to_string(),\n                cause: error.to_string(),\n            })?;\n        let sql = format!(\n            \"SELECT\n                i.index_uid,\n                s.split_state,\n                COUNT(s.split_state) AS num_splits,\n                COALESCE(SUM(s.split_size_bytes)::BIGINT, 0) AS total_size_bytes\n            FROM ({index_pattern_sql}) i\n            LEFT JOIN splits s ON s.index_uid = i.index_uid\n            GROUP BY i.index_uid, s.split_state\"\n        );\n\n        let rows: Vec<(String, Option<String>, i64, i64)> = sqlx::query_as(&sql)\n            .fetch_all(&self.connection_pool)\n            .await?;\n\n        let mut index_stats = HashMap::new();\n        for (index_uid_str, split_state, num_splits, total_size_bytes) in rows {\n            let Ok(index_uid) = IndexUid::from_str(&index_uid_str) else {\n                return Err(MetastoreError::Internal {\n                    message: \"failed to parse index_uid\".to_string(),\n                    cause: index_uid_str.to_string(),\n                });\n            };\n            let stats = index_stats\n                .entry(index_uid_str)\n                .or_insert_with(|| IndexStats {\n                    index_uid: Some(index_uid),\n                    staged: Some(SplitStats::default()),\n                    published: Some(SplitStats::default()),\n                    marked_for_deletion: Some(SplitStats::default()),\n                });\n            let num_splits = num_splits as u64;\n            let total_size_bytes = total_size_bytes as u64;\n            match split_state.as_deref() {\n                Some(\"Staged\") => {\n                    stats.staged = Some(SplitStats {\n                        num_splits,\n                        total_size_bytes,\n                    });\n                }\n                Some(\"Published\") => {\n                    stats.published = Some(SplitStats {\n                        num_splits,\n                        total_size_bytes,\n                    });\n                }\n                Some(\"MarkedForDeletion\") => {\n                    stats.marked_for_deletion = Some(SplitStats {\n                        num_splits,\n                        total_size_bytes,\n                    });\n                }\n                None => {} // if an index has no splits, we can keep the defaults\n                Some(split_state) => {\n                    return Err(MetastoreError::Internal {\n                        message: \"invalid split state\".to_string(),\n                        cause: split_state.to_string(),\n                    });\n                }\n            }\n        }\n\n        Ok(ListIndexStatsResponse {\n            index_stats: index_stats.into_values().collect(),\n        })\n    }\n\n    #[instrument(skip(self))]\n    async fn mark_splits_for_deletion(\n        &self,\n        request: MarkSplitsForDeletionRequest,\n    ) -> MetastoreResult<EmptyResponse> {\n        let index_uid: IndexUid = request.index_uid().clone();\n        let split_ids = request.split_ids;\n        const MARK_SPLITS_FOR_DELETION_QUERY: &str = r#\"\n            -- Select the splits to update, regardless of their state.\n            -- The left join make it possible to identify the splits that do not exist.\n            WITH input_splits AS (\n                SELECT input_splits.split_id, splits.split_state\n                FROM UNNEST($2) AS input_splits(split_id)\n                LEFT JOIN (\n                    SELECT split_id, split_state\n                    FROM splits\n                    WHERE\n                        index_uid = $1\n                        AND split_id = ANY($2)\n                    FOR UPDATE\n                    ) AS splits\n                USING (split_id)\n            ),\n            -- Mark the staged and published splits for deletion.\n            marked_splits AS (\n                UPDATE splits\n                SET\n                    split_state = 'MarkedForDeletion',\n                    update_timestamp = (CURRENT_TIMESTAMP AT TIME ZONE 'UTC')\n                FROM input_splits\n                WHERE\n                    splits.index_uid = $1\n                    AND splits.split_id = input_splits.split_id\n                    AND splits.split_state IN ('Staged', 'Published')\n            )\n            -- Report the outcome of the update query.\n            SELECT\n                COUNT(split_state),\n                COUNT(1) FILTER (WHERE split_state IN ('Staged', 'Published')),\n                COALESCE(ARRAY_AGG(split_id) FILTER (WHERE split_state IS NULL), ARRAY[]::TEXT[])\n                FROM input_splits\n        \"#;\n        let (num_found_splits, num_marked_splits, not_found_split_ids): (i64, i64, Vec<String>) =\n            sqlx::query_as(MARK_SPLITS_FOR_DELETION_QUERY)\n                .bind(&index_uid)\n                .bind(split_ids.clone())\n                .fetch_one(&self.connection_pool)\n                .await\n                .map_err(|sqlx_error| convert_sqlx_err(&index_uid.index_id, sqlx_error))?;\n\n        if num_found_splits == 0\n            && index_opt(&self.connection_pool, &index_uid.index_id, false)\n                .await?\n                .is_none()\n        {\n            return Err(MetastoreError::NotFound(EntityKind::Index {\n                index_id: index_uid.index_id,\n            }));\n        }\n        info!(\n            %index_uid,\n            \"Marked {} splits for deletion, among which {} were newly marked.\",\n            split_ids.len() - not_found_split_ids.len(),\n            num_marked_splits\n        );\n        if !not_found_split_ids.is_empty() {\n            warn!(\n                %index_uid,\n                split_ids=?PrettySample::new(&not_found_split_ids, 5),\n                \"{} splits were not found and could not be marked for deletion.\",\n                not_found_split_ids.len()\n            );\n        }\n        Ok(EmptyResponse {})\n    }\n\n    #[instrument(skip(self))]\n    async fn delete_splits(&self, request: DeleteSplitsRequest) -> MetastoreResult<EmptyResponse> {\n        let index_uid: IndexUid = request.index_uid().clone();\n        let split_ids = request.split_ids;\n        const DELETE_SPLITS_QUERY: &str = r#\"\n            -- Select the splits to delete, regardless of their state.\n            -- The left join make it possible to identify the splits that do not exist.\n            WITH input_splits AS (\n                SELECT input_splits.split_id, splits.split_state\n                FROM UNNEST($2) AS input_splits(split_id)\n                LEFT JOIN (\n                    SELECT split_id, split_state\n                    FROM splits\n                    WHERE\n                        index_uid = $1\n                        AND split_id = ANY($2)\n                    FOR UPDATE\n                    ) AS splits\n                USING (split_id)\n            ),\n            -- Delete the splits if and only if all the splits are marked for deletion.\n            deleted_splits AS (\n                DELETE FROM splits\n                USING input_splits\n                WHERE\n                    splits.index_uid = $1\n                    AND splits.split_id = input_splits.split_id\n                    AND NOT EXISTS (\n                        SELECT 1\n                        FROM input_splits\n                        WHERE\n                            split_state IN ('Staged', 'Published')\n                    )\n            )\n            -- Report the outcome of the delete query.\n            SELECT\n                COUNT(split_state),\n                COUNT(1) FILTER (WHERE split_state = 'MarkedForDeletion'),\n                COALESCE(ARRAY_AGG(split_id) FILTER (WHERE split_state IN ('Staged', 'Published')), ARRAY[]::TEXT[]),\n                COALESCE(ARRAY_AGG(split_id) FILTER (WHERE split_state IS NULL), ARRAY[]::TEXT[])\n                FROM input_splits\n        \"#;\n        let (num_found_splits, num_deleted_splits, not_deletable_split_ids, not_found_split_ids): (\n            i64,\n            i64,\n            Vec<String>,\n            Vec<String>,\n        ) = sqlx::query_as(DELETE_SPLITS_QUERY)\n            .bind(&index_uid)\n            .bind(split_ids)\n            .fetch_one(&self.connection_pool)\n            .await\n            .map_err(|sqlx_error| convert_sqlx_err(&index_uid.index_id, sqlx_error))?;\n\n        if num_found_splits == 0\n            && index_opt_for_uid(&self.connection_pool, index_uid.clone(), false)\n                .await?\n                .is_none()\n        {\n            return Err(MetastoreError::NotFound(EntityKind::Index {\n                index_id: index_uid.index_id,\n            }));\n        }\n        if !not_deletable_split_ids.is_empty() {\n            let message = format!(\n                \"splits `{}` are not deletable\",\n                not_deletable_split_ids.join(\", \")\n            );\n            let entity = EntityKind::Splits {\n                split_ids: not_deletable_split_ids,\n            };\n            return Err(MetastoreError::FailedPrecondition { entity, message });\n        }\n        info!(%index_uid, \"deleted {} splits from index\", num_deleted_splits);\n\n        if !not_found_split_ids.is_empty() {\n            warn!(\n                %index_uid,\n                split_ids=?PrettySample::new(&not_found_split_ids, 5),\n                \"{} splits were not found and could not be deleted.\",\n                not_found_split_ids.len()\n            );\n        }\n        Ok(EmptyResponse {})\n    }\n\n    #[instrument(skip(self))]\n    async fn add_source(&self, request: AddSourceRequest) -> MetastoreResult<EmptyResponse> {\n        let source_config = request.deserialize_source_config()?;\n        let index_uid: IndexUid = request.index_uid().clone();\n        run_with_tx!(self.connection_pool, tx, \"add source\", {\n            mutate_index_metadata::<MetastoreError, _>(tx, index_uid, |index_metadata| {\n                index_metadata.add_source(source_config)?;\n                Ok(MutationOccurred::Yes(()))\n            })\n            .await?;\n            Ok(())\n        })?;\n        Ok(EmptyResponse {})\n    }\n\n    #[instrument(skip(self))]\n    async fn update_source(&self, request: UpdateSourceRequest) -> MetastoreResult<EmptyResponse> {\n        let source_config = request.deserialize_source_config()?;\n        let index_uid: IndexUid = request.index_uid().clone();\n        run_with_tx!(self.connection_pool, tx, \"update source\", {\n            mutate_index_metadata::<MetastoreError, _>(tx, index_uid, |index_metadata| {\n                let mutation_occurred = index_metadata.update_source(source_config)?;\n                Ok(MutationOccurred::from(mutation_occurred))\n            })\n            .await?;\n            Ok(())\n        })?;\n        Ok(EmptyResponse {})\n    }\n\n    #[instrument(skip(self))]\n    async fn toggle_source(&self, request: ToggleSourceRequest) -> MetastoreResult<EmptyResponse> {\n        let index_uid: IndexUid = request.index_uid().clone();\n        run_with_tx!(self.connection_pool, tx, \"toggle source\", {\n            mutate_index_metadata(tx, index_uid, |index_metadata| {\n                if index_metadata.toggle_source(&request.source_id, request.enable)? {\n                    Ok::<_, MetastoreError>(MutationOccurred::Yes(()))\n                } else {\n                    Ok::<_, MetastoreError>(MutationOccurred::No(()))\n                }\n            })\n            .await?;\n            Ok(())\n        })?;\n        Ok(EmptyResponse {})\n    }\n\n    #[instrument(skip(self))]\n    async fn delete_source(&self, request: DeleteSourceRequest) -> MetastoreResult<EmptyResponse> {\n        let index_uid: IndexUid = request.index_uid().clone();\n        let source_id = request.source_id.clone();\n        run_with_tx!(self.connection_pool, tx, \"delete source\", {\n            mutate_index_metadata(tx, index_uid.clone(), |index_metadata| {\n                index_metadata.delete_source(&source_id)?;\n                Ok::<_, MetastoreError>(MutationOccurred::Yes(()))\n            })\n            .await?;\n            sqlx::query(\n                r#\"\n                    DELETE FROM shards\n                    WHERE\n                        index_uid = $1\n                        AND source_id = $2\n                \"#,\n            )\n            .bind(&index_uid)\n            .bind(source_id)\n            .execute(tx.as_mut())\n            .await?;\n            Ok(())\n        })?;\n        Ok(EmptyResponse {})\n    }\n\n    #[instrument(skip(self))]\n    async fn reset_source_checkpoint(\n        &self,\n        request: ResetSourceCheckpointRequest,\n    ) -> MetastoreResult<EmptyResponse> {\n        let index_uid: IndexUid = request.index_uid().clone();\n        run_with_tx!(self.connection_pool, tx, \"reset source checkpoint\", {\n            mutate_index_metadata(tx, index_uid, |index_metadata| {\n                if index_metadata.checkpoint.reset_source(&request.source_id) {\n                    Ok::<_, MetastoreError>(MutationOccurred::Yes(()))\n                } else {\n                    Ok::<_, MetastoreError>(MutationOccurred::No(()))\n                }\n            })\n            .await?;\n            Ok(())\n        })?;\n        Ok(EmptyResponse {})\n    }\n\n    /// Retrieves the last delete opstamp for a given `index_id`.\n    #[instrument(skip(self))]\n    async fn last_delete_opstamp(\n        &self,\n        request: LastDeleteOpstampRequest,\n    ) -> MetastoreResult<LastDeleteOpstampResponse> {\n        let max_opstamp: i64 = sqlx::query_scalar(\n            r#\"\n            SELECT COALESCE(MAX(opstamp), 0)\n            FROM delete_tasks\n            WHERE index_uid = $1\n        \"#,\n        )\n        .bind(request.index_uid())\n        .fetch_one(&self.connection_pool)\n        .await\n        .map_err(|error| MetastoreError::Db {\n            message: error.to_string(),\n        })?;\n\n        Ok(LastDeleteOpstampResponse::new(max_opstamp as u64))\n    }\n\n    /// Creates a delete task from a delete query.\n    #[instrument(skip(self))]\n    async fn create_delete_task(&self, delete_query: DeleteQuery) -> MetastoreResult<DeleteTask> {\n        let delete_query_json = serde_utils::to_json_str(&delete_query)?;\n        let (create_timestamp, opstamp): (sqlx::types::time::PrimitiveDateTime, i64) =\n            sqlx::query_as(\n                r#\"\n                INSERT INTO delete_tasks (index_uid, delete_query_json) VALUES ($1, $2)\n                RETURNING create_timestamp, opstamp\n            \"#,\n            )\n            .bind(delete_query.index_uid().to_string())\n            .bind(&delete_query_json)\n            .fetch_one(&self.connection_pool)\n            .await\n            .map_err(|error| convert_sqlx_err(&delete_query.index_uid().index_id, error))?;\n\n        Ok(DeleteTask {\n            create_timestamp: create_timestamp.assume_utc().unix_timestamp(),\n            opstamp: opstamp as u64,\n            delete_query: Some(delete_query),\n        })\n    }\n\n    /// Update splits delete opstamps.\n    #[instrument(skip(self))]\n    async fn update_splits_delete_opstamp(\n        &self,\n        request: UpdateSplitsDeleteOpstampRequest,\n    ) -> MetastoreResult<UpdateSplitsDeleteOpstampResponse> {\n        let index_uid: IndexUid = request.index_uid().clone();\n        let split_ids = request.split_ids;\n        if split_ids.is_empty() {\n            return Ok(UpdateSplitsDeleteOpstampResponse {});\n        }\n        let update_result = sqlx::query(\n            r#\"\n            UPDATE splits\n            SET\n                delete_opstamp = $1,\n                -- The values we compare with are *before* the modification:\n                update_timestamp = CASE\n                    WHEN delete_opstamp != $1 THEN (CURRENT_TIMESTAMP AT TIME ZONE 'UTC')\n                    ELSE update_timestamp\n                END\n            WHERE\n                index_uid = $2\n                AND split_id = ANY($3)\n        \"#,\n        )\n        .bind(request.delete_opstamp as i64)\n        .bind(&index_uid)\n        .bind(split_ids)\n        .execute(&self.connection_pool)\n        .await?;\n\n        // If no splits were updated, maybe the index does not exist in the first place?\n        if update_result.rows_affected() == 0\n            && index_opt_for_uid(&self.connection_pool, index_uid.clone(), false)\n                .await?\n                .is_none()\n        {\n            return Err(MetastoreError::NotFound(EntityKind::Index {\n                index_id: index_uid.index_id,\n            }));\n        }\n        Ok(UpdateSplitsDeleteOpstampResponse {})\n    }\n\n    /// Lists the delete tasks with opstamp > `opstamp_start`.\n    #[instrument(skip(self))]\n    async fn list_delete_tasks(\n        &self,\n        request: ListDeleteTasksRequest,\n    ) -> MetastoreResult<ListDeleteTasksResponse> {\n        let index_uid: IndexUid = request.index_uid().clone();\n        let pg_delete_tasks: Vec<PgDeleteTask> = sqlx::query_as::<_, PgDeleteTask>(\n            r#\"\n                SELECT * FROM delete_tasks\n                WHERE\n                    index_uid = $1\n                    AND opstamp > $2\n                \"#,\n        )\n        .bind(&index_uid)\n        .bind(request.opstamp_start as i64)\n        .fetch_all(&self.connection_pool)\n        .await?;\n        let delete_tasks: Vec<DeleteTask> = pg_delete_tasks\n            .into_iter()\n            .map(|pg_delete_task| pg_delete_task.try_into())\n            .collect::<MetastoreResult<_>>()?;\n        Ok(ListDeleteTasksResponse { delete_tasks })\n    }\n\n    /// Returns `num_splits` published splits with `split.delete_opstamp` < `delete_opstamp`.\n    /// Results are ordered by ascending `split.delete_opstamp` and `split.publish_timestamp`\n    /// values.\n    #[instrument(skip(self))]\n    async fn list_stale_splits(\n        &self,\n        request: ListStaleSplitsRequest,\n    ) -> MetastoreResult<ListSplitsResponse> {\n        let index_uid: IndexUid = request.index_uid().clone();\n        let stale_pg_splits: Vec<PgSplit> = sqlx::query_as::<_, PgSplit>(\n            r#\"\n                SELECT *\n                FROM splits\n                WHERE\n                    index_uid = $1\n                    AND delete_opstamp < $2\n                    AND split_state = $3\n                    AND (maturity_timestamp = to_timestamp(0) OR (CURRENT_TIMESTAMP AT TIME ZONE 'UTC') >= maturity_timestamp)\n                ORDER BY delete_opstamp ASC, publish_timestamp ASC\n                LIMIT $4\n            \"#,\n        )\n        .bind(&index_uid)\n        .bind(request.delete_opstamp as i64)\n        .bind(SplitState::Published.as_str())\n        .bind(request.num_splits as i64)\n        .fetch_all(&self.connection_pool)\n        .await?;\n\n        let stale_splits: Vec<Split> = stale_pg_splits\n            .into_iter()\n            .map(|pg_split| pg_split.try_into())\n            .collect::<MetastoreResult<_>>()?;\n        let response = ListSplitsResponse::try_from_splits(stale_splits)?;\n        Ok(response)\n    }\n\n    // TODO: Issue a single SQL query.\n    async fn open_shards(&self, request: OpenShardsRequest) -> MetastoreResult<OpenShardsResponse> {\n        let mut subresponses = Vec::with_capacity(request.subrequests.len());\n\n        for subrequest in request.subrequests {\n            let open_shard: Shard = open_or_fetch_shard(&self.connection_pool, &subrequest).await?;\n            let subresponse = OpenShardSubresponse {\n                subrequest_id: subrequest.subrequest_id,\n                open_shard: Some(open_shard),\n            };\n            subresponses.push(subresponse);\n        }\n        Ok(OpenShardsResponse { subresponses })\n    }\n\n    async fn acquire_shards(\n        &self,\n        request: AcquireShardsRequest,\n    ) -> MetastoreResult<AcquireShardsResponse> {\n        const ACQUIRE_SHARDS_QUERY: &str = include_str!(\"queries/shards/acquire.sql\");\n\n        if request.shard_ids.is_empty() {\n            return Ok(Default::default());\n        }\n        let pg_shards: Vec<PgShard> = sqlx::query_as(ACQUIRE_SHARDS_QUERY)\n            .bind(request.index_uid())\n            .bind(&request.source_id)\n            .bind(&request.shard_ids)\n            .bind(&request.publish_token)\n            .fetch_all(&self.connection_pool)\n            .await?;\n        let acquired_shards = pg_shards\n            .into_iter()\n            .map(|pg_shard| pg_shard.into())\n            .collect();\n        let response = AcquireShardsResponse { acquired_shards };\n        Ok(response)\n    }\n\n    async fn list_shards(&self, request: ListShardsRequest) -> MetastoreResult<ListShardsResponse> {\n        if request.subrequests.is_empty() {\n            return Ok(Default::default());\n        }\n        let mut sql_query_builder = Query::select();\n\n        for (idx, subrequest) in request.subrequests.iter().enumerate() {\n            let mut sql_subquery_builder = Query::select();\n\n            sql_subquery_builder\n                .column(Asterisk)\n                .from(Shards::Table)\n                .and_where(Expr::col(Shards::IndexUid).eq(subrequest.index_uid()))\n                .and_where(Expr::col(Shards::SourceId).eq(&subrequest.source_id));\n\n            let shard_state = subrequest.shard_state();\n\n            if shard_state != ShardState::Unspecified {\n                let shard_state_str = shard_state.as_json_str_name();\n                let shard_state_alias = Alias::new(\"SHARD_STATE\");\n                let cast_expr = Func::cast_as(shard_state_str, shard_state_alias);\n                sql_subquery_builder.and_where(Expr::col(Shards::ShardState).eq(cast_expr));\n            }\n            if idx == 0 {\n                sql_query_builder = sql_subquery_builder;\n            } else {\n                sql_query_builder.union(UnionType::All, sql_subquery_builder);\n            }\n        }\n        let (sql_query, values) = sql_query_builder.build_sqlx(PostgresQueryBuilder);\n\n        let pg_shards: Vec<PgShard> = sqlx::query_as_with::<_, PgShard, _>(&sql_query, values)\n            .fetch_all(&self.connection_pool)\n            .await?;\n\n        let mut per_source_subresponses: HashMap<(IndexUid, SourceId), ListShardsSubresponse> =\n            request\n                .subrequests\n                .into_iter()\n                .map(|subrequest| {\n                    let index_uid = subrequest.index_uid().clone();\n                    let source_id = subrequest.source_id.clone();\n                    (\n                        (index_uid, source_id),\n                        ListShardsSubresponse {\n                            index_uid: subrequest.index_uid,\n                            source_id: subrequest.source_id,\n                            shards: Vec::new(),\n                        },\n                    )\n                })\n                .collect();\n\n        for pg_shard in pg_shards {\n            let shard: Shard = pg_shard.into();\n            let source_key = (shard.index_uid().clone(), shard.source_id.clone());\n\n            let Some(subresponse) = per_source_subresponses.get_mut(&source_key) else {\n                warn!(\n                    index_uid=%shard.index_uid(),\n                    source_id=%shard.source_id,\n                    \"could not find source in subresponses: this should never happen, please report\"\n                );\n                continue;\n            };\n            subresponse.shards.push(shard);\n        }\n        let subresponses = per_source_subresponses.into_values().collect();\n        let response = ListShardsResponse { subresponses };\n        Ok(response)\n    }\n\n    async fn delete_shards(\n        &self,\n        request: DeleteShardsRequest,\n    ) -> MetastoreResult<DeleteShardsResponse> {\n        const DELETE_SHARDS_QUERY: &str = include_str!(\"queries/shards/delete.sql\");\n\n        const FIND_NOT_DELETABLE_SHARDS_QUERY: &str =\n            include_str!(\"queries/shards/find_not_deletable.sql\");\n\n        if request.shard_ids.is_empty() {\n            return Ok(Default::default());\n        }\n        let query_result = sqlx::query(DELETE_SHARDS_QUERY)\n            .bind(request.index_uid())\n            .bind(&request.source_id)\n            .bind(&request.shard_ids)\n            .bind(request.force)\n            .execute(&self.connection_pool)\n            .await?;\n\n        // Happy path: all shards were deleted.\n        if request.force || query_result.rows_affected() == request.shard_ids.len() as u64 {\n            let response = DeleteShardsResponse {\n                index_uid: request.index_uid,\n                source_id: request.source_id,\n                successes: request.shard_ids,\n                failures: Vec::new(),\n            };\n            return Ok(response);\n        }\n        // Unhappy path: some shards were not deleted because they do not exist or are not fully\n        // indexed.\n        let not_deletable_pg_shards: Vec<PgShard> = sqlx::query_as(FIND_NOT_DELETABLE_SHARDS_QUERY)\n            .bind(request.index_uid())\n            .bind(&request.source_id)\n            .bind(&request.shard_ids)\n            .fetch_all(&self.connection_pool)\n            .await?;\n\n        if not_deletable_pg_shards.is_empty() {\n            let response = DeleteShardsResponse {\n                index_uid: request.index_uid,\n                source_id: request.source_id,\n                successes: request.shard_ids,\n                failures: Vec::new(),\n            };\n            return Ok(response);\n        }\n        let failures: Vec<ShardId> = not_deletable_pg_shards\n            .into_iter()\n            .map(|pg_shard| pg_shard.shard_id)\n            .collect();\n        warn!(\n            index_uid=%request.index_uid(),\n            source_id=%request.source_id,\n            \"failed to delete shards `{}`: shards are not fully indexed\",\n            failures.iter().join(\", \")\n        );\n        let successes: Vec<ShardId> = request\n            .shard_ids\n            .into_iter()\n            .filter(|shard_id| !failures.contains(shard_id))\n            .collect();\n        let response = DeleteShardsResponse {\n            index_uid: request.index_uid,\n            source_id: request.source_id,\n            successes,\n            failures,\n        };\n        Ok(response)\n    }\n\n    async fn prune_shards(&self, request: PruneShardsRequest) -> MetastoreResult<EmptyResponse> {\n        const PRUNE_AGE_SHARDS_QUERY: &str = include_str!(\"queries/shards/prune_age.sql\");\n        const PRUNE_COUNT_SHARDS_QUERY: &str = include_str!(\"queries/shards/prune_count.sql\");\n\n        if let Some(max_age_secs) = request.max_age_secs {\n            let limit_datetime =\n                OffsetDateTime::now_utc() - Duration::from_secs(max_age_secs as u64);\n            sqlx::query(PRUNE_AGE_SHARDS_QUERY)\n                .bind(request.index_uid())\n                .bind(&request.source_id)\n                .bind(limit_datetime)\n                .execute(&self.connection_pool)\n                .await?;\n        }\n\n        if let Some(max_count) = request.max_count {\n            sqlx::query(PRUNE_COUNT_SHARDS_QUERY)\n                .bind(request.index_uid())\n                .bind(&request.source_id)\n                .bind(max_count as i64)\n                .execute(&self.connection_pool)\n                .await?;\n        }\n        Ok(EmptyResponse {})\n    }\n\n    // Index Template API\n\n    async fn create_index_template(\n        &self,\n        request: CreateIndexTemplateRequest,\n    ) -> MetastoreResult<EmptyResponse> {\n        const INSERT_INDEX_TEMPLATE_QUERY: &str =\n            include_str!(\"queries/index_templates/insert.sql\");\n        const UPSERT_INDEX_TEMPLATE_QUERY: &str =\n            include_str!(\"queries/index_templates/upsert.sql\");\n\n        let index_template: IndexTemplate =\n            serde_utils::from_json_str(&request.index_template_json)?;\n\n        index_template\n            .validate()\n            .map_err(|error| MetastoreError::InvalidArgument {\n                message: format!(\n                    \"invalid index template `{}`: `{error}`\",\n                    index_template.template_id\n                ),\n            })?;\n\n        let mut positive_patterns = Vec::new();\n        let mut negative_patterns = Vec::new();\n\n        for pattern in &index_template.index_id_patterns {\n            if let Some(negative_pattern) = pattern.strip_prefix('-') {\n                negative_patterns.push(negative_pattern.replace('*', \"%\"));\n            } else {\n                positive_patterns.push(pattern.replace('*', \"%\"));\n            }\n        }\n        if request.overwrite {\n            sqlx::query(UPSERT_INDEX_TEMPLATE_QUERY)\n                .bind(&index_template.template_id)\n                .bind(positive_patterns)\n                .bind(negative_patterns)\n                .bind(index_template.priority as i32)\n                .bind(&request.index_template_json)\n                .execute(&self.connection_pool)\n                .await?;\n\n            return Ok(EmptyResponse {});\n        }\n        let pg_query_result = sqlx::query(INSERT_INDEX_TEMPLATE_QUERY)\n            .bind(&index_template.template_id)\n            .bind(positive_patterns)\n            .bind(negative_patterns)\n            .bind(index_template.priority as i32)\n            .bind(&request.index_template_json)\n            .execute(&self.connection_pool)\n            .await?;\n\n        if pg_query_result.rows_affected() == 0 {\n            return Err(MetastoreError::AlreadyExists(EntityKind::IndexTemplate {\n                template_id: index_template.template_id,\n            }));\n        }\n        Ok(EmptyResponse {})\n    }\n\n    async fn get_index_template(\n        &self,\n        request: GetIndexTemplateRequest,\n    ) -> MetastoreResult<GetIndexTemplateResponse> {\n        let pg_index_template_json: PgIndexTemplate =\n            sqlx::query_as(\"SELECT * FROM index_templates WHERE template_id = $1\")\n                .bind(&request.template_id)\n                .fetch_optional(&self.connection_pool)\n                .await?\n                .ok_or({\n                    MetastoreError::NotFound(EntityKind::IndexTemplate {\n                        template_id: request.template_id,\n                    })\n                })?;\n        let response = GetIndexTemplateResponse {\n            index_template_json: pg_index_template_json.index_template_json,\n        };\n        Ok(response)\n    }\n\n    async fn find_index_template_matches(\n        &self,\n        request: FindIndexTemplateMatchesRequest,\n    ) -> MetastoreResult<FindIndexTemplateMatchesResponse> {\n        if request.index_ids.is_empty() {\n            return Ok(Default::default());\n        }\n        const FIND_INDEX_TEMPLATE_MATCHES_QUERY: &str =\n            include_str!(\"queries/index_templates/find.sql\");\n\n        let sql_matches: Vec<(IndexId, IndexTemplateId, String)> =\n            sqlx::query_as(FIND_INDEX_TEMPLATE_MATCHES_QUERY)\n                .bind(&request.index_ids)\n                .fetch_all(&self.connection_pool)\n                .await?;\n\n        let matches = sql_matches\n            .into_iter()\n            .map(\n                |(index_id, template_id, index_template_json)| IndexTemplateMatch {\n                    index_id,\n                    template_id,\n                    index_template_json,\n                },\n            )\n            .collect();\n        let response = FindIndexTemplateMatchesResponse { matches };\n        Ok(response)\n    }\n\n    async fn list_index_templates(\n        &self,\n        _request: ListIndexTemplatesRequest,\n    ) -> MetastoreResult<ListIndexTemplatesResponse> {\n        let pg_index_templates_json: Vec<(String,)> = sqlx::query_as(\n            \"SELECT index_template_json FROM index_templates ORDER BY template_id ASC\",\n        )\n        .fetch_all(&self.connection_pool)\n        .await?;\n        let index_templates_json: Vec<String> = pg_index_templates_json\n            .into_iter()\n            .map(|(index_template_json,)| index_template_json)\n            .collect();\n        let response = ListIndexTemplatesResponse {\n            index_templates_json,\n        };\n        Ok(response)\n    }\n\n    async fn delete_index_templates(\n        &self,\n        request: DeleteIndexTemplatesRequest,\n    ) -> MetastoreResult<EmptyResponse> {\n        sqlx::query(\"DELETE FROM index_templates WHERE template_id = ANY($1)\")\n            .bind(&request.template_ids)\n            .execute(&self.connection_pool)\n            .await?;\n        Ok(EmptyResponse {})\n    }\n\n    async fn get_cluster_identity(\n        &self,\n        _: GetClusterIdentityRequest,\n    ) -> MetastoreResult<GetClusterIdentityResponse> {\n        // `ON CONFLICT DO NOTHING RETURNING` returns NULL if no insert happens.\n        // To always get the value, we use this pattern:\n        let (uuid,) = sqlx::query_as(\n            r\"\n                INSERT INTO kv (key, value)\n                VALUES ('cluster_identity', $1)\n                ON CONFLICT (key) DO UPDATE SET key = EXCLUDED.key\n                RETURNING value\n            \",\n        )\n        .bind(Uuid::new_v4().hyphenated().to_string())\n        .fetch_one(&self.connection_pool)\n        .await?;\n        Ok(GetClusterIdentityResponse { uuid })\n    }\n}\n\nasync fn open_or_fetch_shard<'e>(\n    executor: impl Executor<'e, Database = Postgres> + Clone,\n    subrequest: &OpenShardSubrequest,\n) -> MetastoreResult<Shard> {\n    const OPEN_SHARDS_QUERY: &str = include_str!(\"queries/shards/open.sql\");\n\n    let pg_shard_opt: Option<PgShard> = sqlx::query_as(OPEN_SHARDS_QUERY)\n        .bind(subrequest.index_uid())\n        .bind(&subrequest.source_id)\n        .bind(subrequest.shard_id().as_str())\n        .bind(&subrequest.leader_id)\n        .bind(&subrequest.follower_id)\n        .bind(subrequest.doc_mapping_uid)\n        .bind(&subrequest.publish_token)\n        // Use a timestamp generated by the metastore node to avoid clock drift issues\n        .bind(OffsetDateTime::now_utc())\n        .fetch_optional(executor.clone())\n        .await?;\n\n    if let Some(pg_shard) = pg_shard_opt {\n        let shard: Shard = pg_shard.into();\n        info!(\n            index_uid=%shard.index_uid(),\n            source_id=%shard.source_id,\n            shard_id=%shard.shard_id(),\n            leader_id=%shard.leader_id,\n            follower_id=?shard.follower_id,\n            \"opened shard\"\n        );\n        return Ok(shard);\n    }\n    const FETCH_SHARD_QUERY: &str = include_str!(\"queries/shards/fetch.sql\");\n\n    let pg_shard_opt: Option<PgShard> = sqlx::query_as(FETCH_SHARD_QUERY)\n        .bind(subrequest.index_uid())\n        .bind(&subrequest.source_id)\n        .bind(subrequest.shard_id().as_str())\n        .fetch_optional(executor)\n        .await?;\n\n    if let Some(pg_shard) = pg_shard_opt {\n        return Ok(pg_shard.into());\n    }\n    Err(MetastoreError::NotFound(EntityKind::Source {\n        index_id: subrequest.index_uid().to_string(),\n        source_id: subrequest.source_id.clone(),\n    }))\n}\n\nimpl MetastoreServiceExt for PostgresqlMetastore {}\n\n/// Builds the SQL query that returns indexes matching at least one pattern in\n/// `index_id_patterns`, and none of the patterns starting with '-'\n///\n/// For each pattern, we check whether the pattern is valid and replace `*` by `%`\n/// to build a SQL `LIKE` query.\nfn build_index_id_patterns_sql_query(index_id_patterns: &[String]) -> anyhow::Result<String> {\n    let mut positive_patterns = Vec::new();\n    let mut negative_patterns = Vec::new();\n    for pattern in index_id_patterns {\n        if let Some(negative_pattern) = pattern.strip_prefix('-') {\n            negative_patterns.push(negative_pattern.to_string());\n        } else {\n            positive_patterns.push(pattern);\n        }\n    }\n\n    if positive_patterns.is_empty() {\n        anyhow::bail!(\"The list of index id patterns may not be empty.\");\n    }\n\n    if index_id_patterns.iter().any(|pattern| pattern == \"*\") && negative_patterns.is_empty() {\n        return Ok(\"SELECT * FROM indexes\".to_string());\n    }\n\n    let mut where_like_query = String::new();\n    for (index_id_pattern_idx, index_id_pattern) in positive_patterns.iter().enumerate() {\n        validate_index_id_pattern(index_id_pattern, false).map_err(|error| {\n            MetastoreError::Internal {\n                message: \"failed to build list indexes query\".to_string(),\n                cause: error.to_string(),\n            }\n        })?;\n        if index_id_pattern_idx != 0 {\n            where_like_query.push_str(\" OR \");\n        }\n        if index_id_pattern.contains('*') {\n            let sql_pattern = index_id_pattern.replace('*', \"%\");\n            let _ = write!(where_like_query, \"index_id LIKE '{sql_pattern}'\");\n        } else {\n            let _ = write!(where_like_query, \"index_id = '{index_id_pattern}'\");\n        }\n    }\n    let mut negative_like_query = String::new();\n    for index_id_pattern in negative_patterns.iter() {\n        validate_index_id_pattern(index_id_pattern, false).map_err(|error| {\n            MetastoreError::Internal {\n                message: \"failed to build list indexes query\".to_string(),\n                cause: error.to_string(),\n            }\n        })?;\n        negative_like_query.push_str(\" AND \");\n        if index_id_pattern.contains('*') {\n            let sql_pattern = index_id_pattern.replace('*', \"%\");\n            let _ = write!(negative_like_query, \"index_id NOT LIKE '{sql_pattern}'\");\n        } else {\n            let _ = write!(negative_like_query, \"index_id <> '{index_id_pattern}'\");\n        }\n    }\n\n    Ok(format!(\n        \"SELECT * FROM indexes WHERE ({where_like_query}){negative_like_query}\"\n    ))\n}\n\n/// A postgres metastore factory\n#[cfg(test)]\n#[async_trait]\nimpl crate::tests::DefaultForTest for PostgresqlMetastore {\n    async fn default_for_test() -> Self {\n        // We cannot use a singleton here,\n        // because sqlx needs the runtime used to create a connection to\n        // not being dropped.\n        //\n        // Each unit test runs its own tokio Runtime, so a singleton would mean\n        // tying the connection pool to the runtime of one unit test.\n        // Concretely this results in a \"IO driver has terminated\"\n        // once the first unit test finishes and its runtime is dropped.\n        //\n        // The number of connections to Postgres should not be\n        // too catastrophic, as it is limited by the number of concurrent\n        // unit tests running (= number of test-threads).\n        dotenvy::dotenv().ok();\n        let uri: Uri = std::env::var(\"QW_TEST_DATABASE_URL\")\n            .expect(\"environment variable `QW_TEST_DATABASE_URL` should be set\")\n            .parse()\n            .expect(\"environment variable `QW_TEST_DATABASE_URL` should be a valid URI\");\n        PostgresqlMetastore::new(&PostgresMetastoreConfig::default(), &uri)\n            .await\n            .expect(\"failed to initialize PostgreSQL metastore test\")\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use async_trait::async_trait;\n    use quickwit_common::uri::Protocol;\n    use quickwit_doc_mapper::tag_pruning::TagFilterAst;\n    use quickwit_proto::ingest::Shard;\n    use quickwit_proto::metastore::MetastoreService;\n    use quickwit_proto::types::{IndexUid, SourceId};\n    use sea_query::{Asterisk, PostgresQueryBuilder, Query};\n    use time::OffsetDateTime;\n\n    use super::*;\n    use crate::metastore::postgres::metastore::build_index_id_patterns_sql_query;\n    use crate::metastore::postgres::model::{PgShard, Splits};\n    use crate::tests::DefaultForTest;\n    use crate::tests::shard::ReadWriteShardsForTest;\n    use crate::{ListSplitsQuery, SplitState, metastore_test_suite};\n\n    #[async_trait]\n    impl ReadWriteShardsForTest for PostgresqlMetastore {\n        async fn insert_shards(\n            &self,\n            index_uid: &IndexUid,\n            source_id: &SourceId,\n            shards: Vec<Shard>,\n        ) {\n            const INSERT_SHARD_QUERY: &str = include_str!(\"queries/shards/insert.sql\");\n\n            for shard in shards {\n                assert_eq!(&shard.source_id, source_id);\n                assert_eq!(shard.index_uid(), index_uid);\n                // explicit destructuring to ensure new fields are properly handled\n                let Shard {\n                    doc_mapping_uid,\n                    follower_id,\n                    index_uid,\n                    leader_id,\n                    publish_position_inclusive,\n                    publish_token,\n                    shard_id,\n                    shard_state,\n                    source_id,\n                    update_timestamp,\n                } = shard;\n                let shard_state_name = ShardState::try_from(shard_state)\n                    .unwrap()\n                    .as_json_str_name();\n                let update_timestamp = OffsetDateTime::from_unix_timestamp(update_timestamp)\n                    .expect(\"Bad timestamp format\");\n                sqlx::query(INSERT_SHARD_QUERY)\n                    .bind(index_uid)\n                    .bind(source_id)\n                    .bind(shard_id.unwrap())\n                    .bind(shard_state_name)\n                    .bind(leader_id)\n                    .bind(follower_id)\n                    .bind(doc_mapping_uid)\n                    .bind(publish_position_inclusive.unwrap().to_string())\n                    .bind(publish_token)\n                    .bind(update_timestamp)\n                    .execute(&self.connection_pool)\n                    .await\n                    .unwrap();\n            }\n        }\n\n        async fn list_all_shards(&self, index_uid: &IndexUid, source_id: &SourceId) -> Vec<Shard> {\n            let pg_shards: Vec<PgShard> = sqlx::query_as(\n                r#\"\n                SELECT *\n                FROM shards\n                WHERE\n                    index_uid = $1\n                    AND source_id = $2\n                \"#,\n            )\n            .bind(index_uid)\n            .bind(source_id)\n            .fetch_all(&self.connection_pool)\n            .await\n            .unwrap();\n\n            pg_shards\n                .into_iter()\n                .map(|pg_shard| pg_shard.into())\n                .collect()\n        }\n    }\n\n    metastore_test_suite!(crate::PostgresqlMetastore);\n\n    #[tokio::test]\n    async fn test_metastore_connectivity_and_endpoints() {\n        let metastore = PostgresqlMetastore::default_for_test().await;\n        metastore.check_connectivity().await.unwrap();\n        assert_eq!(metastore.endpoints()[0].protocol(), Protocol::PostgreSQL);\n    }\n\n    #[test]\n    fn test_single_sql_query_builder() {\n        let mut select_statement = Query::select();\n\n        let sql = select_statement.column(Asterisk).from(Splits::Table);\n        let index_uid = IndexUid::new_with_random_ulid(\"test-index\");\n        let query =\n            ListSplitsQuery::for_index(index_uid.clone()).with_split_state(SplitState::Staged);\n        append_query_filters_and_order_by(sql, &query);\n\n        assert_eq!(\n            sql.to_string(PostgresQueryBuilder),\n            format!(\n                r#\"SELECT * FROM \"splits\" WHERE \"index_uid\" IN ('{index_uid}') AND \"split_state\" IN ('Staged')\"#\n            )\n        );\n\n        let mut select_statement = Query::select();\n        let sql = select_statement.column(Asterisk).from(Splits::Table);\n\n        let query =\n            ListSplitsQuery::for_index(index_uid.clone()).with_split_state(SplitState::Published);\n        append_query_filters_and_order_by(sql, &query);\n\n        assert_eq!(\n            sql.to_string(PostgresQueryBuilder),\n            format!(\n                r#\"SELECT * FROM \"splits\" WHERE \"index_uid\" IN ('{index_uid}') AND \"split_state\" IN ('Published')\"#\n            )\n        );\n\n        let mut select_statement = Query::select();\n        let sql = select_statement.column(Asterisk).from(Splits::Table);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_states([SplitState::Published, SplitState::MarkedForDeletion]);\n        append_query_filters_and_order_by(sql, &query);\n        assert_eq!(\n            sql.to_string(PostgresQueryBuilder),\n            format!(\n                r#\"SELECT * FROM \"splits\" WHERE \"index_uid\" IN ('{index_uid}') AND \"split_state\" IN ('Published', 'MarkedForDeletion')\"#\n            )\n        );\n\n        let mut select_statement = Query::select();\n        let sql = select_statement.column(Asterisk).from(Splits::Table);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone()).with_update_timestamp_lt(51);\n        append_query_filters_and_order_by(sql, &query);\n        assert_eq!(\n            sql.to_string(PostgresQueryBuilder),\n            format!(\n                r#\"SELECT * FROM \"splits\" WHERE \"index_uid\" IN ('{index_uid}') AND \"update_timestamp\" < TO_TIMESTAMP(51)\"#\n            )\n        );\n\n        let mut select_statement = Query::select();\n        let sql = select_statement.column(Asterisk).from(Splits::Table);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone()).with_create_timestamp_lte(55);\n        append_query_filters_and_order_by(sql, &query);\n        assert_eq!(\n            sql.to_string(PostgresQueryBuilder),\n            format!(\n                r#\"SELECT * FROM \"splits\" WHERE \"index_uid\" IN ('{index_uid}') AND \"create_timestamp\" <= TO_TIMESTAMP(55)\"#\n            )\n        );\n\n        let mut select_statement = Query::select();\n        let sql = select_statement.column(Asterisk).from(Splits::Table);\n\n        let maturity_evaluation_datetime = OffsetDateTime::from_unix_timestamp(55).unwrap();\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .retain_mature(maturity_evaluation_datetime);\n        append_query_filters_and_order_by(sql, &query);\n\n        assert_eq!(\n            sql.to_string(PostgresQueryBuilder),\n            format!(\n                r#\"SELECT * FROM \"splits\" WHERE \"index_uid\" IN ('{index_uid}') AND (\"maturity_timestamp\" = TO_TIMESTAMP(0) OR \"maturity_timestamp\" <= TO_TIMESTAMP(55))\"#\n            )\n        );\n\n        let mut select_statement = Query::select();\n        let sql = select_statement.column(Asterisk).from(Splits::Table);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .retain_immature(maturity_evaluation_datetime);\n        append_query_filters_and_order_by(sql, &query);\n        assert_eq!(\n            sql.to_string(PostgresQueryBuilder),\n            format!(\n                r#\"SELECT * FROM \"splits\" WHERE \"index_uid\" IN ('{index_uid}') AND \"maturity_timestamp\" > TO_TIMESTAMP(55)\"#\n            )\n        );\n\n        let mut select_statement = Query::select();\n        let sql = select_statement.column(Asterisk).from(Splits::Table);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone()).with_delete_opstamp_gte(4);\n        append_query_filters_and_order_by(sql, &query);\n        assert_eq!(\n            sql.to_string(PostgresQueryBuilder),\n            format!(\n                r#\"SELECT * FROM \"splits\" WHERE \"index_uid\" IN ('{index_uid}') AND \"delete_opstamp\" >= 4\"#\n            )\n        );\n\n        let mut select_statement = Query::select();\n        let sql = select_statement.column(Asterisk).from(Splits::Table);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone()).with_time_range_start_gt(45);\n        append_query_filters_and_order_by(sql, &query);\n        assert_eq!(\n            sql.to_string(PostgresQueryBuilder),\n            format!(\n                r#\"SELECT * FROM \"splits\" WHERE \"index_uid\" IN ('{index_uid}') AND (\"time_range_end\" > 45 OR \"time_range_end\" IS NULL)\"#\n            )\n        );\n\n        let mut select_statement = Query::select();\n        let sql = select_statement.column(Asterisk).from(Splits::Table);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone()).with_time_range_end_lt(45);\n        append_query_filters_and_order_by(sql, &query);\n        assert_eq!(\n            sql.to_string(PostgresQueryBuilder),\n            format!(\n                r#\"SELECT * FROM \"splits\" WHERE \"index_uid\" IN ('{index_uid}') AND (\"time_range_start\" < 45 OR \"time_range_start\" IS NULL)\"#\n            )\n        );\n\n        let mut select_statement = Query::select();\n        let sql = select_statement.column(Asterisk).from(Splits::Table);\n\n        let query =\n            ListSplitsQuery::for_index(index_uid.clone()).with_tags_filter(TagFilterAst::Tag {\n                is_present: false,\n                tag: \"tag-2\".to_string(),\n            });\n        append_query_filters_and_order_by(sql, &query);\n\n        assert_eq!(\n            sql.to_string(PostgresQueryBuilder),\n            format!(\n                r#\"SELECT * FROM \"splits\" WHERE \"index_uid\" IN ('{index_uid}') AND (NOT ($$tag-2$$ = ANY(tags)))\"#\n            )\n        );\n\n        let mut select_statement = Query::select();\n        let sql = select_statement.column(Asterisk).from(Splits::Table);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone()).with_offset(4);\n        append_query_filters_and_order_by(sql, &query);\n\n        assert_eq!(\n            sql.to_string(PostgresQueryBuilder),\n            format!(\n                r#\"SELECT * FROM \"splits\" WHERE \"index_uid\" IN ('{index_uid}') ORDER BY \"split_id\" ASC OFFSET 4\"#\n            )\n        );\n\n        let mut select_statement = Query::select();\n        let sql = select_statement.column(Asterisk).from(Splits::Table);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone()).sort_by_index_uid();\n        append_query_filters_and_order_by(sql, &query);\n\n        assert_eq!(\n            sql.to_string(PostgresQueryBuilder),\n            format!(\n                r#\"SELECT * FROM \"splits\" WHERE \"index_uid\" IN ('{index_uid}') ORDER BY \"index_uid\" ASC, \"split_id\" ASC\"#\n            )\n        );\n\n        let mut select_statement = Query::select();\n        let sql = select_statement.column(Asterisk).from(Splits::Table);\n\n        let query =\n            ListSplitsQuery::for_index(index_uid.clone()).after_split(&crate::SplitMetadata {\n                index_uid: index_uid.clone(),\n                split_id: \"my_split\".to_string(),\n                ..Default::default()\n            });\n        append_query_filters_and_order_by(sql, &query);\n\n        assert_eq!(\n            sql.to_string(PostgresQueryBuilder),\n            format!(\n                r#\"SELECT * FROM \"splits\" WHERE \"index_uid\" IN ('{index_uid}') AND (\"index_uid\", \"split_id\") > ('{index_uid}', 'my_split')\"#\n            )\n        );\n\n        let mut select_statement = Query::select();\n        let sql = select_statement.column(Asterisk).from(Splits::Table);\n\n        let query = ListSplitsQuery::for_all_indexes().with_split_state(SplitState::Staged);\n        append_query_filters_and_order_by(sql, &query);\n\n        assert_eq!(\n            sql.to_string(PostgresQueryBuilder),\n            r#\"SELECT * FROM \"splits\" WHERE \"split_state\" IN ('Staged')\"#\n        );\n\n        let mut select_statement = Query::select();\n        let sql = select_statement.column(Asterisk).from(Splits::Table);\n\n        let query = ListSplitsQuery::for_all_indexes().with_max_time_range_end(42);\n        append_query_filters_and_order_by(sql, &query);\n\n        assert_eq!(\n            sql.to_string(PostgresQueryBuilder),\n            r#\"SELECT * FROM \"splits\" WHERE \"time_range_end\" <= 42\"#\n        );\n    }\n\n    #[test]\n    fn test_combination_sql_query_builder() {\n        let mut select_statement = Query::select();\n        let sql = select_statement.column(Asterisk).from(Splits::Table);\n\n        let index_uid = IndexUid::new_with_random_ulid(\"test-index\");\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_time_range_start_gt(0)\n            .with_time_range_end_lt(40);\n        append_query_filters_and_order_by(sql, &query);\n        assert_eq!(\n            sql.to_string(PostgresQueryBuilder),\n            format!(\n                r#\"SELECT * FROM \"splits\" WHERE \"index_uid\" IN ('{index_uid}') AND (\"time_range_end\" > 0 OR \"time_range_end\" IS NULL) AND (\"time_range_start\" < 40 OR \"time_range_start\" IS NULL)\"#\n            )\n        );\n\n        let mut select_statement = Query::select();\n        let sql = select_statement.column(Asterisk).from(Splits::Table);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_time_range_start_gt(45)\n            .with_delete_opstamp_gt(0);\n        append_query_filters_and_order_by(sql, &query);\n        assert_eq!(\n            sql.to_string(PostgresQueryBuilder),\n            format!(\n                r#\"SELECT * FROM \"splits\" WHERE \"index_uid\" IN ('{index_uid}') AND (\"time_range_end\" > 45 OR \"time_range_end\" IS NULL) AND \"delete_opstamp\" > 0\"#\n            )\n        );\n\n        let mut select_statement = Query::select();\n        let sql = select_statement.column(Asterisk).from(Splits::Table);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_update_timestamp_lt(51)\n            .with_create_timestamp_lte(63);\n        append_query_filters_and_order_by(sql, &query);\n        assert_eq!(\n            sql.to_string(PostgresQueryBuilder),\n            format!(\n                r#\"SELECT * FROM \"splits\" WHERE \"index_uid\" IN ('{index_uid}') AND \"update_timestamp\" < TO_TIMESTAMP(51) AND \"create_timestamp\" <= TO_TIMESTAMP(63)\"#\n            )\n        );\n\n        let mut select_statement = Query::select();\n        let sql = select_statement.column(Asterisk).from(Splits::Table);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_time_range_start_gt(90)\n            .with_tags_filter(TagFilterAst::Tag {\n                is_present: true,\n                tag: \"tag-1\".to_string(),\n            });\n        append_query_filters_and_order_by(sql, &query);\n        assert_eq!(\n            sql.to_string(PostgresQueryBuilder),\n            format!(\n                r#\"SELECT * FROM \"splits\" WHERE \"index_uid\" IN ('{index_uid}') AND ($$tag-1$$ = ANY(tags)) AND (\"time_range_end\" > 90 OR \"time_range_end\" IS NULL)\"#\n            )\n        );\n\n        let mut select_statement = Query::select();\n        let sql = select_statement.column(Asterisk).from(Splits::Table);\n\n        let index_uid_2 = IndexUid::new_with_random_ulid(\"test-index-2\");\n        let query =\n            ListSplitsQuery::try_from_index_uids(vec![index_uid.clone(), index_uid_2.clone()])\n                .unwrap();\n        append_query_filters_and_order_by(sql, &query);\n        assert_eq!(\n            sql.to_string(PostgresQueryBuilder),\n            format!(\n                r#\"SELECT * FROM \"splits\" WHERE \"index_uid\" IN ('{index_uid}', '{index_uid_2}')\"#\n            )\n        );\n    }\n\n    #[test]\n    fn test_index_id_pattern_like_query() {\n        assert_eq!(\n            &build_index_id_patterns_sql_query(&[\"*-index-*-last*\".to_string()]).unwrap(),\n            \"SELECT * FROM indexes WHERE (index_id LIKE '%-index-%-last%')\"\n        );\n        assert_eq!(\n            &build_index_id_patterns_sql_query(&[\n                \"*-index-*-last*\".to_string(),\n                \"another-index\".to_string()\n            ])\n            .unwrap(),\n            \"SELECT * FROM indexes WHERE (index_id LIKE '%-index-%-last%' OR index_id = \\\n             'another-index')\"\n        );\n        assert_eq!(\n            &build_index_id_patterns_sql_query(&[\n                \"*-index-*-last**\".to_string(),\n                \"another-index\".to_string(),\n                \"*\".to_string()\n            ])\n            .unwrap(),\n            \"SELECT * FROM indexes\"\n        );\n        assert_eq!(\n            build_index_id_patterns_sql_query(&[\"*-index-*-&-last**\".to_string()])\n                .unwrap_err()\n                .to_string(),\n            \"internal error: failed to build list indexes query; cause: `index ID pattern \\\n             `*-index-*-&-last**` is invalid: patterns must match the following regular \\\n             expression: `^[a-zA-Z\\\\*][a-zA-Z0-9-_\\\\.\\\\*]{0,254}$``\"\n        );\n\n        assert_eq!(\n            &build_index_id_patterns_sql_query(&[\"*\".to_string(), \"-index-name\".to_string()])\n                .unwrap(),\n            \"SELECT * FROM indexes WHERE (index_id LIKE '%') AND index_id <> 'index-name'\"\n        );\n\n        assert_eq!(\n            &build_index_id_patterns_sql_query(&[\n                \"*-index-*-last*\".to_string(),\n                \"another-index\".to_string(),\n                \"-*-index-1-last*\".to_string(),\n                \"-index-2-last\".to_string(),\n            ])\n            .unwrap(),\n            \"SELECT * FROM indexes WHERE (index_id LIKE '%-index-%-last%' OR index_id = \\\n             'another-index') AND index_id NOT LIKE '%-index-1-last%' AND index_id <> \\\n             'index-2-last'\"\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/postgres/metrics.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse once_cell::sync::Lazy;\nuse quickwit_common::metrics::{IntGauge, new_gauge};\n\n#[derive(Clone)]\npub(super) struct PostgresMetrics {\n    pub acquire_connections: IntGauge,\n    pub active_connections: IntGauge,\n    pub idle_connections: IntGauge,\n}\n\nimpl Default for PostgresMetrics {\n    fn default() -> Self {\n        Self {\n            acquire_connections: new_gauge(\n                \"acquire_connections\",\n                \"Number of connections being acquired.\",\n                \"metastore\",\n                &[],\n            ),\n            active_connections: new_gauge(\n                \"active_connections\",\n                \"Number of active (used + idle) connections.\",\n                \"metastore\",\n                &[],\n            ),\n            idle_connections: new_gauge(\n                \"idle_connections\",\n                \"Number of idle connections.\",\n                \"metastore\",\n                &[],\n            ),\n        }\n    }\n}\n\npub(super) static POSTGRES_METRICS: Lazy<PostgresMetrics> = Lazy::new(PostgresMetrics::default);\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/postgres/migrator.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeMap;\n\nuse quickwit_proto::metastore::{MetastoreError, MetastoreResult};\nuse sqlx::migrate::{Migrate, Migrator};\nuse sqlx::{Acquire, PgConnection, Postgres};\nuse tracing::{error, instrument};\n\nuse super::pool::TrackedPool;\n\nfn get_migrations() -> Migrator {\n    sqlx::migrate!(\"migrations/postgresql\")\n}\n\n/// Initializes the database and runs the SQL migrations stored in the\n/// `quickwit-metastore/migrations` directory.\n#[instrument(skip_all)]\npub(super) async fn run_migrations(\n    pool: &TrackedPool<Postgres>,\n    skip_migrations: bool,\n    skip_locking: bool,\n) -> MetastoreResult<()> {\n    let mut tx = pool.begin().await?;\n    let conn = tx.acquire().await?;\n\n    let mut migrator = get_migrations();\n\n    if skip_locking {\n        migrator.set_locking(false);\n    }\n\n    if !skip_migrations {\n        // this is an hidden function, made to get \"around the annoying \"implementation of `Acquire`\n        // is not general enough\" error\", which is the error we get otherwise.\n        let migrate_result = migrator.run_direct(conn).await;\n\n        let Err(migrate_error) = migrate_result else {\n            tx.commit().await?;\n            return Ok(());\n        };\n        tx.rollback().await?;\n        error!(error=%migrate_error, \"failed to run PostgreSQL migrations\");\n\n        Err(MetastoreError::Internal {\n            message: \"failed to run PostgreSQL migrations\".to_string(),\n            cause: migrate_error.to_string(),\n        })\n    } else {\n        check_migrations(migrator, conn).await\n    }\n}\n\nasync fn check_migrations(migrator: Migrator, conn: &mut PgConnection) -> MetastoreResult<()> {\n    let dirty = match conn.dirty_version().await {\n        Ok(dirty) => dirty,\n        Err(migrate_error) => {\n            error!(error=%migrate_error, \"failed to validate PostgreSQL migrations\");\n\n            return Err(MetastoreError::Internal {\n                message: \"failed to validate PostgreSQL migrations\".to_string(),\n                cause: migrate_error.to_string(),\n            });\n        }\n    };\n    if let Some(dirty) = dirty {\n        error!(\"migration {dirty} is dirty\");\n\n        return Err(MetastoreError::Internal {\n            message: \"failed to validate PostgreSQL migrations\".to_string(),\n            cause: format!(\"migration {dirty} is dirty\"),\n        });\n    };\n    let applied_migrations = match conn.list_applied_migrations().await {\n        Ok(applied_migrations) => applied_migrations,\n        Err(migrate_error) => {\n            error!(error=%migrate_error, \"failed to validate PostgreSQL migrations\");\n\n            return Err(MetastoreError::Internal {\n                message: \"failed to validate PostgreSQL migrations\".to_string(),\n                cause: migrate_error.to_string(),\n            });\n        }\n    };\n    let expected_migrations: BTreeMap<_, _> = migrator\n        .iter()\n        .filter(|migration| migration.migration_type.is_up_migration())\n        .map(|migration| (migration.version, migration))\n        .collect();\n    if applied_migrations.len() < expected_migrations.len() {\n        error!(\n            \"missing migrations, expected {} migrations, only {} present in database\",\n            expected_migrations.len(),\n            applied_migrations.len()\n        );\n\n        return Err(MetastoreError::Internal {\n            message: \"failed to validate PostgreSQL migrations\".to_string(),\n            cause: format!(\n                \"missing migrations, expected {} migrations, only {} present in database\",\n                expected_migrations.len(),\n                applied_migrations.len()\n            ),\n        });\n    }\n    for applied_migration in applied_migrations {\n        let Some(migration) = expected_migrations.get(&applied_migration.version) else {\n            error!(\n                \"found unknown migration {} in database\",\n                applied_migration.version\n            );\n\n            return Err(MetastoreError::Internal {\n                message: \"failed to validate PostgreSQL migrations\".to_string(),\n                cause: format!(\n                    \"found unknown migration {} in database\",\n                    applied_migration.version\n                ),\n            });\n        };\n        if migration.checksum != applied_migration.checksum {\n            error!(\n                \"migration {} differ between database and expected value\",\n                applied_migration.version\n            );\n\n            return Err(MetastoreError::Internal {\n                message: \"failed to validate PostgreSQL migrations\".to_string(),\n                cause: format!(\n                    \"migration {} differ between database and expected value\",\n                    applied_migration.version\n                ),\n            });\n        }\n    }\n    Ok(())\n}\n\n#[cfg(test)]\nmod tests {\n    use std::time::Duration;\n\n    use quickwit_common::uri::Uri;\n    use sqlx::Acquire;\n    use sqlx::migrate::Migrate;\n\n    use super::{get_migrations, run_migrations};\n    use crate::metastore::postgres::utils::establish_connection;\n\n    #[tokio::test]\n    #[serial_test::file_serial]\n    async fn test_metastore_check_migration() {\n        let _ = tracing_subscriber::fmt::try_init();\n\n        dotenvy::dotenv().ok();\n        let uri: Uri = std::env::var(\"QW_TEST_DATABASE_URL\")\n            .expect(\"environment variable `QW_TEST_DATABASE_URL` should be set\")\n            .parse()\n            .expect(\"environment variable `QW_TEST_DATABASE_URL` should be a valid URI\");\n\n        {\n            let connection_pool =\n                establish_connection(&uri, 1, 5, Duration::from_secs(2), None, None, false)\n                    .await\n                    .unwrap();\n            // make sure migrations are run\n            run_migrations(&connection_pool, false, false)\n                .await\n                .unwrap();\n\n            // we just ran migration, nothing else to run\n            run_migrations(&connection_pool, true, false).await.unwrap();\n\n            let migrations = get_migrations();\n            let last_migration = migrations\n                .iter()\n                .map(|migration| migration.version)\n                .max()\n                .expect(\"no migration exists?\");\n            let up_migration = migrations\n                .iter()\n                .find(|migration| {\n                    migration.version == last_migration\n                        && migration.migration_type.is_up_migration()\n                })\n                .unwrap();\n            let down_migration = migrations\n                .iter()\n                .find(|migration| {\n                    migration.version == last_migration\n                        && migration.migration_type.is_down_migration()\n                })\n                .unwrap();\n            let mut conn = connection_pool.acquire().await.unwrap();\n\n            conn.revert(down_migration).await.unwrap();\n\n            run_migrations(&connection_pool, true, false)\n                .await\n                .unwrap_err();\n\n            conn.apply(up_migration).await.unwrap();\n        }\n\n        {\n            let connection_pool =\n                establish_connection(&uri, 1, 5, Duration::from_secs(2), None, None, true)\n                    .await\n                    .unwrap();\n            // error because we are in read only mode, and we try to run migrations\n            run_migrations(&connection_pool, false, false)\n                .await\n                .unwrap_err();\n            // okay because all migrations were already run before\n            run_migrations(&connection_pool, true, false).await.unwrap();\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/postgres/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod error;\nmod factory;\nmod metastore;\nmod metrics;\nmod migrator;\nmod model;\nmod pool;\nmod split_stream;\nmod tags;\nmod utils;\n\npub use factory::PostgresqlMetastoreFactory;\npub use metastore::PostgresqlMetastore;\n\nconst QW_POSTGRES_SKIP_MIGRATIONS_ENV_KEY: &str = \"QW_POSTGRES_SKIP_MIGRATIONS\";\nconst QW_POSTGRES_SKIP_MIGRATION_LOCKING_ENV_KEY: &str = \"QW_POSTGRES_SKIP_MIGRATION_LOCKING\";\nconst QW_POSTGRES_READ_ONLY_ENV_KEY: &str = \"QW_POSTGRES_READ_ONLY\";\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/postgres/model.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n#![allow(dead_code)]\n\nuse std::convert::TryInto;\nuse std::str::FromStr;\n\nuse quickwit_proto::ingest::{Shard, ShardState};\nuse quickwit_proto::metastore::{DeleteQuery, DeleteTask, MetastoreError, MetastoreResult};\nuse quickwit_proto::types::{DocMappingUid, IndexId, IndexUid, ShardId, SourceId, SplitId};\nuse sea_query::{Iden, Write};\nuse tracing::error;\n\nuse crate::{IndexMetadata, Split, SplitMetadata, SplitState};\n\n#[derive(Iden, Clone, Copy)]\n#[allow(dead_code)]\npub enum Indexes {\n    Table,\n    IndexUid,\n    IndexId,\n    IndexMetadataJson,\n    CreateTimestamp,\n}\n\n/// A model structure for handling index metadata in a database.\n#[derive(sqlx::FromRow)]\npub(super) struct PgIndex {\n    /// Index UID. The index UID identifies the index when querying the metastore from the\n    /// application.\n    #[sqlx(try_from = \"String\")]\n    pub index_uid: IndexUid,\n    /// Index ID. The index ID is used to resolve user queries.\n    pub index_id: IndexId,\n    // A JSON string containing all of the IndexMetadata.\n    pub index_metadata_json: String,\n    /// Timestamp for tracking when the split was created.\n    pub create_timestamp: sqlx::types::time::PrimitiveDateTime,\n}\n\nimpl PgIndex {\n    /// Deserializes index metadata from JSON string stored in column and sets appropriate\n    /// timestamps.\n    pub fn index_metadata(&self) -> MetastoreResult<IndexMetadata> {\n        let mut index_metadata = serde_json::from_str::<IndexMetadata>(&self.index_metadata_json)\n            .map_err(|error| {\n            error!(index_id=%self.index_id, error=?error, \"failed to deserialize index metadata\");\n\n            MetastoreError::JsonDeserializeError {\n                struct_name: \"IndexMetadata\".to_string(),\n                message: error.to_string(),\n            }\n        })?;\n        // `create_timestamp` and `update_timestamp` are stored in dedicated columns but are also\n        // duplicated in [`IndexMetadata`]. We must override the duplicates with the authentic\n        // values upon deserialization.\n        index_metadata.create_timestamp = self.create_timestamp.assume_utc().unix_timestamp();\n        Ok(index_metadata)\n    }\n}\n\n#[derive(Iden, Clone, Copy)]\n#[allow(dead_code)]\npub enum Splits {\n    Table,\n    SplitId,\n    SplitState,\n    TimeRangeStart,\n    TimeRangeEnd,\n    CreateTimestamp,\n    UpdateTimestamp,\n    PublishTimestamp,\n    MaturityTimestamp,\n    Tags,\n    SplitMetadataJson,\n    IndexUid,\n    NodeId,\n    DeleteOpstamp,\n}\n\npub(super) struct ToTimestampFunc;\n\nimpl Iden for ToTimestampFunc {\n    fn unquoted(&self, s: &mut dyn Write) {\n        write!(s, \"TO_TIMESTAMP\").unwrap()\n    }\n}\n\n/// A model structure for handling split metadata in a database.\n#[derive(sqlx::FromRow)]\npub(super) struct PgSplit {\n    /// Split ID.\n    pub split_id: SplitId,\n    /// The state of the split. With `update_timestamp`, this is the only mutable attribute of the\n    /// split.\n    pub split_state: String,\n    /// If a timestamp field is available, the min timestamp of the split.\n    pub time_range_start: Option<i64>,\n    /// If a timestamp field is available, the max timestamp of the split.\n    pub time_range_end: Option<i64>,\n    /// Timestamp for tracking when the split was created.\n    pub create_timestamp: sqlx::types::time::PrimitiveDateTime,\n    /// Timestamp for tracking when the split was last updated.\n    pub update_timestamp: sqlx::types::time::PrimitiveDateTime,\n    /// Timestamp for tracking when the split was published.\n    pub publish_timestamp: Option<sqlx::types::time::PrimitiveDateTime>,\n    /// Timestamp for tracking when the split becomes mature.\n    /// If a split is already mature, this timestamp is set to 0.\n    pub maturity_timestamp: sqlx::types::time::PrimitiveDateTime,\n    /// A list of tags for categorizing and searching group of splits.\n    pub tags: Vec<String>,\n    // The split's metadata serialized as a JSON string.\n    pub split_metadata_json: String,\n    /// Index UID. It is used as a foreign key in the database.\n    #[sqlx(try_from = \"String\")]\n    pub index_uid: IndexUid,\n    /// Delete opstamp.\n    pub delete_opstamp: i64,\n}\n\nimpl PgSplit {\n    /// Deserializes and returns the split's metadata.\n    fn split_metadata(&self) -> MetastoreResult<SplitMetadata> {\n        serde_json::from_str::<SplitMetadata>(&self.split_metadata_json).map_err(|error| {\n            error!(index_id=%self.index_uid.index_id, split_id=%self.split_id, error=?error, \"failed to deserialize split metadata\");\n\n            MetastoreError::JsonDeserializeError {\n                struct_name: \"SplitMetadata\".to_string(),\n                message: error.to_string(),\n            }\n        })\n    }\n\n    /// Deserializes and returns the split's state.\n    fn split_state(&self) -> MetastoreResult<SplitState> {\n        SplitState::from_str(&self.split_state).map_err(|error| {\n            error!(index_id=%self.index_uid.index_id, split_id=%self.split_id, split_state=?self.split_state, error=?error, \"failed to deserialize split state\");\n            MetastoreError::JsonDeserializeError {\n                struct_name: \"SplitState\".to_string(),\n                message: error,\n            }\n        })\n    }\n}\n\nimpl TryInto<Split> for PgSplit {\n    type Error = MetastoreError;\n\n    fn try_into(self) -> Result<Split, Self::Error> {\n        let mut split_metadata = self.split_metadata()?;\n        // `create_timestamp` and `delete_opstamp` are duplicated in `SplitMetadata` and needs to be\n        // overridden with the \"true\" value stored in a column.\n        split_metadata.create_timestamp = self.create_timestamp.assume_utc().unix_timestamp();\n        let split_state = self.split_state()?;\n        let update_timestamp = self.update_timestamp.assume_utc().unix_timestamp();\n        let publish_timestamp = self\n            .publish_timestamp\n            .map(|publish_timestamp| publish_timestamp.assume_utc().unix_timestamp());\n        split_metadata.index_uid = self.index_uid;\n        split_metadata.delete_opstamp = self.delete_opstamp as u64;\n        Ok(Split {\n            split_metadata,\n            split_state,\n            update_timestamp,\n            publish_timestamp,\n        })\n    }\n}\n\n/// A model structure for handling split metadata in a database.\n#[derive(sqlx::FromRow)]\npub(super) struct PgDeleteTask {\n    /// Create timestamp.\n    pub create_timestamp: sqlx::types::time::PrimitiveDateTime,\n    /// Monotonic increasing unique opstamp.\n    pub opstamp: i64,\n    /// Index uid.\n    #[sqlx(try_from = \"String\")]\n    pub index_uid: IndexUid,\n    /// Query serialized as a JSON string.\n    pub delete_query_json: String,\n}\n\nimpl PgDeleteTask {\n    /// Deserializes and returns the split's metadata.\n    fn delete_query(&self) -> MetastoreResult<DeleteQuery> {\n        serde_json::from_str::<DeleteQuery>(&self.delete_query_json).map_err(|error| {\n            error!(index_id=%self.index_uid.index_id, opstamp=%self.opstamp, error=?error, \"failed to deserialize delete query\");\n\n            MetastoreError::JsonDeserializeError {\n                struct_name: \"DeleteQuery\".to_string(),\n                message: error.to_string(),\n            }\n        })\n    }\n}\n\nimpl TryInto<DeleteTask> for PgDeleteTask {\n    type Error = MetastoreError;\n\n    fn try_into(self) -> Result<DeleteTask, Self::Error> {\n        let delete_query = self.delete_query()?;\n        Ok(DeleteTask {\n            create_timestamp: self.create_timestamp.assume_utc().unix_timestamp(),\n            opstamp: self.opstamp as u64,\n            delete_query: Some(delete_query),\n        })\n    }\n}\n\n#[derive(Iden, Clone, Copy)]\npub(super) enum Shards {\n    Table,\n    IndexUid,\n    SourceId,\n    ShardId,\n    ShardState,\n    LeaderId,\n    FollowerId,\n    PublishPositionInclusive,\n    PublishToken,\n}\n\n#[derive(sqlx::Type, PartialEq, Debug)]\n#[sqlx(type_name = \"SHARD_STATE\", rename_all = \"snake_case\")]\npub(super) enum PgShardState {\n    Unspecified,\n    Open,\n    Unavailable,\n    Closed,\n}\n\nimpl From<PgShardState> for ShardState {\n    fn from(pg_shard_state: PgShardState) -> Self {\n        match pg_shard_state {\n            PgShardState::Unspecified => ShardState::Unspecified,\n            PgShardState::Open => ShardState::Open,\n            PgShardState::Unavailable => ShardState::Unavailable,\n            PgShardState::Closed => ShardState::Closed,\n        }\n    }\n}\n\n#[derive(sqlx::FromRow, Debug)]\npub(super) struct PgShard {\n    #[sqlx(try_from = \"String\")]\n    pub index_uid: IndexUid,\n    #[sqlx(try_from = \"String\")]\n    pub source_id: SourceId,\n    #[sqlx(try_from = \"String\")]\n    pub shard_id: ShardId,\n    pub leader_id: String,\n    pub follower_id: Option<String>,\n    pub shard_state: PgShardState,\n    #[sqlx(try_from = \"String\")]\n    pub doc_mapping_uid: DocMappingUid,\n    pub publish_position_inclusive: String,\n    pub publish_token: Option<String>,\n    pub update_timestamp: sqlx::types::time::PrimitiveDateTime,\n}\n\nimpl From<PgShard> for Shard {\n    fn from(pg_shard: PgShard) -> Self {\n        Shard {\n            index_uid: Some(pg_shard.index_uid),\n            source_id: pg_shard.source_id,\n            shard_id: Some(pg_shard.shard_id),\n            shard_state: ShardState::from(pg_shard.shard_state) as i32,\n            leader_id: pg_shard.leader_id,\n            follower_id: pg_shard.follower_id,\n            doc_mapping_uid: Some(pg_shard.doc_mapping_uid),\n            publish_position_inclusive: Some(pg_shard.publish_position_inclusive.into()),\n            publish_token: pg_shard.publish_token,\n            update_timestamp: pg_shard.update_timestamp.assume_utc().unix_timestamp(),\n        }\n    }\n}\n\n#[derive(sqlx::FromRow, Debug)]\npub(super) struct PgIndexTemplate {\n    pub index_template_json: String,\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/postgres/pool.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse futures::future::BoxFuture;\nuse futures::stream::BoxStream;\nuse quickwit_common::metrics::GaugeGuard;\nuse sqlx::pool::PoolConnection;\nuse sqlx::pool::maybe::MaybePoolConnection;\nuse sqlx::{\n    Acquire, Database, Describe, Either, Error, Execute, Executor, Pool, Postgres, Transaction,\n};\n\nuse super::metrics::POSTGRES_METRICS;\n\n#[derive(Debug)]\npub(super) struct TrackedPool<DB: Database> {\n    inner_pool: Pool<DB>,\n}\n\nimpl TrackedPool<Postgres> {\n    pub fn new(inner_pool: Pool<Postgres>) -> Self {\n        Self { inner_pool }\n    }\n}\n\nimpl<DB: Database> Clone for TrackedPool<DB> {\n    fn clone(&self) -> Self {\n        Self {\n            inner_pool: self.inner_pool.clone(),\n        }\n    }\n}\n\nimpl<'a, DB: Database> Acquire<'a> for &TrackedPool<DB> {\n    type Database = DB;\n\n    type Connection = PoolConnection<DB>;\n\n    fn acquire(self) -> BoxFuture<'static, Result<Self::Connection, Error>> {\n        let acquire_conn_fut = self.inner_pool.acquire();\n\n        POSTGRES_METRICS\n            .active_connections\n            .set(self.inner_pool.size() as i64);\n        POSTGRES_METRICS\n            .idle_connections\n            .set(self.inner_pool.num_idle() as i64);\n\n        Box::pin(async move {\n            let mut gauge_guard = GaugeGuard::from_gauge(&POSTGRES_METRICS.acquire_connections);\n            gauge_guard.add(1);\n\n            let conn = acquire_conn_fut.await?;\n            Ok(conn)\n        })\n    }\n\n    fn begin(self) -> BoxFuture<'static, Result<Transaction<'a, DB>, Error>> {\n        let acquire_conn_fut = self.acquire();\n\n        Box::pin(async move {\n            Transaction::begin(\n                MaybePoolConnection::PoolConnection(acquire_conn_fut.await?),\n                None,\n            )\n            .await\n        })\n    }\n}\n\nimpl<DB: Database> Executor<'_> for &TrackedPool<DB>\nwhere for<'c> &'c mut DB::Connection: Executor<'c, Database = DB>\n{\n    type Database = DB;\n\n    fn fetch_many<'e, 'q: 'e, E>(\n        self,\n        query: E,\n    ) -> BoxStream<'e, Result<Either<DB::QueryResult, DB::Row>, Error>>\n    where\n        E: Execute<'q, Self::Database> + 'q,\n    {\n        self.inner_pool.fetch_many(query)\n    }\n\n    fn fetch_optional<'e, 'q: 'e, E>(\n        self,\n        query: E,\n    ) -> BoxFuture<'e, Result<Option<DB::Row>, Error>>\n    where\n        E: Execute<'q, Self::Database> + 'q,\n    {\n        self.inner_pool.fetch_optional(query)\n    }\n\n    fn prepare_with<'e, 'q: 'e>(\n        self,\n        sql: &'q str,\n        parameters: &'e [<Self::Database as Database>::TypeInfo],\n    ) -> BoxFuture<'e, Result<<Self::Database as Database>::Statement<'q>, Error>> {\n        self.inner_pool.prepare_with(sql, parameters)\n    }\n\n    #[doc(hidden)]\n    fn describe<'e, 'q: 'e>(\n        self,\n        sql: &'q str,\n    ) -> BoxFuture<'e, Result<Describe<Self::Database>, Error>> {\n        self.inner_pool.describe(sql)\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/postgres/queries/index_templates/find.sql",
    "content": "SELECT DISTINCT ON (index_id)\n    index_id,\n    template_id,\n    index_template_json\nFROM\n    unnest($1) AS index_ids(index_id)\n    JOIN index_templates ON index_ids.index_id LIKE ANY (index_templates.positive_index_id_patterns)\n        AND (cardinality(index_templates.negative_index_id_patterns) = 0\n            OR index_ids.index_id NOT LIKE ANY (index_templates.negative_index_id_patterns))\n    ORDER BY\n        index_id,\n        - priority,\n        template_id ASC\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/postgres/queries/index_templates/insert.sql",
    "content": "INSERT INTO index_templates(template_id, positive_index_id_patterns, negative_index_id_patterns, priority, index_template_json)\n    VALUES ($1, $2, $3, $4, $5)\nON CONFLICT (template_id)\n    DO NOTHING\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/postgres/queries/index_templates/upsert.sql",
    "content": "INSERT INTO index_templates(template_id, positive_index_id_patterns, negative_index_id_patterns, priority, index_template_json)\n    VALUES ($1, $2, $3, $4, $5)\nON CONFLICT (template_id)\n    DO UPDATE SET\n        positive_index_id_patterns = $2, negative_index_id_patterns = $3, priority = $4, index_template_json = $5\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/postgres/queries/indexes_metadata.sql",
    "content": "SELECT\n    *\nFROM\n    indexes\nWHERE\n    index_id = ANY ($1)\n    OR index_uid = ANY ($2)\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/postgres/queries/shards/acquire.sql",
    "content": "UPDATE\n    shards\nSET\n    publish_token = $4\nWHERE\n    index_uid = $1\n    AND source_id = $2\n    AND shard_id = ANY ($3)\nRETURNING\n    *\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/postgres/queries/shards/delete.sql",
    "content": "DELETE FROM shards\nWHERE index_uid = $1\n    AND source_id = $2\n    AND shard_id = ANY ($3)\n    AND ($4\n        OR publish_position_inclusive LIKE '~%')\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/postgres/queries/shards/fetch.sql",
    "content": "SELECT\n    *\nFROM\n    shards\nWHERE\n    index_uid = $1\n    AND source_id = $2\n    AND shard_id = $3\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/postgres/queries/shards/find_not_deletable.sql",
    "content": "SELECT\n    *\nFROM\n    shards\nWHERE\n    index_uid = $1\n    AND source_id = $2\n    AND shard_id = ANY ($3)\n    AND publish_position_inclusive NOT LIKE '~%'\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/postgres/queries/shards/insert.sql",
    "content": "INSERT INTO shards(index_uid, source_id, shard_id, shard_state, leader_id, follower_id, doc_mapping_uid, publish_position_inclusive, publish_token, update_timestamp)\n    VALUES ($1, $2, $3, CAST($4 AS SHARD_STATE), $5, $6, $7, $8, $9, $10)\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/postgres/queries/shards/open.sql",
    "content": "INSERT INTO shards(index_uid, source_id, shard_id, leader_id, follower_id, doc_mapping_uid, publish_token, update_timestamp)\n    VALUES ($1, $2, $3, $4, $5, $6, $7, $8)\nON CONFLICT\n    DO NOTHING\nRETURNING\n    *\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/postgres/queries/shards/prune_age.sql",
    "content": "DELETE FROM shards\nWHERE index_uid = $1\n    AND source_id = $2\n    AND update_timestamp < $3\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/postgres/queries/shards/prune_count.sql",
    "content": "WITH recent_shards AS (\n    SELECT shard_id\n    FROM shards\n    WHERE index_uid = $1\n        AND source_id = $2\n    ORDER BY update_timestamp DESC\n    LIMIT $3\n)\nDELETE FROM shards\nWHERE index_uid = $1\n    AND source_id = $2\n    AND shard_id NOT IN (SELECT shard_id FROM recent_shards)\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/postgres/split_stream.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::pin::Pin;\nuse std::task::{Context, Poll};\n\nuse futures::stream::BoxStream;\nuse ouroboros::self_referencing;\nuse sqlx::Postgres;\nuse tokio_stream::Stream;\n\nuse super::pool::TrackedPool;\n\n#[self_referencing(pub_extras)]\npub struct SplitStream<T> {\n    connection_pool: TrackedPool<Postgres>,\n    sql: String,\n    #[borrows(connection_pool, sql)]\n    #[covariant]\n    inner: BoxStream<'this, Result<T, sqlx::Error>>,\n}\n\nimpl<T> Stream for SplitStream<T> {\n    type Item = Result<T, sqlx::Error>;\n\n    fn poll_next(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Option<Self::Item>> {\n        SplitStream::with_inner_mut(&mut self, |this| Pin::new(&mut this.as_mut()).poll_next(cx))\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/postgres/tags.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_doc_mapper::tag_pruning::TagFilterAst;\nuse sea_query::{Cond, Expr, all};\n\n// We use dollar-quoted strings in PostgreSQL.\n//\n// In order to ensure that we do not risk SQL injection,\n// we need to generate a string that does not appear in\n// the literal we want to dollar quote.\nfn generate_dollar_guard(tag: &str) -> String {\n    if !tag.contains('$') {\n        // That's our happy path here.\n        return String::new();\n    }\n    let mut dollar_guard = String::new();\n    loop {\n        dollar_guard.push_str(\"QuickwitGuard\");\n        // This terminates because `dollar_guard`\n        // will eventually be longer than `tag`.\n        if !tag.contains(&dollar_guard) {\n            return dollar_guard;\n        }\n    }\n}\n\n/// Takes a tag filter AST and returns a SQL expression that can be used as\n/// a filter.\npub(super) fn generate_sql_condition(tag_ast: &TagFilterAst) -> Cond {\n    match tag_ast {\n        TagFilterAst::And(child_asts) => {\n            if child_asts.is_empty() {\n                return all![Expr::cust(\"TRUE\")];\n            }\n            child_asts\n                .iter()\n                .map(generate_sql_condition)\n                .fold(Cond::all(), |cond, child_cond| cond.add(child_cond))\n        }\n        TagFilterAst::Or(child_asts) => {\n            if child_asts.is_empty() {\n                return all![Expr::cust(\"TRUE\")];\n            }\n            child_asts\n                .iter()\n                .map(generate_sql_condition)\n                .fold(Cond::any(), |cond, child_cond| cond.add(child_cond))\n        }\n        TagFilterAst::Tag { tag, is_present } => {\n            let dollar_guard = generate_dollar_guard(tag);\n            let expr_str = format!(\"${dollar_guard}${tag}${dollar_guard}$ = ANY(tags)\");\n            let expr = if *is_present {\n                Expr::cust(expr_str)\n            } else {\n                Expr::cust(expr_str).not()\n            };\n            all![expr]\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_doc_mapper::tag_pruning::{no_tag, tag};\n    use sea_query::any;\n\n    use super::*;\n\n    fn test_tags_filter_expression_helper(tags_ast: TagFilterAst, expected: Cond) {\n        assert_eq!(generate_sql_condition(&tags_ast), expected);\n    }\n\n    #[test]\n    fn test_tags_filter_expression_single_tag() {\n        let tags_ast = tag(\"my_field:titi\");\n\n        let expected = all![Expr::cust(\"$$my_field:titi$$ = ANY(tags)\")];\n\n        test_tags_filter_expression_helper(tags_ast, expected);\n    }\n\n    #[test]\n    fn test_tags_filter_expression_not_tag() {\n        let expected = all![Expr::cust(\"$$my_field:titi$$ = ANY(tags)\").not()];\n\n        test_tags_filter_expression_helper(no_tag(\"my_field:titi\"), expected);\n    }\n\n    #[test]\n    fn test_tags_filter_expression_ands() {\n        let tags_ast = TagFilterAst::And(vec![tag(\"tag:val1\"), tag(\"tag:val2\"), tag(\"tag:val3\")]);\n\n        let expected = all![\n            Expr::cust(\"$$tag:val1$$ = ANY(tags)\"),\n            Expr::cust(\"$$tag:val2$$ = ANY(tags)\"),\n            Expr::cust(\"$$tag:val3$$ = ANY(tags)\"),\n        ];\n\n        test_tags_filter_expression_helper(tags_ast, expected);\n    }\n\n    #[test]\n    fn test_tags_filter_expression_and_or() {\n        let tags_ast = TagFilterAst::Or(vec![\n            TagFilterAst::And(vec![tag(\"tag:val1\"), tag(\"tag:val2\")]),\n            tag(\"tag:val3\"),\n        ]);\n\n        let expected = any![\n            all![\n                Expr::cust(\"$$tag:val1$$ = ANY(tags)\"),\n                Expr::cust(\"$$tag:val2$$ = ANY(tags)\"),\n            ],\n            Expr::cust(\"$$tag:val3$$ = ANY(tags)\"),\n        ];\n\n        test_tags_filter_expression_helper(tags_ast, expected);\n    }\n\n    #[test]\n    fn test_tags_filter_expression_and_or_correct_parenthesis() {\n        let tags_ast = TagFilterAst::And(vec![\n            TagFilterAst::Or(vec![tag(\"tag:val1\"), tag(\"tag:val2\")]),\n            tag(\"tag:val3\"),\n        ]);\n\n        let expected = all![\n            any![\n                Expr::cust(\"$$tag:val1$$ = ANY(tags)\"),\n                Expr::cust(\"$$tag:val2$$ = ANY(tags)\"),\n            ],\n            Expr::cust(\"$$tag:val3$$ = ANY(tags)\"),\n        ];\n\n        test_tags_filter_expression_helper(tags_ast, expected);\n    }\n\n    #[test]\n    fn test_tags_sql_injection_attempt() {\n        let tags_ast = tag(\"tag:$$;DELETE FROM something_evil\");\n\n        let expected = all![Expr::cust(\n            \"$QuickwitGuard$tag:$$;DELETE FROM something_evil$QuickwitGuard$ = ANY(tags)\"\n        ),];\n\n        test_tags_filter_expression_helper(tags_ast, expected);\n\n        let tags_ast = tag(\"tag:$QuickwitGuard$;DELETE FROM something_evil\");\n\n        let expected = all![Expr::cust(\n            \"$QuickwitGuardQuickwitGuard$tag:$QuickwitGuard$;DELETE FROM \\\n             something_evil$QuickwitGuardQuickwitGuard$ = ANY(tags)\"\n        )];\n\n        test_tags_filter_expression_helper(tags_ast, expected);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore/postgres/utils.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt::Display;\nuse std::ops::Bound;\nuse std::str::FromStr;\nuse std::time::Duration;\n\nuse quickwit_common::uri::Uri;\nuse quickwit_proto::metastore::{MetastoreError, MetastoreResult};\nuse sea_query::{Expr, Func, Order, SelectStatement, any};\nuse sqlx::postgres::{PgConnectOptions, PgPoolOptions};\nuse sqlx::{ConnectOptions, Postgres};\nuse tracing::error;\nuse tracing::log::LevelFilter;\n\nuse super::model::{Splits, ToTimestampFunc};\nuse super::pool::TrackedPool;\nuse super::tags::generate_sql_condition;\nuse crate::metastore::{FilterRange, SortBy};\nuse crate::{ListSplitsQuery, SplitMaturity, SplitMetadata};\n\n/// Establishes a connection to the given database URI.\npub(super) async fn establish_connection(\n    connection_uri: &Uri,\n    min_connections: usize,\n    max_connections: usize,\n    acquire_timeout: Duration,\n    idle_timeout_opt: Option<Duration>,\n    max_lifetime_opt: Option<Duration>,\n    read_only: bool,\n) -> MetastoreResult<TrackedPool<Postgres>> {\n    let pool_options = PgPoolOptions::new()\n        .min_connections(min_connections as u32)\n        .max_connections(max_connections as u32)\n        .acquire_timeout(acquire_timeout)\n        .idle_timeout(idle_timeout_opt)\n        .max_lifetime(max_lifetime_opt);\n\n    let mut connect_options: PgConnectOptions =\n        PgConnectOptions::from_str(connection_uri.as_str())?\n            .application_name(\"quickwit-metastore\")\n            .log_statements(LevelFilter::Info);\n\n    if read_only {\n        // this isn't a security mechanism, only a safeguard against involontary missuse\n        connect_options = connect_options.options([(\"default_transaction_read_only\", \"on\")]);\n    }\n    let sqlx_pool = pool_options\n        .connect_with(connect_options)\n        .await\n        .map_err(|error| {\n            error!(connection_uri=%connection_uri, error=?error, \"failed to establish connection to database\");\n            MetastoreError::Connection {\n                message: error.to_string(),\n            }\n        })?;\n    let tracked_pool = TrackedPool::new(sqlx_pool);\n    Ok(tracked_pool)\n}\n\n/// Extends an existing SQL string with the generated filter range appended to the query.\n///\n/// This method is **not** SQL injection proof and should not be used with user-defined values.\npub(super) fn append_range_filters<V: Display>(\n    sql: &mut SelectStatement,\n    field_name: Splits,\n    filter_range: &FilterRange<V>,\n    value_formatter: impl Fn(&V) -> Expr,\n) {\n    if let Bound::Included(value) = &filter_range.start {\n        sql.cond_where(Expr::col(field_name).gte((value_formatter)(value)));\n    };\n\n    if let Bound::Excluded(value) = &filter_range.start {\n        sql.cond_where(Expr::col(field_name).gt((value_formatter)(value)));\n    };\n\n    if let Bound::Included(value) = &filter_range.end {\n        sql.cond_where(Expr::col(field_name).lte((value_formatter)(value)));\n    };\n\n    if let Bound::Excluded(value) = &filter_range.end {\n        sql.cond_where(Expr::col(field_name).lt((value_formatter)(value)));\n    };\n}\n\npub(super) fn append_query_filters_and_order_by(\n    sql: &mut SelectStatement,\n    query: &ListSplitsQuery,\n) {\n    if let Some(index_uids) = &query.index_uids {\n        // Note: `ListSplitsQuery` builder enforces a non empty `index_uids` list.\n        // TODO we should explore IN VALUES, = ANY and similar constructs in case they perform\n        // better.\n        sql.cond_where(Expr::col(Splits::IndexUid).is_in(index_uids));\n    }\n\n    if let Some(node_id) = &query.node_id {\n        sql.cond_where(Expr::col(Splits::NodeId).eq(node_id));\n    };\n\n    if !query.split_states.is_empty() {\n        sql.cond_where(\n            Expr::col(Splits::SplitState)\n                .is_in(query.split_states.iter().map(|val| val.to_string())),\n        );\n    };\n\n    if let Some(tags) = &query.tags {\n        sql.cond_where(generate_sql_condition(tags));\n    };\n\n    if let Some(v) = query.max_time_range_end {\n        sql.cond_where(Expr::col(Splits::TimeRangeEnd).lte(v));\n    }\n\n    match query.time_range.start {\n        Bound::Included(v) => {\n            sql.cond_where(any![\n                Expr::col(Splits::TimeRangeEnd).gte(v),\n                Expr::col(Splits::TimeRangeEnd).is_null()\n            ]);\n        }\n        Bound::Excluded(v) => {\n            sql.cond_where(any![\n                Expr::col(Splits::TimeRangeEnd).gt(v),\n                Expr::col(Splits::TimeRangeEnd).is_null()\n            ]);\n        }\n        Bound::Unbounded => {}\n    };\n\n    match query.time_range.end {\n        Bound::Included(v) => {\n            sql.cond_where(any![\n                Expr::col(Splits::TimeRangeStart).lte(v),\n                Expr::col(Splits::TimeRangeStart).is_null()\n            ]);\n        }\n        Bound::Excluded(v) => {\n            sql.cond_where(any![\n                Expr::col(Splits::TimeRangeStart).lt(v),\n                Expr::col(Splits::TimeRangeStart).is_null()\n            ]);\n        }\n        Bound::Unbounded => {}\n    };\n\n    match &query.mature {\n        Bound::Included(evaluation_datetime) => {\n            sql.cond_where(any![\n                Expr::col(Splits::MaturityTimestamp)\n                    .eq(Func::cust(ToTimestampFunc).arg(Expr::val(0))),\n                Expr::col(Splits::MaturityTimestamp).lte(\n                    Func::cust(ToTimestampFunc)\n                        .arg(Expr::val(evaluation_datetime.unix_timestamp()))\n                )\n            ]);\n        }\n        Bound::Excluded(evaluation_datetime) => {\n            sql.cond_where(Expr::col(Splits::MaturityTimestamp).gt(\n                Func::cust(ToTimestampFunc).arg(Expr::val(evaluation_datetime.unix_timestamp())),\n            ));\n        }\n        Bound::Unbounded => {}\n    };\n    append_range_filters(\n        sql,\n        Splits::UpdateTimestamp,\n        &query.update_timestamp,\n        |&val| Expr::expr(Func::cust(ToTimestampFunc).arg(Expr::val(val))),\n    );\n    append_range_filters(\n        sql,\n        Splits::CreateTimestamp,\n        &query.create_timestamp,\n        |&val| Expr::expr(Func::cust(ToTimestampFunc).arg(Expr::val(val))),\n    );\n    append_range_filters(sql, Splits::DeleteOpstamp, &query.delete_opstamp, |&val| {\n        Expr::expr(val)\n    });\n\n    if let Some((index_uid, split_id)) = &query.after_split {\n        sql.cond_where(\n            Expr::tuple([\n                Expr::col(Splits::IndexUid).into(),\n                Expr::col(Splits::SplitId).into(),\n            ])\n            .gt(Expr::tuple([Expr::value(index_uid), Expr::value(split_id)])),\n        );\n    }\n\n    match query.sort_by {\n        SortBy::Staleness => {\n            sql.order_by(Splits::DeleteOpstamp, Order::Asc)\n                .order_by(Splits::PublishTimestamp, Order::Asc);\n        }\n        SortBy::IndexUid => {\n            sql.order_by(Splits::IndexUid, Order::Asc)\n                .order_by(Splits::SplitId, Order::Asc);\n        }\n        SortBy::None => (),\n    }\n\n    if let Some(limit) = query.limit {\n        sql.limit(limit as u64);\n    }\n\n    if let Some(offset) = query.offset {\n        sql.order_by(Splits::SplitId, Order::Asc)\n            .offset(offset as u64);\n    }\n}\n\n/// Returns the unix timestamp at which the split becomes mature.\n/// If the split is mature (`SplitMaturity::Mature`), we return 0\n/// as we don't want the maturity to depend on datetime.\npub(super) fn split_maturity_timestamp(split_metadata: &SplitMetadata) -> i64 {\n    match split_metadata.maturity {\n        SplitMaturity::Mature => 0,\n        SplitMaturity::Immature { maturation_period } => {\n            split_metadata.create_timestamp + maturation_period.as_secs() as i64\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore_factory.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse async_trait::async_trait;\nuse quickwit_common::uri::Uri;\nuse quickwit_config::{MetastoreBackend, MetastoreConfig};\nuse quickwit_proto::metastore::MetastoreServiceClient;\n\nuse crate::MetastoreResolverError;\n\n/// A metastore factory builds a [`MetastoreServiceClient`] object for a target [`MetastoreBackend`]\n/// from a [`MetastoreConfig`] and a [`Uri`].\n#[cfg_attr(any(test, feature = \"testsuite\"), mockall::automock)]\n#[async_trait]\npub trait MetastoreFactory: Send + Sync + 'static {\n    /// Returns the metastore backend targeted by the factory.\n    fn backend(&self) -> MetastoreBackend;\n\n    /// Returns the appropriate [`MetastoreServiceClient`] object for the `uri`.\n    async fn resolve(\n        &self,\n        metastore_config: &MetastoreConfig,\n        uri: &Uri,\n    ) -> Result<MetastoreServiceClient, MetastoreResolverError>;\n}\n\n/// A metastore factory for handling unsupported or unavailable metastore backends.\n#[derive(Clone)]\npub struct UnsupportedMetastore {\n    backend: MetastoreBackend,\n    message: &'static str,\n}\n\nimpl UnsupportedMetastore {\n    /// Creates a new [`UnsupportedMetastore`].\n    pub fn new(backend: MetastoreBackend, message: &'static str) -> Self {\n        Self { backend, message }\n    }\n}\n\n#[async_trait]\nimpl MetastoreFactory for UnsupportedMetastore {\n    fn backend(&self) -> MetastoreBackend {\n        self.backend\n    }\n\n    async fn resolve(\n        &self,\n        _metastore_config: &MetastoreConfig,\n        _uri: &Uri,\n    ) -> Result<MetastoreServiceClient, MetastoreResolverError> {\n        Err(MetastoreResolverError::UnsupportedBackend(\n            self.message.to_string(),\n        ))\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/metastore_resolver.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::fmt;\nuse std::sync::Arc;\n\nuse anyhow::ensure;\nuse once_cell::sync::Lazy;\nuse quickwit_common::uri::{Protocol, Uri};\nuse quickwit_config::{MetastoreBackend, MetastoreConfig, MetastoreConfigs};\nuse quickwit_proto::metastore::MetastoreServiceClient;\nuse quickwit_storage::StorageResolver;\n\nuse crate::metastore::file_backed::FileBackedMetastoreFactory;\n#[cfg(feature = \"postgres\")]\nuse crate::metastore::postgres::PostgresqlMetastoreFactory;\nuse crate::{MetastoreFactory, MetastoreResolverError};\n\ntype FactoryAndConfig = (Box<dyn MetastoreFactory>, MetastoreConfig);\n\n/// Returns the [`MetastoreServiceClient`] instance associated with the protocol of a URI. The\n/// actual creation of metastore objects is delegated to pre-registered [`MetastoreFactory`]. The\n/// resolver is only responsible for dispatching to the appropriate factory.\n#[derive(Clone)]\npub struct MetastoreResolver {\n    per_backend_factories: Arc<HashMap<MetastoreBackend, FactoryAndConfig>>,\n}\n\nimpl fmt::Debug for MetastoreResolver {\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        f.debug_struct(\"MetastoreResolver\").finish()\n    }\n}\n\nimpl MetastoreResolver {\n    /// Creates an empty [`MetastoreResolverBuilder`].\n    pub fn builder() -> MetastoreResolverBuilder {\n        MetastoreResolverBuilder::default()\n    }\n\n    /// Resolves the given `uri`.\n    pub async fn resolve(\n        &self,\n        uri: &Uri,\n    ) -> Result<MetastoreServiceClient, MetastoreResolverError> {\n        let backend = match uri.protocol() {\n            Protocol::Azure => MetastoreBackend::File,\n            Protocol::Google => MetastoreBackend::File,\n            Protocol::File => MetastoreBackend::File,\n            Protocol::Ram => MetastoreBackend::File,\n            Protocol::S3 => MetastoreBackend::File,\n            Protocol::PostgreSQL => MetastoreBackend::PostgreSQL,\n            _ => {\n                return Err(MetastoreResolverError::UnsupportedBackend(\n                    \"no implementation exists for this backend\".to_string(),\n                ));\n            }\n        };\n        let (metastore_factory, metastore_config) = self\n            .per_backend_factories\n            .get(&backend)\n            .ok_or(MetastoreResolverError::UnsupportedBackend(\n                \"no metastore factory is registered for this backend\".to_string(),\n            ))?;\n        let metastore = metastore_factory.resolve(metastore_config, uri).await?;\n        Ok(metastore)\n    }\n\n    /// Creates and returns a [`MetastoreResolver`] holding the default configuration for each\n    /// backend. Note that if the environment (env vars, instance metadata, ...) fails\n    /// to provide the necessary credentials, the default Azure or S3 file-backed metastores\n    /// returned by this resolver will not work.\n    pub fn unconfigured() -> Self {\n        static METASTORE_RESOLVER: Lazy<MetastoreResolver> = Lazy::new(|| {\n            MetastoreResolver::configured(\n                StorageResolver::unconfigured(),\n                &MetastoreConfigs::default(),\n            )\n        });\n        METASTORE_RESOLVER.clone()\n    }\n\n    /// Creates and returns a [`MetastoreResolver`].\n    pub fn configured(\n        storage_resolver: StorageResolver,\n        metastore_configs: &MetastoreConfigs,\n    ) -> Self {\n        let mut builder = MetastoreResolver::builder().register(\n            FileBackedMetastoreFactory::new(storage_resolver),\n            metastore_configs\n                .find_file()\n                .cloned()\n                .unwrap_or_default()\n                .into(),\n        );\n        #[cfg(feature = \"postgres\")]\n        {\n            builder = builder.register(\n                PostgresqlMetastoreFactory::default(),\n                metastore_configs\n                    .find_postgres()\n                    .cloned()\n                    .unwrap_or_default()\n                    .into(),\n            );\n        }\n        #[cfg(not(feature = \"postgres\"))]\n        {\n            use quickwit_config::PostgresMetastoreConfig;\n\n            use crate::UnsupportedMetastore;\n\n            builder = builder.register(\n                UnsupportedMetastore::new(\n                    MetastoreBackend::PostgreSQL,\n                    \"Quickwit was compiled without the `postgres` feature\",\n                ),\n                PostgresMetastoreConfig::default().into(),\n            );\n        }\n        builder\n            .build()\n            .expect(\"metastore factory and config backends should match\")\n    }\n}\n\n#[derive(Default)]\npub struct MetastoreResolverBuilder {\n    per_protocol_factories: HashMap<MetastoreBackend, (Box<dyn MetastoreFactory>, MetastoreConfig)>,\n}\n\nimpl MetastoreResolverBuilder {\n    pub fn register<S: MetastoreFactory>(\n        mut self,\n        metastore_factory: S,\n        metastore_config: MetastoreConfig,\n    ) -> Self {\n        self.per_protocol_factories.insert(\n            metastore_factory.backend(),\n            (Box::new(metastore_factory), metastore_config),\n        );\n        self\n    }\n\n    pub fn build(self) -> anyhow::Result<MetastoreResolver> {\n        for (metastore_factory, metastore_config) in self.per_protocol_factories.values() {\n            ensure!(\n                metastore_factory.backend() == metastore_config.backend(),\n                \"metastore factory and config backends do not match: {:?} vs. {:?}\",\n                metastore_factory.backend(),\n                metastore_config.backend(),\n            );\n        }\n        let metastore_resolver = MetastoreResolver {\n            per_backend_factories: Arc::new(self.per_protocol_factories),\n        };\n        Ok(metastore_resolver)\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::str::FromStr;\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_metastore_resolver_should_not_raise_errors_on_file() {\n        let metastore_resolver = MetastoreResolver::unconfigured();\n        let tmp_dir = tempfile::tempdir().unwrap();\n        let metastore_filepath = format!(\"file://{}/metastore\", tmp_dir.path().display());\n        let metastore_uri = Uri::from_str(&metastore_filepath).unwrap();\n        metastore_resolver.resolve(&metastore_uri).await.unwrap();\n    }\n\n    #[cfg(feature = \"postgres\")]\n    #[tokio::test]\n    async fn test_postgres_and_postgresql_protocol_accepted() {\n        use std::env;\n        let metastore_resolver = MetastoreResolver::unconfigured();\n        // If the database defined in the env var or the default one is not up, the\n        // test block after making 10 attempts with a timeout of 10s each = 100s.\n        let test_database_url = env::var(\"QW_TEST_DATABASE_URL\").unwrap_or_else(|_| {\n            \"postgres://quickwit-dev:quickwit-dev@localhost/quickwit-metastore-dev\".to_string()\n        });\n        let (_uri_protocol, uri_path) = test_database_url.split_once(\"://\").unwrap();\n        for protocol in &[\"postgres\", \"postgresql\"] {\n            let postgres_uri = Uri::from_str(&format!(\"{protocol}://{uri_path}\")).unwrap();\n            metastore_resolver.resolve(&postgres_uri).await.unwrap();\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/split_metadata.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeSet;\nuse std::fmt;\nuse std::ops::{Range, RangeInclusive};\nuse std::path::PathBuf;\nuse std::str::FromStr;\nuse std::time::Duration;\n\nuse bytesize::ByteSize;\nuse quickwit_proto::types::{DocMappingUid, IndexUid, SourceId, SplitId};\nuse serde::{Deserialize, Serialize};\nuse serde_with::{DurationMilliSeconds, serde_as};\nuse time::OffsetDateTime;\n\nuse crate::split_metadata_version::VersionedSplitMetadata;\n\n/// Carries split metadata.\n#[derive(Clone, Debug, Eq, PartialEq, Serialize, Deserialize, utoipa::ToSchema)]\npub struct Split {\n    /// The state of the split.\n    pub split_state: SplitState,\n\n    /// Timestamp for tracking when the split was last updated.\n    pub update_timestamp: i64,\n\n    /// Timestamp for tracking when the split was published.\n    pub publish_timestamp: Option<i64>,\n\n    #[serde(flatten)]\n    #[schema(value_type = VersionedSplitMetadata)]\n    /// Immutable part of the split.\n    pub split_metadata: SplitMetadata,\n}\n\nimpl Split {\n    /// Returns the split_id.\n    pub fn split_id(&self) -> &str {\n        &self.split_metadata.split_id\n    }\n}\n\n/// Carries immutable split metadata.\n/// This struct can deserialize older format automatically\n/// but can only serialize to the last version.\n#[derive(Clone, Default, Eq, PartialEq, Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(into = \"VersionedSplitMetadata\")]\n#[serde(try_from = \"VersionedSplitMetadata\")]\npub struct SplitMetadata {\n    /// Split ID. Joined with the index URI (<index URI>/<split ID>), this ID\n    /// should be enough to uniquely identify a split.\n    /// In reality, some information may be implicitly configured\n    /// in the storage resolver: for instance, the Amazon S3 region.\n    #[schema(value_type = String)]\n    pub split_id: SplitId,\n\n    /// Id of the index this split belongs to.\n    pub index_uid: IndexUid,\n\n    /// Partition to which the split belongs to.\n    ///\n    /// Partitions are usually meant to isolate documents based on some field like\n    /// `tenant_id`. For this reason, ideally splits with a different `partition_id`\n    /// should not be merged together. Merging two splits with different `partition_id`\n    /// does not hurt correctness however.\n    pub partition_id: u64,\n\n    /// Source ID.\n    pub source_id: SourceId,\n\n    /// Node ID.\n    pub node_id: String,\n\n    /// Number of records (or documents) in the split.\n    /// TODO make u64\n    pub num_docs: usize,\n\n    /// Sum of the size (in bytes) of the raw documents in this split.\n    ///\n    /// Note this is not the split file size. It is the size of the original\n    /// JSON payloads.\n    pub uncompressed_docs_size_in_bytes: u64,\n\n    /// If a timestamp field is available, the min / max timestamp in\n    /// the split, expressed in seconds.\n    pub time_range: Option<RangeInclusive<i64>>,\n\n    /// Timestamp for tracking when the split was created.\n    pub create_timestamp: i64,\n\n    /// Split maturity either `Mature` or `Immature` with a given maturation period.\n    pub maturity: SplitMaturity,\n\n    /// Set of unique tags values of form `{field_name}:{field_value}`.\n    /// The set is filled at indexing with values from each field registered\n    /// in the [`DocMapping`](quickwit_config::DocMapping) `tag_fields` attribute and only when\n    /// cardinality of a given field is less or equal to [`MAX_VALUES_PER_TAG_FIELD`].\n    /// An additional special tag of the form `{field_name}!` is added to the set\n    /// to indicate that this field `field_name` was indeed registered in `tag_fields`.\n    /// When cardinality is strictly higher than [`MAX_VALUES_PER_TAG_FIELD`],\n    /// no field value is added to the set.\n    ///\n    /// [`MAX_VALUES_PER_TAG_FIELD`]: https://github.com/quickwit-oss/quickwit/blob/main/quickwit-indexing/src/actors/packager.rs#L36\n    pub tags: BTreeSet<String>,\n\n    /// Contains the range of bytes of the footer that needs to be downloaded\n    /// in order to open a split.\n    ///\n    /// The footer offsets\n    /// make it possible to download the footer in a single call to `.get_slice(...)`.\n    pub footer_offsets: Range<u64>,\n\n    /// Delete opstamp.\n    pub delete_opstamp: u64,\n\n    /// Number of merge operations that was involved to create\n    /// this split.\n    pub num_merge_ops: usize,\n\n    /// Doc mapping UID used when creating this split. This split may only be merged with other\n    /// splits using the same doc mapping UID.\n    pub doc_mapping_uid: DocMappingUid,\n}\n\nimpl fmt::Debug for SplitMetadata {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        let mut debug_struct = f.debug_struct(\"SplitMetadata\");\n        debug_struct.field(\"split_id\", &self.split_id);\n        debug_struct.field(\"index_uid\", &self.index_uid);\n        debug_struct.field(\"partition_id\", &self.partition_id);\n        debug_struct.field(\"source_id\", &self.source_id);\n        debug_struct.field(\"node_id\", &self.node_id);\n        debug_struct.field(\"num_docs\", &self.num_docs);\n        debug_struct.field(\n            \"uncompressed_docs_size_in_bytes\",\n            &self.uncompressed_docs_size_in_bytes,\n        );\n        debug_struct.field(\"time_range\", &self.time_range);\n        debug_struct.field(\"create_timestamp\", &self.create_timestamp);\n        debug_struct.field(\"maturity\", &self.maturity);\n        if !self.tags.is_empty() {\n            let mut tags_iter = self.tags.iter();\n            let mut tags_str = String::new();\n            tags_str.push('{');\n            for _ in 0..4 {\n                if let Some(tag) = tags_iter.next() {\n                    tags_str.push('\"');\n                    tags_str.push_str(tag);\n                    tags_str.push_str(\"\\\", \");\n                } else {\n                    break;\n                }\n            }\n            if tags_iter.next().is_some() {\n                let remaining_count = self.tags.len() - 4;\n                tags_str.push_str(&format!(\"and {remaining_count} more\"));\n            } else {\n                tags_str.pop();\n                tags_str.pop();\n            }\n            tags_str.push('}');\n            debug_struct.field(\"tags\", &tags_str);\n        }\n        debug_struct.field(\"footer_offsets\", &self.footer_offsets);\n        debug_struct.field(\"delete_opstamp\", &self.delete_opstamp);\n        debug_struct.field(\"num_merge_ops\", &self.num_merge_ops);\n        debug_struct.finish()\n    }\n}\n\nimpl SplitMetadata {\n    /// Creates a new instance of split metadata.\n    pub fn new(\n        split_id: SplitId,\n        index_uid: IndexUid,\n        partition_id: u64,\n        source_id: SourceId,\n        node_id: String,\n    ) -> Self {\n        Self {\n            split_id,\n            index_uid,\n            partition_id,\n            source_id,\n            node_id,\n            create_timestamp: utc_now_timestamp(),\n            ..Default::default()\n        }\n    }\n\n    /// Returns the split_id.\n    pub fn split_id(&self) -> &str {\n        &self.split_id\n    }\n\n    /// Returns true if the split is mature at the unix `timestamp`.\n    pub fn is_mature(&self, datetime: OffsetDateTime) -> bool {\n        match self.maturity {\n            SplitMaturity::Mature => true,\n            SplitMaturity::Immature {\n                maturation_period: time_to_maturity,\n            } => {\n                self.create_timestamp + time_to_maturity.as_secs() as i64\n                    <= datetime.unix_timestamp()\n            }\n        }\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    /// Returns an instance of `SplitMetadata` for testing.\n    pub fn for_test(split_id: SplitId) -> SplitMetadata {\n        SplitMetadata {\n            split_id,\n            ..Default::default()\n        }\n    }\n\n    /// Converts the split metadata into a [`SplitInfo`].\n    pub fn as_split_info(&self) -> SplitInfo {\n        let file_name = quickwit_common::split_file(self.split_id());\n\n        SplitInfo {\n            uncompressed_docs_size_bytes: ByteSize(self.uncompressed_docs_size_in_bytes),\n            file_name: PathBuf::from(file_name),\n            file_size_bytes: ByteSize(self.footer_offsets.end),\n            split_id: self.split_id.clone(),\n            num_docs: self.num_docs,\n        }\n    }\n}\n\n/// A summarized version of the split metadata for display purposes.\n#[derive(Debug, Clone, Eq, PartialEq, Serialize, Deserialize, utoipa::ToSchema)]\npub struct SplitInfo {\n    /// The split ID.\n    #[schema(value_type = String)]\n    pub split_id: SplitId,\n    /// The number of documents in the split.\n    pub num_docs: usize,\n    /// The sum of the sizes of the original JSON payloads in bytes.\n    #[schema(value_type = u64)]\n    pub uncompressed_docs_size_bytes: ByteSize,\n    /// The name of the split file on disk.\n    #[schema(value_type = String)]\n    pub file_name: PathBuf,\n    /// The size of the split file on disk in bytes.\n    #[schema(value_type = u64)]\n    pub file_size_bytes: ByteSize,\n}\n\n#[cfg(any(test, feature = \"testsuite\"))]\nimpl quickwit_config::TestableForRegression for SplitMetadata {\n    fn sample_for_regression() -> Self {\n        SplitMetadata {\n            split_id: \"split\".to_string(),\n            index_uid: IndexUid::for_test(\"my-index\", 1),\n            source_id: \"source\".to_string(),\n            node_id: \"node\".to_string(),\n            delete_opstamp: 10,\n            partition_id: 7u64,\n            num_docs: 12303,\n            uncompressed_docs_size_in_bytes: 234234,\n            time_range: Some(121000..=130198),\n            create_timestamp: 3,\n            maturity: SplitMaturity::Immature {\n                maturation_period: Duration::from_secs(4),\n            },\n            tags: [\"234\".to_string(), \"aaa\".to_string()].into_iter().collect(),\n            footer_offsets: 1000..2000,\n            num_merge_ops: 3,\n            doc_mapping_uid: DocMappingUid::default(),\n        }\n    }\n\n    fn assert_equality(&self, other: &Self) {\n        assert_eq!(self, other);\n    }\n}\n\n/// A split state.\n#[derive(Clone, Copy, Debug, Serialize, Deserialize, Eq, PartialEq, utoipa::ToSchema)]\npub enum SplitState {\n    /// The split is almost ready. Some of its files may have been uploaded in the storage.\n    Staged,\n\n    /// The split is ready and published.\n    Published,\n\n    /// The split is marked for deletion.\n    MarkedForDeletion,\n}\n\nimpl fmt::Display for SplitState {\n    fn fmt(&self, f: &mut fmt::Formatter) -> std::fmt::Result {\n        write!(f, \"{self:?}\")\n    }\n}\n\nimpl SplitState {\n    /// Returns a string representation of the given enum.\n    pub fn as_str(&self) -> &'static str {\n        match self {\n            SplitState::Staged => \"Staged\",\n            SplitState::Published => \"Published\",\n            SplitState::MarkedForDeletion => \"MarkedForDeletion\",\n        }\n    }\n}\n\nimpl FromStr for SplitState {\n    type Err = String;\n\n    fn from_str(input: &str) -> Result<SplitState, Self::Err> {\n        let split_state = match input {\n            \"Staged\" => SplitState::Staged,\n            \"Published\" => SplitState::Published,\n            \"MarkedForDeletion\" => SplitState::MarkedForDeletion,\n            \"ScheduledForDeletion\" => SplitState::MarkedForDeletion, // Deprecated\n            \"New\" => SplitState::Staged,                             // Deprecated\n            _ => return Err(format!(\"unknown split state `{input}`\")),\n        };\n        Ok(split_state)\n    }\n}\n\n/// `SplitMaturity` defines the maturity of a split, it is either `Mature`\n/// or `Immature` with a given maturation period.\n/// The maturity is determined by the `MergePolicy`.\n#[serde_as]\n#[derive(Clone, Copy, Debug, Default, Eq, Serialize, Deserialize, PartialEq, PartialOrd, Ord)]\n#[serde(tag = \"type\")]\n#[serde(rename_all = \"snake_case\")]\npub enum SplitMaturity {\n    /// The split is mature and no longer a candidates for merges.\n    #[default]\n    Mature,\n    /// The split is immature and can undergo merges until `maturation_period` passes,\n    /// measured relatively from the split's creation timestamp.\n    Immature {\n        /// Maturation period.\n        #[serde_as(as = \"DurationMilliSeconds<u64>\")]\n        #[serde(rename = \"maturation_period_millis\")]\n        maturation_period: Duration,\n    },\n}\n\n/// Helper function to provide a UTC now timestamp to use\n/// as a default in deserialization.\n///\n/// During unit test, the value is constant.\npub fn utc_now_timestamp() -> i64 {\n    if cfg!(any(test, feature = \"testsuite\")) {\n        1640577000\n    } else {\n        OffsetDateTime::now_utc().unix_timestamp()\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_split_maturity_serialization() {\n        {\n            let split_maturity = super::SplitMaturity::Immature {\n                maturation_period: std::time::Duration::from_millis(10),\n            };\n            let serialized = serde_json::to_string(&split_maturity).unwrap();\n            assert_eq!(\n                serialized,\n                r#\"{\"type\":\"immature\",\"maturation_period_millis\":10}\"#\n            );\n            let deserialized: super::SplitMaturity = serde_json::from_str(&serialized).unwrap();\n            assert_eq!(deserialized, split_maturity);\n        }\n        {\n            let split_maturity = super::SplitMaturity::Mature;\n            let serialized = serde_json::to_string(&split_maturity).unwrap();\n            assert_eq!(serialized, r#\"{\"type\":\"mature\"}\"#);\n            let deserialized: super::SplitMaturity = serde_json::from_str(&serialized).unwrap();\n            assert_eq!(deserialized, split_maturity);\n        }\n    }\n\n    #[test]\n    fn test_split_metadata_debug() {\n        let split_metadata = SplitMetadata {\n            split_id: \"split-1\".to_string(),\n            index_uid: IndexUid::for_test(\"00000000-0000-0000-0000-000000000000\", 0),\n            partition_id: 0,\n            source_id: \"source-1\".to_string(),\n            node_id: \"node-1\".to_string(),\n            num_docs: 100,\n            uncompressed_docs_size_in_bytes: 1024,\n            time_range: Some(0..=100),\n            create_timestamp: 1629867600,\n            maturity: SplitMaturity::Mature,\n            tags: {\n                let mut tags = BTreeSet::new();\n                tags.insert(\"🐱\".to_string());\n                tags.insert(\"🙀\".to_string());\n                tags.insert(\"😻\".to_string());\n                tags.insert(\"😼\".to_string());\n                tags.insert(\"😿\".to_string());\n                tags\n            },\n            footer_offsets: 0..1024,\n            delete_opstamp: 0,\n            num_merge_ops: 0,\n            doc_mapping_uid: DocMappingUid::default(),\n        };\n\n        let expected_output = \"SplitMetadata { split_id: \\\"split-1\\\", index_uid: IndexUid { \\\n                               index_id: \\\"00000000-0000-0000-0000-000000000000\\\", \\\n                               incarnation_id: Ulid(0) }, partition_id: 0, source_id: \\\n                               \\\"source-1\\\", node_id: \\\"node-1\\\", num_docs: 100, \\\n                               uncompressed_docs_size_in_bytes: 1024, time_range: Some(0..=100), \\\n                               create_timestamp: 1629867600, maturity: Mature, tags: \\\n                               \\\"{\\\\\\\"🐱\\\\\\\", \\\\\\\"😻\\\\\\\", \\\\\\\"😼\\\\\\\", \\\\\\\"😿\\\\\\\", and 1 more}\\\", \\\n                               footer_offsets: 0..1024, delete_opstamp: 0, num_merge_ops: 0 }\";\n\n        assert_eq!(format!(\"{split_metadata:?}\"), expected_output);\n    }\n\n    #[test]\n    fn test_spit_maturity_order() {\n        assert!(\n            SplitMaturity::Mature\n                < SplitMaturity::Immature {\n                    maturation_period: Duration::from_secs(0)\n                }\n        );\n        assert!(\n            SplitMaturity::Immature {\n                maturation_period: Duration::from_secs(0)\n            } < SplitMaturity::Immature {\n                maturation_period: Duration::from_secs(1)\n            }\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/split_metadata_version.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeSet;\nuse std::ops::{Range, RangeInclusive};\n\nuse quickwit_proto::types::{DocMappingUid, IndexUid, SplitId};\nuse serde::{Deserialize, Serialize};\n\nuse crate::SplitMetadata;\nuse crate::split_metadata::{SplitMaturity, utc_now_timestamp};\n\n#[derive(Clone, Debug, Default, Eq, PartialEq, Serialize, Deserialize, utoipa::ToSchema)]\npub(crate) struct SplitMetadataV0_8 {\n    /// Split ID. Joined with the index URI (<index URI>/<split ID>), this ID\n    /// should be enough to uniquely identify a split.\n    /// In reality, some information may be implicitly configured\n    /// in the storage resolver: for instance, the Amazon S3 region.\n    #[schema(value_type = String)]\n    pub split_id: SplitId,\n\n    /// Uid of the index this split belongs to.\n    #[schema(value_type = String)]\n    #[serde(alias = \"index_id\")]\n    pub index_uid: IndexUid,\n\n    #[serde(default)]\n    pub partition_id: u64,\n\n    #[serde(default)]\n    pub source_id: Option<String>,\n\n    #[serde(default)]\n    pub node_id: Option<String>,\n\n    /// Number of records (or documents) in the split.\n    pub num_docs: usize,\n\n    /// Sum of the size (in bytes) of the raw documents in this split.\n    ///\n    /// Note this is not the split file size. It is the size of the original\n    /// JSON payloads.\n    #[serde(alias = \"size_in_bytes\")]\n    pub uncompressed_docs_size_in_bytes: u64,\n\n    #[schema(value_type = Option<Object>)]\n    /// If a timestamp field is available, the min / max timestamp in\n    /// the split.\n    pub time_range: Option<RangeInclusive<i64>>,\n\n    /// Timestamp for tracking when the split was created.\n    #[serde(default = \"utc_now_timestamp\")]\n    pub create_timestamp: i64,\n\n    /// Split maturity either `Mature` or `Immature` with a given maturation period.\n    #[serde(default)]\n    #[schema(value_type = Value)]\n    pub maturity: SplitMaturity,\n\n    #[serde(default)]\n    #[schema(value_type = Vec<String>)]\n    /// A set of tags for categorizing and searching group of splits.\n    pub tags: BTreeSet<String>,\n\n    #[schema(value_type = Object)]\n    /// Contains the range of bytes of the footer that needs to be downloaded\n    /// in order to open a split.\n    ///\n    /// The footer offsets\n    /// make it possible to download the footer in a single call to `.get_slice(...)`.\n    pub footer_offsets: Range<u64>,\n\n    /// Split delete opstamp.\n    #[serde(default)]\n    pub delete_opstamp: u64,\n\n    #[serde(default)]\n    num_merge_ops: usize,\n\n    // we default fill with zero: we don't know the right uid, and it's correct to assume all\n    // splits before when updates first appeared are compatible with each other.\n    #[serde(default)]\n    doc_mapping_uid: DocMappingUid,\n}\n\nimpl From<SplitMetadataV0_8> for SplitMetadata {\n    fn from(v8: SplitMetadataV0_8) -> Self {\n        let source_id = v8.source_id.unwrap_or_else(|| \"unknown\".to_string());\n\n        let node_id = if let Some(node_id) = v8.node_id {\n            // The previous version encoded `v1.node_id` as `{node_id}/{pipeline_ord}`.\n            // Since pipeline_ord is no longer needed, we only extract the `node_id` portion\n            // to keep backward compatibility.  This has the advantage of avoiding a\n            // brand new version.\n            if let Some((node_id, _)) = node_id.rsplit_once('/') {\n                node_id.to_string()\n            } else {\n                node_id\n            }\n        } else {\n            \"unknown\".to_string()\n        };\n\n        SplitMetadata {\n            split_id: v8.split_id,\n            index_uid: v8.index_uid,\n            partition_id: v8.partition_id,\n            source_id,\n            node_id,\n            delete_opstamp: v8.delete_opstamp,\n            num_docs: v8.num_docs,\n            uncompressed_docs_size_in_bytes: v8.uncompressed_docs_size_in_bytes,\n            time_range: v8.time_range,\n            create_timestamp: v8.create_timestamp,\n            maturity: v8.maturity,\n            tags: v8.tags,\n            footer_offsets: v8.footer_offsets,\n            num_merge_ops: v8.num_merge_ops,\n            doc_mapping_uid: v8.doc_mapping_uid,\n        }\n    }\n}\n\nimpl From<SplitMetadata> for SplitMetadataV0_8 {\n    fn from(split: SplitMetadata) -> Self {\n        SplitMetadataV0_8 {\n            split_id: split.split_id,\n            index_uid: split.index_uid,\n            partition_id: split.partition_id,\n            source_id: Some(split.source_id),\n            node_id: Some(split.node_id),\n            delete_opstamp: split.delete_opstamp,\n            num_docs: split.num_docs,\n            uncompressed_docs_size_in_bytes: split.uncompressed_docs_size_in_bytes,\n            time_range: split.time_range,\n            create_timestamp: split.create_timestamp,\n            maturity: split.maturity,\n            tags: split.tags,\n            footer_offsets: split.footer_offsets,\n            num_merge_ops: split.num_merge_ops,\n            doc_mapping_uid: split.doc_mapping_uid,\n        }\n    }\n}\n\n#[derive(Serialize, Deserialize, utoipa::ToSchema)]\n#[serde(tag = \"version\")]\npub(crate) enum VersionedSplitMetadata {\n    #[serde(rename = \"0.9\")]\n    // Retro compatibility.\n    #[serde(alias = \"0.8\")]\n    #[serde(alias = \"0.7\")]\n    V0_8(SplitMetadataV0_8),\n}\n\nimpl From<VersionedSplitMetadata> for SplitMetadata {\n    fn from(versioned_helper: VersionedSplitMetadata) -> Self {\n        match versioned_helper {\n            VersionedSplitMetadata::V0_8(v0_8) => v0_8.into(),\n        }\n    }\n}\n\nimpl From<SplitMetadata> for VersionedSplitMetadata {\n    fn from(split_metadata: SplitMetadata) -> Self {\n        VersionedSplitMetadata::V0_8(split_metadata.into())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/tests/delete_task.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_common::rand::append_random_suffix;\nuse quickwit_config::IndexConfig;\nuse quickwit_proto::metastore::{\n    CreateIndexRequest, DeleteIndexRequest, DeleteQuery, EntityKind, LastDeleteOpstampRequest,\n    ListDeleteTasksRequest, MetastoreError,\n};\nuse quickwit_proto::types::IndexUid;\nuse quickwit_query::query_ast::qast_json_helper;\n\nuse super::DefaultForTest;\nuse crate::tests::cleanup_index;\nuse crate::{CreateIndexRequestExt, MetastoreServiceExt};\n\npub async fn test_metastore_create_delete_task<\n    MetastoreToTest: MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n    let index_id = append_random_suffix(\"add-delete-task\");\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n    let index_uid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n    let delete_query = DeleteQuery {\n        index_uid: Some(index_uid.clone()),\n        query_ast: qast_json_helper(\"my_field:my_value\", &[]),\n        start_timestamp: Some(1),\n        end_timestamp: Some(2),\n    };\n\n    // Create a delete task on non-existing index.\n    let error = metastore\n        .create_delete_task(DeleteQuery {\n            index_uid: Some(IndexUid::new_with_random_ulid(\"does-not-exist\")),\n            ..delete_query.clone()\n        })\n        .await\n        .unwrap_err();\n    assert!(matches!(\n        error,\n        MetastoreError::NotFound(EntityKind::Index { .. })\n    ));\n\n    // Create a delete task on an index with wrong incarnation_id\n    let error = metastore\n        .create_delete_task(DeleteQuery {\n            index_uid: Some(IndexUid::for_test(&index_id, 12345)),\n            ..delete_query.clone()\n        })\n        .await\n        .unwrap_err();\n    assert!(matches!(\n        error,\n        MetastoreError::NotFound(EntityKind::Index { .. })\n    ));\n\n    // Create a delete task.\n    let delete_task_1 = metastore\n        .create_delete_task(delete_query.clone())\n        .await\n        .unwrap();\n    assert!(delete_task_1.opstamp > 0);\n    let delete_query_1 = delete_task_1.delete_query.unwrap();\n    assert_eq!(delete_query_1.index_uid, delete_query.index_uid);\n    assert_eq!(delete_query_1.start_timestamp, delete_query.start_timestamp);\n    assert_eq!(delete_query_1.end_timestamp, delete_query.end_timestamp);\n    let delete_task_2 = metastore\n        .create_delete_task(delete_query.clone())\n        .await\n        .unwrap();\n    assert!(delete_task_2.opstamp > delete_task_1.opstamp);\n\n    cleanup_index(&mut metastore, index_uid).await;\n}\n\npub async fn test_metastore_last_delete_opstamp<\n    MetastoreToTest: MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n    let index_id_1 = append_random_suffix(\"test-last-delete-opstamp-1\");\n    let index_uri_1 = format!(\"ram:///indexes/{index_id_1}\");\n    let index_config_1 = IndexConfig::for_test(&index_id_1, &index_uri_1);\n    let index_id_2 = append_random_suffix(\"test-last-delete-opstamp-2\");\n    let index_uri_2 = format!(\"ram:///indexes/{index_id_2}\");\n    let index_config_2 = IndexConfig::for_test(&index_id_2, &index_uri_2);\n    let index_uid_1 = metastore\n        .create_index(CreateIndexRequest::try_from_index_config(&index_config_1).unwrap())\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n    let index_uid_2 = metastore\n        .create_index(CreateIndexRequest::try_from_index_config(&index_config_2).unwrap())\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    let delete_query_index_1 = DeleteQuery {\n        index_uid: Some(index_uid_1.clone()),\n        query_ast: qast_json_helper(\"my_field:my_value\", &[]),\n        start_timestamp: Some(1),\n        end_timestamp: Some(2),\n    };\n    let delete_query_index_2 = DeleteQuery {\n        index_uid: Some(index_uid_2.clone()),\n        query_ast: qast_json_helper(\"my_field:my_value\", &[]),\n        start_timestamp: Some(1),\n        end_timestamp: Some(2),\n    };\n\n    let last_opstamp_index_1_with_no_task = metastore\n        .last_delete_opstamp(LastDeleteOpstampRequest {\n            index_uid: Some(index_uid_1.clone()),\n        })\n        .await\n        .unwrap()\n        .last_delete_opstamp;\n    assert_eq!(last_opstamp_index_1_with_no_task, 0);\n\n    // Create a delete task.\n    metastore\n        .create_delete_task(delete_query_index_1.clone())\n        .await\n        .unwrap();\n    let delete_task_2 = metastore\n        .create_delete_task(delete_query_index_1.clone())\n        .await\n        .unwrap();\n    let delete_task_3 = metastore\n        .create_delete_task(delete_query_index_2.clone())\n        .await\n        .unwrap();\n\n    let last_opstamp_index_1 = metastore\n        .last_delete_opstamp(LastDeleteOpstampRequest {\n            index_uid: Some(index_uid_1.clone()),\n        })\n        .await\n        .unwrap()\n        .last_delete_opstamp;\n    let last_opstamp_index_2 = metastore\n        .last_delete_opstamp(LastDeleteOpstampRequest {\n            index_uid: Some(index_uid_2.clone()),\n        })\n        .await\n        .unwrap()\n        .last_delete_opstamp;\n    assert_eq!(last_opstamp_index_1, delete_task_2.opstamp);\n    assert_eq!(last_opstamp_index_2, delete_task_3.opstamp);\n    cleanup_index(&mut metastore, index_uid_1).await;\n    cleanup_index(&mut metastore, index_uid_2).await;\n}\n\npub async fn test_metastore_delete_index_with_tasks<\n    MetastoreToTest: MetastoreServiceExt + DefaultForTest,\n>() {\n    let metastore = MetastoreToTest::default_for_test().await;\n    let index_id = append_random_suffix(\"delete-delete-tasks\");\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n    let index_uid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n    let delete_query = DeleteQuery {\n        index_uid: Some(index_uid.clone()),\n        query_ast: qast_json_helper(\"my_field:my_value\", &[]),\n        start_timestamp: Some(1),\n        end_timestamp: Some(2),\n    };\n    let _ = metastore\n        .create_delete_task(delete_query.clone())\n        .await\n        .unwrap();\n    let _ = metastore\n        .create_delete_task(delete_query.clone())\n        .await\n        .unwrap();\n\n    metastore\n        .delete_index(DeleteIndexRequest {\n            index_uid: Some(index_uid),\n        })\n        .await\n        .unwrap();\n}\n\npub async fn test_metastore_list_delete_tasks<\n    MetastoreToTest: MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n    let index_id_1 = append_random_suffix(\"test-list-delete-tasks-1\");\n    let index_uri_1 = format!(\"ram:///indexes/{index_id_1}\");\n    let index_config_1 = IndexConfig::for_test(&index_id_1, &index_uri_1);\n    let index_id_2 = append_random_suffix(\"test-list-delete-tasks-2\");\n    let index_uri_2 = format!(\"ram:///indexes/{index_id_2}\");\n    let index_config_2 = IndexConfig::for_test(&index_id_2, &index_uri_2);\n    let index_uid_1 = metastore\n        .create_index(CreateIndexRequest::try_from_index_config(&index_config_1).unwrap())\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n    let index_uid_2 = metastore\n        .create_index(CreateIndexRequest::try_from_index_config(&index_config_2).unwrap())\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n    let delete_query_index_1 = DeleteQuery {\n        index_uid: Some(index_uid_1.clone()),\n        query_ast: qast_json_helper(\"my_field:my_value\", &[]),\n        start_timestamp: Some(1),\n        end_timestamp: Some(2),\n    };\n    let delete_query_index_2 = DeleteQuery {\n        index_uid: Some(index_uid_2.clone()),\n        query_ast: qast_json_helper(\"my_field:my_value\", &[]),\n        start_timestamp: Some(1),\n        end_timestamp: Some(2),\n    };\n\n    // Create a delete task.\n    let delete_task_1 = metastore\n        .create_delete_task(delete_query_index_1.clone())\n        .await\n        .unwrap();\n    let delete_task_2 = metastore\n        .create_delete_task(delete_query_index_1.clone())\n        .await\n        .unwrap();\n    let _ = metastore\n        .create_delete_task(delete_query_index_2.clone())\n        .await\n        .unwrap();\n\n    let all_index_id_1_delete_tasks = metastore\n        .list_delete_tasks(ListDeleteTasksRequest::new(index_uid_1.clone(), 0))\n        .await\n        .unwrap()\n        .delete_tasks;\n    assert_eq!(all_index_id_1_delete_tasks.len(), 2);\n\n    let recent_index_id_1_delete_tasks = metastore\n        .list_delete_tasks(ListDeleteTasksRequest::new(\n            index_uid_1.clone(),\n            delete_task_1.opstamp,\n        ))\n        .await\n        .unwrap()\n        .delete_tasks;\n    assert_eq!(recent_index_id_1_delete_tasks.len(), 1);\n    assert_eq!(\n        recent_index_id_1_delete_tasks[0].opstamp,\n        delete_task_2.opstamp\n    );\n    cleanup_index(&mut metastore, index_uid_1).await;\n    cleanup_index(&mut metastore, index_uid_2).await;\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/tests/get_identity.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n// Index API tests\n//\n//  - create_index\n//  - index_exists\n//  - index_metadata\n//  - list_indexes\n//  - delete_index\n\nuse quickwit_proto::metastore::{GetClusterIdentityRequest, MetastoreService};\nuse uuid::Uuid;\n\nuse super::DefaultForTest;\nuse crate::MetastoreServiceExt;\n\npub async fn test_metastore_get_identity<\n    MetastoreToTest: MetastoreService + MetastoreServiceExt + DefaultForTest,\n>() {\n    let metastore = MetastoreToTest::default_for_test().await;\n\n    let identity_1 = metastore\n        .get_cluster_identity(GetClusterIdentityRequest {})\n        .await\n        .unwrap()\n        .uuid;\n\n    let identity_2 = metastore\n        .get_cluster_identity(GetClusterIdentityRequest {})\n        .await\n        .unwrap()\n        .uuid;\n\n    assert_eq!(identity_1, identity_2);\n    assert_ne!(identity_1, Uuid::nil().hyphenated().to_string());\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/tests/index.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n// Index API tests\n//\n//  - create_index\n//  - index_exists\n//  - index_metadata\n//  - list_indexes\n//  - delete_index\n\nuse std::num::NonZeroUsize;\n\nuse quickwit_common::rand::append_random_suffix;\nuse quickwit_config::merge_policy_config::{MergePolicyConfig, StableLogMergePolicyConfig};\nuse quickwit_config::{\n    CLI_SOURCE_ID, INGEST_V2_SOURCE_ID, IndexConfig, IndexingSettings, IngestSettings,\n    RetentionPolicy, SearchSettings, SourceConfig,\n};\nuse quickwit_doc_mapper::{Cardinality, FieldMappingEntry, FieldMappingType, QuickwitJsonOptions};\nuse quickwit_proto::metastore::{\n    CreateIndexRequest, DeleteIndexRequest, EntityKind, IndexMetadataFailure,\n    IndexMetadataFailureReason, IndexMetadataRequest, IndexMetadataSubrequest, IndexStats,\n    IndexesMetadataRequest, ListIndexStatsRequest, ListIndexesMetadataRequest, MetastoreError,\n    MetastoreService, PublishSplitsRequest, SplitStats, StageSplitsRequest, UpdateIndexRequest,\n};\nuse quickwit_proto::types::{DocMappingUid, IndexUid};\n\nuse super::DefaultForTest;\nuse crate::tests::cleanup_index;\nuse crate::{\n    CreateIndexRequestExt, IndexMetadataResponseExt, IndexesMetadataResponseExt,\n    ListIndexesMetadataResponseExt, MetastoreServiceExt, SplitMetadata, StageSplitsRequestExt,\n    UpdateIndexRequestExt,\n};\n\npub async fn test_metastore_create_index<\n    MetastoreToTest: MetastoreService + MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let index_id = append_random_suffix(\"test-create-index\");\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n    let index_uid = metastore\n        .create_index(create_index_request.clone())\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    assert!(metastore.index_exists(&index_id).await.unwrap());\n\n    let index_metadata = metastore\n        .index_metadata(IndexMetadataRequest::for_index_id(index_id.to_string()))\n        .await\n        .unwrap()\n        .deserialize_index_metadata()\n        .unwrap();\n\n    assert_eq!(index_metadata.index_id(), index_id);\n    assert_eq!(index_metadata.index_uri(), &index_uri);\n\n    let error = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap_err();\n    assert!(matches!(error, MetastoreError::AlreadyExists { .. }));\n\n    cleanup_index(&mut metastore, index_uid).await;\n}\n\nasync fn setup_metastore_for_update<\n    MetastoreToTest: MetastoreService + MetastoreServiceExt + DefaultForTest,\n>() -> (MetastoreToTest, IndexUid, IndexConfig) {\n    let metastore = MetastoreToTest::default_for_test().await;\n\n    let index_id = append_random_suffix(\"test-update-index\");\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n    let index_uid = metastore\n        .create_index(create_index_request.clone())\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    (metastore, index_uid, index_config)\n}\n\npub async fn test_metastore_update_retention_policy<\n    MetastoreToTest: MetastoreService + MetastoreServiceExt + DefaultForTest,\n>() {\n    let (mut metastore, index_uid, index_config) =\n        setup_metastore_for_update::<MetastoreToTest>().await;\n    let new_retention_policy_opt = Some(RetentionPolicy {\n        retention_period: String::from(\"3 days\"),\n        evaluation_schedule: String::from(\"daily\"),\n    });\n\n    // set and unset retention policy multiple times\n    for loop_retention_policy_opt in [\n        None,\n        new_retention_policy_opt.clone(),\n        new_retention_policy_opt.clone(),\n        None,\n    ] {\n        let index_update = UpdateIndexRequest::try_from_updates(\n            index_uid.clone(),\n            &index_config.doc_mapping,\n            &index_config.indexing_settings,\n            &index_config.ingest_settings,\n            &index_config.search_settings,\n            &loop_retention_policy_opt,\n        )\n        .unwrap();\n        let response_metadata = metastore\n            .update_index(index_update)\n            .await\n            .unwrap()\n            .deserialize_index_metadata()\n            .unwrap();\n        assert_eq!(response_metadata.index_uid, index_uid);\n        assert_eq!(\n            response_metadata.index_config.retention_policy_opt,\n            loop_retention_policy_opt\n        );\n        let updated_metadata = metastore\n            .index_metadata(IndexMetadataRequest::for_index_id(\n                index_uid.index_id.to_string(),\n            ))\n            .await\n            .unwrap()\n            .deserialize_index_metadata()\n            .unwrap();\n        assert_eq!(response_metadata, updated_metadata);\n    }\n    cleanup_index(&mut metastore, index_uid).await;\n}\n\npub async fn test_metastore_update_ingest_settings<\n    MetastoreToTest: MetastoreService + MetastoreServiceExt + DefaultForTest,\n>() {\n    let (mut metastore, index_uid, index_config) =\n        setup_metastore_for_update::<MetastoreToTest>().await;\n\n    let ingest_settings = IngestSettings {\n        min_shards: NonZeroUsize::new(12).unwrap(),\n        ..Default::default()\n    };\n    let index_update_request = UpdateIndexRequest::try_from_updates(\n        index_uid.clone(),\n        &index_config.doc_mapping,\n        &index_config.indexing_settings,\n        &ingest_settings,\n        &index_config.search_settings,\n        &index_config.retention_policy_opt,\n    )\n    .unwrap();\n\n    let min_shards = metastore\n        .update_index(index_update_request)\n        .await\n        .unwrap()\n        .deserialize_index_metadata()\n        .unwrap()\n        .index_config\n        .ingest_settings\n        .min_shards\n        .get();\n    assert_eq!(min_shards, 12);\n\n    let index_metadata_request = IndexMetadataRequest::for_index_uid(index_uid.clone());\n\n    let min_shards = metastore\n        .index_metadata(index_metadata_request)\n        .await\n        .unwrap()\n        .deserialize_index_metadata()\n        .unwrap()\n        .index_config\n        .ingest_settings\n        .min_shards\n        .get();\n    assert_eq!(min_shards, 12);\n\n    cleanup_index(&mut metastore, index_uid).await;\n}\n\npub async fn test_metastore_update_search_settings<\n    MetastoreToTest: MetastoreService + MetastoreServiceExt + DefaultForTest,\n>() {\n    let (mut metastore, index_uid, index_config) =\n        setup_metastore_for_update::<MetastoreToTest>().await;\n\n    for default_search_fields in [\n        Vec::new(),\n        vec![\"body\".to_string()],\n        vec![\"body\".to_string()],\n        vec![\"body\".to_string(), \"owner\".to_string()],\n        Vec::new(),\n    ] {\n        let search_settings = SearchSettings {\n            default_search_fields: default_search_fields.clone(),\n        };\n        let index_update = UpdateIndexRequest::try_from_updates(\n            index_uid.clone(),\n            &index_config.doc_mapping,\n            &index_config.indexing_settings,\n            &index_config.ingest_settings,\n            &search_settings,\n            &index_config.retention_policy_opt,\n        )\n        .unwrap();\n        let response_metadata = metastore\n            .update_index(index_update)\n            .await\n            .unwrap()\n            .deserialize_index_metadata()\n            .unwrap();\n        assert_eq!(\n            response_metadata\n                .index_config\n                .search_settings\n                .default_search_fields,\n            default_search_fields\n        );\n        let updated_metadata = metastore\n            .index_metadata(IndexMetadataRequest::for_index_id(\n                index_uid.index_id.to_string(),\n            ))\n            .await\n            .unwrap()\n            .deserialize_index_metadata()\n            .unwrap();\n        assert_eq!(\n            updated_metadata\n                .index_config\n                .search_settings\n                .default_search_fields,\n            default_search_fields\n        );\n    }\n    cleanup_index(&mut metastore, index_uid).await;\n}\n\npub async fn test_metastore_update_indexing_settings<\n    MetastoreToTest: MetastoreService + MetastoreServiceExt + DefaultForTest,\n>() {\n    let (mut metastore, index_uid, index_config) =\n        setup_metastore_for_update::<MetastoreToTest>().await;\n\n    for merge_policy in [\n        MergePolicyConfig::Nop,\n        MergePolicyConfig::Nop,\n        MergePolicyConfig::StableLog(StableLogMergePolicyConfig {\n            merge_factor: 5,\n            ..Default::default()\n        }),\n    ] {\n        let indexing_settings = IndexingSettings {\n            merge_policy: merge_policy.clone(),\n            ..Default::default()\n        };\n        let index_update = UpdateIndexRequest::try_from_updates(\n            index_uid.clone(),\n            &index_config.doc_mapping,\n            &indexing_settings,\n            &index_config.ingest_settings,\n            &index_config.search_settings,\n            &index_config.retention_policy_opt,\n        )\n        .unwrap();\n        let resp_metadata = metastore\n            .update_index(index_update)\n            .await\n            .unwrap()\n            .deserialize_index_metadata()\n            .unwrap();\n        assert_eq!(\n            resp_metadata.index_config.indexing_settings.merge_policy,\n            merge_policy\n        );\n        let updated_metadata = metastore\n            .index_metadata(IndexMetadataRequest::for_index_id(\n                index_uid.index_id.to_string(),\n            ))\n            .await\n            .unwrap()\n            .deserialize_index_metadata()\n            .unwrap();\n        assert_eq!(\n            updated_metadata.index_config.indexing_settings.merge_policy,\n            merge_policy\n        );\n    }\n    cleanup_index(&mut metastore, index_uid).await;\n}\n\npub async fn test_metastore_update_doc_mapping<\n    MetastoreToTest: MetastoreService + MetastoreServiceExt + DefaultForTest,\n>() {\n    let (mut metastore, index_uid, index_config) =\n        setup_metastore_for_update::<MetastoreToTest>().await;\n\n    let json_options = QuickwitJsonOptions {\n        description: None,\n        stored: false,\n        indexing_options: None,\n        expand_dots: false,\n        fast: Default::default(),\n    };\n\n    let initial = index_config.doc_mapping.clone();\n    let mut new_field = initial.clone();\n    new_field.field_mappings.push(FieldMappingEntry {\n        name: \"new_field\".to_string(),\n        mapping_type: FieldMappingType::Json(json_options.clone(), Cardinality::SingleValued),\n    });\n    new_field.doc_mapping_uid = DocMappingUid::random();\n    let mut new_field_stored = initial.clone();\n    new_field_stored.field_mappings.push(FieldMappingEntry {\n        name: \"new_field\".to_string(),\n        mapping_type: FieldMappingType::Json(\n            QuickwitJsonOptions {\n                stored: true,\n                ..json_options\n            },\n            Cardinality::SingleValued,\n        ),\n    });\n    new_field_stored.doc_mapping_uid = DocMappingUid::random();\n\n    for loop_doc_mapping in [initial.clone(), new_field, new_field_stored, initial] {\n        let index_update = UpdateIndexRequest::try_from_updates(\n            index_uid.clone(),\n            &loop_doc_mapping,\n            &index_config.indexing_settings,\n            &index_config.ingest_settings,\n            &index_config.search_settings,\n            &index_config.retention_policy_opt,\n        )\n        .unwrap();\n        let resp_metadata = metastore\n            .update_index(index_update)\n            .await\n            .unwrap()\n            .deserialize_index_metadata()\n            .unwrap();\n        assert_eq!(resp_metadata.index_config.doc_mapping, loop_doc_mapping);\n        let updated_metadata = metastore\n            .index_metadata(IndexMetadataRequest::for_index_id(\n                index_uid.index_id.to_string(),\n            ))\n            .await\n            .unwrap()\n            .deserialize_index_metadata()\n            .unwrap();\n        assert_eq!(updated_metadata.index_config.doc_mapping, loop_doc_mapping);\n    }\n    cleanup_index(&mut metastore, index_uid).await;\n}\n\npub async fn test_metastore_create_index_with_sources<\n    MetastoreToTest: MetastoreService + MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let index_id = append_random_suffix(\"test-create-index-with-sources\");\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n    let index_config_json = serde_json::to_string(&index_config).unwrap();\n\n    let source_configs_json = vec![\n        serde_json::to_string(&SourceConfig::cli()).unwrap(),\n        serde_json::to_string(&SourceConfig::ingest_v2()).unwrap(),\n    ];\n    let create_index_request = CreateIndexRequest {\n        index_config_json,\n        source_configs_json,\n    };\n    let index_uid: IndexUid = metastore\n        .create_index(create_index_request.clone())\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    assert!(metastore.index_exists(&index_id).await.unwrap());\n\n    let index_metadata = metastore\n        .index_metadata(IndexMetadataRequest::for_index_id(index_id.to_string()))\n        .await\n        .unwrap()\n        .deserialize_index_metadata()\n        .unwrap();\n\n    assert_eq!(index_metadata.index_id(), index_id);\n    assert_eq!(index_metadata.index_uri(), &index_uri);\n\n    assert_eq!(index_metadata.sources.len(), 2);\n    assert!(index_metadata.sources.contains_key(CLI_SOURCE_ID));\n    assert!(index_metadata.sources.contains_key(INGEST_V2_SOURCE_ID));\n\n    cleanup_index(&mut metastore, index_uid).await;\n}\n\npub async fn test_metastore_create_index_enforces_index_id_maximum_length<\n    MetastoreToTest: MetastoreService + MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let index_id = append_random_suffix(format!(\"very-long-index-{}\", \"a\".repeat(233)).as_str());\n    assert_eq!(index_id.len(), 255);\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n    let index_uid: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    assert!(metastore.index_exists(&index_id).await.unwrap());\n\n    cleanup_index(&mut metastore, index_uid).await;\n}\n\npub async fn test_metastore_index_exists<\n    MetastoreToTest: MetastoreService + MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let index_id = append_random_suffix(\"test-index-exists\");\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n\n    assert!(!metastore.index_exists(&index_id).await.unwrap());\n\n    let index_uid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    assert!(metastore.index_exists(&index_id).await.unwrap());\n\n    cleanup_index(&mut metastore, index_uid).await;\n}\n\npub async fn test_metastore_index_metadata<\n    MetastoreToTest: MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let index_id = append_random_suffix(\"test-index-metadata\");\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n\n    let error = metastore\n        .index_metadata(IndexMetadataRequest::for_index_id(index_id.to_string()))\n        .await\n        .unwrap_err();\n    assert!(matches!(\n        error,\n        MetastoreError::NotFound(EntityKind::Index { .. })\n    ));\n\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n    let index_uid: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    let index_metadata = metastore\n        .index_metadata(IndexMetadataRequest::for_index_id(index_id.to_string()))\n        .await\n        .unwrap()\n        .deserialize_index_metadata()\n        .unwrap();\n\n    assert_eq!(index_metadata.index_id(), index_id);\n    assert_eq!(index_metadata.index_uri(), &index_uri);\n\n    cleanup_index(&mut metastore, index_uid).await;\n}\n\npub async fn test_metastore_indexes_metadata<\n    MetastoreToTest: MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let index_id_0 = append_random_suffix(\"test-indexes-metadata-0\");\n    let index_uri_0 = format!(\"ram:///indexes/{index_id_0}\");\n    let index_config_0 = IndexConfig::for_test(&index_id_0, &index_uri_0);\n\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config_0).unwrap();\n    let index_uid_0: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    let index_id_1 = append_random_suffix(\"test-indexes-metadata-1\");\n    let index_uri_1 = format!(\"ram:///indexes/{index_id_1}\");\n    let index_config_1 = IndexConfig::for_test(&index_id_1, &index_uri_1);\n\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config_1).unwrap();\n    let index_uid_1: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    let indexes_metadata_request = IndexesMetadataRequest {\n        subrequests: vec![\n            IndexMetadataSubrequest {\n                index_id: None,\n                index_uid: None,\n            },\n            IndexMetadataSubrequest {\n                index_id: Some(index_id_0.clone()),\n                index_uid: None,\n            },\n            IndexMetadataSubrequest {\n                index_id: Some(\"test-indexes-metadata-foo\".to_string()),\n                index_uid: None,\n            },\n            IndexMetadataSubrequest {\n                index_id: None,\n                index_uid: Some(index_uid_1.clone()),\n            },\n            IndexMetadataSubrequest {\n                index_id: None,\n                index_uid: Some(IndexUid::for_test(\"test-indexes-metadata-bar\", 123)),\n            },\n        ],\n    };\n    let mut indexes_metadata_response = metastore\n        .indexes_metadata(indexes_metadata_request)\n        .await\n        .unwrap();\n\n    let failures = &mut indexes_metadata_response.failures;\n    assert_eq!(failures.len(), 3);\n\n    failures.sort_by(|left, right| left.index_id().cmp(right.index_id()));\n\n    let expected_failure_0 = IndexMetadataFailure {\n        index_id: None,\n        index_uid: None,\n        reason: IndexMetadataFailureReason::Internal as i32,\n    };\n    assert_eq!(failures[0], expected_failure_0);\n\n    let expected_failure_1 = IndexMetadataFailure {\n        index_id: None,\n        index_uid: Some(IndexUid::for_test(\"test-indexes-metadata-bar\", 123)),\n        reason: IndexMetadataFailureReason::NotFound as i32,\n    };\n    assert_eq!(failures[1], expected_failure_1);\n\n    let expected_failure_2 = IndexMetadataFailure {\n        index_id: Some(\"test-indexes-metadata-foo\".to_string()),\n        index_uid: None,\n        reason: IndexMetadataFailureReason::NotFound as i32,\n    };\n    assert_eq!(failures[2], expected_failure_2);\n\n    let mut indexes_metadata = indexes_metadata_response\n        .deserialize_indexes_metadata()\n        .await\n        .unwrap();\n    assert_eq!(indexes_metadata.len(), 2);\n\n    indexes_metadata.sort_by(|left, right| left.index_id().cmp(right.index_id()));\n    assert_eq!(indexes_metadata[0].index_id(), index_id_0);\n    assert_eq!(indexes_metadata[1].index_id(), index_id_1);\n\n    cleanup_index(&mut metastore, index_uid_0).await;\n    cleanup_index(&mut metastore, index_uid_1).await;\n}\n\npub async fn test_metastore_list_all_indexes<\n    MetastoreToTest: MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let index_id_prefix = append_random_suffix(\"test-list-all-indexes\");\n    let index_id_1 = format!(\"{index_id_prefix}-1\");\n    let index_uri_1 = format!(\"ram:///indexes/{index_id_1}\");\n    let index_config_1 = IndexConfig::for_test(&index_id_1, &index_uri_1);\n\n    let index_id_2 = format!(\"{index_id_prefix}-2\");\n    let index_uri_2 = format!(\"ram:///indexes/{index_id_2}\");\n    let index_config_2 = IndexConfig::for_test(&index_id_2, &index_uri_2);\n    let indexes_count = metastore\n        .list_indexes_metadata(ListIndexesMetadataRequest::all())\n        .await\n        .unwrap()\n        .deserialize_indexes_metadata()\n        .await\n        .unwrap()\n        .into_iter()\n        .filter(|index| index.index_id().starts_with(&index_id_prefix))\n        .count();\n    assert_eq!(indexes_count, 0);\n\n    let index_uid_1 = metastore\n        .create_index(CreateIndexRequest::try_from_index_config(&index_config_1).unwrap())\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n    let index_uid_2 = metastore\n        .create_index(CreateIndexRequest::try_from_index_config(&index_config_2).unwrap())\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    let indexes_count = metastore\n        .list_indexes_metadata(ListIndexesMetadataRequest::all())\n        .await\n        .unwrap()\n        .deserialize_indexes_metadata()\n        .await\n        .unwrap()\n        .into_iter()\n        .filter(|index| index.index_id().starts_with(&index_id_prefix))\n        .count();\n    assert_eq!(indexes_count, 2);\n\n    cleanup_index(&mut metastore, index_uid_1).await;\n    cleanup_index(&mut metastore, index_uid_2).await;\n}\n\npub async fn test_metastore_list_indexes<MetastoreToTest: MetastoreServiceExt + DefaultForTest>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let index_id_fragment = append_random_suffix(\"test-list-indexes\");\n    let index_id_1 = format!(\"prefix-1-{index_id_fragment}-suffix-1\");\n    let index_uri_1 = format!(\"ram:///indexes/{index_id_1}\");\n    let index_config_1 = IndexConfig::for_test(&index_id_1, &index_uri_1);\n\n    let index_id_2 = format!(\"prefix-2-{index_id_fragment}-suffix-2\");\n    let index_uri_2 = format!(\"ram:///indexes/{index_id_2}\");\n    let index_config_2 = IndexConfig::for_test(&index_id_2, &index_uri_2);\n\n    let index_id_3 = format!(\"prefix.3.{index_id_fragment}.3\");\n    let index_uri_3 = format!(\"ram:///indexes/{index_id_3}\");\n    let index_config_3 = IndexConfig::for_test(&index_id_3, &index_uri_3);\n\n    let index_id_4 = format!(\"p-4-{index_id_fragment}-suffix-4\");\n    let index_uri_4 = format!(\"ram:///indexes/{index_id_4}\");\n    let index_config_4 = IndexConfig::for_test(&index_id_4, &index_uri_4);\n\n    let index_id_5 = format!(\"my-exact-index-{index_id_fragment}-5\");\n    let index_uri_5 = format!(\"ram:///indexes/{index_id_5}\");\n    let index_config_5 = IndexConfig::for_test(&index_id_5, &index_uri_5);\n\n    let index_id_patterns = vec![\n        format!(\"prefix-*-{index_id_fragment}-suffix-*\"),\n        format!(\"prefix*{index_id_fragment}*suffix-*\"),\n        format!(\"my-exact-index-{index_id_fragment}-5\"),\n    ];\n    let indexes_count = metastore\n        .list_indexes_metadata(ListIndexesMetadataRequest { index_id_patterns })\n        .await\n        .unwrap()\n        .deserialize_indexes_metadata()\n        .await\n        .unwrap()\n        .len();\n    assert_eq!(indexes_count, 0);\n\n    let index_uid_1 = metastore\n        .create_index(CreateIndexRequest::try_from_index_config(&index_config_1).unwrap())\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n    let index_uid_2 = metastore\n        .create_index(CreateIndexRequest::try_from_index_config(&index_config_2).unwrap())\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n    let index_uid_3 = metastore\n        .create_index(CreateIndexRequest::try_from_index_config(&index_config_3).unwrap())\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n    let index_uid_4 = metastore\n        .create_index(CreateIndexRequest::try_from_index_config(&index_config_4).unwrap())\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n    let index_uid_5 = metastore\n        .create_index(CreateIndexRequest::try_from_index_config(&index_config_5).unwrap())\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    let index_id_patterns = vec![\n        format!(\"prefix-*-{index_id_fragment}-suffix-*\"),\n        format!(\"my-exact-index-{index_id_fragment}-5\"),\n    ];\n    let indexes_count = metastore\n        .list_indexes_metadata(ListIndexesMetadataRequest { index_id_patterns })\n        .await\n        .unwrap()\n        .deserialize_indexes_metadata()\n        .await\n        .unwrap()\n        .len();\n    assert_eq!(indexes_count, 3);\n\n    cleanup_index(&mut metastore, index_uid_1).await;\n    cleanup_index(&mut metastore, index_uid_2).await;\n    cleanup_index(&mut metastore, index_uid_3).await;\n    cleanup_index(&mut metastore, index_uid_4).await;\n    cleanup_index(&mut metastore, index_uid_5).await;\n}\n\npub async fn test_metastore_delete_index<\n    MetastoreToTest: MetastoreService + MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let index_id = append_random_suffix(\"test-delete-index\");\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n\n    let index_uid_not_existing = IndexUid::new_with_random_ulid(\"index-not-found\");\n    let error = metastore\n        .delete_index(DeleteIndexRequest {\n            index_uid: Some(index_uid_not_existing.clone()),\n        })\n        .await\n        .unwrap_err();\n    assert!(matches!(\n        error,\n        MetastoreError::NotFound(EntityKind::Index { .. })\n    ));\n\n    let error = metastore\n        .delete_index(DeleteIndexRequest {\n            index_uid: Some(index_uid_not_existing),\n        })\n        .await\n        .unwrap_err();\n    assert!(matches!(\n        error,\n        MetastoreError::NotFound(EntityKind::Index { .. })\n    ));\n\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n    let index_uid: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    metastore\n        .delete_index(DeleteIndexRequest {\n            index_uid: index_uid.clone().into(),\n        })\n        .await\n        .unwrap();\n\n    assert!(!metastore.index_exists(&index_id).await.unwrap());\n\n    let split_id = format!(\"{index_id}--split\");\n    let split_metadata = SplitMetadata {\n        split_id: split_id.clone(),\n        index_uid: index_uid.clone(),\n        ..Default::default()\n    };\n\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n    let index_uid: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid\n        .unwrap();\n\n    let stage_splits_request =\n        StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata).unwrap();\n    metastore.stage_splits(stage_splits_request).await.unwrap();\n\n    // TODO: We should not be able to delete an index that has remaining splits, at least not as\n    // a default behavior. Let's implement the logic that allows this test to pass.\n    // let error = metastore.delete_index(index_uid).await.unwrap_err();\n    // assert!(matches!(error, MetastoreError::IndexNotEmpty { .. }));\n    // let splits = metastore.list_all_splits(index_uid.clone()).await.unwrap();\n    // assert_eq!(splits.len(), 1)\n\n    cleanup_index(&mut metastore, index_uid).await;\n}\n\npub async fn test_metastore_list_index_stats<\n    MetastoreToTest: MetastoreServiceExt + DefaultForTest,\n>() {\n    let metastore = MetastoreToTest::default_for_test().await;\n\n    let index_id_1 = append_random_suffix(\"test-list-index-stats\");\n    let index_uid_1 = IndexUid::new_with_random_ulid(&index_id_1);\n    let index_uri_1 = format!(\"ram:///indexes/{index_id_1}\");\n    let index_config_1 = IndexConfig::for_test(&index_id_1, &index_uri_1);\n\n    let index_id_2 = append_random_suffix(\"test-list-index-stats\");\n    let index_uid_2 = IndexUid::new_with_random_ulid(&index_id_2);\n    let index_uri_2 = format!(\"ram:///indexes/{index_id_2}\");\n    let index_config_2 = IndexConfig::for_test(&index_id_2, &index_uri_2);\n\n    let split_id_1 = format!(\"{index_id_1}--split-1\");\n    let split_metadata_1 = SplitMetadata {\n        split_id: split_id_1.clone(),\n        index_uid: index_uid_1.clone(),\n        footer_offsets: 0..2048,\n        ..Default::default()\n    };\n\n    let split_id_2 = format!(\"{index_id_1}--split-2\");\n    let split_metadata_2 = SplitMetadata {\n        split_id: split_id_2.clone(),\n        index_uid: index_uid_1.clone(),\n        footer_offsets: 0..2048,\n        ..Default::default()\n    };\n\n    let split_id_3 = format!(\"{index_id_1}--split-3\");\n    let split_metadata_3 = SplitMetadata {\n        split_id: split_id_3.clone(),\n        index_uid: index_uid_2.clone(),\n        footer_offsets: 0..1000,\n        ..Default::default()\n    };\n\n    // add split-1 and split-2 to index-1\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config_1).unwrap();\n    let index_uid_1: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    let stage_splits_request = StageSplitsRequest::try_from_splits_metadata(\n        index_uid_1.clone(),\n        vec![split_metadata_1.clone(), split_metadata_2.clone()],\n    )\n    .unwrap();\n    metastore.stage_splits(stage_splits_request).await.unwrap();\n\n    let publish_splits_request = PublishSplitsRequest {\n        index_uid: Some(index_uid_1.clone()),\n        staged_split_ids: vec![split_id_1.clone(), split_id_2.clone()],\n        ..Default::default()\n    };\n    metastore\n        .publish_splits(publish_splits_request)\n        .await\n        .unwrap();\n\n    // add split-3 to index-2\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config_2).unwrap();\n    let index_uid_2: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    let stage_splits_request = StageSplitsRequest::try_from_splits_metadata(\n        index_uid_2.clone(),\n        vec![split_metadata_3.clone()],\n    )\n    .unwrap();\n    metastore.stage_splits(stage_splits_request).await.unwrap();\n\n    let expected_stats_1 = IndexStats {\n        index_uid: Some(index_uid_1.clone()),\n        staged: Some(SplitStats {\n            num_splits: 0,\n            total_size_bytes: 0,\n        }),\n        published: Some(SplitStats {\n            num_splits: 2,\n            total_size_bytes: 4096,\n        }),\n        marked_for_deletion: Some(SplitStats {\n            num_splits: 0,\n            total_size_bytes: 0,\n        }),\n    };\n    let expected_stats_2 = IndexStats {\n        index_uid: Some(index_uid_2.clone()),\n        staged: Some(SplitStats {\n            num_splits: 1,\n            total_size_bytes: 1000,\n        }),\n        published: Some(SplitStats {\n            num_splits: 0,\n            total_size_bytes: 0,\n        }),\n        marked_for_deletion: Some(SplitStats {\n            num_splits: 0,\n            total_size_bytes: 0,\n        }),\n    };\n\n    let response = metastore\n        .list_index_stats(ListIndexStatsRequest {\n            index_id_patterns: vec![\"test-list-index-stats*\".to_string()],\n        })\n        .await\n        .unwrap();\n\n    let index_stats_1 = response\n        .index_stats\n        .iter()\n        .find(|index| index.index_uid == Some(index_uid_1.clone()))\n        .expect(\"Should find index 1\");\n\n    assert_eq!(index_stats_1, &expected_stats_1);\n\n    let index_stats_2 = response\n        .index_stats\n        .iter()\n        .find(|index| index.index_uid == Some(index_uid_2.clone()))\n        .expect(\"Should find index 2\");\n    assert_eq!(index_stats_2, &expected_stats_2);\n}\n\npub async fn test_metastore_list_index_stats_no_splits<\n    MetastoreToTest: MetastoreServiceExt + DefaultForTest,\n>() {\n    let metastore = MetastoreToTest::default_for_test().await;\n\n    let index_id = append_random_suffix(\"test-list-index-stats-no-splits\");\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n\n    let index_uid: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    let expected_stats = IndexStats {\n        index_uid: Some(index_uid.clone()),\n        staged: Some(SplitStats {\n            num_splits: 0,\n            total_size_bytes: 0,\n        }),\n        published: Some(SplitStats {\n            num_splits: 0,\n            total_size_bytes: 0,\n        }),\n        marked_for_deletion: Some(SplitStats {\n            num_splits: 0,\n            total_size_bytes: 0,\n        }),\n    };\n\n    let response = metastore\n        .list_index_stats(ListIndexStatsRequest {\n            index_id_patterns: vec![\"test-list-index-stats-no-splits*\".to_string()],\n        })\n        .await\n        .unwrap();\n\n    let index_stats = response\n        .index_stats\n        .iter()\n        .find(|index| index.index_uid == Some(index_uid.clone()))\n        .expect(\"Should find index\");\n\n    assert_eq!(index_stats, &expected_stats);\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/tests/list_splits.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::time::Duration;\n\nuse futures::TryStreamExt;\nuse itertools::Itertools;\nuse quickwit_common::rand::append_random_suffix;\nuse quickwit_config::IndexConfig;\nuse quickwit_doc_mapper::tag_pruning::{TagFilterAst, no_tag, tag};\nuse quickwit_proto::metastore::{\n    CreateIndexRequest, ListSplitsRequest, ListStaleSplitsRequest, MarkSplitsForDeletionRequest,\n    PublishSplitsRequest, StageSplitsRequest,\n};\nuse quickwit_proto::types::{IndexUid, NodeId, SplitId};\nuse time::OffsetDateTime;\nuse tokio::time::sleep;\nuse tracing::info;\n\nuse super::{DefaultForTest, to_btree_set};\nuse crate::metastore::MetastoreServiceStreamSplitsExt;\nuse crate::tests::{cleanup_index, collect_split_ids};\nuse crate::{\n    CreateIndexRequestExt, ListSplitsQuery, ListSplitsRequestExt, ListSplitsResponseExt,\n    MetastoreServiceExt, SplitMaturity, SplitMetadata, SplitState, StageSplitsRequestExt,\n};\n\npub async fn test_metastore_list_all_splits<\n    MetastoreToTest: MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let index_id = append_random_suffix(\"test-list-all-splits\");\n    let index_uid = IndexUid::new_with_random_ulid(&index_id);\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n\n    let split_id_1 = format!(\"{index_id}--split-1\");\n    let split_metadata_1 = SplitMetadata {\n        split_id: split_id_1.clone(),\n        index_uid: index_uid.clone(),\n        ..Default::default()\n    };\n    let split_id_2 = format!(\"{index_id}--split-2\");\n    let split_metadata_2 = SplitMetadata {\n        split_id: split_id_2.clone(),\n        index_uid: index_uid.clone(),\n        ..Default::default()\n    };\n    let split_id_3 = format!(\"{index_id}--split-3\");\n    let split_metadata_3 = SplitMetadata {\n        split_id: split_id_3.clone(),\n        index_uid: index_uid.clone(),\n        ..Default::default()\n    };\n    let split_id_4 = format!(\"{index_id}--split-4\");\n    let split_metadata_4 = SplitMetadata {\n        split_id: split_id_4.clone(),\n        index_uid: index_uid.clone(),\n        ..Default::default()\n    };\n    let split_id_5 = format!(\"{index_id}--split-5\");\n    let split_metadata_5 = SplitMetadata {\n        split_id: split_id_5.clone(),\n        index_uid: index_uid.clone(),\n        ..Default::default()\n    };\n    let split_id_6 = format!(\"{index_id}--split-6\");\n    let split_metadata_6 = SplitMetadata {\n        split_id: split_id_6.clone(),\n        index_uid: index_uid.clone(),\n        ..Default::default()\n    };\n\n    let no_splits = metastore\n        .list_splits(\n            ListSplitsRequest::try_from_index_uid(IndexUid::new_with_random_ulid(\n                \"index-not-found\",\n            ))\n            .unwrap(),\n        )\n        .await\n        .unwrap()\n        .collect_splits()\n        .await\n        .unwrap();\n    assert!(no_splits.is_empty());\n\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n    let index_uid: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    let stage_splits_request = StageSplitsRequest::try_from_splits_metadata(\n        index_uid.clone(),\n        vec![\n            split_metadata_1,\n            split_metadata_2,\n            split_metadata_3,\n            split_metadata_4,\n            split_metadata_5,\n            split_metadata_6,\n        ],\n    )\n    .unwrap();\n    metastore.stage_splits(stage_splits_request).await.unwrap();\n\n    let publish_splits_request = PublishSplitsRequest {\n        index_uid: Some(index_uid.clone()),\n        staged_split_ids: vec![split_id_1.clone(), split_id_2.clone()],\n        ..Default::default()\n    };\n    metastore\n        .publish_splits(publish_splits_request)\n        .await\n        .unwrap();\n\n    let mark_splits_for_deletion = MarkSplitsForDeletionRequest::new(\n        index_uid.clone(),\n        vec![split_id_3.clone(), split_id_4.clone()],\n    );\n    metastore\n        .mark_splits_for_deletion(mark_splits_for_deletion)\n        .await\n        .unwrap();\n\n    let splits = metastore\n        .list_splits(ListSplitsRequest::try_from_index_uid(index_uid.clone()).unwrap())\n        .await\n        .unwrap()\n        .collect_splits()\n        .await\n        .unwrap();\n    let split_ids = collect_split_ids(&splits);\n    assert_eq!(\n        split_ids,\n        &[\n            &split_id_1,\n            &split_id_2,\n            &split_id_3,\n            &split_id_4,\n            &split_id_5,\n            &split_id_6\n        ]\n    );\n\n    cleanup_index(&mut metastore, index_uid.clone()).await;\n}\n\npub async fn test_metastore_stream_splits<MetastoreToTest: MetastoreServiceExt + DefaultForTest>() {\n    let metastore = MetastoreToTest::default_for_test().await;\n\n    let index_id = append_random_suffix(\"test-stream-splits\");\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n    let index_uid: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    let mut split_metadatas_to_create = Vec::new();\n    for split_idx in 1..1001 {\n        let split_id = format!(\"{index_id}--split-{split_idx:0>4}\");\n        let split_metadata = SplitMetadata {\n            split_id: split_id.clone(),\n            index_uid: index_uid.clone(),\n            ..Default::default()\n        };\n        split_metadatas_to_create.push(split_metadata);\n\n        if split_idx > 0 && split_idx % 100 == 0 {\n            let staged_split_ids: Vec<SplitId> = split_metadatas_to_create\n                .iter()\n                .map(|split_metadata| split_metadata.split_id.clone())\n                .collect();\n            let stage_splits_request = StageSplitsRequest::try_from_splits_metadata(\n                index_uid.clone(),\n                split_metadatas_to_create.clone(),\n            )\n            .unwrap();\n            metastore.stage_splits(stage_splits_request).await.unwrap();\n            let publish_splits_request = PublishSplitsRequest {\n                index_uid: Some(index_uid.clone()),\n                staged_split_ids,\n                ..Default::default()\n            };\n            metastore\n                .publish_splits(publish_splits_request)\n                .await\n                .unwrap();\n            split_metadatas_to_create.clear();\n        }\n    }\n\n    let stream_splits_request = ListSplitsRequest::try_from_index_uid(index_uid.clone()).unwrap();\n    let mut stream_response = metastore.list_splits(stream_splits_request).await.unwrap();\n    let mut all_splits = Vec::new();\n    for _ in 0..10 {\n        let mut splits = stream_response\n            .try_next()\n            .await\n            .unwrap()\n            .unwrap()\n            .deserialize_splits()\n            .await\n            .unwrap();\n        assert_eq!(splits.len(), 100);\n        all_splits.append(&mut splits);\n    }\n    all_splits.sort_by_key(|split| split.split_id().to_string());\n    assert_eq!(all_splits[0].split_id(), format!(\"{index_id}--split-0001\"));\n    assert_eq!(\n        all_splits[all_splits.len() - 1].split_id(),\n        format!(\"{index_id}--split-1000\")\n    );\n}\n\npub async fn test_metastore_list_splits<MetastoreToTest: MetastoreServiceExt + DefaultForTest>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let current_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n\n    let index_id = append_random_suffix(\"test-list-splits\");\n    let index_uid = IndexUid::new_with_random_ulid(&index_id);\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n\n    let split_id_1 = format!(\"{index_id}--split-1\");\n    let split_metadata_1 = SplitMetadata {\n        split_id: split_id_1.clone(),\n        index_uid: index_uid.clone(),\n        time_range: Some(0..=99),\n        create_timestamp: current_timestamp,\n        maturity: SplitMaturity::Immature {\n            maturation_period: Duration::from_secs(0),\n        },\n        tags: to_btree_set(&[\"tag!\", \"tag:foo\", \"$tag!\", \"$tag:bar\"]),\n        delete_opstamp: 3,\n        ..Default::default()\n    };\n\n    let split_id_2 = format!(\"{index_id}--split-2\");\n    let split_metadata_2 = SplitMetadata {\n        split_id: split_id_2.clone(),\n        index_uid: index_uid.clone(),\n        time_range: Some(100..=199),\n        create_timestamp: current_timestamp,\n        maturity: SplitMaturity::Immature {\n            maturation_period: Duration::from_secs(10),\n        },\n        tags: to_btree_set(&[\"tag!\", \"$tag!\", \"$tag:bar\"]),\n        delete_opstamp: 1,\n        ..Default::default()\n    };\n\n    let split_id_3 = format!(\"{index_id}--split-3\");\n    let split_metadata_3 = SplitMetadata {\n        split_id: split_id_3.clone(),\n        index_uid: index_uid.clone(),\n        time_range: Some(200..=299),\n        create_timestamp: current_timestamp,\n        maturity: SplitMaturity::Immature {\n            maturation_period: Duration::from_secs(20),\n        },\n        tags: to_btree_set(&[\"tag!\", \"tag:foo\", \"tag:baz\", \"$tag!\"]),\n        delete_opstamp: 5,\n        ..Default::default()\n    };\n\n    let split_id_4 = format!(\"{index_id}--split-4\");\n    let split_metadata_4 = SplitMetadata {\n        split_id: split_id_4.clone(),\n        index_uid: index_uid.clone(),\n        time_range: Some(300..=399),\n        tags: to_btree_set(&[\"tag!\", \"tag:foo\", \"$tag!\"]),\n        delete_opstamp: 7,\n        ..Default::default()\n    };\n\n    let split_id_5 = format!(\"{index_id}--split-5\");\n    let split_metadata_5 = SplitMetadata {\n        split_id: split_id_5.clone(),\n        index_uid: index_uid.clone(),\n        time_range: None,\n        create_timestamp: current_timestamp,\n        tags: to_btree_set(&[\"tag!\", \"tag:baz\", \"tag:biz\", \"$tag!\"]),\n        delete_opstamp: 9,\n        ..Default::default()\n    };\n\n    {\n        let query =\n            ListSplitsQuery::for_index(index_uid.clone()).with_split_state(SplitState::Staged);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        assert!(splits.is_empty());\n    }\n    {\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        let stage_splits_request = StageSplitsRequest::try_from_splits_metadata(\n            index_uid.clone(),\n            vec![\n                split_metadata_1.clone(),\n                split_metadata_2.clone(),\n                split_metadata_3.clone(),\n                split_metadata_4.clone(),\n                split_metadata_5.clone(),\n            ],\n        )\n        .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let query = ListSplitsQuery::for_index(index_uid.clone()).with_limit(3);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        assert_eq!(\n            splits.len(),\n            3,\n            \"Expected number of splits returned to match limit.\",\n        );\n\n        let query = ListSplitsQuery::for_index(index_uid.clone()).with_offset(3);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        assert_eq!(\n            splits.len(),\n            2,\n            \"Expected 3 splits to be skipped out of the 5 provided splits.\",\n        );\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::Staged)\n            .with_time_range_start_gte(0)\n            .with_time_range_end_lt(99);\n\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids: Vec<&str> = splits\n            .iter()\n            .map(|split| split.split_id())\n            .sorted()\n            .collect();\n        assert_eq!(split_ids, &[&split_id_1, &split_id_5]);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::Staged)\n            .with_time_range_start_gte(200);\n\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(split_ids, &[&split_id_3, &split_id_4, &split_id_5]);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::Staged)\n            .with_time_range_end_lt(200);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(split_ids, &[&split_id_1, &split_id_2, &split_id_5]);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::Staged)\n            .with_time_range_start_gte(0)\n            .with_time_range_end_lt(100);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(split_ids, &[&split_id_1, &split_id_5]);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::Staged)\n            .with_time_range_start_gte(0)\n            .with_time_range_end_lt(101);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(split_ids, &[&split_id_1, &split_id_2, &split_id_5]);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::Staged)\n            .with_time_range_start_gte(0)\n            .with_time_range_end_lt(199);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(split_ids, &[&split_id_1, &split_id_2, &split_id_5]);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::Staged)\n            .with_time_range_start_gte(0)\n            .with_time_range_end_lt(200);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(split_ids, &[&split_id_1, &split_id_2, &split_id_5]);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::Staged)\n            .with_time_range_start_gte(0)\n            .with_time_range_end_lt(201);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(\n            split_ids,\n            &[&split_id_1, &split_id_2, &split_id_3, &split_id_5]\n        );\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::Staged)\n            .with_time_range_start_gte(0)\n            .with_time_range_end_lt(299);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(\n            split_ids,\n            &[&split_id_1, &split_id_2, &split_id_3, &split_id_5]\n        );\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::Staged)\n            .with_time_range_start_gte(0)\n            .with_time_range_end_lt(300);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(\n            split_ids,\n            &[&split_id_1, &split_id_2, &split_id_3, &split_id_5]\n        );\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::Staged)\n            .with_time_range_start_gte(0)\n            .with_time_range_end_lt(301);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(\n            split_ids,\n            &[\n                &split_id_1,\n                &split_id_2,\n                &split_id_3,\n                &split_id_4,\n                &split_id_5\n            ]\n        );\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::Staged)\n            .with_time_range_start_gte(301)\n            .with_time_range_end_lt(400);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(split_ids, &[&split_id_4, &split_id_5]);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::Staged)\n            .with_time_range_start_gte(300)\n            .with_time_range_end_lt(400);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(split_ids, &[&split_id_4, &split_id_5]);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::Staged)\n            .with_time_range_start_gte(299)\n            .with_time_range_end_lt(400);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(split_ids, &[&split_id_3, &split_id_4, &split_id_5]);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::Staged)\n            .with_time_range_start_gte(201)\n            .with_time_range_end_lt(400);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(split_ids, &[&split_id_3, &split_id_4, &split_id_5]);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::Staged)\n            .with_time_range_start_gte(200)\n            .with_time_range_end_lt(400);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(split_ids, &[&split_id_3, &split_id_4, &split_id_5]);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::Staged)\n            .with_time_range_start_gte(199)\n            .with_time_range_end_lt(400);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(\n            split_ids,\n            &[&split_id_2, &split_id_3, &split_id_4, &split_id_5]\n        );\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::Staged)\n            .with_time_range_start_gte(101)\n            .with_time_range_end_lt(400);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(\n            split_ids,\n            &[&split_id_2, &split_id_3, &split_id_4, &split_id_5]\n        );\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::Staged)\n            .with_time_range_start_gte(101)\n            .with_time_range_end_lt(400);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(\n            split_ids,\n            &[&split_id_2, &split_id_3, &split_id_4, &split_id_5]\n        );\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::Staged)\n            .with_time_range_start_gte(100)\n            .with_time_range_end_lt(400);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n\n        assert_eq!(\n            split_ids,\n            &[&split_id_2, &split_id_3, &split_id_4, &split_id_5]\n        );\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::Staged)\n            .with_time_range_start_gte(99)\n            .with_time_range_end_lt(400);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(\n            split_ids,\n            &[\n                &split_id_1,\n                &split_id_2,\n                &split_id_3,\n                &split_id_4,\n                &split_id_5\n            ]\n        );\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::Staged)\n            .with_time_range_start_gte(1000)\n            .with_time_range_end_lt(1100);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(split_ids, &[&split_id_5]);\n\n        // Artificially increase the create_timestamp\n        sleep(Duration::from_secs(1)).await;\n        // add a split without tag\n        let split_id_6 = format!(\"{index_id}--split-6\");\n        let split_metadata_6 = SplitMetadata {\n            split_id: split_id_6.clone(),\n            index_uid: index_uid.clone(),\n            time_range: None,\n            create_timestamp: OffsetDateTime::now_utc().unix_timestamp(),\n            ..Default::default()\n        };\n        let stage_splits_request = StageSplitsRequest::try_from_splits_metadata(\n            index_uid.clone(),\n            vec![split_metadata_6.clone()],\n        )\n        .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let query =\n            ListSplitsQuery::for_index(index_uid.clone()).with_split_state(SplitState::Staged);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(\n            split_ids,\n            &[\n                &split_id_1,\n                &split_id_2,\n                &split_id_3,\n                &split_id_4,\n                &split_id_5,\n                &split_id_6,\n            ]\n        );\n\n        let tag_filter_ast = TagFilterAst::Or(vec![\n            TagFilterAst::Or(vec![no_tag(\"$tag!\"), tag(\"$tag:bar\")]),\n            TagFilterAst::Or(vec![no_tag(\"tag!\"), tag(\"tag:baz\")]),\n        ]);\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::Staged)\n            .with_tags_filter(tag_filter_ast);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(\n            split_ids,\n            &[\n                &split_id_1,\n                &split_id_2,\n                &split_id_3,\n                &split_id_5,\n                &split_id_6,\n            ]\n        );\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_update_timestamp_gte(current_timestamp);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(\n            split_ids,\n            &[\n                &split_id_1,\n                &split_id_2,\n                &split_id_3,\n                &split_id_4,\n                &split_id_5,\n                &split_id_6,\n            ]\n        );\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_update_timestamp_gte(split_metadata_6.create_timestamp);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids: Vec<&String> = splits\n            .iter()\n            .map(|split| &split.split_metadata.split_id)\n            .sorted()\n            .collect();\n        assert_eq!(split_ids, vec![&split_id_6]);\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .with_create_timestamp_lt(split_metadata_6.create_timestamp);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(\n            split_ids,\n            &[\n                &split_id_1,\n                &split_id_2,\n                &split_id_3,\n                &split_id_4,\n                &split_id_5,\n            ]\n        );\n\n        let query = ListSplitsQuery::for_index(index_uid.clone()).with_delete_opstamp_lt(6);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(\n            split_ids,\n            &[&split_id_1, &split_id_2, &split_id_3, &split_id_6,]\n        );\n\n        // Test maturity filter\n        let maturity_evaluation_timestamp =\n            OffsetDateTime::from_unix_timestamp(current_timestamp).unwrap();\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .retain_mature(maturity_evaluation_timestamp);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(\n            split_ids,\n            &[&split_id_1, &split_id_4, &split_id_5, &split_id_6,]\n        );\n\n        let query = ListSplitsQuery::for_index(index_uid.clone())\n            .retain_immature(maturity_evaluation_timestamp);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        let split_ids = collect_split_ids(&splits);\n        assert_eq!(split_ids, &[&split_id_2, &split_id_3]);\n\n        cleanup_index(&mut metastore, index_uid).await;\n    }\n}\n\npub async fn test_metastore_list_splits_by_node_id<\n    MetastoreToTest: MetastoreServiceExt + DefaultForTest,\n>() {\n    let metastore = MetastoreToTest::default_for_test().await;\n\n    let current_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n    let index_id = append_random_suffix(\"test-list-splits-by-node-id\");\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n    let index_uid: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid\n        .unwrap();\n\n    let split_id_1 = format!(\"{index_id}--split-1\");\n    let split_metadata_1 = SplitMetadata {\n        split_id: split_id_1.clone(),\n        index_uid: index_uid.clone(),\n        create_timestamp: current_timestamp,\n        delete_opstamp: 20,\n        node_id: \"test-node-1\".to_string(),\n        ..Default::default()\n    };\n    let split_id_2 = format!(\"{index_id}--split-2\");\n    let split_metadata_2 = SplitMetadata {\n        split_id: split_id_2.clone(),\n        index_uid: index_uid.clone(),\n        create_timestamp: current_timestamp,\n        delete_opstamp: 10,\n        node_id: \"test-node-2\".to_string(),\n        ..Default::default()\n    };\n    let stage_splits_request = StageSplitsRequest::try_from_splits_metadata(\n        index_uid.clone(),\n        vec![split_metadata_1.clone(), split_metadata_2.clone()],\n    )\n    .unwrap();\n\n    metastore.stage_splits(stage_splits_request).await.unwrap();\n\n    let list_splits_query =\n        ListSplitsQuery::for_index(index_uid.clone()).with_node_id(NodeId::from(\"test-node-1\"));\n    let list_splits_request =\n        ListSplitsRequest::try_from_list_splits_query(&list_splits_query).unwrap();\n\n    let splits = metastore\n        .list_splits(list_splits_request)\n        .await\n        .unwrap()\n        .collect_splits()\n        .await\n        .unwrap();\n\n    assert_eq!(splits.len(), 1);\n    assert_eq!(splits[0].split_metadata.split_id, split_id_1);\n    assert_eq!(splits[0].split_metadata.node_id, \"test-node-1\");\n}\n\npub async fn test_metastore_list_stale_splits<\n    MetastoreToTest: MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n    let current_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n    let index_id = append_random_suffix(\"test-list-stale-splits\");\n    let index_uid = IndexUid::new_with_random_ulid(&index_id);\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n\n    let split_id_1 = format!(\"{index_id}--split-1\");\n    let split_metadata_1 = SplitMetadata {\n        split_id: split_id_1.clone(),\n        index_uid: index_uid.clone(),\n        create_timestamp: current_timestamp,\n        delete_opstamp: 20,\n        ..Default::default()\n    };\n    let split_id_2 = format!(\"{index_id}--split-2\");\n    let split_metadata_2 = SplitMetadata {\n        split_id: split_id_2.clone(),\n        index_uid: index_uid.clone(),\n        create_timestamp: current_timestamp,\n        delete_opstamp: 10,\n        ..Default::default()\n    };\n    let split_id_3 = format!(\"{index_id}--split-3\");\n    let split_metadata_3 = SplitMetadata {\n        split_id: split_id_3.clone(),\n        index_uid: index_uid.clone(),\n        create_timestamp: current_timestamp,\n        delete_opstamp: 0,\n        ..Default::default()\n    };\n    let split_id_4 = format!(\"{index_id}--split-4\");\n    let split_metadata_4 = SplitMetadata {\n        split_id: split_id_4.clone(),\n        index_uid: index_uid.clone(),\n        create_timestamp: current_timestamp,\n        delete_opstamp: 20,\n        ..Default::default()\n    };\n    // immature split\n    let split_id_5 = format!(\"{index_id}--split-5\");\n    let split_metadata_5 = SplitMetadata {\n        split_id: split_id_5.clone(),\n        index_uid: index_uid.clone(),\n        create_timestamp: current_timestamp,\n        maturity: SplitMaturity::Immature {\n            maturation_period: Duration::from_secs(100),\n        },\n        delete_opstamp: 0,\n        ..Default::default()\n    };\n\n    let list_stale_splits_request = ListStaleSplitsRequest {\n        index_uid: Some(IndexUid::new_with_random_ulid(\"index-not-found\")),\n        delete_opstamp: 0,\n        num_splits: 100,\n    };\n    let no_splits = metastore\n        .list_stale_splits(list_stale_splits_request)\n        .await\n        .unwrap()\n        .deserialize_splits()\n        .await\n        .unwrap();\n    assert!(no_splits.is_empty());\n\n    {\n        info!(\"list stale splits on an index\");\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        let stage_splits_request = StageSplitsRequest::try_from_splits_metadata(\n            index_uid.clone(),\n            vec![\n                split_metadata_1.clone(),\n                split_metadata_2.clone(),\n                split_metadata_3.clone(),\n                split_metadata_5.clone(),\n            ],\n        )\n        .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        // Sleep for 1 second to have different publish timestamps.\n        sleep(Duration::from_secs(1)).await;\n\n        let stage_splits_request = StageSplitsRequest::try_from_splits_metadata(\n            index_uid.clone(),\n            vec![split_metadata_4.clone()],\n        )\n        .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: index_uid.clone().into(),\n            staged_split_ids: vec![split_id_4.clone()],\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n        // Sleep for 1 second to have different publish timestamps.\n        tokio::time::sleep(Duration::from_secs(1)).await;\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: index_uid.clone().into(),\n            staged_split_ids: vec![split_id_1.clone(), split_id_2.clone(), split_id_5.clone()],\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n        let list_stale_splits_request = ListStaleSplitsRequest {\n            index_uid: index_uid.clone().into(),\n            delete_opstamp: 100,\n            num_splits: 1,\n        };\n        let splits = metastore\n            .list_stale_splits(list_stale_splits_request)\n            .await\n            .unwrap()\n            .deserialize_splits()\n            .await\n            .unwrap();\n        assert_eq!(splits.len(), 1);\n        assert_eq!(\n            splits[0].split_metadata.delete_opstamp,\n            split_metadata_2.delete_opstamp\n        );\n\n        let list_stale_splits_request = ListStaleSplitsRequest {\n            index_uid: index_uid.clone().into(),\n            delete_opstamp: 100,\n            num_splits: 4,\n        };\n        let splits = metastore\n            .list_stale_splits(list_stale_splits_request)\n            .await\n            .unwrap()\n            .deserialize_splits()\n            .await\n            .unwrap();\n        assert_eq!(splits.len(), 3);\n        assert_eq!(splits[0].split_id(), split_metadata_2.split_id());\n        assert_eq!(splits[1].split_id(), split_metadata_4.split_id());\n        assert_eq!(splits[2].split_id(), split_metadata_1.split_id());\n        assert_eq!(\n            splits[2].split_metadata.delete_opstamp,\n            split_metadata_1.delete_opstamp\n        );\n\n        let list_stale_splits_request = ListStaleSplitsRequest {\n            index_uid: index_uid.clone().into(),\n            delete_opstamp: 20,\n            num_splits: 2,\n        };\n        let splits = metastore\n            .list_stale_splits(list_stale_splits_request)\n            .await\n            .unwrap()\n            .deserialize_splits()\n            .await\n            .unwrap();\n        assert_eq!(splits.len(), 1);\n        assert_eq!(\n            splits[0].split_metadata.delete_opstamp,\n            split_metadata_2.delete_opstamp\n        );\n\n        let list_stale_splits_request = ListStaleSplitsRequest {\n            index_uid: index_uid.clone().into(),\n            delete_opstamp: 10,\n            num_splits: 2,\n        };\n        let splits = metastore\n            .list_stale_splits(list_stale_splits_request)\n            .await\n            .unwrap()\n            .deserialize_splits()\n            .await\n            .unwrap();\n        assert!(splits.is_empty());\n        cleanup_index(&mut metastore, index_uid).await;\n    }\n}\n\npub async fn test_metastore_list_sorted_splits<\n    MetastoreToTest: MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let split_id = append_random_suffix(\"test-list-sorted-splits-\");\n    let index_id_1 = append_random_suffix(\"test-list-sorted-splits-1\");\n    let index_uid_1 = IndexUid::new_with_random_ulid(&index_id_1);\n    let index_uri_1 = format!(\"ram:///indexes/{index_id_1}\");\n    let index_config_1 = IndexConfig::for_test(&index_id_1, &index_uri_1);\n\n    let index_id_2 = append_random_suffix(\"test-list-sorted-splits-2\");\n    let index_uid_2 = IndexUid::new_with_random_ulid(&index_id_2);\n    let index_uri_2 = format!(\"ram:///indexes/{index_id_2}\");\n    let index_config_2 = IndexConfig::for_test(&index_id_2, &index_uri_2);\n\n    let split_id_1 = format!(\"{split_id}--split-1\");\n    let split_metadata_1 = SplitMetadata {\n        split_id: split_id_1.clone(),\n        index_uid: index_uid_1.clone(),\n        delete_opstamp: 5,\n        ..Default::default()\n    };\n    let split_id_2 = format!(\"{split_id}--split-2\");\n    let split_metadata_2 = SplitMetadata {\n        split_id: split_id_2.clone(),\n        index_uid: index_uid_2.clone(),\n        delete_opstamp: 3,\n        ..Default::default()\n    };\n    let split_id_3 = format!(\"{split_id}--split-3\");\n    let split_metadata_3 = SplitMetadata {\n        split_id: split_id_3.clone(),\n        index_uid: index_uid_1.clone(),\n        delete_opstamp: 1,\n        ..Default::default()\n    };\n    let split_id_4 = format!(\"{split_id}--split-4\");\n    let split_metadata_4 = SplitMetadata {\n        split_id: split_id_4.clone(),\n        index_uid: index_uid_2.clone(),\n        delete_opstamp: 0,\n        ..Default::default()\n    };\n    let split_id_5 = format!(\"{split_id}--split-5\");\n    let split_metadata_5 = SplitMetadata {\n        split_id: split_id_5.clone(),\n        index_uid: index_uid_1.clone(),\n        delete_opstamp: 2,\n        ..Default::default()\n    };\n    let split_id_6 = format!(\"{split_id}--split-6\");\n    let split_metadata_6 = SplitMetadata {\n        split_id: split_id_6.clone(),\n        index_uid: index_uid_2.clone(),\n        delete_opstamp: 4,\n        ..Default::default()\n    };\n\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config_1).unwrap();\n    let index_uid_1: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config_2).unwrap();\n    let index_uid_2: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    {\n        let stage_splits_request = StageSplitsRequest::try_from_splits_metadata(\n            index_uid_1.clone(),\n            vec![split_metadata_1, split_metadata_3, split_metadata_5],\n        )\n        .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid_1.clone()),\n            staged_split_ids: vec![split_id_1.clone()],\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n\n        let mark_splits_for_deletion =\n            MarkSplitsForDeletionRequest::new(index_uid_1.clone(), vec![split_id_3.clone()]);\n        metastore\n            .mark_splits_for_deletion(mark_splits_for_deletion)\n            .await\n            .unwrap();\n\n        let stage_splits_request = StageSplitsRequest::try_from_splits_metadata(\n            index_uid_2.clone(),\n            vec![split_metadata_2, split_metadata_4, split_metadata_6],\n        )\n        .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid_2.clone()),\n            staged_split_ids: vec![split_id_2.clone()],\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n\n        let mark_splits_for_deletion =\n            MarkSplitsForDeletionRequest::new(index_uid_2.clone(), vec![split_id_4.clone()]);\n        metastore\n            .mark_splits_for_deletion(mark_splits_for_deletion)\n            .await\n            .unwrap();\n    }\n\n    let query =\n        ListSplitsQuery::try_from_index_uids(vec![index_uid_1.clone(), index_uid_2.clone()])\n            .unwrap()\n            .sort_by_staleness();\n    let splits = metastore\n        .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n        .await\n        .unwrap()\n        .collect_splits()\n        .await\n        .unwrap();\n    // we don't use collect_split_ids because it sorts splits internally\n    let split_ids = splits\n        .iter()\n        .map(|split| split.split_id())\n        .collect::<Vec<_>>();\n    assert_eq!(\n        split_ids,\n        &[\n            &split_id_4,\n            &split_id_3,\n            &split_id_5,\n            &split_id_2,\n            &split_id_6,\n            &split_id_1,\n        ]\n    );\n\n    let query =\n        ListSplitsQuery::try_from_index_uids(vec![index_uid_1.clone(), index_uid_2.clone()])\n            .unwrap()\n            .sort_by_index_uid();\n    let splits = metastore\n        .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n        .await\n        .unwrap()\n        .collect_splits()\n        .await\n        .unwrap();\n    // we don't use collect_split_ids because it sorts splits internally\n    let split_ids = splits\n        .iter()\n        .map(|split| split.split_id())\n        .collect::<Vec<_>>();\n    assert_eq!(\n        split_ids,\n        &[\n            &split_id_1,\n            &split_id_3,\n            &split_id_5,\n            &split_id_2,\n            &split_id_4,\n            &split_id_6,\n        ]\n    );\n\n    cleanup_index(&mut metastore, index_uid_1.clone()).await;\n    cleanup_index(&mut metastore, index_uid_2.clone()).await;\n}\n\npub async fn test_metastore_list_after_split<\n    MetastoreToTest: MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let split_id = append_random_suffix(\"test-list-sorted-splits-\");\n    let index_id_1 = append_random_suffix(\"test-list-sorted-splits-1\");\n    let index_uri_1 = format!(\"ram:///indexes/{index_id_1}\");\n    let index_config_1 = IndexConfig::for_test(&index_id_1, &index_uri_1);\n\n    let index_id_2 = append_random_suffix(\"test-list-sorted-splits-2\");\n    let index_uri_2 = format!(\"ram:///indexes/{index_id_2}\");\n    let index_config_2 = IndexConfig::for_test(&index_id_2, &index_uri_2);\n\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config_1).unwrap();\n    let index_uid_1: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config_2).unwrap();\n    let index_uid_2: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    let split_id_1 = format!(\"{split_id}--split-1\");\n    let split_metadata_1 = SplitMetadata {\n        split_id: split_id_1.clone(),\n        index_uid: index_uid_1.clone(),\n        ..Default::default()\n    };\n    let split_id_2 = format!(\"{split_id}--split-2\");\n    let split_metadata_2 = SplitMetadata {\n        split_id: split_id_2.clone(),\n        index_uid: index_uid_2.clone(),\n        ..Default::default()\n    };\n    let split_id_3 = format!(\"{split_id}--split-3\");\n    let split_metadata_3 = SplitMetadata {\n        split_id: split_id_3.clone(),\n        index_uid: index_uid_1.clone(),\n        ..Default::default()\n    };\n    let split_id_4 = format!(\"{split_id}--split-4\");\n    let split_metadata_4 = SplitMetadata {\n        split_id: split_id_4.clone(),\n        index_uid: index_uid_2.clone(),\n        ..Default::default()\n    };\n    let split_id_5 = format!(\"{split_id}--split-5\");\n    let split_metadata_5 = SplitMetadata {\n        split_id: split_id_5.clone(),\n        index_uid: index_uid_1.clone(),\n        ..Default::default()\n    };\n    let split_id_6 = format!(\"{split_id}--split-6\");\n    let split_metadata_6 = SplitMetadata {\n        split_id: split_id_6.clone(),\n        index_uid: index_uid_2.clone(),\n        ..Default::default()\n    };\n\n    {\n        let stage_splits_request = StageSplitsRequest::try_from_splits_metadata(\n            index_uid_1.clone(),\n            vec![\n                split_metadata_1.clone(),\n                split_metadata_3.clone(),\n                split_metadata_5.clone(),\n            ],\n        )\n        .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid_1.clone()),\n            staged_split_ids: vec![split_id_1.clone()],\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n\n        let mark_splits_for_deletion =\n            MarkSplitsForDeletionRequest::new(index_uid_1.clone(), vec![split_id_3.clone()]);\n        metastore\n            .mark_splits_for_deletion(mark_splits_for_deletion)\n            .await\n            .unwrap();\n\n        let stage_splits_request = StageSplitsRequest::try_from_splits_metadata(\n            index_uid_2.clone(),\n            vec![\n                split_metadata_2.clone(),\n                split_metadata_4.clone(),\n                split_metadata_6.clone(),\n            ],\n        )\n        .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid_2.clone()),\n            staged_split_ids: vec![split_id_2.clone()],\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n\n        let mark_splits_for_deletion =\n            MarkSplitsForDeletionRequest::new(index_uid_2.clone(), vec![split_id_4.clone()]);\n        metastore\n            .mark_splits_for_deletion(mark_splits_for_deletion)\n            .await\n            .unwrap();\n    }\n\n    let expected_all = [\n        &split_metadata_1,\n        &split_metadata_3,\n        &split_metadata_5,\n        &split_metadata_2,\n        &split_metadata_4,\n        &split_metadata_6,\n    ];\n\n    for i in 0..expected_all.len() {\n        let after = expected_all[i];\n        let expected_res = expected_all[(i + 1)..]\n            .iter()\n            .map(|split| (&split.index_uid, &split.split_id))\n            .collect::<Vec<_>>();\n\n        let query =\n            ListSplitsQuery::try_from_index_uids(vec![index_uid_1.clone(), index_uid_2.clone()])\n                .unwrap()\n                .sort_by_index_uid()\n                .after_split(after);\n        let splits = metastore\n            .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap();\n        // we don't use collect_split_ids because it sorts splits internally\n        let split_ids = splits\n            .iter()\n            .map(|split| {\n                (\n                    &split.split_metadata.index_uid,\n                    &split.split_metadata.split_id,\n                )\n            })\n            .collect::<Vec<_>>();\n        assert_eq!(split_ids, expected_res,);\n    }\n\n    cleanup_index(&mut metastore, index_uid_1.clone()).await;\n    cleanup_index(&mut metastore, index_uid_2.clone()).await;\n}\n\npub async fn test_metastore_list_splits_from_all_indexes<\n    MetastoreToTest: MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let split_id = append_random_suffix(\"test-list-sorted-splits-\");\n    let index_id_1 = append_random_suffix(\"test-list-sorted-splits-1\");\n    let index_uri_1 = format!(\"ram:///indexes/{index_id_1}\");\n    let index_config_1 = IndexConfig::for_test(&index_id_1, &index_uri_1);\n\n    let index_id_2 = append_random_suffix(\"test-list-sorted-splits-2\");\n    let index_uri_2 = format!(\"ram:///indexes/{index_id_2}\");\n    let index_config_2 = IndexConfig::for_test(&index_id_2, &index_uri_2);\n\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config_1).unwrap();\n    let index_uid_1: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config_2).unwrap();\n    let index_uid_2: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    let split_id_1 = format!(\"{split_id}--split-1\");\n    let split_metadata_1 = SplitMetadata {\n        split_id: split_id_1.clone(),\n        index_uid: index_uid_1.clone(),\n        ..Default::default()\n    };\n    let split_id_2 = format!(\"{split_id}--split-2\");\n    let split_metadata_2 = SplitMetadata {\n        split_id: split_id_2.clone(),\n        index_uid: index_uid_2.clone(),\n        ..Default::default()\n    };\n    let split_id_3 = format!(\"{split_id}--split-3\");\n    let split_metadata_3 = SplitMetadata {\n        split_id: split_id_3.clone(),\n        index_uid: index_uid_1.clone(),\n        ..Default::default()\n    };\n    let split_id_4 = format!(\"{split_id}--split-4\");\n    let split_metadata_4 = SplitMetadata {\n        split_id: split_id_4.clone(),\n        index_uid: index_uid_2.clone(),\n        ..Default::default()\n    };\n    let split_id_5 = format!(\"{split_id}--split-5\");\n    let split_metadata_5 = SplitMetadata {\n        split_id: split_id_5.clone(),\n        index_uid: index_uid_1.clone(),\n        ..Default::default()\n    };\n    let split_id_6 = format!(\"{split_id}--split-6\");\n    let split_metadata_6 = SplitMetadata {\n        split_id: split_id_6.clone(),\n        index_uid: index_uid_2.clone(),\n        ..Default::default()\n    };\n\n    {\n        let stage_splits_request = StageSplitsRequest::try_from_splits_metadata(\n            index_uid_1.clone(),\n            vec![\n                split_metadata_1.clone(),\n                split_metadata_3.clone(),\n                split_metadata_5.clone(),\n            ],\n        )\n        .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid_1.clone()),\n            staged_split_ids: vec![split_id_1.clone()],\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n\n        let mark_splits_for_deletion =\n            MarkSplitsForDeletionRequest::new(index_uid_1.clone(), vec![split_id_3.clone()]);\n        metastore\n            .mark_splits_for_deletion(mark_splits_for_deletion)\n            .await\n            .unwrap();\n\n        let stage_splits_request = StageSplitsRequest::try_from_splits_metadata(\n            index_uid_2.clone(),\n            vec![\n                split_metadata_2.clone(),\n                split_metadata_4.clone(),\n                split_metadata_6.clone(),\n            ],\n        )\n        .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid_2.clone()),\n            staged_split_ids: vec![split_id_2.clone()],\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n\n        let mark_splits_for_deletion =\n            MarkSplitsForDeletionRequest::new(index_uid_2.clone(), vec![split_id_4.clone()]);\n        metastore\n            .mark_splits_for_deletion(mark_splits_for_deletion)\n            .await\n            .unwrap();\n    }\n\n    let expected_all = [\n        &split_metadata_1,\n        &split_metadata_3,\n        &split_metadata_5,\n        &split_metadata_2,\n        &split_metadata_4,\n        &split_metadata_6,\n    ];\n\n    let expected_res = expected_all[1..]\n        .iter()\n        .map(|split| (&split.index_uid, &split.split_id))\n        .collect::<Vec<_>>();\n\n    let query = ListSplitsQuery::for_all_indexes()\n        .sort_by_index_uid()\n        .after_split(expected_all[0]);\n    let splits = metastore\n        .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n        .await\n        .unwrap()\n        .collect_splits()\n        .await\n        .unwrap();\n    // we don't use collect_split_ids because it sorts splits internally\n    let split_ids = splits\n        .iter()\n        .map(|split| {\n            (\n                &split.split_metadata.index_uid,\n                &split.split_metadata.split_id,\n            )\n        })\n        // when running this test against a clean database, this line isn't needed. In practice,\n        // any test that leaves any split behind breaks this test if we remove this filter\n        .filter(|(index_uid, _split_id)| {\n            [index_uid_1.clone(), index_uid_2.clone()].contains(index_uid)\n        })\n        .collect::<Vec<_>>();\n    assert_eq!(split_ids, expected_res);\n\n    cleanup_index(&mut metastore, index_uid_1.clone()).await;\n    cleanup_index(&mut metastore, index_uid_2.clone()).await;\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/tests/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeSet;\n\nuse async_trait::async_trait;\nuse bytesize::ByteSize;\nuse itertools::Itertools;\nuse quickwit_proto::metastore::metastore_service_grpc_client::MetastoreServiceGrpcClient;\nuse quickwit_proto::metastore::{\n    DeleteIndexRequest, DeleteSplitsRequest, ListSplitsRequest, MarkSplitsForDeletionRequest,\n    MetastoreServiceClient, MetastoreServiceGrpcClientAdapter,\n};\nuse quickwit_proto::tonic::transport::Channel;\nuse quickwit_proto::types::IndexUid;\n\npub(crate) mod delete_task;\npub(crate) mod get_identity;\npub(crate) mod index;\npub(crate) mod list_splits;\npub(crate) mod shard;\npub(crate) mod source;\npub(crate) mod split;\npub(crate) mod template;\n\nuse crate::metastore::MetastoreServiceStreamSplitsExt;\nuse crate::{ListSplitsRequestExt, MetastoreServiceExt, Split};\n\nconst MAX_GRPC_MESSAGE_SIZE: ByteSize = ByteSize::mib(1);\n\n#[async_trait]\npub trait DefaultForTest {\n    async fn default_for_test() -> Self;\n}\n\n// We implement the trait to test the gRPC adapter backed by a file backed metastore.\n#[async_trait]\nimpl DefaultForTest for MetastoreServiceGrpcClientAdapter<MetastoreServiceGrpcClient<Channel>> {\n    async fn default_for_test() -> Self {\n        use quickwit_proto::tonic::transport::Server;\n        use quickwit_storage::RamStorage;\n\n        use crate::FileBackedMetastore;\n        let metastore =\n            FileBackedMetastore::try_new(std::sync::Arc::new(RamStorage::default()), None)\n                .await\n                .unwrap();\n        let (client, server) = tokio::io::duplex(1024);\n        tokio::spawn(async move {\n            Server::builder()\n                .add_service(\n                    MetastoreServiceClient::new(metastore).as_grpc_service(MAX_GRPC_MESSAGE_SIZE),\n                )\n                .serve_with_incoming(futures::stream::iter(vec![Ok::<_, std::io::Error>(server)]))\n                .await\n        });\n        let channel = create_channel(client).await.unwrap();\n        let (_, connection_keys_watcher) =\n            tokio::sync::watch::channel(std::collections::HashSet::new());\n\n        MetastoreServiceGrpcClientAdapter::new(\n            MetastoreServiceGrpcClient::new(channel),\n            connection_keys_watcher,\n        )\n    }\n}\n\nimpl MetastoreServiceExt\n    for MetastoreServiceGrpcClientAdapter<MetastoreServiceGrpcClient<Channel>>\n{\n}\n\nasync fn create_channel(client: tokio::io::DuplexStream) -> anyhow::Result<Channel> {\n    use http::Uri;\n    use quickwit_proto::tonic::transport::Endpoint;\n\n    let mut outer_client_opt = Some(client);\n    let channel = Endpoint::try_from(\"http://test.server\")?\n        .connect_with_connector(tower::service_fn(move |_: Uri| {\n            let inner_client_opt = outer_client_opt.take();\n            async move {\n                let client = inner_client_opt\n                    .ok_or_else(|| std::io::Error::other(\"client already taken\"))?;\n                std::io::Result::Ok(hyper_util::rt::TokioIo::new(client))\n            }\n        }))\n        .await?;\n    Ok(channel)\n}\n\n// crate::metastore_test_suite!(\n//     quickwit_proto::metastore::MetastoreServiceGrpcClientAdapter<\n//         quickwit_proto::metastore::metastore_service_grpc_client::MetastoreServiceGrpcClient<\n//             quickwit_proto::tonic::transport::Channel,\n//         >,\n//     >\n// );\n\nfn collect_split_ids(splits: &[Split]) -> Vec<&str> {\n    splits\n        .iter()\n        .map(|split| split.split_id())\n        .sorted()\n        .collect()\n}\n\nfn to_btree_set(tags: &[&str]) -> BTreeSet<String> {\n    tags.iter().map(|tag| tag.to_string()).collect()\n}\n\nasync fn cleanup_index(metastore: &mut dyn MetastoreServiceExt, index_uid: IndexUid) {\n    // List all splits.\n    let all_splits = metastore\n        .list_splits(ListSplitsRequest::try_from_index_uid(index_uid.clone()).unwrap())\n        .await\n        .unwrap()\n        .collect_splits()\n        .await\n        .unwrap();\n\n    if !all_splits.is_empty() {\n        let all_split_ids: Vec<String> = all_splits\n            .iter()\n            .map(|split| split.split_id().to_string())\n            .collect();\n\n        // Mark splits for deletion.\n        let mark_splits_for_deletion_request =\n            MarkSplitsForDeletionRequest::new(index_uid.clone(), all_split_ids.clone());\n        metastore\n            .mark_splits_for_deletion(mark_splits_for_deletion_request)\n            .await\n            .unwrap();\n\n        // Delete splits.\n        let delete_splits_request = DeleteSplitsRequest {\n            index_uid: index_uid.clone().into(),\n            split_ids: all_split_ids,\n        };\n        metastore\n            .delete_splits(delete_splits_request)\n            .await\n            .unwrap();\n    }\n    // Delete index.\n    metastore\n        .delete_index(DeleteIndexRequest {\n            index_uid: index_uid.clone().into(),\n        })\n        .await\n        .unwrap();\n}\n\n/// macro used to generate a testsuite for an implementation of Metastore\n#[macro_export]\nmacro_rules! metastore_test_suite {\n    ($metastore_type:ty) => {\n        #[cfg(test)]\n        mod common_tests {\n\n            // Index API tests\n            //\n            //  - create_index\n            //  - update_index\n            //  - index_exists\n            //  - index_metadata\n            //  - indexes_metadata\n            //  - list_indexes\n            //  - delete_index\n            //  - list_index_stats\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_create_index() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::index::test_metastore_create_index::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_create_index_with_sources() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::index::test_metastore_create_index_with_sources::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_update_retention_policy() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::index::test_metastore_update_retention_policy::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_update_search_settings() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::index::test_metastore_update_search_settings::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_update_doc_mapping() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::index::test_metastore_update_doc_mapping::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_update_indexing_settings() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::index::test_metastore_update_indexing_settings::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_update_ingest_settings() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::index::test_metastore_update_ingest_settings::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_create_index_enforces_index_id_maximum_length() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::index::test_metastore_create_index_enforces_index_id_maximum_length::<\n                    $metastore_type,\n                >()\n                .await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_index_exists() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::index::test_metastore_index_exists::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_index_metadata() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::index::test_metastore_index_metadata::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_indexes_metadata() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::index::test_metastore_indexes_metadata::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_list_indexes() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::index::test_metastore_list_indexes::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_list_all_indexes() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::index::test_metastore_list_all_indexes::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_delete_index() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::index::test_metastore_delete_index::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_list_index_stats() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::index::test_metastore_list_index_stats::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_list_index_stats_no_splits() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::index::test_metastore_list_index_stats_no_splits::<$metastore_type>().await;\n            }\n\n            // Split API tests\n            //\n            //  - stage_splits\n            //  - publish_splits\n            //  - stream_splits\n            //  - mark_splits_for_deletion\n            //  - delete_splits\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_publish_splits() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::split::test_metastore_publish_splits::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_publish_splits_concurrency() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::split::test_metastore_publish_splits_concurrency::<$metastore_type>(\n                )\n                .await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_publish_splits_empty_splits_array_is_allowed() {\n                $crate::tests::split::test_metastore_publish_splits_empty_splits_array_is_allowed::<\n                            $metastore_type,\n                        >()\n                        .await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_replace_splits() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::split::test_metastore_replace_splits::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_mark_splits_for_deletion() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::split::test_metastore_mark_splits_for_deletion::<$metastore_type>()\n                    .await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_delete_splits() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::split::test_metastore_delete_splits::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_stream_splits() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::list_splits::test_metastore_stream_splits::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_list_all_splits() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::list_splits::test_metastore_list_all_splits::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_list_splits() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::list_splits::test_metastore_list_splits::<$metastore_type>().await;\n            }\n\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_list_splits_by_node() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::list_splits::test_metastore_list_splits_by_node_id::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_split_update_timestamp() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::split::test_metastore_split_update_timestamp::<$metastore_type>()\n                    .await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_add_source() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::source::test_metastore_add_source::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_update_source() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::source::test_metastore_update_source::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_toggle_source() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::source::test_metastore_toggle_source::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_delete_source() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::source::test_metastore_delete_source::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_reset_checkpoint() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::source::test_metastore_reset_checkpoint::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_create_delete_task() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::delete_task::test_metastore_create_delete_task::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_last_delete_opstamp() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::delete_task::test_metastore_last_delete_opstamp::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_delete_index_with_tasks() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::delete_task::test_metastore_delete_index_with_tasks::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_list_delete_tasks() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::delete_task::test_metastore_list_delete_tasks::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_list_stale_splits() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::list_splits::test_metastore_list_stale_splits::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_list_sorted_splits() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::list_splits::test_metastore_list_sorted_splits::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_list_after_split() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::list_splits::test_metastore_list_after_split::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_list_splits_from_all_indexes() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::list_splits::test_metastore_list_splits_from_all_indexes::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_update_splits_delete_opstamp() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::split::test_metastore_update_splits_delete_opstamp::<$metastore_type>()\n                    .await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_stage_splits() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::split::test_metastore_stage_splits::<$metastore_type>().await;\n            }\n\n            /// Shard API tests\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_open_shards() {\n                $crate::tests::shard::test_metastore_open_shards::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_acquire_shards() {\n                $crate::tests::shard::test_metastore_acquire_shards::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_list_shards() {\n                $crate::tests::shard::test_metastore_list_shards::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_delete_shards() {\n                $crate::tests::shard::test_metastore_delete_shards::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_prune_shards() {\n                $crate::tests::shard::test_metastore_prune_shards::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::serial]\n            async fn test_metastore_apply_checkpoint_delta_v2_single_shard() {\n                $crate::tests::shard::test_metastore_apply_checkpoint_delta_v2_single_shard::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_apply_checkpoint_delta_v2_multi_shards() {\n                $crate::tests::shard::test_metastore_apply_checkpoint_delta_v2_multi_shards::<$metastore_type>().await;\n            }\n\n            /// Index Template API tests\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_create_index_template() {\n                $crate::tests::template::test_metastore_create_index_template::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_get_index_template() {\n                $crate::tests::template::test_metastore_get_index_template::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_find_index_template_matches() {\n                $crate::tests::template::test_metastore_find_index_template_matches::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_list_index_templates() {\n                $crate::tests::template::test_metastore_list_index_templates::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_delete_index_templates() {\n                $crate::tests::template::test_metastore_delete_index_templates::<$metastore_type>().await;\n            }\n\n            #[tokio::test]\n            #[serial_test::file_serial]\n            async fn test_metastore_get_identity() {\n                let _ = tracing_subscriber::fmt::try_init();\n                $crate::tests::get_identity::test_metastore_get_identity::<$metastore_type>().await;\n            }\n        }\n    };\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/tests/shard.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse async_trait::async_trait;\nuse itertools::Itertools;\nuse quickwit_common::rand::append_random_suffix;\nuse quickwit_config::{IndexConfig, SourceConfig};\nuse quickwit_proto::compatibility_shard_update_timestamp;\nuse quickwit_proto::ingest::{Shard, ShardState};\nuse quickwit_proto::metastore::{\n    AcquireShardsRequest, AddSourceRequest, CreateIndexRequest, DeleteShardsRequest, EntityKind,\n    ListShardsRequest, ListShardsSubrequest, MetastoreError, MetastoreService, OpenShardSubrequest,\n    OpenShardsRequest, PruneShardsRequest, PublishSplitsRequest,\n};\nuse quickwit_proto::types::{DocMappingUid, IndexUid, Position, ShardId, SourceId};\nuse time::OffsetDateTime;\n\nuse super::DefaultForTest;\nuse crate::checkpoint::{IndexCheckpointDelta, PartitionId, SourceCheckpointDelta};\nuse crate::tests::cleanup_index;\nuse crate::{AddSourceRequestExt, CreateIndexRequestExt, MetastoreServiceExt};\n\n#[async_trait]\npub trait ReadWriteShardsForTest {\n    async fn insert_shards(&self, index_uid: &IndexUid, source_id: &SourceId, shards: Vec<Shard>);\n\n    async fn list_all_shards(&self, index_uid: &IndexUid, source_id: &SourceId) -> Vec<Shard>;\n}\n\nstruct TestIndex {\n    index_uid: IndexUid,\n    source_id: SourceId,\n}\n\nimpl TestIndex {\n    async fn create_index_with_source(\n        metastore: &mut dyn MetastoreService,\n        index_id: &str,\n        source_config: SourceConfig,\n    ) -> Self {\n        let index_id = append_random_suffix(index_id);\n        let index_uri = format!(\"ram:///indexes/{index_id}\");\n        let index_config = IndexConfig::for_test(&index_id, &index_uri);\n\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let create_index_response = metastore.create_index(create_index_request).await.unwrap();\n        let index_uid: IndexUid = create_index_response.index_uid().clone();\n\n        let add_source_request =\n            AddSourceRequest::try_from_source_config(index_uid.clone(), &source_config).unwrap();\n        metastore.add_source(add_source_request).await.unwrap();\n\n        Self {\n            index_uid,\n            source_id: source_config.source_id,\n        }\n    }\n}\n\npub async fn test_metastore_open_shards<\n    MetastoreUnderTest: MetastoreService + MetastoreServiceExt + DefaultForTest + ReadWriteShardsForTest,\n>() {\n    let mut metastore = MetastoreUnderTest::default_for_test().await;\n\n    let test_index = TestIndex::create_index_with_source(\n        &mut metastore,\n        \"test-open-shards\",\n        SourceConfig::ingest_v2(),\n    )\n    .await;\n\n    // Test empty request.\n    let open_shards_request = OpenShardsRequest {\n        subrequests: Vec::new(),\n    };\n    let open_shards_response = metastore.open_shards(open_shards_request).await.unwrap();\n    assert!(open_shards_response.subresponses.is_empty());\n\n    // Test index not found.\n    // let open_shards_request = OpenShardsRequest {\n    //     subrequests: vec![OpenShardSubrequest {\n    //         index_uid: \"index-does-not-exist:0\".to_string(),\n    //         source_id: test_index.source_id.clone(),\n    //         leader_id: \"test-ingester-foo\".to_string(),\n    //         ..Default::default()\n    //     }],\n    // };\n    // let error = metastore\n    //     .open_shards(open_shards_request)\n    //     .await\n    //     .unwrap_err();\n    // assert!(\n    //     matches!(error, MetastoreError::NotFound(EntityKind::Index { index_id }) if index_id ==\n    // \"index-does-not-exist\") );\n\n    // // Test source not found.\n    // let open_shards_request = OpenShardsRequest {\n    //     subrequests: vec![OpenShardSubrequest {\n    //         index_uid: Some(test_index.index_uid.clone()),\n    //         source_id: \"source-does-not-exist\".to_string(),\n    //         leader_id: \"test-ingester-foo\".to_string(),\n    //         ..Default::default()\n    //     }],\n    // };\n    // let error = metastore\n    //     .open_shards(open_shards_request)\n    //     .await\n    //     .unwrap_err();\n    // assert!(\n    //     matches!(error, MetastoreError::NotFound(EntityKind::Source { source_id, ..}) if\n    // source_id == \"source-does-not-exist\") );\n\n    // Test open shard #1.\n    let open_shards_request = OpenShardsRequest {\n        subrequests: vec![OpenShardSubrequest {\n            subrequest_id: 0,\n            index_uid: Some(test_index.index_uid.clone()),\n            source_id: test_index.source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            leader_id: \"test-ingester-foo\".to_string(),\n            follower_id: Some(\"test-ingester-bar\".to_string()),\n            doc_mapping_uid: Some(DocMappingUid::default()),\n            publish_token: None,\n        }],\n    };\n    let open_shards_response = metastore.open_shards(open_shards_request).await.unwrap();\n    assert_eq!(open_shards_response.subresponses.len(), 1);\n\n    let subresponse = &open_shards_response.subresponses[0];\n    assert_eq!(subresponse.subrequest_id, 0);\n\n    let shard = subresponse.open_shard();\n    assert_eq!(shard.index_uid(), &test_index.index_uid);\n    assert_eq!(shard.source_id, test_index.source_id);\n    assert_eq!(shard.shard_id(), ShardId::from(1));\n    assert_eq!(shard.shard_state(), ShardState::Open);\n    assert_eq!(shard.leader_id, \"test-ingester-foo\");\n    assert_eq!(shard.follower_id(), \"test-ingester-bar\");\n    assert_eq!(shard.doc_mapping_uid(), DocMappingUid::default(),);\n    assert_eq!(shard.publish_position_inclusive(), Position::Beginning);\n    let shard_ts = shard.update_timestamp;\n    assert_ne!(shard_ts, compatibility_shard_update_timestamp());\n    assert_ne!(shard_ts, 0);\n    assert!(shard.publish_token.is_none());\n\n    // Test open shard #1 is idempotent.\n    let open_shards_request = OpenShardsRequest {\n        subrequests: vec![OpenShardSubrequest {\n            subrequest_id: 0,\n            index_uid: Some(test_index.index_uid.clone()),\n            source_id: test_index.source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            leader_id: \"test-ingester-foo\".to_string(),\n            follower_id: Some(\"test-ingester-bar\".to_string()),\n            doc_mapping_uid: Some(DocMappingUid::default()),\n            publish_token: Some(\"publish-token-baz\".to_string()),\n        }],\n    };\n    let open_shards_response = metastore.open_shards(open_shards_request).await.unwrap();\n    assert_eq!(open_shards_response.subresponses.len(), 1);\n\n    let subresponse = &open_shards_response.subresponses[0];\n    assert_eq!(subresponse.subrequest_id, 0);\n\n    let shard = subresponse.open_shard();\n    assert_eq!(shard.index_uid(), &test_index.index_uid);\n    assert_eq!(shard.source_id, test_index.source_id);\n    assert_eq!(shard.shard_id(), ShardId::from(1));\n    assert_eq!(shard.shard_state(), ShardState::Open);\n    assert_eq!(shard.leader_id, \"test-ingester-foo\");\n    assert_eq!(shard.follower_id(), \"test-ingester-bar\");\n    assert_eq!(shard.publish_position_inclusive(), Position::Beginning);\n    assert_eq!(shard.update_timestamp, shard_ts);\n    assert!(shard.publish_token.is_none());\n\n    // Test open shard #2.\n    let open_shards_request = OpenShardsRequest {\n        subrequests: vec![OpenShardSubrequest {\n            subrequest_id: 0,\n            index_uid: Some(test_index.index_uid.clone()),\n            source_id: test_index.source_id.clone(),\n            shard_id: Some(ShardId::from(2)),\n            leader_id: \"test-ingester-foo\".to_string(),\n            follower_id: None,\n            doc_mapping_uid: Some(DocMappingUid::default()),\n            publish_token: Some(\"publish-token-open\".to_string()),\n        }],\n    };\n    let open_shards_response = metastore.open_shards(open_shards_request).await.unwrap();\n    assert_eq!(open_shards_response.subresponses.len(), 1);\n\n    let subresponse = &open_shards_response.subresponses[0];\n    assert_eq!(subresponse.subrequest_id, 0);\n\n    let shard = subresponse.open_shard();\n    assert_eq!(shard.index_uid(), &test_index.index_uid);\n    assert_eq!(shard.source_id, test_index.source_id);\n    assert_eq!(shard.shard_id(), ShardId::from(2));\n    assert_eq!(shard.shard_state(), ShardState::Open);\n    assert_eq!(shard.leader_id, \"test-ingester-foo\");\n    assert!(shard.follower_id.is_none());\n    assert_eq!(shard.publish_position_inclusive(), Position::Beginning);\n    assert_eq!(shard.publish_token(), \"publish-token-open\");\n\n    cleanup_index(&mut metastore, test_index.index_uid).await;\n}\n\npub async fn test_metastore_acquire_shards<\n    MetastoreUnderTest: MetastoreService + MetastoreServiceExt + DefaultForTest + ReadWriteShardsForTest,\n>() {\n    let mut metastore = MetastoreUnderTest::default_for_test().await;\n\n    let test_index = TestIndex::create_index_with_source(\n        &mut metastore,\n        \"test-acquire-shards\",\n        SourceConfig::ingest_v2(),\n    )\n    .await;\n\n    let shards = vec![\n        Shard {\n            index_uid: Some(test_index.index_uid.clone()),\n            source_id: test_index.source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            shard_state: ShardState::Closed as i32,\n            leader_id: \"test-ingester-foo\".to_string(),\n            follower_id: Some(\"test-ingester-bar\".to_string()),\n            doc_mapping_uid: Some(DocMappingUid::default()),\n            publish_position_inclusive: Some(Position::Beginning),\n            publish_token: Some(\"test-publish-token-foo\".to_string()),\n            update_timestamp: 1724158996,\n        },\n        Shard {\n            index_uid: Some(test_index.index_uid.clone()),\n            source_id: test_index.source_id.clone(),\n            shard_id: Some(ShardId::from(2)),\n            shard_state: ShardState::Open as i32,\n            leader_id: \"test-ingester-bar\".to_string(),\n            follower_id: Some(\"test-ingester-qux\".to_string()),\n            doc_mapping_uid: Some(DocMappingUid::default()),\n            publish_position_inclusive: Some(Position::Beginning),\n            publish_token: Some(\"test-publish-token-bar\".to_string()),\n            update_timestamp: 1724158996,\n        },\n        Shard {\n            index_uid: Some(test_index.index_uid.clone()),\n            source_id: test_index.source_id.clone(),\n            shard_id: Some(ShardId::from(3)),\n            shard_state: ShardState::Open as i32,\n            leader_id: \"test-ingester-qux\".to_string(),\n            follower_id: Some(\"test-ingester-baz\".to_string()),\n            doc_mapping_uid: Some(DocMappingUid::default()),\n            publish_position_inclusive: Some(Position::Beginning),\n            publish_token: None,\n            update_timestamp: 1724158996,\n        },\n        Shard {\n            index_uid: Some(test_index.index_uid.clone()),\n            source_id: test_index.source_id.clone(),\n            shard_id: Some(ShardId::from(4)),\n            shard_state: ShardState::Open as i32,\n            leader_id: \"test-ingester-baz\".to_string(),\n            follower_id: Some(\"test-ingester-tux\".to_string()),\n            doc_mapping_uid: Some(DocMappingUid::default()),\n            publish_position_inclusive: Some(Position::Beginning),\n            publish_token: None,\n            update_timestamp: 1724158996,\n        },\n    ];\n    metastore\n        .insert_shards(&test_index.index_uid, &test_index.source_id, shards)\n        .await;\n\n    // Test acquire shards.\n    let acquire_shards_request = AcquireShardsRequest {\n        index_uid: Some(test_index.index_uid.clone()),\n        source_id: test_index.source_id.clone(),\n        shard_ids: vec![\n            ShardId::from(1),\n            ShardId::from(2),\n            ShardId::from(3),\n            ShardId::from(666),\n        ], // shard 666 does not exist\n        publish_token: \"test-publish-token-foo\".to_string(),\n    };\n    let mut acquire_shards_response = metastore\n        .acquire_shards(acquire_shards_request)\n        .await\n        .unwrap();\n\n    acquire_shards_response\n        .acquired_shards\n        .sort_unstable_by(|left, right| left.shard_id().cmp(right.shard_id()));\n\n    let shard = &acquire_shards_response.acquired_shards[0];\n    assert_eq!(shard.index_uid(), &test_index.index_uid);\n    assert_eq!(shard.source_id, test_index.source_id);\n    assert_eq!(shard.shard_id(), ShardId::from(1));\n    assert_eq!(shard.shard_state(), ShardState::Closed);\n    assert_eq!(shard.leader_id, \"test-ingester-foo\");\n    assert_eq!(shard.follower_id(), \"test-ingester-bar\");\n    assert_eq!(shard.publish_position_inclusive(), Position::Beginning);\n    assert_eq!(shard.publish_token(), \"test-publish-token-foo\");\n\n    let shard = &acquire_shards_response.acquired_shards[1];\n    assert_eq!(shard.index_uid(), &test_index.index_uid);\n    assert_eq!(shard.source_id, test_index.source_id);\n    assert_eq!(shard.shard_id(), ShardId::from(2));\n    assert_eq!(shard.shard_state(), ShardState::Open);\n    assert_eq!(shard.leader_id, \"test-ingester-bar\");\n    assert_eq!(shard.follower_id(), \"test-ingester-qux\");\n    assert_eq!(shard.publish_position_inclusive(), Position::Beginning);\n    assert_eq!(shard.publish_token(), \"test-publish-token-foo\");\n\n    let shard = &acquire_shards_response.acquired_shards[2];\n    assert_eq!(shard.index_uid(), &test_index.index_uid);\n    assert_eq!(shard.source_id, test_index.source_id);\n    assert_eq!(shard.shard_id(), ShardId::from(3));\n    assert_eq!(shard.shard_state(), ShardState::Open);\n    assert_eq!(shard.leader_id, \"test-ingester-qux\");\n    assert_eq!(shard.follower_id(), \"test-ingester-baz\");\n    assert_eq!(shard.publish_position_inclusive(), Position::Beginning);\n    assert_eq!(shard.publish_token(), \"test-publish-token-foo\");\n\n    cleanup_index(&mut metastore, test_index.index_uid).await;\n}\n\npub async fn test_metastore_list_shards<\n    MetastoreUnderTest: MetastoreService + MetastoreServiceExt + DefaultForTest + ReadWriteShardsForTest,\n>() {\n    let mut metastore = MetastoreUnderTest::default_for_test().await;\n\n    let test_index_0 = TestIndex::create_index_with_source(\n        &mut metastore,\n        \"test-list-shards-0\",\n        SourceConfig::ingest_v2(),\n    )\n    .await;\n\n    let test_index_1 = TestIndex::create_index_with_source(\n        &mut metastore,\n        \"test-list-shards-1\",\n        SourceConfig::ingest_v2(),\n    )\n    .await;\n\n    for test_index in [&test_index_0, &test_index_1] {\n        let shards = vec![\n            Shard {\n                index_uid: Some(test_index.index_uid.clone()),\n                source_id: test_index.source_id.clone(),\n                shard_id: Some(ShardId::from(1)),\n                shard_state: ShardState::Open as i32,\n                leader_id: \"test-ingester-foo\".to_string(),\n                follower_id: Some(\"test-ingester-bar\".to_string()),\n                doc_mapping_uid: Some(DocMappingUid::default()),\n                publish_position_inclusive: Some(Position::Beginning),\n                publish_token: Some(\"test-publish-token-foo\".to_string()),\n                update_timestamp: 1724158996,\n            },\n            Shard {\n                index_uid: Some(test_index.index_uid.clone()),\n                source_id: test_index.source_id.clone(),\n                shard_id: Some(ShardId::from(2)),\n                shard_state: ShardState::Closed as i32,\n                leader_id: \"test-ingester-bar\".to_string(),\n                follower_id: Some(\"test-ingester-qux\".to_string()),\n                doc_mapping_uid: Some(DocMappingUid::default()),\n                publish_position_inclusive: Some(Position::Beginning),\n                publish_token: Some(\"test-publish-token-bar\".to_string()),\n                update_timestamp: 1724158997,\n            },\n        ];\n        metastore\n            .insert_shards(&test_index.index_uid, &test_index.source_id, shards)\n            .await;\n    }\n\n    // Test list shards.\n    let list_shards_request = ListShardsRequest {\n        subrequests: vec![\n            ListShardsSubrequest {\n                index_uid: Some(test_index_0.index_uid.clone()),\n                source_id: test_index_0.source_id.clone(),\n                shard_state: None,\n            },\n            ListShardsSubrequest {\n                index_uid: Some(test_index_1.index_uid.clone()),\n                source_id: test_index_1.source_id.clone(),\n                shard_state: None,\n            },\n        ],\n    };\n    let mut list_shards_response = metastore.list_shards(list_shards_request).await.unwrap();\n    assert_eq!(list_shards_response.subresponses.len(), 2);\n\n    list_shards_response\n        .subresponses\n        .sort_unstable_by(|left, right| left.index_uid().cmp(right.index_uid()));\n\n    for (idx, test_index) in [&test_index_0, &test_index_1].into_iter().enumerate() {\n        let subresponse = &mut list_shards_response.subresponses[idx];\n        assert_eq!(subresponse.index_uid(), &test_index.index_uid);\n        assert_eq!(subresponse.source_id, test_index.source_id);\n        assert_eq!(subresponse.shards.len(), 2);\n\n        subresponse\n            .shards\n            .sort_unstable_by(|left, right| left.shard_id.cmp(&right.shard_id));\n\n        let shard = &subresponse.shards[0];\n        assert_eq!(shard.index_uid(), &test_index.index_uid);\n        assert_eq!(shard.source_id, test_index.source_id);\n        assert_eq!(shard.shard_id(), ShardId::from(1));\n        assert_eq!(shard.shard_state(), ShardState::Open);\n        assert_eq!(shard.leader_id, \"test-ingester-foo\");\n        assert_eq!(shard.follower_id(), \"test-ingester-bar\");\n        assert_eq!(shard.publish_position_inclusive(), Position::Beginning);\n        assert_eq!(shard.publish_token(), \"test-publish-token-foo\");\n        assert_eq!(shard.update_timestamp, 1724158996);\n\n        let shard = &subresponse.shards[1];\n        assert_eq!(shard.index_uid(), &test_index.index_uid);\n        assert_eq!(shard.source_id, test_index.source_id);\n        assert_eq!(shard.shard_id(), ShardId::from(2));\n        assert_eq!(shard.shard_state(), ShardState::Closed);\n        assert_eq!(shard.leader_id, \"test-ingester-bar\");\n        assert_eq!(shard.follower_id(), \"test-ingester-qux\");\n        assert_eq!(shard.publish_position_inclusive(), Position::Beginning);\n        assert_eq!(shard.publish_token(), \"test-publish-token-bar\");\n        assert_eq!(shard.update_timestamp, 1724158997);\n    }\n\n    // Test list shards with shard state filter.\n    let list_shards_request = ListShardsRequest {\n        subrequests: vec![\n            ListShardsSubrequest {\n                index_uid: Some(test_index_0.index_uid.clone()),\n                source_id: test_index_0.source_id.clone(),\n                shard_state: Some(ShardState::Open as i32),\n            },\n            ListShardsSubrequest {\n                index_uid: Some(test_index_1.index_uid.clone()),\n                source_id: test_index_1.source_id.clone(),\n                shard_state: Some(ShardState::Closed as i32),\n            },\n        ],\n    };\n    let mut list_shards_response = metastore.list_shards(list_shards_request).await.unwrap();\n    assert_eq!(list_shards_response.subresponses.len(), 2);\n\n    list_shards_response\n        .subresponses\n        .sort_unstable_by(|left, right| left.index_uid().cmp(right.index_uid()));\n\n    assert_eq!(list_shards_response.subresponses[0].shards.len(), 1);\n\n    let shard = &list_shards_response.subresponses[0].shards[0];\n    assert_eq!(shard.shard_id(), ShardId::from(1));\n    assert_eq!(shard.shard_state(), ShardState::Open);\n\n    assert_eq!(list_shards_response.subresponses[1].shards.len(), 1);\n\n    let shard = &list_shards_response.subresponses[1].shards[0];\n    assert_eq!(shard.shard_id(), ShardId::from(2));\n    assert_eq!(shard.shard_state(), ShardState::Closed);\n\n    let list_shards_request = ListShardsRequest {\n        subrequests: vec![ListShardsSubrequest {\n            index_uid: Some(test_index_0.index_uid.clone()),\n            source_id: test_index_0.source_id.clone(),\n            shard_state: Some(ShardState::Unavailable as i32),\n        }],\n    };\n    let list_shards_response = metastore.list_shards(list_shards_request).await.unwrap();\n    assert_eq!(list_shards_response.subresponses.len(), 1);\n    assert!(list_shards_response.subresponses[0].shards.is_empty());\n\n    cleanup_index(&mut metastore, test_index_0.index_uid).await;\n}\n\npub async fn test_metastore_delete_shards<\n    MetastoreUnderTest: MetastoreService + MetastoreServiceExt + DefaultForTest + ReadWriteShardsForTest,\n>() {\n    let mut metastore = MetastoreUnderTest::default_for_test().await;\n\n    let test_index = TestIndex::create_index_with_source(\n        &mut metastore,\n        \"test-delete-shards\",\n        SourceConfig::ingest_v2(),\n    )\n    .await;\n\n    let shards = vec![\n        Shard {\n            index_uid: Some(test_index.index_uid.clone()),\n            source_id: test_index.source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            shard_state: ShardState::Open as i32,\n            doc_mapping_uid: Some(DocMappingUid::default()),\n            publish_position_inclusive: Some(Position::Beginning),\n            ..Default::default()\n        },\n        Shard {\n            index_uid: Some(test_index.index_uid.clone()),\n            source_id: test_index.source_id.clone(),\n            shard_id: Some(ShardId::from(2)),\n            shard_state: ShardState::Closed as i32,\n            doc_mapping_uid: Some(DocMappingUid::default()),\n            publish_position_inclusive: Some(Position::Beginning),\n            ..Default::default()\n        },\n        Shard {\n            index_uid: Some(test_index.index_uid.clone()),\n            source_id: test_index.source_id.clone(),\n            shard_id: Some(ShardId::from(3)),\n            shard_state: ShardState::Closed as i32,\n            doc_mapping_uid: Some(DocMappingUid::default()),\n            publish_position_inclusive: Some(Position::Eof(None)),\n            ..Default::default()\n        },\n    ];\n    metastore\n        .insert_shards(&test_index.index_uid, &test_index.source_id, shards)\n        .await;\n\n    // Attempt to delete shards #1, #2, #3, and #4.\n    let delete_index_request = DeleteShardsRequest {\n        index_uid: Some(test_index.index_uid.clone()),\n        source_id: test_index.source_id.clone(),\n        shard_ids: vec![\n            ShardId::from(1),\n            ShardId::from(2),\n            ShardId::from(3),\n            ShardId::from(4),\n        ],\n        force: false,\n    };\n    let mut response = metastore.delete_shards(delete_index_request).await.unwrap();\n\n    assert_eq!(response.index_uid(), &test_index.index_uid);\n    assert_eq!(response.source_id, test_index.source_id);\n    assert_eq!(response.successes.len(), 2);\n    assert_eq!(response.failures.len(), 2);\n\n    response.successes.sort_unstable();\n    assert_eq!(response.successes[0], ShardId::from(3));\n    assert_eq!(response.successes[1], ShardId::from(4));\n\n    response.failures.sort_unstable();\n    assert_eq!(response.failures[0], ShardId::from(1));\n    assert_eq!(response.failures[1], ShardId::from(2));\n\n    let mut all_shards = metastore\n        .list_all_shards(&test_index.index_uid, &test_index.source_id)\n        .await;\n    assert_eq!(all_shards.len(), 2);\n\n    all_shards.sort_unstable_by(|left, right| left.shard_id.cmp(&right.shard_id));\n\n    assert_eq!(all_shards[0].shard_id(), ShardId::from(1));\n    assert_eq!(all_shards[1].shard_id(), ShardId::from(2));\n\n    // Attempt to delete shards #1, #2, #3, and #4.\n    let delete_index_request = DeleteShardsRequest {\n        index_uid: Some(test_index.index_uid.clone()),\n        source_id: test_index.source_id.clone(),\n        shard_ids: vec![\n            ShardId::from(1),\n            ShardId::from(2),\n            ShardId::from(3),\n            ShardId::from(4),\n        ],\n        force: true,\n    };\n    let mut response = metastore.delete_shards(delete_index_request).await.unwrap();\n\n    assert_eq!(response.index_uid(), &test_index.index_uid);\n    assert_eq!(response.source_id, test_index.source_id);\n\n    assert_eq!(response.successes.len(), 4);\n    assert_eq!(response.failures.len(), 0);\n\n    response.successes.sort_unstable();\n    assert_eq!(response.successes[0], ShardId::from(1));\n    assert_eq!(response.successes[1], ShardId::from(2));\n    assert_eq!(response.successes[2], ShardId::from(3));\n    assert_eq!(response.successes[3], ShardId::from(4));\n\n    let all_shards = metastore\n        .list_all_shards(&test_index.index_uid, &test_index.source_id)\n        .await;\n\n    assert_eq!(all_shards.len(), 0);\n\n    cleanup_index(&mut metastore, test_index.index_uid).await;\n}\n\npub async fn test_metastore_prune_shards<\n    MetastoreUnderTest: MetastoreService + MetastoreServiceExt + DefaultForTest + ReadWriteShardsForTest,\n>() {\n    let mut metastore = MetastoreUnderTest::default_for_test().await;\n\n    let test_index = TestIndex::create_index_with_source(\n        &mut metastore,\n        \"test-prune-shards\",\n        SourceConfig::ingest_v2(),\n    )\n    .await;\n\n    let now_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n    let oldest_shard_age = 10000u32;\n\n    // Create shards with timestamp intervals of 100s starting from\n    // now_timestamp - oldest_shard_age\n    let shards = (0..100)\n        .map(|shard_id| Shard {\n            index_uid: Some(test_index.index_uid.clone()),\n            source_id: test_index.source_id.clone(),\n            shard_id: Some(ShardId::from(shard_id)),\n            shard_state: ShardState::Closed as i32,\n            doc_mapping_uid: Some(DocMappingUid::default()),\n            publish_position_inclusive: Some(Position::Beginning),\n            update_timestamp: now_timestamp - oldest_shard_age as i64 + shard_id as i64 * 100,\n            ..Default::default()\n        })\n        .collect_vec();\n\n    metastore\n        .insert_shards(&test_index.index_uid, &test_index.source_id, shards)\n        .await;\n\n    // noop prune request\n    {\n        let prune_index_request = PruneShardsRequest {\n            index_uid: Some(test_index.index_uid.clone()),\n            source_id: test_index.source_id.clone(),\n            max_age_secs: None,\n            max_count: None,\n            interval_secs: None,\n        };\n        metastore.prune_shards(prune_index_request).await.unwrap();\n        let all_shards = metastore\n            .list_all_shards(&test_index.index_uid, &test_index.source_id)\n            .await;\n        assert_eq!(all_shards.len(), 100);\n    }\n\n    // delete shards 4 last shards with age limit\n    {\n        let prune_index_request = PruneShardsRequest {\n            index_uid: Some(test_index.index_uid.clone()),\n            source_id: test_index.source_id.clone(),\n            max_age_secs: Some(oldest_shard_age - 350),\n            max_count: None,\n            interval_secs: None,\n        };\n        metastore.prune_shards(prune_index_request).await.unwrap();\n\n        let mut all_shards = metastore\n            .list_all_shards(&test_index.index_uid, &test_index.source_id)\n            .await;\n        assert_eq!(all_shards.len(), 96);\n        all_shards.sort_unstable_by_key(|shard| shard.update_timestamp);\n        assert_eq!(all_shards[0].shard_id(), ShardId::from(4));\n        assert_eq!(all_shards[95].shard_id(), ShardId::from(99));\n    }\n\n    // delete 6 more shards with count limit\n    {\n        let prune_index_request = PruneShardsRequest {\n            index_uid: Some(test_index.index_uid.clone()),\n            source_id: test_index.source_id.clone(),\n            max_age_secs: None,\n            max_count: Some(90),\n            interval_secs: None,\n        };\n        metastore.prune_shards(prune_index_request).await.unwrap();\n        let mut all_shards = metastore\n            .list_all_shards(&test_index.index_uid, &test_index.source_id)\n            .await;\n        assert_eq!(all_shards.len(), 90);\n        all_shards.sort_unstable_by_key(|shard| shard.update_timestamp);\n        assert_eq!(all_shards[0].shard_id(), ShardId::from(10));\n        assert_eq!(all_shards[89].shard_id(), ShardId::from(99));\n    }\n\n    // age limit is the limiting factor, delete 10 more shards\n    let prune_index_request = PruneShardsRequest {\n        index_uid: Some(test_index.index_uid.clone()),\n        source_id: test_index.source_id.clone(),\n        max_age_secs: Some(oldest_shard_age - 2950),\n        max_count: Some(80),\n        interval_secs: None,\n    };\n    metastore.prune_shards(prune_index_request).await.unwrap();\n    let all_shards = metastore\n        .list_all_shards(&test_index.index_uid, &test_index.source_id)\n        .await;\n    assert_eq!(all_shards.len(), 70);\n\n    // count limit is the limiting factor, delete 20 more shards\n    let prune_index_request = PruneShardsRequest {\n        index_uid: Some(test_index.index_uid.clone()),\n        source_id: test_index.source_id.clone(),\n        max_age_secs: Some(oldest_shard_age - 4000),\n        max_count: Some(50),\n        interval_secs: None,\n    };\n    metastore.prune_shards(prune_index_request).await.unwrap();\n    let all_shards = metastore\n        .list_all_shards(&test_index.index_uid, &test_index.source_id)\n        .await;\n    assert_eq!(all_shards.len(), 50);\n\n    cleanup_index(&mut metastore, test_index.index_uid).await;\n}\n\npub async fn test_metastore_apply_checkpoint_delta_v2_single_shard<\n    MetastoreUnderTest: MetastoreService + MetastoreServiceExt + DefaultForTest + ReadWriteShardsForTest,\n>() {\n    let mut metastore = MetastoreUnderTest::default_for_test().await;\n\n    let test_index = TestIndex::create_index_with_source(\n        &mut metastore,\n        \"test-delete-shards\",\n        SourceConfig::ingest_v2(),\n    )\n    .await;\n\n    let mut source_delta = SourceCheckpointDelta::default();\n    source_delta\n        .record_partition_delta(\n            PartitionId::from(0u64),\n            Position::Beginning,\n            Position::offset(0u64),\n        )\n        .unwrap();\n    let index_checkpoint_delta = IndexCheckpointDelta {\n        source_id: test_index.source_id.clone(),\n        source_delta,\n    };\n    let index_checkpoint_delta_json = serde_json::to_string(&index_checkpoint_delta).unwrap();\n    let publish_splits_request = PublishSplitsRequest {\n        index_uid: Some(test_index.index_uid.clone()),\n        staged_split_ids: Vec::new(),\n        replaced_split_ids: Vec::new(),\n        index_checkpoint_delta_json_opt: Some(index_checkpoint_delta_json),\n        publish_token_opt: Some(\"test-publish-token-foo\".to_string()),\n    };\n    let error = metastore\n        .publish_splits(publish_splits_request)\n        .await\n        .unwrap_err();\n    assert!(matches!(\n        error,\n        MetastoreError::NotFound(EntityKind::Shard { .. })\n    ));\n\n    let dummy_create_timestamp = 1;\n    let shards = vec![Shard {\n        index_uid: Some(test_index.index_uid.clone()),\n        source_id: test_index.source_id.clone(),\n        shard_id: Some(ShardId::from(0)),\n        shard_state: ShardState::Open as i32,\n        doc_mapping_uid: Some(DocMappingUid::default()),\n        publish_position_inclusive: Some(Position::Beginning),\n        publish_token: Some(\"test-publish-token-bar\".to_string()),\n        update_timestamp: dummy_create_timestamp,\n        ..Default::default()\n    }];\n    metastore\n        .insert_shards(&test_index.index_uid, &test_index.source_id, shards)\n        .await;\n\n    let index_checkpoint_delta_json = serde_json::to_string(&index_checkpoint_delta).unwrap();\n    let publish_splits_request = PublishSplitsRequest {\n        index_uid: Some(test_index.index_uid.clone()),\n        staged_split_ids: Vec::new(),\n        replaced_split_ids: Vec::new(),\n        index_checkpoint_delta_json_opt: Some(index_checkpoint_delta_json),\n        publish_token_opt: Some(\"test-publish-token-foo\".to_string()),\n    };\n    let error = metastore\n        .publish_splits(publish_splits_request.clone())\n        .await\n        .unwrap_err();\n    assert!(\n        matches!(error, MetastoreError::InvalidArgument { message } if message.contains(\"token\"))\n    );\n\n    let index_checkpoint_delta_json = serde_json::to_string(&index_checkpoint_delta).unwrap();\n    let publish_splits_request = PublishSplitsRequest {\n        index_uid: Some(test_index.index_uid.clone()),\n        staged_split_ids: Vec::new(),\n        replaced_split_ids: Vec::new(),\n        index_checkpoint_delta_json_opt: Some(index_checkpoint_delta_json),\n        publish_token_opt: Some(\"test-publish-token-bar\".to_string()),\n    };\n    metastore\n        .publish_splits(publish_splits_request.clone())\n        .await\n        .unwrap();\n\n    let shards = metastore\n        .list_all_shards(&test_index.index_uid, &test_index.source_id)\n        .await;\n    assert_eq!(shards.len(), 1);\n    assert_eq!(shards[0].shard_state(), ShardState::Open);\n    assert_eq!(\n        shards[0].publish_position_inclusive(),\n        Position::offset(0u64)\n    );\n    assert!(\n        shards[0].update_timestamp > dummy_create_timestamp,\n        \"shard timestamp was not updated\"\n    );\n\n    let index_checkpoint_delta_json = serde_json::to_string(&index_checkpoint_delta).unwrap();\n    let publish_splits_request = PublishSplitsRequest {\n        index_uid: Some(test_index.index_uid.clone()),\n        staged_split_ids: Vec::new(),\n        replaced_split_ids: Vec::new(),\n        index_checkpoint_delta_json_opt: Some(index_checkpoint_delta_json),\n        publish_token_opt: Some(\"test-publish-token-bar\".to_string()),\n    };\n    let error = metastore\n        .publish_splits(publish_splits_request.clone())\n        .await\n        .unwrap_err();\n    assert!(\n        matches!(error, MetastoreError::InvalidArgument { message } if message.contains(\"checkpoint\"))\n    );\n\n    let mut source_delta = SourceCheckpointDelta::default();\n    source_delta\n        .record_partition_delta(\n            PartitionId::from(0u64),\n            Position::offset(0u64),\n            Position::eof(1u64),\n        )\n        .unwrap();\n    let index_checkpoint_delta = IndexCheckpointDelta {\n        source_id: test_index.source_id.clone(),\n        source_delta,\n    };\n    let index_checkpoint_delta_json = serde_json::to_string(&index_checkpoint_delta).unwrap();\n    let publish_splits_request = PublishSplitsRequest {\n        index_uid: Some(test_index.index_uid.clone()),\n        staged_split_ids: Vec::new(),\n        replaced_split_ids: Vec::new(),\n        index_checkpoint_delta_json_opt: Some(index_checkpoint_delta_json),\n        publish_token_opt: Some(\"test-publish-token-bar\".to_string()),\n    };\n    metastore\n        .publish_splits(publish_splits_request)\n        .await\n        .unwrap();\n\n    let shards = metastore\n        .list_all_shards(&test_index.index_uid, &test_index.source_id)\n        .await;\n    assert_eq!(shards.len(), 1);\n    assert_eq!(shards[0].shard_state(), ShardState::Closed);\n    assert_eq!(shards[0].publish_position_inclusive(), Position::eof(1u64));\n    cleanup_index(&mut metastore, test_index.index_uid).await;\n}\n\npub async fn test_metastore_apply_checkpoint_delta_v2_multi_shards<\n    MetastoreUnderTest: MetastoreService + MetastoreServiceExt + DefaultForTest + ReadWriteShardsForTest,\n>() {\n    let mut metastore = MetastoreUnderTest::default_for_test().await;\n\n    let test_index = TestIndex::create_index_with_source(\n        &mut metastore,\n        \"test-delete-shards\",\n        SourceConfig::ingest_v2(),\n    )\n    .await;\n\n    let dummy_create_timestamp = 1;\n    let shards = vec![\n        Shard {\n            index_uid: Some(test_index.index_uid.clone()),\n            source_id: test_index.source_id.clone(),\n            shard_id: Some(ShardId::from(0)),\n            shard_state: ShardState::Open as i32,\n            doc_mapping_uid: Some(DocMappingUid::default()),\n            publish_position_inclusive: Some(Position::offset(0u64)),\n            publish_token: Some(\"test-publish-token-foo\".to_string()),\n            update_timestamp: dummy_create_timestamp,\n            ..Default::default()\n        },\n        Shard {\n            index_uid: Some(test_index.index_uid.clone()),\n            source_id: test_index.source_id.clone(),\n            shard_id: Some(ShardId::from(1)),\n            shard_state: ShardState::Open as i32,\n            doc_mapping_uid: Some(DocMappingUid::default()),\n            publish_position_inclusive: Some(Position::offset(1u64)),\n            publish_token: Some(\"test-publish-token-foo\".to_string()),\n            update_timestamp: dummy_create_timestamp,\n            ..Default::default()\n        },\n        Shard {\n            index_uid: Some(test_index.index_uid.clone()),\n            source_id: test_index.source_id.clone(),\n            shard_id: Some(ShardId::from(2)),\n            shard_state: ShardState::Open as i32,\n            doc_mapping_uid: Some(DocMappingUid::default()),\n            publish_position_inclusive: Some(Position::offset(2u64)),\n            publish_token: Some(\"test-publish-token-foo\".to_string()),\n            update_timestamp: dummy_create_timestamp,\n            ..Default::default()\n        },\n        Shard {\n            index_uid: Some(test_index.index_uid.clone()),\n            source_id: test_index.source_id.clone(),\n            shard_id: Some(ShardId::from(3)),\n            shard_state: ShardState::Open as i32,\n            doc_mapping_uid: Some(DocMappingUid::default()),\n            publish_position_inclusive: Some(Position::offset(3u64)),\n            publish_token: Some(\"test-publish-token-bar\".to_string()),\n            update_timestamp: dummy_create_timestamp,\n            ..Default::default()\n        },\n    ];\n    metastore\n        .insert_shards(&test_index.index_uid, &test_index.source_id, shards)\n        .await;\n\n    let mut source_delta = SourceCheckpointDelta::default();\n    source_delta\n        .record_partition_delta(\n            PartitionId::from(0u64),\n            Position::offset(0u64),\n            Position::offset(10u64),\n        )\n        .unwrap();\n    source_delta\n        .record_partition_delta(\n            PartitionId::from(1u64),\n            Position::offset(1u64),\n            Position::offset(11u64),\n        )\n        .unwrap();\n    source_delta\n        .record_partition_delta(\n            PartitionId::from(2u64),\n            Position::offset(2u64),\n            Position::eof(12u64),\n        )\n        .unwrap();\n    let index_checkpoint_delta = IndexCheckpointDelta {\n        source_id: test_index.source_id.clone(),\n        source_delta,\n    };\n    let index_checkpoint_delta_json = serde_json::to_string(&index_checkpoint_delta).unwrap();\n    let publish_splits_request = PublishSplitsRequest {\n        index_uid: Some(test_index.index_uid.clone()),\n        staged_split_ids: Vec::new(),\n        replaced_split_ids: Vec::new(),\n        index_checkpoint_delta_json_opt: Some(index_checkpoint_delta_json),\n        publish_token_opt: Some(\"test-publish-token-foo\".to_string()),\n    };\n    metastore\n        .publish_splits(publish_splits_request)\n        .await\n        .unwrap();\n\n    let mut shards = metastore\n        .list_all_shards(&test_index.index_uid, &test_index.source_id)\n        .await;\n    assert_eq!(shards.len(), 4);\n\n    shards.sort_unstable_by(|left, right| left.shard_id.cmp(&right.shard_id));\n\n    let shard = &shards[0];\n    assert_eq!(shard.shard_id(), ShardId::from(0));\n    assert_eq!(shard.shard_state(), ShardState::Open);\n    assert_eq!(shard.publish_position_inclusive(), Position::offset(10u64));\n    assert!(shard.update_timestamp > dummy_create_timestamp);\n\n    let shard = &shards[1];\n    assert_eq!(shard.shard_id(), ShardId::from(1));\n    assert_eq!(shard.shard_state(), ShardState::Open);\n    assert_eq!(shard.publish_position_inclusive(), Position::offset(11u64));\n    assert!(shard.update_timestamp > dummy_create_timestamp);\n\n    let shard = &shards[2];\n    assert_eq!(shard.shard_id(), ShardId::from(2));\n    assert_eq!(shard.shard_state(), ShardState::Closed);\n    assert_eq!(shard.publish_position_inclusive(), Position::eof(12u64));\n    assert!(shard.update_timestamp > dummy_create_timestamp);\n\n    let shard = &shards[3];\n    assert_eq!(shard.shard_id(), ShardId::from(3));\n    assert_eq!(shard.shard_state(), ShardState::Open);\n    assert_eq!(shard.publish_position_inclusive(), Position::offset(3u64));\n    assert_eq!(shard.update_timestamp, dummy_create_timestamp);\n\n    cleanup_index(&mut metastore, test_index.index_uid).await;\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/tests/source.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::num::NonZeroUsize;\n\nuse quickwit_common::rand::append_random_suffix;\nuse quickwit_config::{\n    IndexConfig, SourceConfig, SourceInputFormat, SourceParams, TransformConfig,\n};\nuse quickwit_proto::metastore::{\n    AddSourceRequest, CreateIndexRequest, DeleteSourceRequest, EntityKind, IndexMetadataRequest,\n    MetastoreError, PublishSplitsRequest, ResetSourceCheckpointRequest, SourceType,\n    StageSplitsRequest, ToggleSourceRequest, UpdateSourceRequest,\n};\nuse quickwit_proto::types::IndexUid;\n\nuse super::DefaultForTest;\nuse crate::checkpoint::SourceCheckpoint;\nuse crate::metastore::UpdateSourceRequestExt;\nuse crate::tests::cleanup_index;\nuse crate::{\n    AddSourceRequestExt, CreateIndexRequestExt, IndexMetadataResponseExt, MetastoreServiceExt,\n    SplitMetadata, StageSplitsRequestExt,\n};\n\npub async fn test_metastore_add_source<MetastoreToTest: MetastoreServiceExt + DefaultForTest>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let index_id = append_random_suffix(\"test-add-source\");\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n    let index_uid: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    let source_id = format!(\"{index_id}--source\");\n\n    let source = SourceConfig {\n        source_id: source_id.to_string(),\n        num_pipelines: NonZeroUsize::MIN,\n        enabled: true,\n        source_params: SourceParams::void(),\n        transform_config: None,\n        input_format: SourceInputFormat::Json,\n    };\n\n    assert_eq!(\n        metastore\n            .index_metadata(IndexMetadataRequest::for_index_id(index_id.to_string()))\n            .await\n            .unwrap()\n            .deserialize_index_metadata()\n            .unwrap()\n            .checkpoint\n            .source_checkpoint(&source_id),\n        None\n    );\n\n    let add_source_request =\n        AddSourceRequest::try_from_source_config(index_uid.clone(), &source).unwrap();\n    metastore.add_source(add_source_request).await.unwrap();\n\n    let index_metadata = metastore\n        .index_metadata(IndexMetadataRequest::for_index_id(index_id.to_string()))\n        .await\n        .unwrap()\n        .deserialize_index_metadata()\n        .unwrap();\n\n    let sources = &index_metadata.sources;\n    assert_eq!(sources.len(), 1);\n    assert!(sources.contains_key(&source_id));\n    assert_eq!(sources.get(&source_id).unwrap().source_id, source_id);\n    assert_eq!(\n        sources.get(&source_id).unwrap().source_type(),\n        SourceType::Void\n    );\n    assert_eq!(\n        index_metadata.checkpoint.source_checkpoint(&source_id),\n        Some(&SourceCheckpoint::default())\n    );\n\n    assert!(matches!(\n        metastore\n            .add_source(\n                AddSourceRequest::try_from_source_config(index_uid.clone(), &source).unwrap()\n            )\n            .await\n            .unwrap_err(),\n        MetastoreError::AlreadyExists(EntityKind::Source { .. })\n    ));\n    assert!(matches!(\n        metastore\n            .add_source(\n                AddSourceRequest::try_from_source_config(\n                    IndexUid::new_with_random_ulid(\"index-not-found\"),\n                    &source\n                )\n                .unwrap()\n            )\n            .await\n            .unwrap_err(),\n        MetastoreError::NotFound(EntityKind::Index { .. })\n    ));\n    assert!(matches!(\n        metastore\n            .add_source(\n                AddSourceRequest::try_from_source_config(\n                    IndexUid::new_with_random_ulid(&index_id),\n                    &source\n                )\n                .unwrap()\n            )\n            .await\n            .unwrap_err(),\n        MetastoreError::NotFound(EntityKind::Index { .. })\n    ));\n    cleanup_index(&mut metastore, index_uid).await;\n}\n\npub async fn test_metastore_update_source<MetastoreToTest: MetastoreServiceExt + DefaultForTest>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let index_id = append_random_suffix(\"test-add-source\");\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n    let index_uid: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    let source_id = format!(\"{index_id}--source\");\n\n    let mut source = SourceConfig {\n        source_id: source_id.to_string(),\n        num_pipelines: NonZeroUsize::MIN,\n        enabled: true,\n        source_params: SourceParams::void(),\n        transform_config: None,\n        input_format: SourceInputFormat::Json,\n    };\n\n    assert_eq!(\n        metastore\n            .index_metadata(IndexMetadataRequest::for_index_id(index_id.to_string()))\n            .await\n            .unwrap()\n            .deserialize_index_metadata()\n            .unwrap()\n            .checkpoint\n            .source_checkpoint(&source_id),\n        None\n    );\n\n    let add_source_request =\n        AddSourceRequest::try_from_source_config(index_uid.clone(), &source).unwrap();\n    metastore.add_source(add_source_request).await.unwrap();\n\n    source.transform_config = Some(TransformConfig::new(\"del(.username)\".to_string(), None));\n\n    // Update the source twice with the same value to validate indempotency\n    for _ in 0..2 {\n        let update_source_request =\n            UpdateSourceRequest::try_from_source_config(index_uid.clone(), &source).unwrap();\n        metastore\n            .update_source(update_source_request)\n            .await\n            .unwrap();\n\n        let index_metadata = metastore\n            .index_metadata(IndexMetadataRequest::for_index_id(index_id.to_string()))\n            .await\n            .unwrap()\n            .deserialize_index_metadata()\n            .unwrap();\n\n        let sources = &index_metadata.sources;\n        assert_eq!(sources.len(), 1);\n        assert!(sources.contains_key(&source_id));\n        assert_eq!(sources.get(&source_id).unwrap().source_id, source_id);\n        assert_eq!(\n            sources.get(&source_id).unwrap().source_type(),\n            SourceType::Void\n        );\n        assert_eq!(\n            sources.get(&source_id).unwrap().transform_config,\n            Some(TransformConfig::new(\"del(.username)\".to_string(), None))\n        );\n        assert_eq!(\n            index_metadata.checkpoint.source_checkpoint(&source_id),\n            Some(&SourceCheckpoint::default())\n        );\n    }\n\n    source.source_id = \"unknown-src-id\".to_string();\n    assert!(matches!(\n        metastore\n            .update_source(\n                UpdateSourceRequest::try_from_source_config(index_uid.clone(), &source).unwrap()\n            )\n            .await\n            .unwrap_err(),\n        MetastoreError::NotFound(EntityKind::Source { .. })\n    ));\n    source.source_id = source_id;\n    assert!(matches!(\n        metastore\n            .add_source(\n                AddSourceRequest::try_from_source_config(\n                    IndexUid::new_with_random_ulid(\"index-not-found\"),\n                    &source\n                )\n                .unwrap()\n            )\n            .await\n            .unwrap_err(),\n        MetastoreError::NotFound(EntityKind::Index { .. })\n    ));\n\n    cleanup_index(&mut metastore, index_uid).await;\n}\n\npub async fn test_metastore_toggle_source<MetastoreToTest: MetastoreServiceExt + DefaultForTest>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let index_id = append_random_suffix(\"test-toggle-source\");\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n    let index_uid: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    let source_id = format!(\"{index_id}--source\");\n    let source = SourceConfig {\n        source_id: source_id.to_string(),\n        num_pipelines: NonZeroUsize::MIN,\n        enabled: true,\n        source_params: SourceParams::void(),\n        transform_config: None,\n        input_format: SourceInputFormat::Json,\n    };\n    let add_source_request =\n        AddSourceRequest::try_from_source_config(index_uid.clone(), &source).unwrap();\n    metastore.add_source(add_source_request).await.unwrap();\n    let index_metadata = metastore\n        .index_metadata(IndexMetadataRequest::for_index_id(index_id.to_string()))\n        .await\n        .unwrap()\n        .deserialize_index_metadata()\n        .unwrap();\n    let source = index_metadata.sources.get(&source_id).unwrap();\n    assert_eq!(source.enabled, true);\n\n    // Disable source.\n    metastore\n        .toggle_source(ToggleSourceRequest {\n            index_uid: index_uid.clone().into(),\n            source_id: source.source_id.clone(),\n            enable: false,\n        })\n        .await\n        .unwrap();\n    let index_metadata = metastore\n        .index_metadata(IndexMetadataRequest::for_index_id(index_id.to_string()))\n        .await\n        .unwrap()\n        .deserialize_index_metadata()\n        .unwrap();\n    let source = index_metadata.sources.get(&source_id).unwrap();\n    assert_eq!(source.enabled, false);\n\n    // Enable source.\n    metastore\n        .toggle_source(ToggleSourceRequest {\n            index_uid: index_uid.clone().into(),\n            source_id: source.source_id.clone(),\n            enable: true,\n        })\n        .await\n        .unwrap();\n    let index_metadata = metastore\n        .index_metadata(IndexMetadataRequest::for_index_id(index_id.to_string()))\n        .await\n        .unwrap()\n        .deserialize_index_metadata()\n        .unwrap();\n    let source = index_metadata.sources.get(&source_id).unwrap();\n    assert_eq!(source.enabled, true);\n\n    cleanup_index(&mut metastore, index_uid).await;\n}\n\npub async fn test_metastore_delete_source<MetastoreToTest: MetastoreServiceExt + DefaultForTest>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let index_id = append_random_suffix(\"test-delete-source\");\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let source_id = format!(\"{index_id}--source\");\n\n    let source = SourceConfig {\n        source_id: source_id.to_string(),\n        num_pipelines: NonZeroUsize::MIN,\n        enabled: true,\n        source_params: SourceParams::void(),\n        transform_config: None,\n        input_format: SourceInputFormat::Json,\n    };\n\n    let index_config = IndexConfig::for_test(&index_id, index_uri.as_str());\n\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n    let index_uid: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n    assert!(matches!(\n        metastore\n            .add_source(\n                AddSourceRequest::try_from_source_config(\n                    IndexUid::new_with_random_ulid(\"index-not-found\"),\n                    &source\n                )\n                .unwrap()\n            )\n            .await\n            .unwrap_err(),\n        MetastoreError::NotFound(EntityKind::Index { .. })\n    ));\n    assert!(matches!(\n        metastore\n            .add_source(\n                AddSourceRequest::try_from_source_config(\n                    IndexUid::new_with_random_ulid(&index_id),\n                    &source\n                )\n                .unwrap()\n            )\n            .await\n            .unwrap_err(),\n        MetastoreError::NotFound(EntityKind::Index { .. })\n    ));\n\n    metastore\n        .add_source(AddSourceRequest::try_from_source_config(index_uid.clone(), &source).unwrap())\n        .await\n        .unwrap();\n    metastore\n        .delete_source(DeleteSourceRequest {\n            index_uid: index_uid.clone().into(),\n            source_id: source_id.clone(),\n        })\n        .await\n        .unwrap();\n\n    let sources = metastore\n        .index_metadata(IndexMetadataRequest::for_index_id(index_id.to_string()))\n        .await\n        .unwrap()\n        .deserialize_index_metadata()\n        .unwrap()\n        .sources;\n    assert!(sources.is_empty());\n\n    assert!(matches!(\n        metastore\n            .delete_source(DeleteSourceRequest {\n                index_uid: index_uid.clone().into(),\n                source_id: source_id.to_string()\n            })\n            .await\n            .unwrap_err(),\n        MetastoreError::NotFound(EntityKind::Source { .. })\n    ));\n    assert!(matches!(\n        metastore\n            .delete_source(DeleteSourceRequest {\n                index_uid: Some(IndexUid::new_with_random_ulid(\"index-not-found\")),\n                source_id: source_id.to_string()\n            })\n            .await\n            .unwrap_err(),\n        MetastoreError::NotFound(EntityKind::Index { .. })\n    ));\n    assert!(matches!(\n        metastore\n            .delete_source(DeleteSourceRequest {\n                index_uid: Some(IndexUid::new_with_random_ulid(&index_id)),\n                source_id: source_id.to_string()\n            })\n            .await\n            .unwrap_err(),\n        MetastoreError::NotFound(EntityKind::Index { .. })\n    ));\n\n    cleanup_index(&mut metastore, index_uid).await;\n}\n\npub async fn test_metastore_reset_checkpoint<\n    MetastoreToTest: MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let index_id = append_random_suffix(\"test-reset-checkpoint\");\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n    let index_uid: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    let source_ids: Vec<String> = (0..2).map(|i| format!(\"{index_id}--source-{i}\")).collect();\n    let split_ids: Vec<String> = (0..2).map(|i| format!(\"{index_id}--split-{i}\")).collect();\n\n    for (source_id, split_id) in source_ids.iter().zip(split_ids.iter()) {\n        let source = SourceConfig {\n            source_id: source_id.clone(),\n            num_pipelines: NonZeroUsize::MIN,\n            enabled: true,\n            source_params: SourceParams::void(),\n            transform_config: None,\n            input_format: SourceInputFormat::Json,\n        };\n        metastore\n            .add_source(\n                AddSourceRequest::try_from_source_config(index_uid.clone(), &source).unwrap(),\n            )\n            .await\n            .unwrap();\n\n        let split_metadata = SplitMetadata {\n            split_id: split_id.clone(),\n            index_uid: index_uid.clone(),\n            ..Default::default()\n        };\n        let stage_splits_request =\n            StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata)\n                .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id.clone()],\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n    }\n    assert!(\n        !metastore\n            .index_metadata(IndexMetadataRequest::for_index_id(index_id.to_string()))\n            .await\n            .unwrap()\n            .deserialize_index_metadata()\n            .unwrap()\n            .checkpoint\n            .is_empty()\n    );\n\n    metastore\n        .reset_source_checkpoint(ResetSourceCheckpointRequest {\n            index_uid: index_uid.clone().into(),\n            source_id: source_ids[0].clone(),\n        })\n        .await\n        .unwrap();\n\n    let index_metadata = metastore\n        .index_metadata(IndexMetadataRequest::for_index_id(index_id.to_string()))\n        .await\n        .unwrap()\n        .deserialize_index_metadata()\n        .unwrap();\n    assert!(\n        index_metadata\n            .checkpoint\n            .source_checkpoint(&source_ids[0])\n            .is_none()\n    );\n\n    assert!(\n        index_metadata\n            .checkpoint\n            .source_checkpoint(&source_ids[1])\n            .is_some()\n    );\n\n    assert!(matches!(\n        metastore\n            .reset_source_checkpoint(ResetSourceCheckpointRequest {\n                index_uid: Some(IndexUid::new_with_random_ulid(\"index-not-found\")),\n                source_id: source_ids[1].clone(),\n            })\n            .await\n            .unwrap_err(),\n        MetastoreError::NotFound(EntityKind::Index { .. })\n    ));\n\n    assert!(matches!(\n        metastore\n            .reset_source_checkpoint(ResetSourceCheckpointRequest {\n                index_uid: Some(IndexUid::new_with_random_ulid(&index_id)),\n                source_id: source_ids[1].to_string(),\n            })\n            .await\n            .unwrap_err(),\n        MetastoreError::NotFound(EntityKind::Index { .. })\n    ));\n\n    metastore\n        .reset_source_checkpoint(ResetSourceCheckpointRequest {\n            index_uid: index_uid.clone().into(),\n            source_id: source_ids[1].to_string(),\n        })\n        .await\n        .unwrap();\n\n    assert!(\n        metastore\n            .index_metadata(IndexMetadataRequest::for_index_id(index_id.to_string()))\n            .await\n            .unwrap()\n            .deserialize_index_metadata()\n            .unwrap()\n            .checkpoint\n            .is_empty()\n    );\n\n    cleanup_index(&mut metastore, index_uid).await;\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/tests/split.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::time::Duration;\n\nuse futures::future::try_join_all;\nuse quickwit_common::rand::append_random_suffix;\nuse quickwit_config::{IndexConfig, SourceConfig, SourceParams};\nuse quickwit_proto::metastore::{\n    CreateIndexRequest, DeleteSplitsRequest, EntityKind, IndexMetadataRequest, ListSplitsRequest,\n    ListStaleSplitsRequest, MarkSplitsForDeletionRequest, MetastoreError, PublishSplitsRequest,\n    StageSplitsRequest, UpdateSplitsDeleteOpstampRequest,\n};\nuse quickwit_proto::types::{IndexUid, Position};\nuse time::OffsetDateTime;\nuse tokio::time::sleep;\nuse tracing::{error, info};\n\nuse super::DefaultForTest;\nuse crate::checkpoint::{IndexCheckpointDelta, PartitionId, SourceCheckpointDelta};\nuse crate::metastore::MetastoreServiceStreamSplitsExt;\nuse crate::tests::cleanup_index;\nuse crate::{\n    CreateIndexRequestExt, IndexMetadataResponseExt, ListSplitsQuery, ListSplitsRequestExt,\n    ListSplitsResponseExt, MetastoreServiceExt, SplitMetadata, SplitState, StageSplitsRequestExt,\n};\n\npub async fn test_metastore_publish_splits_empty_splits_array_is_allowed<\n    MetastoreToTest: MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let index_id = append_random_suffix(\"test-publish-splits-empty\");\n    let non_existent_index_uid = IndexUid::new_with_random_ulid(&index_id);\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n\n    let source_id = format!(\"{index_id}--source\");\n\n    // Publish a split on a non-existent index\n    {\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(non_existent_index_uid),\n            index_checkpoint_delta_json_opt: Some({\n                let offsets = 1..10;\n                let checkpoint_delta = IndexCheckpointDelta::for_test(&source_id, offsets);\n                serde_json::to_string(&checkpoint_delta).unwrap()\n            }),\n            ..Default::default()\n        };\n        let error = metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap_err();\n        assert!(matches!(\n            error,\n            MetastoreError::NotFound(EntityKind::Index { .. })\n        ));\n    }\n\n    // Update the checkpoint, by publishing an empty array of splits with a non-empty\n    // checkpoint. This operation is allowed and used in the Indexer.\n    {\n        let index_config = IndexConfig::for_test(&index_id, &index_uri);\n        let source_configs = &[SourceConfig::for_test(&source_id, SourceParams::void())];\n        let create_index_request =\n            CreateIndexRequest::try_from_index_and_source_configs(&index_config, source_configs)\n                .unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            index_checkpoint_delta_json_opt: Some({\n                let offsets = 0..100;\n                let checkpoint_delta = IndexCheckpointDelta::for_test(&source_id, offsets);\n                serde_json::to_string(&checkpoint_delta).unwrap()\n            }),\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n\n        let index_metadata = metastore\n            .index_metadata(IndexMetadataRequest::for_index_id(index_id.to_string()))\n            .await\n            .unwrap()\n            .deserialize_index_metadata()\n            .unwrap();\n        let source_checkpoint = index_metadata\n            .checkpoint\n            .source_checkpoint(&source_id)\n            .unwrap();\n        assert_eq!(source_checkpoint.num_partitions(), 1);\n        assert_eq!(\n            source_checkpoint\n                .position_for_partition(&PartitionId::default())\n                .unwrap(),\n            &Position::offset(100u64 - 1)\n        );\n        cleanup_index(&mut metastore, index_uid).await;\n    }\n}\n\npub async fn test_metastore_publish_splits<\n    MetastoreToTest: MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let current_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n\n    let index_id = append_random_suffix(\"test-publish-splits\");\n    let index_uid = IndexUid::new_with_random_ulid(&index_id);\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n\n    let source_id = format!(\"{index_id}--source\");\n    let source_configs = &[SourceConfig::for_test(&source_id, SourceParams::void())];\n\n    let split_id_1 = format!(\"{index_id}--split-1\");\n    let split_metadata_1 = SplitMetadata {\n        split_id: split_id_1.clone(),\n        index_uid: index_uid.clone(),\n        time_range: Some(0..=99),\n        create_timestamp: current_timestamp,\n        ..Default::default()\n    };\n\n    let split_id_2 = format!(\"{index_id}--split-2\");\n    let split_metadata_2 = SplitMetadata {\n        split_id: split_id_2.clone(),\n        index_uid: index_uid.clone(),\n        time_range: Some(30..=99),\n        create_timestamp: current_timestamp,\n        ..Default::default()\n    };\n\n    // Publish a split on a non-existent index\n    {\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(IndexUid::new_with_random_ulid(\"index-not-found\")),\n            staged_split_ids: vec![\"split-not-found\".to_string()],\n            index_checkpoint_delta_json_opt: Some({\n                let offsets = 0..10;\n                let checkpoint_delta = IndexCheckpointDelta::for_test(&source_id, offsets);\n                serde_json::to_string(&checkpoint_delta).unwrap()\n            }),\n            ..Default::default()\n        };\n        let error = metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap_err();\n        assert!(matches!(\n            error,\n            MetastoreError::NotFound(EntityKind::Index { .. })\n        ));\n    }\n\n    // Publish a split on a wrong index uid\n    {\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(IndexUid::new_with_random_ulid(&index_id)),\n            staged_split_ids: vec![\"split-not-found\".to_string()],\n            index_checkpoint_delta_json_opt: Some({\n                let offsets = 0..10;\n                let checkpoint_delta = IndexCheckpointDelta::for_test(&source_id, offsets);\n                serde_json::to_string(&checkpoint_delta).unwrap()\n            }),\n            ..Default::default()\n        };\n        let error = metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap_err();\n        assert!(matches!(\n            error,\n            MetastoreError::NotFound(EntityKind::Index { .. })\n        ));\n    }\n\n    // Publish a non-existent split on an index\n    {\n        let create_index_request =\n            CreateIndexRequest::try_from_index_and_source_configs(&index_config, source_configs)\n                .unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![\"split-not-found\".to_string()],\n            ..Default::default()\n        };\n        let error = metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap_err();\n        assert!(matches!(\n            error,\n            MetastoreError::NotFound(EntityKind::Splits { .. })\n        ));\n\n        cleanup_index(&mut metastore, index_uid).await;\n    }\n\n    // Publish a staged split on an index\n    {\n        let create_index_request =\n            CreateIndexRequest::try_from_index_and_source_configs(&index_config, source_configs)\n                .unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        let stage_splits_request =\n            StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata_1)\n                .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_1.clone()],\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n\n        cleanup_index(&mut metastore, index_uid).await;\n    }\n\n    // Publish a published split on an index\n    {\n        let create_index_request =\n            CreateIndexRequest::try_from_index_and_source_configs(&index_config, source_configs)\n                .unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        let stage_splits_request =\n            StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata_1)\n                .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_1.clone()],\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_1.clone()],\n            index_checkpoint_delta_json_opt: Some({\n                let offsets = 1..12;\n                let checkpoint_delta = IndexCheckpointDelta::for_test(&source_id, offsets);\n                serde_json::to_string(&checkpoint_delta).unwrap()\n            }),\n            ..Default::default()\n        };\n        let error = metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap_err();\n        assert!(matches!(\n            error,\n            MetastoreError::FailedPrecondition {\n                entity: EntityKind::Splits { .. },\n                ..\n            }\n        ));\n\n        cleanup_index(&mut metastore, index_uid).await;\n    }\n\n    // Publish a non-staged split on an index\n    {\n        let create_index_request =\n            CreateIndexRequest::try_from_index_and_source_configs(&index_config, source_configs)\n                .unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        let stage_splits_request =\n            StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata_1)\n                .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_1.clone()],\n            index_checkpoint_delta_json_opt: Some({\n                let offsets = 12..15;\n                let checkpoint_delta = IndexCheckpointDelta::for_test(&source_id, offsets);\n                serde_json::to_string(&checkpoint_delta).unwrap()\n            }),\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n\n        let mark_splits_for_deletion_request =\n            MarkSplitsForDeletionRequest::new(index_uid.clone(), vec![split_id_1.clone()]);\n        metastore\n            .mark_splits_for_deletion(mark_splits_for_deletion_request)\n            .await\n            .unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_1.clone()],\n            index_checkpoint_delta_json_opt: Some({\n                let offsets = 15..18;\n                let checkpoint_delta = IndexCheckpointDelta::for_test(&source_id, offsets);\n                serde_json::to_string(&checkpoint_delta).unwrap()\n            }),\n            ..Default::default()\n        };\n        let error = metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap_err();\n        assert!(matches!(\n            error,\n            MetastoreError::FailedPrecondition {\n                entity: EntityKind::Splits { .. },\n                ..\n            }\n        ));\n\n        cleanup_index(&mut metastore, index_uid).await;\n    }\n\n    // Publish a staged split and non-existent split on an index\n    {\n        let create_index_request =\n            CreateIndexRequest::try_from_index_and_source_configs(&index_config, source_configs)\n                .unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        let stage_splits_request =\n            StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata_1)\n                .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_1.clone(), \"split-not-found\".to_string()],\n            index_checkpoint_delta_json_opt: Some({\n                let offsets = 15..18;\n                let checkpoint_delta = IndexCheckpointDelta::for_test(&source_id, offsets);\n                serde_json::to_string(&checkpoint_delta).unwrap()\n            }),\n            ..Default::default()\n        };\n        let error = metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap_err();\n        assert!(matches!(\n            error,\n            MetastoreError::NotFound(EntityKind::Splits { .. })\n        ));\n\n        cleanup_index(&mut metastore, index_uid).await;\n    }\n\n    // Publish a published split and non-existent split on an index\n    {\n        let create_index_request =\n            CreateIndexRequest::try_from_index_and_source_configs(&index_config, source_configs)\n                .unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        let stage_splits_request =\n            StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata_1)\n                .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_1.clone()],\n            index_checkpoint_delta_json_opt: Some({\n                let offsets = 15..18;\n                let checkpoint_delta = IndexCheckpointDelta::for_test(&source_id, offsets);\n                serde_json::to_string(&checkpoint_delta).unwrap()\n            }),\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_1.clone(), \"split-not-found\".to_string()],\n            index_checkpoint_delta_json_opt: Some({\n                let offsets = 18..24;\n                let checkpoint_delta = IndexCheckpointDelta::for_test(&source_id, offsets);\n                serde_json::to_string(&checkpoint_delta).unwrap()\n            }),\n            ..Default::default()\n        };\n        let error = metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap_err();\n        assert!(matches!(\n            error,\n            MetastoreError::NotFound(EntityKind::Splits { .. })\n        ));\n\n        cleanup_index(&mut metastore, index_uid).await;\n    }\n\n    // Publish a non-staged split and non-existent split on an index\n    {\n        let create_index_request =\n            CreateIndexRequest::try_from_index_and_source_configs(&index_config, source_configs)\n                .unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        let stage_splits_request =\n            StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata_1)\n                .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_1.clone()],\n            index_checkpoint_delta_json_opt: Some({\n                let offsets = 18..24;\n                let checkpoint_delta = IndexCheckpointDelta::for_test(&source_id, offsets);\n                serde_json::to_string(&checkpoint_delta).unwrap()\n            }),\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n\n        let mark_splits_for_deletion_request =\n            MarkSplitsForDeletionRequest::new(index_uid.clone(), vec![split_id_1.clone()]);\n        metastore\n            .mark_splits_for_deletion(mark_splits_for_deletion_request)\n            .await\n            .unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_1.clone(), \"split-not-found\".to_string()],\n            index_checkpoint_delta_json_opt: Some({\n                let offsets = 24..26;\n                let checkpoint_delta = IndexCheckpointDelta::for_test(&source_id, offsets);\n                serde_json::to_string(&checkpoint_delta).unwrap()\n            }),\n            ..Default::default()\n        };\n        let error = metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap_err();\n        assert!(matches!(\n            error,\n            MetastoreError::NotFound(EntityKind::Splits { .. })\n        ));\n\n        cleanup_index(&mut metastore, index_uid).await;\n    }\n\n    // Publish staged splits on an index\n    {\n        let create_index_request =\n            CreateIndexRequest::try_from_index_and_source_configs(&index_config, source_configs)\n                .unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        let stage_splits_request =\n            StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata_1)\n                .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let stage_splits_request =\n            StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata_2)\n                .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_1.clone(), split_id_2.clone()],\n            index_checkpoint_delta_json_opt: Some({\n                let offsets = 24..26;\n                let checkpoint_delta = IndexCheckpointDelta::for_test(&source_id, offsets);\n                serde_json::to_string(&checkpoint_delta).unwrap()\n            }),\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n\n        cleanup_index(&mut metastore, index_uid).await;\n    }\n\n    // Publish a staged split and published split on an index\n    {\n        let create_index_request =\n            CreateIndexRequest::try_from_index_and_source_configs(&index_config, source_configs)\n                .unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        let stage_splits_request = StageSplitsRequest::try_from_splits_metadata(\n            index_uid.clone(),\n            [split_metadata_1.clone(), split_metadata_2.clone()],\n        )\n        .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_2.clone()],\n            index_checkpoint_delta_json_opt: Some({\n                let offsets = 26..28;\n                let checkpoint_delta = IndexCheckpointDelta::for_test(&source_id, offsets);\n                serde_json::to_string(&checkpoint_delta).unwrap()\n            }),\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_1.clone(), split_id_2.clone()],\n            index_checkpoint_delta_json_opt: Some({\n                let offsets = 28..30;\n                let checkpoint_delta = IndexCheckpointDelta::for_test(&source_id, offsets);\n                serde_json::to_string(&checkpoint_delta).unwrap()\n            }),\n            ..Default::default()\n        };\n        let error = metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap_err();\n        assert!(matches!(\n            error,\n            MetastoreError::FailedPrecondition {\n                entity: EntityKind::Splits { .. },\n                ..\n            }\n        ));\n\n        cleanup_index(&mut metastore, index_uid).await;\n    }\n\n    // Publish published splits on an index\n    {\n        let create_index_request =\n            CreateIndexRequest::try_from_index_and_source_configs(&index_config, source_configs)\n                .unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        let stage_splits_request = StageSplitsRequest::try_from_splits_metadata(\n            index_uid.clone(),\n            [split_metadata_1.clone(), split_metadata_2.clone()],\n        )\n        .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_1.clone(), split_id_2.clone()],\n            index_checkpoint_delta_json_opt: Some({\n                let offsets = 30..31;\n                let checkpoint_delta = IndexCheckpointDelta::for_test(&source_id, offsets);\n                serde_json::to_string(&checkpoint_delta).unwrap()\n            }),\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_1.clone(), split_id_2.clone()],\n            index_checkpoint_delta_json_opt: Some({\n                let offsets = 30..31;\n                let checkpoint_delta = IndexCheckpointDelta::for_test(&source_id, offsets);\n                serde_json::to_string(&checkpoint_delta).unwrap()\n            }),\n            ..Default::default()\n        };\n        let error = metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap_err();\n        assert!(matches!(\n            error,\n            MetastoreError::FailedPrecondition {\n                entity: EntityKind::CheckpointDelta { .. },\n                ..\n            }\n        ));\n\n        cleanup_index(&mut metastore, index_uid).await;\n    }\n}\n\npub async fn test_metastore_publish_splits_concurrency<\n    MetastoreToTest: MetastoreServiceExt + DefaultForTest + Clone,\n>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let index_id = append_random_suffix(\"test-publish-concurrency\");\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n    let source_id = format!(\"{index_id}--source\");\n\n    let source_config = SourceConfig::for_test(&source_id, SourceParams::void());\n    let create_index_request =\n        CreateIndexRequest::try_from_index_and_source_configs(&index_config, &[source_config])\n            .unwrap();\n\n    let index_uid: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    let mut join_handles = Vec::with_capacity(10);\n\n    for partition_id in 0..10 {\n        let metastore_clone = metastore.clone();\n        let index_id = index_id.clone();\n        let source_id = source_id.clone();\n\n        let join_handle = tokio::spawn({\n            let index_uid = index_uid.clone();\n            async move {\n                let split_id = format!(\"{index_id}--split-{partition_id}\");\n                let split_metadata = SplitMetadata {\n                    split_id: split_id.clone(),\n                    index_uid: index_uid.clone(),\n                    ..Default::default()\n                };\n                let stage_splits_request =\n                    StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata)\n                        .unwrap();\n                metastore_clone\n                    .stage_splits(stage_splits_request)\n                    .await\n                    .unwrap();\n                let source_delta = SourceCheckpointDelta::from_partition_delta(\n                    PartitionId::from(partition_id as u64),\n                    Position::Beginning,\n                    Position::offset(partition_id as u64),\n                )\n                .unwrap();\n                let checkpoint_delta = IndexCheckpointDelta {\n                    source_id,\n                    source_delta,\n                };\n                let publish_splits_request = PublishSplitsRequest {\n                    index_uid: Some(index_uid.clone()),\n                    staged_split_ids: vec![split_id.clone()],\n                    index_checkpoint_delta_json_opt: Some(\n                        serde_json::to_string(&checkpoint_delta).unwrap(),\n                    ),\n                    ..Default::default()\n                };\n                metastore_clone\n                    .publish_splits(publish_splits_request)\n                    .await\n                    .unwrap();\n            }\n        });\n        join_handles.push(join_handle);\n    }\n    try_join_all(join_handles).await.unwrap();\n\n    let index_metadata = metastore\n        .index_metadata(IndexMetadataRequest::for_index_id(index_id.to_string()))\n        .await\n        .unwrap()\n        .deserialize_index_metadata()\n        .unwrap();\n    let source_checkpoint = index_metadata\n        .checkpoint\n        .source_checkpoint(&source_id)\n        .unwrap();\n\n    assert_eq!(source_checkpoint.num_partitions(), 10);\n\n    cleanup_index(&mut metastore, index_uid).await\n}\n\npub async fn test_metastore_replace_splits<\n    MetastoreToTest: MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let current_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n\n    let index_id = append_random_suffix(\"test-replace-splits\");\n    let index_uid = IndexUid::new_with_random_ulid(&index_id);\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n\n    let split_id_1 = format!(\"{index_id}--split-1\");\n    let split_metadata_1 = SplitMetadata {\n        split_id: split_id_1.clone(),\n        index_uid: index_uid.clone(),\n        time_range: None,\n        create_timestamp: current_timestamp,\n        ..Default::default()\n    };\n\n    let split_id_2 = format!(\"{index_id}--split-2\");\n    let split_metadata_2 = SplitMetadata {\n        split_id: split_id_2.clone(),\n        index_uid: index_uid.clone(),\n        time_range: None,\n        create_timestamp: current_timestamp,\n        ..Default::default()\n    };\n\n    let split_id_3 = format!(\"{index_id}--split-3\");\n    let split_metadata_3 = SplitMetadata {\n        split_id: split_id_3.clone(),\n        index_uid: index_uid.clone(),\n        time_range: None,\n        create_timestamp: current_timestamp,\n        ..Default::default()\n    };\n\n    // Replace splits on a non-existent index\n    {\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(IndexUid::new_with_random_ulid(\"index-not-found\")),\n            staged_split_ids: vec![\"split-not-found-1\".to_string()],\n            replaced_split_ids: vec![\"split-not-found-2\".to_string()],\n            ..Default::default()\n        };\n        let error = metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap_err();\n        assert!(matches!(\n            error,\n            MetastoreError::NotFound(EntityKind::Index { .. })\n        ));\n    }\n\n    // Replace a non-existent split on an index\n    {\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![\"split-not-found-1\".to_string()],\n            replaced_split_ids: vec![\"split-not-found-2\".to_string()],\n            ..Default::default()\n        };\n        // TODO source id\n        let error = metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap_err();\n        assert!(matches!(\n            error,\n            MetastoreError::NotFound(EntityKind::Splits { .. })\n        ));\n\n        cleanup_index(&mut metastore, index_uid).await;\n    }\n\n    // Replace a publish split with a non existing split\n    {\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        let stage_splits_request =\n            StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata_1)\n                .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_1.clone()],\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n\n        // TODO Source id\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_2.clone()],\n            replaced_split_ids: vec![split_id_1.clone()],\n            ..Default::default()\n        };\n        let error = metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap_err();\n        assert!(matches!(\n            error,\n            MetastoreError::NotFound(EntityKind::Splits { .. })\n        ));\n\n        cleanup_index(&mut metastore, index_uid).await;\n    }\n\n    // Replace a publish split with a deleted split\n    {\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        let stage_splits_request = StageSplitsRequest::try_from_splits_metadata(\n            index_uid.clone(),\n            [split_metadata_1.clone(), split_metadata_2.clone()],\n        )\n        .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_1.clone(), split_id_2.clone()],\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n\n        let mark_splits_for_deletion_request =\n            MarkSplitsForDeletionRequest::new(index_uid.clone(), vec![split_id_2.clone()]);\n        metastore\n            .mark_splits_for_deletion(mark_splits_for_deletion_request)\n            .await\n            .unwrap();\n\n        // TODO source_id\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_2.clone()],\n            replaced_split_ids: vec![split_id_1.clone()],\n            ..Default::default()\n        };\n        let error = metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap_err();\n        assert!(matches!(\n            error,\n            MetastoreError::FailedPrecondition {\n                entity: EntityKind::Splits { .. },\n                ..\n            }\n        ));\n\n        cleanup_index(&mut metastore, index_uid).await;\n    }\n\n    // Replace a publish split with mixed splits\n    {\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        let stage_splits_request =\n            StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata_1)\n                .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_1.clone()],\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n\n        let stage_splits_request =\n            StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata_2)\n                .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_2.clone(), split_id_3.clone()],\n            replaced_split_ids: vec![split_id_1.clone()],\n            ..Default::default()\n        };\n        let error = metastore\n            .publish_splits(publish_splits_request) // TODO source id\n            .await\n            .unwrap_err();\n        assert!(matches!(\n            error,\n            MetastoreError::NotFound(EntityKind::Splits { .. })\n        ));\n\n        cleanup_index(&mut metastore, index_uid).await;\n    }\n\n    // Replace a deleted split with a new split\n    {\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        let stage_splits_request =\n            StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata_1)\n                .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_1.clone()],\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n\n        let mark_splits_for_deletion_request =\n            MarkSplitsForDeletionRequest::new(index_uid.clone(), vec![split_id_1.clone()]);\n        metastore\n            .mark_splits_for_deletion(mark_splits_for_deletion_request)\n            .await\n            .unwrap();\n\n        let stage_splits_request =\n            StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata_2)\n                .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_2.clone()],\n            replaced_split_ids: vec![split_id_1.clone()],\n            ..Default::default()\n        };\n        let error = metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap_err();\n        assert!(\n            matches!(error, MetastoreError::FailedPrecondition { entity: EntityKind::Splits { split_ids }, .. } if split_ids == [split_id_1.clone()])\n        );\n\n        cleanup_index(&mut metastore, index_uid).await;\n    }\n\n    // Replace a publish split with staged splits\n    {\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        let stage_splits_request =\n            StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata_1)\n                .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_1.clone()],\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n\n        let stage_splits_request = StageSplitsRequest::try_from_splits_metadata(\n            index_uid.clone(),\n            [split_metadata_2.clone(), split_metadata_3.clone()],\n        )\n        .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n\n        // TODO Source id\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_2.clone(), split_id_3.clone()],\n            replaced_split_ids: vec![split_id_1.clone()],\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n\n        cleanup_index(&mut metastore, index_uid).await;\n    }\n}\n\npub async fn test_metastore_mark_splits_for_deletion<\n    MetastoreToTest: MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let current_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n\n    let index_id = append_random_suffix(\"test-mark-splits-for-deletion\");\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n\n    let index_uid: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    let mark_splits_for_deletion_request = MarkSplitsForDeletionRequest::new(\n        \"index-not-found:00000000000000000000000000\"\n            .parse()\n            .unwrap(),\n        Vec::new(),\n    );\n    let error = metastore\n        .mark_splits_for_deletion(mark_splits_for_deletion_request)\n        .await\n        .unwrap_err();\n    assert!(matches!(\n        error,\n        MetastoreError::NotFound(EntityKind::Index { .. })\n    ));\n\n    let mark_splits_for_deletion_request =\n        MarkSplitsForDeletionRequest::new(index_uid.clone(), vec![\"split-not-found\".to_string()]);\n    metastore\n        .mark_splits_for_deletion(mark_splits_for_deletion_request)\n        .await\n        .unwrap();\n\n    let split_id_1 = format!(\"{index_id}--split-1\");\n    let split_metadata_1 = SplitMetadata {\n        split_id: split_id_1.clone(),\n        index_uid: index_uid.clone(),\n        create_timestamp: current_timestamp,\n        ..Default::default()\n    };\n    let stage_splits_request =\n        StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata_1).unwrap();\n    metastore.stage_splits(stage_splits_request).await.unwrap();\n\n    let split_id_2 = format!(\"{index_id}--split-2\");\n    let split_metadata_2 = SplitMetadata {\n        split_id: split_id_2.clone(),\n        index_uid: index_uid.clone(),\n        create_timestamp: current_timestamp,\n        ..Default::default()\n    };\n    let stage_splits_request =\n        StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata_2).unwrap();\n    metastore.stage_splits(stage_splits_request).await.unwrap();\n    let publish_splits_request = PublishSplitsRequest {\n        index_uid: Some(index_uid.clone()),\n        staged_split_ids: vec![split_id_2.clone()],\n        ..Default::default()\n    };\n    metastore\n        .publish_splits(publish_splits_request)\n        .await\n        .unwrap();\n\n    let split_id_3 = format!(\"{index_id}--split-3\");\n    let split_metadata_3 = SplitMetadata {\n        split_id: split_id_3.clone(),\n        index_uid: index_uid.clone(),\n        create_timestamp: current_timestamp,\n        ..Default::default()\n    };\n    let stage_splits_request =\n        StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata_3).unwrap();\n    metastore.stage_splits(stage_splits_request).await.unwrap();\n    let publish_splits_request = PublishSplitsRequest {\n        index_uid: Some(index_uid.clone()),\n        staged_split_ids: vec![split_id_3.clone()],\n        ..Default::default()\n    };\n    metastore\n        .publish_splits(publish_splits_request)\n        .await\n        .unwrap();\n\n    // Sleep for 1s so we can observe the timestamp update.\n    sleep(Duration::from_secs(1)).await;\n\n    let mark_splits_for_deletion_request =\n        MarkSplitsForDeletionRequest::new(index_uid.clone(), vec![split_id_3.clone()]);\n    metastore\n        .mark_splits_for_deletion(mark_splits_for_deletion_request)\n        .await\n        .unwrap();\n\n    let list_splits_request = ListSplitsRequest::try_from_list_splits_query(\n        &ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::MarkedForDeletion),\n    )\n    .unwrap();\n    let marked_splits = metastore\n        .list_splits(list_splits_request)\n        .await\n        .unwrap()\n        .collect_splits()\n        .await\n        .unwrap();\n\n    assert_eq!(marked_splits.len(), 1);\n    assert_eq!(marked_splits[0].split_id(), split_id_3);\n\n    let split_3_update_timestamp = marked_splits[0].update_timestamp;\n    assert!(current_timestamp < split_3_update_timestamp);\n\n    // Sleep for 1s so we can observe the timestamp update.\n    sleep(Duration::from_secs(1)).await;\n\n    let mark_splits_for_deletion_request = MarkSplitsForDeletionRequest::new(\n        index_uid.clone(),\n        vec![\n            split_id_1.clone(),\n            split_id_2.clone(),\n            split_id_3.clone(),\n            \"split-not-found\".to_string(),\n        ],\n    );\n    metastore\n        .mark_splits_for_deletion(mark_splits_for_deletion_request)\n        .await\n        .unwrap();\n\n    let list_splits_request = ListSplitsRequest::try_from_list_splits_query(\n        &ListSplitsQuery::for_index(index_uid.clone())\n            .with_split_state(SplitState::MarkedForDeletion),\n    )\n    .unwrap();\n    let mut marked_splits = metastore\n        .list_splits(list_splits_request)\n        .await\n        .unwrap()\n        .collect_splits()\n        .await\n        .unwrap();\n\n    marked_splits.sort_by_key(|split| split.split_id().to_string());\n\n    assert_eq!(marked_splits.len(), 3);\n\n    assert_eq!(marked_splits[0].split_id(), split_id_1);\n    assert!(current_timestamp + 2 <= marked_splits[0].update_timestamp);\n\n    assert_eq!(marked_splits[1].split_id(), split_id_2);\n    assert!(current_timestamp + 2 <= marked_splits[1].update_timestamp);\n\n    assert_eq!(marked_splits[2].split_id(), split_id_3);\n    assert_eq!(marked_splits[2].update_timestamp, split_3_update_timestamp);\n\n    cleanup_index(&mut metastore, index_uid).await;\n}\n\npub async fn test_metastore_delete_splits<MetastoreToTest: MetastoreServiceExt + DefaultForTest>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let index_id = append_random_suffix(\"test-delete-splits\");\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n    let index_uid: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    let delete_splits_request = DeleteSplitsRequest {\n        index_uid: Some(IndexUid::new_with_random_ulid(\"index-not-found\")),\n        split_ids: Vec::new(),\n    };\n    let error = metastore\n        .delete_splits(delete_splits_request)\n        .await\n        .unwrap_err();\n\n    assert!(matches!(\n        error,\n        MetastoreError::NotFound(EntityKind::Index { .. })\n    ));\n\n    let index_not_existing_uid = IndexUid::new_with_random_ulid(&index_id);\n    // Check error if index does not exist.\n    let delete_splits_request = DeleteSplitsRequest {\n        index_uid: Some(index_not_existing_uid),\n        split_ids: Vec::new(),\n    };\n    let error = metastore\n        .delete_splits(delete_splits_request)\n        .await\n        .unwrap_err();\n\n    assert!(matches!(\n        error,\n        MetastoreError::NotFound(EntityKind::Index { .. })\n    ));\n\n    let delete_splits_request = DeleteSplitsRequest {\n        index_uid: Some(index_uid.clone()),\n        split_ids: vec![\"split-not-found\".to_string()],\n    };\n    metastore\n        .delete_splits(delete_splits_request)\n        .await\n        .unwrap();\n\n    let split_id_1 = format!(\"{index_id}--split-1\");\n    let split_metadata_1 = SplitMetadata {\n        split_id: split_id_1.clone(),\n        index_uid: index_uid.clone(),\n        ..Default::default()\n    };\n    let stage_splits_request =\n        StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata_1).unwrap();\n    metastore.stage_splits(stage_splits_request).await.unwrap();\n    let publish_splits_request = PublishSplitsRequest {\n        index_uid: Some(index_uid.clone()),\n        staged_split_ids: vec![split_id_1.clone()],\n        ..Default::default()\n    };\n    metastore\n        .publish_splits(publish_splits_request)\n        .await\n        .unwrap();\n\n    let split_id_2 = format!(\"{index_id}--split-2\");\n    let split_metadata_2 = SplitMetadata {\n        split_id: split_id_2.clone(),\n        index_uid: index_uid.clone(),\n        ..Default::default()\n    };\n    let stage_splits_request =\n        StageSplitsRequest::try_from_split_metadata(index_uid.clone(), &split_metadata_2).unwrap();\n    metastore.stage_splits(stage_splits_request).await.unwrap();\n\n    let delete_splits_request = DeleteSplitsRequest {\n        index_uid: Some(index_uid.clone()),\n        split_ids: vec![split_id_1.clone(), split_id_2.clone()],\n    };\n    let error = metastore\n        .delete_splits(delete_splits_request)\n        .await\n        .unwrap_err();\n\n    assert!(matches!(\n        error,\n        MetastoreError::FailedPrecondition {\n            entity: EntityKind::Splits { .. },\n            ..\n        }\n    ));\n\n    assert_eq!(\n        metastore\n            .list_splits(ListSplitsRequest::try_from_index_uid(index_uid.clone()).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap()\n            .len(),\n        2\n    );\n\n    let mark_splits_for_deletion_request = MarkSplitsForDeletionRequest::new(\n        index_uid.clone(),\n        vec![split_id_1.clone(), split_id_2.clone()],\n    );\n    metastore\n        .mark_splits_for_deletion(mark_splits_for_deletion_request)\n        .await\n        .unwrap();\n\n    let delete_splits_request = DeleteSplitsRequest {\n        index_uid: Some(index_uid.clone()),\n        split_ids: vec![\n            split_id_1.clone(),\n            split_id_2.clone(),\n            \"split-not-found\".to_string(),\n        ],\n    };\n    metastore\n        .delete_splits(delete_splits_request)\n        .await\n        .unwrap();\n\n    assert_eq!(\n        metastore\n            .list_splits(ListSplitsRequest::try_from_index_uid(index_uid.clone()).unwrap())\n            .await\n            .unwrap()\n            .collect_splits()\n            .await\n            .unwrap()\n            .len(),\n        0\n    );\n\n    cleanup_index(&mut metastore, index_uid).await;\n}\n\npub async fn test_metastore_split_update_timestamp<\n    MetastoreToTest: MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n\n    let mut current_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n\n    let index_id = append_random_suffix(\"split-update-timestamp\");\n    let index_uid = IndexUid::new_with_random_ulid(&index_id);\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n\n    let source_id = format!(\"{index_id}--source\");\n    let source_config = SourceConfig::for_test(&source_id, SourceParams::void());\n\n    let split_id = format!(\"{index_id}--split\");\n    let split_metadata = SplitMetadata {\n        split_id: split_id.clone(),\n        index_uid: index_uid.clone(),\n        create_timestamp: current_timestamp,\n        ..Default::default()\n    };\n\n    // Create an index\n    let create_index_request =\n        CreateIndexRequest::try_from_index_and_source_configs(&index_config, &[source_config])\n            .unwrap();\n    let index_uid: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    // wait for 1s, stage split & check `update_timestamp`\n    sleep(Duration::from_secs(1)).await;\n    let stage_splits_request =\n        StageSplitsRequest::try_from_splits_metadata(index_uid.clone(), [split_metadata.clone()])\n            .unwrap();\n    metastore.stage_splits(stage_splits_request).await.unwrap();\n\n    sleep(Duration::from_secs(1)).await;\n    let split_meta = metastore\n        .list_splits(ListSplitsRequest::try_from_index_uid(index_uid.clone()).unwrap())\n        .await\n        .unwrap()\n        .collect_splits()\n        .await\n        .unwrap()[0]\n        .clone();\n    assert!(split_meta.update_timestamp > current_timestamp);\n    assert!(split_meta.publish_timestamp.is_none());\n\n    current_timestamp = split_meta.update_timestamp;\n\n    // wait for 1s, publish split & check `update_timestamp`\n    sleep(Duration::from_secs(1)).await;\n    let publish_splits_request = PublishSplitsRequest {\n        index_uid: Some(index_uid.clone()),\n        staged_split_ids: vec![split_id.clone()],\n        index_checkpoint_delta_json_opt: Some({\n            let offsets = 0..5;\n            let checkpoint_delta = IndexCheckpointDelta::for_test(&source_id, offsets);\n            serde_json::to_string(&checkpoint_delta).unwrap()\n        }),\n        ..Default::default()\n    };\n    metastore\n        .publish_splits(publish_splits_request)\n        .await\n        .unwrap();\n    let split_meta = metastore\n        .list_splits(ListSplitsRequest::try_from_index_uid(index_uid.clone()).unwrap())\n        .await\n        .unwrap()\n        .collect_splits()\n        .await\n        .unwrap()[0]\n        .clone();\n    assert!(split_meta.update_timestamp > current_timestamp);\n    assert_eq!(\n        split_meta.publish_timestamp,\n        Some(split_meta.update_timestamp)\n    );\n    current_timestamp = split_meta.update_timestamp;\n\n    // wait for 1s, mark split for deletion & check `update_timestamp`\n    sleep(Duration::from_secs(1)).await;\n    let mark_splits_for_deletion_request =\n        MarkSplitsForDeletionRequest::new(index_uid.clone(), vec![split_id.clone()]);\n    metastore\n        .mark_splits_for_deletion(mark_splits_for_deletion_request)\n        .await\n        .unwrap();\n    let split_meta = metastore\n        .list_splits(ListSplitsRequest::try_from_index_uid(index_uid.clone()).unwrap())\n        .await\n        .unwrap()\n        .collect_splits()\n        .await\n        .unwrap()[0]\n        .clone();\n    assert!(split_meta.update_timestamp > current_timestamp);\n    assert!(split_meta.publish_timestamp.is_some());\n\n    cleanup_index(&mut metastore, index_uid).await;\n}\n\npub async fn test_metastore_stage_splits<MetastoreToTest: MetastoreServiceExt + DefaultForTest>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n    let current_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n    let index_id = append_random_suffix(\"test-stage-splits\");\n    let index_uid = IndexUid::new_with_random_ulid(&index_id);\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n\n    let split_id_1 = format!(\"{index_id}--split-1\");\n    let split_metadata_1 = SplitMetadata {\n        split_id: split_id_1.clone(),\n        index_uid: index_uid.clone(),\n        create_timestamp: current_timestamp,\n        delete_opstamp: 20,\n        node_id: \"node-1\".to_string(),\n        ..Default::default()\n    };\n    let split_id_2 = format!(\"{index_id}--split-2\");\n    let split_metadata_2 = SplitMetadata {\n        split_id: split_id_2.clone(),\n        index_uid: index_uid.clone(),\n        create_timestamp: current_timestamp,\n        delete_opstamp: 10,\n        node_id: \"node-2\".to_string(),\n        ..Default::default()\n    };\n\n    // Stage a splits on a non-existent index\n    let stage_splits_request = StageSplitsRequest::try_from_splits_metadata(\n        IndexUid::new_with_random_ulid(\"index-not-found\"),\n        [split_metadata_1.clone()],\n    )\n    .unwrap();\n    let error = metastore\n        .stage_splits(stage_splits_request)\n        .await\n        .unwrap_err();\n    assert!(matches!(\n        error,\n        MetastoreError::NotFound(EntityKind::Index { .. })\n    ));\n\n    let create_index_request = CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n    let index_uid: IndexUid = metastore\n        .create_index(create_index_request)\n        .await\n        .unwrap()\n        .index_uid()\n        .clone();\n\n    // Stage a split on an index\n    let stage_splits_request = StageSplitsRequest::try_from_splits_metadata(\n        index_uid.clone(),\n        [split_metadata_1.clone(), split_metadata_2.clone()],\n    )\n    .unwrap();\n    metastore.stage_splits(stage_splits_request).await.unwrap();\n\n    let query = ListSplitsQuery::for_index(index_uid.clone()).with_split_state(SplitState::Staged);\n    let mut splits = metastore\n        .list_splits(ListSplitsRequest::try_from_list_splits_query(&query).unwrap())\n        .await\n        .unwrap()\n        .collect_splits()\n        .await\n        .unwrap();\n\n    assert_eq!(splits.len(), 2);\n    splits.sort_unstable_by(|left, right| left.split_id().cmp(right.split_id()));\n\n    assert_eq!(splits[0].split_id(), &split_id_1);\n    assert_eq!(splits[0].split_metadata.node_id, \"node-1\");\n\n    assert_eq!(splits[1].split_id(), &split_id_2);\n    assert_eq!(splits[1].split_metadata.node_id, \"node-2\");\n\n    // Stage a existent-staged-split on an index\n    let stage_splits_request =\n        StageSplitsRequest::try_from_splits_metadata(index_uid.clone(), [split_metadata_1.clone()])\n            .unwrap();\n    metastore\n        .stage_splits(stage_splits_request)\n        .await\n        .expect(\"Pre-existing staged splits should be updated.\");\n\n    let publish_splits_request = PublishSplitsRequest {\n        index_uid: Some(index_uid.clone()),\n        staged_split_ids: vec![split_id_1.clone(), split_id_2.clone()],\n        ..Default::default()\n    };\n    metastore\n        .publish_splits(publish_splits_request)\n        .await\n        .unwrap();\n    let stage_splits_request =\n        StageSplitsRequest::try_from_splits_metadata(index_uid.clone(), [split_metadata_1.clone()])\n            .unwrap();\n    let error = metastore\n        .stage_splits(stage_splits_request)\n        .await\n        .expect_err(\"Metastore should not allow splits which are not `Staged` to be overwritten.\");\n    assert!(matches!(\n        error,\n        MetastoreError::FailedPrecondition {\n            entity: EntityKind::Splits { .. },\n            ..\n        }\n    ),);\n\n    let mark_splits_for_deletion_request =\n        MarkSplitsForDeletionRequest::new(index_uid.clone(), vec![split_id_2.clone()]);\n    metastore\n        .mark_splits_for_deletion(mark_splits_for_deletion_request)\n        .await\n        .unwrap();\n    let stage_splits_request =\n        StageSplitsRequest::try_from_splits_metadata(index_uid.clone(), [split_metadata_2.clone()])\n            .unwrap();\n    let error = metastore\n        .stage_splits(stage_splits_request)\n        .await\n        .expect_err(\"Metastore should not allow splits which are not `Staged` to be overwritten.\");\n    assert!(matches!(\n        error,\n        MetastoreError::FailedPrecondition {\n            entity: EntityKind::Splits { .. },\n            ..\n        }\n    ),);\n\n    cleanup_index(&mut metastore, index_uid).await;\n}\n\npub async fn test_metastore_update_splits_delete_opstamp<\n    MetastoreToTest: MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreToTest::default_for_test().await;\n    let current_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n    let index_id = append_random_suffix(\"update-splits-delete-opstamp\");\n    let index_uid = IndexUid::new_with_random_ulid(&index_id);\n    let index_uri = format!(\"ram:///indexes/{index_id}\");\n    let index_config = IndexConfig::for_test(&index_id, &index_uri);\n\n    let split_id_1 = format!(\"{index_id}--split-1\");\n    let split_metadata_1 = SplitMetadata {\n        split_id: split_id_1.clone(),\n        index_uid: index_uid.clone(),\n        create_timestamp: current_timestamp,\n        delete_opstamp: 20,\n        ..Default::default()\n    };\n    let split_id_2 = format!(\"{index_id}--split-2\");\n    let split_metadata_2 = SplitMetadata {\n        split_id: split_id_2.clone(),\n        index_uid: index_uid.clone(),\n        create_timestamp: current_timestamp,\n        delete_opstamp: 10,\n        ..Default::default()\n    };\n    let split_id_3 = format!(\"{index_id}--split-3\");\n    let split_metadata_3 = SplitMetadata {\n        split_id: split_id_3.clone(),\n        index_uid: index_uid.clone(),\n        create_timestamp: current_timestamp,\n        delete_opstamp: 0,\n        ..Default::default()\n    };\n\n    {\n        info!(\"update splits delete opstamp on a non-existent index\");\n        let update_splits_delete_opstamp_request = UpdateSplitsDeleteOpstampRequest {\n            index_uid: Some(IndexUid::new_with_random_ulid(\"index-not-found\")),\n            split_ids: vec![split_id_1.clone()],\n            delete_opstamp: 10,\n        };\n        let metastore_err = metastore\n            .update_splits_delete_opstamp(update_splits_delete_opstamp_request)\n            .await\n            .unwrap_err();\n        error!(err=?metastore_err);\n        assert!(matches!(\n            metastore_err,\n            MetastoreError::NotFound(EntityKind::Index { .. })\n        ));\n    }\n\n    {\n        info!(\"update splits delete opstamp on an index\");\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        let index_uid: IndexUid = metastore\n            .create_index(create_index_request)\n            .await\n            .unwrap()\n            .index_uid()\n            .clone();\n\n        let stage_splits_request = StageSplitsRequest::try_from_splits_metadata(\n            index_uid.clone(),\n            [split_metadata_1, split_metadata_2, split_metadata_3],\n        )\n        .unwrap();\n        metastore.stage_splits(stage_splits_request).await.unwrap();\n        let publish_splits_request = PublishSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            staged_split_ids: vec![split_id_1.clone(), split_id_2.clone()],\n            ..Default::default()\n        };\n        metastore\n            .publish_splits(publish_splits_request)\n            .await\n            .unwrap();\n\n        let list_stale_splits_request = ListStaleSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            delete_opstamp: 100,\n            num_splits: 2,\n        };\n        let splits = metastore\n            .list_stale_splits(list_stale_splits_request)\n            .await\n            .unwrap()\n            .deserialize_splits()\n            .await\n            .unwrap();\n        assert_eq!(splits.len(), 2);\n\n        let update_splits_delete_opstamp_request = UpdateSplitsDeleteOpstampRequest {\n            index_uid: Some(index_uid.clone()),\n            split_ids: vec![split_id_1.clone(), split_id_2.clone()],\n            delete_opstamp: 100,\n        };\n        metastore\n            .update_splits_delete_opstamp(update_splits_delete_opstamp_request)\n            .await\n            .unwrap();\n\n        let list_stale_splits_request = ListStaleSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            delete_opstamp: 100,\n            num_splits: 2,\n        };\n        let splits = metastore\n            .list_stale_splits(list_stale_splits_request)\n            .await\n            .unwrap()\n            .deserialize_splits()\n            .await\n            .unwrap();\n        assert_eq!(splits.len(), 0);\n\n        let list_stale_splits_request = ListStaleSplitsRequest {\n            index_uid: Some(index_uid.clone()),\n            delete_opstamp: 200,\n            num_splits: 2,\n        };\n        let splits = metastore\n            .list_stale_splits(list_stale_splits_request)\n            .await\n            .unwrap()\n            .deserialize_splits()\n            .await\n            .unwrap();\n        assert_eq!(splits.len(), 2);\n        assert_eq!(splits[0].split_metadata.delete_opstamp, 100);\n        assert_eq!(splits[1].split_metadata.delete_opstamp, 100);\n\n        cleanup_index(&mut metastore, index_uid).await;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/src/tests/template.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_common::rand::append_random_suffix;\nuse quickwit_config::IndexTemplate;\nuse quickwit_proto::metastore::{\n    CreateIndexTemplateRequest, DeleteIndexTemplatesRequest, EntityKind,\n    FindIndexTemplateMatchesRequest, GetIndexTemplateRequest, ListIndexTemplatesRequest,\n    MetastoreError, MetastoreResult, MetastoreService, serde_utils,\n};\n\nuse super::DefaultForTest;\nuse crate::MetastoreServiceExt;\n\nasync fn list_all_index_templates(\n    metastore: &mut dyn MetastoreService,\n) -> MetastoreResult<Vec<IndexTemplate>> {\n    let list_index_templates_request = ListIndexTemplatesRequest {};\n    let list_index_templates_response = metastore\n        .list_index_templates(list_index_templates_request)\n        .await?;\n    list_index_templates_response\n        .index_templates_json\n        .into_iter()\n        .map(|index_template_json| serde_utils::from_json_str(&index_template_json))\n        .collect()\n}\n\nasync fn cleanup_templates(metastore: &mut dyn MetastoreService) {\n    let template_ids = list_all_index_templates(metastore)\n        .await\n        .unwrap()\n        .into_iter()\n        .map(|index_template| index_template.template_id)\n        .collect::<Vec<_>>();\n\n    let delete_templates_request = DeleteIndexTemplatesRequest { template_ids };\n    metastore\n        .delete_index_templates(delete_templates_request)\n        .await\n        .unwrap();\n}\n\npub async fn test_metastore_create_index_template<\n    MetastoreUnderTest: MetastoreService + MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreUnderTest::default_for_test().await;\n    cleanup_templates(&mut metastore).await;\n\n    let template_id = append_random_suffix(\"test-create-template\");\n    let index_template = IndexTemplate::for_test(&template_id, &[\"test-template-*\"], 100);\n    let index_template_json = serde_json::to_string(&index_template).unwrap();\n\n    let create_index_template_request = CreateIndexTemplateRequest {\n        index_template_json: index_template_json.clone(),\n        overwrite: false,\n    };\n    metastore\n        .create_index_template(create_index_template_request)\n        .await\n        .unwrap();\n\n    let index_templates = list_all_index_templates(&mut metastore).await.unwrap();\n    assert_eq!(index_templates.len(), 1);\n\n    assert_eq!(index_templates[0].template_id, template_id);\n    assert_eq!(index_templates[0].index_id_patterns, [\"test-template-*\"]);\n    assert_eq!(index_templates[0].priority, 100);\n\n    let create_index_template_request = CreateIndexTemplateRequest {\n        index_template_json: index_template_json.clone(),\n        overwrite: false,\n    };\n    let error = metastore\n        .create_index_template(create_index_template_request)\n        .await\n        .unwrap_err();\n    assert!(\n        matches!(error, MetastoreError::AlreadyExists(EntityKind::IndexTemplate { template_id }) if template_id.starts_with(\"test-create-template\"))\n    );\n\n    let index_template = IndexTemplate::for_test(&template_id, &[\"test-template-*\"], 200);\n    let index_template_json = serde_json::to_string(&index_template).unwrap();\n\n    let create_index_template_request = CreateIndexTemplateRequest {\n        index_template_json: index_template_json.clone(),\n        overwrite: true,\n    };\n    metastore\n        .create_index_template(create_index_template_request)\n        .await\n        .unwrap();\n\n    let index_templates = list_all_index_templates(&mut metastore).await.unwrap();\n    assert_eq!(index_templates.len(), 1);\n\n    let index_templates = list_all_index_templates(&mut metastore).await.unwrap();\n    assert_eq!(index_templates.len(), 1);\n    assert_eq!(index_templates[0].priority, 200);\n}\n\npub async fn test_metastore_get_index_template<\n    MetastoreUnderTest: MetastoreService + MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreUnderTest::default_for_test().await;\n    cleanup_templates(&mut metastore).await;\n\n    let template_id = append_random_suffix(\"test-get-template\");\n    let index_template = IndexTemplate::for_test(&template_id, &[\"test-template\"], 100);\n    let index_template_json = serde_json::to_string(&index_template).unwrap();\n\n    let get_index_template_request = GetIndexTemplateRequest {\n        template_id: template_id.clone(),\n    };\n    let error = metastore\n        .get_index_template(get_index_template_request.clone())\n        .await\n        .unwrap_err();\n    assert!(\n        matches!(error, MetastoreError::NotFound(EntityKind::IndexTemplate { template_id }) if template_id.starts_with(\"test-get-template\"))\n    );\n\n    let create_index_template_request = CreateIndexTemplateRequest {\n        index_template_json,\n        overwrite: false,\n    };\n    metastore\n        .create_index_template(create_index_template_request)\n        .await\n        .unwrap();\n\n    let get_index_template_response = metastore\n        .get_index_template(get_index_template_request.clone())\n        .await\n        .unwrap();\n    let index_template: IndexTemplate =\n        serde_utils::from_json_str(&get_index_template_response.index_template_json).unwrap();\n\n    assert_eq!(index_template.template_id, template_id);\n    assert_eq!(index_template.index_id_patterns, [\"test-template\"]);\n    assert_eq!(index_template.priority, 100);\n}\n\npub async fn test_metastore_find_index_template_matches<\n    MetastoreUnderTest: MetastoreService + MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreUnderTest::default_for_test().await;\n    cleanup_templates(&mut metastore).await;\n\n    let foo_template_id = append_random_suffix(\"test-template-foo\");\n    let foo_index_template = IndexTemplate::for_test(\n        &foo_template_id,\n        &[\"test-index-foo*\", \"-test-index-fool\"],\n        200,\n    );\n    let foo_index_template_json = serde_json::to_string(&foo_index_template).unwrap();\n\n    let create_index_template_request = CreateIndexTemplateRequest {\n        index_template_json: foo_index_template_json,\n        overwrite: false,\n    };\n    metastore\n        .create_index_template(create_index_template_request)\n        .await\n        .unwrap();\n\n    let foobar_template_id = append_random_suffix(\"test-template-foobar\");\n    let foobar_index_template =\n        IndexTemplate::for_test(&foobar_template_id, &[\"test-index-foobar*\"], 100);\n    let foobar_index_template_json = serde_json::to_string(&foobar_index_template).unwrap();\n\n    let create_index_template_request = CreateIndexTemplateRequest {\n        index_template_json: foobar_index_template_json,\n        overwrite: false,\n    };\n    metastore\n        .create_index_template(create_index_template_request)\n        .await\n        .unwrap();\n\n    let bar_template_id = append_random_suffix(\"test-template-bar\");\n    let bar_index_template = IndexTemplate::for_test(&bar_template_id, &[\"test-index-bar*\"], 100);\n    let bar_index_template_json = serde_json::to_string(&bar_index_template).unwrap();\n\n    let create_index_template_request = CreateIndexTemplateRequest {\n        index_template_json: bar_index_template_json,\n        overwrite: false,\n    };\n    metastore\n        .create_index_template(create_index_template_request)\n        .await\n        .unwrap();\n\n    let find_index_template_matches = FindIndexTemplateMatchesRequest {\n        index_ids: vec![\n            \"test-index-foo\".to_string(),\n            \"test-index-fool\".to_string(),\n            \"test-index-foobar\".to_string(),\n            \"test-index-bar\".to_string(),\n            \"test-index-qux\".to_string(),\n        ],\n    };\n    let find_index_template_matches_response = metastore\n        .find_index_template_matches(find_index_template_matches)\n        .await\n        .unwrap();\n    let mut matches = find_index_template_matches_response.matches;\n    matches.sort_unstable_by(|left, right| left.index_id.cmp(&right.index_id));\n\n    assert_eq!(matches.len(), 3);\n\n    assert_eq!(matches[0].index_id, \"test-index-bar\");\n    assert_eq!(matches[0].template_id, bar_template_id);\n\n    assert_eq!(matches[1].index_id, \"test-index-foo\");\n    assert_eq!(matches[1].template_id, foo_template_id);\n\n    assert_eq!(matches[2].index_id, \"test-index-foobar\");\n    assert_eq!(matches[2].template_id, foo_template_id);\n}\n\npub async fn test_metastore_list_index_templates<\n    MetastoreUnderTest: MetastoreService + MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreUnderTest::default_for_test().await;\n    cleanup_templates(&mut metastore).await;\n\n    let list_index_templates_request = ListIndexTemplatesRequest {};\n    let list_index_templates_response = metastore\n        .list_index_templates(list_index_templates_request)\n        .await\n        .unwrap();\n    assert_eq!(list_index_templates_response.index_templates_json.len(), 0);\n\n    let template_id = append_random_suffix(\"test-list-template\");\n    let index_template = IndexTemplate::for_test(&template_id, &[\"test-template\"], 100);\n    let index_template_json = serde_json::to_string(&index_template).unwrap();\n\n    let create_index_template_request = CreateIndexTemplateRequest {\n        index_template_json,\n        overwrite: false,\n    };\n    metastore\n        .create_index_template(create_index_template_request)\n        .await\n        .unwrap();\n\n    let list_index_templates_request = ListIndexTemplatesRequest {};\n    let list_index_templates_response = metastore\n        .list_index_templates(list_index_templates_request)\n        .await\n        .unwrap();\n    assert_eq!(list_index_templates_response.index_templates_json.len(), 1);\n\n    let index_template: IndexTemplate =\n        serde_utils::from_json_str(&list_index_templates_response.index_templates_json[0]).unwrap();\n\n    assert_eq!(index_template.template_id, template_id);\n    assert_eq!(index_template.index_id_patterns, [\"test-template\"]);\n    assert_eq!(\n        index_template.index_root_uri.unwrap().as_str(),\n        \"ram:///indexes\"\n    );\n    assert_eq!(index_template.priority, 100);\n    assert_eq!(index_template.description.unwrap(), \"Test description.\");\n    assert_eq!(index_template.doc_mapping.timestamp_field.unwrap(), \"ts\");\n}\n\npub async fn test_metastore_delete_index_templates<\n    MetastoreUnderTest: MetastoreService + MetastoreServiceExt + DefaultForTest,\n>() {\n    let mut metastore = MetastoreUnderTest::default_for_test().await;\n    cleanup_templates(&mut metastore).await;\n\n    let foo_template_id = append_random_suffix(\"test-template-foo\");\n    let foo_index_template = IndexTemplate::for_test(&foo_template_id, &[\"test-index-foo*\"], 100);\n    let foo_index_template_json = serde_json::to_string(&foo_index_template).unwrap();\n\n    let create_index_template_request = CreateIndexTemplateRequest {\n        index_template_json: foo_index_template_json,\n        overwrite: false,\n    };\n    metastore\n        .create_index_template(create_index_template_request)\n        .await\n        .unwrap();\n\n    let bar_template_id = append_random_suffix(\"test-template-bar\");\n    let bar_index_template = IndexTemplate::for_test(&bar_template_id, &[\"test-index-bar*\"], 100);\n    let bar_index_template_json = serde_json::to_string(&bar_index_template).unwrap();\n\n    let create_index_template_request = CreateIndexTemplateRequest {\n        index_template_json: bar_index_template_json,\n        overwrite: false,\n    };\n    metastore\n        .create_index_template(create_index_template_request)\n        .await\n        .unwrap();\n\n    let qux_template_id = append_random_suffix(\"test-template-qux\");\n    let qux_index_template = IndexTemplate::for_test(&qux_template_id, &[\"test-index-qux*\"], 100);\n    let qux_index_template_json = serde_json::to_string(&qux_index_template).unwrap();\n\n    let create_index_template_request = CreateIndexTemplateRequest {\n        index_template_json: qux_index_template_json,\n        overwrite: false,\n    };\n    metastore\n        .create_index_template(create_index_template_request)\n        .await\n        .unwrap();\n\n    let delete_index_templates_request = DeleteIndexTemplatesRequest {\n        template_ids: vec![foo_template_id.clone(), bar_template_id.clone()],\n    };\n    metastore\n        .delete_index_templates(delete_index_templates_request.clone())\n        .await\n        .unwrap();\n\n    // Test idempotency.\n    metastore\n        .delete_index_templates(delete_index_templates_request.clone())\n        .await\n        .unwrap();\n\n    let index_templates = list_all_index_templates(&mut metastore).await.unwrap();\n    assert_eq!(index_templates.len(), 1);\n    assert_eq!(index_templates[0].template_id, qux_template_id);\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/.gitignore",
    "content": "*.modified.json\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/file-backed-index/v0.7.expected.json",
    "content": "{\n  \"version\": \"0.9\",\n  \"index\": {\n    \"version\": \"0.9\",\n    \"index_uid\": \"my-index:00000000000000000000000000\",\n    \"index_config\": {\n      \"version\": \"0.9\",\n      \"index_id\": \"my-index\",\n      \"index_uri\": \"s3://quickwit-indexes/my-index\",\n      \"doc_mapping\": {\n        \"doc_mapping_uid\": \"00000000000000000000000000\",\n        \"mode\": \"dynamic\",\n        \"dynamic_mapping\": {\n          \"indexed\": true,\n          \"tokenizer\": \"raw\",\n          \"record\": \"basic\",\n          \"stored\": true,\n          \"expand_dots\": true,\n          \"fast\": {\n            \"normalizer\": \"raw\"\n          }\n        },\n        \"field_mappings\": [\n          {\n            \"name\": \"tenant_id\",\n            \"type\": \"u64\",\n            \"stored\": true,\n            \"indexed\": true,\n            \"fast\": true,\n            \"coerce\": true,\n            \"output_format\": \"number\"\n          },\n          {\n            \"name\": \"timestamp\",\n            \"type\": \"datetime\",\n            \"input_formats\": [\n              \"rfc3339\",\n              \"unix_timestamp\"\n            ],\n            \"output_format\": \"rfc3339\",\n            \"fast_precision\": \"seconds\",\n            \"indexed\": true,\n            \"stored\": true,\n            \"fast\": true\n          },\n          {\n            \"name\": \"log_level\",\n            \"type\": \"text\",\n            \"indexed\": true,\n            \"tokenizer\": \"raw\",\n            \"record\": \"basic\",\n            \"fieldnorms\": false,\n            \"stored\": true,\n            \"fast\": false\n          },\n          {\n            \"name\": \"message\",\n            \"type\": \"text\",\n            \"indexed\": true,\n            \"tokenizer\": \"default\",\n            \"record\": \"position\",\n            \"fieldnorms\": false,\n            \"stored\": true,\n            \"fast\": false\n          }\n        ],\n        \"timestamp_field\": \"timestamp\",\n        \"tag_fields\": [\n          \"log_level\",\n          \"tenant_id\"\n        ],\n        \"partition_key\": \"tenant_id\",\n        \"max_num_partitions\": 100,\n        \"index_field_presence\": true,\n        \"store_document_size\": false,\n        \"store_source\": true,\n        \"tokenizers\": [\n          {\n            \"name\": \"custom_tokenizer\",\n            \"type\": \"regex\",\n            \"pattern\": \"[^\\\\p{L}\\\\p{N}]+\",\n            \"filters\": []\n          }\n        ]\n      },\n      \"indexing_settings\": {\n        \"commit_timeout_secs\": 301,\n        \"docstore_compression_level\": 8,\n        \"docstore_blocksize\": 1000000,\n        \"split_num_docs_target\": 10000001,\n        \"merge_policy\": {\n          \"type\": \"stable_log\",\n          \"min_level_num_docs\": 100000,\n          \"merge_factor\": 9,\n          \"max_merge_factor\": 11,\n          \"maturation_period\": \"2days\"\n        },\n        \"resources\": {\n          \"heap_size\": 50000000\n        }\n      },\n      \"ingest_settings\": {\n        \"min_shards\": 1\n      },\n      \"search_settings\": {\n        \"default_search_fields\": [\n          \"message\"\n        ]\n      },\n      \"retention\": {\n        \"period\": \"90 days\",\n        \"schedule\": \"daily\"\n      }\n    },\n    \"checkpoint\": {\n      \"kafka-source\": {\n        \"00000000000000000000\": \"00000000000000000042\"\n      }\n    },\n    \"create_timestamp\": 1789,\n    \"sources\": [\n      {\n        \"version\": \"0.9\",\n        \"source_id\": \"kafka-source\",\n        \"num_pipelines\": 2,\n        \"enabled\": true,\n        \"source_type\": \"kafka\",\n        \"params\": {\n          \"topic\": \"kafka-topic\",\n          \"client_params\": {}\n        },\n        \"transform\": {\n          \"script\": \".message = downcase(string!(.message))\",\n          \"timezone\": \"UTC\"\n        },\n        \"input_format\": \"json\"\n      }\n    ]\n  },\n  \"splits\": [\n    {\n      \"split_state\": \"Published\",\n      \"update_timestamp\": 1789,\n      \"publish_timestamp\": 1789,\n      \"version\": \"0.9\",\n      \"split_id\": \"split\",\n      \"index_uid\": \"my-index:00000000000000000000000000\",\n      \"partition_id\": 7,\n      \"source_id\": \"source\",\n      \"node_id\": \"node\",\n      \"num_docs\": 12303,\n      \"uncompressed_docs_size_in_bytes\": 234234,\n      \"time_range\": {\n        \"start\": 121000,\n        \"end\": 130198\n      },\n      \"create_timestamp\": 3,\n      \"maturity\": {\n        \"type\": \"immature\",\n        \"maturation_period_millis\": 4000\n      },\n      \"tags\": [\n        \"234\",\n        \"aaa\"\n      ],\n      \"footer_offsets\": {\n        \"start\": 1000,\n        \"end\": 2000\n      },\n      \"delete_opstamp\": 10,\n      \"num_merge_ops\": 3,\n      \"doc_mapping_uid\": \"00000000000000000000000000\"\n    }\n  ],\n  \"shards\": {\n    \"_ingest-source\": [\n      {\n        \"index_uid\": \"my-index:00000000000000000000000000\",\n        \"source_id\": \"_ingest-source\",\n        \"shard_id\": \"00000000000000000001\",\n        \"leader_id\": \"leader-ingester\",\n        \"follower_id\": \"follower-ingester\",\n        \"shard_state\": 1,\n        \"publish_position_inclusive\": \"\",\n        \"doc_mapping_uid\": \"00000000000000000000000000\",\n        \"update_timestamp\": 1704067200\n      }\n    ]\n  },\n  \"delete_tasks\": [\n    {\n      \"create_timestamp\": 0,\n      \"opstamp\": 10,\n      \"delete_query\": {\n        \"index_uid\": \"my-index:00000000000000000000000000\",\n        \"query_ast\": \"{\\\"type\\\":\\\"bool\\\",\\\"must\\\":[{\\\"type\\\":\\\"full_text\\\",\\\"field\\\":\\\"body\\\",\\\"text\\\":\\\"Harry\\\",\\\"params\\\":{\\\"mode\\\":{\\\"type\\\":\\\"phrase_fallback_to_intersection\\\"}}},{\\\"type\\\":\\\"full_text\\\",\\\"field\\\":\\\"body\\\",\\\"text\\\":\\\"Potter\\\",\\\"params\\\":{\\\"mode\\\":{\\\"type\\\":\\\"phrase_fallback_to_intersection\\\"}}}]}\"\n      }\n    }\n  ]\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/file-backed-index/v0.7.json",
    "content": "{\n  \"delete_tasks\": [\n    {\n      \"create_timestamp\": 0,\n      \"delete_query\": {\n        \"index_uid\": \"my-index:00000000000000000000000000\",\n        \"query_ast\": \"{\\\"type\\\":\\\"bool\\\",\\\"must\\\":[{\\\"type\\\":\\\"full_text\\\",\\\"field\\\":\\\"body\\\",\\\"text\\\":\\\"Harry\\\",\\\"params\\\":{\\\"mode\\\":{\\\"type\\\":\\\"phrase_fallback_to_intersection\\\"}}},{\\\"type\\\":\\\"full_text\\\",\\\"field\\\":\\\"body\\\",\\\"text\\\":\\\"Potter\\\",\\\"params\\\":{\\\"mode\\\":{\\\"type\\\":\\\"phrase_fallback_to_intersection\\\"}}}]}\"\n      },\n      \"opstamp\": 10\n    }\n  ],\n  \"index\": {\n    \"checkpoint\": {\n      \"kafka-source\": {\n        \"00000000000000000000\": \"00000000000000000042\"\n      }\n    },\n    \"create_timestamp\": 1789,\n    \"index_config\": {\n      \"doc_mapping\": {\n        \"dynamic_mapping\": {\n          \"expand_dots\": true,\n          \"fast\": {\n            \"normalizer\": \"raw\"\n          },\n          \"indexed\": true,\n          \"record\": \"basic\",\n          \"stored\": true,\n          \"tokenizer\": \"raw\"\n        },\n        \"field_mappings\": [\n          {\n            \"coerce\": true,\n            \"fast\": true,\n            \"indexed\": true,\n            \"name\": \"tenant_id\",\n            \"output_format\": \"number\",\n            \"stored\": true,\n            \"type\": \"u64\"\n          },\n          {\n            \"fast\": true,\n            \"fast_precision\": \"seconds\",\n            \"indexed\": true,\n            \"input_formats\": [\n              \"rfc3339\",\n              \"unix_timestamp\"\n            ],\n            \"name\": \"timestamp\",\n            \"output_format\": \"rfc3339\",\n            \"stored\": true,\n            \"type\": \"datetime\"\n          },\n          {\n            \"fast\": false,\n            \"fieldnorms\": false,\n            \"indexed\": true,\n            \"name\": \"log_level\",\n            \"record\": \"basic\",\n            \"stored\": true,\n            \"tokenizer\": \"raw\",\n            \"type\": \"text\"\n          },\n          {\n            \"fast\": false,\n            \"fieldnorms\": false,\n            \"indexed\": true,\n            \"name\": \"message\",\n            \"record\": \"position\",\n            \"stored\": true,\n            \"tokenizer\": \"default\",\n            \"type\": \"text\"\n          }\n        ],\n        \"index_field_presence\": true,\n        \"max_num_partitions\": 100,\n        \"mode\": \"dynamic\",\n        \"partition_key\": \"tenant_id\",\n        \"store_source\": true,\n        \"tag_fields\": [\n          \"log_level\",\n          \"tenant_id\"\n        ],\n        \"timestamp_field\": \"timestamp\",\n        \"tokenizers\": [\n          {\n            \"filters\": [],\n            \"name\": \"custom_tokenizer\",\n            \"pattern\": \"[^\\\\p{L}\\\\p{N}]+\",\n            \"type\": \"regex\"\n          }\n        ]\n      },\n      \"index_id\": \"my-index\",\n      \"index_uri\": \"s3://quickwit-indexes/my-index\",\n      \"indexing_settings\": {\n        \"commit_timeout_secs\": 301,\n        \"docstore_blocksize\": 1000000,\n        \"docstore_compression_level\": 8,\n        \"merge_policy\": {\n          \"maturation_period\": \"2days\",\n          \"max_merge_factor\": 11,\n          \"merge_factor\": 9,\n          \"min_level_num_docs\": 100000,\n          \"type\": \"stable_log\"\n        },\n        \"resources\": {\n          \"heap_size\": \"50.0 MB\"\n        },\n        \"split_num_docs_target\": 10000001\n      },\n      \"retention\": {\n        \"period\": \"90 days\",\n        \"schedule\": \"daily\"\n      },\n      \"search_settings\": {\n        \"default_search_fields\": [\n          \"message\"\n        ]\n      },\n      \"version\": \"0.7\"\n    },\n    \"index_uid\": \"my-index:00000000000000000000000000\",\n    \"sources\": [\n      {\n        \"desired_num_pipelines\": 2,\n        \"enabled\": true,\n        \"input_format\": \"json\",\n        \"max_num_pipelines_per_indexer\": 2,\n        \"params\": {\n          \"client_params\": {},\n          \"topic\": \"kafka-topic\"\n        },\n        \"source_id\": \"kafka-source\",\n        \"source_type\": \"kafka\",\n        \"transform\": {\n          \"script\": \".message = downcase(string!(.message))\",\n          \"timezone\": \"UTC\"\n        },\n        \"version\": \"0.7\"\n      }\n    ],\n    \"version\": \"0.7\"\n  },\n  \"shards\": {\n    \"_ingest-source\": [\n      {\n        \"index_uid\": \"my-index:00000000000000000000000000\",\n        \"shard_id\": \"00000000000000000001\",\n        \"source_id\": \"_ingest-source\",\n        \"shard_state\": 1,\n        \"leader_id\": \"leader-ingester\",\n        \"follower_id\": \"follower-ingester\",\n        \"publish_position_inclusive\": \"\"\n      }\n    ]\n  },\n  \"splits\": [\n    {\n      \"create_timestamp\": 3,\n      \"delete_opstamp\": 10,\n      \"footer_offsets\": {\n        \"end\": 2000,\n        \"start\": 1000\n      },\n      \"index_uid\": \"my-index:00000000000000000000000000\",\n      \"maturity\": {\n        \"maturation_period_millis\": 4000,\n        \"type\": \"immature\"\n      },\n      \"node_id\": \"node\",\n      \"num_docs\": 12303,\n      \"num_merge_ops\": 3,\n      \"partition_id\": 7,\n      \"publish_timestamp\": 1789,\n      \"source_id\": \"source\",\n      \"split_id\": \"split\",\n      \"split_state\": \"Published\",\n      \"tags\": [\n        \"234\",\n        \"aaa\"\n      ],\n      \"time_range\": {\n        \"end\": 130198,\n        \"start\": 121000\n      },\n      \"uncompressed_docs_size_in_bytes\": 234234,\n      \"update_timestamp\": 1789,\n      \"version\": \"0.7\"\n    }\n  ],\n  \"version\": \"0.7\"\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/file-backed-index/v0.8.expected.json",
    "content": "{\n  \"version\": \"0.9\",\n  \"index\": {\n    \"version\": \"0.9\",\n    \"index_uid\": \"my-index:00000000000000000000000000\",\n    \"index_config\": {\n      \"version\": \"0.9\",\n      \"index_id\": \"my-index\",\n      \"index_uri\": \"s3://quickwit-indexes/my-index\",\n      \"doc_mapping\": {\n        \"doc_mapping_uid\": \"00000000000000000000000000\",\n        \"mode\": \"dynamic\",\n        \"dynamic_mapping\": {\n          \"indexed\": true,\n          \"tokenizer\": \"raw\",\n          \"record\": \"basic\",\n          \"stored\": true,\n          \"expand_dots\": true,\n          \"fast\": {\n            \"normalizer\": \"raw\"\n          }\n        },\n        \"field_mappings\": [\n          {\n            \"name\": \"tenant_id\",\n            \"type\": \"u64\",\n            \"stored\": true,\n            \"indexed\": true,\n            \"fast\": true,\n            \"coerce\": true,\n            \"output_format\": \"number\"\n          },\n          {\n            \"name\": \"timestamp\",\n            \"type\": \"datetime\",\n            \"input_formats\": [\n              \"rfc3339\",\n              \"unix_timestamp\"\n            ],\n            \"output_format\": \"rfc3339\",\n            \"fast_precision\": \"seconds\",\n            \"indexed\": true,\n            \"stored\": true,\n            \"fast\": true\n          },\n          {\n            \"name\": \"log_level\",\n            \"type\": \"text\",\n            \"indexed\": true,\n            \"tokenizer\": \"raw\",\n            \"record\": \"basic\",\n            \"fieldnorms\": false,\n            \"stored\": true,\n            \"fast\": false\n          },\n          {\n            \"name\": \"message\",\n            \"type\": \"text\",\n            \"indexed\": true,\n            \"tokenizer\": \"default\",\n            \"record\": \"position\",\n            \"fieldnorms\": false,\n            \"stored\": true,\n            \"fast\": false\n          }\n        ],\n        \"timestamp_field\": \"timestamp\",\n        \"tag_fields\": [\n          \"log_level\",\n          \"tenant_id\"\n        ],\n        \"partition_key\": \"tenant_id\",\n        \"max_num_partitions\": 100,\n        \"index_field_presence\": true,\n        \"store_document_size\": false,\n        \"store_source\": true,\n        \"tokenizers\": [\n          {\n            \"name\": \"custom_tokenizer\",\n            \"type\": \"regex\",\n            \"pattern\": \"[^\\\\p{L}\\\\p{N}]+\",\n            \"filters\": []\n          }\n        ]\n      },\n      \"indexing_settings\": {\n        \"commit_timeout_secs\": 301,\n        \"docstore_compression_level\": 8,\n        \"docstore_blocksize\": 1000000,\n        \"split_num_docs_target\": 10000001,\n        \"merge_policy\": {\n          \"type\": \"stable_log\",\n          \"min_level_num_docs\": 100000,\n          \"merge_factor\": 9,\n          \"max_merge_factor\": 11,\n          \"maturation_period\": \"2days\"\n        },\n        \"resources\": {\n          \"heap_size\": 50000000\n        }\n      },\n      \"ingest_settings\": {\n        \"min_shards\": 1\n      },\n      \"search_settings\": {\n        \"default_search_fields\": [\n          \"message\"\n        ]\n      },\n      \"retention\": {\n        \"period\": \"90 days\",\n        \"schedule\": \"daily\"\n      }\n    },\n    \"checkpoint\": {\n      \"kafka-source\": {\n        \"00000000000000000000\": \"00000000000000000042\"\n      }\n    },\n    \"create_timestamp\": 1789,\n    \"sources\": [\n      {\n        \"version\": \"0.9\",\n        \"source_id\": \"kafka-source\",\n        \"num_pipelines\": 2,\n        \"enabled\": true,\n        \"source_type\": \"kafka\",\n        \"params\": {\n          \"topic\": \"kafka-topic\",\n          \"client_params\": {}\n        },\n        \"transform\": {\n          \"script\": \".message = downcase(string!(.message))\",\n          \"timezone\": \"UTC\"\n        },\n        \"input_format\": \"json\"\n      }\n    ]\n  },\n  \"splits\": [\n    {\n      \"split_state\": \"Published\",\n      \"update_timestamp\": 1789,\n      \"publish_timestamp\": 1789,\n      \"version\": \"0.9\",\n      \"split_id\": \"split\",\n      \"index_uid\": \"my-index:00000000000000000000000000\",\n      \"partition_id\": 7,\n      \"source_id\": \"source\",\n      \"node_id\": \"node\",\n      \"num_docs\": 12303,\n      \"uncompressed_docs_size_in_bytes\": 234234,\n      \"time_range\": {\n        \"start\": 121000,\n        \"end\": 130198\n      },\n      \"create_timestamp\": 3,\n      \"maturity\": {\n        \"type\": \"immature\",\n        \"maturation_period_millis\": 4000\n      },\n      \"tags\": [\n        \"234\",\n        \"aaa\"\n      ],\n      \"footer_offsets\": {\n        \"start\": 1000,\n        \"end\": 2000\n      },\n      \"delete_opstamp\": 10,\n      \"num_merge_ops\": 3,\n      \"doc_mapping_uid\": \"00000000000000000000000000\"\n    }\n  ],\n  \"shards\": {\n    \"_ingest-source\": [\n      {\n        \"index_uid\": \"my-index:00000000000000000000000000\",\n        \"source_id\": \"_ingest-source\",\n        \"shard_id\": \"00000000000000000001\",\n        \"leader_id\": \"leader-ingester\",\n        \"follower_id\": \"follower-ingester\",\n        \"shard_state\": 1,\n        \"publish_position_inclusive\": \"\",\n        \"doc_mapping_uid\": \"00000000000000000000000000\",\n        \"update_timestamp\": 1704067200\n      }\n    ]\n  },\n  \"delete_tasks\": [\n    {\n      \"create_timestamp\": 0,\n      \"opstamp\": 10,\n      \"delete_query\": {\n        \"index_uid\": \"my-index:00000000000000000000000000\",\n        \"query_ast\": \"{\\\"type\\\":\\\"bool\\\",\\\"must\\\":[{\\\"type\\\":\\\"full_text\\\",\\\"field\\\":\\\"body\\\",\\\"text\\\":\\\"Harry\\\",\\\"params\\\":{\\\"mode\\\":{\\\"type\\\":\\\"phrase_fallback_to_intersection\\\"}}},{\\\"type\\\":\\\"full_text\\\",\\\"field\\\":\\\"body\\\",\\\"text\\\":\\\"Potter\\\",\\\"params\\\":{\\\"mode\\\":{\\\"type\\\":\\\"phrase_fallback_to_intersection\\\"}}}]}\"\n      }\n    }\n  ]\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/file-backed-index/v0.8.json",
    "content": "{\n  \"delete_tasks\": [\n    {\n      \"create_timestamp\": 0,\n      \"delete_query\": {\n        \"index_uid\": \"my-index:00000000000000000000000000\",\n        \"query_ast\": \"{\\\"type\\\":\\\"bool\\\",\\\"must\\\":[{\\\"type\\\":\\\"full_text\\\",\\\"field\\\":\\\"body\\\",\\\"text\\\":\\\"Harry\\\",\\\"params\\\":{\\\"mode\\\":{\\\"type\\\":\\\"phrase_fallback_to_intersection\\\"}}},{\\\"type\\\":\\\"full_text\\\",\\\"field\\\":\\\"body\\\",\\\"text\\\":\\\"Potter\\\",\\\"params\\\":{\\\"mode\\\":{\\\"type\\\":\\\"phrase_fallback_to_intersection\\\"}}}]}\"\n      },\n      \"opstamp\": 10\n    }\n  ],\n  \"index\": {\n    \"checkpoint\": {\n      \"kafka-source\": {\n        \"00000000000000000000\": \"00000000000000000042\"\n      }\n    },\n    \"create_timestamp\": 1789,\n    \"index_config\": {\n      \"doc_mapping\": {\n        \"dynamic_mapping\": {\n          \"expand_dots\": true,\n          \"fast\": {\n            \"normalizer\": \"raw\"\n          },\n          \"indexed\": true,\n          \"record\": \"basic\",\n          \"stored\": true,\n          \"tokenizer\": \"raw\"\n        },\n        \"field_mappings\": [\n          {\n            \"coerce\": true,\n            \"fast\": true,\n            \"indexed\": true,\n            \"name\": \"tenant_id\",\n            \"output_format\": \"number\",\n            \"stored\": true,\n            \"type\": \"u64\"\n          },\n          {\n            \"fast\": true,\n            \"fast_precision\": \"seconds\",\n            \"indexed\": true,\n            \"input_formats\": [\n              \"rfc3339\",\n              \"unix_timestamp\"\n            ],\n            \"name\": \"timestamp\",\n            \"output_format\": \"rfc3339\",\n            \"stored\": true,\n            \"type\": \"datetime\"\n          },\n          {\n            \"fast\": false,\n            \"fieldnorms\": false,\n            \"indexed\": true,\n            \"name\": \"log_level\",\n            \"record\": \"basic\",\n            \"stored\": true,\n            \"tokenizer\": \"raw\",\n            \"type\": \"text\"\n          },\n          {\n            \"fast\": false,\n            \"fieldnorms\": false,\n            \"indexed\": true,\n            \"name\": \"message\",\n            \"record\": \"position\",\n            \"stored\": true,\n            \"tokenizer\": \"default\",\n            \"type\": \"text\"\n          }\n        ],\n        \"index_field_presence\": true,\n        \"max_num_partitions\": 100,\n        \"mode\": \"dynamic\",\n        \"partition_key\": \"tenant_id\",\n        \"store_document_size\": false,\n        \"store_source\": true,\n        \"tag_fields\": [\n          \"log_level\",\n          \"tenant_id\"\n        ],\n        \"timestamp_field\": \"timestamp\",\n        \"tokenizers\": [\n          {\n            \"filters\": [],\n            \"name\": \"custom_tokenizer\",\n            \"pattern\": \"[^\\\\p{L}\\\\p{N}]+\",\n            \"type\": \"regex\"\n          }\n        ]\n      },\n      \"index_id\": \"my-index\",\n      \"index_uri\": \"s3://quickwit-indexes/my-index\",\n      \"indexing_settings\": {\n        \"commit_timeout_secs\": 301,\n        \"docstore_blocksize\": 1000000,\n        \"docstore_compression_level\": 8,\n        \"merge_policy\": {\n          \"maturation_period\": \"2days\",\n          \"max_merge_factor\": 11,\n          \"merge_factor\": 9,\n          \"min_level_num_docs\": 100000,\n          \"type\": \"stable_log\"\n        },\n        \"resources\": {\n          \"heap_size\": \"50.0 MB\"\n        },\n        \"split_num_docs_target\": 10000001\n      },\n      \"retention\": {\n        \"period\": \"90 days\",\n        \"schedule\": \"daily\"\n      },\n      \"search_settings\": {\n        \"default_search_fields\": [\n          \"message\"\n        ]\n      },\n      \"version\": \"0.8\"\n    },\n    \"index_uid\": \"my-index:00000000000000000000000000\",\n    \"sources\": [\n      {\n        \"enabled\": true,\n        \"input_format\": \"json\",\n        \"num_pipelines\": 2,\n        \"params\": {\n          \"client_params\": {},\n          \"topic\": \"kafka-topic\"\n        },\n        \"source_id\": \"kafka-source\",\n        \"source_type\": \"kafka\",\n        \"transform\": {\n          \"script\": \".message = downcase(string!(.message))\",\n          \"timezone\": \"UTC\"\n        },\n        \"version\": \"0.8\"\n      }\n    ],\n    \"version\": \"0.8\"\n  },\n  \"shards\": {\n    \"_ingest-source\": [\n      {\n        \"index_uid\": \"my-index:00000000000000000000000000\",\n        \"shard_id\": \"00000000000000000001\",\n        \"source_id\": \"_ingest-source\",\n        \"shard_state\": 1,\n        \"leader_id\": \"leader-ingester\",\n        \"follower_id\": \"follower-ingester\",\n        \"publish_position_inclusive\": \"\"\n      }\n    ]\n  },\n  \"splits\": [\n    {\n      \"create_timestamp\": 3,\n      \"delete_opstamp\": 10,\n      \"footer_offsets\": {\n        \"end\": 2000,\n        \"start\": 1000\n      },\n      \"index_uid\": \"my-index:00000000000000000000000000\",\n      \"maturity\": {\n        \"maturation_period_millis\": 4000,\n        \"type\": \"immature\"\n      },\n      \"node_id\": \"node\",\n      \"num_docs\": 12303,\n      \"num_merge_ops\": 3,\n      \"partition_id\": 7,\n      \"publish_timestamp\": 1789,\n      \"source_id\": \"source\",\n      \"split_id\": \"split\",\n      \"split_state\": \"Published\",\n      \"tags\": [\n        \"234\",\n        \"aaa\"\n      ],\n      \"time_range\": {\n        \"end\": 130198,\n        \"start\": 121000\n      },\n      \"uncompressed_docs_size_in_bytes\": 234234,\n      \"update_timestamp\": 1789,\n      \"version\": \"0.8\"\n    }\n  ],\n  \"version\": \"0.8\"\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/file-backed-index/v0.9.expected.json",
    "content": "{\n  \"version\": \"0.9\",\n  \"index\": {\n    \"version\": \"0.9\",\n    \"index_uid\": \"my-index:00000000000000000000000001\",\n    \"index_config\": {\n      \"version\": \"0.9\",\n      \"index_id\": \"my-index\",\n      \"index_uri\": \"s3://quickwit-indexes/my-index\",\n      \"doc_mapping\": {\n        \"doc_mapping_uid\": \"00000000000000000000000001\",\n        \"mode\": \"dynamic\",\n        \"dynamic_mapping\": {\n          \"indexed\": true,\n          \"tokenizer\": \"raw\",\n          \"record\": \"basic\",\n          \"stored\": true,\n          \"expand_dots\": true,\n          \"fast\": {\n            \"normalizer\": \"raw\"\n          }\n        },\n        \"field_mappings\": [\n          {\n            \"name\": \"tenant_id\",\n            \"type\": \"u64\",\n            \"stored\": true,\n            \"indexed\": true,\n            \"fast\": true,\n            \"coerce\": true,\n            \"output_format\": \"number\"\n          },\n          {\n            \"name\": \"timestamp\",\n            \"type\": \"datetime\",\n            \"input_formats\": [\n              \"rfc3339\",\n              \"unix_timestamp\"\n            ],\n            \"output_format\": \"rfc3339\",\n            \"fast_precision\": \"seconds\",\n            \"indexed\": true,\n            \"stored\": true,\n            \"fast\": true\n          },\n          {\n            \"name\": \"log_level\",\n            \"type\": \"text\",\n            \"indexed\": true,\n            \"tokenizer\": \"raw\",\n            \"record\": \"basic\",\n            \"fieldnorms\": false,\n            \"stored\": true,\n            \"fast\": false\n          },\n          {\n            \"name\": \"message\",\n            \"type\": \"text\",\n            \"indexed\": true,\n            \"tokenizer\": \"default\",\n            \"record\": \"position\",\n            \"fieldnorms\": false,\n            \"stored\": true,\n            \"fast\": false\n          }\n        ],\n        \"timestamp_field\": \"timestamp\",\n        \"tag_fields\": [\n          \"log_level\",\n          \"tenant_id\"\n        ],\n        \"partition_key\": \"tenant_id\",\n        \"max_num_partitions\": 100,\n        \"index_field_presence\": true,\n        \"store_document_size\": false,\n        \"store_source\": true,\n        \"tokenizers\": [\n          {\n            \"name\": \"custom_tokenizer\",\n            \"type\": \"regex\",\n            \"pattern\": \"[^\\\\p{L}\\\\p{N}]+\",\n            \"filters\": []\n          }\n        ]\n      },\n      \"indexing_settings\": {\n        \"commit_timeout_secs\": 301,\n        \"docstore_compression_level\": 8,\n        \"docstore_blocksize\": 1000000,\n        \"split_num_docs_target\": 10000001,\n        \"merge_policy\": {\n          \"type\": \"stable_log\",\n          \"min_level_num_docs\": 100000,\n          \"merge_factor\": 9,\n          \"max_merge_factor\": 11,\n          \"maturation_period\": \"2days\"\n        },\n        \"resources\": {\n          \"heap_size\": 50000000\n        }\n      },\n      \"ingest_settings\": {\n        \"min_shards\": 12\n      },\n      \"search_settings\": {\n        \"default_search_fields\": [\n          \"message\"\n        ]\n      },\n      \"retention\": {\n        \"period\": \"90 days\",\n        \"schedule\": \"daily\"\n      }\n    },\n    \"checkpoint\": {\n      \"kafka-source\": {\n        \"00000000000000000000\": \"00000000000000000042\"\n      }\n    },\n    \"create_timestamp\": 1789,\n    \"sources\": [\n      {\n        \"version\": \"0.9\",\n        \"source_id\": \"kafka-source\",\n        \"num_pipelines\": 2,\n        \"enabled\": true,\n        \"source_type\": \"kafka\",\n        \"params\": {\n          \"topic\": \"kafka-topic\",\n          \"client_params\": {}\n        },\n        \"transform\": {\n          \"script\": \".message = downcase(string!(.message))\",\n          \"timezone\": \"UTC\"\n        },\n        \"input_format\": \"json\"\n      }\n    ]\n  },\n  \"splits\": [\n    {\n      \"split_state\": \"Published\",\n      \"update_timestamp\": 1789,\n      \"publish_timestamp\": 1789,\n      \"version\": \"0.9\",\n      \"split_id\": \"split\",\n      \"index_uid\": \"my-index:00000000000000000000000001\",\n      \"partition_id\": 7,\n      \"source_id\": \"source\",\n      \"node_id\": \"node\",\n      \"num_docs\": 12303,\n      \"uncompressed_docs_size_in_bytes\": 234234,\n      \"time_range\": {\n        \"start\": 121000,\n        \"end\": 130198\n      },\n      \"create_timestamp\": 3,\n      \"maturity\": {\n        \"type\": \"immature\",\n        \"maturation_period_millis\": 4000\n      },\n      \"tags\": [\n        \"234\",\n        \"aaa\"\n      ],\n      \"footer_offsets\": {\n        \"start\": 1000,\n        \"end\": 2000\n      },\n      \"delete_opstamp\": 10,\n      \"num_merge_ops\": 3,\n      \"doc_mapping_uid\": \"00000000000000000000000000\"\n    }\n  ],\n  \"shards\": {\n    \"_ingest-source\": [\n      {\n        \"index_uid\": \"my-index:00000000000000000000000001\",\n        \"source_id\": \"_ingest-source\",\n        \"shard_id\": \"00000000000000000001\",\n        \"leader_id\": \"leader-ingester\",\n        \"follower_id\": \"follower-ingester\",\n        \"shard_state\": 1,\n        \"publish_position_inclusive\": \"\",\n        \"doc_mapping_uid\": \"00000000000000000000000001\",\n        \"update_timestamp\": 1724240908\n      }\n    ]\n  },\n  \"delete_tasks\": [\n    {\n      \"create_timestamp\": 0,\n      \"opstamp\": 10,\n      \"delete_query\": {\n        \"index_uid\": \"my-index:00000000000000000000000001\",\n        \"query_ast\": \"{\\\"type\\\":\\\"bool\\\",\\\"must\\\":[{\\\"type\\\":\\\"full_text\\\",\\\"field\\\":\\\"body\\\",\\\"text\\\":\\\"Harry\\\",\\\"params\\\":{\\\"mode\\\":{\\\"type\\\":\\\"phrase_fallback_to_intersection\\\"}},\\\"lenient\\\":false},{\\\"type\\\":\\\"full_text\\\",\\\"field\\\":\\\"body\\\",\\\"text\\\":\\\"Potter\\\",\\\"params\\\":{\\\"mode\\\":{\\\"type\\\":\\\"phrase_fallback_to_intersection\\\"}},\\\"lenient\\\":false}]}\"\n      }\n    }\n  ]\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/file-backed-index/v0.9.json",
    "content": "{\n  \"version\": \"0.9\",\n  \"index\": {\n    \"version\": \"0.9\",\n    \"index_uid\": \"my-index:00000000000000000000000001\",\n    \"index_config\": {\n      \"version\": \"0.9\",\n      \"index_id\": \"my-index\",\n      \"index_uri\": \"s3://quickwit-indexes/my-index\",\n      \"doc_mapping\": {\n        \"doc_mapping_uid\": \"00000000000000000000000001\",\n        \"mode\": \"dynamic\",\n        \"dynamic_mapping\": {\n          \"indexed\": true,\n          \"tokenizer\": \"raw\",\n          \"record\": \"basic\",\n          \"stored\": true,\n          \"expand_dots\": true,\n          \"fast\": {\n            \"normalizer\": \"raw\"\n          }\n        },\n        \"field_mappings\": [\n          {\n            \"name\": \"tenant_id\",\n            \"type\": \"u64\",\n            \"stored\": true,\n            \"indexed\": true,\n            \"fast\": true,\n            \"coerce\": true,\n            \"output_format\": \"number\"\n          },\n          {\n            \"name\": \"timestamp\",\n            \"type\": \"datetime\",\n            \"input_formats\": [\n              \"rfc3339\",\n              \"unix_timestamp\"\n            ],\n            \"output_format\": \"rfc3339\",\n            \"fast_precision\": \"seconds\",\n            \"indexed\": true,\n            \"stored\": true,\n            \"fast\": true\n          },\n          {\n            \"name\": \"log_level\",\n            \"type\": \"text\",\n            \"indexed\": true,\n            \"tokenizer\": \"raw\",\n            \"record\": \"basic\",\n            \"fieldnorms\": false,\n            \"stored\": true,\n            \"fast\": false\n          },\n          {\n            \"name\": \"message\",\n            \"type\": \"text\",\n            \"indexed\": true,\n            \"tokenizer\": \"default\",\n            \"record\": \"position\",\n            \"fieldnorms\": false,\n            \"stored\": true,\n            \"fast\": false\n          }\n        ],\n        \"timestamp_field\": \"timestamp\",\n        \"tag_fields\": [\n          \"log_level\",\n          \"tenant_id\"\n        ],\n        \"partition_key\": \"tenant_id\",\n        \"max_num_partitions\": 100,\n        \"index_field_presence\": true,\n        \"store_document_size\": false,\n        \"store_source\": true,\n        \"tokenizers\": [\n          {\n            \"name\": \"custom_tokenizer\",\n            \"type\": \"regex\",\n            \"pattern\": \"[^\\\\p{L}\\\\p{N}]+\",\n            \"filters\": []\n          }\n        ]\n      },\n      \"indexing_settings\": {\n        \"commit_timeout_secs\": 301,\n        \"docstore_compression_level\": 8,\n        \"docstore_blocksize\": 1000000,\n        \"split_num_docs_target\": 10000001,\n        \"merge_policy\": {\n          \"type\": \"stable_log\",\n          \"min_level_num_docs\": 100000,\n          \"merge_factor\": 9,\n          \"max_merge_factor\": 11,\n          \"maturation_period\": \"2days\"\n        },\n        \"resources\": {\n          \"heap_size\": 50000000\n        }\n      },\n      \"ingest_settings\": {\n        \"min_shards\": 12\n      },\n      \"search_settings\": {\n        \"default_search_fields\": [\n          \"message\"\n        ]\n      },\n      \"retention\": {\n        \"period\": \"90 days\",\n        \"schedule\": \"daily\"\n      }\n    },\n    \"checkpoint\": {\n      \"kafka-source\": {\n        \"00000000000000000000\": \"00000000000000000042\"\n      }\n    },\n    \"create_timestamp\": 1789,\n    \"sources\": [\n      {\n        \"version\": \"0.9\",\n        \"source_id\": \"kafka-source\",\n        \"num_pipelines\": 2,\n        \"enabled\": true,\n        \"source_type\": \"kafka\",\n        \"params\": {\n          \"topic\": \"kafka-topic\",\n          \"client_params\": {}\n        },\n        \"transform\": {\n          \"script\": \".message = downcase(string!(.message))\",\n          \"timezone\": \"UTC\"\n        },\n        \"input_format\": \"json\"\n      }\n    ]\n  },\n  \"splits\": [\n    {\n      \"split_state\": \"Published\",\n      \"update_timestamp\": 1789,\n      \"publish_timestamp\": 1789,\n      \"version\": \"0.9\",\n      \"split_id\": \"split\",\n      \"index_uid\": \"my-index:00000000000000000000000001\",\n      \"partition_id\": 7,\n      \"source_id\": \"source\",\n      \"node_id\": \"node\",\n      \"num_docs\": 12303,\n      \"uncompressed_docs_size_in_bytes\": 234234,\n      \"time_range\": {\n        \"start\": 121000,\n        \"end\": 130198\n      },\n      \"create_timestamp\": 3,\n      \"maturity\": {\n        \"type\": \"immature\",\n        \"maturation_period_millis\": 4000\n      },\n      \"tags\": [\n        \"234\",\n        \"aaa\"\n      ],\n      \"footer_offsets\": {\n        \"start\": 1000,\n        \"end\": 2000\n      },\n      \"delete_opstamp\": 10,\n      \"num_merge_ops\": 3,\n      \"doc_mapping_uid\": \"00000000000000000000000000\"\n    }\n  ],\n  \"shards\": {\n    \"_ingest-source\": [\n      {\n        \"index_uid\": \"my-index:00000000000000000000000001\",\n        \"source_id\": \"_ingest-source\",\n        \"shard_id\": \"00000000000000000001\",\n        \"leader_id\": \"leader-ingester\",\n        \"follower_id\": \"follower-ingester\",\n        \"shard_state\": 1,\n        \"publish_position_inclusive\": \"\",\n        \"doc_mapping_uid\": \"00000000000000000000000001\",\n        \"update_timestamp\": 1724240908\n      }\n    ]\n  },\n  \"delete_tasks\": [\n    {\n      \"create_timestamp\": 0,\n      \"opstamp\": 10,\n      \"delete_query\": {\n        \"index_uid\": \"my-index:00000000000000000000000001\",\n        \"query_ast\": \"{\\\"type\\\":\\\"bool\\\",\\\"must\\\":[{\\\"type\\\":\\\"full_text\\\",\\\"field\\\":\\\"body\\\",\\\"text\\\":\\\"Harry\\\",\\\"params\\\":{\\\"mode\\\":{\\\"type\\\":\\\"phrase_fallback_to_intersection\\\"}},\\\"lenient\\\":false},{\\\"type\\\":\\\"full_text\\\",\\\"field\\\":\\\"body\\\",\\\"text\\\":\\\"Potter\\\",\\\"params\\\":{\\\"mode\\\":{\\\"type\\\":\\\"phrase_fallback_to_intersection\\\"}},\\\"lenient\\\":false}]}\"\n      }\n    }\n  ]\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/index-metadata/v0.7.expected.json",
    "content": "{\n  \"checkpoint\": {\n    \"kafka-source\": {\n      \"00000000000000000000\": \"00000000000000000042\"\n    }\n  },\n  \"create_timestamp\": 1789,\n  \"index_config\": {\n    \"doc_mapping\": {\n      \"doc_mapping_uid\": \"00000000000000000000000000\",\n      \"dynamic_mapping\": {\n        \"expand_dots\": true,\n        \"fast\": {\n          \"normalizer\": \"raw\"\n        },\n        \"indexed\": true,\n        \"record\": \"basic\",\n        \"stored\": true,\n        \"tokenizer\": \"raw\"\n      },\n      \"field_mappings\": [\n        {\n          \"coerce\": true,\n          \"fast\": true,\n          \"indexed\": true,\n          \"name\": \"tenant_id\",\n          \"output_format\": \"number\",\n          \"stored\": true,\n          \"type\": \"u64\"\n        },\n        {\n          \"fast\": true,\n          \"fast_precision\": \"seconds\",\n          \"indexed\": true,\n          \"input_formats\": [\n            \"rfc3339\",\n            \"unix_timestamp\"\n          ],\n          \"name\": \"timestamp\",\n          \"output_format\": \"rfc3339\",\n          \"stored\": true,\n          \"type\": \"datetime\"\n        },\n        {\n          \"fast\": false,\n          \"fieldnorms\": false,\n          \"indexed\": true,\n          \"name\": \"log_level\",\n          \"record\": \"basic\",\n          \"stored\": true,\n          \"tokenizer\": \"raw\",\n          \"type\": \"text\"\n        },\n        {\n          \"fast\": false,\n          \"fieldnorms\": false,\n          \"indexed\": true,\n          \"name\": \"message\",\n          \"record\": \"position\",\n          \"stored\": true,\n          \"tokenizer\": \"default\",\n          \"type\": \"text\"\n        }\n      ],\n      \"index_field_presence\": true,\n      \"max_num_partitions\": 100,\n      \"mode\": \"dynamic\",\n      \"partition_key\": \"tenant_id\",\n      \"store_document_size\": false,\n      \"store_source\": true,\n      \"tag_fields\": [\n        \"log_level\",\n        \"tenant_id\"\n      ],\n      \"timestamp_field\": \"timestamp\",\n      \"tokenizers\": [\n        {\n          \"filters\": [],\n          \"name\": \"custom_tokenizer\",\n          \"pattern\": \"[^\\\\p{L}\\\\p{N}]+\",\n          \"type\": \"regex\"\n        }\n      ]\n    },\n    \"index_id\": \"my-index\",\n    \"index_uri\": \"s3://quickwit-indexes/my-index\",\n    \"indexing_settings\": {\n      \"commit_timeout_secs\": 301,\n      \"docstore_blocksize\": 1000000,\n      \"docstore_compression_level\": 8,\n      \"merge_policy\": {\n        \"maturation_period\": \"2days\",\n        \"max_merge_factor\": 11,\n        \"merge_factor\": 9,\n        \"min_level_num_docs\": 100000,\n        \"type\": \"stable_log\"\n      },\n      \"resources\": {\n        \"heap_size\": 50000000\n      },\n      \"split_num_docs_target\": 10000001\n    },\n    \"ingest_settings\": {\n      \"min_shards\": 1\n    },\n    \"retention\": {\n      \"period\": \"90 days\",\n      \"schedule\": \"daily\"\n    },\n    \"search_settings\": {\n      \"default_search_fields\": [\n        \"message\"\n      ]\n    },\n    \"version\": \"0.9\"\n  },\n  \"index_uid\": \"my-index:00000000000000000000000000\",\n  \"sources\": [\n    {\n      \"enabled\": true,\n      \"input_format\": \"json\",\n      \"num_pipelines\": 2,\n      \"params\": {\n        \"client_params\": {},\n        \"topic\": \"kafka-topic\"\n      },\n      \"source_id\": \"kafka-source\",\n      \"source_type\": \"kafka\",\n      \"transform\": {\n        \"script\": \".message = downcase(string!(.message))\",\n        \"timezone\": \"UTC\"\n      },\n      \"version\": \"0.9\"\n    }\n  ],\n  \"version\": \"0.9\"\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/index-metadata/v0.7.json",
    "content": "{\n  \"checkpoint\": {\n    \"kafka-source\": {\n      \"00000000000000000000\": \"00000000000000000042\"\n    }\n  },\n  \"create_timestamp\": 1789,\n  \"index_config\": {\n    \"doc_mapping\": {\n      \"dynamic_mapping\": {\n        \"expand_dots\": true,\n        \"fast\": {\n          \"normalizer\": \"raw\"\n        },\n        \"indexed\": true,\n        \"record\": \"basic\",\n        \"stored\": true,\n        \"tokenizer\": \"raw\"\n      },\n      \"field_mappings\": [\n        {\n          \"coerce\": true,\n          \"fast\": true,\n          \"indexed\": true,\n          \"name\": \"tenant_id\",\n          \"output_format\": \"number\",\n          \"stored\": true,\n          \"type\": \"u64\"\n        },\n        {\n          \"fast\": true,\n          \"fast_precision\": \"seconds\",\n          \"indexed\": true,\n          \"input_formats\": [\n            \"rfc3339\",\n            \"unix_timestamp\"\n          ],\n          \"name\": \"timestamp\",\n          \"output_format\": \"rfc3339\",\n          \"stored\": true,\n          \"type\": \"datetime\"\n        },\n        {\n          \"fast\": false,\n          \"fieldnorms\": false,\n          \"indexed\": true,\n          \"name\": \"log_level\",\n          \"record\": \"basic\",\n          \"stored\": true,\n          \"tokenizer\": \"raw\",\n          \"type\": \"text\"\n        },\n        {\n          \"fast\": false,\n          \"fieldnorms\": false,\n          \"indexed\": true,\n          \"name\": \"message\",\n          \"record\": \"position\",\n          \"stored\": true,\n          \"tokenizer\": \"default\",\n          \"type\": \"text\"\n        }\n      ],\n      \"index_field_presence\": true,\n      \"max_num_partitions\": 100,\n      \"mode\": \"dynamic\",\n      \"partition_key\": \"tenant_id\",\n      \"store_source\": true,\n      \"tag_fields\": [\n        \"log_level\",\n        \"tenant_id\"\n      ],\n      \"timestamp_field\": \"timestamp\",\n      \"tokenizers\": [\n        {\n          \"filters\": [],\n          \"name\": \"custom_tokenizer\",\n          \"pattern\": \"[^\\\\p{L}\\\\p{N}]+\",\n          \"type\": \"regex\"\n        }\n      ]\n    },\n    \"index_id\": \"my-index\",\n    \"index_uri\": \"s3://quickwit-indexes/my-index\",\n    \"indexing_settings\": {\n      \"commit_timeout_secs\": 301,\n      \"docstore_blocksize\": 1000000,\n      \"docstore_compression_level\": 8,\n      \"merge_policy\": {\n        \"maturation_period\": \"2days\",\n        \"max_merge_factor\": 11,\n        \"merge_factor\": 9,\n        \"min_level_num_docs\": 100000,\n        \"type\": \"stable_log\"\n      },\n      \"resources\": {\n        \"heap_size\": \"50.0 MB\"\n      },\n      \"split_num_docs_target\": 10000001\n    },\n    \"retention\": {\n      \"period\": \"90 days\",\n      \"schedule\": \"daily\"\n    },\n    \"search_settings\": {\n      \"default_search_fields\": [\n        \"message\"\n      ]\n    },\n    \"version\": \"0.7\"\n  },\n  \"index_uid\": \"my-index:00000000000000000000000000\",\n  \"sources\": [\n    {\n      \"desired_num_pipelines\": 2,\n      \"enabled\": true,\n      \"input_format\": \"json\",\n      \"max_num_pipelines_per_indexer\": 2,\n      \"params\": {\n        \"client_params\": {},\n        \"topic\": \"kafka-topic\"\n      },\n      \"source_id\": \"kafka-source\",\n      \"source_type\": \"kafka\",\n      \"transform\": {\n        \"script\": \".message = downcase(string!(.message))\",\n        \"timezone\": \"UTC\"\n      },\n      \"version\": \"0.7\"\n    }\n  ],\n  \"version\": \"0.7\"\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/index-metadata/v0.8.expected.json",
    "content": "{\n  \"checkpoint\": {\n    \"kafka-source\": {\n      \"00000000000000000000\": \"00000000000000000042\"\n    }\n  },\n  \"create_timestamp\": 1789,\n  \"index_config\": {\n    \"doc_mapping\": {\n      \"doc_mapping_uid\": \"00000000000000000000000000\",\n      \"dynamic_mapping\": {\n        \"expand_dots\": true,\n        \"fast\": {\n          \"normalizer\": \"raw\"\n        },\n        \"indexed\": true,\n        \"record\": \"basic\",\n        \"stored\": true,\n        \"tokenizer\": \"raw\"\n      },\n      \"field_mappings\": [\n        {\n          \"coerce\": true,\n          \"fast\": true,\n          \"indexed\": true,\n          \"name\": \"tenant_id\",\n          \"output_format\": \"number\",\n          \"stored\": true,\n          \"type\": \"u64\"\n        },\n        {\n          \"fast\": true,\n          \"fast_precision\": \"seconds\",\n          \"indexed\": true,\n          \"input_formats\": [\n            \"rfc3339\",\n            \"unix_timestamp\"\n          ],\n          \"name\": \"timestamp\",\n          \"output_format\": \"rfc3339\",\n          \"stored\": true,\n          \"type\": \"datetime\"\n        },\n        {\n          \"fast\": false,\n          \"fieldnorms\": false,\n          \"indexed\": true,\n          \"name\": \"log_level\",\n          \"record\": \"basic\",\n          \"stored\": true,\n          \"tokenizer\": \"raw\",\n          \"type\": \"text\"\n        },\n        {\n          \"fast\": false,\n          \"fieldnorms\": false,\n          \"indexed\": true,\n          \"name\": \"message\",\n          \"record\": \"position\",\n          \"stored\": true,\n          \"tokenizer\": \"default\",\n          \"type\": \"text\"\n        }\n      ],\n      \"index_field_presence\": true,\n      \"max_num_partitions\": 100,\n      \"mode\": \"dynamic\",\n      \"partition_key\": \"tenant_id\",\n      \"store_document_size\": false,\n      \"store_source\": true,\n      \"tag_fields\": [\n        \"log_level\",\n        \"tenant_id\"\n      ],\n      \"timestamp_field\": \"timestamp\",\n      \"tokenizers\": [\n        {\n          \"filters\": [],\n          \"name\": \"custom_tokenizer\",\n          \"pattern\": \"[^\\\\p{L}\\\\p{N}]+\",\n          \"type\": \"regex\"\n        }\n      ]\n    },\n    \"index_id\": \"my-index\",\n    \"index_uri\": \"s3://quickwit-indexes/my-index\",\n    \"indexing_settings\": {\n      \"commit_timeout_secs\": 301,\n      \"docstore_blocksize\": 1000000,\n      \"docstore_compression_level\": 8,\n      \"merge_policy\": {\n        \"maturation_period\": \"2days\",\n        \"max_merge_factor\": 11,\n        \"merge_factor\": 9,\n        \"min_level_num_docs\": 100000,\n        \"type\": \"stable_log\"\n      },\n      \"resources\": {\n        \"heap_size\": 50000000\n      },\n      \"split_num_docs_target\": 10000001\n    },\n    \"ingest_settings\": {\n      \"min_shards\": 1\n    },\n    \"retention\": {\n      \"period\": \"90 days\",\n      \"schedule\": \"daily\"\n    },\n    \"search_settings\": {\n      \"default_search_fields\": [\n        \"message\"\n      ]\n    },\n    \"version\": \"0.9\"\n  },\n  \"index_uid\": \"my-index:00000000000000000000000000\",\n  \"sources\": [\n    {\n      \"enabled\": true,\n      \"input_format\": \"json\",\n      \"num_pipelines\": 2,\n      \"params\": {\n        \"client_params\": {},\n        \"topic\": \"kafka-topic\"\n      },\n      \"source_id\": \"kafka-source\",\n      \"source_type\": \"kafka\",\n      \"transform\": {\n        \"script\": \".message = downcase(string!(.message))\",\n        \"timezone\": \"UTC\"\n      },\n      \"version\": \"0.9\"\n    }\n  ],\n  \"version\": \"0.9\"\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/index-metadata/v0.8.json",
    "content": "{\n  \"checkpoint\": {\n    \"kafka-source\": {\n      \"00000000000000000000\": \"00000000000000000042\"\n    }\n  },\n  \"create_timestamp\": 1789,\n  \"index_config\": {\n    \"doc_mapping\": {\n      \"dynamic_mapping\": {\n        \"expand_dots\": true,\n        \"fast\": {\n          \"normalizer\": \"raw\"\n        },\n        \"indexed\": true,\n        \"record\": \"basic\",\n        \"stored\": true,\n        \"tokenizer\": \"raw\"\n      },\n      \"field_mappings\": [\n        {\n          \"coerce\": true,\n          \"fast\": true,\n          \"indexed\": true,\n          \"name\": \"tenant_id\",\n          \"output_format\": \"number\",\n          \"stored\": true,\n          \"type\": \"u64\"\n        },\n        {\n          \"fast\": true,\n          \"fast_precision\": \"seconds\",\n          \"indexed\": true,\n          \"input_formats\": [\n            \"rfc3339\",\n            \"unix_timestamp\"\n          ],\n          \"name\": \"timestamp\",\n          \"output_format\": \"rfc3339\",\n          \"stored\": true,\n          \"type\": \"datetime\"\n        },\n        {\n          \"fast\": false,\n          \"fieldnorms\": false,\n          \"indexed\": true,\n          \"name\": \"log_level\",\n          \"record\": \"basic\",\n          \"stored\": true,\n          \"tokenizer\": \"raw\",\n          \"type\": \"text\"\n        },\n        {\n          \"fast\": false,\n          \"fieldnorms\": false,\n          \"indexed\": true,\n          \"name\": \"message\",\n          \"record\": \"position\",\n          \"stored\": true,\n          \"tokenizer\": \"default\",\n          \"type\": \"text\"\n        }\n      ],\n      \"index_field_presence\": true,\n      \"max_num_partitions\": 100,\n      \"mode\": \"dynamic\",\n      \"partition_key\": \"tenant_id\",\n      \"store_document_size\": false,\n      \"store_source\": true,\n      \"tag_fields\": [\n        \"log_level\",\n        \"tenant_id\"\n      ],\n      \"timestamp_field\": \"timestamp\",\n      \"tokenizers\": [\n        {\n          \"filters\": [],\n          \"name\": \"custom_tokenizer\",\n          \"pattern\": \"[^\\\\p{L}\\\\p{N}]+\",\n          \"type\": \"regex\"\n        }\n      ]\n    },\n    \"index_id\": \"my-index\",\n    \"index_uri\": \"s3://quickwit-indexes/my-index\",\n    \"indexing_settings\": {\n      \"commit_timeout_secs\": 301,\n      \"docstore_blocksize\": 1000000,\n      \"docstore_compression_level\": 8,\n      \"merge_policy\": {\n        \"maturation_period\": \"2days\",\n        \"max_merge_factor\": 11,\n        \"merge_factor\": 9,\n        \"min_level_num_docs\": 100000,\n        \"type\": \"stable_log\"\n      },\n      \"resources\": {\n        \"heap_size\": \"50.0 MB\"\n      },\n      \"split_num_docs_target\": 10000001\n    },\n    \"retention\": {\n      \"period\": \"90 days\",\n      \"schedule\": \"daily\"\n    },\n    \"search_settings\": {\n      \"default_search_fields\": [\n        \"message\"\n      ]\n    },\n    \"version\": \"0.8\"\n  },\n  \"index_uid\": \"my-index:00000000000000000000000000\",\n  \"sources\": [\n    {\n      \"enabled\": true,\n      \"input_format\": \"json\",\n      \"num_pipelines\": 2,\n      \"params\": {\n        \"client_params\": {},\n        \"topic\": \"kafka-topic\"\n      },\n      \"source_id\": \"kafka-source\",\n      \"source_type\": \"kafka\",\n      \"transform\": {\n        \"script\": \".message = downcase(string!(.message))\",\n        \"timezone\": \"UTC\"\n      },\n      \"version\": \"0.8\"\n    }\n  ],\n  \"version\": \"0.8\"\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/index-metadata/v0.9.expected.json",
    "content": "{\n  \"checkpoint\": {\n    \"kafka-source\": {\n      \"00000000000000000000\": \"00000000000000000042\"\n    }\n  },\n  \"create_timestamp\": 1789,\n  \"index_config\": {\n    \"doc_mapping\": {\n      \"doc_mapping_uid\": \"00000000000000000000000001\",\n      \"dynamic_mapping\": {\n        \"expand_dots\": true,\n        \"fast\": {\n          \"normalizer\": \"raw\"\n        },\n        \"indexed\": true,\n        \"record\": \"basic\",\n        \"stored\": true,\n        \"tokenizer\": \"raw\"\n      },\n      \"field_mappings\": [\n        {\n          \"coerce\": true,\n          \"fast\": true,\n          \"indexed\": true,\n          \"name\": \"tenant_id\",\n          \"output_format\": \"number\",\n          \"stored\": true,\n          \"type\": \"u64\"\n        },\n        {\n          \"fast\": true,\n          \"fast_precision\": \"seconds\",\n          \"indexed\": true,\n          \"input_formats\": [\n            \"rfc3339\",\n            \"unix_timestamp\"\n          ],\n          \"name\": \"timestamp\",\n          \"output_format\": \"rfc3339\",\n          \"stored\": true,\n          \"type\": \"datetime\"\n        },\n        {\n          \"fast\": false,\n          \"fieldnorms\": false,\n          \"indexed\": true,\n          \"name\": \"log_level\",\n          \"record\": \"basic\",\n          \"stored\": true,\n          \"tokenizer\": \"raw\",\n          \"type\": \"text\"\n        },\n        {\n          \"fast\": false,\n          \"fieldnorms\": false,\n          \"indexed\": true,\n          \"name\": \"message\",\n          \"record\": \"position\",\n          \"stored\": true,\n          \"tokenizer\": \"default\",\n          \"type\": \"text\"\n        }\n      ],\n      \"index_field_presence\": true,\n      \"max_num_partitions\": 100,\n      \"mode\": \"dynamic\",\n      \"partition_key\": \"tenant_id\",\n      \"store_document_size\": false,\n      \"store_source\": true,\n      \"tag_fields\": [\n        \"log_level\",\n        \"tenant_id\"\n      ],\n      \"timestamp_field\": \"timestamp\",\n      \"tokenizers\": [\n        {\n          \"filters\": [],\n          \"name\": \"custom_tokenizer\",\n          \"pattern\": \"[^\\\\p{L}\\\\p{N}]+\",\n          \"type\": \"regex\"\n        }\n      ]\n    },\n    \"index_id\": \"my-index\",\n    \"index_uri\": \"s3://quickwit-indexes/my-index\",\n    \"indexing_settings\": {\n      \"commit_timeout_secs\": 301,\n      \"docstore_blocksize\": 1000000,\n      \"docstore_compression_level\": 8,\n      \"merge_policy\": {\n        \"maturation_period\": \"2days\",\n        \"max_merge_factor\": 11,\n        \"merge_factor\": 9,\n        \"min_level_num_docs\": 100000,\n        \"type\": \"stable_log\"\n      },\n      \"resources\": {\n        \"heap_size\": 50000000\n      },\n      \"split_num_docs_target\": 10000001\n    },\n    \"retention\": {\n      \"period\": \"90 days\",\n      \"schedule\": \"daily\"\n    },\n    \"ingest_settings\": {\n      \"min_shards\": 12\n    },\n    \"search_settings\": {\n      \"default_search_fields\": [\n        \"message\"\n      ]\n    },\n    \"version\": \"0.9\"\n  },\n  \"index_uid\": \"my-index:00000000000000000000000001\",\n  \"sources\": [\n    {\n      \"enabled\": true,\n      \"input_format\": \"json\",\n      \"num_pipelines\": 2,\n      \"params\": {\n        \"client_params\": {},\n        \"topic\": \"kafka-topic\"\n      },\n      \"source_id\": \"kafka-source\",\n      \"source_type\": \"kafka\",\n      \"transform\": {\n        \"script\": \".message = downcase(string!(.message))\",\n        \"timezone\": \"UTC\"\n      },\n      \"version\": \"0.9\"\n    }\n  ],\n  \"version\": \"0.9\"\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/index-metadata/v0.9.json",
    "content": "{\n  \"checkpoint\": {\n    \"kafka-source\": {\n      \"00000000000000000000\": \"00000000000000000042\"\n    }\n  },\n  \"create_timestamp\": 1789,\n  \"index_config\": {\n    \"doc_mapping\": {\n      \"doc_mapping_uid\": \"00000000000000000000000001\",\n      \"dynamic_mapping\": {\n        \"expand_dots\": true,\n        \"fast\": {\n          \"normalizer\": \"raw\"\n        },\n        \"indexed\": true,\n        \"record\": \"basic\",\n        \"stored\": true,\n        \"tokenizer\": \"raw\"\n      },\n      \"field_mappings\": [\n        {\n          \"coerce\": true,\n          \"fast\": true,\n          \"indexed\": true,\n          \"name\": \"tenant_id\",\n          \"output_format\": \"number\",\n          \"stored\": true,\n          \"type\": \"u64\"\n        },\n        {\n          \"fast\": true,\n          \"fast_precision\": \"seconds\",\n          \"indexed\": true,\n          \"input_formats\": [\n            \"rfc3339\",\n            \"unix_timestamp\"\n          ],\n          \"name\": \"timestamp\",\n          \"output_format\": \"rfc3339\",\n          \"stored\": true,\n          \"type\": \"datetime\"\n        },\n        {\n          \"fast\": false,\n          \"fieldnorms\": false,\n          \"indexed\": true,\n          \"name\": \"log_level\",\n          \"record\": \"basic\",\n          \"stored\": true,\n          \"tokenizer\": \"raw\",\n          \"type\": \"text\"\n        },\n        {\n          \"fast\": false,\n          \"fieldnorms\": false,\n          \"indexed\": true,\n          \"name\": \"message\",\n          \"record\": \"position\",\n          \"stored\": true,\n          \"tokenizer\": \"default\",\n          \"type\": \"text\"\n        }\n      ],\n      \"index_field_presence\": true,\n      \"max_num_partitions\": 100,\n      \"mode\": \"dynamic\",\n      \"partition_key\": \"tenant_id\",\n      \"store_document_size\": false,\n      \"store_source\": true,\n      \"tag_fields\": [\n        \"log_level\",\n        \"tenant_id\"\n      ],\n      \"timestamp_field\": \"timestamp\",\n      \"tokenizers\": [\n        {\n          \"filters\": [],\n          \"name\": \"custom_tokenizer\",\n          \"pattern\": \"[^\\\\p{L}\\\\p{N}]+\",\n          \"type\": \"regex\"\n        }\n      ]\n    },\n    \"index_id\": \"my-index\",\n    \"index_uri\": \"s3://quickwit-indexes/my-index\",\n    \"indexing_settings\": {\n      \"commit_timeout_secs\": 301,\n      \"docstore_blocksize\": 1000000,\n      \"docstore_compression_level\": 8,\n      \"merge_policy\": {\n        \"maturation_period\": \"2days\",\n        \"max_merge_factor\": 11,\n        \"merge_factor\": 9,\n        \"min_level_num_docs\": 100000,\n        \"type\": \"stable_log\"\n      },\n      \"resources\": {\n        \"heap_size\": 50000000\n      },\n      \"split_num_docs_target\": 10000001\n    },\n    \"retention\": {\n      \"period\": \"90 days\",\n      \"schedule\": \"daily\"\n    },\n    \"ingest_settings\": {\n      \"min_shards\": 12\n    },\n    \"search_settings\": {\n      \"default_search_fields\": [\n        \"message\"\n      ]\n    },\n    \"version\": \"0.9\"\n  },\n  \"index_uid\": \"my-index:00000000000000000000000001\",\n  \"sources\": [\n    {\n      \"enabled\": true,\n      \"input_format\": \"json\",\n      \"num_pipelines\": 2,\n      \"params\": {\n        \"client_params\": {},\n        \"topic\": \"kafka-topic\"\n      },\n      \"source_id\": \"kafka-source\",\n      \"source_type\": \"kafka\",\n      \"transform\": {\n        \"script\": \".message = downcase(string!(.message))\",\n        \"timezone\": \"UTC\"\n      },\n      \"version\": \"0.9\"\n    }\n  ],\n  \"version\": \"0.9\"\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/manifest/v0.7.expected.json",
    "content": "{\n  \"indexes\": {\n    \"test-index-1\": \"creating\",\n    \"test-index-2\": \"active\",\n    \"test-index-3\": \"deleting\"\n  },\n  \"templates\": [\n    {\n      \"description\": \"Test description.\",\n      \"doc_mapping\": {\n        \"doc_mapping_uid\": \"00000000000000000000000000\",\n        \"dynamic_mapping\": {\n          \"expand_dots\": true,\n          \"fast\": {\n            \"normalizer\": \"raw\"\n          },\n          \"indexed\": true,\n          \"record\": \"basic\",\n          \"stored\": true,\n          \"tokenizer\": \"raw\"\n        },\n        \"field_mappings\": [\n          {\n            \"fast\": true,\n            \"fast_precision\": \"seconds\",\n            \"indexed\": true,\n            \"input_formats\": [\n              \"rfc3339\",\n              \"unix_timestamp\"\n            ],\n            \"name\": \"ts\",\n            \"output_format\": \"rfc3339\",\n            \"stored\": true,\n            \"type\": \"datetime\"\n          },\n          {\n            \"expand_dots\": true,\n            \"fast\": false,\n            \"indexed\": true,\n            \"name\": \"message\",\n            \"record\": \"basic\",\n            \"stored\": true,\n            \"tokenizer\": \"raw\",\n            \"type\": \"json\"\n          }\n        ],\n        \"index_field_presence\": false,\n        \"max_num_partitions\": 200,\n        \"mode\": \"dynamic\",\n        \"store_document_size\": false,\n        \"store_source\": false,\n        \"tag_fields\": [],\n        \"timestamp_field\": \"ts\",\n        \"tokenizers\": []\n      },\n      \"index_id_patterns\": [\n        \"test-index-foo*\",\n        \"-test-index-foobar\"\n      ],\n      \"index_root_uri\": \"ram:///indexes\",\n      \"indexing_settings\": {\n        \"commit_timeout_secs\": 60,\n        \"docstore_blocksize\": 1000000,\n        \"docstore_compression_level\": 8,\n        \"merge_policy\": {\n          \"maturation_period\": \"2days\",\n          \"max_merge_factor\": 12,\n          \"merge_factor\": 10,\n          \"min_level_num_docs\": 100000,\n          \"type\": \"stable_log\"\n        },\n        \"resources\": {\n          \"heap_size\": 2000000000\n        },\n        \"split_num_docs_target\": 10000000\n      },\n      \"priority\": 100,\n      \"retention\": {\n        \"period\": \"42 days\",\n        \"schedule\": \"daily\"\n      },\n      \"ingest_settings\": {\n        \"min_shards\": 1\n      },\n      \"search_settings\": {\n        \"default_search_fields\": []\n      },\n      \"template_id\": \"test-template\",\n      \"version\": \"0.9\"\n    }\n  ],\n  \"version\": \"0.9\"\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/manifest/v0.7.json",
    "content": "{\n  \"indexes\": {\n    \"test-index-1\": \"creating\",\n    \"test-index-2\": \"active\",\n    \"test-index-3\": \"deleting\"\n  },\n  \"templates\": [\n    {\n      \"description\": \"Test description.\",\n      \"doc_mapping\": {\n        \"dynamic_mapping\": {\n          \"expand_dots\": true,\n          \"fast\": {\n            \"normalizer\": \"raw\"\n          },\n          \"indexed\": true,\n          \"record\": \"basic\",\n          \"stored\": true,\n          \"tokenizer\": \"raw\"\n        },\n        \"field_mappings\": [\n          {\n            \"fast\": true,\n            \"fast_precision\": \"seconds\",\n            \"indexed\": true,\n            \"input_formats\": [\n              \"rfc3339\",\n              \"unix_timestamp\"\n            ],\n            \"name\": \"ts\",\n            \"output_format\": \"rfc3339\",\n            \"stored\": true,\n            \"type\": \"datetime\"\n          },\n          {\n            \"expand_dots\": true,\n            \"fast\": false,\n            \"indexed\": true,\n            \"name\": \"message\",\n            \"record\": \"basic\",\n            \"stored\": true,\n            \"tokenizer\": \"raw\",\n            \"type\": \"json\"\n          }\n        ],\n        \"index_field_presence\": false,\n        \"max_num_partitions\": 200,\n        \"mode\": \"dynamic\",\n        \"store_source\": false,\n        \"tag_fields\": [],\n        \"timestamp_field\": \"ts\",\n        \"tokenizers\": []\n      },\n      \"index_id_patterns\": [\n        \"test-index-foo*\",\n        \"-test-index-foobar\"\n      ],\n      \"index_root_uri\": \"ram:///indexes\",\n      \"indexing_settings\": {\n        \"commit_timeout_secs\": 60,\n        \"docstore_blocksize\": 1000000,\n        \"docstore_compression_level\": 8,\n        \"merge_policy\": {\n          \"maturation_period\": \"2days\",\n          \"max_merge_factor\": 12,\n          \"merge_factor\": 10,\n          \"min_level_num_docs\": 100000,\n          \"type\": \"stable_log\"\n        },\n        \"resources\": {\n          \"heap_size\": \"2.0 GB\"\n        },\n        \"split_num_docs_target\": 10000000\n      },\n      \"priority\": 100,\n      \"retention\": {\n        \"period\": \"42 days\",\n        \"schedule\": \"daily\"\n      },\n      \"search_settings\": {\n        \"default_search_fields\": []\n      },\n      \"template_id\": \"test-template\",\n      \"version\": \"0.7\"\n    }\n  ],\n  \"version\": \"0.7\"\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/manifest/v0.8.expected.json",
    "content": "{\n  \"indexes\": {\n    \"test-index-1\": \"creating\",\n    \"test-index-2\": \"active\",\n    \"test-index-3\": \"deleting\"\n  },\n  \"templates\": [\n    {\n      \"description\": \"Test description.\",\n      \"doc_mapping\": {\n        \"doc_mapping_uid\": \"00000000000000000000000000\",\n        \"dynamic_mapping\": {\n          \"expand_dots\": true,\n          \"fast\": {\n            \"normalizer\": \"raw\"\n          },\n          \"indexed\": true,\n          \"record\": \"basic\",\n          \"stored\": true,\n          \"tokenizer\": \"raw\"\n        },\n        \"field_mappings\": [\n          {\n            \"fast\": true,\n            \"fast_precision\": \"seconds\",\n            \"indexed\": true,\n            \"input_formats\": [\n              \"rfc3339\",\n              \"unix_timestamp\"\n            ],\n            \"name\": \"ts\",\n            \"output_format\": \"rfc3339\",\n            \"stored\": true,\n            \"type\": \"datetime\"\n          },\n          {\n            \"expand_dots\": true,\n            \"fast\": false,\n            \"indexed\": true,\n            \"name\": \"message\",\n            \"record\": \"basic\",\n            \"stored\": true,\n            \"tokenizer\": \"raw\",\n            \"type\": \"json\"\n          }\n        ],\n        \"index_field_presence\": false,\n        \"max_num_partitions\": 200,\n        \"mode\": \"dynamic\",\n        \"store_document_size\": false,\n        \"store_source\": false,\n        \"tag_fields\": [],\n        \"timestamp_field\": \"ts\",\n        \"tokenizers\": []\n      },\n      \"index_id_patterns\": [\n        \"test-index-foo*\",\n        \"-test-index-foobar\"\n      ],\n      \"index_root_uri\": \"ram:///indexes\",\n      \"indexing_settings\": {\n        \"commit_timeout_secs\": 60,\n        \"docstore_blocksize\": 1000000,\n        \"docstore_compression_level\": 8,\n        \"merge_policy\": {\n          \"maturation_period\": \"2days\",\n          \"max_merge_factor\": 12,\n          \"merge_factor\": 10,\n          \"min_level_num_docs\": 100000,\n          \"type\": \"stable_log\"\n        },\n        \"resources\": {\n          \"heap_size\": 2000000000\n        },\n        \"split_num_docs_target\": 10000000\n      },\n      \"priority\": 100,\n      \"ingest_settings\": {\n        \"min_shards\": 1\n      },\n      \"retention\": {\n        \"period\": \"42 days\",\n        \"schedule\": \"daily\"\n      },\n      \"search_settings\": {\n        \"default_search_fields\": []\n      },\n      \"template_id\": \"test-template\",\n      \"version\": \"0.9\"\n    }\n  ],\n  \"version\": \"0.9\"\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/manifest/v0.8.json",
    "content": "{\n  \"indexes\": {\n    \"test-index-1\": \"creating\",\n    \"test-index-2\": \"active\",\n    \"test-index-3\": \"deleting\"\n  },\n  \"templates\": [\n    {\n      \"description\": \"Test description.\",\n      \"doc_mapping\": {\n        \"dynamic_mapping\": {\n          \"expand_dots\": true,\n          \"fast\": {\n            \"normalizer\": \"raw\"\n          },\n          \"indexed\": true,\n          \"record\": \"basic\",\n          \"stored\": true,\n          \"tokenizer\": \"raw\"\n        },\n        \"field_mappings\": [\n          {\n            \"fast\": true,\n            \"fast_precision\": \"seconds\",\n            \"indexed\": true,\n            \"input_formats\": [\n              \"rfc3339\",\n              \"unix_timestamp\"\n            ],\n            \"name\": \"ts\",\n            \"output_format\": \"rfc3339\",\n            \"stored\": true,\n            \"type\": \"datetime\"\n          },\n          {\n            \"expand_dots\": true,\n            \"fast\": false,\n            \"indexed\": true,\n            \"name\": \"message\",\n            \"record\": \"basic\",\n            \"stored\": true,\n            \"tokenizer\": \"raw\",\n            \"type\": \"json\"\n          }\n        ],\n        \"index_field_presence\": false,\n        \"max_num_partitions\": 200,\n        \"mode\": \"dynamic\",\n        \"store_document_size\": false,\n        \"store_source\": false,\n        \"tag_fields\": [],\n        \"timestamp_field\": \"ts\",\n        \"tokenizers\": []\n      },\n      \"index_id_patterns\": [\n        \"test-index-foo*\",\n        \"-test-index-foobar\"\n      ],\n      \"index_root_uri\": \"ram:///indexes\",\n      \"indexing_settings\": {\n        \"commit_timeout_secs\": 60,\n        \"docstore_blocksize\": 1000000,\n        \"docstore_compression_level\": 8,\n        \"merge_policy\": {\n          \"maturation_period\": \"2days\",\n          \"max_merge_factor\": 12,\n          \"merge_factor\": 10,\n          \"min_level_num_docs\": 100000,\n          \"type\": \"stable_log\"\n        },\n        \"resources\": {\n          \"heap_size\": \"2.0 GB\"\n        },\n        \"split_num_docs_target\": 10000000\n      },\n      \"priority\": 100,\n      \"retention\": {\n        \"period\": \"42 days\",\n        \"schedule\": \"daily\"\n      },\n      \"search_settings\": {\n        \"default_search_fields\": []\n      },\n      \"template_id\": \"test-template\",\n      \"version\": \"0.8\"\n    }\n  ],\n  \"version\": \"0.8\"\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/manifest/v0.9.expected.json",
    "content": "{\n  \"indexes\": {\n    \"test-index-1\": \"creating\",\n    \"test-index-2\": \"active\",\n    \"test-index-3\": \"deleting\"\n  },\n  \"templates\": [\n    {\n      \"description\": \"Test description.\",\n      \"doc_mapping\": {\n        \"doc_mapping_uid\": \"00000000000000000000000001\",\n        \"dynamic_mapping\": {\n          \"expand_dots\": true,\n          \"fast\": {\n            \"normalizer\": \"raw\"\n          },\n          \"indexed\": true,\n          \"record\": \"basic\",\n          \"stored\": true,\n          \"tokenizer\": \"raw\"\n        },\n        \"field_mappings\": [\n          {\n            \"fast\": true,\n            \"fast_precision\": \"seconds\",\n            \"indexed\": true,\n            \"input_formats\": [\n              \"rfc3339\",\n              \"unix_timestamp\"\n            ],\n            \"name\": \"ts\",\n            \"output_format\": \"rfc3339\",\n            \"stored\": true,\n            \"type\": \"datetime\"\n          },\n          {\n            \"expand_dots\": true,\n            \"fast\": false,\n            \"indexed\": true,\n            \"name\": \"message\",\n            \"record\": \"basic\",\n            \"stored\": true,\n            \"tokenizer\": \"raw\",\n            \"type\": \"json\"\n          }\n        ],\n        \"index_field_presence\": false,\n        \"max_num_partitions\": 200,\n        \"mode\": \"dynamic\",\n        \"store_document_size\": false,\n        \"store_source\": false,\n        \"tag_fields\": [],\n        \"timestamp_field\": \"ts\",\n        \"tokenizers\": []\n      },\n      \"index_id_patterns\": [\n        \"test-index-foo*\",\n        \"-test-index-foobar\"\n      ],\n      \"index_root_uri\": \"ram:///indexes\",\n      \"indexing_settings\": {\n        \"commit_timeout_secs\": 60,\n        \"docstore_blocksize\": 1000000,\n        \"docstore_compression_level\": 8,\n        \"merge_policy\": {\n          \"maturation_period\": \"2days\",\n          \"max_merge_factor\": 12,\n          \"merge_factor\": 10,\n          \"min_level_num_docs\": 100000,\n          \"type\": \"stable_log\"\n        },\n        \"resources\": {\n          \"heap_size\": 2000000000\n        },\n        \"split_num_docs_target\": 10000000\n      },\n      \"ingest_settings\": {\n        \"min_shards\": 1\n      },\n      \"priority\": 100,\n      \"retention\": {\n        \"period\": \"42 days\",\n        \"schedule\": \"daily\"\n      },\n      \"search_settings\": {\n        \"default_search_fields\": []\n      },\n      \"template_id\": \"test-template\",\n      \"version\": \"0.9\"\n    }\n  ],\n  \"version\": \"0.9\"\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/manifest/v0.9.json",
    "content": "{\n  \"indexes\": {\n    \"test-index-1\": \"creating\",\n    \"test-index-2\": \"active\",\n    \"test-index-3\": \"deleting\"\n  },\n  \"templates\": [\n    {\n      \"description\": \"Test description.\",\n      \"doc_mapping\": {\n        \"doc_mapping_uid\": \"00000000000000000000000001\",\n        \"dynamic_mapping\": {\n          \"expand_dots\": true,\n          \"fast\": {\n            \"normalizer\": \"raw\"\n          },\n          \"indexed\": true,\n          \"record\": \"basic\",\n          \"stored\": true,\n          \"tokenizer\": \"raw\"\n        },\n        \"field_mappings\": [\n          {\n            \"fast\": true,\n            \"fast_precision\": \"seconds\",\n            \"indexed\": true,\n            \"input_formats\": [\n              \"rfc3339\",\n              \"unix_timestamp\"\n            ],\n            \"name\": \"ts\",\n            \"output_format\": \"rfc3339\",\n            \"stored\": true,\n            \"type\": \"datetime\"\n          },\n          {\n            \"expand_dots\": true,\n            \"fast\": false,\n            \"indexed\": true,\n            \"name\": \"message\",\n            \"record\": \"basic\",\n            \"stored\": true,\n            \"tokenizer\": \"raw\",\n            \"type\": \"json\"\n          }\n        ],\n        \"index_field_presence\": false,\n        \"max_num_partitions\": 200,\n        \"mode\": \"dynamic\",\n        \"store_document_size\": false,\n        \"store_source\": false,\n        \"tag_fields\": [],\n        \"timestamp_field\": \"ts\",\n        \"tokenizers\": []\n      },\n      \"index_id_patterns\": [\n        \"test-index-foo*\",\n        \"-test-index-foobar\"\n      ],\n      \"index_root_uri\": \"ram:///indexes\",\n      \"indexing_settings\": {\n        \"commit_timeout_secs\": 60,\n        \"docstore_blocksize\": 1000000,\n        \"docstore_compression_level\": 8,\n        \"merge_policy\": {\n          \"maturation_period\": \"2days\",\n          \"max_merge_factor\": 12,\n          \"merge_factor\": 10,\n          \"min_level_num_docs\": 100000,\n          \"type\": \"stable_log\"\n        },\n        \"resources\": {\n          \"heap_size\": 2000000000\n        },\n        \"split_num_docs_target\": 10000000\n      },\n      \"priority\": 100,\n      \"retention\": {\n        \"period\": \"42 days\",\n        \"schedule\": \"daily\"\n      },\n      \"ingest_settings\": {\n        \"min_shards\": 1\n      },\n      \"search_settings\": {\n        \"default_search_fields\": []\n      },\n      \"template_id\": \"test-template\",\n      \"version\": \"0.9\"\n    }\n  ],\n  \"version\": \"0.9\"\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/split-metadata/v0.7.expected.json",
    "content": "{\n  \"version\": \"0.9\",\n  \"split_id\": \"split\",\n  \"index_uid\": \"my-index:00000000000000000000000000\",\n  \"partition_id\": 7,\n  \"source_id\": \"source\",\n  \"node_id\": \"node\",\n  \"num_docs\": 12303,\n  \"uncompressed_docs_size_in_bytes\": 234234,\n  \"time_range\": {\n    \"start\": 121000,\n    \"end\": 130198\n  },\n  \"create_timestamp\": 3,\n  \"maturity\": {\n    \"type\": \"immature\",\n    \"maturation_period_millis\": 4000\n  },\n  \"tags\": [\n    \"234\",\n    \"aaa\"\n  ],\n  \"footer_offsets\": {\n    \"start\": 1000,\n    \"end\": 2000\n  },\n  \"delete_opstamp\": 10,\n  \"num_merge_ops\": 3,\n  \"doc_mapping_uid\": \"00000000000000000000000000\"\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/split-metadata/v0.7.json",
    "content": "{\n  \"create_timestamp\": 3,\n  \"delete_opstamp\": 10,\n  \"footer_offsets\": {\n    \"end\": 2000,\n    \"start\": 1000\n  },\n  \"index_uid\": \"my-index:00000000000000000000000000\",\n  \"maturity\": {\n    \"maturation_period_millis\": 4000,\n    \"type\": \"immature\"\n  },\n  \"node_id\": \"node\",\n  \"num_docs\": 12303,\n  \"num_merge_ops\": 3,\n  \"partition_id\": 7,\n  \"source_id\": \"source\",\n  \"split_id\": \"split\",\n  \"tags\": [\n    \"234\",\n    \"aaa\"\n  ],\n  \"time_range\": {\n    \"end\": 130198,\n    \"start\": 121000\n  },\n  \"uncompressed_docs_size_in_bytes\": 234234,\n  \"version\": \"0.7\"\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/split-metadata/v0.8.expected.json",
    "content": "{\n  \"version\": \"0.9\",\n  \"split_id\": \"split\",\n  \"index_uid\": \"my-index:00000000000000000000000000\",\n  \"partition_id\": 7,\n  \"source_id\": \"source\",\n  \"node_id\": \"node\",\n  \"num_docs\": 12303,\n  \"uncompressed_docs_size_in_bytes\": 234234,\n  \"time_range\": {\n    \"start\": 121000,\n    \"end\": 130198\n  },\n  \"create_timestamp\": 3,\n  \"maturity\": {\n    \"type\": \"immature\",\n    \"maturation_period_millis\": 4000\n  },\n  \"tags\": [\n    \"234\",\n    \"aaa\"\n  ],\n  \"footer_offsets\": {\n    \"start\": 1000,\n    \"end\": 2000\n  },\n  \"delete_opstamp\": 10,\n  \"num_merge_ops\": 3,\n  \"doc_mapping_uid\": \"00000000000000000000000000\"\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/split-metadata/v0.8.json",
    "content": "{\n  \"create_timestamp\": 3,\n  \"delete_opstamp\": 10,\n  \"footer_offsets\": {\n    \"end\": 2000,\n    \"start\": 1000\n  },\n  \"index_uid\": \"my-index:00000000000000000000000000\",\n  \"maturity\": {\n    \"maturation_period_millis\": 4000,\n    \"type\": \"immature\"\n  },\n  \"node_id\": \"node\",\n  \"num_docs\": 12303,\n  \"num_merge_ops\": 3,\n  \"partition_id\": 7,\n  \"source_id\": \"source\",\n  \"split_id\": \"split\",\n  \"tags\": [\n    \"234\",\n    \"aaa\"\n  ],\n  \"time_range\": {\n    \"end\": 130198,\n    \"start\": 121000\n  },\n  \"uncompressed_docs_size_in_bytes\": 234234,\n  \"version\": \"0.8\"\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/split-metadata/v0.9.expected.json",
    "content": "{\n  \"version\": \"0.9\",\n  \"split_id\": \"split\",\n  \"index_uid\": \"my-index:00000000000000000000000001\",\n  \"partition_id\": 7,\n  \"source_id\": \"source\",\n  \"node_id\": \"node\",\n  \"num_docs\": 12303,\n  \"uncompressed_docs_size_in_bytes\": 234234,\n  \"time_range\": {\n    \"start\": 121000,\n    \"end\": 130198\n  },\n  \"create_timestamp\": 3,\n  \"maturity\": {\n    \"type\": \"immature\",\n    \"maturation_period_millis\": 4000\n  },\n  \"tags\": [\n    \"234\",\n    \"aaa\"\n  ],\n  \"footer_offsets\": {\n    \"start\": 1000,\n    \"end\": 2000\n  },\n  \"delete_opstamp\": 10,\n  \"num_merge_ops\": 3,\n  \"doc_mapping_uid\": \"00000000000000000000000000\"\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore/test-data/split-metadata/v0.9.json",
    "content": "{\n  \"version\": \"0.9\",\n  \"split_id\": \"split\",\n  \"index_uid\": \"my-index:00000000000000000000000001\",\n  \"partition_id\": 7,\n  \"source_id\": \"source\",\n  \"node_id\": \"node\",\n  \"num_docs\": 12303,\n  \"uncompressed_docs_size_in_bytes\": 234234,\n  \"time_range\": {\n    \"start\": 121000,\n    \"end\": 130198\n  },\n  \"create_timestamp\": 3,\n  \"maturity\": {\n    \"type\": \"immature\",\n    \"maturation_period_millis\": 4000\n  },\n  \"tags\": [\n    \"234\",\n    \"aaa\"\n  ],\n  \"footer_offsets\": {\n    \"start\": 1000,\n    \"end\": 2000\n  },\n  \"delete_opstamp\": 10,\n  \"num_merge_ops\": 3,\n  \"doc_mapping_uid\": \"00000000000000000000000000\"\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore-utils/Cargo.toml",
    "content": "[package]\nname = \"quickwit-metastore-utils\"\ndescription = \"Metastore utilities\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[[bin]]\nname = \"replay\"\n\n[[bin]]\nname = \"proxy\"\n\n[dependencies]\nanyhow = { workspace = true }\nasync-trait = { workspace = true }\nquickwit-proto = { workspace = true }\nserde = \"1\"\nserde_json = { workspace = true }\nstructopt = \"0.3\"\ntokio = { workspace = true }\n"
  },
  {
    "path": "quickwit/quickwit-metastore-utils/src/bin/README.md",
    "content": "# Replay\n\nReplay is a small util that sequentially replays a bunch of gRPC calls made to the\nquickwit metastore, as fast as possible.\n\nRight now, both the grpc address and the file are hardcoded.\n\nTo run it:\n\n- run `cargo run --release --bin replay` from the `quickwit-metastore` directory.\n\nIt assumes a quickwit metastore service is running on `localhost:7280`\n\nTo get that, simply run:\n`./quickwit run --service metastore`\n\nA minimal `quickwit.yaml` to run against the postgres could be\n\n```yaml\nversion: \"0.7\"\nmetastore_uri: postgres://quickwit-dev:quickwit-dev@localhost/quickwit-metastore-dev\n```\n\nTo run postgres\n\n`docker-compose up postgres` from the quickwit root directory.\n\n# Warning\n\nThe replay file first request is creating the index.\nThat request actually includes an index_config json data, and this part is about to be heavily changed.\n\nFor the moment, I recommend experimenting on top of quickwit rev 2b0e3963f67303f4e6a362d53fa8bebd3cbad33e.\n\n# Warning 2\n\nThe replay data does not delete the index and the splits.\n\nIt is required to run\n`TRUNCATE TABLE indexes CASCADE;`\nvia\n`psql -h localhost -U quickwit-dev quickwit-metastore-dev`\n\nto rerun the replay data.\n"
  },
  {
    "path": "quickwit/quickwit-metastore-utils/src/bin/proxy.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::net::SocketAddr;\nuse std::path::PathBuf;\nuse std::sync::Arc;\n\nuse async_trait::async_trait;\nuse quickwit_metastore_utils::{GrpcCall, GrpcRequest};\nuse quickwit_proto::metastore::*;\nuse quickwit_proto::tonic;\nuse quickwit_proto::tonic::transport::Channel;\nuse quickwit_proto::tonic::{Request, Response, Status};\nuse structopt::StructOpt;\nuse tokio::fs::File;\nuse tokio::io::{AsyncWriteExt, BufWriter};\nuse tokio::sync::Mutex;\nuse tokio::time::Instant;\n\nstruct Inner {\n    start: Instant,\n    client: MetastoreServiceClient,\n    file: BufWriter<File>,\n}\n\nstruct MetastoreProxyService {\n    inner: Arc<Mutex<Inner>>,\n}\n\nimpl MetastoreProxyService {\n    pub fn new(client: MetastoreServiceClient, record_file: File) -> Self {\n        let inner = Inner {\n            start: Instant::now(),\n            client,\n            file: BufWriter::new(record_file),\n        };\n        Self {\n            inner: Arc::new(Mutex::new(inner)),\n        }\n    }\n}\n\nimpl Inner {\n    async fn record<T: Into<GrpcRequest>>(&mut self, req: T) -> anyhow::Result<()> {\n        let now = Instant::now();\n        let grpc_request = req.into();\n        let elapsed = now - self.start;\n        let grpc_call = GrpcCall {\n            ts: elapsed.as_millis() as u64,\n            grpc_request,\n        };\n        let mut buf = serde_json::to_vec(&grpc_call)?;\n        buf.push(b'\\n');\n        self.file.write_all(&buf).await?;\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl MetastoreService for MetastoreProxyService {\n    /// Creates an index.\n    async fn create_index(\n        &self,\n        request: tonic::Request<CreateIndexRequest>,\n    ) -> Result<tonic::Response<CreateIndexResponse>, tonic::Status> {\n        let mut lock = self.inner.lock().await;\n        lock.record(request.get_ref().clone()).await.unwrap();\n        let resp = lock.client.create_index(request).await?;\n        Ok(resp)\n    }\n    /// Gets an index metadata.\n    async fn index_metadata(\n        &self,\n        request: tonic::Request<IndexMetadataRequest>,\n    ) -> Result<tonic::Response<IndexMetadataResponse>, tonic::Status> {\n        let mut lock = self.inner.lock().await;\n        lock.record(request.get_ref().clone()).await.unwrap();\n        let resp = lock.client.index_metadata(request).await?;\n        Ok(resp)\n    }\n    /// Gets an indexes metadatas.\n    async fn list_indexes_metadata(\n        &self,\n        request: tonic::Request<ListIndexesMetadataRequest>,\n    ) -> Result<tonic::Response<ListIndexesMetadataResponse>, tonic::Status> {\n        let mut lock = self.inner.lock().await;\n        lock.record(request.get_ref().clone()).await.unwrap();\n        let resp = lock.client.list_indexes_metadata(request).await?;\n        Ok(resp)\n    }\n    /// Deletes an index\n    async fn delete_index(\n        &self,\n        request: tonic::Request<DeleteIndexRequest>,\n    ) -> Result<tonic::Response<DeleteIndexResponse>, tonic::Status> {\n        let mut lock = self.inner.lock().await;\n        lock.record(request.get_ref().clone()).await.unwrap();\n        let resp = lock.client.delete_index(request).await?;\n        Ok(resp)\n    }\n    /// Gets splits from index.\n    async fn list_splits(\n        &self,\n        request: tonic::Request<ListSplitsRequest>,\n    ) -> Result<tonic::Response<ListSplitsResponse>, tonic::Status> {\n        let mut lock = self.inner.lock().await;\n        lock.record(request.get_ref().clone()).await.unwrap();\n        let resp = lock.client.list_splits(request).await?;\n        Ok(resp)\n    }\n    /// Stages several splits.\n    async fn stage_splits(\n        &self,\n        request: Request<StageSplitsRequest>,\n    ) -> Result<Response<SplitResponse>, Status> {\n        let mut lock = self.inner.lock().await;\n        lock.record(request.get_ref().clone()).await.unwrap();\n        let resp = lock.client.stage_splits(request).await?;\n        Ok(resp)\n    }\n    /// Publishes split.\n    async fn publish_splits(\n        &self,\n        request: tonic::Request<PublishSplitsRequest>,\n    ) -> Result<tonic::Response<SplitResponse>, tonic::Status> {\n        let mut lock = self.inner.lock().await;\n        lock.record(request.get_ref().clone()).await.unwrap();\n        let resp = lock.client.publish_splits(request).await?;\n        Ok(resp)\n    }\n    /// Marks splits for deletion.\n    async fn mark_splits_for_deletion(\n        &self,\n        request: tonic::Request<MarkSplitsForDeletionRequest>,\n    ) -> Result<tonic::Response<SplitResponse>, tonic::Status> {\n        let mut lock = self.inner.lock().await;\n        lock.record(request.get_ref().clone()).await.unwrap();\n        let resp = lock.client.mark_splits_for_deletion(request).await?;\n        Ok(resp)\n    }\n    /// Deletes splits.\n    async fn delete_splits(\n        &self,\n        request: tonic::Request<DeleteSplitsRequest>,\n    ) -> Result<tonic::Response<SplitResponse>, tonic::Status> {\n        let mut lock = self.inner.lock().await;\n        lock.record(request.get_ref().clone()).await.unwrap();\n        let resp = lock.client.delete_splits(request).await?;\n        Ok(resp)\n    }\n    /// Adds source.\n    async fn add_source(\n        &self,\n        request: tonic::Request<AddSourceRequest>,\n    ) -> Result<tonic::Response<SourceResponse>, tonic::Status> {\n        let mut lock = self.inner.lock().await;\n        lock.record(request.get_ref().clone()).await.unwrap();\n        let resp = lock.client.add_source(request).await?;\n        Ok(resp)\n    }\n    /// Toggles source.\n    async fn toggle_source(\n        &self,\n        request: tonic::Request<ToggleSourceRequest>,\n    ) -> Result<tonic::Response<SourceResponse>, tonic::Status> {\n        let mut lock = self.inner.lock().await;\n        lock.record(request.get_ref().clone()).await.unwrap();\n        let resp = lock.client.toggle_source(request).await?;\n        Ok(resp)\n    }\n    /// Removes source.\n    async fn delete_source(\n        &self,\n        request: tonic::Request<DeleteSourceRequest>,\n    ) -> Result<tonic::Response<SourceResponse>, tonic::Status> {\n        let mut lock = self.inner.lock().await;\n        lock.record(request.get_ref().clone()).await.unwrap();\n        let resp = lock.client.delete_source(request).await?;\n        Ok(resp)\n    }\n    /// Resets source checkpoint.\n    async fn reset_source_checkpoint(\n        &self,\n        request: tonic::Request<ResetSourceCheckpointRequest>,\n    ) -> Result<tonic::Response<SourceResponse>, tonic::Status> {\n        let mut lock = self.inner.lock().await;\n        lock.record(request.get_ref().clone()).await.unwrap();\n        let resp = lock.client.reset_source_checkpoint(request).await?;\n        Ok(resp)\n    }\n    /// Gets last opstamp for a given `index_id`.\n    async fn last_delete_opstamp(\n        &self,\n        request: tonic::Request<LastDeleteOpstampRequest>,\n    ) -> Result<tonic::Response<LastDeleteOpstampResponse>, tonic::Status> {\n        let mut lock = self.inner.lock().await;\n        lock.record(request.get_ref().clone()).await.unwrap();\n        let resp = lock.client.last_delete_opstamp(request).await?;\n        Ok(resp)\n    }\n    /// Creates a delete task.\n    async fn create_delete_task(\n        &self,\n        request: tonic::Request<DeleteQuery>,\n    ) -> Result<tonic::Response<DeleteTask>, tonic::Status> {\n        let mut lock = self.inner.lock().await;\n        lock.record(request.get_ref().clone()).await.unwrap();\n        let resp = lock.client.create_delete_task(request).await?;\n        Ok(resp)\n    }\n    /// Updates splits `delete_opstamp`.\n    async fn update_splits_delete_opstamp(\n        &self,\n        request: tonic::Request<UpdateSplitsDeleteOpstampRequest>,\n    ) -> Result<tonic::Response<UpdateSplitsDeleteOpstampResponse>, tonic::Status> {\n        let mut lock = self.inner.lock().await;\n        lock.record(request.get_ref().clone()).await.unwrap();\n        let resp = lock.client.update_splits_delete_opstamp(request).await?;\n        Ok(resp)\n    }\n    /// Lists delete tasks with `delete_task.opstamp` > `opstamp_start` for a given `index_id`.\n    async fn list_delete_tasks(\n        &self,\n        request: tonic::Request<ListDeleteTasksRequest>,\n    ) -> Result<tonic::Response<ListDeleteTasksResponse>, tonic::Status> {\n        let mut lock = self.inner.lock().await;\n        lock.record(request.get_ref().clone()).await.unwrap();\n        let resp = lock.client.list_delete_tasks(request).await?;\n        Ok(resp)\n    }\n    //// Lists splits with `split.delete_opstamp` < `delete_opstamp` for a given `index_id`.\n    async fn list_stale_splits(\n        &self,\n        request: tonic::Request<ListStaleSplitsRequest>,\n    ) -> Result<tonic::Response<ListSplitsResponse>, tonic::Status> {\n        let mut lock = self.inner.lock().await;\n        lock.record(request.get_ref().clone()).await.unwrap();\n        let resp = lock.client.list_stale_splits(request).await?;\n        Ok(resp)\n    }\n}\n\n#[derive(Debug, StructOpt)]\n#[structopt(name = \"proxy\", about = \"A quickwit-metastore recording proxy.\")]\nstruct Opt {\n    #[structopt(default_value = \"127.0.0.1:7291\")]\n    listen_to: SocketAddr,\n    #[structopt(long, default_value = \"http://127.0.0.1:7281\")]\n    forward_to: String,\n    #[structopt(long, default_value = \"./replay.ndjson\")]\n    file: PathBuf,\n}\n\n#[tokio::main]\nasync fn main() -> anyhow::Result<()> {\n    let opt = Opt::from_args();\n    let client = MetastoreServiceClient::connect(opt.forward_to.clone()).await?;\n    let file = File::create(&opt.file).await?;\n    let service = MetastoreProxyService::new(client, file);\n    let server = MetastoreServiceServer::new(service);\n    println!(\n        \"Listening to {}, Forwarding to {}\",\n        opt.listen_to, opt.forward_to\n    );\n    tonic::transport::Server::builder()\n        .add_service(server)\n        .serve(opt.listen_to)\n        .await?;\n    Ok(())\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore-utils/src/bin/replay.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::path::PathBuf;\n\nuse quickwit_metastore_utils::{GrpcCall, GrpcRequest};\nuse quickwit_proto::metastore::metastore_service_client::MetastoreServiceClient;\nuse quickwit_proto::tonic::transport::Channel;\nuse structopt::StructOpt;\nuse tokio::fs::File;\nuse tokio::io::AsyncBufReadExt;\n\nasync fn replay_grpc_request(\n    client: &mut MetastoreServiceClient<Channel>,\n    req: GrpcRequest,\n) -> anyhow::Result<()> {\n    match req {\n        GrpcRequest::CreateIndexRequest(req) => {\n            client.create_index(req).await?;\n        }\n        GrpcRequest::IndexMetadataRequest(req) => {\n            client.index_metadata(req).await?;\n        }\n        GrpcRequest::ListIndexesMetadataRequest(req) => {\n            client.list_indexes_metadata(req).await?;\n        }\n        GrpcRequest::DeleteIndexRequest(req) => {\n            client.delete_index(req).await?;\n        }\n        GrpcRequest::ListSplitsRequest(req) => {\n            client.list_splits(req).await?;\n        }\n        GrpcRequest::StageSplitsRequest(req) => {\n            client.stage_splits(req).await?;\n        }\n        GrpcRequest::PublishSplitsRequest(req) => {\n            client.publish_splits(req).await?;\n        }\n        GrpcRequest::MarkSplitsForDeletionRequest(req) => {\n            client.mark_splits_for_deletion(req).await?;\n        }\n        GrpcRequest::DeleteSplitsRequest(req) => {\n            client.delete_splits(req).await?;\n        }\n        GrpcRequest::AddSourceRequest(req) => {\n            client.add_source(req).await?;\n        }\n        GrpcRequest::ToggleSourceRequest(req) => {\n            client.toggle_source(req).await?;\n        }\n        GrpcRequest::DeleteSourceRequest(req) => {\n            client.delete_source(req).await?;\n        }\n        GrpcRequest::LastDeleteOpstampRequest(req) => {\n            client.last_delete_opstamp(req).await?;\n        }\n        GrpcRequest::ResetSourceCheckpointRequest(req) => {\n            client.reset_source_checkpoint(req).await?;\n        }\n        GrpcRequest::DeleteQuery(req) => {\n            client.create_delete_task(req).await?;\n        }\n        GrpcRequest::UpdateSplitsDeleteOpstampRequest(req) => {\n            client.update_splits_delete_opstamp(req).await?;\n        }\n        GrpcRequest::ListDeleteTasksRequest(req) => {\n            client.list_delete_tasks(req).await?;\n        }\n        GrpcRequest::ListStaleSplitsRequest(req) => {\n            client.list_stale_splits(req).await?;\n        }\n    }\n    Ok(())\n}\n\n#[derive(Debug, StructOpt)]\n#[structopt(\n    name = \"replay\",\n    about = \"A quickwit-metastore program to replay request log generated by proxy\"\n)]\nstruct Opt {\n    #[structopt(\n        long,\n        default_value = \"./replay-data/requests-partition-wikitenant.ndjson\"\n    )]\n    file: PathBuf,\n    #[structopt(long, default_value = \"http://127.0.0.1:7281\")]\n    forward_to: String,\n}\n\n#[tokio::main]\nasync fn main() -> anyhow::Result<()> {\n    let opt = Opt::from_args();\n    let file = File::open(&opt.file).await?;\n    let buffered = tokio::io::BufReader::new(file);\n    let mut lines = buffered.lines();\n    let mut client = MetastoreServiceClient::connect(opt.forward_to.clone()).await?;\n    let mut i = 0;\n    while let Some(line) = lines.next_line().await? {\n        println!(\"line {i} = {line}\");\n        let grpc_call: GrpcCall = serde_json::from_str(&line)?;\n        replay_grpc_request(&mut client, grpc_call.grpc_request).await?;\n        i += 1;\n    }\n    Ok(())\n}\n"
  },
  {
    "path": "quickwit/quickwit-metastore-utils/src/grpc_request.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_proto::metastore::*;\n\n// The macros below are generating a req enum of the form\n//\n// ```\n// enum GrpcRequest {\n//    CreateIndexRequest(CreateIndexRequest),\n//    IndexMetadataRequest(IndexMetadataRequest),\n//    ...\n// }\n// ```\n//\n// And adds a From<SpecificRequest> implementation for\n// every specific request.\n\nmacro_rules! build_req_enum {\n    ( $($key:ident,)* ) => {\n        use serde::{Serialize, Deserialize};\n        #[derive(Serialize, Deserialize)]\n        #[serde(tag=\"type\")]\n        pub enum GrpcRequest {\n            $( $key($key), )*\n        }\n    }\n}\n\nmacro_rules! generate_req_enum {\n    ( $($key:ident,)* ) => {\n        build_req_enum!($($key,)*);\n        req_from_impls!($($key,)*);\n    }\n}\n\nmacro_rules! req_from_impls {\n    ($name:ident,) => {\n        impl From<$name> for GrpcRequest {\n            fn from(req: $name) -> Self {\n                GrpcRequest::$name(req)\n            }\n        }\n    };\n    ($name:ident, $($other:ident,)+) => {\n        req_from_impls!($name,);\n        req_from_impls!($($other,)+);\n    }\n}\n\ngenerate_req_enum!(\n    CreateIndexRequest,\n    IndexMetadataRequest,\n    ListIndexesMetadataRequest,\n    DeleteIndexRequest,\n    ListSplitsRequest,\n    StageSplitsRequest,\n    PublishSplitsRequest,\n    MarkSplitsForDeletionRequest,\n    DeleteSplitsRequest,\n    AddSourceRequest,\n    ToggleSourceRequest,\n    DeleteSourceRequest,\n    LastDeleteOpstampRequest,\n    ResetSourceCheckpointRequest,\n    DeleteQuery,\n    UpdateSplitsDeleteOpstampRequest,\n    ListDeleteTasksRequest,\n    ListStaleSplitsRequest,\n);\n"
  },
  {
    "path": "quickwit/quickwit-metastore-utils/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n#![deny(clippy::disallowed_methods)]\n\nmod grpc_request;\npub use grpc_request::GrpcRequest;\nuse serde::{Deserialize, Serialize};\n\n#[derive(Serialize, Deserialize)]\npub struct GrpcCall {\n    pub ts: u64,\n    pub grpc_request: GrpcRequest,\n}\n"
  },
  {
    "path": "quickwit/quickwit-opentelemetry/Cargo.toml",
    "content": "[package]\nname = \"quickwit-opentelemetry\"\ndescription = \"Telemetry server\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nanyhow = { workspace = true }\nasync-trait = { workspace = true }\nonce_cell = { workspace = true }\nprost = { workspace = true }\nserde = { workspace = true }\nserde_json = { workspace = true }\nthiserror = { workspace = true }\ntime = { workspace = true }\ntokio = { workspace = true }\ntonic = { workspace = true }\ntracing = { workspace = true }\n\nquickwit-common = { workspace = true }\nquickwit-config = { workspace = true }\nquickwit-ingest = { workspace = true }\nquickwit-proto = { workspace = true }\n\n[dev-dependencies]\nquickwit-common = { workspace = true, features = [\"testsuite\"] }\nquickwit-metastore = { workspace = true, features = [\"testsuite\"] }\nquickwit-proto = { workspace = true, features = [\"testsuite\"] }\n\n[features]\ntestsuite = []\n"
  },
  {
    "path": "quickwit/quickwit-opentelemetry/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n#![deny(clippy::disallowed_methods)]\n\npub mod otlp;\n"
  },
  {
    "path": "quickwit/quickwit-opentelemetry/src/otlp/logs.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\n\nuse async_trait::async_trait;\nuse prost::Message;\nuse quickwit_common::thread_pool::run_cpu_intensive;\nuse quickwit_common::uri::Uri;\nuse quickwit_config::{ConfigFormat, IndexConfig, load_index_config_from_user_config};\nuse quickwit_ingest::{CommitType, JsonDocBatchV2Builder};\nuse quickwit_proto::ingest::DocBatchV2;\nuse quickwit_proto::ingest::router::IngestRouterServiceClient;\nuse quickwit_proto::opentelemetry::proto::collector::logs::v1::logs_service_server::LogsService;\nuse quickwit_proto::opentelemetry::proto::collector::logs::v1::{\n    ExportLogsPartialSuccess, ExportLogsServiceRequest, ExportLogsServiceResponse,\n};\nuse quickwit_proto::types::{DocUidGenerator, IndexId};\nuse serde::{Deserialize, Serialize};\nuse serde_json::Value as JsonValue;\nuse time::OffsetDateTime;\nuse tonic::{Request, Response, Status};\nuse tracing::field::Empty;\nuse tracing::{Span as RuntimeSpan, error, instrument, warn};\n\nuse super::{\n    OtelSignal, SpanId, TraceId, TryFromSpanIdError, TryFromTraceIdError,\n    extract_otel_index_id_from_metadata, ingest_doc_batch_v2, is_zero, parse_log_record_body,\n};\nuse crate::otlp::extract_attributes;\nuse crate::otlp::metrics::OTLP_SERVICE_METRICS;\n\npub const OTEL_LOGS_INDEX_ID: &str = \"otel-logs-v0_9\";\n\nconst OTEL_LOGS_INDEX_CONFIG: &str = r#\"\nversion: 0.8\n\nindex_id: ${INDEX_ID}\n\ndoc_mapping:\n  mode: strict\n  field_mappings:\n    - name: timestamp_nanos\n      type: datetime\n      input_formats: [unix_timestamp]\n      output_format: unix_timestamp_nanos\n      indexed: false\n      fast: true\n      fast_precision: milliseconds\n    - name: observed_timestamp_nanos\n      type: datetime\n      input_formats: [unix_timestamp]\n      output_format: unix_timestamp_nanos\n    - name: service_name\n      type: text\n      tokenizer: raw\n      fast: true\n    - name: severity_text\n      type: text\n      tokenizer: raw\n      fast: true\n    - name: severity_number\n      type: u64\n      fast: true\n    - name: body\n      type: json\n      tokenizer: default\n    - name: attributes\n      type: json\n      tokenizer: raw\n      fast: true\n    - name: dropped_attributes_count\n      type: u64\n      indexed: false\n    - name: trace_id\n      type: bytes\n      input_format: hex\n      output_format: hex\n    - name: span_id\n      type: bytes\n      input_format: hex\n      output_format: hex\n    - name: trace_flags\n      type: u64\n      indexed: false\n    - name: resource_attributes\n      type: json\n      tokenizer: raw\n      fast: true\n    - name: resource_dropped_attributes_count\n      type: u64\n      indexed: false\n    - name: scope_name\n      type: text\n      indexed: false\n    - name: scope_version\n      type: text\n      indexed: false\n    - name: scope_attributes\n      type: json\n      indexed: false\n    - name: scope_dropped_attributes_count\n      type: u64\n      indexed: false\n\n  timestamp_field: timestamp_nanos\n\n  # partition_key: hash_mod(service_name, 100)\n  # tag_fields: [service_name]\n\nindexing_settings:\n  commit_timeout_secs: 5\n\nsearch_settings:\n  default_search_fields: [body.message]\n\"#;\n\n#[derive(Debug, thiserror::Error)]\npub enum OtlpLogsError {\n    #[error(\"failed to deserialize JSON log records: `{0}`\")]\n    Json(#[from] serde_json::Error),\n    #[error(\"failed to deserialize Protobuf log records: `{0}`\")]\n    Protobuf(#[from] prost::DecodeError),\n    #[error(\"failed to parse log record: `{0}`\")]\n    SpanId(#[from] TryFromSpanIdError),\n    #[error(\"failed to parse log record: `{0}`\")]\n    TraceId(#[from] TryFromTraceIdError),\n}\n\n#[derive(Debug, Serialize, Deserialize)]\npub struct LogRecord {\n    pub timestamp_nanos: u64,\n    pub observed_timestamp_nanos: u64,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"String::is_empty\")]\n    pub service_name: String,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub severity_text: Option<String>,\n    pub severity_number: i32,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub body: Option<JsonValue>,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"HashMap::is_empty\")]\n    pub attributes: HashMap<String, JsonValue>,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"is_zero\")]\n    pub dropped_attributes_count: u32,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub trace_id: Option<TraceId>,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub span_id: Option<SpanId>,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub trace_flags: Option<u32>,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"HashMap::is_empty\")]\n    pub resource_attributes: HashMap<String, JsonValue>,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"is_zero\")]\n    pub resource_dropped_attributes_count: u32,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub scope_name: Option<String>,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub scope_version: Option<String>,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"HashMap::is_empty\")]\n    pub scope_attributes: HashMap<String, JsonValue>,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"is_zero\")]\n    pub scope_dropped_attributes_count: u32,\n}\n\nstruct ParsedLogRecords {\n    doc_batch: DocBatchV2,\n    num_log_records: u64,\n    num_parse_errors: u64,\n    error_message: String,\n}\n\n#[derive(Clone)]\npub struct OtlpGrpcLogsService {\n    ingest_router: IngestRouterServiceClient,\n}\n\nimpl OtlpGrpcLogsService {\n    pub fn new(ingest_router: IngestRouterServiceClient) -> Self {\n        Self { ingest_router }\n    }\n\n    pub fn index_config(default_index_root_uri: &Uri) -> anyhow::Result<IndexConfig> {\n        let index_config_str = OTEL_LOGS_INDEX_CONFIG.replace(\"${INDEX_ID}\", OTEL_LOGS_INDEX_ID);\n        let index_config = load_index_config_from_user_config(\n            ConfigFormat::Yaml,\n            index_config_str.as_bytes(),\n            default_index_root_uri,\n        )?;\n        Ok(index_config)\n    }\n\n    async fn export_inner(\n        &mut self,\n        request: ExportLogsServiceRequest,\n        index_id: IndexId,\n        labels: [&str; 4],\n    ) -> Result<ExportLogsServiceResponse, Status> {\n        let ParsedLogRecords {\n            doc_batch,\n            num_log_records,\n            num_parse_errors,\n            error_message,\n        } = run_cpu_intensive({\n            let parent_span = RuntimeSpan::current();\n            || Self::parse_logs(request, parent_span)\n        })\n        .await\n        .map_err(|join_error| {\n            error!(error=?join_error, \"failed to parse log records\");\n            Status::internal(\"failed to parse log records\")\n        })??;\n        if num_log_records == num_parse_errors {\n            return Err(tonic::Status::internal(error_message));\n        }\n        let num_bytes = doc_batch.num_bytes() as u64;\n        self.store_logs(index_id, doc_batch).await?;\n\n        OTLP_SERVICE_METRICS\n            .ingested_log_records_total\n            .with_label_values(labels)\n            .inc_by(num_log_records);\n        OTLP_SERVICE_METRICS\n            .ingested_bytes_total\n            .with_label_values(labels)\n            .inc_by(num_bytes);\n\n        let response = ExportLogsServiceResponse {\n            // `rejected_log_records=0` and `error_message=\"\"` is consided a \"full\" success.\n            partial_success: Some(ExportLogsPartialSuccess {\n                rejected_log_records: num_parse_errors as i64,\n                error_message,\n            }),\n        };\n        Ok(response)\n    }\n\n    #[instrument(skip_all, parent = parent_span, fields(num_log_records = Empty, num_bytes = Empty, num_parse_errors = Empty))]\n    #[allow(clippy::result_large_err)]\n    fn parse_logs(\n        request: ExportLogsServiceRequest,\n        parent_span: RuntimeSpan,\n    ) -> tonic::Result<ParsedLogRecords> {\n        let log_records = parse_otlp_logs(request)?;\n        let mut num_parse_errors = 0;\n        let num_log_records = log_records.len() as u64;\n        let mut error_message = String::new();\n\n        let mut doc_batch_builder = JsonDocBatchV2Builder::with_num_docs(num_log_records as usize);\n        let mut doc_uid_generator = DocUidGenerator::default();\n        for log_record in log_records {\n            let doc_uid = doc_uid_generator.next_doc_uid();\n            if let Err(error) = doc_batch_builder.add_doc(doc_uid, log_record) {\n                error!(error=?error, \"failed to JSON serialize span\");\n                error_message = format!(\"failed to JSON serialize span: {error:?}\");\n                num_parse_errors += 1;\n            }\n        }\n        let doc_batch = doc_batch_builder.build();\n        let current_span = RuntimeSpan::current();\n        current_span.record(\"num_log_records\", num_log_records);\n        current_span.record(\"num_bytes\", doc_batch.num_bytes());\n        current_span.record(\"num_parse_errors\", num_parse_errors);\n\n        let parsed_logs = ParsedLogRecords {\n            doc_batch,\n            num_log_records,\n            num_parse_errors,\n            error_message,\n        };\n        Ok(parsed_logs)\n    }\n\n    #[instrument(skip_all, fields(num_bytes = doc_batch.num_bytes()))]\n    async fn store_logs(\n        &mut self,\n        index_id: String,\n        doc_batch: DocBatchV2,\n    ) -> Result<(), tonic::Status> {\n        ingest_doc_batch_v2(\n            self.ingest_router.clone(),\n            index_id,\n            doc_batch,\n            CommitType::Auto,\n        )\n        .await?;\n        Ok(())\n    }\n\n    async fn export_instrumented(\n        &mut self,\n        request: ExportLogsServiceRequest,\n        index_id: IndexId,\n    ) -> Result<ExportLogsServiceResponse, Status> {\n        let start = std::time::Instant::now();\n\n        let labels = [\"logs\", &index_id, \"grpc\", \"protobuf\"];\n\n        OTLP_SERVICE_METRICS\n            .requests_total\n            .with_label_values(labels)\n            .inc();\n        let (export_res, is_error) =\n            match self.export_inner(request, index_id.clone(), labels).await {\n                ok @ Ok(_) => (ok, \"false\"),\n                err @ Err(_) => {\n                    OTLP_SERVICE_METRICS\n                        .request_errors_total\n                        .with_label_values(labels)\n                        .inc();\n                    (err, \"true\")\n                }\n            };\n        let elapsed = start.elapsed().as_secs_f64();\n        let labels = [\"logs\", &index_id, \"grpc\", \"protobuf\", is_error];\n        OTLP_SERVICE_METRICS\n            .request_duration_seconds\n            .with_label_values(labels)\n            .observe(elapsed);\n\n        export_res\n    }\n}\n\n#[async_trait]\nimpl LogsService for OtlpGrpcLogsService {\n    #[instrument(name = \"ingest_logs\", skip_all)]\n    async fn export(\n        &self,\n        request: Request<ExportLogsServiceRequest>,\n    ) -> Result<Response<ExportLogsServiceResponse>, Status> {\n        let index_id = extract_otel_index_id_from_metadata(request.metadata(), OtelSignal::Logs)?;\n        let request = request.into_inner();\n        self.clone()\n            .export_instrumented(request, index_id)\n            .await\n            .map(Response::new)\n    }\n}\n\nfn parse_otlp_logs(request: ExportLogsServiceRequest) -> Result<Vec<LogRecord>, OtlpLogsError> {\n    let num_log_records = request\n        .resource_logs\n        .iter()\n        .flat_map(|resource_log| resource_log.scope_logs.iter())\n        .map(|scope_logs| scope_logs.log_records.len())\n        .sum();\n    let mut log_records = Vec::with_capacity(num_log_records);\n\n    for resource_logs in request.resource_logs {\n        let mut resource_attributes = extract_attributes(\n            resource_logs\n                .resource\n                .clone()\n                .map(|rsrc| rsrc.attributes)\n                .unwrap_or_default(),\n        );\n        let resource_dropped_attributes_count = resource_logs\n            .resource\n            .map(|rsrc| rsrc.dropped_attributes_count)\n            .unwrap_or(0);\n\n        let service_name = match resource_attributes.remove(\"service.name\") {\n            Some(JsonValue::String(value)) => value.to_string(),\n            _ => \"unknown_service\".to_string(),\n        };\n        for scope_logs in resource_logs.scope_logs {\n            let scope_name = scope_logs\n                .scope\n                .as_ref()\n                .map(|scope| &scope.name)\n                .filter(|name| !name.is_empty());\n            let scope_version = scope_logs\n                .scope\n                .as_ref()\n                .map(|scope| &scope.version)\n                .filter(|version| !version.is_empty());\n            let scope_attributes = extract_attributes(\n                scope_logs\n                    .scope\n                    .clone()\n                    .map(|scope| scope.attributes)\n                    .unwrap_or_default(),\n            );\n            let scope_dropped_attributes_count = scope_logs\n                .scope\n                .as_ref()\n                .map(|scope| scope.dropped_attributes_count)\n                .unwrap_or(0);\n\n            for log_record in scope_logs.log_records {\n                let observed_timestamp_nanos = if log_record.observed_time_unix_nano == 0 {\n                    // As per OTEL model spec, this field SHOULD be set once the\n                    // event is observed by OpenTelemetry. If it's not set, we\n                    // consider ourselves as the first OTEL observers.\n                    OffsetDateTime::now_utc().unix_timestamp_nanos() as u64\n                } else {\n                    log_record.observed_time_unix_nano\n                };\n\n                let timestamp_nanos = if log_record.time_unix_nano == 0 {\n                    observed_timestamp_nanos\n                } else {\n                    // When only one timestamp is supported by a recipients, the\n                    // OTEL spec recommends using the `Timestamp` field if\n                    // present, otherwise `ObservedTimestamp`. Even though our\n                    // model supports multiple timestamps, we have only one\n                    // field that that can be our `timestamp_field` and it\n                    // should be the one that is commonly used for queries.\n                    log_record.time_unix_nano\n                };\n\n                let trace_id = if log_record.trace_id.iter().any(|&byte| byte != 0) {\n                    let trace_id = TraceId::try_from(log_record.trace_id)?;\n                    Some(trace_id)\n                } else {\n                    None\n                };\n                let span_id = if log_record.span_id.iter().any(|&byte| byte != 0) {\n                    let span_id = SpanId::try_from(log_record.span_id)?;\n                    Some(span_id)\n                } else {\n                    None\n                };\n                let trace_flags = Some(log_record.flags);\n\n                let severity_text = if !log_record.severity_text.is_empty() {\n                    Some(log_record.severity_text)\n                } else {\n                    None\n                };\n                let severity_number = log_record.severity_number;\n                let body = log_record.body.and_then(parse_log_record_body);\n                let attributes = extract_attributes(log_record.attributes);\n                let dropped_attributes_count = log_record.dropped_attributes_count;\n\n                let log_record = LogRecord {\n                    timestamp_nanos,\n                    observed_timestamp_nanos,\n                    service_name: service_name.clone(),\n                    severity_text,\n                    severity_number,\n                    body,\n                    attributes,\n                    trace_id,\n                    span_id,\n                    trace_flags,\n                    dropped_attributes_count,\n                    resource_attributes: resource_attributes.clone(),\n                    resource_dropped_attributes_count,\n                    scope_name: scope_name.cloned(),\n                    scope_version: scope_version.cloned(),\n                    scope_attributes: scope_attributes.clone(),\n                    scope_dropped_attributes_count,\n                };\n                log_records.push(log_record);\n            }\n        }\n    }\n    Ok(log_records)\n}\n\n/// An iterator of JSON OTLP log records for use in the doc processor.\npub struct JsonLogIterator {\n    logs: std::vec::IntoIter<LogRecord>,\n    current_log_idx: usize,\n    num_logs: usize,\n    avg_log_size: usize,\n    avg_log_size_rem: usize,\n}\n\nimpl JsonLogIterator {\n    fn new(logs: Vec<LogRecord>, num_bytes: usize) -> Self {\n        let num_logs = logs.len();\n        let avg_log_size = num_bytes.checked_div(num_logs).unwrap_or(0);\n        let avg_log_size_rem = avg_log_size + num_bytes.checked_rem(num_logs).unwrap_or(0);\n\n        Self {\n            logs: logs.into_iter(),\n            current_log_idx: 0,\n            num_logs,\n            avg_log_size,\n            avg_log_size_rem,\n        }\n    }\n}\n\nimpl Iterator for JsonLogIterator {\n    type Item = (JsonValue, usize);\n\n    fn next(&mut self) -> Option<Self::Item> {\n        let log_opt = self\n            .logs\n            .next()\n            .map(|log| serde_json::to_value(log).expect(\"`LogRecord` should be JSON serializable\"));\n        if log_opt.is_some() {\n            self.current_log_idx += 1;\n        }\n        if self.current_log_idx < self.num_logs {\n            log_opt.map(|span| (span, self.avg_log_size))\n        } else {\n            log_opt.map(|span| (span, self.avg_log_size_rem))\n        }\n    }\n}\n\npub fn parse_otlp_logs_json(payload_json: &[u8]) -> Result<JsonLogIterator, OtlpLogsError> {\n    let request: ExportLogsServiceRequest = serde_json::from_slice(payload_json)?;\n    let log_records = parse_otlp_logs(request)?;\n    Ok(JsonLogIterator::new(log_records, payload_json.len()))\n}\n\npub fn parse_otlp_logs_protobuf(payload_proto: &[u8]) -> Result<JsonLogIterator, OtlpLogsError> {\n    let request = ExportLogsServiceRequest::decode(payload_proto)?;\n    let log_records = parse_otlp_logs(request)?;\n    Ok(JsonLogIterator::new(log_records, payload_proto.len()))\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_metastore::{CreateIndexRequestExt, metastore_for_test};\n    use quickwit_proto::metastore::{CreateIndexRequest, MetastoreService};\n\n    use super::*;\n\n    #[test]\n    fn test_index_config_is_valid() {\n        let index_config =\n            OtlpGrpcLogsService::index_config(&Uri::for_test(\"ram:///indexes\")).unwrap();\n        assert_eq!(index_config.index_id, OTEL_LOGS_INDEX_ID);\n    }\n\n    #[tokio::test]\n    async fn test_create_index() {\n        let metastore = metastore_for_test();\n        let index_config =\n            OtlpGrpcLogsService::index_config(&Uri::for_test(\"ram:///indexes\")).unwrap();\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        metastore.create_index(create_index_request).await.unwrap();\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-opentelemetry/src/otlp/metrics.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse once_cell::sync::Lazy;\nuse quickwit_common::metrics::{\n    HistogramVec, IntCounterVec, exponential_buckets, new_counter_vec, new_histogram_vec,\n};\n\npub struct OtlpServiceMetrics {\n    pub requests_total: IntCounterVec<4>,\n    pub request_errors_total: IntCounterVec<4>,\n    pub request_duration_seconds: HistogramVec<5>,\n    pub ingested_log_records_total: IntCounterVec<4>,\n    pub ingested_spans_total: IntCounterVec<4>,\n    pub ingested_bytes_total: IntCounterVec<4>,\n}\n\nimpl Default for OtlpServiceMetrics {\n    fn default() -> Self {\n        Self {\n            requests_total: new_counter_vec(\n                \"requests_total\",\n                \"Number of requests\",\n                \"otlp\",\n                &[],\n                [\"service\", \"index\", \"transport\", \"format\"],\n            ),\n            request_errors_total: new_counter_vec(\n                \"request_errors_total\",\n                \"Number of failed requests\",\n                \"otlp\",\n                &[],\n                [\"service\", \"index\", \"transport\", \"format\"],\n            ),\n            request_duration_seconds: new_histogram_vec(\n                \"request_duration_seconds\",\n                \"Duration of requests\",\n                \"otlp\",\n                &[],\n                [\"service\", \"index\", \"transport\", \"format\", \"error\"],\n                exponential_buckets(0.02, 2.0, 8).unwrap(),\n            ),\n            ingested_log_records_total: new_counter_vec(\n                \"ingested_log_records_total\",\n                \"Number of log records ingested\",\n                \"otlp\",\n                &[],\n                [\"service\", \"index\", \"transport\", \"format\"],\n            ),\n            ingested_spans_total: new_counter_vec(\n                \"ingested_spans_total\",\n                \"Number of spans ingested\",\n                \"otlp\",\n                &[],\n                [\"service\", \"index\", \"transport\", \"format\"],\n            ),\n            ingested_bytes_total: new_counter_vec(\n                \"ingested_bytes_total\",\n                \"Number of bytes ingested\",\n                \"otlp\",\n                &[],\n                [\"service\", \"index\", \"transport\", \"format\"],\n            ),\n        }\n    }\n}\n\n/// `OTLP_SERVICE_METRICS` exposes metrics for each OTLP service.\npub static OTLP_SERVICE_METRICS: Lazy<OtlpServiceMetrics> = Lazy::new(OtlpServiceMetrics::default);\n"
  },
  {
    "path": "quickwit/quickwit-opentelemetry/src/otlp/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\n\nuse quickwit_common::rate_limited_warn;\nuse quickwit_config::{INGEST_V2_SOURCE_ID, validate_identifier, validate_index_id_pattern};\nuse quickwit_ingest::{CommitType, IngestServiceError};\nuse quickwit_proto::ingest::DocBatchV2;\nuse quickwit_proto::ingest::router::{\n    IngestRequestV2, IngestRouterService, IngestRouterServiceClient, IngestSubrequest,\n};\nuse quickwit_proto::opentelemetry::proto::common::v1::any_value::Value as OtlpValue;\nuse quickwit_proto::opentelemetry::proto::common::v1::{\n    AnyValue as OtlpAnyValue, ArrayValue as OtlpArrayValue, KeyValue as OtlpKeyValue,\n};\nuse serde_json::{Number as JsonNumber, Value as JsonValue};\n\nmod logs;\nmod metrics;\n#[cfg(any(test, feature = \"testsuite\"))]\nmod test_utils;\nmod traces;\n\npub use logs::{\n    JsonLogIterator, OTEL_LOGS_INDEX_ID, OtlpGrpcLogsService, OtlpLogsError, parse_otlp_logs_json,\n    parse_otlp_logs_protobuf,\n};\npub use quickwit_proto::search::{SpanId, TraceId, TryFromSpanIdError, TryFromTraceIdError};\n#[cfg(any(test, feature = \"testsuite\"))]\npub use test_utils::make_resource_spans_for_test;\nuse tonic::Status;\npub use traces::{\n    Event, JsonSpanIterator, Link, OTEL_TRACES_INDEX_ID, OTEL_TRACES_INDEX_ID_PATTERN,\n    OtlpGrpcTracesService, OtlpTracesError, Span, SpanFingerprint, SpanKind, SpanStatus,\n    parse_otlp_spans_json, parse_otlp_spans_protobuf,\n};\n\n#[derive(Debug, Clone, Copy)]\npub enum OtelSignal {\n    Logs,\n    Traces,\n}\n\nimpl OtelSignal {\n    pub fn header_name(&self) -> &'static str {\n        match self {\n            OtelSignal::Logs => \"qw-otel-logs-index\",\n            OtelSignal::Traces => \"qw-otel-traces-index\",\n        }\n    }\n\n    pub fn default_index_id(&self) -> &'static str {\n        match self {\n            OtelSignal::Logs => OTEL_LOGS_INDEX_ID,\n            OtelSignal::Traces => OTEL_TRACES_INDEX_ID,\n        }\n    }\n}\n\nimpl From<OtlpLogsError> for tonic::Status {\n    fn from(error: OtlpLogsError) -> Self {\n        tonic::Status::invalid_argument(error.to_string())\n    }\n}\n\nimpl From<OtlpTracesError> for tonic::Status {\n    fn from(error: OtlpTracesError) -> Self {\n        tonic::Status::invalid_argument(error.to_string())\n    }\n}\n\n// An `Attribute` is a key-value pair, which MUST have the following properties:\n// - The attribute key MUST be a non-null and non-empty string.\n// - The attribute value is either:\n//  - A primitive type: string, boolean, double precision floating point (IEEE 754-1985) or signed\n//    64 bit integer.\n//  - An array of primitive type values. The array MUST be homogeneous, i.e., it MUST NOT contain\n//    values of different types.\n//\n// <https://github.com/open-telemetry/opentelemetry-specification/tree/main/specification/common#attribute>\npub(crate) fn extract_attributes(attributes: Vec<OtlpKeyValue>) -> HashMap<String, JsonValue> {\n    let mut attrs = HashMap::with_capacity(attributes.len());\n\n    for attribute in attributes {\n        if attribute.key.is_empty() {\n            continue;\n        }\n        if let Some(value) = attribute\n            .value\n            .and_then(|any_value| any_value.value)\n            .and_then(oltp_value_to_json_value)\n        {\n            attrs.insert(attribute.key, value);\n        }\n    }\n    attrs\n}\n\nfn oltp_value_to_json_value(value: OtlpValue) -> Option<JsonValue> {\n    match value {\n        OtlpValue::ArrayValue(OtlpArrayValue { values }) => Some(\n            values\n                .into_iter()\n                .filter_map(|value| match value.value {\n                    Some(value) => oltp_value_to_json_value(value),\n                    None => None,\n                })\n                .collect(),\n        ),\n        OtlpValue::BoolValue(bool_value) => Some(JsonValue::Bool(bool_value)),\n        OtlpValue::DoubleValue(double_value) => {\n            JsonNumber::from_f64(double_value).map(JsonValue::Number)\n        }\n        OtlpValue::IntValue(int_value) => Some(JsonValue::Number(JsonNumber::from(int_value))),\n        OtlpValue::KvlistValue(key_values) => {\n            let mut map = serde_json::Map::with_capacity(key_values.values.len());\n\n            for key_value in key_values.values {\n                if let Some(value) = key_value\n                    .value\n                    .and_then(|any_value| any_value.value)\n                    .and_then(oltp_value_to_json_value)\n                {\n                    map.insert(key_value.key, value);\n                }\n            }\n            Some(JsonValue::Object(map))\n        }\n        OtlpValue::StringValue(string_value) => Some(JsonValue::String(string_value)),\n        OtlpValue::BytesValue(_) => {\n            rate_limited_warn!(limit_per_min = 10, \"ignoring unsupported OTLP bytes value\");\n            None\n        }\n    }\n}\n\npub(crate) fn parse_log_record_body(body: OtlpAnyValue) -> Option<JsonValue> {\n    body.value.and_then(oltp_value_to_json_value).map(|value| {\n        if value.is_string() {\n            let mut map = serde_json::Map::with_capacity(1);\n            map.insert(\"message\".to_string(), value);\n            JsonValue::Object(map)\n        } else {\n            value\n        }\n    })\n}\n\nfn is_zero(count: &u32) -> bool {\n    *count == 0\n}\n\n#[allow(clippy::result_large_err)]\npub fn extract_otel_traces_index_id_patterns_from_metadata(\n    metadata: &tonic::metadata::MetadataMap,\n) -> Result<Vec<String>, Status> {\n    let comma_separated_index_id_patterns = metadata\n        .get(OtelSignal::Traces.header_name())\n        .map(|index| index.to_str())\n        .transpose()\n        .map_err(|error| {\n            Status::internal(format!(\n                \"failed to extract index ID from request header: {error}\",\n            ))\n        })?\n        .unwrap_or(OTEL_TRACES_INDEX_ID_PATTERN);\n    let mut index_id_patterns = Vec::new();\n    for index_id_pattern in comma_separated_index_id_patterns.split(',') {\n        if index_id_pattern.is_empty() {\n            continue;\n        }\n        validate_index_id_pattern(index_id_pattern, true).map_err(|error| {\n            Status::internal(format!(\n                \"invalid index ID pattern in request header: {error}\",\n            ))\n        })?;\n        index_id_patterns.push(index_id_pattern.to_string());\n    }\n    Ok(index_id_patterns)\n}\n\n#[allow(clippy::result_large_err)]\npub(crate) fn extract_otel_index_id_from_metadata(\n    metadata: &tonic::metadata::MetadataMap,\n    otel_signal: OtelSignal,\n) -> Result<String, Status> {\n    let index_id = metadata\n        .get(otel_signal.header_name())\n        .map(|index: &tonic::metadata::MetadataValue<tonic::metadata::Ascii>| index.to_str())\n        .transpose()\n        .map_err(|error| {\n            Status::internal(format!(\n                \"failed to extract index ID from request metadata: {error}\",\n            ))\n        })?\n        .unwrap_or_else(|| otel_signal.default_index_id());\n    validate_identifier(\"index_id\", index_id).map_err(|error| {\n        Status::internal(format!(\n            \"invalid index ID pattern in request metadata: {error}\",\n        ))\n    })?;\n    Ok(index_id.to_string())\n}\n\nasync fn ingest_doc_batch_v2(\n    ingest_router: IngestRouterServiceClient,\n    index_id: String,\n    doc_batch: DocBatchV2,\n    commit_type: CommitType,\n) -> Result<(), IngestServiceError> {\n    let subrequest = IngestSubrequest {\n        subrequest_id: 0,\n        index_id,\n        source_id: INGEST_V2_SOURCE_ID.to_string(),\n        doc_batch: Some(doc_batch),\n    };\n    let request = IngestRequestV2 {\n        commit_type: commit_type.into(),\n        subrequests: vec![subrequest],\n    };\n    let mut response = ingest_router.ingest(request).await?;\n    let num_responses = response.successes.len() + response.failures.len();\n    if num_responses != 1 {\n        return Err(IngestServiceError::Internal(format!(\n            \"expected a single failure or success, got {num_responses}\"\n        )));\n    }\n    if response.successes.pop().is_some() {\n        return Ok(());\n    }\n    let ingest_failure = response.failures.pop().unwrap();\n    Err(ingest_failure.into())\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_proto::opentelemetry::proto::common::v1::any_value::{\n        Value as OtlpValue, Value as OtlpAnyValueValue,\n    };\n    use quickwit_proto::opentelemetry::proto::common::v1::{\n        ArrayValue as OtlpArrayValue, KeyValueList as OtlpKeyValueList,\n    };\n    use serde_json::{Value as JsonValue, json};\n\n    use super::*;\n    use crate::otlp::{extract_attributes, oltp_value_to_json_value, parse_log_record_body};\n\n    #[test]\n    fn test_oltp_value_to_json_value() {\n        assert_eq!(\n            oltp_value_to_json_value(OtlpValue::ArrayValue(OtlpArrayValue { values: Vec::new() })),\n            Some(json!([]))\n        );\n        assert_eq!(\n            oltp_value_to_json_value(OtlpValue::ArrayValue(OtlpArrayValue {\n                values: vec![\n                    OtlpAnyValue {\n                        value: Some(OtlpAnyValueValue::IntValue(1337))\n                    },\n                    OtlpAnyValue {\n                        value: Some(OtlpAnyValueValue::StringValue(\"1337\".to_string()))\n                    }\n                ]\n            })),\n            Some(json!([1337, \"1337\"]))\n        );\n        assert_eq!(\n            oltp_value_to_json_value(OtlpValue::BoolValue(true)),\n            Some(json!(true))\n        );\n        assert_eq!(\n            oltp_value_to_json_value(OtlpValue::DoubleValue(12.0)),\n            Some(json!(12.0))\n        );\n        assert_eq!(\n            oltp_value_to_json_value(OtlpValue::IntValue(42)),\n            Some(json!(42))\n        );\n        assert_eq!(\n            oltp_value_to_json_value(OtlpValue::KvlistValue(OtlpKeyValueList {\n                values: Vec::new()\n            })),\n            Some(json!({}))\n        );\n        assert_eq!(\n            oltp_value_to_json_value(OtlpValue::KvlistValue(OtlpKeyValueList {\n                values: vec![\n                    OtlpKeyValue {\n                        key: \"foo\".to_string(),\n                        value: Some(OtlpAnyValue {\n                            value: Some(OtlpAnyValueValue::IntValue(1337))\n                        })\n                    },\n                    OtlpKeyValue {\n                        key: \"bar\".to_string(),\n                        value: Some(OtlpAnyValue {\n                            value: Some(OtlpAnyValueValue::StringValue(\"1337\".to_string()))\n                        })\n                    }\n                ]\n            })),\n            Some(json!({\n                \"foo\": 1337,\n                \"bar\": \"1337\"\n            }))\n        );\n        assert_eq!(\n            oltp_value_to_json_value(OtlpValue::StringValue(\"foo\".to_string())),\n            Some(json!(\"foo\"))\n        );\n    }\n\n    #[test]\n    fn test_extract_attributes() {\n        assert!(extract_attributes(Vec::new()).is_empty());\n\n        let attributes = vec![\n            OtlpKeyValue {\n                key: \"\".to_string(),\n                value: None,\n            },\n            OtlpKeyValue {\n                key: \"\".to_string(),\n                value: Some(OtlpAnyValue {\n                    value: Some(OtlpAnyValueValue::BoolValue(true)),\n                }),\n            },\n            OtlpKeyValue {\n                key: \"empty_value\".to_string(),\n                value: None,\n            },\n            OtlpKeyValue {\n                key: \"empty_value_value\".to_string(),\n                value: Some(OtlpAnyValue { value: None }),\n            },\n        ];\n        assert!(extract_attributes(attributes).is_empty());\n\n        let attributes = vec![\n            OtlpKeyValue {\n                key: \"array_key\".to_string(),\n                value: Some(OtlpAnyValue {\n                    value: Some(OtlpAnyValueValue::ArrayValue(OtlpArrayValue {\n                        values: vec![OtlpAnyValue {\n                            value: Some(OtlpAnyValueValue::IntValue(1337)),\n                        }],\n                    })),\n                }),\n            },\n            OtlpKeyValue {\n                key: \"bool_key\".to_string(),\n                value: Some(OtlpAnyValue {\n                    value: Some(OtlpAnyValueValue::BoolValue(true)),\n                }),\n            },\n            OtlpKeyValue {\n                key: \"double_key\".to_string(),\n                value: Some(OtlpAnyValue {\n                    value: Some(OtlpAnyValueValue::DoubleValue(12.0)),\n                }),\n            },\n            OtlpKeyValue {\n                key: \"int_key\".to_string(),\n                value: Some(OtlpAnyValue {\n                    value: Some(OtlpAnyValueValue::IntValue(42)),\n                }),\n            },\n            OtlpKeyValue {\n                key: \"string_key\".to_string(),\n                value: Some(OtlpAnyValue {\n                    value: Some(OtlpAnyValueValue::StringValue(\"foo\".to_string())),\n                }),\n            },\n        ];\n        let expected_attributes = HashMap::from_iter([\n            (\"array_key\".to_string(), json!([1337])),\n            (\"bool_key\".to_string(), json!(true)),\n            (\"double_key\".to_string(), json!(12.0)),\n            (\"int_key\".to_string(), json!(42)),\n            (\"string_key\".to_string(), json!(\"foo\")),\n        ]);\n        assert_eq!(extract_attributes(attributes), expected_attributes);\n    }\n\n    #[test]\n    fn test_parse_log_record_body() {\n        let value = parse_log_record_body(OtlpAnyValue {\n            value: Some(OtlpAnyValueValue::StringValue(\"body\".to_string())),\n        })\n        .unwrap();\n        let JsonValue::Object(map) = value else {\n            panic!(\"Expected object, got {value:?}\");\n        };\n        assert_eq!(map.len(), 1);\n        assert_eq!(map[\"message\"], json!(\"body\"));\n    }\n\n    #[test]\n    fn test_extract_otel_index_id_patterns_from_metadata() {\n        let mut metadata = tonic::metadata::MetadataMap::new();\n        metadata.insert(\"qw-otel-traces-index\", \"foo,bar\".parse().unwrap());\n        let index_id_patterns =\n            extract_otel_traces_index_id_patterns_from_metadata(&metadata).unwrap();\n        assert_eq!(\n            index_id_patterns,\n            vec![\"foo\".to_string(), \"bar\".to_string()]\n        );\n\n        let mut metadata = tonic::metadata::MetadataMap::new();\n        metadata.insert(\"bad-header\", \"foo,bar\".parse().unwrap());\n        let index_id_patterns =\n            extract_otel_traces_index_id_patterns_from_metadata(&metadata).unwrap();\n        assert_eq!(index_id_patterns, vec![OTEL_TRACES_INDEX_ID_PATTERN]);\n\n        let mut metadata = tonic::metadata::MetadataMap::new();\n        metadata.insert(\"qw-otel-traces-index\", \"foo,bar\".parse().unwrap());\n        let index_id_patterns =\n            extract_otel_traces_index_id_patterns_from_metadata(&metadata).unwrap();\n        assert_eq!(\n            index_id_patterns,\n            vec![\"foo\".to_string(), \"bar\".to_string()]\n        );\n\n        let mut metadata = tonic::metadata::MetadataMap::new();\n        metadata.insert(\"qw-otel-traces-index\", \"foo,bar,\".parse().unwrap());\n        let index_id_patterns =\n            extract_otel_traces_index_id_patterns_from_metadata(&metadata).unwrap();\n        assert_eq!(\n            index_id_patterns,\n            vec![\"foo\".to_string(), \"bar\".to_string()]\n        );\n\n        let mut metadata = tonic::metadata::MetadataMap::new();\n        metadata.insert(\"qw-otel-traces-index\", \"foo,bar,,\".parse().unwrap());\n        let index_id_patterns =\n            extract_otel_traces_index_id_patterns_from_metadata(&metadata).unwrap();\n        assert_eq!(\n            index_id_patterns,\n            vec![\"foo\".to_string(), \"bar\".to_string()]\n        );\n\n        // invalid index ID pattern\n        let mut metadata = tonic::metadata::MetadataMap::new();\n        metadata.insert(\"qw-otel-traces-index\", \"foo,bar, ,\".parse().unwrap());\n        let extract_res = extract_otel_traces_index_id_patterns_from_metadata(&metadata);\n        assert!(extract_res.is_err());\n    }\n\n    #[test]\n    fn test_extract_otel_index_id_from_metadata() {\n        let mut metadata = tonic::metadata::MetadataMap::new();\n        metadata.insert(\"qw-otel-logs-index\", \"foo\".parse().unwrap());\n        let index_id = extract_otel_index_id_from_metadata(&metadata, OtelSignal::Logs).unwrap();\n        assert_eq!(index_id, \"foo\");\n\n        // default index ID\n        let mut metadata = tonic::metadata::MetadataMap::new();\n        metadata.insert(\"wrong-header\", \"foo\".parse().unwrap());\n        let index_id = extract_otel_index_id_from_metadata(&metadata, OtelSignal::Logs).unwrap();\n        assert_eq!(index_id, OTEL_LOGS_INDEX_ID);\n\n        let mut metadata = tonic::metadata::MetadataMap::new();\n        metadata.insert(\"qw-otel-traces-index\", \"foo\".parse().unwrap());\n        let index_id = extract_otel_index_id_from_metadata(&metadata, OtelSignal::Traces).unwrap();\n        assert_eq!(index_id, \"foo\");\n\n        // default index ID\n        let mut metadata = tonic::metadata::MetadataMap::new();\n        metadata.insert(\"wrong-header\", \"foo\".parse().unwrap());\n        let index_id = extract_otel_index_id_from_metadata(&metadata, OtelSignal::Traces).unwrap();\n        assert_eq!(index_id, OTEL_TRACES_INDEX_ID);\n\n        // invalid index ID\n        let mut metadata = tonic::metadata::MetadataMap::new();\n        metadata.insert(\"qw-otel-traces-index\", \"foo bar\".parse().unwrap());\n        let extract_res = extract_otel_index_id_from_metadata(&metadata, OtelSignal::Traces);\n        assert!(extract_res.is_err());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-opentelemetry/src/otlp/test_utils.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::time::Duration;\n\nuse quickwit_proto::opentelemetry::proto::common::v1::any_value::Value as OtlpAnyValueValue;\nuse quickwit_proto::opentelemetry::proto::common::v1::{\n    AnyValue as OtlpAnyValue, ArrayValue, InstrumentationScope, KeyValue as OtlpKeyValue,\n};\nuse quickwit_proto::opentelemetry::proto::resource::v1::Resource;\nuse quickwit_proto::opentelemetry::proto::trace::v1::span::{Event as OtlpEvent, Link as OtlpLink};\nuse quickwit_proto::opentelemetry::proto::trace::v1::{\n    ResourceSpans, ScopeSpans, Span as OtlpSpan, Status as OtlpStatus,\n};\nuse time::OffsetDateTime;\n\nfn now_minus_x_secs(now: &OffsetDateTime, secs: u64) -> u64 {\n    (*now - Duration::from_secs(secs)).unix_timestamp_nanos() as u64\n}\n\npub fn make_resource_spans_for_test() -> Vec<ResourceSpans> {\n    let now: OffsetDateTime = OffsetDateTime::now_utc();\n\n    let attributes = vec![OtlpKeyValue {\n        key: \"span_key\".to_string(),\n        value: Some(OtlpAnyValue {\n            value: Some(OtlpAnyValueValue::StringValue(\"span_value\".to_string())),\n        }),\n    }];\n    let events = vec![OtlpEvent {\n        name: \"event_name\".to_string(),\n        time_unix_nano: 1_000_500_003,\n        attributes: vec![OtlpKeyValue {\n            key: \"event_key\".to_string(),\n            value: Some(OtlpAnyValue {\n                value: Some(OtlpAnyValueValue::StringValue(\"event_value\".to_string())),\n            }),\n        }],\n        dropped_attributes_count: 6,\n    }];\n    let links = vec![OtlpLink {\n        trace_id: vec![4; 16],\n        span_id: vec![5; 8],\n        trace_state: \"link_key1=link_value1,link_key2=link_value2\".to_string(),\n        attributes: vec![OtlpKeyValue {\n            key: \"link_key\".to_string(),\n            value: Some(OtlpAnyValue {\n                value: Some(OtlpAnyValueValue::StringValue(\"link_value\".to_string())),\n            }),\n        }],\n        dropped_attributes_count: 7,\n    }];\n    let spans = vec![\n        OtlpSpan {\n            trace_id: vec![1; 16],\n            span_id: vec![1; 8],\n            parent_span_id: Vec::new(),\n            trace_state: \"key1=value1,key2=value2\".to_string(),\n            name: \"stage_splits\".to_string(),\n            kind: 1, // Internal\n            start_time_unix_nano: now_minus_x_secs(&now, 6),\n            end_time_unix_nano: now_minus_x_secs(&now, 5),\n            attributes: Vec::new(),\n            dropped_attributes_count: 0,\n            events: Vec::new(),\n            dropped_events_count: 0,\n            links: Vec::new(),\n            dropped_links_count: 0,\n            status: None,\n        },\n        OtlpSpan {\n            trace_id: vec![2; 16],\n            span_id: vec![2; 8],\n            parent_span_id: Vec::new(),\n            trace_state: \"key1=value1,key2=value2\".to_string(),\n            name: \"publish_splits\".to_string(),\n            kind: 2, // Server\n            start_time_unix_nano: now_minus_x_secs(&now, 4),\n            end_time_unix_nano: now_minus_x_secs(&now, 3),\n            attributes: Vec::new(),\n            dropped_attributes_count: 0,\n            events: Vec::new(),\n            dropped_events_count: 0,\n            links: Vec::new(),\n            dropped_links_count: 0,\n            status: None,\n        },\n        OtlpSpan {\n            trace_id: vec![3; 16],\n            span_id: vec![3; 8],\n            parent_span_id: Vec::new(),\n            trace_state: \"key1=value1,key2=value2\".to_string(),\n            name: \"list_splits\".to_string(),\n            kind: 3, // Client\n            start_time_unix_nano: now_minus_x_secs(&now, 2),\n            end_time_unix_nano: now_minus_x_secs(&now, 1),\n            attributes,\n            dropped_attributes_count: 0,\n            events: Vec::new(),\n            dropped_events_count: 0,\n            links: Vec::new(),\n            dropped_links_count: 0,\n            status: Some(OtlpStatus {\n                code: 1,\n                message: \"\".to_string(),\n            }),\n        },\n        OtlpSpan {\n            trace_id: vec![4; 16],\n            span_id: vec![4; 8],\n            parent_span_id: Vec::new(),\n            trace_state: \"key1=value1,key2=value2\".to_string(),\n            name: \"list_splits\".to_string(),\n            kind: 3, // Client\n            start_time_unix_nano: now_minus_x_secs(&now, 2),\n            end_time_unix_nano: now_minus_x_secs(&now, 1),\n            attributes: Vec::new(),\n            dropped_attributes_count: 0,\n            events: Vec::new(),\n            dropped_events_count: 0,\n            links: Vec::new(),\n            dropped_links_count: 0,\n            status: Some(OtlpStatus {\n                code: 2,\n                message: \"An error occurred.\".to_string(),\n            }),\n        },\n        OtlpSpan {\n            trace_id: vec![5; 16],\n            span_id: vec![5; 8],\n            parent_span_id: Vec::new(),\n            trace_state: \"key1=value1,key2=value2\".to_string(),\n            name: \"delete_splits\".to_string(),\n            kind: 3, // Client\n            start_time_unix_nano: now_minus_x_secs(&now, 2),\n            end_time_unix_nano: now_minus_x_secs(&now, 1),\n            attributes: Vec::new(),\n            dropped_attributes_count: 0,\n            events,\n            dropped_events_count: 0,\n            links,\n            dropped_links_count: 0,\n            status: Some(OtlpStatus {\n                code: 2,\n                message: \"Storage error.\".to_string(),\n            }),\n        },\n    ];\n    let scope_spans = vec![ScopeSpans {\n        scope: Some(InstrumentationScope {\n            name: \"opentelemetry-otlp\".to_string(),\n            version: \"0.11.0\".to_string(),\n            attributes: Vec::new(),\n            dropped_attributes_count: 0,\n        }),\n        spans,\n        schema_url: \"\".to_string(),\n    }];\n    let resource_attributes = vec![\n        OtlpKeyValue {\n            key: \"service.name\".to_string(),\n            value: Some(OtlpAnyValue {\n                value: Some(OtlpAnyValueValue::StringValue(\"quickwit\".to_string())),\n            }),\n        },\n        OtlpKeyValue {\n            key: \"tags\".to_string(),\n            value: Some(OtlpAnyValue {\n                value: Some(OtlpAnyValueValue::ArrayValue(ArrayValue {\n                    values: vec![OtlpAnyValue {\n                        value: Some(OtlpAnyValueValue::StringValue(\"foo\".to_string())),\n                    }],\n                })),\n            }),\n        },\n    ];\n    let resource_spans = ResourceSpans {\n        resource: Some(Resource {\n            attributes: resource_attributes,\n            dropped_attributes_count: 0,\n        }),\n        scope_spans,\n        schema_url: \"\".to_string(),\n    };\n    vec![resource_spans]\n}\n"
  },
  {
    "path": "quickwit/quickwit-opentelemetry/src/otlp/traces.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::cmp::PartialEq;\nuse std::collections::HashMap;\nuse std::str::FromStr;\n\nuse async_trait::async_trait;\nuse prost::Message;\nuse quickwit_common::thread_pool::run_cpu_intensive;\nuse quickwit_common::uri::Uri;\nuse quickwit_config::{ConfigFormat, IndexConfig, load_index_config_from_user_config};\nuse quickwit_ingest::{CommitType, JsonDocBatchV2Builder};\nuse quickwit_proto::ingest::DocBatchV2;\nuse quickwit_proto::ingest::router::IngestRouterServiceClient;\nuse quickwit_proto::opentelemetry::proto::collector::trace::v1::trace_service_server::TraceService;\nuse quickwit_proto::opentelemetry::proto::collector::trace::v1::{\n    ExportTracePartialSuccess, ExportTraceServiceRequest, ExportTraceServiceResponse,\n};\nuse quickwit_proto::opentelemetry::proto::common::v1::InstrumentationScope;\nuse quickwit_proto::opentelemetry::proto::resource::v1::Resource as OtlpResource;\nuse quickwit_proto::opentelemetry::proto::trace::v1::span::Link as OtlpLink;\nuse quickwit_proto::opentelemetry::proto::trace::v1::status::StatusCode as OtlpStatusCode;\nuse quickwit_proto::opentelemetry::proto::trace::v1::{Span as OtlpSpan, Status as OtlpStatus};\nuse quickwit_proto::types::{DocUidGenerator, IndexId};\nuse serde::{Deserialize, Serialize};\nuse serde_json::Value as JsonValue;\nuse tonic::{Request, Response, Status};\nuse tracing::field::Empty;\nuse tracing::{Span as RuntimeSpan, error, instrument, warn};\n\nuse super::{\n    OtelSignal, TryFromSpanIdError, TryFromTraceIdError, extract_otel_index_id_from_metadata,\n    ingest_doc_batch_v2, is_zero,\n};\nuse crate::otlp::metrics::OTLP_SERVICE_METRICS;\nuse crate::otlp::{SpanId, TraceId, extract_attributes};\n\npub const OTEL_TRACES_INDEX_ID: &str = \"otel-traces-v0_9\";\npub const OTEL_TRACES_INDEX_ID_PATTERN: &str = \"otel-traces-v0_*\";\n\nconst OTEL_TRACES_INDEX_CONFIG: &str = r#\"\nversion: 0.8\n\nindex_id: ${INDEX_ID}\n\ndoc_mapping:\n  mode: strict\n  field_mappings:\n    - name: trace_id\n      type: bytes\n      input_format: hex\n      output_format: hex\n      fast: true\n    - name: trace_state\n      type: text\n      indexed: false\n    - name: service_name\n      type: text\n      tokenizer: raw\n      fast: true\n    - name: resource_attributes\n      type: json\n      tokenizer: raw\n    - name: resource_dropped_attributes_count\n      type: u64\n      indexed: false\n    - name: scope_name\n      type: text\n      indexed: false\n    - name: scope_version\n      type: text\n      indexed: false\n    - name: scope_attributes\n      type: json\n      indexed: false\n    - name: scope_dropped_attributes_count\n      type: u64\n      indexed: false\n    - name: span_id\n      type: bytes\n      input_format: hex\n      output_format: hex\n    - name: span_kind\n      type: u64\n    - name: span_name\n      type: text\n      tokenizer: raw\n      fast: true\n    - name: span_fingerprint\n      type: text\n      tokenizer: raw\n    - name: span_start_timestamp_nanos\n      type: datetime\n      input_formats: [unix_timestamp]\n      output_format: unix_timestamp_nanos\n      indexed: false\n      fast: true\n      fast_precision: milliseconds\n    - name: span_end_timestamp_nanos\n      type: datetime\n      input_formats: [unix_timestamp]\n      output_format: unix_timestamp_nanos\n      indexed: false\n      fast: false\n    - name: span_duration_millis\n      type: u64\n      indexed: false\n      fast: true\n    - name: span_attributes\n      type: json\n      tokenizer: raw\n      fast: true\n    - name: span_dropped_attributes_count\n      type: u64\n      indexed: false\n    - name: span_dropped_events_count\n      type: u64\n      indexed: false\n    - name: span_dropped_links_count\n      type: u64\n      indexed: false\n    - name: span_status\n      type: json\n      indexed: true\n    - name: parent_span_id\n      type: bytes\n      input_format: hex\n      output_format: hex\n      indexed: false\n    - name: is_root\n      type: bool\n      indexed: true\n      stored: false\n    - name: events\n      type: array<json>\n      tokenizer: raw\n      fast: true\n    - name: event_names\n      type: array<text>\n      tokenizer: default\n      record: position\n      stored: false\n    - name: links\n      type: array<json>\n      tokenizer: raw\n\n  timestamp_field: span_start_timestamp_nanos\n\n  # partition_key: hash_mod(service_name, 100)\n  # tag_fields: [service_name]\n\nindexing_settings:\n  commit_timeout_secs: 5\n\nsearch_settings:\n  default_search_fields: [service_name, span_name, event_names]\n\"#;\n\n#[derive(Debug, thiserror::Error)]\npub enum OtlpTracesError {\n    #[error(\"failed to deserialize JSON spans: `{0}`\")]\n    Json(#[from] serde_json::Error),\n    #[error(\"failed to deserialize Protobuf spans: `{0}`\")]\n    Protobuf(#[from] prost::DecodeError),\n    #[error(\"failed to parse span: `{0}`\")]\n    SpanId(#[from] TryFromSpanIdError),\n    #[error(\"failed to parse span: `{0}`\")]\n    TraceId(#[from] TryFromTraceIdError),\n}\n\n#[derive(Debug, Clone, Eq, PartialEq, Serialize, Deserialize)]\npub struct Span {\n    pub trace_id: TraceId,\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub trace_state: Option<String>,\n    pub service_name: String,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"HashMap::is_empty\")]\n    pub resource_attributes: HashMap<String, JsonValue>,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"is_zero\")]\n    pub resource_dropped_attributes_count: u32,\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub scope_name: Option<String>,\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub scope_version: Option<String>,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"HashMap::is_empty\")]\n    pub scope_attributes: HashMap<String, JsonValue>,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"is_zero\")]\n    pub scope_dropped_attributes_count: u32,\n    pub span_id: SpanId,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"is_zero\")]\n    pub span_kind: u32,\n    pub span_name: String,\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub span_fingerprint: Option<SpanFingerprint>,\n    pub span_start_timestamp_nanos: u64,\n    pub span_end_timestamp_nanos: u64,\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub span_duration_millis: Option<u64>,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"HashMap::is_empty\")]\n    pub span_attributes: HashMap<String, JsonValue>,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"is_zero\")]\n    pub span_dropped_attributes_count: u32,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"is_zero\")]\n    pub span_dropped_events_count: u32,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"is_zero\")]\n    pub span_dropped_links_count: u32,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"SpanStatus::is_unset\")]\n    pub span_status: SpanStatus,\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub parent_span_id: Option<SpanId>,\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub is_root: Option<bool>,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Vec::is_empty\")]\n    pub events: Vec<Event>,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Vec::is_empty\")]\n    pub event_names: Vec<String>,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Vec::is_empty\")]\n    pub links: Vec<Link>,\n}\n\nimpl Span {\n    fn from_otlp(\n        span: OtlpSpan,\n        resource: &Resource,\n        scope: &Scope,\n    ) -> Result<Self, OtlpTracesError> {\n        let trace_id = TraceId::try_from(span.trace_id)?;\n        let span_id = SpanId::try_from(span.span_id)?;\n        let parent_span_id = if !span.parent_span_id.is_empty() {\n            Some(SpanId::try_from(span.parent_span_id)?)\n        } else {\n            None\n        };\n        let span_name = if !span.name.is_empty() {\n            span.name\n        } else {\n            \"unknown\".to_string()\n        };\n        let span_fingerprint =\n            SpanFingerprint::new(&resource.service_name, span.kind.into(), &span_name);\n        let span_duration_nanos = span.end_time_unix_nano - span.start_time_unix_nano;\n        let span_duration_millis = Some(span_duration_nanos / 1_000_000);\n        let span_attributes = extract_attributes(span.attributes);\n\n        let events: Vec<Event> = span\n            .events\n            .into_iter()\n            .map(|event| Event {\n                event_timestamp_nanos: event.time_unix_nano,\n                event_name: event.name,\n                event_attributes: extract_attributes(event.attributes),\n                event_dropped_attributes_count: event.dropped_attributes_count,\n            })\n            .collect();\n        let event_names: Vec<String> = events\n            .iter()\n            .map(|event| event.event_name.clone())\n            .collect();\n        let links: Vec<Link> = span\n            .links\n            .into_iter()\n            .map(Link::try_from_otlp)\n            .collect::<Result<_, _>>()?;\n        let trace_state = if span.trace_state.is_empty() {\n            None\n        } else {\n            Some(span.trace_state)\n        };\n        let span = Span {\n            trace_id,\n            trace_state,\n            service_name: resource.service_name.clone(),\n            resource_attributes: resource.attributes.clone(),\n            resource_dropped_attributes_count: resource.dropped_attributes_count,\n            scope_name: scope.name.clone(),\n            scope_version: scope.version.clone(),\n            scope_attributes: scope.attributes.clone(),\n            scope_dropped_attributes_count: scope.dropped_attributes_count,\n            span_id,\n            span_kind: span.kind as u32,\n            span_name,\n            span_fingerprint: Some(span_fingerprint),\n            span_start_timestamp_nanos: span.start_time_unix_nano,\n            span_end_timestamp_nanos: span.end_time_unix_nano,\n            span_duration_millis,\n            span_attributes,\n            span_dropped_attributes_count: span.dropped_attributes_count,\n            span_dropped_events_count: span.dropped_events_count,\n            span_dropped_links_count: span.dropped_links_count,\n            span_status: span.status.map(SpanStatus::from_otlp).unwrap_or_default(),\n            is_root: Some(parent_span_id.is_none()),\n            parent_span_id,\n            events,\n            event_names,\n            links,\n        };\n        Ok(span)\n    }\n}\n\n#[derive(Debug, Clone)]\npub struct SpanKind(i32);\n\nimpl SpanKind {\n    pub fn as_char(&self) -> char {\n        match self.0 {\n            0 => '0',\n            1 => '1',\n            2 => '2',\n            3 => '3',\n            4 => '4',\n            5 => '5',\n            _ => {\n                panic!(\"Unexpected span kind: {}\", self.0);\n            }\n        }\n    }\n    pub fn as_jaeger(&self) -> &'static str {\n        match self.0 {\n            0 => \"unspecified\",\n            1 => \"internal\",\n            2 => \"server\",\n            3 => \"client\",\n            4 => \"producer\",\n            5 => \"consumer\",\n            _ => {\n                panic!(\"Unexpected span kind: {}\", self.0);\n            }\n        }\n    }\n\n    pub fn as_otlp(&self) -> &'static str {\n        match self.0 {\n            0 => \"SPAN_KIND_UNSPECIFIED\",\n            1 => \"SPAN_KIND_INTERNAL\",\n            2 => \"SPAN_KIND_SERVER\",\n            3 => \"SPAN_KIND_CLIENT\",\n            4 => \"SPAN_KIND_PRODUCER\",\n            5 => \"SPAN_KIND_CONSUMER\",\n            _ => {\n                panic!(\"Unexpected span kind: {}\", self.0);\n            }\n        }\n    }\n}\n\nimpl From<i32> for SpanKind {\n    fn from(span_kind: i32) -> Self {\n        Self(span_kind)\n    }\n}\n\nimpl FromStr for SpanKind {\n    type Err = String;\n\n    fn from_str(span_kind: &str) -> Result<Self, Self::Err> {\n        let span_kind_i32 = match span_kind {\n            \"0\" | \"unspecified\" | \"SPAN_KIND_UNSPECIFIED\" => 0,\n            \"1\" | \"internal\" | \"SPAN_KIND_INTERNAL\" => 1,\n            \"2\" | \"server\" | \"SPAN_KIND_SERVER\" => 2,\n            \"3\" | \"client\" | \"SPAN_KIND_CLIENT\" => 3,\n            \"4\" | \"producer\" | \"SPAN_KIND_PRODUCER\" => 4,\n            \"5\" | \"consumer\" | \"SPAN_KIND_CONSUMER\" => 5,\n            _ => {\n                if !span_kind.is_empty() {\n                    warn!(span_kind=%span_kind, \"unexpected span kind\");\n                }\n                return Err(format!(\"Unexpected span kind: {span_kind}\"));\n            }\n        };\n        Ok(Self(span_kind_i32))\n    }\n}\n\n/// Concatenation of the service name, span kind, and span name.\n#[derive(Debug, Clone, Eq, PartialEq, Serialize, Deserialize)]\npub struct SpanFingerprint(String);\n\nimpl SpanFingerprint {\n    /// Null character used to separate the service name, span kind, and span name.\n    const NULL_CHAR: char = '\\u{0}';\n\n    /// Start of heading character, the next character after null.\n    const SOH_CHAR: char = '\\u{1}';\n\n    pub fn new(service_name: &str, span_kind: SpanKind, span_name: &str) -> Self {\n        Self(format!(\n            \"{service_name}{separator}{span_kind}{separator}{span_name}\",\n            separator = Self::NULL_CHAR,\n            span_kind = span_kind.0\n        ))\n    }\n\n    pub fn as_str(&self) -> &str {\n        self.0.as_str()\n    }\n\n    pub fn from_string(fingerprint: String) -> Self {\n        Self(fingerprint)\n    }\n\n    pub fn service_name(&self) -> Option<&str> {\n        self.0.split(Self::NULL_CHAR).next()\n    }\n\n    pub fn span_kind(&self) -> Option<SpanKind> {\n        self.0\n            .split(Self::NULL_CHAR)\n            .nth(1)\n            .and_then(|span_kind| SpanKind::from_str(span_kind).ok())\n    }\n\n    pub fn span_name(&self) -> Option<&str> {\n        self.0.split(Self::NULL_CHAR).nth(2)\n    }\n\n    pub fn start_key(service_name: &str, span_kind_opt: Option<SpanKind>) -> Option<Vec<u8>> {\n        if service_name.is_empty() {\n            return None;\n        }\n        let mut start_key = service_name.as_bytes().to_vec();\n        start_key.push(Self::NULL_CHAR as u8);\n\n        if let Some(span_kind) = span_kind_opt {\n            start_key.push(span_kind.as_char() as u8);\n            start_key.push(Self::NULL_CHAR as u8);\n        }\n        Some(start_key)\n    }\n\n    pub fn end_key(service_name: &str, span_kind_opt: Option<SpanKind>) -> Option<Vec<u8>> {\n        if service_name.is_empty() {\n            return None;\n        }\n        let mut end_key = service_name.as_bytes().to_vec();\n\n        if let Some(span_kind) = span_kind_opt {\n            end_key.push(Self::NULL_CHAR as u8);\n            end_key.push(span_kind.as_char() as u8);\n        }\n        end_key.push(Self::SOH_CHAR as u8);\n        Some(end_key)\n    }\n}\n\n#[derive(Debug, Clone, Eq, PartialEq, Serialize, Deserialize)]\npub struct SpanStatus {\n    pub code: OtlpStatusCode,\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub message: Option<String>,\n}\n\nimpl SpanStatus {\n    pub fn is_unset(&self) -> bool {\n        self.code == OtlpStatusCode::Unset\n    }\n\n    fn from_otlp(span_status: OtlpStatus) -> Self {\n        if span_status.code() == OtlpStatusCode::Ok {\n            Self {\n                code: OtlpStatusCode::Ok,\n                message: None,\n            }\n        } else if span_status.code() == OtlpStatusCode::Error {\n            let message = if span_status.message.is_empty() {\n                None\n            } else {\n                Some(span_status.message)\n            };\n            Self {\n                code: OtlpStatusCode::Error,\n                message,\n            }\n        } else {\n            Self::default()\n        }\n    }\n}\n\nimpl Default for SpanStatus {\n    fn default() -> Self {\n        Self {\n            code: OtlpStatusCode::Unset,\n            message: None,\n        }\n    }\n}\n\nconst UNKNOWN_SERVICE: &str = \"unknown_service\";\n\nconst SERVICE_NAME_KEY: &str = \"service.name\";\n\nstruct Resource {\n    service_name: String,\n    attributes: HashMap<String, JsonValue>,\n    dropped_attributes_count: u32,\n}\n\nimpl Default for Resource {\n    fn default() -> Self {\n        Self {\n            service_name: UNKNOWN_SERVICE.to_string(),\n            attributes: HashMap::new(),\n            dropped_attributes_count: 0,\n        }\n    }\n}\n\nimpl Resource {\n    fn from_otlp(resource: OtlpResource) -> Self {\n        let mut attributes = extract_attributes(resource.attributes);\n        let service_name = match attributes.remove(SERVICE_NAME_KEY) {\n            Some(JsonValue::String(value)) => value,\n            _ => UNKNOWN_SERVICE.to_string(),\n        };\n        Self {\n            service_name,\n            attributes,\n            dropped_attributes_count: resource.dropped_attributes_count,\n        }\n    }\n}\n\n#[derive(Default)]\nstruct Scope {\n    name: Option<String>,\n    version: Option<String>,\n    attributes: HashMap<String, JsonValue>,\n    dropped_attributes_count: u32,\n}\n\nimpl Scope {\n    fn from_otlp(scope: InstrumentationScope) -> Self {\n        let name = Some(scope.name).filter(|name| !name.is_empty());\n        let version = Some(scope.version).filter(|version| !version.is_empty());\n        let attributes = extract_attributes(scope.attributes);\n        Self {\n            name,\n            version,\n            attributes,\n            dropped_attributes_count: scope.dropped_attributes_count,\n        }\n    }\n}\n\n#[derive(Debug, Clone, Eq, PartialEq, Serialize, Deserialize)]\npub struct Event {\n    pub event_timestamp_nanos: u64,\n    pub event_name: String,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"HashMap::is_empty\")]\n    pub event_attributes: HashMap<String, JsonValue>,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"is_zero\")]\n    pub event_dropped_attributes_count: u32,\n}\n\n#[derive(Debug, Clone, Eq, PartialEq, Serialize, Deserialize)]\npub struct Link {\n    pub link_trace_id: TraceId,\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub link_trace_state: Option<String>,\n    pub link_span_id: SpanId,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"HashMap::is_empty\")]\n    pub link_attributes: HashMap<String, JsonValue>,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"is_zero\")]\n    pub link_dropped_attributes_count: u32,\n}\n\nimpl Link {\n    fn try_from_otlp(link: OtlpLink) -> Result<Link, OtlpTracesError> {\n        let link_trace_id = TraceId::try_from(link.trace_id)?;\n        let link_span_id = SpanId::try_from(link.span_id)?;\n        let link = Link {\n            link_trace_id,\n            link_trace_state: if !link.trace_state.is_empty() {\n                Some(link.trace_state)\n            } else {\n                None\n            },\n            link_span_id,\n            link_attributes: extract_attributes(link.attributes),\n            link_dropped_attributes_count: link.dropped_attributes_count,\n        };\n        Ok(link)\n    }\n}\n\nfn parse_otlp_spans(request: ExportTraceServiceRequest) -> Result<Vec<Span>, OtlpTracesError> {\n    let num_spans = request\n        .resource_spans\n        .iter()\n        .flat_map(|resource_spans| resource_spans.scope_spans.iter())\n        .map(|scope_spans| scope_spans.spans.len())\n        .sum();\n    let mut spans = Vec::with_capacity(num_spans);\n\n    for resource_spans in request.resource_spans {\n        let resource = resource_spans\n            .resource\n            .map(Resource::from_otlp)\n            .unwrap_or_default();\n        for scope_spans in resource_spans.scope_spans {\n            let scope = scope_spans.scope.map(Scope::from_otlp).unwrap_or_default();\n            for span in scope_spans.spans {\n                let span = Span::from_otlp(span, &resource, &scope)?;\n                spans.push(span);\n            }\n        }\n    }\n    Ok(spans)\n}\n\nstruct ParsedSpans {\n    doc_batch: DocBatchV2,\n    num_spans: u64,\n    num_parse_errors: u64,\n    error_message: String,\n}\n\n#[derive(Debug, Clone)]\npub struct OtlpGrpcTracesService {\n    ingest_router: IngestRouterServiceClient,\n    commit_type: CommitType,\n}\n\nimpl OtlpGrpcTracesService {\n    pub fn new(\n        ingest_router: IngestRouterServiceClient,\n        commit_type_opt: Option<CommitType>,\n    ) -> Self {\n        Self {\n            ingest_router,\n            commit_type: commit_type_opt.unwrap_or_default(),\n        }\n    }\n\n    pub fn index_config(default_index_root_uri: &Uri) -> anyhow::Result<IndexConfig> {\n        let index_config_str =\n            OTEL_TRACES_INDEX_CONFIG.replace(\"${INDEX_ID}\", OTEL_TRACES_INDEX_ID);\n        let index_config = load_index_config_from_user_config(\n            ConfigFormat::Yaml,\n            index_config_str.as_bytes(),\n            default_index_root_uri,\n        )?;\n        Ok(index_config)\n    }\n\n    pub async fn export_inner(\n        &mut self,\n        request: ExportTraceServiceRequest,\n        index_id: IndexId,\n        labels: [&str; 4],\n    ) -> Result<ExportTraceServiceResponse, Status> {\n        let ParsedSpans {\n            doc_batch,\n            num_spans,\n            num_parse_errors,\n            error_message,\n        } = run_cpu_intensive({\n            let parent_span = RuntimeSpan::current();\n            || Self::parse_spans(request, parent_span)\n        })\n        .await\n        .map_err(|join_error| {\n            error!(error=%join_error, \"failed to parse spans\");\n            Status::internal(\"failed to parse spans\")\n        })??;\n        if num_spans == 0 {\n            return Err(tonic::Status::invalid_argument(\"request is empty\"));\n        }\n        if num_spans == num_parse_errors {\n            return Err(tonic::Status::internal(error_message));\n        }\n        let num_bytes = doc_batch.num_bytes() as u64;\n        self.store_spans(index_id, doc_batch).await?;\n\n        OTLP_SERVICE_METRICS\n            .ingested_spans_total\n            .with_label_values(labels)\n            .inc_by(num_spans);\n        OTLP_SERVICE_METRICS\n            .ingested_bytes_total\n            .with_label_values(labels)\n            .inc_by(num_bytes);\n\n        let response = ExportTraceServiceResponse {\n            // `rejected_spans=0` and `error_message=\"\"` is considered a \"full\" success.\n            partial_success: Some(ExportTracePartialSuccess {\n                rejected_spans: num_parse_errors as i64,\n                error_message,\n            }),\n        };\n        Ok(response)\n    }\n\n    #[instrument(skip_all, parent = parent_span, fields(num_spans = Empty, num_bytes = Empty, num_parse_errors = Empty))]\n    #[allow(clippy::result_large_err)]\n    fn parse_spans(\n        request: ExportTraceServiceRequest,\n        parent_span: RuntimeSpan,\n    ) -> tonic::Result<ParsedSpans> {\n        let spans = parse_otlp_spans(request)?;\n        let num_spans = spans.len() as u64;\n        let mut num_parse_errors = 0;\n        let mut error_message = String::new();\n\n        let mut doc_batch_builder = JsonDocBatchV2Builder::with_num_docs(num_spans as usize);\n        let mut doc_uid_generator = DocUidGenerator::default();\n        for span in spans {\n            let doc_uid = doc_uid_generator.next_doc_uid();\n            if let Err(error) = doc_batch_builder.add_doc(doc_uid, span) {\n                error!(error=?error, \"failed to JSON serialize span\");\n                error_message = format!(\"failed to JSON serialize span: {error:?}\");\n                num_parse_errors += 1;\n            }\n        }\n        let doc_batch = doc_batch_builder.build();\n        let current_span = RuntimeSpan::current();\n        current_span.record(\"num_spans\", num_spans);\n        current_span.record(\"num_bytes\", doc_batch.num_bytes());\n        current_span.record(\"num_parse_errors\", num_parse_errors);\n\n        let parsed_spans = ParsedSpans {\n            doc_batch,\n            num_spans,\n            num_parse_errors,\n            error_message,\n        };\n        Ok(parsed_spans)\n    }\n\n    #[instrument(skip_all, fields(num_bytes = doc_batch.num_bytes()))]\n    async fn store_spans(\n        &mut self,\n        index_id: String,\n        doc_batch: DocBatchV2,\n    ) -> Result<(), tonic::Status> {\n        ingest_doc_batch_v2(\n            self.ingest_router.clone(),\n            index_id,\n            doc_batch,\n            self.commit_type,\n        )\n        .await?;\n        Ok(())\n    }\n\n    async fn export_instrumented(\n        &mut self,\n        request: ExportTraceServiceRequest,\n        index_id: IndexId,\n    ) -> Result<ExportTraceServiceResponse, Status> {\n        let start = std::time::Instant::now();\n\n        let labels = [\"trace\", &index_id, \"grpc\", \"protobuf\"];\n\n        OTLP_SERVICE_METRICS\n            .requests_total\n            .with_label_values(labels)\n            .inc();\n        let (export_res, is_error) =\n            match self.export_inner(request, index_id.clone(), labels).await {\n                ok @ Ok(_) => (ok, \"false\"),\n                err @ Err(_) => {\n                    OTLP_SERVICE_METRICS\n                        .request_errors_total\n                        .with_label_values(labels)\n                        .inc();\n                    (err, \"true\")\n                }\n            };\n        let elapsed = start.elapsed().as_secs_f64();\n        let labels = [\"trace\", &index_id, \"grpc\", \"protobuf\", is_error];\n        OTLP_SERVICE_METRICS\n            .request_duration_seconds\n            .with_label_values(labels)\n            .observe(elapsed);\n\n        export_res\n    }\n}\n\n#[async_trait]\nimpl TraceService for OtlpGrpcTracesService {\n    #[instrument(name = \"ingest_spans\", skip_all)]\n    async fn export(\n        &self,\n        request: Request<ExportTraceServiceRequest>,\n    ) -> Result<Response<ExportTraceServiceResponse>, Status> {\n        let index_id = extract_otel_index_id_from_metadata(request.metadata(), OtelSignal::Traces)?;\n        let request = request.into_inner();\n        self.clone()\n            .export_instrumented(request, index_id)\n            .await\n            .map(Response::new)\n    }\n}\n\n/// An iterator of JSON OTLP spans for use in the doc processor.\npub struct JsonSpanIterator {\n    spans: std::vec::IntoIter<Span>,\n    current_span_idx: usize,\n    num_spans: usize,\n    avg_span_size: usize,\n    avg_span_size_rem: usize,\n}\n\nimpl JsonSpanIterator {\n    fn new(spans: Vec<Span>, num_bytes: usize) -> Self {\n        let num_spans = spans.len();\n        let avg_span_size = num_bytes.checked_div(num_spans).unwrap_or(0);\n        let avg_span_size_rem = avg_span_size + num_bytes.checked_rem(num_spans).unwrap_or(0);\n\n        Self {\n            spans: spans.into_iter(),\n            current_span_idx: 0,\n            num_spans,\n            avg_span_size,\n            avg_span_size_rem,\n        }\n    }\n}\n\nimpl Iterator for JsonSpanIterator {\n    type Item = (JsonValue, usize);\n\n    fn next(&mut self) -> Option<Self::Item> {\n        let span_opt = self\n            .spans\n            .next()\n            .map(|span| serde_json::to_value(span).expect(\"`Span` should be JSON serializable\"));\n        if span_opt.is_some() {\n            self.current_span_idx += 1;\n        }\n        if self.current_span_idx < self.num_spans {\n            span_opt.map(|span| (span, self.avg_span_size))\n        } else {\n            span_opt.map(|span| (span, self.avg_span_size_rem))\n        }\n    }\n}\n\npub fn parse_otlp_spans_json(payload_json: &[u8]) -> Result<JsonSpanIterator, OtlpTracesError> {\n    let request: ExportTraceServiceRequest = serde_json::from_slice(payload_json)?;\n    let spans = parse_otlp_spans(request)?;\n    Ok(JsonSpanIterator::new(spans, payload_json.len()))\n}\n\npub fn parse_otlp_spans_protobuf(\n    payload_proto: &[u8],\n) -> Result<JsonSpanIterator, OtlpTracesError> {\n    let request = ExportTraceServiceRequest::decode(payload_proto)?;\n    let spans = parse_otlp_spans(request)?;\n    Ok(JsonSpanIterator::new(spans, payload_proto.len()))\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_metastore::{CreateIndexRequestExt, metastore_for_test};\n    use quickwit_proto::metastore::{CreateIndexRequest, MetastoreService};\n    use quickwit_proto::opentelemetry::proto::common::v1::any_value::Value as OtlpAnyValueValue;\n    use quickwit_proto::opentelemetry::proto::common::v1::{\n        AnyValue as OtlpAnyValue, KeyValue as OtlpKeyValue,\n    };\n    use quickwit_proto::opentelemetry::proto::trace::v1::span::{\n        Event as OtlpEvent, Link as OtlpLink,\n    };\n    use serde_json::json;\n\n    use super::*;\n\n    #[test]\n    fn test_index_config_is_valid() {\n        let index_config =\n            OtlpGrpcTracesService::index_config(&Uri::for_test(\"ram:///indexes\")).unwrap();\n        assert_eq!(index_config.index_id, OTEL_TRACES_INDEX_ID);\n    }\n\n    #[tokio::test]\n    async fn test_create_index() {\n        let metastore = metastore_for_test();\n        let index_config =\n            OtlpGrpcTracesService::index_config(&Uri::for_test(\"ram:///indexes\")).unwrap();\n        let create_index_request =\n            CreateIndexRequest::try_from_index_config(&index_config).unwrap();\n        metastore.create_index(create_index_request).await.unwrap();\n    }\n\n    #[test]\n    fn test_resource_from_otlp() {\n        let otlp_resource = OtlpResource {\n            attributes: vec![\n                OtlpKeyValue {\n                    key: \"service.name\".to_string(),\n                    value: Some(OtlpAnyValue {\n                        value: Some(OtlpAnyValueValue::StringValue(\"quickwit\".to_string())),\n                    }),\n                },\n                OtlpKeyValue {\n                    key: \"key\".to_string(),\n                    value: Some(OtlpAnyValue {\n                        value: Some(OtlpAnyValueValue::StringValue(\"value\".to_string())),\n                    }),\n                },\n            ],\n            dropped_attributes_count: 1,\n        };\n        let resource = Resource::from_otlp(otlp_resource);\n        assert_eq!(\n            resource.attributes,\n            HashMap::from_iter([(\"key\".to_string(), json!(\"value\"))])\n        );\n        assert_eq!(resource.service_name, \"quickwit\");\n        assert_eq!(resource.dropped_attributes_count, 1);\n    }\n\n    #[test]\n    fn test_scope_from_otlp() {\n        let otlp_scope = InstrumentationScope {\n            name: \"opentelemetry-otlp\".to_string(),\n            version: \"0.11.0\".to_string(),\n            attributes: vec![OtlpKeyValue {\n                key: \"key\".to_string(),\n                value: Some(OtlpAnyValue {\n                    value: Some(OtlpAnyValueValue::StringValue(\"value\".to_string())),\n                }),\n            }],\n            dropped_attributes_count: 1,\n        };\n        let scope = Scope::from_otlp(otlp_scope);\n        assert_eq!(scope.name.unwrap(), \"opentelemetry-otlp\");\n        assert_eq!(scope.version.unwrap(), \"0.11.0\");\n        assert_eq!(\n            scope.attributes,\n            HashMap::from_iter([(\"key\".to_string(), json!(\"value\"))])\n        );\n        assert_eq!(scope.dropped_attributes_count, 1);\n    }\n\n    #[test]\n    fn test_span_from_otlp() {\n        {\n            // Test minimal span.\n            let otlp_span = OtlpSpan {\n                trace_id: vec![1; 16],\n                span_id: vec![2; 8],\n                parent_span_id: vec![3; 8],\n                trace_state: \"\".to_string(),\n                name: \"publish_splits\".to_string(),\n                kind: 2, // Server\n                start_time_unix_nano: 1_000_000_001,\n                end_time_unix_nano: 1_001_000_002,\n                attributes: Vec::new(),\n                dropped_attributes_count: 3,\n                events: Vec::new(),\n                dropped_events_count: 4,\n                links: Vec::new(),\n                dropped_links_count: 5,\n                status: None,\n            };\n            let span = Span::from_otlp(otlp_span, &Resource::default(), &Scope::default()).unwrap();\n\n            assert_eq!(span.service_name, UNKNOWN_SERVICE);\n            assert!(span.resource_attributes.is_empty());\n            assert_eq!(span.resource_dropped_attributes_count, 0);\n\n            assert!(span.scope_name.is_none());\n            assert!(span.scope_version.is_none());\n            assert!(span.scope_attributes.is_empty());\n            assert_eq!(span.scope_dropped_attributes_count, 0);\n\n            assert_eq!(span.trace_id, TraceId::new([1; 16]));\n            assert!(span.trace_state.is_none());\n\n            assert_eq!(span.parent_span_id, Some(SpanId::new([3; 8])));\n            assert_eq!(span.span_id, SpanId::new([2; 8]));\n            assert_eq!(span.span_kind, 2);\n            assert_eq!(span.span_name, \"publish_splits\");\n            assert_eq!(\n                span.span_fingerprint.unwrap(),\n                SpanFingerprint::new(UNKNOWN_SERVICE, SpanKind(2), \"publish_splits\")\n            );\n            assert_eq!(span.span_start_timestamp_nanos, 1_000_000_001);\n            assert_eq!(span.span_end_timestamp_nanos, 1_001_000_002);\n            assert_eq!(span.span_duration_millis.unwrap(), 1);\n            assert!(span.span_attributes.is_empty());\n            assert_eq!(span.span_dropped_attributes_count, 3);\n            assert_eq!(span.span_status.code, OtlpStatusCode::Unset);\n\n            assert!(span.events.is_empty());\n            assert!(span.event_names.is_empty());\n            assert_eq!(span.span_dropped_events_count, 4);\n\n            assert!(span.links.is_empty());\n            assert_eq!(span.span_dropped_links_count, 5);\n        }\n        {\n            let resource = Resource {\n                service_name: \"quickwit\".to_string(),\n                attributes: HashMap::from_iter([(\n                    \"resource_key\".to_string(),\n                    json!(\"resource_value\"),\n                )]),\n                dropped_attributes_count: 1,\n            };\n            let scope = Scope {\n                name: Some(\"opentelemetry-otlp\".to_string()),\n                version: Some(\"0.11.0\".to_string()),\n                attributes: HashMap::from_iter([(\"scope_key\".to_string(), json!(\"scope_value\"))]),\n                dropped_attributes_count: 2,\n            };\n\n            let events = vec![OtlpEvent {\n                name: \"event_name\".to_string(),\n                time_unix_nano: 1_000_500_003,\n                attributes: vec![OtlpKeyValue {\n                    key: \"event_key\".to_string(),\n                    value: Some(OtlpAnyValue {\n                        value: Some(OtlpAnyValueValue::StringValue(\"event_value\".to_string())),\n                    }),\n                }],\n                dropped_attributes_count: 6,\n            }];\n            let links = vec![OtlpLink {\n                trace_id: vec![4; 16],\n                span_id: vec![5; 8],\n                trace_state: \"link_key1=link_value1,link_key2=link_value2\".to_string(),\n                attributes: vec![OtlpKeyValue {\n                    key: \"link_key\".to_string(),\n                    value: Some(OtlpAnyValue {\n                        value: Some(OtlpAnyValueValue::StringValue(\"link_value\".to_string())),\n                    }),\n                }],\n                dropped_attributes_count: 7,\n            }];\n            let attributes = vec![OtlpKeyValue {\n                key: \"span_key\".to_string(),\n                value: Some(OtlpAnyValue {\n                    value: Some(OtlpAnyValueValue::StringValue(\"span_value\".to_string())),\n                }),\n            }];\n            let otlp_span = OtlpSpan {\n                trace_id: vec![1; 16],\n                span_id: vec![2; 8],\n                parent_span_id: vec![3; 8],\n                trace_state: \"key1=value1,key2=value2\".to_string(),\n                name: \"publish_splits\".to_string(),\n                kind: 2, // Server\n                start_time_unix_nano: 1_000_000_001,\n                end_time_unix_nano: 1_001_000_002,\n                attributes,\n                dropped_attributes_count: 3,\n                events,\n                dropped_events_count: 4,\n                links,\n                dropped_links_count: 5,\n                status: Some(OtlpStatus {\n                    code: 2,\n                    message: \"An error occurred.\".to_string(),\n                }),\n            };\n            let span = Span::from_otlp(otlp_span, &resource, &scope).unwrap();\n\n            assert_eq!(span.service_name, \"quickwit\");\n            assert_eq!(\n                span.resource_attributes,\n                HashMap::from_iter([(\"resource_key\".to_string(), json!(\"resource_value\"))],)\n            );\n            assert_eq!(span.resource_dropped_attributes_count, 1);\n\n            assert_eq!(span.scope_name.unwrap(), \"opentelemetry-otlp\");\n            assert_eq!(span.scope_version.unwrap(), \"0.11.0\");\n            assert_eq!(\n                span.scope_attributes,\n                HashMap::from_iter([(\"scope_key\".to_string(), json!(\"scope_value\"))])\n            );\n            assert_eq!(span.scope_dropped_attributes_count, 2);\n\n            assert_eq!(span.trace_id, TraceId::new([1; 16]));\n            assert_eq!(span.trace_state.unwrap(), \"key1=value1,key2=value2\");\n\n            assert_eq!(span.parent_span_id, Some(SpanId::new([3; 8])));\n            assert_eq!(span.span_id, SpanId::new([2; 8]));\n            assert_eq!(span.span_kind, 2);\n            assert_eq!(span.span_name, \"publish_splits\");\n            assert_eq!(\n                span.span_fingerprint.unwrap(),\n                SpanFingerprint::new(\"quickwit\", SpanKind(2), \"publish_splits\")\n            );\n            assert_eq!(span.span_start_timestamp_nanos, 1_000_000_001);\n            assert_eq!(span.span_end_timestamp_nanos, 1_001_000_002);\n            assert_eq!(span.span_duration_millis.unwrap(), 1);\n            assert_eq!(\n                span.span_attributes,\n                HashMap::from_iter([(\"span_key\".to_string(), json!(\"span_value\"))])\n            );\n            assert_eq!(span.span_dropped_attributes_count, 3);\n            assert_eq!(span.span_status.code, OtlpStatusCode::Error);\n            assert_eq!(span.span_status.message.unwrap(), \"An error occurred.\");\n\n            assert_eq!(\n                span.events,\n                vec![Event {\n                    event_name: \"event_name\".to_string(),\n                    event_timestamp_nanos: 1_000_500_003,\n                    event_attributes: HashMap::from_iter([(\n                        \"event_key\".to_string(),\n                        json!(\"event_value\")\n                    )]),\n                    event_dropped_attributes_count: 6,\n                }]\n            );\n            assert_eq!(span.event_names, vec![\"event_name\".to_string()]);\n            assert_eq!(span.span_dropped_events_count, 4);\n\n            assert_eq!(\n                span.links,\n                vec![Link {\n                    link_trace_id: TraceId::new([4; 16]),\n                    link_span_id: SpanId::new([5; 8]),\n                    link_trace_state: Some(\n                        \"link_key1=link_value1,link_key2=link_value2\".to_string()\n                    ),\n                    link_attributes: HashMap::from_iter([(\n                        \"link_key\".to_string(),\n                        json!(\"link_value\")\n                    )]),\n                    link_dropped_attributes_count: 7,\n                }]\n            );\n            assert_eq!(span.span_dropped_links_count, 5);\n        }\n    }\n\n    #[test]\n    fn test_span_fingerprint() {\n        let span_fingerprint = SpanFingerprint::new(\"quickwit\", SpanKind(2), \"publish_splits\");\n        assert_eq!(\n            span_fingerprint.as_str(),\n            \"quickwit\\u{0}2\\u{0}publish_splits\"\n        );\n\n        let start_key_opt = SpanFingerprint::start_key(\"\", None);\n        assert!(start_key_opt.is_none());\n\n        let start_key = SpanFingerprint::start_key(\"quickwit\", None)\n            .map(String::from_utf8)\n            .unwrap()\n            .unwrap();\n        assert_eq!(start_key, \"quickwit\\u{0}\");\n        let end_key = SpanFingerprint::end_key(\"quickwit\", None)\n            .map(String::from_utf8)\n            .unwrap()\n            .unwrap();\n        assert_eq!(end_key, \"quickwit\\u{1}\");\n\n        let start_key = SpanFingerprint::start_key(\"quickwit\", Some(SpanKind::from(1)))\n            .map(String::from_utf8)\n            .unwrap()\n            .unwrap();\n        assert_eq!(start_key, \"quickwit\\u{0}1\\u{0}\");\n        let end_key = SpanFingerprint::end_key(\"quickwit\", Some(SpanKind::from(1)))\n            .map(String::from_utf8)\n            .unwrap()\n            .unwrap();\n        assert_eq!(end_key, \"quickwit\\u{0}1\\u{1}\");\n    }\n\n    #[test]\n    fn test_span_status_from_otlp() {\n        let otlp_status = OtlpStatus {\n            code: 0,\n            message: \"\".to_string(),\n        };\n        assert!(SpanStatus::from_otlp(otlp_status).is_unset());\n\n        let otlp_status = OtlpStatus {\n            code: 1,\n            message: \"\".to_string(),\n        };\n        let span_status = SpanStatus::from_otlp(otlp_status);\n        assert_eq!(span_status.code, OtlpStatusCode::Ok);\n        assert!(span_status.message.is_none());\n\n        let otlp_status = OtlpStatus {\n            code: 2,\n            message: \"An error occurred.\".to_string(),\n        };\n        let span_status = SpanStatus::from_otlp(otlp_status);\n        assert_eq!(span_status.code, OtlpStatusCode::Error);\n        assert_eq!(span_status.message.unwrap(), \"An error occurred.\");\n    }\n\n    #[test]\n    fn test_span_serde() {\n        {\n            let expected_span = Span {\n                trace_id: TraceId::new([1; 16]),\n                trace_state: None,\n                service_name: \"quickwit\".to_string(),\n                resource_attributes: HashMap::new(),\n                resource_dropped_attributes_count: 0,\n                scope_name: None,\n                scope_version: None,\n                scope_attributes: HashMap::new(),\n                scope_dropped_attributes_count: 0,\n                span_id: SpanId::new([2; 8]),\n                span_kind: 0,\n                span_name: \"publish_splits\".to_string(),\n                span_fingerprint: Some(SpanFingerprint::new(\n                    \"quickwit\",\n                    SpanKind(2),\n                    \"publish_splits\",\n                )),\n                span_start_timestamp_nanos: 0,\n                span_end_timestamp_nanos: 1_000,\n                span_duration_millis: Some(1),\n                span_attributes: HashMap::new(),\n                span_dropped_attributes_count: 0,\n                span_dropped_events_count: 0,\n                span_dropped_links_count: 0,\n                span_status: SpanStatus::default(),\n                parent_span_id: None,\n                is_root: Some(true),\n                events: Vec::new(),\n                event_names: Vec::new(),\n                links: Vec::new(),\n            };\n            let span_json = serde_json::to_string_pretty(&expected_span).unwrap();\n            let span = serde_json::from_str::<Span>(&span_json).unwrap();\n            assert_eq!(span, expected_span);\n        }\n        {\n            let expected_span = Span {\n                trace_id: TraceId::new([1; 16]),\n                trace_state: Some(\"key1=value1,key2=value2\".to_string()),\n                service_name: \"quickwit\".to_string(),\n                resource_attributes: HashMap::from([(\n                    \"resource_key\".to_string(),\n                    json!(\"resource_value\"),\n                )]),\n                resource_dropped_attributes_count: 1,\n                scope_name: Some(\"scope_name\".to_string()),\n                scope_version: Some(\"scope_version\".to_string()),\n                scope_attributes: HashMap::from([(\"scope_key\".to_string(), json!(\"scope_value\"))]),\n                scope_dropped_attributes_count: 1,\n                span_id: SpanId::new([2; 8]),\n                span_kind: 1,\n                span_name: \"publish_splits\".to_string(),\n                span_fingerprint: Some(SpanFingerprint::new(\n                    \"quickwit\",\n                    SpanKind(2),\n                    \"publish_splits\",\n                )),\n                span_start_timestamp_nanos: 0,\n                span_end_timestamp_nanos: 1_000,\n                span_duration_millis: Some(1),\n                span_attributes: HashMap::from([(\"span_key\".to_string(), json!(\"span_value\"))]),\n                span_dropped_attributes_count: 1,\n                span_dropped_events_count: 1,\n                span_dropped_links_count: 1,\n                span_status: SpanStatus {\n                    code: OtlpStatusCode::Ok,\n                    message: None,\n                },\n                parent_span_id: Some(SpanId::new([3; 8])),\n                is_root: Some(false),\n                events: vec![Event {\n                    event_timestamp_nanos: 1,\n                    event_name: \"event_name\".to_string(),\n                    event_attributes: HashMap::new(),\n                    event_dropped_attributes_count: 0,\n                }],\n                event_names: vec![\"event_name\".to_string()],\n                links: vec![Link {\n                    link_trace_id: TraceId::new([1; 16]),\n                    link_span_id: SpanId::new([4; 8]),\n                    link_trace_state: None,\n                    link_attributes: HashMap::new(),\n                    link_dropped_attributes_count: 0,\n                }],\n            };\n            let span_json = serde_json::to_string_pretty(&expected_span).unwrap();\n            let span = serde_json::from_str::<Span>(&span_json).unwrap();\n            assert_eq!(span, expected_span);\n        }\n    }\n\n    #[test]\n    fn test_json_span_iterator() {\n        let mut json_span_iterator = JsonSpanIterator::new(Vec::new(), 0);\n        assert!(json_span_iterator.next().is_none());\n\n        let span_0 = Span {\n            trace_id: TraceId::new([1; 16]),\n            trace_state: None,\n            service_name: \"quickwit\".to_string(),\n            resource_attributes: HashMap::new(),\n            resource_dropped_attributes_count: 0,\n            scope_name: None,\n            scope_version: None,\n            scope_attributes: HashMap::new(),\n            scope_dropped_attributes_count: 0,\n            span_id: SpanId::new([2; 8]),\n            span_kind: 0,\n            span_name: \"publish_splits\".to_string(),\n            span_fingerprint: Some(SpanFingerprint::new(\n                \"quickwit\",\n                SpanKind(2),\n                \"publish_splits\",\n            )),\n            span_start_timestamp_nanos: 1_000_000_001,\n            span_end_timestamp_nanos: 1_000_000_002,\n            span_duration_millis: Some(1),\n            span_attributes: HashMap::new(),\n            span_dropped_attributes_count: 0,\n            span_dropped_events_count: 0,\n            span_dropped_links_count: 0,\n            span_status: SpanStatus::default(),\n            parent_span_id: None,\n            is_root: Some(true),\n            events: Vec::new(),\n            event_names: Vec::new(),\n            links: Vec::new(),\n        };\n\n        let spans = vec![span_0.clone()];\n        let mut json_span_iterator = JsonSpanIterator::new(spans, 3);\n\n        assert_eq!(\n            json_span_iterator.next(),\n            Some((serde_json::to_value(&span_0).unwrap(), 3))\n        );\n        assert!(json_span_iterator.next().is_none());\n\n        let mut span_1 = span_0.clone();\n        span_1.span_id = SpanId::new([3; 8]);\n\n        let spans = vec![span_0.clone(), span_1.clone()];\n        let mut json_span_iterator = JsonSpanIterator::new(spans, 7);\n\n        assert_eq!(\n            json_span_iterator.next(),\n            Some((serde_json::to_value(&span_0).unwrap(), 3))\n        );\n        assert_eq!(\n            json_span_iterator.next(),\n            Some((serde_json::to_value(&span_1).unwrap(), 4))\n        );\n        assert!(json_span_iterator.next().is_none());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/.gitignore",
    "content": "gogoproto.rs\ngoogle.protobuf.rs\n"
  },
  {
    "path": "quickwit/quickwit-proto/Cargo.toml",
    "content": "[package]\nname = \"quickwit-proto\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nanyhow = { workspace = true }\nasync-trait = { workspace = true }\nbytes = { workspace = true }\nbytesize = { workspace = true }\nbytestring = { workspace = true }\nfutures = { workspace = true, optional = true }\nhex = { workspace = true  }\nhttp = { workspace = true }\nmockall = { workspace = true, optional = true }\nopentelemetry = { workspace = true }\nprost = { workspace = true }\nprost-types = { workspace = true }\nsea-query = { workspace = true, optional = true }\nserde = { workspace = true }\nserde_json = { workspace = true }\nsqlx = { workspace = true, optional = true }\nthiserror = { workspace = true }\ntokio = { workspace = true }\ntonic = { workspace = true, features = [\n    \"tls-native-roots\",\n    \"server\",\n    \"channel\",\n] }\ntonic-prost = { workspace = true }\ntower = { workspace = true }\ntracing = { workspace = true }\ntracing-opentelemetry = { workspace = true }\nulid = { workspace = true }\nutoipa = { workspace = true }\nzstd = { workspace = true }\n\nquickwit-actors = { workspace = true }\nquickwit-common = { workspace = true }\n\n[dev-dependencies]\nfutures = { workspace = true }\nmockall = { workspace = true }\n\n[build-dependencies]\nglob = \"0.3\"\nprost-build = { workspace = true }\ntonic-build = { workspace = true }\ntonic-prost-build = { workspace = true }\n\nquickwit-codegen = { workspace = true }\n\n[features]\npostgres = [\"sea-query\", \"sqlx\"]\ntestsuite = [\"mockall\", \"futures\"]\n"
  },
  {
    "path": "quickwit/quickwit-proto/build.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::path::PathBuf;\n\nuse glob::glob;\nuse quickwit_codegen::Codegen;\n\nfn main() -> Result<(), Box<dyn std::error::Error>> {\n    // Prost + tonic + Quickwit codegen for control plane, indexing, metastore, ingest and search\n    // services.\n    //\n    // Cluster service.\n    let mut prost_config = prost_build::Config::default();\n    prost_config.file_descriptor_set_path(\"src/codegen/quickwit/cluster_descriptor.bin\");\n\n    Codegen::builder()\n        .with_prost_config(prost_config)\n        .with_protos(&[\"protos/quickwit/cluster.proto\"])\n        .with_output_dir(\"src/codegen/quickwit\")\n        .with_result_type_path(\"crate::cluster::ClusterResult\")\n        .with_error_type_path(\"crate::cluster::ClusterError\")\n        .generate_rpc_name_impls()\n        .run()\n        .unwrap();\n\n    // Control plane.\n    let mut prost_config = prost_build::Config::default();\n    prost_config.file_descriptor_set_path(\"src/codegen/quickwit/control_plane_descriptor.bin\");\n\n    prost_config\n        .extern_path(\n            \".quickwit.common.DocMappingUid\",\n            \"crate::types::DocMappingUid\",\n        )\n        .extern_path(\".quickwit.common.IndexUid\", \"crate::types::IndexUid\");\n\n    Codegen::builder()\n        .with_prost_config(prost_config)\n        .with_protos(&[\"protos/quickwit/control_plane.proto\"])\n        .with_includes(&[\"protos\"])\n        .with_output_dir(\"src/codegen/quickwit\")\n        .with_result_type_path(\"crate::control_plane::ControlPlaneResult\")\n        .with_error_type_path(\"crate::control_plane::ControlPlaneError\")\n        .run()\n        .unwrap();\n\n    // Developer service.\n    let mut prost_config = prost_build::Config::default();\n    prost_config\n        .bytes([\"GetDebugInfoResponse.debug_info_json\"])\n        .file_descriptor_set_path(\"src/codegen/quickwit/developer_descriptor.bin\");\n\n    Codegen::builder()\n        .with_prost_config(prost_config)\n        .with_protos(&[\"protos/quickwit/developer.proto\"])\n        .with_output_dir(\"src/codegen/quickwit\")\n        .with_result_type_path(\"crate::developer::DeveloperResult\")\n        .with_error_type_path(\"crate::developer::DeveloperError\")\n        .generate_rpc_name_impls()\n        .run()\n        .unwrap();\n\n    // Indexing Service.\n    let mut prost_config = prost_build::Config::default();\n    prost_config\n        .extern_path(\n            \".quickwit.indexing.PipelineUid\",\n            \"crate::types::PipelineUid\",\n        )\n        .extern_path(\".quickwit.common.IndexUid\", \"crate::types::IndexUid\")\n        .extern_path(\".quickwit.ingest.ShardId\", \"crate::types::ShardId\")\n        .file_descriptor_set_path(\"src/codegen/quickwit/indexing_descriptor.bin\");\n\n    Codegen::builder()\n        .with_prost_config(prost_config)\n        .with_protos(&[\"protos/quickwit/indexing.proto\"])\n        .with_includes(&[\"protos\"])\n        .with_output_dir(\"src/codegen/quickwit\")\n        .with_result_type_path(\"crate::indexing::IndexingResult\")\n        .with_error_type_path(\"crate::indexing::IndexingError\")\n        .run()\n        .unwrap();\n\n    // Metastore service.\n    let mut prost_config = prost_build::Config::default();\n    prost_config\n        .bytes([\n            \"IndexesMetadataResponse.indexes_metadata_json_zstd\",\n            \"ListIndexesMetadataResponse.indexes_metadata_json_zstd\",\n        ])\n        .extern_path(\n            \".quickwit.common.DocMappingUid\",\n            \"crate::types::DocMappingUid\",\n        )\n        .extern_path(\".quickwit.common.IndexUid\", \"crate::types::IndexUid\")\n        .extern_path(\".quickwit.ingest.ShardId\", \"crate::types::ShardId\")\n        .field_attribute(\"DeleteQuery.index_uid\", \"#[schema(value_type = String)]\")\n        .field_attribute(\"DeleteQuery.index_uid\", \"#[serde(alias = \\\"index_id\\\")]\")\n        .field_attribute(\"DeleteQuery.query_ast\", \"#[serde(alias = \\\"query\\\")]\")\n        .field_attribute(\n            \"DeleteQuery.start_timestamp\",\n            \"#[serde(skip_serializing_if = \\\"Option::is_none\\\")]\",\n        )\n        .field_attribute(\n            \"DeleteQuery.end_timestamp\",\n            \"#[serde(skip_serializing_if = \\\"Option::is_none\\\")]\",\n        )\n        .file_descriptor_set_path(\"src/codegen/quickwit/metastore_descriptor.bin\");\n\n    Codegen::builder()\n        .with_prost_config(prost_config)\n        .with_protos(&[\"protos/quickwit/metastore.proto\"])\n        .with_includes(&[\"protos\"])\n        .with_output_dir(\"src/codegen/quickwit\")\n        .with_result_type_path(\"crate::metastore::MetastoreResult\")\n        .with_error_type_path(\"crate::metastore::MetastoreError\")\n        .generate_extra_service_methods()\n        .generate_rpc_name_impls()\n        .run()\n        .unwrap();\n\n    // Ingest service (metastore service proto should be generated before ingest).\n    let mut prost_config = prost_build::Config::default();\n    prost_config\n        .bytes([\n            \"DocBatchV2.doc_buffer\",\n            \"MRecordBatch.mrecord_buffer\",\n            \"Position.position\",\n        ])\n        .extern_path(\n            \".quickwit.common.DocMappingUid\",\n            \"crate::types::DocMappingUid\",\n        )\n        .extern_path(\".quickwit.common.DocUid\", \"crate::types::DocUid\")\n        .extern_path(\".quickwit.common.IndexUid\", \"crate::types::IndexUid\")\n        .extern_path(\".quickwit.ingest.Position\", \"crate::types::Position\")\n        .extern_path(\".quickwit.ingest.ShardId\", \"crate::types::ShardId\")\n        .field_attribute(\n            \"Shard.follower_id\",\n            \"#[serde(default, skip_serializing_if = \\\"Option::is_none\\\")]\",\n        )\n        .field_attribute(\n            \"Shard.publish_position_inclusive\",\n            \"#[serde(default, skip_serializing_if = \\\"Option::is_none\\\")]\",\n        )\n        .field_attribute(\n            \"Shard.publish_token\",\n            \"#[serde(default, skip_serializing_if = \\\"Option::is_none\\\")]\",\n        )\n        .field_attribute(\n            \"Shard.replication_position_inclusive\",\n            \"#[serde(default, skip_serializing_if = \\\"Option::is_none\\\")]\",\n        )\n        .field_attribute(\n            \"Shard.update_timestamp\",\n            \"#[serde(default = \\\"super::compatibility_shard_update_timestamp\\\")]\",\n        )\n        .file_descriptor_set_path(\"src/codegen/quickwit/ingest_descriptor.bin\");\n\n    Codegen::builder()\n        .with_prost_config(prost_config)\n        .with_protos(&[\n            \"protos/quickwit/ingester.proto\",\n            \"protos/quickwit/router.proto\",\n        ])\n        .with_includes(&[\"protos\"])\n        .with_output_dir(\"src/codegen/quickwit\")\n        .with_result_type_path(\"crate::ingest::IngestV2Result\")\n        .with_error_type_path(\"crate::ingest::IngestV2Error\")\n        .generate_rpc_name_impls()\n        .run()\n        .unwrap();\n\n    // Search service.\n    let mut prost_config = prost_build::Config::default();\n    prost_config\n        .file_descriptor_set_path(\"src/codegen/quickwit/search_descriptor.bin\")\n        .protoc_arg(\"--experimental_allow_proto3_optional\");\n\n    tonic_prost_build::configure()\n        .enum_attribute(\".\", \"#[serde(rename_all=\\\"snake_case\\\")]\")\n        .type_attribute(\n            \".\",\n            \"#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\",\n        )\n        .type_attribute(\"PartialHit.sort_value\", \"#[derive(Copy)]\")\n        .type_attribute(\"SortByValue\", \"#[derive(Ord, PartialOrd)]\")\n        .type_attribute(\"SearchRequest\", \"#[derive(Hash, Eq)]\")\n        .type_attribute(\"PartialHit\", \"#[derive(Hash, Eq)]\")\n        .out_dir(\"src/codegen/quickwit\")\n        .compile_with_config(\n            prost_config,\n            &[std::path::PathBuf::from(\"protos/quickwit/search.proto\")],\n            &[std::path::PathBuf::from(\"protos\")],\n        )?;\n\n    // Jaeger proto\n    let protos = find_protos(\"protos/third-party/jaeger\");\n\n    let mut prost_config = prost_build::Config::default();\n    prost_config.type_attribute(\"Operation\", \"#[derive(Ord, PartialOrd)]\");\n\n    tonic_prost_build::configure()\n        .out_dir(\"src/codegen/jaeger\")\n        .compile_with_config(\n            prost_config,\n            &protos,\n            &[\n                std::path::PathBuf::from(\"protos/third-party/jaeger\"),\n                std::path::PathBuf::from(\"protos/third-party\"),\n            ],\n        )?;\n\n    // OTEL proto\n    let mut prost_config = prost_build::Config::default();\n    prost_config.protoc_arg(\"--experimental_allow_proto3_optional\");\n\n    let protos = find_protos(\"protos/third-party/opentelemetry\");\n    tonic_prost_build::configure()\n        .type_attribute(\".\", \"#[derive(serde::Serialize, serde::Deserialize)]\")\n        .type_attribute(\"StatusCode\", r#\"#[serde(rename_all = \"snake_case\")]\"#)\n        .type_attribute(\n            \"ExportLogsServiceResponse\",\n            r#\"#[derive(utoipa::ToSchema)]\"#,\n        )\n        .out_dir(\"src/codegen/opentelemetry\")\n        .compile_with_config(\n            prost_config,\n            &protos,\n            &[std::path::PathBuf::from(\"protos/third-party\")],\n        )?;\n    Ok(())\n}\n\nfn find_protos(dir_path: &str) -> Vec<PathBuf> {\n    glob(&format!(\"{dir_path}/**/*.proto\"))\n        .unwrap()\n        .flatten()\n        .collect()\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/quickwit/cluster.proto",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nsyntax = \"proto3\";\n\npackage quickwit.cluster;\n\nmessage ChitchatId {\n  string node_id = 1;\n  uint64 generation_id = 2;\n  string gossip_advertise_addr = 3;\n}\n\nenum DeletionStatus {\n    Set = 0;\n    Deleted = 1;\n    DeleteAfterTtl = 2;\n}\n\nmessage VersionedKeyValue {\n  string key = 1;\n  string value = 2;\n  uint64 version = 3;\n  DeletionStatus status = 4;\n}\n\nmessage NodeState {\n  ChitchatId chitchat_id = 1;\n  repeated VersionedKeyValue key_values = 2;\n  uint64 max_version = 3;\n  uint64 last_gc_version = 4;\n}\n\nservice ClusterService {\n  rpc FetchClusterState(FetchClusterStateRequest) returns (FetchClusterStateResponse);\n}\n\nmessage FetchClusterStateRequest {\n  string cluster_id = 1;\n}\n\nmessage FetchClusterStateResponse {\n  string cluster_id = 1;\n  repeated NodeState node_states = 2;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/quickwit/common.proto",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nsyntax = \"proto3\";\n\npackage quickwit.common;\n\n// The corresponding Rust struct [`crate::types::DocMappingUid`] is defined manually and\n// externally provided during code generation (see `build.rs`).\n//\n// Modify at your own risk.\nmessage DocMappingUid {\n  // ULID encoded as a sequence of 16 bytes (big-endian u128).\n  bytes doc_mapping_uid = 1;\n}\n\n// The corresponding Rust struct [`crate::types::DocUid`] is defined manually and\n// externally provided during code generation (see `build.rs`).\n//\n// Modify at your own risk.\nmessage DocUid {\n  // ULID encoded as a sequence of 16 bytes (big-endian u128).\n  bytes doc_uid = 1;\n}\n\n// The corresponding Rust struct [`crate::types::IndexUid`] is defined manually and\n// externally provided during code generation (see `build.rs`).\n//\n// Modify at your own risk.\nmessage IndexUid {\n  string index_id = 1;\n  // ULID encoded as a sequence of 16 bytes (big-endian u128).\n  bytes incarnation_id = 2;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/quickwit/control_plane.proto",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nsyntax = \"proto3\";\n\npackage quickwit.control_plane;\n\nimport \"quickwit/common.proto\";\nimport \"quickwit/indexing.proto\";\nimport \"quickwit/ingest.proto\";\nimport \"quickwit/metastore.proto\";\n\nservice ControlPlaneService {\n  // The control plane acts as a proxy for the metastore for a subset of the API so it can track the state of the\n  // metastore accurately and react to events in real-time.\n\n  // The following RPCs are forwarded and handled by the metastore:\n  // - `create_index`\n  // - `update_index`\n  // - `delete_index`\n  // - `add_source`\n  // - `toggle_source`\n  // - `delete_source`\n\n  // Index API\n\n  // Creates a new index.\n  rpc CreateIndex(quickwit.metastore.CreateIndexRequest) returns (quickwit.metastore.CreateIndexResponse);\n\n  // Updates an index.\n  rpc UpdateIndex(quickwit.metastore.UpdateIndexRequest) returns (quickwit.metastore.IndexMetadataResponse);\n\n  // Deletes an index.\n  rpc DeleteIndex(quickwit.metastore.DeleteIndexRequest) returns (quickwit.metastore.EmptyResponse);\n\n  // Source API\n\n  // Adds a source to an index.\n  rpc AddSource(quickwit.metastore.AddSourceRequest) returns (quickwit.metastore.EmptyResponse);\n\n  // Update a source.\n  rpc UpdateSource(quickwit.metastore.UpdateSourceRequest) returns (quickwit.metastore.EmptyResponse);\n\n  // Enables or disables a source.\n  rpc ToggleSource(quickwit.metastore.ToggleSourceRequest) returns (quickwit.metastore.EmptyResponse);\n\n  // Removes a source from an index.\n  rpc DeleteSource(quickwit.metastore.DeleteSourceRequest) returns (quickwit.metastore.EmptyResponse);\n\n  // Shard API\n\n  // Returns the list of open shards for one or several sources. If the control plane is not able to find any\n  // for a source, it will pick a pair of leader-follower ingesters and will open a new shard.\n  rpc GetOrCreateOpenShards(GetOrCreateOpenShardsRequest) returns (GetOrCreateOpenShardsResponse);\n\n  // Asks the control plane whether the shards listed in the request should be deleted or truncated.\n  rpc AdviseResetShards(AdviseResetShardsRequest) returns (AdviseResetShardsResponse);\n\n  // Performs a debounced shard pruning request to the metastore.\n  rpc PruneShards(quickwit.metastore.PruneShardsRequest) returns (quickwit.metastore.EmptyResponse);\n}\n\n// Shard API\n\nmessage GetOrCreateOpenShardsRequest {\n  // There should be at most one subrequest per index per request.\n  repeated GetOrCreateOpenShardsSubrequest subrequests = 1;\n  repeated quickwit.ingest.ShardIds closed_shards = 2;\n  // The control plane should return shards that are not present on the supplied leaders.\n  //\n  // The control plane does not change the status of those leaders just from this signal.\n  // It will check the status of its own ingester pool.\n  repeated string unavailable_leaders = 3;\n}\n\nmessage GetOrCreateOpenShardsSubrequest {\n  uint32 subrequest_id = 1;\n  string index_id = 2;\n  string source_id = 3;\n}\n\nmessage GetOrCreateOpenShardsResponse {\n  repeated GetOrCreateOpenShardsSuccess successes = 1;\n  repeated GetOrCreateOpenShardsFailure failures = 2;\n}\n\nmessage GetOrCreateOpenShardsSuccess {\n  uint32 subrequest_id = 1;\n  quickwit.common.IndexUid index_uid = 2;\n  string source_id = 3;\n  repeated quickwit.ingest.Shard open_shards = 4;\n}\n\nenum GetOrCreateOpenShardsFailureReason {\n  GET_OR_CREATE_OPEN_SHARDS_FAILURE_REASON_UNSPECIFIED = 0;\n  GET_OR_CREATE_OPEN_SHARDS_FAILURE_REASON_INDEX_NOT_FOUND = 1;\n  GET_OR_CREATE_OPEN_SHARDS_FAILURE_REASON_SOURCE_NOT_FOUND = 2;\n  GET_OR_CREATE_OPEN_SHARDS_FAILURE_REASON_NO_INGESTERS_AVAILABLE = 3;\n}\n\nmessage GetOrCreateOpenShardsFailure {\n  uint32 subrequest_id = 1;\n  string index_id = 2;\n  string source_id = 3;\n  GetOrCreateOpenShardsFailureReason reason = 4;\n}\n\nmessage AdviseResetShardsRequest {\n  repeated quickwit.ingest.ShardIds shard_ids = 1;\n  string ingester_id = 2;\n}\n\nmessage AdviseResetShardsResponse {\n  repeated quickwit.ingest.ShardIds shards_to_delete = 1;\n  repeated quickwit.ingest.ShardIdPositions shards_to_truncate = 2;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/quickwit/developer.proto",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nsyntax = \"proto3\";\n\npackage quickwit.developer;\n\nservice DeveloperService {\n  rpc GetDebugInfo(GetDebugInfoRequest) returns (GetDebugInfoResponse);\n\n  // rpc SetLogLevel(SetLogLevelRequest) returns (SetLogLevelResponse);\n}\n\nmessage GetDebugInfoRequest {\n  // Restricts the debug info to the given roles.\n  repeated string roles = 1;\n}\n\nmessage GetDebugInfoResponse {\n  bytes debug_info_json = 1;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/quickwit/indexing.proto",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nsyntax = \"proto3\";\n\npackage quickwit.indexing;\n\nimport \"quickwit/common.proto\";\nimport \"quickwit/ingest.proto\";\n\nservice IndexingService {\n  // Apply an indexing plan on the node.\n  rpc ApplyIndexingPlan(ApplyIndexingPlanRequest) returns (ApplyIndexingPlanResponse);\n}\n\nmessage ApplyIndexingPlanRequest {\n  repeated IndexingTask indexing_tasks = 1;\n}\n\nmessage PipelineUid {\n  bytes pipeline_uid = 1;\n}\n\nmessage IndexingTask {\n  // The tasks's index UID.\n  quickwit.common.IndexUid index_uid = 1;\n  // The task's source ID.\n  string source_id = 2;\n  // pipeline id\n  PipelineUid pipeline_uid = 4;\n  // The shards assigned to the indexer.\n  repeated quickwit.ingest.ShardId shard_ids = 3;\n  // Fingerprint of the pipeline parameters. Anything that should cause a pipeline restart (such\n  // as updating indexing settings, the doc mapping or the source) should influence this value.\n  uint64 params_fingerprint = 6;\n}\n\nmessage ApplyIndexingPlanResponse {}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/quickwit/ingest.proto",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nsyntax = \"proto3\";\n\npackage quickwit.ingest;\n\nimport \"quickwit/common.proto\";\n\n// The corresponding Rust struct [`crate::types::Position`] is defined manually and\n// externally provided during code generation (see `build.rs`).\n//\n// Modify at your own risk.\nmessage Position {\n  bytes position = 1;\n}\n\n// The corresponding Rust struct [`crate::types::ShardId`] is defined manually and\n// externally provided during code generation (see `build.rs`).\n//\n// Modify at your own risk.\nmessage ShardId {\n  bytes shard_id = 1;\n}\n\n// Shard primary key.\nmessage ShardPKey {\n  quickwit.common.IndexUid index_uid = 1;\n  string source_id = 2;\n  ShardId shard_id = 3;\n}\n\nenum CommitTypeV2 {\n  COMMIT_TYPE_V2_UNSPECIFIED = 0;\n  COMMIT_TYPE_V2_AUTO = 1;\n  COMMIT_TYPE_V2_WAIT_FOR = 2;\n  COMMIT_TYPE_V2_FORCE = 3;\n}\n\nmessage DocBatchV2 {\n  bytes doc_buffer = 1;\n  repeated uint32 doc_lengths = 2;\n  repeated quickwit.common.DocUid doc_uids = 3;\n}\n\nmessage MRecordBatch {\n  // Buffer of encoded and then concatenated mrecords.\n  bytes mrecord_buffer = 1;\n  // Lengths of the mrecords in the buffer.\n  repeated uint32 mrecord_lengths = 2;\n}\n\nenum ShardState {\n  SHARD_STATE_UNSPECIFIED = 0;\n  // The shard is open and accepts write requests.\n  SHARD_STATE_OPEN = 1;\n  // The ingester hosting the shard is unavailable.\n  SHARD_STATE_UNAVAILABLE = 2;\n  // The shard is closed and cannot be written to.\n  // It can be safely deleted if the publish position is superior or equal to `~eof`.\n  SHARD_STATE_CLOSED = 3;\n}\n\nmessage Shard {\n  // Immutable fields\n  quickwit.common.IndexUid index_uid = 1;\n  string source_id = 2;\n  ShardId shard_id = 3;\n  // The node ID of the ingester to which all the write requests for this shard should be sent to.\n  string leader_id = 4;\n  // The node ID of the ingester holding a copy of the data.\n  optional string follower_id = 5;\n\n  // Mutable fields\n  ShardState shard_state = 8;\n  // Position up to which indexers have indexed and published the records stored in the shard.\n  // It is updated asynchronously in a best effort manner by the indexers and indicates the position up to which the log can be safely truncated.\n  Position publish_position_inclusive = 9;\n  // A publish token that ensures only one indexer works on a given shard at a time.\n  // For instance, if an indexer goes rogue, eventually the control plane will detect it and assign the shard to another indexer, which will override the publish token.\n  optional string publish_token = 10;\n\n  // The UID of the index doc mapping when the shard was created.\n  quickwit.common.DocMappingUid doc_mapping_uid = 11;\n\n  // Time when the shard was last updated\n  int64 update_timestamp = 12;\n}\n\n// A group of shards belonging to the same index and source.\nmessage ShardIds {\n  quickwit.common.IndexUid index_uid = 1;\n  string source_id = 2;\n  repeated ShardId shard_ids = 3;\n}\n\nmessage ShardIdPositions {\n  quickwit.common.IndexUid index_uid = 1;\n  string source_id = 2;\n  repeated ShardIdPosition shard_positions = 3;\n}\n\nmessage ShardIdPosition {\n  ShardId shard_id = 1;\n  Position publish_position_inclusive = 2;\n}\n\nenum ParseFailureReason {\n  PARSE_FAILURE_REASON_UNSPECIFIED = 0;\n  PARSE_FAILURE_REASON_INVALID_JSON = 1;\n  PARSE_FAILURE_REASON_INVALID_SCHEMA = 2;\n}\n\nmessage ParseFailure {\n  quickwit.common.DocUid doc_uid = 1;\n  ParseFailureReason reason = 2;\n  string message = 3;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/quickwit/ingester.proto",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nsyntax = \"proto3\";\n\npackage quickwit.ingest.ingester;\n\nimport \"quickwit/common.proto\";\nimport \"quickwit/ingest.proto\";\n\nservice IngesterService {\n  // Persists batches of documents to primary shards hosted on a leader.\n  rpc Persist(PersistRequest) returns (PersistResponse);\n\n  // Opens a replication stream from a leader to a follower.\n  rpc OpenReplicationStream(stream SynReplicationMessage) returns (stream AckReplicationMessage);\n\n  // Streams records from a leader or a follower. The client can optionally specify a range of positions to fetch,\n  // otherwise the stream will go indefinitely or until the shard is closed.\n  rpc OpenFetchStream(OpenFetchStreamRequest) returns (stream FetchMessage);\n\n  // Streams status updates, called \"observations\", from an ingester.\n  rpc OpenObservationStream(OpenObservationStreamRequest) returns (stream ObservationMessage);\n\n  // Creates and initializes a set of newly opened shards. This RPC is called by the control plane on leaders.\n  rpc InitShards(InitShardsRequest) returns (InitShardsResponse);\n\n  // Only retain the shards that are listed in the request.\n  // Other shards are deleted.\n  rpc RetainShards(RetainShardsRequest) returns (RetainShardsResponse);\n\n  // Truncates a set of shards at the given positions. This RPC is called by indexers on leaders AND followers.\n  rpc TruncateShards(TruncateShardsRequest) returns (TruncateShardsResponse);\n\n  // Closes a set of shards. This RPC is called by the control plane.\n  rpc CloseShards(CloseShardsRequest) returns (CloseShardsResponse);\n\n  // Decommissions the ingester.\n  rpc Decommission(DecommissionRequest) returns (DecommissionResponse);\n}\n\nmessage RetainShardsForSource {\n  quickwit.common.IndexUid index_uid = 1;\n  string source_id = 2;\n  repeated quickwit.ingest.ShardId shard_ids = 3;\n}\n\nmessage RetainShardsRequest {\n  repeated RetainShardsForSource retain_shards_for_sources = 1;\n}\n\nmessage RetainShardsResponse {\n}\n\nmessage PersistRequest {\n  string leader_id = 1;\n  quickwit.ingest.CommitTypeV2 commit_type = 3;\n  repeated PersistSubrequest subrequests = 4;\n}\n\nmessage PersistSubrequest {\n  uint32 subrequest_id = 1;\n  quickwit.common.IndexUid index_uid = 2;\n  string source_id = 3;\n  quickwit.ingest.DocBatchV2 doc_batch = 5;\n  reserved 4;\n}\n\nmessage PersistResponse {\n  string leader_id = 1;\n  repeated PersistSuccess successes = 2;\n  repeated PersistFailure failures = 3;\n  RoutingUpdate routing_update = 4;\n}\n\nmessage RoutingUpdate {\n  uint32 capacity_score = 1;\n  repeated SourceShardUpdate source_shard_updates = 2;\n  repeated quickwit.ingest.ShardIds closed_shards = 3;\n}\n\nmessage SourceShardUpdate {\n  quickwit.common.IndexUid index_uid = 1;\n  string source_id = 2;\n  uint32 open_shard_count = 3;\n}\n\nmessage PersistSuccess {\n  uint32 subrequest_id = 1;\n  quickwit.common.IndexUid index_uid = 2;\n  string source_id = 3;\n  quickwit.ingest.ShardId shard_id = 4;\n  quickwit.ingest.Position replication_position_inclusive = 5;\n  uint32 num_persisted_docs = 6;\n  repeated quickwit.ingest.ParseFailure parse_failures = 7;\n}\n\n\nenum PersistFailureReason {\n  PERSIST_FAILURE_REASON_UNSPECIFIED = 0;\n  PERSIST_FAILURE_REASON_WAL_FULL = 4;\n  PERSIST_FAILURE_REASON_TIMEOUT = 5;\n  PERSIST_FAILURE_REASON_NO_SHARDS_AVAILABLE = 6;\n  PERSIST_FAILURE_REASON_NODE_UNAVAILABLE = 7;\n}\n\nmessage PersistFailure {\n  uint32 subrequest_id = 1;\n  quickwit.common.IndexUid index_uid = 2;\n  string source_id = 3;\n  PersistFailureReason reason = 5;\n  reserved 4;\n}\n\nmessage SynReplicationMessage {\n  oneof message {\n    OpenReplicationStreamRequest open_request = 1;\n    InitReplicaRequest init_request = 2;\n    ReplicateRequest replicate_request = 3;\n  }\n}\n\nmessage AckReplicationMessage {\n  oneof message {\n    OpenReplicationStreamResponse open_response = 1;\n    InitReplicaResponse init_response = 2;\n    ReplicateResponse replicate_response = 3;\n  }\n}\n\nmessage OpenReplicationStreamRequest {\n  string leader_id = 1;\n  string follower_id = 2;\n  // Position of the request in the replication stream.\n  uint64 replication_seqno = 3;\n}\n\nmessage OpenReplicationStreamResponse {\n  // Position of the response in the replication stream. It should match the position of the request.\n  uint64 replication_seqno = 1;\n}\n\nmessage InitReplicaRequest {\n  Shard replica_shard = 1;\n  uint64 replication_seqno = 2;\n}\n\nmessage InitReplicaResponse {\n  uint64 replication_seqno = 1;\n}\n\nmessage ReplicateRequest {\n  string leader_id = 1;\n  string follower_id = 2;\n  quickwit.ingest.CommitTypeV2 commit_type = 3;\n  repeated ReplicateSubrequest subrequests = 4;\n  // Position of the request in the replication stream.\n  uint64 replication_seqno = 5;\n}\n\nmessage ReplicateSubrequest {\n  uint32 subrequest_id = 1;\n  quickwit.common.IndexUid index_uid = 2;\n  string source_id = 3;\n  quickwit.ingest.ShardId shard_id = 4;\n  quickwit.ingest.Position from_position_exclusive = 5;\n  ingest.DocBatchV2 doc_batch = 6;\n}\n\nmessage ReplicateResponse {\n  string follower_id = 1;\n  repeated ReplicateSuccess successes = 2;\n  repeated ReplicateFailure failures = 3;\n  // Position of the response in the replication stream. It should match the position of the request.\n  uint64 replication_seqno = 4;\n}\n\nmessage ReplicateSuccess {\n  uint32 subrequest_id = 1;\n  quickwit.common.IndexUid index_uid = 2;\n  string source_id = 3;\n  quickwit.ingest.ShardId shard_id = 4;\n  quickwit.ingest.Position replication_position_inclusive = 5;\n}\n\nenum ReplicateFailureReason {\n  REPLICATE_FAILURE_REASON_UNSPECIFIED = 0;\n  REPLICATE_FAILURE_REASON_SHARD_NOT_FOUND = 1;\n  REPLICATE_FAILURE_REASON_SHARD_CLOSED = 2;\n  reserved 3; // REPLICATE_FAILURE_REASON_RATE_LIMITED = 3;\n  REPLICATE_FAILURE_REASON_WAL_FULL = 4;\n}\n\nmessage ReplicateFailure {\n  uint32 subrequest_id = 1;\n  quickwit.common.IndexUid index_uid = 2;\n  string source_id = 3;\n  quickwit.ingest.ShardId shard_id = 4;\n  ReplicateFailureReason reason = 5;\n}\n\nmessage TruncateShardsRequest {\n  string ingester_id = 1;\n  repeated TruncateShardsSubrequest subrequests = 2;\n}\n\nmessage TruncateShardsSubrequest {\n  quickwit.common.IndexUid index_uid = 1;\n  string source_id = 2;\n  quickwit.ingest.ShardId shard_id = 3;\n  // The position up to which the shard should be truncated (inclusive).\n  quickwit.ingest.Position truncate_up_to_position_inclusive = 4;\n}\n\nmessage TruncateShardsResponse {\n  // TODO\n}\n\nmessage OpenFetchStreamRequest {\n  string client_id = 1;\n  quickwit.common.IndexUid index_uid = 2;\n  string source_id = 3;\n  quickwit.ingest.ShardId shard_id = 4;\n  quickwit.ingest.Position from_position_exclusive = 5;\n}\n\nmessage FetchMessage {\n  oneof message {\n    FetchPayload payload = 1;\n    FetchEof eof = 2;\n  }\n}\n\nmessage FetchPayload {\n  quickwit.common.IndexUid index_uid = 1;\n  string source_id = 2;\n  quickwit.ingest.ShardId shard_id = 3;\n  quickwit.ingest.MRecordBatch mrecord_batch = 4;\n  quickwit.ingest.Position from_position_exclusive = 5;\n  quickwit.ingest.Position to_position_inclusive = 6;\n}\n\nmessage FetchEof {\n  quickwit.common.IndexUid index_uid = 1;\n  string source_id = 2;\n  quickwit.ingest.ShardId shard_id = 3;\n  quickwit.ingest.Position eof_position = 4;\n}\n\nmessage InitShardsRequest {\n  reserved 1;\n  repeated InitShardSubrequest subrequests = 2;\n}\n\nmessage InitShardSubrequest {\n  uint32 subrequest_id = 1;\n  quickwit.ingest.Shard shard = 2;\n  string doc_mapping_json = 3;\n  bool validate_docs = 4;\n}\n\nmessage InitShardsResponse {\n  repeated InitShardSuccess successes = 1;\n  repeated InitShardFailure failures = 2;\n}\n\nmessage InitShardSuccess {\n  uint32 subrequest_id = 1;\n  quickwit.ingest.Shard shard = 2;\n}\n\nmessage InitShardFailure {\n  uint32 subrequest_id = 1;\n  quickwit.common.IndexUid index_uid = 2;\n  string source_id = 3;\n  quickwit.ingest.ShardId shard_id = 4;\n  // InitShardFailureReason reason = 5;\n}\n\nmessage CloseShardsRequest {\n  reserved 1;\n  repeated quickwit.ingest.ShardPKey shard_pkeys = 2;\n}\n\nmessage CloseShardsResponse {\n  repeated quickwit.ingest.ShardPKey successes = 1;\n}\n\nmessage DecommissionRequest {\n}\n\nmessage DecommissionResponse {\n}\n\nmessage OpenObservationStreamRequest {\n}\n\nenum IngesterStatus {\n  INGESTER_STATUS_UNSPECIFIED = 0;\n  // The ingester is live but not ready yet to accept requests.\n  INGESTER_STATUS_INITIALIZING = 1;\n  // The ingester is ready and accepts read and write requests.\n  INGESTER_STATUS_READY = 2;\n  // The ingester is about to be decommissioned. It still accepts read and write requests, but will not accept write requests in a few seconds and should be avoided by future write requests.\n  INGESTER_STATUS_RETIRING = 6;\n  // The ingester is being decommissioned. It accepts read requests but rejects write requests\n  // (open shards, persist, and replicate requests). It will transition to `Decommissioned` once\n  // all shards are fully indexed.\n  INGESTER_STATUS_DECOMMISSIONING = 3;\n  // The ingester no longer accepts read and write requests. It does not hold any data and can\n  // be safely removed from the cluster.\n  INGESTER_STATUS_DECOMMISSIONED = 4;\n  // The ingester failed to initialize and is not ready to accept requests.\n  INGESTER_STATUS_FAILED = 5;\n}\n\nmessage ObservationMessage {\n  string node_id = 1;\n  IngesterStatus Status = 2;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/quickwit/metastore.proto",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nsyntax = \"proto3\";\n\npackage quickwit.metastore;\n\nimport \"quickwit/common.proto\";\nimport \"quickwit/ingest.proto\";\n\nenum SourceType {\n  SOURCE_TYPE_UNSPECIFIED = 0;\n  SOURCE_TYPE_CLI = 1;\n  SOURCE_TYPE_FILE = 2;\n  SOURCE_TYPE_INGEST_V1 = 4;\n  SOURCE_TYPE_INGEST_V2 = 5;\n  // Apache Kafka\n  SOURCE_TYPE_KAFKA = 6;\n  // Amazon Kinesis\n  SOURCE_TYPE_KINESIS = 7;\n  SOURCE_TYPE_NATS = 8;\n  // Google Cloud Pub/Sub\n  SOURCE_TYPE_PUB_SUB = 3;\n  // Apache Pulsar\n  SOURCE_TYPE_PULSAR = 9;\n  SOURCE_TYPE_VEC = 10;\n  SOURCE_TYPE_VOID = 11;\n  SOURCE_TYPE_STDIN = 13;\n}\n\n// Metastore meant to manage Quickwit's indexes, their splits and delete tasks.\n//\n// I. Index and splits management.\n//\n// Quickwit needs a way to ensure that we can cleanup unused files,\n// and this process needs to be resilient to any fail-stop failures.\n// We rely on atomically transitioning the status of splits.\n//\n// The split state goes through the following life cycle:\n// 1. `Staged`\n//   - Start uploading the split files.\n// 2. `Published`\n//   - Uploading the split files is complete and the split is searchable.\n// 3. `MarkedForDeletion`\n//   - Mark the split for deletion.\n//\n// If a split has a file in the storage, it MUST be registered in the metastore,\n// and its state can be as follows:\n// - `Staged`: The split is almost ready. Some of its files may have been uploaded in the storage.\n// - `Published`: The split is ready and published.\n// - `MarkedForDeletion`: The split is marked for deletion.\n//\n// Before creating any file, we need to stage the split. If there is a failure, upon recovery, we\n// schedule for deletion all the staged splits. A client may not necessarily remove files from\n// storage right after marking it for deletion. A CLI client may delete files right away, but a\n// more serious deployment should probably only delete those files after a grace period so that the\n// running search queries can complete.\n//\n// II. Delete tasks management.\n//\n// A delete task is defined on a given index and by a search query. It can be\n// applied to all the splits of the index.\n//\n// Quickwit needs a way to track that a delete task has been applied to a split. This is ensured\n// by two mechanisms:\n// - On creation of a delete task, we give to the task a monotically increasing opstamp (uniqueness\n//   and monotonically increasing must be true at the index level).\n// - When a delete task is executed on a split, that is when the documents matched by the search\n//   query are removed from the splits, we update the split's `delete_opstamp` to the value of the\n//   task's opstamp. This marks the split as \"up-to-date\" regarding this delete task. If new delete\n//   tasks are added, we will know that we need to run these delete tasks on the splits as its\n//   `delete_optstamp` will be inferior to the `opstamp` of the new tasks.\n//\n// For splits created after a given delete task, Quickwit's indexing ensures that these splits\n// are created with a `delete_opstamp` equal the latest opstamp of the tasks of the\n// corresponding index.\nservice MetastoreService {\n  // Creates an index.\n  //\n  // This API creates a new index in the metastore.\n  // An error will occur if an index that already exists in the storage is specified.\n  rpc CreateIndex(CreateIndexRequest) returns (CreateIndexResponse);\n\n  // Update an index.\n  rpc UpdateIndex(UpdateIndexRequest) returns (IndexMetadataResponse);\n\n  // Returns the `IndexMetadata` of an index identified by its IndexID or its IndexUID.\n  rpc IndexMetadata(IndexMetadataRequest) returns (IndexMetadataResponse);\n\n  // Fetches the metadata of a list of indexes identified by their Index IDs or UIDs.\n  rpc IndexesMetadata(IndexesMetadataRequest) returns (IndexesMetadataResponse);\n\n  // Gets an indexes metadatas.\n  rpc ListIndexesMetadata(ListIndexesMetadataRequest) returns (ListIndexesMetadataResponse);\n\n  // Deletes an index\n  rpc DeleteIndex(DeleteIndexRequest) returns (EmptyResponse);\n\n  // Returns a list of size info for each index.\n  rpc ListIndexStats(ListIndexStatsRequest) returns (ListIndexStatsResponse);\n\n  // Streams splits from index.\n  rpc ListSplits(ListSplitsRequest) returns (stream ListSplitsResponse);\n\n  // Stages several splits.\n  rpc StageSplits(StageSplitsRequest) returns (EmptyResponse);\n\n  // Publishes split.\n  rpc PublishSplits(PublishSplitsRequest) returns (EmptyResponse);\n\n  // Marks splits for deletion.\n  rpc MarkSplitsForDeletion(MarkSplitsForDeletionRequest) returns (EmptyResponse);\n\n  // Deletes splits.\n  rpc DeleteSplits(DeleteSplitsRequest) returns (EmptyResponse);\n\n  // Adds a source.\n  rpc AddSource(AddSourceRequest) returns (EmptyResponse);\n\n  // Updates a source.\n  rpc UpdateSource(UpdateSourceRequest) returns (EmptyResponse);\n\n  // Toggles (turns on or off) source.\n  rpc ToggleSource(ToggleSourceRequest) returns (EmptyResponse);\n\n  // Removes source.\n  rpc DeleteSource(DeleteSourceRequest) returns (EmptyResponse);\n\n  // Resets source checkpoint.\n  rpc ResetSourceCheckpoint(ResetSourceCheckpointRequest) returns (EmptyResponse);\n\n  // Gets last opstamp for a given `index_id`.\n  rpc LastDeleteOpstamp(LastDeleteOpstampRequest) returns (LastDeleteOpstampResponse);\n\n  // Creates a delete task.\n  rpc CreateDeleteTask(DeleteQuery) returns (DeleteTask);\n\n  // Updates splits `delete_opstamp`.\n  rpc UpdateSplitsDeleteOpstamp(UpdateSplitsDeleteOpstampRequest) returns (UpdateSplitsDeleteOpstampResponse);\n\n  // Lists delete tasks with `delete_task.opstamp` > `opstamp_start` for a given `index_id`.\n  rpc ListDeleteTasks(ListDeleteTasksRequest) returns (ListDeleteTasksResponse);\n\n  // Lists splits with `split.delete_opstamp` < `delete_opstamp` for a given `index_id`.\n  rpc ListStaleSplits(ListStaleSplitsRequest) returns (ListSplitsResponse);\n\n  // Shard API\n  //\n  // Note that for the file-backed metastore implementation, the requests are not processed atomically.\n  // Indeed, each request comprises one or more subrequests that target different indexes and sources processed\n  // independently. Responses list the requests that succeeded or failed in the fields `successes` and\n  // `failures`.\n  rpc OpenShards(OpenShardsRequest) returns (OpenShardsResponse);\n\n  // Acquires a set of shards for indexing. This RPC locks the shards for publishing thanks to a publish token and only\n  // the last indexer that has acquired the shards is allowed to publish. The response returns for each subrequest the\n  // list of acquired shards along with the positions to index from.\n  //\n  // If a requested shard is missing, this method does not return an error. It should simply return the list of\n  // shards that were actually acquired.\n  //\n  // For this reason, AcquireShards.acquire_shards may return less subresponse than there was in the request.\n  // Also they may be returned in any order.\n  rpc AcquireShards(AcquireShardsRequest) returns (AcquireShardsResponse);\n\n  // Deletes a set of shards. This RPC deletes the shards from the metastore.\n  // If the shard did not exist to begin with, the operation is successful and does not return any error.\n  rpc DeleteShards(DeleteShardsRequest) returns (DeleteShardsResponse);\n\n  // Deletes outdated shards. This RPC deletes the shards from the metastore.\n  rpc PruneShards(PruneShardsRequest) returns (EmptyResponse);\n\n  rpc ListShards(ListShardsRequest) returns (ListShardsResponse);\n\n  // Index Template API\n  //\n  // Index templates are used to create indexes with a predefined configuration.\n\n  // Creates an index template.\n  rpc CreateIndexTemplate(CreateIndexTemplateRequest) returns (EmptyResponse);\n\n  // Fetches an index template.\n  rpc GetIndexTemplate(GetIndexTemplateRequest) returns (GetIndexTemplateResponse);\n\n  // Finds matching index templates.\n  rpc FindIndexTemplateMatches(FindIndexTemplateMatchesRequest) returns (FindIndexTemplateMatchesResponse);\n\n  // Returns all the index templates.\n  rpc ListIndexTemplates(ListIndexTemplatesRequest) returns (ListIndexTemplatesResponse);\n\n  // Deletes index templates.\n  rpc DeleteIndexTemplates(DeleteIndexTemplatesRequest) returns (EmptyResponse);\n\n  // Get cluster identity\n  rpc GetClusterIdentity(GetClusterIdentityRequest) returns (GetClusterIdentityResponse);\n}\n\nmessage EmptyResponse {\n}\n\nmessage CreateIndexRequest {\n  string index_config_json = 2;\n  repeated string source_configs_json = 3;\n}\n\nmessage CreateIndexResponse {\n  quickwit.common.IndexUid index_uid = 1;\n  string index_metadata_json = 2;\n}\n\nmessage UpdateIndexRequest {\n  quickwit.common.IndexUid index_uid = 1;\n  string doc_mapping_json = 5;\n  string indexing_settings_json = 4;\n  string ingest_settings_json = 6;\n  string search_settings_json = 2;\n  optional string retention_policy_json_opt = 3;\n}\n\nmessage ListIndexesMetadataRequest {\n  reserved  1;\n  // List of patterns an index should match or not match to get considered\n  // An index must match at least one positive pattern (a pattern not starting\n  // with a '-'), and no negative pattern (a pattern starting with a '-').\n  repeated string index_id_patterns = 2;\n}\n\nmessage ListIndexesMetadataResponse {\n  // Deprecated (v0.9.0), use `indexes_metadata_json_zstd` instead.\n  optional string indexes_metadata_json_opt = 1;\n  // A JSON serialized then ZSTD compressed list of `IndexMetadata`: `Vec<IndexMetadata> | JSON | ZSTD`.\n  // We don't use `repeated` here to increase the compression rate and ratio.\n  bytes indexes_metadata_json_zstd = 2;\n}\n\nmessage DeleteIndexRequest {\n  quickwit.common.IndexUid index_uid = 1;\n}\n\n// Request the metadata of an index.\n// Either `index_uid` or `index_id` must be specified.\n//\n// If both are supplied, `index_uid` is used.\nmessage IndexMetadataRequest {\n  optional string index_id = 1;\n  optional quickwit.common.IndexUid index_uid = 2;\n}\n\nmessage IndexMetadataResponse {\n  string index_metadata_serialized_json = 1;\n}\n\nmessage IndexesMetadataRequest {\n  repeated IndexMetadataSubrequest subrequests = 1;\n}\n\nmessage IndexMetadataSubrequest {\n  optional string index_id = 1;\n  optional quickwit.common.IndexUid index_uid = 2;\n}\n\nmessage IndexesMetadataResponse {\n  // A JSON serialized then ZSTD compressed list of `IndexMetadata`: `Vec<IndexMetadata> | JSON | ZSTD`.\n  // We don't use `repeated` here to increase the compression rate and ratio.\n  bytes indexes_metadata_json_zstd = 1;\n  repeated IndexMetadataFailure failures = 2;\n}\n\nmessage IndexMetadataFailure {\n  optional string index_id = 1;\n  optional quickwit.common.IndexUid index_uid = 2;\n  IndexMetadataFailureReason reason = 3;\n}\n\nenum IndexMetadataFailureReason {\n  INDEX_METADATA_FAILURE_REASON_UNSPECIFIED = 0;\n  INDEX_METADATA_FAILURE_REASON_NOT_FOUND = 1;\n  INDEX_METADATA_FAILURE_REASON_INTERNAL = 2;\n}\n\nmessage ListIndexStatsRequest {\n  // List of patterns an index should match or not match to get considered\n  // An index must match at least one positive pattern (a pattern not starting\n  // with a '-'), and no negative pattern (a pattern starting with a '-').\n  repeated string index_id_patterns = 1;\n}\n\nmessage ListIndexStatsResponse {\n  // list of IndexStats. each one has the index id, the number of splits and the total size.\n  repeated IndexStats index_stats = 1;\n}\n\nmessage IndexStats {\n  quickwit.common.IndexUid index_uid = 1;\n  SplitStats staged = 2;\n  SplitStats published = 3;\n  SplitStats marked_for_deletion = 4;\n}\n\nmessage SplitStats {\n  uint64 num_splits = 1;\n  uint64 total_size_bytes = 2;\n}\n\nmessage ListSplitsRequest {\n  // Predicate used to filter splits.\n  // The predicate is expressed as a JSON serialized\n  // `ListSplitsQuery`.\n  string query_json = 1;\n}\n\nmessage ListSplitsResponse {\n  // TODO use repeated and encode splits json individually.\n  string splits_serialized_json = 1;\n}\n\nmessage StageSplitsRequest {\n  quickwit.common.IndexUid index_uid = 1;\n  string split_metadata_list_serialized_json = 2;\n}\n\nmessage PublishSplitsRequest {\n  quickwit.common.IndexUid index_uid = 1;\n  repeated string staged_split_ids = 2;\n  repeated string replaced_split_ids = 3;\n  optional string index_checkpoint_delta_json_opt = 4;\n  optional string publish_token_opt = 5;\n}\n\nmessage MarkSplitsForDeletionRequest {\n  quickwit.common.IndexUid index_uid = 2;\n  repeated string split_ids = 3;\n}\n\nmessage DeleteSplitsRequest {\n  quickwit.common.IndexUid index_uid = 2;\n  repeated string split_ids = 3;\n}\n\nmessage AddSourceRequest {\n  quickwit.common.IndexUid index_uid = 1;\n  string source_config_json = 2;\n}\n\nmessage UpdateSourceRequest {\n  quickwit.common.IndexUid index_uid = 1;\n  string source_config_json = 2;\n}\n\nmessage ToggleSourceRequest {\n  quickwit.common.IndexUid index_uid = 1;\n  string source_id = 2;\n  bool enable = 3;\n}\n\nmessage DeleteSourceRequest {\n  quickwit.common.IndexUid index_uid = 1;\n  string source_id = 2;\n}\n\nmessage ResetSourceCheckpointRequest {\n  quickwit.common.IndexUid index_uid = 1;\n  string source_id = 2;\n}\n\n//\n// Delete tasks API.\n//\n\nmessage DeleteTask {\n  int64 create_timestamp = 1;\n  uint64 opstamp = 2;\n  DeleteQuery delete_query = 3;\n}\n\nmessage DeleteQuery {\n  reserved 4, 5;\n\n  // Index UID.\n  quickwit.common.IndexUid index_uid = 1;\n  // If set, restrict search to documents with a `timestamp >= start_timestamp`.\n  optional int64 start_timestamp = 2;\n  // If set, restrict search to documents with a `timestamp < end_timestamp``.\n  optional int64 end_timestamp = 3;\n  // Query AST serialized in JSON\n  string query_ast = 6;\n}\n\nmessage UpdateSplitsDeleteOpstampRequest {\n  quickwit.common.IndexUid index_uid = 1;\n  repeated string split_ids = 2;\n  uint64 delete_opstamp = 3;\n}\n\nmessage UpdateSplitsDeleteOpstampResponse {}\n\nmessage LastDeleteOpstampRequest {\n  quickwit.common.IndexUid index_uid = 1;\n}\n\nmessage LastDeleteOpstampResponse {\n  uint64 last_delete_opstamp = 1;\n}\n\nmessage ListStaleSplitsRequest {\n  quickwit.common.IndexUid index_uid = 1;\n  uint64 delete_opstamp = 2;\n  uint64 num_splits = 3;\n}\n\nmessage ListDeleteTasksRequest {\n  quickwit.common.IndexUid index_uid = 1;\n  uint64 opstamp_start = 2;\n}\n\nmessage ListDeleteTasksResponse {\n  repeated DeleteTask delete_tasks = 1;\n}\n\n//\n// Shard API\n//\n\nmessage OpenShardsRequest {\n  repeated OpenShardSubrequest subrequests = 1;\n}\n\nmessage OpenShardSubrequest {\n  uint32 subrequest_id = 1;\n  quickwit.common.IndexUid index_uid = 2;\n  string source_id = 3;\n  quickwit.ingest.ShardId shard_id = 4;\n  string leader_id = 5;\n  optional string follower_id = 6;\n  quickwit.common.DocMappingUid doc_mapping_uid = 7;\n  optional string publish_token = 8;\n}\n\nmessage OpenShardsResponse {\n  repeated OpenShardSubresponse subresponses = 1;\n}\n\nmessage OpenShardSubresponse {\n  reserved 2, 3;\n\n  uint32 subrequest_id = 1;\n  quickwit.ingest.Shard open_shard = 4;\n}\n\nmessage AcquireShardsRequest {\n  quickwit.common.IndexUid index_uid = 1;\n  string source_id = 2;\n  repeated quickwit.ingest.ShardId shard_ids = 3;\n  string publish_token = 4;\n}\n\nmessage AcquireShardsResponse {\n  // List of acquired shards, in no specific order.\n  repeated quickwit.ingest.Shard acquired_shards = 3;\n}\n\nmessage DeleteShardsRequest {\n  quickwit.common.IndexUid index_uid = 1;\n  string source_id = 2;\n  repeated quickwit.ingest.ShardId shard_ids = 3;\n  // If false, only shards at EOF positions will be deleted.\n  bool force = 4;\n}\n\nmessage DeleteShardsResponse {\n  quickwit.common.IndexUid index_uid = 1;\n  string source_id = 2;\n  // List of shard IDs that were successfully deleted.\n  repeated quickwit.ingest.ShardId successes = 3;\n  // List of shard IDs that could not be deleted because `force` was set to `false` in the request,\n  // and the shards are not at EOF, i.e., not fully indexed.\n  repeated quickwit.ingest.ShardId failures = 4;\n}\n\nmessage PruneShardsRequest {\n  quickwit.common.IndexUid index_uid = 1;\n  string source_id = 2;\n  // The maximum age of the shards to keep, in seconds.\n  optional uint32 max_age_secs = 5;\n  // The maximum number of the shards to keep. Delete older shards first.\n  optional uint32 max_count = 6;\n  // The interval between two pruning operations, in seconds.\n  optional uint32 interval_secs = 7;\n}\n\nmessage ListShardsRequest {\n  repeated ListShardsSubrequest subrequests = 1;\n}\n\nmessage ListShardsSubrequest {\n  quickwit.common.IndexUid index_uid = 1;\n  string source_id = 2;\n  optional quickwit.ingest.ShardState shard_state = 3;\n}\n\nmessage ListShardsResponse {\n  repeated ListShardsSubresponse subresponses = 1;\n}\n\nmessage ListShardsSubresponse {\n  quickwit.common.IndexUid index_uid = 1;\n  string source_id = 2;\n  repeated quickwit.ingest.Shard shards = 3;\n}\n\n//\n// Index Template API\n//\n\nmessage CreateIndexTemplateRequest {\n  string index_template_json = 1;\n  bool overwrite = 2;\n}\n\nmessage GetIndexTemplateRequest {\n  string template_id = 1;\n}\n\nmessage GetIndexTemplateResponse {\n  string index_template_json = 1;\n}\n\nmessage FindIndexTemplateMatchesRequest {\n  repeated string index_ids = 1;\n}\n\nmessage FindIndexTemplateMatchesResponse {\n  repeated IndexTemplateMatch matches = 1;\n}\n\nmessage IndexTemplateMatch {\n  string index_id = 1;\n  string template_id = 2;\n  string index_template_json = 3;\n}\n\nmessage ListIndexTemplatesRequest {\n}\n\nmessage ListIndexTemplatesResponse {\n  repeated string index_templates_json = 1;\n}\n\nmessage DeleteIndexTemplatesRequest {\n  repeated string template_ids = 1;\n}\n\nmessage GetClusterIdentityRequest {\n}\n\nmessage GetClusterIdentityResponse {\n  string uuid = 1;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/quickwit/router.proto",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nsyntax = \"proto3\";\n\npackage quickwit.ingest.router;\n\nimport \"quickwit/common.proto\";\nimport \"quickwit/ingest.proto\";\n\nservice IngestRouterService {\n  // Ingests batches of documents for one or multiple indexes.\n  // TODO: Describe error cases and how to handle them.\n  rpc Ingest(IngestRequestV2) returns (IngestResponseV2);\n}\n\nmessage IngestRequestV2 {\n  // There should be at most one subrequest per index per request.\n  repeated IngestSubrequest subrequests = 1;\n  quickwit.ingest.CommitTypeV2 commit_type = 2;\n}\n\nmessage IngestSubrequest {\n  // The subrequest ID is used to identify the various subrequests and responses\n  // (ingest, persist, replicate) at play during the ingest and replication\n  // process.\n  uint32 subrequest_id = 1;\n  string index_id = 2;\n  string source_id = 3;\n  quickwit.ingest.DocBatchV2 doc_batch = 4;\n}\n\nmessage IngestResponseV2 {\n  repeated IngestSuccess successes  = 1;\n  repeated IngestFailure failures  = 2;\n}\n\nmessage IngestSuccess {\n  uint32 subrequest_id = 1;\n  quickwit.common.IndexUid index_uid = 2;\n  string source_id = 3;\n  quickwit.ingest.ShardId shard_id = 4;\n  // Replication position inclusive.\n  quickwit.ingest.Position replication_position_inclusive = 5;\n  uint32 num_ingested_docs = 6;\n  repeated quickwit.ingest.ParseFailure parse_failures = 7;\n}\n\nenum IngestFailureReason {\n  INGEST_FAILURE_REASON_UNSPECIFIED = 0;\n  INGEST_FAILURE_REASON_INDEX_NOT_FOUND = 1;\n  INGEST_FAILURE_REASON_SOURCE_NOT_FOUND = 2;\n  INGEST_FAILURE_REASON_INTERNAL = 3;\n  INGEST_FAILURE_REASON_NO_SHARDS_AVAILABLE = 4;\n  INGEST_FAILURE_REASON_SHARD_RATE_LIMITED = 5;\n  INGEST_FAILURE_REASON_WAL_FULL = 6;\n  INGEST_FAILURE_REASON_TIMEOUT = 7;\n  INGEST_FAILURE_REASON_ROUTER_LOAD_SHEDDING = 8;\n  INGEST_FAILURE_REASON_LOAD_SHEDDING = 9;\n  INGEST_FAILURE_REASON_CIRCUIT_BREAKER = 10;\n}\n\nmessage IngestFailure {\n  uint32 subrequest_id = 1;\n  string index_id = 2;\n  string source_id = 3;\n  IngestFailureReason reason = 5;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/quickwit/search.proto",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nsyntax = \"proto3\";\n\npackage quickwit.search;\n\nservice SearchService {\n  // Root search API.\n  // This RPC identifies the set of splits on which the query should run on,\n  // and dispatch the several calls to `LeafSearch`.\n  //\n  // It is also in charge of merging back the results.\n  rpc RootSearch(SearchRequest) returns (SearchResponse);\n\n  // Perform a leaf search on a given set of splits.\n  //\n  // It is like a regular search except that:\n  // - the node should perform the search locally instead of dispatching\n  //   it to other nodes.\n  // - it should be applied on the given subset of splits\n  // - Hit content is not fetched, and we instead return so called `PartialHit`.\n  rpc LeafSearch(LeafSearchRequest) returns (LeafSearchResponse);\n\n  /// Fetches the documents contents from the document store.\n  /// This methods takes `PartialHit`s and returns `Hit`s.\n  rpc FetchDocs(FetchDocsRequest) returns (FetchDocsResponse);\n\n  // Root list terms API.\n  // This RPC identifies the set of splits on which the query should run on,\n  // and dispatches the several calls to `LeafListTerms`.\n  //\n  // It is also in charge of merging back the results.\n  rpc RootListTerms(ListTermsRequest) returns (ListTermsResponse);\n\n  // Performs a leaf list terms on a given set of splits.\n  //\n  // It is like a regular list term except that:\n  // - the node should perform the listing locally instead of dispatching\n  //   it to other nodes.\n  // - it should be applied on the given subset of splits\n  rpc LeafListTerms(LeafListTermsRequest) returns (LeafListTermsResponse);\n\n  // Performs a scroll request.\n  rpc Scroll(ScrollRequest) returns (SearchResponse);\n\n  // gRPC request used to store a key in the local storage of the targeted node.\n  // This RPC is used in the mini distributed immutable KV store embedded in quickwit.\n  rpc PutKV(PutKVRequest) returns (PutKVResponse);\n\n  // Gets a key from the local storage of the targeted node.\n  // This RPC is used in the mini distributed immutable KV store embedded in quickwit.\n  rpc GetKV(GetKVRequest) returns (GetKVResponse);\n\n  rpc ReportSplits(ReportSplitsRequest) returns (ReportSplitsResponse);\n\n  rpc ListFields(ListFieldsRequest) returns (ListFieldsResponse);\n\n  rpc LeafListFields(LeafListFieldsRequest) returns (ListFieldsResponse);\n\n  // Describe how a search would be processed.\n  rpc SearchPlan(SearchRequest) returns (SearchPlanResponse);\n}\n\n/// Scroll Request\nmessage ScrollRequest {\n  /// The `scroll_id` is the given in the response of a search request including a scroll.\n  string scroll_id = 1;\n  optional uint32 scroll_ttl_secs = 2;\n}\n\nmessage PutKVRequest {\n  bytes key = 1;\n  bytes payload = 2;\n  uint32 ttl_secs = 3;\n}\n\nmessage PutKVResponse {}\n\nmessage GetKVRequest {\n  bytes key = 1;\n}\n\nmessage GetKVResponse {\n  optional bytes payload = 1;\n}\n\n\nmessage ReportSplit {\n  // Split id (ULID format `01HAV29D4XY3D462FS3D8K5Q2H`)\n  string split_id = 2;\n  // The storage uri. This URI does NOT include the split id.\n  string storage_uri = 1;\n}\n\nmessage ReportSplitsRequest {\n  repeated ReportSplit report_splits = 1;\n}\n\nmessage ReportSplitsResponse {}\n\n// -- ListFields -------------------\n\nmessage ListFieldsRequest {\n  // Index ID patterns\n  repeated string index_id_patterns = 1;\n  // Optional limit query to a list of fields\n  // Wildcard expressions are supported.\n  repeated string fields = 2;\n\n  // Time filter, expressed in seconds since epoch.\n  // That filter is to be interpreted as the semi-open interval:\n  // [start_timestamp, end_timestamp).\n  optional int64 start_timestamp = 3;\n  optional int64 end_timestamp = 4;\n\n  // JSON-serialized QueryAst for index_filter support.\n  // When provided, only fields from documents matching this query are returned.\n  optional string query_ast = 5;\n\n  // Control if the request will fail if split_ids contains a split that does not exist.\n  // optional bool fail_on_missing_index = 6;\n}\n\nmessage LeafListFieldsRequest {\n  // The index id\n  string index_id = 1;\n  // The index uri\n  string index_uri = 2;\n  // Index split ids to apply the query on.\n  // This ids are resolved from the index_uri defined in the search_request.\n  repeated SplitIdAndFooterOffsets split_offsets = 3;\n\n  // Optional limit query to a list of fields\n  // Wildcard expressions are supported.\n  repeated string fields = 4;\n}\n\nmessage ListFieldsResponse {\n  repeated ListFieldsEntryResponse fields = 1;\n}\n\nmessage ListFieldsEntryResponse {\n  string field_name = 1;\n  ListFieldType field_type = 2;\n  // The index ids the field exists\n  repeated string index_ids = 3;\n  // True means the field is searchable (indexed) in at least some indices.\n  // False means the field is not searchable in any indices.\n  bool searchable = 4;\n  // True means the field is aggregatable (fast) in at least some indices.\n  // False means the field is not aggregatable in any indices.\n  bool aggregatable = 5;\n  // The index ids the field exists, but is not searchable.\n  repeated string non_searchable_index_ids = 6;\n  // The index ids the field exists, but is not aggregatable\n  repeated string non_aggregatable_index_ids = 7;\n}\n\nenum ListFieldType {\n    STR = 0;\n    U64 = 1;\n    I64 = 2;\n    F64 = 3;\n    BOOL = 4;\n    DATE = 5;\n    FACET = 6;\n    BYTES = 7;\n    IP_ADDR = 8;\n    JSON = 9;\n}\nmessage ListFields {\n  repeated ListFieldsEntryResponse fields = 1;\n}\n// -- Search -------------------\n\nmessage SearchRequest {\n  // Index ID patterns\n  repeated string index_id_patterns = 1;\n\n  // deprecated `query``\n  reserved 2;\n\n  // Json object representing Quickwit's QueryAst.\n  string query_ast = 13;\n\n  // deprecated `search_fields``\n  reserved 3;\n\n  // Time filter, expressed in seconds since epoch.\n  // That filter is to be interpreted as the semi-open interval:\n  // [start_timestamp, end_timestamp).\n  // If the query AST contains a range query over the timestamp field,\n  // then the the bounds of the range query are used directly and\n  // these two fields are ignored.\n  optional int64 start_timestamp = 4;\n  optional int64 end_timestamp = 5;\n\n  // Maximum number of hits to return.\n  uint64 max_hits = 6;\n\n  // First hit to return. Together with max_hits, this parameter\n  // can be used for pagination.\n  //\n  // E.g.\n  // The results with rank [start_offset..start_offset + max_hits) are returned.\n  uint64 start_offset = 7;\n\n  // deprecated tag field\n  reserved 8;\n\n  // deprecated `sort_order``\n  reserved 9;\n\n  // deprecated `sort_by_field``\n  reserved 10;\n\n  // json serialized aggregation_request\n  optional string aggregation_request = 11;\n\n  // Fields to extract snippet on\n  repeated string snippet_fields = 12;\n\n  // Optional sort by one or more fields (limited to 2 at the moment).\n  repeated SortField sort_fields = 14;\n\n  // If set, the search response will include a search id\n  // that will make it possible to paginate through the results\n  // in a consistent manner.\n  optional uint32 scroll_ttl_secs = 15;\n\n  // Document with sort tuple smaller or equal to this are discarded to\n  // enable pagination.\n  // If split_id is empty, no comparison with _shard_doc should be done\n  optional PartialHit search_after = 16;\n\n  CountHits count_hits = 17;\n\n  // When an exact index ID is provided (not a pattern), the query fails only if\n  // that index is not found and this parameter is set to `false`.\n  bool ignore_missing_indexes = 18;\n\n  // When true, skip finalization of aggregation results and return\n  // the raw IntermediateAggregationResults bytes instead.\n  bool skip_aggregation_finalization = 19;\n}\n\nenum CountHits {\n  // Count all hits, querying all splits.\n  COUNT_ALL = 0;\n  // Give an underestimate of the number of hits, possibly skipping entire\n  // splits if they are otherwise not needed to fulfull a query.\n  UNDERESTIMATE = 1;\n}\n\nmessage SortField {\n  string field_name = 1;\n  SortOrder sort_order = 2;\n  // Optional sort value format for datetime field only.\n  // If none, the default output format for datetime field is\n  // unix_timestamp_nanos.\n  optional SortDatetimeFormat sort_datetime_format = 3;\n}\n\nenum SortOrder {\n  // Ascending order.\n  ASC = 0;\n  // Descending order.\n  DESC = 1; //< This will be the default value;\n}\n\n// Sort value format for datetime field.\n// We keep an enum with only one format\n// for future extension.\nenum SortDatetimeFormat {\n  UNIX_TIMESTAMP_MILLIS = 0;\n  UNIX_TIMESTAMP_NANOS = 1;\n}\n\nmessage SearchResponse {\n  // Number of hits matching the query.\n  uint64 num_hits = 1;\n  // Matched hits\n  repeated Hit hits = 2;\n  // Elapsed time to perform the request. This time is measured\n  // server-side and expressed in microseconds.\n  uint64 elapsed_time_micros = 3;\n\n  // The searcherrors that occurred formatted as string.\n  repeated string errors = 4;\n\n  // used to be json-encoded aggregation\n  reserved 5;\n\n  // Postcard-encoded aggregation response\n  optional bytes aggregation_postcard = 9;\n\n  // Scroll Id (only set if scroll_secs was set in the request)\n  optional string scroll_id = 6;\n\n  // Returns the list of splits for which search failed.\n  // For the moment, the cause is unknown.\n  //\n  // It is up to the caller to decide whether to interpret\n  // this as an overall failure or to present the partial results\n  // to the end user.\n  repeated SplitSearchError failed_splits = 7;\n\n  // Total number of successful splits searched.\n  uint64 num_successful_splits = 8;\n}\n\nmessage SearchPlanResponse {\n  string result = 1;\n}\n\nmessage SplitSearchError {\n  // The searcherror that occurred formatted as string.\n  string error = 1;\n\n  // Split id that failed.\n  string split_id = 2;\n\n  // Flag to indicate if the error can be considered a retryable error\n  bool retryable_error = 3;\n}\n\n// A LeafSearchRequest can span multiple indices.\nmessage LeafSearchRequest {\n  // Search request. This is a perfect copy of the original search request\n  // that was sent to root apart from the start_offset, max_hits params and index_id_patterns.\n  // index_id_patterns contains the actual index ids queried on that leaf.\n  SearchRequest search_request = 1;\n\n  // List of leaf requests, one per index.\n  repeated LeafRequestRef leaf_requests = 7;\n\n  // List of unique doc_mappers serialized as json.\n  repeated string doc_mappers = 8;\n\n  // List of index uris\n  // Index URI. The index URI defines the location of the storage that contains the\n  // split files.\n  repeated string index_uris = 9;\n}\n\nmessage ResourceStats {\n    uint64 short_lived_cache_num_bytes = 1;\n    uint64 split_num_docs = 2;\n    uint64 warmup_microsecs = 3;\n    uint64 cpu_thread_pool_wait_microsecs = 4;\n    uint64 cpu_microsecs = 5;\n}\n\n// LeafRequestRef references data in LeafSearchRequest to deduplicate data.\nmessage LeafRequestRef {\n  // The ordinal of the doc_mapper in `LeafSearchRequest.doc_mappers`\n  uint32 doc_mapper_ord = 1;\n\n  // The ordinal of the index uri in LeafSearchRequest.index_uris\n  uint32 index_uri_ord = 2;\n\n  // Index split ids to apply the query on.\n  // This ids are resolved from the index_uri defined in the search_request.\n  repeated SplitIdAndFooterOffsets split_offsets = 3;\n}\n\nmessage SplitIdAndFooterOffsets {\n  // Index split id to apply the query on.\n  // This id is resolved from the index_uri defined in the search_request.\n  string split_id = 1;\n  // The offset of the start of footer in the split bundle. The footer contains the file bundle metadata and the hotcache.\n  uint64 split_footer_start = 2;\n  // The offset of the end of the footer in split bundle. The footer contains the file bundle metadata and the hotcache.\n  uint64 split_footer_end = 3;\n  // The lowest timestamp appearing in the split, in seconds since epoch\n  optional int64 timestamp_start = 4;\n  // The highest timestamp appearing in the split, in seconds since epoch\n  optional int64 timestamp_end = 5;\n  // The number of docs in the split\n  uint64 num_docs = 6;\n}\n\n// Hits returned by a FetchDocRequest.\n//\n// The json that is joined is the raw tantivy json doc.\n// It is very different from a quickwit json doc.\n//\n// For instance:\n// - it may contain a _source and a _dynamic field.\n// - since tantivy has no notion of cardinality,\n//  all fields are arrays.\n// - since tantivy has no notion of object, the object is\n//  flattened by concatenating the path to the root.\n//\n// See  `quickwit_search::convert_leaf_hit`\nmessage LeafHit {\n  // The actual content of the hit/\n  string leaf_json = 1;\n  // The partial hit (ie: the sorting field + the document address)\n  PartialHit partial_hit = 2;\n  // A snippet of the matching content\n  optional string leaf_snippet_json = 3;\n}\n\nmessage Hit {\n  // The actual content of the hit\n  string json = 1;\n  // The partial hit (ie: the sorting field + the document address)\n  PartialHit partial_hit = 2;\n  // A snippet of the matching content\n  optional string snippet = 3;\n  // The index id of the hit\n  string index_id = 4;\n}\n\n\n// A partial hit, is a hit for which we have not fetch the content yet.\n// Instead, it holds a document_uri which is enough information to\n// go and fetch the actual document data, by performing a `get_doc(...)`\n// request.\nmessage PartialHit {\n  // Value of the sorting key for the given document.\n  //\n  // Quickwit only computes top-K of this sorting field.\n  // If the user requested for a bottom-K of a given fast field, then quickwit simply\n  // emits an decreasing mapping of this fast field.\n  //\n  // In case of a tie, quickwit uses the increasing order of\n  // - the split_id,\n  // - the segment_ord,\n  // - the doc id.\n\n  // Deprecated\n  reserved 1;\n  // Room for eventual future sorted key types.\n  reserved 12 to 20;\n  SortByValue sort_value = 10;\n  SortByValue sort_value2 = 11;\n\n  string split_id = 2;\n\n  // (segment_ord, doc) form a tantivy DocAddress, which is sufficient to identify a document\n  // within a split\n  uint32 segment_ord = 3;\n\n  // The DocId identifies a unique document at the scale of a tantivy segment.\n  uint32 doc_id = 4;\n}\n\nmessage SortByValue {\n  oneof sort_value {\n  uint64 u64 = 1;\n  int64 i64 = 2;\n  double f64 = 3;\n  bool boolean = 4;\n  }\n  // Room for eventual future sorted key types.\n  reserved 5 to 20;\n}\n\nmessage LeafSearchResponse {\n  // Total number of documents matched by the query.\n  uint64 num_hits = 1;\n\n  // List of the best top-K candidates for the given leaf query.\n  repeated PartialHit partial_hits = 2;\n\n  // The list of splits that failed. LeafSearchResponse can be an aggregation of results, so there may be multiple.\n  repeated SplitSearchError failed_splits = 3;\n\n  // Total number of attempt to search into splits.\n  // We do have:\n  // `num_splits_requested == num_successful_splits + num_failed_splits.len()`\n  // But we do not necessarily have:\n  // `num_splits_requested = num_attempted_splits because of retries.`\n  uint64 num_attempted_splits = 4;\n\n  // Total number of successful splits searched.\n  uint64 num_successful_splits = 7;\n\n  // Deprecated json serialized intermediate aggregation_result.\n  reserved 5;\n\n  // postcard serialized intermediate aggregation_result.\n  optional bytes intermediate_aggregation_result = 6;\n\n  ResourceStats resource_stats = 8;\n}\n\n// The result of searching a single split in a Lambda invocation.\n// Each result is tagged with its split_id so that ordering is irrelevant.\nmessage LambdaSingleSplitResult {\n  // The split that was searched.\n  string split_id = 1;\n  oneof outcome {\n    // On success, the leaf search response for this split.\n    LeafSearchResponse response = 2;\n    // On failure, the error message.\n    string error = 3;\n  }\n}\n\n// Wrapper for per-split results from a Lambda invocation.\nmessage LambdaSearchResponses {\n  reserved 1; // was: repeated LeafSearchResponse responses\n  repeated LambdaSingleSplitResult split_results = 2;\n}\n\nmessage SnippetRequest {\n  repeated string snippet_fields = 1;\n  string query_ast_resolved = 2;\n}\n\nmessage FetchDocsRequest {\n  // Request fetching the content of a given list of partial_hits.\n  repeated PartialHit partial_hits = 1;\n\n  // Split footer offsets. They are required for fetch docs to\n  // fetch the document content in two reads, when the footer is not\n  // cached.\n  repeated SplitIdAndFooterOffsets split_offsets = 3;\n\n  // Index URI. The index URI defines the location of the storage that contains the\n  // split files.\n  string index_uri = 4;\n\n  optional SnippetRequest snippet_request = 7;\n\n  // `DocMapper` as json serialized trait.\n  string doc_mapper = 6;\n\n  reserved 5;\n}\n\nmessage FetchDocsResponse {\n  // List of complete hits.\n  repeated LeafHit hits = 1;\n}\n\nmessage ListTermsRequest {\n  // Index ID patterns\n  repeated string index_id_patterns = 1;\n\n  // Field to search on\n  string field = 3;\n\n  // Time filter\n  optional int64 start_timestamp = 4;\n  optional int64 end_timestamp = 5;\n\n  // Maximum number of hits to return.\n  optional uint64 max_hits = 6;\n\n  // start_key is included, end_key is excluded\n  optional bytes start_key = 7;\n  optional bytes end_key = 8;\n}\n\nmessage ListTermsResponse {\n  // Number of hits matching the query.\n  uint64 num_hits = 1;\n  // Matched hits\n  repeated bytes terms = 2;\n  // Elapsed time to perform the request. This time is measured\n  // server-side and expressed in microseconds.\n  uint64 elapsed_time_micros = 3;\n\n  // The searcherrors that occurred formatted as string.\n  repeated string errors = 4;\n}\n\nmessage LeafListTermsRequest {\n  // Search request. This is a perfect copy of the original list request,\n  ListTermsRequest list_terms_request = 1;\n\n  // Index split ids to apply the query on.\n  // This ids are resolved from the index_uri defined in the search_request.\n  repeated SplitIdAndFooterOffsets split_offsets = 2;\n\n  // Index URI. The index URI defines the location of the storage that contains the\n  // split files.\n  string index_uri = 3;\n}\n\nmessage LeafListTermsResponse {\n  // Total number of documents matched by the query.\n  uint64 num_hits = 1;\n\n  // List of the first K terms the given leaf query.\n  repeated bytes terms = 2;\n\n  // The list of splits that failed. LeafSearchResponse can be an aggregation of results, so there may be multiple.\n  repeated SplitSearchError failed_splits = 3;\n\n  // Total number of single split search attempted.\n  uint64 num_attempted_splits = 4;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/gogoproto/gogo.proto",
    "content": "// Protocol Buffers for Go with Gadgets\n//\n// Copyright (c) 2013, The GoGo Authors. All rights reserved.\n// http://github.com/gogo/protobuf\n//\n// Redistribution and use in source and binary forms, with or without\n// modification, are permitted provided that the following conditions are\n// met:\n//\n//     * Redistributions of source code must retain the above copyright\n// notice, this list of conditions and the following disclaimer.\n//     * Redistributions in binary form must reproduce the above\n// copyright notice, this list of conditions and the following disclaimer\n// in the documentation and/or other materials provided with the\n// distribution.\n//\n// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n// \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n\nsyntax = \"proto2\";\npackage gogoproto;\n\nimport \"google/protobuf/descriptor.proto\";\n\noption java_package = \"com.google.protobuf\";\noption java_outer_classname = \"GoGoProtos\";\noption go_package = \"github.com/gogo/protobuf/gogoproto\";\n\nextend google.protobuf.EnumOptions {\n\toptional bool goproto_enum_prefix = 62001;\n\toptional bool goproto_enum_stringer = 62021;\n\toptional bool enum_stringer = 62022;\n\toptional string enum_customname = 62023;\n\toptional bool enumdecl = 62024;\n}\n\nextend google.protobuf.EnumValueOptions {\n\toptional string enumvalue_customname = 66001;\n}\n\nextend google.protobuf.FileOptions {\n\toptional bool goproto_getters_all = 63001;\n\toptional bool goproto_enum_prefix_all = 63002;\n\toptional bool goproto_stringer_all = 63003;\n\toptional bool verbose_equal_all = 63004;\n\toptional bool face_all = 63005;\n\toptional bool gostring_all = 63006;\n\toptional bool populate_all = 63007;\n\toptional bool stringer_all = 63008;\n\toptional bool onlyone_all = 63009;\n\n\toptional bool equal_all = 63013;\n\toptional bool description_all = 63014;\n\toptional bool testgen_all = 63015;\n\toptional bool benchgen_all = 63016;\n\toptional bool marshaler_all = 63017;\n\toptional bool unmarshaler_all = 63018;\n\toptional bool stable_marshaler_all = 63019;\n\n\toptional bool sizer_all = 63020;\n\n\toptional bool goproto_enum_stringer_all = 63021;\n\toptional bool enum_stringer_all = 63022;\n\n\toptional bool unsafe_marshaler_all = 63023;\n\toptional bool unsafe_unmarshaler_all = 63024;\n\n\toptional bool goproto_extensions_map_all = 63025;\n\toptional bool goproto_unrecognized_all = 63026;\n\toptional bool gogoproto_import = 63027;\n\toptional bool protosizer_all = 63028;\n\toptional bool compare_all = 63029;\n    optional bool typedecl_all = 63030;\n    optional bool enumdecl_all = 63031;\n\n\toptional bool goproto_registration = 63032;\n\toptional bool messagename_all = 63033;\n\n\toptional bool goproto_sizecache_all = 63034;\n\toptional bool goproto_unkeyed_all = 63035;\n}\n\nextend google.protobuf.MessageOptions {\n\toptional bool goproto_getters = 64001;\n\toptional bool goproto_stringer = 64003;\n\toptional bool verbose_equal = 64004;\n\toptional bool face = 64005;\n\toptional bool gostring = 64006;\n\toptional bool populate = 64007;\n\toptional bool stringer = 67008;\n\toptional bool onlyone = 64009;\n\n\toptional bool equal = 64013;\n\toptional bool description = 64014;\n\toptional bool testgen = 64015;\n\toptional bool benchgen = 64016;\n\toptional bool marshaler = 64017;\n\toptional bool unmarshaler = 64018;\n\toptional bool stable_marshaler = 64019;\n\n\toptional bool sizer = 64020;\n\n\toptional bool unsafe_marshaler = 64023;\n\toptional bool unsafe_unmarshaler = 64024;\n\n\toptional bool goproto_extensions_map = 64025;\n\toptional bool goproto_unrecognized = 64026;\n\n\toptional bool protosizer = 64028;\n\toptional bool compare = 64029;\n\n\toptional bool typedecl = 64030;\n\n\toptional bool messagename = 64033;\n\n\toptional bool goproto_sizecache = 64034;\n\toptional bool goproto_unkeyed = 64035;\n}\n\nextend google.protobuf.FieldOptions {\n\toptional bool nullable = 65001;\n\toptional bool embed = 65002;\n\toptional string customtype = 65003;\n\toptional string customname = 65004;\n\toptional string jsontag = 65005;\n\toptional string moretags = 65006;\n\toptional string casttype = 65007;\n\toptional string castkey = 65008;\n\toptional string castvalue = 65009;\n\n\toptional bool stdtime = 65010;\n\toptional bool stdduration = 65011;\n\toptional bool wktpointer = 65012;\n\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/google/protobuf/any.proto",
    "content": "// Protocol Buffers - Google's data interchange format\n// Copyright 2008 Google Inc.  All rights reserved.\n// https://developers.google.com/protocol-buffers/\n//\n// Redistribution and use in source and binary forms, with or without\n// modification, are permitted provided that the following conditions are\n// met:\n//\n//     * Redistributions of source code must retain the above copyright\n// notice, this list of conditions and the following disclaimer.\n//     * Redistributions in binary form must reproduce the above\n// copyright notice, this list of conditions and the following disclaimer\n// in the documentation and/or other materials provided with the\n// distribution.\n//     * Neither the name of Google Inc. nor the names of its\n// contributors may be used to endorse or promote products derived from\n// this software without specific prior written permission.\n//\n// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n// \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n\nsyntax = \"proto3\";\n\npackage google.protobuf;\n\noption csharp_namespace = \"Google.Protobuf.WellKnownTypes\";\noption go_package = \"google.golang.org/protobuf/types/known/anypb\";\noption java_package = \"com.google.protobuf\";\noption java_outer_classname = \"AnyProto\";\noption java_multiple_files = true;\noption objc_class_prefix = \"GPB\";\n\n// `Any` contains an arbitrary serialized protocol buffer message along with a\n// URL that describes the type of the serialized message.\n//\n// Protobuf library provides support to pack/unpack Any values in the form\n// of utility functions or additional generated methods of the Any type.\n//\n// Example 1: Pack and unpack a message in C++.\n//\n//     Foo foo = ...;\n//     Any any;\n//     any.PackFrom(foo);\n//     ...\n//     if (any.UnpackTo(&foo)) {\n//       ...\n//     }\n//\n// Example 2: Pack and unpack a message in Java.\n//\n//     Foo foo = ...;\n//     Any any = Any.pack(foo);\n//     ...\n//     if (any.is(Foo.class)) {\n//       foo = any.unpack(Foo.class);\n//     }\n//\n// Example 3: Pack and unpack a message in Python.\n//\n//     foo = Foo(...)\n//     any = Any()\n//     any.Pack(foo)\n//     ...\n//     if any.Is(Foo.DESCRIPTOR):\n//       any.Unpack(foo)\n//       ...\n//\n// Example 4: Pack and unpack a message in Go\n//\n//      foo := &pb.Foo{...}\n//      any, err := anypb.New(foo)\n//      if err != nil {\n//        ...\n//      }\n//      ...\n//      foo := &pb.Foo{}\n//      if err := any.UnmarshalTo(foo); err != nil {\n//        ...\n//      }\n//\n// The pack methods provided by protobuf library will by default use\n// 'type.googleapis.com/full.type.name' as the type URL and the unpack\n// methods only use the fully qualified type name after the last '/'\n// in the type URL, for example \"foo.bar.com/x/y.z\" will yield type\n// name \"y.z\".\n//\n//\n// JSON\n//\n// The JSON representation of an `Any` value uses the regular\n// representation of the deserialized, embedded message, with an\n// additional field `@type` which contains the type URL. Example:\n//\n//     package google.profile;\n//     message Person {\n//       string first_name = 1;\n//       string last_name = 2;\n//     }\n//\n//     {\n//       \"@type\": \"type.googleapis.com/google.profile.Person\",\n//       \"firstName\": <string>,\n//       \"lastName\": <string>\n//     }\n//\n// If the embedded message type is well-known and has a custom JSON\n// representation, that representation will be embedded adding a field\n// `value` which holds the custom JSON in addition to the `@type`\n// field. Example (for message [google.protobuf.Duration][]):\n//\n//     {\n//       \"@type\": \"type.googleapis.com/google.protobuf.Duration\",\n//       \"value\": \"1.212s\"\n//     }\n//\nmessage Any {\n  // A URL/resource name that uniquely identifies the type of the serialized\n  // protocol buffer message. This string must contain at least\n  // one \"/\" character. The last segment of the URL's path must represent\n  // the fully qualified name of the type (as in\n  // `path/google.protobuf.Duration`). The name should be in a canonical form\n  // (e.g., leading \".\" is not accepted).\n  //\n  // In practice, teams usually precompile into the binary all types that they\n  // expect it to use in the context of Any. However, for URLs which use the\n  // scheme `http`, `https`, or no scheme, one can optionally set up a type\n  // server that maps type URLs to message definitions as follows:\n  //\n  // * If no scheme is provided, `https` is assumed.\n  // * An HTTP GET on the URL must yield a [google.protobuf.Type][]\n  //   value in binary format, or produce an error.\n  // * Applications are allowed to cache lookup results based on the\n  //   URL, or have them precompiled into a binary to avoid any\n  //   lookup. Therefore, binary compatibility needs to be preserved\n  //   on changes to types. (Use versioned type names to manage\n  //   breaking changes.)\n  //\n  // Note: this functionality is not currently available in the official\n  // protobuf release, and it is not used for type URLs beginning with\n  // type.googleapis.com.\n  //\n  // Schemes other than `http`, `https` (or the empty scheme) might be\n  // used with implementation specific semantics.\n  //\n  string type_url = 1;\n\n  // Must be a valid serialized protocol buffer of the above specified type.\n  bytes value = 2;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/google/protobuf/api.proto",
    "content": "// Protocol Buffers - Google's data interchange format\n// Copyright 2008 Google Inc.  All rights reserved.\n// https://developers.google.com/protocol-buffers/\n//\n// Redistribution and use in source and binary forms, with or without\n// modification, are permitted provided that the following conditions are\n// met:\n//\n//     * Redistributions of source code must retain the above copyright\n// notice, this list of conditions and the following disclaimer.\n//     * Redistributions in binary form must reproduce the above\n// copyright notice, this list of conditions and the following disclaimer\n// in the documentation and/or other materials provided with the\n// distribution.\n//     * Neither the name of Google Inc. nor the names of its\n// contributors may be used to endorse or promote products derived from\n// this software without specific prior written permission.\n//\n// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n// \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n\nsyntax = \"proto3\";\n\npackage google.protobuf;\n\nimport \"google/protobuf/source_context.proto\";\nimport \"google/protobuf/type.proto\";\n\noption csharp_namespace = \"Google.Protobuf.WellKnownTypes\";\noption java_package = \"com.google.protobuf\";\noption java_outer_classname = \"ApiProto\";\noption java_multiple_files = true;\noption objc_class_prefix = \"GPB\";\noption go_package = \"google.golang.org/protobuf/types/known/apipb\";\n\n// Api is a light-weight descriptor for an API Interface.\n//\n// Interfaces are also described as \"protocol buffer services\" in some contexts,\n// such as by the \"service\" keyword in a .proto file, but they are different\n// from API Services, which represent a concrete implementation of an interface\n// as opposed to simply a description of methods and bindings. They are also\n// sometimes simply referred to as \"APIs\" in other contexts, such as the name of\n// this message itself. See https://cloud.google.com/apis/design/glossary for\n// detailed terminology.\nmessage Api {\n  // The fully qualified name of this interface, including package name\n  // followed by the interface's simple name.\n  string name = 1;\n\n  // The methods of this interface, in unspecified order.\n  repeated Method methods = 2;\n\n  // Any metadata attached to the interface.\n  repeated Option options = 3;\n\n  // A version string for this interface. If specified, must have the form\n  // `major-version.minor-version`, as in `1.10`. If the minor version is\n  // omitted, it defaults to zero. If the entire version field is empty, the\n  // major version is derived from the package name, as outlined below. If the\n  // field is not empty, the version in the package name will be verified to be\n  // consistent with what is provided here.\n  //\n  // The versioning schema uses [semantic\n  // versioning](http://semver.org) where the major version number\n  // indicates a breaking change and the minor version an additive,\n  // non-breaking change. Both version numbers are signals to users\n  // what to expect from different versions, and should be carefully\n  // chosen based on the product plan.\n  //\n  // The major version is also reflected in the package name of the\n  // interface, which must end in `v<major-version>`, as in\n  // `google.feature.v1`. For major versions 0 and 1, the suffix can\n  // be omitted. Zero major versions must only be used for\n  // experimental, non-GA interfaces.\n  //\n  //\n  string version = 4;\n\n  // Source context for the protocol buffer service represented by this\n  // message.\n  SourceContext source_context = 5;\n\n  // Included interfaces. See [Mixin][].\n  repeated Mixin mixins = 6;\n\n  // The source syntax of the service.\n  Syntax syntax = 7;\n}\n\n// Method represents a method of an API interface.\nmessage Method {\n  // The simple name of this method.\n  string name = 1;\n\n  // A URL of the input message type.\n  string request_type_url = 2;\n\n  // If true, the request is streamed.\n  bool request_streaming = 3;\n\n  // The URL of the output message type.\n  string response_type_url = 4;\n\n  // If true, the response is streamed.\n  bool response_streaming = 5;\n\n  // Any metadata attached to the method.\n  repeated Option options = 6;\n\n  // The source syntax of this method.\n  Syntax syntax = 7;\n}\n\n// Declares an API Interface to be included in this interface. The including\n// interface must redeclare all the methods from the included interface, but\n// documentation and options are inherited as follows:\n//\n// - If after comment and whitespace stripping, the documentation\n//   string of the redeclared method is empty, it will be inherited\n//   from the original method.\n//\n// - Each annotation belonging to the service config (http,\n//   visibility) which is not set in the redeclared method will be\n//   inherited.\n//\n// - If an http annotation is inherited, the path pattern will be\n//   modified as follows. Any version prefix will be replaced by the\n//   version of the including interface plus the [root][] path if\n//   specified.\n//\n// Example of a simple mixin:\n//\n//     package google.acl.v1;\n//     service AccessControl {\n//       // Get the underlying ACL object.\n//       rpc GetAcl(GetAclRequest) returns (Acl) {\n//         option (google.api.http).get = \"/v1/{resource=**}:getAcl\";\n//       }\n//     }\n//\n//     package google.storage.v2;\n//     service Storage {\n//       rpc GetAcl(GetAclRequest) returns (Acl);\n//\n//       // Get a data record.\n//       rpc GetData(GetDataRequest) returns (Data) {\n//         option (google.api.http).get = \"/v2/{resource=**}\";\n//       }\n//     }\n//\n// Example of a mixin configuration:\n//\n//     apis:\n//     - name: google.storage.v2.Storage\n//       mixins:\n//       - name: google.acl.v1.AccessControl\n//\n// The mixin construct implies that all methods in `AccessControl` are\n// also declared with same name and request/response types in\n// `Storage`. A documentation generator or annotation processor will\n// see the effective `Storage.GetAcl` method after inheriting\n// documentation and annotations as follows:\n//\n//     service Storage {\n//       // Get the underlying ACL object.\n//       rpc GetAcl(GetAclRequest) returns (Acl) {\n//         option (google.api.http).get = \"/v2/{resource=**}:getAcl\";\n//       }\n//       ...\n//     }\n//\n// Note how the version in the path pattern changed from `v1` to `v2`.\n//\n// If the `root` field in the mixin is specified, it should be a\n// relative path under which inherited HTTP paths are placed. Example:\n//\n//     apis:\n//     - name: google.storage.v2.Storage\n//       mixins:\n//       - name: google.acl.v1.AccessControl\n//         root: acls\n//\n// This implies the following inherited HTTP annotation:\n//\n//     service Storage {\n//       // Get the underlying ACL object.\n//       rpc GetAcl(GetAclRequest) returns (Acl) {\n//         option (google.api.http).get = \"/v2/acls/{resource=**}:getAcl\";\n//       }\n//       ...\n//     }\nmessage Mixin {\n  // The fully qualified name of the interface which is included.\n  string name = 1;\n\n  // If non-empty specifies a path under which inherited HTTP paths\n  // are rooted.\n  string root = 2;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/google/protobuf/descriptor.proto",
    "content": "// Protocol Buffers - Google's data interchange format\n// Copyright 2008 Google Inc.  All rights reserved.\n// https://developers.google.com/protocol-buffers/\n//\n// Redistribution and use in source and binary forms, with or without\n// modification, are permitted provided that the following conditions are\n// met:\n//\n//     * Redistributions of source code must retain the above copyright\n// notice, this list of conditions and the following disclaimer.\n//     * Redistributions in binary form must reproduce the above\n// copyright notice, this list of conditions and the following disclaimer\n// in the documentation and/or other materials provided with the\n// distribution.\n//     * Neither the name of Google Inc. nor the names of its\n// contributors may be used to endorse or promote products derived from\n// this software without specific prior written permission.\n//\n// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n// \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n\n// Author: kenton@google.com (Kenton Varda)\n//  Based on original Protocol Buffers design by\n//  Sanjay Ghemawat, Jeff Dean, and others.\n//\n// The messages in this file describe the definitions found in .proto files.\n// A valid .proto file can be translated directly to a FileDescriptorProto\n// without any other information (e.g. without reading its imports).\n\n\nsyntax = \"proto2\";\n\npackage google.protobuf;\n\noption go_package = \"google.golang.org/protobuf/types/descriptorpb\";\noption java_package = \"com.google.protobuf\";\noption java_outer_classname = \"DescriptorProtos\";\noption csharp_namespace = \"Google.Protobuf.Reflection\";\noption objc_class_prefix = \"GPB\";\noption cc_enable_arenas = true;\n\n// descriptor.proto must be optimized for speed because reflection-based\n// algorithms don't work during bootstrapping.\noption optimize_for = SPEED;\n\n// The protocol compiler can output a FileDescriptorSet containing the .proto\n// files it parses.\nmessage FileDescriptorSet {\n  repeated FileDescriptorProto file = 1;\n}\n\n// Describes a complete .proto file.\nmessage FileDescriptorProto {\n  optional string name = 1;     // file name, relative to root of source tree\n  optional string package = 2;  // e.g. \"foo\", \"foo.bar\", etc.\n\n  // Names of files imported by this file.\n  repeated string dependency = 3;\n  // Indexes of the public imported files in the dependency list above.\n  repeated int32 public_dependency = 10;\n  // Indexes of the weak imported files in the dependency list.\n  // For Google-internal migration only. Do not use.\n  repeated int32 weak_dependency = 11;\n\n  // All top-level definitions in this file.\n  repeated DescriptorProto message_type = 4;\n  repeated EnumDescriptorProto enum_type = 5;\n  repeated ServiceDescriptorProto service = 6;\n  repeated FieldDescriptorProto extension = 7;\n\n  optional FileOptions options = 8;\n\n  // This field contains optional information about the original source code.\n  // You may safely remove this entire field without harming runtime\n  // functionality of the descriptors -- the information is needed only by\n  // development tools.\n  optional SourceCodeInfo source_code_info = 9;\n\n  // The syntax of the proto file.\n  // The supported values are \"proto2\" and \"proto3\".\n  optional string syntax = 12;\n}\n\n// Describes a message type.\nmessage DescriptorProto {\n  optional string name = 1;\n\n  repeated FieldDescriptorProto field = 2;\n  repeated FieldDescriptorProto extension = 6;\n\n  repeated DescriptorProto nested_type = 3;\n  repeated EnumDescriptorProto enum_type = 4;\n\n  message ExtensionRange {\n    optional int32 start = 1;  // Inclusive.\n    optional int32 end = 2;    // Exclusive.\n\n    optional ExtensionRangeOptions options = 3;\n  }\n  repeated ExtensionRange extension_range = 5;\n\n  repeated OneofDescriptorProto oneof_decl = 8;\n\n  optional MessageOptions options = 7;\n\n  // Range of reserved tag numbers. Reserved tag numbers may not be used by\n  // fields or extension ranges in the same message. Reserved ranges may\n  // not overlap.\n  message ReservedRange {\n    optional int32 start = 1;  // Inclusive.\n    optional int32 end = 2;    // Exclusive.\n  }\n  repeated ReservedRange reserved_range = 9;\n  // Reserved field names, which may not be used by fields in the same message.\n  // A given name may only be reserved once.\n  repeated string reserved_name = 10;\n}\n\nmessage ExtensionRangeOptions {\n  // The parser stores options it doesn't recognize here. See above.\n  repeated UninterpretedOption uninterpreted_option = 999;\n\n\n  // Clients can define custom options in extensions of this message. See above.\n  extensions 1000 to max;\n}\n\n// Describes a field within a message.\nmessage FieldDescriptorProto {\n  enum Type {\n    // 0 is reserved for errors.\n    // Order is weird for historical reasons.\n    TYPE_DOUBLE = 1;\n    TYPE_FLOAT = 2;\n    // Not ZigZag encoded.  Negative numbers take 10 bytes.  Use TYPE_SINT64 if\n    // negative values are likely.\n    TYPE_INT64 = 3;\n    TYPE_UINT64 = 4;\n    // Not ZigZag encoded.  Negative numbers take 10 bytes.  Use TYPE_SINT32 if\n    // negative values are likely.\n    TYPE_INT32 = 5;\n    TYPE_FIXED64 = 6;\n    TYPE_FIXED32 = 7;\n    TYPE_BOOL = 8;\n    TYPE_STRING = 9;\n    // Tag-delimited aggregate.\n    // Group type is deprecated and not supported in proto3. However, Proto3\n    // implementations should still be able to parse the group wire format and\n    // treat group fields as unknown fields.\n    TYPE_GROUP = 10;\n    TYPE_MESSAGE = 11;  // Length-delimited aggregate.\n\n    // New in version 2.\n    TYPE_BYTES = 12;\n    TYPE_UINT32 = 13;\n    TYPE_ENUM = 14;\n    TYPE_SFIXED32 = 15;\n    TYPE_SFIXED64 = 16;\n    TYPE_SINT32 = 17;  // Uses ZigZag encoding.\n    TYPE_SINT64 = 18;  // Uses ZigZag encoding.\n  }\n\n  enum Label {\n    // 0 is reserved for errors\n    LABEL_OPTIONAL = 1;\n    LABEL_REQUIRED = 2;\n    LABEL_REPEATED = 3;\n  }\n\n  optional string name = 1;\n  optional int32 number = 3;\n  optional Label label = 4;\n\n  // If type_name is set, this need not be set.  If both this and type_name\n  // are set, this must be one of TYPE_ENUM, TYPE_MESSAGE or TYPE_GROUP.\n  optional Type type = 5;\n\n  // For message and enum types, this is the name of the type.  If the name\n  // starts with a '.', it is fully-qualified.  Otherwise, C++-like scoping\n  // rules are used to find the type (i.e. first the nested types within this\n  // message are searched, then within the parent, on up to the root\n  // namespace).\n  optional string type_name = 6;\n\n  // For extensions, this is the name of the type being extended.  It is\n  // resolved in the same manner as type_name.\n  optional string extendee = 2;\n\n  // For numeric types, contains the original text representation of the value.\n  // For booleans, \"true\" or \"false\".\n  // For strings, contains the default text contents (not escaped in any way).\n  // For bytes, contains the C escaped value.  All bytes >= 128 are escaped.\n  optional string default_value = 7;\n\n  // If set, gives the index of a oneof in the containing type's oneof_decl\n  // list.  This field is a member of that oneof.\n  optional int32 oneof_index = 9;\n\n  // JSON name of this field. The value is set by protocol compiler. If the\n  // user has set a \"json_name\" option on this field, that option's value\n  // will be used. Otherwise, it's deduced from the field's name by converting\n  // it to camelCase.\n  optional string json_name = 10;\n\n  optional FieldOptions options = 8;\n\n  // If true, this is a proto3 \"optional\". When a proto3 field is optional, it\n  // tracks presence regardless of field type.\n  //\n  // When proto3_optional is true, this field must be belong to a oneof to\n  // signal to old proto3 clients that presence is tracked for this field. This\n  // oneof is known as a \"synthetic\" oneof, and this field must be its sole\n  // member (each proto3 optional field gets its own synthetic oneof). Synthetic\n  // oneofs exist in the descriptor only, and do not generate any API. Synthetic\n  // oneofs must be ordered after all \"real\" oneofs.\n  //\n  // For message fields, proto3_optional doesn't create any semantic change,\n  // since non-repeated message fields always track presence. However it still\n  // indicates the semantic detail of whether the user wrote \"optional\" or not.\n  // This can be useful for round-tripping the .proto file. For consistency we\n  // give message fields a synthetic oneof also, even though it is not required\n  // to track presence. This is especially important because the parser can't\n  // tell if a field is a message or an enum, so it must always create a\n  // synthetic oneof.\n  //\n  // Proto2 optional fields do not set this flag, because they already indicate\n  // optional with `LABEL_OPTIONAL`.\n  optional bool proto3_optional = 17;\n}\n\n// Describes a oneof.\nmessage OneofDescriptorProto {\n  optional string name = 1;\n  optional OneofOptions options = 2;\n}\n\n// Describes an enum type.\nmessage EnumDescriptorProto {\n  optional string name = 1;\n\n  repeated EnumValueDescriptorProto value = 2;\n\n  optional EnumOptions options = 3;\n\n  // Range of reserved numeric values. Reserved values may not be used by\n  // entries in the same enum. Reserved ranges may not overlap.\n  //\n  // Note that this is distinct from DescriptorProto.ReservedRange in that it\n  // is inclusive such that it can appropriately represent the entire int32\n  // domain.\n  message EnumReservedRange {\n    optional int32 start = 1;  // Inclusive.\n    optional int32 end = 2;    // Inclusive.\n  }\n\n  // Range of reserved numeric values. Reserved numeric values may not be used\n  // by enum values in the same enum declaration. Reserved ranges may not\n  // overlap.\n  repeated EnumReservedRange reserved_range = 4;\n\n  // Reserved enum value names, which may not be reused. A given name may only\n  // be reserved once.\n  repeated string reserved_name = 5;\n}\n\n// Describes a value within an enum.\nmessage EnumValueDescriptorProto {\n  optional string name = 1;\n  optional int32 number = 2;\n\n  optional EnumValueOptions options = 3;\n}\n\n// Describes a service.\nmessage ServiceDescriptorProto {\n  optional string name = 1;\n  repeated MethodDescriptorProto method = 2;\n\n  optional ServiceOptions options = 3;\n}\n\n// Describes a method of a service.\nmessage MethodDescriptorProto {\n  optional string name = 1;\n\n  // Input and output type names.  These are resolved in the same way as\n  // FieldDescriptorProto.type_name, but must refer to a message type.\n  optional string input_type = 2;\n  optional string output_type = 3;\n\n  optional MethodOptions options = 4;\n\n  // Identifies if client streams multiple client messages\n  optional bool client_streaming = 5 [default = false];\n  // Identifies if server streams multiple server messages\n  optional bool server_streaming = 6 [default = false];\n}\n\n\n// ===================================================================\n// Options\n\n// Each of the definitions above may have \"options\" attached.  These are\n// just annotations which may cause code to be generated slightly differently\n// or may contain hints for code that manipulates protocol messages.\n//\n// Clients may define custom options as extensions of the *Options messages.\n// These extensions may not yet be known at parsing time, so the parser cannot\n// store the values in them.  Instead it stores them in a field in the *Options\n// message called uninterpreted_option. This field must have the same name\n// across all *Options messages. We then use this field to populate the\n// extensions when we build a descriptor, at which point all protos have been\n// parsed and so all extensions are known.\n//\n// Extension numbers for custom options may be chosen as follows:\n// * For options which will only be used within a single application or\n//   organization, or for experimental options, use field numbers 50000\n//   through 99999.  It is up to you to ensure that you do not use the\n//   same number for multiple options.\n// * For options which will be published and used publicly by multiple\n//   independent entities, e-mail protobuf-global-extension-registry@google.com\n//   to reserve extension numbers. Simply provide your project name (e.g.\n//   Objective-C plugin) and your project website (if available) -- there's no\n//   need to explain how you intend to use them. Usually you only need one\n//   extension number. You can declare multiple options with only one extension\n//   number by putting them in a sub-message. See the Custom Options section of\n//   the docs for examples:\n//   https://developers.google.com/protocol-buffers/docs/proto#options\n//   If this turns out to be popular, a web service will be set up\n//   to automatically assign option numbers.\n\nmessage FileOptions {\n\n  // Sets the Java package where classes generated from this .proto will be\n  // placed.  By default, the proto package is used, but this is often\n  // inappropriate because proto packages do not normally start with backwards\n  // domain names.\n  optional string java_package = 1;\n\n\n  // Controls the name of the wrapper Java class generated for the .proto file.\n  // That class will always contain the .proto file's getDescriptor() method as\n  // well as any top-level extensions defined in the .proto file.\n  // If java_multiple_files is disabled, then all the other classes from the\n  // .proto file will be nested inside the single wrapper outer class.\n  optional string java_outer_classname = 8;\n\n  // If enabled, then the Java code generator will generate a separate .java\n  // file for each top-level message, enum, and service defined in the .proto\n  // file.  Thus, these types will *not* be nested inside the wrapper class\n  // named by java_outer_classname.  However, the wrapper class will still be\n  // generated to contain the file's getDescriptor() method as well as any\n  // top-level extensions defined in the file.\n  optional bool java_multiple_files = 10 [default = false];\n\n  // This option does nothing.\n  optional bool java_generate_equals_and_hash = 20 [deprecated=true];\n\n  // If set true, then the Java2 code generator will generate code that\n  // throws an exception whenever an attempt is made to assign a non-UTF-8\n  // byte sequence to a string field.\n  // Message reflection will do the same.\n  // However, an extension field still accepts non-UTF-8 byte sequences.\n  // This option has no effect on when used with the lite runtime.\n  optional bool java_string_check_utf8 = 27 [default = false];\n\n\n  // Generated classes can be optimized for speed or code size.\n  enum OptimizeMode {\n    SPEED = 1;         // Generate complete code for parsing, serialization,\n                       // etc.\n    CODE_SIZE = 2;     // Use ReflectionOps to implement these methods.\n    LITE_RUNTIME = 3;  // Generate code using MessageLite and the lite runtime.\n  }\n  optional OptimizeMode optimize_for = 9 [default = SPEED];\n\n  // Sets the Go package where structs generated from this .proto will be\n  // placed. If omitted, the Go package will be derived from the following:\n  //   - The basename of the package import path, if provided.\n  //   - Otherwise, the package statement in the .proto file, if present.\n  //   - Otherwise, the basename of the .proto file, without extension.\n  optional string go_package = 11;\n\n\n\n\n  // Should generic services be generated in each language?  \"Generic\" services\n  // are not specific to any particular RPC system.  They are generated by the\n  // main code generators in each language (without additional plugins).\n  // Generic services were the only kind of service generation supported by\n  // early versions of google.protobuf.\n  //\n  // Generic services are now considered deprecated in favor of using plugins\n  // that generate code specific to your particular RPC system.  Therefore,\n  // these default to false.  Old code which depends on generic services should\n  // explicitly set them to true.\n  optional bool cc_generic_services = 16 [default = false];\n  optional bool java_generic_services = 17 [default = false];\n  optional bool py_generic_services = 18 [default = false];\n  optional bool php_generic_services = 42 [default = false];\n\n  // Is this file deprecated?\n  // Depending on the target platform, this can emit Deprecated annotations\n  // for everything in the file, or it will be completely ignored; in the very\n  // least, this is a formalization for deprecating files.\n  optional bool deprecated = 23 [default = false];\n\n  // Enables the use of arenas for the proto messages in this file. This applies\n  // only to generated classes for C++.\n  optional bool cc_enable_arenas = 31 [default = true];\n\n\n  // Sets the objective c class prefix which is prepended to all objective c\n  // generated classes from this .proto. There is no default.\n  optional string objc_class_prefix = 36;\n\n  // Namespace for generated classes; defaults to the package.\n  optional string csharp_namespace = 37;\n\n  // By default Swift generators will take the proto package and CamelCase it\n  // replacing '.' with underscore and use that to prefix the types/symbols\n  // defined. When this options is provided, they will use this value instead\n  // to prefix the types/symbols defined.\n  optional string swift_prefix = 39;\n\n  // Sets the php class prefix which is prepended to all php generated classes\n  // from this .proto. Default is empty.\n  optional string php_class_prefix = 40;\n\n  // Use this option to change the namespace of php generated classes. Default\n  // is empty. When this option is empty, the package name will be used for\n  // determining the namespace.\n  optional string php_namespace = 41;\n\n  // Use this option to change the namespace of php generated metadata classes.\n  // Default is empty. When this option is empty, the proto file name will be\n  // used for determining the namespace.\n  optional string php_metadata_namespace = 44;\n\n  // Use this option to change the package of ruby generated classes. Default\n  // is empty. When this option is not set, the package name will be used for\n  // determining the ruby package.\n  optional string ruby_package = 45;\n\n\n  // The parser stores options it doesn't recognize here.\n  // See the documentation for the \"Options\" section above.\n  repeated UninterpretedOption uninterpreted_option = 999;\n\n  // Clients can define custom options in extensions of this message.\n  // See the documentation for the \"Options\" section above.\n  extensions 1000 to max;\n\n  reserved 38;\n}\n\nmessage MessageOptions {\n  // Set true to use the old proto1 MessageSet wire format for extensions.\n  // This is provided for backwards-compatibility with the MessageSet wire\n  // format.  You should not use this for any other reason:  It's less\n  // efficient, has fewer features, and is more complicated.\n  //\n  // The message must be defined exactly as follows:\n  //   message Foo {\n  //     option message_set_wire_format = true;\n  //     extensions 4 to max;\n  //   }\n  // Note that the message cannot have any defined fields; MessageSets only\n  // have extensions.\n  //\n  // All extensions of your type must be singular messages; e.g. they cannot\n  // be int32s, enums, or repeated messages.\n  //\n  // Because this is an option, the above two restrictions are not enforced by\n  // the protocol compiler.\n  optional bool message_set_wire_format = 1 [default = false];\n\n  // Disables the generation of the standard \"descriptor()\" accessor, which can\n  // conflict with a field of the same name.  This is meant to make migration\n  // from proto1 easier; new code should avoid fields named \"descriptor\".\n  optional bool no_standard_descriptor_accessor = 2 [default = false];\n\n  // Is this message deprecated?\n  // Depending on the target platform, this can emit Deprecated annotations\n  // for the message, or it will be completely ignored; in the very least,\n  // this is a formalization for deprecating messages.\n  optional bool deprecated = 3 [default = false];\n\n  reserved 4, 5, 6;\n\n  // Whether the message is an automatically generated map entry type for the\n  // maps field.\n  //\n  // For maps fields:\n  //     map<KeyType, ValueType> map_field = 1;\n  // The parsed descriptor looks like:\n  //     message MapFieldEntry {\n  //         option map_entry = true;\n  //         optional KeyType key = 1;\n  //         optional ValueType value = 2;\n  //     }\n  //     repeated MapFieldEntry map_field = 1;\n  //\n  // Implementations may choose not to generate the map_entry=true message, but\n  // use a native map in the target language to hold the keys and values.\n  // The reflection APIs in such implementations still need to work as\n  // if the field is a repeated message field.\n  //\n  // NOTE: Do not set the option in .proto files. Always use the maps syntax\n  // instead. The option should only be implicitly set by the proto compiler\n  // parser.\n  optional bool map_entry = 7;\n\n  reserved 8;  // javalite_serializable\n  reserved 9;  // javanano_as_lite\n\n\n  // The parser stores options it doesn't recognize here. See above.\n  repeated UninterpretedOption uninterpreted_option = 999;\n\n  // Clients can define custom options in extensions of this message. See above.\n  extensions 1000 to max;\n}\n\nmessage FieldOptions {\n  // The ctype option instructs the C++ code generator to use a different\n  // representation of the field than it normally would.  See the specific\n  // options below.  This option is not yet implemented in the open source\n  // release -- sorry, we'll try to include it in a future version!\n  optional CType ctype = 1 [default = STRING];\n  enum CType {\n    // Default mode.\n    STRING = 0;\n\n    CORD = 1;\n\n    STRING_PIECE = 2;\n  }\n  // The packed option can be enabled for repeated primitive fields to enable\n  // a more efficient representation on the wire. Rather than repeatedly\n  // writing the tag and type for each element, the entire array is encoded as\n  // a single length-delimited blob. In proto3, only explicit setting it to\n  // false will avoid using packed encoding.\n  optional bool packed = 2;\n\n  // The jstype option determines the JavaScript type used for values of the\n  // field.  The option is permitted only for 64 bit integral and fixed types\n  // (int64, uint64, sint64, fixed64, sfixed64).  A field with jstype JS_STRING\n  // is represented as JavaScript string, which avoids loss of precision that\n  // can happen when a large value is converted to a floating point JavaScript.\n  // Specifying JS_NUMBER for the jstype causes the generated JavaScript code to\n  // use the JavaScript \"number\" type.  The behavior of the default option\n  // JS_NORMAL is implementation dependent.\n  //\n  // This option is an enum to permit additional types to be added, e.g.\n  // goog.math.Integer.\n  optional JSType jstype = 6 [default = JS_NORMAL];\n  enum JSType {\n    // Use the default type.\n    JS_NORMAL = 0;\n\n    // Use JavaScript strings.\n    JS_STRING = 1;\n\n    // Use JavaScript numbers.\n    JS_NUMBER = 2;\n  }\n\n  // Should this field be parsed lazily?  Lazy applies only to message-type\n  // fields.  It means that when the outer message is initially parsed, the\n  // inner message's contents will not be parsed but instead stored in encoded\n  // form.  The inner message will actually be parsed when it is first accessed.\n  //\n  // This is only a hint.  Implementations are free to choose whether to use\n  // eager or lazy parsing regardless of the value of this option.  However,\n  // setting this option true suggests that the protocol author believes that\n  // using lazy parsing on this field is worth the additional bookkeeping\n  // overhead typically needed to implement it.\n  //\n  // This option does not affect the public interface of any generated code;\n  // all method signatures remain the same.  Furthermore, thread-safety of the\n  // interface is not affected by this option; const methods remain safe to\n  // call from multiple threads concurrently, while non-const methods continue\n  // to require exclusive access.\n  //\n  //\n  // Note that implementations may choose not to check required fields within\n  // a lazy sub-message.  That is, calling IsInitialized() on the outer message\n  // may return true even if the inner message has missing required fields.\n  // This is necessary because otherwise the inner message would have to be\n  // parsed in order to perform the check, defeating the purpose of lazy\n  // parsing.  An implementation which chooses not to check required fields\n  // must be consistent about it.  That is, for any particular sub-message, the\n  // implementation must either *always* check its required fields, or *never*\n  // check its required fields, regardless of whether or not the message has\n  // been parsed.\n  //\n  // As of 2021, lazy does no correctness checks on the byte stream during\n  // parsing.  This may lead to crashes if and when an invalid byte stream is\n  // finally parsed upon access.\n  //\n  // TODO(b/211906113):  Enable validation on lazy fields.\n  optional bool lazy = 5 [default = false];\n\n  // unverified_lazy does no correctness checks on the byte stream. This should\n  // only be used where lazy with verification is prohibitive for performance\n  // reasons.\n  optional bool unverified_lazy = 15 [default = false];\n\n  // Is this field deprecated?\n  // Depending on the target platform, this can emit Deprecated annotations\n  // for accessors, or it will be completely ignored; in the very least, this\n  // is a formalization for deprecating fields.\n  optional bool deprecated = 3 [default = false];\n\n  // For Google-internal migration only. Do not use.\n  optional bool weak = 10 [default = false];\n\n\n  // The parser stores options it doesn't recognize here. See above.\n  repeated UninterpretedOption uninterpreted_option = 999;\n\n  // Clients can define custom options in extensions of this message. See above.\n  extensions 1000 to max;\n\n  reserved 4;  // removed jtype\n}\n\nmessage OneofOptions {\n  // The parser stores options it doesn't recognize here. See above.\n  repeated UninterpretedOption uninterpreted_option = 999;\n\n  // Clients can define custom options in extensions of this message. See above.\n  extensions 1000 to max;\n}\n\nmessage EnumOptions {\n\n  // Set this option to true to allow mapping different tag names to the same\n  // value.\n  optional bool allow_alias = 2;\n\n  // Is this enum deprecated?\n  // Depending on the target platform, this can emit Deprecated annotations\n  // for the enum, or it will be completely ignored; in the very least, this\n  // is a formalization for deprecating enums.\n  optional bool deprecated = 3 [default = false];\n\n  reserved 5;  // javanano_as_lite\n\n  // The parser stores options it doesn't recognize here. See above.\n  repeated UninterpretedOption uninterpreted_option = 999;\n\n  // Clients can define custom options in extensions of this message. See above.\n  extensions 1000 to max;\n}\n\nmessage EnumValueOptions {\n  // Is this enum value deprecated?\n  // Depending on the target platform, this can emit Deprecated annotations\n  // for the enum value, or it will be completely ignored; in the very least,\n  // this is a formalization for deprecating enum values.\n  optional bool deprecated = 1 [default = false];\n\n  // The parser stores options it doesn't recognize here. See above.\n  repeated UninterpretedOption uninterpreted_option = 999;\n\n  // Clients can define custom options in extensions of this message. See above.\n  extensions 1000 to max;\n}\n\nmessage ServiceOptions {\n\n  // Note:  Field numbers 1 through 32 are reserved for Google's internal RPC\n  //   framework.  We apologize for hoarding these numbers to ourselves, but\n  //   we were already using them long before we decided to release Protocol\n  //   Buffers.\n\n  // Is this service deprecated?\n  // Depending on the target platform, this can emit Deprecated annotations\n  // for the service, or it will be completely ignored; in the very least,\n  // this is a formalization for deprecating services.\n  optional bool deprecated = 33 [default = false];\n\n  // The parser stores options it doesn't recognize here. See above.\n  repeated UninterpretedOption uninterpreted_option = 999;\n\n  // Clients can define custom options in extensions of this message. See above.\n  extensions 1000 to max;\n}\n\nmessage MethodOptions {\n\n  // Note:  Field numbers 1 through 32 are reserved for Google's internal RPC\n  //   framework.  We apologize for hoarding these numbers to ourselves, but\n  //   we were already using them long before we decided to release Protocol\n  //   Buffers.\n\n  // Is this method deprecated?\n  // Depending on the target platform, this can emit Deprecated annotations\n  // for the method, or it will be completely ignored; in the very least,\n  // this is a formalization for deprecating methods.\n  optional bool deprecated = 33 [default = false];\n\n  // Is this method side-effect-free (or safe in HTTP parlance), or idempotent,\n  // or neither? HTTP based RPC implementation may choose GET verb for safe\n  // methods, and PUT verb for idempotent methods instead of the default POST.\n  enum IdempotencyLevel {\n    IDEMPOTENCY_UNKNOWN = 0;\n    NO_SIDE_EFFECTS = 1;  // implies idempotent\n    IDEMPOTENT = 2;       // idempotent, but may have side effects\n  }\n  optional IdempotencyLevel idempotency_level = 34\n      [default = IDEMPOTENCY_UNKNOWN];\n\n  // The parser stores options it doesn't recognize here. See above.\n  repeated UninterpretedOption uninterpreted_option = 999;\n\n  // Clients can define custom options in extensions of this message. See above.\n  extensions 1000 to max;\n}\n\n\n// A message representing a option the parser does not recognize. This only\n// appears in options protos created by the compiler::Parser class.\n// DescriptorPool resolves these when building Descriptor objects. Therefore,\n// options protos in descriptor objects (e.g. returned by Descriptor::options(),\n// or produced by Descriptor::CopyTo()) will never have UninterpretedOptions\n// in them.\nmessage UninterpretedOption {\n  // The name of the uninterpreted option.  Each string represents a segment in\n  // a dot-separated name.  is_extension is true iff a segment represents an\n  // extension (denoted with parentheses in options specs in .proto files).\n  // E.g.,{ [\"foo\", false], [\"bar.baz\", true], [\"moo\", false] } represents\n  // \"foo.(bar.baz).moo\".\n  message NamePart {\n    required string name_part = 1;\n    required bool is_extension = 2;\n  }\n  repeated NamePart name = 2;\n\n  // The value of the uninterpreted option, in whatever type the tokenizer\n  // identified it as during parsing. Exactly one of these should be set.\n  optional string identifier_value = 3;\n  optional uint64 positive_int_value = 4;\n  optional int64 negative_int_value = 5;\n  optional double double_value = 6;\n  optional bytes string_value = 7;\n  optional string aggregate_value = 8;\n}\n\n// ===================================================================\n// Optional source code info\n\n// Encapsulates information about the original source file from which a\n// FileDescriptorProto was generated.\nmessage SourceCodeInfo {\n  // A Location identifies a piece of source code in a .proto file which\n  // corresponds to a particular definition.  This information is intended\n  // to be useful to IDEs, code indexers, documentation generators, and similar\n  // tools.\n  //\n  // For example, say we have a file like:\n  //   message Foo {\n  //     optional string foo = 1;\n  //   }\n  // Let's look at just the field definition:\n  //   optional string foo = 1;\n  //   ^       ^^     ^^  ^  ^^^\n  //   a       bc     de  f  ghi\n  // We have the following locations:\n  //   span   path               represents\n  //   [a,i)  [ 4, 0, 2, 0 ]     The whole field definition.\n  //   [a,b)  [ 4, 0, 2, 0, 4 ]  The label (optional).\n  //   [c,d)  [ 4, 0, 2, 0, 5 ]  The type (string).\n  //   [e,f)  [ 4, 0, 2, 0, 1 ]  The name (foo).\n  //   [g,h)  [ 4, 0, 2, 0, 3 ]  The number (1).\n  //\n  // Notes:\n  // - A location may refer to a repeated field itself (i.e. not to any\n  //   particular index within it).  This is used whenever a set of elements are\n  //   logically enclosed in a single code segment.  For example, an entire\n  //   extend block (possibly containing multiple extension definitions) will\n  //   have an outer location whose path refers to the \"extensions\" repeated\n  //   field without an index.\n  // - Multiple locations may have the same path.  This happens when a single\n  //   logical declaration is spread out across multiple places.  The most\n  //   obvious example is the \"extend\" block again -- there may be multiple\n  //   extend blocks in the same scope, each of which will have the same path.\n  // - A location's span is not always a subset of its parent's span.  For\n  //   example, the \"extendee\" of an extension declaration appears at the\n  //   beginning of the \"extend\" block and is shared by all extensions within\n  //   the block.\n  // - Just because a location's span is a subset of some other location's span\n  //   does not mean that it is a descendant.  For example, a \"group\" defines\n  //   both a type and a field in a single declaration.  Thus, the locations\n  //   corresponding to the type and field and their components will overlap.\n  // - Code which tries to interpret locations should probably be designed to\n  //   ignore those that it doesn't understand, as more types of locations could\n  //   be recorded in the future.\n  repeated Location location = 1;\n  message Location {\n    // Identifies which part of the FileDescriptorProto was defined at this\n    // location.\n    //\n    // Each element is a field number or an index.  They form a path from\n    // the root FileDescriptorProto to the place where the definition occurs.\n    // For example, this path:\n    //   [ 4, 3, 2, 7, 1 ]\n    // refers to:\n    //   file.message_type(3)  // 4, 3\n    //       .field(7)         // 2, 7\n    //       .name()           // 1\n    // This is because FileDescriptorProto.message_type has field number 4:\n    //   repeated DescriptorProto message_type = 4;\n    // and DescriptorProto.field has field number 2:\n    //   repeated FieldDescriptorProto field = 2;\n    // and FieldDescriptorProto.name has field number 1:\n    //   optional string name = 1;\n    //\n    // Thus, the above path gives the location of a field name.  If we removed\n    // the last element:\n    //   [ 4, 3, 2, 7 ]\n    // this path refers to the whole field declaration (from the beginning\n    // of the label to the terminating semicolon).\n    repeated int32 path = 1 [packed = true];\n\n    // Always has exactly three or four elements: start line, start column,\n    // end line (optional, otherwise assumed same as start line), end column.\n    // These are packed into a single field for efficiency.  Note that line\n    // and column numbers are zero-based -- typically you will want to add\n    // 1 to each before displaying to a user.\n    repeated int32 span = 2 [packed = true];\n\n    // If this SourceCodeInfo represents a complete declaration, these are any\n    // comments appearing before and after the declaration which appear to be\n    // attached to the declaration.\n    //\n    // A series of line comments appearing on consecutive lines, with no other\n    // tokens appearing on those lines, will be treated as a single comment.\n    //\n    // leading_detached_comments will keep paragraphs of comments that appear\n    // before (but not connected to) the current element. Each paragraph,\n    // separated by empty lines, will be one comment element in the repeated\n    // field.\n    //\n    // Only the comment content is provided; comment markers (e.g. //) are\n    // stripped out.  For block comments, leading whitespace and an asterisk\n    // will be stripped from the beginning of each line other than the first.\n    // Newlines are included in the output.\n    //\n    // Examples:\n    //\n    //   optional int32 foo = 1;  // Comment attached to foo.\n    //   // Comment attached to bar.\n    //   optional int32 bar = 2;\n    //\n    //   optional string baz = 3;\n    //   // Comment attached to baz.\n    //   // Another line attached to baz.\n    //\n    //   // Comment attached to moo.\n    //   //\n    //   // Another line attached to moo.\n    //   optional double moo = 4;\n    //\n    //   // Detached comment for corge. This is not leading or trailing comments\n    //   // to moo or corge because there are blank lines separating it from\n    //   // both.\n    //\n    //   // Detached comment for corge paragraph 2.\n    //\n    //   optional string corge = 5;\n    //   /* Block comment attached\n    //    * to corge.  Leading asterisks\n    //    * will be removed. */\n    //   /* Block comment attached to\n    //    * grault. */\n    //   optional int32 grault = 6;\n    //\n    //   // ignored detached comments.\n    optional string leading_comments = 3;\n    optional string trailing_comments = 4;\n    repeated string leading_detached_comments = 6;\n  }\n}\n\n// Describes the relationship between generated code and its original source\n// file. A GeneratedCodeInfo message is associated with only one generated\n// source file, but may contain references to different source .proto files.\nmessage GeneratedCodeInfo {\n  // An Annotation connects some span of text in generated code to an element\n  // of its generating .proto file.\n  repeated Annotation annotation = 1;\n  message Annotation {\n    // Identifies the element in the original source .proto file. This field\n    // is formatted the same as SourceCodeInfo.Location.path.\n    repeated int32 path = 1 [packed = true];\n\n    // Identifies the filesystem path to the original source .proto.\n    optional string source_file = 2;\n\n    // Identifies the starting offset in bytes in the generated code\n    // that relates to the identified object.\n    optional int32 begin = 3;\n\n    // Identifies the ending offset in bytes in the generated code that\n    // relates to the identified offset. The end offset should be one past\n    // the last relevant byte (so the length of the text = end - begin).\n    optional int32 end = 4;\n  }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/google/protobuf/duration.proto",
    "content": "// Protocol Buffers - Google's data interchange format\n// Copyright 2008 Google Inc.  All rights reserved.\n// https://developers.google.com/protocol-buffers/\n//\n// Redistribution and use in source and binary forms, with or without\n// modification, are permitted provided that the following conditions are\n// met:\n//\n//     * Redistributions of source code must retain the above copyright\n// notice, this list of conditions and the following disclaimer.\n//     * Redistributions in binary form must reproduce the above\n// copyright notice, this list of conditions and the following disclaimer\n// in the documentation and/or other materials provided with the\n// distribution.\n//     * Neither the name of Google Inc. nor the names of its\n// contributors may be used to endorse or promote products derived from\n// this software without specific prior written permission.\n//\n// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n// \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n\nsyntax = \"proto3\";\n\npackage google.protobuf;\n\noption csharp_namespace = \"Google.Protobuf.WellKnownTypes\";\noption cc_enable_arenas = true;\noption go_package = \"google.golang.org/protobuf/types/known/durationpb\";\noption java_package = \"com.google.protobuf\";\noption java_outer_classname = \"DurationProto\";\noption java_multiple_files = true;\noption objc_class_prefix = \"GPB\";\n\n// A Duration represents a signed, fixed-length span of time represented\n// as a count of seconds and fractions of seconds at nanosecond\n// resolution. It is independent of any calendar and concepts like \"day\"\n// or \"month\". It is related to Timestamp in that the difference between\n// two Timestamp values is a Duration and it can be added or subtracted\n// from a Timestamp. Range is approximately +-10,000 years.\n//\n// # Examples\n//\n// Example 1: Compute Duration from two Timestamps in pseudo code.\n//\n//     Timestamp start = ...;\n//     Timestamp end = ...;\n//     Duration duration = ...;\n//\n//     duration.seconds = end.seconds - start.seconds;\n//     duration.nanos = end.nanos - start.nanos;\n//\n//     if (duration.seconds < 0 && duration.nanos > 0) {\n//       duration.seconds += 1;\n//       duration.nanos -= 1000000000;\n//     } else if (duration.seconds > 0 && duration.nanos < 0) {\n//       duration.seconds -= 1;\n//       duration.nanos += 1000000000;\n//     }\n//\n// Example 2: Compute Timestamp from Timestamp + Duration in pseudo code.\n//\n//     Timestamp start = ...;\n//     Duration duration = ...;\n//     Timestamp end = ...;\n//\n//     end.seconds = start.seconds + duration.seconds;\n//     end.nanos = start.nanos + duration.nanos;\n//\n//     if (end.nanos < 0) {\n//       end.seconds -= 1;\n//       end.nanos += 1000000000;\n//     } else if (end.nanos >= 1000000000) {\n//       end.seconds += 1;\n//       end.nanos -= 1000000000;\n//     }\n//\n// Example 3: Compute Duration from datetime.timedelta in Python.\n//\n//     td = datetime.timedelta(days=3, minutes=10)\n//     duration = Duration()\n//     duration.FromTimedelta(td)\n//\n// # JSON Mapping\n//\n// In JSON format, the Duration type is encoded as a string rather than an\n// object, where the string ends in the suffix \"s\" (indicating seconds) and\n// is preceded by the number of seconds, with nanoseconds expressed as\n// fractional seconds. For example, 3 seconds with 0 nanoseconds should be\n// encoded in JSON format as \"3s\", while 3 seconds and 1 nanosecond should\n// be expressed in JSON format as \"3.000000001s\", and 3 seconds and 1\n// microsecond should be expressed in JSON format as \"3.000001s\".\n//\n//\nmessage Duration {\n  // Signed seconds of the span of time. Must be from -315,576,000,000\n  // to +315,576,000,000 inclusive. Note: these bounds are computed from:\n  // 60 sec/min * 60 min/hr * 24 hr/day * 365.25 days/year * 10000 years\n  int64 seconds = 1;\n\n  // Signed fractions of a second at nanosecond resolution of the span\n  // of time. Durations less than one second are represented with a 0\n  // `seconds` field and a positive or negative `nanos` field. For durations\n  // of one second or more, a non-zero value for the `nanos` field must be\n  // of the same sign as the `seconds` field. Must be from -999,999,999\n  // to +999,999,999 inclusive.\n  int32 nanos = 2;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/google/protobuf/empty.proto",
    "content": "// Protocol Buffers - Google's data interchange format\n// Copyright 2008 Google Inc.  All rights reserved.\n// https://developers.google.com/protocol-buffers/\n//\n// Redistribution and use in source and binary forms, with or without\n// modification, are permitted provided that the following conditions are\n// met:\n//\n//     * Redistributions of source code must retain the above copyright\n// notice, this list of conditions and the following disclaimer.\n//     * Redistributions in binary form must reproduce the above\n// copyright notice, this list of conditions and the following disclaimer\n// in the documentation and/or other materials provided with the\n// distribution.\n//     * Neither the name of Google Inc. nor the names of its\n// contributors may be used to endorse or promote products derived from\n// this software without specific prior written permission.\n//\n// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n// \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n\nsyntax = \"proto3\";\n\npackage google.protobuf;\n\noption csharp_namespace = \"Google.Protobuf.WellKnownTypes\";\noption go_package = \"google.golang.org/protobuf/types/known/emptypb\";\noption java_package = \"com.google.protobuf\";\noption java_outer_classname = \"EmptyProto\";\noption java_multiple_files = true;\noption objc_class_prefix = \"GPB\";\noption cc_enable_arenas = true;\n\n// A generic empty message that you can re-use to avoid defining duplicated\n// empty messages in your APIs. A typical example is to use it as the request\n// or the response type of an API method. For instance:\n//\n//     service Foo {\n//       rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty);\n//     }\n//\nmessage Empty {}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/google/protobuf/field_mask.proto",
    "content": "// Protocol Buffers - Google's data interchange format\n// Copyright 2008 Google Inc.  All rights reserved.\n// https://developers.google.com/protocol-buffers/\n//\n// Redistribution and use in source and binary forms, with or without\n// modification, are permitted provided that the following conditions are\n// met:\n//\n//     * Redistributions of source code must retain the above copyright\n// notice, this list of conditions and the following disclaimer.\n//     * Redistributions in binary form must reproduce the above\n// copyright notice, this list of conditions and the following disclaimer\n// in the documentation and/or other materials provided with the\n// distribution.\n//     * Neither the name of Google Inc. nor the names of its\n// contributors may be used to endorse or promote products derived from\n// this software without specific prior written permission.\n//\n// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n// \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n\nsyntax = \"proto3\";\n\npackage google.protobuf;\n\noption csharp_namespace = \"Google.Protobuf.WellKnownTypes\";\noption java_package = \"com.google.protobuf\";\noption java_outer_classname = \"FieldMaskProto\";\noption java_multiple_files = true;\noption objc_class_prefix = \"GPB\";\noption go_package = \"google.golang.org/protobuf/types/known/fieldmaskpb\";\noption cc_enable_arenas = true;\n\n// `FieldMask` represents a set of symbolic field paths, for example:\n//\n//     paths: \"f.a\"\n//     paths: \"f.b.d\"\n//\n// Here `f` represents a field in some root message, `a` and `b`\n// fields in the message found in `f`, and `d` a field found in the\n// message in `f.b`.\n//\n// Field masks are used to specify a subset of fields that should be\n// returned by a get operation or modified by an update operation.\n// Field masks also have a custom JSON encoding (see below).\n//\n// # Field Masks in Projections\n//\n// When used in the context of a projection, a response message or\n// sub-message is filtered by the API to only contain those fields as\n// specified in the mask. For example, if the mask in the previous\n// example is applied to a response message as follows:\n//\n//     f {\n//       a : 22\n//       b {\n//         d : 1\n//         x : 2\n//       }\n//       y : 13\n//     }\n//     z: 8\n//\n// The result will not contain specific values for fields x,y and z\n// (their value will be set to the default, and omitted in proto text\n// output):\n//\n//\n//     f {\n//       a : 22\n//       b {\n//         d : 1\n//       }\n//     }\n//\n// A repeated field is not allowed except at the last position of a\n// paths string.\n//\n// If a FieldMask object is not present in a get operation, the\n// operation applies to all fields (as if a FieldMask of all fields\n// had been specified).\n//\n// Note that a field mask does not necessarily apply to the\n// top-level response message. In case of a REST get operation, the\n// field mask applies directly to the response, but in case of a REST\n// list operation, the mask instead applies to each individual message\n// in the returned resource list. In case of a REST custom method,\n// other definitions may be used. Where the mask applies will be\n// clearly documented together with its declaration in the API.  In\n// any case, the effect on the returned resource/resources is required\n// behavior for APIs.\n//\n// # Field Masks in Update Operations\n//\n// A field mask in update operations specifies which fields of the\n// targeted resource are going to be updated. The API is required\n// to only change the values of the fields as specified in the mask\n// and leave the others untouched. If a resource is passed in to\n// describe the updated values, the API ignores the values of all\n// fields not covered by the mask.\n//\n// If a repeated field is specified for an update operation, new values will\n// be appended to the existing repeated field in the target resource. Note that\n// a repeated field is only allowed in the last position of a `paths` string.\n//\n// If a sub-message is specified in the last position of the field mask for an\n// update operation, then new value will be merged into the existing sub-message\n// in the target resource.\n//\n// For example, given the target message:\n//\n//     f {\n//       b {\n//         d: 1\n//         x: 2\n//       }\n//       c: [1]\n//     }\n//\n// And an update message:\n//\n//     f {\n//       b {\n//         d: 10\n//       }\n//       c: [2]\n//     }\n//\n// then if the field mask is:\n//\n//  paths: [\"f.b\", \"f.c\"]\n//\n// then the result will be:\n//\n//     f {\n//       b {\n//         d: 10\n//         x: 2\n//       }\n//       c: [1, 2]\n//     }\n//\n// An implementation may provide options to override this default behavior for\n// repeated and message fields.\n//\n// In order to reset a field's value to the default, the field must\n// be in the mask and set to the default value in the provided resource.\n// Hence, in order to reset all fields of a resource, provide a default\n// instance of the resource and set all fields in the mask, or do\n// not provide a mask as described below.\n//\n// If a field mask is not present on update, the operation applies to\n// all fields (as if a field mask of all fields has been specified).\n// Note that in the presence of schema evolution, this may mean that\n// fields the client does not know and has therefore not filled into\n// the request will be reset to their default. If this is unwanted\n// behavior, a specific service may require a client to always specify\n// a field mask, producing an error if not.\n//\n// As with get operations, the location of the resource which\n// describes the updated values in the request message depends on the\n// operation kind. In any case, the effect of the field mask is\n// required to be honored by the API.\n//\n// ## Considerations for HTTP REST\n//\n// The HTTP kind of an update operation which uses a field mask must\n// be set to PATCH instead of PUT in order to satisfy HTTP semantics\n// (PUT must only be used for full updates).\n//\n// # JSON Encoding of Field Masks\n//\n// In JSON, a field mask is encoded as a single string where paths are\n// separated by a comma. Fields name in each path are converted\n// to/from lower-camel naming conventions.\n//\n// As an example, consider the following message declarations:\n//\n//     message Profile {\n//       User user = 1;\n//       Photo photo = 2;\n//     }\n//     message User {\n//       string display_name = 1;\n//       string address = 2;\n//     }\n//\n// In proto a field mask for `Profile` may look as such:\n//\n//     mask {\n//       paths: \"user.display_name\"\n//       paths: \"photo\"\n//     }\n//\n// In JSON, the same mask is represented as below:\n//\n//     {\n//       mask: \"user.displayName,photo\"\n//     }\n//\n// # Field Masks and Oneof Fields\n//\n// Field masks treat fields in oneofs just as regular fields. Consider the\n// following message:\n//\n//     message SampleMessage {\n//       oneof test_oneof {\n//         string name = 4;\n//         SubMessage sub_message = 9;\n//       }\n//     }\n//\n// The field mask can be:\n//\n//     mask {\n//       paths: \"name\"\n//     }\n//\n// Or:\n//\n//     mask {\n//       paths: \"sub_message\"\n//     }\n//\n// Note that oneof type names (\"test_oneof\" in this case) cannot be used in\n// paths.\n//\n// ## Field Mask Verification\n//\n// The implementation of any API method which has a FieldMask type field in the\n// request should verify the included field paths, and return an\n// `INVALID_ARGUMENT` error if any path is unmappable.\nmessage FieldMask {\n  // The set of field mask paths.\n  repeated string paths = 1;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/google/protobuf/source_context.proto",
    "content": "// Protocol Buffers - Google's data interchange format\n// Copyright 2008 Google Inc.  All rights reserved.\n// https://developers.google.com/protocol-buffers/\n//\n// Redistribution and use in source and binary forms, with or without\n// modification, are permitted provided that the following conditions are\n// met:\n//\n//     * Redistributions of source code must retain the above copyright\n// notice, this list of conditions and the following disclaimer.\n//     * Redistributions in binary form must reproduce the above\n// copyright notice, this list of conditions and the following disclaimer\n// in the documentation and/or other materials provided with the\n// distribution.\n//     * Neither the name of Google Inc. nor the names of its\n// contributors may be used to endorse or promote products derived from\n// this software without specific prior written permission.\n//\n// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n// \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n\nsyntax = \"proto3\";\n\npackage google.protobuf;\n\noption csharp_namespace = \"Google.Protobuf.WellKnownTypes\";\noption java_package = \"com.google.protobuf\";\noption java_outer_classname = \"SourceContextProto\";\noption java_multiple_files = true;\noption objc_class_prefix = \"GPB\";\noption go_package = \"google.golang.org/protobuf/types/known/sourcecontextpb\";\n\n// `SourceContext` represents information about the source of a\n// protobuf element, like the file in which it is defined.\nmessage SourceContext {\n  // The path-qualified name of the .proto file that contained the associated\n  // protobuf element.  For example: `\"google/protobuf/source_context.proto\"`.\n  string file_name = 1;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/google/protobuf/struct.proto",
    "content": "// Protocol Buffers - Google's data interchange format\n// Copyright 2008 Google Inc.  All rights reserved.\n// https://developers.google.com/protocol-buffers/\n//\n// Redistribution and use in source and binary forms, with or without\n// modification, are permitted provided that the following conditions are\n// met:\n//\n//     * Redistributions of source code must retain the above copyright\n// notice, this list of conditions and the following disclaimer.\n//     * Redistributions in binary form must reproduce the above\n// copyright notice, this list of conditions and the following disclaimer\n// in the documentation and/or other materials provided with the\n// distribution.\n//     * Neither the name of Google Inc. nor the names of its\n// contributors may be used to endorse or promote products derived from\n// this software without specific prior written permission.\n//\n// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n// \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n\nsyntax = \"proto3\";\n\npackage google.protobuf;\n\noption csharp_namespace = \"Google.Protobuf.WellKnownTypes\";\noption cc_enable_arenas = true;\noption go_package = \"google.golang.org/protobuf/types/known/structpb\";\noption java_package = \"com.google.protobuf\";\noption java_outer_classname = \"StructProto\";\noption java_multiple_files = true;\noption objc_class_prefix = \"GPB\";\n\n// `Struct` represents a structured data value, consisting of fields\n// which map to dynamically typed values. In some languages, `Struct`\n// might be supported by a native representation. For example, in\n// scripting languages like JS a struct is represented as an\n// object. The details of that representation are described together\n// with the proto support for the language.\n//\n// The JSON representation for `Struct` is JSON object.\nmessage Struct {\n  // Unordered map of dynamically typed values.\n  map<string, Value> fields = 1;\n}\n\n// `Value` represents a dynamically typed value which can be either\n// null, a number, a string, a boolean, a recursive struct value, or a\n// list of values. A producer of value is expected to set one of these\n// variants. Absence of any variant indicates an error.\n//\n// The JSON representation for `Value` is JSON value.\nmessage Value {\n  // The kind of value.\n  oneof kind {\n    // Represents a null value.\n    NullValue null_value = 1;\n    // Represents a double value.\n    double number_value = 2;\n    // Represents a string value.\n    string string_value = 3;\n    // Represents a boolean value.\n    bool bool_value = 4;\n    // Represents a structured value.\n    Struct struct_value = 5;\n    // Represents a repeated `Value`.\n    ListValue list_value = 6;\n  }\n}\n\n// `NullValue` is a singleton enumeration to represent the null value for the\n// `Value` type union.\n//\n//  The JSON representation for `NullValue` is JSON `null`.\nenum NullValue {\n  // Null value.\n  NULL_VALUE = 0;\n}\n\n// `ListValue` is a wrapper around a repeated field of values.\n//\n// The JSON representation for `ListValue` is JSON array.\nmessage ListValue {\n  // Repeated field of dynamically typed values.\n  repeated Value values = 1;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/google/protobuf/timestamp.proto",
    "content": "// Protocol Buffers - Google's data interchange format\n// Copyright 2008 Google Inc.  All rights reserved.\n// https://developers.google.com/protocol-buffers/\n//\n// Redistribution and use in source and binary forms, with or without\n// modification, are permitted provided that the following conditions are\n// met:\n//\n//     * Redistributions of source code must retain the above copyright\n// notice, this list of conditions and the following disclaimer.\n//     * Redistributions in binary form must reproduce the above\n// copyright notice, this list of conditions and the following disclaimer\n// in the documentation and/or other materials provided with the\n// distribution.\n//     * Neither the name of Google Inc. nor the names of its\n// contributors may be used to endorse or promote products derived from\n// this software without specific prior written permission.\n//\n// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n// \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n\nsyntax = \"proto3\";\n\npackage google.protobuf;\n\noption csharp_namespace = \"Google.Protobuf.WellKnownTypes\";\noption cc_enable_arenas = true;\noption go_package = \"google.golang.org/protobuf/types/known/timestamppb\";\noption java_package = \"com.google.protobuf\";\noption java_outer_classname = \"TimestampProto\";\noption java_multiple_files = true;\noption objc_class_prefix = \"GPB\";\n\n// A Timestamp represents a point in time independent of any time zone or local\n// calendar, encoded as a count of seconds and fractions of seconds at\n// nanosecond resolution. The count is relative to an epoch at UTC midnight on\n// January 1, 1970, in the proleptic Gregorian calendar which extends the\n// Gregorian calendar backwards to year one.\n//\n// All minutes are 60 seconds long. Leap seconds are \"smeared\" so that no leap\n// second table is needed for interpretation, using a [24-hour linear\n// smear](https://developers.google.com/time/smear).\n//\n// The range is from 0001-01-01T00:00:00Z to 9999-12-31T23:59:59.999999999Z. By\n// restricting to that range, we ensure that we can convert to and from [RFC\n// 3339](https://www.ietf.org/rfc/rfc3339.txt) date strings.\n//\n// # Examples\n//\n// Example 1: Compute Timestamp from POSIX `time()`.\n//\n//     Timestamp timestamp;\n//     timestamp.set_seconds(time(NULL));\n//     timestamp.set_nanos(0);\n//\n// Example 2: Compute Timestamp from POSIX `gettimeofday()`.\n//\n//     struct timeval tv;\n//     gettimeofday(&tv, NULL);\n//\n//     Timestamp timestamp;\n//     timestamp.set_seconds(tv.tv_sec);\n//     timestamp.set_nanos(tv.tv_usec * 1000);\n//\n// Example 3: Compute Timestamp from Win32 `GetSystemTimeAsFileTime()`.\n//\n//     FILETIME ft;\n//     GetSystemTimeAsFileTime(&ft);\n//     UINT64 ticks = (((UINT64)ft.dwHighDateTime) << 32) | ft.dwLowDateTime;\n//\n//     // A Windows tick is 100 nanoseconds. Windows epoch 1601-01-01T00:00:00Z\n//     // is 11644473600 seconds before Unix epoch 1970-01-01T00:00:00Z.\n//     Timestamp timestamp;\n//     timestamp.set_seconds((INT64) ((ticks / 10000000) - 11644473600LL));\n//     timestamp.set_nanos((INT32) ((ticks % 10000000) * 100));\n//\n// Example 4: Compute Timestamp from Java `System.currentTimeMillis()`.\n//\n//     long millis = System.currentTimeMillis();\n//\n//     Timestamp timestamp = Timestamp.newBuilder().setSeconds(millis / 1000)\n//         .setNanos((int) ((millis % 1000) * 1000000)).build();\n//\n//\n// Example 5: Compute Timestamp from Java `Instant.now()`.\n//\n//     Instant now = Instant.now();\n//\n//     Timestamp timestamp =\n//         Timestamp.newBuilder().setSeconds(now.getEpochSecond())\n//             .setNanos(now.getNano()).build();\n//\n//\n// Example 6: Compute Timestamp from current time in Python.\n//\n//     timestamp = Timestamp()\n//     timestamp.GetCurrentTime()\n//\n// # JSON Mapping\n//\n// In JSON format, the Timestamp type is encoded as a string in the\n// [RFC 3339](https://www.ietf.org/rfc/rfc3339.txt) format. That is, the\n// format is \"{year}-{month}-{day}T{hour}:{min}:{sec}[.{frac_sec}]Z\"\n// where {year} is always expressed using four digits while {month}, {day},\n// {hour}, {min}, and {sec} are zero-padded to two digits each. The fractional\n// seconds, which can go up to 9 digits (i.e. up to 1 nanosecond resolution),\n// are optional. The \"Z\" suffix indicates the timezone (\"UTC\"); the timezone\n// is required. A proto3 JSON serializer should always use UTC (as indicated by\n// \"Z\") when printing the Timestamp type and a proto3 JSON parser should be\n// able to accept both UTC and other timezones (as indicated by an offset).\n//\n// For example, \"2017-01-15T01:30:15.01Z\" encodes 15.01 seconds past\n// 01:30 UTC on January 15, 2017.\n//\n// In JavaScript, one can convert a Date object to this format using the\n// standard\n// [toISOString()](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date/toISOString)\n// method. In Python, a standard `datetime.datetime` object can be converted\n// to this format using\n// [`strftime`](https://docs.python.org/2/library/time.html#time.strftime) with\n// the time format spec '%Y-%m-%dT%H:%M:%S.%fZ'. Likewise, in Java, one can use\n// the Joda Time's [`ISODateTimeFormat.dateTime()`](\n// http://www.joda.org/joda-time/apidocs/org/joda/time/format/ISODateTimeFormat.html#dateTime%2D%2D\n// ) to obtain a formatter capable of generating timestamps in this format.\n//\n//\nmessage Timestamp {\n  // Represents seconds of UTC time since Unix epoch\n  // 1970-01-01T00:00:00Z. Must be from 0001-01-01T00:00:00Z to\n  // 9999-12-31T23:59:59Z inclusive.\n  int64 seconds = 1;\n\n  // Non-negative fractions of a second at nanosecond resolution. Negative\n  // second values with fractions must still have non-negative nanos values\n  // that count forward in time. Must be from 0 to 999,999,999\n  // inclusive.\n  int32 nanos = 2;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/google/protobuf/type.proto",
    "content": "// Protocol Buffers - Google's data interchange format\n// Copyright 2008 Google Inc.  All rights reserved.\n// https://developers.google.com/protocol-buffers/\n//\n// Redistribution and use in source and binary forms, with or without\n// modification, are permitted provided that the following conditions are\n// met:\n//\n//     * Redistributions of source code must retain the above copyright\n// notice, this list of conditions and the following disclaimer.\n//     * Redistributions in binary form must reproduce the above\n// copyright notice, this list of conditions and the following disclaimer\n// in the documentation and/or other materials provided with the\n// distribution.\n//     * Neither the name of Google Inc. nor the names of its\n// contributors may be used to endorse or promote products derived from\n// this software without specific prior written permission.\n//\n// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n// \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n\nsyntax = \"proto3\";\n\npackage google.protobuf;\n\nimport \"google/protobuf/any.proto\";\nimport \"google/protobuf/source_context.proto\";\n\noption csharp_namespace = \"Google.Protobuf.WellKnownTypes\";\noption cc_enable_arenas = true;\noption java_package = \"com.google.protobuf\";\noption java_outer_classname = \"TypeProto\";\noption java_multiple_files = true;\noption objc_class_prefix = \"GPB\";\noption go_package = \"google.golang.org/protobuf/types/known/typepb\";\n\n// A protocol buffer message type.\nmessage Type {\n  // The fully qualified message name.\n  string name = 1;\n  // The list of fields.\n  repeated Field fields = 2;\n  // The list of types appearing in `oneof` definitions in this type.\n  repeated string oneofs = 3;\n  // The protocol buffer options.\n  repeated Option options = 4;\n  // The source context.\n  SourceContext source_context = 5;\n  // The source syntax.\n  Syntax syntax = 6;\n}\n\n// A single field of a message type.\nmessage Field {\n  // Basic field types.\n  enum Kind {\n    // Field type unknown.\n    TYPE_UNKNOWN = 0;\n    // Field type double.\n    TYPE_DOUBLE = 1;\n    // Field type float.\n    TYPE_FLOAT = 2;\n    // Field type int64.\n    TYPE_INT64 = 3;\n    // Field type uint64.\n    TYPE_UINT64 = 4;\n    // Field type int32.\n    TYPE_INT32 = 5;\n    // Field type fixed64.\n    TYPE_FIXED64 = 6;\n    // Field type fixed32.\n    TYPE_FIXED32 = 7;\n    // Field type bool.\n    TYPE_BOOL = 8;\n    // Field type string.\n    TYPE_STRING = 9;\n    // Field type group. Proto2 syntax only, and deprecated.\n    TYPE_GROUP = 10;\n    // Field type message.\n    TYPE_MESSAGE = 11;\n    // Field type bytes.\n    TYPE_BYTES = 12;\n    // Field type uint32.\n    TYPE_UINT32 = 13;\n    // Field type enum.\n    TYPE_ENUM = 14;\n    // Field type sfixed32.\n    TYPE_SFIXED32 = 15;\n    // Field type sfixed64.\n    TYPE_SFIXED64 = 16;\n    // Field type sint32.\n    TYPE_SINT32 = 17;\n    // Field type sint64.\n    TYPE_SINT64 = 18;\n  }\n\n  // Whether a field is optional, required, or repeated.\n  enum Cardinality {\n    // For fields with unknown cardinality.\n    CARDINALITY_UNKNOWN = 0;\n    // For optional fields.\n    CARDINALITY_OPTIONAL = 1;\n    // For required fields. Proto2 syntax only.\n    CARDINALITY_REQUIRED = 2;\n    // For repeated fields.\n    CARDINALITY_REPEATED = 3;\n  }\n\n  // The field type.\n  Kind kind = 1;\n  // The field cardinality.\n  Cardinality cardinality = 2;\n  // The field number.\n  int32 number = 3;\n  // The field name.\n  string name = 4;\n  // The field type URL, without the scheme, for message or enumeration\n  // types. Example: `\"type.googleapis.com/google.protobuf.Timestamp\"`.\n  string type_url = 6;\n  // The index of the field type in `Type.oneofs`, for message or enumeration\n  // types. The first type has index 1; zero means the type is not in the list.\n  int32 oneof_index = 7;\n  // Whether to use alternative packed wire representation.\n  bool packed = 8;\n  // The protocol buffer options.\n  repeated Option options = 9;\n  // The field JSON name.\n  string json_name = 10;\n  // The string value of the default value of this field. Proto2 syntax only.\n  string default_value = 11;\n}\n\n// Enum type definition.\nmessage Enum {\n  // Enum type name.\n  string name = 1;\n  // Enum value definitions.\n  repeated EnumValue enumvalue = 2;\n  // Protocol buffer options.\n  repeated Option options = 3;\n  // The source context.\n  SourceContext source_context = 4;\n  // The source syntax.\n  Syntax syntax = 5;\n}\n\n// Enum value definition.\nmessage EnumValue {\n  // Enum value name.\n  string name = 1;\n  // Enum value number.\n  int32 number = 2;\n  // Protocol buffer options.\n  repeated Option options = 3;\n}\n\n// A protocol buffer option, which can be attached to a message, field,\n// enumeration, etc.\nmessage Option {\n  // The option's name. For protobuf built-in options (options defined in\n  // descriptor.proto), this is the short name. For example, `\"map_entry\"`.\n  // For custom options, it should be the fully-qualified name. For example,\n  // `\"google.api.http\"`.\n  string name = 1;\n  // The option's value packed in an Any message. If the value is a primitive,\n  // the corresponding wrapper type defined in google/protobuf/wrappers.proto\n  // should be used. If the value is an enum, it should be stored as an int32\n  // value using the google.protobuf.Int32Value type.\n  Any value = 2;\n}\n\n// The syntax in which a protocol buffer element is defined.\nenum Syntax {\n  // Syntax `proto2`.\n  SYNTAX_PROTO2 = 0;\n  // Syntax `proto3`.\n  SYNTAX_PROTO3 = 1;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/google/protobuf/wrappers.proto",
    "content": "// Protocol Buffers - Google's data interchange format\n// Copyright 2008 Google Inc.  All rights reserved.\n// https://developers.google.com/protocol-buffers/\n//\n// Redistribution and use in source and binary forms, with or without\n// modification, are permitted provided that the following conditions are\n// met:\n//\n//     * Redistributions of source code must retain the above copyright\n// notice, this list of conditions and the following disclaimer.\n//     * Redistributions in binary form must reproduce the above\n// copyright notice, this list of conditions and the following disclaimer\n// in the documentation and/or other materials provided with the\n// distribution.\n//     * Neither the name of Google Inc. nor the names of its\n// contributors may be used to endorse or promote products derived from\n// this software without specific prior written permission.\n//\n// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n// \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n\n// Wrappers for primitive (non-message) types. These types are useful\n// for embedding primitives in the `google.protobuf.Any` type and for places\n// where we need to distinguish between the absence of a primitive\n// typed field and its default value.\n//\n// These wrappers have no meaningful use within repeated fields as they lack\n// the ability to detect presence on individual elements.\n// These wrappers have no meaningful use within a map or a oneof since\n// individual entries of a map or fields of a oneof can already detect presence.\n\nsyntax = \"proto3\";\n\npackage google.protobuf;\n\noption csharp_namespace = \"Google.Protobuf.WellKnownTypes\";\noption cc_enable_arenas = true;\noption go_package = \"google.golang.org/protobuf/types/known/wrapperspb\";\noption java_package = \"com.google.protobuf\";\noption java_outer_classname = \"WrappersProto\";\noption java_multiple_files = true;\noption objc_class_prefix = \"GPB\";\n\n// Wrapper message for `double`.\n//\n// The JSON representation for `DoubleValue` is JSON number.\nmessage DoubleValue {\n  // The double value.\n  double value = 1;\n}\n\n// Wrapper message for `float`.\n//\n// The JSON representation for `FloatValue` is JSON number.\nmessage FloatValue {\n  // The float value.\n  float value = 1;\n}\n\n// Wrapper message for `int64`.\n//\n// The JSON representation for `Int64Value` is JSON string.\nmessage Int64Value {\n  // The int64 value.\n  int64 value = 1;\n}\n\n// Wrapper message for `uint64`.\n//\n// The JSON representation for `UInt64Value` is JSON string.\nmessage UInt64Value {\n  // The uint64 value.\n  uint64 value = 1;\n}\n\n// Wrapper message for `int32`.\n//\n// The JSON representation for `Int32Value` is JSON number.\nmessage Int32Value {\n  // The int32 value.\n  int32 value = 1;\n}\n\n// Wrapper message for `uint32`.\n//\n// The JSON representation for `UInt32Value` is JSON number.\nmessage UInt32Value {\n  // The uint32 value.\n  uint32 value = 1;\n}\n\n// Wrapper message for `bool`.\n//\n// The JSON representation for `BoolValue` is JSON `true` and `false`.\nmessage BoolValue {\n  // The bool value.\n  bool value = 1;\n}\n\n// Wrapper message for `string`.\n//\n// The JSON representation for `StringValue` is JSON string.\nmessage StringValue {\n  // The string value.\n  string value = 1;\n}\n\n// Wrapper message for `bytes`.\n//\n// The JSON representation for `BytesValue` is JSON string.\nmessage BytesValue {\n  // The bytes value.\n  bytes value = 1;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/jaeger/model.proto",
    "content": "// Copyright (c) 2018 Uber Technologies, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n// http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nsyntax=\"proto3\";\n\npackage jaeger.api_v2;\n\nimport \"gogoproto/gogo.proto\";\nimport \"google/protobuf/timestamp.proto\";\nimport \"google/protobuf/duration.proto\";\n\n// TODO: document all types and fields\n\n// TODO: once this moves to jaeger-idl repo, we may want to change Go pkg to api_v2\n// and rewrite it to model only in this repo. That should make it easier to generate\n// classes in other languages.\noption go_package = \"model\";\noption java_package = \"io.jaegertracing.api_v2\";\n\n// Enable gogoprotobuf extensions (https://github.com/gogo/protobuf/blob/master/extensions.md).\n// Enable custom Marshal method.\noption (gogoproto.marshaler_all) = true;\n// Enable custom Unmarshal method.\noption (gogoproto.unmarshaler_all) = true;\n// Enable custom Size method (Required by Marshal and Unmarshal).\noption (gogoproto.sizer_all) = true;\n\nenum ValueType {\n  STRING  = 0;\n  BOOL    = 1;\n  INT64   = 2;\n  FLOAT64 = 3;\n  BINARY  = 4;\n};\n\nmessage KeyValue {\n  option (gogoproto.equal) = true;\n  option (gogoproto.compare) = true;\n\n  string    key      = 1;\n  ValueType v_type    = 2;\n  string    v_str     = 3;\n  bool      v_bool    = 4;\n  int64     v_int64   = 5;\n  double    v_float64 = 6;\n  bytes     v_binary  = 7;\n}\n\nmessage Log {\n  google.protobuf.Timestamp timestamp = 1 [\n    (gogoproto.stdtime) = true,\n    (gogoproto.nullable) = false\n  ];\n  repeated KeyValue fields = 2 [\n    (gogoproto.nullable) = false\n  ];\n}\n\nenum SpanRefType {\n  CHILD_OF = 0;\n  FOLLOWS_FROM = 1;\n};\n\nmessage SpanRef {\n  bytes trace_id = 1 [\n    (gogoproto.nullable) = false,\n    (gogoproto.customtype) = \"TraceID\",\n    (gogoproto.customname) = \"TraceID\"\n  ];\n  bytes span_id = 2 [\n    (gogoproto.nullable) = false,\n    (gogoproto.customtype) = \"SpanID\",\n    (gogoproto.customname) = \"SpanID\"\n  ];\n  SpanRefType ref_type = 3;\n}\n\nmessage Process {\n  string service_name = 1;\n  repeated KeyValue tags = 2 [\n    (gogoproto.nullable) = false\n  ];\n}\n\nmessage Span {\n  bytes trace_id = 1 [\n    (gogoproto.nullable) = false,\n    (gogoproto.customtype) = \"TraceID\",\n    (gogoproto.customname) = \"TraceID\"\n  ];\n  bytes span_id = 2 [\n    (gogoproto.nullable) = false,\n    (gogoproto.customtype) = \"SpanID\",\n    (gogoproto.customname) = \"SpanID\"\n  ];\n  string operation_name = 3;\n  repeated SpanRef references = 4 [\n    (gogoproto.nullable) = false\n  ];\n  uint32 flags = 5 [\n    (gogoproto.nullable) = false,\n    (gogoproto.customtype) = \"Flags\"\n  ];\n  google.protobuf.Timestamp start_time = 6 [\n    (gogoproto.stdtime) = true,\n    (gogoproto.nullable) = false\n  ];\n  google.protobuf.Duration duration = 7 [\n    (gogoproto.stdduration) = true,\n    (gogoproto.nullable) = false\n  ];\n  repeated KeyValue tags = 8 [\n    (gogoproto.nullable) = false\n  ];\n  repeated Log logs = 9 [\n    (gogoproto.nullable) = false\n  ];\n  Process process = 10;\n  string process_id = 11 [\n    (gogoproto.customname) = \"ProcessID\"\n  ];\n  repeated string warnings = 12;\n}\n\nmessage Trace {\n  message ProcessMapping {\n      string process_id = 1 [\n        (gogoproto.customname) = \"ProcessID\"\n      ];\n      Process process = 2 [\n        (gogoproto.nullable) = false\n      ];\n  }\n  repeated Span spans = 1;\n  repeated ProcessMapping process_map = 2 [\n    (gogoproto.nullable) = false\n  ];\n  repeated string warnings = 3;\n}\n\n// Note that both Span and Batch may contain a Process.\n// This is different from the Thrift model which was only used\n// for transport, because Proto model is also used by the backend\n// as the domain model, where once a batch is received it is split\n// into individual spans which are all processed independently,\n// and therefore they all need a Process. As far as on-the-wire\n// semantics, both Batch and Spans in the same message may contain\n// their own instances of Process, with span.Process taking priority\n// over batch.Process.\nmessage Batch {\n    repeated Span spans = 1;\n    Process process = 2 [\n      (gogoproto.nullable) = true\n    ];\n}\n\nmessage DependencyLink {\n  string parent = 1;\n  string child = 2;\n  uint64 call_count = 3;\n  string source = 4;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/jaeger/storage/v2/trace_storage.proto",
    "content": "// Copyright (c) 2024 The Jaeger Authors.\n// SPDX-License-Identifier: Apache-2.0\n\nsyntax = \"proto3\";\n\npackage jaeger.storage.v2;\n\nimport \"google/protobuf/duration.proto\";\nimport \"google/protobuf/timestamp.proto\";\nimport \"opentelemetry/proto/trace/v1/trace.proto\";\n\noption go_package = \"storage\";\n\n// GetTraceParams represents the query for a single trace from the storage backend.\nmessage GetTraceParams {\n  // trace_id is a 16 byte array containing the unique identifier for the trace to query.\n  bytes trace_id = 1;\n\n  // start_time is the start of the time interval to search for the trace_id.\n  //\n  // This field is optional.\n  google.protobuf.Timestamp start_time = 2;\n\n  // end_time is the end of the time interval to search for the trace_id.\n  //\n  // This field is optional.\n  google.protobuf.Timestamp end_time = 3;\n}\n\n// GetTracesRequest represents a request to retrieve multiple traces.\nmessage GetTracesRequest {\n  repeated GetTraceParams query = 1;\n}\n\n// GetServicesRequest represents a request to get service names.\nmessage GetServicesRequest {}\n\n// GetServicesResponse represents the response for GetServicesRequest.\nmessage GetServicesResponse {\n  repeated string services = 1;\n}\n\n// GetOperationsRequest represents a request to get operation names.\nmessage GetOperationsRequest {\n  // service is the name of the service for which to get operation names.\n  //\n  // This field is required.\n  string service = 1;\n\n  // span_kind is the type of span which is used to distinguish between\n  // spans generated in a particular context.\n  //\n  // This field is optional.\n  string span_kind = 2;\n}\n\n// Operation contains information about an operation for a given service.\nmessage Operation {\n  string name = 1;\n  string span_kind = 2;\n}\n\n// GetOperationsResponse represents the response for GetOperationsRequest.\nmessage GetOperationsResponse {\n  repeated Operation operations = 1;\n}\n\n// KeyValue and all its associated types are copied from opentelemetry-proto/common/v1/common.proto\n// (https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/common/v1/common.proto).\n// This type is used to store attributes in traces.\nmessage KeyValue {\n  string key = 1;\n  AnyValue value = 2;\n}\n\nmessage AnyValue {\n  oneof value {\n    string string_value = 1;\n    bool bool_value = 2;\n    int64 int_value = 3;\n    double double_value = 4;\n    ArrayValue array_value = 5;\n    KeyValueList kvlist_value = 6;\n    bytes bytes_value = 7;\n  }\n}\n\nmessage KeyValueList {\n  repeated KeyValue values = 1;\n}\n\nmessage ArrayValue {\n  repeated AnyValue values = 1;\n}\n\n// TraceQueryParameters contains query parameters to find traces. For a detailed\n// definition of each field in this message, refer to `TraceQueryParameters` in `jaeger.api_v3`\n// (https://github.com/jaegertracing/jaeger-idl/blob/main/proto/api_v3/query_service.proto).\nmessage TraceQueryParameters {\n  string service_name = 1;\n  string operation_name = 2;\n  repeated KeyValue attributes = 3;\n  google.protobuf.Timestamp start_time_min = 4;\n  google.protobuf.Timestamp start_time_max = 5;\n  google.protobuf.Duration duration_min = 6;\n  google.protobuf.Duration duration_max = 7;\n  int32 search_depth = 8;\n}\n\n// FindTracesRequest represents a request to find traces.\n// It can be used to retrieve the traces (FindTraces) or simply\n// the trace IDs (FindTraceIDs).\nmessage FindTracesRequest {\n  TraceQueryParameters query = 1;\n}\n\n// FoundTraceID is a wrapper around trace ID returned from FindTraceIDs\n// with an optional time range that may be used in GetTraces calls.\n//\n// The time range is provided as an optimization hint for some storage backends\n// that can perform more efficient queries when they know the approximate time range.\n// The value should not be used for precise time-based filtering or assumptions.\n// It is meant as a rough boundary and may not be populated in all cases.\nmessage FoundTraceID {\n  bytes trace_id = 1;\n  google.protobuf.Timestamp start = 2;\n  google.protobuf.Timestamp end = 3;\n}\n\n// FindTraceIDsResponse represents the response for FindTracesRequest.\nmessage FindTraceIDsResponse {\n  repeated FoundTraceID trace_ids = 1;\n}\n\n// TraceReader is a service that allows reading traces from storage.\n// Note that if you implement this service, you should also implement\n// OTEL's TraceService in package opentelemetry.proto.collector.trace.v1\n// to allow pushing traces to the storage backend\n// (<https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/collector/trace/v1/trace_service.proto>)\nservice TraceReader {\n  // GetTraces returns a stream that retrieves all traces with given IDs.\n  //\n  // Chunking requirements:\n  // - A single TracesData chunk MUST NOT contain spans from multiple traces.\n  // - Large traces MAY be split across multiple, *consecutive* TracesData chunks.\n  // - Each returned TracesData object MUST NOT be empty.\n  //\n  // Edge cases:\n  // - If no spans are found for any given trace ID, the ID is ignored.\n  // - If none of the trace IDs are found in the storage, an empty response is returned.\n  rpc GetTraces(GetTracesRequest) returns (stream opentelemetry.proto.trace.v1.TracesData) {}\n\n  // GetServices returns all service names known to the backend from traces\n  // within its retention period.\n  rpc GetServices(GetServicesRequest) returns (GetServicesResponse) {}\n\n  // GetOperations returns all operation names for a given service\n  // known to the backend from traces within its retention period.\n  rpc GetOperations(GetOperationsRequest) returns (GetOperationsResponse) {}\n\n  // FindTraces returns a stream that retrieves traces matching query parameters.\n  //\n  // The chunking rules are the same as for GetTraces.\n  //\n  // If no matching traces are found, an empty stream is returned.\n  rpc FindTraces(FindTracesRequest) returns (stream opentelemetry.proto.trace.v1.TracesData) {}\n\n  // FindTraceIDs returns a stream that retrieves IDs of traces matching query parameters.\n  //\n  // If no matching traces are found, an empty stream is returned.\n  //\n  // This call behaves identically to FindTraces, except that it returns only the list\n  // of matching trace IDs. This is useful in some contexts, such as batch jobs, where a\n  // large list of trace IDs may be queried first and then the full traces are loaded\n  // in batches.\n  rpc FindTraceIDs(FindTracesRequest) returns (FindTraceIDsResponse) {}\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/jaeger/storage.proto",
    "content": "// Copyright (c) 2019 The Jaeger Authors\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n// http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nsyntax = \"proto3\";\n\npackage jaeger.storage.v1;\n\noption go_package = \"storage_v1\";\n\nimport \"gogoproto/gogo.proto\";\nimport \"google/protobuf/timestamp.proto\";\nimport \"google/protobuf/duration.proto\";\n\nimport \"model.proto\";\n\n// Enable gogoprotobuf extensions (https://github.com/gogo/protobuf/blob/master/extensions.md).\n// Enable custom Marshal method.\noption (gogoproto.marshaler_all) = true;\n// Enable custom Unmarshal method.\noption (gogoproto.unmarshaler_all) = true;\n// Enable custom Size method (Required by Marshal and Unmarshal).\noption (gogoproto.sizer_all) = true;\n\nmessage GetDependenciesRequest {\n    google.protobuf.Timestamp start_time = 1 [\n      (gogoproto.stdtime) = true,\n      (gogoproto.nullable) = false\n    ];\n    google.protobuf.Timestamp end_time = 2 [\n      (gogoproto.stdtime) = true,\n      (gogoproto.nullable) = false\n    ];\n}\n\nmessage GetDependenciesResponse {\n    repeated jaeger.api_v2.DependencyLink dependencies = 1 [\n      (gogoproto.nullable) = false\n    ];\n}\n\nmessage WriteSpanRequest {\n    jaeger.api_v2.Span span = 1;\n}\n\n// empty; extensible in the future\nmessage WriteSpanResponse {\n\n}\n\n// empty; extensible in the future\nmessage CloseWriterRequest {\n}\n\n// empty; extensible in the future\nmessage CloseWriterResponse {\n}\n\nmessage GetTraceRequest {\n    bytes trace_id = 1 [\n      (gogoproto.nullable) = false,\n      (gogoproto.customtype) = \"github.com/jaegertracing/jaeger/model.TraceID\",\n      (gogoproto.customname) = \"TraceID\"\n    ];\n}\n\nmessage GetServicesRequest {}\n\nmessage GetServicesResponse {\n    repeated string services = 1;\n}\n\nmessage GetOperationsRequest {\n    string service = 1;\n    string span_kind = 2;\n}\n\nmessage Operation {\n    string name = 1;\n    string span_kind = 2;\n}\n\nmessage GetOperationsResponse {\n    repeated string operationNames = 1; // deprecated\n    repeated Operation operations = 2;\n}\n\nmessage TraceQueryParameters {\n    string service_name = 1;\n    string operation_name = 2;\n    map<string, string> tags = 3;\n    google.protobuf.Timestamp start_time_min = 4 [\n      (gogoproto.stdtime) = true,\n      (gogoproto.nullable) = false\n    ];\n    google.protobuf.Timestamp start_time_max = 5 [\n      (gogoproto.stdtime) = true,\n      (gogoproto.nullable) = false\n    ];\n    google.protobuf.Duration duration_min = 6 [\n      (gogoproto.stdduration) = true,\n      (gogoproto.nullable) = false\n    ];\n    google.protobuf.Duration duration_max = 7 [\n      (gogoproto.stdduration) = true,\n      (gogoproto.nullable) = false\n    ];\n    int32 num_traces = 8;\n}\n\nmessage FindTracesRequest {\n    TraceQueryParameters query = 1;\n}\n\nmessage SpansResponseChunk {\n    repeated jaeger.api_v2.Span spans = 1  [\n      (gogoproto.nullable) = false\n    ];\n}\n\nmessage FindTraceIDsRequest {\n    TraceQueryParameters query = 1;\n}\n\nmessage FindTraceIDsResponse {\n    repeated bytes trace_ids = 1 [\n      (gogoproto.nullable) = false,\n      (gogoproto.customtype) = \"github.com/jaegertracing/jaeger/model.TraceID\",\n      (gogoproto.customname) = \"TraceIDs\"\n    ];\n}\n\nservice SpanWriterPlugin {\n    // spanstore/Writer\n    rpc WriteSpan(WriteSpanRequest) returns (WriteSpanResponse);\n    rpc Close(CloseWriterRequest) returns (CloseWriterResponse);\n}\n\nservice StreamingSpanWriterPlugin {\n    rpc WriteSpanStream(stream WriteSpanRequest) returns (WriteSpanResponse);\n}\n\nservice SpanReaderPlugin {\n    // spanstore/Reader\n    rpc GetTrace(GetTraceRequest) returns (stream SpansResponseChunk);\n    rpc GetServices(GetServicesRequest) returns (GetServicesResponse);\n    rpc GetOperations(GetOperationsRequest) returns (GetOperationsResponse);\n    rpc FindTraces(FindTracesRequest) returns (stream SpansResponseChunk);\n    rpc FindTraceIDs(FindTraceIDsRequest) returns (FindTraceIDsResponse);\n}\n\nservice ArchiveSpanWriterPlugin {\n    // spanstore/Writer\n    rpc WriteArchiveSpan(WriteSpanRequest) returns (WriteSpanResponse);\n}\n\nservice ArchiveSpanReaderPlugin {\n    // spanstore/Reader\n    rpc GetArchiveTrace(GetTraceRequest) returns (stream SpansResponseChunk);\n}\n\nservice DependenciesReaderPlugin {\n    // dependencystore/Reader\n    rpc GetDependencies(GetDependenciesRequest) returns (GetDependenciesResponse);\n}\n\n// empty; extensible in the future\nmessage CapabilitiesRequest {\n\n}\n\nmessage CapabilitiesResponse {\n    bool archiveSpanReader = 1;\n    bool archiveSpanWriter = 2;\n    bool streamingSpanWriter = 3;\n}\n\nservice PluginCapabilities {\n    rpc Capabilities(CapabilitiesRequest) returns (CapabilitiesResponse);\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/opentelemetry/proto/collector/README.md",
    "content": "# OpenTelemetry Collector Proto\n\nThis package describes the OpenTelemetry collector protocol.\n\n## Packages\n\n1. `common` package contains the common messages shared between different services.\n2. `trace` package contains the Trace Service protos.\n3. `metrics` package contains the Metrics Service protos.\n4. `logs` package contains the Logs Service protos.\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/opentelemetry/proto/collector/logs/v1/logs_service.proto",
    "content": "// Copyright 2020, OpenTelemetry Authors\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nsyntax = \"proto3\";\n\npackage opentelemetry.proto.collector.logs.v1;\n\nimport \"opentelemetry/proto/logs/v1/logs.proto\";\n\noption csharp_namespace = \"OpenTelemetry.Proto.Collector.Logs.V1\";\noption java_multiple_files = true;\noption java_package = \"io.opentelemetry.proto.collector.logs.v1\";\noption java_outer_classname = \"LogsServiceProto\";\noption go_package = \"go.opentelemetry.io/proto/otlp/collector/logs/v1\";\n\n// Service that can be used to push logs between one Application instrumented with\n// OpenTelemetry and an collector, or between an collector and a central collector (in this\n// case logs are sent/received to/from multiple Applications).\nservice LogsService {\n  // For performance reasons, it is recommended to keep this RPC\n  // alive for the entire life of the application.\n  rpc Export(ExportLogsServiceRequest) returns (ExportLogsServiceResponse) {}\n}\n\nmessage ExportLogsServiceRequest {\n  // An array of ResourceLogs.\n  // For data coming from a single resource this array will typically contain one\n  // element. Intermediary nodes (such as OpenTelemetry Collector) that receive\n  // data from multiple origins typically batch the data before forwarding further and\n  // in that case this array will contain multiple elements.\n  repeated opentelemetry.proto.logs.v1.ResourceLogs resource_logs = 1;\n}\n\nmessage ExportLogsServiceResponse {\n  // The details of a partially successful export request.\n  //\n  // If the request is only partially accepted\n  // (i.e. when the server accepts only parts of the data and rejects the rest)\n  // the server MUST initialize the `partial_success` field and MUST\n  // set the `rejected_<signal>` with the number of items it rejected.\n  //\n  // Servers MAY also make use of the `partial_success` field to convey\n  // warnings/suggestions to senders even when the request was fully accepted.\n  // In such cases, the `rejected_<signal>` MUST have a value of `0` and\n  // the `error_message` MUST be non-empty.\n  //\n  // A `partial_success` message with an empty value (`rejected_<signal>1 = 0 and\n  // `error_message` = \"\") is equivalent to it not being set/present. Senders\n  // SHOULD interpret it the same way as in the full success case.\n  ExportLogsPartialSuccess partial_success = 1;\n}\n\nmessage ExportLogsPartialSuccess {\n  // The number of rejected log records.\n  //\n  // A `rejected_<signal>` field holding a `0` value indicates that the\n  // request was fully accepted.\n  int64 rejected_log_records = 1;\n\n  // A developer-facing human-readable message in English. It should be used\n  // either to explain why the server rejected parts of the data during a partial\n  // success or to convey warnings/suggestions during a full success. The message\n  // should offer guidance on how users can address such issues.\n  //\n  // error_message is an optional field. An error_message with an empty value\n  // is equivalent to it not being set.\n  string error_message = 2;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/opentelemetry/proto/collector/logs/v1/logs_service_http.yaml",
    "content": "# This is an API configuration to generate an HTTP/JSON -> gRPC gateway for the\n# OpenTelemetry service using github.com/grpc-ecosystem/grpc-gateway.\ntype: google.api.Service\nconfig_version: 3\nhttp:\n rules:\n - selector: opentelemetry.proto.collector.logs.v1.LogsService.Export\n   post: /v1/logs\n   body: \"*\""
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/opentelemetry/proto/collector/metrics/v1/metrics_service.proto",
    "content": "// Copyright 2019, OpenTelemetry Authors\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nsyntax = \"proto3\";\n\npackage opentelemetry.proto.collector.metrics.v1;\n\nimport \"opentelemetry/proto/metrics/v1/metrics.proto\";\n\noption csharp_namespace = \"OpenTelemetry.Proto.Collector.Metrics.V1\";\noption java_multiple_files = true;\noption java_package = \"io.opentelemetry.proto.collector.metrics.v1\";\noption java_outer_classname = \"MetricsServiceProto\";\noption go_package = \"go.opentelemetry.io/proto/otlp/collector/metrics/v1\";\n\n// Service that can be used to push metrics between one Application\n// instrumented with OpenTelemetry and a collector, or between a collector and a\n// central collector.\nservice MetricsService {\n  // For performance reasons, it is recommended to keep this RPC\n  // alive for the entire life of the application.\n  rpc Export(ExportMetricsServiceRequest) returns (ExportMetricsServiceResponse) {}\n}\n\nmessage ExportMetricsServiceRequest {\n  // An array of ResourceMetrics.\n  // For data coming from a single resource this array will typically contain one\n  // element. Intermediary nodes (such as OpenTelemetry Collector) that receive\n  // data from multiple origins typically batch the data before forwarding further and\n  // in that case this array will contain multiple elements.\n  repeated opentelemetry.proto.metrics.v1.ResourceMetrics resource_metrics = 1;\n}\n\nmessage ExportMetricsServiceResponse {\n  // The details of a partially successful export request.\n  //\n  // If the request is only partially accepted\n  // (i.e. when the server accepts only parts of the data and rejects the rest)\n  // the server MUST initialize the `partial_success` field and MUST\n  // set the `rejected_<signal>` with the number of items it rejected.\n  //\n  // Servers MAY also make use of the `partial_success` field to convey\n  // warnings/suggestions to senders even when the request was fully accepted.\n  // In such cases, the `rejected_<signal>` MUST have a value of `0` and\n  // the `error_message` MUST be non-empty.\n  //\n  // A `partial_success` message with an empty value (rejected_<signal> = 0 and\n  // `error_message` = \"\") is equivalent to it not being set/present. Senders\n  // SHOULD interpret it the same way as in the full success case.\n  ExportMetricsPartialSuccess partial_success = 1;\n}\n\nmessage ExportMetricsPartialSuccess {\n  // The number of rejected data points.\n  //\n  // A `rejected_<signal>` field holding a `0` value indicates that the\n  // request was fully accepted.\n  int64 rejected_data_points = 1;\n\n  // A developer-facing human-readable message in English. It should be used\n  // either to explain why the server rejected parts of the data during a partial\n  // success or to convey warnings/suggestions during a full success. The message\n  // should offer guidance on how users can address such issues.\n  //\n  // error_message is an optional field. An error_message with an empty value\n  // is equivalent to it not being set.\n  string error_message = 2;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/opentelemetry/proto/collector/metrics/v1/metrics_service_http.yaml",
    "content": "# This is an API configuration to generate an HTTP/JSON -> gRPC gateway for the\n# OpenTelemetry service using github.com/grpc-ecosystem/grpc-gateway.\ntype: google.api.Service\nconfig_version: 3\nhttp:\n rules:\n - selector: opentelemetry.proto.collector.metrics.v1.MetricsService.Export\n   post: /v1/metrics\n   body: \"*\""
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/opentelemetry/proto/collector/trace/v1/trace_service.proto",
    "content": "// Copyright 2019, OpenTelemetry Authors\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nsyntax = \"proto3\";\n\npackage opentelemetry.proto.collector.trace.v1;\n\nimport \"opentelemetry/proto/trace/v1/trace.proto\";\n\noption csharp_namespace = \"OpenTelemetry.Proto.Collector.Trace.V1\";\noption java_multiple_files = true;\noption java_package = \"io.opentelemetry.proto.collector.trace.v1\";\noption java_outer_classname = \"TraceServiceProto\";\noption go_package = \"go.opentelemetry.io/proto/otlp/collector/trace/v1\";\n\n// Service that can be used to push spans between one Application instrumented with\n// OpenTelemetry and a collector, or between a collector and a central collector (in this\n// case spans are sent/received to/from multiple Applications).\nservice TraceService {\n  // For performance reasons, it is recommended to keep this RPC\n  // alive for the entire life of the application.\n  rpc Export(ExportTraceServiceRequest) returns (ExportTraceServiceResponse) {}\n}\n\nmessage ExportTraceServiceRequest {\n  // An array of ResourceSpans.\n  // For data coming from a single resource this array will typically contain one\n  // element. Intermediary nodes (such as OpenTelemetry Collector) that receive\n  // data from multiple origins typically batch the data before forwarding further and\n  // in that case this array will contain multiple elements.\n  repeated opentelemetry.proto.trace.v1.ResourceSpans resource_spans = 1;\n}\n\nmessage ExportTraceServiceResponse {\n  // The details of a partially successful export request.\n  //\n  // If the request is only partially accepted\n  // (i.e. when the server accepts only parts of the data and rejects the rest)\n  // the server MUST initialize the `partial_success` field and MUST\n  // set the `rejected_<signal>` with the number of items it rejected.\n  //\n  // Servers MAY also make use of the `partial_success` field to convey\n  // warnings/suggestions to senders even when the request was fully accepted.\n  // In such cases, the `rejected_<signal>` MUST have a value of `0` and\n  // the `error_message` MUST be non-empty.\n  //\n  // A `partial_success` message with an empty value (rejected_<signal> = 0 and\n  // `error_message` = \"\") is equivalent to it not being set/present. Senders\n  // SHOULD interpret it the same way as in the full success case.\n  ExportTracePartialSuccess partial_success = 1;\n}\n\nmessage ExportTracePartialSuccess {\n  // The number of rejected spans.\n  //\n  // A `rejected_<signal>` field holding a `0` value indicates that the\n  // request was fully accepted.\n  int64 rejected_spans = 1;\n\n  // A developer-facing human-readable message in English. It should be used\n  // either to explain why the server rejected parts of the data during a partial\n  // success or to convey warnings/suggestions during a full success. The message\n  // should offer guidance on how users can address such issues.\n  //\n  // error_message is an optional field. An error_message with an empty value\n  // is equivalent to it not being set.\n  string error_message = 2;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/opentelemetry/proto/collector/trace/v1/trace_service_http.yaml",
    "content": "# This is an API configuration to generate an HTTP/JSON -> gRPC gateway for the\n# OpenTelemetry service using github.com/grpc-ecosystem/grpc-gateway.\ntype: google.api.Service\nconfig_version: 3\nhttp:\n rules:\n - selector: opentelemetry.proto.collector.trace.v1.TraceService.Export\n   post: /v1/trace\n   body: \"*\""
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/opentelemetry/proto/common/v1/common.proto",
    "content": "// Copyright 2019, OpenTelemetry Authors\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nsyntax = \"proto3\";\n\npackage opentelemetry.proto.common.v1;\n\noption csharp_namespace = \"OpenTelemetry.Proto.Common.V1\";\noption java_multiple_files = true;\noption java_package = \"io.opentelemetry.proto.common.v1\";\noption java_outer_classname = \"CommonProto\";\noption go_package = \"go.opentelemetry.io/proto/otlp/common/v1\";\n\n// AnyValue is used to represent any type of attribute value. AnyValue may contain a\n// primitive value such as a string or integer or it may contain an arbitrary nested\n// object containing arrays, key-value lists and primitives.\nmessage AnyValue {\n  // The value is one of the listed fields. It is valid for all values to be unspecified\n  // in which case this AnyValue is considered to be \"empty\".\n  oneof value {\n    string string_value = 1;\n    bool bool_value = 2;\n    int64 int_value = 3;\n    double double_value = 4;\n    ArrayValue array_value = 5;\n    KeyValueList kvlist_value = 6;\n    bytes bytes_value = 7;\n  }\n}\n\n// ArrayValue is a list of AnyValue messages. We need ArrayValue as a message\n// since oneof in AnyValue does not allow repeated fields.\nmessage ArrayValue {\n  // Array of values. The array may be empty (contain 0 elements).\n  repeated AnyValue values = 1;\n}\n\n// KeyValueList is a list of KeyValue messages. We need KeyValueList as a message\n// since `oneof` in AnyValue does not allow repeated fields. Everywhere else where we need\n// a list of KeyValue messages (e.g. in Span) we use `repeated KeyValue` directly to\n// avoid unnecessary extra wrapping (which slows down the protocol). The 2 approaches\n// are semantically equivalent.\nmessage KeyValueList {\n  // A collection of key/value pairs of key-value pairs. The list may be empty (may\n  // contain 0 elements).\n  // The keys MUST be unique (it is not allowed to have more than one\n  // value with the same key).\n  repeated KeyValue values = 1;\n}\n\n// KeyValue is a key-value pair that is used to store Span attributes, Link\n// attributes, etc.\nmessage KeyValue {\n  string key = 1;\n  AnyValue value = 2;\n}\n\n// InstrumentationScope is a message representing the instrumentation scope information\n// such as the fully qualified name and version. \nmessage InstrumentationScope {\n  // An empty instrumentation scope name means the name is unknown.\n  string name = 1;\n  string version = 2;\n  repeated KeyValue attributes = 3;\n  uint32 dropped_attributes_count = 4;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/opentelemetry/proto/logs/v1/logs.proto",
    "content": "// Copyright 2020, OpenTelemetry Authors\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nsyntax = \"proto3\";\n\npackage opentelemetry.proto.logs.v1;\n\nimport \"opentelemetry/proto/common/v1/common.proto\";\nimport \"opentelemetry/proto/resource/v1/resource.proto\";\n\noption csharp_namespace = \"OpenTelemetry.Proto.Logs.V1\";\noption java_multiple_files = true;\noption java_package = \"io.opentelemetry.proto.logs.v1\";\noption java_outer_classname = \"LogsProto\";\noption go_package = \"go.opentelemetry.io/proto/otlp/logs/v1\";\n\n// LogsData represents the logs data that can be stored in a persistent storage,\n// OR can be embedded by other protocols that transfer OTLP logs data but do not\n// implement the OTLP protocol.\n//\n// The main difference between this message and collector protocol is that\n// in this message there will not be any \"control\" or \"metadata\" specific to\n// OTLP protocol.\n//\n// When new fields are added into this message, the OTLP request MUST be updated\n// as well.\nmessage LogsData {\n  // An array of ResourceLogs.\n  // For data coming from a single resource this array will typically contain\n  // one element. Intermediary nodes that receive data from multiple origins\n  // typically batch the data before forwarding further and in that case this\n  // array will contain multiple elements.\n  repeated ResourceLogs resource_logs = 1;\n}\n\n// A collection of ScopeLogs from a Resource.\nmessage ResourceLogs {\n  reserved 1000;\n\n  // The resource for the logs in this message.\n  // If this field is not set then resource info is unknown.\n  opentelemetry.proto.resource.v1.Resource resource = 1;\n\n  // A list of ScopeLogs that originate from a resource.\n  repeated ScopeLogs scope_logs = 2;\n\n  // This schema_url applies to the data in the \"resource\" field. It does not apply\n  // to the data in the \"scope_logs\" field which have their own schema_url field.\n  string schema_url = 3;\n}\n\n// A collection of Logs produced by a Scope.\nmessage ScopeLogs {\n  // The instrumentation scope information for the logs in this message.\n  // Semantically when InstrumentationScope isn't set, it is equivalent with\n  // an empty instrumentation scope name (unknown).\n  opentelemetry.proto.common.v1.InstrumentationScope scope = 1;\n\n  // A list of log records.\n  repeated LogRecord log_records = 2;\n\n  // This schema_url applies to all logs in the \"logs\" field.\n  string schema_url = 3;\n}\n\n// Possible values for LogRecord.SeverityNumber.\nenum SeverityNumber {\n  // UNSPECIFIED is the default SeverityNumber, it MUST NOT be used.\n  SEVERITY_NUMBER_UNSPECIFIED = 0;\n  SEVERITY_NUMBER_TRACE  = 1;\n  SEVERITY_NUMBER_TRACE2 = 2;\n  SEVERITY_NUMBER_TRACE3 = 3;\n  SEVERITY_NUMBER_TRACE4 = 4;\n  SEVERITY_NUMBER_DEBUG  = 5;\n  SEVERITY_NUMBER_DEBUG2 = 6;\n  SEVERITY_NUMBER_DEBUG3 = 7;\n  SEVERITY_NUMBER_DEBUG4 = 8;\n  SEVERITY_NUMBER_INFO   = 9;\n  SEVERITY_NUMBER_INFO2  = 10;\n  SEVERITY_NUMBER_INFO3  = 11;\n  SEVERITY_NUMBER_INFO4  = 12;\n  SEVERITY_NUMBER_WARN   = 13;\n  SEVERITY_NUMBER_WARN2  = 14;\n  SEVERITY_NUMBER_WARN3  = 15;\n  SEVERITY_NUMBER_WARN4  = 16;\n  SEVERITY_NUMBER_ERROR  = 17;\n  SEVERITY_NUMBER_ERROR2 = 18;\n  SEVERITY_NUMBER_ERROR3 = 19;\n  SEVERITY_NUMBER_ERROR4 = 20;\n  SEVERITY_NUMBER_FATAL  = 21;\n  SEVERITY_NUMBER_FATAL2 = 22;\n  SEVERITY_NUMBER_FATAL3 = 23;\n  SEVERITY_NUMBER_FATAL4 = 24;\n}\n\n// Masks for LogRecord.flags field.\nenum LogRecordFlags {\n  LOG_RECORD_FLAG_UNSPECIFIED = 0;\n  LOG_RECORD_FLAG_TRACE_FLAGS_MASK = 0x000000FF;\n}\n\n// A log record according to OpenTelemetry Log Data Model:\n// https://github.com/open-telemetry/oteps/blob/main/text/logs/0097-log-data-model.md\nmessage LogRecord {\n  reserved 4;\n\n  // time_unix_nano is the time when the event occurred.\n  // Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.\n  // Value of 0 indicates unknown or missing timestamp.\n  fixed64 time_unix_nano = 1;\n\n  // Time when the event was observed by the collection system.\n  // For events that originate in OpenTelemetry (e.g. using OpenTelemetry Logging SDK)\n  // this timestamp is typically set at the generation time and is equal to Timestamp.\n  // For events originating externally and collected by OpenTelemetry (e.g. using\n  // Collector) this is the time when OpenTelemetry's code observed the event measured\n  // by the clock of the OpenTelemetry code. This field MUST be set once the event is\n  // observed by OpenTelemetry.\n  //\n  // For converting OpenTelemetry log data to formats that support only one timestamp or\n  // when receiving OpenTelemetry log data by recipients that support only one timestamp\n  // internally the following logic is recommended:\n  //   - Use time_unix_nano if it is present, otherwise use observed_time_unix_nano.\n  //\n  // Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.\n  // Value of 0 indicates unknown or missing timestamp.\n  fixed64 observed_time_unix_nano = 11;\n\n  // Numerical value of the severity, normalized to values described in Log Data Model.\n  // [Optional].\n  SeverityNumber severity_number = 2;\n\n  // The severity text (also known as log level). The original string representation as\n  // it is known at the source. [Optional].\n  string severity_text = 3;\n\n  // A value containing the body of the log record. Can be for example a human-readable\n  // string message (including multi-line) describing the event in a free form or it can\n  // be a structured data composed of arrays and maps of other values. [Optional].\n  opentelemetry.proto.common.v1.AnyValue body = 5;\n\n  // Additional attributes that describe the specific event occurrence. [Optional].\n  // Attribute keys MUST be unique (it is not allowed to have more than one\n  // attribute with the same key).\n  repeated opentelemetry.proto.common.v1.KeyValue attributes = 6;\n  uint32 dropped_attributes_count = 7;\n\n  // Flags, a bit field. 8 least significant bits are the trace flags as\n  // defined in W3C Trace Context specification. 24 most significant bits are reserved\n  // and must be set to 0. Readers must not assume that 24 most significant bits\n  // will be zero and must correctly mask the bits when reading 8-bit trace flag (use\n  // flags & TRACE_FLAGS_MASK). [Optional].\n  fixed32 flags = 8;\n\n  // A unique identifier for a trace. All logs from the same trace share\n  // the same `trace_id`. The ID is a 16-byte array. An ID with all zeroes\n  // is considered invalid. Can be set for logs that are part of request processing\n  // and have an assigned trace id. [Optional].\n  bytes trace_id = 9;\n\n  // A unique identifier for a span within a trace, assigned when the span\n  // is created. The ID is an 8-byte array. An ID with all zeroes is considered\n  // invalid. Can be set for logs that are part of a particular processing span.\n  // If span_id is present trace_id SHOULD be also present. [Optional].\n  bytes span_id = 10;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/opentelemetry/proto/metrics/v1/metrics.proto",
    "content": "// Copyright 2019, OpenTelemetry Authors\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nsyntax = \"proto3\";\n\npackage opentelemetry.proto.metrics.v1;\n\nimport \"opentelemetry/proto/common/v1/common.proto\";\nimport \"opentelemetry/proto/resource/v1/resource.proto\";\n\noption csharp_namespace = \"OpenTelemetry.Proto.Metrics.V1\";\noption java_multiple_files = true;\noption java_package = \"io.opentelemetry.proto.metrics.v1\";\noption java_outer_classname = \"MetricsProto\";\noption go_package = \"go.opentelemetry.io/proto/otlp/metrics/v1\";\n\n// MetricsData represents the metrics data that can be stored in a persistent\n// storage, OR can be embedded by other protocols that transfer OTLP metrics\n// data but do not implement the OTLP protocol.\n//\n// The main difference between this message and collector protocol is that\n// in this message there will not be any \"control\" or \"metadata\" specific to\n// OTLP protocol.\n//\n// When new fields are added into this message, the OTLP request MUST be updated\n// as well.\nmessage MetricsData {\n  // An array of ResourceMetrics.\n  // For data coming from a single resource this array will typically contain\n  // one element. Intermediary nodes that receive data from multiple origins\n  // typically batch the data before forwarding further and in that case this\n  // array will contain multiple elements.\n  repeated ResourceMetrics resource_metrics = 1;\n}\n\n// A collection of ScopeMetrics from a Resource.\nmessage ResourceMetrics {\n  reserved 1000;\n\n  // The resource for the metrics in this message.\n  // If this field is not set then no resource info is known.\n  opentelemetry.proto.resource.v1.Resource resource = 1;\n\n  // A list of metrics that originate from a resource.\n  repeated ScopeMetrics scope_metrics = 2;\n\n  // This schema_url applies to the data in the \"resource\" field. It does not apply\n  // to the data in the \"scope_metrics\" field which have their own schema_url field.\n  string schema_url = 3;\n}\n\n// A collection of Metrics produced by an Scope.\nmessage ScopeMetrics {\n  // The instrumentation scope information for the metrics in this message.\n  // Semantically when InstrumentationScope isn't set, it is equivalent with\n  // an empty instrumentation scope name (unknown).\n  opentelemetry.proto.common.v1.InstrumentationScope scope = 1;\n\n  // A list of metrics that originate from an instrumentation library.\n  repeated Metric metrics = 2;\n\n  // This schema_url applies to all metrics in the \"metrics\" field.\n  string schema_url = 3;\n}\n\n// Defines a Metric which has one or more timeseries.  The following is a\n// brief summary of the Metric data model.  For more details, see:\n//\n//   https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/data-model.md\n//\n//\n// The data model and relation between entities is shown in the\n// diagram below. Here, \"DataPoint\" is the term used to refer to any\n// one of the specific data point value types, and \"points\" is the term used\n// to refer to any one of the lists of points contained in the Metric.\n//\n// - Metric is composed of a metadata and data.\n// - Metadata part contains a name, description, unit.\n// - Data is one of the possible types (Sum, Gauge, Histogram, Summary).\n// - DataPoint contains timestamps, attributes, and one of the possible value type\n//   fields.\n//\n//     Metric\n//  +------------+\n//  |name        |\n//  |description |\n//  |unit        |     +------------------------------------+\n//  |data        |---> |Gauge, Sum, Histogram, Summary, ... |\n//  +------------+     +------------------------------------+\n//\n//    Data [One of Gauge, Sum, Histogram, Summary, ...]\n//  +-----------+\n//  |...        |  // Metadata about the Data.\n//  |points     |--+\n//  +-----------+  |\n//                 |      +---------------------------+\n//                 |      |DataPoint 1                |\n//                 v      |+------+------+   +------+ |\n//              +-----+   ||label |label |...|label | |\n//              |  1  |-->||value1|value2|...|valueN| |\n//              +-----+   |+------+------+   +------+ |\n//              |  .  |   |+-----+                    |\n//              |  .  |   ||value|                    |\n//              |  .  |   |+-----+                    |\n//              |  .  |   +---------------------------+\n//              |  .  |                   .\n//              |  .  |                   .\n//              |  .  |                   .\n//              |  .  |   +---------------------------+\n//              |  .  |   |DataPoint M                |\n//              +-----+   |+------+------+   +------+ |\n//              |  M  |-->||label |label |...|label | |\n//              +-----+   ||value1|value2|...|valueN| |\n//                        |+------+------+   +------+ |\n//                        |+-----+                    |\n//                        ||value|                    |\n//                        |+-----+                    |\n//                        +---------------------------+\n//\n// Each distinct type of DataPoint represents the output of a specific\n// aggregation function, the result of applying the DataPoint's\n// associated function of to one or more measurements.\n//\n// All DataPoint types have three common fields:\n// - Attributes includes key-value pairs associated with the data point\n// - TimeUnixNano is required, set to the end time of the aggregation\n// - StartTimeUnixNano is optional, but strongly encouraged for DataPoints\n//   having an AggregationTemporality field, as discussed below.\n//\n// Both TimeUnixNano and StartTimeUnixNano values are expressed as\n// UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.\n//\n// # TimeUnixNano\n//\n// This field is required, having consistent interpretation across\n// DataPoint types.  TimeUnixNano is the moment corresponding to when\n// the data point's aggregate value was captured.\n//\n// Data points with the 0 value for TimeUnixNano SHOULD be rejected\n// by consumers.\n//\n// # StartTimeUnixNano\n//\n// StartTimeUnixNano in general allows detecting when a sequence of\n// observations is unbroken.  This field indicates to consumers the\n// start time for points with cumulative and delta\n// AggregationTemporality, and it should be included whenever possible\n// to support correct rate calculation.  Although it may be omitted\n// when the start time is truly unknown, setting StartTimeUnixNano is\n// strongly encouraged.\nmessage Metric {\n  reserved 4, 6, 8;\n\n  // name of the metric, including its DNS name prefix. It must be unique.\n  string name = 1;\n\n  // description of the metric, which can be used in documentation.\n  string description = 2;\n\n  // unit in which the metric value is reported. Follows the format\n  // described by http://unitsofmeasure.org/ucum.html.\n  string unit = 3;\n\n  // Data determines the aggregation type (if any) of the metric, what is the\n  // reported value type for the data points, as well as the relatationship to\n  // the time interval over which they are reported.\n  oneof data {\n    Gauge gauge = 5;\n    Sum sum = 7;\n    Histogram histogram = 9;\n    ExponentialHistogram exponential_histogram = 10;\n    Summary summary = 11;\n  }\n}\n\n// Gauge represents the type of a scalar metric that always exports the\n// \"current value\" for every data point. It should be used for an \"unknown\"\n// aggregation.\n//\n// A Gauge does not support different aggregation temporalities. Given the\n// aggregation is unknown, points cannot be combined using the same\n// aggregation, regardless of aggregation temporalities. Therefore,\n// AggregationTemporality is not included. Consequently, this also means\n// \"StartTimeUnixNano\" is ignored for all data points.\nmessage Gauge {\n  repeated NumberDataPoint data_points = 1;\n}\n\n// Sum represents the type of a scalar metric that is calculated as a sum of all\n// reported measurements over a time interval.\nmessage Sum {\n  repeated NumberDataPoint data_points = 1;\n\n  // aggregation_temporality describes if the aggregator reports delta changes\n  // since last report time, or cumulative changes since a fixed start time.\n  AggregationTemporality aggregation_temporality = 2;\n\n  // If \"true\" means that the sum is monotonic.\n  bool is_monotonic = 3;\n}\n\n// Histogram represents the type of a metric that is calculated by aggregating\n// as a Histogram of all reported measurements over a time interval.\nmessage Histogram {\n  repeated HistogramDataPoint data_points = 1;\n\n  // aggregation_temporality describes if the aggregator reports delta changes\n  // since last report time, or cumulative changes since a fixed start time.\n  AggregationTemporality aggregation_temporality = 2;\n}\n\n// ExponentialHistogram represents the type of a metric that is calculated by aggregating\n// as a ExponentialHistogram of all reported double measurements over a time interval.\nmessage ExponentialHistogram {\n  repeated ExponentialHistogramDataPoint data_points = 1;\n\n  // aggregation_temporality describes if the aggregator reports delta changes\n  // since last report time, or cumulative changes since a fixed start time.\n  AggregationTemporality aggregation_temporality = 2;\n}\n\n// Summary metric data are used to convey quantile summaries,\n// a Prometheus (see: https://prometheus.io/docs/concepts/metric_types/#summary)\n// and OpenMetrics (see: https://github.com/OpenObservability/OpenMetrics/blob/4dbf6075567ab43296eed941037c12951faafb92/protos/prometheus.proto#L45)\n// data type. These data points cannot always be merged in a meaningful way.\n// While they can be useful in some applications, histogram data points are\n// recommended for new applications.\nmessage Summary {\n  repeated SummaryDataPoint data_points = 1;\n}\n\n// AggregationTemporality defines how a metric aggregator reports aggregated\n// values. It describes how those values relate to the time interval over\n// which they are aggregated.\nenum AggregationTemporality {\n  // UNSPECIFIED is the default AggregationTemporality, it MUST not be used.\n  AGGREGATION_TEMPORALITY_UNSPECIFIED = 0;\n\n  // DELTA is an AggregationTemporality for a metric aggregator which reports\n  // changes since last report time. Successive metrics contain aggregation of\n  // values from continuous and non-overlapping intervals.\n  //\n  // The values for a DELTA metric are based only on the time interval\n  // associated with one measurement cycle. There is no dependency on\n  // previous measurements like is the case for CUMULATIVE metrics.\n  //\n  // For example, consider a system measuring the number of requests that\n  // it receives and reports the sum of these requests every second as a\n  // DELTA metric:\n  //\n  //   1. The system starts receiving at time=t_0.\n  //   2. A request is received, the system measures 1 request.\n  //   3. A request is received, the system measures 1 request.\n  //   4. A request is received, the system measures 1 request.\n  //   5. The 1 second collection cycle ends. A metric is exported for the\n  //      number of requests received over the interval of time t_0 to\n  //      t_0+1 with a value of 3.\n  //   6. A request is received, the system measures 1 request.\n  //   7. A request is received, the system measures 1 request.\n  //   8. The 1 second collection cycle ends. A metric is exported for the\n  //      number of requests received over the interval of time t_0+1 to\n  //      t_0+2 with a value of 2.\n  AGGREGATION_TEMPORALITY_DELTA = 1;\n\n  // CUMULATIVE is an AggregationTemporality for a metric aggregator which\n  // reports changes since a fixed start time. This means that current values\n  // of a CUMULATIVE metric depend on all previous measurements since the\n  // start time. Because of this, the sender is required to retain this state\n  // in some form. If this state is lost or invalidated, the CUMULATIVE metric\n  // values MUST be reset and a new fixed start time following the last\n  // reported measurement time sent MUST be used.\n  //\n  // For example, consider a system measuring the number of requests that\n  // it receives and reports the sum of these requests every second as a\n  // CUMULATIVE metric:\n  //\n  //   1. The system starts receiving at time=t_0.\n  //   2. A request is received, the system measures 1 request.\n  //   3. A request is received, the system measures 1 request.\n  //   4. A request is received, the system measures 1 request.\n  //   5. The 1 second collection cycle ends. A metric is exported for the\n  //      number of requests received over the interval of time t_0 to\n  //      t_0+1 with a value of 3.\n  //   6. A request is received, the system measures 1 request.\n  //   7. A request is received, the system measures 1 request.\n  //   8. The 1 second collection cycle ends. A metric is exported for the\n  //      number of requests received over the interval of time t_0 to\n  //      t_0+2 with a value of 5.\n  //   9. The system experiences a fault and loses state.\n  //   10. The system recovers and resumes receiving at time=t_1.\n  //   11. A request is received, the system measures 1 request.\n  //   12. The 1 second collection cycle ends. A metric is exported for the\n  //      number of requests received over the interval of time t_1 to\n  //      t_0+1 with a value of 1.\n  //\n  // Note: Even though, when reporting changes since last report time, using\n  // CUMULATIVE is valid, it is not recommended. This may cause problems for\n  // systems that do not use start_time to determine when the aggregation\n  // value was reset (e.g. Prometheus).\n  AGGREGATION_TEMPORALITY_CUMULATIVE = 2;\n}\n\n// DataPointFlags is defined as a protobuf 'uint32' type and is to be used as a\n// bit-field representing 32 distinct boolean flags.  Each flag defined in this\n// enum is a bit-mask.  To test the presence of a single flag in the flags of\n// a data point, for example, use an expression like:\n//\n//   (point.flags & FLAG_NO_RECORDED_VALUE) == FLAG_NO_RECORDED_VALUE\n//\nenum DataPointFlags {\n  FLAG_NONE = 0;\n\n  // This DataPoint is valid but has no recorded value.  This value\n  // SHOULD be used to reflect explicitly missing data in a series, as\n  // for an equivalent to the Prometheus \"staleness marker\".\n  FLAG_NO_RECORDED_VALUE = 1;\n\n  // Bits 2-31 are reserved for future use.\n}\n\n// NumberDataPoint is a single data point in a timeseries that describes the\n// time-varying scalar value of a metric.\nmessage NumberDataPoint {\n  reserved 1;\n\n  // The set of key/value pairs that uniquely identify the timeseries from\n  // where this point belongs. The list may be empty (may contain 0 elements).\n  // Attribute keys MUST be unique (it is not allowed to have more than one\n  // attribute with the same key).\n  repeated opentelemetry.proto.common.v1.KeyValue attributes = 7;\n\n  // StartTimeUnixNano is optional but strongly encouraged, see the\n  // the detailed comments above Metric.\n  //\n  // Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January\n  // 1970.\n  fixed64 start_time_unix_nano = 2;\n\n  // TimeUnixNano is required, see the detailed comments above Metric.\n  //\n  // Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January\n  // 1970.\n  fixed64 time_unix_nano = 3;\n\n  // The value itself.  A point is considered invalid when one of the recognized\n  // value fields is not present inside this oneof.\n  oneof value {\n    double as_double = 4;\n    sfixed64 as_int = 6;\n  }\n\n  // (Optional) List of exemplars collected from\n  // measurements that were used to form the data point\n  repeated Exemplar exemplars = 5;\n\n  // Flags that apply to this specific data point.  See DataPointFlags\n  // for the available flags and their meaning.\n  uint32 flags = 8;\n}\n\n// HistogramDataPoint is a single data point in a timeseries that describes the\n// time-varying values of a Histogram. A Histogram contains summary statistics\n// for a population of values, it may optionally contain the distribution of\n// those values across a set of buckets.\n//\n// If the histogram contains the distribution of values, then both\n// \"explicit_bounds\" and \"bucket counts\" fields must be defined.\n// If the histogram does not contain the distribution of values, then both\n// \"explicit_bounds\" and \"bucket_counts\" must be omitted and only \"count\" and\n// \"sum\" are known.\nmessage HistogramDataPoint {\n  reserved 1;\n\n  // The set of key/value pairs that uniquely identify the timeseries from\n  // where this point belongs. The list may be empty (may contain 0 elements).\n  // Attribute keys MUST be unique (it is not allowed to have more than one\n  // attribute with the same key).\n  repeated opentelemetry.proto.common.v1.KeyValue attributes = 9;\n\n  // StartTimeUnixNano is optional but strongly encouraged, see the\n  // the detailed comments above Metric.\n  //\n  // Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January\n  // 1970.\n  fixed64 start_time_unix_nano = 2;\n\n  // TimeUnixNano is required, see the detailed comments above Metric.\n  //\n  // Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January\n  // 1970.\n  fixed64 time_unix_nano = 3;\n\n  // count is the number of values in the population. Must be non-negative. This\n  // value must be equal to the sum of the \"count\" fields in buckets if a\n  // histogram is provided.\n  fixed64 count = 4;\n\n  // sum of the values in the population. If count is zero then this field\n  // must be zero.\n  //\n  // Note: Sum should only be filled out when measuring non-negative discrete\n  // events, and is assumed to be monotonic over the values of these events.\n  // Negative events *can* be recorded, but sum should not be filled out when\n  // doing so.  This is specifically to enforce compatibility w/ OpenMetrics,\n  // see: https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#histogram\n  optional double sum = 5;\n\n  // bucket_counts is an optional field contains the count values of histogram\n  // for each bucket.\n  //\n  // The sum of the bucket_counts must equal the value in the count field.\n  //\n  // The number of elements in bucket_counts array must be by one greater than\n  // the number of elements in explicit_bounds array.\n  repeated fixed64 bucket_counts = 6;\n\n  // explicit_bounds specifies buckets with explicitly defined bounds for values.\n  //\n  // The boundaries for bucket at index i are:\n  //\n  // (-infinity, explicit_bounds[i]] for i == 0\n  // (explicit_bounds[i-1], explicit_bounds[i]] for 0 < i < size(explicit_bounds)\n  // (explicit_bounds[i-1], +infinity) for i == size(explicit_bounds)\n  //\n  // The values in the explicit_bounds array must be strictly increasing.\n  //\n  // Histogram buckets are inclusive of their upper boundary, except the last\n  // bucket where the boundary is at infinity. This format is intentionally\n  // compatible with the OpenMetrics histogram definition.\n  repeated double explicit_bounds = 7;\n\n  // (Optional) List of exemplars collected from\n  // measurements that were used to form the data point\n  repeated Exemplar exemplars = 8;\n\n  // Flags that apply to this specific data point.  See DataPointFlags\n  // for the available flags and their meaning.\n  uint32 flags = 10;\n\n  // min is the minimum value over (start_time, end_time].\n  optional double min = 11;\n\n  // max is the maximum value over (start_time, end_time].\n  optional double max = 12;\n}\n\n// ExponentialHistogramDataPoint is a single data point in a timeseries that describes the\n// time-varying values of a ExponentialHistogram of double values. A ExponentialHistogram contains\n// summary statistics for a population of values, it may optionally contain the\n// distribution of those values across a set of buckets.\n//\nmessage ExponentialHistogramDataPoint {\n  // The set of key/value pairs that uniquely identify the timeseries from\n  // where this point belongs. The list may be empty (may contain 0 elements).\n  // Attribute keys MUST be unique (it is not allowed to have more than one\n  // attribute with the same key).\n  repeated opentelemetry.proto.common.v1.KeyValue attributes = 1;\n\n  // StartTimeUnixNano is optional but strongly encouraged, see the\n  // the detailed comments above Metric.\n  //\n  // Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January\n  // 1970.\n  fixed64 start_time_unix_nano = 2;\n\n  // TimeUnixNano is required, see the detailed comments above Metric.\n  //\n  // Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January\n  // 1970.\n  fixed64 time_unix_nano = 3;\n\n  // count is the number of values in the population. Must be\n  // non-negative. This value must be equal to the sum of the \"bucket_counts\"\n  // values in the positive and negative Buckets plus the \"zero_count\" field.\n  fixed64 count = 4;\n\n  // sum of the values in the population. If count is zero then this field\n  // must be zero.\n  //\n  // Note: Sum should only be filled out when measuring non-negative discrete\n  // events, and is assumed to be monotonic over the values of these events.\n  // Negative events *can* be recorded, but sum should not be filled out when\n  // doing so.  This is specifically to enforce compatibility w/ OpenMetrics,\n  // see: https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#histogram\n  optional double sum = 5;\n  \n  // scale describes the resolution of the histogram.  Boundaries are\n  // located at powers of the base, where:\n  //\n  //   base = (2^(2^-scale))\n  //\n  // The histogram bucket identified by `index`, a signed integer,\n  // contains values that are greater than (base^index) and\n  // less than or equal to (base^(index+1)).\n  //\n  // The positive and negative ranges of the histogram are expressed\n  // separately.  Negative values are mapped by their absolute value\n  // into the negative range using the same scale as the positive range.\n  //\n  // scale is not restricted by the protocol, as the permissible\n  // values depend on the range of the data.\n  sint32 scale = 6;\n\n  // zero_count is the count of values that are either exactly zero or\n  // within the region considered zero by the instrumentation at the\n  // tolerated degree of precision.  This bucket stores values that\n  // cannot be expressed using the standard exponential formula as\n  // well as values that have been rounded to zero.\n  //\n  // Implementations MAY consider the zero bucket to have probability\n  // mass equal to (zero_count / count).\n  fixed64 zero_count = 7;\n\n  // positive carries the positive range of exponential bucket counts.\n  Buckets positive = 8;\n\n  // negative carries the negative range of exponential bucket counts.\n  Buckets negative = 9;\n\n  // Buckets are a set of bucket counts, encoded in a contiguous array\n  // of counts.\n  message Buckets {\n    // Offset is the bucket index of the first entry in the bucket_counts array.\n    // \n    // Note: This uses a varint encoding as a simple form of compression.\n    sint32 offset = 1;\n\n    // Count is an array of counts, where count[i] carries the count\n    // of the bucket at index (offset+i).  count[i] is the count of\n    // values greater than base^(offset+i) and less or equal to than\n    // base^(offset+i+1).\n    //\n    // Note: By contrast, the explicit HistogramDataPoint uses\n    // fixed64.  This field is expected to have many buckets,\n    // especially zeros, so uint64 has been selected to ensure\n    // varint encoding.\n    repeated uint64 bucket_counts = 2;\n  } \n\n  // Flags that apply to this specific data point.  See DataPointFlags\n  // for the available flags and their meaning.\n  uint32 flags = 10;\n\n  // (Optional) List of exemplars collected from\n  // measurements that were used to form the data point\n  repeated Exemplar exemplars = 11;\n\n  // min is the minimum value over (start_time, end_time].\n  optional double min = 12;\n\n  // max is the maximum value over (start_time, end_time].\n  optional double max = 13;\n}\n\n// SummaryDataPoint is a single data point in a timeseries that describes the\n// time-varying values of a Summary metric.\nmessage SummaryDataPoint {\n  reserved 1;\n\n  // The set of key/value pairs that uniquely identify the timeseries from\n  // where this point belongs. The list may be empty (may contain 0 elements).\n  // Attribute keys MUST be unique (it is not allowed to have more than one\n  // attribute with the same key).\n  repeated opentelemetry.proto.common.v1.KeyValue attributes = 7;\n\n  // StartTimeUnixNano is optional but strongly encouraged, see the\n  // the detailed comments above Metric.\n  //\n  // Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January\n  // 1970.\n  fixed64 start_time_unix_nano = 2;\n\n  // TimeUnixNano is required, see the detailed comments above Metric.\n  //\n  // Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January\n  // 1970.\n  fixed64 time_unix_nano = 3;\n\n  // count is the number of values in the population. Must be non-negative.\n  fixed64 count = 4;\n\n  // sum of the values in the population. If count is zero then this field\n  // must be zero.\n  //\n  // Note: Sum should only be filled out when measuring non-negative discrete\n  // events, and is assumed to be monotonic over the values of these events.\n  // Negative events *can* be recorded, but sum should not be filled out when\n  // doing so.  This is specifically to enforce compatibility w/ OpenMetrics,\n  // see: https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#summary\n  double sum = 5;\n\n  // Represents the value at a given quantile of a distribution.\n  //\n  // To record Min and Max values following conventions are used:\n  // - The 1.0 quantile is equivalent to the maximum value observed.\n  // - The 0.0 quantile is equivalent to the minimum value observed.\n  //\n  // See the following issue for more context:\n  // https://github.com/open-telemetry/opentelemetry-proto/issues/125\n  message ValueAtQuantile {\n    // The quantile of a distribution. Must be in the interval\n    // [0.0, 1.0].\n    double quantile = 1;\n\n    // The value at the given quantile of a distribution.\n    //\n    // Quantile values must NOT be negative.\n    double value = 2;\n  }\n\n  // (Optional) list of values at different quantiles of the distribution calculated\n  // from the current snapshot. The quantiles must be strictly increasing.\n  repeated ValueAtQuantile quantile_values = 6;\n\n  // Flags that apply to this specific data point.  See DataPointFlags\n  // for the available flags and their meaning.\n  uint32 flags = 8;\n}\n\n// A representation of an exemplar, which is a sample input measurement.\n// Exemplars also hold information about the environment when the measurement\n// was recorded, for example the span and trace ID of the active span when the\n// exemplar was recorded.\nmessage Exemplar {\n  reserved 1;\n\n  // The set of key/value pairs that were filtered out by the aggregator, but\n  // recorded alongside the original measurement. Only key/value pairs that were\n  // filtered out by the aggregator should be included\n  repeated opentelemetry.proto.common.v1.KeyValue filtered_attributes = 7;\n\n  // time_unix_nano is the exact time when this exemplar was recorded\n  //\n  // Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January\n  // 1970.\n  fixed64 time_unix_nano = 2;\n\n  // The value of the measurement that was recorded. An exemplar is\n  // considered invalid when one of the recognized value fields is not present\n  // inside this oneof.\n  oneof value {\n    double as_double = 3;\n    sfixed64 as_int = 6;\n  }\n\n  // (Optional) Span ID of the exemplar trace.\n  // span_id may be missing if the measurement is not recorded inside a trace\n  // or if the trace is not sampled.\n  bytes span_id = 4;\n\n  // (Optional) Trace ID of the exemplar trace.\n  // trace_id may be missing if the measurement is not recorded inside a trace\n  // or if the trace is not sampled.\n  bytes trace_id = 5;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/opentelemetry/proto/resource/v1/resource.proto",
    "content": "// Copyright 2019, OpenTelemetry Authors\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nsyntax = \"proto3\";\n\npackage opentelemetry.proto.resource.v1;\n\nimport \"opentelemetry/proto/common/v1/common.proto\";\n\noption csharp_namespace = \"OpenTelemetry.Proto.Resource.V1\";\noption java_multiple_files = true;\noption java_package = \"io.opentelemetry.proto.resource.v1\";\noption java_outer_classname = \"ResourceProto\";\noption go_package = \"go.opentelemetry.io/proto/otlp/resource/v1\";\n\n// Resource information.\nmessage Resource {\n  // Set of attributes that describe the resource.\n  // Attribute keys MUST be unique (it is not allowed to have more than one\n  // attribute with the same key).\n  repeated opentelemetry.proto.common.v1.KeyValue attributes = 1;\n\n  // dropped_attributes_count is the number of dropped attributes. If the value is 0, then\n  // no attributes were dropped.\n  uint32 dropped_attributes_count = 2;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/protos/third-party/opentelemetry/proto/trace/v1/trace.proto",
    "content": "// Copyright 2019, OpenTelemetry Authors\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nsyntax = \"proto3\";\n\npackage opentelemetry.proto.trace.v1;\n\nimport \"opentelemetry/proto/common/v1/common.proto\";\nimport \"opentelemetry/proto/resource/v1/resource.proto\";\n\noption csharp_namespace = \"OpenTelemetry.Proto.Trace.V1\";\noption java_multiple_files = true;\noption java_package = \"io.opentelemetry.proto.trace.v1\";\noption java_outer_classname = \"TraceProto\";\noption go_package = \"go.opentelemetry.io/proto/otlp/trace/v1\";\n\n// TracesData represents the traces data that can be stored in a persistent storage,\n// OR can be embedded by other protocols that transfer OTLP traces data but do\n// not implement the OTLP protocol.\n//\n// The main difference between this message and collector protocol is that\n// in this message there will not be any \"control\" or \"metadata\" specific to\n// OTLP protocol.\n//\n// When new fields are added into this message, the OTLP request MUST be updated\n// as well.\nmessage TracesData {\n  // An array of ResourceSpans.\n  // For data coming from a single resource this array will typically contain\n  // one element. Intermediary nodes that receive data from multiple origins\n  // typically batch the data before forwarding further and in that case this\n  // array will contain multiple elements.\n  repeated ResourceSpans resource_spans = 1;\n}\n\n// A collection of ScopeSpans from a Resource.\nmessage ResourceSpans {\n  reserved 1000;\n\n  // The resource for the spans in this message.\n  // If this field is not set then no resource info is known.\n  opentelemetry.proto.resource.v1.Resource resource = 1;\n\n  // A list of ScopeSpans that originate from a resource.\n  repeated ScopeSpans scope_spans = 2;\n\n  // This schema_url applies to the data in the \"resource\" field. It does not apply\n  // to the data in the \"scope_spans\" field which have their own schema_url field.\n  string schema_url = 3;\n}\n\n// A collection of Spans produced by an InstrumentationScope.\nmessage ScopeSpans {\n  // The instrumentation scope information for the spans in this message.\n  // Semantically when InstrumentationScope isn't set, it is equivalent with\n  // an empty instrumentation scope name (unknown).\n  opentelemetry.proto.common.v1.InstrumentationScope scope = 1;\n\n  // A list of Spans that originate from an instrumentation scope.\n  repeated Span spans = 2;\n\n  // This schema_url applies to all spans and span events in the \"spans\" field.\n  string schema_url = 3;\n}\n\n// A Span represents a single operation performed by a single component of the system.\n//\n// The next available field id is 17.\nmessage Span {\n  // A unique identifier for a trace. All spans from the same trace share\n  // the same `trace_id`. The ID is a 16-byte array. An ID with all zeroes\n  // is considered invalid.\n  //\n  // This field is semantically required. Receiver should generate new\n  // random trace_id if empty or invalid trace_id was received.\n  //\n  // This field is required.\n  bytes trace_id = 1;\n\n  // A unique identifier for a span within a trace, assigned when the span\n  // is created. The ID is an 8-byte array. An ID with all zeroes is considered\n  // invalid.\n  //\n  // This field is semantically required. Receiver should generate new\n  // random span_id if empty or invalid span_id was received.\n  //\n  // This field is required.\n  bytes span_id = 2;\n\n  // trace_state conveys information about request position in multiple distributed tracing graphs.\n  // It is a trace_state in w3c-trace-context format: https://www.w3.org/TR/trace-context/#tracestate-header\n  // See also https://github.com/w3c/distributed-tracing for more details about this field.\n  string trace_state = 3;\n\n  // The `span_id` of this span's parent span. If this is a root span, then this\n  // field must be empty. The ID is an 8-byte array.\n  bytes parent_span_id = 4;\n\n  // A description of the span's operation.\n  //\n  // For example, the name can be a qualified method name or a file name\n  // and a line number where the operation is called. A best practice is to use\n  // the same display name at the same call point in an application.\n  // This makes it easier to correlate spans in different traces.\n  //\n  // This field is semantically required to be set to non-empty string.\n  // Empty value is equivalent to an unknown span name.\n  //\n  // This field is required.\n  string name = 5;\n\n  // SpanKind is the type of span. Can be used to specify additional relationships between spans\n  // in addition to a parent/child relationship.\n  enum SpanKind {\n    // Unspecified. Do NOT use as default.\n    // Implementations MAY assume SpanKind to be INTERNAL when receiving UNSPECIFIED.\n    SPAN_KIND_UNSPECIFIED = 0;\n\n    // Indicates that the span represents an internal operation within an application,\n    // as opposed to an operation happening at the boundaries. Default value.\n    SPAN_KIND_INTERNAL = 1;\n\n    // Indicates that the span covers server-side handling of an RPC or other\n    // remote network request.\n    SPAN_KIND_SERVER = 2;\n\n    // Indicates that the span describes a request to some remote service.\n    SPAN_KIND_CLIENT = 3;\n\n    // Indicates that the span describes a producer sending a message to a broker.\n    // Unlike CLIENT and SERVER, there is often no direct critical path latency relationship\n    // between producer and consumer spans. A PRODUCER span ends when the message was accepted\n    // by the broker while the logical processing of the message might span a much longer time.\n    SPAN_KIND_PRODUCER = 4;\n\n    // Indicates that the span describes consumer receiving a message from a broker.\n    // Like the PRODUCER kind, there is often no direct critical path latency relationship\n    // between producer and consumer spans.\n    SPAN_KIND_CONSUMER = 5;\n  }\n\n  // Distinguishes between spans generated in a particular context. For example,\n  // two spans with the same name may be distinguished using `CLIENT` (caller)\n  // and `SERVER` (callee) to identify queueing latency associated with the span.\n  SpanKind kind = 6;\n\n  // start_time_unix_nano is the start time of the span. On the client side, this is the time\n  // kept by the local machine where the span execution starts. On the server side, this\n  // is the time when the server's application handler starts running.\n  // Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.\n  //\n  // This field is semantically required and it is expected that end_time >= start_time.\n  fixed64 start_time_unix_nano = 7;\n\n  // end_time_unix_nano is the end time of the span. On the client side, this is the time\n  // kept by the local machine where the span execution ends. On the server side, this\n  // is the time when the server application handler stops running.\n  // Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.\n  //\n  // This field is semantically required and it is expected that end_time >= start_time.\n  fixed64 end_time_unix_nano = 8;\n\n  // attributes is a collection of key/value pairs. Note, global attributes\n  // like server name can be set using the resource API. Examples of attributes:\n  //\n  //     \"/http/user_agent\": \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36\"\n  //     \"/http/server_latency\": 300\n  //     \"abc.com/myattribute\": true\n  //     \"abc.com/score\": 10.239\n  //\n  // The OpenTelemetry API specification further restricts the allowed value types:\n  // https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/common/README.md#attribute\n  // Attribute keys MUST be unique (it is not allowed to have more than one\n  // attribute with the same key).\n  repeated opentelemetry.proto.common.v1.KeyValue attributes = 9;\n\n  // dropped_attributes_count is the number of attributes that were discarded. Attributes\n  // can be discarded because their keys are too long or because there are too many\n  // attributes. If this value is 0, then no attributes were dropped.\n  uint32 dropped_attributes_count = 10;\n\n  // Event is a time-stamped annotation of the span, consisting of user-supplied\n  // text description and key-value pairs.\n  message Event {\n    // time_unix_nano is the time the event occurred.\n    fixed64 time_unix_nano = 1;\n\n    // name of the event.\n    // This field is semantically required to be set to non-empty string.\n    string name = 2;\n\n    // attributes is a collection of attribute key/value pairs on the event.\n    // Attribute keys MUST be unique (it is not allowed to have more than one\n    // attribute with the same key).\n    repeated opentelemetry.proto.common.v1.KeyValue attributes = 3;\n\n    // dropped_attributes_count is the number of dropped attributes. If the value is 0,\n    // then no attributes were dropped.\n    uint32 dropped_attributes_count = 4;\n  }\n\n  // events is a collection of Event items.\n  repeated Event events = 11;\n\n  // dropped_events_count is the number of dropped events. If the value is 0, then no\n  // events were dropped.\n  uint32 dropped_events_count = 12;\n\n  // A pointer from the current span to another span in the same trace or in a\n  // different trace. For example, this can be used in batching operations,\n  // where a single batch handler processes multiple requests from different\n  // traces or when the handler receives a request from a different project.\n  message Link {\n    // A unique identifier of a trace that this linked span is part of. The ID is a\n    // 16-byte array.\n    bytes trace_id = 1;\n\n    // A unique identifier for the linked span. The ID is an 8-byte array.\n    bytes span_id = 2;\n\n    // The trace_state associated with the link.\n    string trace_state = 3;\n\n    // attributes is a collection of attribute key/value pairs on the link.\n    // Attribute keys MUST be unique (it is not allowed to have more than one\n    // attribute with the same key).\n    repeated opentelemetry.proto.common.v1.KeyValue attributes = 4;\n\n    // dropped_attributes_count is the number of dropped attributes. If the value is 0,\n    // then no attributes were dropped.\n    uint32 dropped_attributes_count = 5;\n  }\n\n  // links is a collection of Links, which are references from this span to a span\n  // in the same or different trace.\n  repeated Link links = 13;\n\n  // dropped_links_count is the number of dropped links after the maximum size was\n  // enforced. If this value is 0, then no links were dropped.\n  uint32 dropped_links_count = 14;\n\n  // An optional final status for this span. Semantically when Status isn't set, it means\n  // span's status code is unset, i.e. assume STATUS_CODE_UNSET (code = 0).\n  Status status = 15;\n}\n\n// The Status type defines a logical error model that is suitable for different\n// programming environments, including REST APIs and RPC APIs.\nmessage Status {\n  reserved 1;\n\n  // A developer-facing human readable error message.\n  string message = 2;\n\n  // For the semantics of status codes see\n  // https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md#set-status\n  enum StatusCode {\n    // The default status.\n    STATUS_CODE_UNSET               = 0;\n    // The Span has been validated by an Application developer or Operator to \n    // have completed successfully.\n    STATUS_CODE_OK                  = 1;\n    // The Span contains an error.\n    STATUS_CODE_ERROR               = 2;\n  };\n\n  // The status code.\n  StatusCode code = 3;\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/cluster/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_common::rate_limited_error;\nuse quickwit_common::tower::MakeLoadShedError;\nuse serde::{Deserialize, Serialize};\nuse thiserror;\n\nuse crate::GrpcServiceError;\nuse crate::error::{ServiceError, ServiceErrorCode};\n\ninclude!(\"../codegen/quickwit/quickwit.cluster.rs\");\n\npub const CLUSTER_PLANE_FILE_DESCRIPTOR_SET: &[u8] =\n    include_bytes!(\"../codegen/quickwit/cluster_descriptor.bin\");\n\npub type ClusterResult<T> = std::result::Result<T, ClusterError>;\n\n#[derive(Debug, thiserror::Error, Eq, PartialEq, Serialize, Deserialize)]\n#[serde(rename_all = \"snake_case\")]\npub enum ClusterError {\n    #[error(\"internal error: {0}\")]\n    Internal(String),\n    #[error(\"request timed out: {0}\")]\n    Timeout(String),\n    #[error(\"too many requests\")]\n    TooManyRequests,\n    #[error(\"service unavailable: {0}\")]\n    Unavailable(String),\n}\n\nimpl ServiceError for ClusterError {\n    fn error_code(&self) -> ServiceErrorCode {\n        match self {\n            Self::Internal(err_msg) => {\n                rate_limited_error!(limit_per_min = 6, \"cluster internal error: {err_msg}\");\n                ServiceErrorCode::Internal\n            }\n            Self::Timeout(_) => ServiceErrorCode::Timeout,\n            Self::TooManyRequests => ServiceErrorCode::TooManyRequests,\n            Self::Unavailable(_) => ServiceErrorCode::Unavailable,\n        }\n    }\n}\n\nimpl GrpcServiceError for ClusterError {\n    fn new_internal(message: String) -> Self {\n        Self::Internal(message)\n    }\n\n    fn new_timeout(message: String) -> Self {\n        Self::Timeout(message)\n    }\n\n    fn new_too_many_requests() -> Self {\n        Self::TooManyRequests\n    }\n\n    fn new_unavailable(message: String) -> Self {\n        Self::Unavailable(message)\n    }\n}\n\nimpl MakeLoadShedError for ClusterError {\n    fn make_load_shed_error() -> Self {\n        ClusterError::TooManyRequests\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/codegen/jaeger/jaeger.api_v2.rs",
    "content": "// This file is @generated by prost-build.\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct KeyValue {\n    #[prost(string, tag = \"1\")]\n    pub key: ::prost::alloc::string::String,\n    #[prost(enumeration = \"ValueType\", tag = \"2\")]\n    pub v_type: i32,\n    #[prost(string, tag = \"3\")]\n    pub v_str: ::prost::alloc::string::String,\n    #[prost(bool, tag = \"4\")]\n    pub v_bool: bool,\n    #[prost(int64, tag = \"5\")]\n    pub v_int64: i64,\n    #[prost(double, tag = \"6\")]\n    pub v_float64: f64,\n    #[prost(bytes = \"vec\", tag = \"7\")]\n    pub v_binary: ::prost::alloc::vec::Vec<u8>,\n}\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct Log {\n    #[prost(message, optional, tag = \"1\")]\n    pub timestamp: ::core::option::Option<::prost_types::Timestamp>,\n    #[prost(message, repeated, tag = \"2\")]\n    pub fields: ::prost::alloc::vec::Vec<KeyValue>,\n}\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct SpanRef {\n    #[prost(bytes = \"vec\", tag = \"1\")]\n    pub trace_id: ::prost::alloc::vec::Vec<u8>,\n    #[prost(bytes = \"vec\", tag = \"2\")]\n    pub span_id: ::prost::alloc::vec::Vec<u8>,\n    #[prost(enumeration = \"SpanRefType\", tag = \"3\")]\n    pub ref_type: i32,\n}\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct Process {\n    #[prost(string, tag = \"1\")]\n    pub service_name: ::prost::alloc::string::String,\n    #[prost(message, repeated, tag = \"2\")]\n    pub tags: ::prost::alloc::vec::Vec<KeyValue>,\n}\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct Span {\n    #[prost(bytes = \"vec\", tag = \"1\")]\n    pub trace_id: ::prost::alloc::vec::Vec<u8>,\n    #[prost(bytes = \"vec\", tag = \"2\")]\n    pub span_id: ::prost::alloc::vec::Vec<u8>,\n    #[prost(string, tag = \"3\")]\n    pub operation_name: ::prost::alloc::string::String,\n    #[prost(message, repeated, tag = \"4\")]\n    pub references: ::prost::alloc::vec::Vec<SpanRef>,\n    #[prost(uint32, tag = \"5\")]\n    pub flags: u32,\n    #[prost(message, optional, tag = \"6\")]\n    pub start_time: ::core::option::Option<::prost_types::Timestamp>,\n    #[prost(message, optional, tag = \"7\")]\n    pub duration: ::core::option::Option<::prost_types::Duration>,\n    #[prost(message, repeated, tag = \"8\")]\n    pub tags: ::prost::alloc::vec::Vec<KeyValue>,\n    #[prost(message, repeated, tag = \"9\")]\n    pub logs: ::prost::alloc::vec::Vec<Log>,\n    #[prost(message, optional, tag = \"10\")]\n    pub process: ::core::option::Option<Process>,\n    #[prost(string, tag = \"11\")]\n    pub process_id: ::prost::alloc::string::String,\n    #[prost(string, repeated, tag = \"12\")]\n    pub warnings: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n}\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct Trace {\n    #[prost(message, repeated, tag = \"1\")]\n    pub spans: ::prost::alloc::vec::Vec<Span>,\n    #[prost(message, repeated, tag = \"2\")]\n    pub process_map: ::prost::alloc::vec::Vec<trace::ProcessMapping>,\n    #[prost(string, repeated, tag = \"3\")]\n    pub warnings: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n}\n/// Nested message and enum types in `Trace`.\npub mod trace {\n    #[derive(Clone, PartialEq, ::prost::Message)]\n    pub struct ProcessMapping {\n        #[prost(string, tag = \"1\")]\n        pub process_id: ::prost::alloc::string::String,\n        #[prost(message, optional, tag = \"2\")]\n        pub process: ::core::option::Option<super::Process>,\n    }\n}\n/// Note that both Span and Batch may contain a Process.\n/// This is different from the Thrift model which was only used\n/// for transport, because Proto model is also used by the backend\n/// as the domain model, where once a batch is received it is split\n/// into individual spans which are all processed independently,\n/// and therefore they all need a Process. As far as on-the-wire\n/// semantics, both Batch and Spans in the same message may contain\n/// their own instances of Process, with span.Process taking priority\n/// over batch.Process.\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct Batch {\n    #[prost(message, repeated, tag = \"1\")]\n    pub spans: ::prost::alloc::vec::Vec<Span>,\n    #[prost(message, optional, tag = \"2\")]\n    pub process: ::core::option::Option<Process>,\n}\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct DependencyLink {\n    #[prost(string, tag = \"1\")]\n    pub parent: ::prost::alloc::string::String,\n    #[prost(string, tag = \"2\")]\n    pub child: ::prost::alloc::string::String,\n    #[prost(uint64, tag = \"3\")]\n    pub call_count: u64,\n    #[prost(string, tag = \"4\")]\n    pub source: ::prost::alloc::string::String,\n}\n#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]\n#[repr(i32)]\npub enum ValueType {\n    String = 0,\n    Bool = 1,\n    Int64 = 2,\n    Float64 = 3,\n    Binary = 4,\n}\nimpl ValueType {\n    /// String value of the enum field names used in the ProtoBuf definition.\n    ///\n    /// The values are not transformed in any way and thus are considered stable\n    /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n    pub fn as_str_name(&self) -> &'static str {\n        match self {\n            Self::String => \"STRING\",\n            Self::Bool => \"BOOL\",\n            Self::Int64 => \"INT64\",\n            Self::Float64 => \"FLOAT64\",\n            Self::Binary => \"BINARY\",\n        }\n    }\n    /// Creates an enum from field names used in the ProtoBuf definition.\n    pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n        match value {\n            \"STRING\" => Some(Self::String),\n            \"BOOL\" => Some(Self::Bool),\n            \"INT64\" => Some(Self::Int64),\n            \"FLOAT64\" => Some(Self::Float64),\n            \"BINARY\" => Some(Self::Binary),\n            _ => None,\n        }\n    }\n}\n#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]\n#[repr(i32)]\npub enum SpanRefType {\n    ChildOf = 0,\n    FollowsFrom = 1,\n}\nimpl SpanRefType {\n    /// String value of the enum field names used in the ProtoBuf definition.\n    ///\n    /// The values are not transformed in any way and thus are considered stable\n    /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n    pub fn as_str_name(&self) -> &'static str {\n        match self {\n            Self::ChildOf => \"CHILD_OF\",\n            Self::FollowsFrom => \"FOLLOWS_FROM\",\n        }\n    }\n    /// Creates an enum from field names used in the ProtoBuf definition.\n    pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n        match value {\n            \"CHILD_OF\" => Some(Self::ChildOf),\n            \"FOLLOWS_FROM\" => Some(Self::FollowsFrom),\n            _ => None,\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/codegen/jaeger/jaeger.storage.v1.rs",
    "content": "// This file is @generated by prost-build.\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct GetDependenciesRequest {\n    #[prost(message, optional, tag = \"1\")]\n    pub start_time: ::core::option::Option<::prost_types::Timestamp>,\n    #[prost(message, optional, tag = \"2\")]\n    pub end_time: ::core::option::Option<::prost_types::Timestamp>,\n}\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct GetDependenciesResponse {\n    #[prost(message, repeated, tag = \"1\")]\n    pub dependencies: ::prost::alloc::vec::Vec<super::super::api_v2::DependencyLink>,\n}\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct WriteSpanRequest {\n    #[prost(message, optional, tag = \"1\")]\n    pub span: ::core::option::Option<super::super::api_v2::Span>,\n}\n/// empty; extensible in the future\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct WriteSpanResponse {}\n/// empty; extensible in the future\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct CloseWriterRequest {}\n/// empty; extensible in the future\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct CloseWriterResponse {}\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct GetTraceRequest {\n    #[prost(bytes = \"vec\", tag = \"1\")]\n    pub trace_id: ::prost::alloc::vec::Vec<u8>,\n}\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct GetServicesRequest {}\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct GetServicesResponse {\n    #[prost(string, repeated, tag = \"1\")]\n    pub services: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n}\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct GetOperationsRequest {\n    #[prost(string, tag = \"1\")]\n    pub service: ::prost::alloc::string::String,\n    #[prost(string, tag = \"2\")]\n    pub span_kind: ::prost::alloc::string::String,\n}\n#[derive(Ord, PartialOrd)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct Operation {\n    #[prost(string, tag = \"1\")]\n    pub name: ::prost::alloc::string::String,\n    #[prost(string, tag = \"2\")]\n    pub span_kind: ::prost::alloc::string::String,\n}\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct GetOperationsResponse {\n    /// deprecated\n    #[prost(string, repeated, tag = \"1\")]\n    pub operation_names: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n    #[prost(message, repeated, tag = \"2\")]\n    pub operations: ::prost::alloc::vec::Vec<Operation>,\n}\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct TraceQueryParameters {\n    #[prost(string, tag = \"1\")]\n    pub service_name: ::prost::alloc::string::String,\n    #[prost(string, tag = \"2\")]\n    pub operation_name: ::prost::alloc::string::String,\n    #[prost(map = \"string, string\", tag = \"3\")]\n    pub tags: ::std::collections::HashMap<\n        ::prost::alloc::string::String,\n        ::prost::alloc::string::String,\n    >,\n    #[prost(message, optional, tag = \"4\")]\n    pub start_time_min: ::core::option::Option<::prost_types::Timestamp>,\n    #[prost(message, optional, tag = \"5\")]\n    pub start_time_max: ::core::option::Option<::prost_types::Timestamp>,\n    #[prost(message, optional, tag = \"6\")]\n    pub duration_min: ::core::option::Option<::prost_types::Duration>,\n    #[prost(message, optional, tag = \"7\")]\n    pub duration_max: ::core::option::Option<::prost_types::Duration>,\n    #[prost(int32, tag = \"8\")]\n    pub num_traces: i32,\n}\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct FindTracesRequest {\n    #[prost(message, optional, tag = \"1\")]\n    pub query: ::core::option::Option<TraceQueryParameters>,\n}\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct SpansResponseChunk {\n    #[prost(message, repeated, tag = \"1\")]\n    pub spans: ::prost::alloc::vec::Vec<super::super::api_v2::Span>,\n}\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct FindTraceIDsRequest {\n    #[prost(message, optional, tag = \"1\")]\n    pub query: ::core::option::Option<TraceQueryParameters>,\n}\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct FindTraceIDsResponse {\n    #[prost(bytes = \"vec\", repeated, tag = \"1\")]\n    pub trace_ids: ::prost::alloc::vec::Vec<::prost::alloc::vec::Vec<u8>>,\n}\n/// empty; extensible in the future\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct CapabilitiesRequest {}\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct CapabilitiesResponse {\n    #[prost(bool, tag = \"1\")]\n    pub archive_span_reader: bool,\n    #[prost(bool, tag = \"2\")]\n    pub archive_span_writer: bool,\n    #[prost(bool, tag = \"3\")]\n    pub streaming_span_writer: bool,\n}\n/// Generated client implementations.\npub mod span_writer_plugin_client {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    use tonic::codegen::http::Uri;\n    #[derive(Debug, Clone)]\n    pub struct SpanWriterPluginClient<T> {\n        inner: tonic::client::Grpc<T>,\n    }\n    impl SpanWriterPluginClient<tonic::transport::Channel> {\n        /// Attempt to create a new client by connecting to a given endpoint.\n        pub async fn connect<D>(dst: D) -> Result<Self, tonic::transport::Error>\n        where\n            D: TryInto<tonic::transport::Endpoint>,\n            D::Error: Into<StdError>,\n        {\n            let conn = tonic::transport::Endpoint::new(dst)?.connect().await?;\n            Ok(Self::new(conn))\n        }\n    }\n    impl<T> SpanWriterPluginClient<T>\n    where\n        T: tonic::client::GrpcService<tonic::body::Body>,\n        T::Error: Into<StdError>,\n        T::ResponseBody: Body<Data = Bytes> + std::marker::Send + 'static,\n        <T::ResponseBody as Body>::Error: Into<StdError> + std::marker::Send,\n    {\n        pub fn new(inner: T) -> Self {\n            let inner = tonic::client::Grpc::new(inner);\n            Self { inner }\n        }\n        pub fn with_origin(inner: T, origin: Uri) -> Self {\n            let inner = tonic::client::Grpc::with_origin(inner, origin);\n            Self { inner }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> SpanWriterPluginClient<InterceptedService<T, F>>\n        where\n            F: tonic::service::Interceptor,\n            T::ResponseBody: Default,\n            T: tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n                Response = http::Response<\n                    <T as tonic::client::GrpcService<tonic::body::Body>>::ResponseBody,\n                >,\n            >,\n            <T as tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n            >>::Error: Into<StdError> + std::marker::Send + std::marker::Sync,\n        {\n            SpanWriterPluginClient::new(InterceptedService::new(inner, interceptor))\n        }\n        /// Compress requests with the given encoding.\n        ///\n        /// This requires the server to support it otherwise it might respond with an\n        /// error.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.send_compressed(encoding);\n            self\n        }\n        /// Enable decompressing responses.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.accept_compressed(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_decoding_message_size(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_encoding_message_size(limit);\n            self\n        }\n        /// spanstore/Writer\n        pub async fn write_span(\n            &mut self,\n            request: impl tonic::IntoRequest<super::WriteSpanRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::WriteSpanResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/jaeger.storage.v1.SpanWriterPlugin/WriteSpan\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\"jaeger.storage.v1.SpanWriterPlugin\", \"WriteSpan\"),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        pub async fn close(\n            &mut self,\n            request: impl tonic::IntoRequest<super::CloseWriterRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::CloseWriterResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/jaeger.storage.v1.SpanWriterPlugin/Close\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(GrpcMethod::new(\"jaeger.storage.v1.SpanWriterPlugin\", \"Close\"));\n            self.inner.unary(req, path, codec).await\n        }\n    }\n}\n/// Generated server implementations.\npub mod span_writer_plugin_server {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    /// Generated trait containing gRPC methods that should be implemented for use with SpanWriterPluginServer.\n    #[async_trait]\n    pub trait SpanWriterPlugin: std::marker::Send + std::marker::Sync + 'static {\n        /// spanstore/Writer\n        async fn write_span(\n            &self,\n            request: tonic::Request<super::WriteSpanRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::WriteSpanResponse>,\n            tonic::Status,\n        >;\n        async fn close(\n            &self,\n            request: tonic::Request<super::CloseWriterRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::CloseWriterResponse>,\n            tonic::Status,\n        >;\n    }\n    #[derive(Debug)]\n    pub struct SpanWriterPluginServer<T> {\n        inner: Arc<T>,\n        accept_compression_encodings: EnabledCompressionEncodings,\n        send_compression_encodings: EnabledCompressionEncodings,\n        max_decoding_message_size: Option<usize>,\n        max_encoding_message_size: Option<usize>,\n    }\n    impl<T> SpanWriterPluginServer<T> {\n        pub fn new(inner: T) -> Self {\n            Self::from_arc(Arc::new(inner))\n        }\n        pub fn from_arc(inner: Arc<T>) -> Self {\n            Self {\n                inner,\n                accept_compression_encodings: Default::default(),\n                send_compression_encodings: Default::default(),\n                max_decoding_message_size: None,\n                max_encoding_message_size: None,\n            }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> InterceptedService<Self, F>\n        where\n            F: tonic::service::Interceptor,\n        {\n            InterceptedService::new(Self::new(inner), interceptor)\n        }\n        /// Enable decompressing requests with the given encoding.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.accept_compression_encodings.enable(encoding);\n            self\n        }\n        /// Compress responses with the given encoding, if the client supports it.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.send_compression_encodings.enable(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.max_decoding_message_size = Some(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.max_encoding_message_size = Some(limit);\n            self\n        }\n    }\n    impl<T, B> tonic::codegen::Service<http::Request<B>> for SpanWriterPluginServer<T>\n    where\n        T: SpanWriterPlugin,\n        B: Body + std::marker::Send + 'static,\n        B::Error: Into<StdError> + std::marker::Send + 'static,\n    {\n        type Response = http::Response<tonic::body::Body>;\n        type Error = std::convert::Infallible;\n        type Future = BoxFuture<Self::Response, Self::Error>;\n        fn poll_ready(\n            &mut self,\n            _cx: &mut Context<'_>,\n        ) -> Poll<std::result::Result<(), Self::Error>> {\n            Poll::Ready(Ok(()))\n        }\n        fn call(&mut self, req: http::Request<B>) -> Self::Future {\n            match req.uri().path() {\n                \"/jaeger.storage.v1.SpanWriterPlugin/WriteSpan\" => {\n                    #[allow(non_camel_case_types)]\n                    struct WriteSpanSvc<T: SpanWriterPlugin>(pub Arc<T>);\n                    impl<\n                        T: SpanWriterPlugin,\n                    > tonic::server::UnaryService<super::WriteSpanRequest>\n                    for WriteSpanSvc<T> {\n                        type Response = super::WriteSpanResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::WriteSpanRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as SpanWriterPlugin>::write_span(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = WriteSpanSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/jaeger.storage.v1.SpanWriterPlugin/Close\" => {\n                    #[allow(non_camel_case_types)]\n                    struct CloseSvc<T: SpanWriterPlugin>(pub Arc<T>);\n                    impl<\n                        T: SpanWriterPlugin,\n                    > tonic::server::UnaryService<super::CloseWriterRequest>\n                    for CloseSvc<T> {\n                        type Response = super::CloseWriterResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::CloseWriterRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as SpanWriterPlugin>::close(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = CloseSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                _ => {\n                    Box::pin(async move {\n                        let mut response = http::Response::new(\n                            tonic::body::Body::default(),\n                        );\n                        let headers = response.headers_mut();\n                        headers\n                            .insert(\n                                tonic::Status::GRPC_STATUS,\n                                (tonic::Code::Unimplemented as i32).into(),\n                            );\n                        headers\n                            .insert(\n                                http::header::CONTENT_TYPE,\n                                tonic::metadata::GRPC_CONTENT_TYPE,\n                            );\n                        Ok(response)\n                    })\n                }\n            }\n        }\n    }\n    impl<T> Clone for SpanWriterPluginServer<T> {\n        fn clone(&self) -> Self {\n            let inner = self.inner.clone();\n            Self {\n                inner,\n                accept_compression_encodings: self.accept_compression_encodings,\n                send_compression_encodings: self.send_compression_encodings,\n                max_decoding_message_size: self.max_decoding_message_size,\n                max_encoding_message_size: self.max_encoding_message_size,\n            }\n        }\n    }\n    /// Generated gRPC service name\n    pub const SERVICE_NAME: &str = \"jaeger.storage.v1.SpanWriterPlugin\";\n    impl<T> tonic::server::NamedService for SpanWriterPluginServer<T> {\n        const NAME: &'static str = SERVICE_NAME;\n    }\n}\n/// Generated client implementations.\npub mod streaming_span_writer_plugin_client {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    use tonic::codegen::http::Uri;\n    #[derive(Debug, Clone)]\n    pub struct StreamingSpanWriterPluginClient<T> {\n        inner: tonic::client::Grpc<T>,\n    }\n    impl StreamingSpanWriterPluginClient<tonic::transport::Channel> {\n        /// Attempt to create a new client by connecting to a given endpoint.\n        pub async fn connect<D>(dst: D) -> Result<Self, tonic::transport::Error>\n        where\n            D: TryInto<tonic::transport::Endpoint>,\n            D::Error: Into<StdError>,\n        {\n            let conn = tonic::transport::Endpoint::new(dst)?.connect().await?;\n            Ok(Self::new(conn))\n        }\n    }\n    impl<T> StreamingSpanWriterPluginClient<T>\n    where\n        T: tonic::client::GrpcService<tonic::body::Body>,\n        T::Error: Into<StdError>,\n        T::ResponseBody: Body<Data = Bytes> + std::marker::Send + 'static,\n        <T::ResponseBody as Body>::Error: Into<StdError> + std::marker::Send,\n    {\n        pub fn new(inner: T) -> Self {\n            let inner = tonic::client::Grpc::new(inner);\n            Self { inner }\n        }\n        pub fn with_origin(inner: T, origin: Uri) -> Self {\n            let inner = tonic::client::Grpc::with_origin(inner, origin);\n            Self { inner }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> StreamingSpanWriterPluginClient<InterceptedService<T, F>>\n        where\n            F: tonic::service::Interceptor,\n            T::ResponseBody: Default,\n            T: tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n                Response = http::Response<\n                    <T as tonic::client::GrpcService<tonic::body::Body>>::ResponseBody,\n                >,\n            >,\n            <T as tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n            >>::Error: Into<StdError> + std::marker::Send + std::marker::Sync,\n        {\n            StreamingSpanWriterPluginClient::new(\n                InterceptedService::new(inner, interceptor),\n            )\n        }\n        /// Compress requests with the given encoding.\n        ///\n        /// This requires the server to support it otherwise it might respond with an\n        /// error.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.send_compressed(encoding);\n            self\n        }\n        /// Enable decompressing responses.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.accept_compressed(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_decoding_message_size(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_encoding_message_size(limit);\n            self\n        }\n        pub async fn write_span_stream(\n            &mut self,\n            request: impl tonic::IntoStreamingRequest<Message = super::WriteSpanRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::WriteSpanResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/jaeger.storage.v1.StreamingSpanWriterPlugin/WriteSpanStream\",\n            );\n            let mut req = request.into_streaming_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"jaeger.storage.v1.StreamingSpanWriterPlugin\",\n                        \"WriteSpanStream\",\n                    ),\n                );\n            self.inner.client_streaming(req, path, codec).await\n        }\n    }\n}\n/// Generated server implementations.\npub mod streaming_span_writer_plugin_server {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    /// Generated trait containing gRPC methods that should be implemented for use with StreamingSpanWriterPluginServer.\n    #[async_trait]\n    pub trait StreamingSpanWriterPlugin: std::marker::Send + std::marker::Sync + 'static {\n        async fn write_span_stream(\n            &self,\n            request: tonic::Request<tonic::Streaming<super::WriteSpanRequest>>,\n        ) -> std::result::Result<\n            tonic::Response<super::WriteSpanResponse>,\n            tonic::Status,\n        >;\n    }\n    #[derive(Debug)]\n    pub struct StreamingSpanWriterPluginServer<T> {\n        inner: Arc<T>,\n        accept_compression_encodings: EnabledCompressionEncodings,\n        send_compression_encodings: EnabledCompressionEncodings,\n        max_decoding_message_size: Option<usize>,\n        max_encoding_message_size: Option<usize>,\n    }\n    impl<T> StreamingSpanWriterPluginServer<T> {\n        pub fn new(inner: T) -> Self {\n            Self::from_arc(Arc::new(inner))\n        }\n        pub fn from_arc(inner: Arc<T>) -> Self {\n            Self {\n                inner,\n                accept_compression_encodings: Default::default(),\n                send_compression_encodings: Default::default(),\n                max_decoding_message_size: None,\n                max_encoding_message_size: None,\n            }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> InterceptedService<Self, F>\n        where\n            F: tonic::service::Interceptor,\n        {\n            InterceptedService::new(Self::new(inner), interceptor)\n        }\n        /// Enable decompressing requests with the given encoding.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.accept_compression_encodings.enable(encoding);\n            self\n        }\n        /// Compress responses with the given encoding, if the client supports it.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.send_compression_encodings.enable(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.max_decoding_message_size = Some(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.max_encoding_message_size = Some(limit);\n            self\n        }\n    }\n    impl<T, B> tonic::codegen::Service<http::Request<B>>\n    for StreamingSpanWriterPluginServer<T>\n    where\n        T: StreamingSpanWriterPlugin,\n        B: Body + std::marker::Send + 'static,\n        B::Error: Into<StdError> + std::marker::Send + 'static,\n    {\n        type Response = http::Response<tonic::body::Body>;\n        type Error = std::convert::Infallible;\n        type Future = BoxFuture<Self::Response, Self::Error>;\n        fn poll_ready(\n            &mut self,\n            _cx: &mut Context<'_>,\n        ) -> Poll<std::result::Result<(), Self::Error>> {\n            Poll::Ready(Ok(()))\n        }\n        fn call(&mut self, req: http::Request<B>) -> Self::Future {\n            match req.uri().path() {\n                \"/jaeger.storage.v1.StreamingSpanWriterPlugin/WriteSpanStream\" => {\n                    #[allow(non_camel_case_types)]\n                    struct WriteSpanStreamSvc<T: StreamingSpanWriterPlugin>(pub Arc<T>);\n                    impl<\n                        T: StreamingSpanWriterPlugin,\n                    > tonic::server::ClientStreamingService<super::WriteSpanRequest>\n                    for WriteSpanStreamSvc<T> {\n                        type Response = super::WriteSpanResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<\n                                tonic::Streaming<super::WriteSpanRequest>,\n                            >,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as StreamingSpanWriterPlugin>::write_span_stream(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = WriteSpanStreamSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.client_streaming(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                _ => {\n                    Box::pin(async move {\n                        let mut response = http::Response::new(\n                            tonic::body::Body::default(),\n                        );\n                        let headers = response.headers_mut();\n                        headers\n                            .insert(\n                                tonic::Status::GRPC_STATUS,\n                                (tonic::Code::Unimplemented as i32).into(),\n                            );\n                        headers\n                            .insert(\n                                http::header::CONTENT_TYPE,\n                                tonic::metadata::GRPC_CONTENT_TYPE,\n                            );\n                        Ok(response)\n                    })\n                }\n            }\n        }\n    }\n    impl<T> Clone for StreamingSpanWriterPluginServer<T> {\n        fn clone(&self) -> Self {\n            let inner = self.inner.clone();\n            Self {\n                inner,\n                accept_compression_encodings: self.accept_compression_encodings,\n                send_compression_encodings: self.send_compression_encodings,\n                max_decoding_message_size: self.max_decoding_message_size,\n                max_encoding_message_size: self.max_encoding_message_size,\n            }\n        }\n    }\n    /// Generated gRPC service name\n    pub const SERVICE_NAME: &str = \"jaeger.storage.v1.StreamingSpanWriterPlugin\";\n    impl<T> tonic::server::NamedService for StreamingSpanWriterPluginServer<T> {\n        const NAME: &'static str = SERVICE_NAME;\n    }\n}\n/// Generated client implementations.\npub mod span_reader_plugin_client {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    use tonic::codegen::http::Uri;\n    #[derive(Debug, Clone)]\n    pub struct SpanReaderPluginClient<T> {\n        inner: tonic::client::Grpc<T>,\n    }\n    impl SpanReaderPluginClient<tonic::transport::Channel> {\n        /// Attempt to create a new client by connecting to a given endpoint.\n        pub async fn connect<D>(dst: D) -> Result<Self, tonic::transport::Error>\n        where\n            D: TryInto<tonic::transport::Endpoint>,\n            D::Error: Into<StdError>,\n        {\n            let conn = tonic::transport::Endpoint::new(dst)?.connect().await?;\n            Ok(Self::new(conn))\n        }\n    }\n    impl<T> SpanReaderPluginClient<T>\n    where\n        T: tonic::client::GrpcService<tonic::body::Body>,\n        T::Error: Into<StdError>,\n        T::ResponseBody: Body<Data = Bytes> + std::marker::Send + 'static,\n        <T::ResponseBody as Body>::Error: Into<StdError> + std::marker::Send,\n    {\n        pub fn new(inner: T) -> Self {\n            let inner = tonic::client::Grpc::new(inner);\n            Self { inner }\n        }\n        pub fn with_origin(inner: T, origin: Uri) -> Self {\n            let inner = tonic::client::Grpc::with_origin(inner, origin);\n            Self { inner }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> SpanReaderPluginClient<InterceptedService<T, F>>\n        where\n            F: tonic::service::Interceptor,\n            T::ResponseBody: Default,\n            T: tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n                Response = http::Response<\n                    <T as tonic::client::GrpcService<tonic::body::Body>>::ResponseBody,\n                >,\n            >,\n            <T as tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n            >>::Error: Into<StdError> + std::marker::Send + std::marker::Sync,\n        {\n            SpanReaderPluginClient::new(InterceptedService::new(inner, interceptor))\n        }\n        /// Compress requests with the given encoding.\n        ///\n        /// This requires the server to support it otherwise it might respond with an\n        /// error.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.send_compressed(encoding);\n            self\n        }\n        /// Enable decompressing responses.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.accept_compressed(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_decoding_message_size(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_encoding_message_size(limit);\n            self\n        }\n        /// spanstore/Reader\n        pub async fn get_trace(\n            &mut self,\n            request: impl tonic::IntoRequest<super::GetTraceRequest>,\n        ) -> std::result::Result<\n            tonic::Response<tonic::codec::Streaming<super::SpansResponseChunk>>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/jaeger.storage.v1.SpanReaderPlugin/GetTrace\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\"jaeger.storage.v1.SpanReaderPlugin\", \"GetTrace\"),\n                );\n            self.inner.server_streaming(req, path, codec).await\n        }\n        pub async fn get_services(\n            &mut self,\n            request: impl tonic::IntoRequest<super::GetServicesRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::GetServicesResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/jaeger.storage.v1.SpanReaderPlugin/GetServices\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\"jaeger.storage.v1.SpanReaderPlugin\", \"GetServices\"),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        pub async fn get_operations(\n            &mut self,\n            request: impl tonic::IntoRequest<super::GetOperationsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::GetOperationsResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/jaeger.storage.v1.SpanReaderPlugin/GetOperations\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"jaeger.storage.v1.SpanReaderPlugin\",\n                        \"GetOperations\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        pub async fn find_traces(\n            &mut self,\n            request: impl tonic::IntoRequest<super::FindTracesRequest>,\n        ) -> std::result::Result<\n            tonic::Response<tonic::codec::Streaming<super::SpansResponseChunk>>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/jaeger.storage.v1.SpanReaderPlugin/FindTraces\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\"jaeger.storage.v1.SpanReaderPlugin\", \"FindTraces\"),\n                );\n            self.inner.server_streaming(req, path, codec).await\n        }\n        pub async fn find_trace_i_ds(\n            &mut self,\n            request: impl tonic::IntoRequest<super::FindTraceIDsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::FindTraceIDsResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/jaeger.storage.v1.SpanReaderPlugin/FindTraceIDs\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\"jaeger.storage.v1.SpanReaderPlugin\", \"FindTraceIDs\"),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n    }\n}\n/// Generated server implementations.\npub mod span_reader_plugin_server {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    /// Generated trait containing gRPC methods that should be implemented for use with SpanReaderPluginServer.\n    #[async_trait]\n    pub trait SpanReaderPlugin: std::marker::Send + std::marker::Sync + 'static {\n        /// Server streaming response type for the GetTrace method.\n        type GetTraceStream: tonic::codegen::tokio_stream::Stream<\n                Item = std::result::Result<super::SpansResponseChunk, tonic::Status>,\n            >\n            + std::marker::Send\n            + 'static;\n        /// spanstore/Reader\n        async fn get_trace(\n            &self,\n            request: tonic::Request<super::GetTraceRequest>,\n        ) -> std::result::Result<tonic::Response<Self::GetTraceStream>, tonic::Status>;\n        async fn get_services(\n            &self,\n            request: tonic::Request<super::GetServicesRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::GetServicesResponse>,\n            tonic::Status,\n        >;\n        async fn get_operations(\n            &self,\n            request: tonic::Request<super::GetOperationsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::GetOperationsResponse>,\n            tonic::Status,\n        >;\n        /// Server streaming response type for the FindTraces method.\n        type FindTracesStream: tonic::codegen::tokio_stream::Stream<\n                Item = std::result::Result<super::SpansResponseChunk, tonic::Status>,\n            >\n            + std::marker::Send\n            + 'static;\n        async fn find_traces(\n            &self,\n            request: tonic::Request<super::FindTracesRequest>,\n        ) -> std::result::Result<tonic::Response<Self::FindTracesStream>, tonic::Status>;\n        async fn find_trace_i_ds(\n            &self,\n            request: tonic::Request<super::FindTraceIDsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::FindTraceIDsResponse>,\n            tonic::Status,\n        >;\n    }\n    #[derive(Debug)]\n    pub struct SpanReaderPluginServer<T> {\n        inner: Arc<T>,\n        accept_compression_encodings: EnabledCompressionEncodings,\n        send_compression_encodings: EnabledCompressionEncodings,\n        max_decoding_message_size: Option<usize>,\n        max_encoding_message_size: Option<usize>,\n    }\n    impl<T> SpanReaderPluginServer<T> {\n        pub fn new(inner: T) -> Self {\n            Self::from_arc(Arc::new(inner))\n        }\n        pub fn from_arc(inner: Arc<T>) -> Self {\n            Self {\n                inner,\n                accept_compression_encodings: Default::default(),\n                send_compression_encodings: Default::default(),\n                max_decoding_message_size: None,\n                max_encoding_message_size: None,\n            }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> InterceptedService<Self, F>\n        where\n            F: tonic::service::Interceptor,\n        {\n            InterceptedService::new(Self::new(inner), interceptor)\n        }\n        /// Enable decompressing requests with the given encoding.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.accept_compression_encodings.enable(encoding);\n            self\n        }\n        /// Compress responses with the given encoding, if the client supports it.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.send_compression_encodings.enable(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.max_decoding_message_size = Some(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.max_encoding_message_size = Some(limit);\n            self\n        }\n    }\n    impl<T, B> tonic::codegen::Service<http::Request<B>> for SpanReaderPluginServer<T>\n    where\n        T: SpanReaderPlugin,\n        B: Body + std::marker::Send + 'static,\n        B::Error: Into<StdError> + std::marker::Send + 'static,\n    {\n        type Response = http::Response<tonic::body::Body>;\n        type Error = std::convert::Infallible;\n        type Future = BoxFuture<Self::Response, Self::Error>;\n        fn poll_ready(\n            &mut self,\n            _cx: &mut Context<'_>,\n        ) -> Poll<std::result::Result<(), Self::Error>> {\n            Poll::Ready(Ok(()))\n        }\n        fn call(&mut self, req: http::Request<B>) -> Self::Future {\n            match req.uri().path() {\n                \"/jaeger.storage.v1.SpanReaderPlugin/GetTrace\" => {\n                    #[allow(non_camel_case_types)]\n                    struct GetTraceSvc<T: SpanReaderPlugin>(pub Arc<T>);\n                    impl<\n                        T: SpanReaderPlugin,\n                    > tonic::server::ServerStreamingService<super::GetTraceRequest>\n                    for GetTraceSvc<T> {\n                        type Response = super::SpansResponseChunk;\n                        type ResponseStream = T::GetTraceStream;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::ResponseStream>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::GetTraceRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as SpanReaderPlugin>::get_trace(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = GetTraceSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.server_streaming(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/jaeger.storage.v1.SpanReaderPlugin/GetServices\" => {\n                    #[allow(non_camel_case_types)]\n                    struct GetServicesSvc<T: SpanReaderPlugin>(pub Arc<T>);\n                    impl<\n                        T: SpanReaderPlugin,\n                    > tonic::server::UnaryService<super::GetServicesRequest>\n                    for GetServicesSvc<T> {\n                        type Response = super::GetServicesResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::GetServicesRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as SpanReaderPlugin>::get_services(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = GetServicesSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/jaeger.storage.v1.SpanReaderPlugin/GetOperations\" => {\n                    #[allow(non_camel_case_types)]\n                    struct GetOperationsSvc<T: SpanReaderPlugin>(pub Arc<T>);\n                    impl<\n                        T: SpanReaderPlugin,\n                    > tonic::server::UnaryService<super::GetOperationsRequest>\n                    for GetOperationsSvc<T> {\n                        type Response = super::GetOperationsResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::GetOperationsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as SpanReaderPlugin>::get_operations(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = GetOperationsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/jaeger.storage.v1.SpanReaderPlugin/FindTraces\" => {\n                    #[allow(non_camel_case_types)]\n                    struct FindTracesSvc<T: SpanReaderPlugin>(pub Arc<T>);\n                    impl<\n                        T: SpanReaderPlugin,\n                    > tonic::server::ServerStreamingService<super::FindTracesRequest>\n                    for FindTracesSvc<T> {\n                        type Response = super::SpansResponseChunk;\n                        type ResponseStream = T::FindTracesStream;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::ResponseStream>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::FindTracesRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as SpanReaderPlugin>::find_traces(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = FindTracesSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.server_streaming(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/jaeger.storage.v1.SpanReaderPlugin/FindTraceIDs\" => {\n                    #[allow(non_camel_case_types)]\n                    struct FindTraceIDsSvc<T: SpanReaderPlugin>(pub Arc<T>);\n                    impl<\n                        T: SpanReaderPlugin,\n                    > tonic::server::UnaryService<super::FindTraceIDsRequest>\n                    for FindTraceIDsSvc<T> {\n                        type Response = super::FindTraceIDsResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::FindTraceIDsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as SpanReaderPlugin>::find_trace_i_ds(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = FindTraceIDsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                _ => {\n                    Box::pin(async move {\n                        let mut response = http::Response::new(\n                            tonic::body::Body::default(),\n                        );\n                        let headers = response.headers_mut();\n                        headers\n                            .insert(\n                                tonic::Status::GRPC_STATUS,\n                                (tonic::Code::Unimplemented as i32).into(),\n                            );\n                        headers\n                            .insert(\n                                http::header::CONTENT_TYPE,\n                                tonic::metadata::GRPC_CONTENT_TYPE,\n                            );\n                        Ok(response)\n                    })\n                }\n            }\n        }\n    }\n    impl<T> Clone for SpanReaderPluginServer<T> {\n        fn clone(&self) -> Self {\n            let inner = self.inner.clone();\n            Self {\n                inner,\n                accept_compression_encodings: self.accept_compression_encodings,\n                send_compression_encodings: self.send_compression_encodings,\n                max_decoding_message_size: self.max_decoding_message_size,\n                max_encoding_message_size: self.max_encoding_message_size,\n            }\n        }\n    }\n    /// Generated gRPC service name\n    pub const SERVICE_NAME: &str = \"jaeger.storage.v1.SpanReaderPlugin\";\n    impl<T> tonic::server::NamedService for SpanReaderPluginServer<T> {\n        const NAME: &'static str = SERVICE_NAME;\n    }\n}\n/// Generated client implementations.\npub mod archive_span_writer_plugin_client {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    use tonic::codegen::http::Uri;\n    #[derive(Debug, Clone)]\n    pub struct ArchiveSpanWriterPluginClient<T> {\n        inner: tonic::client::Grpc<T>,\n    }\n    impl ArchiveSpanWriterPluginClient<tonic::transport::Channel> {\n        /// Attempt to create a new client by connecting to a given endpoint.\n        pub async fn connect<D>(dst: D) -> Result<Self, tonic::transport::Error>\n        where\n            D: TryInto<tonic::transport::Endpoint>,\n            D::Error: Into<StdError>,\n        {\n            let conn = tonic::transport::Endpoint::new(dst)?.connect().await?;\n            Ok(Self::new(conn))\n        }\n    }\n    impl<T> ArchiveSpanWriterPluginClient<T>\n    where\n        T: tonic::client::GrpcService<tonic::body::Body>,\n        T::Error: Into<StdError>,\n        T::ResponseBody: Body<Data = Bytes> + std::marker::Send + 'static,\n        <T::ResponseBody as Body>::Error: Into<StdError> + std::marker::Send,\n    {\n        pub fn new(inner: T) -> Self {\n            let inner = tonic::client::Grpc::new(inner);\n            Self { inner }\n        }\n        pub fn with_origin(inner: T, origin: Uri) -> Self {\n            let inner = tonic::client::Grpc::with_origin(inner, origin);\n            Self { inner }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> ArchiveSpanWriterPluginClient<InterceptedService<T, F>>\n        where\n            F: tonic::service::Interceptor,\n            T::ResponseBody: Default,\n            T: tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n                Response = http::Response<\n                    <T as tonic::client::GrpcService<tonic::body::Body>>::ResponseBody,\n                >,\n            >,\n            <T as tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n            >>::Error: Into<StdError> + std::marker::Send + std::marker::Sync,\n        {\n            ArchiveSpanWriterPluginClient::new(\n                InterceptedService::new(inner, interceptor),\n            )\n        }\n        /// Compress requests with the given encoding.\n        ///\n        /// This requires the server to support it otherwise it might respond with an\n        /// error.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.send_compressed(encoding);\n            self\n        }\n        /// Enable decompressing responses.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.accept_compressed(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_decoding_message_size(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_encoding_message_size(limit);\n            self\n        }\n        /// spanstore/Writer\n        pub async fn write_archive_span(\n            &mut self,\n            request: impl tonic::IntoRequest<super::WriteSpanRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::WriteSpanResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/jaeger.storage.v1.ArchiveSpanWriterPlugin/WriteArchiveSpan\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"jaeger.storage.v1.ArchiveSpanWriterPlugin\",\n                        \"WriteArchiveSpan\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n    }\n}\n/// Generated server implementations.\npub mod archive_span_writer_plugin_server {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    /// Generated trait containing gRPC methods that should be implemented for use with ArchiveSpanWriterPluginServer.\n    #[async_trait]\n    pub trait ArchiveSpanWriterPlugin: std::marker::Send + std::marker::Sync + 'static {\n        /// spanstore/Writer\n        async fn write_archive_span(\n            &self,\n            request: tonic::Request<super::WriteSpanRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::WriteSpanResponse>,\n            tonic::Status,\n        >;\n    }\n    #[derive(Debug)]\n    pub struct ArchiveSpanWriterPluginServer<T> {\n        inner: Arc<T>,\n        accept_compression_encodings: EnabledCompressionEncodings,\n        send_compression_encodings: EnabledCompressionEncodings,\n        max_decoding_message_size: Option<usize>,\n        max_encoding_message_size: Option<usize>,\n    }\n    impl<T> ArchiveSpanWriterPluginServer<T> {\n        pub fn new(inner: T) -> Self {\n            Self::from_arc(Arc::new(inner))\n        }\n        pub fn from_arc(inner: Arc<T>) -> Self {\n            Self {\n                inner,\n                accept_compression_encodings: Default::default(),\n                send_compression_encodings: Default::default(),\n                max_decoding_message_size: None,\n                max_encoding_message_size: None,\n            }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> InterceptedService<Self, F>\n        where\n            F: tonic::service::Interceptor,\n        {\n            InterceptedService::new(Self::new(inner), interceptor)\n        }\n        /// Enable decompressing requests with the given encoding.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.accept_compression_encodings.enable(encoding);\n            self\n        }\n        /// Compress responses with the given encoding, if the client supports it.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.send_compression_encodings.enable(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.max_decoding_message_size = Some(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.max_encoding_message_size = Some(limit);\n            self\n        }\n    }\n    impl<T, B> tonic::codegen::Service<http::Request<B>>\n    for ArchiveSpanWriterPluginServer<T>\n    where\n        T: ArchiveSpanWriterPlugin,\n        B: Body + std::marker::Send + 'static,\n        B::Error: Into<StdError> + std::marker::Send + 'static,\n    {\n        type Response = http::Response<tonic::body::Body>;\n        type Error = std::convert::Infallible;\n        type Future = BoxFuture<Self::Response, Self::Error>;\n        fn poll_ready(\n            &mut self,\n            _cx: &mut Context<'_>,\n        ) -> Poll<std::result::Result<(), Self::Error>> {\n            Poll::Ready(Ok(()))\n        }\n        fn call(&mut self, req: http::Request<B>) -> Self::Future {\n            match req.uri().path() {\n                \"/jaeger.storage.v1.ArchiveSpanWriterPlugin/WriteArchiveSpan\" => {\n                    #[allow(non_camel_case_types)]\n                    struct WriteArchiveSpanSvc<T: ArchiveSpanWriterPlugin>(pub Arc<T>);\n                    impl<\n                        T: ArchiveSpanWriterPlugin,\n                    > tonic::server::UnaryService<super::WriteSpanRequest>\n                    for WriteArchiveSpanSvc<T> {\n                        type Response = super::WriteSpanResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::WriteSpanRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as ArchiveSpanWriterPlugin>::write_archive_span(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = WriteArchiveSpanSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                _ => {\n                    Box::pin(async move {\n                        let mut response = http::Response::new(\n                            tonic::body::Body::default(),\n                        );\n                        let headers = response.headers_mut();\n                        headers\n                            .insert(\n                                tonic::Status::GRPC_STATUS,\n                                (tonic::Code::Unimplemented as i32).into(),\n                            );\n                        headers\n                            .insert(\n                                http::header::CONTENT_TYPE,\n                                tonic::metadata::GRPC_CONTENT_TYPE,\n                            );\n                        Ok(response)\n                    })\n                }\n            }\n        }\n    }\n    impl<T> Clone for ArchiveSpanWriterPluginServer<T> {\n        fn clone(&self) -> Self {\n            let inner = self.inner.clone();\n            Self {\n                inner,\n                accept_compression_encodings: self.accept_compression_encodings,\n                send_compression_encodings: self.send_compression_encodings,\n                max_decoding_message_size: self.max_decoding_message_size,\n                max_encoding_message_size: self.max_encoding_message_size,\n            }\n        }\n    }\n    /// Generated gRPC service name\n    pub const SERVICE_NAME: &str = \"jaeger.storage.v1.ArchiveSpanWriterPlugin\";\n    impl<T> tonic::server::NamedService for ArchiveSpanWriterPluginServer<T> {\n        const NAME: &'static str = SERVICE_NAME;\n    }\n}\n/// Generated client implementations.\npub mod archive_span_reader_plugin_client {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    use tonic::codegen::http::Uri;\n    #[derive(Debug, Clone)]\n    pub struct ArchiveSpanReaderPluginClient<T> {\n        inner: tonic::client::Grpc<T>,\n    }\n    impl ArchiveSpanReaderPluginClient<tonic::transport::Channel> {\n        /// Attempt to create a new client by connecting to a given endpoint.\n        pub async fn connect<D>(dst: D) -> Result<Self, tonic::transport::Error>\n        where\n            D: TryInto<tonic::transport::Endpoint>,\n            D::Error: Into<StdError>,\n        {\n            let conn = tonic::transport::Endpoint::new(dst)?.connect().await?;\n            Ok(Self::new(conn))\n        }\n    }\n    impl<T> ArchiveSpanReaderPluginClient<T>\n    where\n        T: tonic::client::GrpcService<tonic::body::Body>,\n        T::Error: Into<StdError>,\n        T::ResponseBody: Body<Data = Bytes> + std::marker::Send + 'static,\n        <T::ResponseBody as Body>::Error: Into<StdError> + std::marker::Send,\n    {\n        pub fn new(inner: T) -> Self {\n            let inner = tonic::client::Grpc::new(inner);\n            Self { inner }\n        }\n        pub fn with_origin(inner: T, origin: Uri) -> Self {\n            let inner = tonic::client::Grpc::with_origin(inner, origin);\n            Self { inner }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> ArchiveSpanReaderPluginClient<InterceptedService<T, F>>\n        where\n            F: tonic::service::Interceptor,\n            T::ResponseBody: Default,\n            T: tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n                Response = http::Response<\n                    <T as tonic::client::GrpcService<tonic::body::Body>>::ResponseBody,\n                >,\n            >,\n            <T as tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n            >>::Error: Into<StdError> + std::marker::Send + std::marker::Sync,\n        {\n            ArchiveSpanReaderPluginClient::new(\n                InterceptedService::new(inner, interceptor),\n            )\n        }\n        /// Compress requests with the given encoding.\n        ///\n        /// This requires the server to support it otherwise it might respond with an\n        /// error.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.send_compressed(encoding);\n            self\n        }\n        /// Enable decompressing responses.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.accept_compressed(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_decoding_message_size(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_encoding_message_size(limit);\n            self\n        }\n        /// spanstore/Reader\n        pub async fn get_archive_trace(\n            &mut self,\n            request: impl tonic::IntoRequest<super::GetTraceRequest>,\n        ) -> std::result::Result<\n            tonic::Response<tonic::codec::Streaming<super::SpansResponseChunk>>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/jaeger.storage.v1.ArchiveSpanReaderPlugin/GetArchiveTrace\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"jaeger.storage.v1.ArchiveSpanReaderPlugin\",\n                        \"GetArchiveTrace\",\n                    ),\n                );\n            self.inner.server_streaming(req, path, codec).await\n        }\n    }\n}\n/// Generated server implementations.\npub mod archive_span_reader_plugin_server {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    /// Generated trait containing gRPC methods that should be implemented for use with ArchiveSpanReaderPluginServer.\n    #[async_trait]\n    pub trait ArchiveSpanReaderPlugin: std::marker::Send + std::marker::Sync + 'static {\n        /// Server streaming response type for the GetArchiveTrace method.\n        type GetArchiveTraceStream: tonic::codegen::tokio_stream::Stream<\n                Item = std::result::Result<super::SpansResponseChunk, tonic::Status>,\n            >\n            + std::marker::Send\n            + 'static;\n        /// spanstore/Reader\n        async fn get_archive_trace(\n            &self,\n            request: tonic::Request<super::GetTraceRequest>,\n        ) -> std::result::Result<\n            tonic::Response<Self::GetArchiveTraceStream>,\n            tonic::Status,\n        >;\n    }\n    #[derive(Debug)]\n    pub struct ArchiveSpanReaderPluginServer<T> {\n        inner: Arc<T>,\n        accept_compression_encodings: EnabledCompressionEncodings,\n        send_compression_encodings: EnabledCompressionEncodings,\n        max_decoding_message_size: Option<usize>,\n        max_encoding_message_size: Option<usize>,\n    }\n    impl<T> ArchiveSpanReaderPluginServer<T> {\n        pub fn new(inner: T) -> Self {\n            Self::from_arc(Arc::new(inner))\n        }\n        pub fn from_arc(inner: Arc<T>) -> Self {\n            Self {\n                inner,\n                accept_compression_encodings: Default::default(),\n                send_compression_encodings: Default::default(),\n                max_decoding_message_size: None,\n                max_encoding_message_size: None,\n            }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> InterceptedService<Self, F>\n        where\n            F: tonic::service::Interceptor,\n        {\n            InterceptedService::new(Self::new(inner), interceptor)\n        }\n        /// Enable decompressing requests with the given encoding.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.accept_compression_encodings.enable(encoding);\n            self\n        }\n        /// Compress responses with the given encoding, if the client supports it.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.send_compression_encodings.enable(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.max_decoding_message_size = Some(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.max_encoding_message_size = Some(limit);\n            self\n        }\n    }\n    impl<T, B> tonic::codegen::Service<http::Request<B>>\n    for ArchiveSpanReaderPluginServer<T>\n    where\n        T: ArchiveSpanReaderPlugin,\n        B: Body + std::marker::Send + 'static,\n        B::Error: Into<StdError> + std::marker::Send + 'static,\n    {\n        type Response = http::Response<tonic::body::Body>;\n        type Error = std::convert::Infallible;\n        type Future = BoxFuture<Self::Response, Self::Error>;\n        fn poll_ready(\n            &mut self,\n            _cx: &mut Context<'_>,\n        ) -> Poll<std::result::Result<(), Self::Error>> {\n            Poll::Ready(Ok(()))\n        }\n        fn call(&mut self, req: http::Request<B>) -> Self::Future {\n            match req.uri().path() {\n                \"/jaeger.storage.v1.ArchiveSpanReaderPlugin/GetArchiveTrace\" => {\n                    #[allow(non_camel_case_types)]\n                    struct GetArchiveTraceSvc<T: ArchiveSpanReaderPlugin>(pub Arc<T>);\n                    impl<\n                        T: ArchiveSpanReaderPlugin,\n                    > tonic::server::ServerStreamingService<super::GetTraceRequest>\n                    for GetArchiveTraceSvc<T> {\n                        type Response = super::SpansResponseChunk;\n                        type ResponseStream = T::GetArchiveTraceStream;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::ResponseStream>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::GetTraceRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as ArchiveSpanReaderPlugin>::get_archive_trace(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = GetArchiveTraceSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.server_streaming(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                _ => {\n                    Box::pin(async move {\n                        let mut response = http::Response::new(\n                            tonic::body::Body::default(),\n                        );\n                        let headers = response.headers_mut();\n                        headers\n                            .insert(\n                                tonic::Status::GRPC_STATUS,\n                                (tonic::Code::Unimplemented as i32).into(),\n                            );\n                        headers\n                            .insert(\n                                http::header::CONTENT_TYPE,\n                                tonic::metadata::GRPC_CONTENT_TYPE,\n                            );\n                        Ok(response)\n                    })\n                }\n            }\n        }\n    }\n    impl<T> Clone for ArchiveSpanReaderPluginServer<T> {\n        fn clone(&self) -> Self {\n            let inner = self.inner.clone();\n            Self {\n                inner,\n                accept_compression_encodings: self.accept_compression_encodings,\n                send_compression_encodings: self.send_compression_encodings,\n                max_decoding_message_size: self.max_decoding_message_size,\n                max_encoding_message_size: self.max_encoding_message_size,\n            }\n        }\n    }\n    /// Generated gRPC service name\n    pub const SERVICE_NAME: &str = \"jaeger.storage.v1.ArchiveSpanReaderPlugin\";\n    impl<T> tonic::server::NamedService for ArchiveSpanReaderPluginServer<T> {\n        const NAME: &'static str = SERVICE_NAME;\n    }\n}\n/// Generated client implementations.\npub mod dependencies_reader_plugin_client {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    use tonic::codegen::http::Uri;\n    #[derive(Debug, Clone)]\n    pub struct DependenciesReaderPluginClient<T> {\n        inner: tonic::client::Grpc<T>,\n    }\n    impl DependenciesReaderPluginClient<tonic::transport::Channel> {\n        /// Attempt to create a new client by connecting to a given endpoint.\n        pub async fn connect<D>(dst: D) -> Result<Self, tonic::transport::Error>\n        where\n            D: TryInto<tonic::transport::Endpoint>,\n            D::Error: Into<StdError>,\n        {\n            let conn = tonic::transport::Endpoint::new(dst)?.connect().await?;\n            Ok(Self::new(conn))\n        }\n    }\n    impl<T> DependenciesReaderPluginClient<T>\n    where\n        T: tonic::client::GrpcService<tonic::body::Body>,\n        T::Error: Into<StdError>,\n        T::ResponseBody: Body<Data = Bytes> + std::marker::Send + 'static,\n        <T::ResponseBody as Body>::Error: Into<StdError> + std::marker::Send,\n    {\n        pub fn new(inner: T) -> Self {\n            let inner = tonic::client::Grpc::new(inner);\n            Self { inner }\n        }\n        pub fn with_origin(inner: T, origin: Uri) -> Self {\n            let inner = tonic::client::Grpc::with_origin(inner, origin);\n            Self { inner }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> DependenciesReaderPluginClient<InterceptedService<T, F>>\n        where\n            F: tonic::service::Interceptor,\n            T::ResponseBody: Default,\n            T: tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n                Response = http::Response<\n                    <T as tonic::client::GrpcService<tonic::body::Body>>::ResponseBody,\n                >,\n            >,\n            <T as tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n            >>::Error: Into<StdError> + std::marker::Send + std::marker::Sync,\n        {\n            DependenciesReaderPluginClient::new(\n                InterceptedService::new(inner, interceptor),\n            )\n        }\n        /// Compress requests with the given encoding.\n        ///\n        /// This requires the server to support it otherwise it might respond with an\n        /// error.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.send_compressed(encoding);\n            self\n        }\n        /// Enable decompressing responses.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.accept_compressed(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_decoding_message_size(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_encoding_message_size(limit);\n            self\n        }\n        /// dependencystore/Reader\n        pub async fn get_dependencies(\n            &mut self,\n            request: impl tonic::IntoRequest<super::GetDependenciesRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::GetDependenciesResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/jaeger.storage.v1.DependenciesReaderPlugin/GetDependencies\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"jaeger.storage.v1.DependenciesReaderPlugin\",\n                        \"GetDependencies\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n    }\n}\n/// Generated server implementations.\npub mod dependencies_reader_plugin_server {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    /// Generated trait containing gRPC methods that should be implemented for use with DependenciesReaderPluginServer.\n    #[async_trait]\n    pub trait DependenciesReaderPlugin: std::marker::Send + std::marker::Sync + 'static {\n        /// dependencystore/Reader\n        async fn get_dependencies(\n            &self,\n            request: tonic::Request<super::GetDependenciesRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::GetDependenciesResponse>,\n            tonic::Status,\n        >;\n    }\n    #[derive(Debug)]\n    pub struct DependenciesReaderPluginServer<T> {\n        inner: Arc<T>,\n        accept_compression_encodings: EnabledCompressionEncodings,\n        send_compression_encodings: EnabledCompressionEncodings,\n        max_decoding_message_size: Option<usize>,\n        max_encoding_message_size: Option<usize>,\n    }\n    impl<T> DependenciesReaderPluginServer<T> {\n        pub fn new(inner: T) -> Self {\n            Self::from_arc(Arc::new(inner))\n        }\n        pub fn from_arc(inner: Arc<T>) -> Self {\n            Self {\n                inner,\n                accept_compression_encodings: Default::default(),\n                send_compression_encodings: Default::default(),\n                max_decoding_message_size: None,\n                max_encoding_message_size: None,\n            }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> InterceptedService<Self, F>\n        where\n            F: tonic::service::Interceptor,\n        {\n            InterceptedService::new(Self::new(inner), interceptor)\n        }\n        /// Enable decompressing requests with the given encoding.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.accept_compression_encodings.enable(encoding);\n            self\n        }\n        /// Compress responses with the given encoding, if the client supports it.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.send_compression_encodings.enable(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.max_decoding_message_size = Some(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.max_encoding_message_size = Some(limit);\n            self\n        }\n    }\n    impl<T, B> tonic::codegen::Service<http::Request<B>>\n    for DependenciesReaderPluginServer<T>\n    where\n        T: DependenciesReaderPlugin,\n        B: Body + std::marker::Send + 'static,\n        B::Error: Into<StdError> + std::marker::Send + 'static,\n    {\n        type Response = http::Response<tonic::body::Body>;\n        type Error = std::convert::Infallible;\n        type Future = BoxFuture<Self::Response, Self::Error>;\n        fn poll_ready(\n            &mut self,\n            _cx: &mut Context<'_>,\n        ) -> Poll<std::result::Result<(), Self::Error>> {\n            Poll::Ready(Ok(()))\n        }\n        fn call(&mut self, req: http::Request<B>) -> Self::Future {\n            match req.uri().path() {\n                \"/jaeger.storage.v1.DependenciesReaderPlugin/GetDependencies\" => {\n                    #[allow(non_camel_case_types)]\n                    struct GetDependenciesSvc<T: DependenciesReaderPlugin>(pub Arc<T>);\n                    impl<\n                        T: DependenciesReaderPlugin,\n                    > tonic::server::UnaryService<super::GetDependenciesRequest>\n                    for GetDependenciesSvc<T> {\n                        type Response = super::GetDependenciesResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::GetDependenciesRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as DependenciesReaderPlugin>::get_dependencies(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = GetDependenciesSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                _ => {\n                    Box::pin(async move {\n                        let mut response = http::Response::new(\n                            tonic::body::Body::default(),\n                        );\n                        let headers = response.headers_mut();\n                        headers\n                            .insert(\n                                tonic::Status::GRPC_STATUS,\n                                (tonic::Code::Unimplemented as i32).into(),\n                            );\n                        headers\n                            .insert(\n                                http::header::CONTENT_TYPE,\n                                tonic::metadata::GRPC_CONTENT_TYPE,\n                            );\n                        Ok(response)\n                    })\n                }\n            }\n        }\n    }\n    impl<T> Clone for DependenciesReaderPluginServer<T> {\n        fn clone(&self) -> Self {\n            let inner = self.inner.clone();\n            Self {\n                inner,\n                accept_compression_encodings: self.accept_compression_encodings,\n                send_compression_encodings: self.send_compression_encodings,\n                max_decoding_message_size: self.max_decoding_message_size,\n                max_encoding_message_size: self.max_encoding_message_size,\n            }\n        }\n    }\n    /// Generated gRPC service name\n    pub const SERVICE_NAME: &str = \"jaeger.storage.v1.DependenciesReaderPlugin\";\n    impl<T> tonic::server::NamedService for DependenciesReaderPluginServer<T> {\n        const NAME: &'static str = SERVICE_NAME;\n    }\n}\n/// Generated client implementations.\npub mod plugin_capabilities_client {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    use tonic::codegen::http::Uri;\n    #[derive(Debug, Clone)]\n    pub struct PluginCapabilitiesClient<T> {\n        inner: tonic::client::Grpc<T>,\n    }\n    impl PluginCapabilitiesClient<tonic::transport::Channel> {\n        /// Attempt to create a new client by connecting to a given endpoint.\n        pub async fn connect<D>(dst: D) -> Result<Self, tonic::transport::Error>\n        where\n            D: TryInto<tonic::transport::Endpoint>,\n            D::Error: Into<StdError>,\n        {\n            let conn = tonic::transport::Endpoint::new(dst)?.connect().await?;\n            Ok(Self::new(conn))\n        }\n    }\n    impl<T> PluginCapabilitiesClient<T>\n    where\n        T: tonic::client::GrpcService<tonic::body::Body>,\n        T::Error: Into<StdError>,\n        T::ResponseBody: Body<Data = Bytes> + std::marker::Send + 'static,\n        <T::ResponseBody as Body>::Error: Into<StdError> + std::marker::Send,\n    {\n        pub fn new(inner: T) -> Self {\n            let inner = tonic::client::Grpc::new(inner);\n            Self { inner }\n        }\n        pub fn with_origin(inner: T, origin: Uri) -> Self {\n            let inner = tonic::client::Grpc::with_origin(inner, origin);\n            Self { inner }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> PluginCapabilitiesClient<InterceptedService<T, F>>\n        where\n            F: tonic::service::Interceptor,\n            T::ResponseBody: Default,\n            T: tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n                Response = http::Response<\n                    <T as tonic::client::GrpcService<tonic::body::Body>>::ResponseBody,\n                >,\n            >,\n            <T as tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n            >>::Error: Into<StdError> + std::marker::Send + std::marker::Sync,\n        {\n            PluginCapabilitiesClient::new(InterceptedService::new(inner, interceptor))\n        }\n        /// Compress requests with the given encoding.\n        ///\n        /// This requires the server to support it otherwise it might respond with an\n        /// error.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.send_compressed(encoding);\n            self\n        }\n        /// Enable decompressing responses.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.accept_compressed(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_decoding_message_size(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_encoding_message_size(limit);\n            self\n        }\n        pub async fn capabilities(\n            &mut self,\n            request: impl tonic::IntoRequest<super::CapabilitiesRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::CapabilitiesResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/jaeger.storage.v1.PluginCapabilities/Capabilities\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"jaeger.storage.v1.PluginCapabilities\",\n                        \"Capabilities\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n    }\n}\n/// Generated server implementations.\npub mod plugin_capabilities_server {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    /// Generated trait containing gRPC methods that should be implemented for use with PluginCapabilitiesServer.\n    #[async_trait]\n    pub trait PluginCapabilities: std::marker::Send + std::marker::Sync + 'static {\n        async fn capabilities(\n            &self,\n            request: tonic::Request<super::CapabilitiesRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::CapabilitiesResponse>,\n            tonic::Status,\n        >;\n    }\n    #[derive(Debug)]\n    pub struct PluginCapabilitiesServer<T> {\n        inner: Arc<T>,\n        accept_compression_encodings: EnabledCompressionEncodings,\n        send_compression_encodings: EnabledCompressionEncodings,\n        max_decoding_message_size: Option<usize>,\n        max_encoding_message_size: Option<usize>,\n    }\n    impl<T> PluginCapabilitiesServer<T> {\n        pub fn new(inner: T) -> Self {\n            Self::from_arc(Arc::new(inner))\n        }\n        pub fn from_arc(inner: Arc<T>) -> Self {\n            Self {\n                inner,\n                accept_compression_encodings: Default::default(),\n                send_compression_encodings: Default::default(),\n                max_decoding_message_size: None,\n                max_encoding_message_size: None,\n            }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> InterceptedService<Self, F>\n        where\n            F: tonic::service::Interceptor,\n        {\n            InterceptedService::new(Self::new(inner), interceptor)\n        }\n        /// Enable decompressing requests with the given encoding.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.accept_compression_encodings.enable(encoding);\n            self\n        }\n        /// Compress responses with the given encoding, if the client supports it.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.send_compression_encodings.enable(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.max_decoding_message_size = Some(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.max_encoding_message_size = Some(limit);\n            self\n        }\n    }\n    impl<T, B> tonic::codegen::Service<http::Request<B>> for PluginCapabilitiesServer<T>\n    where\n        T: PluginCapabilities,\n        B: Body + std::marker::Send + 'static,\n        B::Error: Into<StdError> + std::marker::Send + 'static,\n    {\n        type Response = http::Response<tonic::body::Body>;\n        type Error = std::convert::Infallible;\n        type Future = BoxFuture<Self::Response, Self::Error>;\n        fn poll_ready(\n            &mut self,\n            _cx: &mut Context<'_>,\n        ) -> Poll<std::result::Result<(), Self::Error>> {\n            Poll::Ready(Ok(()))\n        }\n        fn call(&mut self, req: http::Request<B>) -> Self::Future {\n            match req.uri().path() {\n                \"/jaeger.storage.v1.PluginCapabilities/Capabilities\" => {\n                    #[allow(non_camel_case_types)]\n                    struct CapabilitiesSvc<T: PluginCapabilities>(pub Arc<T>);\n                    impl<\n                        T: PluginCapabilities,\n                    > tonic::server::UnaryService<super::CapabilitiesRequest>\n                    for CapabilitiesSvc<T> {\n                        type Response = super::CapabilitiesResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::CapabilitiesRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as PluginCapabilities>::capabilities(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = CapabilitiesSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                _ => {\n                    Box::pin(async move {\n                        let mut response = http::Response::new(\n                            tonic::body::Body::default(),\n                        );\n                        let headers = response.headers_mut();\n                        headers\n                            .insert(\n                                tonic::Status::GRPC_STATUS,\n                                (tonic::Code::Unimplemented as i32).into(),\n                            );\n                        headers\n                            .insert(\n                                http::header::CONTENT_TYPE,\n                                tonic::metadata::GRPC_CONTENT_TYPE,\n                            );\n                        Ok(response)\n                    })\n                }\n            }\n        }\n    }\n    impl<T> Clone for PluginCapabilitiesServer<T> {\n        fn clone(&self) -> Self {\n            let inner = self.inner.clone();\n            Self {\n                inner,\n                accept_compression_encodings: self.accept_compression_encodings,\n                send_compression_encodings: self.send_compression_encodings,\n                max_decoding_message_size: self.max_decoding_message_size,\n                max_encoding_message_size: self.max_encoding_message_size,\n            }\n        }\n    }\n    /// Generated gRPC service name\n    pub const SERVICE_NAME: &str = \"jaeger.storage.v1.PluginCapabilities\";\n    impl<T> tonic::server::NamedService for PluginCapabilitiesServer<T> {\n        const NAME: &'static str = SERVICE_NAME;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/codegen/jaeger/jaeger.storage.v2.rs",
    "content": "// This file is @generated by prost-build.\n/// GetTraceParams represents the query for a single trace from the storage backend.\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct GetTraceParams {\n    /// trace_id is a 16 byte array containing the unique identifier for the trace to query.\n    #[prost(bytes = \"vec\", tag = \"1\")]\n    pub trace_id: ::prost::alloc::vec::Vec<u8>,\n    /// start_time is the start of the time interval to search for the trace_id.\n    ///\n    /// This field is optional.\n    #[prost(message, optional, tag = \"2\")]\n    pub start_time: ::core::option::Option<::prost_types::Timestamp>,\n    /// end_time is the end of the time interval to search for the trace_id.\n    ///\n    /// This field is optional.\n    #[prost(message, optional, tag = \"3\")]\n    pub end_time: ::core::option::Option<::prost_types::Timestamp>,\n}\n/// GetTracesRequest represents a request to retrieve multiple traces.\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct GetTracesRequest {\n    #[prost(message, repeated, tag = \"1\")]\n    pub query: ::prost::alloc::vec::Vec<GetTraceParams>,\n}\n/// GetServicesRequest represents a request to get service names.\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct GetServicesRequest {}\n/// GetServicesResponse represents the response for GetServicesRequest.\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct GetServicesResponse {\n    #[prost(string, repeated, tag = \"1\")]\n    pub services: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n}\n/// GetOperationsRequest represents a request to get operation names.\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct GetOperationsRequest {\n    /// service is the name of the service for which to get operation names.\n    ///\n    /// This field is required.\n    #[prost(string, tag = \"1\")]\n    pub service: ::prost::alloc::string::String,\n    /// span_kind is the type of span which is used to distinguish between\n    /// spans generated in a particular context.\n    ///\n    /// This field is optional.\n    #[prost(string, tag = \"2\")]\n    pub span_kind: ::prost::alloc::string::String,\n}\n/// Operation contains information about an operation for a given service.\n#[derive(Ord, PartialOrd)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct Operation {\n    #[prost(string, tag = \"1\")]\n    pub name: ::prost::alloc::string::String,\n    #[prost(string, tag = \"2\")]\n    pub span_kind: ::prost::alloc::string::String,\n}\n/// GetOperationsResponse represents the response for GetOperationsRequest.\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct GetOperationsResponse {\n    #[prost(message, repeated, tag = \"1\")]\n    pub operations: ::prost::alloc::vec::Vec<Operation>,\n}\n/// KeyValue and all its associated types are copied from opentelemetry-proto/common/v1/common.proto\n/// (<https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/common/v1/common.proto>).\n/// This type is used to store attributes in traces.\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct KeyValue {\n    #[prost(string, tag = \"1\")]\n    pub key: ::prost::alloc::string::String,\n    #[prost(message, optional, tag = \"2\")]\n    pub value: ::core::option::Option<AnyValue>,\n}\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct AnyValue {\n    #[prost(oneof = \"any_value::Value\", tags = \"1, 2, 3, 4, 5, 6, 7\")]\n    pub value: ::core::option::Option<any_value::Value>,\n}\n/// Nested message and enum types in `AnyValue`.\npub mod any_value {\n    #[derive(Clone, PartialEq, ::prost::Oneof)]\n    pub enum Value {\n        #[prost(string, tag = \"1\")]\n        StringValue(::prost::alloc::string::String),\n        #[prost(bool, tag = \"2\")]\n        BoolValue(bool),\n        #[prost(int64, tag = \"3\")]\n        IntValue(i64),\n        #[prost(double, tag = \"4\")]\n        DoubleValue(f64),\n        #[prost(message, tag = \"5\")]\n        ArrayValue(super::ArrayValue),\n        #[prost(message, tag = \"6\")]\n        KvlistValue(super::KeyValueList),\n        #[prost(bytes, tag = \"7\")]\n        BytesValue(::prost::alloc::vec::Vec<u8>),\n    }\n}\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct KeyValueList {\n    #[prost(message, repeated, tag = \"1\")]\n    pub values: ::prost::alloc::vec::Vec<KeyValue>,\n}\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ArrayValue {\n    #[prost(message, repeated, tag = \"1\")]\n    pub values: ::prost::alloc::vec::Vec<AnyValue>,\n}\n/// TraceQueryParameters contains query parameters to find traces. For a detailed\n/// definition of each field in this message, refer to `TraceQueryParameters` in `jaeger.api_v3`\n/// (<https://github.com/jaegertracing/jaeger-idl/blob/main/proto/api_v3/query_service.proto>).\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct TraceQueryParameters {\n    #[prost(string, tag = \"1\")]\n    pub service_name: ::prost::alloc::string::String,\n    #[prost(string, tag = \"2\")]\n    pub operation_name: ::prost::alloc::string::String,\n    #[prost(message, repeated, tag = \"3\")]\n    pub attributes: ::prost::alloc::vec::Vec<KeyValue>,\n    #[prost(message, optional, tag = \"4\")]\n    pub start_time_min: ::core::option::Option<::prost_types::Timestamp>,\n    #[prost(message, optional, tag = \"5\")]\n    pub start_time_max: ::core::option::Option<::prost_types::Timestamp>,\n    #[prost(message, optional, tag = \"6\")]\n    pub duration_min: ::core::option::Option<::prost_types::Duration>,\n    #[prost(message, optional, tag = \"7\")]\n    pub duration_max: ::core::option::Option<::prost_types::Duration>,\n    #[prost(int32, tag = \"8\")]\n    pub search_depth: i32,\n}\n/// FindTracesRequest represents a request to find traces.\n/// It can be used to retrieve the traces (FindTraces) or simply\n/// the trace IDs (FindTraceIDs).\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct FindTracesRequest {\n    #[prost(message, optional, tag = \"1\")]\n    pub query: ::core::option::Option<TraceQueryParameters>,\n}\n/// FoundTraceID is a wrapper around trace ID returned from FindTraceIDs\n/// with an optional time range that may be used in GetTraces calls.\n///\n/// The time range is provided as an optimization hint for some storage backends\n/// that can perform more efficient queries when they know the approximate time range.\n/// The value should not be used for precise time-based filtering or assumptions.\n/// It is meant as a rough boundary and may not be populated in all cases.\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct FoundTraceId {\n    #[prost(bytes = \"vec\", tag = \"1\")]\n    pub trace_id: ::prost::alloc::vec::Vec<u8>,\n    #[prost(message, optional, tag = \"2\")]\n    pub start: ::core::option::Option<::prost_types::Timestamp>,\n    #[prost(message, optional, tag = \"3\")]\n    pub end: ::core::option::Option<::prost_types::Timestamp>,\n}\n/// FindTraceIDsResponse represents the response for FindTracesRequest.\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct FindTraceIDsResponse {\n    #[prost(message, repeated, tag = \"1\")]\n    pub trace_ids: ::prost::alloc::vec::Vec<FoundTraceId>,\n}\n/// Generated client implementations.\npub mod trace_reader_client {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    use tonic::codegen::http::Uri;\n    /// TraceReader is a service that allows reading traces from storage.\n    /// Note that if you implement this service, you should also implement\n    /// OTEL's TraceService in package opentelemetry.proto.collector.trace.v1\n    /// to allow pushing traces to the storage backend\n    /// (<https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/collector/trace/v1/trace_service.proto>)\n    #[derive(Debug, Clone)]\n    pub struct TraceReaderClient<T> {\n        inner: tonic::client::Grpc<T>,\n    }\n    impl TraceReaderClient<tonic::transport::Channel> {\n        /// Attempt to create a new client by connecting to a given endpoint.\n        pub async fn connect<D>(dst: D) -> Result<Self, tonic::transport::Error>\n        where\n            D: TryInto<tonic::transport::Endpoint>,\n            D::Error: Into<StdError>,\n        {\n            let conn = tonic::transport::Endpoint::new(dst)?.connect().await?;\n            Ok(Self::new(conn))\n        }\n    }\n    impl<T> TraceReaderClient<T>\n    where\n        T: tonic::client::GrpcService<tonic::body::Body>,\n        T::Error: Into<StdError>,\n        T::ResponseBody: Body<Data = Bytes> + std::marker::Send + 'static,\n        <T::ResponseBody as Body>::Error: Into<StdError> + std::marker::Send,\n    {\n        pub fn new(inner: T) -> Self {\n            let inner = tonic::client::Grpc::new(inner);\n            Self { inner }\n        }\n        pub fn with_origin(inner: T, origin: Uri) -> Self {\n            let inner = tonic::client::Grpc::with_origin(inner, origin);\n            Self { inner }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> TraceReaderClient<InterceptedService<T, F>>\n        where\n            F: tonic::service::Interceptor,\n            T::ResponseBody: Default,\n            T: tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n                Response = http::Response<\n                    <T as tonic::client::GrpcService<tonic::body::Body>>::ResponseBody,\n                >,\n            >,\n            <T as tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n            >>::Error: Into<StdError> + std::marker::Send + std::marker::Sync,\n        {\n            TraceReaderClient::new(InterceptedService::new(inner, interceptor))\n        }\n        /// Compress requests with the given encoding.\n        ///\n        /// This requires the server to support it otherwise it might respond with an\n        /// error.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.send_compressed(encoding);\n            self\n        }\n        /// Enable decompressing responses.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.accept_compressed(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_decoding_message_size(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_encoding_message_size(limit);\n            self\n        }\n        /// GetTraces returns a stream that retrieves all traces with given IDs.\n        ///\n        /// Chunking requirements:\n        ///\n        /// * A single TracesData chunk MUST NOT contain spans from multiple traces.\n        /// * Large traces MAY be split across multiple, *consecutive* TracesData chunks.\n        /// * Each returned TracesData object MUST NOT be empty.\n        ///\n        /// Edge cases:\n        ///\n        /// * If no spans are found for any given trace ID, the ID is ignored.\n        /// * If none of the trace IDs are found in the storage, an empty response is returned.\n        pub async fn get_traces(\n            &mut self,\n            request: impl tonic::IntoRequest<super::GetTracesRequest>,\n        ) -> std::result::Result<\n            tonic::Response<\n                tonic::codec::Streaming<\n                    super::super::super::super::opentelemetry::proto::trace::v1::TracesData,\n                >,\n            >,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/jaeger.storage.v2.TraceReader/GetTraces\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(GrpcMethod::new(\"jaeger.storage.v2.TraceReader\", \"GetTraces\"));\n            self.inner.server_streaming(req, path, codec).await\n        }\n        /// GetServices returns all service names known to the backend from traces\n        /// within its retention period.\n        pub async fn get_services(\n            &mut self,\n            request: impl tonic::IntoRequest<super::GetServicesRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::GetServicesResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/jaeger.storage.v2.TraceReader/GetServices\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(GrpcMethod::new(\"jaeger.storage.v2.TraceReader\", \"GetServices\"));\n            self.inner.unary(req, path, codec).await\n        }\n        /// GetOperations returns all operation names for a given service\n        /// known to the backend from traces within its retention period.\n        pub async fn get_operations(\n            &mut self,\n            request: impl tonic::IntoRequest<super::GetOperationsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::GetOperationsResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/jaeger.storage.v2.TraceReader/GetOperations\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\"jaeger.storage.v2.TraceReader\", \"GetOperations\"),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// FindTraces returns a stream that retrieves traces matching query parameters.\n        ///\n        /// The chunking rules are the same as for GetTraces.\n        ///\n        /// If no matching traces are found, an empty stream is returned.\n        pub async fn find_traces(\n            &mut self,\n            request: impl tonic::IntoRequest<super::FindTracesRequest>,\n        ) -> std::result::Result<\n            tonic::Response<\n                tonic::codec::Streaming<\n                    super::super::super::super::opentelemetry::proto::trace::v1::TracesData,\n                >,\n            >,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/jaeger.storage.v2.TraceReader/FindTraces\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(GrpcMethod::new(\"jaeger.storage.v2.TraceReader\", \"FindTraces\"));\n            self.inner.server_streaming(req, path, codec).await\n        }\n        /// FindTraceIDs returns a stream that retrieves IDs of traces matching query parameters.\n        ///\n        /// If no matching traces are found, an empty stream is returned.\n        ///\n        /// This call behaves identically to FindTraces, except that it returns only the list\n        /// of matching trace IDs. This is useful in some contexts, such as batch jobs, where a\n        /// large list of trace IDs may be queried first and then the full traces are loaded\n        /// in batches.\n        pub async fn find_trace_i_ds(\n            &mut self,\n            request: impl tonic::IntoRequest<super::FindTracesRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::FindTraceIDsResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/jaeger.storage.v2.TraceReader/FindTraceIDs\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\"jaeger.storage.v2.TraceReader\", \"FindTraceIDs\"),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n    }\n}\n/// Generated server implementations.\npub mod trace_reader_server {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    /// Generated trait containing gRPC methods that should be implemented for use with TraceReaderServer.\n    #[async_trait]\n    pub trait TraceReader: std::marker::Send + std::marker::Sync + 'static {\n        /// Server streaming response type for the GetTraces method.\n        type GetTracesStream: tonic::codegen::tokio_stream::Stream<\n                Item = std::result::Result<\n                    super::super::super::super::opentelemetry::proto::trace::v1::TracesData,\n                    tonic::Status,\n                >,\n            >\n            + std::marker::Send\n            + 'static;\n        /// GetTraces returns a stream that retrieves all traces with given IDs.\n        ///\n        /// Chunking requirements:\n        ///\n        /// * A single TracesData chunk MUST NOT contain spans from multiple traces.\n        /// * Large traces MAY be split across multiple, *consecutive* TracesData chunks.\n        /// * Each returned TracesData object MUST NOT be empty.\n        ///\n        /// Edge cases:\n        ///\n        /// * If no spans are found for any given trace ID, the ID is ignored.\n        /// * If none of the trace IDs are found in the storage, an empty response is returned.\n        async fn get_traces(\n            &self,\n            request: tonic::Request<super::GetTracesRequest>,\n        ) -> std::result::Result<tonic::Response<Self::GetTracesStream>, tonic::Status>;\n        /// GetServices returns all service names known to the backend from traces\n        /// within its retention period.\n        async fn get_services(\n            &self,\n            request: tonic::Request<super::GetServicesRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::GetServicesResponse>,\n            tonic::Status,\n        >;\n        /// GetOperations returns all operation names for a given service\n        /// known to the backend from traces within its retention period.\n        async fn get_operations(\n            &self,\n            request: tonic::Request<super::GetOperationsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::GetOperationsResponse>,\n            tonic::Status,\n        >;\n        /// Server streaming response type for the FindTraces method.\n        type FindTracesStream: tonic::codegen::tokio_stream::Stream<\n                Item = std::result::Result<\n                    super::super::super::super::opentelemetry::proto::trace::v1::TracesData,\n                    tonic::Status,\n                >,\n            >\n            + std::marker::Send\n            + 'static;\n        /// FindTraces returns a stream that retrieves traces matching query parameters.\n        ///\n        /// The chunking rules are the same as for GetTraces.\n        ///\n        /// If no matching traces are found, an empty stream is returned.\n        async fn find_traces(\n            &self,\n            request: tonic::Request<super::FindTracesRequest>,\n        ) -> std::result::Result<tonic::Response<Self::FindTracesStream>, tonic::Status>;\n        /// FindTraceIDs returns a stream that retrieves IDs of traces matching query parameters.\n        ///\n        /// If no matching traces are found, an empty stream is returned.\n        ///\n        /// This call behaves identically to FindTraces, except that it returns only the list\n        /// of matching trace IDs. This is useful in some contexts, such as batch jobs, where a\n        /// large list of trace IDs may be queried first and then the full traces are loaded\n        /// in batches.\n        async fn find_trace_i_ds(\n            &self,\n            request: tonic::Request<super::FindTracesRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::FindTraceIDsResponse>,\n            tonic::Status,\n        >;\n    }\n    /// TraceReader is a service that allows reading traces from storage.\n    /// Note that if you implement this service, you should also implement\n    /// OTEL's TraceService in package opentelemetry.proto.collector.trace.v1\n    /// to allow pushing traces to the storage backend\n    /// (<https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/collector/trace/v1/trace_service.proto>)\n    #[derive(Debug)]\n    pub struct TraceReaderServer<T> {\n        inner: Arc<T>,\n        accept_compression_encodings: EnabledCompressionEncodings,\n        send_compression_encodings: EnabledCompressionEncodings,\n        max_decoding_message_size: Option<usize>,\n        max_encoding_message_size: Option<usize>,\n    }\n    impl<T> TraceReaderServer<T> {\n        pub fn new(inner: T) -> Self {\n            Self::from_arc(Arc::new(inner))\n        }\n        pub fn from_arc(inner: Arc<T>) -> Self {\n            Self {\n                inner,\n                accept_compression_encodings: Default::default(),\n                send_compression_encodings: Default::default(),\n                max_decoding_message_size: None,\n                max_encoding_message_size: None,\n            }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> InterceptedService<Self, F>\n        where\n            F: tonic::service::Interceptor,\n        {\n            InterceptedService::new(Self::new(inner), interceptor)\n        }\n        /// Enable decompressing requests with the given encoding.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.accept_compression_encodings.enable(encoding);\n            self\n        }\n        /// Compress responses with the given encoding, if the client supports it.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.send_compression_encodings.enable(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.max_decoding_message_size = Some(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.max_encoding_message_size = Some(limit);\n            self\n        }\n    }\n    impl<T, B> tonic::codegen::Service<http::Request<B>> for TraceReaderServer<T>\n    where\n        T: TraceReader,\n        B: Body + std::marker::Send + 'static,\n        B::Error: Into<StdError> + std::marker::Send + 'static,\n    {\n        type Response = http::Response<tonic::body::Body>;\n        type Error = std::convert::Infallible;\n        type Future = BoxFuture<Self::Response, Self::Error>;\n        fn poll_ready(\n            &mut self,\n            _cx: &mut Context<'_>,\n        ) -> Poll<std::result::Result<(), Self::Error>> {\n            Poll::Ready(Ok(()))\n        }\n        fn call(&mut self, req: http::Request<B>) -> Self::Future {\n            match req.uri().path() {\n                \"/jaeger.storage.v2.TraceReader/GetTraces\" => {\n                    #[allow(non_camel_case_types)]\n                    struct GetTracesSvc<T: TraceReader>(pub Arc<T>);\n                    impl<\n                        T: TraceReader,\n                    > tonic::server::ServerStreamingService<super::GetTracesRequest>\n                    for GetTracesSvc<T> {\n                        type Response = super::super::super::super::opentelemetry::proto::trace::v1::TracesData;\n                        type ResponseStream = T::GetTracesStream;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::ResponseStream>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::GetTracesRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as TraceReader>::get_traces(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = GetTracesSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.server_streaming(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/jaeger.storage.v2.TraceReader/GetServices\" => {\n                    #[allow(non_camel_case_types)]\n                    struct GetServicesSvc<T: TraceReader>(pub Arc<T>);\n                    impl<\n                        T: TraceReader,\n                    > tonic::server::UnaryService<super::GetServicesRequest>\n                    for GetServicesSvc<T> {\n                        type Response = super::GetServicesResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::GetServicesRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as TraceReader>::get_services(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = GetServicesSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/jaeger.storage.v2.TraceReader/GetOperations\" => {\n                    #[allow(non_camel_case_types)]\n                    struct GetOperationsSvc<T: TraceReader>(pub Arc<T>);\n                    impl<\n                        T: TraceReader,\n                    > tonic::server::UnaryService<super::GetOperationsRequest>\n                    for GetOperationsSvc<T> {\n                        type Response = super::GetOperationsResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::GetOperationsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as TraceReader>::get_operations(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = GetOperationsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/jaeger.storage.v2.TraceReader/FindTraces\" => {\n                    #[allow(non_camel_case_types)]\n                    struct FindTracesSvc<T: TraceReader>(pub Arc<T>);\n                    impl<\n                        T: TraceReader,\n                    > tonic::server::ServerStreamingService<super::FindTracesRequest>\n                    for FindTracesSvc<T> {\n                        type Response = super::super::super::super::opentelemetry::proto::trace::v1::TracesData;\n                        type ResponseStream = T::FindTracesStream;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::ResponseStream>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::FindTracesRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as TraceReader>::find_traces(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = FindTracesSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.server_streaming(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/jaeger.storage.v2.TraceReader/FindTraceIDs\" => {\n                    #[allow(non_camel_case_types)]\n                    struct FindTraceIDsSvc<T: TraceReader>(pub Arc<T>);\n                    impl<\n                        T: TraceReader,\n                    > tonic::server::UnaryService<super::FindTracesRequest>\n                    for FindTraceIDsSvc<T> {\n                        type Response = super::FindTraceIDsResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::FindTracesRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as TraceReader>::find_trace_i_ds(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = FindTraceIDsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                _ => {\n                    Box::pin(async move {\n                        let mut response = http::Response::new(\n                            tonic::body::Body::default(),\n                        );\n                        let headers = response.headers_mut();\n                        headers\n                            .insert(\n                                tonic::Status::GRPC_STATUS,\n                                (tonic::Code::Unimplemented as i32).into(),\n                            );\n                        headers\n                            .insert(\n                                http::header::CONTENT_TYPE,\n                                tonic::metadata::GRPC_CONTENT_TYPE,\n                            );\n                        Ok(response)\n                    })\n                }\n            }\n        }\n    }\n    impl<T> Clone for TraceReaderServer<T> {\n        fn clone(&self) -> Self {\n            let inner = self.inner.clone();\n            Self {\n                inner,\n                accept_compression_encodings: self.accept_compression_encodings,\n                send_compression_encodings: self.send_compression_encodings,\n                max_decoding_message_size: self.max_decoding_message_size,\n                max_encoding_message_size: self.max_encoding_message_size,\n            }\n        }\n    }\n    /// Generated gRPC service name\n    pub const SERVICE_NAME: &str = \"jaeger.storage.v2.TraceReader\";\n    impl<T> tonic::server::NamedService for TraceReaderServer<T> {\n        const NAME: &'static str = SERVICE_NAME;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/codegen/jaeger/opentelemetry.proto.common.v1.rs",
    "content": "// This file is @generated by prost-build.\n/// AnyValue is used to represent any type of attribute value. AnyValue may contain a\n/// primitive value such as a string or integer or it may contain an arbitrary nested\n/// object containing arrays, key-value lists and primitives.\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct AnyValue {\n    /// The value is one of the listed fields. It is valid for all values to be unspecified\n    /// in which case this AnyValue is considered to be \"empty\".\n    #[prost(oneof = \"any_value::Value\", tags = \"1, 2, 3, 4, 5, 6, 7\")]\n    pub value: ::core::option::Option<any_value::Value>,\n}\n/// Nested message and enum types in `AnyValue`.\npub mod any_value {\n    /// The value is one of the listed fields. It is valid for all values to be unspecified\n    /// in which case this AnyValue is considered to be \"empty\".\n    #[derive(Clone, PartialEq, ::prost::Oneof)]\n    pub enum Value {\n        #[prost(string, tag = \"1\")]\n        StringValue(::prost::alloc::string::String),\n        #[prost(bool, tag = \"2\")]\n        BoolValue(bool),\n        #[prost(int64, tag = \"3\")]\n        IntValue(i64),\n        #[prost(double, tag = \"4\")]\n        DoubleValue(f64),\n        #[prost(message, tag = \"5\")]\n        ArrayValue(super::ArrayValue),\n        #[prost(message, tag = \"6\")]\n        KvlistValue(super::KeyValueList),\n        #[prost(bytes, tag = \"7\")]\n        BytesValue(::prost::alloc::vec::Vec<u8>),\n    }\n}\n/// ArrayValue is a list of AnyValue messages. We need ArrayValue as a message\n/// since oneof in AnyValue does not allow repeated fields.\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ArrayValue {\n    /// Array of values. The array may be empty (contain 0 elements).\n    #[prost(message, repeated, tag = \"1\")]\n    pub values: ::prost::alloc::vec::Vec<AnyValue>,\n}\n/// KeyValueList is a list of KeyValue messages. We need KeyValueList as a message\n/// since `oneof` in AnyValue does not allow repeated fields. Everywhere else where we need\n/// a list of KeyValue messages (e.g. in Span) we use `repeated KeyValue` directly to\n/// avoid unnecessary extra wrapping (which slows down the protocol). The 2 approaches\n/// are semantically equivalent.\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct KeyValueList {\n    /// A collection of key/value pairs of key-value pairs. The list may be empty (may\n    /// contain 0 elements).\n    /// The keys MUST be unique (it is not allowed to have more than one\n    /// value with the same key).\n    #[prost(message, repeated, tag = \"1\")]\n    pub values: ::prost::alloc::vec::Vec<KeyValue>,\n}\n/// KeyValue is a key-value pair that is used to store Span attributes, Link\n/// attributes, etc.\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct KeyValue {\n    #[prost(string, tag = \"1\")]\n    pub key: ::prost::alloc::string::String,\n    #[prost(message, optional, tag = \"2\")]\n    pub value: ::core::option::Option<AnyValue>,\n}\n/// InstrumentationScope is a message representing the instrumentation scope information\n/// such as the fully qualified name and version.\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct InstrumentationScope {\n    /// An empty instrumentation scope name means the name is unknown.\n    #[prost(string, tag = \"1\")]\n    pub name: ::prost::alloc::string::String,\n    #[prost(string, tag = \"2\")]\n    pub version: ::prost::alloc::string::String,\n    #[prost(message, repeated, tag = \"3\")]\n    pub attributes: ::prost::alloc::vec::Vec<KeyValue>,\n    #[prost(uint32, tag = \"4\")]\n    pub dropped_attributes_count: u32,\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/codegen/jaeger/opentelemetry.proto.resource.v1.rs",
    "content": "// This file is @generated by prost-build.\n/// Resource information.\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct Resource {\n    /// Set of attributes that describe the resource.\n    /// Attribute keys MUST be unique (it is not allowed to have more than one\n    /// attribute with the same key).\n    #[prost(message, repeated, tag = \"1\")]\n    pub attributes: ::prost::alloc::vec::Vec<super::super::common::v1::KeyValue>,\n    /// dropped_attributes_count is the number of dropped attributes. If the value is 0, then\n    /// no attributes were dropped.\n    #[prost(uint32, tag = \"2\")]\n    pub dropped_attributes_count: u32,\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/codegen/jaeger/opentelemetry.proto.trace.v1.rs",
    "content": "// This file is @generated by prost-build.\n/// TracesData represents the traces data that can be stored in a persistent storage,\n/// OR can be embedded by other protocols that transfer OTLP traces data but do\n/// not implement the OTLP protocol.\n///\n/// The main difference between this message and collector protocol is that\n/// in this message there will not be any \"control\" or \"metadata\" specific to\n/// OTLP protocol.\n///\n/// When new fields are added into this message, the OTLP request MUST be updated\n/// as well.\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct TracesData {\n    /// An array of ResourceSpans.\n    /// For data coming from a single resource this array will typically contain\n    /// one element. Intermediary nodes that receive data from multiple origins\n    /// typically batch the data before forwarding further and in that case this\n    /// array will contain multiple elements.\n    #[prost(message, repeated, tag = \"1\")]\n    pub resource_spans: ::prost::alloc::vec::Vec<ResourceSpans>,\n}\n/// A collection of ScopeSpans from a Resource.\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ResourceSpans {\n    /// The resource for the spans in this message.\n    /// If this field is not set then no resource info is known.\n    #[prost(message, optional, tag = \"1\")]\n    pub resource: ::core::option::Option<super::super::resource::v1::Resource>,\n    /// A list of ScopeSpans that originate from a resource.\n    #[prost(message, repeated, tag = \"2\")]\n    pub scope_spans: ::prost::alloc::vec::Vec<ScopeSpans>,\n    /// This schema_url applies to the data in the \"resource\" field. It does not apply\n    /// to the data in the \"scope_spans\" field which have their own schema_url field.\n    #[prost(string, tag = \"3\")]\n    pub schema_url: ::prost::alloc::string::String,\n}\n/// A collection of Spans produced by an InstrumentationScope.\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ScopeSpans {\n    /// The instrumentation scope information for the spans in this message.\n    /// Semantically when InstrumentationScope isn't set, it is equivalent with\n    /// an empty instrumentation scope name (unknown).\n    #[prost(message, optional, tag = \"1\")]\n    pub scope: ::core::option::Option<super::super::common::v1::InstrumentationScope>,\n    /// A list of Spans that originate from an instrumentation scope.\n    #[prost(message, repeated, tag = \"2\")]\n    pub spans: ::prost::alloc::vec::Vec<Span>,\n    /// This schema_url applies to all spans and span events in the \"spans\" field.\n    #[prost(string, tag = \"3\")]\n    pub schema_url: ::prost::alloc::string::String,\n}\n/// A Span represents a single operation performed by a single component of the system.\n///\n/// The next available field id is 17.\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct Span {\n    /// A unique identifier for a trace. All spans from the same trace share\n    /// the same `trace_id`. The ID is a 16-byte array. An ID with all zeroes\n    /// is considered invalid.\n    ///\n    /// This field is semantically required. Receiver should generate new\n    /// random trace_id if empty or invalid trace_id was received.\n    ///\n    /// This field is required.\n    #[prost(bytes = \"vec\", tag = \"1\")]\n    pub trace_id: ::prost::alloc::vec::Vec<u8>,\n    /// A unique identifier for a span within a trace, assigned when the span\n    /// is created. The ID is an 8-byte array. An ID with all zeroes is considered\n    /// invalid.\n    ///\n    /// This field is semantically required. Receiver should generate new\n    /// random span_id if empty or invalid span_id was received.\n    ///\n    /// This field is required.\n    #[prost(bytes = \"vec\", tag = \"2\")]\n    pub span_id: ::prost::alloc::vec::Vec<u8>,\n    /// trace_state conveys information about request position in multiple distributed tracing graphs.\n    /// It is a trace_state in w3c-trace-context format: <https://www.w3.org/TR/trace-context/#tracestate-header>\n    /// See also <https://github.com/w3c/distributed-tracing> for more details about this field.\n    #[prost(string, tag = \"3\")]\n    pub trace_state: ::prost::alloc::string::String,\n    /// The `span_id` of this span's parent span. If this is a root span, then this\n    /// field must be empty. The ID is an 8-byte array.\n    #[prost(bytes = \"vec\", tag = \"4\")]\n    pub parent_span_id: ::prost::alloc::vec::Vec<u8>,\n    /// A description of the span's operation.\n    ///\n    /// For example, the name can be a qualified method name or a file name\n    /// and a line number where the operation is called. A best practice is to use\n    /// the same display name at the same call point in an application.\n    /// This makes it easier to correlate spans in different traces.\n    ///\n    /// This field is semantically required to be set to non-empty string.\n    /// Empty value is equivalent to an unknown span name.\n    ///\n    /// This field is required.\n    #[prost(string, tag = \"5\")]\n    pub name: ::prost::alloc::string::String,\n    /// Distinguishes between spans generated in a particular context. For example,\n    /// two spans with the same name may be distinguished using `CLIENT` (caller)\n    /// and `SERVER` (callee) to identify queueing latency associated with the span.\n    #[prost(enumeration = \"span::SpanKind\", tag = \"6\")]\n    pub kind: i32,\n    /// start_time_unix_nano is the start time of the span. On the client side, this is the time\n    /// kept by the local machine where the span execution starts. On the server side, this\n    /// is the time when the server's application handler starts running.\n    /// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.\n    ///\n    /// This field is semantically required and it is expected that end_time >= start_time.\n    #[prost(fixed64, tag = \"7\")]\n    pub start_time_unix_nano: u64,\n    /// end_time_unix_nano is the end time of the span. On the client side, this is the time\n    /// kept by the local machine where the span execution ends. On the server side, this\n    /// is the time when the server application handler stops running.\n    /// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.\n    ///\n    /// This field is semantically required and it is expected that end_time >= start_time.\n    #[prost(fixed64, tag = \"8\")]\n    pub end_time_unix_nano: u64,\n    /// attributes is a collection of key/value pairs. Note, global attributes\n    /// like server name can be set using the resource API. Examples of attributes:\n    ///\n    /// ```text\n    /// \"/http/user_agent\": \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36\"\n    /// \"/http/server_latency\": 300\n    /// \"abc.com/myattribute\": true\n    /// \"abc.com/score\": 10.239\n    /// ```\n    ///\n    /// The OpenTelemetry API specification further restricts the allowed value types:\n    /// <https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/common/README.md#attribute>\n    /// Attribute keys MUST be unique (it is not allowed to have more than one\n    /// attribute with the same key).\n    #[prost(message, repeated, tag = \"9\")]\n    pub attributes: ::prost::alloc::vec::Vec<super::super::common::v1::KeyValue>,\n    /// dropped_attributes_count is the number of attributes that were discarded. Attributes\n    /// can be discarded because their keys are too long or because there are too many\n    /// attributes. If this value is 0, then no attributes were dropped.\n    #[prost(uint32, tag = \"10\")]\n    pub dropped_attributes_count: u32,\n    /// events is a collection of Event items.\n    #[prost(message, repeated, tag = \"11\")]\n    pub events: ::prost::alloc::vec::Vec<span::Event>,\n    /// dropped_events_count is the number of dropped events. If the value is 0, then no\n    /// events were dropped.\n    #[prost(uint32, tag = \"12\")]\n    pub dropped_events_count: u32,\n    /// links is a collection of Links, which are references from this span to a span\n    /// in the same or different trace.\n    #[prost(message, repeated, tag = \"13\")]\n    pub links: ::prost::alloc::vec::Vec<span::Link>,\n    /// dropped_links_count is the number of dropped links after the maximum size was\n    /// enforced. If this value is 0, then no links were dropped.\n    #[prost(uint32, tag = \"14\")]\n    pub dropped_links_count: u32,\n    /// An optional final status for this span. Semantically when Status isn't set, it means\n    /// span's status code is unset, i.e. assume STATUS_CODE_UNSET (code = 0).\n    #[prost(message, optional, tag = \"15\")]\n    pub status: ::core::option::Option<Status>,\n}\n/// Nested message and enum types in `Span`.\npub mod span {\n    /// Event is a time-stamped annotation of the span, consisting of user-supplied\n    /// text description and key-value pairs.\n    #[derive(Clone, PartialEq, ::prost::Message)]\n    pub struct Event {\n        /// time_unix_nano is the time the event occurred.\n        #[prost(fixed64, tag = \"1\")]\n        pub time_unix_nano: u64,\n        /// name of the event.\n        /// This field is semantically required to be set to non-empty string.\n        #[prost(string, tag = \"2\")]\n        pub name: ::prost::alloc::string::String,\n        /// attributes is a collection of attribute key/value pairs on the event.\n        /// Attribute keys MUST be unique (it is not allowed to have more than one\n        /// attribute with the same key).\n        #[prost(message, repeated, tag = \"3\")]\n        pub attributes: ::prost::alloc::vec::Vec<\n            super::super::super::common::v1::KeyValue,\n        >,\n        /// dropped_attributes_count is the number of dropped attributes. If the value is 0,\n        /// then no attributes were dropped.\n        #[prost(uint32, tag = \"4\")]\n        pub dropped_attributes_count: u32,\n    }\n    /// A pointer from the current span to another span in the same trace or in a\n    /// different trace. For example, this can be used in batching operations,\n    /// where a single batch handler processes multiple requests from different\n    /// traces or when the handler receives a request from a different project.\n    #[derive(Clone, PartialEq, ::prost::Message)]\n    pub struct Link {\n        /// A unique identifier of a trace that this linked span is part of. The ID is a\n        /// 16-byte array.\n        #[prost(bytes = \"vec\", tag = \"1\")]\n        pub trace_id: ::prost::alloc::vec::Vec<u8>,\n        /// A unique identifier for the linked span. The ID is an 8-byte array.\n        #[prost(bytes = \"vec\", tag = \"2\")]\n        pub span_id: ::prost::alloc::vec::Vec<u8>,\n        /// The trace_state associated with the link.\n        #[prost(string, tag = \"3\")]\n        pub trace_state: ::prost::alloc::string::String,\n        /// attributes is a collection of attribute key/value pairs on the link.\n        /// Attribute keys MUST be unique (it is not allowed to have more than one\n        /// attribute with the same key).\n        #[prost(message, repeated, tag = \"4\")]\n        pub attributes: ::prost::alloc::vec::Vec<\n            super::super::super::common::v1::KeyValue,\n        >,\n        /// dropped_attributes_count is the number of dropped attributes. If the value is 0,\n        /// then no attributes were dropped.\n        #[prost(uint32, tag = \"5\")]\n        pub dropped_attributes_count: u32,\n    }\n    /// SpanKind is the type of span. Can be used to specify additional relationships between spans\n    /// in addition to a parent/child relationship.\n    #[derive(\n        Clone,\n        Copy,\n        Debug,\n        PartialEq,\n        Eq,\n        Hash,\n        PartialOrd,\n        Ord,\n        ::prost::Enumeration\n    )]\n    #[repr(i32)]\n    pub enum SpanKind {\n        /// Unspecified. Do NOT use as default.\n        /// Implementations MAY assume SpanKind to be INTERNAL when receiving UNSPECIFIED.\n        Unspecified = 0,\n        /// Indicates that the span represents an internal operation within an application,\n        /// as opposed to an operation happening at the boundaries. Default value.\n        Internal = 1,\n        /// Indicates that the span covers server-side handling of an RPC or other\n        /// remote network request.\n        Server = 2,\n        /// Indicates that the span describes a request to some remote service.\n        Client = 3,\n        /// Indicates that the span describes a producer sending a message to a broker.\n        /// Unlike CLIENT and SERVER, there is often no direct critical path latency relationship\n        /// between producer and consumer spans. A PRODUCER span ends when the message was accepted\n        /// by the broker while the logical processing of the message might span a much longer time.\n        Producer = 4,\n        /// Indicates that the span describes consumer receiving a message from a broker.\n        /// Like the PRODUCER kind, there is often no direct critical path latency relationship\n        /// between producer and consumer spans.\n        Consumer = 5,\n    }\n    impl SpanKind {\n        /// String value of the enum field names used in the ProtoBuf definition.\n        ///\n        /// The values are not transformed in any way and thus are considered stable\n        /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n        pub fn as_str_name(&self) -> &'static str {\n            match self {\n                Self::Unspecified => \"SPAN_KIND_UNSPECIFIED\",\n                Self::Internal => \"SPAN_KIND_INTERNAL\",\n                Self::Server => \"SPAN_KIND_SERVER\",\n                Self::Client => \"SPAN_KIND_CLIENT\",\n                Self::Producer => \"SPAN_KIND_PRODUCER\",\n                Self::Consumer => \"SPAN_KIND_CONSUMER\",\n            }\n        }\n        /// Creates an enum from field names used in the ProtoBuf definition.\n        pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n            match value {\n                \"SPAN_KIND_UNSPECIFIED\" => Some(Self::Unspecified),\n                \"SPAN_KIND_INTERNAL\" => Some(Self::Internal),\n                \"SPAN_KIND_SERVER\" => Some(Self::Server),\n                \"SPAN_KIND_CLIENT\" => Some(Self::Client),\n                \"SPAN_KIND_PRODUCER\" => Some(Self::Producer),\n                \"SPAN_KIND_CONSUMER\" => Some(Self::Consumer),\n                _ => None,\n            }\n        }\n    }\n}\n/// The Status type defines a logical error model that is suitable for different\n/// programming environments, including REST APIs and RPC APIs.\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct Status {\n    /// A developer-facing human readable error message.\n    #[prost(string, tag = \"2\")]\n    pub message: ::prost::alloc::string::String,\n    /// The status code.\n    #[prost(enumeration = \"status::StatusCode\", tag = \"3\")]\n    pub code: i32,\n}\n/// Nested message and enum types in `Status`.\npub mod status {\n    /// For the semantics of status codes see\n    /// <https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md#set-status>\n    #[derive(\n        Clone,\n        Copy,\n        Debug,\n        PartialEq,\n        Eq,\n        Hash,\n        PartialOrd,\n        Ord,\n        ::prost::Enumeration\n    )]\n    #[repr(i32)]\n    pub enum StatusCode {\n        /// The default status.\n        Unset = 0,\n        /// The Span has been validated by an Application developer or Operator to\n        /// have completed successfully.\n        Ok = 1,\n        /// The Span contains an error.\n        Error = 2,\n    }\n    impl StatusCode {\n        /// String value of the enum field names used in the ProtoBuf definition.\n        ///\n        /// The values are not transformed in any way and thus are considered stable\n        /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n        pub fn as_str_name(&self) -> &'static str {\n            match self {\n                Self::Unset => \"STATUS_CODE_UNSET\",\n                Self::Ok => \"STATUS_CODE_OK\",\n                Self::Error => \"STATUS_CODE_ERROR\",\n            }\n        }\n        /// Creates an enum from field names used in the ProtoBuf definition.\n        pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n            match value {\n                \"STATUS_CODE_UNSET\" => Some(Self::Unset),\n                \"STATUS_CODE_OK\" => Some(Self::Ok),\n                \"STATUS_CODE_ERROR\" => Some(Self::Error),\n                _ => None,\n            }\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/codegen/opentelemetry/opentelemetry.proto.collector.logs.v1.rs",
    "content": "// This file is @generated by prost-build.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ExportLogsServiceRequest {\n    /// An array of ResourceLogs.\n    /// For data coming from a single resource this array will typically contain one\n    /// element. Intermediary nodes (such as OpenTelemetry Collector) that receive\n    /// data from multiple origins typically batch the data before forwarding further and\n    /// in that case this array will contain multiple elements.\n    #[prost(message, repeated, tag = \"1\")]\n    pub resource_logs: ::prost::alloc::vec::Vec<\n        super::super::super::logs::v1::ResourceLogs,\n    >,\n}\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ExportLogsServiceResponse {\n    /// The details of a partially successful export request.\n    ///\n    /// If the request is only partially accepted\n    /// (i.e. when the server accepts only parts of the data and rejects the rest)\n    /// the server MUST initialize the `partial_success` field and MUST\n    /// set the `rejected_<signal>` with the number of items it rejected.\n    ///\n    /// Servers MAY also make use of the `partial_success` field to convey\n    /// warnings/suggestions to senders even when the request was fully accepted.\n    /// In such cases, the `rejected_<signal>` MUST have a value of `0` and\n    /// the `error_message` MUST be non-empty.\n    ///\n    /// A `partial_success` message with an empty value (`rejected_<signal>1 = 0 and  `error_message\\` = \"\") is equivalent to it not being set/present. Senders\n    /// SHOULD interpret it the same way as in the full success case.\n    #[prost(message, optional, tag = \"1\")]\n    pub partial_success: ::core::option::Option<ExportLogsPartialSuccess>,\n}\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ExportLogsPartialSuccess {\n    /// The number of rejected log records.\n    ///\n    /// A `rejected_<signal>` field holding a `0` value indicates that the\n    /// request was fully accepted.\n    #[prost(int64, tag = \"1\")]\n    pub rejected_log_records: i64,\n    /// A developer-facing human-readable message in English. It should be used\n    /// either to explain why the server rejected parts of the data during a partial\n    /// success or to convey warnings/suggestions during a full success. The message\n    /// should offer guidance on how users can address such issues.\n    ///\n    /// error_message is an optional field. An error_message with an empty value\n    /// is equivalent to it not being set.\n    #[prost(string, tag = \"2\")]\n    pub error_message: ::prost::alloc::string::String,\n}\n/// Generated client implementations.\npub mod logs_service_client {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    use tonic::codegen::http::Uri;\n    /// Service that can be used to push logs between one Application instrumented with\n    /// OpenTelemetry and an collector, or between an collector and a central collector (in this\n    /// case logs are sent/received to/from multiple Applications).\n    #[derive(Debug, Clone)]\n    pub struct LogsServiceClient<T> {\n        inner: tonic::client::Grpc<T>,\n    }\n    impl LogsServiceClient<tonic::transport::Channel> {\n        /// Attempt to create a new client by connecting to a given endpoint.\n        pub async fn connect<D>(dst: D) -> Result<Self, tonic::transport::Error>\n        where\n            D: TryInto<tonic::transport::Endpoint>,\n            D::Error: Into<StdError>,\n        {\n            let conn = tonic::transport::Endpoint::new(dst)?.connect().await?;\n            Ok(Self::new(conn))\n        }\n    }\n    impl<T> LogsServiceClient<T>\n    where\n        T: tonic::client::GrpcService<tonic::body::Body>,\n        T::Error: Into<StdError>,\n        T::ResponseBody: Body<Data = Bytes> + std::marker::Send + 'static,\n        <T::ResponseBody as Body>::Error: Into<StdError> + std::marker::Send,\n    {\n        pub fn new(inner: T) -> Self {\n            let inner = tonic::client::Grpc::new(inner);\n            Self { inner }\n        }\n        pub fn with_origin(inner: T, origin: Uri) -> Self {\n            let inner = tonic::client::Grpc::with_origin(inner, origin);\n            Self { inner }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> LogsServiceClient<InterceptedService<T, F>>\n        where\n            F: tonic::service::Interceptor,\n            T::ResponseBody: Default,\n            T: tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n                Response = http::Response<\n                    <T as tonic::client::GrpcService<tonic::body::Body>>::ResponseBody,\n                >,\n            >,\n            <T as tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n            >>::Error: Into<StdError> + std::marker::Send + std::marker::Sync,\n        {\n            LogsServiceClient::new(InterceptedService::new(inner, interceptor))\n        }\n        /// Compress requests with the given encoding.\n        ///\n        /// This requires the server to support it otherwise it might respond with an\n        /// error.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.send_compressed(encoding);\n            self\n        }\n        /// Enable decompressing responses.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.accept_compressed(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_decoding_message_size(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_encoding_message_size(limit);\n            self\n        }\n        /// For performance reasons, it is recommended to keep this RPC\n        /// alive for the entire life of the application.\n        pub async fn export(\n            &mut self,\n            request: impl tonic::IntoRequest<super::ExportLogsServiceRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ExportLogsServiceResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/opentelemetry.proto.collector.logs.v1.LogsService/Export\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"opentelemetry.proto.collector.logs.v1.LogsService\",\n                        \"Export\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n    }\n}\n/// Generated server implementations.\npub mod logs_service_server {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    /// Generated trait containing gRPC methods that should be implemented for use with LogsServiceServer.\n    #[async_trait]\n    pub trait LogsService: std::marker::Send + std::marker::Sync + 'static {\n        /// For performance reasons, it is recommended to keep this RPC\n        /// alive for the entire life of the application.\n        async fn export(\n            &self,\n            request: tonic::Request<super::ExportLogsServiceRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ExportLogsServiceResponse>,\n            tonic::Status,\n        >;\n    }\n    /// Service that can be used to push logs between one Application instrumented with\n    /// OpenTelemetry and an collector, or between an collector and a central collector (in this\n    /// case logs are sent/received to/from multiple Applications).\n    #[derive(Debug)]\n    pub struct LogsServiceServer<T> {\n        inner: Arc<T>,\n        accept_compression_encodings: EnabledCompressionEncodings,\n        send_compression_encodings: EnabledCompressionEncodings,\n        max_decoding_message_size: Option<usize>,\n        max_encoding_message_size: Option<usize>,\n    }\n    impl<T> LogsServiceServer<T> {\n        pub fn new(inner: T) -> Self {\n            Self::from_arc(Arc::new(inner))\n        }\n        pub fn from_arc(inner: Arc<T>) -> Self {\n            Self {\n                inner,\n                accept_compression_encodings: Default::default(),\n                send_compression_encodings: Default::default(),\n                max_decoding_message_size: None,\n                max_encoding_message_size: None,\n            }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> InterceptedService<Self, F>\n        where\n            F: tonic::service::Interceptor,\n        {\n            InterceptedService::new(Self::new(inner), interceptor)\n        }\n        /// Enable decompressing requests with the given encoding.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.accept_compression_encodings.enable(encoding);\n            self\n        }\n        /// Compress responses with the given encoding, if the client supports it.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.send_compression_encodings.enable(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.max_decoding_message_size = Some(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.max_encoding_message_size = Some(limit);\n            self\n        }\n    }\n    impl<T, B> tonic::codegen::Service<http::Request<B>> for LogsServiceServer<T>\n    where\n        T: LogsService,\n        B: Body + std::marker::Send + 'static,\n        B::Error: Into<StdError> + std::marker::Send + 'static,\n    {\n        type Response = http::Response<tonic::body::Body>;\n        type Error = std::convert::Infallible;\n        type Future = BoxFuture<Self::Response, Self::Error>;\n        fn poll_ready(\n            &mut self,\n            _cx: &mut Context<'_>,\n        ) -> Poll<std::result::Result<(), Self::Error>> {\n            Poll::Ready(Ok(()))\n        }\n        fn call(&mut self, req: http::Request<B>) -> Self::Future {\n            match req.uri().path() {\n                \"/opentelemetry.proto.collector.logs.v1.LogsService/Export\" => {\n                    #[allow(non_camel_case_types)]\n                    struct ExportSvc<T: LogsService>(pub Arc<T>);\n                    impl<\n                        T: LogsService,\n                    > tonic::server::UnaryService<super::ExportLogsServiceRequest>\n                    for ExportSvc<T> {\n                        type Response = super::ExportLogsServiceResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::ExportLogsServiceRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as LogsService>::export(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = ExportSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                _ => {\n                    Box::pin(async move {\n                        let mut response = http::Response::new(\n                            tonic::body::Body::default(),\n                        );\n                        let headers = response.headers_mut();\n                        headers\n                            .insert(\n                                tonic::Status::GRPC_STATUS,\n                                (tonic::Code::Unimplemented as i32).into(),\n                            );\n                        headers\n                            .insert(\n                                http::header::CONTENT_TYPE,\n                                tonic::metadata::GRPC_CONTENT_TYPE,\n                            );\n                        Ok(response)\n                    })\n                }\n            }\n        }\n    }\n    impl<T> Clone for LogsServiceServer<T> {\n        fn clone(&self) -> Self {\n            let inner = self.inner.clone();\n            Self {\n                inner,\n                accept_compression_encodings: self.accept_compression_encodings,\n                send_compression_encodings: self.send_compression_encodings,\n                max_decoding_message_size: self.max_decoding_message_size,\n                max_encoding_message_size: self.max_encoding_message_size,\n            }\n        }\n    }\n    /// Generated gRPC service name\n    pub const SERVICE_NAME: &str = \"opentelemetry.proto.collector.logs.v1.LogsService\";\n    impl<T> tonic::server::NamedService for LogsServiceServer<T> {\n        const NAME: &'static str = SERVICE_NAME;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/codegen/opentelemetry/opentelemetry.proto.collector.metrics.v1.rs",
    "content": "// This file is @generated by prost-build.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ExportMetricsServiceRequest {\n    /// An array of ResourceMetrics.\n    /// For data coming from a single resource this array will typically contain one\n    /// element. Intermediary nodes (such as OpenTelemetry Collector) that receive\n    /// data from multiple origins typically batch the data before forwarding further and\n    /// in that case this array will contain multiple elements.\n    #[prost(message, repeated, tag = \"1\")]\n    pub resource_metrics: ::prost::alloc::vec::Vec<\n        super::super::super::metrics::v1::ResourceMetrics,\n    >,\n}\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ExportMetricsServiceResponse {\n    /// The details of a partially successful export request.\n    ///\n    /// If the request is only partially accepted\n    /// (i.e. when the server accepts only parts of the data and rejects the rest)\n    /// the server MUST initialize the `partial_success` field and MUST\n    /// set the `rejected_<signal>` with the number of items it rejected.\n    ///\n    /// Servers MAY also make use of the `partial_success` field to convey\n    /// warnings/suggestions to senders even when the request was fully accepted.\n    /// In such cases, the `rejected_<signal>` MUST have a value of `0` and\n    /// the `error_message` MUST be non-empty.\n    ///\n    /// A `partial_success` message with an empty value (rejected\\_<signal> = 0 and\n    /// `error_message` = \"\") is equivalent to it not being set/present. Senders\n    /// SHOULD interpret it the same way as in the full success case.\n    #[prost(message, optional, tag = \"1\")]\n    pub partial_success: ::core::option::Option<ExportMetricsPartialSuccess>,\n}\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ExportMetricsPartialSuccess {\n    /// The number of rejected data points.\n    ///\n    /// A `rejected_<signal>` field holding a `0` value indicates that the\n    /// request was fully accepted.\n    #[prost(int64, tag = \"1\")]\n    pub rejected_data_points: i64,\n    /// A developer-facing human-readable message in English. It should be used\n    /// either to explain why the server rejected parts of the data during a partial\n    /// success or to convey warnings/suggestions during a full success. The message\n    /// should offer guidance on how users can address such issues.\n    ///\n    /// error_message is an optional field. An error_message with an empty value\n    /// is equivalent to it not being set.\n    #[prost(string, tag = \"2\")]\n    pub error_message: ::prost::alloc::string::String,\n}\n/// Generated client implementations.\npub mod metrics_service_client {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    use tonic::codegen::http::Uri;\n    /// Service that can be used to push metrics between one Application\n    /// instrumented with OpenTelemetry and a collector, or between a collector and a\n    /// central collector.\n    #[derive(Debug, Clone)]\n    pub struct MetricsServiceClient<T> {\n        inner: tonic::client::Grpc<T>,\n    }\n    impl MetricsServiceClient<tonic::transport::Channel> {\n        /// Attempt to create a new client by connecting to a given endpoint.\n        pub async fn connect<D>(dst: D) -> Result<Self, tonic::transport::Error>\n        where\n            D: TryInto<tonic::transport::Endpoint>,\n            D::Error: Into<StdError>,\n        {\n            let conn = tonic::transport::Endpoint::new(dst)?.connect().await?;\n            Ok(Self::new(conn))\n        }\n    }\n    impl<T> MetricsServiceClient<T>\n    where\n        T: tonic::client::GrpcService<tonic::body::Body>,\n        T::Error: Into<StdError>,\n        T::ResponseBody: Body<Data = Bytes> + std::marker::Send + 'static,\n        <T::ResponseBody as Body>::Error: Into<StdError> + std::marker::Send,\n    {\n        pub fn new(inner: T) -> Self {\n            let inner = tonic::client::Grpc::new(inner);\n            Self { inner }\n        }\n        pub fn with_origin(inner: T, origin: Uri) -> Self {\n            let inner = tonic::client::Grpc::with_origin(inner, origin);\n            Self { inner }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> MetricsServiceClient<InterceptedService<T, F>>\n        where\n            F: tonic::service::Interceptor,\n            T::ResponseBody: Default,\n            T: tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n                Response = http::Response<\n                    <T as tonic::client::GrpcService<tonic::body::Body>>::ResponseBody,\n                >,\n            >,\n            <T as tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n            >>::Error: Into<StdError> + std::marker::Send + std::marker::Sync,\n        {\n            MetricsServiceClient::new(InterceptedService::new(inner, interceptor))\n        }\n        /// Compress requests with the given encoding.\n        ///\n        /// This requires the server to support it otherwise it might respond with an\n        /// error.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.send_compressed(encoding);\n            self\n        }\n        /// Enable decompressing responses.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.accept_compressed(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_decoding_message_size(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_encoding_message_size(limit);\n            self\n        }\n        /// For performance reasons, it is recommended to keep this RPC\n        /// alive for the entire life of the application.\n        pub async fn export(\n            &mut self,\n            request: impl tonic::IntoRequest<super::ExportMetricsServiceRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ExportMetricsServiceResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/opentelemetry.proto.collector.metrics.v1.MetricsService/Export\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"opentelemetry.proto.collector.metrics.v1.MetricsService\",\n                        \"Export\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n    }\n}\n/// Generated server implementations.\npub mod metrics_service_server {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    /// Generated trait containing gRPC methods that should be implemented for use with MetricsServiceServer.\n    #[async_trait]\n    pub trait MetricsService: std::marker::Send + std::marker::Sync + 'static {\n        /// For performance reasons, it is recommended to keep this RPC\n        /// alive for the entire life of the application.\n        async fn export(\n            &self,\n            request: tonic::Request<super::ExportMetricsServiceRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ExportMetricsServiceResponse>,\n            tonic::Status,\n        >;\n    }\n    /// Service that can be used to push metrics between one Application\n    /// instrumented with OpenTelemetry and a collector, or between a collector and a\n    /// central collector.\n    #[derive(Debug)]\n    pub struct MetricsServiceServer<T> {\n        inner: Arc<T>,\n        accept_compression_encodings: EnabledCompressionEncodings,\n        send_compression_encodings: EnabledCompressionEncodings,\n        max_decoding_message_size: Option<usize>,\n        max_encoding_message_size: Option<usize>,\n    }\n    impl<T> MetricsServiceServer<T> {\n        pub fn new(inner: T) -> Self {\n            Self::from_arc(Arc::new(inner))\n        }\n        pub fn from_arc(inner: Arc<T>) -> Self {\n            Self {\n                inner,\n                accept_compression_encodings: Default::default(),\n                send_compression_encodings: Default::default(),\n                max_decoding_message_size: None,\n                max_encoding_message_size: None,\n            }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> InterceptedService<Self, F>\n        where\n            F: tonic::service::Interceptor,\n        {\n            InterceptedService::new(Self::new(inner), interceptor)\n        }\n        /// Enable decompressing requests with the given encoding.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.accept_compression_encodings.enable(encoding);\n            self\n        }\n        /// Compress responses with the given encoding, if the client supports it.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.send_compression_encodings.enable(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.max_decoding_message_size = Some(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.max_encoding_message_size = Some(limit);\n            self\n        }\n    }\n    impl<T, B> tonic::codegen::Service<http::Request<B>> for MetricsServiceServer<T>\n    where\n        T: MetricsService,\n        B: Body + std::marker::Send + 'static,\n        B::Error: Into<StdError> + std::marker::Send + 'static,\n    {\n        type Response = http::Response<tonic::body::Body>;\n        type Error = std::convert::Infallible;\n        type Future = BoxFuture<Self::Response, Self::Error>;\n        fn poll_ready(\n            &mut self,\n            _cx: &mut Context<'_>,\n        ) -> Poll<std::result::Result<(), Self::Error>> {\n            Poll::Ready(Ok(()))\n        }\n        fn call(&mut self, req: http::Request<B>) -> Self::Future {\n            match req.uri().path() {\n                \"/opentelemetry.proto.collector.metrics.v1.MetricsService/Export\" => {\n                    #[allow(non_camel_case_types)]\n                    struct ExportSvc<T: MetricsService>(pub Arc<T>);\n                    impl<\n                        T: MetricsService,\n                    > tonic::server::UnaryService<super::ExportMetricsServiceRequest>\n                    for ExportSvc<T> {\n                        type Response = super::ExportMetricsServiceResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::ExportMetricsServiceRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetricsService>::export(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = ExportSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                _ => {\n                    Box::pin(async move {\n                        let mut response = http::Response::new(\n                            tonic::body::Body::default(),\n                        );\n                        let headers = response.headers_mut();\n                        headers\n                            .insert(\n                                tonic::Status::GRPC_STATUS,\n                                (tonic::Code::Unimplemented as i32).into(),\n                            );\n                        headers\n                            .insert(\n                                http::header::CONTENT_TYPE,\n                                tonic::metadata::GRPC_CONTENT_TYPE,\n                            );\n                        Ok(response)\n                    })\n                }\n            }\n        }\n    }\n    impl<T> Clone for MetricsServiceServer<T> {\n        fn clone(&self) -> Self {\n            let inner = self.inner.clone();\n            Self {\n                inner,\n                accept_compression_encodings: self.accept_compression_encodings,\n                send_compression_encodings: self.send_compression_encodings,\n                max_decoding_message_size: self.max_decoding_message_size,\n                max_encoding_message_size: self.max_encoding_message_size,\n            }\n        }\n    }\n    /// Generated gRPC service name\n    pub const SERVICE_NAME: &str = \"opentelemetry.proto.collector.metrics.v1.MetricsService\";\n    impl<T> tonic::server::NamedService for MetricsServiceServer<T> {\n        const NAME: &'static str = SERVICE_NAME;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/codegen/opentelemetry/opentelemetry.proto.collector.trace.v1.rs",
    "content": "// This file is @generated by prost-build.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ExportTraceServiceRequest {\n    /// An array of ResourceSpans.\n    /// For data coming from a single resource this array will typically contain one\n    /// element. Intermediary nodes (such as OpenTelemetry Collector) that receive\n    /// data from multiple origins typically batch the data before forwarding further and\n    /// in that case this array will contain multiple elements.\n    #[prost(message, repeated, tag = \"1\")]\n    pub resource_spans: ::prost::alloc::vec::Vec<\n        super::super::super::trace::v1::ResourceSpans,\n    >,\n}\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ExportTraceServiceResponse {\n    /// The details of a partially successful export request.\n    ///\n    /// If the request is only partially accepted\n    /// (i.e. when the server accepts only parts of the data and rejects the rest)\n    /// the server MUST initialize the `partial_success` field and MUST\n    /// set the `rejected_<signal>` with the number of items it rejected.\n    ///\n    /// Servers MAY also make use of the `partial_success` field to convey\n    /// warnings/suggestions to senders even when the request was fully accepted.\n    /// In such cases, the `rejected_<signal>` MUST have a value of `0` and\n    /// the `error_message` MUST be non-empty.\n    ///\n    /// A `partial_success` message with an empty value (rejected\\_<signal> = 0 and\n    /// `error_message` = \"\") is equivalent to it not being set/present. Senders\n    /// SHOULD interpret it the same way as in the full success case.\n    #[prost(message, optional, tag = \"1\")]\n    pub partial_success: ::core::option::Option<ExportTracePartialSuccess>,\n}\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ExportTracePartialSuccess {\n    /// The number of rejected spans.\n    ///\n    /// A `rejected_<signal>` field holding a `0` value indicates that the\n    /// request was fully accepted.\n    #[prost(int64, tag = \"1\")]\n    pub rejected_spans: i64,\n    /// A developer-facing human-readable message in English. It should be used\n    /// either to explain why the server rejected parts of the data during a partial\n    /// success or to convey warnings/suggestions during a full success. The message\n    /// should offer guidance on how users can address such issues.\n    ///\n    /// error_message is an optional field. An error_message with an empty value\n    /// is equivalent to it not being set.\n    #[prost(string, tag = \"2\")]\n    pub error_message: ::prost::alloc::string::String,\n}\n/// Generated client implementations.\npub mod trace_service_client {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    use tonic::codegen::http::Uri;\n    /// Service that can be used to push spans between one Application instrumented with\n    /// OpenTelemetry and a collector, or between a collector and a central collector (in this\n    /// case spans are sent/received to/from multiple Applications).\n    #[derive(Debug, Clone)]\n    pub struct TraceServiceClient<T> {\n        inner: tonic::client::Grpc<T>,\n    }\n    impl TraceServiceClient<tonic::transport::Channel> {\n        /// Attempt to create a new client by connecting to a given endpoint.\n        pub async fn connect<D>(dst: D) -> Result<Self, tonic::transport::Error>\n        where\n            D: TryInto<tonic::transport::Endpoint>,\n            D::Error: Into<StdError>,\n        {\n            let conn = tonic::transport::Endpoint::new(dst)?.connect().await?;\n            Ok(Self::new(conn))\n        }\n    }\n    impl<T> TraceServiceClient<T>\n    where\n        T: tonic::client::GrpcService<tonic::body::Body>,\n        T::Error: Into<StdError>,\n        T::ResponseBody: Body<Data = Bytes> + std::marker::Send + 'static,\n        <T::ResponseBody as Body>::Error: Into<StdError> + std::marker::Send,\n    {\n        pub fn new(inner: T) -> Self {\n            let inner = tonic::client::Grpc::new(inner);\n            Self { inner }\n        }\n        pub fn with_origin(inner: T, origin: Uri) -> Self {\n            let inner = tonic::client::Grpc::with_origin(inner, origin);\n            Self { inner }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> TraceServiceClient<InterceptedService<T, F>>\n        where\n            F: tonic::service::Interceptor,\n            T::ResponseBody: Default,\n            T: tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n                Response = http::Response<\n                    <T as tonic::client::GrpcService<tonic::body::Body>>::ResponseBody,\n                >,\n            >,\n            <T as tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n            >>::Error: Into<StdError> + std::marker::Send + std::marker::Sync,\n        {\n            TraceServiceClient::new(InterceptedService::new(inner, interceptor))\n        }\n        /// Compress requests with the given encoding.\n        ///\n        /// This requires the server to support it otherwise it might respond with an\n        /// error.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.send_compressed(encoding);\n            self\n        }\n        /// Enable decompressing responses.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.accept_compressed(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_decoding_message_size(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_encoding_message_size(limit);\n            self\n        }\n        /// For performance reasons, it is recommended to keep this RPC\n        /// alive for the entire life of the application.\n        pub async fn export(\n            &mut self,\n            request: impl tonic::IntoRequest<super::ExportTraceServiceRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ExportTraceServiceResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/opentelemetry.proto.collector.trace.v1.TraceService/Export\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"opentelemetry.proto.collector.trace.v1.TraceService\",\n                        \"Export\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n    }\n}\n/// Generated server implementations.\npub mod trace_service_server {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    /// Generated trait containing gRPC methods that should be implemented for use with TraceServiceServer.\n    #[async_trait]\n    pub trait TraceService: std::marker::Send + std::marker::Sync + 'static {\n        /// For performance reasons, it is recommended to keep this RPC\n        /// alive for the entire life of the application.\n        async fn export(\n            &self,\n            request: tonic::Request<super::ExportTraceServiceRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ExportTraceServiceResponse>,\n            tonic::Status,\n        >;\n    }\n    /// Service that can be used to push spans between one Application instrumented with\n    /// OpenTelemetry and a collector, or between a collector and a central collector (in this\n    /// case spans are sent/received to/from multiple Applications).\n    #[derive(Debug)]\n    pub struct TraceServiceServer<T> {\n        inner: Arc<T>,\n        accept_compression_encodings: EnabledCompressionEncodings,\n        send_compression_encodings: EnabledCompressionEncodings,\n        max_decoding_message_size: Option<usize>,\n        max_encoding_message_size: Option<usize>,\n    }\n    impl<T> TraceServiceServer<T> {\n        pub fn new(inner: T) -> Self {\n            Self::from_arc(Arc::new(inner))\n        }\n        pub fn from_arc(inner: Arc<T>) -> Self {\n            Self {\n                inner,\n                accept_compression_encodings: Default::default(),\n                send_compression_encodings: Default::default(),\n                max_decoding_message_size: None,\n                max_encoding_message_size: None,\n            }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> InterceptedService<Self, F>\n        where\n            F: tonic::service::Interceptor,\n        {\n            InterceptedService::new(Self::new(inner), interceptor)\n        }\n        /// Enable decompressing requests with the given encoding.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.accept_compression_encodings.enable(encoding);\n            self\n        }\n        /// Compress responses with the given encoding, if the client supports it.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.send_compression_encodings.enable(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.max_decoding_message_size = Some(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.max_encoding_message_size = Some(limit);\n            self\n        }\n    }\n    impl<T, B> tonic::codegen::Service<http::Request<B>> for TraceServiceServer<T>\n    where\n        T: TraceService,\n        B: Body + std::marker::Send + 'static,\n        B::Error: Into<StdError> + std::marker::Send + 'static,\n    {\n        type Response = http::Response<tonic::body::Body>;\n        type Error = std::convert::Infallible;\n        type Future = BoxFuture<Self::Response, Self::Error>;\n        fn poll_ready(\n            &mut self,\n            _cx: &mut Context<'_>,\n        ) -> Poll<std::result::Result<(), Self::Error>> {\n            Poll::Ready(Ok(()))\n        }\n        fn call(&mut self, req: http::Request<B>) -> Self::Future {\n            match req.uri().path() {\n                \"/opentelemetry.proto.collector.trace.v1.TraceService/Export\" => {\n                    #[allow(non_camel_case_types)]\n                    struct ExportSvc<T: TraceService>(pub Arc<T>);\n                    impl<\n                        T: TraceService,\n                    > tonic::server::UnaryService<super::ExportTraceServiceRequest>\n                    for ExportSvc<T> {\n                        type Response = super::ExportTraceServiceResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::ExportTraceServiceRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as TraceService>::export(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = ExportSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                _ => {\n                    Box::pin(async move {\n                        let mut response = http::Response::new(\n                            tonic::body::Body::default(),\n                        );\n                        let headers = response.headers_mut();\n                        headers\n                            .insert(\n                                tonic::Status::GRPC_STATUS,\n                                (tonic::Code::Unimplemented as i32).into(),\n                            );\n                        headers\n                            .insert(\n                                http::header::CONTENT_TYPE,\n                                tonic::metadata::GRPC_CONTENT_TYPE,\n                            );\n                        Ok(response)\n                    })\n                }\n            }\n        }\n    }\n    impl<T> Clone for TraceServiceServer<T> {\n        fn clone(&self) -> Self {\n            let inner = self.inner.clone();\n            Self {\n                inner,\n                accept_compression_encodings: self.accept_compression_encodings,\n                send_compression_encodings: self.send_compression_encodings,\n                max_decoding_message_size: self.max_decoding_message_size,\n                max_encoding_message_size: self.max_encoding_message_size,\n            }\n        }\n    }\n    /// Generated gRPC service name\n    pub const SERVICE_NAME: &str = \"opentelemetry.proto.collector.trace.v1.TraceService\";\n    impl<T> tonic::server::NamedService for TraceServiceServer<T> {\n        const NAME: &'static str = SERVICE_NAME;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/codegen/opentelemetry/opentelemetry.proto.common.v1.rs",
    "content": "// This file is @generated by prost-build.\n/// AnyValue is used to represent any type of attribute value. AnyValue may contain a\n/// primitive value such as a string or integer or it may contain an arbitrary nested\n/// object containing arrays, key-value lists and primitives.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct AnyValue {\n    /// The value is one of the listed fields. It is valid for all values to be unspecified\n    /// in which case this AnyValue is considered to be \"empty\".\n    #[prost(oneof = \"any_value::Value\", tags = \"1, 2, 3, 4, 5, 6, 7\")]\n    pub value: ::core::option::Option<any_value::Value>,\n}\n/// Nested message and enum types in `AnyValue`.\npub mod any_value {\n    /// The value is one of the listed fields. It is valid for all values to be unspecified\n    /// in which case this AnyValue is considered to be \"empty\".\n    #[derive(serde::Serialize, serde::Deserialize)]\n    #[derive(Clone, PartialEq, ::prost::Oneof)]\n    pub enum Value {\n        #[prost(string, tag = \"1\")]\n        StringValue(::prost::alloc::string::String),\n        #[prost(bool, tag = \"2\")]\n        BoolValue(bool),\n        #[prost(int64, tag = \"3\")]\n        IntValue(i64),\n        #[prost(double, tag = \"4\")]\n        DoubleValue(f64),\n        #[prost(message, tag = \"5\")]\n        ArrayValue(super::ArrayValue),\n        #[prost(message, tag = \"6\")]\n        KvlistValue(super::KeyValueList),\n        #[prost(bytes, tag = \"7\")]\n        BytesValue(::prost::alloc::vec::Vec<u8>),\n    }\n}\n/// ArrayValue is a list of AnyValue messages. We need ArrayValue as a message\n/// since oneof in AnyValue does not allow repeated fields.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ArrayValue {\n    /// Array of values. The array may be empty (contain 0 elements).\n    #[prost(message, repeated, tag = \"1\")]\n    pub values: ::prost::alloc::vec::Vec<AnyValue>,\n}\n/// KeyValueList is a list of KeyValue messages. We need KeyValueList as a message\n/// since `oneof` in AnyValue does not allow repeated fields. Everywhere else where we need\n/// a list of KeyValue messages (e.g. in Span) we use `repeated KeyValue` directly to\n/// avoid unnecessary extra wrapping (which slows down the protocol). The 2 approaches\n/// are semantically equivalent.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct KeyValueList {\n    /// A collection of key/value pairs of key-value pairs. The list may be empty (may\n    /// contain 0 elements).\n    /// The keys MUST be unique (it is not allowed to have more than one\n    /// value with the same key).\n    #[prost(message, repeated, tag = \"1\")]\n    pub values: ::prost::alloc::vec::Vec<KeyValue>,\n}\n/// KeyValue is a key-value pair that is used to store Span attributes, Link\n/// attributes, etc.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct KeyValue {\n    #[prost(string, tag = \"1\")]\n    pub key: ::prost::alloc::string::String,\n    #[prost(message, optional, tag = \"2\")]\n    pub value: ::core::option::Option<AnyValue>,\n}\n/// InstrumentationScope is a message representing the instrumentation scope information\n/// such as the fully qualified name and version.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct InstrumentationScope {\n    /// An empty instrumentation scope name means the name is unknown.\n    #[prost(string, tag = \"1\")]\n    pub name: ::prost::alloc::string::String,\n    #[prost(string, tag = \"2\")]\n    pub version: ::prost::alloc::string::String,\n    #[prost(message, repeated, tag = \"3\")]\n    pub attributes: ::prost::alloc::vec::Vec<KeyValue>,\n    #[prost(uint32, tag = \"4\")]\n    pub dropped_attributes_count: u32,\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/codegen/opentelemetry/opentelemetry.proto.logs.v1.rs",
    "content": "// This file is @generated by prost-build.\n/// LogsData represents the logs data that can be stored in a persistent storage,\n/// OR can be embedded by other protocols that transfer OTLP logs data but do not\n/// implement the OTLP protocol.\n///\n/// The main difference between this message and collector protocol is that\n/// in this message there will not be any \"control\" or \"metadata\" specific to\n/// OTLP protocol.\n///\n/// When new fields are added into this message, the OTLP request MUST be updated\n/// as well.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct LogsData {\n    /// An array of ResourceLogs.\n    /// For data coming from a single resource this array will typically contain\n    /// one element. Intermediary nodes that receive data from multiple origins\n    /// typically batch the data before forwarding further and in that case this\n    /// array will contain multiple elements.\n    #[prost(message, repeated, tag = \"1\")]\n    pub resource_logs: ::prost::alloc::vec::Vec<ResourceLogs>,\n}\n/// A collection of ScopeLogs from a Resource.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ResourceLogs {\n    /// The resource for the logs in this message.\n    /// If this field is not set then resource info is unknown.\n    #[prost(message, optional, tag = \"1\")]\n    pub resource: ::core::option::Option<super::super::resource::v1::Resource>,\n    /// A list of ScopeLogs that originate from a resource.\n    #[prost(message, repeated, tag = \"2\")]\n    pub scope_logs: ::prost::alloc::vec::Vec<ScopeLogs>,\n    /// This schema_url applies to the data in the \"resource\" field. It does not apply\n    /// to the data in the \"scope_logs\" field which have their own schema_url field.\n    #[prost(string, tag = \"3\")]\n    pub schema_url: ::prost::alloc::string::String,\n}\n/// A collection of Logs produced by a Scope.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ScopeLogs {\n    /// The instrumentation scope information for the logs in this message.\n    /// Semantically when InstrumentationScope isn't set, it is equivalent with\n    /// an empty instrumentation scope name (unknown).\n    #[prost(message, optional, tag = \"1\")]\n    pub scope: ::core::option::Option<super::super::common::v1::InstrumentationScope>,\n    /// A list of log records.\n    #[prost(message, repeated, tag = \"2\")]\n    pub log_records: ::prost::alloc::vec::Vec<LogRecord>,\n    /// This schema_url applies to all logs in the \"logs\" field.\n    #[prost(string, tag = \"3\")]\n    pub schema_url: ::prost::alloc::string::String,\n}\n/// A log record according to OpenTelemetry Log Data Model:\n/// <https://github.com/open-telemetry/oteps/blob/main/text/logs/0097-log-data-model.md>\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct LogRecord {\n    /// time_unix_nano is the time when the event occurred.\n    /// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.\n    /// Value of 0 indicates unknown or missing timestamp.\n    #[prost(fixed64, tag = \"1\")]\n    pub time_unix_nano: u64,\n    /// Time when the event was observed by the collection system.\n    /// For events that originate in OpenTelemetry (e.g. using OpenTelemetry Logging SDK)\n    /// this timestamp is typically set at the generation time and is equal to Timestamp.\n    /// For events originating externally and collected by OpenTelemetry (e.g. using\n    /// Collector) this is the time when OpenTelemetry's code observed the event measured\n    /// by the clock of the OpenTelemetry code. This field MUST be set once the event is\n    /// observed by OpenTelemetry.\n    ///\n    /// For converting OpenTelemetry log data to formats that support only one timestamp or\n    /// when receiving OpenTelemetry log data by recipients that support only one timestamp\n    /// internally the following logic is recommended:\n    ///\n    /// * Use time_unix_nano if it is present, otherwise use observed_time_unix_nano.\n    ///\n    /// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.\n    /// Value of 0 indicates unknown or missing timestamp.\n    #[prost(fixed64, tag = \"11\")]\n    pub observed_time_unix_nano: u64,\n    /// Numerical value of the severity, normalized to values described in Log Data Model.\n    /// \\[Optional\\].\n    #[prost(enumeration = \"SeverityNumber\", tag = \"2\")]\n    pub severity_number: i32,\n    /// The severity text (also known as log level). The original string representation as\n    /// it is known at the source. \\[Optional\\].\n    #[prost(string, tag = \"3\")]\n    pub severity_text: ::prost::alloc::string::String,\n    /// A value containing the body of the log record. Can be for example a human-readable\n    /// string message (including multi-line) describing the event in a free form or it can\n    /// be a structured data composed of arrays and maps of other values. \\[Optional\\].\n    #[prost(message, optional, tag = \"5\")]\n    pub body: ::core::option::Option<super::super::common::v1::AnyValue>,\n    /// Additional attributes that describe the specific event occurrence. \\[Optional\\].\n    /// Attribute keys MUST be unique (it is not allowed to have more than one\n    /// attribute with the same key).\n    #[prost(message, repeated, tag = \"6\")]\n    pub attributes: ::prost::alloc::vec::Vec<super::super::common::v1::KeyValue>,\n    #[prost(uint32, tag = \"7\")]\n    pub dropped_attributes_count: u32,\n    /// Flags, a bit field. 8 least significant bits are the trace flags as\n    /// defined in W3C Trace Context specification. 24 most significant bits are reserved\n    /// and must be set to 0. Readers must not assume that 24 most significant bits\n    /// will be zero and must correctly mask the bits when reading 8-bit trace flag (use\n    /// flags & TRACE_FLAGS_MASK). \\[Optional\\].\n    #[prost(fixed32, tag = \"8\")]\n    pub flags: u32,\n    /// A unique identifier for a trace. All logs from the same trace share\n    /// the same `trace_id`. The ID is a 16-byte array. An ID with all zeroes\n    /// is considered invalid. Can be set for logs that are part of request processing\n    /// and have an assigned trace id. \\[Optional\\].\n    #[prost(bytes = \"vec\", tag = \"9\")]\n    pub trace_id: ::prost::alloc::vec::Vec<u8>,\n    /// A unique identifier for a span within a trace, assigned when the span\n    /// is created. The ID is an 8-byte array. An ID with all zeroes is considered\n    /// invalid. Can be set for logs that are part of a particular processing span.\n    /// If span_id is present trace_id SHOULD be also present. \\[Optional\\].\n    #[prost(bytes = \"vec\", tag = \"10\")]\n    pub span_id: ::prost::alloc::vec::Vec<u8>,\n}\n/// Possible values for LogRecord.SeverityNumber.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]\n#[repr(i32)]\npub enum SeverityNumber {\n    /// UNSPECIFIED is the default SeverityNumber, it MUST NOT be used.\n    Unspecified = 0,\n    Trace = 1,\n    Trace2 = 2,\n    Trace3 = 3,\n    Trace4 = 4,\n    Debug = 5,\n    Debug2 = 6,\n    Debug3 = 7,\n    Debug4 = 8,\n    Info = 9,\n    Info2 = 10,\n    Info3 = 11,\n    Info4 = 12,\n    Warn = 13,\n    Warn2 = 14,\n    Warn3 = 15,\n    Warn4 = 16,\n    Error = 17,\n    Error2 = 18,\n    Error3 = 19,\n    Error4 = 20,\n    Fatal = 21,\n    Fatal2 = 22,\n    Fatal3 = 23,\n    Fatal4 = 24,\n}\nimpl SeverityNumber {\n    /// String value of the enum field names used in the ProtoBuf definition.\n    ///\n    /// The values are not transformed in any way and thus are considered stable\n    /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n    pub fn as_str_name(&self) -> &'static str {\n        match self {\n            Self::Unspecified => \"SEVERITY_NUMBER_UNSPECIFIED\",\n            Self::Trace => \"SEVERITY_NUMBER_TRACE\",\n            Self::Trace2 => \"SEVERITY_NUMBER_TRACE2\",\n            Self::Trace3 => \"SEVERITY_NUMBER_TRACE3\",\n            Self::Trace4 => \"SEVERITY_NUMBER_TRACE4\",\n            Self::Debug => \"SEVERITY_NUMBER_DEBUG\",\n            Self::Debug2 => \"SEVERITY_NUMBER_DEBUG2\",\n            Self::Debug3 => \"SEVERITY_NUMBER_DEBUG3\",\n            Self::Debug4 => \"SEVERITY_NUMBER_DEBUG4\",\n            Self::Info => \"SEVERITY_NUMBER_INFO\",\n            Self::Info2 => \"SEVERITY_NUMBER_INFO2\",\n            Self::Info3 => \"SEVERITY_NUMBER_INFO3\",\n            Self::Info4 => \"SEVERITY_NUMBER_INFO4\",\n            Self::Warn => \"SEVERITY_NUMBER_WARN\",\n            Self::Warn2 => \"SEVERITY_NUMBER_WARN2\",\n            Self::Warn3 => \"SEVERITY_NUMBER_WARN3\",\n            Self::Warn4 => \"SEVERITY_NUMBER_WARN4\",\n            Self::Error => \"SEVERITY_NUMBER_ERROR\",\n            Self::Error2 => \"SEVERITY_NUMBER_ERROR2\",\n            Self::Error3 => \"SEVERITY_NUMBER_ERROR3\",\n            Self::Error4 => \"SEVERITY_NUMBER_ERROR4\",\n            Self::Fatal => \"SEVERITY_NUMBER_FATAL\",\n            Self::Fatal2 => \"SEVERITY_NUMBER_FATAL2\",\n            Self::Fatal3 => \"SEVERITY_NUMBER_FATAL3\",\n            Self::Fatal4 => \"SEVERITY_NUMBER_FATAL4\",\n        }\n    }\n    /// Creates an enum from field names used in the ProtoBuf definition.\n    pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n        match value {\n            \"SEVERITY_NUMBER_UNSPECIFIED\" => Some(Self::Unspecified),\n            \"SEVERITY_NUMBER_TRACE\" => Some(Self::Trace),\n            \"SEVERITY_NUMBER_TRACE2\" => Some(Self::Trace2),\n            \"SEVERITY_NUMBER_TRACE3\" => Some(Self::Trace3),\n            \"SEVERITY_NUMBER_TRACE4\" => Some(Self::Trace4),\n            \"SEVERITY_NUMBER_DEBUG\" => Some(Self::Debug),\n            \"SEVERITY_NUMBER_DEBUG2\" => Some(Self::Debug2),\n            \"SEVERITY_NUMBER_DEBUG3\" => Some(Self::Debug3),\n            \"SEVERITY_NUMBER_DEBUG4\" => Some(Self::Debug4),\n            \"SEVERITY_NUMBER_INFO\" => Some(Self::Info),\n            \"SEVERITY_NUMBER_INFO2\" => Some(Self::Info2),\n            \"SEVERITY_NUMBER_INFO3\" => Some(Self::Info3),\n            \"SEVERITY_NUMBER_INFO4\" => Some(Self::Info4),\n            \"SEVERITY_NUMBER_WARN\" => Some(Self::Warn),\n            \"SEVERITY_NUMBER_WARN2\" => Some(Self::Warn2),\n            \"SEVERITY_NUMBER_WARN3\" => Some(Self::Warn3),\n            \"SEVERITY_NUMBER_WARN4\" => Some(Self::Warn4),\n            \"SEVERITY_NUMBER_ERROR\" => Some(Self::Error),\n            \"SEVERITY_NUMBER_ERROR2\" => Some(Self::Error2),\n            \"SEVERITY_NUMBER_ERROR3\" => Some(Self::Error3),\n            \"SEVERITY_NUMBER_ERROR4\" => Some(Self::Error4),\n            \"SEVERITY_NUMBER_FATAL\" => Some(Self::Fatal),\n            \"SEVERITY_NUMBER_FATAL2\" => Some(Self::Fatal2),\n            \"SEVERITY_NUMBER_FATAL3\" => Some(Self::Fatal3),\n            \"SEVERITY_NUMBER_FATAL4\" => Some(Self::Fatal4),\n            _ => None,\n        }\n    }\n}\n/// Masks for LogRecord.flags field.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]\n#[repr(i32)]\npub enum LogRecordFlags {\n    LogRecordFlagUnspecified = 0,\n    LogRecordFlagTraceFlagsMask = 255,\n}\nimpl LogRecordFlags {\n    /// String value of the enum field names used in the ProtoBuf definition.\n    ///\n    /// The values are not transformed in any way and thus are considered stable\n    /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n    pub fn as_str_name(&self) -> &'static str {\n        match self {\n            Self::LogRecordFlagUnspecified => \"LOG_RECORD_FLAG_UNSPECIFIED\",\n            Self::LogRecordFlagTraceFlagsMask => \"LOG_RECORD_FLAG_TRACE_FLAGS_MASK\",\n        }\n    }\n    /// Creates an enum from field names used in the ProtoBuf definition.\n    pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n        match value {\n            \"LOG_RECORD_FLAG_UNSPECIFIED\" => Some(Self::LogRecordFlagUnspecified),\n            \"LOG_RECORD_FLAG_TRACE_FLAGS_MASK\" => Some(Self::LogRecordFlagTraceFlagsMask),\n            _ => None,\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/codegen/opentelemetry/opentelemetry.proto.metrics.v1.rs",
    "content": "// This file is @generated by prost-build.\n/// MetricsData represents the metrics data that can be stored in a persistent\n/// storage, OR can be embedded by other protocols that transfer OTLP metrics\n/// data but do not implement the OTLP protocol.\n///\n/// The main difference between this message and collector protocol is that\n/// in this message there will not be any \"control\" or \"metadata\" specific to\n/// OTLP protocol.\n///\n/// When new fields are added into this message, the OTLP request MUST be updated\n/// as well.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct MetricsData {\n    /// An array of ResourceMetrics.\n    /// For data coming from a single resource this array will typically contain\n    /// one element. Intermediary nodes that receive data from multiple origins\n    /// typically batch the data before forwarding further and in that case this\n    /// array will contain multiple elements.\n    #[prost(message, repeated, tag = \"1\")]\n    pub resource_metrics: ::prost::alloc::vec::Vec<ResourceMetrics>,\n}\n/// A collection of ScopeMetrics from a Resource.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ResourceMetrics {\n    /// The resource for the metrics in this message.\n    /// If this field is not set then no resource info is known.\n    #[prost(message, optional, tag = \"1\")]\n    pub resource: ::core::option::Option<super::super::resource::v1::Resource>,\n    /// A list of metrics that originate from a resource.\n    #[prost(message, repeated, tag = \"2\")]\n    pub scope_metrics: ::prost::alloc::vec::Vec<ScopeMetrics>,\n    /// This schema_url applies to the data in the \"resource\" field. It does not apply\n    /// to the data in the \"scope_metrics\" field which have their own schema_url field.\n    #[prost(string, tag = \"3\")]\n    pub schema_url: ::prost::alloc::string::String,\n}\n/// A collection of Metrics produced by an Scope.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ScopeMetrics {\n    /// The instrumentation scope information for the metrics in this message.\n    /// Semantically when InstrumentationScope isn't set, it is equivalent with\n    /// an empty instrumentation scope name (unknown).\n    #[prost(message, optional, tag = \"1\")]\n    pub scope: ::core::option::Option<super::super::common::v1::InstrumentationScope>,\n    /// A list of metrics that originate from an instrumentation library.\n    #[prost(message, repeated, tag = \"2\")]\n    pub metrics: ::prost::alloc::vec::Vec<Metric>,\n    /// This schema_url applies to all metrics in the \"metrics\" field.\n    #[prost(string, tag = \"3\")]\n    pub schema_url: ::prost::alloc::string::String,\n}\n/// Defines a Metric which has one or more timeseries.  The following is a\n/// brief summary of the Metric data model.  For more details, see:\n///\n/// <https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/data-model.md>\n///\n/// The data model and relation between entities is shown in the\n/// diagram below. Here, \"DataPoint\" is the term used to refer to any\n/// one of the specific data point value types, and \"points\" is the term used\n/// to refer to any one of the lists of points contained in the Metric.\n///\n/// * Metric is composed of a metadata and data.\n///\n/// * Metadata part contains a name, description, unit.\n///\n/// * Data is one of the possible types (Sum, Gauge, Histogram, Summary).\n///\n/// * DataPoint contains timestamps, attributes, and one of the possible value type\n///   fields.\n///\n///   Metric\n///   +------------+\n///   \\|name        |\n///   \\|description |\n///   \\|unit        |     +------------------------------------+\n///   \\|data        |---> |Gauge, Sum, Histogram, Summary, ... |\n///   +------------+     +------------------------------------+\n///\n///   Data \\[One of Gauge, Sum, Histogram, Summary, ...\\]\n///   +-----------+\n///   \\|...        |  // Metadata about the Data.\n///   \\|points     |--+\n///   +-----------+  |\n///   \\|      +---------------------------+\n///   \\|      |DataPoint 1                |\n///   v      |+------+------+   +------+ |\n///   +-----+   ||label |label |...|label | |\n///   \\|  1  |-->||value1|value2|...|valueN| |\n///   +-----+   |+------+------+   +------+ |\n///   \\|  .  |   |+-----+                    |\n///   \\|  .  |   ||value|                    |\n///   \\|  .  |   |+-----+                    |\n///   \\|  .  |   +---------------------------+\n///   \\|  .  |                   .\n///   \\|  .  |                   .\n///   \\|  .  |                   .\n///   \\|  .  |   +---------------------------+\n///   \\|  .  |   |DataPoint M                |\n///   +-----+   |+------+------+   +------+ |\n///   \\|  M  |-->||label |label |...|label | |\n///   +-----+   ||value1|value2|...|valueN| |\n///   \\|+------+------+   +------+ |\n///   \\|+-----+                    |\n///   \\||value|                    |\n///   \\|+-----+                    |\n///   +---------------------------+\n///\n/// Each distinct type of DataPoint represents the output of a specific\n/// aggregation function, the result of applying the DataPoint's\n/// associated function of to one or more measurements.\n///\n/// All DataPoint types have three common fields:\n///\n/// * Attributes includes key-value pairs associated with the data point\n/// * TimeUnixNano is required, set to the end time of the aggregation\n/// * StartTimeUnixNano is optional, but strongly encouraged for DataPoints\n///   having an AggregationTemporality field, as discussed below.\n///\n/// Both TimeUnixNano and StartTimeUnixNano values are expressed as\n/// UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.\n///\n/// # TimeUnixNano\n///\n/// This field is required, having consistent interpretation across\n/// DataPoint types.  TimeUnixNano is the moment corresponding to when\n/// the data point's aggregate value was captured.\n///\n/// Data points with the 0 value for TimeUnixNano SHOULD be rejected\n/// by consumers.\n///\n/// # StartTimeUnixNano\n///\n/// StartTimeUnixNano in general allows detecting when a sequence of\n/// observations is unbroken.  This field indicates to consumers the\n/// start time for points with cumulative and delta\n/// AggregationTemporality, and it should be included whenever possible\n/// to support correct rate calculation.  Although it may be omitted\n/// when the start time is truly unknown, setting StartTimeUnixNano is\n/// strongly encouraged.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct Metric {\n    /// name of the metric, including its DNS name prefix. It must be unique.\n    #[prost(string, tag = \"1\")]\n    pub name: ::prost::alloc::string::String,\n    /// description of the metric, which can be used in documentation.\n    #[prost(string, tag = \"2\")]\n    pub description: ::prost::alloc::string::String,\n    /// unit in which the metric value is reported. Follows the format\n    /// described by <http://unitsofmeasure.org/ucum.html.>\n    #[prost(string, tag = \"3\")]\n    pub unit: ::prost::alloc::string::String,\n    /// Data determines the aggregation type (if any) of the metric, what is the\n    /// reported value type for the data points, as well as the relatationship to\n    /// the time interval over which they are reported.\n    #[prost(oneof = \"metric::Data\", tags = \"5, 7, 9, 10, 11\")]\n    pub data: ::core::option::Option<metric::Data>,\n}\n/// Nested message and enum types in `Metric`.\npub mod metric {\n    /// Data determines the aggregation type (if any) of the metric, what is the\n    /// reported value type for the data points, as well as the relatationship to\n    /// the time interval over which they are reported.\n    #[derive(serde::Serialize, serde::Deserialize)]\n    #[derive(Clone, PartialEq, ::prost::Oneof)]\n    pub enum Data {\n        #[prost(message, tag = \"5\")]\n        Gauge(super::Gauge),\n        #[prost(message, tag = \"7\")]\n        Sum(super::Sum),\n        #[prost(message, tag = \"9\")]\n        Histogram(super::Histogram),\n        #[prost(message, tag = \"10\")]\n        ExponentialHistogram(super::ExponentialHistogram),\n        #[prost(message, tag = \"11\")]\n        Summary(super::Summary),\n    }\n}\n/// Gauge represents the type of a scalar metric that always exports the\n/// \"current value\" for every data point. It should be used for an \"unknown\"\n/// aggregation.\n///\n/// A Gauge does not support different aggregation temporalities. Given the\n/// aggregation is unknown, points cannot be combined using the same\n/// aggregation, regardless of aggregation temporalities. Therefore,\n/// AggregationTemporality is not included. Consequently, this also means\n/// \"StartTimeUnixNano\" is ignored for all data points.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct Gauge {\n    #[prost(message, repeated, tag = \"1\")]\n    pub data_points: ::prost::alloc::vec::Vec<NumberDataPoint>,\n}\n/// Sum represents the type of a scalar metric that is calculated as a sum of all\n/// reported measurements over a time interval.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct Sum {\n    #[prost(message, repeated, tag = \"1\")]\n    pub data_points: ::prost::alloc::vec::Vec<NumberDataPoint>,\n    /// aggregation_temporality describes if the aggregator reports delta changes\n    /// since last report time, or cumulative changes since a fixed start time.\n    #[prost(enumeration = \"AggregationTemporality\", tag = \"2\")]\n    pub aggregation_temporality: i32,\n    /// If \"true\" means that the sum is monotonic.\n    #[prost(bool, tag = \"3\")]\n    pub is_monotonic: bool,\n}\n/// Histogram represents the type of a metric that is calculated by aggregating\n/// as a Histogram of all reported measurements over a time interval.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct Histogram {\n    #[prost(message, repeated, tag = \"1\")]\n    pub data_points: ::prost::alloc::vec::Vec<HistogramDataPoint>,\n    /// aggregation_temporality describes if the aggregator reports delta changes\n    /// since last report time, or cumulative changes since a fixed start time.\n    #[prost(enumeration = \"AggregationTemporality\", tag = \"2\")]\n    pub aggregation_temporality: i32,\n}\n/// ExponentialHistogram represents the type of a metric that is calculated by aggregating\n/// as a ExponentialHistogram of all reported double measurements over a time interval.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ExponentialHistogram {\n    #[prost(message, repeated, tag = \"1\")]\n    pub data_points: ::prost::alloc::vec::Vec<ExponentialHistogramDataPoint>,\n    /// aggregation_temporality describes if the aggregator reports delta changes\n    /// since last report time, or cumulative changes since a fixed start time.\n    #[prost(enumeration = \"AggregationTemporality\", tag = \"2\")]\n    pub aggregation_temporality: i32,\n}\n/// Summary metric data are used to convey quantile summaries,\n/// a Prometheus (see: <https://prometheus.io/docs/concepts/metric_types/#summary>)\n/// and OpenMetrics (see: <https://github.com/OpenObservability/OpenMetrics/blob/4dbf6075567ab43296eed941037c12951faafb92/protos/prometheus.proto#L45>)\n/// data type. These data points cannot always be merged in a meaningful way.\n/// While they can be useful in some applications, histogram data points are\n/// recommended for new applications.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct Summary {\n    #[prost(message, repeated, tag = \"1\")]\n    pub data_points: ::prost::alloc::vec::Vec<SummaryDataPoint>,\n}\n/// NumberDataPoint is a single data point in a timeseries that describes the\n/// time-varying scalar value of a metric.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct NumberDataPoint {\n    /// The set of key/value pairs that uniquely identify the timeseries from\n    /// where this point belongs. The list may be empty (may contain 0 elements).\n    /// Attribute keys MUST be unique (it is not allowed to have more than one\n    /// attribute with the same key).\n    #[prost(message, repeated, tag = \"7\")]\n    pub attributes: ::prost::alloc::vec::Vec<super::super::common::v1::KeyValue>,\n    /// StartTimeUnixNano is optional but strongly encouraged, see the\n    /// the detailed comments above Metric.\n    ///\n    /// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January\n    /// 1970.\n    #[prost(fixed64, tag = \"2\")]\n    pub start_time_unix_nano: u64,\n    /// TimeUnixNano is required, see the detailed comments above Metric.\n    ///\n    /// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January\n    /// 1970.\n    #[prost(fixed64, tag = \"3\")]\n    pub time_unix_nano: u64,\n    /// (Optional) List of exemplars collected from\n    /// measurements that were used to form the data point\n    #[prost(message, repeated, tag = \"5\")]\n    pub exemplars: ::prost::alloc::vec::Vec<Exemplar>,\n    /// Flags that apply to this specific data point.  See DataPointFlags\n    /// for the available flags and their meaning.\n    #[prost(uint32, tag = \"8\")]\n    pub flags: u32,\n    /// The value itself.  A point is considered invalid when one of the recognized\n    /// value fields is not present inside this oneof.\n    #[prost(oneof = \"number_data_point::Value\", tags = \"4, 6\")]\n    pub value: ::core::option::Option<number_data_point::Value>,\n}\n/// Nested message and enum types in `NumberDataPoint`.\npub mod number_data_point {\n    /// The value itself.  A point is considered invalid when one of the recognized\n    /// value fields is not present inside this oneof.\n    #[derive(serde::Serialize, serde::Deserialize)]\n    #[derive(Clone, Copy, PartialEq, ::prost::Oneof)]\n    pub enum Value {\n        #[prost(double, tag = \"4\")]\n        AsDouble(f64),\n        #[prost(sfixed64, tag = \"6\")]\n        AsInt(i64),\n    }\n}\n/// HistogramDataPoint is a single data point in a timeseries that describes the\n/// time-varying values of a Histogram. A Histogram contains summary statistics\n/// for a population of values, it may optionally contain the distribution of\n/// those values across a set of buckets.\n///\n/// If the histogram contains the distribution of values, then both\n/// \"explicit_bounds\" and \"bucket counts\" fields must be defined.\n/// If the histogram does not contain the distribution of values, then both\n/// \"explicit_bounds\" and \"bucket_counts\" must be omitted and only \"count\" and\n/// \"sum\" are known.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct HistogramDataPoint {\n    /// The set of key/value pairs that uniquely identify the timeseries from\n    /// where this point belongs. The list may be empty (may contain 0 elements).\n    /// Attribute keys MUST be unique (it is not allowed to have more than one\n    /// attribute with the same key).\n    #[prost(message, repeated, tag = \"9\")]\n    pub attributes: ::prost::alloc::vec::Vec<super::super::common::v1::KeyValue>,\n    /// StartTimeUnixNano is optional but strongly encouraged, see the\n    /// the detailed comments above Metric.\n    ///\n    /// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January\n    /// 1970.\n    #[prost(fixed64, tag = \"2\")]\n    pub start_time_unix_nano: u64,\n    /// TimeUnixNano is required, see the detailed comments above Metric.\n    ///\n    /// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January\n    /// 1970.\n    #[prost(fixed64, tag = \"3\")]\n    pub time_unix_nano: u64,\n    /// count is the number of values in the population. Must be non-negative. This\n    /// value must be equal to the sum of the \"count\" fields in buckets if a\n    /// histogram is provided.\n    #[prost(fixed64, tag = \"4\")]\n    pub count: u64,\n    /// sum of the values in the population. If count is zero then this field\n    /// must be zero.\n    ///\n    /// Note: Sum should only be filled out when measuring non-negative discrete\n    /// events, and is assumed to be monotonic over the values of these events.\n    /// Negative events *can* be recorded, but sum should not be filled out when\n    /// doing so.  This is specifically to enforce compatibility w/ OpenMetrics,\n    /// see: <https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#histogram>\n    #[prost(double, optional, tag = \"5\")]\n    pub sum: ::core::option::Option<f64>,\n    /// bucket_counts is an optional field contains the count values of histogram\n    /// for each bucket.\n    ///\n    /// The sum of the bucket_counts must equal the value in the count field.\n    ///\n    /// The number of elements in bucket_counts array must be by one greater than\n    /// the number of elements in explicit_bounds array.\n    #[prost(fixed64, repeated, tag = \"6\")]\n    pub bucket_counts: ::prost::alloc::vec::Vec<u64>,\n    /// explicit_bounds specifies buckets with explicitly defined bounds for values.\n    ///\n    /// The boundaries for bucket at index i are:\n    ///\n    /// (-infinity, explicit_bounds\\[i\\]\\] for i == 0\n    /// (explicit_bounds\\[i-1\\], explicit_bounds\\[i\\]\\] for 0 \\< i \\< size(explicit_bounds)\n    /// (explicit_bounds\\[i-1\\], +infinity) for i == size(explicit_bounds)\n    ///\n    /// The values in the explicit_bounds array must be strictly increasing.\n    ///\n    /// Histogram buckets are inclusive of their upper boundary, except the last\n    /// bucket where the boundary is at infinity. This format is intentionally\n    /// compatible with the OpenMetrics histogram definition.\n    #[prost(double, repeated, tag = \"7\")]\n    pub explicit_bounds: ::prost::alloc::vec::Vec<f64>,\n    /// (Optional) List of exemplars collected from\n    /// measurements that were used to form the data point\n    #[prost(message, repeated, tag = \"8\")]\n    pub exemplars: ::prost::alloc::vec::Vec<Exemplar>,\n    /// Flags that apply to this specific data point.  See DataPointFlags\n    /// for the available flags and their meaning.\n    #[prost(uint32, tag = \"10\")]\n    pub flags: u32,\n    /// min is the minimum value over (start_time, end_time\\].\n    #[prost(double, optional, tag = \"11\")]\n    pub min: ::core::option::Option<f64>,\n    /// max is the maximum value over (start_time, end_time\\].\n    #[prost(double, optional, tag = \"12\")]\n    pub max: ::core::option::Option<f64>,\n}\n/// ExponentialHistogramDataPoint is a single data point in a timeseries that describes the\n/// time-varying values of a ExponentialHistogram of double values. A ExponentialHistogram contains\n/// summary statistics for a population of values, it may optionally contain the\n/// distribution of those values across a set of buckets.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ExponentialHistogramDataPoint {\n    /// The set of key/value pairs that uniquely identify the timeseries from\n    /// where this point belongs. The list may be empty (may contain 0 elements).\n    /// Attribute keys MUST be unique (it is not allowed to have more than one\n    /// attribute with the same key).\n    #[prost(message, repeated, tag = \"1\")]\n    pub attributes: ::prost::alloc::vec::Vec<super::super::common::v1::KeyValue>,\n    /// StartTimeUnixNano is optional but strongly encouraged, see the\n    /// the detailed comments above Metric.\n    ///\n    /// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January\n    /// 1970.\n    #[prost(fixed64, tag = \"2\")]\n    pub start_time_unix_nano: u64,\n    /// TimeUnixNano is required, see the detailed comments above Metric.\n    ///\n    /// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January\n    /// 1970.\n    #[prost(fixed64, tag = \"3\")]\n    pub time_unix_nano: u64,\n    /// count is the number of values in the population. Must be\n    /// non-negative. This value must be equal to the sum of the \"bucket_counts\"\n    /// values in the positive and negative Buckets plus the \"zero_count\" field.\n    #[prost(fixed64, tag = \"4\")]\n    pub count: u64,\n    /// sum of the values in the population. If count is zero then this field\n    /// must be zero.\n    ///\n    /// Note: Sum should only be filled out when measuring non-negative discrete\n    /// events, and is assumed to be monotonic over the values of these events.\n    /// Negative events *can* be recorded, but sum should not be filled out when\n    /// doing so.  This is specifically to enforce compatibility w/ OpenMetrics,\n    /// see: <https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#histogram>\n    #[prost(double, optional, tag = \"5\")]\n    pub sum: ::core::option::Option<f64>,\n    /// scale describes the resolution of the histogram.  Boundaries are\n    /// located at powers of the base, where:\n    ///\n    /// base = (2^(2^-scale))\n    ///\n    /// The histogram bucket identified by `index`, a signed integer,\n    /// contains values that are greater than (base^index) and\n    /// less than or equal to (base^(index+1)).\n    ///\n    /// The positive and negative ranges of the histogram are expressed\n    /// separately.  Negative values are mapped by their absolute value\n    /// into the negative range using the same scale as the positive range.\n    ///\n    /// scale is not restricted by the protocol, as the permissible\n    /// values depend on the range of the data.\n    #[prost(sint32, tag = \"6\")]\n    pub scale: i32,\n    /// zero_count is the count of values that are either exactly zero or\n    /// within the region considered zero by the instrumentation at the\n    /// tolerated degree of precision.  This bucket stores values that\n    /// cannot be expressed using the standard exponential formula as\n    /// well as values that have been rounded to zero.\n    ///\n    /// Implementations MAY consider the zero bucket to have probability\n    /// mass equal to (zero_count / count).\n    #[prost(fixed64, tag = \"7\")]\n    pub zero_count: u64,\n    /// positive carries the positive range of exponential bucket counts.\n    #[prost(message, optional, tag = \"8\")]\n    pub positive: ::core::option::Option<exponential_histogram_data_point::Buckets>,\n    /// negative carries the negative range of exponential bucket counts.\n    #[prost(message, optional, tag = \"9\")]\n    pub negative: ::core::option::Option<exponential_histogram_data_point::Buckets>,\n    /// Flags that apply to this specific data point.  See DataPointFlags\n    /// for the available flags and their meaning.\n    #[prost(uint32, tag = \"10\")]\n    pub flags: u32,\n    /// (Optional) List of exemplars collected from\n    /// measurements that were used to form the data point\n    #[prost(message, repeated, tag = \"11\")]\n    pub exemplars: ::prost::alloc::vec::Vec<Exemplar>,\n    /// min is the minimum value over (start_time, end_time\\].\n    #[prost(double, optional, tag = \"12\")]\n    pub min: ::core::option::Option<f64>,\n    /// max is the maximum value over (start_time, end_time\\].\n    #[prost(double, optional, tag = \"13\")]\n    pub max: ::core::option::Option<f64>,\n}\n/// Nested message and enum types in `ExponentialHistogramDataPoint`.\npub mod exponential_histogram_data_point {\n    /// Buckets are a set of bucket counts, encoded in a contiguous array\n    /// of counts.\n    #[derive(serde::Serialize, serde::Deserialize)]\n    #[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\n    pub struct Buckets {\n        /// Offset is the bucket index of the first entry in the bucket_counts array.\n        ///\n        /// Note: This uses a varint encoding as a simple form of compression.\n        #[prost(sint32, tag = \"1\")]\n        pub offset: i32,\n        /// Count is an array of counts, where count\\[i\\] carries the count\n        /// of the bucket at index (offset+i).  count\\[i\\] is the count of\n        /// values greater than base^(offset+i) and less or equal to than\n        /// base^(offset+i+1).\n        ///\n        /// Note: By contrast, the explicit HistogramDataPoint uses\n        /// fixed64.  This field is expected to have many buckets,\n        /// especially zeros, so uint64 has been selected to ensure\n        /// varint encoding.\n        #[prost(uint64, repeated, tag = \"2\")]\n        pub bucket_counts: ::prost::alloc::vec::Vec<u64>,\n    }\n}\n/// SummaryDataPoint is a single data point in a timeseries that describes the\n/// time-varying values of a Summary metric.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct SummaryDataPoint {\n    /// The set of key/value pairs that uniquely identify the timeseries from\n    /// where this point belongs. The list may be empty (may contain 0 elements).\n    /// Attribute keys MUST be unique (it is not allowed to have more than one\n    /// attribute with the same key).\n    #[prost(message, repeated, tag = \"7\")]\n    pub attributes: ::prost::alloc::vec::Vec<super::super::common::v1::KeyValue>,\n    /// StartTimeUnixNano is optional but strongly encouraged, see the\n    /// the detailed comments above Metric.\n    ///\n    /// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January\n    /// 1970.\n    #[prost(fixed64, tag = \"2\")]\n    pub start_time_unix_nano: u64,\n    /// TimeUnixNano is required, see the detailed comments above Metric.\n    ///\n    /// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January\n    /// 1970.\n    #[prost(fixed64, tag = \"3\")]\n    pub time_unix_nano: u64,\n    /// count is the number of values in the population. Must be non-negative.\n    #[prost(fixed64, tag = \"4\")]\n    pub count: u64,\n    /// sum of the values in the population. If count is zero then this field\n    /// must be zero.\n    ///\n    /// Note: Sum should only be filled out when measuring non-negative discrete\n    /// events, and is assumed to be monotonic over the values of these events.\n    /// Negative events *can* be recorded, but sum should not be filled out when\n    /// doing so.  This is specifically to enforce compatibility w/ OpenMetrics,\n    /// see: <https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#summary>\n    #[prost(double, tag = \"5\")]\n    pub sum: f64,\n    /// (Optional) list of values at different quantiles of the distribution calculated\n    /// from the current snapshot. The quantiles must be strictly increasing.\n    #[prost(message, repeated, tag = \"6\")]\n    pub quantile_values: ::prost::alloc::vec::Vec<summary_data_point::ValueAtQuantile>,\n    /// Flags that apply to this specific data point.  See DataPointFlags\n    /// for the available flags and their meaning.\n    #[prost(uint32, tag = \"8\")]\n    pub flags: u32,\n}\n/// Nested message and enum types in `SummaryDataPoint`.\npub mod summary_data_point {\n    /// Represents the value at a given quantile of a distribution.\n    ///\n    /// To record Min and Max values following conventions are used:\n    ///\n    /// * The 1.0 quantile is equivalent to the maximum value observed.\n    /// * The 0.0 quantile is equivalent to the minimum value observed.\n    ///\n    /// See the following issue for more context:\n    /// <https://github.com/open-telemetry/opentelemetry-proto/issues/125>\n    #[derive(serde::Serialize, serde::Deserialize)]\n    #[derive(Clone, Copy, PartialEq, ::prost::Message)]\n    pub struct ValueAtQuantile {\n        /// The quantile of a distribution. Must be in the interval\n        /// \\[0.0, 1.0\\].\n        #[prost(double, tag = \"1\")]\n        pub quantile: f64,\n        /// The value at the given quantile of a distribution.\n        ///\n        /// Quantile values must NOT be negative.\n        #[prost(double, tag = \"2\")]\n        pub value: f64,\n    }\n}\n/// A representation of an exemplar, which is a sample input measurement.\n/// Exemplars also hold information about the environment when the measurement\n/// was recorded, for example the span and trace ID of the active span when the\n/// exemplar was recorded.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct Exemplar {\n    /// The set of key/value pairs that were filtered out by the aggregator, but\n    /// recorded alongside the original measurement. Only key/value pairs that were\n    /// filtered out by the aggregator should be included\n    #[prost(message, repeated, tag = \"7\")]\n    pub filtered_attributes: ::prost::alloc::vec::Vec<\n        super::super::common::v1::KeyValue,\n    >,\n    /// time_unix_nano is the exact time when this exemplar was recorded\n    ///\n    /// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January\n    /// 1970.\n    #[prost(fixed64, tag = \"2\")]\n    pub time_unix_nano: u64,\n    /// (Optional) Span ID of the exemplar trace.\n    /// span_id may be missing if the measurement is not recorded inside a trace\n    /// or if the trace is not sampled.\n    #[prost(bytes = \"vec\", tag = \"4\")]\n    pub span_id: ::prost::alloc::vec::Vec<u8>,\n    /// (Optional) Trace ID of the exemplar trace.\n    /// trace_id may be missing if the measurement is not recorded inside a trace\n    /// or if the trace is not sampled.\n    #[prost(bytes = \"vec\", tag = \"5\")]\n    pub trace_id: ::prost::alloc::vec::Vec<u8>,\n    /// The value of the measurement that was recorded. An exemplar is\n    /// considered invalid when one of the recognized value fields is not present\n    /// inside this oneof.\n    #[prost(oneof = \"exemplar::Value\", tags = \"3, 6\")]\n    pub value: ::core::option::Option<exemplar::Value>,\n}\n/// Nested message and enum types in `Exemplar`.\npub mod exemplar {\n    /// The value of the measurement that was recorded. An exemplar is\n    /// considered invalid when one of the recognized value fields is not present\n    /// inside this oneof.\n    #[derive(serde::Serialize, serde::Deserialize)]\n    #[derive(Clone, Copy, PartialEq, ::prost::Oneof)]\n    pub enum Value {\n        #[prost(double, tag = \"3\")]\n        AsDouble(f64),\n        #[prost(sfixed64, tag = \"6\")]\n        AsInt(i64),\n    }\n}\n/// AggregationTemporality defines how a metric aggregator reports aggregated\n/// values. It describes how those values relate to the time interval over\n/// which they are aggregated.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]\n#[repr(i32)]\npub enum AggregationTemporality {\n    /// UNSPECIFIED is the default AggregationTemporality, it MUST not be used.\n    Unspecified = 0,\n    /// DELTA is an AggregationTemporality for a metric aggregator which reports\n    /// changes since last report time. Successive metrics contain aggregation of\n    /// values from continuous and non-overlapping intervals.\n    ///\n    /// The values for a DELTA metric are based only on the time interval\n    /// associated with one measurement cycle. There is no dependency on\n    /// previous measurements like is the case for CUMULATIVE metrics.\n    ///\n    /// For example, consider a system measuring the number of requests that\n    /// it receives and reports the sum of these requests every second as a\n    /// DELTA metric:\n    ///\n    /// 1. The system starts receiving at time=t_0.\n    /// 1. A request is received, the system measures 1 request.\n    /// 1. A request is received, the system measures 1 request.\n    /// 1. A request is received, the system measures 1 request.\n    /// 1. The 1 second collection cycle ends. A metric is exported for the\n    ///    number of requests received over the interval of time t_0 to\n    ///    t_0+1 with a value of 3.\n    /// 1. A request is received, the system measures 1 request.\n    /// 1. A request is received, the system measures 1 request.\n    /// 1. The 1 second collection cycle ends. A metric is exported for the\n    ///    number of requests received over the interval of time t_0+1 to\n    ///    t_0+2 with a value of 2.\n    Delta = 1,\n    /// CUMULATIVE is an AggregationTemporality for a metric aggregator which\n    /// reports changes since a fixed start time. This means that current values\n    /// of a CUMULATIVE metric depend on all previous measurements since the\n    /// start time. Because of this, the sender is required to retain this state\n    /// in some form. If this state is lost or invalidated, the CUMULATIVE metric\n    /// values MUST be reset and a new fixed start time following the last\n    /// reported measurement time sent MUST be used.\n    ///\n    /// For example, consider a system measuring the number of requests that\n    /// it receives and reports the sum of these requests every second as a\n    /// CUMULATIVE metric:\n    ///\n    /// 1. The system starts receiving at time=t_0.\n    /// 1. A request is received, the system measures 1 request.\n    /// 1. A request is received, the system measures 1 request.\n    /// 1. A request is received, the system measures 1 request.\n    /// 1. The 1 second collection cycle ends. A metric is exported for the\n    ///    number of requests received over the interval of time t_0 to\n    ///    t_0+1 with a value of 3.\n    /// 1. A request is received, the system measures 1 request.\n    /// 1. A request is received, the system measures 1 request.\n    /// 1. The 1 second collection cycle ends. A metric is exported for the\n    ///    number of requests received over the interval of time t_0 to\n    ///    t_0+2 with a value of 5.\n    /// 1. The system experiences a fault and loses state.\n    /// 1. The system recovers and resumes receiving at time=t_1.\n    /// 1. A request is received, the system measures 1 request.\n    /// 1. The 1 second collection cycle ends. A metric is exported for the\n    ///    number of requests received over the interval of time t_1 to\n    ///    t_0+1 with a value of 1.\n    ///\n    /// Note: Even though, when reporting changes since last report time, using\n    /// CUMULATIVE is valid, it is not recommended. This may cause problems for\n    /// systems that do not use start_time to determine when the aggregation\n    /// value was reset (e.g. Prometheus).\n    Cumulative = 2,\n}\nimpl AggregationTemporality {\n    /// String value of the enum field names used in the ProtoBuf definition.\n    ///\n    /// The values are not transformed in any way and thus are considered stable\n    /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n    pub fn as_str_name(&self) -> &'static str {\n        match self {\n            Self::Unspecified => \"AGGREGATION_TEMPORALITY_UNSPECIFIED\",\n            Self::Delta => \"AGGREGATION_TEMPORALITY_DELTA\",\n            Self::Cumulative => \"AGGREGATION_TEMPORALITY_CUMULATIVE\",\n        }\n    }\n    /// Creates an enum from field names used in the ProtoBuf definition.\n    pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n        match value {\n            \"AGGREGATION_TEMPORALITY_UNSPECIFIED\" => Some(Self::Unspecified),\n            \"AGGREGATION_TEMPORALITY_DELTA\" => Some(Self::Delta),\n            \"AGGREGATION_TEMPORALITY_CUMULATIVE\" => Some(Self::Cumulative),\n            _ => None,\n        }\n    }\n}\n/// DataPointFlags is defined as a protobuf 'uint32' type and is to be used as a\n/// bit-field representing 32 distinct boolean flags.  Each flag defined in this\n/// enum is a bit-mask.  To test the presence of a single flag in the flags of\n/// a data point, for example, use an expression like:\n///\n/// (point.flags & FLAG_NO_RECORDED_VALUE) == FLAG_NO_RECORDED_VALUE\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]\n#[repr(i32)]\npub enum DataPointFlags {\n    FlagNone = 0,\n    /// This DataPoint is valid but has no recorded value.  This value\n    /// SHOULD be used to reflect explicitly missing data in a series, as\n    /// for an equivalent to the Prometheus \"staleness marker\".\n    FlagNoRecordedValue = 1,\n}\nimpl DataPointFlags {\n    /// String value of the enum field names used in the ProtoBuf definition.\n    ///\n    /// The values are not transformed in any way and thus are considered stable\n    /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n    pub fn as_str_name(&self) -> &'static str {\n        match self {\n            Self::FlagNone => \"FLAG_NONE\",\n            Self::FlagNoRecordedValue => \"FLAG_NO_RECORDED_VALUE\",\n        }\n    }\n    /// Creates an enum from field names used in the ProtoBuf definition.\n    pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n        match value {\n            \"FLAG_NONE\" => Some(Self::FlagNone),\n            \"FLAG_NO_RECORDED_VALUE\" => Some(Self::FlagNoRecordedValue),\n            _ => None,\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/codegen/opentelemetry/opentelemetry.proto.resource.v1.rs",
    "content": "// This file is @generated by prost-build.\n/// Resource information.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct Resource {\n    /// Set of attributes that describe the resource.\n    /// Attribute keys MUST be unique (it is not allowed to have more than one\n    /// attribute with the same key).\n    #[prost(message, repeated, tag = \"1\")]\n    pub attributes: ::prost::alloc::vec::Vec<super::super::common::v1::KeyValue>,\n    /// dropped_attributes_count is the number of dropped attributes. If the value is 0, then\n    /// no attributes were dropped.\n    #[prost(uint32, tag = \"2\")]\n    pub dropped_attributes_count: u32,\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/codegen/opentelemetry/opentelemetry.proto.trace.v1.rs",
    "content": "// This file is @generated by prost-build.\n/// TracesData represents the traces data that can be stored in a persistent storage,\n/// OR can be embedded by other protocols that transfer OTLP traces data but do\n/// not implement the OTLP protocol.\n///\n/// The main difference between this message and collector protocol is that\n/// in this message there will not be any \"control\" or \"metadata\" specific to\n/// OTLP protocol.\n///\n/// When new fields are added into this message, the OTLP request MUST be updated\n/// as well.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct TracesData {\n    /// An array of ResourceSpans.\n    /// For data coming from a single resource this array will typically contain\n    /// one element. Intermediary nodes that receive data from multiple origins\n    /// typically batch the data before forwarding further and in that case this\n    /// array will contain multiple elements.\n    #[prost(message, repeated, tag = \"1\")]\n    pub resource_spans: ::prost::alloc::vec::Vec<ResourceSpans>,\n}\n/// A collection of ScopeSpans from a Resource.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ResourceSpans {\n    /// The resource for the spans in this message.\n    /// If this field is not set then no resource info is known.\n    #[prost(message, optional, tag = \"1\")]\n    pub resource: ::core::option::Option<super::super::resource::v1::Resource>,\n    /// A list of ScopeSpans that originate from a resource.\n    #[prost(message, repeated, tag = \"2\")]\n    pub scope_spans: ::prost::alloc::vec::Vec<ScopeSpans>,\n    /// This schema_url applies to the data in the \"resource\" field. It does not apply\n    /// to the data in the \"scope_spans\" field which have their own schema_url field.\n    #[prost(string, tag = \"3\")]\n    pub schema_url: ::prost::alloc::string::String,\n}\n/// A collection of Spans produced by an InstrumentationScope.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ScopeSpans {\n    /// The instrumentation scope information for the spans in this message.\n    /// Semantically when InstrumentationScope isn't set, it is equivalent with\n    /// an empty instrumentation scope name (unknown).\n    #[prost(message, optional, tag = \"1\")]\n    pub scope: ::core::option::Option<super::super::common::v1::InstrumentationScope>,\n    /// A list of Spans that originate from an instrumentation scope.\n    #[prost(message, repeated, tag = \"2\")]\n    pub spans: ::prost::alloc::vec::Vec<Span>,\n    /// This schema_url applies to all spans and span events in the \"spans\" field.\n    #[prost(string, tag = \"3\")]\n    pub schema_url: ::prost::alloc::string::String,\n}\n/// A Span represents a single operation performed by a single component of the system.\n///\n/// The next available field id is 17.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct Span {\n    /// A unique identifier for a trace. All spans from the same trace share\n    /// the same `trace_id`. The ID is a 16-byte array. An ID with all zeroes\n    /// is considered invalid.\n    ///\n    /// This field is semantically required. Receiver should generate new\n    /// random trace_id if empty or invalid trace_id was received.\n    ///\n    /// This field is required.\n    #[prost(bytes = \"vec\", tag = \"1\")]\n    pub trace_id: ::prost::alloc::vec::Vec<u8>,\n    /// A unique identifier for a span within a trace, assigned when the span\n    /// is created. The ID is an 8-byte array. An ID with all zeroes is considered\n    /// invalid.\n    ///\n    /// This field is semantically required. Receiver should generate new\n    /// random span_id if empty or invalid span_id was received.\n    ///\n    /// This field is required.\n    #[prost(bytes = \"vec\", tag = \"2\")]\n    pub span_id: ::prost::alloc::vec::Vec<u8>,\n    /// trace_state conveys information about request position in multiple distributed tracing graphs.\n    /// It is a trace_state in w3c-trace-context format: <https://www.w3.org/TR/trace-context/#tracestate-header>\n    /// See also <https://github.com/w3c/distributed-tracing> for more details about this field.\n    #[prost(string, tag = \"3\")]\n    pub trace_state: ::prost::alloc::string::String,\n    /// The `span_id` of this span's parent span. If this is a root span, then this\n    /// field must be empty. The ID is an 8-byte array.\n    #[prost(bytes = \"vec\", tag = \"4\")]\n    pub parent_span_id: ::prost::alloc::vec::Vec<u8>,\n    /// A description of the span's operation.\n    ///\n    /// For example, the name can be a qualified method name or a file name\n    /// and a line number where the operation is called. A best practice is to use\n    /// the same display name at the same call point in an application.\n    /// This makes it easier to correlate spans in different traces.\n    ///\n    /// This field is semantically required to be set to non-empty string.\n    /// Empty value is equivalent to an unknown span name.\n    ///\n    /// This field is required.\n    #[prost(string, tag = \"5\")]\n    pub name: ::prost::alloc::string::String,\n    /// Distinguishes between spans generated in a particular context. For example,\n    /// two spans with the same name may be distinguished using `CLIENT` (caller)\n    /// and `SERVER` (callee) to identify queueing latency associated with the span.\n    #[prost(enumeration = \"span::SpanKind\", tag = \"6\")]\n    pub kind: i32,\n    /// start_time_unix_nano is the start time of the span. On the client side, this is the time\n    /// kept by the local machine where the span execution starts. On the server side, this\n    /// is the time when the server's application handler starts running.\n    /// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.\n    ///\n    /// This field is semantically required and it is expected that end_time >= start_time.\n    #[prost(fixed64, tag = \"7\")]\n    pub start_time_unix_nano: u64,\n    /// end_time_unix_nano is the end time of the span. On the client side, this is the time\n    /// kept by the local machine where the span execution ends. On the server side, this\n    /// is the time when the server application handler stops running.\n    /// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.\n    ///\n    /// This field is semantically required and it is expected that end_time >= start_time.\n    #[prost(fixed64, tag = \"8\")]\n    pub end_time_unix_nano: u64,\n    /// attributes is a collection of key/value pairs. Note, global attributes\n    /// like server name can be set using the resource API. Examples of attributes:\n    ///\n    /// ```text\n    /// \"/http/user_agent\": \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36\"\n    /// \"/http/server_latency\": 300\n    /// \"abc.com/myattribute\": true\n    /// \"abc.com/score\": 10.239\n    /// ```\n    ///\n    /// The OpenTelemetry API specification further restricts the allowed value types:\n    /// <https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/common/README.md#attribute>\n    /// Attribute keys MUST be unique (it is not allowed to have more than one\n    /// attribute with the same key).\n    #[prost(message, repeated, tag = \"9\")]\n    pub attributes: ::prost::alloc::vec::Vec<super::super::common::v1::KeyValue>,\n    /// dropped_attributes_count is the number of attributes that were discarded. Attributes\n    /// can be discarded because their keys are too long or because there are too many\n    /// attributes. If this value is 0, then no attributes were dropped.\n    #[prost(uint32, tag = \"10\")]\n    pub dropped_attributes_count: u32,\n    /// events is a collection of Event items.\n    #[prost(message, repeated, tag = \"11\")]\n    pub events: ::prost::alloc::vec::Vec<span::Event>,\n    /// dropped_events_count is the number of dropped events. If the value is 0, then no\n    /// events were dropped.\n    #[prost(uint32, tag = \"12\")]\n    pub dropped_events_count: u32,\n    /// links is a collection of Links, which are references from this span to a span\n    /// in the same or different trace.\n    #[prost(message, repeated, tag = \"13\")]\n    pub links: ::prost::alloc::vec::Vec<span::Link>,\n    /// dropped_links_count is the number of dropped links after the maximum size was\n    /// enforced. If this value is 0, then no links were dropped.\n    #[prost(uint32, tag = \"14\")]\n    pub dropped_links_count: u32,\n    /// An optional final status for this span. Semantically when Status isn't set, it means\n    /// span's status code is unset, i.e. assume STATUS_CODE_UNSET (code = 0).\n    #[prost(message, optional, tag = \"15\")]\n    pub status: ::core::option::Option<Status>,\n}\n/// Nested message and enum types in `Span`.\npub mod span {\n    /// Event is a time-stamped annotation of the span, consisting of user-supplied\n    /// text description and key-value pairs.\n    #[derive(serde::Serialize, serde::Deserialize)]\n    #[derive(Clone, PartialEq, ::prost::Message)]\n    pub struct Event {\n        /// time_unix_nano is the time the event occurred.\n        #[prost(fixed64, tag = \"1\")]\n        pub time_unix_nano: u64,\n        /// name of the event.\n        /// This field is semantically required to be set to non-empty string.\n        #[prost(string, tag = \"2\")]\n        pub name: ::prost::alloc::string::String,\n        /// attributes is a collection of attribute key/value pairs on the event.\n        /// Attribute keys MUST be unique (it is not allowed to have more than one\n        /// attribute with the same key).\n        #[prost(message, repeated, tag = \"3\")]\n        pub attributes: ::prost::alloc::vec::Vec<\n            super::super::super::common::v1::KeyValue,\n        >,\n        /// dropped_attributes_count is the number of dropped attributes. If the value is 0,\n        /// then no attributes were dropped.\n        #[prost(uint32, tag = \"4\")]\n        pub dropped_attributes_count: u32,\n    }\n    /// A pointer from the current span to another span in the same trace or in a\n    /// different trace. For example, this can be used in batching operations,\n    /// where a single batch handler processes multiple requests from different\n    /// traces or when the handler receives a request from a different project.\n    #[derive(serde::Serialize, serde::Deserialize)]\n    #[derive(Clone, PartialEq, ::prost::Message)]\n    pub struct Link {\n        /// A unique identifier of a trace that this linked span is part of. The ID is a\n        /// 16-byte array.\n        #[prost(bytes = \"vec\", tag = \"1\")]\n        pub trace_id: ::prost::alloc::vec::Vec<u8>,\n        /// A unique identifier for the linked span. The ID is an 8-byte array.\n        #[prost(bytes = \"vec\", tag = \"2\")]\n        pub span_id: ::prost::alloc::vec::Vec<u8>,\n        /// The trace_state associated with the link.\n        #[prost(string, tag = \"3\")]\n        pub trace_state: ::prost::alloc::string::String,\n        /// attributes is a collection of attribute key/value pairs on the link.\n        /// Attribute keys MUST be unique (it is not allowed to have more than one\n        /// attribute with the same key).\n        #[prost(message, repeated, tag = \"4\")]\n        pub attributes: ::prost::alloc::vec::Vec<\n            super::super::super::common::v1::KeyValue,\n        >,\n        /// dropped_attributes_count is the number of dropped attributes. If the value is 0,\n        /// then no attributes were dropped.\n        #[prost(uint32, tag = \"5\")]\n        pub dropped_attributes_count: u32,\n    }\n    /// SpanKind is the type of span. Can be used to specify additional relationships between spans\n    /// in addition to a parent/child relationship.\n    #[derive(serde::Serialize, serde::Deserialize)]\n    #[derive(\n        Clone,\n        Copy,\n        Debug,\n        PartialEq,\n        Eq,\n        Hash,\n        PartialOrd,\n        Ord,\n        ::prost::Enumeration\n    )]\n    #[repr(i32)]\n    pub enum SpanKind {\n        /// Unspecified. Do NOT use as default.\n        /// Implementations MAY assume SpanKind to be INTERNAL when receiving UNSPECIFIED.\n        Unspecified = 0,\n        /// Indicates that the span represents an internal operation within an application,\n        /// as opposed to an operation happening at the boundaries. Default value.\n        Internal = 1,\n        /// Indicates that the span covers server-side handling of an RPC or other\n        /// remote network request.\n        Server = 2,\n        /// Indicates that the span describes a request to some remote service.\n        Client = 3,\n        /// Indicates that the span describes a producer sending a message to a broker.\n        /// Unlike CLIENT and SERVER, there is often no direct critical path latency relationship\n        /// between producer and consumer spans. A PRODUCER span ends when the message was accepted\n        /// by the broker while the logical processing of the message might span a much longer time.\n        Producer = 4,\n        /// Indicates that the span describes consumer receiving a message from a broker.\n        /// Like the PRODUCER kind, there is often no direct critical path latency relationship\n        /// between producer and consumer spans.\n        Consumer = 5,\n    }\n    impl SpanKind {\n        /// String value of the enum field names used in the ProtoBuf definition.\n        ///\n        /// The values are not transformed in any way and thus are considered stable\n        /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n        pub fn as_str_name(&self) -> &'static str {\n            match self {\n                Self::Unspecified => \"SPAN_KIND_UNSPECIFIED\",\n                Self::Internal => \"SPAN_KIND_INTERNAL\",\n                Self::Server => \"SPAN_KIND_SERVER\",\n                Self::Client => \"SPAN_KIND_CLIENT\",\n                Self::Producer => \"SPAN_KIND_PRODUCER\",\n                Self::Consumer => \"SPAN_KIND_CONSUMER\",\n            }\n        }\n        /// Creates an enum from field names used in the ProtoBuf definition.\n        pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n            match value {\n                \"SPAN_KIND_UNSPECIFIED\" => Some(Self::Unspecified),\n                \"SPAN_KIND_INTERNAL\" => Some(Self::Internal),\n                \"SPAN_KIND_SERVER\" => Some(Self::Server),\n                \"SPAN_KIND_CLIENT\" => Some(Self::Client),\n                \"SPAN_KIND_PRODUCER\" => Some(Self::Producer),\n                \"SPAN_KIND_CONSUMER\" => Some(Self::Consumer),\n                _ => None,\n            }\n        }\n    }\n}\n/// The Status type defines a logical error model that is suitable for different\n/// programming environments, including REST APIs and RPC APIs.\n#[derive(serde::Serialize, serde::Deserialize)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct Status {\n    /// A developer-facing human readable error message.\n    #[prost(string, tag = \"2\")]\n    pub message: ::prost::alloc::string::String,\n    /// The status code.\n    #[prost(enumeration = \"status::StatusCode\", tag = \"3\")]\n    pub code: i32,\n}\n/// Nested message and enum types in `Status`.\npub mod status {\n    /// For the semantics of status codes see\n    /// <https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md#set-status>\n    #[derive(serde::Serialize, serde::Deserialize)]\n    #[serde(rename_all = \"snake_case\")]\n    #[derive(\n        Clone,\n        Copy,\n        Debug,\n        PartialEq,\n        Eq,\n        Hash,\n        PartialOrd,\n        Ord,\n        ::prost::Enumeration\n    )]\n    #[repr(i32)]\n    pub enum StatusCode {\n        /// The default status.\n        Unset = 0,\n        /// The Span has been validated by an Application developer or Operator to\n        /// have completed successfully.\n        Ok = 1,\n        /// The Span contains an error.\n        Error = 2,\n    }\n    impl StatusCode {\n        /// String value of the enum field names used in the ProtoBuf definition.\n        ///\n        /// The values are not transformed in any way and thus are considered stable\n        /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n        pub fn as_str_name(&self) -> &'static str {\n            match self {\n                Self::Unset => \"STATUS_CODE_UNSET\",\n                Self::Ok => \"STATUS_CODE_OK\",\n                Self::Error => \"STATUS_CODE_ERROR\",\n            }\n        }\n        /// Creates an enum from field names used in the ProtoBuf definition.\n        pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n            match value {\n                \"STATUS_CODE_UNSET\" => Some(Self::Unset),\n                \"STATUS_CODE_OK\" => Some(Self::Ok),\n                \"STATUS_CODE_ERROR\" => Some(Self::Error),\n                _ => None,\n            }\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/codegen/quickwit/quickwit.cluster.rs",
    "content": "// This file is @generated by prost-build.\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ChitchatId {\n    #[prost(string, tag = \"1\")]\n    pub node_id: ::prost::alloc::string::String,\n    #[prost(uint64, tag = \"2\")]\n    pub generation_id: u64,\n    #[prost(string, tag = \"3\")]\n    pub gossip_advertise_addr: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct VersionedKeyValue {\n    #[prost(string, tag = \"1\")]\n    pub key: ::prost::alloc::string::String,\n    #[prost(string, tag = \"2\")]\n    pub value: ::prost::alloc::string::String,\n    #[prost(uint64, tag = \"3\")]\n    pub version: u64,\n    #[prost(enumeration = \"DeletionStatus\", tag = \"4\")]\n    pub status: i32,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct NodeState {\n    #[prost(message, optional, tag = \"1\")]\n    pub chitchat_id: ::core::option::Option<ChitchatId>,\n    #[prost(message, repeated, tag = \"2\")]\n    pub key_values: ::prost::alloc::vec::Vec<VersionedKeyValue>,\n    #[prost(uint64, tag = \"3\")]\n    pub max_version: u64,\n    #[prost(uint64, tag = \"4\")]\n    pub last_gc_version: u64,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct FetchClusterStateRequest {\n    #[prost(string, tag = \"1\")]\n    pub cluster_id: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct FetchClusterStateResponse {\n    #[prost(string, tag = \"1\")]\n    pub cluster_id: ::prost::alloc::string::String,\n    #[prost(message, repeated, tag = \"2\")]\n    pub node_states: ::prost::alloc::vec::Vec<NodeState>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[serde(rename_all = \"snake_case\")]\n#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]\n#[repr(i32)]\npub enum DeletionStatus {\n    Set = 0,\n    Deleted = 1,\n    DeleteAfterTtl = 2,\n}\nimpl DeletionStatus {\n    /// String value of the enum field names used in the ProtoBuf definition.\n    ///\n    /// The values are not transformed in any way and thus are considered stable\n    /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n    pub fn as_str_name(&self) -> &'static str {\n        match self {\n            Self::Set => \"Set\",\n            Self::Deleted => \"Deleted\",\n            Self::DeleteAfterTtl => \"DeleteAfterTtl\",\n        }\n    }\n    /// Creates an enum from field names used in the ProtoBuf definition.\n    pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n        match value {\n            \"Set\" => Some(Self::Set),\n            \"Deleted\" => Some(Self::Deleted),\n            \"DeleteAfterTtl\" => Some(Self::DeleteAfterTtl),\n            _ => None,\n        }\n    }\n}\n/// BEGIN quickwit-codegen\n#[allow(unused_imports)]\nuse std::str::FromStr;\nuse tower::{Layer, Service, ServiceExt};\nuse quickwit_common::tower::RpcName;\nimpl RpcName for FetchClusterStateRequest {\n    fn rpc_name() -> &'static str {\n        \"fetch_cluster_state\"\n    }\n}\n#[cfg_attr(any(test, feature = \"testsuite\"), mockall::automock)]\n#[async_trait::async_trait]\npub trait ClusterService: std::fmt::Debug + Send + Sync + 'static {\n    async fn fetch_cluster_state(\n        &self,\n        request: FetchClusterStateRequest,\n    ) -> crate::cluster::ClusterResult<FetchClusterStateResponse>;\n}\n#[derive(Debug, Clone)]\npub struct ClusterServiceClient {\n    inner: InnerClusterServiceClient,\n}\n#[derive(Debug, Clone)]\nstruct InnerClusterServiceClient(std::sync::Arc<dyn ClusterService>);\nimpl ClusterServiceClient {\n    pub fn new<T>(instance: T) -> Self\n    where\n        T: ClusterService,\n    {\n        #[cfg(any(test, feature = \"testsuite\"))]\n        assert!(\n            std::any::TypeId::of:: < T > () != std::any::TypeId::of:: <\n            MockClusterService > (),\n            \"`MockClusterService` must be wrapped in a `MockClusterServiceWrapper`: use `ClusterServiceClient::from_mock(mock)` to instantiate the client\"\n        );\n        Self {\n            inner: InnerClusterServiceClient(std::sync::Arc::new(instance)),\n        }\n    }\n    pub fn as_grpc_service(\n        &self,\n        max_message_size: bytesize::ByteSize,\n    ) -> cluster_service_grpc_server::ClusterServiceGrpcServer<\n        ClusterServiceGrpcServerAdapter,\n    > {\n        let adapter = ClusterServiceGrpcServerAdapter::new(self.clone());\n        cluster_service_grpc_server::ClusterServiceGrpcServer::new(adapter)\n            .accept_compressed(tonic::codec::CompressionEncoding::Gzip)\n            .accept_compressed(tonic::codec::CompressionEncoding::Zstd)\n            .send_compressed(tonic::codec::CompressionEncoding::Gzip)\n            .send_compressed(tonic::codec::CompressionEncoding::Zstd)\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize)\n    }\n    pub fn from_channel(\n        addr: std::net::SocketAddr,\n        channel: tonic::transport::Channel,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> Self {\n        let (_, connection_keys_watcher) = tokio::sync::watch::channel(\n            std::collections::HashSet::from_iter([addr]),\n        );\n        let mut client = cluster_service_grpc_client::ClusterServiceGrpcClient::new(\n                channel,\n            )\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize);\n        if let Some(compression_encoding) = compression_encoding_opt {\n            client = client\n                .accept_compressed(compression_encoding)\n                .send_compressed(compression_encoding);\n        }\n        let adapter = ClusterServiceGrpcClientAdapter::new(\n            client,\n            connection_keys_watcher,\n        );\n        Self::new(adapter)\n    }\n    pub fn from_balance_channel(\n        balance_channel: quickwit_common::tower::BalanceChannel<std::net::SocketAddr>,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> ClusterServiceClient {\n        let connection_keys_watcher = balance_channel.connection_keys_watcher();\n        let mut client = cluster_service_grpc_client::ClusterServiceGrpcClient::new(\n                balance_channel,\n            )\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize);\n        if let Some(compression_encoding) = compression_encoding_opt {\n            client = client\n                .accept_compressed(compression_encoding)\n                .send_compressed(compression_encoding);\n        }\n        let adapter = ClusterServiceGrpcClientAdapter::new(\n            client,\n            connection_keys_watcher,\n        );\n        Self::new(adapter)\n    }\n    pub fn from_mailbox<A>(mailbox: quickwit_actors::Mailbox<A>) -> Self\n    where\n        A: quickwit_actors::Actor + std::fmt::Debug + Send + 'static,\n        ClusterServiceMailbox<A>: ClusterService,\n    {\n        ClusterServiceClient::new(ClusterServiceMailbox::new(mailbox))\n    }\n    pub fn tower() -> ClusterServiceTowerLayerStack {\n        ClusterServiceTowerLayerStack::default()\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn from_mock(mock: MockClusterService) -> Self {\n        let mock_wrapper = mock_cluster_service::MockClusterServiceWrapper {\n            inner: tokio::sync::Mutex::new(mock),\n        };\n        Self::new(mock_wrapper)\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn mocked() -> Self {\n        Self::from_mock(MockClusterService::new())\n    }\n}\n#[async_trait::async_trait]\nimpl ClusterService for ClusterServiceClient {\n    async fn fetch_cluster_state(\n        &self,\n        request: FetchClusterStateRequest,\n    ) -> crate::cluster::ClusterResult<FetchClusterStateResponse> {\n        self.inner.0.fetch_cluster_state(request).await\n    }\n}\n#[cfg(any(test, feature = \"testsuite\"))]\npub mod mock_cluster_service {\n    use super::*;\n    #[derive(Debug)]\n    pub struct MockClusterServiceWrapper {\n        pub(super) inner: tokio::sync::Mutex<MockClusterService>,\n    }\n    #[async_trait::async_trait]\n    impl ClusterService for MockClusterServiceWrapper {\n        async fn fetch_cluster_state(\n            &self,\n            request: super::FetchClusterStateRequest,\n        ) -> crate::cluster::ClusterResult<super::FetchClusterStateResponse> {\n            self.inner.lock().await.fetch_cluster_state(request).await\n        }\n    }\n}\npub type BoxFuture<T, E> = std::pin::Pin<\n    Box<dyn std::future::Future<Output = Result<T, E>> + Send + 'static>,\n>;\nimpl tower::Service<FetchClusterStateRequest> for InnerClusterServiceClient {\n    type Response = FetchClusterStateResponse;\n    type Error = crate::cluster::ClusterError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: FetchClusterStateRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.fetch_cluster_state(request).await };\n        Box::pin(fut)\n    }\n}\n/// A tower service stack is a set of tower services.\n#[derive(Debug)]\nstruct ClusterServiceTowerServiceStack {\n    #[allow(dead_code)]\n    inner: InnerClusterServiceClient,\n    fetch_cluster_state_svc: quickwit_common::tower::BoxService<\n        FetchClusterStateRequest,\n        FetchClusterStateResponse,\n        crate::cluster::ClusterError,\n    >,\n}\n#[async_trait::async_trait]\nimpl ClusterService for ClusterServiceTowerServiceStack {\n    async fn fetch_cluster_state(\n        &self,\n        request: FetchClusterStateRequest,\n    ) -> crate::cluster::ClusterResult<FetchClusterStateResponse> {\n        self.fetch_cluster_state_svc.clone().ready().await?.call(request).await\n    }\n}\ntype FetchClusterStateLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        FetchClusterStateRequest,\n        FetchClusterStateResponse,\n        crate::cluster::ClusterError,\n    >,\n    FetchClusterStateRequest,\n    FetchClusterStateResponse,\n    crate::cluster::ClusterError,\n>;\n#[derive(Debug, Default)]\npub struct ClusterServiceTowerLayerStack {\n    fetch_cluster_state_layers: Vec<FetchClusterStateLayer>,\n}\nimpl ClusterServiceTowerLayerStack {\n    pub fn stack_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    FetchClusterStateRequest,\n                    FetchClusterStateResponse,\n                    crate::cluster::ClusterError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                FetchClusterStateRequest,\n                FetchClusterStateResponse,\n                crate::cluster::ClusterError,\n            >,\n        >>::Service: tower::Service<\n                FetchClusterStateRequest,\n                Response = FetchClusterStateResponse,\n                Error = crate::cluster::ClusterError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                FetchClusterStateRequest,\n                FetchClusterStateResponse,\n                crate::cluster::ClusterError,\n            >,\n        >>::Service as tower::Service<FetchClusterStateRequest>>::Future: Send + 'static,\n    {\n        self.fetch_cluster_state_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self\n    }\n    pub fn stack_fetch_cluster_state_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    FetchClusterStateRequest,\n                    FetchClusterStateResponse,\n                    crate::cluster::ClusterError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                FetchClusterStateRequest,\n                Response = FetchClusterStateResponse,\n                Error = crate::cluster::ClusterError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<FetchClusterStateRequest>>::Future: Send + 'static,\n    {\n        self.fetch_cluster_state_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn build<T>(self, instance: T) -> ClusterServiceClient\n    where\n        T: ClusterService,\n    {\n        let inner_client = InnerClusterServiceClient(std::sync::Arc::new(instance));\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_channel(\n        self,\n        addr: std::net::SocketAddr,\n        channel: tonic::transport::Channel,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> ClusterServiceClient {\n        let client = ClusterServiceClient::from_channel(\n            addr,\n            channel,\n            max_message_size,\n            compression_encoding_opt,\n        );\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_balance_channel(\n        self,\n        balance_channel: quickwit_common::tower::BalanceChannel<std::net::SocketAddr>,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> ClusterServiceClient {\n        let client = ClusterServiceClient::from_balance_channel(\n            balance_channel,\n            max_message_size,\n            compression_encoding_opt,\n        );\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_mailbox<A>(\n        self,\n        mailbox: quickwit_actors::Mailbox<A>,\n    ) -> ClusterServiceClient\n    where\n        A: quickwit_actors::Actor + std::fmt::Debug + Send + 'static,\n        ClusterServiceMailbox<A>: ClusterService,\n    {\n        let inner_client = InnerClusterServiceClient(\n            std::sync::Arc::new(ClusterServiceMailbox::new(mailbox)),\n        );\n        self.build_from_inner_client(inner_client)\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn build_from_mock(self, mock: MockClusterService) -> ClusterServiceClient {\n        let client = ClusterServiceClient::from_mock(mock);\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    fn build_from_inner_client(\n        self,\n        inner_client: InnerClusterServiceClient,\n    ) -> ClusterServiceClient {\n        let fetch_cluster_state_svc = self\n            .fetch_cluster_state_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let tower_svc_stack = ClusterServiceTowerServiceStack {\n            inner: inner_client,\n            fetch_cluster_state_svc,\n        };\n        ClusterServiceClient::new(tower_svc_stack)\n    }\n}\n#[derive(Debug, Clone)]\nstruct MailboxAdapter<A: quickwit_actors::Actor, E> {\n    inner: quickwit_actors::Mailbox<A>,\n    phantom: std::marker::PhantomData<E>,\n}\nimpl<A, E> std::ops::Deref for MailboxAdapter<A, E>\nwhere\n    A: quickwit_actors::Actor,\n{\n    type Target = quickwit_actors::Mailbox<A>;\n    fn deref(&self) -> &Self::Target {\n        &self.inner\n    }\n}\n#[derive(Debug)]\npub struct ClusterServiceMailbox<A: quickwit_actors::Actor> {\n    inner: MailboxAdapter<A, crate::cluster::ClusterError>,\n}\nimpl<A: quickwit_actors::Actor> ClusterServiceMailbox<A> {\n    pub fn new(instance: quickwit_actors::Mailbox<A>) -> Self {\n        let inner = MailboxAdapter {\n            inner: instance,\n            phantom: std::marker::PhantomData,\n        };\n        Self { inner }\n    }\n}\nimpl<A: quickwit_actors::Actor> Clone for ClusterServiceMailbox<A> {\n    fn clone(&self) -> Self {\n        let inner = MailboxAdapter {\n            inner: self.inner.clone(),\n            phantom: std::marker::PhantomData,\n        };\n        Self { inner }\n    }\n}\nimpl<A, M, T, E> tower::Service<M> for ClusterServiceMailbox<A>\nwhere\n    A: quickwit_actors::Actor\n        + quickwit_actors::DeferableReplyHandler<M, Reply = Result<T, E>> + Send\n        + 'static,\n    M: std::fmt::Debug + Send + 'static,\n    T: Send + 'static,\n    E: std::fmt::Debug + Send + 'static,\n    crate::cluster::ClusterError: From<quickwit_actors::AskError<E>>,\n{\n    type Response = T;\n    type Error = crate::cluster::ClusterError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        //! This does not work with balance middlewares such as `tower::balance::pool::Pool` because\n        //! this always returns `Poll::Ready`. The fix is to acquire a permit from the\n        //! mailbox in `poll_ready` and consume it in `call`.\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, message: M) -> Self::Future {\n        let mailbox = self.inner.clone();\n        let fut = async move {\n            mailbox.ask_for_res(message).await.map_err(|error| error.into())\n        };\n        Box::pin(fut)\n    }\n}\n#[async_trait::async_trait]\nimpl<A> ClusterService for ClusterServiceMailbox<A>\nwhere\n    A: quickwit_actors::Actor + std::fmt::Debug,\n    ClusterServiceMailbox<\n        A,\n    >: tower::Service<\n        FetchClusterStateRequest,\n        Response = FetchClusterStateResponse,\n        Error = crate::cluster::ClusterError,\n        Future = BoxFuture<FetchClusterStateResponse, crate::cluster::ClusterError>,\n    >,\n{\n    async fn fetch_cluster_state(\n        &self,\n        request: FetchClusterStateRequest,\n    ) -> crate::cluster::ClusterResult<FetchClusterStateResponse> {\n        self.clone().call(request).await\n    }\n}\n#[derive(Debug, Clone)]\npub struct ClusterServiceGrpcClientAdapter<T> {\n    inner: T,\n    #[allow(dead_code)]\n    connection_addrs_rx: tokio::sync::watch::Receiver<\n        std::collections::HashSet<std::net::SocketAddr>,\n    >,\n}\nimpl<T> ClusterServiceGrpcClientAdapter<T> {\n    pub fn new(\n        instance: T,\n        connection_addrs_rx: tokio::sync::watch::Receiver<\n            std::collections::HashSet<std::net::SocketAddr>,\n        >,\n    ) -> Self {\n        Self {\n            inner: instance,\n            connection_addrs_rx,\n        }\n    }\n}\n#[async_trait::async_trait]\nimpl<T> ClusterService\nfor ClusterServiceGrpcClientAdapter<\n    cluster_service_grpc_client::ClusterServiceGrpcClient<T>,\n>\nwhere\n    T: tonic::client::GrpcService<tonic::body::Body> + std::fmt::Debug + Clone + Send\n        + Sync + 'static,\n    T::ResponseBody: tonic::codegen::Body<Data = tonic::codegen::Bytes> + Send + 'static,\n    <T::ResponseBody as tonic::codegen::Body>::Error: Into<tonic::codegen::StdError>\n        + Send,\n    T::Future: Send,\n{\n    async fn fetch_cluster_state(\n        &self,\n        request: FetchClusterStateRequest,\n    ) -> crate::cluster::ClusterResult<FetchClusterStateResponse> {\n        self.inner\n            .clone()\n            .fetch_cluster_state(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                FetchClusterStateRequest::rpc_name(),\n            ))\n    }\n}\n#[derive(Debug)]\npub struct ClusterServiceGrpcServerAdapter {\n    inner: InnerClusterServiceClient,\n}\nimpl ClusterServiceGrpcServerAdapter {\n    pub fn new<T>(instance: T) -> Self\n    where\n        T: ClusterService,\n    {\n        Self {\n            inner: InnerClusterServiceClient(std::sync::Arc::new(instance)),\n        }\n    }\n}\n#[async_trait::async_trait]\nimpl cluster_service_grpc_server::ClusterServiceGrpc\nfor ClusterServiceGrpcServerAdapter {\n    async fn fetch_cluster_state(\n        &self,\n        request: tonic::Request<FetchClusterStateRequest>,\n    ) -> Result<tonic::Response<FetchClusterStateResponse>, tonic::Status> {\n        self.inner\n            .0\n            .fetch_cluster_state(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n}\n/// Generated client implementations.\npub mod cluster_service_grpc_client {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    use tonic::codegen::http::Uri;\n    #[derive(Debug, Clone)]\n    pub struct ClusterServiceGrpcClient<T> {\n        inner: tonic::client::Grpc<T>,\n    }\n    impl ClusterServiceGrpcClient<tonic::transport::Channel> {\n        /// Attempt to create a new client by connecting to a given endpoint.\n        pub async fn connect<D>(dst: D) -> Result<Self, tonic::transport::Error>\n        where\n            D: TryInto<tonic::transport::Endpoint>,\n            D::Error: Into<StdError>,\n        {\n            let conn = tonic::transport::Endpoint::new(dst)?.connect().await?;\n            Ok(Self::new(conn))\n        }\n    }\n    impl<T> ClusterServiceGrpcClient<T>\n    where\n        T: tonic::client::GrpcService<tonic::body::Body>,\n        T::Error: Into<StdError>,\n        T::ResponseBody: Body<Data = Bytes> + std::marker::Send + 'static,\n        <T::ResponseBody as Body>::Error: Into<StdError> + std::marker::Send,\n    {\n        pub fn new(inner: T) -> Self {\n            let inner = tonic::client::Grpc::new(inner);\n            Self { inner }\n        }\n        pub fn with_origin(inner: T, origin: Uri) -> Self {\n            let inner = tonic::client::Grpc::with_origin(inner, origin);\n            Self { inner }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> ClusterServiceGrpcClient<InterceptedService<T, F>>\n        where\n            F: tonic::service::Interceptor,\n            T::ResponseBody: Default,\n            T: tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n                Response = http::Response<\n                    <T as tonic::client::GrpcService<tonic::body::Body>>::ResponseBody,\n                >,\n            >,\n            <T as tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n            >>::Error: Into<StdError> + std::marker::Send + std::marker::Sync,\n        {\n            ClusterServiceGrpcClient::new(InterceptedService::new(inner, interceptor))\n        }\n        /// Compress requests with the given encoding.\n        ///\n        /// This requires the server to support it otherwise it might respond with an\n        /// error.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.send_compressed(encoding);\n            self\n        }\n        /// Enable decompressing responses.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.accept_compressed(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_decoding_message_size(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_encoding_message_size(limit);\n            self\n        }\n        pub async fn fetch_cluster_state(\n            &mut self,\n            request: impl tonic::IntoRequest<super::FetchClusterStateRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::FetchClusterStateResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.cluster.ClusterService/FetchClusterState\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.cluster.ClusterService\",\n                        \"FetchClusterState\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n    }\n}\n/// Generated server implementations.\npub mod cluster_service_grpc_server {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    /// Generated trait containing gRPC methods that should be implemented for use with ClusterServiceGrpcServer.\n    #[async_trait]\n    pub trait ClusterServiceGrpc: std::marker::Send + std::marker::Sync + 'static {\n        async fn fetch_cluster_state(\n            &self,\n            request: tonic::Request<super::FetchClusterStateRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::FetchClusterStateResponse>,\n            tonic::Status,\n        >;\n    }\n    #[derive(Debug)]\n    pub struct ClusterServiceGrpcServer<T> {\n        inner: Arc<T>,\n        accept_compression_encodings: EnabledCompressionEncodings,\n        send_compression_encodings: EnabledCompressionEncodings,\n        max_decoding_message_size: Option<usize>,\n        max_encoding_message_size: Option<usize>,\n    }\n    impl<T> ClusterServiceGrpcServer<T> {\n        pub fn new(inner: T) -> Self {\n            Self::from_arc(Arc::new(inner))\n        }\n        pub fn from_arc(inner: Arc<T>) -> Self {\n            Self {\n                inner,\n                accept_compression_encodings: Default::default(),\n                send_compression_encodings: Default::default(),\n                max_decoding_message_size: None,\n                max_encoding_message_size: None,\n            }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> InterceptedService<Self, F>\n        where\n            F: tonic::service::Interceptor,\n        {\n            InterceptedService::new(Self::new(inner), interceptor)\n        }\n        /// Enable decompressing requests with the given encoding.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.accept_compression_encodings.enable(encoding);\n            self\n        }\n        /// Compress responses with the given encoding, if the client supports it.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.send_compression_encodings.enable(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.max_decoding_message_size = Some(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.max_encoding_message_size = Some(limit);\n            self\n        }\n    }\n    impl<T, B> tonic::codegen::Service<http::Request<B>> for ClusterServiceGrpcServer<T>\n    where\n        T: ClusterServiceGrpc,\n        B: Body + std::marker::Send + 'static,\n        B::Error: Into<StdError> + std::marker::Send + 'static,\n    {\n        type Response = http::Response<tonic::body::Body>;\n        type Error = std::convert::Infallible;\n        type Future = BoxFuture<Self::Response, Self::Error>;\n        fn poll_ready(\n            &mut self,\n            _cx: &mut Context<'_>,\n        ) -> Poll<std::result::Result<(), Self::Error>> {\n            Poll::Ready(Ok(()))\n        }\n        fn call(&mut self, req: http::Request<B>) -> Self::Future {\n            match req.uri().path() {\n                \"/quickwit.cluster.ClusterService/FetchClusterState\" => {\n                    #[allow(non_camel_case_types)]\n                    struct FetchClusterStateSvc<T: ClusterServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: ClusterServiceGrpc,\n                    > tonic::server::UnaryService<super::FetchClusterStateRequest>\n                    for FetchClusterStateSvc<T> {\n                        type Response = super::FetchClusterStateResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::FetchClusterStateRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as ClusterServiceGrpc>::fetch_cluster_state(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = FetchClusterStateSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                _ => {\n                    Box::pin(async move {\n                        let mut response = http::Response::new(\n                            tonic::body::Body::default(),\n                        );\n                        let headers = response.headers_mut();\n                        headers\n                            .insert(\n                                tonic::Status::GRPC_STATUS,\n                                (tonic::Code::Unimplemented as i32).into(),\n                            );\n                        headers\n                            .insert(\n                                http::header::CONTENT_TYPE,\n                                tonic::metadata::GRPC_CONTENT_TYPE,\n                            );\n                        Ok(response)\n                    })\n                }\n            }\n        }\n    }\n    impl<T> Clone for ClusterServiceGrpcServer<T> {\n        fn clone(&self) -> Self {\n            let inner = self.inner.clone();\n            Self {\n                inner,\n                accept_compression_encodings: self.accept_compression_encodings,\n                send_compression_encodings: self.send_compression_encodings,\n                max_decoding_message_size: self.max_decoding_message_size,\n                max_encoding_message_size: self.max_encoding_message_size,\n            }\n        }\n    }\n    /// Generated gRPC service name\n    pub const SERVICE_NAME: &str = \"quickwit.cluster.ClusterService\";\n    impl<T> tonic::server::NamedService for ClusterServiceGrpcServer<T> {\n        const NAME: &'static str = SERVICE_NAME;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/codegen/quickwit/quickwit.common.rs",
    "content": "// This file is @generated by prost-build.\n/// The corresponding Rust struct \\[`crate::types::DocUid`\\] is defined manually and\n/// externally provided during code generation (see `build.rs`).\n///\n/// Modify at your own risk.\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct DocUid {\n    /// ULID encoded as a sequence of 16 bytes (big-endian u128).\n    #[prost(bytes = \"vec\", tag = \"1\")]\n    pub doc_uid: ::prost::alloc::vec::Vec<u8>,\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/codegen/quickwit/quickwit.control_plane.rs",
    "content": "// This file is @generated by prost-build.\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct GetOrCreateOpenShardsRequest {\n    /// There should be at most one subrequest per index per request.\n    #[prost(message, repeated, tag = \"1\")]\n    pub subrequests: ::prost::alloc::vec::Vec<GetOrCreateOpenShardsSubrequest>,\n    #[prost(message, repeated, tag = \"2\")]\n    pub closed_shards: ::prost::alloc::vec::Vec<super::ingest::ShardIds>,\n    /// The control plane should return shards that are not present on the supplied leaders.\n    ///\n    /// The control plane does not change the status of those leaders just from this signal.\n    /// It will check the status of its own ingester pool.\n    #[prost(string, repeated, tag = \"3\")]\n    pub unavailable_leaders: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct GetOrCreateOpenShardsSubrequest {\n    #[prost(uint32, tag = \"1\")]\n    pub subrequest_id: u32,\n    #[prost(string, tag = \"2\")]\n    pub index_id: ::prost::alloc::string::String,\n    #[prost(string, tag = \"3\")]\n    pub source_id: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct GetOrCreateOpenShardsResponse {\n    #[prost(message, repeated, tag = \"1\")]\n    pub successes: ::prost::alloc::vec::Vec<GetOrCreateOpenShardsSuccess>,\n    #[prost(message, repeated, tag = \"2\")]\n    pub failures: ::prost::alloc::vec::Vec<GetOrCreateOpenShardsFailure>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct GetOrCreateOpenShardsSuccess {\n    #[prost(uint32, tag = \"1\")]\n    pub subrequest_id: u32,\n    #[prost(message, optional, tag = \"2\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"3\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(message, repeated, tag = \"4\")]\n    pub open_shards: ::prost::alloc::vec::Vec<super::ingest::Shard>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct GetOrCreateOpenShardsFailure {\n    #[prost(uint32, tag = \"1\")]\n    pub subrequest_id: u32,\n    #[prost(string, tag = \"2\")]\n    pub index_id: ::prost::alloc::string::String,\n    #[prost(string, tag = \"3\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(enumeration = \"GetOrCreateOpenShardsFailureReason\", tag = \"4\")]\n    pub reason: i32,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct AdviseResetShardsRequest {\n    #[prost(message, repeated, tag = \"1\")]\n    pub shard_ids: ::prost::alloc::vec::Vec<super::ingest::ShardIds>,\n    #[prost(string, tag = \"2\")]\n    pub ingester_id: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct AdviseResetShardsResponse {\n    #[prost(message, repeated, tag = \"1\")]\n    pub shards_to_delete: ::prost::alloc::vec::Vec<super::ingest::ShardIds>,\n    #[prost(message, repeated, tag = \"2\")]\n    pub shards_to_truncate: ::prost::alloc::vec::Vec<super::ingest::ShardIdPositions>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[serde(rename_all = \"snake_case\")]\n#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]\n#[repr(i32)]\npub enum GetOrCreateOpenShardsFailureReason {\n    Unspecified = 0,\n    IndexNotFound = 1,\n    SourceNotFound = 2,\n    NoIngestersAvailable = 3,\n}\nimpl GetOrCreateOpenShardsFailureReason {\n    /// String value of the enum field names used in the ProtoBuf definition.\n    ///\n    /// The values are not transformed in any way and thus are considered stable\n    /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n    pub fn as_str_name(&self) -> &'static str {\n        match self {\n            Self::Unspecified => \"GET_OR_CREATE_OPEN_SHARDS_FAILURE_REASON_UNSPECIFIED\",\n            Self::IndexNotFound => {\n                \"GET_OR_CREATE_OPEN_SHARDS_FAILURE_REASON_INDEX_NOT_FOUND\"\n            }\n            Self::SourceNotFound => {\n                \"GET_OR_CREATE_OPEN_SHARDS_FAILURE_REASON_SOURCE_NOT_FOUND\"\n            }\n            Self::NoIngestersAvailable => {\n                \"GET_OR_CREATE_OPEN_SHARDS_FAILURE_REASON_NO_INGESTERS_AVAILABLE\"\n            }\n        }\n    }\n    /// Creates an enum from field names used in the ProtoBuf definition.\n    pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n        match value {\n            \"GET_OR_CREATE_OPEN_SHARDS_FAILURE_REASON_UNSPECIFIED\" => {\n                Some(Self::Unspecified)\n            }\n            \"GET_OR_CREATE_OPEN_SHARDS_FAILURE_REASON_INDEX_NOT_FOUND\" => {\n                Some(Self::IndexNotFound)\n            }\n            \"GET_OR_CREATE_OPEN_SHARDS_FAILURE_REASON_SOURCE_NOT_FOUND\" => {\n                Some(Self::SourceNotFound)\n            }\n            \"GET_OR_CREATE_OPEN_SHARDS_FAILURE_REASON_NO_INGESTERS_AVAILABLE\" => {\n                Some(Self::NoIngestersAvailable)\n            }\n            _ => None,\n        }\n    }\n}\n/// BEGIN quickwit-codegen\n#[allow(unused_imports)]\nuse std::str::FromStr;\nuse tower::{Layer, Service, ServiceExt};\n#[cfg_attr(any(test, feature = \"testsuite\"), mockall::automock)]\n#[async_trait::async_trait]\npub trait ControlPlaneService: std::fmt::Debug + Send + Sync + 'static {\n    ///Creates a new index.\n    async fn create_index(\n        &self,\n        request: super::metastore::CreateIndexRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::CreateIndexResponse>;\n    ///Updates an index.\n    async fn update_index(\n        &self,\n        request: super::metastore::UpdateIndexRequest,\n    ) -> crate::control_plane::ControlPlaneResult<\n        super::metastore::IndexMetadataResponse,\n    >;\n    ///Deletes an index.\n    async fn delete_index(\n        &self,\n        request: super::metastore::DeleteIndexRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse>;\n    ///Adds a source to an index.\n    async fn add_source(\n        &self,\n        request: super::metastore::AddSourceRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse>;\n    ///Update a source.\n    async fn update_source(\n        &self,\n        request: super::metastore::UpdateSourceRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse>;\n    ///Enables or disables a source.\n    async fn toggle_source(\n        &self,\n        request: super::metastore::ToggleSourceRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse>;\n    ///Removes a source from an index.\n    async fn delete_source(\n        &self,\n        request: super::metastore::DeleteSourceRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse>;\n    ///Returns the list of open shards for one or several sources. If the control plane is not able to find any\n    ///for a source, it will pick a pair of leader-follower ingesters and will open a new shard.\n    async fn get_or_create_open_shards(\n        &self,\n        request: GetOrCreateOpenShardsRequest,\n    ) -> crate::control_plane::ControlPlaneResult<GetOrCreateOpenShardsResponse>;\n    ///Asks the control plane whether the shards listed in the request should be deleted or truncated.\n    async fn advise_reset_shards(\n        &self,\n        request: AdviseResetShardsRequest,\n    ) -> crate::control_plane::ControlPlaneResult<AdviseResetShardsResponse>;\n    ///Performs a debounced shard pruning request to the metastore.\n    async fn prune_shards(\n        &self,\n        request: super::metastore::PruneShardsRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse>;\n}\n#[derive(Debug, Clone)]\npub struct ControlPlaneServiceClient {\n    inner: InnerControlPlaneServiceClient,\n}\n#[derive(Debug, Clone)]\nstruct InnerControlPlaneServiceClient(std::sync::Arc<dyn ControlPlaneService>);\nimpl ControlPlaneServiceClient {\n    pub fn new<T>(instance: T) -> Self\n    where\n        T: ControlPlaneService,\n    {\n        #[cfg(any(test, feature = \"testsuite\"))]\n        assert!(\n            std::any::TypeId::of:: < T > () != std::any::TypeId::of:: <\n            MockControlPlaneService > (),\n            \"`MockControlPlaneService` must be wrapped in a `MockControlPlaneServiceWrapper`: use `ControlPlaneServiceClient::from_mock(mock)` to instantiate the client\"\n        );\n        Self {\n            inner: InnerControlPlaneServiceClient(std::sync::Arc::new(instance)),\n        }\n    }\n    pub fn as_grpc_service(\n        &self,\n        max_message_size: bytesize::ByteSize,\n    ) -> control_plane_service_grpc_server::ControlPlaneServiceGrpcServer<\n        ControlPlaneServiceGrpcServerAdapter,\n    > {\n        let adapter = ControlPlaneServiceGrpcServerAdapter::new(self.clone());\n        control_plane_service_grpc_server::ControlPlaneServiceGrpcServer::new(adapter)\n            .accept_compressed(tonic::codec::CompressionEncoding::Gzip)\n            .accept_compressed(tonic::codec::CompressionEncoding::Zstd)\n            .send_compressed(tonic::codec::CompressionEncoding::Gzip)\n            .send_compressed(tonic::codec::CompressionEncoding::Zstd)\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize)\n    }\n    pub fn from_channel(\n        addr: std::net::SocketAddr,\n        channel: tonic::transport::Channel,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> Self {\n        let (_, connection_keys_watcher) = tokio::sync::watch::channel(\n            std::collections::HashSet::from_iter([addr]),\n        );\n        let mut client = control_plane_service_grpc_client::ControlPlaneServiceGrpcClient::new(\n                channel,\n            )\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize);\n        if let Some(compression_encoding) = compression_encoding_opt {\n            client = client\n                .accept_compressed(compression_encoding)\n                .send_compressed(compression_encoding);\n        }\n        let adapter = ControlPlaneServiceGrpcClientAdapter::new(\n            client,\n            connection_keys_watcher,\n        );\n        Self::new(adapter)\n    }\n    pub fn from_balance_channel(\n        balance_channel: quickwit_common::tower::BalanceChannel<std::net::SocketAddr>,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> ControlPlaneServiceClient {\n        let connection_keys_watcher = balance_channel.connection_keys_watcher();\n        let mut client = control_plane_service_grpc_client::ControlPlaneServiceGrpcClient::new(\n                balance_channel,\n            )\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize);\n        if let Some(compression_encoding) = compression_encoding_opt {\n            client = client\n                .accept_compressed(compression_encoding)\n                .send_compressed(compression_encoding);\n        }\n        let adapter = ControlPlaneServiceGrpcClientAdapter::new(\n            client,\n            connection_keys_watcher,\n        );\n        Self::new(adapter)\n    }\n    pub fn from_mailbox<A>(mailbox: quickwit_actors::Mailbox<A>) -> Self\n    where\n        A: quickwit_actors::Actor + std::fmt::Debug + Send + 'static,\n        ControlPlaneServiceMailbox<A>: ControlPlaneService,\n    {\n        ControlPlaneServiceClient::new(ControlPlaneServiceMailbox::new(mailbox))\n    }\n    pub fn tower() -> ControlPlaneServiceTowerLayerStack {\n        ControlPlaneServiceTowerLayerStack::default()\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn from_mock(mock: MockControlPlaneService) -> Self {\n        let mock_wrapper = mock_control_plane_service::MockControlPlaneServiceWrapper {\n            inner: tokio::sync::Mutex::new(mock),\n        };\n        Self::new(mock_wrapper)\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn mocked() -> Self {\n        Self::from_mock(MockControlPlaneService::new())\n    }\n}\n#[async_trait::async_trait]\nimpl ControlPlaneService for ControlPlaneServiceClient {\n    async fn create_index(\n        &self,\n        request: super::metastore::CreateIndexRequest,\n    ) -> crate::control_plane::ControlPlaneResult<\n        super::metastore::CreateIndexResponse,\n    > {\n        self.inner.0.create_index(request).await\n    }\n    async fn update_index(\n        &self,\n        request: super::metastore::UpdateIndexRequest,\n    ) -> crate::control_plane::ControlPlaneResult<\n        super::metastore::IndexMetadataResponse,\n    > {\n        self.inner.0.update_index(request).await\n    }\n    async fn delete_index(\n        &self,\n        request: super::metastore::DeleteIndexRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse> {\n        self.inner.0.delete_index(request).await\n    }\n    async fn add_source(\n        &self,\n        request: super::metastore::AddSourceRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse> {\n        self.inner.0.add_source(request).await\n    }\n    async fn update_source(\n        &self,\n        request: super::metastore::UpdateSourceRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse> {\n        self.inner.0.update_source(request).await\n    }\n    async fn toggle_source(\n        &self,\n        request: super::metastore::ToggleSourceRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse> {\n        self.inner.0.toggle_source(request).await\n    }\n    async fn delete_source(\n        &self,\n        request: super::metastore::DeleteSourceRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse> {\n        self.inner.0.delete_source(request).await\n    }\n    async fn get_or_create_open_shards(\n        &self,\n        request: GetOrCreateOpenShardsRequest,\n    ) -> crate::control_plane::ControlPlaneResult<GetOrCreateOpenShardsResponse> {\n        self.inner.0.get_or_create_open_shards(request).await\n    }\n    async fn advise_reset_shards(\n        &self,\n        request: AdviseResetShardsRequest,\n    ) -> crate::control_plane::ControlPlaneResult<AdviseResetShardsResponse> {\n        self.inner.0.advise_reset_shards(request).await\n    }\n    async fn prune_shards(\n        &self,\n        request: super::metastore::PruneShardsRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse> {\n        self.inner.0.prune_shards(request).await\n    }\n}\n#[cfg(any(test, feature = \"testsuite\"))]\npub mod mock_control_plane_service {\n    use super::*;\n    #[derive(Debug)]\n    pub struct MockControlPlaneServiceWrapper {\n        pub(super) inner: tokio::sync::Mutex<MockControlPlaneService>,\n    }\n    #[async_trait::async_trait]\n    impl ControlPlaneService for MockControlPlaneServiceWrapper {\n        async fn create_index(\n            &self,\n            request: super::super::metastore::CreateIndexRequest,\n        ) -> crate::control_plane::ControlPlaneResult<\n            super::super::metastore::CreateIndexResponse,\n        > {\n            self.inner.lock().await.create_index(request).await\n        }\n        async fn update_index(\n            &self,\n            request: super::super::metastore::UpdateIndexRequest,\n        ) -> crate::control_plane::ControlPlaneResult<\n            super::super::metastore::IndexMetadataResponse,\n        > {\n            self.inner.lock().await.update_index(request).await\n        }\n        async fn delete_index(\n            &self,\n            request: super::super::metastore::DeleteIndexRequest,\n        ) -> crate::control_plane::ControlPlaneResult<\n            super::super::metastore::EmptyResponse,\n        > {\n            self.inner.lock().await.delete_index(request).await\n        }\n        async fn add_source(\n            &self,\n            request: super::super::metastore::AddSourceRequest,\n        ) -> crate::control_plane::ControlPlaneResult<\n            super::super::metastore::EmptyResponse,\n        > {\n            self.inner.lock().await.add_source(request).await\n        }\n        async fn update_source(\n            &self,\n            request: super::super::metastore::UpdateSourceRequest,\n        ) -> crate::control_plane::ControlPlaneResult<\n            super::super::metastore::EmptyResponse,\n        > {\n            self.inner.lock().await.update_source(request).await\n        }\n        async fn toggle_source(\n            &self,\n            request: super::super::metastore::ToggleSourceRequest,\n        ) -> crate::control_plane::ControlPlaneResult<\n            super::super::metastore::EmptyResponse,\n        > {\n            self.inner.lock().await.toggle_source(request).await\n        }\n        async fn delete_source(\n            &self,\n            request: super::super::metastore::DeleteSourceRequest,\n        ) -> crate::control_plane::ControlPlaneResult<\n            super::super::metastore::EmptyResponse,\n        > {\n            self.inner.lock().await.delete_source(request).await\n        }\n        async fn get_or_create_open_shards(\n            &self,\n            request: super::GetOrCreateOpenShardsRequest,\n        ) -> crate::control_plane::ControlPlaneResult<\n            super::GetOrCreateOpenShardsResponse,\n        > {\n            self.inner.lock().await.get_or_create_open_shards(request).await\n        }\n        async fn advise_reset_shards(\n            &self,\n            request: super::AdviseResetShardsRequest,\n        ) -> crate::control_plane::ControlPlaneResult<super::AdviseResetShardsResponse> {\n            self.inner.lock().await.advise_reset_shards(request).await\n        }\n        async fn prune_shards(\n            &self,\n            request: super::super::metastore::PruneShardsRequest,\n        ) -> crate::control_plane::ControlPlaneResult<\n            super::super::metastore::EmptyResponse,\n        > {\n            self.inner.lock().await.prune_shards(request).await\n        }\n    }\n}\npub type BoxFuture<T, E> = std::pin::Pin<\n    Box<dyn std::future::Future<Output = Result<T, E>> + Send + 'static>,\n>;\nimpl tower::Service<super::metastore::CreateIndexRequest>\nfor InnerControlPlaneServiceClient {\n    type Response = super::metastore::CreateIndexResponse;\n    type Error = crate::control_plane::ControlPlaneError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: super::metastore::CreateIndexRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.create_index(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<super::metastore::UpdateIndexRequest>\nfor InnerControlPlaneServiceClient {\n    type Response = super::metastore::IndexMetadataResponse;\n    type Error = crate::control_plane::ControlPlaneError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: super::metastore::UpdateIndexRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.update_index(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<super::metastore::DeleteIndexRequest>\nfor InnerControlPlaneServiceClient {\n    type Response = super::metastore::EmptyResponse;\n    type Error = crate::control_plane::ControlPlaneError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: super::metastore::DeleteIndexRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.delete_index(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<super::metastore::AddSourceRequest>\nfor InnerControlPlaneServiceClient {\n    type Response = super::metastore::EmptyResponse;\n    type Error = crate::control_plane::ControlPlaneError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: super::metastore::AddSourceRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.add_source(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<super::metastore::UpdateSourceRequest>\nfor InnerControlPlaneServiceClient {\n    type Response = super::metastore::EmptyResponse;\n    type Error = crate::control_plane::ControlPlaneError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: super::metastore::UpdateSourceRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.update_source(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<super::metastore::ToggleSourceRequest>\nfor InnerControlPlaneServiceClient {\n    type Response = super::metastore::EmptyResponse;\n    type Error = crate::control_plane::ControlPlaneError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: super::metastore::ToggleSourceRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.toggle_source(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<super::metastore::DeleteSourceRequest>\nfor InnerControlPlaneServiceClient {\n    type Response = super::metastore::EmptyResponse;\n    type Error = crate::control_plane::ControlPlaneError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: super::metastore::DeleteSourceRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.delete_source(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<GetOrCreateOpenShardsRequest> for InnerControlPlaneServiceClient {\n    type Response = GetOrCreateOpenShardsResponse;\n    type Error = crate::control_plane::ControlPlaneError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: GetOrCreateOpenShardsRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.get_or_create_open_shards(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<AdviseResetShardsRequest> for InnerControlPlaneServiceClient {\n    type Response = AdviseResetShardsResponse;\n    type Error = crate::control_plane::ControlPlaneError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: AdviseResetShardsRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.advise_reset_shards(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<super::metastore::PruneShardsRequest>\nfor InnerControlPlaneServiceClient {\n    type Response = super::metastore::EmptyResponse;\n    type Error = crate::control_plane::ControlPlaneError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: super::metastore::PruneShardsRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.prune_shards(request).await };\n        Box::pin(fut)\n    }\n}\n/// A tower service stack is a set of tower services.\n#[derive(Debug)]\nstruct ControlPlaneServiceTowerServiceStack {\n    #[allow(dead_code)]\n    inner: InnerControlPlaneServiceClient,\n    create_index_svc: quickwit_common::tower::BoxService<\n        super::metastore::CreateIndexRequest,\n        super::metastore::CreateIndexResponse,\n        crate::control_plane::ControlPlaneError,\n    >,\n    update_index_svc: quickwit_common::tower::BoxService<\n        super::metastore::UpdateIndexRequest,\n        super::metastore::IndexMetadataResponse,\n        crate::control_plane::ControlPlaneError,\n    >,\n    delete_index_svc: quickwit_common::tower::BoxService<\n        super::metastore::DeleteIndexRequest,\n        super::metastore::EmptyResponse,\n        crate::control_plane::ControlPlaneError,\n    >,\n    add_source_svc: quickwit_common::tower::BoxService<\n        super::metastore::AddSourceRequest,\n        super::metastore::EmptyResponse,\n        crate::control_plane::ControlPlaneError,\n    >,\n    update_source_svc: quickwit_common::tower::BoxService<\n        super::metastore::UpdateSourceRequest,\n        super::metastore::EmptyResponse,\n        crate::control_plane::ControlPlaneError,\n    >,\n    toggle_source_svc: quickwit_common::tower::BoxService<\n        super::metastore::ToggleSourceRequest,\n        super::metastore::EmptyResponse,\n        crate::control_plane::ControlPlaneError,\n    >,\n    delete_source_svc: quickwit_common::tower::BoxService<\n        super::metastore::DeleteSourceRequest,\n        super::metastore::EmptyResponse,\n        crate::control_plane::ControlPlaneError,\n    >,\n    get_or_create_open_shards_svc: quickwit_common::tower::BoxService<\n        GetOrCreateOpenShardsRequest,\n        GetOrCreateOpenShardsResponse,\n        crate::control_plane::ControlPlaneError,\n    >,\n    advise_reset_shards_svc: quickwit_common::tower::BoxService<\n        AdviseResetShardsRequest,\n        AdviseResetShardsResponse,\n        crate::control_plane::ControlPlaneError,\n    >,\n    prune_shards_svc: quickwit_common::tower::BoxService<\n        super::metastore::PruneShardsRequest,\n        super::metastore::EmptyResponse,\n        crate::control_plane::ControlPlaneError,\n    >,\n}\n#[async_trait::async_trait]\nimpl ControlPlaneService for ControlPlaneServiceTowerServiceStack {\n    async fn create_index(\n        &self,\n        request: super::metastore::CreateIndexRequest,\n    ) -> crate::control_plane::ControlPlaneResult<\n        super::metastore::CreateIndexResponse,\n    > {\n        self.create_index_svc.clone().ready().await?.call(request).await\n    }\n    async fn update_index(\n        &self,\n        request: super::metastore::UpdateIndexRequest,\n    ) -> crate::control_plane::ControlPlaneResult<\n        super::metastore::IndexMetadataResponse,\n    > {\n        self.update_index_svc.clone().ready().await?.call(request).await\n    }\n    async fn delete_index(\n        &self,\n        request: super::metastore::DeleteIndexRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse> {\n        self.delete_index_svc.clone().ready().await?.call(request).await\n    }\n    async fn add_source(\n        &self,\n        request: super::metastore::AddSourceRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse> {\n        self.add_source_svc.clone().ready().await?.call(request).await\n    }\n    async fn update_source(\n        &self,\n        request: super::metastore::UpdateSourceRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse> {\n        self.update_source_svc.clone().ready().await?.call(request).await\n    }\n    async fn toggle_source(\n        &self,\n        request: super::metastore::ToggleSourceRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse> {\n        self.toggle_source_svc.clone().ready().await?.call(request).await\n    }\n    async fn delete_source(\n        &self,\n        request: super::metastore::DeleteSourceRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse> {\n        self.delete_source_svc.clone().ready().await?.call(request).await\n    }\n    async fn get_or_create_open_shards(\n        &self,\n        request: GetOrCreateOpenShardsRequest,\n    ) -> crate::control_plane::ControlPlaneResult<GetOrCreateOpenShardsResponse> {\n        self.get_or_create_open_shards_svc.clone().ready().await?.call(request).await\n    }\n    async fn advise_reset_shards(\n        &self,\n        request: AdviseResetShardsRequest,\n    ) -> crate::control_plane::ControlPlaneResult<AdviseResetShardsResponse> {\n        self.advise_reset_shards_svc.clone().ready().await?.call(request).await\n    }\n    async fn prune_shards(\n        &self,\n        request: super::metastore::PruneShardsRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse> {\n        self.prune_shards_svc.clone().ready().await?.call(request).await\n    }\n}\ntype CreateIndexLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        super::metastore::CreateIndexRequest,\n        super::metastore::CreateIndexResponse,\n        crate::control_plane::ControlPlaneError,\n    >,\n    super::metastore::CreateIndexRequest,\n    super::metastore::CreateIndexResponse,\n    crate::control_plane::ControlPlaneError,\n>;\ntype UpdateIndexLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        super::metastore::UpdateIndexRequest,\n        super::metastore::IndexMetadataResponse,\n        crate::control_plane::ControlPlaneError,\n    >,\n    super::metastore::UpdateIndexRequest,\n    super::metastore::IndexMetadataResponse,\n    crate::control_plane::ControlPlaneError,\n>;\ntype DeleteIndexLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        super::metastore::DeleteIndexRequest,\n        super::metastore::EmptyResponse,\n        crate::control_plane::ControlPlaneError,\n    >,\n    super::metastore::DeleteIndexRequest,\n    super::metastore::EmptyResponse,\n    crate::control_plane::ControlPlaneError,\n>;\ntype AddSourceLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        super::metastore::AddSourceRequest,\n        super::metastore::EmptyResponse,\n        crate::control_plane::ControlPlaneError,\n    >,\n    super::metastore::AddSourceRequest,\n    super::metastore::EmptyResponse,\n    crate::control_plane::ControlPlaneError,\n>;\ntype UpdateSourceLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        super::metastore::UpdateSourceRequest,\n        super::metastore::EmptyResponse,\n        crate::control_plane::ControlPlaneError,\n    >,\n    super::metastore::UpdateSourceRequest,\n    super::metastore::EmptyResponse,\n    crate::control_plane::ControlPlaneError,\n>;\ntype ToggleSourceLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        super::metastore::ToggleSourceRequest,\n        super::metastore::EmptyResponse,\n        crate::control_plane::ControlPlaneError,\n    >,\n    super::metastore::ToggleSourceRequest,\n    super::metastore::EmptyResponse,\n    crate::control_plane::ControlPlaneError,\n>;\ntype DeleteSourceLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        super::metastore::DeleteSourceRequest,\n        super::metastore::EmptyResponse,\n        crate::control_plane::ControlPlaneError,\n    >,\n    super::metastore::DeleteSourceRequest,\n    super::metastore::EmptyResponse,\n    crate::control_plane::ControlPlaneError,\n>;\ntype GetOrCreateOpenShardsLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        GetOrCreateOpenShardsRequest,\n        GetOrCreateOpenShardsResponse,\n        crate::control_plane::ControlPlaneError,\n    >,\n    GetOrCreateOpenShardsRequest,\n    GetOrCreateOpenShardsResponse,\n    crate::control_plane::ControlPlaneError,\n>;\ntype AdviseResetShardsLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        AdviseResetShardsRequest,\n        AdviseResetShardsResponse,\n        crate::control_plane::ControlPlaneError,\n    >,\n    AdviseResetShardsRequest,\n    AdviseResetShardsResponse,\n    crate::control_plane::ControlPlaneError,\n>;\ntype PruneShardsLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        super::metastore::PruneShardsRequest,\n        super::metastore::EmptyResponse,\n        crate::control_plane::ControlPlaneError,\n    >,\n    super::metastore::PruneShardsRequest,\n    super::metastore::EmptyResponse,\n    crate::control_plane::ControlPlaneError,\n>;\n#[derive(Debug, Default)]\npub struct ControlPlaneServiceTowerLayerStack {\n    create_index_layers: Vec<CreateIndexLayer>,\n    update_index_layers: Vec<UpdateIndexLayer>,\n    delete_index_layers: Vec<DeleteIndexLayer>,\n    add_source_layers: Vec<AddSourceLayer>,\n    update_source_layers: Vec<UpdateSourceLayer>,\n    toggle_source_layers: Vec<ToggleSourceLayer>,\n    delete_source_layers: Vec<DeleteSourceLayer>,\n    get_or_create_open_shards_layers: Vec<GetOrCreateOpenShardsLayer>,\n    advise_reset_shards_layers: Vec<AdviseResetShardsLayer>,\n    prune_shards_layers: Vec<PruneShardsLayer>,\n}\nimpl ControlPlaneServiceTowerLayerStack {\n    pub fn stack_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    super::metastore::CreateIndexRequest,\n                    super::metastore::CreateIndexResponse,\n                    crate::control_plane::ControlPlaneError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                super::metastore::CreateIndexRequest,\n                super::metastore::CreateIndexResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >>::Service: tower::Service<\n                super::metastore::CreateIndexRequest,\n                Response = super::metastore::CreateIndexResponse,\n                Error = crate::control_plane::ControlPlaneError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                super::metastore::CreateIndexRequest,\n                super::metastore::CreateIndexResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >>::Service as tower::Service<\n            super::metastore::CreateIndexRequest,\n        >>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    super::metastore::UpdateIndexRequest,\n                    super::metastore::IndexMetadataResponse,\n                    crate::control_plane::ControlPlaneError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                super::metastore::UpdateIndexRequest,\n                super::metastore::IndexMetadataResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >>::Service: tower::Service<\n                super::metastore::UpdateIndexRequest,\n                Response = super::metastore::IndexMetadataResponse,\n                Error = crate::control_plane::ControlPlaneError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                super::metastore::UpdateIndexRequest,\n                super::metastore::IndexMetadataResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >>::Service as tower::Service<\n            super::metastore::UpdateIndexRequest,\n        >>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    super::metastore::DeleteIndexRequest,\n                    super::metastore::EmptyResponse,\n                    crate::control_plane::ControlPlaneError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                super::metastore::DeleteIndexRequest,\n                super::metastore::EmptyResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >>::Service: tower::Service<\n                super::metastore::DeleteIndexRequest,\n                Response = super::metastore::EmptyResponse,\n                Error = crate::control_plane::ControlPlaneError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                super::metastore::DeleteIndexRequest,\n                super::metastore::EmptyResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >>::Service as tower::Service<\n            super::metastore::DeleteIndexRequest,\n        >>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    super::metastore::AddSourceRequest,\n                    super::metastore::EmptyResponse,\n                    crate::control_plane::ControlPlaneError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                super::metastore::AddSourceRequest,\n                super::metastore::EmptyResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >>::Service: tower::Service<\n                super::metastore::AddSourceRequest,\n                Response = super::metastore::EmptyResponse,\n                Error = crate::control_plane::ControlPlaneError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                super::metastore::AddSourceRequest,\n                super::metastore::EmptyResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >>::Service as tower::Service<\n            super::metastore::AddSourceRequest,\n        >>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    super::metastore::UpdateSourceRequest,\n                    super::metastore::EmptyResponse,\n                    crate::control_plane::ControlPlaneError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                super::metastore::UpdateSourceRequest,\n                super::metastore::EmptyResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >>::Service: tower::Service<\n                super::metastore::UpdateSourceRequest,\n                Response = super::metastore::EmptyResponse,\n                Error = crate::control_plane::ControlPlaneError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                super::metastore::UpdateSourceRequest,\n                super::metastore::EmptyResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >>::Service as tower::Service<\n            super::metastore::UpdateSourceRequest,\n        >>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    super::metastore::ToggleSourceRequest,\n                    super::metastore::EmptyResponse,\n                    crate::control_plane::ControlPlaneError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                super::metastore::ToggleSourceRequest,\n                super::metastore::EmptyResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >>::Service: tower::Service<\n                super::metastore::ToggleSourceRequest,\n                Response = super::metastore::EmptyResponse,\n                Error = crate::control_plane::ControlPlaneError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                super::metastore::ToggleSourceRequest,\n                super::metastore::EmptyResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >>::Service as tower::Service<\n            super::metastore::ToggleSourceRequest,\n        >>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    super::metastore::DeleteSourceRequest,\n                    super::metastore::EmptyResponse,\n                    crate::control_plane::ControlPlaneError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                super::metastore::DeleteSourceRequest,\n                super::metastore::EmptyResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >>::Service: tower::Service<\n                super::metastore::DeleteSourceRequest,\n                Response = super::metastore::EmptyResponse,\n                Error = crate::control_plane::ControlPlaneError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                super::metastore::DeleteSourceRequest,\n                super::metastore::EmptyResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >>::Service as tower::Service<\n            super::metastore::DeleteSourceRequest,\n        >>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    GetOrCreateOpenShardsRequest,\n                    GetOrCreateOpenShardsResponse,\n                    crate::control_plane::ControlPlaneError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                GetOrCreateOpenShardsRequest,\n                GetOrCreateOpenShardsResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >>::Service: tower::Service<\n                GetOrCreateOpenShardsRequest,\n                Response = GetOrCreateOpenShardsResponse,\n                Error = crate::control_plane::ControlPlaneError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                GetOrCreateOpenShardsRequest,\n                GetOrCreateOpenShardsResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >>::Service as tower::Service<\n            GetOrCreateOpenShardsRequest,\n        >>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    AdviseResetShardsRequest,\n                    AdviseResetShardsResponse,\n                    crate::control_plane::ControlPlaneError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                AdviseResetShardsRequest,\n                AdviseResetShardsResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >>::Service: tower::Service<\n                AdviseResetShardsRequest,\n                Response = AdviseResetShardsResponse,\n                Error = crate::control_plane::ControlPlaneError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                AdviseResetShardsRequest,\n                AdviseResetShardsResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >>::Service as tower::Service<AdviseResetShardsRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    super::metastore::PruneShardsRequest,\n                    super::metastore::EmptyResponse,\n                    crate::control_plane::ControlPlaneError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                super::metastore::PruneShardsRequest,\n                super::metastore::EmptyResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >>::Service: tower::Service<\n                super::metastore::PruneShardsRequest,\n                Response = super::metastore::EmptyResponse,\n                Error = crate::control_plane::ControlPlaneError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                super::metastore::PruneShardsRequest,\n                super::metastore::EmptyResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >>::Service as tower::Service<\n            super::metastore::PruneShardsRequest,\n        >>::Future: Send + 'static,\n    {\n        self.create_index_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.update_index_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.delete_index_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.add_source_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.update_source_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.toggle_source_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.delete_source_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.get_or_create_open_shards_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.advise_reset_shards_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.prune_shards_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self\n    }\n    pub fn stack_create_index_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    super::metastore::CreateIndexRequest,\n                    super::metastore::CreateIndexResponse,\n                    crate::control_plane::ControlPlaneError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                super::metastore::CreateIndexRequest,\n                Response = super::metastore::CreateIndexResponse,\n                Error = crate::control_plane::ControlPlaneError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<\n            super::metastore::CreateIndexRequest,\n        >>::Future: Send + 'static,\n    {\n        self.create_index_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_update_index_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    super::metastore::UpdateIndexRequest,\n                    super::metastore::IndexMetadataResponse,\n                    crate::control_plane::ControlPlaneError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                super::metastore::UpdateIndexRequest,\n                Response = super::metastore::IndexMetadataResponse,\n                Error = crate::control_plane::ControlPlaneError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<\n            super::metastore::UpdateIndexRequest,\n        >>::Future: Send + 'static,\n    {\n        self.update_index_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_delete_index_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    super::metastore::DeleteIndexRequest,\n                    super::metastore::EmptyResponse,\n                    crate::control_plane::ControlPlaneError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                super::metastore::DeleteIndexRequest,\n                Response = super::metastore::EmptyResponse,\n                Error = crate::control_plane::ControlPlaneError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<\n            super::metastore::DeleteIndexRequest,\n        >>::Future: Send + 'static,\n    {\n        self.delete_index_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_add_source_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    super::metastore::AddSourceRequest,\n                    super::metastore::EmptyResponse,\n                    crate::control_plane::ControlPlaneError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                super::metastore::AddSourceRequest,\n                Response = super::metastore::EmptyResponse,\n                Error = crate::control_plane::ControlPlaneError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<\n            super::metastore::AddSourceRequest,\n        >>::Future: Send + 'static,\n    {\n        self.add_source_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_update_source_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    super::metastore::UpdateSourceRequest,\n                    super::metastore::EmptyResponse,\n                    crate::control_plane::ControlPlaneError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                super::metastore::UpdateSourceRequest,\n                Response = super::metastore::EmptyResponse,\n                Error = crate::control_plane::ControlPlaneError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<\n            super::metastore::UpdateSourceRequest,\n        >>::Future: Send + 'static,\n    {\n        self.update_source_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_toggle_source_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    super::metastore::ToggleSourceRequest,\n                    super::metastore::EmptyResponse,\n                    crate::control_plane::ControlPlaneError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                super::metastore::ToggleSourceRequest,\n                Response = super::metastore::EmptyResponse,\n                Error = crate::control_plane::ControlPlaneError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<\n            super::metastore::ToggleSourceRequest,\n        >>::Future: Send + 'static,\n    {\n        self.toggle_source_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_delete_source_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    super::metastore::DeleteSourceRequest,\n                    super::metastore::EmptyResponse,\n                    crate::control_plane::ControlPlaneError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                super::metastore::DeleteSourceRequest,\n                Response = super::metastore::EmptyResponse,\n                Error = crate::control_plane::ControlPlaneError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<\n            super::metastore::DeleteSourceRequest,\n        >>::Future: Send + 'static,\n    {\n        self.delete_source_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_get_or_create_open_shards_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    GetOrCreateOpenShardsRequest,\n                    GetOrCreateOpenShardsResponse,\n                    crate::control_plane::ControlPlaneError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                GetOrCreateOpenShardsRequest,\n                Response = GetOrCreateOpenShardsResponse,\n                Error = crate::control_plane::ControlPlaneError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<\n            GetOrCreateOpenShardsRequest,\n        >>::Future: Send + 'static,\n    {\n        self.get_or_create_open_shards_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_advise_reset_shards_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    AdviseResetShardsRequest,\n                    AdviseResetShardsResponse,\n                    crate::control_plane::ControlPlaneError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                AdviseResetShardsRequest,\n                Response = AdviseResetShardsResponse,\n                Error = crate::control_plane::ControlPlaneError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<AdviseResetShardsRequest>>::Future: Send + 'static,\n    {\n        self.advise_reset_shards_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_prune_shards_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    super::metastore::PruneShardsRequest,\n                    super::metastore::EmptyResponse,\n                    crate::control_plane::ControlPlaneError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                super::metastore::PruneShardsRequest,\n                Response = super::metastore::EmptyResponse,\n                Error = crate::control_plane::ControlPlaneError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<\n            super::metastore::PruneShardsRequest,\n        >>::Future: Send + 'static,\n    {\n        self.prune_shards_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn build<T>(self, instance: T) -> ControlPlaneServiceClient\n    where\n        T: ControlPlaneService,\n    {\n        let inner_client = InnerControlPlaneServiceClient(std::sync::Arc::new(instance));\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_channel(\n        self,\n        addr: std::net::SocketAddr,\n        channel: tonic::transport::Channel,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> ControlPlaneServiceClient {\n        let client = ControlPlaneServiceClient::from_channel(\n            addr,\n            channel,\n            max_message_size,\n            compression_encoding_opt,\n        );\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_balance_channel(\n        self,\n        balance_channel: quickwit_common::tower::BalanceChannel<std::net::SocketAddr>,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> ControlPlaneServiceClient {\n        let client = ControlPlaneServiceClient::from_balance_channel(\n            balance_channel,\n            max_message_size,\n            compression_encoding_opt,\n        );\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_mailbox<A>(\n        self,\n        mailbox: quickwit_actors::Mailbox<A>,\n    ) -> ControlPlaneServiceClient\n    where\n        A: quickwit_actors::Actor + std::fmt::Debug + Send + 'static,\n        ControlPlaneServiceMailbox<A>: ControlPlaneService,\n    {\n        let inner_client = InnerControlPlaneServiceClient(\n            std::sync::Arc::new(ControlPlaneServiceMailbox::new(mailbox)),\n        );\n        self.build_from_inner_client(inner_client)\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn build_from_mock(\n        self,\n        mock: MockControlPlaneService,\n    ) -> ControlPlaneServiceClient {\n        let client = ControlPlaneServiceClient::from_mock(mock);\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    fn build_from_inner_client(\n        self,\n        inner_client: InnerControlPlaneServiceClient,\n    ) -> ControlPlaneServiceClient {\n        let create_index_svc = self\n            .create_index_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let update_index_svc = self\n            .update_index_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let delete_index_svc = self\n            .delete_index_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let add_source_svc = self\n            .add_source_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let update_source_svc = self\n            .update_source_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let toggle_source_svc = self\n            .toggle_source_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let delete_source_svc = self\n            .delete_source_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let get_or_create_open_shards_svc = self\n            .get_or_create_open_shards_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let advise_reset_shards_svc = self\n            .advise_reset_shards_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let prune_shards_svc = self\n            .prune_shards_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let tower_svc_stack = ControlPlaneServiceTowerServiceStack {\n            inner: inner_client,\n            create_index_svc,\n            update_index_svc,\n            delete_index_svc,\n            add_source_svc,\n            update_source_svc,\n            toggle_source_svc,\n            delete_source_svc,\n            get_or_create_open_shards_svc,\n            advise_reset_shards_svc,\n            prune_shards_svc,\n        };\n        ControlPlaneServiceClient::new(tower_svc_stack)\n    }\n}\n#[derive(Debug, Clone)]\nstruct MailboxAdapter<A: quickwit_actors::Actor, E> {\n    inner: quickwit_actors::Mailbox<A>,\n    phantom: std::marker::PhantomData<E>,\n}\nimpl<A, E> std::ops::Deref for MailboxAdapter<A, E>\nwhere\n    A: quickwit_actors::Actor,\n{\n    type Target = quickwit_actors::Mailbox<A>;\n    fn deref(&self) -> &Self::Target {\n        &self.inner\n    }\n}\n#[derive(Debug)]\npub struct ControlPlaneServiceMailbox<A: quickwit_actors::Actor> {\n    inner: MailboxAdapter<A, crate::control_plane::ControlPlaneError>,\n}\nimpl<A: quickwit_actors::Actor> ControlPlaneServiceMailbox<A> {\n    pub fn new(instance: quickwit_actors::Mailbox<A>) -> Self {\n        let inner = MailboxAdapter {\n            inner: instance,\n            phantom: std::marker::PhantomData,\n        };\n        Self { inner }\n    }\n}\nimpl<A: quickwit_actors::Actor> Clone for ControlPlaneServiceMailbox<A> {\n    fn clone(&self) -> Self {\n        let inner = MailboxAdapter {\n            inner: self.inner.clone(),\n            phantom: std::marker::PhantomData,\n        };\n        Self { inner }\n    }\n}\nimpl<A, M, T, E> tower::Service<M> for ControlPlaneServiceMailbox<A>\nwhere\n    A: quickwit_actors::Actor\n        + quickwit_actors::DeferableReplyHandler<M, Reply = Result<T, E>> + Send\n        + 'static,\n    M: std::fmt::Debug + Send + 'static,\n    T: Send + 'static,\n    E: std::fmt::Debug + Send + 'static,\n    crate::control_plane::ControlPlaneError: From<quickwit_actors::AskError<E>>,\n{\n    type Response = T;\n    type Error = crate::control_plane::ControlPlaneError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        //! This does not work with balance middlewares such as `tower::balance::pool::Pool` because\n        //! this always returns `Poll::Ready`. The fix is to acquire a permit from the\n        //! mailbox in `poll_ready` and consume it in `call`.\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, message: M) -> Self::Future {\n        let mailbox = self.inner.clone();\n        let fut = async move {\n            mailbox.ask_for_res(message).await.map_err(|error| error.into())\n        };\n        Box::pin(fut)\n    }\n}\n#[async_trait::async_trait]\nimpl<A> ControlPlaneService for ControlPlaneServiceMailbox<A>\nwhere\n    A: quickwit_actors::Actor + std::fmt::Debug,\n    ControlPlaneServiceMailbox<\n        A,\n    >: tower::Service<\n            super::metastore::CreateIndexRequest,\n            Response = super::metastore::CreateIndexResponse,\n            Error = crate::control_plane::ControlPlaneError,\n            Future = BoxFuture<\n                super::metastore::CreateIndexResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >\n        + tower::Service<\n            super::metastore::UpdateIndexRequest,\n            Response = super::metastore::IndexMetadataResponse,\n            Error = crate::control_plane::ControlPlaneError,\n            Future = BoxFuture<\n                super::metastore::IndexMetadataResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >\n        + tower::Service<\n            super::metastore::DeleteIndexRequest,\n            Response = super::metastore::EmptyResponse,\n            Error = crate::control_plane::ControlPlaneError,\n            Future = BoxFuture<\n                super::metastore::EmptyResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >\n        + tower::Service<\n            super::metastore::AddSourceRequest,\n            Response = super::metastore::EmptyResponse,\n            Error = crate::control_plane::ControlPlaneError,\n            Future = BoxFuture<\n                super::metastore::EmptyResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >\n        + tower::Service<\n            super::metastore::UpdateSourceRequest,\n            Response = super::metastore::EmptyResponse,\n            Error = crate::control_plane::ControlPlaneError,\n            Future = BoxFuture<\n                super::metastore::EmptyResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >\n        + tower::Service<\n            super::metastore::ToggleSourceRequest,\n            Response = super::metastore::EmptyResponse,\n            Error = crate::control_plane::ControlPlaneError,\n            Future = BoxFuture<\n                super::metastore::EmptyResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >\n        + tower::Service<\n            super::metastore::DeleteSourceRequest,\n            Response = super::metastore::EmptyResponse,\n            Error = crate::control_plane::ControlPlaneError,\n            Future = BoxFuture<\n                super::metastore::EmptyResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >\n        + tower::Service<\n            GetOrCreateOpenShardsRequest,\n            Response = GetOrCreateOpenShardsResponse,\n            Error = crate::control_plane::ControlPlaneError,\n            Future = BoxFuture<\n                GetOrCreateOpenShardsResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >\n        + tower::Service<\n            AdviseResetShardsRequest,\n            Response = AdviseResetShardsResponse,\n            Error = crate::control_plane::ControlPlaneError,\n            Future = BoxFuture<\n                AdviseResetShardsResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >\n        + tower::Service<\n            super::metastore::PruneShardsRequest,\n            Response = super::metastore::EmptyResponse,\n            Error = crate::control_plane::ControlPlaneError,\n            Future = BoxFuture<\n                super::metastore::EmptyResponse,\n                crate::control_plane::ControlPlaneError,\n            >,\n        >,\n{\n    async fn create_index(\n        &self,\n        request: super::metastore::CreateIndexRequest,\n    ) -> crate::control_plane::ControlPlaneResult<\n        super::metastore::CreateIndexResponse,\n    > {\n        self.clone().call(request).await\n    }\n    async fn update_index(\n        &self,\n        request: super::metastore::UpdateIndexRequest,\n    ) -> crate::control_plane::ControlPlaneResult<\n        super::metastore::IndexMetadataResponse,\n    > {\n        self.clone().call(request).await\n    }\n    async fn delete_index(\n        &self,\n        request: super::metastore::DeleteIndexRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse> {\n        self.clone().call(request).await\n    }\n    async fn add_source(\n        &self,\n        request: super::metastore::AddSourceRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse> {\n        self.clone().call(request).await\n    }\n    async fn update_source(\n        &self,\n        request: super::metastore::UpdateSourceRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse> {\n        self.clone().call(request).await\n    }\n    async fn toggle_source(\n        &self,\n        request: super::metastore::ToggleSourceRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse> {\n        self.clone().call(request).await\n    }\n    async fn delete_source(\n        &self,\n        request: super::metastore::DeleteSourceRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse> {\n        self.clone().call(request).await\n    }\n    async fn get_or_create_open_shards(\n        &self,\n        request: GetOrCreateOpenShardsRequest,\n    ) -> crate::control_plane::ControlPlaneResult<GetOrCreateOpenShardsResponse> {\n        self.clone().call(request).await\n    }\n    async fn advise_reset_shards(\n        &self,\n        request: AdviseResetShardsRequest,\n    ) -> crate::control_plane::ControlPlaneResult<AdviseResetShardsResponse> {\n        self.clone().call(request).await\n    }\n    async fn prune_shards(\n        &self,\n        request: super::metastore::PruneShardsRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse> {\n        self.clone().call(request).await\n    }\n}\n#[derive(Debug, Clone)]\npub struct ControlPlaneServiceGrpcClientAdapter<T> {\n    inner: T,\n    #[allow(dead_code)]\n    connection_addrs_rx: tokio::sync::watch::Receiver<\n        std::collections::HashSet<std::net::SocketAddr>,\n    >,\n}\nimpl<T> ControlPlaneServiceGrpcClientAdapter<T> {\n    pub fn new(\n        instance: T,\n        connection_addrs_rx: tokio::sync::watch::Receiver<\n            std::collections::HashSet<std::net::SocketAddr>,\n        >,\n    ) -> Self {\n        Self {\n            inner: instance,\n            connection_addrs_rx,\n        }\n    }\n}\n#[async_trait::async_trait]\nimpl<T> ControlPlaneService\nfor ControlPlaneServiceGrpcClientAdapter<\n    control_plane_service_grpc_client::ControlPlaneServiceGrpcClient<T>,\n>\nwhere\n    T: tonic::client::GrpcService<tonic::body::Body> + std::fmt::Debug + Clone + Send\n        + Sync + 'static,\n    T::ResponseBody: tonic::codegen::Body<Data = tonic::codegen::Bytes> + Send + 'static,\n    <T::ResponseBody as tonic::codegen::Body>::Error: Into<tonic::codegen::StdError>\n        + Send,\n    T::Future: Send,\n{\n    async fn create_index(\n        &self,\n        request: super::metastore::CreateIndexRequest,\n    ) -> crate::control_plane::ControlPlaneResult<\n        super::metastore::CreateIndexResponse,\n    > {\n        self.inner\n            .clone()\n            .create_index(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                super::metastore::CreateIndexRequest::rpc_name(),\n            ))\n    }\n    async fn update_index(\n        &self,\n        request: super::metastore::UpdateIndexRequest,\n    ) -> crate::control_plane::ControlPlaneResult<\n        super::metastore::IndexMetadataResponse,\n    > {\n        self.inner\n            .clone()\n            .update_index(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                super::metastore::UpdateIndexRequest::rpc_name(),\n            ))\n    }\n    async fn delete_index(\n        &self,\n        request: super::metastore::DeleteIndexRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse> {\n        self.inner\n            .clone()\n            .delete_index(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                super::metastore::DeleteIndexRequest::rpc_name(),\n            ))\n    }\n    async fn add_source(\n        &self,\n        request: super::metastore::AddSourceRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse> {\n        self.inner\n            .clone()\n            .add_source(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                super::metastore::AddSourceRequest::rpc_name(),\n            ))\n    }\n    async fn update_source(\n        &self,\n        request: super::metastore::UpdateSourceRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse> {\n        self.inner\n            .clone()\n            .update_source(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                super::metastore::UpdateSourceRequest::rpc_name(),\n            ))\n    }\n    async fn toggle_source(\n        &self,\n        request: super::metastore::ToggleSourceRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse> {\n        self.inner\n            .clone()\n            .toggle_source(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                super::metastore::ToggleSourceRequest::rpc_name(),\n            ))\n    }\n    async fn delete_source(\n        &self,\n        request: super::metastore::DeleteSourceRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse> {\n        self.inner\n            .clone()\n            .delete_source(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                super::metastore::DeleteSourceRequest::rpc_name(),\n            ))\n    }\n    async fn get_or_create_open_shards(\n        &self,\n        request: GetOrCreateOpenShardsRequest,\n    ) -> crate::control_plane::ControlPlaneResult<GetOrCreateOpenShardsResponse> {\n        self.inner\n            .clone()\n            .get_or_create_open_shards(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                GetOrCreateOpenShardsRequest::rpc_name(),\n            ))\n    }\n    async fn advise_reset_shards(\n        &self,\n        request: AdviseResetShardsRequest,\n    ) -> crate::control_plane::ControlPlaneResult<AdviseResetShardsResponse> {\n        self.inner\n            .clone()\n            .advise_reset_shards(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                AdviseResetShardsRequest::rpc_name(),\n            ))\n    }\n    async fn prune_shards(\n        &self,\n        request: super::metastore::PruneShardsRequest,\n    ) -> crate::control_plane::ControlPlaneResult<super::metastore::EmptyResponse> {\n        self.inner\n            .clone()\n            .prune_shards(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                super::metastore::PruneShardsRequest::rpc_name(),\n            ))\n    }\n}\n#[derive(Debug)]\npub struct ControlPlaneServiceGrpcServerAdapter {\n    inner: InnerControlPlaneServiceClient,\n}\nimpl ControlPlaneServiceGrpcServerAdapter {\n    pub fn new<T>(instance: T) -> Self\n    where\n        T: ControlPlaneService,\n    {\n        Self {\n            inner: InnerControlPlaneServiceClient(std::sync::Arc::new(instance)),\n        }\n    }\n}\n#[async_trait::async_trait]\nimpl control_plane_service_grpc_server::ControlPlaneServiceGrpc\nfor ControlPlaneServiceGrpcServerAdapter {\n    async fn create_index(\n        &self,\n        request: tonic::Request<super::metastore::CreateIndexRequest>,\n    ) -> Result<tonic::Response<super::metastore::CreateIndexResponse>, tonic::Status> {\n        self.inner\n            .0\n            .create_index(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn update_index(\n        &self,\n        request: tonic::Request<super::metastore::UpdateIndexRequest>,\n    ) -> Result<\n        tonic::Response<super::metastore::IndexMetadataResponse>,\n        tonic::Status,\n    > {\n        self.inner\n            .0\n            .update_index(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn delete_index(\n        &self,\n        request: tonic::Request<super::metastore::DeleteIndexRequest>,\n    ) -> Result<tonic::Response<super::metastore::EmptyResponse>, tonic::Status> {\n        self.inner\n            .0\n            .delete_index(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn add_source(\n        &self,\n        request: tonic::Request<super::metastore::AddSourceRequest>,\n    ) -> Result<tonic::Response<super::metastore::EmptyResponse>, tonic::Status> {\n        self.inner\n            .0\n            .add_source(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn update_source(\n        &self,\n        request: tonic::Request<super::metastore::UpdateSourceRequest>,\n    ) -> Result<tonic::Response<super::metastore::EmptyResponse>, tonic::Status> {\n        self.inner\n            .0\n            .update_source(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn toggle_source(\n        &self,\n        request: tonic::Request<super::metastore::ToggleSourceRequest>,\n    ) -> Result<tonic::Response<super::metastore::EmptyResponse>, tonic::Status> {\n        self.inner\n            .0\n            .toggle_source(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn delete_source(\n        &self,\n        request: tonic::Request<super::metastore::DeleteSourceRequest>,\n    ) -> Result<tonic::Response<super::metastore::EmptyResponse>, tonic::Status> {\n        self.inner\n            .0\n            .delete_source(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn get_or_create_open_shards(\n        &self,\n        request: tonic::Request<GetOrCreateOpenShardsRequest>,\n    ) -> Result<tonic::Response<GetOrCreateOpenShardsResponse>, tonic::Status> {\n        self.inner\n            .0\n            .get_or_create_open_shards(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn advise_reset_shards(\n        &self,\n        request: tonic::Request<AdviseResetShardsRequest>,\n    ) -> Result<tonic::Response<AdviseResetShardsResponse>, tonic::Status> {\n        self.inner\n            .0\n            .advise_reset_shards(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn prune_shards(\n        &self,\n        request: tonic::Request<super::metastore::PruneShardsRequest>,\n    ) -> Result<tonic::Response<super::metastore::EmptyResponse>, tonic::Status> {\n        self.inner\n            .0\n            .prune_shards(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n}\n/// Generated client implementations.\npub mod control_plane_service_grpc_client {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    use tonic::codegen::http::Uri;\n    #[derive(Debug, Clone)]\n    pub struct ControlPlaneServiceGrpcClient<T> {\n        inner: tonic::client::Grpc<T>,\n    }\n    impl ControlPlaneServiceGrpcClient<tonic::transport::Channel> {\n        /// Attempt to create a new client by connecting to a given endpoint.\n        pub async fn connect<D>(dst: D) -> Result<Self, tonic::transport::Error>\n        where\n            D: TryInto<tonic::transport::Endpoint>,\n            D::Error: Into<StdError>,\n        {\n            let conn = tonic::transport::Endpoint::new(dst)?.connect().await?;\n            Ok(Self::new(conn))\n        }\n    }\n    impl<T> ControlPlaneServiceGrpcClient<T>\n    where\n        T: tonic::client::GrpcService<tonic::body::Body>,\n        T::Error: Into<StdError>,\n        T::ResponseBody: Body<Data = Bytes> + std::marker::Send + 'static,\n        <T::ResponseBody as Body>::Error: Into<StdError> + std::marker::Send,\n    {\n        pub fn new(inner: T) -> Self {\n            let inner = tonic::client::Grpc::new(inner);\n            Self { inner }\n        }\n        pub fn with_origin(inner: T, origin: Uri) -> Self {\n            let inner = tonic::client::Grpc::with_origin(inner, origin);\n            Self { inner }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> ControlPlaneServiceGrpcClient<InterceptedService<T, F>>\n        where\n            F: tonic::service::Interceptor,\n            T::ResponseBody: Default,\n            T: tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n                Response = http::Response<\n                    <T as tonic::client::GrpcService<tonic::body::Body>>::ResponseBody,\n                >,\n            >,\n            <T as tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n            >>::Error: Into<StdError> + std::marker::Send + std::marker::Sync,\n        {\n            ControlPlaneServiceGrpcClient::new(\n                InterceptedService::new(inner, interceptor),\n            )\n        }\n        /// Compress requests with the given encoding.\n        ///\n        /// This requires the server to support it otherwise it might respond with an\n        /// error.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.send_compressed(encoding);\n            self\n        }\n        /// Enable decompressing responses.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.accept_compressed(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_decoding_message_size(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_encoding_message_size(limit);\n            self\n        }\n        /// Creates a new index.\n        pub async fn create_index(\n            &mut self,\n            request: impl tonic::IntoRequest<super::super::metastore::CreateIndexRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::super::metastore::CreateIndexResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.control_plane.ControlPlaneService/CreateIndex\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.control_plane.ControlPlaneService\",\n                        \"CreateIndex\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Updates an index.\n        pub async fn update_index(\n            &mut self,\n            request: impl tonic::IntoRequest<super::super::metastore::UpdateIndexRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::super::metastore::IndexMetadataResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.control_plane.ControlPlaneService/UpdateIndex\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.control_plane.ControlPlaneService\",\n                        \"UpdateIndex\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Deletes an index.\n        pub async fn delete_index(\n            &mut self,\n            request: impl tonic::IntoRequest<super::super::metastore::DeleteIndexRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::super::metastore::EmptyResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.control_plane.ControlPlaneService/DeleteIndex\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.control_plane.ControlPlaneService\",\n                        \"DeleteIndex\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Adds a source to an index.\n        pub async fn add_source(\n            &mut self,\n            request: impl tonic::IntoRequest<super::super::metastore::AddSourceRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::super::metastore::EmptyResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.control_plane.ControlPlaneService/AddSource\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.control_plane.ControlPlaneService\",\n                        \"AddSource\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Update a source.\n        pub async fn update_source(\n            &mut self,\n            request: impl tonic::IntoRequest<\n                super::super::metastore::UpdateSourceRequest,\n            >,\n        ) -> std::result::Result<\n            tonic::Response<super::super::metastore::EmptyResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.control_plane.ControlPlaneService/UpdateSource\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.control_plane.ControlPlaneService\",\n                        \"UpdateSource\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Enables or disables a source.\n        pub async fn toggle_source(\n            &mut self,\n            request: impl tonic::IntoRequest<\n                super::super::metastore::ToggleSourceRequest,\n            >,\n        ) -> std::result::Result<\n            tonic::Response<super::super::metastore::EmptyResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.control_plane.ControlPlaneService/ToggleSource\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.control_plane.ControlPlaneService\",\n                        \"ToggleSource\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Removes a source from an index.\n        pub async fn delete_source(\n            &mut self,\n            request: impl tonic::IntoRequest<\n                super::super::metastore::DeleteSourceRequest,\n            >,\n        ) -> std::result::Result<\n            tonic::Response<super::super::metastore::EmptyResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.control_plane.ControlPlaneService/DeleteSource\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.control_plane.ControlPlaneService\",\n                        \"DeleteSource\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Returns the list of open shards for one or several sources. If the control plane is not able to find any\n        /// for a source, it will pick a pair of leader-follower ingesters and will open a new shard.\n        pub async fn get_or_create_open_shards(\n            &mut self,\n            request: impl tonic::IntoRequest<super::GetOrCreateOpenShardsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::GetOrCreateOpenShardsResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.control_plane.ControlPlaneService/GetOrCreateOpenShards\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.control_plane.ControlPlaneService\",\n                        \"GetOrCreateOpenShards\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Asks the control plane whether the shards listed in the request should be deleted or truncated.\n        pub async fn advise_reset_shards(\n            &mut self,\n            request: impl tonic::IntoRequest<super::AdviseResetShardsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::AdviseResetShardsResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.control_plane.ControlPlaneService/AdviseResetShards\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.control_plane.ControlPlaneService\",\n                        \"AdviseResetShards\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Performs a debounced shard pruning request to the metastore.\n        pub async fn prune_shards(\n            &mut self,\n            request: impl tonic::IntoRequest<super::super::metastore::PruneShardsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::super::metastore::EmptyResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.control_plane.ControlPlaneService/PruneShards\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.control_plane.ControlPlaneService\",\n                        \"PruneShards\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n    }\n}\n/// Generated server implementations.\npub mod control_plane_service_grpc_server {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    /// Generated trait containing gRPC methods that should be implemented for use with ControlPlaneServiceGrpcServer.\n    #[async_trait]\n    pub trait ControlPlaneServiceGrpc: std::marker::Send + std::marker::Sync + 'static {\n        /// Creates a new index.\n        async fn create_index(\n            &self,\n            request: tonic::Request<super::super::metastore::CreateIndexRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::super::metastore::CreateIndexResponse>,\n            tonic::Status,\n        >;\n        /// Updates an index.\n        async fn update_index(\n            &self,\n            request: tonic::Request<super::super::metastore::UpdateIndexRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::super::metastore::IndexMetadataResponse>,\n            tonic::Status,\n        >;\n        /// Deletes an index.\n        async fn delete_index(\n            &self,\n            request: tonic::Request<super::super::metastore::DeleteIndexRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::super::metastore::EmptyResponse>,\n            tonic::Status,\n        >;\n        /// Adds a source to an index.\n        async fn add_source(\n            &self,\n            request: tonic::Request<super::super::metastore::AddSourceRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::super::metastore::EmptyResponse>,\n            tonic::Status,\n        >;\n        /// Update a source.\n        async fn update_source(\n            &self,\n            request: tonic::Request<super::super::metastore::UpdateSourceRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::super::metastore::EmptyResponse>,\n            tonic::Status,\n        >;\n        /// Enables or disables a source.\n        async fn toggle_source(\n            &self,\n            request: tonic::Request<super::super::metastore::ToggleSourceRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::super::metastore::EmptyResponse>,\n            tonic::Status,\n        >;\n        /// Removes a source from an index.\n        async fn delete_source(\n            &self,\n            request: tonic::Request<super::super::metastore::DeleteSourceRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::super::metastore::EmptyResponse>,\n            tonic::Status,\n        >;\n        /// Returns the list of open shards for one or several sources. If the control plane is not able to find any\n        /// for a source, it will pick a pair of leader-follower ingesters and will open a new shard.\n        async fn get_or_create_open_shards(\n            &self,\n            request: tonic::Request<super::GetOrCreateOpenShardsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::GetOrCreateOpenShardsResponse>,\n            tonic::Status,\n        >;\n        /// Asks the control plane whether the shards listed in the request should be deleted or truncated.\n        async fn advise_reset_shards(\n            &self,\n            request: tonic::Request<super::AdviseResetShardsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::AdviseResetShardsResponse>,\n            tonic::Status,\n        >;\n        /// Performs a debounced shard pruning request to the metastore.\n        async fn prune_shards(\n            &self,\n            request: tonic::Request<super::super::metastore::PruneShardsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::super::metastore::EmptyResponse>,\n            tonic::Status,\n        >;\n    }\n    #[derive(Debug)]\n    pub struct ControlPlaneServiceGrpcServer<T> {\n        inner: Arc<T>,\n        accept_compression_encodings: EnabledCompressionEncodings,\n        send_compression_encodings: EnabledCompressionEncodings,\n        max_decoding_message_size: Option<usize>,\n        max_encoding_message_size: Option<usize>,\n    }\n    impl<T> ControlPlaneServiceGrpcServer<T> {\n        pub fn new(inner: T) -> Self {\n            Self::from_arc(Arc::new(inner))\n        }\n        pub fn from_arc(inner: Arc<T>) -> Self {\n            Self {\n                inner,\n                accept_compression_encodings: Default::default(),\n                send_compression_encodings: Default::default(),\n                max_decoding_message_size: None,\n                max_encoding_message_size: None,\n            }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> InterceptedService<Self, F>\n        where\n            F: tonic::service::Interceptor,\n        {\n            InterceptedService::new(Self::new(inner), interceptor)\n        }\n        /// Enable decompressing requests with the given encoding.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.accept_compression_encodings.enable(encoding);\n            self\n        }\n        /// Compress responses with the given encoding, if the client supports it.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.send_compression_encodings.enable(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.max_decoding_message_size = Some(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.max_encoding_message_size = Some(limit);\n            self\n        }\n    }\n    impl<T, B> tonic::codegen::Service<http::Request<B>>\n    for ControlPlaneServiceGrpcServer<T>\n    where\n        T: ControlPlaneServiceGrpc,\n        B: Body + std::marker::Send + 'static,\n        B::Error: Into<StdError> + std::marker::Send + 'static,\n    {\n        type Response = http::Response<tonic::body::Body>;\n        type Error = std::convert::Infallible;\n        type Future = BoxFuture<Self::Response, Self::Error>;\n        fn poll_ready(\n            &mut self,\n            _cx: &mut Context<'_>,\n        ) -> Poll<std::result::Result<(), Self::Error>> {\n            Poll::Ready(Ok(()))\n        }\n        fn call(&mut self, req: http::Request<B>) -> Self::Future {\n            match req.uri().path() {\n                \"/quickwit.control_plane.ControlPlaneService/CreateIndex\" => {\n                    #[allow(non_camel_case_types)]\n                    struct CreateIndexSvc<T: ControlPlaneServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: ControlPlaneServiceGrpc,\n                    > tonic::server::UnaryService<\n                        super::super::metastore::CreateIndexRequest,\n                    > for CreateIndexSvc<T> {\n                        type Response = super::super::metastore::CreateIndexResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<\n                                super::super::metastore::CreateIndexRequest,\n                            >,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as ControlPlaneServiceGrpc>::create_index(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = CreateIndexSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.control_plane.ControlPlaneService/UpdateIndex\" => {\n                    #[allow(non_camel_case_types)]\n                    struct UpdateIndexSvc<T: ControlPlaneServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: ControlPlaneServiceGrpc,\n                    > tonic::server::UnaryService<\n                        super::super::metastore::UpdateIndexRequest,\n                    > for UpdateIndexSvc<T> {\n                        type Response = super::super::metastore::IndexMetadataResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<\n                                super::super::metastore::UpdateIndexRequest,\n                            >,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as ControlPlaneServiceGrpc>::update_index(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = UpdateIndexSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.control_plane.ControlPlaneService/DeleteIndex\" => {\n                    #[allow(non_camel_case_types)]\n                    struct DeleteIndexSvc<T: ControlPlaneServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: ControlPlaneServiceGrpc,\n                    > tonic::server::UnaryService<\n                        super::super::metastore::DeleteIndexRequest,\n                    > for DeleteIndexSvc<T> {\n                        type Response = super::super::metastore::EmptyResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<\n                                super::super::metastore::DeleteIndexRequest,\n                            >,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as ControlPlaneServiceGrpc>::delete_index(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = DeleteIndexSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.control_plane.ControlPlaneService/AddSource\" => {\n                    #[allow(non_camel_case_types)]\n                    struct AddSourceSvc<T: ControlPlaneServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: ControlPlaneServiceGrpc,\n                    > tonic::server::UnaryService<\n                        super::super::metastore::AddSourceRequest,\n                    > for AddSourceSvc<T> {\n                        type Response = super::super::metastore::EmptyResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<\n                                super::super::metastore::AddSourceRequest,\n                            >,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as ControlPlaneServiceGrpc>::add_source(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = AddSourceSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.control_plane.ControlPlaneService/UpdateSource\" => {\n                    #[allow(non_camel_case_types)]\n                    struct UpdateSourceSvc<T: ControlPlaneServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: ControlPlaneServiceGrpc,\n                    > tonic::server::UnaryService<\n                        super::super::metastore::UpdateSourceRequest,\n                    > for UpdateSourceSvc<T> {\n                        type Response = super::super::metastore::EmptyResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<\n                                super::super::metastore::UpdateSourceRequest,\n                            >,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as ControlPlaneServiceGrpc>::update_source(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = UpdateSourceSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.control_plane.ControlPlaneService/ToggleSource\" => {\n                    #[allow(non_camel_case_types)]\n                    struct ToggleSourceSvc<T: ControlPlaneServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: ControlPlaneServiceGrpc,\n                    > tonic::server::UnaryService<\n                        super::super::metastore::ToggleSourceRequest,\n                    > for ToggleSourceSvc<T> {\n                        type Response = super::super::metastore::EmptyResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<\n                                super::super::metastore::ToggleSourceRequest,\n                            >,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as ControlPlaneServiceGrpc>::toggle_source(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = ToggleSourceSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.control_plane.ControlPlaneService/DeleteSource\" => {\n                    #[allow(non_camel_case_types)]\n                    struct DeleteSourceSvc<T: ControlPlaneServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: ControlPlaneServiceGrpc,\n                    > tonic::server::UnaryService<\n                        super::super::metastore::DeleteSourceRequest,\n                    > for DeleteSourceSvc<T> {\n                        type Response = super::super::metastore::EmptyResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<\n                                super::super::metastore::DeleteSourceRequest,\n                            >,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as ControlPlaneServiceGrpc>::delete_source(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = DeleteSourceSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.control_plane.ControlPlaneService/GetOrCreateOpenShards\" => {\n                    #[allow(non_camel_case_types)]\n                    struct GetOrCreateOpenShardsSvc<T: ControlPlaneServiceGrpc>(\n                        pub Arc<T>,\n                    );\n                    impl<\n                        T: ControlPlaneServiceGrpc,\n                    > tonic::server::UnaryService<super::GetOrCreateOpenShardsRequest>\n                    for GetOrCreateOpenShardsSvc<T> {\n                        type Response = super::GetOrCreateOpenShardsResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::GetOrCreateOpenShardsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as ControlPlaneServiceGrpc>::get_or_create_open_shards(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = GetOrCreateOpenShardsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.control_plane.ControlPlaneService/AdviseResetShards\" => {\n                    #[allow(non_camel_case_types)]\n                    struct AdviseResetShardsSvc<T: ControlPlaneServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: ControlPlaneServiceGrpc,\n                    > tonic::server::UnaryService<super::AdviseResetShardsRequest>\n                    for AdviseResetShardsSvc<T> {\n                        type Response = super::AdviseResetShardsResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::AdviseResetShardsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as ControlPlaneServiceGrpc>::advise_reset_shards(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = AdviseResetShardsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.control_plane.ControlPlaneService/PruneShards\" => {\n                    #[allow(non_camel_case_types)]\n                    struct PruneShardsSvc<T: ControlPlaneServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: ControlPlaneServiceGrpc,\n                    > tonic::server::UnaryService<\n                        super::super::metastore::PruneShardsRequest,\n                    > for PruneShardsSvc<T> {\n                        type Response = super::super::metastore::EmptyResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<\n                                super::super::metastore::PruneShardsRequest,\n                            >,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as ControlPlaneServiceGrpc>::prune_shards(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = PruneShardsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                _ => {\n                    Box::pin(async move {\n                        let mut response = http::Response::new(\n                            tonic::body::Body::default(),\n                        );\n                        let headers = response.headers_mut();\n                        headers\n                            .insert(\n                                tonic::Status::GRPC_STATUS,\n                                (tonic::Code::Unimplemented as i32).into(),\n                            );\n                        headers\n                            .insert(\n                                http::header::CONTENT_TYPE,\n                                tonic::metadata::GRPC_CONTENT_TYPE,\n                            );\n                        Ok(response)\n                    })\n                }\n            }\n        }\n    }\n    impl<T> Clone for ControlPlaneServiceGrpcServer<T> {\n        fn clone(&self) -> Self {\n            let inner = self.inner.clone();\n            Self {\n                inner,\n                accept_compression_encodings: self.accept_compression_encodings,\n                send_compression_encodings: self.send_compression_encodings,\n                max_decoding_message_size: self.max_decoding_message_size,\n                max_encoding_message_size: self.max_encoding_message_size,\n            }\n        }\n    }\n    /// Generated gRPC service name\n    pub const SERVICE_NAME: &str = \"quickwit.control_plane.ControlPlaneService\";\n    impl<T> tonic::server::NamedService for ControlPlaneServiceGrpcServer<T> {\n        const NAME: &'static str = SERVICE_NAME;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/codegen/quickwit/quickwit.developer.rs",
    "content": "// This file is @generated by prost-build.\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct GetDebugInfoRequest {\n    /// Restricts the debug info to the given roles.\n    #[prost(string, repeated, tag = \"1\")]\n    pub roles: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct GetDebugInfoResponse {\n    #[prost(bytes = \"bytes\", tag = \"1\")]\n    pub debug_info_json: ::prost::bytes::Bytes,\n}\n/// BEGIN quickwit-codegen\n#[allow(unused_imports)]\nuse std::str::FromStr;\nuse tower::{Layer, Service, ServiceExt};\nuse quickwit_common::tower::RpcName;\nimpl RpcName for GetDebugInfoRequest {\n    fn rpc_name() -> &'static str {\n        \"get_debug_info\"\n    }\n}\n#[cfg_attr(any(test, feature = \"testsuite\"), mockall::automock)]\n#[async_trait::async_trait]\npub trait DeveloperService: std::fmt::Debug + Send + Sync + 'static {\n    async fn get_debug_info(\n        &self,\n        request: GetDebugInfoRequest,\n    ) -> crate::developer::DeveloperResult<GetDebugInfoResponse>;\n}\n#[derive(Debug, Clone)]\npub struct DeveloperServiceClient {\n    inner: InnerDeveloperServiceClient,\n}\n#[derive(Debug, Clone)]\nstruct InnerDeveloperServiceClient(std::sync::Arc<dyn DeveloperService>);\nimpl DeveloperServiceClient {\n    pub fn new<T>(instance: T) -> Self\n    where\n        T: DeveloperService,\n    {\n        #[cfg(any(test, feature = \"testsuite\"))]\n        assert!(\n            std::any::TypeId::of:: < T > () != std::any::TypeId::of:: <\n            MockDeveloperService > (),\n            \"`MockDeveloperService` must be wrapped in a `MockDeveloperServiceWrapper`: use `DeveloperServiceClient::from_mock(mock)` to instantiate the client\"\n        );\n        Self {\n            inner: InnerDeveloperServiceClient(std::sync::Arc::new(instance)),\n        }\n    }\n    pub fn as_grpc_service(\n        &self,\n        max_message_size: bytesize::ByteSize,\n    ) -> developer_service_grpc_server::DeveloperServiceGrpcServer<\n        DeveloperServiceGrpcServerAdapter,\n    > {\n        let adapter = DeveloperServiceGrpcServerAdapter::new(self.clone());\n        developer_service_grpc_server::DeveloperServiceGrpcServer::new(adapter)\n            .accept_compressed(tonic::codec::CompressionEncoding::Gzip)\n            .accept_compressed(tonic::codec::CompressionEncoding::Zstd)\n            .send_compressed(tonic::codec::CompressionEncoding::Gzip)\n            .send_compressed(tonic::codec::CompressionEncoding::Zstd)\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize)\n    }\n    pub fn from_channel(\n        addr: std::net::SocketAddr,\n        channel: tonic::transport::Channel,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> Self {\n        let (_, connection_keys_watcher) = tokio::sync::watch::channel(\n            std::collections::HashSet::from_iter([addr]),\n        );\n        let mut client = developer_service_grpc_client::DeveloperServiceGrpcClient::new(\n                channel,\n            )\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize);\n        if let Some(compression_encoding) = compression_encoding_opt {\n            client = client\n                .accept_compressed(compression_encoding)\n                .send_compressed(compression_encoding);\n        }\n        let adapter = DeveloperServiceGrpcClientAdapter::new(\n            client,\n            connection_keys_watcher,\n        );\n        Self::new(adapter)\n    }\n    pub fn from_balance_channel(\n        balance_channel: quickwit_common::tower::BalanceChannel<std::net::SocketAddr>,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> DeveloperServiceClient {\n        let connection_keys_watcher = balance_channel.connection_keys_watcher();\n        let mut client = developer_service_grpc_client::DeveloperServiceGrpcClient::new(\n                balance_channel,\n            )\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize);\n        if let Some(compression_encoding) = compression_encoding_opt {\n            client = client\n                .accept_compressed(compression_encoding)\n                .send_compressed(compression_encoding);\n        }\n        let adapter = DeveloperServiceGrpcClientAdapter::new(\n            client,\n            connection_keys_watcher,\n        );\n        Self::new(adapter)\n    }\n    pub fn from_mailbox<A>(mailbox: quickwit_actors::Mailbox<A>) -> Self\n    where\n        A: quickwit_actors::Actor + std::fmt::Debug + Send + 'static,\n        DeveloperServiceMailbox<A>: DeveloperService,\n    {\n        DeveloperServiceClient::new(DeveloperServiceMailbox::new(mailbox))\n    }\n    pub fn tower() -> DeveloperServiceTowerLayerStack {\n        DeveloperServiceTowerLayerStack::default()\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn from_mock(mock: MockDeveloperService) -> Self {\n        let mock_wrapper = mock_developer_service::MockDeveloperServiceWrapper {\n            inner: tokio::sync::Mutex::new(mock),\n        };\n        Self::new(mock_wrapper)\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn mocked() -> Self {\n        Self::from_mock(MockDeveloperService::new())\n    }\n}\n#[async_trait::async_trait]\nimpl DeveloperService for DeveloperServiceClient {\n    async fn get_debug_info(\n        &self,\n        request: GetDebugInfoRequest,\n    ) -> crate::developer::DeveloperResult<GetDebugInfoResponse> {\n        self.inner.0.get_debug_info(request).await\n    }\n}\n#[cfg(any(test, feature = \"testsuite\"))]\npub mod mock_developer_service {\n    use super::*;\n    #[derive(Debug)]\n    pub struct MockDeveloperServiceWrapper {\n        pub(super) inner: tokio::sync::Mutex<MockDeveloperService>,\n    }\n    #[async_trait::async_trait]\n    impl DeveloperService for MockDeveloperServiceWrapper {\n        async fn get_debug_info(\n            &self,\n            request: super::GetDebugInfoRequest,\n        ) -> crate::developer::DeveloperResult<super::GetDebugInfoResponse> {\n            self.inner.lock().await.get_debug_info(request).await\n        }\n    }\n}\npub type BoxFuture<T, E> = std::pin::Pin<\n    Box<dyn std::future::Future<Output = Result<T, E>> + Send + 'static>,\n>;\nimpl tower::Service<GetDebugInfoRequest> for InnerDeveloperServiceClient {\n    type Response = GetDebugInfoResponse;\n    type Error = crate::developer::DeveloperError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: GetDebugInfoRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.get_debug_info(request).await };\n        Box::pin(fut)\n    }\n}\n/// A tower service stack is a set of tower services.\n#[derive(Debug)]\nstruct DeveloperServiceTowerServiceStack {\n    #[allow(dead_code)]\n    inner: InnerDeveloperServiceClient,\n    get_debug_info_svc: quickwit_common::tower::BoxService<\n        GetDebugInfoRequest,\n        GetDebugInfoResponse,\n        crate::developer::DeveloperError,\n    >,\n}\n#[async_trait::async_trait]\nimpl DeveloperService for DeveloperServiceTowerServiceStack {\n    async fn get_debug_info(\n        &self,\n        request: GetDebugInfoRequest,\n    ) -> crate::developer::DeveloperResult<GetDebugInfoResponse> {\n        self.get_debug_info_svc.clone().ready().await?.call(request).await\n    }\n}\ntype GetDebugInfoLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        GetDebugInfoRequest,\n        GetDebugInfoResponse,\n        crate::developer::DeveloperError,\n    >,\n    GetDebugInfoRequest,\n    GetDebugInfoResponse,\n    crate::developer::DeveloperError,\n>;\n#[derive(Debug, Default)]\npub struct DeveloperServiceTowerLayerStack {\n    get_debug_info_layers: Vec<GetDebugInfoLayer>,\n}\nimpl DeveloperServiceTowerLayerStack {\n    pub fn stack_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    GetDebugInfoRequest,\n                    GetDebugInfoResponse,\n                    crate::developer::DeveloperError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                GetDebugInfoRequest,\n                GetDebugInfoResponse,\n                crate::developer::DeveloperError,\n            >,\n        >>::Service: tower::Service<\n                GetDebugInfoRequest,\n                Response = GetDebugInfoResponse,\n                Error = crate::developer::DeveloperError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                GetDebugInfoRequest,\n                GetDebugInfoResponse,\n                crate::developer::DeveloperError,\n            >,\n        >>::Service as tower::Service<GetDebugInfoRequest>>::Future: Send + 'static,\n    {\n        self.get_debug_info_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self\n    }\n    pub fn stack_get_debug_info_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    GetDebugInfoRequest,\n                    GetDebugInfoResponse,\n                    crate::developer::DeveloperError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                GetDebugInfoRequest,\n                Response = GetDebugInfoResponse,\n                Error = crate::developer::DeveloperError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<GetDebugInfoRequest>>::Future: Send + 'static,\n    {\n        self.get_debug_info_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn build<T>(self, instance: T) -> DeveloperServiceClient\n    where\n        T: DeveloperService,\n    {\n        let inner_client = InnerDeveloperServiceClient(std::sync::Arc::new(instance));\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_channel(\n        self,\n        addr: std::net::SocketAddr,\n        channel: tonic::transport::Channel,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> DeveloperServiceClient {\n        let client = DeveloperServiceClient::from_channel(\n            addr,\n            channel,\n            max_message_size,\n            compression_encoding_opt,\n        );\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_balance_channel(\n        self,\n        balance_channel: quickwit_common::tower::BalanceChannel<std::net::SocketAddr>,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> DeveloperServiceClient {\n        let client = DeveloperServiceClient::from_balance_channel(\n            balance_channel,\n            max_message_size,\n            compression_encoding_opt,\n        );\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_mailbox<A>(\n        self,\n        mailbox: quickwit_actors::Mailbox<A>,\n    ) -> DeveloperServiceClient\n    where\n        A: quickwit_actors::Actor + std::fmt::Debug + Send + 'static,\n        DeveloperServiceMailbox<A>: DeveloperService,\n    {\n        let inner_client = InnerDeveloperServiceClient(\n            std::sync::Arc::new(DeveloperServiceMailbox::new(mailbox)),\n        );\n        self.build_from_inner_client(inner_client)\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn build_from_mock(self, mock: MockDeveloperService) -> DeveloperServiceClient {\n        let client = DeveloperServiceClient::from_mock(mock);\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    fn build_from_inner_client(\n        self,\n        inner_client: InnerDeveloperServiceClient,\n    ) -> DeveloperServiceClient {\n        let get_debug_info_svc = self\n            .get_debug_info_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let tower_svc_stack = DeveloperServiceTowerServiceStack {\n            inner: inner_client,\n            get_debug_info_svc,\n        };\n        DeveloperServiceClient::new(tower_svc_stack)\n    }\n}\n#[derive(Debug, Clone)]\nstruct MailboxAdapter<A: quickwit_actors::Actor, E> {\n    inner: quickwit_actors::Mailbox<A>,\n    phantom: std::marker::PhantomData<E>,\n}\nimpl<A, E> std::ops::Deref for MailboxAdapter<A, E>\nwhere\n    A: quickwit_actors::Actor,\n{\n    type Target = quickwit_actors::Mailbox<A>;\n    fn deref(&self) -> &Self::Target {\n        &self.inner\n    }\n}\n#[derive(Debug)]\npub struct DeveloperServiceMailbox<A: quickwit_actors::Actor> {\n    inner: MailboxAdapter<A, crate::developer::DeveloperError>,\n}\nimpl<A: quickwit_actors::Actor> DeveloperServiceMailbox<A> {\n    pub fn new(instance: quickwit_actors::Mailbox<A>) -> Self {\n        let inner = MailboxAdapter {\n            inner: instance,\n            phantom: std::marker::PhantomData,\n        };\n        Self { inner }\n    }\n}\nimpl<A: quickwit_actors::Actor> Clone for DeveloperServiceMailbox<A> {\n    fn clone(&self) -> Self {\n        let inner = MailboxAdapter {\n            inner: self.inner.clone(),\n            phantom: std::marker::PhantomData,\n        };\n        Self { inner }\n    }\n}\nimpl<A, M, T, E> tower::Service<M> for DeveloperServiceMailbox<A>\nwhere\n    A: quickwit_actors::Actor\n        + quickwit_actors::DeferableReplyHandler<M, Reply = Result<T, E>> + Send\n        + 'static,\n    M: std::fmt::Debug + Send + 'static,\n    T: Send + 'static,\n    E: std::fmt::Debug + Send + 'static,\n    crate::developer::DeveloperError: From<quickwit_actors::AskError<E>>,\n{\n    type Response = T;\n    type Error = crate::developer::DeveloperError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        //! This does not work with balance middlewares such as `tower::balance::pool::Pool` because\n        //! this always returns `Poll::Ready`. The fix is to acquire a permit from the\n        //! mailbox in `poll_ready` and consume it in `call`.\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, message: M) -> Self::Future {\n        let mailbox = self.inner.clone();\n        let fut = async move {\n            mailbox.ask_for_res(message).await.map_err(|error| error.into())\n        };\n        Box::pin(fut)\n    }\n}\n#[async_trait::async_trait]\nimpl<A> DeveloperService for DeveloperServiceMailbox<A>\nwhere\n    A: quickwit_actors::Actor + std::fmt::Debug,\n    DeveloperServiceMailbox<\n        A,\n    >: tower::Service<\n        GetDebugInfoRequest,\n        Response = GetDebugInfoResponse,\n        Error = crate::developer::DeveloperError,\n        Future = BoxFuture<GetDebugInfoResponse, crate::developer::DeveloperError>,\n    >,\n{\n    async fn get_debug_info(\n        &self,\n        request: GetDebugInfoRequest,\n    ) -> crate::developer::DeveloperResult<GetDebugInfoResponse> {\n        self.clone().call(request).await\n    }\n}\n#[derive(Debug, Clone)]\npub struct DeveloperServiceGrpcClientAdapter<T> {\n    inner: T,\n    #[allow(dead_code)]\n    connection_addrs_rx: tokio::sync::watch::Receiver<\n        std::collections::HashSet<std::net::SocketAddr>,\n    >,\n}\nimpl<T> DeveloperServiceGrpcClientAdapter<T> {\n    pub fn new(\n        instance: T,\n        connection_addrs_rx: tokio::sync::watch::Receiver<\n            std::collections::HashSet<std::net::SocketAddr>,\n        >,\n    ) -> Self {\n        Self {\n            inner: instance,\n            connection_addrs_rx,\n        }\n    }\n}\n#[async_trait::async_trait]\nimpl<T> DeveloperService\nfor DeveloperServiceGrpcClientAdapter<\n    developer_service_grpc_client::DeveloperServiceGrpcClient<T>,\n>\nwhere\n    T: tonic::client::GrpcService<tonic::body::Body> + std::fmt::Debug + Clone + Send\n        + Sync + 'static,\n    T::ResponseBody: tonic::codegen::Body<Data = tonic::codegen::Bytes> + Send + 'static,\n    <T::ResponseBody as tonic::codegen::Body>::Error: Into<tonic::codegen::StdError>\n        + Send,\n    T::Future: Send,\n{\n    async fn get_debug_info(\n        &self,\n        request: GetDebugInfoRequest,\n    ) -> crate::developer::DeveloperResult<GetDebugInfoResponse> {\n        self.inner\n            .clone()\n            .get_debug_info(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                GetDebugInfoRequest::rpc_name(),\n            ))\n    }\n}\n#[derive(Debug)]\npub struct DeveloperServiceGrpcServerAdapter {\n    inner: InnerDeveloperServiceClient,\n}\nimpl DeveloperServiceGrpcServerAdapter {\n    pub fn new<T>(instance: T) -> Self\n    where\n        T: DeveloperService,\n    {\n        Self {\n            inner: InnerDeveloperServiceClient(std::sync::Arc::new(instance)),\n        }\n    }\n}\n#[async_trait::async_trait]\nimpl developer_service_grpc_server::DeveloperServiceGrpc\nfor DeveloperServiceGrpcServerAdapter {\n    async fn get_debug_info(\n        &self,\n        request: tonic::Request<GetDebugInfoRequest>,\n    ) -> Result<tonic::Response<GetDebugInfoResponse>, tonic::Status> {\n        self.inner\n            .0\n            .get_debug_info(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n}\n/// Generated client implementations.\npub mod developer_service_grpc_client {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    use tonic::codegen::http::Uri;\n    #[derive(Debug, Clone)]\n    pub struct DeveloperServiceGrpcClient<T> {\n        inner: tonic::client::Grpc<T>,\n    }\n    impl DeveloperServiceGrpcClient<tonic::transport::Channel> {\n        /// Attempt to create a new client by connecting to a given endpoint.\n        pub async fn connect<D>(dst: D) -> Result<Self, tonic::transport::Error>\n        where\n            D: TryInto<tonic::transport::Endpoint>,\n            D::Error: Into<StdError>,\n        {\n            let conn = tonic::transport::Endpoint::new(dst)?.connect().await?;\n            Ok(Self::new(conn))\n        }\n    }\n    impl<T> DeveloperServiceGrpcClient<T>\n    where\n        T: tonic::client::GrpcService<tonic::body::Body>,\n        T::Error: Into<StdError>,\n        T::ResponseBody: Body<Data = Bytes> + std::marker::Send + 'static,\n        <T::ResponseBody as Body>::Error: Into<StdError> + std::marker::Send,\n    {\n        pub fn new(inner: T) -> Self {\n            let inner = tonic::client::Grpc::new(inner);\n            Self { inner }\n        }\n        pub fn with_origin(inner: T, origin: Uri) -> Self {\n            let inner = tonic::client::Grpc::with_origin(inner, origin);\n            Self { inner }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> DeveloperServiceGrpcClient<InterceptedService<T, F>>\n        where\n            F: tonic::service::Interceptor,\n            T::ResponseBody: Default,\n            T: tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n                Response = http::Response<\n                    <T as tonic::client::GrpcService<tonic::body::Body>>::ResponseBody,\n                >,\n            >,\n            <T as tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n            >>::Error: Into<StdError> + std::marker::Send + std::marker::Sync,\n        {\n            DeveloperServiceGrpcClient::new(InterceptedService::new(inner, interceptor))\n        }\n        /// Compress requests with the given encoding.\n        ///\n        /// This requires the server to support it otherwise it might respond with an\n        /// error.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.send_compressed(encoding);\n            self\n        }\n        /// Enable decompressing responses.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.accept_compressed(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_decoding_message_size(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_encoding_message_size(limit);\n            self\n        }\n        pub async fn get_debug_info(\n            &mut self,\n            request: impl tonic::IntoRequest<super::GetDebugInfoRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::GetDebugInfoResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.developer.DeveloperService/GetDebugInfo\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.developer.DeveloperService\",\n                        \"GetDebugInfo\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n    }\n}\n/// Generated server implementations.\npub mod developer_service_grpc_server {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    /// Generated trait containing gRPC methods that should be implemented for use with DeveloperServiceGrpcServer.\n    #[async_trait]\n    pub trait DeveloperServiceGrpc: std::marker::Send + std::marker::Sync + 'static {\n        async fn get_debug_info(\n            &self,\n            request: tonic::Request<super::GetDebugInfoRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::GetDebugInfoResponse>,\n            tonic::Status,\n        >;\n    }\n    #[derive(Debug)]\n    pub struct DeveloperServiceGrpcServer<T> {\n        inner: Arc<T>,\n        accept_compression_encodings: EnabledCompressionEncodings,\n        send_compression_encodings: EnabledCompressionEncodings,\n        max_decoding_message_size: Option<usize>,\n        max_encoding_message_size: Option<usize>,\n    }\n    impl<T> DeveloperServiceGrpcServer<T> {\n        pub fn new(inner: T) -> Self {\n            Self::from_arc(Arc::new(inner))\n        }\n        pub fn from_arc(inner: Arc<T>) -> Self {\n            Self {\n                inner,\n                accept_compression_encodings: Default::default(),\n                send_compression_encodings: Default::default(),\n                max_decoding_message_size: None,\n                max_encoding_message_size: None,\n            }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> InterceptedService<Self, F>\n        where\n            F: tonic::service::Interceptor,\n        {\n            InterceptedService::new(Self::new(inner), interceptor)\n        }\n        /// Enable decompressing requests with the given encoding.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.accept_compression_encodings.enable(encoding);\n            self\n        }\n        /// Compress responses with the given encoding, if the client supports it.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.send_compression_encodings.enable(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.max_decoding_message_size = Some(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.max_encoding_message_size = Some(limit);\n            self\n        }\n    }\n    impl<T, B> tonic::codegen::Service<http::Request<B>>\n    for DeveloperServiceGrpcServer<T>\n    where\n        T: DeveloperServiceGrpc,\n        B: Body + std::marker::Send + 'static,\n        B::Error: Into<StdError> + std::marker::Send + 'static,\n    {\n        type Response = http::Response<tonic::body::Body>;\n        type Error = std::convert::Infallible;\n        type Future = BoxFuture<Self::Response, Self::Error>;\n        fn poll_ready(\n            &mut self,\n            _cx: &mut Context<'_>,\n        ) -> Poll<std::result::Result<(), Self::Error>> {\n            Poll::Ready(Ok(()))\n        }\n        fn call(&mut self, req: http::Request<B>) -> Self::Future {\n            match req.uri().path() {\n                \"/quickwit.developer.DeveloperService/GetDebugInfo\" => {\n                    #[allow(non_camel_case_types)]\n                    struct GetDebugInfoSvc<T: DeveloperServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: DeveloperServiceGrpc,\n                    > tonic::server::UnaryService<super::GetDebugInfoRequest>\n                    for GetDebugInfoSvc<T> {\n                        type Response = super::GetDebugInfoResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::GetDebugInfoRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as DeveloperServiceGrpc>::get_debug_info(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = GetDebugInfoSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                _ => {\n                    Box::pin(async move {\n                        let mut response = http::Response::new(\n                            tonic::body::Body::default(),\n                        );\n                        let headers = response.headers_mut();\n                        headers\n                            .insert(\n                                tonic::Status::GRPC_STATUS,\n                                (tonic::Code::Unimplemented as i32).into(),\n                            );\n                        headers\n                            .insert(\n                                http::header::CONTENT_TYPE,\n                                tonic::metadata::GRPC_CONTENT_TYPE,\n                            );\n                        Ok(response)\n                    })\n                }\n            }\n        }\n    }\n    impl<T> Clone for DeveloperServiceGrpcServer<T> {\n        fn clone(&self) -> Self {\n            let inner = self.inner.clone();\n            Self {\n                inner,\n                accept_compression_encodings: self.accept_compression_encodings,\n                send_compression_encodings: self.send_compression_encodings,\n                max_decoding_message_size: self.max_decoding_message_size,\n                max_encoding_message_size: self.max_encoding_message_size,\n            }\n        }\n    }\n    /// Generated gRPC service name\n    pub const SERVICE_NAME: &str = \"quickwit.developer.DeveloperService\";\n    impl<T> tonic::server::NamedService for DeveloperServiceGrpcServer<T> {\n        const NAME: &'static str = SERVICE_NAME;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/codegen/quickwit/quickwit.indexing.rs",
    "content": "// This file is @generated by prost-build.\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ApplyIndexingPlanRequest {\n    #[prost(message, repeated, tag = \"1\")]\n    pub indexing_tasks: ::prost::alloc::vec::Vec<IndexingTask>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct IndexingTask {\n    /// The tasks's index UID.\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    /// The task's source ID.\n    #[prost(string, tag = \"2\")]\n    pub source_id: ::prost::alloc::string::String,\n    /// pipeline id\n    #[prost(message, optional, tag = \"4\")]\n    pub pipeline_uid: ::core::option::Option<crate::types::PipelineUid>,\n    /// The shards assigned to the indexer.\n    #[prost(message, repeated, tag = \"3\")]\n    pub shard_ids: ::prost::alloc::vec::Vec<crate::types::ShardId>,\n    /// Fingerprint of the pipeline parameters. Anything that should cause a pipeline restart (such\n    /// as updating indexing settings, the doc mapping or the source) should influence this value.\n    #[prost(uint64, tag = \"6\")]\n    pub params_fingerprint: u64,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ApplyIndexingPlanResponse {}\n/// BEGIN quickwit-codegen\n#[allow(unused_imports)]\nuse std::str::FromStr;\nuse tower::{Layer, Service, ServiceExt};\n#[cfg_attr(any(test, feature = \"testsuite\"), mockall::automock)]\n#[async_trait::async_trait]\npub trait IndexingService: std::fmt::Debug + Send + Sync + 'static {\n    ///Apply an indexing plan on the node.\n    async fn apply_indexing_plan(\n        &self,\n        request: ApplyIndexingPlanRequest,\n    ) -> crate::indexing::IndexingResult<ApplyIndexingPlanResponse>;\n}\n#[derive(Debug, Clone)]\npub struct IndexingServiceClient {\n    inner: InnerIndexingServiceClient,\n}\n#[derive(Debug, Clone)]\nstruct InnerIndexingServiceClient(std::sync::Arc<dyn IndexingService>);\nimpl IndexingServiceClient {\n    pub fn new<T>(instance: T) -> Self\n    where\n        T: IndexingService,\n    {\n        #[cfg(any(test, feature = \"testsuite\"))]\n        assert!(\n            std::any::TypeId::of:: < T > () != std::any::TypeId::of:: <\n            MockIndexingService > (),\n            \"`MockIndexingService` must be wrapped in a `MockIndexingServiceWrapper`: use `IndexingServiceClient::from_mock(mock)` to instantiate the client\"\n        );\n        Self {\n            inner: InnerIndexingServiceClient(std::sync::Arc::new(instance)),\n        }\n    }\n    pub fn as_grpc_service(\n        &self,\n        max_message_size: bytesize::ByteSize,\n    ) -> indexing_service_grpc_server::IndexingServiceGrpcServer<\n        IndexingServiceGrpcServerAdapter,\n    > {\n        let adapter = IndexingServiceGrpcServerAdapter::new(self.clone());\n        indexing_service_grpc_server::IndexingServiceGrpcServer::new(adapter)\n            .accept_compressed(tonic::codec::CompressionEncoding::Gzip)\n            .accept_compressed(tonic::codec::CompressionEncoding::Zstd)\n            .send_compressed(tonic::codec::CompressionEncoding::Gzip)\n            .send_compressed(tonic::codec::CompressionEncoding::Zstd)\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize)\n    }\n    pub fn from_channel(\n        addr: std::net::SocketAddr,\n        channel: tonic::transport::Channel,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> Self {\n        let (_, connection_keys_watcher) = tokio::sync::watch::channel(\n            std::collections::HashSet::from_iter([addr]),\n        );\n        let mut client = indexing_service_grpc_client::IndexingServiceGrpcClient::new(\n                channel,\n            )\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize);\n        if let Some(compression_encoding) = compression_encoding_opt {\n            client = client\n                .accept_compressed(compression_encoding)\n                .send_compressed(compression_encoding);\n        }\n        let adapter = IndexingServiceGrpcClientAdapter::new(\n            client,\n            connection_keys_watcher,\n        );\n        Self::new(adapter)\n    }\n    pub fn from_balance_channel(\n        balance_channel: quickwit_common::tower::BalanceChannel<std::net::SocketAddr>,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> IndexingServiceClient {\n        let connection_keys_watcher = balance_channel.connection_keys_watcher();\n        let mut client = indexing_service_grpc_client::IndexingServiceGrpcClient::new(\n                balance_channel,\n            )\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize);\n        if let Some(compression_encoding) = compression_encoding_opt {\n            client = client\n                .accept_compressed(compression_encoding)\n                .send_compressed(compression_encoding);\n        }\n        let adapter = IndexingServiceGrpcClientAdapter::new(\n            client,\n            connection_keys_watcher,\n        );\n        Self::new(adapter)\n    }\n    pub fn from_mailbox<A>(mailbox: quickwit_actors::Mailbox<A>) -> Self\n    where\n        A: quickwit_actors::Actor + std::fmt::Debug + Send + 'static,\n        IndexingServiceMailbox<A>: IndexingService,\n    {\n        IndexingServiceClient::new(IndexingServiceMailbox::new(mailbox))\n    }\n    pub fn tower() -> IndexingServiceTowerLayerStack {\n        IndexingServiceTowerLayerStack::default()\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn from_mock(mock: MockIndexingService) -> Self {\n        let mock_wrapper = mock_indexing_service::MockIndexingServiceWrapper {\n            inner: tokio::sync::Mutex::new(mock),\n        };\n        Self::new(mock_wrapper)\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn mocked() -> Self {\n        Self::from_mock(MockIndexingService::new())\n    }\n}\n#[async_trait::async_trait]\nimpl IndexingService for IndexingServiceClient {\n    async fn apply_indexing_plan(\n        &self,\n        request: ApplyIndexingPlanRequest,\n    ) -> crate::indexing::IndexingResult<ApplyIndexingPlanResponse> {\n        self.inner.0.apply_indexing_plan(request).await\n    }\n}\n#[cfg(any(test, feature = \"testsuite\"))]\npub mod mock_indexing_service {\n    use super::*;\n    #[derive(Debug)]\n    pub struct MockIndexingServiceWrapper {\n        pub(super) inner: tokio::sync::Mutex<MockIndexingService>,\n    }\n    #[async_trait::async_trait]\n    impl IndexingService for MockIndexingServiceWrapper {\n        async fn apply_indexing_plan(\n            &self,\n            request: super::ApplyIndexingPlanRequest,\n        ) -> crate::indexing::IndexingResult<super::ApplyIndexingPlanResponse> {\n            self.inner.lock().await.apply_indexing_plan(request).await\n        }\n    }\n}\npub type BoxFuture<T, E> = std::pin::Pin<\n    Box<dyn std::future::Future<Output = Result<T, E>> + Send + 'static>,\n>;\nimpl tower::Service<ApplyIndexingPlanRequest> for InnerIndexingServiceClient {\n    type Response = ApplyIndexingPlanResponse;\n    type Error = crate::indexing::IndexingError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: ApplyIndexingPlanRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.apply_indexing_plan(request).await };\n        Box::pin(fut)\n    }\n}\n/// A tower service stack is a set of tower services.\n#[derive(Debug)]\nstruct IndexingServiceTowerServiceStack {\n    #[allow(dead_code)]\n    inner: InnerIndexingServiceClient,\n    apply_indexing_plan_svc: quickwit_common::tower::BoxService<\n        ApplyIndexingPlanRequest,\n        ApplyIndexingPlanResponse,\n        crate::indexing::IndexingError,\n    >,\n}\n#[async_trait::async_trait]\nimpl IndexingService for IndexingServiceTowerServiceStack {\n    async fn apply_indexing_plan(\n        &self,\n        request: ApplyIndexingPlanRequest,\n    ) -> crate::indexing::IndexingResult<ApplyIndexingPlanResponse> {\n        self.apply_indexing_plan_svc.clone().ready().await?.call(request).await\n    }\n}\ntype ApplyIndexingPlanLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        ApplyIndexingPlanRequest,\n        ApplyIndexingPlanResponse,\n        crate::indexing::IndexingError,\n    >,\n    ApplyIndexingPlanRequest,\n    ApplyIndexingPlanResponse,\n    crate::indexing::IndexingError,\n>;\n#[derive(Debug, Default)]\npub struct IndexingServiceTowerLayerStack {\n    apply_indexing_plan_layers: Vec<ApplyIndexingPlanLayer>,\n}\nimpl IndexingServiceTowerLayerStack {\n    pub fn stack_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    ApplyIndexingPlanRequest,\n                    ApplyIndexingPlanResponse,\n                    crate::indexing::IndexingError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                ApplyIndexingPlanRequest,\n                ApplyIndexingPlanResponse,\n                crate::indexing::IndexingError,\n            >,\n        >>::Service: tower::Service<\n                ApplyIndexingPlanRequest,\n                Response = ApplyIndexingPlanResponse,\n                Error = crate::indexing::IndexingError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                ApplyIndexingPlanRequest,\n                ApplyIndexingPlanResponse,\n                crate::indexing::IndexingError,\n            >,\n        >>::Service as tower::Service<ApplyIndexingPlanRequest>>::Future: Send + 'static,\n    {\n        self.apply_indexing_plan_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self\n    }\n    pub fn stack_apply_indexing_plan_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    ApplyIndexingPlanRequest,\n                    ApplyIndexingPlanResponse,\n                    crate::indexing::IndexingError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                ApplyIndexingPlanRequest,\n                Response = ApplyIndexingPlanResponse,\n                Error = crate::indexing::IndexingError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<ApplyIndexingPlanRequest>>::Future: Send + 'static,\n    {\n        self.apply_indexing_plan_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn build<T>(self, instance: T) -> IndexingServiceClient\n    where\n        T: IndexingService,\n    {\n        let inner_client = InnerIndexingServiceClient(std::sync::Arc::new(instance));\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_channel(\n        self,\n        addr: std::net::SocketAddr,\n        channel: tonic::transport::Channel,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> IndexingServiceClient {\n        let client = IndexingServiceClient::from_channel(\n            addr,\n            channel,\n            max_message_size,\n            compression_encoding_opt,\n        );\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_balance_channel(\n        self,\n        balance_channel: quickwit_common::tower::BalanceChannel<std::net::SocketAddr>,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> IndexingServiceClient {\n        let client = IndexingServiceClient::from_balance_channel(\n            balance_channel,\n            max_message_size,\n            compression_encoding_opt,\n        );\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_mailbox<A>(\n        self,\n        mailbox: quickwit_actors::Mailbox<A>,\n    ) -> IndexingServiceClient\n    where\n        A: quickwit_actors::Actor + std::fmt::Debug + Send + 'static,\n        IndexingServiceMailbox<A>: IndexingService,\n    {\n        let inner_client = InnerIndexingServiceClient(\n            std::sync::Arc::new(IndexingServiceMailbox::new(mailbox)),\n        );\n        self.build_from_inner_client(inner_client)\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn build_from_mock(self, mock: MockIndexingService) -> IndexingServiceClient {\n        let client = IndexingServiceClient::from_mock(mock);\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    fn build_from_inner_client(\n        self,\n        inner_client: InnerIndexingServiceClient,\n    ) -> IndexingServiceClient {\n        let apply_indexing_plan_svc = self\n            .apply_indexing_plan_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let tower_svc_stack = IndexingServiceTowerServiceStack {\n            inner: inner_client,\n            apply_indexing_plan_svc,\n        };\n        IndexingServiceClient::new(tower_svc_stack)\n    }\n}\n#[derive(Debug, Clone)]\nstruct MailboxAdapter<A: quickwit_actors::Actor, E> {\n    inner: quickwit_actors::Mailbox<A>,\n    phantom: std::marker::PhantomData<E>,\n}\nimpl<A, E> std::ops::Deref for MailboxAdapter<A, E>\nwhere\n    A: quickwit_actors::Actor,\n{\n    type Target = quickwit_actors::Mailbox<A>;\n    fn deref(&self) -> &Self::Target {\n        &self.inner\n    }\n}\n#[derive(Debug)]\npub struct IndexingServiceMailbox<A: quickwit_actors::Actor> {\n    inner: MailboxAdapter<A, crate::indexing::IndexingError>,\n}\nimpl<A: quickwit_actors::Actor> IndexingServiceMailbox<A> {\n    pub fn new(instance: quickwit_actors::Mailbox<A>) -> Self {\n        let inner = MailboxAdapter {\n            inner: instance,\n            phantom: std::marker::PhantomData,\n        };\n        Self { inner }\n    }\n}\nimpl<A: quickwit_actors::Actor> Clone for IndexingServiceMailbox<A> {\n    fn clone(&self) -> Self {\n        let inner = MailboxAdapter {\n            inner: self.inner.clone(),\n            phantom: std::marker::PhantomData,\n        };\n        Self { inner }\n    }\n}\nimpl<A, M, T, E> tower::Service<M> for IndexingServiceMailbox<A>\nwhere\n    A: quickwit_actors::Actor\n        + quickwit_actors::DeferableReplyHandler<M, Reply = Result<T, E>> + Send\n        + 'static,\n    M: std::fmt::Debug + Send + 'static,\n    T: Send + 'static,\n    E: std::fmt::Debug + Send + 'static,\n    crate::indexing::IndexingError: From<quickwit_actors::AskError<E>>,\n{\n    type Response = T;\n    type Error = crate::indexing::IndexingError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        //! This does not work with balance middlewares such as `tower::balance::pool::Pool` because\n        //! this always returns `Poll::Ready`. The fix is to acquire a permit from the\n        //! mailbox in `poll_ready` and consume it in `call`.\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, message: M) -> Self::Future {\n        let mailbox = self.inner.clone();\n        let fut = async move {\n            mailbox.ask_for_res(message).await.map_err(|error| error.into())\n        };\n        Box::pin(fut)\n    }\n}\n#[async_trait::async_trait]\nimpl<A> IndexingService for IndexingServiceMailbox<A>\nwhere\n    A: quickwit_actors::Actor + std::fmt::Debug,\n    IndexingServiceMailbox<\n        A,\n    >: tower::Service<\n        ApplyIndexingPlanRequest,\n        Response = ApplyIndexingPlanResponse,\n        Error = crate::indexing::IndexingError,\n        Future = BoxFuture<ApplyIndexingPlanResponse, crate::indexing::IndexingError>,\n    >,\n{\n    async fn apply_indexing_plan(\n        &self,\n        request: ApplyIndexingPlanRequest,\n    ) -> crate::indexing::IndexingResult<ApplyIndexingPlanResponse> {\n        self.clone().call(request).await\n    }\n}\n#[derive(Debug, Clone)]\npub struct IndexingServiceGrpcClientAdapter<T> {\n    inner: T,\n    #[allow(dead_code)]\n    connection_addrs_rx: tokio::sync::watch::Receiver<\n        std::collections::HashSet<std::net::SocketAddr>,\n    >,\n}\nimpl<T> IndexingServiceGrpcClientAdapter<T> {\n    pub fn new(\n        instance: T,\n        connection_addrs_rx: tokio::sync::watch::Receiver<\n            std::collections::HashSet<std::net::SocketAddr>,\n        >,\n    ) -> Self {\n        Self {\n            inner: instance,\n            connection_addrs_rx,\n        }\n    }\n}\n#[async_trait::async_trait]\nimpl<T> IndexingService\nfor IndexingServiceGrpcClientAdapter<\n    indexing_service_grpc_client::IndexingServiceGrpcClient<T>,\n>\nwhere\n    T: tonic::client::GrpcService<tonic::body::Body> + std::fmt::Debug + Clone + Send\n        + Sync + 'static,\n    T::ResponseBody: tonic::codegen::Body<Data = tonic::codegen::Bytes> + Send + 'static,\n    <T::ResponseBody as tonic::codegen::Body>::Error: Into<tonic::codegen::StdError>\n        + Send,\n    T::Future: Send,\n{\n    async fn apply_indexing_plan(\n        &self,\n        request: ApplyIndexingPlanRequest,\n    ) -> crate::indexing::IndexingResult<ApplyIndexingPlanResponse> {\n        self.inner\n            .clone()\n            .apply_indexing_plan(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                ApplyIndexingPlanRequest::rpc_name(),\n            ))\n    }\n}\n#[derive(Debug)]\npub struct IndexingServiceGrpcServerAdapter {\n    inner: InnerIndexingServiceClient,\n}\nimpl IndexingServiceGrpcServerAdapter {\n    pub fn new<T>(instance: T) -> Self\n    where\n        T: IndexingService,\n    {\n        Self {\n            inner: InnerIndexingServiceClient(std::sync::Arc::new(instance)),\n        }\n    }\n}\n#[async_trait::async_trait]\nimpl indexing_service_grpc_server::IndexingServiceGrpc\nfor IndexingServiceGrpcServerAdapter {\n    async fn apply_indexing_plan(\n        &self,\n        request: tonic::Request<ApplyIndexingPlanRequest>,\n    ) -> Result<tonic::Response<ApplyIndexingPlanResponse>, tonic::Status> {\n        self.inner\n            .0\n            .apply_indexing_plan(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n}\n/// Generated client implementations.\npub mod indexing_service_grpc_client {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    use tonic::codegen::http::Uri;\n    #[derive(Debug, Clone)]\n    pub struct IndexingServiceGrpcClient<T> {\n        inner: tonic::client::Grpc<T>,\n    }\n    impl IndexingServiceGrpcClient<tonic::transport::Channel> {\n        /// Attempt to create a new client by connecting to a given endpoint.\n        pub async fn connect<D>(dst: D) -> Result<Self, tonic::transport::Error>\n        where\n            D: TryInto<tonic::transport::Endpoint>,\n            D::Error: Into<StdError>,\n        {\n            let conn = tonic::transport::Endpoint::new(dst)?.connect().await?;\n            Ok(Self::new(conn))\n        }\n    }\n    impl<T> IndexingServiceGrpcClient<T>\n    where\n        T: tonic::client::GrpcService<tonic::body::Body>,\n        T::Error: Into<StdError>,\n        T::ResponseBody: Body<Data = Bytes> + std::marker::Send + 'static,\n        <T::ResponseBody as Body>::Error: Into<StdError> + std::marker::Send,\n    {\n        pub fn new(inner: T) -> Self {\n            let inner = tonic::client::Grpc::new(inner);\n            Self { inner }\n        }\n        pub fn with_origin(inner: T, origin: Uri) -> Self {\n            let inner = tonic::client::Grpc::with_origin(inner, origin);\n            Self { inner }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> IndexingServiceGrpcClient<InterceptedService<T, F>>\n        where\n            F: tonic::service::Interceptor,\n            T::ResponseBody: Default,\n            T: tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n                Response = http::Response<\n                    <T as tonic::client::GrpcService<tonic::body::Body>>::ResponseBody,\n                >,\n            >,\n            <T as tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n            >>::Error: Into<StdError> + std::marker::Send + std::marker::Sync,\n        {\n            IndexingServiceGrpcClient::new(InterceptedService::new(inner, interceptor))\n        }\n        /// Compress requests with the given encoding.\n        ///\n        /// This requires the server to support it otherwise it might respond with an\n        /// error.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.send_compressed(encoding);\n            self\n        }\n        /// Enable decompressing responses.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.accept_compressed(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_decoding_message_size(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_encoding_message_size(limit);\n            self\n        }\n        /// Apply an indexing plan on the node.\n        pub async fn apply_indexing_plan(\n            &mut self,\n            request: impl tonic::IntoRequest<super::ApplyIndexingPlanRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ApplyIndexingPlanResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.indexing.IndexingService/ApplyIndexingPlan\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.indexing.IndexingService\",\n                        \"ApplyIndexingPlan\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n    }\n}\n/// Generated server implementations.\npub mod indexing_service_grpc_server {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    /// Generated trait containing gRPC methods that should be implemented for use with IndexingServiceGrpcServer.\n    #[async_trait]\n    pub trait IndexingServiceGrpc: std::marker::Send + std::marker::Sync + 'static {\n        /// Apply an indexing plan on the node.\n        async fn apply_indexing_plan(\n            &self,\n            request: tonic::Request<super::ApplyIndexingPlanRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ApplyIndexingPlanResponse>,\n            tonic::Status,\n        >;\n    }\n    #[derive(Debug)]\n    pub struct IndexingServiceGrpcServer<T> {\n        inner: Arc<T>,\n        accept_compression_encodings: EnabledCompressionEncodings,\n        send_compression_encodings: EnabledCompressionEncodings,\n        max_decoding_message_size: Option<usize>,\n        max_encoding_message_size: Option<usize>,\n    }\n    impl<T> IndexingServiceGrpcServer<T> {\n        pub fn new(inner: T) -> Self {\n            Self::from_arc(Arc::new(inner))\n        }\n        pub fn from_arc(inner: Arc<T>) -> Self {\n            Self {\n                inner,\n                accept_compression_encodings: Default::default(),\n                send_compression_encodings: Default::default(),\n                max_decoding_message_size: None,\n                max_encoding_message_size: None,\n            }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> InterceptedService<Self, F>\n        where\n            F: tonic::service::Interceptor,\n        {\n            InterceptedService::new(Self::new(inner), interceptor)\n        }\n        /// Enable decompressing requests with the given encoding.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.accept_compression_encodings.enable(encoding);\n            self\n        }\n        /// Compress responses with the given encoding, if the client supports it.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.send_compression_encodings.enable(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.max_decoding_message_size = Some(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.max_encoding_message_size = Some(limit);\n            self\n        }\n    }\n    impl<T, B> tonic::codegen::Service<http::Request<B>> for IndexingServiceGrpcServer<T>\n    where\n        T: IndexingServiceGrpc,\n        B: Body + std::marker::Send + 'static,\n        B::Error: Into<StdError> + std::marker::Send + 'static,\n    {\n        type Response = http::Response<tonic::body::Body>;\n        type Error = std::convert::Infallible;\n        type Future = BoxFuture<Self::Response, Self::Error>;\n        fn poll_ready(\n            &mut self,\n            _cx: &mut Context<'_>,\n        ) -> Poll<std::result::Result<(), Self::Error>> {\n            Poll::Ready(Ok(()))\n        }\n        fn call(&mut self, req: http::Request<B>) -> Self::Future {\n            match req.uri().path() {\n                \"/quickwit.indexing.IndexingService/ApplyIndexingPlan\" => {\n                    #[allow(non_camel_case_types)]\n                    struct ApplyIndexingPlanSvc<T: IndexingServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: IndexingServiceGrpc,\n                    > tonic::server::UnaryService<super::ApplyIndexingPlanRequest>\n                    for ApplyIndexingPlanSvc<T> {\n                        type Response = super::ApplyIndexingPlanResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::ApplyIndexingPlanRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as IndexingServiceGrpc>::apply_indexing_plan(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = ApplyIndexingPlanSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                _ => {\n                    Box::pin(async move {\n                        let mut response = http::Response::new(\n                            tonic::body::Body::default(),\n                        );\n                        let headers = response.headers_mut();\n                        headers\n                            .insert(\n                                tonic::Status::GRPC_STATUS,\n                                (tonic::Code::Unimplemented as i32).into(),\n                            );\n                        headers\n                            .insert(\n                                http::header::CONTENT_TYPE,\n                                tonic::metadata::GRPC_CONTENT_TYPE,\n                            );\n                        Ok(response)\n                    })\n                }\n            }\n        }\n    }\n    impl<T> Clone for IndexingServiceGrpcServer<T> {\n        fn clone(&self) -> Self {\n            let inner = self.inner.clone();\n            Self {\n                inner,\n                accept_compression_encodings: self.accept_compression_encodings,\n                send_compression_encodings: self.send_compression_encodings,\n                max_decoding_message_size: self.max_decoding_message_size,\n                max_encoding_message_size: self.max_encoding_message_size,\n            }\n        }\n    }\n    /// Generated gRPC service name\n    pub const SERVICE_NAME: &str = \"quickwit.indexing.IndexingService\";\n    impl<T> tonic::server::NamedService for IndexingServiceGrpcServer<T> {\n        const NAME: &'static str = SERVICE_NAME;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/codegen/quickwit/quickwit.ingest.ingester.rs",
    "content": "// This file is @generated by prost-build.\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct RetainShardsForSource {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"2\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(message, repeated, tag = \"3\")]\n    pub shard_ids: ::prost::alloc::vec::Vec<crate::types::ShardId>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct RetainShardsRequest {\n    #[prost(message, repeated, tag = \"1\")]\n    pub retain_shards_for_sources: ::prost::alloc::vec::Vec<RetainShardsForSource>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct RetainShardsResponse {}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct PersistRequest {\n    #[prost(string, tag = \"1\")]\n    pub leader_id: ::prost::alloc::string::String,\n    #[prost(enumeration = \"super::CommitTypeV2\", tag = \"3\")]\n    pub commit_type: i32,\n    #[prost(message, repeated, tag = \"4\")]\n    pub subrequests: ::prost::alloc::vec::Vec<PersistSubrequest>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct PersistSubrequest {\n    #[prost(uint32, tag = \"1\")]\n    pub subrequest_id: u32,\n    #[prost(message, optional, tag = \"2\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"3\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(message, optional, tag = \"5\")]\n    pub doc_batch: ::core::option::Option<super::DocBatchV2>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct PersistResponse {\n    #[prost(string, tag = \"1\")]\n    pub leader_id: ::prost::alloc::string::String,\n    #[prost(message, repeated, tag = \"2\")]\n    pub successes: ::prost::alloc::vec::Vec<PersistSuccess>,\n    #[prost(message, repeated, tag = \"3\")]\n    pub failures: ::prost::alloc::vec::Vec<PersistFailure>,\n    #[prost(message, optional, tag = \"4\")]\n    pub routing_update: ::core::option::Option<RoutingUpdate>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct RoutingUpdate {\n    #[prost(uint32, tag = \"1\")]\n    pub capacity_score: u32,\n    #[prost(message, repeated, tag = \"2\")]\n    pub source_shard_updates: ::prost::alloc::vec::Vec<SourceShardUpdate>,\n    #[prost(message, repeated, tag = \"3\")]\n    pub closed_shards: ::prost::alloc::vec::Vec<super::ShardIds>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct SourceShardUpdate {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"2\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(uint32, tag = \"3\")]\n    pub open_shard_count: u32,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct PersistSuccess {\n    #[prost(uint32, tag = \"1\")]\n    pub subrequest_id: u32,\n    #[prost(message, optional, tag = \"2\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"3\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(message, optional, tag = \"4\")]\n    pub shard_id: ::core::option::Option<crate::types::ShardId>,\n    #[prost(message, optional, tag = \"5\")]\n    pub replication_position_inclusive: ::core::option::Option<crate::types::Position>,\n    #[prost(uint32, tag = \"6\")]\n    pub num_persisted_docs: u32,\n    #[prost(message, repeated, tag = \"7\")]\n    pub parse_failures: ::prost::alloc::vec::Vec<super::ParseFailure>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct PersistFailure {\n    #[prost(uint32, tag = \"1\")]\n    pub subrequest_id: u32,\n    #[prost(message, optional, tag = \"2\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"3\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(enumeration = \"PersistFailureReason\", tag = \"5\")]\n    pub reason: i32,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct SynReplicationMessage {\n    #[prost(oneof = \"syn_replication_message::Message\", tags = \"1, 2, 3\")]\n    pub message: ::core::option::Option<syn_replication_message::Message>,\n}\n/// Nested message and enum types in `SynReplicationMessage`.\npub mod syn_replication_message {\n    #[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n    #[serde(rename_all = \"snake_case\")]\n    #[derive(Clone, PartialEq, ::prost::Oneof)]\n    pub enum Message {\n        #[prost(message, tag = \"1\")]\n        OpenRequest(super::OpenReplicationStreamRequest),\n        #[prost(message, tag = \"2\")]\n        InitRequest(super::InitReplicaRequest),\n        #[prost(message, tag = \"3\")]\n        ReplicateRequest(super::ReplicateRequest),\n    }\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct AckReplicationMessage {\n    #[prost(oneof = \"ack_replication_message::Message\", tags = \"1, 2, 3\")]\n    pub message: ::core::option::Option<ack_replication_message::Message>,\n}\n/// Nested message and enum types in `AckReplicationMessage`.\npub mod ack_replication_message {\n    #[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n    #[serde(rename_all = \"snake_case\")]\n    #[derive(Clone, PartialEq, ::prost::Oneof)]\n    pub enum Message {\n        #[prost(message, tag = \"1\")]\n        OpenResponse(super::OpenReplicationStreamResponse),\n        #[prost(message, tag = \"2\")]\n        InitResponse(super::InitReplicaResponse),\n        #[prost(message, tag = \"3\")]\n        ReplicateResponse(super::ReplicateResponse),\n    }\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct OpenReplicationStreamRequest {\n    #[prost(string, tag = \"1\")]\n    pub leader_id: ::prost::alloc::string::String,\n    #[prost(string, tag = \"2\")]\n    pub follower_id: ::prost::alloc::string::String,\n    /// Position of the request in the replication stream.\n    #[prost(uint64, tag = \"3\")]\n    pub replication_seqno: u64,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct OpenReplicationStreamResponse {\n    /// Position of the response in the replication stream. It should match the position of the request.\n    #[prost(uint64, tag = \"1\")]\n    pub replication_seqno: u64,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct InitReplicaRequest {\n    #[prost(message, optional, tag = \"1\")]\n    pub replica_shard: ::core::option::Option<super::Shard>,\n    #[prost(uint64, tag = \"2\")]\n    pub replication_seqno: u64,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct InitReplicaResponse {\n    #[prost(uint64, tag = \"1\")]\n    pub replication_seqno: u64,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ReplicateRequest {\n    #[prost(string, tag = \"1\")]\n    pub leader_id: ::prost::alloc::string::String,\n    #[prost(string, tag = \"2\")]\n    pub follower_id: ::prost::alloc::string::String,\n    #[prost(enumeration = \"super::CommitTypeV2\", tag = \"3\")]\n    pub commit_type: i32,\n    #[prost(message, repeated, tag = \"4\")]\n    pub subrequests: ::prost::alloc::vec::Vec<ReplicateSubrequest>,\n    /// Position of the request in the replication stream.\n    #[prost(uint64, tag = \"5\")]\n    pub replication_seqno: u64,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ReplicateSubrequest {\n    #[prost(uint32, tag = \"1\")]\n    pub subrequest_id: u32,\n    #[prost(message, optional, tag = \"2\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"3\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(message, optional, tag = \"4\")]\n    pub shard_id: ::core::option::Option<crate::types::ShardId>,\n    #[prost(message, optional, tag = \"5\")]\n    pub from_position_exclusive: ::core::option::Option<crate::types::Position>,\n    #[prost(message, optional, tag = \"6\")]\n    pub doc_batch: ::core::option::Option<super::DocBatchV2>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ReplicateResponse {\n    #[prost(string, tag = \"1\")]\n    pub follower_id: ::prost::alloc::string::String,\n    #[prost(message, repeated, tag = \"2\")]\n    pub successes: ::prost::alloc::vec::Vec<ReplicateSuccess>,\n    #[prost(message, repeated, tag = \"3\")]\n    pub failures: ::prost::alloc::vec::Vec<ReplicateFailure>,\n    /// Position of the response in the replication stream. It should match the position of the request.\n    #[prost(uint64, tag = \"4\")]\n    pub replication_seqno: u64,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ReplicateSuccess {\n    #[prost(uint32, tag = \"1\")]\n    pub subrequest_id: u32,\n    #[prost(message, optional, tag = \"2\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"3\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(message, optional, tag = \"4\")]\n    pub shard_id: ::core::option::Option<crate::types::ShardId>,\n    #[prost(message, optional, tag = \"5\")]\n    pub replication_position_inclusive: ::core::option::Option<crate::types::Position>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ReplicateFailure {\n    #[prost(uint32, tag = \"1\")]\n    pub subrequest_id: u32,\n    #[prost(message, optional, tag = \"2\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"3\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(message, optional, tag = \"4\")]\n    pub shard_id: ::core::option::Option<crate::types::ShardId>,\n    #[prost(enumeration = \"ReplicateFailureReason\", tag = \"5\")]\n    pub reason: i32,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct TruncateShardsRequest {\n    #[prost(string, tag = \"1\")]\n    pub ingester_id: ::prost::alloc::string::String,\n    #[prost(message, repeated, tag = \"2\")]\n    pub subrequests: ::prost::alloc::vec::Vec<TruncateShardsSubrequest>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct TruncateShardsSubrequest {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"2\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(message, optional, tag = \"3\")]\n    pub shard_id: ::core::option::Option<crate::types::ShardId>,\n    /// The position up to which the shard should be truncated (inclusive).\n    #[prost(message, optional, tag = \"4\")]\n    pub truncate_up_to_position_inclusive: ::core::option::Option<\n        crate::types::Position,\n    >,\n}\n/// TODO\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct TruncateShardsResponse {}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct OpenFetchStreamRequest {\n    #[prost(string, tag = \"1\")]\n    pub client_id: ::prost::alloc::string::String,\n    #[prost(message, optional, tag = \"2\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"3\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(message, optional, tag = \"4\")]\n    pub shard_id: ::core::option::Option<crate::types::ShardId>,\n    #[prost(message, optional, tag = \"5\")]\n    pub from_position_exclusive: ::core::option::Option<crate::types::Position>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct FetchMessage {\n    #[prost(oneof = \"fetch_message::Message\", tags = \"1, 2\")]\n    pub message: ::core::option::Option<fetch_message::Message>,\n}\n/// Nested message and enum types in `FetchMessage`.\npub mod fetch_message {\n    #[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n    #[serde(rename_all = \"snake_case\")]\n    #[derive(Clone, PartialEq, Eq, Hash, ::prost::Oneof)]\n    pub enum Message {\n        #[prost(message, tag = \"1\")]\n        Payload(super::FetchPayload),\n        #[prost(message, tag = \"2\")]\n        Eof(super::FetchEof),\n    }\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct FetchPayload {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"2\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(message, optional, tag = \"3\")]\n    pub shard_id: ::core::option::Option<crate::types::ShardId>,\n    #[prost(message, optional, tag = \"4\")]\n    pub mrecord_batch: ::core::option::Option<super::MRecordBatch>,\n    #[prost(message, optional, tag = \"5\")]\n    pub from_position_exclusive: ::core::option::Option<crate::types::Position>,\n    #[prost(message, optional, tag = \"6\")]\n    pub to_position_inclusive: ::core::option::Option<crate::types::Position>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct FetchEof {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"2\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(message, optional, tag = \"3\")]\n    pub shard_id: ::core::option::Option<crate::types::ShardId>,\n    #[prost(message, optional, tag = \"4\")]\n    pub eof_position: ::core::option::Option<crate::types::Position>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct InitShardsRequest {\n    #[prost(message, repeated, tag = \"2\")]\n    pub subrequests: ::prost::alloc::vec::Vec<InitShardSubrequest>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct InitShardSubrequest {\n    #[prost(uint32, tag = \"1\")]\n    pub subrequest_id: u32,\n    #[prost(message, optional, tag = \"2\")]\n    pub shard: ::core::option::Option<super::Shard>,\n    #[prost(string, tag = \"3\")]\n    pub doc_mapping_json: ::prost::alloc::string::String,\n    #[prost(bool, tag = \"4\")]\n    pub validate_docs: bool,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct InitShardsResponse {\n    #[prost(message, repeated, tag = \"1\")]\n    pub successes: ::prost::alloc::vec::Vec<InitShardSuccess>,\n    #[prost(message, repeated, tag = \"2\")]\n    pub failures: ::prost::alloc::vec::Vec<InitShardFailure>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct InitShardSuccess {\n    #[prost(uint32, tag = \"1\")]\n    pub subrequest_id: u32,\n    #[prost(message, optional, tag = \"2\")]\n    pub shard: ::core::option::Option<super::Shard>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct InitShardFailure {\n    #[prost(uint32, tag = \"1\")]\n    pub subrequest_id: u32,\n    #[prost(message, optional, tag = \"2\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"3\")]\n    pub source_id: ::prost::alloc::string::String,\n    /// InitShardFailureReason reason = 5;\n    #[prost(message, optional, tag = \"4\")]\n    pub shard_id: ::core::option::Option<crate::types::ShardId>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct CloseShardsRequest {\n    #[prost(message, repeated, tag = \"2\")]\n    pub shard_pkeys: ::prost::alloc::vec::Vec<super::ShardPKey>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct CloseShardsResponse {\n    #[prost(message, repeated, tag = \"1\")]\n    pub successes: ::prost::alloc::vec::Vec<super::ShardPKey>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct DecommissionRequest {}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct DecommissionResponse {}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct OpenObservationStreamRequest {}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ObservationMessage {\n    #[prost(string, tag = \"1\")]\n    pub node_id: ::prost::alloc::string::String,\n    #[prost(enumeration = \"IngesterStatus\", tag = \"2\")]\n    pub status: i32,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[serde(rename_all = \"snake_case\")]\n#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]\n#[repr(i32)]\npub enum PersistFailureReason {\n    Unspecified = 0,\n    WalFull = 4,\n    Timeout = 5,\n    NoShardsAvailable = 6,\n    NodeUnavailable = 7,\n}\nimpl PersistFailureReason {\n    /// String value of the enum field names used in the ProtoBuf definition.\n    ///\n    /// The values are not transformed in any way and thus are considered stable\n    /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n    pub fn as_str_name(&self) -> &'static str {\n        match self {\n            Self::Unspecified => \"PERSIST_FAILURE_REASON_UNSPECIFIED\",\n            Self::WalFull => \"PERSIST_FAILURE_REASON_WAL_FULL\",\n            Self::Timeout => \"PERSIST_FAILURE_REASON_TIMEOUT\",\n            Self::NoShardsAvailable => \"PERSIST_FAILURE_REASON_NO_SHARDS_AVAILABLE\",\n            Self::NodeUnavailable => \"PERSIST_FAILURE_REASON_NODE_UNAVAILABLE\",\n        }\n    }\n    /// Creates an enum from field names used in the ProtoBuf definition.\n    pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n        match value {\n            \"PERSIST_FAILURE_REASON_UNSPECIFIED\" => Some(Self::Unspecified),\n            \"PERSIST_FAILURE_REASON_WAL_FULL\" => Some(Self::WalFull),\n            \"PERSIST_FAILURE_REASON_TIMEOUT\" => Some(Self::Timeout),\n            \"PERSIST_FAILURE_REASON_NO_SHARDS_AVAILABLE\" => Some(Self::NoShardsAvailable),\n            \"PERSIST_FAILURE_REASON_NODE_UNAVAILABLE\" => Some(Self::NodeUnavailable),\n            _ => None,\n        }\n    }\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[serde(rename_all = \"snake_case\")]\n#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]\n#[repr(i32)]\npub enum ReplicateFailureReason {\n    Unspecified = 0,\n    ShardNotFound = 1,\n    ShardClosed = 2,\n    WalFull = 4,\n}\nimpl ReplicateFailureReason {\n    /// String value of the enum field names used in the ProtoBuf definition.\n    ///\n    /// The values are not transformed in any way and thus are considered stable\n    /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n    pub fn as_str_name(&self) -> &'static str {\n        match self {\n            Self::Unspecified => \"REPLICATE_FAILURE_REASON_UNSPECIFIED\",\n            Self::ShardNotFound => \"REPLICATE_FAILURE_REASON_SHARD_NOT_FOUND\",\n            Self::ShardClosed => \"REPLICATE_FAILURE_REASON_SHARD_CLOSED\",\n            Self::WalFull => \"REPLICATE_FAILURE_REASON_WAL_FULL\",\n        }\n    }\n    /// Creates an enum from field names used in the ProtoBuf definition.\n    pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n        match value {\n            \"REPLICATE_FAILURE_REASON_UNSPECIFIED\" => Some(Self::Unspecified),\n            \"REPLICATE_FAILURE_REASON_SHARD_NOT_FOUND\" => Some(Self::ShardNotFound),\n            \"REPLICATE_FAILURE_REASON_SHARD_CLOSED\" => Some(Self::ShardClosed),\n            \"REPLICATE_FAILURE_REASON_WAL_FULL\" => Some(Self::WalFull),\n            _ => None,\n        }\n    }\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[serde(rename_all = \"snake_case\")]\n#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]\n#[repr(i32)]\npub enum IngesterStatus {\n    Unspecified = 0,\n    /// The ingester is live but not ready yet to accept requests.\n    Initializing = 1,\n    /// The ingester is ready and accepts read and write requests.\n    Ready = 2,\n    /// The ingester is about to be decommissioned. It still accepts read and write requests, but will not accept write requests in a few seconds and should be avoided by future write requests.\n    Retiring = 6,\n    /// The ingester is being decommissioned. It accepts read requests but rejects write requests\n    /// (open shards, persist, and replicate requests). It will transition to `Decommissioned` once\n    /// all shards are fully indexed.\n    Decommissioning = 3,\n    /// The ingester no longer accepts read and write requests. It does not hold any data and can\n    /// be safely removed from the cluster.\n    Decommissioned = 4,\n    /// The ingester failed to initialize and is not ready to accept requests.\n    Failed = 5,\n}\nimpl IngesterStatus {\n    /// String value of the enum field names used in the ProtoBuf definition.\n    ///\n    /// The values are not transformed in any way and thus are considered stable\n    /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n    pub fn as_str_name(&self) -> &'static str {\n        match self {\n            Self::Unspecified => \"INGESTER_STATUS_UNSPECIFIED\",\n            Self::Initializing => \"INGESTER_STATUS_INITIALIZING\",\n            Self::Ready => \"INGESTER_STATUS_READY\",\n            Self::Retiring => \"INGESTER_STATUS_RETIRING\",\n            Self::Decommissioning => \"INGESTER_STATUS_DECOMMISSIONING\",\n            Self::Decommissioned => \"INGESTER_STATUS_DECOMMISSIONED\",\n            Self::Failed => \"INGESTER_STATUS_FAILED\",\n        }\n    }\n    /// Creates an enum from field names used in the ProtoBuf definition.\n    pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n        match value {\n            \"INGESTER_STATUS_UNSPECIFIED\" => Some(Self::Unspecified),\n            \"INGESTER_STATUS_INITIALIZING\" => Some(Self::Initializing),\n            \"INGESTER_STATUS_READY\" => Some(Self::Ready),\n            \"INGESTER_STATUS_RETIRING\" => Some(Self::Retiring),\n            \"INGESTER_STATUS_DECOMMISSIONING\" => Some(Self::Decommissioning),\n            \"INGESTER_STATUS_DECOMMISSIONED\" => Some(Self::Decommissioned),\n            \"INGESTER_STATUS_FAILED\" => Some(Self::Failed),\n            _ => None,\n        }\n    }\n}\n/// BEGIN quickwit-codegen\n#[allow(unused_imports)]\nuse std::str::FromStr;\nuse tower::{Layer, Service, ServiceExt};\nuse quickwit_common::tower::RpcName;\nimpl RpcName for PersistRequest {\n    fn rpc_name() -> &'static str {\n        \"persist\"\n    }\n}\nimpl RpcName for SynReplicationMessage {\n    fn rpc_name() -> &'static str {\n        \"open_replication_stream\"\n    }\n}\nimpl RpcName for OpenFetchStreamRequest {\n    fn rpc_name() -> &'static str {\n        \"open_fetch_stream\"\n    }\n}\nimpl RpcName for OpenObservationStreamRequest {\n    fn rpc_name() -> &'static str {\n        \"open_observation_stream\"\n    }\n}\nimpl RpcName for InitShardsRequest {\n    fn rpc_name() -> &'static str {\n        \"init_shards\"\n    }\n}\nimpl RpcName for RetainShardsRequest {\n    fn rpc_name() -> &'static str {\n        \"retain_shards\"\n    }\n}\nimpl RpcName for TruncateShardsRequest {\n    fn rpc_name() -> &'static str {\n        \"truncate_shards\"\n    }\n}\nimpl RpcName for CloseShardsRequest {\n    fn rpc_name() -> &'static str {\n        \"close_shards\"\n    }\n}\nimpl RpcName for DecommissionRequest {\n    fn rpc_name() -> &'static str {\n        \"decommission\"\n    }\n}\npub type IngesterServiceStream<T> = quickwit_common::ServiceStream<\n    crate::ingest::IngestV2Result<T>,\n>;\n#[cfg_attr(any(test, feature = \"testsuite\"), mockall::automock)]\n#[async_trait::async_trait]\npub trait IngesterService: std::fmt::Debug + Send + Sync + 'static {\n    ///Persists batches of documents to primary shards hosted on a leader.\n    async fn persist(\n        &self,\n        request: PersistRequest,\n    ) -> crate::ingest::IngestV2Result<PersistResponse>;\n    ///Opens a replication stream from a leader to a follower.\n    async fn open_replication_stream(\n        &self,\n        request: quickwit_common::ServiceStream<SynReplicationMessage>,\n    ) -> crate::ingest::IngestV2Result<IngesterServiceStream<AckReplicationMessage>>;\n    ///Streams records from a leader or a follower. The client can optionally specify a range of positions to fetch,\n    ///otherwise the stream will go indefinitely or until the shard is closed.\n    async fn open_fetch_stream(\n        &self,\n        request: OpenFetchStreamRequest,\n    ) -> crate::ingest::IngestV2Result<IngesterServiceStream<FetchMessage>>;\n    ///Streams status updates, called \"observations\", from an ingester.\n    async fn open_observation_stream(\n        &self,\n        request: OpenObservationStreamRequest,\n    ) -> crate::ingest::IngestV2Result<IngesterServiceStream<ObservationMessage>>;\n    ///Creates and initializes a set of newly opened shards. This RPC is called by the control plane on leaders.\n    async fn init_shards(\n        &self,\n        request: InitShardsRequest,\n    ) -> crate::ingest::IngestV2Result<InitShardsResponse>;\n    ///Only retain the shards that are listed in the request.\n    ///Other shards are deleted.\n    async fn retain_shards(\n        &self,\n        request: RetainShardsRequest,\n    ) -> crate::ingest::IngestV2Result<RetainShardsResponse>;\n    ///Truncates a set of shards at the given positions. This RPC is called by indexers on leaders AND followers.\n    async fn truncate_shards(\n        &self,\n        request: TruncateShardsRequest,\n    ) -> crate::ingest::IngestV2Result<TruncateShardsResponse>;\n    ///Closes a set of shards. This RPC is called by the control plane.\n    async fn close_shards(\n        &self,\n        request: CloseShardsRequest,\n    ) -> crate::ingest::IngestV2Result<CloseShardsResponse>;\n    ///Decommissions the ingester.\n    async fn decommission(\n        &self,\n        request: DecommissionRequest,\n    ) -> crate::ingest::IngestV2Result<DecommissionResponse>;\n}\n#[derive(Debug, Clone)]\npub struct IngesterServiceClient {\n    inner: InnerIngesterServiceClient,\n}\n#[derive(Debug, Clone)]\nstruct InnerIngesterServiceClient(std::sync::Arc<dyn IngesterService>);\nimpl IngesterServiceClient {\n    pub fn new<T>(instance: T) -> Self\n    where\n        T: IngesterService,\n    {\n        #[cfg(any(test, feature = \"testsuite\"))]\n        assert!(\n            std::any::TypeId::of:: < T > () != std::any::TypeId::of:: <\n            MockIngesterService > (),\n            \"`MockIngesterService` must be wrapped in a `MockIngesterServiceWrapper`: use `IngesterServiceClient::from_mock(mock)` to instantiate the client\"\n        );\n        Self {\n            inner: InnerIngesterServiceClient(std::sync::Arc::new(instance)),\n        }\n    }\n    pub fn as_grpc_service(\n        &self,\n        max_message_size: bytesize::ByteSize,\n    ) -> ingester_service_grpc_server::IngesterServiceGrpcServer<\n        IngesterServiceGrpcServerAdapter,\n    > {\n        let adapter = IngesterServiceGrpcServerAdapter::new(self.clone());\n        ingester_service_grpc_server::IngesterServiceGrpcServer::new(adapter)\n            .accept_compressed(tonic::codec::CompressionEncoding::Gzip)\n            .accept_compressed(tonic::codec::CompressionEncoding::Zstd)\n            .send_compressed(tonic::codec::CompressionEncoding::Gzip)\n            .send_compressed(tonic::codec::CompressionEncoding::Zstd)\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize)\n    }\n    pub fn from_channel(\n        addr: std::net::SocketAddr,\n        channel: tonic::transport::Channel,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> Self {\n        let (_, connection_keys_watcher) = tokio::sync::watch::channel(\n            std::collections::HashSet::from_iter([addr]),\n        );\n        let mut client = ingester_service_grpc_client::IngesterServiceGrpcClient::new(\n                channel,\n            )\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize);\n        if let Some(compression_encoding) = compression_encoding_opt {\n            client = client\n                .accept_compressed(compression_encoding)\n                .send_compressed(compression_encoding);\n        }\n        let adapter = IngesterServiceGrpcClientAdapter::new(\n            client,\n            connection_keys_watcher,\n        );\n        Self::new(adapter)\n    }\n    pub fn from_balance_channel(\n        balance_channel: quickwit_common::tower::BalanceChannel<std::net::SocketAddr>,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> IngesterServiceClient {\n        let connection_keys_watcher = balance_channel.connection_keys_watcher();\n        let mut client = ingester_service_grpc_client::IngesterServiceGrpcClient::new(\n                balance_channel,\n            )\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize);\n        if let Some(compression_encoding) = compression_encoding_opt {\n            client = client\n                .accept_compressed(compression_encoding)\n                .send_compressed(compression_encoding);\n        }\n        let adapter = IngesterServiceGrpcClientAdapter::new(\n            client,\n            connection_keys_watcher,\n        );\n        Self::new(adapter)\n    }\n    pub fn from_mailbox<A>(mailbox: quickwit_actors::Mailbox<A>) -> Self\n    where\n        A: quickwit_actors::Actor + std::fmt::Debug + Send + 'static,\n        IngesterServiceMailbox<A>: IngesterService,\n    {\n        IngesterServiceClient::new(IngesterServiceMailbox::new(mailbox))\n    }\n    pub fn tower() -> IngesterServiceTowerLayerStack {\n        IngesterServiceTowerLayerStack::default()\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn from_mock(mock: MockIngesterService) -> Self {\n        let mock_wrapper = mock_ingester_service::MockIngesterServiceWrapper {\n            inner: tokio::sync::Mutex::new(mock),\n        };\n        Self::new(mock_wrapper)\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn mocked() -> Self {\n        Self::from_mock(MockIngesterService::new())\n    }\n}\n#[async_trait::async_trait]\nimpl IngesterService for IngesterServiceClient {\n    async fn persist(\n        &self,\n        request: PersistRequest,\n    ) -> crate::ingest::IngestV2Result<PersistResponse> {\n        self.inner.0.persist(request).await\n    }\n    async fn open_replication_stream(\n        &self,\n        request: quickwit_common::ServiceStream<SynReplicationMessage>,\n    ) -> crate::ingest::IngestV2Result<IngesterServiceStream<AckReplicationMessage>> {\n        self.inner.0.open_replication_stream(request).await\n    }\n    async fn open_fetch_stream(\n        &self,\n        request: OpenFetchStreamRequest,\n    ) -> crate::ingest::IngestV2Result<IngesterServiceStream<FetchMessage>> {\n        self.inner.0.open_fetch_stream(request).await\n    }\n    async fn open_observation_stream(\n        &self,\n        request: OpenObservationStreamRequest,\n    ) -> crate::ingest::IngestV2Result<IngesterServiceStream<ObservationMessage>> {\n        self.inner.0.open_observation_stream(request).await\n    }\n    async fn init_shards(\n        &self,\n        request: InitShardsRequest,\n    ) -> crate::ingest::IngestV2Result<InitShardsResponse> {\n        self.inner.0.init_shards(request).await\n    }\n    async fn retain_shards(\n        &self,\n        request: RetainShardsRequest,\n    ) -> crate::ingest::IngestV2Result<RetainShardsResponse> {\n        self.inner.0.retain_shards(request).await\n    }\n    async fn truncate_shards(\n        &self,\n        request: TruncateShardsRequest,\n    ) -> crate::ingest::IngestV2Result<TruncateShardsResponse> {\n        self.inner.0.truncate_shards(request).await\n    }\n    async fn close_shards(\n        &self,\n        request: CloseShardsRequest,\n    ) -> crate::ingest::IngestV2Result<CloseShardsResponse> {\n        self.inner.0.close_shards(request).await\n    }\n    async fn decommission(\n        &self,\n        request: DecommissionRequest,\n    ) -> crate::ingest::IngestV2Result<DecommissionResponse> {\n        self.inner.0.decommission(request).await\n    }\n}\n#[cfg(any(test, feature = \"testsuite\"))]\npub mod mock_ingester_service {\n    use super::*;\n    #[derive(Debug)]\n    pub struct MockIngesterServiceWrapper {\n        pub(super) inner: tokio::sync::Mutex<MockIngesterService>,\n    }\n    #[async_trait::async_trait]\n    impl IngesterService for MockIngesterServiceWrapper {\n        async fn persist(\n            &self,\n            request: super::PersistRequest,\n        ) -> crate::ingest::IngestV2Result<super::PersistResponse> {\n            self.inner.lock().await.persist(request).await\n        }\n        async fn open_replication_stream(\n            &self,\n            request: quickwit_common::ServiceStream<super::SynReplicationMessage>,\n        ) -> crate::ingest::IngestV2Result<\n            IngesterServiceStream<super::AckReplicationMessage>,\n        > {\n            self.inner.lock().await.open_replication_stream(request).await\n        }\n        async fn open_fetch_stream(\n            &self,\n            request: super::OpenFetchStreamRequest,\n        ) -> crate::ingest::IngestV2Result<IngesterServiceStream<super::FetchMessage>> {\n            self.inner.lock().await.open_fetch_stream(request).await\n        }\n        async fn open_observation_stream(\n            &self,\n            request: super::OpenObservationStreamRequest,\n        ) -> crate::ingest::IngestV2Result<\n            IngesterServiceStream<super::ObservationMessage>,\n        > {\n            self.inner.lock().await.open_observation_stream(request).await\n        }\n        async fn init_shards(\n            &self,\n            request: super::InitShardsRequest,\n        ) -> crate::ingest::IngestV2Result<super::InitShardsResponse> {\n            self.inner.lock().await.init_shards(request).await\n        }\n        async fn retain_shards(\n            &self,\n            request: super::RetainShardsRequest,\n        ) -> crate::ingest::IngestV2Result<super::RetainShardsResponse> {\n            self.inner.lock().await.retain_shards(request).await\n        }\n        async fn truncate_shards(\n            &self,\n            request: super::TruncateShardsRequest,\n        ) -> crate::ingest::IngestV2Result<super::TruncateShardsResponse> {\n            self.inner.lock().await.truncate_shards(request).await\n        }\n        async fn close_shards(\n            &self,\n            request: super::CloseShardsRequest,\n        ) -> crate::ingest::IngestV2Result<super::CloseShardsResponse> {\n            self.inner.lock().await.close_shards(request).await\n        }\n        async fn decommission(\n            &self,\n            request: super::DecommissionRequest,\n        ) -> crate::ingest::IngestV2Result<super::DecommissionResponse> {\n            self.inner.lock().await.decommission(request).await\n        }\n    }\n}\npub type BoxFuture<T, E> = std::pin::Pin<\n    Box<dyn std::future::Future<Output = Result<T, E>> + Send + 'static>,\n>;\nimpl tower::Service<PersistRequest> for InnerIngesterServiceClient {\n    type Response = PersistResponse;\n    type Error = crate::ingest::IngestV2Error;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: PersistRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.persist(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<quickwit_common::ServiceStream<SynReplicationMessage>>\nfor InnerIngesterServiceClient {\n    type Response = IngesterServiceStream<AckReplicationMessage>;\n    type Error = crate::ingest::IngestV2Error;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(\n        &mut self,\n        request: quickwit_common::ServiceStream<SynReplicationMessage>,\n    ) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.open_replication_stream(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<OpenFetchStreamRequest> for InnerIngesterServiceClient {\n    type Response = IngesterServiceStream<FetchMessage>;\n    type Error = crate::ingest::IngestV2Error;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: OpenFetchStreamRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.open_fetch_stream(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<OpenObservationStreamRequest> for InnerIngesterServiceClient {\n    type Response = IngesterServiceStream<ObservationMessage>;\n    type Error = crate::ingest::IngestV2Error;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: OpenObservationStreamRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.open_observation_stream(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<InitShardsRequest> for InnerIngesterServiceClient {\n    type Response = InitShardsResponse;\n    type Error = crate::ingest::IngestV2Error;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: InitShardsRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.init_shards(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<RetainShardsRequest> for InnerIngesterServiceClient {\n    type Response = RetainShardsResponse;\n    type Error = crate::ingest::IngestV2Error;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: RetainShardsRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.retain_shards(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<TruncateShardsRequest> for InnerIngesterServiceClient {\n    type Response = TruncateShardsResponse;\n    type Error = crate::ingest::IngestV2Error;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: TruncateShardsRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.truncate_shards(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<CloseShardsRequest> for InnerIngesterServiceClient {\n    type Response = CloseShardsResponse;\n    type Error = crate::ingest::IngestV2Error;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: CloseShardsRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.close_shards(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<DecommissionRequest> for InnerIngesterServiceClient {\n    type Response = DecommissionResponse;\n    type Error = crate::ingest::IngestV2Error;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: DecommissionRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.decommission(request).await };\n        Box::pin(fut)\n    }\n}\n/// A tower service stack is a set of tower services.\n#[derive(Debug)]\nstruct IngesterServiceTowerServiceStack {\n    #[allow(dead_code)]\n    inner: InnerIngesterServiceClient,\n    persist_svc: quickwit_common::tower::BoxService<\n        PersistRequest,\n        PersistResponse,\n        crate::ingest::IngestV2Error,\n    >,\n    open_replication_stream_svc: quickwit_common::tower::BoxService<\n        quickwit_common::ServiceStream<SynReplicationMessage>,\n        IngesterServiceStream<AckReplicationMessage>,\n        crate::ingest::IngestV2Error,\n    >,\n    open_fetch_stream_svc: quickwit_common::tower::BoxService<\n        OpenFetchStreamRequest,\n        IngesterServiceStream<FetchMessage>,\n        crate::ingest::IngestV2Error,\n    >,\n    open_observation_stream_svc: quickwit_common::tower::BoxService<\n        OpenObservationStreamRequest,\n        IngesterServiceStream<ObservationMessage>,\n        crate::ingest::IngestV2Error,\n    >,\n    init_shards_svc: quickwit_common::tower::BoxService<\n        InitShardsRequest,\n        InitShardsResponse,\n        crate::ingest::IngestV2Error,\n    >,\n    retain_shards_svc: quickwit_common::tower::BoxService<\n        RetainShardsRequest,\n        RetainShardsResponse,\n        crate::ingest::IngestV2Error,\n    >,\n    truncate_shards_svc: quickwit_common::tower::BoxService<\n        TruncateShardsRequest,\n        TruncateShardsResponse,\n        crate::ingest::IngestV2Error,\n    >,\n    close_shards_svc: quickwit_common::tower::BoxService<\n        CloseShardsRequest,\n        CloseShardsResponse,\n        crate::ingest::IngestV2Error,\n    >,\n    decommission_svc: quickwit_common::tower::BoxService<\n        DecommissionRequest,\n        DecommissionResponse,\n        crate::ingest::IngestV2Error,\n    >,\n}\n#[async_trait::async_trait]\nimpl IngesterService for IngesterServiceTowerServiceStack {\n    async fn persist(\n        &self,\n        request: PersistRequest,\n    ) -> crate::ingest::IngestV2Result<PersistResponse> {\n        self.persist_svc.clone().ready().await?.call(request).await\n    }\n    async fn open_replication_stream(\n        &self,\n        request: quickwit_common::ServiceStream<SynReplicationMessage>,\n    ) -> crate::ingest::IngestV2Result<IngesterServiceStream<AckReplicationMessage>> {\n        self.open_replication_stream_svc.clone().ready().await?.call(request).await\n    }\n    async fn open_fetch_stream(\n        &self,\n        request: OpenFetchStreamRequest,\n    ) -> crate::ingest::IngestV2Result<IngesterServiceStream<FetchMessage>> {\n        self.open_fetch_stream_svc.clone().ready().await?.call(request).await\n    }\n    async fn open_observation_stream(\n        &self,\n        request: OpenObservationStreamRequest,\n    ) -> crate::ingest::IngestV2Result<IngesterServiceStream<ObservationMessage>> {\n        self.open_observation_stream_svc.clone().ready().await?.call(request).await\n    }\n    async fn init_shards(\n        &self,\n        request: InitShardsRequest,\n    ) -> crate::ingest::IngestV2Result<InitShardsResponse> {\n        self.init_shards_svc.clone().ready().await?.call(request).await\n    }\n    async fn retain_shards(\n        &self,\n        request: RetainShardsRequest,\n    ) -> crate::ingest::IngestV2Result<RetainShardsResponse> {\n        self.retain_shards_svc.clone().ready().await?.call(request).await\n    }\n    async fn truncate_shards(\n        &self,\n        request: TruncateShardsRequest,\n    ) -> crate::ingest::IngestV2Result<TruncateShardsResponse> {\n        self.truncate_shards_svc.clone().ready().await?.call(request).await\n    }\n    async fn close_shards(\n        &self,\n        request: CloseShardsRequest,\n    ) -> crate::ingest::IngestV2Result<CloseShardsResponse> {\n        self.close_shards_svc.clone().ready().await?.call(request).await\n    }\n    async fn decommission(\n        &self,\n        request: DecommissionRequest,\n    ) -> crate::ingest::IngestV2Result<DecommissionResponse> {\n        self.decommission_svc.clone().ready().await?.call(request).await\n    }\n}\ntype PersistLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        PersistRequest,\n        PersistResponse,\n        crate::ingest::IngestV2Error,\n    >,\n    PersistRequest,\n    PersistResponse,\n    crate::ingest::IngestV2Error,\n>;\ntype OpenReplicationStreamLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        quickwit_common::ServiceStream<SynReplicationMessage>,\n        IngesterServiceStream<AckReplicationMessage>,\n        crate::ingest::IngestV2Error,\n    >,\n    quickwit_common::ServiceStream<SynReplicationMessage>,\n    IngesterServiceStream<AckReplicationMessage>,\n    crate::ingest::IngestV2Error,\n>;\ntype OpenFetchStreamLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        OpenFetchStreamRequest,\n        IngesterServiceStream<FetchMessage>,\n        crate::ingest::IngestV2Error,\n    >,\n    OpenFetchStreamRequest,\n    IngesterServiceStream<FetchMessage>,\n    crate::ingest::IngestV2Error,\n>;\ntype OpenObservationStreamLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        OpenObservationStreamRequest,\n        IngesterServiceStream<ObservationMessage>,\n        crate::ingest::IngestV2Error,\n    >,\n    OpenObservationStreamRequest,\n    IngesterServiceStream<ObservationMessage>,\n    crate::ingest::IngestV2Error,\n>;\ntype InitShardsLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        InitShardsRequest,\n        InitShardsResponse,\n        crate::ingest::IngestV2Error,\n    >,\n    InitShardsRequest,\n    InitShardsResponse,\n    crate::ingest::IngestV2Error,\n>;\ntype RetainShardsLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        RetainShardsRequest,\n        RetainShardsResponse,\n        crate::ingest::IngestV2Error,\n    >,\n    RetainShardsRequest,\n    RetainShardsResponse,\n    crate::ingest::IngestV2Error,\n>;\ntype TruncateShardsLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        TruncateShardsRequest,\n        TruncateShardsResponse,\n        crate::ingest::IngestV2Error,\n    >,\n    TruncateShardsRequest,\n    TruncateShardsResponse,\n    crate::ingest::IngestV2Error,\n>;\ntype CloseShardsLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        CloseShardsRequest,\n        CloseShardsResponse,\n        crate::ingest::IngestV2Error,\n    >,\n    CloseShardsRequest,\n    CloseShardsResponse,\n    crate::ingest::IngestV2Error,\n>;\ntype DecommissionLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        DecommissionRequest,\n        DecommissionResponse,\n        crate::ingest::IngestV2Error,\n    >,\n    DecommissionRequest,\n    DecommissionResponse,\n    crate::ingest::IngestV2Error,\n>;\n#[derive(Debug, Default)]\npub struct IngesterServiceTowerLayerStack {\n    persist_layers: Vec<PersistLayer>,\n    open_replication_stream_layers: Vec<OpenReplicationStreamLayer>,\n    open_fetch_stream_layers: Vec<OpenFetchStreamLayer>,\n    open_observation_stream_layers: Vec<OpenObservationStreamLayer>,\n    init_shards_layers: Vec<InitShardsLayer>,\n    retain_shards_layers: Vec<RetainShardsLayer>,\n    truncate_shards_layers: Vec<TruncateShardsLayer>,\n    close_shards_layers: Vec<CloseShardsLayer>,\n    decommission_layers: Vec<DecommissionLayer>,\n}\nimpl IngesterServiceTowerLayerStack {\n    pub fn stack_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    PersistRequest,\n                    PersistResponse,\n                    crate::ingest::IngestV2Error,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                PersistRequest,\n                PersistResponse,\n                crate::ingest::IngestV2Error,\n            >,\n        >>::Service: tower::Service<\n                PersistRequest,\n                Response = PersistResponse,\n                Error = crate::ingest::IngestV2Error,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                PersistRequest,\n                PersistResponse,\n                crate::ingest::IngestV2Error,\n            >,\n        >>::Service as tower::Service<PersistRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    quickwit_common::ServiceStream<SynReplicationMessage>,\n                    IngesterServiceStream<AckReplicationMessage>,\n                    crate::ingest::IngestV2Error,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                quickwit_common::ServiceStream<SynReplicationMessage>,\n                IngesterServiceStream<AckReplicationMessage>,\n                crate::ingest::IngestV2Error,\n            >,\n        >>::Service: tower::Service<\n                quickwit_common::ServiceStream<SynReplicationMessage>,\n                Response = IngesterServiceStream<AckReplicationMessage>,\n                Error = crate::ingest::IngestV2Error,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                quickwit_common::ServiceStream<SynReplicationMessage>,\n                IngesterServiceStream<AckReplicationMessage>,\n                crate::ingest::IngestV2Error,\n            >,\n        >>::Service as tower::Service<\n            quickwit_common::ServiceStream<SynReplicationMessage>,\n        >>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    OpenFetchStreamRequest,\n                    IngesterServiceStream<FetchMessage>,\n                    crate::ingest::IngestV2Error,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                OpenFetchStreamRequest,\n                IngesterServiceStream<FetchMessage>,\n                crate::ingest::IngestV2Error,\n            >,\n        >>::Service: tower::Service<\n                OpenFetchStreamRequest,\n                Response = IngesterServiceStream<FetchMessage>,\n                Error = crate::ingest::IngestV2Error,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                OpenFetchStreamRequest,\n                IngesterServiceStream<FetchMessage>,\n                crate::ingest::IngestV2Error,\n            >,\n        >>::Service as tower::Service<OpenFetchStreamRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    OpenObservationStreamRequest,\n                    IngesterServiceStream<ObservationMessage>,\n                    crate::ingest::IngestV2Error,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                OpenObservationStreamRequest,\n                IngesterServiceStream<ObservationMessage>,\n                crate::ingest::IngestV2Error,\n            >,\n        >>::Service: tower::Service<\n                OpenObservationStreamRequest,\n                Response = IngesterServiceStream<ObservationMessage>,\n                Error = crate::ingest::IngestV2Error,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                OpenObservationStreamRequest,\n                IngesterServiceStream<ObservationMessage>,\n                crate::ingest::IngestV2Error,\n            >,\n        >>::Service as tower::Service<\n            OpenObservationStreamRequest,\n        >>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    InitShardsRequest,\n                    InitShardsResponse,\n                    crate::ingest::IngestV2Error,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                InitShardsRequest,\n                InitShardsResponse,\n                crate::ingest::IngestV2Error,\n            >,\n        >>::Service: tower::Service<\n                InitShardsRequest,\n                Response = InitShardsResponse,\n                Error = crate::ingest::IngestV2Error,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                InitShardsRequest,\n                InitShardsResponse,\n                crate::ingest::IngestV2Error,\n            >,\n        >>::Service as tower::Service<InitShardsRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    RetainShardsRequest,\n                    RetainShardsResponse,\n                    crate::ingest::IngestV2Error,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                RetainShardsRequest,\n                RetainShardsResponse,\n                crate::ingest::IngestV2Error,\n            >,\n        >>::Service: tower::Service<\n                RetainShardsRequest,\n                Response = RetainShardsResponse,\n                Error = crate::ingest::IngestV2Error,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                RetainShardsRequest,\n                RetainShardsResponse,\n                crate::ingest::IngestV2Error,\n            >,\n        >>::Service as tower::Service<RetainShardsRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    TruncateShardsRequest,\n                    TruncateShardsResponse,\n                    crate::ingest::IngestV2Error,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                TruncateShardsRequest,\n                TruncateShardsResponse,\n                crate::ingest::IngestV2Error,\n            >,\n        >>::Service: tower::Service<\n                TruncateShardsRequest,\n                Response = TruncateShardsResponse,\n                Error = crate::ingest::IngestV2Error,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                TruncateShardsRequest,\n                TruncateShardsResponse,\n                crate::ingest::IngestV2Error,\n            >,\n        >>::Service as tower::Service<TruncateShardsRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    CloseShardsRequest,\n                    CloseShardsResponse,\n                    crate::ingest::IngestV2Error,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                CloseShardsRequest,\n                CloseShardsResponse,\n                crate::ingest::IngestV2Error,\n            >,\n        >>::Service: tower::Service<\n                CloseShardsRequest,\n                Response = CloseShardsResponse,\n                Error = crate::ingest::IngestV2Error,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                CloseShardsRequest,\n                CloseShardsResponse,\n                crate::ingest::IngestV2Error,\n            >,\n        >>::Service as tower::Service<CloseShardsRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    DecommissionRequest,\n                    DecommissionResponse,\n                    crate::ingest::IngestV2Error,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                DecommissionRequest,\n                DecommissionResponse,\n                crate::ingest::IngestV2Error,\n            >,\n        >>::Service: tower::Service<\n                DecommissionRequest,\n                Response = DecommissionResponse,\n                Error = crate::ingest::IngestV2Error,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                DecommissionRequest,\n                DecommissionResponse,\n                crate::ingest::IngestV2Error,\n            >,\n        >>::Service as tower::Service<DecommissionRequest>>::Future: Send + 'static,\n    {\n        self.persist_layers.push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.open_replication_stream_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.open_fetch_stream_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.open_observation_stream_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.init_shards_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.retain_shards_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.truncate_shards_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.close_shards_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.decommission_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self\n    }\n    pub fn stack_persist_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    PersistRequest,\n                    PersistResponse,\n                    crate::ingest::IngestV2Error,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                PersistRequest,\n                Response = PersistResponse,\n                Error = crate::ingest::IngestV2Error,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<PersistRequest>>::Future: Send + 'static,\n    {\n        self.persist_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_open_replication_stream_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    quickwit_common::ServiceStream<SynReplicationMessage>,\n                    IngesterServiceStream<AckReplicationMessage>,\n                    crate::ingest::IngestV2Error,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                quickwit_common::ServiceStream<SynReplicationMessage>,\n                Response = IngesterServiceStream<AckReplicationMessage>,\n                Error = crate::ingest::IngestV2Error,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<\n            quickwit_common::ServiceStream<SynReplicationMessage>,\n        >>::Future: Send + 'static,\n    {\n        self.open_replication_stream_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_open_fetch_stream_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    OpenFetchStreamRequest,\n                    IngesterServiceStream<FetchMessage>,\n                    crate::ingest::IngestV2Error,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                OpenFetchStreamRequest,\n                Response = IngesterServiceStream<FetchMessage>,\n                Error = crate::ingest::IngestV2Error,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<OpenFetchStreamRequest>>::Future: Send + 'static,\n    {\n        self.open_fetch_stream_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_open_observation_stream_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    OpenObservationStreamRequest,\n                    IngesterServiceStream<ObservationMessage>,\n                    crate::ingest::IngestV2Error,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                OpenObservationStreamRequest,\n                Response = IngesterServiceStream<ObservationMessage>,\n                Error = crate::ingest::IngestV2Error,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<\n            OpenObservationStreamRequest,\n        >>::Future: Send + 'static,\n    {\n        self.open_observation_stream_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_init_shards_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    InitShardsRequest,\n                    InitShardsResponse,\n                    crate::ingest::IngestV2Error,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                InitShardsRequest,\n                Response = InitShardsResponse,\n                Error = crate::ingest::IngestV2Error,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<InitShardsRequest>>::Future: Send + 'static,\n    {\n        self.init_shards_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_retain_shards_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    RetainShardsRequest,\n                    RetainShardsResponse,\n                    crate::ingest::IngestV2Error,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                RetainShardsRequest,\n                Response = RetainShardsResponse,\n                Error = crate::ingest::IngestV2Error,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<RetainShardsRequest>>::Future: Send + 'static,\n    {\n        self.retain_shards_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_truncate_shards_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    TruncateShardsRequest,\n                    TruncateShardsResponse,\n                    crate::ingest::IngestV2Error,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                TruncateShardsRequest,\n                Response = TruncateShardsResponse,\n                Error = crate::ingest::IngestV2Error,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<TruncateShardsRequest>>::Future: Send + 'static,\n    {\n        self.truncate_shards_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_close_shards_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    CloseShardsRequest,\n                    CloseShardsResponse,\n                    crate::ingest::IngestV2Error,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                CloseShardsRequest,\n                Response = CloseShardsResponse,\n                Error = crate::ingest::IngestV2Error,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<CloseShardsRequest>>::Future: Send + 'static,\n    {\n        self.close_shards_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_decommission_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    DecommissionRequest,\n                    DecommissionResponse,\n                    crate::ingest::IngestV2Error,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                DecommissionRequest,\n                Response = DecommissionResponse,\n                Error = crate::ingest::IngestV2Error,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<DecommissionRequest>>::Future: Send + 'static,\n    {\n        self.decommission_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn build<T>(self, instance: T) -> IngesterServiceClient\n    where\n        T: IngesterService,\n    {\n        let inner_client = InnerIngesterServiceClient(std::sync::Arc::new(instance));\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_channel(\n        self,\n        addr: std::net::SocketAddr,\n        channel: tonic::transport::Channel,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> IngesterServiceClient {\n        let client = IngesterServiceClient::from_channel(\n            addr,\n            channel,\n            max_message_size,\n            compression_encoding_opt,\n        );\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_balance_channel(\n        self,\n        balance_channel: quickwit_common::tower::BalanceChannel<std::net::SocketAddr>,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> IngesterServiceClient {\n        let client = IngesterServiceClient::from_balance_channel(\n            balance_channel,\n            max_message_size,\n            compression_encoding_opt,\n        );\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_mailbox<A>(\n        self,\n        mailbox: quickwit_actors::Mailbox<A>,\n    ) -> IngesterServiceClient\n    where\n        A: quickwit_actors::Actor + std::fmt::Debug + Send + 'static,\n        IngesterServiceMailbox<A>: IngesterService,\n    {\n        let inner_client = InnerIngesterServiceClient(\n            std::sync::Arc::new(IngesterServiceMailbox::new(mailbox)),\n        );\n        self.build_from_inner_client(inner_client)\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn build_from_mock(self, mock: MockIngesterService) -> IngesterServiceClient {\n        let client = IngesterServiceClient::from_mock(mock);\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    fn build_from_inner_client(\n        self,\n        inner_client: InnerIngesterServiceClient,\n    ) -> IngesterServiceClient {\n        let persist_svc = self\n            .persist_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let open_replication_stream_svc = self\n            .open_replication_stream_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let open_fetch_stream_svc = self\n            .open_fetch_stream_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let open_observation_stream_svc = self\n            .open_observation_stream_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let init_shards_svc = self\n            .init_shards_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let retain_shards_svc = self\n            .retain_shards_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let truncate_shards_svc = self\n            .truncate_shards_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let close_shards_svc = self\n            .close_shards_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let decommission_svc = self\n            .decommission_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let tower_svc_stack = IngesterServiceTowerServiceStack {\n            inner: inner_client,\n            persist_svc,\n            open_replication_stream_svc,\n            open_fetch_stream_svc,\n            open_observation_stream_svc,\n            init_shards_svc,\n            retain_shards_svc,\n            truncate_shards_svc,\n            close_shards_svc,\n            decommission_svc,\n        };\n        IngesterServiceClient::new(tower_svc_stack)\n    }\n}\n#[derive(Debug, Clone)]\nstruct MailboxAdapter<A: quickwit_actors::Actor, E> {\n    inner: quickwit_actors::Mailbox<A>,\n    phantom: std::marker::PhantomData<E>,\n}\nimpl<A, E> std::ops::Deref for MailboxAdapter<A, E>\nwhere\n    A: quickwit_actors::Actor,\n{\n    type Target = quickwit_actors::Mailbox<A>;\n    fn deref(&self) -> &Self::Target {\n        &self.inner\n    }\n}\n#[derive(Debug)]\npub struct IngesterServiceMailbox<A: quickwit_actors::Actor> {\n    inner: MailboxAdapter<A, crate::ingest::IngestV2Error>,\n}\nimpl<A: quickwit_actors::Actor> IngesterServiceMailbox<A> {\n    pub fn new(instance: quickwit_actors::Mailbox<A>) -> Self {\n        let inner = MailboxAdapter {\n            inner: instance,\n            phantom: std::marker::PhantomData,\n        };\n        Self { inner }\n    }\n}\nimpl<A: quickwit_actors::Actor> Clone for IngesterServiceMailbox<A> {\n    fn clone(&self) -> Self {\n        let inner = MailboxAdapter {\n            inner: self.inner.clone(),\n            phantom: std::marker::PhantomData,\n        };\n        Self { inner }\n    }\n}\nimpl<A, M, T, E> tower::Service<M> for IngesterServiceMailbox<A>\nwhere\n    A: quickwit_actors::Actor\n        + quickwit_actors::DeferableReplyHandler<M, Reply = Result<T, E>> + Send\n        + 'static,\n    M: std::fmt::Debug + Send + 'static,\n    T: Send + 'static,\n    E: std::fmt::Debug + Send + 'static,\n    crate::ingest::IngestV2Error: From<quickwit_actors::AskError<E>>,\n{\n    type Response = T;\n    type Error = crate::ingest::IngestV2Error;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        //! This does not work with balance middlewares such as `tower::balance::pool::Pool` because\n        //! this always returns `Poll::Ready`. The fix is to acquire a permit from the\n        //! mailbox in `poll_ready` and consume it in `call`.\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, message: M) -> Self::Future {\n        let mailbox = self.inner.clone();\n        let fut = async move {\n            mailbox.ask_for_res(message).await.map_err(|error| error.into())\n        };\n        Box::pin(fut)\n    }\n}\n#[async_trait::async_trait]\nimpl<A> IngesterService for IngesterServiceMailbox<A>\nwhere\n    A: quickwit_actors::Actor + std::fmt::Debug,\n    IngesterServiceMailbox<\n        A,\n    >: tower::Service<\n            PersistRequest,\n            Response = PersistResponse,\n            Error = crate::ingest::IngestV2Error,\n            Future = BoxFuture<PersistResponse, crate::ingest::IngestV2Error>,\n        >\n        + tower::Service<\n            quickwit_common::ServiceStream<SynReplicationMessage>,\n            Response = IngesterServiceStream<AckReplicationMessage>,\n            Error = crate::ingest::IngestV2Error,\n            Future = BoxFuture<\n                IngesterServiceStream<AckReplicationMessage>,\n                crate::ingest::IngestV2Error,\n            >,\n        >\n        + tower::Service<\n            OpenFetchStreamRequest,\n            Response = IngesterServiceStream<FetchMessage>,\n            Error = crate::ingest::IngestV2Error,\n            Future = BoxFuture<\n                IngesterServiceStream<FetchMessage>,\n                crate::ingest::IngestV2Error,\n            >,\n        >\n        + tower::Service<\n            OpenObservationStreamRequest,\n            Response = IngesterServiceStream<ObservationMessage>,\n            Error = crate::ingest::IngestV2Error,\n            Future = BoxFuture<\n                IngesterServiceStream<ObservationMessage>,\n                crate::ingest::IngestV2Error,\n            >,\n        >\n        + tower::Service<\n            InitShardsRequest,\n            Response = InitShardsResponse,\n            Error = crate::ingest::IngestV2Error,\n            Future = BoxFuture<InitShardsResponse, crate::ingest::IngestV2Error>,\n        >\n        + tower::Service<\n            RetainShardsRequest,\n            Response = RetainShardsResponse,\n            Error = crate::ingest::IngestV2Error,\n            Future = BoxFuture<RetainShardsResponse, crate::ingest::IngestV2Error>,\n        >\n        + tower::Service<\n            TruncateShardsRequest,\n            Response = TruncateShardsResponse,\n            Error = crate::ingest::IngestV2Error,\n            Future = BoxFuture<TruncateShardsResponse, crate::ingest::IngestV2Error>,\n        >\n        + tower::Service<\n            CloseShardsRequest,\n            Response = CloseShardsResponse,\n            Error = crate::ingest::IngestV2Error,\n            Future = BoxFuture<CloseShardsResponse, crate::ingest::IngestV2Error>,\n        >\n        + tower::Service<\n            DecommissionRequest,\n            Response = DecommissionResponse,\n            Error = crate::ingest::IngestV2Error,\n            Future = BoxFuture<DecommissionResponse, crate::ingest::IngestV2Error>,\n        >,\n{\n    async fn persist(\n        &self,\n        request: PersistRequest,\n    ) -> crate::ingest::IngestV2Result<PersistResponse> {\n        self.clone().call(request).await\n    }\n    async fn open_replication_stream(\n        &self,\n        request: quickwit_common::ServiceStream<SynReplicationMessage>,\n    ) -> crate::ingest::IngestV2Result<IngesterServiceStream<AckReplicationMessage>> {\n        self.clone().call(request).await\n    }\n    async fn open_fetch_stream(\n        &self,\n        request: OpenFetchStreamRequest,\n    ) -> crate::ingest::IngestV2Result<IngesterServiceStream<FetchMessage>> {\n        self.clone().call(request).await\n    }\n    async fn open_observation_stream(\n        &self,\n        request: OpenObservationStreamRequest,\n    ) -> crate::ingest::IngestV2Result<IngesterServiceStream<ObservationMessage>> {\n        self.clone().call(request).await\n    }\n    async fn init_shards(\n        &self,\n        request: InitShardsRequest,\n    ) -> crate::ingest::IngestV2Result<InitShardsResponse> {\n        self.clone().call(request).await\n    }\n    async fn retain_shards(\n        &self,\n        request: RetainShardsRequest,\n    ) -> crate::ingest::IngestV2Result<RetainShardsResponse> {\n        self.clone().call(request).await\n    }\n    async fn truncate_shards(\n        &self,\n        request: TruncateShardsRequest,\n    ) -> crate::ingest::IngestV2Result<TruncateShardsResponse> {\n        self.clone().call(request).await\n    }\n    async fn close_shards(\n        &self,\n        request: CloseShardsRequest,\n    ) -> crate::ingest::IngestV2Result<CloseShardsResponse> {\n        self.clone().call(request).await\n    }\n    async fn decommission(\n        &self,\n        request: DecommissionRequest,\n    ) -> crate::ingest::IngestV2Result<DecommissionResponse> {\n        self.clone().call(request).await\n    }\n}\n#[derive(Debug, Clone)]\npub struct IngesterServiceGrpcClientAdapter<T> {\n    inner: T,\n    #[allow(dead_code)]\n    connection_addrs_rx: tokio::sync::watch::Receiver<\n        std::collections::HashSet<std::net::SocketAddr>,\n    >,\n}\nimpl<T> IngesterServiceGrpcClientAdapter<T> {\n    pub fn new(\n        instance: T,\n        connection_addrs_rx: tokio::sync::watch::Receiver<\n            std::collections::HashSet<std::net::SocketAddr>,\n        >,\n    ) -> Self {\n        Self {\n            inner: instance,\n            connection_addrs_rx,\n        }\n    }\n}\n#[async_trait::async_trait]\nimpl<T> IngesterService\nfor IngesterServiceGrpcClientAdapter<\n    ingester_service_grpc_client::IngesterServiceGrpcClient<T>,\n>\nwhere\n    T: tonic::client::GrpcService<tonic::body::Body> + std::fmt::Debug + Clone + Send\n        + Sync + 'static,\n    T::ResponseBody: tonic::codegen::Body<Data = tonic::codegen::Bytes> + Send + 'static,\n    <T::ResponseBody as tonic::codegen::Body>::Error: Into<tonic::codegen::StdError>\n        + Send,\n    T::Future: Send,\n{\n    async fn persist(\n        &self,\n        request: PersistRequest,\n    ) -> crate::ingest::IngestV2Result<PersistResponse> {\n        self.inner\n            .clone()\n            .persist(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                PersistRequest::rpc_name(),\n            ))\n    }\n    async fn open_replication_stream(\n        &self,\n        request: quickwit_common::ServiceStream<SynReplicationMessage>,\n    ) -> crate::ingest::IngestV2Result<IngesterServiceStream<AckReplicationMessage>> {\n        self.inner\n            .clone()\n            .open_replication_stream(request)\n            .await\n            .map(|response| {\n                let streaming: tonic::Streaming<_> = response.into_inner();\n                let stream = quickwit_common::ServiceStream::from(streaming);\n                stream\n                    .map_err(|status| crate::error::grpc_status_to_service_error(\n                        status,\n                        SynReplicationMessage::rpc_name(),\n                    ))\n            })\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                SynReplicationMessage::rpc_name(),\n            ))\n    }\n    async fn open_fetch_stream(\n        &self,\n        request: OpenFetchStreamRequest,\n    ) -> crate::ingest::IngestV2Result<IngesterServiceStream<FetchMessage>> {\n        self.inner\n            .clone()\n            .open_fetch_stream(request)\n            .await\n            .map(|response| {\n                let streaming: tonic::Streaming<_> = response.into_inner();\n                let stream = quickwit_common::ServiceStream::from(streaming);\n                stream\n                    .map_err(|status| crate::error::grpc_status_to_service_error(\n                        status,\n                        OpenFetchStreamRequest::rpc_name(),\n                    ))\n            })\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                OpenFetchStreamRequest::rpc_name(),\n            ))\n    }\n    async fn open_observation_stream(\n        &self,\n        request: OpenObservationStreamRequest,\n    ) -> crate::ingest::IngestV2Result<IngesterServiceStream<ObservationMessage>> {\n        self.inner\n            .clone()\n            .open_observation_stream(request)\n            .await\n            .map(|response| {\n                let streaming: tonic::Streaming<_> = response.into_inner();\n                let stream = quickwit_common::ServiceStream::from(streaming);\n                stream\n                    .map_err(|status| crate::error::grpc_status_to_service_error(\n                        status,\n                        OpenObservationStreamRequest::rpc_name(),\n                    ))\n            })\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                OpenObservationStreamRequest::rpc_name(),\n            ))\n    }\n    async fn init_shards(\n        &self,\n        request: InitShardsRequest,\n    ) -> crate::ingest::IngestV2Result<InitShardsResponse> {\n        self.inner\n            .clone()\n            .init_shards(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                InitShardsRequest::rpc_name(),\n            ))\n    }\n    async fn retain_shards(\n        &self,\n        request: RetainShardsRequest,\n    ) -> crate::ingest::IngestV2Result<RetainShardsResponse> {\n        self.inner\n            .clone()\n            .retain_shards(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                RetainShardsRequest::rpc_name(),\n            ))\n    }\n    async fn truncate_shards(\n        &self,\n        request: TruncateShardsRequest,\n    ) -> crate::ingest::IngestV2Result<TruncateShardsResponse> {\n        self.inner\n            .clone()\n            .truncate_shards(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                TruncateShardsRequest::rpc_name(),\n            ))\n    }\n    async fn close_shards(\n        &self,\n        request: CloseShardsRequest,\n    ) -> crate::ingest::IngestV2Result<CloseShardsResponse> {\n        self.inner\n            .clone()\n            .close_shards(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                CloseShardsRequest::rpc_name(),\n            ))\n    }\n    async fn decommission(\n        &self,\n        request: DecommissionRequest,\n    ) -> crate::ingest::IngestV2Result<DecommissionResponse> {\n        self.inner\n            .clone()\n            .decommission(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                DecommissionRequest::rpc_name(),\n            ))\n    }\n}\n#[derive(Debug)]\npub struct IngesterServiceGrpcServerAdapter {\n    inner: InnerIngesterServiceClient,\n}\nimpl IngesterServiceGrpcServerAdapter {\n    pub fn new<T>(instance: T) -> Self\n    where\n        T: IngesterService,\n    {\n        Self {\n            inner: InnerIngesterServiceClient(std::sync::Arc::new(instance)),\n        }\n    }\n}\n#[async_trait::async_trait]\nimpl ingester_service_grpc_server::IngesterServiceGrpc\nfor IngesterServiceGrpcServerAdapter {\n    async fn persist(\n        &self,\n        request: tonic::Request<PersistRequest>,\n    ) -> Result<tonic::Response<PersistResponse>, tonic::Status> {\n        self.inner\n            .0\n            .persist(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    type OpenReplicationStreamStream = quickwit_common::ServiceStream<\n        tonic::Result<AckReplicationMessage>,\n    >;\n    async fn open_replication_stream(\n        &self,\n        request: tonic::Request<tonic::Streaming<SynReplicationMessage>>,\n    ) -> Result<tonic::Response<Self::OpenReplicationStreamStream>, tonic::Status> {\n        self.inner\n            .0\n            .open_replication_stream({\n                let streaming: tonic::Streaming<_> = request.into_inner();\n                quickwit_common::ServiceStream::from(streaming)\n            })\n            .await\n            .map(|stream| tonic::Response::new(\n                stream.map_err(crate::error::grpc_error_to_grpc_status),\n            ))\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    type OpenFetchStreamStream = quickwit_common::ServiceStream<\n        tonic::Result<FetchMessage>,\n    >;\n    async fn open_fetch_stream(\n        &self,\n        request: tonic::Request<OpenFetchStreamRequest>,\n    ) -> Result<tonic::Response<Self::OpenFetchStreamStream>, tonic::Status> {\n        self.inner\n            .0\n            .open_fetch_stream(request.into_inner())\n            .await\n            .map(|stream| tonic::Response::new(\n                stream.map_err(crate::error::grpc_error_to_grpc_status),\n            ))\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    type OpenObservationStreamStream = quickwit_common::ServiceStream<\n        tonic::Result<ObservationMessage>,\n    >;\n    async fn open_observation_stream(\n        &self,\n        request: tonic::Request<OpenObservationStreamRequest>,\n    ) -> Result<tonic::Response<Self::OpenObservationStreamStream>, tonic::Status> {\n        self.inner\n            .0\n            .open_observation_stream(request.into_inner())\n            .await\n            .map(|stream| tonic::Response::new(\n                stream.map_err(crate::error::grpc_error_to_grpc_status),\n            ))\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn init_shards(\n        &self,\n        request: tonic::Request<InitShardsRequest>,\n    ) -> Result<tonic::Response<InitShardsResponse>, tonic::Status> {\n        self.inner\n            .0\n            .init_shards(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn retain_shards(\n        &self,\n        request: tonic::Request<RetainShardsRequest>,\n    ) -> Result<tonic::Response<RetainShardsResponse>, tonic::Status> {\n        self.inner\n            .0\n            .retain_shards(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn truncate_shards(\n        &self,\n        request: tonic::Request<TruncateShardsRequest>,\n    ) -> Result<tonic::Response<TruncateShardsResponse>, tonic::Status> {\n        self.inner\n            .0\n            .truncate_shards(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn close_shards(\n        &self,\n        request: tonic::Request<CloseShardsRequest>,\n    ) -> Result<tonic::Response<CloseShardsResponse>, tonic::Status> {\n        self.inner\n            .0\n            .close_shards(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn decommission(\n        &self,\n        request: tonic::Request<DecommissionRequest>,\n    ) -> Result<tonic::Response<DecommissionResponse>, tonic::Status> {\n        self.inner\n            .0\n            .decommission(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n}\n/// Generated client implementations.\npub mod ingester_service_grpc_client {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    use tonic::codegen::http::Uri;\n    #[derive(Debug, Clone)]\n    pub struct IngesterServiceGrpcClient<T> {\n        inner: tonic::client::Grpc<T>,\n    }\n    impl IngesterServiceGrpcClient<tonic::transport::Channel> {\n        /// Attempt to create a new client by connecting to a given endpoint.\n        pub async fn connect<D>(dst: D) -> Result<Self, tonic::transport::Error>\n        where\n            D: TryInto<tonic::transport::Endpoint>,\n            D::Error: Into<StdError>,\n        {\n            let conn = tonic::transport::Endpoint::new(dst)?.connect().await?;\n            Ok(Self::new(conn))\n        }\n    }\n    impl<T> IngesterServiceGrpcClient<T>\n    where\n        T: tonic::client::GrpcService<tonic::body::Body>,\n        T::Error: Into<StdError>,\n        T::ResponseBody: Body<Data = Bytes> + std::marker::Send + 'static,\n        <T::ResponseBody as Body>::Error: Into<StdError> + std::marker::Send,\n    {\n        pub fn new(inner: T) -> Self {\n            let inner = tonic::client::Grpc::new(inner);\n            Self { inner }\n        }\n        pub fn with_origin(inner: T, origin: Uri) -> Self {\n            let inner = tonic::client::Grpc::with_origin(inner, origin);\n            Self { inner }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> IngesterServiceGrpcClient<InterceptedService<T, F>>\n        where\n            F: tonic::service::Interceptor,\n            T::ResponseBody: Default,\n            T: tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n                Response = http::Response<\n                    <T as tonic::client::GrpcService<tonic::body::Body>>::ResponseBody,\n                >,\n            >,\n            <T as tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n            >>::Error: Into<StdError> + std::marker::Send + std::marker::Sync,\n        {\n            IngesterServiceGrpcClient::new(InterceptedService::new(inner, interceptor))\n        }\n        /// Compress requests with the given encoding.\n        ///\n        /// This requires the server to support it otherwise it might respond with an\n        /// error.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.send_compressed(encoding);\n            self\n        }\n        /// Enable decompressing responses.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.accept_compressed(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_decoding_message_size(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_encoding_message_size(limit);\n            self\n        }\n        /// Persists batches of documents to primary shards hosted on a leader.\n        pub async fn persist(\n            &mut self,\n            request: impl tonic::IntoRequest<super::PersistRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::PersistResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.ingest.ingester.IngesterService/Persist\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.ingest.ingester.IngesterService\",\n                        \"Persist\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Opens a replication stream from a leader to a follower.\n        pub async fn open_replication_stream(\n            &mut self,\n            request: impl tonic::IntoStreamingRequest<\n                Message = super::SynReplicationMessage,\n            >,\n        ) -> std::result::Result<\n            tonic::Response<tonic::codec::Streaming<super::AckReplicationMessage>>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.ingest.ingester.IngesterService/OpenReplicationStream\",\n            );\n            let mut req = request.into_streaming_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.ingest.ingester.IngesterService\",\n                        \"OpenReplicationStream\",\n                    ),\n                );\n            self.inner.streaming(req, path, codec).await\n        }\n        /// Streams records from a leader or a follower. The client can optionally specify a range of positions to fetch,\n        /// otherwise the stream will go indefinitely or until the shard is closed.\n        pub async fn open_fetch_stream(\n            &mut self,\n            request: impl tonic::IntoRequest<super::OpenFetchStreamRequest>,\n        ) -> std::result::Result<\n            tonic::Response<tonic::codec::Streaming<super::FetchMessage>>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.ingest.ingester.IngesterService/OpenFetchStream\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.ingest.ingester.IngesterService\",\n                        \"OpenFetchStream\",\n                    ),\n                );\n            self.inner.server_streaming(req, path, codec).await\n        }\n        /// Streams status updates, called \"observations\", from an ingester.\n        pub async fn open_observation_stream(\n            &mut self,\n            request: impl tonic::IntoRequest<super::OpenObservationStreamRequest>,\n        ) -> std::result::Result<\n            tonic::Response<tonic::codec::Streaming<super::ObservationMessage>>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.ingest.ingester.IngesterService/OpenObservationStream\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.ingest.ingester.IngesterService\",\n                        \"OpenObservationStream\",\n                    ),\n                );\n            self.inner.server_streaming(req, path, codec).await\n        }\n        /// Creates and initializes a set of newly opened shards. This RPC is called by the control plane on leaders.\n        pub async fn init_shards(\n            &mut self,\n            request: impl tonic::IntoRequest<super::InitShardsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::InitShardsResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.ingest.ingester.IngesterService/InitShards\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.ingest.ingester.IngesterService\",\n                        \"InitShards\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Only retain the shards that are listed in the request.\n        /// Other shards are deleted.\n        pub async fn retain_shards(\n            &mut self,\n            request: impl tonic::IntoRequest<super::RetainShardsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::RetainShardsResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.ingest.ingester.IngesterService/RetainShards\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.ingest.ingester.IngesterService\",\n                        \"RetainShards\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Truncates a set of shards at the given positions. This RPC is called by indexers on leaders AND followers.\n        pub async fn truncate_shards(\n            &mut self,\n            request: impl tonic::IntoRequest<super::TruncateShardsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::TruncateShardsResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.ingest.ingester.IngesterService/TruncateShards\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.ingest.ingester.IngesterService\",\n                        \"TruncateShards\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Closes a set of shards. This RPC is called by the control plane.\n        pub async fn close_shards(\n            &mut self,\n            request: impl tonic::IntoRequest<super::CloseShardsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::CloseShardsResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.ingest.ingester.IngesterService/CloseShards\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.ingest.ingester.IngesterService\",\n                        \"CloseShards\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Decommissions the ingester.\n        pub async fn decommission(\n            &mut self,\n            request: impl tonic::IntoRequest<super::DecommissionRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::DecommissionResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.ingest.ingester.IngesterService/Decommission\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.ingest.ingester.IngesterService\",\n                        \"Decommission\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n    }\n}\n/// Generated server implementations.\npub mod ingester_service_grpc_server {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    /// Generated trait containing gRPC methods that should be implemented for use with IngesterServiceGrpcServer.\n    #[async_trait]\n    pub trait IngesterServiceGrpc: std::marker::Send + std::marker::Sync + 'static {\n        /// Persists batches of documents to primary shards hosted on a leader.\n        async fn persist(\n            &self,\n            request: tonic::Request<super::PersistRequest>,\n        ) -> std::result::Result<tonic::Response<super::PersistResponse>, tonic::Status>;\n        /// Server streaming response type for the OpenReplicationStream method.\n        type OpenReplicationStreamStream: tonic::codegen::tokio_stream::Stream<\n                Item = std::result::Result<super::AckReplicationMessage, tonic::Status>,\n            >\n            + std::marker::Send\n            + 'static;\n        /// Opens a replication stream from a leader to a follower.\n        async fn open_replication_stream(\n            &self,\n            request: tonic::Request<tonic::Streaming<super::SynReplicationMessage>>,\n        ) -> std::result::Result<\n            tonic::Response<Self::OpenReplicationStreamStream>,\n            tonic::Status,\n        >;\n        /// Server streaming response type for the OpenFetchStream method.\n        type OpenFetchStreamStream: tonic::codegen::tokio_stream::Stream<\n                Item = std::result::Result<super::FetchMessage, tonic::Status>,\n            >\n            + std::marker::Send\n            + 'static;\n        /// Streams records from a leader or a follower. The client can optionally specify a range of positions to fetch,\n        /// otherwise the stream will go indefinitely or until the shard is closed.\n        async fn open_fetch_stream(\n            &self,\n            request: tonic::Request<super::OpenFetchStreamRequest>,\n        ) -> std::result::Result<\n            tonic::Response<Self::OpenFetchStreamStream>,\n            tonic::Status,\n        >;\n        /// Server streaming response type for the OpenObservationStream method.\n        type OpenObservationStreamStream: tonic::codegen::tokio_stream::Stream<\n                Item = std::result::Result<super::ObservationMessage, tonic::Status>,\n            >\n            + std::marker::Send\n            + 'static;\n        /// Streams status updates, called \"observations\", from an ingester.\n        async fn open_observation_stream(\n            &self,\n            request: tonic::Request<super::OpenObservationStreamRequest>,\n        ) -> std::result::Result<\n            tonic::Response<Self::OpenObservationStreamStream>,\n            tonic::Status,\n        >;\n        /// Creates and initializes a set of newly opened shards. This RPC is called by the control plane on leaders.\n        async fn init_shards(\n            &self,\n            request: tonic::Request<super::InitShardsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::InitShardsResponse>,\n            tonic::Status,\n        >;\n        /// Only retain the shards that are listed in the request.\n        /// Other shards are deleted.\n        async fn retain_shards(\n            &self,\n            request: tonic::Request<super::RetainShardsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::RetainShardsResponse>,\n            tonic::Status,\n        >;\n        /// Truncates a set of shards at the given positions. This RPC is called by indexers on leaders AND followers.\n        async fn truncate_shards(\n            &self,\n            request: tonic::Request<super::TruncateShardsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::TruncateShardsResponse>,\n            tonic::Status,\n        >;\n        /// Closes a set of shards. This RPC is called by the control plane.\n        async fn close_shards(\n            &self,\n            request: tonic::Request<super::CloseShardsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::CloseShardsResponse>,\n            tonic::Status,\n        >;\n        /// Decommissions the ingester.\n        async fn decommission(\n            &self,\n            request: tonic::Request<super::DecommissionRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::DecommissionResponse>,\n            tonic::Status,\n        >;\n    }\n    #[derive(Debug)]\n    pub struct IngesterServiceGrpcServer<T> {\n        inner: Arc<T>,\n        accept_compression_encodings: EnabledCompressionEncodings,\n        send_compression_encodings: EnabledCompressionEncodings,\n        max_decoding_message_size: Option<usize>,\n        max_encoding_message_size: Option<usize>,\n    }\n    impl<T> IngesterServiceGrpcServer<T> {\n        pub fn new(inner: T) -> Self {\n            Self::from_arc(Arc::new(inner))\n        }\n        pub fn from_arc(inner: Arc<T>) -> Self {\n            Self {\n                inner,\n                accept_compression_encodings: Default::default(),\n                send_compression_encodings: Default::default(),\n                max_decoding_message_size: None,\n                max_encoding_message_size: None,\n            }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> InterceptedService<Self, F>\n        where\n            F: tonic::service::Interceptor,\n        {\n            InterceptedService::new(Self::new(inner), interceptor)\n        }\n        /// Enable decompressing requests with the given encoding.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.accept_compression_encodings.enable(encoding);\n            self\n        }\n        /// Compress responses with the given encoding, if the client supports it.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.send_compression_encodings.enable(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.max_decoding_message_size = Some(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.max_encoding_message_size = Some(limit);\n            self\n        }\n    }\n    impl<T, B> tonic::codegen::Service<http::Request<B>> for IngesterServiceGrpcServer<T>\n    where\n        T: IngesterServiceGrpc,\n        B: Body + std::marker::Send + 'static,\n        B::Error: Into<StdError> + std::marker::Send + 'static,\n    {\n        type Response = http::Response<tonic::body::Body>;\n        type Error = std::convert::Infallible;\n        type Future = BoxFuture<Self::Response, Self::Error>;\n        fn poll_ready(\n            &mut self,\n            _cx: &mut Context<'_>,\n        ) -> Poll<std::result::Result<(), Self::Error>> {\n            Poll::Ready(Ok(()))\n        }\n        fn call(&mut self, req: http::Request<B>) -> Self::Future {\n            match req.uri().path() {\n                \"/quickwit.ingest.ingester.IngesterService/Persist\" => {\n                    #[allow(non_camel_case_types)]\n                    struct PersistSvc<T: IngesterServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: IngesterServiceGrpc,\n                    > tonic::server::UnaryService<super::PersistRequest>\n                    for PersistSvc<T> {\n                        type Response = super::PersistResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::PersistRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as IngesterServiceGrpc>::persist(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = PersistSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.ingest.ingester.IngesterService/OpenReplicationStream\" => {\n                    #[allow(non_camel_case_types)]\n                    struct OpenReplicationStreamSvc<T: IngesterServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: IngesterServiceGrpc,\n                    > tonic::server::StreamingService<super::SynReplicationMessage>\n                    for OpenReplicationStreamSvc<T> {\n                        type Response = super::AckReplicationMessage;\n                        type ResponseStream = T::OpenReplicationStreamStream;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::ResponseStream>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<\n                                tonic::Streaming<super::SynReplicationMessage>,\n                            >,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as IngesterServiceGrpc>::open_replication_stream(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = OpenReplicationStreamSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.streaming(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.ingest.ingester.IngesterService/OpenFetchStream\" => {\n                    #[allow(non_camel_case_types)]\n                    struct OpenFetchStreamSvc<T: IngesterServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: IngesterServiceGrpc,\n                    > tonic::server::ServerStreamingService<\n                        super::OpenFetchStreamRequest,\n                    > for OpenFetchStreamSvc<T> {\n                        type Response = super::FetchMessage;\n                        type ResponseStream = T::OpenFetchStreamStream;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::ResponseStream>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::OpenFetchStreamRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as IngesterServiceGrpc>::open_fetch_stream(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = OpenFetchStreamSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.server_streaming(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.ingest.ingester.IngesterService/OpenObservationStream\" => {\n                    #[allow(non_camel_case_types)]\n                    struct OpenObservationStreamSvc<T: IngesterServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: IngesterServiceGrpc,\n                    > tonic::server::ServerStreamingService<\n                        super::OpenObservationStreamRequest,\n                    > for OpenObservationStreamSvc<T> {\n                        type Response = super::ObservationMessage;\n                        type ResponseStream = T::OpenObservationStreamStream;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::ResponseStream>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::OpenObservationStreamRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as IngesterServiceGrpc>::open_observation_stream(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = OpenObservationStreamSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.server_streaming(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.ingest.ingester.IngesterService/InitShards\" => {\n                    #[allow(non_camel_case_types)]\n                    struct InitShardsSvc<T: IngesterServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: IngesterServiceGrpc,\n                    > tonic::server::UnaryService<super::InitShardsRequest>\n                    for InitShardsSvc<T> {\n                        type Response = super::InitShardsResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::InitShardsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as IngesterServiceGrpc>::init_shards(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = InitShardsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.ingest.ingester.IngesterService/RetainShards\" => {\n                    #[allow(non_camel_case_types)]\n                    struct RetainShardsSvc<T: IngesterServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: IngesterServiceGrpc,\n                    > tonic::server::UnaryService<super::RetainShardsRequest>\n                    for RetainShardsSvc<T> {\n                        type Response = super::RetainShardsResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::RetainShardsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as IngesterServiceGrpc>::retain_shards(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = RetainShardsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.ingest.ingester.IngesterService/TruncateShards\" => {\n                    #[allow(non_camel_case_types)]\n                    struct TruncateShardsSvc<T: IngesterServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: IngesterServiceGrpc,\n                    > tonic::server::UnaryService<super::TruncateShardsRequest>\n                    for TruncateShardsSvc<T> {\n                        type Response = super::TruncateShardsResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::TruncateShardsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as IngesterServiceGrpc>::truncate_shards(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = TruncateShardsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.ingest.ingester.IngesterService/CloseShards\" => {\n                    #[allow(non_camel_case_types)]\n                    struct CloseShardsSvc<T: IngesterServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: IngesterServiceGrpc,\n                    > tonic::server::UnaryService<super::CloseShardsRequest>\n                    for CloseShardsSvc<T> {\n                        type Response = super::CloseShardsResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::CloseShardsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as IngesterServiceGrpc>::close_shards(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = CloseShardsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.ingest.ingester.IngesterService/Decommission\" => {\n                    #[allow(non_camel_case_types)]\n                    struct DecommissionSvc<T: IngesterServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: IngesterServiceGrpc,\n                    > tonic::server::UnaryService<super::DecommissionRequest>\n                    for DecommissionSvc<T> {\n                        type Response = super::DecommissionResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::DecommissionRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as IngesterServiceGrpc>::decommission(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = DecommissionSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                _ => {\n                    Box::pin(async move {\n                        let mut response = http::Response::new(\n                            tonic::body::Body::default(),\n                        );\n                        let headers = response.headers_mut();\n                        headers\n                            .insert(\n                                tonic::Status::GRPC_STATUS,\n                                (tonic::Code::Unimplemented as i32).into(),\n                            );\n                        headers\n                            .insert(\n                                http::header::CONTENT_TYPE,\n                                tonic::metadata::GRPC_CONTENT_TYPE,\n                            );\n                        Ok(response)\n                    })\n                }\n            }\n        }\n    }\n    impl<T> Clone for IngesterServiceGrpcServer<T> {\n        fn clone(&self) -> Self {\n            let inner = self.inner.clone();\n            Self {\n                inner,\n                accept_compression_encodings: self.accept_compression_encodings,\n                send_compression_encodings: self.send_compression_encodings,\n                max_decoding_message_size: self.max_decoding_message_size,\n                max_encoding_message_size: self.max_encoding_message_size,\n            }\n        }\n    }\n    /// Generated gRPC service name\n    pub const SERVICE_NAME: &str = \"quickwit.ingest.ingester.IngesterService\";\n    impl<T> tonic::server::NamedService for IngesterServiceGrpcServer<T> {\n        const NAME: &'static str = SERVICE_NAME;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/codegen/quickwit/quickwit.ingest.router.rs",
    "content": "// This file is @generated by prost-build.\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct IngestRequestV2 {\n    /// There should be at most one subrequest per index per request.\n    #[prost(message, repeated, tag = \"1\")]\n    pub subrequests: ::prost::alloc::vec::Vec<IngestSubrequest>,\n    #[prost(enumeration = \"super::CommitTypeV2\", tag = \"2\")]\n    pub commit_type: i32,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct IngestSubrequest {\n    /// The subrequest ID is used to identify the various subrequests and responses\n    /// (ingest, persist, replicate) at play during the ingest and replication\n    /// process.\n    #[prost(uint32, tag = \"1\")]\n    pub subrequest_id: u32,\n    #[prost(string, tag = \"2\")]\n    pub index_id: ::prost::alloc::string::String,\n    #[prost(string, tag = \"3\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(message, optional, tag = \"4\")]\n    pub doc_batch: ::core::option::Option<super::DocBatchV2>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct IngestResponseV2 {\n    #[prost(message, repeated, tag = \"1\")]\n    pub successes: ::prost::alloc::vec::Vec<IngestSuccess>,\n    #[prost(message, repeated, tag = \"2\")]\n    pub failures: ::prost::alloc::vec::Vec<IngestFailure>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct IngestSuccess {\n    #[prost(uint32, tag = \"1\")]\n    pub subrequest_id: u32,\n    #[prost(message, optional, tag = \"2\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"3\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(message, optional, tag = \"4\")]\n    pub shard_id: ::core::option::Option<crate::types::ShardId>,\n    /// Replication position inclusive.\n    #[prost(message, optional, tag = \"5\")]\n    pub replication_position_inclusive: ::core::option::Option<crate::types::Position>,\n    #[prost(uint32, tag = \"6\")]\n    pub num_ingested_docs: u32,\n    #[prost(message, repeated, tag = \"7\")]\n    pub parse_failures: ::prost::alloc::vec::Vec<super::ParseFailure>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct IngestFailure {\n    #[prost(uint32, tag = \"1\")]\n    pub subrequest_id: u32,\n    #[prost(string, tag = \"2\")]\n    pub index_id: ::prost::alloc::string::String,\n    #[prost(string, tag = \"3\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(enumeration = \"IngestFailureReason\", tag = \"5\")]\n    pub reason: i32,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[serde(rename_all = \"snake_case\")]\n#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]\n#[repr(i32)]\npub enum IngestFailureReason {\n    Unspecified = 0,\n    IndexNotFound = 1,\n    SourceNotFound = 2,\n    Internal = 3,\n    NoShardsAvailable = 4,\n    ShardRateLimited = 5,\n    WalFull = 6,\n    Timeout = 7,\n    RouterLoadShedding = 8,\n    LoadShedding = 9,\n    CircuitBreaker = 10,\n}\nimpl IngestFailureReason {\n    /// String value of the enum field names used in the ProtoBuf definition.\n    ///\n    /// The values are not transformed in any way and thus are considered stable\n    /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n    pub fn as_str_name(&self) -> &'static str {\n        match self {\n            Self::Unspecified => \"INGEST_FAILURE_REASON_UNSPECIFIED\",\n            Self::IndexNotFound => \"INGEST_FAILURE_REASON_INDEX_NOT_FOUND\",\n            Self::SourceNotFound => \"INGEST_FAILURE_REASON_SOURCE_NOT_FOUND\",\n            Self::Internal => \"INGEST_FAILURE_REASON_INTERNAL\",\n            Self::NoShardsAvailable => \"INGEST_FAILURE_REASON_NO_SHARDS_AVAILABLE\",\n            Self::ShardRateLimited => \"INGEST_FAILURE_REASON_SHARD_RATE_LIMITED\",\n            Self::WalFull => \"INGEST_FAILURE_REASON_WAL_FULL\",\n            Self::Timeout => \"INGEST_FAILURE_REASON_TIMEOUT\",\n            Self::RouterLoadShedding => \"INGEST_FAILURE_REASON_ROUTER_LOAD_SHEDDING\",\n            Self::LoadShedding => \"INGEST_FAILURE_REASON_LOAD_SHEDDING\",\n            Self::CircuitBreaker => \"INGEST_FAILURE_REASON_CIRCUIT_BREAKER\",\n        }\n    }\n    /// Creates an enum from field names used in the ProtoBuf definition.\n    pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n        match value {\n            \"INGEST_FAILURE_REASON_UNSPECIFIED\" => Some(Self::Unspecified),\n            \"INGEST_FAILURE_REASON_INDEX_NOT_FOUND\" => Some(Self::IndexNotFound),\n            \"INGEST_FAILURE_REASON_SOURCE_NOT_FOUND\" => Some(Self::SourceNotFound),\n            \"INGEST_FAILURE_REASON_INTERNAL\" => Some(Self::Internal),\n            \"INGEST_FAILURE_REASON_NO_SHARDS_AVAILABLE\" => Some(Self::NoShardsAvailable),\n            \"INGEST_FAILURE_REASON_SHARD_RATE_LIMITED\" => Some(Self::ShardRateLimited),\n            \"INGEST_FAILURE_REASON_WAL_FULL\" => Some(Self::WalFull),\n            \"INGEST_FAILURE_REASON_TIMEOUT\" => Some(Self::Timeout),\n            \"INGEST_FAILURE_REASON_ROUTER_LOAD_SHEDDING\" => {\n                Some(Self::RouterLoadShedding)\n            }\n            \"INGEST_FAILURE_REASON_LOAD_SHEDDING\" => Some(Self::LoadShedding),\n            \"INGEST_FAILURE_REASON_CIRCUIT_BREAKER\" => Some(Self::CircuitBreaker),\n            _ => None,\n        }\n    }\n}\n/// BEGIN quickwit-codegen\n#[allow(unused_imports)]\nuse std::str::FromStr;\nuse tower::{Layer, Service, ServiceExt};\nuse quickwit_common::tower::RpcName;\nimpl RpcName for IngestRequestV2 {\n    fn rpc_name() -> &'static str {\n        \"ingest\"\n    }\n}\n#[cfg_attr(any(test, feature = \"testsuite\"), mockall::automock)]\n#[async_trait::async_trait]\npub trait IngestRouterService: std::fmt::Debug + Send + Sync + 'static {\n    ///Ingests batches of documents for one or multiple indexes.\n    ///TODO: Describe error cases and how to handle them.\n    async fn ingest(\n        &self,\n        request: IngestRequestV2,\n    ) -> crate::ingest::IngestV2Result<IngestResponseV2>;\n}\n#[derive(Debug, Clone)]\npub struct IngestRouterServiceClient {\n    inner: InnerIngestRouterServiceClient,\n}\n#[derive(Debug, Clone)]\nstruct InnerIngestRouterServiceClient(std::sync::Arc<dyn IngestRouterService>);\nimpl IngestRouterServiceClient {\n    pub fn new<T>(instance: T) -> Self\n    where\n        T: IngestRouterService,\n    {\n        #[cfg(any(test, feature = \"testsuite\"))]\n        assert!(\n            std::any::TypeId::of:: < T > () != std::any::TypeId::of:: <\n            MockIngestRouterService > (),\n            \"`MockIngestRouterService` must be wrapped in a `MockIngestRouterServiceWrapper`: use `IngestRouterServiceClient::from_mock(mock)` to instantiate the client\"\n        );\n        Self {\n            inner: InnerIngestRouterServiceClient(std::sync::Arc::new(instance)),\n        }\n    }\n    pub fn as_grpc_service(\n        &self,\n        max_message_size: bytesize::ByteSize,\n    ) -> ingest_router_service_grpc_server::IngestRouterServiceGrpcServer<\n        IngestRouterServiceGrpcServerAdapter,\n    > {\n        let adapter = IngestRouterServiceGrpcServerAdapter::new(self.clone());\n        ingest_router_service_grpc_server::IngestRouterServiceGrpcServer::new(adapter)\n            .accept_compressed(tonic::codec::CompressionEncoding::Gzip)\n            .accept_compressed(tonic::codec::CompressionEncoding::Zstd)\n            .send_compressed(tonic::codec::CompressionEncoding::Gzip)\n            .send_compressed(tonic::codec::CompressionEncoding::Zstd)\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize)\n    }\n    pub fn from_channel(\n        addr: std::net::SocketAddr,\n        channel: tonic::transport::Channel,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> Self {\n        let (_, connection_keys_watcher) = tokio::sync::watch::channel(\n            std::collections::HashSet::from_iter([addr]),\n        );\n        let mut client = ingest_router_service_grpc_client::IngestRouterServiceGrpcClient::new(\n                channel,\n            )\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize);\n        if let Some(compression_encoding) = compression_encoding_opt {\n            client = client\n                .accept_compressed(compression_encoding)\n                .send_compressed(compression_encoding);\n        }\n        let adapter = IngestRouterServiceGrpcClientAdapter::new(\n            client,\n            connection_keys_watcher,\n        );\n        Self::new(adapter)\n    }\n    pub fn from_balance_channel(\n        balance_channel: quickwit_common::tower::BalanceChannel<std::net::SocketAddr>,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> IngestRouterServiceClient {\n        let connection_keys_watcher = balance_channel.connection_keys_watcher();\n        let mut client = ingest_router_service_grpc_client::IngestRouterServiceGrpcClient::new(\n                balance_channel,\n            )\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize);\n        if let Some(compression_encoding) = compression_encoding_opt {\n            client = client\n                .accept_compressed(compression_encoding)\n                .send_compressed(compression_encoding);\n        }\n        let adapter = IngestRouterServiceGrpcClientAdapter::new(\n            client,\n            connection_keys_watcher,\n        );\n        Self::new(adapter)\n    }\n    pub fn from_mailbox<A>(mailbox: quickwit_actors::Mailbox<A>) -> Self\n    where\n        A: quickwit_actors::Actor + std::fmt::Debug + Send + 'static,\n        IngestRouterServiceMailbox<A>: IngestRouterService,\n    {\n        IngestRouterServiceClient::new(IngestRouterServiceMailbox::new(mailbox))\n    }\n    pub fn tower() -> IngestRouterServiceTowerLayerStack {\n        IngestRouterServiceTowerLayerStack::default()\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn from_mock(mock: MockIngestRouterService) -> Self {\n        let mock_wrapper = mock_ingest_router_service::MockIngestRouterServiceWrapper {\n            inner: tokio::sync::Mutex::new(mock),\n        };\n        Self::new(mock_wrapper)\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn mocked() -> Self {\n        Self::from_mock(MockIngestRouterService::new())\n    }\n}\n#[async_trait::async_trait]\nimpl IngestRouterService for IngestRouterServiceClient {\n    async fn ingest(\n        &self,\n        request: IngestRequestV2,\n    ) -> crate::ingest::IngestV2Result<IngestResponseV2> {\n        self.inner.0.ingest(request).await\n    }\n}\n#[cfg(any(test, feature = \"testsuite\"))]\npub mod mock_ingest_router_service {\n    use super::*;\n    #[derive(Debug)]\n    pub struct MockIngestRouterServiceWrapper {\n        pub(super) inner: tokio::sync::Mutex<MockIngestRouterService>,\n    }\n    #[async_trait::async_trait]\n    impl IngestRouterService for MockIngestRouterServiceWrapper {\n        async fn ingest(\n            &self,\n            request: super::IngestRequestV2,\n        ) -> crate::ingest::IngestV2Result<super::IngestResponseV2> {\n            self.inner.lock().await.ingest(request).await\n        }\n    }\n}\npub type BoxFuture<T, E> = std::pin::Pin<\n    Box<dyn std::future::Future<Output = Result<T, E>> + Send + 'static>,\n>;\nimpl tower::Service<IngestRequestV2> for InnerIngestRouterServiceClient {\n    type Response = IngestResponseV2;\n    type Error = crate::ingest::IngestV2Error;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: IngestRequestV2) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.ingest(request).await };\n        Box::pin(fut)\n    }\n}\n/// A tower service stack is a set of tower services.\n#[derive(Debug)]\nstruct IngestRouterServiceTowerServiceStack {\n    #[allow(dead_code)]\n    inner: InnerIngestRouterServiceClient,\n    ingest_svc: quickwit_common::tower::BoxService<\n        IngestRequestV2,\n        IngestResponseV2,\n        crate::ingest::IngestV2Error,\n    >,\n}\n#[async_trait::async_trait]\nimpl IngestRouterService for IngestRouterServiceTowerServiceStack {\n    async fn ingest(\n        &self,\n        request: IngestRequestV2,\n    ) -> crate::ingest::IngestV2Result<IngestResponseV2> {\n        self.ingest_svc.clone().ready().await?.call(request).await\n    }\n}\ntype IngestLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        IngestRequestV2,\n        IngestResponseV2,\n        crate::ingest::IngestV2Error,\n    >,\n    IngestRequestV2,\n    IngestResponseV2,\n    crate::ingest::IngestV2Error,\n>;\n#[derive(Debug, Default)]\npub struct IngestRouterServiceTowerLayerStack {\n    ingest_layers: Vec<IngestLayer>,\n}\nimpl IngestRouterServiceTowerLayerStack {\n    pub fn stack_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    IngestRequestV2,\n                    IngestResponseV2,\n                    crate::ingest::IngestV2Error,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                IngestRequestV2,\n                IngestResponseV2,\n                crate::ingest::IngestV2Error,\n            >,\n        >>::Service: tower::Service<\n                IngestRequestV2,\n                Response = IngestResponseV2,\n                Error = crate::ingest::IngestV2Error,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                IngestRequestV2,\n                IngestResponseV2,\n                crate::ingest::IngestV2Error,\n            >,\n        >>::Service as tower::Service<IngestRequestV2>>::Future: Send + 'static,\n    {\n        self.ingest_layers.push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self\n    }\n    pub fn stack_ingest_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    IngestRequestV2,\n                    IngestResponseV2,\n                    crate::ingest::IngestV2Error,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                IngestRequestV2,\n                Response = IngestResponseV2,\n                Error = crate::ingest::IngestV2Error,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<IngestRequestV2>>::Future: Send + 'static,\n    {\n        self.ingest_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn build<T>(self, instance: T) -> IngestRouterServiceClient\n    where\n        T: IngestRouterService,\n    {\n        let inner_client = InnerIngestRouterServiceClient(std::sync::Arc::new(instance));\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_channel(\n        self,\n        addr: std::net::SocketAddr,\n        channel: tonic::transport::Channel,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> IngestRouterServiceClient {\n        let client = IngestRouterServiceClient::from_channel(\n            addr,\n            channel,\n            max_message_size,\n            compression_encoding_opt,\n        );\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_balance_channel(\n        self,\n        balance_channel: quickwit_common::tower::BalanceChannel<std::net::SocketAddr>,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> IngestRouterServiceClient {\n        let client = IngestRouterServiceClient::from_balance_channel(\n            balance_channel,\n            max_message_size,\n            compression_encoding_opt,\n        );\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_mailbox<A>(\n        self,\n        mailbox: quickwit_actors::Mailbox<A>,\n    ) -> IngestRouterServiceClient\n    where\n        A: quickwit_actors::Actor + std::fmt::Debug + Send + 'static,\n        IngestRouterServiceMailbox<A>: IngestRouterService,\n    {\n        let inner_client = InnerIngestRouterServiceClient(\n            std::sync::Arc::new(IngestRouterServiceMailbox::new(mailbox)),\n        );\n        self.build_from_inner_client(inner_client)\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn build_from_mock(\n        self,\n        mock: MockIngestRouterService,\n    ) -> IngestRouterServiceClient {\n        let client = IngestRouterServiceClient::from_mock(mock);\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    fn build_from_inner_client(\n        self,\n        inner_client: InnerIngestRouterServiceClient,\n    ) -> IngestRouterServiceClient {\n        let ingest_svc = self\n            .ingest_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let tower_svc_stack = IngestRouterServiceTowerServiceStack {\n            inner: inner_client,\n            ingest_svc,\n        };\n        IngestRouterServiceClient::new(tower_svc_stack)\n    }\n}\n#[derive(Debug, Clone)]\nstruct MailboxAdapter<A: quickwit_actors::Actor, E> {\n    inner: quickwit_actors::Mailbox<A>,\n    phantom: std::marker::PhantomData<E>,\n}\nimpl<A, E> std::ops::Deref for MailboxAdapter<A, E>\nwhere\n    A: quickwit_actors::Actor,\n{\n    type Target = quickwit_actors::Mailbox<A>;\n    fn deref(&self) -> &Self::Target {\n        &self.inner\n    }\n}\n#[derive(Debug)]\npub struct IngestRouterServiceMailbox<A: quickwit_actors::Actor> {\n    inner: MailboxAdapter<A, crate::ingest::IngestV2Error>,\n}\nimpl<A: quickwit_actors::Actor> IngestRouterServiceMailbox<A> {\n    pub fn new(instance: quickwit_actors::Mailbox<A>) -> Self {\n        let inner = MailboxAdapter {\n            inner: instance,\n            phantom: std::marker::PhantomData,\n        };\n        Self { inner }\n    }\n}\nimpl<A: quickwit_actors::Actor> Clone for IngestRouterServiceMailbox<A> {\n    fn clone(&self) -> Self {\n        let inner = MailboxAdapter {\n            inner: self.inner.clone(),\n            phantom: std::marker::PhantomData,\n        };\n        Self { inner }\n    }\n}\nimpl<A, M, T, E> tower::Service<M> for IngestRouterServiceMailbox<A>\nwhere\n    A: quickwit_actors::Actor\n        + quickwit_actors::DeferableReplyHandler<M, Reply = Result<T, E>> + Send\n        + 'static,\n    M: std::fmt::Debug + Send + 'static,\n    T: Send + 'static,\n    E: std::fmt::Debug + Send + 'static,\n    crate::ingest::IngestV2Error: From<quickwit_actors::AskError<E>>,\n{\n    type Response = T;\n    type Error = crate::ingest::IngestV2Error;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        //! This does not work with balance middlewares such as `tower::balance::pool::Pool` because\n        //! this always returns `Poll::Ready`. The fix is to acquire a permit from the\n        //! mailbox in `poll_ready` and consume it in `call`.\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, message: M) -> Self::Future {\n        let mailbox = self.inner.clone();\n        let fut = async move {\n            mailbox.ask_for_res(message).await.map_err(|error| error.into())\n        };\n        Box::pin(fut)\n    }\n}\n#[async_trait::async_trait]\nimpl<A> IngestRouterService for IngestRouterServiceMailbox<A>\nwhere\n    A: quickwit_actors::Actor + std::fmt::Debug,\n    IngestRouterServiceMailbox<\n        A,\n    >: tower::Service<\n        IngestRequestV2,\n        Response = IngestResponseV2,\n        Error = crate::ingest::IngestV2Error,\n        Future = BoxFuture<IngestResponseV2, crate::ingest::IngestV2Error>,\n    >,\n{\n    async fn ingest(\n        &self,\n        request: IngestRequestV2,\n    ) -> crate::ingest::IngestV2Result<IngestResponseV2> {\n        self.clone().call(request).await\n    }\n}\n#[derive(Debug, Clone)]\npub struct IngestRouterServiceGrpcClientAdapter<T> {\n    inner: T,\n    #[allow(dead_code)]\n    connection_addrs_rx: tokio::sync::watch::Receiver<\n        std::collections::HashSet<std::net::SocketAddr>,\n    >,\n}\nimpl<T> IngestRouterServiceGrpcClientAdapter<T> {\n    pub fn new(\n        instance: T,\n        connection_addrs_rx: tokio::sync::watch::Receiver<\n            std::collections::HashSet<std::net::SocketAddr>,\n        >,\n    ) -> Self {\n        Self {\n            inner: instance,\n            connection_addrs_rx,\n        }\n    }\n}\n#[async_trait::async_trait]\nimpl<T> IngestRouterService\nfor IngestRouterServiceGrpcClientAdapter<\n    ingest_router_service_grpc_client::IngestRouterServiceGrpcClient<T>,\n>\nwhere\n    T: tonic::client::GrpcService<tonic::body::Body> + std::fmt::Debug + Clone + Send\n        + Sync + 'static,\n    T::ResponseBody: tonic::codegen::Body<Data = tonic::codegen::Bytes> + Send + 'static,\n    <T::ResponseBody as tonic::codegen::Body>::Error: Into<tonic::codegen::StdError>\n        + Send,\n    T::Future: Send,\n{\n    async fn ingest(\n        &self,\n        request: IngestRequestV2,\n    ) -> crate::ingest::IngestV2Result<IngestResponseV2> {\n        self.inner\n            .clone()\n            .ingest(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                IngestRequestV2::rpc_name(),\n            ))\n    }\n}\n#[derive(Debug)]\npub struct IngestRouterServiceGrpcServerAdapter {\n    inner: InnerIngestRouterServiceClient,\n}\nimpl IngestRouterServiceGrpcServerAdapter {\n    pub fn new<T>(instance: T) -> Self\n    where\n        T: IngestRouterService,\n    {\n        Self {\n            inner: InnerIngestRouterServiceClient(std::sync::Arc::new(instance)),\n        }\n    }\n}\n#[async_trait::async_trait]\nimpl ingest_router_service_grpc_server::IngestRouterServiceGrpc\nfor IngestRouterServiceGrpcServerAdapter {\n    async fn ingest(\n        &self,\n        request: tonic::Request<IngestRequestV2>,\n    ) -> Result<tonic::Response<IngestResponseV2>, tonic::Status> {\n        self.inner\n            .0\n            .ingest(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n}\n/// Generated client implementations.\npub mod ingest_router_service_grpc_client {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    use tonic::codegen::http::Uri;\n    #[derive(Debug, Clone)]\n    pub struct IngestRouterServiceGrpcClient<T> {\n        inner: tonic::client::Grpc<T>,\n    }\n    impl IngestRouterServiceGrpcClient<tonic::transport::Channel> {\n        /// Attempt to create a new client by connecting to a given endpoint.\n        pub async fn connect<D>(dst: D) -> Result<Self, tonic::transport::Error>\n        where\n            D: TryInto<tonic::transport::Endpoint>,\n            D::Error: Into<StdError>,\n        {\n            let conn = tonic::transport::Endpoint::new(dst)?.connect().await?;\n            Ok(Self::new(conn))\n        }\n    }\n    impl<T> IngestRouterServiceGrpcClient<T>\n    where\n        T: tonic::client::GrpcService<tonic::body::Body>,\n        T::Error: Into<StdError>,\n        T::ResponseBody: Body<Data = Bytes> + std::marker::Send + 'static,\n        <T::ResponseBody as Body>::Error: Into<StdError> + std::marker::Send,\n    {\n        pub fn new(inner: T) -> Self {\n            let inner = tonic::client::Grpc::new(inner);\n            Self { inner }\n        }\n        pub fn with_origin(inner: T, origin: Uri) -> Self {\n            let inner = tonic::client::Grpc::with_origin(inner, origin);\n            Self { inner }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> IngestRouterServiceGrpcClient<InterceptedService<T, F>>\n        where\n            F: tonic::service::Interceptor,\n            T::ResponseBody: Default,\n            T: tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n                Response = http::Response<\n                    <T as tonic::client::GrpcService<tonic::body::Body>>::ResponseBody,\n                >,\n            >,\n            <T as tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n            >>::Error: Into<StdError> + std::marker::Send + std::marker::Sync,\n        {\n            IngestRouterServiceGrpcClient::new(\n                InterceptedService::new(inner, interceptor),\n            )\n        }\n        /// Compress requests with the given encoding.\n        ///\n        /// This requires the server to support it otherwise it might respond with an\n        /// error.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.send_compressed(encoding);\n            self\n        }\n        /// Enable decompressing responses.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.accept_compressed(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_decoding_message_size(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_encoding_message_size(limit);\n            self\n        }\n        /// Ingests batches of documents for one or multiple indexes.\n        /// TODO: Describe error cases and how to handle them.\n        pub async fn ingest(\n            &mut self,\n            request: impl tonic::IntoRequest<super::IngestRequestV2>,\n        ) -> std::result::Result<\n            tonic::Response<super::IngestResponseV2>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.ingest.router.IngestRouterService/Ingest\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.ingest.router.IngestRouterService\",\n                        \"Ingest\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n    }\n}\n/// Generated server implementations.\npub mod ingest_router_service_grpc_server {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    /// Generated trait containing gRPC methods that should be implemented for use with IngestRouterServiceGrpcServer.\n    #[async_trait]\n    pub trait IngestRouterServiceGrpc: std::marker::Send + std::marker::Sync + 'static {\n        /// Ingests batches of documents for one or multiple indexes.\n        /// TODO: Describe error cases and how to handle them.\n        async fn ingest(\n            &self,\n            request: tonic::Request<super::IngestRequestV2>,\n        ) -> std::result::Result<\n            tonic::Response<super::IngestResponseV2>,\n            tonic::Status,\n        >;\n    }\n    #[derive(Debug)]\n    pub struct IngestRouterServiceGrpcServer<T> {\n        inner: Arc<T>,\n        accept_compression_encodings: EnabledCompressionEncodings,\n        send_compression_encodings: EnabledCompressionEncodings,\n        max_decoding_message_size: Option<usize>,\n        max_encoding_message_size: Option<usize>,\n    }\n    impl<T> IngestRouterServiceGrpcServer<T> {\n        pub fn new(inner: T) -> Self {\n            Self::from_arc(Arc::new(inner))\n        }\n        pub fn from_arc(inner: Arc<T>) -> Self {\n            Self {\n                inner,\n                accept_compression_encodings: Default::default(),\n                send_compression_encodings: Default::default(),\n                max_decoding_message_size: None,\n                max_encoding_message_size: None,\n            }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> InterceptedService<Self, F>\n        where\n            F: tonic::service::Interceptor,\n        {\n            InterceptedService::new(Self::new(inner), interceptor)\n        }\n        /// Enable decompressing requests with the given encoding.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.accept_compression_encodings.enable(encoding);\n            self\n        }\n        /// Compress responses with the given encoding, if the client supports it.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.send_compression_encodings.enable(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.max_decoding_message_size = Some(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.max_encoding_message_size = Some(limit);\n            self\n        }\n    }\n    impl<T, B> tonic::codegen::Service<http::Request<B>>\n    for IngestRouterServiceGrpcServer<T>\n    where\n        T: IngestRouterServiceGrpc,\n        B: Body + std::marker::Send + 'static,\n        B::Error: Into<StdError> + std::marker::Send + 'static,\n    {\n        type Response = http::Response<tonic::body::Body>;\n        type Error = std::convert::Infallible;\n        type Future = BoxFuture<Self::Response, Self::Error>;\n        fn poll_ready(\n            &mut self,\n            _cx: &mut Context<'_>,\n        ) -> Poll<std::result::Result<(), Self::Error>> {\n            Poll::Ready(Ok(()))\n        }\n        fn call(&mut self, req: http::Request<B>) -> Self::Future {\n            match req.uri().path() {\n                \"/quickwit.ingest.router.IngestRouterService/Ingest\" => {\n                    #[allow(non_camel_case_types)]\n                    struct IngestSvc<T: IngestRouterServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: IngestRouterServiceGrpc,\n                    > tonic::server::UnaryService<super::IngestRequestV2>\n                    for IngestSvc<T> {\n                        type Response = super::IngestResponseV2;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::IngestRequestV2>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as IngestRouterServiceGrpc>::ingest(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = IngestSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                _ => {\n                    Box::pin(async move {\n                        let mut response = http::Response::new(\n                            tonic::body::Body::default(),\n                        );\n                        let headers = response.headers_mut();\n                        headers\n                            .insert(\n                                tonic::Status::GRPC_STATUS,\n                                (tonic::Code::Unimplemented as i32).into(),\n                            );\n                        headers\n                            .insert(\n                                http::header::CONTENT_TYPE,\n                                tonic::metadata::GRPC_CONTENT_TYPE,\n                            );\n                        Ok(response)\n                    })\n                }\n            }\n        }\n    }\n    impl<T> Clone for IngestRouterServiceGrpcServer<T> {\n        fn clone(&self) -> Self {\n            let inner = self.inner.clone();\n            Self {\n                inner,\n                accept_compression_encodings: self.accept_compression_encodings,\n                send_compression_encodings: self.send_compression_encodings,\n                max_decoding_message_size: self.max_decoding_message_size,\n                max_encoding_message_size: self.max_encoding_message_size,\n            }\n        }\n    }\n    /// Generated gRPC service name\n    pub const SERVICE_NAME: &str = \"quickwit.ingest.router.IngestRouterService\";\n    impl<T> tonic::server::NamedService for IngestRouterServiceGrpcServer<T> {\n        const NAME: &'static str = SERVICE_NAME;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/codegen/quickwit/quickwit.ingest.rs",
    "content": "// This file is @generated by prost-build.\n/// Shard primary key.\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ShardPKey {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"2\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(message, optional, tag = \"3\")]\n    pub shard_id: ::core::option::Option<crate::types::ShardId>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct DocBatchV2 {\n    #[prost(bytes = \"bytes\", tag = \"1\")]\n    pub doc_buffer: ::prost::bytes::Bytes,\n    #[prost(uint32, repeated, tag = \"2\")]\n    pub doc_lengths: ::prost::alloc::vec::Vec<u32>,\n    #[prost(message, repeated, tag = \"3\")]\n    pub doc_uids: ::prost::alloc::vec::Vec<crate::types::DocUid>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct MRecordBatch {\n    /// Buffer of encoded and then concatenated mrecords.\n    #[prost(bytes = \"bytes\", tag = \"1\")]\n    pub mrecord_buffer: ::prost::bytes::Bytes,\n    /// Lengths of the mrecords in the buffer.\n    #[prost(uint32, repeated, tag = \"2\")]\n    pub mrecord_lengths: ::prost::alloc::vec::Vec<u32>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct Shard {\n    /// Immutable fields\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"2\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(message, optional, tag = \"3\")]\n    pub shard_id: ::core::option::Option<crate::types::ShardId>,\n    /// The node ID of the ingester to which all the write requests for this shard should be sent to.\n    #[prost(string, tag = \"4\")]\n    pub leader_id: ::prost::alloc::string::String,\n    /// The node ID of the ingester holding a copy of the data.\n    #[prost(string, optional, tag = \"5\")]\n    #[serde(default, skip_serializing_if = \"Option::is_none\")]\n    pub follower_id: ::core::option::Option<::prost::alloc::string::String>,\n    /// Mutable fields\n    #[prost(enumeration = \"ShardState\", tag = \"8\")]\n    pub shard_state: i32,\n    /// Position up to which indexers have indexed and published the records stored in the shard.\n    /// It is updated asynchronously in a best effort manner by the indexers and indicates the position up to which the log can be safely truncated.\n    #[prost(message, optional, tag = \"9\")]\n    #[serde(default, skip_serializing_if = \"Option::is_none\")]\n    pub publish_position_inclusive: ::core::option::Option<crate::types::Position>,\n    /// A publish token that ensures only one indexer works on a given shard at a time.\n    /// For instance, if an indexer goes rogue, eventually the control plane will detect it and assign the shard to another indexer, which will override the publish token.\n    #[prost(string, optional, tag = \"10\")]\n    #[serde(default, skip_serializing_if = \"Option::is_none\")]\n    pub publish_token: ::core::option::Option<::prost::alloc::string::String>,\n    /// The UID of the index doc mapping when the shard was created.\n    #[prost(message, optional, tag = \"11\")]\n    pub doc_mapping_uid: ::core::option::Option<crate::types::DocMappingUid>,\n    /// Time when the shard was last updated\n    #[prost(int64, tag = \"12\")]\n    #[serde(default = \"super::compatibility_shard_update_timestamp\")]\n    pub update_timestamp: i64,\n}\n/// A group of shards belonging to the same index and source.\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ShardIds {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"2\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(message, repeated, tag = \"3\")]\n    pub shard_ids: ::prost::alloc::vec::Vec<crate::types::ShardId>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ShardIdPositions {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"2\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(message, repeated, tag = \"3\")]\n    pub shard_positions: ::prost::alloc::vec::Vec<ShardIdPosition>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ShardIdPosition {\n    #[prost(message, optional, tag = \"1\")]\n    pub shard_id: ::core::option::Option<crate::types::ShardId>,\n    #[prost(message, optional, tag = \"2\")]\n    pub publish_position_inclusive: ::core::option::Option<crate::types::Position>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ParseFailure {\n    #[prost(message, optional, tag = \"1\")]\n    pub doc_uid: ::core::option::Option<crate::types::DocUid>,\n    #[prost(enumeration = \"ParseFailureReason\", tag = \"2\")]\n    pub reason: i32,\n    #[prost(string, tag = \"3\")]\n    pub message: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[serde(rename_all = \"snake_case\")]\n#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]\n#[repr(i32)]\npub enum CommitTypeV2 {\n    Unspecified = 0,\n    Auto = 1,\n    WaitFor = 2,\n    Force = 3,\n}\nimpl CommitTypeV2 {\n    /// String value of the enum field names used in the ProtoBuf definition.\n    ///\n    /// The values are not transformed in any way and thus are considered stable\n    /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n    pub fn as_str_name(&self) -> &'static str {\n        match self {\n            Self::Unspecified => \"COMMIT_TYPE_V2_UNSPECIFIED\",\n            Self::Auto => \"COMMIT_TYPE_V2_AUTO\",\n            Self::WaitFor => \"COMMIT_TYPE_V2_WAIT_FOR\",\n            Self::Force => \"COMMIT_TYPE_V2_FORCE\",\n        }\n    }\n    /// Creates an enum from field names used in the ProtoBuf definition.\n    pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n        match value {\n            \"COMMIT_TYPE_V2_UNSPECIFIED\" => Some(Self::Unspecified),\n            \"COMMIT_TYPE_V2_AUTO\" => Some(Self::Auto),\n            \"COMMIT_TYPE_V2_WAIT_FOR\" => Some(Self::WaitFor),\n            \"COMMIT_TYPE_V2_FORCE\" => Some(Self::Force),\n            _ => None,\n        }\n    }\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[serde(rename_all = \"snake_case\")]\n#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]\n#[repr(i32)]\npub enum ShardState {\n    Unspecified = 0,\n    /// The shard is open and accepts write requests.\n    Open = 1,\n    /// The ingester hosting the shard is unavailable.\n    Unavailable = 2,\n    /// The shard is closed and cannot be written to.\n    /// It can be safely deleted if the publish position is superior or equal to `~eof`.\n    Closed = 3,\n}\nimpl ShardState {\n    /// String value of the enum field names used in the ProtoBuf definition.\n    ///\n    /// The values are not transformed in any way and thus are considered stable\n    /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n    pub fn as_str_name(&self) -> &'static str {\n        match self {\n            Self::Unspecified => \"SHARD_STATE_UNSPECIFIED\",\n            Self::Open => \"SHARD_STATE_OPEN\",\n            Self::Unavailable => \"SHARD_STATE_UNAVAILABLE\",\n            Self::Closed => \"SHARD_STATE_CLOSED\",\n        }\n    }\n    /// Creates an enum from field names used in the ProtoBuf definition.\n    pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n        match value {\n            \"SHARD_STATE_UNSPECIFIED\" => Some(Self::Unspecified),\n            \"SHARD_STATE_OPEN\" => Some(Self::Open),\n            \"SHARD_STATE_UNAVAILABLE\" => Some(Self::Unavailable),\n            \"SHARD_STATE_CLOSED\" => Some(Self::Closed),\n            _ => None,\n        }\n    }\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[serde(rename_all = \"snake_case\")]\n#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]\n#[repr(i32)]\npub enum ParseFailureReason {\n    Unspecified = 0,\n    InvalidJson = 1,\n    InvalidSchema = 2,\n}\nimpl ParseFailureReason {\n    /// String value of the enum field names used in the ProtoBuf definition.\n    ///\n    /// The values are not transformed in any way and thus are considered stable\n    /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n    pub fn as_str_name(&self) -> &'static str {\n        match self {\n            Self::Unspecified => \"PARSE_FAILURE_REASON_UNSPECIFIED\",\n            Self::InvalidJson => \"PARSE_FAILURE_REASON_INVALID_JSON\",\n            Self::InvalidSchema => \"PARSE_FAILURE_REASON_INVALID_SCHEMA\",\n        }\n    }\n    /// Creates an enum from field names used in the ProtoBuf definition.\n    pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n        match value {\n            \"PARSE_FAILURE_REASON_UNSPECIFIED\" => Some(Self::Unspecified),\n            \"PARSE_FAILURE_REASON_INVALID_JSON\" => Some(Self::InvalidJson),\n            \"PARSE_FAILURE_REASON_INVALID_SCHEMA\" => Some(Self::InvalidSchema),\n            _ => None,\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/codegen/quickwit/quickwit.metastore.rs",
    "content": "// This file is @generated by prost-build.\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct EmptyResponse {}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct CreateIndexRequest {\n    #[prost(string, tag = \"2\")]\n    pub index_config_json: ::prost::alloc::string::String,\n    #[prost(string, repeated, tag = \"3\")]\n    pub source_configs_json: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct CreateIndexResponse {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"2\")]\n    pub index_metadata_json: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct UpdateIndexRequest {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"5\")]\n    pub doc_mapping_json: ::prost::alloc::string::String,\n    #[prost(string, tag = \"4\")]\n    pub indexing_settings_json: ::prost::alloc::string::String,\n    #[prost(string, tag = \"6\")]\n    pub ingest_settings_json: ::prost::alloc::string::String,\n    #[prost(string, tag = \"2\")]\n    pub search_settings_json: ::prost::alloc::string::String,\n    #[prost(string, optional, tag = \"3\")]\n    pub retention_policy_json_opt: ::core::option::Option<\n        ::prost::alloc::string::String,\n    >,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ListIndexesMetadataRequest {\n    /// List of patterns an index should match or not match to get considered\n    /// An index must match at least one positive pattern (a pattern not starting\n    /// with a '-'), and no negative pattern (a pattern starting with a '-').\n    #[prost(string, repeated, tag = \"2\")]\n    pub index_id_patterns: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ListIndexesMetadataResponse {\n    /// Deprecated (v0.9.0), use `indexes_metadata_json_zstd` instead.\n    #[prost(string, optional, tag = \"1\")]\n    pub indexes_metadata_json_opt: ::core::option::Option<\n        ::prost::alloc::string::String,\n    >,\n    /// A JSON serialized then ZSTD compressed list of `IndexMetadata`: `Vec<IndexMetadata> | JSON | ZSTD`.\n    /// We don't use `repeated` here to increase the compression rate and ratio.\n    #[prost(bytes = \"bytes\", tag = \"2\")]\n    pub indexes_metadata_json_zstd: ::prost::bytes::Bytes,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct DeleteIndexRequest {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n}\n/// Request the metadata of an index.\n/// Either `index_uid` or `index_id` must be specified.\n///\n/// If both are supplied, `index_uid` is used.\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct IndexMetadataRequest {\n    #[prost(string, optional, tag = \"1\")]\n    pub index_id: ::core::option::Option<::prost::alloc::string::String>,\n    #[prost(message, optional, tag = \"2\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct IndexMetadataResponse {\n    #[prost(string, tag = \"1\")]\n    pub index_metadata_serialized_json: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct IndexesMetadataRequest {\n    #[prost(message, repeated, tag = \"1\")]\n    pub subrequests: ::prost::alloc::vec::Vec<IndexMetadataSubrequest>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct IndexMetadataSubrequest {\n    #[prost(string, optional, tag = \"1\")]\n    pub index_id: ::core::option::Option<::prost::alloc::string::String>,\n    #[prost(message, optional, tag = \"2\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct IndexesMetadataResponse {\n    /// A JSON serialized then ZSTD compressed list of `IndexMetadata`: `Vec<IndexMetadata> | JSON | ZSTD`.\n    /// We don't use `repeated` here to increase the compression rate and ratio.\n    #[prost(bytes = \"bytes\", tag = \"1\")]\n    pub indexes_metadata_json_zstd: ::prost::bytes::Bytes,\n    #[prost(message, repeated, tag = \"2\")]\n    pub failures: ::prost::alloc::vec::Vec<IndexMetadataFailure>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct IndexMetadataFailure {\n    #[prost(string, optional, tag = \"1\")]\n    pub index_id: ::core::option::Option<::prost::alloc::string::String>,\n    #[prost(message, optional, tag = \"2\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(enumeration = \"IndexMetadataFailureReason\", tag = \"3\")]\n    pub reason: i32,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ListIndexStatsRequest {\n    /// List of patterns an index should match or not match to get considered\n    /// An index must match at least one positive pattern (a pattern not starting\n    /// with a '-'), and no negative pattern (a pattern starting with a '-').\n    #[prost(string, repeated, tag = \"1\")]\n    pub index_id_patterns: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ListIndexStatsResponse {\n    /// list of IndexStats. each one has the index id, the number of splits and the total size.\n    #[prost(message, repeated, tag = \"1\")]\n    pub index_stats: ::prost::alloc::vec::Vec<IndexStats>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct IndexStats {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(message, optional, tag = \"2\")]\n    pub staged: ::core::option::Option<SplitStats>,\n    #[prost(message, optional, tag = \"3\")]\n    pub published: ::core::option::Option<SplitStats>,\n    #[prost(message, optional, tag = \"4\")]\n    pub marked_for_deletion: ::core::option::Option<SplitStats>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct SplitStats {\n    #[prost(uint64, tag = \"1\")]\n    pub num_splits: u64,\n    #[prost(uint64, tag = \"2\")]\n    pub total_size_bytes: u64,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ListSplitsRequest {\n    /// Predicate used to filter splits.\n    /// The predicate is expressed as a JSON serialized\n    /// `ListSplitsQuery`.\n    #[prost(string, tag = \"1\")]\n    pub query_json: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ListSplitsResponse {\n    /// TODO use repeated and encode splits json individually.\n    #[prost(string, tag = \"1\")]\n    pub splits_serialized_json: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct StageSplitsRequest {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"2\")]\n    pub split_metadata_list_serialized_json: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct PublishSplitsRequest {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, repeated, tag = \"2\")]\n    pub staged_split_ids: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n    #[prost(string, repeated, tag = \"3\")]\n    pub replaced_split_ids: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n    #[prost(string, optional, tag = \"4\")]\n    pub index_checkpoint_delta_json_opt: ::core::option::Option<\n        ::prost::alloc::string::String,\n    >,\n    #[prost(string, optional, tag = \"5\")]\n    pub publish_token_opt: ::core::option::Option<::prost::alloc::string::String>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct MarkSplitsForDeletionRequest {\n    #[prost(message, optional, tag = \"2\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, repeated, tag = \"3\")]\n    pub split_ids: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct DeleteSplitsRequest {\n    #[prost(message, optional, tag = \"2\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, repeated, tag = \"3\")]\n    pub split_ids: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct AddSourceRequest {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"2\")]\n    pub source_config_json: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct UpdateSourceRequest {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"2\")]\n    pub source_config_json: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ToggleSourceRequest {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"2\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(bool, tag = \"3\")]\n    pub enable: bool,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct DeleteSourceRequest {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"2\")]\n    pub source_id: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ResetSourceCheckpointRequest {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"2\")]\n    pub source_id: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct DeleteTask {\n    #[prost(int64, tag = \"1\")]\n    pub create_timestamp: i64,\n    #[prost(uint64, tag = \"2\")]\n    pub opstamp: u64,\n    #[prost(message, optional, tag = \"3\")]\n    pub delete_query: ::core::option::Option<DeleteQuery>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct DeleteQuery {\n    /// Index UID.\n    #[prost(message, optional, tag = \"1\")]\n    #[schema(value_type = String)]\n    #[serde(alias = \"index_id\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    /// If set, restrict search to documents with a `timestamp >= start_timestamp`.\n    #[prost(int64, optional, tag = \"2\")]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub start_timestamp: ::core::option::Option<i64>,\n    /// If set, restrict search to documents with a \\`timestamp \\< end_timestamp\\``.\n    #[prost(int64, optional, tag = \"3\")]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub end_timestamp: ::core::option::Option<i64>,\n    /// Query AST serialized in JSON\n    #[prost(string, tag = \"6\")]\n    #[serde(alias = \"query\")]\n    pub query_ast: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct UpdateSplitsDeleteOpstampRequest {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, repeated, tag = \"2\")]\n    pub split_ids: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n    #[prost(uint64, tag = \"3\")]\n    pub delete_opstamp: u64,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct UpdateSplitsDeleteOpstampResponse {}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct LastDeleteOpstampRequest {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct LastDeleteOpstampResponse {\n    #[prost(uint64, tag = \"1\")]\n    pub last_delete_opstamp: u64,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ListStaleSplitsRequest {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(uint64, tag = \"2\")]\n    pub delete_opstamp: u64,\n    #[prost(uint64, tag = \"3\")]\n    pub num_splits: u64,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ListDeleteTasksRequest {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(uint64, tag = \"2\")]\n    pub opstamp_start: u64,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ListDeleteTasksResponse {\n    #[prost(message, repeated, tag = \"1\")]\n    pub delete_tasks: ::prost::alloc::vec::Vec<DeleteTask>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct OpenShardsRequest {\n    #[prost(message, repeated, tag = \"1\")]\n    pub subrequests: ::prost::alloc::vec::Vec<OpenShardSubrequest>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct OpenShardSubrequest {\n    #[prost(uint32, tag = \"1\")]\n    pub subrequest_id: u32,\n    #[prost(message, optional, tag = \"2\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"3\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(message, optional, tag = \"4\")]\n    pub shard_id: ::core::option::Option<crate::types::ShardId>,\n    #[prost(string, tag = \"5\")]\n    pub leader_id: ::prost::alloc::string::String,\n    #[prost(string, optional, tag = \"6\")]\n    pub follower_id: ::core::option::Option<::prost::alloc::string::String>,\n    #[prost(message, optional, tag = \"7\")]\n    pub doc_mapping_uid: ::core::option::Option<crate::types::DocMappingUid>,\n    #[prost(string, optional, tag = \"8\")]\n    pub publish_token: ::core::option::Option<::prost::alloc::string::String>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct OpenShardsResponse {\n    #[prost(message, repeated, tag = \"1\")]\n    pub subresponses: ::prost::alloc::vec::Vec<OpenShardSubresponse>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct OpenShardSubresponse {\n    #[prost(uint32, tag = \"1\")]\n    pub subrequest_id: u32,\n    #[prost(message, optional, tag = \"4\")]\n    pub open_shard: ::core::option::Option<super::ingest::Shard>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct AcquireShardsRequest {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"2\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(message, repeated, tag = \"3\")]\n    pub shard_ids: ::prost::alloc::vec::Vec<crate::types::ShardId>,\n    #[prost(string, tag = \"4\")]\n    pub publish_token: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct AcquireShardsResponse {\n    /// List of acquired shards, in no specific order.\n    #[prost(message, repeated, tag = \"3\")]\n    pub acquired_shards: ::prost::alloc::vec::Vec<super::ingest::Shard>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct DeleteShardsRequest {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"2\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(message, repeated, tag = \"3\")]\n    pub shard_ids: ::prost::alloc::vec::Vec<crate::types::ShardId>,\n    /// If false, only shards at EOF positions will be deleted.\n    #[prost(bool, tag = \"4\")]\n    pub force: bool,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct DeleteShardsResponse {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"2\")]\n    pub source_id: ::prost::alloc::string::String,\n    /// List of shard IDs that were successfully deleted.\n    #[prost(message, repeated, tag = \"3\")]\n    pub successes: ::prost::alloc::vec::Vec<crate::types::ShardId>,\n    /// List of shard IDs that could not be deleted because `force` was set to `false` in the request,\n    /// and the shards are not at EOF, i.e., not fully indexed.\n    #[prost(message, repeated, tag = \"4\")]\n    pub failures: ::prost::alloc::vec::Vec<crate::types::ShardId>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct PruneShardsRequest {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"2\")]\n    pub source_id: ::prost::alloc::string::String,\n    /// The maximum age of the shards to keep, in seconds.\n    #[prost(uint32, optional, tag = \"5\")]\n    pub max_age_secs: ::core::option::Option<u32>,\n    /// The maximum number of the shards to keep. Delete older shards first.\n    #[prost(uint32, optional, tag = \"6\")]\n    pub max_count: ::core::option::Option<u32>,\n    /// The interval between two pruning operations, in seconds.\n    #[prost(uint32, optional, tag = \"7\")]\n    pub interval_secs: ::core::option::Option<u32>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ListShardsRequest {\n    #[prost(message, repeated, tag = \"1\")]\n    pub subrequests: ::prost::alloc::vec::Vec<ListShardsSubrequest>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ListShardsSubrequest {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"2\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(enumeration = \"super::ingest::ShardState\", optional, tag = \"3\")]\n    pub shard_state: ::core::option::Option<i32>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ListShardsResponse {\n    #[prost(message, repeated, tag = \"1\")]\n    pub subresponses: ::prost::alloc::vec::Vec<ListShardsSubresponse>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ListShardsSubresponse {\n    #[prost(message, optional, tag = \"1\")]\n    pub index_uid: ::core::option::Option<crate::types::IndexUid>,\n    #[prost(string, tag = \"2\")]\n    pub source_id: ::prost::alloc::string::String,\n    #[prost(message, repeated, tag = \"3\")]\n    pub shards: ::prost::alloc::vec::Vec<super::ingest::Shard>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct CreateIndexTemplateRequest {\n    #[prost(string, tag = \"1\")]\n    pub index_template_json: ::prost::alloc::string::String,\n    #[prost(bool, tag = \"2\")]\n    pub overwrite: bool,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct GetIndexTemplateRequest {\n    #[prost(string, tag = \"1\")]\n    pub template_id: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct GetIndexTemplateResponse {\n    #[prost(string, tag = \"1\")]\n    pub index_template_json: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct FindIndexTemplateMatchesRequest {\n    #[prost(string, repeated, tag = \"1\")]\n    pub index_ids: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct FindIndexTemplateMatchesResponse {\n    #[prost(message, repeated, tag = \"1\")]\n    pub matches: ::prost::alloc::vec::Vec<IndexTemplateMatch>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct IndexTemplateMatch {\n    #[prost(string, tag = \"1\")]\n    pub index_id: ::prost::alloc::string::String,\n    #[prost(string, tag = \"2\")]\n    pub template_id: ::prost::alloc::string::String,\n    #[prost(string, tag = \"3\")]\n    pub index_template_json: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ListIndexTemplatesRequest {}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ListIndexTemplatesResponse {\n    #[prost(string, repeated, tag = \"1\")]\n    pub index_templates_json: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct DeleteIndexTemplatesRequest {\n    #[prost(string, repeated, tag = \"1\")]\n    pub template_ids: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct GetClusterIdentityRequest {}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct GetClusterIdentityResponse {\n    #[prost(string, tag = \"1\")]\n    pub uuid: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[serde(rename_all = \"snake_case\")]\n#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]\n#[repr(i32)]\npub enum SourceType {\n    Unspecified = 0,\n    Cli = 1,\n    File = 2,\n    IngestV1 = 4,\n    IngestV2 = 5,\n    /// Apache Kafka\n    Kafka = 6,\n    /// Amazon Kinesis\n    Kinesis = 7,\n    Nats = 8,\n    /// Google Cloud Pub/Sub\n    PubSub = 3,\n    /// Apache Pulsar\n    Pulsar = 9,\n    Vec = 10,\n    Void = 11,\n    Stdin = 13,\n}\nimpl SourceType {\n    /// String value of the enum field names used in the ProtoBuf definition.\n    ///\n    /// The values are not transformed in any way and thus are considered stable\n    /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n    pub fn as_str_name(&self) -> &'static str {\n        match self {\n            Self::Unspecified => \"SOURCE_TYPE_UNSPECIFIED\",\n            Self::Cli => \"SOURCE_TYPE_CLI\",\n            Self::File => \"SOURCE_TYPE_FILE\",\n            Self::IngestV1 => \"SOURCE_TYPE_INGEST_V1\",\n            Self::IngestV2 => \"SOURCE_TYPE_INGEST_V2\",\n            Self::Kafka => \"SOURCE_TYPE_KAFKA\",\n            Self::Kinesis => \"SOURCE_TYPE_KINESIS\",\n            Self::Nats => \"SOURCE_TYPE_NATS\",\n            Self::PubSub => \"SOURCE_TYPE_PUB_SUB\",\n            Self::Pulsar => \"SOURCE_TYPE_PULSAR\",\n            Self::Vec => \"SOURCE_TYPE_VEC\",\n            Self::Void => \"SOURCE_TYPE_VOID\",\n            Self::Stdin => \"SOURCE_TYPE_STDIN\",\n        }\n    }\n    /// Creates an enum from field names used in the ProtoBuf definition.\n    pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n        match value {\n            \"SOURCE_TYPE_UNSPECIFIED\" => Some(Self::Unspecified),\n            \"SOURCE_TYPE_CLI\" => Some(Self::Cli),\n            \"SOURCE_TYPE_FILE\" => Some(Self::File),\n            \"SOURCE_TYPE_INGEST_V1\" => Some(Self::IngestV1),\n            \"SOURCE_TYPE_INGEST_V2\" => Some(Self::IngestV2),\n            \"SOURCE_TYPE_KAFKA\" => Some(Self::Kafka),\n            \"SOURCE_TYPE_KINESIS\" => Some(Self::Kinesis),\n            \"SOURCE_TYPE_NATS\" => Some(Self::Nats),\n            \"SOURCE_TYPE_PUB_SUB\" => Some(Self::PubSub),\n            \"SOURCE_TYPE_PULSAR\" => Some(Self::Pulsar),\n            \"SOURCE_TYPE_VEC\" => Some(Self::Vec),\n            \"SOURCE_TYPE_VOID\" => Some(Self::Void),\n            \"SOURCE_TYPE_STDIN\" => Some(Self::Stdin),\n            _ => None,\n        }\n    }\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[serde(rename_all = \"snake_case\")]\n#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]\n#[repr(i32)]\npub enum IndexMetadataFailureReason {\n    Unspecified = 0,\n    NotFound = 1,\n    Internal = 2,\n}\nimpl IndexMetadataFailureReason {\n    /// String value of the enum field names used in the ProtoBuf definition.\n    ///\n    /// The values are not transformed in any way and thus are considered stable\n    /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n    pub fn as_str_name(&self) -> &'static str {\n        match self {\n            Self::Unspecified => \"INDEX_METADATA_FAILURE_REASON_UNSPECIFIED\",\n            Self::NotFound => \"INDEX_METADATA_FAILURE_REASON_NOT_FOUND\",\n            Self::Internal => \"INDEX_METADATA_FAILURE_REASON_INTERNAL\",\n        }\n    }\n    /// Creates an enum from field names used in the ProtoBuf definition.\n    pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n        match value {\n            \"INDEX_METADATA_FAILURE_REASON_UNSPECIFIED\" => Some(Self::Unspecified),\n            \"INDEX_METADATA_FAILURE_REASON_NOT_FOUND\" => Some(Self::NotFound),\n            \"INDEX_METADATA_FAILURE_REASON_INTERNAL\" => Some(Self::Internal),\n            _ => None,\n        }\n    }\n}\n/// BEGIN quickwit-codegen\n#[allow(unused_imports)]\nuse std::str::FromStr;\nuse tower::{Layer, Service, ServiceExt};\nuse quickwit_common::tower::RpcName;\nimpl RpcName for CreateIndexRequest {\n    fn rpc_name() -> &'static str {\n        \"create_index\"\n    }\n}\nimpl RpcName for UpdateIndexRequest {\n    fn rpc_name() -> &'static str {\n        \"update_index\"\n    }\n}\nimpl RpcName for IndexMetadataRequest {\n    fn rpc_name() -> &'static str {\n        \"index_metadata\"\n    }\n}\nimpl RpcName for IndexesMetadataRequest {\n    fn rpc_name() -> &'static str {\n        \"indexes_metadata\"\n    }\n}\nimpl RpcName for ListIndexesMetadataRequest {\n    fn rpc_name() -> &'static str {\n        \"list_indexes_metadata\"\n    }\n}\nimpl RpcName for DeleteIndexRequest {\n    fn rpc_name() -> &'static str {\n        \"delete_index\"\n    }\n}\nimpl RpcName for ListIndexStatsRequest {\n    fn rpc_name() -> &'static str {\n        \"list_index_stats\"\n    }\n}\nimpl RpcName for ListSplitsRequest {\n    fn rpc_name() -> &'static str {\n        \"list_splits\"\n    }\n}\nimpl RpcName for StageSplitsRequest {\n    fn rpc_name() -> &'static str {\n        \"stage_splits\"\n    }\n}\nimpl RpcName for PublishSplitsRequest {\n    fn rpc_name() -> &'static str {\n        \"publish_splits\"\n    }\n}\nimpl RpcName for MarkSplitsForDeletionRequest {\n    fn rpc_name() -> &'static str {\n        \"mark_splits_for_deletion\"\n    }\n}\nimpl RpcName for DeleteSplitsRequest {\n    fn rpc_name() -> &'static str {\n        \"delete_splits\"\n    }\n}\nimpl RpcName for AddSourceRequest {\n    fn rpc_name() -> &'static str {\n        \"add_source\"\n    }\n}\nimpl RpcName for UpdateSourceRequest {\n    fn rpc_name() -> &'static str {\n        \"update_source\"\n    }\n}\nimpl RpcName for ToggleSourceRequest {\n    fn rpc_name() -> &'static str {\n        \"toggle_source\"\n    }\n}\nimpl RpcName for DeleteSourceRequest {\n    fn rpc_name() -> &'static str {\n        \"delete_source\"\n    }\n}\nimpl RpcName for ResetSourceCheckpointRequest {\n    fn rpc_name() -> &'static str {\n        \"reset_source_checkpoint\"\n    }\n}\nimpl RpcName for LastDeleteOpstampRequest {\n    fn rpc_name() -> &'static str {\n        \"last_delete_opstamp\"\n    }\n}\nimpl RpcName for DeleteQuery {\n    fn rpc_name() -> &'static str {\n        \"create_delete_task\"\n    }\n}\nimpl RpcName for UpdateSplitsDeleteOpstampRequest {\n    fn rpc_name() -> &'static str {\n        \"update_splits_delete_opstamp\"\n    }\n}\nimpl RpcName for ListDeleteTasksRequest {\n    fn rpc_name() -> &'static str {\n        \"list_delete_tasks\"\n    }\n}\nimpl RpcName for ListStaleSplitsRequest {\n    fn rpc_name() -> &'static str {\n        \"list_stale_splits\"\n    }\n}\nimpl RpcName for OpenShardsRequest {\n    fn rpc_name() -> &'static str {\n        \"open_shards\"\n    }\n}\nimpl RpcName for AcquireShardsRequest {\n    fn rpc_name() -> &'static str {\n        \"acquire_shards\"\n    }\n}\nimpl RpcName for DeleteShardsRequest {\n    fn rpc_name() -> &'static str {\n        \"delete_shards\"\n    }\n}\nimpl RpcName for PruneShardsRequest {\n    fn rpc_name() -> &'static str {\n        \"prune_shards\"\n    }\n}\nimpl RpcName for ListShardsRequest {\n    fn rpc_name() -> &'static str {\n        \"list_shards\"\n    }\n}\nimpl RpcName for CreateIndexTemplateRequest {\n    fn rpc_name() -> &'static str {\n        \"create_index_template\"\n    }\n}\nimpl RpcName for GetIndexTemplateRequest {\n    fn rpc_name() -> &'static str {\n        \"get_index_template\"\n    }\n}\nimpl RpcName for FindIndexTemplateMatchesRequest {\n    fn rpc_name() -> &'static str {\n        \"find_index_template_matches\"\n    }\n}\nimpl RpcName for ListIndexTemplatesRequest {\n    fn rpc_name() -> &'static str {\n        \"list_index_templates\"\n    }\n}\nimpl RpcName for DeleteIndexTemplatesRequest {\n    fn rpc_name() -> &'static str {\n        \"delete_index_templates\"\n    }\n}\nimpl RpcName for GetClusterIdentityRequest {\n    fn rpc_name() -> &'static str {\n        \"get_cluster_identity\"\n    }\n}\npub type MetastoreServiceStream<T> = quickwit_common::ServiceStream<\n    crate::metastore::MetastoreResult<T>,\n>;\n#[cfg_attr(any(test, feature = \"testsuite\"), mockall::automock)]\n#[async_trait::async_trait]\npub trait MetastoreService: std::fmt::Debug + Send + Sync + 'static {\n    ///Creates an index.\n    ///\n    ///This API creates a new index in the metastore.\n    ///An error will occur if an index that already exists in the storage is specified.\n    async fn create_index(\n        &self,\n        request: CreateIndexRequest,\n    ) -> crate::metastore::MetastoreResult<CreateIndexResponse>;\n    ///Update an index.\n    async fn update_index(\n        &self,\n        request: UpdateIndexRequest,\n    ) -> crate::metastore::MetastoreResult<IndexMetadataResponse>;\n    ///Returns the `IndexMetadata` of an index identified by its IndexID or its IndexUID.\n    async fn index_metadata(\n        &self,\n        request: IndexMetadataRequest,\n    ) -> crate::metastore::MetastoreResult<IndexMetadataResponse>;\n    ///Fetches the metadata of a list of indexes identified by their Index IDs or UIDs.\n    async fn indexes_metadata(\n        &self,\n        request: IndexesMetadataRequest,\n    ) -> crate::metastore::MetastoreResult<IndexesMetadataResponse>;\n    ///Gets an indexes metadatas.\n    async fn list_indexes_metadata(\n        &self,\n        request: ListIndexesMetadataRequest,\n    ) -> crate::metastore::MetastoreResult<ListIndexesMetadataResponse>;\n    ///Deletes an index\n    async fn delete_index(\n        &self,\n        request: DeleteIndexRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse>;\n    ///Returns a list of size info for each index.\n    async fn list_index_stats(\n        &self,\n        request: ListIndexStatsRequest,\n    ) -> crate::metastore::MetastoreResult<ListIndexStatsResponse>;\n    ///Streams splits from index.\n    async fn list_splits(\n        &self,\n        request: ListSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<MetastoreServiceStream<ListSplitsResponse>>;\n    ///Stages several splits.\n    async fn stage_splits(\n        &self,\n        request: StageSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse>;\n    ///Publishes split.\n    async fn publish_splits(\n        &self,\n        request: PublishSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse>;\n    ///Marks splits for deletion.\n    async fn mark_splits_for_deletion(\n        &self,\n        request: MarkSplitsForDeletionRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse>;\n    ///Deletes splits.\n    async fn delete_splits(\n        &self,\n        request: DeleteSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse>;\n    ///Adds a source.\n    async fn add_source(\n        &self,\n        request: AddSourceRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse>;\n    ///Updates a source.\n    async fn update_source(\n        &self,\n        request: UpdateSourceRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse>;\n    ///Toggles (turns on or off) source.\n    async fn toggle_source(\n        &self,\n        request: ToggleSourceRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse>;\n    ///Removes source.\n    async fn delete_source(\n        &self,\n        request: DeleteSourceRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse>;\n    ///Resets source checkpoint.\n    async fn reset_source_checkpoint(\n        &self,\n        request: ResetSourceCheckpointRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse>;\n    ///Gets last opstamp for a given `index_id`.\n    async fn last_delete_opstamp(\n        &self,\n        request: LastDeleteOpstampRequest,\n    ) -> crate::metastore::MetastoreResult<LastDeleteOpstampResponse>;\n    ///Creates a delete task.\n    async fn create_delete_task(\n        &self,\n        request: DeleteQuery,\n    ) -> crate::metastore::MetastoreResult<DeleteTask>;\n    ///Updates splits `delete_opstamp`.\n    async fn update_splits_delete_opstamp(\n        &self,\n        request: UpdateSplitsDeleteOpstampRequest,\n    ) -> crate::metastore::MetastoreResult<UpdateSplitsDeleteOpstampResponse>;\n    ///Lists delete tasks with `delete_task.opstamp` > `opstamp_start` for a given `index_id`.\n    async fn list_delete_tasks(\n        &self,\n        request: ListDeleteTasksRequest,\n    ) -> crate::metastore::MetastoreResult<ListDeleteTasksResponse>;\n    ///Lists splits with `split.delete_opstamp` \\< `delete_opstamp` for a given `index_id`.\n    async fn list_stale_splits(\n        &self,\n        request: ListStaleSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<ListSplitsResponse>;\n    ///Shard API\n    ///\n    ///Note that for the file-backed metastore implementation, the requests are not processed atomically.\n    ///Indeed, each request comprises one or more subrequests that target different indexes and sources processed\n    ///independently. Responses list the requests that succeeded or failed in the fields `successes` and\n    ///`failures`.\n    async fn open_shards(\n        &self,\n        request: OpenShardsRequest,\n    ) -> crate::metastore::MetastoreResult<OpenShardsResponse>;\n    ///Acquires a set of shards for indexing. This RPC locks the shards for publishing thanks to a publish token and only\n    ///the last indexer that has acquired the shards is allowed to publish. The response returns for each subrequest the\n    ///list of acquired shards along with the positions to index from.\n    ///\n    ///If a requested shard is missing, this method does not return an error. It should simply return the list of\n    ///shards that were actually acquired.\n    ///\n    ///For this reason, AcquireShards.acquire_shards may return less subresponse than there was in the request.\n    ///Also they may be returned in any order.\n    async fn acquire_shards(\n        &self,\n        request: AcquireShardsRequest,\n    ) -> crate::metastore::MetastoreResult<AcquireShardsResponse>;\n    ///Deletes a set of shards. This RPC deletes the shards from the metastore.\n    ///If the shard did not exist to begin with, the operation is successful and does not return any error.\n    async fn delete_shards(\n        &self,\n        request: DeleteShardsRequest,\n    ) -> crate::metastore::MetastoreResult<DeleteShardsResponse>;\n    ///Deletes outdated shards. This RPC deletes the shards from the metastore.\n    async fn prune_shards(\n        &self,\n        request: PruneShardsRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse>;\n    async fn list_shards(\n        &self,\n        request: ListShardsRequest,\n    ) -> crate::metastore::MetastoreResult<ListShardsResponse>;\n    ///Creates an index template.\n    async fn create_index_template(\n        &self,\n        request: CreateIndexTemplateRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse>;\n    ///Fetches an index template.\n    async fn get_index_template(\n        &self,\n        request: GetIndexTemplateRequest,\n    ) -> crate::metastore::MetastoreResult<GetIndexTemplateResponse>;\n    ///Finds matching index templates.\n    async fn find_index_template_matches(\n        &self,\n        request: FindIndexTemplateMatchesRequest,\n    ) -> crate::metastore::MetastoreResult<FindIndexTemplateMatchesResponse>;\n    ///Returns all the index templates.\n    async fn list_index_templates(\n        &self,\n        request: ListIndexTemplatesRequest,\n    ) -> crate::metastore::MetastoreResult<ListIndexTemplatesResponse>;\n    ///Deletes index templates.\n    async fn delete_index_templates(\n        &self,\n        request: DeleteIndexTemplatesRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse>;\n    ///Get cluster identity\n    async fn get_cluster_identity(\n        &self,\n        request: GetClusterIdentityRequest,\n    ) -> crate::metastore::MetastoreResult<GetClusterIdentityResponse>;\n    async fn check_connectivity(&self) -> anyhow::Result<()>;\n    fn endpoints(&self) -> Vec<quickwit_common::uri::Uri>;\n}\n#[derive(Debug, Clone)]\npub struct MetastoreServiceClient {\n    inner: InnerMetastoreServiceClient,\n}\n#[derive(Debug, Clone)]\nstruct InnerMetastoreServiceClient(std::sync::Arc<dyn MetastoreService>);\nimpl MetastoreServiceClient {\n    pub fn new<T>(instance: T) -> Self\n    where\n        T: MetastoreService,\n    {\n        #[cfg(any(test, feature = \"testsuite\"))]\n        assert!(\n            std::any::TypeId::of:: < T > () != std::any::TypeId::of:: <\n            MockMetastoreService > (),\n            \"`MockMetastoreService` must be wrapped in a `MockMetastoreServiceWrapper`: use `MetastoreServiceClient::from_mock(mock)` to instantiate the client\"\n        );\n        Self {\n            inner: InnerMetastoreServiceClient(std::sync::Arc::new(instance)),\n        }\n    }\n    pub fn as_grpc_service(\n        &self,\n        max_message_size: bytesize::ByteSize,\n    ) -> metastore_service_grpc_server::MetastoreServiceGrpcServer<\n        MetastoreServiceGrpcServerAdapter,\n    > {\n        let adapter = MetastoreServiceGrpcServerAdapter::new(self.clone());\n        metastore_service_grpc_server::MetastoreServiceGrpcServer::new(adapter)\n            .accept_compressed(tonic::codec::CompressionEncoding::Gzip)\n            .accept_compressed(tonic::codec::CompressionEncoding::Zstd)\n            .send_compressed(tonic::codec::CompressionEncoding::Gzip)\n            .send_compressed(tonic::codec::CompressionEncoding::Zstd)\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize)\n    }\n    pub fn from_channel(\n        addr: std::net::SocketAddr,\n        channel: tonic::transport::Channel,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> Self {\n        let (_, connection_keys_watcher) = tokio::sync::watch::channel(\n            std::collections::HashSet::from_iter([addr]),\n        );\n        let mut client = metastore_service_grpc_client::MetastoreServiceGrpcClient::new(\n                channel,\n            )\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize);\n        if let Some(compression_encoding) = compression_encoding_opt {\n            client = client\n                .accept_compressed(compression_encoding)\n                .send_compressed(compression_encoding);\n        }\n        let adapter = MetastoreServiceGrpcClientAdapter::new(\n            client,\n            connection_keys_watcher,\n        );\n        Self::new(adapter)\n    }\n    pub fn from_balance_channel(\n        balance_channel: quickwit_common::tower::BalanceChannel<std::net::SocketAddr>,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> MetastoreServiceClient {\n        let connection_keys_watcher = balance_channel.connection_keys_watcher();\n        let mut client = metastore_service_grpc_client::MetastoreServiceGrpcClient::new(\n                balance_channel,\n            )\n            .max_decoding_message_size(max_message_size.0 as usize)\n            .max_encoding_message_size(max_message_size.0 as usize);\n        if let Some(compression_encoding) = compression_encoding_opt {\n            client = client\n                .accept_compressed(compression_encoding)\n                .send_compressed(compression_encoding);\n        }\n        let adapter = MetastoreServiceGrpcClientAdapter::new(\n            client,\n            connection_keys_watcher,\n        );\n        Self::new(adapter)\n    }\n    pub fn from_mailbox<A>(mailbox: quickwit_actors::Mailbox<A>) -> Self\n    where\n        A: quickwit_actors::Actor + std::fmt::Debug + Send + 'static,\n        MetastoreServiceMailbox<A>: MetastoreService,\n    {\n        MetastoreServiceClient::new(MetastoreServiceMailbox::new(mailbox))\n    }\n    pub fn tower() -> MetastoreServiceTowerLayerStack {\n        MetastoreServiceTowerLayerStack::default()\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn from_mock(mock: MockMetastoreService) -> Self {\n        let mock_wrapper = mock_metastore_service::MockMetastoreServiceWrapper {\n            inner: tokio::sync::Mutex::new(mock),\n        };\n        Self::new(mock_wrapper)\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn mocked() -> Self {\n        Self::from_mock(MockMetastoreService::new())\n    }\n}\n#[async_trait::async_trait]\nimpl MetastoreService for MetastoreServiceClient {\n    async fn create_index(\n        &self,\n        request: CreateIndexRequest,\n    ) -> crate::metastore::MetastoreResult<CreateIndexResponse> {\n        self.inner.0.create_index(request).await\n    }\n    async fn update_index(\n        &self,\n        request: UpdateIndexRequest,\n    ) -> crate::metastore::MetastoreResult<IndexMetadataResponse> {\n        self.inner.0.update_index(request).await\n    }\n    async fn index_metadata(\n        &self,\n        request: IndexMetadataRequest,\n    ) -> crate::metastore::MetastoreResult<IndexMetadataResponse> {\n        self.inner.0.index_metadata(request).await\n    }\n    async fn indexes_metadata(\n        &self,\n        request: IndexesMetadataRequest,\n    ) -> crate::metastore::MetastoreResult<IndexesMetadataResponse> {\n        self.inner.0.indexes_metadata(request).await\n    }\n    async fn list_indexes_metadata(\n        &self,\n        request: ListIndexesMetadataRequest,\n    ) -> crate::metastore::MetastoreResult<ListIndexesMetadataResponse> {\n        self.inner.0.list_indexes_metadata(request).await\n    }\n    async fn delete_index(\n        &self,\n        request: DeleteIndexRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner.0.delete_index(request).await\n    }\n    async fn list_index_stats(\n        &self,\n        request: ListIndexStatsRequest,\n    ) -> crate::metastore::MetastoreResult<ListIndexStatsResponse> {\n        self.inner.0.list_index_stats(request).await\n    }\n    async fn list_splits(\n        &self,\n        request: ListSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<MetastoreServiceStream<ListSplitsResponse>> {\n        self.inner.0.list_splits(request).await\n    }\n    async fn stage_splits(\n        &self,\n        request: StageSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner.0.stage_splits(request).await\n    }\n    async fn publish_splits(\n        &self,\n        request: PublishSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner.0.publish_splits(request).await\n    }\n    async fn mark_splits_for_deletion(\n        &self,\n        request: MarkSplitsForDeletionRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner.0.mark_splits_for_deletion(request).await\n    }\n    async fn delete_splits(\n        &self,\n        request: DeleteSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner.0.delete_splits(request).await\n    }\n    async fn add_source(\n        &self,\n        request: AddSourceRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner.0.add_source(request).await\n    }\n    async fn update_source(\n        &self,\n        request: UpdateSourceRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner.0.update_source(request).await\n    }\n    async fn toggle_source(\n        &self,\n        request: ToggleSourceRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner.0.toggle_source(request).await\n    }\n    async fn delete_source(\n        &self,\n        request: DeleteSourceRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner.0.delete_source(request).await\n    }\n    async fn reset_source_checkpoint(\n        &self,\n        request: ResetSourceCheckpointRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner.0.reset_source_checkpoint(request).await\n    }\n    async fn last_delete_opstamp(\n        &self,\n        request: LastDeleteOpstampRequest,\n    ) -> crate::metastore::MetastoreResult<LastDeleteOpstampResponse> {\n        self.inner.0.last_delete_opstamp(request).await\n    }\n    async fn create_delete_task(\n        &self,\n        request: DeleteQuery,\n    ) -> crate::metastore::MetastoreResult<DeleteTask> {\n        self.inner.0.create_delete_task(request).await\n    }\n    async fn update_splits_delete_opstamp(\n        &self,\n        request: UpdateSplitsDeleteOpstampRequest,\n    ) -> crate::metastore::MetastoreResult<UpdateSplitsDeleteOpstampResponse> {\n        self.inner.0.update_splits_delete_opstamp(request).await\n    }\n    async fn list_delete_tasks(\n        &self,\n        request: ListDeleteTasksRequest,\n    ) -> crate::metastore::MetastoreResult<ListDeleteTasksResponse> {\n        self.inner.0.list_delete_tasks(request).await\n    }\n    async fn list_stale_splits(\n        &self,\n        request: ListStaleSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<ListSplitsResponse> {\n        self.inner.0.list_stale_splits(request).await\n    }\n    async fn open_shards(\n        &self,\n        request: OpenShardsRequest,\n    ) -> crate::metastore::MetastoreResult<OpenShardsResponse> {\n        self.inner.0.open_shards(request).await\n    }\n    async fn acquire_shards(\n        &self,\n        request: AcquireShardsRequest,\n    ) -> crate::metastore::MetastoreResult<AcquireShardsResponse> {\n        self.inner.0.acquire_shards(request).await\n    }\n    async fn delete_shards(\n        &self,\n        request: DeleteShardsRequest,\n    ) -> crate::metastore::MetastoreResult<DeleteShardsResponse> {\n        self.inner.0.delete_shards(request).await\n    }\n    async fn prune_shards(\n        &self,\n        request: PruneShardsRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner.0.prune_shards(request).await\n    }\n    async fn list_shards(\n        &self,\n        request: ListShardsRequest,\n    ) -> crate::metastore::MetastoreResult<ListShardsResponse> {\n        self.inner.0.list_shards(request).await\n    }\n    async fn create_index_template(\n        &self,\n        request: CreateIndexTemplateRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner.0.create_index_template(request).await\n    }\n    async fn get_index_template(\n        &self,\n        request: GetIndexTemplateRequest,\n    ) -> crate::metastore::MetastoreResult<GetIndexTemplateResponse> {\n        self.inner.0.get_index_template(request).await\n    }\n    async fn find_index_template_matches(\n        &self,\n        request: FindIndexTemplateMatchesRequest,\n    ) -> crate::metastore::MetastoreResult<FindIndexTemplateMatchesResponse> {\n        self.inner.0.find_index_template_matches(request).await\n    }\n    async fn list_index_templates(\n        &self,\n        request: ListIndexTemplatesRequest,\n    ) -> crate::metastore::MetastoreResult<ListIndexTemplatesResponse> {\n        self.inner.0.list_index_templates(request).await\n    }\n    async fn delete_index_templates(\n        &self,\n        request: DeleteIndexTemplatesRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner.0.delete_index_templates(request).await\n    }\n    async fn get_cluster_identity(\n        &self,\n        request: GetClusterIdentityRequest,\n    ) -> crate::metastore::MetastoreResult<GetClusterIdentityResponse> {\n        self.inner.0.get_cluster_identity(request).await\n    }\n    async fn check_connectivity(&self) -> anyhow::Result<()> {\n        self.inner.0.check_connectivity().await\n    }\n    fn endpoints(&self) -> Vec<quickwit_common::uri::Uri> {\n        self.inner.0.endpoints()\n    }\n}\n#[cfg(any(test, feature = \"testsuite\"))]\npub mod mock_metastore_service {\n    use super::*;\n    #[derive(Debug)]\n    pub struct MockMetastoreServiceWrapper {\n        pub(super) inner: tokio::sync::Mutex<MockMetastoreService>,\n    }\n    #[async_trait::async_trait]\n    impl MetastoreService for MockMetastoreServiceWrapper {\n        async fn create_index(\n            &self,\n            request: super::CreateIndexRequest,\n        ) -> crate::metastore::MetastoreResult<super::CreateIndexResponse> {\n            self.inner.lock().await.create_index(request).await\n        }\n        async fn update_index(\n            &self,\n            request: super::UpdateIndexRequest,\n        ) -> crate::metastore::MetastoreResult<super::IndexMetadataResponse> {\n            self.inner.lock().await.update_index(request).await\n        }\n        async fn index_metadata(\n            &self,\n            request: super::IndexMetadataRequest,\n        ) -> crate::metastore::MetastoreResult<super::IndexMetadataResponse> {\n            self.inner.lock().await.index_metadata(request).await\n        }\n        async fn indexes_metadata(\n            &self,\n            request: super::IndexesMetadataRequest,\n        ) -> crate::metastore::MetastoreResult<super::IndexesMetadataResponse> {\n            self.inner.lock().await.indexes_metadata(request).await\n        }\n        async fn list_indexes_metadata(\n            &self,\n            request: super::ListIndexesMetadataRequest,\n        ) -> crate::metastore::MetastoreResult<super::ListIndexesMetadataResponse> {\n            self.inner.lock().await.list_indexes_metadata(request).await\n        }\n        async fn delete_index(\n            &self,\n            request: super::DeleteIndexRequest,\n        ) -> crate::metastore::MetastoreResult<super::EmptyResponse> {\n            self.inner.lock().await.delete_index(request).await\n        }\n        async fn list_index_stats(\n            &self,\n            request: super::ListIndexStatsRequest,\n        ) -> crate::metastore::MetastoreResult<super::ListIndexStatsResponse> {\n            self.inner.lock().await.list_index_stats(request).await\n        }\n        async fn list_splits(\n            &self,\n            request: super::ListSplitsRequest,\n        ) -> crate::metastore::MetastoreResult<\n            MetastoreServiceStream<super::ListSplitsResponse>,\n        > {\n            self.inner.lock().await.list_splits(request).await\n        }\n        async fn stage_splits(\n            &self,\n            request: super::StageSplitsRequest,\n        ) -> crate::metastore::MetastoreResult<super::EmptyResponse> {\n            self.inner.lock().await.stage_splits(request).await\n        }\n        async fn publish_splits(\n            &self,\n            request: super::PublishSplitsRequest,\n        ) -> crate::metastore::MetastoreResult<super::EmptyResponse> {\n            self.inner.lock().await.publish_splits(request).await\n        }\n        async fn mark_splits_for_deletion(\n            &self,\n            request: super::MarkSplitsForDeletionRequest,\n        ) -> crate::metastore::MetastoreResult<super::EmptyResponse> {\n            self.inner.lock().await.mark_splits_for_deletion(request).await\n        }\n        async fn delete_splits(\n            &self,\n            request: super::DeleteSplitsRequest,\n        ) -> crate::metastore::MetastoreResult<super::EmptyResponse> {\n            self.inner.lock().await.delete_splits(request).await\n        }\n        async fn add_source(\n            &self,\n            request: super::AddSourceRequest,\n        ) -> crate::metastore::MetastoreResult<super::EmptyResponse> {\n            self.inner.lock().await.add_source(request).await\n        }\n        async fn update_source(\n            &self,\n            request: super::UpdateSourceRequest,\n        ) -> crate::metastore::MetastoreResult<super::EmptyResponse> {\n            self.inner.lock().await.update_source(request).await\n        }\n        async fn toggle_source(\n            &self,\n            request: super::ToggleSourceRequest,\n        ) -> crate::metastore::MetastoreResult<super::EmptyResponse> {\n            self.inner.lock().await.toggle_source(request).await\n        }\n        async fn delete_source(\n            &self,\n            request: super::DeleteSourceRequest,\n        ) -> crate::metastore::MetastoreResult<super::EmptyResponse> {\n            self.inner.lock().await.delete_source(request).await\n        }\n        async fn reset_source_checkpoint(\n            &self,\n            request: super::ResetSourceCheckpointRequest,\n        ) -> crate::metastore::MetastoreResult<super::EmptyResponse> {\n            self.inner.lock().await.reset_source_checkpoint(request).await\n        }\n        async fn last_delete_opstamp(\n            &self,\n            request: super::LastDeleteOpstampRequest,\n        ) -> crate::metastore::MetastoreResult<super::LastDeleteOpstampResponse> {\n            self.inner.lock().await.last_delete_opstamp(request).await\n        }\n        async fn create_delete_task(\n            &self,\n            request: super::DeleteQuery,\n        ) -> crate::metastore::MetastoreResult<super::DeleteTask> {\n            self.inner.lock().await.create_delete_task(request).await\n        }\n        async fn update_splits_delete_opstamp(\n            &self,\n            request: super::UpdateSplitsDeleteOpstampRequest,\n        ) -> crate::metastore::MetastoreResult<\n            super::UpdateSplitsDeleteOpstampResponse,\n        > {\n            self.inner.lock().await.update_splits_delete_opstamp(request).await\n        }\n        async fn list_delete_tasks(\n            &self,\n            request: super::ListDeleteTasksRequest,\n        ) -> crate::metastore::MetastoreResult<super::ListDeleteTasksResponse> {\n            self.inner.lock().await.list_delete_tasks(request).await\n        }\n        async fn list_stale_splits(\n            &self,\n            request: super::ListStaleSplitsRequest,\n        ) -> crate::metastore::MetastoreResult<super::ListSplitsResponse> {\n            self.inner.lock().await.list_stale_splits(request).await\n        }\n        async fn open_shards(\n            &self,\n            request: super::OpenShardsRequest,\n        ) -> crate::metastore::MetastoreResult<super::OpenShardsResponse> {\n            self.inner.lock().await.open_shards(request).await\n        }\n        async fn acquire_shards(\n            &self,\n            request: super::AcquireShardsRequest,\n        ) -> crate::metastore::MetastoreResult<super::AcquireShardsResponse> {\n            self.inner.lock().await.acquire_shards(request).await\n        }\n        async fn delete_shards(\n            &self,\n            request: super::DeleteShardsRequest,\n        ) -> crate::metastore::MetastoreResult<super::DeleteShardsResponse> {\n            self.inner.lock().await.delete_shards(request).await\n        }\n        async fn prune_shards(\n            &self,\n            request: super::PruneShardsRequest,\n        ) -> crate::metastore::MetastoreResult<super::EmptyResponse> {\n            self.inner.lock().await.prune_shards(request).await\n        }\n        async fn list_shards(\n            &self,\n            request: super::ListShardsRequest,\n        ) -> crate::metastore::MetastoreResult<super::ListShardsResponse> {\n            self.inner.lock().await.list_shards(request).await\n        }\n        async fn create_index_template(\n            &self,\n            request: super::CreateIndexTemplateRequest,\n        ) -> crate::metastore::MetastoreResult<super::EmptyResponse> {\n            self.inner.lock().await.create_index_template(request).await\n        }\n        async fn get_index_template(\n            &self,\n            request: super::GetIndexTemplateRequest,\n        ) -> crate::metastore::MetastoreResult<super::GetIndexTemplateResponse> {\n            self.inner.lock().await.get_index_template(request).await\n        }\n        async fn find_index_template_matches(\n            &self,\n            request: super::FindIndexTemplateMatchesRequest,\n        ) -> crate::metastore::MetastoreResult<super::FindIndexTemplateMatchesResponse> {\n            self.inner.lock().await.find_index_template_matches(request).await\n        }\n        async fn list_index_templates(\n            &self,\n            request: super::ListIndexTemplatesRequest,\n        ) -> crate::metastore::MetastoreResult<super::ListIndexTemplatesResponse> {\n            self.inner.lock().await.list_index_templates(request).await\n        }\n        async fn delete_index_templates(\n            &self,\n            request: super::DeleteIndexTemplatesRequest,\n        ) -> crate::metastore::MetastoreResult<super::EmptyResponse> {\n            self.inner.lock().await.delete_index_templates(request).await\n        }\n        async fn get_cluster_identity(\n            &self,\n            request: super::GetClusterIdentityRequest,\n        ) -> crate::metastore::MetastoreResult<super::GetClusterIdentityResponse> {\n            self.inner.lock().await.get_cluster_identity(request).await\n        }\n        async fn check_connectivity(&self) -> anyhow::Result<()> {\n            self.inner.lock().await.check_connectivity().await\n        }\n        fn endpoints(&self) -> Vec<quickwit_common::uri::Uri> {\n            futures::executor::block_on(self.inner.lock()).endpoints()\n        }\n    }\n}\npub type BoxFuture<T, E> = std::pin::Pin<\n    Box<dyn std::future::Future<Output = Result<T, E>> + Send + 'static>,\n>;\nimpl tower::Service<CreateIndexRequest> for InnerMetastoreServiceClient {\n    type Response = CreateIndexResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: CreateIndexRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.create_index(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<UpdateIndexRequest> for InnerMetastoreServiceClient {\n    type Response = IndexMetadataResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: UpdateIndexRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.update_index(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<IndexMetadataRequest> for InnerMetastoreServiceClient {\n    type Response = IndexMetadataResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: IndexMetadataRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.index_metadata(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<IndexesMetadataRequest> for InnerMetastoreServiceClient {\n    type Response = IndexesMetadataResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: IndexesMetadataRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.indexes_metadata(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<ListIndexesMetadataRequest> for InnerMetastoreServiceClient {\n    type Response = ListIndexesMetadataResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: ListIndexesMetadataRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.list_indexes_metadata(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<DeleteIndexRequest> for InnerMetastoreServiceClient {\n    type Response = EmptyResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: DeleteIndexRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.delete_index(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<ListIndexStatsRequest> for InnerMetastoreServiceClient {\n    type Response = ListIndexStatsResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: ListIndexStatsRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.list_index_stats(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<ListSplitsRequest> for InnerMetastoreServiceClient {\n    type Response = MetastoreServiceStream<ListSplitsResponse>;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: ListSplitsRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.list_splits(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<StageSplitsRequest> for InnerMetastoreServiceClient {\n    type Response = EmptyResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: StageSplitsRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.stage_splits(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<PublishSplitsRequest> for InnerMetastoreServiceClient {\n    type Response = EmptyResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: PublishSplitsRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.publish_splits(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<MarkSplitsForDeletionRequest> for InnerMetastoreServiceClient {\n    type Response = EmptyResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: MarkSplitsForDeletionRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.mark_splits_for_deletion(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<DeleteSplitsRequest> for InnerMetastoreServiceClient {\n    type Response = EmptyResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: DeleteSplitsRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.delete_splits(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<AddSourceRequest> for InnerMetastoreServiceClient {\n    type Response = EmptyResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: AddSourceRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.add_source(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<UpdateSourceRequest> for InnerMetastoreServiceClient {\n    type Response = EmptyResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: UpdateSourceRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.update_source(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<ToggleSourceRequest> for InnerMetastoreServiceClient {\n    type Response = EmptyResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: ToggleSourceRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.toggle_source(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<DeleteSourceRequest> for InnerMetastoreServiceClient {\n    type Response = EmptyResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: DeleteSourceRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.delete_source(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<ResetSourceCheckpointRequest> for InnerMetastoreServiceClient {\n    type Response = EmptyResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: ResetSourceCheckpointRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.reset_source_checkpoint(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<LastDeleteOpstampRequest> for InnerMetastoreServiceClient {\n    type Response = LastDeleteOpstampResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: LastDeleteOpstampRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.last_delete_opstamp(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<DeleteQuery> for InnerMetastoreServiceClient {\n    type Response = DeleteTask;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: DeleteQuery) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.create_delete_task(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<UpdateSplitsDeleteOpstampRequest> for InnerMetastoreServiceClient {\n    type Response = UpdateSplitsDeleteOpstampResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: UpdateSplitsDeleteOpstampRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.update_splits_delete_opstamp(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<ListDeleteTasksRequest> for InnerMetastoreServiceClient {\n    type Response = ListDeleteTasksResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: ListDeleteTasksRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.list_delete_tasks(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<ListStaleSplitsRequest> for InnerMetastoreServiceClient {\n    type Response = ListSplitsResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: ListStaleSplitsRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.list_stale_splits(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<OpenShardsRequest> for InnerMetastoreServiceClient {\n    type Response = OpenShardsResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: OpenShardsRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.open_shards(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<AcquireShardsRequest> for InnerMetastoreServiceClient {\n    type Response = AcquireShardsResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: AcquireShardsRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.acquire_shards(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<DeleteShardsRequest> for InnerMetastoreServiceClient {\n    type Response = DeleteShardsResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: DeleteShardsRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.delete_shards(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<PruneShardsRequest> for InnerMetastoreServiceClient {\n    type Response = EmptyResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: PruneShardsRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.prune_shards(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<ListShardsRequest> for InnerMetastoreServiceClient {\n    type Response = ListShardsResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: ListShardsRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.list_shards(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<CreateIndexTemplateRequest> for InnerMetastoreServiceClient {\n    type Response = EmptyResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: CreateIndexTemplateRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.create_index_template(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<GetIndexTemplateRequest> for InnerMetastoreServiceClient {\n    type Response = GetIndexTemplateResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: GetIndexTemplateRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.get_index_template(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<FindIndexTemplateMatchesRequest> for InnerMetastoreServiceClient {\n    type Response = FindIndexTemplateMatchesResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: FindIndexTemplateMatchesRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.find_index_template_matches(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<ListIndexTemplatesRequest> for InnerMetastoreServiceClient {\n    type Response = ListIndexTemplatesResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: ListIndexTemplatesRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.list_index_templates(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<DeleteIndexTemplatesRequest> for InnerMetastoreServiceClient {\n    type Response = EmptyResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: DeleteIndexTemplatesRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.delete_index_templates(request).await };\n        Box::pin(fut)\n    }\n}\nimpl tower::Service<GetClusterIdentityRequest> for InnerMetastoreServiceClient {\n    type Response = GetClusterIdentityResponse;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, request: GetClusterIdentityRequest) -> Self::Future {\n        let svc = self.clone();\n        let fut = async move { svc.0.get_cluster_identity(request).await };\n        Box::pin(fut)\n    }\n}\n/// A tower service stack is a set of tower services.\n#[derive(Debug)]\nstruct MetastoreServiceTowerServiceStack {\n    #[allow(dead_code)]\n    inner: InnerMetastoreServiceClient,\n    create_index_svc: quickwit_common::tower::BoxService<\n        CreateIndexRequest,\n        CreateIndexResponse,\n        crate::metastore::MetastoreError,\n    >,\n    update_index_svc: quickwit_common::tower::BoxService<\n        UpdateIndexRequest,\n        IndexMetadataResponse,\n        crate::metastore::MetastoreError,\n    >,\n    index_metadata_svc: quickwit_common::tower::BoxService<\n        IndexMetadataRequest,\n        IndexMetadataResponse,\n        crate::metastore::MetastoreError,\n    >,\n    indexes_metadata_svc: quickwit_common::tower::BoxService<\n        IndexesMetadataRequest,\n        IndexesMetadataResponse,\n        crate::metastore::MetastoreError,\n    >,\n    list_indexes_metadata_svc: quickwit_common::tower::BoxService<\n        ListIndexesMetadataRequest,\n        ListIndexesMetadataResponse,\n        crate::metastore::MetastoreError,\n    >,\n    delete_index_svc: quickwit_common::tower::BoxService<\n        DeleteIndexRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    list_index_stats_svc: quickwit_common::tower::BoxService<\n        ListIndexStatsRequest,\n        ListIndexStatsResponse,\n        crate::metastore::MetastoreError,\n    >,\n    list_splits_svc: quickwit_common::tower::BoxService<\n        ListSplitsRequest,\n        MetastoreServiceStream<ListSplitsResponse>,\n        crate::metastore::MetastoreError,\n    >,\n    stage_splits_svc: quickwit_common::tower::BoxService<\n        StageSplitsRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    publish_splits_svc: quickwit_common::tower::BoxService<\n        PublishSplitsRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    mark_splits_for_deletion_svc: quickwit_common::tower::BoxService<\n        MarkSplitsForDeletionRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    delete_splits_svc: quickwit_common::tower::BoxService<\n        DeleteSplitsRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    add_source_svc: quickwit_common::tower::BoxService<\n        AddSourceRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    update_source_svc: quickwit_common::tower::BoxService<\n        UpdateSourceRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    toggle_source_svc: quickwit_common::tower::BoxService<\n        ToggleSourceRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    delete_source_svc: quickwit_common::tower::BoxService<\n        DeleteSourceRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    reset_source_checkpoint_svc: quickwit_common::tower::BoxService<\n        ResetSourceCheckpointRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    last_delete_opstamp_svc: quickwit_common::tower::BoxService<\n        LastDeleteOpstampRequest,\n        LastDeleteOpstampResponse,\n        crate::metastore::MetastoreError,\n    >,\n    create_delete_task_svc: quickwit_common::tower::BoxService<\n        DeleteQuery,\n        DeleteTask,\n        crate::metastore::MetastoreError,\n    >,\n    update_splits_delete_opstamp_svc: quickwit_common::tower::BoxService<\n        UpdateSplitsDeleteOpstampRequest,\n        UpdateSplitsDeleteOpstampResponse,\n        crate::metastore::MetastoreError,\n    >,\n    list_delete_tasks_svc: quickwit_common::tower::BoxService<\n        ListDeleteTasksRequest,\n        ListDeleteTasksResponse,\n        crate::metastore::MetastoreError,\n    >,\n    list_stale_splits_svc: quickwit_common::tower::BoxService<\n        ListStaleSplitsRequest,\n        ListSplitsResponse,\n        crate::metastore::MetastoreError,\n    >,\n    open_shards_svc: quickwit_common::tower::BoxService<\n        OpenShardsRequest,\n        OpenShardsResponse,\n        crate::metastore::MetastoreError,\n    >,\n    acquire_shards_svc: quickwit_common::tower::BoxService<\n        AcquireShardsRequest,\n        AcquireShardsResponse,\n        crate::metastore::MetastoreError,\n    >,\n    delete_shards_svc: quickwit_common::tower::BoxService<\n        DeleteShardsRequest,\n        DeleteShardsResponse,\n        crate::metastore::MetastoreError,\n    >,\n    prune_shards_svc: quickwit_common::tower::BoxService<\n        PruneShardsRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    list_shards_svc: quickwit_common::tower::BoxService<\n        ListShardsRequest,\n        ListShardsResponse,\n        crate::metastore::MetastoreError,\n    >,\n    create_index_template_svc: quickwit_common::tower::BoxService<\n        CreateIndexTemplateRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    get_index_template_svc: quickwit_common::tower::BoxService<\n        GetIndexTemplateRequest,\n        GetIndexTemplateResponse,\n        crate::metastore::MetastoreError,\n    >,\n    find_index_template_matches_svc: quickwit_common::tower::BoxService<\n        FindIndexTemplateMatchesRequest,\n        FindIndexTemplateMatchesResponse,\n        crate::metastore::MetastoreError,\n    >,\n    list_index_templates_svc: quickwit_common::tower::BoxService<\n        ListIndexTemplatesRequest,\n        ListIndexTemplatesResponse,\n        crate::metastore::MetastoreError,\n    >,\n    delete_index_templates_svc: quickwit_common::tower::BoxService<\n        DeleteIndexTemplatesRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    get_cluster_identity_svc: quickwit_common::tower::BoxService<\n        GetClusterIdentityRequest,\n        GetClusterIdentityResponse,\n        crate::metastore::MetastoreError,\n    >,\n}\n#[async_trait::async_trait]\nimpl MetastoreService for MetastoreServiceTowerServiceStack {\n    async fn create_index(\n        &self,\n        request: CreateIndexRequest,\n    ) -> crate::metastore::MetastoreResult<CreateIndexResponse> {\n        self.create_index_svc.clone().ready().await?.call(request).await\n    }\n    async fn update_index(\n        &self,\n        request: UpdateIndexRequest,\n    ) -> crate::metastore::MetastoreResult<IndexMetadataResponse> {\n        self.update_index_svc.clone().ready().await?.call(request).await\n    }\n    async fn index_metadata(\n        &self,\n        request: IndexMetadataRequest,\n    ) -> crate::metastore::MetastoreResult<IndexMetadataResponse> {\n        self.index_metadata_svc.clone().ready().await?.call(request).await\n    }\n    async fn indexes_metadata(\n        &self,\n        request: IndexesMetadataRequest,\n    ) -> crate::metastore::MetastoreResult<IndexesMetadataResponse> {\n        self.indexes_metadata_svc.clone().ready().await?.call(request).await\n    }\n    async fn list_indexes_metadata(\n        &self,\n        request: ListIndexesMetadataRequest,\n    ) -> crate::metastore::MetastoreResult<ListIndexesMetadataResponse> {\n        self.list_indexes_metadata_svc.clone().ready().await?.call(request).await\n    }\n    async fn delete_index(\n        &self,\n        request: DeleteIndexRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.delete_index_svc.clone().ready().await?.call(request).await\n    }\n    async fn list_index_stats(\n        &self,\n        request: ListIndexStatsRequest,\n    ) -> crate::metastore::MetastoreResult<ListIndexStatsResponse> {\n        self.list_index_stats_svc.clone().ready().await?.call(request).await\n    }\n    async fn list_splits(\n        &self,\n        request: ListSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<MetastoreServiceStream<ListSplitsResponse>> {\n        self.list_splits_svc.clone().ready().await?.call(request).await\n    }\n    async fn stage_splits(\n        &self,\n        request: StageSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.stage_splits_svc.clone().ready().await?.call(request).await\n    }\n    async fn publish_splits(\n        &self,\n        request: PublishSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.publish_splits_svc.clone().ready().await?.call(request).await\n    }\n    async fn mark_splits_for_deletion(\n        &self,\n        request: MarkSplitsForDeletionRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.mark_splits_for_deletion_svc.clone().ready().await?.call(request).await\n    }\n    async fn delete_splits(\n        &self,\n        request: DeleteSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.delete_splits_svc.clone().ready().await?.call(request).await\n    }\n    async fn add_source(\n        &self,\n        request: AddSourceRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.add_source_svc.clone().ready().await?.call(request).await\n    }\n    async fn update_source(\n        &self,\n        request: UpdateSourceRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.update_source_svc.clone().ready().await?.call(request).await\n    }\n    async fn toggle_source(\n        &self,\n        request: ToggleSourceRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.toggle_source_svc.clone().ready().await?.call(request).await\n    }\n    async fn delete_source(\n        &self,\n        request: DeleteSourceRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.delete_source_svc.clone().ready().await?.call(request).await\n    }\n    async fn reset_source_checkpoint(\n        &self,\n        request: ResetSourceCheckpointRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.reset_source_checkpoint_svc.clone().ready().await?.call(request).await\n    }\n    async fn last_delete_opstamp(\n        &self,\n        request: LastDeleteOpstampRequest,\n    ) -> crate::metastore::MetastoreResult<LastDeleteOpstampResponse> {\n        self.last_delete_opstamp_svc.clone().ready().await?.call(request).await\n    }\n    async fn create_delete_task(\n        &self,\n        request: DeleteQuery,\n    ) -> crate::metastore::MetastoreResult<DeleteTask> {\n        self.create_delete_task_svc.clone().ready().await?.call(request).await\n    }\n    async fn update_splits_delete_opstamp(\n        &self,\n        request: UpdateSplitsDeleteOpstampRequest,\n    ) -> crate::metastore::MetastoreResult<UpdateSplitsDeleteOpstampResponse> {\n        self.update_splits_delete_opstamp_svc.clone().ready().await?.call(request).await\n    }\n    async fn list_delete_tasks(\n        &self,\n        request: ListDeleteTasksRequest,\n    ) -> crate::metastore::MetastoreResult<ListDeleteTasksResponse> {\n        self.list_delete_tasks_svc.clone().ready().await?.call(request).await\n    }\n    async fn list_stale_splits(\n        &self,\n        request: ListStaleSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<ListSplitsResponse> {\n        self.list_stale_splits_svc.clone().ready().await?.call(request).await\n    }\n    async fn open_shards(\n        &self,\n        request: OpenShardsRequest,\n    ) -> crate::metastore::MetastoreResult<OpenShardsResponse> {\n        self.open_shards_svc.clone().ready().await?.call(request).await\n    }\n    async fn acquire_shards(\n        &self,\n        request: AcquireShardsRequest,\n    ) -> crate::metastore::MetastoreResult<AcquireShardsResponse> {\n        self.acquire_shards_svc.clone().ready().await?.call(request).await\n    }\n    async fn delete_shards(\n        &self,\n        request: DeleteShardsRequest,\n    ) -> crate::metastore::MetastoreResult<DeleteShardsResponse> {\n        self.delete_shards_svc.clone().ready().await?.call(request).await\n    }\n    async fn prune_shards(\n        &self,\n        request: PruneShardsRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.prune_shards_svc.clone().ready().await?.call(request).await\n    }\n    async fn list_shards(\n        &self,\n        request: ListShardsRequest,\n    ) -> crate::metastore::MetastoreResult<ListShardsResponse> {\n        self.list_shards_svc.clone().ready().await?.call(request).await\n    }\n    async fn create_index_template(\n        &self,\n        request: CreateIndexTemplateRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.create_index_template_svc.clone().ready().await?.call(request).await\n    }\n    async fn get_index_template(\n        &self,\n        request: GetIndexTemplateRequest,\n    ) -> crate::metastore::MetastoreResult<GetIndexTemplateResponse> {\n        self.get_index_template_svc.clone().ready().await?.call(request).await\n    }\n    async fn find_index_template_matches(\n        &self,\n        request: FindIndexTemplateMatchesRequest,\n    ) -> crate::metastore::MetastoreResult<FindIndexTemplateMatchesResponse> {\n        self.find_index_template_matches_svc.clone().ready().await?.call(request).await\n    }\n    async fn list_index_templates(\n        &self,\n        request: ListIndexTemplatesRequest,\n    ) -> crate::metastore::MetastoreResult<ListIndexTemplatesResponse> {\n        self.list_index_templates_svc.clone().ready().await?.call(request).await\n    }\n    async fn delete_index_templates(\n        &self,\n        request: DeleteIndexTemplatesRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.delete_index_templates_svc.clone().ready().await?.call(request).await\n    }\n    async fn get_cluster_identity(\n        &self,\n        request: GetClusterIdentityRequest,\n    ) -> crate::metastore::MetastoreResult<GetClusterIdentityResponse> {\n        self.get_cluster_identity_svc.clone().ready().await?.call(request).await\n    }\n    async fn check_connectivity(&self) -> anyhow::Result<()> {\n        self.inner.0.check_connectivity().await\n    }\n    fn endpoints(&self) -> Vec<quickwit_common::uri::Uri> {\n        self.inner.0.endpoints()\n    }\n}\ntype CreateIndexLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        CreateIndexRequest,\n        CreateIndexResponse,\n        crate::metastore::MetastoreError,\n    >,\n    CreateIndexRequest,\n    CreateIndexResponse,\n    crate::metastore::MetastoreError,\n>;\ntype UpdateIndexLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        UpdateIndexRequest,\n        IndexMetadataResponse,\n        crate::metastore::MetastoreError,\n    >,\n    UpdateIndexRequest,\n    IndexMetadataResponse,\n    crate::metastore::MetastoreError,\n>;\ntype IndexMetadataLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        IndexMetadataRequest,\n        IndexMetadataResponse,\n        crate::metastore::MetastoreError,\n    >,\n    IndexMetadataRequest,\n    IndexMetadataResponse,\n    crate::metastore::MetastoreError,\n>;\ntype IndexesMetadataLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        IndexesMetadataRequest,\n        IndexesMetadataResponse,\n        crate::metastore::MetastoreError,\n    >,\n    IndexesMetadataRequest,\n    IndexesMetadataResponse,\n    crate::metastore::MetastoreError,\n>;\ntype ListIndexesMetadataLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        ListIndexesMetadataRequest,\n        ListIndexesMetadataResponse,\n        crate::metastore::MetastoreError,\n    >,\n    ListIndexesMetadataRequest,\n    ListIndexesMetadataResponse,\n    crate::metastore::MetastoreError,\n>;\ntype DeleteIndexLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        DeleteIndexRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    DeleteIndexRequest,\n    EmptyResponse,\n    crate::metastore::MetastoreError,\n>;\ntype ListIndexStatsLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        ListIndexStatsRequest,\n        ListIndexStatsResponse,\n        crate::metastore::MetastoreError,\n    >,\n    ListIndexStatsRequest,\n    ListIndexStatsResponse,\n    crate::metastore::MetastoreError,\n>;\ntype ListSplitsLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        ListSplitsRequest,\n        MetastoreServiceStream<ListSplitsResponse>,\n        crate::metastore::MetastoreError,\n    >,\n    ListSplitsRequest,\n    MetastoreServiceStream<ListSplitsResponse>,\n    crate::metastore::MetastoreError,\n>;\ntype StageSplitsLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        StageSplitsRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    StageSplitsRequest,\n    EmptyResponse,\n    crate::metastore::MetastoreError,\n>;\ntype PublishSplitsLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        PublishSplitsRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    PublishSplitsRequest,\n    EmptyResponse,\n    crate::metastore::MetastoreError,\n>;\ntype MarkSplitsForDeletionLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        MarkSplitsForDeletionRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    MarkSplitsForDeletionRequest,\n    EmptyResponse,\n    crate::metastore::MetastoreError,\n>;\ntype DeleteSplitsLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        DeleteSplitsRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    DeleteSplitsRequest,\n    EmptyResponse,\n    crate::metastore::MetastoreError,\n>;\ntype AddSourceLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        AddSourceRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    AddSourceRequest,\n    EmptyResponse,\n    crate::metastore::MetastoreError,\n>;\ntype UpdateSourceLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        UpdateSourceRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    UpdateSourceRequest,\n    EmptyResponse,\n    crate::metastore::MetastoreError,\n>;\ntype ToggleSourceLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        ToggleSourceRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    ToggleSourceRequest,\n    EmptyResponse,\n    crate::metastore::MetastoreError,\n>;\ntype DeleteSourceLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        DeleteSourceRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    DeleteSourceRequest,\n    EmptyResponse,\n    crate::metastore::MetastoreError,\n>;\ntype ResetSourceCheckpointLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        ResetSourceCheckpointRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    ResetSourceCheckpointRequest,\n    EmptyResponse,\n    crate::metastore::MetastoreError,\n>;\ntype LastDeleteOpstampLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        LastDeleteOpstampRequest,\n        LastDeleteOpstampResponse,\n        crate::metastore::MetastoreError,\n    >,\n    LastDeleteOpstampRequest,\n    LastDeleteOpstampResponse,\n    crate::metastore::MetastoreError,\n>;\ntype CreateDeleteTaskLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        DeleteQuery,\n        DeleteTask,\n        crate::metastore::MetastoreError,\n    >,\n    DeleteQuery,\n    DeleteTask,\n    crate::metastore::MetastoreError,\n>;\ntype UpdateSplitsDeleteOpstampLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        UpdateSplitsDeleteOpstampRequest,\n        UpdateSplitsDeleteOpstampResponse,\n        crate::metastore::MetastoreError,\n    >,\n    UpdateSplitsDeleteOpstampRequest,\n    UpdateSplitsDeleteOpstampResponse,\n    crate::metastore::MetastoreError,\n>;\ntype ListDeleteTasksLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        ListDeleteTasksRequest,\n        ListDeleteTasksResponse,\n        crate::metastore::MetastoreError,\n    >,\n    ListDeleteTasksRequest,\n    ListDeleteTasksResponse,\n    crate::metastore::MetastoreError,\n>;\ntype ListStaleSplitsLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        ListStaleSplitsRequest,\n        ListSplitsResponse,\n        crate::metastore::MetastoreError,\n    >,\n    ListStaleSplitsRequest,\n    ListSplitsResponse,\n    crate::metastore::MetastoreError,\n>;\ntype OpenShardsLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        OpenShardsRequest,\n        OpenShardsResponse,\n        crate::metastore::MetastoreError,\n    >,\n    OpenShardsRequest,\n    OpenShardsResponse,\n    crate::metastore::MetastoreError,\n>;\ntype AcquireShardsLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        AcquireShardsRequest,\n        AcquireShardsResponse,\n        crate::metastore::MetastoreError,\n    >,\n    AcquireShardsRequest,\n    AcquireShardsResponse,\n    crate::metastore::MetastoreError,\n>;\ntype DeleteShardsLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        DeleteShardsRequest,\n        DeleteShardsResponse,\n        crate::metastore::MetastoreError,\n    >,\n    DeleteShardsRequest,\n    DeleteShardsResponse,\n    crate::metastore::MetastoreError,\n>;\ntype PruneShardsLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        PruneShardsRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    PruneShardsRequest,\n    EmptyResponse,\n    crate::metastore::MetastoreError,\n>;\ntype ListShardsLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        ListShardsRequest,\n        ListShardsResponse,\n        crate::metastore::MetastoreError,\n    >,\n    ListShardsRequest,\n    ListShardsResponse,\n    crate::metastore::MetastoreError,\n>;\ntype CreateIndexTemplateLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        CreateIndexTemplateRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    CreateIndexTemplateRequest,\n    EmptyResponse,\n    crate::metastore::MetastoreError,\n>;\ntype GetIndexTemplateLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        GetIndexTemplateRequest,\n        GetIndexTemplateResponse,\n        crate::metastore::MetastoreError,\n    >,\n    GetIndexTemplateRequest,\n    GetIndexTemplateResponse,\n    crate::metastore::MetastoreError,\n>;\ntype FindIndexTemplateMatchesLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        FindIndexTemplateMatchesRequest,\n        FindIndexTemplateMatchesResponse,\n        crate::metastore::MetastoreError,\n    >,\n    FindIndexTemplateMatchesRequest,\n    FindIndexTemplateMatchesResponse,\n    crate::metastore::MetastoreError,\n>;\ntype ListIndexTemplatesLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        ListIndexTemplatesRequest,\n        ListIndexTemplatesResponse,\n        crate::metastore::MetastoreError,\n    >,\n    ListIndexTemplatesRequest,\n    ListIndexTemplatesResponse,\n    crate::metastore::MetastoreError,\n>;\ntype DeleteIndexTemplatesLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        DeleteIndexTemplatesRequest,\n        EmptyResponse,\n        crate::metastore::MetastoreError,\n    >,\n    DeleteIndexTemplatesRequest,\n    EmptyResponse,\n    crate::metastore::MetastoreError,\n>;\ntype GetClusterIdentityLayer = quickwit_common::tower::BoxLayer<\n    quickwit_common::tower::BoxService<\n        GetClusterIdentityRequest,\n        GetClusterIdentityResponse,\n        crate::metastore::MetastoreError,\n    >,\n    GetClusterIdentityRequest,\n    GetClusterIdentityResponse,\n    crate::metastore::MetastoreError,\n>;\n#[derive(Debug, Default)]\npub struct MetastoreServiceTowerLayerStack {\n    create_index_layers: Vec<CreateIndexLayer>,\n    update_index_layers: Vec<UpdateIndexLayer>,\n    index_metadata_layers: Vec<IndexMetadataLayer>,\n    indexes_metadata_layers: Vec<IndexesMetadataLayer>,\n    list_indexes_metadata_layers: Vec<ListIndexesMetadataLayer>,\n    delete_index_layers: Vec<DeleteIndexLayer>,\n    list_index_stats_layers: Vec<ListIndexStatsLayer>,\n    list_splits_layers: Vec<ListSplitsLayer>,\n    stage_splits_layers: Vec<StageSplitsLayer>,\n    publish_splits_layers: Vec<PublishSplitsLayer>,\n    mark_splits_for_deletion_layers: Vec<MarkSplitsForDeletionLayer>,\n    delete_splits_layers: Vec<DeleteSplitsLayer>,\n    add_source_layers: Vec<AddSourceLayer>,\n    update_source_layers: Vec<UpdateSourceLayer>,\n    toggle_source_layers: Vec<ToggleSourceLayer>,\n    delete_source_layers: Vec<DeleteSourceLayer>,\n    reset_source_checkpoint_layers: Vec<ResetSourceCheckpointLayer>,\n    last_delete_opstamp_layers: Vec<LastDeleteOpstampLayer>,\n    create_delete_task_layers: Vec<CreateDeleteTaskLayer>,\n    update_splits_delete_opstamp_layers: Vec<UpdateSplitsDeleteOpstampLayer>,\n    list_delete_tasks_layers: Vec<ListDeleteTasksLayer>,\n    list_stale_splits_layers: Vec<ListStaleSplitsLayer>,\n    open_shards_layers: Vec<OpenShardsLayer>,\n    acquire_shards_layers: Vec<AcquireShardsLayer>,\n    delete_shards_layers: Vec<DeleteShardsLayer>,\n    prune_shards_layers: Vec<PruneShardsLayer>,\n    list_shards_layers: Vec<ListShardsLayer>,\n    create_index_template_layers: Vec<CreateIndexTemplateLayer>,\n    get_index_template_layers: Vec<GetIndexTemplateLayer>,\n    find_index_template_matches_layers: Vec<FindIndexTemplateMatchesLayer>,\n    list_index_templates_layers: Vec<ListIndexTemplatesLayer>,\n    delete_index_templates_layers: Vec<DeleteIndexTemplatesLayer>,\n    get_cluster_identity_layers: Vec<GetClusterIdentityLayer>,\n}\nimpl MetastoreServiceTowerLayerStack {\n    pub fn stack_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    CreateIndexRequest,\n                    CreateIndexResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                CreateIndexRequest,\n                CreateIndexResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                CreateIndexRequest,\n                Response = CreateIndexResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                CreateIndexRequest,\n                CreateIndexResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<CreateIndexRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    UpdateIndexRequest,\n                    IndexMetadataResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                UpdateIndexRequest,\n                IndexMetadataResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                UpdateIndexRequest,\n                Response = IndexMetadataResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                UpdateIndexRequest,\n                IndexMetadataResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<UpdateIndexRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    IndexMetadataRequest,\n                    IndexMetadataResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                IndexMetadataRequest,\n                IndexMetadataResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                IndexMetadataRequest,\n                Response = IndexMetadataResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                IndexMetadataRequest,\n                IndexMetadataResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<IndexMetadataRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    IndexesMetadataRequest,\n                    IndexesMetadataResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                IndexesMetadataRequest,\n                IndexesMetadataResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                IndexesMetadataRequest,\n                Response = IndexesMetadataResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                IndexesMetadataRequest,\n                IndexesMetadataResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<IndexesMetadataRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    ListIndexesMetadataRequest,\n                    ListIndexesMetadataResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                ListIndexesMetadataRequest,\n                ListIndexesMetadataResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                ListIndexesMetadataRequest,\n                Response = ListIndexesMetadataResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                ListIndexesMetadataRequest,\n                ListIndexesMetadataResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<\n            ListIndexesMetadataRequest,\n        >>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    DeleteIndexRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                DeleteIndexRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                DeleteIndexRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                DeleteIndexRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<DeleteIndexRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    ListIndexStatsRequest,\n                    ListIndexStatsResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                ListIndexStatsRequest,\n                ListIndexStatsResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                ListIndexStatsRequest,\n                Response = ListIndexStatsResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                ListIndexStatsRequest,\n                ListIndexStatsResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<ListIndexStatsRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    ListSplitsRequest,\n                    MetastoreServiceStream<ListSplitsResponse>,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                ListSplitsRequest,\n                MetastoreServiceStream<ListSplitsResponse>,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                ListSplitsRequest,\n                Response = MetastoreServiceStream<ListSplitsResponse>,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                ListSplitsRequest,\n                MetastoreServiceStream<ListSplitsResponse>,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<ListSplitsRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    StageSplitsRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                StageSplitsRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                StageSplitsRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                StageSplitsRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<StageSplitsRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    PublishSplitsRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                PublishSplitsRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                PublishSplitsRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                PublishSplitsRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<PublishSplitsRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    MarkSplitsForDeletionRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                MarkSplitsForDeletionRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                MarkSplitsForDeletionRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                MarkSplitsForDeletionRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<\n            MarkSplitsForDeletionRequest,\n        >>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    DeleteSplitsRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                DeleteSplitsRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                DeleteSplitsRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                DeleteSplitsRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<DeleteSplitsRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    AddSourceRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                AddSourceRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                AddSourceRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                AddSourceRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<AddSourceRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    UpdateSourceRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                UpdateSourceRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                UpdateSourceRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                UpdateSourceRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<UpdateSourceRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    ToggleSourceRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                ToggleSourceRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                ToggleSourceRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                ToggleSourceRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<ToggleSourceRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    DeleteSourceRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                DeleteSourceRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                DeleteSourceRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                DeleteSourceRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<DeleteSourceRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    ResetSourceCheckpointRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                ResetSourceCheckpointRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                ResetSourceCheckpointRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                ResetSourceCheckpointRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<\n            ResetSourceCheckpointRequest,\n        >>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    LastDeleteOpstampRequest,\n                    LastDeleteOpstampResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                LastDeleteOpstampRequest,\n                LastDeleteOpstampResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                LastDeleteOpstampRequest,\n                Response = LastDeleteOpstampResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                LastDeleteOpstampRequest,\n                LastDeleteOpstampResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<LastDeleteOpstampRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    DeleteQuery,\n                    DeleteTask,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                DeleteQuery,\n                DeleteTask,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                DeleteQuery,\n                Response = DeleteTask,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                DeleteQuery,\n                DeleteTask,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<DeleteQuery>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    UpdateSplitsDeleteOpstampRequest,\n                    UpdateSplitsDeleteOpstampResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                UpdateSplitsDeleteOpstampRequest,\n                UpdateSplitsDeleteOpstampResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                UpdateSplitsDeleteOpstampRequest,\n                Response = UpdateSplitsDeleteOpstampResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                UpdateSplitsDeleteOpstampRequest,\n                UpdateSplitsDeleteOpstampResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<\n            UpdateSplitsDeleteOpstampRequest,\n        >>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    ListDeleteTasksRequest,\n                    ListDeleteTasksResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                ListDeleteTasksRequest,\n                ListDeleteTasksResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                ListDeleteTasksRequest,\n                Response = ListDeleteTasksResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                ListDeleteTasksRequest,\n                ListDeleteTasksResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<ListDeleteTasksRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    ListStaleSplitsRequest,\n                    ListSplitsResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                ListStaleSplitsRequest,\n                ListSplitsResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                ListStaleSplitsRequest,\n                Response = ListSplitsResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                ListStaleSplitsRequest,\n                ListSplitsResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<ListStaleSplitsRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    OpenShardsRequest,\n                    OpenShardsResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                OpenShardsRequest,\n                OpenShardsResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                OpenShardsRequest,\n                Response = OpenShardsResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                OpenShardsRequest,\n                OpenShardsResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<OpenShardsRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    AcquireShardsRequest,\n                    AcquireShardsResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                AcquireShardsRequest,\n                AcquireShardsResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                AcquireShardsRequest,\n                Response = AcquireShardsResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                AcquireShardsRequest,\n                AcquireShardsResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<AcquireShardsRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    DeleteShardsRequest,\n                    DeleteShardsResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                DeleteShardsRequest,\n                DeleteShardsResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                DeleteShardsRequest,\n                Response = DeleteShardsResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                DeleteShardsRequest,\n                DeleteShardsResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<DeleteShardsRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    PruneShardsRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                PruneShardsRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                PruneShardsRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                PruneShardsRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<PruneShardsRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    ListShardsRequest,\n                    ListShardsResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                ListShardsRequest,\n                ListShardsResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                ListShardsRequest,\n                Response = ListShardsResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                ListShardsRequest,\n                ListShardsResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<ListShardsRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    CreateIndexTemplateRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                CreateIndexTemplateRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                CreateIndexTemplateRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                CreateIndexTemplateRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<\n            CreateIndexTemplateRequest,\n        >>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    GetIndexTemplateRequest,\n                    GetIndexTemplateResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                GetIndexTemplateRequest,\n                GetIndexTemplateResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                GetIndexTemplateRequest,\n                Response = GetIndexTemplateResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                GetIndexTemplateRequest,\n                GetIndexTemplateResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<GetIndexTemplateRequest>>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    FindIndexTemplateMatchesRequest,\n                    FindIndexTemplateMatchesResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                FindIndexTemplateMatchesRequest,\n                FindIndexTemplateMatchesResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                FindIndexTemplateMatchesRequest,\n                Response = FindIndexTemplateMatchesResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                FindIndexTemplateMatchesRequest,\n                FindIndexTemplateMatchesResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<\n            FindIndexTemplateMatchesRequest,\n        >>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    ListIndexTemplatesRequest,\n                    ListIndexTemplatesResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                ListIndexTemplatesRequest,\n                ListIndexTemplatesResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                ListIndexTemplatesRequest,\n                Response = ListIndexTemplatesResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                ListIndexTemplatesRequest,\n                ListIndexTemplatesResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<\n            ListIndexTemplatesRequest,\n        >>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    DeleteIndexTemplatesRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                DeleteIndexTemplatesRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                DeleteIndexTemplatesRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                DeleteIndexTemplatesRequest,\n                EmptyResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<\n            DeleteIndexTemplatesRequest,\n        >>::Future: Send + 'static,\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    GetClusterIdentityRequest,\n                    GetClusterIdentityResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Clone + Send + Sync + 'static,\n        <L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                GetClusterIdentityRequest,\n                GetClusterIdentityResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service: tower::Service<\n                GetClusterIdentityRequest,\n                Response = GetClusterIdentityResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <<L as tower::Layer<\n            quickwit_common::tower::BoxService<\n                GetClusterIdentityRequest,\n                GetClusterIdentityResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >>::Service as tower::Service<\n            GetClusterIdentityRequest,\n        >>::Future: Send + 'static,\n    {\n        self.create_index_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.update_index_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.index_metadata_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.indexes_metadata_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.list_indexes_metadata_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.delete_index_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.list_index_stats_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.list_splits_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.stage_splits_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.publish_splits_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.mark_splits_for_deletion_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.delete_splits_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.add_source_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.update_source_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.toggle_source_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.delete_source_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.reset_source_checkpoint_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.last_delete_opstamp_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.create_delete_task_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.update_splits_delete_opstamp_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.list_delete_tasks_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.list_stale_splits_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.open_shards_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.acquire_shards_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.delete_shards_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.prune_shards_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.list_shards_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.create_index_template_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.get_index_template_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.find_index_template_matches_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.list_index_templates_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.delete_index_templates_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self.get_cluster_identity_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer.clone()));\n        self\n    }\n    pub fn stack_create_index_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    CreateIndexRequest,\n                    CreateIndexResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                CreateIndexRequest,\n                Response = CreateIndexResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<CreateIndexRequest>>::Future: Send + 'static,\n    {\n        self.create_index_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_update_index_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    UpdateIndexRequest,\n                    IndexMetadataResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                UpdateIndexRequest,\n                Response = IndexMetadataResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<UpdateIndexRequest>>::Future: Send + 'static,\n    {\n        self.update_index_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_index_metadata_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    IndexMetadataRequest,\n                    IndexMetadataResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                IndexMetadataRequest,\n                Response = IndexMetadataResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<IndexMetadataRequest>>::Future: Send + 'static,\n    {\n        self.index_metadata_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_indexes_metadata_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    IndexesMetadataRequest,\n                    IndexesMetadataResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                IndexesMetadataRequest,\n                Response = IndexesMetadataResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<IndexesMetadataRequest>>::Future: Send + 'static,\n    {\n        self.indexes_metadata_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_list_indexes_metadata_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    ListIndexesMetadataRequest,\n                    ListIndexesMetadataResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                ListIndexesMetadataRequest,\n                Response = ListIndexesMetadataResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<\n            ListIndexesMetadataRequest,\n        >>::Future: Send + 'static,\n    {\n        self.list_indexes_metadata_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_delete_index_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    DeleteIndexRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                DeleteIndexRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<DeleteIndexRequest>>::Future: Send + 'static,\n    {\n        self.delete_index_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_list_index_stats_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    ListIndexStatsRequest,\n                    ListIndexStatsResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                ListIndexStatsRequest,\n                Response = ListIndexStatsResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<ListIndexStatsRequest>>::Future: Send + 'static,\n    {\n        self.list_index_stats_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_list_splits_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    ListSplitsRequest,\n                    MetastoreServiceStream<ListSplitsResponse>,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                ListSplitsRequest,\n                Response = MetastoreServiceStream<ListSplitsResponse>,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<ListSplitsRequest>>::Future: Send + 'static,\n    {\n        self.list_splits_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_stage_splits_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    StageSplitsRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                StageSplitsRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<StageSplitsRequest>>::Future: Send + 'static,\n    {\n        self.stage_splits_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_publish_splits_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    PublishSplitsRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                PublishSplitsRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<PublishSplitsRequest>>::Future: Send + 'static,\n    {\n        self.publish_splits_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_mark_splits_for_deletion_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    MarkSplitsForDeletionRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                MarkSplitsForDeletionRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<\n            MarkSplitsForDeletionRequest,\n        >>::Future: Send + 'static,\n    {\n        self.mark_splits_for_deletion_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_delete_splits_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    DeleteSplitsRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                DeleteSplitsRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<DeleteSplitsRequest>>::Future: Send + 'static,\n    {\n        self.delete_splits_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_add_source_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    AddSourceRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                AddSourceRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<AddSourceRequest>>::Future: Send + 'static,\n    {\n        self.add_source_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_update_source_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    UpdateSourceRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                UpdateSourceRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<UpdateSourceRequest>>::Future: Send + 'static,\n    {\n        self.update_source_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_toggle_source_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    ToggleSourceRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                ToggleSourceRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<ToggleSourceRequest>>::Future: Send + 'static,\n    {\n        self.toggle_source_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_delete_source_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    DeleteSourceRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                DeleteSourceRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<DeleteSourceRequest>>::Future: Send + 'static,\n    {\n        self.delete_source_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_reset_source_checkpoint_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    ResetSourceCheckpointRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                ResetSourceCheckpointRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<\n            ResetSourceCheckpointRequest,\n        >>::Future: Send + 'static,\n    {\n        self.reset_source_checkpoint_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_last_delete_opstamp_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    LastDeleteOpstampRequest,\n                    LastDeleteOpstampResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                LastDeleteOpstampRequest,\n                Response = LastDeleteOpstampResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<LastDeleteOpstampRequest>>::Future: Send + 'static,\n    {\n        self.last_delete_opstamp_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_create_delete_task_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    DeleteQuery,\n                    DeleteTask,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                DeleteQuery,\n                Response = DeleteTask,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<DeleteQuery>>::Future: Send + 'static,\n    {\n        self.create_delete_task_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_update_splits_delete_opstamp_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    UpdateSplitsDeleteOpstampRequest,\n                    UpdateSplitsDeleteOpstampResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                UpdateSplitsDeleteOpstampRequest,\n                Response = UpdateSplitsDeleteOpstampResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<\n            UpdateSplitsDeleteOpstampRequest,\n        >>::Future: Send + 'static,\n    {\n        self.update_splits_delete_opstamp_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_list_delete_tasks_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    ListDeleteTasksRequest,\n                    ListDeleteTasksResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                ListDeleteTasksRequest,\n                Response = ListDeleteTasksResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<ListDeleteTasksRequest>>::Future: Send + 'static,\n    {\n        self.list_delete_tasks_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_list_stale_splits_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    ListStaleSplitsRequest,\n                    ListSplitsResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                ListStaleSplitsRequest,\n                Response = ListSplitsResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<ListStaleSplitsRequest>>::Future: Send + 'static,\n    {\n        self.list_stale_splits_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_open_shards_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    OpenShardsRequest,\n                    OpenShardsResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                OpenShardsRequest,\n                Response = OpenShardsResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<OpenShardsRequest>>::Future: Send + 'static,\n    {\n        self.open_shards_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_acquire_shards_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    AcquireShardsRequest,\n                    AcquireShardsResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                AcquireShardsRequest,\n                Response = AcquireShardsResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<AcquireShardsRequest>>::Future: Send + 'static,\n    {\n        self.acquire_shards_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_delete_shards_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    DeleteShardsRequest,\n                    DeleteShardsResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                DeleteShardsRequest,\n                Response = DeleteShardsResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<DeleteShardsRequest>>::Future: Send + 'static,\n    {\n        self.delete_shards_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_prune_shards_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    PruneShardsRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                PruneShardsRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<PruneShardsRequest>>::Future: Send + 'static,\n    {\n        self.prune_shards_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_list_shards_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    ListShardsRequest,\n                    ListShardsResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                ListShardsRequest,\n                Response = ListShardsResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<ListShardsRequest>>::Future: Send + 'static,\n    {\n        self.list_shards_layers.push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_create_index_template_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    CreateIndexTemplateRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                CreateIndexTemplateRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<\n            CreateIndexTemplateRequest,\n        >>::Future: Send + 'static,\n    {\n        self.create_index_template_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_get_index_template_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    GetIndexTemplateRequest,\n                    GetIndexTemplateResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                GetIndexTemplateRequest,\n                Response = GetIndexTemplateResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<GetIndexTemplateRequest>>::Future: Send + 'static,\n    {\n        self.get_index_template_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_find_index_template_matches_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    FindIndexTemplateMatchesRequest,\n                    FindIndexTemplateMatchesResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                FindIndexTemplateMatchesRequest,\n                Response = FindIndexTemplateMatchesResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<\n            FindIndexTemplateMatchesRequest,\n        >>::Future: Send + 'static,\n    {\n        self.find_index_template_matches_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_list_index_templates_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    ListIndexTemplatesRequest,\n                    ListIndexTemplatesResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                ListIndexTemplatesRequest,\n                Response = ListIndexTemplatesResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<\n            ListIndexTemplatesRequest,\n        >>::Future: Send + 'static,\n    {\n        self.list_index_templates_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_delete_index_templates_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    DeleteIndexTemplatesRequest,\n                    EmptyResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                DeleteIndexTemplatesRequest,\n                Response = EmptyResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<\n            DeleteIndexTemplatesRequest,\n        >>::Future: Send + 'static,\n    {\n        self.delete_index_templates_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn stack_get_cluster_identity_layer<L>(mut self, layer: L) -> Self\n    where\n        L: tower::Layer<\n                quickwit_common::tower::BoxService<\n                    GetClusterIdentityRequest,\n                    GetClusterIdentityResponse,\n                    crate::metastore::MetastoreError,\n                >,\n            > + Send + Sync + 'static,\n        L::Service: tower::Service<\n                GetClusterIdentityRequest,\n                Response = GetClusterIdentityResponse,\n                Error = crate::metastore::MetastoreError,\n            > + Clone + Send + Sync + 'static,\n        <L::Service as tower::Service<\n            GetClusterIdentityRequest,\n        >>::Future: Send + 'static,\n    {\n        self.get_cluster_identity_layers\n            .push(quickwit_common::tower::BoxLayer::new(layer));\n        self\n    }\n    pub fn build<T>(self, instance: T) -> MetastoreServiceClient\n    where\n        T: MetastoreService,\n    {\n        let inner_client = InnerMetastoreServiceClient(std::sync::Arc::new(instance));\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_channel(\n        self,\n        addr: std::net::SocketAddr,\n        channel: tonic::transport::Channel,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> MetastoreServiceClient {\n        let client = MetastoreServiceClient::from_channel(\n            addr,\n            channel,\n            max_message_size,\n            compression_encoding_opt,\n        );\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_balance_channel(\n        self,\n        balance_channel: quickwit_common::tower::BalanceChannel<std::net::SocketAddr>,\n        max_message_size: bytesize::ByteSize,\n        compression_encoding_opt: Option<tonic::codec::CompressionEncoding>,\n    ) -> MetastoreServiceClient {\n        let client = MetastoreServiceClient::from_balance_channel(\n            balance_channel,\n            max_message_size,\n            compression_encoding_opt,\n        );\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    pub fn build_from_mailbox<A>(\n        self,\n        mailbox: quickwit_actors::Mailbox<A>,\n    ) -> MetastoreServiceClient\n    where\n        A: quickwit_actors::Actor + std::fmt::Debug + Send + 'static,\n        MetastoreServiceMailbox<A>: MetastoreService,\n    {\n        let inner_client = InnerMetastoreServiceClient(\n            std::sync::Arc::new(MetastoreServiceMailbox::new(mailbox)),\n        );\n        self.build_from_inner_client(inner_client)\n    }\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn build_from_mock(self, mock: MockMetastoreService) -> MetastoreServiceClient {\n        let client = MetastoreServiceClient::from_mock(mock);\n        let inner_client = client.inner;\n        self.build_from_inner_client(inner_client)\n    }\n    fn build_from_inner_client(\n        self,\n        inner_client: InnerMetastoreServiceClient,\n    ) -> MetastoreServiceClient {\n        let create_index_svc = self\n            .create_index_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let update_index_svc = self\n            .update_index_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let index_metadata_svc = self\n            .index_metadata_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let indexes_metadata_svc = self\n            .indexes_metadata_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let list_indexes_metadata_svc = self\n            .list_indexes_metadata_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let delete_index_svc = self\n            .delete_index_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let list_index_stats_svc = self\n            .list_index_stats_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let list_splits_svc = self\n            .list_splits_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let stage_splits_svc = self\n            .stage_splits_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let publish_splits_svc = self\n            .publish_splits_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let mark_splits_for_deletion_svc = self\n            .mark_splits_for_deletion_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let delete_splits_svc = self\n            .delete_splits_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let add_source_svc = self\n            .add_source_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let update_source_svc = self\n            .update_source_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let toggle_source_svc = self\n            .toggle_source_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let delete_source_svc = self\n            .delete_source_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let reset_source_checkpoint_svc = self\n            .reset_source_checkpoint_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let last_delete_opstamp_svc = self\n            .last_delete_opstamp_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let create_delete_task_svc = self\n            .create_delete_task_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let update_splits_delete_opstamp_svc = self\n            .update_splits_delete_opstamp_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let list_delete_tasks_svc = self\n            .list_delete_tasks_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let list_stale_splits_svc = self\n            .list_stale_splits_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let open_shards_svc = self\n            .open_shards_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let acquire_shards_svc = self\n            .acquire_shards_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let delete_shards_svc = self\n            .delete_shards_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let prune_shards_svc = self\n            .prune_shards_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let list_shards_svc = self\n            .list_shards_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let create_index_template_svc = self\n            .create_index_template_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let get_index_template_svc = self\n            .get_index_template_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let find_index_template_matches_svc = self\n            .find_index_template_matches_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let list_index_templates_svc = self\n            .list_index_templates_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let delete_index_templates_svc = self\n            .delete_index_templates_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let get_cluster_identity_svc = self\n            .get_cluster_identity_layers\n            .into_iter()\n            .rev()\n            .fold(\n                quickwit_common::tower::BoxService::new(inner_client.clone()),\n                |svc, layer| layer.layer(svc),\n            );\n        let tower_svc_stack = MetastoreServiceTowerServiceStack {\n            inner: inner_client,\n            create_index_svc,\n            update_index_svc,\n            index_metadata_svc,\n            indexes_metadata_svc,\n            list_indexes_metadata_svc,\n            delete_index_svc,\n            list_index_stats_svc,\n            list_splits_svc,\n            stage_splits_svc,\n            publish_splits_svc,\n            mark_splits_for_deletion_svc,\n            delete_splits_svc,\n            add_source_svc,\n            update_source_svc,\n            toggle_source_svc,\n            delete_source_svc,\n            reset_source_checkpoint_svc,\n            last_delete_opstamp_svc,\n            create_delete_task_svc,\n            update_splits_delete_opstamp_svc,\n            list_delete_tasks_svc,\n            list_stale_splits_svc,\n            open_shards_svc,\n            acquire_shards_svc,\n            delete_shards_svc,\n            prune_shards_svc,\n            list_shards_svc,\n            create_index_template_svc,\n            get_index_template_svc,\n            find_index_template_matches_svc,\n            list_index_templates_svc,\n            delete_index_templates_svc,\n            get_cluster_identity_svc,\n        };\n        MetastoreServiceClient::new(tower_svc_stack)\n    }\n}\n#[derive(Debug, Clone)]\nstruct MailboxAdapter<A: quickwit_actors::Actor, E> {\n    inner: quickwit_actors::Mailbox<A>,\n    phantom: std::marker::PhantomData<E>,\n}\nimpl<A, E> std::ops::Deref for MailboxAdapter<A, E>\nwhere\n    A: quickwit_actors::Actor,\n{\n    type Target = quickwit_actors::Mailbox<A>;\n    fn deref(&self) -> &Self::Target {\n        &self.inner\n    }\n}\n#[derive(Debug)]\npub struct MetastoreServiceMailbox<A: quickwit_actors::Actor> {\n    inner: MailboxAdapter<A, crate::metastore::MetastoreError>,\n}\nimpl<A: quickwit_actors::Actor> MetastoreServiceMailbox<A> {\n    pub fn new(instance: quickwit_actors::Mailbox<A>) -> Self {\n        let inner = MailboxAdapter {\n            inner: instance,\n            phantom: std::marker::PhantomData,\n        };\n        Self { inner }\n    }\n}\nimpl<A: quickwit_actors::Actor> Clone for MetastoreServiceMailbox<A> {\n    fn clone(&self) -> Self {\n        let inner = MailboxAdapter {\n            inner: self.inner.clone(),\n            phantom: std::marker::PhantomData,\n        };\n        Self { inner }\n    }\n}\nimpl<A, M, T, E> tower::Service<M> for MetastoreServiceMailbox<A>\nwhere\n    A: quickwit_actors::Actor\n        + quickwit_actors::DeferableReplyHandler<M, Reply = Result<T, E>> + Send\n        + 'static,\n    M: std::fmt::Debug + Send + 'static,\n    T: Send + 'static,\n    E: std::fmt::Debug + Send + 'static,\n    crate::metastore::MetastoreError: From<quickwit_actors::AskError<E>>,\n{\n    type Response = T;\n    type Error = crate::metastore::MetastoreError;\n    type Future = BoxFuture<Self::Response, Self::Error>;\n    fn poll_ready(\n        &mut self,\n        _cx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Result<(), Self::Error>> {\n        //! This does not work with balance middlewares such as `tower::balance::pool::Pool` because\n        //! this always returns `Poll::Ready`. The fix is to acquire a permit from the\n        //! mailbox in `poll_ready` and consume it in `call`.\n        std::task::Poll::Ready(Ok(()))\n    }\n    fn call(&mut self, message: M) -> Self::Future {\n        let mailbox = self.inner.clone();\n        let fut = async move {\n            mailbox.ask_for_res(message).await.map_err(|error| error.into())\n        };\n        Box::pin(fut)\n    }\n}\n#[async_trait::async_trait]\nimpl<A> MetastoreService for MetastoreServiceMailbox<A>\nwhere\n    A: quickwit_actors::Actor + std::fmt::Debug,\n    MetastoreServiceMailbox<\n        A,\n    >: tower::Service<\n            CreateIndexRequest,\n            Response = CreateIndexResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<CreateIndexResponse, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            UpdateIndexRequest,\n            Response = IndexMetadataResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<IndexMetadataResponse, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            IndexMetadataRequest,\n            Response = IndexMetadataResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<IndexMetadataResponse, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            IndexesMetadataRequest,\n            Response = IndexesMetadataResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<IndexesMetadataResponse, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            ListIndexesMetadataRequest,\n            Response = ListIndexesMetadataResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<\n                ListIndexesMetadataResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >\n        + tower::Service<\n            DeleteIndexRequest,\n            Response = EmptyResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<EmptyResponse, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            ListIndexStatsRequest,\n            Response = ListIndexStatsResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<ListIndexStatsResponse, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            ListSplitsRequest,\n            Response = MetastoreServiceStream<ListSplitsResponse>,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<\n                MetastoreServiceStream<ListSplitsResponse>,\n                crate::metastore::MetastoreError,\n            >,\n        >\n        + tower::Service<\n            StageSplitsRequest,\n            Response = EmptyResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<EmptyResponse, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            PublishSplitsRequest,\n            Response = EmptyResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<EmptyResponse, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            MarkSplitsForDeletionRequest,\n            Response = EmptyResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<EmptyResponse, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            DeleteSplitsRequest,\n            Response = EmptyResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<EmptyResponse, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            AddSourceRequest,\n            Response = EmptyResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<EmptyResponse, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            UpdateSourceRequest,\n            Response = EmptyResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<EmptyResponse, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            ToggleSourceRequest,\n            Response = EmptyResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<EmptyResponse, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            DeleteSourceRequest,\n            Response = EmptyResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<EmptyResponse, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            ResetSourceCheckpointRequest,\n            Response = EmptyResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<EmptyResponse, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            LastDeleteOpstampRequest,\n            Response = LastDeleteOpstampResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<\n                LastDeleteOpstampResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >\n        + tower::Service<\n            DeleteQuery,\n            Response = DeleteTask,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<DeleteTask, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            UpdateSplitsDeleteOpstampRequest,\n            Response = UpdateSplitsDeleteOpstampResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<\n                UpdateSplitsDeleteOpstampResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >\n        + tower::Service<\n            ListDeleteTasksRequest,\n            Response = ListDeleteTasksResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<ListDeleteTasksResponse, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            ListStaleSplitsRequest,\n            Response = ListSplitsResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<ListSplitsResponse, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            OpenShardsRequest,\n            Response = OpenShardsResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<OpenShardsResponse, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            AcquireShardsRequest,\n            Response = AcquireShardsResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<AcquireShardsResponse, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            DeleteShardsRequest,\n            Response = DeleteShardsResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<DeleteShardsResponse, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            PruneShardsRequest,\n            Response = EmptyResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<EmptyResponse, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            ListShardsRequest,\n            Response = ListShardsResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<ListShardsResponse, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            CreateIndexTemplateRequest,\n            Response = EmptyResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<EmptyResponse, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            GetIndexTemplateRequest,\n            Response = GetIndexTemplateResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<\n                GetIndexTemplateResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >\n        + tower::Service<\n            FindIndexTemplateMatchesRequest,\n            Response = FindIndexTemplateMatchesResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<\n                FindIndexTemplateMatchesResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >\n        + tower::Service<\n            ListIndexTemplatesRequest,\n            Response = ListIndexTemplatesResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<\n                ListIndexTemplatesResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >\n        + tower::Service<\n            DeleteIndexTemplatesRequest,\n            Response = EmptyResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<EmptyResponse, crate::metastore::MetastoreError>,\n        >\n        + tower::Service<\n            GetClusterIdentityRequest,\n            Response = GetClusterIdentityResponse,\n            Error = crate::metastore::MetastoreError,\n            Future = BoxFuture<\n                GetClusterIdentityResponse,\n                crate::metastore::MetastoreError,\n            >,\n        >,\n{\n    async fn create_index(\n        &self,\n        request: CreateIndexRequest,\n    ) -> crate::metastore::MetastoreResult<CreateIndexResponse> {\n        self.clone().call(request).await\n    }\n    async fn update_index(\n        &self,\n        request: UpdateIndexRequest,\n    ) -> crate::metastore::MetastoreResult<IndexMetadataResponse> {\n        self.clone().call(request).await\n    }\n    async fn index_metadata(\n        &self,\n        request: IndexMetadataRequest,\n    ) -> crate::metastore::MetastoreResult<IndexMetadataResponse> {\n        self.clone().call(request).await\n    }\n    async fn indexes_metadata(\n        &self,\n        request: IndexesMetadataRequest,\n    ) -> crate::metastore::MetastoreResult<IndexesMetadataResponse> {\n        self.clone().call(request).await\n    }\n    async fn list_indexes_metadata(\n        &self,\n        request: ListIndexesMetadataRequest,\n    ) -> crate::metastore::MetastoreResult<ListIndexesMetadataResponse> {\n        self.clone().call(request).await\n    }\n    async fn delete_index(\n        &self,\n        request: DeleteIndexRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.clone().call(request).await\n    }\n    async fn list_index_stats(\n        &self,\n        request: ListIndexStatsRequest,\n    ) -> crate::metastore::MetastoreResult<ListIndexStatsResponse> {\n        self.clone().call(request).await\n    }\n    async fn list_splits(\n        &self,\n        request: ListSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<MetastoreServiceStream<ListSplitsResponse>> {\n        self.clone().call(request).await\n    }\n    async fn stage_splits(\n        &self,\n        request: StageSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.clone().call(request).await\n    }\n    async fn publish_splits(\n        &self,\n        request: PublishSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.clone().call(request).await\n    }\n    async fn mark_splits_for_deletion(\n        &self,\n        request: MarkSplitsForDeletionRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.clone().call(request).await\n    }\n    async fn delete_splits(\n        &self,\n        request: DeleteSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.clone().call(request).await\n    }\n    async fn add_source(\n        &self,\n        request: AddSourceRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.clone().call(request).await\n    }\n    async fn update_source(\n        &self,\n        request: UpdateSourceRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.clone().call(request).await\n    }\n    async fn toggle_source(\n        &self,\n        request: ToggleSourceRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.clone().call(request).await\n    }\n    async fn delete_source(\n        &self,\n        request: DeleteSourceRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.clone().call(request).await\n    }\n    async fn reset_source_checkpoint(\n        &self,\n        request: ResetSourceCheckpointRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.clone().call(request).await\n    }\n    async fn last_delete_opstamp(\n        &self,\n        request: LastDeleteOpstampRequest,\n    ) -> crate::metastore::MetastoreResult<LastDeleteOpstampResponse> {\n        self.clone().call(request).await\n    }\n    async fn create_delete_task(\n        &self,\n        request: DeleteQuery,\n    ) -> crate::metastore::MetastoreResult<DeleteTask> {\n        self.clone().call(request).await\n    }\n    async fn update_splits_delete_opstamp(\n        &self,\n        request: UpdateSplitsDeleteOpstampRequest,\n    ) -> crate::metastore::MetastoreResult<UpdateSplitsDeleteOpstampResponse> {\n        self.clone().call(request).await\n    }\n    async fn list_delete_tasks(\n        &self,\n        request: ListDeleteTasksRequest,\n    ) -> crate::metastore::MetastoreResult<ListDeleteTasksResponse> {\n        self.clone().call(request).await\n    }\n    async fn list_stale_splits(\n        &self,\n        request: ListStaleSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<ListSplitsResponse> {\n        self.clone().call(request).await\n    }\n    async fn open_shards(\n        &self,\n        request: OpenShardsRequest,\n    ) -> crate::metastore::MetastoreResult<OpenShardsResponse> {\n        self.clone().call(request).await\n    }\n    async fn acquire_shards(\n        &self,\n        request: AcquireShardsRequest,\n    ) -> crate::metastore::MetastoreResult<AcquireShardsResponse> {\n        self.clone().call(request).await\n    }\n    async fn delete_shards(\n        &self,\n        request: DeleteShardsRequest,\n    ) -> crate::metastore::MetastoreResult<DeleteShardsResponse> {\n        self.clone().call(request).await\n    }\n    async fn prune_shards(\n        &self,\n        request: PruneShardsRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.clone().call(request).await\n    }\n    async fn list_shards(\n        &self,\n        request: ListShardsRequest,\n    ) -> crate::metastore::MetastoreResult<ListShardsResponse> {\n        self.clone().call(request).await\n    }\n    async fn create_index_template(\n        &self,\n        request: CreateIndexTemplateRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.clone().call(request).await\n    }\n    async fn get_index_template(\n        &self,\n        request: GetIndexTemplateRequest,\n    ) -> crate::metastore::MetastoreResult<GetIndexTemplateResponse> {\n        self.clone().call(request).await\n    }\n    async fn find_index_template_matches(\n        &self,\n        request: FindIndexTemplateMatchesRequest,\n    ) -> crate::metastore::MetastoreResult<FindIndexTemplateMatchesResponse> {\n        self.clone().call(request).await\n    }\n    async fn list_index_templates(\n        &self,\n        request: ListIndexTemplatesRequest,\n    ) -> crate::metastore::MetastoreResult<ListIndexTemplatesResponse> {\n        self.clone().call(request).await\n    }\n    async fn delete_index_templates(\n        &self,\n        request: DeleteIndexTemplatesRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.clone().call(request).await\n    }\n    async fn get_cluster_identity(\n        &self,\n        request: GetClusterIdentityRequest,\n    ) -> crate::metastore::MetastoreResult<GetClusterIdentityResponse> {\n        self.clone().call(request).await\n    }\n    async fn check_connectivity(&self) -> anyhow::Result<()> {\n        if self.inner.is_disconnected() {\n            anyhow::bail!(\"actor `{}` is disconnected\", self.inner.actor_instance_id())\n        }\n        Ok(())\n    }\n    fn endpoints(&self) -> Vec<quickwit_common::uri::Uri> {\n        vec![\n            quickwit_common::uri::Uri::from_str(& format!(\"actor://localhost/{}\", self\n            .inner.actor_instance_id())).expect(\"URI should be valid\")\n        ]\n    }\n}\n#[derive(Debug, Clone)]\npub struct MetastoreServiceGrpcClientAdapter<T> {\n    inner: T,\n    #[allow(dead_code)]\n    connection_addrs_rx: tokio::sync::watch::Receiver<\n        std::collections::HashSet<std::net::SocketAddr>,\n    >,\n}\nimpl<T> MetastoreServiceGrpcClientAdapter<T> {\n    pub fn new(\n        instance: T,\n        connection_addrs_rx: tokio::sync::watch::Receiver<\n            std::collections::HashSet<std::net::SocketAddr>,\n        >,\n    ) -> Self {\n        Self {\n            inner: instance,\n            connection_addrs_rx,\n        }\n    }\n}\n#[async_trait::async_trait]\nimpl<T> MetastoreService\nfor MetastoreServiceGrpcClientAdapter<\n    metastore_service_grpc_client::MetastoreServiceGrpcClient<T>,\n>\nwhere\n    T: tonic::client::GrpcService<tonic::body::Body> + std::fmt::Debug + Clone + Send\n        + Sync + 'static,\n    T::ResponseBody: tonic::codegen::Body<Data = tonic::codegen::Bytes> + Send + 'static,\n    <T::ResponseBody as tonic::codegen::Body>::Error: Into<tonic::codegen::StdError>\n        + Send,\n    T::Future: Send,\n{\n    async fn create_index(\n        &self,\n        request: CreateIndexRequest,\n    ) -> crate::metastore::MetastoreResult<CreateIndexResponse> {\n        self.inner\n            .clone()\n            .create_index(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                CreateIndexRequest::rpc_name(),\n            ))\n    }\n    async fn update_index(\n        &self,\n        request: UpdateIndexRequest,\n    ) -> crate::metastore::MetastoreResult<IndexMetadataResponse> {\n        self.inner\n            .clone()\n            .update_index(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                UpdateIndexRequest::rpc_name(),\n            ))\n    }\n    async fn index_metadata(\n        &self,\n        request: IndexMetadataRequest,\n    ) -> crate::metastore::MetastoreResult<IndexMetadataResponse> {\n        self.inner\n            .clone()\n            .index_metadata(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                IndexMetadataRequest::rpc_name(),\n            ))\n    }\n    async fn indexes_metadata(\n        &self,\n        request: IndexesMetadataRequest,\n    ) -> crate::metastore::MetastoreResult<IndexesMetadataResponse> {\n        self.inner\n            .clone()\n            .indexes_metadata(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                IndexesMetadataRequest::rpc_name(),\n            ))\n    }\n    async fn list_indexes_metadata(\n        &self,\n        request: ListIndexesMetadataRequest,\n    ) -> crate::metastore::MetastoreResult<ListIndexesMetadataResponse> {\n        self.inner\n            .clone()\n            .list_indexes_metadata(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                ListIndexesMetadataRequest::rpc_name(),\n            ))\n    }\n    async fn delete_index(\n        &self,\n        request: DeleteIndexRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner\n            .clone()\n            .delete_index(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                DeleteIndexRequest::rpc_name(),\n            ))\n    }\n    async fn list_index_stats(\n        &self,\n        request: ListIndexStatsRequest,\n    ) -> crate::metastore::MetastoreResult<ListIndexStatsResponse> {\n        self.inner\n            .clone()\n            .list_index_stats(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                ListIndexStatsRequest::rpc_name(),\n            ))\n    }\n    async fn list_splits(\n        &self,\n        request: ListSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<MetastoreServiceStream<ListSplitsResponse>> {\n        self.inner\n            .clone()\n            .list_splits(request)\n            .await\n            .map(|response| {\n                let streaming: tonic::Streaming<_> = response.into_inner();\n                let stream = quickwit_common::ServiceStream::from(streaming);\n                stream\n                    .map_err(|status| crate::error::grpc_status_to_service_error(\n                        status,\n                        ListSplitsRequest::rpc_name(),\n                    ))\n            })\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                ListSplitsRequest::rpc_name(),\n            ))\n    }\n    async fn stage_splits(\n        &self,\n        request: StageSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner\n            .clone()\n            .stage_splits(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                StageSplitsRequest::rpc_name(),\n            ))\n    }\n    async fn publish_splits(\n        &self,\n        request: PublishSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner\n            .clone()\n            .publish_splits(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                PublishSplitsRequest::rpc_name(),\n            ))\n    }\n    async fn mark_splits_for_deletion(\n        &self,\n        request: MarkSplitsForDeletionRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner\n            .clone()\n            .mark_splits_for_deletion(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                MarkSplitsForDeletionRequest::rpc_name(),\n            ))\n    }\n    async fn delete_splits(\n        &self,\n        request: DeleteSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner\n            .clone()\n            .delete_splits(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                DeleteSplitsRequest::rpc_name(),\n            ))\n    }\n    async fn add_source(\n        &self,\n        request: AddSourceRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner\n            .clone()\n            .add_source(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                AddSourceRequest::rpc_name(),\n            ))\n    }\n    async fn update_source(\n        &self,\n        request: UpdateSourceRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner\n            .clone()\n            .update_source(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                UpdateSourceRequest::rpc_name(),\n            ))\n    }\n    async fn toggle_source(\n        &self,\n        request: ToggleSourceRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner\n            .clone()\n            .toggle_source(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                ToggleSourceRequest::rpc_name(),\n            ))\n    }\n    async fn delete_source(\n        &self,\n        request: DeleteSourceRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner\n            .clone()\n            .delete_source(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                DeleteSourceRequest::rpc_name(),\n            ))\n    }\n    async fn reset_source_checkpoint(\n        &self,\n        request: ResetSourceCheckpointRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner\n            .clone()\n            .reset_source_checkpoint(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                ResetSourceCheckpointRequest::rpc_name(),\n            ))\n    }\n    async fn last_delete_opstamp(\n        &self,\n        request: LastDeleteOpstampRequest,\n    ) -> crate::metastore::MetastoreResult<LastDeleteOpstampResponse> {\n        self.inner\n            .clone()\n            .last_delete_opstamp(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                LastDeleteOpstampRequest::rpc_name(),\n            ))\n    }\n    async fn create_delete_task(\n        &self,\n        request: DeleteQuery,\n    ) -> crate::metastore::MetastoreResult<DeleteTask> {\n        self.inner\n            .clone()\n            .create_delete_task(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                DeleteQuery::rpc_name(),\n            ))\n    }\n    async fn update_splits_delete_opstamp(\n        &self,\n        request: UpdateSplitsDeleteOpstampRequest,\n    ) -> crate::metastore::MetastoreResult<UpdateSplitsDeleteOpstampResponse> {\n        self.inner\n            .clone()\n            .update_splits_delete_opstamp(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                UpdateSplitsDeleteOpstampRequest::rpc_name(),\n            ))\n    }\n    async fn list_delete_tasks(\n        &self,\n        request: ListDeleteTasksRequest,\n    ) -> crate::metastore::MetastoreResult<ListDeleteTasksResponse> {\n        self.inner\n            .clone()\n            .list_delete_tasks(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                ListDeleteTasksRequest::rpc_name(),\n            ))\n    }\n    async fn list_stale_splits(\n        &self,\n        request: ListStaleSplitsRequest,\n    ) -> crate::metastore::MetastoreResult<ListSplitsResponse> {\n        self.inner\n            .clone()\n            .list_stale_splits(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                ListStaleSplitsRequest::rpc_name(),\n            ))\n    }\n    async fn open_shards(\n        &self,\n        request: OpenShardsRequest,\n    ) -> crate::metastore::MetastoreResult<OpenShardsResponse> {\n        self.inner\n            .clone()\n            .open_shards(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                OpenShardsRequest::rpc_name(),\n            ))\n    }\n    async fn acquire_shards(\n        &self,\n        request: AcquireShardsRequest,\n    ) -> crate::metastore::MetastoreResult<AcquireShardsResponse> {\n        self.inner\n            .clone()\n            .acquire_shards(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                AcquireShardsRequest::rpc_name(),\n            ))\n    }\n    async fn delete_shards(\n        &self,\n        request: DeleteShardsRequest,\n    ) -> crate::metastore::MetastoreResult<DeleteShardsResponse> {\n        self.inner\n            .clone()\n            .delete_shards(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                DeleteShardsRequest::rpc_name(),\n            ))\n    }\n    async fn prune_shards(\n        &self,\n        request: PruneShardsRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner\n            .clone()\n            .prune_shards(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                PruneShardsRequest::rpc_name(),\n            ))\n    }\n    async fn list_shards(\n        &self,\n        request: ListShardsRequest,\n    ) -> crate::metastore::MetastoreResult<ListShardsResponse> {\n        self.inner\n            .clone()\n            .list_shards(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                ListShardsRequest::rpc_name(),\n            ))\n    }\n    async fn create_index_template(\n        &self,\n        request: CreateIndexTemplateRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner\n            .clone()\n            .create_index_template(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                CreateIndexTemplateRequest::rpc_name(),\n            ))\n    }\n    async fn get_index_template(\n        &self,\n        request: GetIndexTemplateRequest,\n    ) -> crate::metastore::MetastoreResult<GetIndexTemplateResponse> {\n        self.inner\n            .clone()\n            .get_index_template(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                GetIndexTemplateRequest::rpc_name(),\n            ))\n    }\n    async fn find_index_template_matches(\n        &self,\n        request: FindIndexTemplateMatchesRequest,\n    ) -> crate::metastore::MetastoreResult<FindIndexTemplateMatchesResponse> {\n        self.inner\n            .clone()\n            .find_index_template_matches(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                FindIndexTemplateMatchesRequest::rpc_name(),\n            ))\n    }\n    async fn list_index_templates(\n        &self,\n        request: ListIndexTemplatesRequest,\n    ) -> crate::metastore::MetastoreResult<ListIndexTemplatesResponse> {\n        self.inner\n            .clone()\n            .list_index_templates(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                ListIndexTemplatesRequest::rpc_name(),\n            ))\n    }\n    async fn delete_index_templates(\n        &self,\n        request: DeleteIndexTemplatesRequest,\n    ) -> crate::metastore::MetastoreResult<EmptyResponse> {\n        self.inner\n            .clone()\n            .delete_index_templates(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                DeleteIndexTemplatesRequest::rpc_name(),\n            ))\n    }\n    async fn get_cluster_identity(\n        &self,\n        request: GetClusterIdentityRequest,\n    ) -> crate::metastore::MetastoreResult<GetClusterIdentityResponse> {\n        self.inner\n            .clone()\n            .get_cluster_identity(request)\n            .await\n            .map(|response| response.into_inner())\n            .map_err(|status| crate::error::grpc_status_to_service_error(\n                status,\n                GetClusterIdentityRequest::rpc_name(),\n            ))\n    }\n    async fn check_connectivity(&self) -> anyhow::Result<()> {\n        if self.connection_addrs_rx.borrow().is_empty() {\n            anyhow::bail!(\"no server currently available\")\n        }\n        Ok(())\n    }\n    fn endpoints(&self) -> Vec<quickwit_common::uri::Uri> {\n        self.connection_addrs_rx\n            .borrow()\n            .iter()\n            .flat_map(|addr| quickwit_common::uri::Uri::from_str(\n                &format!(\"grpc://{addr}/{}.{}\", \"quickwit.metastore\", \"MetastoreService\"),\n            ))\n            .collect()\n    }\n}\n#[derive(Debug)]\npub struct MetastoreServiceGrpcServerAdapter {\n    inner: InnerMetastoreServiceClient,\n}\nimpl MetastoreServiceGrpcServerAdapter {\n    pub fn new<T>(instance: T) -> Self\n    where\n        T: MetastoreService,\n    {\n        Self {\n            inner: InnerMetastoreServiceClient(std::sync::Arc::new(instance)),\n        }\n    }\n}\n#[async_trait::async_trait]\nimpl metastore_service_grpc_server::MetastoreServiceGrpc\nfor MetastoreServiceGrpcServerAdapter {\n    async fn create_index(\n        &self,\n        request: tonic::Request<CreateIndexRequest>,\n    ) -> Result<tonic::Response<CreateIndexResponse>, tonic::Status> {\n        self.inner\n            .0\n            .create_index(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn update_index(\n        &self,\n        request: tonic::Request<UpdateIndexRequest>,\n    ) -> Result<tonic::Response<IndexMetadataResponse>, tonic::Status> {\n        self.inner\n            .0\n            .update_index(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn index_metadata(\n        &self,\n        request: tonic::Request<IndexMetadataRequest>,\n    ) -> Result<tonic::Response<IndexMetadataResponse>, tonic::Status> {\n        self.inner\n            .0\n            .index_metadata(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn indexes_metadata(\n        &self,\n        request: tonic::Request<IndexesMetadataRequest>,\n    ) -> Result<tonic::Response<IndexesMetadataResponse>, tonic::Status> {\n        self.inner\n            .0\n            .indexes_metadata(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn list_indexes_metadata(\n        &self,\n        request: tonic::Request<ListIndexesMetadataRequest>,\n    ) -> Result<tonic::Response<ListIndexesMetadataResponse>, tonic::Status> {\n        self.inner\n            .0\n            .list_indexes_metadata(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn delete_index(\n        &self,\n        request: tonic::Request<DeleteIndexRequest>,\n    ) -> Result<tonic::Response<EmptyResponse>, tonic::Status> {\n        self.inner\n            .0\n            .delete_index(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn list_index_stats(\n        &self,\n        request: tonic::Request<ListIndexStatsRequest>,\n    ) -> Result<tonic::Response<ListIndexStatsResponse>, tonic::Status> {\n        self.inner\n            .0\n            .list_index_stats(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    type ListSplitsStream = quickwit_common::ServiceStream<\n        tonic::Result<ListSplitsResponse>,\n    >;\n    async fn list_splits(\n        &self,\n        request: tonic::Request<ListSplitsRequest>,\n    ) -> Result<tonic::Response<Self::ListSplitsStream>, tonic::Status> {\n        self.inner\n            .0\n            .list_splits(request.into_inner())\n            .await\n            .map(|stream| tonic::Response::new(\n                stream.map_err(crate::error::grpc_error_to_grpc_status),\n            ))\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn stage_splits(\n        &self,\n        request: tonic::Request<StageSplitsRequest>,\n    ) -> Result<tonic::Response<EmptyResponse>, tonic::Status> {\n        self.inner\n            .0\n            .stage_splits(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn publish_splits(\n        &self,\n        request: tonic::Request<PublishSplitsRequest>,\n    ) -> Result<tonic::Response<EmptyResponse>, tonic::Status> {\n        self.inner\n            .0\n            .publish_splits(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn mark_splits_for_deletion(\n        &self,\n        request: tonic::Request<MarkSplitsForDeletionRequest>,\n    ) -> Result<tonic::Response<EmptyResponse>, tonic::Status> {\n        self.inner\n            .0\n            .mark_splits_for_deletion(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn delete_splits(\n        &self,\n        request: tonic::Request<DeleteSplitsRequest>,\n    ) -> Result<tonic::Response<EmptyResponse>, tonic::Status> {\n        self.inner\n            .0\n            .delete_splits(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn add_source(\n        &self,\n        request: tonic::Request<AddSourceRequest>,\n    ) -> Result<tonic::Response<EmptyResponse>, tonic::Status> {\n        self.inner\n            .0\n            .add_source(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn update_source(\n        &self,\n        request: tonic::Request<UpdateSourceRequest>,\n    ) -> Result<tonic::Response<EmptyResponse>, tonic::Status> {\n        self.inner\n            .0\n            .update_source(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn toggle_source(\n        &self,\n        request: tonic::Request<ToggleSourceRequest>,\n    ) -> Result<tonic::Response<EmptyResponse>, tonic::Status> {\n        self.inner\n            .0\n            .toggle_source(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn delete_source(\n        &self,\n        request: tonic::Request<DeleteSourceRequest>,\n    ) -> Result<tonic::Response<EmptyResponse>, tonic::Status> {\n        self.inner\n            .0\n            .delete_source(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn reset_source_checkpoint(\n        &self,\n        request: tonic::Request<ResetSourceCheckpointRequest>,\n    ) -> Result<tonic::Response<EmptyResponse>, tonic::Status> {\n        self.inner\n            .0\n            .reset_source_checkpoint(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn last_delete_opstamp(\n        &self,\n        request: tonic::Request<LastDeleteOpstampRequest>,\n    ) -> Result<tonic::Response<LastDeleteOpstampResponse>, tonic::Status> {\n        self.inner\n            .0\n            .last_delete_opstamp(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn create_delete_task(\n        &self,\n        request: tonic::Request<DeleteQuery>,\n    ) -> Result<tonic::Response<DeleteTask>, tonic::Status> {\n        self.inner\n            .0\n            .create_delete_task(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn update_splits_delete_opstamp(\n        &self,\n        request: tonic::Request<UpdateSplitsDeleteOpstampRequest>,\n    ) -> Result<tonic::Response<UpdateSplitsDeleteOpstampResponse>, tonic::Status> {\n        self.inner\n            .0\n            .update_splits_delete_opstamp(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn list_delete_tasks(\n        &self,\n        request: tonic::Request<ListDeleteTasksRequest>,\n    ) -> Result<tonic::Response<ListDeleteTasksResponse>, tonic::Status> {\n        self.inner\n            .0\n            .list_delete_tasks(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn list_stale_splits(\n        &self,\n        request: tonic::Request<ListStaleSplitsRequest>,\n    ) -> Result<tonic::Response<ListSplitsResponse>, tonic::Status> {\n        self.inner\n            .0\n            .list_stale_splits(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn open_shards(\n        &self,\n        request: tonic::Request<OpenShardsRequest>,\n    ) -> Result<tonic::Response<OpenShardsResponse>, tonic::Status> {\n        self.inner\n            .0\n            .open_shards(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn acquire_shards(\n        &self,\n        request: tonic::Request<AcquireShardsRequest>,\n    ) -> Result<tonic::Response<AcquireShardsResponse>, tonic::Status> {\n        self.inner\n            .0\n            .acquire_shards(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn delete_shards(\n        &self,\n        request: tonic::Request<DeleteShardsRequest>,\n    ) -> Result<tonic::Response<DeleteShardsResponse>, tonic::Status> {\n        self.inner\n            .0\n            .delete_shards(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn prune_shards(\n        &self,\n        request: tonic::Request<PruneShardsRequest>,\n    ) -> Result<tonic::Response<EmptyResponse>, tonic::Status> {\n        self.inner\n            .0\n            .prune_shards(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn list_shards(\n        &self,\n        request: tonic::Request<ListShardsRequest>,\n    ) -> Result<tonic::Response<ListShardsResponse>, tonic::Status> {\n        self.inner\n            .0\n            .list_shards(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn create_index_template(\n        &self,\n        request: tonic::Request<CreateIndexTemplateRequest>,\n    ) -> Result<tonic::Response<EmptyResponse>, tonic::Status> {\n        self.inner\n            .0\n            .create_index_template(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn get_index_template(\n        &self,\n        request: tonic::Request<GetIndexTemplateRequest>,\n    ) -> Result<tonic::Response<GetIndexTemplateResponse>, tonic::Status> {\n        self.inner\n            .0\n            .get_index_template(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn find_index_template_matches(\n        &self,\n        request: tonic::Request<FindIndexTemplateMatchesRequest>,\n    ) -> Result<tonic::Response<FindIndexTemplateMatchesResponse>, tonic::Status> {\n        self.inner\n            .0\n            .find_index_template_matches(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn list_index_templates(\n        &self,\n        request: tonic::Request<ListIndexTemplatesRequest>,\n    ) -> Result<tonic::Response<ListIndexTemplatesResponse>, tonic::Status> {\n        self.inner\n            .0\n            .list_index_templates(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn delete_index_templates(\n        &self,\n        request: tonic::Request<DeleteIndexTemplatesRequest>,\n    ) -> Result<tonic::Response<EmptyResponse>, tonic::Status> {\n        self.inner\n            .0\n            .delete_index_templates(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n    async fn get_cluster_identity(\n        &self,\n        request: tonic::Request<GetClusterIdentityRequest>,\n    ) -> Result<tonic::Response<GetClusterIdentityResponse>, tonic::Status> {\n        self.inner\n            .0\n            .get_cluster_identity(request.into_inner())\n            .await\n            .map(tonic::Response::new)\n            .map_err(crate::error::grpc_error_to_grpc_status)\n    }\n}\n/// Generated client implementations.\npub mod metastore_service_grpc_client {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    use tonic::codegen::http::Uri;\n    /// Metastore meant to manage Quickwit's indexes, their splits and delete tasks.\n    ///\n    /// I. Index and splits management.\n    ///\n    /// Quickwit needs a way to ensure that we can cleanup unused files,\n    /// and this process needs to be resilient to any fail-stop failures.\n    /// We rely on atomically transitioning the status of splits.\n    ///\n    /// The split state goes through the following life cycle:\n    ///\n    /// 1. `Staged`\n    ///\n    /// * Start uploading the split files.\n    ///\n    /// 2. `Published`\n    ///\n    /// * Uploading the split files is complete and the split is searchable.\n    ///\n    /// 3. `MarkedForDeletion`\n    ///\n    /// * Mark the split for deletion.\n    ///\n    /// If a split has a file in the storage, it MUST be registered in the metastore,\n    /// and its state can be as follows:\n    ///\n    /// * `Staged`: The split is almost ready. Some of its files may have been uploaded in the storage.\n    /// * `Published`: The split is ready and published.\n    /// * `MarkedForDeletion`: The split is marked for deletion.\n    ///\n    /// Before creating any file, we need to stage the split. If there is a failure, upon recovery, we\n    /// schedule for deletion all the staged splits. A client may not necessarily remove files from\n    /// storage right after marking it for deletion. A CLI client may delete files right away, but a\n    /// more serious deployment should probably only delete those files after a grace period so that the\n    /// running search queries can complete.\n    ///\n    /// II. Delete tasks management.\n    ///\n    /// A delete task is defined on a given index and by a search query. It can be\n    /// applied to all the splits of the index.\n    ///\n    /// Quickwit needs a way to track that a delete task has been applied to a split. This is ensured\n    /// by two mechanisms:\n    ///\n    /// * On creation of a delete task, we give to the task a monotically increasing opstamp (uniqueness\n    ///  and monotonically increasing must be true at the index level).\n    /// * When a delete task is executed on a split, that is when the documents matched by the search\n    ///  query are removed from the splits, we update the split's `delete_opstamp` to the value of the\n    ///  task's opstamp. This marks the split as \"up-to-date\" regarding this delete task. If new delete\n    ///  tasks are added, we will know that we need to run these delete tasks on the splits as its\n    ///  `delete_optstamp` will be inferior to the `opstamp` of the new tasks.\n    ///\n    /// For splits created after a given delete task, Quickwit's indexing ensures that these splits\n    /// are created with a `delete_opstamp` equal the latest opstamp of the tasks of the\n    /// corresponding index.\n    #[derive(Debug, Clone)]\n    pub struct MetastoreServiceGrpcClient<T> {\n        inner: tonic::client::Grpc<T>,\n    }\n    impl MetastoreServiceGrpcClient<tonic::transport::Channel> {\n        /// Attempt to create a new client by connecting to a given endpoint.\n        pub async fn connect<D>(dst: D) -> Result<Self, tonic::transport::Error>\n        where\n            D: TryInto<tonic::transport::Endpoint>,\n            D::Error: Into<StdError>,\n        {\n            let conn = tonic::transport::Endpoint::new(dst)?.connect().await?;\n            Ok(Self::new(conn))\n        }\n    }\n    impl<T> MetastoreServiceGrpcClient<T>\n    where\n        T: tonic::client::GrpcService<tonic::body::Body>,\n        T::Error: Into<StdError>,\n        T::ResponseBody: Body<Data = Bytes> + std::marker::Send + 'static,\n        <T::ResponseBody as Body>::Error: Into<StdError> + std::marker::Send,\n    {\n        pub fn new(inner: T) -> Self {\n            let inner = tonic::client::Grpc::new(inner);\n            Self { inner }\n        }\n        pub fn with_origin(inner: T, origin: Uri) -> Self {\n            let inner = tonic::client::Grpc::with_origin(inner, origin);\n            Self { inner }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> MetastoreServiceGrpcClient<InterceptedService<T, F>>\n        where\n            F: tonic::service::Interceptor,\n            T::ResponseBody: Default,\n            T: tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n                Response = http::Response<\n                    <T as tonic::client::GrpcService<tonic::body::Body>>::ResponseBody,\n                >,\n            >,\n            <T as tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n            >>::Error: Into<StdError> + std::marker::Send + std::marker::Sync,\n        {\n            MetastoreServiceGrpcClient::new(InterceptedService::new(inner, interceptor))\n        }\n        /// Compress requests with the given encoding.\n        ///\n        /// This requires the server to support it otherwise it might respond with an\n        /// error.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.send_compressed(encoding);\n            self\n        }\n        /// Enable decompressing responses.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.accept_compressed(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_decoding_message_size(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_encoding_message_size(limit);\n            self\n        }\n        /// Creates an index.\n        ///\n        /// This API creates a new index in the metastore.\n        /// An error will occur if an index that already exists in the storage is specified.\n        pub async fn create_index(\n            &mut self,\n            request: impl tonic::IntoRequest<super::CreateIndexRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::CreateIndexResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/CreateIndex\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\"quickwit.metastore.MetastoreService\", \"CreateIndex\"),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Update an index.\n        pub async fn update_index(\n            &mut self,\n            request: impl tonic::IntoRequest<super::UpdateIndexRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::IndexMetadataResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/UpdateIndex\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\"quickwit.metastore.MetastoreService\", \"UpdateIndex\"),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Returns the `IndexMetadata` of an index identified by its IndexID or its IndexUID.\n        pub async fn index_metadata(\n            &mut self,\n            request: impl tonic::IntoRequest<super::IndexMetadataRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::IndexMetadataResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/IndexMetadata\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.metastore.MetastoreService\",\n                        \"IndexMetadata\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Fetches the metadata of a list of indexes identified by their Index IDs or UIDs.\n        pub async fn indexes_metadata(\n            &mut self,\n            request: impl tonic::IntoRequest<super::IndexesMetadataRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::IndexesMetadataResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/IndexesMetadata\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.metastore.MetastoreService\",\n                        \"IndexesMetadata\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Gets an indexes metadatas.\n        pub async fn list_indexes_metadata(\n            &mut self,\n            request: impl tonic::IntoRequest<super::ListIndexesMetadataRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ListIndexesMetadataResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/ListIndexesMetadata\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.metastore.MetastoreService\",\n                        \"ListIndexesMetadata\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Deletes an index\n        pub async fn delete_index(\n            &mut self,\n            request: impl tonic::IntoRequest<super::DeleteIndexRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status> {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/DeleteIndex\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\"quickwit.metastore.MetastoreService\", \"DeleteIndex\"),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Returns a list of size info for each index.\n        pub async fn list_index_stats(\n            &mut self,\n            request: impl tonic::IntoRequest<super::ListIndexStatsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ListIndexStatsResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/ListIndexStats\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.metastore.MetastoreService\",\n                        \"ListIndexStats\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Streams splits from index.\n        pub async fn list_splits(\n            &mut self,\n            request: impl tonic::IntoRequest<super::ListSplitsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<tonic::codec::Streaming<super::ListSplitsResponse>>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/ListSplits\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\"quickwit.metastore.MetastoreService\", \"ListSplits\"),\n                );\n            self.inner.server_streaming(req, path, codec).await\n        }\n        /// Stages several splits.\n        pub async fn stage_splits(\n            &mut self,\n            request: impl tonic::IntoRequest<super::StageSplitsRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status> {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/StageSplits\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\"quickwit.metastore.MetastoreService\", \"StageSplits\"),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Publishes split.\n        pub async fn publish_splits(\n            &mut self,\n            request: impl tonic::IntoRequest<super::PublishSplitsRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status> {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/PublishSplits\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.metastore.MetastoreService\",\n                        \"PublishSplits\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Marks splits for deletion.\n        pub async fn mark_splits_for_deletion(\n            &mut self,\n            request: impl tonic::IntoRequest<super::MarkSplitsForDeletionRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status> {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/MarkSplitsForDeletion\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.metastore.MetastoreService\",\n                        \"MarkSplitsForDeletion\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Deletes splits.\n        pub async fn delete_splits(\n            &mut self,\n            request: impl tonic::IntoRequest<super::DeleteSplitsRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status> {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/DeleteSplits\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.metastore.MetastoreService\",\n                        \"DeleteSplits\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Adds a source.\n        pub async fn add_source(\n            &mut self,\n            request: impl tonic::IntoRequest<super::AddSourceRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status> {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/AddSource\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\"quickwit.metastore.MetastoreService\", \"AddSource\"),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Updates a source.\n        pub async fn update_source(\n            &mut self,\n            request: impl tonic::IntoRequest<super::UpdateSourceRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status> {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/UpdateSource\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.metastore.MetastoreService\",\n                        \"UpdateSource\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Toggles (turns on or off) source.\n        pub async fn toggle_source(\n            &mut self,\n            request: impl tonic::IntoRequest<super::ToggleSourceRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status> {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/ToggleSource\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.metastore.MetastoreService\",\n                        \"ToggleSource\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Removes source.\n        pub async fn delete_source(\n            &mut self,\n            request: impl tonic::IntoRequest<super::DeleteSourceRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status> {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/DeleteSource\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.metastore.MetastoreService\",\n                        \"DeleteSource\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Resets source checkpoint.\n        pub async fn reset_source_checkpoint(\n            &mut self,\n            request: impl tonic::IntoRequest<super::ResetSourceCheckpointRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status> {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/ResetSourceCheckpoint\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.metastore.MetastoreService\",\n                        \"ResetSourceCheckpoint\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Gets last opstamp for a given `index_id`.\n        pub async fn last_delete_opstamp(\n            &mut self,\n            request: impl tonic::IntoRequest<super::LastDeleteOpstampRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::LastDeleteOpstampResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/LastDeleteOpstamp\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.metastore.MetastoreService\",\n                        \"LastDeleteOpstamp\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Creates a delete task.\n        pub async fn create_delete_task(\n            &mut self,\n            request: impl tonic::IntoRequest<super::DeleteQuery>,\n        ) -> std::result::Result<tonic::Response<super::DeleteTask>, tonic::Status> {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/CreateDeleteTask\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.metastore.MetastoreService\",\n                        \"CreateDeleteTask\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Updates splits `delete_opstamp`.\n        pub async fn update_splits_delete_opstamp(\n            &mut self,\n            request: impl tonic::IntoRequest<super::UpdateSplitsDeleteOpstampRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::UpdateSplitsDeleteOpstampResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/UpdateSplitsDeleteOpstamp\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.metastore.MetastoreService\",\n                        \"UpdateSplitsDeleteOpstamp\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Lists delete tasks with `delete_task.opstamp` > `opstamp_start` for a given `index_id`.\n        pub async fn list_delete_tasks(\n            &mut self,\n            request: impl tonic::IntoRequest<super::ListDeleteTasksRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ListDeleteTasksResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/ListDeleteTasks\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.metastore.MetastoreService\",\n                        \"ListDeleteTasks\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Lists splits with `split.delete_opstamp` \\< `delete_opstamp` for a given `index_id`.\n        pub async fn list_stale_splits(\n            &mut self,\n            request: impl tonic::IntoRequest<super::ListStaleSplitsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ListSplitsResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/ListStaleSplits\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.metastore.MetastoreService\",\n                        \"ListStaleSplits\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Shard API\n        ///\n        /// Note that for the file-backed metastore implementation, the requests are not processed atomically.\n        /// Indeed, each request comprises one or more subrequests that target different indexes and sources processed\n        /// independently. Responses list the requests that succeeded or failed in the fields `successes` and\n        /// `failures`.\n        pub async fn open_shards(\n            &mut self,\n            request: impl tonic::IntoRequest<super::OpenShardsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::OpenShardsResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/OpenShards\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\"quickwit.metastore.MetastoreService\", \"OpenShards\"),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Acquires a set of shards for indexing. This RPC locks the shards for publishing thanks to a publish token and only\n        /// the last indexer that has acquired the shards is allowed to publish. The response returns for each subrequest the\n        /// list of acquired shards along with the positions to index from.\n        ///\n        /// If a requested shard is missing, this method does not return an error. It should simply return the list of\n        /// shards that were actually acquired.\n        ///\n        /// For this reason, AcquireShards.acquire_shards may return less subresponse than there was in the request.\n        /// Also they may be returned in any order.\n        pub async fn acquire_shards(\n            &mut self,\n            request: impl tonic::IntoRequest<super::AcquireShardsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::AcquireShardsResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/AcquireShards\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.metastore.MetastoreService\",\n                        \"AcquireShards\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Deletes a set of shards. This RPC deletes the shards from the metastore.\n        /// If the shard did not exist to begin with, the operation is successful and does not return any error.\n        pub async fn delete_shards(\n            &mut self,\n            request: impl tonic::IntoRequest<super::DeleteShardsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::DeleteShardsResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/DeleteShards\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.metastore.MetastoreService\",\n                        \"DeleteShards\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Deletes outdated shards. This RPC deletes the shards from the metastore.\n        pub async fn prune_shards(\n            &mut self,\n            request: impl tonic::IntoRequest<super::PruneShardsRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status> {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/PruneShards\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\"quickwit.metastore.MetastoreService\", \"PruneShards\"),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        pub async fn list_shards(\n            &mut self,\n            request: impl tonic::IntoRequest<super::ListShardsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ListShardsResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/ListShards\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\"quickwit.metastore.MetastoreService\", \"ListShards\"),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Creates an index template.\n        pub async fn create_index_template(\n            &mut self,\n            request: impl tonic::IntoRequest<super::CreateIndexTemplateRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status> {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/CreateIndexTemplate\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.metastore.MetastoreService\",\n                        \"CreateIndexTemplate\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Fetches an index template.\n        pub async fn get_index_template(\n            &mut self,\n            request: impl tonic::IntoRequest<super::GetIndexTemplateRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::GetIndexTemplateResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/GetIndexTemplate\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.metastore.MetastoreService\",\n                        \"GetIndexTemplate\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Finds matching index templates.\n        pub async fn find_index_template_matches(\n            &mut self,\n            request: impl tonic::IntoRequest<super::FindIndexTemplateMatchesRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::FindIndexTemplateMatchesResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/FindIndexTemplateMatches\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.metastore.MetastoreService\",\n                        \"FindIndexTemplateMatches\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Returns all the index templates.\n        pub async fn list_index_templates(\n            &mut self,\n            request: impl tonic::IntoRequest<super::ListIndexTemplatesRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ListIndexTemplatesResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/ListIndexTemplates\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.metastore.MetastoreService\",\n                        \"ListIndexTemplates\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Deletes index templates.\n        pub async fn delete_index_templates(\n            &mut self,\n            request: impl tonic::IntoRequest<super::DeleteIndexTemplatesRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status> {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/DeleteIndexTemplates\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.metastore.MetastoreService\",\n                        \"DeleteIndexTemplates\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Get cluster identity\n        pub async fn get_cluster_identity(\n            &mut self,\n            request: impl tonic::IntoRequest<super::GetClusterIdentityRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::GetClusterIdentityResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.metastore.MetastoreService/GetClusterIdentity\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\n                        \"quickwit.metastore.MetastoreService\",\n                        \"GetClusterIdentity\",\n                    ),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n    }\n}\n/// Generated server implementations.\npub mod metastore_service_grpc_server {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    /// Generated trait containing gRPC methods that should be implemented for use with MetastoreServiceGrpcServer.\n    #[async_trait]\n    pub trait MetastoreServiceGrpc: std::marker::Send + std::marker::Sync + 'static {\n        /// Creates an index.\n        ///\n        /// This API creates a new index in the metastore.\n        /// An error will occur if an index that already exists in the storage is specified.\n        async fn create_index(\n            &self,\n            request: tonic::Request<super::CreateIndexRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::CreateIndexResponse>,\n            tonic::Status,\n        >;\n        /// Update an index.\n        async fn update_index(\n            &self,\n            request: tonic::Request<super::UpdateIndexRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::IndexMetadataResponse>,\n            tonic::Status,\n        >;\n        /// Returns the `IndexMetadata` of an index identified by its IndexID or its IndexUID.\n        async fn index_metadata(\n            &self,\n            request: tonic::Request<super::IndexMetadataRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::IndexMetadataResponse>,\n            tonic::Status,\n        >;\n        /// Fetches the metadata of a list of indexes identified by their Index IDs or UIDs.\n        async fn indexes_metadata(\n            &self,\n            request: tonic::Request<super::IndexesMetadataRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::IndexesMetadataResponse>,\n            tonic::Status,\n        >;\n        /// Gets an indexes metadatas.\n        async fn list_indexes_metadata(\n            &self,\n            request: tonic::Request<super::ListIndexesMetadataRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ListIndexesMetadataResponse>,\n            tonic::Status,\n        >;\n        /// Deletes an index\n        async fn delete_index(\n            &self,\n            request: tonic::Request<super::DeleteIndexRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status>;\n        /// Returns a list of size info for each index.\n        async fn list_index_stats(\n            &self,\n            request: tonic::Request<super::ListIndexStatsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ListIndexStatsResponse>,\n            tonic::Status,\n        >;\n        /// Server streaming response type for the ListSplits method.\n        type ListSplitsStream: tonic::codegen::tokio_stream::Stream<\n                Item = std::result::Result<super::ListSplitsResponse, tonic::Status>,\n            >\n            + std::marker::Send\n            + 'static;\n        /// Streams splits from index.\n        async fn list_splits(\n            &self,\n            request: tonic::Request<super::ListSplitsRequest>,\n        ) -> std::result::Result<tonic::Response<Self::ListSplitsStream>, tonic::Status>;\n        /// Stages several splits.\n        async fn stage_splits(\n            &self,\n            request: tonic::Request<super::StageSplitsRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status>;\n        /// Publishes split.\n        async fn publish_splits(\n            &self,\n            request: tonic::Request<super::PublishSplitsRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status>;\n        /// Marks splits for deletion.\n        async fn mark_splits_for_deletion(\n            &self,\n            request: tonic::Request<super::MarkSplitsForDeletionRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status>;\n        /// Deletes splits.\n        async fn delete_splits(\n            &self,\n            request: tonic::Request<super::DeleteSplitsRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status>;\n        /// Adds a source.\n        async fn add_source(\n            &self,\n            request: tonic::Request<super::AddSourceRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status>;\n        /// Updates a source.\n        async fn update_source(\n            &self,\n            request: tonic::Request<super::UpdateSourceRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status>;\n        /// Toggles (turns on or off) source.\n        async fn toggle_source(\n            &self,\n            request: tonic::Request<super::ToggleSourceRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status>;\n        /// Removes source.\n        async fn delete_source(\n            &self,\n            request: tonic::Request<super::DeleteSourceRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status>;\n        /// Resets source checkpoint.\n        async fn reset_source_checkpoint(\n            &self,\n            request: tonic::Request<super::ResetSourceCheckpointRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status>;\n        /// Gets last opstamp for a given `index_id`.\n        async fn last_delete_opstamp(\n            &self,\n            request: tonic::Request<super::LastDeleteOpstampRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::LastDeleteOpstampResponse>,\n            tonic::Status,\n        >;\n        /// Creates a delete task.\n        async fn create_delete_task(\n            &self,\n            request: tonic::Request<super::DeleteQuery>,\n        ) -> std::result::Result<tonic::Response<super::DeleteTask>, tonic::Status>;\n        /// Updates splits `delete_opstamp`.\n        async fn update_splits_delete_opstamp(\n            &self,\n            request: tonic::Request<super::UpdateSplitsDeleteOpstampRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::UpdateSplitsDeleteOpstampResponse>,\n            tonic::Status,\n        >;\n        /// Lists delete tasks with `delete_task.opstamp` > `opstamp_start` for a given `index_id`.\n        async fn list_delete_tasks(\n            &self,\n            request: tonic::Request<super::ListDeleteTasksRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ListDeleteTasksResponse>,\n            tonic::Status,\n        >;\n        /// Lists splits with `split.delete_opstamp` \\< `delete_opstamp` for a given `index_id`.\n        async fn list_stale_splits(\n            &self,\n            request: tonic::Request<super::ListStaleSplitsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ListSplitsResponse>,\n            tonic::Status,\n        >;\n        /// Shard API\n        ///\n        /// Note that for the file-backed metastore implementation, the requests are not processed atomically.\n        /// Indeed, each request comprises one or more subrequests that target different indexes and sources processed\n        /// independently. Responses list the requests that succeeded or failed in the fields `successes` and\n        /// `failures`.\n        async fn open_shards(\n            &self,\n            request: tonic::Request<super::OpenShardsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::OpenShardsResponse>,\n            tonic::Status,\n        >;\n        /// Acquires a set of shards for indexing. This RPC locks the shards for publishing thanks to a publish token and only\n        /// the last indexer that has acquired the shards is allowed to publish. The response returns for each subrequest the\n        /// list of acquired shards along with the positions to index from.\n        ///\n        /// If a requested shard is missing, this method does not return an error. It should simply return the list of\n        /// shards that were actually acquired.\n        ///\n        /// For this reason, AcquireShards.acquire_shards may return less subresponse than there was in the request.\n        /// Also they may be returned in any order.\n        async fn acquire_shards(\n            &self,\n            request: tonic::Request<super::AcquireShardsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::AcquireShardsResponse>,\n            tonic::Status,\n        >;\n        /// Deletes a set of shards. This RPC deletes the shards from the metastore.\n        /// If the shard did not exist to begin with, the operation is successful and does not return any error.\n        async fn delete_shards(\n            &self,\n            request: tonic::Request<super::DeleteShardsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::DeleteShardsResponse>,\n            tonic::Status,\n        >;\n        /// Deletes outdated shards. This RPC deletes the shards from the metastore.\n        async fn prune_shards(\n            &self,\n            request: tonic::Request<super::PruneShardsRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status>;\n        async fn list_shards(\n            &self,\n            request: tonic::Request<super::ListShardsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ListShardsResponse>,\n            tonic::Status,\n        >;\n        /// Creates an index template.\n        async fn create_index_template(\n            &self,\n            request: tonic::Request<super::CreateIndexTemplateRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status>;\n        /// Fetches an index template.\n        async fn get_index_template(\n            &self,\n            request: tonic::Request<super::GetIndexTemplateRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::GetIndexTemplateResponse>,\n            tonic::Status,\n        >;\n        /// Finds matching index templates.\n        async fn find_index_template_matches(\n            &self,\n            request: tonic::Request<super::FindIndexTemplateMatchesRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::FindIndexTemplateMatchesResponse>,\n            tonic::Status,\n        >;\n        /// Returns all the index templates.\n        async fn list_index_templates(\n            &self,\n            request: tonic::Request<super::ListIndexTemplatesRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ListIndexTemplatesResponse>,\n            tonic::Status,\n        >;\n        /// Deletes index templates.\n        async fn delete_index_templates(\n            &self,\n            request: tonic::Request<super::DeleteIndexTemplatesRequest>,\n        ) -> std::result::Result<tonic::Response<super::EmptyResponse>, tonic::Status>;\n        /// Get cluster identity\n        async fn get_cluster_identity(\n            &self,\n            request: tonic::Request<super::GetClusterIdentityRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::GetClusterIdentityResponse>,\n            tonic::Status,\n        >;\n    }\n    /// Metastore meant to manage Quickwit's indexes, their splits and delete tasks.\n    ///\n    /// I. Index and splits management.\n    ///\n    /// Quickwit needs a way to ensure that we can cleanup unused files,\n    /// and this process needs to be resilient to any fail-stop failures.\n    /// We rely on atomically transitioning the status of splits.\n    ///\n    /// The split state goes through the following life cycle:\n    ///\n    /// 1. `Staged`\n    ///\n    /// * Start uploading the split files.\n    ///\n    /// 2. `Published`\n    ///\n    /// * Uploading the split files is complete and the split is searchable.\n    ///\n    /// 3. `MarkedForDeletion`\n    ///\n    /// * Mark the split for deletion.\n    ///\n    /// If a split has a file in the storage, it MUST be registered in the metastore,\n    /// and its state can be as follows:\n    ///\n    /// * `Staged`: The split is almost ready. Some of its files may have been uploaded in the storage.\n    /// * `Published`: The split is ready and published.\n    /// * `MarkedForDeletion`: The split is marked for deletion.\n    ///\n    /// Before creating any file, we need to stage the split. If there is a failure, upon recovery, we\n    /// schedule for deletion all the staged splits. A client may not necessarily remove files from\n    /// storage right after marking it for deletion. A CLI client may delete files right away, but a\n    /// more serious deployment should probably only delete those files after a grace period so that the\n    /// running search queries can complete.\n    ///\n    /// II. Delete tasks management.\n    ///\n    /// A delete task is defined on a given index and by a search query. It can be\n    /// applied to all the splits of the index.\n    ///\n    /// Quickwit needs a way to track that a delete task has been applied to a split. This is ensured\n    /// by two mechanisms:\n    ///\n    /// * On creation of a delete task, we give to the task a monotically increasing opstamp (uniqueness\n    ///  and monotonically increasing must be true at the index level).\n    /// * When a delete task is executed on a split, that is when the documents matched by the search\n    ///  query are removed from the splits, we update the split's `delete_opstamp` to the value of the\n    ///  task's opstamp. This marks the split as \"up-to-date\" regarding this delete task. If new delete\n    ///  tasks are added, we will know that we need to run these delete tasks on the splits as its\n    ///  `delete_optstamp` will be inferior to the `opstamp` of the new tasks.\n    ///\n    /// For splits created after a given delete task, Quickwit's indexing ensures that these splits\n    /// are created with a `delete_opstamp` equal the latest opstamp of the tasks of the\n    /// corresponding index.\n    #[derive(Debug)]\n    pub struct MetastoreServiceGrpcServer<T> {\n        inner: Arc<T>,\n        accept_compression_encodings: EnabledCompressionEncodings,\n        send_compression_encodings: EnabledCompressionEncodings,\n        max_decoding_message_size: Option<usize>,\n        max_encoding_message_size: Option<usize>,\n    }\n    impl<T> MetastoreServiceGrpcServer<T> {\n        pub fn new(inner: T) -> Self {\n            Self::from_arc(Arc::new(inner))\n        }\n        pub fn from_arc(inner: Arc<T>) -> Self {\n            Self {\n                inner,\n                accept_compression_encodings: Default::default(),\n                send_compression_encodings: Default::default(),\n                max_decoding_message_size: None,\n                max_encoding_message_size: None,\n            }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> InterceptedService<Self, F>\n        where\n            F: tonic::service::Interceptor,\n        {\n            InterceptedService::new(Self::new(inner), interceptor)\n        }\n        /// Enable decompressing requests with the given encoding.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.accept_compression_encodings.enable(encoding);\n            self\n        }\n        /// Compress responses with the given encoding, if the client supports it.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.send_compression_encodings.enable(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.max_decoding_message_size = Some(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.max_encoding_message_size = Some(limit);\n            self\n        }\n    }\n    impl<T, B> tonic::codegen::Service<http::Request<B>>\n    for MetastoreServiceGrpcServer<T>\n    where\n        T: MetastoreServiceGrpc,\n        B: Body + std::marker::Send + 'static,\n        B::Error: Into<StdError> + std::marker::Send + 'static,\n    {\n        type Response = http::Response<tonic::body::Body>;\n        type Error = std::convert::Infallible;\n        type Future = BoxFuture<Self::Response, Self::Error>;\n        fn poll_ready(\n            &mut self,\n            _cx: &mut Context<'_>,\n        ) -> Poll<std::result::Result<(), Self::Error>> {\n            Poll::Ready(Ok(()))\n        }\n        fn call(&mut self, req: http::Request<B>) -> Self::Future {\n            match req.uri().path() {\n                \"/quickwit.metastore.MetastoreService/CreateIndex\" => {\n                    #[allow(non_camel_case_types)]\n                    struct CreateIndexSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::CreateIndexRequest>\n                    for CreateIndexSvc<T> {\n                        type Response = super::CreateIndexResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::CreateIndexRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::create_index(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = CreateIndexSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/UpdateIndex\" => {\n                    #[allow(non_camel_case_types)]\n                    struct UpdateIndexSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::UpdateIndexRequest>\n                    for UpdateIndexSvc<T> {\n                        type Response = super::IndexMetadataResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::UpdateIndexRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::update_index(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = UpdateIndexSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/IndexMetadata\" => {\n                    #[allow(non_camel_case_types)]\n                    struct IndexMetadataSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::IndexMetadataRequest>\n                    for IndexMetadataSvc<T> {\n                        type Response = super::IndexMetadataResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::IndexMetadataRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::index_metadata(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = IndexMetadataSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/IndexesMetadata\" => {\n                    #[allow(non_camel_case_types)]\n                    struct IndexesMetadataSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::IndexesMetadataRequest>\n                    for IndexesMetadataSvc<T> {\n                        type Response = super::IndexesMetadataResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::IndexesMetadataRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::indexes_metadata(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = IndexesMetadataSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/ListIndexesMetadata\" => {\n                    #[allow(non_camel_case_types)]\n                    struct ListIndexesMetadataSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::ListIndexesMetadataRequest>\n                    for ListIndexesMetadataSvc<T> {\n                        type Response = super::ListIndexesMetadataResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::ListIndexesMetadataRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::list_indexes_metadata(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = ListIndexesMetadataSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/DeleteIndex\" => {\n                    #[allow(non_camel_case_types)]\n                    struct DeleteIndexSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::DeleteIndexRequest>\n                    for DeleteIndexSvc<T> {\n                        type Response = super::EmptyResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::DeleteIndexRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::delete_index(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = DeleteIndexSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/ListIndexStats\" => {\n                    #[allow(non_camel_case_types)]\n                    struct ListIndexStatsSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::ListIndexStatsRequest>\n                    for ListIndexStatsSvc<T> {\n                        type Response = super::ListIndexStatsResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::ListIndexStatsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::list_index_stats(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = ListIndexStatsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/ListSplits\" => {\n                    #[allow(non_camel_case_types)]\n                    struct ListSplitsSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::ServerStreamingService<super::ListSplitsRequest>\n                    for ListSplitsSvc<T> {\n                        type Response = super::ListSplitsResponse;\n                        type ResponseStream = T::ListSplitsStream;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::ResponseStream>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::ListSplitsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::list_splits(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = ListSplitsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.server_streaming(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/StageSplits\" => {\n                    #[allow(non_camel_case_types)]\n                    struct StageSplitsSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::StageSplitsRequest>\n                    for StageSplitsSvc<T> {\n                        type Response = super::EmptyResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::StageSplitsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::stage_splits(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = StageSplitsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/PublishSplits\" => {\n                    #[allow(non_camel_case_types)]\n                    struct PublishSplitsSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::PublishSplitsRequest>\n                    for PublishSplitsSvc<T> {\n                        type Response = super::EmptyResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::PublishSplitsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::publish_splits(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = PublishSplitsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/MarkSplitsForDeletion\" => {\n                    #[allow(non_camel_case_types)]\n                    struct MarkSplitsForDeletionSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::MarkSplitsForDeletionRequest>\n                    for MarkSplitsForDeletionSvc<T> {\n                        type Response = super::EmptyResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::MarkSplitsForDeletionRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::mark_splits_for_deletion(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = MarkSplitsForDeletionSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/DeleteSplits\" => {\n                    #[allow(non_camel_case_types)]\n                    struct DeleteSplitsSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::DeleteSplitsRequest>\n                    for DeleteSplitsSvc<T> {\n                        type Response = super::EmptyResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::DeleteSplitsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::delete_splits(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = DeleteSplitsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/AddSource\" => {\n                    #[allow(non_camel_case_types)]\n                    struct AddSourceSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::AddSourceRequest>\n                    for AddSourceSvc<T> {\n                        type Response = super::EmptyResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::AddSourceRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::add_source(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = AddSourceSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/UpdateSource\" => {\n                    #[allow(non_camel_case_types)]\n                    struct UpdateSourceSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::UpdateSourceRequest>\n                    for UpdateSourceSvc<T> {\n                        type Response = super::EmptyResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::UpdateSourceRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::update_source(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = UpdateSourceSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/ToggleSource\" => {\n                    #[allow(non_camel_case_types)]\n                    struct ToggleSourceSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::ToggleSourceRequest>\n                    for ToggleSourceSvc<T> {\n                        type Response = super::EmptyResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::ToggleSourceRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::toggle_source(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = ToggleSourceSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/DeleteSource\" => {\n                    #[allow(non_camel_case_types)]\n                    struct DeleteSourceSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::DeleteSourceRequest>\n                    for DeleteSourceSvc<T> {\n                        type Response = super::EmptyResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::DeleteSourceRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::delete_source(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = DeleteSourceSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/ResetSourceCheckpoint\" => {\n                    #[allow(non_camel_case_types)]\n                    struct ResetSourceCheckpointSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::ResetSourceCheckpointRequest>\n                    for ResetSourceCheckpointSvc<T> {\n                        type Response = super::EmptyResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::ResetSourceCheckpointRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::reset_source_checkpoint(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = ResetSourceCheckpointSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/LastDeleteOpstamp\" => {\n                    #[allow(non_camel_case_types)]\n                    struct LastDeleteOpstampSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::LastDeleteOpstampRequest>\n                    for LastDeleteOpstampSvc<T> {\n                        type Response = super::LastDeleteOpstampResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::LastDeleteOpstampRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::last_delete_opstamp(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = LastDeleteOpstampSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/CreateDeleteTask\" => {\n                    #[allow(non_camel_case_types)]\n                    struct CreateDeleteTaskSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::DeleteQuery>\n                    for CreateDeleteTaskSvc<T> {\n                        type Response = super::DeleteTask;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::DeleteQuery>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::create_delete_task(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = CreateDeleteTaskSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/UpdateSplitsDeleteOpstamp\" => {\n                    #[allow(non_camel_case_types)]\n                    struct UpdateSplitsDeleteOpstampSvc<T: MetastoreServiceGrpc>(\n                        pub Arc<T>,\n                    );\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<\n                        super::UpdateSplitsDeleteOpstampRequest,\n                    > for UpdateSplitsDeleteOpstampSvc<T> {\n                        type Response = super::UpdateSplitsDeleteOpstampResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<\n                                super::UpdateSplitsDeleteOpstampRequest,\n                            >,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::update_splits_delete_opstamp(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = UpdateSplitsDeleteOpstampSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/ListDeleteTasks\" => {\n                    #[allow(non_camel_case_types)]\n                    struct ListDeleteTasksSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::ListDeleteTasksRequest>\n                    for ListDeleteTasksSvc<T> {\n                        type Response = super::ListDeleteTasksResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::ListDeleteTasksRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::list_delete_tasks(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = ListDeleteTasksSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/ListStaleSplits\" => {\n                    #[allow(non_camel_case_types)]\n                    struct ListStaleSplitsSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::ListStaleSplitsRequest>\n                    for ListStaleSplitsSvc<T> {\n                        type Response = super::ListSplitsResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::ListStaleSplitsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::list_stale_splits(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = ListStaleSplitsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/OpenShards\" => {\n                    #[allow(non_camel_case_types)]\n                    struct OpenShardsSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::OpenShardsRequest>\n                    for OpenShardsSvc<T> {\n                        type Response = super::OpenShardsResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::OpenShardsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::open_shards(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = OpenShardsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/AcquireShards\" => {\n                    #[allow(non_camel_case_types)]\n                    struct AcquireShardsSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::AcquireShardsRequest>\n                    for AcquireShardsSvc<T> {\n                        type Response = super::AcquireShardsResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::AcquireShardsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::acquire_shards(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = AcquireShardsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/DeleteShards\" => {\n                    #[allow(non_camel_case_types)]\n                    struct DeleteShardsSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::DeleteShardsRequest>\n                    for DeleteShardsSvc<T> {\n                        type Response = super::DeleteShardsResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::DeleteShardsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::delete_shards(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = DeleteShardsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/PruneShards\" => {\n                    #[allow(non_camel_case_types)]\n                    struct PruneShardsSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::PruneShardsRequest>\n                    for PruneShardsSvc<T> {\n                        type Response = super::EmptyResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::PruneShardsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::prune_shards(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = PruneShardsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/ListShards\" => {\n                    #[allow(non_camel_case_types)]\n                    struct ListShardsSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::ListShardsRequest>\n                    for ListShardsSvc<T> {\n                        type Response = super::ListShardsResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::ListShardsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::list_shards(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = ListShardsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/CreateIndexTemplate\" => {\n                    #[allow(non_camel_case_types)]\n                    struct CreateIndexTemplateSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::CreateIndexTemplateRequest>\n                    for CreateIndexTemplateSvc<T> {\n                        type Response = super::EmptyResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::CreateIndexTemplateRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::create_index_template(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = CreateIndexTemplateSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/GetIndexTemplate\" => {\n                    #[allow(non_camel_case_types)]\n                    struct GetIndexTemplateSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::GetIndexTemplateRequest>\n                    for GetIndexTemplateSvc<T> {\n                        type Response = super::GetIndexTemplateResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::GetIndexTemplateRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::get_index_template(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = GetIndexTemplateSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/FindIndexTemplateMatches\" => {\n                    #[allow(non_camel_case_types)]\n                    struct FindIndexTemplateMatchesSvc<T: MetastoreServiceGrpc>(\n                        pub Arc<T>,\n                    );\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::FindIndexTemplateMatchesRequest>\n                    for FindIndexTemplateMatchesSvc<T> {\n                        type Response = super::FindIndexTemplateMatchesResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<\n                                super::FindIndexTemplateMatchesRequest,\n                            >,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::find_index_template_matches(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = FindIndexTemplateMatchesSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/ListIndexTemplates\" => {\n                    #[allow(non_camel_case_types)]\n                    struct ListIndexTemplatesSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::ListIndexTemplatesRequest>\n                    for ListIndexTemplatesSvc<T> {\n                        type Response = super::ListIndexTemplatesResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::ListIndexTemplatesRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::list_index_templates(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = ListIndexTemplatesSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/DeleteIndexTemplates\" => {\n                    #[allow(non_camel_case_types)]\n                    struct DeleteIndexTemplatesSvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::DeleteIndexTemplatesRequest>\n                    for DeleteIndexTemplatesSvc<T> {\n                        type Response = super::EmptyResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::DeleteIndexTemplatesRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::delete_index_templates(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = DeleteIndexTemplatesSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.metastore.MetastoreService/GetClusterIdentity\" => {\n                    #[allow(non_camel_case_types)]\n                    struct GetClusterIdentitySvc<T: MetastoreServiceGrpc>(pub Arc<T>);\n                    impl<\n                        T: MetastoreServiceGrpc,\n                    > tonic::server::UnaryService<super::GetClusterIdentityRequest>\n                    for GetClusterIdentitySvc<T> {\n                        type Response = super::GetClusterIdentityResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::GetClusterIdentityRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as MetastoreServiceGrpc>::get_cluster_identity(\n                                        &inner,\n                                        request,\n                                    )\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = GetClusterIdentitySvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                _ => {\n                    Box::pin(async move {\n                        let mut response = http::Response::new(\n                            tonic::body::Body::default(),\n                        );\n                        let headers = response.headers_mut();\n                        headers\n                            .insert(\n                                tonic::Status::GRPC_STATUS,\n                                (tonic::Code::Unimplemented as i32).into(),\n                            );\n                        headers\n                            .insert(\n                                http::header::CONTENT_TYPE,\n                                tonic::metadata::GRPC_CONTENT_TYPE,\n                            );\n                        Ok(response)\n                    })\n                }\n            }\n        }\n    }\n    impl<T> Clone for MetastoreServiceGrpcServer<T> {\n        fn clone(&self) -> Self {\n            let inner = self.inner.clone();\n            Self {\n                inner,\n                accept_compression_encodings: self.accept_compression_encodings,\n                send_compression_encodings: self.send_compression_encodings,\n                max_decoding_message_size: self.max_decoding_message_size,\n                max_encoding_message_size: self.max_encoding_message_size,\n            }\n        }\n    }\n    /// Generated gRPC service name\n    pub const SERVICE_NAME: &str = \"quickwit.metastore.MetastoreService\";\n    impl<T> tonic::server::NamedService for MetastoreServiceGrpcServer<T> {\n        const NAME: &'static str = SERVICE_NAME;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/codegen/quickwit/quickwit.search.rs",
    "content": "// This file is @generated by prost-build.\n/// / Scroll Request\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ScrollRequest {\n    /// / The `scroll_id` is the given in the response of a search request including a scroll.\n    #[prost(string, tag = \"1\")]\n    pub scroll_id: ::prost::alloc::string::String,\n    #[prost(uint32, optional, tag = \"2\")]\n    pub scroll_ttl_secs: ::core::option::Option<u32>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct PutKvRequest {\n    #[prost(bytes = \"vec\", tag = \"1\")]\n    pub key: ::prost::alloc::vec::Vec<u8>,\n    #[prost(bytes = \"vec\", tag = \"2\")]\n    pub payload: ::prost::alloc::vec::Vec<u8>,\n    #[prost(uint32, tag = \"3\")]\n    pub ttl_secs: u32,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct PutKvResponse {}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct GetKvRequest {\n    #[prost(bytes = \"vec\", tag = \"1\")]\n    pub key: ::prost::alloc::vec::Vec<u8>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct GetKvResponse {\n    #[prost(bytes = \"vec\", optional, tag = \"1\")]\n    pub payload: ::core::option::Option<::prost::alloc::vec::Vec<u8>>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ReportSplit {\n    /// Split id (ULID format `01HAV29D4XY3D462FS3D8K5Q2H`)\n    #[prost(string, tag = \"2\")]\n    pub split_id: ::prost::alloc::string::String,\n    /// The storage uri. This URI does NOT include the split id.\n    #[prost(string, tag = \"1\")]\n    pub storage_uri: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ReportSplitsRequest {\n    #[prost(message, repeated, tag = \"1\")]\n    pub report_splits: ::prost::alloc::vec::Vec<ReportSplit>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ReportSplitsResponse {}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ListFieldsRequest {\n    /// Index ID patterns\n    #[prost(string, repeated, tag = \"1\")]\n    pub index_id_patterns: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n    /// Optional limit query to a list of fields\n    /// Wildcard expressions are supported.\n    #[prost(string, repeated, tag = \"2\")]\n    pub fields: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n    /// Time filter, expressed in seconds since epoch.\n    /// That filter is to be interpreted as the semi-open interval:\n    /// \\[start_timestamp, end_timestamp).\n    #[prost(int64, optional, tag = \"3\")]\n    pub start_timestamp: ::core::option::Option<i64>,\n    #[prost(int64, optional, tag = \"4\")]\n    pub end_timestamp: ::core::option::Option<i64>,\n    /// JSON-serialized QueryAst for index_filter support.\n    /// When provided, only fields from documents matching this query are returned.\n    #[prost(string, optional, tag = \"5\")]\n    pub query_ast: ::core::option::Option<::prost::alloc::string::String>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct LeafListFieldsRequest {\n    /// The index id\n    #[prost(string, tag = \"1\")]\n    pub index_id: ::prost::alloc::string::String,\n    /// The index uri\n    #[prost(string, tag = \"2\")]\n    pub index_uri: ::prost::alloc::string::String,\n    /// Index split ids to apply the query on.\n    /// This ids are resolved from the index_uri defined in the search_request.\n    #[prost(message, repeated, tag = \"3\")]\n    pub split_offsets: ::prost::alloc::vec::Vec<SplitIdAndFooterOffsets>,\n    /// Optional limit query to a list of fields\n    /// Wildcard expressions are supported.\n    #[prost(string, repeated, tag = \"4\")]\n    pub fields: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ListFieldsResponse {\n    #[prost(message, repeated, tag = \"1\")]\n    pub fields: ::prost::alloc::vec::Vec<ListFieldsEntryResponse>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ListFieldsEntryResponse {\n    #[prost(string, tag = \"1\")]\n    pub field_name: ::prost::alloc::string::String,\n    #[prost(enumeration = \"ListFieldType\", tag = \"2\")]\n    pub field_type: i32,\n    /// The index ids the field exists\n    #[prost(string, repeated, tag = \"3\")]\n    pub index_ids: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n    /// True means the field is searchable (indexed) in at least some indices.\n    /// False means the field is not searchable in any indices.\n    #[prost(bool, tag = \"4\")]\n    pub searchable: bool,\n    /// True means the field is aggregatable (fast) in at least some indices.\n    /// False means the field is not aggregatable in any indices.\n    #[prost(bool, tag = \"5\")]\n    pub aggregatable: bool,\n    /// The index ids the field exists, but is not searchable.\n    #[prost(string, repeated, tag = \"6\")]\n    pub non_searchable_index_ids: ::prost::alloc::vec::Vec<\n        ::prost::alloc::string::String,\n    >,\n    /// The index ids the field exists, but is not aggregatable\n    #[prost(string, repeated, tag = \"7\")]\n    pub non_aggregatable_index_ids: ::prost::alloc::vec::Vec<\n        ::prost::alloc::string::String,\n    >,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct ListFields {\n    #[prost(message, repeated, tag = \"1\")]\n    pub fields: ::prost::alloc::vec::Vec<ListFieldsEntryResponse>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Hash, Eq)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct SearchRequest {\n    /// Index ID patterns\n    #[prost(string, repeated, tag = \"1\")]\n    pub index_id_patterns: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n    /// Json object representing Quickwit's QueryAst.\n    #[prost(string, tag = \"13\")]\n    pub query_ast: ::prost::alloc::string::String,\n    /// Time filter, expressed in seconds since epoch.\n    /// That filter is to be interpreted as the semi-open interval:\n    /// \\[start_timestamp, end_timestamp).\n    /// If the query AST contains a range query over the timestamp field,\n    /// then the the bounds of the range query are used directly and\n    /// these two fields are ignored.\n    #[prost(int64, optional, tag = \"4\")]\n    pub start_timestamp: ::core::option::Option<i64>,\n    #[prost(int64, optional, tag = \"5\")]\n    pub end_timestamp: ::core::option::Option<i64>,\n    /// Maximum number of hits to return.\n    #[prost(uint64, tag = \"6\")]\n    pub max_hits: u64,\n    /// First hit to return. Together with max_hits, this parameter\n    /// can be used for pagination.\n    ///\n    /// E.g.\n    /// The results with rank \\[start_offset..start_offset + max_hits) are returned.\n    #[prost(uint64, tag = \"7\")]\n    pub start_offset: u64,\n    /// json serialized aggregation_request\n    #[prost(string, optional, tag = \"11\")]\n    pub aggregation_request: ::core::option::Option<::prost::alloc::string::String>,\n    /// Fields to extract snippet on\n    #[prost(string, repeated, tag = \"12\")]\n    pub snippet_fields: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n    /// Optional sort by one or more fields (limited to 2 at the moment).\n    #[prost(message, repeated, tag = \"14\")]\n    pub sort_fields: ::prost::alloc::vec::Vec<SortField>,\n    /// If set, the search response will include a search id\n    /// that will make it possible to paginate through the results\n    /// in a consistent manner.\n    #[prost(uint32, optional, tag = \"15\")]\n    pub scroll_ttl_secs: ::core::option::Option<u32>,\n    /// Document with sort tuple smaller or equal to this are discarded to\n    /// enable pagination.\n    /// If split_id is empty, no comparison with \\_shard_doc should be done\n    #[prost(message, optional, tag = \"16\")]\n    pub search_after: ::core::option::Option<PartialHit>,\n    #[prost(enumeration = \"CountHits\", tag = \"17\")]\n    pub count_hits: i32,\n    /// When an exact index ID is provided (not a pattern), the query fails only if\n    /// that index is not found and this parameter is set to `false`.\n    #[prost(bool, tag = \"18\")]\n    pub ignore_missing_indexes: bool,\n    /// When true, skip finalization of aggregation results and return\n    /// the raw IntermediateAggregationResults bytes instead.\n    #[prost(bool, tag = \"19\")]\n    pub skip_aggregation_finalization: bool,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct SortField {\n    #[prost(string, tag = \"1\")]\n    pub field_name: ::prost::alloc::string::String,\n    #[prost(enumeration = \"SortOrder\", tag = \"2\")]\n    pub sort_order: i32,\n    /// Optional sort value format for datetime field only.\n    /// If none, the default output format for datetime field is\n    /// unix_timestamp_nanos.\n    #[prost(enumeration = \"SortDatetimeFormat\", optional, tag = \"3\")]\n    pub sort_datetime_format: ::core::option::Option<i32>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct SearchResponse {\n    /// Number of hits matching the query.\n    #[prost(uint64, tag = \"1\")]\n    pub num_hits: u64,\n    /// Matched hits\n    #[prost(message, repeated, tag = \"2\")]\n    pub hits: ::prost::alloc::vec::Vec<Hit>,\n    /// Elapsed time to perform the request. This time is measured\n    /// server-side and expressed in microseconds.\n    #[prost(uint64, tag = \"3\")]\n    pub elapsed_time_micros: u64,\n    /// The searcherrors that occurred formatted as string.\n    #[prost(string, repeated, tag = \"4\")]\n    pub errors: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n    /// Postcard-encoded aggregation response\n    #[prost(bytes = \"vec\", optional, tag = \"9\")]\n    pub aggregation_postcard: ::core::option::Option<::prost::alloc::vec::Vec<u8>>,\n    /// Scroll Id (only set if scroll_secs was set in the request)\n    #[prost(string, optional, tag = \"6\")]\n    pub scroll_id: ::core::option::Option<::prost::alloc::string::String>,\n    /// Returns the list of splits for which search failed.\n    /// For the moment, the cause is unknown.\n    ///\n    /// It is up to the caller to decide whether to interpret\n    /// this as an overall failure or to present the partial results\n    /// to the end user.\n    #[prost(message, repeated, tag = \"7\")]\n    pub failed_splits: ::prost::alloc::vec::Vec<SplitSearchError>,\n    /// Total number of successful splits searched.\n    #[prost(uint64, tag = \"8\")]\n    pub num_successful_splits: u64,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct SearchPlanResponse {\n    #[prost(string, tag = \"1\")]\n    pub result: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct SplitSearchError {\n    /// The searcherror that occurred formatted as string.\n    #[prost(string, tag = \"1\")]\n    pub error: ::prost::alloc::string::String,\n    /// Split id that failed.\n    #[prost(string, tag = \"2\")]\n    pub split_id: ::prost::alloc::string::String,\n    /// Flag to indicate if the error can be considered a retryable error\n    #[prost(bool, tag = \"3\")]\n    pub retryable_error: bool,\n}\n/// A LeafSearchRequest can span multiple indices.\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct LeafSearchRequest {\n    /// Search request. This is a perfect copy of the original search request\n    /// that was sent to root apart from the start_offset, max_hits params and index_id_patterns.\n    /// index_id_patterns contains the actual index ids queried on that leaf.\n    #[prost(message, optional, tag = \"1\")]\n    pub search_request: ::core::option::Option<SearchRequest>,\n    /// List of leaf requests, one per index.\n    #[prost(message, repeated, tag = \"7\")]\n    pub leaf_requests: ::prost::alloc::vec::Vec<LeafRequestRef>,\n    /// List of unique doc_mappers serialized as json.\n    #[prost(string, repeated, tag = \"8\")]\n    pub doc_mappers: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n    /// List of index uris\n    /// Index URI. The index URI defines the location of the storage that contains the\n    /// split files.\n    #[prost(string, repeated, tag = \"9\")]\n    pub index_uris: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, Copy, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ResourceStats {\n    #[prost(uint64, tag = \"1\")]\n    pub short_lived_cache_num_bytes: u64,\n    #[prost(uint64, tag = \"2\")]\n    pub split_num_docs: u64,\n    #[prost(uint64, tag = \"3\")]\n    pub warmup_microsecs: u64,\n    #[prost(uint64, tag = \"4\")]\n    pub cpu_thread_pool_wait_microsecs: u64,\n    #[prost(uint64, tag = \"5\")]\n    pub cpu_microsecs: u64,\n}\n/// LeafRequestRef references data in LeafSearchRequest to deduplicate data.\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct LeafRequestRef {\n    /// The ordinal of the doc_mapper in `LeafSearchRequest.doc_mappers`\n    #[prost(uint32, tag = \"1\")]\n    pub doc_mapper_ord: u32,\n    /// The ordinal of the index uri in LeafSearchRequest.index_uris\n    #[prost(uint32, tag = \"2\")]\n    pub index_uri_ord: u32,\n    /// Index split ids to apply the query on.\n    /// This ids are resolved from the index_uri defined in the search_request.\n    #[prost(message, repeated, tag = \"3\")]\n    pub split_offsets: ::prost::alloc::vec::Vec<SplitIdAndFooterOffsets>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct SplitIdAndFooterOffsets {\n    /// Index split id to apply the query on.\n    /// This id is resolved from the index_uri defined in the search_request.\n    #[prost(string, tag = \"1\")]\n    pub split_id: ::prost::alloc::string::String,\n    /// The offset of the start of footer in the split bundle. The footer contains the file bundle metadata and the hotcache.\n    #[prost(uint64, tag = \"2\")]\n    pub split_footer_start: u64,\n    /// The offset of the end of the footer in split bundle. The footer contains the file bundle metadata and the hotcache.\n    #[prost(uint64, tag = \"3\")]\n    pub split_footer_end: u64,\n    /// The lowest timestamp appearing in the split, in seconds since epoch\n    #[prost(int64, optional, tag = \"4\")]\n    pub timestamp_start: ::core::option::Option<i64>,\n    /// The highest timestamp appearing in the split, in seconds since epoch\n    #[prost(int64, optional, tag = \"5\")]\n    pub timestamp_end: ::core::option::Option<i64>,\n    /// The number of docs in the split\n    #[prost(uint64, tag = \"6\")]\n    pub num_docs: u64,\n}\n/// Hits returned by a FetchDocRequest.\n///\n/// The json that is joined is the raw tantivy json doc.\n/// It is very different from a quickwit json doc.\n///\n/// For instance:\n///\n/// * it may contain a \\_source and a \\_dynamic field.\n/// * since tantivy has no notion of cardinality,\n///   all fields are arrays.\n/// * since tantivy has no notion of object, the object is\n///   flattened by concatenating the path to the root.\n///\n/// See  `quickwit_search::convert_leaf_hit`\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct LeafHit {\n    /// The actual content of the hit/\n    #[prost(string, tag = \"1\")]\n    pub leaf_json: ::prost::alloc::string::String,\n    /// The partial hit (ie: the sorting field + the document address)\n    #[prost(message, optional, tag = \"2\")]\n    pub partial_hit: ::core::option::Option<PartialHit>,\n    /// A snippet of the matching content\n    #[prost(string, optional, tag = \"3\")]\n    pub leaf_snippet_json: ::core::option::Option<::prost::alloc::string::String>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct Hit {\n    /// The actual content of the hit\n    #[prost(string, tag = \"1\")]\n    pub json: ::prost::alloc::string::String,\n    /// The partial hit (ie: the sorting field + the document address)\n    #[prost(message, optional, tag = \"2\")]\n    pub partial_hit: ::core::option::Option<PartialHit>,\n    /// A snippet of the matching content\n    #[prost(string, optional, tag = \"3\")]\n    pub snippet: ::core::option::Option<::prost::alloc::string::String>,\n    /// The index id of the hit\n    #[prost(string, tag = \"4\")]\n    pub index_id: ::prost::alloc::string::String,\n}\n/// A partial hit, is a hit for which we have not fetch the content yet.\n/// Instead, it holds a document_uri which is enough information to\n/// go and fetch the actual document data, by performing a `get_doc(...)`\n/// request.\n///\n/// Value of the sorting key for the given document.\n///\n/// Quickwit only computes top-K of this sorting field.\n/// If the user requested for a bottom-K of a given fast field, then quickwit simply\n/// emits an decreasing mapping of this fast field.\n///\n/// In case of a tie, quickwit uses the increasing order of\n///\n/// * the split_id,\n/// * the segment_ord,\n/// * the doc id.\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Hash, Eq)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct PartialHit {\n    #[prost(message, optional, tag = \"10\")]\n    pub sort_value: ::core::option::Option<SortByValue>,\n    #[prost(message, optional, tag = \"11\")]\n    pub sort_value2: ::core::option::Option<SortByValue>,\n    #[prost(string, tag = \"2\")]\n    pub split_id: ::prost::alloc::string::String,\n    /// (segment_ord, doc) form a tantivy DocAddress, which is sufficient to identify a document\n    /// within a split\n    #[prost(uint32, tag = \"3\")]\n    pub segment_ord: u32,\n    /// The DocId identifies a unique document at the scale of a tantivy segment.\n    #[prost(uint32, tag = \"4\")]\n    pub doc_id: u32,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Ord, PartialOrd)]\n#[derive(Clone, Copy, PartialEq, ::prost::Message)]\npub struct SortByValue {\n    #[prost(oneof = \"sort_by_value::SortValue\", tags = \"1, 2, 3, 4\")]\n    pub sort_value: ::core::option::Option<sort_by_value::SortValue>,\n}\n/// Nested message and enum types in `SortByValue`.\npub mod sort_by_value {\n    #[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n    #[serde(rename_all = \"snake_case\")]\n    #[derive(Clone, Copy, PartialEq, ::prost::Oneof)]\n    pub enum SortValue {\n        #[prost(uint64, tag = \"1\")]\n        U64(u64),\n        #[prost(int64, tag = \"2\")]\n        I64(i64),\n        #[prost(double, tag = \"3\")]\n        F64(f64),\n        #[prost(bool, tag = \"4\")]\n        Boolean(bool),\n    }\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct LeafSearchResponse {\n    /// Total number of documents matched by the query.\n    #[prost(uint64, tag = \"1\")]\n    pub num_hits: u64,\n    /// List of the best top-K candidates for the given leaf query.\n    #[prost(message, repeated, tag = \"2\")]\n    pub partial_hits: ::prost::alloc::vec::Vec<PartialHit>,\n    /// The list of splits that failed. LeafSearchResponse can be an aggregation of results, so there may be multiple.\n    #[prost(message, repeated, tag = \"3\")]\n    pub failed_splits: ::prost::alloc::vec::Vec<SplitSearchError>,\n    /// Total number of attempt to search into splits.\n    /// We do have:\n    /// `num_splits_requested == num_successful_splits + num_failed_splits.len()`\n    /// But we do not necessarily have:\n    /// `num_splits_requested = num_attempted_splits because of retries.`\n    #[prost(uint64, tag = \"4\")]\n    pub num_attempted_splits: u64,\n    /// Total number of successful splits searched.\n    #[prost(uint64, tag = \"7\")]\n    pub num_successful_splits: u64,\n    /// postcard serialized intermediate aggregation_result.\n    #[prost(bytes = \"vec\", optional, tag = \"6\")]\n    pub intermediate_aggregation_result: ::core::option::Option<\n        ::prost::alloc::vec::Vec<u8>,\n    >,\n    #[prost(message, optional, tag = \"8\")]\n    pub resource_stats: ::core::option::Option<ResourceStats>,\n}\n/// The result of searching a single split in a Lambda invocation.\n/// Each result is tagged with its split_id so that ordering is irrelevant.\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct LambdaSingleSplitResult {\n    /// The split that was searched.\n    #[prost(string, tag = \"1\")]\n    pub split_id: ::prost::alloc::string::String,\n    #[prost(oneof = \"lambda_single_split_result::Outcome\", tags = \"2, 3\")]\n    pub outcome: ::core::option::Option<lambda_single_split_result::Outcome>,\n}\n/// Nested message and enum types in `LambdaSingleSplitResult`.\npub mod lambda_single_split_result {\n    #[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n    #[serde(rename_all = \"snake_case\")]\n    #[derive(Clone, PartialEq, ::prost::Oneof)]\n    pub enum Outcome {\n        /// On success, the leaf search response for this split.\n        #[prost(message, tag = \"2\")]\n        Response(super::LeafSearchResponse),\n        /// On failure, the error message.\n        #[prost(string, tag = \"3\")]\n        Error(::prost::alloc::string::String),\n    }\n}\n/// Wrapper for per-split results from a Lambda invocation.\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct LambdaSearchResponses {\n    #[prost(message, repeated, tag = \"2\")]\n    pub split_results: ::prost::alloc::vec::Vec<LambdaSingleSplitResult>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct SnippetRequest {\n    #[prost(string, repeated, tag = \"1\")]\n    pub snippet_fields: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n    #[prost(string, tag = \"2\")]\n    pub query_ast_resolved: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct FetchDocsRequest {\n    /// Request fetching the content of a given list of partial_hits.\n    #[prost(message, repeated, tag = \"1\")]\n    pub partial_hits: ::prost::alloc::vec::Vec<PartialHit>,\n    /// Split footer offsets. They are required for fetch docs to\n    /// fetch the document content in two reads, when the footer is not\n    /// cached.\n    #[prost(message, repeated, tag = \"3\")]\n    pub split_offsets: ::prost::alloc::vec::Vec<SplitIdAndFooterOffsets>,\n    /// Index URI. The index URI defines the location of the storage that contains the\n    /// split files.\n    #[prost(string, tag = \"4\")]\n    pub index_uri: ::prost::alloc::string::String,\n    #[prost(message, optional, tag = \"7\")]\n    pub snippet_request: ::core::option::Option<SnippetRequest>,\n    /// `DocMapper` as json serialized trait.\n    #[prost(string, tag = \"6\")]\n    pub doc_mapper: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct FetchDocsResponse {\n    /// List of complete hits.\n    #[prost(message, repeated, tag = \"1\")]\n    pub hits: ::prost::alloc::vec::Vec<LeafHit>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ListTermsRequest {\n    /// Index ID patterns\n    #[prost(string, repeated, tag = \"1\")]\n    pub index_id_patterns: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n    /// Field to search on\n    #[prost(string, tag = \"3\")]\n    pub field: ::prost::alloc::string::String,\n    /// Time filter\n    #[prost(int64, optional, tag = \"4\")]\n    pub start_timestamp: ::core::option::Option<i64>,\n    #[prost(int64, optional, tag = \"5\")]\n    pub end_timestamp: ::core::option::Option<i64>,\n    /// Maximum number of hits to return.\n    #[prost(uint64, optional, tag = \"6\")]\n    pub max_hits: ::core::option::Option<u64>,\n    /// start_key is included, end_key is excluded\n    #[prost(bytes = \"vec\", optional, tag = \"7\")]\n    pub start_key: ::core::option::Option<::prost::alloc::vec::Vec<u8>>,\n    #[prost(bytes = \"vec\", optional, tag = \"8\")]\n    pub end_key: ::core::option::Option<::prost::alloc::vec::Vec<u8>>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)]\npub struct ListTermsResponse {\n    /// Number of hits matching the query.\n    #[prost(uint64, tag = \"1\")]\n    pub num_hits: u64,\n    /// Matched hits\n    #[prost(bytes = \"vec\", repeated, tag = \"2\")]\n    pub terms: ::prost::alloc::vec::Vec<::prost::alloc::vec::Vec<u8>>,\n    /// Elapsed time to perform the request. This time is measured\n    /// server-side and expressed in microseconds.\n    #[prost(uint64, tag = \"3\")]\n    pub elapsed_time_micros: u64,\n    /// The searcherrors that occurred formatted as string.\n    #[prost(string, repeated, tag = \"4\")]\n    pub errors: ::prost::alloc::vec::Vec<::prost::alloc::string::String>,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct LeafListTermsRequest {\n    /// Search request. This is a perfect copy of the original list request,\n    #[prost(message, optional, tag = \"1\")]\n    pub list_terms_request: ::core::option::Option<ListTermsRequest>,\n    /// Index split ids to apply the query on.\n    /// This ids are resolved from the index_uri defined in the search_request.\n    #[prost(message, repeated, tag = \"2\")]\n    pub split_offsets: ::prost::alloc::vec::Vec<SplitIdAndFooterOffsets>,\n    /// Index URI. The index URI defines the location of the storage that contains the\n    /// split files.\n    #[prost(string, tag = \"3\")]\n    pub index_uri: ::prost::alloc::string::String,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[derive(Clone, PartialEq, ::prost::Message)]\npub struct LeafListTermsResponse {\n    /// Total number of documents matched by the query.\n    #[prost(uint64, tag = \"1\")]\n    pub num_hits: u64,\n    /// List of the first K terms the given leaf query.\n    #[prost(bytes = \"vec\", repeated, tag = \"2\")]\n    pub terms: ::prost::alloc::vec::Vec<::prost::alloc::vec::Vec<u8>>,\n    /// The list of splits that failed. LeafSearchResponse can be an aggregation of results, so there may be multiple.\n    #[prost(message, repeated, tag = \"3\")]\n    pub failed_splits: ::prost::alloc::vec::Vec<SplitSearchError>,\n    /// Total number of single split search attempted.\n    #[prost(uint64, tag = \"4\")]\n    pub num_attempted_splits: u64,\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[serde(rename_all = \"snake_case\")]\n#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]\n#[repr(i32)]\npub enum ListFieldType {\n    Str = 0,\n    U64 = 1,\n    I64 = 2,\n    F64 = 3,\n    Bool = 4,\n    Date = 5,\n    Facet = 6,\n    Bytes = 7,\n    IpAddr = 8,\n    Json = 9,\n}\nimpl ListFieldType {\n    /// String value of the enum field names used in the ProtoBuf definition.\n    ///\n    /// The values are not transformed in any way and thus are considered stable\n    /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n    pub fn as_str_name(&self) -> &'static str {\n        match self {\n            Self::Str => \"STR\",\n            Self::U64 => \"U64\",\n            Self::I64 => \"I64\",\n            Self::F64 => \"F64\",\n            Self::Bool => \"BOOL\",\n            Self::Date => \"DATE\",\n            Self::Facet => \"FACET\",\n            Self::Bytes => \"BYTES\",\n            Self::IpAddr => \"IP_ADDR\",\n            Self::Json => \"JSON\",\n        }\n    }\n    /// Creates an enum from field names used in the ProtoBuf definition.\n    pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n        match value {\n            \"STR\" => Some(Self::Str),\n            \"U64\" => Some(Self::U64),\n            \"I64\" => Some(Self::I64),\n            \"F64\" => Some(Self::F64),\n            \"BOOL\" => Some(Self::Bool),\n            \"DATE\" => Some(Self::Date),\n            \"FACET\" => Some(Self::Facet),\n            \"BYTES\" => Some(Self::Bytes),\n            \"IP_ADDR\" => Some(Self::IpAddr),\n            \"JSON\" => Some(Self::Json),\n            _ => None,\n        }\n    }\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[serde(rename_all = \"snake_case\")]\n#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]\n#[repr(i32)]\npub enum CountHits {\n    /// Count all hits, querying all splits.\n    CountAll = 0,\n    /// Give an underestimate of the number of hits, possibly skipping entire\n    /// splits if they are otherwise not needed to fulfull a query.\n    Underestimate = 1,\n}\nimpl CountHits {\n    /// String value of the enum field names used in the ProtoBuf definition.\n    ///\n    /// The values are not transformed in any way and thus are considered stable\n    /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n    pub fn as_str_name(&self) -> &'static str {\n        match self {\n            Self::CountAll => \"COUNT_ALL\",\n            Self::Underestimate => \"UNDERESTIMATE\",\n        }\n    }\n    /// Creates an enum from field names used in the ProtoBuf definition.\n    pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n        match value {\n            \"COUNT_ALL\" => Some(Self::CountAll),\n            \"UNDERESTIMATE\" => Some(Self::Underestimate),\n            _ => None,\n        }\n    }\n}\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[serde(rename_all = \"snake_case\")]\n#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]\n#[repr(i32)]\npub enum SortOrder {\n    /// Ascending order.\n    Asc = 0,\n    /// Descending order.\n    ///\n    /// \\< This will be the default value;\n    Desc = 1,\n}\nimpl SortOrder {\n    /// String value of the enum field names used in the ProtoBuf definition.\n    ///\n    /// The values are not transformed in any way and thus are considered stable\n    /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n    pub fn as_str_name(&self) -> &'static str {\n        match self {\n            Self::Asc => \"ASC\",\n            Self::Desc => \"DESC\",\n        }\n    }\n    /// Creates an enum from field names used in the ProtoBuf definition.\n    pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n        match value {\n            \"ASC\" => Some(Self::Asc),\n            \"DESC\" => Some(Self::Desc),\n            _ => None,\n        }\n    }\n}\n/// Sort value format for datetime field.\n/// We keep an enum with only one format\n/// for future extension.\n#[derive(serde::Serialize, serde::Deserialize, utoipa::ToSchema)]\n#[serde(rename_all = \"snake_case\")]\n#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]\n#[repr(i32)]\npub enum SortDatetimeFormat {\n    UnixTimestampMillis = 0,\n    UnixTimestampNanos = 1,\n}\nimpl SortDatetimeFormat {\n    /// String value of the enum field names used in the ProtoBuf definition.\n    ///\n    /// The values are not transformed in any way and thus are considered stable\n    /// (if the ProtoBuf definition does not change) and safe for programmatic use.\n    pub fn as_str_name(&self) -> &'static str {\n        match self {\n            Self::UnixTimestampMillis => \"UNIX_TIMESTAMP_MILLIS\",\n            Self::UnixTimestampNanos => \"UNIX_TIMESTAMP_NANOS\",\n        }\n    }\n    /// Creates an enum from field names used in the ProtoBuf definition.\n    pub fn from_str_name(value: &str) -> ::core::option::Option<Self> {\n        match value {\n            \"UNIX_TIMESTAMP_MILLIS\" => Some(Self::UnixTimestampMillis),\n            \"UNIX_TIMESTAMP_NANOS\" => Some(Self::UnixTimestampNanos),\n            _ => None,\n        }\n    }\n}\n/// Generated client implementations.\npub mod search_service_client {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    use tonic::codegen::http::Uri;\n    #[derive(Debug, Clone)]\n    pub struct SearchServiceClient<T> {\n        inner: tonic::client::Grpc<T>,\n    }\n    impl SearchServiceClient<tonic::transport::Channel> {\n        /// Attempt to create a new client by connecting to a given endpoint.\n        pub async fn connect<D>(dst: D) -> Result<Self, tonic::transport::Error>\n        where\n            D: TryInto<tonic::transport::Endpoint>,\n            D::Error: Into<StdError>,\n        {\n            let conn = tonic::transport::Endpoint::new(dst)?.connect().await?;\n            Ok(Self::new(conn))\n        }\n    }\n    impl<T> SearchServiceClient<T>\n    where\n        T: tonic::client::GrpcService<tonic::body::Body>,\n        T::Error: Into<StdError>,\n        T::ResponseBody: Body<Data = Bytes> + std::marker::Send + 'static,\n        <T::ResponseBody as Body>::Error: Into<StdError> + std::marker::Send,\n    {\n        pub fn new(inner: T) -> Self {\n            let inner = tonic::client::Grpc::new(inner);\n            Self { inner }\n        }\n        pub fn with_origin(inner: T, origin: Uri) -> Self {\n            let inner = tonic::client::Grpc::with_origin(inner, origin);\n            Self { inner }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> SearchServiceClient<InterceptedService<T, F>>\n        where\n            F: tonic::service::Interceptor,\n            T::ResponseBody: Default,\n            T: tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n                Response = http::Response<\n                    <T as tonic::client::GrpcService<tonic::body::Body>>::ResponseBody,\n                >,\n            >,\n            <T as tonic::codegen::Service<\n                http::Request<tonic::body::Body>,\n            >>::Error: Into<StdError> + std::marker::Send + std::marker::Sync,\n        {\n            SearchServiceClient::new(InterceptedService::new(inner, interceptor))\n        }\n        /// Compress requests with the given encoding.\n        ///\n        /// This requires the server to support it otherwise it might respond with an\n        /// error.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.send_compressed(encoding);\n            self\n        }\n        /// Enable decompressing responses.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.inner = self.inner.accept_compressed(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_decoding_message_size(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.inner = self.inner.max_encoding_message_size(limit);\n            self\n        }\n        /// Root search API.\n        /// This RPC identifies the set of splits on which the query should run on,\n        /// and dispatch the several calls to `LeafSearch`.\n        ///\n        /// It is also in charge of merging back the results.\n        pub async fn root_search(\n            &mut self,\n            request: impl tonic::IntoRequest<super::SearchRequest>,\n        ) -> std::result::Result<tonic::Response<super::SearchResponse>, tonic::Status> {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.search.SearchService/RootSearch\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(GrpcMethod::new(\"quickwit.search.SearchService\", \"RootSearch\"));\n            self.inner.unary(req, path, codec).await\n        }\n        /// Perform a leaf search on a given set of splits.\n        ///\n        /// It is like a regular search except that:\n        ///\n        /// * the node should perform the search locally instead of dispatching\n        ///  it to other nodes.\n        /// * it should be applied on the given subset of splits\n        /// * Hit content is not fetched, and we instead return so called `PartialHit`.\n        pub async fn leaf_search(\n            &mut self,\n            request: impl tonic::IntoRequest<super::LeafSearchRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::LeafSearchResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.search.SearchService/LeafSearch\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(GrpcMethod::new(\"quickwit.search.SearchService\", \"LeafSearch\"));\n            self.inner.unary(req, path, codec).await\n        }\n        /// / Fetches the documents contents from the document store.\n        /// / This methods takes `PartialHit`s and returns `Hit`s.\n        pub async fn fetch_docs(\n            &mut self,\n            request: impl tonic::IntoRequest<super::FetchDocsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::FetchDocsResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.search.SearchService/FetchDocs\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(GrpcMethod::new(\"quickwit.search.SearchService\", \"FetchDocs\"));\n            self.inner.unary(req, path, codec).await\n        }\n        /// Root list terms API.\n        /// This RPC identifies the set of splits on which the query should run on,\n        /// and dispatches the several calls to `LeafListTerms`.\n        ///\n        /// It is also in charge of merging back the results.\n        pub async fn root_list_terms(\n            &mut self,\n            request: impl tonic::IntoRequest<super::ListTermsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ListTermsResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.search.SearchService/RootListTerms\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\"quickwit.search.SearchService\", \"RootListTerms\"),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Performs a leaf list terms on a given set of splits.\n        ///\n        /// It is like a regular list term except that:\n        ///\n        /// * the node should perform the listing locally instead of dispatching\n        ///  it to other nodes.\n        /// * it should be applied on the given subset of splits\n        pub async fn leaf_list_terms(\n            &mut self,\n            request: impl tonic::IntoRequest<super::LeafListTermsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::LeafListTermsResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.search.SearchService/LeafListTerms\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\"quickwit.search.SearchService\", \"LeafListTerms\"),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Performs a scroll request.\n        pub async fn scroll(\n            &mut self,\n            request: impl tonic::IntoRequest<super::ScrollRequest>,\n        ) -> std::result::Result<tonic::Response<super::SearchResponse>, tonic::Status> {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.search.SearchService/Scroll\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(GrpcMethod::new(\"quickwit.search.SearchService\", \"Scroll\"));\n            self.inner.unary(req, path, codec).await\n        }\n        /// gRPC request used to store a key in the local storage of the targeted node.\n        /// This RPC is used in the mini distributed immutable KV store embedded in quickwit.\n        pub async fn put_kv(\n            &mut self,\n            request: impl tonic::IntoRequest<super::PutKvRequest>,\n        ) -> std::result::Result<tonic::Response<super::PutKvResponse>, tonic::Status> {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.search.SearchService/PutKV\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(GrpcMethod::new(\"quickwit.search.SearchService\", \"PutKV\"));\n            self.inner.unary(req, path, codec).await\n        }\n        /// Gets a key from the local storage of the targeted node.\n        /// This RPC is used in the mini distributed immutable KV store embedded in quickwit.\n        pub async fn get_kv(\n            &mut self,\n            request: impl tonic::IntoRequest<super::GetKvRequest>,\n        ) -> std::result::Result<tonic::Response<super::GetKvResponse>, tonic::Status> {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.search.SearchService/GetKV\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(GrpcMethod::new(\"quickwit.search.SearchService\", \"GetKV\"));\n            self.inner.unary(req, path, codec).await\n        }\n        pub async fn report_splits(\n            &mut self,\n            request: impl tonic::IntoRequest<super::ReportSplitsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ReportSplitsResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.search.SearchService/ReportSplits\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\"quickwit.search.SearchService\", \"ReportSplits\"),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        pub async fn list_fields(\n            &mut self,\n            request: impl tonic::IntoRequest<super::ListFieldsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ListFieldsResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.search.SearchService/ListFields\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(GrpcMethod::new(\"quickwit.search.SearchService\", \"ListFields\"));\n            self.inner.unary(req, path, codec).await\n        }\n        pub async fn leaf_list_fields(\n            &mut self,\n            request: impl tonic::IntoRequest<super::LeafListFieldsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ListFieldsResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.search.SearchService/LeafListFields\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(\n                    GrpcMethod::new(\"quickwit.search.SearchService\", \"LeafListFields\"),\n                );\n            self.inner.unary(req, path, codec).await\n        }\n        /// Describe how a search would be processed.\n        pub async fn search_plan(\n            &mut self,\n            request: impl tonic::IntoRequest<super::SearchRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::SearchPlanResponse>,\n            tonic::Status,\n        > {\n            self.inner\n                .ready()\n                .await\n                .map_err(|e| {\n                    tonic::Status::unknown(\n                        format!(\"Service was not ready: {}\", e.into()),\n                    )\n                })?;\n            let codec = tonic_prost::ProstCodec::default();\n            let path = http::uri::PathAndQuery::from_static(\n                \"/quickwit.search.SearchService/SearchPlan\",\n            );\n            let mut req = request.into_request();\n            req.extensions_mut()\n                .insert(GrpcMethod::new(\"quickwit.search.SearchService\", \"SearchPlan\"));\n            self.inner.unary(req, path, codec).await\n        }\n    }\n}\n/// Generated server implementations.\npub mod search_service_server {\n    #![allow(\n        unused_variables,\n        dead_code,\n        missing_docs,\n        clippy::wildcard_imports,\n        clippy::let_unit_value,\n    )]\n    use tonic::codegen::*;\n    /// Generated trait containing gRPC methods that should be implemented for use with SearchServiceServer.\n    #[async_trait]\n    pub trait SearchService: std::marker::Send + std::marker::Sync + 'static {\n        /// Root search API.\n        /// This RPC identifies the set of splits on which the query should run on,\n        /// and dispatch the several calls to `LeafSearch`.\n        ///\n        /// It is also in charge of merging back the results.\n        async fn root_search(\n            &self,\n            request: tonic::Request<super::SearchRequest>,\n        ) -> std::result::Result<tonic::Response<super::SearchResponse>, tonic::Status>;\n        /// Perform a leaf search on a given set of splits.\n        ///\n        /// It is like a regular search except that:\n        ///\n        /// * the node should perform the search locally instead of dispatching\n        ///  it to other nodes.\n        /// * it should be applied on the given subset of splits\n        /// * Hit content is not fetched, and we instead return so called `PartialHit`.\n        async fn leaf_search(\n            &self,\n            request: tonic::Request<super::LeafSearchRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::LeafSearchResponse>,\n            tonic::Status,\n        >;\n        /// / Fetches the documents contents from the document store.\n        /// / This methods takes `PartialHit`s and returns `Hit`s.\n        async fn fetch_docs(\n            &self,\n            request: tonic::Request<super::FetchDocsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::FetchDocsResponse>,\n            tonic::Status,\n        >;\n        /// Root list terms API.\n        /// This RPC identifies the set of splits on which the query should run on,\n        /// and dispatches the several calls to `LeafListTerms`.\n        ///\n        /// It is also in charge of merging back the results.\n        async fn root_list_terms(\n            &self,\n            request: tonic::Request<super::ListTermsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ListTermsResponse>,\n            tonic::Status,\n        >;\n        /// Performs a leaf list terms on a given set of splits.\n        ///\n        /// It is like a regular list term except that:\n        ///\n        /// * the node should perform the listing locally instead of dispatching\n        ///  it to other nodes.\n        /// * it should be applied on the given subset of splits\n        async fn leaf_list_terms(\n            &self,\n            request: tonic::Request<super::LeafListTermsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::LeafListTermsResponse>,\n            tonic::Status,\n        >;\n        /// Performs a scroll request.\n        async fn scroll(\n            &self,\n            request: tonic::Request<super::ScrollRequest>,\n        ) -> std::result::Result<tonic::Response<super::SearchResponse>, tonic::Status>;\n        /// gRPC request used to store a key in the local storage of the targeted node.\n        /// This RPC is used in the mini distributed immutable KV store embedded in quickwit.\n        async fn put_kv(\n            &self,\n            request: tonic::Request<super::PutKvRequest>,\n        ) -> std::result::Result<tonic::Response<super::PutKvResponse>, tonic::Status>;\n        /// Gets a key from the local storage of the targeted node.\n        /// This RPC is used in the mini distributed immutable KV store embedded in quickwit.\n        async fn get_kv(\n            &self,\n            request: tonic::Request<super::GetKvRequest>,\n        ) -> std::result::Result<tonic::Response<super::GetKvResponse>, tonic::Status>;\n        async fn report_splits(\n            &self,\n            request: tonic::Request<super::ReportSplitsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ReportSplitsResponse>,\n            tonic::Status,\n        >;\n        async fn list_fields(\n            &self,\n            request: tonic::Request<super::ListFieldsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ListFieldsResponse>,\n            tonic::Status,\n        >;\n        async fn leaf_list_fields(\n            &self,\n            request: tonic::Request<super::LeafListFieldsRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::ListFieldsResponse>,\n            tonic::Status,\n        >;\n        /// Describe how a search would be processed.\n        async fn search_plan(\n            &self,\n            request: tonic::Request<super::SearchRequest>,\n        ) -> std::result::Result<\n            tonic::Response<super::SearchPlanResponse>,\n            tonic::Status,\n        >;\n    }\n    #[derive(Debug)]\n    pub struct SearchServiceServer<T> {\n        inner: Arc<T>,\n        accept_compression_encodings: EnabledCompressionEncodings,\n        send_compression_encodings: EnabledCompressionEncodings,\n        max_decoding_message_size: Option<usize>,\n        max_encoding_message_size: Option<usize>,\n    }\n    impl<T> SearchServiceServer<T> {\n        pub fn new(inner: T) -> Self {\n            Self::from_arc(Arc::new(inner))\n        }\n        pub fn from_arc(inner: Arc<T>) -> Self {\n            Self {\n                inner,\n                accept_compression_encodings: Default::default(),\n                send_compression_encodings: Default::default(),\n                max_decoding_message_size: None,\n                max_encoding_message_size: None,\n            }\n        }\n        pub fn with_interceptor<F>(\n            inner: T,\n            interceptor: F,\n        ) -> InterceptedService<Self, F>\n        where\n            F: tonic::service::Interceptor,\n        {\n            InterceptedService::new(Self::new(inner), interceptor)\n        }\n        /// Enable decompressing requests with the given encoding.\n        #[must_use]\n        pub fn accept_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.accept_compression_encodings.enable(encoding);\n            self\n        }\n        /// Compress responses with the given encoding, if the client supports it.\n        #[must_use]\n        pub fn send_compressed(mut self, encoding: CompressionEncoding) -> Self {\n            self.send_compression_encodings.enable(encoding);\n            self\n        }\n        /// Limits the maximum size of a decoded message.\n        ///\n        /// Default: `4MB`\n        #[must_use]\n        pub fn max_decoding_message_size(mut self, limit: usize) -> Self {\n            self.max_decoding_message_size = Some(limit);\n            self\n        }\n        /// Limits the maximum size of an encoded message.\n        ///\n        /// Default: `usize::MAX`\n        #[must_use]\n        pub fn max_encoding_message_size(mut self, limit: usize) -> Self {\n            self.max_encoding_message_size = Some(limit);\n            self\n        }\n    }\n    impl<T, B> tonic::codegen::Service<http::Request<B>> for SearchServiceServer<T>\n    where\n        T: SearchService,\n        B: Body + std::marker::Send + 'static,\n        B::Error: Into<StdError> + std::marker::Send + 'static,\n    {\n        type Response = http::Response<tonic::body::Body>;\n        type Error = std::convert::Infallible;\n        type Future = BoxFuture<Self::Response, Self::Error>;\n        fn poll_ready(\n            &mut self,\n            _cx: &mut Context<'_>,\n        ) -> Poll<std::result::Result<(), Self::Error>> {\n            Poll::Ready(Ok(()))\n        }\n        fn call(&mut self, req: http::Request<B>) -> Self::Future {\n            match req.uri().path() {\n                \"/quickwit.search.SearchService/RootSearch\" => {\n                    #[allow(non_camel_case_types)]\n                    struct RootSearchSvc<T: SearchService>(pub Arc<T>);\n                    impl<\n                        T: SearchService,\n                    > tonic::server::UnaryService<super::SearchRequest>\n                    for RootSearchSvc<T> {\n                        type Response = super::SearchResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::SearchRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as SearchService>::root_search(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = RootSearchSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.search.SearchService/LeafSearch\" => {\n                    #[allow(non_camel_case_types)]\n                    struct LeafSearchSvc<T: SearchService>(pub Arc<T>);\n                    impl<\n                        T: SearchService,\n                    > tonic::server::UnaryService<super::LeafSearchRequest>\n                    for LeafSearchSvc<T> {\n                        type Response = super::LeafSearchResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::LeafSearchRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as SearchService>::leaf_search(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = LeafSearchSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.search.SearchService/FetchDocs\" => {\n                    #[allow(non_camel_case_types)]\n                    struct FetchDocsSvc<T: SearchService>(pub Arc<T>);\n                    impl<\n                        T: SearchService,\n                    > tonic::server::UnaryService<super::FetchDocsRequest>\n                    for FetchDocsSvc<T> {\n                        type Response = super::FetchDocsResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::FetchDocsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as SearchService>::fetch_docs(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = FetchDocsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.search.SearchService/RootListTerms\" => {\n                    #[allow(non_camel_case_types)]\n                    struct RootListTermsSvc<T: SearchService>(pub Arc<T>);\n                    impl<\n                        T: SearchService,\n                    > tonic::server::UnaryService<super::ListTermsRequest>\n                    for RootListTermsSvc<T> {\n                        type Response = super::ListTermsResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::ListTermsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as SearchService>::root_list_terms(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = RootListTermsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.search.SearchService/LeafListTerms\" => {\n                    #[allow(non_camel_case_types)]\n                    struct LeafListTermsSvc<T: SearchService>(pub Arc<T>);\n                    impl<\n                        T: SearchService,\n                    > tonic::server::UnaryService<super::LeafListTermsRequest>\n                    for LeafListTermsSvc<T> {\n                        type Response = super::LeafListTermsResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::LeafListTermsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as SearchService>::leaf_list_terms(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = LeafListTermsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.search.SearchService/Scroll\" => {\n                    #[allow(non_camel_case_types)]\n                    struct ScrollSvc<T: SearchService>(pub Arc<T>);\n                    impl<\n                        T: SearchService,\n                    > tonic::server::UnaryService<super::ScrollRequest>\n                    for ScrollSvc<T> {\n                        type Response = super::SearchResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::ScrollRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as SearchService>::scroll(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = ScrollSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.search.SearchService/PutKV\" => {\n                    #[allow(non_camel_case_types)]\n                    struct PutKVSvc<T: SearchService>(pub Arc<T>);\n                    impl<\n                        T: SearchService,\n                    > tonic::server::UnaryService<super::PutKvRequest> for PutKVSvc<T> {\n                        type Response = super::PutKvResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::PutKvRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as SearchService>::put_kv(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = PutKVSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.search.SearchService/GetKV\" => {\n                    #[allow(non_camel_case_types)]\n                    struct GetKVSvc<T: SearchService>(pub Arc<T>);\n                    impl<\n                        T: SearchService,\n                    > tonic::server::UnaryService<super::GetKvRequest> for GetKVSvc<T> {\n                        type Response = super::GetKvResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::GetKvRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as SearchService>::get_kv(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = GetKVSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.search.SearchService/ReportSplits\" => {\n                    #[allow(non_camel_case_types)]\n                    struct ReportSplitsSvc<T: SearchService>(pub Arc<T>);\n                    impl<\n                        T: SearchService,\n                    > tonic::server::UnaryService<super::ReportSplitsRequest>\n                    for ReportSplitsSvc<T> {\n                        type Response = super::ReportSplitsResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::ReportSplitsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as SearchService>::report_splits(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = ReportSplitsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.search.SearchService/ListFields\" => {\n                    #[allow(non_camel_case_types)]\n                    struct ListFieldsSvc<T: SearchService>(pub Arc<T>);\n                    impl<\n                        T: SearchService,\n                    > tonic::server::UnaryService<super::ListFieldsRequest>\n                    for ListFieldsSvc<T> {\n                        type Response = super::ListFieldsResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::ListFieldsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as SearchService>::list_fields(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = ListFieldsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.search.SearchService/LeafListFields\" => {\n                    #[allow(non_camel_case_types)]\n                    struct LeafListFieldsSvc<T: SearchService>(pub Arc<T>);\n                    impl<\n                        T: SearchService,\n                    > tonic::server::UnaryService<super::LeafListFieldsRequest>\n                    for LeafListFieldsSvc<T> {\n                        type Response = super::ListFieldsResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::LeafListFieldsRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as SearchService>::leaf_list_fields(&inner, request)\n                                    .await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = LeafListFieldsSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                \"/quickwit.search.SearchService/SearchPlan\" => {\n                    #[allow(non_camel_case_types)]\n                    struct SearchPlanSvc<T: SearchService>(pub Arc<T>);\n                    impl<\n                        T: SearchService,\n                    > tonic::server::UnaryService<super::SearchRequest>\n                    for SearchPlanSvc<T> {\n                        type Response = super::SearchPlanResponse;\n                        type Future = BoxFuture<\n                            tonic::Response<Self::Response>,\n                            tonic::Status,\n                        >;\n                        fn call(\n                            &mut self,\n                            request: tonic::Request<super::SearchRequest>,\n                        ) -> Self::Future {\n                            let inner = Arc::clone(&self.0);\n                            let fut = async move {\n                                <T as SearchService>::search_plan(&inner, request).await\n                            };\n                            Box::pin(fut)\n                        }\n                    }\n                    let accept_compression_encodings = self.accept_compression_encodings;\n                    let send_compression_encodings = self.send_compression_encodings;\n                    let max_decoding_message_size = self.max_decoding_message_size;\n                    let max_encoding_message_size = self.max_encoding_message_size;\n                    let inner = self.inner.clone();\n                    let fut = async move {\n                        let method = SearchPlanSvc(inner);\n                        let codec = tonic_prost::ProstCodec::default();\n                        let mut grpc = tonic::server::Grpc::new(codec)\n                            .apply_compression_config(\n                                accept_compression_encodings,\n                                send_compression_encodings,\n                            )\n                            .apply_max_message_size_config(\n                                max_decoding_message_size,\n                                max_encoding_message_size,\n                            );\n                        let res = grpc.unary(method, req).await;\n                        Ok(res)\n                    };\n                    Box::pin(fut)\n                }\n                _ => {\n                    Box::pin(async move {\n                        let mut response = http::Response::new(\n                            tonic::body::Body::default(),\n                        );\n                        let headers = response.headers_mut();\n                        headers\n                            .insert(\n                                tonic::Status::GRPC_STATUS,\n                                (tonic::Code::Unimplemented as i32).into(),\n                            );\n                        headers\n                            .insert(\n                                http::header::CONTENT_TYPE,\n                                tonic::metadata::GRPC_CONTENT_TYPE,\n                            );\n                        Ok(response)\n                    })\n                }\n            }\n        }\n    }\n    impl<T> Clone for SearchServiceServer<T> {\n        fn clone(&self) -> Self {\n            let inner = self.inner.clone();\n            Self {\n                inner,\n                accept_compression_encodings: self.accept_compression_encodings,\n                send_compression_encodings: self.send_compression_encodings,\n                max_decoding_message_size: self.max_decoding_message_size,\n                max_encoding_message_size: self.max_encoding_message_size,\n            }\n        }\n    }\n    /// Generated gRPC service name\n    pub const SERVICE_NAME: &str = \"quickwit.search.SearchService\";\n    impl<T> tonic::server::NamedService for SearchServiceServer<T> {\n        const NAME: &'static str = SERVICE_NAME;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/control_plane/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_actors::AskError;\nuse quickwit_common::rate_limited_error;\nuse quickwit_common::tower::{MakeLoadShedError, RpcName, TimeoutExceeded};\nuse serde::{Deserialize, Serialize};\nuse thiserror;\n\nuse crate::metastore::{MetastoreError, OpenShardSubrequest};\nuse crate::{GrpcServiceError, ServiceError, ServiceErrorCode};\n\ninclude!(\"../codegen/quickwit/quickwit.control_plane.rs\");\n\npub const CONTROL_PLANE_FILE_DESCRIPTOR_SET: &[u8] =\n    include_bytes!(\"../codegen/quickwit/control_plane_descriptor.bin\");\n\npub type ControlPlaneResult<T> = std::result::Result<T, ControlPlaneError>;\n\n#[derive(Debug, thiserror::Error, Eq, PartialEq, Serialize, Deserialize)]\n#[serde(rename_all = \"snake_case\")]\npub enum ControlPlaneError {\n    #[error(\"internal error: {0}\")]\n    Internal(String),\n    #[error(\"metastore error: {0}\")]\n    Metastore(#[from] MetastoreError),\n    #[error(\"request timed out: {0}\")]\n    Timeout(String),\n    #[error(\"too many requests\")]\n    TooManyRequests,\n    #[error(\"service unavailable: {0}\")]\n    Unavailable(String),\n}\n\nimpl From<TimeoutExceeded> for ControlPlaneError {\n    fn from(_timeout_exceeded: TimeoutExceeded) -> Self {\n        Self::Timeout(\"tower layer timeout\".to_string())\n    }\n}\n\nimpl From<quickwit_common::tower::TaskCancelled> for ControlPlaneError {\n    fn from(task_cancelled: quickwit_common::tower::TaskCancelled) -> Self {\n        ControlPlaneError::Internal(task_cancelled.to_string())\n    }\n}\n\nimpl ServiceError for ControlPlaneError {\n    fn error_code(&self) -> ServiceErrorCode {\n        match self {\n            Self::Internal(error_msg) => {\n                rate_limited_error!(\n                    limit_per_min = 6,\n                    \"control plane internal error: {error_msg}\"\n                );\n                ServiceErrorCode::Internal\n            }\n            Self::Metastore(metastore_error) => metastore_error.error_code(),\n            Self::Timeout(_) => ServiceErrorCode::Timeout,\n            Self::TooManyRequests => ServiceErrorCode::TooManyRequests,\n            Self::Unavailable(_) => ServiceErrorCode::Unavailable,\n        }\n    }\n}\n\nimpl GrpcServiceError for ControlPlaneError {\n    fn new_internal(message: String) -> Self {\n        Self::Internal(message)\n    }\n\n    fn new_timeout(message: String) -> Self {\n        Self::Timeout(message)\n    }\n\n    fn new_too_many_requests() -> Self {\n        Self::TooManyRequests\n    }\n\n    fn new_unavailable(message: String) -> Self {\n        Self::Unavailable(message)\n    }\n}\n\nimpl MakeLoadShedError for ControlPlaneError {\n    fn make_load_shed_error() -> Self {\n        Self::TooManyRequests\n    }\n}\n\nimpl From<ControlPlaneError> for MetastoreError {\n    fn from(error: ControlPlaneError) -> Self {\n        match error {\n            ControlPlaneError::Internal(message) => MetastoreError::Internal {\n                message: \"an internal metastore error occurred\".to_string(),\n                cause: message,\n            },\n            ControlPlaneError::Metastore(error) => error,\n            ControlPlaneError::Timeout(message) => MetastoreError::Timeout(message),\n            ControlPlaneError::TooManyRequests => MetastoreError::TooManyRequests,\n            ControlPlaneError::Unavailable(message) => MetastoreError::Unavailable(message),\n        }\n    }\n}\n\nimpl From<AskError<ControlPlaneError>> for ControlPlaneError {\n    fn from(error: AskError<ControlPlaneError>) -> Self {\n        match error {\n            AskError::ErrorReply(error) => error,\n            AskError::MessageNotDelivered => {\n                Self::new_unavailable(\"request could not be delivered to actor\".to_string())\n            }\n            AskError::ProcessMessageError => {\n                Self::new_internal(\"an error occurred while processing the request\".to_string())\n            }\n        }\n    }\n}\n\nimpl RpcName for GetOrCreateOpenShardsRequest {\n    fn rpc_name() -> &'static str {\n        \"get_or_create_open_shards\"\n    }\n}\n\nimpl RpcName for AdviseResetShardsRequest {\n    fn rpc_name() -> &'static str {\n        \"advise_reset_shards\"\n    }\n}\n\nimpl GetOrCreateOpenShardsFailureReason {\n    pub fn create_failure(\n        &self,\n        subrequest: impl Into<GetOrCreateOpenShardsSubrequest>,\n    ) -> GetOrCreateOpenShardsFailure {\n        let subrequest = subrequest.into();\n\n        GetOrCreateOpenShardsFailure {\n            subrequest_id: subrequest.subrequest_id,\n            index_id: subrequest.index_id,\n            source_id: subrequest.source_id,\n            reason: *self as i32,\n        }\n    }\n}\n\nimpl From<crate::metastore::OpenShardSubrequest> for GetOrCreateOpenShardsSubrequest {\n    fn from(metastore_open_shard_subrequest: OpenShardSubrequest) -> Self {\n        let index_id = metastore_open_shard_subrequest.index_uid().index_id.clone();\n\n        Self {\n            subrequest_id: metastore_open_shard_subrequest.subrequest_id,\n            index_id,\n            source_id: metastore_open_shard_subrequest.source_id,\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/developer/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse thiserror;\n\nuse crate::{GrpcServiceError, ServiceError, ServiceErrorCode};\n\ninclude!(\"../codegen/quickwit/quickwit.developer.rs\");\n\npub const DEVELOPER_FILE_DESCRIPTOR_SET: &[u8] =\n    include_bytes!(\"../codegen/quickwit/developer_descriptor.bin\");\n\npub type DeveloperResult<T> = std::result::Result<T, DeveloperError>;\n\n#[derive(Debug, thiserror::Error, Eq, PartialEq, serde::Serialize, serde::Deserialize)]\n#[serde(rename_all = \"snake_case\")]\npub enum DeveloperError {\n    #[error(\"internal error: {0}\")]\n    Internal(String),\n    #[error(\"invalid argument: {0}\")]\n    InvalidArgument(String),\n    #[error(\"request timed out: {0}\")]\n    Timeout(String),\n    #[error(\"too many requests\")]\n    TooManyRequests,\n    #[error(\"service unavailable: {0}\")]\n    Unavailable(String),\n}\n\nimpl ServiceError for DeveloperError {\n    fn error_code(&self) -> ServiceErrorCode {\n        match self {\n            Self::Internal(_) => ServiceErrorCode::Internal,\n            Self::InvalidArgument(_) => ServiceErrorCode::BadRequest,\n            Self::Timeout(_) => ServiceErrorCode::Timeout,\n            Self::TooManyRequests => ServiceErrorCode::TooManyRequests,\n            Self::Unavailable(_) => ServiceErrorCode::Unavailable,\n        }\n    }\n}\n\nimpl GrpcServiceError for DeveloperError {\n    fn new_internal(message: String) -> Self {\n        Self::Internal(message)\n    }\n\n    fn new_timeout(message: String) -> Self {\n        Self::Timeout(message)\n    }\n\n    fn new_too_many_requests() -> Self {\n        Self::TooManyRequests\n    }\n\n    fn new_unavailable(message: String) -> Self {\n        Self::Unavailable(message)\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/error.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::convert::Infallible;\nuse std::error::Error;\nuse std::fmt::Debug;\n\nuse anyhow::Context;\nuse quickwit_actors::AskError;\nuse serde::Serialize;\nuse serde::de::DeserializeOwned;\nuse tonic::metadata::BinaryMetadataValue;\nuse tracing::{error, warn};\n\nconst QW_ERROR_HEADER_NAME: &str = \"qw-error-bin\";\n\n/// This enum maps our internal error codes to\n/// gRPC and HTTP status codes.\n///\n/// It is voluntarily a restricted subset of gRPC status codes. Please introduce new variants\n/// thoughtfully.\n#[derive(Clone, Copy)]\npub enum ServiceErrorCode {\n    AlreadyExists,\n    BadRequest,\n    // Use `Unauthenticated` if the caller cannot be identified.\n    Forbidden,\n    Internal,\n    NotFound,\n    Timeout,\n    TooManyRequests,\n    Unauthenticated,\n    Unavailable,\n}\n\nimpl ServiceErrorCode {\n    fn grpc_status_code(&self) -> tonic::Code {\n        match self {\n            Self::AlreadyExists => tonic::Code::AlreadyExists,\n            Self::BadRequest => tonic::Code::InvalidArgument,\n            Self::Forbidden => tonic::Code::PermissionDenied,\n            Self::Internal => tonic::Code::Internal,\n            Self::NotFound => tonic::Code::NotFound,\n            Self::Timeout => tonic::Code::DeadlineExceeded,\n            Self::TooManyRequests => tonic::Code::ResourceExhausted,\n            Self::Unauthenticated => tonic::Code::Unauthenticated,\n            Self::Unavailable => tonic::Code::Unavailable,\n        }\n    }\n\n    pub fn http_status_code(&self) -> http::StatusCode {\n        match self {\n            Self::AlreadyExists => http::StatusCode::BAD_REQUEST,\n            Self::BadRequest => http::StatusCode::BAD_REQUEST,\n            Self::Forbidden => http::StatusCode::FORBIDDEN,\n            Self::Internal => http::StatusCode::INTERNAL_SERVER_ERROR,\n            Self::NotFound => http::StatusCode::NOT_FOUND,\n            Self::Timeout => http::StatusCode::REQUEST_TIMEOUT,\n            Self::TooManyRequests => http::StatusCode::TOO_MANY_REQUESTS,\n            Self::Unauthenticated => http::StatusCode::UNAUTHORIZED,\n            Self::Unavailable => http::StatusCode::SERVICE_UNAVAILABLE,\n        }\n    }\n}\n\npub trait ServiceError: Error + Debug + 'static {\n    fn error_code(&self) -> ServiceErrorCode;\n}\n\nimpl ServiceError for Infallible {\n    fn error_code(&self) -> ServiceErrorCode {\n        unreachable!()\n    }\n}\n\nimpl<E> ServiceError for AskError<E>\nwhere E: ServiceError\n{\n    fn error_code(&self) -> ServiceErrorCode {\n        match self {\n            AskError::ErrorReply(error) => error.error_code(),\n            AskError::MessageNotDelivered => ServiceErrorCode::Unavailable,\n            AskError::ProcessMessageError => ServiceErrorCode::Internal,\n        }\n    }\n}\n\n/// A trait for encoding/decoding service errors to/from gRPC statuses. Errors are stored in JSON\n/// in the gRPC header `qw-error-bin`. This allows for propagating them transparently\n/// between clients and servers over the network without being semantically limited to a status code\n/// and a message. However, it also means that modifying the serialization format of existing errors\n/// or introducing new ones is not backward compatible.\npub trait GrpcServiceError: ServiceError + Serialize + DeserializeOwned + Send + Sync {\n    fn into_grpc_status(self) -> tonic::Status {\n        grpc_error_to_grpc_status(self)\n    }\n\n    fn new_internal(message: String) -> Self;\n\n    fn new_timeout(message: String) -> Self;\n\n    fn new_too_many_requests() -> Self;\n\n    fn new_unavailable(message: String) -> Self;\n}\n\n/// Converts a service error into a gRPC status.\npub fn grpc_error_to_grpc_status<E>(service_error: E) -> tonic::Status\nwhere E: GrpcServiceError {\n    let code = service_error.error_code().grpc_status_code();\n    let message = service_error.to_string();\n    let mut status = tonic::Status::new(code, message);\n\n    match encode_error(&service_error) {\n        Ok(header_value) => {\n            status\n                .metadata_mut()\n                .insert_bin(QW_ERROR_HEADER_NAME, header_value);\n        }\n        Err(error) => {\n            warn!(%error, \"failed to encode error `{service_error:?}`\");\n        }\n    }\n    status\n}\n\n/// Converts a gRPC status into a service error.\npub fn grpc_status_to_service_error<E>(status: tonic::Status, rpc_name: &'static str) -> E\nwhere E: GrpcServiceError {\n    if let Some(header_value) = status.metadata().get_bin(QW_ERROR_HEADER_NAME) {\n        let service_error = match decode_error(header_value) {\n            Ok(service_error) => service_error,\n            Err(error) => {\n                let message = format!(\n                    \"failed to deserialize error returned from server (this can happen during \\\n                     rolling upgrades): {error}\"\n                );\n                E::new_internal(message)\n            }\n        };\n        return service_error;\n    }\n    let message = status.message().to_string();\n    error!(code = ?status.code(), rpc = rpc_name, \"gRPC transport error: {message}\");\n\n    match status.code() {\n        // `Cancelled` is a client timeout whereas `DeadlineExceeded` is a server timeout. At this\n        // stage, we don't distinguish them.\n        tonic::Code::Cancelled | tonic::Code::DeadlineExceeded => E::new_timeout(message),\n        tonic::Code::Unavailable => E::new_unavailable(message),\n        _ => E::new_internal(message),\n    }\n}\n\n/// Encodes a service error into a gRPC header value.\nfn encode_error<E: Serialize>(service_error: &E) -> anyhow::Result<BinaryMetadataValue> {\n    let service_error_json = serde_json::to_vec(&service_error)?;\n    let header_value = BinaryMetadataValue::from_bytes(&service_error_json);\n    Ok(header_value)\n}\n\n/// Decodes a service error from a gRPC header value.\nfn decode_error<E: DeserializeOwned>(header_value: &BinaryMetadataValue) -> anyhow::Result<E> {\n    let service_error_json = header_value.to_bytes().context(\"invalid header value\")?;\n    let service_error = serde_json::from_slice(&service_error_json).with_context(|| {\n        if let Ok(service_error_json_str) = std::str::from_utf8(&service_error_json) {\n            format!(\"invalid JSON `{service_error_json_str}`\")\n        } else {\n            \"invalid JSON\".to_string()\n        }\n    })?;\n    Ok(service_error)\n}\n\n#[allow(clippy::result_large_err)]\npub fn convert_to_grpc_result<T, E: GrpcServiceError>(\n    result: Result<T, E>,\n) -> tonic::Result<tonic::Response<T>> {\n    result\n        .map(tonic::Response::new)\n        .map_err(|error| error.into_grpc_status())\n}\n\n#[cfg(test)]\nmod tests {\n    use serde::Deserialize;\n\n    use super::*;\n\n    #[test]\n    fn test_grpc_service_error_roundtrip() {\n        #[derive(Clone, Debug, thiserror::Error, Eq, PartialEq, Serialize, Deserialize)]\n        #[serde(rename_all = \"snake_case\")]\n        enum MyError {\n            #[error(\"internal error: {0}\")]\n            Internal(String),\n            #[error(\"request timed out: {0}\")]\n            Timeout(String),\n\n            #[error(\"too many requests\")]\n            TooManyRequests,\n\n            #[error(\"service unavailable: {0}\")]\n            Unavailable(String),\n        }\n\n        impl ServiceError for MyError {\n            fn error_code(&self) -> ServiceErrorCode {\n                match self {\n                    Self::Internal(_) => ServiceErrorCode::Internal,\n                    Self::Timeout(_) => ServiceErrorCode::Timeout,\n                    Self::TooManyRequests => ServiceErrorCode::TooManyRequests,\n                    Self::Unavailable(_) => ServiceErrorCode::Unavailable,\n                }\n            }\n        }\n\n        impl GrpcServiceError for MyError {\n            fn new_internal(message: String) -> Self {\n                Self::Internal(message)\n            }\n\n            fn new_timeout(message: String) -> Self {\n                Self::Timeout(message)\n            }\n\n            fn new_too_many_requests() -> Self {\n                Self::TooManyRequests\n            }\n\n            fn new_unavailable(message: String) -> Self {\n                Self::Unavailable(message)\n            }\n        }\n\n        let service_error = MyError::new_internal(\"test\".to_string());\n        let status = grpc_error_to_grpc_status(service_error.clone());\n        let expected_error: MyError = grpc_status_to_service_error(status, \"rpc_name\");\n        assert_eq!(service_error, expected_error);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/getters.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse crate::control_plane::*;\nuse crate::indexing::*;\nuse crate::ingest::ingester::*;\nuse crate::ingest::router::*;\nuse crate::ingest::*;\nuse crate::metastore::*;\nuse crate::types::*;\n\nmacro_rules! generate_getters {\n    (impl fn $field:ident() -> $type:ty {} for $($struct:ty),+) => {\n        $(\n        impl $struct {\n            // we track caller so the reported line isn't the macro invocation below\n            #[track_caller]\n            pub fn $field(&self) -> $type {\n                self.$field\n                    .as_ref()\n                    .expect(concat!(\"`\",\n                    stringify!($field), \"` should be a required field\"))\n            }\n        }\n        )*\n    }\n}\n\nmacro_rules! generate_clone_getters {\n    (impl fn $field:ident() -> $type:ty {} for $($struct:ty),+) => {\n        $(\n        impl $struct {\n            // we track caller so the reported line isn't the macro invocation below\n            #[track_caller]\n            pub fn $field(&self) -> $type {\n                self.$field\n                    .clone()\n                    .expect(concat!(\"`\",\n                    stringify!($field), \"` should be a required field\"))\n            }\n        }\n        )*\n    }\n}\n\nmacro_rules! generate_copy_getters {\n    (impl fn $field:ident() -> $type:ty {} for $($struct:ty),+) => {\n        $(\n        impl $struct {\n            // we track caller so the reported line isn't the macro invocation below\n            #[track_caller]\n            pub fn $field(&self) -> $type {\n                self.$field\n                    .expect(concat!(\"`\",\n                    stringify!($field), \"` should be a required field\"))\n            }\n        }\n        )*\n    }\n}\n\n// [`DocMappingUid`] getters\ngenerate_copy_getters!(\n    impl fn doc_mapping_uid() -> DocMappingUid {} for\n\n    OpenShardSubrequest,\n    Shard\n);\n\n// [`DocUid`] getters\ngenerate_copy_getters! {\n    impl fn doc_uid() -> DocUid {} for\n\n    ParseFailure\n}\n\n// [`IndexUid`] getters\ngenerate_getters! {\n    impl fn index_uid() -> &IndexUid {} for\n    // Control Plane API\n    GetOrCreateOpenShardsSuccess,\n\n    // Indexing API\n    IndexingTask,\n\n    // Ingest API\n    FetchEof,\n    FetchPayload,\n    IngestSuccess,\n    OpenFetchStreamRequest,\n    PersistFailure,\n    PersistSubrequest,\n    PersistSuccess,\n    ReplicateFailure,\n    ReplicateSubrequest,\n    ReplicateSuccess,\n    RetainShardsForSource,\n    Shard,\n    ShardIdPositions,\n    ShardIds,\n    ShardPKey,\n    TruncateShardsSubrequest,\n    SourceShardUpdate,\n\n    // Metastore API\n    AcquireShardsRequest,\n    AddSourceRequest,\n    CreateIndexResponse,\n    DeleteIndexRequest,\n    DeleteQuery,\n    DeleteShardsRequest,\n    DeleteShardsResponse,\n    DeleteSourceRequest,\n    DeleteSplitsRequest,\n    LastDeleteOpstampRequest,\n    ListDeleteTasksRequest,\n    ListShardsSubrequest,\n    ListShardsSubresponse,\n    ListStaleSplitsRequest,\n    MarkSplitsForDeletionRequest,\n    OpenShardSubrequest,\n    PruneShardsRequest,\n    PublishSplitsRequest,\n    ResetSourceCheckpointRequest,\n    StageSplitsRequest,\n    ToggleSourceRequest,\n    UpdateIndexRequest,\n    UpdateSourceRequest,\n    UpdateSplitsDeleteOpstampRequest\n}\n\n// [`PipelineUid`] getters\ngenerate_copy_getters! {\n    impl fn pipeline_uid() -> PipelineUid {} for\n\n    IndexingTask\n}\n\n// [`Position`] getters. We use `clone` because `Position` is an `Arc` under the hood.\ngenerate_clone_getters! {\n    impl fn eof_position() -> Position {} for\n\n    FetchEof\n}\n\ngenerate_clone_getters! {\n    impl fn from_position_exclusive() -> Position {} for\n\n    FetchPayload,\n    OpenFetchStreamRequest,\n    ReplicateSubrequest\n}\n\ngenerate_clone_getters! {\n    impl fn to_position_inclusive() -> Position {} for\n\n    FetchPayload\n}\n\ngenerate_clone_getters! {\n    impl fn publish_position_inclusive() -> Position {} for\n\n    Shard,\n    ShardIdPosition\n}\n\ngenerate_clone_getters! {\n    impl fn replication_position_inclusive() -> Position {} for\n\n    ReplicateSuccess\n}\n\ngenerate_clone_getters! {\n    impl fn truncate_up_to_position_inclusive() -> Position {} for\n\n    TruncateShardsSubrequest\n}\n\n// [`Shard`] getters\ngenerate_getters! {\n    impl fn open_shard() -> &Shard {} for\n\n    OpenShardSubresponse\n}\n\ngenerate_getters! {\n    impl fn shard() -> &Shard {} for\n\n    InitShardSubrequest,\n    InitShardSuccess\n}\n\n// [`ShardId`] getters\ngenerate_getters! {\n    impl fn shard_id() -> &ShardId {} for\n\n    FetchEof,\n    FetchPayload,\n    InitShardFailure,\n    OpenFetchStreamRequest,\n    OpenShardSubrequest,\n    PersistSuccess,\n    ReplicateFailure,\n    ReplicateSubrequest,\n    ReplicateSuccess,\n    Shard,\n    ShardIdPosition,\n    ShardPKey,\n    TruncateShardsSubrequest\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/indexing/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::fmt::{Display, Formatter};\nuse std::hash::Hash;\nuse std::ops::{Add, Mul, Sub};\n\nuse bytesize::ByteSize;\nuse quickwit_actors::AskError;\nuse quickwit_common::pubsub::Event;\nuse quickwit_common::rate_limited_error;\nuse quickwit_common::tower::{MakeLoadShedError, RpcName, TimeoutExceeded};\nuse serde::{Deserialize, Serialize};\nuse thiserror;\n\nuse crate::metastore::MetastoreError;\nuse crate::types::{IndexUid, NodeId, PipelineUid, Position, ShardId, SourceId, SourceUid};\nuse crate::{GrpcServiceError, ServiceError, ServiceErrorCode};\n\ninclude!(\"../codegen/quickwit/quickwit.indexing.rs\");\n\npub const INDEXING_FILE_DESCRIPTOR_SET: &[u8] =\n    include_bytes!(\"../codegen/quickwit/indexing_descriptor.bin\");\n\npub type IndexingResult<T> = std::result::Result<T, IndexingError>;\n\n#[derive(Debug, thiserror::Error, Eq, PartialEq, Serialize, Deserialize)]\n#[serde(rename_all = \"snake_case\")]\npub enum IndexingError {\n    #[error(\"internal error: {0}\")]\n    Internal(String),\n    #[error(\"metastore error: {0}\")]\n    Metastore(#[from] MetastoreError),\n    #[error(\"request timed out: {0}\")]\n    Timeout(String),\n    #[error(\"too many requests\")]\n    TooManyRequests,\n    #[error(\"service unavailable: {0}\")]\n    Unavailable(String),\n}\nimpl From<TimeoutExceeded> for IndexingError {\n    fn from(_timeout_exceeded: TimeoutExceeded) -> Self {\n        Self::Timeout(\"tower layer timeout\".to_string())\n    }\n}\n\nimpl ServiceError for IndexingError {\n    fn error_code(&self) -> ServiceErrorCode {\n        match self {\n            Self::Internal(err_msg) => {\n                rate_limited_error!(limit_per_min = 6, \"indexing error: {err_msg}\");\n                ServiceErrorCode::Internal\n            }\n            Self::Metastore(metastore_error) => metastore_error.error_code(),\n            Self::Timeout(_) => ServiceErrorCode::Timeout,\n            Self::TooManyRequests => ServiceErrorCode::TooManyRequests,\n            Self::Unavailable(_) => ServiceErrorCode::Unavailable,\n        }\n    }\n}\n\nimpl GrpcServiceError for IndexingError {\n    fn new_internal(message: String) -> Self {\n        Self::Internal(message)\n    }\n\n    fn new_timeout(message: String) -> Self {\n        Self::Timeout(message)\n    }\n\n    fn new_too_many_requests() -> Self {\n        Self::TooManyRequests\n    }\n\n    fn new_unavailable(message: String) -> Self {\n        Self::Unavailable(message)\n    }\n}\n\nimpl MakeLoadShedError for IndexingError {\n    fn make_load_shed_error() -> Self {\n        Self::TooManyRequests\n    }\n}\n\nimpl From<AskError<IndexingError>> for IndexingError {\n    fn from(error: AskError<IndexingError>) -> Self {\n        match error {\n            AskError::ErrorReply(error) => error,\n            AskError::MessageNotDelivered => {\n                Self::new_unavailable(\"request could not be delivered to actor\".to_string())\n            }\n            AskError::ProcessMessageError => {\n                Self::new_internal(\"an error occurred while processing the request\".to_string())\n            }\n        }\n    }\n}\n\n/// Uniquely identifies an indexing pipeline. There can be multiple indexing pipelines per\n/// source `(index_uid, source_id)` running simultaneously on an indexer.\n#[derive(Clone, Debug, Hash, Eq, PartialEq)]\npub struct IndexingPipelineId {\n    pub node_id: NodeId,\n    pub index_uid: IndexUid,\n    pub source_id: SourceId,\n    pub pipeline_uid: PipelineUid,\n}\n\nimpl IndexingPipelineId {\n    pub fn merge_pipeline_id(&self) -> MergePipelineId {\n        MergePipelineId {\n            node_id: self.node_id.clone(),\n            index_uid: self.index_uid.clone(),\n            source_id: self.source_id.clone(),\n        }\n    }\n}\n\nimpl Display for IndexingPipelineId {\n    fn fmt(&self, f: &mut Formatter) -> fmt::Result {\n        write!(f, \"{}:{}\", self.index_uid, &self.source_id)\n    }\n}\n\n/// Uniquely identifies a merge pipeline. There exists at most one merge pipeline per\n/// `(index_uid, source_id)` running on indexer at any given time fed by one or more indexing\n/// pipelines.\n#[derive(Clone, Debug, Hash, Eq, PartialEq)]\npub struct MergePipelineId {\n    pub node_id: NodeId,\n    pub index_uid: IndexUid,\n    pub source_id: SourceId,\n}\n\nimpl Display for MergePipelineId {\n    fn fmt(&self, f: &mut Formatter) -> fmt::Result {\n        write!(f, \"merge:{}:{}\", self.index_uid, &self.source_id)\n    }\n}\n\nimpl Display for IndexingTask {\n    fn fmt(&self, f: &mut Formatter) -> fmt::Result {\n        write!(f, \"{}:{}\", self.index_uid(), &self.source_id)\n    }\n}\n\nimpl Eq for IndexingTask {}\n\n// TODO: This implementation conflicts with the default derived implementation. It would be better\n// to use a wrapper over `IndexingTask` where we need to group indexing tasks by index UID and\n// source ID.\nimpl Hash for IndexingTask {\n    fn hash<H: std::hash::Hasher>(&self, state: &mut H) {\n        self.index_uid.hash(state);\n        self.source_id.hash(state);\n    }\n}\n#[derive(Clone, Copy, Debug, PartialEq, Eq, Serialize, utoipa::ToSchema)]\npub struct PipelineMetrics {\n    pub cpu_load: CpuCapacity,\n    // Indexing throughput (when the CPU is working).\n    // This measure the theoretical maximum number of MB/s a full indexing pipeline could process\n    // provided enough data was being ingested.\n    pub throughput_mb_per_sec: u16,\n}\n\nimpl Display for PipelineMetrics {\n    fn fmt(&self, f: &mut Formatter) -> fmt::Result {\n        write!(f, \"{},{}MB/s\", self.cpu_load, self.throughput_mb_per_sec)\n    }\n}\n\n/// One full pipeline (including merging) is assumed to consume 4 CPU threads.\n/// The actual number somewhere between 3 and 4. Quickwit is not super sensitive to this number.\n///\n/// It simply impacts the point where we prefer to work on balancing the load over the different\n/// indexers and the point where we prefer improving other feature of the system (shard locality,\n/// grouping pipelines associated to a given index on the same node, etc.).\npub const PIPELINE_FULL_CAPACITY: CpuCapacity = CpuCapacity::from_cpu_millis(4_000u32);\n\n/// One full pipeline (including merging) is supposed to have the capacity to index at least 20mb/s.\n/// This is a defensive value: In reality, this is typically above 30mb/s.\npub const PIPELINE_THROUGHPUT: ByteSize = ByteSize::mb(20);\n\n/// The CpuCapacity represents an amount of CPU resource available.\n///\n/// It is usually expressed in CPU millis (For instance, one full CPU thread is\n/// displayed as `1000m`).\n#[derive(\n    Copy, Clone, Debug, Eq, PartialEq, Deserialize, Serialize, Ord, PartialOrd, utoipa::ToSchema,\n)]\n#[serde(\n    into = \"CpuCapacityForSerialization\",\n    try_from = \"CpuCapacityForSerialization\"\n)]\npub struct CpuCapacity(u32);\n\n/// Short helper function to build `CpuCapacity`.\n#[inline(always)]\npub const fn mcpu(milli_cpus: u32) -> CpuCapacity {\n    CpuCapacity::from_cpu_millis(milli_cpus)\n}\n\nimpl CpuCapacity {\n    #[inline(always)]\n    pub const fn from_cpu_millis(cpu_millis: u32) -> CpuCapacity {\n        CpuCapacity(cpu_millis)\n    }\n\n    #[inline(always)]\n    pub fn cpu_millis(self) -> u32 {\n        self.0\n    }\n\n    #[inline(always)]\n    pub fn zero() -> CpuCapacity {\n        CpuCapacity::from_cpu_millis(0u32)\n    }\n\n    #[inline(always)]\n    pub fn one_cpu_thread() -> CpuCapacity {\n        CpuCapacity::from_cpu_millis(1_000u32)\n    }\n}\n\nimpl Sub<CpuCapacity> for CpuCapacity {\n    type Output = CpuCapacity;\n\n    #[inline(always)]\n    fn sub(self, rhs: CpuCapacity) -> Self::Output {\n        CpuCapacity::from_cpu_millis(self.0 - rhs.0)\n    }\n}\n\nimpl Add<CpuCapacity> for CpuCapacity {\n    type Output = CpuCapacity;\n\n    #[inline(always)]\n    fn add(self, rhs: CpuCapacity) -> Self::Output {\n        CpuCapacity::from_cpu_millis(self.0 + rhs.0)\n    }\n}\n\nimpl Mul<u32> for CpuCapacity {\n    type Output = CpuCapacity;\n\n    #[inline(always)]\n    fn mul(self, rhs: u32) -> CpuCapacity {\n        CpuCapacity::from_cpu_millis(self.0 * rhs)\n    }\n}\n\nimpl Mul<f32> for CpuCapacity {\n    type Output = CpuCapacity;\n\n    #[inline(always)]\n    fn mul(self, scale: f32) -> CpuCapacity {\n        CpuCapacity::from_cpu_millis((self.0 as f32 * scale) as u32)\n    }\n}\n\nimpl Display for CpuCapacity {\n    fn fmt(&self, f: &mut Formatter) -> std::fmt::Result {\n        write!(f, \"{}m\", self.0)\n    }\n}\n\n#[derive(Serialize, Deserialize)]\n#[serde(untagged)]\nenum CpuCapacityForSerialization {\n    Float(f32),\n    MilliCpuWithUnit(String),\n}\n\nimpl TryFrom<CpuCapacityForSerialization> for CpuCapacity {\n    type Error = String;\n\n    fn try_from(\n        cpu_capacity_for_serialization: CpuCapacityForSerialization,\n    ) -> Result<CpuCapacity, Self::Error> {\n        match cpu_capacity_for_serialization {\n            CpuCapacityForSerialization::Float(cpu_capacity) => {\n                Ok(CpuCapacity((cpu_capacity * 1000.0f32) as u32))\n            }\n            CpuCapacityForSerialization::MilliCpuWithUnit(cpu_capacity_str) => {\n                Self::from_str(&cpu_capacity_str)\n            }\n        }\n    }\n}\n\nimpl FromStr for CpuCapacity {\n    type Err = String;\n\n    fn from_str(cpu_capacity_str: &str) -> Result<Self, Self::Err> {\n        let Some(milli_cpus_without_unit_str) = cpu_capacity_str.strip_suffix('m') else {\n            return Err(format!(\n                \"invalid cpu capacity: `{cpu_capacity_str}`. String format expects a trailing 'm'.\"\n            ));\n        };\n        let milli_cpus: u32 = milli_cpus_without_unit_str\n            .parse::<u32>()\n            .map_err(|_err| format!(\"invalid cpu capacity: `{cpu_capacity_str}`.\"))?;\n        Ok(CpuCapacity(milli_cpus))\n    }\n}\n\nimpl From<CpuCapacity> for CpuCapacityForSerialization {\n    fn from(cpu_capacity: CpuCapacity) -> CpuCapacityForSerialization {\n        CpuCapacityForSerialization::MilliCpuWithUnit(format!(\"{}m\", cpu_capacity.0))\n    }\n}\n\n/// Whenever a shard position update is detected (whether it is emit by an indexing pipeline local\n/// to the cluster or received via chitchat), the shard positions service publishes a\n/// `ShardPositionsUpdate` event through the cluster's `EventBroker`.\n#[derive(Debug, Clone, PartialEq, Eq)]\npub struct ShardPositionsUpdate {\n    pub source_uid: SourceUid,\n    // Only shards that received an update are listed here.\n    pub updated_shard_positions: Vec<(ShardId, Position)>,\n}\n\nimpl Event for ShardPositionsUpdate {}\n\nimpl RpcName for ApplyIndexingPlanRequest {\n    fn rpc_name() -> &'static str {\n        \"apply_indexing_plan\"\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_cpu_capacity_serialization() {\n        assert_eq!(CpuCapacity::from_str(\"2000m\").unwrap(), mcpu(2000));\n        assert_eq!(CpuCapacity::from_cpu_millis(2500), mcpu(2500));\n        assert_eq!(\n            CpuCapacity::from_str(\"2.5\").unwrap_err(),\n            \"invalid cpu capacity: `2.5`. String format expects a trailing 'm'.\"\n        );\n        assert_eq!(\n            serde_json::from_value::<CpuCapacity>(serde_json::Value::String(\"1200m\".to_string()))\n                .unwrap(),\n            mcpu(1200)\n        );\n        assert_eq!(\n            serde_json::from_value::<CpuCapacity>(serde_json::Value::Number(\n                serde_json::Number::from_f64(1.2f64).unwrap()\n            ))\n            .unwrap(),\n            mcpu(1200)\n        );\n        assert_eq!(\n            serde_json::from_value::<CpuCapacity>(serde_json::Value::Number(\n                serde_json::Number::from(1u32)\n            ))\n            .unwrap(),\n            mcpu(1000)\n        );\n        assert_eq!(CpuCapacity::from_cpu_millis(2500).to_string(), \"2500m\");\n        assert_eq!(serde_json::to_string(&mcpu(2500)).unwrap(), \"\\\"2500m\\\"\");\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/ingest/ingester.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse bytesize::ByteSize;\n\nuse crate::types::{Position, QueueId, queue_id};\n\ninclude!(\"../codegen/quickwit/quickwit.ingest.ingester.rs\");\n\npub use ingester_service_grpc_server::IngesterServiceGrpcServer;\n\nimpl FetchMessage {\n    pub fn new_payload(payload: FetchPayload) -> Self {\n        assert!(\n            matches!(&payload.mrecord_batch, Some(batch) if !batch.mrecord_lengths.is_empty()),\n            \"`mrecord_batch` must be set and non-empty\"\n        );\n\n        Self {\n            message: Some(fetch_message::Message::Payload(payload)),\n        }\n    }\n\n    pub fn new_eof(eof: FetchEof) -> Self {\n        assert!(\n            matches!(eof.eof_position, Some(Position::Eof(_))),\n            \"`eof_position` must be set\"\n        );\n\n        Self {\n            message: Some(fetch_message::Message::Eof(eof)),\n        }\n    }\n}\n\nimpl FetchPayload {\n    pub fn queue_id(&self) -> QueueId {\n        queue_id(self.index_uid(), &self.source_id, self.shard_id())\n    }\n\n    pub fn num_mrecords(&self) -> usize {\n        if let Some(mrecord_batch) = &self.mrecord_batch {\n            mrecord_batch.mrecord_lengths.len()\n        } else {\n            0\n        }\n    }\n\n    pub fn estimate_size(&self) -> ByteSize {\n        if let Some(mrecord_batch) = &self.mrecord_batch {\n            mrecord_batch.estimate_size()\n        } else {\n            ByteSize(0)\n        }\n    }\n}\n\nimpl IngesterStatus {\n    pub fn as_json_str_name(&self) -> &'static str {\n        match self {\n            Self::Unspecified => \"unspecified\",\n            Self::Initializing => \"initializing\",\n            Self::Ready => \"ready\",\n            Self::Retiring => \"retiring\",\n            Self::Decommissioning => \"decommissioning\",\n            Self::Decommissioned => \"decommissioned\",\n            Self::Failed => \"failed\",\n        }\n    }\n\n    pub fn from_json_str_name(value: &str) -> Option<Self> {\n        match value {\n            \"unspecified\" => Some(Self::Unspecified),\n            \"initializing\" => Some(Self::Initializing),\n            \"ready\" => Some(Self::Ready),\n            \"retiring\" => Some(Self::Retiring),\n            \"decommissioning\" => Some(Self::Decommissioning),\n            \"decommissioned\" => Some(Self::Decommissioned),\n            \"failed\" => Some(Self::Failed),\n            _ => None,\n        }\n    }\n\n    pub fn is_ready(&self) -> bool {\n        matches!(self, Self::Ready)\n    }\n\n    pub fn accepts_write_requests(&self) -> bool {\n        matches!(self, Self::Ready | Self::Retiring)\n    }\n}\n\nimpl std::fmt::Display for IngesterStatus {\n    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {\n        write!(f, \"{}\", self.as_json_str_name())\n    }\n}\n\nimpl OpenFetchStreamRequest {\n    pub fn queue_id(&self) -> QueueId {\n        queue_id(self.index_uid(), &self.source_id, self.shard_id())\n    }\n}\n\nimpl PersistSuccess {\n    pub fn queue_id(&self) -> QueueId {\n        queue_id(self.index_uid(), &self.source_id, self.shard_id())\n    }\n}\n\nimpl SynReplicationMessage {\n    pub fn into_open_request(self) -> Option<OpenReplicationStreamRequest> {\n        match self.message {\n            Some(syn_replication_message::Message::OpenRequest(open_request)) => Some(open_request),\n            _ => None,\n        }\n    }\n\n    pub fn new_open_request(open_request: OpenReplicationStreamRequest) -> Self {\n        Self {\n            message: Some(syn_replication_message::Message::OpenRequest(open_request)),\n        }\n    }\n\n    pub fn new_init_replica_request(init_replica_request: InitReplicaRequest) -> Self {\n        Self {\n            message: Some(syn_replication_message::Message::InitRequest(\n                init_replica_request,\n            )),\n        }\n    }\n\n    pub fn new_replicate_request(replicate_request: ReplicateRequest) -> Self {\n        Self {\n            message: Some(syn_replication_message::Message::ReplicateRequest(\n                replicate_request,\n            )),\n        }\n    }\n}\n\nimpl AckReplicationMessage {\n    pub fn into_open_response(self) -> Option<OpenReplicationStreamResponse> {\n        match self.message {\n            Some(ack_replication_message::Message::OpenResponse(open_response)) => {\n                Some(open_response)\n            }\n            _ => None,\n        }\n    }\n\n    pub fn new_open_response(open_response: OpenReplicationStreamResponse) -> Self {\n        Self {\n            message: Some(ack_replication_message::Message::OpenResponse(\n                open_response,\n            )),\n        }\n    }\n\n    pub fn new_init_replica_response(init_replica_response: InitReplicaResponse) -> Self {\n        Self {\n            message: Some(ack_replication_message::Message::InitResponse(\n                init_replica_response,\n            )),\n        }\n    }\n\n    pub fn new_replicate_response(replicate_response: ReplicateResponse) -> Self {\n        Self {\n            message: Some(ack_replication_message::Message::ReplicateResponse(\n                replicate_response,\n            )),\n        }\n    }\n}\n\nimpl ReplicateRequest {\n    pub fn num_bytes(&self) -> usize {\n        self.subrequests\n            .iter()\n            .flat_map(|subrequest| &subrequest.doc_batch)\n            .map(|doc_batch| doc_batch.num_bytes())\n            .sum()\n    }\n}\n\nimpl ReplicateSubrequest {\n    pub fn queue_id(&self) -> QueueId {\n        queue_id(self.index_uid(), &self.source_id, self.shard_id())\n    }\n}\n\nimpl TruncateShardsSubrequest {\n    pub fn queue_id(&self) -> QueueId {\n        queue_id(self.index_uid(), &self.source_id, self.shard_id())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/ingest/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::iter::zip;\n\nuse bytes::Bytes;\nuse bytesize::ByteSize;\nuse quickwit_common::rate_limited_error;\nuse quickwit_common::tower::MakeLoadShedError;\nuse serde::{Deserialize, Serialize};\n\nuse self::ingester::{PersistFailureReason, ReplicateFailureReason};\nuse self::router::IngestFailureReason;\nuse super::GrpcServiceError;\nuse crate::types::{DocUid, NodeIdRef, Position, QueueId, ShardId, SourceUid, queue_id};\nuse crate::{ServiceError, ServiceErrorCode};\n\npub mod ingester;\npub mod router;\n\ninclude!(\"../codegen/quickwit/quickwit.ingest.rs\");\n\npub const INGEST_FILE_DESCRIPTOR_SET: &[u8] =\n    include_bytes!(\"../codegen/quickwit/ingest_descriptor.bin\");\n\npub type IngestV2Result<T> = std::result::Result<T, IngestV2Error>;\n\n#[derive(Debug, Copy, Clone, thiserror::Error, Eq, PartialEq, Serialize, Deserialize)]\npub enum RateLimitingCause {\n    #[error(\"router load shedding\")]\n    RouterLoadShedding,\n    #[error(\"load shedding\")]\n    LoadShedding,\n    #[error(\"wal full (memory or disk)\")]\n    WalFull,\n    #[error(\"circuit breaker\")]\n    CircuitBreaker,\n    #[error(\"shard rate limiting\")]\n    ShardRateLimiting,\n    #[error(\"unknown\")]\n    Unknown,\n}\n\n#[derive(Debug, thiserror::Error, Eq, PartialEq, Serialize, Deserialize)]\n#[serde(rename_all = \"snake_case\")]\npub enum IngestV2Error {\n    #[error(\"internal error: {0}\")]\n    Internal(String),\n    #[error(\"shard `{shard_id}` not found\")]\n    ShardNotFound { shard_id: ShardId },\n    #[error(\"request timed out: {0}\")]\n    Timeout(String),\n    #[error(\"too many requests\")]\n    TooManyRequests(RateLimitingCause),\n    #[error(\"service unavailable: {0}\")]\n    Unavailable(String),\n}\n\nimpl From<quickwit_common::tower::TimeoutExceeded> for IngestV2Error {\n    fn from(_: quickwit_common::tower::TimeoutExceeded) -> IngestV2Error {\n        IngestV2Error::Timeout(\"tower layer timeout\".to_string())\n    }\n}\n\nimpl From<quickwit_common::tower::TaskCancelled> for IngestV2Error {\n    fn from(task_cancelled: quickwit_common::tower::TaskCancelled) -> IngestV2Error {\n        IngestV2Error::Internal(task_cancelled.to_string())\n    }\n}\n\nimpl ServiceError for IngestV2Error {\n    fn error_code(&self) -> ServiceErrorCode {\n        match self {\n            Self::Internal(error_msg) => {\n                rate_limited_error!(limit_per_min = 6, \"ingest internal error: {error_msg}\");\n                ServiceErrorCode::Internal\n            }\n            Self::ShardNotFound { .. } => ServiceErrorCode::NotFound,\n            Self::Timeout(_) => ServiceErrorCode::Timeout,\n            Self::TooManyRequests(_) => ServiceErrorCode::TooManyRequests,\n            Self::Unavailable(_) => ServiceErrorCode::Unavailable,\n        }\n    }\n}\n\nimpl GrpcServiceError for IngestV2Error {\n    fn new_internal(message: String) -> Self {\n        Self::Internal(message)\n    }\n\n    fn new_timeout(message: String) -> Self {\n        Self::Timeout(message)\n    }\n\n    fn new_too_many_requests() -> Self {\n        Self::TooManyRequests(RateLimitingCause::Unknown)\n    }\n\n    fn new_unavailable(message: String) -> Self {\n        Self::Unavailable(message)\n    }\n}\n\nimpl MakeLoadShedError for IngestV2Error {\n    fn make_load_shed_error() -> Self {\n        IngestV2Error::TooManyRequests(RateLimitingCause::LoadShedding)\n    }\n}\n\nimpl Shard {\n    /// List of nodes that are storing the shard (the leader, and optionally the follower).\n    pub fn ingesters(&self) -> impl Iterator<Item = &NodeIdRef> + '_ {\n        [Some(&self.leader_id), self.follower_id.as_ref()]\n            .into_iter()\n            .flatten()\n            .map(|node_id| NodeIdRef::from_str(node_id))\n    }\n\n    pub fn source_uid(&self) -> SourceUid {\n        SourceUid {\n            index_uid: self.index_uid().clone(),\n            source_id: self.source_id.clone(),\n        }\n    }\n}\n\nimpl ShardPKey {\n    pub fn queue_id(&self) -> QueueId {\n        queue_id(self.index_uid(), &self.source_id, self.shard_id())\n    }\n}\n\nimpl DocBatchV2 {\n    pub fn docs(&self) -> impl Iterator<Item = (DocUid, Bytes)> + '_ {\n        zip(&self.doc_uids, &self.doc_lengths).scan(\n            self.doc_buffer.clone(),\n            |doc_buffer, (doc_uid, doc_len)| {\n                let doc = doc_buffer.split_to(*doc_len as usize);\n                Some((*doc_uid, doc))\n            },\n        )\n    }\n\n    pub fn into_docs(self) -> impl Iterator<Item = (DocUid, Bytes)> {\n        zip(self.doc_uids, self.doc_lengths).scan(\n            self.doc_buffer,\n            |doc_buffer, (doc_uid, doc_len)| {\n                let doc = doc_buffer.split_to(doc_len as usize);\n                Some((doc_uid, doc))\n            },\n        )\n    }\n\n    pub fn is_empty(&self) -> bool {\n        self.doc_lengths.is_empty()\n    }\n\n    pub fn num_bytes(&self) -> usize {\n        self.doc_buffer.len() + self.doc_lengths.len() * 4\n    }\n\n    pub fn num_docs(&self) -> usize {\n        self.doc_lengths.len()\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn for_test(docs: impl IntoIterator<Item = &'static str>) -> Self {\n        let mut doc_uids = Vec::new();\n        let mut doc_buffer = Vec::new();\n        let mut doc_lengths = Vec::new();\n\n        for (doc_uid, doc) in docs.into_iter().enumerate() {\n            doc_uids.push(DocUid::for_test(doc_uid as u128));\n            doc_buffer.extend(doc.as_bytes());\n            doc_lengths.push(doc.len() as u32);\n        }\n        Self {\n            doc_uids,\n            doc_buffer: Bytes::from(doc_buffer),\n            doc_lengths,\n        }\n    }\n}\n\nimpl MRecordBatch {\n    pub fn encoded_mrecords(&self) -> impl Iterator<Item = Bytes> + '_ {\n        self.mrecord_lengths\n            .iter()\n            .scan(0, |start_offset, mrecord_length| {\n                let start = *start_offset;\n                let end = start + *mrecord_length as usize;\n                *start_offset = end;\n                Some(self.mrecord_buffer.slice(start..end))\n            })\n    }\n\n    pub fn is_empty(&self) -> bool {\n        self.mrecord_lengths.is_empty()\n    }\n\n    pub fn estimate_size(&self) -> ByteSize {\n        ByteSize((self.mrecord_buffer.len() + self.mrecord_lengths.len() * 4) as u64)\n    }\n\n    pub fn num_mrecords(&self) -> usize {\n        self.mrecord_lengths.len()\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn for_test(mrecords: impl IntoIterator<Item = &'static str>) -> Option<Self> {\n        let mut mrecord_buffer = Vec::new();\n        let mut mrecord_lengths = Vec::new();\n\n        for mrecord in mrecords {\n            mrecord_buffer.extend(mrecord.as_bytes());\n            mrecord_lengths.push(mrecord.len() as u32);\n        }\n        Some(Self {\n            mrecord_lengths,\n            mrecord_buffer: Bytes::from(mrecord_buffer),\n        })\n    }\n}\n\nimpl Shard {\n    pub fn is_open(&self) -> bool {\n        self.shard_state().is_open()\n    }\n\n    pub fn is_unavailable(&self) -> bool {\n        self.shard_state().is_unavailable()\n    }\n\n    pub fn is_closed(&self) -> bool {\n        self.shard_state().is_closed()\n    }\n\n    pub fn queue_id(&self) -> super::types::QueueId {\n        queue_id(self.index_uid(), &self.source_id, self.shard_id())\n    }\n}\n\nimpl ShardState {\n    pub fn is_open(&self) -> bool {\n        *self == ShardState::Open\n    }\n\n    pub fn is_unavailable(&self) -> bool {\n        *self == ShardState::Unavailable\n    }\n\n    pub fn is_closed(&self) -> bool {\n        *self == ShardState::Closed\n    }\n\n    pub fn as_json_str_name(&self) -> &'static str {\n        match self {\n            ShardState::Unspecified => \"unspecified\",\n            ShardState::Open => \"open\",\n            ShardState::Unavailable => \"unavailable\",\n            ShardState::Closed => \"closed\",\n        }\n    }\n\n    pub fn from_json_str_name(shard_state_json_name: &str) -> Option<Self> {\n        match shard_state_json_name {\n            \"unspecified\" => Some(Self::Unspecified),\n            \"open\" => Some(Self::Open),\n            \"unavailable\" => Some(Self::Unavailable),\n            \"closed\" => Some(Self::Closed),\n            _ => None,\n        }\n    }\n}\n\nimpl ShardIds {\n    pub fn queue_ids(&self) -> impl Iterator<Item = QueueId> + '_ {\n        self.shard_ids\n            .iter()\n            .map(|shard_id| queue_id(self.index_uid(), &self.source_id, shard_id))\n    }\n\n    pub fn pkeys(&self) -> impl Iterator<Item = ShardPKey> + '_ {\n        self.shard_ids.iter().map(move |shard_id| ShardPKey {\n            index_uid: self.index_uid.clone(),\n            source_id: self.source_id.clone(),\n            shard_id: Some(shard_id.clone()),\n        })\n    }\n}\n\nimpl ShardIdPositions {\n    pub fn queue_id_positions(&self) -> impl Iterator<Item = (QueueId, Position)> + '_ {\n        self.shard_positions.iter().map(|shard_position| {\n            let queue_id = queue_id(self.index_uid(), &self.source_id, shard_position.shard_id());\n            (queue_id, shard_position.publish_position_inclusive())\n        })\n    }\n}\n\nimpl From<PersistFailureReason> for IngestFailureReason {\n    fn from(reason: PersistFailureReason) -> Self {\n        match reason {\n            PersistFailureReason::Unspecified => IngestFailureReason::Unspecified,\n            PersistFailureReason::NoShardsAvailable => IngestFailureReason::NoShardsAvailable,\n            PersistFailureReason::WalFull => IngestFailureReason::WalFull,\n            PersistFailureReason::Timeout => IngestFailureReason::Timeout,\n            PersistFailureReason::NodeUnavailable => IngestFailureReason::NoShardsAvailable,\n        }\n    }\n}\n\nimpl From<ReplicateFailureReason> for PersistFailureReason {\n    fn from(reason: ReplicateFailureReason) -> Self {\n        match reason {\n            ReplicateFailureReason::Unspecified => PersistFailureReason::Unspecified,\n            ReplicateFailureReason::ShardNotFound => PersistFailureReason::NoShardsAvailable,\n            ReplicateFailureReason::ShardClosed => PersistFailureReason::NoShardsAvailable,\n            ReplicateFailureReason::WalFull => PersistFailureReason::WalFull,\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_shard_state_json_str_name() {\n        let shard_state_json_name = ShardState::Unspecified.as_json_str_name();\n        let shard_state = ShardState::from_json_str_name(shard_state_json_name).unwrap();\n        assert_eq!(shard_state, ShardState::Unspecified);\n\n        let shard_state_json_name = ShardState::Open.as_json_str_name();\n        let shard_state = ShardState::from_json_str_name(shard_state_json_name).unwrap();\n        assert_eq!(shard_state, ShardState::Open);\n\n        let shard_state_json_name = ShardState::Unavailable.as_json_str_name();\n        let shard_state = ShardState::from_json_str_name(shard_state_json_name).unwrap();\n        assert_eq!(shard_state, ShardState::Unavailable);\n\n        let shard_state_json_name = ShardState::Closed.as_json_str_name();\n        let shard_state = ShardState::from_json_str_name(shard_state_json_name).unwrap();\n        assert_eq!(shard_state, ShardState::Closed);\n\n        assert!(ShardState::from_json_str_name(\"unknown\").is_none());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/ingest/router.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\ninclude!(\"../codegen/quickwit/quickwit.ingest.router.rs\");\n\nimpl IngestRequestV2 {\n    pub fn num_bytes(&self) -> usize {\n        self.subrequests\n            .iter()\n            .map(|subrequest| subrequest.num_bytes())\n            .sum()\n    }\n}\n\nimpl IngestSubrequest {\n    pub fn num_bytes(&self) -> usize {\n        self.doc_batch\n            .as_ref()\n            .map(|doc_batch| doc_batch.doc_buffer.len())\n            .unwrap_or(0)\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n#![allow(clippy::derive_partial_eq_without_eq)]\n#![allow(clippy::disallowed_methods)]\n#![allow(clippy::doc_lazy_continuation)]\n#![allow(rustdoc::invalid_html_tags)]\n\nuse std::cmp::Ordering;\n\nuse ::opentelemetry::global;\nuse ::opentelemetry::propagation::{Extractor, Injector};\nuse tonic::Status;\nuse tonic::service::Interceptor;\nuse tracing::Span;\nuse tracing_opentelemetry::OpenTelemetrySpanExt;\n\npub mod cluster;\npub mod control_plane;\npub use bytes;\npub use tonic;\npub mod developer;\npub mod error;\nmod getters;\npub mod indexing;\npub mod ingest;\npub mod metastore;\npub mod search;\npub mod types;\n\npub use error::{GrpcServiceError, ServiceError, ServiceErrorCode};\nuse search::ReportSplitsRequest;\n\npub mod jaeger {\n    pub mod api_v2 {\n        include!(\"codegen/jaeger/jaeger.api_v2.rs\");\n    }\n    pub mod storage {\n        pub mod v1 {\n            include!(\"codegen/jaeger/jaeger.storage.v1.rs\");\n        }\n        pub mod v2 {\n            include!(\"codegen/jaeger/jaeger.storage.v2.rs\");\n        }\n    }\n}\n\npub mod opentelemetry {\n    #[cfg(not(doctest))]\n    pub mod proto {\n\n        pub mod collector {\n            pub mod logs {\n                pub mod v1 {\n                    include!(\"codegen/opentelemetry/opentelemetry.proto.collector.logs.v1.rs\");\n                }\n            }\n            // One can dream.\n            // pub mod metrics {\n            //     pub mod v1 {\n            //         include!(\"codegen/opentelemetry/opentelemetry.proto.collector.metrics.v1.rs\"\n            // );     }\n            // }\n            pub mod trace {\n                pub mod v1 {\n                    include!(\"codegen/opentelemetry/opentelemetry.proto.collector.trace.v1.rs\");\n                }\n            }\n        }\n        pub mod common {\n            pub mod v1 {\n                include!(\"codegen/opentelemetry/opentelemetry.proto.common.v1.rs\");\n            }\n        }\n        pub mod logs {\n            pub mod v1 {\n                include!(\"codegen/opentelemetry/opentelemetry.proto.logs.v1.rs\");\n            }\n        }\n        // pub mod metrics {\n        //     pub mod experimental {\n        //         include!(\"codegen/opentelemetry/opentelemetry.proto.metrics.experimental.rs\");\n        //     }\n        //     pub mod v1 {\n        //         tonic::include_proto!(\"codegen/opentelemetry/opentelemetry.proto.metrics.v1\");\n        //     }\n        // }\n        pub mod resource {\n            pub mod v1 {\n                include!(\"codegen/opentelemetry/opentelemetry.proto.resource.v1.rs\");\n            }\n        }\n        pub mod trace {\n            pub mod v1 {\n                include!(\"codegen/opentelemetry/opentelemetry.proto.trace.v1.rs\");\n            }\n        }\n    }\n}\n\nimpl TryFrom<metastore::DeleteQuery> for search::SearchRequest {\n    type Error = anyhow::Error;\n\n    fn try_from(delete_query: metastore::DeleteQuery) -> anyhow::Result<Self> {\n        Ok(Self {\n            index_id_patterns: vec![delete_query.index_uid().index_id.to_string()],\n            query_ast: delete_query.query_ast,\n            start_timestamp: delete_query.start_timestamp,\n            end_timestamp: delete_query.end_timestamp,\n            ..Default::default()\n        })\n    }\n}\n\n/// `MutMetadataMap` used to extract [`tonic::metadata::MetadataMap`] from a request.\npub struct MutMetadataMap<'a>(&'a mut tonic::metadata::MetadataMap);\n\nimpl Injector for MutMetadataMap<'_> {\n    /// Sets a key-value pair in the [`MetadataMap`]. No-op if the key or value is invalid.\n    fn set(&mut self, key: &str, value: String) {\n        if let Ok(metadata_key) = tonic::metadata::MetadataKey::from_bytes(key.as_bytes())\n            && let Ok(metadata_value) = tonic::metadata::MetadataValue::try_from(&value)\n        {\n            self.0.insert(metadata_key, metadata_value);\n        }\n    }\n}\n\nimpl Extractor for MutMetadataMap<'_> {\n    /// Gets a value for a key from the MetadataMap.  If the value can't be converted to &str,\n    /// returns None.\n    fn get(&self, key: &str) -> Option<&str> {\n        self.0.get(key).and_then(|metadata| metadata.to_str().ok())\n    }\n\n    /// Collect all the keys from the MetadataMap.\n    fn keys(&self) -> Vec<&str> {\n        self.0\n            .keys()\n            .map(|key| match key {\n                tonic::metadata::KeyRef::Ascii(v) => v.as_str(),\n                tonic::metadata::KeyRef::Binary(v) => v.as_str(),\n            })\n            .collect::<Vec<_>>()\n    }\n}\n\n/// [`tonic::service::interceptor::Interceptor`] which injects the span context into\n/// [`tonic::metadata::MetadataMap`].\n#[derive(Clone, Debug)]\npub struct SpanContextInterceptor;\n\nimpl Interceptor for SpanContextInterceptor {\n    fn call(&mut self, mut request: tonic::Request<()>) -> Result<tonic::Request<()>, Status> {\n        global::get_text_map_propagator(|propagator| {\n            propagator.inject_context(\n                &tracing::Span::current().context(),\n                &mut MutMetadataMap(request.metadata_mut()),\n            )\n        });\n        Ok(request)\n    }\n}\n\n/// `MetadataMap` extracts OpenTelemetry\n/// tracing keys from request's headers.\nstruct MetadataMap<'a>(&'a tonic::metadata::MetadataMap);\n\nimpl Extractor for MetadataMap<'_> {\n    /// Gets a value for a key from the MetadataMap.  If the value can't be converted to &str,\n    /// returns None.\n    fn get(&self, key: &str) -> Option<&str> {\n        self.0.get(key).and_then(|metadata| metadata.to_str().ok())\n    }\n\n    /// Collect all the keys from the MetadataMap.\n    fn keys(&self) -> Vec<&str> {\n        self.0\n            .keys()\n            .map(|key| match key {\n                tonic::metadata::KeyRef::Ascii(v) => v.as_str(),\n                tonic::metadata::KeyRef::Binary(v) => v.as_str(),\n            })\n            .collect::<Vec<_>>()\n    }\n}\n\n/// Sets parent span context derived from [`tonic::metadata::MetadataMap`].\npub fn set_parent_span_from_request_metadata(request_metadata: &tonic::metadata::MetadataMap) {\n    let parent_cx =\n        global::get_text_map_propagator(|prop| prop.extract(&MetadataMap(request_metadata)));\n    let _ = Span::current().set_parent(parent_cx);\n}\n\nimpl search::SortOrder {\n    #[inline(always)]\n    pub fn compare_opt<T: Ord>(&self, this: &Option<T>, other: &Option<T>) -> Ordering {\n        match (this, other) {\n            (Some(this), Some(other)) => self.compare(this, other),\n            (Some(_), None) => Ordering::Greater,\n            (None, Some(_)) => Ordering::Less,\n            (None, None) => Ordering::Equal,\n        }\n    }\n\n    pub fn compare<T: Ord>(&self, this: &T, other: &T) -> Ordering {\n        if self == &search::SortOrder::Desc {\n            this.cmp(other)\n        } else {\n            other.cmp(this)\n        }\n    }\n}\n\nimpl quickwit_common::pubsub::Event for ReportSplitsRequest {}\n\n/// Shard update_timestamp to use when reading file metastores <v0.9\npub fn compatibility_shard_update_timestamp() -> i64 {\n    // We prefer a fix value here because it makes backward compatibility tests\n    // simpler. Very few users use the shard API in versions <0.9 anyway.\n    1704067200 // 2024-00-00T00:00:00Z\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/metastore/events.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n// use quickwit_common::pubsub::Event;\n\nuse quickwit_common::pubsub::Event;\n\nuse super::{\n    AddSourceRequest, CreateIndexRequest, DeleteIndexRequest, DeleteSourceRequest, SourceType,\n    ToggleSourceRequest,\n};\nuse crate::types::{IndexUid, SourceId};\n\n/// Delete index event.\n#[derive(Debug, Clone, Eq, PartialEq)]\npub struct DeleteIndexEvent {\n    /// Index ID of the deleted index.\n    pub index_uid: IndexUid,\n}\n\n/// Add source event.\n#[derive(Debug, Clone, Eq, PartialEq)]\npub struct AddSourceEvent {\n    /// The ID of the index to which the source belongs.\n    pub index_uid: IndexUid,\n    /// The source ID.\n    pub source_id: SourceId,\n    /// The source type.\n    pub source_type: SourceType,\n}\n\n/// Toggle source events.\n#[derive(Debug, Clone, Eq, PartialEq)]\npub struct ToggleSourceEvent {\n    /// Index ID of the toggled source.\n    pub index_uid: IndexUid,\n    /// Source ID of the toggled source.\n    pub source_id: SourceId,\n    /// Whether the source is enabled.\n    pub enabled: bool,\n}\n\n/// Delete source event.\n#[derive(Debug, Clone, Eq, PartialEq)]\npub struct DeleteSourceEvent {\n    /// Index ID of the deleted source.\n    pub index_uid: IndexUid,\n    /// Source ID of the deleted source.\n    pub source_id: SourceId,\n}\n\nimpl Event for AddSourceRequest {}\nimpl Event for CreateIndexRequest {}\nimpl Event for DeleteIndexRequest {}\nimpl Event for DeleteSourceRequest {}\nimpl Event for ToggleSourceRequest {}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/metastore/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\n\nuse quickwit_common::rate_limited_error;\nuse quickwit_common::retry::Retryable;\nuse quickwit_common::tower::{MakeLoadShedError, TimeoutExceeded};\nuse serde::{Deserialize, Serialize};\n\nuse crate::types::{IndexId, IndexUid, QueueId, SourceId, SplitId};\nuse crate::{GrpcServiceError, ServiceError, ServiceErrorCode};\n\npub mod events;\n\ninclude!(\"../codegen/quickwit/quickwit.metastore.rs\");\n\npub const METASTORE_FILE_DESCRIPTOR_SET: &[u8] =\n    include_bytes!(\"../codegen/quickwit/metastore_descriptor.bin\");\n\npub type MetastoreResult<T> = Result<T, MetastoreError>;\n\n/// Lists the object types stored and managed by the metastore.\n#[derive(Debug, Clone, Eq, PartialEq, Serialize, Deserialize)]\n#[serde(rename_all = \"snake_case\")]\npub enum EntityKind {\n    /// A checkpoint delta.\n    CheckpointDelta {\n        /// Index ID.\n        index_id: IndexId,\n        /// Source ID.\n        source_id: SourceId,\n    },\n    /// An index.\n    Index {\n        /// Index ID.\n        index_id: IndexId,\n    },\n    /// A set of indexes.\n    Indexes {\n        /// Index IDs.\n        index_ids: Vec<IndexId>,\n    },\n    /// A source.\n    Source {\n        /// Index ID.\n        index_id: IndexId,\n        /// Source ID.\n        source_id: SourceId,\n    },\n    /// A shard.\n    Shard {\n        /// Shard queue ID: <index_uid>/<source_id>/<shard_id>\n        queue_id: QueueId,\n    },\n    /// A split.\n    Split {\n        /// Split ID.\n        split_id: SplitId,\n    },\n    /// A set of splits.\n    Splits {\n        /// Split IDs.\n        split_ids: Vec<String>,\n    },\n    /// An index template.\n    IndexTemplate {\n        /// Index template ID.\n        template_id: String,\n    },\n}\n\nimpl fmt::Display for EntityKind {\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        match self {\n            EntityKind::CheckpointDelta {\n                index_id,\n                source_id,\n            } => write!(f, \"checkpoint delta `{index_id}/{source_id}`\"),\n            EntityKind::Index { index_id } => write!(f, \"index `{index_id}`\"),\n            EntityKind::Indexes { index_ids } => write!(f, \"indexes `{}`\", index_ids.join(\", \")),\n            EntityKind::Shard { queue_id } => write!(f, \"shard `{queue_id}`\"),\n            EntityKind::Source {\n                index_id,\n                source_id,\n            } => write!(f, \"source `{index_id}/{source_id}`\"),\n            EntityKind::Split { split_id } => write!(f, \"split `{split_id}`\"),\n            EntityKind::Splits { split_ids } => write!(f, \"splits `{}`\", split_ids.join(\", \")),\n            EntityKind::IndexTemplate { template_id } => {\n                write!(f, \"index template `{template_id}`\")\n            }\n        }\n    }\n}\n\n#[derive(Debug, Clone, thiserror::Error, Eq, PartialEq, Serialize, Deserialize)]\npub enum MetastoreError {\n    #[error(\"{0} already exist(s)\")]\n    AlreadyExists(EntityKind),\n\n    #[error(\"connection error: {message}\")]\n    Connection { message: String },\n\n    #[error(\"database error: {message}\")]\n    Db { message: String },\n\n    #[error(\"precondition failed for {entity}: {message}\")]\n    FailedPrecondition { entity: EntityKind, message: String },\n\n    #[error(\"access forbidden: {message}\")]\n    Forbidden { message: String },\n\n    #[error(\"internal error: {message}; cause: `{cause}`\")]\n    Internal { message: String, cause: String },\n\n    #[error(\"invalid argument: {message}\")]\n    InvalidArgument { message: String },\n\n    #[error(\"IO error: {message}\")]\n    Io { message: String },\n\n    #[error(\"failed to deserialize `{struct_name}` from JSON: {message}\")]\n    JsonDeserializeError {\n        struct_name: String,\n        message: String,\n    },\n\n    #[error(\"failed to serialize `{struct_name}` to JSON: {message}\")]\n    JsonSerializeError {\n        struct_name: String,\n        message: String,\n    },\n\n    #[error(\"{0} not found\")]\n    NotFound(EntityKind),\n\n    #[error(\"request timed out: {0}\")]\n    Timeout(String),\n\n    #[error(\"too many requests\")]\n    TooManyRequests,\n\n    #[error(\"service unavailable: {0}\")]\n    Unavailable(String),\n}\n\nimpl MetastoreError {\n    /// Returns `true` if the transaction that emitted this error is \"certainly abort\".\n    /// Returns `false` if we cannot know whether the transaction was successful or not.\n    pub fn is_transaction_certainly_aborted(&self) -> bool {\n        match self {\n            MetastoreError::AlreadyExists(_)\n            | MetastoreError::FailedPrecondition { .. }\n            | MetastoreError::Forbidden { .. }\n            | MetastoreError::InvalidArgument { .. }\n            | MetastoreError::JsonDeserializeError { .. }\n            | MetastoreError::JsonSerializeError { .. }\n            | MetastoreError::NotFound(_)\n            | MetastoreError::TooManyRequests => true,\n            MetastoreError::Connection { .. }\n            | MetastoreError::Db { .. }\n            | MetastoreError::Internal { .. }\n            | MetastoreError::Io { .. }\n            | MetastoreError::Timeout { .. }\n            | MetastoreError::Unavailable(_) => false,\n        }\n    }\n}\n\n#[cfg(feature = \"postgres\")]\nimpl From<sqlx::Error> for MetastoreError {\n    fn from(error: sqlx::Error) -> Self {\n        MetastoreError::Db {\n            message: error.to_string(),\n        }\n    }\n}\n\nimpl From<TimeoutExceeded> for MetastoreError {\n    fn from(_: TimeoutExceeded) -> Self {\n        MetastoreError::Timeout(\"client\".to_string())\n    }\n}\n\nimpl ServiceError for MetastoreError {\n    fn error_code(&self) -> ServiceErrorCode {\n        match self {\n            Self::AlreadyExists(_) => ServiceErrorCode::AlreadyExists,\n            Self::Connection { message } => {\n                rate_limited_error!(\n                    limit_per_min = 6,\n                    \"metastore/connection internal error: {message}\"\n                );\n                ServiceErrorCode::Internal\n            }\n            Self::Db { message } => {\n                rate_limited_error!(limit_per_min = 6, \"metastore/db internal error: {message}\");\n                ServiceErrorCode::Internal\n            }\n            Self::FailedPrecondition { .. } => ServiceErrorCode::BadRequest,\n            Self::Forbidden { .. } => ServiceErrorCode::Forbidden,\n            Self::Internal { message, cause } => {\n                rate_limited_error!(\n                    limit_per_min = 6,\n                    \"metastore internal error: {message} cause: {cause}\"\n                );\n                ServiceErrorCode::Internal\n            }\n            Self::InvalidArgument { .. } => ServiceErrorCode::BadRequest,\n            Self::Io { message } => {\n                rate_limited_error!(limit_per_min = 6, \"metastore/io internal error: {message}\");\n                ServiceErrorCode::Internal\n            }\n            Self::JsonDeserializeError {\n                struct_name,\n                message,\n            } => {\n                rate_limited_error!(\n                    limit_per_min = 6,\n                    \"metastore/jsondeser internal error: [{struct_name}] {message}\"\n                );\n                ServiceErrorCode::Internal\n            }\n            Self::JsonSerializeError {\n                struct_name,\n                message,\n            } => {\n                rate_limited_error!(\n                    limit_per_min = 6,\n                    \"metastore/jsonser internal error: [{struct_name}]  {message}\"\n                );\n                ServiceErrorCode::Internal\n            }\n            Self::NotFound(_) => ServiceErrorCode::NotFound,\n            Self::Timeout(_) => ServiceErrorCode::Timeout,\n            Self::TooManyRequests => ServiceErrorCode::TooManyRequests,\n            Self::Unavailable(_) => ServiceErrorCode::Unavailable,\n        }\n    }\n}\n\nimpl GrpcServiceError for MetastoreError {\n    fn new_internal(message: String) -> Self {\n        quickwit_common::rate_limited_error!(limit_per_min=6, message=%message.as_str(), \"metastore error: internal\");\n        Self::Internal {\n            message,\n            cause: \"\".to_string(),\n        }\n    }\n\n    fn new_timeout(message: String) -> Self {\n        quickwit_common::rate_limited_error!(limit_per_min=6, message=%message.as_str(), \"metastore error: timeout\");\n        Self::Timeout(message)\n    }\n\n    fn new_too_many_requests() -> Self {\n        quickwit_common::rate_limited_error!(\n            limit_per_min = 6,\n            \"metastore error: too many requests\"\n        );\n        Self::TooManyRequests\n    }\n\n    fn new_unavailable(message: String) -> Self {\n        quickwit_common::rate_limited_error!(limit_per_min=6, message=%message.as_str(), \"metastore error: unavailable metastore\");\n        Self::Unavailable(message)\n    }\n}\n\nimpl Retryable for MetastoreError {\n    fn is_retryable(&self) -> bool {\n        matches!(\n            self,\n            Self::Connection { .. }\n                | Self::Db { .. }\n                | Self::Internal { .. }\n                | Self::Io { .. }\n                | Self::Timeout(_)\n                | Self::Unavailable(_)\n        )\n    }\n}\n\nimpl MakeLoadShedError for MetastoreError {\n    fn make_load_shed_error() -> Self {\n        MetastoreError::TooManyRequests\n    }\n}\n\nimpl SourceType {\n    pub fn as_str(&self) -> &'static str {\n        match self {\n            SourceType::Cli => \"ingest-cli\",\n            SourceType::File => \"file\",\n            SourceType::IngestV1 => \"ingest-api\",\n            SourceType::IngestV2 => \"ingest\",\n            SourceType::Kafka => \"kafka\",\n            SourceType::Kinesis => \"kinesis\",\n            SourceType::Nats => \"nats\",\n            SourceType::PubSub => \"pubsub\",\n            SourceType::Pulsar => \"pulsar\",\n            SourceType::Stdin => \"stdin\",\n            SourceType::Unspecified => \"unspecified\",\n            SourceType::Vec => \"vec\",\n            SourceType::Void => \"void\",\n        }\n    }\n}\n\nimpl fmt::Display for SourceType {\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        let source_type_str = match self {\n            SourceType::Cli => \"CLI ingest\",\n            SourceType::File => \"file\",\n            SourceType::IngestV1 => \"ingest API v1\",\n            SourceType::IngestV2 => \"ingest API v2\",\n            SourceType::Kafka => \"Apache Kafka\",\n            SourceType::Kinesis => \"Amazon Kinesis\",\n            SourceType::Nats => \"NATS\",\n            SourceType::PubSub => \"Google Cloud Pub/Sub\",\n            SourceType::Pulsar => \"Apache Pulsar\",\n            SourceType::Stdin => \"Stdin\",\n            SourceType::Unspecified => \"unspecified\",\n            SourceType::Vec => \"vec\",\n            SourceType::Void => \"void\",\n        };\n        write!(f, \"{source_type_str}\")\n    }\n}\n\nimpl IndexMetadataRequest {\n    pub fn into_index_id(self) -> Option<IndexId> {\n        self.index_uid\n            .map(|index_uid| index_uid.index_id)\n            .or(self.index_id)\n    }\n\n    pub fn for_index_id(index_id: IndexId) -> Self {\n        Self {\n            index_uid: None,\n            index_id: Some(index_id),\n        }\n    }\n\n    pub fn for_index_uid(index_uid: IndexUid) -> Self {\n        Self {\n            index_uid: Some(index_uid),\n            index_id: None,\n        }\n    }\n}\n\nimpl MarkSplitsForDeletionRequest {\n    pub fn new(index_uid: IndexUid, split_ids: Vec<String>) -> Self {\n        Self {\n            index_uid: index_uid.into(),\n            split_ids,\n        }\n    }\n}\n\nimpl LastDeleteOpstampResponse {\n    pub fn new(last_delete_opstamp: u64) -> Self {\n        Self {\n            last_delete_opstamp,\n        }\n    }\n}\n\nimpl ListDeleteTasksRequest {\n    pub fn new(index_uid: IndexUid, opstamp_start: u64) -> Self {\n        Self {\n            index_uid: index_uid.into(),\n            opstamp_start,\n        }\n    }\n}\n\nimpl SplitStats {\n    pub fn add_split(&mut self, size_bytes: u64) {\n        self.num_splits += 1;\n        self.total_size_bytes += size_bytes;\n    }\n}\n\npub mod serde_utils {\n    use serde::de::DeserializeOwned;\n    use serde::{Deserialize, Serialize};\n    use serde_json::Value as JsonValue;\n\n    use super::{MetastoreError, MetastoreResult};\n\n    pub fn from_json_bytes<'de, T: Deserialize<'de>>(value_bytes: &'de [u8]) -> MetastoreResult<T> {\n        serde_json::from_slice(value_bytes).map_err(|error| MetastoreError::JsonDeserializeError {\n            struct_name: std::any::type_name::<T>().to_string(),\n            message: error.to_string(),\n        })\n    }\n\n    pub fn from_json_zstd<T: DeserializeOwned>(value_bytes: &[u8]) -> MetastoreResult<T> {\n        let value_json = zstd::decode_all(value_bytes).map_err(|error| {\n            MetastoreError::JsonDeserializeError {\n                struct_name: std::any::type_name::<T>().to_string(),\n                message: error.to_string(),\n            }\n        })?;\n        serde_json::from_slice(&value_json).map_err(|error| MetastoreError::JsonDeserializeError {\n            struct_name: std::any::type_name::<T>().to_string(),\n            message: error.to_string(),\n        })\n    }\n\n    pub fn from_json_str<'de, T: Deserialize<'de>>(value_str: &'de str) -> MetastoreResult<T> {\n        serde_json::from_str(value_str).map_err(|error| MetastoreError::JsonDeserializeError {\n            struct_name: std::any::type_name::<T>().to_string(),\n            message: error.to_string(),\n        })\n    }\n\n    pub fn from_json_value<T: DeserializeOwned>(value: JsonValue) -> MetastoreResult<T> {\n        serde_json::from_value(value).map_err(|error| MetastoreError::JsonDeserializeError {\n            struct_name: std::any::type_name::<T>().to_string(),\n            message: error.to_string(),\n        })\n    }\n\n    pub fn to_json_str<T: Serialize>(value: &T) -> Result<String, MetastoreError> {\n        serde_json::to_string(value).map_err(|error| MetastoreError::JsonSerializeError {\n            struct_name: std::any::type_name::<T>().to_string(),\n            message: error.to_string(),\n        })\n    }\n\n    pub fn to_json_bytes<T: Serialize>(value: &T) -> Result<Vec<u8>, MetastoreError> {\n        serde_json::to_vec(value).map_err(|error| MetastoreError::JsonSerializeError {\n            struct_name: std::any::type_name::<T>().to_string(),\n            message: error.to_string(),\n        })\n    }\n\n    pub fn to_json_zstd<T: Serialize>(\n        value: &T,\n        compression_level: i32,\n    ) -> Result<Vec<u8>, MetastoreError> {\n        let value_json =\n            serde_json::to_vec(value).map_err(|error| MetastoreError::JsonSerializeError {\n                struct_name: std::any::type_name::<T>().to_string(),\n                message: error.to_string(),\n            })?;\n        zstd::encode_all(value_json.as_slice(), compression_level).map_err(|error| {\n            MetastoreError::JsonSerializeError {\n                struct_name: std::any::type_name::<T>().to_string(),\n                message: error.to_string(),\n            }\n        })\n    }\n\n    pub fn to_json_bytes_pretty<T: Serialize>(value: &T) -> Result<Vec<u8>, MetastoreError> {\n        serde_json::to_vec_pretty(value).map_err(|error| MetastoreError::JsonSerializeError {\n            struct_name: std::any::type_name::<T>().to_string(),\n            message: error.to_string(),\n        })\n    }\n}\n\nimpl ListIndexesMetadataRequest {\n    pub fn all() -> ListIndexesMetadataRequest {\n        ListIndexesMetadataRequest {\n            index_id_patterns: vec![\"*\".to_string()],\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/search/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod span_id;\nmod trace_id;\n\nuse std::cmp::Ordering;\nuse std::fmt;\nuse std::io::{self, Read};\n\nuse prost::Message;\npub use sort_by_value::SortValue;\npub use span_id::{SpanId, TryFromSpanIdError};\npub use trace_id::{TraceId, TryFromTraceIdError};\n\ninclude!(\"../codegen/quickwit/quickwit.search.rs\");\n\npub const SEARCH_FILE_DESCRIPTOR_SET: &[u8] =\n    include_bytes!(\"../codegen/quickwit/search_descriptor.bin\");\n\nimpl SearchRequest {\n    pub fn time_range(&self) -> impl std::ops::RangeBounds<i64> {\n        use std::ops::Bound;\n        (\n            self.start_timestamp\n                .map_or(Bound::Unbounded, Bound::Included),\n            self.end_timestamp.map_or(Bound::Unbounded, Bound::Excluded),\n        )\n    }\n}\n\nimpl SplitIdAndFooterOffsets {\n    pub fn time_range(&self) -> impl std::ops::RangeBounds<i64> {\n        use std::ops::Bound;\n        (\n            self.timestamp_start\n                .map_or(Bound::Unbounded, Bound::Included),\n            self.timestamp_end.map_or(Bound::Unbounded, Bound::Included),\n        )\n    }\n}\n\nimpl fmt::Display for SplitSearchError {\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        write!(f, \"({}, split_id: {})\", self.error, self.split_id)\n    }\n}\n\nimpl Eq for SortByValue {}\n\nimpl From<SortValue> for SortByValue {\n    fn from(sort_value: SortValue) -> Self {\n        SortByValue {\n            sort_value: Some(sort_value),\n        }\n    }\n}\n\nimpl std::hash::Hash for SortByValue {\n    fn hash<H: std::hash::Hasher>(&self, state: &mut H) {\n        self.sort_value.hash(state);\n    }\n}\n\nimpl SortByValue {\n    pub fn into_json(self) -> serde_json::Value {\n        use serde_json::Value::*;\n        match self.sort_value {\n            Some(SortValue::U64(num)) => Number(num.into()),\n            Some(SortValue::I64(num)) => Number(num.into()),\n            Some(SortValue::F64(num)) => {\n                if let Some(num) = serde_json::Number::from_f64(num) {\n                    Number(num)\n                } else {\n                    // TODO is there a better way to handle infinite/nan?\n                    Null\n                }\n            }\n            Some(SortValue::Boolean(b)) => Bool(b),\n            None => Null,\n        }\n    }\n\n    pub fn try_from_json(value: serde_json::Value) -> Option<Self> {\n        use serde_json::Value::*;\n        let sort_value = match value {\n            Null => None,\n            Bool(b) => Some(SortValue::Boolean(b)),\n            Number(number) => {\n                if let Some(number) = number.as_u64() {\n                    Some(SortValue::U64(number))\n                } else if let Some(number) = number.as_i64() {\n                    Some(SortValue::I64(number))\n                } else if let Some(number) = number.as_f64() {\n                    Some(SortValue::F64(number))\n                } else {\n                    // this should never happen as we don't emit such number ourselves\n                    return None;\n                }\n            }\n            // Strings that can be converted to a number are accepted.\n            // Some clients (like JS clients) can't easily handle large integers\n            // without losing precision, so we accept them as strings.\n            String(value) => {\n                if let Ok(number) = value.parse::<i64>() {\n                    Some(SortValue::I64(number))\n                } else if let Ok(number) = value.parse::<u64>() {\n                    Some(SortValue::U64(number))\n                } else {\n                    return None;\n                }\n            }\n            Array(_) | Object(_) => return None,\n        };\n        Some(SortByValue { sort_value })\n    }\n}\n\n// !!! Disclaimer !!!\n//\n// Prost imposes the PartialEq derived implementation.\n// This is terrible because this means Eq, PartialEq are not really in line with Ord's\n// implementation. if in presence of NaN.\nimpl Eq for SortValue {}\n\nimpl Ord for SortValue {\n    #[inline]\n    fn cmp(&self, other: &Self) -> Ordering {\n        // We make sure to end up with a total order.\n        match (*self, *other) {\n            // Same types.\n            (SortValue::U64(left), SortValue::U64(right)) => left.cmp(&right),\n            (SortValue::I64(left), SortValue::I64(right)) => left.cmp(&right),\n            (SortValue::Boolean(left), SortValue::Boolean(right)) => left.cmp(&right),\n            // We half the logic by making sure we keep\n            // the \"stronger\" type on the left.\n            (SortValue::U64(left), SortValue::I64(right)) => {\n                if left > i64::MAX as u64 {\n                    return Ordering::Greater;\n                }\n                (left as i64).cmp(&right)\n            }\n            (SortValue::F64(left), SortValue::F64(right)) => left.total_cmp(&right),\n            (SortValue::F64(left), SortValue::U64(right)) => left.total_cmp(&(right as f64)),\n            (SortValue::F64(left), SortValue::I64(right)) => left.total_cmp(&(right as f64)),\n            (SortValue::Boolean(left), right) => SortValue::U64(left as u64).cmp(&right),\n            (left, right) => right.cmp(&left).reverse(),\n        }\n    }\n}\n\nimpl PartialOrd for SortValue {\n    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {\n        Some(self.cmp(other))\n    }\n}\n\nimpl std::hash::Hash for SortValue {\n    fn hash<H: std::hash::Hasher>(&self, state: &mut H) {\n        let this = self.normalize();\n        std::mem::discriminant(&this).hash(state);\n        match this {\n            SortValue::U64(number) => {\n                number.hash(state);\n            }\n            SortValue::I64(number) => {\n                number.hash(state);\n            }\n            SortValue::F64(number) => {\n                number.to_bits().hash(state);\n            }\n            SortValue::Boolean(b) => {\n                b.hash(state);\n            }\n        }\n    }\n}\n\nimpl SortValue {\n    /// Where multiple variant could represent the same logical value, convert to a canonical form.\n    ///\n    /// For number, we prefer to represent them, in order, as i64, then as u64 and finally as f64.\n    pub fn normalize(&self) -> Self {\n        match self {\n            SortValue::I64(_) => *self,\n            SortValue::Boolean(_) => *self,\n            SortValue::U64(number) => {\n                if let Ok(number) = (*number).try_into() {\n                    SortValue::I64(number)\n                } else {\n                    *self\n                }\n            }\n            SortValue::F64(number) => {\n                let number = *number;\n                if number.ceil() == number {\n                    // number is not NaN, and is a natural number\n                    if number >= i64::MIN as f64 && number <= i64::MAX as f64 {\n                        return SortValue::I64(number as i64);\n                    } else if number.is_sign_positive() && number <= u64::MAX as f64 {\n                        return SortValue::U64(number as u64);\n                    }\n                }\n                *self\n            }\n        }\n    }\n}\n\nimpl PartialHit {\n    /// Helper to get access to the 1st sort value\n    pub fn sort_value(&self) -> Option<SortValue> {\n        if let Some(sort_value) = self.sort_value {\n            sort_value.sort_value\n        } else {\n            None\n        }\n    }\n}\n\n/// Serializes the Split fields.\n///\n/// `fields_metadata` has to be sorted.\npub fn serialize_split_fields(list_fields: ListFields) -> Vec<u8> {\n    let payload = list_fields.encode_to_vec();\n    let compression_level = 3;\n    let payload_compressed = zstd::stream::encode_all(&mut &payload[..], compression_level)\n        .expect(\"zstd encoding failed\");\n    let mut out = Vec::new();\n    // Write Header -- Format Version 2\n    let format_version = 2u8;\n    out.push(format_version);\n    // Write Payload\n    out.extend_from_slice(&payload_compressed);\n    out\n}\n\n/// Reads a fixed number of bytes into an array and returns the array.\nfn read_exact_array<const N: usize>(reader: &mut impl Read) -> io::Result<[u8; N]> {\n    let mut buffer = [0u8; N];\n    reader.read_exact(&mut buffer)?;\n    Ok(buffer)\n}\n\n/// Reads the Split fields from a zstd compressed stream of bytes\npub fn deserialize_split_fields<R: Read>(mut reader: R) -> io::Result<ListFields> {\n    let format_version = read_exact_array::<1>(&mut reader)?[0];\n    if format_version != 2 {\n        return Err(io::Error::new(\n            io::ErrorKind::InvalidData,\n            format!(\"Unsupported split field format version: {format_version}\"),\n        ));\n    }\n    let reader = zstd::Decoder::new(reader)?;\n    read_split_fields_from_zstd(reader)\n}\n\n/// Reads the Split fields from a stream of bytes\n#[allow(clippy::unbuffered_bytes)]\nfn read_split_fields_from_zstd<R: Read>(reader: R) -> io::Result<ListFields> {\n    let all_bytes: Vec<_> = reader.bytes().collect::<io::Result<_>>()?;\n    let serialized_list_fields: ListFields = prost::Message::decode(&all_bytes[..])?;\n\n    Ok(serialized_list_fields)\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/search/span_id.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse serde::{Deserialize, Deserializer, Serialize, Serializer, de};\n\n#[derive(Debug, Clone, Copy, Eq, PartialEq, Ord, PartialOrd, Hash)]\npub struct SpanId([u8; 8]);\n\nimpl SpanId {\n    pub const HEX_LENGTH: usize = 16;\n\n    pub fn new(bytes: [u8; 8]) -> Self {\n        Self(bytes)\n    }\n\n    pub fn as_bytes(&self) -> &[u8] {\n        &self.0\n    }\n\n    pub fn to_vec(&self) -> Vec<u8> {\n        self.0.to_vec()\n    }\n}\n\nimpl Serialize for SpanId {\n    fn serialize<S: Serializer>(&self, serializer: S) -> Result<S::Ok, S::Error> {\n        let hexspan_id = hex::encode(self.0);\n        serializer.serialize_str(&hexspan_id)\n    }\n}\n\nimpl<'de> Deserialize<'de> for SpanId {\n    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>\n    where D: Deserializer<'de> {\n        let hexspan_id = String::deserialize(deserializer)?;\n\n        if hexspan_id.len() != SpanId::HEX_LENGTH {\n            let message = format!(\n                \"hex span ID must be {} bytes long, got {}\",\n                SpanId::HEX_LENGTH,\n                hexspan_id.len()\n            );\n            return Err(de::Error::custom(message));\n        }\n        let mut span_id = [0u8; 8];\n        hex::decode_to_slice(hexspan_id, &mut span_id).map_err(|error| {\n            let message = format!(\"failed to decode hex span ID: {error:?}\");\n            de::Error::custom(message)\n        })?;\n        Ok(SpanId(span_id))\n    }\n}\n\n#[derive(Debug, thiserror::Error)]\n#[error(\"span ID must be 8 bytes long, got {0}\")]\npub struct TryFromSpanIdError(usize);\n\nimpl TryFrom<&[u8]> for SpanId {\n    type Error = TryFromSpanIdError;\n\n    fn try_from(slice: &[u8]) -> Result<Self, Self::Error> {\n        let span_id = slice\n            .try_into()\n            .map_err(|_| TryFromSpanIdError(slice.len()))?;\n        Ok(SpanId(span_id))\n    }\n}\n\nimpl TryFrom<Vec<u8>> for SpanId {\n    type Error = TryFromSpanIdError;\n\n    fn try_from(vec: Vec<u8>) -> Result<Self, Self::Error> {\n        Self::try_from(&vec[..])\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_span_id_serde() {\n        let expected_span_id = SpanId::new([1; 8]);\n        let span_id_json = serde_json::to_string(&expected_span_id).unwrap();\n        assert_eq!(span_id_json, r#\"\"0101010101010101\"\"#);\n\n        let span_id = serde_json::from_str::<SpanId>(&span_id_json).unwrap();\n        assert_eq!(span_id, expected_span_id,);\n    }\n\n    #[test]\n    fn test_span_id_try_from() {\n        let expected_span_id = SpanId::new([1; 8]);\n        let span_id = SpanId::try_from([1; 8].as_slice()).unwrap();\n        assert_eq!(span_id, expected_span_id);\n\n        let error = SpanId::try_from([1; 9].as_slice()).unwrap_err();\n        assert_eq!(error.0, 9);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/search/trace_id.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse serde::{Deserialize, Deserializer, Serialize, Serializer, de};\n\n#[derive(Debug, Clone, Copy, Eq, PartialEq, Ord, PartialOrd, Hash)]\npub struct TraceId([u8; 16]);\n\nimpl TraceId {\n    pub const HEX_LENGTH: usize = 32;\n\n    pub fn new(bytes: [u8; 16]) -> Self {\n        Self(bytes)\n    }\n\n    pub fn into_bytes(self) -> [u8; 16] {\n        self.0\n    }\n\n    pub fn to_vec(&self) -> Vec<u8> {\n        self.0.to_vec()\n    }\n\n    pub fn hex_display(&self) -> String {\n        hex::encode(self.0)\n    }\n}\n\nimpl Serialize for TraceId {\n    fn serialize<S: Serializer>(&self, serializer: S) -> Result<S::Ok, S::Error> {\n        if serializer.is_human_readable() {\n            let hextrace_id = hex::encode(self.0);\n            serializer.serialize_str(&hextrace_id)\n        } else {\n            self.0.serialize(serializer)\n        }\n    }\n}\n\nimpl<'de> Deserialize<'de> for TraceId {\n    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>\n    where D: Deserializer<'de> {\n        if deserializer.is_human_readable() {\n            let hextrace_id = String::deserialize(deserializer)?;\n            if hextrace_id.len() != TraceId::HEX_LENGTH {\n                let message = format!(\n                    \"hex trace ID must be {} bytes long, got {}\",\n                    TraceId::HEX_LENGTH,\n                    hextrace_id.len()\n                );\n                return Err(de::Error::custom(message));\n            }\n            let mut trace_id_bytes = [0u8; 16];\n            hex::decode_to_slice(hextrace_id, &mut trace_id_bytes).map_err(|error| {\n                let message = format!(\"failed to decode hex span ID: {error:?}\");\n                de::Error::custom(message)\n            })?;\n            Ok(TraceId(trace_id_bytes))\n        } else {\n            let trace_id_bytes: [u8; 16] = <[u8; 16]>::deserialize(deserializer)?;\n            Ok(TraceId(trace_id_bytes))\n        }\n    }\n}\n\n#[derive(Debug, thiserror::Error)]\n#[error(\"trace ID must be 16 bytes long, got {0}\")]\npub struct TryFromTraceIdError(usize);\n\nimpl TryFrom<&[u8]> for TraceId {\n    type Error = TryFromTraceIdError;\n\n    fn try_from(slice: &[u8]) -> Result<Self, Self::Error> {\n        let trace_id = slice\n            .try_into()\n            .map_err(|_| TryFromTraceIdError(slice.len()))?;\n        Ok(TraceId(trace_id))\n    }\n}\n\nimpl TryFrom<Vec<u8>> for TraceId {\n    type Error = TryFromTraceIdError;\n\n    fn try_from(vec: Vec<u8>) -> Result<Self, Self::Error> {\n        Self::try_from(&vec[..])\n    }\n}\n\nimpl From<TryFromTraceIdError> for tonic::Status {\n    fn from(error: TryFromTraceIdError) -> Self {\n        tonic::Status::invalid_argument(error.to_string())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_trace_id_serde() {\n        let expected_trace_id = TraceId::new([1; 16]);\n        let trace_id_json = serde_json::to_string(&expected_trace_id).unwrap();\n        assert_eq!(trace_id_json, r#\"\"01010101010101010101010101010101\"\"#);\n\n        let trace_id = serde_json::from_str::<TraceId>(&trace_id_json).unwrap();\n        assert_eq!(trace_id, expected_trace_id,);\n    }\n\n    #[test]\n    fn test_trace_id_try_from() {\n        let expected_trace_id = TraceId::new([1; 16]);\n        let trace_id = TraceId::try_from([1; 16].as_slice()).unwrap();\n        assert_eq!(trace_id, expected_trace_id);\n\n        let error = TraceId::try_from([1; 17].as_slice()).unwrap_err();\n        assert_eq!(error.0, 17);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/types/doc_mapping_uid.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::borrow::Cow;\nuse std::fmt;\nuse std::str::FromStr;\n\nuse anyhow::Context;\nuse serde::de::Error;\nuse serde::{Deserialize, Deserializer, Serialize, Serializer};\npub use ulid::Ulid;\n\nuse super::ULID_SIZE;\n\n/// Unique identifier for a document mapping.\n#[derive(Clone, Copy, Default, Hash, Eq, PartialEq, Ord, PartialOrd, utoipa::ToSchema)]\npub struct DocMappingUid(Ulid);\n\nimpl fmt::Debug for DocMappingUid {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(f, \"DocMapping({})\", self.0)\n    }\n}\n\nimpl fmt::Display for DocMappingUid {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        self.0.fmt(f)\n    }\n}\n\nimpl From<Ulid> for DocMappingUid {\n    fn from(ulid: Ulid) -> Self {\n        Self(ulid)\n    }\n}\n\nimpl DocMappingUid {\n    /// Creates a new random doc mapping UID.\n    pub fn random() -> Self {\n        Self(Ulid::new())\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn for_test(ulid_u128: u128) -> DocMappingUid {\n        Self(Ulid::from(ulid_u128))\n    }\n}\n\nimpl<'de> Deserialize<'de> for DocMappingUid {\n    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>\n    where D: Deserializer<'de> {\n        let doc_mapping_uid_str: Cow<'de, str> = Cow::deserialize(deserializer)?;\n        doc_mapping_uid_str.parse().map_err(D::Error::custom)\n    }\n}\n\nimpl Serialize for DocMappingUid {\n    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>\n    where S: Serializer {\n        serializer.collect_str(&self.0)\n    }\n}\n\nimpl prost::Message for DocMappingUid {\n    fn encode_raw(&self, buf: &mut impl prost::bytes::BufMut) {\n        // TODO: when `bytes::encode` supports `&[u8]`, we can remove this allocation.\n        prost::encoding::bytes::encode(1u32, &self.0.to_bytes().to_vec(), buf);\n    }\n\n    fn merge_field(\n        &mut self,\n        tag: u32,\n        wire_type: prost::encoding::WireType,\n        buf: &mut impl prost::bytes::Buf,\n        ctx: prost::encoding::DecodeContext,\n    ) -> ::core::result::Result<(), prost::DecodeError> {\n        const STRUCT_NAME: &str = \"DocMappingUid\";\n\n        match tag {\n            1u32 => {\n                let mut buffer = Vec::with_capacity(ULID_SIZE);\n\n                prost::encoding::bytes::merge(wire_type, &mut buffer, buf, ctx).map_err(\n                    |mut error| {\n                        error.push(STRUCT_NAME, \"doc_mapping_uid\");\n                        error\n                    },\n                )?;\n                let ulid_bytes: [u8; ULID_SIZE] =\n                    buffer.try_into().map_err(|buffer: Vec<u8>| {\n                        prost::DecodeError::new(format!(\n                            \"invalid length for field `doc_mapping_uid`, expected 16 bytes, got {}\",\n                            buffer.len()\n                        ))\n                    })?;\n                self.0 = Ulid::from_bytes(ulid_bytes);\n                Ok(())\n            }\n            _ => prost::encoding::skip_field(wire_type, tag, buf, ctx),\n        }\n    }\n\n    #[inline]\n    fn encoded_len(&self) -> usize {\n        prost::encoding::key_len(1u32)\n            + prost::encoding::encoded_len_varint(ULID_SIZE as u64)\n            + ULID_SIZE\n    }\n\n    fn clear(&mut self) {\n        self.0 = Ulid::nil();\n    }\n}\n\nimpl FromStr for DocMappingUid {\n    type Err = anyhow::Error;\n\n    fn from_str(doc_mapping_uid_str: &str) -> Result<Self, Self::Err> {\n        Ulid::from_string(doc_mapping_uid_str)\n            .map(Self)\n            .with_context(|| format!(\"failed to parse doc mapping UID `{doc_mapping_uid_str}`\"))\n    }\n}\n\n#[cfg(feature = \"postgres\")]\nimpl TryFrom<String> for DocMappingUid {\n    type Error = anyhow::Error;\n\n    fn try_from(doc_mapping_uid_str: String) -> Result<Self, Self::Error> {\n        doc_mapping_uid_str.parse()\n    }\n}\n\n#[cfg(feature = \"postgres\")]\nimpl sqlx::Type<sqlx::Postgres> for DocMappingUid {\n    fn type_info() -> sqlx::postgres::PgTypeInfo {\n        sqlx::postgres::PgTypeInfo::with_name(\"VARCHAR(26)\")\n    }\n}\n\n#[cfg(feature = \"postgres\")]\nimpl sqlx::Encode<'_, sqlx::Postgres> for DocMappingUid {\n    fn encode_by_ref(\n        &self,\n        buf: &mut sqlx::postgres::PgArgumentBuffer,\n    ) -> Result<sqlx::encode::IsNull, sqlx::error::BoxDynError> {\n        sqlx::Encode::<sqlx::Postgres>::encode(self.0.to_string(), buf)\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use bytes::Bytes;\n    use prost::Message;\n\n    use super::*;\n\n    #[test]\n    fn test_doc_mapping_uid_json_serde_roundtrip() {\n        let doc_mapping_uid = DocMappingUid::default();\n        let serialized = serde_json::to_string(&doc_mapping_uid).unwrap();\n        assert_eq!(serialized, r#\"\"00000000000000000000000000\"\"#);\n\n        let deserialized: DocMappingUid = serde_json::from_str(&serialized).unwrap();\n        assert_eq!(deserialized, doc_mapping_uid);\n    }\n\n    #[test]\n    fn test_doc_mapping_uid_prost_serde_roundtrip() {\n        let doc_mapping_uid = DocMappingUid::random();\n\n        let encoded = doc_mapping_uid.encode_to_vec();\n        assert_eq!(\n            DocMappingUid::decode(Bytes::from(encoded)).unwrap(),\n            doc_mapping_uid\n        );\n\n        let encoded = doc_mapping_uid.encode_length_delimited_to_vec();\n        assert_eq!(\n            DocMappingUid::decode_length_delimited(Bytes::from(encoded)).unwrap(),\n            doc_mapping_uid\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/types/doc_uid.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::borrow::Cow;\nuse std::fmt;\n\nuse serde::de::Error;\nuse serde::{Deserialize, Deserializer, Serialize, Serializer};\npub use ulid::Ulid;\n\nuse super::ULID_SIZE;\n\n/// A doc UID identifies a document across segments, splits, and indexes.\n#[derive(Clone, Copy, Default, Hash, Eq, PartialEq, Ord, PartialOrd)]\npub struct DocUid(Ulid);\n\nimpl fmt::Debug for DocUid {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(f, \"Doc({})\", self.0)\n    }\n}\n\nimpl fmt::Display for DocUid {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        self.0.fmt(f)\n    }\n}\n\nimpl From<Ulid> for DocUid {\n    fn from(ulid: Ulid) -> Self {\n        Self(ulid)\n    }\n}\n\nimpl DocUid {\n    /// Creates a new random doc UID.\n    pub fn random() -> Self {\n        Self(Ulid::new())\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn for_test(ulid_u128: u128) -> DocUid {\n        Self(Ulid::from(ulid_u128))\n    }\n}\n\nimpl<'de> Deserialize<'de> for DocUid {\n    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>\n    where D: Deserializer<'de> {\n        let doc_uid_str: Cow<'de, str> = Cow::deserialize(deserializer)?;\n        let doc_uid = Ulid::from_string(&doc_uid_str).map_err(D::Error::custom)?;\n        Ok(Self(doc_uid))\n    }\n}\n\nimpl Serialize for DocUid {\n    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>\n    where S: Serializer {\n        serializer.collect_str(&self.0)\n    }\n}\n\nimpl prost::Message for DocUid {\n    fn encode_raw(&self, buf: &mut impl prost::bytes::BufMut) {\n        // TODO: when `bytes::encode` supports `&[u8]`, we can remove this allocation.\n        prost::encoding::bytes::encode(1u32, &self.0.to_bytes().to_vec(), buf);\n    }\n\n    fn merge_field(\n        &mut self,\n        tag: u32,\n        wire_type: prost::encoding::WireType,\n        buf: &mut impl prost::bytes::Buf,\n        ctx: prost::encoding::DecodeContext,\n    ) -> ::core::result::Result<(), prost::DecodeError> {\n        const STRUCT_NAME: &str = \"DocUid\";\n\n        match tag {\n            1u32 => {\n                let mut buffer = Vec::with_capacity(ULID_SIZE);\n\n                prost::encoding::bytes::merge(wire_type, &mut buffer, buf, ctx).map_err(\n                    |mut error| {\n                        error.push(STRUCT_NAME, \"doc_uid\");\n                        error\n                    },\n                )?;\n                let ulid_bytes: [u8; ULID_SIZE] =\n                    buffer.try_into().map_err(|buffer: Vec<u8>| {\n                        prost::DecodeError::new(format!(\n                            \"invalid length for field `doc_uid`, expected 16 bytes, got {}\",\n                            buffer.len()\n                        ))\n                    })?;\n                self.0 = Ulid::from_bytes(ulid_bytes);\n                Ok(())\n            }\n            _ => prost::encoding::skip_field(wire_type, tag, buf, ctx),\n        }\n    }\n\n    #[inline]\n    fn encoded_len(&self) -> usize {\n        prost::encoding::key_len(1u32)\n            + prost::encoding::encoded_len_varint(ULID_SIZE as u64)\n            + ULID_SIZE\n    }\n\n    fn clear(&mut self) {\n        self.0 = Ulid::nil();\n    }\n}\n\n/// Generates monotonically increasing doc UIDs. It is not `Clone` nor `Copy` on purpose.\n#[derive(Debug)]\npub struct DocUidGenerator {\n    next_ulid: Ulid,\n}\n\nimpl Default for DocUidGenerator {\n    fn default() -> Self {\n        Self {\n            next_ulid: Ulid::new(),\n        }\n    }\n}\n\nimpl DocUidGenerator {\n    /// Generates a new doc UID.\n    #[allow(clippy::unwrap_or_default)]\n    pub fn next_doc_uid(&mut self) -> DocUid {\n        let doc_uid = DocUid(self.next_ulid);\n        // Clippy insists on using `unwrap_or_default`, but that's really not what we want here:\n        // https://github.com/rust-lang/rust-clippy/issues/11631\n        self.next_ulid = self.next_ulid.increment().unwrap_or_else(Ulid::new);\n        doc_uid\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use bytes::Bytes;\n    use prost::Message;\n\n    use super::*;\n\n    #[test]\n    fn test_doc_uid_json_serde_roundtrip() {\n        let doc_uid = DocUid::default();\n        let serialized = serde_json::to_string(&doc_uid).unwrap();\n        assert_eq!(serialized, r#\"\"00000000000000000000000000\"\"#);\n\n        let deserialized: DocUid = serde_json::from_str(&serialized).unwrap();\n        assert_eq!(deserialized, doc_uid);\n    }\n\n    #[test]\n    fn test_doc_uid_prost_serde_roundtrip() {\n        let doc_uid = DocUid::random();\n\n        let encoded = doc_uid.encode_to_vec();\n        assert_eq!(DocUid::decode(Bytes::from(encoded)).unwrap(), doc_uid);\n\n        let encoded = doc_uid.encode_length_delimited_to_vec();\n        assert_eq!(\n            DocUid::decode_length_delimited(Bytes::from(encoded)).unwrap(),\n            doc_uid\n        );\n    }\n\n    #[test]\n    fn test_doc_uid_generator() {\n        let mut generator = DocUidGenerator::default();\n        let doc_uids: Vec<DocUid> = (0..10_000).map(|_| generator.next_doc_uid()).collect();\n        assert!(doc_uids.windows(2).all(|window| window[0] < window[1]));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/types/index_uid.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::borrow::Cow;\nuse std::fmt;\nuse std::str::FromStr;\n\nuse serde::de::Error;\nuse serde::{Deserialize, Deserializer, Serialize, Serializer};\nuse thiserror::Error;\npub use ulid::Ulid;\n\nuse super::ULID_SIZE;\nuse crate::types::IndexId;\n\n/// Index identifiers that uniquely identify not only the index, but also\n/// its incarnation allowing to distinguish between deleted and recreated indexes.\n/// It is represented as a string in index_id:incarnation_id format.\n#[derive(Clone, Debug, Default, PartialEq, Eq, Ord, PartialOrd, Hash)]\npub struct IndexUid {\n    pub index_id: IndexId,\n    pub incarnation_id: Ulid,\n}\n\nimpl fmt::Display for IndexUid {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(f, \"{}:{}\", self.index_id, self.incarnation_id)\n    }\n}\n\nimpl IndexUid {\n    /// Creates a new index UID from an index ID using a random ULID as incarnation ID.\n    pub fn new_with_random_ulid(index_id: &str) -> Self {\n        Self::new(index_id, Ulid::new())\n    }\n\n    fn new(index_id: &str, incarnation_id: impl Into<Ulid>) -> Self {\n        assert!(!index_id.contains(':'), \"index ID may not contain `:`\");\n\n        Self {\n            index_id: index_id.to_string(),\n            incarnation_id: incarnation_id.into(),\n        }\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn for_test(index_id: &str, incarnation_id: u128) -> Self {\n        Self {\n            index_id: index_id.to_string(),\n            incarnation_id: incarnation_id.into(),\n        }\n    }\n}\n\n#[derive(Error, Debug)]\n#[error(\"invalid index UID `{0}`\")]\npub struct InvalidIndexUid(String);\n\nimpl FromStr for IndexUid {\n    type Err = InvalidIndexUid;\n\n    fn from_str(index_uid_str: &str) -> Result<Self, Self::Err> {\n        let Some((index_id, incarnation_id_str)) = index_uid_str.split_once(':') else {\n            return Err(InvalidIndexUid(index_uid_str.to_string()));\n        };\n        let incarnation_id = Ulid::from_string(incarnation_id_str)\n            .map_err(|_| InvalidIndexUid(index_uid_str.to_string()))?;\n        let index_uid = IndexUid {\n            index_id: index_id.to_string(),\n            incarnation_id,\n        };\n        Ok(index_uid)\n    }\n}\n\nimpl<'de> Deserialize<'de> for IndexUid {\n    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>\n    where D: Deserializer<'de> {\n        let index_uid_str: Cow<'de, str> = Cow::deserialize(deserializer)?;\n        let index_uid = IndexUid::from_str(&index_uid_str).map_err(D::Error::custom)?;\n        Ok(index_uid)\n    }\n}\n\nimpl Serialize for IndexUid {\n    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>\n    where S: Serializer {\n        serializer.collect_str(&self)\n    }\n}\n\nimpl prost::Message for IndexUid {\n    fn encode_raw(&self, buf: &mut impl prost::bytes::BufMut) {\n        if !self.index_id.is_empty() {\n            prost::encoding::string::encode(1u32, &self.index_id, buf);\n        }\n        // TODO: when `bytes::encode` supports `&[u8]`, we can remove this allocation.\n        prost::encoding::bytes::encode(2u32, &self.incarnation_id.to_bytes().to_vec(), buf);\n    }\n\n    fn merge_field(\n        &mut self,\n        tag: u32,\n        wire_type: prost::encoding::WireType,\n        buf: &mut impl prost::bytes::Buf,\n        ctx: prost::encoding::DecodeContext,\n    ) -> ::core::result::Result<(), prost::DecodeError> {\n        const STRUCT_NAME: &str = \"IndexUid\";\n\n        match tag {\n            1u32 => {\n                let value = &mut self.index_id;\n                prost::encoding::string::merge(wire_type, value, buf, ctx).map_err(|mut error| {\n                    error.push(STRUCT_NAME, \"index_id\");\n                    error\n                })\n            }\n            2u32 => {\n                let mut buffer = Vec::with_capacity(ULID_SIZE);\n\n                prost::encoding::bytes::merge(wire_type, &mut buffer, buf, ctx).map_err(\n                    |mut error| {\n                        error.push(STRUCT_NAME, \"incarnation_id\");\n                        error\n                    },\n                )?;\n                let ulid_bytes: [u8; ULID_SIZE] =\n                    buffer.try_into().map_err(|buffer: Vec<u8>| {\n                        prost::DecodeError::new(format!(\n                            \"invalid length for field `incarnation_id`, expected 16 bytes, got {}\",\n                            buffer.len()\n                        ))\n                    })?;\n                self.incarnation_id = Ulid::from_bytes(ulid_bytes);\n                Ok(())\n            }\n            _ => prost::encoding::skip_field(wire_type, tag, buf, ctx),\n        }\n    }\n\n    #[inline]\n    fn encoded_len(&self) -> usize {\n        let mut len = 0;\n\n        if !self.index_id.is_empty() {\n            len += prost::encoding::string::encoded_len(1u32, &self.index_id);\n        }\n\n        len += prost::encoding::key_len(2u32)\n            + prost::encoding::encoded_len_varint(ULID_SIZE as u64)\n            + ULID_SIZE;\n        len\n    }\n\n    fn clear(&mut self) {\n        self.index_id.clear();\n        self.incarnation_id = Ulid::nil();\n    }\n}\n\n#[cfg(feature = \"postgres\")]\nimpl TryFrom<String> for IndexUid {\n    type Error = InvalidIndexUid;\n\n    fn try_from(value: String) -> Result<Self, Self::Error> {\n        value.parse()\n    }\n}\n\n#[cfg(feature = \"postgres\")]\nimpl sqlx::Type<sqlx::Postgres> for IndexUid {\n    fn type_info() -> sqlx::postgres::PgTypeInfo {\n        sqlx::postgres::PgTypeInfo::with_name(\"VARCHAR\")\n    }\n}\n\n#[cfg(feature = \"postgres\")]\nimpl sqlx::Encode<'_, sqlx::Postgres> for IndexUid {\n    fn encode_by_ref(\n        &self,\n        buf: &mut sqlx::postgres::PgArgumentBuffer,\n    ) -> Result<sqlx::encode::IsNull, sqlx::error::BoxDynError> {\n        let _ = sqlx::Encode::<sqlx::Postgres>::encode(&self.index_id, buf)?;\n        let _ = sqlx::Encode::<sqlx::Postgres>::encode(\":\", buf)?;\n        sqlx::Encode::<sqlx::Postgres>::encode(self.incarnation_id.to_string(), buf)\n    }\n}\n\n#[cfg(feature = \"postgres\")]\nimpl sqlx::postgres::PgHasArrayType for IndexUid {\n    fn array_type_info() -> sqlx::postgres::PgTypeInfo {\n        sqlx::postgres::PgTypeInfo::with_name(\"VARCHAR[]\")\n    }\n}\n\nimpl PartialEq<(&'static str, u128)> for IndexUid {\n    fn eq(&self, (index_id, incarnation_id): &(&str, u128)) -> bool {\n        self.index_id == *index_id && self.incarnation_id == Ulid::from(*incarnation_id)\n    }\n}\n\n#[cfg(feature = \"postgres\")]\nimpl From<IndexUid> for sea_query::Value {\n    fn from(index_uid: IndexUid) -> Self {\n        index_uid.to_string().into()\n    }\n}\n\n#[cfg(feature = \"postgres\")]\nimpl From<&IndexUid> for sea_query::Value {\n    fn from(index_uid: &IndexUid) -> Self {\n        index_uid.to_string().into()\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/types/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::borrow::Borrow;\nuse std::convert::Infallible;\nuse std::fmt;\nuse std::fmt::{Display, Formatter};\nuse std::ops::Deref;\nuse std::str::FromStr;\n\nuse serde::{Deserialize, Serialize};\nuse tracing::error;\npub use ulid::Ulid;\n\nmod doc_mapping_uid;\nmod doc_uid;\nmod index_uid;\nmod pipeline_uid;\nmod position;\nmod shard_id;\n\npub use doc_mapping_uid::DocMappingUid;\npub use doc_uid::{DocUid, DocUidGenerator};\npub use index_uid::IndexUid;\npub use pipeline_uid::PipelineUid;\npub use position::Position;\npub use shard_id::ShardId;\n\n/// The size of an ULID in bytes. Use `ULID_LEN` for the length of Base32 encoded ULID strings.\npub(crate) const ULID_SIZE: usize = 16;\n\npub type IndexId = String;\n\npub type SourceId = String;\n\npub type SplitId = String;\n\npub type SubrequestId = u32;\n\n/// See the file `ingest.proto` for more details.\npub type PublishToken = String;\n\n/// Uniquely identifies a shard and its underlying mrecordlog queue.\npub type QueueId = String; // <index_uid>/<source_id>/<shard_id>\n\npub fn queue_id(index_uid: &IndexUid, source_id: &str, shard_id: &ShardId) -> QueueId {\n    format!(\"{index_uid}/{source_id}/{shard_id}\")\n}\n\npub fn split_queue_id(queue_id: &str) -> Option<(IndexUid, SourceId, ShardId)> {\n    let parts_opt = split_queue_id_inner(queue_id);\n\n    if parts_opt.is_none() {\n        error!(\"failed to parse queue ID `{queue_id}`: this should never happen, please report\");\n    }\n    parts_opt\n}\n\nfn split_queue_id_inner(queue_id: &str) -> Option<(IndexUid, SourceId, ShardId)> {\n    let mut parts = queue_id.split('/');\n    let index_uid = parts.next()?;\n    let source_id = parts.next()?;\n    let shard_id = parts.next()?;\n    Some((\n        index_uid.parse().ok()?,\n        source_id.to_string(),\n        ShardId::from(shard_id),\n    ))\n}\n\n/// It can however appear only once in a given index.\n/// In itself, `SourceId` is not unique, but the pair `(IndexUid, SourceId)` is.\n#[derive(PartialEq, Eq, Debug, PartialOrd, Ord, Hash, Clone)]\npub struct SourceUid {\n    pub index_uid: IndexUid,\n    pub source_id: SourceId,\n}\n\nimpl Display for SourceUid {\n    fn fmt(&self, f: &mut Formatter) -> fmt::Result {\n        write!(f, \"{}:{}\", self.index_uid, self.source_id)\n    }\n}\n\n#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Hash, Serialize, Deserialize)]\npub struct NodeId(String);\n\nimpl NodeId {\n    /// Constructs a new [`NodeId`].\n    pub const fn new(node_id: String) -> Self {\n        Self(node_id)\n    }\n\n    /// Takes ownership of the underlying [`String`], consuming `self`.\n    pub fn take(self) -> String {\n        self.0\n    }\n}\n\nimpl AsRef<NodeIdRef> for NodeId {\n    fn as_ref(&self) -> &NodeIdRef {\n        self.deref()\n    }\n}\n\nimpl Borrow<str> for NodeId {\n    fn borrow(&self) -> &str {\n        &self.0\n    }\n}\n\nimpl Borrow<String> for NodeId {\n    fn borrow(&self) -> &String {\n        &self.0\n    }\n}\n\nimpl Borrow<NodeIdRef> for NodeId {\n    fn borrow(&self) -> &NodeIdRef {\n        self.deref()\n    }\n}\n\nimpl Deref for NodeId {\n    type Target = NodeIdRef;\n\n    fn deref(&self) -> &Self::Target {\n        NodeIdRef::from_str(&self.0)\n    }\n}\n\nimpl Display for NodeId {\n    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {\n        write!(f, \"{}\", self.0)\n    }\n}\n\nimpl From<&'_ str> for NodeId {\n    fn from(node_id: &str) -> Self {\n        Self::new(node_id.to_string())\n    }\n}\n\nimpl From<String> for NodeId {\n    fn from(node_id: String) -> Self {\n        Self::new(node_id)\n    }\n}\n\nimpl From<NodeId> for String {\n    fn from(node_id: NodeId) -> Self {\n        node_id.0\n    }\n}\n\nimpl From<&'_ NodeIdRef> for NodeId {\n    fn from(node_id: &NodeIdRef) -> Self {\n        node_id.to_owned()\n    }\n}\n\nimpl FromStr for NodeId {\n    type Err = Infallible;\n\n    fn from_str(node_id: &str) -> Result<Self, Self::Err> {\n        Ok(NodeId::new(node_id.to_string()))\n    }\n}\n\nimpl PartialEq<&str> for NodeId {\n    fn eq(&self, other: &&str) -> bool {\n        self.as_str() == *other\n    }\n}\n\nimpl PartialEq<String> for NodeId {\n    fn eq(&self, other: &String) -> bool {\n        self.as_str() == *other\n    }\n}\n\n#[repr(transparent)]\n#[derive(Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]\npub struct NodeIdRef(str);\n\nimpl NodeIdRef {\n    /// Transparently reinterprets the string slice as a strongly-typed [`NodeIdRef`].\n    pub const fn from_str(node_id: &str) -> &Self {\n        let ptr: *const str = node_id;\n        // SAFETY: `NodeIdRef` is `#[repr(transparent)]` around a single `str` field, so a `*const\n        // str` can be safely reinterpreted as a `*const NodeIdRef`\n        unsafe { &*(ptr as *const Self) }\n    }\n\n    /// Transparently reinterprets the static string slice as a strongly-typed [`NodeIdRef`].\n    pub const fn from_static(node_id: &'static str) -> &'static Self {\n        Self::from_str(node_id)\n    }\n\n    /// Provides access to the underlying value as a string slice.\n    pub const fn as_str(&self) -> &str {\n        &self.0\n    }\n}\n\nimpl AsRef<str> for NodeIdRef {\n    fn as_ref(&self) -> &str {\n        &self.0\n    }\n}\n\nimpl Borrow<str> for NodeIdRef {\n    fn borrow(&self) -> &str {\n        &self.0\n    }\n}\n\nimpl Display for NodeIdRef {\n    fn fmt(&self, f: &mut Formatter) -> fmt::Result {\n        write!(f, \"{}\", &self.0)\n    }\n}\n\nimpl<'a> From<&'a str> for &'a NodeIdRef {\n    fn from(node_id: &'a str) -> &'a NodeIdRef {\n        NodeIdRef::from_str(node_id)\n    }\n}\n\nimpl PartialEq<NodeIdRef> for NodeId {\n    fn eq(&self, other: &NodeIdRef) -> bool {\n        self.as_str() == other.as_str()\n    }\n}\n\nimpl PartialEq<&'_ NodeIdRef> for NodeId {\n    fn eq(&self, other: &&NodeIdRef) -> bool {\n        self.as_str() == other.as_str()\n    }\n}\n\nimpl PartialEq<NodeId> for NodeIdRef {\n    fn eq(&self, other: &NodeId) -> bool {\n        self.as_str() == other.as_str()\n    }\n}\n\nimpl PartialEq<NodeId> for &'_ NodeIdRef {\n    fn eq(&self, other: &NodeId) -> bool {\n        self.as_str() == other.as_str()\n    }\n}\n\nimpl PartialEq<NodeId> for String {\n    fn eq(&self, other: &NodeId) -> bool {\n        self.as_str() == other.as_str()\n    }\n}\n\nimpl ToOwned for NodeIdRef {\n    type Owned = NodeId;\n\n    fn to_owned(&self) -> Self::Owned {\n        NodeId(self.0.to_string())\n    }\n}\n\n#[cfg(feature = \"postgres\")]\nimpl From<&NodeId> for sea_query::Value {\n    fn from(node_id: &NodeId) -> Self {\n        node_id.to_string().into()\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_queue_id() {\n        assert_eq!(\n            queue_id(\n                &IndexUid::for_test(\"test-index\", 0),\n                \"test-source\",\n                &ShardId::from(1u64)\n            ),\n            \"test-index:00000000000000000000000000/test-source/00000000000000000001\"\n        );\n    }\n\n    #[test]\n    fn test_split_queue_id() {\n        let splits = split_queue_id(\"test-index:00000000000000000000000000\");\n        assert!(splits.is_none());\n\n        let splits = split_queue_id(\"test-index:00000000000000000000000000/test-source\");\n        assert!(splits.is_none());\n\n        let (index_uid, source_id, shard_id) = split_queue_id(\n            \"test-index:00000000000000000000000000/test-source/00000000000000000001\",\n        )\n        .unwrap();\n        assert_eq!(\n            &index_uid.to_string(),\n            \"test-index:00000000000000000000000000\"\n        );\n        assert_eq!(source_id, \"test-source\");\n        assert_eq!(shard_id, ShardId::from(1u64));\n    }\n\n    #[test]\n    fn test_node_id() {\n        let node_id = NodeId::new(\"test-node\".to_string());\n        assert_eq!(node_id.as_str(), \"test-node\");\n        assert_eq!(node_id, NodeIdRef::from_str(\"test-node\"));\n    }\n\n    #[test]\n    fn test_node_serde() {\n        #[derive(Debug, PartialEq, Eq, Serialize, Deserialize)]\n        struct Node {\n            node_id: NodeId,\n        }\n        let node = Node {\n            node_id: NodeId::from(\"test-node\"),\n        };\n        let serialized = serde_json::to_string(&node).unwrap();\n        assert_eq!(serialized, r#\"{\"node_id\":\"test-node\"}\"#);\n\n        let deserialized = serde_json::from_str::<Node>(&serialized).unwrap();\n        assert_eq!(deserialized, node);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/types/pipeline_uid.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::borrow::Cow;\nuse std::fmt;\nuse std::fmt::{Display, Formatter};\nuse std::str::FromStr;\n\nuse serde::de::Error;\nuse serde::{Deserialize, Serialize};\nuse ulid::Ulid;\n\nuse super::ULID_SIZE;\n\n/// A pipeline UID identifies an indexing pipeline and an indexing task.\n#[derive(Clone, Copy, Default, Hash, Eq, PartialEq, Ord, PartialOrd)]\npub struct PipelineUid(Ulid);\n\nimpl fmt::Debug for PipelineUid {\n    fn fmt(&self, f: &mut Formatter) -> fmt::Result {\n        write!(f, \"Pipeline({})\", self.0)\n    }\n}\n\nimpl Display for PipelineUid {\n    fn fmt(&self, f: &mut Formatter) -> fmt::Result {\n        self.0.fmt(f)\n    }\n}\n\nimpl PipelineUid {\n    /// Creates a new random pipeline UID.\n    pub fn random() -> Self {\n        Self(Ulid::new())\n    }\n\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn for_test(ulid_u128: u128) -> PipelineUid {\n        Self(Ulid::from(ulid_u128))\n    }\n}\n\nimpl FromStr for PipelineUid {\n    type Err = &'static str;\n\n    fn from_str(pipeline_uid_str: &str) -> Result<PipelineUid, Self::Err> {\n        let pipeline_ulid =\n            Ulid::from_string(pipeline_uid_str).map_err(|_| \"invalid pipeline UID\")?;\n        Ok(PipelineUid(pipeline_ulid))\n    }\n}\n\nimpl Serialize for PipelineUid {\n    fn serialize<S: serde::Serializer>(&self, serializer: S) -> Result<S::Ok, S::Error> {\n        serializer.collect_str(&self.0)\n    }\n}\n\nimpl<'de> Deserialize<'de> for PipelineUid {\n    fn deserialize<D: serde::Deserializer<'de>>(deserializer: D) -> Result<Self, D::Error> {\n        let ulid_str: Cow<'de, str> = Cow::deserialize(deserializer)?;\n        let ulid = Ulid::from_string(&ulid_str).map_err(D::Error::custom)?;\n        Ok(Self(ulid))\n    }\n}\n\nimpl prost::Message for PipelineUid {\n    fn encode_raw(&self, buf: &mut impl prost::bytes::BufMut) {\n        // TODO: when `bytes::encode` supports `&[u8]`, we can remove this allocation.\n        prost::encoding::bytes::encode(1u32, &self.0.to_bytes().to_vec(), buf);\n    }\n\n    fn merge_field(\n        &mut self,\n        tag: u32,\n        wire_type: prost::encoding::WireType,\n        buf: &mut impl prost::bytes::Buf,\n        ctx: prost::encoding::DecodeContext,\n    ) -> ::core::result::Result<(), prost::DecodeError> {\n        const STRUCT_NAME: &str = \"PipelineUid\";\n\n        match tag {\n            1u32 => {\n                let mut buffer = Vec::with_capacity(ULID_SIZE);\n\n                prost::encoding::bytes::merge(wire_type, &mut buffer, buf, ctx).map_err(\n                    |mut error| {\n                        error.push(STRUCT_NAME, \"pipeline_uid\");\n                        error\n                    },\n                )?;\n                let ulid_bytes: [u8; ULID_SIZE] =\n                    buffer.try_into().map_err(|buffer: Vec<u8>| {\n                        prost::DecodeError::new(format!(\n                            \"invalid length for field `pipeline_uid`, expected 16 bytes, got {}\",\n                            buffer.len()\n                        ))\n                    })?;\n                self.0 = Ulid::from_bytes(ulid_bytes);\n                Ok(())\n            }\n            _ => prost::encoding::skip_field(wire_type, tag, buf, ctx),\n        }\n    }\n\n    #[inline]\n    fn encoded_len(&self) -> usize {\n        prost::encoding::key_len(1u32)\n            + prost::encoding::encoded_len_varint(ULID_SIZE as u64)\n            + ULID_SIZE\n    }\n\n    fn clear(&mut self) {\n        self.0 = Ulid::default();\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use bytes::Bytes;\n    use prost::Message;\n\n    use super::*;\n\n    #[test]\n    fn test_pipeline_uid_json_serde_roundtrip() {\n        let pipeline_uid = PipelineUid::default();\n        let serialized = serde_json::to_string(&pipeline_uid).unwrap();\n        assert_eq!(serialized, r#\"\"00000000000000000000000000\"\"#);\n\n        let deserialized: PipelineUid = serde_json::from_str(&serialized).unwrap();\n        assert_eq!(deserialized, pipeline_uid);\n    }\n\n    #[test]\n    fn test_pipeline_uid_prost_serde_roundtrip() {\n        let pipeline_uid = PipelineUid::random();\n\n        let encoded = pipeline_uid.encode_to_vec();\n        assert_eq!(\n            PipelineUid::decode(Bytes::from(encoded)).unwrap(),\n            pipeline_uid\n        );\n\n        let encoded = pipeline_uid.encode_length_delimited_to_vec();\n        assert_eq!(\n            PipelineUid::decode_length_delimited(Bytes::from(encoded)).unwrap(),\n            pipeline_uid\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/types/position.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt::{Debug, Display};\nuse std::{fmt, mem};\n\nuse bytes::{Bytes, BytesMut};\nuse bytestring::ByteString;\nuse prost::{self, DecodeError};\nuse quickwit_common::pretty::PrettyDisplay;\nuse serde::{Deserialize, Serialize};\n\nconst BEGINNING: &str = \"\";\n\nconst EOF_PREFIX: &str = \"~\";\n\n#[derive(Clone, Debug, Default, Eq, PartialEq, Hash, Ord, PartialOrd)]\npub struct Offset(ByteString);\n\nimpl Offset {\n    pub fn as_str(&self) -> &str {\n        &self.0\n    }\n\n    pub fn as_i64(&self) -> Option<i64> {\n        self.0.parse::<i64>().ok()\n    }\n\n    pub fn as_u64(&self) -> Option<u64> {\n        self.0.parse::<u64>().ok()\n    }\n\n    pub fn as_usize(&self) -> Option<usize> {\n        self.0.parse::<usize>().ok()\n    }\n}\n\nimpl fmt::Display for Offset {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(f, \"{}\", &self.0)\n    }\n}\n\nimpl From<i64> for Offset {\n    fn from(offset: i64) -> Self {\n        Self(ByteString::from(format!(\"{offset:0>20}\")))\n    }\n}\n\nimpl From<u64> for Offset {\n    fn from(offset: u64) -> Self {\n        Self(ByteString::from(format!(\"{offset:0>20}\")))\n    }\n}\n\nimpl From<usize> for Offset {\n    fn from(offset: usize) -> Self {\n        Self(ByteString::from(format!(\"{offset:0>20}\")))\n    }\n}\n\nimpl From<&str> for Offset {\n    fn from(offset: &str) -> Self {\n        Self(ByteString::from(offset))\n    }\n}\n\n/// Marks a position within a specific partition/shard of a source.\n///\n/// The nature of the position depends on the source.\n/// Each source must encode it as a `String` in such a way that\n/// the lexicographical order matches the natural order of the\n/// position.\n///\n/// For instance, for u64, a 20-left-padded decimal representation\n/// can be used. Alternatively, a base64 representation of their\n/// big-endian representation can be used.\n///\n/// The empty string can be used to represent the beginning of the source,\n/// if no position makes sense. It can be built via `Position::default()`.\n#[derive(Clone, Default, Eq, PartialEq, Hash, Ord, PartialOrd)]\npub enum Position {\n    #[default]\n    Beginning,\n    Offset(Offset),\n    /// End of partition/shard at the given offset. `Eof(None)` means no records were ever written.\n    Eof(Option<Offset>),\n}\n\nimpl Debug for Position {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        match self {\n            Self::Beginning => write!(f, \"Position::Beginning\"),\n            // The derive implementation would show `Offset(Offset(0000001u64))` here.\n            Self::Offset(offset) => write!(f, \"Position::Offset({offset})\"),\n            Self::Eof(Some(offset)) => write!(f, \"Position::Eof({offset})\"),\n            Self::Eof(None) => write!(f, \"Position::Eof\"),\n        }\n    }\n}\n\n// Caution: This is also the serialization format for chitchat and serde. Modify with care.\nimpl Display for Position {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        match self {\n            Self::Beginning => write!(f, \"{BEGINNING}\"),\n            Self::Offset(offset) => write!(f, \"{offset}\"),\n            Self::Eof(Some(offset)) => write!(f, \"{EOF_PREFIX}{offset}\"),\n            Self::Eof(None) => write!(f, \"{EOF_PREFIX}\"),\n        }\n    }\n}\n\nstruct PositionPrettyDisplay<'a>(&'a Position);\n\nimpl fmt::Display for PositionPrettyDisplay<'_> {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        match self.0 {\n            Position::Beginning => write!(f, \"beginning\"),\n            Position::Offset(offset) => write!(f, \"{offset}\"),\n            Position::Eof(Some(offset)) => write!(f, \"eof({offset})\"),\n            Position::Eof(None) => write!(f, \"eof\"),\n        }\n    }\n}\n\nimpl PrettyDisplay for Position {\n    fn pretty_display(&self) -> impl fmt::Display {\n        PositionPrettyDisplay(self)\n    }\n}\n\nimpl Position {\n    pub fn offset(offset: impl Into<Offset>) -> Self {\n        Self::Offset(offset.into())\n    }\n\n    pub fn eof(offset: impl Into<Offset>) -> Self {\n        Self::Eof(Some(offset.into()))\n    }\n\n    pub fn as_eof(&self) -> Self {\n        match self {\n            Self::Beginning => Self::Eof(None),\n            Self::Offset(offset) => Self::Eof(Some(offset.clone())),\n            _ => self.clone(),\n        }\n    }\n\n    pub fn to_eof(&mut self) {\n        match self {\n            Self::Beginning => *self = Self::Eof(None),\n            Self::Offset(offset) => *self = Self::Eof(Some(mem::take(offset))),\n            _ => (),\n        }\n    }\n\n    pub fn as_i64(&self) -> Option<i64> {\n        match self {\n            Self::Offset(offset) | Self::Eof(Some(offset)) => offset.as_i64(),\n            _ => None,\n        }\n    }\n\n    pub fn as_u64(&self) -> Option<u64> {\n        match self {\n            Self::Offset(offset) | Self::Eof(Some(offset)) => offset.as_u64(),\n            _ => None,\n        }\n    }\n\n    pub fn as_usize(&self) -> Option<usize> {\n        match self {\n            Self::Offset(offset) | Self::Eof(Some(offset)) => offset.as_usize(),\n            _ => None,\n        }\n    }\n\n    pub fn is_beginning(&self) -> bool {\n        matches!(self, Self::Beginning)\n    }\n\n    pub fn is_eof(&self) -> bool {\n        matches!(self, Self::Eof(_))\n    }\n\n    fn as_bytes(&self) -> Bytes {\n        match self {\n            Self::Beginning => Bytes::from_static(BEGINNING.as_bytes()),\n            Self::Offset(offset) => offset.0.as_bytes().clone(),\n            Self::Eof(Some(offset)) => {\n                let mut bytes = BytesMut::with_capacity(EOF_PREFIX.len() + offset.0.len());\n                bytes.extend_from_slice(EOF_PREFIX.as_bytes());\n                bytes.extend_from_slice(offset.0.as_bytes());\n                bytes.freeze()\n            }\n            Self::Eof(None) => Bytes::from_static(EOF_PREFIX.as_bytes()),\n        }\n    }\n}\n\nimpl From<ByteString> for Position {\n    fn from(position: ByteString) -> Self {\n        match &position[..] {\n            BEGINNING => Self::Beginning,\n            EOF_PREFIX => Self::Eof(None),\n            offset if offset.starts_with(EOF_PREFIX) => {\n                let offset = ByteString::from(&offset[EOF_PREFIX.len()..]);\n                Self::Eof(Some(Offset(offset)))\n            }\n            _ => Self::Offset(Offset(position)),\n        }\n    }\n}\n\nimpl From<String> for Position {\n    fn from(position: String) -> Self {\n        Self::from(ByteString::from(position))\n    }\n}\n\nimpl Serialize for Position {\n    fn serialize<S: serde::Serializer>(&self, serializer: S) -> Result<S::Ok, S::Error> {\n        serializer.collect_str(self)\n    }\n}\n\nimpl<'de> Deserialize<'de> for Position {\n    fn deserialize<D: serde::Deserializer<'de>>(deserializer: D) -> Result<Self, D::Error> {\n        let position_str = String::deserialize(deserializer)?;\n        Ok(Self::from(position_str))\n    }\n}\n\nimpl PartialEq<Position> for &Position {\n    #[inline]\n    fn eq(&self, other: &Position) -> bool {\n        *self == other\n    }\n}\n\nimpl prost::Message for Position {\n    fn encode_raw(&self, buf: &mut impl prost::bytes::BufMut) {\n        prost::encoding::bytes::encode(1u32, &self.as_bytes(), buf);\n    }\n\n    fn merge_field(\n        &mut self,\n        tag: u32,\n        wire_type: prost::encoding::WireType,\n        buf: &mut impl prost::bytes::Buf,\n        ctx: prost::encoding::DecodeContext,\n    ) -> ::core::result::Result<(), prost::DecodeError> {\n        const STRUCT_NAME: &str = \"Position\";\n\n        match tag {\n            1u32 => {\n                let mut value = Vec::new();\n                prost::encoding::bytes::merge(wire_type, &mut value, buf, ctx).map_err(\n                    |mut error| {\n                        error.push(STRUCT_NAME, \"position\");\n                        error\n                    },\n                )?;\n                let byte_string = ByteString::try_from(value)\n                    .map_err(|_| DecodeError::new(\"position is not valid UTF-8\"))?;\n                *self = Self::from(byte_string);\n                Ok(())\n            }\n            _ => prost::encoding::skip_field(wire_type, tag, buf, ctx),\n        }\n    }\n\n    #[inline]\n    fn encoded_len(&self) -> usize {\n        prost::encoding::bytes::encoded_len(1u32, &self.as_bytes())\n    }\n\n    fn clear(&mut self) {\n        *self = Self::default();\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use prost::Message;\n\n    use super::*;\n\n    #[test]\n    #[allow(clippy::cmp_owned)]\n    fn test_position_ord() {\n        assert!(Position::Beginning < Position::offset(0u64));\n        assert!(Position::Beginning < Position::Eof(None));\n        assert!(Position::Beginning < Position::eof(0u64));\n\n        assert!(Position::offset(0u64) < Position::offset(1u64));\n\n        assert!(Position::Eof(None) < Position::eof(0u64));\n        assert!(Position::eof(0u64) < Position::eof(1u64));\n    }\n\n    #[test]\n    fn test_position_as_eof() {\n        let eof_position = Position::Beginning.as_eof();\n\n        assert!(eof_position.is_eof());\n        assert!(eof_position.as_u64().is_none());\n\n        let eof_position = Position::offset(0u64).as_eof();\n\n        assert!(eof_position.is_eof());\n        assert_eq!(eof_position.as_u64().unwrap(), 0u64);\n    }\n\n    #[test]\n    fn test_position_to_eof() {\n        let mut position = Position::Beginning;\n        position.to_eof();\n        assert!(matches!(position, Position::Eof(None)));\n\n        let mut position = Position::offset(0u64);\n        position.to_eof();\n        assert!(matches!(position, Position::Eof(Some(offset)) if offset.as_u64().unwrap() == 0));\n    }\n\n    #[test]\n    fn test_position_json_serde_roundtrip() {\n        let serialized = serde_json::to_string(&Position::Beginning).unwrap();\n        assert_eq!(serialized, r#\"\"\"\"#);\n        let deserialized: Position = serde_json::from_str(&serialized).unwrap();\n        assert_eq!(deserialized, Position::Beginning);\n\n        let serialized = serde_json::to_string(&Position::offset(0u64)).unwrap();\n        assert_eq!(serialized, r#\"\"00000000000000000000\"\"#);\n        let deserialized: Position = serde_json::from_str(&serialized).unwrap();\n        assert_eq!(deserialized, Position::offset(0u64));\n\n        let serialized = serde_json::to_string(&Position::Eof(None)).unwrap();\n        assert_eq!(serialized, r#\"\"~\"\"#);\n        let deserialized: Position = serde_json::from_str(&serialized).unwrap();\n        assert_eq!(deserialized, Position::Eof(None));\n\n        let serialized = serde_json::to_string(&Position::eof(0u64)).unwrap();\n        assert_eq!(serialized, r#\"\"~00000000000000000000\"\"#);\n        let deserialized: Position = serde_json::from_str(&serialized).unwrap();\n        assert_eq!(deserialized, Position::eof(0u64));\n    }\n\n    #[test]\n    fn test_position_prost_serde_roundtrip() {\n        let encoded = Position::Beginning.encode_to_vec();\n        assert_eq!(\n            Position::decode(Bytes::from(encoded)).unwrap(),\n            Position::Beginning\n        );\n        let encoded = Position::Beginning.encode_length_delimited_to_vec();\n        assert_eq!(\n            Position::decode_length_delimited(Bytes::from(encoded)).unwrap(),\n            Position::Beginning\n        );\n\n        let encoded = Position::offset(0u64).encode_to_vec();\n        assert_eq!(\n            Position::decode(Bytes::from(encoded)).unwrap(),\n            Position::offset(0u64)\n        );\n        let encoded = Position::offset(0u64).encode_length_delimited_to_vec();\n        assert_eq!(\n            Position::decode_length_delimited(Bytes::from(encoded)).unwrap(),\n            Position::offset(0u64)\n        );\n\n        let encoded = Position::Eof(None).encode_to_vec();\n        assert_eq!(\n            Position::decode(Bytes::from(encoded)).unwrap(),\n            Position::Eof(None)\n        );\n        let encoded = Position::Eof(None).encode_length_delimited_to_vec();\n        assert_eq!(\n            Position::decode_length_delimited(Bytes::from(encoded)).unwrap(),\n            Position::Eof(None)\n        );\n\n        let encoded = Position::eof(0u64).encode_to_vec();\n        assert_eq!(\n            Position::decode(Bytes::from(encoded)).unwrap(),\n            Position::eof(0u64)\n        );\n        let encoded = Position::eof(0u64).encode_length_delimited_to_vec();\n        assert_eq!(\n            Position::decode_length_delimited(Bytes::from(encoded)).unwrap(),\n            Position::eof(0u64)\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-proto/src/types/shard_id.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::fmt::Debug;\n\nuse bytestring::ByteString;\nuse prost::DecodeError;\nuse serde::{Deserialize, Serialize};\nuse ulid::Ulid;\n\n/// Shard ID.\n/// Shard ID are required to be globally unique.\n///\n/// In other words, there cannot be two shards belonging to two different sources\n/// with the same shard ID.\n#[derive(Clone, Debug, Default, Eq, PartialEq, Hash, Ord, PartialOrd)]\npub struct ShardId(ByteString);\n\nimpl ShardId {\n    pub fn as_str(&self) -> &str {\n        &self.0\n    }\n\n    pub fn as_u64(&self) -> Option<u64> {\n        self.0.parse().ok()\n    }\n}\n\nimpl fmt::Display for ShardId {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(f, \"{}\", &self.0)\n    }\n}\n\nimpl From<&str> for ShardId {\n    fn from(shard_id: &str) -> Self {\n        Self(ByteString::from(shard_id))\n    }\n}\n\nimpl From<String> for ShardId {\n    fn from(shard_id: String) -> Self {\n        Self(ByteString::from(shard_id))\n    }\n}\n\nimpl From<u64> for ShardId {\n    fn from(shard_id: u64) -> Self {\n        Self(ByteString::from(format!(\"{shard_id:0>20}\")))\n    }\n}\n\nimpl From<Ulid> for ShardId {\n    fn from(shard_id: Ulid) -> Self {\n        Self(ByteString::from(shard_id.to_string()))\n    }\n}\n\nimpl Serialize for ShardId {\n    fn serialize<S: serde::Serializer>(&self, serializer: S) -> Result<S::Ok, S::Error> {\n        serializer.collect_str(self)\n    }\n}\n\nimpl<'de> Deserialize<'de> for ShardId {\n    fn deserialize<D: serde::Deserializer<'de>>(deserializer: D) -> Result<Self, D::Error> {\n        let shard_id = String::deserialize(deserializer)?;\n        Ok(Self::from(shard_id))\n    }\n}\n\nimpl prost::Message for ShardId {\n    fn encode_raw(&self, buf: &mut impl prost::bytes::BufMut) {\n        prost::encoding::bytes::encode(1u32, &self.0.as_bytes().clone(), buf);\n    }\n\n    fn merge_field(\n        &mut self,\n        tag: u32,\n        wire_type: prost::encoding::WireType,\n        buf: &mut impl prost::bytes::Buf,\n        ctx: prost::encoding::DecodeContext,\n    ) -> ::core::result::Result<(), prost::DecodeError> {\n        const STRUCT_NAME: &str = \"ShardId\";\n\n        match tag {\n            1u32 => {\n                let mut value = Vec::new();\n                prost::encoding::bytes::merge(wire_type, &mut value, buf, ctx).map_err(\n                    |mut error| {\n                        error.push(STRUCT_NAME, \"position\");\n                        error\n                    },\n                )?;\n                let byte_string = ByteString::try_from(value)\n                    .map_err(|_| DecodeError::new(\"shard_id is not valid UTF-8\"))?;\n                *self = Self(byte_string);\n                Ok(())\n            }\n            _ => prost::encoding::skip_field(wire_type, tag, buf, ctx),\n        }\n    }\n\n    #[inline]\n    fn encoded_len(&self) -> usize {\n        prost::encoding::bytes::encoded_len(1u32, &self.0.as_bytes().clone())\n    }\n\n    fn clear(&mut self) {\n        *self = Self::default();\n    }\n}\n\nimpl PartialEq<ShardId> for &ShardId {\n    #[inline]\n    fn eq(&self, other: &ShardId) -> bool {\n        *self == other\n    }\n}\n\n#[cfg(feature = \"postgres\")]\nimpl sqlx::Type<sqlx::Postgres> for ShardId {\n    fn type_info() -> sqlx::postgres::PgTypeInfo {\n        sqlx::postgres::PgTypeInfo::with_name(\"VARCHAR\")\n    }\n}\n\n#[cfg(feature = \"postgres\")]\nimpl sqlx::Encode<'_, sqlx::Postgres> for ShardId {\n    fn encode_by_ref(\n        &self,\n        buf: &mut sqlx::postgres::PgArgumentBuffer,\n    ) -> Result<sqlx::encode::IsNull, sqlx::error::BoxDynError> {\n        sqlx::Encode::<sqlx::Postgres>::encode(self.as_str(), buf)\n    }\n}\n\n#[cfg(feature = \"postgres\")]\nimpl sqlx::postgres::PgHasArrayType for ShardId {\n    fn array_type_info() -> sqlx::postgres::PgTypeInfo {\n        sqlx::postgres::PgTypeInfo::with_name(\"VARCHAR[]\")\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use bytes::Bytes;\n    use prost::Message;\n\n    use super::*;\n\n    #[test]\n    fn test_shard_id_json_serde_roundtrip() {\n        let serialized = serde_json::to_string(&ShardId::from(0)).unwrap();\n        assert_eq!(serialized, r#\"\"00000000000000000000\"\"#);\n        let deserialized: ShardId = serde_json::from_str(&serialized).unwrap();\n        assert_eq!(deserialized, ShardId::from(0));\n    }\n\n    #[test]\n    fn test_shard_id_prost_serde_roundtrip() {\n        let ulid = Ulid::new();\n        let encoded = ShardId::from(ulid).encode_to_vec();\n        assert_eq!(\n            ShardId::decode(Bytes::from(encoded)).unwrap(),\n            ShardId::from(ulid)\n        );\n        let encoded = ShardId::from(ulid).encode_length_delimited_to_vec();\n        assert_eq!(\n            ShardId::decode_length_delimited(Bytes::from(encoded)).unwrap(),\n            ShardId::from(ulid)\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/Cargo.toml",
    "content": "[package]\nname = \"quickwit-query\"\ndescription = \"Query DSL definition and parsing\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nanyhow = { workspace = true }\nbase64 = { workspace = true }\nbitpacking = { workspace = true }\nhex = { workspace = true }\nonce_cell = { workspace = true }\nregex = { workspace = true }\nserde = { workspace = true }\nserde_json = { workspace = true }\nserde_with = { workspace = true }\ntantivy = { workspace = true }\ntantivy-fst = { workspace = true }\ntracing = { workspace = true }\ntime = { workspace = true }\nthiserror = { workspace = true }\nrustc-hash = { workspace = true }\n\nquickwit-common = { workspace = true }\nquickwit-datetime = { workspace = true }\nquickwit-proto = { workspace = true }\n\n[dev-dependencies]\ncriterion = { workspace = true }\nproptest = { workspace = true }\ntime = { workspace = true }\n\nquickwit-common = { workspace = true, features = [\"testsuite\"] }\n\n[[bench]]\nname = \"tokenizers_bench\"\nharness = false\n"
  },
  {
    "path": "quickwit/quickwit-query/README.md",
    "content": "Quickwit\n\n```mermaid\nflowchart LR\n    EsApiQParam[ES q= query string param] -->|parse user query| QueryAST\n    EsQueryDSL[ES query DSL in json body] -->|Parse json| QueryAST\n    QuickwitAPI -->|parse user query| QueryAST\n    QueryAST -->|apply query to a split| TantivyQuery\n\n```\n\nIn quickwit and tantivy, we call XXXQuery every object representing a predicate over a document, hence selecting a subset of documents.\n\nSuch objects do not contain information about what to do with the document, for instance, how to sort them, which aggregation to run on them, etc.\n\nThe `SearchRequest` object, on the other hand, is the larger object in charge of gathering all of the information of the request, including the user query.\n\nQuickwit uses a `QueryAST` object to represent queries internally.\nRegardless of how the query has been supplied (ES-compatible API, quickwit search API,  search stream API), we convert\nit to a QueryAST. Because it is schema agnostic, it can be serialized and passed around.\n\n# root / leaf\n\nOne confusing thing about the QueryAST is that because we want its construction to not depend on\nthe docmapper: building the QueryAST should not require interrogating the metastore. It\nis built without knowing the default fields of the index.\n\nFor this reason, the AST contains a node called `UserInputQuery` that has a bit of special status:\nbefore usage, the AST must parse the content of these nodes and replace them by an AST.\n\nThis operation is done on the root search (on which the doc mapper, and hence the default fields are known),\nand dispatched to the leaf search.\nThe root search checks the validity of the search against *against the current DocMapper*.\n\nThat way we are able to return an error to the user, if for instance the query includes a range query that does not target a fast field.\n\nThe leaf search is applied on splits that may have been produced with a different doc mapper.\nConsidering our example again, we want users to be able to run range query after they updated their schema.\n\nReindexing is not option in quickwit, so what will happen is generally speaking a best effort solution.\nThe range query node of the AST will act as if it was a match nothing node, and the recall will be affected for these\nlegacy splits.\nGenerally, this behavior decreases recall, but `MUST NOT` clauses can actually increase recall.\n\n# Elasticsearch compatibility API\n\nThe user's request contains information in both the http body and the querystring parameters. These parameters may overlap, in which case the querystring parameter takes priority.\n\nIn the body, a user can supply the query using a rich query DSL expressed in JSON format.\n```json\n{\n    \"query\": { /* ESQueryDSL */}\n}\n```\n\nWhen the query is passed as an `ESQueryDSL`, it is simply deserialized into a `QueryAST` object. The `QueryAST is` a one-to-one representation of the user input. It is entirely schema-agnostic.\n\n"
  },
  {
    "path": "quickwit/quickwit-query/benches/tokenizers_bench.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse criterion::{Criterion, Throughput, black_box, criterion_group, criterion_main};\nuse quickwit_query::CodeTokenizer;\nuse tantivy::tokenizer::{RegexTokenizer, TextAnalyzer, Token, TokenStream};\n\n// A random ascii string of length 100 chars.\nstatic CODE_TEXT: &str = r#\"\n# Camel case variables\nfirstName = \"John\"\nlastName = \"Doe\"\nageOfPerson = 30\nisEmployed = True\n\n# Snake case variables\nfirst_name = \"Jane\"\nlast_name = \"Smith\"\nage_of_person = 25\nis_employed = False\n\n# Mixed case variables\nfullName = firstName + \" \" + lastName\nisPersonEmployed = isEmployed and is_employed\n\n# Code logic\nif isEmployed and is_employed:\n    print(f\"{firstName} {first_name} is currently employed.\")\nelse:\n    print(f\"{lastName} {last_name} is not employed at the moment.\")\n\ntotalAge = ageOfPerson + age_of_person\nprint(f\"The combined age is: {totalAge}\")\n\n# Longer word examples\nlongCamelCaseWord = \"LongCamelCase\"\nlongSnakeCaseWord = \"long_snake_case\"\nmixedCaseWord = \"ThisIsAMixedCaseWord\"\nlongCamelCaseWord = \"LongCamelCase\"\nlongSnakeCaseWord = \"long_snake_case\"\nmixedCaseWord = \"ThisIsAMixedCaseWord\"\n\n# Words with consecutive uppercase letters\nWORDWITHConsecutiveUppercase1 = \"1\"\nWORDWITHCONSECUTIVEUppercase2 = \"2\"\nWORDWITHCONSECUTIVEUPPERCASE2 = \"3\"\n\"#;\n\nfn process_tokens(analyzer: &mut TextAnalyzer, text: &str) -> Vec<Token> {\n    let mut token_stream = analyzer.token_stream(text);\n    let mut tokens: Vec<Token> = Vec::new();\n    token_stream.process(&mut |token: &Token| tokens.push(token.clone()));\n    tokens\n}\n\npub fn tokenizers_throughput_benchmark(c: &mut Criterion) {\n    let mut group = c.benchmark_group(\"code_tokenizer\");\n    let mut regex_tokenizer = TextAnalyzer::from(\n        RegexTokenizer::new(\"(\\\\p{Ll}+|\\\\p{Lu}\\\\p{Ll}+|\\\\p{Lu}+|\\\\d+)\").unwrap(),\n    );\n    let mut code_tokenizer = TextAnalyzer::from(CodeTokenizer::default());\n\n    group\n        .throughput(Throughput::Bytes(CODE_TEXT.len() as u64))\n        .bench_with_input(\"regex-tokenize\", CODE_TEXT, |b, text| {\n            b.iter(|| process_tokens(&mut regex_tokenizer, black_box(text)));\n        });\n    group\n        .throughput(Throughput::Bytes(CODE_TEXT.len() as u64))\n        .bench_with_input(\"code-tokenize\", CODE_TEXT, |b, text| {\n            b.iter(|| process_tokens(&mut code_tokenizer, black_box(text)));\n        });\n}\n\ncriterion_group!(\n    tokenizers_throughput_benches,\n    tokenizers_throughput_benchmark\n);\ncriterion_main!(tokenizers_throughput_benches);\n"
  },
  {
    "path": "quickwit/quickwit-query/src/aggregations.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse rustc_hash::FxHashMap;\nuse serde::{Deserialize, Serialize};\nuse tantivy::aggregation::Key as TantivyKey;\nuse tantivy::aggregation::agg_result::{\n    AggregationResult as TantivyAggregationResult, AggregationResults as TantivyAggregationResults,\n    BucketEntries as TantivyBucketEntries, BucketEntry as TantivyBucketEntry,\n    BucketResult as TantivyBucketResult, MetricResult as TantivyMetricResult,\n    RangeBucketEntry as TantivyRangeBucketEntry,\n};\nuse tantivy::aggregation::metric::{\n    ExtendedStats, PercentileValues as TantivyPercentileValues, PercentileValuesVecEntry,\n    PercentilesMetricResult as TantivyPercentilesMetricResult, SingleMetricResult, Stats,\n    TopHitsMetricResult,\n};\n\n// hopefully all From in this module are no-ops, otherwise, this is a very sad situation\n\n#[derive(Clone, Debug, Serialize, Deserialize)]\n/// The final aggregation result.\npub struct AggregationResults(pub Vec<(String, AggregationResult)>);\n\nimpl From<TantivyAggregationResults> for AggregationResults {\n    fn from(value: TantivyAggregationResults) -> AggregationResults {\n        AggregationResults(value.0.into_iter().map(|(k, v)| (k, v.into())).collect())\n    }\n}\n\nimpl From<AggregationResults> for TantivyAggregationResults {\n    fn from(value: AggregationResults) -> TantivyAggregationResults {\n        TantivyAggregationResults(value.0.into_iter().map(|(k, v)| (k, v.into())).collect())\n    }\n}\n\n#[derive(Clone, Debug, Serialize, Deserialize)]\n/// An aggregation is either a bucket or a metric.\npub enum AggregationResult {\n    /// Bucket result variant.\n    BucketResult(BucketResult),\n    /// Metric result variant.\n    MetricResult(MetricResult),\n}\n\nimpl From<TantivyAggregationResult> for AggregationResult {\n    fn from(value: TantivyAggregationResult) -> AggregationResult {\n        match value {\n            TantivyAggregationResult::BucketResult(bucket) => {\n                AggregationResult::BucketResult(bucket.into())\n            }\n            TantivyAggregationResult::MetricResult(metric) => {\n                AggregationResult::MetricResult(metric.into())\n            }\n        }\n    }\n}\n\nimpl From<AggregationResult> for TantivyAggregationResult {\n    fn from(value: AggregationResult) -> TantivyAggregationResult {\n        match value {\n            AggregationResult::BucketResult(bucket) => {\n                TantivyAggregationResult::BucketResult(bucket.into())\n            }\n            AggregationResult::MetricResult(metric) => {\n                TantivyAggregationResult::MetricResult(metric.into())\n            }\n        }\n    }\n}\n\n#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]\n/// MetricResult\npub enum MetricResult {\n    /// Average metric result.\n    Average(SingleMetricResult),\n    /// Count metric result.\n    Count(SingleMetricResult),\n    /// Max metric result.\n    Max(SingleMetricResult),\n    /// Min metric result.\n    Min(SingleMetricResult),\n    /// Stats metric result.\n    Stats(Stats),\n    /// ExtendedStats metric result.\n    ExtendedStats(Box<ExtendedStats>),\n    /// Sum metric result.\n    Sum(SingleMetricResult),\n    /// Percentiles metric result.\n    Percentiles(PercentilesMetricResult),\n    /// Top hits metric result\n    TopHits(TopHitsMetricResult),\n    /// Cardinality metric result\n    Cardinality(SingleMetricResult),\n}\n\nimpl From<TantivyMetricResult> for MetricResult {\n    fn from(value: TantivyMetricResult) -> MetricResult {\n        match value {\n            TantivyMetricResult::Average(val) => MetricResult::Average(val),\n            TantivyMetricResult::Count(val) => MetricResult::Count(val),\n            TantivyMetricResult::Max(val) => MetricResult::Max(val),\n            TantivyMetricResult::Min(val) => MetricResult::Min(val),\n            TantivyMetricResult::Stats(val) => MetricResult::Stats(val),\n            TantivyMetricResult::ExtendedStats(val) => MetricResult::ExtendedStats(val),\n            TantivyMetricResult::Sum(val) => MetricResult::Sum(val),\n            TantivyMetricResult::Percentiles(val) => MetricResult::Percentiles(val.into()),\n            TantivyMetricResult::TopHits(val) => MetricResult::TopHits(val),\n            TantivyMetricResult::Cardinality(val) => MetricResult::Cardinality(val),\n        }\n    }\n}\n\nimpl From<MetricResult> for TantivyMetricResult {\n    fn from(value: MetricResult) -> TantivyMetricResult {\n        match value {\n            MetricResult::Average(val) => TantivyMetricResult::Average(val),\n            MetricResult::Count(val) => TantivyMetricResult::Count(val),\n            MetricResult::Max(val) => TantivyMetricResult::Max(val),\n            MetricResult::Min(val) => TantivyMetricResult::Min(val),\n            MetricResult::Stats(val) => TantivyMetricResult::Stats(val),\n            MetricResult::ExtendedStats(val) => TantivyMetricResult::ExtendedStats(val),\n            MetricResult::Sum(val) => TantivyMetricResult::Sum(val),\n            MetricResult::Percentiles(val) => TantivyMetricResult::Percentiles(val.into()),\n            MetricResult::TopHits(val) => TantivyMetricResult::TopHits(val),\n            MetricResult::Cardinality(val) => TantivyMetricResult::Cardinality(val),\n        }\n    }\n}\n\n/// BucketEntry holds bucket aggregation result types.\n#[derive(Clone, Debug, Serialize, Deserialize)]\npub enum BucketResult {\n    /// This is the range entry for a bucket, which contains a key, count, from, to, and optionally\n    /// sub-aggregations.\n    Range {\n        /// The range buckets sorted by range.\n        buckets: BucketEntries<RangeBucketEntry>,\n    },\n    /// This is the histogram entry for a bucket, which contains a key, count, and optionally\n    /// sub-aggregations.\n    Histogram {\n        /// The buckets.\n        ///\n        /// If there are holes depends on the request, if min_doc_count is 0, then there are no\n        /// holes between the first and last bucket.\n        /// See `HistogramAggregation`\n        buckets: BucketEntries<BucketEntry>,\n    },\n    /// This is the term result\n    Terms {\n        /// The buckets.\n        ///\n        /// See `TermsAggregation`\n        buckets: Vec<BucketEntry>,\n        /// The number of documents that didn’t make it into to TOP N due to shard_size or size\n        sum_other_doc_count: u64,\n        /// The upper bound error for the doc count of each term.\n        doc_count_error_upper_bound: Option<u64>,\n    },\n}\n\nimpl From<TantivyBucketResult> for BucketResult {\n    fn from(value: TantivyBucketResult) -> BucketResult {\n        match value {\n            TantivyBucketResult::Range { buckets } => BucketResult::Range {\n                buckets: buckets.into(),\n            },\n            TantivyBucketResult::Histogram { buckets } => BucketResult::Histogram {\n                buckets: buckets.into(),\n            },\n            TantivyBucketResult::Terms {\n                buckets,\n                sum_other_doc_count,\n                doc_count_error_upper_bound,\n            } => BucketResult::Terms {\n                buckets: buckets.into_iter().map(Into::into).collect(),\n                sum_other_doc_count,\n                doc_count_error_upper_bound,\n            },\n            TantivyBucketResult::Filter(_filter_bucket_result) => {\n                unimplemented!(\"filter aggregation is not yet supported in quickwit\")\n            }\n        }\n    }\n}\n\nimpl From<BucketResult> for TantivyBucketResult {\n    fn from(value: BucketResult) -> TantivyBucketResult {\n        match value {\n            BucketResult::Range { buckets } => TantivyBucketResult::Range {\n                buckets: buckets.into(),\n            },\n            BucketResult::Histogram { buckets } => TantivyBucketResult::Histogram {\n                buckets: buckets.into(),\n            },\n            BucketResult::Terms {\n                buckets,\n                sum_other_doc_count,\n                doc_count_error_upper_bound,\n            } => TantivyBucketResult::Terms {\n                buckets: buckets.into_iter().map(Into::into).collect(),\n                sum_other_doc_count,\n                doc_count_error_upper_bound,\n            },\n        }\n    }\n}\n\n/// This is the wrapper of buckets entries, which can be vector or hashmap\n/// depending on if it's keyed or not.\n#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]\npub enum BucketEntries<T> {\n    /// Vector format bucket entries\n    Vec(Vec<T>),\n    /// HashMap format bucket entries\n    HashMap(Vec<(String, T)>),\n}\n\nimpl<T, U> From<TantivyBucketEntries<T>> for BucketEntries<U>\nwhere U: From<T>\n{\n    fn from(value: TantivyBucketEntries<T>) -> BucketEntries<U> {\n        match value {\n            TantivyBucketEntries::Vec(vec) => {\n                BucketEntries::Vec(vec.into_iter().map(Into::into).collect())\n            }\n            TantivyBucketEntries::HashMap(map) => {\n                BucketEntries::HashMap(map.into_iter().map(|(k, v)| (k, v.into())).collect())\n            }\n        }\n    }\n}\n\nimpl<T, U> From<BucketEntries<T>> for TantivyBucketEntries<U>\nwhere U: From<T>\n{\n    fn from(value: BucketEntries<T>) -> TantivyBucketEntries<U> {\n        match value {\n            BucketEntries::Vec(vec) => {\n                TantivyBucketEntries::Vec(vec.into_iter().map(Into::into).collect())\n            }\n            BucketEntries::HashMap(map) => {\n                TantivyBucketEntries::HashMap(map.into_iter().map(|(k, v)| (k, v.into())).collect())\n            }\n        }\n    }\n}\n\n#[derive(Clone, Debug, Serialize, Deserialize)]\npub struct RangeBucketEntry {\n    /// The identifier of the bucket.\n    pub key: Key,\n    /// Number of documents in the bucket.\n    pub doc_count: u64,\n    /// Sub-aggregations in this bucket.\n    // here we had a flatten, postcard didn't like that (unknown map size)\n    pub sub_aggregation: AggregationResults,\n    /// The from range of the bucket. Equals `f64::MIN` when `None`.\n    pub from: Option<f64>,\n    /// The to range of the bucket. Equals `f64::MAX` when `None`.\n    pub to: Option<f64>,\n    /// The optional string representation for the `from` range.\n    pub from_as_string: Option<String>,\n    /// The optional string representation for the `to` range.\n    pub to_as_string: Option<String>,\n}\n\nimpl From<TantivyRangeBucketEntry> for RangeBucketEntry {\n    fn from(value: TantivyRangeBucketEntry) -> RangeBucketEntry {\n        RangeBucketEntry {\n            key: value.key.into(),\n            doc_count: value.doc_count,\n            from: value.from,\n            to: value.to,\n            from_as_string: value.from_as_string,\n            to_as_string: value.to_as_string,\n            sub_aggregation: value.sub_aggregation.into(),\n        }\n    }\n}\n\nimpl From<RangeBucketEntry> for TantivyRangeBucketEntry {\n    fn from(value: RangeBucketEntry) -> TantivyRangeBucketEntry {\n        TantivyRangeBucketEntry {\n            key: value.key.into(),\n            doc_count: value.doc_count,\n            from: value.from,\n            to: value.to,\n            from_as_string: value.from_as_string,\n            to_as_string: value.to_as_string,\n            sub_aggregation: value.sub_aggregation.into(),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Serialize, Deserialize)]\npub struct BucketEntry {\n    /// The string representation of the bucket.\n    pub key_as_string: Option<String>,\n    /// The identifier of the bucket.\n    pub key: Key,\n    /// Number of documents in the bucket.\n    pub doc_count: u64,\n    /// Sub-aggregations in this bucket.\n    pub sub_aggregation: AggregationResults,\n}\n\nimpl From<TantivyBucketEntry> for BucketEntry {\n    fn from(value: TantivyBucketEntry) -> BucketEntry {\n        BucketEntry {\n            key_as_string: value.key_as_string,\n            key: value.key.into(),\n            doc_count: value.doc_count,\n            sub_aggregation: value.sub_aggregation.into(),\n        }\n    }\n}\n\nimpl From<BucketEntry> for TantivyBucketEntry {\n    fn from(value: BucketEntry) -> TantivyBucketEntry {\n        TantivyBucketEntry {\n            key_as_string: value.key_as_string,\n            key: value.key.into(),\n            doc_count: value.doc_count,\n            sub_aggregation: value.sub_aggregation.into(),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Serialize, Deserialize)]\npub enum Key {\n    /// String key\n    Str(String),\n    /// `i64` key\n    I64(i64),\n    /// `u64` key\n    U64(u64),\n    /// `f64` key\n    F64(f64),\n}\n\nimpl From<TantivyKey> for Key {\n    fn from(value: TantivyKey) -> Key {\n        match value {\n            TantivyKey::Str(s) => Key::Str(s),\n            TantivyKey::I64(i) => Key::I64(i),\n            TantivyKey::U64(u) => Key::U64(u),\n            TantivyKey::F64(f) => Key::F64(f),\n        }\n    }\n}\n\nimpl From<Key> for TantivyKey {\n    fn from(value: Key) -> TantivyKey {\n        match value {\n            Key::Str(s) => TantivyKey::Str(s),\n            Key::I64(i) => TantivyKey::I64(i),\n            Key::U64(u) => TantivyKey::U64(u),\n            Key::F64(f) => TantivyKey::F64(f),\n        }\n    }\n}\n\n/// Single-metric aggregations use this common result structure.\n///\n/// Main reason to wrap it in value is to match elasticsearch output structure.\n#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]\npub struct PercentilesMetricResult {\n    /// The result of the percentile metric.\n    pub values: PercentileValues,\n}\n\n/// This is the wrapper of percentile entries, which can be vector or hashmap\n/// depending on if it's keyed or not.\n#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]\npub enum PercentileValues {\n    /// Vector format percentile entries\n    Vec(Vec<PercentileValuesVecEntry>),\n    /// HashMap format percentile entries. Key is the serialized percentile\n    // we use a hashmap here because neither key nor value require conversion, almost\n    // all usage of PercentileValues will be direct conversion to TantivyPercentilesValue\n    HashMap(FxHashMap<String, f64>),\n}\n\nimpl From<TantivyPercentilesMetricResult> for PercentilesMetricResult {\n    fn from(value: TantivyPercentilesMetricResult) -> PercentilesMetricResult {\n        let values = match value.values {\n            TantivyPercentileValues::Vec(vec) => PercentileValues::Vec(vec),\n            TantivyPercentileValues::HashMap(map) => PercentileValues::HashMap(map),\n        };\n        PercentilesMetricResult { values }\n    }\n}\n\nimpl From<PercentilesMetricResult> for TantivyPercentilesMetricResult {\n    fn from(value: PercentilesMetricResult) -> TantivyPercentilesMetricResult {\n        let values = match value.values {\n            PercentileValues::Vec(vec) => TantivyPercentileValues::Vec(vec),\n            PercentileValues::HashMap(map) => TantivyPercentileValues::HashMap(map),\n        };\n        TantivyPercentilesMetricResult { values }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/elastic_query_dsl/bool_query.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse serde::Deserialize;\nuse serde_with::formats::PreferMany;\nuse serde_with::{DefaultOnNull, OneOrMany, serde_as};\n\nuse crate::elastic_query_dsl::{ConvertibleToQueryAst, ElasticQueryDslInner};\nuse crate::not_nan_f32::NotNaNf32;\nuse crate::query_ast::{self, QueryAst};\n\n/// # Unsupported features\n/// - named queries\n#[serde_as]\n#[derive(Deserialize, Debug, PartialEq, Clone)]\n#[serde(deny_unknown_fields)]\npub struct BoolQuery {\n    #[serde_as(deserialize_as = \"DefaultOnNull<OneOrMany<_, PreferMany>>\")]\n    #[serde(default)]\n    must: Vec<ElasticQueryDslInner>,\n    #[serde_as(deserialize_as = \"DefaultOnNull<OneOrMany<_, PreferMany>>\")]\n    #[serde(default)]\n    must_not: Vec<ElasticQueryDslInner>,\n    #[serde_as(deserialize_as = \"DefaultOnNull<OneOrMany<_, PreferMany>>\")]\n    #[serde(default)]\n    should: Vec<ElasticQueryDslInner>,\n    #[serde_as(deserialize_as = \"DefaultOnNull<OneOrMany<_, PreferMany>>\")]\n    #[serde(default)]\n    filter: Vec<ElasticQueryDslInner>,\n    #[serde(default)]\n    pub boost: Option<NotNaNf32>,\n    #[serde(default)]\n    pub minimum_should_match: Option<MinimumShouldMatch>,\n    #[serde(alias = \"adjust_pure_negative\", default, skip_serializing)]\n    _adjust_pure_negative: Option<serde::de::IgnoredAny>,\n}\n\n// `IgnoredAny` implements `PartialEq` but not `Eq`, so we derive `PartialEq`\n// and manually assert `Eq` (safe because `IgnoredAny` is a unit struct).\nimpl Eq for BoolQuery {}\n\n#[derive(Deserialize, Debug, Eq, PartialEq, Clone)]\n#[serde(untagged)]\npub enum MinimumShouldMatch {\n    Str(String),\n    Int(isize),\n}\n\nimpl MinimumShouldMatch {\n    fn resolve(&self, num_should_clauses: usize) -> anyhow::Result<MinimumShouldMatchResolved> {\n        match self {\n            MinimumShouldMatch::Str(minimum_should_match_dsl) => {\n                let Some(percentage) = parse_percentage(minimum_should_match_dsl) else {\n                    anyhow::bail!(\n                        \"Unsupported minimum should match dsl {}. quickwit currently only \\\n                         supports the format '35%' and `-35%`\",\n                        minimum_should_match_dsl\n                    );\n                };\n                let min_should_match = percentage * num_should_clauses as isize / 100;\n                MinimumShouldMatch::Int(min_should_match).resolve(num_should_clauses)\n            }\n            MinimumShouldMatch::Int(neg_num_missing_should_clauses)\n                if *neg_num_missing_should_clauses < 0 =>\n            {\n                let num_missing_should_clauses = -neg_num_missing_should_clauses as usize;\n                if num_missing_should_clauses >= num_should_clauses {\n                    Ok(MinimumShouldMatchResolved::Unspecified)\n                } else {\n                    Ok(MinimumShouldMatchResolved::Min(\n                        num_should_clauses - num_missing_should_clauses,\n                    ))\n                }\n            }\n            MinimumShouldMatch::Int(num_required_should_clauses) => {\n                let num_required_should_clauses: usize = *num_required_should_clauses as usize;\n                if num_required_should_clauses > num_should_clauses {\n                    Ok(MinimumShouldMatchResolved::NoMatch)\n                } else {\n                    Ok(MinimumShouldMatchResolved::Min(num_required_should_clauses))\n                }\n            }\n        }\n    }\n}\n\n#[derive(Deserialize, Debug, Copy, Clone, Eq, PartialEq)]\nenum MinimumShouldMatchResolved {\n    Unspecified,\n    Min(usize),\n    NoMatch,\n}\n\nfn parse_percentage(s: &str) -> Option<isize> {\n    let percentage_str = s.strip_suffix('%')?;\n    let percentage_isize = percentage_str.parse::<isize>().ok()?;\n    if percentage_isize.abs() > 100 {\n        return None;\n    }\n    Some(percentage_isize)\n}\n\nimpl BoolQuery {\n    fn resolve_minimum_should_match(&self) -> anyhow::Result<MinimumShouldMatchResolved> {\n        let num_should_clauses = self.should.len();\n        let Some(minimum_should_match) = &self.minimum_should_match else {\n            return Ok(MinimumShouldMatchResolved::Unspecified);\n        };\n        minimum_should_match.resolve(num_should_clauses)\n    }\n}\n\nimpl BoolQuery {\n    // Combines a list of children queries into a boolean union.\n    pub(crate) fn union(children: Vec<ElasticQueryDslInner>) -> BoolQuery {\n        BoolQuery {\n            must: Vec::new(),\n            must_not: Vec::new(),\n            should: children,\n            filter: Vec::new(),\n            boost: None,\n            minimum_should_match: None,\n            _adjust_pure_negative: None,\n        }\n    }\n}\n\nfn convert_vec(query_dsls: Vec<ElasticQueryDslInner>) -> anyhow::Result<Vec<QueryAst>> {\n    query_dsls\n        .into_iter()\n        .map(|query_dsl| query_dsl.convert_to_query_ast())\n        .collect()\n}\n\nimpl ConvertibleToQueryAst for BoolQuery {\n    fn convert_to_query_ast(self) -> anyhow::Result<QueryAst> {\n        let minimum_should_match_resolved = self.resolve_minimum_should_match()?;\n        let must = convert_vec(self.must)?;\n        let must_not = convert_vec(self.must_not)?;\n        let should = convert_vec(self.should)?;\n        let filter = convert_vec(self.filter)?;\n\n        let minimum_should_match_opt = match minimum_should_match_resolved {\n            MinimumShouldMatchResolved::Unspecified => None,\n            MinimumShouldMatchResolved::Min(minimum_should_match) => Some(minimum_should_match),\n            MinimumShouldMatchResolved::NoMatch => {\n                return Ok(QueryAst::MatchNone);\n            }\n        };\n        let bool_query_ast = query_ast::BoolQuery {\n            must,\n            must_not,\n            should,\n            filter,\n            minimum_should_match: minimum_should_match_opt,\n        };\n        Ok(bool_query_ast.into())\n    }\n}\n\nimpl From<BoolQuery> for ElasticQueryDslInner {\n    fn from(bool_query: BoolQuery) -> Self {\n        ElasticQueryDslInner::Bool(bool_query)\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::parse_percentage;\n    use crate::elastic_query_dsl::ConvertibleToQueryAst;\n    use crate::elastic_query_dsl::bool_query::{\n        BoolQuery, MinimumShouldMatch, MinimumShouldMatchResolved,\n    };\n    use crate::elastic_query_dsl::term_query::term_query_from_field_value;\n    use crate::query_ast::QueryAst;\n\n    #[test]\n    fn test_dsl_bool_query_deserialize_simple() {\n        let bool_query_json = r#\"{\n            \"must\": [\n                { \"term\": {\"product_id\": {\"value\": \"1\" }} },\n                { \"term\": {\"product_id\": {\"value\": \"2\" }} }\n            ]\n        }\"#;\n        let bool_query: BoolQuery = serde_json::from_str(bool_query_json).unwrap();\n        assert_eq!(\n            &bool_query,\n            &BoolQuery {\n                must: vec![\n                    term_query_from_field_value(\"product_id\", \"1\").into(),\n                    term_query_from_field_value(\"product_id\", \"2\").into(),\n                ],\n                must_not: Vec::new(),\n                should: Vec::new(),\n                filter: Vec::new(),\n                boost: None,\n                minimum_should_match: None,\n                _adjust_pure_negative: None,\n            }\n        );\n    }\n\n    #[test]\n    fn test_dsl_query_single() {\n        let bool_query_json = r#\"{\n            \"must\": { \"term\": {\"product_id\": {\"value\": \"1\" }} },\n            \"filter\": { \"term\": {\"product_id\": {\"value\": \"2\" }} }\n        }\"#;\n        let bool_query: BoolQuery = serde_json::from_str(bool_query_json).unwrap();\n        assert_eq!(\n            &bool_query,\n            &BoolQuery {\n                must: vec![term_query_from_field_value(\"product_id\", \"1\").into(),],\n                must_not: Vec::new(),\n                should: Vec::new(),\n                filter: vec![term_query_from_field_value(\"product_id\", \"2\").into(),],\n                boost: None,\n                minimum_should_match: None,\n                _adjust_pure_negative: None,\n            }\n        );\n    }\n\n    #[test]\n    fn test_dsl_query_with_null_values() {\n        let bool_query_json = r#\"{\n            \"must\": null,\n            \"must_not\": null,\n            \"should\": null,\n            \"filter\": null,\n            \"boost\": null\n        }\"#;\n        let bool_query: BoolQuery = serde_json::from_str(bool_query_json).unwrap();\n        assert_eq!(\n            &bool_query,\n            &BoolQuery {\n                must: Vec::new(),\n                must_not: Vec::new(),\n                should: Vec::new(),\n                filter: Vec::new(),\n                boost: None,\n                minimum_should_match: None,\n                _adjust_pure_negative: None,\n            }\n        );\n    }\n\n    #[test]\n    fn test_dsl_bool_query_deserialize_adjust_pure_negative() {\n        let bool_query_json = r#\"{\n            \"must\": [\n                { \"term\": {\"product_id\": {\"value\": \"1\" }} }\n            ],\n            \"adjust_pure_negative\": true\n        }\"#;\n        let bool_query: BoolQuery = serde_json::from_str(bool_query_json).unwrap();\n        assert!(bool_query._adjust_pure_negative.is_some());\n        assert_eq!(bool_query.must.len(), 1);\n        bool_query.convert_to_query_ast().unwrap();\n    }\n\n    #[test]\n    fn test_dsl_bool_query_deserialize_minimum_should_match() {\n        let bool_query: super::BoolQuery = serde_json::from_str(\n            r#\"{\n            \"must\": [\n                { \"term\": {\"product_id\": {\"value\": \"1\" }} },\n                { \"term\": {\"product_id\": {\"value\": \"2\" }} }\n            ],\n            \"minimum_should_match\": -2\n        }\"#,\n        )\n        .unwrap();\n        assert_eq!(\n            bool_query.minimum_should_match.as_ref().unwrap(),\n            &MinimumShouldMatch::Int(-2)\n        );\n    }\n\n    #[test]\n    fn test_dsl_query_with_minimum_should_match() {\n        let bool_query_json = r#\"{\n                \"should\": [\n                    { \"term\": {\"product_id\": {\"value\": \"1\" }} },\n                    { \"term\": {\"product_id\": {\"value\": \"2\" }} },\n                    { \"term\": {\"product_id\": {\"value\": \"3\" }} }\n                ],\n                \"minimum_should_match\": 2\n            }\"#;\n        let bool_query: BoolQuery = serde_json::from_str(bool_query_json).unwrap();\n        assert_eq!(bool_query.should.len(), 3);\n        assert_eq!(\n            bool_query.minimum_should_match.as_ref().unwrap(),\n            &super::MinimumShouldMatch::Int(2)\n        );\n        let QueryAst::Bool(bool_query_ast) = bool_query.convert_to_query_ast().unwrap() else {\n            panic!();\n        };\n        assert_eq!(bool_query_ast.should.len(), 3);\n        assert_eq!(bool_query_ast.minimum_should_match, Some(2));\n    }\n\n    #[test]\n    fn test_parse_percentage() {\n        assert_eq!(parse_percentage(\"10%\"), Some(10));\n        assert_eq!(parse_percentage(\"101%\"), None);\n        assert_eq!(parse_percentage(\"0%\"), Some(0));\n        assert_eq!(parse_percentage(\"100%\"), Some(100));\n        assert_eq!(parse_percentage(\"-20%\"), Some(-20));\n        assert_eq!(parse_percentage(\"20\"), None);\n        assert_eq!(parse_percentage(\"20a%\"), None);\n    }\n\n    #[test]\n    fn test_resolve_minimum_should_match() {\n        assert_eq!(\n            MinimumShouldMatch::Str(\"30%\".to_string())\n                .resolve(10)\n                .unwrap(),\n            MinimumShouldMatchResolved::Min(3)\n        );\n        // not supported yet\n        assert_eq!(\n            MinimumShouldMatch::Str(\"-30%\".to_string())\n                .resolve(10)\n                .unwrap(),\n            MinimumShouldMatchResolved::Min(7)\n        );\n        assert!(\n            MinimumShouldMatch::Str(\"-30!\".to_string())\n                .resolve(10)\n                .is_err()\n        );\n        assert_eq!(\n            MinimumShouldMatch::Int(10).resolve(11).unwrap(),\n            MinimumShouldMatchResolved::Min(10)\n        );\n        assert_eq!(\n            MinimumShouldMatch::Int(-10).resolve(11).unwrap(),\n            MinimumShouldMatchResolved::Min(1)\n        );\n        assert_eq!(\n            MinimumShouldMatch::Int(-12).resolve(11).unwrap(),\n            MinimumShouldMatchResolved::Unspecified\n        );\n        assert_eq!(\n            MinimumShouldMatch::Int(12).resolve(11).unwrap(),\n            MinimumShouldMatchResolved::NoMatch\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/elastic_query_dsl/exists_query.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse serde::Deserialize;\n\nuse crate::elastic_query_dsl::ConvertibleToQueryAst;\nuse crate::query_ast::{self, QueryAst};\n\n#[derive(Deserialize, Clone, Eq, PartialEq, Debug)]\npub struct ExistsQuery {\n    field: String,\n}\n\nimpl ConvertibleToQueryAst for ExistsQuery {\n    fn convert_to_query_ast(self) -> anyhow::Result<QueryAst> {\n        Ok(QueryAst::FieldPresence(query_ast::FieldPresenceQuery {\n            field: self.field,\n        }))\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use crate::elastic_query_dsl::exists_query::ExistsQuery;\n\n    #[test]\n    fn test_dsl_exists_query_deserialize_simple() {\n        let exists_query_json = r#\"{\n           \"field\": \"privileged\"\n        }\"#;\n        let bool_query: ExistsQuery = serde_json::from_str(exists_query_json).unwrap();\n        assert_eq!(\n            &bool_query,\n            &ExistsQuery {\n                field: \"privileged\".to_string(),\n            }\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/elastic_query_dsl/match_bool_prefix.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse serde::Deserialize;\n\nuse super::{ElasticQueryDslInner, StringOrStructForSerialization};\nuse crate::OneFieldMap;\nuse crate::elastic_query_dsl::match_query::MatchQueryParams;\nuse crate::elastic_query_dsl::{ConvertibleToQueryAst, default_max_expansions};\nuse crate::query_ast::{FullTextParams, FullTextQuery, QueryAst};\n\n/// `MatchBoolPrefixQuery` as defined in\n/// <https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-bool-prefix-query.html>\n#[derive(Deserialize, Clone, Eq, PartialEq, Debug)]\n#[serde(from = \"OneFieldMap<StringOrStructForSerialization<MatchQueryParams>>\")]\npub(crate) struct MatchBoolPrefixQuery {\n    pub(crate) field: String,\n    pub(crate) params: MatchQueryParams,\n}\n\nimpl ConvertibleToQueryAst for MatchBoolPrefixQuery {\n    fn convert_to_query_ast(self) -> anyhow::Result<QueryAst> {\n        let full_text_params = FullTextParams {\n            tokenizer: None,\n            mode: crate::query_ast::FullTextMode::BoolPrefix {\n                operator: self.params.operator,\n                max_expansions: default_max_expansions(),\n            },\n            zero_terms_query: self.params.zero_terms_query,\n        };\n        Ok(QueryAst::FullText(FullTextQuery {\n            field: self.field,\n            text: self.params.query,\n            params: full_text_params,\n            lenient: self.params.lenient,\n        }))\n    }\n}\n\nimpl From<MatchBoolPrefixQuery> for ElasticQueryDslInner {\n    fn from(match_bool_prefix_query: MatchBoolPrefixQuery) -> Self {\n        ElasticQueryDslInner::MatchBoolPrefix(match_bool_prefix_query)\n    }\n}\n\nimpl From<OneFieldMap<StringOrStructForSerialization<MatchQueryParams>>> for MatchBoolPrefixQuery {\n    fn from(\n        match_query_params: OneFieldMap<StringOrStructForSerialization<MatchQueryParams>>,\n    ) -> Self {\n        let OneFieldMap { field, value } = match_query_params;\n        MatchBoolPrefixQuery {\n            field,\n            params: value.inner,\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/elastic_query_dsl/match_phrase_query.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse serde::Deserialize;\n\nuse crate::elastic_query_dsl::{\n    ConvertibleToQueryAst, ElasticQueryDslInner, StringOrStructForSerialization,\n};\nuse crate::query_ast::{FullTextMode, FullTextParams, FullTextQuery, QueryAst};\nuse crate::{MatchAllOrNone, OneFieldMap};\n\n/// `MatchPhraseQuery` as defined in\n/// <https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-query-phrase.html>\n#[derive(Deserialize, Clone, Eq, PartialEq, Debug)]\n#[serde(from = \"OneFieldMap<StringOrStructForSerialization<MatchPhraseQueryParams>>\")]\npub(crate) struct MatchPhraseQuery {\n    pub(crate) field: String,\n    pub(crate) params: MatchPhraseQueryParams,\n}\n\n#[derive(Clone, Deserialize, PartialEq, Eq, Debug)]\n#[serde(deny_unknown_fields)]\npub struct MatchPhraseQueryParams {\n    pub(crate) query: String,\n    #[serde(default)]\n    pub(crate) zero_terms_query: MatchAllOrNone,\n    #[serde(default)]\n    pub(crate) analyzer: Option<String>,\n    #[serde(default)]\n    pub(crate) slop: u32,\n}\n\nimpl ConvertibleToQueryAst for MatchPhraseQuery {\n    fn convert_to_query_ast(self) -> anyhow::Result<QueryAst> {\n        let full_text_params = FullTextParams {\n            tokenizer: self.params.analyzer,\n            mode: FullTextMode::Phrase {\n                slop: self.params.slop,\n            },\n            zero_terms_query: self.params.zero_terms_query,\n        };\n        Ok(QueryAst::FullText(FullTextQuery {\n            field: self.field,\n            text: self.params.query,\n            params: full_text_params,\n            lenient: false,\n        }))\n    }\n}\n\nimpl From<MatchPhraseQuery> for ElasticQueryDslInner {\n    fn from(match_phrase_query: MatchPhraseQuery) -> Self {\n        ElasticQueryDslInner::MatchPhrase(match_phrase_query)\n    }\n}\n\nimpl From<OneFieldMap<StringOrStructForSerialization<MatchPhraseQueryParams>>>\n    for MatchPhraseQuery\n{\n    fn from(\n        match_query_params: OneFieldMap<StringOrStructForSerialization<MatchPhraseQueryParams>>,\n    ) -> Self {\n        let OneFieldMap { field, value } = match_query_params;\n        MatchPhraseQuery {\n            field,\n            params: value.inner,\n        }\n    }\n}\n\nimpl From<String> for MatchPhraseQueryParams {\n    fn from(query: String) -> MatchPhraseQueryParams {\n        MatchPhraseQueryParams {\n            query,\n            zero_terms_query: Default::default(),\n            analyzer: None,\n            slop: 0,\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_deserialize_match_query_string() {\n        // We accept a single string\n        let match_query: MatchPhraseQuery =\n            serde_json::from_str(r#\"{\"my_field\": \"my_query\"}\"#).unwrap();\n        assert_eq!(match_query.field, \"my_field\");\n        assert_eq!(&match_query.params.query, \"my_query\");\n        assert_eq!(match_query.params.slop, 0u32);\n        assert!(match_query.params.analyzer.is_none());\n        assert_eq!(\n            match_query.params.zero_terms_query,\n            MatchAllOrNone::MatchNone\n        );\n    }\n\n    #[test]\n    fn test_deserialize_match_query_struct() {\n        // We accept a struct too.\n        let match_query: MatchPhraseQuery = serde_json::from_str(\n            r#\"\n            {\"my_field\":\n                {\n                    \"query\": \"my_query\",\n                    \"slop\": 1\n                }\n            }\n        \"#,\n        )\n        .unwrap();\n        assert_eq!(match_query.field, \"my_field\");\n        assert_eq!(&match_query.params.query, \"my_query\");\n        assert_eq!(match_query.params.slop, 1u32);\n    }\n\n    #[test]\n    fn test_deserialize_match_query_nice_errors() {\n        let deser_error = serde_json::from_str::<MatchPhraseQuery>(\n            r#\"{\"my_field\": {\"query\": \"my_query\", \"wrong_param\": 2}}\"#,\n        )\n        .unwrap_err();\n        assert!(\n            deser_error\n                .to_string()\n                .contains(\"unknown field `wrong_param`\")\n        );\n    }\n\n    #[test]\n    fn test_match_query() {\n        let match_query = MatchPhraseQuery {\n            field: \"body\".to_string(),\n            params: MatchPhraseQueryParams {\n                analyzer: Some(\"whitespace\".to_string()),\n                query: \"hello\".to_string(),\n                slop: 2u32,\n                zero_terms_query: crate::MatchAllOrNone::MatchAll,\n            },\n        };\n        let ast = match_query.convert_to_query_ast().unwrap();\n        let QueryAst::FullText(FullTextQuery {\n            field,\n            text,\n            params,\n            lenient: _,\n        }) = ast\n        else {\n            panic!()\n        };\n        assert_eq!(field, \"body\");\n        assert_eq!(text, \"hello\");\n        assert_eq!(params.mode, FullTextMode::Phrase { slop: 2u32 });\n        assert_eq!(params.zero_terms_query, MatchAllOrNone::MatchAll);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/elastic_query_dsl/match_query.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse serde::Deserialize;\n\nuse super::LeniencyBool;\nuse crate::elastic_query_dsl::{\n    ConvertibleToQueryAst, ElasticQueryDslInner, StringOrStructForSerialization,\n};\nuse crate::query_ast::{FullTextParams, FullTextQuery, QueryAst};\nuse crate::{BooleanOperand, MatchAllOrNone, OneFieldMap};\n\n/// `MatchQuery` as defined in\n/// <https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-query.html>\n#[derive(Deserialize, Clone, Eq, PartialEq, Debug)]\n#[serde(from = \"OneFieldMap<StringOrStructForSerialization<MatchQueryParams>>\")]\npub struct MatchQuery {\n    pub(crate) field: String,\n    pub(crate) params: MatchQueryParams,\n}\n\n#[derive(Clone, Deserialize, PartialEq, Eq, Debug)]\n#[serde(deny_unknown_fields)]\npub(crate) struct MatchQueryParams {\n    pub(crate) query: String,\n    #[serde(default)]\n    pub(crate) operator: BooleanOperand,\n    #[serde(default)]\n    pub(crate) zero_terms_query: MatchAllOrNone,\n    #[serde(default)]\n    pub(crate) lenient: LeniencyBool,\n}\n\nimpl ConvertibleToQueryAst for MatchQuery {\n    fn convert_to_query_ast(self) -> anyhow::Result<QueryAst> {\n        let full_text_params = FullTextParams {\n            tokenizer: None,\n            mode: self.params.operator.into(),\n            zero_terms_query: self.params.zero_terms_query,\n        };\n        Ok(QueryAst::FullText(FullTextQuery {\n            field: self.field,\n            text: self.params.query,\n            params: full_text_params,\n            lenient: self.params.lenient,\n        }))\n    }\n}\n\nimpl From<MatchQuery> for ElasticQueryDslInner {\n    fn from(match_query: MatchQuery) -> Self {\n        ElasticQueryDslInner::Match(match_query)\n    }\n}\n\nimpl From<OneFieldMap<StringOrStructForSerialization<MatchQueryParams>>> for MatchQuery {\n    fn from(\n        match_query_params: OneFieldMap<StringOrStructForSerialization<MatchQueryParams>>,\n    ) -> Self {\n        let OneFieldMap { field, value } = match_query_params;\n        MatchQuery {\n            field,\n            params: value.inner,\n        }\n    }\n}\n\nimpl From<String> for MatchQueryParams {\n    fn from(query: String) -> MatchQueryParams {\n        MatchQueryParams {\n            query,\n            zero_terms_query: Default::default(),\n            operator: Default::default(),\n            lenient: false,\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n    use crate::query_ast::FullTextMode;\n\n    #[test]\n    fn test_deserialize_match_query_string() {\n        // We accept a single string\n        let match_query: MatchQuery = serde_json::from_str(r#\"{\"my_field\": \"my_query\"}\"#).unwrap();\n        assert_eq!(match_query.field, \"my_field\");\n        assert_eq!(&match_query.params.query, \"my_query\");\n        assert_eq!(match_query.params.operator, BooleanOperand::Or);\n    }\n\n    #[test]\n    fn test_deserialize_match_query_struct() {\n        // We accept a struct too.\n        let match_query: MatchQuery =\n            serde_json::from_str(r#\"{\"my_field\": {\"query\": \"my_query\", \"operator\": \"AND\"}}\"#)\n                .unwrap();\n        assert_eq!(match_query.field, \"my_field\");\n        assert_eq!(&match_query.params.query, \"my_query\");\n        assert_eq!(match_query.params.operator, BooleanOperand::And);\n    }\n\n    #[test]\n    fn test_deserialize_match_query_nice_errors() {\n        let deser_error = serde_json::from_str::<MatchQuery>(\n            r#\"{\"my_field\": {\"query\": \"my_query\", \"wrong_param\": 2}}\"#,\n        )\n        .unwrap_err();\n        assert!(\n            deser_error\n                .to_string()\n                .contains(\"unknown field `wrong_param`\")\n        );\n    }\n\n    #[test]\n    fn test_match_query() {\n        let match_query = MatchQuery {\n            field: \"body\".to_string(),\n            params: MatchQueryParams {\n                query: \"hello\".to_string(),\n                operator: BooleanOperand::And,\n                zero_terms_query: crate::MatchAllOrNone::MatchAll,\n                lenient: false,\n            },\n        };\n        let ast = match_query.convert_to_query_ast().unwrap();\n        let QueryAst::FullText(FullTextQuery {\n            field,\n            text,\n            params,\n            lenient: _,\n        }) = ast\n        else {\n            panic!()\n        };\n        assert_eq!(field, \"body\");\n        assert_eq!(text, \"hello\");\n        assert_eq!(\n            params.mode,\n            FullTextMode::Bool {\n                operator: BooleanOperand::And\n            }\n        );\n        assert_eq!(params.zero_terms_query, MatchAllOrNone::MatchAll);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/elastic_query_dsl/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse serde::{Deserialize, Serialize};\n\nmod bool_query;\nmod exists_query;\nmod match_bool_prefix;\nmod match_phrase_query;\nmod match_query;\nmod multi_match;\nmod one_field_map;\nmod phrase_prefix_query;\nmod prefix_query;\nmod query_string_query;\nmod range_query;\nmod regex_query;\nmod string_or_struct;\nmod term_query;\nmod terms_query;\nmod wildcard_query;\n\nuse bool_query::BoolQuery;\npub use one_field_map::OneFieldMap;\nuse phrase_prefix_query::MatchPhrasePrefixQuery;\nuse prefix_query::PrefixQuery;\npub(crate) use query_string_query::QueryStringQuery;\nuse range_query::RangeQuery;\npub(crate) use string_or_struct::StringOrStructForSerialization;\nuse term_query::TermQuery;\n\nuse crate::elastic_query_dsl::exists_query::ExistsQuery;\nuse crate::elastic_query_dsl::match_bool_prefix::MatchBoolPrefixQuery;\nuse crate::elastic_query_dsl::match_phrase_query::MatchPhraseQuery;\nuse crate::elastic_query_dsl::match_query::MatchQuery;\nuse crate::elastic_query_dsl::multi_match::MultiMatchQuery;\nuse crate::elastic_query_dsl::regex_query::RegexQuery;\nuse crate::elastic_query_dsl::terms_query::TermsQuery;\nuse crate::elastic_query_dsl::wildcard_query::WildcardQuery;\nuse crate::not_nan_f32::NotNaNf32;\nuse crate::query_ast::QueryAst;\n\n/// Quickwit and Elasticsearch have different interpretations of leniency:\n/// - In Quickwit, lenient mode allows ignoring parts of the query that reference non-existing\n///   columns. This is a behavior that Elasticsearch supports by default.\n/// - In Elasticsearch, lenient mode primarily addresses type errors (such as searching for text in\n///   an integer field). Quickwit always supports this behavior, regardless of the `lenient`\n///   setting.\npub type LeniencyBool = bool;\n\nfn default_max_expansions() -> u32 {\n    50\n}\n\n#[derive(Serialize, Deserialize, Debug, Eq, PartialEq, Clone, Copy, Default)]\n#[serde(deny_unknown_fields)]\npub(crate) struct MatchAllQuery {\n    pub boost: Option<NotNaNf32>,\n}\n\n#[derive(Serialize, Deserialize, Debug, Eq, PartialEq, Clone, Copy)]\npub(crate) struct MatchNoneQuery;\n\n#[derive(Deserialize, Debug, Eq, PartialEq, Clone)]\n#[serde(rename_all = \"snake_case\", deny_unknown_fields)]\npub(crate) enum ElasticQueryDslInner {\n    QueryString(QueryStringQuery),\n    Bool(BoolQuery),\n    Term(TermQuery),\n    Terms(TermsQuery),\n    MatchAll(MatchAllQuery),\n    MatchNone(MatchNoneQuery),\n    Match(MatchQuery),\n    MatchBoolPrefix(MatchBoolPrefixQuery),\n    MatchPhrase(MatchPhraseQuery),\n    MatchPhrasePrefix(MatchPhrasePrefixQuery),\n    MultiMatch(MultiMatchQuery),\n    Range(RangeQuery),\n    Exists(ExistsQuery),\n    Regexp(RegexQuery),\n    Wildcard(WildcardQuery),\n    Prefix(PrefixQuery),\n}\n\n#[derive(Deserialize, Debug, Eq, PartialEq, Clone)]\n#[serde(transparent)]\npub struct ElasticQueryDsl(ElasticQueryDslInner);\n\nimpl TryFrom<ElasticQueryDsl> for QueryAst {\n    type Error = anyhow::Error;\n\n    fn try_from(es_dsl: ElasticQueryDsl) -> anyhow::Result<Self> {\n        es_dsl.0.convert_to_query_ast()\n    }\n}\n\npub(crate) trait ConvertibleToQueryAst {\n    fn convert_to_query_ast(self) -> anyhow::Result<QueryAst>;\n}\n\nimpl ConvertibleToQueryAst for ElasticQueryDslInner {\n    fn convert_to_query_ast(self) -> anyhow::Result<QueryAst> {\n        match self {\n            Self::QueryString(query_string_query) => query_string_query.convert_to_query_ast(),\n            Self::Bool(bool_query) => bool_query.convert_to_query_ast(),\n            Self::Term(term_query) => term_query.convert_to_query_ast(),\n            Self::Terms(terms_query) => terms_query.convert_to_query_ast(),\n            Self::MatchAll(match_all_query) => {\n                if let Some(boost) = match_all_query.boost {\n                    Ok(QueryAst::Boost {\n                        boost,\n                        underlying: Box::new(QueryAst::MatchAll),\n                    })\n                } else {\n                    Ok(QueryAst::MatchAll)\n                }\n            }\n            Self::MatchNone(_) => Ok(QueryAst::MatchNone),\n            Self::MatchBoolPrefix(match_bool_prefix_query) => {\n                match_bool_prefix_query.convert_to_query_ast()\n            }\n            Self::MatchPhrase(match_phrase_query) => match_phrase_query.convert_to_query_ast(),\n            Self::MatchPhrasePrefix(match_phrase_prefix) => {\n                match_phrase_prefix.convert_to_query_ast()\n            }\n            Self::Range(range_query) => range_query.convert_to_query_ast(),\n            Self::Match(match_query) => match_query.convert_to_query_ast(),\n            Self::Exists(exists_query) => exists_query.convert_to_query_ast(),\n            Self::MultiMatch(multi_match_query) => multi_match_query.convert_to_query_ast(),\n            Self::Regexp(regex_query) => regex_query.convert_to_query_ast(),\n            Self::Wildcard(wildcard_query) => wildcard_query.convert_to_query_ast(),\n            Self::Prefix(prefix_query) => prefix_query.convert_to_query_ast(),\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n    use crate::elastic_query_dsl::term_query::term_query_from_field_value;\n\n    #[test]\n    fn test_query_dsl_deserialize_simple() {\n        let term_query_json = r#\"{\n            \"term\": {\n                \"product_id\": { \"value\": \"61809\" }\n            }\n        }\"#;\n        let query_dsl = serde_json::from_str(term_query_json).unwrap();\n        let ElasticQueryDsl(ElasticQueryDslInner::Term(term_query)) = query_dsl else {\n            panic!()\n        };\n        assert_eq!(\n            &term_query,\n            &term_query_from_field_value(\"product_id\", \"61809\")\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/elastic_query_dsl/multi_match.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse serde::Deserialize;\nuse serde_with::formats::PreferMany;\nuse serde_with::{OneOrMany, serde_as};\n\nuse super::LeniencyBool;\nuse crate::elastic_query_dsl::bool_query::BoolQuery;\nuse crate::elastic_query_dsl::match_bool_prefix::MatchBoolPrefixQuery;\nuse crate::elastic_query_dsl::match_phrase_query::{MatchPhraseQuery, MatchPhraseQueryParams};\nuse crate::elastic_query_dsl::match_query::{MatchQuery, MatchQueryParams};\nuse crate::elastic_query_dsl::phrase_prefix_query::{\n    MatchPhrasePrefixQuery, MatchPhrasePrefixQueryParams,\n};\nuse crate::elastic_query_dsl::{ConvertibleToQueryAst, ElasticQueryDslInner};\n\n/// Multi match queries are a bit odd. They end up being expanded into another type of query.\n/// In Quickwit, we operate this expansion in generic way at the time of deserialization.\n#[derive(Deserialize, Debug, Eq, PartialEq, Clone)]\n#[serde(try_from = \"MultiMatchQueryForDeserialization\")]\npub struct MultiMatchQuery(Box<ElasticQueryDslInner>);\n\n#[serde_as]\n#[derive(Deserialize, Debug, Eq, PartialEq, Clone)]\nstruct MultiMatchQueryForDeserialization {\n    #[serde(rename = \"type\", default)]\n    match_type: MatchType,\n    // Other parameters is used to dynamically collect more parameters.\n    // We will then expand the query at the json level, and then deserialize the right object.\n    #[serde(flatten)]\n    other_parameters: serde_json::Map<String, serde_json::Value>,\n    #[serde_as(deserialize_as = \"OneOrMany<_, PreferMany>\")]\n    #[serde(default)]\n    fields: Vec<String>,\n    #[serde(default)]\n    lenient: LeniencyBool,\n}\n\nfn deserialize_match_query_for_one_field(\n    match_type: MatchType,\n    field: &str,\n    json_object: serde_json::Map<String, serde_json::Value>,\n) -> serde_json::Result<ElasticQueryDslInner> {\n    let json_val = serde_json::Value::Object(json_object);\n    match match_type {\n        MatchType::Phrase => {\n            let params: MatchPhraseQueryParams = serde_json::from_value(json_val)?;\n            let phrase_query = MatchPhraseQuery {\n                field: field.to_string(),\n                params,\n            };\n            Ok(ElasticQueryDslInner::MatchPhrase(phrase_query))\n        }\n        MatchType::PhrasePrefix => {\n            let phrase_prefix_params: MatchPhrasePrefixQueryParams =\n                serde_json::from_value(json_val)?;\n            let phrase_prefix = MatchPhrasePrefixQuery {\n                field: field.to_string(),\n                value: phrase_prefix_params,\n            };\n            Ok(ElasticQueryDslInner::MatchPhrasePrefix(phrase_prefix))\n        }\n        MatchType::BoolPrefix => {\n            let bool_prefix_params: MatchQueryParams = serde_json::from_value(json_val)?;\n            let bool_prefix = MatchBoolPrefixQuery {\n                params: bool_prefix_params,\n                field: field.to_string(),\n            };\n            Ok(ElasticQueryDslInner::MatchBoolPrefix(bool_prefix))\n        }\n        MatchType::MostFields | MatchType::BestFields | MatchType::CrossFields => {\n            let match_query_params: MatchQueryParams = serde_json::from_value(json_val)?;\n            let match_query = MatchQuery {\n                field: field.to_string(),\n                params: match_query_params,\n            };\n            Ok(ElasticQueryDslInner::Match(match_query))\n        }\n    }\n}\n\nfn validate_field_name(field_name: &str) -> Result<(), String> {\n    if field_name.contains('^') {\n        return Err(format!(\n            \"Quickwit does not support field boosting in the multi match query fields (got \\\n             `{field_name}`)\"\n        ));\n    }\n    if field_name.contains('*') {\n        return Err(format!(\n            \"Quickwit does not support wildcards in the multi match query fields (got \\\n             `{field_name}`)\"\n        ));\n    }\n    Ok(())\n}\n\nimpl TryFrom<MultiMatchQueryForDeserialization> for MultiMatchQuery {\n    type Error = serde_json::Error;\n\n    fn try_from(multi_match_query: MultiMatchQueryForDeserialization) -> Result<Self, Self::Error> {\n        if multi_match_query.fields.is_empty() {\n            // TODO: We can use default field from index configuration instead\n            return Err(serde::de::Error::custom(\n                \"Quickwit does not support multi match query with 0 fields. MultiMatchQueries \\\n                 must have at least one field.\",\n            ));\n        }\n        for field in &multi_match_query.fields {\n            validate_field_name(field).map_err(serde::de::Error::custom)?;\n        }\n        let mut children = Vec::new();\n        for field in multi_match_query.fields {\n            let child = deserialize_match_query_for_one_field(\n                multi_match_query.match_type,\n                &field,\n                multi_match_query.other_parameters.clone(),\n            )?;\n            children.push(child);\n        }\n        let bool_query = BoolQuery::union(children);\n        Ok(MultiMatchQuery(Box::new(ElasticQueryDslInner::Bool(\n            bool_query,\n        ))))\n    }\n}\n\n#[derive(Deserialize, Debug, Default, Eq, PartialEq, Clone, Copy)]\n#[serde(rename_all = \"snake_case\")]\npub enum MatchType {\n    #[default]\n    MostFields,\n    BestFields,  // Not implemented will be converted to MostFields\n    CrossFields, // Not implemented will be converted to MostFields\n    Phrase,\n    PhrasePrefix,\n    BoolPrefix,\n}\n\nimpl ConvertibleToQueryAst for MultiMatchQuery {\n    fn convert_to_query_ast(self) -> anyhow::Result<crate::query_ast::QueryAst> {\n        self.0.convert_to_query_ast()\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n    use crate::elastic_query_dsl::default_max_expansions;\n\n    #[track_caller]\n    fn test_multimatch_query_ok_aux<T: Into<ElasticQueryDslInner>>(json: &str, expected: T) {\n        let expected: ElasticQueryDslInner = expected.into();\n        let multi_match_query: MultiMatchQuery = serde_json::from_str(json).unwrap();\n        let es_query = &*multi_match_query.0;\n        assert_eq!(es_query, &expected);\n    }\n\n    #[track_caller]\n    fn test_multimatch_query_err_aux(json: &str, expected_error_msg: &'static str) {\n        let err_msg: String = serde_json::from_str::<MultiMatchQuery>(json)\n            .unwrap_err()\n            .to_string();\n        assert!(err_msg.contains(expected_error_msg), \"Got `{err_msg}`\");\n    }\n\n    #[test]\n    fn test_multimatch_query_deserialization() {\n        test_multimatch_query_ok_aux(\n            r#\"{\n                \"query\": \"quick brown fox\",\n                \"type\": \"most_fields\",\n                \"fields\": [\"title\", \"body\"]\n            }\"#,\n            BoolQuery::union(vec![\n                MatchQuery {\n                    field: \"title\".to_string(),\n                    params: MatchQueryParams {\n                        query: \"quick brown fox\".to_string(),\n                        operator: crate::BooleanOperand::Or,\n                        zero_terms_query: Default::default(),\n                        lenient: false,\n                    },\n                }\n                .into(),\n                MatchQuery {\n                    field: \"body\".to_string(),\n                    params: MatchQueryParams {\n                        query: \"quick brown fox\".to_string(),\n                        operator: crate::BooleanOperand::Or,\n                        zero_terms_query: Default::default(),\n                        lenient: false,\n                    },\n                }\n                .into(),\n            ]),\n        );\n\n        test_multimatch_query_ok_aux(\n            r#\"{\n            \"query\": \"quick brown fox\",\n            \"type\": \"best_fields\",\n            \"fields\": [\"title\", \"body\"]\n        }\"#,\n            BoolQuery::union(vec![\n                MatchQuery {\n                    field: \"title\".to_string(),\n                    params: MatchQueryParams {\n                        query: \"quick brown fox\".to_string(),\n                        operator: crate::BooleanOperand::Or,\n                        zero_terms_query: Default::default(),\n                        lenient: false,\n                    },\n                }\n                .into(),\n                MatchQuery {\n                    field: \"body\".to_string(),\n                    params: MatchQueryParams {\n                        query: \"quick brown fox\".to_string(),\n                        operator: crate::BooleanOperand::Or,\n                        zero_terms_query: Default::default(),\n                        lenient: false,\n                    },\n                }\n                .into(),\n            ]),\n        );\n\n        test_multimatch_query_ok_aux(\n            r#\"{\n            \"query\": \"quick brown fox\",\n            \"type\": \"cross_fields\",\n            \"fields\": [\"title\", \"body\"]\n        }\"#,\n            BoolQuery::union(vec![\n                MatchQuery {\n                    field: \"title\".to_string(),\n                    params: MatchQueryParams {\n                        query: \"quick brown fox\".to_string(),\n                        operator: crate::BooleanOperand::Or,\n                        zero_terms_query: Default::default(),\n                        lenient: false,\n                    },\n                }\n                .into(),\n                MatchQuery {\n                    field: \"body\".to_string(),\n                    params: MatchQueryParams {\n                        query: \"quick brown fox\".to_string(),\n                        operator: crate::BooleanOperand::Or,\n                        zero_terms_query: Default::default(),\n                        lenient: false,\n                    },\n                }\n                .into(),\n            ]),\n        );\n\n        test_multimatch_query_ok_aux(\n            r#\"{\n            \"query\": \"quick brown fox\",\n            \"type\": \"phrase\",\n            \"fields\": [\"title\", \"body\"]\n        }\"#,\n            BoolQuery::union(vec![\n                MatchPhraseQuery {\n                    field: \"title\".to_string(),\n                    params: MatchPhraseQueryParams {\n                        query: \"quick brown fox\".to_string(),\n                        zero_terms_query: Default::default(),\n                        analyzer: None,\n                        slop: Default::default(),\n                    },\n                }\n                .into(),\n                MatchPhraseQuery {\n                    field: \"body\".to_string(),\n                    params: MatchPhraseQueryParams {\n                        query: \"quick brown fox\".to_string(),\n                        zero_terms_query: Default::default(),\n                        analyzer: None,\n                        slop: Default::default(),\n                    },\n                }\n                .into(),\n            ]),\n        );\n\n        test_multimatch_query_ok_aux(\n            r#\"{\n            \"query\": \"quick brown fox\",\n            \"type\": \"phrase_prefix\",\n            \"fields\": [\"title\", \"body\"]\n        }\"#,\n            BoolQuery::union(vec![\n                MatchPhrasePrefixQuery {\n                    field: \"title\".to_string(),\n                    value: MatchPhrasePrefixQueryParams {\n                        query: \"quick brown fox\".to_string(),\n                        analyzer: Default::default(),\n                        max_expansions: default_max_expansions(),\n                        slop: Default::default(),\n                        zero_terms_query: Default::default(),\n                    },\n                }\n                .into(),\n                MatchPhrasePrefixQuery {\n                    field: \"body\".to_string(),\n                    value: MatchPhrasePrefixQueryParams {\n                        query: \"quick brown fox\".to_string(),\n                        analyzer: Default::default(),\n                        max_expansions: default_max_expansions(),\n                        slop: Default::default(),\n                        zero_terms_query: Default::default(),\n                    },\n                }\n                .into(),\n            ]),\n        );\n\n        test_multimatch_query_ok_aux(\n            r#\"{\n            \"query\": \"quick brown\",\n            \"type\": \"bool_prefix\",\n            \"fields\": [\"title\", \"body\"]\n        }\"#,\n            BoolQuery::union(vec![\n                MatchBoolPrefixQuery {\n                    field: \"title\".to_string(),\n                    params: MatchQueryParams {\n                        query: \"quick brown\".to_string(),\n                        operator: crate::BooleanOperand::Or,\n                        zero_terms_query: Default::default(),\n                        lenient: false,\n                    },\n                }\n                .into(),\n                MatchBoolPrefixQuery {\n                    field: \"body\".to_string(),\n                    params: MatchQueryParams {\n                        query: \"quick brown\".to_string(),\n                        operator: crate::BooleanOperand::Or,\n                        zero_terms_query: Default::default(),\n                        lenient: false,\n                    },\n                }\n                .into(),\n            ]),\n        );\n    }\n\n    #[test]\n    fn test_multimatch_unsupported() {\n        test_multimatch_query_err_aux(\n            r#\"{\n                \"query\": \"quick brown fox\",\n                \"type\": \"most_fields\",\n                \"fields\": [\"body\", \"body.*\"]\n            }\"#,\n            \"Quickwit does not support wildcards\",\n        );\n        test_multimatch_query_err_aux(\n            r#\"{\n                \"query\": \"quick brown fox\",\n                \"type\": \"most_fields\",\n                \"fields\": [\"body\", \"title^3\"]\n            }\"#,\n            \"Quickwit does not support field boosting\",\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/elastic_query_dsl/one_field_map.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::marker::PhantomData;\n\nuse serde::de::Visitor;\nuse serde::ser::SerializeMap;\nuse serde::{Deserialize, Serialize};\n\n/// Helper to serialize/deserialize `{\"my_field\": {..}}` object\n/// often present in Elasticsearch DSL.\n#[derive(PartialEq, Eq, Debug, Clone)]\npub struct OneFieldMap<V> {\n    pub field: String,\n    pub value: V,\n}\n\nimpl<V: Serialize> Serialize for OneFieldMap<V> {\n    fn serialize<S: serde::Serializer>(&self, serializer: S) -> Result<S::Ok, S::Error> {\n        let mut map = serializer.serialize_map(Some(1))?;\n        map.serialize_entry(&self.field, &self.value)?;\n        map.end()\n    }\n}\n\nstruct OneFieldMapVisitor<V> {\n    _data: PhantomData<V>,\n}\n\nimpl<'de, V: Deserialize<'de>> Visitor<'de> for OneFieldMapVisitor<V> {\n    type Value = OneFieldMap<V>;\n\n    fn expecting(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(f, \"Expected a map with a single field.\")\n    }\n\n    fn visit_map<A>(self, mut map: A) -> Result<Self::Value, A::Error>\n    where A: serde::de::MapAccess<'de> {\n        if let Some(num_keys) = map.size_hint()\n            && num_keys != 1\n        {\n            return Err(serde::de::Error::custom(format!(\n                \"expected a single field. got {num_keys}\"\n            )));\n        }\n        let Some((key, val)) = map.next_entry()? else {\n            return Err(serde::de::Error::custom(\n                \"expected a single field. got none\",\n            ));\n        };\n        if let Some(second_key) = map.next_key::<String>()? {\n            return Err(serde::de::Error::custom(format!(\n                \"expected a single field. got several ({key}, {second_key}, ...)\"\n            )));\n        }\n        Ok(OneFieldMap {\n            field: key,\n            value: val,\n        })\n    }\n}\n\nimpl<'de, V: Deserialize<'de>> Deserialize<'de> for OneFieldMap<V> {\n    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>\n    where D: serde::Deserializer<'de> {\n        deserializer.deserialize_map(OneFieldMapVisitor {\n            _data: Default::default(),\n        })\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use serde::{Deserialize, Serialize};\n\n    use crate::OneFieldMap;\n    #[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Eq)]\n    struct Property {\n        count: usize,\n    }\n\n    #[test]\n    fn test_one_field_hash_map_simple() {\n        let one_field_map = OneFieldMap {\n            field: \"my-field\".to_string(),\n            value: Property { count: 2 },\n        };\n        let json = serde_json::to_value(one_field_map).unwrap();\n        assert_eq!(&json, &serde_json::json!({\"my-field\": {\"count\": 2}}));\n        let deser_ser = serde_json::from_value::<OneFieldMap<Property>>(json).unwrap();\n        assert_eq!(deser_ser.field.as_str(), \"my-field\");\n        assert_eq!(deser_ser.value.count, 2);\n    }\n\n    #[test]\n    fn test_one_field_hash_map_deserialize_error_too_many_fields() {\n        let deser: serde_json::Result<OneFieldMap<Property>> =\n            serde_json::from_value(serde_json::json!({\n                \"my-field\": {\"count\": 2},\n                \"my-field2\": {\"count\": 2}\n            }));\n        let deser_err = deser.unwrap_err();\n        assert_eq!(deser_err.to_string(), \"expected a single field. got 2\");\n    }\n\n    #[test]\n    fn test_one_field_hash_map_deserialize_error_no_fields() {\n        let deser: serde_json::Result<OneFieldMap<Property>> =\n            serde_json::from_value(serde_json::json!({}));\n        let deser_err = deser.unwrap_err();\n        assert_eq!(deser_err.to_string(), \"expected a single field. got 0\");\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/elastic_query_dsl/phrase_prefix_query.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse serde::Deserialize;\n\nuse crate::MatchAllOrNone;\nuse crate::elastic_query_dsl::one_field_map::OneFieldMap;\nuse crate::elastic_query_dsl::{\n    ConvertibleToQueryAst, ElasticQueryDslInner, default_max_expansions,\n};\nuse crate::query_ast::{self, FullTextMode, FullTextParams, QueryAst};\n\npub(crate) type MatchPhrasePrefixQuery = OneFieldMap<MatchPhrasePrefixQueryParams>;\n\n#[derive(PartialEq, Eq, Debug, Deserialize, Clone)]\n#[serde(deny_unknown_fields)]\npub(crate) struct MatchPhrasePrefixQueryParams {\n    pub query: String,\n    #[serde(default)]\n    pub analyzer: Option<String>,\n    #[serde(default = \"default_max_expansions\")]\n    pub max_expansions: u32,\n    #[serde(default)]\n    pub slop: u32,\n    #[serde(default, skip_serializing_if = \"MatchAllOrNone::is_none\")]\n    pub zero_terms_query: MatchAllOrNone,\n}\n\nimpl From<MatchPhrasePrefixQuery> for ElasticQueryDslInner {\n    fn from(term_query: MatchPhrasePrefixQuery) -> Self {\n        Self::MatchPhrasePrefix(term_query)\n    }\n}\n\nimpl ConvertibleToQueryAst for MatchPhrasePrefixQuery {\n    fn convert_to_query_ast(self) -> anyhow::Result<QueryAst> {\n        let MatchPhrasePrefixQueryParams {\n            query,\n            analyzer,\n            max_expansions,\n            slop,\n            zero_terms_query,\n        } = self.value;\n        let analyzer = FullTextParams {\n            tokenizer: analyzer,\n            mode: FullTextMode::Phrase { slop },\n            zero_terms_query,\n        };\n        let phrase_prefix_query_ast = query_ast::PhrasePrefixQuery {\n            field: self.field,\n            phrase: query,\n            params: analyzer,\n            max_expansions,\n            lenient: false,\n        };\n        Ok(phrase_prefix_query_ast.into())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::{MatchAllOrNone, MatchPhrasePrefixQuery, MatchPhrasePrefixQueryParams};\n\n    #[test]\n    fn test_term_query_simple() {\n        let phrase_prefix_json = r#\"{ \"message\": { \"query\": \"quick brown f\" } }\"#;\n        let phrase_prefix: MatchPhrasePrefixQuery =\n            serde_json::from_str(phrase_prefix_json).unwrap();\n        let expected = MatchPhrasePrefixQuery {\n            field: \"message\".to_string(),\n            value: MatchPhrasePrefixQueryParams {\n                query: \"quick brown f\".to_string(),\n                analyzer: None,\n                max_expansions: 50,\n                slop: 0,\n                zero_terms_query: MatchAllOrNone::MatchNone,\n            },\n        };\n\n        assert_eq!(&phrase_prefix, &expected);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/elastic_query_dsl/prefix_query.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse serde::Deserialize;\n\nuse crate::elastic_query_dsl::one_field_map::OneFieldMap;\nuse crate::elastic_query_dsl::{ConvertibleToQueryAst, StringOrStructForSerialization};\nuse crate::query_ast::{QueryAst, WildcardQuery as AstWildcardQuery};\n\n#[derive(Deserialize, Clone, Eq, PartialEq, Debug)]\n#[serde(from = \"OneFieldMap<StringOrStructForSerialization<PrefixQueryParams>>\")]\npub(crate) struct PrefixQuery {\n    pub(crate) field: String,\n    pub(crate) params: PrefixQueryParams,\n}\n\n#[derive(Deserialize, Debug, Default, Eq, PartialEq, Clone)]\n#[serde(deny_unknown_fields)]\npub struct PrefixQueryParams {\n    value: String,\n    #[serde(default)]\n    case_insensitive: bool,\n}\n\nimpl ConvertibleToQueryAst for PrefixQuery {\n    fn convert_to_query_ast(self) -> anyhow::Result<QueryAst> {\n        let wildcard = format!(\n            \"{}*\",\n            self.params\n                .value\n                .replace(r\"\\\", r\"\\\\\")\n                .replace(\"*\", r\"\\*\")\n                .replace(\"?\", r\"\\?\")\n        );\n        Ok(AstWildcardQuery {\n            field: self.field,\n            value: wildcard,\n            lenient: true,\n            case_insensitive: self.params.case_insensitive,\n        }\n        .into())\n    }\n}\n\nimpl From<OneFieldMap<StringOrStructForSerialization<PrefixQueryParams>>> for PrefixQuery {\n    fn from(\n        match_query_params: OneFieldMap<StringOrStructForSerialization<PrefixQueryParams>>,\n    ) -> Self {\n        let OneFieldMap { field, value } = match_query_params;\n        PrefixQuery {\n            field,\n            params: value.inner,\n        }\n    }\n}\n\nimpl From<String> for PrefixQueryParams {\n    fn from(value: String) -> PrefixQueryParams {\n        PrefixQueryParams {\n            value,\n            case_insensitive: false,\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_prefix_query_convert_to_query_ast() {\n        let prefix_query_json = r#\"{\n            \"user_name\": {\n                \"value\": \"john\"\n            }\n        }\"#;\n        let prefix_query: PrefixQuery = serde_json::from_str(prefix_query_json).unwrap();\n        let query_ast = prefix_query.convert_to_query_ast().unwrap();\n\n        if let QueryAst::Wildcard(prefix) = query_ast {\n            assert_eq!(prefix.field, \"user_name\");\n            assert_eq!(prefix.value, \"john*\");\n            assert!(prefix.lenient);\n        } else {\n            panic!(\"Expected QueryAst::Prefix, got {:?}\", query_ast);\n        }\n    }\n\n    #[test]\n    fn test_prefix_query_convert_to_query_ast_special_chars() {\n        let prefix_query_json = r#\"{\n            \"user_name\": {\n                \"value\": \"a\\\\dm?n*\"\n            }\n        }\"#;\n        let prefix_query: PrefixQuery = serde_json::from_str(prefix_query_json).unwrap();\n        let query_ast = prefix_query.convert_to_query_ast().unwrap();\n\n        if let QueryAst::Wildcard(prefix) = query_ast {\n            assert_eq!(prefix.field, \"user_name\");\n            assert_eq!(prefix.value, r\"a\\\\dm\\?n\\**\");\n            assert!(prefix.lenient);\n        } else {\n            panic!(\"Expected QueryAst::Prefix, got {:?}\", query_ast);\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/elastic_query_dsl/query_string_query.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse serde::Deserialize;\n\nuse super::LeniencyBool;\nuse crate::BooleanOperand;\nuse crate::elastic_query_dsl::ConvertibleToQueryAst;\nuse crate::not_nan_f32::NotNaNf32;\nuse crate::query_ast::UserInputQuery;\n\n#[derive(Deserialize, Debug, Eq, PartialEq, Clone)]\n#[serde(deny_unknown_fields)]\npub(crate) struct QueryStringQuery {\n    query: String,\n    /// Limitation. We do not support * at the moment.\n    /// We do not support JSON field either.\n    ///\n    /// Note that following elastic, we do not support \"string\" and require an array here.\n    #[serde(default)]\n    fields: Option<Vec<String>>,\n    #[serde(default)]\n    default_field: Option<String>,\n    #[serde(default)]\n    default_operator: BooleanOperand,\n    #[serde(default)]\n    boost: Option<NotNaNf32>,\n    #[serde(default)]\n    lenient: LeniencyBool,\n}\n\nimpl ConvertibleToQueryAst for QueryStringQuery {\n    fn convert_to_query_ast(self) -> anyhow::Result<crate::query_ast::QueryAst> {\n        if self.default_field.is_some() && self.fields.is_some() {\n            anyhow::bail!(\"fields and default_field cannot be both set in `query_string` queries\");\n        }\n        let default_fields: Option<Vec<String>> = self\n            .default_field\n            .map(|default_field| vec![default_field])\n            .or(self.fields);\n        let user_text_query = UserInputQuery {\n            user_text: self.query,\n            default_fields,\n            default_operator: self.default_operator,\n            lenient: self.lenient,\n        };\n        Ok(user_text_query.into())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use crate::BooleanOperand;\n    use crate::elastic_query_dsl::{ConvertibleToQueryAst, QueryStringQuery};\n    use crate::query_ast::{QueryAst, UserInputQuery};\n\n    #[test]\n    fn test_build_query_string_query_with_fields_non_empty() {\n        let query_string_query = crate::elastic_query_dsl::QueryStringQuery {\n            query: \"hello world\".to_string(),\n            fields: Some(vec![\"hello\".to_string()]),\n            default_operator: crate::BooleanOperand::Or,\n            default_field: None,\n            boost: None,\n            lenient: false,\n        };\n        let QueryAst::UserInput(user_input_query) =\n            query_string_query.convert_to_query_ast().unwrap()\n        else {\n            panic!();\n        };\n        assert_eq!(user_input_query.default_operator, BooleanOperand::Or);\n        assert_eq!(\n            user_input_query.default_fields.unwrap(),\n            vec![\"hello\".to_string()]\n        );\n    }\n\n    #[test]\n    fn test_build_query_string_query_with_default_field_non_empty() {\n        let query_string_query = crate::elastic_query_dsl::QueryStringQuery {\n            query: \"hello world\".to_string(),\n            fields: None,\n            default_operator: crate::BooleanOperand::Or,\n            default_field: Some(\"hello\".to_string()),\n            boost: None,\n            lenient: false,\n        };\n        let QueryAst::UserInput(user_input_query) =\n            query_string_query.convert_to_query_ast().unwrap()\n        else {\n            panic!();\n        };\n        assert_eq!(user_input_query.default_operator, BooleanOperand::Or);\n        assert_eq!(\n            user_input_query.default_fields.unwrap(),\n            vec![\"hello\".to_string()]\n        );\n    }\n\n    #[test]\n    fn test_build_query_string_query_with_both_default_fields_and_field_yield_an_error() {\n        let query_string_query = crate::elastic_query_dsl::QueryStringQuery {\n            query: \"hello world\".to_string(),\n            fields: Some(vec![\"hello\".to_string()]),\n            default_operator: crate::BooleanOperand::Or,\n            default_field: Some(\"hello\".to_string()),\n            boost: None,\n            lenient: false,\n        };\n        let err_msg = query_string_query\n            .convert_to_query_ast()\n            .unwrap_err()\n            .to_string();\n        assert!(err_msg.contains(\"cannot be both set\"));\n    }\n\n    #[test]\n    fn test_build_query_string_query_with_default_operand_and() {\n        let query_string_query = crate::elastic_query_dsl::QueryStringQuery {\n            query: \"hello world\".to_string(),\n            fields: Some(Vec::new()),\n            default_field: None,\n            default_operator: crate::BooleanOperand::And,\n            boost: None,\n            lenient: false,\n        };\n        let QueryAst::UserInput(user_input_query) =\n            query_string_query.convert_to_query_ast().unwrap()\n        else {\n            panic!();\n        };\n        assert_eq!(user_input_query.default_operator, BooleanOperand::And);\n    }\n\n    #[test]\n    fn test_build_query_string_query_with_empty_default_field() {\n        let query_string_query = crate::elastic_query_dsl::QueryStringQuery {\n            query: \"hello world\".to_string(),\n            fields: Some(Vec::new()),\n            default_field: None,\n            default_operator: crate::BooleanOperand::Or,\n            boost: None,\n            lenient: false,\n        };\n        let QueryAst::UserInput(user_input_query) =\n            query_string_query.convert_to_query_ast().unwrap()\n        else {\n            panic!();\n        };\n        assert_eq!(user_input_query.default_operator, BooleanOperand::Or);\n        assert!(user_input_query.default_fields.unwrap().is_empty());\n    }\n\n    #[test]\n    fn test_build_query_string_query_no_default_fields() {\n        let query_string_query = crate::elastic_query_dsl::QueryStringQuery {\n            query: \"hello world\".to_string(),\n            fields: None,\n            default_field: None,\n            default_operator: crate::BooleanOperand::Or,\n            boost: None,\n            lenient: false,\n        };\n        let QueryAst::UserInput(user_input_query) =\n            query_string_query.convert_to_query_ast().unwrap()\n        else {\n            panic!();\n        };\n        assert!(user_input_query.default_fields.is_none());\n    }\n\n    #[test]\n    fn test_build_query_string_default_operator() {\n        let query_string_query: QueryStringQuery =\n            serde_json::from_str(r#\"{ \"query\": \"hello world\", \"fields\": [\"text\"] }\"#).unwrap();\n        // By default the default operator is OR in elasticsearch and opensearch.\n        assert_eq!(query_string_query.default_operator, BooleanOperand::Or);\n        assert_eq!(query_string_query.fields, Some(vec![\"text\".to_string()]));\n        assert_eq!(&query_string_query.query, \"hello world\");\n        assert_eq!(query_string_query.boost, None);\n        let query_ast: QueryAst = query_string_query.convert_to_query_ast().unwrap();\n        assert!(matches!(query_ast, QueryAst::UserInput(UserInputQuery {\n            user_text,\n            default_fields,\n            default_operator,\n            lenient: _,\n        }) if user_text == \"hello world\"\n            && default_operator == BooleanOperand::Or\n            && default_fields == Some(vec![\"text\".to_string()])));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/elastic_query_dsl/range_query.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::ops::Bound;\n\nuse quickwit_datetime::StrptimeParser;\nuse serde::Deserialize;\nuse time::format_description::well_known::Rfc3339;\n\nuse crate::JsonLiteral;\nuse crate::elastic_query_dsl::ConvertibleToQueryAst;\nuse crate::elastic_query_dsl::one_field_map::OneFieldMap;\nuse crate::not_nan_f32::NotNaNf32;\nuse crate::query_ast::QueryAst;\n\n#[derive(Deserialize, Debug, Default, Eq, PartialEq, Clone)]\n#[serde(deny_unknown_fields)]\npub struct RangeQueryParams {\n    #[serde(default)]\n    gt: Option<JsonLiteral>,\n    #[serde(default)]\n    gte: Option<JsonLiteral>,\n    #[serde(default)]\n    lt: Option<JsonLiteral>,\n    #[serde(default)]\n    lte: Option<JsonLiteral>,\n    #[serde(default)]\n    boost: Option<NotNaNf32>,\n    #[serde(default)]\n    format: Option<JsonLiteral>,\n    #[serde(default)]\n    from: Option<JsonLiteral>,\n    #[serde(default)]\n    to: Option<JsonLiteral>,\n    #[serde(default)]\n    include_lower: Option<bool>,\n    #[serde(default)]\n    include_upper: Option<bool>,\n}\n\npub type RangeQuery = OneFieldMap<RangeQueryParams>;\n\nimpl ConvertibleToQueryAst for RangeQuery {\n    fn convert_to_query_ast(self) -> anyhow::Result<QueryAst> {\n        let field = self.field;\n        let RangeQueryParams {\n            gt,\n            gte,\n            lt,\n            lte,\n            boost,\n            format,\n            from,\n            to,\n            include_lower,\n            include_upper,\n        } = self.value;\n\n        let (mut gt, mut gte, mut lt, mut lte) = (gt, gte, lt, lte);\n        if let Some(from_val) = from\n            && gt.is_none()\n            && gte.is_none()\n        {\n            if include_lower.unwrap_or(true) {\n                gte = Some(from_val);\n            } else {\n                gt = Some(from_val);\n            }\n        }\n        if let Some(to_val) = to\n            && lt.is_none()\n            && lte.is_none()\n        {\n            if include_upper.unwrap_or(true) {\n                lte = Some(to_val);\n            } else {\n                lt = Some(to_val);\n            }\n        }\n\n        let (gt, gte, lt, lte) = if let Some(JsonLiteral::String(java_date_format)) = format {\n            let parser = StrptimeParser::from_java_datetime_format(&java_date_format)\n                .map_err(|err| anyhow::anyhow!(\"failed to parse range query date format. {err}\"))?;\n            (\n                gt.map(|v| parse_and_convert(v, &parser)).transpose()?,\n                gte.map(|v| parse_and_convert(v, &parser)).transpose()?,\n                lt.map(|v| parse_and_convert(v, &parser)).transpose()?,\n                lte.map(|v| parse_and_convert(v, &parser)).transpose()?,\n            )\n        } else {\n            (gt, gte, lt, lte)\n        };\n\n        let range_query_ast = crate::query_ast::RangeQuery {\n            field,\n            lower_bound: match (gt, gte) {\n                (Some(_gt), Some(_gte)) => {\n                    anyhow::bail!(\"both gt and gte are set\")\n                }\n                (Some(gt), None) => Bound::Excluded(gt),\n                (None, Some(gte)) => Bound::Included(gte),\n                (None, None) => Bound::Unbounded,\n            },\n            upper_bound: match (lt, lte) {\n                (Some(_lt), Some(_lte)) => {\n                    anyhow::bail!(\"both lt and lte are set\")\n                }\n                (Some(lt), None) => Bound::Excluded(lt),\n                (None, Some(lte)) => Bound::Included(lte),\n                (None, None) => Bound::Unbounded,\n            },\n        };\n        let ast: QueryAst = range_query_ast.into();\n        Ok(ast.boost(boost))\n    }\n}\n\nfn parse_and_convert(literal: JsonLiteral, parser: &StrptimeParser) -> anyhow::Result<JsonLiteral> {\n    if let JsonLiteral::String(date_time_str) = literal {\n        let parsed_date_time = parser\n            .parse_date_time(&date_time_str)\n            .map_err(|reason| anyhow::anyhow!(\"Failed to parse date time: {}\", reason))?;\n        let parsed_date_time_rfc3339 = parsed_date_time.format(&Rfc3339)?;\n        Ok(JsonLiteral::String(parsed_date_time_rfc3339))\n    } else {\n        Ok(literal)\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::ops::Bound;\n\n    use super::{RangeQuery as ElasticRangeQuery, RangeQueryParams as ElasticRangeQueryParams};\n    use crate::JsonLiteral;\n    use crate::elastic_query_dsl::ConvertibleToQueryAst;\n    use crate::query_ast::{QueryAst, RangeQuery};\n\n    #[test]\n    fn test_date_range_query_with_format() {\n        let range_query_params = ElasticRangeQueryParams {\n            gt: Some(JsonLiteral::String(\"2021-01-03T13:32:43\".to_string())),\n            gte: None,\n            lt: None,\n            lte: None,\n            boost: None,\n            format: JsonLiteral::String(\"yyyy-MM-dd['T'HH:mm:ss]\".to_string()).into(),\n            ..Default::default()\n        };\n        let range_query: ElasticRangeQuery = ElasticRangeQuery {\n            field: \"date\".to_string(),\n            value: range_query_params,\n        };\n        let range_query_ast = range_query.convert_to_query_ast().unwrap();\n        assert!(matches!(\n            range_query_ast,\n            QueryAst::Range(RangeQuery {\n                field,\n                lower_bound: Bound::Excluded(lower_bound),\n                upper_bound: Bound::Unbounded,\n            })\n            if field == \"date\" && lower_bound == JsonLiteral::String(\"2021-01-03T13:32:43Z\".to_string())\n        ));\n    }\n\n    fn into_json_number(n: u64) -> JsonLiteral {\n        JsonLiteral::Number(serde_json::Number::from(n))\n    }\n\n    #[test]\n    fn test_range_query_with_from_to_inclusive() {\n        let range_json =\n            r#\"{\"score\": {\"from\": 50, \"to\": 100, \"include_lower\": true, \"include_upper\": true}}\"#;\n        let range_query: ElasticRangeQuery = serde_json::from_str(range_json).unwrap();\n        let ast = range_query.convert_to_query_ast().unwrap();\n        let QueryAst::Range(rq) = ast else {\n            panic!(\"expected Range, got {ast:?}\");\n        };\n        assert_eq!(rq.field, \"score\");\n        assert_eq!(rq.lower_bound, Bound::Included(into_json_number(50)));\n        assert_eq!(rq.upper_bound, Bound::Included(into_json_number(100)));\n    }\n\n    #[test]\n    fn test_range_query_with_from_to_exclusive() {\n        let range_json =\n            r#\"{\"score\": {\"from\": 50, \"to\": 100, \"include_lower\": false, \"include_upper\": false}}\"#;\n        let range_query: ElasticRangeQuery = serde_json::from_str(range_json).unwrap();\n        let ast = range_query.convert_to_query_ast().unwrap();\n        let QueryAst::Range(rq) = ast else {\n            panic!(\"expected Range, got {ast:?}\");\n        };\n        assert_eq!(rq.field, \"score\");\n        assert_eq!(rq.lower_bound, Bound::Excluded(into_json_number(50)));\n        assert_eq!(rq.upper_bound, Bound::Excluded(into_json_number(100)));\n    }\n\n    #[test]\n    fn test_range_query_with_from_to_defaults() {\n        let range_json = r#\"{\"score\": {\"from\": 50, \"to\": 100}}\"#;\n        let range_query: ElasticRangeQuery = serde_json::from_str(range_json).unwrap();\n        let ast = range_query.convert_to_query_ast().unwrap();\n        let QueryAst::Range(rq) = ast else {\n            panic!(\"expected Range, got {ast:?}\");\n        };\n        assert_eq!(rq.field, \"score\");\n        assert_eq!(rq.lower_bound, Bound::Included(into_json_number(50)));\n        assert_eq!(rq.upper_bound, Bound::Included(into_json_number(100)));\n    }\n\n    #[test]\n    fn test_date_range_query_with_strict_date_optional_time_format() {\n        let range_query_params = ElasticRangeQueryParams {\n            gt: None,\n            gte: None,\n            lt: None,\n            lte: Some(JsonLiteral::String(\"2024-09-28T10:22:55.797Z\".to_string())),\n            boost: None,\n            format: JsonLiteral::String(\"strict_date_optional_time\".to_string()).into(),\n            ..Default::default()\n        };\n        let range_query: ElasticRangeQuery = ElasticRangeQuery {\n            field: \"timestamp\".to_string(),\n            value: range_query_params,\n        };\n        let range_query_ast = range_query.convert_to_query_ast().unwrap();\n        assert!(matches!(\n            range_query_ast,\n            QueryAst::Range(RangeQuery {\n                field,\n                lower_bound: Bound::Unbounded,\n                upper_bound: Bound::Included(upper_bound),\n            })\n            if field == \"timestamp\" && upper_bound == JsonLiteral::String(\"2024-09-28T10:22:55.797Z\".to_string())\n        ));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/elastic_query_dsl/regex_query.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse serde::Deserialize;\n\nuse crate::elastic_query_dsl::ConvertibleToQueryAst;\nuse crate::elastic_query_dsl::one_field_map::OneFieldMap;\nuse crate::query_ast::{QueryAst, RegexQuery as AstRegexQuery};\n\n/// Elasticsearch supports two formats for regexp queries:\n/// - Shorthand: `{\"regexp\": {\"field\": \"pattern\"}}`\n/// - Full:      `{\"regexp\": {\"field\": {\"value\": \"pattern\", \"case_insensitive\": true}}}`\n#[derive(Deserialize, Debug, Eq, PartialEq, Clone)]\n#[serde(untagged)]\npub enum RegexQueryParams {\n    Full {\n        #[serde(rename = \"value\")]\n        pattern: String,\n        #[serde(default)]\n        case_insensitive: bool,\n    },\n    Shorthand(String),\n}\n\nimpl RegexQueryParams {\n    fn into_tuple(self) -> (String, bool) {\n        match self {\n            RegexQueryParams::Full {\n                pattern,\n                case_insensitive,\n            } => (pattern, case_insensitive),\n            RegexQueryParams::Shorthand(pattern) => (pattern, false),\n        }\n    }\n}\n\npub type RegexQuery = OneFieldMap<RegexQueryParams>;\n\nimpl ConvertibleToQueryAst for RegexQuery {\n    fn convert_to_query_ast(self) -> anyhow::Result<QueryAst> {\n        let (pattern, case_insensitive) = self.value.into_tuple();\n\n        let regex = if case_insensitive {\n            format!(\"(?i){pattern}\")\n        } else {\n            pattern\n        };\n        Ok(AstRegexQuery {\n            field: self.field,\n            regex,\n        }\n        .into())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_regex_query_shorthand_format() {\n        let json = serde_json::json!({\"service\": \".*logs.*\"});\n        let query: RegexQuery = serde_json::from_value(json).unwrap();\n        assert_eq!(query.field, \"service\");\n        let (pattern, case_insensitive) = query.value.into_tuple();\n        assert_eq!(pattern, \".*logs.*\");\n        assert!(!case_insensitive);\n    }\n\n    #[test]\n    fn test_regex_query_full_format() {\n        let json = serde_json::json!({\"service\": {\"value\": \".*logs.*\", \"case_insensitive\": true}});\n        let query: RegexQuery = serde_json::from_value(json).unwrap();\n        assert_eq!(query.field, \"service\");\n        let (pattern, case_insensitive) = query.value.into_tuple();\n        assert_eq!(pattern, \".*logs.*\");\n        assert!(case_insensitive);\n    }\n\n    #[test]\n    fn test_regex_query_full_format_default_case() {\n        let json = serde_json::json!({\"service\": {\"value\": \".*logs.*\"}});\n        let query: RegexQuery = serde_json::from_value(json).unwrap();\n        assert_eq!(query.field, \"service\");\n        let (pattern, case_insensitive) = query.value.into_tuple();\n        assert_eq!(pattern, \".*logs.*\");\n        assert!(!case_insensitive);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/elastic_query_dsl/string_or_struct.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::marker::PhantomData;\n\nuse serde::de::{MapAccess, Visitor};\nuse serde::{Deserialize, Deserializer, de};\n\n/// The point of `StringOrStructForSerialization` is to support\n/// the two following formats for various queries.\n///\n/// `{\"field\": {\"query\": \"my query\", \"default_operator\": \"OR\"}}`\n///\n/// and the shorter.\n/// `{\"field\": \"my query\"}`\n///\n/// If a integer is passed, we cast it to string. Floats are not supported.\n///\n/// We don't use untagged enum to support this, in order to keep good errors.\n///\n/// The code below is adapted from solution described here: <https://serde.rs/string-or-struct.html>\n#[derive(Deserialize)]\n#[serde(transparent)]\npub(crate) struct StringOrStructForSerialization<T>\nwhere\n    T: From<String>,\n    for<'de2> T: Deserialize<'de2>,\n{\n    #[serde(deserialize_with = \"string_or_struct\")]\n    pub inner: T,\n}\n\nstruct StringOrStructVisitor<T> {\n    phantom_data: PhantomData<T>,\n}\n\nfn string_or_struct<'de, D, T>(deserializer: D) -> Result<T, D::Error>\nwhere\n    D: Deserializer<'de>,\n    T: From<String> + Deserialize<'de>,\n{\n    deserializer.deserialize_any(StringOrStructVisitor {\n        phantom_data: Default::default(),\n    })\n}\n\nimpl<'de, T> Visitor<'de> for StringOrStructVisitor<T>\nwhere\n    T: From<String>,\n    T: Deserialize<'de>,\n{\n    type Value = T;\n\n    fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {\n        let type_str = std::any::type_name::<T>();\n        formatter.write_str(&format!(\"string or map to deserialize {type_str}.\"))\n    }\n\n    fn visit_i64<E>(self, v: i64) -> Result<Self::Value, E>\n    where E: de::Error {\n        self.visit_str(&v.to_string())\n    }\n\n    fn visit_u64<E>(self, v: u64) -> Result<Self::Value, E>\n    where E: de::Error {\n        self.visit_str(&v.to_string())\n    }\n\n    fn visit_str<E>(self, query: &str) -> Result<Self::Value, E>\n    where E: serde::de::Error {\n        Ok(T::from(query.to_string()))\n    }\n\n    fn visit_map<M>(self, map: M) -> Result<T, M::Error>\n    where M: MapAccess<'de> {\n        Deserialize::deserialize(de::value::MapAccessDeserializer::new(map))\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/elastic_query_dsl/term_query.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse serde::{Deserialize, Deserializer, Serialize};\n\nuse super::StringOrStructForSerialization;\nuse crate::elastic_query_dsl::one_field_map::OneFieldMap;\nuse crate::elastic_query_dsl::{ConvertibleToQueryAst, ElasticQueryDslInner};\nuse crate::not_nan_f32::NotNaNf32;\nuse crate::query_ast::{self, QueryAst};\n\n#[derive(Deserialize, Debug, PartialEq, Eq, Clone)]\n#[serde(from = \"OneFieldMap<StringOrStructForSerialization<TermQueryParams>>\")]\npub struct TermQuery {\n    pub field: String,\n    pub value: TermQueryParams,\n}\n\nimpl From<OneFieldMap<StringOrStructForSerialization<TermQueryParams>>> for TermQuery {\n    fn from(one_field_map: OneFieldMap<StringOrStructForSerialization<TermQueryParams>>) -> Self {\n        TermQuery {\n            field: one_field_map.field,\n            value: one_field_map.value.inner,\n        }\n    }\n}\n\nimpl From<String> for TermQueryParams {\n    fn from(query: String) -> TermQueryParams {\n        TermQueryParams {\n            value: query,\n            boost: None,\n            case_insensitive: false,\n        }\n    }\n}\n\n#[derive(Deserialize)]\n#[serde(untagged)]\nenum TermValue {\n    I64(i64),\n    U64(u64),\n    Str(String),\n}\n\nfn deserialize_term_value<'de, D>(deserializer: D) -> Result<String, D::Error>\nwhere D: Deserializer<'de> {\n    let term_value = TermValue::deserialize(deserializer)?;\n    match term_value {\n        TermValue::I64(i64) => Ok(i64.to_string()),\n        TermValue::U64(u64) => Ok(u64.to_string()),\n        TermValue::Str(str) => Ok(str),\n    }\n}\n\n#[derive(PartialEq, Eq, Debug, Serialize, Deserialize, Clone)]\n#[serde(deny_unknown_fields)]\npub struct TermQueryParams {\n    #[serde(deserialize_with = \"deserialize_term_value\")]\n    pub value: String,\n    #[serde(default)]\n    pub boost: Option<NotNaNf32>,\n    #[serde(default)]\n    case_insensitive: bool,\n}\n\n#[cfg(test)]\npub fn term_query_from_field_value(field: impl ToString, value: impl ToString) -> TermQuery {\n    TermQuery {\n        field: field.to_string(),\n        value: TermQueryParams {\n            value: value.to_string(),\n            boost: None,\n            case_insensitive: false,\n        },\n    }\n}\n\nimpl From<TermQuery> for ElasticQueryDslInner {\n    fn from(term_query: TermQuery) -> Self {\n        Self::Term(term_query)\n    }\n}\n\nimpl ConvertibleToQueryAst for TermQuery {\n    fn convert_to_query_ast(self) -> anyhow::Result<QueryAst> {\n        let TermQueryParams {\n            value,\n            boost,\n            case_insensitive,\n        } = self.value;\n        if case_insensitive {\n            let ci_value = format!(\"(?i){}\", regex::escape(&value));\n            let term_ast: QueryAst = query_ast::RegexQuery {\n                field: self.field,\n                regex: ci_value,\n            }\n            .into();\n            return Ok(term_ast.boost(boost));\n        }\n        let term_ast: QueryAst = query_ast::TermQuery {\n            field: self.field,\n            value,\n        }\n        .into();\n        Ok(term_ast.boost(boost))\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_term_query_simple() {\n        let term_query_json = r#\"{ \"product_id\": { \"value\": \"61809\" } }\"#;\n        let term_query: TermQuery = serde_json::from_str(term_query_json).unwrap();\n        assert_eq!(\n            &term_query,\n            &term_query_from_field_value(\"product_id\", \"61809\")\n        );\n    }\n\n    #[test]\n    fn test_term_query_deserialization_in_short_format() {\n        let term_query: TermQuery = serde_json::from_str(\n            r#\"{\n            \"product_id\": \"61809\"\n        }\"#,\n        )\n        .unwrap();\n        assert_eq!(\n            &term_query,\n            &term_query_from_field_value(\"product_id\", \"61809\")\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/elastic_query_dsl/terms_query.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{BTreeSet, HashMap};\n\nuse serde::Deserialize;\n\nuse crate::elastic_query_dsl::one_field_map::OneFieldMap;\nuse crate::elastic_query_dsl::{ConvertibleToQueryAst, ElasticQueryDslInner};\nuse crate::not_nan_f32::NotNaNf32;\nuse crate::query_ast::{QueryAst, TermSetQuery};\n\n#[derive(PartialEq, Eq, Debug, Deserialize, Clone)]\n#[serde(try_from = \"TermsQueryForSerialization\")]\npub struct TermsQuery {\n    pub boost: Option<NotNaNf32>,\n    pub field: String,\n    pub values: Vec<String>,\n}\n\n#[derive(Deserialize)]\nstruct TermsQueryForSerialization {\n    #[serde(default)]\n    boost: Option<NotNaNf32>,\n    #[serde(flatten)]\n    capture_other: serde_json::Value,\n}\n\n#[derive(Deserialize)]\n#[serde(untagged)]\nenum TermValue {\n    I64(i64),\n    U64(u64),\n    Str(String),\n}\n\nimpl From<TermValue> for String {\n    fn from(term_value: TermValue) -> String {\n        match term_value {\n            TermValue::I64(val) => val.to_string(),\n            TermValue::U64(val) => val.to_string(),\n            TermValue::Str(val) => val,\n        }\n    }\n}\n\n#[derive(Deserialize)]\n#[serde(untagged)]\nenum OneOrMany {\n    One(TermValue),\n    Many(Vec<TermValue>),\n}\n\nimpl From<OneOrMany> for Vec<String> {\n    fn from(one_or_many: OneOrMany) -> Vec<String> {\n        match one_or_many {\n            OneOrMany::One(one_value) => vec![String::from(one_value)],\n            OneOrMany::Many(values) => values.into_iter().map(String::from).collect(),\n        }\n    }\n}\n\nimpl TryFrom<TermsQueryForSerialization> for TermsQuery {\n    type Error = serde_json::Error;\n\n    fn try_from(value: TermsQueryForSerialization) -> serde_json::Result<TermsQuery> {\n        let one_field: OneFieldMap<OneOrMany> = serde_json::from_value(value.capture_other)?;\n        let one_field_values: Vec<String> = one_field.value.into();\n        Ok(TermsQuery {\n            boost: value.boost,\n            field: one_field.field,\n            values: one_field_values,\n        })\n    }\n}\n\nimpl ConvertibleToQueryAst for TermsQuery {\n    fn convert_to_query_ast(self) -> anyhow::Result<QueryAst> {\n        let mut terms_per_field = HashMap::new();\n        let values_set: BTreeSet<String> = self.values.into_iter().collect();\n        terms_per_field.insert(self.field, values_set);\n\n        let term_set_query = TermSetQuery { terms_per_field };\n        let query_ast: QueryAst = term_set_query.into();\n\n        Ok(query_ast.boost(self.boost))\n    }\n}\n\nimpl From<TermsQuery> for ElasticQueryDslInner {\n    fn from(term_query: TermsQuery) -> Self {\n        Self::Terms(term_query)\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_terms_query_simple() {\n        let terms_query_json = r#\"{ \"user.id\": [\"hello\", \"happy\"] }\"#;\n        let terms_query: TermsQuery = serde_json::from_str(terms_query_json).unwrap();\n        assert_eq!(&terms_query.field, \"user.id\");\n        assert_eq!(\n            &terms_query.values[..],\n            &[\"hello\".to_string(), \"happy\".to_string()]\n        );\n    }\n\n    #[test]\n    fn test_terms_query_single_term_not_array() {\n        let terms_query_json = r#\"{ \"user.id\": \"hello\"}\"#;\n        let terms_query: TermsQuery = serde_json::from_str(terms_query_json).unwrap();\n        assert_eq!(&terms_query.field, \"user.id\");\n        assert_eq!(&terms_query.values[..], &[\"hello\".to_string()]);\n    }\n\n    #[test]\n    fn test_terms_query_not_string() {\n        let terms_query_json = r#\"{ \"user.id\": [1, 2] }\"#;\n        let terms_query: TermsQuery = serde_json::from_str(terms_query_json).unwrap();\n        assert_eq!(&terms_query.field, \"user.id\");\n        assert_eq!(&terms_query.values[..], &[\"1\".to_string(), \"2\".to_string()]);\n    }\n\n    #[test]\n    fn test_terms_query_single_term_boost() {\n        let terms_query_json = r#\"{ \"user.id\": [\"hello\", \"happy\"], \"boost\": 2 }\"#;\n        let terms_query: TermsQuery = serde_json::from_str(terms_query_json).unwrap();\n        assert_eq!(&terms_query.field, \"user.id\");\n        assert_eq!(\n            &terms_query.values[..],\n            &[\"hello\".to_string(), \"happy\".to_string()]\n        );\n        let boost: f32 = terms_query.boost.unwrap().into();\n        assert!((boost - 2.0f32).abs() < 0.0001f32);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/elastic_query_dsl/visitor.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse crate::match_all::MatchAllQuery;\nuse crate::match_none::MatchNoneQuery;\nuse crate::query_string_query::QueryStringQuery;\nuse crate::range_query::RangeQuery;\nuse crate::term_query::TermQuery;\nuse crate::QueryDsl;\n\npub trait QueryDslVisitor<'a> {\n    type Err;\n\n    fn visit(&mut self, query_dsl: &'a QueryDsl) -> Result<(), Self::Err> {\n        match query_dsl {\n            QueryDsl::QueryString(query_string_query) => {\n                self.visit_query_string(query_string_query)\n            }\n            QueryDsl::Bool(bool_query) => self.visit_bool_query(bool_query),\n            QueryDsl::Term(term_query) => self.visit_term(term_query),\n            QueryDsl::MatchAll(just_boost) => self.visit_match_all(just_boost),\n            QueryDsl::MatchNone(match_none) => self.visit_match_none(match_none),\n            QueryDsl::Range(range_query) => self.visit_range(range_query),\n        }\n    }\n\n    fn visit_query_string(\n        &mut self,\n        _query_string_query: &'a QueryStringQuery,\n    ) -> Result<(), Self::Err> {\n        Ok(())\n    }\n\n    fn visit_bool_query(&mut self, bool_query: &'a BoolQuery) -> Result<(), Self::Err> {\n        for ast in bool_query\n            .must\n            .iter()\n            .chain(bool_query.should.iter())\n            .chain(bool_query.must_not.iter())\n            .chain(bool_query.filter.iter())\n        {\n            self.visit(ast)?;\n        }\n        Ok(())\n    }\n\n    fn visit_term(&mut self, _term_query: &'a TermQuery) -> Result<(), Self::Err> {\n        Ok(())\n    }\n\n    fn visit_match_all(&mut self, _match_all: &'a MatchAllQuery) -> Result<(), Self::Err> {\n        Ok(())\n    }\n\n    fn visit_match_none(&mut self, _match_none: &'a MatchNoneQuery) -> Result<(), Self::Err> {\n        Ok(())\n    }\n\n    fn visit_range(&mut self, _range_query: &'a RangeQuery) -> Result<(), Self::Err> {\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/elastic_query_dsl/wildcard_query.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse serde::Deserialize;\n\nuse crate::NotNaNf32;\nuse crate::elastic_query_dsl::one_field_map::OneFieldMap;\nuse crate::elastic_query_dsl::{ConvertibleToQueryAst, StringOrStructForSerialization};\nuse crate::query_ast::{QueryAst, WildcardQuery as AstWildcardQuery};\n\n#[derive(Deserialize, Clone, Eq, PartialEq, Debug)]\n#[serde(from = \"OneFieldMap<StringOrStructForSerialization<WildcardQueryParams>>\")]\npub(crate) struct WildcardQuery {\n    pub(crate) field: String,\n    pub(crate) params: WildcardQueryParams,\n}\n\n#[derive(Deserialize, Debug, Default, Eq, PartialEq, Clone)]\n#[serde(deny_unknown_fields)]\npub struct WildcardQueryParams {\n    value: String,\n    #[serde(default)]\n    pub boost: Option<NotNaNf32>,\n    #[serde(default)]\n    case_insensitive: bool,\n}\n\nimpl ConvertibleToQueryAst for WildcardQuery {\n    fn convert_to_query_ast(self) -> anyhow::Result<QueryAst> {\n        let wildcard_ast: QueryAst = AstWildcardQuery {\n            field: self.field,\n            value: self.params.value,\n            lenient: true,\n            case_insensitive: self.params.case_insensitive,\n        }\n        .into();\n        Ok(wildcard_ast.boost(self.params.boost))\n    }\n}\n\nimpl From<OneFieldMap<StringOrStructForSerialization<WildcardQueryParams>>> for WildcardQuery {\n    fn from(\n        match_query_params: OneFieldMap<StringOrStructForSerialization<WildcardQueryParams>>,\n    ) -> Self {\n        let OneFieldMap { field, value } = match_query_params;\n        WildcardQuery {\n            field,\n            params: value.inner,\n        }\n    }\n}\n\nimpl From<String> for WildcardQueryParams {\n    fn from(value: String) -> WildcardQueryParams {\n        WildcardQueryParams {\n            value,\n            boost: None,\n            case_insensitive: false,\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_wildcard_query_convert_to_query_ast() {\n        let wildcard_query_json = r#\"{\n            \"user_name\": {\n                \"value\": \"john*\"\n            }\n        }\"#;\n        let wildcard_query: WildcardQuery = serde_json::from_str(wildcard_query_json).unwrap();\n        let query_ast = wildcard_query.convert_to_query_ast().unwrap();\n\n        if let QueryAst::Wildcard(wildcard) = query_ast {\n            assert_eq!(wildcard.field, \"user_name\");\n            assert_eq!(wildcard.value, \"john*\");\n            assert!(wildcard.lenient);\n        } else {\n            panic!(\"Expected QueryAst::Wildcard\");\n        }\n    }\n\n    #[test]\n    fn test_boosted_wildcard_query_convert_to_query_ast() {\n        let wildcard_query_json = r#\"{\n            \"user_name\": {\n                \"value\": \"john*\",\n                \"boost\": 2.0\n            }\n        }\"#;\n        let wildcard_query: WildcardQuery = serde_json::from_str(wildcard_query_json).unwrap();\n        let query_ast = wildcard_query.convert_to_query_ast().unwrap();\n\n        if let QueryAst::Boost { underlying, boost } = query_ast {\n            if let QueryAst::Wildcard(wildcard) = *underlying {\n                assert_eq!(wildcard.field, \"user_name\");\n                assert_eq!(wildcard.value, \"john*\");\n                assert!(wildcard.lenient);\n            } else {\n                panic!(\"Expected underlying QueryAst::Wildcard\");\n            }\n            assert_eq!(boost, NotNaNf32::try_from(2.0).unwrap());\n        } else {\n            panic!(\"Expected QueryAst::Wildcard\");\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/error.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse thiserror::Error;\n\n#[derive(Error, Debug)]\npub enum InvalidQuery {\n    #[error(\"query is incompatible with schema. {0})\")]\n    SchemaError(String),\n    #[error(\"expected `{expected_value_type}` boundary for field `{field_name}`\")]\n    InvalidBoundary {\n        expected_value_type: &'static str,\n        field_name: String,\n    },\n    #[error(\n        \"expected a `{expected_value_type}` search value for field `{field_name}`, got `{value}`\"\n    )]\n    InvalidSearchTerm {\n        expected_value_type: &'static str,\n        field_name: String,\n        value: String,\n    },\n    #[error(\"range query on `{value_type}` field (`{field_name}`) forbidden\")]\n    RangeQueryNotSupportedForField {\n        value_type: &'static str,\n        field_name: String,\n    },\n    #[error(\"field does not exist: `{full_path}`\")]\n    FieldDoesNotExist { full_path: String },\n    #[error(\"Json field root is not a valid search field: `{full_path}`\")]\n    JsonFieldRootNotSearchable { full_path: String },\n    #[error(\"user query should have been parsed\")]\n    UserQueryNotParsed,\n    #[error(\"{0}\")]\n    Other(#[from] anyhow::Error),\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/json_literal.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::net::{IpAddr, Ipv6Addr};\nuse std::str::FromStr;\n\nuse base64::Engine;\nuse once_cell::sync::OnceCell;\nuse quickwit_datetime::{DateTimeInputFormat, parse_date_time_str, parse_timestamp};\nuse serde::{Deserialize, Serialize};\nuse tantivy::schema::IntoIpv6Addr;\n\nfn get_default_date_time_format() -> &'static [DateTimeInputFormat] {\n    static DEFAULT_DATE_TIME_FORMATS: OnceCell<Vec<DateTimeInputFormat>> = OnceCell::new();\n    DEFAULT_DATE_TIME_FORMATS\n        .get_or_init(|| {\n            vec![\n                DateTimeInputFormat::Rfc3339,\n                DateTimeInputFormat::Rfc2822,\n                DateTimeInputFormat::Timestamp,\n                DateTimeInputFormat::from_str(\"%Y-%m-%dT%H:%M:%S\").unwrap(),\n                DateTimeInputFormat::from_str(\"%Y-%m-%d %H:%M:%S.%f\").unwrap(),\n                DateTimeInputFormat::from_str(\"%Y-%m-%d %H:%M:%S\").unwrap(),\n                DateTimeInputFormat::from_str(\"%Y-%m-%d\").unwrap(),\n                DateTimeInputFormat::from_str(\"%Y/%m/%d\").unwrap(),\n            ]\n        })\n        .as_slice()\n}\n\n#[derive(Serialize, Deserialize, Eq, PartialEq, Clone, Debug)]\n#[serde(untagged)]\npub enum JsonLiteral {\n    Number(serde_json::Number),\n    // String is a bit special.\n    //\n    // It can either mean it was passed as a string by the user (via the es query dsl for\n    // instance), or it can mean its type is unknown as it was parsed out of tantivy's query\n    // language.\n    //\n    // We have decided to not make a difference at the moment.\n    String(String),\n    Bool(bool),\n}\n\npub trait InterpretUserInput<'a>: Sized {\n    fn interpret_json(user_input: &'a JsonLiteral) -> Option<Self> {\n        match user_input {\n            JsonLiteral::Number(number) => Self::interpret_number(number),\n            JsonLiteral::String(str_val) => Self::interpret_str(str_val),\n            JsonLiteral::Bool(bool_val) => Self::interpret_bool(*bool_val),\n        }\n    }\n\n    fn interpret_number(_number: &serde_json::Number) -> Option<Self> {\n        None\n    }\n\n    fn interpret_bool(_bool: bool) -> Option<Self> {\n        None\n    }\n    fn interpret_str(_text: &'a str) -> Option<Self> {\n        None\n    }\n\n    fn name() -> &'static str {\n        std::any::type_name::<Self>()\n    }\n}\n\nimpl<'a> InterpretUserInput<'a> for &'a str {\n    fn interpret_str(text: &'a str) -> Option<Self> {\n        Some(text)\n    }\n}\n\nimpl<'a> InterpretUserInput<'a> for u64 {\n    fn interpret_number(number: &serde_json::Number) -> Option<Self> {\n        number.as_u64()\n    }\n\n    fn interpret_str(text: &'a str) -> Option<Self> {\n        text.parse().ok()\n    }\n}\n\nimpl<'a> InterpretUserInput<'a> for i64 {\n    fn interpret_number(number: &serde_json::Number) -> Option<Self> {\n        number.as_i64()\n    }\n\n    fn interpret_str(text: &'a str) -> Option<Self> {\n        text.parse().ok()\n    }\n}\n\n// We refuse NaN and infinity.\nimpl<'a> InterpretUserInput<'a> for f64 {\n    fn interpret_number(number: &serde_json::Number) -> Option<Self> {\n        let val = number.as_f64()?;\n        if val.is_nan() || val.is_infinite() {\n            return None;\n        }\n        Some(val)\n    }\n\n    fn interpret_str(text: &'a str) -> Option<f64> {\n        let val: f64 = text.parse().ok()?;\n        if val.is_nan() || val.is_infinite() {\n            return None;\n        }\n        Some(val)\n    }\n}\n\nimpl InterpretUserInput<'_> for bool {\n    fn interpret_bool(b: bool) -> Option<Self> {\n        Some(b)\n    }\n\n    fn interpret_str(text: &str) -> Option<Self> {\n        text.parse().ok()\n    }\n}\n\nimpl InterpretUserInput<'_> for Ipv6Addr {\n    fn interpret_str(text: &str) -> Option<Self> {\n        let ip_addr: IpAddr = text.parse().ok()?;\n        Some(ip_addr.into_ipv6_addr())\n    }\n}\n\nimpl InterpretUserInput<'_> for tantivy::DateTime {\n    fn interpret_str(text: &str) -> Option<Self> {\n        let date_time_formats = get_default_date_time_format();\n        if let Ok(datetime) = parse_date_time_str(text, date_time_formats) {\n            return Some(datetime);\n        }\n        // Parsing the normal string formats failed.\n        // Maybe it is actually a timestamp as a string?\n        let possible_timestamp = text.parse::<i64>().ok()?;\n        parse_timestamp(possible_timestamp).ok()\n    }\n\n    fn interpret_number(number: &serde_json::Number) -> Option<Self> {\n        let possible_timestamp = number.as_i64()?;\n        parse_timestamp(possible_timestamp).ok()\n    }\n}\n\n/// Lenient base64 engine that allows users to use padding or not.\nconst LENIENT_BASE64_ENGINE: base64::engine::GeneralPurpose = base64::engine::GeneralPurpose::new(\n    &base64::alphabet::STANDARD,\n    base64::engine::GeneralPurposeConfig::new()\n        .with_decode_padding_mode(base64::engine::DecodePaddingMode::Indifferent),\n);\n\nimpl InterpretUserInput<'_> for Vec<u8> {\n    fn interpret_str(mut text: &str) -> Option<Vec<u8>> {\n        let Some(first_byte) = text.as_bytes().first().copied() else {\n            return Some(Vec::new());\n        };\n        let mut buffer = Vec::with_capacity(text.len() * 3 / 4);\n        if first_byte == b'!' {\n            // We use ! as a marker to force base64 decoding.\n            text = &text[1..];\n        } else {\n            buffer.resize(text.len() / 2, 0u8);\n            if hex::decode_to_slice(text, &mut buffer[..]).is_ok() {\n                return Some(buffer);\n            }\n            buffer.clear();\n        }\n        LENIENT_BASE64_ENGINE.decode_vec(text, &mut buffer).ok()?;\n        Some(buffer)\n    }\n}\n\nimpl From<bool> for JsonLiteral {\n    fn from(b: bool) -> JsonLiteral {\n        JsonLiteral::Bool(b)\n    }\n}\n\nimpl From<String> for JsonLiteral {\n    fn from(s: String) -> JsonLiteral {\n        JsonLiteral::String(s)\n    }\n}\n\nimpl From<u64> for JsonLiteral {\n    fn from(number: u64) -> JsonLiteral {\n        JsonLiteral::Number(number.into())\n    }\n}\n\nimpl From<i64> for JsonLiteral {\n    fn from(number: i64) -> JsonLiteral {\n        JsonLiteral::Number(number.into())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use tantivy::DateTime;\n    use time::macros::datetime;\n\n    use crate::JsonLiteral;\n    use crate::json_literal::InterpretUserInput;\n\n    #[test]\n    fn test_interpret_str_u64() {\n        let val_opt = u64::interpret_str(\"123\");\n        assert_eq!(val_opt, Some(123u64));\n    }\n\n    #[test]\n    fn test_interpret_datetime_simple_date() {\n        let dt_opt = DateTime::interpret_json(&JsonLiteral::String(\"2023-05-25\".to_string()));\n        let expected_datetime = datetime!(2023-05-25 00:00 UTC);\n        assert_eq!(dt_opt, Some(DateTime::from_utc(expected_datetime)));\n    }\n\n    #[test]\n    fn test_interpret_datetime_rfc3339_with_no_timezone() {\n        let dt_opt =\n            DateTime::interpret_json(&JsonLiteral::String(\"2023-05-25T18:00:00\".to_string()));\n        let expected_datetime = datetime!(2023-05-25 18:00 UTC);\n        assert_eq!(dt_opt, Some(DateTime::from_utc(expected_datetime)));\n    }\n\n    #[test]\n    fn test_interpret_datetime_fractional_millis() {\n        let dt_opt =\n            DateTime::interpret_json(&JsonLiteral::String(\"2023-05-25 10:20:11.322\".to_string()));\n        let expected_datetime = datetime!(2023-05-25 10:20:11.322 UTC);\n        assert_eq!(dt_opt, Some(DateTime::from_utc(expected_datetime)));\n    }\n\n    #[test]\n    fn test_interpret_datetime_unix_timestamp_as_string() {\n        let dt_opt = DateTime::interpret_json(&JsonLiteral::String(\"1685086013\".to_string()));\n        let expected_datetime = datetime!(2023-05-26 07:26:53 UTC);\n        assert_eq!(dt_opt, Some(DateTime::from_utc(expected_datetime)));\n    }\n\n    #[test]\n    fn test_interpret_datetime_unix_timestamp_as_number() {\n        let dt_opt = DateTime::interpret_json(&JsonLiteral::Number(1685086013.into()));\n        let expected_datetime = datetime!(2023-05-26 07:26:53 UTC);\n        assert_eq!(dt_opt, Some(DateTime::from_utc(expected_datetime)));\n    }\n\n    #[test]\n    fn test_interpret_bytes_base16_lowercase() {\n        let bytes_opt = Vec::<u8>::interpret_str(\"deadbeef\");\n        assert_eq!(bytes_opt, Some(vec![0xde, 0xad, 0xbe, 0xef]));\n    }\n\n    #[test]\n    fn test_interpret_bytes_base16_uppercase() {\n        let bytes_opt = Vec::<u8>::interpret_str(\"DEADBEEF\");\n        assert_eq!(bytes_opt, Some(vec![0xde, 0xad, 0xbe, 0xef]));\n    }\n\n    #[test]\n    fn test_interpret_bytes_base16_mixed_casing() {\n        let bytes_opt = Vec::<u8>::interpret_str(\"dEadbeef\");\n        assert_eq!(bytes_opt, Some(vec![0xde, 0xad, 0xbe, 0xef]));\n    }\n\n    #[test]\n    fn test_interpret_bytes_base64() {\n        let decoded = Vec::<u8>::interpret_str(\"aGVsbG8=\").unwrap();\n        assert_eq!(decoded, b\"hello\");\n    }\n\n    #[test]\n    fn test_interpret_force_ambiguous_base64() {\n        let decoded = Vec::<u8>::interpret_str(\"!beef\").unwrap();\n        assert_eq!(decoded, &[109, 231, 159]);\n    }\n\n    #[test]\n    fn test_interpret_with_and_without_padding() {\n        let decoded_without_padding = Vec::<u8>::interpret_str(\"cQ\").unwrap();\n        let decoded_with_padding = Vec::<u8>::interpret_str(\"cQ\").unwrap();\n        assert_eq!(&decoded_with_padding, &decoded_without_padding);\n        assert_eq!(&decoded_with_padding, b\"q\");\n    }\n\n    #[test]\n    fn test_interpret_bytes_invalid() {\n        assert!(Vec::<u8>::interpret_str(\"deadbeef@\").is_none());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n//! QueryDSL partially compatible with Elasticsearch/Opensearch QueryDSL.\n//! See documentation here:\n//! <https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html>\n\n// As you add queries in this file please insert it in the order of the OpenSearch 2.6\n// documentation (the opensearch documentation has a nicer structure than that of ES).\n// https://opensearch.org/docs/2.6/query-dsl/term/\n//\n// For the individual detailed API documentation however, you should refer to elastic\n// documentation.\n\npub mod aggregations;\nmod elastic_query_dsl;\nmod error;\nmod json_literal;\nmod not_nan_f32;\npub mod query_ast;\npub mod tokenizers;\n\npub use elastic_query_dsl::{ElasticQueryDsl, OneFieldMap};\npub use error::InvalidQuery;\npub use json_literal::{InterpretUserInput, JsonLiteral};\npub(crate) use not_nan_f32::NotNaNf32;\npub use query_ast::utils::find_field_or_hit_dynamic;\nuse serde::{Deserialize, Serialize};\npub use tantivy::query::Query as TantivyQuery;\npub use tokenizers::{\n    CodeTokenizer, DEFAULT_REMOVE_TOKEN_LENGTH, create_default_quickwit_tokenizer_manager,\n    get_quickwit_fastfield_normalizer_manager,\n};\n\n#[derive(Serialize, Deserialize, Debug, Default, Copy, Clone, Eq, PartialEq)]\npub enum BooleanOperand {\n    #[serde(alias = \"AND\")]\n    And,\n    #[default]\n    #[serde(alias = \"OR\")]\n    Or,\n}\n\n#[derive(Serialize, Deserialize, Debug, Copy, Clone, Eq, PartialEq, Default)]\npub enum MatchAllOrNone {\n    #[serde(rename = \"none\")]\n    #[default]\n    MatchNone,\n    #[serde(rename = \"all\")]\n    MatchAll,\n}\n\nimpl MatchAllOrNone {\n    pub fn is_none(&self) -> bool {\n        self == &MatchAllOrNone::MatchNone\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/not_nan_f32.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse serde::{Deserialize, Serialize};\n\n#[derive(Serialize, Deserialize, Debug, Copy, Clone, PartialEq)]\n#[serde(into = \"f32\", try_from = \"f32\")]\npub struct NotNaNf32(f32);\n\nimpl NotNaNf32 {\n    pub const ZERO: Self = NotNaNf32(0.0f32);\n    pub const ONE: Self = NotNaNf32(1.0f32);\n}\n\nimpl From<NotNaNf32> for f32 {\n    fn from(not_nan_f32: NotNaNf32) -> f32 {\n        not_nan_f32.0\n    }\n}\n\nimpl TryFrom<f32> for NotNaNf32 {\n    type Error = &'static str;\n\n    fn try_from(possibly_nan: f32) -> Result<NotNaNf32, &'static str> {\n        if possibly_nan.is_nan() {\n            return Err(\"NaN is not supported as a boost value\");\n        }\n        Ok(NotNaNf32(possibly_nan))\n    }\n}\n\nimpl Eq for NotNaNf32 {}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/query_ast/bool_query.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse serde::{Deserialize, Serialize};\n\nuse super::{BuildTantivyAst, BuildTantivyAstContext, TantivyQueryAst};\nuse crate::InvalidQuery;\nuse crate::query_ast::QueryAst;\n\n/// # Unsupported features\n/// - named queries\n///\n/// Edge cases of BooleanQuery are not obvious,\n/// and different behavior could be justified.\n///\n/// Here we align ourselves with Elasticsearch.\n/// A boolean query is to be interpreted like a filtering predicate\n/// over the set of documents.\n///\n/// If all clauses are empty, then the full set of documents is returned.\n/// Adding a match all must clause does not change the result of a boolean query.\n#[derive(Serialize, Deserialize, Debug, PartialEq, Eq, Clone, Default)]\npub struct BoolQuery {\n    #[serde(default, skip_serializing_if = \"Vec::is_empty\")]\n    pub must: Vec<QueryAst>,\n    #[serde(default, skip_serializing_if = \"Vec::is_empty\")]\n    pub must_not: Vec<QueryAst>,\n    #[serde(default, skip_serializing_if = \"Vec::is_empty\")]\n    pub should: Vec<QueryAst>,\n    #[serde(default, skip_serializing_if = \"Vec::is_empty\")]\n    pub filter: Vec<QueryAst>,\n    #[serde(default, skip_serializing_if = \"Option::is_none\")]\n    pub minimum_should_match: Option<usize>,\n}\n\nimpl From<BoolQuery> for QueryAst {\n    fn from(bool_query: BoolQuery) -> Self {\n        QueryAst::Bool(bool_query)\n    }\n}\n\nimpl BuildTantivyAst for BoolQuery {\n    fn build_tantivy_ast_impl(\n        &self,\n        context: &BuildTantivyAstContext,\n    ) -> Result<TantivyQueryAst, InvalidQuery> {\n        let mut boolean_query = super::tantivy_query_ast::TantivyBoolQuery {\n            minimum_should_match: self.minimum_should_match,\n            ..Default::default()\n        };\n        for must in &self.must {\n            let must_leaf = must.build_tantivy_ast_call(context)?;\n            boolean_query.must.push(must_leaf);\n        }\n        for must_not in &self.must_not {\n            let must_not_leaf = must_not.build_tantivy_ast_call(context)?;\n            boolean_query.must_not.push(must_not_leaf);\n        }\n        for should in &self.should {\n            let should_leaf = should.build_tantivy_ast_call(context)?;\n            boolean_query.should.push(should_leaf);\n        }\n        for filter in &self.filter {\n            let filter_leaf = filter.build_tantivy_ast_call(context)?;\n            boolean_query.filter.push(filter_leaf);\n        }\n        Ok(TantivyQueryAst::Bool(boolean_query))\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/query_ast/cache_node.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::Arc;\n\nuse bitpacking::{BitPacker, BitPacker1x};\nuse quickwit_proto::types::SplitId;\nuse serde::{Deserialize, Serialize};\n\nuse super::{BuildTantivyAst, BuildTantivyAstContext, TantivyQueryAst};\nuse crate::InvalidQuery;\nuse crate::query_ast::QueryAst;\n\n/// A node caching the result of an inner query.\n///\n/// This can be used when it's known that some sub-ast might appear in many queries,\n/// or that the same query might be run, with various aggregations.\n///\n/// /!\\ Sprinkling this everywhere can lead to performance degradations: the whole posting\n/// list of the underlying query will need to be evaluated to build the cache, whereas it could\n/// have been largely skipped if some other part of the query is very selective.\n#[derive(Serialize, Deserialize, Debug, Clone)]\npub struct CacheNode {\n    pub inner: Box<QueryAst>,\n    #[serde(skip)]\n    pub state: CacheState,\n}\n\n#[derive(Default, Clone)]\npub enum CacheState {\n    // This is the state a CacheNode should be before\n    #[default]\n    Uninitialized,\n    CacheHit(CacheEntry),\n    CacheMiss(CacheFiller),\n}\n\nimpl std::fmt::Debug for CacheState {\n    fn fmt(&self, fmt: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {\n        match self {\n            CacheState::Uninitialized => fmt.debug_tuple(\"Uninitialized\").finish(),\n            CacheState::CacheHit(_) => fmt.debug_tuple(\"CacheHit\").finish_non_exhaustive(),\n            CacheState::CacheMiss(_) => fmt.debug_tuple(\"CacheMiss\").finish_non_exhaustive(),\n        }\n    }\n}\n\n// cache state shouldn't impact a CacheNode equality\nimpl Eq for CacheNode {}\nimpl PartialEq for CacheNode {\n    fn eq(&self, other: &Self) -> bool {\n        self.inner == other.inner\n    }\n}\n\nimpl From<CacheNode> for QueryAst {\n    fn from(cache_node: CacheNode) -> Self {\n        QueryAst::Cache(cache_node)\n    }\n}\n\nimpl CacheNode {\n    pub fn new(ast: QueryAst) -> Self {\n        CacheNode {\n            inner: Box::new(ast),\n            state: CacheState::Uninitialized,\n        }\n    }\n\n    pub fn fill_cache_state(&mut self, cache: &Arc<dyn PredicateCache>, split_id: &str) {\n        let Ok(query) = serde_json::to_string(&self.inner) else {\n            return;\n        };\n        if let Some((segment_id, hits)) = cache.get(split_id.to_string(), query.clone()) {\n            self.state = CacheState::CacheHit(CacheEntry { segment_id, hits });\n        } else {\n            self.state = CacheState::CacheMiss(CacheFiller {\n                cache: cache.clone(),\n                split_id: split_id.to_string(),\n                query,\n            });\n        }\n    }\n}\n\nimpl BuildTantivyAst for CacheNode {\n    fn build_tantivy_ast_impl(\n        &self,\n        context: &BuildTantivyAstContext,\n    ) -> Result<TantivyQueryAst, InvalidQuery> {\n        match &self.state {\n            CacheState::Uninitialized => self.inner.build_tantivy_ast_call(context),\n            CacheState::CacheHit(cache_entry) => Ok(CacheHitQuery {\n                cache_entry: cache_entry.clone(),\n            }\n            .into()),\n            CacheState::CacheMiss(cache_filler) => {\n                let tantivy_query: Box<dyn Query> = self\n                    .inner\n                    .build_tantivy_ast_call(context)?\n                    .simplify()\n                    .into();\n                Ok(CacheFillerQuery {\n                    inner_query: Box::new(tantivy_query),\n                    cache_filler: cache_filler.clone(),\n                }\n                .into())\n            }\n        }\n    }\n}\n\nuse tantivy::directory::OwnedBytes;\nuse tantivy::index::SegmentId;\nuse tantivy::query::{EnableScoring, Explanation, Query, Scorer, Weight};\nuse tantivy::{DocId, DocSet, Score, SegmentReader, TantivyError};\n\n#[derive(Clone, Debug)]\npub struct CacheHitQuery {\n    cache_entry: CacheEntry,\n}\n\nimpl Query for CacheHitQuery {\n    fn weight(&self, enable_scoring: EnableScoring<'_>) -> tantivy::Result<Box<dyn Weight>> {\n        if enable_scoring.is_scoring_enabled() {\n            Err(tantivy::TantivyError::InternalError(\n                \"Predicate cache doesn't support scoring yet\".to_string(),\n            ))\n        } else {\n            Ok(Box::new(CacheHitWeight {\n                cache_entry: self.cache_entry.clone(),\n            }))\n        }\n    }\n}\n\n/// Weight associated with the `AllQuery` query.\npub struct CacheHitWeight {\n    cache_entry: CacheEntry,\n}\n\nimpl Weight for CacheHitWeight {\n    fn scorer(&self, reader: &SegmentReader, boost: Score) -> tantivy::Result<Box<dyn Scorer>> {\n        // we could try to run the query if for some reason we don't actually find an entry in\n        // cache, but that would have required loading stuff during warmup which we skipped.\n        // An error is the best we can do\n        let mut hit_set = self\n            .cache_entry\n            .for_segment(reader.segment_id())\n            .ok_or_else(|| TantivyError::InternalError(\"Segment not found in cache\".to_string()))?;\n        hit_set.boost = boost;\n        Ok(Box::new(hit_set))\n    }\n\n    fn explain(&self, reader: &SegmentReader, doc: DocId) -> tantivy::Result<Explanation> {\n        let mut scorer = self.scorer(reader, 1.0)?;\n        if scorer.seek(doc) == doc {\n            Ok(Explanation::new(\"HitSet\", 1.0))\n        } else {\n            Err(TantivyError::InvalidArgument(\n                \"Document does not exist\".to_string(),\n            ))\n        }\n    }\n}\n\n#[derive(Clone)]\npub struct CacheEntry {\n    segment_id: SegmentId,\n    hits: HitSet,\n}\n\nimpl std::fmt::Debug for CacheEntry {\n    fn fmt(&self, fmt: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {\n        fmt.debug_struct(\"CacheEntry\")\n            .field(\"segment_id\", &self.segment_id)\n            .finish_non_exhaustive()\n    }\n}\n\nimpl CacheEntry {\n    fn for_segment(&self, segment_id: SegmentId) -> Option<HitSet> {\n        if segment_id == self.segment_id {\n            Some(self.hits.clone())\n        } else {\n            None\n        }\n    }\n}\n\n#[derive(Debug, Clone)]\npub struct HitSet {\n    buffer: OwnedBytes,\n    buffer_pos: usize,\n    previous_last_val: Option<u32>,\n    current_block: [u32; BitPacker1x::BLOCK_LEN],\n    block_pos: usize,\n    boost: Score,\n}\n\nconst INCOMPLETE_BLOCK_MARKER: u8 = 0x80;\n\nimpl HitSet {\n    #[cfg(test)]\n    fn empty() -> Self {\n        Self::from_buffer(OwnedBytes::new(vec![0, 0, 0, 0]))\n    }\n\n    /// Build a HitSet from its serialized form.\n    ///\n    /// The provided buffer must come from `HitSet::into_buffer`\n    pub fn from_buffer(buffer: OwnedBytes) -> Self {\n        let mut this = Self {\n            buffer,\n            // skip count\n            buffer_pos: 4,\n            previous_last_val: None,\n            current_block: [0; BitPacker1x::BLOCK_LEN],\n            // we set this to block_len minus 1 so we can call advance() once to initialize\n            // everything\n            block_pos: BitPacker1x::BLOCK_LEN - 1,\n            boost: 1.0,\n        };\n        this.advance();\n        this\n    }\n\n    /// Return a buffer representing the underlying data.\n    ///\n    /// This does not preserve where in the DocSet you are.\n    pub fn into_buffer(self) -> OwnedBytes {\n        self.buffer\n    }\n\n    fn load_new_block(&mut self) {\n        let Some(num_bits) = self.buffer.get(self.buffer_pos) else {\n            // we ended iteration: simply fill the current_block full of TERMINATED\n            self.current_block = [tantivy::TERMINATED; 32];\n            return;\n        };\n        self.buffer_pos += 1;\n        if *num_bits == INCOMPLETE_BLOCK_MARKER {\n            // final block, decode as many ids as possible\n            let mut i = 0;\n            for chunk in self.buffer[self.buffer_pos..].as_chunks().0 {\n                self.current_block[i] = u32::from_ne_bytes(*chunk);\n                i += 1;\n            }\n            // pad with TERMINATED\n            while i < BitPacker1x::BLOCK_LEN {\n                self.current_block[i] = tantivy::TERMINATED;\n                i += 1;\n            }\n            self.buffer_pos = self.buffer.len();\n        } else {\n            self.buffer_pos += BitPacker1x.decompress_strictly_sorted(\n                self.previous_last_val,\n                &self.buffer[self.buffer_pos..],\n                &mut self.current_block,\n                *num_bits,\n            );\n            self.previous_last_val = self.current_block.last().copied();\n        }\n    }\n}\n\nimpl DocSet for HitSet {\n    fn advance(&mut self) -> DocId {\n        self.block_pos += 1;\n        if let Some(doc_id) = self.current_block.get(self.block_pos) {\n            return *doc_id;\n        }\n        self.load_new_block();\n        self.block_pos = 0;\n        self.current_block[0]\n    }\n\n    // fn seek(&mut self, target: DocId) -> DocId {\n    // }\n\n    #[inline(always)]\n    fn doc(&self) -> DocId {\n        self.current_block[self.block_pos]\n    }\n\n    fn size_hint(&self) -> u32 {\n        u32::from_ne_bytes(self.buffer[0..4].try_into().unwrap())\n    }\n}\n\nimpl Scorer for HitSet {\n    fn score(&mut self) -> f32 {\n        self.boost\n    }\n}\n\npub struct HitSetBuilder {\n    count: u32,\n    current_block: [u32; BitPacker1x::BLOCK_LEN],\n    previous_last_val: Option<u32>,\n    buffer: Vec<u8>,\n}\n\nimpl HitSetBuilder {\n    pub fn new() -> Self {\n        HitSetBuilder {\n            count: 0,\n            current_block: [0; BitPacker1x::BLOCK_LEN],\n            previous_last_val: None,\n            buffer: vec![0; 4],\n        }\n    }\n\n    fn in_block_pos(&self) -> usize {\n        (self.count % BitPacker1x::BLOCK_LEN as u32) as usize\n    }\n\n    fn end_of_block(&self) -> bool {\n        self.in_block_pos() == (BitPacker1x::BLOCK_LEN - 1)\n    }\n\n    fn flush_block(&mut self) {\n        let num_bits =\n            BitPacker1x.num_bits_strictly_sorted(self.previous_last_val, &self.current_block);\n        self.buffer.push(num_bits);\n        let current_buffer_pos = self.buffer.len();\n        let new_end = current_buffer_pos + (BitPacker1x::BLOCK_LEN * num_bits as usize) / 8;\n        self.buffer.resize(new_end, 0);\n        BitPacker1x.compress_strictly_sorted(\n            self.previous_last_val,\n            &self.current_block,\n            &mut self.buffer[current_buffer_pos..],\n            num_bits,\n        );\n        self.previous_last_val = self.current_block.last().copied();\n    }\n\n    pub fn insert(&mut self, value: u32) {\n        self.current_block[self.in_block_pos()] = value;\n        if self.end_of_block() {\n            self.flush_block();\n        }\n        self.count += 1;\n    }\n\n    pub fn build(mut self) -> HitSet {\n        if self.in_block_pos() != 0 {\n            self.buffer.push(INCOMPLETE_BLOCK_MARKER);\n            for elem in &self.current_block[..self.in_block_pos()] {\n                self.buffer.extend_from_slice(&elem.to_ne_bytes());\n            }\n        }\n        // write back the count of items\n        self.buffer[0..4].copy_from_slice(&self.count.to_ne_bytes());\n        HitSet::from_buffer(OwnedBytes::new(self.buffer))\n    }\n}\n\n#[derive(Clone)]\npub struct CacheFiller {\n    cache: Arc<dyn PredicateCache>,\n    split_id: String,\n    query: String,\n}\n\nimpl std::fmt::Debug for CacheFiller {\n    fn fmt(&self, fmt: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {\n        fmt.debug_struct(\"CacheFiller\")\n            .field(\"split_id\", &self.split_id)\n            .finish_non_exhaustive()\n    }\n}\n\nimpl CacheFiller {\n    fn fill_segment(&self, segment_id: SegmentId, value: HitSet) {\n        self.cache\n            .put(self.split_id.clone(), self.query.clone(), segment_id, value);\n    }\n}\n\n#[derive(Debug)]\npub struct CacheFillerQuery {\n    inner_query: Box<dyn Query>,\n    cache_filler: CacheFiller,\n}\n\nimpl Clone for CacheFillerQuery {\n    fn clone(&self) -> Self {\n        Self {\n            inner_query: self.inner_query.box_clone(),\n            cache_filler: self.cache_filler.clone(),\n        }\n    }\n}\n\nimpl Query for CacheFillerQuery {\n    fn weight(&self, enable_scoring: EnableScoring<'_>) -> tantivy::Result<Box<dyn Weight>> {\n        if enable_scoring.is_scoring_enabled() {\n            Err(tantivy::TantivyError::InternalError(\n                \"Predicate cache doesn't support scoring yet\".to_string(),\n            ))\n        } else {\n            Ok(Box::new(CacheFillerWeight {\n                inner_weight: self.inner_query.weight(enable_scoring)?,\n                cache_filler: self.cache_filler.clone(),\n            }))\n        }\n    }\n    fn query_terms<'a>(&'a self, visitor: &mut dyn FnMut(&'a tantivy::Term, bool)) {\n        self.inner_query.query_terms(visitor)\n    }\n}\n\n/// Weight associated with the `AllQuery` query.\npub struct CacheFillerWeight {\n    inner_weight: Box<dyn Weight>,\n    cache_filler: CacheFiller,\n}\n\nimpl Weight for CacheFillerWeight {\n    fn scorer(&self, reader: &SegmentReader, boost: Score) -> tantivy::Result<Box<dyn Scorer>> {\n        let mut hit_set_builder = HitSetBuilder::new();\n        let mut scorer = self.inner_weight.scorer(reader, 1.0)?;\n        let mut doc_id = scorer.doc();\n        while doc_id < tantivy::TERMINATED {\n            hit_set_builder.insert(doc_id);\n            doc_id = scorer.advance();\n        }\n        let mut hit_set = hit_set_builder.build();\n        self.cache_filler\n            .fill_segment(reader.segment_id(), hit_set.clone());\n        hit_set.boost = boost;\n        Ok(Box::new(hit_set))\n    }\n\n    fn explain(&self, reader: &SegmentReader, doc: DocId) -> tantivy::Result<Explanation> {\n        self.inner_weight.explain(reader, doc)\n    }\n}\n\n/// A transformer that goes through a QueryAst, and change the state of all CacheNodes\n/// to Hit/Miss based on the provided cache.\n///\n/// This must be called for any CacheNode inside a QueryAst to do anything (though not calling\n/// it isn't an error, it just means no cache will be used).\npub struct PredicateCacheInjector {\n    pub cache: Arc<dyn PredicateCache>,\n    pub split_id: String,\n}\n\nimpl crate::query_ast::QueryAstTransformer for PredicateCacheInjector {\n    type Err = std::convert::Infallible;\n\n    fn transform_cache_node(\n        &mut self,\n        mut cache_node: CacheNode,\n    ) -> Result<Option<QueryAst>, Self::Err> {\n        cache_node.fill_cache_state(&self.cache, &self.split_id);\n        self.transform(*cache_node.inner).map(|maybe_ast| {\n            maybe_ast.map(|inner| {\n                QueryAst::Cache(CacheNode {\n                    inner: Box::new(inner),\n                    state: cache_node.state,\n                })\n            })\n        })\n    }\n}\n\n// we use a trait to dodge circular dependancies with quickwit-storage\npub trait PredicateCache: Send + Sync + 'static {\n    fn get(&self, split_id: SplitId, query_ast_json: String) -> Option<(SegmentId, HitSet)>;\n\n    fn put(&self, split_id: SplitId, query_ast_json: String, segment: SegmentId, results: HitSet);\n}\n\n#[cfg(test)]\nmod tests {\n    use std::collections::HashMap;\n    use std::sync::Mutex;\n\n    use tantivy::DocSet;\n    use tantivy::query::Query as TantivyQuery;\n    use tantivy::schema::{Schema, TEXT};\n\n    use super::*;\n    use crate::query_ast::{\n        BuildTantivyAstContext, QueryAstTransformer, QueryAstVisitor, TermQuery,\n    };\n\n    impl PredicateCache for Mutex<HashMap<(SplitId, String), (SegmentId, HitSet)>> {\n        fn get(&self, split_id: SplitId, query_ast_json: String) -> Option<(SegmentId, HitSet)> {\n            self.lock()\n                .unwrap()\n                .get(&(split_id, query_ast_json))\n                .cloned()\n        }\n\n        fn put(\n            &self,\n            split_id: SplitId,\n            query_ast_json: String,\n            segment: SegmentId,\n            results: HitSet,\n        ) {\n            self.lock()\n                .unwrap()\n                .insert((split_id, query_ast_json), (segment, results));\n        }\n    }\n\n    #[track_caller]\n    fn test_hit_set_roundtrip_helper<I: Iterator<Item = u32> + Clone>(iter: I) {\n        let mut hitset_builder = HitSetBuilder::new();\n        for i in iter.clone() {\n            hitset_builder.insert(i);\n        }\n        let mut hitset = hitset_builder.build();\n\n        for val in iter {\n            assert_eq!(hitset.doc(), val);\n            hitset.advance();\n        }\n        for _ in 0..96 {\n            assert_eq!(hitset.doc(), tantivy::TERMINATED);\n            hitset.advance();\n        }\n    }\n\n    #[test]\n    fn test_hit_set_roundtrip() {\n        // this generate a pseurorandom strictrly increasing sequence\n        let generator = std::iter::successors(Some(0u32), |x| Some(x + x.trailing_ones() + 1));\n\n        // empty\n        test_hit_set_roundtrip_helper(generator.clone().take(0));\n        // one item\n        test_hit_set_roundtrip_helper(generator.clone().take(1));\n        test_hit_set_roundtrip_helper(generator.clone().skip(10).take(1));\n        // partial block\n        test_hit_set_roundtrip_helper(generator.clone().take(24));\n        test_hit_set_roundtrip_helper(generator.clone().skip(10).take(24));\n\n        // one block\n        test_hit_set_roundtrip_helper(generator.clone().take(32));\n        test_hit_set_roundtrip_helper(generator.clone().skip(10).take(32));\n        // two blocks\n        test_hit_set_roundtrip_helper(generator.clone().take(64));\n        test_hit_set_roundtrip_helper(generator.clone().skip(10).take(64));\n\n        // many blocks, partial last block\n        test_hit_set_roundtrip_helper(generator.clone().take(1024 + 6));\n        test_hit_set_roundtrip_helper(generator.clone().skip(10).take(1024 + 6));\n    }\n\n    #[test]\n    fn test_built_tantivy_ast() {\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_text_field(\"body\", TEXT);\n        let schema = schema_builder.build();\n        let term_query: QueryAst = TermQuery {\n            field: \"body\".to_string(),\n            value: \"val\".to_string(),\n        }\n        .into();\n        let tantivy_term_query: Box<dyn TantivyQuery> = term_query\n            .build_tantivy_ast_impl(&BuildTantivyAstContext::for_test(&schema))\n            .unwrap()\n            .into();\n\n        {\n            let ast = CacheNode {\n                inner: Box::new(term_query.clone()),\n                state: CacheState::Uninitialized,\n            };\n            let uninit_cache_query: Box<dyn TantivyQuery> = ast\n                .build_tantivy_ast_impl(&BuildTantivyAstContext::for_test(&schema))\n                .unwrap()\n                .into();\n            assert_eq!(\n                format!(\"{uninit_cache_query:?}\"),\n                format!(\"{tantivy_term_query:?}\")\n            );\n        }\n\n        {\n            let cache_entry = CacheEntry {\n                segment_id: SegmentId::from_uuid_string(\"1686a000d4f7a91939d0e71df1646d7a\")\n                    .unwrap(),\n                hits: HitSet::empty(),\n            };\n            let ast = CacheNode {\n                inner: Box::new(term_query.clone()),\n                state: CacheState::CacheHit(cache_entry),\n            };\n            let cache_hit_query: Box<dyn TantivyQuery> = ast\n                .build_tantivy_ast_impl(&BuildTantivyAstContext::for_test(&schema))\n                .unwrap()\n                .into();\n\n            let debug_query = format!(\"{cache_hit_query:?}\");\n            assert!(debug_query.contains(\"CacheHitQuery\"));\n            assert!(!debug_query.contains(\"TermQuery\"));\n        }\n        {\n            let cache_filler = CacheFiller {\n                cache: Arc::new(Mutex::new(HashMap::new())),\n                split_id: \"split_id\".to_string(),\n                query: \"{}\".to_string(),\n            };\n            let ast = CacheNode {\n                inner: Box::new(term_query.clone()),\n                state: CacheState::CacheMiss(cache_filler),\n            };\n            let cache_miss_query: Box<dyn TantivyQuery> = ast\n                .build_tantivy_ast_impl(&BuildTantivyAstContext::for_test(&schema))\n                .unwrap()\n                .into();\n\n            let debug_query = format!(\"{cache_miss_query:?}\");\n            assert!(debug_query.contains(\"CacheFillerQuery\"));\n            assert!(debug_query.contains(&format!(\"{tantivy_term_query:?}\")));\n        }\n    }\n\n    struct FoundATermVisitor(bool);\n    impl QueryAstVisitor<'_> for FoundATermVisitor {\n        type Err = std::convert::Infallible;\n        fn visit_term(&mut self, _term: &TermQuery) -> Result<(), Self::Err> {\n            self.0 = true;\n            Ok(())\n        }\n    }\n\n    impl QueryAstTransformer for FoundATermVisitor {\n        type Err = std::convert::Infallible;\n        fn transform_term(&mut self, term: TermQuery) -> Result<Option<QueryAst>, Self::Err> {\n            self.0 = true;\n            Ok(Some(term.into()))\n        }\n    }\n\n    #[test]\n    fn test_default_visitor_ignore_cached_node() {\n        let term_query: QueryAst = TermQuery {\n            field: \"body\".to_string(),\n            value: \"val\".to_string(),\n        }\n        .into();\n        {\n            let ast = CacheNode {\n                inner: Box::new(term_query.clone()),\n                state: CacheState::Uninitialized,\n            }\n            .into();\n\n            let mut visitor = FoundATermVisitor(false);\n            visitor.visit(&ast).unwrap();\n            assert!(visitor.0);\n            let mut visitor = FoundATermVisitor(false);\n            visitor.transform(ast).unwrap();\n            assert!(visitor.0);\n        }\n        {\n            let cache_entry = CacheEntry {\n                segment_id: SegmentId::from_uuid_string(\"1686a000d4f7a91939d0e71df1646d7a\")\n                    .unwrap(),\n                hits: HitSet::empty(),\n            };\n            let ast = CacheNode {\n                inner: Box::new(term_query.clone()),\n                state: CacheState::CacheHit(cache_entry),\n            }\n            .into();\n\n            let mut visitor = FoundATermVisitor(false);\n            visitor.visit(&ast).unwrap();\n            assert!(!visitor.0);\n            let mut visitor = FoundATermVisitor(false);\n            visitor.transform(ast).unwrap();\n            assert!(!visitor.0);\n        }\n        {\n            let cache_filler = CacheFiller {\n                cache: Arc::new(Mutex::new(HashMap::new())),\n                split_id: \"split_id\".to_string(),\n                query: \"{}\".to_string(),\n            };\n            let ast = CacheNode {\n                inner: Box::new(term_query.clone()),\n                state: CacheState::CacheMiss(cache_filler),\n            }\n            .into();\n\n            let mut visitor = FoundATermVisitor(false);\n            visitor.visit(&ast).unwrap();\n            assert!(visitor.0);\n            let mut visitor = FoundATermVisitor(false);\n            visitor.transform(ast).unwrap();\n            assert!(visitor.0);\n        }\n    }\n\n    #[test]\n    fn test_cache_preigniter_fills_cache() {\n        let term_query: QueryAst = TermQuery {\n            field: \"body\".to_string(),\n            value: \"val\".to_string(),\n        }\n        .into();\n        let cache_node = CacheNode {\n            inner: Box::new(term_query.clone()),\n            state: CacheState::Uninitialized,\n        };\n        let query_json = serde_json::to_string(&cache_node.inner).unwrap();\n        let ast: QueryAst = cache_node.into();\n        let cache = Arc::new(Mutex::new(HashMap::new()));\n        cache.put(\n            \"split_2\".to_string(),\n            query_json,\n            SegmentId::from_uuid_string(\"1686a000d4f7a91939d0e71df1646d7a\").unwrap(),\n            HitSet::empty(),\n        );\n\n        {\n            let mut pre_igniter = PredicateCacheInjector {\n                cache: cache.clone(),\n                split_id: \"split_1\".to_string(),\n            };\n            let filled = pre_igniter.transform(ast.clone()).unwrap().unwrap();\n            assert!(matches!(\n                filled,\n                QueryAst::Cache(CacheNode {\n                    state: CacheState::CacheMiss(_),\n                    ..\n                })\n            ));\n        }\n\n        {\n            let mut pre_igniter = PredicateCacheInjector {\n                cache: cache.clone(),\n                split_id: \"split_2\".to_string(),\n            };\n            let filled = pre_igniter.transform(ast.clone()).unwrap().unwrap();\n            assert!(matches!(\n                filled,\n                QueryAst::Cache(CacheNode {\n                    state: CacheState::CacheHit(_),\n                    ..\n                })\n            ));\n        }\n    }\n\n    #[test]\n    fn test_cache_hit_returns_correct_docs() {\n        let mut schema_builder = Schema::builder();\n        let host_field = schema_builder.add_text_field(\"host\", TEXT);\n        let schema = schema_builder.build();\n        let index = tantivy::IndexBuilder::new()\n            .schema(schema.clone())\n            .create_in_ram()\n            .unwrap();\n        let mut index_writer = index.writer_with_num_threads(1, 20_000_000).unwrap();\n        for count in 1..13 {\n            let mut doc = tantivy::TantivyDocument::default();\n            doc.add_text(host_field, format!(\"host_{count}\"));\n            for _ in 0..count {\n                index_writer.add_document(doc.clone()).unwrap();\n            }\n        }\n        index_writer.commit().unwrap();\n        let searcher = index.reader().unwrap().searcher();\n        let segment_id = searcher.segment_readers()[0].segment_id();\n\n        let generator =\n            std::iter::successors(Some(0u32), |x| Some(x + x.trailing_ones() + 1)).take(500);\n        let mut hitset_builder = HitSetBuilder::new();\n        for i in generator {\n            hitset_builder.insert(i);\n        }\n        let hitset = hitset_builder.build();\n\n        // this query isn't even valid for that split, but that's not relevant as it won't get run\n        let term_query: QueryAst = TermQuery {\n            field: \"body\".to_string(),\n            value: \"val\".to_string(),\n        }\n        .into();\n        let cache_entry = CacheEntry {\n            segment_id,\n            hits: hitset,\n        };\n        let ast = CacheNode {\n            inner: Box::new(term_query.clone()),\n            state: CacheState::CacheHit(cache_entry),\n        };\n        let cache_hit_query: Box<dyn TantivyQuery> = ast\n            .build_tantivy_ast_impl(&BuildTantivyAstContext::for_test(&schema))\n            .unwrap()\n            .into();\n\n        assert_eq!(cache_hit_query.count(&searcher).unwrap(), 500);\n    }\n\n    #[test]\n    fn test_cache_miss_returns_correct_docs_and_fill_cache() {\n        let mut schema_builder = Schema::builder();\n        let host_field = schema_builder.add_text_field(\"host\", TEXT);\n        let schema = schema_builder.build();\n        let index = tantivy::IndexBuilder::new()\n            .schema(schema.clone())\n            .create_in_ram()\n            .unwrap();\n        let mut index_writer = index.writer_with_num_threads(1, 20_000_000).unwrap();\n        for count in 1..13 {\n            let mut doc = tantivy::TantivyDocument::default();\n            doc.add_text(host_field, format!(\"host_{count}\"));\n            for _ in 0..count {\n                index_writer.add_document(doc.clone()).unwrap();\n            }\n        }\n        index_writer.commit().unwrap();\n        let searcher = index.reader().unwrap().searcher();\n        let segment_id = searcher.segment_readers()[0].segment_id();\n\n        let term_query: QueryAst = TermQuery {\n            field: \"host\".to_string(),\n            value: \"11\".to_string(),\n        }\n        .into();\n        let cache = Arc::new(Mutex::new(HashMap::new()));\n        let cache_filler = CacheFiller {\n            cache: cache.clone(),\n            split_id: \"split_id\".to_string(),\n            query: \"{some_query}\".to_string(),\n        };\n        let ast = CacheNode {\n            inner: Box::new(term_query.clone()),\n            state: CacheState::CacheMiss(cache_filler),\n        };\n        let cache_hit_query: Box<dyn TantivyQuery> = ast\n            .build_tantivy_ast_impl(&BuildTantivyAstContext::for_test(&schema))\n            .unwrap()\n            .into();\n\n        assert_eq!(cache_hit_query.count(&searcher).unwrap(), 11);\n        let mut cache_entry = cache\n            .get(\"split_id\".to_string(), \"{some_query}\".to_string())\n            .unwrap();\n        assert_eq!(cache_entry.0, segment_id);\n        let expected = (10 * 11 / 2)..(11 * 12 / 2);\n        for doc_id in expected {\n            assert_eq!(cache_entry.1.doc(), doc_id);\n            cache_entry.1.advance();\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/query_ast/field_presence.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_common::PathHasher;\nuse quickwit_common::shared_consts::FIELD_PRESENCE_FIELD_NAME;\nuse serde::{Deserialize, Serialize};\nuse tantivy::Term;\nuse tantivy::schema::{Field, FieldEntry, IndexRecordOption, Schema as TantivySchema};\n\nuse super::tantivy_query_ast::TantivyBoolQuery;\nuse super::utils::{DYNAMIC_FIELD_NAME, find_subfields};\nuse crate::query_ast::tantivy_query_ast::TantivyQueryAst;\nuse crate::query_ast::{BuildTantivyAst, BuildTantivyAstContext, QueryAst};\nuse crate::{BooleanOperand, InvalidQuery, find_field_or_hit_dynamic};\n\n#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Eq)]\npub struct FieldPresenceQuery {\n    pub field: String,\n}\n\nimpl From<FieldPresenceQuery> for QueryAst {\n    fn from(field_presence_query: FieldPresenceQuery) -> Self {\n        QueryAst::FieldPresence(field_presence_query)\n    }\n}\n\nfn compute_field_presence_hash(field: Field, field_path: &str) -> PathHasher {\n    let mut path_hasher: PathHasher = PathHasher::default();\n    path_hasher.append(&field.field_id().to_le_bytes()[..]);\n    let mut escaped = false;\n    let mut current_segment = String::new();\n    for c in field_path.chars() {\n        if escaped {\n            escaped = false;\n            current_segment.push(c);\n            continue;\n        }\n        match c {\n            '\\\\' => {\n                escaped = true;\n            }\n            '.' => {\n                path_hasher.append(current_segment.as_bytes());\n                current_segment.clear();\n            }\n            _ => {\n                current_segment.push(c);\n            }\n        }\n    }\n    if !current_segment.is_empty() {\n        path_hasher.append(current_segment.as_bytes());\n    }\n    path_hasher\n}\n\nfn build_existence_query(\n    field_presence_field: Field,\n    field: Field,\n    field_entry: &FieldEntry,\n    path: &str,\n) -> TantivyQueryAst {\n    if field_entry.is_fast() {\n        let full_path = if path.is_empty() {\n            field_entry.name().to_string()\n        } else {\n            format!(\"{}.{}\", field_entry.name(), path)\n        };\n        let exists_query = tantivy::query::ExistsQuery::new(full_path, true);\n        TantivyQueryAst::from(exists_query)\n    } else {\n        // fallback to the presence field\n        let presence_hasher = compute_field_presence_hash(field, path);\n        let leaf_term = Term::from_field_u64(field_presence_field, presence_hasher.finish_leaf());\n        if field_entry.field_type().is_json() {\n            let intermediate_term =\n                Term::from_field_u64(field_presence_field, presence_hasher.finish_intermediate());\n            let query = tantivy::query::TermSetQuery::new([leaf_term, intermediate_term]);\n            TantivyQueryAst::from(query)\n        } else {\n            let query = tantivy::query::TermQuery::new(leaf_term, IndexRecordOption::Basic);\n            TantivyQueryAst::from(query)\n        }\n    }\n}\n\nimpl FieldPresenceQuery {\n    /// Identify the field and potential subfields that are required for this query.\n    ///\n    /// This is only based on the schema and cannot now about dynamic fields.\n    pub fn find_field_and_subfields<'a>(\n        &'a self,\n        schema: &'a TantivySchema,\n    ) -> Vec<(Field, &'a FieldEntry, &'a str)> {\n        let mut fields = Vec::new();\n        if let Some((field, entry, path)) = find_field_or_hit_dynamic(&self.field, schema) {\n            fields.push((field, entry, path));\n        };\n        // if `self.field` was not found, it might still be an `object` field\n        if fields.is_empty() || fields[0].1.name() == DYNAMIC_FIELD_NAME {\n            for (field, entry) in find_subfields(&self.field, schema) {\n                fields.push((field, entry, \"\"));\n            }\n        }\n        fields\n    }\n}\n\nimpl BuildTantivyAst for FieldPresenceQuery {\n    fn build_tantivy_ast_impl(\n        &self,\n        context: &BuildTantivyAstContext,\n    ) -> Result<TantivyQueryAst, InvalidQuery> {\n        let field_presence_field = context\n            .schema\n            .get_field(FIELD_PRESENCE_FIELD_NAME)\n            .map_err(|_| {\n                InvalidQuery::SchemaError(\n                    \"field presence is not available for this split\".to_string(),\n                )\n            })?;\n        let fields = self.find_field_and_subfields(context.schema);\n        if fields.is_empty() {\n            // the schema is not dynamic and no subfields are defined\n            return Err(InvalidQuery::FieldDoesNotExist {\n                full_path: self.field.clone(),\n            });\n        }\n        let queries = fields\n            .into_iter()\n            .map(|(field, entry, path)| {\n                build_existence_query(field_presence_field, field, entry, path)\n            })\n            .collect();\n        Ok(TantivyQueryAst::Bool(TantivyBoolQuery::build_clause(\n            BooleanOperand::Or,\n            queries,\n        )))\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use super::*;\n\n    #[test]\n    fn test_field_presence_single() {\n        let field_presence_term: u64 =\n            compute_field_presence_hash(Field::from_field_id(17u32), \"attributes\").finish_leaf();\n        assert_eq!(\n            field_presence_term,\n            PathHasher::hash_path(&[&17u32.to_le_bytes()[..], b\"attributes\"])\n        );\n    }\n\n    #[test]\n    fn test_field_presence_hash_simple() {\n        let field_presence_term: u64 =\n            compute_field_presence_hash(Field::from_field_id(17u32), \"attributes.color\")\n                .finish_leaf();\n        assert_eq!(\n            field_presence_term,\n            PathHasher::hash_path(&[&17u32.to_le_bytes()[..], b\"attributes\", b\"color\"])\n        );\n    }\n\n    #[test]\n    fn test_field_presence_hash_escaped_dot() {\n        let field_presence_term: u64 =\n            compute_field_presence_hash(Field::from_field_id(17u32), r\"attributes\\.color.hello\")\n                .finish_leaf();\n        assert_eq!(\n            field_presence_term,\n            PathHasher::hash_path(&[&17u32.to_le_bytes()[..], b\"attributes.color\", b\"hello\"])\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/query_ast/full_text_query.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse anyhow::Context;\nuse serde::{Deserialize, Serialize};\nuse tantivy::Term;\nuse tantivy::query::{\n    PhrasePrefixQuery as TantivyPhrasePrefixQuery, PhraseQuery as TantivyPhraseQuery,\n    TermQuery as TantivyTermQuery,\n};\nuse tantivy::schema::{\n    Field, FieldType, IndexRecordOption, JsonObjectOptions, Schema as TantivySchema,\n    TextFieldIndexing,\n};\nuse tantivy::tokenizer::{TextAnalyzer, TokenStream};\n\nuse crate::query_ast::tantivy_query_ast::{TantivyBoolQuery, TantivyQueryAst};\nuse crate::query_ast::utils::full_text_query;\nuse crate::query_ast::{BuildTantivyAst, BuildTantivyAstContext, QueryAst};\nuse crate::tokenizers::TokenizerManager;\nuse crate::{BooleanOperand, InvalidQuery, MatchAllOrNone, find_field_or_hit_dynamic};\n\n#[derive(Serialize, Deserialize, Debug, Eq, PartialEq, Clone)]\n#[serde(deny_unknown_fields)]\npub struct FullTextParams {\n    #[serde(default, skip_serializing_if = \"Option::is_none\")]\n    pub tokenizer: Option<String>,\n    pub mode: FullTextMode,\n    // How an empty query (no terms after tokenization) should be interpreted.\n    // By default we match no documents.\n    #[serde(default, skip_serializing_if = \"MatchAllOrNone::is_none\")]\n    pub zero_terms_query: MatchAllOrNone,\n}\n\nimpl FullTextParams {\n    fn text_analyzer(\n        &self,\n        text_field_indexing: &TextFieldIndexing,\n        tokenizer_manager: &TokenizerManager,\n    ) -> anyhow::Result<TextAnalyzer> {\n        let tokenizer_name: &str = self\n            .tokenizer\n            .as_deref()\n            .unwrap_or(text_field_indexing.tokenizer());\n        tokenizer_manager\n            .get_tokenizer(tokenizer_name)\n            .with_context(|| format!(\"no tokenizer named `{tokenizer_name}` is registered\"))\n    }\n\n    pub(crate) fn tokenize_text_into_terms_json(\n        &self,\n        field: Field,\n        json_path: &str,\n        text: &str,\n        json_options: &JsonObjectOptions,\n        tokenizer_manager: &TokenizerManager,\n    ) -> anyhow::Result<Vec<(usize, Term)>> {\n        let text_indexing_options = json_options\n            .get_text_indexing_options()\n            .with_context(|| format!(\"Json field text `{json_path}` is not indexed\"))?;\n        let mut text_analyzer: TextAnalyzer =\n            self.text_analyzer(text_indexing_options, tokenizer_manager)?;\n        let mut token_stream = text_analyzer.token_stream(text);\n        let mut tokens = Vec::new();\n        token_stream.process(&mut |token| {\n            let mut term =\n                Term::from_field_json_path(field, json_path, json_options.is_expand_dots_enabled());\n            term.append_type_and_str(&token.text);\n            tokens.push((token.position, term));\n        });\n        Ok(tokens)\n    }\n\n    pub(crate) fn tokenize_text_into_terms(\n        &self,\n        field: Field,\n        text: &str,\n        text_field_indexing: &TextFieldIndexing,\n        tokenizer_manager: &TokenizerManager,\n    ) -> anyhow::Result<Vec<(usize, Term)>> {\n        let mut text_analyzer: TextAnalyzer =\n            self.text_analyzer(text_field_indexing, tokenizer_manager)?;\n        let mut token_stream = text_analyzer.token_stream(text);\n        let mut tokens = Vec::new();\n        token_stream.process(&mut |token| {\n            let term: Term = Term::from_field_text(field, &token.text);\n            tokens.push((token.position, term));\n        });\n        Ok(tokens)\n    }\n\n    pub(crate) fn make_query(\n        &self,\n        mut terms: Vec<(usize, Term)>,\n        index_record_option: IndexRecordOption,\n    ) -> Result<TantivyQueryAst, InvalidQuery> {\n        if terms.is_empty() {\n            return Ok(self.zero_terms_query.into());\n        }\n        if terms.len() == 1 {\n            let term = terms.pop().unwrap().1;\n            return Ok(TantivyTermQuery::new(term, IndexRecordOption::WithFreqs).into());\n        }\n        match self.mode {\n            FullTextMode::Bool { operator } => {\n                let leaf_queries: Vec<TantivyQueryAst> = terms\n                    .into_iter()\n                    .map(|(_, term)| TantivyTermQuery::new(term, index_record_option).into())\n                    .collect();\n                Ok(TantivyBoolQuery::build_clause(operator, leaf_queries).into())\n            }\n            FullTextMode::BoolPrefix {\n                operator,\n                max_expansions,\n            } => {\n                let term_with_prefix = terms.pop();\n                let mut leaf_queries: Vec<TantivyQueryAst> = terms\n                    .into_iter()\n                    .map(|(_, term)| TantivyTermQuery::new(term, index_record_option).into())\n                    .collect();\n                if let Some(term_with_prefix) = term_with_prefix {\n                    let mut phrase_prefix_query =\n                        TantivyPhrasePrefixQuery::new_with_offset(vec![term_with_prefix]);\n                    phrase_prefix_query.set_max_expansions(max_expansions);\n                    leaf_queries.push(phrase_prefix_query.into());\n                }\n                Ok(TantivyBoolQuery::build_clause(operator, leaf_queries).into())\n            }\n            FullTextMode::Phrase { slop } => {\n                if !index_record_option.has_positions() {\n                    return Err(InvalidQuery::SchemaError(\n                        \"Applied phrase query on field which does not have positions indexed\"\n                            .to_string(),\n                    ));\n                }\n                let mut phrase_query = TantivyPhraseQuery::new_with_offset(terms);\n                phrase_query.set_slop(slop);\n                Ok(phrase_query.into())\n            }\n            FullTextMode::PhraseFallbackToIntersection => {\n                if index_record_option.has_positions() {\n                    Ok(TantivyPhraseQuery::new_with_offset(terms).into())\n                } else {\n                    let term_query: Vec<TantivyQueryAst> = terms\n                        .into_iter()\n                        .map(|(_, term)| TantivyTermQuery::new(term, index_record_option).into())\n                        .collect();\n                    Ok(TantivyBoolQuery::build_clause(BooleanOperand::And, term_query).into())\n                }\n            }\n        }\n    }\n}\n\nfn is_zero(val: &u32) -> bool {\n    *val == 0u32\n}\n\n/// `FullTextMode` describe how we should derive a query from a user sequence of tokens.\n#[derive(Copy, Clone, Debug, Eq, PartialEq, Serialize, Deserialize)]\n#[serde(tag = \"type\", rename_all = \"snake_case\")]\npub enum FullTextMode {\n    // After tokenization, the different tokens should be used to\n    // create a boolean clause (conjunction or disjunction based on the operator).\n    Bool {\n        operator: BooleanOperand,\n    },\n    BoolPrefix {\n        operator: BooleanOperand,\n        // max_expansions correspond to the fuzzy stop of query evaluation. It's not the same as\n        // the max_expansions of a PhrasePrefixQuery, where it's used for the range\n        // expansion.\n        max_expansions: u32,\n    },\n    // Act as Phrase with slop 0 if the field has positions,\n    // otherwise act as an intersection.\n    PhraseFallbackToIntersection,\n    // After tokenization, the different tokens should be used to create\n    // a phrase query.\n    //\n    // A non-zero slop allows the position of the terms to be slightly off.\n    Phrase {\n        #[serde(default, skip_serializing_if = \"is_zero\")]\n        slop: u32,\n    },\n}\n\nimpl From<BooleanOperand> for FullTextMode {\n    fn from(operator: BooleanOperand) -> Self {\n        FullTextMode::Bool { operator }\n    }\n}\n\n/// The Full Text query is tokenized into a sequence of tokens\n/// that will then be searched.\n///\n/// The `full_text_params` defines what type of match is accepted.\n/// The tokens might be transformed into a phrase queries,\n/// into a disjunction, or into a conjunction.\n///\n/// If after tokenization, a single term is emitted, it will naturally be\n/// produce a tantivy TermQuery.\n///\n/// If no terms is emitted, it will produce a query that match all or no documents,\n/// depending on `full_text_params.zero_terms_query`.\n///\n/// Contrary to the user input query, the FullTextQuery does not\n/// interpret a boolean query grammar and targets a specific field.\n#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Eq)]\npub struct FullTextQuery {\n    pub field: String,\n    pub text: String,\n    pub params: FullTextParams,\n    /// Support missing fields\n    pub lenient: bool,\n}\n\nimpl From<FullTextQuery> for QueryAst {\n    fn from(full_text_query: FullTextQuery) -> Self {\n        QueryAst::FullText(full_text_query)\n    }\n}\n\nimpl BuildTantivyAst for FullTextQuery {\n    fn build_tantivy_ast_impl(\n        &self,\n        context: &BuildTantivyAstContext,\n    ) -> Result<TantivyQueryAst, InvalidQuery> {\n        full_text_query(\n            &self.field,\n            &self.text,\n            &self.params,\n            context.schema,\n            context.tokenizer_manager,\n            self.lenient,\n        )\n    }\n}\n\nimpl FullTextQuery {\n    /// Returns the last term of the query assuming the query is targeting a string or a Json\n    /// field.\n    ///\n    /// This strange method is used to identify which term range should be warmed up for\n    /// phrase prefix queries.\n    pub fn get_prefix_term(\n        &self,\n        schema: &TantivySchema,\n        tokenizer_manager: &TokenizerManager,\n    ) -> Option<Term> {\n        if !matches!(self.params.mode, FullTextMode::BoolPrefix { .. }) {\n            return None;\n        };\n\n        let (field, field_entry, json_path) = find_field_or_hit_dynamic(&self.field, schema)?;\n        let field_type: &FieldType = field_entry.field_type();\n        match field_type {\n            FieldType::Str(text_options) => {\n                let text_field_indexing = text_options.get_indexing_options()?;\n                let mut terms = self\n                    .params\n                    .tokenize_text_into_terms(\n                        field,\n                        &self.text,\n                        text_field_indexing,\n                        tokenizer_manager,\n                    )\n                    .ok()?;\n                let (_pos, term) = terms.pop()?;\n                Some(term)\n            }\n            FieldType::JsonObject(json_options) => {\n                let mut terms = self\n                    .params\n                    .tokenize_text_into_terms_json(\n                        field,\n                        json_path,\n                        &self.text,\n                        json_options,\n                        tokenizer_manager,\n                    )\n                    .ok()?;\n                let (_pos, term) = terms.pop()?;\n                Some(term)\n            }\n            _ => None,\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use tantivy::schema::{DateOptions, DateTimePrecision, Schema, TEXT};\n\n    use crate::BooleanOperand;\n    use crate::query_ast::tantivy_query_ast::TantivyQueryAst;\n    use crate::query_ast::{BuildTantivyAst, BuildTantivyAstContext, FullTextMode, FullTextQuery};\n\n    #[test]\n    fn test_zero_terms() {\n        let full_text_query = FullTextQuery {\n            field: \"body\".to_string(),\n            text: \"\".to_string(),\n            params: super::FullTextParams {\n                tokenizer: None,\n                mode: BooleanOperand::And.into(),\n                zero_terms_query: crate::MatchAllOrNone::MatchAll,\n            },\n            lenient: false,\n        };\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_text_field(\"body\", TEXT);\n        let schema = schema_builder.build();\n        let ast: TantivyQueryAst = full_text_query\n            .build_tantivy_ast_call(&BuildTantivyAstContext::for_test(&schema))\n            .unwrap();\n        assert_eq!(ast.const_predicate(), Some(crate::MatchAllOrNone::MatchAll));\n    }\n\n    #[test]\n    fn test_phrase_mode_default_tokenizer() {\n        let full_text_query = FullTextQuery {\n            field: \"body\".to_string(),\n            text: \"Hello World!\".to_string(),\n            params: super::FullTextParams {\n                tokenizer: None,\n                mode: FullTextMode::Phrase { slop: 1 },\n                zero_terms_query: crate::MatchAllOrNone::MatchAll,\n            },\n            lenient: false,\n        };\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_text_field(\"body\", TEXT);\n        let schema = schema_builder.build();\n        let ast: TantivyQueryAst = full_text_query\n            .build_tantivy_ast_call(&BuildTantivyAstContext::for_test(&schema))\n            .unwrap();\n        let leaf = ast.as_leaf().unwrap();\n        assert_eq!(\n            &format!(\"{leaf:?}\"),\n            \"PhraseQuery { field: Field(0), phrase_terms: [(0, Term(field=0, type=Str, \\\n             \\\"hello\\\")), (1, Term(field=0, type=Str, \\\"world\\\"))], slop: 1 }\"\n        );\n    }\n\n    #[test]\n    fn test_full_text_specific_tokenizer() {\n        let full_text_query = FullTextQuery {\n            field: \"body\".to_string(),\n            text: \"Hello world\".to_string(),\n            params: super::FullTextParams {\n                tokenizer: Some(\"raw\".to_string()),\n                mode: FullTextMode::Phrase { slop: 1 },\n                zero_terms_query: crate::MatchAllOrNone::MatchAll,\n            },\n            lenient: false,\n        };\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_text_field(\"body\", TEXT);\n        let schema = schema_builder.build();\n        let ast: TantivyQueryAst = full_text_query\n            .build_tantivy_ast_call(&BuildTantivyAstContext::for_test(&schema))\n            .unwrap();\n        let leaf = ast.as_leaf().unwrap();\n        assert_eq!(\n            &format!(\"{leaf:?}\"),\n            r#\"TermQuery(Term(field=0, type=Str, \"Hello world\"))\"#\n        );\n    }\n\n    #[test]\n    fn test_full_text_datetime() {\n        let full_text_query = FullTextQuery {\n            field: \"ts\".to_string(),\n            text: \"2025-12-13T16:13:12.666777Z\".to_string(),\n            params: super::FullTextParams {\n                tokenizer: Some(\"raw\".to_string()),\n                mode: FullTextMode::Phrase { slop: 1 },\n                zero_terms_query: crate::MatchAllOrNone::MatchAll,\n            },\n            lenient: false,\n        };\n        {\n            // indexed, we truncate to the second\n            let mut schema_builder = Schema::builder();\n            schema_builder.add_date_field(\n                \"ts\",\n                DateOptions::default()\n                    .set_precision(DateTimePrecision::Milliseconds)\n                    .set_fast()\n                    .set_indexed(),\n            );\n            let schema = schema_builder.build();\n            let ast: TantivyQueryAst = full_text_query\n                .build_tantivy_ast_call(&BuildTantivyAstContext::for_test(&schema))\n                .unwrap();\n            let leaf = ast.as_leaf().unwrap();\n            assert_eq!(\n                &format!(\"{leaf:?}\"),\n                r#\"TermQuery(Term(field=0, type=Date, 2025-12-13T16:13:12Z))\"#\n            );\n        }\n        {\n            // not indexed, we truncate to fastfield precision\n            let mut schema_builder = Schema::builder();\n            schema_builder.add_date_field(\n                \"ts\",\n                DateOptions::default()\n                    .set_precision(DateTimePrecision::Milliseconds)\n                    .set_fast(),\n            );\n            let schema = schema_builder.build();\n            let ast: TantivyQueryAst = full_text_query\n                .build_tantivy_ast_call(&BuildTantivyAstContext::for_test(&schema))\n                .unwrap();\n            let leaf = ast.as_leaf().unwrap();\n            assert_eq!(\n                &format!(\"{leaf:?}\"),\n                r#\"TermQuery(Term(field=0, type=Date, 2025-12-13T16:13:12.666Z))\"#\n            );\n        }\n    }\n\n    #[test]\n    fn test_full_text_bool_mode() {\n        let full_text_query = FullTextQuery {\n            field: \"body\".to_string(),\n            text: \"Hello world\".to_string(),\n            params: super::FullTextParams {\n                tokenizer: None,\n                mode: BooleanOperand::And.into(),\n                zero_terms_query: crate::MatchAllOrNone::MatchAll,\n            },\n            lenient: false,\n        };\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_text_field(\"body\", TEXT);\n        let schema = schema_builder.build();\n        let ast: TantivyQueryAst = full_text_query\n            .build_tantivy_ast_call(&BuildTantivyAstContext::for_test(&schema))\n            .unwrap();\n        let bool_query = ast.as_bool_query().unwrap();\n        assert_eq!(bool_query.must.len(), 2);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/query_ast/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse serde::{Deserialize, Serialize};\nuse tantivy::query::BoostQuery as TantivyBoostQuery;\nuse tantivy::schema::Schema as TantivySchema;\n\nuse crate::tokenizers::TokenizerManager;\n\nmod bool_query;\nmod cache_node;\nmod field_presence;\nmod full_text_query;\nmod phrase_prefix_query;\nmod range_query;\nmod regex_query;\nmod tantivy_query_ast;\nmod term_query;\nmod term_set_query;\nmod user_input_query;\npub(crate) mod utils;\nmod visitor;\nmod wildcard_query;\n\npub use bool_query::BoolQuery;\npub use cache_node::{CacheNode, HitSet, PredicateCache, PredicateCacheInjector};\npub use field_presence::FieldPresenceQuery;\npub use full_text_query::{FullTextMode, FullTextParams, FullTextQuery};\npub use phrase_prefix_query::PhrasePrefixQuery;\npub use range_query::RangeQuery;\npub use regex_query::{AutomatonQuery, JsonPathPrefix, RegexQuery};\nuse tantivy_query_ast::TantivyQueryAst;\npub use term_query::TermQuery;\npub use term_set_query::TermSetQuery;\npub use user_input_query::UserInputQuery;\npub use visitor::{QueryAstTransformer, QueryAstVisitor};\npub use wildcard_query::WildcardQuery;\n\nuse crate::{BooleanOperand, InvalidQuery, NotNaNf32};\n\n#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Eq)]\n#[serde(tag = \"type\")]\n#[serde(rename_all = \"snake_case\")]\npub enum QueryAst {\n    Bool(BoolQuery),\n    Term(TermQuery),\n    TermSet(TermSetQuery),\n    FieldPresence(FieldPresenceQuery),\n    FullText(FullTextQuery),\n    PhrasePrefix(PhrasePrefixQuery),\n    Range(RangeQuery),\n    UserInput(UserInputQuery),\n    Wildcard(WildcardQuery),\n    Regex(RegexQuery),\n    MatchAll,\n    MatchNone,\n    Boost {\n        underlying: Box<QueryAst>,\n        boost: NotNaNf32,\n    },\n    Cache(CacheNode),\n}\n\nimpl QueryAst {\n    pub fn parse_user_query(\n        self: QueryAst,\n        default_search_fields: &[String],\n    ) -> anyhow::Result<QueryAst> {\n        match self {\n            QueryAst::Bool(BoolQuery {\n                must,\n                must_not,\n                should,\n                filter,\n                minimum_should_match,\n            }) => {\n                let must = parse_user_query_in_asts(must, default_search_fields)?;\n                let must_not = parse_user_query_in_asts(must_not, default_search_fields)?;\n                let should = parse_user_query_in_asts(should, default_search_fields)?;\n                let filter = parse_user_query_in_asts(filter, default_search_fields)?;\n                Ok(BoolQuery {\n                    must,\n                    must_not,\n                    should,\n                    filter,\n                    minimum_should_match,\n                }\n                .into())\n            }\n            ast @ QueryAst::Term(_)\n            | ast @ QueryAst::TermSet(_)\n            | ast @ QueryAst::FullText(_)\n            | ast @ QueryAst::PhrasePrefix(_)\n            | ast @ QueryAst::MatchAll\n            | ast @ QueryAst::MatchNone\n            | ast @ QueryAst::FieldPresence(_)\n            | ast @ QueryAst::Range(_)\n            | ast @ QueryAst::Wildcard(_)\n            | ast @ QueryAst::Regex(_) => Ok(ast),\n            QueryAst::UserInput(user_text_query) => {\n                user_text_query.parse_user_query(default_search_fields)\n            }\n            QueryAst::Boost { underlying, boost } => {\n                let underlying = underlying.parse_user_query(default_search_fields)?;\n                Ok(QueryAst::Boost {\n                    underlying: Box::new(underlying),\n                    boost,\n                })\n            }\n            QueryAst::Cache(cache_node) => {\n                let inner = cache_node.inner.parse_user_query(default_search_fields)?;\n                let uninitialized =\n                    matches!(cache_node.state, cache_node::CacheState::Uninitialized);\n                debug_assert!(\n                    uninitialized,\n                    \"QueryAst::parse_user_query called on initialized CacheNode, this is probably \\\n                     a misstake\"\n                );\n                if !uninitialized {\n                    tracing::warn!(\n                        \"QueryAst::parse_user_query called on initialized CacheNode, cache \\\n                         discarded\"\n                    );\n                }\n                Ok(CacheNode {\n                    inner: Box::new(inner),\n                    // inner got modified, the result is supposed to be equivalent, but to be safe,\n                    // lets reinitialize the cache in practice this function\n                    // shouldn't ever be called after cache was resolved\n                    state: cache_node::CacheState::Uninitialized,\n                }\n                .into())\n            }\n        }\n    }\n\n    pub fn boost(self, scale_boost_opt: Option<NotNaNf32>) -> Self {\n        let Some(scale_boost) = scale_boost_opt else {\n            return self;\n        };\n        match self {\n            QueryAst::Boost { underlying, boost } => {\n                let scale_boost_f32: f32 = scale_boost.into();\n                let boost_f32: f32 = boost.into();\n                let new_boost =\n                    NotNaNf32::try_from(scale_boost_f32 * boost_f32).unwrap_or(NotNaNf32::ZERO);\n                QueryAst::Boost {\n                    underlying,\n                    boost: new_boost,\n                }\n            }\n            ast => {\n                let underlying = Box::new(ast);\n                QueryAst::Boost {\n                    underlying,\n                    boost: scale_boost,\n                }\n            }\n        }\n    }\n}\n\n/// Context used when building a tantivy ast.\npub struct BuildTantivyAstContext<'a> {\n    pub schema: &'a TantivySchema,\n    pub tokenizer_manager: &'a TokenizerManager,\n    pub search_fields: &'a [String],\n    pub with_validation: bool,\n}\n\nimpl<'a> BuildTantivyAstContext<'a> {\n    pub fn for_test(schema: &'a TantivySchema) -> Self {\n        use once_cell::sync::Lazy;\n\n        // we do that to have a TokenizerManager with a long enough lifetime\n        static DEFAULT_TOKENIZER_MANAGER: Lazy<TokenizerManager> =\n            Lazy::new(crate::create_default_quickwit_tokenizer_manager);\n\n        BuildTantivyAstContext {\n            schema,\n            tokenizer_manager: &DEFAULT_TOKENIZER_MANAGER,\n            search_fields: &[],\n            with_validation: true,\n        }\n    }\n\n    pub fn without_validation(mut self) -> Self {\n        self.with_validation = false;\n        self\n    }\n}\n\ntrait BuildTantivyAst {\n    /// Transforms a query Ast node into a TantivyQueryAst.\n    ///\n    /// This function is supposed to return an error if it detects a problem in the schema.\n    /// It can call `into_tantivy_ast_call_me` but should never call `into_tantivy_ast_impl`.\n    fn build_tantivy_ast_impl(\n        &self,\n        context: &BuildTantivyAstContext,\n    ) -> Result<TantivyQueryAst, InvalidQuery>;\n\n    /// This method is meant to be called, but should never be overloaded.\n    fn build_tantivy_ast_call(\n        &self,\n        context: &BuildTantivyAstContext,\n    ) -> Result<TantivyQueryAst, InvalidQuery> {\n        let tantivy_ast_res = self.build_tantivy_ast_impl(context);\n        if !context.with_validation && tantivy_ast_res.is_err() {\n            return match tantivy_ast_res {\n                res @ Ok(_) | res @ Err(InvalidQuery::UserQueryNotParsed) => res,\n                Err(_) => Ok(TantivyQueryAst::match_none()),\n            };\n        }\n        tantivy_ast_res\n    }\n}\n\nimpl BuildTantivyAst for QueryAst {\n    fn build_tantivy_ast_impl(\n        &self,\n        context: &BuildTantivyAstContext,\n    ) -> Result<TantivyQueryAst, InvalidQuery> {\n        match self {\n            QueryAst::Bool(bool_query) => bool_query.build_tantivy_ast_call(context),\n            QueryAst::Term(term_query) => term_query.build_tantivy_ast_call(context),\n            QueryAst::Range(range_query) => range_query.build_tantivy_ast_call(context),\n            QueryAst::MatchAll => Ok(TantivyQueryAst::match_all()),\n            QueryAst::MatchNone => Ok(TantivyQueryAst::match_none()),\n            QueryAst::Boost { boost, underlying } => {\n                let underlying = underlying.build_tantivy_ast_call(context)?.simplify();\n                let boost_query = TantivyBoostQuery::new(underlying.into(), (*boost).into());\n                Ok(boost_query.into())\n            }\n            QueryAst::TermSet(term_set) => term_set.build_tantivy_ast_call(context),\n            QueryAst::FullText(full_text_query) => full_text_query.build_tantivy_ast_call(context),\n            QueryAst::PhrasePrefix(phrase_prefix_query) => {\n                phrase_prefix_query.build_tantivy_ast_call(context)\n            }\n            QueryAst::UserInput(user_text_query) => user_text_query.build_tantivy_ast_call(context),\n            QueryAst::FieldPresence(field_presence) => {\n                field_presence.build_tantivy_ast_call(context)\n            }\n            QueryAst::Wildcard(wildcard) => wildcard.build_tantivy_ast_call(context),\n            QueryAst::Regex(regex) => regex.build_tantivy_ast_call(context),\n            QueryAst::Cache(cache_node) => cache_node.build_tantivy_ast_call(context),\n        }\n    }\n}\n\nimpl QueryAst {\n    pub fn build_tantivy_query(\n        &self,\n        context: &BuildTantivyAstContext,\n    ) -> Result<Box<dyn crate::TantivyQuery>, InvalidQuery> {\n        let tantivy_query_ast = self.build_tantivy_ast_call(context)?;\n        Ok(tantivy_query_ast.simplify().into())\n    }\n}\n\nfn parse_user_query_in_asts(\n    asts: Vec<QueryAst>,\n    default_search_fields: &[String],\n) -> anyhow::Result<Vec<QueryAst>> {\n    asts.into_iter()\n        .map(|ast| ast.parse_user_query(default_search_fields))\n        .collect::<anyhow::Result<_>>()\n}\n\n/// Parses a user query and returns a JSON query AST.\n///\n/// The resulting query does not include `UserInputQuery` nodes.\n/// The resolution assumes that there are no default search fields\n/// in the doc mapper.\n///\n/// # Panics\n///\n/// Panics if the user text is invalid.\npub fn qast_json_helper(user_text: &str, default_fields: &[&'static str]) -> String {\n    let ast = qast_helper(user_text, default_fields);\n    serde_json::to_string(&ast).expect(\"The query AST should be JSON serializable.\")\n}\n\npub fn qast_helper(user_text: &str, default_fields: &[&'static str]) -> QueryAst {\n    let default_fields: Vec<String> = default_fields\n        .iter()\n        .map(|default_field| default_field.to_string())\n        .collect();\n    query_ast_from_user_text(user_text, Some(default_fields))\n        .parse_user_query(&[])\n        .expect(\"The user query should be valid.\")\n}\n\n/// Creates a QueryAST with a single UserInputQuery node.\n///\n/// Disclaimer:\n/// At this point the query has not been parsed.\n///\n/// The actual parsing is meant to happen on a root node,\n/// `default_fields` can be passed to decide which field should be search\n/// if not specified specifically in the user query (e.g. hello as opposed to \"body:hello\").\n///\n/// If it is not supplied, the docmapper search fields are meant to be used.\n///\n/// If no boolean operator is specified, the default is `AND` (contrary to the Elasticsearch\n/// default).\npub fn query_ast_from_user_text(user_text: &str, default_fields: Option<Vec<String>>) -> QueryAst {\n    UserInputQuery {\n        user_text: user_text.to_string(),\n        default_fields,\n        default_operator: BooleanOperand::And,\n        lenient: false,\n    }\n    .into()\n}\n\n#[cfg(test)]\nmod tests {\n    use crate::query_ast::tantivy_query_ast::TantivyQueryAst;\n    use crate::query_ast::{\n        BoolQuery, BuildTantivyAst, BuildTantivyAstContext, QueryAst, UserInputQuery,\n        query_ast_from_user_text,\n    };\n    use crate::{BooleanOperand, InvalidQuery};\n\n    #[test]\n    fn test_user_query_not_parsed() {\n        let query_ast: QueryAst = UserInputQuery {\n            user_text: \"*\".to_string(),\n            default_fields: Default::default(),\n            default_operator: Default::default(),\n            lenient: false,\n        }\n        .into();\n        let schema = tantivy::schema::Schema::builder().build();\n        let build_tantivy_ast_err: InvalidQuery = query_ast\n            .build_tantivy_ast_call(&BuildTantivyAstContext::for_test(&schema))\n            .unwrap_err();\n        assert!(matches!(\n            build_tantivy_ast_err,\n            InvalidQuery::UserQueryNotParsed\n        ));\n    }\n\n    #[test]\n    fn test_user_query_parsed() {\n        let query_ast: QueryAst = UserInputQuery {\n            user_text: \"*\".to_string(),\n            default_fields: Default::default(),\n            default_operator: Default::default(),\n            lenient: false,\n        }\n        .into();\n        let query_ast_with_parsed_user_query: QueryAst = query_ast.parse_user_query(&[]).unwrap();\n        let schema = tantivy::schema::Schema::builder().build();\n        let tantivy_query_ast = query_ast_with_parsed_user_query\n            .build_tantivy_ast_call(&BuildTantivyAstContext::for_test(&schema))\n            .unwrap();\n        assert_eq!(&tantivy_query_ast, &TantivyQueryAst::match_all(),);\n    }\n\n    #[test]\n    fn test_user_query_parsed_query_ast() {\n        let query_ast: QueryAst = UserInputQuery {\n            user_text: \"*\".to_string(),\n            default_fields: Default::default(),\n            default_operator: Default::default(),\n            lenient: false,\n        }\n        .into();\n        let bool_query_ast: QueryAst = BoolQuery {\n            filter: vec![query_ast],\n            ..Default::default()\n        }\n        .into();\n        let query_ast_with_parsed_user_query: QueryAst =\n            bool_query_ast.parse_user_query(&[]).unwrap();\n        let schema = tantivy::schema::Schema::builder().build();\n        let tantivy_query_ast = query_ast_with_parsed_user_query\n            .build_tantivy_ast_call(&BuildTantivyAstContext::for_test(&schema))\n            .unwrap();\n        let tantivy_query_ast_simplified = tantivy_query_ast.simplify();\n        // This does not get more simplified than this, because we need the boost 0 score.\n        let tantivy_bool_query = tantivy_query_ast_simplified.as_bool_query().unwrap();\n        assert_eq!(tantivy_bool_query.must.len(), 0);\n        assert_eq!(tantivy_bool_query.should.len(), 0);\n        assert_eq!(tantivy_bool_query.must_not.len(), 0);\n        assert_eq!(tantivy_bool_query.filter.len(), 1);\n        assert_eq!(&tantivy_bool_query.filter[0], &TantivyQueryAst::match_all(),);\n    }\n\n    #[test]\n    fn test_query_parse_default_occur_must() {\n        let query_ast: QueryAst = UserInputQuery {\n            user_text: \"field:hello field:toto\".to_string(),\n            default_fields: None,\n            default_operator: crate::BooleanOperand::And,\n            lenient: false,\n        }\n        .parse_user_query(&[])\n        .unwrap();\n        let QueryAst::Bool(bool_query) = query_ast else {\n            panic!()\n        };\n        assert_eq!(bool_query.must.len(), 2);\n    }\n\n    #[test]\n    fn test_query_parse_default_occur_should() {\n        let query_ast: QueryAst = UserInputQuery {\n            user_text: \"field:hello field:toto\".to_string(),\n            default_fields: None,\n            default_operator: crate::BooleanOperand::Or,\n            lenient: false,\n        }\n        .parse_user_query(&[])\n        .unwrap();\n        let QueryAst::Bool(bool_query) = query_ast else {\n            panic!()\n        };\n        assert_eq!(bool_query.should.len(), 2);\n    }\n\n    #[test]\n    fn test_query_ast_from_user_text_default_as_and() {\n        let ast = query_ast_from_user_text(\"hello you\", None);\n        let QueryAst::UserInput(input_query) = ast else {\n            panic!()\n        };\n        assert_eq!(input_query.default_operator, BooleanOperand::And);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/query_ast/phrase_prefix_query.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse serde::{Deserialize, Serialize};\nuse tantivy::Term;\nuse tantivy::query::PhrasePrefixQuery as TantivyPhrasePrefixQuery;\nuse tantivy::schema::{Field, FieldType, Schema as TantivySchema};\n\nuse crate::query_ast::tantivy_query_ast::TantivyQueryAst;\nuse crate::query_ast::{BuildTantivyAst, BuildTantivyAstContext, FullTextParams, QueryAst};\nuse crate::tokenizers::TokenizerManager;\nuse crate::{InvalidQuery, find_field_or_hit_dynamic};\n\n/// The PhraseQuery node is meant to be tokenized and searched.\n///\n/// If after tokenization, a single term is emitted, it will naturally be\n/// produce a tantivy TermQuery.\n/// If not terms is emitted, it will produce a query that match no documents..\n#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Eq)]\npub struct PhrasePrefixQuery {\n    pub field: String,\n    pub phrase: String,\n    pub max_expansions: u32,\n    pub params: FullTextParams,\n    /// Support missing fields\n    pub lenient: bool,\n}\n\nimpl PhrasePrefixQuery {\n    pub fn get_terms(\n        &self,\n        schema: &TantivySchema,\n        tokenizer_manager: &TokenizerManager,\n    ) -> Result<(Field, Vec<(usize, Term)>), InvalidQuery> {\n        let (field, field_entry, json_path) = find_field_or_hit_dynamic(&self.field, schema)\n            .ok_or_else(|| InvalidQuery::FieldDoesNotExist {\n                full_path: self.field.clone(),\n            })?;\n        let field_type = field_entry.field_type();\n\n        match field_type {\n            FieldType::Str(text_options) => {\n                let text_field_indexing = text_options.get_indexing_options().ok_or_else(|| {\n                    InvalidQuery::SchemaError(format!(\n                        \"field {} is not full-text searchable\",\n                        field_entry.name()\n                    ))\n                })?;\n                let terms = self.params.tokenize_text_into_terms(\n                    field,\n                    &self.phrase,\n                    text_field_indexing,\n                    tokenizer_manager,\n                )?;\n                if !text_field_indexing.index_option().has_positions() && terms.len() > 1 {\n                    return Err(InvalidQuery::SchemaError(\n                        \"trying to run a phrase prefix query on a field which does not have \\\n                         positions indexed\"\n                            .to_string(),\n                    ));\n                }\n                Ok((field, terms))\n            }\n            FieldType::JsonObject(json_options) => {\n                let text_field_indexing =\n                    json_options.get_text_indexing_options().ok_or_else(|| {\n                        InvalidQuery::SchemaError(format!(\n                            \"field {} is not full-text searchable\",\n                            field_entry.name()\n                        ))\n                    })?;\n                let terms = self.params.tokenize_text_into_terms_json(\n                    field,\n                    json_path,\n                    &self.phrase,\n                    json_options,\n                    tokenizer_manager,\n                )?;\n                if !text_field_indexing.index_option().has_positions() && terms.len() > 1 {\n                    return Err(InvalidQuery::SchemaError(\n                        \"trying to run a PhrasePrefix query on a field which does not have \\\n                         positions indexed\"\n                            .to_string(),\n                    ));\n                }\n                Ok((field, terms))\n            }\n            _ => Err(InvalidQuery::SchemaError(\n                \"trying to run a PhrasePrefix query on a non-text field\".to_string(),\n            )),\n        }\n    }\n}\n\nimpl From<PhrasePrefixQuery> for QueryAst {\n    fn from(phrase_query: PhrasePrefixQuery) -> Self {\n        QueryAst::PhrasePrefix(phrase_query)\n    }\n}\n\nimpl BuildTantivyAst for PhrasePrefixQuery {\n    fn build_tantivy_ast_impl(\n        &self,\n        context: &BuildTantivyAstContext,\n    ) -> Result<TantivyQueryAst, InvalidQuery> {\n        let (_, terms) = match self.get_terms(context.schema, context.tokenizer_manager) {\n            Ok(res) => res,\n            Err(InvalidQuery::FieldDoesNotExist { .. }) if self.lenient => {\n                return Ok(TantivyQueryAst::match_none());\n            }\n            Err(e) => return Err(e),\n        };\n\n        if terms.is_empty() {\n            if self.params.zero_terms_query.is_none() {\n                Ok(TantivyQueryAst::match_none())\n            } else {\n                Ok(TantivyQueryAst::match_all())\n            }\n        } else {\n            let mut phrase_prefix_query = TantivyPhrasePrefixQuery::new_with_offset(terms);\n            phrase_prefix_query.set_max_expansions(self.max_expansions);\n            Ok(phrase_prefix_query.into())\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/query_ast/range_query.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::ops::Bound;\n\nuse serde::{Deserialize, Serialize};\nuse tantivy::fastfield::FastValue;\nuse tantivy::query::FastFieldRangeQuery;\nuse tantivy::tokenizer::TextAnalyzer;\nuse tantivy::{DateTime, Term};\n\nuse super::QueryAst;\nuse super::tantivy_query_ast::TantivyBoolQuery;\nuse crate::json_literal::InterpretUserInput;\nuse crate::query_ast::{BuildTantivyAst, BuildTantivyAstContext, TantivyQueryAst};\nuse crate::{InvalidQuery, JsonLiteral};\n\n#[derive(Serialize, Deserialize, Clone, Debug, PartialEq, Eq)]\npub struct RangeQuery {\n    pub field: String,\n    pub lower_bound: Bound<JsonLiteral>,\n    pub upper_bound: Bound<JsonLiteral>,\n}\n\n/// Converts a given bound JsonLiteral bound into a bound of type T.\nfn convert_bound<'a, T>(bound: &'a Bound<JsonLiteral>) -> Option<Bound<T>>\nwhere T: InterpretUserInput<'a> {\n    match bound {\n        Bound::Included(val) => {\n            let val = T::interpret_json(val)?;\n            Some(Bound::Included(val))\n        }\n        Bound::Excluded(val) => {\n            let val = T::interpret_json(val)?;\n            Some(Bound::Excluded(val))\n        }\n        Bound::Unbounded => Some(Bound::Unbounded),\n    }\n}\n\n/// Converts a given bound JsonLiteral bound into a bound of type T.\nfn convert_bounds<'a, T>(\n    lower_bound: &'a Bound<JsonLiteral>,\n    upper_bound: &'a Bound<JsonLiteral>,\n    field_name: &str,\n) -> Result<(Bound<T>, Bound<T>), InvalidQuery>\nwhere\n    T: InterpretUserInput<'a>,\n{\n    let invalid_query = || InvalidQuery::InvalidBoundary {\n        expected_value_type: T::name(),\n        field_name: field_name.to_string(),\n    };\n    let lower_bound = convert_bound(lower_bound).ok_or_else(invalid_query)?;\n    let upper_bound = convert_bound(upper_bound).ok_or_else(invalid_query)?;\n    Ok((lower_bound, upper_bound))\n}\n\n/// Converts a given bound JsonLiteral bound into a bound of type T.\nimpl From<RangeQuery> for QueryAst {\n    fn from(range_query: RangeQuery) -> Self {\n        QueryAst::Range(range_query)\n    }\n}\n\nfn term_with_fastval<T: FastValue>(term: &Term, val: T) -> Term {\n    let mut term = term.clone();\n    term.append_type_and_fast_value(val);\n    term\n}\n\nfn query_from_fast_val_range<T: FastValue>(\n    empty_term: &Term,\n    range: (Bound<T>, Bound<T>),\n) -> FastFieldRangeQuery {\n    let (lower_bound, upper_bound) = range;\n    FastFieldRangeQuery::new(\n        lower_bound.map(|val| term_with_fastval(empty_term, val)),\n        upper_bound.map(|val| term_with_fastval(empty_term, val)),\n    )\n}\n\nfn get_normalized_text(normalizer: &mut Option<TextAnalyzer>, text: &str) -> String {\n    if let Some(normalizer) = normalizer {\n        let mut token_stream = normalizer.token_stream(text);\n        let mut tokens = Vec::new();\n        token_stream.process(&mut |token| {\n            tokens.push(token.text.clone());\n        });\n        tokens[0].to_string()\n    } else {\n        text.to_string()\n    }\n}\n\nimpl BuildTantivyAst for RangeQuery {\n    fn build_tantivy_ast_impl(\n        &self,\n        context: &BuildTantivyAstContext,\n    ) -> Result<TantivyQueryAst, InvalidQuery> {\n        let (field, field_entry, json_path) =\n            super::utils::find_field_or_hit_dynamic(&self.field, context.schema).ok_or_else(\n                || InvalidQuery::FieldDoesNotExist {\n                    full_path: self.field.clone(),\n                },\n            )?;\n        if !field_entry.is_fast() {\n            return Err(InvalidQuery::SchemaError(format!(\n                \"range queries are only supported for fast fields. (`{}` is not a fast field)\",\n                field_entry.name()\n            )));\n        }\n        Ok(match field_entry.field_type() {\n            tantivy::schema::FieldType::Str(options) => {\n                let mut normalizer =\n                    options\n                        .get_fast_field_tokenizer_name()\n                        .and_then(|tokenizer_name| {\n                            context.tokenizer_manager.get_normalizer(tokenizer_name)\n                        });\n\n                let (lower_bound, upper_bound) =\n                    convert_bounds(&self.lower_bound, &self.upper_bound, field_entry.name())?;\n\n                FastFieldRangeQuery::new(\n                    lower_bound.map(|text| {\n                        Term::from_field_text(field, &get_normalized_text(&mut normalizer, text))\n                    }),\n                    upper_bound.map(|text| {\n                        Term::from_field_text(field, &get_normalized_text(&mut normalizer, text))\n                    }),\n                )\n                .into()\n            }\n            tantivy::schema::FieldType::U64(_) => {\n                let (lower_bound, upper_bound) =\n                    convert_bounds(&self.lower_bound, &self.upper_bound, field_entry.name())?;\n                FastFieldRangeQuery::new(\n                    lower_bound.map(|val| Term::from_field_u64(field, val)),\n                    upper_bound.map(|val| Term::from_field_u64(field, val)),\n                )\n                .into()\n            }\n            tantivy::schema::FieldType::I64(_) => {\n                let (lower_bound, upper_bound) =\n                    convert_bounds(&self.lower_bound, &self.upper_bound, field_entry.name())?;\n                FastFieldRangeQuery::new(\n                    lower_bound.map(|val| Term::from_field_i64(field, val)),\n                    upper_bound.map(|val| Term::from_field_i64(field, val)),\n                )\n                .into()\n            }\n            tantivy::schema::FieldType::F64(_) => {\n                let (lower_bound, upper_bound) =\n                    convert_bounds(&self.lower_bound, &self.upper_bound, field_entry.name())?;\n                FastFieldRangeQuery::new(\n                    lower_bound.map(|val| Term::from_field_f64(field, val)),\n                    upper_bound.map(|val| Term::from_field_f64(field, val)),\n                )\n                .into()\n            }\n            tantivy::schema::FieldType::Bool(_) => {\n                return Err(InvalidQuery::RangeQueryNotSupportedForField {\n                    value_type: \"bool\",\n                    field_name: field_entry.name().to_string(),\n                });\n            }\n            tantivy::schema::FieldType::Date(date_options) => {\n                let (lower_bound, upper_bound) =\n                    convert_bounds(&self.lower_bound, &self.upper_bound, field_entry.name())?;\n                let truncate_datetime =\n                    |date: &DateTime| date.truncate(date_options.get_precision());\n                let lower_bound = lower_bound.as_ref().map(truncate_datetime);\n                let upper_bound = upper_bound.as_ref().map(truncate_datetime);\n                FastFieldRangeQuery::new(\n                    lower_bound.map(|val| Term::from_field_date(field, val)),\n                    upper_bound.map(|val| Term::from_field_date(field, val)),\n                )\n                .into()\n            }\n            tantivy::schema::FieldType::Facet(_) => {\n                return Err(InvalidQuery::RangeQueryNotSupportedForField {\n                    value_type: \"facet\",\n                    field_name: field_entry.name().to_string(),\n                });\n            }\n            tantivy::schema::FieldType::Bytes(_) => todo!(),\n            tantivy::schema::FieldType::JsonObject(options) => {\n                let mut sub_queries: Vec<TantivyQueryAst> = Vec::new();\n                let empty_term =\n                    Term::from_field_json_path(field, json_path, options.is_expand_dots_enabled());\n                // Try to convert the bounds into numerical values in following order i64, u64,\n                // f64. Tantivy will convert to the correct numerical type of the column if it\n                // doesn't match.\n                let bounds_range_i64: Option<(Bound<i64>, Bound<i64>)> =\n                    convert_bound(&self.lower_bound).zip(convert_bound(&self.upper_bound));\n                let bounds_range_u64: Option<(Bound<u64>, Bound<u64>)> =\n                    convert_bound(&self.lower_bound).zip(convert_bound(&self.upper_bound));\n                let bounds_range_f64: Option<(Bound<f64>, Bound<f64>)> =\n                    convert_bound(&self.lower_bound).zip(convert_bound(&self.upper_bound));\n                if let Some(range) = bounds_range_i64 {\n                    sub_queries.push(query_from_fast_val_range(&empty_term, range).into());\n                } else if let Some(range) = bounds_range_u64 {\n                    sub_queries.push(query_from_fast_val_range(&empty_term, range).into());\n                } else if let Some(range) = bounds_range_f64 {\n                    sub_queries.push(query_from_fast_val_range(&empty_term, range).into());\n                }\n                let bounds_range_date: Option<(Bound<DateTime>, Bound<DateTime>)> =\n                    convert_bound(&self.lower_bound).zip(convert_bound(&self.upper_bound));\n                if let Some(range) = bounds_range_date {\n                    sub_queries.push(query_from_fast_val_range(&empty_term, range).into());\n                }\n                let mut normalizer =\n                    options\n                        .get_fast_field_tokenizer_name()\n                        .and_then(|tokenizer_name| {\n                            context.tokenizer_manager.get_normalizer(tokenizer_name)\n                        });\n\n                let bounds_range_str: Option<(Bound<&str>, Bound<&str>)> =\n                    convert_bound(&self.lower_bound).zip(convert_bound(&self.upper_bound));\n                if let Some(range) = bounds_range_str {\n                    let str_query = FastFieldRangeQuery::new(\n                        range.0.map(|val| {\n                            let val = get_normalized_text(&mut normalizer, val);\n                            let mut term = empty_term.clone();\n                            term.append_type_and_str(&val);\n                            term\n                        }),\n                        range.1.map(|val| {\n                            let val = get_normalized_text(&mut normalizer, val);\n                            let mut term = empty_term.clone();\n                            term.append_type_and_str(&val);\n                            term\n                        }),\n                    )\n                    .into();\n                    sub_queries.push(str_query);\n                }\n                if sub_queries.is_empty() {\n                    return Err(InvalidQuery::InvalidBoundary {\n                        expected_value_type: \"i64, u64, f64, str\",\n                        field_name: field_entry.name().to_string(),\n                    });\n                }\n                if sub_queries.len() == 1 {\n                    return Ok(sub_queries.pop().unwrap());\n                }\n\n                let bool_query = TantivyBoolQuery {\n                    should: sub_queries,\n                    ..Default::default()\n                };\n                bool_query.into()\n            }\n            tantivy::schema::FieldType::IpAddr(_) => {\n                let (lower_bound, upper_bound) =\n                    convert_bounds(&self.lower_bound, &self.upper_bound, field_entry.name())?;\n                FastFieldRangeQuery::new(\n                    lower_bound.map(|val| Term::from_field_ip_addr(field, val)),\n                    upper_bound.map(|val| Term::from_field_ip_addr(field, val)),\n                )\n                .into()\n            }\n        })\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::ops::Bound;\n\n    use tantivy::schema::{DateOptions, DateTimePrecision, FAST, STORED, Schema, TEXT};\n\n    use super::RangeQuery;\n    use crate::query_ast::{BuildTantivyAst, BuildTantivyAstContext};\n    use crate::{InvalidQuery, JsonLiteral, MatchAllOrNone};\n\n    fn make_schema(dynamic_mode: bool) -> Schema {\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_i64_field(\"my_i64_field\", FAST);\n        schema_builder.add_u64_field(\"my_u64_field\", FAST);\n        schema_builder.add_f64_field(\"my_f64_field\", FAST);\n        schema_builder.add_text_field(\"my_str_field\", FAST);\n        let date_options = DateOptions::default()\n            .set_fast()\n            .set_precision(DateTimePrecision::Milliseconds);\n        schema_builder.add_date_field(\"my_date_field\", date_options);\n        schema_builder.add_u64_field(\"my_u64_not_fastfield\", STORED);\n        if dynamic_mode {\n            schema_builder.add_json_field(\"_dynamic\", TEXT | STORED | FAST);\n        }\n        schema_builder.build()\n    }\n\n    fn test_range_query_typed_field_util(\n        field: &str,\n        lower_value: JsonLiteral,\n        upper_value: JsonLiteral,\n        expected: &str,\n    ) {\n        let schema = make_schema(false);\n        let range_query = RangeQuery {\n            field: field.to_string(),\n            lower_bound: Bound::Included(lower_value),\n            upper_bound: Bound::Included(upper_value),\n        };\n        let tantivy_ast = range_query\n            .build_tantivy_ast_call(&BuildTantivyAstContext::for_test(&schema))\n            .unwrap()\n            .simplify();\n        let leaf = tantivy_ast.as_leaf().unwrap();\n        let leaf_str = format!(\"{leaf:?}\");\n        assert_eq!(leaf_str, expected);\n    }\n\n    #[test]\n    fn test_range_query_typed_field() {\n        test_range_query_typed_field_util(\n            \"my_i64_field\",\n            JsonLiteral::String(\"1980\".to_string()),\n            JsonLiteral::String(\"1989\".to_string()),\n            \"FastFieldRangeQuery { bounds: BoundsRange { lower_bound: Included(Term(field=0, \\\n             type=I64, 1980)), upper_bound: Included(Term(field=0, type=I64, 1989)) } }\",\n        );\n        test_range_query_typed_field_util(\n            \"my_u64_field\",\n            JsonLiteral::String(\"1980\".to_string()),\n            JsonLiteral::String(\"1989\".to_string()),\n            \"FastFieldRangeQuery { bounds: BoundsRange { lower_bound: Included(Term(field=1, \\\n             type=U64, 1980)), upper_bound: Included(Term(field=1, type=U64, 1989)) } }\",\n        );\n        test_range_query_typed_field_util(\n            \"my_f64_field\",\n            JsonLiteral::String(\"1980\".to_string()),\n            JsonLiteral::String(\"1989\".to_string()),\n            \"FastFieldRangeQuery { bounds: BoundsRange { lower_bound: Included(Term(field=2, \\\n             type=F64, 1980.0)), upper_bound: Included(Term(field=2, type=F64, 1989.0)) } }\",\n        );\n    }\n\n    #[test]\n    fn test_range_query_missing_field() {\n        let schema = make_schema(false);\n        let range_query = RangeQuery {\n            field: \"missing_field.toto\".to_string(),\n            lower_bound: Bound::Included(JsonLiteral::String(\"1980\".to_string())),\n            upper_bound: Bound::Included(JsonLiteral::String(\"1989\".to_string())),\n        };\n        // with validation\n        let invalid_query: InvalidQuery = range_query\n            .build_tantivy_ast_call(&BuildTantivyAstContext::for_test(&schema))\n            .unwrap_err();\n        assert!(\n            matches!(invalid_query, InvalidQuery::FieldDoesNotExist { full_path } if full_path == \"missing_field.toto\")\n        );\n        // without validation\n        assert_eq!(\n            range_query\n                .build_tantivy_ast_call(\n                    &BuildTantivyAstContext::for_test(&schema).without_validation()\n                )\n                .unwrap()\n                .const_predicate(),\n            Some(MatchAllOrNone::MatchNone)\n        );\n    }\n\n    #[test]\n    fn test_range_dynamic() {\n        let range_query = RangeQuery {\n            field: \"hello\".to_string(),\n            lower_bound: Bound::Included(JsonLiteral::String(\"1980\".to_string())),\n            upper_bound: Bound::Included(JsonLiteral::String(\"1989\".to_string())),\n        };\n        let schema = make_schema(true);\n        let tantivy_ast = range_query\n            .build_tantivy_ast_call(&BuildTantivyAstContext::for_test(&schema))\n            .unwrap();\n        assert_eq!(\n            format!(\"{tantivy_ast:?}\"),\n            \"Bool(TantivyBoolQuery { must: [], must_not: [], should: [Leaf(FastFieldRangeQuery { \\\n             bounds: BoundsRange { lower_bound: Included(Term(field=6, type=Json, path=hello, \\\n             type=I64, 1980)), upper_bound: Included(Term(field=6, type=Json, path=hello, \\\n             type=I64, 1989)) } }), Leaf(FastFieldRangeQuery { bounds: BoundsRange { lower_bound: \\\n             Included(Term(field=6, type=Json, path=hello, type=Str, \\\"1980\\\")), upper_bound: \\\n             Included(Term(field=6, type=Json, path=hello, type=Str, \\\"1989\\\")) } })], filter: \\\n             [], minimum_should_match: None })\"\n        );\n    }\n\n    #[test]\n    fn test_range_dynamic_datetime() {\n        let range_query = RangeQuery {\n            field: \"hello\".to_string(),\n            lower_bound: Bound::Included(JsonLiteral::String(\n                \"2020-12-09T16:09:53+00:00\".to_string(),\n            )),\n            upper_bound: Bound::Included(JsonLiteral::String(\n                \"2020-12-09T16:09:53+00:00\".to_string(),\n            )),\n        };\n        let schema = make_schema(true);\n        let tantivy_ast = range_query\n            .build_tantivy_ast_call(&BuildTantivyAstContext::for_test(&schema))\n            .unwrap();\n        assert_eq!(\n            format!(\"{tantivy_ast:?}\"),\n            \"Bool(TantivyBoolQuery { must: [], must_not: [], should: [Leaf(FastFieldRangeQuery { \\\n             bounds: BoundsRange { lower_bound: Included(Term(field=6, type=Json, path=hello, \\\n             type=Date, 2020-12-09T16:09:53Z)), upper_bound: Included(Term(field=6, type=Json, \\\n             path=hello, type=Date, 2020-12-09T16:09:53Z)) } }), Leaf(FastFieldRangeQuery { \\\n             bounds: BoundsRange { lower_bound: Included(Term(field=6, type=Json, path=hello, \\\n             type=Str, \\\"2020-12-09T16:09:53+00:00\\\")), upper_bound: Included(Term(field=6, \\\n             type=Json, path=hello, type=Str, \\\"2020-12-09T16:09:53+00:00\\\")) } })], filter: [], \\\n             minimum_should_match: None })\"\n        );\n    }\n\n    #[test]\n    fn test_range_query_not_fast_field() {\n        let range_query = RangeQuery {\n            field: \"my_u64_not_fastfield\".to_string(),\n            lower_bound: Bound::Included(JsonLiteral::String(\"1980\".to_string())),\n            upper_bound: Bound::Included(JsonLiteral::String(\"1989\".to_string())),\n        };\n        let schema = make_schema(false);\n        let err = range_query\n            .build_tantivy_ast_call(&BuildTantivyAstContext::for_test(&schema))\n            .unwrap_err();\n        assert!(matches!(err, InvalidQuery::SchemaError { .. }));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/query_ast/regex_query.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::Arc;\n\nuse anyhow::Context;\npub use prefix::{AutomatonQuery, JsonPathPrefix};\nuse serde::{Deserialize, Serialize};\nuse tantivy::Term;\nuse tantivy::schema::{Field, FieldType, Schema as TantivySchema};\n\nuse super::{BuildTantivyAst, BuildTantivyAstContext, QueryAst};\nuse crate::query_ast::TantivyQueryAst;\nuse crate::{InvalidQuery, find_field_or_hit_dynamic};\n\n/// A Regex query\n#[derive(PartialEq, Eq, Debug, Serialize, Deserialize, Clone)]\npub struct RegexQuery {\n    pub field: String,\n    pub regex: String,\n}\n\nimpl From<RegexQuery> for QueryAst {\n    fn from(regex_query: RegexQuery) -> Self {\n        Self::Regex(regex_query)\n    }\n}\n\nimpl RegexQuery {\n    #[cfg(test)]\n    pub fn from_field_value(field: impl ToString, regex: impl ToString) -> Self {\n        Self {\n            field: field.to_string(),\n            regex: regex.to_string(),\n        }\n    }\n}\n\nimpl RegexQuery {\n    pub fn to_field_and_regex(\n        &self,\n        schema: &TantivySchema,\n    ) -> Result<(Field, Option<Vec<u8>>, String), InvalidQuery> {\n        let Some((field, field_entry, json_path)) = find_field_or_hit_dynamic(&self.field, schema)\n        else {\n            return Err(InvalidQuery::FieldDoesNotExist {\n                full_path: self.field.clone(),\n            });\n        };\n        let field_type = field_entry.field_type();\n\n        match field_type {\n            FieldType::Str(text_options) => {\n                text_options.get_indexing_options().ok_or_else(|| {\n                    InvalidQuery::SchemaError(format!(\n                        \"field {} is not full-text searchable\",\n                        field_entry.name()\n                    ))\n                })?;\n\n                Ok((field, None, self.regex.to_string()))\n            }\n            FieldType::JsonObject(json_options) => {\n                json_options.get_text_indexing_options().ok_or_else(|| {\n                    InvalidQuery::SchemaError(format!(\n                        \"field {} is not full-text searchable\",\n                        field_entry.name()\n                    ))\n                })?;\n\n                let mut term_for_path = Term::from_field_json_path(\n                    field,\n                    json_path,\n                    json_options.is_expand_dots_enabled(),\n                );\n                term_for_path.append_type_and_str(\"\");\n\n                let value = term_for_path.value();\n                // We skip the 1st byte which is a marker to tell this is json. This isn't present\n                // in the dictionary\n                let byte_path_prefix = value.as_serialized()[1..].to_owned();\n                Ok((field, Some(byte_path_prefix), self.regex.to_string()))\n            }\n            _ => Err(InvalidQuery::SchemaError(\n                \"trying to run a regex query on a non-text field\".to_string(),\n            )),\n        }\n    }\n}\n\nimpl BuildTantivyAst for RegexQuery {\n    fn build_tantivy_ast_impl(\n        &self,\n        context: &BuildTantivyAstContext,\n    ) -> Result<TantivyQueryAst, InvalidQuery> {\n        let (field, path, regex) = self.to_field_and_regex(context.schema)?;\n        let regex = tantivy_fst::Regex::new(&regex).context(\"failed to parse regex\")?;\n        let regex_automaton_with_path = JsonPathPrefix {\n            prefix: path.unwrap_or_default(),\n            automaton: regex.into(),\n        };\n        let regex_query_with_path = AutomatonQuery {\n            field,\n            automaton: Arc::new(regex_automaton_with_path),\n        };\n        Ok(regex_query_with_path.into())\n    }\n}\n\nmod prefix {\n    use std::sync::Arc;\n\n    use tantivy::query::{AutomatonWeight, EnableScoring, Query, Weight};\n    use tantivy::schema::Field;\n    use tantivy_fst::Automaton;\n\n    pub struct JsonPathPrefix<A> {\n        pub prefix: Vec<u8>,\n        pub automaton: Arc<A>,\n    }\n\n    // we need to implement manually because the std adds an unnecessary bound `A: Clone`\n    impl<A> Clone for JsonPathPrefix<A> {\n        fn clone(&self) -> Self {\n            JsonPathPrefix {\n                prefix: self.prefix.clone(),\n                automaton: self.automaton.clone(),\n            }\n        }\n    }\n\n    #[derive(Clone, Debug, PartialEq)]\n    pub enum JsonPathPrefixState<A> {\n        Prefix(usize),\n        Inner(A),\n        PrefixFailed,\n    }\n\n    impl<A: Automaton> Automaton for JsonPathPrefix<A> {\n        type State = JsonPathPrefixState<A::State>;\n\n        fn start(&self) -> Self::State {\n            if self.prefix.is_empty() {\n                JsonPathPrefixState::Inner(self.automaton.start())\n            } else {\n                JsonPathPrefixState::Prefix(0)\n            }\n        }\n\n        fn is_match(&self, state: &Self::State) -> bool {\n            match state {\n                JsonPathPrefixState::Prefix(_) => false,\n                JsonPathPrefixState::Inner(inner_state) => self.automaton.is_match(inner_state),\n                JsonPathPrefixState::PrefixFailed => false,\n            }\n        }\n\n        fn accept(&self, state: &Self::State, byte: u8) -> Self::State {\n            match state {\n                JsonPathPrefixState::Prefix(i) => {\n                    if self.prefix.get(*i) != Some(&byte) {\n                        return JsonPathPrefixState::PrefixFailed;\n                    }\n                    let next_pos = i + 1;\n                    if next_pos == self.prefix.len() {\n                        JsonPathPrefixState::Inner(self.automaton.start())\n                    } else {\n                        JsonPathPrefixState::Prefix(next_pos)\n                    }\n                }\n                JsonPathPrefixState::Inner(inner_state) => {\n                    JsonPathPrefixState::Inner(self.automaton.accept(inner_state, byte))\n                }\n                JsonPathPrefixState::PrefixFailed => JsonPathPrefixState::PrefixFailed,\n            }\n        }\n\n        fn can_match(&self, state: &Self::State) -> bool {\n            match state {\n                JsonPathPrefixState::Prefix(_) => true,\n                JsonPathPrefixState::Inner(inner_state) => self.automaton.can_match(inner_state),\n                JsonPathPrefixState::PrefixFailed => false,\n            }\n        }\n\n        fn will_always_match(&self, state: &Self::State) -> bool {\n            match state {\n                JsonPathPrefixState::Prefix(_) => false,\n                JsonPathPrefixState::Inner(inner_state) => {\n                    self.automaton.will_always_match(inner_state)\n                }\n                JsonPathPrefixState::PrefixFailed => false,\n            }\n        }\n    }\n\n    // we don't use RegexQuery to handle our path. We could tinker with the regex to embed\n    // json field path inside, but that seems not as clean, and would prevent support of\n    // case-insensitive search in the future (we would also make the path insensitive,\n    // which we shouldn't)\n    pub struct AutomatonQuery<A> {\n        pub automaton: Arc<A>,\n        pub field: Field,\n    }\n\n    impl<A> std::fmt::Debug for AutomatonQuery<A> {\n        fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {\n            f.debug_struct(\"AutomatonQuery\")\n                .field(\"field\", &self.field)\n                .field(\"automaton\", &std::any::type_name::<A>())\n                .finish()\n        }\n    }\n\n    impl<A> Clone for AutomatonQuery<A> {\n        fn clone(&self) -> Self {\n            AutomatonQuery {\n                automaton: self.automaton.clone(),\n                field: self.field,\n            }\n        }\n    }\n\n    impl<A: Automaton + Send + Sync + 'static> Query for AutomatonQuery<A>\n    where A::State: Clone\n    {\n        fn weight(&self, _enabled_scoring: EnableScoring<'_>) -> tantivy::Result<Box<dyn Weight>> {\n            Ok(Box::new(AutomatonWeight::<A>::new(\n                self.field,\n                self.automaton.clone(),\n            )))\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::sync::Arc;\n\n    use tantivy::schema::{Schema as TantivySchema, TEXT};\n    use tantivy_fst::{Automaton, Regex};\n\n    use super::prefix::JsonPathPrefixState;\n    use super::{JsonPathPrefix, RegexQuery};\n\n    #[test]\n    fn test_regex_query_text_field() {\n        let mut schema_builder = TantivySchema::builder();\n        schema_builder.add_text_field(\"field\", TEXT);\n        let schema = schema_builder.build();\n\n        let query = RegexQuery {\n            field: \"field\".to_string(),\n            regex: \"abc.*xyz\".to_string(),\n        };\n        let (field, path, regex) = query.to_field_and_regex(&schema).unwrap();\n        assert_eq!(field, schema.get_field(\"field\").unwrap());\n        assert!(path.is_none());\n        assert_eq!(regex, query.regex);\n    }\n\n    #[test]\n    fn test_regex_query_json_field() {\n        let mut schema_builder = TantivySchema::builder();\n        schema_builder.add_json_field(\"field\", TEXT);\n        let schema = schema_builder.build();\n\n        let query = RegexQuery {\n            field: \"field.sub.field\".to_string(),\n            regex: \"abc.*xyz\".to_string(),\n        };\n        let (field, path, regex) = query.to_field_and_regex(&schema).unwrap();\n        assert_eq!(field, schema.get_field(\"field\").unwrap());\n        assert_eq!(path.unwrap(), b\"sub\\x01field\\0s\");\n        assert_eq!(regex, query.regex);\n\n        // i believe this is how concatenated field behave\n        let query_empty_path = RegexQuery {\n            field: \"field\".to_string(),\n            regex: \"abc.*xyz\".to_string(),\n        };\n        let (field, path, regex) = query_empty_path.to_field_and_regex(&schema).unwrap();\n        assert_eq!(field, schema.get_field(\"field\").unwrap());\n        assert_eq!(path.unwrap(), b\"\\0s\");\n        assert_eq!(regex, query_empty_path.regex);\n    }\n\n    #[test]\n    fn test_json_prefix_automaton_empty_path() {\n        let regex = Arc::new(Regex::new(\"e(f|g.*)\").unwrap());\n        let empty_path_automaton = JsonPathPrefix {\n            prefix: Vec::new(),\n            automaton: regex.clone(),\n        };\n\n        let start = empty_path_automaton.start();\n        assert_eq!(start, JsonPathPrefixState::Inner(regex.start()));\n    }\n\n    #[test]\n    fn test_json_prefix_automaton() {\n        let regex = Arc::new(Regex::new(\"e(f|g.*)\").unwrap());\n        let automaton = JsonPathPrefix {\n            prefix: b\"ab\".to_vec(),\n            automaton: regex.clone(),\n        };\n\n        let start = automaton.start();\n        assert!(matches!(start, JsonPathPrefixState::Prefix(_)));\n        assert!(automaton.can_match(&start));\n        assert!(!automaton.is_match(&start));\n\n        let miss = automaton.accept(&start, b'g');\n        assert_eq!(miss, JsonPathPrefixState::PrefixFailed);\n        // supporting this is important for optimisation\n        assert!(!automaton.can_match(&miss));\n        assert!(!automaton.is_match(&miss));\n\n        let a = automaton.accept(&start, b'a');\n        assert!(matches!(a, JsonPathPrefixState::Prefix(_)));\n        assert!(automaton.can_match(&a));\n        assert!(!automaton.is_match(&a));\n\n        let ab = automaton.accept(&a, b'b');\n        assert_eq!(ab, JsonPathPrefixState::Inner(regex.start()));\n        assert!(automaton.can_match(&ab));\n        assert!(!automaton.is_match(&ab));\n\n        // starting here, we just take that we passthrough correctly,\n        // and reply to can_match as well as possible\n        // (we don't test will_always_match because Regex doesn't support it)\n        let abc = automaton.accept(&ab, b'c');\n        assert!(matches!(abc, JsonPathPrefixState::Inner(_)));\n        assert!(!automaton.can_match(&abc));\n        assert!(!automaton.is_match(&abc));\n\n        let abe = automaton.accept(&ab, b'e');\n        assert!(matches!(abe, JsonPathPrefixState::Inner(_)));\n        assert!(automaton.can_match(&abe));\n        assert!(!automaton.is_match(&abe));\n\n        let abef = automaton.accept(&abe, b'f');\n        assert!(matches!(abef, JsonPathPrefixState::Inner(_)));\n        assert!(automaton.can_match(&abef));\n        assert!(automaton.is_match(&abef));\n\n        let abefg = automaton.accept(&abef, b'g');\n        assert!(matches!(abefg, JsonPathPrefixState::Inner(_)));\n        assert!(!automaton.can_match(&abefg));\n        assert!(!automaton.is_match(&abefg));\n\n        let abeg = automaton.accept(&abe, b'g');\n        assert!(matches!(abeg, JsonPathPrefixState::Inner(_)));\n        assert!(automaton.can_match(&abeg));\n        assert!(automaton.is_match(&abeg));\n\n        let abegh = automaton.accept(&abeg, b'h');\n        assert!(matches!(abegh, JsonPathPrefixState::Inner(_)));\n        assert!(automaton.can_match(&abegh));\n        assert!(automaton.is_match(&abegh));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/query_ast/tantivy_query_ast.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse tantivy::query::{\n    AllQuery as TantivyAllQuery, BooleanQuery, ConstScoreQuery as TantivyConstScoreQuery,\n    EmptyQuery as TantivyEmptyQuery,\n};\nuse tantivy::query_grammar::Occur;\n\nuse crate::{BooleanOperand, MatchAllOrNone, TantivyQuery};\n\n/// This AST point, is only to make it easier to simplify the generated Tantivy query.\n/// when we convert a QueryAst into a TantivyQueryAst.\n///\n/// Let's keep private.\n#[derive(Debug)]\npub(crate) enum TantivyQueryAst {\n    Bool(TantivyBoolQuery),\n    Leaf(Box<dyn TantivyQuery>),\n    ConstPredicate(MatchAllOrNone),\n}\n\nimpl Clone for TantivyQueryAst {\n    fn clone(&self) -> Self {\n        match self {\n            TantivyQueryAst::Bool(bool_query) => TantivyQueryAst::Bool(bool_query.clone()),\n            TantivyQueryAst::ConstPredicate(predicate) => {\n                TantivyQueryAst::ConstPredicate(*predicate)\n            }\n            TantivyQueryAst::Leaf(query) => TantivyQueryAst::Leaf(query.box_clone()),\n        }\n    }\n}\n\nimpl From<MatchAllOrNone> for TantivyQueryAst {\n    fn from(match_all_or_none: MatchAllOrNone) -> Self {\n        TantivyQueryAst::ConstPredicate(match_all_or_none)\n    }\n}\n\nimpl PartialEq for TantivyQueryAst {\n    fn eq(&self, other: &Self) -> bool {\n        match (self, other) {\n            (Self::Bool(left), Self::Bool(right)) => left == right,\n            (Self::Leaf(left), Self::Leaf(right)) => {\n                let left_str: String = format!(\"{left:?}\");\n                let right_str: String = format!(\"{right:?}\");\n                left_str == right_str\n            }\n            (Self::ConstPredicate(left), Self::ConstPredicate(right)) => left == right,\n            _ => false,\n        }\n    }\n}\n\nimpl Eq for TantivyQueryAst {}\n\nimpl TantivyQueryAst {\n    #[cfg(test)]\n    pub(crate) fn as_bool_query(&self) -> Option<&TantivyBoolQuery> {\n        match self {\n            TantivyQueryAst::Bool(bool) => Some(bool),\n            _ => None,\n        }\n    }\n\n    #[cfg(test)]\n    pub(crate) fn as_leaf(&self) -> Option<&dyn TantivyQuery> {\n        match self {\n            TantivyQueryAst::Leaf(tantivy_query) => Some(&**tantivy_query),\n            _ => None,\n        }\n    }\n\n    pub(crate) fn const_predicate(&self) -> Option<MatchAllOrNone> {\n        if let Self::ConstPredicate(always_or_never) = self {\n            Some(*always_or_never)\n        } else {\n            None\n        }\n    }\n\n    pub fn match_all() -> Self {\n        Self::ConstPredicate(MatchAllOrNone::MatchAll)\n    }\n\n    pub fn match_none() -> Self {\n        Self::ConstPredicate(MatchAllOrNone::MatchNone)\n    }\n\n    pub fn simplify(self) -> TantivyQueryAst {\n        match self {\n            TantivyQueryAst::Bool(bool_query) => bool_query.simplify(),\n            ast => ast,\n        }\n    }\n}\n\nimpl<Q: TantivyQuery> From<Q> for TantivyQueryAst {\n    fn from(query: Q) -> TantivyQueryAst {\n        TantivyQueryAst::Leaf(Box::new(query))\n    }\n}\n\nimpl From<TantivyQueryAst> for Box<dyn TantivyQuery> {\n    fn from(boxed_tantivy_query: TantivyQueryAst) -> Box<dyn TantivyQuery> {\n        match boxed_tantivy_query {\n            TantivyQueryAst::Bool(boolean_query) => boolean_query.into(),\n            TantivyQueryAst::Leaf(leaf) => leaf,\n            TantivyQueryAst::ConstPredicate(always_or_never_match) => match always_or_never_match {\n                MatchAllOrNone::MatchAll => Box::new(TantivyAllQuery),\n                MatchAllOrNone::MatchNone => Box::new(TantivyEmptyQuery),\n            },\n        }\n    }\n}\n\n// Remove the occurrence of trivial AST in the given list of asts.\n//\n// If `stop_before_empty` is true, then we will make sure to stop removing asts if it is\n// the last element.\n// This function may change the order of asts.\nfn remove_with_guard(\n    asts: &mut Vec<TantivyQueryAst>,\n    to_remove: MatchAllOrNone,\n    stop_before_empty: bool,\n) {\n    let mut i = 0;\n    while i < asts.len() {\n        if stop_before_empty && asts.len() == 1 {\n            break;\n        }\n        if asts[i].const_predicate() == Some(to_remove) {\n            asts.swap_remove(i);\n        } else {\n            i += 1;\n        }\n    }\n}\n\n#[derive(Default, Debug, Clone, Eq, PartialEq)]\npub(crate) struct TantivyBoolQuery {\n    pub must: Vec<TantivyQueryAst>,\n    pub must_not: Vec<TantivyQueryAst>,\n    pub should: Vec<TantivyQueryAst>,\n    pub filter: Vec<TantivyQueryAst>,\n    pub minimum_should_match: Option<usize>,\n}\n\nfn simplify_asts(asts: Vec<TantivyQueryAst>) -> Vec<TantivyQueryAst> {\n    asts.into_iter().map(|ast| ast.simplify()).collect()\n}\n\nimpl TantivyBoolQuery {\n    pub fn build_clause(operator: BooleanOperand, children: Vec<TantivyQueryAst>) -> Self {\n        match operator {\n            BooleanOperand::And => Self {\n                must: children,\n                ..Default::default()\n            },\n            BooleanOperand::Or => Self {\n                should: children,\n                ..Default::default()\n            },\n        }\n    }\n\n    pub fn simplify(mut self) -> TantivyQueryAst {\n        // simplify sub branches\n        self.must = simplify_asts(self.must);\n        self.should = simplify_asts(self.should);\n        self.must_not = simplify_asts(self.must_not);\n        self.filter = simplify_asts(self.filter);\n\n        for must_children in [&mut self.must, &mut self.filter] {\n            for child in must_children {\n                if child.const_predicate() == Some(MatchAllOrNone::MatchNone) {\n                    return TantivyQueryAst::ConstPredicate(MatchAllOrNone::MatchNone);\n                }\n            }\n        }\n        if self.should.is_empty()\n            && self.must.is_empty()\n            && self.filter.is_empty()\n            && self.must_not.is_empty()\n            && self.minimum_should_match.unwrap_or(0) == 0\n        {\n            // This is just a convention mimicking Elastic/Commonsearch's behavior.\n            return TantivyQueryAst::match_all();\n        }\n\n        let mut new_must = Vec::with_capacity(self.must.len());\n        for must in self.must {\n            let mut must_bool = match must {\n                TantivyQueryAst::Bool(bool_query) => bool_query,\n                _ => {\n                    new_must.push(must);\n                    continue;\n                }\n            };\n            if must_bool.should.is_empty() && must_bool.minimum_should_match.is_none() {\n                new_must.append(&mut must_bool.must);\n                self.filter.append(&mut must_bool.filter);\n                self.must_not.append(&mut must_bool.must_not);\n            } else {\n                new_must.push(TantivyQueryAst::Bool(must_bool));\n            }\n        }\n        self.must = new_must;\n\n        let mut new_filter = Vec::with_capacity(self.filter.len());\n        for filter in self.filter {\n            let mut filter_bool = match filter {\n                TantivyQueryAst::Bool(bool_query) => bool_query,\n                _ => {\n                    new_filter.push(filter);\n                    continue;\n                }\n            };\n            if filter_bool.should.is_empty() && filter_bool.minimum_should_match.is_none() {\n                new_filter.append(&mut filter_bool.must);\n                new_filter.append(&mut filter_bool.filter);\n                // must_not doesn't contribute to score, no need to move it to some filter_not kind\n                // of thing\n                self.must_not.append(&mut filter_bool.must_not);\n            } else {\n                new_filter.push(TantivyQueryAst::Bool(filter_bool));\n            }\n        }\n        self.filter = new_filter;\n\n        if self.minimum_should_match.is_none() {\n            let mut new_should = Vec::with_capacity(self.should.len());\n            for should in self.should {\n                let mut should_bool = match should {\n                    TantivyQueryAst::Bool(bool_query) => bool_query,\n                    _ => {\n                        new_should.push(should);\n                        continue;\n                    }\n                };\n                if should_bool.must.is_empty()\n                    && should_bool.filter.is_empty()\n                    && should_bool.must_not.is_empty()\n                    && should_bool.minimum_should_match.is_none()\n                {\n                    new_should.append(&mut should_bool.should);\n                } else {\n                    new_should.push(TantivyQueryAst::Bool(should_bool));\n                }\n            }\n            self.should = new_should;\n        }\n\n        // TODO we could turn must_not(must_not(abc, def)) into should(filter(abc), filter(def)),\n        // we can't simply have should(abc, def) because of scoring, and should(filter(abc, def))\n        // has a different meaning\n\n        // remove sub-queries which don't impact the result\n        remove_with_guard(&mut self.must, MatchAllOrNone::MatchAll, true);\n        let mut has_no_positive_ast_so_far = self.must.is_empty();\n        remove_with_guard(\n            &mut self.filter,\n            MatchAllOrNone::MatchAll,\n            has_no_positive_ast_so_far,\n        );\n        has_no_positive_ast_so_far &= self.filter.is_empty();\n        if !self.filter.is_empty() {\n            // if filter is not empty, we can re-try cleaning must. we can't just check\n            // has_no_positive_ast_so_far as it would clean must if must or filter contained\n            // something\n            remove_with_guard(&mut self.must, MatchAllOrNone::MatchAll, false);\n        }\n        remove_with_guard(\n            &mut self.should,\n            MatchAllOrNone::MatchNone,\n            has_no_positive_ast_so_far,\n        );\n        has_no_positive_ast_so_far &= self.should.is_empty();\n        remove_with_guard(\n            &mut self.must_not,\n            MatchAllOrNone::MatchNone,\n            has_no_positive_ast_so_far,\n        );\n\n        for must_child in self.must.iter().chain(self.filter.iter()) {\n            if must_child.const_predicate() == Some(MatchAllOrNone::MatchNone) {\n                return TantivyQueryAst::ConstPredicate(MatchAllOrNone::MatchNone);\n            }\n        }\n        for must_not_child in &self.must_not {\n            if must_not_child.const_predicate() == Some(MatchAllOrNone::MatchAll) {\n                return TantivyQueryAst::ConstPredicate(MatchAllOrNone::MatchNone);\n            }\n        }\n        let has_positive_children =\n            !(self.must.is_empty() && self.should.is_empty() && self.filter.is_empty());\n\n        if !has_positive_children {\n            if self.minimum_should_match.unwrap_or(0) > 0 {\n                return MatchAllOrNone::MatchNone.into();\n            }\n            if self\n                .must_not\n                .iter()\n                .all(|must_not| must_not.const_predicate() == Some(MatchAllOrNone::MatchNone))\n            {\n                return MatchAllOrNone::MatchAll.into();\n            }\n            self.must.push(TantivyQueryAst::match_all());\n        } else {\n            let num_children =\n                self.must.len() + self.should.len() + self.must_not.len() + self.filter.len();\n            if num_children == 1\n                && self.minimum_should_match.is_none()\n                && let Some(ast) = self.must.pop().or(self.should.pop())\n            {\n                return ast;\n            }\n            // We do not optimize a single filter clause for the moment.\n            // We do need a mechanism to make sure we keep the boost of 0.\n        }\n\n        TantivyQueryAst::Bool(self)\n    }\n}\n\nimpl From<TantivyBoolQuery> for TantivyQueryAst {\n    fn from(bool_query: TantivyBoolQuery) -> Self {\n        TantivyQueryAst::Bool(bool_query)\n    }\n}\n\nimpl From<TantivyBoolQuery> for Box<dyn TantivyQuery> {\n    fn from(bool_query: TantivyBoolQuery) -> Box<dyn TantivyQuery> {\n        let mut clause: Vec<(Occur, Box<dyn TantivyQuery>)> = Vec::with_capacity(\n            bool_query.must.len()\n                + bool_query.must_not.len()\n                + bool_query.should.len()\n                + bool_query.filter.len(),\n        );\n        for (occur, child_asts) in [\n            (Occur::Must, bool_query.must),\n            (Occur::MustNot, bool_query.must_not),\n            (Occur::Should, bool_query.should),\n        ] {\n            for child_ast in child_asts {\n                let sub_query = child_ast.into();\n                clause.push((occur, sub_query));\n            }\n        }\n        for filter_child in bool_query.filter {\n            let filter_query = filter_child.into();\n            clause.push((\n                Occur::Must,\n                Box::new(TantivyConstScoreQuery::new(filter_query, 0.0f32)),\n            ));\n        }\n        let tantivy_bool_query = if let Some(minimum_should_match) = bool_query.minimum_should_match\n        {\n            BooleanQuery::with_minimum_required_clauses(clause, minimum_should_match)\n        } else {\n            BooleanQuery::from(clause)\n        };\n        Box::new(tantivy_bool_query)\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use proptest::prelude::*;\n    use tantivy::query::{EmptyQuery, TermQuery};\n\n    use super::TantivyBoolQuery;\n    use crate::query_ast::tantivy_query_ast::{MatchAllOrNone, TantivyQueryAst, remove_with_guard};\n\n    fn term(val: &str) -> TantivyQueryAst {\n        use tantivy::schema::{Field, Term};\n        TermQuery::new(\n            Term::from_field_text(Field::from_field_id(0), val),\n            Default::default(),\n        )\n        .into()\n    }\n\n    #[test]\n    fn test_simplify_bool_query_with_no_clauses() {\n        let bool_query = TantivyBoolQuery::default();\n        assert_eq!(bool_query.simplify(), TantivyQueryAst::match_all());\n    }\n\n    #[test]\n    fn test_remove_with_guard() {\n        {\n            let mut asts = Vec::new();\n            // we are just checking for panics\n            remove_with_guard(&mut asts, MatchAllOrNone::MatchAll, true);\n            remove_with_guard(&mut asts, MatchAllOrNone::MatchAll, false);\n        }\n        {\n            let mut asts = vec![\n                MatchAllOrNone::MatchAll.into(),\n                MatchAllOrNone::MatchAll.into(),\n            ];\n            remove_with_guard(&mut asts, MatchAllOrNone::MatchAll, true);\n            assert_eq!(asts.len(), 1);\n        }\n        {\n            let mut asts = vec![\n                MatchAllOrNone::MatchAll.into(),\n                MatchAllOrNone::MatchAll.into(),\n            ];\n            remove_with_guard(&mut asts, MatchAllOrNone::MatchAll, false);\n            assert!(asts.is_empty());\n        }\n        {\n            let mut asts = vec![\n                MatchAllOrNone::MatchAll.into(),\n                MatchAllOrNone::MatchNone.into(),\n                MatchAllOrNone::MatchAll.into(),\n            ];\n            remove_with_guard(&mut asts, MatchAllOrNone::MatchAll, true);\n            assert_eq!(asts.len(), 1);\n        }\n        {\n            let mut asts = vec![\n                MatchAllOrNone::MatchAll.into(),\n                MatchAllOrNone::MatchNone.into(),\n                MatchAllOrNone::MatchAll.into(),\n            ];\n            remove_with_guard(&mut asts, MatchAllOrNone::MatchAll, false);\n            assert_eq!(asts.len(), 1);\n        }\n    }\n\n    #[test]\n    fn test_simplify_bool_query_with_one_clauses() {\n        {\n            let tantivy_query = EmptyQuery.into();\n            let bool_query = TantivyBoolQuery {\n                must: vec![tantivy_query],\n                ..Default::default()\n            };\n            assert!(bool_query.simplify().as_leaf().is_some());\n        }\n        {\n            let tantivy_query = EmptyQuery.into();\n            let bool_query = TantivyBoolQuery {\n                should: vec![tantivy_query],\n                ..Default::default()\n            };\n            assert!(bool_query.simplify().as_leaf().is_some());\n        }\n        {\n            let tantivy_query = EmptyQuery.into();\n            let bool_query = TantivyBoolQuery {\n                filter: vec![tantivy_query],\n                ..Default::default()\n            };\n            // We do not simplify filter. We somehow need a mechanism to make sure we end up with a\n            // const-score.\n            assert!(bool_query.simplify().as_leaf().is_none());\n        }\n    }\n\n    #[test]\n    fn test_bool_negative_query_add_wildcard() {\n        let tantivy_query = EmptyQuery.into();\n        let simplified_ast = TantivyBoolQuery {\n            must_not: vec![tantivy_query],\n            ..Default::default()\n        }\n        .simplify();\n        let simplified_ast_bool = simplified_ast.as_bool_query().unwrap();\n        assert_eq!(simplified_ast_bool.must_not.len(), 1);\n        assert_eq!(\n            simplified_ast_bool.should.len() + simplified_ast_bool.filter.len(),\n            0\n        );\n        assert_eq!(simplified_ast_bool.must.len(), 1);\n        assert_eq!(\n            simplified_ast_bool.must[0].const_predicate(),\n            Some(MatchAllOrNone::MatchAll)\n        );\n    }\n\n    #[test]\n    fn test_bool_multiple_negative_query_add_wildcard() {\n        let simplified_ast = TantivyBoolQuery {\n            must_not: vec![EmptyQuery.into(), EmptyQuery.into()],\n            ..Default::default()\n        }\n        .simplify();\n        let simplified_ast_bool = simplified_ast.as_bool_query().unwrap();\n        assert_eq!(simplified_ast_bool.must_not.len(), 2);\n        assert_eq!(\n            simplified_ast_bool.should.len() + simplified_ast_bool.filter.len(),\n            0\n        );\n        assert_eq!(simplified_ast_bool.must.len(), 1);\n        assert_eq!(\n            simplified_ast_bool.must[0].const_predicate(),\n            Some(MatchAllOrNone::MatchAll)\n        );\n    }\n\n    #[test]\n    fn test_bool_multiple_negative_query_with_positive() {\n        let simplified_ast = TantivyBoolQuery {\n            must: vec![EmptyQuery.into()],\n            must_not: vec![EmptyQuery.into(), EmptyQuery.into()],\n            ..Default::default()\n        }\n        .simplify();\n        let simplified_ast_bool = simplified_ast.as_bool_query().unwrap();\n        assert_eq!(simplified_ast_bool.must_not.len(), 2);\n        assert_eq!(\n            simplified_ast_bool.should.len() + simplified_ast_bool.filter.len(),\n            0\n        );\n        assert_eq!(simplified_ast_bool.must.len(), 1);\n        assert!(simplified_ast_bool.must[0].const_predicate().is_none(),);\n    }\n\n    #[test]\n    fn test_should_lift_simplification() {\n        let test_leaf = TantivyQueryAst::Leaf(Box::new(tantivy::query::AllQuery));\n        let ast = TantivyQueryAst::Bool(TantivyBoolQuery {\n            should: vec![\n                test_leaf.clone(),\n                TantivyQueryAst::Bool(TantivyBoolQuery {\n                    should: vec![test_leaf.clone(), test_leaf],\n                    ..Default::default()\n                }),\n            ],\n            ..Default::default()\n        });\n        let simplified_ast = ast.clone().simplify();\n        assert_ne!(simplified_ast, ast);\n        let TantivyQueryAst::Bool(bool_query) = simplified_ast else {\n            panic!();\n        };\n        assert_eq!(bool_query.should.len(), 3);\n        assert!(bool_query.must.is_empty());\n        assert!(bool_query.filter.is_empty());\n        assert!(bool_query.must_not.is_empty());\n        assert!(bool_query.minimum_should_match.is_none());\n    }\n\n    #[test]\n    fn test_minimum_should_match_prevent_lift_simplification() {\n        let test_leaf = TantivyQueryAst::Leaf(Box::new(tantivy::query::AllQuery));\n        let ast = TantivyQueryAst::Bool(TantivyBoolQuery {\n            should: vec![\n                test_leaf.clone(),\n                TantivyQueryAst::Bool(TantivyBoolQuery {\n                    should: vec![test_leaf.clone(), test_leaf],\n                    ..Default::default()\n                }),\n            ],\n            minimum_should_match: Some(2),\n            ..Default::default()\n        });\n        let simplified_ast = ast.clone().simplify();\n        assert_eq!(simplified_ast, ast);\n    }\n\n    #[test]\n    fn test_simplify_bool_query_with_match_all_must_not_clauses() {\n        let tantivy_query = EmptyQuery.into();\n        let bool_query = TantivyBoolQuery {\n            must: vec![tantivy_query],\n            must_not: vec![TantivyQueryAst::match_all()],\n            ..Default::default()\n        };\n        assert_eq!(\n            bool_query.simplify().const_predicate(),\n            Some(MatchAllOrNone::MatchNone)\n        );\n    }\n\n    #[test]\n    fn test_simplify_bool_query_with_match_must_clauses() {\n        let tantivy_query = EmptyQuery.into();\n        let bool_query = TantivyBoolQuery {\n            must: vec![tantivy_query, TantivyQueryAst::match_all()],\n            ..Default::default()\n        }\n        .simplify();\n        assert!(bool_query.as_leaf().is_some());\n    }\n\n    #[test]\n    fn test_simplify_bool_query_with_match_must_and_other_positive_clauses() {\n        let bool_query = TantivyBoolQuery {\n            must: vec![TantivyQueryAst::match_all()],\n            filter: vec![EmptyQuery.into()],\n            ..Default::default()\n        }\n        .simplify();\n        assert_eq!(\n            bool_query,\n            TantivyBoolQuery {\n                filter: vec![EmptyQuery.into()],\n                ..Default::default()\n            }\n            .into()\n        );\n    }\n\n    #[test]\n    fn test_simplify_bool_query_with_match_none_must_clauses() {\n        let tantivy_query = EmptyQuery.into();\n        let bool_query = TantivyBoolQuery {\n            must: vec![TantivyQueryAst::match_none()],\n            should: vec![tantivy_query],\n            ..Default::default()\n        }\n        .simplify();\n        assert_eq!(\n            bool_query.const_predicate(),\n            Some(MatchAllOrNone::MatchNone)\n        );\n    }\n\n    #[test]\n    fn test_simplify_bool_query_with_match_none_no_positive_clauses() {\n        let bool_query = TantivyBoolQuery {\n            must_not: vec![TantivyQueryAst::match_none()],\n            ..Default::default()\n        }\n        .simplify();\n        assert_eq!(bool_query.const_predicate(), Some(MatchAllOrNone::MatchAll));\n    }\n\n    #[test]\n    fn test_simplify_empty_bool_query_matches_all() {\n        let empty_bool_query = TantivyBoolQuery::default().simplify();\n        assert_eq!(\n            empty_bool_query.const_predicate(),\n            Some(MatchAllOrNone::MatchAll)\n        );\n    }\n\n    #[test]\n    fn test_simplify_lift_bool_bool() {\n        let bool_query = TantivyBoolQuery {\n            must: vec![\n                TantivyBoolQuery {\n                    must: vec![term(\"abc\"), term(\"def\")],\n                    ..Default::default()\n                }\n                .into(),\n                TantivyBoolQuery {\n                    must: vec![term(\"ghi\"), term(\"jkl\")],\n                    ..Default::default()\n                }\n                .into(),\n            ],\n            ..Default::default()\n        }\n        .simplify();\n        assert_eq!(\n            bool_query,\n            TantivyBoolQuery {\n                must: vec![term(\"abc\"), term(\"def\"), term(\"ghi\"), term(\"jkl\"),],\n                ..Default::default()\n            }\n            .into()\n        );\n\n        let bool_query = TantivyBoolQuery {\n            should: vec![\n                TantivyBoolQuery {\n                    should: vec![term(\"abc\"), term(\"def\")],\n                    ..Default::default()\n                }\n                .into(),\n                TantivyBoolQuery {\n                    should: vec![term(\"ghi\"), term(\"jkl\")],\n                    ..Default::default()\n                }\n                .into(),\n            ],\n            ..Default::default()\n        }\n        .simplify();\n        assert_eq!(\n            bool_query,\n            TantivyBoolQuery {\n                should: vec![term(\"abc\"), term(\"def\"), term(\"ghi\"), term(\"jkl\"),],\n                ..Default::default()\n            }\n            .into()\n        );\n\n        let bool_query = TantivyBoolQuery {\n            must: vec![\n                TantivyBoolQuery {\n                    must: vec![term(\"abc\"), term(\"def\")],\n                    ..Default::default()\n                }\n                .into(),\n                TantivyBoolQuery {\n                    should: vec![term(\"ghi\"), term(\"jkl\")],\n                    ..Default::default()\n                }\n                .into(),\n            ],\n            ..Default::default()\n        }\n        .simplify();\n        assert_eq!(\n            bool_query,\n            TantivyBoolQuery {\n                must: vec![\n                    term(\"abc\"),\n                    term(\"def\"),\n                    TantivyBoolQuery {\n                        should: vec![term(\"ghi\"), term(\"jkl\")],\n                        ..Default::default()\n                    }\n                    .into(),\n                ],\n                ..Default::default()\n            }\n            .into()\n        );\n\n        let bool_query = TantivyBoolQuery {\n            should: vec![\n                TantivyBoolQuery {\n                    must: vec![term(\"abc\")],\n                    ..Default::default()\n                }\n                .into(),\n                TantivyBoolQuery {\n                    filter: vec![term(\"ghi\")],\n                    ..Default::default()\n                }\n                .into(),\n            ],\n            ..Default::default()\n        }\n        .simplify();\n        assert_eq!(\n            bool_query,\n            TantivyBoolQuery {\n                should: vec![\n                    term(\"abc\"),\n                    // filter can't get optimized for scoring reasons\n                    TantivyBoolQuery {\n                        filter: vec![term(\"ghi\")],\n                        ..Default::default()\n                    }\n                    .into(),\n                ],\n                ..Default::default()\n            }\n            .into()\n        );\n\n        let bool_query = TantivyBoolQuery {\n            must: vec![\n                TantivyBoolQuery {\n                    should: vec![term(\"abc\")],\n                    ..Default::default()\n                }\n                .into(),\n                TantivyBoolQuery {\n                    should: vec![term(\"def\")],\n                    ..Default::default()\n                }\n                .into(),\n            ],\n            ..Default::default()\n        }\n        .simplify();\n        assert_eq!(\n            bool_query,\n            TantivyBoolQuery {\n                must: vec![term(\"abc\"), term(\"def\"),],\n                ..Default::default()\n            }\n            .into()\n        );\n\n        let bool_query = TantivyBoolQuery {\n            must_not: vec![\n                TantivyBoolQuery {\n                    should: vec![term(\"abc\")],\n                    ..Default::default()\n                }\n                .into(),\n                TantivyBoolQuery {\n                    must: vec![term(\"def\")],\n                    ..Default::default()\n                }\n                .into(),\n            ],\n            ..Default::default()\n        }\n        .simplify();\n        assert_eq!(\n            bool_query,\n            TantivyBoolQuery {\n                must: vec![MatchAllOrNone::MatchAll.into()],\n                must_not: vec![term(\"abc\"), term(\"def\"),],\n                ..Default::default()\n            }\n            .into()\n        );\n\n        let bool_query = TantivyBoolQuery {\n            must: vec![\n                TantivyBoolQuery {\n                    must_not: vec![term(\"abc\"), term(\"def\")],\n                    ..Default::default()\n                }\n                .into(),\n                TantivyBoolQuery {\n                    must_not: vec![term(\"ghi\")],\n                    ..Default::default()\n                }\n                .into(),\n            ],\n            ..Default::default()\n        }\n        .simplify();\n        assert_eq!(\n            bool_query,\n            TantivyBoolQuery {\n                must: vec![MatchAllOrNone::MatchAll.into()],\n                must_not: vec![term(\"abc\"), term(\"def\"), term(\"ghi\"),],\n                ..Default::default()\n            }\n            .into()\n        );\n    }\n\n    #[derive(Debug, Clone)]\n    struct ConstQuery(bool, u32);\n\n    impl tantivy::query::Query for ConstQuery {\n        fn weight(\n            &self,\n            _: tantivy::query::EnableScoring<'_>,\n        ) -> tantivy::Result<Box<dyn tantivy::query::Weight>> {\n            unimplemented!()\n        }\n    }\n\n    impl TantivyQueryAst {\n        fn evaluate_test(&self) -> Option<u32> {\n            match self {\n                TantivyQueryAst::ConstPredicate(MatchAllOrNone::MatchNone) => None,\n                TantivyQueryAst::ConstPredicate(MatchAllOrNone::MatchAll) => Some(0),\n                TantivyQueryAst::Bool(bool_query) => bool_query.evaluate_test(),\n                TantivyQueryAst::Leaf(query) => {\n                    let const_query = query\n                        .downcast_ref::<ConstQuery>()\n                        .expect(\"query wasn't a ConstQuery\");\n                    const_query.0.then_some(const_query.1)\n                }\n            }\n        }\n    }\n\n    impl TantivyBoolQuery {\n        fn evaluate_test(&self) -> Option<u32> {\n            if self\n                .must_not\n                .iter()\n                .any(|sub_ast| sub_ast.evaluate_test().is_some())\n            {\n                return None;\n            }\n\n            let mut should_score = 0u32;\n            let mut matching_should_count = 0;\n            for should in &self.should {\n                if let Some(score) = should.evaluate_test() {\n                    should_score += score;\n                    matching_should_count += 1;\n                }\n            }\n\n            if let Some(minimum_should_match) = self.minimum_should_match\n                && minimum_should_match > matching_should_count\n            {\n                return None;\n            }\n\n            if self.must.len() + self.filter.len() > 0 {\n                if self\n                    .must\n                    .iter()\n                    .all(|sub_ast| sub_ast.evaluate_test().is_some())\n                    && self\n                        .filter\n                        .iter()\n                        .all(|sub_ast| sub_ast.evaluate_test().is_some())\n                {\n                    Some(\n                        self.must\n                            .iter()\n                            .map(|sub_ast| sub_ast.evaluate_test().unwrap())\n                            .sum::<u32>()\n                            + should_score,\n                    )\n                } else {\n                    None\n                }\n            } else {\n                if self.should.is_empty() {\n                    // by convention, an empty query returns all match.\n                    return Some(0);\n                }\n                self.should\n                    .iter()\n                    .any(|sub_ast| sub_ast.evaluate_test().is_some())\n                    .then_some(should_score)\n            }\n        }\n    }\n\n    fn ast_strategy() -> impl Strategy<Value = TantivyQueryAst> {\n        let ast_leaf = proptest::prop_oneof![\n            Just(TantivyQueryAst::ConstPredicate(MatchAllOrNone::MatchNone)),\n            Just(TantivyQueryAst::ConstPredicate(MatchAllOrNone::MatchAll)),\n            (prop::bool::ANY, 0u32..5)\n                .prop_map(|(matc, score)| TantivyQueryAst::Leaf(Box::new(ConstQuery(matc, score)))),\n        ];\n\n        ast_leaf.prop_recursive(4, 32, 16, |element| {\n            let must = proptest::collection::vec(element.clone(), 0..4);\n            let filter = proptest::collection::vec(element.clone(), 0..4);\n            let should = proptest::collection::vec(element.clone(), 0..4);\n            let must_not = proptest::collection::vec(element.clone(), 0..4);\n            let minimum_should_match = (0usize..=2).prop_map(|n: usize| n.checked_sub(1));\n            (must, filter, should, must_not, minimum_should_match).prop_map(\n                |(must, filter, should, must_not, minimum_should_match)| {\n                    TantivyQueryAst::Bool(TantivyBoolQuery {\n                        must,\n                        filter,\n                        should,\n                        must_not,\n                        minimum_should_match,\n                    })\n                },\n            )\n        })\n    }\n\n    #[track_caller]\n    fn test_aux_simplify_never_change_result(ast: TantivyQueryAst) {\n        let simplified_ast = ast.clone().simplify();\n        assert_eq!(dbg!(simplified_ast).evaluate_test(), ast.evaluate_test());\n    }\n\n    proptest::proptest! {\n        #![proptest_config(ProptestConfig {\n          cases: 100000, .. ProptestConfig::default()\n        })]\n        #[test]\n        fn test_proptest_simplify_never_change_result(ast in ast_strategy()) {\n            test_aux_simplify_never_change_result(ast);\n        }\n    }\n\n    #[test]\n    fn test_simplify_never_change_result_simple_corner_case() {\n        let ast = TantivyQueryAst::Bool(TantivyBoolQuery {\n            minimum_should_match: Some(1),\n            ..Default::default()\n        });\n        test_aux_simplify_never_change_result(ast);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/query_ast/term_query.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\n\nuse serde::{Deserialize, Serialize};\n\nuse super::{BuildTantivyAst, QueryAst};\nuse crate::query_ast::{BuildTantivyAstContext, FullTextParams, TantivyQueryAst};\nuse crate::{BooleanOperand, InvalidQuery};\n\n/// The TermQuery acts exactly like a FullTextQuery with\n/// a raw tokenizer.\n#[derive(PartialEq, Eq, Debug, Serialize, Deserialize, Clone)]\npub struct TermQuery {\n    pub field: String,\n    pub value: String,\n}\n\nimpl From<TermQuery> for QueryAst {\n    fn from(term_query: TermQuery) -> Self {\n        Self::Term(term_query)\n    }\n}\n\nimpl TermQuery {\n    #[cfg(test)]\n    pub fn from_field_value(field: impl ToString, value: impl ToString) -> Self {\n        Self {\n            field: field.to_string(),\n            value: value.to_string(),\n        }\n    }\n}\n\nimpl BuildTantivyAst for TermQuery {\n    fn build_tantivy_ast_impl(\n        &self,\n        context: &BuildTantivyAstContext,\n    ) -> Result<TantivyQueryAst, InvalidQuery> {\n        let full_text_params = FullTextParams {\n            tokenizer: Some(\"raw\".to_string()),\n            // The parameter below won't matter, since we will have only one term\n            mode: BooleanOperand::Or.into(),\n            zero_terms_query: Default::default(),\n        };\n        crate::query_ast::utils::full_text_query(\n            &self.field,\n            &self.value,\n            &full_text_params,\n            context.schema,\n            context.tokenizer_manager,\n            false,\n        )\n    }\n}\n\n// Private struct used for serialization.\n// It represents the value of a term query. in the json form : `{field: <TermQueryValue>}`.\n#[derive(Serialize, Deserialize)]\nstruct TermQueryValue {\n    value: String,\n}\n\nimpl From<TermQuery> for (String, TermQueryValue) {\n    fn from(term_query: TermQuery) -> Self {\n        (\n            term_query.field,\n            TermQueryValue {\n                value: term_query.value,\n            },\n        )\n    }\n}\n\nimpl From<(String, TermQueryValue)> for TermQuery {\n    fn from((field, term_query_value): (String, TermQueryValue)) -> Self {\n        Self {\n            field,\n            value: term_query_value.value,\n        }\n    }\n}\n\nimpl TryFrom<HashMap<String, TermQueryValue>> for TermQuery {\n    type Error = &'static str;\n\n    fn try_from(map: HashMap<String, TermQueryValue>) -> Result<Self, Self::Error> {\n        if map.len() > 1 {\n            return Err(\"TermQuery must have exactly one entry\");\n        }\n        Ok(TermQuery::from(map.into_iter().next().unwrap())) // unwrap justified by the if\n        // statementabove.\n    }\n}\n\nimpl From<TermQuery> for HashMap<String, TermQueryValue> {\n    fn from(term_query: TermQuery) -> HashMap<String, TermQueryValue> {\n        let (field, term_query_value) = term_query.into();\n        let mut map = HashMap::with_capacity(1);\n        map.insert(field, term_query_value);\n        map\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use tantivy::schema::{INDEXED, Schema};\n\n    use crate::query_ast::{BuildTantivyAst, BuildTantivyAstContext, TermQuery};\n\n    #[test]\n    fn test_term_query_with_ipaddr_ipv4() {\n        let term_query = TermQuery {\n            field: \"ip\".to_string(),\n            value: \"127.0.0.1\".to_string(),\n        };\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_ip_addr_field(\"ip\", INDEXED);\n        let schema = schema_builder.build();\n        let tantivy_query_ast = term_query\n            .build_tantivy_ast_call(&BuildTantivyAstContext::for_test(&schema))\n            .unwrap();\n        let leaf = tantivy_query_ast.as_leaf().unwrap();\n        assert_eq!(\n            &format!(\"{leaf:?}\"),\n            \"TermQuery(Term(field=0, type=IpAddr, ::ffff:127.0.0.1))\"\n        );\n    }\n\n    #[test]\n    fn test_term_query_with_ipaddr_compressed_ipv6() {\n        let term_query = TermQuery {\n            field: \"ip\".to_string(),\n            value: \"2001:db8:85a3::8a2e:370:7334\".to_string(), //< note the ::. This is a compressed form\n        };\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_ip_addr_field(\"ip\", INDEXED);\n        let schema = schema_builder.build();\n        let tantivy_query_ast = term_query\n            .build_tantivy_ast_call(&BuildTantivyAstContext::for_test(&schema))\n            .unwrap();\n        let leaf = tantivy_query_ast.as_leaf().unwrap();\n        assert_eq!(\n            &format!(\"{leaf:?}\"),\n            \"TermQuery(Term(field=0, type=IpAddr, 2001:db8:85a3::8a2e:370:7334))\"\n        );\n    }\n\n    #[test]\n    fn test_term_query_bytes_with_padding() {\n        let term_query = TermQuery {\n            field: \"bytes\".to_string(),\n            value: \"bGlnaHQgdw==\".to_string(),\n        };\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_bytes_field(\"bytes\", INDEXED);\n        let schema = schema_builder.build();\n        let tantivy_query_ast = term_query\n            .build_tantivy_ast_call(&BuildTantivyAstContext::for_test(&schema))\n            .unwrap();\n        let leaf = tantivy_query_ast.as_leaf().unwrap();\n        assert_eq!(\n            &format!(\"{leaf:?}\"),\n            \"TermQuery(Term(field=0, type=Bytes, [108, 105, 103, 104, 116, 32, 119]))\"\n        );\n    }\n\n    #[test]\n    fn test_term_query_bytes_without_padding() {\n        let term_query = TermQuery {\n            field: \"bytes\".to_string(),\n            value: \"bGlnaHQgdw\".to_string(),\n        };\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_bytes_field(\"bytes\", INDEXED);\n        let schema = schema_builder.build();\n        let tantivy_query_ast = term_query\n            .build_tantivy_ast_call(&BuildTantivyAstContext::for_test(&schema))\n            .unwrap();\n        let leaf = tantivy_query_ast.as_leaf().unwrap();\n        assert_eq!(\n            &format!(\"{leaf:?}\"),\n            \"TermQuery(Term(field=0, type=Bytes, [108, 105, 103, 104, 116, 32, 119]))\"\n        );\n    }\n\n    #[test]\n    fn test_term_query_with_date_nanosecond() {\n        let term_query = TermQuery {\n            field: \"timestamp\".to_string(),\n            value: \"2025-08-07T14:49:21.831343Z\".to_string(),\n        };\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_date_field(\"timestamp\", INDEXED);\n        let schema = schema_builder.build();\n        let tantivy_query_ast = term_query\n            .build_tantivy_ast_call(&BuildTantivyAstContext::for_test(&schema))\n            .unwrap();\n        let leaf = tantivy_query_ast.as_leaf().unwrap();\n        // The date should have been truncated to seconds precision.\n        assert_eq!(\n            &format!(\"{leaf:?}\"),\n            \"TermQuery(Term(field=0, type=Date, 2025-08-07T14:49:21Z))\"\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/query_ast/term_set_query.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{BTreeSet, HashMap, HashSet};\n\nuse serde::{Deserialize, Serialize};\nuse tantivy::Term;\n\nuse crate::InvalidQuery;\nuse crate::query_ast::{\n    BoolQuery, BuildTantivyAst, BuildTantivyAstContext, QueryAst, TantivyQueryAst, TermQuery,\n};\n\n/// TermSetQuery matches the same document set as if it was a union of\n/// the equivalent set of TermQueries.\n///\n/// The text will be used as is, untokenized.\n#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Eq)]\npub struct TermSetQuery {\n    pub terms_per_field: HashMap<String, BTreeSet<String>>,\n}\n\nimpl TermSetQuery {\n    fn has_fast_only_field(&self, context: &BuildTantivyAstContext) -> bool {\n        for full_path in self.terms_per_field.keys() {\n            if let Some((_, field_entry, _)) =\n                super::utils::find_field_or_hit_dynamic(full_path, context.schema)\n                && field_entry.is_fast()\n                && !field_entry.is_indexed()\n            {\n                return true;\n            }\n        }\n        false\n    }\n\n    fn build_bool_query(\n        &self,\n        context: &BuildTantivyAstContext,\n    ) -> Result<TantivyQueryAst, InvalidQuery> {\n        let should_clauses = self\n            .terms_per_field\n            .iter()\n            .flat_map(|(full_path, values)| {\n                values.iter().map(|value| {\n                    QueryAst::Term(TermQuery {\n                        field: full_path.to_string(),\n                        value: value.to_string(),\n                    })\n                })\n            })\n            .collect();\n\n        let bool_query = BoolQuery {\n            should: should_clauses,\n            ..Default::default()\n        };\n\n        bool_query.build_tantivy_ast_impl(context)\n    }\n\n    fn build_term_set_query(\n        &self,\n        context: &BuildTantivyAstContext,\n    ) -> Result<TantivyQueryAst, InvalidQuery> {\n        let terms_it = self.make_term_iterator(context)?;\n        let term_set_query = tantivy::query::TermSetQuery::new(terms_it);\n        Ok(term_set_query.into())\n    }\n\n    fn make_term_iterator(\n        &self,\n        context: &BuildTantivyAstContext,\n    ) -> Result<HashSet<Term>, InvalidQuery> {\n        let mut terms: HashSet<Term> = HashSet::default();\n\n        for (full_path, values) in &self.terms_per_field {\n            for value in values {\n                // Mapping a text (field, value) is non-trivial:\n                // It depends on the schema of course, and can actually result in a disjunction of\n                // multiple terms if the query targets a dynamic field (due to the\n                // different types).\n                //\n                // Here, we ensure the logic is the same as for a TermQuery, by creating the term\n                // query and extracting the terms from the resulting `TermQuery`.\n                let term_query = TermQuery {\n                    field: full_path.to_string(),\n                    value: value.to_string(),\n                };\n                let ast = term_query.build_tantivy_ast_call(context)?;\n                let tantivy_query: Box<dyn crate::TantivyQuery> = ast.simplify().into();\n                tantivy_query.query_terms(&mut |term, _| {\n                    terms.insert(term.clone());\n                });\n            }\n        }\n        Ok(terms)\n    }\n}\n\nimpl BuildTantivyAst for TermSetQuery {\n    fn build_tantivy_ast_impl(\n        &self,\n        context: &BuildTantivyAstContext,\n    ) -> Result<TantivyQueryAst, InvalidQuery> {\n        if self.has_fast_only_field(context) {\n            self.build_bool_query(context)\n        } else {\n            self.build_term_set_query(context)\n        }\n    }\n}\n\nimpl From<TermSetQuery> for QueryAst {\n    fn from(term_set_query: TermSetQuery) -> Self {\n        QueryAst::TermSet(term_set_query)\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::collections::{BTreeSet, HashMap};\n\n    use tantivy::schema::{FAST, INDEXED, Schema};\n\n    use super::TermSetQuery;\n    use crate::query_ast::{BuildTantivyAst, BuildTantivyAstContext};\n\n    #[test]\n    fn test_term_set_query_with_fast_only_field_returns_bool_query() {\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_u64_field(\"fast_field\", FAST);\n        let schema = schema_builder.build();\n\n        let terms_per_field = HashMap::from([(\n            \"fast_field\".to_string(),\n            BTreeSet::from([\"1\".to_string(), \"2\".to_string()]),\n        )]);\n        let term_set_query = TermSetQuery { terms_per_field };\n\n        let tantivy_query_ast = term_set_query\n            .build_tantivy_ast_call(&BuildTantivyAstContext::for_test(&schema))\n            .unwrap();\n\n        let bool_query = tantivy_query_ast\n            .as_bool_query()\n            .expect(\"Expected BoolQuery for fast-only field, but got a different query type\");\n        assert_eq!(bool_query.should.len(), 2);\n        assert_eq!(bool_query.must.len(), 0);\n        assert_eq!(bool_query.must_not.len(), 0);\n        assert_eq!(bool_query.filter.len(), 0);\n    }\n\n    #[test]\n    fn test_term_set_query_with_indexed_field_uses_term_set() {\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_u64_field(\"indexed_field\", FAST | INDEXED);\n        let schema = schema_builder.build();\n\n        let terms_per_field = HashMap::from([(\n            \"indexed_field\".to_string(),\n            BTreeSet::from([\"1\".to_string(), \"2\".to_string()]),\n        )]);\n        let term_set_query = TermSetQuery { terms_per_field };\n\n        let tantivy_query_ast = term_set_query\n            .build_tantivy_ast_call(&BuildTantivyAstContext::for_test(&schema))\n            .unwrap();\n\n        // Should return a leaf query (TermSetQuery wrapped in TantivyQueryAst)\n        let leaf = tantivy_query_ast\n            .as_leaf()\n            .expect(\"Expected a leaf query (TermSetQuery), but got a complex query\");\n\n        // Verify it's a TermSetQuery by checking the debug representation\n        let debug_str = format!(\"{leaf:?}\");\n        assert!(\n            debug_str.contains(\"TermSetQuery\"),\n            \"Expected TermSetQuery, got: {debug_str}\"\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/query_ast/user_input_query.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{BTreeSet, HashMap};\nuse std::ops::Bound;\n\nuse anyhow::bail;\nuse serde::{Deserialize, Serialize};\nuse tantivy::query_grammar::{\n    Delimiter, Occur, UserInputAst, UserInputBound, UserInputLeaf, UserInputLiteral,\n};\n\nuse crate::not_nan_f32::NotNaNf32;\nuse crate::query_ast::{\n    self, BuildTantivyAst, BuildTantivyAstContext, FieldPresenceQuery, FullTextMode,\n    FullTextParams, QueryAst, TantivyQueryAst,\n};\nuse crate::{BooleanOperand, InvalidQuery, JsonLiteral};\n\nconst DEFAULT_PHRASE_QUERY_MAX_EXPANSION: u32 = 50;\n\n/// A query expressed in the tantivy query grammar DSL.\n#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Eq)]\npub struct UserInputQuery {\n    pub user_text: String,\n    // Set of search fields to search into for text not specifically\n    // targeting a field.\n    //\n    // If None, the default search fields, as defined in the DocMapper\n    // will be used.\n    #[serde(default, skip_serializing_if = \"Option::is_none\")]\n    pub default_fields: Option<Vec<String>>,\n    pub default_operator: BooleanOperand,\n    /// Support missing fields\n    pub lenient: bool,\n}\n\nimpl UserInputQuery {\n    /// Parse the user query to generate a structured QueryAST, without any UserInputQuery node.\n    ///\n    /// The `UserInputQuery` have an optional search_fields property that takes precedence over\n    /// the `default_search_fields`.\n    ///\n    /// In quickwit, the search fields in the `UserInputQuery` are usually supplied with the user\n    /// request.\n    /// The default_search_fields argument on the other hand, is the default search fields defined\n    /// in the `DocMapper`.\n    pub fn parse_user_query(&self, default_search_fields: &[String]) -> anyhow::Result<QueryAst> {\n        let search_fields = self\n            .default_fields\n            .as_ref()\n            .map(|search_fields| &search_fields[..])\n            .unwrap_or(default_search_fields);\n        let user_input_ast = tantivy::query_grammar::parse_query(&self.user_text)\n            .map_err(|_| anyhow::anyhow!(\"failed to parse query: `{}`\", &self.user_text))?;\n        let default_occur = match self.default_operator {\n            BooleanOperand::And => Occur::Must,\n            BooleanOperand::Or => Occur::Should,\n        };\n        convert_user_input_ast_to_query_ast(\n            user_input_ast,\n            default_occur,\n            search_fields,\n            self.lenient,\n        )\n    }\n}\n\nimpl From<UserInputQuery> for QueryAst {\n    fn from(user_text_query: UserInputQuery) -> Self {\n        QueryAst::UserInput(user_text_query)\n    }\n}\n\nimpl BuildTantivyAst for UserInputQuery {\n    fn build_tantivy_ast_impl(\n        &self,\n        _context: &BuildTantivyAstContext,\n    ) -> Result<TantivyQueryAst, crate::InvalidQuery> {\n        Err(InvalidQuery::UserQueryNotParsed)\n    }\n}\n\n/// Convert the AST of a text query to a QueryAst, filling in default field and default occur when\n/// they were not present.\nfn convert_user_input_ast_to_query_ast(\n    user_input_ast: UserInputAst,\n    default_occur: Occur,\n    default_search_fields: &[String],\n    lenient: bool,\n) -> anyhow::Result<QueryAst> {\n    match user_input_ast {\n        UserInputAst::Clause(clause) => {\n            let mut bool_query = query_ast::BoolQuery::default();\n            for (occur_opt, sub_ast) in clause {\n                let sub_ast = convert_user_input_ast_to_query_ast(\n                    sub_ast,\n                    default_occur,\n                    default_search_fields,\n                    lenient,\n                )?;\n                let children_ast_for_occur: &mut Vec<QueryAst> =\n                    match occur_opt.unwrap_or(default_occur) {\n                        Occur::Should => &mut bool_query.should,\n                        Occur::Must => &mut bool_query.must,\n                        Occur::MustNot => &mut bool_query.must_not,\n                    };\n                children_ast_for_occur.push(sub_ast);\n            }\n            Ok(bool_query.into())\n        }\n        UserInputAst::Leaf(leaf) => match *leaf {\n            UserInputLeaf::Literal(literal) => {\n                convert_user_input_literal(literal, default_search_fields, lenient)\n            }\n            UserInputLeaf::All => Ok(QueryAst::MatchAll),\n            UserInputLeaf::Range {\n                field,\n                lower,\n                upper,\n            } => {\n                let field = if let Some(field) = field {\n                    field\n                } else if default_search_fields.len() == 1 {\n                    default_search_fields[0].clone()\n                } else if default_search_fields.is_empty() {\n                    bail!(\"range query without field is not supported\");\n                } else {\n                    bail!(\"range query with multiple fields is not supported\");\n                };\n                let convert_bound = |user_input_bound: UserInputBound| match user_input_bound {\n                    UserInputBound::Inclusive(user_text) => {\n                        Bound::Included(JsonLiteral::String(user_text))\n                    }\n                    UserInputBound::Exclusive(user_text) => {\n                        Bound::Excluded(JsonLiteral::String(user_text))\n                    }\n                    UserInputBound::Unbounded => Bound::Unbounded,\n                };\n                let range_query = query_ast::RangeQuery {\n                    field,\n                    lower_bound: convert_bound(lower),\n                    upper_bound: convert_bound(upper),\n                };\n                Ok(range_query.into())\n            }\n            UserInputLeaf::Set { field, elements } => {\n                let field_names: Vec<String> = if let Some(field) = field.as_ref() {\n                    vec![field.to_string()]\n                } else {\n                    default_search_fields.to_vec()\n                };\n                if field_names.is_empty() {\n                    anyhow::bail!(\"set query need to target a specific field\");\n                }\n                let mut terms_per_field: HashMap<String, BTreeSet<String>> = Default::default();\n                let terms: BTreeSet<String> = elements.into_iter().collect();\n                for field in field_names {\n                    terms_per_field.insert(field.to_string(), terms.clone());\n                }\n                let term_set_query = query_ast::TermSetQuery { terms_per_field };\n                Ok(term_set_query.into())\n            }\n            UserInputLeaf::Exists { field } => Ok(FieldPresenceQuery { field }.into()),\n            UserInputLeaf::Regex { field, pattern } => {\n                let field = if let Some(field) = field {\n                    field\n                } else if default_search_fields.len() == 1 {\n                    default_search_fields[0].clone()\n                } else if default_search_fields.is_empty() {\n                    bail!(\"regex query without field is not supported\");\n                } else {\n                    bail!(\"regex query with multiple fields is not supported\");\n                };\n                let regex_query = query_ast::RegexQuery {\n                    field,\n                    regex: pattern,\n                };\n                Ok(regex_query.into())\n            }\n        },\n        UserInputAst::Boost(underlying, boost) => {\n            let query_ast = convert_user_input_ast_to_query_ast(\n                *underlying,\n                default_occur,\n                default_search_fields,\n                lenient,\n            )?;\n            let boost: NotNaNf32 = (boost.into_inner() as f32)\n                .try_into()\n                .map_err(|err_msg: &str| anyhow::anyhow!(err_msg))?;\n            Ok(QueryAst::Boost {\n                underlying: Box::new(query_ast),\n                boost,\n            })\n        }\n    }\n}\n\nfn is_wildcard(phrase: &str) -> bool {\n    use std::ops::ControlFlow;\n    enum State {\n        Normal,\n        Escaped,\n    }\n\n    phrase\n        .chars()\n        .try_fold(State::Normal, |state, c| match state {\n            State::Escaped => ControlFlow::Continue(State::Normal),\n            State::Normal => {\n                if c == '*' || c == '?' {\n                    // we are in a wildcard query\n                    ControlFlow::Break(())\n                } else if c == '\\\\' {\n                    ControlFlow::Continue(State::Escaped)\n                } else {\n                    ControlFlow::Continue(State::Normal)\n                }\n            }\n        })\n        .is_break()\n}\n\n/// Convert a leaf of a text query AST to a QueryAst.\n/// This may generate more than a single leaf if there are multiple default fields.\nfn convert_user_input_literal(\n    user_input_literal: UserInputLiteral,\n    default_search_fields: &[String],\n    lenient: bool,\n) -> anyhow::Result<QueryAst> {\n    let UserInputLiteral {\n        field_name,\n        phrase,\n        prefix,\n        delimiter,\n        slop,\n    } = user_input_literal;\n    let field_names: Vec<String> = if let Some(field_name) = field_name {\n        vec![field_name]\n    } else {\n        default_search_fields\n            .iter()\n            .map(|field_name| field_name.to_string())\n            .collect()\n    };\n    if field_names.is_empty() {\n        anyhow::bail!(\"query requires a default search field and none was supplied\");\n    }\n    let mode = match delimiter {\n        Delimiter::None => FullTextMode::PhraseFallbackToIntersection,\n        Delimiter::SingleQuotes => FullTextMode::Bool {\n            operator: BooleanOperand::And,\n        },\n        Delimiter::DoubleQuotes => FullTextMode::Phrase { slop },\n    };\n    let full_text_params = FullTextParams {\n        tokenizer: None,\n        mode,\n        zero_terms_query: crate::MatchAllOrNone::MatchNone,\n    };\n    let wildcard = delimiter == Delimiter::None && is_wildcard(&phrase);\n    let mut phrase_queries: Vec<QueryAst> = field_names\n        .into_iter()\n        .map(|field_name| {\n            if prefix {\n                query_ast::PhrasePrefixQuery {\n                    field: field_name,\n                    phrase: phrase.clone(),\n                    params: full_text_params.clone(),\n                    max_expansions: DEFAULT_PHRASE_QUERY_MAX_EXPANSION,\n                    lenient,\n                }\n                .into()\n            } else if wildcard {\n                query_ast::WildcardQuery {\n                    field: field_name,\n                    value: phrase.clone(),\n                    lenient,\n                    case_insensitive: false,\n                }\n                .into()\n            } else {\n                query_ast::FullTextQuery {\n                    field: field_name,\n                    text: phrase.clone(),\n                    params: full_text_params.clone(),\n                    lenient,\n                }\n                .into()\n            }\n        })\n        .collect();\n    if phrase_queries.is_empty() {\n        Ok(QueryAst::MatchNone)\n    } else if phrase_queries.len() == 1 {\n        Ok(phrase_queries.pop().unwrap())\n    } else {\n        Ok(query_ast::BoolQuery {\n            should: phrase_queries,\n            ..Default::default()\n        }\n        .into())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use crate::query_ast::{\n        BoolQuery, BuildTantivyAst, BuildTantivyAstContext, FullTextMode, FullTextQuery, QueryAst,\n        UserInputQuery,\n    };\n    use crate::{BooleanOperand, InvalidQuery};\n\n    #[test]\n    fn test_user_input_query_not_parsed_error() {\n        let user_input_query = UserInputQuery {\n            user_text: \"hello\".to_string(),\n            default_fields: None,\n            default_operator: BooleanOperand::And,\n            lenient: false,\n        };\n        let schema = tantivy::schema::Schema::builder().build();\n        {\n            let invalid_query = user_input_query\n                .build_tantivy_ast_call(&BuildTantivyAstContext::for_test(&schema))\n                .unwrap_err();\n            assert!(matches!(invalid_query, InvalidQuery::UserQueryNotParsed));\n        }\n        {\n            let invalid_query = user_input_query\n                .build_tantivy_ast_call(&BuildTantivyAstContext::for_test(&schema))\n                .unwrap_err();\n            assert!(matches!(invalid_query, InvalidQuery::UserQueryNotParsed));\n        }\n    }\n\n    #[test]\n    fn test_user_input_query_missing_fields() {\n        {\n            let invalid_err = UserInputQuery {\n                user_text: \"hello\".to_string(),\n                default_fields: None,\n                default_operator: BooleanOperand::And,\n                lenient: false,\n            }\n            .parse_user_query(&[])\n            .unwrap_err();\n            assert_eq!(\n                &invalid_err.to_string(),\n                \"query requires a default search field and none was supplied\"\n            );\n        }\n        {\n            let invalid_err = UserInputQuery {\n                user_text: \"hello\".to_string(),\n                default_fields: Some(Vec::new()),\n                default_operator: BooleanOperand::And,\n                lenient: false,\n            }\n            .parse_user_query(&[])\n            .unwrap_err();\n            assert_eq!(\n                &invalid_err.to_string(),\n                \"query requires a default search field and none was supplied\"\n            );\n        }\n    }\n\n    #[test]\n    fn test_user_input_query_predefined_default_fields() {\n        let ast = UserInputQuery {\n            user_text: \"hello\".to_string(),\n            default_fields: None,\n            default_operator: BooleanOperand::And,\n            lenient: false,\n        }\n        .parse_user_query(&[\"defaultfield\".to_string()])\n        .unwrap();\n        let QueryAst::FullText(phrase_query) = ast else {\n            panic!()\n        };\n        assert_eq!(&phrase_query.field, \"defaultfield\");\n        assert_eq!(&phrase_query.text, \"hello\");\n        assert_eq!(\n            phrase_query.params.mode,\n            FullTextMode::PhraseFallbackToIntersection\n        );\n    }\n\n    #[test]\n    fn test_user_input_query_phrase_with_prefix() {\n        let ast = UserInputQuery {\n            user_text: \"field:\\\"hello\\\"*\".to_string(),\n            default_fields: None,\n            default_operator: BooleanOperand::And,\n            lenient: false,\n        }\n        .parse_user_query(&[])\n        .unwrap();\n        let QueryAst::PhrasePrefix(phrase_prefix_query) = ast else {\n            panic!()\n        };\n        assert_eq!(&phrase_prefix_query.field, \"field\");\n        assert_eq!(&phrase_prefix_query.phrase, \"hello\");\n        assert_eq!(phrase_prefix_query.max_expansions, 50);\n        assert_eq!(\n            phrase_prefix_query.params.mode,\n            FullTextMode::Phrase { slop: 0 }\n        );\n    }\n\n    #[test]\n    fn test_user_input_query_override_default_fields() {\n        let ast = UserInputQuery {\n            user_text: \"hello\".to_string(),\n            default_fields: Some(vec![\"defaultfield\".to_string()]),\n            default_operator: BooleanOperand::And,\n            lenient: false,\n        }\n        .parse_user_query(&[\"defaultfieldweshouldignore\".to_string()])\n        .unwrap();\n        let QueryAst::FullText(phrase_query) = ast else {\n            panic!()\n        };\n        assert_eq!(&phrase_query.field, \"defaultfield\");\n        assert_eq!(&phrase_query.text, \"hello\");\n        assert_eq!(\n            phrase_query.params.mode,\n            FullTextMode::PhraseFallbackToIntersection\n        );\n    }\n\n    #[test]\n    fn test_user_input_query_several_default_fields() {\n        let ast = UserInputQuery {\n            user_text: \"hello\".to_string(),\n            default_fields: Some(vec![\"fielda\".to_string(), \"fieldb\".to_string()]),\n            default_operator: BooleanOperand::And,\n            lenient: false,\n        }\n        .parse_user_query(&[\"defaultfieldweshouldignore\".to_string()])\n        .unwrap();\n        let QueryAst::Bool(BoolQuery { should, .. }) = ast else {\n            panic!()\n        };\n        assert_eq!(should.len(), 2);\n    }\n\n    #[test]\n    fn test_user_input_query_field_specified_in_user_input() {\n        let ast = UserInputQuery {\n            user_text: \"myfield:hello\".to_string(),\n            default_fields: Some(vec![\"fieldtoignore\".to_string()]),\n            default_operator: BooleanOperand::And,\n            lenient: false,\n        }\n        .parse_user_query(&[\"fieldtoignore\".to_string()])\n        .unwrap();\n        let QueryAst::FullText(full_text_query) = ast else {\n            panic!()\n        };\n        assert_eq!(&full_text_query.field, \"myfield\");\n        assert_eq!(&full_text_query.text, \"hello\");\n        assert_eq!(\n            full_text_query.params.mode,\n            FullTextMode::PhraseFallbackToIntersection\n        );\n    }\n\n    #[test]\n    fn test_user_input_query_different_delimiter() {\n        let parse_user_query_delimiter_util = |query: &str| {\n            let ast = UserInputQuery {\n                user_text: query.to_string(),\n                default_fields: None,\n                default_operator: BooleanOperand::Or,\n                lenient: false,\n            }\n            .parse_user_query(&[])\n            .unwrap();\n            let QueryAst::FullText(full_text_query) = ast else {\n                panic!()\n            };\n            full_text_query\n        };\n        {\n            let double_quote_query: FullTextQuery =\n                parse_user_query_delimiter_util(\"jobtitle:\\\"editor-in-chief\\\"\");\n            assert_eq!(&double_quote_query.field, \"jobtitle\");\n            assert_eq!(&double_quote_query.text, \"editor-in-chief\");\n            assert_eq!(\n                double_quote_query.params.mode,\n                FullTextMode::Phrase { slop: 0 }\n            );\n        }\n        {\n            let double_quote_query: FullTextQuery =\n                parse_user_query_delimiter_util(\"jobtitle:\\\"editor-in-chief\\\"~2\");\n            assert_eq!(&double_quote_query.field, \"jobtitle\");\n            assert_eq!(&double_quote_query.text, \"editor-in-chief\");\n            assert_eq!(\n                double_quote_query.params.mode,\n                FullTextMode::Phrase { slop: 2 }\n            );\n        }\n        {\n            let double_quote_query: FullTextQuery =\n                parse_user_query_delimiter_util(\"jobtitle:'editor-in-chief'\");\n            assert_eq!(&double_quote_query.field, \"jobtitle\");\n            assert_eq!(&double_quote_query.text, \"editor-in-chief\");\n            assert_eq!(\n                double_quote_query.params.mode,\n                FullTextMode::Bool {\n                    operator: BooleanOperand::And\n                }\n            );\n        }\n        {\n            let double_quote_query: FullTextQuery =\n                parse_user_query_delimiter_util(\"jobtitle:editor-in-chief\");\n            assert_eq!(&double_quote_query.field, \"jobtitle\");\n            assert_eq!(&double_quote_query.text, \"editor-in-chief\");\n            assert_eq!(\n                double_quote_query.params.mode,\n                FullTextMode::PhraseFallbackToIntersection\n            );\n        }\n    }\n\n    #[test]\n    fn test_user_input_query_regex() {\n        let ast = UserInputQuery {\n            user_text: \"field: /.*/\".to_string(),\n            default_fields: None,\n            default_operator: BooleanOperand::And,\n            lenient: false,\n        }\n        .parse_user_query(&[])\n        .unwrap();\n        let QueryAst::Regex(regex_query) = ast else {\n            panic!()\n        };\n        assert_eq!(&regex_query.field, \"field\");\n        assert_eq!(&regex_query.regex, \".*\");\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/query_ast/utils.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse tantivy::Term;\nuse tantivy::json_utils::convert_to_fast_value_and_append_to_json_term;\nuse tantivy::query::TermQuery as TantivyTermQuery;\nuse tantivy::schema::{\n    Field, FieldEntry, FieldType, IndexRecordOption, JsonObjectOptions, Schema as TantivySchema,\n    TextFieldIndexing, Type,\n};\n\nuse crate::InvalidQuery;\nuse crate::MatchAllOrNone::MatchNone as TantivyEmptyQuery;\nuse crate::json_literal::InterpretUserInput;\nuse crate::query_ast::full_text_query::FullTextParams;\nuse crate::query_ast::tantivy_query_ast::{TantivyBoolQuery, TantivyQueryAst};\nuse crate::tokenizers::{RAW_TOKENIZER_NAME, TokenizerManager};\n\npub(crate) const DYNAMIC_FIELD_NAME: &str = \"_dynamic\";\n\nfn make_term_query(term: Term) -> TantivyQueryAst {\n    TantivyTermQuery::new(term, IndexRecordOption::WithFreqs).into()\n}\n\n/// Find the field or fallback to the dynamic field if it exists\npub fn find_field_or_hit_dynamic<'a>(\n    full_path: &'a str,\n    schema: &'a TantivySchema,\n) -> Option<(Field, &'a FieldEntry, &'a str)> {\n    let (field, path) = if let Some((field, path)) = schema.find_field(full_path) {\n        (field, path)\n    } else {\n        let dynamic_field = schema.get_field(DYNAMIC_FIELD_NAME).ok()?;\n        (dynamic_field, full_path)\n    };\n    let field_entry = schema.get_field_entry(field);\n    let typ = field_entry.field_type().value_type();\n    if !path.is_empty() && typ != Type::Json {\n        return None;\n    }\n    Some((field, field_entry, path))\n}\n\n/// Find all the fields that are below the given path.\n///\n/// This will return a list of fields only when the path is that of a composite\n/// type in the doc mapping.\npub fn find_subfields<'a>(\n    path: &'a str,\n    schema: &'a TantivySchema,\n) -> Vec<(Field, &'a FieldEntry)> {\n    let prefix = format!(\"{path}.\");\n    schema\n        .fields()\n        .filter(|(_, field_entry)| field_entry.name().starts_with(&prefix))\n        .collect()\n}\n\n/// Creates a full text query.\n///\n/// If tokenize is set to true, the text will be tokenized.\npub(crate) fn full_text_query(\n    full_path: &str,\n    text_query: &str,\n    full_text_params: &FullTextParams,\n    schema: &TantivySchema,\n    tokenizer_manager: &TokenizerManager,\n    lenient: bool,\n) -> Result<TantivyQueryAst, InvalidQuery> {\n    let Some((field, field_entry, path)) = find_field_or_hit_dynamic(full_path, schema) else {\n        if lenient {\n            return Ok(TantivyEmptyQuery.into());\n        } else {\n            return Err(InvalidQuery::FieldDoesNotExist {\n                full_path: full_path.to_string(),\n            });\n        }\n    };\n    compute_query_with_field(\n        field,\n        field_entry,\n        path,\n        text_query,\n        full_text_params,\n        tokenizer_manager,\n    )\n}\n\nfn parse_value_from_user_text<'a, T: InterpretUserInput<'a>>(\n    text: &'a str,\n    field_name: &str,\n) -> Result<T, InvalidQuery> {\n    if let Some(parsed_value) = T::interpret_str(text) {\n        return Ok(parsed_value);\n    }\n    Err(InvalidQuery::InvalidSearchTerm {\n        expected_value_type: T::name(),\n        field_name: field_name.to_string(),\n        value: text.to_string(),\n    })\n}\n\nfn compute_query_with_field(\n    field: Field,\n    field_entry: &FieldEntry,\n    json_path: &str,\n    value: &str,\n    full_text_params: &FullTextParams,\n    tokenizer_manager: &TokenizerManager,\n) -> Result<TantivyQueryAst, InvalidQuery> {\n    let field_type = field_entry.field_type();\n    match field_type {\n        FieldType::U64(_) => {\n            let val = parse_value_from_user_text::<u64>(value, field_entry.name())?;\n            let term = Term::from_field_u64(field, val);\n            Ok(make_term_query(term))\n        }\n        FieldType::I64(_) => {\n            let val = parse_value_from_user_text::<i64>(value, field_entry.name())?;\n            let term = Term::from_field_i64(field, val);\n            Ok(make_term_query(term))\n        }\n        FieldType::F64(_) => {\n            let val = parse_value_from_user_text::<f64>(value, field_entry.name())?;\n            let term = Term::from_field_f64(field, val);\n            Ok(make_term_query(term))\n        }\n        FieldType::Bool(_) => {\n            let bool_val = parse_value_from_user_text(value, field_entry.name())?;\n            let term = Term::from_field_bool(field, bool_val);\n            Ok(make_term_query(term))\n        }\n        FieldType::Date(date_options) => {\n            let dt = parse_value_from_user_text(value, field_entry.name())?;\n            let term = if date_options.is_indexed() {\n                Term::from_field_date_for_search(field, dt)\n            } else {\n                Term::from_field_date(field, dt.truncate(date_options.get_precision()))\n            };\n            Ok(make_term_query(term))\n        }\n        FieldType::Str(text_options) => {\n            let columnar_opt = TextFieldIndexing::default()\n                .set_fieldnorms(false)\n                .set_tokenizer(RAW_TOKENIZER_NAME);\n            let text_field_indexing = text_options\n                .get_indexing_options()\n                .or_else(|| text_options.is_fast().then_some(&columnar_opt))\n                .ok_or_else(|| {\n                    InvalidQuery::SchemaError(format!(\n                        \"field {} is not full-text searchable\",\n                        field_entry.name()\n                    ))\n                })?;\n            let terms = full_text_params.tokenize_text_into_terms(\n                field,\n                value,\n                text_field_indexing,\n                tokenizer_manager,\n            )?;\n            full_text_params.make_query(terms, text_field_indexing.index_option())\n        }\n        FieldType::IpAddr(_) => {\n            let ip_v6 = parse_value_from_user_text(value, field_entry.name())?;\n            let term = Term::from_field_ip_addr(field, ip_v6);\n            Ok(make_term_query(term))\n        }\n        FieldType::JsonObject(json_options) => compute_tantivy_ast_query_for_json(\n            field,\n            json_path,\n            value,\n            full_text_params,\n            json_options,\n            tokenizer_manager,\n        ),\n        FieldType::Facet(_) => Err(InvalidQuery::SchemaError(\n            \"facets are not supported in Quickwit\".to_string(),\n        )),\n        FieldType::Bytes(_) => {\n            let buffer: Vec<u8> = parse_value_from_user_text(value, field_entry.name())?;\n            let term = Term::from_field_bytes(field, &buffer[..]);\n            Ok(make_term_query(term))\n        }\n    }\n}\n\nfn compute_tantivy_ast_query_for_json(\n    field: Field,\n    json_path: &str,\n    text: &str,\n    full_text_params: &FullTextParams,\n    json_options: &JsonObjectOptions,\n    tokenizer_manager: &TokenizerManager,\n) -> Result<TantivyQueryAst, InvalidQuery> {\n    let mut bool_query = TantivyBoolQuery::default();\n    let term = Term::from_field_json_path(field, json_path, json_options.is_expand_dots_enabled());\n    if let Some(term) = convert_to_fast_value_and_append_to_json_term(&term, text, true) {\n        bool_query\n            .should\n            .push(TantivyTermQuery::new(term, IndexRecordOption::Basic).into());\n    }\n    let position_terms: Vec<(usize, Term)> = full_text_params.tokenize_text_into_terms_json(\n        field,\n        json_path,\n        text,\n        json_options,\n        tokenizer_manager,\n    )?;\n    let index_record_option = json_options\n        .get_text_indexing_options()\n        .map(|text_indexing_options| text_indexing_options.index_option())\n        .unwrap_or(IndexRecordOption::Basic);\n    bool_query\n        .should\n        .push(full_text_params.make_query(position_terms, index_record_option)?);\n    Ok(bool_query.into())\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/query_ast/visitor.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse crate::not_nan_f32::NotNaNf32;\nuse crate::query_ast::cache_node::CacheState;\nuse crate::query_ast::field_presence::FieldPresenceQuery;\nuse crate::query_ast::user_input_query::UserInputQuery;\nuse crate::query_ast::{\n    BoolQuery, CacheNode, FullTextQuery, PhrasePrefixQuery, QueryAst, RangeQuery, RegexQuery,\n    TermQuery, TermSetQuery, WildcardQuery,\n};\n\n/// Simple trait to implement a Visitor over the QueryAst.\npub trait QueryAstVisitor<'a> {\n    type Err;\n\n    fn visit(&mut self, query_ast: &'a QueryAst) -> Result<(), Self::Err> {\n        match query_ast {\n            QueryAst::Bool(bool_query) => self.visit_bool(bool_query),\n            QueryAst::Term(term_query) => self.visit_term(term_query),\n            QueryAst::TermSet(term_set_query) => self.visit_term_set(term_set_query),\n            QueryAst::FullText(full_text_query) => self.visit_full_text(full_text_query),\n            QueryAst::PhrasePrefix(phrase_prefix_query) => {\n                self.visit_phrase_prefix(phrase_prefix_query)\n            }\n            QueryAst::Range(range_query) => self.visit_range(range_query),\n            QueryAst::MatchAll => self.visit_match_all(),\n            QueryAst::MatchNone => self.visit_match_none(),\n            QueryAst::Boost { underlying, boost } => self.visit_boost(underlying, *boost),\n            QueryAst::UserInput(user_text_query) => self.visit_user_text(user_text_query),\n            QueryAst::FieldPresence(exists) => self.visit_exists(exists),\n            QueryAst::Wildcard(wildcard) => self.visit_wildcard(wildcard),\n            QueryAst::Regex(regex) => self.visit_regex(regex),\n            QueryAst::Cache(cache_node) => self.visit_cache_node(cache_node),\n        }\n    }\n\n    fn visit_bool(&mut self, bool_query: &'a BoolQuery) -> Result<(), Self::Err> {\n        for ast in bool_query\n            .must\n            .iter()\n            .chain(bool_query.should.iter())\n            .chain(bool_query.must_not.iter())\n            .chain(bool_query.filter.iter())\n        {\n            self.visit(ast)?;\n        }\n        Ok(())\n    }\n\n    fn visit_term(&mut self, _term_query: &'a TermQuery) -> Result<(), Self::Err> {\n        Ok(())\n    }\n\n    fn visit_term_set(&mut self, _term_query: &'a TermSetQuery) -> Result<(), Self::Err> {\n        Ok(())\n    }\n\n    fn visit_full_text(&mut self, _full_text: &'a FullTextQuery) -> Result<(), Self::Err> {\n        Ok(())\n    }\n\n    fn visit_phrase_prefix(\n        &mut self,\n        _phrase_query: &'a PhrasePrefixQuery,\n    ) -> Result<(), Self::Err> {\n        Ok(())\n    }\n\n    fn visit_match_all(&mut self) -> Result<(), Self::Err> {\n        Ok(())\n    }\n\n    fn visit_match_none(&mut self) -> Result<(), Self::Err> {\n        Ok(())\n    }\n\n    fn visit_boost(\n        &mut self,\n        underlying: &'a QueryAst,\n        _boost: NotNaNf32,\n    ) -> Result<(), Self::Err> {\n        self.visit(underlying)\n    }\n\n    fn visit_range(&mut self, _range_query: &'a RangeQuery) -> Result<(), Self::Err> {\n        Ok(())\n    }\n\n    fn visit_user_text(&mut self, _user_text_query: &'a UserInputQuery) -> Result<(), Self::Err> {\n        Ok(())\n    }\n\n    fn visit_exists(&mut self, _exists_query: &'a FieldPresenceQuery) -> Result<(), Self::Err> {\n        Ok(())\n    }\n\n    fn visit_wildcard(&mut self, _wildcard_query: &'a WildcardQuery) -> Result<(), Self::Err> {\n        Ok(())\n    }\n\n    fn visit_regex(&mut self, _regex_query: &'a RegexQuery) -> Result<(), Self::Err> {\n        Ok(())\n    }\n\n    fn visit_cache_node(&mut self, cache_node: &'a CacheNode) -> Result<(), Self::Err> {\n        // this goes a bit again how the rest of the default Visitor behave. The rational is that in\n        // practice, on a cache hit, we don't want to do anything with that node.\n        // On unitialized cache, any kind of data extract could make sense (extracing tags or\n        // timestamp bounds) On cache miss, we still want to know what we need for warmup.\n        // But on cache hit, it's too late to do optimisation based on tags and timestamps, and we\n        // don't want to warmup anything.\n        if !matches!(cache_node.state, CacheState::CacheHit(_)) {\n            self.visit(&cache_node.inner)?\n        }\n        Ok(())\n    }\n}\n\n/// Simple trait to implement a Visitor over the QueryAst.\npub trait QueryAstTransformer {\n    type Err;\n\n    fn transform(&mut self, query_ast: QueryAst) -> Result<Option<QueryAst>, Self::Err> {\n        match query_ast {\n            QueryAst::Bool(bool_query) => self.transform_bool(bool_query),\n            QueryAst::Term(term_query) => self.transform_term(term_query),\n            QueryAst::TermSet(term_set_query) => self.transform_term_set(term_set_query),\n            QueryAst::FullText(full_text_query) => self.transform_full_text(full_text_query),\n            QueryAst::PhrasePrefix(phrase_prefix_query) => {\n                self.transform_phrase_prefix(phrase_prefix_query)\n            }\n            QueryAst::Range(range_query) => self.transform_range(range_query),\n            QueryAst::MatchAll => self.transform_match_all(),\n            QueryAst::MatchNone => self.transform_match_none(),\n            QueryAst::Boost { underlying, boost } => self.transform_boost(*underlying, boost),\n            QueryAst::UserInput(user_text_query) => self.transform_user_text(user_text_query),\n            QueryAst::FieldPresence(exists) => self.transform_exists(exists),\n            QueryAst::Wildcard(wildcard) => self.transform_wildcard(wildcard),\n            QueryAst::Regex(regex) => self.transform_regex(regex),\n            QueryAst::Cache(cache_node) => self.transform_cache_node(cache_node),\n        }\n    }\n\n    fn transform_bool(&mut self, mut bool_query: BoolQuery) -> Result<Option<QueryAst>, Self::Err> {\n        bool_query.must = bool_query\n            .must\n            .into_iter()\n            .filter_map(|query_ast| self.transform(query_ast).transpose())\n            .collect::<Result<Vec<_>, _>>()?;\n        bool_query.should = bool_query\n            .should\n            .into_iter()\n            .filter_map(|query_ast| self.transform(query_ast).transpose())\n            .collect::<Result<Vec<_>, _>>()?;\n        bool_query.must_not = bool_query\n            .must_not\n            .into_iter()\n            .filter_map(|query_ast| self.transform(query_ast).transpose())\n            .collect::<Result<Vec<_>, _>>()?;\n        bool_query.filter = bool_query\n            .filter\n            .into_iter()\n            .filter_map(|query_ast| self.transform(query_ast).transpose())\n            .collect::<Result<Vec<_>, _>>()?;\n\n        Ok(Some(QueryAst::Bool(bool_query)))\n    }\n\n    fn transform_term(&mut self, term_query: TermQuery) -> Result<Option<QueryAst>, Self::Err> {\n        Ok(Some(QueryAst::Term(term_query)))\n    }\n\n    fn transform_term_set(\n        &mut self,\n        term_set: TermSetQuery,\n    ) -> Result<Option<QueryAst>, Self::Err> {\n        Ok(Some(QueryAst::TermSet(term_set)))\n    }\n\n    fn transform_full_text(\n        &mut self,\n        full_text: FullTextQuery,\n    ) -> Result<Option<QueryAst>, Self::Err> {\n        Ok(Some(QueryAst::FullText(full_text)))\n    }\n\n    fn transform_phrase_prefix(\n        &mut self,\n        phrase_query: PhrasePrefixQuery,\n    ) -> Result<Option<QueryAst>, Self::Err> {\n        Ok(Some(QueryAst::PhrasePrefix(phrase_query)))\n    }\n\n    fn transform_match_all(&mut self) -> Result<Option<QueryAst>, Self::Err> {\n        Ok(Some(QueryAst::MatchAll))\n    }\n\n    fn transform_match_none(&mut self) -> Result<Option<QueryAst>, Self::Err> {\n        Ok(Some(QueryAst::MatchNone))\n    }\n\n    fn transform_boost(\n        &mut self,\n        underlying: QueryAst,\n        boost: NotNaNf32,\n    ) -> Result<Option<QueryAst>, Self::Err> {\n        self.transform(underlying).map(|maybe_ast| {\n            maybe_ast.map(|underlying| QueryAst::Boost {\n                underlying: Box::new(underlying),\n                boost,\n            })\n        })\n    }\n\n    fn transform_range(&mut self, range_query: RangeQuery) -> Result<Option<QueryAst>, Self::Err> {\n        Ok(Some(QueryAst::Range(range_query)))\n    }\n\n    fn transform_user_text(\n        &mut self,\n        user_text_query: UserInputQuery,\n    ) -> Result<Option<QueryAst>, Self::Err> {\n        Ok(Some(QueryAst::UserInput(user_text_query)))\n    }\n\n    fn transform_exists(\n        &mut self,\n        exists_query: FieldPresenceQuery,\n    ) -> Result<Option<QueryAst>, Self::Err> {\n        Ok(Some(QueryAst::FieldPresence(exists_query)))\n    }\n\n    fn transform_wildcard(\n        &mut self,\n        wildcard_query: WildcardQuery,\n    ) -> Result<Option<QueryAst>, Self::Err> {\n        Ok(Some(QueryAst::Wildcard(wildcard_query)))\n    }\n\n    fn transform_regex(&mut self, regex_query: RegexQuery) -> Result<Option<QueryAst>, Self::Err> {\n        Ok(Some(QueryAst::Regex(regex_query)))\n    }\n\n    fn transform_cache_node(\n        &mut self,\n        cache_node: CacheNode,\n    ) -> Result<Option<QueryAst>, Self::Err> {\n        if matches!(cache_node.state, CacheState::CacheHit(_)) {\n            return Ok(Some(cache_node.into()));\n        }\n        self.transform(*cache_node.inner).map(|maybe_ast| {\n            maybe_ast.map(|inner| {\n                QueryAst::Cache(CacheNode {\n                    inner: Box::new(inner),\n                    state: Default::default(),\n                })\n            })\n        })\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/query_ast/wildcard_query.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::borrow::Cow;\nuse std::sync::Arc;\n\nuse anyhow::{Context, bail};\nuse serde::{Deserialize, Serialize};\nuse tantivy::Term;\nuse tantivy::schema::{Field, FieldType, Schema as TantivySchema};\n\nuse super::{BuildTantivyAst, QueryAst};\nuse crate::query_ast::{AutomatonQuery, BuildTantivyAstContext, JsonPathPrefix, TantivyQueryAst};\nuse crate::tokenizers::TokenizerManager;\nuse crate::{InvalidQuery, find_field_or_hit_dynamic};\n\n/// A Wildcard query allows to match 'bond' with a query like 'b*d'.\n#[derive(PartialEq, Eq, Debug, Serialize, Deserialize, Clone)]\npub struct WildcardQuery {\n    pub field: String,\n    pub value: String,\n    /// Support missing fields\n    pub lenient: bool,\n    pub case_insensitive: bool,\n}\n\nimpl From<WildcardQuery> for QueryAst {\n    fn from(wildcard_query: WildcardQuery) -> Self {\n        Self::Wildcard(wildcard_query)\n    }\n}\n\nfn parse_wildcard_query(mut query: &str) -> Vec<SubQuery> {\n    let mut res = Vec::new();\n    while let Some(pos) = query.find(['*', '?', '\\\\']) {\n        if pos > 0 {\n            res.push(SubQuery::Text(query[..pos].to_string()));\n        }\n        let chr = &query[pos..pos + 1];\n        query = &query[pos + 1..];\n        match chr {\n            \"*\" => res.push(SubQuery::Wildcard),\n            \"?\" => res.push(SubQuery::QuestionMark),\n            \"\\\\\" => {\n                if let Some(chr) = query.chars().next() {\n                    res.push(SubQuery::Text(chr.to_string()));\n                    query = &query[chr.len_utf8()..];\n                } else {\n                    // escaping at the end is invalid, handle it as if that escape sequence wasn't\n                    // present\n                    break;\n                }\n            }\n            _ => unreachable!(\"find shouldn't return non-matching position\"),\n        }\n    }\n    if !query.is_empty() {\n        res.push(SubQuery::Text(query.to_string()));\n    }\n    res\n}\n\nenum SubQuery {\n    Text(String),\n    Wildcard,\n    QuestionMark,\n}\n\nfn sub_query_parts_to_regex(\n    sub_query_parts: Vec<SubQuery>,\n    tokenizer_name: &str,\n    tokenizer_manager: &TokenizerManager,\n) -> anyhow::Result<String> {\n    let mut normalizer = tokenizer_manager\n        .get_normalizer(tokenizer_name)\n        .with_context(|| format!(\"no tokenizer named `{tokenizer_name}` is registered\"))?;\n\n    sub_query_parts\n        .into_iter()\n        .map(|part| match part {\n            SubQuery::Text(text) => {\n                let mut token_stream = normalizer.token_stream(&text);\n                let expected_token = token_stream\n                    .next()\n                    .context(\"normalizer generated no content\")?\n                    .text\n                    .clone();\n                if let Some(_unexpected_token) = token_stream.next() {\n                    bail!(\"normalizer generated multiple tokens\")\n                }\n                Ok(Cow::Owned(regex::escape(&expected_token)))\n            }\n            SubQuery::Wildcard => Ok(Cow::Borrowed(\".*\")),\n            SubQuery::QuestionMark => Ok(Cow::Borrowed(\".\")),\n        })\n        .collect::<Result<String, _>>()\n}\n\nimpl WildcardQuery {\n    pub fn to_regex(\n        &self,\n        schema: &TantivySchema,\n        tokenizer_manager: &TokenizerManager,\n    ) -> Result<(Field, Option<Vec<u8>>, String), InvalidQuery> {\n        let Some((field, field_entry, json_path)) = find_field_or_hit_dynamic(&self.field, schema)\n        else {\n            return Err(InvalidQuery::FieldDoesNotExist {\n                full_path: self.field.clone(),\n            });\n        };\n        let field_type = field_entry.field_type();\n\n        let sub_query_parts = parse_wildcard_query(&self.value);\n\n        match field_type {\n            FieldType::Str(text_options) => {\n                let text_field_indexing = text_options.get_indexing_options().ok_or_else(|| {\n                    InvalidQuery::SchemaError(format!(\n                        \"field {} is not full-text searchable\",\n                        field_entry.name()\n                    ))\n                })?;\n                let tokenizer_name = text_field_indexing.tokenizer();\n                let regex =\n                    sub_query_parts_to_regex(sub_query_parts, tokenizer_name, tokenizer_manager)?;\n                let regex = if self.case_insensitive {\n                    format!(\"(?i){}\", regex)\n                } else {\n                    regex\n                };\n\n                Ok((field, None, regex))\n            }\n            FieldType::JsonObject(json_options) => {\n                let text_field_indexing =\n                    json_options.get_text_indexing_options().ok_or_else(|| {\n                        InvalidQuery::SchemaError(format!(\n                            \"field {} is not full-text searchable\",\n                            field_entry.name()\n                        ))\n                    })?;\n                let tokenizer_name = text_field_indexing.tokenizer();\n                let regex =\n                    sub_query_parts_to_regex(sub_query_parts, tokenizer_name, tokenizer_manager)?;\n                let regex = if self.case_insensitive {\n                    format!(\"(?i){}\", regex)\n                } else {\n                    regex\n                };\n\n                let mut term_for_path = Term::from_field_json_path(\n                    field,\n                    json_path,\n                    json_options.is_expand_dots_enabled(),\n                );\n                term_for_path.append_type_and_str(\"\");\n\n                let value = term_for_path.value();\n                // We skip the 1st byte which is a marker to tell this is json. This isn't present\n                // in the dictionary\n                let byte_path_prefix = value.as_serialized()[1..].to_owned();\n\n                Ok((field, Some(byte_path_prefix), regex))\n            }\n            _ => Err(InvalidQuery::SchemaError(\n                \"trying to run a Wildcard query on a non-text field\".to_string(),\n            )),\n        }\n    }\n}\n\nimpl BuildTantivyAst for WildcardQuery {\n    fn build_tantivy_ast_impl(\n        &self,\n        context: &BuildTantivyAstContext,\n    ) -> Result<TantivyQueryAst, InvalidQuery> {\n        let (field, path, regex) = match self.to_regex(context.schema, context.tokenizer_manager) {\n            Ok(res) => res,\n            Err(InvalidQuery::FieldDoesNotExist { .. }) if self.lenient => {\n                return Ok(TantivyQueryAst::match_none());\n            }\n            Err(e) => return Err(e),\n        };\n        let regex =\n            tantivy_fst::Regex::new(&regex).context(\"failed to parse regex built from wildcard\")?;\n        let regex_automaton_with_path = JsonPathPrefix {\n            prefix: path.unwrap_or_default(),\n            automaton: regex.into(),\n        };\n        let regex_query_with_path = AutomatonQuery {\n            field,\n            automaton: Arc::new(regex_automaton_with_path),\n        };\n        Ok(regex_query_with_path.into())\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use tantivy::schema::{TextFieldIndexing, TextOptions};\n\n    use super::*;\n    use crate::create_default_quickwit_tokenizer_manager;\n\n    fn single_text_field_schema(field_name: &str, tokenizer: &str) -> TantivySchema {\n        let mut schema_builder = TantivySchema::builder();\n        let text_options = TextOptions::default()\n            .set_indexing_options(TextFieldIndexing::default().set_tokenizer(tokenizer));\n        schema_builder.add_text_field(field_name, text_options);\n        schema_builder.build()\n    }\n\n    #[test]\n    fn test_wildcard_query_to_regex_on_text() {\n        let query = WildcardQuery {\n            field: \"text_field\".to_string(),\n            value: \"MyString Wh1ch?a.nOrMal Tokenizer would*cut\".to_string(),\n            lenient: false,\n            case_insensitive: false,\n        };\n\n        let tokenizer_manager = create_default_quickwit_tokenizer_manager();\n        for tokenizer in [\"raw\", \"whitespace\"] {\n            let mut schema_builder = TantivySchema::builder();\n            let text_options = TextOptions::default()\n                .set_indexing_options(TextFieldIndexing::default().set_tokenizer(tokenizer));\n            schema_builder.add_text_field(\"text_field\", text_options);\n            let schema = schema_builder.build();\n\n            let (_field, path, regex) = query.to_regex(&schema, &tokenizer_manager).unwrap();\n            assert_eq!(regex, \"MyString Wh1ch.a\\\\.nOrMal Tokenizer would.*cut\");\n            assert!(path.is_none());\n        }\n\n        for tokenizer in [\n            \"raw_lowercase\",\n            \"lowercase\",\n            \"default\",\n            \"chinese_compatible\",\n            \"source_code_default\",\n            \"source_code_with_hex\",\n        ] {\n            let mut schema_builder = TantivySchema::builder();\n            let text_options = TextOptions::default()\n                .set_indexing_options(TextFieldIndexing::default().set_tokenizer(tokenizer));\n            schema_builder.add_text_field(\"text_field\", text_options);\n            let schema = schema_builder.build();\n\n            let (_field, path, regex) = query.to_regex(&schema, &tokenizer_manager).unwrap();\n            assert_eq!(regex, \"mystring wh1ch.a\\\\.normal tokenizer would.*cut\");\n            assert!(path.is_none());\n        }\n    }\n\n    #[test]\n    fn test_wildcard_query_to_regex_on_escaped_text() {\n        let query = WildcardQuery {\n            field: \"text_field\".to_string(),\n            value: \"MyString Wh1ch\\\\?a.nOrMal Tokenizer would\\\\*cut\".to_string(),\n            lenient: false,\n            case_insensitive: false,\n        };\n\n        let tokenizer_manager = create_default_quickwit_tokenizer_manager();\n        for tokenizer in [\"raw\", \"whitespace\"] {\n            let mut schema_builder = TantivySchema::builder();\n            let text_options = TextOptions::default()\n                .set_indexing_options(TextFieldIndexing::default().set_tokenizer(tokenizer));\n            schema_builder.add_text_field(\"text_field\", text_options);\n            let schema = schema_builder.build();\n\n            let (_field, path, regex) = query.to_regex(&schema, &tokenizer_manager).unwrap();\n            assert_eq!(regex, \"MyString Wh1ch\\\\?a\\\\.nOrMal Tokenizer would\\\\*cut\");\n            assert!(path.is_none());\n        }\n\n        for tokenizer in [\n            \"raw_lowercase\",\n            \"lowercase\",\n            \"default\",\n            \"chinese_compatible\",\n            \"source_code_default\",\n            \"source_code_with_hex\",\n        ] {\n            let mut schema_builder = TantivySchema::builder();\n            let text_options = TextOptions::default()\n                .set_indexing_options(TextFieldIndexing::default().set_tokenizer(tokenizer));\n            schema_builder.add_text_field(\"text_field\", text_options);\n            let schema = schema_builder.build();\n\n            let (_field, path, regex) = query.to_regex(&schema, &tokenizer_manager).unwrap();\n            assert_eq!(regex, \"mystring wh1ch\\\\?a\\\\.normal tokenizer would\\\\*cut\");\n            assert!(path.is_none());\n        }\n    }\n\n    #[test]\n    fn test_wildcard_query_to_regex_on_json() {\n        let query = WildcardQuery {\n            // this volontarily contains uppercase and regex-unsafe char to make sure we properly\n            // keep the case, but sanitize special chars\n            field: \"json_field.Inner.Fie*ld\".to_string(),\n            value: \"MyString Wh1ch?a.nOrMal Tokenizer would*cut\".to_string(),\n            lenient: false,\n            case_insensitive: false,\n        };\n\n        let tokenizer_manager = create_default_quickwit_tokenizer_manager();\n        for tokenizer in [\"raw\", \"whitespace\"] {\n            let mut schema_builder = TantivySchema::builder();\n            let text_options = TextOptions::default()\n                .set_indexing_options(TextFieldIndexing::default().set_tokenizer(tokenizer));\n            schema_builder.add_json_field(\"json_field\", text_options);\n            let schema = schema_builder.build();\n\n            let (_field, path, regex) = query.to_regex(&schema, &tokenizer_manager).unwrap();\n            assert_eq!(regex, \"MyString Wh1ch.a\\\\.nOrMal Tokenizer would.*cut\");\n            assert_eq!(path.unwrap(), \"Inner\\u{1}Fie*ld\\0s\".as_bytes());\n        }\n\n        for tokenizer in [\n            \"raw_lowercase\",\n            \"lowercase\",\n            \"default\",\n            \"chinese_compatible\",\n            \"source_code_default\",\n            \"source_code_with_hex\",\n        ] {\n            let mut schema_builder = TantivySchema::builder();\n            let text_options = TextOptions::default()\n                .set_indexing_options(TextFieldIndexing::default().set_tokenizer(tokenizer));\n            schema_builder.add_json_field(\"json_field\", text_options);\n            let schema = schema_builder.build();\n\n            let (_field, path, regex) = query.to_regex(&schema, &tokenizer_manager).unwrap();\n            assert_eq!(regex, \"mystring wh1ch.a\\\\.normal tokenizer would.*cut\");\n            assert_eq!(path.unwrap(), \"Inner\\u{1}Fie*ld\\0s\".as_bytes());\n        }\n    }\n\n    #[test]\n    fn test_extract_regex_wildcard_missing_field() {\n        let query = WildcardQuery {\n            field: \"my_missing_field\".to_string(),\n            value: \"My query value*\".to_string(),\n            lenient: false,\n            case_insensitive: false,\n        };\n        let tokenizer_manager = create_default_quickwit_tokenizer_manager();\n        let schema = single_text_field_schema(\"my_field\", \"whitespace\");\n        let err = query.to_regex(&schema, &tokenizer_manager).unwrap_err();\n        let InvalidQuery::FieldDoesNotExist {\n            full_path: missing_field_full_path,\n        } = err\n        else {\n            panic!(\"unexpected error: {err:?}\");\n        };\n        assert_eq!(missing_field_full_path, \"my_missing_field\");\n    }\n\n    #[test]\n    fn test_wildcard_query_to_regex_on_text_case_insensitive() {\n        let query = WildcardQuery {\n            field: \"text_field\".to_string(),\n            value: \"MyString Wh1ch?a.nOrMal Tokenizer would*cut\".to_string(),\n            lenient: false,\n            case_insensitive: true,\n        };\n\n        let tokenizer_manager = create_default_quickwit_tokenizer_manager();\n        for tokenizer in [\"raw\", \"whitespace\"] {\n            let mut schema_builder = TantivySchema::builder();\n            let text_options = TextOptions::default()\n                .set_indexing_options(TextFieldIndexing::default().set_tokenizer(tokenizer));\n            schema_builder.add_text_field(\"text_field\", text_options);\n            let schema = schema_builder.build();\n\n            let (_field, path, regex) = query.to_regex(&schema, &tokenizer_manager).unwrap();\n            assert_eq!(regex, \"(?i)MyString Wh1ch.a\\\\.nOrMal Tokenizer would.*cut\");\n            assert!(path.is_none());\n        }\n\n        for tokenizer in [\n            \"raw_lowercase\",\n            \"lowercase\",\n            \"default\",\n            \"chinese_compatible\",\n            \"source_code_default\",\n            \"source_code_with_hex\",\n        ] {\n            let mut schema_builder = TantivySchema::builder();\n            let text_options = TextOptions::default()\n                .set_indexing_options(TextFieldIndexing::default().set_tokenizer(tokenizer));\n            schema_builder.add_text_field(\"text_field\", text_options);\n            let schema = schema_builder.build();\n\n            let (_field, path, regex) = query.to_regex(&schema, &tokenizer_manager).unwrap();\n            assert_eq!(regex, \"(?i)mystring wh1ch.a\\\\.normal tokenizer would.*cut\");\n            assert!(path.is_none());\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/tokenizers/chinese_compatible.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::str::CharIndices;\n\nuse tantivy::tokenizer::{Token, TokenStream, Tokenizer};\n\n#[derive(Clone)]\npub(crate) struct ChineseTokenizer;\n\nimpl Tokenizer for ChineseTokenizer {\n    type TokenStream<'a> = ChineseTokenStream<'a>;\n\n    fn token_stream<'a>(&'a mut self, text: &'a str) -> Self::TokenStream<'a> {\n        ChineseTokenStream {\n            text,\n            last_char: None,\n            chars: text.char_indices(),\n            token: Token::default(),\n        }\n    }\n}\n\npub(crate) struct ChineseTokenStream<'a> {\n    text: &'a str,\n    last_char: Option<(usize, char)>,\n    chars: CharIndices<'a>,\n    token: Token,\n}\n\nfn char_is_cjk(c: char) -> bool {\n    // Block                                   Range       Comment\n    // CJK Unified Ideographs                  4E00-9FFF   Common\n    // CJK Unified Ideographs Extension A      3400-4DBF   Rare\n    // CJK Unified Ideographs Extension B      20000-2A6DF Rare, historic\n    // CJK Unified Ideographs Extension C      2A700–2B73F Rare, historic\n    // CJK Unified Ideographs Extension D      2B740–2B81F Uncommon, some in current use\n    // CJK Unified Ideographs Extension E      2B820–2CEAF Rare, historic\n    matches!(c,\n        '\\u{4500}'..='\\u{9FFF}' |\n        '\\u{3400}'..='\\u{4DBF}' |\n        '\\u{20000}'..='\\u{2A6DF}' |\n        '\\u{2A700}'..='\\u{2CEAF}' // merge of extension C,D and E.\n    )\n}\n\n#[derive(Clone, Debug, Eq, PartialEq)]\nenum Grouping {\n    Keep,\n    SplitKeep,\n    SplitIgnore,\n}\n\nfn char_grouping(c: char) -> Grouping {\n    if c.is_alphanumeric() {\n        if char_is_cjk(c) {\n            Grouping::SplitKeep\n        } else {\n            Grouping::Keep\n        }\n    } else {\n        Grouping::SplitIgnore\n    }\n}\n\nimpl TokenStream for ChineseTokenStream<'_> {\n    fn advance(&mut self) -> bool {\n        self.token.text.clear();\n        self.token.position = self.token.position.wrapping_add(1);\n\n        let mut iter = self.last_char.take().into_iter().chain(&mut self.chars);\n\n        while let Some((offset_from, c)) = iter.next() {\n            match char_grouping(c) {\n                Grouping::Keep => {\n                    let offset_to = if let Some((next_index, next_char)) =\n                        iter.find(|&(_, c)| char_grouping(c) != Grouping::Keep)\n                    {\n                        self.last_char = Some((next_index, next_char));\n                        next_index\n                    } else {\n                        self.text.len()\n                    };\n\n                    self.token.offset_from = offset_from;\n                    self.token.offset_to = offset_to;\n                    self.token.text.push_str(&self.text[offset_from..offset_to]);\n                    return true;\n                }\n                Grouping::SplitKeep => {\n                    let num_bytes_in_char = c.len_utf8();\n                    self.token.offset_from = offset_from;\n                    self.token.offset_to = offset_from + num_bytes_in_char;\n                    self.token\n                        .text\n                        .push_str(&self.text[offset_from..(self.token.offset_to)]);\n                    return true;\n                }\n                Grouping::SplitIgnore => (),\n            }\n        }\n        false\n    }\n\n    fn token(&self) -> &Token {\n        &self.token\n    }\n\n    fn token_mut(&mut self) -> &mut Token {\n        &mut self.token\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use tantivy::tokenizer::{Token, TokenStream};\n\n    #[test]\n    fn test_chinese_tokenizer() {\n        let text = \"Hello world, 你好世界, bonjour monde\";\n        let tokenizer_manager = crate::create_default_quickwit_tokenizer_manager();\n        let mut tokenizer = tokenizer_manager\n            .get_tokenizer(\"chinese_compatible\")\n            .unwrap();\n        let mut text_stream = tokenizer.token_stream(text);\n\n        let mut res = Vec::new();\n        while let Some(tok) = text_stream.next() {\n            res.push(tok.clone());\n        }\n\n        // latin alphabet split on white spaces, Han split on each char\n        let expected = [\n            Token {\n                offset_from: 0,\n                offset_to: 5,\n                position: 0,\n                text: \"hello\".to_owned(),\n                position_length: 1,\n            },\n            Token {\n                offset_from: 6,\n                offset_to: 11,\n                position: 1,\n                text: \"world\".to_owned(),\n                position_length: 1,\n            },\n            Token {\n                offset_from: 13,\n                offset_to: 16,\n                position: 2,\n                text: \"你\".to_owned(),\n                position_length: 1,\n            },\n            Token {\n                offset_from: 16,\n                offset_to: 19,\n                position: 3,\n                text: \"好\".to_owned(),\n                position_length: 1,\n            },\n            Token {\n                offset_from: 19,\n                offset_to: 22,\n                position: 4,\n                text: \"世\".to_owned(),\n                position_length: 1,\n            },\n            Token {\n                offset_from: 22,\n                offset_to: 25,\n                position: 5,\n                text: \"界\".to_owned(),\n                position_length: 1,\n            },\n            Token {\n                offset_from: 27,\n                offset_to: 34,\n                position: 6,\n                text: \"bonjour\".to_owned(),\n                position_length: 1,\n            },\n            Token {\n                offset_from: 35,\n                offset_to: 40,\n                position: 7,\n                text: \"monde\".to_owned(),\n                position_length: 1,\n            },\n        ];\n\n        assert_eq!(res, expected);\n    }\n\n    #[test]\n    fn test_chinese_tokenizer_no_space() {\n        let text = \"Hello你好bonjour\";\n        let tokenizer_manager = crate::create_default_quickwit_tokenizer_manager();\n        let mut tokenizer = tokenizer_manager\n            .get_tokenizer(\"chinese_compatible\")\n            .unwrap();\n        let mut text_stream = tokenizer.token_stream(text);\n\n        let mut res = Vec::new();\n        while let Some(tok) = text_stream.next() {\n            res.push(tok.clone());\n        }\n\n        let expected = [\n            Token {\n                offset_from: 0,\n                offset_to: 5,\n                position: 0,\n                text: \"hello\".to_owned(),\n                position_length: 1,\n            },\n            Token {\n                offset_from: 5,\n                offset_to: 8,\n                position: 1,\n                text: \"你\".to_owned(),\n                position_length: 1,\n            },\n            Token {\n                offset_from: 8,\n                offset_to: 11,\n                position: 2,\n                text: \"好\".to_owned(),\n                position_length: 1,\n            },\n            Token {\n                offset_from: 11,\n                offset_to: 18,\n                position: 3,\n                text: \"bonjour\".to_owned(),\n                position_length: 1,\n            },\n        ];\n\n        assert_eq!(res, expected);\n    }\n\n    proptest::proptest! {\n        #[test]\n        fn test_proptest_ascii_default_chinese_equal(text in \"[ -~]{0,64}\") {\n            let tokenizer_manager = crate::create_default_quickwit_tokenizer_manager();\n            let mut cn_tok = tokenizer_manager.get_tokenizer(\"chinese_compatible\").unwrap();\n            let mut default_tok = tokenizer_manager.get_tokenizer(\"default\").unwrap();\n\n            let mut text_stream = cn_tok.token_stream(&text);\n\n            let mut cn_res = Vec::new();\n            while let Some(tok) = text_stream.next() {\n                cn_res.push(tok.clone());\n            }\n\n            let mut text_stream = default_tok.token_stream(&text);\n\n            let mut default_res = Vec::new();\n            while let Some(tok) = text_stream.next() {\n                default_res.push(tok.clone());\n            }\n\n            assert_eq!(cn_res, default_res);\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/tokenizers/code_tokenizer.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::ops::Range;\nuse std::str::CharIndices;\n\nuse tantivy::tokenizer::{Token, TokenStream, Tokenizer};\n\n/// A Tokenizer splitting based on casing families often used in code such ase camelCase or\n/// PascalCase.\n///\n/// For instance, it splits `PigCaféFactory2` as `[Pig, Café, Factory, 2]`, or `RPCResult` into\n/// `[RPC, Result]`.\n///\n/// Optionally, it can keep sequences of hexadecimal chars together, which can be useful when\n/// dealing with ids encoded in that way, such as UUIDs.\n#[derive(Clone, Default)]\npub struct CodeTokenizer {\n    token: Token,\n    enable_hex: bool,\n}\n\nimpl CodeTokenizer {\n    /// When hex support is enabled, the tokenizer tries to keep group of hexadecimal digits as one\n    /// token, instead of splitting them in groups of letters and numbers.\n    pub fn with_hex_support() -> Self {\n        CodeTokenizer {\n            token: Token::default(),\n            enable_hex: true,\n        }\n    }\n}\n\nimpl Tokenizer for CodeTokenizer {\n    type TokenStream<'a> = CodeTokenStream<'a>;\n\n    fn token_stream<'a>(&'a mut self, text: &'a str) -> Self::TokenStream<'a> {\n        self.token.reset();\n        CodeTokenStream {\n            chars: text.char_indices(),\n            state: CodeTokenStreamState::Empty,\n            text,\n            token: &mut self.token,\n            enable_hex: self.enable_hex,\n        }\n    }\n}\n\npub struct CodeTokenStream<'a> {\n    text: &'a str,\n    chars: CharIndices<'a>,\n    token: &'a mut Token,\n    state: CodeTokenStreamState,\n    enable_hex: bool,\n}\n\nenum AdvanceResult {\n    None,\n    Emit(TokenOffsets),\n    Backtrack,\n}\n\nimpl CodeTokenStream<'_> {\n    fn advance_inner(&mut self, enable_hex: bool) -> bool {\n        // this is cheap, just a copy of a few ptrs and integers\n        let checkpoint = self.chars.clone();\n\n        while let Some((next_char_offset, next_char)) = self.chars.next() {\n            match self.state.advance(next_char_offset, next_char, enable_hex) {\n                AdvanceResult::None => {}\n                AdvanceResult::Emit(token_offsets) => {\n                    self.update_token(token_offsets);\n                    return true;\n                }\n                AdvanceResult::Backtrack => {\n                    self.chars = checkpoint;\n                    self.state.reset();\n                    // this can't recurse more than once, Backtrack is only emitted from hex state,\n                    // and calling with false prevent that state from being generated.\n                    return self.advance_inner(false);\n                }\n            }\n        }\n\n        // No more chars.\n        match self.state.finalize() {\n            AdvanceResult::None => {}\n            AdvanceResult::Emit(token_offsets) => {\n                self.update_token(token_offsets);\n                return true;\n            }\n            AdvanceResult::Backtrack => {\n                self.chars = checkpoint;\n                self.state.reset();\n                return self.advance_inner(false);\n            }\n        }\n\n        false\n    }\n}\n\nimpl TokenStream for CodeTokenStream<'_> {\n    fn advance(&mut self) -> bool {\n        self.token.text.clear();\n        self.token.position = self.token.position.wrapping_add(1);\n\n        self.advance_inner(self.enable_hex)\n    }\n\n    fn token(&self) -> &Token {\n        self.token\n    }\n\n    fn token_mut(&mut self) -> &mut Token {\n        self.token\n    }\n}\n\nimpl CodeTokenStream<'_> {\n    fn update_token(&mut self, token_offsets: Range<usize>) {\n        self.token.offset_from = token_offsets.start;\n        self.token.offset_to = token_offsets.end;\n        self.token\n            .text\n            .push_str(&self.text[token_offsets.start..token_offsets.end]);\n    }\n}\n\nenum CodeTokenStreamState {\n    Empty,\n    ProcessingChars(ProcessingCharsState),\n    ProcessingHex(ProcessingHexState),\n}\n\nstruct ProcessingCharsState {\n    is_first_char: bool,\n    start_offset: usize,\n    current_char: char,\n    current_char_offset: usize,\n    current_char_type: CharType,\n}\n\ntype TokenOffsets = Range<usize>;\n\nimpl CodeTokenStreamState {\n    fn reset(&mut self) {\n        *self = CodeTokenStreamState::Empty;\n    }\n\n    fn advance(\n        &mut self,\n        next_char_offset: usize,\n        next_char: char,\n        allow_hex: bool,\n    ) -> AdvanceResult {\n        let next_char_type = get_char_type(next_char);\n        match self {\n            Self::Empty => {\n                match next_char_type {\n                    CharType::Delimiter => {\n                        self.reset();\n                    }\n                    _ => {\n                        let is_hex = next_char.is_ascii_digit()\n                            || ('a'..='f').contains(&next_char)\n                            || ('A'..='F').contains(&next_char);\n                        if allow_hex && is_hex {\n                            *self = CodeTokenStreamState::ProcessingHex(ProcessingHexState {\n                                seen_lowercase: next_char_type == CharType::LowerCase,\n                                seen_uppercase: next_char_type == CharType::UpperCase,\n                                seen_number: next_char_type == CharType::Numeric,\n                                start_offset: next_char_offset,\n                                current_char: next_char,\n                                current_char_offset: next_char_offset,\n                            });\n                        } else {\n                            *self = CodeTokenStreamState::ProcessingChars(ProcessingCharsState {\n                                is_first_char: true,\n                                start_offset: next_char_offset,\n                                current_char_offset: next_char_offset,\n                                current_char: next_char,\n                                current_char_type: next_char_type,\n                            });\n                        }\n                    }\n                }\n                AdvanceResult::None\n            }\n            Self::ProcessingChars(state) => {\n                match (state.current_char_type, next_char_type) {\n                    (_, CharType::Delimiter) => {\n                        let offsets = TokenOffsets {\n                            start: state.start_offset,\n                            end: state.current_char_offset + state.current_char.len_utf8(),\n                        };\n                        // this is the only case where we want to reset, otherwise we might get\n                        // back to a hex-state in a place where we did not get a delimiter\n                        self.reset();\n                        AdvanceResult::Emit(offsets)\n                    }\n                    // We do not emit a token if we have only `Ac` (is_first_char = true).\n                    // But we emit the token `AB` if we have `ABCa`,\n                    (CharType::UpperCase, CharType::LowerCase) => {\n                        if state.is_first_char {\n                            state.is_first_char = false;\n                            state.current_char_offset = next_char_offset;\n                            state.current_char = next_char;\n                            state.current_char_type = next_char_type;\n                            AdvanceResult::None\n                        } else {\n                            let offsets = TokenOffsets {\n                                start: state.start_offset,\n                                end: state.current_char_offset,\n                            };\n                            state.is_first_char = false;\n                            state.start_offset = state.current_char_offset;\n                            state.current_char_offset = next_char_offset;\n                            state.current_char = next_char;\n                            state.current_char_type = next_char_type;\n                            AdvanceResult::Emit(offsets)\n                        }\n                    }\n                    // Don't emit tokens on identical char types.\n                    (CharType::UpperCase, CharType::UpperCase)\n                    | (CharType::LowerCase, CharType::LowerCase)\n                    | (CharType::Numeric, CharType::Numeric) => {\n                        state.is_first_char = false;\n                        state.current_char_offset = next_char_offset;\n                        state.current_char = next_char;\n                        AdvanceResult::None\n                    }\n                    _ => {\n                        let offsets = TokenOffsets {\n                            start: state.start_offset,\n                            end: state.current_char_offset + state.current_char.len_utf8(),\n                        };\n                        state.is_first_char = true;\n                        state.start_offset = next_char_offset;\n                        state.current_char_offset = next_char_offset;\n                        state.current_char = next_char;\n                        state.current_char_type = next_char_type;\n                        AdvanceResult::Emit(offsets)\n                    }\n                }\n            }\n            Self::ProcessingHex(state) => {\n                match state.consume_char(next_char_offset, next_char) {\n                    HexResult::None => AdvanceResult::None,\n                    HexResult::Emit(offsets) => {\n                        self.reset();\n                        AdvanceResult::Emit(offsets)\n                    }\n                    HexResult::RecoverableError(state) => {\n                        *self = CodeTokenStreamState::ProcessingChars(state);\n                        // the char wasn't actually consumed, we recurse once to make sure it is\n                        self.advance(next_char_offset, next_char, allow_hex)\n                    }\n                    HexResult::IrrecoverableError => AdvanceResult::Backtrack,\n                }\n            }\n        }\n    }\n\n    fn finalize(&mut self) -> AdvanceResult {\n        match self {\n            Self::Empty => AdvanceResult::None,\n            Self::ProcessingChars(char_state) => {\n                let offsets = TokenOffsets {\n                    start: char_state.start_offset,\n                    end: char_state.current_char_offset + char_state.current_char.len_utf8(),\n                };\n                *self = Self::Empty;\n                AdvanceResult::Emit(offsets)\n            }\n            CodeTokenStreamState::ProcessingHex(hex_state) => match hex_state.finalize() {\n                HexResult::None => unreachable!(),\n                HexResult::Emit(offsets) => {\n                    *self = Self::Empty;\n                    AdvanceResult::Emit(offsets)\n                }\n                HexResult::RecoverableError(state) => {\n                    *self = CodeTokenStreamState::ProcessingChars(state);\n                    self.finalize()\n                }\n                HexResult::IrrecoverableError => AdvanceResult::Backtrack,\n            },\n        }\n    }\n}\n\n/// Returns the type of the character:\n/// - `UpperCase` for `p{Lu}`.\n/// - `LowerCase` for `p{Ll}`.\n/// - `Numeric` for `\\d`.\n/// - `Delimiter` for the remaining characters.\nfn get_char_type(c: char) -> CharType {\n    if c.is_alphabetic() {\n        if c.is_uppercase() {\n            CharType::UpperCase\n        } else {\n            CharType::LowerCase\n        }\n    } else if c.is_numeric() {\n        CharType::Numeric\n    } else {\n        CharType::Delimiter\n    }\n}\n\n#[derive(Clone, Copy, Debug, Eq, PartialEq)]\nenum CharType {\n    // Equivalent of regex `p{Lu}`.\n    UpperCase,\n    // Equivalent of regex `p{Ll}`.\n    LowerCase,\n    // Equivalent of regex `\\d`.\n    Numeric,\n    // Other characters.\n    Delimiter,\n}\n\n#[derive(Debug)]\nstruct ProcessingHexState {\n    seen_uppercase: bool,\n    seen_lowercase: bool,\n    seen_number: bool,\n\n    start_offset: usize,\n    current_char_offset: usize,\n    current_char: char,\n}\n\nenum HexResult {\n    // no token emitted\n    None,\n    // a token is being emitted, after that the state needs to be reset.\n    Emit(TokenOffsets),\n    // we got an error, but where able to generate a code tokenizer state\n    RecoverableError(ProcessingCharsState),\n    // we got an error and can't generate a code tokenizer state, we need to backtrack\n    IrrecoverableError,\n}\n\nimpl ProcessingHexState {\n    // if this returns an error, the char was *not* consumed\n    fn consume_char(&mut self, next_char_offset: usize, next_char: char) -> HexResult {\n        match next_char {\n            '0'..='9' => self.seen_number = true,\n            'a'..='f' => {\n                if !self.seen_uppercase {\n                    self.seen_lowercase = true;\n                } else {\n                    return self.to_processing_chars_state();\n                }\n            }\n            'A'..='F' => {\n                if !self.seen_lowercase {\n                    self.seen_uppercase = true;\n                } else {\n                    return self.to_processing_chars_state();\n                }\n            }\n            c => {\n                if get_char_type(c) == CharType::Delimiter {\n                    // end of sequence, check if size is multiple of 2, or try to generate code\n                    // state. We use next_char_offset as it already takes into account the size of\n                    // the last character\n                    if (next_char_offset - self.start_offset).is_multiple_of(2) {\n                        return HexResult::Emit(self.start_offset..next_char_offset);\n                    }\n                }\n                // we got an invalid non-delimiter, or our sequence is an odd-length. Either way,\n                // we need to go switch to the code tokenizer\n                return self.to_processing_chars_state();\n            }\n        }\n        // char was accepted, update state\n        self.current_char_offset = next_char_offset;\n        self.current_char = next_char;\n        HexResult::None\n    }\n\n    fn to_processing_chars_state(&self) -> HexResult {\n        let current_char_type = match (self.seen_uppercase, self.seen_lowercase, self.seen_number) {\n            // for Aab, we actually take this branch has a hasn't been consumed just yet.\n            (true, false, false) => CharType::UpperCase,\n            (false, true, false) => CharType::LowerCase,\n            (false, false, true) => CharType::Numeric,\n            _ => return HexResult::IrrecoverableError,\n        };\n        HexResult::RecoverableError(ProcessingCharsState {\n            current_char: self.current_char,\n            current_char_offset: self.current_char_offset,\n            start_offset: self.start_offset,\n            is_first_char: self.current_char_offset == self.start_offset,\n            current_char_type,\n        })\n    }\n\n    fn finalize(&self) -> HexResult {\n        let next_char_offset = self.current_char_offset + self.current_char.len_utf8();\n        if (next_char_offset - self.start_offset).is_multiple_of(2) {\n            return HexResult::Emit(self.start_offset..next_char_offset);\n        }\n        self.to_processing_chars_state()\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use tantivy::tokenizer::{Token, TokenStream, Tokenizer};\n\n    use super::CodeTokenizer;\n\n    #[test]\n    fn test_code_tokenizer() {\n        let mut tokenizer = CodeTokenizer::default();\n        {\n            let mut token_stream = tokenizer.token_stream(\"PigCaféFactory2\");\n            let mut res = Vec::new();\n            while let Some(tok) = token_stream.next() {\n                res.push(tok.clone());\n            }\n            let expected_tokens = vec![\n                Token {\n                    offset_from: 0,\n                    offset_to: 3,\n                    position: 0,\n                    text: \"Pig\".to_owned(),\n                    position_length: 1,\n                },\n                Token {\n                    offset_from: 3,\n                    offset_to: 8,\n                    position: 1,\n                    text: \"Café\".to_owned(),\n                    position_length: 1,\n                },\n                Token {\n                    offset_from: 8,\n                    offset_to: 15,\n                    position: 2,\n                    text: \"Factory\".to_owned(),\n                    position_length: 1,\n                },\n                Token {\n                    offset_from: 15,\n                    offset_to: 16,\n                    position: 3,\n                    text: \"2\".to_owned(),\n                    position_length: 1,\n                },\n            ];\n            assert_eq!(res, expected_tokens);\n        }\n        {\n            let mut token_stream = tokenizer.token_stream(\"PIG_CAFE_FACTORY\");\n            let mut res = Vec::new();\n            while let Some(tok) = token_stream.next() {\n                res.push(tok.clone());\n            }\n            let expected_tokens = vec![\n                Token {\n                    offset_from: 0,\n                    offset_to: 3,\n                    position: 0,\n                    text: \"PIG\".to_owned(),\n                    position_length: 1,\n                },\n                Token {\n                    offset_from: 4,\n                    offset_to: 8,\n                    position: 1,\n                    text: \"CAFE\".to_owned(),\n                    position_length: 1,\n                },\n                Token {\n                    offset_from: 9,\n                    offset_to: 16,\n                    position: 2,\n                    text: \"FACTORY\".to_owned(),\n                    position_length: 1,\n                },\n            ];\n            assert_eq!(res, expected_tokens);\n        }\n        {\n            let mut token_stream = tokenizer.token_stream(\"TPigCafeFactory\");\n            let mut res = Vec::new();\n            while let Some(tok) = token_stream.next() {\n                res.push(tok.clone());\n            }\n            let expected_tokens = vec![\n                Token {\n                    offset_from: 0,\n                    offset_to: 1,\n                    position: 0,\n                    text: \"T\".to_owned(),\n                    position_length: 1,\n                },\n                Token {\n                    offset_from: 1,\n                    offset_to: 4,\n                    position: 1,\n                    text: \"Pig\".to_owned(),\n                    position_length: 1,\n                },\n                Token {\n                    offset_from: 4,\n                    offset_to: 8,\n                    position: 2,\n                    text: \"Cafe\".to_owned(),\n                    position_length: 1,\n                },\n                Token {\n                    offset_from: 8,\n                    offset_to: 15,\n                    position: 3,\n                    text: \"Factory\".to_owned(),\n                    position_length: 1,\n                },\n            ];\n            assert_eq!(res, expected_tokens);\n        }\n        {\n            let mut token_stream = tokenizer.token_stream(\"PIG# Cafe@FACTORY\");\n            let mut res = Vec::new();\n            while let Some(tok) = token_stream.next() {\n                res.push(tok.clone());\n            }\n            let expected_tokens = vec![\n                Token {\n                    offset_from: 0,\n                    offset_to: 3,\n                    position: 0,\n                    text: \"PIG\".to_owned(),\n                    position_length: 1,\n                },\n                Token {\n                    offset_from: 5,\n                    offset_to: 9,\n                    position: 1,\n                    text: \"Cafe\".to_owned(),\n                    position_length: 1,\n                },\n                Token {\n                    offset_from: 10,\n                    offset_to: 17,\n                    position: 2,\n                    text: \"FACTORY\".to_owned(),\n                    position_length: 1,\n                },\n            ];\n            assert_eq!(res, expected_tokens);\n        }\n    }\n\n    #[test]\n    fn test_code_tokenizer_hex() {\n        let mut tokenizer = CodeTokenizer::with_hex_support();\n        {\n            let mut token_stream = tokenizer.token_stream(\"PigCaféFactory2\");\n            let mut res = Vec::new();\n            while let Some(tok) = token_stream.next() {\n                res.push(tok.clone());\n            }\n            let expected_tokens = vec![\n                Token {\n                    offset_from: 0,\n                    offset_to: 3,\n                    position: 0,\n                    text: \"Pig\".to_owned(),\n                    position_length: 1,\n                },\n                Token {\n                    offset_from: 3,\n                    offset_to: 8,\n                    position: 1,\n                    text: \"Café\".to_owned(),\n                    position_length: 1,\n                },\n                Token {\n                    offset_from: 8,\n                    offset_to: 15,\n                    position: 2,\n                    text: \"Factory\".to_owned(),\n                    position_length: 1,\n                },\n                Token {\n                    offset_from: 15,\n                    offset_to: 16,\n                    position: 3,\n                    text: \"2\".to_owned(),\n                    position_length: 1,\n                },\n            ];\n            assert_eq!(res, expected_tokens);\n        }\n        {\n            let mut token_stream = tokenizer.token_stream(\"PIG_CAFE_FACTORY\");\n            let mut res = Vec::new();\n            while let Some(tok) = token_stream.next() {\n                res.push(tok.clone());\n            }\n            let expected_tokens = vec![\n                Token {\n                    offset_from: 0,\n                    offset_to: 3,\n                    position: 0,\n                    text: \"PIG\".to_owned(),\n                    position_length: 1,\n                },\n                Token {\n                    offset_from: 4,\n                    offset_to: 8,\n                    position: 1,\n                    text: \"CAFE\".to_owned(),\n                    position_length: 1,\n                },\n                Token {\n                    offset_from: 9,\n                    offset_to: 16,\n                    position: 2,\n                    text: \"FACTORY\".to_owned(),\n                    position_length: 1,\n                },\n            ];\n            assert_eq!(res, expected_tokens);\n        }\n        {\n            let mut token_stream = tokenizer.token_stream(\"TPigCafeFactory\");\n            let mut res = Vec::new();\n            while let Some(tok) = token_stream.next() {\n                res.push(tok.clone());\n            }\n            let expected_tokens = vec![\n                Token {\n                    offset_from: 0,\n                    offset_to: 1,\n                    position: 0,\n                    text: \"T\".to_owned(),\n                    position_length: 1,\n                },\n                Token {\n                    offset_from: 1,\n                    offset_to: 4,\n                    position: 1,\n                    text: \"Pig\".to_owned(),\n                    position_length: 1,\n                },\n                Token {\n                    offset_from: 4,\n                    offset_to: 8,\n                    position: 2,\n                    text: \"Cafe\".to_owned(),\n                    position_length: 1,\n                },\n                Token {\n                    offset_from: 8,\n                    offset_to: 15,\n                    position: 3,\n                    text: \"Factory\".to_owned(),\n                    position_length: 1,\n                },\n            ];\n            assert_eq!(res, expected_tokens);\n        }\n        {\n            let mut token_stream = tokenizer.token_stream(\"PIG# Cafe@FACTORY\");\n            let mut res = Vec::new();\n            while let Some(tok) = token_stream.next() {\n                res.push(tok.clone());\n            }\n            let expected_tokens = vec![\n                Token {\n                    offset_from: 0,\n                    offset_to: 3,\n                    position: 0,\n                    text: \"PIG\".to_owned(),\n                    position_length: 1,\n                },\n                Token {\n                    offset_from: 5,\n                    offset_to: 9,\n                    position: 1,\n                    text: \"Cafe\".to_owned(),\n                    position_length: 1,\n                },\n                Token {\n                    offset_from: 10,\n                    offset_to: 17,\n                    position: 2,\n                    text: \"FACTORY\".to_owned(),\n                    position_length: 1,\n                },\n            ];\n            assert_eq!(res, expected_tokens);\n        }\n    }\n\n    #[test]\n    fn test_code_tokenizer_hex_scenaris() {\n        let test_vectors = vec![\n            // simple hex, separated by delimiter, or at end of string\n            (\n                \"fa63bbbf-0fb9-5ec8-ae63-561dc0f444aa\",\n                vec![\"fa63bbbf\", \"0fb9\", \"5ec8\", \"ae63\", \"561dc0f444aa\"],\n            ),\n            (\n                \"FA63BBBF-0FB9-5EC8-AE63-561DC0F444AA\",\n                vec![\"FA63BBBF\", \"0FB9\", \"5EC8\", \"AE63\", \"561DC0F444AA\"],\n            ),\n            // last token has odd len\n            (\n                \"fa63bbbf-0fb9-5ec8-ae63-561dc0f444a\",\n                vec![\n                    \"fa63bbbf\", \"0fb9\", \"5ec8\", \"ae63\", \"561\", \"dc\", \"0\", \"f\", \"444\", \"a\",\n                ],\n            ),\n            // a middle token has odd len\n            (\n                \"fa63bbbf-0fb9-5ec8-ae6-561dc0f444aa\",\n                vec![\"fa63bbbf\", \"0fb9\", \"5ec8\", \"ae\", \"6\", \"561dc0f444aa\"],\n            ),\n            // token starts with upper case\n            (\n                \"Fa63bbbf-0fb9-5ec8-ae63-561dc0f444aa\",\n                vec![\"Fa\", \"63\", \"bbbf\", \"0fb9\", \"5ec8\", \"ae63\", \"561dc0f444aa\"],\n            ),\n            // change in case during a token\n            (\n                \"fa63Bbbf-0fb9-5ec8-ae63-561dc0f444aa\",\n                vec![\"fa\", \"63\", \"Bbbf\", \"0fb9\", \"5ec8\", \"ae63\", \"561dc0f444aa\"],\n            ),\n            (\n                \"fa63bbBf-0fb9-5ec8-ae63-561dc0f444aa\",\n                vec![\n                    \"fa\",\n                    \"63\",\n                    \"bb\",\n                    \"Bf\",\n                    \"0fb9\",\n                    \"5ec8\",\n                    \"ae63\",\n                    \"561dc0f444aa\",\n                ],\n            ),\n            // token starts with lower case\n            (\n                \"fA63BBBF-0FB9-5EC8-AE63-561DC0F444AA\",\n                vec![\n                    \"f\",\n                    \"A\",\n                    \"63\",\n                    \"BBBF\",\n                    \"0FB9\",\n                    \"5EC8\",\n                    \"AE63\",\n                    \"561DC0F444AA\",\n                ],\n            ),\n            // token contain non hex\n            (\n                \"fa63bgbf-0fb9-5ec8-ae63-561dc0f444aa\",\n                vec![\"fa\", \"63\", \"bgbf\", \"0fb9\", \"5ec8\", \"ae63\", \"561dc0f444aa\"],\n            ),\n            // non 0-9 numeric\n            (\n                \"fa6③bbbf-0fb9-5ec8-ae63-561dc0f444aa\",\n                vec![\"fa\", \"6③\", \"bbbf\", \"0fb9\", \"5ec8\", \"ae63\", \"561dc0f444aa\"],\n            ),\n            (\"301ms\", vec![\"301\", \"ms\"]),\n            (\"301cd\", vec![\"301\", \"cd\"]),\n            (\"30ms\", vec![\"30\", \"ms\"]),\n            // we don't know if it's candelas or hex, and assume hex in this case\n            (\"30cd\", vec![\"30cd\"]),\n            (\"ABCDef\", vec![\"ABC\", \"Def\"]),\n        ];\n\n        let mut tokenizer = CodeTokenizer::with_hex_support();\n        for (text, expected) in test_vectors {\n            let mut token_stream = tokenizer.token_stream(text);\n            let mut res = Vec::new();\n            while let Some(tok) = token_stream.next() {\n                res.push(tok.text.clone());\n            }\n            assert_eq!(res, expected);\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/tokenizers/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod chinese_compatible;\nmod code_tokenizer;\nmod tokenizer_manager;\n\nuse once_cell::sync::Lazy;\nuse tantivy::tokenizer::{\n    AsciiFoldingFilter, LowerCaser, RawTokenizer, RemoveLongFilter, SimpleTokenizer, TextAnalyzer,\n    WhitespaceTokenizer,\n};\n\nuse self::chinese_compatible::ChineseTokenizer;\npub use self::code_tokenizer::CodeTokenizer;\npub use self::tokenizer_manager::{RAW_TOKENIZER_NAME, TokenizerManager};\n\npub const DEFAULT_REMOVE_TOKEN_LENGTH: usize = 255;\n\n/// Quickwit's tokenizer/analyzer manager.\npub fn create_default_quickwit_tokenizer_manager() -> TokenizerManager {\n    let tokenizer_manager = TokenizerManager::new();\n\n    let raw_tokenizer = TextAnalyzer::builder(RawTokenizer::default())\n        .filter(RemoveLongFilter::limit(DEFAULT_REMOVE_TOKEN_LENGTH))\n        .build();\n    tokenizer_manager.register(\"raw\", raw_tokenizer, false);\n\n    let raw_tokenizer = TextAnalyzer::builder(RawTokenizer::default())\n        .filter(LowerCaser)\n        .filter(RemoveLongFilter::limit(DEFAULT_REMOVE_TOKEN_LENGTH))\n        .build();\n    tokenizer_manager.register(\"raw_lowercase\", raw_tokenizer, true);\n\n    let lower_case_tokenizer = TextAnalyzer::builder(RawTokenizer::default())\n        .filter(LowerCaser)\n        .filter(RemoveLongFilter::limit(DEFAULT_REMOVE_TOKEN_LENGTH))\n        .build();\n    tokenizer_manager.register(\"lowercase\", lower_case_tokenizer, true);\n\n    let default_tokenizer = TextAnalyzer::builder(SimpleTokenizer::default())\n        .filter(RemoveLongFilter::limit(DEFAULT_REMOVE_TOKEN_LENGTH))\n        .filter(LowerCaser)\n        .build();\n    tokenizer_manager.register(\"default\", default_tokenizer, true);\n    tokenizer_manager.register(\"whitespace\", WhitespaceTokenizer::default(), false);\n\n    let chinese_tokenizer = TextAnalyzer::builder(ChineseTokenizer)\n        .filter(RemoveLongFilter::limit(DEFAULT_REMOVE_TOKEN_LENGTH))\n        .filter(LowerCaser)\n        .build();\n    tokenizer_manager.register(\"chinese_compatible\", chinese_tokenizer, true);\n    tokenizer_manager.register(\n        \"source_code_default\",\n        TextAnalyzer::builder(CodeTokenizer::default())\n            .filter(RemoveLongFilter::limit(DEFAULT_REMOVE_TOKEN_LENGTH))\n            .filter(LowerCaser)\n            .filter(AsciiFoldingFilter)\n            .build(),\n        true,\n    );\n    tokenizer_manager.register(\n        \"source_code_with_hex\",\n        TextAnalyzer::builder(CodeTokenizer::with_hex_support())\n            .filter(RemoveLongFilter::limit(DEFAULT_REMOVE_TOKEN_LENGTH))\n            .filter(LowerCaser)\n            .filter(AsciiFoldingFilter)\n            .build(),\n        true,\n    );\n    tokenizer_manager\n}\n\nfn create_quickwit_fastfield_normalizer_manager() -> TokenizerManager {\n    let raw_tokenizer = TextAnalyzer::builder(RawTokenizer::default())\n        .filter(RemoveLongFilter::limit(DEFAULT_REMOVE_TOKEN_LENGTH))\n        .build();\n    let lower_case_tokenizer = TextAnalyzer::builder(RawTokenizer::default())\n        .filter(LowerCaser)\n        .filter(RemoveLongFilter::limit(DEFAULT_REMOVE_TOKEN_LENGTH))\n        .build();\n    let tokenizer_manager = TokenizerManager::new();\n    tokenizer_manager.register(\"raw\", raw_tokenizer, false);\n    tokenizer_manager.register(\"lowercase\", lower_case_tokenizer, true);\n    tokenizer_manager\n}\n\npub fn get_quickwit_fastfield_normalizer_manager() -> &'static TokenizerManager {\n    static QUICKWIT_FAST_FIELD_NORMALIZER_MANAGER: Lazy<TokenizerManager> =\n        Lazy::new(create_quickwit_fastfield_normalizer_manager);\n    &QUICKWIT_FAST_FIELD_NORMALIZER_MANAGER\n}\n\n#[cfg(test)]\nmod tests {\n\n    #[test]\n    fn test_tokenizers_in_manager() {\n        let tokenizer_manager = super::create_default_quickwit_tokenizer_manager();\n        tokenizer_manager\n            .get_tokenizer(\"chinese_compatible\")\n            .unwrap();\n        tokenizer_manager.get_tokenizer(\"default\").unwrap();\n        tokenizer_manager.get_tokenizer(\"raw\").unwrap();\n    }\n\n    #[test]\n    fn test_raw_tokenizer() {\n        let tokenizer_manager = super::create_default_quickwit_tokenizer_manager();\n        let my_haiku = r#\"\n        white sandy beach\n        a strong wind is coming\n        sand in my face\n        \"#;\n        let my_long_text = \"a text, that is just too long, no one will type it, no one will like \\\n                            it, no one shall find it. I just need some more chars, now you may \\\n                            not pass.\";\n\n        let mut tokenizer = tokenizer_manager.get_tokenizer(\"raw\").unwrap();\n        let mut haiku_stream = tokenizer.token_stream(my_haiku);\n        assert!(haiku_stream.advance());\n        assert!(!haiku_stream.advance());\n        let mut other_tokenizer = tokenizer_manager.get_tokenizer(\"raw\").unwrap();\n        let mut other_stream = other_tokenizer.token_stream(my_long_text);\n        assert!(other_stream.advance());\n        assert!(!other_stream.advance());\n    }\n\n    #[test]\n    fn test_code_tokenizer_in_tokenizer_manager() {\n        let mut code_tokenizer = super::create_default_quickwit_tokenizer_manager()\n            .get_tokenizer(\"source_code_default\")\n            .unwrap();\n        let mut token_stream = code_tokenizer.token_stream(\"PigCaféFactory2\");\n        let mut tokens = Vec::new();\n        while let Some(token) = token_stream.next() {\n            tokens.push(token.text.to_string());\n        }\n        assert_eq!(tokens, vec![\"pig\", \"cafe\", \"factory\", \"2\"])\n    }\n\n    #[test]\n    fn test_raw_lowercase_tokenizer() {\n        let tokenizer_manager = super::create_default_quickwit_tokenizer_manager();\n        let my_long_text = \"a text, that is just too long, no one will type it, no one will like \\\n                            it, no one shall find it. I just need some more chars, now you may \\\n                            not pass.\";\n\n        let mut tokenizer = tokenizer_manager.get_tokenizer(\"raw_lowercase\").unwrap();\n        let mut stream = tokenizer.token_stream(my_long_text);\n        assert!(stream.advance());\n        assert_eq!(stream.token().text.len(), my_long_text.len());\n        // there are non letter, so we can't check for all lowercase directly\n        assert!(stream.token().text.chars().all(|c| !c.is_uppercase()));\n        assert!(!stream.advance());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-query/src/tokenizers/tokenizer_manager.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::sync::{Arc, RwLock};\n\nuse tantivy::tokenizer::{\n    LowerCaser, RawTokenizer, RemoveLongFilter, TextAnalyzer,\n    TokenizerManager as TantivyTokenizerManager,\n};\n\nuse crate::DEFAULT_REMOVE_TOKEN_LENGTH;\n\npub const RAW_TOKENIZER_NAME: &str = \"raw\";\nconst LOWERCASE_TOKENIZER_NAME: &str = \"lowercase\";\nconst RAW_LOWERCASE_TOKENIZER_NAME: &str = \"raw_lowercase\";\n\n#[derive(Clone)]\npub struct TokenizerManager {\n    inner: TantivyTokenizerManager,\n    is_lowercaser: Arc<RwLock<HashMap<String, bool>>>,\n}\n\nimpl TokenizerManager {\n    /// Creates an empty tokenizer manager.\n    pub fn new() -> Self {\n        let this = Self {\n            inner: TantivyTokenizerManager::new(),\n            is_lowercaser: Arc::new(RwLock::new(HashMap::new())),\n        };\n\n        // in practice these will almost always be overridden in\n        // create_default_quickwit_tokenizer_manager()\n        let raw_tokenizer = TextAnalyzer::builder(RawTokenizer::default())\n            .filter(RemoveLongFilter::limit(DEFAULT_REMOVE_TOKEN_LENGTH))\n            .build();\n        this.register(RAW_TOKENIZER_NAME, raw_tokenizer, false);\n        let raw_tokenizer = TextAnalyzer::builder(RawTokenizer::default())\n            .filter(LowerCaser)\n            .filter(RemoveLongFilter::limit(DEFAULT_REMOVE_TOKEN_LENGTH))\n            .build();\n        this.register(RAW_LOWERCASE_TOKENIZER_NAME, raw_tokenizer, true);\n        let lower_case_tokenizer = TextAnalyzer::builder(RawTokenizer::default())\n            .filter(LowerCaser)\n            .filter(RemoveLongFilter::limit(DEFAULT_REMOVE_TOKEN_LENGTH))\n            .build();\n        this.register(LOWERCASE_TOKENIZER_NAME, lower_case_tokenizer, true);\n\n        this\n    }\n\n    /// Registers a new tokenizer associated with a given name.\n    pub fn register<T>(&self, tokenizer_name: &str, tokenizer: T, does_lowercasing: bool)\n    where TextAnalyzer: From<T> {\n        self.inner.register(tokenizer_name, tokenizer);\n        self.is_lowercaser\n            .write()\n            .unwrap()\n            .insert(tokenizer_name.to_string(), does_lowercasing);\n    }\n\n    /// Accessing a tokenizer given its name.\n    pub fn get_tokenizer(&self, tokenizer_name: &str) -> Option<TextAnalyzer> {\n        self.inner.get(tokenizer_name)\n    }\n\n    /// Query whether a given tokenizer does lowercasing\n    pub fn get_normalizer(&self, tokenizer_name: &str) -> Option<TextAnalyzer> {\n        let use_lowercaser = self\n            .is_lowercaser\n            .read()\n            .unwrap()\n            .get(tokenizer_name)\n            .copied()?;\n        let analyzer = if use_lowercaser {\n            RAW_LOWERCASE_TOKENIZER_NAME\n        } else {\n            RAW_TOKENIZER_NAME\n        };\n        self.get_tokenizer(analyzer)\n    }\n\n    /// Get the inner TokenizerManager\n    pub fn tantivy_manager(&self) -> &TantivyTokenizerManager {\n        &self.inner\n    }\n}\n\nimpl Default for TokenizerManager {\n    fn default() -> Self {\n        Self::new()\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-rest-client/Cargo.toml",
    "content": "[package]\nname = \"quickwit-rest-client\"\ndescription = \"Rust client for Quickwit REST API\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nanyhow = { workspace = true }\nbytes = { workspace = true }\nreqwest = { workspace = true }\nreqwest-middleware = { workspace = true }\nreqwest-retry = { workspace = true }\nserde = { workspace = true }\nserde_json = { workspace = true }\nthiserror = { workspace = true }\ntokio = { workspace = true }\ntracing = { workspace = true }\n\nquickwit-cluster = { workspace = true }\nquickwit-common = { workspace = true }\nquickwit-config = { workspace = true }\nquickwit-indexing = { workspace = true }\nquickwit-ingest = { workspace = true }\nquickwit-metastore = { workspace = true }\nquickwit-proto = { workspace = true }\nquickwit-serve = { workspace = true }\n\n[dev-dependencies]\nhttp = { workspace = true }\nwiremock = { workspace = true }\n\nquickwit-config = { workspace = true, features = [\"testsuite\"] }\nquickwit-indexing = { workspace = true, features = [\"testsuite\"] }\nquickwit-metastore = { workspace = true, features = [\"testsuite\"] }\n"
  },
  {
    "path": "quickwit/quickwit-rest-client/README.md",
    "content": "# quickwit-rest-client\n\nThis project hosts quickwit REST client.\n\n\n"
  },
  {
    "path": "quickwit/quickwit-rest-client/resources/tests/documents_to_ingest.json",
    "content": "{\"user\":\"8\",\"tags\":[\"rust\"]}\n{\"user\":\"7\",\"tags\":[\"python\"]}\n"
  },
  {
    "path": "quickwit/quickwit-rest-client/src/error.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse reqwest::StatusCode;\nuse reqwest_middleware::Error as MiddlewareError;\nuse serde::Deserialize;\nuse thiserror::Error;\n\npub static DEFAULT_ADDRESS: &str = \"http://127.0.0.1:7280\";\npub static DEFAULT_CONTENT_TYPE: &str = \"application/json\";\n\n#[derive(Error, Debug)]\npub enum Error {\n    // Error returned by Quickwit server.\n    #[error(\"API error: {0}\")]\n    Api(#[from] ApiError),\n    // Error returned by reqwest lib.\n    #[error(\"client error: {0:?}\")]\n    Client(#[from] reqwest::Error),\n    // IO Error returned by tokio lib.\n    #[error(\"IO error: {0}\")]\n    Io(#[from] tokio::io::Error),\n    // Internal error returned by quickwit client lib.\n    #[error(\"internal Quickwit client error: {0}\")]\n    Internal(String),\n    // Error returned by reqwest middleware.\n    #[error(\"client middleware error: {0:?}\")]\n    Middleware(anyhow::Error),\n    // Error returned by url lib when parsing a string.\n    #[error(\"URL parsing error: {0}\")]\n    UrlParse(String),\n}\n\nimpl Error {\n    pub fn status_code(&self) -> Option<StatusCode> {\n        match &self {\n            Self::Api(error) => Some(error.code),\n            Self::Client(error) => error.status(),\n            Self::Internal(_) => Some(StatusCode::INTERNAL_SERVER_ERROR),\n            Self::Io(_) => Some(StatusCode::INTERNAL_SERVER_ERROR),\n            Self::Middleware(_) => Some(StatusCode::INTERNAL_SERVER_ERROR),\n            Self::UrlParse(_) => Some(StatusCode::BAD_REQUEST),\n        }\n    }\n}\n\nimpl From<MiddlewareError> for Error {\n    fn from(error: MiddlewareError) -> Self {\n        match error {\n            MiddlewareError::Middleware(error) => Error::Middleware(error),\n            MiddlewareError::Reqwest(error) => Error::Client(error),\n        }\n    }\n}\n\n#[derive(Debug, Error)]\npub struct ApiError {\n    pub message: Option<String>,\n    pub code: StatusCode,\n}\n\n// Implement `Display` for `ApiError`.\nimpl std::fmt::Display for ApiError {\n    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {\n        if let Some(error) = &self.message {\n            write!(f, \"(code={}, message={})\", self.code, error)\n        } else {\n            write!(f, \"(code={})\", self.code)\n        }\n    }\n}\n\n#[derive(Deserialize)]\npub(crate) struct ErrorResponsePayload {\n    pub message: String,\n}\n"
  },
  {
    "path": "quickwit/quickwit-rest-client/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::path::Path;\nuse std::{io, mem};\n\nuse bytes::Bytes;\nuse tokio::fs::File;\nuse tokio::io::{AsyncBufReadExt, AsyncRead, BufReader};\nuse tracing::warn;\n\npub mod error;\npub mod models;\npub mod rest_client;\n\n// re-exports\npub use quickwit_config::ConfigFormat;\npub use reqwest::Url;\n\npub(crate) struct BatchLineReader {\n    buf_reader: BufReader<Box<dyn AsyncRead + Send + Sync + Unpin>>,\n    buffer: Vec<u8>,\n    alloc_num_bytes: usize,\n    max_batch_num_bytes: usize,\n    num_lines: usize,\n    has_next: bool,\n}\n\nimpl BatchLineReader {\n    pub async fn from_file(filepath: &Path, max_batch_num_bytes: usize) -> io::Result<Self> {\n        let file = File::open(&filepath).await?;\n        Ok(Self::new(Box::new(file), max_batch_num_bytes))\n    }\n\n    pub fn from_stdin(max_batch_num_bytes: usize) -> Self {\n        Self::new(Box::new(tokio::io::stdin()), max_batch_num_bytes)\n    }\n\n    pub fn new(\n        reader: Box<dyn AsyncRead + Send + Sync + Unpin>,\n        max_batch_num_bytes: usize,\n    ) -> Self {\n        let alloc_num_bytes = max_batch_num_bytes + 100 * 1024; // Add 100 KiB headroom to avoid reallocation.\n        Self {\n            buf_reader: BufReader::new(reader),\n            buffer: Vec::with_capacity(alloc_num_bytes),\n            alloc_num_bytes,\n            max_batch_num_bytes,\n            num_lines: 0,\n            has_next: true,\n        }\n    }\n\n    pub async fn next_batch(&mut self) -> io::Result<Option<Bytes>> {\n        loop {\n            let line_num_bytes = self.buf_reader.read_until(b'\\n', &mut self.buffer).await?;\n\n            if line_num_bytes > self.max_batch_num_bytes {\n                warn!(\n                    \"Skipping line {}, which exceeds the maximum allowed content length ({} vs. \\\n                     {} bytes).\",\n                    self.num_lines + 1,\n                    line_num_bytes,\n                    self.max_batch_num_bytes\n                );\n                let new_len = self.buffer.len() - line_num_bytes;\n                self.buffer.truncate(new_len);\n                continue;\n            }\n            if self.buffer.len() > self.max_batch_num_bytes {\n                let mut new_buffer = Vec::with_capacity(self.alloc_num_bytes);\n                let new_len = self.buffer.len() - line_num_bytes;\n                new_buffer.extend_from_slice(&self.buffer[new_len..]);\n                self.buffer.truncate(new_len);\n                let batch = mem::replace(&mut self.buffer, new_buffer);\n                return Ok(Some(Bytes::from(batch)));\n            }\n            if line_num_bytes == 0 {\n                self.has_next = false;\n                if self.buffer.is_empty() {\n                    return Ok(None);\n                }\n                let batch = mem::take(&mut self.buffer);\n                return Ok(Some(Bytes::from(batch)));\n            }\n            self.num_lines += 1;\n        }\n    }\n\n    /// Returns whether there is still data available\n    ///\n    /// This can spuriously return `true` when there was no data\n    /// to send at all.\n    pub fn has_next(&self) -> bool {\n        self.has_next\n    }\n\n    fn from_string(payload: impl ToString, max_batch_num_bytes: usize) -> Self {\n        Self::new(\n            Box::new(std::io::Cursor::new(payload.to_string().into_bytes())),\n            max_batch_num_bytes,\n        )\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[tokio::test]\n    async fn test_batch_reader() {\n        {\n            let mut batch_reader = BatchLineReader::from_string(\"\".to_string(), 10);\n            assert!(batch_reader.next_batch().await.unwrap().is_none());\n            assert!(batch_reader.next_batch().await.unwrap().is_none());\n        }\n        {\n            let mut batch_reader = BatchLineReader::from_string(\"foo\\n\", 10);\n            assert_eq!(\n                &batch_reader.next_batch().await.unwrap().unwrap()[..],\n                b\"foo\\n\"\n            );\n            assert!(batch_reader.next_batch().await.unwrap().is_none());\n            assert!(batch_reader.next_batch().await.unwrap().is_none());\n        }\n        {\n            let mut batch_reader = BatchLineReader::from_string(\"foo\\nbar\\nqux\\n\", 10);\n            assert_eq!(\n                &batch_reader.next_batch().await.unwrap().unwrap()[..],\n                b\"foo\\nbar\\n\"\n            );\n            assert_eq!(\n                &batch_reader.next_batch().await.unwrap().unwrap()[..],\n                b\"qux\\n\"\n            );\n            assert!(batch_reader.next_batch().await.unwrap().is_none());\n            assert!(batch_reader.next_batch().await.unwrap().is_none());\n        }\n        {\n            let mut batch_reader = BatchLineReader::from_string(\"fooo\\nbaar\\nqux\\n\", 10);\n            assert_eq!(\n                &batch_reader.next_batch().await.unwrap().unwrap()[..],\n                b\"fooo\\nbaar\\n\"\n            );\n            assert_eq!(\n                &batch_reader.next_batch().await.unwrap().unwrap()[..],\n                b\"qux\\n\"\n            );\n            assert!(batch_reader.next_batch().await.unwrap().is_none());\n            assert!(batch_reader.next_batch().await.unwrap().is_none());\n        }\n        {\n            let mut batch_reader =\n                BatchLineReader::from_string(\"foobarquxbaz\\nfoo\\nbar\\nqux\\n\", 10);\n            assert_eq!(\n                &batch_reader.next_batch().await.unwrap().unwrap()[..],\n                b\"foo\\nbar\\n\"\n            );\n            assert_eq!(\n                &batch_reader.next_batch().await.unwrap().unwrap()[..],\n                b\"qux\\n\"\n            );\n            assert!(batch_reader.next_batch().await.unwrap().is_none());\n            assert!(batch_reader.next_batch().await.unwrap().is_none());\n        }\n        {\n            let mut batch_reader =\n                BatchLineReader::from_string(\"foo\\nbar\\nfoobarquxbaz\\nqux\\n\", 10);\n            assert_eq!(\n                &batch_reader.next_batch().await.unwrap().unwrap()[..],\n                b\"foo\\nbar\\n\"\n            );\n            assert_eq!(\n                &batch_reader.next_batch().await.unwrap().unwrap()[..],\n                b\"qux\\n\"\n            );\n            assert!(batch_reader.next_batch().await.unwrap().is_none());\n            assert!(batch_reader.next_batch().await.unwrap().is_none());\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-rest-client/src/models.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::path::PathBuf;\nuse std::time::Duration;\n\nuse reqwest::StatusCode;\nuse serde::de::DeserializeOwned;\nuse serde::{Deserialize, Serialize};\nuse serde_json::Value as JsonValue;\n\nuse crate::error::{ApiError, Error, ErrorResponsePayload};\n\n#[derive(Debug)]\npub struct ApiResponse {\n    inner: reqwest::Response,\n}\n\nimpl ApiResponse {\n    pub fn new(inner: reqwest::Response) -> Self {\n        Self { inner }\n    }\n    /// Get the HTTP status code of the response\n    pub fn status_code(&self) -> StatusCode {\n        self.inner.status()\n    }\n\n    /// Checks status and returns error if appropriate.\n    pub async fn check(self) -> Result<(), Error> {\n        if self.inner.status().is_client_error() || self.inner.status().is_server_error() {\n            return Err(self.api_error().await);\n        }\n        Ok(())\n    }\n\n    async fn extract_error_message(self) -> Option<String> {\n        let error_body_bytes = self.inner.bytes().await.ok()?;\n        let error_body_text = std::str::from_utf8(&error_body_bytes).ok()?;\n        if let Ok(error_payload) = serde_json::from_str::<ErrorResponsePayload>(error_body_text) {\n            Some(error_payload.message)\n        } else {\n            Some(error_body_text.to_string())\n        }\n    }\n\n    async fn api_error(self) -> Error {\n        let code = self.inner.status();\n        let error_message = self.extract_error_message().await;\n        Error::from(ApiError {\n            message: error_message,\n            code,\n        })\n    }\n\n    pub async fn deserialize<T: DeserializeOwned>(self) -> Result<T, Error> {\n        if self.inner.status().is_client_error() || self.inner.status().is_server_error() {\n            Err(self.api_error().await)\n        } else {\n            let object = self.inner.json::<T>().await?;\n            Ok(object)\n        }\n    }\n}\n\n/// A cousin of `quickwit_search::SearchResponseRest` that implements [`Deserialize`]\n///\n/// This version of the response is necessary because\n/// `serde_json_borrow::OwnedValue` is not deserializeable.\n#[derive(Deserialize, Serialize, PartialEq, Debug)]\npub struct SearchResponseRestClient {\n    pub num_hits: u64,\n    pub hits: Vec<JsonValue>,\n    pub snippets: Option<Vec<JsonValue>>,\n    pub elapsed_time_micros: u64,\n    pub errors: Vec<String>,\n    pub aggregations: Option<JsonValue>,\n}\n\n#[derive(Clone)]\npub enum IngestSource {\n    Str(String),\n    File(PathBuf),\n    Stdin,\n}\n\n/// A structure that represent a timeout. Unlike Duration it can also represent an infinite or no\n/// timeout value.\n#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash, Default, Debug)]\npub struct Timeout {\n    duration: Duration,\n}\n\nconst SECS_PER_MIN: u64 = 60;\nconst MINS_PER_HOUR: u64 = 60;\nconst HOURS_PER_DAY: u64 = 24;\n\nimpl Timeout {\n    /// Creates a new timeout from duration\n    pub const fn new(duration: Duration) -> Timeout {\n        Timeout { duration }\n    }\n\n    /// Creates a new timeout from seconds\n    pub const fn from_secs(secs: u64) -> Timeout {\n        Timeout {\n            duration: Duration::from_secs(secs),\n        }\n    }\n\n    /// Creates a new timeout from minutes\n    pub const fn from_mins(mins: u64) -> Timeout {\n        Self::from_secs(mins * SECS_PER_MIN)\n    }\n\n    /// Creates a new timeout from hours\n    pub const fn from_hours(hours: u64) -> Timeout {\n        Self::from_secs(hours * SECS_PER_MIN * MINS_PER_HOUR)\n    }\n\n    /// Creates a new timeout from days\n    pub const fn from_days(days: u64) -> Timeout {\n        Self::from_secs(days * SECS_PER_MIN * MINS_PER_HOUR * HOURS_PER_DAY)\n    }\n\n    /// Creates a new infinite timeout\n    pub const fn none() -> Timeout {\n        Timeout {\n            duration: Duration::MAX,\n        }\n    }\n\n    /// Converts timeout into Some(Duration) or None if it is infinite.\n    pub fn as_duration_opt(&self) -> Option<Duration> {\n        if self.duration != Duration::MAX {\n            Some(self.duration)\n        } else {\n            None\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-rest-client/src/rest_client.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::time::Duration;\n\nuse bytes::Bytes;\nuse quickwit_cluster::ClusterSnapshot;\nuse quickwit_config::{ConfigFormat, SourceConfig};\nuse quickwit_indexing::actors::IndexingServiceCounters;\npub use quickwit_ingest::CommitType;\nuse quickwit_metastore::{IndexMetadata, Split, SplitInfo};\nuse quickwit_proto::ingest::Shard;\nuse quickwit_serve::{\n    ListSplitsQueryParams, ListSplitsResponse, RestIngestResponse, SearchRequestQueryString,\n};\nuse reqwest::header::{CONTENT_TYPE, HeaderMap, HeaderValue};\nuse reqwest::tls::Certificate;\nuse reqwest::{ClientBuilder as ReqwestClientBuilder, Method, StatusCode, Url};\nuse reqwest_middleware::{ClientBuilder as ReqwestMiddlewareClientBuilder, ClientWithMiddleware};\nuse reqwest_retry::RetryTransientMiddleware;\nuse reqwest_retry::policies::ExponentialBackoff;\nuse serde::Serialize;\nuse serde_json::json;\n\nuse crate::BatchLineReader;\nuse crate::error::Error;\nuse crate::models::{ApiResponse, IngestSource, SearchResponseRestClient, Timeout};\n\npub const DEFAULT_BASE_URL: &str = \"http://127.0.0.1:7280\";\npub const DEFAULT_CONTENT_TYPE: &str = \"application/json\";\npub const INGEST_CONTENT_LENGTH_LIMIT: usize = 10 * 1024 * 1024; // 10MiB\npub const DEFAULT_CLIENT_CONNECT_TIMEOUT: Timeout = Timeout::from_secs(5);\npub const DEFAULT_CLIENT_TIMEOUT: Timeout = Timeout::from_secs(10);\npub const DEFAULT_CLIENT_SEARCH_TIMEOUT: Timeout = Timeout::from_mins(1);\npub const DEFAULT_CLIENT_INGEST_TIMEOUT: Timeout = Timeout::from_mins(1);\npub const DEFAULT_CLIENT_COMMIT_TIMEOUT: Timeout = Timeout::from_mins(30);\n\nstruct Transport {\n    base_url: Url,\n    api_url: Url,\n    client: ClientWithMiddleware,\n}\n\nimpl Transport {\n    fn new(\n        endpoint: Url,\n        connect_timeout: Timeout,\n        ca_cert: Option<Certificate>,\n        num_retries: u32,\n    ) -> Self {\n        let base_url = endpoint;\n        let api_url = base_url\n            .join(\"api/v1/\")\n            .expect(\"root url should be well-formed\");\n        let mut reqwest_client_builder = ReqwestClientBuilder::new();\n        if let Some(duration) = connect_timeout.as_duration_opt() {\n            reqwest_client_builder = reqwest_client_builder.connect_timeout(duration);\n        }\n        if let Some(ca_cert) = ca_cert {\n            reqwest_client_builder = reqwest_client_builder\n                .tls_built_in_root_certs(false)\n                .add_root_certificate(ca_cert);\n        }\n        let retry_policy = ExponentialBackoff::builder()\n            .retry_bounds(Duration::from_secs(1), Duration::from_secs(60))\n            .build_with_max_retries(num_retries);\n        let retry_transient_middleware = RetryTransientMiddleware::new_with_policy(retry_policy);\n        let reqwest_client = reqwest_client_builder\n            .build()\n            .expect(\"`client_builder.build()` should not fail\");\n        let client = ReqwestMiddlewareClientBuilder::new(reqwest_client)\n            .with(retry_transient_middleware)\n            .build();\n        Self {\n            base_url,\n            api_url,\n            client,\n        }\n    }\n\n    /// Creates an asynchronous request that can be awaited\n    async fn send<Q: Serialize + ?Sized>(\n        &self,\n        method: Method,\n        path: &str,\n        header_map: Option<HeaderMap>,\n        query_string: Option<&Q>,\n        body: Option<Bytes>,\n        timeout: Timeout,\n    ) -> Result<ApiResponse, Error> {\n        let url = if path.starts_with('/') {\n            self.base_url.join(path)\n        } else {\n            self.api_url.join(path)\n        }\n        .map_err(|error| Error::UrlParse(error.to_string()))?;\n        let mut request_builder = self.client.request(method, url);\n        if let Some(duration) = timeout.as_duration_opt() {\n            request_builder = request_builder.timeout(duration);\n        }\n        let mut request_headers = HeaderMap::new();\n        request_headers.insert(CONTENT_TYPE, HeaderValue::from_static(DEFAULT_CONTENT_TYPE));\n        if let Some(header_map_val) = header_map {\n            request_headers.extend(header_map_val.into_iter());\n        }\n        request_builder = request_builder.headers(request_headers);\n        if let Some(bytes) = body {\n            request_builder = request_builder.body(bytes);\n        };\n        if let Some(qs) = query_string {\n            request_builder = request_builder.query(qs);\n        }\n        let response = request_builder.send().await?;\n\n        Ok(ApiResponse::new(response))\n    }\n}\n\npub struct QuickwitClientBuilder {\n    /// Base url for the client\n    base_url: Url,\n    /// Connection timeout.\n    connect_timeout: Timeout,\n    /// Timeout for most operations except search and ingest.\n    timeout: Timeout,\n    /// Timeout for search operations.\n    search_timeout: Timeout,\n    /// Timeout for the ingest operations with auto commit.\n    ingest_timeout: Timeout,\n    /// Timeout for the ingest operations that require waiting for commit.\n    commit_timeout: Timeout,\n    /// Forces use of ingest v1.\n    use_legacy_ingest: bool,\n    /// Request detailed parse failures report from the ingest api.\n    detailed_response: bool,\n    /// Validate against a custom TLS certificate authority\n    ca_cert: Option<Certificate>,\n    /// Maximum number of retries for transient errors.\n    num_retries: u32,\n}\n\nimpl QuickwitClientBuilder {\n    pub fn new(endpoint: Url) -> Self {\n        QuickwitClientBuilder {\n            base_url: endpoint,\n            connect_timeout: DEFAULT_CLIENT_CONNECT_TIMEOUT,\n            timeout: DEFAULT_CLIENT_TIMEOUT,\n            search_timeout: DEFAULT_CLIENT_SEARCH_TIMEOUT,\n            ingest_timeout: DEFAULT_CLIENT_INGEST_TIMEOUT,\n            commit_timeout: DEFAULT_CLIENT_COMMIT_TIMEOUT,\n            use_legacy_ingest: false,\n            detailed_response: false,\n            ca_cert: None,\n            num_retries: 0,\n        }\n    }\n\n    pub fn connect_timeout(mut self, timeout: Timeout) -> Self {\n        self.connect_timeout = timeout;\n        self\n    }\n\n    pub fn timeout(mut self, timeout: Timeout) -> Self {\n        self.timeout = timeout;\n        self\n    }\n\n    pub fn search_timeout(mut self, timeout: Timeout) -> Self {\n        self.search_timeout = timeout;\n        self\n    }\n\n    pub fn ingest_timeout(mut self, timeout: Timeout) -> Self {\n        self.ingest_timeout = timeout;\n        self\n    }\n\n    // TODO(#5604)\n    pub fn use_legacy_ingest(mut self, use_legacy_ingest: bool) -> Self {\n        self.use_legacy_ingest = use_legacy_ingest;\n        self\n    }\n\n    pub fn detailed_response(mut self, is_detailed: bool) -> Self {\n        self.detailed_response = is_detailed;\n        self\n    }\n\n    pub fn commit_timeout(mut self, timeout: Timeout) -> Self {\n        self.commit_timeout = timeout;\n        self\n    }\n\n    pub fn set_tls_ca(mut self, ca_cert: Option<Certificate>) -> Self {\n        self.ca_cert = ca_cert;\n        self\n    }\n\n    pub fn num_retries(mut self, num_retries: u32) -> Self {\n        self.num_retries = num_retries;\n        self\n    }\n\n    pub fn build(self) -> QuickwitClient {\n        let transport = Transport::new(\n            self.base_url,\n            self.connect_timeout,\n            self.ca_cert,\n            self.num_retries,\n        );\n        QuickwitClient {\n            transport,\n            timeout: self.timeout,\n            search_timeout: self.search_timeout,\n            ingest_timeout: self.ingest_timeout,\n            commit_timeout: self.commit_timeout,\n            use_legacy_ingest: self.use_legacy_ingest,\n            detailed_response: self.detailed_response,\n        }\n    }\n}\n\n/// Root client for top level APIs.\npub struct QuickwitClient {\n    transport: Transport,\n    /// Timeout for all operations except search and ingest.\n    timeout: Timeout,\n    /// Timeout for search operations.\n    search_timeout: Timeout,\n    /// Timeout for the ingest operations.\n    ingest_timeout: Timeout,\n    /// Timeout for the ingest operations that require waiting for commit.\n    commit_timeout: Timeout,\n    /// Forces use of ingest v1.\n    use_legacy_ingest: bool,\n    /// Request detailed parse failures report from the ingest api.\n    detailed_response: bool,\n}\n\nimpl QuickwitClient {\n    pub async fn search(\n        &self,\n        index_id: &str,\n        search_query: SearchRequestQueryString,\n    ) -> Result<SearchResponseRestClient, Error> {\n        let path = format!(\"{index_id}/search\");\n        let bytes = serde_json::to_string(&search_query)\n            .unwrap()\n            .as_bytes()\n            .to_vec();\n        let body = Bytes::from(bytes);\n        let response = self\n            .transport\n            .send::<()>(\n                Method::POST,\n                &path,\n                None,\n                None,\n                Some(body),\n                self.search_timeout,\n            )\n            .await?;\n        let search_response = response.deserialize().await?;\n        Ok(search_response)\n    }\n\n    pub fn indexes(&self) -> IndexClient<'_> {\n        IndexClient::new(&self.transport, self.timeout)\n    }\n\n    pub fn splits<'a>(&'a self, index_id: &'a str) -> SplitClient<'a, 'a> {\n        SplitClient::new(&self.transport, self.timeout, index_id)\n    }\n\n    pub fn sources<'a>(&'a self, index_id: &'a str) -> SourceClient<'a> {\n        SourceClient::new(&self.transport, self.timeout, index_id)\n    }\n\n    pub fn cluster(&self) -> ClusterClient<'_> {\n        ClusterClient::new(&self.transport, self.timeout)\n    }\n\n    pub fn node_stats(&self) -> NodeStatsClient<'_> {\n        NodeStatsClient::new(&self.transport, self.timeout)\n    }\n\n    pub fn node_health(&self) -> NodeHealthClient<'_> {\n        NodeHealthClient::new(&self.transport, self.timeout)\n    }\n\n    pub async fn ingest(\n        &self,\n        index_id: &str,\n        ingest_source: IngestSource,\n        batch_size_limit_opt: Option<usize>,\n        mut on_ingest_event: Option<&mut (dyn FnMut(IngestEvent) + Sync)>,\n        last_block_commit: CommitType,\n    ) -> Result<RestIngestResponse, Error> {\n        let ingest_path = format!(\"{index_id}/ingest\");\n        let mut query_params = HashMap::new();\n        // TODO(#5604)\n        if self.use_legacy_ingest {\n            query_params.insert(\"use_legacy_ingest\", \"true\");\n        }\n        if self.detailed_response {\n            query_params.insert(\"detailed_response\", \"true\");\n        }\n        let batch_size_limit = batch_size_limit_opt.unwrap_or(INGEST_CONTENT_LENGTH_LIMIT);\n        let mut batch_reader = match ingest_source {\n            IngestSource::File(filepath) => {\n                BatchLineReader::from_file(&filepath, batch_size_limit).await?\n            }\n            IngestSource::Stdin => BatchLineReader::from_stdin(batch_size_limit),\n            IngestSource::Str(ingest_payload) => {\n                BatchLineReader::from_string(ingest_payload, batch_size_limit)\n            }\n        };\n        let mut cumulated_resp = RestIngestResponse::default();\n        while let Some(batch) = batch_reader.next_batch().await? {\n            loop {\n                let timeout = if !batch_reader.has_next() && last_block_commit != CommitType::Auto {\n                    self.commit_timeout\n                } else {\n                    self.ingest_timeout\n                };\n                match last_block_commit {\n                    CommitType::Auto => {}\n                    CommitType::WaitFor => {\n                        query_params.insert(\"commit\", \"wait_for\");\n                    }\n                    CommitType::Force => {\n                        query_params.insert(\"commit\", \"force\");\n                    }\n                }\n                let response = self\n                    .transport\n                    .send(\n                        Method::POST,\n                        &ingest_path,\n                        None,\n                        Some(&query_params),\n                        Some(batch.clone()),\n                        timeout,\n                    )\n                    .await?;\n                if response.status_code() == StatusCode::TOO_MANY_REQUESTS {\n                    if let Some(event_fn) = &mut on_ingest_event {\n                        event_fn(IngestEvent::Sleep)\n                    }\n                    tokio::time::sleep(Duration::from_millis(500)).await;\n                } else {\n                    let current_parsed_resp = response.deserialize().await?;\n                    cumulated_resp = cumulated_resp.merge(current_parsed_resp);\n                    break;\n                }\n            }\n            if let Some(event_fn) = &mut on_ingest_event {\n                event_fn(IngestEvent::IngestedDocBatch(batch.len()))\n            }\n        }\n\n        Ok(cumulated_resp)\n    }\n}\n\npub enum IngestEvent {\n    IngestedDocBatch(usize),\n    Sleep,\n}\n\n/// Client for indexes APIs.\npub struct IndexClient<'a> {\n    transport: &'a Transport,\n    timeout: Timeout,\n}\n\nimpl<'a> IndexClient<'a> {\n    fn new(transport: &'a Transport, timeout: Timeout) -> Self {\n        Self { transport, timeout }\n    }\n\n    pub async fn create(\n        &self,\n        index_config: impl AsRef<[u8]>,\n        config_format: ConfigFormat,\n        overwrite: bool,\n    ) -> Result<IndexMetadata, Error> {\n        let header_map = header_from_config_format(config_format);\n        let body = Bytes::copy_from_slice(index_config.as_ref());\n        let response = self\n            .transport\n            .send(\n                Method::POST,\n                \"indexes\",\n                Some(header_map),\n                Some(&[(\"overwrite\", overwrite)]),\n                Some(body),\n                self.timeout,\n            )\n            .await?;\n        let index_metadata = response.deserialize().await?;\n        Ok(index_metadata)\n    }\n\n    pub async fn update(\n        &self,\n        index_id: &str,\n        index_config: impl AsRef<[u8]>,\n        config_format: ConfigFormat,\n        create: bool,\n    ) -> Result<IndexMetadata, Error> {\n        let header_map = header_from_config_format(config_format);\n        let body = Bytes::copy_from_slice(index_config.as_ref());\n        let mut query_params = HashMap::new();\n        if create {\n            query_params.insert(\"create\", \"true\");\n        }\n        let path = format!(\"indexes/{index_id}\");\n        let response = self\n            .transport\n            .send(\n                Method::PUT,\n                &path,\n                Some(header_map),\n                Some(&query_params),\n                Some(body),\n                self.timeout,\n            )\n            .await?;\n        let index_metadata = response.deserialize().await?;\n        Ok(index_metadata)\n    }\n\n    pub async fn list(&self) -> Result<Vec<IndexMetadata>, Error> {\n        let response = self\n            .transport\n            .send::<()>(Method::GET, \"indexes\", None, None, None, self.timeout)\n            .await?;\n        let indexes_metadatas = response.deserialize().await?;\n        Ok(indexes_metadatas)\n    }\n\n    pub async fn get(&self, index_id: &str) -> Result<IndexMetadata, Error> {\n        let path = format!(\"indexes/{index_id}\");\n        let response = self\n            .transport\n            .send::<()>(Method::GET, &path, None, None, None, self.timeout)\n            .await?;\n        let index_metadata = response.deserialize().await?;\n        Ok(index_metadata)\n    }\n\n    pub async fn clear(&self, index_id: &str) -> Result<(), Error> {\n        let path = format!(\"indexes/{index_id}/clear\");\n        let response = self\n            .transport\n            .send::<()>(Method::PUT, &path, None, None, None, self.timeout)\n            .await?;\n        response.check().await?;\n        Ok(())\n    }\n\n    pub async fn delete(&self, index_id: &str, dry_run: bool) -> Result<Vec<SplitInfo>, Error> {\n        let path = format!(\"indexes/{index_id}\");\n        let response = self\n            .transport\n            .send(\n                Method::DELETE,\n                &path,\n                None,\n                Some(&[(\"dry_run\", dry_run)]),\n                None,\n                self.timeout,\n            )\n            .await?;\n        let file_entries = response.deserialize().await?;\n        Ok(file_entries)\n    }\n}\n\n/// Client for splits APIs.\npub struct SplitClient<'a, 'b> {\n    transport: &'a Transport,\n    timeout: Timeout,\n    index_id: &'b str,\n}\n\nimpl<'a, 'b> SplitClient<'a, 'b> {\n    fn new(transport: &'a Transport, timeout: Timeout, index_id: &'b str) -> Self {\n        Self {\n            transport,\n            timeout,\n            index_id,\n        }\n    }\n\n    fn splits_root_url(&self) -> String {\n        format!(\"indexes/{}/splits\", self.index_id)\n    }\n\n    pub async fn list(\n        &self,\n        list_splits_query_params: ListSplitsQueryParams,\n    ) -> Result<Vec<Split>, Error> {\n        let path = self.splits_root_url();\n        let response = self\n            .transport\n            .send(\n                Method::GET,\n                &path,\n                None,\n                Some(&list_splits_query_params),\n                None,\n                self.timeout,\n            )\n            .await?;\n        let list_splits_response: ListSplitsResponse = response.deserialize().await?;\n        Ok(list_splits_response.splits)\n    }\n\n    pub async fn mark_for_deletion(&self, split_ids: Vec<String>) -> Result<(), Error> {\n        let path = format!(\"{}/mark-for-deletion\", self.splits_root_url());\n        let body_json = json!({ \"split_ids\": split_ids });\n        let body_vec =\n            serde_json::to_vec(&body_json).expect(\"serializing `body_json` should never fail\");\n        let body_bytes = Bytes::from(body_vec);\n        let response = self\n            .transport\n            .send::<()>(\n                Method::PUT,\n                &path,\n                None,\n                None,\n                Some(body_bytes),\n                self.timeout,\n            )\n            .await?;\n        response.check().await?;\n        Ok(())\n    }\n}\n\n/// Client for source APIs.\npub struct SourceClient<'a> {\n    transport: &'a Transport,\n    timeout: Timeout,\n    index_id: &'a str,\n}\n\nimpl<'a> SourceClient<'a> {\n    fn new(transport: &'a Transport, timeout: Timeout, index_id: &'a str) -> Self {\n        Self {\n            transport,\n            timeout,\n            index_id,\n        }\n    }\n\n    fn sources_root_url(&self) -> String {\n        format!(\"indexes/{}/sources\", self.index_id)\n    }\n\n    pub async fn create(\n        &self,\n        source_config_input: impl AsRef<[u8]>,\n        config_format: ConfigFormat,\n    ) -> Result<SourceConfig, Error> {\n        let header_map = header_from_config_format(config_format);\n        let source_config_bytes = Bytes::copy_from_slice(source_config_input.as_ref());\n        let response = self\n            .transport\n            .send::<()>(\n                Method::POST,\n                &self.sources_root_url(),\n                Some(header_map),\n                None,\n                Some(source_config_bytes),\n                self.timeout,\n            )\n            .await?;\n        let source_config = response.deserialize().await?;\n        Ok(source_config)\n    }\n\n    pub async fn update(\n        &self,\n        source_id: &str,\n        source_config_input: impl AsRef<[u8]>,\n        config_format: ConfigFormat,\n        create: bool,\n    ) -> Result<SourceConfig, Error> {\n        let header_map = header_from_config_format(config_format);\n        let source_config_bytes = Bytes::copy_from_slice(source_config_input.as_ref());\n        let mut query_params = HashMap::new();\n        if create {\n            query_params.insert(\"create\", \"true\");\n        }\n        let path = format!(\"{}/{source_id}\", self.sources_root_url());\n        let response = self\n            .transport\n            .send(\n                Method::PUT,\n                &path,\n                Some(header_map),\n                Some(&query_params),\n                Some(source_config_bytes),\n                self.timeout,\n            )\n            .await?;\n        let source_config = response.deserialize().await?;\n        Ok(source_config)\n    }\n\n    pub async fn get(&self, source_id: &str) -> Result<SourceConfig, Error> {\n        let path = format!(\"{}/{source_id}\", self.sources_root_url());\n        let response = self\n            .transport\n            .send::<()>(Method::GET, &path, None, None, None, self.timeout)\n            .await?;\n        let source_config = response.deserialize().await?;\n        Ok(source_config)\n    }\n\n    pub async fn toggle(&self, source_id: &str, enable: bool) -> Result<(), Error> {\n        let json_value = json!({ \"enable\": enable });\n        let json_bytes = serde_json::to_vec(&json_value).expect(\"Serialization should never fail.\");\n        let path = format!(\"{}/{source_id}/toggle\", self.sources_root_url());\n        let response = self\n            .transport\n            .send::<()>(\n                Method::PUT,\n                &path,\n                None,\n                None,\n                Some(Bytes::from(json_bytes)),\n                self.timeout,\n            )\n            .await?;\n        response.check().await?;\n        Ok(())\n    }\n\n    pub async fn reset_checkpoint(&self, source_id: &str) -> Result<(), Error> {\n        let path = format!(\"{}/{source_id}/reset-checkpoint\", self.sources_root_url());\n        let response = self\n            .transport\n            .send::<()>(Method::PUT, &path, None, None, None, self.timeout)\n            .await?;\n        response.check().await?;\n        Ok(())\n    }\n\n    pub async fn list(&self) -> Result<Vec<SourceConfig>, Error> {\n        let response = self\n            .transport\n            .send::<()>(\n                Method::GET,\n                &self.sources_root_url(),\n                None,\n                None,\n                None,\n                self.timeout,\n            )\n            .await?;\n        let source_configs = response.deserialize().await?;\n        Ok(source_configs)\n    }\n\n    pub async fn delete(&self, source_id: &str) -> Result<(), Error> {\n        let path = format!(\"{}/{source_id}\", self.sources_root_url());\n        let response = self\n            .transport\n            .send::<()>(Method::DELETE, &path, None, None, None, self.timeout)\n            .await?;\n        response.check().await?;\n        Ok(())\n    }\n\n    pub async fn get_shards(&self, source_id: &str) -> Result<Vec<Shard>, Error> {\n        let path = format!(\"{}/{source_id}/shards\", self.sources_root_url());\n        let response = self\n            .transport\n            .send::<()>(Method::GET, &path, None, None, None, self.timeout)\n            .await?;\n        let source_config = response.deserialize().await?;\n        Ok(source_config)\n    }\n}\n\n/// Client for Cluster APIs.\npub struct ClusterClient<'a> {\n    transport: &'a Transport,\n    timeout: Timeout,\n}\n\nimpl<'a> ClusterClient<'a> {\n    fn new(transport: &'a Transport, timeout: Timeout) -> Self {\n        Self { transport, timeout }\n    }\n\n    pub async fn snapshot(&self) -> Result<ClusterSnapshot, Error> {\n        let response = self\n            .transport\n            .send::<()>(Method::GET, \"cluster\", None, None, None, self.timeout)\n            .await?;\n        let cluster_snapshot = response.deserialize().await?;\n        Ok(cluster_snapshot)\n    }\n}\n\n/// Client for Node-level Stats APIs.\npub struct NodeStatsClient<'a> {\n    transport: &'a Transport,\n    timeout: Timeout,\n}\n\nimpl<'a> NodeStatsClient<'a> {\n    fn new(transport: &'a Transport, timeout: Timeout) -> Self {\n        Self { transport, timeout }\n    }\n\n    pub async fn indexing(&self) -> Result<IndexingServiceCounters, Error> {\n        let response = self\n            .transport\n            .send::<()>(Method::GET, \"indexing\", None, None, None, self.timeout)\n            .await?;\n        let indexing_stats = response.deserialize().await?;\n        Ok(indexing_stats)\n    }\n}\n\n/// Client for Node-level Health APIs.\npub struct NodeHealthClient<'a> {\n    transport: &'a Transport,\n    timeout: Timeout,\n}\n\nimpl<'a> NodeHealthClient<'a> {\n    fn new(transport: &'a Transport, timeout: Timeout) -> Self {\n        Self { transport, timeout }\n    }\n\n    /// Returns true if the node is healthy, returns false or an error otherwise.\n    pub async fn is_live(&self) -> Result<bool, Error> {\n        let response = self\n            .transport\n            .send::<()>(Method::GET, \"/health/livez\", None, None, None, self.timeout)\n            .await?;\n        let result: bool = response.deserialize().await?;\n        Ok(result)\n    }\n\n    /// Returns true if the node is ready, returns false or an error otherwise.\n    pub async fn is_ready(&self) -> Result<bool, Error> {\n        let response = self\n            .transport\n            .send::<()>(\n                Method::GET,\n                \"/health/readyz\",\n                None,\n                None,\n                None,\n                self.timeout,\n            )\n            .await?;\n        let result: bool = response.deserialize().await?;\n        Ok(result)\n    }\n}\n\nfn header_from_config_format(config_format: ConfigFormat) -> HeaderMap {\n    let mut header_map = HeaderMap::new();\n    let content_type_value = format!(\"application/{}\", config_format.as_str());\n    header_map.insert(\n        CONTENT_TYPE,\n        HeaderValue::from_str(&content_type_value).expect(\"Content type should always be valid.\"),\n    );\n    header_map\n}\n\n#[cfg(test)]\nmod test {\n\n    use std::path::PathBuf;\n    use std::str::FromStr;\n\n    use http::StatusCode;\n    use quickwit_config::{ConfigFormat, SourceConfig};\n    use quickwit_indexing::mock_split;\n    use quickwit_ingest::CommitType;\n    use quickwit_metastore::IndexMetadata;\n    use quickwit_serve::{\n        ListSplitsQueryParams, ListSplitsResponse, RestIngestResponse, SearchRequestQueryString,\n    };\n    use reqwest::Url;\n    use reqwest::header::CONTENT_TYPE;\n    use serde_json::json;\n    use tokio::fs::File;\n    use tokio::io::AsyncReadExt;\n    use wiremock::matchers::{\n        body_bytes, body_json, header, method, path, query_param, query_param_is_missing,\n    };\n    use wiremock::{Mock, MockServer, ResponseTemplate};\n\n    use crate::error::Error;\n    use crate::models::{IngestSource, SearchResponseRestClient};\n    use crate::rest_client::QuickwitClientBuilder;\n    #[tokio::test]\n    async fn test_client_no_server() {\n        let port = quickwit_common::net::find_available_tcp_port().unwrap();\n        let server_url = Url::parse(&format!(\"http://127.0.0.1:{port}\")).unwrap();\n        let qw_client = QuickwitClientBuilder::new(server_url).build();\n        let error = qw_client.indexes().list().await.unwrap_err();\n        assert!(matches!(error, Error::Middleware(_)));\n        assert!(error.to_string().contains(\"tcp connect error\"));\n    }\n\n    #[tokio::test]\n    async fn test_search_endpoint() {\n        let mock_server = MockServer::start().await;\n        let server_url = Url::parse(&mock_server.uri()).unwrap();\n        let qw_client = QuickwitClientBuilder::new(server_url).build();\n        // Search\n        let search_query_params = SearchRequestQueryString {\n            ..Default::default()\n        };\n        let expected_search_response = SearchResponseRestClient {\n            num_hits: 0,\n            hits: Vec::new(),\n            snippets: None,\n            aggregations: None,\n            elapsed_time_micros: 100,\n            errors: Vec::new(),\n        };\n        Mock::given(method(\"POST\"))\n            .and(path(\"/api/v1/my-index/search\"))\n            .respond_with(ResponseTemplate::new(StatusCode::OK).set_body_json(\n                json!({\"num_hits\": 0, \"hits\": [], \"elapsed_time_micros\": 100, \"errors\": []}),\n            ))\n            .up_to_n_times(1)\n            .mount(&mock_server)\n            .await;\n        assert_eq!(\n            qw_client\n                .search(\"my-index\", search_query_params)\n                .await\n                .unwrap(),\n            expected_search_response\n        );\n    }\n\n    fn get_ndjson_filepath(ndjson_dataset_filename: &str) -> String {\n        format!(\n            \"{}/resources/tests/{}\",\n            env!(\"CARGO_MANIFEST_DIR\"),\n            ndjson_dataset_filename\n        )\n    }\n\n    #[tokio::test]\n    async fn test_ingest_endpoint() {\n        let mock_server = MockServer::start().await;\n        let server_url = Url::parse(&mock_server.uri()).unwrap();\n        let qw_client = QuickwitClientBuilder::new(server_url).build();\n        let ndjson_filepath = get_ndjson_filepath(\"documents_to_ingest.json\");\n        let mut buffer = Vec::new();\n        File::open(&ndjson_filepath)\n            .await\n            .unwrap()\n            .read_to_end(&mut buffer)\n            .await\n            .unwrap();\n        Mock::given(method(\"POST\"))\n            .and(path(\"/api/v1/my-index/ingest\"))\n            .and(query_param_is_missing(\"commit\"))\n            .and(body_bytes(buffer.clone()))\n            .respond_with(ResponseTemplate::new(StatusCode::TOO_MANY_REQUESTS))\n            .up_to_n_times(2)\n            .expect(2)\n            .mount(&mock_server)\n            .await;\n        let mock_response = RestIngestResponse {\n            num_docs_for_processing: 2,\n            num_ingested_docs: Some(2),\n            num_rejected_docs: Some(0),\n            parse_failures: Some(Vec::new()),\n        };\n        Mock::given(method(\"POST\"))\n            .and(path(\"/api/v1/my-index/ingest\"))\n            .and(query_param_is_missing(\"commit\"))\n            .and(body_bytes(buffer))\n            .respond_with(ResponseTemplate::new(StatusCode::OK).set_body_json(&mock_response))\n            .up_to_n_times(1)\n            .mount(&mock_server)\n            .await;\n        let ingest_source = IngestSource::File(PathBuf::from_str(&ndjson_filepath).unwrap());\n        let actual_response = qw_client\n            .ingest(\"my-index\", ingest_source, None, None, CommitType::Auto)\n            .await\n            .unwrap();\n        assert_eq!(actual_response, mock_response);\n    }\n\n    #[tokio::test]\n    async fn test_ingest_endpoint_with_force_commit() {\n        let mock_server = MockServer::start().await;\n        let server_url = Url::parse(&mock_server.uri()).unwrap();\n        let qw_client = QuickwitClientBuilder::new(server_url).build();\n        let ndjson_filepath = get_ndjson_filepath(\"documents_to_ingest.json\");\n        let mut buffer = Vec::new();\n        File::open(&ndjson_filepath)\n            .await\n            .unwrap()\n            .read_to_end(&mut buffer)\n            .await\n            .unwrap();\n        let mock_response = RestIngestResponse {\n            num_docs_for_processing: 2,\n            num_ingested_docs: Some(2),\n            num_rejected_docs: Some(0),\n            parse_failures: Some(Vec::new()),\n        };\n        Mock::given(method(\"POST\"))\n            .and(path(\"/api/v1/my-index/ingest\"))\n            .and(query_param(\"commit\", \"force\"))\n            .and(body_bytes(buffer))\n            .respond_with(ResponseTemplate::new(StatusCode::OK).set_body_json(&mock_response))\n            .up_to_n_times(1)\n            .mount(&mock_server)\n            .await;\n        let ingest_source = IngestSource::File(PathBuf::from_str(&ndjson_filepath).unwrap());\n        let actual_response = qw_client\n            .ingest(\"my-index\", ingest_source, None, None, CommitType::Force)\n            .await\n            .unwrap();\n        assert_eq!(actual_response, mock_response);\n    }\n\n    #[tokio::test]\n    async fn test_ingest_endpoint_with_wait_for_commit() {\n        let mock_server = MockServer::start().await;\n        let server_url = Url::parse(&mock_server.uri()).unwrap();\n        let qw_client = QuickwitClientBuilder::new(server_url).build();\n        let ndjson_filepath = get_ndjson_filepath(\"documents_to_ingest.json\");\n        let mut buffer = Vec::new();\n        File::open(&ndjson_filepath)\n            .await\n            .unwrap()\n            .read_to_end(&mut buffer)\n            .await\n            .unwrap();\n        let mock_response = RestIngestResponse {\n            num_docs_for_processing: 2,\n            num_ingested_docs: Some(2),\n            num_rejected_docs: Some(0),\n            parse_failures: Some(Vec::new()),\n        };\n        Mock::given(method(\"POST\"))\n            .and(path(\"/api/v1/my-index/ingest\"))\n            .and(query_param(\"commit\", \"wait_for\"))\n            .and(body_bytes(buffer))\n            .respond_with(ResponseTemplate::new(StatusCode::OK).set_body_json(&mock_response))\n            .up_to_n_times(1)\n            .mount(&mock_server)\n            .await;\n        let ingest_source = IngestSource::File(PathBuf::from_str(&ndjson_filepath).unwrap());\n        let actual_response = qw_client\n            .ingest(\"my-index\", ingest_source, None, None, CommitType::WaitFor)\n            .await\n            .unwrap();\n        assert_eq!(actual_response, mock_response);\n    }\n\n    #[tokio::test]\n    async fn test_ingest_endpoint_should_return_api_error() {\n        let mock_server = MockServer::start().await;\n        let server_url = Url::parse(&mock_server.uri()).unwrap();\n        let qw_client = QuickwitClientBuilder::new(server_url).build();\n        let ndjson_filepath = get_ndjson_filepath(\"documents_to_ingest.json\");\n        let mut buffer = Vec::new();\n        File::open(&ndjson_filepath)\n            .await\n            .unwrap()\n            .read_to_end(&mut buffer)\n            .await\n            .unwrap();\n        Mock::given(method(\"POST\"))\n            .and(path(\"/api/v1/my-index/ingest\"))\n            .and(body_bytes(buffer.clone()))\n            .respond_with(\n                ResponseTemplate::new(405).set_body_json(json!({\"message\": \"internal error\"})),\n            )\n            .up_to_n_times(1)\n            .mount(&mock_server)\n            .await;\n        let ingest_source = IngestSource::File(PathBuf::from_str(&ndjson_filepath).unwrap());\n        let error = qw_client\n            .ingest(\n                \"my-index\",\n                ingest_source,\n                Some(4096),\n                None,\n                CommitType::Auto,\n            )\n            .await\n            .unwrap_err();\n        assert!(matches!(error, Error::Api(_)));\n        assert!(error.to_string().contains(\"internal error\"));\n    }\n\n    #[tokio::test]\n    async fn test_indexes_endpoints() {\n        let mock_server = MockServer::start().await;\n        let server_url = Url::parse(&mock_server.uri()).unwrap();\n        let qw_client = QuickwitClientBuilder::new(server_url).build();\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///indexes/test-index\");\n        // GET indexes\n        Mock::given(method(\"GET\"))\n            .and(path(\"/api/v1/indexes\"))\n            .respond_with(\n                ResponseTemplate::new(StatusCode::OK).set_body_json(vec![index_metadata.clone()]),\n            )\n            .up_to_n_times(1)\n            .mount(&mock_server)\n            .await;\n        assert_eq!(\n            qw_client.indexes().list().await.unwrap(),\n            vec![index_metadata.clone()]\n        );\n\n        // POST create index\n        let index_config_to_create = index_metadata.index_config.clone();\n        Mock::given(method(\"POST\"))\n            .and(path(\"/api/v1/indexes\"))\n            .and(body_json(index_config_to_create.clone()))\n            .respond_with(\n                ResponseTemplate::new(StatusCode::OK).set_body_json(index_metadata.clone()),\n            )\n            .up_to_n_times(1)\n            .mount(&mock_server)\n            .await;\n        let post_body = serde_json::to_string(&index_config_to_create).unwrap();\n        assert_eq!(\n            qw_client\n                .indexes()\n                .create(post_body, ConfigFormat::Json, false)\n                .await\n                .unwrap(),\n            index_metadata\n        );\n\n        // POST create index with yaml\n        Mock::given(method(\"POST\"))\n            .and(path(\"/api/v1/indexes\"))\n            .and(header(CONTENT_TYPE.as_str(), \"application/yaml\"))\n            .respond_with(\n                ResponseTemplate::new(StatusCode::OK).set_body_json(index_metadata.clone()),\n            )\n            .up_to_n_times(1)\n            .mount(&mock_server)\n            .await;\n        assert_eq!(\n            qw_client\n                .indexes()\n                .create(\"\", ConfigFormat::Yaml, false)\n                .await\n                .unwrap(),\n            index_metadata\n        );\n\n        // PUT clear index\n        Mock::given(method(\"PUT\"))\n            .and(path(\"/api/v1/indexes/my-index/clear\"))\n            .respond_with(ResponseTemplate::new(StatusCode::OK))\n            .up_to_n_times(1)\n            .mount(&mock_server)\n            .await;\n        qw_client.indexes().clear(\"my-index\").await.unwrap();\n\n        // PUT clear index returns an error\n        Mock::given(method(\"PUT\"))\n            .and(path(\"/api/v1/indexes/my-index/clear\"))\n            .respond_with(ResponseTemplate::new(StatusCode::BAD_REQUEST))\n            .up_to_n_times(1)\n            .mount(&mock_server)\n            .await;\n        qw_client.indexes().clear(\"my-index\").await.unwrap_err();\n\n        // DELETE index\n        Mock::given(method(\"DELETE\"))\n            .and(path(\"/api/v1/indexes/my-index\"))\n            .and(query_param(\"dry_run\", \"true\"))\n            .respond_with(ResponseTemplate::new(StatusCode::OK).set_body_json(json!([{\n                \"split_id\": \"my-split\",\n                \"num_docs\": 1,\n                \"uncompressed_docs_size_bytes\": 1024,\n                \"file_name\": \"my-split.split\",\n                \"file_size_bytes\": 128,\n            }])))\n            .up_to_n_times(1)\n            .mount(&mock_server)\n            .await;\n        qw_client.indexes().delete(\"my-index\", true).await.unwrap();\n\n        // DELETE index returns an error\n        Mock::given(method(\"DELETE\"))\n            .and(path(\"/api/v1/indexes/my-index\"))\n            .respond_with(ResponseTemplate::new(StatusCode::UNSUPPORTED_MEDIA_TYPE))\n            .up_to_n_times(1)\n            .mount(&mock_server)\n            .await;\n        qw_client\n            .indexes()\n            .delete(\"my-index\", true)\n            .await\n            .unwrap_err();\n    }\n\n    #[tokio::test]\n    async fn test_splits_endpoints() {\n        let mock_server = MockServer::start().await;\n        let server_url = Url::parse(&mock_server.uri()).unwrap();\n        let qw_client = QuickwitClientBuilder::new(server_url).build();\n        let split = mock_split(\"split-1\");\n        // GET splits\n        let list_splits_params = ListSplitsQueryParams {\n            start_timestamp: Some(1),\n            ..Default::default()\n        };\n        let response = ListSplitsResponse {\n            offset: 0,\n            size: 1,\n            splits: vec![split.clone()],\n        };\n        Mock::given(method(\"GET\"))\n            .and(path(\"/api/v1/indexes/my-index/splits\"))\n            .and(query_param(\"start_timestamp\", \"1\"))\n            .respond_with(ResponseTemplate::new(StatusCode::OK).set_body_json(response))\n            .up_to_n_times(1)\n            .mount(&mock_server)\n            .await;\n        assert_eq!(\n            qw_client\n                .splits(\"my-index\")\n                .list(list_splits_params)\n                .await\n                .unwrap(),\n            vec![split.clone()]\n        );\n\n        // Mark for deletion\n        Mock::given(method(\"PUT\"))\n            .and(path(\"/api/v1/indexes/my-index/splits/mark-for-deletion\"))\n            .respond_with(\n                ResponseTemplate::new(StatusCode::OK)\n                    .set_body_json(json!({\"split_ids\": [\"split-1\"]})),\n            )\n            .up_to_n_times(1)\n            .mount(&mock_server)\n            .await;\n        qw_client\n            .splits(\"my-index\")\n            .mark_for_deletion(vec![\"split-1\".to_string()])\n            .await\n            .unwrap();\n\n        // Mark for deletion returns an error\n        Mock::given(method(\"PUT\"))\n            .and(path(\"/api/v1/indexes/my-index/splits/mark-for-deletion\"))\n            .respond_with(ResponseTemplate::new(StatusCode::METHOD_NOT_ALLOWED))\n            .up_to_n_times(1)\n            .mount(&mock_server)\n            .await;\n        qw_client\n            .splits(\"my-index\")\n            .mark_for_deletion(vec![\"split-1\".to_string()])\n            .await\n            .unwrap_err();\n    }\n\n    #[tokio::test]\n    async fn test_sources_endpoints() {\n        let mock_server = MockServer::start().await;\n        let server_url = Url::parse(&mock_server.uri()).unwrap();\n        let qw_client = QuickwitClientBuilder::new(server_url).build();\n        let source_config = SourceConfig::ingest_api_default();\n        // POST create source with toml\n        Mock::given(method(\"POST\"))\n            .and(path(\"/api/v1/indexes/my-index/sources\"))\n            .and(header(CONTENT_TYPE.as_str(), \"application/toml\"))\n            .respond_with(\n                ResponseTemplate::new(StatusCode::OK).set_body_json(source_config.clone()),\n            )\n            .up_to_n_times(1)\n            .mount(&mock_server)\n            .await;\n        assert_eq!(\n            qw_client\n                .sources(\"my-index\")\n                .create(\"\", ConfigFormat::Toml)\n                .await\n                .unwrap(),\n            source_config\n        );\n\n        // PUT update source with yaml\n        Mock::given(method(\"PUT\"))\n            .and(path(\"/api/v1/indexes/my-index/sources/my-source-1\"))\n            .and(header(CONTENT_TYPE.as_str(), \"application/yaml\"))\n            .respond_with(\n                ResponseTemplate::new(StatusCode::OK).set_body_json(source_config.clone()),\n            )\n            .up_to_n_times(1)\n            .mount(&mock_server)\n            .await;\n        assert_eq!(\n            qw_client\n                .sources(\"my-index\")\n                .update(\"my-source-1\", \"\", ConfigFormat::Yaml, false)\n                .await\n                .unwrap(),\n            source_config\n        );\n\n        // GET sources\n        Mock::given(method(\"GET\"))\n            .and(path(\"/api/v1/indexes/my-index/sources\"))\n            .respond_with(\n                ResponseTemplate::new(StatusCode::OK).set_body_json(vec![source_config.clone()]),\n            )\n            .up_to_n_times(1)\n            .mount(&mock_server)\n            .await;\n        assert_eq!(\n            qw_client.sources(\"my-index\").list().await.unwrap(),\n            vec![source_config.clone()]\n        );\n\n        // Toggle source\n        Mock::given(method(\"PUT\"))\n            .and(path(\"/api/v1/indexes/my-index/sources/my-source-1/toggle\"))\n            .respond_with(\n                ResponseTemplate::new(StatusCode::OK).set_body_json(json!({\"enable\": true})),\n            )\n            .up_to_n_times(1)\n            .mount(&mock_server)\n            .await;\n        qw_client\n            .sources(\"my-index\")\n            .toggle(\"my-source-1\", true)\n            .await\n            .unwrap();\n\n        // Toggle source returns an error\n        Mock::given(method(\"PUT\"))\n            .and(path(\"/api/v1/indexes/my-index/sources/my-source-2/toggle\"))\n            .respond_with(ResponseTemplate::new(StatusCode::BAD_REQUEST))\n            .up_to_n_times(1)\n            .mount(&mock_server)\n            .await;\n        qw_client\n            .sources(\"my-index\")\n            .toggle(\"my-source-2\", true)\n            .await\n            .unwrap_err();\n\n        // PUT reset checkpoint\n        Mock::given(method(\"PUT\"))\n            .and(path(\n                \"/api/v1/indexes/my-index/sources/my-source/reset-checkpoint\",\n            ))\n            .respond_with(ResponseTemplate::new(StatusCode::OK))\n            .up_to_n_times(1)\n            .mount(&mock_server)\n            .await;\n        qw_client\n            .sources(\"my-index\")\n            .reset_checkpoint(\"my-source\")\n            .await\n            .unwrap();\n\n        // PUT reset checkpoint returns an error\n        Mock::given(method(\"PUT\"))\n            .and(path(\n                \"/api/v1/indexes/my-index/sources/my-source/reset-checkpoint\",\n            ))\n            .respond_with(ResponseTemplate::new(StatusCode::BAD_GATEWAY))\n            .up_to_n_times(1)\n            .mount(&mock_server)\n            .await;\n        qw_client\n            .sources(\"my-index\")\n            .reset_checkpoint(\"my-source\")\n            .await\n            .unwrap_err();\n\n        // DELETE source\n        Mock::given(method(\"DELETE\"))\n            .and(path(\"/api/v1/indexes/my-index/sources/my-source\"))\n            .respond_with(ResponseTemplate::new(StatusCode::OK))\n            .up_to_n_times(1)\n            .mount(&mock_server)\n            .await;\n        qw_client\n            .sources(\"my-index\")\n            .delete(\"my-source\")\n            .await\n            .unwrap();\n\n        // DELETE source returns an error\n        Mock::given(method(\"DELETE\"))\n            .and(path(\"/api/v1/indexes/my-index/sources/my-source\"))\n            .respond_with(ResponseTemplate::new(StatusCode::BAD_GATEWAY))\n            .up_to_n_times(1)\n            .mount(&mock_server)\n            .await;\n        qw_client\n            .sources(\"my-index\")\n            .delete(\"my-source\")\n            .await\n            .unwrap_err();\n    }\n\n    #[tokio::test]\n    async fn test_health_endpoints() {\n        let mock_server = MockServer::start().await;\n        let server_url = Url::parse(&mock_server.uri()).unwrap();\n        let qw_client = QuickwitClientBuilder::new(server_url).build();\n\n        assert!(qw_client.node_health().is_live().await.is_err());\n        assert!(qw_client.node_health().is_ready().await.is_err());\n\n        // GET /health/livez\n        Mock::given(method(\"GET\"))\n            .and(path(\"/health/livez\"))\n            .respond_with(ResponseTemplate::new(StatusCode::OK).set_body_json(true))\n            .expect(1)\n            .mount(&mock_server)\n            .await;\n        assert!(qw_client.node_health().is_live().await.unwrap());\n\n        // GET /health/readyz\n        Mock::given(method(\"GET\"))\n            .and(path(\"/health/readyz\"))\n            .respond_with(ResponseTemplate::new(StatusCode::OK).set_body_json(true))\n            .expect(1)\n            .mount(&mock_server)\n            .await;\n        assert!(qw_client.node_health().is_ready().await.unwrap());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-search/Cargo.toml",
    "content": "[package]\nname = \"quickwit-search\"\ndescription = \"Distributed search\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nanyhow = { workspace = true }\nasync-trait = { workspace = true }\nbase64 = { workspace = true }\nbytes = { workspace = true }\nbytesize = { workspace = true }\nfnv = { workspace = true }\nfutures = { workspace = true }\nhttp = { workspace = true }\nitertools = { workspace = true }\nmockall = { workspace = true }\nonce_cell = { workspace = true }\npin-project = { workspace = true }\npostcard = { workspace = true }\nprost = { workspace = true }\nserde = { workspace = true }\nserde_json = { workspace = true }\ntantivy = { workspace = true }\ntantivy-fst = { workspace = true }\nthiserror = { workspace = true }\ntokio = { workspace = true }\ntower = { workspace = true, features = [\"timeout\"] }\ntracing = { workspace = true }\nttl_cache = { workspace = true }\nulid = { workspace = true }\nutoipa = { workspace = true }\n\nquickwit-common = { workspace = true }\nquickwit-config = { workspace = true }\nquickwit-directories = { workspace = true }\nquickwit-doc-mapper = { workspace = true }\nquickwit-metastore = { workspace = true }\nquickwit-proto = { workspace = true }\nquickwit-query = { workspace = true }\nquickwit-storage = { workspace = true }\n\n[dev-dependencies]\nassert-json-diff = { workspace = true }\nproptest = { workspace = true }\nrand = { workspace = true }\nserde_json = { workspace = true }\n\nquickwit-indexing = { workspace = true, features = [\"testsuite\"] }\nquickwit-metastore = { workspace = true, features = [\"testsuite\"] }\nquickwit-proto = { workspace = true, features = [\"testsuite\"] }\nquickwit-storage = { workspace = true, features = [\"testsuite\"] }\n\n[features]\ntestsuite = []\nci-test = []\n"
  },
  {
    "path": "quickwit/quickwit-search/README.md",
    "content": "# Quickwit-search\n\nThis project implements quickwit's search API.\n\n# Architecture\n\nQuickwit relies on a pool of stateless search servers.\nAll search-servers are identical and are meant to be queried using a simple load balancer.\n\nThe server which receives the query acts as the *root* server for the time of the query.\n\nThe *root* role is to coordinate the work of the *leaf* servers:\n- it interprets the user query\n- queries the meta store to identify the list of relevant index splits\n- dispatch the work to the leaf\n- gathers and merge the leaf results.\n\nThe *leaf* servers are in charge of performing the actual search task on their\nassigned subset of index splits.\n\nA search request on one split typically works in phases\n- downloading the hotcache and opening the directory\n- download all of the data of required for the query phase on the split\n- performing the query_search_phase\n- if required, performing the fetch_docs_phase.\n"
  },
  {
    "path": "quickwit/quickwit-search/src/client.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::net::SocketAddr;\nuse std::sync::Arc;\nuse std::time::Duration;\n\nuse bytesize::ByteSize;\nuse http::Uri;\nuse quickwit_proto::search::{GetKvRequest, PutKvRequest, ReportSplitsRequest};\nuse quickwit_proto::tonic::Request;\nuse quickwit_proto::tonic::codegen::InterceptedService;\nuse quickwit_proto::tonic::transport::{Channel, Endpoint};\nuse quickwit_proto::{SpanContextInterceptor, tonic};\nuse tower::timeout::Timeout;\nuse tracing::warn;\n\nuse crate::SearchService;\nuse crate::error::parse_grpc_error;\n\n/// Impl is an enumeration that meant to manage Quickwit's search service client types.\n#[derive(Clone)]\nenum SearchServiceClientImpl {\n    Local(Arc<dyn SearchService>),\n    Grpc(\n        quickwit_proto::search::search_service_client::SearchServiceClient<\n            InterceptedService<Timeout<Channel>, SpanContextInterceptor>,\n        >,\n    ),\n}\n\n/// A search service client.\n/// It contains the client implementation and the gRPC address of the node to which the client\n/// connects.\n#[derive(Clone)]\npub struct SearchServiceClient {\n    client_impl: SearchServiceClientImpl,\n    grpc_addr: SocketAddr,\n}\n\nimpl fmt::Debug for SearchServiceClient {\n    fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {\n        match &self.client_impl {\n            SearchServiceClientImpl::Local(_service) => {\n                write!(formatter, \"Local({:?})\", self.grpc_addr)\n            }\n            SearchServiceClientImpl::Grpc(_grpc_client) => {\n                write!(formatter, \"Grpc({:?})\", self.grpc_addr)\n            }\n        }\n    }\n}\n\nimpl SearchServiceClient {\n    /// Create a search service client instance given a gRPC client and gRPC address.\n    pub fn from_grpc_client(\n        client: quickwit_proto::search::search_service_client::SearchServiceClient<\n            InterceptedService<Timeout<Channel>, SpanContextInterceptor>,\n        >,\n        grpc_addr: SocketAddr,\n    ) -> Self {\n        SearchServiceClient {\n            client_impl: SearchServiceClientImpl::Grpc(client),\n            grpc_addr,\n        }\n    }\n\n    /// Create a search service client instance given a search service and gRPC address.\n    pub fn from_service(service: Arc<dyn SearchService>, grpc_addr: SocketAddr) -> Self {\n        SearchServiceClient {\n            client_impl: SearchServiceClientImpl::Local(service),\n            grpc_addr,\n        }\n    }\n\n    /// Return the grpc_addr the underlying client connects to.\n    pub fn grpc_addr(&self) -> SocketAddr {\n        self.grpc_addr\n    }\n\n    /// Returns whether the underlying client is local or remote.\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn is_local(&self) -> bool {\n        matches!(self.client_impl, SearchServiceClientImpl::Local(_))\n    }\n\n    /// Perform root search.\n    pub async fn root_search(\n        &mut self,\n        request: quickwit_proto::search::SearchRequest,\n    ) -> crate::Result<quickwit_proto::search::SearchResponse> {\n        match &mut self.client_impl {\n            SearchServiceClientImpl::Grpc(grpc_client) => grpc_client\n                .root_search(request)\n                .await\n                .map(|tonic_response| tonic_response.into_inner())\n                .map_err(|tonic_error| parse_grpc_error(&tonic_error)),\n            SearchServiceClientImpl::Local(service) => service.root_search(request).await,\n        }\n    }\n\n    /// Perform leaf search.\n    pub async fn leaf_search(\n        &mut self,\n        request: quickwit_proto::search::LeafSearchRequest,\n    ) -> crate::Result<quickwit_proto::search::LeafSearchResponse> {\n        match &mut self.client_impl {\n            SearchServiceClientImpl::Grpc(grpc_client) => grpc_client\n                .leaf_search(request)\n                .await\n                .map(|tonic_response| tonic_response.into_inner())\n                .map_err(|tonic_error| parse_grpc_error(&tonic_error)),\n            SearchServiceClientImpl::Local(service) => service.leaf_search(request).await,\n        }\n    }\n\n    /// Perform leaf search.\n    pub async fn leaf_list_fields(\n        &mut self,\n        request: quickwit_proto::search::LeafListFieldsRequest,\n    ) -> crate::Result<quickwit_proto::search::ListFieldsResponse> {\n        match &mut self.client_impl {\n            SearchServiceClientImpl::Grpc(grpc_client) => {\n                let tonic_request = Request::new(request);\n                let tonic_response = grpc_client\n                    .leaf_list_fields(tonic_request)\n                    .await\n                    .map_err(|tonic_error| parse_grpc_error(&tonic_error))?;\n                Ok(tonic_response.into_inner())\n            }\n            SearchServiceClientImpl::Local(service) => service.leaf_list_fields(request).await,\n        }\n    }\n\n    /// Perform fetch docs.\n    pub async fn fetch_docs(\n        &mut self,\n        request: quickwit_proto::search::FetchDocsRequest,\n    ) -> crate::Result<quickwit_proto::search::FetchDocsResponse> {\n        match &mut self.client_impl {\n            SearchServiceClientImpl::Grpc(grpc_client) => {\n                let tonic_request = Request::new(request);\n                let tonic_response = grpc_client\n                    .fetch_docs(tonic_request)\n                    .await\n                    .map_err(|tonic_error| parse_grpc_error(&tonic_error))?;\n                Ok(tonic_response.into_inner())\n            }\n            SearchServiceClientImpl::Local(service) => service.fetch_docs(request).await,\n        }\n    }\n\n    /// Perform leaf list terms.\n    pub async fn leaf_list_terms(\n        &mut self,\n        request: quickwit_proto::search::LeafListTermsRequest,\n    ) -> crate::Result<quickwit_proto::search::LeafListTermsResponse> {\n        match &mut self.client_impl {\n            SearchServiceClientImpl::Grpc(grpc_client) => {\n                let tonic_request = Request::new(request);\n                let tonic_response = grpc_client\n                    .leaf_list_terms(tonic_request)\n                    .await\n                    .map_err(|tonic_error| parse_grpc_error(&tonic_error))?;\n                Ok(tonic_response.into_inner())\n            }\n            SearchServiceClientImpl::Local(service) => service.leaf_list_terms(request).await,\n        }\n    }\n\n    /// Gets the value associated to a key stored locally in the targeted node.\n    /// This call is not \"distributed\".\n    /// If the key is not present on the targeted search `None` is simply returned.\n    pub async fn get_kv(&mut self, get_kv_req: GetKvRequest) -> crate::Result<Option<Vec<u8>>> {\n        match &mut self.client_impl {\n            SearchServiceClientImpl::Local(service) => {\n                let search_after_context_opt = service.get_kv(get_kv_req).await;\n                Ok(search_after_context_opt)\n            }\n            SearchServiceClientImpl::Grpc(grpc_client) => {\n                let grpc_resp: tonic::Response<quickwit_proto::search::GetKvResponse> = grpc_client\n                    .get_kv(get_kv_req)\n                    .await\n                    .map_err(|tonic_error| parse_grpc_error(&tonic_error))?;\n                let get_search_after_context_resp = grpc_resp.into_inner();\n                Ok(get_search_after_context_resp.payload)\n            }\n        }\n    }\n\n    /// Gets the value associated to a key stored locally in the targeted node.\n    /// This call is not \"distributed\". It is up to the client to put the K,V pair\n    /// on several nodes.\n    pub async fn put_kv(&mut self, put_kv_req: PutKvRequest) -> crate::Result<()> {\n        match &mut self.client_impl {\n            SearchServiceClientImpl::Local(service) => {\n                service.put_kv(put_kv_req).await;\n            }\n            SearchServiceClientImpl::Grpc(grpc_client) => {\n                grpc_client\n                    .put_kv(put_kv_req)\n                    .await\n                    .map_err(|tonic_error| parse_grpc_error(&tonic_error))?;\n            }\n        }\n        Ok(())\n    }\n\n    /// Indexers call report_splits to inform searchers node about the presence of a split, which\n    /// would then be considered as a candidate for the searcher split cache.\n    pub async fn report_splits(&mut self, report_splits_request: ReportSplitsRequest) {\n        match &mut self.client_impl {\n            SearchServiceClientImpl::Local(service) => {\n                let _ = service.report_splits(report_splits_request).await;\n            }\n            SearchServiceClientImpl::Grpc(search_client) => {\n                // Ignoring any error.\n                if search_client\n                    .report_splits(report_splits_request)\n                    .await\n                    .is_err()\n                {\n                    warn!(\n                        \"Failed to report splits. This is not critical as this message is only \\\n                         used to identify caching opportunities.\"\n                    );\n                }\n            }\n        }\n    }\n}\n\n/// Creates a [`SearchServiceClient`] from a socket address.\n/// The underlying channel connects lazily and is set up to time out after 5 seconds. It reconnects\n/// automatically should the connection be dropped.\npub fn create_search_client_from_grpc_addr(\n    grpc_addr: SocketAddr,\n    max_message_size: ByteSize,\n) -> SearchServiceClient {\n    let uri = Uri::builder()\n        .scheme(\"http\")\n        .authority(grpc_addr.to_string().as_str())\n        .path_and_query(\"/\")\n        .build()\n        .expect(\"The URI should be well-formed.\");\n    let channel = Endpoint::from(uri).connect_lazy();\n    let timeout_channel = Timeout::new(channel, Duration::from_secs(5));\n    create_search_client_from_channel(grpc_addr, timeout_channel, max_message_size)\n}\n\n/// Creates a [`SearchServiceClient`] from a pre-established connection (channel).\npub fn create_search_client_from_channel(\n    grpc_addr: SocketAddr,\n    channel: Timeout<Channel>,\n    max_message_size: ByteSize,\n) -> SearchServiceClient {\n    let client =\n        quickwit_proto::search::search_service_client::SearchServiceClient::with_interceptor(\n            channel,\n            SpanContextInterceptor,\n        )\n        .max_decoding_message_size(max_message_size.0 as usize)\n        .max_encoding_message_size(max_message_size.0 as usize);\n    SearchServiceClient::from_grpc_client(client, grpc_addr)\n}\n"
  },
  {
    "path": "quickwit/quickwit-search/src/cluster_client.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::time::Duration;\n\nuse base64::Engine;\nuse futures::future::ready;\nuse futures::{Future, StreamExt};\nuse quickwit_proto::search::{\n    FetchDocsRequest, FetchDocsResponse, GetKvRequest, LeafListFieldsRequest, LeafListTermsRequest,\n    LeafListTermsResponse, LeafSearchRequest, LeafSearchResponse, ListFieldsResponse, PutKvRequest,\n};\nuse tantivy::aggregation::intermediate_agg_result::IntermediateAggregationResults;\nuse tracing::{debug, error, info, warn};\n\nuse crate::retry::search::LeafSearchRetryPolicy;\nuse crate::retry::{DefaultRetryPolicy, RetryPolicy, retry_client};\nuse crate::{SearchJobPlacer, SearchServiceClient, merge_resource_stats_it};\n\n/// Maximum number of put requests emitted to perform a replicated given PUT KV.\nconst MAX_PUT_KV_ATTEMPTS: usize = 6;\n\n/// Maximum number of get requests emitted to perform a GET KV request.\nconst MAX_GET_KV_ATTEMPTS: usize = 6;\n\n/// We attempt to store our KVs on two nodes.\nconst TARGET_NUM_REPLICATION: usize = 2;\n\n/// Client that executes placed requests (Request, `SearchServiceClient`) and\n/// provides retry policies for `FetchDocsRequest` and `LeafSearchRequest` to\n/// retry on other `SearchServiceClient`.\n#[derive(Clone)]\npub struct ClusterClient {\n    pub(crate) search_job_placer: SearchJobPlacer,\n}\n\nimpl ClusterClient {\n    /// Instantiates [`ClusterClient`].\n    pub fn new(search_job_placer: SearchJobPlacer) -> Self {\n        Self { search_job_placer }\n    }\n\n    /// Fetches docs with retry on another node client.\n    pub async fn fetch_docs(\n        &self,\n        request: FetchDocsRequest,\n        mut client: SearchServiceClient,\n    ) -> crate::Result<FetchDocsResponse> {\n        let mut response_res = client.fetch_docs(request.clone()).await;\n        let retry_policy = DefaultRetryPolicy {};\n        if let Some(retry_request) = retry_policy.retry_request(request, &response_res) {\n            assert!(!retry_request.split_offsets.is_empty());\n            client = retry_client(\n                &self.search_job_placer,\n                client.grpc_addr(),\n                &retry_request.split_offsets[0].split_id,\n            )\n            .await?;\n            debug!(\n                \"Fetch docs response error: `{:?}`. Retry once to execute {:?} with {:?}\",\n                response_res, retry_request, client\n            );\n            response_res = client.fetch_docs(retry_request).await;\n        }\n        response_res\n    }\n\n    /// Leaf search with retry on another node client.\n    pub async fn leaf_search(\n        &self,\n        request: LeafSearchRequest,\n        mut client: SearchServiceClient,\n    ) -> crate::Result<LeafSearchResponse> {\n        let mut response_res = client.leaf_search(request.clone()).await;\n        let retry_policy = LeafSearchRetryPolicy {};\n        // We retry only once.\n        let Some(retry_request) = retry_policy.retry_request(request, &response_res) else {\n            return response_res;\n        };\n        let Some(first_split) = retry_request\n            .leaf_requests\n            .iter()\n            .flat_map(|leaf_req| leaf_req.split_offsets.iter())\n            .next()\n        else {\n            warn!(\n                \"the retry request did not contain any split to retry. this should never happen, \\\n                 please report\"\n            );\n            return response_res;\n        };\n        // There could be more than one split in the retry request. We pick a single client\n        // arbitrarily only considering the affinity of the first split.\n        client = retry_client(\n            &self.search_job_placer,\n            client.grpc_addr(),\n            &first_split.split_id,\n        )\n        .await?;\n        debug!(\n            \"Leaf search response error: `{:?}`. Retry once to execute {:?} with {:?}\",\n            response_res, retry_request, client\n        );\n        let retry_result = client.leaf_search(retry_request).await;\n        response_res = merge_original_with_retry_leaf_search_results(response_res, retry_result);\n        response_res\n    }\n\n    /// Leaf search with retry on another node client.\n    pub async fn leaf_list_fields(\n        &self,\n        request: LeafListFieldsRequest,\n        mut client: SearchServiceClient,\n    ) -> crate::Result<ListFieldsResponse> {\n        client.leaf_list_fields(request.clone()).await\n    }\n\n    /// Leaf search with retry on another node client.\n    pub async fn leaf_list_terms(\n        &self,\n        request: LeafListTermsRequest,\n        mut client: SearchServiceClient,\n    ) -> crate::Result<LeafListTermsResponse> {\n        // TODO: implement retry\n        client.leaf_list_terms(request.clone()).await\n    }\n\n    /// Attempts to store a given key value pair within the cluster.\n    ///\n    /// Tries to replicate the pair to [`TARGET_NUM_REPLICATION`] nodes, but this function may fail\n    /// silently (e.g if no client was available). Even in case of success, this storage is not\n    /// persistent. For instance during a rolling upgrade, all replicas will be lost as there is no\n    /// mechanism to maintain the replication count.\n    pub async fn put_kv(&self, key: &[u8], payload: &[u8], ttl: Duration) {\n        let clients: Vec<SearchServiceClient> = self\n            .search_job_placer\n            .best_nodes_per_affinity(key)\n            .await\n            .take(MAX_PUT_KV_ATTEMPTS)\n            .collect();\n\n        if clients.is_empty() {\n            // We only log a warning as it might be that we are just running in a\n            // single node cluster.\n            // (That's odd though, the node running this code should be in the pool too)\n            warn!(\"no other node available to replicate scroll context\");\n            return;\n        }\n\n        // We run the put requests concurrently.\n        // Our target is a replication over TARGET_NUM_REPLICATION nodes, we therefore try to avoid\n        // replicating on more than TARGET_NUM_REPLICATION nodes at the same time. Of\n        // course, this may still result in the replication over more nodes, but this is not\n        // a problem.\n        //\n        // The requests are made in a concurrent manner, up to TARGET_NUM_REPLICATION at a time. As\n        // soon as TARGET_NUM_REPLICATION requests are successful, we stop.\n        let put_kv_futs = clients\n            .into_iter()\n            .map(|client| replicate_kv_to_one_server(client, key, payload, ttl));\n        let successful_replication = futures::stream::iter(put_kv_futs)\n            .buffer_unordered(TARGET_NUM_REPLICATION)\n            .filter(|put_kv_successful| ready(*put_kv_successful))\n            .take(TARGET_NUM_REPLICATION)\n            .count()\n            .await;\n\n        if successful_replication == 0 {\n            error!(successful_replication=%successful_replication,\"failed-to-replicate-scroll-context\");\n        }\n    }\n\n    /// Returns a search_after context\n    pub async fn get_kv(&self, key: &[u8]) -> Option<Vec<u8>> {\n        let clients = self.search_job_placer.best_nodes_per_affinity(key).await;\n        // On the read side, we attempt to contact up to 6 nodes.\n        for mut client in clients.take(MAX_GET_KV_ATTEMPTS) {\n            let get_request = GetKvRequest { key: key.to_vec() };\n            if let Ok(Some(search_after_resp)) = client.get_kv(get_request.clone()).await {\n                return Some(search_after_resp);\n            } else {\n                let base64_key: String = base64::prelude::BASE64_STANDARD.encode(key);\n                info!(destination=?client, key=base64_key, \"Failed to get KV\");\n            }\n        }\n        None\n    }\n}\n\nfn replicate_kv_to_one_server(\n    mut client: SearchServiceClient,\n    key: &[u8],\n    payload: &[u8],\n    ttl: Duration,\n) -> impl Future<Output = bool> {\n    let put_kv_request = PutKvRequest {\n        key: key.to_vec(),\n        payload: payload.to_vec(),\n        ttl_secs: ttl.as_secs() as u32,\n    };\n    let base64_key: String = base64::prelude::BASE64_STANDARD.encode(key);\n    async move {\n        if client.put_kv(put_kv_request).await.is_ok() {\n            true\n        } else {\n            warn!(destination=?client, key=base64_key, \"Failed to replicate KV\");\n            false\n        }\n    }\n}\n\n/// Takes two intermediate aggregation results serialized using postcard,\n/// merge them and returns the merged serialized result.\nfn merge_intermediate_aggregation(left: &[u8], right: &[u8]) -> crate::Result<Vec<u8>> {\n    let mut intermediate_aggregation_results_left: IntermediateAggregationResults =\n        postcard::from_bytes(left)?;\n    let intermediate_aggregation_results_right: IntermediateAggregationResults =\n        postcard::from_bytes(right)?;\n    intermediate_aggregation_results_left.merge_fruits(intermediate_aggregation_results_right)?;\n    let serialized = postcard::to_allocvec(&intermediate_aggregation_results_left)?;\n    Ok(serialized)\n}\n\n/// Merge two leaf search response.\n///\n/// # Quirk\n///\n/// This is implemented for a retries.\n/// For instance, the set of attempted splits of right is supposed to be the set of failed\n/// list of the left one, so that the list of the overal failed splits is the list of splits on the\n/// `right_response`.\nfn merge_original_with_retry_leaf_search_response(\n    mut original_response: LeafSearchResponse,\n    retry_response: LeafSearchResponse,\n) -> crate::Result<LeafSearchResponse> {\n    original_response\n        .partial_hits\n        .extend(retry_response.partial_hits);\n    let intermediate_aggregation_result: Option<Vec<u8>> = match (\n        original_response.intermediate_aggregation_result,\n        retry_response.intermediate_aggregation_result,\n    ) {\n        (Some(left_agg_bytes), Some(right_agg_bytes)) => {\n            let intermediate_aggregation_bytes: Vec<u8> =\n                merge_intermediate_aggregation(&left_agg_bytes[..], &right_agg_bytes[..])?;\n            Some(intermediate_aggregation_bytes)\n        }\n        (None, Some(right)) => Some(right),\n        (Some(left), None) => Some(left),\n        (None, None) => None,\n    };\n    let resource_stats = merge_resource_stats_it([\n        &original_response.resource_stats,\n        &retry_response.resource_stats,\n    ]);\n    Ok(LeafSearchResponse {\n        intermediate_aggregation_result,\n        num_hits: original_response.num_hits + retry_response.num_hits,\n        num_attempted_splits: original_response.num_attempted_splits\n            + retry_response.num_attempted_splits,\n        failed_splits: retry_response.failed_splits,\n        partial_hits: original_response.partial_hits,\n        num_successful_splits: original_response.num_successful_splits\n            + retry_response.num_successful_splits,\n        resource_stats,\n    })\n}\n\n// Merge initial leaf search results with results obtained from a retry.\nfn merge_original_with_retry_leaf_search_results(\n    left_search_response_result: crate::Result<LeafSearchResponse>,\n    right_search_response_result: crate::Result<LeafSearchResponse>,\n) -> crate::Result<LeafSearchResponse> {\n    match (left_search_response_result, right_search_response_result) {\n        (Ok(left_response), Ok(right_response)) => {\n            merge_original_with_retry_leaf_search_response(left_response, right_response)\n        }\n        (Ok(single_valid_response), Err(_)) => Ok(single_valid_response),\n        (Err(_), Ok(single_valid_response)) => Ok(single_valid_response),\n        (Err(error), Err(_)) => Err(error),\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::collections::HashSet;\n    use std::net::SocketAddr;\n\n    use quickwit_proto::search::{\n        LeafRequestRef, PartialHit, SearchRequest, SortValue, SplitIdAndFooterOffsets,\n        SplitSearchError,\n    };\n    use quickwit_query::query_ast::qast_json_helper;\n\n    use super::*;\n    use crate::root::SearchJob;\n    use crate::{MockSearchService, SearchError, searcher_pool_for_test};\n\n    fn mock_partial_hit(split_id: &str, sort_value: u64, doc_id: u32) -> PartialHit {\n        PartialHit {\n            sort_value: Some(SortValue::U64(sort_value).into()),\n            sort_value2: None,\n            split_id: split_id.to_string(),\n            segment_ord: 1,\n            doc_id,\n        }\n    }\n\n    fn mock_doc_request(split_id: &str) -> FetchDocsRequest {\n        FetchDocsRequest {\n            partial_hits: Vec::new(),\n            index_uri: \"uri\".to_string(),\n            split_offsets: vec![SplitIdAndFooterOffsets {\n                split_id: split_id.to_string(),\n                split_footer_end: 100,\n                split_footer_start: 0,\n                timestamp_start: None,\n                timestamp_end: None,\n                num_docs: 0,\n            }],\n            ..Default::default()\n        }\n    }\n\n    fn mock_leaf_search_request() -> LeafSearchRequest {\n        let search_request = SearchRequest {\n            index_id_patterns: vec![\"test-idx\".to_string()],\n            query_ast: qast_json_helper(\"test\", &[\"body\"]),\n            max_hits: 10,\n            ..Default::default()\n        };\n        LeafSearchRequest {\n            search_request: Some(search_request),\n            doc_mappers: vec![\"doc_mapper\".to_string()],\n            index_uris: vec![\"uri\".to_string()],\n            leaf_requests: vec![LeafRequestRef {\n                index_uri_ord: 0,\n                doc_mapper_ord: 0,\n                split_offsets: vec![\n                    SplitIdAndFooterOffsets {\n                        split_id: \"split_1\".to_string(),\n                        split_footer_start: 0,\n                        split_footer_end: 100,\n                        timestamp_start: None,\n                        timestamp_end: None,\n                        num_docs: 0,\n                    },\n                    SplitIdAndFooterOffsets {\n                        split_id: \"split_2\".to_string(),\n                        split_footer_start: 0,\n                        split_footer_end: 100,\n                        timestamp_start: None,\n                        timestamp_end: None,\n                        num_docs: 0,\n                    },\n                ],\n            }],\n        }\n    }\n\n    #[tokio::test]\n    async fn test_cluster_client_fetch_docs_no_retry() {\n        let request = mock_doc_request(\"split_1\");\n        let mut mock_search_service = MockSearchService::new();\n        mock_search_service.expect_fetch_docs().return_once(\n            |_: quickwit_proto::search::FetchDocsRequest| {\n                Ok(quickwit_proto::search::FetchDocsResponse { hits: Vec::new() })\n            },\n        );\n        let searcher_pool = searcher_pool_for_test([(\"127.0.0.1:1001\", mock_search_service)]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let first_client = search_job_placer\n            .assign_job(SearchJob::for_test(\"split_1\", 0), &HashSet::new())\n            .await\n            .unwrap();\n        let cluster_client = ClusterClient::new(search_job_placer);\n        let fetch_docs_response = cluster_client\n            .fetch_docs(request, first_client)\n            .await\n            .unwrap();\n        assert_eq!(fetch_docs_response.hits.len(), 0);\n    }\n\n    #[tokio::test]\n    async fn test_cluster_client_fetch_docs_retry_with_final_success() {\n        let request = mock_doc_request(\"split_1\");\n        let mut mock_search_service_1 = MockSearchService::new();\n        mock_search_service_1.expect_fetch_docs().return_once(\n            |_: quickwit_proto::search::FetchDocsRequest| {\n                Err(SearchError::Internal(\"error\".to_string()))\n            },\n        );\n        let mut mock_search_service_2 = MockSearchService::new();\n        mock_search_service_2.expect_fetch_docs().return_once(\n            |_: quickwit_proto::search::FetchDocsRequest| {\n                Ok(quickwit_proto::search::FetchDocsResponse { hits: Vec::new() })\n            },\n        );\n        let searcher_pool = searcher_pool_for_test([\n            (\"127.0.0.1:1001\", mock_search_service_1),\n            (\"127.0.0.1:1002\", mock_search_service_2),\n        ]);\n        let first_client_addr: SocketAddr = \"127.0.0.1:1001\".parse().unwrap();\n        let first_client = searcher_pool.get(&first_client_addr).unwrap();\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let cluster_client = ClusterClient::new(search_job_placer);\n        let fetch_docs_response = cluster_client\n            .fetch_docs(request, first_client)\n            .await\n            .unwrap();\n        assert_eq!(fetch_docs_response.hits.len(), 0);\n    }\n\n    #[tokio::test]\n    async fn test_cluster_client_fetch_docs_retry_with_final_error() {\n        let request = mock_doc_request(\"split_1\");\n        let mut mock_search_service = MockSearchService::new();\n        mock_search_service.expect_fetch_docs().returning(\n            |_: quickwit_proto::search::FetchDocsRequest| {\n                Err(SearchError::Internal(\"error\".to_string()))\n            },\n        );\n        let searcher_pool = searcher_pool_for_test([(\"127.0.0.1:1001\", mock_search_service)]);\n        let first_client_addr: SocketAddr = \"127.0.0.1:1001\".parse().unwrap();\n        let first_client = searcher_pool.get(&first_client_addr).unwrap();\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let cluster_client = ClusterClient::new(search_job_placer);\n        let search_error = cluster_client\n            .fetch_docs(request, first_client)\n            .await\n            .unwrap_err();\n        assert!(matches!(search_error, SearchError::Internal(_)));\n    }\n\n    #[tokio::test]\n    async fn test_cluster_client_leaf_search_no_retry() {\n        let request = mock_leaf_search_request();\n        let mut mock_search_service = MockSearchService::new();\n        mock_search_service\n            .expect_leaf_search()\n            .return_once(|_: LeafSearchRequest| {\n                Ok(LeafSearchResponse {\n                    num_attempted_splits: 1,\n                    ..Default::default()\n                })\n            });\n        let searcher_pool = searcher_pool_for_test([(\"127.0.0.1:1001\", mock_search_service)]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let first_client = search_job_placer\n            .assign_job(SearchJob::for_test(\"split_1\", 0), &HashSet::new())\n            .await\n            .unwrap();\n        let cluster_client = ClusterClient::new(search_job_placer);\n        let leaf_search_response = cluster_client\n            .leaf_search(request, first_client)\n            .await\n            .unwrap();\n        assert_eq!(leaf_search_response.num_attempted_splits, 1);\n    }\n\n    #[tokio::test]\n    async fn test_cluster_client_leaf_search_retry_on_failing_splits() {\n        let request = mock_leaf_search_request();\n        let mut mock_search_service = MockSearchService::new();\n        mock_search_service\n            .expect_leaf_search()\n            .withf(|request| request.leaf_requests[0].split_offsets[0].split_id == \"split_1\")\n            .return_once(|_: LeafSearchRequest| {\n                Ok(LeafSearchResponse {\n                    num_hits: 1,\n                    failed_splits: vec![SplitSearchError {\n                        error: \"mock_error\".to_string(),\n                        split_id: \"split_2\".to_string(),\n                        retryable_error: true,\n                    }],\n                    num_attempted_splits: 1,\n                    ..Default::default()\n                })\n            });\n        mock_search_service\n            .expect_leaf_search()\n            .withf(|request| request.leaf_requests[0].split_offsets[0].split_id == \"split_2\")\n            .return_once(|_: LeafSearchRequest| {\n                Ok(LeafSearchResponse {\n                    num_hits: 1,\n                    partial_hits: Vec::new(),\n                    failed_splits: vec![SplitSearchError {\n                        error: \"mock_error\".to_string(),\n                        split_id: \"split_3\".to_string(),\n                        retryable_error: true,\n                    }],\n                    num_attempted_splits: 1,\n                    ..Default::default()\n                })\n            });\n        let searcher_pool = searcher_pool_for_test([(\"127.0.0.1:1001\", mock_search_service)]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let first_client = search_job_placer\n            .assign_job(SearchJob::for_test(\"split_1\", 0), &HashSet::new())\n            .await\n            .unwrap();\n        let cluster_client = ClusterClient::new(search_job_placer);\n        let result = cluster_client.leaf_search(request, first_client).await;\n        assert!(result.is_ok());\n        assert_eq!(result.unwrap().num_hits, 2);\n    }\n\n    #[test]\n    fn test_merge_leaf_search_retry_on_partial_success() -> anyhow::Result<()> {\n        let split_error = SplitSearchError {\n            error: \"error\".to_string(),\n            split_id: \"split_2\".to_string(),\n            retryable_error: true,\n        };\n        let leaf_response = LeafSearchResponse {\n            num_hits: 1,\n            partial_hits: vec![mock_partial_hit(\"split_1\", 3, 1)],\n            failed_splits: vec![split_error],\n            num_attempted_splits: 1,\n            ..Default::default()\n        };\n        let leaf_response_retry = LeafSearchResponse {\n            num_hits: 1,\n            partial_hits: vec![mock_partial_hit(\"split_2\", 3, 1)],\n            failed_splits: Vec::new(),\n            num_attempted_splits: 1,\n            ..Default::default()\n        };\n        let merged_leaf_search_response = merge_original_with_retry_leaf_search_results(\n            Ok(leaf_response),\n            Ok(leaf_response_retry),\n        )\n        .unwrap();\n        assert_eq!(merged_leaf_search_response.num_attempted_splits, 2);\n        assert_eq!(merged_leaf_search_response.num_hits, 2);\n        assert_eq!(merged_leaf_search_response.partial_hits.len(), 2);\n        assert_eq!(merged_leaf_search_response.failed_splits.len(), 0);\n        Ok(())\n    }\n\n    #[test]\n    fn test_merge_leaf_search_retry_on_error() -> anyhow::Result<()> {\n        let split_error = SplitSearchError {\n            error: \"error\".to_string(),\n            split_id: \"split_2\".to_string(),\n            retryable_error: true,\n        };\n        let leaf_response = LeafSearchResponse {\n            num_hits: 1,\n            partial_hits: vec![mock_partial_hit(\"split_1\", 3, 1)],\n            failed_splits: vec![split_error],\n            num_attempted_splits: 1,\n            ..Default::default()\n        };\n        let merged_result = merge_original_with_retry_leaf_search_results(\n            Err(SearchError::Internal(\"error\".to_string())),\n            Ok(leaf_response),\n        )\n        .unwrap();\n        assert_eq!(merged_result.num_attempted_splits, 1);\n        assert_eq!(merged_result.num_hits, 1);\n        assert_eq!(merged_result.partial_hits.len(), 1);\n        assert_eq!(merged_result.failed_splits.len(), 1);\n        Ok(())\n    }\n\n    #[test]\n    fn test_merge_leaf_search_retry_error_on_error() -> anyhow::Result<()> {\n        let merge_error = merge_original_with_retry_leaf_search_results(\n            Err(SearchError::Internal(\"error\".to_string())),\n            Err(SearchError::Internal(\"retry error\".to_string())),\n        )\n        .unwrap_err();\n        assert_eq!(merge_error.to_string(), \"internal error: `error`\");\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_put_kv_happy_path() {\n        // 3 servers 1, 2, 3\n        // Targeted key has affinity [2, 3, 1].\n        //\n        // Put on 2 and 3 is successful\n        // Get succeeds on 2.\n        let mock_search_service_1 = MockSearchService::new();\n        let mut mock_search_service_2 = MockSearchService::new();\n        mock_search_service_2.expect_put_kv().once().returning(\n            |put_req: quickwit_proto::search::PutKvRequest| {\n                assert_eq!(put_req.key, b\"my_key\");\n                assert_eq!(put_req.payload, b\"my_payload\");\n            },\n        );\n        mock_search_service_2.expect_get_kv().once().returning(\n            |get_req: quickwit_proto::search::GetKvRequest| {\n                assert_eq!(get_req.key, b\"my_key\");\n                Some(b\"my_payload\".to_vec())\n            },\n        );\n        let mut mock_search_service_3 = MockSearchService::new();\n        // Due to the buffered call it is possible for the\n        // put request to 3 to be emitted too.\n        mock_search_service_3\n            .expect_put_kv()\n            .returning(|_put_req: quickwit_proto::search::PutKvRequest| {});\n        let searcher_pool = searcher_pool_for_test([\n            (\"127.0.0.1:1001\", mock_search_service_1),\n            (\"127.0.0.1:1002\", mock_search_service_2),\n            (\"127.0.0.1:1003\", mock_search_service_3),\n        ]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let cluster_client = ClusterClient::new(search_job_placer);\n        cluster_client\n            .put_kv(\n                &b\"my_key\"[..],\n                &b\"my_payload\"[..],\n                Duration::from_secs(10 * 60),\n            )\n            .await;\n        let result = cluster_client.get_kv(&b\"my_key\"[..]).await;\n        assert_eq!(result, Some(b\"my_payload\".to_vec()))\n    }\n\n    #[tokio::test]\n    async fn test_put_kv_failing_get() {\n        // 3 servers 1, 2, 3\n        // Targeted key has affinity [2, 3, 1].\n        //\n        // Put on 2 and 3 is successful\n        // Get fails on 2.\n        // Get succeeds on 3.\n        let mock_search_service_1 = MockSearchService::new();\n        let mut mock_search_service_2 = MockSearchService::new();\n        mock_search_service_2.expect_put_kv().once().returning(\n            |put_req: quickwit_proto::search::PutKvRequest| {\n                assert_eq!(put_req.key, b\"my_key\");\n                assert_eq!(put_req.payload, b\"my_payload\");\n            },\n        );\n        mock_search_service_2.expect_get_kv().once().returning(\n            |get_req: quickwit_proto::search::GetKvRequest| {\n                assert_eq!(get_req.key, b\"my_key\");\n                None\n            },\n        );\n        let mut mock_search_service_3 = MockSearchService::new();\n        mock_search_service_3.expect_put_kv().once().returning(\n            |put_req: quickwit_proto::search::PutKvRequest| {\n                assert_eq!(put_req.key, b\"my_key\");\n                assert_eq!(put_req.payload, b\"my_payload\");\n            },\n        );\n        mock_search_service_3.expect_get_kv().once().returning(\n            |get_req: quickwit_proto::search::GetKvRequest| {\n                assert_eq!(get_req.key, b\"my_key\");\n                Some(b\"my_payload\".to_vec())\n            },\n        );\n        let searcher_pool = searcher_pool_for_test([\n            (\"127.0.0.1:1001\", mock_search_service_1),\n            (\"127.0.0.1:1002\", mock_search_service_2),\n            (\"127.0.0.1:1003\", mock_search_service_3),\n        ]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let cluster_client = ClusterClient::new(search_job_placer);\n        cluster_client\n            .put_kv(\n                &b\"my_key\"[..],\n                &b\"my_payload\"[..],\n                Duration::from_secs(10 * 60),\n            )\n            .await;\n        let result = cluster_client.get_kv(&b\"my_key\"[..]).await;\n        assert_eq!(result, Some(b\"my_payload\".to_vec()))\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-search/src/collector.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::borrow::Cow;\nuse std::cmp::Ordering;\nuse std::collections::HashSet;\n\nuse itertools::Itertools;\nuse quickwit_common::binary_heap::{SortKeyMapper, TopK};\nuse quickwit_doc_mapper::{FastFieldWarmupInfo, WarmupInfo};\nuse quickwit_proto::search::{\n    LeafSearchResponse, PartialHit, ResourceStats, SearchRequest, SortByValue, SortOrder,\n    SortValue, SplitSearchError,\n};\nuse quickwit_proto::types::SplitId;\nuse serde::Deserialize;\nuse tantivy::aggregation::agg_req::{Aggregations, get_fast_field_names};\nuse tantivy::aggregation::intermediate_agg_result::IntermediateAggregationResults;\nuse tantivy::aggregation::{AggContextParams, AggregationLimitsGuard, AggregationSegmentCollector};\nuse tantivy::collector::{Collector, SegmentCollector};\nuse tantivy::columnar::{ColumnType, MonotonicallyMappableToU64};\nuse tantivy::fastfield::Column;\nuse tantivy::tokenizer::TokenizerManager;\nuse tantivy::{DateTime, DocId, Score, SegmentOrdinal, SegmentReader, TantivyError};\n\nuse crate::find_trace_ids_collector::{FindTraceIdsCollector, FindTraceIdsSegmentCollector, Span};\nuse crate::top_k_collector::{QuickwitSegmentTopKCollector, specialized_top_k_segment_collector};\nuse crate::{GlobalDocAddress, merge_resource_stats, merge_resource_stats_it};\n\n#[derive(Clone, Debug)]\npub(crate) enum SortByComponent {\n    DocId {\n        order: SortOrder,\n    },\n    FastField {\n        field_name: String,\n        order: SortOrder,\n    },\n    Score {\n        order: SortOrder,\n    },\n}\nimpl From<SortByComponent> for SortByPair {\n    fn from(value: SortByComponent) -> Self {\n        Self {\n            first: value,\n            second: None,\n        }\n    }\n}\n#[derive(Clone)]\npub(crate) struct SortByPair {\n    first: SortByComponent,\n    second: Option<SortByComponent>,\n}\nimpl SortByPair {\n    pub fn sort_orders(&self) -> (SortOrder, SortOrder) {\n        (\n            self.first.sort_order(),\n            self.second\n                .as_ref()\n                .map(|sort_by| sort_by.sort_order())\n                .unwrap_or(SortOrder::Desc),\n        )\n    }\n}\nimpl SortByComponent {\n    fn to_sorting_field_extractor_component(\n        &self,\n        segment_reader: &SegmentReader,\n    ) -> tantivy::Result<SortingFieldExtractorComponent> {\n        match self {\n            SortByComponent::DocId { .. } => Ok(SortingFieldExtractorComponent::DocId),\n            SortByComponent::FastField { field_name, .. } => {\n                let sort_column_opt: Option<(Column<u64>, ColumnType)> =\n                    segment_reader.fast_fields().u64_lenient(field_name)?;\n                let (sort_column, column_type) = sort_column_opt.unwrap_or_else(|| {\n                    (\n                        Column::build_empty_column(segment_reader.max_doc()),\n                        ColumnType::U64,\n                    )\n                });\n                let sort_field_type = SortFieldType::try_from(column_type)?;\n                Ok(SortingFieldExtractorComponent::FastField {\n                    sort_column,\n                    sort_field_type,\n                })\n            }\n            SortByComponent::Score { .. } => Ok(SortingFieldExtractorComponent::Score),\n        }\n    }\n    pub fn requires_scoring(&self) -> bool {\n        match self {\n            SortByComponent::DocId { .. } => false,\n            SortByComponent::FastField { .. } => false,\n            SortByComponent::Score { .. } => true,\n        }\n    }\n    pub fn add_fast_field(&self, set: &mut HashSet<String>) {\n        if let SortByComponent::FastField {\n            field_name,\n            order: _,\n        } = self\n        {\n            set.insert(field_name.clone());\n        }\n    }\n    pub fn sort_order(&self) -> SortOrder {\n        match self {\n            SortByComponent::DocId { order } => *order,\n            SortByComponent::FastField { order, .. } => *order,\n            SortByComponent::Score { order } => *order,\n        }\n    }\n}\n\n#[derive(Copy, Clone, Debug, Eq, PartialEq)]\npub(crate) enum SortFieldType {\n    U64,\n    I64,\n    F64,\n    DateTime,\n    Bool,\n}\n\n/// The `SortingFieldExtractor` is used to extract a score, which can either be a true score,\n/// a value from a fast field, or nothing (sort by DocId).\npub(crate) enum SortingFieldExtractorComponent {\n    /// If undefined, we simply sort by DocIds.\n    DocId,\n    FastField {\n        sort_column: Column<u64>,\n        sort_field_type: SortFieldType,\n    },\n    Score,\n}\n\nimpl SortingFieldExtractorComponent {\n    pub fn is_score(&self) -> bool {\n        matches!(self, SortingFieldExtractorComponent::Score)\n    }\n    pub fn is_fast_field(&self) -> bool {\n        matches!(self, SortingFieldExtractorComponent::FastField { .. })\n    }\n    /// Loads the fast field values for the given doc_ids in its u64 representation. The returned\n    /// u64 representation maintains the ordering of the original value.\n    #[inline]\n    pub fn extract_typed_sort_values_block(&self, doc_ids: &[DocId], values: &mut [Option<u64>]) {\n        // In the collect block case we don't have scores to extract\n        if let SortingFieldExtractorComponent::FastField { sort_column, .. } = self {\n            let values = &mut values[..doc_ids.len()];\n            sort_column.first_vals(doc_ids, values);\n        }\n    }\n\n    /// Returns the sort value for the given element in its u64 representation. The returned u64\n    /// representation maintains the ordering of the original value.\n    ///\n    /// The function returns None if the sort key is a fast field, for which we have no value\n    /// for the given doc_id, or we sort by DocId.\n    #[inline]\n    fn extract_typed_sort_value_opt(&self, doc_id: DocId, score: Score) -> Option<u64> {\n        match self {\n            // Tie breaks are not handled here, but in SegmentPartialHit\n            SortingFieldExtractorComponent::DocId => None,\n            SortingFieldExtractorComponent::FastField { sort_column, .. } => {\n                sort_column.first(doc_id)\n            }\n            SortingFieldExtractorComponent::Score => Some((score as f64).to_u64()),\n        }\n    }\n\n    #[inline]\n    /// Converts u64 fast field values to its correct type.\n    /// The conversion is delayed for performance reasons.\n    ///\n    /// This is used to convert `search_after` sort value to a u64 representation that will respect\n    /// the same order as the `SortValue` representation.\n    pub fn convert_u64_ff_val_to_sort_value(&self, sort_value: u64) -> SortValue {\n        let map_fast_field_to_value = |fast_field_value, field_type| match field_type {\n            SortFieldType::U64 => SortValue::U64(fast_field_value),\n            SortFieldType::I64 => SortValue::I64(i64::from_u64(fast_field_value)),\n            SortFieldType::F64 => SortValue::F64(f64::from_u64(fast_field_value)),\n            SortFieldType::DateTime => SortValue::I64(i64::from_u64(fast_field_value)),\n            SortFieldType::Bool => SortValue::Boolean(fast_field_value != 0u64),\n        };\n        match self {\n            SortingFieldExtractorComponent::DocId => SortValue::U64(sort_value),\n            SortingFieldExtractorComponent::FastField {\n                sort_field_type, ..\n            } => map_fast_field_to_value(sort_value, *sort_field_type),\n            SortingFieldExtractorComponent::Score => SortValue::F64(f64::from_u64(sort_value)),\n        }\n    }\n    /// Converts fast field values into their u64 fast field representation.\n    ///\n    /// Returns None if value is out of bounds of target value.\n    /// None means that the search_after will be disabled and everything matches.\n    ///\n    /// What's currently missing is to signal that _nothing_ matches to generate an optimized\n    /// query. For now we just choose the max value of the target type.\n    #[inline]\n    pub fn convert_to_u64_ff_val(\n        &self,\n        sort_value: SortValue,\n        sort_order: SortOrder,\n    ) -> Option<u64> {\n        match self {\n            SortingFieldExtractorComponent::DocId => match sort_value {\n                SortValue::U64(val) => Some(val),\n                _ => panic!(\"Internal error: Got non-U64 sort value for DocId.\"),\n            },\n            SortingFieldExtractorComponent::FastField {\n                sort_field_type, ..\n            } => {\n                // We need to convert a (potential user provided) value in the correct u64\n                // representation of the fast field.\n                // This requires this weird conversion of first casting into the target type\n                // (if possible) and then to its u64 presentation.\n                //\n                // For the conversion into the target type it's important to know if the target\n                // type does not cover the whole range of the source type. In that case we need to\n                // add additional conversion checks, to see if it matches everything\n                // or nothing. (Which also depends on the sort order).\n                // Below are the visual representations of the value ranges of the different types.\n                // Note: DateTime is equal to I64 and omitted.\n                //\n                //     Bool value range (0, 1):\n                //                        <->\n                //\n                //     I64 value range (signed 64-bit integer):\n                //     <------------------------------------>\n                //     -2^63                             2^63-1\n                //     U64 value range (unsigned 64-bit integer):\n                //                        <------------------------------------>\n                //                        0                                  2^64-1\n                // F64 value range (64-bit floating point, conceptual, not to scale):\n                // <-------------------------------------------------------------------->\n                // Very negative numbers                                       Very positive numbers\n                //\n                // Those conversions have limited target type value space:\n                // - [X] U64 -> I64\n                // - [X] F64 -> I64\n                // - [X] I64 -> U64\n                // - [X] F64 -> U64\n                //\n                // - [X] F64 -> Bool\n                // - [X] I64 -> Bool\n                // - [X] U64 -> Bool\n                //\n                let val = match (sort_value, sort_field_type) {\n                    // Same field type, no conversion needed.\n                    (SortValue::U64(val), SortFieldType::U64) => val,\n                    (SortValue::F64(val), SortFieldType::F64) => val.to_u64(),\n                    (SortValue::Boolean(val), SortFieldType::Bool) => val.to_u64(),\n                    (SortValue::I64(val), SortFieldType::I64) => val.to_u64(),\n                    (SortValue::U64(mut val), SortFieldType::I64) => {\n                        if sort_order == SortOrder::Desc && val > i64::MAX as u64 {\n                            return None;\n                        }\n                        // Add a limit to avoid overflow.\n                        val = val.min(i64::MAX as u64);\n                        (val as i64).to_u64()\n                    }\n                    (SortValue::U64(val), SortFieldType::F64) => (val as f64).to_u64(),\n                    (SortValue::U64(mut val), SortFieldType::DateTime) => {\n                        // Match everything\n                        if sort_order == SortOrder::Desc && val > i64::MAX as u64 {\n                            return None;\n                        }\n                        // Add a limit to avoid overflow.\n                        val = val.min(i64::MAX as u64);\n                        DateTime::from_timestamp_nanos(val as i64).to_u64()\n                    }\n                    (SortValue::I64(val), SortFieldType::U64) => {\n                        if val < 0 && sort_order == SortOrder::Asc {\n                            return None;\n                        }\n                        if val < 0 && sort_order == SortOrder::Desc {\n                            u64::MIN // matches nothing as search_after is not inclusive\n                        } else {\n                            val as u64\n                        }\n                    }\n                    (SortValue::I64(val), SortFieldType::F64) => (val as f64).to_u64(),\n                    (SortValue::I64(val), SortFieldType::DateTime) => {\n                        DateTime::from_timestamp_nanos(val).to_u64()\n                    }\n                    (SortValue::F64(val), SortFieldType::U64) => {\n                        let all_values_ahead1 =\n                            val < u64::MIN as f64 && sort_order == SortOrder::Asc;\n                        let all_values_ahead2 =\n                            val > u64::MAX as f64 && sort_order == SortOrder::Desc;\n                        if all_values_ahead1 || all_values_ahead2 {\n                            return None;\n                        }\n                        // f64 cast already handles under/overflow and clamps the value\n                        (val as u64).to_u64()\n                    }\n                    (SortValue::F64(val), SortFieldType::I64)\n                    | (SortValue::F64(val), SortFieldType::DateTime) => {\n                        let all_values_ahead1 =\n                            val < i64::MIN as f64 && sort_order == SortOrder::Asc;\n                        let all_values_ahead2 =\n                            val > i64::MAX as f64 && sort_order == SortOrder::Desc;\n                        if all_values_ahead1 || all_values_ahead2 {\n                            return None;\n                        }\n                        // f64 cast already handles under/overflow and clamps the value\n                        let val_i64 = val as i64;\n\n                        if *sort_field_type == SortFieldType::DateTime {\n                            DateTime::from_timestamp_nanos(val_i64).to_u64()\n                        } else {\n                            val_i64.to_u64()\n                        }\n                    }\n                    // Not sure when we hit this, it's probably are very rare case.\n                    (SortValue::Boolean(val), SortFieldType::U64) => val as u64,\n                    (SortValue::Boolean(val), SortFieldType::F64) => (val as u64 as f64).to_u64(),\n                    (SortValue::Boolean(val), SortFieldType::I64) => (val as i64).to_u64(),\n                    (SortValue::Boolean(val), SortFieldType::DateTime) => {\n                        DateTime::from_timestamp_nanos(val as i64).to_u64()\n                    }\n                    (SortValue::U64(mut val), SortFieldType::Bool) => {\n                        let all_values_ahead1 = val > 1 && sort_order == SortOrder::Desc;\n                        if all_values_ahead1 {\n                            return None;\n                        }\n                        // clamp value for comparison\n                        val = val.clamp(0, 1);\n                        (val == 1).to_u64()\n                    }\n                    (SortValue::I64(mut val), SortFieldType::Bool) => {\n                        let all_values_ahead1 = val > 1 && sort_order == SortOrder::Desc;\n                        let all_values_ahead2 = val < 0 && sort_order == SortOrder::Asc;\n                        if all_values_ahead1 || all_values_ahead2 {\n                            return None;\n                        }\n                        // clamp value for comparison\n                        val = val.clamp(0, 1);\n                        (val == 1).to_u64()\n                    }\n                    (SortValue::F64(mut val), SortFieldType::Bool) => {\n                        let all_values_ahead1 = val > 1.0 && sort_order == SortOrder::Desc;\n                        let all_values_ahead2 = val < 0.0 && sort_order == SortOrder::Asc;\n                        if all_values_ahead1 || all_values_ahead2 {\n                            return None;\n                        }\n                        val = val.clamp(0.0, 1.0);\n                        (val >= 0.5).to_u64() // Is this correct?\n                    }\n                };\n                Some(val)\n            }\n            SortingFieldExtractorComponent::Score => match sort_value {\n                SortValue::F64(val) => Some(val.to_u64()),\n                _ => panic!(\"Internal error: Got non-F64 sort value for Score.\"),\n            },\n        }\n    }\n}\n\nimpl From<SortingFieldExtractorComponent> for SortingFieldExtractorPair {\n    fn from(value: SortingFieldExtractorComponent) -> Self {\n        Self {\n            first: value,\n            second: None,\n        }\n    }\n}\n\npub(crate) struct SortingFieldExtractorPair {\n    pub first: SortingFieldExtractorComponent,\n    pub second: Option<SortingFieldExtractorComponent>,\n}\n\nimpl SortingFieldExtractorPair {\n    pub fn is_score(&self) -> bool {\n        self.first.is_score()\n            || self\n                .second\n                .as_ref()\n                .map(|second| second.is_score())\n                .unwrap_or(false)\n    }\n    /// Returns the list of sort values for the given element\n    ///\n    /// See also [`SortingFieldExtractorComponent::extract_typed_sort_values_block`] for more\n    /// information.\n    #[inline]\n    pub(crate) fn extract_typed_sort_values(\n        &self,\n        doc_ids: &[DocId],\n        values1: &mut [Option<u64>],\n        values2: &mut [Option<u64>],\n    ) {\n        self.first\n            .extract_typed_sort_values_block(doc_ids, &mut values1[..doc_ids.len()]);\n        if let Some(second) = self.second.as_ref() {\n            second.extract_typed_sort_values_block(doc_ids, &mut values2[..doc_ids.len()]);\n        }\n    }\n    /// Returns the list of sort values for the given element\n    ///\n    /// See also [`SortingFieldExtractorComponent::extract_typed_sort_value_opt`] for more\n    /// information.\n    #[inline]\n    pub(crate) fn extract_typed_sort_value(\n        &self,\n        doc_id: DocId,\n        score: Score,\n    ) -> (Option<u64>, Option<u64>) {\n        let first = self.first.extract_typed_sort_value_opt(doc_id, score);\n        let second = self\n            .second\n            .as_ref()\n            .and_then(|second| second.extract_typed_sort_value_opt(doc_id, score));\n        (first, second)\n    }\n}\n\nimpl TryFrom<ColumnType> for SortFieldType {\n    type Error = tantivy::TantivyError;\n\n    fn try_from(column_type: ColumnType) -> tantivy::Result<Self> {\n        match column_type {\n            ColumnType::U64 => Ok(SortFieldType::U64),\n            ColumnType::I64 => Ok(SortFieldType::I64),\n            ColumnType::F64 => Ok(SortFieldType::F64),\n            ColumnType::DateTime => Ok(SortFieldType::DateTime),\n            ColumnType::Bool => Ok(SortFieldType::Bool),\n            _ => Err(TantivyError::InvalidArgument(format!(\n                \"Unsupported sort field type `{column_type:?}`.\"\n            ))),\n        }\n    }\n}\n\n/// Takes a user-defined sorting criteria and resolves it to a\n/// segment specific `SortingFieldExtractorPair`.\nfn get_score_extractor(\n    sort_by: &SortByPair,\n    segment_reader: &SegmentReader,\n) -> tantivy::Result<SortingFieldExtractorPair> {\n    Ok(SortingFieldExtractorPair {\n        first: sort_by\n            .first\n            .to_sorting_field_extractor_component(segment_reader)?,\n        second: sort_by\n            .second\n            .as_ref()\n            .map(|first| first.to_sorting_field_extractor_component(segment_reader))\n            .transpose()?,\n    })\n}\n\n#[allow(clippy::large_enum_variant)]\nenum AggregationSegmentCollectors {\n    FindTraceIdsSegmentCollector(Box<FindTraceIdsSegmentCollector>),\n    TantivyAggregationSegmentCollector(AggregationSegmentCollector),\n}\n\n/// Quickwit collector working at the scale of the segment.\npub struct QuickwitSegmentCollector {\n    segment_top_k_collector: Option<Box<dyn QuickwitSegmentTopKCollector>>,\n    aggregation: Option<AggregationSegmentCollectors>,\n    num_hits: u64,\n}\n\n#[derive(Copy, Clone, Debug)]\npub(crate) struct SegmentPartialHit {\n    /// Normalized to u64, the typed value can be reconstructed with\n    /// SortingFieldExtractorComponent.\n    pub sort_value: Option<u64>,\n    pub sort_value2: Option<u64>,\n    pub doc_id: DocId,\n}\n\nimpl SegmentPartialHit {\n    pub fn into_partial_hit(\n        self,\n        split_id: SplitId,\n        segment_ord: SegmentOrdinal,\n        first: &SortingFieldExtractorComponent,\n        second: &Option<SortingFieldExtractorComponent>,\n    ) -> PartialHit {\n        PartialHit {\n            sort_value: self\n                .sort_value\n                .map(|sort_value| first.convert_u64_ff_val_to_sort_value(sort_value))\n                .map(|sort_value| SortByValue {\n                    sort_value: Some(sort_value),\n                }),\n            sort_value2: self\n                .sort_value2\n                .map(|sort_value| {\n                    second\n                        .as_ref()\n                        .expect(\"Internal error: Got sort_value2, but no sort extractor\")\n                        .convert_u64_ff_val_to_sort_value(sort_value)\n                })\n                .map(|sort_value| SortByValue {\n                    sort_value: Some(sort_value),\n                }),\n            doc_id: self.doc_id,\n            split_id,\n            segment_ord,\n        }\n    }\n}\n\nimpl SegmentCollector for QuickwitSegmentCollector {\n    type Fruit = tantivy::Result<LeafSearchResponse>;\n\n    #[inline]\n    fn collect_block(&mut self, filtered_docs: &[DocId]) {\n        // Update results\n        self.num_hits += filtered_docs.len() as u64;\n\n        if let Some(segment_top_k_collector) = self.segment_top_k_collector.as_mut() {\n            segment_top_k_collector.collect_top_k_block(filtered_docs);\n        }\n\n        match self.aggregation.as_mut() {\n            Some(AggregationSegmentCollectors::FindTraceIdsSegmentCollector(collector)) => {\n                collector.collect_block(filtered_docs)\n            }\n            Some(AggregationSegmentCollectors::TantivyAggregationSegmentCollector(collector)) => {\n                collector.collect_block(filtered_docs)\n            }\n            None => (),\n        }\n    }\n\n    #[inline]\n    fn collect(&mut self, doc_id: DocId, score: Score) {\n        self.num_hits += 1;\n        if let Some(segment_top_k_collector) = self.segment_top_k_collector.as_mut() {\n            segment_top_k_collector.collect_top_k(doc_id, score);\n        }\n\n        match self.aggregation.as_mut() {\n            Some(AggregationSegmentCollectors::FindTraceIdsSegmentCollector(collector)) => {\n                collector.collect(doc_id, score)\n            }\n            Some(AggregationSegmentCollectors::TantivyAggregationSegmentCollector(collector)) => {\n                collector.collect(doc_id, score)\n            }\n            None => (),\n        }\n    }\n\n    fn harvest(self) -> Self::Fruit {\n        let mut partial_hits: Vec<PartialHit> = Vec::new();\n        if let Some(segment_top_k_collector) = self.segment_top_k_collector {\n            partial_hits = segment_top_k_collector.get_top_k();\n        }\n\n        let intermediate_aggregation_result = match self.aggregation {\n            Some(AggregationSegmentCollectors::FindTraceIdsSegmentCollector(collector)) => {\n                let fruit: Vec<Span> = collector.harvest();\n                let serialized =\n                    postcard::to_allocvec(&fruit).expect(\"Collector fruit should be serializable.\");\n                Some(serialized)\n            }\n            Some(AggregationSegmentCollectors::TantivyAggregationSegmentCollector(collector)) => {\n                let serialized = postcard::to_allocvec(&collector.harvest()?)\n                    .expect(\"Collector fruit should be serializable.\");\n                Some(serialized)\n            }\n            None => None,\n        };\n\n        Ok(LeafSearchResponse {\n            intermediate_aggregation_result,\n            num_hits: self.num_hits,\n            partial_hits,\n            failed_splits: Vec::new(),\n            num_attempted_splits: 1,\n            num_successful_splits: 1,\n            resource_stats: None,\n        })\n    }\n}\n\n/// Available aggregation types.\n#[derive(Debug, Clone, PartialEq, Deserialize)]\n#[serde(untagged)]\npub enum QuickwitAggregations {\n    /// Aggregation used by the Jaeger service to find trace IDs that match a\n    /// [`quickwit_proto::jaeger::storage::v1::FindTraceIDsRequest`].\n    FindTraceIdsAggregation(FindTraceIdsCollector),\n    /// Your classic Tantivy aggregation.\n    TantivyAggregations(Aggregations),\n}\n\nimpl QuickwitAggregations {\n    /// Returns the list of fast fields that should be loaded for the aggregation.\n    pub fn fast_field_names(&self) -> HashSet<String> {\n        match self {\n            QuickwitAggregations::FindTraceIdsAggregation(collector) => {\n                collector.fast_field_names()\n            }\n            QuickwitAggregations::TantivyAggregations(aggregations) => {\n                get_fast_field_names(aggregations)\n            }\n        }\n    }\n\n    fn maybe_incremental_aggregator(&self) -> QuickwitIncrementalAggregations {\n        match self {\n            QuickwitAggregations::FindTraceIdsAggregation(aggreg) => {\n                QuickwitIncrementalAggregations::FindTraceIdsAggregation(aggreg.clone(), Vec::new())\n            }\n            QuickwitAggregations::TantivyAggregations(aggreg) => {\n                QuickwitIncrementalAggregations::TantivyAggregations(aggreg.clone(), Vec::new())\n            }\n        }\n    }\n}\n\n#[derive(Clone)]\nenum QuickwitIncrementalAggregations {\n    FindTraceIdsAggregation(FindTraceIdsCollector, Vec<Vec<Span>>),\n    TantivyAggregations(Aggregations, Vec<Vec<u8>>),\n    NoAggregation,\n}\n\nimpl QuickwitIncrementalAggregations {\n    fn add(&mut self, intermediate_result: Vec<u8>) -> tantivy::Result<()> {\n        match self {\n            QuickwitIncrementalAggregations::FindTraceIdsAggregation(collector, state) => {\n                let fruits: Vec<Span> =\n                    postcard::from_bytes(&intermediate_result).map_err(map_error)?;\n                state.push(fruits);\n                if state.iter().map(Vec::len).sum::<usize>() >= collector.num_traces {\n                    let new_state = collector.merge_fruits(std::mem::take(state))?;\n                    state.push(new_state);\n                }\n            }\n            QuickwitIncrementalAggregations::TantivyAggregations(_, state) => {\n                state.push(intermediate_result);\n            }\n            QuickwitIncrementalAggregations::NoAggregation => (),\n        }\n        Ok(())\n    }\n\n    fn virtual_worst_hit(&self) -> Option<PartialHit> {\n        match self {\n            QuickwitIncrementalAggregations::FindTraceIdsAggregation(collector, state) => {\n                if let Some(first) = state.first()\n                    && first.len() >= collector.num_traces\n                    && let Some(last_elem) = first.last()\n                {\n                    let timestamp = last_elem.span_timestamp.into_timestamp_nanos();\n                    return Some(PartialHit {\n                        sort_value: Some(SortByValue {\n                            sort_value: Some(SortValue::I64(timestamp)),\n                        }),\n                        sort_value2: None,\n                        split_id: SplitId::new(),\n                        segment_ord: 0,\n                        doc_id: 0,\n                    });\n                }\n                None\n            }\n            QuickwitIncrementalAggregations::TantivyAggregations(_, _) => None,\n            QuickwitIncrementalAggregations::NoAggregation => None,\n        }\n    }\n\n    fn finalize(self) -> tantivy::Result<Option<Vec<u8>>> {\n        match self {\n            QuickwitIncrementalAggregations::FindTraceIdsAggregation(collector, mut state) => {\n                let merged_fruit = if state.len() > 1 {\n                    collector.merge_fruits(state)?\n                } else {\n                    state.pop().unwrap_or_default()\n                };\n                let serialized = postcard::to_allocvec(&merged_fruit).map_err(map_error)?;\n                Ok(Some(serialized))\n            }\n            QuickwitIncrementalAggregations::TantivyAggregations(aggregation, state) => {\n                merge_intermediate_aggregation_result(\n                    &Some(QuickwitAggregations::TantivyAggregations(aggregation)),\n                    state.iter().map(|vec| vec.as_slice()),\n                )\n            }\n            QuickwitIncrementalAggregations::NoAggregation => Ok(None),\n        }\n    }\n}\n\n/// The quickwit collector is the tantivy Collector used in Quickwit.\n///\n/// It defines the data that should be accumulated about the documents matching\n/// the query.\n#[derive(Clone)]\npub(crate) struct QuickwitCollector {\n    pub split_id: SplitId,\n    pub start_offset: usize,\n    pub max_hits: usize,\n    pub sort_by: SortByPair,\n    pub aggregation: Option<QuickwitAggregations>,\n    pub agg_context_params: AggContextParams,\n    search_after: Option<PartialHit>,\n}\n\nimpl QuickwitCollector {\n    pub fn is_count_only(&self) -> bool {\n        self.max_hits == 0 && self.aggregation.is_none()\n    }\n    /// Updates search parameters affecting the returned documents.\n    /// Does not update aggregations.\n    pub fn update_search_param(&mut self, search_request: &SearchRequest) {\n        let sort_by = sort_by_from_request(search_request);\n        self.sort_by = sort_by;\n        self.max_hits = search_request.max_hits as usize;\n        self.start_offset = search_request.start_offset as usize;\n        self.search_after.clone_from(&search_request.search_after);\n    }\n    pub fn fast_field_names(&self) -> HashSet<String> {\n        let mut fast_field_names = HashSet::default();\n        self.sort_by.first.add_fast_field(&mut fast_field_names);\n        if let Some(sort_by_second) = &self.sort_by.second {\n            sort_by_second.add_fast_field(&mut fast_field_names);\n        }\n        if let Some(aggregations) = &self.aggregation {\n            fast_field_names.extend(aggregations.fast_field_names());\n        }\n        fast_field_names\n    }\n\n    pub fn warmup_info(&self) -> WarmupInfo {\n        WarmupInfo {\n            fast_fields: self\n                .fast_field_names()\n                .into_iter()\n                .map(|name| FastFieldWarmupInfo {\n                    name,\n                    with_subfields: false,\n                })\n                .collect(),\n            field_norms: self.requires_scoring(),\n            ..WarmupInfo::default()\n        }\n    }\n}\n\nimpl Collector for QuickwitCollector {\n    type Child = QuickwitSegmentCollector;\n    type Fruit = LeafSearchResponse;\n\n    fn for_segment(\n        &self,\n        segment_ord: SegmentOrdinal,\n        segment_reader: &SegmentReader,\n    ) -> tantivy::Result<Self::Child> {\n        // Regardless of the start_offset, we need to collect top-K\n        // starting from 0 for every leaves.\n        let leaf_max_hits = self.max_hits + self.start_offset;\n\n        let aggregation = match &self.aggregation {\n            Some(QuickwitAggregations::FindTraceIdsAggregation(collector)) => {\n                Some(AggregationSegmentCollectors::FindTraceIdsSegmentCollector(\n                    Box::new(collector.for_segment(0, segment_reader)?),\n                ))\n            }\n            Some(QuickwitAggregations::TantivyAggregations(aggs)) => Some(\n                AggregationSegmentCollectors::TantivyAggregationSegmentCollector(\n                    AggregationSegmentCollector::from_agg_req_and_reader(\n                        aggs,\n                        segment_reader,\n                        segment_ord,\n                        &self.agg_context_params,\n                    )?,\n                ),\n            ),\n            None => None,\n        };\n        let score_extractor = get_score_extractor(&self.sort_by, segment_reader)?;\n        let (order1, order2) = self.sort_by.sort_orders();\n\n        let segment_top_k_collector = if leaf_max_hits == 0 {\n            None\n        } else {\n            let coll: Box<dyn QuickwitSegmentTopKCollector> = specialized_top_k_segment_collector(\n                self.split_id.clone(),\n                score_extractor,\n                leaf_max_hits,\n                segment_ord,\n                self.search_after.clone(),\n                order1,\n                order2,\n            );\n            Some(coll)\n        };\n\n        Ok(QuickwitSegmentCollector {\n            num_hits: 0,\n            segment_top_k_collector,\n            aggregation,\n        })\n    }\n\n    fn requires_scoring(&self) -> bool {\n        // We do not need BM25 scoring in Quickwit if it is not opted-in.\n        // By returning false, we inform tantivy that it does not need to decompress\n        // term frequencies.\n        self.sort_by.first.requires_scoring()\n            || self\n                .sort_by\n                .second\n                .as_ref()\n                .map(|sort_by| sort_by.requires_scoring())\n                .unwrap_or(false)\n    }\n\n    fn merge_fruits(\n        &self,\n        segment_fruits: Vec<tantivy::Result<LeafSearchResponse>>,\n    ) -> tantivy::Result<Self::Fruit> {\n        let segment_fruits: tantivy::Result<Vec<LeafSearchResponse>> =\n            segment_fruits.into_iter().collect();\n        // We want the hits in [start_offset..start_offset + max_hits).\n        // All leaves will return their top [0..start_offset + max_hits) documents.\n        // We compute the overall [0..start_offset + max_hits) documents ...\n        let num_hits = self.start_offset + self.max_hits;\n        let (sort_order1, sort_order2) = self.sort_by.sort_orders();\n        let mut merged_leaf_response = merge_leaf_responses(\n            &self.aggregation,\n            segment_fruits?,\n            sort_order1,\n            sort_order2,\n            num_hits,\n        )?;\n        // ... and drop the first [..start_offsets) hits.\n        // note that self.start_offset is 0 when merging from leaf_search, and is only set when\n        // merging from root_search, so as to remove the firsts elements only once.\n        merged_leaf_response.partial_hits.drain(\n            0..self\n                .start_offset\n                .min(merged_leaf_response.partial_hits.len()),\n        );\n        merged_leaf_response.partial_hits.truncate(self.max_hits);\n        Ok(merged_leaf_response)\n    }\n}\n\nfn map_error(error: postcard::Error) -> TantivyError {\n    TantivyError::InternalError(format!(\n        \"failed to merge intermediate aggregation results: Postcard error: {error}\"\n    ))\n}\n\n/// Merges a set of Leaf Results.\nfn merge_intermediate_aggregation_result<'a>(\n    aggregations_opt: &Option<QuickwitAggregations>,\n    intermediate_aggregation_results: impl Iterator<Item = &'a [u8]>,\n) -> tantivy::Result<Option<Vec<u8>>> {\n    let merged_intermediate_aggregation_result = match aggregations_opt {\n        Some(QuickwitAggregations::FindTraceIdsAggregation(collector)) => {\n            let fruits: Vec<\n                <<FindTraceIdsCollector as Collector>::Child as SegmentCollector>::Fruit,\n            > = intermediate_aggregation_results\n                .map(|intermediate_aggregation_result| {\n                    postcard::from_bytes(intermediate_aggregation_result).map_err(map_error)\n                })\n                .collect::<Result<_, _>>()?;\n            let merged_fruit: Vec<Span> = collector.merge_fruits(fruits)?;\n            let serialized = postcard::to_allocvec(&merged_fruit).map_err(map_error)?;\n            Some(serialized)\n        }\n        Some(QuickwitAggregations::TantivyAggregations(_)) => {\n            let merged_opt = intermediate_aggregation_results\n                .map(|bytes| postcard::from_bytes(bytes).map_err(map_error))\n                .try_fold::<_, _, Result<_, TantivyError>>(\n                    None,\n                    |acc: Option<IntermediateAggregationResults>, fruits_res| {\n                        let fruits = fruits_res?;\n                        match acc {\n                            Some(mut merged_fruits) => {\n                                merged_fruits.merge_fruits(fruits)?;\n                                Ok(Some(merged_fruits))\n                            }\n                            None => Ok(Some(fruits)),\n                        }\n                    },\n                )?;\n            let serialized =\n                postcard::to_allocvec(&merged_opt.unwrap_or_default()).map_err(map_error)?;\n            Some(serialized)\n        }\n        None => None,\n    };\n\n    Ok(merged_intermediate_aggregation_result)\n}\n\n/// Merges a set of Leaf Results.\nfn merge_leaf_responses(\n    aggregations_opt: &Option<QuickwitAggregations>,\n    mut leaf_responses: Vec<LeafSearchResponse>,\n    sort_order1: SortOrder,\n    sort_order2: SortOrder,\n    max_hits: usize,\n) -> tantivy::Result<LeafSearchResponse> {\n    // Optimization: No merging needed if there is only one result.\n    if leaf_responses.len() == 1 {\n        return Ok(leaf_responses.pop().unwrap());\n    }\n\n    let resource_stats_it = leaf_responses\n        .iter()\n        .map(|leaf_response| &leaf_response.resource_stats);\n    let merged_resource_stats = merge_resource_stats_it(resource_stats_it);\n\n    let merged_intermediate_aggregation_result: Option<Vec<u8>> =\n        merge_intermediate_aggregation_result(\n            aggregations_opt,\n            leaf_responses.iter().filter_map(|leaf_response| {\n                leaf_response.intermediate_aggregation_result.as_deref()\n            }),\n        )?;\n    let num_attempted_splits = leaf_responses\n        .iter()\n        .map(|leaf_response| leaf_response.num_attempted_splits)\n        .sum();\n    let num_successful_splits = leaf_responses\n        .iter()\n        .map(|leaf_response| leaf_response.num_successful_splits)\n        .sum::<u64>();\n    let num_hits: u64 = leaf_responses\n        .iter()\n        .map(|leaf_response| leaf_response.num_hits)\n        .sum();\n    let failed_splits = leaf_responses\n        .iter()\n        .flat_map(|leaf_response| leaf_response.failed_splits.iter())\n        .cloned()\n        .collect_vec();\n    let all_partial_hits: Vec<PartialHit> = leaf_responses\n        .into_iter()\n        .flat_map(|leaf_response| leaf_response.partial_hits)\n        .collect();\n    let top_k_partial_hits: Vec<PartialHit> = top_k_partial_hits(\n        all_partial_hits.into_iter(),\n        sort_order1,\n        sort_order2,\n        max_hits,\n    );\n    Ok(LeafSearchResponse {\n        intermediate_aggregation_result: merged_intermediate_aggregation_result,\n        num_hits,\n        partial_hits: top_k_partial_hits,\n        failed_splits,\n        num_attempted_splits,\n        num_successful_splits,\n        resource_stats: merged_resource_stats,\n    })\n}\n\n/// Mutates partial_hits so that it contains the top-num_hitso hits,\n/// and so that these elements are sorted.\n///\n/// TODO we could possibly optimize the sort away (but I doubt it matters).\nfn top_k_partial_hits(\n    partial_hits: impl Iterator<Item = PartialHit>,\n    order1: SortOrder,\n    order2: SortOrder,\n    num_hits: usize,\n) -> Vec<PartialHit> {\n    let sort_key_mapper = HitSortingMapper { order1, order2 };\n    let mut top_k_hits = TopK::new(num_hits, sort_key_mapper);\n\n    partial_hits.for_each(|hit| top_k_hits.add_entry(hit));\n\n    top_k_hits.finalize()\n}\n\npub(crate) fn sort_by_from_request(search_request: &SearchRequest) -> SortByPair {\n    let to_sort_by_component = |field_name: &str, order| {\n        if field_name == \"_score\" {\n            SortByComponent::Score { order }\n        } else if field_name == \"_shard_doc\" || field_name == \"_doc\" {\n            SortByComponent::DocId { order }\n        } else {\n            SortByComponent::FastField {\n                field_name: field_name.to_string(),\n                order,\n            }\n        }\n    };\n\n    let num_sort_fields = search_request.sort_fields.len();\n    if num_sort_fields == 0 {\n        SortByComponent::DocId {\n            order: SortOrder::Desc,\n        }\n        .into()\n    } else if num_sort_fields == 1 {\n        let sort_field = &search_request.sort_fields[0];\n        let order = SortOrder::try_from(sort_field.sort_order).unwrap_or(SortOrder::Desc);\n        to_sort_by_component(&sort_field.field_name, order).into()\n    } else if num_sort_fields == 2 {\n        let sort_field1 = &search_request.sort_fields[0];\n        let order1 = SortOrder::try_from(sort_field1.sort_order).unwrap_or(SortOrder::Desc);\n        let sort_field2 = &search_request.sort_fields[1];\n        let order2 = SortOrder::try_from(sort_field2.sort_order).unwrap_or(SortOrder::Desc);\n        SortByPair {\n            first: to_sort_by_component(&sort_field1.field_name, order1),\n            second: Some(to_sort_by_component(&sort_field2.field_name, order2)),\n        }\n    } else {\n        panic!(\"Sort by more than 2 fields is not supported yet.\")\n    }\n}\n\n/// Builds the QuickwitCollector, in function of the information that was requested by the user.\npub(crate) fn make_collector_for_split(\n    split_id: SplitId,\n    search_request: &SearchRequest,\n    agg_context_params: AggContextParams,\n) -> crate::Result<QuickwitCollector> {\n    let aggregation = match &search_request.aggregation_request {\n        Some(aggregation) => Some(serde_json::from_str(aggregation)?),\n        None => None,\n    };\n    let sort_by = sort_by_from_request(search_request);\n    Ok(QuickwitCollector {\n        split_id,\n        start_offset: search_request.start_offset as usize,\n        max_hits: search_request.max_hits as usize,\n        sort_by,\n        aggregation,\n        agg_context_params,\n        search_after: search_request.search_after.clone(),\n    })\n}\n\n/// Builds a QuickwitCollector that's only useful for merging fruits.\npub(crate) fn make_merge_collector(\n    search_request: &SearchRequest,\n    agg_limits: AggregationLimitsGuard,\n) -> crate::Result<QuickwitCollector> {\n    // Note: at this point the tokenizer manager is not used anymore by aggregations (filter query),\n    // so we can create an empty one. So if it will ever be used, it would panic.\n    let agg_context_params = AggContextParams {\n        limits: agg_limits,\n        tokenizers: TokenizerManager::new(),\n    };\n\n    let aggregation = match &search_request.aggregation_request {\n        Some(aggregation) => Some(serde_json::from_str(aggregation)?),\n        None => None,\n    };\n    let sort_by = sort_by_from_request(search_request);\n    Ok(QuickwitCollector {\n        split_id: SplitId::default(),\n        start_offset: search_request.start_offset as usize,\n        max_hits: search_request.max_hits as usize,\n        sort_by,\n        aggregation,\n        agg_context_params,\n        search_after: search_request.search_after.clone(),\n    })\n}\n\n#[derive(Clone, Copy, Debug, PartialEq, Eq)]\npub struct SegmentPartialHitSortingKey {\n    sort_value: Option<u64>,\n    sort_value2: Option<u64>,\n    doc_id: DocId,\n    // TODO This should not be there.\n    sort_order: SortOrder,\n    // TODO This should not be there.\n    sort_order2: SortOrder,\n}\n\nimpl Ord for SegmentPartialHitSortingKey {\n    fn cmp(&self, other: &SegmentPartialHitSortingKey) -> Ordering {\n        debug_assert_eq!(\n            self.sort_order, other.sort_order,\n            \"comparing two PartialHitSortingKey of different ordering\"\n        );\n        debug_assert_eq!(\n            self.sort_order2, other.sort_order2,\n            \"comparing two PartialHitSortingKey of different ordering\"\n        );\n        let order = self\n            .sort_order\n            .compare_opt(&self.sort_value, &other.sort_value);\n        let order2 = self\n            .sort_order2\n            .compare_opt(&self.sort_value2, &other.sort_value2);\n        let order_addr = self.sort_order.compare(&self.doc_id, &other.doc_id);\n        order.then(order2).then(order_addr)\n    }\n}\n\nimpl PartialOrd for SegmentPartialHitSortingKey {\n    fn partial_cmp(&self, other: &SegmentPartialHitSortingKey) -> Option<Ordering> {\n        Some(self.cmp(other))\n    }\n}\n\n#[derive(Clone, Debug, PartialEq, Eq)]\npub(crate) struct PartialHitSortingKey {\n    sort_value: Option<SortValue>,\n    sort_value2: Option<SortValue>,\n    address: GlobalDocAddress,\n    // TODO remove this\n    sort_order: SortOrder,\n    sort_order2: SortOrder,\n}\n\nimpl Ord for PartialHitSortingKey {\n    fn cmp(&self, other: &PartialHitSortingKey) -> Ordering {\n        assert_eq!(\n            self.sort_order, other.sort_order,\n            \"comparing two PartialHitSortingKey of different ordering\"\n        );\n        assert_eq!(\n            self.sort_order2, other.sort_order2,\n            \"comparing two PartialHitSortingKey of different ordering\"\n        );\n\n        let order = self\n            .sort_order\n            .compare_opt(&self.sort_value, &other.sort_value);\n\n        let order2 = self\n            .sort_order2\n            .compare_opt(&self.sort_value2, &other.sort_value2);\n\n        let order_addr = self.sort_order.compare(&self.address, &other.address);\n\n        order.then(order2).then(order_addr)\n    }\n}\n\nimpl PartialOrd for PartialHitSortingKey {\n    fn partial_cmp(&self, other: &PartialHitSortingKey) -> Option<Ordering> {\n        Some(self.cmp(other))\n    }\n}\n\n#[derive(Clone)]\npub(crate) struct HitSortingMapper {\n    pub order1: SortOrder,\n    pub order2: SortOrder,\n}\n\nimpl SortKeyMapper<PartialHit> for HitSortingMapper {\n    type Key = PartialHitSortingKey;\n    fn get_sort_key(&self, partial_hit: &PartialHit) -> PartialHitSortingKey {\n        PartialHitSortingKey {\n            sort_value: partial_hit.sort_value.and_then(|v| v.sort_value),\n            sort_value2: partial_hit.sort_value2.and_then(|v| v.sort_value),\n            address: GlobalDocAddress::from_partial_hit(partial_hit),\n            sort_order: self.order1,\n            sort_order2: self.order2,\n        }\n    }\n}\n\nimpl SortKeyMapper<SegmentPartialHit> for HitSortingMapper {\n    type Key = SegmentPartialHitSortingKey;\n    fn get_sort_key(&self, partial_hit: &SegmentPartialHit) -> SegmentPartialHitSortingKey {\n        SegmentPartialHitSortingKey {\n            sort_value: partial_hit.sort_value,\n            sort_value2: partial_hit.sort_value2,\n            doc_id: partial_hit.doc_id,\n            sort_order: self.order1,\n            sort_order2: self.order2,\n        }\n    }\n}\n\n/// Incrementally merge segment results.\n#[derive(Clone)]\npub(crate) struct IncrementalCollector {\n    top_k_hits: TopK<PartialHit, PartialHitSortingKey, HitSortingMapper>,\n    incremental_aggregation: QuickwitIncrementalAggregations,\n    num_hits: u64,\n    failed_splits: Vec<SplitSearchError>,\n    num_attempted_splits: u64,\n    num_successful_splits: u64,\n    start_offset: usize,\n    resource_stats: Option<ResourceStats>,\n}\n\nimpl IncrementalCollector {\n    /// Create a new incremental collector\n    pub(crate) fn new(collector: QuickwitCollector) -> Self {\n        let incremental_aggregation = collector\n            .aggregation\n            .as_ref()\n            .map(QuickwitAggregations::maybe_incremental_aggregator)\n            .unwrap_or(QuickwitIncrementalAggregations::NoAggregation);\n        let (order1, order2) = collector.sort_by.sort_orders();\n        let sort_key_mapper = HitSortingMapper { order1, order2 };\n        IncrementalCollector {\n            top_k_hits: TopK::new(collector.max_hits + collector.start_offset, sort_key_mapper),\n            start_offset: collector.start_offset,\n            incremental_aggregation,\n            num_hits: 0,\n            failed_splits: Vec::new(),\n            num_attempted_splits: 0,\n            num_successful_splits: 0,\n            resource_stats: None,\n        }\n    }\n\n    /// Merge one search result with the current state\n    pub(crate) fn add_result(&mut self, leaf_response: LeafSearchResponse) -> tantivy::Result<()> {\n        let LeafSearchResponse {\n            num_hits,\n            partial_hits,\n            failed_splits,\n            num_attempted_splits,\n            intermediate_aggregation_result,\n            num_successful_splits,\n            resource_stats,\n        } = leaf_response;\n\n        merge_resource_stats(&resource_stats, &mut self.resource_stats);\n\n        self.num_hits += num_hits;\n        self.top_k_hits.add_entries(partial_hits.into_iter());\n        self.failed_splits.extend(failed_splits);\n        self.num_attempted_splits += num_attempted_splits;\n        self.num_successful_splits += num_successful_splits;\n        if let Some(intermediate_aggregation_result) = intermediate_aggregation_result {\n            self.incremental_aggregation\n                .add(intermediate_aggregation_result)?;\n        }\n        Ok(())\n    }\n\n    /// Add a failed split to the state\n    pub(crate) fn add_failed_split(&mut self, split_error: SplitSearchError) {\n        self.failed_splits.push(split_error)\n    }\n\n    /// Get the worst top-hit. Can be used to skip splits if they can't possibly do better.\n    ///\n    /// Only returns a result if enough hits were recorded already.\n    pub(crate) fn peek_worst_hit(&self) -> Option<Cow<'_, PartialHit>> {\n        if self.top_k_hits.max_len() == 0 {\n            return self\n                .incremental_aggregation\n                .virtual_worst_hit()\n                .map(Cow::Owned);\n        }\n\n        if self.top_k_hits.at_capacity() {\n            self.top_k_hits.peek_worst().map(Cow::Borrowed)\n        } else {\n            None\n        }\n    }\n\n    /// Finalize the merge, creating a LeafSearchResponse.\n    pub(crate) fn finalize(self) -> tantivy::Result<LeafSearchResponse> {\n        let intermediate_aggregation_result = self.incremental_aggregation.finalize()?;\n        let mut partial_hits = self.top_k_hits.finalize();\n        if self.start_offset != 0 {\n            partial_hits.drain(0..self.start_offset.min(partial_hits.len()));\n        }\n        Ok(LeafSearchResponse {\n            num_hits: self.num_hits,\n            partial_hits,\n            failed_splits: self.failed_splits,\n            num_attempted_splits: self.num_attempted_splits,\n            num_successful_splits: self.num_successful_splits,\n            intermediate_aggregation_result,\n            resource_stats: self.resource_stats,\n        })\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::cmp::Ordering;\n\n    use quickwit_proto::search::{\n        LeafSearchResponse, PartialHit, ResourceStats, SearchRequest, SortByValue, SortField,\n        SortOrder, SortValue, SplitSearchError,\n    };\n    use tantivy::TantivyDocument;\n    use tantivy::aggregation::agg_req::Aggregations;\n    use tantivy::aggregation::intermediate_agg_result::IntermediateAggregationResults;\n    use tantivy::collector::Collector;\n\n    use super::{IncrementalCollector, make_merge_collector};\n    use crate::QuickwitAggregations;\n    use crate::collector::{merge_intermediate_aggregation_result, top_k_partial_hits};\n\n    #[test]\n    fn test_merge_partial_hits_no_tie() {\n        let make_doc = |sort_value: u64| PartialHit {\n            sort_value: Some(SortValue::U64(sort_value).into()),\n            sort_value2: None,\n            split_id: \"split1\".to_string(),\n            segment_ord: 0u32,\n            doc_id: 0u32,\n        };\n        assert_eq!(\n            top_k_partial_hits(\n                vec![make_doc(1u64), make_doc(3u64), make_doc(2u64),].into_iter(),\n                SortOrder::Asc,\n                SortOrder::Asc,\n                2\n            ),\n            vec![make_doc(1), make_doc(2)]\n        );\n    }\n\n    #[test]\n    fn test_merge_partial_hits_with_tie() {\n        let make_hit_given_split_id = |split_id: u64| PartialHit {\n            sort_value: Some(SortValue::U64(0u64).into()),\n            sort_value2: None,\n            split_id: format!(\"split_{split_id}\"),\n            segment_ord: 0u32,\n            doc_id: 0u32,\n        };\n        assert_eq!(\n            &top_k_partial_hits(\n                vec![\n                    make_hit_given_split_id(1u64),\n                    make_hit_given_split_id(3u64),\n                    make_hit_given_split_id(2u64),\n                ]\n                .into_iter(),\n                SortOrder::Desc,\n                SortOrder::Desc,\n                2\n            ),\n            &[make_hit_given_split_id(3), make_hit_given_split_id(2)]\n        );\n        assert_eq!(\n            &top_k_partial_hits(\n                vec![\n                    make_hit_given_split_id(1u64),\n                    make_hit_given_split_id(3u64),\n                    make_hit_given_split_id(2u64),\n                ]\n                .into_iter(),\n                SortOrder::Asc,\n                SortOrder::Asc,\n                2\n            ),\n            &[make_hit_given_split_id(1), make_hit_given_split_id(2)]\n        );\n    }\n\n    fn sort_dataset() -> Vec<(Option<u64>, Option<u64>)> {\n        // every combination of 0..=2 + None, in random order.\n        // (2, 1) is duplicated to allow testing for DocId sorting with two sort fields\n        vec![\n            (Some(2), Some(1)),\n            (Some(0), Some(1)),\n            (Some(1), Some(1)),\n            (Some(0), Some(0)),\n            (None, Some(1)),\n            (None, Some(2)),\n            (Some(2), Some(1)),\n            (Some(1), Some(2)),\n            (Some(0), None),\n            (None, Some(0)),\n            (Some(2), Some(0)),\n            (Some(2), Some(2)),\n            (Some(0), Some(2)),\n            (Some(2), None),\n            (None, None),\n            (Some(1), Some(0)),\n            (Some(1), None),\n        ]\n    }\n\n    fn make_request(max_hits: u64, sort_fields: &str) -> SearchRequest {\n        SearchRequest {\n            max_hits,\n            sort_fields: sort_fields\n                .split(',')\n                .filter(|field| !field.is_empty())\n                .map(|field| {\n                    if let Some(field) = field.strip_prefix('-') {\n                        SortField {\n                            field_name: field.to_string(),\n                            sort_order: SortOrder::Asc.into(),\n                            sort_datetime_format: None,\n                        }\n                    } else {\n                        SortField {\n                            field_name: field.to_string(),\n                            sort_order: SortOrder::Desc.into(),\n                            sort_datetime_format: None,\n                        }\n                    }\n                })\n                .collect(),\n            ..SearchRequest::default()\n        }\n    }\n\n    fn make_index() -> tantivy::Index {\n        use tantivy::Index;\n        use tantivy::indexer::UserOperation;\n        use tantivy::schema::{NumericOptions, Schema};\n\n        let dataset = sort_dataset();\n\n        let mut schema_builder = Schema::builder();\n        let opts = NumericOptions::default().set_fast();\n\n        schema_builder.add_u64_field(\"sort1\", opts.clone());\n        schema_builder.add_u64_field(\"sort2\", opts);\n        let schema = schema_builder.build();\n\n        let field1 = schema.get_field(\"sort1\").unwrap();\n        let field2 = schema.get_field(\"sort2\").unwrap();\n\n        let index = Index::create_in_ram(schema);\n        let mut index_writer = index.writer(50_000_000).unwrap();\n\n        index_writer\n            .run(\n                dataset\n                    .into_iter()\n                    .map(|(val1, val2)| {\n                        let mut doc = TantivyDocument::new();\n                        if let Some(val1) = val1 {\n                            doc.add_u64(field1, val1);\n                        }\n                        if let Some(val2) = val2 {\n                            doc.add_u64(field2, val2);\n                        }\n                        doc\n                    })\n                    .map(UserOperation::Add),\n            )\n            .unwrap();\n        index_writer.commit().unwrap();\n\n        index\n    }\n\n    #[test]\n    fn test_single_split_sorting() {\n        let index = make_index();\n\n        let reader = index.reader().unwrap();\n        let searcher = reader.searcher();\n\n        // tuple of DocId and sort value\n        type Doc = (usize, (Option<u64>, Option<u64>));\n\n        let mut dataset: Vec<Doc> = sort_dataset().into_iter().enumerate().collect();\n\n        let reverse_int = |val: &Option<u64>| val.as_ref().map(|val| u64::MAX - val);\n        let cmp_doc_id_desc = |a: &Doc, b: &Doc| b.0.cmp(&a.0);\n        let cmp_doc_id_asc = |a: &Doc, b: &Doc| a.0.cmp(&b.0);\n        let cmp_1_desc = |a: &Doc, b: &Doc| b.1.0.cmp(&a.1.0);\n        let cmp_1_asc = |a: &Doc, b: &Doc| reverse_int(&b.1.0).cmp(&reverse_int(&a.1.0));\n        let cmp_2_desc = |a: &Doc, b: &Doc| b.1.1.cmp(&a.1.1);\n        let cmp_2_asc = |a: &Doc, b: &Doc| reverse_int(&b.1.1).cmp(&reverse_int(&a.1.1));\n\n        {\n            // the logic for sorting isn't easy to wrap one's head around. These simple tests are\n            // here to convince oneself they do what we want them todo\n            let mut data = vec![(1, (None, None)), (0, (None, None))];\n            let data_copy = data.clone();\n            data.sort_by(cmp_doc_id_desc);\n            assert_eq!(data, data_copy);\n\n            let mut data = vec![(0, (None, None)), (1, (None, None))];\n            let data_copy = data.clone();\n            data.sort_by(cmp_doc_id_asc);\n            assert_eq!(data, data_copy);\n\n            let mut data = vec![\n                (1, (Some(2), None)),\n                (0, (Some(1), None)),\n                (2, (None, None)),\n            ];\n            let data_copy = data.clone();\n            data.sort_by(cmp_1_desc);\n            assert_eq!(data, data_copy);\n\n            let mut data = vec![\n                (1, (Some(1), None)),\n                (0, (Some(2), None)),\n                (2, (None, None)),\n            ];\n            let data_copy = data.clone();\n            data.sort_by(cmp_1_asc);\n            assert_eq!(data, data_copy);\n\n            let mut data = vec![\n                (1, (None, Some(2))),\n                (0, (None, Some(1))),\n                (2, (None, None)),\n            ];\n            let data_copy = data.clone();\n            data.sort_by(cmp_2_desc);\n            assert_eq!(data, data_copy);\n\n            let mut data = vec![\n                (1, (None, Some(1))),\n                (0, (None, Some(2))),\n                (2, (None, None)),\n            ];\n            let data_copy = data.clone();\n            data.sort_by(cmp_2_asc);\n            assert_eq!(data, data_copy);\n        }\n\n        #[allow(clippy::type_complexity)]\n        let sort_orders: Vec<(_, Box<dyn Fn(&Doc, &Doc) -> Ordering>)> = vec![\n            (\"\", Box::new(cmp_doc_id_desc)),\n            (\n                \"sort1\",\n                Box::new(|a, b| cmp_1_desc(a, b).then(cmp_doc_id_desc(a, b))),\n            ),\n            (\n                \"-sort1\",\n                Box::new(|a, b| cmp_1_asc(a, b).then(cmp_doc_id_asc(a, b))),\n            ),\n            (\n                \"sort1,sort2\",\n                Box::new(|a, b| {\n                    cmp_1_desc(a, b).then(cmp_2_desc(a, b).then(cmp_doc_id_desc(a, b)))\n                }),\n            ),\n            (\n                \"-sort1,sort2\",\n                Box::new(|a, b| {\n                    cmp_1_asc(a, b)\n                        .then(cmp_2_desc(a, b))\n                        .then(cmp_doc_id_asc(a, b))\n                }),\n            ),\n            (\n                \"sort1,-sort2\",\n                Box::new(|a, b| cmp_1_desc(a, b).then(cmp_2_asc(a, b).then(cmp_doc_id_desc(a, b)))),\n            ),\n            (\n                \"-sort1,-sort2\",\n                Box::new(|a, b| {\n                    cmp_1_asc(a, b)\n                        .then(cmp_2_asc(a, b))\n                        .then(cmp_doc_id_asc(a, b))\n                }),\n            ),\n        ];\n\n        for (sort_str, sort_function) in sort_orders {\n            dataset.sort_by(sort_function);\n            // Check increasing slice sizes of the dataset\n            for slice_len in 0..dataset.len() {\n                let collector = super::make_collector_for_split(\n                    \"fake_split_id\".to_string(),\n                    &make_request(slice_len as u64, sort_str),\n                    Default::default(),\n                )\n                .unwrap();\n                let res = searcher\n                    .search(&tantivy::query::AllQuery, &collector)\n                    .unwrap();\n                assert_eq!(\n                    res.partial_hits.len(),\n                    slice_len,\n                    \"mismatch slice_len for \\\"{sort_str}\\\":{slice_len}\"\n                );\n                for (expected, got) in dataset.iter().zip(res.partial_hits.iter()) {\n                    if expected.0 as u32 != got.doc_id {\n                        let expected_docids = dataset\n                            .iter()\n                            .map(|(docid, val)| {\n                                format!(\"{} {:?} {:?}\", *docid as u32, val.0.clone(), val.1.clone())\n                            })\n                            .collect::<Vec<_>>();\n                        let got_docids = res\n                            .partial_hits\n                            .iter()\n                            .map(|hit| {\n                                format!(\n                                    \"{} {:?} {:?}\",\n                                    hit.doc_id,\n                                    hit.sort_value.and_then(|el| el.sort_value).clone(),\n                                    hit.sort_value2.and_then(|el| el.sort_value).clone()\n                                )\n                            })\n                            .collect::<Vec<_>>();\n                        eprintln!(\"expected: {expected_docids:#?}\");\n                        eprintln!(\"got: {got_docids:#?}\");\n                        panic!(\"mismatch ordering for \\\"{sort_str}\\\":{slice_len}\");\n                    }\n                }\n            }\n        }\n    }\n\n    #[test]\n    fn test_search_after() {\n        let index = make_index();\n\n        let reader = index.reader().unwrap();\n        let searcher = reader.searcher();\n\n        // tuple of DocId and sort value\n        type Doc = (usize, (Option<u64>, Option<u64>));\n\n        let mut dataset: Vec<Doc> = sort_dataset().into_iter().enumerate().collect();\n\n        let reverse_int = |val: &Option<u64>| val.as_ref().map(|val| u64::MAX - val);\n        let cmp_doc_id_desc = |a: &Doc, b: &Doc| b.0.cmp(&a.0);\n        let cmp_1_desc = |a: &Doc, b: &Doc| b.1.0.cmp(&a.1.0);\n        let cmp_2_asc = |a: &Doc, b: &Doc| reverse_int(&b.1.1).cmp(&reverse_int(&a.1.1));\n\n        let sort_function =\n            |a: &Doc, b: &Doc| cmp_1_desc(a, b).then(cmp_2_asc(a, b).then(cmp_doc_id_desc(a, b)));\n        dataset.sort_by(sort_function);\n        let partial_sort_value = dataset\n            .iter()\n            .map(|(doc_id, (val1, val2))| PartialHit {\n                split_id: \"fake_split_id\".to_string(),\n                segment_ord: 0,\n                doc_id: *doc_id as u32,\n                sort_value: Some(SortByValue {\n                    sort_value: val1.map(SortValue::U64),\n                }),\n                sort_value2: Some(SortByValue {\n                    sort_value: val2.map(SortValue::U64),\n                }),\n            })\n            .collect::<Vec<_>>();\n        // we eliminate based on sort value\n        for (i, search_after) in partial_sort_value.into_iter().enumerate() {\n            let request = SearchRequest {\n                max_hits: 1000,\n                sort_fields: vec![\n                    SortField {\n                        field_name: \"sort1\".to_string(),\n                        sort_order: SortOrder::Desc.into(),\n                        sort_datetime_format: None,\n                    },\n                    SortField {\n                        field_name: \"sort2\".to_string(),\n                        sort_order: SortOrder::Asc.into(),\n                        sort_datetime_format: None,\n                    },\n                ],\n                search_after: Some(search_after),\n                ..SearchRequest::default()\n            };\n            let collector = super::make_collector_for_split(\n                \"fake_split_id\".to_string(),\n                &request,\n                Default::default(),\n            )\n            .unwrap();\n            let res = searcher\n                .search(&tantivy::query::AllQuery, &collector)\n                .unwrap();\n            // we count results even if they were removed due to search_after\n            assert_eq!(res.num_hits, dataset.len() as u64);\n            // we get as many result as expected\n            assert_eq!(res.partial_hits.len(), dataset.len() - i - 1);\n            for (expected, got) in dataset[i + 1..].iter().zip(res.partial_hits.iter()) {\n                assert_eq!(expected.0 as u32, got.doc_id,);\n            }\n        }\n\n        // we eliminate based on split id\n        {\n            let search_after = PartialHit {\n                split_id: \"fake_split_id2\".to_string(),\n                segment_ord: 0,\n                doc_id: 5,\n                sort_value: None,\n                sort_value2: None,\n            };\n            let request = SearchRequest {\n                max_hits: 1000,\n                sort_fields: vec![SortField {\n                    field_name: \"_shard_doc\".to_string(),\n                    sort_order: SortOrder::Desc.into(),\n                    sort_datetime_format: None,\n                }],\n                search_after: Some(search_after),\n                ..SearchRequest::default()\n            };\n\n            let collector = super::make_collector_for_split(\n                \"fake_split_id1\".to_string(),\n                &request,\n                Default::default(),\n            )\n            .unwrap();\n            let res = searcher\n                .search(&tantivy::query::AllQuery, &collector)\n                .unwrap();\n            assert_eq!(res.num_hits, dataset.len() as u64);\n            // we are searching split id1, and we remove anything before id2 in descending order\n            // (i.e. higher than id2 lexicographically), so every document matches\n            assert_eq!(res.partial_hits.len(), dataset.len());\n\n            let collector = super::make_collector_for_split(\n                \"fake_split_id2\".to_string(),\n                &request,\n                Default::default(),\n            )\n            .unwrap();\n            let res = searcher\n                .search(&tantivy::query::AllQuery, &collector)\n                .unwrap();\n            assert_eq!(res.num_hits, dataset.len() as u64);\n            // we are searching the limit split, but only doc_id in 0..5\n            assert_eq!(res.partial_hits.len(), 5);\n\n            let collector = super::make_collector_for_split(\n                \"fake_split_id3\".to_string(),\n                &request,\n                Default::default(),\n            )\n            .unwrap();\n            let res = searcher\n                .search(&tantivy::query::AllQuery, &collector)\n                .unwrap();\n            assert_eq!(res.num_hits, dataset.len() as u64);\n            // we are searching split id3, and we remove anything before id2 in descending order\n            // (i.e. higher than id2 lexicographically), so everything is removed\n            assert_eq!(res.partial_hits.len(), 0);\n        }\n    }\n\n    fn merge_collector_equal_results(\n        request: &SearchRequest,\n        results: Vec<LeafSearchResponse>,\n    ) -> LeafSearchResponse {\n        let collector = make_merge_collector(request, Default::default()).unwrap();\n        let mut incremental_collector = IncrementalCollector::new(collector.clone());\n\n        let result = collector\n            .merge_fruits(results.iter().cloned().map(Ok).collect())\n            .unwrap();\n\n        for split_result in results {\n            incremental_collector.add_result(split_result).unwrap();\n        }\n\n        let incremental_result = incremental_collector.finalize().unwrap();\n        assert_eq!(result, incremental_result);\n        result\n    }\n\n    #[test]\n    fn test_merge_collectors() {\n        let result = merge_collector_equal_results(\n            &SearchRequest {\n                start_offset: 0,\n                max_hits: 2,\n                sort_fields: vec![SortField {\n                    field_name: \"timestamp\".to_string(),\n                    sort_order: SortOrder::Desc as i32,\n                    sort_datetime_format: None,\n                }],\n                aggregation_request: None,\n                ..Default::default()\n            },\n            vec![LeafSearchResponse {\n                num_hits: 1234,\n                partial_hits: vec![PartialHit {\n                    split_id: \"1\".to_string(),\n                    segment_ord: 0,\n                    doc_id: 123,\n                    sort_value: Some(SortValue::I64(1234).into()),\n                    sort_value2: None,\n                }],\n                failed_splits: Vec::new(),\n                num_attempted_splits: 3,\n                num_successful_splits: 3,\n                intermediate_aggregation_result: None,\n                resource_stats: None,\n            }],\n        );\n\n        assert_eq!(\n            result,\n            LeafSearchResponse {\n                num_hits: 1234,\n                partial_hits: vec![PartialHit {\n                    split_id: \"1\".to_string(),\n                    segment_ord: 0,\n                    doc_id: 123,\n                    sort_value: Some(SortValue::I64(1234).into()),\n                    sort_value2: None,\n                }],\n                failed_splits: Vec::new(),\n                num_attempted_splits: 3,\n                num_successful_splits: 3,\n                intermediate_aggregation_result: None,\n                resource_stats: None,\n            }\n        );\n\n        let result = merge_collector_equal_results(\n            &SearchRequest {\n                start_offset: 0,\n                max_hits: 2,\n                sort_fields: vec![SortField {\n                    field_name: \"timestamp\".to_string(),\n                    sort_order: SortOrder::Desc as i32,\n                    sort_datetime_format: None,\n                }],\n                aggregation_request: None,\n                ..Default::default()\n            },\n            vec![\n                LeafSearchResponse {\n                    num_hits: 1234,\n                    partial_hits: vec![\n                        PartialHit {\n                            split_id: \"1\".to_string(),\n                            segment_ord: 0,\n                            doc_id: 123,\n                            sort_value: Some(SortValue::I64(1234).into()),\n                            sort_value2: None,\n                        },\n                        PartialHit {\n                            split_id: \"1\".to_string(),\n                            segment_ord: 0,\n                            doc_id: 125,\n                            sort_value: Some(SortValue::I64(1236).into()),\n                            sort_value2: None,\n                        },\n                    ],\n                    failed_splits: Vec::new(),\n                    num_attempted_splits: 3,\n                    num_successful_splits: 3,\n                    intermediate_aggregation_result: None,\n                    resource_stats: None,\n                },\n                LeafSearchResponse {\n                    num_hits: 10,\n                    partial_hits: vec![PartialHit {\n                        split_id: \"2\".to_string(),\n                        segment_ord: 0,\n                        doc_id: 3,\n                        sort_value: Some(SortValue::I64(1235).into()),\n                        sort_value2: None,\n                    }],\n                    failed_splits: vec![SplitSearchError {\n                        error: \"fake error\".to_string(),\n                        split_id: \"3\".to_string(),\n                        retryable_error: true,\n                    }],\n                    num_attempted_splits: 2,\n                    num_successful_splits: 1,\n                    intermediate_aggregation_result: None,\n                    resource_stats: None,\n                },\n            ],\n        );\n\n        assert_eq!(\n            result,\n            LeafSearchResponse {\n                num_hits: 1244,\n                partial_hits: vec![\n                    PartialHit {\n                        split_id: \"1\".to_string(),\n                        segment_ord: 0,\n                        doc_id: 125,\n                        sort_value: Some(SortValue::I64(1236).into()),\n                        sort_value2: None,\n                    },\n                    PartialHit {\n                        split_id: \"2\".to_string(),\n                        segment_ord: 0,\n                        doc_id: 3,\n                        sort_value: Some(SortValue::I64(1235).into()),\n                        sort_value2: None,\n                    },\n                ],\n                failed_splits: vec![SplitSearchError {\n                    error: \"fake error\".to_string(),\n                    split_id: \"3\".to_string(),\n                    retryable_error: true,\n                }],\n                num_attempted_splits: 5,\n                num_successful_splits: 4,\n                intermediate_aggregation_result: None,\n                resource_stats: None,\n            }\n        );\n\n        // same request, but we reverse sort order\n        let result = merge_collector_equal_results(\n            &SearchRequest {\n                start_offset: 0,\n                max_hits: 2,\n                sort_fields: vec![SortField {\n                    field_name: \"timestamp\".to_string(),\n                    sort_order: SortOrder::Asc as i32,\n                    sort_datetime_format: None,\n                }],\n                aggregation_request: None,\n                ..Default::default()\n            },\n            vec![\n                LeafSearchResponse {\n                    num_hits: 1234,\n                    partial_hits: vec![\n                        PartialHit {\n                            split_id: \"1\".to_string(),\n                            segment_ord: 0,\n                            doc_id: 123,\n                            sort_value: Some(SortValue::I64(1234).into()),\n                            sort_value2: None,\n                        },\n                        PartialHit {\n                            split_id: \"1\".to_string(),\n                            segment_ord: 0,\n                            doc_id: 125,\n                            sort_value: Some(SortValue::I64(1236).into()),\n                            sort_value2: None,\n                        },\n                    ],\n                    failed_splits: Vec::new(),\n                    num_attempted_splits: 3,\n                    num_successful_splits: 3,\n                    intermediate_aggregation_result: None,\n                    resource_stats: Some(ResourceStats {\n                        cpu_microsecs: 100,\n                        ..Default::default()\n                    }),\n                },\n                LeafSearchResponse {\n                    num_hits: 10,\n                    partial_hits: vec![PartialHit {\n                        split_id: \"2\".to_string(),\n                        segment_ord: 0,\n                        doc_id: 3,\n                        sort_value: Some(SortValue::I64(1235).into()),\n                        sort_value2: None,\n                    }],\n                    failed_splits: vec![SplitSearchError {\n                        error: \"fake error\".to_string(),\n                        split_id: \"3\".to_string(),\n                        retryable_error: true,\n                    }],\n                    num_attempted_splits: 2,\n                    num_successful_splits: 1,\n                    intermediate_aggregation_result: None,\n                    resource_stats: Some(ResourceStats {\n                        cpu_microsecs: 50,\n                        ..Default::default()\n                    }),\n                },\n            ],\n        );\n\n        assert_eq!(\n            result,\n            LeafSearchResponse {\n                num_hits: 1244,\n                partial_hits: vec![\n                    PartialHit {\n                        split_id: \"1\".to_string(),\n                        segment_ord: 0,\n                        doc_id: 123,\n                        sort_value: Some(SortValue::I64(1234).into()),\n                        sort_value2: None,\n                    },\n                    PartialHit {\n                        split_id: \"2\".to_string(),\n                        segment_ord: 0,\n                        doc_id: 3,\n                        sort_value: Some(SortValue::I64(1235).into()),\n                        sort_value2: None,\n                    },\n                ],\n                failed_splits: vec![SplitSearchError {\n                    error: \"fake error\".to_string(),\n                    split_id: \"3\".to_string(),\n                    retryable_error: true,\n                }],\n                num_attempted_splits: 5,\n                num_successful_splits: 4,\n                intermediate_aggregation_result: None,\n                resource_stats: Some(ResourceStats {\n                    cpu_microsecs: 150,\n                    ..Default::default()\n                }),\n            }\n        );\n        // TODO would be nice to test aggregation too.\n    }\n\n    #[test]\n    fn test_merge_empty_intermediate_aggregation_result() {\n        let merged = merge_intermediate_aggregation_result(&None, std::iter::empty()).unwrap();\n        assert!(merged.is_none());\n\n        let aggregations_json = r#\"{\n            \"avg_price\": { \"avg\": { \"field\": \"price\" } }\n        }\"#;\n        let ttv_aggregations: Aggregations = serde_json::from_str(aggregations_json).unwrap();\n        let qw_aggregations = QuickwitAggregations::TantivyAggregations(ttv_aggregations);\n        let serialized =\n            merge_intermediate_aggregation_result(&Some(qw_aggregations), std::iter::empty())\n                .unwrap()\n                .unwrap();\n        let _merged: IntermediateAggregationResults = postcard::from_bytes(&serialized).unwrap();\n        // Hopefully `_merged` is empty but the API does not allow us to assert that.\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-search/src/error.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse itertools::Itertools;\nuse quickwit_common::rate_limited_error;\nuse quickwit_common::retry::Retryable;\nuse quickwit_doc_mapper::QueryParserError;\nuse quickwit_proto::error::grpc_error_to_grpc_status;\nuse quickwit_proto::metastore::{EntityKind, MetastoreError};\nuse quickwit_proto::search::SplitSearchError;\nuse quickwit_proto::{GrpcServiceError, ServiceError, ServiceErrorCode, tonic};\nuse quickwit_storage::StorageResolverError;\nuse serde::{Deserialize, Serialize};\nuse tantivy::TantivyError;\nuse thiserror::Error;\nuse tokio::task::JoinError;\n\n/// Possible SearchError\n#[allow(missing_docs)]\n#[derive(Error, Debug, Serialize, Deserialize, Clone)]\n#[serde(rename_all = \"snake_case\")]\npub enum SearchError {\n    #[error(\"could not find indexes matching the IDs `{index_ids:?}`\")]\n    IndexesNotFound { index_ids: Vec<String> },\n    #[error(\"internal error: `{0}`\")]\n    Internal(String),\n    #[error(\"invalid aggregation request: {0}\")]\n    InvalidAggregationRequest(String),\n    #[error(\"Invalid argument: {0}\")]\n    InvalidArgument(String),\n    #[error(\"{0}\")]\n    InvalidQuery(String),\n    #[error(\"storage not found: `{0}`)\")]\n    StorageResolver(#[from] StorageResolverError),\n    #[error(\"request timed out: {0}\")]\n    Timeout(String),\n    #[error(\"too many requests\")]\n    TooManyRequests,\n    #[error(\"service unavailable: {0}\")]\n    Unavailable(String),\n}\n\nimpl SearchError {\n    /// Creates an internal `SearchError` from a list of split search errors.\n    pub fn from_split_errors(failed_splits: &[SplitSearchError]) -> Option<SearchError> {\n        let first_failing_split = failed_splits.first()?;\n        let failed_splits = failed_splits\n            .iter()\n            .map(|failed_split| &failed_split.split_id)\n            .join(\", \");\n        let error_msg = format!(\n            \"search failed for the following splits: {failed_splits:}. For instance, split {} \\\n             failed with the following error message: {}\",\n            first_failing_split.split_id, first_failing_split.error,\n        );\n        Some(SearchError::Internal(error_msg))\n    }\n}\n\nimpl ServiceError for SearchError {\n    fn error_code(&self) -> ServiceErrorCode {\n        match self {\n            Self::IndexesNotFound { .. } => ServiceErrorCode::NotFound,\n            Self::Internal(error_msg) => {\n                rate_limited_error!(limit_per_min = 6, \"search internal error: {error_msg}\");\n                ServiceErrorCode::Internal\n            }\n            Self::InvalidAggregationRequest(_) => ServiceErrorCode::BadRequest,\n            Self::InvalidArgument(_) => ServiceErrorCode::BadRequest,\n            Self::InvalidQuery(_) => ServiceErrorCode::BadRequest,\n            Self::StorageResolver(storage_err) => {\n                rate_limited_error!(\n                    limit_per_min = 6,\n                    \"search's storager resolver internal error: {storage_err}\"\n                );\n                ServiceErrorCode::Internal\n            }\n            Self::Timeout(_) => ServiceErrorCode::Timeout,\n            Self::TooManyRequests => ServiceErrorCode::TooManyRequests,\n            Self::Unavailable(_) => ServiceErrorCode::Unavailable,\n        }\n    }\n}\n\nimpl GrpcServiceError for SearchError {\n    fn new_internal(message: String) -> Self {\n        Self::Internal(message)\n    }\n\n    fn new_timeout(message: String) -> Self {\n        Self::Timeout(message)\n    }\n\n    fn new_too_many_requests() -> Self {\n        Self::TooManyRequests\n    }\n\n    fn new_unavailable(message: String) -> Self {\n        Self::Unavailable(message)\n    }\n}\n\nimpl From<SearchError> for tonic::Status {\n    fn from(error: SearchError) -> Self {\n        grpc_error_to_grpc_status(error)\n    }\n}\n\n/// Parse tonic error and returns `SearchError`.\npub fn parse_grpc_error(grpc_error: &tonic::Status) -> SearchError {\n    // TODO: the serialization to JSON part is missing.\n    serde_json::from_str(grpc_error.message())\n        .unwrap_or_else(|_| SearchError::Internal(grpc_error.message().to_string()))\n}\n\nimpl From<TantivyError> for SearchError {\n    fn from(tantivy_error: TantivyError) -> Self {\n        SearchError::Internal(format!(\"tantivy error: {tantivy_error}\"))\n    }\n}\n\nimpl From<tokio::time::error::Elapsed> for SearchError {\n    fn from(_elapsed: tokio::time::error::Elapsed) -> Self {\n        SearchError::Timeout(\"timeout exceeded\".to_string())\n    }\n}\n\nimpl From<postcard::Error> for SearchError {\n    fn from(error: postcard::Error) -> Self {\n        SearchError::Internal(format!(\"Postcard error: {error}\"))\n    }\n}\n\nimpl From<serde_json::Error> for SearchError {\n    fn from(serde_error: serde_json::Error) -> Self {\n        SearchError::Internal(format!(\"serde error: {serde_error}\"))\n    }\n}\n\nimpl From<anyhow::Error> for SearchError {\n    fn from(any_error: anyhow::Error) -> Self {\n        SearchError::Internal(any_error.to_string())\n    }\n}\n\nimpl From<QueryParserError> for SearchError {\n    fn from(query_parser_error: QueryParserError) -> Self {\n        SearchError::InvalidQuery(query_parser_error.to_string())\n    }\n}\n\nimpl From<MetastoreError> for SearchError {\n    fn from(metastore_error: MetastoreError) -> SearchError {\n        match metastore_error {\n            MetastoreError::NotFound(EntityKind::Index { index_id }) => {\n                SearchError::IndexesNotFound {\n                    index_ids: vec![index_id],\n                }\n            }\n            MetastoreError::NotFound(EntityKind::Indexes { index_ids }) => {\n                SearchError::IndexesNotFound { index_ids }\n            }\n            _ => SearchError::Internal(metastore_error.to_string()),\n        }\n    }\n}\n\nimpl Retryable for SearchError {\n    fn is_retryable(&self) -> bool {\n        matches!(self, SearchError::TooManyRequests | SearchError::Timeout(_))\n    }\n}\n\nimpl From<JoinError> for SearchError {\n    fn from(join_error: JoinError) -> SearchError {\n        SearchError::Internal(format!(\"spawned task in root join failed: {join_error}\"))\n    }\n}\n\nimpl From<std::convert::Infallible> for SearchError {\n    fn from(infallible: std::convert::Infallible) -> SearchError {\n        match infallible {}\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-search/src/fetch_docs.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{BTreeMap, HashMap};\nuse std::sync::Arc;\n\nuse anyhow::{Context, Ok};\nuse futures::{StreamExt, TryStreamExt};\nuse itertools::Itertools;\nuse quickwit_doc_mapper::DocMapper;\nuse quickwit_proto::search::{\n    FetchDocsResponse, PartialHit, SnippetRequest, SplitIdAndFooterOffsets,\n};\nuse quickwit_storage::Storage;\nuse tantivy::query::Query;\nuse tantivy::schema::document::CompactDocValue;\nuse tantivy::schema::{Document as DocumentTrait, Field, TantivyDocument, Value};\nuse tantivy::snippet::SnippetGenerator;\nuse tantivy::{ReloadPolicy, Score, Searcher, Term};\nuse tracing::{Instrument, error};\n\nuse crate::leaf::open_index_with_caches;\nuse crate::service::SearcherContext;\nuse crate::{GlobalDocAddress, convert_document_to_json_string};\n\nconst SNIPPET_MAX_NUM_CHARS: usize = 150;\n\n/// Given a list of global doc address, fetches all the documents and\n/// returns them as a hashmap.\nasync fn fetch_docs_to_map(\n    searcher_context: Arc<SearcherContext>,\n    mut global_doc_addrs: Vec<GlobalDocAddress>,\n    index_storage: Arc<dyn Storage>,\n    splits: &[SplitIdAndFooterOffsets],\n    doc_mapper: Arc<DocMapper>,\n    snippet_request_opt: Option<&SnippetRequest>,\n) -> anyhow::Result<HashMap<GlobalDocAddress, Document>> {\n    let mut split_fetch_docs_futures = Vec::new();\n\n    let split_offsets_map: HashMap<&str, &SplitIdAndFooterOffsets> = splits\n        .iter()\n        .map(|split| (split.split_id.as_str(), split))\n        .collect();\n\n    // We sort global hit addrs in order to allow for the grouby.\n    global_doc_addrs.sort_by(|a, b| a.split.cmp(&b.split));\n    for (split_id, global_doc_addrs) in global_doc_addrs\n        .iter()\n        .chunk_by(|global_doc_addr| global_doc_addr.split.as_str())\n        .into_iter()\n    {\n        let global_doc_addrs: Vec<GlobalDocAddress> =\n            global_doc_addrs.into_iter().cloned().collect();\n        let split_and_offset = split_offsets_map\n            .get(split_id)\n            .ok_or_else(|| anyhow::anyhow!(\"failed to find offset for split {}\", split_id))?;\n        split_fetch_docs_futures.push(fetch_docs_in_split(\n            searcher_context.clone(),\n            global_doc_addrs,\n            index_storage.clone(),\n            split_and_offset,\n            doc_mapper.clone(),\n            snippet_request_opt,\n        ));\n    }\n\n    let split_fetch_docs: Vec<Vec<(GlobalDocAddress, Document)>> = futures::future::try_join_all(\n        split_fetch_docs_futures,\n    )\n    .await\n    .map_err(|error| {\n        let split_ids = splits\n            .iter()\n            .map(|split| split.split_id.clone())\n            .collect_vec();\n        error!(split_ids = ?split_ids, error = ?error, \"error when fetching docs in splits\");\n        anyhow::anyhow!(\n            \"error when fetching docs for splits {:?}: {:?}\",\n            split_ids,\n            error\n        )\n    })?;\n\n    let global_doc_addr_to_doc_json: HashMap<GlobalDocAddress, Document> = split_fetch_docs\n        .into_iter()\n        .flat_map(|docs| docs.into_iter())\n        .collect();\n\n    Ok(global_doc_addr_to_doc_json)\n}\n\n/// `fetch_docs` step of search.\n///\n/// This function takes a list of partial hits (possibly from different splits)\n/// and the storage associated to an index, fetches the document from\n/// the split document stores, and returns the full hits.\npub async fn fetch_docs(\n    searcher_context: Arc<SearcherContext>,\n    partial_hits: Vec<PartialHit>,\n    index_storage: Arc<dyn Storage>,\n    splits: &[SplitIdAndFooterOffsets],\n    doc_mapper: Arc<DocMapper>,\n    snippet_request_opt: Option<&SnippetRequest>,\n) -> anyhow::Result<FetchDocsResponse> {\n    let global_doc_addrs: Vec<GlobalDocAddress> = partial_hits\n        .iter()\n        .map(GlobalDocAddress::from_partial_hit)\n        .collect();\n\n    let mut global_doc_addr_to_doc_json = fetch_docs_to_map(\n        searcher_context,\n        global_doc_addrs,\n        index_storage,\n        splits,\n        doc_mapper,\n        snippet_request_opt,\n    )\n    .await?;\n\n    let hits: Vec<quickwit_proto::search::LeafHit> = partial_hits\n        .into_iter()\n        .flat_map(|partial_hit| {\n            let global_doc_addr = GlobalDocAddress::from_partial_hit(&partial_hit);\n            if let Some((_, document)) = global_doc_addr_to_doc_json.remove_entry(&global_doc_addr)\n            {\n                Some(quickwit_proto::search::LeafHit {\n                    leaf_json: document.content_json,\n                    partial_hit: Some(partial_hit),\n                    leaf_snippet_json: document.snippet_json,\n                })\n            } else {\n                None\n            }\n        })\n        .collect();\n    Ok(FetchDocsResponse { hits })\n}\n\n// number of concurrent fetch allowed for a single split.\nconst NUM_CONCURRENT_REQUESTS: usize = 30;\n\n/// A struct for holding a fetched document's content and snippet.\n#[derive(Debug)]\nstruct Document {\n    content_json: String,\n    snippet_json: Option<String>,\n}\n\n/// Fetching docs from a specific split.\nasync fn fetch_docs_in_split(\n    searcher_context: Arc<SearcherContext>,\n    mut global_doc_addrs: Vec<GlobalDocAddress>,\n    index_storage: Arc<dyn Storage>,\n    split: &SplitIdAndFooterOffsets,\n    doc_mapper: Arc<DocMapper>,\n    snippet_request_opt: Option<&SnippetRequest>,\n) -> anyhow::Result<Vec<(GlobalDocAddress, Document)>> {\n    global_doc_addrs.sort_by_key(|doc| doc.doc_addr);\n    // Opens the index without the ephemeral unbounded cache, this cache is indeed not useful\n    // when fetching docs as we will fetch them only once.\n    let (mut index, _) = open_index_with_caches(\n        &searcher_context,\n        index_storage,\n        split,\n        Some(doc_mapper.tokenizer_manager()),\n        None,\n    )\n    .await\n    .context(\"open-index-for-split\")?;\n    // we add an executor here, we could add it in open_index_with_caches, though we should verify\n    // the side-effect before\n    let tantivy_executor = crate::search_thread_pool()\n        .get_underlying_rayon_thread_pool()\n        .into();\n    index.set_executor(tantivy_executor);\n    let index_reader = index\n        .reader_builder()\n        // the docs are presorted so a cache size of NUM_CONCURRENT_REQUESTS is fine\n        .doc_store_cache_num_blocks(NUM_CONCURRENT_REQUESTS)\n        .reload_policy(ReloadPolicy::Manual)\n        .try_into()?;\n    let searcher = Arc::new(index_reader.searcher());\n    let fields_snippet_generator_opt = if let Some(snippet_request) = snippet_request_opt {\n        Some(create_fields_snippet_generator(&searcher, doc_mapper.clone(), snippet_request).await?)\n    } else {\n        None\n    };\n\n    let doc_futures = global_doc_addrs.into_iter().map(|global_doc_addr| {\n        let moved_searcher = searcher.clone();\n        let moved_doc_mapper = doc_mapper.clone();\n        let fields_snippet_generator_opt_clone = fields_snippet_generator_opt.clone();\n        async move {\n            let doc: TantivyDocument = moved_searcher\n                .doc_async(global_doc_addr.doc_addr)\n                .await\n                .context(\"searcher-doc-async\")?;\n\n            let named_field_doc = doc.to_named_doc(moved_searcher.schema());\n            let content_json = convert_document_to_json_string(named_field_doc, &moved_doc_mapper)?;\n            if fields_snippet_generator_opt_clone.is_none() {\n                return Ok((\n                    global_doc_addr,\n                    Document {\n                        content_json,\n                        snippet_json: None,\n                    },\n                ));\n            }\n\n            let fields_snippet_generator_clone = fields_snippet_generator_opt_clone.unwrap();\n            if fields_snippet_generator_clone.is_empty() {\n                return Ok((\n                    global_doc_addr,\n                    Document {\n                        content_json,\n                        snippet_json: None,\n                    },\n                ));\n            }\n\n            let mut snippets = HashMap::new();\n            for (field, field_values) in doc.get_sorted_field_values() {\n                let field_name = moved_searcher.schema().get_field_name(field);\n                if let Some(values) = fields_snippet_generator_clone\n                    .snippets_from_field_values(field_name, field_values)\n                {\n                    snippets.insert(field_name, values);\n                }\n            }\n            let snippet_json = serde_json::to_string(&snippets)?;\n            Ok((\n                global_doc_addr,\n                Document {\n                    content_json,\n                    snippet_json: Some(snippet_json),\n                },\n            ))\n        }\n        .in_current_span()\n    });\n\n    futures::stream::iter(doc_futures)\n        .buffer_unordered(NUM_CONCURRENT_REQUESTS)\n        .try_collect::<Vec<_>>()\n        .await\n}\n\n// A struct to hold the snippet generators associated to\n// the snippet fields from a search request.\n#[derive(Clone)]\nstruct FieldsSnippetGenerator {\n    field_generators: Arc<HashMap<String, SnippetGenerator>>,\n}\n\nimpl FieldsSnippetGenerator {\n    // Returns the  snippets from fields values.\n    fn snippets_from_field_values(\n        &self,\n        field_name: &str,\n        field_values: Vec<CompactDocValue<'_>>,\n    ) -> Option<Vec<String>> {\n        if let Some(snippet_generator) = self.field_generators.get(field_name) {\n            let values = field_values\n                .into_iter()\n                .filter_map(|value| {\n                    value.as_str().and_then(|text| {\n                        let snippet = snippet_generator.snippet(text);\n                        match snippet.is_empty() {\n                            false => Some(snippet.to_html()),\n                            _ => None,\n                        }\n                    })\n                })\n                .collect();\n            Some(values)\n        } else {\n            None\n        }\n    }\n\n    fn is_empty(&self) -> bool {\n        self.field_generators.is_empty()\n    }\n}\n\n// Creates FieldsSnippetGenerator.\nasync fn create_fields_snippet_generator(\n    searcher: &Searcher,\n    doc_mapper: Arc<DocMapper>,\n    snippet_request: &SnippetRequest,\n) -> anyhow::Result<FieldsSnippetGenerator> {\n    let schema = searcher.schema();\n    let query_ast_resolved = serde_json::from_str(&snippet_request.query_ast_resolved)\n        .context(\"failed to deserialize QueryAst\")?;\n    let (query, _) = doc_mapper.query(schema.clone(), query_ast_resolved, false, None)?;\n    let mut snippet_generators = HashMap::new();\n    for field_name in &snippet_request.snippet_fields {\n        let field = schema.get_field(field_name)?;\n        let snippet_generator = create_snippet_generator(searcher, &query, field).await?;\n        snippet_generators.insert(field_name.clone(), snippet_generator);\n    }\n\n    Ok(FieldsSnippetGenerator {\n        field_generators: Arc::new(snippet_generators),\n    })\n}\n\n// Creates a snippet generator associated to a field.\nasync fn create_snippet_generator(\n    searcher: &Searcher,\n    query: &dyn Query,\n    field: Field,\n) -> anyhow::Result<SnippetGenerator> {\n    let mut terms: Vec<&Term> = Vec::new();\n    // TODO ok with termset?\n    query.query_terms(&mut |term, _need_position| {\n        if term.field() == field {\n            terms.push(term);\n        }\n    });\n    let mut terms_text: BTreeMap<String, f32> = BTreeMap::default();\n    for term in terms {\n        let value = term.value();\n        let Some(term_str) = value.as_str() else {\n            continue;\n        };\n        let doc_freq = searcher.doc_freq_async(term).await?;\n        if doc_freq > 0 {\n            let score = 1.0 / (1.0 + doc_freq as Score);\n            terms_text.insert(term_str.to_string(), score);\n        }\n    }\n    let tokenizer = searcher.index().tokenizer_for_field(field)?;\n    Ok(SnippetGenerator::new(\n        terms_text,\n        tokenizer,\n        field,\n        SNIPPET_MAX_NUM_CHARS,\n    ))\n}\n"
  },
  {
    "path": "quickwit/quickwit-search/src/find_trace_ids_collector.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::cmp::{Ord, Ordering};\nuse std::collections::HashSet;\n\nuse fnv::{FnvHashMap, FnvHashSet};\nuse itertools::Itertools;\nuse quickwit_proto::search::TraceId;\nuse serde::{Deserialize, Serialize};\nuse tantivy::collector::{Collector, SegmentCollector};\nuse tantivy::columnar::BytesColumn;\nuse tantivy::fastfield::Column;\nuse tantivy::{DateTime, DocId, Score, SegmentReader};\n\ntype TermOrd = u64;\n\n#[derive(Debug, Clone, Serialize, Deserialize)]\n/// Metadata about a single span\npub struct Span {\n    /// The trace id this span is part of\n    pub trace_id: TraceId,\n    /// The start timestamp of the span\n    #[serde(with = \"serde_datetime\")]\n    pub span_timestamp: DateTime,\n}\n\nimpl Span {\n    fn new(trace_id: TraceId, span_timestamp: DateTime) -> Self {\n        Self {\n            trace_id,\n            span_timestamp,\n        }\n    }\n}\n\nimpl Ord for Span {\n    fn cmp(&self, other: &Self) -> Ordering {\n        self.span_timestamp\n            .cmp(&other.span_timestamp)\n            .reverse()\n            .then(self.trace_id.cmp(&other.trace_id))\n    }\n}\n\nimpl PartialOrd for Span {\n    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {\n        Some(self.cmp(other))\n    }\n}\n\nimpl PartialEq for Span {\n    fn eq(&self, other: &Self) -> bool {\n        self.cmp(other) == Ordering::Equal\n    }\n}\n\nimpl Eq for Span {}\n\n#[derive(Debug)]\npub struct TraceIdTermOrd {\n    pub term_ord: TermOrd,\n    pub span_timestamp: DateTime,\n}\n\nimpl TraceIdTermOrd {\n    pub fn new(term_ord: TermOrd, span_timestamp: DateTime) -> Self {\n        Self {\n            term_ord,\n            span_timestamp,\n        }\n    }\n}\n\nimpl Ord for TraceIdTermOrd {\n    fn cmp(&self, other: &Self) -> Ordering {\n        self.span_timestamp\n            .cmp(&other.span_timestamp)\n            .reverse()\n            .then(self.term_ord.cmp(&other.term_ord))\n    }\n}\n\nimpl PartialOrd for TraceIdTermOrd {\n    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {\n        Some(self.cmp(other))\n    }\n}\n\nimpl PartialEq for TraceIdTermOrd {\n    fn eq(&self, other: &Self) -> bool {\n        self.cmp(other) == Ordering::Equal\n    }\n}\n\nimpl Eq for TraceIdTermOrd {}\n\n/// Finds the most recent trace ids among a set of matching spans. Multiple spans belonging to the\n/// same trace can be found in the document set. As a result, this problem is akin to finding the\n/// top k elements with duplicates\n#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]\npub struct FindTraceIdsCollector {\n    /// The number of traces to select.\n    pub num_traces: usize,\n    /// The name of the fast field storing the trace IDs.\n    pub trace_id_field_name: String,\n    /// The name of the fast field recording the spans' start timestamp.\n    pub span_timestamp_field_name: String,\n}\n\nimpl FindTraceIdsCollector {\n    /// The names of the fast fields accessed by this collector.\n    pub fn fast_field_names(&self) -> HashSet<String> {\n        HashSet::from_iter([\n            self.trace_id_field_name.clone(),\n            self.span_timestamp_field_name.clone(),\n        ])\n    }\n\n    /// The field names of the term dictionaries accessed by this collector.\n    pub fn term_dict_field_names(&self) -> HashSet<String> {\n        HashSet::from_iter([self.trace_id_field_name.clone()])\n    }\n}\n\nimpl Collector for FindTraceIdsCollector {\n    type Fruit = Vec<Span>;\n    type Child = FindTraceIdsSegmentCollector;\n\n    fn for_segment(\n        &self,\n        _segment_local_id: u32,\n        segment_reader: &SegmentReader,\n    ) -> tantivy::Result<Self::Child> {\n        let trace_id_column = segment_reader\n            .fast_fields()\n            .bytes(&self.trace_id_field_name)?\n            .ok_or_else(|| {\n                let err_msg = format!(\n                    \"failed to find column for trace_id field `{}`\",\n                    self.trace_id_field_name\n                );\n                tantivy::TantivyError::InternalError(err_msg)\n            })?;\n        let span_timestamp_column: Column<DateTime> = segment_reader\n            .fast_fields()\n            .date(&self.span_timestamp_field_name)?;\n        Ok(FindTraceIdsSegmentCollector {\n            trace_id_column,\n            span_timestamp_column,\n            select_trace_ids: SelectTraceIds::new(self.num_traces),\n        })\n    }\n\n    fn merge_fruits(\n        &self,\n        segment_fruits: Vec<<Self::Child as SegmentCollector>::Fruit>,\n    ) -> tantivy::Result<Self::Fruit> {\n        Ok(merge_segment_fruits(segment_fruits, self.num_traces))\n    }\n\n    fn requires_scoring(&self) -> bool {\n        false\n    }\n}\n\nfn merge_segment_fruits(mut segment_fruits: Vec<Vec<Span>>, num_traces: usize) -> Vec<Span> {\n    // Spans are ordered in reverse order of their timestamp.\n    for segment_fruit in &mut segment_fruits {\n        segment_fruit.sort_unstable()\n    }\n    let mut spans: Vec<Span> = Vec::with_capacity(num_traces);\n    let mut seen_trace_ids: FnvHashSet<TraceId> = FnvHashSet::default();\n\n    for span in segment_fruits.into_iter().kmerge() {\n        if seen_trace_ids.insert(span.trace_id) {\n            spans.push(span);\n\n            if spans.len() == num_traces {\n                break;\n            }\n        }\n    }\n    spans\n}\n\npub struct FindTraceIdsSegmentCollector {\n    trace_id_column: BytesColumn,\n    span_timestamp_column: Column<DateTime>,\n    select_trace_ids: SelectTraceIds,\n}\n\nimpl FindTraceIdsSegmentCollector {\n    fn trace_id_term_ord(&self, doc: DocId) -> TermOrd {\n        self.trace_id_column\n            .term_ords(doc)\n            .next()\n            .unwrap_or_default()\n    }\n\n    fn span_timestamp(&self, doc: DocId) -> DateTime {\n        self.span_timestamp_column.first(doc).unwrap_or_default()\n    }\n}\n\nimpl SegmentCollector for FindTraceIdsSegmentCollector {\n    type Fruit = Vec<Span>;\n\n    fn collect(&mut self, doc: DocId, _score: Score) {\n        let term_ord = self.trace_id_term_ord(doc);\n        let span_timestamp = self.span_timestamp(doc);\n        self.select_trace_ids.collect(term_ord, span_timestamp);\n    }\n\n    fn harvest(self) -> Self::Fruit {\n        let mut buffer = Vec::with_capacity(TraceId::HEX_LENGTH);\n        self.select_trace_ids\n            .harvest()\n            .into_iter()\n            .map(|trace_id_term_ord| {\n                let span_timestamp = trace_id_term_ord.span_timestamp;\n                let found_term = self\n                    .trace_id_column\n                    .ord_to_bytes(trace_id_term_ord.term_ord, &mut buffer)\n                    .expect(\"Failed to lookup trace ID in the column term dictionary\");\n                debug_assert!(found_term);\n                let trace_id = TraceId::try_from(buffer.as_slice())\n                    .expect(\"The term dict should store valid trace IDs.\");\n                Span::new(trace_id, span_timestamp)\n            })\n            .collect()\n    }\n}\n\nstruct SelectTraceIds {\n    num_traces: usize,\n    dedup_workbench: FnvHashMap<TermOrd, DateTime>,\n    select_workbench: Vec<TraceIdTermOrd>,\n    running_term_ord: Option<TermOrd>,\n    running_span_timestamp: DateTime,\n    // This is the lowest timestamp required to enter our top K.\n    span_timestamp_sentinel: DateTime,\n}\n\nimpl SelectTraceIds {\n    fn new(num_traces: usize) -> Self {\n        Self {\n            num_traces,\n            dedup_workbench: FnvHashMap::with_capacity_and_hasher(\n                2 * num_traces,\n                Default::default(),\n            ),\n            select_workbench: Vec::with_capacity(2 * num_traces),\n            running_term_ord: None,\n            running_span_timestamp: DateTime::default(),\n            span_timestamp_sentinel: DateTime::from_timestamp_nanos(i64::MIN),\n        }\n    }\n\n    fn collect(&mut self, term_ord: TermOrd, span_timestamp: DateTime) {\n        if self.running_term_ord.is_none() {\n            self.running_term_ord = Some(term_ord);\n            self.running_span_timestamp = span_timestamp;\n            return;\n        }\n        if self.span_timestamp_sentinel >= span_timestamp {\n            return;\n        }\n        let running_term_ord = self\n            .running_term_ord\n            .expect(\"The running trace ID should be set.\");\n\n        if running_term_ord == term_ord {\n            self.running_span_timestamp = self.running_span_timestamp.max(span_timestamp);\n        } else {\n            self.dedup(running_term_ord, self.running_span_timestamp);\n            self.truncate();\n            self.running_term_ord = Some(term_ord);\n            self.running_span_timestamp = span_timestamp;\n        }\n    }\n\n    fn dedup(&mut self, term_ord: TermOrd, span_timestamp: DateTime) {\n        self.dedup_workbench\n            .entry(term_ord)\n            .and_modify(|entry| {\n                if *entry < span_timestamp {\n                    *entry = span_timestamp\n                }\n            })\n            .or_insert(span_timestamp);\n    }\n\n    fn select(&mut self) {\n        if self.num_traces == 0 || self.dedup_workbench.is_empty() {\n            return;\n        }\n        self.select_workbench.clear();\n\n        for (term_ord, span_timestamp) in self.dedup_workbench.drain() {\n            let trace_id = TraceIdTermOrd::new(term_ord, span_timestamp);\n            self.select_workbench.push(trace_id);\n        }\n        let select_len = self.num_traces.min(self.select_workbench.len());\n        let select_index = select_len - 1;\n        self.select_workbench.select_nth_unstable(select_index);\n        self.select_workbench.truncate(select_len);\n        self.span_timestamp_sentinel = self.select_workbench[select_index].span_timestamp;\n    }\n\n    fn truncate(&mut self) {\n        if self.dedup_workbench.len() < 2 * self.num_traces {\n            return;\n        }\n        self.select();\n        for trace_id in self.select_workbench.drain(..self.num_traces) {\n            self.dedup_workbench\n                .insert(trace_id.term_ord, trace_id.span_timestamp);\n        }\n    }\n\n    fn harvest(mut self) -> Vec<TraceIdTermOrd> {\n        if let Some(running_term_ord) = self.running_term_ord.take() {\n            self.dedup(running_term_ord, self.running_span_timestamp);\n        }\n        self.select();\n        self.select_workbench\n    }\n}\n\nmod serde_datetime {\n    use serde::{Deserialize, Deserializer, Serializer};\n    use tantivy::DateTime;\n\n    pub(crate) fn serialize<S>(datetime: &DateTime, serializer: S) -> Result<S::Ok, S::Error>\n    where S: Serializer {\n        serializer.serialize_i64(datetime.into_timestamp_nanos())\n    }\n\n    pub(crate) fn deserialize<'de, D>(deserializer: D) -> Result<DateTime, D::Error>\n    where D: Deserializer<'de> {\n        let datetime_i64: i64 = Deserialize::deserialize(deserializer)?;\n        Ok(DateTime::from_timestamp_nanos(datetime_i64))\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use tantivy::DateTime;\n    use tantivy::time::OffsetDateTime;\n\n    use super::*;\n    use crate::collector::QuickwitAggregations;\n\n    impl Span {\n        fn for_test(bytes: &[u8], span_timestamp_nanos: i64) -> Self {\n            let mut trace_id = [0u8; 16];\n            trace_id[..bytes.len()].copy_from_slice(bytes);\n            let span_timestamp = DateTime::from_timestamp_nanos(span_timestamp_nanos);\n            Self::new(TraceId::new(trace_id), span_timestamp)\n        }\n    }\n\n    impl TraceIdTermOrd {\n        fn for_test(term_ord: TermOrd, span_timestamp_nanos: i64) -> Self {\n            Self {\n                term_ord,\n                span_timestamp: DateTime::from_timestamp_nanos(span_timestamp_nanos),\n            }\n        }\n    }\n\n    impl SelectTraceIds {\n        fn collect_for_test(&mut self, term_ord: TermOrd, span_timestamp_nanos: i64) {\n            let span_timestamp = DateTime::from_timestamp_nanos(span_timestamp_nanos);\n            self.collect(term_ord, span_timestamp)\n        }\n    }\n\n    #[test]\n    fn test_find_trace_ids_collector_serde() {\n        let collector_json = serde_json::to_string(&FindTraceIdsCollector {\n            num_traces: 10,\n            trace_id_field_name: \"trace_id\".to_string(),\n            span_timestamp_field_name: \"span_timestamp\".to_string(),\n        })\n        .unwrap();\n        let aggregation: QuickwitAggregations = serde_json::from_str(&collector_json).unwrap();\n        let QuickwitAggregations::FindTraceIdsAggregation(collector) = aggregation else {\n            panic!(\"Expected FindTraceIdsAggregation\");\n        };\n        assert_eq!(collector.num_traces, 10);\n        assert_eq!(collector.trace_id_field_name, \"trace_id\");\n        assert_eq!(collector.span_timestamp_field_name, \"span_timestamp\");\n    }\n\n    #[test]\n    fn test_span_serde() {\n        let span_timestamp_nanos = OffsetDateTime::now_utc().unix_timestamp_nanos() as i64;\n        let expected_span = Span::for_test(b\"trace_id\", span_timestamp_nanos);\n        let span_json = serde_json::to_string(&expected_span).unwrap();\n        let span = serde_json::from_str::<Span>(&span_json).unwrap();\n        assert_eq!(span, expected_span);\n    }\n\n    #[test]\n    fn test_select_trace_ids() {\n        {\n            let select_trace_ids = SelectTraceIds::new(0);\n            let mut trace_ids = select_trace_ids.harvest();\n            trace_ids.sort();\n            assert_eq!(trace_ids, &[]);\n        }\n        {\n            let select_trace_ids = SelectTraceIds::new(3);\n\n            let mut trace_ids = select_trace_ids.harvest();\n            trace_ids.sort();\n\n            assert_eq!(trace_ids, &[]);\n        }\n        {\n            let mut select_trace_ids = SelectTraceIds::new(0);\n            select_trace_ids.collect_for_test(0, 0);\n\n            let mut trace_ids = select_trace_ids.harvest();\n            trace_ids.sort();\n\n            assert_eq!(trace_ids, &[]);\n        }\n        {\n            let mut select_trace_ids = SelectTraceIds::new(3);\n            select_trace_ids.collect_for_test(0, 0);\n\n            let mut trace_ids = select_trace_ids.harvest();\n            trace_ids.sort();\n\n            assert_eq!(trace_ids, &[TraceIdTermOrd::for_test(0, 0)]);\n        }\n        {\n            let mut select_trace_ids = SelectTraceIds::new(3);\n            select_trace_ids.collect_for_test(0, 1);\n            select_trace_ids.collect_for_test(0, 0);\n\n            let mut trace_ids = select_trace_ids.harvest();\n            trace_ids.sort();\n\n            assert_eq!(trace_ids, &[TraceIdTermOrd::for_test(0, 1)]);\n        }\n        {\n            let mut select_trace_ids = SelectTraceIds::new(3);\n            select_trace_ids.collect_for_test(0, 2);\n            select_trace_ids.collect_for_test(1, 1);\n            select_trace_ids.collect_for_test(2, 0);\n\n            let mut trace_ids = select_trace_ids.harvest();\n            trace_ids.sort();\n\n            assert_eq!(\n                trace_ids,\n                &[\n                    TraceIdTermOrd::for_test(0, 2),\n                    TraceIdTermOrd::for_test(1, 1),\n                    TraceIdTermOrd::for_test(2, 0),\n                ]\n            );\n        }\n        {\n            let mut select_trace_ids = SelectTraceIds::new(3);\n            select_trace_ids.collect_for_test(0, 7);\n            select_trace_ids.collect_for_test(1, 6);\n            select_trace_ids.collect_for_test(2, 5);\n            select_trace_ids.collect_for_test(3, 4);\n            select_trace_ids.collect_for_test(4, 3);\n            select_trace_ids.collect_for_test(5, 2);\n            select_trace_ids.collect_for_test(6, 1);\n            select_trace_ids.collect_for_test(7, 0);\n\n            assert_eq!(select_trace_ids.select_workbench.capacity(), 6);\n\n            let mut trace_ids = select_trace_ids.harvest();\n            trace_ids.sort();\n\n            assert_eq!(\n                trace_ids,\n                &[\n                    TraceIdTermOrd::for_test(0, 7),\n                    TraceIdTermOrd::for_test(1, 6),\n                    TraceIdTermOrd::for_test(2, 5),\n                ]\n            );\n        }\n    }\n\n    #[test]\n    fn test_merge_segment_fruits() {\n        {\n            let segment_fruits = Vec::new();\n            let merged_fruit = merge_segment_fruits(segment_fruits, 0);\n            assert_eq!(merged_fruit, &[]);\n        }\n        {\n            let segment_fruits = vec![vec![Span::for_test(b\"foo\", 0), Span::for_test(b\"foo\", 1)]];\n            let merged_fruit = merge_segment_fruits(segment_fruits, 3);\n            assert_eq!(merged_fruit, &[Span::for_test(b\"foo\", 1)]);\n        }\n        {\n            let segment_fruits = vec![\n                vec![Span::for_test(b\"foo\", 0), Span::for_test(b\"foo\", 1)],\n                vec![Span::for_test(b\"foo\", 1), Span::for_test(b\"foo\", 2)],\n            ];\n            let merged_fruit = merge_segment_fruits(segment_fruits, 3);\n            assert_eq!(merged_fruit, &[Span::for_test(b\"foo\", 2)]);\n        }\n        {\n            let segment_fruits = vec![\n                vec![\n                    Span::for_test(b\"foo\", 0),\n                    Span::for_test(b\"foo\", 1),\n                    Span::for_test(b\"foo\", 2),\n                ],\n                vec![Span::for_test(b\"foo\", 2), Span::for_test(b\"bar\", 2)],\n                vec![Span::for_test(b\"foo\", 2), Span::for_test(b\"bar\", 3)],\n            ];\n            let merged_fruit = merge_segment_fruits(segment_fruits, 3);\n            assert_eq!(\n                merged_fruit,\n                &[Span::for_test(b\"bar\", 3), Span::for_test(b\"foo\", 2)]\n            );\n        }\n        {\n            let segment_fruits = vec![\n                vec![\n                    Span::for_test(b\"foo\", 0),\n                    Span::for_test(b\"foo\", 1),\n                    Span::for_test(b\"foo\", 2),\n                ],\n                vec![Span::for_test(b\"foo\", 2), Span::for_test(b\"bar\", 2)],\n                vec![Span::for_test(b\"foo\", 2), Span::for_test(b\"bar\", 3)],\n                vec![Span::for_test(b\"qux\", 4)],\n            ];\n            let merged_fruit = merge_segment_fruits(segment_fruits, 3);\n            assert_eq!(\n                merged_fruit,\n                &[\n                    Span::for_test(b\"qux\", 4),\n                    Span::for_test(b\"bar\", 3),\n                    Span::for_test(b\"foo\", 2)\n                ]\n            );\n        }\n    }\n\n    use proptest::prelude::*;\n\n    fn span_strategy() -> impl Strategy<Value = Span> {\n        let trace_id_strat = proptest::array::uniform16(any::<u8>());\n        let span_timestamp_strat = any::<i64>();\n        (trace_id_strat, span_timestamp_strat).prop_map(|(trace_id, span_timestamp)| {\n            Span::new(\n                TraceId::new(trace_id),\n                tantivy::DateTime::from_timestamp_nanos(span_timestamp),\n            )\n        })\n    }\n\n    fn test_postcard_aux<I: Serialize + std::fmt::Debug + for<'a> Deserialize<'a> + Eq>(item: &I) {\n        let payload = postcard::to_allocvec(item).unwrap();\n        let deserialized_item: I = postcard::from_bytes(&payload).unwrap();\n        assert_eq!(item, &deserialized_item);\n    }\n\n    #[test]\n    fn test_proptest_spans_postcard_empty_vec() {\n        test_postcard_aux(&Vec::<Span>::new());\n    }\n\n    #[test]\n    fn test_proptest_spans_postcard_extreme_values() {\n        test_postcard_aux(&vec![Span {\n            trace_id: TraceId::new([255u8; 16]),\n            span_timestamp: tantivy::DateTime::from_timestamp_nanos(i64::MIN),\n        }]);\n    }\n\n    proptest::proptest! {\n\n        #[test]\n        fn test_proptest_spans_postcard_serdeser(span in span_strategy()) {\n            test_postcard_aux(&span);\n        }\n\n        #[test]\n        fn test_proptest_spans_vec_postcard_serdeser(spans in proptest::collection::vec(span_strategy(), 0..100)) {\n            test_postcard_aux(&spans);\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-search/src/invoker.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n//! Trait for invoking remote serverless functions for leaf search.\n\nuse async_trait::async_trait;\nuse quickwit_proto::search::{LambdaSingleSplitResult, LeafSearchRequest};\n\nuse crate::SearchError;\n\n/// Trait for invoking remote serverless functions (e.g., AWS Lambda) for leaf search.\n///\n/// This abstraction allows different cloud providers to be supported.\n/// Implementations are provided by the `quickwit-lambda` crate.\n#[async_trait]\npub trait LambdaLeafSearchInvoker: Send + Sync + 'static {\n    /// Invoke the remote function with a LeafSearchRequest.\n    ///\n    /// Returns one `LambdaSingleSplitResult` per split in the request.\n    /// Each result is tagged with its split_id so ordering is irrelevant.\n    /// Individual split failures are reported per-split; the outer `Result`\n    /// only represents transport-level errors.\n    async fn invoke_leaf_search(\n        &self,\n        request: LeafSearchRequest,\n    ) -> Result<Vec<LambdaSingleSplitResult>, SearchError>;\n}\n\n#[async_trait]\nimpl<T> LambdaLeafSearchInvoker for Box<T>\nwhere T: LambdaLeafSearchInvoker + ?Sized\n{\n    async fn invoke_leaf_search(\n        &self,\n        request: LeafSearchRequest,\n    ) -> Result<Vec<LambdaSingleSplitResult>, SearchError> {\n        (**self).invoke_leaf_search(request).await\n    }\n}\n\n#[async_trait]\nimpl<T> LambdaLeafSearchInvoker for std::sync::Arc<T>\nwhere T: LambdaLeafSearchInvoker + ?Sized\n{\n    async fn invoke_leaf_search(\n        &self,\n        request: LeafSearchRequest,\n    ) -> Result<Vec<LambdaSingleSplitResult>, SearchError> {\n        (**self).invoke_leaf_search(request).await\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-search/src/leaf.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::cmp::Reverse;\nuse std::collections::binary_heap::PeekMut;\nuse std::collections::{BinaryHeap, HashMap, HashSet};\nuse std::num::NonZeroUsize;\nuse std::ops::Bound;\nuse std::path::PathBuf;\nuse std::str::FromStr;\nuse std::sync::{Arc, Mutex, RwLock};\nuse std::time::{Duration, Instant};\n\nuse anyhow::Context;\nuse bytesize::ByteSize;\nuse futures::future::try_join_all;\nuse quickwit_common::pretty::PrettySample;\nuse quickwit_common::uri::Uri;\nuse quickwit_directories::{CachingDirectory, HotDirectory, StorageDirectory};\nuse quickwit_doc_mapper::{Automaton, DocMapper, FastFieldWarmupInfo, TermRange, WarmupInfo};\nuse quickwit_proto::search::lambda_single_split_result::Outcome;\nuse quickwit_proto::search::{\n    CountHits, LeafSearchRequest, LeafSearchResponse, PartialHit, ResourceStats, SearchRequest,\n    SortOrder, SortValue, SplitIdAndFooterOffsets, SplitSearchError,\n};\nuse quickwit_query::query_ast::{\n    BoolQuery, CacheNode, QueryAst, QueryAstTransformer, RangeQuery, TermQuery,\n};\nuse quickwit_query::tokenizers::TokenizerManager;\nuse quickwit_storage::{\n    BundleStorage, ByteRangeCache, MemorySizedCache, OwnedBytes, SplitCache, Storage,\n    StorageResolver, TimeoutAndRetryStorage, wrap_storage_with_cache,\n};\nuse tantivy::aggregation::AggContextParams;\nuse tantivy::aggregation::agg_req::{AggregationVariants, Aggregations};\nuse tantivy::collector::Collector;\nuse tantivy::directory::FileSlice;\nuse tantivy::fastfield::FastFieldReaders;\nuse tantivy::schema::Field;\nuse tantivy::{DateTime, Index, ReloadPolicy, Searcher, TantivyError, Term};\nuse tokio::task::{JoinError, JoinSet};\nuse tracing::*;\n\nuse crate::collector::{IncrementalCollector, make_collector_for_split, make_merge_collector};\nuse crate::leaf_cache::LeafSearchCache;\nuse crate::metrics::SplitSearchOutcomeCounters;\nuse crate::root::is_metadata_count_request_with_ast;\nuse crate::search_permit_provider::{\n    SearchPermit, SearchPermitFuture, compute_initial_memory_allocation,\n};\nuse crate::service::{SearcherContext, deserialize_doc_mapper};\nuse crate::{QuickwitAggregations, SearchError};\n\n/// Distributes items across batches using a greedy LPT (Longest Processing Time)\n/// algorithm to balance total weight across batches.\n///\n/// Items are sorted by weight descending, then each item is assigned to the\n/// batch with the smallest current total weight. This produces a good\n/// approximation of balanced batches.\nfn greedy_batch_split<T>(\n    items: Vec<T>,\n    weight_fn: impl Fn(&T) -> u64,\n    max_items_per_batch: NonZeroUsize,\n) -> Vec<Vec<T>> {\n    if items.is_empty() {\n        return Vec::new();\n    }\n\n    let num_items = items.len();\n    let max_items_per_batch: usize = max_items_per_batch.get();\n    let num_batches = num_items.div_ceil(max_items_per_batch);\n\n    // Compute weights, then sort descending by weight\n    let mut weighted_items: Vec<(u64, T)> = Vec::with_capacity(num_items);\n    for item in items {\n        let weight = weight_fn(&item);\n        weighted_items.push((weight, item));\n    }\n    weighted_items.sort_unstable_by_key(|(weight, _)| std::cmp::Reverse(*weight));\n\n    let mut batches: Vec<Vec<T>> = std::iter::repeat_with(Vec::new).take(num_batches).collect();\n\n    // Min-heap of (weight, item_count, batch_index).\n    // Reverse turns BinaryHeap into a min-heap.\n    // Ties break naturally: lighter weight → fewer items → lower index.\n    let mut heap: BinaryHeap<Reverse<(u64, usize, usize)>> = BinaryHeap::with_capacity(num_batches);\n    for batch_idx in 0..num_batches {\n        heap.push(Reverse((0, 0, batch_idx)));\n    }\n\n    // Greedily assign each item to the lightest batch.\n    // Full batches are removed via PeekMut::pop().\n    for (weight, item) in weighted_items {\n        let mut top = heap.peek_mut().unwrap();\n        let Reverse((ref mut batch_weight, ref mut batch_count, batch_idx)) = *top;\n        batches[batch_idx].push(item);\n        *batch_weight += weight;\n        *batch_count += 1;\n        if *batch_count >= max_items_per_batch {\n            PeekMut::pop(top);\n        }\n    }\n\n    batches\n}\n\nasync fn get_split_footer_from_cache_or_fetch(\n    index_storage: Arc<dyn Storage>,\n    split_and_footer_offsets: &SplitIdAndFooterOffsets,\n    footer_cache: &MemorySizedCache<String>,\n) -> anyhow::Result<OwnedBytes> {\n    {\n        let possible_val = footer_cache.get(&split_and_footer_offsets.split_id);\n        if let Some(footer_data) = possible_val {\n            return Ok(footer_data);\n        }\n    }\n    let split_file = PathBuf::from(format!(\"{}.split\", split_and_footer_offsets.split_id));\n    let footer_data_opt = index_storage\n        .get_slice(\n            &split_file,\n            split_and_footer_offsets.split_footer_start as usize\n                ..split_and_footer_offsets.split_footer_end as usize,\n        )\n        .await\n        .with_context(|| {\n            format!(\n                \"failed to fetch hotcache and footer from {} for split `{}`\",\n                index_storage.uri(),\n                split_and_footer_offsets.split_id\n            )\n        })?;\n\n    footer_cache.put(\n        split_and_footer_offsets.split_id.to_owned(),\n        footer_data_opt.clone(),\n    );\n\n    Ok(footer_data_opt)\n}\n\n/// Returns hotcache_bytes and the split directory (`BundleStorage`) with cache layer:\n/// - A split footer cache given by `SearcherContext.split_footer_cache`.\npub(crate) async fn open_split_bundle(\n    searcher_context: &SearcherContext,\n    index_storage: Arc<dyn Storage>,\n    split_and_footer_offsets: &SplitIdAndFooterOffsets,\n) -> anyhow::Result<(FileSlice, BundleStorage)> {\n    let split_file = PathBuf::from(format!(\"{}.split\", split_and_footer_offsets.split_id));\n    let footer_data = get_split_footer_from_cache_or_fetch(\n        index_storage.clone(),\n        split_and_footer_offsets,\n        &searcher_context.split_footer_cache,\n    )\n    .await?;\n\n    // We wrap the top-level storage with the split cache.\n    // This is before the bundle storage: at this point, this storage is reading `.split` files.\n    let index_storage_with_split_cache =\n        if let Some(split_cache) = searcher_context.split_cache_opt.as_ref() {\n            SplitCache::wrap_storage(split_cache.clone(), index_storage.clone())\n        } else {\n            index_storage.clone()\n        };\n\n    let (hotcache_bytes, bundle_storage) = BundleStorage::open_from_split_data(\n        index_storage_with_split_cache,\n        split_file,\n        FileSlice::new(Arc::new(footer_data)),\n    )?;\n\n    Ok((hotcache_bytes, bundle_storage))\n}\n\n/// Add a storage proxy to retry `get_slice` requests if they are taking too long,\n/// if configured in the searcher config.\n///\n/// The goal here is too ensure a low latency.\nfn configure_storage_retries(\n    searcher_context: &SearcherContext,\n    index_storage: Arc<dyn Storage>,\n) -> Arc<dyn Storage> {\n    if let Some(storage_timeout_policy) = &searcher_context.searcher_config.storage_timeout_policy {\n        Arc::new(TimeoutAndRetryStorage::new(\n            index_storage,\n            storage_timeout_policy.clone(),\n        ))\n    } else {\n        index_storage\n    }\n}\n\n/// Opens a `tantivy::Index` for the given split with several cache layers:\n/// - A split footer cache given by `SearcherContext.split_footer_cache`.\n/// - A fast fields cache given by `SearcherContext.storage_long_term_cache`.\n/// - An ephemeral unbounded cache directory (whose lifetime is tied to the returned `Index` if no\n///   `ByteRangeCache` is provided).\npub(crate) async fn open_index_with_caches(\n    searcher_context: &SearcherContext,\n    index_storage: Arc<dyn Storage>,\n    split_and_footer_offsets: &SplitIdAndFooterOffsets,\n    tokenizer_manager: Option<&TokenizerManager>,\n    ephemeral_unbounded_cache: Option<ByteRangeCache>,\n) -> anyhow::Result<(Index, HotDirectory)> {\n    let index_storage_with_retry_on_timeout =\n        configure_storage_retries(searcher_context, index_storage);\n\n    let (hotcache_bytes, bundle_storage) = open_split_bundle(\n        searcher_context,\n        index_storage_with_retry_on_timeout,\n        split_and_footer_offsets,\n    )\n    .await?;\n\n    let bundle_storage_with_cache = wrap_storage_with_cache(\n        searcher_context.fast_fields_cache.clone(),\n        Arc::new(bundle_storage),\n    );\n\n    let directory = StorageDirectory::new(bundle_storage_with_cache);\n\n    let hot_directory = if let Some(cache) = ephemeral_unbounded_cache {\n        let caching_directory = CachingDirectory::new(Arc::new(directory), cache);\n        HotDirectory::open(caching_directory, hotcache_bytes.read_bytes()?)?\n    } else {\n        HotDirectory::open(directory, hotcache_bytes.read_bytes()?)?\n    };\n\n    let mut index = Index::open(hot_directory.clone())?;\n    if let Some(tokenizer_manager) = tokenizer_manager {\n        index.set_tokenizers(tokenizer_manager.tantivy_manager().clone());\n    }\n    index.set_fast_field_tokenizers(\n        quickwit_query::get_quickwit_fastfield_normalizer_manager()\n            .tantivy_manager()\n            .clone(),\n    );\n    Ok((index, hot_directory))\n}\n\n/// Tantivy search does not make it possible to fetch data asynchronously during\n/// search.\n///\n/// It is required to download all required information in advance.\n/// This is the role of the `warmup` function.\n///\n/// The downloaded data depends on the query (which term's posting list is required,\n/// are position required too), and the collector.\n///\n/// * `query` - query is used to extract the terms and their fields which will be loaded from the\n/// inverted_index.\n///\n/// * `term_dict_field_names` - A list of fields, where the whole dictionary needs to be loaded.\n/// This is e.g. required for term aggregation, since we don't know in advance which terms are going\n/// to be hit.\n#[instrument(skip_all)]\npub(crate) async fn warmup(searcher: &Searcher, warmup_info: &WarmupInfo) -> anyhow::Result<()> {\n    debug!(warmup_info=?warmup_info);\n    let warm_up_terms_future = warm_up_terms(searcher, &warmup_info.terms_grouped_by_field)\n        .instrument(debug_span!(\"warm_up_terms\"));\n    let warm_up_term_ranges_future =\n        warm_up_term_ranges(searcher, &warmup_info.term_ranges_grouped_by_field)\n            .instrument(debug_span!(\"warm_up_term_ranges\"));\n    let warm_up_term_dict_future =\n        warm_up_term_dict_fields(searcher, &warmup_info.term_dict_fields)\n            .instrument(debug_span!(\"warm_up_term_dicts\"));\n    let warm_up_fastfields_future = warm_up_fastfields(searcher, &warmup_info.fast_fields)\n        .instrument(debug_span!(\"warm_up_fastfields\"));\n    let warm_up_fieldnorms_future = warm_up_fieldnorms(searcher, warmup_info.field_norms)\n        .instrument(debug_span!(\"warm_up_fieldnorms\"));\n    // TODO merge warm_up_postings into warm_up_term_dict_fields\n    let warm_up_postings_future = warm_up_postings(searcher, &warmup_info.term_dict_fields)\n        .instrument(debug_span!(\"warm_up_postings\"));\n    let warm_up_automatons_future =\n        warm_up_automatons(searcher, &warmup_info.automatons_grouped_by_field)\n            .instrument(debug_span!(\"warm_up_automatons\"));\n\n    tokio::try_join!(\n        warm_up_terms_future,\n        warm_up_term_ranges_future,\n        warm_up_fastfields_future,\n        warm_up_term_dict_future,\n        warm_up_fieldnorms_future,\n        warm_up_postings_future,\n        warm_up_automatons_future,\n    )?;\n\n    Ok(())\n}\n\nasync fn warm_up_term_dict_fields(\n    searcher: &Searcher,\n    term_dict_fields: &HashSet<Field>,\n) -> anyhow::Result<()> {\n    let mut warm_up_futures = Vec::new();\n    for field in term_dict_fields {\n        for segment_reader in searcher.segment_readers() {\n            let inverted_index = segment_reader.inverted_index(*field)?.clone();\n            warm_up_futures.push(async move {\n                let dict = inverted_index.terms();\n                dict.warm_up_dictionary().await\n            });\n        }\n    }\n    try_join_all(warm_up_futures).await?;\n    Ok(())\n}\n\nasync fn warm_up_postings(searcher: &Searcher, fields: &HashSet<Field>) -> anyhow::Result<()> {\n    let mut warm_up_futures = Vec::new();\n    for field in fields {\n        for segment_reader in searcher.segment_readers() {\n            let inverted_index = segment_reader.inverted_index(*field)?.clone();\n            warm_up_futures.push(async move { inverted_index.warm_postings_full(false).await });\n        }\n    }\n    try_join_all(warm_up_futures).await?;\n    Ok(())\n}\n\nasync fn warm_up_fastfield(\n    fast_field_reader: &FastFieldReaders,\n    fast_field: &FastFieldWarmupInfo,\n) -> anyhow::Result<()> {\n    let mut columns = fast_field_reader\n        .list_dynamic_column_handles(&fast_field.name)\n        .await?;\n    if fast_field.with_subfields {\n        let subpath_columns = fast_field_reader\n            .list_subpath_dynamic_column_handles(&fast_field.name)\n            .await?;\n        columns.extend(subpath_columns);\n    }\n    futures::future::try_join_all(\n        columns\n            .into_iter()\n            .map(|col| async move { col.file_slice().read_bytes_async().await }),\n    )\n    .await?;\n    Ok(())\n}\n\n/// Populates the short-lived cache with the data for\n/// all of the fast fields passed as argument.\nasync fn warm_up_fastfields(\n    searcher: &Searcher,\n    fast_fields: &HashSet<FastFieldWarmupInfo>,\n) -> anyhow::Result<()> {\n    let mut warm_up_futures = Vec::new();\n    for segment_reader in searcher.segment_readers() {\n        let fast_field_reader = segment_reader.fast_fields();\n        for fast_field in fast_fields {\n            let warm_up_fut = warm_up_fastfield(fast_field_reader, fast_field);\n            warm_up_futures.push(Box::pin(warm_up_fut));\n        }\n    }\n    futures::future::try_join_all(warm_up_futures).await?;\n    Ok(())\n}\n\nasync fn warm_up_terms(\n    searcher: &Searcher,\n    terms_grouped_by_field: &HashMap<Field, HashMap<Term, bool>>,\n) -> anyhow::Result<()> {\n    let mut warm_up_futures = Vec::new();\n    for (field, terms) in terms_grouped_by_field {\n        for segment_reader in searcher.segment_readers() {\n            let inv_idx = segment_reader.inverted_index(*field)?;\n            for (term, position_needed) in terms.iter() {\n                let inv_idx_clone = inv_idx.clone();\n                warm_up_futures\n                    .push(async move { inv_idx_clone.warm_postings(term, *position_needed).await });\n            }\n        }\n    }\n    try_join_all(warm_up_futures).await?;\n    Ok(())\n}\n\nasync fn warm_up_term_ranges(\n    searcher: &Searcher,\n    terms_grouped_by_field: &HashMap<Field, HashMap<TermRange, bool>>,\n) -> anyhow::Result<()> {\n    let mut warm_up_futures = Vec::new();\n    for (field, terms) in terms_grouped_by_field {\n        for segment_reader in searcher.segment_readers() {\n            let inv_idx = segment_reader.inverted_index(*field)?;\n            for (term_range, position_needed) in terms.iter() {\n                let inv_idx_clone = inv_idx.clone();\n                let range = (term_range.start.as_ref(), term_range.end.as_ref());\n                warm_up_futures.push(async move {\n                    inv_idx_clone\n                        .warm_postings_range(range, term_range.limit, *position_needed)\n                        .await\n                });\n            }\n        }\n    }\n    try_join_all(warm_up_futures).await?;\n    Ok(())\n}\n\nasync fn warm_up_automatons(\n    searcher: &Searcher,\n    terms_grouped_by_field: &HashMap<Field, HashSet<Automaton>>,\n) -> anyhow::Result<()> {\n    let mut warm_up_futures = Vec::new();\n    let cpu_intensive_executor = |task| async {\n        crate::search_thread_pool()\n            .run_cpu_intensive(task)\n            .await\n            .map_err(|_| std::io::Error::other(\"task panicked\"))?\n    };\n    for (field, automatons) in terms_grouped_by_field {\n        for segment_reader in searcher.segment_readers() {\n            let inv_idx = segment_reader.inverted_index(*field)?;\n            for automaton in automatons {\n                let inv_idx_clone = inv_idx.clone();\n                warm_up_futures.push(async move {\n                    match automaton {\n                        Automaton::Regex(path, regex_str) => {\n                            let regex = tantivy_fst::Regex::new(regex_str)\n                                .context(\"failed to parse regex during warmup\")?;\n                            inv_idx_clone\n                                .warm_postings_automaton(\n                                    quickwit_query::query_ast::JsonPathPrefix {\n                                        automaton: regex.into(),\n                                        prefix: path.clone().unwrap_or_default(),\n                                    },\n                                    cpu_intensive_executor,\n                                )\n                                .await\n                                .context(\"failed to load automaton\")\n                        }\n                    }\n                });\n            }\n        }\n    }\n    try_join_all(warm_up_futures).await?;\n    Ok(())\n}\n\nasync fn warm_up_fieldnorms(searcher: &Searcher, requires_scoring: bool) -> anyhow::Result<()> {\n    if !requires_scoring {\n        return Ok(());\n    }\n    let mut warm_up_futures = Vec::new();\n    for field in searcher.schema().fields() {\n        for segment_reader in searcher.segment_readers() {\n            let fieldnorm_readers = segment_reader.fieldnorms_readers();\n            let file_handle_opt = fieldnorm_readers.get_inner_file().open_read(field.0);\n            if let Some(file_handle) = file_handle_opt {\n                warm_up_futures.push(async move { file_handle.read_bytes_async().await })\n            }\n        }\n    }\n    try_join_all(warm_up_futures).await?;\n    Ok(())\n}\n\nfn get_leaf_resp_from_count(count: u64) -> LeafSearchResponse {\n    LeafSearchResponse {\n        num_hits: count,\n        partial_hits: Vec::new(),\n        failed_splits: Vec::new(),\n        num_attempted_splits: 1,\n        num_successful_splits: 1,\n        intermediate_aggregation_result: None,\n        resource_stats: None,\n    }\n}\n\n/// Compute the size of the index, store excluded.\nfn compute_index_size(hot_directory: &HotDirectory) -> ByteSize {\n    let size_bytes = hot_directory\n        .get_file_lengths()\n        .iter()\n        .filter(|(path, _)| !path.to_string_lossy().ends_with(\"store\"))\n        .map(|(_, size)| *size)\n        .sum();\n    ByteSize(size_bytes)\n}\n\n/// Apply a leaf search on a single split.\n#[allow(clippy::too_many_arguments)]\nasync fn leaf_search_single_split(\n    search_request: SearchRequest,\n    ctx: Arc<LeafSearchContext>,\n    storage: Arc<dyn Storage>,\n    split: SplitIdAndFooterOffsets,\n    search_permit: &mut SearchPermit,\n) -> crate::Result<Option<LeafSearchResponse>> {\n    let mut leaf_search_state_guard =\n        SplitSearchStateGuard::new(ctx.split_outcome_counters.clone());\n\n    // We already checked if the result was already in the partial result cache,\n    // but it's not a bad idea to check again.\n    if let Some(cached_answer) = ctx\n        .searcher_context\n        .leaf_search_cache\n        .get(split.clone(), search_request.clone())\n    {\n        leaf_search_state_guard.set_state(SplitSearchState::CacheHit);\n        return Ok(Some(cached_answer));\n    }\n\n    let query_ast: QueryAst = serde_json::from_str(search_request.query_ast.as_str())\n        .map_err(|err| SearchError::InvalidQuery(err.to_string()))?;\n\n    // CanSplitDoBetter or rewrite_request may have changed the request to be a count only request\n    // This may be the case for AllQuery with a sort by date and time filter, where the current\n    // split can't have better results.\n    if is_metadata_count_request_with_ast(&query_ast, &search_request) {\n        leaf_search_state_guard.set_state(SplitSearchState::PrunedBeforeWarmup);\n        return Ok(Some(get_leaf_resp_from_count(split.num_docs)));\n    }\n\n    let split_id = split.split_id.to_string();\n    let byte_range_cache =\n        ByteRangeCache::with_infinite_capacity(&quickwit_storage::STORAGE_METRICS.shortlived_cache);\n    let (index, hot_directory) = open_index_with_caches(\n        &ctx.searcher_context,\n        storage,\n        &split,\n        Some(ctx.doc_mapper.tokenizer_manager()),\n        Some(byte_range_cache.clone()),\n    )\n    .await?;\n\n    let index_size = compute_index_size(&hot_directory);\n    if index_size < search_permit.memory_allocation() {\n        search_permit.update_memory_usage(index_size);\n    }\n\n    let searcher = index\n        .reader_builder()\n        .reload_policy(ReloadPolicy::Manual)\n        .try_into()?\n        .searcher();\n\n    let agg_context_params = AggContextParams {\n        limits: ctx.searcher_context.get_aggregation_limits(),\n        tokenizers: ctx.doc_mapper.tokenizer_manager().tantivy_manager().clone(),\n    };\n    let mut collector =\n        make_collector_for_split(split_id.clone(), &search_request, agg_context_params)?;\n\n    let predicate_cache = if collector.requires_scoring() {\n        // at the moment the predicate cache doesn't support scoring\n        None\n    } else {\n        Some((\n            ctx.searcher_context.predicate_cache.clone() as _,\n            split.split_id.clone(),\n        ))\n    };\n    let split_schema = index.schema();\n    let (query, mut warmup_info) = ctx.doc_mapper.query(\n        split_schema.clone(),\n        query_ast.clone(),\n        false,\n        predicate_cache,\n    )?;\n\n    let collector_warmup_info = collector.warmup_info();\n    warmup_info.merge(collector_warmup_info);\n    warmup_info.simplify();\n\n    let warmup_start = Instant::now();\n    leaf_search_state_guard.set_state(SplitSearchState::WarmUp);\n    warmup(&searcher, &warmup_info).await?;\n    let warmup_end = Instant::now();\n    let warmup_duration: Duration = warmup_end.duration_since(warmup_start);\n    let warmup_size = ByteSize(byte_range_cache.get_num_bytes());\n    if warmup_size > search_permit.memory_allocation() {\n        warn!(\n            memory_usage = ?warmup_size,\n            memory_allocation = ?search_permit.memory_allocation(),\n            \"current leaf search is consuming more memory than the initial allocation\"\n        );\n    }\n    crate::SEARCH_METRICS\n        .leaf_search_single_split_warmup_num_bytes\n        .observe(warmup_size.as_u64() as f64);\n    search_permit.update_memory_usage(warmup_size);\n    search_permit.free_warmup_slot();\n\n    let split_num_docs = split.num_docs;\n\n    let span = info_span!(\"tantivy_search\");\n\n    let split_clone = split.clone();\n\n    let ctx_clone = ctx.clone();\n    leaf_search_state_guard.set_state(SplitSearchState::CpuQueue);\n    let search_request_and_result: Option<(SearchRequest, LeafSearchResponse)> =\n        crate::search_thread_pool()\n            .run_cpu_intensive(move || {\n                leaf_search_state_guard.set_state(SplitSearchState::Cpu);\n                let cpu_start = Instant::now();\n                let cpu_thread_pool_wait_microsecs = cpu_start.duration_since(warmup_end);\n                let _span_guard = span.enter();\n                // Our search execution has been scheduled, let's check if we can improve the\n                // request based on the results of the preceding searches\n                let Some(simplified_search_request) =\n                    simplify_search_request(search_request, &split_clone, &ctx_clone.split_filter)\n                else {\n                    leaf_search_state_guard.set_state(SplitSearchState::PrunedAfterWarmup);\n                    return Ok(None);\n                };\n                collector.update_search_param(&simplified_search_request);\n                let mut leaf_search_response: LeafSearchResponse =\n                    if is_metadata_count_request_with_ast(&query_ast, &simplified_search_request) {\n                        get_leaf_resp_from_count(searcher.num_docs())\n                    } else if collector.is_count_only() {\n                        let count = query.count(&searcher)? as u64;\n                        get_leaf_resp_from_count(count)\n                    } else {\n                        searcher.search(&query, &collector)?\n                    };\n                leaf_search_response.resource_stats = Some(ResourceStats {\n                    cpu_microsecs: cpu_start.elapsed().as_micros() as u64,\n                    short_lived_cache_num_bytes: warmup_size.as_u64(),\n                    split_num_docs,\n                    warmup_microsecs: warmup_duration.as_micros() as u64,\n                    cpu_thread_pool_wait_microsecs: cpu_thread_pool_wait_microsecs.as_micros()\n                        as u64,\n                });\n                leaf_search_state_guard.set_state(SplitSearchState::Success);\n                Result::<_, TantivyError>::Ok(Some((\n                    simplified_search_request,\n                    leaf_search_response,\n                )))\n            })\n            .await\n            .map_err(|_| {\n                crate::SearchError::Internal(format!(\"leaf search panicked. split={split_id}\"))\n            })??;\n\n    // Let's cache this result in the partial result cache.\n    let Some((leaf_search_req, leaf_search_resp)) = search_request_and_result else {\n        return Ok(None);\n    };\n    // We save our result in the cache.\n    ctx.searcher_context\n        .leaf_search_cache\n        .put(split, leaf_search_req, leaf_search_resp.clone());\n    Ok(Some(leaf_search_resp))\n}\n\n/// Rewrite a request removing parts which incur additional download or computation with no\n/// effect.\n///\n/// This include things such as sorting result by a field or _score when no document is requested,\n/// or applying date range when the range covers the entire split.\nfn rewrite_request(\n    search_request: &mut SearchRequest,\n    split: &SplitIdAndFooterOffsets,\n    timestamp_field: Option<&str>,\n) {\n    if search_request.max_hits == 0 {\n        search_request.sort_fields = Vec::new();\n    }\n    if let Some(timestamp_field) = timestamp_field {\n        remove_redundant_timestamp_range(search_request, split, timestamp_field);\n    }\n    rewrite_aggregation(search_request);\n    // we add a top level cache node when search_after is set, this won't help for this query (which\n    // is the 2nd in its series), but should speedup every other request that comes after\n    if search_request.search_after.is_some() {\n        add_top_cache_node(search_request)\n    }\n}\n\nfn add_top_cache_node(search_request: &mut SearchRequest) {\n    let Ok(query_ast) = serde_json::from_str(search_request.query_ast.as_str()) else {\n        // an error will get raised a bit after anyway\n        return;\n    };\n    let new_ast: QueryAst = CacheNode::new(query_ast).into();\n    search_request.query_ast = serde_json::to_string(&new_ast).unwrap();\n}\n\n/// Rewrite aggregation to make them easier to cache\n///\n/// This is only valid for options which are handled while merging results, which is\n/// mostly `extended_bounds`.\nfn rewrite_aggregation(search_request: &mut SearchRequest) {\n    if let Some(aggregation) = &search_request.aggregation_request {\n        let Ok(QuickwitAggregations::TantivyAggregations(mut aggregations)) =\n            serde_json::from_str(aggregation)\n        else {\n            return;\n        };\n        let modified_something = visit_aggregation_mut(&mut aggregations, &|aggregation_variant| {\n            match aggregation_variant {\n                // we take() away the extended bounds, and record we did something\n                AggregationVariants::Histogram(histogram) => {\n                    histogram.extended_bounds.take().is_some()\n                }\n                AggregationVariants::DateHistogram(histogram) => {\n                    histogram.extended_bounds.take().is_some()\n                }\n                _ => false,\n            }\n        });\n        if modified_something {\n            // it's fine to put a (Tantivy)Aggregations and not a QuickwitAggregations because\n            // the former is an serde-untagged variant of the later\n            search_request.aggregation_request =\n                Some(serde_json::to_string(&aggregations).expect(\"serializing should never fail\"));\n        }\n    }\n}\n\n// this is a rather limited visitor, but enough to do the job\nfn visit_aggregation_mut(\n    aggregations: &mut Aggregations,\n    callback: &impl Fn(&mut AggregationVariants) -> bool,\n) -> bool {\n    let mut modified_something = false;\n    for aggregation in aggregations.values_mut() {\n        modified_something |= callback(&mut aggregation.agg);\n        modified_something |= visit_aggregation_mut(&mut aggregation.sub_aggregation, callback);\n    }\n    modified_something\n}\n\n// returns the max of left and right, that isn't unbounded. Useful for making\n// the intersection of lower bound of ranges\nfn max_bound<T: Ord + Copy>(left: Bound<T>, right: Bound<T>) -> Bound<T> {\n    use Bound::*;\n    match (left, right) {\n        (Unbounded, right) => right,\n        (left, Unbounded) => left,\n        (Included(left), Included(right)) => Included(left.max(right)),\n        (Excluded(left), Excluded(right)) => Excluded(left.max(right)),\n        (excluded_total @ Excluded(excluded), included_total @ Included(included)) => {\n            if included > excluded {\n                included_total\n            } else {\n                excluded_total\n            }\n        }\n        (included_total @ Included(included), excluded_total @ Excluded(excluded)) => {\n            if included > excluded {\n                included_total\n            } else {\n                excluded_total\n            }\n        }\n    }\n}\n\n// returns the min of left and right, that isn't unbounded. Useful for making\n// the intersection of upper bound of ranges\nfn min_bound<T: Ord + Copy>(left: Bound<T>, right: Bound<T>) -> Bound<T> {\n    use Bound::*;\n    match (left, right) {\n        (Unbounded, right) => right,\n        (left, Unbounded) => left,\n        (Included(left), Included(right)) => Included(left.min(right)),\n        (Excluded(left), Excluded(right)) => Excluded(left.min(right)),\n        (excluded_total @ Excluded(excluded), included_total @ Included(included)) => {\n            if included < excluded {\n                included_total\n            } else {\n                excluded_total\n            }\n        }\n        (included_total @ Included(included), excluded_total @ Excluded(excluded)) => {\n            if included < excluded {\n                included_total\n            } else {\n                excluded_total\n            }\n        }\n    }\n}\n\n/// remove timestamp range that would be present both in QueryAst and SearchRequest\n///\n/// this can save us from doing double the work in some cases, and help with the partial request\n/// cache.\nfn remove_redundant_timestamp_range(\n    search_request: &mut SearchRequest,\n    split: &SplitIdAndFooterOffsets,\n    timestamp_field: &str,\n) {\n    let Ok(query_ast) = serde_json::from_str(search_request.query_ast.as_str()) else {\n        // an error will get raised a bit after anyway\n        return;\n    };\n\n    let start_timestamp = search_request\n        .start_timestamp\n        .map(DateTime::from_timestamp_secs)\n        .map(Bound::Included)\n        .unwrap_or(Bound::Unbounded);\n    let end_timestamp = search_request\n        .end_timestamp\n        .map(DateTime::from_timestamp_secs)\n        .map(Bound::Excluded)\n        .unwrap_or(Bound::Unbounded);\n\n    let mut visitor = RemoveTimestampRange {\n        timestamp_field,\n        start_timestamp,\n        end_timestamp,\n    };\n    let mut new_ast = visitor\n        .transform(query_ast)\n        .expect(\"can't fail unwrapping Infallible\")\n        .unwrap_or(QueryAst::MatchAll);\n\n    let final_start_timestamp = match (\n        visitor.start_timestamp,\n        split.timestamp_start.map(DateTime::from_timestamp_secs),\n    ) {\n        (Bound::Included(query_ts), Some(split_ts)) => {\n            if query_ts > split_ts {\n                Bound::Included(query_ts)\n            } else {\n                Bound::Unbounded\n            }\n        }\n        (Bound::Excluded(query_ts), Some(split_ts)) => {\n            if query_ts >= split_ts {\n                Bound::Excluded(query_ts)\n            } else {\n                Bound::Unbounded\n            }\n        }\n        (Bound::Unbounded, Some(_)) => Bound::Unbounded,\n        (timestamp, None) => timestamp,\n    };\n    let final_end_timestamp = match (\n        visitor.end_timestamp,\n        split.timestamp_end.map(DateTime::from_timestamp_secs),\n    ) {\n        (Bound::Included(query_ts), Some(split_ts)) => {\n            if query_ts < split_ts {\n                Bound::Included(query_ts)\n            } else {\n                Bound::Unbounded\n            }\n        }\n        (Bound::Excluded(query_ts), Some(split_ts)) => {\n            if query_ts <= split_ts {\n                Bound::Excluded(query_ts)\n            } else {\n                Bound::Unbounded\n            }\n        }\n        (Bound::Unbounded, Some(_)) => Bound::Unbounded,\n        (timestamp, None) => timestamp,\n    };\n    if final_start_timestamp != Bound::Unbounded || final_end_timestamp != Bound::Unbounded {\n        let range = RangeQuery {\n            field: timestamp_field.to_string(),\n            lower_bound: final_start_timestamp.map(|bound| bound.into_timestamp_nanos().into()),\n            upper_bound: final_end_timestamp.map(|bound| bound.into_timestamp_nanos().into()),\n        };\n        new_ast = if let QueryAst::Bool(mut bool_query) = new_ast {\n            if bool_query.must.is_empty()\n                && bool_query.filter.is_empty()\n                && !bool_query.should.is_empty()\n            {\n                // we can't simply add a filter if we have some should but no must/filter. We must\n                // add a new layer of bool query\n                BoolQuery {\n                    must: vec![bool_query.into()],\n                    filter: vec![range.into()],\n                    ..Default::default()\n                }\n                .into()\n            } else {\n                bool_query.filter.push(range.into());\n                QueryAst::Bool(bool_query)\n            }\n        } else {\n            BoolQuery {\n                must: vec![new_ast],\n                filter: vec![range.into()],\n                ..Default::default()\n            }\n            .into()\n        }\n    }\n\n    search_request.query_ast = serde_json::to_string(&new_ast).unwrap();\n    search_request.start_timestamp = None;\n    search_request.end_timestamp = None;\n}\n\n/// Remove all `must` and `filter timestamp ranges, and summarize them\n#[derive(Debug, Clone)]\nstruct RemoveTimestampRange<'a> {\n    timestamp_field: &'a str,\n    start_timestamp: Bound<DateTime>,\n    end_timestamp: Bound<DateTime>,\n}\n\nimpl RemoveTimestampRange<'_> {\n    fn update_start_timestamp(\n        &mut self,\n        lower_bound: &quickwit_query::JsonLiteral,\n        included: bool,\n    ) {\n        use quickwit_query::InterpretUserInput;\n        let Some(lower_bound) = DateTime::interpret_json(lower_bound) else {\n            // we shouldn't be able to get here, we would have errored much earlier in root search\n            warn!(\"unparsable time bound in leaf search: {lower_bound:?}\");\n            return;\n        };\n        let bound = if included {\n            Bound::Included(lower_bound)\n        } else {\n            Bound::Excluded(lower_bound)\n        };\n\n        self.start_timestamp = max_bound(self.start_timestamp, bound);\n    }\n\n    fn update_end_timestamp(&mut self, upper_bound: &quickwit_query::JsonLiteral, included: bool) {\n        use quickwit_query::InterpretUserInput;\n        let Some(upper_bound) = DateTime::interpret_json(upper_bound) else {\n            // we shouldn't be able to get here, we would have errored much earlier in root search\n            warn!(\"unparsable time bound in leaf search: {upper_bound:?}\");\n            return;\n        };\n        let bound = if included {\n            Bound::Included(upper_bound)\n        } else {\n            Bound::Excluded(upper_bound)\n        };\n\n        self.end_timestamp = min_bound(self.end_timestamp, bound);\n    }\n}\n\nimpl QueryAstTransformer for RemoveTimestampRange<'_> {\n    type Err = std::convert::Infallible;\n\n    fn transform_bool(&mut self, mut bool_query: BoolQuery) -> Result<Option<QueryAst>, Self::Err> {\n        // we only want to visit sub-queries which are strict (positive) requirements\n        bool_query.must = bool_query\n            .must\n            .into_iter()\n            .filter_map(|query_ast| self.transform(query_ast).transpose())\n            .collect::<Result<Vec<_>, _>>()?;\n        bool_query.filter = bool_query\n            .filter\n            .into_iter()\n            .filter_map(|query_ast| self.transform(query_ast).transpose())\n            .collect::<Result<Vec<_>, _>>()?;\n\n        Ok(Some(QueryAst::Bool(bool_query)))\n    }\n\n    fn transform_range(&mut self, range_query: RangeQuery) -> Result<Option<QueryAst>, Self::Err> {\n        if range_query.field == self.timestamp_field {\n            match range_query.lower_bound {\n                Bound::Included(lower_bound) => {\n                    self.update_start_timestamp(&lower_bound, true);\n                }\n                Bound::Excluded(lower_bound) => {\n                    self.update_start_timestamp(&lower_bound, false);\n                }\n                Bound::Unbounded => (),\n            };\n\n            match range_query.upper_bound {\n                Bound::Included(upper_bound) => {\n                    self.update_end_timestamp(&upper_bound, true);\n                }\n                Bound::Excluded(upper_bound) => {\n                    self.update_end_timestamp(&upper_bound, false);\n                }\n                Bound::Unbounded => (),\n            };\n\n            Ok(Some(QueryAst::MatchAll))\n        } else {\n            Ok(Some(range_query.into()))\n        }\n    }\n\n    fn transform_term(&mut self, term_query: TermQuery) -> Result<Option<QueryAst>, Self::Err> {\n        // TODO we could remove query bounds, this point query surely is more precise, and it\n        // doesn't require loading a fastfield\n        Ok(Some(QueryAst::Term(term_query)))\n    }\n}\n\n/// Checks if request is a simple all query.\n/// Simple in this case would still including sorting\nfn is_simple_all_query(search_request: &SearchRequest) -> bool {\n    if search_request.aggregation_request.is_some() {\n        return false;\n    }\n\n    if search_request.search_after.is_some() {\n        return false;\n    }\n\n    // TODO: Update the logic to handle start_timestamp end_timestamp ranges\n    if search_request.start_timestamp.is_some() || search_request.end_timestamp.is_some() {\n        return false;\n    }\n\n    let Ok(query_ast) = serde_json::from_str(&search_request.query_ast) else {\n        return false;\n    };\n\n    matches!(query_ast, QueryAst::MatchAll)\n}\n\n#[derive(Debug, Clone)]\nenum CanSplitDoBetter {\n    Uninformative,\n    SplitIdHigher(Option<String>),\n    SplitTimestampHigher(Option<i64>),\n    SplitTimestampLower(Option<i64>),\n    FindTraceIdsAggregation(Option<i64>),\n}\n\nimpl CanSplitDoBetter {\n    /// Create a CanSplitDoBetter from a SearchRequest\n    fn from_request(request: &SearchRequest, timestamp_field_name: Option<&str>) -> Self {\n        if request.max_hits == 0\n            && let Some(aggregation) = &request.aggregation_request\n            && let Ok(crate::QuickwitAggregations::FindTraceIdsAggregation(find_trace_aggregation)) =\n                serde_json::from_str(aggregation)\n            && Some(find_trace_aggregation.span_timestamp_field_name.as_str())\n                == timestamp_field_name\n        {\n            return CanSplitDoBetter::FindTraceIdsAggregation(None);\n        }\n\n        if request.sort_fields.is_empty() {\n            CanSplitDoBetter::SplitIdHigher(None)\n        } else if let Some((sort_by, timestamp_field)) =\n            request.sort_fields.first().zip(timestamp_field_name)\n        {\n            if sort_by.field_name == timestamp_field {\n                if sort_by.sort_order() == SortOrder::Desc {\n                    CanSplitDoBetter::SplitTimestampHigher(None)\n                } else {\n                    CanSplitDoBetter::SplitTimestampLower(None)\n                }\n            } else {\n                CanSplitDoBetter::Uninformative\n            }\n        } else {\n            CanSplitDoBetter::Uninformative\n        }\n    }\n\n    /// Optimize the order in which splits will get processed based on how it can skip the most\n    /// splits.\n    ///\n    /// The leaf search code contains some logic that makes it possible to skip entire splits\n    /// when we are confident they won't make it into top K.\n    /// To make this optimization as potent as possible, we sort the splits so that the first splits\n    /// are the most likely to fill our Top K.\n    /// In the future, as split get more metadata per column, we may be able to do this more than\n    /// just for timestamp and \"unsorted\" request.\n    fn optimize_split_order(&self, splits: &mut [SplitIdAndFooterOffsets]) {\n        match self {\n            CanSplitDoBetter::SplitIdHigher(_) => {\n                splits.sort_unstable_by(|a, b| b.split_id.cmp(&a.split_id))\n            }\n            CanSplitDoBetter::SplitTimestampHigher(_)\n            | CanSplitDoBetter::FindTraceIdsAggregation(_) => {\n                splits.sort_unstable_by_key(|split| std::cmp::Reverse(split.timestamp_end()))\n            }\n            CanSplitDoBetter::SplitTimestampLower(_) => {\n                splits.sort_unstable_by_key(|split| split.timestamp_start())\n            }\n            CanSplitDoBetter::Uninformative => (),\n        }\n    }\n\n    /// This function tries to detect upfront which splits contain the top n hits and convert other\n    /// split searches to count only searches. It also optimizes split order.\n    ///\n    /// Returns the search_requests with their split.\n    fn optimize(\n        &self,\n        request: &SearchRequest,\n        mut splits: Vec<SplitIdAndFooterOffsets>,\n    ) -> Result<Vec<(SplitIdAndFooterOffsets, SearchRequest)>, SearchError> {\n        self.optimize_split_order(&mut splits);\n\n        if !is_simple_all_query(request) {\n            // no optimization opportunity here.\n            return Ok(splits\n                .into_iter()\n                .map(|split| (split, (*request).clone()))\n                .collect::<Vec<_>>());\n        }\n\n        let num_requested_docs = request.start_offset + request.max_hits;\n\n        // Calculate the number of splits which are guaranteed to deliver enough documents.\n        let min_required_splits = splits\n            .iter()\n            .map(|split| split.num_docs)\n            // computing the partial sum\n            .scan(0u64, |partial_sum: &mut u64, num_docs_in_split: u64| {\n                *partial_sum += num_docs_in_split;\n                Some(*partial_sum)\n            })\n            .take_while(|partial_sum| *partial_sum < num_requested_docs)\n            .count()\n            + 1;\n\n        // TODO: we maybe want here some deduplication + Cow logic\n        let mut split_with_req = splits\n            .into_iter()\n            .map(|split| (split, (*request).clone()))\n            .collect::<Vec<_>>();\n\n        // reuse the detected sort order in split_filter\n        // we want to detect cases where we can convert some split queries to count only queries\n        match self {\n            CanSplitDoBetter::SplitIdHigher(_) => {\n                // In this case there is no sort order, we order by split id.\n                // If the first split has enough documents, we can convert the other queries to\n                // count only queries\n                for (_split, request) in split_with_req.iter_mut().skip(min_required_splits) {\n                    disable_search_request_hits(request);\n                }\n            }\n            CanSplitDoBetter::Uninformative => {}\n            CanSplitDoBetter::SplitTimestampLower(_) => {\n                // We order by timestamp asc. split_with_req is sorted by timestamp_start.\n                //\n                // If we know that some splits will deliver enough documents, we can convert the\n                // others to count only queries.\n                // Since we only have start and end ranges and don't know the distribution we make\n                // sure the splits dont' overlap, since the distribution of two\n                // splits could be like this (dot is a timestamp doc on a x axis), for top 2\n                // queries.\n                // ```\n                // [.          .] Split1 has enough docs, but last doc is not in top 2\n                //           [..         .] Split2 first doc is in top2\n                // ```\n                // Let's get the biggest timestamp_end of the first num_splits splits\n                let biggest_end_timestamp = split_with_req\n                    .iter()\n                    .take(min_required_splits)\n                    .map(|(split, _)| split.timestamp_end())\n                    .max()\n                    // if min_required_splits is 0, we choose a value that disables all splits\n                    .unwrap_or(i64::MIN);\n                for (split, request) in split_with_req.iter_mut().skip(min_required_splits) {\n                    if split.timestamp_start() > biggest_end_timestamp {\n                        disable_search_request_hits(request);\n                    }\n                }\n            }\n            CanSplitDoBetter::SplitTimestampHigher(_) => {\n                // We order by timestamp desc. split_with_req is sorted by timestamp_end desc.\n                //\n                // We have the number of splits we need to search to get enough docs, now we need to\n                // find the splits that don't overlap.\n                //\n                // Let's get the smallest timestamp_start of the first num_splits splits\n                let smallest_start_timestamp = split_with_req\n                    .iter()\n                    .take(min_required_splits)\n                    .map(|(split, _)| split.timestamp_start())\n                    .min()\n                    // if min_required_splits is 0, we choose a value that disables all splits\n                    .unwrap_or(i64::MAX);\n                for (split, request) in split_with_req.iter_mut().skip(min_required_splits) {\n                    if split.timestamp_end() < smallest_start_timestamp {\n                        disable_search_request_hits(request);\n                    }\n                }\n            }\n            CanSplitDoBetter::FindTraceIdsAggregation(_) => {}\n        }\n\n        Ok(split_with_req)\n    }\n\n    /// Returns whether the given split can possibly give documents better than the one already\n    /// known to match.\n    fn can_be_better(&self, split: &SplitIdAndFooterOffsets) -> bool {\n        match self {\n            CanSplitDoBetter::SplitIdHigher(Some(split_id)) => split.split_id >= *split_id,\n            CanSplitDoBetter::SplitTimestampHigher(Some(timestamp))\n            | CanSplitDoBetter::FindTraceIdsAggregation(Some(timestamp)) => {\n                split.timestamp_end() >= *timestamp\n            }\n            CanSplitDoBetter::SplitTimestampLower(Some(timestamp)) => {\n                split.timestamp_start() <= *timestamp\n            }\n            _ => true,\n        }\n    }\n\n    /// Record the new worst-of-the-top document, that is, the document which would first be\n    /// evicted from the list of best documents, if a better document was found. Only call this\n    /// function if you have at least max_hits documents already.\n    fn record_new_worst_hit(&mut self, hit: &PartialHit) {\n        match self {\n            CanSplitDoBetter::Uninformative => (),\n            CanSplitDoBetter::SplitIdHigher(split_id) => *split_id = Some(hit.split_id.clone()),\n            CanSplitDoBetter::SplitTimestampHigher(timestamp)\n            | CanSplitDoBetter::FindTraceIdsAggregation(timestamp) => {\n                if let Some(SortValue::I64(timestamp_ns)) = hit.sort_value() {\n                    // if we get a timestamp of, says 1.5s, we need to check up to 2s to make\n                    // sure we don't throw away something like 1.2s, so we should round up while\n                    // dividing.\n                    *timestamp = Some(quickwit_common::div_ceil(timestamp_ns, 1_000_000_000));\n                }\n            }\n            CanSplitDoBetter::SplitTimestampLower(timestamp) => {\n                if let Some(SortValue::I64(timestamp_ns)) = hit.sort_value() {\n                    // if we get a timestamp of, says 1.5s, we need to check down to 1s to make\n                    // sure we don't throw away something like 1.7s, so we should truncate,\n                    // which is the default behavior of division\n                    let timestamp_s = timestamp_ns / 1_000_000_000;\n                    *timestamp = Some(timestamp_s);\n                }\n            }\n        }\n    }\n}\n\n/// Searches multiple splits, potentially in multiple indexes, sitting on different storages and\n/// having different doc mappings.\n#[instrument(skip_all, fields(index = ?leaf_search_request.search_request.as_ref().unwrap().index_id_patterns))]\npub async fn multi_index_leaf_search(\n    searcher_context: Arc<SearcherContext>,\n    leaf_search_request: LeafSearchRequest,\n    storage_resolver: StorageResolver,\n) -> Result<LeafSearchResponse, SearchError> {\n    let search_request: Arc<SearchRequest> = leaf_search_request\n        .search_request\n        .ok_or_else(|| SearchError::Internal(\"no search request\".to_string()))?\n        .into();\n\n    let doc_mappers: Vec<Arc<DocMapper>> = leaf_search_request\n        .doc_mappers\n        .iter()\n        .map(|doc_mapper| deserialize_doc_mapper(doc_mapper))\n        .collect::<crate::Result<_>>()?;\n\n    // TODO: to avoid lockstep, we should pull up the future creation over the list of split ids\n    // and have the semaphore on this level.\n    // This will lower resource consumption due to less in-flight futures and avoid contention.\n    // It also allows passing early exit conditions between indices.\n    //\n    // It is a little bit tricky how to handle which is now the incremental_merge_collector, one\n    // per index, e.g. when to merge results and how to avoid lock contention.\n    let mut leaf_request_futures = JoinSet::new();\n    for leaf_search_request_ref in leaf_search_request.leaf_requests.into_iter() {\n        let index_uri = quickwit_common::uri::Uri::from_str(\n            leaf_search_request\n                .index_uris\n                .get(leaf_search_request_ref.index_uri_ord as usize)\n                .ok_or_else(|| {\n                    SearchError::Internal(format!(\n                        \"Received incorrect request, index_uri_ord out of bounds: {}\",\n                        leaf_search_request_ref.index_uri_ord\n                    ))\n                })?,\n        )?;\n        let doc_mapper = doc_mappers\n            .get(leaf_search_request_ref.doc_mapper_ord as usize)\n            .ok_or_else(|| {\n                SearchError::Internal(format!(\n                    \"Received incorrect request, doc_mapper_ord out of bounds: {}\",\n                    leaf_search_request_ref.doc_mapper_ord\n                ))\n            })?\n            .clone();\n\n        let storage_resolver = storage_resolver.clone();\n        let searcher_context = searcher_context.clone();\n        let search_request = search_request.clone();\n\n        leaf_request_futures.spawn({\n            async move {\n                let storage = storage_resolver.resolve(&index_uri).await?;\n                single_doc_mapping_leaf_search(\n                    searcher_context,\n                    search_request,\n                    storage,\n                    leaf_search_request_ref.split_offsets,\n                    doc_mapper,\n                )\n                .in_current_span()\n                .await\n            }\n        });\n    }\n\n    // Creates a collector which merges responses into one\n    let merge_collector =\n        make_merge_collector(&search_request, searcher_context.get_aggregation_limits())?;\n    let mut incremental_merge_collector = IncrementalCollector::new(merge_collector);\n\n    while let Some(leaf_response_join_result) = leaf_request_futures.join_next().await {\n        // abort the search on join errors\n        let leaf_response_result = leaf_response_join_result?;\n        match leaf_response_result {\n            Ok(leaf_response) => {\n                incremental_merge_collector.add_result(leaf_response)?;\n            }\n            Err(err) => {\n                incremental_merge_collector.add_failed_split(SplitSearchError {\n                    split_id: \"unknown\".to_string(),\n                    error: format!(\"{err}\"),\n                    retryable_error: true,\n                });\n            }\n        }\n    }\n\n    crate::search_thread_pool()\n        .run_cpu_intensive(|| incremental_merge_collector.finalize().map_err(Into::into))\n        .instrument(info_span!(\"incremental_merge_finalize\"))\n        .await\n        .context(\"failed to merge split search responses\")?\n}\n\n/// Optimizes the search_request based on CanSplitDoBetter\n/// Returns None if the search request does nothing can be skipped.\n#[must_use]\nfn simplify_search_request(\n    mut search_request: SearchRequest,\n    split: &SplitIdAndFooterOffsets,\n    split_filter_lock: &Arc<RwLock<CanSplitDoBetter>>,\n) -> Option<SearchRequest> {\n    let can_be_better: bool;\n    let is_trace_req: bool;\n    {\n        let split_filter_guard = split_filter_lock.read().unwrap();\n        can_be_better = split_filter_guard.can_be_better(split);\n        // The info is originally from the search_request.aggregation as a string (yes we need to\n        // clean this eventually). We don't want to parse it again, so we use the\n        // split_filter variant to get that info.\n        is_trace_req = matches!(\n            &*split_filter_guard,\n            &CanSplitDoBetter::FindTraceIdsAggregation(_)\n        );\n    }\n    if !can_be_better {\n        disable_search_request_hits(&mut search_request);\n    }\n    if is_trace_req {\n        return Some(search_request);\n    }\n    if search_request.max_hits > 0 {\n        return Some(search_request);\n    }\n    if search_request.aggregation_request.is_some() {\n        return Some(search_request);\n    }\n    if search_request.count_hits() == CountHits::CountAll {\n        return Some(search_request);\n    }\n    None\n}\n\n/// Alter the search request so it does not return any docs.\n///\n/// This is usually done since it cannot provide better hits results than existing fetched results.\nfn disable_search_request_hits(search_request: &mut SearchRequest) {\n    search_request.max_hits = 0;\n    search_request.start_offset = 0;\n    search_request.sort_fields.clear();\n    search_request.search_after = None;\n}\n\n/// Searches multiple splits for a specific index and a single doc mapping\n/// Offloads splits to Lambda invocations, distributing them across batches\n/// balanced by document count. Each batch is invoked independently; a failure\n/// in one batch does not affect others.\nasync fn run_offloaded_search_tasks(\n    searcher_context: &SearcherContext,\n    search_request: &SearchRequest,\n    doc_mapper: &DocMapper,\n    index_uri: Uri,\n    splits_with_requests: Vec<(SplitIdAndFooterOffsets, SearchRequest)>,\n    incremental_merge_collector: &Mutex<IncrementalCollector>,\n) -> Result<(), SearchError> {\n    if splits_with_requests.is_empty() {\n        return Ok(());\n    }\n\n    info!(\n        num_offloaded_splits = splits_with_requests.len(),\n        \"offloading to lambda\"\n    );\n\n    let lambda_invoker = searcher_context.lambda_invoker.as_ref().expect(\n        \"did not receive enough permit futures despite not having any lambda invoker to offload to\",\n    );\n    let lambda_config = searcher_context.searcher_config.lambda.as_ref().unwrap();\n\n    let doc_mapper_str = serde_json::to_string(doc_mapper)\n        .map_err(|err| SearchError::Internal(format!(\"failed to serialize doc mapper: {err}\")))?;\n\n    // Build a lookup so we can match lambda results (tagged by split_id) back to the\n    // split metadata and per-split SearchRequest needed for caching.\n    let mut split_lookup: HashMap<String, (SplitIdAndFooterOffsets, SearchRequest)> =\n        HashMap::with_capacity(splits_with_requests.len());\n    let splits: Vec<SplitIdAndFooterOffsets> = splits_with_requests\n        .into_iter()\n        .map(|(split, search_req)| {\n            split_lookup.insert(split.split_id.clone(), (split.clone(), search_req));\n            split\n        })\n        .collect();\n\n    let batches: Vec<Vec<SplitIdAndFooterOffsets>> = greedy_batch_split(\n        splits,\n        |split| split.num_docs,\n        lambda_config.max_splits_per_invocation,\n    );\n\n    let mut lambda_tasks_joinset = JoinSet::new();\n    for batch in batches {\n        let batch_split_ids: Vec<String> =\n            batch.iter().map(|split| split.split_id.clone()).collect();\n        let leaf_request = LeafSearchRequest {\n            // Note this is not the split-specific rewritten request, we ship the main request,\n            // and the leaf will apply the split specific rewrite on its own.\n            search_request: Some(search_request.clone()),\n            doc_mappers: vec![doc_mapper_str.clone()],\n            index_uris: vec![index_uri.as_str().to_string()], //< careful here. Calling to_string() directly would return a redacted uri.\n            leaf_requests: vec![quickwit_proto::search::LeafRequestRef {\n                index_uri_ord: 0,\n                doc_mapper_ord: 0,\n                split_offsets: batch,\n            }],\n        };\n        let invoker = lambda_invoker.clone();\n        lambda_tasks_joinset.spawn(async move {\n            (\n                batch_split_ids,\n                invoker.invoke_leaf_search(leaf_request).await,\n            )\n        });\n    }\n\n    while let Some(join_res) = lambda_tasks_joinset.join_next().await {\n        let Ok((batch_split_ids, result)) = join_res else {\n            error!(\"lambda join error\");\n            return Err(SearchError::Internal(\"lambda join error\".to_string()));\n        };\n        match result {\n            Ok(split_results) => {\n                let mut locked = incremental_merge_collector.lock().unwrap();\n                for split_result in split_results {\n                    match split_result.outcome {\n                        Some(Outcome::Response(response)) => {\n                            if let Some((split_info, single_split_search_req)) =\n                                split_lookup.remove(&split_result.split_id)\n                            {\n                                // We use the single_split_search_req to perform the search\n                                searcher_context.leaf_search_cache.put(\n                                    split_info,\n                                    single_split_search_req,\n                                    response.clone(),\n                                );\n                            }\n                            if let Err(err) = locked.add_result(response) {\n                                error!(error = %err, \"failed to add lambda result to collector\");\n                            }\n                        }\n                        Some(Outcome::Error(error_msg)) => {\n                            locked.add_failed_split(SplitSearchError {\n                                split_id: split_result.split_id,\n                                error: format!(\"lambda split error: {error_msg}\"),\n                                retryable_error: true,\n                            });\n                        }\n                        None => {\n                            locked.add_failed_split(SplitSearchError {\n                                split_id: split_result.split_id,\n                                error: \"lambda returned empty outcome\".to_string(),\n                                retryable_error: true,\n                            });\n                        }\n                    }\n                }\n            }\n            Err(err) => {\n                // Transport-level failure: the Lambda invocation itself failed.\n                // Mark all splits in this batch as failed.\n                error!(\n                    error = %err,\n                    num_splits = batch_split_ids.len(),\n                    \"lambda invocation failed for batch\"\n                );\n                let mut locked = incremental_merge_collector.lock().unwrap();\n                for split_id in batch_split_ids {\n                    locked.add_failed_split(SplitSearchError {\n                        split_id,\n                        error: format!(\"lambda invocation error: {err}\"),\n                        retryable_error: true,\n                    });\n                }\n            }\n        }\n    }\n\n    Ok(())\n}\n\nstruct LocalSearchTask {\n    split: SplitIdAndFooterOffsets,\n    search_request: SearchRequest,\n    search_permit_future: SearchPermitFuture,\n}\n\nstruct ScheduleSearchTaskResult {\n    // The search permit futures associated to each local_search_task are\n    // guaranteed to resolve in order.\n    local_search_tasks: Vec<LocalSearchTask>,\n    // The per-split SearchRequest (already rewritten by `rewrite_request()`) is preserved\n    // so that lambda results can be cached with the correct cache key in `leaf_search_cache`.\n    offloaded_search_tasks: Vec<(SplitIdAndFooterOffsets, SearchRequest)>,\n}\n\n/// Schedule search tasks, either:\n/// - locally\n/// - remotely on lambdas, if lambda are configured, and the number of tasks scheduled exceed the\n///   offload threshold.\nasync fn schedule_search_tasks(\n    mut splits: Vec<(SplitIdAndFooterOffsets, SearchRequest)>,\n    searcher_context: &SearcherContext,\n) -> ScheduleSearchTaskResult {\n    let permit_sizes: Vec<ByteSize> = splits\n        .iter()\n        .map(|(split, _)| {\n            compute_initial_memory_allocation(\n                split,\n                searcher_context\n                    .searcher_config\n                    .warmup_single_split_initial_allocation,\n            )\n        })\n        .collect();\n\n    let offload_threshold: usize = if searcher_context.lambda_invoker.is_some()\n        && let Some(lambda_config) = &searcher_context.searcher_config.lambda\n    {\n        lambda_config.offload_threshold\n    } else {\n        usize::MAX\n    };\n\n    let search_permit_futures = searcher_context\n        .search_permit_provider\n        .get_permits_with_offload(permit_sizes, offload_threshold)\n        .await;\n\n    let splits_to_run_on_lambda: Vec<(SplitIdAndFooterOffsets, SearchRequest)> =\n        splits.drain(search_permit_futures.len()..).collect();\n\n    let splits_to_run_locally: Vec<LocalSearchTask> = splits\n        .into_iter()\n        .zip(search_permit_futures)\n        .map(\n            |((split, search_request), search_permit_future)| LocalSearchTask {\n                split,\n                search_request,\n                search_permit_future,\n            },\n        )\n        .collect();\n\n    ScheduleSearchTaskResult {\n        local_search_tasks: splits_to_run_locally,\n        offloaded_search_tasks: splits_to_run_on_lambda,\n    }\n}\n\n/// The leaf search collects all kind of information, and returns a set of\n/// [PartialHit] candidates. The root will be in\n/// charge to consolidate, identify the actual final top hits to display, and\n/// fetch the actual documents to convert the partial hits into actual Hits.\npub async fn single_doc_mapping_leaf_search(\n    searcher_context: Arc<SearcherContext>,\n    request: Arc<SearchRequest>,\n    index_storage: Arc<dyn Storage>,\n    splits: Vec<SplitIdAndFooterOffsets>,\n    doc_mapper: Arc<DocMapper>,\n) -> Result<LeafSearchResponse, SearchError> {\n    let num_docs: u64 = splits.iter().map(|split| split.num_docs).sum();\n    let num_splits = splits.len();\n    info!(num_docs, num_splits, split_offsets = ?PrettySample::new(&splits, 5));\n\n    // We simplify the request as much as possible.\n    let split_filter: CanSplitDoBetter =\n        CanSplitDoBetter::from_request(&request, doc_mapper.timestamp_field_name());\n    let mut split_with_req: Vec<(SplitIdAndFooterOffsets, SearchRequest)> =\n        split_filter.optimize(&request, splits)?;\n    for (split, single_split_search_request) in &mut split_with_req {\n        rewrite_request(\n            single_split_search_request,\n            split,\n            doc_mapper.timestamp_field_name(),\n        );\n    }\n    let split_filter_arc: Arc<RwLock<CanSplitDoBetter>> = Arc::new(RwLock::new(split_filter));\n\n    let merge_collector =\n        make_merge_collector(&request, searcher_context.get_aggregation_limits())?;\n    let mut incremental_merge_collector = IncrementalCollector::new(merge_collector);\n\n    let split_outcome_counters = Arc::new(SplitSearchOutcomeCounters::new_unregistered());\n\n    // Sort out the splits that are already in the partial result cache.\n    let uncached_splits: Vec<(SplitIdAndFooterOffsets, SearchRequest)> =\n        process_partial_result_cache(\n            &searcher_context.leaf_search_cache,\n            split_with_req,\n            split_outcome_counters.clone(),\n            &mut incremental_merge_collector,\n        )?;\n    let incremental_merge_collector_arc: Arc<Mutex<IncrementalCollector>> =\n        Arc::new(Mutex::new(incremental_merge_collector));\n\n    // Determine which uncached splits to process locally vs offload.\n    let ScheduleSearchTaskResult {\n        local_search_tasks,\n        offloaded_search_tasks,\n    } = schedule_search_tasks(uncached_splits, &searcher_context).await;\n\n    // Offload splits to Lambda.\n    let run_offloaded_search_tasks_fut = run_offloaded_search_tasks(\n        &searcher_context,\n        &request,\n        &doc_mapper,\n        index_storage.uri().clone(),\n        offloaded_search_tasks,\n        &incremental_merge_collector_arc,\n    );\n\n    // Spawn local split search tasks.\n    let leaf_search_context = Arc::new(LeafSearchContext {\n        searcher_context: searcher_context.clone(),\n        split_outcome_counters,\n        incremental_merge_collector: incremental_merge_collector_arc.clone(),\n        doc_mapper: doc_mapper.clone(),\n        split_filter: split_filter_arc.clone(),\n    });\n    let run_local_search_tasks_fut = run_local_search_tasks(\n        local_search_tasks,\n        index_storage,\n        split_filter_arc,\n        leaf_search_context,\n    );\n\n    let (offloaded_res, _) =\n        tokio::join!(run_offloaded_search_tasks_fut, run_local_search_tasks_fut);\n    offloaded_res?;\n\n    // we can't use unwrap_or_clone because mutexes aren't Clone\n    let incremental_merge_collector = match Arc::try_unwrap(incremental_merge_collector_arc) {\n        Ok(filter_merger) => filter_merger.into_inner().unwrap(),\n        Err(filter_merger) => filter_merger.lock().unwrap().clone(),\n    };\n\n    let leaf_search_response_result: tantivy::Result<LeafSearchResponse> =\n        crate::search_thread_pool()\n            .run_cpu_intensive(|| incremental_merge_collector.finalize())\n            .instrument(info_span!(\"incremental_merge_intermediate\"))\n            .await\n            .context(\"failed to merge split search responses: thread panicked\")?;\n\n    Ok(leaf_search_response_result?)\n}\n\nasync fn run_local_search_tasks(\n    local_search_tasks: Vec<LocalSearchTask>,\n    index_storage: Arc<dyn Storage + 'static>,\n    split_filter_arc: Arc<RwLock<CanSplitDoBetter>>,\n    leaf_search_context: Arc<LeafSearchContext>,\n) {\n    let mut split_search_joinset = JoinSet::new();\n    let mut task_id_to_split_id_map = HashMap::with_capacity(local_search_tasks.len());\n\n    for LocalSearchTask {\n        split,\n        search_request,\n        search_permit_future,\n    } in local_search_tasks\n    {\n        let leaf_split_search_permit = search_permit_future\n            .instrument(info_span!(\"waiting_for_leaf_search_split_semaphore\"))\n            .await;\n\n        // We run simplify search request again: as we push split into the merge collector,\n        // we may have discovered that we won't find any better candidates for top hits in this\n        // split, in which case we can remove top hits collection.\n        let Some(simplified_search_request) =\n            simplify_search_request(search_request, &split, &split_filter_arc)\n        else {\n            let mut leaf_search_state_guard =\n                SplitSearchStateGuard::new(leaf_search_context.split_outcome_counters.clone());\n            leaf_search_state_guard.set_state(SplitSearchState::PrunedBeforeWarmup);\n            continue;\n        };\n        let split_id = split.split_id.clone();\n        let handle = split_search_joinset.spawn(\n            leaf_search_single_split_wrapper(\n                simplified_search_request,\n                leaf_search_context.clone(),\n                index_storage.clone(),\n                split.clone(),\n                leaf_split_search_permit,\n            )\n            .in_current_span(),\n        );\n        task_id_to_split_id_map.insert(handle.id(), split_id);\n    }\n\n    // Await all local tasks.\n    let mut split_search_join_errors: Vec<(String, JoinError)> = Vec::new();\n\n    while let Some(leaf_search_join_result) = split_search_joinset.join_next().await {\n        if let Err(join_error) = leaf_search_join_result {\n            if join_error.is_cancelled() {\n                continue;\n            }\n            let split_id = task_id_to_split_id_map.get(&join_error.id()).unwrap();\n            if join_error.is_panic() {\n                error!(split=%split_id, \"leaf search task panicked\");\n            } else {\n                error!(split=%split_id, \"please report: leaf search was not cancelled, and could not extract panic. this should never happen\");\n            }\n            split_search_join_errors.push((split_id.clone(), join_error));\n        }\n    }\n\n    let mut incremental_merge_collector_lock = leaf_search_context\n        .incremental_merge_collector\n        .lock()\n        .unwrap();\n    for (split_id, split_search_join_error) in split_search_join_errors {\n        incremental_merge_collector_lock.add_failed_split(SplitSearchError {\n            split_id,\n            error: SearchError::from(split_search_join_error).to_string(),\n            retryable_error: true,\n        });\n    }\n\n    info!(split_outcome_counters=%leaf_search_context.split_outcome_counters, \"leaf split search finished\");\n}\n\n/// We identify the splits that are in the cache and append them to the incremental merge collector.\n/// The (split, request) that are yet to be processed are returned.\nfn process_partial_result_cache(\n    leaf_search_cache: &LeafSearchCache,\n    split_with_req: Vec<(SplitIdAndFooterOffsets, SearchRequest)>,\n    split_outcome_counters: Arc<SplitSearchOutcomeCounters>,\n    incremental_merge_collector: &mut IncrementalCollector,\n) -> Result<Vec<(SplitIdAndFooterOffsets, SearchRequest)>, SearchError> {\n    let mut uncached_splits: Vec<(SplitIdAndFooterOffsets, SearchRequest)> =\n        Vec::with_capacity(split_with_req.len());\n    for (split, search_request) in split_with_req {\n        if let Some(cached_response) = leaf_search_cache\n            // TODO remove the clone here.\n            .get(split.clone(), search_request.clone())\n        {\n            let mut split_search_guard = SplitSearchStateGuard::new(split_outcome_counters.clone());\n            split_search_guard.set_state(SplitSearchState::CacheHit);\n            incremental_merge_collector.add_result(cached_response)?;\n        } else {\n            uncached_splits.push((split, search_request));\n        }\n    }\n    Ok(uncached_splits)\n}\n\n#[derive(Copy, Clone)]\nenum SplitSearchState {\n    Start,\n    CacheHit,\n    PrunedBeforeWarmup,\n    WarmUp,\n    PrunedAfterWarmup,\n    CpuQueue,\n    Cpu,\n    Success,\n}\n\nimpl SplitSearchState {\n    pub fn inc(self, counters: &SplitSearchOutcomeCounters) {\n        match self {\n            SplitSearchState::Start => counters.cancel_before_warmup.inc(),\n            SplitSearchState::CacheHit => counters.cache_hit.inc(),\n            SplitSearchState::PrunedBeforeWarmup => counters.pruned_before_warmup.inc(),\n            SplitSearchState::WarmUp => counters.cancel_warmup.inc(),\n            SplitSearchState::PrunedAfterWarmup => counters.pruned_after_warmup.inc(),\n            SplitSearchState::CpuQueue => counters.cancel_cpu_queue.inc(),\n            SplitSearchState::Cpu => counters.cancel_cpu.inc(),\n            SplitSearchState::Success => counters.success.inc(),\n        }\n    }\n}\n\nimpl Drop for SplitSearchStateGuard {\n    fn drop(&mut self) {\n        self.state\n            .inc(&crate::metrics::SEARCH_METRICS.split_search_outcome_total);\n        self.state.inc(&self.local_split_search_outcome_counters);\n    }\n}\n\nstruct SplitSearchStateGuard {\n    state: SplitSearchState,\n    local_split_search_outcome_counters: Arc<SplitSearchOutcomeCounters>,\n}\n\nimpl SplitSearchStateGuard {\n    pub fn new(local_split_search_outcome_counters: Arc<SplitSearchOutcomeCounters>) -> Self {\n        SplitSearchStateGuard {\n            state: SplitSearchState::Start,\n            local_split_search_outcome_counters: local_split_search_outcome_counters.clone(),\n        }\n    }\n\n    pub fn set_state(&mut self, state: SplitSearchState) {\n        self.state = state;\n    }\n}\n\nstruct LeafSearchContext {\n    searcher_context: Arc<SearcherContext>,\n    split_outcome_counters: Arc<SplitSearchOutcomeCounters>,\n    incremental_merge_collector: Arc<Mutex<IncrementalCollector>>,\n    doc_mapper: Arc<DocMapper>,\n    split_filter: Arc<RwLock<CanSplitDoBetter>>,\n}\n\n#[allow(clippy::too_many_arguments)]\n#[instrument(skip_all, fields(split_id = split.split_id, num_docs = split.num_docs))]\nasync fn leaf_search_single_split_wrapper(\n    request: SearchRequest,\n    ctx: Arc<LeafSearchContext>,\n    index_storage: Arc<dyn Storage>,\n    split: SplitIdAndFooterOffsets,\n    mut search_permit: SearchPermit,\n) {\n    let timer = crate::SEARCH_METRICS\n        .leaf_search_split_duration_secs\n        .start_timer();\n    let leaf_search_single_split_opt_res: crate::Result<Option<LeafSearchResponse>> =\n        leaf_search_single_split(\n            request,\n            ctx.clone(),\n            index_storage,\n            split.clone(),\n            &mut search_permit,\n        )\n        .await;\n\n    // Explicitly drop the permit for readability.\n    // This should always happen after the ephemeral search cache is dropped.\n    std::mem::drop(search_permit);\n\n    if leaf_search_single_split_opt_res.is_ok() {\n        timer.observe_duration();\n    }\n\n    let mut locked_incremental_merge_collector = ctx.incremental_merge_collector.lock().unwrap();\n    match leaf_search_single_split_opt_res {\n        Ok(Some(split_search_res)) => {\n            if let Err(err) = locked_incremental_merge_collector.add_result(split_search_res) {\n                locked_incremental_merge_collector.add_failed_split(SplitSearchError {\n                    split_id: split.split_id.clone(),\n                    error: format!(\"Error parsing aggregation result: {err}\"),\n                    retryable_error: true,\n                });\n            }\n        }\n        Ok(None) => {}\n        Err(err) => locked_incremental_merge_collector.add_failed_split(SplitSearchError {\n            split_id: split.split_id.clone(),\n            error: format!(\"{err}\"),\n            retryable_error: true,\n        }),\n    }\n    if let Some(last_hit) = locked_incremental_merge_collector.peek_worst_hit() {\n        // TODO: we could use the RWLock instead and read the value instead of updating it\n        // unconditionally.\n        ctx.split_filter\n            .write()\n            .unwrap()\n            .record_new_worst_hit(last_hit.as_ref());\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::ops::Bound;\n\n    use async_trait::async_trait;\n    use bytes::BufMut;\n    use quickwit_config::{LambdaConfig, SearcherConfig};\n    use quickwit_directories::write_hotcache;\n    use quickwit_proto::search::LambdaSingleSplitResult;\n    use rand::Rng;\n    use tantivy::TantivyDocument;\n    use tantivy::directory::RamDirectory;\n    use tantivy::schema::{\n        BytesOptions, FieldEntry, Schema, TextFieldIndexing, TextOptions, Value,\n    };\n\n    use super::*;\n    use crate::LambdaLeafSearchInvoker;\n\n    fn bool_filter(ast: impl Into<QueryAst>) -> QueryAst {\n        BoolQuery {\n            must: vec![QueryAst::MatchAll],\n            filter: vec![ast.into()],\n            ..Default::default()\n        }\n        .into()\n    }\n\n    #[track_caller]\n    fn assert_ast_eq(got: &SearchRequest, expected: &QueryAst) {\n        let got_ast: QueryAst = serde_json::from_str(&got.query_ast).unwrap();\n        assert_eq!(&got_ast, expected);\n        assert!(got.start_timestamp.is_none());\n        assert!(got.end_timestamp.is_none());\n    }\n\n    #[track_caller]\n    fn remove_timestamp_test_case(\n        request: &SearchRequest,\n        split: &SplitIdAndFooterOffsets,\n        expected: Option<RangeQuery>,\n    ) {\n        let timestamp_field = \"timestamp\";\n\n        // test the query directly\n        let mut request_direct = request.clone();\n        remove_redundant_timestamp_range(&mut request_direct, split, timestamp_field);\n        let expected_direct = expected\n            .clone()\n            .map(bool_filter)\n            .unwrap_or(QueryAst::MatchAll);\n        assert_ast_eq(&request_direct, &expected_direct);\n    }\n\n    #[test]\n    fn test_remove_timestamp_range() {\n        const S_TO_NS: i64 = 1_000_000_000;\n        let time1 = 1700001000;\n        let time2 = 1700002000;\n        let time3 = 1700003000;\n        let time4 = 1700004000;\n\n        let timestamp_field = \"timestamp\".to_string();\n\n        // cases where the bounds are larger than the split: no bound is emitted\n        let split = SplitIdAndFooterOffsets {\n            timestamp_start: Some(time2),\n            timestamp_end: Some(time3),\n            ..SplitIdAndFooterOffsets::default()\n        };\n\n        let search_request = SearchRequest {\n            query_ast: serde_json::to_string(&QueryAst::Range(RangeQuery {\n                field: timestamp_field.to_string(),\n                lower_bound: Bound::Included(time1.into()),\n                // *1000 has no impact, we detect timestamp in ms instead of s\n                upper_bound: Bound::Included((time4 * 1000).into()),\n            }))\n            .unwrap(),\n            ..SearchRequest::default()\n        };\n        remove_timestamp_test_case(&search_request, &split, None);\n\n        let search_request = SearchRequest {\n            query_ast: serde_json::to_string(&QueryAst::Range(RangeQuery {\n                field: timestamp_field.to_string(),\n                lower_bound: Bound::Included(time1.into()),\n                upper_bound: Bound::Included(time3.into()),\n            }))\n            .unwrap(),\n            ..SearchRequest::default()\n        };\n        remove_timestamp_test_case(&search_request, &split, None);\n\n        let search_request = SearchRequest {\n            query_ast: serde_json::to_string(&QueryAst::MatchAll).unwrap(),\n            start_timestamp: Some(time1),\n            end_timestamp: Some(time4),\n            ..SearchRequest::default()\n        };\n        remove_timestamp_test_case(&search_request, &split, None);\n\n        // request bound that are exclusive are treated properly\n        let expected_upper_exclusive = RangeQuery {\n            field: timestamp_field.to_string(),\n            lower_bound: Bound::Unbounded,\n            upper_bound: Bound::Excluded((time3 * S_TO_NS).into()),\n        };\n        let search_request = SearchRequest {\n            query_ast: serde_json::to_string(&QueryAst::Range(RangeQuery {\n                field: timestamp_field.to_string(),\n                lower_bound: Bound::Included(time1.into()),\n                upper_bound: Bound::Excluded(time3.into()),\n            }))\n            .unwrap(),\n            ..SearchRequest::default()\n        };\n        remove_timestamp_test_case(\n            &search_request,\n            &split,\n            Some(expected_upper_exclusive.clone()),\n        );\n\n        let search_request = SearchRequest {\n            query_ast: serde_json::to_string(&QueryAst::MatchAll).unwrap(),\n            start_timestamp: Some(time1),\n            end_timestamp: Some(time3),\n            ..SearchRequest::default()\n        };\n        remove_timestamp_test_case(\n            &search_request,\n            &split,\n            Some(expected_upper_exclusive.clone()),\n        );\n\n        let expected_lower_exclusive = RangeQuery {\n            field: timestamp_field.to_string(),\n            lower_bound: Bound::Excluded((time2 * S_TO_NS).into()),\n            upper_bound: Bound::Unbounded,\n        };\n        let search_request = SearchRequest {\n            query_ast: serde_json::to_string(&QueryAst::Range(RangeQuery {\n                field: timestamp_field.to_string(),\n                lower_bound: Bound::Excluded(time2.into()),\n                upper_bound: Bound::Included(time3.into()),\n            }))\n            .unwrap(),\n            ..SearchRequest::default()\n        };\n        remove_timestamp_test_case(\n            &search_request,\n            &split,\n            Some(expected_lower_exclusive.clone()),\n        );\n\n        // we take the most restrictive bounds\n        let split = SplitIdAndFooterOffsets {\n            timestamp_start: Some(time1),\n            timestamp_end: Some(time4),\n            ..SplitIdAndFooterOffsets::default()\n        };\n\n        let expected_upper_2_ex = RangeQuery {\n            field: timestamp_field.to_string(),\n            lower_bound: Bound::Unbounded,\n            upper_bound: Bound::Excluded((time2 * S_TO_NS).into()),\n        };\n        let search_request = SearchRequest {\n            query_ast: serde_json::to_string(&QueryAst::Range(RangeQuery {\n                field: timestamp_field.to_string(),\n                lower_bound: Bound::Included(time1.into()),\n                upper_bound: Bound::Included(time3.into()),\n            }))\n            .unwrap(),\n            start_timestamp: Some(time1),\n            end_timestamp: Some(time2),\n            ..SearchRequest::default()\n        };\n        remove_timestamp_test_case(&search_request, &split, Some(expected_upper_2_ex));\n\n        let expected_upper_2_inc = RangeQuery {\n            field: timestamp_field.to_string(),\n            lower_bound: Bound::Unbounded,\n            upper_bound: Bound::Included((time2 * S_TO_NS).into()),\n        };\n        let search_request = SearchRequest {\n            query_ast: serde_json::to_string(&QueryAst::Range(RangeQuery {\n                field: timestamp_field.to_string(),\n                lower_bound: Bound::Included(time1.into()),\n                upper_bound: Bound::Included(time2.into()),\n            }))\n            .unwrap(),\n            start_timestamp: Some(time1),\n            end_timestamp: Some(time3),\n            ..SearchRequest::default()\n        };\n        remove_timestamp_test_case(&search_request, &split, Some(expected_upper_2_inc));\n\n        let expected_lower_3 = RangeQuery {\n            field: timestamp_field.to_string(),\n            lower_bound: Bound::Included((time3 * S_TO_NS).into()),\n            upper_bound: Bound::Unbounded,\n        };\n\n        let search_request = SearchRequest {\n            query_ast: serde_json::to_string(&QueryAst::Range(RangeQuery {\n                field: timestamp_field.to_string(),\n                lower_bound: Bound::Included(time2.into()),\n                upper_bound: Bound::Included(time4.into()),\n            }))\n            .unwrap(),\n            start_timestamp: Some(time3),\n            end_timestamp: Some(time4 + 1),\n            ..SearchRequest::default()\n        };\n        remove_timestamp_test_case(&search_request, &split, Some(expected_lower_3.clone()));\n\n        let search_request = SearchRequest {\n            query_ast: serde_json::to_string(&QueryAst::Range(RangeQuery {\n                field: timestamp_field.to_string(),\n                lower_bound: Bound::Included(time3.into()),\n                upper_bound: Bound::Included(time4.into()),\n            }))\n            .unwrap(),\n            start_timestamp: Some(time2),\n            end_timestamp: Some(time4 + 1),\n            ..SearchRequest::default()\n        };\n        remove_timestamp_test_case(&search_request, &split, Some(expected_lower_3));\n\n        let mut search_request = SearchRequest {\n            query_ast: serde_json::to_string(&QueryAst::MatchAll).unwrap(),\n            start_timestamp: Some(time1),\n            end_timestamp: Some(time4),\n            ..SearchRequest::default()\n        };\n        let split = SplitIdAndFooterOffsets {\n            timestamp_start: Some(time2),\n            timestamp_end: Some(time3),\n            ..SplitIdAndFooterOffsets::default()\n        };\n        remove_redundant_timestamp_range(&mut search_request, &split, &timestamp_field);\n        assert_ast_eq(&search_request, &QueryAst::MatchAll);\n    }\n\n    // regression test for #4935\n    #[test]\n    fn test_remove_timestamp_range_keep_should() {\n        let time1 = 1700001000;\n        let time2 = 1700002000;\n        let time3 = 1700003000;\n\n        let timestamp_field = \"timestamp\".to_string();\n\n        // cases where the bounds are larger than the split: no bound is emitted\n        let split = SplitIdAndFooterOffsets {\n            timestamp_start: Some(time1),\n            timestamp_end: Some(time3),\n            ..SplitIdAndFooterOffsets::default()\n        };\n\n        let mut search_request = SearchRequest {\n            query_ast: serde_json::to_string(&QueryAst::Bool(BoolQuery {\n                should: vec![QueryAst::MatchAll],\n                ..BoolQuery::default()\n            }))\n            .unwrap(),\n            start_timestamp: Some(time2),\n            end_timestamp: None,\n            ..SearchRequest::default()\n        };\n        remove_redundant_timestamp_range(&mut search_request, &split, &timestamp_field);\n        assert_ast_eq(\n            &search_request,\n            &QueryAst::Bool(BoolQuery {\n                // original request\n                must: vec![QueryAst::Bool(BoolQuery {\n                    should: vec![QueryAst::MatchAll],\n                    ..BoolQuery::default()\n                })],\n                // time bound\n                filter: vec![\n                    RangeQuery {\n                        field: \"timestamp\".to_string(),\n                        lower_bound: Bound::Included(1_700_002_000_000_000_000u64.into()),\n                        upper_bound: Bound::Unbounded,\n                    }\n                    .into(),\n                ],\n                ..BoolQuery::default()\n            }),\n        );\n    }\n\n    #[test]\n    fn test_remove_extended_bounds_from_histogram() {\n        let histo_at_root = r#\"\n{\n  \"date_histo\": {\n    \"date_histogram\": {\n      \"extended_bounds\": {\n        \"max\": 1425254400000,\n        \"min\": 1420070400000\n      },\n      \"field\": \"date\",\n      \"fixed_interval\": \"30d\",\n      \"offset\": \"-4d\"\n    }\n  }\n}\n\"#;\n\n        let histo_at_root_no_bounds = r#\"\n{\n  \"date_histo\": {\n    \"date_histogram\": {\n      \"field\": \"date\",\n      \"fixed_interval\": \"30d\",\n      \"offset\": \"-4d\"\n    }\n  }\n}\n\"#;\n\n        let histo_at_root_with_sibling = r#\"\n{\n  \"metrics\": {\n    \"aggs\": {\n      \"response\": {\n        \"percentiles\": {\n          \"field\": \"response\",\n          \"keyed\": false,\n          \"percents\": [\n            85\n          ]\n        }\n      }\n    },\n    \"date_histogram\": {\n      \"extended_bounds\": {\n        \"max\": 1425254400000,\n        \"min\": 1420070400000\n      },\n      \"field\": \"date\",\n      \"fixed_interval\": \"30d\",\n      \"offset\": \"-4d\"\n    }\n  }\n}\n\"#;\n\n        let histo_at_root_with_sibling_no_bounds = r#\"\n{\n  \"metrics\": {\n    \"aggs\": {\n      \"response\": {\n        \"percentiles\": {\n          \"field\": \"response\",\n          \"keyed\": false,\n          \"percents\": [\n            85\n          ]\n        }\n      }\n    },\n    \"date_histogram\": {\n      \"field\": \"date\",\n      \"fixed_interval\": \"30d\",\n      \"offset\": \"-4d\"\n    }\n  }\n}\n\"#;\n        let histo_at_leaf = r#\"\n{\n  \"metrics\": {\n    \"aggs\": {\n      \"response\": {\n        \"date_histogram\": {\n          \"extended_bounds\": {\n            \"max\": 1425254400000,\n            \"min\": 1420070400000\n          },\n          \"field\": \"date\",\n          \"fixed_interval\": \"30d\",\n          \"offset\": \"-4d\"\n        }\n      }\n    },\n    \"percentiles\": {\n      \"field\": \"response\",\n      \"keyed\": false,\n      \"percents\": [\n        85\n      ]\n    }\n  }\n}\n\"#;\n\n        let histo_at_leaf_no_bounds = r#\"\n{\n  \"metrics\": {\n    \"aggs\": {\n      \"response\": {\n        \"date_histogram\": {\n          \"field\": \"date\",\n          \"fixed_interval\": \"30d\",\n          \"offset\": \"-4d\"\n        }\n      }\n    },\n    \"percentiles\": {\n      \"field\": \"response\",\n      \"keyed\": false,\n      \"percents\": [\n        85\n      ]\n    }\n  }\n}\n\"#;\n        for (bounds, no_bounds) in [\n            (histo_at_root, histo_at_root_no_bounds),\n            (\n                histo_at_root_with_sibling,\n                histo_at_root_with_sibling_no_bounds,\n            ),\n            (histo_at_leaf, histo_at_leaf_no_bounds),\n        ] {\n            // first assert we do nothing when there are no bounds\n            let request_no_bounds = SearchRequest {\n                aggregation_request: Some(no_bounds.to_string()),\n                ..SearchRequest::default()\n            };\n            let mut request_no_bounds_clone = request_no_bounds.clone();\n            rewrite_aggregation(&mut request_no_bounds_clone);\n            assert_eq!(request_no_bounds, request_no_bounds_clone);\n\n            let mut request_bounds = SearchRequest {\n                aggregation_request: Some(bounds.to_string()),\n                ..SearchRequest::default()\n            };\n            rewrite_aggregation(&mut request_bounds);\n            // we can't just compare bounds and no_bounds, they must be structuraly equal, but not\n            // necessarily identical (field order, null vs absent...). So we parse both and verify\n            // the results are equal instead\n            let no_bounds_agg: QuickwitAggregations =\n                serde_json::from_str(&request_no_bounds.aggregation_request.unwrap()).unwrap();\n            let rewrote_bounds_agg: QuickwitAggregations =\n                serde_json::from_str(&request_bounds.aggregation_request.unwrap()).unwrap();\n            assert_eq!(rewrote_bounds_agg, no_bounds_agg);\n        }\n    }\n\n    fn create_tantivy_dir_with_hotcache<'a, V>(\n        field_entry: FieldEntry,\n        field_value: V,\n    ) -> (HotDirectory, usize)\n    where\n        V: Value<'a>,\n    {\n        let field_name = field_entry.name().to_string();\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_field(field_entry);\n        let schema = schema_builder.build();\n\n        let ram_directory = RamDirectory::create();\n        let index = Index::open_or_create(ram_directory.clone(), schema.clone()).unwrap();\n\n        let mut index_writer = index.writer(15_000_000).unwrap();\n        let field = schema.get_field(&field_name).unwrap();\n        let mut new_doc = TantivyDocument::default();\n        new_doc.add_field_value(field, field_value);\n        index_writer.add_document(new_doc).unwrap();\n        index_writer.commit().unwrap();\n\n        let mut hotcache_bytes_writer = Vec::new().writer();\n        write_hotcache(ram_directory.clone(), &mut hotcache_bytes_writer).unwrap();\n        let hotcache_bytes = OwnedBytes::new(hotcache_bytes_writer.into_inner());\n        let hot_directory = HotDirectory::open(ram_directory.clone(), hotcache_bytes).unwrap();\n        (hot_directory, ram_directory.total_mem_usage())\n    }\n\n    #[test]\n    fn test_compute_index_size_without_store() {\n        // We don't want to make assertions on absolute index sizes (it might\n        // change in future Tantivy versions), but rather verify that the store\n        // is properly excluded from the computed size.\n\n        // We use random bytes so that the store can't compress them\n        let mut payload = vec![0u8; 1024];\n        rand::rng().fill(&mut payload[..]);\n\n        let (hotcache_directory_stored_payload, directory_size_stored_payload) =\n            create_tantivy_dir_with_hotcache(\n                FieldEntry::new_bytes(\"payload\".to_string(), BytesOptions::default().set_stored()),\n                &payload,\n            );\n        let size_with_stored_payload =\n            compute_index_size(&hotcache_directory_stored_payload).as_u64();\n\n        let (hotcache_directory_index_only, directory_size_index_only) =\n            create_tantivy_dir_with_hotcache(\n                FieldEntry::new_bytes(\"payload\".to_string(), BytesOptions::default()),\n                &payload,\n            );\n        let size_index_only = compute_index_size(&hotcache_directory_index_only).as_u64();\n\n        assert!(directory_size_stored_payload > directory_size_index_only + 1000);\n        assert!(size_with_stored_payload.abs_diff(size_index_only) < 10);\n    }\n\n    #[test]\n    fn test_compute_index_size_varies_with_data() {\n        // We don't want to make assertions on absolute index sizes (it might\n        // change in future Tantivy versions), but rather verify that an index\n        // with more data is indeed bigger.\n\n        let indexing_options =\n            TextOptions::default().set_indexing_options(TextFieldIndexing::default());\n\n        let (hotcache_directory_larger, directory_size_larger) = create_tantivy_dir_with_hotcache(\n            FieldEntry::new_text(\"text\".to_string(), indexing_options.clone()),\n            \"Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium \\\n             doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore \\\n             veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam \\\n             voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur \\\n             magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, \\\n             qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non \\\n             numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat \\\n             voluptatem. Ut enim ad minima veniam, quis nostrum exercitationem ullam corporis \\\n             suscipit laboriosam, nisi ut aliquid ex ea commodi consequatur? Quis autem vel eum \\\n             iure reprehenderit qui in ea voluptate velit esse quam nihil molestiae consequatur, \\\n             vel illum qui dolorem eum fugiat quo voluptas nulla pariatur?\",\n        );\n        let larger_size = compute_index_size(&hotcache_directory_larger).as_u64();\n\n        let (hotcache_directory_smaller, directory_size_smaller) = create_tantivy_dir_with_hotcache(\n            FieldEntry::new_text(\"text\".to_string(), indexing_options),\n            \"hi\",\n        );\n        let smaller_size = compute_index_size(&hotcache_directory_smaller).as_u64();\n\n        assert!(directory_size_larger > directory_size_smaller + 100);\n        assert!(larger_size > smaller_size + 100);\n    }\n\n    fn nz(n: usize) -> std::num::NonZeroUsize {\n        std::num::NonZeroUsize::new(n).unwrap()\n    }\n\n    #[test]\n    fn test_greedy_batch_split_empty() {\n        let items: Vec<u64> = vec![];\n        let batches = super::greedy_batch_split(items, |&x| x, nz(5));\n        assert!(batches.is_empty());\n    }\n\n    #[test]\n    fn test_greedy_batch_split_single_batch() {\n        let items = vec![10u64, 20, 30];\n        let batches = super::greedy_batch_split(items, |&x| x, nz(10));\n        assert_eq!(batches.len(), 1);\n        assert_eq!(batches[0].len(), 3);\n    }\n\n    #[test]\n    fn test_greedy_batch_split_balances_weights() {\n        // 7 items with weights, max 3 per batch -> 3 batches\n        let items = vec![100u64, 80, 60, 50, 40, 30, 20];\n        let batches = super::greedy_batch_split(items, |&x| x, nz(3));\n\n        assert_eq!(batches.len(), 3);\n\n        // All items should be present\n        let mut all_items: Vec<u64> = batches.iter().flatten().copied().collect();\n        all_items.sort_unstable();\n        assert_eq!(all_items, vec![20, 30, 40, 50, 60, 80, 100]);\n\n        // Check weights are reasonably balanced\n        let weights: Vec<u64> = batches.iter().map(|b| b.iter().sum()).collect();\n        let max_weight = *weights.iter().max().unwrap();\n        let min_weight = *weights.iter().min().unwrap();\n        // With greedy LPT, the imbalance should be bounded\n        assert!(\n            max_weight <= min_weight * 2,\n            \"weights should be reasonably balanced: {:?}\",\n            weights\n        );\n    }\n\n    #[test]\n    fn test_greedy_batch_split_count_balance() {\n        // 10 items, max 3 per batch -> 4 batches\n        // counts should be either 2 or 3 per batch\n        let items: Vec<u64> = (0..10).collect();\n        let batches = super::greedy_batch_split(items, |&x| x, nz(3));\n\n        assert_eq!(batches.len(), 4);\n        let counts: Vec<usize> = batches.iter().map(|b| b.len()).collect();\n        for count in &counts {\n            assert!(\n                *count >= 2 && *count <= 3,\n                \"count should be 2 or 3, got {}\",\n                count\n            );\n        }\n        assert_eq!(counts.iter().sum::<usize>(), 10);\n    }\n\n    fn make_splits_with_requests(\n        num_splits: usize,\n    ) -> Vec<(SplitIdAndFooterOffsets, SearchRequest)> {\n        (0..num_splits)\n            .map(|idx| {\n                let split = SplitIdAndFooterOffsets {\n                    split_id: format!(\"split_{idx}\"),\n                    num_docs: 100,\n                    ..Default::default()\n                };\n                (split, SearchRequest::default())\n            })\n            .collect()\n    }\n\n    #[tokio::test]\n    async fn test_schedule_search_tasks_no_lambda_all_local() {\n        let searcher_context = SearcherContext::for_test();\n        let splits = make_splits_with_requests(5);\n        let result = super::schedule_search_tasks(splits, &searcher_context).await;\n        assert_eq!(result.local_search_tasks.len(), 5);\n        assert!(result.offloaded_search_tasks.is_empty());\n        for (idx, task) in result.local_search_tasks.iter().enumerate() {\n            assert_eq!(task.split.split_id, format!(\"split_{idx}\"));\n        }\n    }\n\n    struct DummyInvoker;\n    #[async_trait]\n    impl LambdaLeafSearchInvoker for DummyInvoker {\n        async fn invoke_leaf_search(\n            &self,\n            _req: LeafSearchRequest,\n        ) -> Result<Vec<LambdaSingleSplitResult>, SearchError> {\n            todo!()\n        }\n    }\n\n    #[tokio::test]\n    async fn test_schedule_search_tasks_lambda_offloads_excess() {\n        let mut config = SearcherConfig::default();\n        config.lambda = Some(LambdaConfig {\n            offload_threshold: 3,\n            ..LambdaConfig::for_test()\n        });\n        let searcher_context = SearcherContext::new(config, None, Some(Arc::new(DummyInvoker)));\n        let splits = make_splits_with_requests(7);\n        let result = super::schedule_search_tasks(splits, &searcher_context).await;\n        assert_eq!(result.local_search_tasks.len(), 3);\n        assert_eq!(result.offloaded_search_tasks.len(), 4);\n        for (idx, task) in result.local_search_tasks.iter().enumerate() {\n            assert_eq!(task.split.split_id, format!(\"split_{idx}\"));\n        }\n        for (idx, (split, _req)) in result.offloaded_search_tasks.iter().enumerate() {\n            assert_eq!(split.split_id, format!(\"split_{}\", idx + 3));\n        }\n    }\n\n    #[tokio::test]\n    async fn test_schedule_search_tasks_lambda_threshold_zero_offloads_all() {\n        let mut config = SearcherConfig::default();\n        config.lambda = Some(LambdaConfig {\n            offload_threshold: 0,\n            ..LambdaConfig::for_test()\n        });\n        let searcher_context = SearcherContext::new(config, None, Some(Arc::new(DummyInvoker)));\n        let splits = make_splits_with_requests(5);\n        let result = super::schedule_search_tasks(splits, &searcher_context).await;\n        assert!(result.local_search_tasks.is_empty());\n        assert_eq!(result.offloaded_search_tasks.len(), 5);\n    }\n\n    #[tokio::test]\n    async fn test_schedule_search_tasks_lambda_threshold_above_split_count() {\n        let mut config = SearcherConfig::default();\n        config.lambda = Some(LambdaConfig {\n            offload_threshold: 100,\n            ..LambdaConfig::for_test()\n        });\n        let searcher_context = SearcherContext::new(config, None, Some(Arc::new(DummyInvoker)));\n        let splits = make_splits_with_requests(5);\n        let result = super::schedule_search_tasks(splits, &searcher_context).await;\n        assert_eq!(result.local_search_tasks.len(), 5);\n        assert!(result.offloaded_search_tasks.is_empty());\n    }\n\n    #[tokio::test]\n    async fn test_schedule_search_tasks_empty() {\n        let searcher_context = SearcherContext::for_test();\n        let result = super::schedule_search_tasks(Vec::new(), &searcher_context).await;\n        assert!(result.local_search_tasks.is_empty());\n        assert!(result.offloaded_search_tasks.is_empty());\n    }\n\n    mod proptest_greedy_batch {\n        use std::num::NonZeroUsize;\n\n        use proptest::prelude::*;\n\n        proptest! {\n            #[test]\n            fn all_items_preserved(\n                items in prop::collection::vec(0u64..1000, 0..100),\n                max_per_batch in 1usize..20\n            ) {\n                let original: Vec<u64> = items.clone();\n                let max_per_batch = NonZeroUsize::new(max_per_batch).unwrap();\n                let batches = super::super::greedy_batch_split(items, |&x| x, max_per_batch);\n\n                // All items should be present exactly once\n                let mut result: Vec<u64> = batches.into_iter().flatten().collect();\n                result.sort_unstable();\n                let mut expected = original;\n                expected.sort_unstable();\n                prop_assert_eq!(result, expected);\n            }\n\n            #[test]\n            fn batch_count_correct(\n                items in prop::collection::vec(0u64..1000, 1..100),\n                max_per_batch in 1usize..20\n            ) {\n                let n = items.len();\n                let max_per_batch_nz = NonZeroUsize::new(max_per_batch).unwrap();\n                let batches = super::super::greedy_batch_split(items, |&x| x, max_per_batch_nz);\n\n                let expected_batches = n.div_ceil(max_per_batch);\n                prop_assert_eq!(batches.len(), expected_batches);\n            }\n\n            #[test]\n            fn total_items_matches(\n                items in prop::collection::vec(0u64..1000, 1..100),\n                max_per_batch in 1usize..20\n            ) {\n                let n = items.len();\n                let max_per_batch = NonZeroUsize::new(max_per_batch).unwrap();\n                let batches = super::super::greedy_batch_split(items, |&x| x, max_per_batch);\n\n                // Total items across all batches equals input\n                let total: usize = batches.iter().map(|b| b.len()).sum();\n                prop_assert_eq!(total, n);\n            }\n\n            #[test]\n            fn greedy_balances_by_weight_not_count(\n                // Use items with significant weights to test weight balancing\n                items in prop::collection::vec(100u64..1000, 4..30),\n                max_per_batch in 2usize..10\n            ) {\n                let max_per_batch = NonZeroUsize::new(max_per_batch).unwrap();\n                let batches = super::super::greedy_batch_split(items, |&x| x, max_per_batch);\n\n                if batches.len() >= 2 {\n                    let weights: Vec<u64> = batches.iter().map(|b| b.iter().sum()).collect();\n                    let total_weight: u64 = weights.iter().sum();\n                    let avg_weight = total_weight / batches.len() as u64;\n\n                    // LPT guarantees max makespan <= (4/3) * optimal\n                    // With balanced input, max should be close to average\n                    let max_weight = *weights.iter().max().unwrap();\n\n                    // Max weight should be at most 2x average (generous bound)\n                    prop_assert!(\n                        max_weight <= avg_weight * 2 + 1000, // +1000 for rounding slack\n                        \"max weight {} too far from average {}\",\n                        max_weight,\n                        avg_weight\n                    );\n                }\n            }\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-search/src/leaf_cache.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::ops::{Bound, RangeBounds};\n\nuse prost::Message;\nuse quickwit_config::CacheConfig;\nuse quickwit_proto::search::{\n    CountHits, LeafSearchResponse, SearchRequest, SplitIdAndFooterOffsets,\n};\nuse quickwit_proto::types::SplitId;\nuse quickwit_storage::{MemorySizedCache, OwnedBytes};\nuse tantivy::index::SegmentId;\n\n/// A cache to memoize `leaf_search_single_split` results.\npub struct LeafSearchCache {\n    content: MemorySizedCache<CacheKey>,\n}\n\n// TODO we could be smarter about search_after. If we have a cached request with a search_after\n// (possibly equal to None) A, and a corresponding response with the 1st element having the value\n// B, and we receive a 2nd request with a search_after such that A <= C < B, we can serve from\n// cache directly. Only the case A = C < B is currently handled.\n// TODO if we don't request counting all results, have no aggregation, and we get a request we can\n// match, the merged_time_range is strictly smaller, and every hit we had fits in the new\n// timebound, we can reply from cache, saying we hit only result.partial_hits.len() res. It always\n// undercount, and necessarily returns the right hits.\n// TODO if we stored a result for X hits, but a subsequent request asks for Y < X hits, we can\n// modify the answer and serve from cache.\n// TODO mix of 1 and 3.\n// TODO this means given a request for X documents, we could search for k*X docs in each split,\n// truncate to X while merging, and get free results from cache for at least the next k subsequent\n// queries which vary only by search_after.\n\nimpl LeafSearchCache {\n    pub fn new(config: &CacheConfig) -> LeafSearchCache {\n        LeafSearchCache {\n            content: MemorySizedCache::from_config(\n                config,\n                &quickwit_storage::STORAGE_METRICS.partial_request_cache,\n            ),\n        }\n    }\n    pub fn get(\n        &self,\n        split_info: SplitIdAndFooterOffsets,\n        search_request: SearchRequest,\n    ) -> Option<LeafSearchResponse> {\n        let key = CacheKey::from_split_meta_and_request(split_info, search_request);\n        let encoded_result = self.content.get(&key)?;\n        // this should never fail\n        LeafSearchResponse::decode(&*encoded_result).ok()\n    }\n\n    pub fn put(\n        &self,\n        split_info: SplitIdAndFooterOffsets,\n        search_request: SearchRequest,\n        result: LeafSearchResponse,\n    ) {\n        let key = CacheKey::from_split_meta_and_request(split_info, search_request);\n        let encoded_result = result.encode_to_vec();\n        self.content.put(key, OwnedBytes::new(encoded_result));\n    }\n}\n\n/// A key inside a [`LeafSearchCache`].\n#[derive(Debug, Hash, Clone, PartialEq, Eq)]\nstruct CacheKey {\n    /// The split this entry refers to\n    split_id: SplitId,\n    /// The request this matches. The timerange of the request was removed.\n    request: SearchRequest,\n    /// The effective time range of the request, that is, the intersection of the timerange\n    /// requested, and the timerange covered by the split.\n    merged_time_range: HalfOpenRange,\n}\n\nimpl CacheKey {\n    fn from_split_meta_and_request(\n        split_info: SplitIdAndFooterOffsets,\n        mut search_request: SearchRequest,\n    ) -> Self {\n        let split_time_range = HalfOpenRange::from_bounds(split_info.time_range());\n        let request_time_range = HalfOpenRange::from_bounds(search_request.time_range());\n        let merged_time_range = request_time_range.intersect(&split_time_range);\n\n        search_request.start_timestamp = None;\n        search_request.end_timestamp = None;\n        // it doesn't matter whether or not we count all hits at the scale of a\n        // single split: either we did process it and got everything, or we didn't.\n        search_request.count_hits = CountHits::CountAll.into();\n\n        CacheKey {\n            split_id: split_info.split_id,\n            request: search_request,\n            merged_time_range,\n        }\n    }\n}\n\n/// A (half-open) range bounded inclusively below and exclusively above [start..end).\n#[derive(Debug, Copy, Clone, PartialEq, Eq, Hash)]\nstruct HalfOpenRange {\n    start: i64,\n    end: Option<i64>,\n}\n\nimpl HalfOpenRange {\n    fn empty_range() -> HalfOpenRange {\n        HalfOpenRange {\n            start: 0,\n            end: Some(0),\n        }\n    }\n\n    /// Create a Range from bounds.\n    fn from_bounds(range: impl RangeBounds<i64>) -> Self {\n        let start = match range.start_bound() {\n            Bound::Included(start) => *start,\n            Bound::Excluded(start) => {\n                // if we exclude i64::MAX from the start bound, the range is necessarily empty\n                if let Some(start) = start.checked_add(1) {\n                    start\n                } else {\n                    return Self::empty_range();\n                }\n            }\n            Bound::Unbounded => i64::MIN,\n        };\n        let end = match range.end_bound() {\n            // if we include i64::MAX at the end bound, this is essentially boundless\n            Bound::Included(end) => end.checked_add(1),\n            Bound::Excluded(end) => Some(*end),\n            Bound::Unbounded => None,\n        };\n\n        HalfOpenRange { start, end }.normalize()\n    }\n\n    fn is_empty(self) -> bool {\n        !self.contains(&self.start)\n    }\n\n    /// Normalize empty ranges to be 0..0\n    fn normalize(self) -> HalfOpenRange {\n        if self.is_empty() {\n            Self::empty_range()\n        } else {\n            self\n        }\n    }\n\n    /// Return the intersection of self and other.\n    fn intersect(&self, other: &HalfOpenRange) -> HalfOpenRange {\n        let start = self.start.max(other.start);\n        let end = match (self.end, other.end) {\n            (Some(this), Some(other)) => Some(this.min(other)),\n            (Some(this), None) => Some(this),\n            (None, other) => other,\n        };\n        HalfOpenRange { start, end }.normalize()\n    }\n}\n\nimpl RangeBounds<i64> for HalfOpenRange {\n    fn start_bound(&self) -> Bound<&i64> {\n        Bound::Included(&self.start)\n    }\n\n    fn end_bound(&self) -> Bound<&i64> {\n        if let Some(end_bound) = &self.end {\n            Bound::Excluded(end_bound)\n        } else {\n            Bound::Unbounded\n        }\n    }\n}\n\npub struct PredicateCacheImpl {\n    content: MemorySizedCache<(SplitId, String)>,\n}\n\nimpl PredicateCacheImpl {\n    pub fn new(config: &CacheConfig) -> Self {\n        PredicateCacheImpl {\n            content: MemorySizedCache::from_config(\n                config,\n                &quickwit_storage::STORAGE_METRICS.predicate_cache,\n            ),\n        }\n    }\n}\n\nimpl quickwit_query::query_ast::PredicateCache for PredicateCacheImpl {\n    fn get(\n        &self,\n        split_id: SplitId,\n        query_ast_json: String,\n    ) -> Option<(SegmentId, quickwit_query::query_ast::HitSet)> {\n        let encoded_result = self.content.get(&(split_id, query_ast_json))?;\n        let (segment_id_bytes, hits_buffer) = encoded_result.split(32);\n        let segment_id =\n            SegmentId::from_uuid_string(str::from_utf8(&segment_id_bytes).ok()?).ok()?;\n        let hits = quickwit_query::query_ast::HitSet::from_buffer(hits_buffer);\n        Some((segment_id, hits))\n    }\n\n    fn put(\n        &self,\n        split_id: SplitId,\n        query_ast_json: String,\n        segment: SegmentId,\n        hits: quickwit_query::query_ast::HitSet,\n    ) {\n        let hits_buffer = hits.into_buffer();\n        let mut buffer = Vec::with_capacity(32 + hits_buffer.len());\n        buffer.extend_from_slice(segment.uuid_string().as_bytes());\n        buffer.extend_from_slice(&hits_buffer);\n        self.content\n            .put((split_id, query_ast_json), OwnedBytes::new(buffer));\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use bytesize::ByteSize;\n    use quickwit_proto::search::{\n        LeafSearchResponse, PartialHit, ResourceStats, SearchRequest, SortValue,\n        SplitIdAndFooterOffsets,\n    };\n\n    use super::LeafSearchCache;\n\n    #[test]\n    fn test_leaf_search_cache_no_timestamp() {\n        let cache = LeafSearchCache::new(&ByteSize::mb(64).into());\n\n        let split_1 = SplitIdAndFooterOffsets {\n            split_id: \"split_1\".to_string(),\n            split_footer_start: 0,\n            split_footer_end: 100,\n            timestamp_start: None,\n            timestamp_end: None,\n            num_docs: 0,\n        };\n\n        let split_2 = SplitIdAndFooterOffsets {\n            split_id: \"split_2\".to_string(),\n            split_footer_start: 0,\n            split_footer_end: 100,\n            timestamp_start: None,\n            timestamp_end: None,\n            num_docs: 0,\n        };\n\n        let query_1 = SearchRequest {\n            index_id_patterns: vec![\"test-idx\".to_string()],\n            query_ast: \"test\".to_string(),\n            start_timestamp: None,\n            end_timestamp: None,\n            max_hits: 10,\n            start_offset: 0,\n            ..Default::default()\n        };\n\n        let query_2 = SearchRequest {\n            index_id_patterns: vec![\"test-idx\".to_string()],\n            query_ast: \"test2\".to_string(),\n            start_timestamp: None,\n            end_timestamp: None,\n            max_hits: 10,\n            start_offset: 0,\n            ..Default::default()\n        };\n\n        let result = LeafSearchResponse {\n            failed_splits: Vec::new(),\n            intermediate_aggregation_result: None,\n            num_attempted_splits: 1,\n            num_successful_splits: 1,\n            num_hits: 1234,\n            partial_hits: vec![PartialHit {\n                doc_id: 1,\n                segment_ord: 0,\n                sort_value: Some(SortValue::U64(0u64).into()),\n                sort_value2: None,\n                split_id: \"split_1\".to_string(),\n            }],\n            resource_stats: None,\n        };\n\n        assert!(cache.get(split_1.clone(), query_1.clone()).is_none());\n\n        cache.put(split_1.clone(), query_1.clone(), result.clone());\n        assert_eq!(cache.get(split_1.clone(), query_1.clone()).unwrap(), result);\n        assert!(cache.get(split_2, query_1).is_none());\n        assert!(cache.get(split_1, query_2).is_none());\n    }\n\n    #[test]\n    fn test_leaf_search_cache_timestamp() {\n        let cache = LeafSearchCache::new(&ByteSize::mb(64).into());\n\n        let split_1 = SplitIdAndFooterOffsets {\n            split_id: \"split_1\".to_string(),\n            split_footer_start: 0,\n            split_footer_end: 100,\n            timestamp_start: Some(100),\n            timestamp_end: Some(199),\n            num_docs: 0,\n        };\n        let split_2 = SplitIdAndFooterOffsets {\n            split_id: \"split_2\".to_string(),\n            split_footer_start: 0,\n            split_footer_end: 100,\n            timestamp_start: Some(150),\n            timestamp_end: Some(249),\n            num_docs: 0,\n        };\n        let split_3 = SplitIdAndFooterOffsets {\n            split_id: \"split_3\".to_string(),\n            split_footer_start: 0,\n            split_footer_end: 100,\n            timestamp_start: Some(150),\n            timestamp_end: Some(249),\n            num_docs: 0,\n        };\n\n        let query_1 = SearchRequest {\n            index_id_patterns: vec![\"test-idx\".to_string()],\n            query_ast: \"test\".to_string(),\n            start_timestamp: Some(100),\n            end_timestamp: Some(250),\n            max_hits: 10,\n            start_offset: 0,\n            ..Default::default()\n        };\n        let query_1bis = SearchRequest {\n            index_id_patterns: vec![\"test-idx\".to_string()],\n            query_ast: \"test\".to_string(),\n            start_timestamp: Some(150),\n            end_timestamp: Some(300),\n            max_hits: 10,\n            start_offset: 0,\n            ..Default::default()\n        };\n\n        let query_2 = SearchRequest {\n            index_id_patterns: vec![\"test-idx\".to_string()],\n            query_ast: \"test2\".to_string(),\n            start_timestamp: None,\n            end_timestamp: None,\n            max_hits: 10,\n            start_offset: 0,\n            ..Default::default()\n        };\n        let query_2bis = SearchRequest {\n            index_id_patterns: vec![\"test-idx\".to_string()],\n            query_ast: \"test2\".to_string(),\n            start_timestamp: Some(50),\n            end_timestamp: Some(200),\n            max_hits: 10,\n            start_offset: 0,\n            ..Default::default()\n        };\n\n        let result = LeafSearchResponse {\n            failed_splits: Vec::new(),\n            intermediate_aggregation_result: None,\n            num_attempted_splits: 1,\n            num_successful_splits: 1,\n            num_hits: 1234,\n            partial_hits: vec![PartialHit {\n                doc_id: 1,\n                segment_ord: 0,\n                sort_value: Some(SortValue::U64(0).into()),\n                sort_value2: None,\n                split_id: \"split_1\".to_string(),\n            }],\n            resource_stats: Some(ResourceStats::default()),\n        };\n\n        // for split_1, 1 and 1bis cover different timestamp ranges\n        cache.put(split_1.clone(), query_1.clone(), result.clone());\n        assert!(cache.get(split_1.clone(), query_1.clone()).is_some());\n        assert!(cache.get(split_1.clone(), query_1bis.clone()).is_none());\n\n        // for split_2, both 1 and 1bis cover everything, so it should cache-hit\n        cache.put(split_2.clone(), query_1.clone(), result.clone());\n        assert!(cache.get(split_2.clone(), query_1).is_some());\n        assert!(cache.get(split_2.clone(), query_1bis).is_some());\n\n        // for split_1, both 1 and 1bis cover everything, so it should cache-hit\n        cache.put(split_1.clone(), query_2.clone(), result.clone());\n        assert!(cache.get(split_1.clone(), query_2.clone()).is_some());\n        assert!(cache.get(split_1, query_2bis.clone()).is_some());\n\n        // for split_2, 2 covers everything, but 2bis cover only a subrange\n        cache.put(split_2.clone(), query_2.clone(), result.clone());\n        assert!(cache.get(split_2.clone(), query_2.clone()).is_some());\n        assert!(cache.get(split_2, query_2bis.clone()).is_none());\n\n        // same for split_3, but we try caching the bounded request and query for the unbounded one\n        cache.put(split_3.clone(), query_2bis.clone(), result);\n        assert!(cache.get(split_3.clone(), query_2).is_none());\n        assert!(cache.get(split_3, query_2bis).is_some());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-search/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n//! This projects implements quickwit's search API.\n#![warn(missing_docs)]\n#![allow(clippy::bool_assert_comparison)]\n#![deny(clippy::disallowed_methods)]\n\nmod client;\nmod cluster_client;\nmod collector;\nmod error;\nmod fetch_docs;\nmod find_trace_ids_collector;\n\nmod invoker;\n/// Leaf search operations.\npub mod leaf;\nmod leaf_cache;\nmod list_fields;\nmod list_fields_cache;\nmod list_terms;\nmod metrics_trackers;\nmod retry;\nmod root;\nmod scroll_context;\nmod search_job_placer;\nmod search_response_rest;\nmod service;\npub(crate) mod top_k_collector;\n\nmod metrics;\nmod search_permit_provider;\n\n#[cfg(test)]\nmod tests;\n\npub use collector::QuickwitAggregations;\nuse metrics::SEARCH_METRICS;\nuse quickwit_common::thread_pool::ThreadPool;\nuse quickwit_common::tower::Pool;\nuse quickwit_doc_mapper::DocMapper;\nuse quickwit_proto::metastore::{\n    ListIndexesMetadataRequest, ListSplitsRequest, MetastoreService, MetastoreServiceClient,\n};\nuse tantivy::schema::NamedFieldDocument;\n\n/// Refer to this as `crate::Result<T>`.\npub type Result<T> = std::result::Result<T, SearchError>;\n\nuse std::net::{Ipv4Addr, SocketAddr};\nuse std::sync::{Arc, OnceLock};\n\npub use find_trace_ids_collector::{FindTraceIdsCollector, Span};\nuse quickwit_config::SearcherConfig;\nuse quickwit_doc_mapper::tag_pruning::TagFilterAst;\nuse quickwit_metastore::{\n    IndexMetadata, ListIndexesMetadataResponseExt, ListSplitsQuery, ListSplitsRequestExt,\n    MetastoreServiceStreamSplitsExt, SplitMetadata, SplitState,\n};\nuse quickwit_proto::search::{\n    PartialHit, ResourceStats, SearchRequest, SearchResponse, SplitIdAndFooterOffsets,\n};\nuse quickwit_proto::types::IndexUid;\nuse quickwit_storage::StorageResolver;\npub use service::SearcherContext;\nuse tantivy::DocAddress;\n\npub use crate::client::{\n    SearchServiceClient, create_search_client_from_channel, create_search_client_from_grpc_addr,\n};\npub use crate::cluster_client::ClusterClient;\npub use crate::error::{SearchError, parse_grpc_error};\nuse crate::fetch_docs::fetch_docs;\npub use crate::invoker::LambdaLeafSearchInvoker;\npub use crate::root::{\n    IndexMetasForLeafSearch, SearchJob, ensure_all_indexes_found, jobs_to_leaf_request,\n    root_search, search_plan,\n};\npub use crate::search_job_placer::{Job, SearchJobPlacer};\npub use crate::search_response_rest::{\n    AggregationResults, SearchPlanResponseRest, SearchResponseRest,\n};\npub use crate::service::{MockSearchService, SearchService, SearchServiceImpl};\n\n/// A pool of searcher clients identified by their gRPC socket address.\npub type SearcherPool = Pool<SocketAddr, SearchServiceClient>;\n\nfn search_thread_pool() -> &'static ThreadPool {\n    static SEARCH_THREAD_POOL: OnceLock<ThreadPool> = OnceLock::new();\n    SEARCH_THREAD_POOL.get_or_init(|| ThreadPool::new(\"search\", None))\n}\n\n/// GlobalDocAddress serves as a hit address.\n#[derive(Clone, Eq, Debug, PartialEq, Hash, Ord, PartialOrd)]\npub struct GlobalDocAddress {\n    /// Split containing the document\n    pub split: String,\n    /// Document address inside the split\n    pub doc_addr: DocAddress,\n}\n\n/// An error happened converting a string to a GLobalDocAddress\n#[derive(Debug, Clone, Copy)]\npub struct GlobalDocAddressParseError;\n\nimpl GlobalDocAddress {\n    /// Extract a GlobalDocAddress from a PartialHit\n    pub fn from_partial_hit(partial_hit: &PartialHit) -> Self {\n        Self {\n            split: partial_hit.split_id.to_string(),\n            doc_addr: DocAddress {\n                segment_ord: partial_hit.segment_ord,\n                doc_id: partial_hit.doc_id,\n            },\n        }\n    }\n}\n\nimpl std::fmt::Display for GlobalDocAddress {\n    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {\n        f.write_str(&self.split)?;\n        write!(\n            f,\n            \":{:08x}:{:08x}\",\n            self.doc_addr.segment_ord, self.doc_addr.doc_id\n        )\n    }\n}\n\nimpl std::str::FromStr for GlobalDocAddress {\n    type Err = GlobalDocAddressParseError;\n\n    fn from_str(s: &str) -> std::result::Result<Self, Self::Err> {\n        let mut s_iter = s.splitn(3, ':');\n        let split = s_iter.next().ok_or(GlobalDocAddressParseError)?.to_string();\n        let segment = s_iter.next().ok_or(GlobalDocAddressParseError)?;\n        let doc_id = s_iter.next().ok_or(GlobalDocAddressParseError)?;\n\n        let segment_ord =\n            u32::from_str_radix(segment, 16).map_err(|_| GlobalDocAddressParseError)?;\n        let doc_id = u32::from_str_radix(doc_id, 16).map_err(|_| GlobalDocAddressParseError)?;\n\n        Ok(GlobalDocAddress {\n            split,\n            doc_addr: DocAddress {\n                segment_ord,\n                doc_id,\n            },\n        })\n    }\n}\n\nfn extract_split_and_footer_offsets(split_metadata: &SplitMetadata) -> SplitIdAndFooterOffsets {\n    SplitIdAndFooterOffsets {\n        split_id: split_metadata.split_id.clone(),\n        split_footer_start: split_metadata.footer_offsets.start,\n        split_footer_end: split_metadata.footer_offsets.end,\n        timestamp_start: split_metadata\n            .time_range\n            .as_ref()\n            .map(|time_range| *time_range.start()),\n        timestamp_end: split_metadata\n            .time_range\n            .as_ref()\n            .map(|time_range| *time_range.end()),\n        num_docs: split_metadata.num_docs as u64,\n    }\n}\n\n/// Get all splits of given index ids\npub async fn list_all_splits(\n    index_uids: Vec<IndexUid>,\n    metastore: &mut MetastoreServiceClient,\n) -> crate::Result<Vec<SplitMetadata>> {\n    list_relevant_splits(index_uids, None, None, None, metastore).await\n}\n\n/// Extract the list of relevant splits for a given request.\npub async fn list_relevant_splits(\n    index_uids: Vec<IndexUid>,\n    start_timestamp: Option<i64>,\n    end_timestamp: Option<i64>,\n    tags_filter_opt: Option<TagFilterAst>,\n    metastore: &mut MetastoreServiceClient,\n) -> crate::Result<Vec<SplitMetadata>> {\n    let Some(mut query) = ListSplitsQuery::try_from_index_uids(index_uids) else {\n        return Ok(Vec::new());\n    };\n    query = query.with_split_state(SplitState::Published);\n\n    if let Some(start_ts) = start_timestamp {\n        query = query.with_time_range_start_gte(start_ts);\n    }\n    if let Some(end_ts) = end_timestamp {\n        query = query.with_time_range_end_lt(end_ts);\n    }\n    if let Some(tags_filter) = tags_filter_opt {\n        query = query.with_tags_filter(tags_filter);\n    }\n    let list_splits_request = ListSplitsRequest::try_from_list_splits_query(&query)?;\n    let splits_metadata: Vec<SplitMetadata> = metastore\n        .list_splits(list_splits_request)\n        .await?\n        .collect_splits_metadata()\n        .await?;\n    Ok(splits_metadata)\n}\n\n/// Resolve index patterns and returns IndexMetadata for found indices.\n/// Patterns follow the elastic search patterns.\npub async fn resolve_index_patterns(\n    index_id_patterns: &[String],\n    metastore: &mut MetastoreServiceClient,\n) -> crate::Result<Vec<IndexMetadata>> {\n    let list_indexes_metadata_request = if index_id_patterns.is_empty() {\n        ListIndexesMetadataRequest::all()\n    } else {\n        ListIndexesMetadataRequest {\n            index_id_patterns: index_id_patterns.to_vec(),\n        }\n    };\n\n    // Get the index ids from the request\n    let indexes_metadata = metastore\n        .list_indexes_metadata(list_indexes_metadata_request)\n        .await?\n        .deserialize_indexes_metadata()\n        .await?;\n    ensure_all_indexes_found(&indexes_metadata, index_id_patterns)?;\n    Ok(indexes_metadata)\n}\n\n/// Converts a Tantivy `NamedFieldDocument` into a json string using the\n/// schema defined by the DocMapper.\n///\n/// We perform this conversion at leaf level only to avoid having\n/// another intermediate json format between the leaves and the root.\nfn convert_document_to_json_string(\n    named_field_doc: NamedFieldDocument,\n    doc_mapper: &DocMapper,\n) -> anyhow::Result<String> {\n    let NamedFieldDocument(named_field_doc_map) = named_field_doc;\n    let doc_json_map = doc_mapper.doc_to_json(named_field_doc_map)?;\n    let content_json =\n        serde_json::to_string(&doc_json_map).expect(\"Json serialization should never fail.\");\n    Ok(content_json)\n}\n\n/// Starts a search node, aka a `searcher`.\npub async fn start_searcher_service(\n    metastore: MetastoreServiceClient,\n    storage_resolver: StorageResolver,\n    search_job_placer: SearchJobPlacer,\n    searcher_context: Arc<SearcherContext>,\n) -> anyhow::Result<Arc<dyn SearchService>> {\n    let cluster_client = ClusterClient::new(search_job_placer);\n    let search_service = Arc::new(SearchServiceImpl::new(\n        metastore,\n        storage_resolver,\n        cluster_client,\n        searcher_context,\n    ));\n    Ok(search_service)\n}\n\n/// Performs a search on the current node.\n/// See also `[distributed_search]`.\npub async fn single_node_search(\n    search_request: SearchRequest,\n    metastore: MetastoreServiceClient,\n    storage_resolver: StorageResolver,\n) -> crate::Result<SearchResponse> {\n    let socket_addr = SocketAddr::new(Ipv4Addr::new(127, 0, 0, 1).into(), 7280u16);\n    let searcher_pool = SearcherPool::default();\n    let search_job_placer = SearchJobPlacer::new(searcher_pool.clone());\n    let cluster_client = ClusterClient::new(search_job_placer);\n    let searcher_config = SearcherConfig::default();\n    let searcher_context = Arc::new(SearcherContext::new_without_invoker(searcher_config, None));\n    let search_service = Arc::new(SearchServiceImpl::new(\n        metastore.clone(),\n        storage_resolver,\n        cluster_client.clone(),\n        searcher_context.clone(),\n    ));\n    let search_service_client =\n        SearchServiceClient::from_service(search_service.clone(), socket_addr);\n    searcher_pool.insert(socket_addr, search_service_client);\n    root_search(\n        &searcher_context,\n        search_request,\n        metastore,\n        &cluster_client,\n    )\n    .await\n}\n\n/// Creates a tantivy Term from a &str.\n#[cfg(any(test, feature = \"testsuite\"))]\n#[macro_export]\nmacro_rules! encode_term_for_test {\n    ($field:expr, $value:expr) => {{\n        #[allow(deprecated)]\n        {\n            ::tantivy::schema::Term::from_field_text(\n                ::tantivy::schema::Field::from_field_id($field),\n                $value,\n            )\n            .serialized_term()\n            .to_vec()\n        }\n    }};\n    ($value:expr) => {\n        encode_term_for_test!(0, $value)\n    };\n}\n\n/// Creates a `SearcherPool` for tests from an iterator of socket addresses and mock search\n/// services.\n#[cfg(any(test, feature = \"testsuite\"))]\npub fn searcher_pool_for_test(\n    iter: impl IntoIterator<Item = (&'static str, MockSearchService)>,\n) -> SearcherPool {\n    SearcherPool::from_iter(\n        iter.into_iter()\n            .map(|(grpc_addr_str, mock_search_service)| {\n                let grpc_addr: SocketAddr = grpc_addr_str\n                    .parse()\n                    .expect(\"The gRPC address should be valid socket address.\");\n                let client =\n                    SearchServiceClient::from_service(Arc::new(mock_search_service), grpc_addr);\n                (grpc_addr, client)\n            }),\n    )\n}\n\npub(crate) fn merge_resource_stats_it<'a>(\n    stats_it: impl IntoIterator<Item = &'a Option<ResourceStats>>,\n) -> Option<ResourceStats> {\n    let mut acc_stats: Option<ResourceStats> = None;\n    for new_stats in stats_it {\n        merge_resource_stats(new_stats, &mut acc_stats);\n    }\n    acc_stats\n}\n\nfn merge_resource_stats(\n    new_stats_opt: &Option<ResourceStats>,\n    stat_accs_opt: &mut Option<ResourceStats>,\n) {\n    if let Some(new_stats) = new_stats_opt {\n        if let Some(stat_accs) = stat_accs_opt {\n            stat_accs.short_lived_cache_num_bytes += new_stats.short_lived_cache_num_bytes;\n            stat_accs.split_num_docs += new_stats.split_num_docs;\n            stat_accs.warmup_microsecs += new_stats.warmup_microsecs;\n            stat_accs.cpu_thread_pool_wait_microsecs += new_stats.cpu_thread_pool_wait_microsecs;\n            stat_accs.cpu_microsecs += new_stats.cpu_microsecs;\n        } else {\n            *stat_accs_opt = Some(*new_stats);\n        }\n    }\n}\n#[cfg(test)]\nmod stats_merge_tests {\n    use super::*;\n\n    #[test]\n    fn test_merge_resource_stats() {\n        let mut acc_stats = None;\n\n        merge_resource_stats(&None, &mut acc_stats);\n\n        assert_eq!(acc_stats, None);\n\n        let stats = Some(ResourceStats {\n            short_lived_cache_num_bytes: 100,\n            split_num_docs: 200,\n            warmup_microsecs: 300,\n            cpu_thread_pool_wait_microsecs: 400,\n            cpu_microsecs: 500,\n        });\n\n        merge_resource_stats(&stats, &mut acc_stats);\n\n        assert_eq!(acc_stats, stats);\n\n        let new_stats = Some(ResourceStats {\n            short_lived_cache_num_bytes: 50,\n            split_num_docs: 100,\n            warmup_microsecs: 150,\n            cpu_thread_pool_wait_microsecs: 200,\n            cpu_microsecs: 250,\n        });\n\n        merge_resource_stats(&new_stats, &mut acc_stats);\n\n        let stats_plus_new_stats = Some(ResourceStats {\n            short_lived_cache_num_bytes: 150,\n            split_num_docs: 300,\n            warmup_microsecs: 450,\n            cpu_thread_pool_wait_microsecs: 600,\n            cpu_microsecs: 750,\n        });\n\n        assert_eq!(acc_stats, stats_plus_new_stats);\n\n        merge_resource_stats(&None, &mut acc_stats);\n\n        assert_eq!(acc_stats, stats_plus_new_stats);\n    }\n\n    #[test]\n    fn test_merge_resource_stats_it() {\n        let merged_stats = merge_resource_stats_it(Vec::<&Option<ResourceStats>>::new());\n        assert_eq!(merged_stats, None);\n\n        let stats1 = Some(ResourceStats {\n            short_lived_cache_num_bytes: 100,\n            split_num_docs: 200,\n            warmup_microsecs: 300,\n            cpu_thread_pool_wait_microsecs: 400,\n            cpu_microsecs: 500,\n        });\n\n        let merged_stats = merge_resource_stats_it(vec![&None, &stats1, &None]);\n\n        assert_eq!(merged_stats, stats1);\n\n        let stats2 = Some(ResourceStats {\n            short_lived_cache_num_bytes: 50,\n            split_num_docs: 100,\n            warmup_microsecs: 150,\n            cpu_thread_pool_wait_microsecs: 200,\n            cpu_microsecs: 250,\n        });\n\n        let stats3 = Some(ResourceStats {\n            short_lived_cache_num_bytes: 25,\n            split_num_docs: 50,\n            warmup_microsecs: 75,\n            cpu_thread_pool_wait_microsecs: 100,\n            cpu_microsecs: 125,\n        });\n\n        let merged_stats = merge_resource_stats_it(vec![&stats1, &stats2, &stats3]);\n\n        assert_eq!(\n            merged_stats,\n            Some(ResourceStats {\n                short_lived_cache_num_bytes: 175,\n                split_num_docs: 350,\n                warmup_microsecs: 525,\n                cpu_thread_pool_wait_microsecs: 700,\n                cpu_microsecs: 875,\n            })\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-search/src/list_fields.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{HashMap, HashSet};\nuse std::path::Path;\nuse std::str::FromStr;\nuse std::sync::{Arc, LazyLock};\n\nuse anyhow::Context;\nuse futures::future;\nuse futures::future::try_join_all;\nuse itertools::Itertools;\nuse quickwit_common::rate_limited_warn;\nuse quickwit_common::shared_consts::{FIELD_PRESENCE_FIELD_NAME, SPLIT_FIELDS_FILE_NAME};\nuse quickwit_common::uri::Uri;\nuse quickwit_config::build_doc_mapper;\nuse quickwit_doc_mapper::tag_pruning::extract_tags_from_query;\nuse quickwit_metastore::SplitMetadata;\nuse quickwit_proto::metastore::MetastoreServiceClient;\nuse quickwit_proto::search::{\n    LeafListFieldsRequest, ListFields, ListFieldsEntryResponse, ListFieldsRequest,\n    ListFieldsResponse, SplitIdAndFooterOffsets, deserialize_split_fields,\n};\nuse quickwit_proto::types::{IndexId, IndexUid};\nuse quickwit_query::query_ast::QueryAst;\nuse quickwit_storage::Storage;\n\nuse crate::leaf::open_split_bundle;\nuse crate::search_job_placer::group_jobs_by_index_id;\nuse crate::service::SearcherContext;\nuse crate::{\n    ClusterClient, SearchError, SearchJob, list_relevant_splits, resolve_index_patterns,\n    search_thread_pool,\n};\n\n/// QW_FIELD_LIST_SIZE_LIMIT defines a hard limit on the number of fields that\n/// can be returned (error otherwise).\n///\n/// Having many fields can happen when a user is creating fields dynamically in\n/// a JSON type with random field names. This leads to huge memory consumption\n/// when building the response. This is a workaround until a way is found to\n/// prune the long tail of rare fields.\nstatic FIELD_LIST_SIZE_LIMIT: LazyLock<usize> =\n    LazyLock::new(|| quickwit_common::get_from_env(\"QW_FIELD_LIST_SIZE_LIMIT\", 100_000, false));\n\nconst DYNAMIC_FIELD_PREFIX: &str = \"_dynamic.\";\n\n/// Get the list of fields in the given split.\n/// The returned list is guaranteed to be strictly sorted by (field_name, field_type).\nasync fn get_fields_from_split(\n    searcher_context: &SearcherContext,\n    index_id: IndexId,\n    split_and_footer_offsets: &SplitIdAndFooterOffsets,\n    index_storage: Arc<dyn Storage>,\n) -> anyhow::Result<Vec<ListFieldsEntryResponse>> {\n    if let Some(list_fields) = searcher_context\n        .list_fields_cache\n        .get(split_and_footer_offsets.clone())\n    {\n        return Ok(list_fields.fields);\n    }\n    let (_, split_bundle) =\n        open_split_bundle(searcher_context, index_storage, split_and_footer_offsets).await?;\n\n    let serialized_split_fields = split_bundle\n        .get_all(Path::new(SPLIT_FIELDS_FILE_NAME))\n        .await?;\n    let serialized_split_fields_len = serialized_split_fields.len();\n    let list_fields_proto =\n        deserialize_split_fields(serialized_split_fields).with_context(|| {\n            format!(\"could not read split fields (serialized len: {serialized_split_fields_len})\",)\n        })?;\n\n    let mut list_fields = list_fields_proto.fields;\n    list_fields.retain(|list_field_entry| list_field_entry.field_name != FIELD_PRESENCE_FIELD_NAME);\n\n    for list_field_entry in list_fields.iter_mut() {\n        list_field_entry.index_ids = vec![index_id.to_string()];\n\n        if list_field_entry\n            .field_name\n            .starts_with(DYNAMIC_FIELD_PREFIX)\n        {\n            list_field_entry\n                .field_name\n                .replace_range(..DYNAMIC_FIELD_PREFIX.len(), \"\");\n        }\n    }\n\n    // We sort our fields, as the removal of dynamic_field prefix could have caused them to be out\n    // of order. We also defensively make sure there are no duplicates here.\n    make_sorted_and_dedup(&mut list_fields);\n\n    // Put result into cache\n    searcher_context.list_fields_cache.put(\n        split_and_footer_offsets.clone(),\n        ListFields {\n            fields: list_fields.clone(),\n        },\n    );\n\n    Ok(list_fields)\n}\n\nfn field_order(\n    left: &ListFieldsEntryResponse,\n    right: &ListFieldsEntryResponse,\n) -> std::cmp::Ordering {\n    left.field_name\n        .cmp(&right.field_name)\n        .then_with(|| left.field_type.cmp(&right.field_type))\n}\n\n// Sorts and deduplicates the list of fields.\n//\n// If somehow we end up with duplicate fields, only the first one is kept,\n// and we log a warning.\nfn make_sorted_and_dedup(list_fields: &mut Vec<ListFieldsEntryResponse>) {\n    list_fields.sort_unstable_by(field_order);\n\n    // We defensively make sure there are no duplicates here.\n    list_fields.dedup_by(|left, right| {\n        if left.field_name == right.field_name && left.field_type == right.field_type {\n            rate_limited_warn!(\n                limit_per_min = 1,\n                left.field_name,\n                \"duplicate fields found, please report\"\n            );\n            true\n        } else {\n            false\n        }\n    });\n}\n\n/// `current_group` needs to contain at least one element.\n/// The group needs to be of the same field name and type.\nfn merge_same_field_group(\n    current_group: &mut Vec<ListFieldsEntryResponse>,\n) -> ListFieldsEntryResponse {\n    // Make sure all fields have the same name and type in current_group\n    assert!(!current_group.is_empty());\n    assert!(\n        current_group\n            .windows(2)\n            .all(|window| window[0].field_name == window[1].field_name\n                && window[0].field_type == window[1].field_type)\n    );\n\n    if current_group.len() == 1 {\n        return current_group\n            .pop()\n            .expect(\"`current_group` should not be empty\");\n    }\n    let metadata = current_group\n        .last()\n        .expect(\"`current_group` should not be empty\");\n    let searchable = current_group.iter().any(|entry| entry.searchable);\n    let aggregatable = current_group.iter().any(|entry| entry.aggregatable);\n    let field_name = metadata.field_name.to_string();\n    let field_type = metadata.field_type;\n    let mut non_searchable_index_ids = if searchable {\n        // We need to combine the non_searchable_index_ids + index_ids where searchable is set to\n        // false (as they are all non_searchable)\n        current_group\n            .iter()\n            .flat_map(|entry| {\n                if !entry.searchable {\n                    entry.index_ids.iter().cloned()\n                } else {\n                    entry.non_searchable_index_ids.iter().cloned()\n                }\n            })\n            .collect()\n    } else {\n        // Not searchable => no need to list all the indices\n        Vec::new()\n    };\n    non_searchable_index_ids.sort_unstable();\n    non_searchable_index_ids.dedup();\n\n    let mut non_aggregatable_index_ids = if aggregatable {\n        // We need to combine the non_aggregatable_index_ids + index_ids where aggregatable is set\n        // to false (as they are all non_aggregatable)\n        current_group\n            .iter()\n            .flat_map(|entry| {\n                if !entry.aggregatable {\n                    entry.index_ids.iter().cloned()\n                } else {\n                    entry.non_aggregatable_index_ids.iter().cloned()\n                }\n            })\n            .collect()\n    } else {\n        // Not aggregatable => no need to list all the indices\n        Vec::new()\n    };\n    non_aggregatable_index_ids.sort_unstable();\n    non_aggregatable_index_ids.dedup();\n    let mut index_ids: Vec<String> = current_group\n        .drain(..)\n        .flat_map(|entry| entry.index_ids.into_iter())\n        .collect();\n    index_ids.sort_unstable();\n    index_ids.dedup();\n\n    ListFieldsEntryResponse {\n        field_name,\n        field_type,\n        searchable,\n        aggregatable,\n        non_searchable_index_ids,\n        non_aggregatable_index_ids,\n        index_ids,\n    }\n}\n\n/// Merge iterators of ListFieldsEntryResponse into a `Vec<ListFieldsEntryResponse>`.\n///\n/// The iterators need to be sorted by (field_name, fieldtype)\nfn merge_leaf_list_fields(\n    iterators: Vec<impl Iterator<Item = ListFieldsEntryResponse>>,\n) -> crate::Result<Vec<ListFieldsEntryResponse>> {\n    let merged = iterators\n        .into_iter()\n        .kmerge_by(|a, b| (&a.field_name, a.field_type) <= (&b.field_name, b.field_type));\n    let mut responses = Vec::new();\n\n    let mut current_group: Vec<ListFieldsEntryResponse> = Vec::new();\n    // Build ListFieldsEntryResponse from current group\n    let flush_group = |responses: &mut Vec<_>, current_group: &mut Vec<ListFieldsEntryResponse>| {\n        let entry = merge_same_field_group(current_group);\n        responses.push(entry);\n        current_group.clear();\n    };\n\n    for entry in merged {\n        if let Some(last) = current_group.last()\n            && (last.field_name != entry.field_name || last.field_type != entry.field_type)\n        {\n            flush_group(&mut responses, &mut current_group);\n        }\n        if responses.len() >= *FIELD_LIST_SIZE_LIMIT {\n            return Err(SearchError::Internal(format!(\n                \"list fields response exceeded {} fields\",\n                *FIELD_LIST_SIZE_LIMIT\n            )));\n        }\n        current_group.push(entry);\n    }\n    if !current_group.is_empty() {\n        flush_group(&mut responses, &mut current_group);\n    }\n\n    Ok(responses)\n}\n\n// Returns true if any of the patterns match the field name.\nfn matches_any_pattern(field_name: &str, field_patterns: &[FieldPattern]) -> bool {\n    field_patterns\n        .iter()\n        .any(|pattern| pattern.matches(field_name))\n}\n\nenum FieldPattern {\n    Match { field: String },\n    Wildcard { prefix: String, suffix: String },\n}\n\nimpl FromStr for FieldPattern {\n    type Err = crate::SearchError;\n\n    fn from_str(field_pattern: &str) -> crate::Result<Self> {\n        match field_pattern.find('*') {\n            None => Ok(FieldPattern::Match {\n                field: field_pattern.to_string(),\n            }),\n            Some(pos) => {\n                let prefix = field_pattern[..pos].to_string();\n                let suffix = field_pattern[pos + 1..].to_string();\n                if suffix.contains(\"*\") {\n                    return Err(crate::SearchError::InvalidArgument(format!(\n                        \"invalid field pattern `{field_pattern}`: we only support one wildcard\"\n                    )));\n                }\n                Ok(FieldPattern::Wildcard { prefix, suffix })\n            }\n        }\n    }\n}\n\nimpl FieldPattern {\n    pub fn matches(&self, field_name: &str) -> bool {\n        match self {\n            FieldPattern::Match { field } => field == field_name,\n            FieldPattern::Wildcard { prefix, suffix } => {\n                field_name.starts_with(prefix) && field_name.ends_with(suffix)\n            }\n        }\n    }\n}\n\n/// `leaf` step of list fields.\n///\n/// Returns field metadata from the assigned splits.\npub async fn leaf_list_fields(\n    index_id: IndexId,\n    index_storage: Arc<dyn Storage>,\n    searcher_context: &SearcherContext,\n    split_ids: &[SplitIdAndFooterOffsets],\n    field_patterns_str: &[String],\n) -> crate::Result<ListFieldsResponse> {\n    let field_patterns: Vec<FieldPattern> = field_patterns_str\n        .iter()\n        .map(|pattern_str| FieldPattern::from_str(pattern_str))\n        .collect::<crate::Result<_>>()?;\n\n    // If no splits, return empty response\n    if split_ids.is_empty() {\n        return Ok(ListFieldsResponse { fields: Vec::new() });\n    }\n\n    // Get fields from all splits\n    let single_split_list_fields_futures: Vec<_> = split_ids\n        .iter()\n        .map(|split_id| {\n            get_fields_from_split(\n                searcher_context,\n                index_id.to_string(),\n                split_id,\n                index_storage.clone(),\n            )\n        })\n        .collect();\n\n    let mut single_split_list_fields_vec: Vec<Vec<ListFieldsEntryResponse>> =\n        future::try_join_all(single_split_list_fields_futures).await?;\n\n    let fields = search_thread_pool()\n        .run_cpu_intensive(move || {\n            for single_split_list_fields in &mut single_split_list_fields_vec {\n                // This contract is enforced on a different node, etc. so we defensively check that\n                // the fields are sorted and deduplicated.\n                if !single_split_list_fields.is_sorted_by(|left, right| {\n                    // Checking on less ensure that this is both sorted AND that there are no\n                    // duplicates\n                    field_order(left, right) == std::cmp::Ordering::Less\n                }) {\n                    rate_limited_warn!(\n                        limit_per_min = 1,\n                        \"contract breach: fields returned by a leaf are not strictly sorted! \\\n                         please report\"\n                    );\n                    make_sorted_and_dedup(single_split_list_fields);\n                }\n            }\n\n            let filtered_list_fields_sorted_iters: Vec<_> = single_split_list_fields_vec\n                .into_iter()\n                .map(|list_fields_sorted| {\n                    list_fields_sorted.into_iter().filter(|field| {\n                        if field_patterns.is_empty() {\n                            true\n                        } else {\n                            matches_any_pattern(&field.field_name, &field_patterns)\n                        }\n                    })\n                })\n                .collect();\n            merge_leaf_list_fields(filtered_list_fields_sorted_iters)\n        })\n        .await\n        .context(\"failed to merge single split list fields\")??;\n    Ok(ListFieldsResponse { fields })\n}\n\n/// Index metas needed for executing a leaf list fields request.\n#[derive(Clone, Debug)]\npub struct IndexMetasForLeafSearch {\n    /// Index id.\n    pub index_id: IndexId,\n    /// Index URI.\n    pub index_uri: Uri,\n}\n\n/// Performs a distributed list fields request.\n/// 1. Sends leaf requests over gRPC to multiple leaf nodes.\n/// 2. Merges the search results.\n/// 3. Builds the response and returns.\npub async fn root_list_fields(\n    list_fields_req: ListFieldsRequest,\n    cluster_client: &ClusterClient,\n    mut metastore: MetastoreServiceClient,\n) -> crate::Result<ListFieldsResponse> {\n    let indexes_metadata =\n        resolve_index_patterns(&list_fields_req.index_id_patterns[..], &mut metastore).await?;\n    // The request contains a wildcard, but couldn't find any index.\n    if indexes_metadata.is_empty() {\n        return Ok(ListFieldsResponse { fields: Vec::new() });\n    }\n\n    // Build index metadata map and extract timestamp field for time range refinement\n    let mut index_uid_to_index_meta: HashMap<IndexUid, IndexMetasForLeafSearch> = HashMap::new();\n    let mut index_uids: Vec<IndexUid> = Vec::new();\n    let mut timestamp_field_opt: Option<String> = None;\n\n    for index_metadata in indexes_metadata {\n        // Extract timestamp field for time range refinement (use first index's field)\n        if timestamp_field_opt.is_none()\n            && list_fields_req.query_ast.is_some()\n            && let Ok(doc_mapper) = build_doc_mapper(\n                &index_metadata.index_config.doc_mapping,\n                &index_metadata.index_config.search_settings,\n            )\n        {\n            timestamp_field_opt = doc_mapper.timestamp_field_name().map(|s| s.to_string());\n        }\n\n        let index_metadata_for_leaf_search = IndexMetasForLeafSearch {\n            index_uri: index_metadata.index_uri().clone(),\n            index_id: index_metadata.index_config.index_id.to_string(),\n        };\n\n        index_uids.push(index_metadata.index_uid.clone());\n        index_uid_to_index_meta.insert(\n            index_metadata.index_uid.clone(),\n            index_metadata_for_leaf_search,\n        );\n    }\n\n    // Extract tags and refine time range from query_ast for split pruning\n    let mut start_timestamp = list_fields_req.start_timestamp;\n    let mut end_timestamp = list_fields_req.end_timestamp;\n    let tags_filter_opt = if let Some(ref query_ast_json) = list_fields_req.query_ast {\n        let query_ast: QueryAst = serde_json::from_str(query_ast_json)\n            .map_err(|err| SearchError::InvalidQuery(err.to_string()))?;\n\n        // Refine time range from query AST if timestamp field is available\n        if let Some(ref timestamp_field) = timestamp_field_opt {\n            crate::root::refine_start_end_timestamp_from_ast(\n                &query_ast,\n                timestamp_field,\n                &mut start_timestamp,\n                &mut end_timestamp,\n            );\n        }\n\n        extract_tags_from_query(query_ast)\n    } else {\n        None\n    };\n\n    let split_metadatas: Vec<SplitMetadata> = list_relevant_splits(\n        index_uids,\n        start_timestamp,\n        end_timestamp,\n        tags_filter_opt,\n        &mut metastore,\n    )\n    .await?;\n\n    // Build requests for each index id\n    let jobs: Vec<SearchJob> = split_metadatas.iter().map(SearchJob::from).collect();\n    let assigned_leaf_search_jobs = cluster_client\n        .search_job_placer\n        .assign_jobs(jobs, &HashSet::default())\n        .await?;\n    let mut leaf_request_tasks = Vec::new();\n    // For each node, forward to a node with an affinity for that index id.\n    for (client, client_jobs) in assigned_leaf_search_jobs {\n        let leaf_requests =\n            jobs_to_leaf_requests(&list_fields_req, &index_uid_to_index_meta, client_jobs)?;\n        for leaf_request in leaf_requests {\n            leaf_request_tasks.push(cluster_client.leaf_list_fields(leaf_request, client.clone()));\n        }\n    }\n    let leaf_list_fields_protos: Vec<ListFieldsResponse> = try_join_all(leaf_request_tasks).await?;\n    let fields = search_thread_pool()\n        .run_cpu_intensive(move || {\n            let leaf_list_fields = leaf_list_fields_protos\n                .into_iter()\n                .map(|leaf_list_fields_proto| leaf_list_fields_proto.fields.into_iter())\n                .collect();\n            merge_leaf_list_fields(leaf_list_fields)\n        })\n        .await\n        .context(\"failed to merge leaf list fields responses\")??;\n\n    Ok(ListFieldsResponse { fields })\n}\n\n/// Builds a list of [`LeafListFieldsRequest`], one per index, from a list of [`SearchJob`].\npub fn jobs_to_leaf_requests(\n    request: &ListFieldsRequest,\n    index_uid_to_id: &HashMap<IndexUid, IndexMetasForLeafSearch>,\n    jobs: Vec<SearchJob>,\n) -> crate::Result<Vec<LeafListFieldsRequest>> {\n    let search_request_for_leaf = request.clone();\n    let mut leaf_search_requests = Vec::new();\n    // Group jobs by index uid.\n    group_jobs_by_index_id(jobs, |job_group| {\n        let index_uid = &job_group[0].index_uid;\n        let index_meta = index_uid_to_id.get(index_uid).ok_or_else(|| {\n            SearchError::Internal(format!(\n                \"received list fields job for an unknown index {index_uid}. it should never happen\"\n            ))\n        })?;\n\n        let leaf_search_request = LeafListFieldsRequest {\n            index_id: index_meta.index_id.to_string(),\n            index_uri: index_meta.index_uri.to_string(),\n            fields: search_request_for_leaf.fields.clone(),\n            split_offsets: job_group.into_iter().map(|job| job.offsets).collect(),\n        };\n        leaf_search_requests.push(leaf_search_request);\n        Ok(())\n    })?;\n\n    Ok(leaf_search_requests)\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_proto::search::{ListFieldType, ListFieldsEntryResponse};\n\n    use super::*;\n\n    #[test]\n    fn merge_leaf_list_fields_identical_test() {\n        let entry1 = ListFieldsEntryResponse {\n            field_name: \"field1\".to_string(),\n            field_type: ListFieldType::Str as i32,\n            searchable: true,\n            aggregatable: true,\n            non_searchable_index_ids: Vec::new(),\n            non_aggregatable_index_ids: Vec::new(),\n            index_ids: vec![\"index1\".to_string()],\n        };\n        let entry2 = ListFieldsEntryResponse {\n            field_name: \"field1\".to_string(),\n            field_type: ListFieldType::Str as i32,\n            searchable: true,\n            aggregatable: true,\n            non_searchable_index_ids: Vec::new(),\n            non_aggregatable_index_ids: Vec::new(),\n            index_ids: vec![\"index1\".to_string()],\n        };\n        let resp = merge_leaf_list_fields(vec![\n            vec![entry1.clone()].into_iter(),\n            vec![entry2.clone()].into_iter(),\n        ])\n        .unwrap();\n        assert_eq!(resp, vec![entry1]);\n    }\n    #[test]\n    fn merge_leaf_list_fields_different_test() {\n        let entry1 = ListFieldsEntryResponse {\n            field_name: \"field1\".to_string(),\n            field_type: ListFieldType::Str as i32,\n            searchable: true,\n            aggregatable: true,\n            non_searchable_index_ids: Vec::new(),\n            non_aggregatable_index_ids: Vec::new(),\n            index_ids: vec![\"index1\".to_string()],\n        };\n        let entry2 = ListFieldsEntryResponse {\n            field_name: \"field2\".to_string(),\n            field_type: ListFieldType::Str as i32,\n            searchable: true,\n            aggregatable: true,\n            non_searchable_index_ids: Vec::new(),\n            non_aggregatable_index_ids: Vec::new(),\n            index_ids: vec![\"index1\".to_string()],\n        };\n        let resp = merge_leaf_list_fields(vec![\n            vec![entry1.clone()].into_iter(),\n            vec![entry2.clone()].into_iter(),\n        ])\n        .unwrap();\n        assert_eq!(resp, vec![entry1, entry2]);\n    }\n    #[test]\n    fn merge_leaf_list_fields_non_searchable_test() {\n        let entry1 = ListFieldsEntryResponse {\n            field_name: \"field1\".to_string(),\n            field_type: ListFieldType::Str as i32,\n            searchable: true,\n            aggregatable: true,\n            non_searchable_index_ids: Vec::new(),\n            non_aggregatable_index_ids: Vec::new(),\n            index_ids: vec![\"index1\".to_string()],\n        };\n        let entry2 = ListFieldsEntryResponse {\n            field_name: \"field1\".to_string(),\n            field_type: ListFieldType::Str as i32,\n            searchable: false,\n            aggregatable: true,\n            non_searchable_index_ids: Vec::new(),\n            non_aggregatable_index_ids: Vec::new(),\n            index_ids: vec![\"index2\".to_string()],\n        };\n        let resp = merge_leaf_list_fields(vec![\n            vec![entry1.clone()].into_iter(),\n            vec![entry2.clone()].into_iter(),\n        ])\n        .unwrap();\n        let expected = ListFieldsEntryResponse {\n            field_name: \"field1\".to_string(),\n            field_type: ListFieldType::Str as i32,\n            searchable: true,\n            aggregatable: true,\n            non_searchable_index_ids: vec![\"index2\".to_string()],\n            non_aggregatable_index_ids: Vec::new(),\n            index_ids: vec![\"index1\".to_string(), \"index2\".to_string()],\n        };\n        assert_eq!(resp, vec![expected]);\n    }\n    #[test]\n    fn merge_leaf_list_fields_non_aggregatable_test() {\n        let entry1 = ListFieldsEntryResponse {\n            field_name: \"field1\".to_string(),\n            field_type: ListFieldType::Str as i32,\n            searchable: true,\n            aggregatable: true,\n            non_searchable_index_ids: Vec::new(),\n            non_aggregatable_index_ids: Vec::new(),\n            index_ids: vec![\"index1\".to_string()],\n        };\n        let entry2 = ListFieldsEntryResponse {\n            field_name: \"field1\".to_string(),\n            field_type: ListFieldType::Str as i32,\n            searchable: true,\n            aggregatable: false,\n            non_searchable_index_ids: Vec::new(),\n            non_aggregatable_index_ids: Vec::new(),\n            index_ids: vec![\"index2\".to_string()],\n        };\n        let resp = merge_leaf_list_fields(vec![\n            vec![entry1.clone()].into_iter(),\n            vec![entry2.clone()].into_iter(),\n        ])\n        .unwrap();\n        let expected = ListFieldsEntryResponse {\n            field_name: \"field1\".to_string(),\n            field_type: ListFieldType::Str as i32,\n            searchable: true,\n            aggregatable: true,\n            non_searchable_index_ids: Vec::new(),\n            non_aggregatable_index_ids: vec![\"index2\".to_string()],\n            index_ids: vec![\"index1\".to_string(), \"index2\".to_string()],\n        };\n        assert_eq!(resp, vec![expected]);\n    }\n    #[test]\n    fn merge_leaf_list_fields_mixed_types1() {\n        let entry1 = ListFieldsEntryResponse {\n            field_name: \"field1\".to_string(),\n            field_type: ListFieldType::Str as i32,\n            searchable: true,\n            aggregatable: true,\n            non_searchable_index_ids: Vec::new(),\n            non_aggregatable_index_ids: Vec::new(),\n            index_ids: vec![\"index1\".to_string()],\n        };\n        let entry2 = ListFieldsEntryResponse {\n            field_name: \"field1\".to_string(),\n            field_type: ListFieldType::Str as i32,\n            searchable: true,\n            aggregatable: true,\n            non_searchable_index_ids: Vec::new(),\n            non_aggregatable_index_ids: Vec::new(),\n            index_ids: vec![\"index1\".to_string()],\n        };\n        let entry3 = ListFieldsEntryResponse {\n            field_name: \"field1\".to_string(),\n            field_type: ListFieldType::U64 as i32,\n            searchable: true,\n            aggregatable: true,\n            non_searchable_index_ids: Vec::new(),\n            non_aggregatable_index_ids: Vec::new(),\n            index_ids: vec![\"index1\".to_string()],\n        };\n        let resp = merge_leaf_list_fields(vec![\n            vec![entry1.clone(), entry2.clone()].into_iter(),\n            vec![entry3.clone()].into_iter(),\n        ])\n        .unwrap();\n        assert_eq!(resp, vec![entry1.clone(), entry3.clone()]);\n    }\n    #[test]\n    fn merge_leaf_list_fields_mixed_types2() {\n        let entry1 = ListFieldsEntryResponse {\n            field_name: \"field1\".to_string(),\n            field_type: ListFieldType::Str as i32,\n            searchable: true,\n            aggregatable: true,\n            non_searchable_index_ids: Vec::new(),\n            non_aggregatable_index_ids: Vec::new(),\n            index_ids: vec![\"index1\".to_string()],\n        };\n        let entry2 = ListFieldsEntryResponse {\n            field_name: \"field1\".to_string(),\n            field_type: ListFieldType::Str as i32,\n            searchable: true,\n            aggregatable: true,\n            non_searchable_index_ids: Vec::new(),\n            non_aggregatable_index_ids: Vec::new(),\n            index_ids: vec![\"index1\".to_string()],\n        };\n        let entry3 = ListFieldsEntryResponse {\n            field_name: \"field1\".to_string(),\n            field_type: ListFieldType::U64 as i32,\n            searchable: true,\n            aggregatable: true,\n            non_searchable_index_ids: Vec::new(),\n            non_aggregatable_index_ids: Vec::new(),\n            index_ids: vec![\"index1\".to_string()],\n        };\n        let resp = merge_leaf_list_fields(vec![\n            vec![entry1.clone(), entry3.clone()].into_iter(),\n            vec![entry2.clone()].into_iter(),\n        ])\n        .unwrap();\n        assert_eq!(resp, vec![entry1.clone(), entry3.clone()]);\n    }\n    #[test]\n    fn merge_leaf_list_fields_multiple_field_names() {\n        let entry1 = ListFieldsEntryResponse {\n            field_name: \"field1\".to_string(),\n            field_type: ListFieldType::Str as i32,\n            searchable: true,\n            aggregatable: true,\n            non_searchable_index_ids: Vec::new(),\n            non_aggregatable_index_ids: Vec::new(),\n            index_ids: vec![\"index1\".to_string()],\n        };\n        let entry2 = ListFieldsEntryResponse {\n            field_name: \"field1\".to_string(),\n            field_type: ListFieldType::Str as i32,\n            searchable: true,\n            aggregatable: true,\n            non_searchable_index_ids: Vec::new(),\n            non_aggregatable_index_ids: Vec::new(),\n            index_ids: vec![\"index1\".to_string()],\n        };\n        let entry3 = ListFieldsEntryResponse {\n            field_name: \"field2\".to_string(),\n            field_type: ListFieldType::Str as i32,\n            searchable: true,\n            aggregatable: true,\n            non_searchable_index_ids: Vec::new(),\n            non_aggregatable_index_ids: Vec::new(),\n            index_ids: vec![\"index1\".to_string()],\n        };\n        let resp = merge_leaf_list_fields(vec![\n            vec![entry1.clone(), entry3.clone()].into_iter(),\n            vec![entry2.clone()].into_iter(),\n        ])\n        .unwrap();\n        assert_eq!(resp, vec![entry1.clone(), entry3.clone()]);\n    }\n    #[test]\n    fn merge_leaf_list_fields_non_aggregatable_list_test() {\n        let entry1 = ListFieldsEntryResponse {\n            field_name: \"field1\".to_string(),\n            field_type: ListFieldType::Str as i32,\n            searchable: true,\n            aggregatable: true,\n            non_searchable_index_ids: vec![\"index1\".to_string()],\n            non_aggregatable_index_ids: Vec::new(),\n            index_ids: vec![\n                \"index1\".to_string(),\n                \"index2\".to_string(),\n                \"index3\".to_string(),\n            ],\n        };\n        let entry2 = ListFieldsEntryResponse {\n            field_name: \"field1\".to_string(),\n            field_type: ListFieldType::Str as i32,\n            searchable: false,\n            aggregatable: true,\n            non_searchable_index_ids: Vec::new(),\n            non_aggregatable_index_ids: Vec::new(),\n            index_ids: vec![\"index4\".to_string()],\n        };\n        let resp = merge_leaf_list_fields(vec![\n            vec![entry1.clone()].into_iter(),\n            vec![entry2.clone()].into_iter(),\n        ])\n        .unwrap();\n        let expected = ListFieldsEntryResponse {\n            field_name: \"field1\".to_string(),\n            field_type: ListFieldType::Str as i32,\n            searchable: true,\n            aggregatable: true,\n            non_searchable_index_ids: vec![\"index1\".to_string(), \"index4\".to_string()],\n            non_aggregatable_index_ids: Vec::new(),\n            index_ids: vec![\n                \"index1\".to_string(),\n                \"index2\".to_string(),\n                \"index3\".to_string(),\n                \"index4\".to_string(),\n            ],\n        };\n        assert_eq!(resp, vec![expected]);\n    }\n\n    #[test]\n    fn test_field_pattern() {\n        let prefix_pattern = FieldPattern::from_str(\"toto*\").unwrap();\n        assert!(!prefix_pattern.matches(\"\"));\n        assert!(!prefix_pattern.matches(\"tot3\"));\n        assert!(!prefix_pattern.matches(\"atoto\"));\n        assert!(prefix_pattern.matches(\"toto\"));\n        assert!(prefix_pattern.matches(\"totowhatever\"));\n\n        let suffix_pattern = FieldPattern::from_str(\"*toto\").unwrap();\n        assert!(!suffix_pattern.matches(\"\"));\n        assert!(!suffix_pattern.matches(\"3tot\"));\n        assert!(!suffix_pattern.matches(\"totoa\"));\n        assert!(suffix_pattern.matches(\"toto\"));\n        assert!(suffix_pattern.matches(\"whatevertoto\"));\n\n        let inner_pattern = FieldPattern::from_str(\"to*ti\").unwrap();\n        assert!(!inner_pattern.matches(\"\"));\n        assert!(!inner_pattern.matches(\"tot\"));\n        assert!(!inner_pattern.matches(\"totia\"));\n        assert!(!inner_pattern.matches(\"atoti\"));\n        assert!(inner_pattern.matches(\"toti\"));\n        assert!(!inner_pattern.matches(\"tito\"));\n        assert!(inner_pattern.matches(\"towhateverti\"));\n\n        assert!(FieldPattern::from_str(\"to**\").is_err());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-search/src/list_fields_cache.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_config::CacheConfig;\nuse quickwit_proto::search::{\n    ListFields, SplitIdAndFooterOffsets, deserialize_split_fields, serialize_split_fields,\n};\nuse quickwit_proto::types::SplitId;\nuse quickwit_storage::{MemorySizedCache, OwnedBytes};\n\n/// A cache to memoize `leaf_search_single_split` results.\npub struct ListFieldsCache {\n    content: MemorySizedCache<CacheKey>,\n}\n\n// TODO For now this simply caches the whole ListFieldsEntryResponse. We could\n// be more clever and cache aggregates instead.\nimpl ListFieldsCache {\n    pub fn new(config: &CacheConfig) -> ListFieldsCache {\n        ListFieldsCache {\n            content: MemorySizedCache::from_config(\n                config,\n                &quickwit_storage::STORAGE_METRICS.partial_request_cache,\n            ),\n        }\n    }\n    pub fn get(&self, split_info: SplitIdAndFooterOffsets) -> Option<ListFields> {\n        let key = CacheKey::from_split_meta(split_info);\n        let encoded_result = self.content.get(&key)?;\n        // this should never fail\n        deserialize_split_fields(encoded_result).ok()\n    }\n\n    pub fn put(&self, split_info: SplitIdAndFooterOffsets, list_fields: ListFields) {\n        let key = CacheKey::from_split_meta(split_info);\n\n        let encoded_result = serialize_split_fields(list_fields);\n        self.content.put(key, OwnedBytes::new(encoded_result));\n    }\n}\n\n/// A key inside a [`ListFieldsCache`].\n#[derive(Debug, Hash, Clone, PartialEq, Eq)]\nstruct CacheKey {\n    /// The split this entry refers to\n    split_id: SplitId,\n}\n\nimpl CacheKey {\n    fn from_split_meta(split_info: SplitIdAndFooterOffsets) -> Self {\n        CacheKey {\n            split_id: split_info.split_id,\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use bytesize::ByteSize;\n    use quickwit_proto::search::{\n        ListFieldType, ListFields, ListFieldsEntryResponse, SplitIdAndFooterOffsets,\n    };\n\n    use super::ListFieldsCache;\n\n    #[test]\n    fn test_list_fields_cache() {\n        let cache = ListFieldsCache::new(&ByteSize::mb(64).into());\n\n        let split_1 = SplitIdAndFooterOffsets {\n            split_id: \"split_1\".to_string(),\n            split_footer_start: 0,\n            split_footer_end: 100,\n            timestamp_start: None,\n            timestamp_end: None,\n            num_docs: 0,\n        };\n\n        let split_2 = SplitIdAndFooterOffsets {\n            split_id: \"split_2\".to_string(),\n            split_footer_start: 0,\n            split_footer_end: 100,\n            timestamp_start: None,\n            timestamp_end: None,\n            num_docs: 0,\n        };\n\n        let result = ListFieldsEntryResponse {\n            field_name: \"field1\".to_string(),\n            field_type: ListFieldType::Str as i32,\n            searchable: false,\n            aggregatable: true,\n            non_searchable_index_ids: Vec::new(),\n            non_aggregatable_index_ids: Vec::new(),\n            index_ids: vec![\"index4\".to_string()],\n        };\n\n        assert!(cache.get(split_1.clone()).is_none());\n\n        let list_fields = ListFields {\n            fields: vec![result.clone()],\n        };\n\n        cache.put(split_1.clone(), list_fields.clone());\n        assert_eq!(cache.get(split_1.clone()).unwrap(), list_fields);\n        assert!(cache.get(split_2).is_none());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-search/src/list_terms.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{HashMap, HashSet};\nuse std::ops::Bound;\nuse std::sync::Arc;\n\nuse anyhow::Context;\nuse bytesize::ByteSize;\nuse futures::future::try_join_all;\nuse itertools::{Either, Itertools};\nuse quickwit_common::pretty::PrettySample;\nuse quickwit_config::build_doc_mapper;\nuse quickwit_metastore::{ListSplitsRequestExt, MetastoreServiceStreamSplitsExt, SplitMetadata};\nuse quickwit_proto::metastore::{ListSplitsRequest, MetastoreService, MetastoreServiceClient};\nuse quickwit_proto::search::{\n    LeafListTermsRequest, LeafListTermsResponse, ListTermsRequest, ListTermsResponse,\n    SplitIdAndFooterOffsets, SplitSearchError,\n};\nuse quickwit_proto::types::IndexUid;\nuse quickwit_storage::{ByteRangeCache, Storage};\nuse tantivy::schema::{Field, FieldType};\nuse tantivy::{ReloadPolicy, Term};\nuse tracing::{debug, error, info, instrument};\n\nuse crate::leaf::open_index_with_caches;\nuse crate::search_job_placer::group_jobs_by_index_id;\nuse crate::search_permit_provider::compute_initial_memory_allocation;\nuse crate::{ClusterClient, SearchError, SearchJob, SearcherContext, resolve_index_patterns};\n\n/// Performs a distributed list terms.\n/// 1. Sends leaf requests over gRPC to multiple leaf nodes.\n/// 2. Merges the search results.\n/// 3. Builds the response and returns.\n/// this is much simpler than `root_search` as it doesn't need to get actual docs.\n#[instrument(skip(list_terms_request, cluster_client, metastore))]\npub async fn root_list_terms(\n    list_terms_request: &ListTermsRequest,\n    mut metastore: MetastoreServiceClient,\n    cluster_client: &ClusterClient,\n) -> crate::Result<ListTermsResponse> {\n    let start_instant = tokio::time::Instant::now();\n    let indexes_metadata =\n        resolve_index_patterns(&list_terms_request.index_id_patterns, &mut metastore).await?;\n    // The request contains a wildcard, but couldn't find any index.\n    if indexes_metadata.is_empty() {\n        return Ok(ListTermsResponse {\n            num_hits: 0,\n            terms: Vec::new(),\n            elapsed_time_micros: 0,\n            errors: Vec::new(),\n        });\n    }\n\n    for index_metadata in indexes_metadata.iter() {\n        let index_config = &index_metadata.index_config;\n        let doc_mapper = build_doc_mapper(&index_config.doc_mapping, &index_config.search_settings)\n            .map_err(|err| {\n                SearchError::Internal(format!(\"failed to build doc mapper. cause: {err}\"))\n            })?;\n        let schema = doc_mapper.schema();\n        let field = schema.get_field(&list_terms_request.field).map_err(|_| {\n            SearchError::InvalidQuery(format!(\n                \"failed to list terms in `{}`, field doesn't exist\",\n                list_terms_request.field\n            ))\n        })?;\n        let field_entry = schema.get_field_entry(field);\n        if !field_entry.is_indexed() {\n            return Err(SearchError::InvalidQuery(\n                \"trying to list terms on field which isn't indexed\".to_string(),\n            ));\n        }\n    }\n    let index_uids: Vec<IndexUid> = indexes_metadata\n        .iter()\n        .map(|index_metadata| index_metadata.index_uid.clone())\n        .collect();\n\n    let Some(mut query) = quickwit_metastore::ListSplitsQuery::try_from_index_uids(index_uids)\n    else {\n        return Ok(ListTermsResponse::default());\n    };\n    query = query.with_split_state(quickwit_metastore::SplitState::Published);\n\n    if let Some(start_ts) = list_terms_request.start_timestamp {\n        query = query.with_time_range_start_gte(start_ts);\n    }\n\n    if let Some(end_ts) = list_terms_request.end_timestamp {\n        query = query.with_time_range_end_lt(end_ts);\n    }\n    let index_uid_to_index_uri: HashMap<IndexUid, String> = indexes_metadata\n        .iter()\n        .map(|index_metadata| {\n            (\n                index_metadata.index_uid.clone(),\n                index_metadata.index_uri().to_string(),\n            )\n        })\n        .collect();\n    let list_splits_request = ListSplitsRequest::try_from_list_splits_query(&query)?;\n    let split_metadatas: Vec<SplitMetadata> = metastore\n        .clone()\n        .list_splits(list_splits_request)\n        .await?\n        .collect_splits_metadata()\n        .await?;\n\n    let jobs: Vec<SearchJob> = split_metadatas.iter().map(SearchJob::from).collect();\n    let assigned_leaf_search_jobs = cluster_client\n        .search_job_placer\n        .assign_jobs(jobs, &HashSet::default())\n        .await?;\n    let mut leaf_request_tasks = Vec::new();\n    // For each node, forward to a node with an affinity for that index id.\n    for (client, client_jobs) in assigned_leaf_search_jobs {\n        let leaf_requests =\n            jobs_to_leaf_requests(list_terms_request, &index_uid_to_index_uri, client_jobs)?;\n        for leaf_request in leaf_requests {\n            leaf_request_tasks.push(cluster_client.leaf_list_terms(leaf_request, client.clone()));\n        }\n    }\n    let leaf_search_responses: Vec<LeafListTermsResponse> =\n        try_join_all(leaf_request_tasks).await?;\n\n    let failed_splits: Vec<_> = leaf_search_responses\n        .iter()\n        .flat_map(|leaf_search_response| &leaf_search_response.failed_splits)\n        .collect();\n\n    if !failed_splits.is_empty() {\n        error!(failed_splits = ?failed_splits, \"leaf search response contains at least one failed split\");\n        let errors: String = failed_splits\n            .iter()\n            .map(|splits| splits.to_string())\n            .collect::<Vec<_>>()\n            .join(\", \");\n        return Err(SearchError::Internal(errors));\n    }\n\n    // Merging is a cpu-bound task, but probably fast enough to not require\n    // spawning it on a blocking thread.\n    let merged_iter = leaf_search_responses\n        .into_iter()\n        .map(|leaf_search_response| leaf_search_response.terms)\n        .kmerge()\n        .dedup();\n    let leaf_list_terms_response: Vec<Vec<u8>> = if let Some(limit) = list_terms_request.max_hits {\n        merged_iter.take(limit as usize).collect()\n    } else {\n        merged_iter.collect()\n    };\n\n    debug!(\n        leaf_list_terms_response_count = leaf_list_terms_response.len(),\n        \"Merged leaf search response.\"\n    );\n\n    let elapsed = start_instant.elapsed();\n\n    Ok(ListTermsResponse {\n        num_hits: leaf_list_terms_response.len() as u64,\n        terms: leaf_list_terms_response,\n        elapsed_time_micros: elapsed.as_micros() as u64,\n        errors: Vec::new(),\n    })\n}\n\n/// Builds a list of [`LeafListTermsRequest`], one per index, from a list of [`SearchJob`].\npub fn jobs_to_leaf_requests(\n    request: &ListTermsRequest,\n    index_uid_to_uri: &HashMap<IndexUid, String>,\n    jobs: Vec<SearchJob>,\n) -> crate::Result<Vec<LeafListTermsRequest>> {\n    let search_request_for_leaf = request.clone();\n    let mut leaf_search_requests = Vec::new();\n    group_jobs_by_index_id(jobs, |job_group| {\n        let index_uid = &job_group[0].index_uid;\n        let index_uri = index_uid_to_uri.get(index_uid).ok_or_else(|| {\n            SearchError::Internal(format!(\n                \"received list fields job for an unknown index {index_uid}. it should never happen\"\n            ))\n        })?;\n\n        let leaf_search_request = LeafListTermsRequest {\n            list_terms_request: Some(search_request_for_leaf.clone()),\n            index_uri: index_uri.to_string(),\n            split_offsets: job_group.into_iter().map(|job| job.offsets).collect(),\n        };\n        leaf_search_requests.push(leaf_search_request);\n        Ok(())\n    })?;\n    Ok(leaf_search_requests)\n}\n\n/// Apply a leaf list terms on a single split.\n#[instrument(skip_all, fields(split_id = split.split_id))]\n#[allow(deprecated)]\nasync fn leaf_list_terms_single_split(\n    searcher_context: &SearcherContext,\n    search_request: &ListTermsRequest,\n    storage: Arc<dyn Storage>,\n    split: SplitIdAndFooterOffsets,\n) -> crate::Result<LeafListTermsResponse> {\n    let cache =\n        ByteRangeCache::with_infinite_capacity(&quickwit_storage::STORAGE_METRICS.shortlived_cache);\n    let (index, _) =\n        open_index_with_caches(searcher_context, storage, &split, None, Some(cache)).await?;\n    let split_schema = index.schema();\n    let reader = index\n        .reader_builder()\n        .reload_policy(ReloadPolicy::Manual)\n        .try_into()?;\n    let searcher = reader.searcher();\n\n    let field = split_schema\n        .get_field(&search_request.field)\n        .with_context(|| {\n            format!(\n                \"couldn't get field named {:?} from schema to list terms\",\n                search_request.field\n            )\n        })?;\n\n    let field_type = split_schema.get_field_entry(field).field_type();\n    let start_term: Option<Term> = search_request\n        .start_key\n        .as_ref()\n        .map(|data| term_from_data(field, field_type, data));\n    let end_term: Option<Term> = search_request\n        .end_key\n        .as_ref()\n        .map(|data| term_from_data(field, field_type, data));\n\n    let mut segment_results = Vec::new();\n    for segment_reader in searcher.segment_readers() {\n        let inverted_index = segment_reader.inverted_index(field)?.clone();\n        let dict = inverted_index.terms();\n        dict.file_slice_for_range(\n            (\n                start_term\n                    .as_ref()\n                    .map(Term::serialized_value_bytes)\n                    .map(Bound::Included)\n                    .unwrap_or(Bound::Unbounded),\n                end_term\n                    .as_ref()\n                    .map(Term::serialized_value_bytes)\n                    .map(Bound::Excluded)\n                    .unwrap_or(Bound::Unbounded),\n            ),\n            search_request.max_hits,\n        )\n        .read_bytes_async()\n        .await\n        .with_context(|| \"failed to load sstable range\")?;\n\n        let mut range = dict.range();\n        if let Some(limit) = search_request.max_hits {\n            range = range.limit(limit);\n        }\n        if let Some(start_term) = &start_term {\n            range = range.ge(start_term.serialized_value_bytes())\n        }\n        if let Some(end_term) = &end_term {\n            range = range.lt(end_term.serialized_value_bytes())\n        }\n        let mut stream = range\n            .into_stream()\n            .with_context(|| \"failed to create stream over sstable\")?;\n        let mut segment_result: Vec<Vec<u8>> =\n            Vec::with_capacity(search_request.max_hits.unwrap_or(0) as usize);\n        while stream.advance() {\n            segment_result.push(term_to_data(field, field_type, stream.key()));\n        }\n        segment_results.push(segment_result);\n    }\n\n    let merged_iter = segment_results.into_iter().kmerge().dedup();\n    let merged_results: Vec<Vec<u8>> = if let Some(limit) = search_request.max_hits {\n        merged_iter.take(limit as usize).collect()\n    } else {\n        merged_iter.collect()\n    };\n\n    Ok(LeafListTermsResponse {\n        num_hits: merged_results.len() as u64,\n        terms: merged_results,\n        num_attempted_splits: 1,\n        failed_splits: Vec::new(),\n    })\n}\n\nfn term_from_data(field: Field, field_type: &FieldType, data: &[u8]) -> Term {\n    let mut term = Term::from_field_bool(field, false);\n    term.clear_with_type(field_type.value_type());\n    term.append_bytes(data);\n    term\n}\n\n#[allow(deprecated)]\nfn term_to_data(field: Field, field_type: &FieldType, field_value: &[u8]) -> Vec<u8> {\n    let mut term = Term::from_field_bool(field, false);\n    term.clear_with_type(field_type.value_type());\n    term.append_bytes(field_value);\n    term.serialized_term().to_vec()\n}\n\n/// `leaf` step of list terms.\n#[instrument(skip_all)]\npub async fn leaf_list_terms(\n    searcher_context: Arc<SearcherContext>,\n    request: &ListTermsRequest,\n    index_storage: Arc<dyn Storage>,\n    splits: &[SplitIdAndFooterOffsets],\n) -> Result<LeafListTermsResponse, SearchError> {\n    info!(split_offsets = ?PrettySample::new(splits, 5));\n    let permit_sizes: Vec<ByteSize> = splits\n        .iter()\n        .map(|split| {\n            compute_initial_memory_allocation(\n                split,\n                searcher_context\n                    .searcher_config\n                    .warmup_single_split_initial_allocation,\n            )\n        })\n        .collect();\n    // We have added offloading leaf search to lambdas, but not for list_terms yet.\n    // TODO (Add it)\n    // https://github.com/quickwit-oss/quickwit/issues/6150\n    let permits = searcher_context\n        .search_permit_provider\n        .get_permits(permit_sizes)\n        .await;\n    let leaf_search_single_split_futures: Vec<_> = splits\n        .iter()\n        .zip(permits.into_iter())\n        .map(|(split, search_permit_recv)| {\n            let index_storage_clone = index_storage.clone();\n            let searcher_context_clone = searcher_context.clone();\n            async move {\n                let leaf_split_search_permit = search_permit_recv.await;\n                // TODO dedicated counter and timer?\n                crate::SEARCH_METRICS.leaf_list_terms_splits_total.inc();\n                let timer = crate::SEARCH_METRICS\n                    .leaf_search_split_duration_secs\n                    .start_timer();\n                let leaf_search_single_split_res = leaf_list_terms_single_split(\n                    &searcher_context_clone,\n                    request,\n                    index_storage_clone,\n                    split.clone(),\n                )\n                .await;\n                timer.observe_duration();\n\n                // Explicitly drop the permit for readability.\n                // This should always happen after the ephemeral search cache is dropped.\n                std::mem::drop(leaf_split_search_permit);\n\n                leaf_search_single_split_res.map_err(|err| (split.split_id.clone(), err))\n            }\n        })\n        .collect();\n\n    let split_search_results = futures::future::join_all(leaf_search_single_split_futures).await;\n\n    let (split_search_responses, errors): (Vec<LeafListTermsResponse>, Vec<(String, SearchError)>) =\n        split_search_results\n            .into_iter()\n            .partition_map(|split_search_res| match split_search_res {\n                Ok(split_search_resp) => Either::Left(split_search_resp),\n                Err(err) => Either::Right(err),\n            });\n\n    let merged_iter = split_search_responses\n        .into_iter()\n        .map(|leaf_search_response| leaf_search_response.terms)\n        .kmerge()\n        .dedup();\n    let terms: Vec<Vec<u8>> = if let Some(limit) = request.max_hits {\n        merged_iter.take(limit as usize).collect()\n    } else {\n        merged_iter.collect()\n    };\n\n    let failed_splits = errors\n        .into_iter()\n        .map(|(split_id, err)| SplitSearchError {\n            split_id,\n            error: err.to_string(),\n            retryable_error: true,\n        })\n        .collect();\n    let merged_search_response = LeafListTermsResponse {\n        num_hits: terms.len() as u64,\n        terms,\n        num_attempted_splits: splits.len() as u64,\n        failed_splits,\n    };\n\n    Ok(merged_search_response)\n}\n"
  },
  {
    "path": "quickwit/quickwit-search/src/metrics.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n// See https://prometheus.io/docs/practices/naming/\n\nuse std::fmt;\n\nuse bytesize::ByteSize;\nuse once_cell::sync::Lazy;\nuse quickwit_common::metrics::{\n    Histogram, HistogramVec, IntCounter, IntCounterVec, IntGauge, exponential_buckets,\n    linear_buckets, new_counter, new_counter_vec, new_gauge, new_gauge_vec, new_histogram,\n    new_histogram_vec,\n};\n\nfn print_if_not_null(\n    field_name: &'static str,\n    counter: &IntCounter,\n    f: &mut fmt::Formatter,\n) -> fmt::Result {\n    let val = counter.get();\n    if val > 0 {\n        write!(f, \"{}={} \", field_name, val)?;\n    }\n    Ok(())\n}\n\npub struct SplitSearchOutcomeCounters {\n    pub cancel_before_warmup: IntCounter,\n    pub cache_hit: IntCounter,\n    pub pruned_before_warmup: IntCounter,\n    pub cancel_warmup: IntCounter,\n    pub pruned_after_warmup: IntCounter,\n    pub cancel_cpu_queue: IntCounter,\n    pub cancel_cpu: IntCounter,\n    pub success: IntCounter,\n}\n\nimpl fmt::Display for SplitSearchOutcomeCounters {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        print_if_not_null(\"cancel_before_warmup\", &self.cancel_before_warmup, f)?;\n        print_if_not_null(\"cache_hit\", &self.cache_hit, f)?;\n        print_if_not_null(\"pruned_before_warmup\", &self.pruned_before_warmup, f)?;\n        print_if_not_null(\"cancel_warmup\", &self.cancel_warmup, f)?;\n        print_if_not_null(\"pruned_after_warmup\", &self.pruned_after_warmup, f)?;\n        print_if_not_null(\"cancel_cpu_queue\", &self.cancel_cpu_queue, f)?;\n        print_if_not_null(\"cancel_cpu\", &self.cancel_cpu, f)?;\n        print_if_not_null(\"success\", &self.success, f)?;\n        Ok(())\n    }\n}\n\nimpl SplitSearchOutcomeCounters {\n    /// Create a new SplitSearchOutcomeCounters instance, registered in prometheus.\n    pub fn new_registered() -> Self {\n        let search_split_outcome_vec = new_counter_vec(\n            \"split_search_outcome\",\n            \"Count the state in which each leaf search split ended\",\n            \"search\",\n            &[],\n            [\"category\"],\n        );\n        Self::new_from_counter_vec(search_split_outcome_vec)\n    }\n\n    /// Create a new SplitSearchOutcomeCounters instance, but this one won't be reported to\n    /// prometheus.\n    pub fn new_unregistered() -> Self {\n        let search_split_outcome_vec = IntCounterVec::new(\n            \"split_search_outcome\",\n            \"Count the state in which each leaf search split ended\",\n            \"search\",\n            &[],\n            [\"category\"],\n        );\n        Self::new_from_counter_vec(search_split_outcome_vec)\n    }\n\n    pub fn new_from_counter_vec(search_split_outcome_vec: IntCounterVec<1>) -> Self {\n        SplitSearchOutcomeCounters {\n            cancel_before_warmup: search_split_outcome_vec\n                .with_label_values([\"cancel_before_warmup\"]),\n            cache_hit: search_split_outcome_vec.with_label_values([\"cache_hit\"]),\n            pruned_before_warmup: search_split_outcome_vec\n                .with_label_values([\"pruned_before_warmup\"]),\n            cancel_warmup: search_split_outcome_vec.with_label_values([\"cancel_warmup\"]),\n            pruned_after_warmup: search_split_outcome_vec\n                .with_label_values([\"pruned_after_warmup\"]),\n            cancel_cpu_queue: search_split_outcome_vec.with_label_values([\"cancel_cpu_queue\"]),\n            cancel_cpu: search_split_outcome_vec.with_label_values([\"cancel_cpu\"]),\n            success: search_split_outcome_vec.with_label_values([\"success\"]),\n        }\n    }\n}\n\npub struct SearchMetrics {\n    pub root_search_requests_total: IntCounterVec<1>,\n    pub root_search_request_duration_seconds: HistogramVec<1>,\n    pub root_search_targeted_splits: HistogramVec<1>,\n    pub leaf_search_requests_total: IntCounterVec<1>,\n    pub leaf_search_request_duration_seconds: HistogramVec<1>,\n    pub leaf_search_targeted_splits: HistogramVec<1>,\n    pub leaf_list_terms_splits_total: IntCounter,\n    pub split_search_outcome_total: SplitSearchOutcomeCounters,\n    pub leaf_search_split_duration_secs: Histogram,\n    pub job_assigned_total: IntCounterVec<1>,\n    pub leaf_search_single_split_tasks_pending: IntGauge,\n    pub leaf_search_single_split_tasks_ongoing: IntGauge,\n    pub leaf_search_single_split_warmup_num_bytes: Histogram,\n    pub searcher_local_kv_store_size_bytes: IntGauge,\n}\n\n/// From 0.008s to 131.072s\nfn duration_buckets() -> Vec<f64> {\n    exponential_buckets(0.008, 2.0, 15).unwrap()\n}\n\nimpl Default for SearchMetrics {\n    fn default() -> Self {\n        let targeted_splits_buckets: Vec<f64> = [\n            linear_buckets(0.0, 10.0, 10).unwrap(),\n            linear_buckets(100.0, 100.0, 9).unwrap(),\n            linear_buckets(1000.0, 1000.0, 9).unwrap(),\n            linear_buckets(10000.0, 10000.0, 10).unwrap(),\n        ]\n        .iter()\n        .flatten()\n        .copied()\n        .collect();\n\n        let pseudo_exponential_bytes_buckets = vec![\n            ByteSize::mb(10).as_u64() as f64,\n            ByteSize::mb(20).as_u64() as f64,\n            ByteSize::mb(50).as_u64() as f64,\n            ByteSize::mb(100).as_u64() as f64,\n            ByteSize::mb(200).as_u64() as f64,\n            ByteSize::mb(500).as_u64() as f64,\n            ByteSize::gb(1).as_u64() as f64,\n            ByteSize::gb(2).as_u64() as f64,\n            ByteSize::gb(5).as_u64() as f64,\n        ];\n\n        let leaf_search_single_split_tasks = new_gauge_vec::<1>(\n            \"leaf_search_single_split_tasks\",\n            \"Number of single split search tasks pending or ongoing\",\n            \"search\",\n            &[],\n            [\"status\"], // takes values \"ongoing\" or \"pending\"\n        );\n\n        SearchMetrics {\n            root_search_requests_total: new_counter_vec(\n                \"root_search_requests_total\",\n                \"Total number of root search gRPC requests processed.\",\n                \"search\",\n                &[(\"kind\", \"server\")],\n                [\"status\"],\n            ),\n            root_search_request_duration_seconds: new_histogram_vec(\n                \"root_search_request_duration_seconds\",\n                \"Duration of root search gRPC requests in seconds.\",\n                \"search\",\n                &[(\"kind\", \"server\")],\n                [\"status\"],\n                duration_buckets(),\n            ),\n            root_search_targeted_splits: new_histogram_vec(\n                \"root_search_targeted_splits\",\n                \"Number of splits targeted per root search GRPC request.\",\n                \"search\",\n                &[],\n                [\"status\"],\n                targeted_splits_buckets.clone(),\n            ),\n            leaf_search_requests_total: new_counter_vec(\n                \"leaf_search_requests_total\",\n                \"Total number of leaf search gRPC requests processed.\",\n                \"search\",\n                &[(\"kind\", \"server\")],\n                [\"status\"],\n            ),\n            leaf_search_request_duration_seconds: new_histogram_vec(\n                \"leaf_search_request_duration_seconds\",\n                \"Duration of leaf search gRPC requests in seconds.\",\n                \"search\",\n                &[(\"kind\", \"server\")],\n                [\"status\"],\n                duration_buckets(),\n            ),\n            leaf_search_targeted_splits: new_histogram_vec(\n                \"leaf_search_targeted_splits\",\n                \"Number of splits targeted per leaf search GRPC request.\",\n                \"search\",\n                &[],\n                [\"status\"],\n                targeted_splits_buckets,\n            ),\n\n            leaf_list_terms_splits_total: new_counter(\n                \"leaf_list_terms_splits_total\",\n                \"Number of list terms splits total\",\n                \"search\",\n                &[],\n            ),\n            split_search_outcome_total: SplitSearchOutcomeCounters::new_registered(),\n\n            leaf_search_split_duration_secs: new_histogram(\n                \"leaf_search_split_duration_secs\",\n                \"Number of seconds required to run a leaf search over a single split. The timer \\\n                 starts after the semaphore is obtained.\",\n                \"search\",\n                duration_buckets(),\n            ),\n            leaf_search_single_split_tasks_ongoing: leaf_search_single_split_tasks\n                .with_label_values([\"ongoing\"]),\n            leaf_search_single_split_tasks_pending: leaf_search_single_split_tasks\n                .with_label_values([\"pending\"]),\n            leaf_search_single_split_warmup_num_bytes: new_histogram(\n                \"leaf_search_single_split_warmup_num_bytes\",\n                \"Size of the short lived cache for a single split once the warmup is done.\",\n                \"search\",\n                pseudo_exponential_bytes_buckets,\n            ),\n            job_assigned_total: new_counter_vec(\n                \"job_assigned_total\",\n                \"Number of job assigned to searchers, per affinity rank.\",\n                \"search\",\n                &[],\n                [\"affinity\"],\n            ),\n            searcher_local_kv_store_size_bytes: new_gauge(\n                \"searcher_local_kv_store_size_bytes\",\n                \"Size of the searcher kv store in bytes. This store is used to cache scroll \\\n                 contexts.\",\n                \"search\",\n                &[],\n            ),\n        }\n    }\n}\n\n/// `SEARCH_METRICS` exposes a bunch a set of storage/cache related metrics through a prometheus\n/// endpoint.\npub static SEARCH_METRICS: Lazy<SearchMetrics> = Lazy::new(SearchMetrics::default);\n"
  },
  {
    "path": "quickwit/quickwit-search/src/metrics_trackers.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n// See https://prometheus.io/docs/practices/naming/\n\nuse std::pin::Pin;\nuse std::task::{Context, Poll, ready};\nuse std::time::Instant;\n\nuse pin_project::{pin_project, pinned_drop};\nuse quickwit_proto::search::LeafSearchResponse;\n\nuse crate::SearchError;\nuse crate::metrics::SEARCH_METRICS;\n\n// root\n\npub enum RootSearchMetricsStep {\n    Plan,\n    Exec { num_targeted_splits: usize },\n}\n\n/// Wrapper around the plan and search futures to track metrics.\n#[pin_project(PinnedDrop)]\npub struct RootSearchMetricsFuture<F> {\n    #[pin]\n    pub tracked: F,\n    pub start: Instant,\n    pub step: RootSearchMetricsStep,\n    pub is_success: Option<bool>,\n}\n\n#[pinned_drop]\nimpl<F> PinnedDrop for RootSearchMetricsFuture<F> {\n    fn drop(self: Pin<&mut Self>) {\n        let (num_targeted_splits, status) = match (&self.step, self.is_success) {\n            // is is a partial success, actual success is recorded during the search step\n            (RootSearchMetricsStep::Plan, Some(true)) => return,\n            (RootSearchMetricsStep::Plan, Some(false)) => (0, \"plan-error\"),\n            (RootSearchMetricsStep::Plan, None) => (0, \"plan-cancelled\"),\n            (\n                RootSearchMetricsStep::Exec {\n                    num_targeted_splits,\n                },\n                Some(true),\n            ) => (*num_targeted_splits, \"success\"),\n            (\n                RootSearchMetricsStep::Exec {\n                    num_targeted_splits,\n                },\n                Some(false),\n            ) => (*num_targeted_splits, \"error\"),\n            (\n                RootSearchMetricsStep::Exec {\n                    num_targeted_splits,\n                },\n                None,\n            ) => (*num_targeted_splits, \"cancelled\"),\n        };\n\n        let label_values = [status];\n        SEARCH_METRICS\n            .root_search_requests_total\n            .with_label_values(label_values)\n            .inc();\n        SEARCH_METRICS\n            .root_search_request_duration_seconds\n            .with_label_values(label_values)\n            .observe(self.start.elapsed().as_secs_f64());\n        SEARCH_METRICS\n            .root_search_targeted_splits\n            .with_label_values(label_values)\n            .observe(num_targeted_splits as f64);\n    }\n}\n\nimpl<F, R, E> Future for RootSearchMetricsFuture<F>\nwhere F: Future<Output = Result<R, E>>\n{\n    type Output = Result<R, E>;\n\n    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {\n        let this = self.project();\n        let response = ready!(this.tracked.poll(cx));\n        *this.is_success = Some(response.is_ok());\n        Poll::Ready(Ok(response?))\n    }\n}\n\n// leaf\n\n/// Wrapper around the search future to track metrics.\n#[pin_project(PinnedDrop)]\npub struct LeafSearchMetricsFuture<F>\nwhere F: Future<Output = Result<LeafSearchResponse, SearchError>>\n{\n    #[pin]\n    pub tracked: F,\n    pub start: Instant,\n    pub targeted_splits: usize,\n    pub status: Option<&'static str>,\n}\n\n#[pinned_drop]\nimpl<F> PinnedDrop for LeafSearchMetricsFuture<F>\nwhere F: Future<Output = Result<LeafSearchResponse, SearchError>>\n{\n    fn drop(self: Pin<&mut Self>) {\n        let label_values = [self.status.unwrap_or(\"cancelled\")];\n        SEARCH_METRICS\n            .leaf_search_requests_total\n            .with_label_values(label_values)\n            .inc();\n        SEARCH_METRICS\n            .leaf_search_request_duration_seconds\n            .with_label_values(label_values)\n            .observe(self.start.elapsed().as_secs_f64());\n        SEARCH_METRICS\n            .leaf_search_targeted_splits\n            .with_label_values(label_values)\n            .observe(self.targeted_splits as f64);\n    }\n}\n\nimpl<F> Future for LeafSearchMetricsFuture<F>\nwhere F: Future<Output = Result<LeafSearchResponse, SearchError>>\n{\n    type Output = Result<LeafSearchResponse, SearchError>;\n\n    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {\n        let this = self.project();\n        let response = ready!(this.tracked.poll(cx));\n        *this.status = if response.is_ok() {\n            Some(\"success\")\n        } else {\n            Some(\"error\")\n        };\n        Poll::Ready(Ok(response?))\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-search/src/retry/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\npub mod search;\n\nuse std::collections::HashSet;\nuse std::net::SocketAddr;\n\nuse crate::search_job_placer::Job;\nuse crate::{SearchJobPlacer, SearchServiceClient};\n\n/// A retry policy to evaluate if a request should be retried.\n/// A retry can be made either on an error or on a partial success.\npub trait RetryPolicy<Request, Response, Error>: Sized {\n    /// Returns a retry request in case of retry.\n    fn retry_request(\n        &self,\n        request: Request,\n        response_res: &Result<Response, Error>,\n    ) -> Option<Request>;\n}\n\n/// Default retry policy:\n/// - All responses are treated as success.\n/// - All errors are retryable and the retry request is the same as the original one.\npub struct DefaultRetryPolicy {}\n\nimpl<Request, Response, Error> RetryPolicy<Request, Response, Error> for DefaultRetryPolicy {\n    fn retry_request(\n        &self,\n        request: Request,\n        response_res: &Result<Response, Error>,\n    ) -> Option<Request> {\n        match response_res {\n            Ok(_) => None,\n            Err(_) => Some(request),\n        }\n    }\n}\n\nimpl Job for &str {\n    fn split_id(&self) -> &str {\n        self\n    }\n\n    fn cost(&self) -> usize {\n        1\n    }\n}\n\n// Select a new client from the client pool by the following oversimplified policy:\n// 1. Take the first split_id of the request\n// 2. Ask for a relevant client for that split while excluding the failing identified by its socket\n// addr.\npub async fn retry_client(\n    search_job_placer: &SearchJobPlacer,\n    excluded_addr: SocketAddr,\n    split_id: &str,\n) -> anyhow::Result<SearchServiceClient> {\n    let excluded_addrs = HashSet::from_iter([excluded_addr]);\n    search_job_placer\n        .assign_job(split_id, &excluded_addrs)\n        .await\n}\n\n#[cfg(test)]\nmod tests {\n    use std::net::SocketAddr;\n    use std::sync::Arc;\n\n    use quickwit_proto::search::{FetchDocsResponse, SplitIdAndFooterOffsets};\n\n    use crate::retry::{DefaultRetryPolicy, RetryPolicy, retry_client};\n    use crate::{\n        MockSearchService, SearchError, SearchJobPlacer, SearchServiceClient, SearcherPool,\n    };\n\n    #[test]\n    fn test_should_retry_on_error() {\n        let retry_policy = DefaultRetryPolicy {};\n        let response_res = crate::Result::<()>::Err(SearchError::Internal(\"test\".to_string()));\n        retry_policy.retry_request((), &response_res).unwrap()\n    }\n\n    #[test]\n    fn test_should_not_retry_if_result_is_ok() {\n        let retry_policy = DefaultRetryPolicy {};\n        let response_res =\n            crate::Result::<FetchDocsResponse>::Ok(FetchDocsResponse { hits: Vec::new() });\n        assert!(retry_policy.retry_request((), &response_res).is_none());\n    }\n\n    #[tokio::test]\n    async fn test_retry_client_should_return_another_client() -> anyhow::Result<()> {\n        let searcher_grpc_addr_1 = ([127, 0, 0, 1], 1000).into();\n        let mock_search_service_1 = MockSearchService::new();\n        let searcher_client_1 = SearchServiceClient::from_service(\n            Arc::new(mock_search_service_1),\n            searcher_grpc_addr_1,\n        );\n        let searcher_grpc_addr_2 = ([127, 0, 0, 1], 1001).into();\n        let mock_search_service_2 = MockSearchService::new();\n        let searcher_client_2 = SearchServiceClient::from_service(\n            Arc::new(mock_search_service_2),\n            searcher_grpc_addr_2,\n        );\n        let searcher_pool = SearcherPool::from_iter([\n            (searcher_grpc_addr_1, searcher_client_1),\n            (searcher_grpc_addr_2, searcher_client_2),\n        ]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let _first_grpc_addr: SocketAddr = \"127.0.0.1:1000\".parse()?;\n        let split_id_and_footer_offsets = SplitIdAndFooterOffsets {\n            split_id: \"split_1\".to_string(),\n            split_footer_end: 100,\n            split_footer_start: 0,\n            timestamp_start: None,\n            timestamp_end: None,\n            num_docs: 0,\n        };\n        let client_for_retry = retry_client(\n            &search_job_placer,\n            searcher_grpc_addr_1,\n            &split_id_and_footer_offsets.split_id,\n        )\n        .await\n        .unwrap();\n        assert_eq!(client_for_retry.grpc_addr().to_string(), \"127.0.0.1:1001\");\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-search/src/retry/search.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashSet;\n\nuse quickwit_proto::search::{LeafSearchRequest, LeafSearchResponse};\n\nuse super::RetryPolicy;\nuse crate::SearchError;\n\n/// Retry policy for LeafSearchRequest.\n/// A retry is made either on an error or if there are some failing splits.\n/// In the last case, a retry request is built on failing splits only.\npub struct LeafSearchRetryPolicy {}\n\nimpl RetryPolicy<LeafSearchRequest, LeafSearchResponse, SearchError> for LeafSearchRetryPolicy {\n    // Build a retry request on failing split ids only.\n    fn retry_request(\n        &self,\n        mut request: LeafSearchRequest,\n        response_res: &Result<LeafSearchResponse, SearchError>,\n    ) -> Option<LeafSearchRequest> {\n        match response_res {\n            Ok(response) => {\n                if response.failed_splits.is_empty() {\n                    return None;\n                }\n                let failed_splits_hash_set: HashSet<&str> = response\n                    .failed_splits\n                    .iter()\n                    .map(|failed_split| failed_split.split_id.as_str())\n                    .collect();\n                for request in request.leaf_requests.iter_mut() {\n                    // Keep only failed splits\n                    request.split_offsets.retain(|split_metadata| {\n                        failed_splits_hash_set.contains(split_metadata.split_id.as_str())\n                    });\n                }\n                // Remove requests with empty split_offsets\n                request\n                    .leaf_requests\n                    .retain(|request| !request.split_offsets.is_empty());\n                Some(request)\n            }\n            Err(SearchError::Timeout(_)) => None, // Don't retry on timeout\n            Err(_) => Some(request),\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_proto::search::{\n        LeafRequestRef, LeafSearchRequest, LeafSearchResponse, SearchRequest,\n        SplitIdAndFooterOffsets, SplitSearchError,\n    };\n    use quickwit_query::query_ast::qast_json_helper;\n\n    use crate::SearchError;\n    use crate::retry::RetryPolicy;\n    use crate::retry::search::LeafSearchRetryPolicy;\n\n    fn mock_leaf_search_request() -> LeafSearchRequest {\n        let search_request = SearchRequest {\n            index_id_patterns: vec![\"test-idx\".to_string()],\n            query_ast: qast_json_helper(\"test\", &[\"body\"]),\n            max_hits: 10,\n            ..Default::default()\n        };\n        LeafSearchRequest {\n            search_request: Some(search_request),\n            doc_mappers: vec![\"doc_mapper\".to_string()],\n            index_uris: vec![\"uri\".to_string()],\n            leaf_requests: vec![LeafRequestRef {\n                index_uri_ord: 0,\n                doc_mapper_ord: 0,\n                split_offsets: vec![\n                    SplitIdAndFooterOffsets {\n                        split_id: \"split_1\".to_string(),\n                        split_footer_start: 0,\n                        split_footer_end: 100,\n                        timestamp_start: None,\n                        timestamp_end: None,\n                        num_docs: 0,\n                    },\n                    SplitIdAndFooterOffsets {\n                        split_id: \"split_2\".to_string(),\n                        split_footer_start: 0,\n                        split_footer_end: 100,\n                        timestamp_start: None,\n                        timestamp_end: None,\n                        num_docs: 0,\n                    },\n                ],\n            }],\n        }\n    }\n\n    #[test]\n    fn test_should_retry_on_error() {\n        let retry_policy = LeafSearchRetryPolicy {};\n        let request = mock_leaf_search_request();\n        let response_res = Result::<LeafSearchResponse, SearchError>::Err(SearchError::Internal(\n            \"test\".to_string(),\n        ));\n        retry_policy.retry_request(request, &response_res).unwrap();\n    }\n\n    #[test]\n    fn test_should_not_retry_if_result_is_ok_and_no_failing_splits() {\n        let retry_policy = LeafSearchRetryPolicy {};\n        let request = mock_leaf_search_request();\n        let response_res = Ok(LeafSearchResponse {\n            num_hits: 0,\n            partial_hits: Vec::new(),\n            failed_splits: Vec::new(),\n            num_attempted_splits: 1,\n            ..Default::default()\n        });\n        assert!(retry_policy.retry_request(request, &response_res).is_none())\n    }\n\n    #[test]\n    fn test_should_retry_on_failed_splits() {\n        let retry_policy = LeafSearchRetryPolicy {};\n        let request = mock_leaf_search_request();\n        let mut expected_retry_request = request.clone();\n        expected_retry_request.leaf_requests[0]\n            .split_offsets\n            .remove(0);\n        let split_error = SplitSearchError {\n            error: \"error\".to_string(),\n            split_id: \"split_2\".to_string(),\n            retryable_error: true,\n        };\n        let response_res = Ok(LeafSearchResponse {\n            num_hits: 0,\n            partial_hits: Vec::new(),\n            failed_splits: vec![split_error],\n            num_attempted_splits: 1,\n            ..Default::default()\n        });\n        let retry_request = retry_policy.retry_request(request, &response_res).unwrap();\n        assert_eq!(retry_request, expected_retry_request);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-search/src/root.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{HashMap, HashSet};\nuse std::sync::OnceLock;\nuse std::sync::atomic::{AtomicU64, Ordering};\nuse std::time::{Duration, Instant};\n\nuse anyhow::Context;\nuse futures::future::try_join_all;\nuse itertools::Itertools;\nuse quickwit_common::pretty::PrettySample;\nuse quickwit_common::shared_consts;\nuse quickwit_common::uri::Uri;\nuse quickwit_config::build_doc_mapper;\nuse quickwit_doc_mapper::DYNAMIC_FIELD_NAME;\nuse quickwit_doc_mapper::tag_pruning::extract_tags_from_query;\nuse quickwit_metastore::{IndexMetadata, ListIndexesMetadataResponseExt, SplitMetadata};\nuse quickwit_proto::metastore::{\n    ListIndexesMetadataRequest, MetastoreService, MetastoreServiceClient,\n};\nuse quickwit_proto::search::{\n    FetchDocsRequest, FetchDocsResponse, Hit, LeafHit, LeafRequestRef, LeafSearchRequest,\n    LeafSearchResponse, PartialHit, SearchPlanResponse, SearchRequest, SearchResponse,\n    SnippetRequest, SortDatetimeFormat, SortField, SortValue, SplitIdAndFooterOffsets,\n};\nuse quickwit_proto::types::{IndexUid, SplitId};\nuse quickwit_query::query_ast::{\n    BoolQuery, QueryAst, QueryAstVisitor, RangeQuery, TermQuery, TermSetQuery,\n};\nuse serde::{Deserialize, Serialize};\nuse tantivy::TantivyError;\nuse tantivy::aggregation::agg_result::AggregationResults;\nuse tantivy::aggregation::intermediate_agg_result::IntermediateAggregationResults;\nuse tantivy::collector::Collector;\nuse tantivy::schema::{Field, FieldEntry, FieldType, Schema};\nuse tracing::{debug, error, info, info_span, instrument};\n\nuse crate::cluster_client::ClusterClient;\nuse crate::collector::{QuickwitAggregations, make_merge_collector};\nuse crate::metrics_trackers::{RootSearchMetricsFuture, RootSearchMetricsStep};\nuse crate::scroll_context::{ScrollContext, ScrollKeyAndStartOffset};\nuse crate::search_job_placer::{Job, group_by, group_jobs_by_index_id};\nuse crate::search_response_rest::StorageRequestCount;\nuse crate::service::SearcherContext;\nuse crate::{\n    SearchError, SearchJobPlacer, SearchPlanResponseRest, SearchServiceClient,\n    extract_split_and_footer_offsets, list_relevant_splits,\n};\n\n/// Maximum accepted scroll TTL.\nfn max_scroll_ttl() -> Duration {\n    static MAX_SCROLL_TTL_LOCK: OnceLock<Duration> = OnceLock::new();\n    *MAX_SCROLL_TTL_LOCK.get_or_init(|| {\n        let split_deletion_grace_period = shared_consts::split_deletion_grace_period();\n        assert!(\n            split_deletion_grace_period >= shared_consts::MINIMUM_DELETION_GRACE_PERIOD,\n            \"The split deletion grace period is too short ({split_deletion_grace_period:?}). This \\\n             should not happen.\"\n        );\n        // We remove an extra margin of 2minutes from the split deletion grace period.\n        split_deletion_grace_period - Duration::from_secs(60 * 2)\n    })\n}\n\nconst SORT_DOC_FIELD_NAMES: &[&str] = &[\"_shard_doc\", \"_doc\"];\n\n/// SearchJob to be assigned to search clients by the [`SearchJobPlacer`].\n#[derive(Debug, Clone, PartialEq)]\npub struct SearchJob {\n    /// The index UID.\n    pub index_uid: IndexUid,\n    cost: usize,\n    /// The split ID and footer offsets of the split.\n    pub offsets: SplitIdAndFooterOffsets,\n}\n\nimpl SearchJob {\n    /// Create a fake job from a split_id (used for hashing), and a cost.\n    #[cfg(test)]\n    pub fn for_test(split_id: &str, cost: usize) -> SearchJob {\n        use std::str::FromStr;\n        SearchJob {\n            index_uid: IndexUid::from_str(\"test-index:00000000000000000000000000\").unwrap(),\n            cost,\n            offsets: SplitIdAndFooterOffsets {\n                split_id: split_id.to_string(),\n                ..Default::default()\n            },\n        }\n    }\n}\n\nimpl From<SearchJob> for SplitIdAndFooterOffsets {\n    fn from(search_job: SearchJob) -> Self {\n        search_job.offsets\n    }\n}\n\nimpl<'a> From<&'a SplitMetadata> for SearchJob {\n    fn from(split_metadata: &'a SplitMetadata) -> Self {\n        SearchJob {\n            index_uid: split_metadata.index_uid.clone(),\n            cost: compute_split_cost(split_metadata),\n            offsets: extract_split_and_footer_offsets(split_metadata),\n        }\n    }\n}\n\nimpl Job for SearchJob {\n    fn split_id(&self) -> &str {\n        &self.offsets.split_id\n    }\n\n    fn cost(&self) -> usize {\n        self.cost\n    }\n}\n\npub struct FetchDocsJob {\n    index_uid: IndexUid,\n    offsets: SplitIdAndFooterOffsets,\n    pub partial_hits: Vec<PartialHit>,\n}\n\nimpl Job for FetchDocsJob {\n    fn split_id(&self) -> &str {\n        &self.offsets.split_id\n    }\n\n    fn cost(&self) -> usize {\n        self.partial_hits.len()\n    }\n}\n\nimpl From<FetchDocsJob> for SplitIdAndFooterOffsets {\n    fn from(fetch_docs_job: FetchDocsJob) -> SplitIdAndFooterOffsets {\n        fetch_docs_job.offsets\n    }\n}\n\n/// Index metas needed for executing a leaf search request.\n#[derive(Serialize, Deserialize, Clone, Debug)]\npub struct IndexMetasForLeafSearch {\n    /// Index URI.\n    pub index_uri: Uri,\n    /// Doc mapper json string.\n    pub doc_mapper_str: String,\n}\n\npub(crate) type IndexesMetasForLeafSearch = HashMap<IndexUid, IndexMetasForLeafSearch>;\n\n#[derive(Debug)]\nstruct RequestMetadata {\n    timestamp_field_opt: Option<String>,\n    query_ast_resolved: QueryAst,\n    indexes_meta_for_leaf_search: IndexesMetasForLeafSearch,\n    sort_fields_is_datetime: HashMap<String, bool>,\n}\n\n/// Validates request against each index's doc mapper and ensures that:\n/// - timestamp fields (if any) are equal across indexes.\n/// - resolved query ASTs are the same across indexes.\n/// - if a sort field is of type datetime, it must be a datetime field on all indexes. This\n///   constraint come from the need to support datetime formatting on sort values.\n///\n/// Returns the timestamp field, the resolved query AST and the indexes metadatas\n/// needed for leaf search requests.\n/// Note: the requirements on timestamp fields and resolved query ASTs can be lifted\n/// but it adds complexity that does not seem needed right now.\nfn validate_request_and_build_metadata(\n    indexes_metadata: &[IndexMetadata],\n    search_request: &SearchRequest,\n) -> crate::Result<RequestMetadata> {\n    validate_sort_by_fields_and_search_after(\n        &search_request.sort_fields,\n        &search_request.search_after,\n    )?;\n    let query_ast: QueryAst = serde_json::from_str(&search_request.query_ast)\n        .map_err(|err| SearchError::InvalidQuery(err.to_string()))?;\n    let mut indexes_meta_for_leaf_search: HashMap<IndexUid, IndexMetasForLeafSearch> =\n        HashMap::new();\n    let mut query_ast_resolved_opt: Option<QueryAst> = None;\n    let mut timestamp_field_opt: Option<String> = None;\n    let mut sort_fields_is_datetime: HashMap<String, bool> = HashMap::new();\n\n    for index_metadata in indexes_metadata {\n        let doc_mapper = build_doc_mapper(\n            &index_metadata.index_config.doc_mapping,\n            &index_metadata.index_config.search_settings,\n        )\n        .map_err(|err| {\n            SearchError::Internal(format!(\"failed to build doc mapper. cause: {err}\"))\n        })?;\n        let query_ast_resolved_for_index = query_ast\n            .clone()\n            .parse_user_query(doc_mapper.default_search_fields())\n            // We convert the error to return a 400 to the user (and not a 500).\n            .map_err(|err| SearchError::InvalidQuery(err.to_string()))?;\n\n        // Validate uniqueness of resolved query AST.\n        if let Some(query_ast_resolved) = &query_ast_resolved_opt {\n            if query_ast_resolved != &query_ast_resolved_for_index {\n                return Err(SearchError::InvalidQuery(\n                    \"resolved query ASTs must be the same across indexes. resolving queries with \\\n                     different default fields are different between indexes is not supported\"\n                        .to_string(),\n                ));\n            }\n        } else {\n            query_ast_resolved_opt = Some(query_ast_resolved_for_index.clone());\n        }\n\n        // Validate uniqueness of timestamp field if any.\n        if let Some(timestamp_field_for_index) = doc_mapper.timestamp_field_name() {\n            match timestamp_field_opt {\n                Some(timestamp_field) if timestamp_field != timestamp_field_for_index => {\n                    return Err(SearchError::InvalidQuery(\n                        \"the timestamp field (if present) must be the same for all indexes\"\n                            .to_string(),\n                    ));\n                }\n                None => {\n                    timestamp_field_opt = Some(timestamp_field_for_index.to_string());\n                }\n                _ => {}\n            }\n        }\n\n        // Validate request against the current index schema.\n        let schema = doc_mapper.schema();\n        validate_request(&schema, &doc_mapper.timestamp_field_name(), search_request)?;\n\n        validate_sort_field_types(\n            &schema,\n            &search_request.sort_fields,\n            &mut sort_fields_is_datetime,\n        )?;\n\n        // Validates the query by effectively building it against the current schema.\n        doc_mapper.query(\n            doc_mapper.schema(),\n            query_ast_resolved_for_index,\n            true,\n            None,\n        )?;\n\n        let index_metadata_for_leaf_search = IndexMetasForLeafSearch {\n            index_uri: index_metadata.index_uri().clone(),\n            doc_mapper_str: serde_json::to_string(&doc_mapper).map_err(|err| {\n                SearchError::Internal(format!(\"failed to serialize doc mapper. cause: {err}\"))\n            })?,\n        };\n        indexes_meta_for_leaf_search.insert(\n            index_metadata.index_uid.clone(),\n            index_metadata_for_leaf_search,\n        );\n    }\n\n    let query_ast_resolved = query_ast_resolved_opt.ok_or_else(|| {\n        SearchError::Internal(\n            \"resolved query AST must be present. this should never happen\".to_string(),\n        )\n    })?;\n\n    Ok(RequestMetadata {\n        timestamp_field_opt,\n        query_ast_resolved,\n        indexes_meta_for_leaf_search,\n        sort_fields_is_datetime,\n    })\n}\n\n/// Validate sort field types.\nfn validate_sort_field_types(\n    schema: &Schema,\n    sort_fields: &[SortField],\n    sort_field_is_datetime: &mut HashMap<String, bool>,\n) -> crate::Result<()> {\n    for sort_field in sort_fields.iter() {\n        if let Some(sort_field_entry) = get_sort_by_field_entry(&sort_field.field_name, schema)? {\n            validate_sort_by_field_type(\n                sort_field_entry,\n                sort_field.sort_datetime_format.is_some(),\n            )?;\n            // If sort field type is a date, ensure it's true for all indexes.\n            if let Some(is_datetime) = sort_field_is_datetime.get(&sort_field.field_name) {\n                if *is_datetime != sort_field_entry.field_type().is_date() {\n                    return Err(SearchError::InvalidQuery(format!(\n                        \"sort datetime field `{}` must be of type datetime on all indexes\",\n                        sort_field_entry.name(),\n                    )));\n                }\n            } else {\n                sort_field_is_datetime.insert(\n                    sort_field.field_name.to_string(),\n                    sort_field_entry.field_type().is_date(),\n                );\n            }\n        } else {\n            sort_field_is_datetime.insert(sort_field.field_name.to_string(), false);\n        }\n    }\n    Ok(())\n}\n\nfn validate_requested_snippet_fields(\n    schema: &Schema,\n    snippet_fields: &[String],\n) -> anyhow::Result<()> {\n    for field_name in snippet_fields {\n        let field_entry = schema\n            .get_field(field_name)\n            .map(|field| schema.get_field_entry(field))?;\n        match field_entry.field_type() {\n            FieldType::Str(text_options) => {\n                if !text_options.is_stored() {\n                    return Err(anyhow::anyhow!(\n                        \"the snippet field `{}` must be stored\",\n                        field_name\n                    ));\n                }\n            }\n            other => {\n                return Err(anyhow::anyhow!(\n                    \"the snippet field `{}` must be of type `Str`, got `{}`\",\n                    field_name,\n                    other.value_type().name()\n                ));\n            }\n        }\n    }\n    Ok(())\n}\n\nfn simplify_search_request_for_scroll_api(req: &SearchRequest) -> crate::Result<SearchRequest> {\n    if req.search_after.is_some() {\n        return Err(SearchError::InvalidArgument(\n            \"search_after cannot be used in a scroll context\".to_string(),\n        ));\n    }\n\n    // We do not mutate\n    Ok(SearchRequest {\n        index_id_patterns: req.index_id_patterns.clone(),\n        query_ast: req.query_ast.clone(),\n        start_timestamp: req.start_timestamp,\n        end_timestamp: req.end_timestamp,\n        max_hits: req.max_hits,\n        start_offset: req.start_offset,\n        sort_fields: req.sort_fields.clone(),\n        // We remove all aggregation request.\n        // The aggregation will not be computed for each scroll request.\n        aggregation_request: None,\n        // We remove the snippet fields. This feature is not supported for scroll requests.\n        snippet_fields: Vec::new(),\n        // We remove the scroll ttl parameter. It is irrelevant to process later request\n        scroll_ttl_secs: None,\n        search_after: None,\n        // request is simplified after initial query, and we cache the hit count, so we don't need\n        // to recompute it afterward.\n        count_hits: quickwit_proto::search::CountHits::Underestimate as i32,\n        ignore_missing_indexes: req.ignore_missing_indexes,\n        skip_aggregation_finalization: false,\n    })\n}\n\n/// Validates sort fields and search after values.\n/// - validate sort fields length.\n/// - search after values must be set for all sort fields.\nfn validate_sort_by_fields_and_search_after(\n    sort_fields: &[SortField],\n    search_after: &Option<PartialHit>,\n) -> crate::Result<()> {\n    if sort_fields.is_empty() {\n        return Ok(());\n    }\n    if sort_fields.len() > 2 {\n        return Err(SearchError::InvalidArgument(format!(\n            \"sort by field must be up to 2 fields, got {}\",\n            sort_fields.len()\n        )));\n    }\n    let Some(search_after_partial_hit) = search_after.as_ref() else {\n        return Ok(());\n    };\n\n    let sort_fields_without_doc_count = sort_fields\n        .iter()\n        .filter(|sort_field| !SORT_DOC_FIELD_NAMES.contains(&sort_field.field_name.as_str()))\n        .count();\n    let has_doc_sort_field = sort_fields_without_doc_count != sort_fields.len();\n    if has_doc_sort_field && search_after_partial_hit.split_id.is_empty() {\n        return Err(SearchError::InvalidArgument(\n            \"search_after with a sort field `_doc` must define a split ID, segment ID and doc ID \\\n             values\"\n                .to_string(),\n        ));\n    }\n\n    let mut search_after_sort_value_count = 0;\n    // TODO: we could validate if the search after sort value types of consistent with the sort\n    // field types.\n    if let Some(sort_by_value) = search_after_partial_hit.sort_value.as_ref() {\n        sort_by_value.sort_value.context(\"sort value must be set\")?;\n        search_after_sort_value_count += 1;\n    }\n    if let Some(sort_by_value_2) = search_after_partial_hit.sort_value2.as_ref() {\n        sort_by_value_2\n            .sort_value\n            .context(\"sort value must be set\")?;\n        search_after_sort_value_count += 1;\n    }\n    if search_after_sort_value_count != sort_fields_without_doc_count {\n        return Err(SearchError::InvalidArgument(format!(\n            \"`search_after` must have the same number of sort values as sort by fields {:?}\",\n            sort_fields\n                .iter()\n                .map(|sort_field| &sort_field.field_name)\n                .collect_vec()\n        )));\n    }\n    Ok(())\n}\n\nfn get_sort_by_field_entry<'a>(\n    field_name: &str,\n    schema: &'a Schema,\n) -> crate::Result<Option<&'a FieldEntry>> {\n    if \"_score\" == field_name || SORT_DOC_FIELD_NAMES.contains(&field_name) {\n        return Ok(None);\n    }\n    let dynamic_field_opt = schema.get_field(DYNAMIC_FIELD_NAME).ok();\n    let (sort_by_field, _json_path) = schema\n        .find_field_with_default(field_name, dynamic_field_opt)\n        .ok_or_else(|| {\n            SearchError::InvalidArgument(format!(\"unknown field used in `sort by`: {field_name}\"))\n        })?;\n    let sort_by_field_entry = schema.get_field_entry(sort_by_field);\n    Ok(Some(sort_by_field_entry))\n}\n\n/// Validates sort field type.\nfn validate_sort_by_field_type(\n    sort_by_field_entry: &FieldEntry,\n    has_timestamp_format: bool,\n) -> crate::Result<()> {\n    let field_name = sort_by_field_entry.name();\n    if matches!(sort_by_field_entry.field_type(), FieldType::Str(_)) {\n        return Err(SearchError::InvalidArgument(format!(\n            \"sort by field on type text is currently not supported `{field_name}`\"\n        )));\n    }\n    if !sort_by_field_entry.is_fast() {\n        return Err(SearchError::InvalidArgument(format!(\n            \"sort by field must be a fast field, please add the fast property to your field \\\n             `{field_name}`\",\n        )));\n    }\n    if has_timestamp_format && !sort_by_field_entry.field_type().is_date() {\n        return Err(SearchError::InvalidArgument(format!(\n            \"sort by field with a timestamp format must be a datetime field and the field \\\n             `{field_name}` is not\",\n        )));\n    }\n    Ok(())\n}\n\nfn check_is_fast_field(\n    schema: &Schema,\n    fast_field_name: &str,\n    dynamic_fast_field: Option<Field>,\n) -> crate::Result<()> {\n    let Some((field, _path)): Option<(Field, &str)> =\n        schema.find_field_with_default(fast_field_name, dynamic_fast_field)\n    else {\n        return Err(SearchError::InvalidArgument(format!(\n            \"Field \\\"{fast_field_name}\\\" does not exist\"\n        )));\n    };\n    let field_entry: &FieldEntry = schema.get_field_entry(field);\n    if !field_entry.is_fast() {\n        return Err(SearchError::InvalidArgument(format!(\n            \"Field \\\"{fast_field_name}\\\" is not configured as a fast field\"\n        )));\n    }\n    Ok(())\n}\n\nfn validate_request(\n    schema: &Schema,\n    timestamp_field_name: &Option<&str>,\n    search_request: &SearchRequest,\n) -> crate::Result<()> {\n    if timestamp_field_name.is_none()\n        && (search_request.start_timestamp.is_some() || search_request.end_timestamp.is_some())\n    {\n        return Err(SearchError::InvalidQuery(format!(\n            \"the timestamp field is not set in index: {:?} definition but start-timestamp or \\\n             end-timestamp are set in the query\",\n            search_request.index_id_patterns\n        )));\n    }\n\n    validate_requested_snippet_fields(schema, &search_request.snippet_fields)?;\n\n    if let Some(agg) = search_request.aggregation_request.as_ref() {\n        let aggs: QuickwitAggregations = serde_json::from_str(agg).map_err(|_err| {\n            let err = serde_json::from_str::<tantivy::aggregation::agg_req::Aggregations>(agg)\n                .unwrap_err();\n            SearchError::InvalidAggregationRequest(err.to_string())\n        })?;\n\n        // ensure that the required fast fields are indeed configured as fast fields.\n        let fast_field_names = aggs.fast_field_names();\n        let dynamic_field = schema.get_field(DYNAMIC_FIELD_NAME).ok();\n        for fast_field_name in &fast_field_names {\n            check_is_fast_field(schema, fast_field_name, dynamic_field)?;\n        }\n    };\n\n    if search_request.start_offset > 10_000 {\n        return Err(SearchError::InvalidArgument(format!(\n            \"max value for start_offset is 10_000, but got {}\",\n            search_request.start_offset\n        )));\n    }\n\n    if search_request.max_hits > 10_000 {\n        return Err(SearchError::InvalidArgument(format!(\n            \"max value for max_hits is 10_000, but got {}\",\n            search_request.max_hits\n        )));\n    }\n\n    Ok(())\n}\n\nfn get_scroll_ttl_duration(search_request: &SearchRequest) -> crate::Result<Option<Duration>> {\n    let Some(scroll_ttl_secs) = search_request.scroll_ttl_secs else {\n        return Ok(None);\n    };\n    let scroll_ttl: Duration = Duration::from_secs(scroll_ttl_secs as u64);\n    let max_scroll_ttl = max_scroll_ttl();\n    if scroll_ttl > max_scroll_ttl {\n        return Err(SearchError::InvalidArgument(format!(\n            \"Quickwit only supports scroll TTL period up to {} secs\",\n            max_scroll_ttl.as_secs()\n        )));\n    }\n    Ok(Some(scroll_ttl))\n}\n\n#[instrument(level = \"debug\", skip_all)]\nasync fn search_partial_hits_phase_with_scroll(\n    searcher_context: &SearcherContext,\n    indexes_metas_for_leaf_search: &IndexesMetasForLeafSearch,\n    mut search_request: SearchRequest,\n    split_metadatas: &[SplitMetadata],\n    cluster_client: &ClusterClient,\n) -> crate::Result<(LeafSearchResponse, Option<ScrollKeyAndStartOffset>)> {\n    let scroll_ttl_opt = get_scroll_ttl_duration(&search_request)?;\n\n    if let Some(scroll_ttl) = scroll_ttl_opt {\n        let max_hits = search_request.max_hits;\n        // This is a scroll request.\n        //\n        // We increase max hits to add populate the scroll cache.\n        search_request.max_hits = search_request\n            .max_hits\n            .max(shared_consts::SCROLL_BATCH_LEN as u64);\n        search_request.scroll_ttl_secs = None;\n        let mut leaf_search_resp = search_partial_hits_phase(\n            searcher_context,\n            indexes_metas_for_leaf_search,\n            &search_request,\n            split_metadatas,\n            cluster_client,\n        )\n        .await?;\n        let cached_partial_hits = leaf_search_resp.partial_hits.clone();\n        leaf_search_resp.partial_hits.truncate(max_hits as usize);\n        let last_hit = leaf_search_resp\n            .partial_hits\n            .last()\n            .cloned()\n            .unwrap_or_default();\n\n        let scroll_context_search_request =\n            simplify_search_request_for_scroll_api(&search_request)?;\n        let mut scroll_ctx = ScrollContext {\n            indexes_metas_for_leaf_search: indexes_metas_for_leaf_search.clone(),\n            split_metadatas: split_metadatas.to_vec(),\n            search_request: scroll_context_search_request,\n            total_num_hits: leaf_search_resp.num_hits,\n            max_hits_per_page: max_hits,\n            cached_partial_hits_start_offset: search_request.start_offset,\n            cached_partial_hits,\n            failed_splits: leaf_search_resp.failed_splits.clone(),\n            num_successful_splits: leaf_search_resp.num_successful_splits,\n        };\n        let scroll_key_and_start_offset: ScrollKeyAndStartOffset =\n            ScrollKeyAndStartOffset::new_with_start_offset(\n                scroll_ctx.search_request.start_offset,\n                max_hits as u32,\n                last_hit.clone(),\n            )\n            .next_page(leaf_search_resp.partial_hits.len() as u64, last_hit);\n\n        scroll_ctx.clear_cache_if_unneeded();\n        let payload: Vec<u8> = scroll_ctx.serialize();\n        let scroll_key = scroll_key_and_start_offset.scroll_key();\n        cluster_client\n            .put_kv(&scroll_key, &payload, scroll_ttl)\n            .await;\n        Ok((leaf_search_resp, Some(scroll_key_and_start_offset)))\n    } else {\n        let leaf_search_resp = search_partial_hits_phase(\n            searcher_context,\n            indexes_metas_for_leaf_search,\n            &search_request,\n            split_metadatas,\n            cluster_client,\n        )\n        .await?;\n        Ok((leaf_search_resp, None))\n    }\n}\n\n/// Check if the request is a count request without any filters, so we can just return the split\n/// metadata count.\n///\n/// This is done by exclusion, so we will need to keep it up to date if fields are added.\npub fn is_metadata_count_request(request: &SearchRequest) -> bool {\n    let query_ast: QueryAst = serde_json::from_str(&request.query_ast).unwrap();\n    is_metadata_count_request_with_ast(&query_ast, request)\n}\n\n/// Check if the request is a count request without any filters, so we can just return the split\n/// metadata count.\n///\n/// This is done by exclusion, so we will need to keep it up to date if fields are added.\n///\n/// The passed query_ast should match the serialized on in request.\npub fn is_metadata_count_request_with_ast(query_ast: &QueryAst, request: &SearchRequest) -> bool {\n    // TODO detect Cache(MatchAll), Boost(MatchAll) and Bool{must/should:MatchAll}\n    if query_ast != &QueryAst::MatchAll {\n        return false;\n    }\n    if request.max_hits != 0 {\n        return false;\n    }\n\n    // If the start and end timestamp encompass the whole split, it is still a count query.\n    // We remove this currently on the leaf level, but not yet on the root level.\n    // There's a small advantage when we would do this on the root level, since we have the\n    // counts available on the split. On the leaf it is currently required to open the split\n    // to get the count.\n    if request.start_timestamp.is_some() || request.end_timestamp.is_some() {\n        return false;\n    }\n    if request.aggregation_request.is_some() || !request.snippet_fields.is_empty() {\n        return false;\n    }\n    true\n}\n\n/// Get a leaf search response that returns the num_docs of the split\npub fn get_count_from_metadata(split_metadatas: &[SplitMetadata]) -> Vec<LeafSearchResponse> {\n    split_metadatas\n        .iter()\n        .map(|metadata| LeafSearchResponse {\n            num_hits: metadata.num_docs as u64,\n            partial_hits: Vec::new(),\n            failed_splits: Vec::new(),\n            num_attempted_splits: 1,\n            num_successful_splits: 1,\n            intermediate_aggregation_result: None,\n            resource_stats: None,\n        })\n        .collect()\n}\n\n/// Returns true if the query is particularly memory intensive.\n///\n/// This function only considers the memory usage associated to the input data\n/// and does not take in account aggregations (intermediary or not) results for instance.\n///\n/// Since its point is to log memory intensive queries, it focuses on the metric of the number of\n/// bytes per document.\n///\n/// The threshold is computed dynamically using gradient descent.\nfn is_top_5pct_memory_intensive(num_bytes: u64, split_num_docs: u64) -> bool {\n    // It is not worth considering small splits for this.\n    if split_num_docs < 100_000 {\n        return false;\n    }\n    // We multiply those figure by 1_000 for accuracy.\n    const PERCENTILE: u64 = 95;\n    const PRIOR_NUM_BYTES_PER_DOC: u64 = 3 * 1_000;\n    static NUM_BYTES_PER_DOC_95_PERCENTILE_ESTIMATOR: AtomicU64 =\n        AtomicU64::new(PRIOR_NUM_BYTES_PER_DOC);\n    let num_bits_per_docs = num_bytes * 1_000 / split_num_docs;\n    let current_estimator = NUM_BYTES_PER_DOC_95_PERCENTILE_ESTIMATOR.load(Ordering::Relaxed);\n    let is_memory_intensive = num_bits_per_docs > current_estimator;\n    let new_estimator: u64 = if is_memory_intensive {\n        current_estimator.saturating_add(PRIOR_NUM_BYTES_PER_DOC * PERCENTILE / 100)\n    } else {\n        current_estimator.saturating_sub(PRIOR_NUM_BYTES_PER_DOC * (100 - PERCENTILE) / 100)\n    };\n    // We do not use fetch_add / fetch_sub directly as they wrap around.\n    // Concurrency could lead to different results here, but really we don't care.\n    //\n    // This is just ignoring some gradient updates.\n    NUM_BYTES_PER_DOC_95_PERCENTILE_ESTIMATOR.store(new_estimator, Ordering::Relaxed);\n    is_memory_intensive\n}\n\n/// If this method fails for some splits, a partial search response is returned, with the list of\n/// faulty splits in the failed_splits field.\n#[instrument(level = \"debug\", skip_all)]\npub(crate) async fn search_partial_hits_phase(\n    searcher_context: &SearcherContext,\n    indexes_metas_for_leaf_search: &IndexesMetasForLeafSearch,\n    search_request: &SearchRequest,\n    split_metadatas: &[SplitMetadata],\n    cluster_client: &ClusterClient,\n) -> crate::Result<LeafSearchResponse> {\n    let leaf_search_responses: Vec<LeafSearchResponse> =\n        if is_metadata_count_request(search_request) {\n            get_count_from_metadata(split_metadatas)\n        } else {\n            let jobs: Vec<SearchJob> = split_metadatas.iter().map(SearchJob::from).collect();\n            let assigned_leaf_search_jobs = cluster_client\n                .search_job_placer\n                .assign_jobs(jobs, &HashSet::default())\n                .await?;\n            let mut leaf_request_tasks = Vec::new();\n            for (client, client_jobs) in assigned_leaf_search_jobs {\n                let leaf_request = jobs_to_leaf_request(\n                    search_request,\n                    indexes_metas_for_leaf_search,\n                    client_jobs,\n                )?;\n                leaf_request_tasks.push(cluster_client.leaf_search(leaf_request, client.clone()));\n            }\n            try_join_all(leaf_request_tasks).await?\n        };\n\n    let merge_collector =\n        make_merge_collector(search_request, searcher_context.get_aggregation_limits())?;\n\n    // Merging is a cpu-bound task.\n    // It should be executed by Tokio's blocking threads.\n\n    // Wrap into result for merge_fruits\n    let leaf_search_results: Vec<tantivy::Result<LeafSearchResponse>> =\n        leaf_search_responses.into_iter().map(Ok).collect_vec();\n    let span = info_span!(\"merge_fruits\");\n    let leaf_search_response = crate::search_thread_pool()\n        .run_cpu_intensive(move || {\n            let _span_guard = span.enter();\n            merge_collector.merge_fruits(leaf_search_results)\n        })\n        .await\n        .context(\"failed to merge leaf search responses\")?\n        .map_err(|error: TantivyError| crate::SearchError::Internal(error.to_string()))?;\n    debug!(\n        num_hits = leaf_search_response.num_hits,\n        failed_splits = ?leaf_search_response.failed_splits,\n        num_attempted_splits = leaf_search_response.num_attempted_splits,\n        has_intermediate_aggregation_result = leaf_search_response.intermediate_aggregation_result.is_some(),\n        \"Merged leaf search response.\"\n    );\n\n    if let Some(resource_stats) = &leaf_search_response.resource_stats\n        && is_top_5pct_memory_intensive(\n            resource_stats.short_lived_cache_num_bytes,\n            resource_stats.split_num_docs,\n        )\n    {\n        // We log at most 5 times per minute.\n        quickwit_common::rate_limited_info!(\n            limit_per_min = 5,\n            split_num_docs = resource_stats.split_num_docs,\n            short_lived_cached_num_bytes = resource_stats.short_lived_cache_num_bytes,\n            \"memory intensive query\"\n        );\n    }\n\n    if !leaf_search_response.failed_splits.is_empty() {\n        quickwit_common::rate_limited_error!(limit_per_min=6, failed_splits = ?leaf_search_response.failed_splits, \"leaf search response contains at least one failed split\");\n    }\n\n    Ok(leaf_search_response)\n}\n\npub(crate) fn get_snippet_request(search_request: &SearchRequest) -> Option<SnippetRequest> {\n    if search_request.snippet_fields.is_empty() {\n        return None;\n    }\n    Some(SnippetRequest {\n        snippet_fields: search_request.snippet_fields.clone(),\n        query_ast_resolved: search_request.query_ast.clone(),\n    })\n}\n\n#[instrument(skip_all, fields(partial_hits_num=partial_hits.len()))]\npub(crate) async fn fetch_docs_phase(\n    indexes_metas_for_leaf_search: &IndexesMetasForLeafSearch,\n    partial_hits: &[PartialHit],\n    split_metadatas: &[SplitMetadata],\n    search_request: &SearchRequest,\n    cluster_client: &ClusterClient,\n) -> crate::Result<Vec<Hit>> {\n    let snippet_request: Option<SnippetRequest> = get_snippet_request(search_request);\n    let hit_order: HashMap<(String, u32, u32), usize> = partial_hits\n        .iter()\n        .enumerate()\n        .map(|(position, partial_hit)| {\n            let key = (\n                partial_hit.split_id.clone(),\n                partial_hit.segment_ord,\n                partial_hit.doc_id,\n            );\n            (key, position)\n        })\n        .collect();\n\n    let assigned_fetch_docs_jobs = assign_client_fetch_docs_jobs(\n        partial_hits,\n        split_metadatas,\n        &cluster_client.search_job_placer,\n    )\n    .await?;\n\n    let mut fetch_docs_tasks = Vec::new();\n    for (client, client_jobs) in assigned_fetch_docs_jobs {\n        let fetch_jobs_requests = jobs_to_fetch_docs_requests(\n            snippet_request.clone(),\n            indexes_metas_for_leaf_search,\n            client_jobs,\n        )?;\n        for fetch_docs_request in fetch_jobs_requests {\n            fetch_docs_tasks.push(cluster_client.fetch_docs(fetch_docs_request, client.clone()));\n        }\n    }\n    let fetch_docs_responses: Vec<FetchDocsResponse> = try_join_all(fetch_docs_tasks).await?;\n\n    // Merge the fetched docs.\n    let leaf_hits = fetch_docs_responses\n        .into_iter()\n        .flat_map(|response| response.hits.into_iter());\n\n    // Build map of Split ID > index ID to add the index ID to the hits.\n    // Used for ES compatibility.\n    let split_id_to_index_id_map: HashMap<&SplitId, &str> = split_metadatas\n        .iter()\n        .map(|split_metadata| {\n            (\n                &split_metadata.split_id,\n                split_metadata.index_uid.index_id.as_str(),\n            )\n        })\n        .collect();\n    let mut sort_field_iter = search_request.sort_fields.iter();\n    let sort_field_1_datetime_format_opt: Option<SortDatetimeFormat> =\n        get_sort_field_datetime_format(sort_field_iter.next())?;\n    let sort_field_2_datetime_format_opt: Option<SortDatetimeFormat> =\n        get_sort_field_datetime_format(sort_field_iter.next())?;\n    let mut hits_with_position: Vec<(usize, Hit)> = leaf_hits\n        .map(|leaf_hit| {\n            build_hit_with_position(\n                leaf_hit,\n                &split_id_to_index_id_map,\n                &hit_order,\n                &sort_field_1_datetime_format_opt,\n                &sort_field_2_datetime_format_opt,\n            )\n        })\n        .try_collect()?;\n\n    hits_with_position.sort_by_key(|(position, _)| *position);\n    let hits: Vec<Hit> = hits_with_position\n        .into_iter()\n        .map(|(_position, hit)| hit)\n        .collect();\n\n    Ok(hits)\n}\n\nfn build_hit_with_position(\n    mut leaf_hit: LeafHit,\n    split_id_to_index_id_map: &HashMap<&SplitId, &str>,\n    hit_order: &HashMap<(String, u32, u32), usize>,\n    sort_field_1_datetime_format_opt: &Option<SortDatetimeFormat>,\n    sort_field_2_datetime_format_opt: &Option<SortDatetimeFormat>,\n) -> crate::Result<(usize, Hit)> {\n    let partial_hit_ref = leaf_hit\n        .partial_hit\n        .as_mut()\n        .expect(\"partial hit must be present\");\n    let key = (\n        partial_hit_ref.split_id.clone(),\n        partial_hit_ref.segment_ord,\n        partial_hit_ref.doc_id,\n    );\n    let sort_value_opt = partial_hit_ref\n        .sort_value\n        .as_mut()\n        .and_then(|sort_field| sort_field.sort_value.as_mut());\n    if let Some(sort_by_value) = sort_value_opt\n        && let Some(output_datetime_format) = &sort_field_1_datetime_format_opt\n    {\n        convert_sort_datetime_value(sort_by_value, *output_datetime_format)?;\n    }\n    let sort_value_2_opt = partial_hit_ref\n        .sort_value2\n        .as_mut()\n        .and_then(|sort_field| sort_field.sort_value.as_mut());\n    if let Some(sort_by_value) = sort_value_2_opt\n        && let Some(output_datetime_format) = &sort_field_2_datetime_format_opt\n    {\n        convert_sort_datetime_value(sort_by_value, *output_datetime_format)?;\n    }\n    let position = *hit_order.get(&key).expect(\"hit order must be present\");\n    let index_id = split_id_to_index_id_map\n        .get(&partial_hit_ref.split_id)\n        .map(|split_id| split_id.to_string())\n        .unwrap_or_default();\n\n    Result::<(usize, Hit), SearchError>::Ok((\n        position,\n        Hit {\n            json: leaf_hit.leaf_json,\n            partial_hit: leaf_hit.partial_hit,\n            snippet: leaf_hit.leaf_snippet_json,\n            index_id,\n        },\n    ))\n}\n\nfn get_sort_field_datetime_format(\n    sort_field: Option<&SortField>,\n) -> crate::Result<Option<SortDatetimeFormat>> {\n    if let Some(sort_field) = sort_field\n        && let Some(sort_field_datetime_format_int) = &sort_field.sort_datetime_format\n    {\n        let sort_field_datetime_format =\n            SortDatetimeFormat::try_from(*sort_field_datetime_format_int)\n                .context(\"invalid sort datetime format\")?;\n        return Ok(Some(sort_field_datetime_format));\n    }\n    Ok(None)\n}\n\n/// Performs a distributed search.\n/// 1. Sends leaf requests over gRPC to multiple leaf nodes.\n/// 2. Merges the search results.\n/// 3. Sends fetch docs requests to multiple leaf nodes.\n/// 4. Builds the response with docs and returns.\nasync fn root_search_aux(\n    searcher_context: &SearcherContext,\n    indexes_metas_for_leaf_search: &IndexesMetasForLeafSearch,\n    search_request: SearchRequest,\n    split_metadatas: Vec<SplitMetadata>,\n    cluster_client: &ClusterClient,\n) -> crate::Result<SearchResponse> {\n    debug!(split_metadatas = ?PrettySample::new(&split_metadatas, 5));\n    let (first_phase_result, scroll_key_and_start_offset_opt): (\n        LeafSearchResponse,\n        Option<ScrollKeyAndStartOffset>,\n    ) = search_partial_hits_phase_with_scroll(\n        searcher_context,\n        indexes_metas_for_leaf_search,\n        search_request.clone(),\n        &split_metadatas[..],\n        cluster_client,\n    )\n    .await?;\n\n    let hits = fetch_docs_phase(\n        indexes_metas_for_leaf_search,\n        &first_phase_result.partial_hits,\n        &split_metadatas[..],\n        &search_request,\n        cluster_client,\n    )\n    .await?;\n\n    let mut aggregation_result_postcard_opt = finalize_aggregation_if_any(\n        &search_request,\n        first_phase_result.intermediate_aggregation_result,\n        searcher_context,\n    )?;\n    // In case there is no index, we don't want the response to contain any aggregation structure\n    if indexes_metas_for_leaf_search.is_empty() {\n        aggregation_result_postcard_opt = None;\n    }\n\n    Ok(SearchResponse {\n        aggregation_postcard: aggregation_result_postcard_opt,\n        num_hits: first_phase_result.num_hits,\n        hits,\n        elapsed_time_micros: 0u64,\n        errors: Vec::new(),\n        scroll_id: scroll_key_and_start_offset_opt\n            .as_ref()\n            .map(ToString::to_string),\n        failed_splits: first_phase_result.failed_splits,\n        num_successful_splits: first_phase_result.num_successful_splits,\n    })\n}\n\nfn finalize_aggregation(\n    intermediate_aggregation_result_bytes_opt: Option<Vec<u8>>,\n    aggregations: QuickwitAggregations,\n    searcher_context: &SearcherContext,\n) -> crate::Result<Option<Vec<u8>>> {\n    let merge_aggregation_result = match aggregations {\n        QuickwitAggregations::FindTraceIdsAggregation(_) => {\n            // The merge collector has already merged the intermediate results.\n            return Ok(intermediate_aggregation_result_bytes_opt);\n        }\n        QuickwitAggregations::TantivyAggregations(aggregations) => {\n            let intermediate_aggregation_results =\n                if let Some(intermediate_aggregation_result_bytes) =\n                    intermediate_aggregation_result_bytes_opt\n                {\n                    let intermediate_aggregation_results: IntermediateAggregationResults =\n                        postcard::from_bytes(&intermediate_aggregation_result_bytes)?;\n                    intermediate_aggregation_results\n                } else {\n                    // Default, to return correct structure\n                    Default::default()\n                };\n            let final_aggregation_results: AggregationResults = intermediate_aggregation_results\n                .into_final_result(aggregations, searcher_context.get_aggregation_limits())?;\n            let final_aggregation_proxy: quickwit_query::aggregations::AggregationResults =\n                final_aggregation_results.into();\n            postcard::to_stdvec(&final_aggregation_proxy)?\n        }\n    };\n    Ok(Some(merge_aggregation_result))\n}\n\nfn finalize_aggregation_if_any(\n    search_request: &SearchRequest,\n    intermediate_aggregation_result_bytes_opt: Option<Vec<u8>>,\n    searcher_context: &SearcherContext,\n) -> crate::Result<Option<Vec<u8>>> {\n    let Some(aggregations_json) = search_request.aggregation_request.as_ref() else {\n        return Ok(None);\n    };\n    if search_request.skip_aggregation_finalization {\n        return Ok(intermediate_aggregation_result_bytes_opt);\n    }\n    let aggregations: QuickwitAggregations = serde_json::from_str(aggregations_json)?;\n    let aggregation_result_postcard = finalize_aggregation(\n        intermediate_aggregation_result_bytes_opt,\n        aggregations,\n        searcher_context,\n    )?;\n    Ok(aggregation_result_postcard)\n}\n\n/// Checks that all of the index researched as found.\n///\n/// An index pattern (= containing a wildcard) not matching is not an error.\n/// A specific index id however must be found.\n///\n/// We put this check here and not in the metastore to make sure the logic is independent\n/// of the metastore implementation, and some different use cases could require different\n/// behaviors. This specification was principally motivated by #4042.\npub fn ensure_all_indexes_found(\n    indexes_metadata: &[IndexMetadata],\n    index_id_patterns: &[String],\n) -> crate::Result<()> {\n    let mut index_ids: HashSet<&str> = index_id_patterns\n        .iter()\n        .filter(|pattern| !pattern.contains('*') && !pattern.starts_with('-'))\n        .map(|pattern| pattern.as_str())\n        .collect();\n\n    if index_ids.is_empty() {\n        // All the patterns are wildcard or negative patterns.\n        return Ok(());\n    }\n    for index_metadata in indexes_metadata {\n        index_ids.remove(index_metadata.index_id());\n    }\n    if index_ids.is_empty() {\n        return Ok(());\n    }\n    let not_found_index_ids = index_ids\n        .into_iter()\n        .map(|index_id| index_id.to_string())\n        .collect();\n\n    Err(SearchError::IndexesNotFound {\n        index_ids: not_found_index_ids,\n    })\n}\n\nasync fn refine_and_list_matches(\n    metastore: &mut MetastoreServiceClient,\n    search_request: &mut SearchRequest,\n    indexes_metadata: Vec<IndexMetadata>,\n    query_ast_resolved: QueryAst,\n    sort_fields_is_datetime: HashMap<String, bool>,\n    timestamp_field_opt: Option<String>,\n) -> crate::Result<Vec<SplitMetadata>> {\n    let index_uids = indexes_metadata\n        .iter()\n        .map(|index_metadata| index_metadata.index_uid.clone())\n        .collect_vec();\n    search_request.query_ast = serde_json::to_string(&query_ast_resolved)?;\n\n    // convert search_after datetime values from input datetime format to nanos.\n    convert_search_after_datetime_values(search_request, &sort_fields_is_datetime)?;\n\n    // update_search_after_datetime_in_nanos(&mut search_request)?;\n    if let Some(timestamp_field) = &timestamp_field_opt {\n        refine_start_end_timestamp_from_ast(\n            &query_ast_resolved,\n            timestamp_field,\n            &mut search_request.start_timestamp,\n            &mut search_request.end_timestamp,\n        );\n    }\n    let tag_filter_ast = extract_tags_from_query(query_ast_resolved);\n\n    // TODO if search after is set, we sort by timestamp and we don't want to count all results,\n    // we can refine more here. Same if we sort by _shard_doc\n    let split_metadatas: Vec<SplitMetadata> = list_relevant_splits(\n        index_uids,\n        search_request.start_timestamp,\n        search_request.end_timestamp,\n        tag_filter_ast,\n        metastore,\n    )\n    .await?;\n    Ok(split_metadatas)\n}\n\n/// Fetches the list of splits and their metadata from the metastore\nasync fn plan_splits_for_root_search(\n    search_request: &mut SearchRequest,\n    metastore: &mut MetastoreServiceClient,\n) -> crate::Result<(Vec<SplitMetadata>, IndexesMetasForLeafSearch)> {\n    let list_indexes_metadatas_request = ListIndexesMetadataRequest {\n        index_id_patterns: search_request.index_id_patterns.clone(),\n    };\n    let indexes_metadata: Vec<IndexMetadata> = metastore\n        .list_indexes_metadata(list_indexes_metadatas_request)\n        .await?\n        .deserialize_indexes_metadata()\n        .await?;\n\n    if !search_request.ignore_missing_indexes {\n        ensure_all_indexes_found(&indexes_metadata[..], &search_request.index_id_patterns[..])?;\n    }\n\n    if indexes_metadata.is_empty() {\n        return Ok((Vec::new(), HashMap::default()));\n    }\n\n    let request_metadata = validate_request_and_build_metadata(&indexes_metadata, search_request)?;\n    let split_metadatas = refine_and_list_matches(\n        metastore,\n        search_request,\n        indexes_metadata,\n        request_metadata.query_ast_resolved,\n        request_metadata.sort_fields_is_datetime,\n        request_metadata.timestamp_field_opt,\n    )\n    .await?;\n    Ok((\n        split_metadatas,\n        request_metadata.indexes_meta_for_leaf_search,\n    ))\n}\n\n/// Performs a distributed search.\n/// 1. Sends leaf requests over gRPC to multiple leaf nodes.\n/// 2. Merges the search results.\n/// 3. Sends fetch docs requests to multiple leaf nodes.\n/// 4. Builds the response with docs and returns.\n#[instrument(skip_all)]\npub async fn root_search(\n    searcher_context: &SearcherContext,\n    mut search_request: SearchRequest,\n    mut metastore: MetastoreServiceClient,\n    cluster_client: &ClusterClient,\n) -> crate::Result<SearchResponse> {\n    let start_instant = Instant::now();\n\n    let (split_metadatas, indexes_meta_for_leaf_search) = RootSearchMetricsFuture {\n        start: start_instant,\n        tracked: plan_splits_for_root_search(&mut search_request, &mut metastore),\n        is_success: None,\n        step: RootSearchMetricsStep::Plan,\n    }\n    .await?;\n\n    let num_docs: usize = split_metadatas.iter().map(|split| split.num_docs).sum();\n    let num_splits = split_metadatas.len();\n\n    // It would have been nice to add those in the context of the trace span,\n    // but with our current logging setting, it makes logs too verbose.\n    info!(\n        query_ast = search_request.query_ast.as_str(),\n        agg = search_request.aggregation_request(),\n        start_ts = ?(search_request.start_timestamp()..search_request.end_timestamp()),\n        count_required = search_request.count_hits().as_str_name(),\n        num_docs = num_docs,\n        num_splits = num_splits,\n        \"root_search\"\n    );\n\n    if let Some(max_total_split_searches) = searcher_context.searcher_config.max_splits_per_search\n        && max_total_split_searches < num_splits\n    {\n        error!(\n            num_splits,\n            max_total_split_searches,\n            index=?search_request.index_id_patterns,\n            query=%search_request.query_ast,\n            \"max total splits exceeded\"\n        );\n        return Err(SearchError::InvalidArgument(format!(\n            \"Number of targeted splits {num_splits} exceeds the limit {max_total_split_searches}\"\n        )));\n    }\n\n    let mut search_response_result = RootSearchMetricsFuture {\n        start: start_instant,\n        tracked: root_search_aux(\n            searcher_context,\n            &indexes_meta_for_leaf_search,\n            search_request,\n            split_metadatas,\n            cluster_client,\n        ),\n        is_success: None,\n        step: RootSearchMetricsStep::Exec {\n            num_targeted_splits: num_splits,\n        },\n    }\n    .await;\n\n    if let Ok(search_response) = &mut search_response_result {\n        search_response.elapsed_time_micros = start_instant.elapsed().as_micros() as u64;\n    }\n\n    search_response_result\n}\n\n/// Returns details on how a query would be executed\npub async fn search_plan(\n    mut search_request: SearchRequest,\n    mut metastore: MetastoreServiceClient,\n) -> crate::Result<SearchPlanResponse> {\n    let list_indexes_metadatas_request = ListIndexesMetadataRequest {\n        index_id_patterns: search_request.index_id_patterns.clone(),\n    };\n    let indexes_metadata: Vec<IndexMetadata> = metastore\n        .list_indexes_metadata(list_indexes_metadatas_request)\n        .await?\n        .deserialize_indexes_metadata()\n        .await?;\n\n    if !search_request.ignore_missing_indexes {\n        ensure_all_indexes_found(&indexes_metadata[..], &search_request.index_id_patterns[..])?;\n    }\n    if indexes_metadata.is_empty() {\n        return Ok(SearchPlanResponse {\n            result: serde_json::to_string(&SearchPlanResponseRest {\n                quickwit_ast: QueryAst::MatchAll,\n                tantivy_ast: String::new(),\n                searched_splits: Vec::new(),\n                storage_requests: StorageRequestCount::default(),\n            })?,\n        });\n    }\n    let doc_mapper = build_doc_mapper(\n        &indexes_metadata[0].index_config.doc_mapping,\n        &indexes_metadata[0].index_config.search_settings,\n    )\n    .map_err(|err| SearchError::Internal(format!(\"failed to build doc mapper. cause: {err}\")))?;\n\n    let request_metadata = validate_request_and_build_metadata(&indexes_metadata, &search_request)?;\n    let split_metadatas = refine_and_list_matches(\n        &mut metastore,\n        &mut search_request,\n        indexes_metadata,\n        request_metadata.query_ast_resolved.clone(),\n        request_metadata.sort_fields_is_datetime,\n        request_metadata.timestamp_field_opt,\n    )\n    .await?;\n\n    let (query, mut warmup_info) = doc_mapper.query(\n        doc_mapper.schema(),\n        request_metadata.query_ast_resolved.clone(),\n        true,\n        None,\n    )?;\n    let merge_collector = make_merge_collector(&search_request, Default::default())?;\n    warmup_info.merge(merge_collector.warmup_info());\n    warmup_info.simplify();\n\n    let split_ids = split_metadatas\n        .into_iter()\n        .map(|split| format!(\"{}/{}\", split.index_uid.index_id, split.split_id))\n        .collect();\n    // this is an upper bound, we'd need access to a hotdir for more precise results\n    let fieldnorm_query_count = if warmup_info.field_norms {\n        doc_mapper\n            .schema()\n            .fields()\n            .filter(|(_, entry)| entry.has_fieldnorms())\n            .count()\n    } else {\n        0\n    };\n    let sstable_query_count = warmup_info.term_dict_fields.len()\n        + warmup_info\n            .terms_grouped_by_field\n            .values()\n            .map(|terms: &HashMap<tantivy::Term, bool>| terms.len())\n            .sum::<usize>()\n        + warmup_info\n            .term_ranges_grouped_by_field\n            .values()\n            .map(|terms: &HashMap<_, bool>| terms.len())\n            .sum::<usize>();\n    let position_query_count = warmup_info\n        .terms_grouped_by_field\n        .values()\n        .map(|terms: &HashMap<tantivy::Term, bool>| {\n            terms\n                .values()\n                .filter(|load_position| **load_position)\n                .count()\n        })\n        .sum::<usize>()\n        + warmup_info\n            .term_ranges_grouped_by_field\n            .values()\n            .map(|terms: &HashMap<_, bool>| {\n                terms\n                    .values()\n                    .filter(|load_position| **load_position)\n                    .count()\n            })\n            .sum::<usize>();\n    Ok(SearchPlanResponse {\n        result: serde_json::to_string(&SearchPlanResponseRest {\n            quickwit_ast: request_metadata.query_ast_resolved,\n            tantivy_ast: format!(\"{query:#?}\"),\n            searched_splits: split_ids,\n            storage_requests: StorageRequestCount {\n                footer: 1,\n                fastfield: warmup_info.fast_fields.len(),\n                fieldnorm: fieldnorm_query_count,\n                sstable: sstable_query_count,\n                posting: sstable_query_count,\n                position: position_query_count,\n            },\n        })?,\n    })\n}\n\n/// Converts search after with datetime format to nanoseconds (representation in tantivy).\n/// If the sort field is a datetime field and no datetime format is set, the default format is\n/// milliseconds.\n/// `sort_fields_are_datetime_opt` must be of the same length as `search_request.sort_fields`.\nfn convert_search_after_datetime_values(\n    search_request: &mut SearchRequest,\n    sort_fields_is_datetime: &HashMap<String, bool>,\n) -> crate::Result<()> {\n    for sort_field in search_request.sort_fields.iter_mut() {\n        if *sort_fields_is_datetime\n            .get(&sort_field.field_name)\n            .unwrap_or(&false)\n            && sort_field.sort_datetime_format.is_none()\n        {\n            sort_field.sort_datetime_format = Some(SortDatetimeFormat::UnixTimestampMillis as i32);\n        }\n    }\n    if let Some(partial_hit) = search_request.search_after.as_mut() {\n        let search_after_values = [\n            partial_hit.sort_value.as_mut(),\n            partial_hit.sort_value2.as_mut(),\n        ];\n        for (sort_field, search_after_value_opt) in\n            search_request.sort_fields.iter().zip(search_after_values)\n        {\n            let Some(search_after_sort_by_value) = search_after_value_opt else {\n                continue;\n            };\n            let Some(search_after_sort_value) = search_after_sort_by_value.sort_value.as_mut()\n            else {\n                continue;\n            };\n            let Some(datetime_format_int) = sort_field.sort_datetime_format else {\n                continue;\n            };\n            let input_datetime_format = SortDatetimeFormat::try_from(datetime_format_int)\n                .context(\"invalid sort datetime format\")?;\n            convert_sort_datetime_value_into_nanos(search_after_sort_value, input_datetime_format)?;\n        }\n    }\n    Ok(())\n}\n\n/// Convert sort values from input datetime format into nanoseconds.\n/// The conversion is done only for U64 and I64 sort values, an error is returned for other types.\nfn convert_sort_datetime_value_into_nanos(\n    sort_value: &mut SortValue,\n    input_format: SortDatetimeFormat,\n) -> crate::Result<()> {\n    match sort_value {\n        SortValue::U64(value) => match input_format {\n            SortDatetimeFormat::UnixTimestampMillis => {\n                *value = value.checked_mul(1_000_000).ok_or_else(|| {\n                    SearchError::Internal(format!(\n                        \"sort value defined in milliseconds is too large and cannot be converted \\\n                         into nanoseconds: {value}\"\n                    ))\n                })?;\n            }\n            SortDatetimeFormat::UnixTimestampNanos => {\n                // Nothing to do as the internal format is nanos.\n            }\n        },\n        SortValue::I64(value) => match input_format {\n            SortDatetimeFormat::UnixTimestampMillis => {\n                *value = value.checked_mul(1_000_000).ok_or_else(|| {\n                    SearchError::Internal(format!(\n                        \"sort value defined in milliseconds is too large and cannot be converted \\\n                         into nanoseconds: {value}\"\n                    ))\n                })?;\n            }\n            SortDatetimeFormat::UnixTimestampNanos => {\n                // Nothing to do as the internal format is nanos.\n            }\n        },\n        _ => {\n            return Err(SearchError::Internal(format!(\n                \"datetime conversion are only support for u64 and i64 sort values, not \\\n                 `{sort_value:?}`\"\n            )));\n        }\n    }\n    Ok(())\n}\n\n/// Convert sort values from nanoseconds to the requested output format.\n/// The conversion is done only for U64 and I64 sort values, an error is returned for other types.\nfn convert_sort_datetime_value(\n    sort_value: &mut SortValue,\n    output_format: SortDatetimeFormat,\n) -> crate::Result<()> {\n    match sort_value {\n        SortValue::U64(value) => match output_format {\n            SortDatetimeFormat::UnixTimestampMillis => {\n                *value /= 1_000_000;\n            }\n            SortDatetimeFormat::UnixTimestampNanos => {\n                // Nothing todo as the internal format is in nanos.\n            }\n        },\n        SortValue::I64(value) => match output_format {\n            SortDatetimeFormat::UnixTimestampMillis => {\n                *value /= 1_000_000;\n            }\n            SortDatetimeFormat::UnixTimestampNanos => {\n                // Nothing todo as the internal format is in nanos.\n            }\n        },\n        _ => {\n            return Err(SearchError::Internal(format!(\n                \"datetime conversion are only support for u64 and i64 sort values, not \\\n                 `{sort_value:?}`\"\n            )));\n        }\n    }\n    Ok(())\n}\n\npub(crate) fn refine_start_end_timestamp_from_ast(\n    query_ast: &QueryAst,\n    timestamp_field: &str,\n    start_timestamp: &mut Option<i64>,\n    end_timestamp: &mut Option<i64>,\n) {\n    let mut timestamp_range_extractor = ExtractTimestampRange {\n        timestamp_field,\n        start_timestamp: *start_timestamp,\n        end_timestamp: *end_timestamp,\n    };\n    timestamp_range_extractor\n        .visit(query_ast)\n        .expect(\"can't fail unwrapping Infallible\");\n    *start_timestamp = timestamp_range_extractor.start_timestamp;\n    *end_timestamp = timestamp_range_extractor.end_timestamp;\n}\n\n/// Boundaries identified as being implied by the QueryAst.\n///\n/// `start_timestamp` is to be interpreted as Inclusive (or Unbounded)\n/// `end_timestamp` is to be interpreted as Exclusive (or Unbounded)\n/// In other word, this is a `[start_timestamp..end_timestamp)` interval.\nstruct ExtractTimestampRange<'a> {\n    timestamp_field: &'a str,\n    start_timestamp: Option<i64>,\n    end_timestamp: Option<i64>,\n}\n\nimpl ExtractTimestampRange<'_> {\n    fn update_start_timestamp(\n        &mut self,\n        lower_bound: &quickwit_query::JsonLiteral,\n        included: bool,\n    ) {\n        use quickwit_query::InterpretUserInput;\n        let Some(lower_bound) = tantivy::DateTime::interpret_json(lower_bound) else {\n            return;\n        };\n        let mut lower_bound = lower_bound.into_timestamp_secs();\n        if !included {\n            // TODO saturating isn't exactly right, we should replace the RangeQuery with\n            // a match_none, but the visitor doesn't allow mutation.\n            lower_bound = lower_bound.saturating_add(1);\n        }\n\n        self.start_timestamp = self.start_timestamp.max(Some(lower_bound));\n    }\n\n    fn update_end_timestamp(&mut self, upper_bound: &quickwit_query::JsonLiteral, included: bool) {\n        use quickwit_query::InterpretUserInput;\n        let Some(upper_bound_timestamp) = tantivy::DateTime::interpret_json(upper_bound) else {\n            return;\n        };\n        let mut upper_bound = upper_bound_timestamp.into_timestamp_secs();\n        let round_up = (upper_bound_timestamp.into_timestamp_nanos() % 1_000_000_000) != 0;\n        if included || round_up {\n            // TODO saturating isn't exactly right, we should replace the RangeQuery with\n            // a match_none, but the visitor doesn't allow mutation.\n            upper_bound = upper_bound.saturating_add(1);\n        }\n\n        let new_end_timestamp = self.end_timestamp.unwrap_or(upper_bound).min(upper_bound);\n        self.end_timestamp = Some(new_end_timestamp);\n    }\n}\n\nimpl<'b> QueryAstVisitor<'b> for ExtractTimestampRange<'_> {\n    type Err = std::convert::Infallible;\n\n    fn visit_bool(&mut self, bool_query: &'b BoolQuery) -> Result<(), Self::Err> {\n        // we only want to visit sub-queries which are strict (positive) requirements\n        for ast in bool_query.must.iter().chain(bool_query.filter.iter()) {\n            self.visit(ast)?;\n        }\n        Ok(())\n    }\n\n    fn visit_range(&mut self, range_query: &'b RangeQuery) -> Result<(), Self::Err> {\n        use std::ops::Bound;\n\n        if range_query.field == self.timestamp_field {\n            match &range_query.lower_bound {\n                Bound::Included(lower_bound) => self.update_start_timestamp(lower_bound, true),\n                Bound::Excluded(lower_bound) => self.update_start_timestamp(lower_bound, false),\n                Bound::Unbounded => (),\n            }\n            match &range_query.upper_bound {\n                Bound::Included(upper_bound) => self.update_end_timestamp(upper_bound, true),\n                Bound::Excluded(upper_bound) => self.update_end_timestamp(upper_bound, false),\n                Bound::Unbounded => (),\n            }\n        }\n        Ok(())\n    }\n\n    // if we visit a term, limit the range to DATE..=DATE\n    fn visit_term(&mut self, term_query: &'b TermQuery) -> Result<(), Self::Err> {\n        if term_query.field == self.timestamp_field {\n            // TODO when fixing #3323, this may need to be modified to support numbers too\n            let json_term = quickwit_query::JsonLiteral::String(term_query.value.clone());\n            self.update_start_timestamp(&json_term, true);\n            self.update_end_timestamp(&json_term, true);\n        }\n        Ok(())\n    }\n\n    // if we visit a termset, limit the range to LOWEST..=HIGHEST\n    fn visit_term_set(&mut self, term_query: &'b TermSetQuery) -> Result<(), Self::Err> {\n        if let Some(term_set) = term_query.terms_per_field.get(self.timestamp_field) {\n            // rfc3339 is lexicographically ordered if YEAR <= 9999, so we can use string\n            // ordering to get the start and end quickly.\n            if let Some(first) = term_set.first() {\n                let json_term = quickwit_query::JsonLiteral::String(first.clone());\n                self.update_start_timestamp(&json_term, true);\n            }\n            if let Some(last) = term_set.last() {\n                let json_term = quickwit_query::JsonLiteral::String(last.clone());\n                self.update_end_timestamp(&json_term, true);\n            }\n        }\n        Ok(())\n    }\n}\n\nasync fn assign_client_fetch_docs_jobs(\n    partial_hits: &[PartialHit],\n    split_metadatas: &[SplitMetadata],\n    client_pool: &SearchJobPlacer,\n) -> crate::Result<impl Iterator<Item = (SearchServiceClient, Vec<FetchDocsJob>)>> {\n    let index_uids_and_split_offsets_map: HashMap<String, (IndexUid, SplitIdAndFooterOffsets)> =\n        split_metadatas\n            .iter()\n            .map(|metadata| {\n                (\n                    metadata.split_id().to_string(),\n                    (\n                        metadata.index_uid.clone(),\n                        extract_split_and_footer_offsets(metadata),\n                    ),\n                )\n            })\n            .collect();\n\n    // Group the partial hits per split\n    let mut partial_hits_map: HashMap<String, Vec<PartialHit>> = HashMap::new();\n    for partial_hit in partial_hits.iter() {\n        partial_hits_map\n            .entry(partial_hit.split_id.clone())\n            .or_default()\n            .push(partial_hit.clone());\n    }\n\n    let mut fetch_docs_req_jobs: Vec<FetchDocsJob> = Vec::new();\n    for (split_id, partial_hits) in partial_hits_map {\n        let (index_uid, offsets) = index_uids_and_split_offsets_map\n            .get(&split_id)\n            .ok_or_else(|| {\n                crate::SearchError::Internal(format!(\n                    \"received partial hit from an unknown split {split_id}\"\n                ))\n            })?\n            .clone();\n        let fetch_docs_job = FetchDocsJob {\n            index_uid: index_uid.clone(),\n            offsets,\n            partial_hits,\n        };\n        fetch_docs_req_jobs.push(fetch_docs_job);\n    }\n\n    let assigned_jobs = client_pool\n        .assign_jobs(fetch_docs_req_jobs, &HashSet::new())\n        .await?;\n\n    Ok(assigned_jobs)\n}\n\n// Measure the cost associated to searching in a given split metadata.\nfn compute_split_cost(split_metadata: &SplitMetadata) -> usize {\n    // TODO this formula could be tuned a lot more. The general idea is that there is a fixed\n    // cost to searching a split, plus a somewhat-linear cost depending on the size of the split\n    5 + split_metadata.num_docs / 100_000\n}\n\n/// Builds a LeafSearchRequest to one node, from a list of [`SearchJob`].\npub fn jobs_to_leaf_request(\n    request: &SearchRequest,\n    search_indexes_metadatas: &IndexesMetasForLeafSearch,\n    jobs: Vec<SearchJob>,\n) -> crate::Result<LeafSearchRequest> {\n    let mut search_request_for_leaf = request.clone();\n    search_request_for_leaf.start_offset = 0;\n    search_request_for_leaf.max_hits += request.start_offset;\n    search_request_for_leaf.index_id_patterns = Vec::new();\n\n    let mut leaf_search_request = LeafSearchRequest {\n        search_request: Some(search_request_for_leaf),\n        leaf_requests: Vec::new(),\n        doc_mappers: Vec::new(),\n        index_uris: Vec::new(),\n    };\n\n    let mut added_doc_mappers: HashMap<&str, u32> = HashMap::new();\n    // Group jobs by index uid, as the split offsets are relative to the index.\n    group_jobs_by_index_id(jobs, |job_group| {\n        let index_uid = &job_group[0].index_uid;\n        leaf_search_request\n            .search_request\n            .as_mut()\n            .unwrap()\n            .index_id_patterns\n            .push(index_uid.index_id.to_string());\n        let search_index_meta = search_indexes_metadatas.get(index_uid).ok_or_else(|| {\n            SearchError::Internal(format!(\n                \"received job for an unknown index {index_uid}. it should never happen\"\n            ))\n        })?;\n        let doc_mapper_ord = *added_doc_mappers\n            .entry(&search_index_meta.doc_mapper_str)\n            .or_insert_with(|| {\n                let ord = leaf_search_request.doc_mappers.len();\n                leaf_search_request\n                    .doc_mappers\n                    .push(search_index_meta.doc_mapper_str.to_string());\n                ord as u32\n            });\n        let index_uri_ord = leaf_search_request.index_uris.len() as u32;\n        leaf_search_request\n            .index_uris\n            .push(search_index_meta.index_uri.to_string());\n\n        let leaf_search_request_ref = LeafRequestRef {\n            split_offsets: job_group.into_iter().map(|job| job.offsets).collect(),\n            doc_mapper_ord,\n            index_uri_ord,\n        };\n        leaf_search_request\n            .leaf_requests\n            .push(leaf_search_request_ref);\n        Ok(())\n    })?;\n    Ok(leaf_search_request)\n}\n\n/// Builds a list of [`FetchDocsRequest`], one per index, from a list of [`FetchDocsJob`].\npub fn jobs_to_fetch_docs_requests(\n    snippet_request_opt: Option<SnippetRequest>,\n    indexes_metas_for_leaf_search: &IndexesMetasForLeafSearch,\n    jobs: Vec<FetchDocsJob>,\n) -> crate::Result<Vec<FetchDocsRequest>> {\n    let mut fetch_docs_requests = Vec::new();\n    // Group jobs by index uid.\n    group_by(\n        jobs,\n        |job| &job.index_uid,\n        |fetch_docs_jobs| {\n            let index_uid = &fetch_docs_jobs[0].index_uid;\n\n            let index_meta = indexes_metas_for_leaf_search\n                .get(index_uid)\n                .ok_or_else(|| {\n                    SearchError::Internal(format!(\n                        \"received search job for an unknown index {index_uid}\"\n                    ))\n                })?;\n            let partial_hits: Vec<PartialHit> = fetch_docs_jobs\n                .iter()\n                .flat_map(|fetch_doc_job| fetch_doc_job.partial_hits.iter().cloned())\n                .collect();\n            let split_offsets: Vec<SplitIdAndFooterOffsets> = fetch_docs_jobs\n                .into_iter()\n                .map(|fetch_doc_job| fetch_doc_job.into())\n                .collect();\n            let fetch_docs_req = FetchDocsRequest {\n                partial_hits,\n                split_offsets,\n                index_uri: index_meta.index_uri.to_string(),\n                snippet_request: snippet_request_opt.clone(),\n                doc_mapper: index_meta.doc_mapper_str.clone(),\n            };\n            fetch_docs_requests.push(fetch_docs_req);\n\n            Ok(())\n        },\n    )?;\n    Ok(fetch_docs_requests)\n}\n\n#[cfg(test)]\nmod tests {\n    use std::ops::Range;\n    use std::str::FromStr;\n    use std::sync::{Arc, RwLock};\n\n    use quickwit_common::ServiceStream;\n    use quickwit_common::shared_consts::SCROLL_BATCH_LEN;\n    use quickwit_config::{\n        DocMapping, IndexConfig, IndexingSettings, IngestSettings, SearchSettings,\n    };\n    use quickwit_indexing::MockSplitBuilder;\n    use quickwit_metastore::{IndexMetadata, ListSplitsRequestExt, ListSplitsResponseExt};\n    use quickwit_proto::metastore::{\n        ListIndexesMetadataResponse, ListSplitsResponse, MockMetastoreService,\n    };\n    use quickwit_proto::search::{\n        ScrollRequest, SortByValue, SortOrder, SortValue, SplitSearchError,\n    };\n    use quickwit_query::query_ast::{qast_helper, qast_json_helper, query_ast_from_user_text};\n    use tantivy::schema::{FAST, STORED, TEXT};\n\n    use super::*;\n    use crate::{MockSearchService, searcher_pool_for_test};\n\n    #[track_caller]\n    fn check_snippet_fields_validation(snippet_fields: &[String]) -> anyhow::Result<()> {\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_text_field(\"title\", TEXT);\n        schema_builder.add_text_field(\"desc\", TEXT | STORED);\n        schema_builder.add_ip_addr_field(\"ip\", FAST | STORED);\n        let schema = schema_builder.build();\n        validate_requested_snippet_fields(&schema, snippet_fields)\n    }\n\n    #[test]\n    fn test_validate_requested_snippet_fields() {\n        check_snippet_fields_validation(&[\"desc\".to_string()]).unwrap();\n        let field_not_stored_err =\n            check_snippet_fields_validation(&[\"title\".to_string()]).unwrap_err();\n        assert_eq!(\n            field_not_stored_err.to_string(),\n            \"the snippet field `title` must be stored\"\n        );\n        let field_doesnotexist_err =\n            check_snippet_fields_validation(&[\"doesnotexist\".to_string()]).unwrap_err();\n        assert_eq!(\n            field_doesnotexist_err.to_string(),\n            \"The field does not exist: 'doesnotexist'\"\n        );\n        let field_is_not_text_err =\n            check_snippet_fields_validation(&[\"ip\".to_string()]).unwrap_err();\n        assert_eq!(\n            field_is_not_text_err.to_string(),\n            \"the snippet field `ip` must be of type `Str`, got `IpAddr`\"\n        );\n    }\n\n    #[test]\n    fn test_get_sort_by_field_entry() {\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_text_field(\"title\", TEXT);\n        schema_builder.add_text_field(\"desc\", TEXT | STORED);\n        schema_builder.add_u64_field(\"timestamp\", FAST | STORED);\n        let schema = schema_builder.build();\n        get_sort_by_field_entry(\"timestamp\", &schema)\n            .unwrap()\n            .unwrap();\n        let sort_by_field_entry_err = get_sort_by_field_entry(\"doesnotexist\", &schema).unwrap_err();\n        assert_eq!(\n            sort_by_field_entry_err.to_string(),\n            \"Invalid argument: unknown field used in `sort by`: doesnotexist\"\n        );\n        for sort_field_name in &[\"_doc\", \"_score\", \"_shard_doc\"] {\n            assert!(\n                get_sort_by_field_entry(sort_field_name, &schema)\n                    .unwrap()\n                    .is_none()\n            );\n        }\n    }\n\n    fn index_metadata_for_multi_indexes_test(index_id: &str, index_uri: &str) -> IndexMetadata {\n        let index_uri = Uri::from_str(index_uri).unwrap();\n        let doc_mapping_json = r#\"{\n            \"mode\": \"lenient\",\n            \"field_mappings\": [\n                {\n                    \"name\": \"timestamp\",\n                    \"type\": \"datetime\",\n                    \"fast\": true\n                },\n                {\n                    \"name\": \"body\",\n                    \"type\": \"text\",\n                    \"stored\": true\n                }\n            ],\n            \"timestamp_field\": \"timestamp\",\n            \"store_source\": true\n        }\"#;\n        let doc_mapping = serde_json::from_str(doc_mapping_json).unwrap();\n        let indexing_settings = IndexingSettings::default();\n        let ingest_settings = IngestSettings::default();\n        let search_settings = SearchSettings {\n            default_search_fields: vec![\"body\".to_string()],\n        };\n        IndexMetadata::new(IndexConfig {\n            index_id: index_id.to_string(),\n            index_uri,\n            doc_mapping,\n            indexing_settings,\n            ingest_settings,\n            search_settings,\n            retention_policy_opt: None,\n        })\n    }\n\n    #[test]\n    fn test_validate_request_and_build_metadatas_ok() {\n        let request_query_ast = qast_helper(\"body:test\", &[]);\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index\".to_string()],\n            query_ast: serde_json::to_string(&request_query_ast).unwrap(),\n            max_hits: 10,\n            start_offset: 10,\n            sort_fields: vec![\n                SortField {\n                    field_name: \"timestamp\".to_string(),\n                    sort_order: SortOrder::Desc as i32,\n                    sort_datetime_format: Some(SortDatetimeFormat::UnixTimestampMillis as i32),\n                },\n                SortField {\n                    field_name: \"_doc\".to_string(),\n                    sort_order: SortOrder::Asc as i32,\n                    sort_datetime_format: None,\n                },\n            ],\n            ..Default::default()\n        };\n        let index_metadata = IndexMetadata::for_test(\"test-index-1\", \"ram:///test-index-1\");\n        let index_metadata_with_other_config =\n            index_metadata_for_multi_indexes_test(\"test-index-2\", \"ram:///test-index-2\");\n        let mut index_metadata_no_timestamp =\n            IndexMetadata::for_test(\"test-index-3\", \"ram:///test-index-3\");\n        index_metadata_no_timestamp\n            .index_config\n            .doc_mapping\n            .timestamp_field = None;\n        let request_metadata = validate_request_and_build_metadata(\n            &[\n                index_metadata,\n                index_metadata_with_other_config,\n                index_metadata_no_timestamp,\n            ],\n            &search_request,\n        )\n        .unwrap();\n        assert_eq!(\n            request_metadata.timestamp_field_opt,\n            Some(\"timestamp\".to_string())\n        );\n        assert_eq!(request_metadata.query_ast_resolved, request_query_ast);\n        assert_eq!(request_metadata.indexes_meta_for_leaf_search.len(), 3);\n        assert_eq!(request_metadata.sort_fields_is_datetime.len(), 2);\n        assert_eq!(\n            request_metadata.sort_fields_is_datetime.get(\"timestamp\"),\n            Some(&true)\n        );\n        assert_eq!(\n            request_metadata.sort_fields_is_datetime.get(\"_doc\"),\n            Some(&false)\n        );\n    }\n\n    #[test]\n    fn test_validate_request_and_build_metadatas_fail_with_different_timestamps() {\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index\".to_string()],\n            query_ast: qast_json_helper(\"test\", &[\"body\"]),\n            max_hits: 10,\n            start_offset: 10,\n            ..Default::default()\n        };\n        let index_metadata_1 = IndexMetadata::for_test(\"test-index-1\", \"ram:///test-index-1\");\n        let mut index_metadata_2 = IndexMetadata::for_test(\"test-index-2\", \"ram:///test-index-2\");\n        let doc_mapping_json_2 = r#\"{\n            \"mode\": \"lenient\",\n            \"field_mappings\": [\n                {\n                    \"name\": \"timestamp-2\",\n                    \"type\": \"datetime\",\n                    \"fast\": true\n                },\n                {\n                    \"name\": \"body\",\n                    \"type\": \"text\"\n                }\n            ],\n            \"timestamp_field\": \"timestamp-2\",\n            \"store_source\": true\n        }\"#;\n        let doc_mapping_2: DocMapping = serde_json::from_str(doc_mapping_json_2).unwrap();\n        index_metadata_2.index_config.doc_mapping = doc_mapping_2;\n        index_metadata_2\n            .index_config\n            .search_settings\n            .default_search_fields = Vec::new();\n        let timestamp_field_different = validate_request_and_build_metadata(\n            &[index_metadata_1, index_metadata_2],\n            &search_request,\n        )\n        .unwrap_err();\n        assert_eq!(\n            timestamp_field_different.to_string(),\n            \"the timestamp field (if present) must be the same for all indexes\"\n        );\n    }\n\n    #[test]\n    fn test_validate_request_and_build_metadatas_fail_with_different_resolved_qast() {\n        let qast = query_ast_from_user_text(\"test\", None);\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index\".to_string()],\n            query_ast: serde_json::to_string(&qast).unwrap(),\n            max_hits: 10,\n            start_offset: 10,\n            ..Default::default()\n        };\n        let index_metadata_1 = IndexMetadata::for_test(\"test-index-1\", \"ram:///test-index-1\");\n        let mut index_metadata_2 = IndexMetadata::for_test(\"test-index-2\", \"ram:///test-index-2\");\n        index_metadata_2\n            .index_config\n            .search_settings\n            .default_search_fields = vec![\"owner\".to_string()];\n        let timestamp_field_different = validate_request_and_build_metadata(\n            &[index_metadata_1, index_metadata_2],\n            &search_request,\n        )\n        .unwrap_err();\n        assert_eq!(\n            timestamp_field_different.to_string(),\n            \"resolved query ASTs must be the same across indexes. resolving queries with \\\n             different default fields are different between indexes is not supported\"\n        );\n    }\n\n    fn index_metadata_for_multi_indexes_test_with_incompatible_sort_type(\n        index_id: &str,\n        index_uri: &str,\n    ) -> IndexMetadata {\n        let index_uri = Uri::from_str(index_uri).unwrap();\n        let doc_mapping_json = r#\"{\n            \"mode\": \"lenient\",\n            \"field_mappings\": [\n                {\n                    \"name\": \"timestamp\",\n                    \"type\": \"datetime\",\n                    \"fast\": true\n                },\n                {\n                    \"name\": \"body\",\n                    \"type\": \"text\",\n                    \"stored\": true\n                },\n                {\n                    \"name\": \"response_date\",\n                    \"type\": \"i64\",\n                    \"stored\": true,\n                    \"fast\": true\n                }\n            ],\n            \"timestamp_field\": \"timestamp\",\n            \"store_source\": true\n        }\"#;\n        let doc_mapping = serde_json::from_str(doc_mapping_json).unwrap();\n        let ingest_settings = IngestSettings::default();\n        let indexing_settings = IndexingSettings::default();\n        let search_settings = SearchSettings {\n            default_search_fields: vec![\"body\".to_string()],\n        };\n        IndexMetadata::new(IndexConfig {\n            index_id: index_id.to_string(),\n            index_uri,\n            doc_mapping,\n            ingest_settings,\n            indexing_settings,\n            search_settings,\n            retention_policy_opt: None,\n        })\n    }\n\n    #[test]\n    fn test_validate_request_and_build_metadatas_fail_with_incompatible_sort_field_types() {\n        let request_query_ast = qast_helper(\"body:test\", &[]);\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index\".to_string()],\n            query_ast: serde_json::to_string(&request_query_ast).unwrap(),\n            max_hits: 10,\n            start_offset: 10,\n            sort_fields: vec![SortField {\n                field_name: \"response_date\".to_string(),\n                sort_order: SortOrder::Desc as i32,\n                sort_datetime_format: None,\n            }],\n            ..Default::default()\n        };\n        let index_metadata = IndexMetadata::for_test(\"test-index-1\", \"ram:///test-index-1\");\n        let index_metadata_with_other_config =\n            index_metadata_for_multi_indexes_test_with_incompatible_sort_type(\n                \"test-index-2\",\n                \"ram:///test-index-2\",\n            );\n        let search_error = validate_request_and_build_metadata(\n            &[index_metadata, index_metadata_with_other_config],\n            &search_request,\n        )\n        .unwrap_err();\n        assert_eq!(\n            search_error.to_string(),\n            \"sort datetime field `response_date` must be of type datetime on all indexes\"\n        );\n    }\n\n    #[test]\n    fn test_convert_sort_datetime_value() {\n        let mut sort_value = SortValue::U64(1617000000000000000);\n        convert_sort_datetime_value(&mut sort_value, SortDatetimeFormat::UnixTimestampMillis)\n            .unwrap();\n        assert_eq!(sort_value, SortValue::U64(1617000000000));\n        let mut sort_value = SortValue::I64(1617000000000000000);\n        convert_sort_datetime_value(&mut sort_value, SortDatetimeFormat::UnixTimestampMillis)\n            .unwrap();\n        assert_eq!(sort_value, SortValue::I64(1617000000000));\n\n        // conversion with float values should fail.\n        let mut sort_value = SortValue::F64(1617000000000000000.0);\n        let error =\n            convert_sort_datetime_value(&mut sort_value, SortDatetimeFormat::UnixTimestampMillis)\n                .unwrap_err();\n        assert_eq!(\n            error.to_string(),\n            \"internal error: `datetime conversion are only support for u64 and i64 sort values, \\\n             not `F64(1.617e18)``\"\n        );\n    }\n\n    #[test]\n    fn test_convert_sort_datetime_value_into_nanos() {\n        let mut sort_value = SortValue::U64(1617000000000);\n        convert_sort_datetime_value_into_nanos(\n            &mut sort_value,\n            SortDatetimeFormat::UnixTimestampMillis,\n        )\n        .unwrap();\n        assert_eq!(sort_value, SortValue::U64(1617000000000000000));\n        let mut sort_value = SortValue::I64(1617000000000);\n        convert_sort_datetime_value_into_nanos(\n            &mut sort_value,\n            SortDatetimeFormat::UnixTimestampMillis,\n        )\n        .unwrap();\n        assert_eq!(sort_value, SortValue::I64(1617000000000000000));\n\n        // conversion with a too large millisecond value should fail.\n        let mut sort_value = SortValue::I64(1617000000000000);\n        let error = convert_sort_datetime_value_into_nanos(\n            &mut sort_value,\n            SortDatetimeFormat::UnixTimestampMillis,\n        )\n        .unwrap_err();\n        assert_eq!(\n            error.to_string(),\n            \"internal error: `sort value defined in milliseconds is too large and cannot be \\\n             converted into nanoseconds: 1617000000000000`\"\n        );\n        // conversion with float values should fail.\n        let mut sort_value = SortValue::F64(1617000000000000.0);\n        let error = convert_sort_datetime_value_into_nanos(\n            &mut sort_value,\n            SortDatetimeFormat::UnixTimestampMillis,\n        )\n        .unwrap_err();\n        assert_eq!(\n            error.to_string(),\n            \"internal error: `datetime conversion are only support for u64 and i64 sort values, \\\n             not `F64(1617000000000000.0)``\"\n        );\n    }\n\n    #[test]\n    fn test_validate_sort_field_types_with_doc_and_shard_doc() {\n        let sort_fields = vec![\n            SortField {\n                field_name: \"_doc\".to_string(),\n                sort_order: 0,\n                sort_datetime_format: None,\n            },\n            SortField {\n                field_name: \"_shard_doc\".to_string(),\n                sort_order: 0,\n                sort_datetime_format: None,\n            },\n        ];\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_date_field(\"timestamp\", FAST);\n        schema_builder.add_u64_field(\"id\", FAST);\n        let schema = schema_builder.build();\n        let mut sort_field_are_datetime = HashMap::new();\n        validate_sort_field_types(&schema, &sort_fields, &mut sort_field_are_datetime).unwrap();\n        assert_eq!(sort_field_are_datetime.get(\"_doc\"), Some(&false));\n        assert_eq!(sort_field_are_datetime.get(\"_shard_doc\"), Some(&false));\n    }\n\n    #[test]\n    fn test_validate_sort_field_types_valid() {\n        let sort_fields = vec![\n            SortField {\n                field_name: \"timestamp\".to_string(),\n                sort_order: 0,\n                sort_datetime_format: None,\n            },\n            SortField {\n                field_name: \"id\".to_string(),\n                sort_order: 0,\n                sort_datetime_format: None,\n            },\n        ];\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_date_field(\"timestamp\", FAST);\n        schema_builder.add_u64_field(\"id\", FAST);\n        let schema = schema_builder.build();\n        let mut sort_field_are_datetime = HashMap::new();\n        validate_sort_field_types(&schema, &sort_fields, &mut sort_field_are_datetime).unwrap();\n        assert_eq!(sort_field_are_datetime.get(\"timestamp\"), Some(&true));\n        assert_eq!(sort_field_are_datetime.get(\"id\"), Some(&false));\n    }\n\n    #[test]\n    fn test_validate_sort_field_types_with_inconsistent_datetime_type() {\n        let sort_fields = vec![\n            SortField {\n                field_name: \"timestamp\".to_string(),\n                sort_order: 0,\n                sort_datetime_format: None,\n            },\n            SortField {\n                field_name: \"id\".to_string(),\n                sort_order: 0,\n                sort_datetime_format: None,\n            },\n        ];\n        let mut schema_builder = Schema::builder();\n        schema_builder.add_date_field(\"timestamp\", FAST);\n        schema_builder.add_u64_field(\"id\", FAST);\n        let schema = schema_builder.build();\n        {\n            let mut sort_field_are_datetime = HashMap::new();\n            sort_field_are_datetime.insert(\"timestamp\".to_string(), false);\n            sort_field_are_datetime.insert(\"id\".to_string(), false);\n            let error =\n                validate_sort_field_types(&schema, &sort_fields, &mut sort_field_are_datetime)\n                    .unwrap_err();\n            assert_eq!(\n                error.to_string(),\n                \"sort datetime field `timestamp` must be of type datetime on all indexes\"\n            );\n        }\n        {\n            let mut sort_field_are_datetime = HashMap::new();\n            sort_field_are_datetime.insert(\"id\".to_string(), true);\n            let error =\n                validate_sort_field_types(&schema, &sort_fields, &mut sort_field_are_datetime)\n                    .unwrap_err();\n            assert_eq!(\n                error.to_string(),\n                \"sort datetime field `id` must be of type datetime on all indexes\"\n            );\n        }\n    }\n\n    #[test]\n    fn test_validate_sort_by_fields_with_datetime_format_ok() {\n        let sort_fields = vec![\n            SortField {\n                field_name: \"timestamp\".to_string(),\n                sort_order: 0,\n                sort_datetime_format: Some(SortDatetimeFormat::UnixTimestampMillis as i32),\n            },\n            SortField {\n                field_name: \"id\".to_string(),\n                sort_order: 0,\n                sort_datetime_format: None,\n            },\n        ];\n        validate_sort_by_fields_and_search_after(&sort_fields, &None).unwrap();\n    }\n\n    #[test]\n    fn test_validate_sort_by_fields_and_search_after_ok() {\n        let sort_fields = vec![\n            SortField {\n                field_name: \"timestamp\".to_string(),\n                sort_order: 0,\n                sort_datetime_format: Some(SortDatetimeFormat::UnixTimestampMillis as i32),\n            },\n            SortField {\n                field_name: \"id\".to_string(),\n                sort_order: 0,\n                sort_datetime_format: None,\n            },\n        ];\n        let partial_hit = PartialHit {\n            sort_value: Some(SortByValue {\n                sort_value: Some(SortValue::U64(1)),\n            }),\n            sort_value2: Some(SortByValue {\n                sort_value: Some(SortValue::U64(2)),\n            }),\n            split_id: \"\".to_string(),\n            segment_ord: 0,\n            doc_id: 0,\n        };\n        validate_sort_by_fields_and_search_after(&sort_fields, &Some(partial_hit)).unwrap();\n    }\n\n    #[test]\n    fn test_validate_sort_by_fields_and_search_after_ok_with_doc_sort_field() {\n        let sort_fields = vec![\n            SortField {\n                field_name: \"timestamp\".to_string(),\n                sort_order: 0,\n                sort_datetime_format: Some(SortDatetimeFormat::UnixTimestampMillis as i32),\n            },\n            SortField {\n                field_name: \"_doc\".to_string(),\n                sort_order: 0,\n                sort_datetime_format: None,\n            },\n        ];\n        let partial_hit = PartialHit {\n            sort_value: Some(SortByValue {\n                sort_value: Some(SortValue::U64(1)),\n            }),\n            sort_value2: None,\n            split_id: \"split1\".to_string(),\n            segment_ord: 1,\n            doc_id: 1,\n        };\n        validate_sort_by_fields_and_search_after(&sort_fields, &Some(partial_hit)).unwrap();\n    }\n\n    #[test]\n    fn test_validate_sort_by_field_type() {\n        let mut schema_builder = Schema::builder();\n        let timestamp_field = schema_builder.add_date_field(\"timestamp\", FAST);\n        let id_field = schema_builder.add_u64_field(\"id\", FAST);\n        let no_fast_field = schema_builder.add_u64_field(\"no_fast\", STORED);\n        let text_field = schema_builder.add_text_field(\"text\", STORED);\n        let schema = schema_builder.build();\n        {\n            let sort_by_field_entry = schema.get_field_entry(timestamp_field);\n            validate_sort_by_field_type(sort_by_field_entry, false).unwrap();\n            validate_sort_by_field_type(sort_by_field_entry, true).unwrap();\n        }\n        {\n            let sort_by_field_entry = schema.get_field_entry(id_field);\n            validate_sort_by_field_type(sort_by_field_entry, false).unwrap();\n            let error = validate_sort_by_field_type(sort_by_field_entry, true).unwrap_err();\n            assert_eq!(\n                error.to_string(),\n                \"Invalid argument: sort by field with a timestamp format must be a datetime field \\\n                 and the field `id` is not\"\n            );\n        }\n        {\n            let sort_by_field_entry = schema.get_field_entry(no_fast_field);\n            let error = validate_sort_by_field_type(sort_by_field_entry, true).unwrap_err();\n            assert_eq!(\n                error.to_string(),\n                \"Invalid argument: sort by field must be a fast field, please add the fast \\\n                 property to your field `no_fast`\"\n            );\n        }\n        {\n            let sort_by_field_entry = schema.get_field_entry(text_field);\n            let error = validate_sort_by_field_type(sort_by_field_entry, true).unwrap_err();\n            assert_eq!(\n                error.to_string(),\n                \"Invalid argument: sort by field on type text is currently not supported `text`\"\n            );\n        }\n    }\n\n    #[test]\n    fn test_validate_sort_by_fields_and_search_after_invalid_1() {\n        // 2 sort fields + search after with only one sort value is invalid.\n        let sort_fields = vec![\n            SortField {\n                field_name: \"timestamp\".to_string(),\n                sort_order: 0,\n                sort_datetime_format: Some(SortDatetimeFormat::UnixTimestampMillis as i32),\n            },\n            SortField {\n                field_name: \"id\".to_string(),\n                sort_order: 0,\n                sort_datetime_format: None,\n            },\n        ];\n        let partial_hit = PartialHit {\n            sort_value: Some(SortByValue {\n                sort_value: Some(SortValue::U64(1)),\n            }),\n            sort_value2: None,\n            split_id: \"split1\".to_string(),\n            segment_ord: 1,\n            doc_id: 1,\n        };\n        let error =\n            validate_sort_by_fields_and_search_after(&sort_fields, &Some(partial_hit)).unwrap_err();\n        assert_eq!(\n            error.to_string(),\n            \"Invalid argument: `search_after` must have the same number of sort values as sort by \\\n             fields [\\\"timestamp\\\", \\\"id\\\"]\"\n        );\n    }\n\n    #[test]\n    fn test_validate_sort_by_fields_and_search_after_invalid_with_missing_split_id() {\n        // 2 sort fields + search after with only one sort value is invalid.\n        let sort_fields = vec![\n            SortField {\n                field_name: \"timestamp\".to_string(),\n                sort_order: 0,\n                sort_datetime_format: Some(SortDatetimeFormat::UnixTimestampMillis as i32),\n            },\n            SortField {\n                field_name: \"_doc\".to_string(),\n                sort_order: 0,\n                sort_datetime_format: None,\n            },\n        ];\n        let partial_hit = PartialHit {\n            sort_value: Some(SortByValue {\n                sort_value: Some(SortValue::U64(1)),\n            }),\n            sort_value2: None,\n            split_id: \"\".to_string(),\n            segment_ord: 1,\n            doc_id: 1,\n        };\n        let error =\n            validate_sort_by_fields_and_search_after(&sort_fields, &Some(partial_hit)).unwrap_err();\n        assert_eq!(\n            error.to_string(),\n            \"Invalid argument: search_after with a sort field `_doc` must define a split ID, \\\n             segment ID and doc ID values\"\n        );\n    }\n\n    #[test]\n    fn test_validate_sort_by_fields_and_search_valid_1() {\n        // 2 sort fields + search after with only one sort value is invalid.\n        let sort_fields = vec![\n            SortField {\n                field_name: \"timestamp\".to_string(),\n                sort_order: 0,\n                sort_datetime_format: Some(SortDatetimeFormat::UnixTimestampMillis as i32),\n            },\n            SortField {\n                field_name: \"id\".to_string(),\n                sort_order: 0,\n                sort_datetime_format: None,\n            },\n        ];\n        let partial_hit = PartialHit {\n            sort_value: Some(SortByValue {\n                sort_value: Some(SortValue::U64(1)),\n            }),\n            sort_value2: None,\n            split_id: \"split1\".to_string(),\n            segment_ord: 1,\n            doc_id: 1,\n        };\n        let error =\n            validate_sort_by_fields_and_search_after(&sort_fields, &Some(partial_hit)).unwrap_err();\n        assert_eq!(\n            error.to_string(),\n            \"Invalid argument: `search_after` must have the same number of sort values as sort by \\\n             fields [\\\"timestamp\\\", \\\"id\\\"]\"\n        );\n    }\n\n    #[test]\n    fn test_validate_sort_by_field_type_invalid() {\n        // sort non-datetime field with a datetime format is invalid.\n        let mut schema_builder = Schema::builder();\n        let field = schema_builder.add_u64_field(\"timestamp\", FAST);\n        let schema = schema_builder.build();\n        let field_entry = schema.get_field_entry(field);\n        let error = validate_sort_by_field_type(field_entry, true).unwrap_err();\n        assert_eq!(\n            error.to_string(),\n            \"Invalid argument: sort by field with a timestamp format must be a datetime field and \\\n             the field `timestamp` is not\"\n        );\n    }\n\n    #[test]\n    fn test_validate_sort_by_fields_and_search_after_invalid_3() {\n        // 3 sort fields is not possible.\n        let sort_fields = vec![\n            SortField {\n                field_name: \"timestamp\".to_string(),\n                sort_order: 0,\n                sort_datetime_format: Some(SortDatetimeFormat::UnixTimestampMillis as i32),\n            },\n            SortField {\n                field_name: \"timestamp\".to_string(),\n                sort_order: 0,\n                sort_datetime_format: Some(SortDatetimeFormat::UnixTimestampMillis as i32),\n            },\n            SortField {\n                field_name: \"timestamp\".to_string(),\n                sort_order: 0,\n                sort_datetime_format: Some(SortDatetimeFormat::UnixTimestampMillis as i32),\n            },\n        ];\n        let error = validate_sort_by_fields_and_search_after(&sort_fields, &None).unwrap_err();\n        assert_eq!(\n            error.to_string(),\n            \"Invalid argument: sort by field must be up to 2 fields, got 3\"\n        );\n    }\n\n    fn mock_partial_hit(\n        split_id: &str,\n        sort_value: u64,\n        doc_id: u32,\n    ) -> quickwit_proto::search::PartialHit {\n        quickwit_proto::search::PartialHit {\n            sort_value: Some(SortValue::U64(sort_value).into()),\n            sort_value2: None,\n            split_id: split_id.to_string(),\n            segment_ord: 1,\n            doc_id,\n        }\n    }\n\n    fn mock_partial_hit_opt_sort_value(\n        split_id: &str,\n        sort_value: Option<u64>,\n        doc_id: u32,\n    ) -> quickwit_proto::search::PartialHit {\n        quickwit_proto::search::PartialHit {\n            sort_value: sort_value.map(|sort_value| SortValue::U64(sort_value).into()),\n            sort_value2: None,\n            split_id: split_id.to_string(),\n            segment_ord: 1,\n            doc_id,\n        }\n    }\n\n    fn get_doc_for_fetch_req(\n        fetch_docs_req: quickwit_proto::search::FetchDocsRequest,\n    ) -> Vec<quickwit_proto::search::LeafHit> {\n        fetch_docs_req\n            .partial_hits\n            .into_iter()\n            .map(|req| quickwit_proto::search::LeafHit {\n                leaf_json: serde_json::to_string_pretty(&serde_json::json!({\n                    \"title\": [req.doc_id.to_string()],\n                    \"body\": [\"test 1\"],\n                    \"url\": [\"http://127.0.0.1/1\"]\n                }))\n                .expect(\"Json serialization should not fail\"),\n                partial_hit: Some(req),\n                leaf_snippet_json: None,\n            })\n            .collect()\n    }\n\n    #[tokio::test]\n    async fn test_root_search_offset_out_of_bounds_1085() -> anyhow::Result<()> {\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index\".to_string()],\n            query_ast: qast_json_helper(\"test\", &[\"body\"]),\n            max_hits: 10,\n            start_offset: 10,\n            ..Default::default()\n        };\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///test-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(move |_indexes_metadata_request| {\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata.clone(),\n                ]))\n            });\n        mock_metastore\n            .expect_list_splits()\n            .returning(move |_filter| {\n                let splits = vec![\n                    MockSplitBuilder::new(\"split1\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                    MockSplitBuilder::new(\"split2\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                ];\n                let splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits_response)]))\n            });\n        let mut mock_search_service_2 = MockSearchService::new();\n        mock_search_service_2.expect_leaf_search().returning(\n            |_leaf_search_req: quickwit_proto::search::LeafSearchRequest| {\n                Ok(quickwit_proto::search::LeafSearchResponse {\n                    num_hits: 3,\n                    partial_hits: vec![\n                        mock_partial_hit(\"split1\", 3, 1),\n                        mock_partial_hit(\"split1\", 2, 2),\n                        mock_partial_hit(\"split1\", 1, 3),\n                    ],\n                    failed_splits: Vec::new(),\n                    num_attempted_splits: 1,\n                    ..Default::default()\n                })\n            },\n        );\n        mock_search_service_2.expect_fetch_docs().returning(\n            |fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                Ok(quickwit_proto::search::FetchDocsResponse {\n                    hits: get_doc_for_fetch_req(fetch_docs_req),\n                })\n            },\n        );\n        let mut mock_search_service_1 = MockSearchService::new();\n        mock_search_service_1.expect_leaf_search().returning(\n            |_leaf_search_req: quickwit_proto::search::LeafSearchRequest| {\n                Ok(quickwit_proto::search::LeafSearchResponse {\n                    num_hits: 2,\n                    partial_hits: vec![\n                        mock_partial_hit(\"split2\", 3, 1),\n                        mock_partial_hit(\"split2\", 1, 3),\n                    ],\n                    failed_splits: Vec::new(),\n                    num_attempted_splits: 1,\n                    ..Default::default()\n                })\n            },\n        );\n        mock_search_service_1.expect_fetch_docs().returning(\n            |fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                Ok(quickwit_proto::search::FetchDocsResponse {\n                    hits: get_doc_for_fetch_req(fetch_docs_req),\n                })\n            },\n        );\n        let searcher_pool = searcher_pool_for_test([\n            (\"127.0.0.1:1001\", mock_search_service_1),\n            (\"127.0.0.1:1002\", mock_search_service_2),\n        ]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let cluster_client = ClusterClient::new(search_job_placer.clone());\n\n        let search_response = root_search(\n            &SearcherContext::for_test(),\n            search_request,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            &cluster_client,\n        )\n        .await\n        .unwrap();\n        assert_eq!(search_response.num_hits, 5);\n        assert_eq!(search_response.hits.len(), 0);\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_root_search_single_split() -> anyhow::Result<()> {\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index\".to_string()],\n            query_ast: qast_json_helper(\"test\", &[\"body\"]),\n            max_hits: 10,\n            ..Default::default()\n        };\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///test-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(move |_index_ids_query| {\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata.clone(),\n                ]))\n            });\n        mock_metastore\n            .expect_list_splits()\n            .returning(move |_list_splits_request| {\n                let splits = vec![\n                    MockSplitBuilder::new(\"split1\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                ];\n                let splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits_response)]))\n            });\n        let mut mock_search_service = MockSearchService::new();\n        mock_search_service.expect_leaf_search().returning(\n            |_leaf_search_req: quickwit_proto::search::LeafSearchRequest| {\n                Ok(quickwit_proto::search::LeafSearchResponse {\n                    num_hits: 3,\n                    partial_hits: vec![\n                        mock_partial_hit(\"split1\", 3, 1),\n                        mock_partial_hit(\"split1\", 2, 2),\n                        mock_partial_hit(\"split1\", 1, 3),\n                    ],\n                    failed_splits: Vec::new(),\n                    num_attempted_splits: 1,\n                    ..Default::default()\n                })\n            },\n        );\n        mock_search_service.expect_fetch_docs().returning(\n            |fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                Ok(quickwit_proto::search::FetchDocsResponse {\n                    hits: get_doc_for_fetch_req(fetch_docs_req),\n                })\n            },\n        );\n        let searcher_pool = searcher_pool_for_test([(\"127.0.0.1:1001\", mock_search_service)]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let cluster_client = ClusterClient::new(search_job_placer.clone());\n\n        let searcher_context = SearcherContext::for_test();\n        let search_response = root_search(\n            &searcher_context,\n            search_request,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            &cluster_client,\n        )\n        .await\n        .unwrap();\n        assert_eq!(search_response.num_hits, 3);\n        assert_eq!(search_response.hits.len(), 3);\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_root_search_multiple_splits() -> anyhow::Result<()> {\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index\".to_string()],\n            query_ast: qast_json_helper(\"test\", &[\"body\"]),\n            max_hits: 10,\n            ..Default::default()\n        };\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///test-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(move |_index_ids_query| {\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata.clone(),\n                ]))\n            });\n        mock_metastore\n            .expect_list_splits()\n            .returning(move |_filter| {\n                let splits = vec![\n                    MockSplitBuilder::new(\"split1\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                    MockSplitBuilder::new(\"split2\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                ];\n                let splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits_response)]))\n            });\n        let mut mock_search_service_1 = MockSearchService::new();\n        mock_search_service_1.expect_leaf_search().returning(\n            |_leaf_search_req: quickwit_proto::search::LeafSearchRequest| {\n                Ok(quickwit_proto::search::LeafSearchResponse {\n                    num_hits: 2,\n                    partial_hits: vec![\n                        mock_partial_hit(\"split1\", 3, 1),\n                        mock_partial_hit(\"split1\", 1, 3),\n                    ],\n                    failed_splits: Vec::new(),\n                    num_attempted_splits: 1,\n                    ..Default::default()\n                })\n            },\n        );\n        mock_search_service_1.expect_fetch_docs().returning(\n            |fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                Ok(quickwit_proto::search::FetchDocsResponse {\n                    hits: get_doc_for_fetch_req(fetch_docs_req),\n                })\n            },\n        );\n        let mut mock_search_service_2 = MockSearchService::new();\n        mock_search_service_2.expect_leaf_search().returning(\n            |_leaf_search_req: quickwit_proto::search::LeafSearchRequest| {\n                Ok(quickwit_proto::search::LeafSearchResponse {\n                    num_hits: 1,\n                    partial_hits: vec![mock_partial_hit(\"split2\", 2, 2)],\n                    failed_splits: Vec::new(),\n                    num_attempted_splits: 1,\n                    ..Default::default()\n                })\n            },\n        );\n        mock_search_service_2.expect_fetch_docs().returning(\n            |fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                Ok(quickwit_proto::search::FetchDocsResponse {\n                    hits: get_doc_for_fetch_req(fetch_docs_req),\n                })\n            },\n        );\n        let searcher_pool = searcher_pool_for_test([\n            (\"127.0.0.1:1001\", mock_search_service_1),\n            (\"127.0.0.1:1002\", mock_search_service_2),\n        ]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let cluster_client = ClusterClient::new(search_job_placer.clone());\n        let search_response = root_search(\n            &SearcherContext::for_test(),\n            search_request,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            &cluster_client,\n        )\n        .await\n        .unwrap();\n        assert_eq!(search_response.num_hits, 3);\n        assert_eq!(search_response.hits.len(), 3);\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_root_search_multiple_splits_with_failure() -> anyhow::Result<()> {\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index\".to_string()],\n            query_ast: qast_json_helper(\"test\", &[\"body\"]),\n            max_hits: 2,\n            ..Default::default()\n        };\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///test-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(move |_index_ids_query| {\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata.clone(),\n                ]))\n            });\n        mock_metastore\n            .expect_list_splits()\n            .returning(move |_filter| {\n                let splits = vec![\n                    MockSplitBuilder::new(\"split1\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                    MockSplitBuilder::new(\"split2\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                ];\n                let splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits_response)]))\n            });\n        let mut mock_search_service_1 = MockSearchService::new();\n        mock_search_service_1.expect_leaf_search().returning(\n            |leaf_search_req: quickwit_proto::search::LeafSearchRequest| {\n                if leaf_search_req.leaf_requests[0].split_offsets.len() == 2 {\n                    Ok(quickwit_proto::search::LeafSearchResponse {\n                        num_hits: 2,\n                        partial_hits: vec![\n                            mock_partial_hit(\"split1\", 3, 1),\n                            mock_partial_hit(\"split1\", 1, 3),\n                        ],\n                        failed_splits: vec![SplitSearchError {\n                            error: \"some error\".to_string(),\n                            split_id: \"split2\".to_string(),\n                            retryable_error: true,\n                        }],\n                        num_attempted_splits: 2,\n                        ..Default::default()\n                    })\n                } else {\n                    Ok(quickwit_proto::search::LeafSearchResponse {\n                        num_hits: 1,\n                        partial_hits: vec![mock_partial_hit(\"split2\", 2, 2)],\n                        failed_splits: Vec::new(),\n                        num_attempted_splits: 1,\n                        ..Default::default()\n                    })\n                }\n            },\n        );\n        mock_search_service_1.expect_fetch_docs().returning(\n            |fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                Ok(quickwit_proto::search::FetchDocsResponse {\n                    hits: get_doc_for_fetch_req(fetch_docs_req),\n                })\n            },\n        );\n        let searcher_pool = searcher_pool_for_test([(\"127.0.0.1:1001\", mock_search_service_1)]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let cluster_client = ClusterClient::new(search_job_placer.clone());\n        let search_response = root_search(\n            &SearcherContext::for_test(),\n            search_request,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            &cluster_client,\n        )\n        .await\n        .unwrap();\n        assert_eq!(search_response.num_hits, 3);\n        assert_eq!(search_response.hits.len(), 2);\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_root_search_multiple_splits_sort_heteregeneous_field_ascending()\n    -> anyhow::Result<()> {\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index\".to_string()],\n            query_ast: qast_json_helper(\"test\", &[\"body\"]),\n            max_hits: 10,\n            sort_fields: vec![SortField {\n                field_name: \"response_date\".to_string(),\n                sort_order: SortOrder::Asc.into(),\n                sort_datetime_format: Some(SortDatetimeFormat::UnixTimestampNanos as i32),\n            }],\n            ..Default::default()\n        };\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///test-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(move |_index_ids_query| {\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata.clone(),\n                ]))\n            });\n        mock_metastore\n            .expect_list_splits()\n            .returning(move |_filter| {\n                let splits = vec![\n                    MockSplitBuilder::new(\"split1\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                    MockSplitBuilder::new(\"split2\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                ];\n                let splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits_response)]))\n            });\n        let mut mock_search_service_1 = MockSearchService::new();\n        mock_search_service_1.expect_leaf_search().returning(\n            |_leaf_search_req: quickwit_proto::search::LeafSearchRequest| {\n                Ok(quickwit_proto::search::LeafSearchResponse {\n                    num_hits: 2,\n                    partial_hits: vec![\n                        quickwit_proto::search::PartialHit {\n                            sort_value: Some(SortValue::U64(2u64).into()),\n                            sort_value2: None,\n                            split_id: \"split1\".to_string(),\n                            segment_ord: 0,\n                            doc_id: 0,\n                        },\n                        quickwit_proto::search::PartialHit {\n                            sort_value: None,\n                            sort_value2: None,\n                            split_id: \"split1\".to_string(),\n                            segment_ord: 0,\n                            doc_id: 1,\n                        },\n                    ],\n                    failed_splits: Vec::new(),\n                    num_attempted_splits: 1,\n                    ..Default::default()\n                })\n            },\n        );\n        mock_search_service_1.expect_fetch_docs().returning(\n            |fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                Ok(quickwit_proto::search::FetchDocsResponse {\n                    hits: get_doc_for_fetch_req(fetch_docs_req),\n                })\n            },\n        );\n        let mut mock_search_service_2 = MockSearchService::new();\n        mock_search_service_2.expect_leaf_search().returning(\n            |_leaf_search_req: quickwit_proto::search::LeafSearchRequest| {\n                Ok(quickwit_proto::search::LeafSearchResponse {\n                    num_hits: 3,\n                    partial_hits: vec![\n                        quickwit_proto::search::PartialHit {\n                            sort_value: Some(SortValue::I64(-1i64).into()),\n                            sort_value2: None,\n                            split_id: \"split2\".to_string(),\n                            segment_ord: 0,\n                            doc_id: 1,\n                        },\n                        quickwit_proto::search::PartialHit {\n                            sort_value: Some(SortValue::I64(1i64).into()),\n                            sort_value2: None,\n                            split_id: \"split2\".to_string(),\n                            segment_ord: 0,\n                            doc_id: 0,\n                        },\n                        quickwit_proto::search::PartialHit {\n                            sort_value: None,\n                            sort_value2: None,\n                            split_id: \"split2\".to_string(),\n                            segment_ord: 0,\n                            doc_id: 2,\n                        },\n                    ],\n                    failed_splits: Vec::new(),\n                    num_attempted_splits: 1,\n                    ..Default::default()\n                })\n            },\n        );\n        mock_search_service_2.expect_fetch_docs().returning(\n            |fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                Ok(quickwit_proto::search::FetchDocsResponse {\n                    hits: get_doc_for_fetch_req(fetch_docs_req),\n                })\n            },\n        );\n        let searcher_pool = searcher_pool_for_test([\n            (\"127.0.0.1:1001\", mock_search_service_1),\n            (\"127.0.0.1:1002\", mock_search_service_2),\n        ]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let cluster_client = ClusterClient::new(search_job_placer.clone());\n        let search_response = root_search(\n            &SearcherContext::for_test(),\n            search_request.clone(),\n            MetastoreServiceClient::from_mock(mock_metastore),\n            &cluster_client,\n        )\n        .await?;\n\n        assert_eq!(search_response.num_hits, 5);\n        assert_eq!(search_response.hits.len(), 5);\n        assert_eq!(\n            search_response.hits[0].partial_hit.as_ref().unwrap(),\n            &PartialHit {\n                split_id: \"split2\".to_string(),\n                segment_ord: 0,\n                doc_id: 1,\n                sort_value: Some(SortValue::I64(-1i64).into()),\n                sort_value2: None,\n            }\n        );\n        assert_eq!(\n            search_response.hits[1].partial_hit.as_ref().unwrap(),\n            &PartialHit {\n                split_id: \"split2\".to_string(),\n                segment_ord: 0,\n                doc_id: 0,\n                sort_value: Some(SortValue::I64(1i64).into()),\n                sort_value2: None,\n            }\n        );\n        assert_eq!(\n            search_response.hits[2].partial_hit.as_ref().unwrap(),\n            &PartialHit {\n                split_id: \"split1\".to_string(),\n                segment_ord: 0,\n                doc_id: 0,\n                sort_value: Some(SortValue::U64(2u64).into()),\n                sort_value2: None,\n            }\n        );\n        assert_eq!(\n            search_response.hits[3].partial_hit.as_ref().unwrap(),\n            &PartialHit {\n                split_id: \"split1\".to_string(),\n                segment_ord: 0,\n                doc_id: 1,\n                sort_value: None,\n                sort_value2: None,\n            }\n        );\n        assert_eq!(\n            search_response.hits[4].partial_hit.as_ref().unwrap(),\n            &PartialHit {\n                split_id: \"split2\".to_string(),\n                segment_ord: 0,\n                doc_id: 2,\n                sort_value: None,\n                sort_value2: None,\n            }\n        );\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_root_search_multiple_splits_sort_heteregeneous_field_descending()\n    -> anyhow::Result<()> {\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index\".to_string()],\n            query_ast: qast_json_helper(\"test\", &[\"body\"]),\n            max_hits: 10,\n            sort_fields: vec![SortField {\n                field_name: \"response_date\".to_string(),\n                sort_order: SortOrder::Desc.into(),\n                sort_datetime_format: Some(SortDatetimeFormat::UnixTimestampNanos as i32),\n            }],\n            ..Default::default()\n        };\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///test-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(move |_index_ids_query| {\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata.clone(),\n                ]))\n            });\n        mock_metastore\n            .expect_list_splits()\n            .returning(move |_filter| {\n                let splits = vec![\n                    MockSplitBuilder::new(\"split1\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                    MockSplitBuilder::new(\"split2\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                ];\n                let splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits_response)]))\n            });\n        let mut mock_search_service_1 = MockSearchService::new();\n        mock_search_service_1.expect_leaf_search().returning(\n            |_leaf_search_req: quickwit_proto::search::LeafSearchRequest| {\n                Ok(quickwit_proto::search::LeafSearchResponse {\n                    num_hits: 2,\n                    partial_hits: vec![\n                        quickwit_proto::search::PartialHit {\n                            sort_value: Some(SortValue::U64(2u64).into()),\n                            sort_value2: None,\n                            split_id: \"split1\".to_string(),\n                            segment_ord: 0,\n                            doc_id: 0,\n                        },\n                        quickwit_proto::search::PartialHit {\n                            sort_value: None,\n                            sort_value2: None,\n                            split_id: \"split1\".to_string(),\n                            segment_ord: 0,\n                            doc_id: 1,\n                        },\n                    ],\n                    failed_splits: Vec::new(),\n                    num_attempted_splits: 1,\n                    ..Default::default()\n                })\n            },\n        );\n        mock_search_service_1.expect_fetch_docs().returning(\n            |fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                Ok(quickwit_proto::search::FetchDocsResponse {\n                    hits: get_doc_for_fetch_req(fetch_docs_req),\n                })\n            },\n        );\n        let mut mock_search_service_2 = MockSearchService::new();\n        mock_search_service_2.expect_leaf_search().returning(\n            |_leaf_search_req: quickwit_proto::search::LeafSearchRequest| {\n                Ok(quickwit_proto::search::LeafSearchResponse {\n                    num_hits: 3,\n                    partial_hits: vec![\n                        quickwit_proto::search::PartialHit {\n                            sort_value: Some(SortValue::I64(1i64).into()),\n                            sort_value2: None,\n                            split_id: \"split2\".to_string(),\n                            segment_ord: 0,\n                            doc_id: 0,\n                        },\n                        quickwit_proto::search::PartialHit {\n                            sort_value: Some(SortValue::I64(-1i64).into()),\n                            sort_value2: None,\n                            split_id: \"split2\".to_string(),\n                            segment_ord: 0,\n                            doc_id: 1,\n                        },\n                        quickwit_proto::search::PartialHit {\n                            sort_value: None,\n                            sort_value2: None,\n                            split_id: \"split2\".to_string(),\n                            segment_ord: 0,\n                            doc_id: 2,\n                        },\n                    ],\n                    failed_splits: Vec::new(),\n                    num_attempted_splits: 1,\n                    ..Default::default()\n                })\n            },\n        );\n        mock_search_service_2.expect_fetch_docs().returning(\n            |fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                Ok(quickwit_proto::search::FetchDocsResponse {\n                    hits: get_doc_for_fetch_req(fetch_docs_req),\n                })\n            },\n        );\n        let searcher_pool = searcher_pool_for_test([\n            (\"127.0.0.1:1001\", mock_search_service_1),\n            (\"127.0.0.1:1002\", mock_search_service_2),\n        ]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let cluster_client = ClusterClient::new(search_job_placer.clone());\n        let search_response = root_search(\n            &SearcherContext::for_test(),\n            search_request.clone(),\n            MetastoreServiceClient::from_mock(mock_metastore),\n            &cluster_client,\n        )\n        .await?;\n\n        assert_eq!(search_response.num_hits, 5);\n        assert_eq!(search_response.hits.len(), 5);\n        assert_eq!(\n            search_response.hits[0].partial_hit.as_ref().unwrap(),\n            &PartialHit {\n                split_id: \"split1\".to_string(),\n                segment_ord: 0,\n                doc_id: 0,\n                sort_value: Some(SortValue::U64(2u64).into()),\n                sort_value2: None,\n            }\n        );\n        assert_eq!(\n            search_response.hits[1].partial_hit.as_ref().unwrap(),\n            &PartialHit {\n                split_id: \"split2\".to_string(),\n                segment_ord: 0,\n                doc_id: 0,\n                sort_value: Some(SortValue::I64(1i64).into()),\n                sort_value2: None,\n            }\n        );\n        assert_eq!(\n            search_response.hits[2].partial_hit.as_ref().unwrap(),\n            &PartialHit {\n                split_id: \"split2\".to_string(),\n                segment_ord: 0,\n                doc_id: 1,\n                sort_value: Some(SortValue::I64(-1i64).into()),\n                sort_value2: None,\n            }\n        );\n        assert_eq!(\n            search_response.hits[3].partial_hit.as_ref().unwrap(),\n            &PartialHit {\n                split_id: \"split2\".to_string(),\n                segment_ord: 0,\n                doc_id: 2,\n                sort_value: None,\n                sort_value2: None,\n            }\n        );\n        assert_eq!(\n            search_response.hits[4].partial_hit.as_ref().unwrap(),\n            &PartialHit {\n                split_id: \"split1\".to_string(),\n                segment_ord: 0,\n                doc_id: 1,\n                sort_value: None,\n                sort_value2: None,\n            }\n        );\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_root_search_missing_index() -> anyhow::Result<()> {\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata = IndexMetadata::for_test(\"test-index1\", \"ram:///test-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(move |_index_ids_query| {\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata.clone(),\n                ]))\n            });\n        mock_metastore\n            .expect_list_splits()\n            .returning(move |_list_splits_request| {\n                let splits = vec![\n                    MockSplitBuilder::new(\"split1\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                ];\n                let splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits_response)]))\n            });\n        let mock_metastore_client = MetastoreServiceClient::from_mock(mock_metastore);\n        let mut mock_search_service = MockSearchService::new();\n        mock_search_service.expect_leaf_search().returning(\n            |_leaf_search_req: quickwit_proto::search::LeafSearchRequest| {\n                Ok(quickwit_proto::search::LeafSearchResponse {\n                    num_hits: 3,\n                    partial_hits: vec![\n                        mock_partial_hit(\"split1\", 3, 1),\n                        mock_partial_hit(\"split1\", 2, 2),\n                        mock_partial_hit(\"split1\", 1, 3),\n                    ],\n                    failed_splits: Vec::new(),\n                    num_attempted_splits: 1,\n                    ..Default::default()\n                })\n            },\n        );\n        mock_search_service.expect_fetch_docs().returning(\n            |fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                Ok(quickwit_proto::search::FetchDocsResponse {\n                    hits: get_doc_for_fetch_req(fetch_docs_req),\n                })\n            },\n        );\n        let searcher_pool = searcher_pool_for_test([(\"127.0.0.1:1001\", mock_search_service)]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let cluster_client = ClusterClient::new(search_job_placer.clone());\n\n        let searcher_context = SearcherContext::for_test();\n\n        // search with ignore_missing_indexes=true succeeds\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index1\".to_string(), \"test-index2\".to_string()],\n            query_ast: qast_json_helper(\"test\", &[\"body\"]),\n            max_hits: 10,\n            ignore_missing_indexes: true,\n            ..Default::default()\n        };\n        let search_response = root_search(\n            &searcher_context,\n            search_request,\n            mock_metastore_client.clone(),\n            &cluster_client,\n        )\n        .await\n        .unwrap();\n        assert_eq!(search_response.num_hits, 3);\n        assert_eq!(search_response.hits.len(), 3);\n\n        // search with ignore_missing_indexes=false fails\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index1\".to_string(), \"test-index2\".to_string()],\n            query_ast: qast_json_helper(\"test\", &[\"body\"]),\n            max_hits: 10,\n            ignore_missing_indexes: false,\n            ..Default::default()\n        };\n        let search_error = root_search(\n            &searcher_context,\n            search_request,\n            mock_metastore_client,\n            &cluster_client,\n        )\n        .await\n        .unwrap_err();\n        if let SearchError::IndexesNotFound { index_ids } = search_error {\n            assert_eq!(index_ids, vec![\"test-index2\".to_string()]);\n        } else {\n            panic!(\"unexpected error type: {search_error}\");\n        }\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_root_search_multiple_splits_retry_on_other_node() -> anyhow::Result<()> {\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index\".to_string()],\n            query_ast: qast_json_helper(\"test\", &[\"body\"]),\n            max_hits: 10,\n            ..Default::default()\n        };\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///test-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(move |_index_ids_query| {\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata.clone(),\n                ]))\n            });\n        mock_metastore\n            .expect_list_splits()\n            .returning(move |_filter| {\n                let splits = vec![\n                    MockSplitBuilder::new(\"split1\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                    MockSplitBuilder::new(\"split2\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                ];\n                let splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits_response)]))\n            });\n\n        let mut mock_search_service_1 = MockSearchService::new();\n        mock_search_service_1\n            .expect_leaf_search()\n            .times(2)\n            .returning(\n                |leaf_search_req: quickwit_proto::search::LeafSearchRequest| {\n                    let split_ids: Vec<&str> = leaf_search_req.leaf_requests[0]\n                        .split_offsets\n                        .iter()\n                        .map(|metadata| metadata.split_id.as_str())\n                        .collect();\n                    if split_ids == [\"split1\"] {\n                        Ok(quickwit_proto::search::LeafSearchResponse {\n                            num_hits: 2,\n                            partial_hits: vec![\n                                mock_partial_hit(\"split1\", 3, 1),\n                                mock_partial_hit(\"split1\", 1, 3),\n                            ],\n                            failed_splits: Vec::new(),\n                            num_attempted_splits: 1,\n                            ..Default::default()\n                        })\n                    } else if split_ids == [\"split2\"] {\n                        // RETRY REQUEST!\n                        Ok(quickwit_proto::search::LeafSearchResponse {\n                            num_hits: 1,\n                            partial_hits: vec![mock_partial_hit(\"split2\", 2, 2)],\n                            failed_splits: Vec::new(),\n                            num_attempted_splits: 1,\n                            ..Default::default()\n                        })\n                    } else {\n                        panic!(\"unexpected request in test {split_ids:?}\");\n                    }\n                },\n            );\n        mock_search_service_1.expect_fetch_docs().returning(\n            |fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                Ok(quickwit_proto::search::FetchDocsResponse {\n                    hits: get_doc_for_fetch_req(fetch_docs_req),\n                })\n            },\n        );\n        let mut mock_search_service_2 = MockSearchService::new();\n        mock_search_service_2\n            .expect_leaf_search()\n            .times(1)\n            .returning(\n                |_leaf_search_req: quickwit_proto::search::LeafSearchRequest| {\n                    Ok(quickwit_proto::search::LeafSearchResponse {\n                        // requests from split 2 arrive here - simulate failure\n                        num_hits: 0,\n                        partial_hits: Vec::new(),\n                        failed_splits: vec![SplitSearchError {\n                            error: \"mock_error\".to_string(),\n                            split_id: \"split2\".to_string(),\n                            retryable_error: true,\n                        }],\n                        num_attempted_splits: 1,\n                        ..Default::default()\n                    })\n                },\n            );\n        mock_search_service_2.expect_fetch_docs().returning(\n            |fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                Ok(quickwit_proto::search::FetchDocsResponse {\n                    hits: get_doc_for_fetch_req(fetch_docs_req),\n                })\n            },\n        );\n        let searcher_pool = searcher_pool_for_test([\n            (\"127.0.0.1:1001\", mock_search_service_1),\n            (\"127.0.0.1:1002\", mock_search_service_2),\n        ]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let cluster_client = ClusterClient::new(search_job_placer.clone());\n        let search_response = root_search(\n            &SearcherContext::for_test(),\n            search_request,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            &cluster_client,\n        )\n        .await\n        .unwrap();\n        assert_eq!(search_response.num_hits, 3);\n        assert_eq!(search_response.hits.len(), 3);\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_root_search_multiple_splits_retry_on_all_nodes() -> anyhow::Result<()> {\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index\".to_string()],\n            query_ast: qast_json_helper(\"test\", &[\"body\"]),\n            max_hits: 10,\n            ..Default::default()\n        };\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///test-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(move |_indexes_metadata_request| {\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata.clone(),\n                ]))\n            });\n        mock_metastore\n            .expect_list_splits()\n            .returning(move |_filter| {\n                let splits = vec![\n                    MockSplitBuilder::new(\"split1\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                    MockSplitBuilder::new(\"split2\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                ];\n                let splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits_response)]))\n            });\n        let mut mock_search_service_1 = MockSearchService::new();\n        mock_search_service_1\n            .expect_leaf_search()\n            .withf(|leaf_search_req| {\n                leaf_search_req.leaf_requests[0].split_offsets[0].split_id == \"split2\"\n            })\n            .return_once(|_| {\n                // requests from split 2 arrive here - simulate failure.\n                // a retry will be made on the second service.\n                Ok(quickwit_proto::search::LeafSearchResponse {\n                    num_hits: 0,\n                    partial_hits: Vec::new(),\n                    failed_splits: vec![SplitSearchError {\n                        error: \"mock_error\".to_string(),\n                        split_id: \"split2\".to_string(),\n                        retryable_error: true,\n                    }],\n                    num_attempted_splits: 1,\n                    ..Default::default()\n                })\n            });\n        mock_search_service_1\n            .expect_leaf_search()\n            .withf(|leaf_search_req| {\n                leaf_search_req.leaf_requests[0].split_offsets[0].split_id == \"split1\"\n            })\n            .return_once(|_| {\n                // RETRY REQUEST from split1\n                Ok(quickwit_proto::search::LeafSearchResponse {\n                    num_hits: 2,\n                    partial_hits: vec![\n                        mock_partial_hit(\"split1\", 3, 1),\n                        mock_partial_hit(\"split1\", 1, 3),\n                    ],\n                    failed_splits: Vec::new(),\n                    num_attempted_splits: 1,\n                    ..Default::default()\n                })\n            });\n        mock_search_service_1.expect_fetch_docs().returning(\n            |fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                Ok(quickwit_proto::search::FetchDocsResponse {\n                    hits: get_doc_for_fetch_req(fetch_docs_req),\n                })\n            },\n        );\n        let mut mock_search_service_2 = MockSearchService::new();\n        mock_search_service_2\n            .expect_leaf_search()\n            .withf(|leaf_search_req| {\n                leaf_search_req.leaf_requests[0].split_offsets[0].split_id == \"split2\"\n            })\n            .return_once(|_| {\n                // retry for split 2 arrive here, simulate success.\n                Ok(quickwit_proto::search::LeafSearchResponse {\n                    num_hits: 1,\n                    partial_hits: vec![mock_partial_hit(\"split2\", 2, 2)],\n                    failed_splits: Vec::new(),\n                    num_attempted_splits: 1,\n                    ..Default::default()\n                })\n            });\n        mock_search_service_2\n            .expect_leaf_search()\n            .withf(|leaf_search_req| {\n                leaf_search_req.leaf_requests[0].split_offsets[0].split_id == \"split1\"\n            })\n            .return_once(|_| {\n                // requests from split 1 arrive here - simulate failure, then success.\n                Ok(quickwit_proto::search::LeafSearchResponse {\n                    // requests from split 2 arrive here - simulate failure\n                    num_hits: 0,\n                    partial_hits: Vec::new(),\n                    failed_splits: vec![SplitSearchError {\n                        error: \"mock_error\".to_string(),\n                        split_id: \"split1\".to_string(),\n                        retryable_error: true,\n                    }],\n                    num_attempted_splits: 1,\n                    ..Default::default()\n                })\n            });\n        mock_search_service_2.expect_fetch_docs().returning(\n            |fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                Ok(quickwit_proto::search::FetchDocsResponse {\n                    hits: get_doc_for_fetch_req(fetch_docs_req),\n                })\n            },\n        );\n        let searcher_pool = searcher_pool_for_test([\n            (\"127.0.0.1:1001\", mock_search_service_1),\n            (\"127.0.0.1:1002\", mock_search_service_2),\n        ]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let cluster_client = ClusterClient::new(search_job_placer.clone());\n        let search_response = root_search(\n            &SearcherContext::for_test(),\n            search_request,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            &cluster_client,\n        )\n        .await\n        .unwrap();\n        assert_eq!(search_response.num_hits, 3);\n        assert_eq!(search_response.hits.len(), 3);\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_root_search_single_split_retry_single_node() -> anyhow::Result<()> {\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index\".to_string()],\n            query_ast: qast_json_helper(\"test\", &[\"body\"]),\n            max_hits: 10,\n            ..Default::default()\n        };\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///test-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(move |_index_ids_query| {\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata.clone(),\n                ]))\n            });\n        mock_metastore\n            .expect_list_splits()\n            .returning(move |_list_splits_request| {\n                let splits = vec![\n                    MockSplitBuilder::new(\"split1\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                ];\n                let splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits_response)]))\n            });\n        let mut first_call = true;\n        let mut mock_search_service = MockSearchService::new();\n        mock_search_service.expect_leaf_search().times(2).returning(\n            move |_leaf_search_req: quickwit_proto::search::LeafSearchRequest| {\n                // requests from split 2 arrive here - simulate failure, then success\n                if first_call {\n                    first_call = false;\n                    Ok(quickwit_proto::search::LeafSearchResponse {\n                        num_hits: 0,\n                        partial_hits: Vec::new(),\n                        failed_splits: vec![SplitSearchError {\n                            error: \"mock_error\".to_string(),\n                            split_id: \"split1\".to_string(),\n                            retryable_error: true,\n                        }],\n                        num_attempted_splits: 1,\n                        ..Default::default()\n                    })\n                } else {\n                    Ok(quickwit_proto::search::LeafSearchResponse {\n                        num_hits: 1,\n                        partial_hits: vec![mock_partial_hit(\"split1\", 2, 2)],\n                        failed_splits: Vec::new(),\n                        num_attempted_splits: 1,\n                        ..Default::default()\n                    })\n                }\n            },\n        );\n        mock_search_service.expect_fetch_docs().returning(\n            |fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                Ok(quickwit_proto::search::FetchDocsResponse {\n                    hits: get_doc_for_fetch_req(fetch_docs_req),\n                })\n            },\n        );\n        let searcher_pool = searcher_pool_for_test([(\"127.0.0.1:1001\", mock_search_service)]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let cluster_client = ClusterClient::new(search_job_placer.clone());\n        let search_response = root_search(\n            &SearcherContext::for_test(),\n            search_request,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            &cluster_client,\n        )\n        .await\n        .unwrap();\n        assert_eq!(search_response.num_hits, 1);\n        assert_eq!(search_response.hits.len(), 1);\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_root_search_single_split_retry_single_node_fails() {\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index\".to_string()],\n            query_ast: qast_json_helper(\"test\", &[\"body\"]),\n            max_hits: 10,\n            ..Default::default()\n        };\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///test-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(move |_index_ids_query| {\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata.clone(),\n                ]))\n            });\n        mock_metastore\n            .expect_list_splits()\n            .returning(move |_filter| {\n                let splits = vec![\n                    MockSplitBuilder::new(\"split1\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                ];\n                let splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits_response)]))\n            });\n\n        let mut mock_search_service = MockSearchService::new();\n        mock_search_service.expect_leaf_search().times(2).returning(\n            move |_leaf_search_req: quickwit_proto::search::LeafSearchRequest| {\n                Ok(quickwit_proto::search::LeafSearchResponse {\n                    num_hits: 0,\n                    partial_hits: Vec::new(),\n                    failed_splits: vec![SplitSearchError {\n                        error: \"mock_error\".to_string(),\n                        split_id: \"split1\".to_string(),\n                        retryable_error: true,\n                    }],\n                    num_attempted_splits: 1,\n                    ..Default::default()\n                })\n            },\n        );\n        mock_search_service.expect_fetch_docs().returning(\n            |_fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                Err(SearchError::Internal(\"mockerr docs\".to_string()))\n            },\n        );\n        let searcher_pool = searcher_pool_for_test([(\"127.0.0.1:1001\", mock_search_service)]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let cluster_client = ClusterClient::new(search_job_placer.clone());\n        let search_response = root_search(\n            &SearcherContext::for_test(),\n            search_request,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            &cluster_client,\n        )\n        .await\n        .unwrap();\n        assert_eq!(search_response.failed_splits.len(), 1);\n    }\n\n    #[tokio::test]\n    async fn test_root_search_one_splits_two_nodes_but_one_is_failing_for_split()\n    -> anyhow::Result<()> {\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index\".to_string()],\n            query_ast: qast_json_helper(\"test\", &[\"body\"]),\n            max_hits: 10,\n            ..Default::default()\n        };\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///test-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(move |_index_ids_query| {\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata.clone(),\n                ]))\n            });\n        mock_metastore\n            .expect_list_splits()\n            .returning(move |_filter| {\n                let splits = vec![\n                    MockSplitBuilder::new(\"split1\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                ];\n                let splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits_response)]))\n            });\n        // Service1 - broken node.\n        let mut mock_search_service_1 = MockSearchService::new();\n        mock_search_service_1.expect_leaf_search().returning(\n            move |_leaf_search_req: quickwit_proto::search::LeafSearchRequest| {\n                // retry requests from split 1 arrive here\n                Ok(quickwit_proto::search::LeafSearchResponse {\n                    num_hits: 1,\n                    partial_hits: vec![mock_partial_hit(\"split1\", 2, 2)],\n                    failed_splits: Vec::new(),\n                    num_attempted_splits: 1,\n                    ..Default::default()\n                })\n            },\n        );\n        mock_search_service_1.expect_fetch_docs().returning(\n            |fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                Ok(quickwit_proto::search::FetchDocsResponse {\n                    hits: get_doc_for_fetch_req(fetch_docs_req),\n                })\n            },\n        );\n        // Service2 - working node.\n        let mut mock_search_service_2 = MockSearchService::new();\n        mock_search_service_2.expect_leaf_search().returning(\n            move |_leaf_search_req: quickwit_proto::search::LeafSearchRequest| {\n                Ok(quickwit_proto::search::LeafSearchResponse {\n                    num_hits: 0,\n                    partial_hits: Vec::new(),\n                    failed_splits: vec![SplitSearchError {\n                        error: \"mock_error\".to_string(),\n                        split_id: \"split1\".to_string(),\n                        retryable_error: true,\n                    }],\n                    num_attempted_splits: 1,\n                    ..Default::default()\n                })\n            },\n        );\n        mock_search_service_2.expect_fetch_docs().returning(\n            |_fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                Err(SearchError::Internal(\"mockerr docs\".to_string()))\n            },\n        );\n        let searcher_pool = searcher_pool_for_test([\n            (\"127.0.0.1:1001\", mock_search_service_1),\n            (\"127.0.0.1:1002\", mock_search_service_2),\n        ]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let cluster_client = ClusterClient::new(search_job_placer.clone());\n        let search_response = root_search(\n            &SearcherContext::for_test(),\n            search_request,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            &cluster_client,\n        )\n        .await\n        .unwrap();\n        assert_eq!(search_response.num_hits, 1);\n        assert_eq!(search_response.hits.len(), 1);\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_root_search_one_splits_two_nodes_but_one_is_failing_completely()\n    -> anyhow::Result<()> {\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index\".to_string()],\n            query_ast: qast_json_helper(\"test\", &[\"body\"]),\n            max_hits: 10,\n            ..Default::default()\n        };\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///test-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(move |_index_ids_query| {\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata.clone(),\n                ]))\n            });\n        mock_metastore\n            .expect_list_splits()\n            .returning(move |_filter| {\n                let splits = vec![\n                    MockSplitBuilder::new(\"split1\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                ];\n                let splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits_response)]))\n            });\n\n        // Service1 - working node.\n        let mut mock_search_service_1 = MockSearchService::new();\n        mock_search_service_1.expect_leaf_search().returning(\n            move |_leaf_search_req: quickwit_proto::search::LeafSearchRequest| {\n                Ok(quickwit_proto::search::LeafSearchResponse {\n                    num_hits: 1,\n                    partial_hits: vec![mock_partial_hit(\"split1\", 2, 2)],\n                    failed_splits: Vec::new(),\n                    num_attempted_splits: 1,\n                    ..Default::default()\n                })\n            },\n        );\n        mock_search_service_1.expect_fetch_docs().returning(\n            |fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                Ok(quickwit_proto::search::FetchDocsResponse {\n                    hits: get_doc_for_fetch_req(fetch_docs_req),\n                })\n            },\n        );\n        // Service2 - broken node.\n        let mut mock_search_service_2 = MockSearchService::new();\n        mock_search_service_2.expect_leaf_search().returning(\n            move |_leaf_search_req: quickwit_proto::search::LeafSearchRequest| {\n                Err(SearchError::Internal(\"mockerr search\".to_string()))\n            },\n        );\n        mock_search_service_2.expect_fetch_docs().returning(\n            |_fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                Err(SearchError::Internal(\"mockerr docs\".to_string()))\n            },\n        );\n        let searcher_pool = searcher_pool_for_test([\n            (\"127.0.0.1:1001\", mock_search_service_1),\n            (\"127.0.0.1:1002\", mock_search_service_2),\n        ]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let cluster_client = ClusterClient::new(search_job_placer.clone());\n        let search_response = root_search(\n            &SearcherContext::for_test(),\n            search_request,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            &cluster_client,\n        )\n        .await\n        .unwrap();\n        assert_eq!(search_response.num_hits, 1);\n        assert_eq!(search_response.hits.len(), 1);\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_root_search_invalid_queries() -> anyhow::Result<()> {\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///test-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(move |_index_ids_query| {\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata.clone(),\n                ]))\n            });\n        mock_metastore\n            .expect_list_splits()\n            .returning(move |_filter| {\n                let splits = vec![\n                    MockSplitBuilder::new(\"split\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                ];\n                let splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits_response)]))\n            });\n\n        let searcher_pool = searcher_pool_for_test([(\"127.0.0.1:1001\", MockSearchService::new())]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let cluster_client = ClusterClient::new(search_job_placer.clone());\n        let searcher_context = SearcherContext::for_test();\n        let metastore = MetastoreServiceClient::from_mock(mock_metastore);\n\n        assert!(\n            root_search(\n                &searcher_context,\n                quickwit_proto::search::SearchRequest {\n                    index_id_patterns: vec![\"test-index\".to_string()],\n                    query_ast: qast_json_helper(\"invalid_field:\\\"test\\\"\", &[\"body\"]),\n                    max_hits: 10,\n                    ..Default::default()\n                },\n                metastore.clone(),\n                &cluster_client,\n            )\n            .await\n            .is_err()\n        );\n\n        assert!(\n            root_search(\n                &searcher_context,\n                quickwit_proto::search::SearchRequest {\n                    index_id_patterns: vec![\"test-index\".to_string()],\n                    query_ast: qast_json_helper(\"test\", &[\"invalid_field\"]),\n                    max_hits: 10,\n                    ..Default::default()\n                },\n                metastore,\n                &cluster_client,\n            )\n            .await\n            .is_err()\n        );\n\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_root_search_invalid_aggregation() -> anyhow::Result<()> {\n        let agg_req = r#\"\n            {\n                \"expensive_colors\": {\n                    \"termss\": {\n                        \"field\": \"color\",\n                        \"order\": {\n                            \"price_stats.max\": \"desc\"\n                        }\n                    },\n                    \"aggs\": {\n                        \"price_stats\" : {\n                            \"stats\": {\n                                \"field\": \"price\"\n                            }\n                        }\n                    }\n                }\n            }\"#;\n\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index\".to_string()],\n            query_ast: qast_json_helper(\"test\", &[\"body\"]),\n            max_hits: 10,\n            aggregation_request: Some(agg_req.to_string()),\n            ..Default::default()\n        };\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///test-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(move |_index_ids_query| {\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata.clone(),\n                ]))\n            });\n        mock_metastore\n            .expect_list_splits()\n            .returning(move |_filter| {\n                let splits = vec![\n                    MockSplitBuilder::new(\"split1\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                ];\n                let splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits_response)]))\n            });\n        let searcher_pool = searcher_pool_for_test([(\"127.0.0.1:1001\", MockSearchService::new())]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let cluster_client = ClusterClient::new(search_job_placer.clone());\n        let search_response = root_search(\n            &SearcherContext::for_test(),\n            search_request,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            &cluster_client,\n        )\n        .await;\n        assert!(search_response.is_err());\n        assert!(\n            search_response.unwrap_err().to_string().starts_with(\n                \"invalid aggregation request: unknown variant `termss`, expected one of\"\n            )\n        );\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_root_search_invalid_request() -> anyhow::Result<()> {\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index\".to_string()],\n            query_ast: qast_json_helper(\"test\", &[\"body\"]),\n            max_hits: 10,\n            start_offset: 20_000,\n            ..Default::default()\n        };\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///test-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(move |_index_ids_query| {\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata.clone(),\n                ]))\n            });\n        mock_metastore\n            .expect_list_splits()\n            .returning(move |_filter| {\n                let splits = vec![\n                    MockSplitBuilder::new(\"split1\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                ];\n                let splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits_response)]))\n            });\n        let searcher_pool = searcher_pool_for_test([(\"127.0.0.1:1001\", MockSearchService::new())]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let cluster_client = ClusterClient::new(search_job_placer.clone());\n        let metastore = MetastoreServiceClient::from_mock(mock_metastore);\n        let search_response = root_search(\n            &SearcherContext::for_test(),\n            search_request,\n            metastore.clone(),\n            &cluster_client,\n        )\n        .await;\n        assert!(search_response.is_err());\n        assert_eq!(\n            search_response.unwrap_err().to_string(),\n            \"Invalid argument: max value for start_offset is 10_000, but got 20000\",\n        );\n\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index\".to_string()],\n            query_ast: qast_json_helper(\"test\", &[\"body\"]),\n            max_hits: 20_000,\n            ..Default::default()\n        };\n\n        let search_response = root_search(\n            &SearcherContext::for_test(),\n            search_request,\n            metastore,\n            &cluster_client,\n        )\n        .await;\n        assert!(search_response.is_err());\n        assert_eq!(\n            search_response.unwrap_err().to_string(),\n            \"Invalid argument: max value for max_hits is 10_000, but got 20000\",\n        );\n\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_search_plan_multiple_splits() -> anyhow::Result<()> {\n        use quickwit_query::MatchAllOrNone;\n        use quickwit_query::query_ast::{FullTextMode, FullTextParams, FullTextQuery};\n\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index\".to_string()],\n            query_ast: qast_json_helper(\"test-query\", &[\"body\"]),\n            max_hits: 10,\n            ..Default::default()\n        };\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///test-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(move |_index_ids_query| {\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata.clone(),\n                ]))\n            });\n        mock_metastore\n            .expect_list_splits()\n            .returning(move |_filter| {\n                let splits = vec![\n                    MockSplitBuilder::new(\"split1\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                    MockSplitBuilder::new(\"split2\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                ];\n                let splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits_response)]))\n            });\n        let search_response = search_plan(\n            search_request,\n            MetastoreServiceClient::from_mock(mock_metastore),\n        )\n        .await\n        .unwrap();\n        let response: SearchPlanResponseRest =\n            serde_json::from_str(&search_response.result).unwrap();\n        assert_eq!(\n            response,\n            SearchPlanResponseRest {\n                quickwit_ast: QueryAst::FullText(FullTextQuery {\n                    field: \"body\".to_string(),\n                    text: \"test-query\".to_string(),\n                    params: FullTextParams {\n                        tokenizer: None,\n                        mode: FullTextMode::PhraseFallbackToIntersection,\n                        zero_terms_query: MatchAllOrNone::MatchNone,\n                    },\n                    lenient: false,\n                },),\n                tantivy_ast: r#\"BooleanQuery {\n    subqueries: [\n        (\n            Must,\n            TermQuery(Term(field=3, type=Str, \"test\")),\n        ),\n        (\n            Must,\n            TermQuery(Term(field=3, type=Str, \"query\")),\n        ),\n    ],\n    minimum_number_should_match: 0,\n}\"#\n                .to_string(),\n                searched_splits: vec![\n                    \"test-index/split1\".to_string(),\n                    \"test-index/split2\".to_string()\n                ],\n                storage_requests: StorageRequestCount {\n                    footer: 1,\n                    fastfield: 0,\n                    fieldnorm: 0,\n                    sstable: 2,\n                    posting: 2,\n                    position: 0,\n                },\n            }\n        );\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_search_plan_missing_index() -> anyhow::Result<()> {\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata = IndexMetadata::for_test(\"test-index1\", \"ram:///test-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(move |_index_ids_query| {\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata.clone(),\n                ]))\n            });\n        mock_metastore\n            .expect_list_splits()\n            .returning(move |_filter| {\n                let splits = vec![\n                    MockSplitBuilder::new(\"split1\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                    MockSplitBuilder::new(\"split2\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                ];\n                let splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits_response)]))\n            });\n        let mock_metastore_service = MetastoreServiceClient::from_mock(mock_metastore);\n\n        // plan with ignore_missing_indexes=true succeeds\n        search_plan(\n            quickwit_proto::search::SearchRequest {\n                index_id_patterns: vec![\"test-index1\".to_string(), \"test-index2\".to_string()],\n                query_ast: qast_json_helper(\"test-query\", &[\"body\"]),\n                max_hits: 10,\n                ignore_missing_indexes: true,\n                ..Default::default()\n            },\n            mock_metastore_service.clone(),\n        )\n        .await\n        .unwrap();\n\n        // plan with ignore_missing_indexes=false fails\n        let search_error = search_plan(\n            quickwit_proto::search::SearchRequest {\n                index_id_patterns: vec![\"test-index1\".to_string(), \"test-index2\".to_string()],\n                query_ast: qast_json_helper(\"test-query\", &[\"body\"]),\n                max_hits: 10,\n                ignore_missing_indexes: false,\n                ..Default::default()\n            },\n            mock_metastore_service.clone(),\n        )\n        .await\n        .unwrap_err();\n        if let SearchError::IndexesNotFound { index_ids } = search_error {\n            assert_eq!(index_ids, vec![\"test-index2\".to_string()]);\n        } else {\n            panic!(\"unexpected error type: {search_error}\");\n        }\n        Ok(())\n    }\n\n    #[test]\n    fn test_extract_timestamp_range_from_ast() {\n        use std::ops::Bound;\n\n        use quickwit_query::JsonLiteral;\n\n        let timestamp_field = \"timestamp\";\n\n        let simple_range = quickwit_query::query_ast::RangeQuery {\n            field: timestamp_field.to_string(),\n            lower_bound: Bound::Included(JsonLiteral::String(\"2021-04-13T22:45:41Z\".to_owned())),\n            upper_bound: Bound::Excluded(JsonLiteral::String(\"2021-05-06T06:51:19Z\".to_owned())),\n        }\n        .into();\n\n        // direct range\n        let mut timestamp_range_extractor = ExtractTimestampRange {\n            timestamp_field,\n            start_timestamp: None,\n            end_timestamp: None,\n        };\n        timestamp_range_extractor.visit(&simple_range).unwrap();\n        assert_eq!(timestamp_range_extractor.start_timestamp, Some(1618353941));\n        assert_eq!(timestamp_range_extractor.end_timestamp, Some(1620283879));\n\n        // range inside a must bool query\n        let bool_query_must = quickwit_query::query_ast::BoolQuery {\n            must: vec![simple_range.clone()],\n            ..Default::default()\n        };\n        timestamp_range_extractor.start_timestamp = None;\n        timestamp_range_extractor.end_timestamp = None;\n        timestamp_range_extractor\n            .visit(&bool_query_must.into())\n            .unwrap();\n        assert_eq!(timestamp_range_extractor.start_timestamp, Some(1618353941));\n        assert_eq!(timestamp_range_extractor.end_timestamp, Some(1620283879));\n\n        // range inside a should bool query\n        let bool_query_should = quickwit_query::query_ast::BoolQuery {\n            should: vec![simple_range.clone()],\n            ..Default::default()\n        };\n        timestamp_range_extractor.start_timestamp = Some(123);\n        timestamp_range_extractor.end_timestamp = None;\n        timestamp_range_extractor\n            .visit(&bool_query_should.into())\n            .unwrap();\n        assert_eq!(timestamp_range_extractor.start_timestamp, Some(123));\n        assert_eq!(timestamp_range_extractor.end_timestamp, None);\n\n        // start bound was already more restrictive\n        timestamp_range_extractor.start_timestamp = Some(1618601297);\n        timestamp_range_extractor.end_timestamp = Some(i64::MAX);\n        timestamp_range_extractor.visit(&simple_range).unwrap();\n        assert_eq!(timestamp_range_extractor.start_timestamp, Some(1618601297));\n        assert_eq!(timestamp_range_extractor.end_timestamp, Some(1620283879));\n\n        // end bound was already more restrictive\n        timestamp_range_extractor.start_timestamp = Some(1);\n        timestamp_range_extractor.end_timestamp = Some(1618601297);\n        timestamp_range_extractor.visit(&simple_range).unwrap();\n        assert_eq!(timestamp_range_extractor.start_timestamp, Some(1618353941));\n        assert_eq!(timestamp_range_extractor.end_timestamp, Some(1618601297));\n\n        // bounds are (start..end] instead of [start..end)\n        let unusual_bounds = quickwit_query::query_ast::RangeQuery {\n            field: timestamp_field.to_string(),\n            lower_bound: Bound::Excluded(JsonLiteral::String(\"2021-04-13T22:45:41Z\".to_owned())),\n            upper_bound: Bound::Included(JsonLiteral::String(\"2021-05-06T06:51:19Z\".to_owned())),\n        }\n        .into();\n        timestamp_range_extractor.start_timestamp = None;\n        timestamp_range_extractor.end_timestamp = None;\n        timestamp_range_extractor.visit(&unusual_bounds).unwrap();\n        assert_eq!(timestamp_range_extractor.start_timestamp, Some(1618353942));\n        assert_eq!(timestamp_range_extractor.end_timestamp, Some(1620283880));\n\n        let wrong_field = quickwit_query::query_ast::RangeQuery {\n            field: \"other_field\".to_string(),\n            lower_bound: Bound::Included(JsonLiteral::String(\"2021-04-13T22:45:41Z\".to_owned())),\n            upper_bound: Bound::Excluded(JsonLiteral::String(\"2021-05-06T06:51:19Z\".to_owned())),\n        }\n        .into();\n        timestamp_range_extractor.start_timestamp = None;\n        timestamp_range_extractor.end_timestamp = None;\n        timestamp_range_extractor.visit(&wrong_field).unwrap();\n        assert_eq!(timestamp_range_extractor.start_timestamp, None);\n        assert_eq!(timestamp_range_extractor.end_timestamp, None);\n\n        let high_precision = quickwit_query::query_ast::RangeQuery {\n            field: timestamp_field.to_string(),\n            lower_bound: Bound::Included(JsonLiteral::String(\n                \"2021-04-13T22:45:41.001Z\".to_owned(),\n            )),\n            upper_bound: Bound::Excluded(JsonLiteral::String(\n                \"2021-05-06T06:51:19.001Z\".to_owned(),\n            )),\n        }\n        .into();\n\n        // the upper bound should be rounded up as to includes documents from X.000 to X.001\n        let mut timestamp_range_extractor = ExtractTimestampRange {\n            timestamp_field,\n            start_timestamp: None,\n            end_timestamp: None,\n        };\n        timestamp_range_extractor.visit(&high_precision).unwrap();\n        assert_eq!(timestamp_range_extractor.start_timestamp, Some(1618353941));\n        assert_eq!(timestamp_range_extractor.end_timestamp, Some(1620283880));\n    }\n\n    fn create_search_resp(\n        index_uri: &str,\n        hit_range: Range<usize>,\n        search_after: Option<PartialHit>,\n    ) -> LeafSearchResponse {\n        let (num_total_hits, split_id) = match index_uri {\n            \"ram:///test-index-1\" => (TOTAL_NUM_HITS_INDEX_1, \"split1\"),\n            \"ram:///test-index-2\" => (TOTAL_NUM_HITS_INDEX_2, \"split2\"),\n            _ => panic!(\"unexpected index uri\"),\n        };\n\n        let doc_ids = (0..num_total_hits)\n            .rev()\n            .filter(|elem| {\n                if let Some(search_after) = &search_after {\n                    if split_id == search_after.split_id {\n                        *elem < (search_after.doc_id as usize)\n                    } else {\n                        split_id < search_after.split_id.as_str()\n                    }\n                } else {\n                    true\n                }\n            })\n            .skip(hit_range.start)\n            .take(hit_range.end - hit_range.start);\n        quickwit_proto::search::LeafSearchResponse {\n            num_hits: num_total_hits as u64,\n            partial_hits: doc_ids\n                .map(|doc_id| mock_partial_hit_opt_sort_value(split_id, None, doc_id as u32))\n                .collect(),\n            num_attempted_splits: 1,\n            ..Default::default()\n        }\n    }\n\n    const TOTAL_NUM_HITS_INDEX_1: usize = 2_005;\n    const TOTAL_NUM_HITS_INDEX_2: usize = 10;\n    const MAX_HITS_PER_PAGE: usize = 93;\n    const MAX_HITS_PER_PAGE_LARGE: usize = 1_005;\n\n    #[tokio::test]\n    async fn test_root_search_with_scroll() {\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata = IndexMetadata::for_test(\"test-index-1\", \"ram:///test-index-1\");\n        let index_uid = index_metadata.index_uid.clone();\n        let index_metadata_2 = IndexMetadata::for_test(\"test-index-2\", \"ram:///test-index-2\");\n        let index_uid_2 = index_metadata_2.index_uid.clone();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(move |_index_ids_query| {\n                let indexes_metadata = vec![index_metadata.clone(), index_metadata_2.clone()];\n                Ok(ListIndexesMetadataResponse::for_test(indexes_metadata))\n            });\n        mock_metastore\n            .expect_list_splits()\n            .returning(move |_filter| {\n                let splits = vec![\n                    MockSplitBuilder::new(\"split1\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                    MockSplitBuilder::new(\"split2\")\n                        .with_index_uid(&index_uid_2)\n                        .build(),\n                ];\n                let splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits_response)]))\n            });\n        // We add two mock_search_service to simulate a multi node environment, where the requests\n        // are forwarded two node.\n        let mut mock_search_service1 = MockSearchService::new();\n        mock_search_service1\n            .expect_leaf_search()\n            .times(1)\n            .returning(|req: quickwit_proto::search::LeafSearchRequest| {\n                let search_req = req.search_request.unwrap();\n                // the leaf request does not need to know about the scroll_ttl.\n                assert_eq!(search_req.start_offset, 0u64);\n                assert!(search_req.scroll_ttl_secs.is_none());\n                assert_eq!(search_req.max_hits as usize, SCROLL_BATCH_LEN);\n                assert!(search_req.search_after.is_none());\n                Ok(create_search_resp(\n                    &req.index_uris[0],\n                    search_req.start_offset as usize\n                        ..(search_req.start_offset + search_req.max_hits) as usize,\n                    search_req.search_after,\n                ))\n            });\n        mock_search_service1\n            .expect_leaf_search()\n            .times(1)\n            .returning(|req: quickwit_proto::search::LeafSearchRequest| {\n                let search_req = req.search_request.unwrap();\n                // the leaf request does not need to know about the scroll_ttl.\n                assert_eq!(search_req.start_offset, 0u64);\n                assert!(search_req.scroll_ttl_secs.is_none());\n                assert_eq!(search_req.max_hits as usize, SCROLL_BATCH_LEN);\n                assert!(search_req.search_after.is_some());\n                Ok(create_search_resp(\n                    &req.index_uris[0],\n                    search_req.start_offset as usize\n                        ..(search_req.start_offset + search_req.max_hits) as usize,\n                    search_req.search_after,\n                ))\n            });\n        mock_search_service1\n            .expect_leaf_search()\n            .times(1)\n            .returning(|req: quickwit_proto::search::LeafSearchRequest| {\n                let search_req = req.search_request.unwrap();\n                // the leaf request does not need to know about the scroll_ttl.\n                assert_eq!(search_req.start_offset, 0u64);\n                assert!(search_req.scroll_ttl_secs.is_none());\n                assert_eq!(search_req.max_hits as usize, SCROLL_BATCH_LEN);\n                assert!(search_req.search_after.is_some());\n                Ok(create_search_resp(\n                    &req.index_uris[0],\n                    search_req.start_offset as usize\n                        ..(search_req.start_offset + search_req.max_hits) as usize,\n                    search_req.search_after,\n                ))\n            });\n\n        let mut mock_search_service2 = MockSearchService::new();\n        mock_search_service2\n            .expect_leaf_search()\n            .times(1)\n            .returning(|req: quickwit_proto::search::LeafSearchRequest| {\n                let search_req = req.search_request.unwrap();\n                // the leaf request does not need to know about the scroll_ttl.\n                assert_eq!(search_req.start_offset, 0u64);\n                assert!(search_req.scroll_ttl_secs.is_none());\n                assert_eq!(search_req.max_hits as usize, SCROLL_BATCH_LEN);\n                assert!(search_req.search_after.is_none());\n                Ok(create_search_resp(\n                    &req.index_uris[0],\n                    search_req.start_offset as usize\n                        ..(search_req.start_offset + search_req.max_hits) as usize,\n                    search_req.search_after,\n                ))\n            });\n        mock_search_service2\n            .expect_leaf_search()\n            .times(1)\n            .returning(|req: quickwit_proto::search::LeafSearchRequest| {\n                let search_req = req.search_request.unwrap();\n                // the leaf request does not need to know about the scroll_ttl.\n                assert_eq!(search_req.start_offset, 0u64);\n                assert!(search_req.scroll_ttl_secs.is_none());\n                assert_eq!(search_req.max_hits as usize, SCROLL_BATCH_LEN);\n                assert!(search_req.search_after.is_some());\n                Ok(create_search_resp(\n                    &req.index_uris[0],\n                    search_req.start_offset as usize\n                        ..(search_req.start_offset + search_req.max_hits) as usize,\n                    search_req.search_after,\n                ))\n            });\n        mock_search_service2\n            .expect_leaf_search()\n            .times(1)\n            .returning(|req: quickwit_proto::search::LeafSearchRequest| {\n                let search_req = req.search_request.unwrap();\n                // the leaf request does not need to know about the scroll_ttl.\n                assert_eq!(search_req.start_offset, 0u64);\n                assert!(search_req.scroll_ttl_secs.is_none());\n                assert_eq!(search_req.max_hits as usize, SCROLL_BATCH_LEN);\n                assert!(search_req.search_after.is_some());\n                Ok(create_search_resp(\n                    &req.index_uris[0],\n                    search_req.start_offset as usize\n                        ..(search_req.start_offset + search_req.max_hits) as usize,\n                    search_req.search_after,\n                ))\n            });\n\n        let kv: Arc<RwLock<HashMap<Vec<u8>, Vec<u8>>>> = Default::default();\n        let kv_clone = kv.clone();\n        mock_search_service1\n            .expect_put_kv()\n            .returning(move |put_kv_req| {\n                kv_clone\n                    .write()\n                    .unwrap()\n                    .insert(put_kv_req.key, put_kv_req.payload);\n            });\n        mock_search_service1\n            .expect_get_kv()\n            .returning(move |get_kv_req| kv.read().unwrap().get(&get_kv_req.key).cloned());\n\n        let kv: Arc<RwLock<HashMap<Vec<u8>, Vec<u8>>>> = Default::default();\n        let kv_clone = kv.clone();\n        mock_search_service2\n            .expect_put_kv()\n            .returning(move |put_kv_req| {\n                kv_clone\n                    .write()\n                    .unwrap()\n                    .insert(put_kv_req.key, put_kv_req.payload);\n            });\n        mock_search_service2\n            .expect_get_kv()\n            .returning(move |get_kv_req| kv.read().unwrap().get(&get_kv_req.key).cloned());\n\n        mock_search_service1.expect_fetch_docs().returning(\n            |fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                assert!(fetch_docs_req.partial_hits.len() <= MAX_HITS_PER_PAGE);\n                Ok(quickwit_proto::search::FetchDocsResponse {\n                    hits: get_doc_for_fetch_req(fetch_docs_req),\n                })\n            },\n        );\n\n        mock_search_service2.expect_fetch_docs().returning(\n            |fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                assert!(fetch_docs_req.partial_hits.len() <= MAX_HITS_PER_PAGE);\n                Ok(quickwit_proto::search::FetchDocsResponse {\n                    hits: get_doc_for_fetch_req(fetch_docs_req),\n                })\n            },\n        );\n\n        let searcher_pool = searcher_pool_for_test([\n            (\"127.0.0.1:1001\", mock_search_service1),\n            (\"127.0.0.1:1002\", mock_search_service2),\n        ]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let searcher_context = SearcherContext::for_test();\n        let cluster_client = ClusterClient::new(search_job_placer.clone());\n\n        let mut count_seen_hits = 0;\n\n        let mut scroll_id: String = {\n            let search_request = quickwit_proto::search::SearchRequest {\n                index_id_patterns: vec![\"test-index-*\".to_string()],\n                query_ast: qast_json_helper(\"test\", &[\"body\"]),\n                max_hits: MAX_HITS_PER_PAGE as u64,\n                scroll_ttl_secs: Some(60),\n                ..Default::default()\n            };\n            let search_response = root_search(\n                &searcher_context,\n                search_request,\n                MetastoreServiceClient::from_mock(mock_metastore),\n                &cluster_client,\n            )\n            .await\n            .unwrap();\n            assert_eq!(\n                search_response.num_hits,\n                (TOTAL_NUM_HITS_INDEX_1 + TOTAL_NUM_HITS_INDEX_2) as u64\n            );\n            assert_eq!(search_response.hits.len(), MAX_HITS_PER_PAGE);\n            let expected = (0..TOTAL_NUM_HITS_INDEX_2)\n                .rev()\n                .zip(std::iter::repeat(\"split2\"))\n                .chain(\n                    (0..TOTAL_NUM_HITS_INDEX_1)\n                        .rev()\n                        .zip(std::iter::repeat(\"split1\")),\n                );\n            for (hit, (doc_id, split)) in search_response.hits.iter().zip(expected) {\n                assert_eq!(\n                    hit.partial_hit.as_ref().unwrap(),\n                    &mock_partial_hit_opt_sort_value(split, None, doc_id as u32)\n                );\n            }\n            count_seen_hits += search_response.hits.len();\n            search_response.scroll_id.unwrap()\n        };\n        for page in 1.. {\n            let scroll_req = ScrollRequest {\n                scroll_id,\n                scroll_ttl_secs: Some(60),\n            };\n            let scroll_resp =\n                crate::service::scroll(scroll_req, &cluster_client, &searcher_context)\n                    .await\n                    .unwrap();\n            assert_eq!(\n                scroll_resp.num_hits,\n                (TOTAL_NUM_HITS_INDEX_1 + TOTAL_NUM_HITS_INDEX_2) as u64\n            );\n            let expected = (0..TOTAL_NUM_HITS_INDEX_2)\n                .rev()\n                .zip(std::iter::repeat(\"split2\"))\n                .chain(\n                    (0..TOTAL_NUM_HITS_INDEX_1)\n                        .rev()\n                        .zip(std::iter::repeat(\"split1\")),\n                )\n                .skip(page * MAX_HITS_PER_PAGE);\n            for (hit, (doc_id, split)) in scroll_resp.hits.iter().zip(expected) {\n                assert_eq!(\n                    hit.partial_hit.as_ref().unwrap(),\n                    &mock_partial_hit_opt_sort_value(split, None, doc_id as u32)\n                );\n            }\n            scroll_id = scroll_resp.scroll_id.unwrap();\n            count_seen_hits += scroll_resp.hits.len();\n            if scroll_resp.hits.is_empty() {\n                break;\n            }\n        }\n\n        assert_eq!(\n            count_seen_hits,\n            TOTAL_NUM_HITS_INDEX_1 + TOTAL_NUM_HITS_INDEX_2\n        );\n    }\n\n    #[tokio::test]\n    async fn test_root_search_with_scroll_large_page() {\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata = IndexMetadata::for_test(\"test-index-1\", \"ram:///test-index-1\");\n        let index_uid = index_metadata.index_uid.clone();\n        let index_metadata_2 = IndexMetadata::for_test(\"test-index-2\", \"ram:///test-index-2\");\n        let index_uid_2 = index_metadata_2.index_uid.clone();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(move |_index_ids_query| {\n                let indexes_metadata = vec![index_metadata.clone(), index_metadata_2.clone()];\n                Ok(ListIndexesMetadataResponse::for_test(indexes_metadata))\n            });\n        mock_metastore\n            .expect_list_splits()\n            .returning(move |_filter| {\n                let splits = vec![\n                    MockSplitBuilder::new(\"split1\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                    MockSplitBuilder::new(\"split2\")\n                        .with_index_uid(&index_uid_2)\n                        .build(),\n                ];\n                let splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits_response)]))\n            });\n        // We add two mock_search_service to simulate a multi node environment, where the requests\n        // are forwarded two nodes.\n        let mut mock_search_service1 = MockSearchService::new();\n        mock_search_service1\n            .expect_leaf_search()\n            .times(1)\n            .returning(|req: quickwit_proto::search::LeafSearchRequest| {\n                let search_req = req.search_request.unwrap();\n                // the leaf request does not need to know about the scroll_ttl.\n                assert_eq!(search_req.start_offset, 0u64);\n                assert!(search_req.scroll_ttl_secs.is_none());\n                assert_eq!(search_req.max_hits as usize, MAX_HITS_PER_PAGE_LARGE);\n                assert!(search_req.search_after.is_none());\n                Ok(create_search_resp(\n                    &req.index_uris[0],\n                    search_req.start_offset as usize\n                        ..(search_req.start_offset + search_req.max_hits) as usize,\n                    search_req.search_after,\n                ))\n            });\n        mock_search_service1\n            .expect_leaf_search()\n            .times(1)\n            .returning(|req: quickwit_proto::search::LeafSearchRequest| {\n                let search_req = req.search_request.unwrap();\n                // the leaf request does not need to know about the scroll_ttl.\n                assert_eq!(search_req.start_offset, 0u64);\n                assert!(search_req.scroll_ttl_secs.is_none());\n                assert_eq!(search_req.max_hits as usize, MAX_HITS_PER_PAGE_LARGE);\n                assert!(search_req.search_after.is_some());\n                Ok(create_search_resp(\n                    &req.index_uris[0],\n                    search_req.start_offset as usize\n                        ..(search_req.start_offset + search_req.max_hits) as usize,\n                    search_req.search_after,\n                ))\n            });\n        mock_search_service1\n            .expect_leaf_search()\n            .times(1)\n            .returning(|req: quickwit_proto::search::LeafSearchRequest| {\n                let search_req = req.search_request.unwrap();\n                // the leaf request does not need to know about the scroll_ttl.\n                assert_eq!(search_req.start_offset, 0u64);\n                assert!(search_req.scroll_ttl_secs.is_none());\n                assert_eq!(search_req.max_hits as usize, MAX_HITS_PER_PAGE_LARGE);\n                assert!(search_req.search_after.is_some());\n                Ok(create_search_resp(\n                    &req.index_uris[0],\n                    search_req.start_offset as usize\n                        ..(search_req.start_offset + search_req.max_hits) as usize,\n                    search_req.search_after,\n                ))\n            });\n        let kv: Arc<RwLock<HashMap<Vec<u8>, Vec<u8>>>> = Default::default();\n        let kv_clone = kv.clone();\n        mock_search_service1\n            .expect_put_kv()\n            .returning(move |put_kv_req| {\n                kv_clone\n                    .write()\n                    .unwrap()\n                    .insert(put_kv_req.key, put_kv_req.payload);\n            });\n        mock_search_service1\n            .expect_get_kv()\n            .returning(move |get_kv_req| kv.read().unwrap().get(&get_kv_req.key).cloned());\n        mock_search_service1.expect_fetch_docs().returning(\n            |fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                assert!(fetch_docs_req.partial_hits.len() <= MAX_HITS_PER_PAGE_LARGE);\n                Ok(quickwit_proto::search::FetchDocsResponse {\n                    hits: get_doc_for_fetch_req(fetch_docs_req),\n                })\n            },\n        );\n\n        let mut mock_search_service2 = MockSearchService::new();\n        mock_search_service2\n            .expect_leaf_search()\n            .times(1)\n            .returning(|req: quickwit_proto::search::LeafSearchRequest| {\n                let search_req = req.search_request.unwrap();\n                // the leaf request does not need to know about the scroll_ttl.\n                assert_eq!(search_req.start_offset, 0u64);\n                assert!(search_req.scroll_ttl_secs.is_none());\n                assert_eq!(search_req.max_hits as usize, MAX_HITS_PER_PAGE_LARGE);\n                assert!(search_req.search_after.is_none());\n                Ok(create_search_resp(\n                    &req.index_uris[0],\n                    search_req.start_offset as usize\n                        ..(search_req.start_offset + search_req.max_hits) as usize,\n                    search_req.search_after,\n                ))\n            });\n        mock_search_service2\n            .expect_leaf_search()\n            .times(1)\n            .returning(|req: quickwit_proto::search::LeafSearchRequest| {\n                let search_req = req.search_request.unwrap();\n                // the leaf request does not need to know about the scroll_ttl.\n                assert_eq!(search_req.start_offset, 0u64);\n                assert!(search_req.scroll_ttl_secs.is_none());\n                assert_eq!(search_req.max_hits as usize, MAX_HITS_PER_PAGE_LARGE);\n                assert!(search_req.search_after.is_some());\n                Ok(create_search_resp(\n                    &req.index_uris[0],\n                    search_req.start_offset as usize\n                        ..(search_req.start_offset + search_req.max_hits) as usize,\n                    search_req.search_after,\n                ))\n            });\n        mock_search_service2\n            .expect_leaf_search()\n            .times(1)\n            .returning(|req: quickwit_proto::search::LeafSearchRequest| {\n                let search_req = req.search_request.unwrap();\n                // the leaf request does not need to know about the scroll_ttl.\n                assert_eq!(search_req.start_offset, 0u64);\n                assert!(search_req.scroll_ttl_secs.is_none());\n                assert_eq!(search_req.max_hits as usize, MAX_HITS_PER_PAGE_LARGE);\n                assert!(search_req.search_after.is_some());\n                Ok(create_search_resp(\n                    &req.index_uris[0],\n                    search_req.start_offset as usize\n                        ..(search_req.start_offset + search_req.max_hits) as usize,\n                    search_req.search_after,\n                ))\n            });\n        let kv: Arc<RwLock<HashMap<Vec<u8>, Vec<u8>>>> = Default::default();\n        let kv_clone = kv.clone();\n        mock_search_service2\n            .expect_put_kv()\n            .returning(move |put_kv_req| {\n                kv_clone\n                    .write()\n                    .unwrap()\n                    .insert(put_kv_req.key, put_kv_req.payload);\n            });\n        mock_search_service2\n            .expect_get_kv()\n            .returning(move |get_kv_req| kv.read().unwrap().get(&get_kv_req.key).cloned());\n        mock_search_service2.expect_fetch_docs().returning(\n            |fetch_docs_req: quickwit_proto::search::FetchDocsRequest| {\n                assert!(fetch_docs_req.partial_hits.len() <= MAX_HITS_PER_PAGE_LARGE);\n                Ok(quickwit_proto::search::FetchDocsResponse {\n                    hits: get_doc_for_fetch_req(fetch_docs_req),\n                })\n            },\n        );\n\n        let searcher_pool = searcher_pool_for_test([\n            (\"127.0.0.1:1001\", mock_search_service1),\n            (\"127.0.0.1:1002\", mock_search_service2),\n        ]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let searcher_context = SearcherContext::for_test();\n        let cluster_client = ClusterClient::new(search_job_placer.clone());\n\n        let mut count_seen_hits = 0;\n\n        let mut scroll_id: String = {\n            let search_request = quickwit_proto::search::SearchRequest {\n                index_id_patterns: vec![\"test-index-*\".to_string()],\n                query_ast: qast_json_helper(\"test\", &[\"body\"]),\n                max_hits: MAX_HITS_PER_PAGE_LARGE as u64,\n                scroll_ttl_secs: Some(60),\n                ..Default::default()\n            };\n            let search_response = root_search(\n                &searcher_context,\n                search_request,\n                MetastoreServiceClient::from_mock(mock_metastore),\n                &cluster_client,\n            )\n            .await\n            .unwrap();\n            assert_eq!(\n                search_response.num_hits,\n                (TOTAL_NUM_HITS_INDEX_1 + TOTAL_NUM_HITS_INDEX_2) as u64\n            );\n            assert_eq!(search_response.hits.len(), MAX_HITS_PER_PAGE_LARGE);\n            let expected = (0..TOTAL_NUM_HITS_INDEX_2)\n                .rev()\n                .zip(std::iter::repeat(\"split2\"))\n                .chain(\n                    (0..TOTAL_NUM_HITS_INDEX_1)\n                        .rev()\n                        .zip(std::iter::repeat(\"split1\")),\n                );\n            for (hit, (doc_id, split)) in search_response.hits.iter().zip(expected) {\n                assert_eq!(\n                    hit.partial_hit.as_ref().unwrap(),\n                    &mock_partial_hit_opt_sort_value(split, None, doc_id as u32)\n                );\n            }\n            count_seen_hits += search_response.hits.len();\n            search_response.scroll_id.unwrap()\n        };\n        for page in 1.. {\n            let scroll_req = ScrollRequest {\n                scroll_id,\n                scroll_ttl_secs: Some(60),\n            };\n            let scroll_resp =\n                crate::service::scroll(scroll_req, &cluster_client, &searcher_context)\n                    .await\n                    .unwrap();\n            assert_eq!(\n                scroll_resp.num_hits,\n                (TOTAL_NUM_HITS_INDEX_1 + TOTAL_NUM_HITS_INDEX_2) as u64\n            );\n            let expected = (0..TOTAL_NUM_HITS_INDEX_2)\n                .rev()\n                .zip(std::iter::repeat(\"split2\"))\n                .chain(\n                    (0..TOTAL_NUM_HITS_INDEX_1)\n                        .rev()\n                        .zip(std::iter::repeat(\"split1\")),\n                )\n                .skip(page * MAX_HITS_PER_PAGE_LARGE);\n            for (hit, (doc_id, split)) in scroll_resp.hits.iter().zip(expected) {\n                assert_eq!(\n                    hit.partial_hit.as_ref().unwrap(),\n                    &mock_partial_hit_opt_sort_value(split, None, doc_id as u32)\n                );\n            }\n            scroll_id = scroll_resp.scroll_id.unwrap();\n            count_seen_hits += scroll_resp.hits.len();\n            if scroll_resp.hits.is_empty() {\n                break;\n            }\n        }\n\n        assert_eq!(\n            count_seen_hits,\n            TOTAL_NUM_HITS_INDEX_1 + TOTAL_NUM_HITS_INDEX_2\n        );\n    }\n\n    #[tokio::test]\n    async fn test_root_search_multi_indices() -> anyhow::Result<()> {\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index-*\".to_string()],\n            query_ast: qast_json_helper(\"test\", &[\"body\"]),\n            max_hits: 10,\n            ..Default::default()\n        };\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata_1 = IndexMetadata::for_test(\"test-index-1\", \"ram:///test-index-1\");\n        let index_uid_1 = index_metadata_1.index_uid.clone();\n        let index_metadata_2 =\n            index_metadata_for_multi_indexes_test(\"test-index-2\", \"ram:///test-index-2\");\n        let index_uid_2 = index_metadata_2.index_uid.clone();\n        let index_metadata_3 =\n            index_metadata_for_multi_indexes_test(\"test-index-3\", \"ram:///test-index-3\");\n        let index_uid_3 = index_metadata_3.index_uid.clone();\n        mock_metastore.expect_list_indexes_metadata().return_once(\n            move |list_indexes_metadata_request: ListIndexesMetadataRequest| {\n                let index_id_patterns = list_indexes_metadata_request.index_id_patterns;\n                assert_eq!(&index_id_patterns, &[\"test-index-*\".to_string()]);\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata_1,\n                    index_metadata_2,\n                    index_metadata_3,\n                ]))\n            },\n        );\n        mock_metastore\n            .expect_list_splits()\n            .return_once(move |list_splits_request| {\n                let list_splits_query =\n                    list_splits_request.deserialize_list_splits_query().unwrap();\n                assert!(\n                    list_splits_query.index_uids\n                        == Some(vec![\n                            index_uid_1.clone(),\n                            index_uid_2.clone(),\n                            index_uid_3.clone()\n                        ])\n                );\n                let splits = vec![\n                    MockSplitBuilder::new(\"index-1-split-1\")\n                        .with_index_uid(&index_uid_1)\n                        .build(),\n                    MockSplitBuilder::new(\"index-1-split-2\")\n                        .with_index_uid(&index_uid_1)\n                        .build(),\n                    MockSplitBuilder::new(\"index-2-split-1\")\n                        .with_index_uid(&index_uid_2)\n                        .build(),\n                ];\n                let splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits_response)]))\n            });\n        let mut mock_search_service_1 = MockSearchService::new();\n        mock_search_service_1\n            .expect_leaf_search()\n            .times(1)\n            .withf(|leaf_search_req| {\n                (&leaf_search_req.index_uris[0] == \"ram:///test-index-1\"\n                    && leaf_search_req.leaf_requests[0].split_offsets.len() == 2)\n                    || (leaf_search_req.index_uris[0] == \"ram:///test-index-2\"\n                        && leaf_search_req.leaf_requests[0].split_offsets[0].split_id\n                            == \"index-2-split-1\")\n            })\n            .returning(\n                |leaf_search_req: quickwit_proto::search::LeafSearchRequest| {\n                    let mut partial_hits = leaf_search_req.leaf_requests[0]\n                        .split_offsets\n                        .iter()\n                        .map(|split_offset| mock_partial_hit(&split_offset.split_id, 3, 1))\n                        .collect_vec();\n                    let partial_hits2 = leaf_search_req.leaf_requests[1]\n                        .split_offsets\n                        .iter()\n                        .map(|split_offset| mock_partial_hit(&split_offset.split_id, 3, 1))\n                        .collect_vec();\n                    partial_hits.extend_from_slice(&partial_hits2);\n                    let num_attempted_splits: u64 = leaf_search_req\n                        .leaf_requests\n                        .iter()\n                        .map(|leaf_req| leaf_req.split_offsets.len() as u64)\n                        .sum::<u64>();\n                    Ok(quickwit_proto::search::LeafSearchResponse {\n                        num_hits: leaf_search_req.leaf_requests[0].split_offsets.len() as u64\n                            + leaf_search_req.leaf_requests[1].split_offsets.len() as u64,\n                        partial_hits,\n                        failed_splits: Vec::new(),\n                        num_attempted_splits,\n                        ..Default::default()\n                    })\n                },\n            );\n        mock_search_service_1\n            .expect_fetch_docs()\n            .times(2)\n            .withf(|fetch_docs_req: &FetchDocsRequest| {\n                (fetch_docs_req.index_uri == \"ram:///test-index-1\"\n                    && fetch_docs_req.partial_hits.len() == 2)\n                    || (fetch_docs_req.index_uri == \"ram:///test-index-2\"\n                        && fetch_docs_req.partial_hits[0].split_id == \"index-2-split-1\")\n            })\n            .returning(|fetch_docs_req| {\n                Ok(quickwit_proto::search::FetchDocsResponse {\n                    hits: get_doc_for_fetch_req(fetch_docs_req),\n                })\n            });\n        let searcher_pool = searcher_pool_for_test([(\"127.0.0.1:1001\", mock_search_service_1)]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let cluster_client = ClusterClient::new(search_job_placer.clone());\n        let search_response = root_search(\n            &SearcherContext::for_test(),\n            search_request,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            &cluster_client,\n        )\n        .await\n        .unwrap();\n        assert_eq!(search_response.num_hits, 3);\n        assert_eq!(search_response.hits.len(), 3);\n        assert_eq!(\n            search_response\n                .hits\n                .iter()\n                .map(|hit| &hit.index_id)\n                .collect_vec(),\n            vec![\"test-index-1\", \"test-index-1\", \"test-index-2\"]\n        );\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_root_search_split_failures() -> anyhow::Result<()> {\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index-1\".to_string()],\n            query_ast: qast_json_helper(\"test\", &[\"body\"]),\n            max_hits: 10,\n            ..Default::default()\n        };\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata_1 = IndexMetadata::for_test(\"test-index-1\", \"ram:///test-index-1\");\n        let index_uid_1 = index_metadata_1.index_uid.clone();\n        mock_metastore.expect_list_indexes_metadata().return_once(\n            move |_list_indexes_metadata_request: ListIndexesMetadataRequest| {\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata_1,\n                ]))\n            },\n        );\n        mock_metastore\n            .expect_list_splits()\n            .return_once(move |list_splits_request| {\n                let list_splits_query =\n                    list_splits_request.deserialize_list_splits_query().unwrap();\n                assert!(list_splits_query.index_uids == Some(vec![index_uid_1.clone()]));\n                let splits = vec![\n                    MockSplitBuilder::new(\"index-1-split-1\")\n                        .with_index_uid(&index_uid_1)\n                        .build(),\n                    MockSplitBuilder::new(\"index-1-split-2\")\n                        .with_index_uid(&index_uid_1)\n                        .build(),\n                ];\n                let splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits_response)]))\n            });\n        let mut mock_search_service_1 = MockSearchService::new();\n        mock_search_service_1\n            .expect_leaf_search()\n            .withf(\n                |leaf_search_req: &quickwit_proto::search::LeafSearchRequest| {\n                    leaf_search_req.leaf_requests.len() == 1\n                        && leaf_search_req.leaf_requests[0].split_offsets.len() == 2\n                },\n            )\n            .times(1)\n            .returning(\n                |_leaf_search_req: quickwit_proto::search::LeafSearchRequest| {\n                    let partial_hits = vec![mock_partial_hit(\"index-1-split-1\", 0u64, 1u32)];\n                    Ok(quickwit_proto::search::LeafSearchResponse {\n                        num_hits: 1,\n                        partial_hits,\n                        failed_splits: vec![{\n                            SplitSearchError {\n                                error: \"some error\".to_string(),\n                                split_id: \"index-1-split-1\".to_string(),\n                                retryable_error: true,\n                            }\n                        }],\n                        num_attempted_splits: 3,\n                        ..Default::default()\n                    })\n                },\n            );\n        mock_search_service_1\n            .expect_leaf_search()\n            .withf(\n                |leaf_search_req: &quickwit_proto::search::LeafSearchRequest| {\n                    leaf_search_req.leaf_requests.len() == 1\n                        && leaf_search_req.leaf_requests[0].split_offsets.len() == 1\n                },\n            )\n            .times(1)\n            .returning(\n                |_leaf_search_req: quickwit_proto::search::LeafSearchRequest| {\n                    Ok(quickwit_proto::search::LeafSearchResponse {\n                        num_hits: 0,\n                        partial_hits: Vec::new(),\n                        failed_splits: vec![{\n                            SplitSearchError {\n                                error: \"some error\".to_string(),\n                                split_id: \"index-1-split-1\".to_string(),\n                                retryable_error: true,\n                            }\n                        }],\n                        num_attempted_splits: 1,\n                        ..Default::default()\n                    })\n                },\n            );\n        mock_search_service_1\n            .expect_fetch_docs()\n            .times(1)\n            .returning(|fetch_docs_req| {\n                Ok(quickwit_proto::search::FetchDocsResponse {\n                    hits: get_doc_for_fetch_req(fetch_docs_req),\n                })\n            });\n        let searcher_pool = searcher_pool_for_test([(\"127.0.0.1:1001\", mock_search_service_1)]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let cluster_client = ClusterClient::new(search_job_placer.clone());\n        let search_response = root_search(\n            &SearcherContext::for_test(),\n            search_request,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            &cluster_client,\n        )\n        .await\n        .unwrap();\n        assert_eq!(search_response.num_hits, 1);\n        assert_eq!(search_response.hits.len(), 1);\n        assert_eq!(search_response.failed_splits.len(), 1);\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_root_search_too_many_splits() -> anyhow::Result<()> {\n        let search_request = quickwit_proto::search::SearchRequest {\n            index_id_patterns: vec![\"test-index\".to_string()],\n            query_ast: qast_json_helper(\"test\", &[\"body\"]),\n            max_hits: 10,\n            ..Default::default()\n        };\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata = IndexMetadata::for_test(\"test-index\", \"ram:///test-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .returning(move |_index_ids_query| {\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    index_metadata.clone(),\n                ]))\n            });\n        mock_metastore\n            .expect_list_splits()\n            .returning(move |_filter| {\n                let splits = vec![\n                    MockSplitBuilder::new(\"split1\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                    MockSplitBuilder::new(\"split2\")\n                        .with_index_uid(&index_uid)\n                        .build(),\n                ];\n                let splits_response = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits_response)]))\n            });\n        let mock_search_service = MockSearchService::new();\n        let searcher_pool = searcher_pool_for_test([(\"127.0.0.1:1001\", mock_search_service)]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let cluster_client = ClusterClient::new(search_job_placer.clone());\n\n        let mut searcher_context = SearcherContext::for_test();\n        searcher_context.searcher_config.max_splits_per_search = Some(1);\n        let search_error = root_search(\n            &searcher_context,\n            search_request,\n            MetastoreServiceClient::from_mock(mock_metastore),\n            &cluster_client,\n        )\n        .await\n        .unwrap_err();\n        assert!(matches!(search_error, SearchError::InvalidArgument { .. }));\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_finalize_aggregation_if_any_no_aggregation_request() {\n        let search_request = SearchRequest {\n            aggregation_request: None,\n            skip_aggregation_finalization: false,\n            ..Default::default()\n        };\n        let searcher_context = SearcherContext::for_test();\n        let result =\n            finalize_aggregation_if_any(&search_request, Some(vec![1, 2, 3]), &searcher_context)\n                .unwrap();\n        assert!(result.is_none());\n    }\n\n    #[tokio::test]\n    async fn test_finalize_aggregation_if_any_skip_finalization_returns_intermediate_bytes() {\n        let agg_req = r#\"{\"avg_price\": {\"avg\": {\"field\": \"price\"}}}\"#;\n        let intermediate_bytes = vec![42, 43, 44];\n        let search_request = SearchRequest {\n            aggregation_request: Some(agg_req.to_string()),\n            skip_aggregation_finalization: true,\n            ..Default::default()\n        };\n        let searcher_context = SearcherContext::for_test();\n        let result = finalize_aggregation_if_any(\n            &search_request,\n            Some(intermediate_bytes.clone()),\n            &searcher_context,\n        )\n        .unwrap();\n        assert_eq!(result, Some(intermediate_bytes));\n    }\n\n    #[tokio::test]\n    async fn test_finalize_aggregation_if_any_skip_finalization_none_bytes() {\n        let agg_req = r#\"{\"avg_price\": {\"avg\": {\"field\": \"price\"}}}\"#;\n        let search_request = SearchRequest {\n            aggregation_request: Some(agg_req.to_string()),\n            skip_aggregation_finalization: true,\n            ..Default::default()\n        };\n        let searcher_context = SearcherContext::for_test();\n        let result = finalize_aggregation_if_any(&search_request, None, &searcher_context).unwrap();\n        assert!(result.is_none());\n    }\n\n    #[tokio::test]\n    async fn test_finalize_aggregation_if_any_default_finalizes() {\n        let agg_req = r#\"{\"avg_price\": {\"avg\": {\"field\": \"price\"}}}\"#;\n        let intermediate_results = IntermediateAggregationResults::default();\n        let intermediate_bytes = postcard::to_stdvec(&intermediate_results).unwrap();\n        let search_request = SearchRequest {\n            aggregation_request: Some(agg_req.to_string()),\n            skip_aggregation_finalization: false,\n            ..Default::default()\n        };\n        let searcher_context = SearcherContext::for_test();\n        let result = finalize_aggregation_if_any(\n            &search_request,\n            Some(intermediate_bytes.clone()),\n            &searcher_context,\n        )\n        .unwrap();\n        // Result should be Some (finalized), but different from intermediate bytes\n        assert!(result.is_some());\n        assert_ne!(result.unwrap(), intermediate_bytes);\n    }\n\n    #[tokio::test]\n    async fn test_finalize_aggregation_if_any_false_flag_finalizes() {\n        let agg_req = r#\"{\"avg_price\": {\"avg\": {\"field\": \"price\"}}}\"#;\n        let intermediate_results = IntermediateAggregationResults::default();\n        let intermediate_bytes = postcard::to_stdvec(&intermediate_results).unwrap();\n        let search_request = SearchRequest {\n            aggregation_request: Some(agg_req.to_string()),\n            skip_aggregation_finalization: false,\n            ..Default::default()\n        };\n        let searcher_context = SearcherContext::for_test();\n        let result = finalize_aggregation_if_any(\n            &search_request,\n            Some(intermediate_bytes.clone()),\n            &searcher_context,\n        )\n        .unwrap();\n        assert!(result.is_some());\n        assert_ne!(result.unwrap(), intermediate_bytes);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-search/src/scroll_context.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::fmt;\nuse std::ops::Range;\nuse std::str::FromStr;\nuse std::sync::Arc;\nuse std::time::Duration;\n\nuse anyhow::Context;\nuse base64::Engine;\nuse base64::prelude::BASE64_STANDARD;\nuse quickwit_common::metrics::GaugeGuard;\nuse quickwit_common::shared_consts::SCROLL_BATCH_LEN;\nuse quickwit_metastore::SplitMetadata;\nuse quickwit_proto::search::{LeafSearchResponse, PartialHit, SearchRequest, SplitSearchError};\nuse quickwit_proto::types::IndexUid;\nuse serde::{Deserialize, Serialize};\nuse tokio::sync::RwLock;\nuse ttl_cache::TtlCache;\nuse ulid::Ulid;\n\nuse crate::ClusterClient;\nuse crate::root::IndexMetasForLeafSearch;\nuse crate::service::SearcherContext;\n\n/// Maximum number of values in the local search KV store.\n///\n/// TODO make configurable.\n///\n/// Assuming a search context of 1MB, this can\n/// amount to up to 1GB.\nconst LOCAL_KV_CACHE_SIZE: usize = 1_000;\n\n#[derive(Serialize, Deserialize)]\npub(crate) struct ScrollContext {\n    pub split_metadatas: Vec<SplitMetadata>,\n    pub search_request: SearchRequest,\n    pub indexes_metas_for_leaf_search: HashMap<IndexUid, IndexMetasForLeafSearch>,\n    pub total_num_hits: u64,\n    pub max_hits_per_page: u64,\n    pub cached_partial_hits_start_offset: u64,\n    pub cached_partial_hits: Vec<PartialHit>,\n    pub failed_splits: Vec<SplitSearchError>,\n    pub num_successful_splits: u64,\n}\n\nimpl ScrollContext {\n    /// Returns as many results in cache.\n    pub fn get_cached_partial_hits(&self, doc_range: Range<u64>) -> &[PartialHit] {\n        if doc_range.end <= doc_range.start {\n            return &[];\n        }\n        if doc_range.start < self.cached_partial_hits_start_offset {\n            return &[];\n        }\n        if doc_range.start\n            >= self.cached_partial_hits_start_offset + self.cached_partial_hits.len() as u64\n        {\n            return &[];\n        }\n        let truncated_partial_hits = &self.cached_partial_hits\n            [(doc_range.start - self.cached_partial_hits_start_offset) as usize..];\n        let num_partial_hits = truncated_partial_hits\n            .len()\n            .min((doc_range.end - doc_range.start) as usize);\n        &truncated_partial_hits[..num_partial_hits]\n    }\n\n    /// Clear cache if it wouldn't be useful, i.e. if page size is greater than SCROLL_BATCH_LEN\n    pub fn clear_cache_if_unneeded(&mut self) {\n        if self.search_request.max_hits > SCROLL_BATCH_LEN as u64 {\n            self.cached_partial_hits.clear();\n        }\n    }\n\n    pub fn serialize(&self) -> Vec<u8> {\n        let uncompressed_payload = serde_json::to_string(self).unwrap();\n        uncompressed_payload.as_bytes().to_vec()\n    }\n\n    pub fn load(payload: &[u8]) -> anyhow::Result<Self> {\n        let scroll_context =\n            serde_json::from_slice(payload).context(\"failed to deserialize context\")?;\n        Ok(scroll_context)\n    }\n\n    /// Loads in the `ScrollContext` cache all the\n    /// hits in range [start_offset..start_offset + SCROLL_BATCH_LEN).\n    pub async fn load_batch_starting_at(\n        &mut self,\n        start_offset: u64,\n        previous_last_hit: PartialHit,\n        cluster_client: &ClusterClient,\n        searcher_context: &SearcherContext,\n    ) -> crate::Result<bool> {\n        self.search_request.search_after = Some(previous_last_hit);\n        let leaf_search_response: LeafSearchResponse = crate::root::search_partial_hits_phase(\n            searcher_context,\n            &self.indexes_metas_for_leaf_search,\n            &self.search_request,\n            &self.split_metadatas[..],\n            cluster_client,\n        )\n        .await?;\n        self.cached_partial_hits_start_offset = start_offset;\n        self.cached_partial_hits = leaf_search_response.partial_hits;\n        Ok(true)\n    }\n}\n\nstruct TrackedValue {\n    content: Vec<u8>,\n    _total_size_metric_guard: GaugeGuard<'static>,\n}\n\n/// In memory key value store with TTL and limited size.\n///\n/// Once the capacity [LOCAL_KV_CACHE_SIZE] is reached, the oldest entries are\n/// removed.\n///\n/// Currently this store is only used for caching scroll contexts. Using it for\n/// other purposes is risky as use cases would compete for its capacity.\n#[derive(Clone)]\npub(crate) struct MiniKV {\n    ttl_with_cache: Arc<RwLock<TtlCache<Vec<u8>, TrackedValue>>>,\n}\n\nimpl Default for MiniKV {\n    fn default() -> MiniKV {\n        MiniKV {\n            ttl_with_cache: Arc::new(RwLock::new(TtlCache::new(LOCAL_KV_CACHE_SIZE))),\n        }\n    }\n}\n\nimpl MiniKV {\n    pub async fn put(&self, key: Vec<u8>, payload: Vec<u8>, ttl: Duration) {\n        let mut metric_guard =\n            GaugeGuard::from_gauge(&crate::SEARCH_METRICS.searcher_local_kv_store_size_bytes);\n        metric_guard.add(payload.len() as i64);\n        let mut cache_lock = self.ttl_with_cache.write().await;\n        cache_lock.insert(\n            key,\n            TrackedValue {\n                content: payload,\n                _total_size_metric_guard: metric_guard,\n            },\n            ttl,\n        );\n    }\n\n    pub async fn get(&self, key: &[u8]) -> Option<Vec<u8>> {\n        let cache_lock = self.ttl_with_cache.read().await;\n        let tracked_value = cache_lock.get(key)?;\n        Some(tracked_value.content.clone())\n    }\n}\n\n#[derive(Serialize, Deserialize, Clone, Eq, PartialEq, Debug)]\npub(crate) struct ScrollKeyAndStartOffset {\n    scroll_ulid: Ulid,\n    pub(crate) start_offset: u64,\n    // this is set to zero if there are no more documents\n    pub(crate) max_hits_per_page: u32,\n    pub(crate) search_after: PartialHit,\n}\n\nimpl ScrollKeyAndStartOffset {\n    pub fn new_with_start_offset(\n        start_offset: u64,\n        max_hits_per_page: u32,\n        search_after: PartialHit,\n    ) -> ScrollKeyAndStartOffset {\n        let scroll_ulid: Ulid = Ulid::new();\n        // technically we could only initialize search_after on first call to next_page, and use\n        // default() before, but that feels like partial initialization.\n        ScrollKeyAndStartOffset {\n            scroll_ulid,\n            start_offset,\n            max_hits_per_page,\n            search_after,\n        }\n    }\n\n    pub fn next_page(\n        mut self,\n        found_hits_in_current_page: u64,\n        last_hit: PartialHit,\n    ) -> ScrollKeyAndStartOffset {\n        self.start_offset += found_hits_in_current_page;\n        if found_hits_in_current_page < self.max_hits_per_page as u64 {\n            self.max_hits_per_page = 0;\n        }\n        self.search_after = last_hit;\n        self\n    }\n\n    pub fn scroll_key(&self) -> [u8; 16] {\n        u128::from(self.scroll_ulid).to_le_bytes()\n    }\n}\n\nimpl fmt::Display for ScrollKeyAndStartOffset {\n    fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {\n        let mut payload = vec![0u8; 28];\n        payload[..16].copy_from_slice(&u128::from(self.scroll_ulid).to_le_bytes());\n        payload[16..24].copy_from_slice(&self.start_offset.to_le_bytes());\n        payload[24..28].copy_from_slice(&self.max_hits_per_page.to_le_bytes());\n        serde_json::to_writer(&mut payload, &self.search_after)\n            .expect(\"serializing PartialHit should never fail\");\n        let b64_payload = BASE64_STANDARD.encode(payload);\n        write!(formatter, \"{b64_payload}\")\n    }\n}\n\nimpl FromStr for ScrollKeyAndStartOffset {\n    type Err = &'static str;\n\n    fn from_str(scroll_id_str: &str) -> Result<Self, Self::Err> {\n        let base64_decoded: Vec<u8> = BASE64_STANDARD\n            .decode(scroll_id_str)\n            .map_err(|_| \"scroll id is invalid base64.\")?;\n        if base64_decoded.len() <= 16 + 8 + 4 {\n            return Err(\"scroll id payload is truncated\");\n        }\n        let (scroll_ulid_bytes, from_bytes, max_hits_bytes) = (\n            &base64_decoded[..16],\n            &base64_decoded[16..24],\n            &base64_decoded[24..28],\n        );\n        let scroll_ulid = u128::from_le_bytes(scroll_ulid_bytes.try_into().unwrap()).into();\n        let from = u64::from_le_bytes(from_bytes.try_into().unwrap());\n        let max_hits = u32::from_le_bytes(max_hits_bytes.try_into().unwrap());\n        if max_hits > 10_000 {\n            return Err(\"scroll id is malformed\");\n        }\n        let search_after =\n            serde_json::from_slice(&base64_decoded[28..]).map_err(|_| \"scroll id is malformed\")?;\n        Ok(ScrollKeyAndStartOffset {\n            scroll_ulid,\n            start_offset: from,\n            max_hits_per_page: max_hits,\n            search_after,\n        })\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::str::FromStr;\n\n    use quickwit_proto::search::PartialHit;\n\n    use crate::scroll_context::ScrollKeyAndStartOffset;\n\n    #[test]\n    fn test_scroll_id() {\n        let partial_hit = PartialHit {\n            sort_value: None,\n            sort_value2: None,\n            split_id: \"split\".to_string(),\n            segment_ord: 1,\n            doc_id: 2,\n        };\n        let scroll = ScrollKeyAndStartOffset::new_with_start_offset(10, 100, partial_hit);\n        let scroll_str = scroll.to_string();\n        let ser_deser_scroll = ScrollKeyAndStartOffset::from_str(&scroll_str).unwrap();\n        assert_eq!(scroll, ser_deser_scroll);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-search/src/search_job_placer.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::cmp::Ordering;\nuse std::collections::{HashMap, HashSet};\nuse std::fmt;\nuse std::hash::{Hash, Hasher};\nuse std::net::SocketAddr;\n\nuse anyhow::bail;\nuse async_trait::async_trait;\nuse quickwit_common::SocketAddrLegacyHash;\nuse quickwit_common::pubsub::EventSubscriber;\nuse quickwit_common::rendezvous_hasher::{node_affinity, sort_by_rendez_vous_hash};\nuse quickwit_proto::search::{ReportSplit, ReportSplitsRequest};\nuse tracing::{info, warn};\n\nuse crate::{SEARCH_METRICS, SearchJob, SearchServiceClient, SearcherPool};\n\n/// Job.\n/// The unit in which distributed search is performed.\n///\n/// The `split_id` is used to define an affinity between a leaf nodes and a job.\n/// The `cost` is used to spread the work evenly amongst nodes.\npub trait Job {\n    /// Split ID of the targeted split.\n    fn split_id(&self) -> &str;\n\n    /// Estimation of the load associated with running a given job.\n    ///\n    /// A list of jobs will be assigned to leaf nodes in a way that spread\n    /// the sum of cost evenly.\n    fn cost(&self) -> usize;\n\n    /// Compares the cost of two jobs in reverse order, breaking ties by split ID.\n    fn compare_cost(&self, other: &Self) -> Ordering {\n        self.cost()\n            .cmp(&other.cost())\n            .reverse()\n            .then_with(|| self.split_id().cmp(other.split_id()))\n    }\n}\n\n/// Search job placer.\n/// It assigns jobs to search clients.\n#[derive(Clone, Default)]\npub struct SearchJobPlacer {\n    /// Search clients pool.\n    searcher_pool: SearcherPool,\n}\n\n#[async_trait]\nimpl EventSubscriber<ReportSplitsRequest> for SearchJobPlacer {\n    async fn handle_event(&mut self, evt: ReportSplitsRequest) {\n        let mut nodes: HashMap<SocketAddr, SearchServiceClient> =\n            self.searcher_pool.pairs().into_iter().collect();\n        if nodes.is_empty() {\n            return;\n        }\n        let mut splits_per_node: HashMap<SocketAddr, Vec<ReportSplit>> =\n            HashMap::with_capacity(nodes.len().min(evt.report_splits.len()));\n        for report_split in evt.report_splits {\n            let node_addr = nodes\n                .keys()\n                .max_by_key(|node_addr| {\n                    node_affinity(SocketAddrLegacyHash(node_addr), &report_split.split_id)\n                })\n                // This actually never happens thanks to the if-condition at the\n                // top of this function.\n                .expect(\"`nodes` should not be empty\");\n            splits_per_node\n                .entry(*node_addr)\n                .or_default()\n                .push(report_split);\n        }\n        for (node_addr, report_splits) in splits_per_node {\n            if let Some(search_client) = nodes.get_mut(&node_addr) {\n                let report_splits_req = ReportSplitsRequest { report_splits };\n                let _ = search_client.report_splits(report_splits_req).await;\n            }\n        }\n    }\n}\n\nimpl fmt::Debug for SearchJobPlacer {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        f.debug_struct(\"SearchJobPlacer\").finish()\n    }\n}\n\nimpl SearchJobPlacer {\n    /// Returns an [`SearchJobPlacer`] from a search service client pool.\n    pub fn new(searcher_pool: SearcherPool) -> Self {\n        Self { searcher_pool }\n    }\n}\n\nstruct SocketAddrAndClient {\n    socket_addr: SocketAddr,\n    client: SearchServiceClient,\n}\n\nimpl Hash for SocketAddrAndClient {\n    fn hash<H: Hasher>(&self, hasher: &mut H) {\n        SocketAddrLegacyHash(&self.socket_addr).hash(hasher);\n    }\n}\n\nimpl SearchJobPlacer {\n    /// Returns an iterator over the search nodes, ordered by their affinity\n    /// with the `affinity_key`, as defined by rendez-vous hashing.\n    pub async fn best_nodes_per_affinity(\n        &self,\n        affinity_key: &[u8],\n    ) -> impl Iterator<Item = SearchServiceClient> {\n        let mut nodes: Vec<SocketAddrAndClient> = self\n            .searcher_pool\n            .pairs()\n            .into_iter()\n            .map(|(socket_addr, client)| SocketAddrAndClient {\n                socket_addr,\n                client,\n            })\n            .collect();\n        sort_by_rendez_vous_hash(&mut nodes[..], affinity_key);\n        nodes\n            .into_iter()\n            .map(|socket_addr_and_client| socket_addr_and_client.client)\n    }\n\n    /// Assign the given job to the clients\n    /// Returns a list of pair (SocketAddr, `Vec<Job>`)\n    ///\n    /// When exclude_addresses filters all clients it is ignored.\n    pub async fn assign_jobs<J: Job>(\n        &self,\n        mut jobs: Vec<J>,\n        excluded_addrs: &HashSet<SocketAddr>,\n    ) -> anyhow::Result<impl Iterator<Item = (SearchServiceClient, Vec<J>)> + use<J>> {\n        let mut all_nodes = self.searcher_pool.pairs();\n\n        if all_nodes.is_empty() {\n            bail!(\n                \"failed to assign search jobs: there are no available searcher nodes in the \\\n                 cluster\"\n            );\n        }\n        if !excluded_addrs.is_empty() && excluded_addrs.len() < all_nodes.len() {\n            all_nodes.retain(|(grpc_addr, _)| !excluded_addrs.contains(grpc_addr));\n\n            // This should never happen, but... belt and suspenders policy.\n            if all_nodes.is_empty() {\n                bail!(\n                    \"failed to assign search jobs: there are no searcher nodes candidates for \\\n                     these jobs\"\n                );\n            }\n            info!(\n                \"excluded {} nodes from search job placement, {} remaining\",\n                excluded_addrs.len(),\n                all_nodes.len()\n            );\n        }\n        let mut candidate_nodes: Vec<CandidateNode> = all_nodes\n            .into_iter()\n            .map(|(grpc_addr, client)| CandidateNode {\n                grpc_addr,\n                client,\n                load: 0,\n            })\n            .collect();\n\n        jobs.sort_unstable_by(Job::compare_cost);\n\n        let num_nodes = candidate_nodes.len();\n\n        let mut job_assignments: HashMap<SocketAddr, (SearchServiceClient, Vec<J>)> =\n            HashMap::with_capacity(num_nodes);\n\n        let total_load: usize = jobs.iter().map(|job| job.cost()).sum();\n\n        // allow around 5% disparity. Round up so we never end up in a case where\n        // target_load * num_nodes < total_load\n        // some of our tests needs 2 splits to be put on 2 different searchers. It makes sense for\n        // these tests to keep doing so (testing root merge). Either we can make the allowed\n        // difference stricter, find the right split names (\"split6\" instead of \"split2\" works).\n        // or modify mock_split_meta() so that not all splits have the same job cost\n        // for now i went with the mock_split_meta() changes.\n        const ALLOWED_DIFFERENCE: usize = 105;\n        let target_load = (total_load * ALLOWED_DIFFERENCE).div_ceil(num_nodes * 100);\n        for job in jobs {\n            sort_by_rendez_vous_hash(&mut candidate_nodes, job.split_id());\n\n            let (chosen_node_idx, chosen_node) = if let Some((idx, node)) = candidate_nodes\n                .iter_mut()\n                .enumerate()\n                .find(|(_pos, node)| node.load < target_load)\n            {\n                (idx, node)\n            } else {\n                warn!(\"found no lightly loaded searcher for split, this should never happen\");\n                (0, &mut candidate_nodes[0])\n            };\n            let metric_node_idx = match chosen_node_idx {\n                0 => \"0\",\n                1 => \"1\",\n                _ => \"> 1\",\n            };\n            SEARCH_METRICS\n                .job_assigned_total\n                .with_label_values([metric_node_idx])\n                .inc();\n            chosen_node.load += job.cost();\n\n            job_assignments\n                .entry(chosen_node.grpc_addr)\n                .or_insert_with(|| (chosen_node.client.clone(), Vec::new()))\n                .1\n                .push(job);\n        }\n        Ok(job_assignments.into_values())\n    }\n\n    /// Assigns a single job to a client.\n    pub async fn assign_job<J: Job>(\n        &self,\n        job: J,\n        excluded_addrs: &HashSet<SocketAddr>,\n    ) -> anyhow::Result<SearchServiceClient> {\n        let client = self\n            .assign_jobs(vec![job], excluded_addrs)\n            .await?\n            .next()\n            .map(|(client, _jobs)| client)\n            .expect(\"`assign_jobs` should return at least one client or fail.\");\n        Ok(client)\n    }\n}\n\n#[derive(Debug, Clone)]\nstruct CandidateNode {\n    pub grpc_addr: SocketAddr,\n    pub client: SearchServiceClient,\n    pub load: usize,\n}\n\nimpl Hash for CandidateNode {\n    fn hash<H: Hasher>(&self, state: &mut H) {\n        SocketAddrLegacyHash(&self.grpc_addr).hash(state);\n    }\n}\n\nimpl PartialEq for CandidateNode {\n    fn eq(&self, other: &Self) -> bool {\n        self.grpc_addr == other.grpc_addr\n    }\n}\n\nimpl Eq for CandidateNode {}\n\n/// Groups jobs by index id and returns a list of `SearchJob` per index\npub fn group_jobs_by_index_id(\n    jobs: Vec<SearchJob>,\n    cb: impl FnMut(Vec<SearchJob>) -> crate::Result<()>,\n) -> crate::Result<()> {\n    // Group jobs by index uid.\n    group_by(jobs, |job| &job.index_uid, cb)?;\n    Ok(())\n}\n\n/// Note: The data will be sorted.\n///\n/// Returns slices of the input data grouped by passed closure.\npub fn group_by<T, K: Ord, F>(\n    mut data: Vec<T>,\n    compare_by: impl Fn(&T) -> &K,\n    mut callback: F,\n) -> crate::Result<()>\nwhere\n    F: FnMut(Vec<T>) -> crate::Result<()>,\n{\n    data.sort_by(|job1, job2| compare_by(job2).cmp(compare_by(job1)));\n    while !data.is_empty() {\n        let last_element = data.last().unwrap();\n        let count = data\n            .iter()\n            .rev()\n            .take_while(|&x| compare_by(x) == compare_by(last_element))\n            .count();\n\n        let group = data.split_off(data.len() - count);\n        callback(group)?;\n    }\n\n    Ok(())\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n    use crate::{MockSearchService, SearchJob, searcher_pool_for_test};\n\n    #[test]\n    fn test_group_by_1() {\n        let data = vec![1, 1, 2, 2, 2, 3, 4, 4, 5, 5, 5];\n        let mut outputs: Vec<Vec<i32>> = Vec::new();\n        group_by(\n            data,\n            |el| el,\n            |group| {\n                outputs.push(group);\n                Ok(())\n            },\n        )\n        .unwrap();\n        assert_eq!(outputs.len(), 5);\n        assert_eq!(outputs[0], vec![1, 1]);\n        assert_eq!(outputs[1], vec![2, 2, 2]);\n        assert_eq!(outputs[2], vec![3]);\n        assert_eq!(outputs[3], vec![4, 4]);\n        assert_eq!(outputs[4], vec![5, 5, 5]);\n    }\n    #[test]\n    fn test_group_by_all_same() {\n        let data = vec![1, 1];\n        let mut outputs: Vec<Vec<i32>> = Vec::new();\n        group_by(\n            data,\n            |el| el,\n            |group| {\n                outputs.push(group);\n                Ok(())\n            },\n        )\n        .unwrap();\n        assert_eq!(outputs.len(), 1);\n        assert_eq!(outputs[0], vec![1, 1]);\n    }\n    #[test]\n    fn test_group_by_empty() {\n        let data = vec![];\n        let mut outputs: Vec<Vec<i32>> = Vec::new();\n        group_by(\n            data,\n            |el| el,\n            |group| {\n                outputs.push(group);\n                Ok(())\n            },\n        )\n        .unwrap();\n        assert_eq!(outputs.len(), 0);\n    }\n\n    #[tokio::test]\n    async fn test_search_job_placer() {\n        {\n            let searcher_pool = SearcherPool::default();\n            let search_job_placer = SearchJobPlacer::new(searcher_pool);\n            assert!(\n                search_job_placer\n                    .assign_jobs::<SearchJob>(Vec::new(), &HashSet::new())\n                    .await\n                    .is_err()\n            );\n        }\n        {\n            let searcher_pool =\n                searcher_pool_for_test([(\"127.0.0.1:1001\", MockSearchService::new())]);\n            let search_job_placer = SearchJobPlacer::new(searcher_pool);\n            let jobs = vec![\n                SearchJob::for_test(\"split1\", 1),\n                SearchJob::for_test(\"split2\", 2),\n                SearchJob::for_test(\"split3\", 3),\n                SearchJob::for_test(\"split4\", 4),\n            ];\n            let assigned_jobs: Vec<(SocketAddr, Vec<SearchJob>)> = search_job_placer\n                .assign_jobs(jobs, &HashSet::default())\n                .await\n                .unwrap()\n                .map(|(client, jobs)| (client.grpc_addr(), jobs))\n                .collect();\n            let expected_searcher_addr: SocketAddr = ([127, 0, 0, 1], 1001).into();\n            let expected_assigned_jobs = vec![(\n                expected_searcher_addr,\n                vec![\n                    SearchJob::for_test(\"split4\", 4),\n                    SearchJob::for_test(\"split3\", 3),\n                    SearchJob::for_test(\"split2\", 2),\n                    SearchJob::for_test(\"split1\", 1),\n                ],\n            )];\n            assert_eq!(assigned_jobs, expected_assigned_jobs);\n        }\n        {\n            let searcher_pool = searcher_pool_for_test([\n                (\"127.0.0.1:1001\", MockSearchService::new()),\n                (\"127.0.0.1:1002\", MockSearchService::new()),\n            ]);\n            let search_job_placer = SearchJobPlacer::new(searcher_pool);\n            let jobs = vec![\n                SearchJob::for_test(\"split1\", 1),\n                SearchJob::for_test(\"split2\", 2),\n                SearchJob::for_test(\"split3\", 3),\n                SearchJob::for_test(\"split4\", 4),\n                SearchJob::for_test(\"split5\", 5),\n                SearchJob::for_test(\"split6\", 6),\n            ];\n            let mut assigned_jobs: Vec<(SocketAddr, Vec<SearchJob>)> = search_job_placer\n                .assign_jobs(jobs, &HashSet::default())\n                .await\n                .unwrap()\n                .map(|(client, jobs)| (client.grpc_addr(), jobs))\n                .collect();\n            assigned_jobs.sort_unstable_by_key(|(node_uid, _)| *node_uid);\n\n            let expected_searcher_addr_1: SocketAddr = ([127, 0, 0, 1], 1001).into();\n            let expected_searcher_addr_2: SocketAddr = ([127, 0, 0, 1], 1002).into();\n            // on a small number of splits, we may be unbalanced\n            let expected_assigned_jobs = vec![\n                (\n                    expected_searcher_addr_1,\n                    vec![\n                        SearchJob::for_test(\"split5\", 5),\n                        SearchJob::for_test(\"split4\", 4),\n                        SearchJob::for_test(\"split3\", 3),\n                    ],\n                ),\n                (\n                    expected_searcher_addr_2,\n                    vec![\n                        SearchJob::for_test(\"split6\", 6),\n                        SearchJob::for_test(\"split2\", 2),\n                        SearchJob::for_test(\"split1\", 1),\n                    ],\n                ),\n            ];\n            assert_eq!(assigned_jobs, expected_assigned_jobs);\n        }\n        {\n            let searcher_pool = searcher_pool_for_test([\n                (\"127.0.0.1:1001\", MockSearchService::new()),\n                (\"127.0.0.1:1002\", MockSearchService::new()),\n            ]);\n            let search_job_placer = SearchJobPlacer::new(searcher_pool);\n            let jobs = vec![\n                SearchJob::for_test(\"split1\", 1000),\n                SearchJob::for_test(\"split2\", 1),\n            ];\n            let mut assigned_jobs: Vec<(SocketAddr, Vec<SearchJob>)> = search_job_placer\n                .assign_jobs(jobs, &HashSet::default())\n                .await\n                .unwrap()\n                .map(|(client, jobs)| (client.grpc_addr(), jobs))\n                .collect();\n            assigned_jobs.sort_unstable_by_key(|(node_uid, _)| *node_uid);\n\n            let expected_searcher_addr_1: SocketAddr = ([127, 0, 0, 1], 1001).into();\n            let expected_searcher_addr_2: SocketAddr = ([127, 0, 0, 1], 1002).into();\n            let expected_assigned_jobs = vec![\n                (\n                    expected_searcher_addr_1,\n                    vec![SearchJob::for_test(\"split1\", 1000)],\n                ),\n                (\n                    expected_searcher_addr_2,\n                    vec![SearchJob::for_test(\"split2\", 1)],\n                ),\n            ];\n            assert_eq!(assigned_jobs, expected_assigned_jobs);\n        }\n    }\n\n    #[tokio::test]\n    async fn test_search_job_placer_many_splits() {\n        let searcher_pool = searcher_pool_for_test([\n            (\"127.0.0.1:1001\", MockSearchService::new()),\n            (\"127.0.0.1:1002\", MockSearchService::new()),\n            (\"127.0.0.1:1003\", MockSearchService::new()),\n            (\"127.0.0.1:1004\", MockSearchService::new()),\n            (\"127.0.0.1:1005\", MockSearchService::new()),\n        ]);\n        let search_job_placer = SearchJobPlacer::new(searcher_pool);\n        let jobs = (0..1000)\n            .map(|id| SearchJob::for_test(&format!(\"split{id}\"), 1))\n            .collect();\n        let jobs_len: Vec<usize> = search_job_placer\n            .assign_jobs(jobs, &HashSet::default())\n            .await\n            .unwrap()\n            .map(|(_, jobs)| jobs.len())\n            .collect();\n        for job_len in jobs_len {\n            assert!(job_len <= 1050 / 5);\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-search/src/search_permit_provider.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BinaryHeap;\nuse std::collections::binary_heap::PeekMut;\nuse std::future::Future;\nuse std::pin::Pin;\nuse std::sync::Arc;\nuse std::task::{Context, Poll};\n\nuse bytesize::ByteSize;\nuse quickwit_common::metrics::GaugeGuard;\nuse quickwit_proto::search::SplitIdAndFooterOffsets;\nuse tokio::sync::{mpsc, oneshot};\n\n/// Distributor of permits to perform split search operation.\n///\n/// Requests are served in order. Each permit initially reserves a slot for the\n/// warmup (limit concurrent downloads) and a pessimistic amount of memory. Once\n/// the warmup is completed, the actual memory usage is set and the warmup slot\n/// is released. Once the search is completed and the permit is dropped, the\n/// remaining memory is also released.\n#[derive(Clone)]\npub struct SearchPermitProvider {\n    message_sender: mpsc::UnboundedSender<SearchPermitMessage>,\n    #[allow(dead_code)]\n    actor_join_handle: Arc<tokio::task::JoinHandle<SearchPermitActor>>,\n}\n\npub enum SearchPermitMessage {\n    RequestWithOffload {\n        permit_sizes: Vec<u64>,\n        /// Maximum number of pending requests. If granting all\n        /// requested permits would cause the number of pending requests to exceed this threshold,\n        /// some permits will be offloaded to Lambda.\n        offload_threshold: usize,\n        /// Channel to return the result message from the actor.\n        /// When offloading permits, the number of futures can be < to the number of requested\n        /// permits.\n        permit_resp_tx: oneshot::Sender<Vec<SearchPermitFuture>>,\n    },\n    UpdateMemory {\n        memory_delta: i64,\n    },\n    FreeWarmupSlot,\n    Drop {\n        memory_size: u64,\n        warmup_slot_freed: bool,\n    },\n}\n\n/// Makes very pessimistic estimate of the memory allocation required for a split search\n///\n/// This is refined later on when more data is available about the split.\npub fn compute_initial_memory_allocation(\n    split: &SplitIdAndFooterOffsets,\n    warmup_single_split_initial_allocation: ByteSize,\n) -> ByteSize {\n    let split_size = split.split_footer_start;\n    // we consider the configured initial allocation to be set for a large split with 10M docs\n    const LARGE_SPLIT_NUM_DOCS: u64 = 10_000_000;\n    let proportional_allocation =\n        warmup_single_split_initial_allocation.as_u64() * split.num_docs / LARGE_SPLIT_NUM_DOCS;\n    let size_bytes = [\n        split_size,\n        proportional_allocation,\n        warmup_single_split_initial_allocation.as_u64(),\n    ]\n    .into_iter()\n    .min()\n    .unwrap();\n    const MINIMUM_ALLOCATION_BYTES: u64 = 10_000_000;\n    ByteSize(size_bytes.max(MINIMUM_ALLOCATION_BYTES))\n}\n\nimpl SearchPermitProvider {\n    pub fn new(num_download_slots: usize, memory_budget: ByteSize) -> Self {\n        let (message_sender, message_receiver) = mpsc::unbounded_channel();\n        let actor = SearchPermitActor {\n            msg_receiver: message_receiver,\n            msg_sender: message_sender.downgrade(),\n            num_warmup_slots_available: num_download_slots,\n            total_memory_budget: memory_budget.as_u64(),\n            permits_requests: BinaryHeap::new(),\n            total_memory_allocated: 0u64,\n        };\n        let actor_join_handle = Arc::new(tokio::spawn(actor.run()));\n        Self {\n            message_sender,\n            actor_join_handle,\n        }\n    }\n\n    #[cfg(test)]\n    async fn stop_and_unwrap(self) -> SearchPermitActor {\n        let SearchPermitProvider {\n            message_sender,\n            actor_join_handle,\n            ..\n        } = self;\n        drop(message_sender);\n        Arc::into_inner(actor_join_handle).unwrap().await.unwrap()\n    }\n\n    /// Returns permits for local splits\n    ///\n    /// The returned futures are guaranteed to resolve in order.\n    pub async fn get_permits(&self, splits: Vec<ByteSize>) -> Vec<SearchPermitFuture> {\n        self.get_permits_with_offload(splits, usize::MAX).await\n    }\n\n    /// Returns permits for local splits and a list of split indices to offload.\n    ///\n    /// The actor checks the current pending queue depth. If adding all splits\n    /// would exceed `offload_threshold` pending requests, only enough splits\n    /// to fill up to the threshold are processed locally; the rest are offloaded.\n    ///\n    /// The returned futures are guaranteed to resolve in order.\n    ///\n    /// If `offload_threshold` is 0, all splits are offloaded.\n    /// If `offload_threshold` is usize::MAX, all splits are processed locally.\n    pub async fn get_permits_with_offload(\n        &self,\n        splits: Vec<ByteSize>,\n        offload_threshold: usize,\n    ) -> Vec<SearchPermitFuture> {\n        if splits.is_empty() {\n            return Vec::new();\n        }\n        let (permit_sender, permit_receiver) = oneshot::channel();\n        let permit_sizes = splits.into_iter().map(|size| size.as_u64()).collect();\n        self.message_sender\n            .send(SearchPermitMessage::RequestWithOffload {\n                permit_resp_tx: permit_sender,\n                permit_sizes,\n                offload_threshold,\n            })\n            .expect(\"Receiver lives longer than sender\");\n        permit_receiver\n            .await\n            .expect(\"Receiver lives longer than sender\")\n    }\n}\n\nstruct SearchPermitActor {\n    msg_receiver: mpsc::UnboundedReceiver<SearchPermitMessage>,\n    msg_sender: mpsc::WeakUnboundedSender<SearchPermitMessage>,\n    num_warmup_slots_available: usize,\n    /// Note it is possible for memory_allocated to exceed memory_budget temporarily,\n    /// if and only if a split leaf search task ended up using more than `initial_allocation`.\n    /// When it happens, new permits will not be assigned until the memory is freed.\n    total_memory_budget: u64,\n    total_memory_allocated: u64,\n    permits_requests: BinaryHeap<LeafPermitRequest>,\n}\n\nstruct SingleSplitPermitRequest {\n    permit_sender: oneshot::Sender<SearchPermit>,\n    permit_size: u64,\n}\n\nstruct LeafPermitRequest {\n    /// Single split permit requests for this leaf search.\n    single_split_permit_requests: std::vec::IntoIter<SingleSplitPermitRequest>,\n}\n\nimpl Ord for LeafPermitRequest {\n    fn cmp(&self, other: &Self) -> std::cmp::Ordering {\n        // we compare other with self and not the other way arround because we want a min-heap and\n        // Rust's is a max-heap\n        other\n            .single_split_permit_requests\n            .as_slice()\n            .len()\n            .cmp(&self.single_split_permit_requests.as_slice().len())\n    }\n}\n\nimpl PartialOrd for LeafPermitRequest {\n    fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {\n        Some(self.cmp(other))\n    }\n}\n\nimpl PartialEq for LeafPermitRequest {\n    fn eq(&self, other: &Self) -> bool {\n        self.cmp(other).is_eq()\n    }\n}\n\nimpl Eq for LeafPermitRequest {}\n\nimpl LeafPermitRequest {\n    // `permit_sizes` must not be empty.\n    fn from_estimated_costs(permit_sizes: Vec<u64>) -> (Self, Vec<SearchPermitFuture>) {\n        assert!(!permit_sizes.is_empty(), \"permit_sizes must not be empty\");\n        let mut permits = Vec::with_capacity(permit_sizes.len());\n        let mut single_split_permit_requests = Vec::with_capacity(permit_sizes.len());\n        for permit_size in permit_sizes {\n            let (tx, rx) = oneshot::channel();\n            // we keep our internal list of permits and the returned wait handles in the\n            // same order to make sure we emit each permit in the right order. Doing otherwise\n            // may cause deadlocks\n            single_split_permit_requests.push(SingleSplitPermitRequest {\n                permit_sender: tx,\n                permit_size,\n            });\n            permits.push(SearchPermitFuture(rx));\n        }\n        (\n            LeafPermitRequest {\n                single_split_permit_requests: single_split_permit_requests.into_iter(),\n            },\n            permits,\n        )\n    }\n\n    fn pop_if_smaller_than(&mut self, max_size: u64) -> Option<SingleSplitPermitRequest> {\n        let peeked_single_split_req = self.single_split_permit_requests.as_slice().first()?;\n        if peeked_single_split_req.permit_size > max_size {\n            return None;\n        }\n        self.single_split_permit_requests.next()\n    }\n\n    fn is_empty(&self) -> bool {\n        self.single_split_permit_requests.as_slice().is_empty()\n    }\n}\n\nimpl SearchPermitActor {\n    async fn run(mut self) -> Self {\n        // Stops when the last clone of SearchPermitProvider is dropped.\n        while let Some(msg) = self.msg_receiver.recv().await {\n            self.handle_message(msg);\n        }\n        self\n    }\n\n    fn handle_message(&mut self, msg: SearchPermitMessage) {\n        match msg {\n            SearchPermitMessage::RequestWithOffload {\n                mut permit_sizes,\n                permit_resp_tx: permit_sender,\n                offload_threshold,\n            } => {\n                let current_pending = self\n                    .permits_requests\n                    .iter()\n                    .map(|req| req.single_split_permit_requests.as_slice().len())\n                    .sum();\n                // How many new splits can we accept locally before hitting the threshold.\n                let local_capacity = offload_threshold.saturating_sub(current_pending);\n\n                // If this indeed truncates the permit_sizes vector, other splits will be offloaded\n                // to lambdas.\n                permit_sizes.truncate(local_capacity);\n\n                // We special case here in order to avoid pushing empty request in the queue.\n                // (they would never be removed)\n                if permit_sizes.is_empty() {\n                    let _ = permit_sender.send(Vec::new());\n                    return;\n                }\n\n                let (leaf_permit_request, permit_futures) =\n                    LeafPermitRequest::from_estimated_costs(permit_sizes);\n                self.permits_requests.push(leaf_permit_request);\n                self.assign_available_permits();\n                let _ = permit_sender.send(permit_futures);\n            }\n            SearchPermitMessage::UpdateMemory { memory_delta } => {\n                if self.total_memory_allocated as i64 + memory_delta < 0 {\n                    panic!(\"More memory released than allocated, should never happen.\")\n                }\n                self.total_memory_allocated =\n                    (self.total_memory_allocated as i64 + memory_delta) as u64;\n                self.assign_available_permits();\n            }\n            SearchPermitMessage::FreeWarmupSlot => {\n                self.num_warmup_slots_available += 1;\n                self.assign_available_permits();\n            }\n            SearchPermitMessage::Drop {\n                memory_size,\n                warmup_slot_freed,\n            } => {\n                if !warmup_slot_freed {\n                    self.num_warmup_slots_available += 1;\n                }\n                self.total_memory_allocated = self\n                    .total_memory_allocated\n                    .checked_sub(memory_size)\n                    .expect(\"More memory released than allocated, should never happen.\");\n                self.assign_available_permits();\n            }\n        }\n    }\n\n    fn pop_next_request_if_serviceable(&mut self) -> Option<SingleSplitPermitRequest> {\n        if self.num_warmup_slots_available == 0 {\n            return None;\n        }\n        let available_memory = self\n            .total_memory_budget\n            .checked_sub(self.total_memory_allocated)?;\n        let mut peeked = self.permits_requests.peek_mut()?;\n\n        assert!(\n            !peeked.is_empty(),\n            \"unexpected empty permits_requests present in the search permit provider queue\"\n        );\n        if let Some(permit_request) = peeked.pop_if_smaller_than(available_memory) {\n            if peeked.is_empty() {\n                PeekMut::pop(peeked);\n            }\n            return Some(permit_request);\n        }\n        None\n    }\n\n    fn assign_available_permits(&mut self) {\n        while let Some(permit_request) = self.pop_next_request_if_serviceable() {\n            let mut ongoing_gauge_guard = GaugeGuard::from_gauge(\n                &crate::SEARCH_METRICS.leaf_search_single_split_tasks_ongoing,\n            );\n            ongoing_gauge_guard.add(1);\n            self.total_memory_allocated += permit_request.permit_size;\n            self.num_warmup_slots_available -= 1;\n            permit_request\n                .permit_sender\n                .send(SearchPermit {\n                    _ongoing_gauge_guard: ongoing_gauge_guard,\n                    msg_sender: self.msg_sender.clone(),\n                    memory_allocation: permit_request.permit_size,\n                    warmup_slot_freed: false,\n                })\n                // if the requester dropped its receiver, we drop the newly\n                // created SearchPermit which releases the resources\n                .ok();\n        }\n        crate::SEARCH_METRICS\n            .leaf_search_single_split_tasks_pending\n            .set(self.permits_requests.len() as i64);\n    }\n}\n\npub struct SearchPermit {\n    _ongoing_gauge_guard: GaugeGuard<'static>,\n    msg_sender: mpsc::WeakUnboundedSender<SearchPermitMessage>,\n    memory_allocation: u64,\n    warmup_slot_freed: bool,\n}\n\nimpl SearchPermit {\n    /// Update the memory usage attached to this permit.\n    ///\n    /// This will increase or decrease the available memory in the [`SearchPermitProvider`].\n    pub fn update_memory_usage(&mut self, new_memory_usage: ByteSize) {\n        let new_usage_bytes = new_memory_usage.as_u64();\n        let memory_delta = new_usage_bytes as i64 - self.memory_allocation as i64;\n        self.memory_allocation = new_usage_bytes;\n        self.send_if_still_running(SearchPermitMessage::UpdateMemory { memory_delta });\n    }\n\n    /// Drop the warmup permit, allowing more downloads to be started. Only one\n    /// slot is attached to each permit so calling this again has no effect.\n    pub fn free_warmup_slot(&mut self) {\n        if self.warmup_slot_freed {\n            return;\n        }\n        self.warmup_slot_freed = true;\n        self.send_if_still_running(SearchPermitMessage::FreeWarmupSlot);\n    }\n\n    pub fn memory_allocation(&self) -> ByteSize {\n        ByteSize(self.memory_allocation)\n    }\n\n    fn send_if_still_running(&self, msg: SearchPermitMessage) {\n        if let Some(sender) = self.msg_sender.upgrade() {\n            sender\n                .send(msg)\n                // Receiver instance in the event loop is never dropped or\n                // closed as long as there is a strong sender reference.\n                .expect(\"Receiver should live longer than sender\");\n        }\n    }\n}\n\nimpl Drop for SearchPermit {\n    fn drop(&mut self) {\n        self.send_if_still_running(SearchPermitMessage::Drop {\n            memory_size: self.memory_allocation,\n            warmup_slot_freed: self.warmup_slot_freed,\n        });\n    }\n}\n\npub struct SearchPermitFuture(oneshot::Receiver<SearchPermit>);\n\nimpl Future for SearchPermitFuture {\n    type Output = SearchPermit;\n\n    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {\n        let receiver = Pin::new(&mut self.get_mut().0);\n        match receiver.poll(cx) {\n            Poll::Ready(Ok(search_permit)) => Poll::Ready(search_permit),\n            Poll::Ready(Err(_)) => panic!(\"Failed to acquire permit. This should never happen! Please, report on https://github.com/quickwit-oss/quickwit/issues.\"),\n            Poll::Pending => Poll::Pending,\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::time::Duration;\n\n    use futures::StreamExt;\n    use rand::seq::SliceRandom;\n    use tokio::task::JoinSet;\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_search_permit_order() {\n        let permit_provider = SearchPermitProvider::new(1, ByteSize::mb(100));\n        let mut all_futures = Vec::new();\n        let first_batch_of_permits = permit_provider\n            .get_permits(vec![ByteSize::mb(10); 10])\n            .await;\n        assert_eq!(first_batch_of_permits.len(), 10);\n        all_futures.extend(\n            first_batch_of_permits\n                .into_iter()\n                .enumerate()\n                .map(move |(i, fut)| ((1, i), fut)),\n        );\n\n        let second_batch_of_permits = permit_provider\n            .get_permits(vec![ByteSize::mb(10); 10])\n            .await;\n        assert_eq!(second_batch_of_permits.len(), 10);\n        all_futures.extend(\n            second_batch_of_permits\n                .into_iter()\n                .enumerate()\n                .map(move |(i, fut)| ((2, i), fut)),\n        );\n\n        // not super useful, considering what join set does, but still a tiny bit more sound.\n        all_futures.shuffle(&mut rand::rng());\n\n        let mut join_set = JoinSet::new();\n        for (res, fut) in all_futures {\n            join_set.spawn(async move {\n                let permit = fut.await;\n                (res, permit)\n            });\n        }\n        let mut ordered_result: Vec<(usize, usize)> = Vec::with_capacity(20);\n        while let Some(Ok(((batch_id, order), _permit))) = join_set.join_next().await {\n            ordered_result.push((batch_id, order));\n        }\n\n        assert_eq!(ordered_result.len(), 20);\n        for (i, res) in ordered_result[0..10].iter().enumerate() {\n            assert_eq!(res, &(1, i));\n        }\n        for (i, res) in ordered_result[10..20].iter().enumerate() {\n            assert_eq!(res, &(2, i));\n        }\n    }\n\n    #[tokio::test]\n    async fn test_search_permit_order_with_concurrent_search() {\n        let permit_provider = SearchPermitProvider::new(4, ByteSize::mb(100));\n        let mut all_futures = Vec::new();\n        let first_batch_of_permits = permit_provider.get_permits(vec![ByteSize::mb(10); 8]).await;\n        assert_eq!(first_batch_of_permits.len(), 8);\n        all_futures.extend(\n            first_batch_of_permits\n                .into_iter()\n                .enumerate()\n                .map(move |(i, fut)| ((1, i), fut)),\n        );\n\n        let second_batch_of_permits = permit_provider.get_permits(vec![ByteSize::mb(10); 2]).await;\n        all_futures.extend(\n            second_batch_of_permits\n                .into_iter()\n                .enumerate()\n                .map(move |(i, fut)| ((2, i), fut)),\n        );\n\n        let third_batch_of_permits = permit_provider.get_permits(vec![ByteSize::mb(10); 6]).await;\n        all_futures.extend(\n            third_batch_of_permits\n                .into_iter()\n                .enumerate()\n                .map(move |(i, fut)| ((3, i), fut)),\n        );\n\n        // not super useful, considering what join set does, but still a tiny bit more sound.\n        all_futures.shuffle(&mut rand::rng());\n\n        let mut join_set = JoinSet::new();\n        for (res, fut) in all_futures {\n            join_set.spawn(async move {\n                let permit = fut.await;\n                (res, permit)\n            });\n        }\n        let mut ordered_result: Vec<(usize, usize)> = Vec::with_capacity(20);\n        while let Some(Ok(((batch_id, order), _permit))) = join_set.join_next().await {\n            ordered_result.push((batch_id, order));\n        }\n\n        let mut counters = [0; 4];\n        let expected_result: Vec<(usize, usize)> = [\n            1, 1, 1, 1, // initial 4 permits\n            2, 2, 1, 1, 1, 1, 3, 3, 3, 3, 3, 3,\n        ]\n        .into_iter()\n        .map(|batch_id| {\n            let order = counters[batch_id];\n            counters[batch_id] += 1;\n            (batch_id, order)\n        })\n        .collect();\n\n        // for the first 4 permits, the order is not well defined as they are all granted at once,\n        // and we poll futures in a random order. We sort them to fix that artifact\n        ordered_result[..4].sort();\n        assert_eq!(ordered_result, expected_result);\n    }\n\n    #[tokio::test]\n    async fn test_search_permit_early_drops() {\n        let permit_provider = SearchPermitProvider::new(1, ByteSize::mb(100));\n        let permit_fut1 = permit_provider\n            .get_permits(vec![ByteSize::mb(10)])\n            .await\n            .into_iter()\n            .next()\n            .unwrap();\n        let permit_fut2 = permit_provider\n            .get_permits(vec![ByteSize::mb(10)])\n            .await\n            .into_iter()\n            .next()\n            .unwrap();\n        drop(permit_fut1);\n        let permit = permit_fut2.await;\n        assert_eq!(permit.memory_allocation, ByteSize::mb(10).as_u64());\n        assert!(!permit_provider.actor_join_handle.is_finished());\n\n        let _permit_fut3 = permit_provider\n            .get_permits(vec![ByteSize::mb(10)])\n            .await\n            .into_iter()\n            .next()\n            .unwrap();\n        let SearchPermitProvider {\n            message_sender,\n            actor_join_handle,\n        } = permit_provider;\n        drop(message_sender);\n        Arc::into_inner(actor_join_handle).unwrap().await.unwrap();\n    }\n\n    /// Tries to wait for a permit\n    async fn try_get(permit_fut: SearchPermitFuture) -> anyhow::Result<SearchPermit> {\n        // using a short timeout is a bit flaky, but it should be enough for these tests\n        let permit = tokio::time::timeout(Duration::from_millis(20), permit_fut).await?;\n        Ok(permit)\n    }\n\n    #[tokio::test]\n    async fn test_memory_budget() {\n        let permit_provider = SearchPermitProvider::new(100, ByteSize::mb(100));\n        let mut permit_futs = permit_provider\n            .get_permits(vec![ByteSize::mb(10); 14])\n            .await;\n        let mut remaining_permit_futs = permit_futs.split_off(10).into_iter();\n        assert_eq!(remaining_permit_futs.len(), 4);\n        // we should be able to obtain 10 permits right away (100MB / 10MB)\n        let mut permits: Vec<SearchPermit> = futures::stream::iter(permit_futs.into_iter())\n            .buffered(1)\n            .collect()\n            .await;\n        // the next permit is blocked by the memory budget\n        let next_blocked_permit_fut = remaining_permit_futs.next().unwrap();\n        try_get(next_blocked_permit_fut).await.err().unwrap();\n        // if we drop one of the permits, we can get a new one\n        permits.drain(0..1);\n        let next_permit_fut = remaining_permit_futs.next().unwrap();\n        let _new_permit = try_get(next_permit_fut).await.unwrap();\n        // the next permit is blocked again by the memory budget\n        let next_blocked_permit_fut = remaining_permit_futs.next().unwrap();\n        try_get(next_blocked_permit_fut).await.err().unwrap();\n        // by setting a more accurate memory usage after a completed warmup, we can get more permits\n        permits[0].update_memory_usage(ByteSize::mb(4));\n        permits[1].update_memory_usage(ByteSize::mb(6));\n        let next_permit_fut = remaining_permit_futs.next().unwrap();\n        try_get(next_permit_fut).await.unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_get_permits_with_offload_threshold_max_returns_all() {\n        let permit_provider = SearchPermitProvider::new(100, ByteSize::mb(100));\n        let permits = permit_provider\n            .get_permits_with_offload(vec![ByteSize::mb(1); 8], usize::MAX)\n            .await;\n        assert_eq!(permits.len(), 8);\n    }\n\n    #[tokio::test]\n    async fn test_get_permits_with_offload_threshold_zero_returns_none() {\n        let permit_provider = SearchPermitProvider::new(100, ByteSize::mb(100));\n        let permits = permit_provider\n            .get_permits_with_offload(vec![ByteSize::mb(1); 5], 0)\n            .await;\n        assert!(permits.is_empty());\n        let permit_actor = permit_provider.stop_and_unwrap().await;\n        assert!(permit_actor.permits_requests.is_empty());\n    }\n\n    #[tokio::test]\n    async fn test_get_permits_with_offload_truncates_to_threshold() {\n        let permit_provider = SearchPermitProvider::new(100, ByteSize::mb(100));\n        let permits = permit_provider\n            .get_permits_with_offload(vec![ByteSize::mb(1); 10], 4)\n            .await;\n        assert_eq!(permits.len(), 4);\n    }\n\n    #[tokio::test]\n    async fn test_get_permits_with_offload_futures_resolve_in_order() {\n        // We use a search permit provider with a capacity of 1 to make sure that the permits are\n        // resolved in order.\n        let permit_provider = SearchPermitProvider::new(1, ByteSize::mb(100));\n        let permits = permit_provider\n            .get_permits_with_offload(vec![ByteSize::mb(1); 4], 10)\n            .await;\n        assert_eq!(permits.len(), 4);\n        let mut futs: Vec<_> = permits\n            .into_iter()\n            .enumerate()\n            .map(|(i, permit_fut)| async move {\n                permit_fut.await;\n                i\n            })\n            .collect();\n        futs.shuffle(&mut rand::rng());\n        let mut join_set = JoinSet::new();\n        for fut in futs {\n            join_set.spawn(fut);\n        }\n        let mut results = Vec::new();\n        while let Some(result) = join_set.join_next().await {\n            results.push(result.unwrap());\n        }\n        assert_eq!(results, vec![0, 1, 2, 3]);\n    }\n\n    #[tokio::test]\n    async fn test_get_permits_with_offload_pending_consumed_frees_capacity() {\n        let permit_provider = SearchPermitProvider::new(100, ByteSize::mb(100));\n        // First call: 4 splits, threshold 6.\n        let first_permits = permit_provider\n            .get_permits_with_offload(vec![ByteSize::mb(1); 4], 6)\n            .await;\n        assert_eq!(first_permits.len(), 4);\n        // Consume all permits from the first batch (they resolve and get dropped).\n        for permit_fut in first_permits {\n            let _permit = permit_fut.await;\n        }\n        // Second call: the consumed permits no longer count as pending.\n        let second_permits = permit_provider\n            .get_permits_with_offload(vec![ByteSize::mb(1); 5], 6)\n            .await;\n        assert_eq!(second_permits.len(), 5);\n    }\n\n    #[tokio::test]\n    async fn test_warmup_slot() {\n        let permit_provider = SearchPermitProvider::new(10, ByteSize::mb(100));\n        let mut permit_futs = permit_provider.get_permits(vec![ByteSize::mb(1); 16]).await;\n        let mut remaining_permit_futs = permit_futs.split_off(10).into_iter();\n        assert_eq!(remaining_permit_futs.len(), 6);\n        // we should be able to obtain 10 permits right away\n        let mut permits: Vec<SearchPermit> = futures::stream::iter(permit_futs.into_iter())\n            .buffered(1)\n            .collect()\n            .await;\n        // the next permit is blocked by the warmup slots\n        let next_blocked_permit_fut = remaining_permit_futs.next().unwrap();\n        try_get(next_blocked_permit_fut).await.err().unwrap();\n        // if we drop one of the permits, we can get a new one\n        permits.drain(0..1);\n        let next_permit_fut = remaining_permit_futs.next().unwrap();\n        permits.push(try_get(next_permit_fut).await.unwrap());\n        // the next permit is blocked again by the warmup slots\n        let next_blocked_permit_fut = remaining_permit_futs.next().unwrap();\n        try_get(next_blocked_permit_fut).await.err().unwrap();\n        // we can explicitly free the warmup slot on a permit\n        permits[0].free_warmup_slot();\n        let next_permit_fut = remaining_permit_futs.next().unwrap();\n        permits.push(try_get(next_permit_fut).await.unwrap());\n        // dropping that same permit does not free up another slot\n        permits.drain(0..1);\n        let next_blocked_permit_fut = remaining_permit_futs.next().unwrap();\n        try_get(next_blocked_permit_fut).await.err().unwrap();\n        // but dropping a permit for which the slot wasn't explicitly free does free up a slot\n        permits.drain(0..1);\n        let next_blocked_permit_fut = remaining_permit_futs.next().unwrap();\n        permits.push(try_get(next_blocked_permit_fut).await.unwrap());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-search/src/search_response_rest.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::convert::TryFrom;\n\nuse quickwit_common::truncate_str;\nuse quickwit_proto::search::SearchResponse;\nuse quickwit_query::aggregations::AggregationResults as AggregationResultsProxy;\nuse quickwit_query::query_ast::QueryAst;\nuse serde::{Deserialize, Serialize};\nuse serde_json::Value as JsonValue;\n\nuse crate::error::SearchError;\n\n/// A classic ES aggregation result ast\n// TODO previously, we were using zero-copy when possible, which we are no longer doing:\n// is that problematic? How can we return to zero/low-copy without it being painful?\n#[derive(Serialize, PartialEq, Debug)]\npub struct AggregationResults(tantivy::aggregation::agg_result::AggregationResults);\n\nimpl AggregationResults {\n    /// Parse an ES aggregation result ast from our non-ambiguous postcard format\n    pub fn from_postcard(postcard_bytes: &[u8]) -> anyhow::Result<Self> {\n        let aggregation_result: AggregationResultsProxy = postcard::from_bytes(postcard_bytes)?;\n        Ok(AggregationResults(aggregation_result.into()))\n    }\n}\n\n/// SearchResponseRest represents the response returned by the REST search API\n/// and is meant to be serialized into JSON.\n#[derive(Serialize, PartialEq, Debug, utoipa::ToSchema)]\npub struct SearchResponseRest {\n    /// Overall number of documents matching the query.\n    pub num_hits: u64,\n    #[schema(value_type = Vec<Object>)]\n    /// List of hits returned.\n    pub hits: Vec<JsonValue>,\n    /// List of snippets\n    #[schema(value_type = Vec<Object>)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub snippets: Option<Vec<JsonValue>>,\n    /// Elapsed time.\n    pub elapsed_time_micros: u64,\n    /// Search errors.\n    pub errors: Vec<String>,\n    /// Aggregations.\n    #[schema(value_type = Object)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub aggregations: Option<AggregationResults>,\n}\n\nimpl TryFrom<SearchResponse> for SearchResponseRest {\n    type Error = SearchError;\n\n    fn try_from(search_response: SearchResponse) -> Result<Self, Self::Error> {\n        let mut documents = Vec::with_capacity(search_response.hits.len());\n        let mut snippets = Vec::new();\n        for hit in search_response.hits {\n            let document: JsonValue = serde_json::from_str(&hit.json).map_err(|err| {\n                SearchError::Internal(format!(\n                    \"failed to serialize document `{}` to JSON: `{}`\",\n                    truncate_str(&hit.json, 100),\n                    err\n                ))\n            })?;\n            documents.push(document);\n\n            if let Some(snippet_json) = hit.snippet {\n                let snippet_opt: JsonValue =\n                    serde_json::from_str(&snippet_json).map_err(|err| {\n                        SearchError::Internal(format!(\n                            \"failed to serialize snippet `{snippet_json}` to JSON: `{err}`\"\n                        ))\n                    })?;\n                snippets.push(snippet_opt);\n            }\n        }\n\n        let snippet_opt = if !snippets.is_empty() {\n            Some(snippets)\n        } else {\n            None\n        };\n\n        let aggregations_opt =\n            if let Some(aggregation_postcard) = search_response.aggregation_postcard {\n                let aggregation = AggregationResults::from_postcard(&aggregation_postcard)\n                    .map_err(|err| SearchError::Internal(err.to_string()))?;\n                Some(aggregation)\n            } else {\n                None\n            };\n\n        Ok(SearchResponseRest {\n            num_hits: search_response.num_hits,\n            hits: documents,\n            snippets: snippet_opt,\n            elapsed_time_micros: search_response.elapsed_time_micros,\n            errors: search_response.errors,\n            aggregations: aggregations_opt,\n        })\n    }\n}\n\n/// Details on how a query would be executed.\n#[derive(Serialize, Deserialize, PartialEq, Debug, utoipa::ToSchema)]\npub struct SearchPlanResponseRest {\n    /// Quickwit AST of the query.\n    #[schema(value_type = Object)]\n    pub quickwit_ast: QueryAst,\n    /// Resolved Tantivy AST of the query, according to the latest docmapping.\n    ///\n    /// It's possible older splits actually resolve to a different ast.\n    pub tantivy_ast: String,\n    /// List of splits that would be searched by this query\n    pub searched_splits: Vec<String>,\n    /// Requests expected for each split\n    #[schema(value_type = Object)]\n    pub storage_requests: StorageRequestCount,\n}\n\n/// Number of expected storage requests, per request kind.\n///\n/// These figures do not take in account whether the data is already cached or not.\n#[derive(Serialize, Deserialize, PartialEq, Debug, Default)]\npub struct StorageRequestCount {\n    /// Number of split footer downloaded, always 1\n    pub footer: usize,\n    /// Number of fastfields downloaded\n    pub fastfield: usize,\n    /// Number of fieldnorm downloaded\n    pub fieldnorm: usize,\n    /// Number of sstable downloaded\n    pub sstable: usize,\n    /// Number of posting list downloaded\n    pub posting: usize,\n    /// Number of position list downloaded\n    pub position: usize,\n}\n"
  },
  {
    "path": "quickwit/quickwit-search/src/service.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::str::FromStr;\nuse std::sync::Arc;\nuse std::time::{Duration, Instant};\n\nuse async_trait::async_trait;\nuse quickwit_common::uri::Uri;\nuse quickwit_config::SearcherConfig;\nuse quickwit_doc_mapper::DocMapper;\nuse quickwit_proto::metastore::MetastoreServiceClient;\nuse quickwit_proto::search::{\n    FetchDocsRequest, FetchDocsResponse, GetKvRequest, Hit, LeafListFieldsRequest,\n    LeafListTermsRequest, LeafListTermsResponse, LeafSearchRequest, LeafSearchResponse,\n    ListFieldsRequest, ListFieldsResponse, ListTermsRequest, ListTermsResponse, PutKvRequest,\n    ReportSplitsRequest, ReportSplitsResponse, ScrollRequest, SearchPlanResponse, SearchRequest,\n    SearchResponse, SnippetRequest,\n};\nuse quickwit_storage::{\n    MemorySizedCache, QuickwitCache, SplitCache, StorageCache, StorageResolver,\n};\nuse tantivy::aggregation::AggregationLimitsGuard;\n\nuse crate::invoker::LambdaLeafSearchInvoker;\nuse crate::leaf::multi_index_leaf_search;\nuse crate::leaf_cache::{LeafSearchCache, PredicateCacheImpl};\nuse crate::list_fields::{leaf_list_fields, root_list_fields};\nuse crate::list_fields_cache::ListFieldsCache;\nuse crate::list_terms::{leaf_list_terms, root_list_terms};\nuse crate::metrics_trackers::LeafSearchMetricsFuture;\nuse crate::root::fetch_docs_phase;\nuse crate::scroll_context::{MiniKV, ScrollContext, ScrollKeyAndStartOffset};\nuse crate::search_permit_provider::SearchPermitProvider;\nuse crate::{ClusterClient, SearchError, fetch_docs, root_search, search_plan};\n\n#[derive(Clone)]\n/// The search service implementation.\npub struct SearchServiceImpl {\n    metastore: MetastoreServiceClient,\n    storage_resolver: StorageResolver,\n    cluster_client: ClusterClient,\n    searcher_context: Arc<SearcherContext>,\n    local_kv_store: MiniKV,\n}\n\n/// Trait representing a search service.\n///\n/// It mirrors the gRPC service `SearchService`, but with a more concrete\n/// error type that can be converted into an API Error.\n/// The REST API relies directly on the `SearchService`.\n/// Also, it is mockable.\n#[mockall::automock]\n#[async_trait]\npub trait SearchService: 'static + Send + Sync {\n    /// Root search API.\n    /// This RPC identifies the set of splits on which the query should run on,\n    /// and dispatches the multiple calls to `LeafSearch`.\n    ///\n    /// It is also in charge of merging back the responses.\n    async fn root_search(&self, request: SearchRequest) -> crate::Result<SearchResponse>;\n\n    /// Performs a leaf search on a given set of splits.\n    ///\n    /// It is like a regular search except that:\n    /// - the node should perform the search locally instead of dispatching\n    /// it to other nodes.\n    /// - it should be applied on the given subset of splits\n    /// - hit content is not fetched, and we instead return a so-called `PartialHit`.\n    async fn leaf_search(&self, request: LeafSearchRequest) -> crate::Result<LeafSearchResponse>;\n\n    /// Fetches the documents contents from the document store.\n    /// This methods takes `PartialHit`s and returns `Hit`s.\n    async fn fetch_docs(&self, request: FetchDocsRequest) -> crate::Result<FetchDocsResponse>;\n\n    /// Root search API.\n    /// This RPC identifies the set of splits on which the query should run on,\n    /// and dispatches the multiple calls to `LeafSearch`.\n    ///\n    /// It is also in charge of merging back the responses.\n    async fn root_list_terms(&self, request: ListTermsRequest) -> crate::Result<ListTermsResponse>;\n\n    /// Performs a leaf search on a given set of splits.\n    ///\n    /// It is like a regular search except that:\n    /// - the node should perform the search locally instead of dispatching\n    /// it to other nodes.\n    /// - it should be applied on the given subset of splits\n    /// - hit content is not fetched, and we instead return a so-called `PartialHit`.\n    async fn leaf_list_terms(\n        &self,\n        request: LeafListTermsRequest,\n    ) -> crate::Result<LeafListTermsResponse>;\n\n    /// Performs a scroll request.\n    async fn scroll(&self, scroll_request: ScrollRequest) -> crate::Result<SearchResponse>;\n\n    /// Stores a Key value in the local cache.\n    /// This operation is not distributed. The distribution logic lives in\n    /// the `ClusterClient`.\n    async fn put_kv(&self, put_kv: PutKvRequest);\n\n    /// Gets the payload associated to a key in the local cache.\n    /// See also `put_kv(..)`.\n    async fn get_kv(&self, get_kv: GetKvRequest) -> Option<Vec<u8>>;\n\n    /// Indexers call report_splits to inform searchers node about the presence of a split, which\n    /// would then be considered as a candidate for the searcher split cache.\n    async fn report_splits(&self, report_splits: ReportSplitsRequest) -> ReportSplitsResponse;\n\n    /// Return the list of fields for a given or multiple indices.\n    async fn root_list_fields(\n        &self,\n        list_fields: ListFieldsRequest,\n    ) -> crate::Result<ListFieldsResponse>;\n\n    /// Return the list of fields for one index.\n    async fn leaf_list_fields(\n        &self,\n        list_fields: LeafListFieldsRequest,\n    ) -> crate::Result<ListFieldsResponse>;\n\n    /// Describe how a search would be processed.\n    async fn search_plan(&self, request: SearchRequest) -> crate::Result<SearchPlanResponse>;\n}\n\nimpl SearchServiceImpl {\n    /// Creates a new search service.\n    pub fn new(\n        metastore: MetastoreServiceClient,\n        storage_resolver: StorageResolver,\n        cluster_client: ClusterClient,\n        searcher_context: Arc<SearcherContext>,\n    ) -> Self {\n        SearchServiceImpl {\n            metastore,\n            storage_resolver,\n            cluster_client,\n            searcher_context,\n            local_kv_store: MiniKV::default(),\n        }\n    }\n}\n\n/// Deserializes a JSON-encoded doc mapper string into an `Arc<DocMapper>`.\npub(crate) fn deserialize_doc_mapper(doc_mapper_str: &str) -> crate::Result<Arc<DocMapper>> {\n    let doc_mapper = serde_json::from_str::<Arc<DocMapper>>(doc_mapper_str).map_err(|err| {\n        SearchError::Internal(format!(\"failed to deserialize doc mapper: `{err}`\"))\n    })?;\n    Ok(doc_mapper)\n}\n\n#[async_trait]\nimpl SearchService for SearchServiceImpl {\n    async fn root_search(&self, search_request: SearchRequest) -> crate::Result<SearchResponse> {\n        let search_result = root_search(\n            &self.searcher_context,\n            search_request,\n            self.metastore.clone(),\n            &self.cluster_client,\n        )\n        .await?;\n        Ok(search_result)\n    }\n\n    async fn leaf_search(\n        &self,\n        leaf_search_request: LeafSearchRequest,\n    ) -> crate::Result<LeafSearchResponse> {\n        // Check leaf_search_request existence before tracing with `instrument` call.\n        if leaf_search_request.search_request.is_none() {\n            return Err(SearchError::Internal(\"no search request\".to_string()));\n        }\n        let num_splits = leaf_search_request\n            .leaf_requests\n            .iter()\n            .map(|req| req.split_offsets.len())\n            .sum::<usize>();\n\n        let tracked_future = LeafSearchMetricsFuture {\n            tracked: multi_index_leaf_search(\n                self.searcher_context.clone(),\n                leaf_search_request,\n                self.storage_resolver.clone(),\n            ),\n            start: Instant::now(),\n            targeted_splits: num_splits,\n            status: None,\n        };\n        let timeout = self.searcher_context.searcher_config.request_timeout();\n        tokio::time::timeout(timeout, tracked_future).await?\n    }\n\n    async fn fetch_docs(\n        &self,\n        fetch_docs_request: FetchDocsRequest,\n    ) -> crate::Result<FetchDocsResponse> {\n        let index_uri = Uri::from_str(&fetch_docs_request.index_uri)?;\n        let storage = self.storage_resolver.resolve(&index_uri).await?;\n        let snippet_request_opt: Option<&SnippetRequest> =\n            fetch_docs_request.snippet_request.as_ref();\n        let doc_mapper = deserialize_doc_mapper(&fetch_docs_request.doc_mapper)?;\n        let fetch_docs_response = fetch_docs(\n            self.searcher_context.clone(),\n            fetch_docs_request.partial_hits,\n            storage,\n            &fetch_docs_request.split_offsets,\n            doc_mapper,\n            snippet_request_opt,\n        )\n        .await?;\n\n        Ok(fetch_docs_response)\n    }\n\n    async fn root_list_terms(\n        &self,\n        list_terms_request: ListTermsRequest,\n    ) -> crate::Result<ListTermsResponse> {\n        let search_result = root_list_terms(\n            &list_terms_request,\n            self.metastore.clone(),\n            &self.cluster_client,\n        )\n        .await?;\n\n        Ok(search_result)\n    }\n\n    async fn leaf_list_terms(\n        &self,\n        leaf_search_request: LeafListTermsRequest,\n    ) -> crate::Result<LeafListTermsResponse> {\n        let search_request = leaf_search_request\n            .list_terms_request\n            .ok_or_else(|| SearchError::Internal(\"no search request\".to_string()))?;\n        let index_uri = Uri::from_str(&leaf_search_request.index_uri)?;\n        let storage = self.storage_resolver.resolve(&index_uri).await?;\n        let split_ids = leaf_search_request.split_offsets;\n\n        let leaf_search_response = leaf_list_terms(\n            self.searcher_context.clone(),\n            &search_request,\n            storage.clone(),\n            &split_ids[..],\n        )\n        .await?;\n\n        Ok(leaf_search_response)\n    }\n\n    async fn scroll(&self, scroll_request: ScrollRequest) -> crate::Result<SearchResponse> {\n        scroll(scroll_request, &self.cluster_client, &self.searcher_context).await\n    }\n\n    async fn put_kv(&self, put_request: PutKvRequest) {\n        let ttl = Duration::from_secs(put_request.ttl_secs as u64);\n        self.local_kv_store\n            .put(put_request.key, put_request.payload, ttl)\n            .await;\n    }\n\n    async fn get_kv(&self, get_request: GetKvRequest) -> Option<Vec<u8>> {\n        let payload: Vec<u8> = self.local_kv_store.get(&get_request.key).await?;\n        Some(payload)\n    }\n\n    async fn report_splits(&self, report_splits: ReportSplitsRequest) -> ReportSplitsResponse {\n        if let Some(split_cache) = self.searcher_context.split_cache_opt.as_ref() {\n            split_cache.report_splits(report_splits.report_splits);\n        }\n        ReportSplitsResponse {}\n    }\n\n    async fn root_list_fields(\n        &self,\n        list_fields_req: ListFieldsRequest,\n    ) -> crate::Result<ListFieldsResponse> {\n        root_list_fields(\n            list_fields_req,\n            &self.cluster_client,\n            self.metastore.clone(),\n        )\n        .await\n    }\n\n    async fn leaf_list_fields(\n        &self,\n        list_fields_req: LeafListFieldsRequest,\n    ) -> crate::Result<ListFieldsResponse> {\n        let index_uri = Uri::from_str(&list_fields_req.index_uri)?;\n        let storage = self.storage_resolver.resolve(&index_uri).await?;\n        let index_id = list_fields_req.index_id;\n        let split_ids = list_fields_req.split_offsets;\n        leaf_list_fields(\n            index_id,\n            storage,\n            &self.searcher_context,\n            &split_ids[..],\n            &list_fields_req.fields,\n        )\n        .await\n    }\n\n    async fn search_plan(\n        &self,\n        search_request: SearchRequest,\n    ) -> crate::Result<SearchPlanResponse> {\n        let search_plan = search_plan(search_request, self.metastore.clone()).await?;\n        Ok(search_plan)\n    }\n}\n\npub(crate) async fn scroll(\n    scroll_request: ScrollRequest,\n    cluster_client: &ClusterClient,\n    searcher_context: &SearcherContext,\n) -> crate::Result<SearchResponse> {\n    let start = Instant::now();\n    let current_scroll = ScrollKeyAndStartOffset::from_str(&scroll_request.scroll_id)\n        .map_err(|msg| SearchError::InvalidArgument(msg.to_string()))?;\n    let start_doc = current_scroll.start_offset;\n    let scroll_key: [u8; 16] = current_scroll.scroll_key();\n    let payload = cluster_client.get_kv(&scroll_key[..]).await;\n    let payload =\n        payload.ok_or_else(|| SearchError::Internal(\"scroll key not found\".to_string()))?;\n\n    let mut scroll_context = ScrollContext::load(&payload)\n        .map_err(|_| SearchError::Internal(\"corrupted Scroll context\".to_string()))?;\n\n    let end_doc: u64 = start_doc + scroll_context.max_hits_per_page;\n\n    let mut partial_hits = Vec::new();\n    let mut scroll_context_modified = false;\n\n    let cached_results = scroll_context.get_cached_partial_hits(start_doc..end_doc);\n    partial_hits.extend_from_slice(cached_results);\n    if (partial_hits.len() as u64) < current_scroll.max_hits_per_page as u64 {\n        let search_after = partial_hits\n            .last()\n            .cloned()\n            .unwrap_or_else(|| current_scroll.search_after.clone());\n        let cursor = start_doc + partial_hits.len() as u64;\n        scroll_context\n            .load_batch_starting_at(cursor, search_after, cluster_client, searcher_context)\n            .await?;\n        partial_hits.extend_from_slice(scroll_context.get_cached_partial_hits(cursor..end_doc));\n        scroll_context_modified = true;\n    }\n\n    // Fetch the actual documents.\n    let hits: Vec<Hit> = fetch_docs_phase(\n        &scroll_context.indexes_metas_for_leaf_search,\n        &partial_hits[..],\n        &scroll_context.split_metadatas[..],\n        &scroll_context.search_request,\n        cluster_client,\n    )\n    .await?;\n\n    let next_scroll_id = current_scroll.next_page(\n        hits.len() as u64,\n        partial_hits.last().cloned().unwrap_or_default(),\n    );\n\n    if let Some(scroll_ttl_secs) = scroll_request.scroll_ttl_secs\n        && scroll_context_modified\n    {\n        scroll_context.clear_cache_if_unneeded();\n        let payload = scroll_context.serialize();\n        let scroll_ttl = Duration::from_secs(scroll_ttl_secs as u64);\n        cluster_client\n            .put_kv(&scroll_key, &payload, scroll_ttl)\n            .await;\n    }\n\n    Ok(SearchResponse {\n        hits,\n        num_hits: scroll_context.total_num_hits,\n        elapsed_time_micros: start.elapsed().as_micros() as u64,\n        scroll_id: Some(next_scroll_id.to_string()),\n        errors: Vec::new(),\n        aggregation_postcard: None,\n        failed_splits: scroll_context.failed_splits,\n        num_successful_splits: scroll_context.num_successful_splits,\n    })\n}\n/// [`SearcherContext`] provides a common set of variables\n/// shared by a searcher instance (which instantiates a\n/// [`SearchServiceImpl`]).\npub struct SearcherContext {\n    /// Searcher config.\n    pub searcher_config: SearcherConfig,\n    /// Fast fields cache.\n    pub fast_fields_cache: Arc<dyn StorageCache>,\n    /// Counting semaphore to limit concurrent leaf search split requests.\n    pub search_permit_provider: SearchPermitProvider,\n    /// Split footer cache.\n    pub split_footer_cache: MemorySizedCache<String>,\n    /// Per-split and per-query cache.\n    pub leaf_search_cache: LeafSearchCache,\n    /// Per-split and per-predicate cache.\n    pub predicate_cache: Arc<PredicateCacheImpl>,\n    /// Search split cache. `None` if no split cache is configured.\n    pub split_cache_opt: Option<Arc<SplitCache>>,\n    /// List fields cache. Caches the list fields response for a given split.\n    pub list_fields_cache: ListFieldsCache,\n    /// The aggregation limits are passed to limit the memory usage.\n    /// This object is shared across all request.\n    pub aggregation_limit: AggregationLimitsGuard,\n    /// Optional Lambda invoker for offloading leaf search to serverless functions.\n    pub lambda_invoker: Option<Arc<dyn LambdaLeafSearchInvoker>>,\n}\n\nimpl std::fmt::Debug for SearcherContext {\n    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {\n        f.debug_struct(\"SearcherContext\")\n            .field(\"searcher_config\", &self.searcher_config)\n            .finish()\n    }\n}\n\nimpl SearcherContext {\n    /// Create a default SearcherContext\n    #[cfg(test)]\n    pub fn for_test() -> SearcherContext {\n        let searcher_config = SearcherConfig::default();\n        SearcherContext::new_without_invoker(searcher_config, None)\n    }\n\n    /// Creates a new searcher context without a lambda invoker.\n    pub fn new_without_invoker(\n        searcher_config: SearcherConfig,\n        split_cache_opt: Option<Arc<SplitCache>>,\n    ) -> Self {\n        Self::new(\n            searcher_config,\n            split_cache_opt,\n            None::<Box<dyn LambdaLeafSearchInvoker>>,\n        )\n    }\n\n    /// Creates a new searcher context, given a searcher config, and an optional `SplitCache`.\n    pub fn new(\n        searcher_config: SearcherConfig,\n        split_cache_opt: Option<Arc<SplitCache>>,\n        lambda_invoker: Option<impl LambdaLeafSearchInvoker + 'static>,\n    ) -> Self {\n        let global_split_footer_cache = MemorySizedCache::from_config(\n            &searcher_config.split_footer_cache,\n            &quickwit_storage::STORAGE_METRICS.split_footer_cache,\n        );\n        let leaf_search_split_semaphore = SearchPermitProvider::new(\n            searcher_config.max_num_concurrent_split_searches,\n            searcher_config.warmup_memory_budget,\n        );\n        let storage_long_term_cache =\n            Arc::new(QuickwitCache::new(&searcher_config.fast_field_cache));\n        let leaf_search_cache = LeafSearchCache::new(&searcher_config.partial_request_cache);\n        let predicate_cache = PredicateCacheImpl::new(&searcher_config.predicate_cache);\n        let list_fields_cache = ListFieldsCache::new(&searcher_config.partial_request_cache);\n        let aggregation_limit = AggregationLimitsGuard::new(\n            Some(searcher_config.aggregation_memory_limit.as_u64()),\n            Some(searcher_config.aggregation_bucket_limit),\n        );\n\n        let lambda_invoker =\n            lambda_invoker.map(|invoker| Arc::new(invoker) as Arc<dyn LambdaLeafSearchInvoker>);\n\n        Self {\n            searcher_config,\n            fast_fields_cache: storage_long_term_cache,\n            predicate_cache: predicate_cache.into(),\n            search_permit_provider: leaf_search_split_semaphore,\n            split_footer_cache: global_split_footer_cache,\n            leaf_search_cache,\n            list_fields_cache,\n            split_cache_opt,\n            aggregation_limit,\n            lambda_invoker,\n        }\n    }\n\n    /// Returns the shared instance to track the aggregation memory usage.\n    pub fn get_aggregation_limits(&self) -> AggregationLimitsGuard {\n        self.aggregation_limit.clone()\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-search/src/tests.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::cmp::Ordering;\nuse std::collections::{BTreeMap, BTreeSet};\n\nuse assert_json_diff::{assert_json_eq, assert_json_include};\nuse quickwit_config::SearcherConfig;\nuse quickwit_doc_mapper::DocMapper;\nuse quickwit_doc_mapper::tag_pruning::extract_tags_from_query;\nuse quickwit_indexing::TestSandbox;\nuse quickwit_proto::search::{\n    LeafListTermsResponse, ListTermsRequest, SearchRequest, SortByValue, SortField, SortOrder,\n    SortValue, TraceId,\n};\nuse quickwit_query::query_ast::{\n    QueryAst, qast_helper, qast_json_helper, query_ast_from_user_text,\n};\nuse serde_json::{Value as JsonValue, json};\nuse tantivy::Term;\nuse tantivy::schema::OwnedValue as TantivyValue;\nuse tantivy::time::OffsetDateTime;\n\nuse self::leaf::single_doc_mapping_leaf_search;\nuse super::*;\nuse crate::find_trace_ids_collector::Span;\nuse crate::list_terms::leaf_list_terms;\nuse crate::service::SearcherContext;\nuse crate::single_node_search;\n\n#[tokio::test]\nasync fn test_single_node_simple() -> anyhow::Result<()> {\n    let index_id = \"single-node-simple-1\";\n    let doc_mapping_yaml = r#\"\n            field_mappings:\n              - name: title\n                type: text\n              - name: body\n                type: text\n              - name: url\n                type: text\n              - name: binary\n                type: bytes\n        \"#;\n    let test_sandbox = TestSandbox::create(index_id, doc_mapping_yaml, \"{}\", &[\"body\"]).await?;\n    let docs = vec![\n        json!({\"title\": \"snoopy\", \"body\": \"Snoopy is an anthropomorphic beagle[5] in the comic strip...\", \"url\": \"http://snoopy\", \"binary\": \"dGhpcyBpcyBhIHRlc3Qu\"}),\n        json!({\"title\": \"beagle\", \"body\": \"The beagle is a breed of small scent hound, similar in appearance to the much larger foxhound.\", \"url\": \"http://beagle\", \"binary\": \"bWFkZSB5b3UgbG9vay4=\"}),\n    ];\n    test_sandbox.add_documents(docs.clone()).await?;\n    let search_request = SearchRequest {\n        index_id_patterns: vec![index_id.to_string()],\n        query_ast: qast_json_helper(\"anthropomorphic\", &[\"body\"]),\n        max_hits: 2,\n        ..Default::default()\n    };\n    let single_node_result = single_node_search(\n        search_request,\n        test_sandbox.metastore(),\n        test_sandbox.storage_resolver(),\n    )\n    .await?;\n    assert_eq!(single_node_result.num_hits, 1);\n    assert_eq!(single_node_result.hits.len(), 1);\n    let hit_json: JsonValue = serde_json::from_str(&single_node_result.hits[0].json)?;\n    let expected_json: JsonValue = json!({\"title\": \"snoopy\", \"body\": \"Snoopy is an anthropomorphic beagle[5] in the comic strip...\", \"url\": \"http://snoopy\", \"binary\": \"dGhpcyBpcyBhIHRlc3Qu\"});\n    assert_json_include!(actual: hit_json, expected: expected_json);\n    assert!(single_node_result.elapsed_time_micros > 10);\n    assert!(single_node_result.elapsed_time_micros < 1_000_000);\n    test_sandbox.assert_quit().await;\n    Ok(())\n}\n\n#[tokio::test]\nasync fn test_single_node_termset() -> anyhow::Result<()> {\n    let index_id = \"single-node-termset-1\";\n    let doc_mapping_yaml = r#\"\n            field_mappings:\n              - name: title\n                type: text\n              - name: body\n                type: text\n              - name: url\n                type: text\n              - name: binary\n                type: bytes\n        \"#;\n    let test_sandbox = TestSandbox::create(index_id, doc_mapping_yaml, \"{}\", &[\"body\"]).await?;\n    let docs = vec![\n        json!({\"title\": \"snoopy\", \"body\": \"Snoopy is an anthropomorphic beagle[5] in the comic strip...\", \"url\": \"http://snoopy\", \"binary\": \"dGhpcyBpcyBhIHRlc3Qu\"}),\n        json!({\"title\": \"beagle\", \"body\": \"The beagle is a breed of small scent hound, similar in appearance to the much larger foxhound.\", \"url\": \"http://beagle\", \"binary\": \"bWFkZSB5b3UgbG9vay4=\"}),\n    ];\n    test_sandbox.add_documents(docs.clone()).await?;\n    let search_request = SearchRequest {\n        index_id_patterns: vec![index_id.to_string()],\n        query_ast: qast_json_helper(\"title: IN [beagle]\", &[]),\n        start_timestamp: None,\n        end_timestamp: None,\n        max_hits: 2,\n        start_offset: 0,\n        ..Default::default()\n    };\n    let single_node_result = single_node_search(\n        search_request,\n        test_sandbox.metastore(),\n        test_sandbox.storage_resolver(),\n    )\n    .await?;\n    assert_eq!(single_node_result.num_hits, 1);\n    assert_eq!(single_node_result.hits.len(), 1);\n    let hit_json: JsonValue = serde_json::from_str(&single_node_result.hits[0].json)?;\n    let expected_json: JsonValue = json!({\"title\": \"beagle\", \"body\": \"The beagle is a breed of small scent hound, similar in appearance to the much larger foxhound.\", \"url\": \"http://beagle\", \"binary\": \"bWFkZSB5b3UgbG9vay4=\"});\n    assert_json_include!(actual: hit_json, expected: expected_json);\n    assert!(single_node_result.elapsed_time_micros > 10);\n    assert!(single_node_result.elapsed_time_micros < 1_000_000);\n    test_sandbox.assert_quit().await;\n    Ok(())\n}\n\n#[tokio::test]\nasync fn test_single_search_with_snippet() -> anyhow::Result<()> {\n    let index_id = \"single-node-with-snippet\";\n    let doc_mapping_yaml = r#\"\n            field_mappings:\n              - name: title\n                type: text\n              - name: body\n                type: text\n        \"#;\n    let test_sandbox = TestSandbox::create(index_id, doc_mapping_yaml, \"{}\", &[\"body\"]).await?;\n    let docs = vec![\n        json!({\"title\": \"snoopy\", \"body\": \"Snoopy is an anthropomorphic beagle in the comic strip.\"}),\n        json!({\"title\": \"beagle\", \"body\": \"The beagle is a breed of small scent hound.\"}),\n        json!({\"title\": \"lisa\", \"body\": \"Lisa is a character in `The Simpsons` animated tv series.\"}),\n    ];\n    test_sandbox.add_documents(docs.clone()).await?;\n    let search_request = SearchRequest {\n        index_id_patterns: vec![index_id.to_string()],\n        query_ast: qast_json_helper(\"beagle\", &[\"title\", \"body\"]),\n        snippet_fields: vec![\"title\".to_string(), \"body\".to_string()],\n        max_hits: 2,\n        ..Default::default()\n    };\n    let single_node_result = single_node_search(\n        search_request,\n        test_sandbox.metastore(),\n        test_sandbox.storage_resolver(),\n    )\n    .await?;\n    assert_eq!(single_node_result.num_hits, 2);\n    assert_eq!(single_node_result.hits.len(), 2);\n\n    let highlight_json: JsonValue =\n        serde_json::from_str(single_node_result.hits[0].snippet.as_ref().unwrap())?;\n    let expected_json: JsonValue = json!({\n        \"title\": [\"<b>beagle</b>\"],\n        \"body\": [\"The <b>beagle</b> is a breed of small scent hound\"]\n    });\n\n    assert_json_eq!(highlight_json, expected_json);\n    let highlight_json: JsonValue =\n        serde_json::from_str(single_node_result.hits[1].snippet.as_ref().unwrap())?;\n    let expected_json: JsonValue = json!({\"title\": [], \"body\": [\"Snoopy is an anthropomorphic <b>beagle</b> in the comic strip\"]});\n    assert_json_eq!(highlight_json, expected_json);\n\n    test_sandbox.assert_quit().await;\n    Ok(())\n}\n\nasync fn slop_search_and_check(\n    test_sandbox: &TestSandbox,\n    index_id: &str,\n    query: &str,\n    expected_num_match: u64,\n) -> anyhow::Result<()> {\n    let query_ast = qast_json_helper(query, &[\"body\"]);\n    let search_request = SearchRequest {\n        index_id_patterns: vec![index_id.to_string()],\n        query_ast,\n        max_hits: 5,\n        ..Default::default()\n    };\n    let single_node_result = single_node_search(\n        search_request,\n        test_sandbox.metastore(),\n        test_sandbox.storage_resolver(),\n    )\n    .await?;\n    assert_eq!(\n        single_node_result.num_hits, expected_num_match,\n        \"query: {query}\"\n    );\n    assert_eq!(\n        single_node_result.hits.len(),\n        expected_num_match as usize,\n        \"query: {query}\"\n    );\n    Ok(())\n}\n\n#[tokio::test]\nasync fn test_slop_queries() {\n    let index_id = \"slop-query\";\n    let doc_mapping_yaml = r#\"\n            field_mappings:\n              - name: title\n                type: text\n              - name: body\n                type: text\n                record: position\n        \"#;\n\n    let test_sandbox = TestSandbox::create(index_id, doc_mapping_yaml, \"{}\", &[\"body\"])\n        .await\n        .unwrap();\n    let docs = vec![\n        json!({\"title\": \"one\", \"body\": \"a red bike\"}),\n        json!({\"title\": \"two\", \"body\": \"a small blue bike\"}),\n        json!({\"title\": \"three\", \"body\": \"a small, rusty, and yellow bike\"}),\n        json!({\"title\": \"four\", \"body\": \"fred's small bike\"}),\n        json!({\"title\": \"five\", \"body\": \"a tiny shelter\"}),\n    ];\n    test_sandbox.add_documents(docs.clone()).await.unwrap();\n\n    slop_search_and_check(&test_sandbox, index_id, \"\\\"small bird\\\"~2\", 0)\n        .await\n        .unwrap();\n    slop_search_and_check(&test_sandbox, index_id, \"\\\"red bike\\\"~2\", 1)\n        .await\n        .unwrap();\n    slop_search_and_check(&test_sandbox, index_id, \"\\\"small blue bike\\\"~3\", 1)\n        .await\n        .unwrap();\n    slop_search_and_check(&test_sandbox, index_id, \"\\\"small bike\\\"\", 1)\n        .await\n        .unwrap();\n    slop_search_and_check(&test_sandbox, index_id, \"\\\"small bike\\\"~1\", 2)\n        .await\n        .unwrap();\n    slop_search_and_check(&test_sandbox, index_id, \"\\\"small bike\\\"~2\", 2)\n        .await\n        .unwrap();\n    slop_search_and_check(&test_sandbox, index_id, \"\\\"small bike\\\"~3\", 3)\n        .await\n        .unwrap();\n    slop_search_and_check(&test_sandbox, index_id, \"\\\"tiny shelter\\\"~3\", 1)\n        .await\n        .unwrap();\n    test_sandbox.assert_quit().await;\n}\n\n#[tokio::test]\nasync fn test_single_node_several_splits() -> anyhow::Result<()> {\n    let index_id = \"single-node-several-splits\";\n    let doc_mapping_yaml = r#\"\n            tag_fields:\n              - \"owner\"\n            field_mappings:\n              - name: title\n                type: text\n              - name: body\n                type: text\n              - name: url\n                type: text\n              - name: owner\n                type: text\n                tokenizer: 'raw'\n        \"#;\n    let test_sandbox = TestSandbox::create(index_id, doc_mapping_yaml, \"{}\", &[\"body\"]).await?;\n    for _ in 0..10u32 {\n        test_sandbox.add_documents(vec![\n                json!({\"title\": \"snoopy\", \"body\": \"Snoopy is an anthropomorphic beagle[5] in the comic strip...\", \"url\": \"http://snoopy\"}),\n                json!({\"title\": \"beagle\", \"body\": \"The beagle is a breed of small scent hound, similar in appearance to the much larger foxhound.\", \"url\": \"http://beagle\"}),\n            ]).await?;\n    }\n    let query_ast = query_ast_from_user_text(\"beagle\", None);\n    let query_ast_json = serde_json::to_string(&query_ast).unwrap();\n    let search_request = SearchRequest {\n        index_id_patterns: vec![index_id.to_string()],\n        query_ast: query_ast_json,\n        max_hits: 6,\n        ..Default::default()\n    };\n    let single_node_result = single_node_search(\n        search_request,\n        test_sandbox.metastore(),\n        test_sandbox.storage_resolver(),\n    )\n    .await?;\n    assert_eq!(single_node_result.num_hits, 20);\n    assert_eq!(single_node_result.hits.len(), 6);\n    assert!(&single_node_result.hits[0].json.contains(\"breed\"));\n    assert!(&single_node_result.hits[1].json.contains(\"Snoopy\"));\n    let hit_keys = single_node_result.hits.iter().flat_map(|hit| {\n        hit.partial_hit\n            .as_ref()\n            .map(|partial_hit| (partial_hit.split_id.as_str(), partial_hit.doc_id as i32))\n    });\n    assert!(hit_keys.is_sorted_by(|left, right| left.cmp(right) == Ordering::Greater));\n    assert!(single_node_result.elapsed_time_micros > 10);\n    assert!(single_node_result.elapsed_time_micros < 1_000_000);\n    test_sandbox.assert_quit().await;\n    Ok(())\n}\n\n#[tokio::test]\nasync fn test_single_node_filtering() -> anyhow::Result<()> {\n    let index_id = \"single-node-filtering\";\n    let doc_mapping_yaml = r#\"\n            tag_fields:\n              - owner\n            field_mappings:\n              - name: body\n                type: text\n              - name: ts\n                type: datetime\n                input_formats:\n                    - \"rfc3339\"\n                    - \"unix_timestamp\"\n                fast: true\n              - name: owner\n                type: text\n                tokenizer: raw\n            timestamp_field: ts\n            mode: lenient\n        \"#;\n    let indexing_settings_json = r#\"{}\"#;\n    let test_sandbox = TestSandbox::create(\n        index_id,\n        doc_mapping_yaml,\n        indexing_settings_json,\n        &[\"body\"],\n    )\n    .await?;\n\n    let mut docs = Vec::new();\n    let start_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n    for i in 0..30 {\n        let body = format!(\"info @ t:{}\", i + 1);\n        docs.push(json!({\"body\": body, \"ts\": start_timestamp + i + 1}));\n    }\n    test_sandbox.add_documents(docs).await?;\n\n    let search_request = SearchRequest {\n        index_id_patterns: vec![index_id.to_string()],\n        query_ast: qast_json_helper(\"info\", &[\"body\"]),\n        start_timestamp: Some(start_timestamp + 10),\n        end_timestamp: Some(start_timestamp + 20),\n        max_hits: 15,\n        sort_fields: vec![SortField {\n            field_name: \"ts\".to_string(),\n            sort_order: SortOrder::Desc as i32,\n            sort_datetime_format: None,\n        }],\n        ..Default::default()\n    };\n    let single_node_response = single_node_search(\n        search_request,\n        test_sandbox.metastore(),\n        test_sandbox.storage_resolver(),\n    )\n    .await?;\n    assert_eq!(single_node_response.num_hits, 10);\n    assert_eq!(single_node_response.hits.len(), 10);\n    assert!(&single_node_response.hits[0].json.contains(\"t:19\"));\n    assert!(&single_node_response.hits[9].json.contains(\"t:10\"));\n\n    // filter on time range [i64::MIN 20[ should only hit first 19 docs because of filtering\n    let search_request = SearchRequest {\n        index_id_patterns: vec![index_id.to_string()],\n        query_ast: qast_json_helper(\"info\", &[\"body\"]),\n        end_timestamp: Some(start_timestamp + 20),\n        max_hits: 25,\n        sort_fields: vec![SortField {\n            field_name: \"ts\".to_string(),\n            sort_order: SortOrder::Desc as i32,\n            sort_datetime_format: None,\n        }],\n        ..Default::default()\n    };\n    let single_node_response = single_node_search(\n        search_request,\n        test_sandbox.metastore(),\n        test_sandbox.storage_resolver(),\n    )\n    .await?;\n    assert_eq!(single_node_response.num_hits, 19);\n    assert_eq!(single_node_response.hits.len(), 19);\n    assert!(&single_node_response.hits[0].json.contains(\"t:19\"));\n    assert!(&single_node_response.hits[18].json.contains(\"t:1\"));\n\n    // filter on tag, should return an error since no split is tagged\n    let search_request = SearchRequest {\n        index_id_patterns: vec![index_id.to_string()],\n        query_ast: qast_json_helper(\"tag:foo AND info\", &[\"body\"]),\n        max_hits: 25,\n        sort_fields: vec![SortField {\n            field_name: \"ts\".to_string(),\n            sort_order: SortOrder::Desc as i32,\n            sort_datetime_format: None,\n        }],\n        ..Default::default()\n    };\n    let single_node_response = single_node_search(\n        search_request,\n        test_sandbox.metastore(),\n        test_sandbox.storage_resolver(),\n    )\n    .await;\n    assert!(single_node_response.is_err());\n    assert_eq!(\n        single_node_response.err().map(|err| err.to_string()),\n        Some(\"invalid query: field does not exist: `tag`\".to_string())\n    );\n    test_sandbox.assert_quit().await;\n    Ok(())\n}\n\n#[tokio::test]\nasync fn test_single_node_without_timestamp_with_query_start_timestamp_enabled()\n-> anyhow::Result<()> {\n    let index_id = \"single-node-no-timestamp\";\n    let doc_mapping_yaml = r#\"\n            tag_fields:\n              - owner\n            field_mappings:\n              - name: body\n                type: text\n              - name: owner\n                type: text\n                tokenizer: raw\n        \"#;\n    let indexing_settings_json = r#\"{}\"#;\n    let test_sandbox = TestSandbox::create(\n        index_id,\n        doc_mapping_yaml,\n        indexing_settings_json,\n        &[\"body\"],\n    )\n    .await?;\n\n    let mut docs = Vec::new();\n    let start_timestamp = OffsetDateTime::now_utc().unix_timestamp();\n    for i in 0..30 {\n        let body = format!(\"info @ t:{}\", i + 1);\n        docs.push(json!({ \"body\": body }));\n    }\n    test_sandbox.add_documents(docs).await?;\n\n    let search_request = SearchRequest {\n        index_id_patterns: vec![index_id.to_string()],\n        query_ast: qast_json_helper(\"info\", &[\"body\"]),\n        start_timestamp: Some(start_timestamp + 10),\n        end_timestamp: Some(start_timestamp + 20),\n        max_hits: 15,\n        ..Default::default()\n    };\n    let single_node_response = single_node_search(\n        search_request,\n        test_sandbox.metastore(),\n        test_sandbox.storage_resolver(),\n    )\n    .await;\n\n    assert!(single_node_response.is_err());\n    assert_eq!(\n        single_node_response.err().map(|err| err.to_string()),\n        Some(\n            \"the timestamp field is not set in index: [\\\"single-node-no-timestamp\\\"] definition \\\n             but start-timestamp or end-timestamp are set in the query\"\n                .to_string()\n        )\n    );\n    test_sandbox.assert_quit().await;\n    Ok(())\n}\n\nasync fn single_node_search_sort_by_field(\n    sort_by_field: &str,\n    fieldnorms_enabled: bool,\n) -> anyhow::Result<()> {\n    let index_id = \"single-node-sorting-sort-by-\".to_string()\n        + sort_by_field\n        + \"fieldnorms-\"\n        + &fieldnorms_enabled.to_string();\n\n    let doc_mapping_with_fieldnorms = r#\"\n            field_mappings:\n              - name: description\n                type: text\n                fieldnorms: true\n              - name: ts\n                type: datetime\n                fast: true\n              - name: temperature\n                type: i64\n                fast: true\n            timestamp_field: ts\n            \"#;\n\n    let doc_mapping_without_fieldnorms = r#\"\n            field_mappings:\n              - name: description\n                type: text\n              - name: ts\n                type: datetime\n                fast: true\n              - name: temperature\n                type: i64\n                fast: true\n            timestamp_field: ts\n            \"#;\n\n    let doc_mapping_yaml = if fieldnorms_enabled {\n        doc_mapping_with_fieldnorms\n    } else {\n        doc_mapping_without_fieldnorms\n    };\n\n    let indexing_settings_json = r#\"{}\"#;\n    let test_sandbox = TestSandbox::create(\n        &index_id,\n        doc_mapping_yaml,\n        indexing_settings_json,\n        &[\"description\"],\n    )\n    .await?;\n\n    let mut docs = Vec::new();\n    let start_timestamp = 72057595;\n    for i in 0..30 {\n        let timestamp = start_timestamp + (i + 1) as i64;\n        let description = format!(\"city info-{timestamp}\");\n        docs.push(json!({\"description\": description, \"ts\": timestamp, \"temperature\": i+32}));\n    }\n    test_sandbox.add_documents(docs).await?;\n\n    let search_request = SearchRequest {\n        index_id_patterns: vec![index_id.to_string()],\n        query_ast: qast_json_helper(\"city\", &[\"description\"]),\n        max_hits: 15,\n        sort_fields: vec![SortField {\n            field_name: sort_by_field.to_string(),\n            sort_order: SortOrder::Desc as i32,\n            sort_datetime_format: None,\n        }],\n        ..Default::default()\n    };\n\n    match single_node_search(\n        search_request,\n        test_sandbox.metastore(),\n        test_sandbox.storage_resolver(),\n    )\n    .await\n    {\n        Ok(single_node_response) => {\n            assert_eq!(single_node_response.num_hits, 30);\n            assert_eq!(single_node_response.hits.len(), 15);\n            assert!(\n                single_node_response.hits.windows(2).all(|hits| hits[0]\n                    .partial_hit\n                    .as_ref()\n                    .unwrap()\n                    .sort_value\n                    >= hits[1].partial_hit.as_ref().unwrap().sort_value)\n            );\n            test_sandbox.assert_quit().await;\n            Ok(())\n        }\n        Err(err) => {\n            test_sandbox.assert_quit().await;\n            Err(anyhow::Error::from(err))\n        }\n    }\n}\n\n#[tokio::test]\nasync fn test_single_node_sorting_with_query_fieldnorms_enabled() -> anyhow::Result<()> {\n    single_node_search_sort_by_field(\"_score\", true).await\n}\n\n#[tokio::test]\nasync fn test_single_node_sorting_with_query_fieldnorms_disabled() -> anyhow::Result<()> {\n    single_node_search_sort_by_field(\"temperature\", false).await\n}\n\n#[tokio::test]\nasync fn test_sort_bm25() {\n    let index_id = \"sort_by_bm25\".to_string();\n    let doc_mapping_yaml = r#\"\n            field_mappings:\n              - name: title\n                type: text\n                record: freq\n                fieldnorms: true\n              - name: body\n                type: text\n                record: freq\n                fieldnorms: true\n              - name: nofreq\n                type: text\n                record: basic\n                fieldnorms: true\n              - name: nofreq_nofieldnorms\n                type: text\n                fieldnorms: false\n            \"#;\n    let default_search_fields = &[\"title\", \"body\", \"nofreq\", \"nofreq_nofieldnorms\"];\n    let test_sandbox = TestSandbox::create(\n        &index_id,\n        doc_mapping_yaml,\n        \"{}\",\n        &default_search_fields[..],\n    )\n    .await\n    .unwrap();\n    let docs = vec![\n        json!({\"title\": \"one pad\", \"nofreq\": \"two pad\"}), // 0\n        json!({\"title\": \"one\", \"nofreq\": \"two\"}),         // 1\n        json!({\"title\": \"one one\", \"nofreq\": \"two two\"}), // 2\n    ];\n    test_sandbox.add_documents(docs).await.unwrap();\n    let search_hits = |query: &str| {\n        let query_ast_json = serde_json::to_string(&query_ast_from_user_text(query, None)).unwrap();\n        let search_request = SearchRequest {\n            index_id_patterns: vec![index_id.to_string()],\n            query_ast: query_ast_json,\n            max_hits: 1_000,\n            sort_fields: vec![SortField {\n                field_name: \"_score\".to_string(),\n                sort_order: SortOrder::Desc as i32,\n                sort_datetime_format: None,\n            }],\n            ..Default::default()\n        };\n        let metastore = test_sandbox.metastore();\n        let storage_resolver = test_sandbox.storage_resolver();\n        async move {\n            single_node_search(search_request, metastore, storage_resolver)\n                .await\n                .unwrap()\n                .hits\n                .into_iter()\n                .map(|hit| {\n                    let partial_hit = hit.partial_hit.unwrap();\n                    let Some(SortByValue {\n                        sort_value: Some(SortValue::F64(score)),\n                    }) = partial_hit.sort_value\n                    else {\n                        panic!()\n                    };\n                    (score as f32, partial_hit.doc_id)\n                })\n                .collect()\n        }\n    };\n    {\n        let hits: Vec<(f32, u32)> = search_hits(\"title:one\").await;\n        assert_eq!(\n            &hits[..],\n            &[(0.1738279, 2), (0.15965714, 1), (0.12343242, 0)]\n        );\n    }\n    {\n        let hits: Vec<(f32, u32)> = search_hits(\"nofreq:two\").await;\n        assert_eq!(\n            &hits[..],\n            &[(0.15965714, 1), (0.12343242, 2), (0.12343242, 0)]\n        );\n    }\n    {\n        let hits: Vec<(f32, u32)> = search_hits(\"title:one nofreq:two\").await;\n        assert_eq!(\n            &hits[..],\n            &[(0.31931427, 1), (0.2972603, 2), (0.24686484, 0)]\n        );\n    }\n    test_sandbox.assert_quit().await;\n}\n\n#[tokio::test]\nasync fn test_sort_by_static_and_dynamic_field() {\n    let index_id = \"sort_by_dynamic_field\".to_string();\n    // In this test, we will try sorting docs by several fields.\n    // - static_i64\n    // - static_u64\n    // - dynamic_i64\n    // - dynamic_u64\n    let doc_mapping_yaml = r#\"\n            mode: dynamic\n            field_mappings:\n              - name: static_u64\n                type: u64\n                fast: true\n              - name: static_i64\n                type: i64\n                fast: true\n            dynamic_mapping:\n                fast: true\n                stored: true\n            \"#;\n    let test_sandbox = TestSandbox::create(&index_id, doc_mapping_yaml, \"{}\", &[])\n        .await\n        .unwrap();\n    let docs = vec![\n        // 0\n        json!({\"static_u64\": 3u64, \"dynamic_u64\": 3u64, \"static_i64\": 0i64, \"dynamic_i64\": 0i64}),\n        // 1\n        json!({\"static_u64\": 2u64, \"dynamic_u64\": 2u64, \"static_i64\": -1i64, \"dynamic_i64\": -1i64}),\n        // 2\n        json!({}),\n        // 3\n        json!({\"static_u64\": 4u64, \"dynamic_u64\": (i64::MAX as u64) + 1, \"static_i64\": 1i64, \"dynamic_i64\": 1i64}),\n    ];\n    test_sandbox.add_documents(docs).await.unwrap();\n    let search_hits = |sort_field: &str, order: SortOrder| {\n        let query_ast_json = serde_json::to_string(&QueryAst::MatchAll).unwrap();\n        let search_request = SearchRequest {\n            index_id_patterns: vec![index_id.to_string()],\n            query_ast: query_ast_json,\n            max_hits: 1_000,\n            sort_fields: vec![SortField {\n                field_name: sort_field.to_string(),\n                sort_order: order as i32,\n                sort_datetime_format: None,\n            }],\n            ..Default::default()\n        };\n        let metastore = test_sandbox.metastore();\n        let storage_resolver = test_sandbox.storage_resolver();\n        async move {\n            let search_resp = single_node_search(search_request, metastore, storage_resolver)\n                .await\n                .unwrap();\n            assert_eq!(search_resp.num_hits, 4);\n            search_resp\n                .hits\n                .into_iter()\n                .map(|hit| {\n                    let partial_hit = hit.partial_hit.unwrap();\n                    partial_hit.doc_id\n                })\n                .collect::<Vec<u32>>()\n        }\n    };\n    {\n        let ordered_docs: Vec<u32> = search_hits(\"static_u64\", SortOrder::Desc).await;\n        assert_eq!(&ordered_docs[..], &[3, 0, 1, 2]);\n    }\n    {\n        let ordered_docs: Vec<u32> = search_hits(\"static_u64\", SortOrder::Asc).await;\n        assert_eq!(&ordered_docs[..], &[1, 0, 3, 2]);\n    }\n    {\n        let ordered_docs: Vec<u32> = search_hits(\"static_i64\", SortOrder::Desc).await;\n        assert_eq!(&ordered_docs[..], &[3, 0, 1, 2]);\n    }\n    {\n        let ordered_docs: Vec<u32> = search_hits(\"static_i64\", SortOrder::Asc).await;\n        assert_eq!(&ordered_docs[..], &[1, 0, 3, 2]);\n    }\n    {\n        let ordered_docs: Vec<u32> = search_hits(\"dynamic_u64\", SortOrder::Desc).await;\n        assert_eq!(&ordered_docs[..], &[3, 0, 1, 2]);\n    }\n    {\n        let ordered_docs: Vec<u32> = search_hits(\"dynamic_u64\", SortOrder::Asc).await;\n        assert_eq!(&ordered_docs[..], &[1, 0, 3, 2]);\n    }\n    {\n        let ordered_docs: Vec<u32> = search_hits(\"dynamic_i64\", SortOrder::Desc).await;\n        assert_eq!(&ordered_docs[..], &[3, 0, 1, 2]);\n    }\n    {\n        let ordered_docs: Vec<u32> = search_hits(\"dynamic_i64\", SortOrder::Asc).await;\n        assert_eq!(&ordered_docs[..], &[1, 0, 3, 2]);\n    }\n    test_sandbox.assert_quit().await;\n}\n\n#[tokio::test]\nasync fn test_sort_by_2_field() {\n    let index_id = \"sort_by_dynamic_field\".to_string();\n    // In this test, we will try sorting docs by several fields.\n    // - static_u64\n    // - dynamic_u64\n    let doc_mapping_yaml = r#\"\n            mode: dynamic\n            field_mappings:\n              - name: static_u64\n                type: u64\n                fast: true\n            dynamic_mapping:\n                fast: true\n                stored: true\n            \"#;\n    let test_sandbox = TestSandbox::create(&index_id, doc_mapping_yaml, \"{}\", &[])\n        .await\n        .unwrap();\n    let docs = vec![\n        // 0\n        json!({\"static_u64\": 3u64, \"dynamic_u64\": 3u64}),\n        // 1\n        json!({\"static_u64\": 3u64, \"dynamic_u64\": 2u64}),\n        // 2\n        json!({}),\n        // 3\n        json!({\"dynamic_u64\": 2u64}),\n        // 4\n        json!({\"static_u64\": 4u64, \"dynamic_u64\": (i64::MAX as u64) + 1}),\n    ];\n    test_sandbox.add_documents(docs).await.unwrap();\n    let search_hits =\n        |sort_field1: &str, order1: SortOrder, sort_field2: &str, order2: SortOrder| {\n            let query_ast_json = serde_json::to_string(&QueryAst::MatchAll).unwrap();\n            let search_request = SearchRequest {\n                index_id_patterns: vec![index_id.to_string()],\n                query_ast: query_ast_json,\n                max_hits: 1_000,\n                sort_fields: vec![\n                    SortField {\n                        field_name: sort_field1.to_string(),\n                        sort_order: order1 as i32,\n                        sort_datetime_format: None,\n                    },\n                    SortField {\n                        field_name: sort_field2.to_string(),\n                        sort_order: order2 as i32,\n                        sort_datetime_format: None,\n                    },\n                ],\n                ..Default::default()\n            };\n            let metastore = test_sandbox.metastore();\n            let storage_resolver = test_sandbox.storage_resolver();\n            async move {\n                let search_resp = single_node_search(search_request, metastore, storage_resolver)\n                    .await\n                    .unwrap();\n                assert_eq!(search_resp.num_hits, 5);\n                search_resp\n                    .hits\n                    .into_iter()\n                    .map(|hit| {\n                        let partial_hit = hit.partial_hit.unwrap();\n                        partial_hit.doc_id\n                    })\n                    .collect::<Vec<u32>>()\n            }\n        };\n    {\n        let ordered_docs: Vec<u32> = search_hits(\n            \"static_u64\",\n            SortOrder::Desc,\n            \"dynamic_u64\",\n            SortOrder::Desc,\n        )\n        .await;\n        assert_eq!(&ordered_docs[..], &[4, 0, 1, 3, 2]);\n    }\n    {\n        let ordered_docs: Vec<u32> =\n            search_hits(\"static_u64\", SortOrder::Desc, \"dynamic_u64\", SortOrder::Asc).await;\n        assert_eq!(&ordered_docs[..], &[4, 1, 0, 3, 2]);\n    }\n    {\n        let ordered_docs: Vec<u32> =\n            search_hits(\"static_u64\", SortOrder::Asc, \"dynamic_u64\", SortOrder::Desc).await;\n        assert_eq!(&ordered_docs[..], &[0, 1, 4, 3, 2]);\n    }\n    {\n        let ordered_docs: Vec<u32> =\n            search_hits(\"static_u64\", SortOrder::Asc, \"dynamic_u64\", SortOrder::Asc).await;\n        assert_eq!(&ordered_docs[..], &[1, 0, 4, 3, 2]);\n    }\n    test_sandbox.assert_quit().await;\n}\n\n#[tokio::test]\nasync fn test_single_node_invalid_sorting_with_query() {\n    let index_id = \"single-node-invalid-sorting\";\n    let doc_mapping_yaml = r#\"\n            field_mappings:\n              - name: description\n                type: text\n                fast: true\n              - name: temperature\n                type: i64\n        \"#;\n    let test_sandbox = TestSandbox::create(index_id, doc_mapping_yaml, \"{}\", &[\"description\"])\n        .await\n        .unwrap();\n\n    let mut docs = Vec::new();\n    for i in 0..30 {\n        let description = format!(\"city info-{}\", i + 1);\n        docs.push(json!({\"description\": description, \"ts\": i+1, \"temperature\": i+32}));\n    }\n    test_sandbox.add_documents(docs).await.unwrap();\n\n    let search_request = SearchRequest {\n        index_id_patterns: vec![index_id.to_string()],\n        query_ast: qast_json_helper(\"city\", &[\"description\"]),\n        max_hits: 15,\n        sort_fields: vec![SortField {\n            field_name: \"description\".to_string(),\n            sort_order: SortOrder::Desc as i32,\n            sort_datetime_format: None,\n        }],\n        ..Default::default()\n    };\n    let single_node_response = single_node_search(\n        search_request,\n        test_sandbox.metastore(),\n        test_sandbox.storage_resolver(),\n    )\n    .await;\n    assert!(single_node_response.is_err());\n    let error_msg = single_node_response.unwrap_err().to_string();\n    assert_eq!(\n        error_msg,\n        \"Invalid argument: sort by field on type text is currently not supported `description`\"\n    );\n    test_sandbox.assert_quit().await;\n}\n\n#[tokio::test]\nasync fn test_single_node_split_pruning_by_tags() -> anyhow::Result<()> {\n    let doc_mapping_yaml = r#\"\n            tag_fields:\n              - owner\n            field_mappings:\n              - name: owner\n                type: text\n                tokenizer: raw\n        \"#;\n    let index_id = \"single-node-pruning-by-tags\";\n    let test_sandbox = TestSandbox::create(index_id, doc_mapping_yaml, \"{}\", &[]).await?;\n    let index_uid = test_sandbox.index_uid();\n\n    let owners = [\"paul\", \"adrien\"];\n    for owner in owners {\n        let mut docs = Vec::new();\n        for i in 0..10 {\n            docs.push(json!({\"body\": format!(\"content num #{}\", i + 1), \"owner\": owner}));\n        }\n        test_sandbox.add_documents(docs).await?;\n    }\n\n    let query_ast: QueryAst = qast_helper(\"owner:francois\", &[]);\n\n    let selected_splits = list_relevant_splits(\n        vec![index_uid.clone()],\n        None,\n        None,\n        extract_tags_from_query(query_ast),\n        &mut test_sandbox.metastore(),\n    )\n    .await?;\n    assert!(selected_splits.is_empty());\n\n    let query_ast: QueryAst = qast_helper(\"\", &[]);\n\n    let selected_splits = list_relevant_splits(\n        vec![index_uid.clone()],\n        None,\n        None,\n        extract_tags_from_query(query_ast),\n        &mut test_sandbox.metastore(),\n    )\n    .await?;\n    assert_eq!(selected_splits.len(), 2);\n\n    let query_ast: QueryAst = qast_helper(\"owner:francois OR owner:paul OR owner:adrien\", &[]);\n\n    let selected_splits = list_relevant_splits(\n        vec![index_uid.clone()],\n        None,\n        None,\n        extract_tags_from_query(query_ast),\n        &mut test_sandbox.metastore(),\n    )\n    .await?;\n    assert_eq!(selected_splits.len(), 2);\n    let split_tags: BTreeSet<String> = selected_splits\n        .iter()\n        .flat_map(|split| split.tags.clone())\n        .collect();\n    assert_eq!(\n        split_tags\n            .iter()\n            .map(|tag| tag.as_str())\n            .collect::<Vec<&str>>(),\n        vec![\"owner!\", \"owner:adrien\", \"owner:paul\"]\n    );\n    test_sandbox.assert_quit().await;\n    Ok(())\n}\n\nasync fn test_search_util(test_sandbox: &TestSandbox, query: &str) -> Vec<u32> {\n    let splits = test_sandbox\n        .metastore()\n        .list_splits(ListSplitsRequest::try_from_index_uid(test_sandbox.index_uid()).unwrap())\n        .await\n        .unwrap()\n        .collect_splits()\n        .await\n        .unwrap();\n    let splits_offsets: Vec<_> = splits\n        .into_iter()\n        .map(|split| extract_split_and_footer_offsets(&split.split_metadata))\n        .collect();\n    let request = Arc::new(SearchRequest {\n        index_id_patterns: vec![test_sandbox.index_uid().index_id.to_string()],\n        query_ast: qast_json_helper(query, &[]),\n        max_hits: 100,\n        ..Default::default()\n    });\n    let searcher_context: Arc<SearcherContext> = Arc::new(SearcherContext::new_without_invoker(\n        SearcherConfig::default(),\n        None,\n    ));\n\n    let search_response = single_doc_mapping_leaf_search(\n        searcher_context,\n        request,\n        test_sandbox.storage(),\n        splits_offsets,\n        test_sandbox.doc_mapper(),\n    )\n    .await\n    .unwrap();\n\n    search_response\n        .partial_hits\n        .into_iter()\n        .map(|partial_hit| partial_hit.doc_id)\n        .collect::<Vec<u32>>()\n}\n\n#[tokio::test]\nasync fn test_search_dynamic_mode() -> anyhow::Result<()> {\n    let doc_mapping_yaml = r#\"\n            field_mappings:\n              - name: body\n                type: text\n                tokenizer: default\n                indexed: true\n            mode: dynamic\n            dynamic_mapping:\n                tokenizer: raw\n        \"#;\n    let test_sandbox = TestSandbox::create(\"search_dynamic_mode\", doc_mapping_yaml, \"{}\", &[])\n        .await\n        .unwrap();\n    let docs = vec![\n        json!({\"body\": \"hello happy tax payer\"}),\n        json!({\"body\": \"hello\"}),\n        json!({\"body_dynamic\": \"hello happy tax payer\"}),\n        json!({\"body_dynamic\": \"hello\"}),\n    ];\n    test_sandbox.add_documents(docs).await.unwrap();\n    {\n        let docs = test_search_util(&test_sandbox, \"body:hello\").await;\n        assert_eq!(&docs[..], &[1u32, 0u32]);\n    }\n    {\n        let docs = test_search_util(&test_sandbox, \"body_dynamic:hello\").await;\n        assert_eq!(&docs[..], &[3u32]); // 1 is not matched due to the raw tokenizer\n    }\n    test_sandbox.assert_quit().await;\n    Ok(())\n}\n\n#[tokio::test]\nasync fn test_search_dynamic_mode_expand_dots() -> anyhow::Result<()> {\n    let doc_mapping_yaml = r#\"\n            field_mappings: []\n            mode: dynamic\n            #dynamic_mapping:\n            #  expand_dots: true -- that's the default value.\n        \"#;\n    let test_sandbox = TestSandbox::create(\n        \"search_dynamic_mode_expand_dots\",\n        doc_mapping_yaml,\n        \"{}\",\n        &[],\n    )\n    .await\n    .unwrap();\n    let docs = vec![json!({\"k8s.component.name\": \"quickwit\"})];\n    test_sandbox.add_documents(docs).await.unwrap();\n    {\n        let docs = test_search_util(&test_sandbox, \"k8s.component.name:quickwit\").await;\n        assert_eq!(&docs[..], &[0u32]);\n    }\n    {\n        let docs = test_search_util(&test_sandbox, r\"k8s\\.component\\.name:quickwit\").await;\n        assert_eq!(&docs[..], &[0u32]);\n    }\n    test_sandbox.assert_quit().await;\n    Ok(())\n}\n\n#[tokio::test]\nasync fn test_search_dynamic_mode_do_not_expand_dots() -> anyhow::Result<()> {\n    let doc_mapping_yaml = r#\"\n            field_mappings: []\n            mode: dynamic\n            dynamic_mapping:\n                expand_dots: false\n        \"#;\n    let test_sandbox = TestSandbox::create(\n        \"search_dynamic_mode_not_expand_dots\",\n        doc_mapping_yaml,\n        \"{}\",\n        &[],\n    )\n    .await\n    .unwrap();\n    let docs = vec![json!({\"k8s.component.name\": \"quickwit\"})];\n    test_sandbox.add_documents(docs).await.unwrap();\n    {\n        let docs = test_search_util(&test_sandbox, r\"k8s\\.component\\.name:quickwit\").await;\n        assert_eq!(&docs[..], &[0u32]);\n    }\n    {\n        let docs = test_search_util(&test_sandbox, r#\"k8s.component.name:quickwit\"#).await;\n        assert!(docs.is_empty());\n    }\n    test_sandbox.assert_quit().await;\n    Ok(())\n}\n\nfn json_to_named_field_doc(doc_json: JsonValue) -> NamedFieldDocument {\n    assert!(doc_json.is_object());\n    let mut doc_map: BTreeMap<String, Vec<TantivyValue>> = BTreeMap::new();\n    for (key, value) in doc_json.as_object().unwrap().clone() {\n        doc_map.insert(key, json_value_to_tantivy_value(value));\n    }\n    NamedFieldDocument(doc_map)\n}\n\nfn json_value_to_tantivy_value(value: JsonValue) -> Vec<TantivyValue> {\n    match value {\n        JsonValue::Bool(val) => vec![TantivyValue::Bool(val)],\n        JsonValue::String(val) => vec![TantivyValue::Str(val)],\n        JsonValue::Array(values) => values\n            .into_iter()\n            .flat_map(json_value_to_tantivy_value)\n            .collect(),\n        JsonValue::Object(object) => {\n            vec![TantivyValue::Object(\n                object\n                    .into_iter()\n                    .map(|(key, val)| (key, TantivyValue::from(val)))\n                    .collect(),\n            )]\n        }\n        JsonValue::Null => Vec::new(),\n        value => vec![value.into()],\n    }\n}\n\n#[track_caller]\nfn test_convert_leaf_hit_aux(\n    default_doc_mapper_json: JsonValue,\n    document_json: JsonValue,\n    expected_hit_json: JsonValue,\n) {\n    let default_doc_mapper: DocMapper = serde_json::from_value(default_doc_mapper_json).unwrap();\n    let named_field_doc = json_to_named_field_doc(document_json);\n    let hit_json_str =\n        convert_document_to_json_string(named_field_doc, &default_doc_mapper).unwrap();\n    let hit_json: JsonValue = serde_json::from_str(&hit_json_str).unwrap();\n    assert_eq!(hit_json, expected_hit_json);\n}\n\n#[test]\nfn test_convert_leaf_hit_multiple_cardinality() {\n    test_convert_leaf_hit_aux(\n        json!({\n            \"field_mappings\": [\n                { \"name\": \"body\", \"type\": \"array<text>\" }\n            ],\n            \"mode\": \"lenient\"\n        }),\n        json!({ \"body\": [\"hello\", \"happy\"] }),\n        json!({ \"body\": [\"hello\", \"happy\"] }),\n    );\n}\n\n#[test]\nfn test_convert_leaf_hit_simple_cardinality() {\n    test_convert_leaf_hit_aux(\n        json!({\n            \"field_mappings\": [\n                { \"name\": \"body\", \"type\": \"text\" }\n            ],\n            \"mode\": \"lenient\"\n        }),\n        json!({ \"body\": [\"hello\", \"happy\"] }),\n        json!({ \"body\": \"hello\" }),\n    );\n}\n\n#[test]\nfn test_convert_dynamic() {\n    test_convert_leaf_hit_aux(\n        json!({\n            \"field_mappings\": [\n                { \"name\": \"body\", \"type\": \"text\" }\n            ],\n            \"mode\": \"dynamic\"\n        }),\n        json!({ \"body\": [\"hello\", \"happy\"], \"_dynamic\": [{\"title\": \"hello\"}] }),\n        json!({ \"body\": \"hello\", \"title\": \"hello\" }),\n    );\n}\n\n#[test]\nfn test_convert_leaf_object() {\n    test_convert_leaf_hit_aux(\n        json!({\n            \"field_mappings\": [\n                {\n                    \"name\": \"user\",\n                    \"type\": \"object\",\n                    \"field_mappings\": [\n                        {\"name\": \"username\", \"type\": \"text\"},\n                        {\"name\": \"email\", \"type\": \"text\"}\n                    ]\n                }\n            ],\n            \"mode\": \"lenient\"\n        }),\n        json!({ \"user.username\": [\"fulmicoton\"], \"user.email\": [\"werwe33@quickwit.io\"]}),\n        json!({ \"user\": {\"username\": \"fulmicoton\", \"email\": \"werwe33@quickwit.io\"}}),\n    );\n}\n\n#[test]\nfn test_convert_leaf_object_used_to_be_dynamic() {\n    test_convert_leaf_hit_aux(\n        json!({\n            \"field_mappings\": [\n                {\n                    \"name\": \"user\",\n                    \"type\": \"object\",\n                    \"field_mappings\": [\n                        {\"name\": \"username\", \"type\": \"text\"},\n                    ]\n                }\n            ],\n            \"mode\": \"dynamic\"\n        }),\n        json!({ \"_dynamic\": [{ \"user\": {\"username\": \"fulmicoton\", \"email\": \"werwe33@quickwit.io\"}}]}),\n        json!({ \"user\": {\"username\": \"fulmicoton\", \"email\": \"werwe33@quickwit.io\"}}),\n    );\n    test_convert_leaf_hit_aux(\n        json!({\n            \"field_mappings\": [\n                {\n                    \"name\": \"user\",\n                    \"type\": \"object\",\n                    \"field_mappings\": [\n                        {\"name\": \"username\", \"type\": \"text\"},\n                    ]\n                }\n            ],\n            \"mode\": \"dynamic\"\n        }),\n        json!({ \"_dynamic\": [{ \"user\": {\"email\": \"werwe33@quickwit.io\"}}], \"user.username\": [\"fulmicoton\"] }),\n        json!({ \"user\": {\"username\": \"fulmicoton\", \"email\": \"werwe33@quickwit.io\"}}),\n    );\n}\n\n// This spec might change in the future. The mode has no impact on the\n// output of convert_document_to_json_string. In particular, it does not ignore\n// the previously gathered dynamic field.\n#[test]\nfn test_convert_leaf_object_arguable_mode_does_not_affect_format() {\n    test_convert_leaf_hit_aux(\n        json!({ \"mode\": \"strict\" }),\n        json!({ \"_dynamic\": [{ \"user\": {\"username\": \"fulmicoton\", \"email\": \"werwe33@quickwit.io\"}}]}),\n        json!({ \"user\": {\"username\": \"fulmicoton\", \"email\": \"werwe33@quickwit.io\"}}),\n    );\n}\n\n#[test]\nfn test_convert_leaf_hit_with_source() {\n    test_convert_leaf_hit_aux(\n        json!({\n            \"field_mappings\": [ {\"name\": \"username\", \"type\": \"text\"} ],\n            \"mode\": \"strict\"\n        }),\n        json!({ \"_source\": [{\"username\": \"fulmicoton\"}], \"username\": [\"fulmicoton\"] }),\n        json!({ \"username\": \"fulmicoton\", \"_source\": {\"username\": \"fulmicoton\"}}),\n    );\n}\n\n#[tokio::test]\nasync fn test_single_node_aggregation() -> anyhow::Result<()> {\n    let index_id = \"single-node-agg-1\";\n    let doc_mapping_yaml = r#\"\n            field_mappings:\n              - name: color\n                type: text\n                fast: true\n              - name: price\n                type: f64\n                fast: true\n        \"#;\n    let test_sandbox = TestSandbox::create(index_id, doc_mapping_yaml, \"{}\", &[\"color\"]).await?;\n    let docs = vec![\n        json!({\"color\": \"blue\", \"price\": 10.0}),\n        json!({\"color\": \"blue\", \"price\": 15.0}),\n        json!({\"color\": \"green\", \"price\": 10.0}),\n        json!({\"color\": \"white\", \"price\": 100.0}),\n        json!({\"color\": \"white\", \"price\": 1.0}),\n    ];\n    let agg_req = r#\"\n {\n   \"expensive_colors\": {\n     \"terms\": {\n       \"field\": \"color\",\n       \"order\": {\n            \"price_stats.max\": \"desc\"\n       }\n     },\n     \"aggs\": {\n       \"price_stats\" : {\n          \"stats\": {\n              \"field\": \"price\"\n          }\n       }\n     }\n   }\n }\"#;\n\n    test_sandbox.add_documents(docs.clone()).await?;\n    let search_request = SearchRequest {\n        index_id_patterns: vec![index_id.to_string()],\n        query_ast: qast_json_helper(\"*\", &[]),\n        max_hits: 2,\n        aggregation_request: Some(agg_req.to_string()),\n        ..Default::default()\n    };\n    let single_node_result = single_node_search(\n        search_request,\n        test_sandbox.metastore(),\n        test_sandbox.storage_resolver(),\n    )\n    .await?;\n    let agg_res_struct =\n        AggregationResults::from_postcard(&single_node_result.aggregation_postcard.unwrap())?;\n    let agg_res_json = serde_json::to_string(&agg_res_struct)?;\n    let agg_res_parsed_json: JsonValue = serde_json::from_str(&agg_res_json)?;\n    assert_eq!(\n        agg_res_parsed_json[\"expensive_colors\"][\"buckets\"][0][\"key\"],\n        \"white\"\n    );\n    assert_eq!(\n        agg_res_parsed_json[\"expensive_colors\"][\"buckets\"][1][\"key\"],\n        \"blue\"\n    );\n    assert_eq!(\n        agg_res_parsed_json[\"expensive_colors\"][\"buckets\"][2][\"key\"],\n        \"green\"\n    );\n    assert!(single_node_result.elapsed_time_micros > 10);\n    assert!(single_node_result.elapsed_time_micros < 1_000_000);\n    test_sandbox.assert_quit().await;\n    Ok(())\n}\n\n#[tokio::test]\nasync fn test_single_node_aggregation_missing_fast_field() {\n    let index_id = \"single-node-agg-2\";\n    let doc_mapping_yaml = r#\"\n            field_mappings:\n              - name: color\n                type: text\n              - name: price\n                type: f64\n                fast: true\n        \"#;\n    let test_sandbox = TestSandbox::create(index_id, doc_mapping_yaml, \"{}\", &[\"color\"])\n        .await\n        .unwrap();\n    let docs = vec![\n        json!({\"color\": \"blue\", \"price\": 10.0}),\n        json!({\"color\": \"blue\", \"price\": 15.0}),\n        json!({\"color\": \"green\", \"price\": 10.0}),\n        json!({\"color\": \"white\", \"price\": 100.0}),\n        json!({\"color\": \"white\", \"price\": 1.0}),\n    ];\n    let agg_req = r#\"\n {\n   \"expensive_colors\": {\n     \"terms\": {\n       \"field\": \"color\",\n       \"order\": {\n            \"price_stats.max\": \"desc\"\n       }\n     },\n     \"aggs\": {\n       \"price_stats\" : {\n          \"stats\": {\n              \"field\": \"price\"\n          }\n       }\n     }\n   }\n }\"#;\n\n    test_sandbox.add_documents(docs.clone()).await.unwrap();\n    let search_request = SearchRequest {\n        index_id_patterns: vec![index_id.to_string()],\n        query_ast: qast_json_helper(\"*\", &[]),\n        max_hits: 2,\n        aggregation_request: Some(agg_req.to_string()),\n        ..Default::default()\n    };\n    let single_node_error = single_node_search(\n        search_request,\n        test_sandbox.metastore(),\n        test_sandbox.storage_resolver(),\n    )\n    .await\n    .unwrap_err();\n    let SearchError::InvalidArgument(error_msg) = single_node_error else {\n        panic!();\n    };\n    assert!(error_msg.contains(\"Field \\\"color\\\" is not configured as a fast field\"));\n    test_sandbox.assert_quit().await;\n}\n\n#[tokio::test]\nasync fn test_single_node_with_ip_field() -> anyhow::Result<()> {\n    let index_id = \"single-node-with-ip-field\";\n    let doc_mapping_yaml = r#\"\n            field_mappings:\n              - name: log\n                type: text\n              - name: host\n                type: ip\n        \"#;\n    let test_sandbox = TestSandbox::create(index_id, doc_mapping_yaml, \"{}\", &[\"log\"]).await?;\n    let docs = vec![\n        json!({\"log\": \"User not found\", \"host\": \"192.168.0.1\"}),\n        json!({\"log\": \"Request failed\", \"host\": \"10.10.12.123\"}),\n        json!({\"log\": \"Request successful\", \"host\": \"10.10.11.125\"}),\n        json!({\"log\": \"Auth service error\", \"host\": \"2001:db8::1:0:0:1\"}),\n        json!({\"log\": \"Settings saved\", \"host\": \"::afff:4567:890a\"}),\n        json!({\"log\": \"Request failed\", \"host\": \"10.10.12.123\"}),\n    ];\n    test_sandbox.add_documents(docs.clone()).await?;\n    {\n        let search_request = SearchRequest {\n            index_id_patterns: vec![index_id.to_string()],\n            query_ast: qast_json_helper(\"*\", &[]),\n            max_hits: 10,\n            ..Default::default()\n        };\n        let single_node_result = single_node_search(\n            search_request,\n            test_sandbox.metastore(),\n            test_sandbox.storage_resolver(),\n        )\n        .await?;\n        assert_eq!(single_node_result.num_hits, 6);\n        assert_eq!(single_node_result.hits.len(), 6);\n    }\n    {\n        let search_request = SearchRequest {\n            index_id_patterns: vec![index_id.to_string()],\n            query_ast: qast_json_helper(\"10.10.11.125\", &[\"host\"]),\n            max_hits: 10,\n            ..Default::default()\n        };\n        let single_node_result = single_node_search(\n            search_request,\n            test_sandbox.metastore(),\n            test_sandbox.storage_resolver(),\n        )\n        .await?;\n        assert_eq!(single_node_result.num_hits, 1);\n        assert_eq!(single_node_result.hits.len(), 1);\n        let hit_json: JsonValue = serde_json::from_str(&single_node_result.hits[0].json)?;\n        let expected_json: JsonValue = json!({\"log\": \"Request successful\", \"host\": \"10.10.11.125\"});\n        assert_json_include!(actual: hit_json, expected: expected_json);\n    }\n    test_sandbox.assert_quit().await;\n    Ok(())\n}\n\n#[tokio::test]\nasync fn test_single_node_range_queries() -> anyhow::Result<()> {\n    let index_id = \"single-node-range-queries\";\n    let doc_mapping_yaml = r#\"\n            field_mappings:\n              - name: datetime\n                type: datetime\n                fast: true\n              - name: log\n                type: text\n              - name: status_code\n                type: u64\n                fast: true\n              - name: host\n                type: ip\n                fast: true\n              - name: latency\n                type: f64\n                fast: true\n              - name: error_code\n                type: i64\n                fast: true\n        \"#;\n    let docs = vec![\n        json!({\"datetime\": \"2023-01-10T15:13:35Z\", \"log\": \"User not found\", \"status_code\": 404, \"host\": \"192.168.0.1\", \"latency\": 12.34, \"error_code\": 4}),\n        json!({\"datetime\": \"2023-01-10T15:13:36Z\", \"log\": \"Request failed\", \"status_code\": 400, \"host\": \"10.10.12.123\", \"latency\": 56.78, \"error_code\": 1}),\n        json!({\"datetime\": \"2023-01-10T15:13:37Z\", \"log\": \"Request successful\", \"status_code\": 200, \"host\": \"10.10.11.125\", \"latency\": 91.10, \"error_code\": -1}),\n        json!({\"datetime\": \"2023-01-10T15:13:38Z\", \"log\": \"Auth service error\", \"status_code\": 401, \"host\": \"2001:db8::1:0:0:1\", \"latency\": 111.12, \"error_code\": 2}),\n        json!({\"datetime\": \"2023-01-10T15:13:39Z\", \"log\": \"Settings saved\", \"status_code\": 200, \"host\": \"::afff:4567:890a\", \"latency\": 112.13, \"error_code\": -1}),\n        json!({\"datetime\": \"2023-01-10T15:13:40Z\", \"log\": \"Request failed\", \"status_code\": 400, \"host\": \"10.10.12.123\", \"latency\": 114.15, \"error_code\": 1}),\n    ];\n    let test_sandbox = TestSandbox::create(index_id, doc_mapping_yaml, \"{}\", &[\"log\"]).await?;\n    test_sandbox.add_documents(docs).await?;\n    {\n        let search_request = SearchRequest {\n            index_id_patterns: vec![index_id.to_string()],\n            query_ast: qast_json_helper(\n                \"datetime:[2023-01-10T15:13:36Z TO 2023-01-10T15:13:38Z}\",\n                &[],\n            ),\n            max_hits: 10,\n            ..Default::default()\n        };\n        let single_node_result = single_node_search(\n            search_request,\n            test_sandbox.metastore(),\n            test_sandbox.storage_resolver(),\n        )\n        .await?;\n        assert_eq!(single_node_result.num_hits, 2);\n        assert_eq!(single_node_result.hits.len(), 2);\n    }\n    {\n        let search_request = SearchRequest {\n            index_id_patterns: vec![index_id.to_string()],\n            query_ast: qast_json_helper(\"status_code:[400 TO 401]\", &[]),\n            max_hits: 10,\n            ..Default::default()\n        };\n        let single_node_result = single_node_search(\n            search_request,\n            test_sandbox.metastore(),\n            test_sandbox.storage_resolver(),\n        )\n        .await?;\n        assert_eq!(single_node_result.num_hits, 3);\n        assert_eq!(single_node_result.hits.len(), 3);\n    }\n    {\n        let search_request = SearchRequest {\n            index_id_patterns: vec![index_id.to_string()],\n            query_ast: qast_json_helper(\"host:[10.0.0.0 TO 10.255.255.255]\", &[]),\n            max_hits: 10,\n            ..Default::default()\n        };\n        let single_node_result = single_node_search(\n            search_request,\n            test_sandbox.metastore(),\n            test_sandbox.storage_resolver(),\n        )\n        .await?;\n        assert_eq!(single_node_result.num_hits, 3);\n        assert_eq!(single_node_result.hits.len(), 3);\n    }\n    {\n        let search_request = SearchRequest {\n            index_id_patterns: vec![index_id.to_string()],\n            query_ast: qast_json_helper(\"latency:[100 TO *]\", &[]),\n            max_hits: 10,\n            ..Default::default()\n        };\n        let single_node_result = single_node_search(\n            search_request,\n            test_sandbox.metastore(),\n            test_sandbox.storage_resolver(),\n        )\n        .await?;\n        assert_eq!(single_node_result.num_hits, 3);\n        assert_eq!(single_node_result.hits.len(), 3);\n    }\n    {\n        let search_request = SearchRequest {\n            index_id_patterns: vec![index_id.to_string()],\n            query_ast: qast_json_helper(\"error_code:[-1 TO 1]\", &[]),\n            max_hits: 10,\n            ..Default::default()\n        };\n        let single_node_result = single_node_search(\n            search_request,\n            test_sandbox.metastore(),\n            test_sandbox.storage_resolver(),\n        )\n        .await?;\n        assert_eq!(single_node_result.num_hits, 4);\n        assert_eq!(single_node_result.hits.len(), 4);\n    }\n    test_sandbox.assert_quit().await;\n    Ok(())\n}\n\n#[allow(deprecated)]\nfn collect_str_terms(response: LeafListTermsResponse) -> Vec<String> {\n    response\n        .terms\n        .into_iter()\n        .map(|term| Term::wrap(&term).value().as_str().unwrap().to_string())\n        .collect()\n}\n\n#[tokio::test]\nasync fn test_single_node_list_terms() -> anyhow::Result<()> {\n    let doc_mapping_yaml = r#\"\n            field_mappings:\n              - name: title\n                type: text\n              - name: body\n                type: text\n              - name: url\n                type: text\n              - name: binary\n                type: bytes\n        \"#;\n    let test_sandbox =\n        TestSandbox::create(\"single-node-list-terms\", doc_mapping_yaml, \"{}\", &[\"body\"]).await?;\n    let docs = vec![\n        json!({\"title\": \"snoopy\", \"body\": \"Snoopy is an anthropomorphic beagle[5] in the comic strip...\", \"url\": \"http://snoopy\", \"binary\": \"dGhpcyBpcyBhIHRlc3Qu\"}),\n        json!({\"title\": \"beagle\", \"body\": \"The beagle is a breed of small scent hound, similar in appearance to the much larger foxhound.\", \"url\": \"http://beagle\", \"binary\": \"bWFkZSB5b3UgbG9vay4=\"}),\n    ];\n    test_sandbox.add_documents(docs).await.unwrap();\n\n    let splits = test_sandbox\n        .metastore()\n        .list_splits(ListSplitsRequest::try_from_index_uid(test_sandbox.index_uid()).unwrap())\n        .await?\n        .collect_splits()\n        .await\n        .unwrap();\n    let splits_offsets: Vec<_> = splits\n        .into_iter()\n        .map(|split| extract_split_and_footer_offsets(&split.split_metadata))\n        .collect();\n    let searcher_context = Arc::new(SearcherContext::new_without_invoker(\n        SearcherConfig::default(),\n        None,\n    ));\n\n    {\n        let request = ListTermsRequest {\n            index_id_patterns: vec![test_sandbox.index_uid().index_id.to_string()],\n            field: \"title\".to_string(),\n            start_key: None,\n            end_key: None,\n            start_timestamp: None,\n            end_timestamp: None,\n            max_hits: Some(100),\n        };\n        let search_response = leaf_list_terms(\n            searcher_context.clone(),\n            &request,\n            test_sandbox.storage(),\n            &splits_offsets,\n        )\n        .await\n        .unwrap();\n        let terms = collect_str_terms(search_response);\n        assert_eq!(terms, &[\"beagle\", \"snoopy\",]);\n    }\n    {\n        let request = ListTermsRequest {\n            index_id_patterns: vec![test_sandbox.index_uid().index_id.to_string()],\n            field: \"title\".to_string(),\n            start_key: None,\n            end_key: None,\n            start_timestamp: None,\n            end_timestamp: None,\n            max_hits: Some(1),\n        };\n        let search_response = leaf_list_terms(\n            searcher_context.clone(),\n            &request,\n            test_sandbox.storage(),\n            &splits_offsets,\n        )\n        .await\n        .unwrap();\n        let terms = collect_str_terms(search_response);\n        assert_eq!(terms, &[\"beagle\"]);\n    }\n    {\n        let request = ListTermsRequest {\n            index_id_patterns: vec![test_sandbox.index_uid().index_id.to_string()],\n            field: \"title\".to_string(),\n            start_key: Some(\"casper\".as_bytes().to_vec()),\n            end_key: None,\n            start_timestamp: None,\n            end_timestamp: None,\n            max_hits: Some(100),\n        };\n        let search_response = leaf_list_terms(\n            searcher_context.clone(),\n            &request,\n            test_sandbox.storage(),\n            &splits_offsets,\n        )\n        .await\n        .unwrap();\n        let terms = collect_str_terms(search_response);\n        assert_eq!(terms, &[\"snoopy\"]);\n    }\n    {\n        let request = ListTermsRequest {\n            index_id_patterns: vec![test_sandbox.index_uid().index_id.to_string()],\n            field: \"title\".to_string(),\n            start_key: None,\n            end_key: Some(\"casper\".as_bytes().to_vec()),\n            start_timestamp: None,\n            end_timestamp: None,\n            max_hits: Some(100),\n        };\n        let search_response = leaf_list_terms(\n            searcher_context.clone(),\n            &request,\n            test_sandbox.storage(),\n            &splits_offsets,\n        )\n        .await\n        .unwrap();\n        let terms = collect_str_terms(search_response);\n        assert_eq!(terms, &[\"beagle\"]);\n    }\n    test_sandbox.assert_quit().await;\n    Ok(())\n}\n\n#[tokio::test]\nasync fn test_single_node_find_trace_ids_collector() {\n    let index_id = \"single-node-find-trace-ids-collector\";\n    let doc_mapping_yaml = r#\"\n            field_mappings:\n              - name: trace_id\n                type: bytes\n                fast: true\n                input_format: hex\n                output_format: hex\n              - name: span_timestamp_secs\n                type: datetime\n                fast: true\n                fast_precision: seconds\n        \"#;\n    let foo_trace_id = TraceId::new([1u8; 16]);\n    let bar_trace_id = TraceId::new([2u8; 16]);\n    let qux_trace_id = TraceId::new([3u8; 16]);\n    let baz_trace_id = TraceId::new([4u8; 16]);\n\n    let docs = vec![\n        json!({\"trace_id\": foo_trace_id, \"span_timestamp_secs\": \"2023-01-10T15:13:35Z\"}),\n        json!({\"trace_id\": foo_trace_id, \"span_timestamp_secs\": \"2023-01-10T15:13:36Z\"}),\n        json!({\"trace_id\": foo_trace_id, \"span_timestamp_secs\": \"2023-01-10T15:13:37Z\"}),\n        json!({\"trace_id\": foo_trace_id, \"span_timestamp_secs\": \"2023-01-10T15:13:38Z\"}),\n        json!({\"trace_id\": foo_trace_id, \"span_timestamp_secs\": \"2023-01-10T15:13:39Z\"}),\n        json!({\"trace_id\": foo_trace_id, \"span_timestamp_secs\": \"2023-01-10T15:13:40Z\"}),\n        json!({\"trace_id\": bar_trace_id, \"span_timestamp_secs\": \"2024-01-10T15:13:35Z\"}),\n        json!({\"trace_id\": bar_trace_id, \"span_timestamp_secs\": \"2024-01-10T15:13:40Z\"}),\n        json!({\"trace_id\": qux_trace_id, \"span_timestamp_secs\": \"2025-01-10T15:13:40Z\"}),\n        json!({\"trace_id\": qux_trace_id, \"span_timestamp_secs\": \"2025-01-10T15:13:35Z\"}),\n        json!({\"trace_id\": baz_trace_id, \"span_timestamp_secs\": \"2022-01-10T15:13:35Z\"}),\n    ];\n    let test_sandbox = TestSandbox::create(index_id, doc_mapping_yaml, \"{}\", &[])\n        .await\n        .unwrap();\n    test_sandbox.add_documents(docs).await.unwrap();\n    {\n        let aggregations = r#\"{\n            \"num_traces\": 3,\n            \"trace_id_field_name\": \"trace_id\",\n            \"span_timestamp_field_name\": \"span_timestamp_secs\"\n        }\"#\n        .to_string();\n\n        let search_request = SearchRequest {\n            index_id_patterns: vec![index_id.to_string()],\n            query_ast: qast_json_helper(\"*\", &[]),\n            aggregation_request: Some(aggregations),\n            ..Default::default()\n        };\n        let single_node_result = single_node_search(\n            search_request,\n            test_sandbox.metastore(),\n            test_sandbox.storage_resolver(),\n        )\n        .await\n        .unwrap();\n        let aggregation_postcard = single_node_result.aggregation_postcard.unwrap();\n        let trace_ids: Vec<Span> = postcard::from_bytes(&aggregation_postcard).unwrap();\n        assert_eq!(trace_ids.len(), 3);\n\n        assert_eq!(trace_ids[0].trace_id, qux_trace_id);\n        assert_eq!(\n            trace_ids[0].span_timestamp.into_timestamp_secs(),\n            1736522020\n        );\n        assert_eq!(trace_ids[1].trace_id, bar_trace_id);\n        assert_eq!(\n            trace_ids[1].span_timestamp.into_timestamp_secs(),\n            1704899620\n        );\n        assert_eq!(trace_ids[2].trace_id, foo_trace_id);\n        assert_eq!(\n            trace_ids[2].span_timestamp.into_timestamp_secs(),\n            1673363620\n        );\n    }\n    test_sandbox.assert_quit().await;\n}\n\n#[tokio::test]\nasync fn test_search_in_text_field_with_custom_tokenizer() -> anyhow::Result<()> {\n    let doc_mapping_yaml = r#\"\n            tokenizers:\n              - name: custom_tokenizer\n                type: ngram\n                min_gram: 3\n                max_gram: 5\n                prefix_only: true\n            field_mappings:\n              - name: body\n                type: text\n                tokenizer: custom_tokenizer\n                indexed: true\n        \"#;\n    let test_sandbox = TestSandbox::create(\"search_custom_tokenizer\", doc_mapping_yaml, \"{}\", &[])\n        .await\n        .unwrap();\n    let docs = vec![json!({\"body\": \"hellohappy\"})];\n    test_sandbox.add_documents(docs).await.unwrap();\n    {\n        let docs = test_search_util(&test_sandbox, \"body:happy\").await;\n        assert!(&docs.is_empty());\n    }\n    {\n        let docs = test_search_util(&test_sandbox, \"body:hel\").await;\n        assert_eq!(&docs[..], &[0u32]);\n    }\n    test_sandbox.assert_quit().await;\n    Ok(())\n}\n\n#[test]\nfn test_global_doc_address_ser_deser() {\n    let doc_address = GlobalDocAddress {\n        split: \"split_id\".to_string(),\n        doc_addr: DocAddress {\n            segment_ord: 0,\n            doc_id: 123456,\n        },\n    };\n    let doc_address_string = doc_address.to_string();\n    let doc_address_deser: GlobalDocAddress = doc_address_string.parse().unwrap();\n    assert_eq!(doc_address_deser, doc_address);\n}\n"
  },
  {
    "path": "quickwit/quickwit-search/src/top_k_collector.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::cmp::{Ordering, Reverse};\nuse std::fmt::Debug;\nuse std::marker::PhantomData;\n\nuse quickwit_common::binary_heap::TopK;\nuse quickwit_proto::search::{PartialHit, SortOrder};\nuse quickwit_proto::types::SplitId;\nuse tantivy::{DocId, Score};\n\nuse crate::collector::{\n    HitSortingMapper, SegmentPartialHit, SegmentPartialHitSortingKey,\n    SortingFieldExtractorComponent, SortingFieldExtractorPair,\n};\n\npub trait QuickwitSegmentTopKCollector {\n    fn collect_top_k_block(&mut self, docs: &[DocId]);\n    fn collect_top_k(&mut self, doc_id: DocId, score: Score);\n    fn get_top_k(&self) -> Vec<PartialHit>;\n}\n\ntrait IntoOptionU64 {\n    #[inline]\n    fn is_unit_type() -> bool {\n        false\n    }\n    fn into_option_u64(self) -> Option<u64>;\n    fn from_option_u64(value: Option<u64>) -> Self;\n}\ntrait MinValue {\n    fn min_value() -> Self;\n}\n\nimpl IntoOptionU64 for Option<u64> {\n    #[inline]\n    fn into_option_u64(self) -> Option<u64> {\n        self\n    }\n    #[inline]\n    fn from_option_u64(value: Option<u64>) -> Self {\n        value\n    }\n}\n\nimpl MinValue for Option<u64> {\n    #[inline]\n    fn min_value() -> Self {\n        None\n    }\n}\n\nimpl IntoOptionU64 for Option<Reverse<u64>> {\n    #[inline]\n    fn into_option_u64(self) -> Option<u64> {\n        self.map(|el| el.0)\n    }\n    #[inline]\n    fn from_option_u64(value: Option<u64>) -> Self {\n        value.map(Reverse)\n    }\n}\nimpl MinValue for Option<Reverse<u64>> {\n    #[inline]\n    fn min_value() -> Self {\n        None\n    }\n}\n\nimpl IntoOptionU64 for () {\n    #[inline]\n    fn is_unit_type() -> bool {\n        true\n    }\n    #[inline]\n    fn into_option_u64(self) -> Option<u64> {\n        None\n    }\n    #[inline]\n    fn from_option_u64(_: Option<u64>) -> Self {}\n}\nimpl MinValue for () {\n    #[inline]\n    fn min_value() -> Self {}\n}\n\n/// Generic hit struct for top k collector.\n/// V1 and V2 are the types of the two values to sort by.\n/// They are either Option<u64> or _statically_ disabled via unit type.\n#[derive(Debug, Copy, Clone, PartialEq, Eq)]\nstruct Hit<V1, V2, const REVERSE_DOCID: bool> {\n    doc_id: DocId,\n    value1: V1,\n    value2: V2,\n}\n\nimpl<V1, V2, const REVERSE_DOCID: bool> MinValue for Hit<V1, V2, REVERSE_DOCID>\nwhere\n    V1: MinValue,\n    V2: MinValue,\n{\n    #[inline]\n    fn min_value() -> Self {\n        let doc_id = if REVERSE_DOCID {\n            DocId::MAX\n        } else {\n            DocId::MIN\n        };\n        Hit {\n            doc_id,\n            value1: V1::min_value(),\n            value2: V2::min_value(),\n        }\n    }\n}\n\nimpl<V1, V2, const REVERSE_DOCID: bool> std::fmt::Display for Hit<V1, V2, REVERSE_DOCID>\nwhere\n    V1: Copy + PartialEq + Eq + PartialOrd + Ord + Debug,\n    V2: Copy + PartialEq + Eq + PartialOrd + Ord + Debug,\n{\n    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {\n        write!(\n            f,\n            \"Hit(doc_id: {}, value1: {:?}, value2: {:?})\",\n            self.doc_id, self.value1, self.value2\n        )\n    }\n}\n\nimpl<V1, V2, const REVERSE_DOCID: bool> Ord for Hit<V1, V2, REVERSE_DOCID>\nwhere\n    V1: Copy + PartialEq + Eq + PartialOrd + Ord + Debug + MinValue,\n    V2: Copy + PartialEq + Eq + PartialOrd + Ord + Debug + MinValue,\n{\n    #[inline]\n    fn cmp(&self, other: &Self) -> Ordering {\n        let order = self.value1.cmp(&other.value1);\n        order\n            .then_with(|| self.value2.cmp(&other.value2))\n            .then_with(|| {\n                if REVERSE_DOCID {\n                    other.doc_id.cmp(&self.doc_id)\n                } else {\n                    self.doc_id.cmp(&other.doc_id)\n                }\n            })\n    }\n}\n\nimpl<V1, V2, const REVERSE_DOCID: bool> PartialOrd for Hit<V1, V2, REVERSE_DOCID>\nwhere\n    V1: Copy + PartialEq + Eq + PartialOrd + Ord + Debug + MinValue,\n    V2: Copy + PartialEq + Eq + PartialOrd + Ord + Debug + MinValue,\n{\n    #[inline]\n    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {\n        Some(self.cmp(other))\n    }\n}\n\nimpl<\n    V1: Copy + PartialEq + Eq + PartialOrd + Ord + IntoOptionU64 + Debug + MinValue,\n    V2: Copy + PartialEq + Eq + PartialOrd + Ord + IntoOptionU64 + Debug + MinValue,\n    const REVERSE_DOCID: bool,\n> Hit<V1, V2, REVERSE_DOCID>\n{\n    #[inline]\n    fn into_segment_partial_hit(self) -> SegmentPartialHit {\n        SegmentPartialHit {\n            sort_value: self.value1.into_option_u64(),\n            sort_value2: self.value2.into_option_u64(),\n            doc_id: self.doc_id,\n        }\n    }\n}\n\npub fn specialized_top_k_segment_collector(\n    split_id: SplitId,\n    score_extractor: SortingFieldExtractorPair,\n    leaf_max_hits: usize,\n    segment_ord: u32,\n    search_after_option: Option<PartialHit>,\n    order1: SortOrder,\n    order2: SortOrder,\n) -> Box<dyn QuickwitSegmentTopKCollector> {\n    // TODO: Add support for search_after to the specialized collector.\n    // Eventually we may want to remove the generic collector to reduce complexity.\n    if search_after_option.is_some() || score_extractor.is_score() {\n        return Box::new(GenericQuickwitSegmentTopKCollector::new(\n            split_id,\n            score_extractor,\n            leaf_max_hits,\n            segment_ord,\n            search_after_option,\n            order1,\n            order2,\n        ));\n    }\n\n    let sort_first_by_ff = score_extractor.first.is_fast_field();\n    let sort_second_by_ff = score_extractor\n        .second\n        .as_ref()\n        .map(|extr| extr.is_fast_field())\n        .unwrap_or(false);\n\n    #[derive(Debug)]\n    enum SortType {\n        DocId,\n        OneFFSort,\n        TwoFFSorts,\n    }\n    let sort_type = match (sort_first_by_ff, sort_second_by_ff) {\n        (false, false) => SortType::DocId,\n        (true, false) => SortType::OneFFSort,\n        (true, true) => SortType::TwoFFSorts,\n        (false, true) => panic!(\"Internal error: Got second sort, but no first sort\"),\n    };\n    // only check order1 for OneFFSort and DocId, as it's the only sort\n    //\n    // REVERSE_DOCID is only used for SortType::DocId and SortType::OneFFSort\n    match (sort_type, order1, order2) {\n        (SortType::DocId, SortOrder::Desc, _) => {\n            Box::new(SpecializedSegmentTopKCollector::<(), (), false>::new(\n                split_id,\n                score_extractor,\n                leaf_max_hits,\n                segment_ord,\n            ))\n        }\n        (SortType::DocId, SortOrder::Asc, _) => {\n            Box::new(SpecializedSegmentTopKCollector::<(), (), true>::new(\n                split_id,\n                score_extractor,\n                leaf_max_hits,\n                segment_ord,\n            ))\n        }\n        (SortType::OneFFSort, SortOrder::Asc, SortOrder::Asc) => {\n            Box::new(SpecializedSegmentTopKCollector::<\n                Option<Reverse<u64>>,\n                (),\n                true,\n            >::new(\n                split_id, score_extractor, leaf_max_hits, segment_ord\n            ))\n        }\n        (SortType::OneFFSort, SortOrder::Desc, SortOrder::Asc) => Box::new(\n            SpecializedSegmentTopKCollector::<Option<u64>, (), false>::new(\n                split_id,\n                score_extractor,\n                leaf_max_hits,\n                segment_ord,\n            ),\n        ),\n        (SortType::OneFFSort, SortOrder::Asc, SortOrder::Desc) => {\n            Box::new(SpecializedSegmentTopKCollector::<\n                Option<Reverse<u64>>,\n                (),\n                true,\n            >::new(\n                split_id, score_extractor, leaf_max_hits, segment_ord\n            ))\n        }\n        (SortType::OneFFSort, SortOrder::Desc, SortOrder::Desc) => Box::new(\n            SpecializedSegmentTopKCollector::<Option<u64>, (), false>::new(\n                split_id,\n                score_extractor,\n                leaf_max_hits,\n                segment_ord,\n            ),\n        ),\n        (SortType::TwoFFSorts, SortOrder::Asc, SortOrder::Asc) => {\n            Box::new(SpecializedSegmentTopKCollector::<\n                Option<Reverse<u64>>,\n                Option<Reverse<u64>>,\n                true,\n            >::new(\n                split_id, score_extractor, leaf_max_hits, segment_ord\n            ))\n        }\n        (SortType::TwoFFSorts, SortOrder::Asc, SortOrder::Desc) => {\n            Box::new(SpecializedSegmentTopKCollector::<\n                Option<Reverse<u64>>,\n                Option<u64>,\n                true,\n            >::new(\n                split_id, score_extractor, leaf_max_hits, segment_ord\n            ))\n        }\n        (SortType::TwoFFSorts, SortOrder::Desc, SortOrder::Asc) => {\n            Box::new(SpecializedSegmentTopKCollector::<\n                Option<u64>,\n                Option<Reverse<u64>>,\n                false,\n            >::new(\n                split_id, score_extractor, leaf_max_hits, segment_ord\n            ))\n        }\n        (SortType::TwoFFSorts, SortOrder::Desc, SortOrder::Desc) => {\n            Box::new(SpecializedSegmentTopKCollector::<\n                Option<u64>,\n                Option<u64>,\n                false,\n            >::new(\n                split_id, score_extractor, leaf_max_hits, segment_ord\n            ))\n        }\n    }\n}\n\n/// Fast Top K Computation\n///\n/// The buffer is truncated to the top_n elements when it reaches the capacity of the Vec.\n/// That means capacity has special meaning and should be carried over when cloning or serializing.\n///\n/// For TopK == 0, it will be relative expensive.\nstruct TopKComputer<D> {\n    /// Reverses sort order to get top-semantics instead of bottom-semantics\n    buffer: Vec<Reverse<D>>,\n    top_n: usize,\n    pub(crate) threshold: D,\n}\n\n// Custom clone to keep capacity\nimpl<D: Clone> Clone for TopKComputer<D> {\n    fn clone(&self) -> Self {\n        let mut buffer_clone = Vec::with_capacity(self.buffer.capacity());\n        buffer_clone.extend(self.buffer.iter().cloned());\n\n        TopKComputer {\n            buffer: buffer_clone,\n            top_n: self.top_n,\n            threshold: self.threshold.clone(),\n        }\n    }\n}\n\nimpl<D> TopKComputer<D>\nwhere D: Ord + Copy + Debug + MinValue\n{\n    /// Create a new `TopKComputer`.\n    pub fn new(top_n: usize) -> Self {\n        // Vec cap can't be 0, since it would panic in push\n        let vec_cap = top_n.max(1) * 10;\n        TopKComputer {\n            buffer: Vec::with_capacity(vec_cap),\n            top_n,\n            threshold: D::min_value(),\n        }\n    }\n\n    /// Push a new document to the top n.\n    /// If the document is below the current threshold, it will be ignored.\n    #[inline]\n    pub fn push(&mut self, doc: D) {\n        if doc < self.threshold {\n            return;\n        }\n        if self.buffer.len() == self.buffer.capacity() {\n            let median = self.truncate_top_n();\n            self.threshold = median;\n        }\n\n        // This is faster since it avoids the buffer resizing to be inlined from vec.push()\n        // (this is in the hot path)\n        // TODO: Replace with `push_within_capacity` when it's stabilized\n        let uninit = self.buffer.spare_capacity_mut();\n        // This cannot panic, because we truncate_median will at least remove one element, since\n        // the min capacity is larger than 2.\n        uninit[0].write(Reverse(doc));\n        // This is safe because it would panic in the line above\n        unsafe {\n            self.buffer.set_len(self.buffer.len() + 1);\n        }\n    }\n\n    #[inline(never)]\n    fn truncate_top_n(&mut self) -> D {\n        // Use select_nth_unstable to find the top nth score\n        let (_, median_el, _) = self.buffer.select_nth_unstable(self.top_n);\n\n        let median_score = *median_el;\n        // Remove all elements below the top_n\n        self.buffer.truncate(self.top_n);\n\n        median_score.0\n    }\n\n    /// Returns the top n elements in sorted order.\n    pub fn into_sorted_vec(mut self) -> Vec<D> {\n        if self.buffer.len() > self.top_n {\n            self.truncate_top_n();\n        }\n        self.buffer.sort_unstable();\n        self.buffer.into_iter().map(|el| el.0).collect()\n    }\n\n    /// Returns the top n elements in stored order.\n    /// Useful if you do not need the elements in sorted order,\n    /// for example when merging the results of multiple segments.\n    #[allow(dead_code)]\n    pub fn into_vec(mut self) -> Vec<D> {\n        if self.buffer.len() > self.top_n {\n            self.truncate_top_n();\n        }\n        self.buffer.into_iter().map(|el| el.0).collect()\n    }\n}\n\npub use tantivy::COLLECT_BLOCK_BUFFER_LEN;\nstruct SpecSortingFieldExtractor<V1, V2> {\n    _phantom: std::marker::PhantomData<(V1, V2)>,\n    sort_values1: Box<[Option<u64>; COLLECT_BLOCK_BUFFER_LEN]>,\n    sort_values2: Box<[Option<u64>; COLLECT_BLOCK_BUFFER_LEN]>,\n\n    pub first: SortingFieldExtractorComponent,\n    pub second: Option<SortingFieldExtractorComponent>,\n}\n\nimpl<\n    V1: Copy + PartialEq + PartialOrd + Ord + IntoOptionU64 + Debug,\n    V2: Copy + PartialEq + PartialOrd + Ord + IntoOptionU64 + Debug,\n> SpecSortingFieldExtractor<V1, V2>\n{\n    fn new(\n        first: SortingFieldExtractorComponent,\n        second: Option<SortingFieldExtractorComponent>,\n    ) -> Self {\n        Self {\n            _phantom: PhantomData,\n            sort_values1: vec![None; COLLECT_BLOCK_BUFFER_LEN]\n                .into_boxed_slice()\n                .try_into()\n                .unwrap(),\n            sort_values2: vec![None; COLLECT_BLOCK_BUFFER_LEN]\n                .into_boxed_slice()\n                .try_into()\n                .unwrap(),\n            first,\n            second,\n        }\n    }\n    /// Fetches the sort values for the given docs.\n    /// Does noting when sorting by docid.\n    fn fetch_data(&mut self, docs: &[DocId]) {\n        self.first\n            .extract_typed_sort_values_block(docs, &mut self.sort_values1[..docs.len()]);\n        if let Some(second) = self.second.as_ref() {\n            second.extract_typed_sort_values_block(docs, &mut self.sort_values2[..docs.len()]);\n        }\n    }\n    #[inline]\n    fn iter_hits<'a, const REVERSE_DOCID: bool>(\n        &'a self,\n        docs: &'a [DocId],\n    ) -> impl Iterator<Item = Hit<V1, V2, REVERSE_DOCID>> + 'a {\n        SpecSortingFieldIter::<V1, V2, REVERSE_DOCID>::new(\n            docs,\n            &self.sort_values1,\n            &self.sort_values2,\n        )\n    }\n}\n\nstruct SpecSortingFieldIter<'a, V1, V2, const REVERSE_DOCID: bool> {\n    docs: std::slice::Iter<'a, DocId>,\n    sort_values1: std::slice::Iter<'a, Option<u64>>,\n    sort_values2: std::slice::Iter<'a, Option<u64>>,\n    _phantom: PhantomData<(V1, V2)>,\n}\n\nimpl<'a, V1, V2, const REVERSE_DOCID: bool> SpecSortingFieldIter<'a, V1, V2, REVERSE_DOCID>\nwhere\n    V1: Copy + PartialEq + PartialOrd + Ord + IntoOptionU64,\n    V2: Copy + PartialEq + PartialOrd + Ord + IntoOptionU64,\n{\n    #[inline]\n    pub fn new(\n        docs: &'a [DocId],\n        sort_values1: &'a [Option<u64>; COLLECT_BLOCK_BUFFER_LEN],\n        sort_values2: &'a [Option<u64>; COLLECT_BLOCK_BUFFER_LEN],\n    ) -> Self {\n        Self {\n            docs: docs.iter(),\n            sort_values1: sort_values1.iter(),\n            sort_values2: sort_values2.iter(),\n            _phantom: PhantomData,\n        }\n    }\n}\n\nimpl<V1, V2, const REVERSE_DOCID: bool> Iterator for SpecSortingFieldIter<'_, V1, V2, REVERSE_DOCID>\nwhere\n    V1: Copy + PartialEq + Eq + PartialOrd + Ord + IntoOptionU64 + Debug,\n    V2: Copy + PartialEq + Eq + PartialOrd + Ord + IntoOptionU64 + Debug,\n{\n    type Item = Hit<V1, V2, REVERSE_DOCID>;\n\n    #[inline]\n    fn next(&mut self) -> Option<Self::Item> {\n        let doc_id = *self.docs.next()?;\n\n        let value1 = if !V1::is_unit_type() {\n            V1::from_option_u64(*self.sort_values1.next()?)\n        } else {\n            V1::from_option_u64(None)\n        };\n\n        let value2 = if !V2::is_unit_type() {\n            V2::from_option_u64(*self.sort_values2.next()?)\n        } else {\n            V2::from_option_u64(None)\n        };\n\n        Some(Hit {\n            doc_id,\n            value1,\n            value2,\n        })\n    }\n}\n\n/// No search after handling\n/// Quickwit collector working at the scale of the segment.\nstruct SpecializedSegmentTopKCollector<\n    V1: Copy + PartialEq + Eq + PartialOrd + Ord + IntoOptionU64 + Debug + MinValue,\n    V2: Copy + PartialEq + Eq + PartialOrd + Ord + IntoOptionU64 + Debug + MinValue,\n    const REVERSE_DOCID: bool,\n> {\n    split_id: SplitId,\n    hit_fetcher: SpecSortingFieldExtractor<V1, V2>,\n    top_k_hits: TopKComputer<Hit<V1, V2, REVERSE_DOCID>>,\n    segment_ord: u32,\n}\n\nimpl<\n    V1: Copy + PartialEq + Eq + PartialOrd + Ord + IntoOptionU64 + Debug + MinValue + 'static,\n    V2: Copy + PartialEq + Eq + PartialOrd + Ord + IntoOptionU64 + Debug + MinValue + 'static,\n    const REVERSE_DOCID: bool,\n> SpecializedSegmentTopKCollector<V1, V2, REVERSE_DOCID>\n{\n    pub fn new(\n        split_id: SplitId,\n        score_extractor: SortingFieldExtractorPair,\n        leaf_max_hits: usize,\n        segment_ord: u32,\n    ) -> Self {\n        let hit_fetcher =\n            SpecSortingFieldExtractor::new(score_extractor.first, score_extractor.second);\n        let top_k_hits = TopKComputer::new(leaf_max_hits);\n        Self {\n            split_id,\n            hit_fetcher,\n            top_k_hits,\n            segment_ord,\n        }\n    }\n}\nimpl<\n    V1: Copy + PartialEq + Eq + PartialOrd + Ord + IntoOptionU64 + Debug + MinValue,\n    V2: Copy + PartialEq + Eq + PartialOrd + Ord + IntoOptionU64 + Debug + MinValue,\n    const REVERSE_DOCID: bool,\n> QuickwitSegmentTopKCollector for SpecializedSegmentTopKCollector<V1, V2, REVERSE_DOCID>\n{\n    fn collect_top_k_block(&mut self, docs: &[DocId]) {\n        self.hit_fetcher.fetch_data(docs);\n        let iter = self.hit_fetcher.iter_hits::<REVERSE_DOCID>(docs);\n        for doc_id in iter {\n            self.top_k_hits.push(doc_id);\n        }\n    }\n\n    #[inline]\n    fn collect_top_k(&mut self, _doc_id: DocId, _score: Score) {\n        panic!(\"Internal Error: This collector does not support collect_top_k\");\n    }\n\n    fn get_top_k(&self) -> Vec<PartialHit> {\n        self.top_k_hits\n            .clone()\n            .into_sorted_vec()\n            .into_iter()\n            .map(|el| el.into_segment_partial_hit())\n            .map(|segment_partial_hit: SegmentPartialHit| {\n                segment_partial_hit.into_partial_hit(\n                    self.split_id.clone(),\n                    self.segment_ord,\n                    &self.hit_fetcher.first,\n                    &self.hit_fetcher.second,\n                )\n            })\n            .collect()\n    }\n}\n\n/// Quickwit collector working at the scale of the segment.\npub(crate) struct GenericQuickwitSegmentTopKCollector {\n    split_id: SplitId,\n    score_extractor: SortingFieldExtractorPair,\n    // PartialHits in this heap don't contain a split_id yet.\n    top_k_hits: TopK<SegmentPartialHit, SegmentPartialHitSortingKey, HitSortingMapper>,\n    segment_ord: u32,\n    search_after: Option<SearchAfterSegment>,\n    // Precomputed order for search_after for split_id and segment_ord\n    precomp_search_after_order: Ordering,\n    sort_values1: Box<[Option<u64>; COLLECT_BLOCK_BUFFER_LEN]>,\n    sort_values2: Box<[Option<u64>; COLLECT_BLOCK_BUFFER_LEN]>,\n}\n\nimpl GenericQuickwitSegmentTopKCollector {\n    pub fn new(\n        split_id: SplitId,\n        score_extractor: SortingFieldExtractorPair,\n        leaf_max_hits: usize,\n        segment_ord: u32,\n        search_after_option: Option<PartialHit>,\n        order1: SortOrder,\n        order2: SortOrder,\n    ) -> Self {\n        let sort_key_mapper = HitSortingMapper { order1, order2 };\n        let precomp_search_after_order = match &search_after_option {\n            Some(search_after) if !search_after.split_id.is_empty() => order1\n                .compare(&split_id, &search_after.split_id)\n                .then_with(|| order1.compare(&segment_ord, &search_after.segment_ord)),\n            // This value isn't actually used.\n            _ => Ordering::Equal,\n        };\n        let search_after =\n            SearchAfterSegment::new(search_after_option, order1, order2, &score_extractor);\n\n        GenericQuickwitSegmentTopKCollector {\n            split_id,\n            score_extractor,\n            top_k_hits: TopK::new(leaf_max_hits, sort_key_mapper), // Adjusted for context\n            segment_ord,\n            search_after,\n            precomp_search_after_order,\n            sort_values1: vec![None; COLLECT_BLOCK_BUFFER_LEN]\n                .into_boxed_slice()\n                .try_into()\n                .unwrap(),\n            sort_values2: vec![None; COLLECT_BLOCK_BUFFER_LEN]\n                .into_boxed_slice()\n                .try_into()\n                .unwrap(),\n        }\n    }\n    #[inline]\n    /// Generic top k collection, that includes search_after handling\n    ///\n    /// Outside of the collector to circumvent lifetime issues.\n    fn collect_top_k_vals(\n        doc_id: DocId,\n        sort_value: Option<u64>,\n        sort_value2: Option<u64>,\n        search_after: &Option<SearchAfterSegment>,\n        precomp_search_after_order: Ordering,\n        top_k_hits: &mut TopK<SegmentPartialHit, SegmentPartialHitSortingKey, HitSortingMapper>,\n    ) {\n        if let Some(search_after) = &search_after {\n            let search_after_value1 = search_after.sort_value;\n            let search_after_value2 = search_after.sort_value2;\n            let orders = &top_k_hits.sort_key_mapper;\n            let mut cmp_result = orders\n                .order1\n                .compare_opt(&sort_value, &search_after_value1)\n                .then_with(|| {\n                    orders\n                        .order2\n                        .compare_opt(&sort_value2, &search_after_value2)\n                });\n            if search_after.compare_on_equal {\n                // TODO actually it's not first, it should be what's in _shard_doc then first then\n                // default\n                let order = orders.order1;\n                cmp_result = cmp_result\n                    .then(precomp_search_after_order)\n                    // We compare doc_id only if sort_value1, sort_value2, split_id and segment_ord\n                    // are equal.\n                    .then_with(|| order.compare(&doc_id, &search_after.doc_id))\n            }\n\n            if cmp_result != Ordering::Less {\n                return;\n            }\n        }\n\n        let hit = SegmentPartialHit {\n            sort_value,\n            sort_value2,\n            doc_id,\n        };\n        top_k_hits.add_entry(hit);\n    }\n}\nimpl QuickwitSegmentTopKCollector for GenericQuickwitSegmentTopKCollector {\n    fn collect_top_k_block(&mut self, docs: &[DocId]) {\n        self.score_extractor.extract_typed_sort_values(\n            docs,\n            &mut self.sort_values1[..],\n            &mut self.sort_values2[..],\n        );\n        if self.search_after.is_some() {\n            // Search after not optimized for block collection yet\n            for ((doc_id, sort_value), sort_value2) in docs\n                .iter()\n                .cloned()\n                .zip(self.sort_values1.iter().cloned())\n                .zip(self.sort_values2.iter().cloned())\n            {\n                Self::collect_top_k_vals(\n                    doc_id,\n                    sort_value,\n                    sort_value2,\n                    &self.search_after,\n                    self.precomp_search_after_order,\n                    &mut self.top_k_hits,\n                );\n            }\n        } else {\n            // Probably would make sense to check the fence against e.g. sort_values1 earlier,\n            // before creating the SegmentPartialHit.\n            //\n            // Below are different versions to avoid iterating the caches if they are unused.\n            //\n            // No sort values loaded. Sort only by doc_id.\n            if !self.score_extractor.first.is_fast_field() {\n                for doc_id in docs.iter().cloned() {\n                    let hit = SegmentPartialHit {\n                        sort_value: None,\n                        sort_value2: None,\n                        doc_id,\n                    };\n                    self.top_k_hits.add_entry(hit);\n                }\n                return;\n            }\n            let has_no_second_sort = !self\n                .score_extractor\n                .second\n                .as_ref()\n                .map(|extr| extr.is_fast_field())\n                .unwrap_or(false);\n            // No second sort values => We can skip iterating the second sort values cache.\n            if has_no_second_sort {\n                for (doc_id, sort_value) in\n                    docs.iter().cloned().zip(self.sort_values1.iter().cloned())\n                {\n                    let hit = SegmentPartialHit {\n                        sort_value,\n                        sort_value2: None,\n                        doc_id,\n                    };\n                    self.top_k_hits.add_entry(hit);\n                }\n                return;\n            }\n\n            for ((doc_id, sort_value), sort_value2) in docs\n                .iter()\n                .cloned()\n                .zip(self.sort_values1.iter().cloned())\n                .zip(self.sort_values2.iter().cloned())\n            {\n                let hit = SegmentPartialHit {\n                    sort_value,\n                    sort_value2,\n                    doc_id,\n                };\n                self.top_k_hits.add_entry(hit);\n            }\n        }\n    }\n\n    #[inline]\n    fn collect_top_k(&mut self, doc_id: DocId, score: Score) {\n        let (sort_value, sort_value2): (Option<u64>, Option<u64>) =\n            self.score_extractor.extract_typed_sort_value(doc_id, score);\n        Self::collect_top_k_vals(\n            doc_id,\n            sort_value,\n            sort_value2,\n            &self.search_after,\n            self.precomp_search_after_order,\n            &mut self.top_k_hits,\n        );\n    }\n\n    fn get_top_k(&self) -> Vec<PartialHit> {\n        self.top_k_hits\n            .clone()\n            .finalize()\n            .into_iter()\n            .map(|segment_partial_hit: SegmentPartialHit| {\n                segment_partial_hit.into_partial_hit(\n                    self.split_id.clone(),\n                    self.segment_ord,\n                    &self.score_extractor.first,\n                    &self.score_extractor.second,\n                )\n            })\n            .collect()\n    }\n}\n\n/// Search After, but the sort values are converted to the u64 fast field representation.\npub(crate) struct SearchAfterSegment {\n    sort_value: Option<u64>,\n    sort_value2: Option<u64>,\n    compare_on_equal: bool,\n    doc_id: DocId,\n}\nimpl SearchAfterSegment {\n    pub fn new(\n        search_after_opt: Option<PartialHit>,\n        sort_order1: SortOrder,\n        sort_order2: SortOrder,\n        score_extractor: &SortingFieldExtractorPair,\n    ) -> Option<Self> {\n        let search_after = search_after_opt?;\n        let mut sort_value = None;\n        if let Some(search_after_sort_value) = search_after\n            .sort_value\n            .and_then(|sort_value| sort_value.sort_value)\n        {\n            if let Some(new_value) = score_extractor\n                .first\n                .convert_to_u64_ff_val(search_after_sort_value, sort_order1)\n            {\n                sort_value = Some(new_value);\n            } else {\n                // Value is out of bounds, we ignore sort_value2 and disable the whole\n                // search_after\n                return None;\n            }\n        }\n        let mut sort_value2 = None;\n        if let Some(search_after_sort_value) = search_after\n            .sort_value2\n            .and_then(|sort_value2| sort_value2.sort_value)\n        {\n            let extractor = score_extractor\n                .second\n                .as_ref()\n                .expect(\"Internal error: Got sort_value2, but no sort extractor\");\n            if let Some(new_value) =\n                extractor.convert_to_u64_ff_val(search_after_sort_value, sort_order2)\n            {\n                sort_value2 = Some(new_value);\n            }\n        }\n        Some(Self {\n            sort_value,\n            sort_value2,\n            compare_on_equal: !search_after.split_id.is_empty(),\n            doc_id: search_after.doc_id,\n        })\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/Cargo.toml",
    "content": "[package]\nname = \"quickwit-serve\"\ndescription = \"REST API server\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nanyhow = { workspace = true }\nasync-trait = { workspace = true }\nbase64 = { workspace = true }\nbytes = { workspace = true }\nbytesize = { workspace = true }\nelasticsearch-dsl = \"0.4\"\nflate2 = { workspace = true }\nfutures = { workspace = true }\nfutures-util = { workspace = true }\nglob = { workspace = true }\nhex = { workspace = true }\nhttp = { workspace = true }\nhttp-body = { workspace = true }\nhttp-serde = { workspace = true }\nhumantime = { workspace = true }\nhyper-util = {workspace = true}\nitertools = { workspace = true }\nmime_guess = { workspace = true }\nonce_cell = { workspace = true }\npercent-encoding = { workspace = true }\npprof = { workspace = true, optional = true }\nprost = { workspace = true }\nprost-types = { workspace = true }\nregex = { workspace = true }\nrust-embed = { workspace = true }\nrustls = { workspace = true }\nrustls-pemfile = { workspace = true }\nserde = { workspace = true }\nserde_json = { workspace = true }\nserde_qs = { workspace = true }\nserde_with = { workspace = true }\nthiserror = { workspace = true }\ntokio = { workspace = true }\ntokio-rustls = { workspace = true }\ntokio-stream = { workspace = true }\ntokio-util = { workspace = true }\ntonic = { workspace = true }\ntonic-health = { workspace = true }\ntonic-reflection = { workspace = true }\ntower = { workspace = true, features = [\"limit\"] }\ntower-http = { workspace = true }\ntracing = { workspace = true }\nutoipa = { workspace = true }\nwarp = { workspace = true, features = [\"server\"] }\nzstd = { workspace = true }\n\nquickwit-actors = { workspace = true }\nquickwit-cluster = { workspace = true }\nquickwit-common = { workspace = true }\nquickwit-config = { workspace = true }\nquickwit-control-plane = { workspace = true }\nquickwit-doc-mapper = { workspace = true }\nquickwit-index-management = { workspace = true }\nquickwit-indexing = { workspace = true }\nquickwit-ingest = { workspace = true }\nquickwit-jaeger = { workspace = true }\nquickwit-janitor = { workspace = true }\nquickwit-metastore = { workspace = true }\nquickwit-opentelemetry = { workspace = true }\nquickwit-proto = { workspace = true }\nquickwit-query = { workspace = true }\nquickwit-search = { workspace = true }\nquickwit-lambda-client = { workspace = true, optional = true }\nquickwit-storage = { workspace = true }\nquickwit-telemetry = { workspace = true }\n\n[build-dependencies]\ntime = { workspace = true }\n\n[dev-dependencies]\nassert-json-diff = { workspace = true }\nhttp = { workspace = true }\nitertools = { workspace = true }\nmockall = { workspace = true }\ntempfile = { workspace = true }\ntokio = { workspace = true }\ntokio-stream = { workspace = true }\ntonic = { workspace = true }\n\nquickwit-actors = { workspace = true, features = [\"testsuite\"] }\nquickwit-cluster = { workspace = true, features = [\"testsuite\"] }\nquickwit-common = { workspace = true, features = [\"testsuite\"] }\nquickwit-config = { workspace = true, features = [\"testsuite\"] }\nquickwit-control-plane = { workspace = true, features = [\"testsuite\"] }\nquickwit-indexing = { workspace = true, features = [\"testsuite\"] }\nquickwit-ingest = { workspace = true, features = [\"testsuite\"] }\nquickwit-janitor = { workspace = true, features = [\"testsuite\"] }\nquickwit-metastore = { workspace = true, features = [\"testsuite\"] }\nquickwit-opentelemetry = { workspace = true, features = [\"testsuite\"] }\nquickwit-proto = { workspace = true, features = [\"testsuite\"] }\nquickwit-search = { workspace = true, features = [\"testsuite\"] }\nquickwit-storage = { workspace = true, features = [\"testsuite\"] }\n\n[features]\npprof = [\n  \"dep:pprof\"\n]\njemalloc-profiled = [\n  \"quickwit-common/jemalloc-profiled\"\n]\ntestsuite = []\nsqs-for-tests = [\n  \"quickwit-indexing/sqs\",\n  \"quickwit-indexing/sqs-test-helpers\"\n]\nlambda = [\n  \"quickwit-lambda-client\"\n]\n"
  },
  {
    "path": "quickwit/quickwit-serve/README.md",
    "content": "# quickwit-serve\n\nThis project hosts the REST, the gRPC API associated with quickwit and the react UI.\n\n## REST and gRPC API\n\nThe API is split into:\n- the search API: the normal and the stream search api;\n- the index management API: create, delete, list indexes and list splits of an index;\n- the ingest API;\n- the cluster API: expose information about the cluster, its members etc;\n- the health check API: the health check of the current node. This API is rest only at the moment.\n\nThe APIs are usually accessible both via gRPC and REST.\nThis is done consistently using the following pattern.\n\nA service async trait mimics the tonic service api, but without the `tonic`\nwrapping of the request and with a rich and specific error type instead of tonic::Status.\nThe argument and the response on the other hand are typically using protobuf object\ndirectly whenever sensible.\n\nThis service only has one implementation but is a trait for mocking purpose.\nThis service is typically exposed by another crate, specific to the API considered.\nFor instance, the search api has a `SearchService` trait, using the `SearchError`\nresponse in the `quickwit-search` crate.\n\nAn adapter then wraps this service to implement the grpc::Service\n(It simply does the wrapping of the request / results and converts errors to the tonic status.).\n\nThe rest API then relies on calling this service.\n\n```mermaid\ngraph TD\n    grpc_service[grpc::BlopService] --> |wraps| service(Go shopping)\n    rest[blop_handler] --> |calls| service(Go shopping)\n    service[BlopService]\n```\n\n## UI\n\nThe server also exposes at `/ui` all static files located in `quickwit-ui/build` directory. These static files are\nproduced by the react app build in `quickwit-ui`.\nDuring development, the server will serve the local files. When building the binary, these static files will be embedded in it.\n"
  },
  {
    "path": "quickwit/quickwit-serve/build.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::env;\nuse std::process::Command;\n\nuse time::OffsetDateTime;\nuse time::macros::format_description;\n\nfn main() {\n    println!(\n        \"cargo:rustc-env=BUILD_DATE={}\",\n        OffsetDateTime::now_utc()\n            .format(format_description!(\n                \"[year]-[month]-[day]T[hour]:[minute]:[second]Z\"\n            ))\n            .unwrap()\n    );\n    println!(\n        \"cargo:rustc-env=BUILD_PROFILE={}\",\n        env::var(\"PROFILE\").unwrap()\n    );\n    println!(\n        \"cargo:rustc-env=BUILD_TARGET={}\",\n        env::var(\"TARGET\").unwrap()\n    );\n    commit_info();\n}\n\n/// Extracts commit date, hash, and tags\nfn commit_info() {\n    // Extract commit date and hash.\n    let output_bytes = match Command::new(\"git\")\n        .arg(\"log\")\n        .arg(\"-1\")\n        .arg(\"--format=%cd %H\")\n        .arg(\"--date=format-local:%Y-%m-%dT%H:%M:%SZ\")\n        .env(\"TZ\", \"UTC0\")\n        .output()\n    {\n        Ok(output) if output.status.success() => output.stdout,\n        _ => Vec::new(),\n    };\n    let output = String::from_utf8(output_bytes).unwrap();\n    let mut parts = output.split_whitespace();\n\n    if let Some(commit_date) = parts.next() {\n        println!(\"cargo:rustc-env=QW_COMMIT_DATE={commit_date}\");\n    }\n    if let Some(commit_hash) = parts.next() {\n        println!(\"cargo:rustc-env=QW_COMMIT_HASH={commit_hash}\");\n    }\n\n    // Extract commit tags.\n    let output_bytes = match Command::new(\"git\")\n        .arg(\"tag\")\n        .arg(\"--points-at\")\n        .arg(\"HEAD\")\n        .output()\n    {\n        Ok(output) if output.status.success() => output.stdout,\n        _ => Vec::new(),\n    };\n    let output = String::from_utf8(output_bytes).unwrap();\n    let tags = output.lines().collect::<Vec<_>>();\n    if !tags.is_empty() {\n        println!(\"cargo:rustc-env=QW_COMMIT_TAGS={}\", tags.join(\",\"));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/resources/tests/jaeger_ui_trace.json",
    "content": "{\n    \"traceID\": \"0000000000000001\",\n    \"spans\": [\n      {\n        \"traceID\": \"0000000000000001\",\n        \"spanID\": \"0000000000000001\",\n        \"operationName\": \"test-general-conversion\",\n        \"references\": [],\n        \"startTime\": 1485467191639875,\n        \"duration\": 5,\n        \"flags\": 0,\n        \"tags\": [],\n        \"logs\": [\n          {\n            \"timestamp\": 1485467191639875,\n            \"fields\": [\n              {\n                \"key\": \"event\",\n                \"type\": \"string\",\n                \"value\": \"some-event\"\n              }\n            ]\n          },\n          {\n            \"timestamp\": 1485467191639875,\n            \"fields\": [\n              {\n                \"key\": \"x\",\n                \"type\": \"string\",\n                \"value\": \"y\"\n              }\n            ]\n          }\n        ],\n        \"processID\": \"p1\",\n        \"warnings\": []\n      },\n      {\n        \"traceID\": \"0000000000000001\",\n        \"spanID\": \"0000000000000002\",\n        \"operationName\": \"some-operation\",\n        \"references\": [],\n        \"flags\": 0,\n        \"startTime\": 1485467191639875,\n        \"duration\": 5,\n        \"tags\": [\n          {\n            \"key\": \"peer.service\",\n            \"type\": \"string\",\n            \"value\": \"service-y\"\n          },\n          {\n            \"key\": \"peer.ipv4\",\n            \"type\": \"int64\",\n            \"value\": 23456\n          },\n          {\n            \"key\": \"error\",\n            \"type\": \"bool\",\n            \"value\": true\n          },\n          {\n            \"key\": \"temperature\",\n            \"type\": \"float64\",\n            \"value\": 72.5\n          },\n          {\n            \"key\": \"javascript_limit\",\n            \"type\": \"int64\",\n            \"value\": \"9223372036854775222\"\n          },\n          {\n            \"key\": \"blob\",\n            \"type\": \"binary\",\n            \"value\": \"AAAwOQ==\"\n          }\n        ],\n        \"logs\": [],\n        \"processID\": \"p1\",\n        \"warnings\": []\n      },\n      {\n        \"traceID\": \"0000000000000001\",\n        \"spanID\": \"0000000000000003\",\n        \"operationName\": \"some-operation\",\n        \"flags\": 0,\n        \"references\": [\n          {\n            \"refType\": \"CHILD_OF\",\n            \"traceID\": \"0000000000000001\",\n            \"spanID\": \"0000000000000002\"\n          }\n        ],\n        \"startTime\": 1485467191639875,\n        \"duration\": 5,\n        \"tags\": [],\n        \"logs\": [],\n        \"processID\": \"p2\",\n        \"warnings\": []\n      },\n      {\n        \"traceID\": \"0000000000000001\",\n        \"spanID\": \"0000000000000004\",\n        \"operationName\": \"reference-test\",\n        \"flags\": 0,\n        \"references\": [\n          {\n            \"refType\": \"CHILD_OF\",\n            \"traceID\": \"00000000000000ff\",\n            \"spanID\": \"00000000000000ff\"\n          },\n          {\n            \"refType\": \"CHILD_OF\",\n            \"traceID\": \"0000000000000001\",\n            \"spanID\": \"0000000000000002\"\n          },\n          {\n            \"refType\": \"FOLLOWS_FROM\",\n            \"traceID\": \"0000000000000001\",\n            \"spanID\": \"0000000000000002\"\n          }\n        ],\n        \"startTime\": 1485467191639875,\n        \"duration\": 5,\n        \"tags\": [],\n        \"logs\": [],\n        \"processID\": \"p2\",\n        \"warnings\": [\n          \"some span warning\"\n        ]\n      },\n      {\n        \"traceID\": \"0000000000000001\",\n        \"spanID\": \"0000000000000005\",\n        \"operationName\": \"preserveParentID-test\",\n        \"flags\": 0,\n        \"references\": [\n          {\n            \"refType\": \"CHILD_OF\",\n            \"traceID\": \"0000000000000001\",\n            \"spanID\": \"0000000000000004\"\n          }\n        ],\n        \"startTime\": 1485467191639875,\n        \"duration\": 4,\n        \"tags\": [],\n        \"logs\": [],\n        \"processID\": \"p2\",\n        \"warnings\": [\n          \"some span warning\"\n        ]\n      }\n    ],\n    \"processes\": {\n      \"p1\": {\n        \"serviceName\": \"service-x\",\n        \"key\": \"p1\",\n        \"tags\": []\n      },\n      \"p2\": {\n        \"serviceName\": \"service-y\",\n        \"key\": \"p2\",\n        \"tags\": []\n      }\n    },\n    \"warnings\": [\n    ]\n  }\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/build_info.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse once_cell::sync::OnceCell;\nuse quickwit_common::runtimes::RuntimesConfig;\nuse serde::Serialize;\n\n#[derive(Debug, Eq, PartialEq, Serialize, utoipa::ToSchema)]\npub struct BuildInfo {\n    pub build_date: &'static str,\n    pub build_profile: &'static str,\n    pub build_target: &'static str,\n    pub cargo_pkg_version: &'static str,\n    pub commit_date: &'static str,\n    pub commit_hash: &'static str,\n    pub commit_short_hash: &'static str,\n    pub commit_tags: Vec<String>,\n    pub version: String,\n}\n\nimpl BuildInfo {\n    /// Returns the properties of the binary.\n    pub fn get() -> &'static Self {\n        const UNKNOWN: &str = \"unknown\";\n\n        static INSTANCE: OnceCell<BuildInfo> = OnceCell::new();\n\n        INSTANCE.get_or_init(|| {\n            let commit_date = option_env!(\"QW_COMMIT_DATE\")\n                .filter(|commit_date| !commit_date.is_empty())\n                .unwrap_or(UNKNOWN);\n            let commit_hash = option_env!(\"QW_COMMIT_HASH\")\n                .filter(|commit_hash| !commit_hash.is_empty())\n                .unwrap_or(UNKNOWN);\n            let commit_short_hash = option_env!(\"QW_COMMIT_HASH\")\n                .filter(|commit_hash| commit_hash.len() >= 7)\n                .map(|commit_hash| &commit_hash[..7])\n                .unwrap_or(UNKNOWN);\n            let mut commit_tags: Vec<String> = option_env!(\"QW_COMMIT_TAGS\")\n                .map(|tags| {\n                    tags.split(',')\n                        .map(|tag| tag.trim().to_string())\n                        .filter(|tag| !tag.is_empty())\n                        .collect()\n                })\n                .unwrap_or_default();\n            commit_tags.sort();\n\n            let version = commit_tags\n                .iter()\n                .find(|tag| tag.starts_with('v'))\n                .cloned()\n                .unwrap_or_else(|| concat!(env!(\"CARGO_PKG_VERSION\"), \"-nightly\").to_string());\n\n            Self {\n                build_date: env!(\"BUILD_DATE\"),\n                build_profile: env!(\"BUILD_PROFILE\"),\n                build_target: env!(\"BUILD_TARGET\"),\n                cargo_pkg_version: env!(\"CARGO_PKG_VERSION\"),\n                commit_date,\n                commit_hash,\n                commit_short_hash,\n                commit_tags,\n                version,\n            }\n        })\n    }\n\n    pub fn get_version_text() -> String {\n        let build_info = Self::get();\n        format!(\n            \"{} ({} {} {})\",\n            build_info.cargo_pkg_version,\n            build_info.build_target,\n            build_info.commit_date,\n            build_info.commit_short_hash\n        )\n    }\n}\n\n#[derive(Debug, Eq, PartialEq, Serialize, utoipa::ToSchema)]\npub struct RuntimeInfo {\n    // This is a number of logical cpus: vCPU or hyperthread depending on where you are running.\n    // This is usually NOT necessarily the number of cores.\n    pub num_cpus: usize,\n    pub num_threads_blocking: usize,\n    pub num_threads_non_blocking: usize,\n}\n\nimpl RuntimeInfo {\n    /// Returns the properties of the node.\n    pub fn get() -> &'static Self {\n        static INSTANCE: OnceCell<RuntimeInfo> = OnceCell::new();\n\n        INSTANCE.get_or_init(|| {\n            let num_cpus = quickwit_common::num_cpus();\n            let runtimes_config = RuntimesConfig::with_num_cpus(num_cpus);\n            Self {\n                num_cpus,\n                num_threads_blocking: runtimes_config.num_threads_blocking,\n                num_threads_non_blocking: runtimes_config.num_threads_non_blocking,\n            }\n        })\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/cluster_api/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod rest_handler;\n\npub use rest_handler::{ClusterApi, cluster_handler};\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/cluster_api/rest_handler.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::convert::Infallible;\n\nuse quickwit_cluster::{Cluster, ClusterSnapshot, NodeIdSchema};\nuse warp::{Filter, Rejection};\n\nuse crate::format::extract_format_from_qs;\nuse crate::rest::recover_fn;\nuse crate::rest_api_response::into_rest_api_response;\n\n#[derive(utoipa::OpenApi)]\n#[openapi(\n    paths(get_cluster),\n    components(schemas(ClusterSnapshot, NodeIdSchema,))\n)]\npub struct ClusterApi;\n\n/// Cluster handler.\npub fn cluster_handler(\n    cluster: Cluster,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"cluster\")\n        .and(warp::path::end())\n        .and(warp::get())\n        .and(warp::path::end().map(move || cluster.clone()))\n        .then(get_cluster)\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n        .recover(recover_fn)\n        .boxed()\n}\n\n#[utoipa::path(\n    get,\n    tag = \"Cluster Info\",\n    path = \"/cluster\",\n    responses(\n        (status = 200, description = \"Successfully fetched cluster information.\", body = ClusterSnapshot)\n    )\n)]\n\n/// Get cluster information.\nasync fn get_cluster(cluster: Cluster) -> Result<ClusterSnapshot, Infallible> {\n    let snapshot = cluster.snapshot().await;\n    Ok(snapshot)\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/decompression.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::io::Read;\nuse std::sync::OnceLock;\n\nuse bytes::Bytes;\nuse flate2::read::{MultiGzDecoder, ZlibDecoder};\nuse quickwit_common::metrics::{GaugeGuard, MEMORY_METRICS};\nuse quickwit_common::thread_pool::run_cpu_intensive;\nuse thiserror::Error;\nuse warp::Filter;\nuse warp::reject::Reject;\n\nuse crate::load_shield::{LoadShield, LoadShieldPermit};\n\nfn get_ingest_load_shield() -> &'static LoadShield {\n    static LOAD_SHIELD: OnceLock<LoadShield> = OnceLock::new();\n    LOAD_SHIELD.get_or_init(|| LoadShield::new(\"ingest\"))\n}\n\n/// There are two ways to decompress the body:\n/// - Stream the body through an async decompressor\n/// - Fetch the body and then decompress the bytes\n///\n/// The first approach lowers the latency, while the second approach is more CPU efficient.\n/// Ingesting data is usually CPU bound and there is considerable latency until the data is\n/// searchable, so the second approach is more suitable for this use case.\nasync fn decompress_body(encoding: Option<String>, body: Bytes) -> Result<Bytes, warp::Rejection> {\n    match encoding.as_deref() {\n        Some(\"identity\") => Ok(body),\n        Some(\"gzip\" | \"x-gzip\") => {\n            let decompressed = run_cpu_intensive(move || {\n                let mut decompressed = Vec::new();\n                let mut decoder = MultiGzDecoder::new(body.as_ref());\n                decoder\n                    .read_to_end(&mut decompressed)\n                    .map_err(|_| warp::reject::custom(CorruptedData))?;\n                Result::<_, warp::Rejection>::Ok(Bytes::from(decompressed))\n            })\n            .await\n            .map_err(|_| warp::reject::custom(CorruptedData))??;\n            Ok(decompressed)\n        }\n        Some(\"zstd\") => {\n            let decompressed = run_cpu_intensive(move || {\n                zstd::decode_all(body.as_ref())\n                    .map(Bytes::from)\n                    .map_err(|_| warp::reject::custom(CorruptedData))\n            })\n            .await\n            .map_err(|_| warp::reject::custom(CorruptedData))??;\n            Ok(decompressed)\n        }\n        Some(\"deflate\" | \"x-deflate\") => {\n            let decompressed = run_cpu_intensive(move || {\n                let mut decompressed = Vec::new();\n                ZlibDecoder::new(body.as_ref())\n                    .read_to_end(&mut decompressed)\n                    .map_err(|_| warp::reject::custom(CorruptedData))?;\n                Result::<_, warp::Rejection>::Ok(Bytes::from(decompressed))\n            })\n            .await\n            .map_err(|_| warp::reject::custom(CorruptedData))??;\n            Ok(decompressed)\n        }\n        Some(encoding) => Err(warp::reject::custom(UnsupportedEncoding(\n            encoding.to_string(),\n        ))),\n        _ => Ok(body),\n    }\n}\n\n#[derive(Debug, Error)]\n#[error(\"Error while decompressing the data\")]\npub(crate) struct CorruptedData;\n\nimpl Reject for CorruptedData {}\n\n#[derive(Debug, Error)]\n#[error(\"Unsupported Content-Encoding {}. Supported encodings are 'gzip' and 'zstd'\", self.0)]\npub(crate) struct UnsupportedEncoding(String);\n\nimpl Reject for UnsupportedEncoding {}\n\n/// Custom filter for optional decompression\npub(crate) fn get_body_bytes() -> impl Filter<Extract = (Body,), Error = warp::Rejection> + Clone {\n    warp::header::optional(\"content-encoding\")\n        .and(warp::body::bytes())\n        .and_then(|encoding: Option<String>, body: Bytes| async move {\n            let permit = get_ingest_load_shield().acquire_permit().await?;\n            decompress_body(encoding, body)\n                .await\n                .map(|content| Body::new(content, permit))\n        })\n}\n\npub(crate) struct Body {\n    pub content: Bytes,\n    _gauge_guard: GaugeGuard<'static>,\n    _permit: LoadShieldPermit,\n}\n\nimpl Body {\n    pub fn new(content: Bytes, load_shield_permit: LoadShieldPermit) -> Body {\n        let mut gauge_guard = GaugeGuard::from_gauge(&MEMORY_METRICS.in_flight.rest_server);\n        gauge_guard.add(content.len() as i64);\n        Body {\n            content,\n            _gauge_guard: gauge_guard,\n            _permit: load_shield_permit,\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/delete_task_api/handler.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_config::build_doc_mapper;\nuse quickwit_janitor::error::JanitorError;\nuse quickwit_metastore::IndexMetadataResponseExt;\nuse quickwit_proto::metastore::{\n    DeleteQuery, DeleteTask, IndexMetadataRequest, ListDeleteTasksRequest, MetastoreResult,\n    MetastoreService, MetastoreServiceClient,\n};\nuse quickwit_proto::search::SearchRequest;\nuse quickwit_proto::types::{IndexId, IndexUid};\nuse quickwit_query::query_ast::{QueryAst, query_ast_from_user_text};\nuse serde::Deserialize;\nuse warp::{Filter, Rejection};\n\nuse crate::format::extract_format_from_qs;\nuse crate::rest::recover_fn;\nuse crate::rest_api_response::into_rest_api_response;\nuse crate::with_arg;\n\n#[derive(utoipa::OpenApi)]\n#[openapi(\n    paths(get_delete_tasks, post_delete_request),\n    components(schemas(DeleteQueryRequest, DeleteTask, DeleteQuery))\n)]\npub struct DeleteTaskApi;\n\n/// This struct represents the delete query passed to\n/// the rest API.\n#[derive(Deserialize, Debug, Eq, PartialEq, Default, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct DeleteQueryRequest {\n    /// Query text. The query language is that of tantivy.\n    pub query: String,\n    // Fields to search on\n    #[serde(rename(deserialize = \"search_field\"))]\n    #[serde(default)]\n    pub search_fields: Option<Vec<String>>,\n    /// If set, restrict delete to documents with a `timestamp >= start_timestamp`.\n    pub start_timestamp: Option<i64>,\n    /// If set, restrict delete to documents with a `timestamp < end_timestamp``.\n    pub end_timestamp: Option<i64>,\n}\n\n/// Delete query API handlers.\npub fn delete_task_api_handlers(\n    metastore: MetastoreServiceClient,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    get_delete_tasks_handler(metastore.clone())\n        .or(post_delete_tasks_handler(metastore.clone()))\n        .recover(recover_fn)\n        .boxed()\n}\n\npub fn get_delete_tasks_handler(\n    metastore: MetastoreServiceClient,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(String / \"delete-tasks\")\n        .and(warp::get())\n        .and(with_arg(metastore))\n        .then(get_delete_tasks)\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n}\n\n#[utoipa::path(\n    get,\n    tag = \"Delete Tasks\",\n    path = \"/{index_id}/delete-tasks\",\n    responses(\n        (status = 200, description = \"Successfully fetched delete tasks.\", body = [DeleteTask])\n    ),\n    params(\n        (\"index_id\" = String, Path, description = \"The index ID to retrieve delete tasks for.\"),\n    )\n)]\n/// Get Delete Tasks\n///\n/// Returns delete tasks in json format for a given `index_id`.\n// Note that `_delete_task_service_mailbox` is not used...\n// Explanation: we don't want to expose any delete tasks endpoints without a running\n// `DeleteTaskService`. This is ensured by requiring a `Mailbox<DeleteTaskService>` in\n// `get_delete_tasks_handler` and consequently we get the mailbox in `get_delete_tasks` signature.\npub async fn get_delete_tasks(\n    index_id: IndexId,\n    metastore: MetastoreServiceClient,\n) -> MetastoreResult<Vec<DeleteTask>> {\n    let index_metadata_request = IndexMetadataRequest::for_index_id(index_id.to_string());\n    let index_uid: IndexUid = metastore\n        .index_metadata(index_metadata_request)\n        .await?\n        .deserialize_index_metadata()?\n        .index_uid;\n    let list_delete_tasks_request = ListDeleteTasksRequest::new(index_uid, 0);\n    let delete_tasks = metastore\n        .list_delete_tasks(list_delete_tasks_request)\n        .await?\n        .delete_tasks;\n    Ok(delete_tasks)\n}\n\npub fn post_delete_tasks_handler(\n    metastore: MetastoreServiceClient,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(String / \"delete-tasks\")\n        .and(warp::body::json())\n        .and(warp::post())\n        .and(with_arg(metastore))\n        .then(post_delete_request)\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n}\n\n#[utoipa::path(\n    post,\n    tag = \"Delete Tasks\",\n    path = \"/{index_id}/delete-tasks\",\n    request_body = DeleteQueryRequest,\n    responses(\n        (status = 200, description = \"Successfully added a new delete task.\", body = DeleteTask)\n    ),\n    params(\n        (\"index_id\" = String, Path, description = \"The index ID to add the delete task to.\"),\n    )\n)]\n/// Create Delete Task\n///\n/// This operation will not be immediately executed, instead it will be added to a queue\n/// and cleaned up in the near future.\npub async fn post_delete_request(\n    index_id: IndexId,\n    delete_request: DeleteQueryRequest,\n    metastore: MetastoreServiceClient,\n) -> Result<DeleteTask, JanitorError> {\n    let index_metadata_request = IndexMetadataRequest::for_index_id(index_id.to_string());\n    let metadata = metastore\n        .index_metadata(index_metadata_request)\n        .await?\n        .deserialize_index_metadata()?;\n    let index_uid: IndexUid = metadata.index_uid.clone();\n    let query_ast = query_ast_from_user_text(&delete_request.query, delete_request.search_fields)\n        .parse_user_query(&metadata.index_config.search_settings.default_search_fields)\n        .map_err(|err| JanitorError::InvalidDeleteQuery(err.to_string()))?;\n    let query_ast_json = serde_json::to_string(&query_ast).map_err(|_err| {\n        JanitorError::Internal(\"failed to serialized delete query ast\".to_string())\n    })?;\n    let delete_query = DeleteQuery {\n        index_uid: Some(index_uid),\n        start_timestamp: delete_request.start_timestamp,\n        end_timestamp: delete_request.end_timestamp,\n        query_ast: query_ast_json,\n    };\n    let index_config = metadata.into_index_config();\n    // TODO should it be something else than a JanitorError?\n    let doc_mapper = build_doc_mapper(&index_config.doc_mapping, &index_config.search_settings)\n        .map_err(|error| JanitorError::Internal(error.to_string()))?;\n    let delete_search_request = SearchRequest::try_from(delete_query.clone())\n        .map_err(|error| JanitorError::InvalidDeleteQuery(error.to_string()))?;\n\n    // Validate the delete query against the current doc mapping configuration.\n    let query_ast: QueryAst = serde_json::from_str(&delete_search_request.query_ast)\n        .map_err(|err| JanitorError::InvalidDeleteQuery(err.to_string()))?;\n    doc_mapper\n        .query(doc_mapper.schema(), query_ast, true, None)\n        .map_err(|error| JanitorError::InvalidDeleteQuery(error.to_string()))?;\n    let delete_task = metastore.create_delete_task(delete_query).await?;\n    Ok(delete_task)\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_indexing::TestSandbox;\n    use quickwit_proto::metastore::DeleteTask;\n    use warp::Filter;\n\n    use crate::rest::recover_fn;\n\n    #[tokio::test]\n    async fn test_delete_task_api() {\n        let index_id = \"test-delete-task-rest\";\n        let doc_mapping_yaml = r#\"\n            field_mappings:\n              - name: title\n                type: text\n              - name: body\n                type: text\n              - name: ts\n                type: i64\n                fast: true\n            mode: lenient\n        \"#;\n        let test_sandbox = TestSandbox::create(index_id, doc_mapping_yaml, \"\", &[\"title\"])\n            .await\n            .unwrap();\n        let metastore = test_sandbox.metastore();\n        let delete_query_api_handlers =\n            super::delete_task_api_handlers(metastore).recover(recover_fn);\n\n        // POST a delete query with explicit field name in query\n        let resp = warp::test::request()\n            .path(\"/test-delete-task-rest/delete-tasks\")\n            .method(\"POST\")\n            .json(&true)\n            .body(r#\"{\"query\": \"body:myterm\", \"start_timestamp\": 1, \"end_timestamp\": 10}\"#)\n            .reply(&delete_query_api_handlers)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let created_delete_task: DeleteTask = serde_json::from_slice(resp.body()).unwrap();\n        assert_eq!(created_delete_task.opstamp, 1);\n        let created_delete_query = created_delete_task.delete_query.unwrap();\n        assert_eq!(created_delete_query.index_uid(), &test_sandbox.index_uid());\n        assert_eq!(\n            created_delete_query.query_ast,\n            r#\"{\"type\":\"full_text\",\"field\":\"body\",\"text\":\"myterm\",\"params\":{\"mode\":{\"type\":\"phrase_fallback_to_intersection\"}},\"lenient\":false}\"#\n        );\n        assert_eq!(created_delete_query.start_timestamp, Some(1));\n        assert_eq!(created_delete_query.end_timestamp, Some(10));\n\n        // POST a delete query with specified default field\n        let resp = warp::test::request()\n            .path(\"/test-delete-task-rest/delete-tasks\")\n            .method(\"POST\")\n            .json(&true)\n            .body(r#\"{\"query\": \"myterm\", \"start_timestamp\": 1, \"end_timestamp\": 10, \"search_field\": [\"body\"]}\"#)\n            .reply(&delete_query_api_handlers)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let created_delete_task: DeleteTask = serde_json::from_slice(resp.body()).unwrap();\n        assert_eq!(created_delete_task.opstamp, 2);\n        let created_delete_query = created_delete_task.delete_query.unwrap();\n        assert_eq!(created_delete_query.index_uid(), &test_sandbox.index_uid());\n        assert_eq!(\n            created_delete_query.query_ast,\n            r#\"{\"type\":\"full_text\",\"field\":\"body\",\"text\":\"myterm\",\"params\":{\"mode\":{\"type\":\"phrase_fallback_to_intersection\"}},\"lenient\":false}\"#\n        );\n        assert_eq!(created_delete_query.start_timestamp, Some(1));\n        assert_eq!(created_delete_query.end_timestamp, Some(10));\n\n        // POST a delete query using the config default field\n        let resp = warp::test::request()\n            .path(\"/test-delete-task-rest/delete-tasks\")\n            .method(\"POST\")\n            .json(&true)\n            .body(r#\"{\"query\": \"myterm\", \"start_timestamp\": 1, \"end_timestamp\": 10}\"#)\n            .reply(&delete_query_api_handlers)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let created_delete_task: DeleteTask = serde_json::from_slice(resp.body()).unwrap();\n        assert_eq!(created_delete_task.opstamp, 3);\n        let created_delete_query = created_delete_task.delete_query.unwrap();\n        assert_eq!(created_delete_query.index_uid(), &test_sandbox.index_uid());\n        assert_eq!(\n            created_delete_query.query_ast,\n            r#\"{\"type\":\"full_text\",\"field\":\"title\",\"text\":\"myterm\",\"params\":{\"mode\":{\"type\":\"phrase_fallback_to_intersection\"}},\"lenient\":false}\"#\n        );\n        assert_eq!(created_delete_query.start_timestamp, Some(1));\n        assert_eq!(created_delete_query.end_timestamp, Some(10));\n\n        // POST an invalid delete query.\n        let resp = warp::test::request()\n            .path(\"/test-delete-task-rest/delete-tasks\")\n            .method(\"POST\")\n            .json(&true)\n            .body(r#\"{\"query\": \"unknown_field:test\", \"start_timestamp\": 1, \"end_timestamp\": 10}\"#)\n            .reply(&delete_query_api_handlers)\n            .await;\n        assert_eq!(resp.status(), 400);\n        assert!(String::from_utf8_lossy(resp.body()).contains(\"invalid delete query\"));\n\n        // GET delete tasks.\n        let resp = warp::test::request()\n            .path(\"/test-delete-task-rest/delete-tasks\")\n            .reply(&delete_query_api_handlers)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let delete_tasks: Vec<DeleteTask> = serde_json::from_slice(resp.body()).unwrap();\n        assert_eq!(delete_tasks.len(), 3);\n\n        test_sandbox.assert_quit().await;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/delete_task_api/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod handler;\n\npub use handler::{DeleteTaskApi, delete_task_api_handlers};\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/developer_api/debug.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{HashMap, HashSet};\nuse std::time::Duration;\n\nuse futures::StreamExt;\nuse futures::stream::FuturesUnordered;\nuse glob::{MatchOptions, Pattern as GlobPattern};\nuse quickwit_cluster::Cluster;\nuse quickwit_config::service::QuickwitService;\nuse quickwit_proto::developer::{DeveloperService, DeveloperServiceClient, GetDebugInfoRequest};\nuse quickwit_proto::tonic::codec::CompressionEncoding;\nuse quickwit_proto::types::{NodeId, NodeIdRef};\nuse serde::Deserialize;\nuse serde_json::Value as JsonValue;\nuse tokio::time::timeout;\nuse tracing::error;\nuse warp::hyper::StatusCode;\nuse warp::{Filter, Rejection, Reply};\n\nuse super::DeveloperApiServer;\nuse crate::with_arg;\n\n#[derive(Deserialize)]\nstruct DebugInfoQueryParams {\n    // Comma-separated list of case insensitive node ID glob patterns to restrict the debug\n    // information to.\n    node_ids: Option<String>,\n    // Comma-separated list of roles to restrict the debug information to.\n    roles: Option<String>,\n}\n\n#[utoipa::path(\n    get,\n    tag = \"Debug\",\n    path = \"/debug\",\n    responses(\n        (status = 200, description = \"Successfully fetched debug info.\"),\n    ),\n)]\n/// Get debug information for the nodes in the cluster.\npub(super) fn debug_handler(\n    cluster: Cluster,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path(\"debug\")\n        .and(warp::path::end())\n        .and(with_arg(cluster))\n        .and(warp::query::<DebugInfoQueryParams>())\n        .then(get_node_debug_infos)\n}\n\nasync fn get_node_debug_infos(\n    cluster: Cluster,\n    query_params: DebugInfoQueryParams,\n) -> warp::reply::Response {\n    let node_id_patterns = if let Some(node_ids) = &query_params.node_ids {\n        match NodeIdGlobPatterns::try_from_comma_separated_patterns(node_ids) {\n            Ok(node_id_patterns) => node_id_patterns,\n            Err(error) => {\n                return warp::reply::with_status(\n                    format!(\n                        \"failed to parse node ID glob patterns `{}`: {error}\",\n                        query_params.node_ids.as_deref().unwrap_or(\"\")\n                    ),\n                    StatusCode::BAD_REQUEST,\n                )\n                .into_response();\n            }\n        }\n    } else {\n        NodeIdGlobPatterns::default()\n    };\n    let target_roles: HashSet<QuickwitService> = if let Some(roles) = query_params.roles {\n        let target_roles_res = roles.split(',').map(|role| role.parse()).collect();\n\n        match target_roles_res {\n            Ok(target_roles) => target_roles,\n            Err(error) => {\n                return warp::reply::with_status(\n                    format!(\"failed to parse roles `{roles}`: {error}\"),\n                    StatusCode::BAD_REQUEST,\n                )\n                .into_response();\n            }\n        }\n    } else {\n        HashSet::new()\n    };\n    let ready_nodes = cluster.ready_nodes().await;\n    let mut debug_infos: HashMap<NodeId, JsonValue> = HashMap::with_capacity(ready_nodes.len());\n\n    let mut get_debug_info_futures = FuturesUnordered::new();\n\n    for ready_node in ready_nodes {\n        if node_id_patterns.matches(ready_node.node_id()) {\n            let node_id = ready_node.node_id().to_owned();\n            let client = DeveloperServiceClient::from_channel(\n                ready_node.grpc_advertise_addr(),\n                ready_node.channel(),\n                DeveloperApiServer::MAX_GRPC_MESSAGE_SIZE,\n                Some(CompressionEncoding::Zstd),\n            );\n            let roles = target_roles.iter().map(|role| role.to_string()).collect();\n            let request = GetDebugInfoRequest { roles };\n            let get_debug_info_future = async move {\n                let get_debug_info_res =\n                    timeout(Duration::from_secs(5), client.get_debug_info(request)).await;\n                (node_id, get_debug_info_res)\n            };\n            get_debug_info_futures.push(get_debug_info_future);\n        }\n    }\n    while let Some(get_debug_info_res) = get_debug_info_futures.next().await {\n        match get_debug_info_res {\n            (node_id, Ok(Ok(debug_info_response))) => {\n                match serde_json::from_slice(&debug_info_response.debug_info_json) {\n                    Ok(debug_info) => {\n                        debug_infos.insert(node_id, debug_info);\n                    }\n                    Err(error) => {\n                        error!(%node_id, %error, \"failed to parse JSON debug info from node\");\n                    }\n                };\n            }\n            (node_id, Ok(Err(error))) => {\n                error!(%node_id, %error, \"failed to get debug info from node\");\n            }\n            (node_id, Err(_elpased)) => {\n                error!(%node_id, \"get debug info request timed out\");\n            }\n        }\n    }\n    warp::reply::json(&debug_infos).into_response()\n}\n\n#[derive(Debug)]\nstruct NodeIdGlobPatterns(HashSet<GlobPattern>, MatchOptions);\n\nimpl Default for NodeIdGlobPatterns {\n    fn default() -> Self {\n        let glob_patterns = HashSet::new();\n        let match_options = MatchOptions {\n            case_sensitive: false,\n            ..Default::default()\n        };\n        Self(glob_patterns, match_options)\n    }\n}\n\nimpl NodeIdGlobPatterns {\n    fn try_from_comma_separated_patterns(comma_separated_patterns: &str) -> anyhow::Result<Self> {\n        let glob_patterns: HashSet<GlobPattern> = comma_separated_patterns\n            .split(',')\n            .filter(|pattern| !pattern.is_empty())\n            .map(GlobPattern::new)\n            .collect::<Result<_, _>>()?;\n        let match_options = MatchOptions {\n            case_sensitive: false,\n            ..Default::default()\n        };\n        Ok(Self(glob_patterns, match_options))\n    }\n\n    fn matches(&self, node_id: &NodeIdRef) -> bool {\n        if self.0.is_empty() {\n            return true;\n        }\n        self.0\n            .iter()\n            .any(|pattern| pattern.matches_with(node_id.as_str(), self.1))\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_cluster::{ChannelTransport, create_cluster_for_test};\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_developer_api_debug_handler() {\n        let peer_seeds = Vec::new();\n        let transport = ChannelTransport::default();\n        let self_node_readiness = true;\n        let cluster = create_cluster_for_test(\n            peer_seeds,\n            &[\"control-plane\"],\n            &transport,\n            self_node_readiness,\n        )\n        .await\n        .unwrap();\n\n        let debug_handler = debug_handler(cluster);\n\n        let response = warp::test::request()\n            .path(\"/debug?roles=foo\")\n            .method(\"GET\")\n            .reply(&debug_handler)\n            .await;\n        assert_eq!(response.status(), 400);\n\n        let response = warp::test::request()\n            .path(\"/debug?node_ids=[\")\n            .method(\"GET\")\n            .reply(&debug_handler)\n            .await;\n        assert_eq!(response.status(), 400);\n\n        let response = warp::test::request()\n            .path(\"/debug\")\n            .method(\"GET\")\n            .reply(&debug_handler)\n            .await;\n        assert_eq!(response.status(), 200);\n\n        // TODO: Refactor handler and test against mock developer service servers.\n    }\n\n    #[test]\n    fn test_node_id_glob_patterns() {\n        let node_id_patterns = NodeIdGlobPatterns::try_from_comma_separated_patterns(\"\").unwrap();\n        let node_id = NodeIdRef::from_str(\"node-1\");\n        assert!(node_id_patterns.matches(node_id));\n\n        let node_id_patterns = NodeIdGlobPatterns::try_from_comma_separated_patterns(\",\").unwrap();\n        let node_id = NodeIdRef::from_str(\"node-1\");\n        assert!(node_id_patterns.matches(node_id));\n\n        let node_id_patterns = NodeIdGlobPatterns::try_from_comma_separated_patterns(\n            \"control-plane,,indexer-[1-2],searcher*\",\n        )\n        .unwrap();\n\n        let node_id = NodeIdRef::from_str(\"control-plane\");\n        assert!(node_id_patterns.matches(node_id));\n\n        let node_id = NodeIdRef::from_str(\"indexer-1\");\n        assert!(node_id_patterns.matches(node_id));\n\n        let node_id = NodeIdRef::from_str(\"Indexer-2\");\n        assert!(node_id_patterns.matches(node_id));\n\n        let node_id = NodeIdRef::from_str(\"indexer-3\");\n        assert!(!node_id_patterns.matches(node_id));\n\n        let node_id = NodeIdRef::from_str(\"searcher-1\");\n        assert!(node_id_patterns.matches(node_id));\n\n        let node_id = NodeIdRef::from_str(\"janitor\");\n        assert!(!node_id_patterns.matches(node_id));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/developer_api/heap_prof.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_common::jemalloc_profiled::{start_profiling, stop_profiling};\nuse serde::Deserialize;\nuse warp::Filter;\nuse warp::reply::Reply;\n\npub fn heap_prof_handlers()\n-> impl Filter<Extract = impl warp::Reply, Error = warp::Rejection> + Clone {\n    #[derive(Deserialize)]\n    struct ProfilerQueryParams {\n        min_alloc_size: Option<u64>,\n        backtrace_every: Option<u64>,\n    }\n\n    let start_profiler = {\n        warp::path!(\"heap-prof\" / \"start\")\n            .and(warp::query::<ProfilerQueryParams>())\n            .and_then(move |params: ProfilerQueryParams| start_profiler_handler(params))\n    };\n\n    let stop_profiler = { warp::path!(\"heap-prof\" / \"stop\").and_then(stop_profiler_handler) };\n\n    async fn start_profiler_handler(\n        params: ProfilerQueryParams,\n    ) -> Result<warp::reply::Response, warp::Rejection> {\n        start_profiling(params.min_alloc_size, params.backtrace_every);\n        let response =\n            warp::reply::with_status(\"Heap profiling started\", warp::http::StatusCode::OK)\n                .into_response();\n        Ok(response)\n    }\n\n    async fn stop_profiler_handler() -> Result<warp::reply::Response, warp::Rejection> {\n        stop_profiling();\n        let response =\n            warp::reply::with_status(\"Heap profiling stopped\", warp::http::StatusCode::OK)\n                .into_response();\n        Ok(response)\n    }\n\n    start_profiler.or(stop_profiler)\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/developer_api/heap_prof_disabled.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse warp::Filter;\n\nfn not_implemented_handler() -> impl warp::Reply {\n    warp::reply::with_status(\n        \"Quickwit was compiled without the `jemalloc-profiled` feature\",\n        warp::http::StatusCode::NOT_IMPLEMENTED,\n    )\n}\n\npub fn heap_prof_handlers()\n-> impl Filter<Extract = impl warp::Reply, Error = warp::Rejection> + Clone {\n    let start_profiler = { warp::path!(\"heap-prof\" / \"start\").map(not_implemented_handler) };\n    let stop_profiler = { warp::path!(\"heap-prof\" / \"stop\").map(not_implemented_handler) };\n    start_profiler.or(stop_profiler)\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/developer_api/log_level.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse serde::Deserialize;\nuse tracing::{error, info};\nuse warp::hyper::StatusCode;\nuse warp::{Filter, Rejection};\n\nuse crate::{EnvFilterReloadFn, with_arg};\n\n#[derive(Deserialize)]\nstruct EnvFilter {\n    filter: String,\n}\n\n/// Dynamically Quickwit's log level\n#[utoipa::path(get, tag = \"Debug\", path = \"/log-level\")]\npub fn log_level_handler(\n    env_filter_reload_fn: EnvFilterReloadFn,\n) -> impl warp::Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path(\"log-level\")\n        .and(warp::get().or(warp::post()).unify())\n        .and(warp::path::end())\n        .and(with_arg(env_filter_reload_fn))\n        .and(warp::query::<EnvFilter>())\n        .then(\n            |env_filter_reload_fn: EnvFilterReloadFn, env_filter: EnvFilter| async move {\n                match env_filter_reload_fn(&env_filter.filter) {\n                    Ok(_) => {\n                        info!(filter = env_filter.filter, \"setting log level\");\n                        warp::reply::with_status(\n                            format!(\"set log level to: [{}]\", env_filter.filter),\n                            StatusCode::OK,\n                        )\n                    }\n                    Err(err) => {\n                        error!(filter = env_filter.filter, %err, \"failed to set log level\");\n                        warp::reply::with_status(\n                            format!(\n                                \"failed to set log level to: [{}], {}\",\n                                env_filter.filter, err\n                            ),\n                            StatusCode::BAD_REQUEST,\n                        )\n                    }\n                }\n            },\n        )\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/developer_api/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod debug;\n\n#[cfg_attr(not(feature = \"jemalloc-profiled\"), path = \"heap_prof_disabled.rs\")]\nmod heap_prof;\nmod log_level;\n#[cfg_attr(not(feature = \"pprof\"), path = \"pprof_disabled.rs\")]\nmod pprof;\nmod server;\n\nuse debug::debug_handler;\nuse heap_prof::heap_prof_handlers;\nuse log_level::log_level_handler;\nuse pprof::pprof_handlers;\nuse quickwit_cluster::Cluster;\npub(crate) use server::DeveloperApiServer;\nuse warp::{Filter, Rejection};\n\nuse crate::EnvFilterReloadFn;\nuse crate::rest::recover_fn;\n\n#[derive(utoipa::OpenApi)]\n#[openapi(paths(debug::debug_handler, log_level::log_level_handler))]\npub struct DeveloperApi;\n\npub(crate) fn developer_api_routes(\n    cluster: Cluster,\n    env_filter_reload_fn: EnvFilterReloadFn,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"api\" / \"developer\" / ..)\n        .and(\n            debug_handler(cluster.clone())\n                .or(log_level_handler(env_filter_reload_fn.clone()).boxed())\n                .or(pprof_handlers())\n                .or(heap_prof_handlers()),\n        )\n        .recover(recover_fn)\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/developer_api/pprof.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::OnceLock;\n\nuse regex::Regex;\nuse warp::Filter;\n\nfn remove_trailing_numbers(thread_name: &mut String) {\n    static REMOVE_TRAILING_NUMBER_PTN: OnceLock<Regex> = OnceLock::new();\n    let captures_opt = REMOVE_TRAILING_NUMBER_PTN\n        .get_or_init(|| Regex::new(r\"^(.*?)[-\\d]+$\").unwrap())\n        .captures(thread_name);\n    if let Some(captures) = captures_opt {\n        *thread_name = captures[1].to_string();\n    }\n}\n\nfn frames_post_processor(frames: &mut pprof::Frames) {\n    remove_trailing_numbers(&mut frames.thread_name);\n}\n\n/// pprof/start to start cpu profiling.\n/// pprof/start?duration=5&sampling=1000 to start a short high frequency cpu profiling\n/// pprof/flamegraph to stop the current cpu profiling and return a flamegraph or return the last\n/// flamegraph\n///\n/// Query parameters:\n/// - duration: duration of the profiling in seconds, default is 30 seconds. max value is 300\n/// - sampling: the sampling rate, default is 100, max value is 1000\npub fn pprof_handlers() -> impl Filter<Extract = impl warp::Reply, Error = warp::Rejection> + Clone\n{\n    use std::sync::{Arc, Mutex};\n\n    use pprof::ProfilerGuard;\n    use serde::Deserialize;\n    use tokio::time::{self, Duration};\n    use warp::reply::Reply;\n\n    struct ProfilerState {\n        profiler_guard: Option<ProfilerGuard<'static>>,\n        // We will keep the latest flamegraph and return it at the flamegraph endpoint\n        // A new run will overwrite the flamegraph_data\n        flamegraph_data: Option<Vec<u8>>,\n    }\n\n    let profiler_state = Arc::new(Mutex::new(ProfilerState {\n        profiler_guard: None,\n        flamegraph_data: None,\n    }));\n\n    #[derive(Deserialize)]\n    struct ProfilerQueryParams {\n        duration: Option<u64>, // max allowed value is 300 seconds, default is 30 seconds\n        sampling: Option<i32>, // max value is 1000, default is 100\n    }\n\n    let start_profiler = {\n        let profiler_state = Arc::clone(&profiler_state);\n        warp::path!(\"pprof\" / \"start\")\n            .and(warp::query::<ProfilerQueryParams>())\n            .and_then(move |params: ProfilerQueryParams| {\n                start_profiler_handler(profiler_state.clone(), params)\n            })\n    };\n\n    let stop_profiler = {\n        let profiler_state = Arc::clone(&profiler_state);\n        warp::path!(\"pprof\" / \"flamegraph\")\n            .and_then(move || get_flamegraph_handler(Arc::clone(&profiler_state)))\n    };\n\n    async fn start_profiler_handler(\n        profiler_state: Arc<Mutex<ProfilerState>>,\n        params: ProfilerQueryParams,\n    ) -> Result<impl warp::Reply, warp::Rejection> {\n        let mut state = profiler_state.lock().unwrap();\n\n        if state.profiler_guard.is_none() {\n            let duration = params.duration.unwrap_or(30).min(300);\n            let sampling = params.sampling.unwrap_or(100).min(1000);\n            state.profiler_guard = Some(pprof::ProfilerGuard::new(sampling).unwrap());\n            let profiler_state = Arc::clone(&profiler_state);\n            tokio::spawn(async move {\n                time::sleep(Duration::from_secs(duration)).await;\n                save_flamegraph(profiler_state).await;\n            });\n            Ok(warp::reply::with_status(\n                \"CPU profiling started\",\n                warp::http::StatusCode::OK,\n            ))\n        } else {\n            Ok(warp::reply::with_status(\n                \"CPU profiling is already running\",\n                warp::http::StatusCode::BAD_REQUEST,\n            ))\n        }\n    }\n\n    async fn get_flamegraph_handler(\n        profiler_state: Arc<Mutex<ProfilerState>>,\n    ) -> Result<impl warp::Reply, warp::Rejection> {\n        let state = profiler_state.lock().unwrap();\n\n        if let Some(data) = state.flamegraph_data.clone() {\n            Ok(warp::reply::with_header(data, \"Content-Type\", \"image/svg+xml\").into_response())\n        } else {\n            Ok(warp::reply::with_status(\n                \"flamegraph is not available\",\n                warp::http::StatusCode::BAD_REQUEST,\n            )\n            .into_response())\n        }\n    }\n\n    async fn save_flamegraph(profiler_state: Arc<Mutex<ProfilerState>>) {\n        let handle = quickwit_common::thread_pool::run_cpu_intensive(move || {\n            let mut state = profiler_state.lock().unwrap();\n            if let Some(profiler) = state.profiler_guard.take()\n                && let Ok(report) = profiler\n                    .report()\n                    .frames_post_processor(frames_post_processor)\n                    .build()\n            {\n                let mut buffer = Vec::new();\n                if report.flamegraph(&mut buffer).is_ok() {\n                    state.flamegraph_data = Some(buffer);\n                }\n            }\n        });\n        let _ = handle.await;\n    }\n\n    start_profiler.or(stop_profiler)\n}\n\n#[cfg(test)]\nmod tests {\n    use super::remove_trailing_numbers;\n\n    #[track_caller]\n    fn test_remove_trailing_numbers_aux(thread_name: &str, expected: &str) {\n        let mut thread_name = thread_name.to_string();\n        remove_trailing_numbers(&mut thread_name);\n        assert_eq!(&thread_name, expected);\n    }\n\n    #[test]\n    fn test_remove_trailing_numbers() {\n        test_remove_trailing_numbers_aux(\"thread-12\", \"thread\");\n        test_remove_trailing_numbers_aux(\"thread12\", \"thread\");\n        test_remove_trailing_numbers_aux(\"thread-\", \"thread\");\n        test_remove_trailing_numbers_aux(\"thread-1-2\", \"thread\");\n        test_remove_trailing_numbers_aux(\"thread-1-2\", \"thread\");\n        test_remove_trailing_numbers_aux(\"12-aa\", \"12-aa\");\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/developer_api/pprof_disabled.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse warp::Filter;\n\nfn not_implemented_handler() -> impl warp::Reply {\n    warp::reply::with_status(\n        \"Quickwit was compiled without the `pprof` feature\",\n        warp::http::StatusCode::NOT_IMPLEMENTED,\n    )\n}\n\n/// pprof/start disabled\n/// pprof/flamegraph disabled\npub fn pprof_handlers() -> impl Filter<Extract = impl warp::Reply, Error = warp::Rejection> + Clone\n{\n    let start_profiler = { warp::path!(\"pprof\" / \"start\").map(not_implemented_handler) };\n    let stop_profiler = { warp::path!(\"pprof\" / \"flamegraph\").map(not_implemented_handler) };\n    start_profiler.or(stop_profiler)\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/developer_api/server.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashSet;\nuse std::fmt;\nuse std::sync::Arc;\n\nuse async_trait::async_trait;\nuse bytes::Bytes;\nuse bytesize::ByteSize;\nuse quickwit_actors::Mailbox;\nuse quickwit_cluster::Cluster;\nuse quickwit_config::NodeConfig;\nuse quickwit_config::service::QuickwitService;\nuse quickwit_control_plane::control_plane::{ControlPlane, GetDebugInfo};\nuse quickwit_ingest::{IngestRouter, Ingester};\nuse quickwit_proto::developer::{\n    DeveloperError, DeveloperResult, DeveloperService, GetDebugInfoRequest, GetDebugInfoResponse,\n};\nuse serde_json::json;\n\nuse crate::{BuildInfo, QuickwitServices, RuntimeInfo};\n\n#[derive(Clone)]\npub(crate) struct DeveloperApiServer {\n    node_config: Arc<NodeConfig>,\n    cluster: Cluster,\n    control_plane_mailbox_opt: Option<Mailbox<ControlPlane>>,\n    ingest_router_opt: Option<IngestRouter>,\n    ingester_opt: Option<Ingester>,\n}\n\nimpl fmt::Debug for DeveloperApiServer {\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        f.debug_struct(\"DeveloperApiServer\").finish()\n    }\n}\n\nimpl DeveloperApiServer {\n    pub const MAX_GRPC_MESSAGE_SIZE: ByteSize = ByteSize::mib(100);\n\n    pub fn from_services(services: &QuickwitServices) -> Self {\n        Self {\n            node_config: services.node_config.clone(),\n            cluster: services.cluster.clone(),\n            control_plane_mailbox_opt: services.control_plane_server_opt.clone(),\n            ingest_router_opt: services.ingest_router_opt.clone(),\n            ingester_opt: services.ingester_opt.clone(),\n        }\n    }\n}\n\n#[async_trait]\nimpl DeveloperService for DeveloperApiServer {\n    async fn get_debug_info(\n        &self,\n        request: GetDebugInfoRequest,\n    ) -> DeveloperResult<GetDebugInfoResponse> {\n        let roles: HashSet<QuickwitService> = request\n            .roles\n            .into_iter()\n            .map(|role| role.parse())\n            .collect::<anyhow::Result<_>>()\n            .map_err(|error| DeveloperError::InvalidArgument(error.to_string()))?;\n\n        let cluster_snapshot = self.cluster.snapshot().await;\n\n        // We must redact sensitive information such as credentials.\n        let mut node_config = (*self.node_config).clone();\n        node_config.redact();\n\n        let mut debug_info = json!({\n            \"build_info\": BuildInfo::get(),\n            \"runtime_info\": RuntimeInfo::get(),\n            \"node_config\": node_config,\n            \"cluster_membership_info\": json!({\n                \"ready_nodes\": cluster_snapshot.ready_nodes,\n                \"live_nodes\": cluster_snapshot.live_nodes,\n                \"dead_nodes\": cluster_snapshot.dead_nodes,\n                \"chitchat_state\": cluster_snapshot.chitchat_state_snapshot.node_states,\n            })\n        });\n        if let Some(control_plane_mailbox) = &self.control_plane_mailbox_opt\n            && (roles.is_empty() || roles.contains(&QuickwitService::ControlPlane))\n        {\n            debug_info[\"control_plane\"] = match control_plane_mailbox.ask(GetDebugInfo).await {\n                Ok(debug_info) => debug_info,\n                Err(error) => {\n                    json!({\"error\": error.to_string()})\n                }\n            };\n        }\n        if let Some(ingest_router) = &self.ingest_router_opt {\n            debug_info[\"ingest_router\"] = ingest_router.debug_info().await;\n        }\n        if let Some(ingester) = &self.ingester_opt\n            && (roles.is_empty() || roles.contains(&QuickwitService::Indexer))\n        {\n            debug_info[\"ingester\"] = ingester.debug_info().await;\n        };\n        let debug_info_json = serde_json::to_vec(&debug_info).map_err(|error| {\n            let message = format!(\"failed to JSON serialize debug info: {error}\");\n            DeveloperError::Internal(message)\n        })?;\n        let response = GetDebugInfoResponse {\n            debug_info_json: Bytes::from(debug_info_json),\n        };\n        Ok(response)\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_cluster::{ChannelTransport, create_cluster_for_test};\n    use serde_json::Value as JsonValue;\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_developer_api_server_get_debug_info() {\n        let peer_seeds = Vec::new();\n        let transport = ChannelTransport::default();\n        let self_node_readiness = true;\n        let cluster = create_cluster_for_test(\n            peer_seeds,\n            &[\"metastore\", \"control-plane\", \"indexer\"],\n            &transport,\n            self_node_readiness,\n        )\n        .await\n        .unwrap();\n\n        let mut node_config = NodeConfig::for_test();\n        node_config.metastore_uri =\n            quickwit_common::uri::Uri::for_test(\"postgresql://username:password@db\");\n        let node_config = Arc::new(node_config);\n\n        let developer_api_server = DeveloperApiServer {\n            node_config,\n            cluster,\n            control_plane_mailbox_opt: None,\n            ingest_router_opt: None,\n            ingester_opt: None,\n        };\n        let request = GetDebugInfoRequest { roles: Vec::new() };\n        let response = developer_api_server.get_debug_info(request).await.unwrap();\n        let debug_info: JsonValue = serde_json::from_slice(&response.debug_info_json).unwrap();\n\n        assert!(debug_info[\"build_info\"].is_object());\n        assert!(debug_info[\"runtime_info\"].is_object());\n        assert!(debug_info[\"node_config\"].is_object());\n        assert!(debug_info[\"cluster_membership_info\"].is_object());\n\n        assert_eq!(\n            debug_info[\"node_config\"][\"metastore_uri\"],\n            \"postgresql://username:***redacted***@db\"\n        );\n\n        // TODO: Test control plane and ingester debug info.\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/elasticsearch_api/bulk.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::time::Instant;\n\nuse bytesize::ByteSize;\nuse quickwit_ingest::{\n    CommitType, DocBatchBuilder, IngestRequest, IngestService, IngestServiceClient,\n};\nuse quickwit_proto::ingest::router::IngestRouterServiceClient;\nuse quickwit_proto::types::IndexId;\nuse warp::http::StatusCode;\nuse warp::{Filter, Rejection};\n\nuse super::bulk_v2::{ElasticBulkResponse, elastic_bulk_ingest_v2};\nuse crate::elasticsearch_api::filter::{elastic_bulk_filter, elastic_index_bulk_filter};\nuse crate::elasticsearch_api::make_elastic_api_response;\nuse crate::elasticsearch_api::model::{BulkAction, ElasticBulkOptions, ElasticsearchError};\nuse crate::format::extract_format_from_qs;\nuse crate::ingest_api::lines;\nuse crate::rest::recover_fn;\nuse crate::{Body, with_arg};\n\n/// POST `_elastic/_bulk`\npub fn es_compat_bulk_handler(\n    ingest_service: IngestServiceClient,\n    ingest_router: IngestRouterServiceClient,\n    content_length_limit: ByteSize,\n    enable_ingest_v1: bool,\n    enable_ingest_v2: bool,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    elastic_bulk_filter(content_length_limit)\n        .and(with_arg(ingest_service))\n        .and(with_arg(ingest_router))\n        .then(move |body, bulk_options, ingest_service, ingest_router| {\n            elastic_ingest_bulk(\n                None,\n                body,\n                bulk_options,\n                ingest_service,\n                ingest_router,\n                enable_ingest_v1,\n                enable_ingest_v2,\n            )\n        })\n        .and(extract_format_from_qs())\n        .map(make_elastic_api_response)\n        .recover(recover_fn)\n}\n\n/// POST `_elastic/<index>/_bulk`\npub fn es_compat_index_bulk_handler(\n    ingest_service: IngestServiceClient,\n    ingest_router: IngestRouterServiceClient,\n    content_length_limit: ByteSize,\n    enable_ingest_v1: bool,\n    enable_ingest_v2: bool,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    elastic_index_bulk_filter(content_length_limit)\n        .and(with_arg(ingest_service))\n        .and(with_arg(ingest_router))\n        .then(\n            move |index_id, body, bulk_options, ingest_service, ingest_router| {\n                elastic_ingest_bulk(\n                    Some(index_id),\n                    body,\n                    bulk_options,\n                    ingest_service,\n                    ingest_router,\n                    enable_ingest_v1,\n                    enable_ingest_v2,\n                )\n            },\n        )\n        .and(extract_format_from_qs())\n        .map(make_elastic_api_response)\n        .recover(recover_fn)\n        .boxed()\n}\n\nasync fn elastic_ingest_bulk(\n    default_index_id: Option<IndexId>,\n    body: Body,\n    bulk_options: ElasticBulkOptions,\n    ingest_service: IngestServiceClient,\n    ingest_router: IngestRouterServiceClient,\n    enable_ingest_v1: bool,\n    enable_ingest_v2: bool,\n) -> Result<ElasticBulkResponse, ElasticsearchError> {\n    if enable_ingest_v2 && !bulk_options.use_legacy_ingest {\n        return elastic_bulk_ingest_v2(default_index_id, body, bulk_options, ingest_router).await;\n    }\n    if !enable_ingest_v1 {\n        return Err(ElasticsearchError::new(\n            StatusCode::INTERNAL_SERVER_ERROR,\n            \"ingest v1 is disabled: environment variable `QW_DISABLE_INGEST_V1` is set\".to_string(),\n            None,\n        ));\n    }\n    let now = Instant::now();\n    let mut doc_batch_builders = HashMap::new();\n    let mut lines = lines(&body.content).enumerate();\n\n    while let Some((line_number, line)) = lines.next() {\n        let action = serde_json::from_slice::<BulkAction>(line).map_err(|error| {\n            ElasticsearchError::new(\n                StatusCode::BAD_REQUEST,\n                format!(\"Malformed action/metadata line [#{line_number}]. Details: `{error}`\"),\n                None,\n            )\n        })?;\n        let (_, source) = lines.next().ok_or_else(|| {\n            ElasticsearchError::new(\n                StatusCode::BAD_REQUEST,\n                \"expected source for the action\".to_string(),\n                None,\n            )\n        })?;\n        // when ingesting on /my-index/_bulk, if _index: is set to something else than my-index,\n        // ES honors it and create the doc in the requested index. That is, `my-index` is a default\n        // value in case _index: is missing, but not a constraint on each sub-action.\n        let index_id = action\n            .into_index_id()\n            .or_else(|| default_index_id.clone())\n            .ok_or_else(|| {\n                ElasticsearchError::new(\n                    StatusCode::BAD_REQUEST,\n                    format!(\"missing required field: `_index` in the line [#{line_number}].\"),\n                    None,\n                )\n            })?;\n        let doc_batch_builder = doc_batch_builders\n            .entry(index_id.clone())\n            .or_insert(DocBatchBuilder::new(index_id));\n\n        doc_batch_builder.ingest_doc(source);\n    }\n    let doc_batches = doc_batch_builders\n        .into_values()\n        .map(|builder| builder.build())\n        .collect();\n    let commit_type: CommitType = bulk_options.refresh.into();\n    let ingest_request = IngestRequest {\n        doc_batches,\n        commit: commit_type.into(),\n    };\n    ingest_service.ingest(ingest_request).await?;\n\n    let took_millis = now.elapsed().as_millis() as u64;\n    let errors = false;\n    let bulk_response = ElasticBulkResponse {\n        took_millis,\n        errors,\n        actions: Vec::new(),\n    };\n    Ok(bulk_response)\n}\n\n#[cfg(test)]\nmod tests {\n    use std::sync::Arc;\n    use std::time::Duration;\n\n    use quickwit_config::{IngestApiConfig, NodeConfig};\n    use quickwit_index_management::IndexService;\n    use quickwit_ingest::{FetchRequest, IngestServiceClient, SuggestTruncateRequest};\n    use quickwit_metastore::metastore_for_test;\n    use quickwit_proto::ingest::router::IngestRouterServiceClient;\n    use quickwit_proto::metastore::MetastoreServiceClient;\n    use quickwit_search::MockSearchService;\n    use quickwit_storage::StorageResolver;\n    use warp::hyper::StatusCode;\n\n    use crate::elasticsearch_api::bulk_v2::ElasticBulkResponse;\n    use crate::elasticsearch_api::elastic_api_handlers;\n    use crate::elasticsearch_api::model::ElasticsearchError;\n    use crate::elasticsearch_api::tests::mock_cluster;\n    use crate::ingest_api::setup_ingest_v1_service;\n\n    #[tokio::test]\n    async fn test_bulk_api_returns_404_if_index_id_does_not_exist() {\n        let config = Arc::new(NodeConfig::for_test());\n        let search_service = Arc::new(MockSearchService::new());\n        let (universe, _temp_dir, ingest_service, _) =\n            setup_ingest_v1_service(&[\"my-index\"], &IngestApiConfig::default()).await;\n        let ingest_router = IngestRouterServiceClient::mocked();\n        let index_service =\n            IndexService::new(metastore_for_test(), StorageResolver::unconfigured());\n        let elastic_api_handlers = elastic_api_handlers(\n            mock_cluster().await,\n            config,\n            search_service,\n            ingest_service,\n            ingest_router,\n            MetastoreServiceClient::mocked(),\n            index_service,\n            true,\n            false,\n        );\n        let payload = r#\"\n            { \"create\" : { \"_index\" : \"my-index\", \"_id\" : \"1\"} }\n            {\"id\": 1, \"message\": \"push\"}\n            { \"create\" : { \"_index\" : \"index-2\", \"_id\" : \"1\" } }\n            {\"id\": 1, \"message\": \"push\"}\"#;\n        let resp = warp::test::request()\n            .path(\"/_elastic/_bulk\")\n            .method(\"POST\")\n            .body(payload)\n            .reply(&elastic_api_handlers)\n            .await;\n        assert_eq!(resp.status(), 404);\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_bulk_api_returns_200() {\n        let config = Arc::new(NodeConfig::for_test());\n        let search_service = Arc::new(MockSearchService::new());\n        let (universe, _temp_dir, ingest_service, _) =\n            setup_ingest_v1_service(&[\"my-index-1\", \"my-index-2\"], &IngestApiConfig::default())\n                .await;\n        let ingest_router = IngestRouterServiceClient::mocked();\n        let index_service =\n            IndexService::new(metastore_for_test(), StorageResolver::unconfigured());\n        let elastic_api_handlers = elastic_api_handlers(\n            mock_cluster().await,\n            config,\n            search_service,\n            ingest_service,\n            ingest_router,\n            MetastoreServiceClient::mocked(),\n            index_service,\n            true,\n            false,\n        );\n        let payload = r#\"\n            { \"create\" : { \"_index\" : \"my-index-1\", \"_id\" : \"1\"} }\n            {\"id\": 1, \"message\": \"push\"}\n            { \"create\" : { \"_index\" : \"my-index-2\", \"_id\" : \"1\"} }\n            {\"id\": 1, \"message\": \"push\"}\n            { \"create\" : { \"_index\" : \"my-index-1\" } }\n            {\"id\": 2, \"message\": \"push\"}\"#;\n        let resp = warp::test::request()\n            .path(\"/_elastic/_bulk\")\n            .method(\"POST\")\n            .body(payload)\n            .reply(&elastic_api_handlers)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let bulk_response: ElasticBulkResponse = serde_json::from_slice(resp.body()).unwrap();\n        assert!(!bulk_response.errors);\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_bulk_api_returns_200_if_payload_has_blank_lines() {\n        let config = Arc::new(NodeConfig::for_test());\n        let search_service = Arc::new(MockSearchService::new());\n        let (universe, _temp_dir, ingest_service, _) =\n            setup_ingest_v1_service(&[\"my-index-1\"], &IngestApiConfig::default()).await;\n        let ingest_router = IngestRouterServiceClient::mocked();\n        let index_service =\n            IndexService::new(metastore_for_test(), StorageResolver::unconfigured());\n        let elastic_api_handlers = elastic_api_handlers(\n            mock_cluster().await,\n            config,\n            search_service,\n            ingest_service,\n            ingest_router,\n            MetastoreServiceClient::mocked(),\n            index_service,\n            true,\n            false,\n        );\n        let payload = \"\n            {\\\"create\\\": {\\\"_index\\\": \\\"my-index-1\\\", \\\"_id\\\": \\\"1674834324802805760\\\"}}\n            \\u{20}\\u{20}\\u{20}\\u{20}\\n\n            {\\\"_line\\\": {\\\"message\\\": \\\"hello-world\\\"}}\";\n        let resp = warp::test::request()\n            .path(\"/_elastic/_bulk\")\n            .method(\"POST\")\n            .body(payload)\n            .reply(&elastic_api_handlers)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let bulk_response: ElasticBulkResponse = serde_json::from_slice(resp.body()).unwrap();\n        assert!(!bulk_response.errors);\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_bulk_index_api_returns_200() {\n        let config = Arc::new(NodeConfig::for_test());\n        let search_service = Arc::new(MockSearchService::new());\n        let (universe, _temp_dir, ingest_service, _) =\n            setup_ingest_v1_service(&[\"my-index-1\", \"my-index-2\"], &IngestApiConfig::default())\n                .await;\n        let ingest_router = IngestRouterServiceClient::mocked();\n        let index_service =\n            IndexService::new(metastore_for_test(), StorageResolver::unconfigured());\n        let elastic_api_handlers = elastic_api_handlers(\n            mock_cluster().await,\n            config,\n            search_service,\n            ingest_service,\n            ingest_router,\n            MetastoreServiceClient::mocked(),\n            index_service,\n            true,\n            false,\n        );\n        let payload = r#\"\n            { \"create\" : { \"_index\" : \"my-index-1\", \"_id\" : \"1\"} }\n            {\"id\": 1, \"message\": \"push\"}\n            { \"create\" : { \"_index\" : \"my-index-2\", \"_id\" : \"1\"} }\n            {\"id\": 1, \"message\": \"push\"}\n            { \"create\" : {} }\n            {\"id\": 2, \"message\": \"push\"}\"#;\n        let resp = warp::test::request()\n            .path(\"/_elastic/my-index-1/_bulk\")\n            .method(\"POST\")\n            .body(payload)\n            .reply(&elastic_api_handlers)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let bulk_response: ElasticBulkResponse = serde_json::from_slice(resp.body()).unwrap();\n        assert!(!bulk_response.errors);\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_bulk_api_blocks_when_refresh_wait_for_is_specified() {\n        let config = Arc::new(NodeConfig::for_test());\n        let search_service = Arc::new(MockSearchService::new());\n        let (universe, _temp_dir, ingest_service, ingest_service_mailbox) =\n            setup_ingest_v1_service(&[\"my-index-1\", \"my-index-2\"], &IngestApiConfig::default())\n                .await;\n        let ingest_router = IngestRouterServiceClient::mocked();\n        let index_service =\n            IndexService::new(metastore_for_test(), StorageResolver::unconfigured());\n        let elastic_api_handlers = elastic_api_handlers(\n            mock_cluster().await,\n            config,\n            search_service,\n            ingest_service,\n            ingest_router,\n            MetastoreServiceClient::mocked(),\n            index_service,\n            true,\n            false,\n        );\n        let payload = r#\"\n            { \"create\" : { \"_index\" : \"my-index-1\", \"_id\" : \"1\"} }\n            {\"id\": 1, \"message\": \"push\"}\n            { \"create\" : { \"_index\" : \"my-index-2\", \"_id\" : \"1\"} }\n            {\"id\": 1, \"message\": \"push\"}\n            { \"create\" : { \"_index\" : \"my-index-1\" } }\n            {\"id\": 2, \"message\": \"push\"}\"#;\n        let handle = tokio::spawn(async move {\n            let resp = warp::test::request()\n                .path(\"/_elastic/_bulk?refresh=wait_for\")\n                .method(\"POST\")\n                .body(payload)\n                .reply(&elastic_api_handlers)\n                .await;\n\n            assert_eq!(resp.status(), 200);\n            let bulk_response: ElasticBulkResponse = serde_json::from_slice(resp.body()).unwrap();\n            assert!(!bulk_response.errors);\n        });\n        universe.sleep(Duration::from_secs(10)).await;\n        assert!(!handle.is_finished());\n        assert_eq!(\n            ingest_service_mailbox\n                .ask_for_res(FetchRequest {\n                    index_id: \"my-index-1\".to_string(),\n                    start_after: None,\n                    num_bytes_limit: None,\n                })\n                .await\n                .unwrap()\n                .doc_batch\n                .unwrap()\n                .num_docs(),\n            2\n        );\n        assert!(!handle.is_finished());\n        assert_eq!(\n            ingest_service_mailbox\n                .ask_for_res(FetchRequest {\n                    index_id: \"my-index-2\".to_string(),\n                    start_after: None,\n                    num_bytes_limit: None,\n                })\n                .await\n                .unwrap()\n                .doc_batch\n                .unwrap()\n                .num_docs(),\n            1\n        );\n        ingest_service_mailbox\n            .ask_for_res(SuggestTruncateRequest {\n                index_id: \"my-index-1\".to_string(),\n                up_to_position_included: 1,\n            })\n            .await\n            .unwrap();\n        universe.sleep(Duration::from_secs(10)).await;\n        assert!(!handle.is_finished());\n        ingest_service_mailbox\n            .ask_for_res(SuggestTruncateRequest {\n                index_id: \"my-index-2\".to_string(),\n                up_to_position_included: 0,\n            })\n            .await\n            .unwrap();\n        handle.await.unwrap();\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_bulk_api_blocks_when_refresh_true_is_specified() {\n        let config = Arc::new(NodeConfig::for_test());\n        let search_service = Arc::new(MockSearchService::new());\n        let (universe, _temp_dir, ingest_service, ingest_service_mailbox) =\n            setup_ingest_v1_service(&[\"my-index-1\", \"my-index-2\"], &IngestApiConfig::default())\n                .await;\n        let ingest_router = IngestRouterServiceClient::mocked();\n        let index_service =\n            IndexService::new(metastore_for_test(), StorageResolver::unconfigured());\n        let elastic_api_handlers = elastic_api_handlers(\n            mock_cluster().await,\n            config,\n            search_service,\n            ingest_service,\n            ingest_router,\n            MetastoreServiceClient::mocked(),\n            index_service,\n            true,\n            false,\n        );\n        let payload = r#\"\n            { \"create\" : { \"_index\" : \"my-index-1\", \"_id\" : \"1\"} }\n            {\"id\": 1, \"message\": \"push\"}\n            { \"create\" : { \"_index\" : \"my-index-2\", \"_id\" : \"1\"} }\n            {\"id\": 1, \"message\": \"push\"}\n            { \"create\" : { \"_index\" : \"my-index-1\" } }\n            {\"id\": 2, \"message\": \"push\"}\"#;\n        let handle = tokio::spawn(async move {\n            let resp = warp::test::request()\n                .path(\"/_elastic/_bulk?refresh\")\n                .method(\"POST\")\n                .body(payload)\n                .reply(&elastic_api_handlers)\n                .await;\n\n            assert_eq!(resp.status(), 200);\n            let bulk_response: ElasticBulkResponse = serde_json::from_slice(resp.body()).unwrap();\n            assert!(!bulk_response.errors);\n        });\n        universe.sleep(Duration::from_secs(10)).await;\n        assert!(!handle.is_finished());\n        assert_eq!(\n            ingest_service_mailbox\n                .ask_for_res(FetchRequest {\n                    index_id: \"my-index-1\".to_string(),\n                    start_after: None,\n                    num_bytes_limit: None,\n                })\n                .await\n                .unwrap()\n                .doc_batch\n                .unwrap()\n                .num_docs(),\n            3\n        );\n        assert_eq!(\n            ingest_service_mailbox\n                .ask_for_res(FetchRequest {\n                    index_id: \"my-index-2\".to_string(),\n                    start_after: None,\n                    num_bytes_limit: None,\n                })\n                .await\n                .unwrap()\n                .doc_batch\n                .unwrap()\n                .num_docs(),\n            2\n        );\n        ingest_service_mailbox\n            .ask_for_res(SuggestTruncateRequest {\n                index_id: \"my-index-1\".to_string(),\n                up_to_position_included: 1,\n            })\n            .await\n            .unwrap();\n        universe.sleep(Duration::from_secs(10)).await;\n        assert!(!handle.is_finished());\n        ingest_service_mailbox\n            .ask_for_res(SuggestTruncateRequest {\n                index_id: \"my-index-2\".to_string(),\n                up_to_position_included: 0,\n            })\n            .await\n            .unwrap();\n        handle.await.unwrap();\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_bulk_ingest_request_returns_400_if_action_is_malformed() {\n        let config = Arc::new(NodeConfig::for_test());\n        let search_service = Arc::new(MockSearchService::new());\n        let ingest_service = IngestServiceClient::mocked();\n        let ingest_router = IngestRouterServiceClient::mocked();\n        let index_service =\n            IndexService::new(metastore_for_test(), StorageResolver::unconfigured());\n        let elastic_api_handlers = elastic_api_handlers(\n            mock_cluster().await,\n            config,\n            search_service,\n            ingest_service,\n            ingest_router,\n            MetastoreServiceClient::mocked(),\n            index_service,\n            true,\n            false,\n        );\n        let payload = r#\"\n            {\"create\": {\"_index\": \"my-index\", \"_id\": \"1\"},}\n            {\"id\": 1, \"message\": \"my-doc\"}\"#;\n        let resp = warp::test::request()\n            .path(\"/_elastic/_bulk\")\n            .method(\"POST\")\n            .body(payload)\n            .reply(&elastic_api_handlers)\n            .await;\n        assert_eq!(resp.status(), 400);\n        let es_error: ElasticsearchError = serde_json::from_slice(resp.body()).unwrap();\n        assert_eq!(es_error.status, StatusCode::BAD_REQUEST);\n        assert_eq!(\n            es_error.error.reason.unwrap(),\n            \"Malformed action/metadata line [#0]. Details: `expected value at line 1 column 57`\"\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/elasticsearch_api/bulk_v2.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::time::Instant;\n\nuse quickwit_common::rate_limited_error;\nuse quickwit_config::{INGEST_V2_SOURCE_ID, validate_identifier};\nuse quickwit_ingest::IngestRequestV2Builder;\nuse quickwit_proto::ingest::CommitTypeV2;\nuse quickwit_proto::ingest::router::{\n    IngestFailureReason, IngestResponseV2, IngestRouterService, IngestRouterServiceClient,\n};\nuse quickwit_proto::types::{DocUid, IndexId};\nuse serde::{Deserialize, Serialize};\nuse warp::hyper::StatusCode;\n\nuse super::model::ElasticException;\nuse crate::Body;\nuse crate::elasticsearch_api::model::{BulkAction, ElasticBulkOptions, ElasticsearchError};\nuse crate::ingest_api::lines;\n\n#[derive(Debug, Default, Serialize, Deserialize)]\npub(crate) struct ElasticBulkResponse {\n    #[serde(rename = \"took\")]\n    pub took_millis: u64,\n    pub errors: bool,\n    #[serde(rename = \"items\")]\n    pub actions: Vec<ElasticBulkAction>,\n}\n\n#[derive(Debug, Clone, Serialize, Deserialize)]\npub(crate) enum ElasticBulkAction {\n    #[serde(rename = \"create\")]\n    Create(ElasticBulkItem),\n    #[serde(rename = \"index\")]\n    Index(ElasticBulkItem),\n}\n\n#[derive(Debug, Clone, Serialize, Deserialize)]\npub(crate) struct ElasticBulkItem {\n    #[serde(rename = \"_index\")]\n    pub index_id: IndexId,\n    #[serde(rename = \"_id\")]\n    pub es_doc_id: Option<String>,\n    #[serde(with = \"http_serde::status_code\")]\n    pub status: StatusCode,\n    pub error: Option<ElasticBulkError>,\n}\n\n#[derive(Debug, Clone, Serialize, Deserialize)]\npub(crate) struct ElasticBulkError {\n    #[serde(rename = \"index\")]\n    pub index_id: Option<IndexId>,\n    #[serde(rename = \"type\")]\n    pub exception: ElasticException,\n    pub reason: String,\n}\n\ntype ElasticDocId = String;\n\n#[derive(Debug)]\nstruct DocHandle {\n    doc_position: usize,\n    doc_uid: DocUid,\n    es_doc_id: Option<ElasticDocId>,\n    // Whether the document failed to parse. When the struct is instantiated, this value is set to\n    // `false` and then mutated if the ingest response contains a parse failure for this document.\n    is_parse_failure: bool,\n}\n\npub(crate) async fn elastic_bulk_ingest_v2(\n    default_index_id: Option<IndexId>,\n    body: Body,\n    bulk_options: ElasticBulkOptions,\n    ingest_router: IngestRouterServiceClient,\n) -> Result<ElasticBulkResponse, ElasticsearchError> {\n    let now = Instant::now();\n    let mut ingest_request_builder = IngestRequestV2Builder::default();\n    let mut lines = lines(&body.content).enumerate();\n    let mut per_subrequest_doc_handles: HashMap<u32, Vec<DocHandle>> = HashMap::new();\n    let mut action_count = 0;\n    let mut invalid_index_id_items = Vec::new();\n    while let Some((line_no, line)) = lines.next() {\n        let action = serde_json::from_slice::<BulkAction>(line).map_err(|error| {\n            ElasticsearchError::new(\n                StatusCode::BAD_REQUEST,\n                format!(\"Malformed action/metadata line [{}]: {error}\", line_no + 1),\n                Some(ElasticException::IllegalArgument),\n            )\n        })?;\n        let (_, doc) = lines.next().ok_or_else(|| {\n            ElasticsearchError::new(\n                StatusCode::BAD_REQUEST,\n                \"Validation Failed: 1: no requests added;\".to_string(),\n                Some(ElasticException::ActionRequestValidation),\n            )\n        })?;\n        let meta = action.into_meta();\n        // When ingesting into `/my-index/_bulk`, if `_index` is set to something other than\n        // `my-index`, ES honors it and creates the doc for the requested index. That is,\n        // `my-index` is a default value in case `_index`` is missing, but not a constraint on\n        // each sub-action.\n        let index_id = meta\n            .index_id\n            .or_else(|| default_index_id.clone())\n            .ok_or_else(|| {\n                ElasticsearchError::new(\n                    StatusCode::BAD_REQUEST,\n                    \"Validation Failed: 1: index is missing;\".to_string(),\n                    Some(ElasticException::ActionRequestValidation),\n                )\n            })?;\n\n        // Validate index ID early because propagating back the right error (400)\n        // from deeper ingest layers is harder\n        if validate_identifier(\"\", &index_id).is_err() {\n            let invalid_item = make_invalid_index_id_item(index_id.clone(), meta.es_doc_id);\n            invalid_index_id_items.push((action_count, invalid_item));\n            action_count += 1;\n            continue;\n        }\n\n        let (subrequest_id, doc_uid) = ingest_request_builder.add_doc(index_id, doc);\n\n        let doc_handle = DocHandle {\n            doc_position: action_count,\n            doc_uid,\n            es_doc_id: meta.es_doc_id,\n            is_parse_failure: false,\n        };\n        action_count += 1;\n        per_subrequest_doc_handles\n            .entry(subrequest_id)\n            .or_default()\n            .push(doc_handle);\n    }\n    let commit_type: CommitTypeV2 = bulk_options.refresh.into();\n\n    let ingest_request_opt = ingest_request_builder.build(INGEST_V2_SOURCE_ID, commit_type);\n\n    let Some(ingest_request) = ingest_request_opt else {\n        return Ok(ElasticBulkResponse::default());\n    };\n    let ingest_response = ingest_router.ingest(ingest_request).await.map_err(|err| {\n        rate_limited_error!(limit_per_min=6, err=?err, \"router error\");\n        err\n    })?;\n    make_elastic_bulk_response_v2(\n        ingest_response,\n        per_subrequest_doc_handles,\n        now,\n        action_count,\n        invalid_index_id_items,\n    )\n}\n\n#[allow(clippy::result_large_err)]\nfn make_elastic_bulk_response_v2(\n    ingest_response_v2: IngestResponseV2,\n    mut per_subrequest_doc_handles: HashMap<u32, Vec<DocHandle>>,\n    now: Instant,\n    action_count: usize,\n    invalid_index_id_items: Vec<(usize, ElasticBulkItem)>,\n) -> Result<ElasticBulkResponse, ElasticsearchError> {\n    let mut positioned_actions: Vec<(usize, ElasticBulkAction)> = Vec::with_capacity(action_count);\n    let mut errors = false;\n\n    // Populate the items for each `IngestSuccess` subresponse. They may be partially successful and\n    // contain some parse failures.\n    for success in ingest_response_v2.successes {\n        let index_id = success\n            .index_uid\n            .map(|index_uid| index_uid.index_id)\n            .expect(\"`index_uid` should be a required field\");\n\n        // Find the doc handles for the subresponse.\n        let mut doc_handles = remove_doc_handles(\n            &mut per_subrequest_doc_handles,\n            success.subrequest_id,\n        )\n        .inspect_err(|_| {\n            rate_limited_error!(limit_per_min=6, index_id=%index_id, \"could not find subrequest id\");\n        })?;\n        doc_handles.sort_unstable_by(|left, right| left.doc_uid.cmp(&right.doc_uid));\n\n        // Populate the response items with one error per parse failure.\n        for parse_failure in success.parse_failures {\n            errors = true;\n\n            let failed_doc_uid = parse_failure.doc_uid();\n            let doc_handle_idx = doc_handles\n                .binary_search_by_key(&failed_doc_uid, |doc_handle| doc_handle.doc_uid)\n                .map_err(|_| {\n                    rate_limited_error!(limit_per_min=6, doc_uid=%failed_doc_uid, \"could not find doc_uid from parse failure\");\n                    ElasticsearchError::new(\n                        StatusCode::INTERNAL_SERVER_ERROR,\n                        format!(\n                            \"could not find doc `{}` in bulk request\",\n                            parse_failure.doc_uid()\n                        ),\n                        None,\n                    )\n                })?;\n            let doc_handle = &mut doc_handles[doc_handle_idx];\n            doc_handle.is_parse_failure = true;\n\n            let error = ElasticBulkError {\n                index_id: Some(index_id.clone()),\n                exception: ElasticException::DocumentParsing,\n                reason: parse_failure.message,\n            };\n            let item = ElasticBulkItem {\n                index_id: index_id.clone(),\n                es_doc_id: doc_handle.es_doc_id.take(),\n                status: StatusCode::BAD_REQUEST,\n                error: Some(error),\n            };\n            let action = ElasticBulkAction::Index(item);\n            positioned_actions.push((doc_handle.doc_position, action));\n        }\n        // Populate the remaining successful items.\n        for mut doc_handle in doc_handles {\n            if doc_handle.is_parse_failure {\n                continue;\n            }\n            let item = ElasticBulkItem {\n                index_id: index_id.clone(),\n                es_doc_id: doc_handle.es_doc_id.take(),\n                status: StatusCode::CREATED,\n                error: None,\n            };\n            let action = ElasticBulkAction::Index(item);\n            positioned_actions.push((doc_handle.doc_position, action));\n        }\n    }\n    // Repeat the operation for each `IngestFailure` subresponse.\n    for failure in ingest_response_v2.failures {\n        errors = true;\n\n        // Find the doc handles for the subrequest.\n        let doc_handles =\n            remove_doc_handles(&mut per_subrequest_doc_handles, failure.subrequest_id)\n                .inspect_err(|_| {\n                    rate_limited_error!(\n                        limit_per_min = 6,\n                        subrequest = failure.subrequest_id,\n                        \"failed to find error subrequest\"\n                    );\n                })?;\n\n        // Populate the response items with one error per doc handle.\n        let (exception, reason, status) = match failure.reason() {\n            IngestFailureReason::IndexNotFound => (\n                ElasticException::IndexNotFound,\n                format!(\"no such index [{}]\", failure.index_id),\n                StatusCode::NOT_FOUND,\n            ),\n            IngestFailureReason::SourceNotFound => (\n                ElasticException::SourceNotFound,\n                format!(\"no such source [{}]\", failure.index_id),\n                StatusCode::NOT_FOUND,\n            ),\n            IngestFailureReason::Timeout => (\n                ElasticException::Timeout,\n                format!(\"timeout [{}]\", failure.index_id),\n                StatusCode::REQUEST_TIMEOUT,\n            ),\n            IngestFailureReason::ShardRateLimited => (\n                ElasticException::RateLimited,\n                format!(\"shard rate limiting [{}]\", failure.index_id),\n                StatusCode::TOO_MANY_REQUESTS,\n            ),\n            IngestFailureReason::NoShardsAvailable => (\n                ElasticException::RateLimited,\n                format!(\"no shards available [{}]\", failure.index_id),\n                StatusCode::TOO_MANY_REQUESTS,\n            ),\n            reason => {\n                let pretty_reason = reason\n                    .as_str_name()\n                    .strip_prefix(\"INGEST_FAILURE_REASON_\")\n                    .unwrap_or(\"\")\n                    .replace('_', \" \")\n                    .to_ascii_lowercase();\n                (\n                    ElasticException::Internal,\n                    format!(\"{} error [{}]\", pretty_reason, failure.index_id),\n                    StatusCode::INTERNAL_SERVER_ERROR,\n                )\n            }\n        };\n        for mut doc_handle in doc_handles {\n            let error = ElasticBulkError {\n                index_id: Some(failure.index_id.clone()),\n                exception,\n                reason: reason.clone(),\n            };\n            let item = ElasticBulkItem {\n                index_id: failure.index_id.clone(),\n                es_doc_id: doc_handle.es_doc_id.take(),\n                status,\n                error: Some(error),\n            };\n            let action = ElasticBulkAction::Index(item);\n            positioned_actions.push((doc_handle.doc_position, action));\n        }\n    }\n    assert!(\n        per_subrequest_doc_handles.is_empty(),\n        \"doc handles should be empty\"\n    );\n\n    for (position, item) in invalid_index_id_items {\n        errors = true;\n        let action = ElasticBulkAction::Index(item);\n        positioned_actions.push((position, action));\n    }\n\n    assert_eq!(\n        positioned_actions.len(),\n        action_count,\n        \"request and response action count should match\"\n    );\n    positioned_actions.sort_unstable_by_key(|(idx, _)| *idx);\n    let actions = positioned_actions\n        .into_iter()\n        .map(|(_, action)| action)\n        .collect();\n\n    let took_millis = now.elapsed().as_millis() as u64;\n\n    let bulk_response = ElasticBulkResponse {\n        took_millis,\n        errors,\n        actions,\n    };\n    Ok(bulk_response)\n}\n\n#[allow(clippy::result_large_err)]\nfn remove_doc_handles(\n    per_subrequest_doc_handles: &mut HashMap<u32, Vec<DocHandle>>,\n    subrequest_id: u32,\n) -> Result<Vec<DocHandle>, ElasticsearchError> {\n    per_subrequest_doc_handles\n        .remove(&subrequest_id)\n        .ok_or_else(|| {\n            ElasticsearchError::new(\n                StatusCode::INTERNAL_SERVER_ERROR,\n                format!(\"could not find subrequest `{subrequest_id}` in bulk request\"),\n                None,\n            )\n        })\n}\n\nfn make_invalid_index_id_item(index_id: String, es_doc_id: Option<String>) -> ElasticBulkItem {\n    let error = ElasticBulkError {\n        index_id: Some(index_id.clone()),\n        exception: ElasticException::IllegalArgument,\n        reason: format!(\"invalid index id [{index_id}]\"),\n    };\n    ElasticBulkItem {\n        index_id,\n        es_doc_id,\n        status: StatusCode::BAD_REQUEST,\n        error: Some(error),\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use bytesize::ByteSize;\n    use quickwit_proto::ingest::router::{\n        IngestFailure, IngestFailureReason, IngestResponseV2, IngestSuccess,\n        MockIngestRouterService,\n    };\n    use quickwit_proto::ingest::{ParseFailure, ParseFailureReason};\n    use quickwit_proto::types::{IndexUid, Position, ShardId};\n    use warp::{Filter, Rejection, Reply};\n\n    use super::*;\n    use crate::elasticsearch_api::bulk_v2::ElasticBulkResponse;\n    use crate::elasticsearch_api::filter::elastic_bulk_filter;\n    use crate::elasticsearch_api::make_elastic_api_response;\n    use crate::elasticsearch_api::model::ElasticsearchError;\n    use crate::format::extract_format_from_qs;\n    use crate::with_arg;\n\n    impl ElasticBulkAction {\n        fn index_id(&self) -> &IndexId {\n            match self {\n                ElasticBulkAction::Create(item) => &item.index_id,\n                ElasticBulkAction::Index(item) => &item.index_id,\n            }\n        }\n\n        fn es_doc_id(&self) -> Option<&str> {\n            match self {\n                ElasticBulkAction::Create(item) => item.es_doc_id.as_deref(),\n                ElasticBulkAction::Index(item) => item.es_doc_id.as_deref(),\n            }\n        }\n\n        fn status(&self) -> StatusCode {\n            match self {\n                ElasticBulkAction::Create(item) => item.status,\n                ElasticBulkAction::Index(item) => item.status,\n            }\n        }\n\n        fn error(&self) -> Option<&ElasticBulkError> {\n            match self {\n                ElasticBulkAction::Create(item) => item.error.as_ref(),\n                ElasticBulkAction::Index(item) => item.error.as_ref(),\n            }\n        }\n    }\n\n    fn es_compat_bulk_handler_v2(\n        ingest_router: IngestRouterServiceClient,\n        content_length_limit: ByteSize,\n    ) -> impl Filter<Extract = (impl Reply,), Error = Rejection> + Clone {\n        elastic_bulk_filter(content_length_limit)\n            .and(with_arg(ingest_router))\n            .then(|body, bulk_options, ingest_router| {\n                elastic_bulk_ingest_v2(None, body, bulk_options, ingest_router)\n            })\n            .and(extract_format_from_qs())\n            .map(make_elastic_api_response)\n    }\n\n    #[tokio::test]\n    async fn test_bulk_api_happy_path() {\n        let mut mock_ingest_router = MockIngestRouterService::new();\n        mock_ingest_router\n            .expect_ingest()\n            .once()\n            .returning(|ingest_request| {\n                assert_eq!(ingest_request.subrequests.len(), 2);\n                assert_eq!(ingest_request.commit_type(), CommitTypeV2::Auto);\n\n                let mut subrequests = ingest_request.subrequests;\n                subrequests.sort_by(|left, right| left.index_id.cmp(&right.index_id));\n\n                assert_eq!(subrequests[0].subrequest_id, 0);\n                assert_eq!(subrequests[0].index_id, \"my-index-1\");\n                assert_eq!(subrequests[0].source_id, INGEST_V2_SOURCE_ID);\n                assert_eq!(subrequests[0].doc_batch.as_ref().unwrap().num_docs(), 2);\n                assert_eq!(subrequests[0].doc_batch.as_ref().unwrap().num_bytes(), 104);\n\n                assert_eq!(subrequests[1].subrequest_id, 1);\n                assert_eq!(subrequests[1].index_id, \"my-index-2\");\n                assert_eq!(subrequests[1].source_id, INGEST_V2_SOURCE_ID);\n                assert_eq!(subrequests[1].doc_batch.as_ref().unwrap().num_docs(), 1);\n                assert_eq!(subrequests[1].doc_batch.as_ref().unwrap().num_bytes(), 52);\n\n                Ok(IngestResponseV2 {\n                    successes: vec![\n                        IngestSuccess {\n                            subrequest_id: 0,\n                            index_uid: Some(IndexUid::for_test(\"my-index-1\", 0)),\n                            source_id: INGEST_V2_SOURCE_ID.to_string(),\n                            shard_id: Some(ShardId::from(1)),\n                            replication_position_inclusive: Some(Position::offset(1u64)),\n                            num_ingested_docs: 2,\n                            parse_failures: Vec::new(),\n                        },\n                        IngestSuccess {\n                            subrequest_id: 1,\n                            index_uid: Some(IndexUid::for_test(\"my-index-2\", 0)),\n                            source_id: INGEST_V2_SOURCE_ID.to_string(),\n                            shard_id: Some(ShardId::from(1)),\n                            replication_position_inclusive: Some(Position::offset(0u64)),\n                            num_ingested_docs: 1,\n                            parse_failures: Vec::new(),\n                        },\n                    ],\n                    failures: Vec::new(),\n                })\n            });\n        let ingest_router = IngestRouterServiceClient::from_mock(mock_ingest_router);\n        let handler = es_compat_bulk_handler_v2(ingest_router, ByteSize::mb(10));\n\n        let payload = r#\"\n            {\"create\": {\"_index\": \"my-index-1\", \"_id\" : \"1\"}}\n            {\"ts\": 1, \"message\": \"my-message-1\"}\n            {\"create\": {\"_index\": \"my-index-2\", \"_id\" : \"1\"}}\n            {\"ts\": 1, \"message\": \"my-message-1\"}\n            {\"create\": {\"_index\": \"my-index-1\"}}\n            {\"ts\": 2, \"message\": \"my-message-2\"}\n        \"#;\n        let response = warp::test::request()\n            .path(\"/_elastic/_bulk\")\n            .method(\"POST\")\n            .body(payload)\n            .reply(&handler)\n            .await;\n        assert_eq!(response.status(), 200);\n\n        let bulk_response: ElasticBulkResponse = serde_json::from_slice(response.body()).unwrap();\n        assert!(!bulk_response.errors);\n\n        let mut items = bulk_response\n            .actions\n            .into_iter()\n            .map(|action| match action {\n                ElasticBulkAction::Create(item) => item,\n                ElasticBulkAction::Index(item) => item,\n            })\n            .collect::<Vec<_>>();\n        assert_eq!(items.len(), 3);\n\n        items.sort_by(|left, right| {\n            left.index_id\n                .cmp(&right.index_id)\n                .then(left.es_doc_id.cmp(&right.es_doc_id))\n        });\n        assert_eq!(items[0].index_id, \"my-index-1\");\n        assert!(items[0].es_doc_id.is_none());\n        assert_eq!(items[0].status, StatusCode::CREATED);\n\n        assert_eq!(items[1].index_id, \"my-index-1\");\n        assert_eq!(items[1].es_doc_id.as_ref().unwrap(), \"1\");\n        assert_eq!(items[1].status, StatusCode::CREATED);\n\n        assert_eq!(items[2].index_id, \"my-index-2\");\n        assert_eq!(items[2].es_doc_id.as_ref().unwrap(), \"1\");\n        assert_eq!(items[2].status, StatusCode::CREATED);\n    }\n\n    #[tokio::test]\n    async fn test_bulk_api_accepts_empty_requests() {\n        let ingest_router = IngestRouterServiceClient::mocked();\n        let handler = es_compat_bulk_handler_v2(ingest_router, ByteSize::mb(10));\n\n        let response = warp::test::request()\n            .path(\"/_elastic/_bulk\")\n            .method(\"POST\")\n            .body(\"\")\n            .reply(&handler)\n            .await;\n        assert_eq!(response.status(), 200);\n\n        let bulk_response: ElasticBulkResponse = serde_json::from_slice(response.body()).unwrap();\n        assert!(!bulk_response.errors)\n    }\n\n    #[tokio::test]\n    async fn test_bulk_api_ignores_blank_lines() {\n        let mut mock_ingest_router = MockIngestRouterService::new();\n        mock_ingest_router\n            .expect_ingest()\n            .once()\n            .returning(|ingest_request| {\n                assert_eq!(ingest_request.subrequests.len(), 1);\n                assert_eq!(ingest_request.commit_type(), CommitTypeV2::Auto);\n\n                let subrequest_0 = &ingest_request.subrequests[0];\n\n                assert_eq!(subrequest_0.index_id, \"my-index-1\");\n                assert_eq!(subrequest_0.source_id, INGEST_V2_SOURCE_ID);\n                assert_eq!(subrequest_0.doc_batch.as_ref().unwrap().num_docs(), 1);\n                assert_eq!(subrequest_0.doc_batch.as_ref().unwrap().num_bytes(), 52);\n\n                Ok(IngestResponseV2 {\n                    successes: vec![IngestSuccess {\n                        subrequest_id: 0,\n                        index_uid: Some(IndexUid::for_test(\"my-index-1\", 0)),\n                        source_id: INGEST_V2_SOURCE_ID.to_string(),\n                        shard_id: Some(ShardId::from(1)),\n                        replication_position_inclusive: Some(Position::offset(0u64)),\n                        num_ingested_docs: 1,\n                        parse_failures: Vec::new(),\n                    }],\n                    failures: Vec::new(),\n                })\n            });\n        let ingest_router = IngestRouterServiceClient::from_mock(mock_ingest_router);\n        let handler = es_compat_bulk_handler_v2(ingest_router, ByteSize::mb(10));\n\n        let payload = r#\"\n\n            {\"create\": {\"_index\": \"my-index-1\", \"_id\" : \"1\"}}\n\n            {\"ts\": 1, \"message\": \"my-message-1\"}\n        \"#;\n        let response = warp::test::request()\n            .path(\"/_elastic/_bulk\")\n            .method(\"POST\")\n            .body(payload)\n            .reply(&handler)\n            .await;\n        assert_eq!(response.status(), 200);\n\n        let bulk_response: ElasticBulkResponse = serde_json::from_slice(response.body()).unwrap();\n        assert!(!bulk_response.errors);\n    }\n\n    #[tokio::test]\n    async fn test_bulk_api_handles_malformed_requests() {\n        let ingest_router = IngestRouterServiceClient::mocked();\n        let handler = es_compat_bulk_handler_v2(ingest_router, ByteSize::mb(10));\n\n        let payload = r#\"\n            {\"create\": {\"_index\": \"my-index-1\", \"_id\" : \"1\"},}\n            {\"ts\": 1, \"message\": \"my-message-1\"}\n        \"#;\n        let response = warp::test::request()\n            .path(\"/_elastic/_bulk\")\n            .method(\"POST\")\n            .body(payload)\n            .reply(&handler)\n            .await;\n        assert_eq!(response.status(), 400);\n\n        let es_error: ElasticsearchError = serde_json::from_slice(response.body()).unwrap();\n        assert_eq!(es_error.status, StatusCode::BAD_REQUEST);\n\n        let reason = es_error.error.reason.unwrap();\n        assert_eq!(\n            reason,\n            \"Malformed action/metadata line [1]: expected value at line 1 column 60\"\n        );\n\n        let payload = r#\"\n            {\"create\": {\"_index\": \"my-index-1\", \"_id\" : \"1\"}}\n        \"#;\n        let response = warp::test::request()\n            .path(\"/_elastic/_bulk\")\n            .method(\"POST\")\n            .body(payload)\n            .reply(&handler)\n            .await;\n        assert_eq!(response.status(), 400);\n\n        let es_error: ElasticsearchError = serde_json::from_slice(response.body()).unwrap();\n        assert_eq!(es_error.status, StatusCode::BAD_REQUEST);\n\n        let reason = es_error.error.reason.unwrap();\n        assert_eq!(reason, \"Validation Failed: 1: no requests added;\");\n\n        let payload = r#\"\n            {\"create\": {\"_id\" : \"1\"}}\n            {\"ts\": 1, \"message\": \"my-message-1\"}\n        \"#;\n        let response = warp::test::request()\n            .path(\"/_elastic/_bulk\")\n            .method(\"POST\")\n            .body(payload)\n            .reply(&handler)\n            .await;\n        assert_eq!(response.status(), 400);\n\n        let es_error: ElasticsearchError = serde_json::from_slice(response.body()).unwrap();\n        assert_eq!(es_error.status, StatusCode::BAD_REQUEST);\n\n        let reason = es_error.error.reason.unwrap();\n        assert_eq!(reason, \"Validation Failed: 1: index is missing;\");\n    }\n\n    #[tokio::test]\n    async fn test_bulk_api_index_not_found() {\n        let mut mock_ingest_router = MockIngestRouterService::new();\n        mock_ingest_router\n            .expect_ingest()\n            .once()\n            .returning(|ingest_request| {\n                assert_eq!(ingest_request.subrequests.len(), 2);\n                assert_eq!(ingest_request.commit_type(), CommitTypeV2::Auto);\n\n                let mut subrequests = ingest_request.subrequests;\n                subrequests.sort_by(|left, right| left.index_id.cmp(&right.index_id));\n\n                assert_eq!(subrequests[0].subrequest_id, 0);\n                assert_eq!(subrequests[0].index_id, \"my-index-1\");\n                assert_eq!(subrequests[0].source_id, INGEST_V2_SOURCE_ID);\n                assert_eq!(subrequests[0].doc_batch.as_ref().unwrap().num_docs(), 2);\n\n                assert_eq!(subrequests[1].subrequest_id, 1);\n                assert_eq!(subrequests[1].index_id, \"my-index-2\");\n                assert_eq!(subrequests[1].source_id, INGEST_V2_SOURCE_ID);\n                assert_eq!(subrequests[1].doc_batch.as_ref().unwrap().num_docs(), 1);\n\n                Ok(IngestResponseV2 {\n                    successes: Vec::new(),\n                    failures: vec![\n                        IngestFailure {\n                            subrequest_id: 0,\n                            index_id: \"my-index-1\".to_string(),\n                            source_id: INGEST_V2_SOURCE_ID.to_string(),\n                            reason: IngestFailureReason::IndexNotFound as i32,\n                        },\n                        IngestFailure {\n                            subrequest_id: 1,\n                            index_id: \"my-index-2\".to_string(),\n                            source_id: INGEST_V2_SOURCE_ID.to_string(),\n                            reason: IngestFailureReason::IndexNotFound as i32,\n                        },\n                    ],\n                })\n            });\n        let ingest_router = IngestRouterServiceClient::from_mock(mock_ingest_router);\n        let handler = es_compat_bulk_handler_v2(ingest_router, ByteSize::mb(10));\n\n        let payload = r#\"\n            {\"index\": {\"_index\": \"my-index-1\", \"_id\" : \"1\"}}\n            {\"ts\": 1, \"message\": \"my-message-1\"}\n            {\"index\": {\"_index\": \"my-index-1\"}}\n            {\"ts\": 2, \"message\": \"my-message-1\"}\n            {\"index\": {\"_index\": \"my-index-2\", \"_id\" : \"1\"}}\n            {\"ts\": 3, \"message\": \"my-message-2\"}\n        \"#;\n        let response = warp::test::request()\n            .path(\"/_elastic/_bulk\")\n            .method(\"POST\")\n            .body(payload)\n            .reply(&handler)\n            .await;\n        assert_eq!(response.status(), 200);\n\n        let bulk_response: ElasticBulkResponse = serde_json::from_slice(response.body()).unwrap();\n        assert!(bulk_response.errors);\n        assert_eq!(bulk_response.actions.len(), 3);\n    }\n\n    #[test]\n    fn test_bulk_api_make_elastic_bulk_response_v2() {\n        let response = make_elastic_bulk_response_v2(\n            IngestResponseV2::default(),\n            HashMap::new(),\n            Instant::now(),\n            0,\n            Vec::new(),\n        )\n        .unwrap();\n\n        assert!(!response.errors);\n        assert!(response.actions.is_empty());\n\n        let ingest_response_v2 = IngestResponseV2 {\n            successes: vec![IngestSuccess {\n                subrequest_id: 0,\n                index_uid: Some(IndexUid::for_test(\"test-index-foo\", 0)),\n                source_id: \"test-source\".to_string(),\n                shard_id: Some(ShardId::from(0)),\n                replication_position_inclusive: Some(Position::offset(0u64)),\n                num_ingested_docs: 1,\n                parse_failures: vec![ParseFailure {\n                    doc_uid: Some(DocUid::for_test(1)),\n                    reason: ParseFailureReason::InvalidJson as i32,\n                    message: \"failed to parse JSON document\".to_string(),\n                }],\n            }],\n            failures: vec![IngestFailure {\n                subrequest_id: 1,\n                index_id: \"test-index-bar\".to_string(),\n                source_id: \"test-source\".to_string(),\n                reason: IngestFailureReason::IndexNotFound as i32,\n            }],\n        };\n        let per_request_doc_handles = HashMap::from_iter([\n            (\n                0,\n                vec![\n                    DocHandle {\n                        doc_position: 0,\n                        doc_uid: DocUid::for_test(0),\n                        es_doc_id: Some(\"0\".to_string()),\n                        is_parse_failure: false,\n                    },\n                    DocHandle {\n                        doc_position: 1,\n                        doc_uid: DocUid::for_test(1),\n                        es_doc_id: Some(\"1\".to_string()),\n                        is_parse_failure: false,\n                    },\n                ],\n            ),\n            (\n                1,\n                vec![DocHandle {\n                    doc_position: 2,\n                    doc_uid: DocUid::for_test(2),\n                    es_doc_id: Some(\"2\".to_string()),\n                    is_parse_failure: false,\n                }],\n            ),\n        ]);\n        let response = make_elastic_bulk_response_v2(\n            ingest_response_v2,\n            per_request_doc_handles,\n            Instant::now(),\n            3,\n            Vec::new(),\n        )\n        .unwrap();\n\n        assert!(response.errors);\n        assert_eq!(response.actions.len(), 3);\n\n        assert_eq!(response.actions[0].index_id(), \"test-index-foo\");\n        assert_eq!(response.actions[0].es_doc_id(), Some(\"0\"));\n        assert_eq!(response.actions[0].status(), StatusCode::CREATED);\n        assert!(response.actions[0].error().is_none());\n\n        assert_eq!(response.actions[1].index_id(), \"test-index-foo\");\n        assert_eq!(response.actions[1].es_doc_id(), Some(\"1\"));\n        assert_eq!(response.actions[1].status(), StatusCode::BAD_REQUEST);\n\n        let error = response.actions[1].error().unwrap();\n        assert_eq!(error.index_id.as_ref().unwrap(), \"test-index-foo\");\n        assert_eq!(error.exception, ElasticException::DocumentParsing);\n        assert_eq!(error.reason, \"failed to parse JSON document\");\n\n        assert_eq!(response.actions[2].index_id(), \"test-index-bar\");\n        assert_eq!(response.actions[2].es_doc_id(), Some(\"2\"));\n        assert_eq!(response.actions[2].status(), StatusCode::NOT_FOUND);\n\n        let error = response.actions[2].error().unwrap();\n        assert_eq!(error.index_id.as_ref().unwrap(), \"test-index-bar\");\n        assert_eq!(error.exception, ElasticException::IndexNotFound);\n        assert_eq!(error.reason, \"no such index [test-index-bar]\");\n    }\n\n    #[tokio::test]\n    async fn test_bulk_api_refresh_parameter() {\n        let mut mock_ingest_router = MockIngestRouterService::new();\n        mock_ingest_router\n            .expect_ingest()\n            .once()\n            .returning(|ingest_request| {\n                assert_eq!(ingest_request.commit_type(), CommitTypeV2::WaitFor);\n                Ok(IngestResponseV2 {\n                    successes: vec![IngestSuccess {\n                        subrequest_id: 0,\n                        index_uid: Some(IndexUid::for_test(\"my-index-1\", 0)),\n                        source_id: INGEST_V2_SOURCE_ID.to_string(),\n                        shard_id: Some(ShardId::from(1)),\n                        replication_position_inclusive: Some(Position::offset(1u64)),\n                        num_ingested_docs: 2,\n                        parse_failures: Vec::new(),\n                    }],\n                    failures: Vec::new(),\n                })\n            });\n        let ingest_router = IngestRouterServiceClient::from_mock(mock_ingest_router);\n        let handler = es_compat_bulk_handler_v2(ingest_router, ByteSize::mb(10));\n\n        let payload = r#\"\n            {\"create\": {\"_index\": \"my-index-1\", \"_id\" : \"1\"}}\n            {\"ts\": 1, \"message\": \"my-message-1\"}\n        \"#;\n        warp::test::request()\n            .path(\"/_elastic/_bulk?refresh=wait_for\")\n            .method(\"POST\")\n            .body(payload)\n            .reply(&handler)\n            .await;\n    }\n\n    #[tokio::test]\n    async fn test_bulk_api_invalid_index_id() {\n        let mut mock_ingest_router = MockIngestRouterService::new();\n        mock_ingest_router\n            .expect_ingest()\n            .once()\n            .returning(|ingest_request| {\n                assert_eq!(ingest_request.subrequests.len(), 2);\n                Ok(IngestResponseV2 {\n                    successes: vec![\n                        IngestSuccess {\n                            subrequest_id: 0,\n                            index_uid: Some(IndexUid::for_test(\"my-index-1\", 0)),\n                            source_id: INGEST_V2_SOURCE_ID.to_string(),\n                            shard_id: Some(ShardId::from(1)),\n                            replication_position_inclusive: Some(Position::offset(1u64)),\n                            num_ingested_docs: 2,\n                            parse_failures: Vec::new(),\n                        },\n                        IngestSuccess {\n                            subrequest_id: 1,\n                            index_uid: Some(IndexUid::for_test(\"my-index-2\", 0)),\n                            source_id: INGEST_V2_SOURCE_ID.to_string(),\n                            shard_id: Some(ShardId::from(1)),\n                            replication_position_inclusive: Some(Position::offset(0u64)),\n                            num_ingested_docs: 1,\n                            parse_failures: Vec::new(),\n                        },\n                    ],\n                    failures: Vec::new(),\n                })\n            });\n        let ingest_router = IngestRouterServiceClient::from_mock(mock_ingest_router);\n        let handler = es_compat_bulk_handler_v2(ingest_router, ByteSize::mb(10));\n\n        let payload = r#\"\n            {\"create\": {\"_index\": \"my-index-1\"}}\n            {\"ts\": 1, \"message\": \"my-message-1\"}\n            {\"create\": {\"_index\": \"bad!\"}}\n            {\"ts\": 1, \"message\": \"my-message-2\"}\n            {\"create\": {\"_index\": \"my-index-2\", \"_id\" : \"1\"}}\n            {\"ts\": 1, \"message\": \"my-message-3\"}\n\n        \"#;\n        let response = warp::test::request()\n            .path(\"/_elastic/_bulk\")\n            .method(\"POST\")\n            .body(payload)\n            .reply(&handler)\n            .await;\n        assert_eq!(response.status(), 200);\n\n        let bulk_response: ElasticBulkResponse = serde_json::from_slice(response.body()).unwrap();\n        assert!(bulk_response.errors);\n\n        let items = bulk_response\n            .actions\n            .into_iter()\n            .map(|action| match action {\n                ElasticBulkAction::Create(item) => item,\n                ElasticBulkAction::Index(item) => item,\n            })\n            .collect::<Vec<_>>();\n        assert_eq!(items.len(), 3);\n\n        assert_eq!(items[0].index_id, \"my-index-1\");\n        assert!(items[0].es_doc_id.is_none());\n        assert_eq!(items[0].status, StatusCode::CREATED);\n\n        assert_eq!(items[1].index_id, \"bad!\");\n        assert!(items[1].es_doc_id.is_none());\n        assert_eq!(items[1].status, StatusCode::BAD_REQUEST);\n\n        assert_eq!(items[2].index_id, \"my-index-2\");\n        assert_eq!(items[2].es_doc_id.as_ref().unwrap(), \"1\");\n        assert_eq!(items[2].status, StatusCode::CREATED);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/elasticsearch_api/filter.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse bytes::Bytes;\nuse bytesize::ByteSize;\nuse serde::de::DeserializeOwned;\nuse warp::reject::LengthRequired;\nuse warp::{Filter, Rejection};\n\nuse super::model::{\n    CatIndexQueryParams, DeleteQueryParams, FieldCapabilityQueryParams, FieldCapabilityRequestBody,\n    MultiSearchQueryParams, SearchQueryParamsCount,\n};\nuse crate::Body;\nuse crate::decompression::get_body_bytes;\nuse crate::elasticsearch_api::model::{\n    ElasticBulkOptions, ScrollQueryParams, SearchBody, SearchQueryParams,\n};\nuse crate::search_api::{extract_index_id_patterns, extract_index_id_patterns_default};\n\nconst BODY_LENGTH_LIMIT: ByteSize = ByteSize::mib(1);\n\n// TODO: Make all elastic endpoint models `utoipa` compatible\n// and register them here.\n#[derive(utoipa::OpenApi)]\n#[openapi(paths(elastic_cluster_info_filter,))]\npub struct ElasticCompatibleApi;\n\n#[utoipa::path(get, tag = \"Cluster Info\", path = \"/_elastic\")]\npub(crate) fn elastic_cluster_info_filter() -> impl Filter<Extract = (), Error = Rejection> + Clone\n{\n    warp::path!(\"_elastic\")\n        .and(warp::get().or(warp::head()).unify())\n        .and(warp::path::end())\n}\n\n#[utoipa::path(get, tag = \"Search\", path = \"/_search\")]\npub(crate) fn elasticsearch_filter()\n-> impl Filter<Extract = (SearchQueryParams,), Error = Rejection> + Clone {\n    warp::path!(\"_elastic\" / \"_search\")\n        .and(warp::get().or(warp::post()).unify())\n        .and(warp::query())\n}\n\n#[utoipa::path(\n    post,\n    tag = \"Ingest\",\n    path = \"/_bulk\",\n    request_body(content = String, description = \"Elasticsearch compatible bulk request body limited to 10MB\", content_type = \"application/json\"),\n    responses(\n        (status = 200, description = \"Successfully ingested documents.\", body = IngestResponse)\n    ),\n    params(\n        (\"refresh\" = Option<ElasticRefresh>, Query, description = \"Force or wait for commit at the end of the indexing operation.\"),\n    )\n)]\npub(crate) fn elastic_bulk_filter(\n    content_length_limit: ByteSize,\n) -> impl Filter<Extract = (Body, ElasticBulkOptions), Error = Rejection> + Clone {\n    warp::path!(\"_elastic\" / \"_bulk\")\n        .and(warp::post().or(warp::put()).unify())\n        .and(warp::body::content_length_limit(\n            content_length_limit.as_u64(),\n        ))\n        .and(get_body_bytes())\n        .and(warp::query())\n}\n\n#[utoipa::path(\n    post,\n    tag = \"Ingest\",\n    path = \"/{index}/_bulk\",\n    request_body(content = String, description = \"Elasticsearch compatible bulk request body limited to 10MB\", content_type = \"application/json\"),\n    responses(\n        (status = 200, description = \"Successfully ingested documents.\", body = IngestResponse)\n    ),\n    params(\n        (\"refresh\" = Option<ElasticRefresh>, Query, description = \"Force or wait for commit at the end of the indexing operation.\"),\n    )\n)]\npub(crate) fn elastic_index_bulk_filter(\n    content_length_limit: ByteSize,\n) -> impl Filter<Extract = (String, Body, ElasticBulkOptions), Error = Rejection> + Clone {\n    warp::path!(\"_elastic\" / String / \"_bulk\")\n        .and(warp::post().or(warp::put()).unify())\n        .and(warp::body::content_length_limit(\n            content_length_limit.as_u64(),\n        ))\n        .and(get_body_bytes())\n        .and(warp::query::<ElasticBulkOptions>())\n}\n\n/// Like the warp json filter, but accepts an empty body and interprets it as `T::default`.\nfn json_or_empty<T: DeserializeOwned + Send + Default>()\n-> impl Filter<Extract = (T,), Error = Rejection> + Copy {\n    warp::body::content_length_limit(BODY_LENGTH_LIMIT.as_u64())\n        .and(warp::body::bytes().and_then(|buf: Bytes| async move {\n            if buf.is_empty() {\n                return Ok(T::default());\n            }\n            serde_json::from_slice(&buf)\n                .map_err(|err| warp::reject::custom(crate::rest::InvalidJsonRequest(err)))\n        }))\n        .recover(|rejection: Rejection| async {\n            // Not having a header with content length is not an error as long as\n            // there are no body.\n            if rejection.find::<LengthRequired>().is_some() {\n                Ok(T::default())\n            } else {\n                Err(rejection)\n            }\n        })\n        .unify()\n}\n\n#[utoipa::path(get, tag = \"Metadata\", path = \"/{index}/_field_caps\")]\npub(crate) fn elastic_index_field_capabilities_filter() -> impl Filter<\n    Extract = (\n        Vec<String>,\n        FieldCapabilityQueryParams,\n        FieldCapabilityRequestBody,\n    ),\n    Error = Rejection,\n> + Clone {\n    warp::path!(\"_elastic\" / String / \"_field_caps\")\n        .and_then(extract_index_id_patterns)\n        .and(warp::get().or(warp::post()).unify())\n        .and(warp::query())\n        .and(json_or_empty())\n}\n\n#[utoipa::path(get, tag = \"Metadata\", path = \"/_field_caps\")]\npub(crate) fn elastic_field_capabilities_filter() -> impl Filter<\n    Extract = (\n        Vec<String>,\n        FieldCapabilityQueryParams,\n        FieldCapabilityRequestBody,\n    ),\n    Error = Rejection,\n> + Clone {\n    warp::path!(\"_elastic\" / \"_field_caps\")\n        .and_then(extract_index_id_patterns_default)\n        .and(warp::get().or(warp::post()).unify())\n        .and(warp::query())\n        .and(json_or_empty())\n}\n\n#[utoipa::path(get, tag = \"Metadata\", path = \"/_resolve/index/{index}\")]\npub(crate) fn elastic_resolve_index_filter()\n-> impl Filter<Extract = (Vec<String>,), Error = Rejection> + Clone {\n    warp::path!(\"_elastic\" / \"_resolve\" / \"index\" / String)\n        .and_then(extract_index_id_patterns)\n        .and(warp::get())\n}\n\n#[utoipa::path(get, tag = \"Count\", path = \"/{index}/_count\")]\npub(crate) fn elastic_index_count_filter()\n-> impl Filter<Extract = (Vec<String>, SearchQueryParamsCount, SearchBody), Error = Rejection> + Clone\n{\n    warp::path!(\"_elastic\" / String / \"_count\")\n        .and_then(extract_index_id_patterns)\n        .and(warp::get().or(warp::post()).unify())\n        .and(warp::query())\n        .and(json_or_empty())\n}\n\n#[utoipa::path(delete, tag = \"Indexes\", path = \"/{index}\")]\npub(crate) fn elastic_delete_index_filter()\n-> impl Filter<Extract = (Vec<String>, DeleteQueryParams), Error = Rejection> + Clone {\n    warp::path!(\"_elastic\" / String)\n        .and(warp::delete())\n        .and_then(extract_index_id_patterns)\n        .and(warp::query())\n}\n\n// No support for any query parameters for now.\n#[utoipa::path(get, tag = \"Search\", path = \"/{index}/_stats\")]\npub(crate) fn elastic_index_stats_filter()\n-> impl Filter<Extract = (Vec<String>,), Error = Rejection> + Clone {\n    warp::path!(\"_elastic\" / String / \"_stats\")\n        .and_then(extract_index_id_patterns)\n        .and(warp::get())\n}\n\n#[utoipa::path(get, tag = \"Search\", path = \"/_stats\")]\npub(crate) fn elastic_stats_filter() -> impl Filter<Extract = (), Error = Rejection> + Clone {\n    warp::path!(\"_elastic\" / \"_stats\").and(warp::get())\n}\n\n#[utoipa::path(get, tag = \"Search\", path = \"/_cluster/health\")]\npub(crate) fn elastic_cluster_health_filter() -> impl Filter<Extract = (), Error = Rejection> + Clone\n{\n    warp::path!(\"_elastic\" / \"_cluster\" / \"health\").and(warp::get())\n}\n\n#[utoipa::path(get, tag = \"Search\", path = \"/_cat/indices/{index}\")]\npub(crate) fn elastic_index_cat_indices_filter()\n-> impl Filter<Extract = (Vec<String>, CatIndexQueryParams), Error = Rejection> + Clone {\n    warp::path!(\"_elastic\" / \"_cat\" / \"indices\" / String)\n        .and_then(extract_index_id_patterns)\n        .and(warp::get())\n        .and(warp::query())\n}\n\n#[utoipa::path(get, tag = \"Search\", path = \"/_cat/indices\")]\npub(crate) fn elastic_cat_indices_filter()\n-> impl Filter<Extract = (CatIndexQueryParams,), Error = Rejection> + Clone {\n    warp::path!(\"_elastic\" / \"_cat\" / \"indices\")\n        .and(warp::get())\n        .and(warp::query())\n}\n\n#[utoipa::path(get, tag = \"Search\", path = \"/{index}/_search\")]\npub(crate) fn elastic_index_search_filter()\n-> impl Filter<Extract = (Vec<String>, SearchQueryParams, SearchBody), Error = Rejection> + Clone {\n    warp::path!(\"_elastic\" / String / \"_search\")\n        .and_then(extract_index_id_patterns)\n        .and(warp::get().or(warp::post()).unify())\n        .and(warp::query())\n        .and(json_or_empty())\n}\n\n#[utoipa::path(post, tag = \"Search\", path = \"/_msearch\")]\npub(crate) fn elastic_multi_search_filter()\n-> impl Filter<Extract = (Bytes, MultiSearchQueryParams), Error = Rejection> + Clone {\n    warp::path!(\"_elastic\" / \"_msearch\")\n        .and(warp::body::content_length_limit(BODY_LENGTH_LIMIT.as_u64()))\n        .and(warp::body::bytes())\n        .and(warp::post())\n        .and(warp::query())\n}\n\nfn merge_scroll_body_params(\n    from_query_string: ScrollQueryParams,\n    from_body: ScrollQueryParams,\n) -> ScrollQueryParams {\n    ScrollQueryParams {\n        scroll: from_query_string.scroll.or(from_body.scroll),\n        scroll_id: from_query_string.scroll_id.or(from_body.scroll_id),\n    }\n}\n\npub(crate) fn elastic_nodes_filter() -> impl Filter<Extract = (), Error = Rejection> + Clone {\n    warp::path!(\"_elastic\" / \"_nodes\" / \"http\").and(warp::get())\n}\n\npub(crate) fn elastic_search_shards_filter()\n-> impl Filter<Extract = (String,), Error = Rejection> + Clone {\n    warp::path!(\"_elastic\" / String / \"_search_shards\").and(warp::get())\n}\n\n#[utoipa::path(post, tag = \"Search\", path = \"/_search/scroll\")]\npub(crate) fn elastic_scroll_filter()\n-> impl Filter<Extract = (ScrollQueryParams,), Error = Rejection> + Clone {\n    warp::path!(\"_elastic\" / \"_search\" / \"scroll\")\n        .and(warp::body::content_length_limit(BODY_LENGTH_LIMIT.as_u64()))\n        .and(warp::get().or(warp::post()).unify())\n        .and(warp::query())\n        .and(json_or_empty())\n        .map(\n            |scroll_query_params: ScrollQueryParams, scroll_body: ScrollQueryParams| {\n                merge_scroll_body_params(scroll_query_params, scroll_body)\n            },\n        )\n}\n\npub(crate) fn elastic_delete_scroll_filter() -> impl Filter<Extract = (), Error = Rejection> + Clone\n{\n    warp::path!(\"_elastic\" / \"_search\" / \"scroll\").and(warp::delete())\n}\n\npub(crate) fn elastic_aliases_filter() -> impl Filter<Extract = (), Error = Rejection> + Clone {\n    warp::path!(\"_elastic\" / \"_aliases\").and(warp::get())\n}\n\npub(crate) fn elastic_index_mapping_filter()\n-> impl Filter<Extract = (String,), Error = Rejection> + Clone {\n    warp::path!(\"_elastic\" / String / \"_mapping\")\n        .or(warp::path!(\"_elastic\" / String / \"_mappings\"))\n        .unify()\n        .and(warp::get())\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/elasticsearch_api/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod bulk;\nmod bulk_v2;\nmod filter;\nmod model;\nmod rest_handler;\n\nuse std::sync::Arc;\n\nuse bulk::{es_compat_bulk_handler, es_compat_index_bulk_handler};\npub use filter::ElasticCompatibleApi;\nuse quickwit_cluster::Cluster;\nuse quickwit_config::NodeConfig;\nuse quickwit_index_management::IndexService;\nuse quickwit_ingest::IngestServiceClient;\nuse quickwit_proto::ingest::router::IngestRouterServiceClient;\nuse quickwit_proto::metastore::MetastoreServiceClient;\nuse quickwit_search::SearchService;\npub use rest_handler::{\n    es_compat_cat_indices_handler, es_compat_cluster_info_handler, es_compat_delete_index_handler,\n    es_compat_delete_scroll_handler, es_compat_index_cat_indices_handler,\n    es_compat_index_count_handler, es_compat_index_field_capabilities_handler,\n    es_compat_index_multi_search_handler, es_compat_index_search_handler,\n    es_compat_index_stats_handler, es_compat_resolve_index_handler, es_compat_scroll_handler,\n    es_compat_search_handler, es_compat_stats_handler,\n};\nuse rest_handler::{\n    es_compat_cluster_health_handler, es_compat_nodes_handler, es_compat_search_shards_handler,\n};\nuse serde::{Deserialize, Serialize};\nuse warp::hyper::StatusCode;\nuse warp::{Filter, Rejection};\n\nuse crate::elasticsearch_api::model::ElasticsearchError;\nuse crate::elasticsearch_api::rest_handler::{\n    es_compat_aliases_handler, es_compat_index_mapping_handler,\n};\nuse crate::rest::recover_fn;\nuse crate::rest_api_response::RestApiResponse;\nuse crate::{BodyFormat, BuildInfo};\n\n/// Setup Elasticsearch API handlers\n///\n/// This is where all newly supported Elasticsearch handlers\n/// should be registered.\n#[allow(clippy::too_many_arguments)] // Will go away when we remove ingest v1.\npub fn elastic_api_handlers(\n    cluster: Cluster,\n    node_config: Arc<NodeConfig>,\n    search_service: Arc<dyn SearchService>,\n    ingest_service: IngestServiceClient,\n    ingest_router: IngestRouterServiceClient,\n    metastore: MetastoreServiceClient,\n    index_service: IndexService,\n    enable_ingest_v1: bool,\n    enable_ingest_v2: bool,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    let ingest_content_length_limit = node_config.ingest_api_config.content_length_limit;\n    es_compat_cluster_info_handler(node_config.clone(), BuildInfo::get())\n        .or(es_compat_nodes_handler(node_config.clone()))\n        .or(es_compat_search_handler(search_service.clone()))\n        .or(es_compat_bulk_handler(\n            ingest_service.clone(),\n            ingest_router.clone(),\n            ingest_content_length_limit,\n            enable_ingest_v1,\n            enable_ingest_v2,\n        ))\n        .boxed()\n        .or(es_compat_index_bulk_handler(\n            ingest_service,\n            ingest_router,\n            ingest_content_length_limit,\n            enable_ingest_v1,\n            enable_ingest_v2,\n        ))\n        .or(es_compat_index_search_handler(search_service.clone()))\n        .or(es_compat_index_count_handler(search_service.clone()))\n        .or(es_compat_scroll_handler(search_service.clone()))\n        .or(es_compat_delete_scroll_handler())\n        .or(es_compat_index_multi_search_handler(search_service.clone()))\n        .or(es_compat_index_field_capabilities_handler(\n            search_service.clone(),\n        ))\n        .boxed()\n        .or(es_compat_index_stats_handler(metastore.clone()))\n        .or(es_compat_delete_index_handler(index_service))\n        .or(es_compat_stats_handler(metastore.clone()))\n        .or(es_compat_cluster_health_handler(cluster))\n        .or(es_compat_index_cat_indices_handler(metastore.clone()))\n        .or(es_compat_cat_indices_handler(metastore.clone()))\n        .or(es_compat_resolve_index_handler(metastore.clone()))\n        .or(es_compat_aliases_handler())\n        .or(es_compat_index_mapping_handler(\n            metastore.clone(),\n            search_service.clone(),\n        ))\n        .or(es_compat_search_shards_handler(node_config))\n        .recover(recover_fn)\n        .with(warp::reply::with::header(\n            \"X-Elastic-Product\",\n            \"Elasticsearch\",\n        ))\n        .boxed()\n    // Register newly created handlers here.\n}\n\n/// Helper type needed by the Elasticsearch endpoints.\n/// Control how the total number of hits should be tracked.\n///\n/// When set to `Track` with a value `true`, the response will always track the number of hits that\n/// match the query accurately.\n///\n/// When set to `Count` with an integer value `n`, the response accurately tracks the total\n/// hit count that match the query up to `n` documents.\n#[derive(Debug, Clone, Copy, PartialEq, Serialize, Deserialize)]\n#[serde(untagged)]\npub enum TrackTotalHits {\n    /// Track the number of hits that match the query accurately.\n    Track(bool),\n    /// Track the number of hits up to the specified value.\n    Count(i64),\n}\n\nimpl From<bool> for TrackTotalHits {\n    fn from(b: bool) -> Self {\n        TrackTotalHits::Track(b)\n    }\n}\n\nimpl From<i64> for TrackTotalHits {\n    fn from(i: i64) -> Self {\n        TrackTotalHits::Count(i)\n    }\n}\n\nfn make_elastic_api_response<T: serde::Serialize>(\n    elasticsearch_result: Result<T, ElasticsearchError>,\n    body_format: BodyFormat,\n) -> RestApiResponse {\n    let status_code = match &elasticsearch_result {\n        Ok(_) => StatusCode::OK,\n        Err(error) => error.status,\n    };\n    RestApiResponse::new(&elasticsearch_result, status_code, body_format)\n}\n\n#[cfg(test)]\nmod tests {\n    use std::sync::Arc;\n\n    use assert_json_diff::assert_json_include;\n    use mockall::predicate;\n    use quickwit_cluster::{ChannelTransport, Cluster, create_cluster_for_test};\n    use quickwit_config::NodeConfig;\n    use quickwit_index_management::IndexService;\n    use quickwit_ingest::{IngestApiService, IngestServiceClient};\n    use quickwit_metastore::metastore_for_test;\n    use quickwit_proto::ingest::router::IngestRouterServiceClient;\n    use quickwit_proto::metastore::MetastoreServiceClient;\n    use quickwit_search::MockSearchService;\n    use quickwit_storage::StorageResolver;\n    use serde_json::Value as JsonValue;\n    use warp::Filter;\n\n    use super::elastic_api_handlers;\n    use super::model::ElasticsearchError;\n    use crate::BuildInfo;\n    use crate::elasticsearch_api::rest_handler::es_compat_cluster_info_handler;\n    use crate::rest::recover_fn;\n\n    fn ingest_service_client() -> IngestServiceClient {\n        let universe = quickwit_actors::Universe::new();\n        let (ingest_service_mailbox, _) = universe.create_test_mailbox::<IngestApiService>();\n        IngestServiceClient::from_mailbox(ingest_service_mailbox)\n    }\n\n    pub async fn mock_cluster() -> Cluster {\n        let transport = ChannelTransport::default();\n        create_cluster_for_test(Vec::new(), &[], &transport, false)\n            .await\n            .unwrap()\n    }\n\n    #[tokio::test]\n    async fn test_msearch_api_return_200_responses() {\n        let config = Arc::new(NodeConfig::for_test());\n        let mut mock_search_service = MockSearchService::new();\n        mock_search_service\n            .expect_root_search()\n            .with(predicate::function(\n                |search_request: &quickwit_proto::search::SearchRequest| {\n                    (search_request.index_id_patterns == vec![\"index-1\".to_string()]\n                        && search_request.start_offset == 5\n                        && search_request.max_hits == 20)\n                        || (search_request.index_id_patterns == vec![\"index-2\".to_string()]\n                            && search_request.start_offset == 0\n                            && search_request.max_hits == 10)\n                },\n            ))\n            .returning(|_| Ok(Default::default()));\n        let ingest_router = IngestRouterServiceClient::mocked();\n        let index_service =\n            IndexService::new(metastore_for_test(), StorageResolver::unconfigured());\n        let es_search_api_handler = super::elastic_api_handlers(\n            mock_cluster().await,\n            config,\n            Arc::new(mock_search_service),\n            ingest_service_client(),\n            ingest_router,\n            MetastoreServiceClient::mocked(),\n            index_service,\n            true,\n            false,\n        );\n        let msearch_payload = r#\"\n            {\"index\":\"index-1\"}\n            {\"query\":{\"query_string\":{\"query\":\"test\"}}, \"from\": 5, \"size\": 20}\n            {\"index\":\"index-2\"}\n            {\"query\":{\"query_string\":{\"query\":\"test\"}}}\n            \"#;\n        let resp = warp::test::request()\n            .path(\"/_elastic/_msearch\")\n            .method(\"POST\")\n            .body(msearch_payload)\n            .reply(&es_search_api_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n        assert_eq!(\n            resp.headers().get(\"x-elastic-product\").unwrap(),\n            \"Elasticsearch\"\n        );\n        let string_body = String::from_utf8(resp.body().to_vec()).unwrap();\n        let es_msearch_response: serde_json::Value = serde_json::from_str(&string_body).unwrap();\n        let responses = es_msearch_response\n            .get(\"responses\")\n            .unwrap()\n            .as_array()\n            .unwrap();\n        assert_eq!(responses.len(), 2);\n        for response in responses {\n            assert_eq!(response.get(\"status\").unwrap().as_u64().unwrap(), 200);\n            assert_eq!(response.get(\"error\"), None);\n            response.get(\"hits\").unwrap();\n        }\n    }\n\n    #[tokio::test]\n    async fn test_msearch_api_return_one_500_and_one_200_responses() {\n        let config = Arc::new(NodeConfig::for_test());\n        let mut mock_search_service = MockSearchService::new();\n        mock_search_service\n            .expect_root_search()\n            .returning(|search_request| {\n                if search_request\n                    .index_id_patterns\n                    .contains(&\"index-1\".to_string())\n                {\n                    Ok(Default::default())\n                } else {\n                    Err(quickwit_search::SearchError::Internal(\n                        \"something bad happened\".to_string(),\n                    ))\n                }\n            });\n\n        let ingest_router = IngestRouterServiceClient::mocked();\n        let index_service =\n            IndexService::new(metastore_for_test(), StorageResolver::unconfigured());\n        let es_search_api_handler = super::elastic_api_handlers(\n            mock_cluster().await,\n            config,\n            Arc::new(mock_search_service),\n            ingest_service_client(),\n            ingest_router,\n            MetastoreServiceClient::mocked(),\n            index_service,\n            true,\n            false,\n        );\n        let msearch_payload = r#\"\n            {\"index\":\"index-1\"}\n            {\"query\":{\"query_string\":{\"query\":\"test\"}}, \"from\": 5, \"size\": 10}\n            {\"index\":\"index-2\"}\n            {\"query\":{\"query_string\":{\"query\":\"test\"}}}\n            \"#;\n        let resp = warp::test::request()\n            .path(\"/_elastic/_msearch\")\n            .method(\"POST\")\n            .body(msearch_payload)\n            .reply(&es_search_api_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let es_msearch_response: serde_json::Value = serde_json::from_slice(resp.body()).unwrap();\n        let responses = es_msearch_response\n            .get(\"responses\")\n            .unwrap()\n            .as_array()\n            .unwrap();\n        assert_eq!(responses.len(), 2);\n        assert_eq!(responses[0].get(\"status\").unwrap().as_u64().unwrap(), 200);\n        assert_eq!(responses[0].get(\"error\"), None);\n        assert_eq!(responses[1].get(\"status\").unwrap().as_u64().unwrap(), 500);\n        assert_eq!(responses[1].get(\"hits\"), None);\n        let error_cause = responses[1].get(\"error\").unwrap();\n        assert_eq!(\n            error_cause.get(\"reason\").unwrap().as_str().unwrap(),\n            \"internal error: `something bad happened`\"\n        );\n    }\n\n    #[tokio::test]\n    async fn test_msearch_api_return_400_with_malformed_request_header() {\n        let config = Arc::new(NodeConfig::for_test());\n        let mock_search_service = MockSearchService::new();\n\n        let ingest_router = IngestRouterServiceClient::mocked();\n        let index_service =\n            IndexService::new(metastore_for_test(), StorageResolver::unconfigured());\n        let es_search_api_handler = super::elastic_api_handlers(\n            mock_cluster().await,\n            config,\n            Arc::new(mock_search_service),\n            ingest_service_client(),\n            ingest_router,\n            MetastoreServiceClient::mocked(),\n            index_service,\n            true,\n            false,\n        );\n        let msearch_payload = r#\"\n            {\"index\":\"index-1\"\n            {\"query\":{\"query_string\":{\"query\":\"test\"}}}\n            \"#;\n        let resp = warp::test::request()\n            .path(\"/_elastic/_msearch\")\n            .method(\"POST\")\n            .body(msearch_payload)\n            .reply(&es_search_api_handler)\n            .await;\n        assert_eq!(resp.status(), 400);\n        let es_error: ElasticsearchError = serde_json::from_slice(resp.body()).unwrap();\n        assert!(\n            es_error\n                .error\n                .reason\n                .unwrap()\n                .starts_with(\"Invalid argument: failed to parse request header\")\n        );\n    }\n\n    #[tokio::test]\n    async fn test_msearch_api_return_400_with_malformed_request_body() {\n        let config = Arc::new(NodeConfig::for_test());\n        let mock_search_service = MockSearchService::new();\n\n        let ingest_router = IngestRouterServiceClient::mocked();\n        let index_service =\n            IndexService::new(metastore_for_test(), StorageResolver::unconfigured());\n        let es_search_api_handler = elastic_api_handlers(\n            mock_cluster().await,\n            config,\n            Arc::new(mock_search_service),\n            ingest_service_client(),\n            ingest_router,\n            MetastoreServiceClient::mocked(),\n            index_service,\n            true,\n            false,\n        );\n        let msearch_payload = r#\"\n            {\"index\":\"index-1\"}\n            {\"query\":{\"query_string\":{\"bad\":\"test\"}}}\n            \"#;\n        let resp = warp::test::request()\n            .path(\"/_elastic/_msearch\")\n            .method(\"POST\")\n            .body(msearch_payload)\n            .reply(&es_search_api_handler)\n            .await;\n        assert_eq!(resp.status(), 400);\n        let es_error: ElasticsearchError = serde_json::from_slice(resp.body()).unwrap();\n        assert!(\n            es_error\n                .error\n                .reason\n                .unwrap()\n                .starts_with(\"Invalid argument: failed to parse request body\")\n        );\n    }\n\n    #[tokio::test]\n    async fn test_msearch_api_return_400_with_only_a_header_request() {\n        let config = Arc::new(NodeConfig::for_test());\n        let mock_search_service = MockSearchService::new();\n\n        let ingest_router = IngestRouterServiceClient::mocked();\n        let index_service =\n            IndexService::new(metastore_for_test(), StorageResolver::unconfigured());\n        let es_search_api_handler = super::elastic_api_handlers(\n            mock_cluster().await,\n            config,\n            Arc::new(mock_search_service),\n            ingest_service_client(),\n            ingest_router,\n            MetastoreServiceClient::mocked(),\n            index_service,\n            true,\n            false,\n        );\n        let msearch_payload = r#\"\n            {\"index\":\"index-1\"}\n            \"#;\n        let resp = warp::test::request()\n            .path(\"/_elastic/_msearch\")\n            .method(\"POST\")\n            .body(msearch_payload)\n            .reply(&es_search_api_handler)\n            .await;\n        assert_eq!(resp.status(), 400);\n        let es_error: ElasticsearchError = serde_json::from_slice(resp.body()).unwrap();\n        assert!(\n            es_error\n                .error\n                .reason\n                .unwrap()\n                .starts_with(\"Invalid argument: expect request body after request header\")\n        );\n    }\n\n    #[tokio::test]\n    async fn test_msearch_api_return_400_with_no_index() {\n        let config = Arc::new(NodeConfig::for_test());\n        let mock_search_service = MockSearchService::new();\n\n        let ingest_router = IngestRouterServiceClient::mocked();\n        let index_service =\n            IndexService::new(metastore_for_test(), StorageResolver::unconfigured());\n        let es_search_api_handler = super::elastic_api_handlers(\n            mock_cluster().await,\n            config,\n            Arc::new(mock_search_service),\n            ingest_service_client(),\n            ingest_router,\n            MetastoreServiceClient::mocked(),\n            index_service,\n            true,\n            false,\n        );\n        let msearch_payload = r#\"\n            {}\n            {\"query\":{\"query_string\":{\"bad\":\"test\"}}}\n            \"#;\n        let resp = warp::test::request()\n            .path(\"/_elastic/_msearch\")\n            .method(\"POST\")\n            .body(msearch_payload)\n            .reply(&es_search_api_handler)\n            .await;\n        assert_eq!(resp.status(), 400);\n        let es_error: ElasticsearchError = serde_json::from_slice(resp.body()).unwrap();\n        assert_eq!(\n            es_error.error.reason.unwrap(),\n            \"Invalid argument: `_msearch` request header must define at least one index\"\n        );\n    }\n\n    #[tokio::test]\n    async fn test_msearch_api_return_400_with_multiple_indexes() {\n        let config = Arc::new(NodeConfig::for_test());\n        let mut mock_search_service = MockSearchService::new();\n        mock_search_service\n            .expect_root_search()\n            .returning(|search_request| {\n                if search_request.index_id_patterns\n                    == vec![\"index-1\".to_string(), \"index-2\".to_string()]\n                {\n                    Ok(Default::default())\n                } else {\n                    Err(quickwit_search::SearchError::Internal(\n                        \"something bad happened\".to_string(),\n                    ))\n                }\n            });\n        let ingest_router = IngestRouterServiceClient::mocked();\n        let index_service =\n            IndexService::new(metastore_for_test(), StorageResolver::unconfigured());\n        let es_search_api_handler = super::elastic_api_handlers(\n            mock_cluster().await,\n            config,\n            Arc::new(mock_search_service),\n            ingest_service_client(),\n            ingest_router,\n            MetastoreServiceClient::mocked(),\n            index_service,\n            true,\n            false,\n        );\n        let msearch_payload = r#\"\n            {\"index\": [\"index-1\", \"index-2\"]}\n            {\"query\":{\"query_string\":{\"query\":\"test\"}}}\n            \"#;\n        let resp = warp::test::request()\n            .path(\"/_elastic/_msearch\")\n            .method(\"POST\")\n            .body(msearch_payload)\n            .reply(&es_search_api_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n    }\n\n    #[tokio::test]\n    async fn test_es_compat_cluster_info_handler() {\n        let build_info = BuildInfo::get();\n        let config = Arc::new(NodeConfig::for_test());\n        let handler =\n            es_compat_cluster_info_handler(config.clone(), build_info).recover(recover_fn);\n        let resp = warp::test::request()\n            .path(\"/_elastic\")\n            .reply(&handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let resp_json: JsonValue = serde_json::from_slice(resp.body()).unwrap();\n        let expected_response_json = serde_json::json!({\n            \"name\" : config.node_id,\n            \"cluster_name\" : config.cluster_id,\n            \"version\" : {\n                \"distribution\" : \"quickwit\",\n                \"number\" : \"7.17.0\",\n                \"build_hash\" : build_info.commit_hash,\n                \"build_date\" : build_info.build_date,\n                \"build_snapshot\" : false,\n                \"lucene_version\" : \"8.11.1\",\n                \"minimum_wire_compatibility_version\" : \"6.8.0\",\n                \"minimum_index_compatibility_version\" : \"6.0.0-beta1\",\n            }\n        });\n        assert_json_include!(actual: resp_json, expected: expected_response_json);\n    }\n\n    #[tokio::test]\n    async fn test_head_request_on_root_endpoint() {\n        let build_info = BuildInfo::get();\n        let config = Arc::new(NodeConfig::for_test());\n        let handler =\n            es_compat_cluster_info_handler(config.clone(), build_info).recover(recover_fn);\n        let resp = warp::test::request()\n            .path(\"/_elastic\")\n            .method(\"HEAD\")\n            .reply(&handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/elasticsearch_api/model/bulk_body.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_proto::types::IndexId;\nuse serde::Deserialize;\n\n#[derive(Clone, Debug, Deserialize, PartialEq)]\n#[serde(rename_all(deserialize = \"lowercase\"))]\npub enum BulkAction {\n    Create(BulkActionMeta),\n    Index(BulkActionMeta),\n}\n\nimpl BulkAction {\n    pub fn into_index_id(self) -> Option<IndexId> {\n        match self {\n            BulkAction::Index(meta) => meta.index_id,\n            BulkAction::Create(meta) => meta.index_id,\n        }\n    }\n\n    pub fn into_meta(self) -> BulkActionMeta {\n        match self {\n            BulkAction::Create(meta) => meta,\n            BulkAction::Index(meta) => meta,\n        }\n    }\n}\n\n#[derive(Clone, Debug, Deserialize, PartialEq)]\npub struct BulkActionMeta {\n    #[serde(alias = \"_index\")]\n    #[serde(default)]\n    pub index_id: Option<IndexId>,\n    #[serde(alias = \"_id\")]\n    #[serde(default)]\n    pub es_doc_id: Option<String>,\n}\n\n#[cfg(test)]\nmod tests {\n    use crate::elasticsearch_api::model::BulkAction;\n    use crate::elasticsearch_api::model::bulk_body::BulkActionMeta;\n\n    #[test]\n    fn test_bulk_action_serde() {\n        {\n            let bulk_action_json = r#\"{\n                \"create\": {\n                    \"_index\": \"test\",\n                    \"_id\" : \"2\"\n                }\n            }\"#;\n            let bulk_action = serde_json::from_str::<BulkAction>(bulk_action_json).unwrap();\n            assert_eq!(\n                bulk_action,\n                BulkAction::Create(BulkActionMeta {\n                    index_id: Some(\"test\".to_string()),\n                    es_doc_id: Some(\"2\".to_string()),\n                })\n            );\n        }\n        {\n            let bulk_action_json = r#\"{\n                \"create\": {\n                    \"_index\": \"test\"\n                }\n            }\"#;\n            let bulk_action = serde_json::from_str::<BulkAction>(bulk_action_json).unwrap();\n            assert_eq!(\n                bulk_action,\n                BulkAction::Create(BulkActionMeta {\n                    index_id: Some(\"test\".to_string()),\n                    es_doc_id: None,\n                })\n            );\n        }\n        {\n            let bulk_action_json = r#\"{\n                \"create\": {\n                    \"_id\": \"3\"\n                }\n            }\"#;\n            let bulk_action = serde_json::from_str::<BulkAction>(bulk_action_json).unwrap();\n            assert_eq!(\n                bulk_action,\n                BulkAction::Create(BulkActionMeta {\n                    index_id: None,\n                    es_doc_id: Some(\"3\".to_string()),\n                })\n            );\n        }\n        {\n            let bulk_action_json = r#\"{\n                \"delete\": {\n                    \"_index\": \"test\",\n                    \"_id\": \"2\"\n                }\n            }\"#;\n            serde_json::from_str::<BulkAction>(bulk_action_json).unwrap_err();\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/elasticsearch_api/model/bulk_query_params.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_ingest::CommitType;\nuse quickwit_proto::ingest::CommitTypeV2;\nuse serde::Deserialize;\n\n#[derive(Clone, Copy, Debug, Default, Deserialize, PartialEq)]\npub struct ElasticBulkOptions {\n    #[serde(default)]\n    pub refresh: ElasticRefresh,\n    #[serde(default)]\n    pub use_legacy_ingest: bool,\n}\n\n/// ?refresh parameter for elasticsearch bulk request\n///\n/// The syntax for this parameter is a bit confusing for backward compatibility reasons.\n/// - Absence of ?refresh parameter or ?refresh=false means no refresh\n/// - Presence of ?refresh parameter without any values or ?refresh=true means force refresh\n/// - ?refresh=wait_for means wait for refresh\n#[derive(Clone, Copy, Debug, Deserialize, PartialEq, utoipa::ToSchema)]\n#[serde(rename_all(deserialize = \"snake_case\"))]\n#[derive(Default)]\npub enum ElasticRefresh {\n    // if the refresh parameter is not present it is false\n    #[default]\n    /// The request doesn't wait for commit\n    False,\n    // but if it is present without a value like this: ?refresh, it should be the same as\n    // ?refresh=true\n    #[serde(alias = \"\")]\n    /// The request forces an immediate commit after the last document in the batch and waits for\n    /// it to finish.\n    True,\n    /// The request will wait for the next scheduled commit to finish.\n    WaitFor,\n}\n\nimpl From<ElasticRefresh> for CommitType {\n    fn from(val: ElasticRefresh) -> Self {\n        match val {\n            ElasticRefresh::False => Self::Auto,\n            ElasticRefresh::True => Self::Force,\n            ElasticRefresh::WaitFor => Self::WaitFor,\n        }\n    }\n}\n\nimpl From<ElasticRefresh> for CommitTypeV2 {\n    fn from(val: ElasticRefresh) -> Self {\n        match val {\n            ElasticRefresh::False => Self::Auto,\n            ElasticRefresh::True => Self::Force,\n            ElasticRefresh::WaitFor => Self::WaitFor,\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use crate::elasticsearch_api::model::ElasticBulkOptions;\n    use crate::elasticsearch_api::model::bulk_query_params::ElasticRefresh;\n\n    #[test]\n    fn test_elastic_refresh_parsing() {\n        assert_eq!(\n            serde_qs::from_str::<ElasticBulkOptions>(\"\")\n                .unwrap()\n                .refresh,\n            ElasticRefresh::False\n        );\n        assert_eq!(\n            serde_qs::from_str::<ElasticBulkOptions>(\"refresh=true\")\n                .unwrap()\n                .refresh,\n            ElasticRefresh::True\n        );\n        assert_eq!(\n            serde_qs::from_str::<ElasticBulkOptions>(\"refresh=false\")\n                .unwrap()\n                .refresh,\n            ElasticRefresh::False\n        );\n        assert_eq!(\n            serde_qs::from_str::<ElasticBulkOptions>(\"refresh=wait_for\")\n                .unwrap()\n                .refresh,\n            ElasticRefresh::WaitFor\n        );\n        assert_eq!(\n            serde_qs::from_str::<ElasticBulkOptions>(\"refresh\")\n                .unwrap()\n                .refresh,\n            ElasticRefresh::True\n        );\n        assert_eq!(\n            serde_qs::from_str::<ElasticBulkOptions>(\"refresh=wait\")\n                .unwrap_err()\n                .to_string(),\n            \"unknown variant `wait`, expected one of `false`, ``, `true`, `wait_for`\"\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/elasticsearch_api/model/cat_indices.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashSet;\nuse std::ops::AddAssign;\n\nuse quickwit_metastore::{IndexMetadata, SplitMetadata};\nuse serde::{Deserialize, Serialize, Serializer};\nuse warp::hyper::StatusCode;\n\nuse super::ElasticsearchError;\nuse crate::simple_list::{from_simple_list, to_simple_list};\n\n#[serde_with::skip_serializing_none]\n#[derive(Default, Debug, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct CatIndexQueryParams {\n    #[serde(default)]\n    /// Only JSON supported for now.\n    pub format: Option<String>,\n    /// Comma-separated list of column names to display.\n    #[serde(serialize_with = \"to_simple_list\")]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(default)]\n    pub h: Option<Vec<String>>,\n    #[serde(default)]\n    /// Filter for health: green, yellow, or red\n    pub health: Option<Health>,\n    /// Unit used to display byte values.\n    /// Unsupported for now.\n    #[serde(default)]\n    pub bytes: Option<String>,\n    /// Comma-separated list of column names or column aliases used to sort the response.\n    /// Unsupported for now.\n    #[serde(serialize_with = \"to_simple_list\")]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(default)]\n    pub s: Option<Vec<String>>,\n    /// If true, the response includes column headings. Defaults to false.\n    /// Unsupported for now.\n    #[serde(default)]\n    pub v: Option<bool>,\n}\nimpl CatIndexQueryParams {\n    #[allow(clippy::result_large_err)]\n    pub fn validate(&self) -> Result<(), ElasticsearchError> {\n        if let Some(format) = &self.format {\n            if format.to_lowercase() != \"json\" {\n                return Err(ElasticsearchError::new(\n                    StatusCode::BAD_REQUEST,\n                    format!(\"Format {format:?} is not supported. Only format=json is supported.\"),\n                    None,\n                ));\n            }\n        } else {\n            return Err(ElasticsearchError::new(\n                StatusCode::BAD_REQUEST,\n                \"Only format=json is supported.\".to_string(),\n                None,\n            ));\n        }\n        let unsupported_parameter_error = |field: &str| {\n            ElasticsearchError::new(\n                StatusCode::BAD_REQUEST,\n                format!(\"Parameter {field:?} is not supported.\"),\n                None,\n            )\n        };\n        if self.bytes.is_some() {\n            return Err(unsupported_parameter_error(\"bytes\"));\n        }\n        if self.v.is_some() {\n            return Err(unsupported_parameter_error(\"v\"));\n        }\n        if let Some(sort_by) = &self.s {\n            if sort_by.len() > 1 {\n                return Err(unsupported_parameter_error(\"s\"));\n            }\n            if sort_by[0] != \"index\" && sort_by[0] != \"index:asc\" {\n                return Err(unsupported_parameter_error(\"s\"));\n            }\n        }\n        Ok(())\n    }\n}\n\n#[derive(Debug, Clone, Default, Serialize)]\npub struct ElasticsearchCatIndexResponse {\n    pub health: Health,\n    status: Status,\n    pub index: String,\n    uuid: String,\n    pri: String,\n    rep: String,\n    #[serde(rename = \"docs.count\", serialize_with = \"serialize_u64_as_string\")]\n    docs_count: u64,\n    #[serde(rename = \"docs.deleted\", serialize_with = \"serialize_u64_as_string\")]\n    docs_deleted: u64,\n    #[serde(rename = \"store.size\", serialize_with = \"ser_es_format\")]\n    store_size: u64,\n    #[serde(rename = \"pri.store.size\", serialize_with = \"ser_es_format\")]\n    pri_store_size: u64,\n    #[serde(rename = \"dataset.size\", serialize_with = \"ser_es_format\")]\n    dataset_size: u64,\n}\n\nimpl ElasticsearchCatIndexResponse {\n    pub fn serialize_filtered(\n        &self,\n        fields: &Option<Vec<String>>,\n    ) -> serde_json::Result<serde_json::Value> {\n        let mut value = serde_json::to_value(self)?;\n\n        if let Some(fields) = fields {\n            let fields: HashSet<String> = fields.iter().cloned().collect();\n            // If fields are specified, retain only those fields\n            if let serde_json::Value::Object(ref mut map) = value {\n                map.retain(|key, _| fields.contains(key));\n            }\n        }\n\n        Ok(value)\n    }\n}\nimpl AddAssign for ElasticsearchCatIndexResponse {\n    fn add_assign(&mut self, rhs: Self) {\n        self.health += rhs.health;\n        self.status += rhs.status;\n        self.docs_count += rhs.docs_count;\n        self.docs_deleted += rhs.docs_deleted;\n        self.store_size += rhs.store_size;\n        self.pri_store_size += rhs.pri_store_size;\n        self.dataset_size += rhs.dataset_size;\n    }\n}\n\nimpl From<IndexMetadata> for ElasticsearchCatIndexResponse {\n    fn from(index_metadata: IndexMetadata) -> Self {\n        ElasticsearchCatIndexResponse {\n            uuid: index_metadata.index_uid.to_string(),\n            index: index_metadata.index_config.index_id.to_string(),\n            pri: \"1\".to_string(),\n            rep: \"1\".to_string(),\n            ..Default::default()\n        }\n    }\n}\n\nimpl From<SplitMetadata> for ElasticsearchCatIndexResponse {\n    fn from(split_metadata: SplitMetadata) -> Self {\n        ElasticsearchCatIndexResponse {\n            store_size: split_metadata.as_split_info().file_size_bytes.as_u64(),\n            pri_store_size: split_metadata.as_split_info().file_size_bytes.as_u64(),\n            dataset_size: split_metadata\n                .as_split_info()\n                .uncompressed_docs_size_bytes\n                .as_u64(),\n            uuid: split_metadata.index_uid.to_string(),\n            pri: \"1\".to_string(),\n            rep: \"1\".to_string(),\n            docs_count: split_metadata.as_split_info().num_docs as u64,\n            ..Default::default()\n        }\n    }\n}\n\n#[derive(Debug, Clone, Default, Serialize)]\npub struct ElasticsearchResolveIndexResponse {\n    pub indices: Vec<ElasticsearchResolveIndexEntryResponse>,\n    // Unused for the moment.\n    pub aliases: Vec<serde_json::Value>,\n    pub data_streams: Vec<serde_json::Value>,\n}\n\n#[derive(Debug, Clone, Default, Serialize)]\npub struct ElasticsearchResolveIndexEntryResponse {\n    pub name: String,\n    pub attributes: Vec<Status>,\n}\n\nimpl From<IndexMetadata> for ElasticsearchResolveIndexEntryResponse {\n    fn from(index_metadata: IndexMetadata) -> Self {\n        ElasticsearchResolveIndexEntryResponse {\n            name: index_metadata.index_config.index_id.to_string(),\n            attributes: vec![Status::Open],\n        }\n    }\n}\n\nfn serialize_u64_as_string<S>(value: &u64, serializer: S) -> Result<S::Ok, S::Error>\nwhere S: Serializer {\n    serializer.serialize_str(&value.to_string())\n}\n\nfn ser_es_format<S>(bytes: &u64, serializer: S) -> Result<S::Ok, S::Error>\nwhere S: Serializer {\n    serializer.serialize_str(&format_byte_size(*bytes))\n}\n\nfn format_byte_size(bytes: u64) -> String {\n    const KILOBYTE: u64 = 1024;\n    const MEGABYTE: u64 = KILOBYTE * 1024;\n    const GIGABYTE: u64 = MEGABYTE * 1024;\n    const TERABYTE: u64 = GIGABYTE * 1024;\n    if bytes < KILOBYTE {\n        format!(\"{bytes}b\")\n    } else if bytes < MEGABYTE {\n        format!(\"{:.1}kb\", bytes as f64 / KILOBYTE as f64)\n    } else if bytes < GIGABYTE {\n        format!(\"{:.1}mb\", bytes as f64 / MEGABYTE as f64)\n    } else if bytes < TERABYTE {\n        format!(\"{:.1}gb\", bytes as f64 / GIGABYTE as f64)\n    } else {\n        format!(\"{:.1}tb\", bytes as f64 / TERABYTE as f64)\n    }\n}\n\n#[derive(Debug, Default, Clone, Copy, Serialize, Deserialize, Eq, PartialEq)]\n#[serde(rename_all = \"lowercase\")]\npub enum Health {\n    #[default]\n    Green = 1,\n    Yellow = 2,\n    Red = 3,\n}\nimpl AddAssign for Health {\n    fn add_assign(&mut self, other: Self) {\n        *self = match std::cmp::max(*self as u8, other as u8) {\n            1 => Health::Green,\n            2 => Health::Yellow,\n            _ => Health::Red,\n        };\n    }\n}\n\n#[derive(Debug, Default, Clone, Copy, Serialize, Deserialize)]\n#[serde(rename_all = \"lowercase\")]\npub enum Status {\n    #[default]\n    Open = 1,\n}\nimpl AddAssign for Status {\n    fn add_assign(&mut self, other: Self) {\n        *self = match std::cmp::max(*self as u8, other as u8) {\n            1 => Status::Open,\n            _ => Status::Open,\n        };\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use serde_json::json;\n\n    use super::*;\n\n    #[test]\n    fn test_serialize_filtered() {\n        let response = ElasticsearchCatIndexResponse {\n            health: Health::Green,\n            status: Status::Open,\n            index: \"test_index\".to_string(),\n            uuid: \"test_uuid\".to_string(),\n            pri: \"1\".to_string(),\n            rep: \"2\".to_string(),\n            docs_count: 100,\n            docs_deleted: 10,\n            store_size: 1000,\n            pri_store_size: 500,\n            dataset_size: 1500,\n        };\n\n        // Test serialization with all fields\n        let all_fields = response.serialize_filtered(&None).unwrap();\n        let expected_all_fields = json!({\n            \"health\": \"green\",\n            \"status\": \"open\",\n            \"index\": \"test_index\",\n            \"uuid\": \"test_uuid\",\n            \"pri\": \"1\",\n            \"rep\": \"2\",\n            \"docs.count\": \"100\",\n            \"docs.deleted\": \"10\",\n            \"store.size\": \"1000b\",  // Assuming ser_es_format formats size to kb\n            \"pri.store.size\": \"500b\", // Example format\n            \"dataset.size\": \"1.5kb\", // Example format\n        });\n        assert_eq!(all_fields, expected_all_fields);\n\n        // Test serialization with selected fields\n        let selected_fields = response\n            .serialize_filtered(&Some(vec![\"index\".to_string(), \"uuid\".to_string()]))\n            .unwrap();\n        let expected_selected_fields = json!({\n            \"index\": \"test_index\",\n            \"uuid\": \"test_uuid\"\n        });\n        assert_eq!(selected_fields, expected_selected_fields);\n\n        // Add more test cases as needed\n    }\n\n    #[test]\n    fn test_cat_index_query_params_validate_s_parameter() {\n        let params = CatIndexQueryParams {\n            format: Some(\"json\".to_string()),\n            s: Some(vec![\"index:asc\".to_string()]),\n            ..Default::default()\n        };\n        assert!(params.validate().is_ok());\n\n        let params = CatIndexQueryParams {\n            format: Some(\"json\".to_string()),\n            s: Some(vec![\"index\".to_string()]),\n            ..Default::default()\n        };\n        assert!(params.validate().is_ok());\n\n        let params = CatIndexQueryParams {\n            format: Some(\"json\".to_string()),\n            s: Some(vec![\"index:desc\".to_string()]),\n            ..Default::default()\n        };\n        assert!(params.validate().is_err());\n\n        let params = CatIndexQueryParams {\n            format: Some(\"json\".to_string()),\n            s: Some(vec![\"index:asc\".to_string(), \"docs.count\".to_string()]),\n            ..Default::default()\n        };\n        assert!(params.validate().is_err());\n\n        let params = CatIndexQueryParams {\n            format: Some(\"json\".to_string()),\n            s: None,\n            ..Default::default()\n        };\n        assert!(params.validate().is_ok());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/elasticsearch_api/model/error.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse elasticsearch_dsl::search::ErrorCause;\nuse quickwit_common::{rate_limited_debug, rate_limited_error};\nuse quickwit_index_management::IndexServiceError;\nuse quickwit_ingest::IngestServiceError;\nuse quickwit_proto::ServiceError;\nuse quickwit_proto::ingest::IngestV2Error;\nuse quickwit_search::SearchError;\nuse serde::{Deserialize, Serialize};\nuse warp::hyper::StatusCode;\n\n#[derive(Debug, Clone, Serialize, Deserialize)]\npub struct ElasticsearchError {\n    #[serde(with = \"http_serde::status_code\")]\n    pub status: StatusCode,\n    pub error: ErrorCause,\n}\n\nimpl ElasticsearchError {\n    pub fn new(\n        status: StatusCode,\n        reason: String,\n        exception_opt: Option<ElasticException>,\n    ) -> Self {\n        if status.is_server_error() {\n            rate_limited_error!(limit_per_min=10, status=%status, \"http request failed with server error: {reason}\");\n        } else if !status.is_success() {\n            rate_limited_debug!(limit_per_min=10, status=%status, \"http request failed: {reason}\");\n        }\n        ElasticsearchError {\n            status,\n            error: ErrorCause {\n                reason: Some(reason),\n                caused_by: None,\n                root_cause: Vec::new(),\n                stack_trace: None,\n                suppressed: Vec::new(),\n                ty: exception_opt.map(|exception| exception.as_str().to_string()),\n                additional_details: Default::default(),\n            },\n        }\n    }\n}\n\nimpl From<SearchError> for ElasticsearchError {\n    fn from(search_error: SearchError) -> Self {\n        let status = search_error.error_code().http_status_code();\n        // Fill only reason field to keep it simple.\n        let reason = ErrorCause {\n            reason: Some(search_error.to_string()),\n            caused_by: None,\n            root_cause: Vec::new(),\n            stack_trace: None,\n            suppressed: Vec::new(),\n            ty: None,\n            additional_details: Default::default(),\n        };\n        ElasticsearchError {\n            status,\n            error: reason,\n        }\n    }\n}\n\nimpl From<IngestServiceError> for ElasticsearchError {\n    fn from(ingest_service_error: IngestServiceError) -> Self {\n        let status = ingest_service_error.error_code().http_status_code();\n\n        let reason = ErrorCause {\n            reason: Some(ingest_service_error.to_string()),\n            caused_by: None,\n            root_cause: Vec::new(),\n            stack_trace: None,\n            suppressed: Vec::new(),\n            ty: None,\n            additional_details: Default::default(),\n        };\n        ElasticsearchError {\n            status,\n            error: reason,\n        }\n    }\n}\n\nimpl From<IngestV2Error> for ElasticsearchError {\n    fn from(ingest_error: IngestV2Error) -> Self {\n        let status = ingest_error.error_code().http_status_code();\n\n        let reason = ErrorCause {\n            reason: Some(ingest_error.to_string()),\n            caused_by: None,\n            root_cause: Vec::new(),\n            stack_trace: None,\n            suppressed: Vec::new(),\n            ty: None,\n            additional_details: Default::default(),\n        };\n        ElasticsearchError {\n            status,\n            error: reason,\n        }\n    }\n}\n\nimpl From<IndexServiceError> for ElasticsearchError {\n    fn from(ingest_error: IndexServiceError) -> Self {\n        let status = ingest_error.error_code().http_status_code();\n\n        let reason = ErrorCause {\n            reason: Some(ingest_error.to_string()),\n            caused_by: None,\n            root_cause: Vec::new(),\n            stack_trace: None,\n            suppressed: Vec::new(),\n            ty: None,\n            additional_details: Default::default(),\n        };\n        ElasticsearchError {\n            status,\n            error: reason,\n        }\n    }\n}\n\n#[derive(Debug, Clone, Copy, Eq, PartialEq, Serialize, Deserialize)]\npub enum ElasticException {\n    #[serde(rename = \"action_request_validation_exception\")]\n    ActionRequestValidation,\n    #[serde(rename = \"document_parsing_exception\")]\n    DocumentParsing,\n    // This is an exception proper to Quickwit.\n    #[serde(rename = \"internal_exception\")]\n    Internal,\n    #[serde(rename = \"illegal_argument_exception\")]\n    IllegalArgument,\n    #[serde(rename = \"index_not_found_exception\")]\n    IndexNotFound,\n    // This is an exception proper to Quickwit.\n    #[serde(rename = \"rate_limited_exception\")]\n    RateLimited,\n    // This is an exception proper to Quickwit.\n    #[serde(rename = \"source_not_found_exception\")]\n    SourceNotFound,\n    #[serde(rename = \"timeout_exception\")]\n    Timeout,\n}\n\nimpl ElasticException {\n    pub fn as_str(&self) -> &'static str {\n        match self {\n            Self::ActionRequestValidation => \"action_request_validation_exception\",\n            Self::DocumentParsing => \"document_parsing_exception\",\n            Self::Internal => \"internal_exception\",\n            Self::RateLimited => \"rate_limited_exception\",\n            Self::IllegalArgument => \"illegal_argument_exception\",\n            Self::IndexNotFound => \"index_not_found_exception\",\n            Self::SourceNotFound => \"source_not_found_exception\",\n            Self::Timeout => \"timeout_exception\",\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/elasticsearch_api/model/field_capability.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\n\nuse quickwit_proto::search::{ListFieldType, ListFieldsEntryResponse, ListFieldsResponse};\nuse quickwit_query::ElasticQueryDsl;\nuse quickwit_query::query_ast::QueryAst;\nuse serde::{Deserialize, Serialize};\nuse warp::hyper::StatusCode;\n\nuse super::ElasticsearchError;\nuse super::search_query_params::*;\nuse crate::simple_list::{from_simple_list, to_simple_list};\n\n#[serde_with::skip_serializing_none]\n#[derive(Default, Debug, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct FieldCapabilityQueryParams {\n    #[serde(default)]\n    pub allow_no_indices: Option<bool>,\n    #[serde(serialize_with = \"to_simple_list\")]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(default)]\n    pub expand_wildcards: Option<Vec<ExpandWildcards>>,\n    #[serde(serialize_with = \"to_simple_list\")]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(default)]\n    pub fields: Option<Vec<String>>,\n    #[serde(default)]\n    pub ignore_unavailable: Option<bool>,\n    /// Non-ES Parameter. If set, restricts splits to documents with a `time_range.start >=\n    /// start_timestamp`.\n    pub start_timestamp: Option<i64>,\n    /// Non-ES Parameter. If set, restricts splits to documents with a `time_range.end <\n    /// end_timestamp``.\n    pub end_timestamp: Option<i64>,\n}\n\n#[derive(Debug, Default, Clone, Deserialize, PartialEq)]\n#[serde(deny_unknown_fields)]\npub struct FieldCapabilityRequestBody {\n    #[serde(default)]\n    // unsupported currently\n    pub index_filter: serde_json::Value,\n    #[serde(default)]\n    // unsupported currently\n    pub runtime_mappings: serde_json::Value,\n}\n\n#[derive(Serialize, Deserialize, Debug)]\npub struct FieldCapabilityResponse {\n    indices: Vec<String>,\n    fields: HashMap<String, FieldCapabilityFieldTypesResponse>,\n}\n\ntype FieldCapabilityFieldTypesResponse =\n    HashMap<FieldCapabilityEntryType, FieldCapabilityEntryResponse>;\n\n#[derive(Serialize, Deserialize, Debug, Clone, Eq, PartialEq, Hash)]\nenum FieldCapabilityEntryType {\n    #[serde(rename = \"long\")]\n    Long,\n    #[serde(rename = \"keyword\")]\n    Keyword,\n    #[serde(rename = \"text\")]\n    Text,\n    #[serde(rename = \"date_nanos\")]\n    DateNanos,\n    #[serde(rename = \"binary\")]\n    Binary,\n    #[serde(rename = \"double\")]\n    Double,\n    #[serde(rename = \"boolean\")]\n    Boolean,\n    #[serde(rename = \"ip\")]\n    Ip,\n    // Unmapped currently\n    #[serde(rename = \"nested\")]\n    Nested,\n    // Unmapped currently\n    #[serde(rename = \"object\")]\n    Object,\n}\n\n#[derive(Serialize, Deserialize, Debug, Clone)]\nstruct FieldCapabilityEntryResponse {\n    metadata_field: bool, // Always false\n    searchable: bool,\n    aggregatable: bool,\n    // Option since it is filled later\n    #[serde(rename = \"type\")]\n    typ: Option<FieldCapabilityEntryType>,\n    #[serde(skip_serializing_if = \"Vec::is_empty\")]\n    indices: Vec<String>, // [ \"index1\", \"index2\" ],\n    #[serde(skip_serializing_if = \"Vec::is_empty\")]\n    non_aggregatable_indices: Vec<String>, // [ \"index1\" ]\n    #[serde(skip_serializing_if = \"Vec::is_empty\")]\n    non_searchable_indices: Vec<String>, // [ \"index1\" ]\n}\nimpl FieldCapabilityEntryResponse {\n    fn from_list_field_entry_response(entry: ListFieldsEntryResponse) -> Self {\n        Self {\n            metadata_field: false,\n            searchable: entry.searchable,\n            aggregatable: entry.aggregatable,\n            typ: None,\n            indices: entry.index_ids.clone(),\n            non_aggregatable_indices: entry.non_aggregatable_index_ids,\n            non_searchable_indices: entry.non_searchable_index_ids,\n        }\n    }\n}\n\npub fn convert_to_es_field_capabilities_response(\n    resp: ListFieldsResponse,\n) -> FieldCapabilityResponse {\n    let mut indices = resp\n        .fields\n        .iter()\n        .flat_map(|entry| entry.index_ids.iter().cloned())\n        .collect::<Vec<_>>();\n    indices.sort();\n    indices.dedup();\n\n    let mut fields: HashMap<String, FieldCapabilityFieldTypesResponse> = HashMap::new();\n    for list_field_resp in resp.fields {\n        let entry = fields\n            .entry(list_field_resp.field_name.to_string())\n            .or_default();\n\n        let field_type = ListFieldType::try_from(list_field_resp.field_type).unwrap();\n        let add_entry =\n            FieldCapabilityEntryResponse::from_list_field_entry_response(list_field_resp);\n        let types = match field_type {\n            ListFieldType::Str => {\n                vec![\n                    FieldCapabilityEntryType::Keyword,\n                    FieldCapabilityEntryType::Text,\n                ]\n            }\n            ListFieldType::U64 => vec![FieldCapabilityEntryType::Long],\n            ListFieldType::I64 => vec![FieldCapabilityEntryType::Long],\n            ListFieldType::F64 => vec![FieldCapabilityEntryType::Double],\n            ListFieldType::Bool => vec![FieldCapabilityEntryType::Boolean],\n            ListFieldType::Date => vec![FieldCapabilityEntryType::DateNanos],\n            ListFieldType::Facet => continue,\n            ListFieldType::Json => continue,\n            ListFieldType::Bytes => vec![FieldCapabilityEntryType::Binary],\n            ListFieldType::IpAddr => vec![FieldCapabilityEntryType::Ip],\n        };\n        for field_type in types {\n            let mut add_entry = add_entry.clone();\n            add_entry.typ = Some(field_type.clone());\n\n            // If the field exists in all indices, we omit field.indices in the response.\n            let exists_in_all_indices = add_entry.indices.len() == indices.len();\n            if exists_in_all_indices {\n                add_entry.indices = Vec::new();\n            }\n\n            entry.insert(field_type, add_entry);\n        }\n    }\n    FieldCapabilityResponse { indices, fields }\n}\n\n/// Parses an Elasticsearch index_filter JSON value into a Quickwit QueryAst.\n///\n/// Returns `Ok(None)` if the index_filter is null.\n/// Returns `Ok(Some(QueryAst))` if the index_filter is valid.\n/// Returns `Err` if the index_filter is invalid or cannot be converted (including empty object).\n#[allow(clippy::result_large_err)]\npub fn parse_index_filter_to_query_ast(\n    index_filter: serde_json::Value,\n) -> Result<Option<QueryAst>, ElasticsearchError> {\n    if index_filter.is_null() {\n        return Ok(None);\n    }\n\n    // Parse ES Query DSL to internal QueryAst\n    let elastic_query_dsl: ElasticQueryDsl =\n        serde_json::from_value(index_filter).map_err(|err| {\n            ElasticsearchError::new(\n                StatusCode::BAD_REQUEST,\n                format!(\"Invalid index_filter: {err}\"),\n                None,\n            )\n        })?;\n\n    let query_ast: QueryAst = elastic_query_dsl.try_into().map_err(|err: anyhow::Error| {\n        ElasticsearchError::new(\n            StatusCode::BAD_REQUEST,\n            format!(\"Failed to convert index_filter: {err}\"),\n            None,\n        )\n    })?;\n\n    Ok(Some(query_ast))\n}\n\n#[allow(clippy::result_large_err)]\npub fn build_list_field_request_for_es_api(\n    index_id_patterns: Vec<String>,\n    search_params: FieldCapabilityQueryParams,\n    search_body: FieldCapabilityRequestBody,\n) -> Result<quickwit_proto::search::ListFieldsRequest, ElasticsearchError> {\n    let query_ast = parse_index_filter_to_query_ast(search_body.index_filter)?;\n    let query_ast_json = query_ast\n        .map(|ast| serde_json::to_string(&ast).expect(\"QueryAst should be JSON serializable\"));\n\n    Ok(quickwit_proto::search::ListFieldsRequest {\n        index_id_patterns,\n        fields: search_params.fields.unwrap_or_default(),\n        start_timestamp: search_params.start_timestamp,\n        end_timestamp: search_params.end_timestamp,\n        query_ast: query_ast_json,\n    })\n}\n\n#[cfg(test)]\nmod tests {\n    use serde_json::json;\n\n    use super::*;\n\n    #[test]\n    fn test_build_list_field_request_empty_index_filter() {\n        let result = build_list_field_request_for_es_api(\n            vec![\"test_index\".to_string()],\n            FieldCapabilityQueryParams::default(),\n            FieldCapabilityRequestBody::default(),\n        )\n        .unwrap();\n\n        assert_eq!(result.index_id_patterns, vec![\"test_index\".to_string()]);\n        assert!(result.query_ast.is_none());\n    }\n\n    #[test]\n    fn test_build_list_field_request_with_term_index_filter() {\n        let search_body = FieldCapabilityRequestBody {\n            index_filter: json!({\n                \"term\": {\n                    \"status\": \"active\"\n                }\n            }),\n            runtime_mappings: serde_json::Value::Null,\n        };\n\n        let result = build_list_field_request_for_es_api(\n            vec![\"test_index\".to_string()],\n            FieldCapabilityQueryParams::default(),\n            search_body,\n        )\n        .unwrap();\n\n        assert_eq!(result.index_id_patterns, vec![\"test_index\".to_string()]);\n        assert!(result.query_ast.is_some());\n\n        // Verify the query_ast is valid JSON\n        let query_ast: serde_json::Value =\n            serde_json::from_str(&result.query_ast.unwrap()).unwrap();\n        assert!(query_ast.is_object());\n    }\n\n    #[test]\n    fn test_build_list_field_request_with_bool_index_filter() {\n        let search_body = FieldCapabilityRequestBody {\n            index_filter: json!({\n                \"bool\": {\n                    \"must\": [\n                        { \"term\": { \"status\": \"active\" } }\n                    ],\n                    \"filter\": [\n                        { \"range\": { \"age\": { \"gte\": 18 } } }\n                    ]\n                }\n            }),\n            runtime_mappings: serde_json::Value::Null,\n        };\n\n        let result = build_list_field_request_for_es_api(\n            vec![\"test_index\".to_string()],\n            FieldCapabilityQueryParams::default(),\n            search_body,\n        )\n        .unwrap();\n\n        assert!(result.query_ast.is_some());\n    }\n\n    #[test]\n    fn test_build_list_field_request_with_invalid_index_filter() {\n        let search_body = FieldCapabilityRequestBody {\n            index_filter: json!({\n                \"invalid_query_type\": {\n                    \"field\": \"value\"\n                }\n            }),\n            runtime_mappings: serde_json::Value::Null,\n        };\n\n        let result = build_list_field_request_for_es_api(\n            vec![\"test_index\".to_string()],\n            FieldCapabilityQueryParams::default(),\n            search_body,\n        );\n\n        assert!(result.is_err());\n        let err = result.unwrap_err();\n        assert_eq!(err.status, StatusCode::BAD_REQUEST);\n    }\n\n    #[test]\n    fn test_build_list_field_request_with_null_index_filter() {\n        let search_body = FieldCapabilityRequestBody {\n            index_filter: serde_json::Value::Null,\n            runtime_mappings: serde_json::Value::Null,\n        };\n\n        let result = build_list_field_request_for_es_api(\n            vec![\"test_index\".to_string()],\n            FieldCapabilityQueryParams::default(),\n            search_body,\n        )\n        .unwrap();\n\n        assert!(result.query_ast.is_none());\n    }\n\n    #[test]\n    fn test_build_list_field_request_preserves_other_params() {\n        let search_params = FieldCapabilityQueryParams {\n            fields: Some(vec![\"field1\".to_string(), \"field2\".to_string()]),\n            start_timestamp: Some(1000),\n            end_timestamp: Some(2000),\n            ..Default::default()\n        };\n\n        let search_body = FieldCapabilityRequestBody {\n            index_filter: json!({ \"match_all\": {} }),\n            runtime_mappings: serde_json::Value::Null,\n        };\n\n        let result = build_list_field_request_for_es_api(\n            vec![\"test_index\".to_string()],\n            search_params,\n            search_body,\n        )\n        .unwrap();\n\n        assert_eq!(\n            result.fields,\n            vec![\"field1\".to_string(), \"field2\".to_string()]\n        );\n        assert_eq!(result.start_timestamp, Some(1000));\n        assert_eq!(result.end_timestamp, Some(2000));\n        assert!(result.query_ast.is_some());\n    }\n\n    #[test]\n    fn test_parse_index_filter_to_query_ast_null() {\n        let result = parse_index_filter_to_query_ast(serde_json::Value::Null).unwrap();\n        assert!(result.is_none());\n    }\n\n    #[test]\n    fn test_parse_index_filter_to_query_ast_empty_object() {\n        // Empty object {} should return error to match ES behavior\n        let result = parse_index_filter_to_query_ast(json!({}));\n        assert!(result.is_err());\n    }\n\n    #[test]\n    fn test_parse_index_filter_to_query_ast_valid_term() {\n        let result = parse_index_filter_to_query_ast(json!({\n            \"term\": { \"status\": \"active\" }\n        }))\n        .unwrap();\n        assert!(result.is_some());\n    }\n\n    #[test]\n    fn test_parse_index_filter_to_query_ast_invalid() {\n        let result = parse_index_filter_to_query_ast(json!({\n            \"invalid_query_type\": { \"field\": \"value\" }\n        }));\n        assert!(result.is_err());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/elasticsearch_api/model/mappings.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\n\nuse quickwit_doc_mapper::{FieldMappingEntry, FieldMappingType};\nuse quickwit_metastore::IndexMetadata;\nuse quickwit_proto::search::{ListFieldType, ListFieldsResponse};\nuse serde::ser::SerializeMap;\nuse serde::{Serialize, Serializer};\n\n/// Top-level response for `GET /{index}/_mapping(s)`.\n///\n/// Serializes as `{ \"<index_id>\": { \"mappings\": { \"properties\": { ... } } } }`.\npub(crate) struct ElasticsearchMappingsResponse {\n    indices: HashMap<String, IndexMappings>,\n}\n\nimpl Serialize for ElasticsearchMappingsResponse {\n    fn serialize<S: Serializer>(&self, serializer: S) -> Result<S::Ok, S::Error> {\n        let mut map = serializer.serialize_map(Some(self.indices.len()))?;\n        for (index_id, mappings) in &self.indices {\n            map.serialize_entry(index_id, mappings)?;\n        }\n        map.end()\n    }\n}\n\n#[derive(Debug, Serialize)]\nstruct IndexMappings {\n    mappings: MappingProperties,\n}\n\n#[derive(Debug, Serialize)]\nstruct MappingProperties {\n    properties: HashMap<String, FieldMapping>,\n}\n\n#[derive(Debug, Serialize)]\n#[serde(untagged)]\nenum FieldMapping {\n    Leaf {\n        #[serde(rename = \"type\")]\n        typ: &'static str,\n    },\n    Object {\n        #[serde(rename = \"type\")]\n        typ: &'static str,\n        properties: HashMap<String, FieldMapping>,\n    },\n}\n\nimpl ElasticsearchMappingsResponse {\n    pub fn from_doc_mapping(\n        indexes_metadata: Vec<IndexMetadata>,\n        list_fields_response: Option<&ListFieldsResponse>,\n    ) -> Self {\n        let indices = indexes_metadata\n            .into_iter()\n            .map(|index_metadata| {\n                let field_mappings = &index_metadata.index_config.doc_mapping.field_mappings;\n                let mut properties = build_properties(field_mappings);\n                if let Some(list_fields) = list_fields_response {\n                    merge_dynamic_fields(&mut properties, list_fields);\n                }\n                let index_id = index_metadata.index_id().to_string();\n                (\n                    index_id,\n                    IndexMappings {\n                        mappings: MappingProperties { properties },\n                    },\n                )\n            })\n            .collect();\n        Self { indices }\n    }\n}\n\nfn build_properties(field_mappings: &[FieldMappingEntry]) -> HashMap<String, FieldMapping> {\n    let mut properties = HashMap::with_capacity(field_mappings.len());\n    for entry in field_mappings {\n        if let Some(field_mapping) = field_mapping_from_entry(entry) {\n            properties.insert(entry.name.clone(), field_mapping);\n        }\n    }\n    properties\n}\n\nfn field_mapping_from_entry(entry: &FieldMappingEntry) -> Option<FieldMapping> {\n    match &entry.mapping_type {\n        // Quickwit text fields behave like ES keyword fields: they support exact\n        // match, prefix, and regexp queries. Reporting them as \"keyword\" enables\n        // downstream connectors (e.g. Trino ES connector) to push down filters and\n        // LIKE predicates, which they only do for keyword-typed fields.\n        FieldMappingType::Text(..) => Some(FieldMapping::Leaf { typ: \"keyword\" }),\n        FieldMappingType::I64(..) => Some(FieldMapping::Leaf { typ: \"long\" }),\n        FieldMappingType::U64(..) => Some(FieldMapping::Leaf { typ: \"long\" }),\n        FieldMappingType::F64(..) => Some(FieldMapping::Leaf { typ: \"double\" }),\n        FieldMappingType::Bool(..) => Some(FieldMapping::Leaf { typ: \"boolean\" }),\n        FieldMappingType::DateTime(..) => Some(FieldMapping::Leaf { typ: \"date\" }),\n        FieldMappingType::IpAddr(..) => Some(FieldMapping::Leaf { typ: \"ip\" }),\n        FieldMappingType::Bytes(..) => Some(FieldMapping::Leaf { typ: \"binary\" }),\n        FieldMappingType::Json(..) => Some(FieldMapping::Leaf { typ: \"object\" }),\n        FieldMappingType::Object(options) => {\n            let properties = build_properties(&options.field_mappings);\n            Some(FieldMapping::Object {\n                typ: \"object\",\n                properties,\n            })\n        }\n        FieldMappingType::Concatenate(_) => Some(FieldMapping::Leaf { typ: \"keyword\" }),\n    }\n}\n\n/// Merges dynamic fields from a `ListFieldsResponse` into the properties map.\n///\n/// Fields already present in the map (from explicit doc mappings) are skipped,\n/// as are internal fields (prefixed with `_`).\nfn merge_dynamic_fields(\n    properties: &mut HashMap<String, FieldMapping>,\n    list_fields_response: &ListFieldsResponse,\n) {\n    for field_entry in &list_fields_response.fields {\n        let field_name = &field_entry.field_name;\n        if field_name.starts_with('_') {\n            continue;\n        }\n        if properties.contains_key(field_name) {\n            continue;\n        }\n        let Ok(field_type) = ListFieldType::try_from(field_entry.field_type) else {\n            continue;\n        };\n        if let Some(es_type) = es_type_from_list_field_type(field_type) {\n            properties.insert(field_name.clone(), FieldMapping::Leaf { typ: es_type });\n        }\n    }\n}\n\nfn es_type_from_list_field_type(field_type: ListFieldType) -> Option<&'static str> {\n    match field_type {\n        ListFieldType::Str => Some(\"keyword\"),\n        ListFieldType::U64 | ListFieldType::I64 => Some(\"long\"),\n        ListFieldType::F64 => Some(\"double\"),\n        ListFieldType::Bool => Some(\"boolean\"),\n        ListFieldType::Date => Some(\"date\"),\n        ListFieldType::Bytes => Some(\"binary\"),\n        ListFieldType::IpAddr => Some(\"ip\"),\n        ListFieldType::Facet | ListFieldType::Json => None,\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use serde_json::json;\n\n    use super::*;\n\n    #[test]\n    fn test_field_mapping_from_entry_bool() {\n        let entry_json = json!({ \"name\": \"active\", \"type\": \"bool\" });\n        let entry: FieldMappingEntry = serde_json::from_value(entry_json).unwrap();\n        let mapping = field_mapping_from_entry(&entry).unwrap();\n        let serialized = serde_json::to_value(&mapping).unwrap();\n        assert_eq!(serialized, json!({ \"type\": \"boolean\" }));\n    }\n\n    #[test]\n    fn test_field_mapping_from_entry_text() {\n        let entry_json = json!({ \"name\": \"message\", \"type\": \"text\" });\n        let entry: FieldMappingEntry = serde_json::from_value(entry_json).unwrap();\n        let mapping = field_mapping_from_entry(&entry).unwrap();\n        let serialized = serde_json::to_value(&mapping).unwrap();\n        assert_eq!(serialized, json!({ \"type\": \"keyword\" }));\n    }\n\n    #[test]\n    fn test_field_mapping_from_entry_i64() {\n        let entry_json = json!({ \"name\": \"count\", \"type\": \"i64\" });\n        let entry: FieldMappingEntry = serde_json::from_value(entry_json).unwrap();\n        let mapping = field_mapping_from_entry(&entry).unwrap();\n        let serialized = serde_json::to_value(&mapping).unwrap();\n        assert_eq!(serialized, json!({ \"type\": \"long\" }));\n    }\n\n    #[test]\n    fn test_field_mapping_from_entry_object() {\n        let entry_json = json!({\n            \"name\": \"nested\",\n            \"type\": \"object\",\n            \"field_mappings\": [\n                { \"name\": \"id\", \"type\": \"u64\" },\n                { \"name\": \"label\", \"type\": \"text\" }\n            ]\n        });\n        let entry: FieldMappingEntry = serde_json::from_value(entry_json).unwrap();\n        let mapping = field_mapping_from_entry(&entry).unwrap();\n        let serialized = serde_json::to_value(&mapping).unwrap();\n        assert_eq!(\n            serialized,\n            json!({\n                \"type\": \"object\",\n                \"properties\": {\n                    \"id\": { \"type\": \"long\" },\n                    \"label\": { \"type\": \"keyword\" }\n                }\n            })\n        );\n    }\n\n    #[test]\n    fn test_field_mapping_from_entry_concatenate_exposed_as_keyword() {\n        let entry_json = json!({\n            \"name\": \"concat_field\",\n            \"type\": \"concatenate\",\n            \"concatenate_fields\": [\"field_a\", \"field_b\"]\n        });\n        let entry: FieldMappingEntry = serde_json::from_value(entry_json).unwrap();\n        let mapping = field_mapping_from_entry(&entry).unwrap();\n        let serialized = serde_json::to_value(&mapping).unwrap();\n        assert_eq!(serialized, json!({ \"type\": \"keyword\" }));\n    }\n\n    #[test]\n    fn test_build_properties_all_leaf_types() {\n        let entries: Vec<FieldMappingEntry> = serde_json::from_value(json!([\n            { \"name\": \"title\", \"type\": \"text\" },\n            { \"name\": \"count\", \"type\": \"i64\" },\n            { \"name\": \"unsigned\", \"type\": \"u64\" },\n            { \"name\": \"score\", \"type\": \"f64\" },\n            { \"name\": \"active\", \"type\": \"bool\" },\n            { \"name\": \"created_at\", \"type\": \"datetime\" },\n            { \"name\": \"ip_field\", \"type\": \"ip\" },\n            { \"name\": \"data\", \"type\": \"bytes\" },\n            { \"name\": \"payload\", \"type\": \"json\" },\n            {\n                \"name\": \"metadata\",\n                \"type\": \"object\",\n                \"field_mappings\": [\n                    { \"name\": \"source\", \"type\": \"text\" }\n                ]\n            }\n        ]))\n        .unwrap();\n\n        let props = build_properties(&entries);\n        let to_json = |fm: &FieldMapping| serde_json::to_value(fm).unwrap();\n\n        assert_eq!(to_json(&props[\"title\"]), json!({ \"type\": \"keyword\" }));\n        assert_eq!(to_json(&props[\"count\"]), json!({ \"type\": \"long\" }));\n        assert_eq!(to_json(&props[\"unsigned\"]), json!({ \"type\": \"long\" }));\n        assert_eq!(to_json(&props[\"score\"]), json!({ \"type\": \"double\" }));\n        assert_eq!(to_json(&props[\"active\"]), json!({ \"type\": \"boolean\" }));\n        assert_eq!(to_json(&props[\"created_at\"]), json!({ \"type\": \"date\" }));\n        assert_eq!(to_json(&props[\"ip_field\"]), json!({ \"type\": \"ip\" }));\n        assert_eq!(to_json(&props[\"data\"]), json!({ \"type\": \"binary\" }));\n        assert_eq!(to_json(&props[\"payload\"]), json!({ \"type\": \"object\" }));\n\n        let meta = to_json(&props[\"metadata\"]);\n        assert_eq!(meta[\"type\"], \"object\");\n        assert_eq!(meta[\"properties\"][\"source\"][\"type\"], \"keyword\");\n    }\n\n    #[test]\n    fn test_merge_dynamic_fields_skips_existing_and_internal() {\n        use quickwit_proto::search::ListFieldsEntryResponse;\n\n        let mut properties = HashMap::new();\n        properties.insert(\"title\".to_string(), FieldMapping::Leaf { typ: \"text\" });\n\n        let list_fields = ListFieldsResponse {\n            fields: vec![\n                ListFieldsEntryResponse {\n                    field_name: \"title\".to_string(),\n                    field_type: ListFieldType::Str as i32,\n                    ..Default::default()\n                },\n                ListFieldsEntryResponse {\n                    field_name: \"_timestamp\".to_string(),\n                    field_type: ListFieldType::Date as i32,\n                    ..Default::default()\n                },\n                ListFieldsEntryResponse {\n                    field_name: \"dynamic_field\".to_string(),\n                    field_type: ListFieldType::Str as i32,\n                    ..Default::default()\n                },\n            ],\n        };\n\n        merge_dynamic_fields(&mut properties, &list_fields);\n\n        assert_eq!(properties.len(), 2);\n        assert!(properties.contains_key(\"title\"));\n        assert!(properties.contains_key(\"dynamic_field\"));\n        assert!(!properties.contains_key(\"_timestamp\"));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/elasticsearch_api/model/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod bulk_body;\nmod bulk_query_params;\nmod cat_indices;\nmod error;\nmod field_capability;\nmod mappings;\nmod multi_search;\nmod scroll;\nmod search_body;\nmod search_query_params;\nmod search_response;\nmod stats;\n\npub use bulk_body::BulkAction;\npub use bulk_query_params::ElasticBulkOptions;\npub use cat_indices::{\n    CatIndexQueryParams, ElasticsearchCatIndexResponse, ElasticsearchResolveIndexEntryResponse,\n    ElasticsearchResolveIndexResponse,\n};\npub use error::{ElasticException, ElasticsearchError};\npub use field_capability::{\n    FieldCapabilityQueryParams, FieldCapabilityRequestBody, FieldCapabilityResponse,\n    build_list_field_request_for_es_api, convert_to_es_field_capabilities_response,\n};\npub(crate) use mappings::ElasticsearchMappingsResponse;\npub use multi_search::{\n    MultiSearchHeader, MultiSearchQueryParams, MultiSearchResponse, MultiSearchSingleResponse,\n};\nuse quickwit_proto::search::{SortDatetimeFormat, SortOrder};\npub use scroll::ScrollQueryParams;\npub use search_body::SearchBody;\npub use search_query_params::{DeleteQueryParams, SearchQueryParams, SearchQueryParamsCount};\npub use search_response::ElasticsearchResponse;\nuse serde::{Deserialize, Serialize};\npub use stats::{ElasticsearchStatsResponse, StatsResponseEntry};\n\n#[derive(Debug, Clone, Eq, PartialEq)]\npub struct SortField {\n    pub field: String,\n    pub order: SortOrder,\n    pub date_format: Option<ElasticDateFormat>,\n}\n\n#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]\n#[serde(rename_all = \"snake_case\")]\npub enum ElasticDateFormat {\n    /// Sort values are in milliseconds by default to ease migration from ES.\n    /// We allow the user to specify nanoseconds if needed.\n    /// We add `Int` to the name to avoid confusion ES variant `EpochMillis` which,\n    /// returns milliseconds as strings.\n    EpochNanosInt,\n}\n\nimpl From<ElasticDateFormat> for SortDatetimeFormat {\n    fn from(date_format: ElasticDateFormat) -> Self {\n        match date_format {\n            ElasticDateFormat::EpochNanosInt => SortDatetimeFormat::UnixTimestampNanos,\n        }\n    }\n}\n\npub(crate) fn default_elasticsearch_sort_order(field_name: &str) -> SortOrder {\n    if field_name == \"_score\" {\n        SortOrder::Desc\n    } else {\n        SortOrder::Asc\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/elasticsearch_api/model/multi_search.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse elasticsearch_dsl::ErrorCause;\nuse serde::{Deserialize, Serialize};\nuse serde_with::formats::PreferMany;\nuse serde_with::{OneOrMany, serde_as};\nuse warp::hyper::StatusCode;\n\nuse super::ElasticsearchError;\nuse super::search_query_params::ExpandWildcards;\nuse super::search_response::ElasticsearchResponse;\nuse crate::simple_list::{from_simple_list, to_simple_list};\n\n// Multi search doc: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-multi-search.html\n\n#[serde_as]\n#[serde_with::skip_serializing_none]\n#[derive(Default, Debug, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct MultiSearchQueryParams {\n    #[serde(default)]\n    pub allow_no_indices: Option<bool>,\n    #[serde(default)]\n    pub ccs_minimize_roundtrips: Option<bool>,\n    #[serde(serialize_with = \"to_simple_list\")]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(default)]\n    pub expand_wildcards: Option<Vec<ExpandWildcards>>,\n    #[serde(serialize_with = \"to_simple_list\")]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(default)]\n    /// Additional filters to be applied to the query.\n    /// Useful for permissions and other use cases.\n    /// This is not part of the official Elasticsearch API.\n    ///\n    /// This will set extra_filters on the search request.\n    pub extra_filters: Option<Vec<String>>,\n    #[serde(default)]\n    pub ignore_throttled: Option<bool>,\n    #[serde(default)]\n    pub ignore_unavailable: Option<bool>,\n    /// List of indexes to search.\n    #[serde_as(deserialize_as = \"OneOrMany<_, PreferMany>\")]\n    #[serde(default, rename = \"index\")]\n    pub indexes: Vec<String>,\n    #[serde(default)]\n    pub max_concurrent_searches: Option<u64>,\n    #[serde(default)]\n    pub max_concurrent_shard_requests: Option<i64>,\n    #[serde(default)]\n    pub pre_filter_shard_size: Option<i64>,\n    #[serde(default)]\n    pub rest_total_hits_as_int: Option<bool>,\n    #[serde(serialize_with = \"to_simple_list\")]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(default)]\n    pub routing: Option<Vec<String>>,\n    #[serde(serialize_with = \"to_simple_list\")]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(default)]\n    /// This is not part of the official Elasticsearch API.\n    /// This will set source_excludes on the search request.\n    pub _source_excludes: Option<Vec<String>>,\n    #[serde(serialize_with = \"to_simple_list\")]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(default)]\n    /// This is not part of the official Elasticsearch API.\n    /// This will set source_includes on the search request.\n    pub _source_includes: Option<Vec<String>>,\n    #[serde(default)]\n    pub typed_keys: Option<bool>,\n}\n\n#[serde_as]\n#[serde_with::skip_serializing_none]\n#[derive(Default, Debug, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct MultiSearchHeader {\n    #[serde(default)]\n    pub allow_no_indices: Option<bool>,\n    #[serde(default)]\n    pub expand_wildcards: Option<Vec<ExpandWildcards>>,\n    #[serde(default)]\n    pub ignore_unavailable: Option<bool>,\n    #[serde_as(deserialize_as = \"OneOrMany<_, PreferMany>\")]\n    #[serde(default, rename = \"index\")]\n    pub indexes: Vec<String>,\n    #[serde(default)]\n    pub preference: Option<String>,\n    #[serde(default)]\n    pub request_cache: Option<bool>,\n    #[serde(default)]\n    pub routing: Option<Vec<String>>,\n}\n\nimpl MultiSearchHeader {\n    pub fn apply_query_param_defaults(&mut self, defaults: &MultiSearchQueryParams) {\n        if self.allow_no_indices.is_none() {\n            self.allow_no_indices = defaults.allow_no_indices;\n        }\n        if self.expand_wildcards.is_none() {\n            self.expand_wildcards = defaults.expand_wildcards.clone();\n        }\n        if self.ignore_unavailable.is_none() {\n            self.ignore_unavailable = defaults.ignore_unavailable;\n        }\n        if self.indexes.is_empty() {\n            self.indexes = defaults.indexes.clone();\n        }\n        if self.routing.is_none() {\n            self.routing = defaults.routing.clone();\n        }\n    }\n}\n\n#[derive(Serialize)]\npub struct MultiSearchResponse {\n    pub responses: Vec<MultiSearchSingleResponse>,\n}\n\n#[derive(Serialize, Debug)]\npub struct MultiSearchSingleResponse {\n    #[serde(with = \"http_serde::status_code\")]\n    pub status: StatusCode,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    #[serde(flatten)]\n    pub response: Option<ElasticsearchResponse>,\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub error: Option<ErrorCause>,\n}\n\nimpl From<ElasticsearchResponse> for MultiSearchSingleResponse {\n    fn from(response: ElasticsearchResponse) -> Self {\n        MultiSearchSingleResponse {\n            status: StatusCode::OK,\n            response: Some(response),\n            error: None,\n        }\n    }\n}\n\nimpl From<ElasticsearchError> for MultiSearchSingleResponse {\n    fn from(error: ElasticsearchError) -> Self {\n        MultiSearchSingleResponse {\n            status: error.status,\n            response: None,\n            error: Some(error.error),\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/elasticsearch_api/model/scroll.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse serde::Deserialize;\n\n#[derive(Deserialize, Default)]\npub struct ScrollQueryParams {\n    pub scroll: Option<String>,\n    pub scroll_id: Option<String>,\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/elasticsearch_api/model/search_body.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeSet;\nuse std::fmt;\n\nuse quickwit_proto::search::SortOrder;\nuse quickwit_query::{ElasticQueryDsl, OneFieldMap};\nuse serde::de::{MapAccess, Visitor};\nuse serde::{Deserialize, Deserializer, Serialize};\n\nuse super::ElasticDateFormat;\nuse crate::elasticsearch_api::TrackTotalHits;\nuse crate::elasticsearch_api::model::{SortField, default_elasticsearch_sort_order};\n\n#[derive(Debug, Clone, PartialEq, Deserialize)]\n#[serde(untagged)]\nenum FieldSortParamsForDeser {\n    // we can't just use FieldSortParams or we get infinite recursion on deser\n    Object {\n        order: Option<SortOrder>,\n        format: Option<ElasticDateFormat>,\n    },\n    String(SortOrder),\n}\n\nimpl From<FieldSortParamsForDeser> for FieldSortParams {\n    fn from(for_deser: FieldSortParamsForDeser) -> FieldSortParams {\n        match for_deser {\n            FieldSortParamsForDeser::Object {\n                order,\n                format: date_format,\n            } => FieldSortParams { order, date_format },\n            FieldSortParamsForDeser::String(order) => FieldSortParams {\n                order: Some(order),\n                date_format: None,\n            },\n        }\n    }\n}\n\n#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]\n#[serde(from = \"FieldSortParamsForDeser\")]\n#[serde(deny_unknown_fields)]\nstruct FieldSortParams {\n    #[serde(default)]\n    pub order: Option<SortOrder>,\n    #[serde(default)]\n    #[serde(rename = \"format\")]\n    pub date_format: Option<ElasticDateFormat>,\n}\n\n#[derive(Debug, Default, Clone, Deserialize, PartialEq)]\n#[serde(deny_unknown_fields)]\npub struct SearchBody {\n    #[serde(default)]\n    pub from: Option<u64>,\n    #[serde(default)]\n    pub size: Option<u64>,\n    #[serde(default)]\n    pub query: Option<ElasticQueryDsl>,\n    #[serde(default)]\n    #[serde(deserialize_with = \"deserialize_field_sorts\")]\n    pub sort: Option<Vec<SortField>>,\n    #[serde(default)]\n    pub aggs: serde_json::Map<String, serde_json::Value>,\n    #[serde(default)]\n    pub track_total_hits: Option<TrackTotalHits>,\n    #[serde(default)]\n    pub stored_fields: Option<BTreeSet<String>>,\n    #[serde(default)]\n    pub search_after: Vec<serde_json::Value>,\n\n    // Ignored values, only here for compatibility with OpenSearch Dashboards.\n    #[serde(default)]\n    pub _source: serde::de::IgnoredAny,\n    #[serde(default)]\n    pub docvalue_fields: serde::de::IgnoredAny,\n    #[serde(default)]\n    pub script_fields: serde::de::IgnoredAny,\n    #[serde(default)]\n    pub highlight: serde::de::IgnoredAny,\n    #[serde(default)]\n    pub version: serde::de::IgnoredAny,\n}\n\nstruct FieldSortVecVisitor;\n\n#[derive(Deserialize)]\n#[serde(untagged)]\nenum StringOrMapFieldSort {\n    FieldNameOnly(String),\n    Sort(OneFieldMap<FieldSortParams>),\n}\n\nimpl From<StringOrMapFieldSort> for SortField {\n    fn from(string_or_map_field_sort: StringOrMapFieldSort) -> Self {\n        match string_or_map_field_sort {\n            StringOrMapFieldSort::FieldNameOnly(field_name) => {\n                let order = default_elasticsearch_sort_order(&field_name);\n                SortField {\n                    field: field_name,\n                    order,\n                    date_format: None,\n                }\n            }\n            StringOrMapFieldSort::Sort(sort) => {\n                let order = sort\n                    .value\n                    .order\n                    .unwrap_or_else(|| default_elasticsearch_sort_order(&sort.field));\n                SortField {\n                    field: sort.field,\n                    order,\n                    date_format: sort.value.date_format,\n                }\n            }\n        }\n    }\n}\n\nimpl<'de> Visitor<'de> for FieldSortVecVisitor {\n    type Value = Vec<SortField>;\n\n    fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {\n        formatter.write_str(\"A string, array, or object containing the sort fields.\")\n    }\n\n    fn visit_str<E>(self, field_name: &str) -> Result<Vec<SortField>, E>\n    where E: serde::de::Error {\n        let order = default_elasticsearch_sort_order(field_name);\n        Ok(vec![SortField {\n            field: field_name.to_string(),\n            order,\n            date_format: None,\n        }])\n    }\n\n    fn visit_seq<A>(self, mut seq: A) -> Result<Vec<SortField>, A::Error>\n    where A: serde::de::SeqAccess<'de> {\n        let mut sort_fields: Vec<SortField> = Vec::new();\n        while let Some(field_sort) = seq.next_element::<StringOrMapFieldSort>()? {\n            sort_fields.push(field_sort.into());\n        }\n        Ok(sort_fields)\n    }\n\n    fn visit_map<M>(self, mut map: M) -> Result<Vec<SortField>, M::Error>\n    where M: MapAccess<'de> {\n        let mut sort_fields: Vec<SortField> = Vec::new();\n        while let Some((field_sort_key, field_sort_params)) =\n            map.next_entry::<String, FieldSortParams>()?\n        {\n            let sort_order = field_sort_params\n                .order\n                .unwrap_or_else(|| default_elasticsearch_sort_order(&field_sort_key));\n            sort_fields.push(SortField {\n                field: field_sort_key,\n                order: sort_order,\n                date_format: field_sort_params.date_format,\n            });\n        }\n        Ok(sort_fields)\n    }\n}\n\n/// ES accepts structs to describe the sort field.\n/// In that case the order of apparition in the JSON object matters.\nfn deserialize_field_sorts<'de, D>(deserializer: D) -> Result<Option<Vec<SortField>>, D::Error>\nwhere D: Deserializer<'de> {\n    deserializer.deserialize_any(FieldSortVecVisitor).map(Some)\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_sort_field_array() {\n        let json = r#\"\n        {\n            \"sort\": [\n                { \"timestamp\": { \"order\": \"desc\", \"format\": \"epoch_nanos_int\" } },\n                { \"uid\": { \"order\": \"asc\" } },\n                { \"my_field\": \"asc\" },\n                { \"hello\": {}},\n                { \"_score\": {}}\n            ]\n        }\n        \"#;\n        let search_body: SearchBody = serde_json::from_str(json).unwrap();\n        let sort_fields = search_body.sort.unwrap();\n        assert_eq!(sort_fields.len(), 5);\n        assert_eq!(sort_fields[0].field, \"timestamp\");\n        assert_eq!(sort_fields[0].order, SortOrder::Desc);\n        assert_eq!(\n            sort_fields[0].date_format,\n            Some(ElasticDateFormat::EpochNanosInt)\n        );\n        assert_eq!(sort_fields[1].field, \"uid\");\n        assert_eq!(sort_fields[1].order, SortOrder::Asc);\n        assert_eq!(sort_fields[1].date_format, None);\n        assert_eq!(sort_fields[2].field, \"my_field\");\n        assert_eq!(sort_fields[2].order, SortOrder::Asc);\n        assert_eq!(sort_fields[2].date_format, None);\n        assert_eq!(sort_fields[3].field, \"hello\");\n        assert_eq!(sort_fields[3].order, SortOrder::Asc);\n        assert_eq!(sort_fields[3].date_format, None);\n        assert_eq!(sort_fields[4].field, \"_score\");\n        assert_eq!(sort_fields[4].order, SortOrder::Desc);\n        assert_eq!(sort_fields[4].date_format, None);\n    }\n\n    #[test]\n    fn test_sort_field_obj() {\n        let json = r#\"\n        {\n            \"sort\": {\n                \"timestamp\": { \"order\": \"desc\" },\n                \"uid\": { \"order\": \"asc\" }\n            }\n        }\n        \"#;\n        let search_body: SearchBody = serde_json::from_str(json).unwrap();\n        let field_sorts = search_body.sort.unwrap();\n        assert_eq!(field_sorts.len(), 2);\n        assert_eq!(field_sorts[0].field, \"timestamp\");\n        assert_eq!(field_sorts[0].order, SortOrder::Desc);\n        assert_eq!(field_sorts[1].field, \"uid\");\n        assert_eq!(field_sorts[1].order, SortOrder::Asc);\n    }\n\n    #[test]\n    fn test_sort_field_str() {\n        let json = r#\"\n        {\n            \"sort\": \"timestamp\"\n        }\n        \"#;\n        let search_body: SearchBody = serde_json::from_str(json).unwrap();\n        let field_sorts = search_body.sort.unwrap();\n        assert_eq!(field_sorts.len(), 1);\n        assert_eq!(field_sorts[0].field, \"timestamp\");\n        assert_eq!(field_sorts[0].order, SortOrder::Asc);\n    }\n\n    #[test]\n    fn test_sort_default_orders() {\n        let json = r#\"\n        {\n            \"sort\": [\n                \"timestamp\",\n                \"uid\",\n                \"_score\",\n                \"_doc\"\n            ]\n        }\n        \"#;\n        let search_body: SearchBody = serde_json::from_str(json).unwrap();\n        let field_sorts = search_body.sort.unwrap();\n        assert_eq!(field_sorts.len(), 4);\n        assert_eq!(field_sorts[0].field, \"timestamp\");\n        assert_eq!(field_sorts[0].order, SortOrder::Asc);\n        assert_eq!(field_sorts[1].field, \"uid\");\n        assert_eq!(field_sorts[1].order, SortOrder::Asc);\n        assert_eq!(field_sorts[2].field, \"_score\");\n        assert_eq!(field_sorts[2].order, SortOrder::Desc);\n        assert_eq!(field_sorts[3].field, \"_doc\");\n        assert_eq!(field_sorts[3].order, SortOrder::Asc);\n    }\n\n    #[test]\n    fn test_unknown_field_behaviour() {\n        let json = r#\"\n            {\n                \"term\": {\n                    \"actor.id\": {\n                        \"value\": \"95077794\"\n                     }\n                }\n            }\n        \"#;\n\n        let search_body = serde_json::from_str::<SearchBody>(json);\n        let error_msg = search_body.unwrap_err().to_string();\n        assert!(error_msg.contains(\"unknown field `term`\"));\n        assert!(error_msg.contains(\"expected one of \"));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/elasticsearch_api/model/search_query_params.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::str::FromStr;\nuse std::time::Duration;\n\nuse quickwit_query::BooleanOperand;\nuse quickwit_search::SearchError;\nuse serde::{Deserialize, Serialize};\n\nuse super::super::TrackTotalHits;\nuse super::MultiSearchHeader;\nuse crate::elasticsearch_api::model::{SortField, default_elasticsearch_sort_order};\nuse crate::simple_list::{from_simple_list, to_simple_list};\n\n#[serde_with::skip_serializing_none]\n#[derive(Default, Debug, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct SearchQueryParams {\n    #[serde(serialize_with = \"to_simple_list\")]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(default)]\n    pub _source: Option<Vec<String>>,\n    #[serde(serialize_with = \"to_simple_list\")]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(default)]\n    pub _source_excludes: Option<Vec<String>>,\n    #[serde(serialize_with = \"to_simple_list\")]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(default)]\n    pub _source_includes: Option<Vec<String>>,\n    #[serde(default)]\n    pub allow_no_indices: Option<bool>,\n    #[serde(default)]\n    pub allow_partial_search_results: Option<bool>,\n    #[serde(default)]\n    pub analyze_wildcard: Option<bool>,\n    #[serde(default)]\n    pub analyzer: Option<String>,\n    #[serde(default)]\n    pub batched_reduce_size: Option<u64>,\n    #[serde(default)]\n    pub ccs_minimize_roundtrips: Option<bool>,\n    #[serde(default)]\n    pub default_operator: Option<BooleanOperand>,\n    #[serde(default)]\n    pub df: Option<String>,\n    #[serde(serialize_with = \"to_simple_list\")]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(default)]\n    pub docvalue_fields: Option<Vec<String>>,\n    #[serde(default)]\n    pub error_trace: Option<bool>,\n    #[serde(serialize_with = \"to_simple_list\")]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(default)]\n    pub expand_wildcards: Option<Vec<ExpandWildcards>>,\n    #[serde(default)]\n    pub explain: Option<bool>,\n    #[serde(serialize_with = \"to_simple_list\")]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(default)]\n    /// Additional filters to be applied to the query.\n    /// Useful for permissions and other use cases.\n    /// This is not part of the official Elasticsearch API.\n    pub extra_filters: Option<Vec<String>>,\n    #[serde(serialize_with = \"to_simple_list\")]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(default)]\n    pub filter_path: Option<Vec<String>>,\n    #[serde(default)]\n    pub force_synthetic_source: Option<bool>,\n    #[serde(default)]\n    pub from: Option<u64>,\n    #[serde(default)]\n    pub human: Option<bool>,\n    #[serde(default)]\n    pub ignore_throttled: Option<bool>,\n    #[serde(default)]\n    pub ignore_unavailable: Option<bool>,\n    #[serde(default)]\n    pub lenient: Option<bool>,\n    #[serde(default)]\n    pub max_concurrent_shard_requests: Option<u64>,\n    #[serde(default)]\n    pub min_compatible_shard_node: Option<String>,\n    #[serde(default)]\n    pub pre_filter_shard_size: Option<u64>,\n    #[serde(default)]\n    pub preference: Option<String>,\n    #[serde(default)]\n    pub pretty: Option<bool>,\n    #[serde(default)]\n    pub q: Option<String>,\n    #[serde(default)]\n    pub request_cache: Option<bool>,\n    #[serde(default)]\n    pub rest_total_hits_as_int: Option<bool>,\n    #[serde(serialize_with = \"to_simple_list\")]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(default)]\n    pub routing: Option<Vec<String>>,\n    #[serde(default)]\n    pub scroll: Option<String>,\n    #[serde(default)]\n    pub search_type: Option<String>,\n    #[serde(default)]\n    pub seq_no_primary_term: Option<bool>,\n    #[serde(default)]\n    pub size: Option<u64>,\n    #[serde(serialize_with = \"to_simple_list\")]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(default)]\n    pub sort: Option<Vec<String>>,\n    #[serde(default)]\n    pub stats: Option<Vec<String>>,\n    #[serde(serialize_with = \"to_simple_list\")]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(default)]\n    pub stored_fields: Option<Vec<String>>,\n    #[serde(default)]\n    pub suggest_field: Option<String>,\n    #[serde(default)]\n    pub suggest_mode: Option<SuggestMode>,\n    #[serde(default)]\n    pub suggest_size: Option<u64>,\n    #[serde(default)]\n    pub suggest_text: Option<String>,\n    #[serde(default)]\n    pub terminate_after: Option<u64>,\n    #[serde(default)]\n    pub timeout: Option<String>,\n    #[serde(default)]\n    pub track_scores: Option<bool>,\n    #[serde(default)]\n    pub track_total_hits: Option<TrackTotalHits>,\n    #[serde(default)]\n    pub typed_keys: Option<bool>,\n    #[serde(default)]\n    pub version: Option<bool>,\n}\n\n#[serde_with::skip_serializing_none]\n#[derive(Default, Debug, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct SearchQueryParamsCount {\n    #[serde(default)]\n    pub allow_no_indices: Option<bool>,\n    #[serde(default)]\n    pub analyze_wildcard: Option<bool>,\n    #[serde(default)]\n    pub analyzer: Option<String>,\n    #[serde(default)]\n    pub default_operator: Option<BooleanOperand>,\n    #[serde(default)]\n    pub df: Option<String>,\n    #[serde(serialize_with = \"to_simple_list\")]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(default)]\n    pub expand_wildcards: Option<Vec<ExpandWildcards>>,\n    #[serde(default)]\n    pub ignore_throttled: Option<bool>,\n    #[serde(default)]\n    pub ignore_unavailable: Option<bool>,\n    #[serde(default)]\n    pub lenient: Option<bool>,\n    #[serde(default)]\n    pub max_concurrent_shard_requests: Option<u64>,\n    #[serde(default)]\n    pub preference: Option<String>,\n    #[serde(default)]\n    pub q: Option<String>,\n    #[serde(default)]\n    pub request_cache: Option<bool>,\n    #[serde(serialize_with = \"to_simple_list\")]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(default)]\n    pub routing: Option<Vec<String>>,\n}\nimpl From<SearchQueryParamsCount> for SearchQueryParams {\n    fn from(value: SearchQueryParamsCount) -> Self {\n        SearchQueryParams {\n            allow_no_indices: value.allow_no_indices,\n            analyze_wildcard: value.analyze_wildcard,\n            analyzer: value.analyzer,\n            default_operator: value.default_operator,\n            df: value.df,\n            expand_wildcards: value.expand_wildcards,\n            ignore_throttled: value.ignore_throttled,\n            ignore_unavailable: value.ignore_unavailable,\n            preference: value.preference,\n            q: value.q,\n            request_cache: value.request_cache,\n            routing: value.routing,\n            size: Some(0),\n            ..Default::default()\n        }\n    }\n}\n\n#[serde_with::skip_serializing_none]\n#[derive(Default, Debug, Serialize, Deserialize)]\n#[serde(deny_unknown_fields)]\npub struct DeleteQueryParams {\n    #[serde(default)]\n    pub allow_no_indices: Option<bool>,\n    #[serde(serialize_with = \"to_simple_list\")]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(default)]\n    pub expand_wildcards: Option<Vec<ExpandWildcards>>,\n    #[serde(default)]\n    pub ignore_unavailable: Option<bool>,\n    #[serde(default)]\n    pub master_timeout: Option<String>,\n    #[serde(default)]\n    pub timeout: Option<String>,\n}\n\n/// Parses a string as if it was a json value string.\nfn parse_str_like_json<T: serde::de::DeserializeOwned>(s: &str) -> Option<T> {\n    let json_value = serde_json::Value::String(s.to_string());\n    serde_json::from_value::<T>(json_value).ok()\n}\n\n// Parse a single sort field parameter from ES sort query string parameter.\nfn parse_sort_field_str(sort_field_str: &str) -> Result<SortField, SearchError> {\n    if let Some((field, order_str)) = sort_field_str.split_once(':') {\n        let order = parse_str_like_json(order_str).ok_or_else(|| {\n            SearchError::InvalidArgument(format!(\n                \"invalid sort order `{field}`. expected `asc` or `desc`\"\n            ))\n        })?;\n        Ok(SortField {\n            field: field.to_string(),\n            order,\n            date_format: None,\n        })\n    } else {\n        let order = default_elasticsearch_sort_order(sort_field_str);\n        Ok(SortField {\n            field: sort_field_str.to_string(),\n            order,\n            date_format: None,\n        })\n    }\n}\n\nimpl SearchQueryParams {\n    /// Accessor for the list of sort fields passed in the sort query string parameter.\n    ///\n    /// Returns an error if the sort query string are not in the expected format\n    /// (`field:order,field2:order2,...`). Returns `Ok(None)` if the sort query string parameter\n    /// is not present.\n    #[allow(clippy::type_complexity)]\n    pub(crate) fn sort_fields(&self) -> Result<Option<Vec<SortField>>, SearchError> {\n        let Some(sort_fields_str) = self.sort.as_ref() else {\n            return Ok(None);\n        };\n        let mut sort_fields: Vec<SortField> = Vec::with_capacity(sort_fields_str.len());\n        for sort_field_str in sort_fields_str {\n            sort_fields.push(parse_sort_field_str(sort_field_str)?);\n        }\n        Ok(Some(sort_fields))\n    }\n\n    /// Returns the scroll duration supplied by the user.\n    ///\n    /// This function returns an error if the scroll duration is not in the expected format. (`40s`\n    /// etc.)\n    pub fn parse_scroll_ttl(&self) -> Result<Option<Duration>, SearchError> {\n        let Some(scroll_str) = self.scroll.as_ref() else {\n            return Ok(None);\n        };\n        let duration: Duration = humantime::parse_duration(scroll_str).map_err(|_err| {\n            SearchError::InvalidArgument(format!(\"invalid scroll duration: `{scroll_str}`\"))\n        })?;\n        Ok(Some(duration))\n    }\n\n    pub fn allow_partial_search_results(&self) -> bool {\n        // By default, elastic search allows partial results.\n        self.allow_partial_search_results.unwrap_or(true)\n    }\n}\n\n#[doc = \"Whether to expand wildcard expression to concrete indices that are open, closed or both.\"]\n#[derive(Debug, PartialEq, Deserialize, Serialize, Clone, Copy)]\n#[serde(rename_all = \"lowercase\")]\npub enum ExpandWildcards {\n    Open,\n    Closed,\n    Hidden,\n    None,\n    All,\n}\n\nimpl FromStr for ExpandWildcards {\n    type Err = &'static str;\n    fn from_str(value_str: &str) -> Result<Self, Self::Err> {\n        match value_str {\n            \"open\" => Ok(Self::Open),\n            \"closed\" => Ok(Self::Closed),\n            \"hidden\" => Ok(Self::Hidden),\n            \"none\" => Ok(Self::None),\n            \"all\" => Ok(Self::All),\n            _ => Err(\"unknown enum variant\"),\n        }\n    }\n}\n\nimpl fmt::Display for ExpandWildcards {\n    fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {\n        match self {\n            Self::Open => write!(formatter, \"open\"),\n            Self::Closed => write!(formatter, \"closed\"),\n            Self::Hidden => write!(formatter, \"hidden\"),\n            Self::None => write!(formatter, \"none\"),\n            Self::All => write!(formatter, \"all\"),\n        }\n    }\n}\n\nimpl From<MultiSearchHeader> for SearchQueryParams {\n    fn from(multi_search_header: MultiSearchHeader) -> Self {\n        SearchQueryParams {\n            allow_no_indices: multi_search_header.allow_no_indices,\n            expand_wildcards: multi_search_header.expand_wildcards,\n            ignore_unavailable: multi_search_header.ignore_unavailable,\n            routing: multi_search_header.routing,\n            request_cache: multi_search_header.request_cache,\n            preference: multi_search_header.preference,\n            ..Default::default()\n        }\n    }\n}\n\n/// Specify suggest mode\n#[derive(Debug, PartialEq, Deserialize, Serialize, Clone, Copy)]\n#[serde(rename_all = \"lowercase\")]\npub enum SuggestMode {\n    Missing,\n    Popular,\n    Always,\n}\n\nimpl FromStr for SuggestMode {\n    type Err = &'static str;\n    fn from_str(value_str: &str) -> Result<Self, Self::Err> {\n        match value_str {\n            \"missing\" => Ok(Self::Missing),\n            \"popular\" => Ok(Self::Popular),\n            \"always\" => Ok(Self::Always),\n            _ => Err(\"unknown enum variant\"),\n        }\n    }\n}\n\nimpl fmt::Display for SuggestMode {\n    fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {\n        match self {\n            Self::Missing => write!(formatter, \"missing\"),\n            Self::Popular => write!(formatter, \"popular\"),\n            Self::Always => write!(formatter, \"always\"),\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use quickwit_proto::search::SortOrder;\n\n    use super::*;\n\n    #[derive(Deserialize, PartialEq, Eq, Debug)]\n    #[serde(rename_all = \"snake_case\")]\n    enum TestEnum {\n        FirstItem,\n        SecondItem,\n    }\n\n    #[test]\n    fn test_parse_str_like_json() {\n        assert_eq!(\n            parse_str_like_json::<TestEnum>(\"first_item\").unwrap(),\n            TestEnum::FirstItem\n        );\n        assert!(parse_str_like_json::<TestEnum>(\"FirstItem\").is_none());\n    }\n\n    #[test]\n    fn test_sort_order_qs() {\n        let sort_order_qs = parse_sort_field_str(\"timestamp:desc\").unwrap();\n        assert_eq!(\n            sort_order_qs,\n            SortField {\n                field: \"timestamp\".to_string(),\n                order: SortOrder::Desc,\n                date_format: None\n            }\n        );\n        let sort_order_qs = parse_sort_field_str(\"timestamp:asc\").unwrap();\n        assert_eq!(\n            sort_order_qs,\n            SortField {\n                field: \"timestamp\".to_string(),\n                order: SortOrder::Asc,\n                date_format: None\n            }\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/elasticsearch_api/model/search_response.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse elasticsearch_dsl::{ClusterStatistics, HitsMetadata, ShardStatistics, Suggest};\nuse quickwit_search::AggregationResults;\nuse serde::Serialize;\n\ntype Map<K, V> = std::collections::BTreeMap<K, V>;\n\n/// Search response\n///\n/// This is a fork of [`elasticsearch_dsl::SearchResponse`] with the\n/// `aggregations` field using [`AggregationResults`] instead of\n/// [`serde_json::Value`].\n#[derive(Debug, Default, Serialize, PartialEq)]\npub struct ElasticsearchResponse {\n    /// The time that it took Elasticsearch to process the query\n    pub took: u32,\n\n    /// The search has been cancelled and results are partial\n    pub timed_out: bool,\n\n    /// Indicates if search has been terminated early\n    #[serde(default)]\n    pub terminated_early: Option<bool>,\n\n    /// Scroll Id\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    #[serde(rename = \"_scroll_id\")]\n    pub scroll_id: Option<String>,\n\n    /// Dynamically fetched fields\n    #[serde(default)]\n    pub fields: Map<String, serde_json::Value>,\n\n    /// Point in time Id\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub pit_id: Option<String>,\n\n    /// Number of reduce phases\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub num_reduce_phases: Option<u64>,\n\n    /// Maximum document score. [None] when documents are implicitly sorted\n    /// by a field other than `_score`\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub max_score: Option<f32>,\n\n    /// Number of clusters touched with their states\n    #[serde(skip_serializing_if = \"Option::is_none\", rename = \"_clusters\")]\n    pub clusters: Option<ClusterStatistics>,\n\n    /// Number of shards touched with their states\n    #[serde(rename = \"_shards\")]\n    pub shards: ShardStatistics,\n\n    /// Search hits\n    pub hits: HitsMetadata,\n\n    /// Search aggregations\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub aggregations: Option<AggregationResults>,\n\n    #[serde(skip_serializing_if = \"Map::is_empty\", default)]\n    /// Suggest response\n    pub suggest: Map<String, Vec<Suggest>>,\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/elasticsearch_api/model/stats.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::ops::AddAssign;\n\nuse quickwit_metastore::SplitMetadata;\nuse serde::{Deserialize, Serialize};\n\n/// Returns JSON in the format:\n///\n/// {\n///   \"_all\": {\n///     \"primaries\": {\n///       \"store\": {\"size_in_bytes\": 123456789},\n///       \"docs\": {\"count\": 5000}\n///     },\n///     \"total\": {\n///       \"segments\": {\"count\": 100},\n///       \"docs\": {\"count\": 5000}\n///     }\n///   },\n///   \"indices\": {\n///     \"exampleIndex\": {\n///       \"primaries\": {\n///         \"store\": {\"size_in_bytes\": 123456789},\n///         \"docs\": {\"count\": 5000}\n///       },\n///       \"total\": {\n///         \"segments\": {\"count\": 50},\n///         \"docs\": {\"count\": 5000}\n///       }\n///     }\n///   }\n/// }\n#[derive(Clone, Serialize, Deserialize, Debug)]\npub struct ElasticsearchStatsResponse {\n    pub _all: StatsResponseEntry,\n    pub indices: HashMap<String, StatsResponseEntry>, // String is Field name\n}\n\n#[derive(Clone, Serialize, Deserialize, Debug, Default)]\npub struct StatsResponseEntry {\n    primaries: StatsPrimariesResponse,\n    total: StatsTotalResponse,\n}\n\nimpl AddAssign for StatsResponseEntry {\n    fn add_assign(&mut self, rhs: Self) {\n        self.primaries.store.size_in_bytes += rhs.primaries.store.size_in_bytes;\n        self.primaries.docs.count += rhs.primaries.docs.count;\n        self.total.segments.count += rhs.total.segments.count;\n        self.total.docs.count += rhs.total.docs.count;\n    }\n}\n\nimpl From<SplitMetadata> for StatsResponseEntry {\n    fn from(split_metadata: SplitMetadata) -> Self {\n        let mut stats_response_entry = StatsResponseEntry::default();\n        stats_response_entry.primaries.store.size_in_bytes =\n            split_metadata.as_split_info().file_size_bytes.as_u64();\n        stats_response_entry.primaries.docs.count = split_metadata.num_docs as u64;\n        stats_response_entry.total.docs.count = split_metadata.num_docs as u64;\n        stats_response_entry.total.segments.count = 1;\n        stats_response_entry\n    }\n}\n\n#[derive(Clone, Serialize, Deserialize, Debug, Default)]\npub struct StatsPrimariesResponse {\n    store: StatsStoreResponse,\n    docs: StatsDocsResponse,\n}\n\n#[derive(Clone, Serialize, Deserialize, Debug, Default)]\npub struct StatsStoreResponse {\n    size_in_bytes: u64,\n}\n\n#[derive(Clone, Serialize, Deserialize, Debug, Default)]\npub struct StatsDocsResponse {\n    count: u64,\n}\n\n#[derive(Clone, Serialize, Deserialize, Debug, Default)]\npub struct StatsTotalResponse {\n    segments: StatsTotalSegmentsResponse,\n    docs: StatsDocsResponse,\n}\n\n#[derive(Clone, Serialize, Deserialize, Debug, Default)]\npub struct StatsTotalSegmentsResponse {\n    count: u64,\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/elasticsearch_api/rest_handler.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::str::from_utf8;\nuse std::sync::Arc;\nuse std::time::{Duration, Instant};\n\nuse bytes::Bytes;\nuse elasticsearch_dsl::search::Hit as ElasticHit;\nuse elasticsearch_dsl::{HitsMetadata, ShardStatistics, Source, TotalHits, TotalHitsRelation};\nuse futures_util::StreamExt;\nuse itertools::Itertools;\nuse quickwit_cluster::Cluster;\nuse quickwit_common::truncate_str;\nuse quickwit_config::{NodeConfig, validate_index_id_pattern};\nuse quickwit_index_management::IndexService;\nuse quickwit_metastore::*;\nuse quickwit_proto::metastore::{IndexMetadataRequest, MetastoreService, MetastoreServiceClient};\nuse quickwit_proto::search::{\n    CountHits, ListFieldsResponse, PartialHit, ScrollRequest, SearchResponse, SortByValue,\n    SortDatetimeFormat,\n};\nuse quickwit_proto::types::IndexUid;\nuse quickwit_query::BooleanOperand;\nuse quickwit_query::query_ast::{BoolQuery, QueryAst, UserInputQuery};\nuse quickwit_search::{\n    AggregationResults, SearchError, SearchService, list_all_splits, resolve_index_patterns,\n};\nuse serde::{Deserialize, Serialize};\nuse serde_json::{Map, Value, json};\nuse warp::hyper::StatusCode;\nuse warp::reply::with_status;\nuse warp::{Filter, Rejection};\n\nuse super::filter::{\n    elastic_aliases_filter, elastic_cat_indices_filter, elastic_cluster_health_filter,\n    elastic_cluster_info_filter, elastic_delete_index_filter, elastic_delete_scroll_filter,\n    elastic_field_capabilities_filter, elastic_index_cat_indices_filter,\n    elastic_index_count_filter, elastic_index_field_capabilities_filter,\n    elastic_index_mapping_filter, elastic_index_search_filter, elastic_index_stats_filter,\n    elastic_multi_search_filter, elastic_nodes_filter, elastic_resolve_index_filter,\n    elastic_scroll_filter, elastic_search_shards_filter, elastic_stats_filter,\n    elasticsearch_filter,\n};\nuse super::model::{\n    CatIndexQueryParams, DeleteQueryParams, ElasticsearchCatIndexResponse, ElasticsearchError,\n    ElasticsearchResolveIndexEntryResponse, ElasticsearchResolveIndexResponse,\n    ElasticsearchResponse, ElasticsearchStatsResponse, FieldCapabilityQueryParams,\n    FieldCapabilityRequestBody, FieldCapabilityResponse, MultiSearchHeader, MultiSearchQueryParams,\n    MultiSearchResponse, MultiSearchSingleResponse, ScrollQueryParams, SearchBody,\n    SearchQueryParams, SearchQueryParamsCount, StatsResponseEntry,\n    build_list_field_request_for_es_api, convert_to_es_field_capabilities_response,\n};\nuse super::{TrackTotalHits, make_elastic_api_response};\nuse crate::elasticsearch_api::model::ElasticsearchMappingsResponse;\nuse crate::format::BodyFormat;\nuse crate::rest::recover_fn;\nuse crate::rest_api_response::{RestApiError, RestApiResponse};\nuse crate::{BuildInfo, with_arg};\n\n/// Elastic compatible cluster info handler.\npub fn es_compat_cluster_info_handler(\n    node_config: Arc<NodeConfig>,\n    build_info: &'static BuildInfo,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    elastic_cluster_info_filter()\n        .and(with_arg(node_config.clone()))\n        .and(with_arg(build_info))\n        .then(\n            |config: Arc<NodeConfig>, build_info: &'static BuildInfo| async move {\n                warp::reply::json(&json!({\n                    \"name\" : config.node_id,\n                    \"cluster_name\" : config.cluster_id,\n                    \"cluster_uuid\" : config.cluster_id,\n                    \"tagline\" : \"You Know, for Search\",\n                    \"version\" : {\n                        \"distribution\" : \"quickwit\",\n                        \"number\" : \"7.17.0\",\n                        \"build_hash\" : build_info.commit_hash,\n                        \"build_date\" : build_info.build_date,\n                        \"build_snapshot\" : false,\n                        \"lucene_version\" : \"8.11.1\",\n                        \"minimum_wire_compatibility_version\" : \"6.8.0\",\n                        \"minimum_index_compatibility_version\" : \"6.0.0-beta1\",\n                    }\n                }))\n            },\n        )\n        .boxed()\n}\n\n/// GET _elastic/_nodes/http\npub fn es_compat_nodes_handler(\n    node_config: Arc<NodeConfig>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    elastic_nodes_filter()\n        .and(with_arg(node_config))\n        .then(|config: Arc<NodeConfig>| async move {\n            let advertise_addr = std::net::SocketAddr::new(\n                config.grpc_advertise_addr.ip(),\n                config.rest_config.listen_addr.port(),\n            );\n            warp::reply::json(&json!({\n                \"nodes\": {\n                    config.node_id.as_str(): {\n                        \"roles\": [\"data\", \"ingest\"],\n                        \"http\": {\n                            \"publish_address\": advertise_addr.to_string()\n                        }\n                    }\n                }\n            }))\n        })\n        .boxed()\n}\n\n/// GET _elastic/{index}/_search_shards\npub fn es_compat_search_shards_handler(\n    node_config: Arc<NodeConfig>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    elastic_search_shards_filter()\n        .and(with_arg(node_config))\n        .then(|index_id: String, config: Arc<NodeConfig>| async move {\n            warp::reply::json(&json!({\n                \"shards\": [[{\n                    \"index\": index_id,\n                    \"shard\": 0,\n                    \"primary\": true,\n                    \"node\": config.node_id.as_str()\n                }]]\n            }))\n        })\n        .boxed()\n}\n\n/// GET _elastic/_aliases\npub fn es_compat_aliases_handler()\n-> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    elastic_aliases_filter()\n        .then(|| async { Ok(Value::Object(Map::new())) })\n        .map(|result| make_elastic_api_response(result, BodyFormat::default()))\n        .recover(recover_fn)\n        .boxed()\n}\n\n/// GET _elastic/{index}/_mapping or _elastic/{index}/_mappings\npub fn es_compat_index_mapping_handler(\n    metastore: MetastoreServiceClient,\n    search_service: Arc<dyn SearchService>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    elastic_index_mapping_filter()\n        .and(with_arg(metastore))\n        .and(with_arg(search_service))\n        .then(es_compat_index_mapping)\n        .map(|result| make_elastic_api_response(result, BodyFormat::default()))\n        .recover(recover_fn)\n}\n\nasync fn get_index_metadata(\n    index_id: String,\n    metastore: MetastoreServiceClient,\n) -> Result<IndexMetadata, SearchError> {\n    let index_metadata_request = IndexMetadataRequest::for_index_id(index_id);\n    let index_metadata = metastore\n        .index_metadata(index_metadata_request)\n        .await?\n        .deserialize_index_metadata()?;\n    Ok(index_metadata)\n}\n\nasync fn es_compat_index_mapping(\n    index_id: String,\n    mut metastore: MetastoreServiceClient,\n    search_service: Arc<dyn SearchService>,\n) -> Result<ElasticsearchMappingsResponse, ElasticsearchError> {\n    let indexes_metadata = if index_id.contains('*') || index_id.contains(',') {\n        let patterns: Vec<String> = index_id.split(',').map(|s| s.trim().to_string()).collect();\n        resolve_index_patterns(&patterns, &mut metastore).await?\n    } else {\n        vec![get_index_metadata(index_id.clone(), metastore).await?]\n    };\n    let index_id_patterns: Vec<String> = indexes_metadata\n        .iter()\n        .map(|m| m.index_id().to_string())\n        .collect();\n    let list_fields_request = quickwit_proto::search::ListFieldsRequest {\n        index_id_patterns,\n        fields: Vec::new(),\n        start_timestamp: None,\n        end_timestamp: None,\n        query_ast: None,\n    };\n    let list_fields_response = search_service\n        .root_list_fields(list_fields_request)\n        .await\n        .ok();\n    let response = ElasticsearchMappingsResponse::from_doc_mapping(\n        indexes_metadata,\n        list_fields_response.as_ref(),\n    );\n    Ok(response)\n}\n\n/// GET or POST _elastic/_search\npub fn es_compat_search_handler(\n    _search_service: Arc<dyn SearchService>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    elasticsearch_filter()\n        .then(|_params: SearchQueryParams| async move {\n            // TODO\n            let api_error = RestApiError {\n                status_code: StatusCode::NOT_IMPLEMENTED,\n                message: \"_elastic/_search is not supported yet. Please try the index search \\\n                          endpoint (_elastic/{index}/search)\"\n                    .to_string(),\n            };\n            RestApiResponse::new::<(), _>(\n                &Err(api_error),\n                StatusCode::NOT_IMPLEMENTED,\n                BodyFormat::default(),\n            )\n        })\n        .recover(recover_fn)\n}\n\n/// GET or POST _elastic/{index}/_field_caps\npub fn es_compat_index_field_capabilities_handler(\n    search_service: Arc<dyn SearchService>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    elastic_index_field_capabilities_filter()\n        .or(elastic_field_capabilities_filter())\n        .unify()\n        .and(with_arg(search_service))\n        .then(es_compat_index_field_capabilities)\n        .map(|result| make_elastic_api_response(result, BodyFormat::default()))\n        .recover(recover_fn)\n}\n\n/// DELETE _elastic/{index}\npub fn es_compat_delete_index_handler(\n    index_service: IndexService,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    elastic_delete_index_filter()\n        .and(with_arg(index_service))\n        .then(es_compat_delete_index)\n        .map(|result| make_elastic_api_response(result, BodyFormat::default()))\n        .boxed()\n}\n\n/// GET _elastic/_stats\npub fn es_compat_stats_handler(\n    metastore_service: MetastoreServiceClient,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    elastic_stats_filter()\n        .and(with_arg(metastore_service))\n        .then(es_compat_stats)\n        .map(|result| make_elastic_api_response(result, BodyFormat::default()))\n        .recover(recover_fn)\n        .boxed()\n}\n\n/// Check if the parameter is a known query parameter to reject\nfn is_unsupported_qp(param: &str) -> bool {\n    [\"wait_for_status\", \"timeout\", \"level\"].contains(&param)\n}\n\n/// GET _elastic/_cluster/health\npub fn es_compat_cluster_health_handler(\n    cluster: Cluster,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    elastic_cluster_health_filter()\n        .and(warp::query::<HashMap<String, String>>())\n        .and(with_arg(cluster))\n        .then(es_compat_cluster_health)\n        .recover(recover_fn)\n}\n\n#[utoipa::path(\n    get,\n    tag = \"Node Health\",\n    path = \"/_elastic/_cluster/health\",\n    responses(\n        (status = 200, description = \"The cluster is healthy.\", body = bool),\n        (status = 503, description = \"The cluster is unhealthy.\", body = bool),\n    ),\n)]\n/// Get Node Liveliness\nasync fn es_compat_cluster_health(\n    query_params: HashMap<String, String>,\n    cluster: Cluster,\n) -> impl warp::Reply {\n    if let Some(invalid_param) = query_params.keys().find(|key| is_unsupported_qp(key)) {\n        let error_body = warp::reply::json(&json!({\n            \"error\": \"Unsupported parameter.\",\n            \"param\": invalid_param\n        }));\n        return with_status(error_body, StatusCode::BAD_REQUEST);\n    }\n    let is_ready = cluster.is_self_node_ready().await;\n    if is_ready {\n        with_status(\n            warp::reply::json(&json!({\"status\": \"green\"})),\n            StatusCode::OK,\n        )\n    } else {\n        with_status(\n            warp::reply::json(&json!({\"status\": \"red\"})),\n            StatusCode::SERVICE_UNAVAILABLE,\n        )\n    }\n}\n\n/// GET _elastic/{index}/_stats\npub fn es_compat_index_stats_handler(\n    metastore_service: MetastoreServiceClient,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    elastic_index_stats_filter()\n        .and(with_arg(metastore_service))\n        .then(es_compat_index_stats)\n        .map(|result| make_elastic_api_response(result, BodyFormat::default()))\n        .recover(recover_fn)\n        .boxed()\n}\n\n/// GET _elastic/_cat/indices\npub fn es_compat_cat_indices_handler(\n    metastore_service: MetastoreServiceClient,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    elastic_cat_indices_filter()\n        .and(with_arg(metastore_service))\n        .then(es_compat_cat_indices)\n        .map(|result| make_elastic_api_response(result, BodyFormat::default()))\n        .recover(recover_fn)\n        .boxed()\n}\n\n/// GET _elastic/_cat/indices/{index}\npub fn es_compat_index_cat_indices_handler(\n    metastore_service: MetastoreServiceClient,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    elastic_index_cat_indices_filter()\n        .and(with_arg(metastore_service))\n        .then(es_compat_index_cat_indices)\n        .map(|result| make_elastic_api_response(result, BodyFormat::default()))\n        .recover(recover_fn)\n        .boxed()\n}\n\n/// GET  _elastic/_resolve/index/{index}\npub fn es_compat_resolve_index_handler(\n    metastore_service: MetastoreServiceClient,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    elastic_resolve_index_filter()\n        .and(with_arg(metastore_service))\n        .then(es_compat_resolve_index)\n        .map(|result| make_elastic_api_response(result, BodyFormat::default()))\n        .boxed()\n}\n\n/// GET or POST _elastic/{index}/_search\npub fn es_compat_index_search_handler(\n    search_service: Arc<dyn SearchService>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    elastic_index_search_filter()\n        .and(with_arg(search_service))\n        .then(es_compat_index_search)\n        .map(|result| make_elastic_api_response(result, BodyFormat::default()))\n        .recover(recover_fn)\n        .boxed()\n}\n\n/// GET or POST _elastic/{index}/_count\npub fn es_compat_index_count_handler(\n    search_service: Arc<dyn SearchService>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    elastic_index_count_filter()\n        .and(with_arg(search_service))\n        .then(es_compat_index_count)\n        .map(|result| make_elastic_api_response(result, BodyFormat::default()))\n        .recover(recover_fn)\n        .boxed()\n}\n\n/// POST _elastic/_msearch\npub fn es_compat_index_multi_search_handler(\n    search_service: Arc<dyn SearchService>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    elastic_multi_search_filter()\n        .and(with_arg(search_service))\n        .then(es_compat_index_multi_search)\n        .map(|result: Result<MultiSearchResponse, ElasticsearchError>| {\n            let status_code = match &result {\n                Ok(_) => StatusCode::OK,\n                Err(err) => err.status,\n            };\n            RestApiResponse::new(&result, status_code, BodyFormat::default())\n        })\n        .recover(recover_fn)\n        .boxed()\n}\n\n/// GET or POST _elastic/_search/scroll\npub fn es_compat_scroll_handler(\n    search_service: Arc<dyn SearchService>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    elastic_scroll_filter()\n        .and(with_arg(search_service))\n        .then(es_scroll)\n        .map(|result| make_elastic_api_response(result, BodyFormat::default()))\n        .recover(recover_fn)\n        .boxed()\n}\n\n/// DELETE _elastic/_search/scroll\n///\n/// Clears a scroll context. Quickwit manages scroll lifetime via TTL,\n/// so this is a no-op that returns success.\npub fn es_compat_delete_scroll_handler()\n-> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    elastic_delete_scroll_filter()\n        .then(|| async {\n            Ok::<_, ElasticsearchError>(json!({\n                \"succeeded\": true,\n                \"num_freed\": 0\n            }))\n        })\n        .map(|result| make_elastic_api_response(result, BodyFormat::default()))\n        .recover(recover_fn)\n        .boxed()\n}\n\n#[allow(clippy::result_large_err)]\nfn build_request_for_es_api(\n    index_id_patterns: Vec<String>,\n    search_params: SearchQueryParams,\n    search_body: SearchBody,\n) -> Result<(quickwit_proto::search::SearchRequest, bool), ElasticsearchError> {\n    let default_operator = search_params.default_operator.unwrap_or(BooleanOperand::Or);\n    // The query string, if present, takes priority over what can be in the request\n    // body.\n    let mut query_ast = if let Some(q) = &search_params.q {\n        let user_text_query = UserInputQuery {\n            user_text: q.to_string(),\n            default_fields: None,\n            default_operator,\n            lenient: false,\n        };\n        user_text_query.into()\n    } else if let Some(query_dsl) = search_body.query {\n        query_dsl\n            .try_into()\n            .map_err(|err: anyhow::Error| SearchError::InvalidQuery(err.to_string()))?\n    } else {\n        QueryAst::MatchAll\n    };\n\n    if let Some(extra_filters) = &search_params.extra_filters {\n        let queries: Vec<QueryAst> = extra_filters\n            .iter()\n            .map(|query| {\n                let user_text_query = UserInputQuery {\n                    user_text: query.to_string(),\n                    default_fields: None,\n                    default_operator,\n                    lenient: false,\n                };\n                QueryAst::UserInput(user_text_query)\n            })\n            .collect();\n\n        query_ast = QueryAst::Bool(BoolQuery {\n            must: vec![query_ast],\n            must_not: Vec::new(),\n            should: Vec::new(),\n            filter: queries,\n            minimum_should_match: None,\n        });\n    }\n\n    let aggregation_request: Option<String> = if search_body.aggs.is_empty() {\n        None\n    } else {\n        serde_json::to_string(&search_body.aggs).ok()\n    };\n\n    let max_hits = search_params.size.or(search_body.size).unwrap_or(10);\n    let start_offset = search_params.from.or(search_body.from).unwrap_or(0);\n    let ignore_missing_indexes = search_params.ignore_unavailable.unwrap_or(false);\n    let count_hits = match search_params\n        .track_total_hits\n        .or(search_body.track_total_hits)\n    {\n        None => CountHits::Underestimate,\n        Some(TrackTotalHits::Track(false)) => CountHits::Underestimate,\n        Some(TrackTotalHits::Count(count)) if count <= max_hits as i64 => CountHits::Underestimate,\n        Some(TrackTotalHits::Track(true) | TrackTotalHits::Count(_)) => CountHits::CountAll,\n    }\n    .into();\n\n    let sort_fields: Vec<quickwit_proto::search::SortField> = search_params\n        .sort_fields()?\n        .or_else(|| search_body.sort.clone())\n        .unwrap_or_default()\n        .iter()\n        .map(|sort_field| quickwit_proto::search::SortField {\n            field_name: sort_field.field.to_string(),\n            sort_order: sort_field.order as i32,\n            sort_datetime_format: sort_field\n                .date_format\n                .clone()\n                .map(|date_format| SortDatetimeFormat::from(date_format) as i32),\n        })\n        .take_while_inclusive(|sort_field| !is_doc_field(sort_field))\n        .collect();\n    if sort_fields.len() >= 3 {\n        return Err(ElasticsearchError::from(SearchError::InvalidArgument(\n            format!(\"only up to two sort fields supported at the moment. got {sort_fields:?}\"),\n        )));\n    }\n\n    let scroll_duration: Option<Duration> = search_params.parse_scroll_ttl()?;\n    let scroll_ttl_secs: Option<u32> = scroll_duration.map(|duration| duration.as_secs() as u32);\n\n    let has_doc_id_field = sort_fields.iter().any(is_doc_field);\n    let search_after = partial_hit_from_search_after_param(search_body.search_after, &sort_fields)?;\n\n    Ok((\n        quickwit_proto::search::SearchRequest {\n            index_id_patterns,\n            query_ast: serde_json::to_string(&query_ast).expect(\"Failed to serialize QueryAst\"),\n            max_hits,\n            start_offset,\n            aggregation_request,\n            sort_fields,\n            start_timestamp: None,\n            end_timestamp: None,\n            snippet_fields: Vec::new(),\n            scroll_ttl_secs,\n            search_after,\n            count_hits,\n            ignore_missing_indexes,\n            skip_aggregation_finalization: false,\n        },\n        has_doc_id_field,\n    ))\n}\n\nfn is_doc_field(field: &quickwit_proto::search::SortField) -> bool {\n    field.field_name == \"_shard_doc\" || field.field_name == \"_doc\"\n}\n\n#[allow(clippy::result_large_err)]\nfn partial_hit_from_search_after_param(\n    search_after: Vec<serde_json::Value>,\n    sort_order: &[quickwit_proto::search::SortField],\n) -> Result<Option<PartialHit>, ElasticsearchError> {\n    if search_after.is_empty() {\n        return Ok(None);\n    }\n    if search_after.len() != sort_order.len() {\n        return Err(ElasticsearchError::new(\n            StatusCode::BAD_REQUEST,\n            \"sort and search_after are of different length\".to_string(),\n            None,\n        ));\n    }\n    let mut parsed_search_after = PartialHit::default();\n    for (value, field) in search_after.into_iter().zip(sort_order) {\n        if is_doc_field(field) {\n            if let Some(value_str) = value.as_str() {\n                let address: quickwit_search::GlobalDocAddress =\n                    value_str.parse().map_err(|_| {\n                        ElasticsearchError::new(\n                            StatusCode::BAD_REQUEST,\n                            \"invalid search_after doc id, must be of form \\\n                             `{split_id}:{segment_id: u32}:{doc_id: u32}`\"\n                                .to_string(),\n                            None,\n                        )\n                    })?;\n                parsed_search_after.split_id = address.split;\n                parsed_search_after.segment_ord = address.doc_addr.segment_ord;\n                parsed_search_after.doc_id = address.doc_addr.doc_id;\n                return Ok(Some(parsed_search_after));\n            } else {\n                return Err(ElasticsearchError::new(\n                    StatusCode::BAD_REQUEST,\n                    \"search_after doc id must be of string type\".to_string(),\n                    None,\n                ));\n            }\n        } else {\n            let value = SortByValue::try_from_json(value).ok_or_else(|| {\n                ElasticsearchError::new(\n                    StatusCode::BAD_REQUEST,\n                    \"invalid search_after field value, expect bool, number or string\".to_string(),\n                    None,\n                )\n            })?;\n            // TODO make cleaner once we support Vec\n            if parsed_search_after.sort_value.is_none() {\n                parsed_search_after.sort_value = Some(value);\n            } else {\n                parsed_search_after.sort_value2 = Some(value);\n            }\n        }\n    }\n    Ok(Some(parsed_search_after))\n}\n\n#[derive(Debug, Serialize, Deserialize)]\nstruct ElasticsearchCountResponse {\n    count: u64,\n}\n\nasync fn es_compat_index_count(\n    index_id_patterns: Vec<String>,\n    search_params: SearchQueryParamsCount,\n    search_body: SearchBody,\n    search_service: Arc<dyn SearchService>,\n) -> Result<ElasticsearchCountResponse, ElasticsearchError> {\n    let mut search_params: SearchQueryParams = search_params.into();\n    search_params.track_total_hits = Some(TrackTotalHits::Track(true));\n    let (search_request, _append_shard_doc) =\n        build_request_for_es_api(index_id_patterns, search_params, search_body)?;\n    let search_response: SearchResponse = search_service.root_search(search_request).await?;\n    let search_response_rest: ElasticsearchCountResponse = ElasticsearchCountResponse {\n        count: search_response.num_hits,\n    };\n    Ok(search_response_rest)\n}\n\nasync fn es_compat_index_search(\n    index_id_patterns: Vec<String>,\n    search_params: SearchQueryParams,\n    search_body: SearchBody,\n    search_service: Arc<dyn SearchService>,\n) -> Result<ElasticsearchResponse, ElasticsearchError> {\n    if search_params.scroll.is_some() && !search_params.allow_partial_search_results() {\n        return Err(ElasticsearchError::from(SearchError::InvalidArgument(\n            \"Quickwit only supports scroll API with allow_partial_search_results set to true\"\n                .to_string(),\n        )));\n    }\n    let _source_excludes = search_params._source_excludes.clone();\n    let _source_includes = search_params._source_includes.clone();\n    let start_instant = Instant::now();\n    let allow_partial_search_results = search_params.allow_partial_search_results();\n    let (search_request, append_shard_doc) =\n        build_request_for_es_api(index_id_patterns, search_params, search_body)?;\n    let search_response: SearchResponse = search_service.root_search(search_request).await?;\n    let elapsed = start_instant.elapsed();\n    let mut search_response_rest: ElasticsearchResponse = convert_to_es_search_response(\n        search_response,\n        append_shard_doc,\n        _source_excludes,\n        _source_includes,\n        allow_partial_search_results,\n    )?;\n    search_response_rest.took = elapsed.as_millis() as u32;\n    Ok(search_response_rest)\n}\n\n/// Returns JSON in the format:\n///\n/// {\n///   \"acknowledged\": true\n/// }\n#[derive(Clone, Serialize, Deserialize, Debug)]\npub struct ElasticsearchDeleteResponse {\n    pub acknowledged: bool,\n}\n\nasync fn es_compat_delete_index(\n    index_id_patterns: Vec<String>,\n    query_params: DeleteQueryParams,\n    index_service: IndexService,\n) -> Result<ElasticsearchDeleteResponse, ElasticsearchError> {\n    index_service\n        .delete_indexes(\n            index_id_patterns,\n            query_params.ignore_unavailable.unwrap_or_default(),\n            false,\n        )\n        .await?;\n    Ok(ElasticsearchDeleteResponse { acknowledged: true })\n}\n\nasync fn es_compat_stats(\n    metastore: MetastoreServiceClient,\n) -> Result<ElasticsearchStatsResponse, ElasticsearchError> {\n    es_compat_index_stats(vec![\"*\".to_string()], metastore).await\n}\n\nasync fn es_compat_index_stats(\n    index_id_patterns: Vec<String>,\n    mut metastore: MetastoreServiceClient,\n) -> Result<ElasticsearchStatsResponse, ElasticsearchError> {\n    let indexes_metadata = resolve_index_patterns(&index_id_patterns, &mut metastore).await?;\n\n    // Index uid to index id mapping\n    let index_uid_to_index_id: HashMap<IndexUid, String> = indexes_metadata\n        .iter()\n        .map(|metadata| (metadata.index_uid.clone(), metadata.index_id().to_owned()))\n        .collect();\n\n    let index_uids = indexes_metadata\n        .into_iter()\n        .map(|index_metadata| index_metadata.index_uid)\n        .collect_vec();\n    // calling into the search module is not necessary, but reuses established patterns\n    let splits_metadata = list_all_splits(index_uids, &mut metastore).await?;\n\n    let search_response_rest: ElasticsearchStatsResponse =\n        convert_to_es_stats_response(index_uid_to_index_id, splits_metadata);\n\n    Ok(search_response_rest)\n}\n\nasync fn es_compat_cat_indices(\n    query_params: CatIndexQueryParams,\n    metastore: MetastoreServiceClient,\n) -> Result<Vec<serde_json::Value>, ElasticsearchError> {\n    es_compat_index_cat_indices(vec![\"*\".to_string()], query_params, metastore).await\n}\n\nasync fn es_compat_index_cat_indices(\n    index_id_patterns: Vec<String>,\n    query_params: CatIndexQueryParams,\n    mut metastore: MetastoreServiceClient,\n) -> Result<Vec<serde_json::Value>, ElasticsearchError> {\n    query_params.validate()?;\n    let indexes_metadata = resolve_index_patterns(&index_id_patterns, &mut metastore).await?;\n    let mut index_id_to_resp: HashMap<IndexUid, ElasticsearchCatIndexResponse> = indexes_metadata\n        .iter()\n        .map(|metadata| (metadata.index_uid.to_owned(), metadata.clone().into()))\n        .collect();\n\n    let splits_metadata = {\n        let index_uids = indexes_metadata\n            .into_iter()\n            .map(|index_metadata| index_metadata.index_uid)\n            .collect_vec();\n\n        // calling into the search module is not necessary, but reuses established patterns\n        list_all_splits(index_uids, &mut metastore).await?\n    };\n\n    let search_response_rest: Vec<ElasticsearchCatIndexResponse> =\n        convert_to_es_cat_indices_response(&mut index_id_to_resp, splits_metadata);\n\n    let search_response_rest = search_response_rest\n        .into_iter()\n        .filter(|resp| {\n            if let Some(health) = query_params.health {\n                resp.health == health\n            } else {\n                true\n            }\n        })\n        .map(|cat_index| cat_index.serialize_filtered(&query_params.h))\n        .collect::<Result<Vec<serde_json::Value>, serde_json::Error>>()\n        .map_err(|serde_error| {\n            ElasticsearchError::new(\n                StatusCode::INTERNAL_SERVER_ERROR,\n                format!(\"Failed to serialize cat indices response: {serde_error}\"),\n                None,\n            )\n        })?;\n\n    Ok(search_response_rest)\n}\n\nasync fn es_compat_resolve_index(\n    index_id_patterns: Vec<String>,\n    mut metastore: MetastoreServiceClient,\n) -> Result<ElasticsearchResolveIndexResponse, ElasticsearchError> {\n    let indexes_metadata = resolve_index_patterns(&index_id_patterns, &mut metastore).await?;\n    let mut indices: Vec<ElasticsearchResolveIndexEntryResponse> = indexes_metadata\n        .into_iter()\n        .map(|metadata| metadata.into())\n        .collect();\n\n    indices.sort_by(|left, right| left.name.cmp(&right.name));\n\n    Ok(ElasticsearchResolveIndexResponse {\n        indices,\n        ..Default::default()\n    })\n}\n\nasync fn es_compat_index_field_capabilities(\n    index_id_patterns: Vec<String>,\n    search_params: FieldCapabilityQueryParams,\n    search_body: FieldCapabilityRequestBody,\n    search_service: Arc<dyn SearchService>,\n) -> Result<FieldCapabilityResponse, ElasticsearchError> {\n    let search_request =\n        build_list_field_request_for_es_api(index_id_patterns, search_params, search_body)?;\n    let search_response: ListFieldsResponse =\n        search_service.root_list_fields(search_request).await?;\n    let search_response_rest: FieldCapabilityResponse =\n        convert_to_es_field_capabilities_response(search_response);\n    Ok(search_response_rest)\n}\n\nfn filter_source(\n    value: &mut serde_json::Value,\n    _source_excludes: &Option<Vec<String>>,\n    _source_includes: &Option<Vec<String>>,\n) {\n    fn remove_path(value: &mut serde_json::Value, path: &str) {\n        for (prefix, suffix) in generate_path_variants_with_suffix(path) {\n            match value {\n                serde_json::Value::Object(map) => {\n                    if let Some(suffix) = suffix {\n                        if let Some(sub_value) = map.get_mut(prefix) {\n                            remove_path(sub_value, suffix);\n                            return;\n                        }\n                    } else {\n                        map.remove(prefix);\n                    }\n                }\n                _ => continue,\n            }\n        }\n    }\n    fn retain_includes(\n        value: &mut serde_json::Value,\n        current_path: &str,\n        include_paths: &Vec<String>,\n    ) {\n        if let Some(ref mut map) = value.as_object_mut() {\n            map.retain(|key, sub_value| {\n                let path = if current_path.is_empty() {\n                    key.to_string()\n                } else {\n                    format!(\"{current_path}.{key}\")\n                };\n\n                if include_paths.contains(&path) {\n                    // Exact match keep whole node\n                    return true;\n                }\n                // Check if the path is sub path of any allowed path\n                for allowed_path in include_paths {\n                    if allowed_path.starts_with(path.as_str()) {\n                        retain_includes(sub_value, &path, include_paths);\n                        return true;\n                    }\n                }\n                false\n            });\n        }\n    }\n\n    // Remove fields that are not included\n    if let Some(includes) = _source_includes {\n        retain_includes(value, \"\", includes);\n    }\n\n    // Remove fields that are excluded\n    if let Some(excludes) = _source_excludes {\n        for exclude in excludes {\n            remove_path(value, exclude);\n        }\n    }\n}\n\n/// \"app.id.name\" -> [(\"app\", Some(\"id.name\")), (\"app.id\", Some(\"name\")), (\"app.id.name\", None)]\nfn generate_path_variants_with_suffix(input: &str) -> Vec<(&str, Option<&str>)> {\n    let mut variants = Vec::new();\n\n    // Iterate over each character in the input.\n    for (idx, ch) in input.char_indices() {\n        if ch == '.' {\n            // If a dot is found, create a variant using the current slice and the remainder of the\n            // string.\n            let prefix = &input[0..idx];\n            let suffix = if idx + 1 < input.len() {\n                Some(&input[idx + 1..])\n            } else {\n                None\n            };\n            variants.push((prefix, suffix));\n        }\n    }\n\n    variants.push((&input[0..], None));\n\n    variants\n}\n\nfn convert_hit(\n    hit: quickwit_proto::search::Hit,\n    append_shard_doc: bool,\n    _source_excludes: &Option<Vec<String>>,\n    _source_includes: &Option<Vec<String>>,\n) -> ElasticHit {\n    let mut json: serde_json::Value = serde_json::from_str(&hit.json).unwrap_or(json!({}));\n    filter_source(&mut json, _source_excludes, _source_includes);\n    let source =\n        Source::from_string(serde_json::to_string(&json).unwrap_or_else(|_| \"{}\".to_string()))\n            .unwrap_or_else(|_| Source::from_string(\"{}\".to_string()).unwrap());\n\n    let mut sort = Vec::new();\n    if let Some(partial_hit) = hit.partial_hit {\n        if let Some(sort_value) = partial_hit.sort_value {\n            sort.push(sort_value.into_json());\n        }\n        if let Some(sort_value2) = partial_hit.sort_value2 {\n            sort.push(sort_value2.into_json());\n        }\n        if append_shard_doc {\n            sort.push(serde_json::Value::String(\n                quickwit_search::GlobalDocAddress::from_partial_hit(&partial_hit).to_string(),\n            ));\n        }\n    }\n\n    ElasticHit {\n        fields: Default::default(),\n        explanation: None,\n        index: hit.index_id,\n        id: \"\".to_string(),\n        score: None,\n        nested: None,\n        source,\n        highlight: Default::default(),\n        inner_hits: Default::default(),\n        matched_queries: Vec::default(),\n        sort,\n    }\n}\n\nasync fn es_compat_index_multi_search(\n    payload: Bytes,\n    multi_search_params: MultiSearchQueryParams,\n    search_service: Arc<dyn SearchService>,\n) -> Result<MultiSearchResponse, ElasticsearchError> {\n    let mut search_requests = Vec::new();\n    let str_payload = from_utf8(&payload)\n        .map_err(|err| SearchError::InvalidQuery(format!(\"invalid UTF-8: {err}\")))?;\n    let mut payload_lines = str_lines(str_payload);\n\n    while let Some(line) = payload_lines.next() {\n        let mut request_header =\n            serde_json::from_str::<MultiSearchHeader>(line).map_err(|err| {\n                SearchError::InvalidArgument(format!(\n                    \"failed to parse request header `{}...`: {}\",\n                    truncate_str(line, 20),\n                    err\n                ))\n            })?;\n        request_header.apply_query_param_defaults(&multi_search_params);\n        if request_header.indexes.is_empty() {\n            return Err(ElasticsearchError::from(SearchError::InvalidArgument(\n                \"`_msearch` request header must define at least one index\".to_string(),\n            )));\n        }\n        for index in &request_header.indexes {\n            validate_index_id_pattern(index, true).map_err(|err| {\n                SearchError::InvalidArgument(format!(\n                    \"request header contains an invalid index: {err}\"\n                ))\n            })?;\n        }\n        let index_ids_patterns = request_header.indexes.clone();\n        let search_body = payload_lines\n            .next()\n            .ok_or_else(|| {\n                SearchError::InvalidArgument(\"expect request body after request header\".to_string())\n            })\n            .and_then(|line| {\n                serde_json::from_str::<SearchBody>(line).map_err(|err| {\n                    SearchError::InvalidArgument(format!(\n                        \"failed to parse request body `{}...`: {}\",\n                        truncate_str(line, 20),\n                        err\n                    ))\n                })\n            })?;\n        let mut search_query_params = SearchQueryParams::from(request_header);\n        if let Some(_source_excludes) = &multi_search_params._source_excludes {\n            search_query_params._source_excludes = Some(_source_excludes.to_vec());\n        }\n        if let Some(_source_includes) = &multi_search_params._source_includes {\n            search_query_params._source_includes = Some(_source_includes.to_vec());\n        }\n        if let Some(extra_filters) = &multi_search_params.extra_filters {\n            search_query_params.extra_filters = Some(extra_filters.to_vec());\n        }\n        let es_request =\n            build_request_for_es_api(index_ids_patterns, search_query_params, search_body)?;\n        search_requests.push(es_request);\n    }\n\n    // TODO: forced to do weird referencing to work around https://github.com/rust-lang/rust/issues/100905\n    // otherwise append_shard_doc is captured by ref, and we get lifetime issues\n    let futures = search_requests\n        .into_iter()\n        .map(|(search_request, append_shard_doc)| {\n            let search_service = &search_service;\n            let _source_excludes = multi_search_params._source_excludes.clone();\n            let _source_includes = multi_search_params._source_includes.clone();\n            async move {\n                let start_instant = Instant::now();\n                let search_response: SearchResponse =\n                    search_service.clone().root_search(search_request).await?;\n                let elapsed = start_instant.elapsed();\n                let mut search_response_rest: ElasticsearchResponse =\n                    convert_to_es_search_response(\n                        search_response,\n                        append_shard_doc,\n                        _source_excludes,\n                        _source_includes,\n                        true, //< allow_partial_results. Set to true to match ES's behavior.\n                    )?;\n                search_response_rest.took = elapsed.as_millis() as u32;\n                Ok::<_, ElasticsearchError>(search_response_rest)\n            }\n        });\n    let max_concurrent_searches =\n        multi_search_params.max_concurrent_searches.unwrap_or(10) as usize;\n    let search_responses = futures::stream::iter(futures)\n        .buffered(max_concurrent_searches)\n        .collect::<Vec<_>>()\n        .await;\n    let responses = search_responses\n        .into_iter()\n        .map(|search_response| match search_response {\n            Ok(search_response) => MultiSearchSingleResponse::from(search_response),\n            Err(error) => MultiSearchSingleResponse::from(error),\n        })\n        .collect_vec();\n    let multi_search_response = MultiSearchResponse { responses };\n    Ok(multi_search_response)\n}\n\nasync fn es_scroll(\n    scroll_query_params: ScrollQueryParams,\n    search_service: Arc<dyn SearchService>,\n) -> Result<ElasticsearchResponse, ElasticsearchError> {\n    let start_instant = Instant::now();\n    let Some(scroll_id) = scroll_query_params.scroll_id.clone() else {\n        return Err(SearchError::InvalidArgument(\"missing scroll_id\".to_string()).into());\n    };\n    let scroll_ttl_secs: Option<u32> = if let Some(scroll_ttl) = scroll_query_params.scroll {\n        let scroll_ttl_duration = humantime::parse_duration(&scroll_ttl)\n            .map_err(|_| SearchError::InvalidArgument(format!(\"Scroll invalid: {scroll_ttl}\")))?;\n        Some(scroll_ttl_duration.as_secs() as u32)\n    } else {\n        None\n    };\n    let scroll_request = ScrollRequest {\n        scroll_id,\n        scroll_ttl_secs,\n    };\n    let search_response: SearchResponse = search_service.scroll(scroll_request).await?;\n    // TODO append_shard_doc depends on the initial request, but we don't have access to it\n\n    // Ideally, we would have wanted to reuse the setting from the initial search request.\n    // However, passing that parameter is cumbersome, so we cut some corner and forbid the\n    // use of scroll requests in combination with allow_partial_results set to false.\n    let allow_failed_splits = true;\n    let mut search_response_rest: ElasticsearchResponse =\n        convert_to_es_search_response(search_response, false, None, None, allow_failed_splits)?;\n    search_response_rest.took = start_instant.elapsed().as_millis() as u32;\n    Ok(search_response_rest)\n}\n\nfn convert_to_es_cat_indices_response(\n    index_id_to_resp: &mut HashMap<IndexUid, ElasticsearchCatIndexResponse>,\n    splits: Vec<SplitMetadata>,\n) -> Vec<ElasticsearchCatIndexResponse> {\n    for split_metadata in splits {\n        let resp_entry = index_id_to_resp\n            .get_mut(&split_metadata.index_uid)\n            .unwrap_or_else(|| {\n                panic!(\n                    \"index_id {} not found in index_id_to_resp\",\n                    split_metadata.index_uid\n                )\n            });\n        let cat_index_entry: ElasticsearchCatIndexResponse = split_metadata.into();\n        *resp_entry += cat_index_entry.clone();\n    }\n    let mut indices: Vec<ElasticsearchCatIndexResponse> =\n        index_id_to_resp.values().cloned().collect();\n    indices.sort_by(|a, b| a.index.cmp(&b.index));\n\n    indices\n}\n\nfn convert_to_es_stats_response(\n    index_uid_to_index_id: HashMap<IndexUid, String>,\n    splits: Vec<SplitMetadata>,\n) -> ElasticsearchStatsResponse {\n    let mut indices: HashMap<String, StatsResponseEntry> = index_uid_to_index_id\n        .values()\n        .map(|index_id| (index_id.to_owned(), StatsResponseEntry::default()))\n        .collect();\n    let mut _all = StatsResponseEntry::default();\n\n    for split_metadata in splits {\n        let index_id = index_uid_to_index_id\n            .get(&split_metadata.index_uid)\n            .unwrap_or_else(|| {\n                panic!(\n                    \"index_uid {} not found in index_uid_to_index_id\",\n                    split_metadata.index_uid\n                )\n            });\n        let resp_entry = indices.get_mut(index_id).unwrap_or_else(|| {\n            panic!(\n                \"index_id {} not found in index_id_to_resp\",\n                split_metadata.index_uid\n            )\n        });\n        let stats_entry: StatsResponseEntry = split_metadata.into();\n        *resp_entry += stats_entry.clone();\n        _all += stats_entry.clone();\n    }\n    ElasticsearchStatsResponse { _all, indices }\n}\n\n#[allow(clippy::result_large_err)]\nfn convert_to_es_search_response(\n    resp: SearchResponse,\n    append_shard_doc: bool,\n    _source_excludes: Option<Vec<String>>,\n    _source_includes: Option<Vec<String>>,\n    allow_partial_results: bool,\n) -> Result<ElasticsearchResponse, ElasticsearchError> {\n    if (!allow_partial_results || resp.num_successful_splits == 0)\n        && let Some(search_error) = SearchError::from_split_errors(&resp.failed_splits)\n    {\n        return Err(ElasticsearchError::from(search_error));\n    }\n    let hits: Vec<ElasticHit> = resp\n        .hits\n        .into_iter()\n        .map(|hit| convert_hit(hit, append_shard_doc, &_source_excludes, &_source_includes))\n        .collect();\n    let aggregations: Option<AggregationResults> =\n        if let Some(aggregation_postcard) = resp.aggregation_postcard {\n            let aggregations =\n                AggregationResults::from_postcard(&aggregation_postcard).map_err(|_| {\n                    ElasticsearchError::new(\n                        StatusCode::INTERNAL_SERVER_ERROR,\n                        \"Failed to parse aggregation results\".to_string(),\n                        None,\n                    )\n                })?;\n            Some(aggregations)\n        } else {\n            None\n        };\n    let num_failed_splits = resp.failed_splits.len() as u32;\n    let num_successful_splits = resp.num_successful_splits as u32;\n    let num_total_splits = num_successful_splits + num_failed_splits;\n    Ok(ElasticsearchResponse {\n        timed_out: false,\n        hits: HitsMetadata {\n            total: Some(TotalHits {\n                value: resp.num_hits,\n                relation: TotalHitsRelation::Equal,\n            }),\n            max_score: None,\n            hits,\n        },\n        aggregations,\n        scroll_id: resp.scroll_id,\n        // There is no concept of shards here, but use this to convey split search failures.\n        shards: ShardStatistics {\n            total: num_total_splits,\n            successful: num_successful_splits,\n            skipped: 0u32,\n            failed: num_failed_splits,\n            failures: Vec::new(),\n        },\n        ..Default::default()\n    })\n}\n\npub(crate) fn str_lines(body: &str) -> impl Iterator<Item = &str> {\n    body.lines()\n        .map(|line| line.trim())\n        .filter(|line| !line.is_empty())\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_proto::search::SplitSearchError;\n    use warp::hyper::StatusCode;\n\n    use super::{partial_hit_from_search_after_param, *};\n\n    #[test]\n    fn test_partial_hit_from_search_after_param_invalid_length() {\n        let search_after = vec![serde_json::json!([1])];\n        let sort_order = &[];\n        let error = partial_hit_from_search_after_param(search_after, sort_order).unwrap_err();\n        assert_eq!(error.status, StatusCode::BAD_REQUEST);\n        assert_eq!(\n            error.error.reason.unwrap(),\n            \"sort and search_after are of different length\"\n        );\n    }\n\n    #[test]\n    fn test_partial_hit_from_search_after_param_invalid_search_after_value() {\n        let search_after = vec![serde_json::json!([1])];\n        let sort_order = &[quickwit_proto::search::SortField {\n            field_name: \"field1\".to_string(),\n            sort_order: 1,\n            sort_datetime_format: None,\n        }];\n        let error = partial_hit_from_search_after_param(search_after, sort_order).unwrap_err();\n        assert_eq!(error.status, StatusCode::BAD_REQUEST);\n        assert_eq!(\n            error.error.reason.unwrap(),\n            \"invalid search_after field value, expect bool, number or string\"\n        );\n    }\n\n    #[test]\n    fn test_partial_hit_from_search_after_param_invalid_search_after_doc_id() {\n        let search_after = vec![serde_json::json!(\"split_id:1112\")];\n        let sort_order = &[quickwit_proto::search::SortField {\n            field_name: \"_doc\".to_string(),\n            sort_order: 1,\n            sort_datetime_format: None,\n        }];\n        let error = partial_hit_from_search_after_param(search_after, sort_order).unwrap_err();\n        assert_eq!(error.status, StatusCode::BAD_REQUEST);\n        assert_eq!(\n            error.error.reason.unwrap(),\n            \"invalid search_after doc id, must be of form `{split_id}:{segment_id: u32}:{doc_id: \\\n             u32}`\"\n        );\n    }\n\n    #[test]\n    fn test_single_element() {\n        let input = \"app\";\n        let expected = vec![(\"app\", None)];\n        assert_eq!(generate_path_variants_with_suffix(input), expected);\n    }\n\n    #[test]\n    fn test_two_elements() {\n        let input = \"app.id\";\n        let expected = vec![(\"app\", Some(\"id\")), (\"app.id\", None)];\n        assert_eq!(generate_path_variants_with_suffix(input), expected);\n    }\n\n    #[test]\n    fn test_multiple_elements() {\n        let input = \"app.id.name\";\n        let expected = vec![\n            (\"app\", Some(\"id.name\")),\n            (\"app.id\", Some(\"name\")),\n            (\"app.id.name\", None),\n        ];\n        assert_eq!(generate_path_variants_with_suffix(input), expected);\n    }\n\n    #[test]\n    fn test_include_fields1() {\n        let mut fields = json!({\n            \"app\": { \"id\": 123, \"name\": \"Blub\" },\n            \"user\": { \"id\": 456, \"name\": \"Fred\" }\n        });\n\n        let includes = Some(vec![\"app.id\".to_string()]);\n        filter_source(&mut fields, &None, &includes);\n\n        let expected = json!({\n            \"app\": { \"id\": 123 }\n        });\n\n        assert_eq!(fields, expected);\n    }\n    #[test]\n    fn test_include_fields2() {\n        let mut fields = json!({\n            \"app\": { \"id\": 123, \"name\": \"Blub\" },\n            \"app.id\": { \"id\": 123, \"name\": \"Blub\" },\n            \"user\": { \"id\": 456, \"name\": \"Fred\" }\n        });\n\n        let includes = Some(vec![\"app\".to_string(), \"app.id\".to_string()]);\n        filter_source(&mut fields, &None, &includes);\n\n        let expected = json!({\n            \"app\": { \"id\": 123, \"name\": \"Blub\" },\n            \"app.id\": { \"id\": 123, \"name\": \"Blub\" },\n        });\n\n        assert_eq!(fields, expected);\n    }\n\n    #[test]\n    fn test_exclude_fields() {\n        let mut fields = json!({\n            \"app\": {\n                \"id\": 123,\n                \"name\": \"Blub\"\n            },\n            \"user\": {\n                \"id\": 456,\n                \"name\": \"Fred\"\n            }\n        });\n\n        let excludes = Some(vec![\"app.name\".to_string(), \"user.id\".to_string()]);\n        filter_source(&mut fields, &excludes, &None);\n\n        let expected = json!({\n            \"app\": {\n                \"id\": 123\n            },\n            \"user\": {\n                \"name\": \"Fred\"\n            }\n        });\n\n        assert_eq!(fields, expected);\n    }\n\n    #[test]\n    fn test_include_and_exclude_fields() {\n        let mut fields = json!({\n            \"app\": { \"id\": 123, \"name\": \"Blub\", \"version\": \"1.0\" },\n            \"user\": { \"id\": 456, \"name\": \"Fred\", \"email\": \"john@example.com\" }\n        });\n\n        let includes = Some(vec![\n            \"app\".to_string(),\n            \"user.name\".to_string(),\n            \"user.email\".to_string(),\n        ]);\n        let excludes = Some(vec![\"app.version\".to_string(), \"user.email\".to_string()]);\n        filter_source(&mut fields, &excludes, &includes);\n\n        let expected = json!({\n            \"app\": { \"id\": 123, \"name\": \"Blub\" },\n            \"user\": { \"name\": \"Fred\" }\n        });\n\n        assert_eq!(fields, expected);\n    }\n\n    #[test]\n    fn test_no_includes_or_excludes() {\n        let mut fields = json!({\n            \"app\": {\n                \"id\": 123,\n                \"name\": \"Blub\"\n            }\n        });\n\n        filter_source(&mut fields, &None, &None);\n\n        let expected = json!({\n            \"app\": {\n                \"id\": 123,\n                \"name\": \"Blub\"\n            }\n        });\n\n        assert_eq!(fields, expected);\n    }\n\n    // We test that the behavior of allow partial search results.\n    #[test]\n    fn test_convert_to_es_search_response_allow_partial() {\n        let split_error = SplitSearchError {\n            error: \"some-error\".to_string(),\n            split_id: \"some-split-id\".to_string(),\n            retryable_error: true,\n        };\n        {\n            let search_response = SearchResponse {\n                num_successful_splits: 1,\n                failed_splits: vec![split_error.clone()],\n                ..Default::default()\n            };\n            convert_to_es_search_response(search_response, false, None, None, false).unwrap_err();\n        }\n        {\n            let search_response = SearchResponse {\n                num_successful_splits: 1,\n                failed_splits: vec![split_error.clone()],\n                ..Default::default()\n            };\n            // if we allow partial search results, this should not fail, but we report the presence\n            // of failed splits in the fail shard response.\n            let es_search_resp =\n                convert_to_es_search_response(search_response, false, None, None, true).unwrap();\n            assert_eq!(es_search_resp.shards.failed, 1);\n        }\n        {\n            let search_response = SearchResponse {\n                failed_splits: vec![split_error.clone()],\n                ..Default::default()\n            };\n            // Event if we allow partial search results, with a fail and no success, we have a\n            // failure.\n            convert_to_es_search_response(search_response, false, None, None, true).unwrap_err();\n        }\n        {\n            // Not having any splits (no failure + no success) is not considered a failure.\n            for allow_partial in [true, false] {\n                let search_response = SearchResponse::default();\n                let es_search_resp = convert_to_es_search_response(\n                    search_response,\n                    false,\n                    None,\n                    None,\n                    allow_partial,\n                )\n                .unwrap();\n                assert_eq!(es_search_resp.shards.failed, 0);\n            }\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/format.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\n\nuse quickwit_config::ConfigFormat;\nuse serde::{self, Deserialize, Serialize, Serializer};\nuse thiserror::Error;\nuse warp::hyper::header::CONTENT_TYPE;\nuse warp::{Filter, Rejection};\n\n/// Body output format used for the REST API.\n#[derive(Deserialize, Clone, Debug, Eq, PartialEq, Copy, utoipa::ToSchema)]\n#[serde(rename_all = \"snake_case\")]\n#[derive(Default)]\npub enum BodyFormat {\n    Json,\n    #[default]\n    PrettyJson,\n}\n\nimpl BodyFormat {\n    pub(crate) fn result_to_vec<T: serde::Serialize, E: serde::Serialize>(\n        &self,\n        result: &Result<T, E>,\n    ) -> Result<Vec<u8>, ()> {\n        match result {\n            Ok(value) => self.value_to_vec(value),\n            Err(err) => self.value_to_vec(err),\n        }\n    }\n\n    fn value_to_vec(&self, value: &impl serde::Serialize) -> Result<Vec<u8>, ()> {\n        match &self {\n            Self::Json => serde_json::to_vec(value),\n            Self::PrettyJson => serde_json::to_vec_pretty(value),\n        }\n        .map_err(|_| {\n            tracing::error!(\"response serialization failed\");\n        })\n    }\n}\n\nimpl fmt::Display for BodyFormat {\n    fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {\n        match &self {\n            Self::Json => write!(formatter, \"json\"),\n            Self::PrettyJson => write!(formatter, \"pretty_json\"),\n        }\n    }\n}\n\nimpl Serialize for BodyFormat {\n    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>\n    where S: Serializer {\n        serializer.serialize_str(&self.to_string())\n    }\n}\n\n/// This struct represents a QueryString passed to\n/// the REST API.\n#[derive(Deserialize, Debug, Eq, PartialEq, utoipa::IntoParams)]\n#[into_params(parameter_in = Query)]\nstruct FormatQueryString {\n    /// The output format requested.\n    #[serde(default)]\n    pub format: BodyFormat,\n}\n\npub(crate) fn extract_format_from_qs()\n-> impl Filter<Extract = (BodyFormat,), Error = Rejection> + Clone {\n    warp::query::<FormatQueryString>().map(|format_qs: FormatQueryString| format_qs.format)\n}\n\n#[derive(Debug, Error)]\n#[error(\n    \"request's content-type is not supported: supported media types are `application/json`, \\\n     `application/toml`, and `application/yaml`\"\n)]\npub(crate) struct UnsupportedMediaType;\n\nimpl warp::reject::Reject for UnsupportedMediaType {}\n\npub(crate) fn extract_config_format()\n-> impl Filter<Extract = (ConfigFormat,), Error = Rejection> + Copy {\n    warp::filters::header::optional::<mime_guess::Mime>(CONTENT_TYPE.as_str()).and_then(\n        |mime_opt: Option<mime_guess::Mime>| {\n            if let Some(mime) = mime_opt {\n                let config_format = match mime.subtype().as_str() {\n                    \"json\" => ConfigFormat::Json,\n                    \"toml\" => ConfigFormat::Toml,\n                    \"yaml\" => ConfigFormat::Yaml,\n                    _ => {\n                        return futures::future::err(warp::reject::custom(UnsupportedMediaType));\n                    }\n                };\n                return futures::future::ok(config_format);\n            }\n            futures::future::ok(ConfigFormat::Json)\n        },\n    )\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/grpc.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeSet;\nuse std::sync::Arc;\n\nuse anyhow::Context;\nuse quickwit_cluster::cluster_grpc_server;\nuse quickwit_common::tower::BoxFutureInfaillible;\nuse quickwit_config::GrpcConfig;\nuse quickwit_config::service::QuickwitService;\nuse quickwit_proto::developer::DeveloperServiceClient;\nuse quickwit_proto::indexing::IndexingServiceClient;\nuse quickwit_proto::jaeger::storage::v1::span_reader_plugin_server::SpanReaderPluginServer;\nuse quickwit_proto::jaeger::storage::v2::trace_reader_server::TraceReaderServer;\nuse quickwit_proto::opentelemetry::proto::collector::logs::v1::logs_service_server::LogsServiceServer;\nuse quickwit_proto::opentelemetry::proto::collector::trace::v1::trace_service_server::TraceServiceServer;\nuse quickwit_proto::search::search_service_server::SearchServiceServer;\nuse quickwit_proto::tonic::codegen::CompressionEncoding;\nuse quickwit_proto::tonic::transport::server::TcpIncoming;\nuse quickwit_proto::tonic::transport::{Certificate, Identity, Server, ServerTlsConfig};\nuse tokio::net::TcpListener;\nuse tonic_health::pb::FILE_DESCRIPTOR_SET as HEALTH_FILE_DESCRIPTOR_SET;\nuse tonic_health::pb::health_server::{Health, HealthServer};\nuse tonic_reflection::pb::v1::FILE_DESCRIPTOR_SET as REFLECTION_FILE_DESCRIPTOR_SET;\nuse tonic_reflection::server::v1::{ServerReflection, ServerReflectionServer};\nuse tracing::*;\n\nuse crate::developer_api::DeveloperApiServer;\nuse crate::search_api::GrpcSearchAdapter;\nuse crate::{INDEXING_GRPC_SERVER_METRICS_LAYER, QuickwitServices};\n\n/// Starts and binds gRPC services to `grpc_listen_addr`.\npub(crate) async fn start_grpc_server(\n    tcp_listener: TcpListener,\n    grpc_config: GrpcConfig,\n    services: Arc<QuickwitServices>,\n    readiness_trigger: BoxFutureInfaillible<()>,\n    shutdown_signal: BoxFutureInfaillible<()>,\n    health_service: HealthServer<impl Health>,\n) -> anyhow::Result<()> {\n    let mut enabled_grpc_services = BTreeSet::new();\n    let mut file_descriptor_sets = Vec::new();\n    let mut server = Server::builder();\n\n    if let Some(tls_config) = grpc_config.tls {\n        let cert = std::fs::read_to_string(tls_config.cert_path)?;\n        let key = std::fs::read_to_string(tls_config.key_path)?;\n        let identity = Identity::from_pem(cert, key);\n\n        let mut tls = ServerTlsConfig::new().identity(identity);\n\n        if tls_config.validate_client {\n            let ca_cert = std::fs::read_to_string(tls_config.ca_path)?;\n            let ca_cert = Certificate::from_pem(ca_cert);\n            tls = tls.client_ca_root(ca_cert);\n        }\n        // TODO using this builtin method means we have no way of hot-reloading certificates\n        // (i.e. the process must be restarted every time its certificate expires)\n        // to do better, we'd need to wra the TcpListener with something that does (m)TLS\n        // and that we control, however it would be somewhat painful, and more error prone\n        server = server.tls_config(tls)?;\n    }\n\n    let cluster_grpc_service = cluster_grpc_server(services.cluster.clone());\n    file_descriptor_sets.push(quickwit_proto::cluster::CLUSTER_PLANE_FILE_DESCRIPTOR_SET);\n\n    // Mount gRPC metastore service if `QuickwitService::Metastore` is enabled on node.\n    let metastore_grpc_service = if let Some(metastore_server) = &services.metastore_server_opt {\n        enabled_grpc_services.insert(\"metastore\");\n        file_descriptor_sets.push(quickwit_proto::metastore::METASTORE_FILE_DESCRIPTOR_SET);\n\n        Some(metastore_server.as_grpc_service(grpc_config.max_message_size))\n    } else {\n        None\n    };\n    // Mount gRPC indexing service if `QuickwitService::Indexer` is enabled on node.\n    let indexing_grpc_service = if services\n        .node_config\n        .is_service_enabled(QuickwitService::Indexer)\n    {\n        if let Some(indexing_service) = services.indexing_service_opt.clone() {\n            enabled_grpc_services.insert(\"indexing\");\n            file_descriptor_sets.push(quickwit_proto::indexing::INDEXING_FILE_DESCRIPTOR_SET);\n\n            let indexing_service = IndexingServiceClient::tower()\n                .stack_layer(INDEXING_GRPC_SERVER_METRICS_LAYER.clone())\n                .build_from_mailbox(indexing_service);\n            Some(indexing_service.as_grpc_service(grpc_config.max_message_size))\n        } else {\n            None\n        }\n    } else {\n        None\n    };\n    // Mount gRPC ingest service if `QuickwitService::Indexer` is enabled on node.\n    let ingest_api_grpc_service = if services\n        .node_config\n        .is_service_enabled(QuickwitService::Indexer)\n    {\n        enabled_grpc_services.insert(\"ingest-api\");\n        Some(\n            services\n                .ingest_service\n                .as_grpc_service(grpc_config.max_message_size),\n        )\n    } else {\n        None\n    };\n    let ingest_router_grpc_service = if services\n        .node_config\n        .is_service_enabled(QuickwitService::Indexer)\n    {\n        enabled_grpc_services.insert(\"ingest-router\");\n\n        let ingest_router_service = services\n            .ingest_router_service\n            .as_grpc_service(grpc_config.max_message_size);\n        Some(ingest_router_service)\n    } else {\n        None\n    };\n\n    let ingester_grpc_service = if let Some(ingester_service) = services.ingester_service() {\n        enabled_grpc_services.insert(\"ingester\");\n        file_descriptor_sets.push(quickwit_proto::ingest::INGEST_FILE_DESCRIPTOR_SET);\n        let ingester_grpc_service = ingester_service.as_grpc_service(grpc_config.max_message_size);\n        Some(ingester_grpc_service)\n    } else {\n        None\n    };\n\n    // Mount gRPC control plane service if `QuickwitService::ControlPlane` is enabled on node.\n    let control_plane_grpc_service = if services\n        .node_config\n        .is_service_enabled(QuickwitService::ControlPlane)\n    {\n        enabled_grpc_services.insert(\"control-plane\");\n        file_descriptor_sets.push(quickwit_proto::control_plane::CONTROL_PLANE_FILE_DESCRIPTOR_SET);\n\n        Some(\n            services\n                .control_plane_client\n                .as_grpc_service(grpc_config.max_message_size),\n        )\n    } else {\n        None\n    };\n    // Mount gRPC OpenTelemetry OTLP services if present.\n    let otlp_trace_grpc_service =\n        if let Some(otlp_traces_service) = services.otlp_traces_service_opt.clone() {\n            enabled_grpc_services.insert(\"otlp-traces\");\n            let trace_service = TraceServiceServer::new(otlp_traces_service)\n                .accept_compressed(CompressionEncoding::Gzip)\n                .accept_compressed(CompressionEncoding::Zstd)\n                .max_decoding_message_size(grpc_config.max_message_size.0 as usize)\n                .max_encoding_message_size(grpc_config.max_message_size.0 as usize);\n            Some(trace_service)\n        } else {\n            None\n        };\n    let otlp_log_grpc_service =\n        if let Some(otlp_logs_service) = services.otlp_logs_service_opt.clone() {\n            enabled_grpc_services.insert(\"otlp-logs\");\n            let logs_service = LogsServiceServer::new(otlp_logs_service)\n                .accept_compressed(CompressionEncoding::Gzip)\n                .accept_compressed(CompressionEncoding::Zstd)\n                .max_decoding_message_size(grpc_config.max_message_size.0 as usize)\n                .max_encoding_message_size(grpc_config.max_message_size.0 as usize);\n            Some(logs_service)\n        } else {\n            None\n        };\n    // Mount gRPC search service if `QuickwitService::Searcher` is enabled on node.\n    let search_grpc_service = if services\n        .node_config\n        .is_service_enabled(QuickwitService::Searcher)\n    {\n        enabled_grpc_services.insert(\"search\");\n        file_descriptor_sets.push(quickwit_proto::search::SEARCH_FILE_DESCRIPTOR_SET);\n\n        let search_service = services.search_service.clone();\n        let grpc_search_service = GrpcSearchAdapter::from(search_service);\n        Some(\n            SearchServiceServer::new(grpc_search_service)\n                .max_decoding_message_size(grpc_config.max_message_size.0 as usize)\n                .max_encoding_message_size(grpc_config.max_message_size.0 as usize),\n        )\n    } else {\n        None\n    };\n\n    // Mount gRPC jaeger service if present.\n    let jaeger_grpc_service = if let Some(jaeger_service) = services.jaeger_service_opt.clone() {\n        enabled_grpc_services.insert(\"jaeger\");\n        Some(SpanReaderPluginServer::new(jaeger_service))\n    } else {\n        None\n    };\n\n    // Mount gRPC jaeger v2 service (TraceReader) if present.\n    let jaeger_v2_grpc_service = if let Some(jaeger_service) = services.jaeger_service_opt.clone() {\n        enabled_grpc_services.insert(\"jaeger-v2\");\n        Some(TraceReaderServer::new(jaeger_service))\n    } else {\n        None\n    };\n\n    let developer_grpc_service = {\n        enabled_grpc_services.insert(\"developer\");\n        file_descriptor_sets.push(quickwit_proto::developer::DEVELOPER_FILE_DESCRIPTOR_SET);\n\n        let developer_service = DeveloperApiServer::from_services(&services);\n\n        DeveloperServiceClient::new(developer_service)\n            .as_grpc_service(DeveloperApiServer::MAX_GRPC_MESSAGE_SIZE)\n    };\n    enabled_grpc_services.insert(\"health\");\n    file_descriptor_sets.push(HEALTH_FILE_DESCRIPTOR_SET);\n\n    enabled_grpc_services.insert(\"reflection\");\n    file_descriptor_sets.push(REFLECTION_FILE_DESCRIPTOR_SET);\n    let reflection_service = build_reflection_service(&file_descriptor_sets)?;\n\n    let server_router = server\n        .add_service(cluster_grpc_service)\n        .add_service(developer_grpc_service)\n        .add_service(health_service)\n        .add_service(reflection_service)\n        .add_optional_service(control_plane_grpc_service)\n        .add_optional_service(indexing_grpc_service)\n        .add_optional_service(ingest_api_grpc_service)\n        .add_optional_service(ingest_router_grpc_service)\n        .add_optional_service(ingester_grpc_service)\n        .add_optional_service(jaeger_grpc_service)\n        .add_optional_service(jaeger_v2_grpc_service)\n        .add_optional_service(metastore_grpc_service)\n        .add_optional_service(otlp_log_grpc_service)\n        .add_optional_service(otlp_trace_grpc_service)\n        .add_optional_service(search_grpc_service);\n\n    let grpc_listen_addr = tcp_listener.local_addr()?;\n    info!(\n        enabled_grpc_services=?enabled_grpc_services,\n        grpc_listen_addr=?grpc_listen_addr,\n        \"starting gRPC server listening on {grpc_listen_addr}\"\n    );\n    // nodelay=true and keepalive=None are the default values for Server::builder()\n    let tcp_incoming = TcpIncoming::from(tcp_listener)\n        .with_nodelay(Some(true))\n        .with_keepalive(None);\n    let serve_fut = server_router.serve_with_incoming_shutdown(tcp_incoming, shutdown_signal);\n    let (serve_res, _trigger_res) = tokio::join!(serve_fut, readiness_trigger);\n    serve_res?;\n    Ok(())\n}\n\nfn build_reflection_service(\n    file_descriptor_sets: &[&[u8]],\n) -> anyhow::Result<ServerReflectionServer<impl ServerReflection>> {\n    let mut builder = tonic_reflection::server::Builder::configure();\n\n    for file_descriptor_set in file_descriptor_sets {\n        builder = builder.register_encoded_file_descriptor_set(file_descriptor_set)\n    }\n    builder\n        .build_v1()\n        .context(\"failed to build reflection service\")\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/health_check_api/handler.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_actors::{Healthz, Mailbox};\nuse quickwit_cluster::Cluster;\nuse quickwit_indexing::IndexingService;\nuse quickwit_janitor::JanitorService;\nuse tracing::error;\nuse warp::hyper::StatusCode;\nuse warp::reply::with_status;\nuse warp::{Filter, Rejection};\n\nuse crate::rest::recover_fn;\nuse crate::with_arg;\n\n#[derive(utoipa::OpenApi)]\n#[openapi(paths(get_liveness, get_readiness))]\npub struct HealthCheckApi;\n\n/// Health check handlers.\npub(crate) fn health_check_handlers(\n    cluster: Cluster,\n    indexer_service_opt: Option<Mailbox<IndexingService>>,\n    janitor_service_opt: Option<Mailbox<JanitorService>>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    liveness_handler(indexer_service_opt, janitor_service_opt).or(readiness_handler(cluster))\n}\n\nfn liveness_handler(\n    indexer_service_opt: Option<Mailbox<IndexingService>>,\n    janitor_service_opt: Option<Mailbox<JanitorService>>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"health\" / \"livez\")\n        .and(warp::get())\n        .and(with_arg(indexer_service_opt))\n        .and(with_arg(janitor_service_opt))\n        .then(get_liveness)\n        .recover(recover_fn)\n}\n\nfn readiness_handler(\n    cluster: Cluster,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"health\" / \"readyz\")\n        .and(warp::get())\n        .and(with_arg(cluster))\n        .then(get_readiness)\n        .recover(recover_fn)\n}\n\n#[utoipa::path(\n    get,\n    tag = \"Node Health\",\n    path = \"/livez\",\n    responses(\n        (status = 200, description = \"The service is live.\", body = bool),\n        (status = 503, description = \"The service is not live.\", body = bool),\n    ),\n)]\n/// Get Node Liveliness\nasync fn get_liveness(\n    indexer_service_opt: Option<Mailbox<IndexingService>>,\n    janitor_service_opt: Option<Mailbox<JanitorService>>,\n) -> impl warp::Reply {\n    let mut is_live = true;\n\n    if let Some(indexer_service) = indexer_service_opt\n        && !indexer_service.ask(Healthz).await.unwrap_or(false)\n    {\n        error!(\"indexer service is unhealthy\");\n        is_live = false;\n    }\n    if let Some(janitor_service) = janitor_service_opt\n        && !janitor_service.ask(Healthz).await.unwrap_or(false)\n    {\n        error!(\"janitor service is unhealthy\");\n        is_live = false;\n    }\n    let status_code = if is_live {\n        StatusCode::OK\n    } else {\n        StatusCode::SERVICE_UNAVAILABLE\n    };\n    with_status(warp::reply::json(&is_live), status_code)\n}\n\n#[utoipa::path(\n    get,\n    tag = \"Node Health\",\n    path = \"/readyz\",\n    responses(\n        (status = 200, description = \"The service is ready.\", body = bool),\n        (status = 503, description = \"The service is not ready.\", body = bool),\n    ),\n)]\n/// Get Node Readiness\nasync fn get_readiness(cluster: Cluster) -> impl warp::Reply {\n    let is_ready = cluster.is_self_node_ready().await;\n    let status_code = if is_ready {\n        StatusCode::OK\n    } else {\n        StatusCode::SERVICE_UNAVAILABLE\n    };\n    with_status(warp::reply::json(&is_ready), status_code)\n}\n\n#[cfg(test)]\nmod tests {\n\n    use quickwit_cluster::{ChannelTransport, create_cluster_for_test};\n\n    #[tokio::test]\n    async fn test_rest_search_api_health_checks() {\n        let transport = ChannelTransport::default();\n        let cluster = create_cluster_for_test(Vec::new(), &[], &transport, false)\n            .await\n            .unwrap();\n        let health_check_handler = super::health_check_handlers(cluster.clone(), None, None);\n        let resp = warp::test::request()\n            .path(\"/health/livez\")\n            .reply(&health_check_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let resp = warp::test::request()\n            .path(\"/health/readyz\")\n            .reply(&health_check_handler)\n            .await;\n        assert_eq!(resp.status(), 503);\n        cluster.set_self_node_readiness(true).await;\n        let resp = warp::test::request()\n            .path(\"/health/readyz\")\n            .reply(&health_check_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/health_check_api/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod handler;\n\npub(crate) use handler::{HealthCheckApi, health_check_handlers};\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/index_api/index_resource.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::Arc;\n\nuse bytes::Bytes;\nuse quickwit_common::uri::Uri;\nuse quickwit_config::{\n    ConfigFormat, NodeConfig, load_index_config_update, validate_index_id_pattern,\n};\nuse quickwit_index_management::{IndexService, IndexServiceError};\nuse quickwit_metastore::{\n    IndexMetadata, IndexMetadataResponseExt, ListIndexesMetadataResponseExt, ListSplitsQuery,\n    ListSplitsRequestExt, MetastoreServiceStreamSplitsExt, Split, SplitInfo, SplitState,\n};\nuse quickwit_proto::metastore::{\n    IndexMetadataRequest, ListIndexesMetadataRequest, ListSplitsRequest, MetastoreError,\n    MetastoreResult, MetastoreService, MetastoreServiceClient,\n};\nuse quickwit_proto::types::IndexId;\nuse serde::{Deserialize, Serialize};\nuse tracing::info;\nuse warp::{Filter, Rejection};\n\nuse super::rest_handler::log_failure;\nuse crate::format::{extract_config_format, extract_format_from_qs};\nuse crate::rest_api_response::into_rest_api_response;\nuse crate::simple_list::from_simple_list;\nuse crate::with_arg;\n\npub fn get_index_metadata_handler(\n    metastore: MetastoreServiceClient,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"indexes\" / String)\n        .and(warp::get())\n        .and(with_arg(metastore))\n        .then(get_index_metadata)\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n        .boxed()\n}\n\npub async fn get_index_metadata(\n    index_id: IndexId,\n    metastore: MetastoreServiceClient,\n) -> MetastoreResult<IndexMetadata> {\n    info!(index_id = %index_id, \"get-index-metadata\");\n    let index_metadata_request = IndexMetadataRequest::for_index_id(index_id.to_string());\n    let index_metadata = metastore\n        .index_metadata(index_metadata_request)\n        .await?\n        .deserialize_index_metadata()?;\n    Ok(index_metadata)\n}\n\n/// This struct represents the QueryString passed to\n/// the rest API to filter indexes.\n#[derive(Debug, Clone, Deserialize, Serialize, utoipa::IntoParams, utoipa::ToSchema, Default)]\n#[into_params(parameter_in = Query)]\npub struct ListIndexesQueryParams {\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    #[serde(default)]\n    pub index_id_patterns: Option<Vec<String>>,\n}\n\npub fn list_indexes_metadata_handler(\n    metastore: MetastoreServiceClient,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"indexes\")\n        .and(warp::get())\n        .and(warp::query())\n        .and(with_arg(metastore))\n        .then(list_indexes_metadata)\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n        .boxed()\n}\n\n/// Describes an index with its main information and statistics.\n#[derive(Serialize, Deserialize, utoipa::ToSchema)]\npub struct IndexStats {\n    #[schema(value_type = String)]\n    pub index_id: IndexId,\n    #[schema(value_type = String)]\n    pub index_uri: Uri,\n    pub num_published_splits: usize,\n    pub size_published_splits: u64,\n    pub num_published_docs: u64,\n    pub size_published_docs_uncompressed: u64,\n    pub timestamp_field_name: Option<String>,\n    pub min_timestamp: Option<i64>,\n    pub max_timestamp: Option<i64>,\n}\n\n#[utoipa::path(\n    get,\n    tag = \"Indexes\",\n    path = \"/indexes/{index_id}/describe\",\n    responses(\n        (status = 200, description = \"Successfully fetched stats about Index.\", body = IndexStats)\n    ),\n    params(\n        (\"index_id\" = String, Path, description = \"The index ID to describe.\"),\n    )\n)]\n\n/// Describes an index.\npub async fn describe_index(\n    index_id: IndexId,\n    metastore: MetastoreServiceClient,\n) -> MetastoreResult<IndexStats> {\n    let index_metadata_request = IndexMetadataRequest::for_index_id(index_id.to_string());\n    let index_metadata = metastore\n        .index_metadata(index_metadata_request)\n        .await?\n        .deserialize_index_metadata()?;\n    let query = ListSplitsQuery::for_index(index_metadata.index_uid.clone());\n    let list_splits_request = ListSplitsRequest::try_from_list_splits_query(&query)?;\n    let splits = metastore\n        .list_splits(list_splits_request)\n        .await?\n        .collect_splits()\n        .await?;\n    let published_splits: Vec<Split> = splits\n        .into_iter()\n        .filter(|split| split.split_state == SplitState::Published)\n        .collect();\n    let mut total_num_docs = 0;\n    let mut total_num_bytes = 0;\n    let mut total_uncompressed_num_bytes = 0;\n    let mut min_timestamp: Option<i64> = None;\n    let mut max_timestamp: Option<i64> = None;\n\n    for split in &published_splits {\n        total_num_docs += split.split_metadata.num_docs as u64;\n        total_num_bytes += split.split_metadata.footer_offsets.end;\n        total_uncompressed_num_bytes += split.split_metadata.uncompressed_docs_size_in_bytes;\n\n        if let Some(time_range) = &split.split_metadata.time_range {\n            min_timestamp = min_timestamp\n                .min(Some(*time_range.start()))\n                .or(Some(*time_range.start()));\n            max_timestamp = max_timestamp\n                .max(Some(*time_range.end()))\n                .or(Some(*time_range.end()));\n        }\n    }\n\n    let index_config = index_metadata.into_index_config();\n    let index_stats = IndexStats {\n        index_id,\n        index_uri: index_config.index_uri.clone(),\n        num_published_splits: published_splits.len(),\n        size_published_splits: total_num_bytes,\n        num_published_docs: total_num_docs,\n        size_published_docs_uncompressed: total_uncompressed_num_bytes,\n        timestamp_field_name: index_config.doc_mapping.timestamp_field,\n        min_timestamp,\n        max_timestamp,\n    };\n\n    Ok(index_stats)\n}\n\npub fn describe_index_handler(\n    metastore: MetastoreServiceClient,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"indexes\" / String / \"describe\")\n        .and(warp::get())\n        .and(with_arg(metastore))\n        .then(describe_index)\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n        .boxed()\n}\n\n#[utoipa::path(\n    get,\n    tag = \"Indexes\",\n    path = \"/indexes\",\n    responses(\n        // We return `VersionedIndexMetadata` as it's the serialized model view.\n        (status = 200, description = \"Successfully fetched all indexes.\", body = [VersionedIndexMetadata])\n    ),\n    params(\n        ListIndexesQueryParams,\n        (\"index_id_patterns\" = String, Path, description = \"The index ID pattern to retrieve indexes for.\"),\n    )\n)]\n/// Gets indexes metadata.\npub async fn list_indexes_metadata(\n    list_indexes_params: ListIndexesQueryParams,\n    metastore: MetastoreServiceClient,\n) -> MetastoreResult<Vec<IndexMetadata>> {\n    let list_indexes_metata_request =\n        if let Some(index_id_patterns) = list_indexes_params.index_id_patterns {\n            for index_id_pattern in &index_id_patterns {\n                validate_index_id_pattern(index_id_pattern, true).map_err(|error| {\n                    MetastoreError::InvalidArgument {\n                        message: error.to_string(),\n                    }\n                })?;\n            }\n            ListIndexesMetadataRequest { index_id_patterns }\n        } else {\n            ListIndexesMetadataRequest::all()\n        };\n    metastore\n        .list_indexes_metadata(list_indexes_metata_request)\n        .await?\n        .deserialize_indexes_metadata()\n        .await\n}\n\n#[derive(Deserialize, utoipa::IntoParams, utoipa::ToSchema)]\n#[into_params(parameter_in = Query)]\npub struct CreateIndexQueryParams {\n    #[serde(default)]\n    overwrite: bool,\n}\n\npub fn create_index_handler(\n    index_service: IndexService,\n    node_config: Arc<NodeConfig>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"indexes\")\n        .and(warp::post())\n        .and(warp::query())\n        .and(extract_config_format())\n        .and(warp::body::content_length_limit(1024 * 1024))\n        .and(warp::filters::body::bytes())\n        .and(with_arg(index_service))\n        .and(with_arg(node_config))\n        .then(create_index)\n        .map(log_failure(\"failed to create index\"))\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n        .boxed()\n}\n\n#[utoipa::path(\n    post,\n    tag = \"Indexes\",\n    path = \"/indexes\",\n    request_body = VersionedIndexConfig,\n    responses(\n        // We return `VersionedIndexMetadata` as it's the serialized model view.\n        (status = 200, description = \"Successfully created index.\", body = VersionedIndexMetadata)\n    ),\n    params(\n        CreateIndexQueryParams,\n    )\n)]\n/// Creates index.\npub async fn create_index(\n    create_index_query_params: CreateIndexQueryParams,\n    config_format: ConfigFormat,\n    index_config_bytes: Bytes,\n    mut index_service: IndexService,\n    node_config: Arc<NodeConfig>,\n) -> Result<IndexMetadata, IndexServiceError> {\n    let index_config = quickwit_config::load_index_config_from_user_config(\n        config_format,\n        &index_config_bytes,\n        &node_config.default_index_root_uri,\n    )\n    .map_err(IndexServiceError::InvalidConfig)?;\n    info!(index_id = %index_config.index_id, overwrite = create_index_query_params.overwrite, \"create-index\");\n    index_service\n        .create_index(index_config, create_index_query_params.overwrite)\n        .await\n}\n\n/// Query parameters for update index queries\n#[derive(Deserialize, Debug, Eq, PartialEq, utoipa::IntoParams)]\n#[into_params(parameter_in = Query)]\npub struct UpdateQueryParams {\n    /// Create the index if it doesn't exist yet\n    #[serde(default)]\n    pub create: bool,\n}\n\nfn update_index_qp() -> impl Filter<Extract = (UpdateQueryParams,), Error = Rejection> + Clone {\n    warp::query::<UpdateQueryParams>()\n}\n\npub fn update_index_handler(\n    index_service: IndexService,\n    node_config: Arc<NodeConfig>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"indexes\" / String)\n        .and(warp::put())\n        .and(extract_config_format())\n        .and(update_index_qp())\n        .and(warp::body::content_length_limit(1024 * 1024))\n        .and(warp::filters::body::bytes())\n        .and(with_arg(index_service))\n        .and(with_arg(node_config))\n        .then(update_index)\n        .map(log_failure(\"failed to update index\"))\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n        .boxed()\n}\n\n#[utoipa::path(\n    put,\n    tag = \"Indexes\",\n    path = \"/indexes/{index_id}\",\n    request_body = VersionedIndexConfig,\n    responses(\n        (status = 200, description = \"Successfully updated the index configuration.\", body = VersionedIndexMetadata)\n    ),\n    params(\n        (\"index_id\" = String, Path, description = \"The index ID to update.\"),\n        UpdateQueryParams,\n    )\n)]\n/// Updates an existing index.\n///\n/// This endpoint follows PUT semantics, which means that all the fields of the\n/// current configuration are replaced by the values specified in this request\n/// or the associated defaults. In particular, if the field is optional (e.g.\n/// `retention_policy`), omitting it will delete the associated configuration.\n/// If the new configuration file contains updates that cannot be applied, the\n/// request fails, and none of the updates are applied.\npub async fn update_index(\n    target_index_id: IndexId,\n    config_format: ConfigFormat,\n    query_params: UpdateQueryParams,\n    index_config_bytes: Bytes,\n    mut index_service: IndexService,\n    node_config: Arc<NodeConfig>,\n) -> Result<IndexMetadata, IndexServiceError> {\n    info!(index_id = %target_index_id, \"update-index\");\n\n    let metastore = index_service.metastore();\n    let index_metadata_request = IndexMetadataRequest::for_index_id(target_index_id.to_string());\n    let current_index_metadata_res = metastore.index_metadata(index_metadata_request).await;\n\n    let current_index_metadata_ser = match current_index_metadata_res {\n        Ok(index_metadata) => index_metadata,\n        Err(MetastoreError::NotFound(_)) if query_params.create => {\n            let index_config = quickwit_config::load_index_config_from_user_config(\n                config_format,\n                &index_config_bytes,\n                &node_config.default_index_root_uri,\n            )\n            .map_err(IndexServiceError::InvalidConfig)?;\n            if index_config.index_id != target_index_id {\n                return Err(IndexServiceError::InvalidConfig(anyhow::anyhow!(\n                    \"`index_id` in config file does not match index_id from query path\"\n                )));\n            }\n            info!(index_id = %index_config.index_id, \"create-index-on-update\");\n            match index_service.create_index(index_config, false).await {\n                Err(IndexServiceError::Metastore(MetastoreError::AlreadyExists(_))) => {\n                    // If the index was created just after we tried to update it, try to update as\n                    // if nothing happened. But if it gets deleted again before we update it, just\n                    // error out\n                    let index_metadata_request =\n                        IndexMetadataRequest::for_index_id(target_index_id.to_string());\n                    metastore.index_metadata(index_metadata_request).await?\n                }\n                other => return other,\n            }\n        }\n        Err(e) => return Err(e.into()),\n    };\n    let current_index_metadata = current_index_metadata_ser.deserialize_index_metadata()?;\n    let index_uid = current_index_metadata.index_uid.clone();\n    let current_index_config = current_index_metadata.into_index_config();\n\n    let new_index_config = load_index_config_update(\n        config_format,\n        &index_config_bytes,\n        &node_config.default_index_root_uri,\n        &current_index_config,\n    )\n    .map_err(IndexServiceError::InvalidConfig)?;\n\n    let index_metadata = index_service\n        .update_index(index_uid, new_index_config)\n        .await?;\n    Ok(index_metadata)\n}\n\npub fn clear_index_handler(\n    index_service: IndexService,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"indexes\" / String / \"clear\")\n        .and(warp::put())\n        .and(with_arg(index_service))\n        .then(clear_index)\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n        .boxed()\n}\n\n#[utoipa::path(\n    put,\n    tag = \"Indexes\",\n    path = \"/indexes/{index_id}/clear\",\n    responses(\n        (status = 200, description = \"Successfully cleared index.\")\n    ),\n    params(\n        (\"index_id\" = String, Path, description = \"The index ID to clear.\"),\n    )\n)]\n/// Removes all of the data (splits, queued document) associated with the index, but keeps the index\n/// configuration. (See also, `delete-index`).\npub async fn clear_index(\n    index_id: IndexId,\n    mut index_service: IndexService,\n) -> Result<(), IndexServiceError> {\n    info!(index_id = %index_id, \"clear-index\");\n    index_service.clear_index(&index_id).await\n}\n\n#[derive(Deserialize, utoipa::IntoParams, utoipa::ToSchema)]\n#[into_params(parameter_in = Query)]\npub struct DeleteIndexQueryParam {\n    #[serde(default)]\n    dry_run: bool,\n}\n\npub fn delete_index_handler(\n    index_service: IndexService,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"indexes\" / String)\n        .and(warp::delete())\n        .and(warp::query())\n        .and(with_arg(index_service))\n        .then(delete_index)\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n        .boxed()\n}\n\n#[utoipa::path(\n    delete,\n    tag = \"Indexes\",\n    path = \"/indexes/{index_id}\",\n    responses(\n        // We return `VersionedIndexMetadata` as it's the serialized model view.\n        (status = 200, description = \"Successfully deleted index.\", body = [FileEntry])\n    ),\n    params(\n        DeleteIndexQueryParam,\n        (\"index_id\" = String, Path, description = \"The index ID to delete.\"),\n    )\n)]\n/// Deletes index.\npub async fn delete_index(\n    index_id: IndexId,\n    delete_index_query_param: DeleteIndexQueryParam,\n    mut index_service: IndexService,\n) -> Result<Vec<SplitInfo>, IndexServiceError> {\n    info!(index_id = %index_id, dry_run = delete_index_query_param.dry_run, \"delete-index\");\n    index_service\n        .delete_index(&index_id, delete_index_query_param.dry_run)\n        .await\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/index_api/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod index_resource;\nmod rest_handler;\nmod source_resource;\nmod split_resource;\n\npub use self::index_resource::get_index_metadata_handler;\npub use self::rest_handler::{IndexApi, index_management_handlers};\npub use self::split_resource::{ListSplitsQueryParams, ListSplitsResponse};\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/index_api/rest_handler.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::Arc;\n\nuse quickwit_config::NodeConfig;\nuse quickwit_doc_mapper::{TokenizerConfig, analyze_text};\nuse quickwit_index_management::{IndexService, IndexServiceError};\nuse quickwit_query::query_ast::{QueryAst, query_ast_from_user_text};\nuse serde::Deserialize;\nuse serde::de::DeserializeOwned;\nuse tracing::warn;\nuse warp::{Filter, Rejection};\n\nuse super::get_index_metadata_handler;\nuse super::index_resource::{\n    __path_clear_index, __path_create_index, __path_delete_index, __path_describe_index,\n    __path_list_indexes_metadata, __path_update_index, IndexStats, clear_index_handler,\n    create_index_handler, delete_index_handler, describe_index_handler,\n    list_indexes_metadata_handler, update_index_handler,\n};\nuse super::source_resource::{\n    __path_create_source, __path_delete_source, __path_reset_source_checkpoint,\n    __path_toggle_source, __path_update_source, ToggleSource, create_source_handler,\n    delete_source_handler, get_source_handler, get_source_shards_handler,\n    reset_source_checkpoint_handler, toggle_source_handler, update_source_handler,\n};\nuse super::split_resource::{\n    __path_list_splits, __path_mark_splits_for_deletion, SplitsForDeletion, list_splits_handler,\n    mark_splits_for_deletion_handler,\n};\nuse crate::format::extract_format_from_qs;\nuse crate::rest::recover_fn;\nuse crate::rest_api_response::into_rest_api_response;\nuse crate::simple_list::from_simple_list;\n\n#[derive(utoipa::OpenApi)]\n#[openapi(\n    paths(\n        create_index,\n        update_index,\n        clear_index,\n        delete_index,\n        list_indexes_metadata,\n        list_splits,\n        describe_index,\n        mark_splits_for_deletion,\n        create_source,\n        update_source,\n        reset_source_checkpoint,\n        toggle_source,\n        delete_source,\n    ),\n    components(schemas(ToggleSource, SplitsForDeletion, IndexStats))\n)]\npub struct IndexApi;\n\npub fn log_failure<T, E: std::fmt::Display>(\n    message: &'static str,\n) -> impl Fn(Result<T, E>) -> Result<T, E> + Clone {\n    move |result| {\n        if let Err(err) = &result {\n            warn!(\"{message}: {err}\");\n        };\n        result\n    }\n}\n\npub fn json_body<T: DeserializeOwned + Send>()\n-> impl Filter<Extract = (T,), Error = warp::Rejection> + Clone {\n    warp::body::content_length_limit(1024 * 1024).and(warp::body::json())\n}\n\npub fn index_management_handlers(\n    index_service: IndexService,\n    node_config: Arc<NodeConfig>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    // Indexes handlers.\n    get_index_metadata_handler(index_service.metastore())\n        .or(list_indexes_metadata_handler(index_service.metastore()))\n        .or(create_index_handler(\n            index_service.clone(),\n            node_config.clone(),\n        ))\n        .or(update_index_handler(index_service.clone(), node_config))\n        .or(clear_index_handler(index_service.clone()))\n        .or(delete_index_handler(index_service.clone()))\n        .boxed()\n        // Splits handlers\n        .or(list_splits_handler(index_service.metastore()))\n        .or(describe_index_handler(index_service.metastore()))\n        .or(mark_splits_for_deletion_handler(index_service.metastore()))\n        .boxed()\n        // Sources handlers.\n        .or(reset_source_checkpoint_handler(index_service.metastore()))\n        .or(toggle_source_handler(index_service.metastore()))\n        .or(create_source_handler(index_service.clone()))\n        .or(update_source_handler(index_service.clone()))\n        .or(get_source_handler(index_service.metastore()))\n        .or(delete_source_handler(index_service.metastore()))\n        .or(get_source_shards_handler(index_service.metastore()))\n        .boxed()\n        // Tokenizer handlers.\n        .or(analyze_request_handler())\n        // Parse query into query AST handler.\n        .or(parse_query_request_handler())\n        .recover(recover_fn)\n        .boxed()\n}\n\n#[derive(Debug, Deserialize, utoipa::IntoParams, utoipa::ToSchema)]\nstruct AnalyzeRequest {\n    /// The tokenizer to use.\n    #[serde(flatten)]\n    pub tokenizer_config: TokenizerConfig,\n    /// The text to analyze.\n    pub text: String,\n}\n\nfn analyze_request_filter() -> impl Filter<Extract = (AnalyzeRequest,), Error = Rejection> + Clone {\n    warp::path!(\"analyze\")\n        .and(warp::post())\n        .and(warp::body::json())\n}\n\nfn analyze_request_handler() -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone\n{\n    analyze_request_filter()\n        .then(analyze_request)\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n        .boxed()\n}\n\n/// Analyzes text with given tokenizer config and returns the list of tokens.\n#[utoipa::path(\n    post,\n    tag = \"analyze\",\n    path = \"/analyze\",\n    request_body = AnalyzeRequest,\n    responses(\n        (status = 200, description = \"Successfully analyze text.\")\n    ),\n)]\nasync fn analyze_request(request: AnalyzeRequest) -> Result<serde_json::Value, IndexServiceError> {\n    let tokens = analyze_text(&request.text, &request.tokenizer_config)\n        .map_err(|err| IndexServiceError::Internal(format!(\"{err:?}\")))?;\n    let json_value = serde_json::to_value(tokens)\n        .map_err(|err| IndexServiceError::Internal(format!(\"cannot serialize tokens: {err}\")))?;\n    Ok(json_value)\n}\n\n#[derive(Debug, Deserialize, utoipa::IntoParams, utoipa::ToSchema)]\nstruct ParseQueryRequest {\n    /// Query text. The query language is that of tantivy.\n    pub query: String,\n    // Fields to search on.\n    #[param(rename = \"search_field\")]\n    #[serde(default)]\n    #[serde(rename(deserialize = \"search_field\"))]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    pub search_fields: Option<Vec<String>>,\n}\n\nfn parse_query_request_filter()\n-> impl Filter<Extract = (ParseQueryRequest,), Error = Rejection> + Clone {\n    warp::path!(\"parse-query\")\n        .and(warp::post())\n        .and(warp::body::json())\n}\n\nfn parse_query_request_handler()\n-> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    parse_query_request_filter()\n        .then(parse_query_request)\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n        .boxed()\n}\n\n/// Analyzes text with given tokenizer config and returns the list of tokens.\n#[utoipa::path(\n    post,\n    tag = \"parse_query\",\n    path = \"/parse_query\",\n    request_body = ParseQueryRequest,\n    responses(\n        (status = 200, description = \"Successfully parsed query into AST.\")\n    ),\n)]\nasync fn parse_query_request(request: ParseQueryRequest) -> Result<QueryAst, IndexServiceError> {\n    let query_ast = query_ast_from_user_text(&request.query, request.search_fields)\n        .parse_user_query(&[])\n        .map_err(|err| IndexServiceError::Internal(err.to_string()))?;\n    Ok(query_ast)\n}\n\n#[cfg(test)]\nmod tests {\n    use std::ops::{Bound, RangeInclusive};\n\n    use assert_json_diff::assert_json_include;\n    use quickwit_common::ServiceStream;\n    use quickwit_common::uri::Uri;\n    use quickwit_config::{\n        CLI_SOURCE_ID, INGEST_API_SOURCE_ID, NodeConfig, SourceParams, VecSourceParams,\n    };\n    use quickwit_indexing::{MockSplitBuilder, mock_split};\n    use quickwit_metastore::{\n        IndexMetadata, IndexMetadataResponseExt, ListIndexesMetadataResponseExt,\n        ListSplitsRequestExt, ListSplitsResponseExt, SplitState, metastore_for_test,\n    };\n    use quickwit_proto::metastore::{\n        DeleteSourceRequest, EmptyResponse, EntityKind, IndexMetadataRequest,\n        IndexMetadataResponse, ListIndexesMetadataRequest, ListIndexesMetadataResponse,\n        ListSplitsRequest, ListSplitsResponse, MarkSplitsForDeletionRequest, MetastoreError,\n        MetastoreService, MetastoreServiceClient, MockMetastoreService,\n        ResetSourceCheckpointRequest, SourceType, ToggleSourceRequest,\n    };\n    use quickwit_proto::types::IndexUid;\n    use quickwit_storage::StorageResolver;\n    use serde_json::Value as JsonValue;\n\n    use super::*;\n    use crate::recover_fn;\n\n    #[tokio::test]\n    async fn test_get_index() -> anyhow::Result<()> {\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore.expect_index_metadata().return_once(|_| {\n            Ok(\n                IndexMetadataResponse::try_from_index_metadata(&IndexMetadata::for_test(\n                    \"test-index\",\n                    \"ram:///indexes/test-index\",\n                ))\n                .unwrap(),\n            )\n        });\n        let index_service = IndexService::new(\n            MetastoreServiceClient::from_mock(mock_metastore),\n            StorageResolver::unconfigured(),\n        );\n        let index_management_handler =\n            super::index_management_handlers(index_service, Arc::new(NodeConfig::for_test()))\n                .recover(recover_fn);\n        let resp = warp::test::request()\n            .path(\"/indexes/test-index\")\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let actual_response_json: JsonValue = serde_json::from_slice(resp.body())?;\n        let expected_response_json = serde_json::json!({\n            \"index_id\": \"test-index\",\n            \"index_uri\": \"ram:///indexes/test-index\",\n        });\n        assert_json_include!(\n            actual: actual_response_json.get(\"index_config\").unwrap(),\n            expected: expected_response_json\n        );\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_get_non_existing_index() {\n        let metastore = metastore_for_test();\n        let index_service = IndexService::new(metastore, StorageResolver::unconfigured());\n        let index_management_handler =\n            super::index_management_handlers(index_service, Arc::new(NodeConfig::for_test()))\n                .recover(recover_fn);\n        let resp = warp::test::request()\n            .path(\"/indexes/test-index\")\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 404);\n    }\n\n    #[tokio::test]\n    async fn test_get_splits() {\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata =\n            IndexMetadata::for_test(\"quickwit-demo-index\", \"ram:///indexes/quickwit-demo-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        mock_metastore\n            .expect_index_metadata()\n            .returning(move |_| {\n                Ok(IndexMetadataResponse::try_from_index_metadata(&index_metadata).unwrap())\n            })\n            .times(2);\n        mock_metastore\n            .expect_list_splits()\n            .returning(move |list_splits_request: ListSplitsRequest| {\n                let list_split_query = list_splits_request.deserialize_list_splits_query().unwrap();\n                if list_split_query.index_uids.unwrap().contains(&index_uid)\n                    && list_split_query.split_states\n                        == vec![SplitState::Published, SplitState::Staged]\n                    && list_split_query.time_range.start == Bound::Included(10)\n                    && list_split_query.time_range.end == Bound::Excluded(20)\n                    && list_split_query.create_timestamp.end == Bound::Excluded(2)\n                {\n                    let splits = vec![\n                        MockSplitBuilder::new(\"split_1\")\n                            .with_index_uid(&index_uid)\n                            .build(),\n                    ];\n                    let splits = ListSplitsResponse::try_from_splits(splits).unwrap();\n                    return Ok(ServiceStream::from(vec![Ok(splits)]));\n                }\n                Err(MetastoreError::Internal {\n                    message: \"\".to_string(),\n                    cause: \"\".to_string(),\n                })\n            })\n            .times(2);\n        let index_service = IndexService::new(\n            MetastoreServiceClient::from_mock(mock_metastore),\n            StorageResolver::unconfigured(),\n        );\n        let index_management_handler =\n            super::index_management_handlers(index_service, Arc::new(NodeConfig::for_test()))\n                .recover(recover_fn);\n        {\n            let resp = warp::test::request()\n                .path(\n                    \"/indexes/quickwit-demo-index/splits?split_states=Published,Staged&\\\n                     start_timestamp=10&end_timestamp=20&end_create_timestamp=2\",\n                )\n                .reply(&index_management_handler)\n                .await;\n            assert_eq!(resp.status(), 200);\n            let actual_response_json: JsonValue = serde_json::from_slice(resp.body()).unwrap();\n            let expected_response_json = serde_json::json!({\n                \"splits\": [\n                    {\n                        \"create_timestamp\": 0,\n                        \"split_id\": \"split_1\",\n                    }\n                ]\n            });\n            assert_json_include!(\n                actual: actual_response_json,\n                expected: expected_response_json\n            );\n        }\n        {\n            let resp = warp::test::request()\n                .path(\n                    \"/indexes/quickwit-demo-index/splits?split_states=Published&\\\n                     start_timestamp=11&end_timestamp=20&end_create_timestamp=2\",\n                )\n                .reply(&index_management_handler)\n                .await;\n            assert_eq!(resp.status(), 500);\n        }\n    }\n\n    #[tokio::test]\n    async fn test_describe_index() -> anyhow::Result<()> {\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata =\n            IndexMetadata::for_test(\"quickwit-demo-index\", \"ram:///indexes/quickwit-demo-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        mock_metastore\n            .expect_index_metadata()\n            .return_once(move |_| {\n                Ok(IndexMetadataResponse::try_from_index_metadata(&index_metadata).unwrap())\n            });\n        let split_1 = MockSplitBuilder::new(\"split_1\")\n            .with_index_uid(&index_uid)\n            .build();\n        let split_1_time_range = split_1.split_metadata.time_range.clone().unwrap();\n        let mut split_2 = MockSplitBuilder::new(\"split_2\")\n            .with_index_uid(&index_uid)\n            .build();\n        split_2.split_metadata.time_range = Some(RangeInclusive::new(\n            split_1_time_range.start() - 10,\n            split_1_time_range.end() + 10,\n        ));\n        mock_metastore\n            .expect_list_splits()\n            .withf(move |list_split_request| -> bool {\n                let list_split_query = list_split_request.deserialize_list_splits_query().unwrap();\n                list_split_query.index_uids.unwrap().contains(&index_uid)\n            })\n            .return_once(move |_| {\n                let splits = vec![split_1, split_2];\n                let splits = ListSplitsResponse::try_from_splits(splits).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits)]))\n            });\n\n        let index_service = IndexService::new(\n            MetastoreServiceClient::from_mock(mock_metastore),\n            StorageResolver::unconfigured(),\n        );\n        let index_management_handler =\n            super::index_management_handlers(index_service, Arc::new(NodeConfig::for_test()))\n                .recover(recover_fn);\n        let resp = warp::test::request()\n            .path(\"/indexes/quickwit-demo-index/describe\")\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n\n        let actual_response_json: JsonValue = serde_json::from_slice(resp.body()).unwrap();\n        let expected_response_json = serde_json::json!({\n            \"index_id\": \"quickwit-demo-index\",\n            \"index_uri\": \"ram:///indexes/quickwit-demo-index\",\n            \"num_published_splits\": 2,\n            \"size_published_splits\": 1600,\n            \"num_published_docs\": 20,\n            \"size_published_docs_uncompressed\": 512,\n            \"timestamp_field_name\": \"timestamp\",\n            \"min_timestamp\": split_1_time_range.start() - 10,\n            \"max_timestamp\": split_1_time_range.end() + 10,\n        });\n\n        assert_eq!(actual_response_json, expected_response_json);\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_get_all_splits() {\n        let mut mock_metastore = MockMetastoreService::new();\n        let index_metadata =\n            IndexMetadata::for_test(\"quickwit-demo-index\", \"ram:///indexes/quickwit-demo-index\");\n        let index_uid = index_metadata.index_uid.clone();\n        mock_metastore\n            .expect_index_metadata()\n            .return_once(move |_| {\n                Ok(IndexMetadataResponse::try_from_index_metadata(&index_metadata).unwrap())\n            });\n        mock_metastore.expect_list_splits().return_once(\n            move |list_split_request: ListSplitsRequest| {\n                let list_split_query = list_split_request.deserialize_list_splits_query().unwrap();\n                if list_split_query.index_uids.unwrap().contains(&index_uid)\n                    && list_split_query.split_states.is_empty()\n                    && list_split_query.time_range.is_unbounded()\n                    && list_split_query.create_timestamp.is_unbounded()\n                {\n                    return Ok(ServiceStream::empty());\n                }\n                Err(MetastoreError::Internal {\n                    message: \"\".to_string(),\n                    cause: \"\".to_string(),\n                })\n            },\n        );\n        let index_service = IndexService::new(\n            MetastoreServiceClient::from_mock(mock_metastore),\n            StorageResolver::unconfigured(),\n        );\n        let index_management_handler =\n            super::index_management_handlers(index_service, Arc::new(NodeConfig::for_test()))\n                .recover(recover_fn);\n        let resp = warp::test::request()\n            .path(\"/indexes/quickwit-demo-index/splits\")\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n    }\n\n    #[tokio::test]\n    async fn test_mark_splits_for_deletion() -> anyhow::Result<()> {\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_index_metadata()\n            .returning(|_| {\n                Ok(\n                    IndexMetadataResponse::try_from_index_metadata(&IndexMetadata::for_test(\n                        \"quickwit-demo-index\",\n                        \"ram:///indexes/quickwit-demo-index\",\n                    ))\n                    .unwrap(),\n                )\n            })\n            .times(2);\n        mock_metastore\n            .expect_mark_splits_for_deletion()\n            .returning(\n                |mark_splits_for_deletion_request: MarkSplitsForDeletionRequest| {\n                    let index_uid: IndexUid = mark_splits_for_deletion_request.index_uid().clone();\n                    let split_ids = mark_splits_for_deletion_request.split_ids;\n                    if index_uid.index_id == \"quickwit-demo-index\"\n                        && split_ids == [\"split-1\", \"split-2\"]\n                    {\n                        return Ok(EmptyResponse {});\n                    }\n                    Err(MetastoreError::Internal {\n                        message: \"\".to_string(),\n                        cause: \"\".to_string(),\n                    })\n                },\n            )\n            .times(2);\n        let index_service = IndexService::new(\n            MetastoreServiceClient::from_mock(mock_metastore),\n            StorageResolver::unconfigured(),\n        );\n        let index_management_handler =\n            super::index_management_handlers(index_service, Arc::new(NodeConfig::for_test()))\n                .recover(recover_fn);\n        let resp = warp::test::request()\n            .path(\"/indexes/quickwit-demo-index/splits/mark-for-deletion\")\n            .method(\"PUT\")\n            .json(&true)\n            .body(r#\"{\"split_ids\": [\"split-1\", \"split-2\"]}\"#)\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let resp = warp::test::request()\n            .path(\"/indexes/quickwit-demo-index/splits/mark-for-deletion\")\n            .json(&true)\n            .body(r#\"{\"split_ids\": [\"\"]}\"#)\n            .method(\"PUT\")\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 500);\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_get_list_indexes() -> anyhow::Result<()> {\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .return_once(|list_indexes_request| {\n                assert_eq!(\n                    list_indexes_request.index_id_patterns,\n                    vec![\"test-index-*\".to_string()]\n                );\n                let index_metadata =\n                    IndexMetadata::for_test(\"test-index\", \"ram:///indexes/test-index\");\n                Ok(ListIndexesMetadataResponse::for_test(vec![index_metadata]))\n            });\n        let index_service = IndexService::new(\n            MetastoreServiceClient::from_mock(mock_metastore),\n            StorageResolver::unconfigured(),\n        );\n        let index_management_handler =\n            super::index_management_handlers(index_service, Arc::new(NodeConfig::for_test()))\n                .recover(recover_fn);\n        let resp = warp::test::request()\n            .path(\"/indexes?index_id_patterns=test-index-*\")\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let actual_response_json: JsonValue = serde_json::from_slice(resp.body())?;\n        let actual_response_arr: &Vec<JsonValue> = actual_response_json.as_array().unwrap();\n        assert_eq!(actual_response_arr.len(), 1);\n        let actual_index_metadata_json: &JsonValue = &actual_response_arr[0];\n        let expected_response_json = serde_json::json!({\n            \"index_id\": \"test-index\",\n            \"index_uri\": \"ram:///indexes/test-index\",\n        });\n        assert_json_include!(\n            actual: actual_index_metadata_json.get(\"index_config\").unwrap(),\n            expected: expected_response_json\n        );\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_clear_index() -> anyhow::Result<()> {\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore.expect_index_metadata().return_once(|_| {\n            Ok(\n                IndexMetadataResponse::try_from_index_metadata(&IndexMetadata::for_test(\n                    \"quickwit-demo-index\",\n                    \"file:///path/to/index/quickwit-demo-index\",\n                ))\n                .unwrap(),\n            )\n        });\n        mock_metastore.expect_list_splits().return_once(|_| {\n            let splits = ListSplitsResponse::try_from_splits(vec![mock_split(\"split_1\")]).unwrap();\n            Ok(ServiceStream::from(vec![Ok(splits)]))\n        });\n        mock_metastore\n            .expect_mark_splits_for_deletion()\n            .return_once(|_| Ok(EmptyResponse {}));\n        mock_metastore\n            .expect_delete_splits()\n            .return_once(|_| Ok(EmptyResponse {}));\n        mock_metastore\n            .expect_reset_source_checkpoint()\n            .return_once(|_| Ok(EmptyResponse {}));\n        let index_service = IndexService::new(\n            MetastoreServiceClient::from_mock(mock_metastore),\n            StorageResolver::unconfigured(),\n        );\n        let index_management_handler =\n            super::index_management_handlers(index_service, Arc::new(NodeConfig::for_test()))\n                .recover(recover_fn);\n        let resp = warp::test::request()\n            .path(\"/indexes/quickwit-demo-index/clear\")\n            .method(\"PUT\")\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_delete_index() {\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_index_metadata()\n            .returning(|_| {\n                Ok(\n                    IndexMetadataResponse::try_from_index_metadata(&IndexMetadata::for_test(\n                        \"quickwit-demo-index\",\n                        \"file:///path/to/index/quickwit-demo-index\",\n                    ))\n                    .unwrap(),\n                )\n            })\n            .times(2);\n        mock_metastore\n            .expect_list_splits()\n            .returning(|_| {\n                let splits =\n                    ListSplitsResponse::try_from_splits(vec![mock_split(\"split_1\")]).unwrap();\n                Ok(ServiceStream::from(vec![Ok(splits)]))\n            })\n            .times(3);\n        mock_metastore\n            .expect_mark_splits_for_deletion()\n            .return_once(|_| Ok(EmptyResponse {}));\n        mock_metastore\n            .expect_delete_splits()\n            .return_once(|_| Ok(EmptyResponse {}));\n        mock_metastore\n            .expect_delete_index()\n            .return_once(|_| Ok(EmptyResponse {}));\n        let index_service = IndexService::new(\n            MetastoreServiceClient::from_mock(mock_metastore),\n            StorageResolver::unconfigured(),\n        );\n        let index_management_handler =\n            super::index_management_handlers(index_service, Arc::new(NodeConfig::for_test()))\n                .recover(recover_fn);\n        {\n            // Dry run\n            let resp = warp::test::request()\n                .path(\"/indexes/quickwit-demo-index?dry_run=true\")\n                .method(\"DELETE\")\n                .reply(&index_management_handler)\n                .await;\n            assert_eq!(resp.status(), 200);\n            let resp_json: serde_json::Value = serde_json::from_slice(resp.body()).unwrap();\n            let expected_response_json = serde_json::json!([{\n                \"file_name\": \"split_1.split\",\n                \"file_size_bytes\": \"800 B\",\n            }]);\n            assert_json_include!(actual: resp_json, expected: expected_response_json);\n        }\n        {\n            let resp = warp::test::request()\n                .path(\"/indexes/quickwit-demo-index\")\n                .method(\"DELETE\")\n                .reply(&index_management_handler)\n                .await;\n            assert_eq!(resp.status(), 200);\n            let resp_json: serde_json::Value = serde_json::from_slice(resp.body()).unwrap();\n            let expected_response_json = serde_json::json!([{\n                \"file_name\": \"split_1.split\",\n                \"file_size_bytes\": \"800 B\",\n            }]);\n            assert_json_include!(actual: resp_json, expected: expected_response_json);\n        }\n    }\n\n    #[tokio::test]\n    async fn test_delete_on_non_existing_index() {\n        let metastore = metastore_for_test();\n        let index_service = IndexService::new(metastore, StorageResolver::unconfigured());\n        let index_management_handler =\n            super::index_management_handlers(index_service, Arc::new(NodeConfig::for_test()))\n                .recover(recover_fn);\n        let resp = warp::test::request()\n            .path(\"/indexes/quickwit-demo-index\")\n            .method(\"DELETE\")\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 404);\n    }\n\n    #[tokio::test]\n    async fn test_create_index_with_overwrite() {\n        let metastore = metastore_for_test();\n        let index_service = IndexService::new(metastore.clone(), StorageResolver::unconfigured());\n        let mut node_config = NodeConfig::for_test();\n        node_config.default_index_root_uri = Uri::for_test(\"file:///default-index-root-uri\");\n        let index_management_handler =\n            super::index_management_handlers(index_service, Arc::new(node_config));\n        {\n            let resp = warp::test::request()\n                .path(\"/indexes?overwrite=true\")\n                .method(\"POST\")\n                .json(&true)\n                .body(r#\"{\"version\": \"0.7\", \"index_id\": \"hdfs-logs\", \"doc_mapping\": {\"field_mappings\":[{\"name\": \"timestamp\", \"type\": \"i64\", \"fast\": true, \"indexed\": true}]}}\"#)\n                .reply(&index_management_handler)\n                .await;\n            assert_eq!(resp.status(), 200);\n        }\n        {\n            let resp = warp::test::request()\n                .path(\"/indexes?overwrite=true\")\n                .method(\"POST\")\n                .json(&true)\n                .body(r#\"{\"version\": \"0.7\", \"index_id\": \"hdfs-logs\", \"doc_mapping\": {\"field_mappings\":[{\"name\": \"timestamp\", \"type\": \"i64\", \"fast\": true, \"indexed\": true}]}}\"#)\n                .reply(&index_management_handler)\n                .await;\n            assert_eq!(resp.status(), 200);\n        }\n        {\n            let resp = warp::test::request()\n                .path(\"/indexes\")\n                .method(\"POST\")\n                .json(&true)\n                .body(r#\"{\"version\": \"0.7\", \"index_id\": \"hdfs-logs\", \"doc_mapping\": {\"field_mappings\":[{\"name\": \"timestamp\", \"type\": \"i64\", \"fast\": true, \"indexed\": true}]}}\"#)\n                .reply(&index_management_handler)\n                .await;\n            assert_eq!(resp.status(), 400);\n        }\n    }\n\n    #[tokio::test]\n    async fn test_create_delete_index_and_source() {\n        let metastore = metastore_for_test();\n        let index_service = IndexService::new(metastore.clone(), StorageResolver::unconfigured());\n        let mut node_config = NodeConfig::for_test();\n        node_config.default_index_root_uri = Uri::for_test(\"file:///default-index-root-uri\");\n        let index_management_handler =\n            super::index_management_handlers(index_service, Arc::new(node_config));\n        let resp = warp::test::request()\n            .path(\"/indexes\")\n            .method(\"POST\")\n            .json(&true)\n            .body(r#\"{\"version\": \"0.7\", \"index_id\": \"hdfs-logs\", \"doc_mapping\": {\"field_mappings\":[{\"name\": \"timestamp\", \"type\": \"i64\", \"fast\": true, \"indexed\": true}]}}\"#)\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let resp_json: serde_json::Value = serde_json::from_slice(resp.body()).unwrap();\n        let expected_response_json = serde_json::json!({\n            \"index_config\": {\n                \"index_id\": \"hdfs-logs\",\n                \"index_uri\": \"file:///default-index-root-uri/hdfs-logs\",\n            }\n        });\n        assert_json_include!(actual: resp_json, expected: expected_response_json);\n\n        // Create source.\n        let source_config_body = r#\"{\"version\": \"0.7\", \"source_id\": \"vec-source\", \"source_type\": \"vec\", \"params\": {\"docs\": [], \"batch_num_docs\": 10}}\"#;\n        let resp = warp::test::request()\n            .path(\"/indexes/hdfs-logs/sources\")\n            .method(\"POST\")\n            .json(&true)\n            .body(source_config_body)\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n\n        // Get source.\n        let resp = warp::test::request()\n            .path(\"/indexes/hdfs-logs/sources/vec-source\")\n            .method(\"GET\")\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n\n        // Check that the source has been added to index metadata.\n        let index_metadata = metastore\n            .index_metadata(IndexMetadataRequest::for_index_id(\"hdfs-logs\".to_string()))\n            .await\n            .unwrap()\n            .deserialize_index_metadata()\n            .unwrap();\n        assert!(index_metadata.sources.contains_key(\"vec-source\"));\n        let source_config = index_metadata.sources.get(\"vec-source\").unwrap();\n        assert_eq!(source_config.source_type(), SourceType::Vec);\n        assert_eq!(\n            source_config.source_params,\n            SourceParams::Vec(VecSourceParams {\n                docs: Vec::new(),\n                batch_num_docs: 10,\n                partition: \"\".to_string(),\n            })\n        );\n\n        // Check delete source.\n        let resp = warp::test::request()\n            .path(\"/indexes/hdfs-logs/sources/vec-source\")\n            .method(\"DELETE\")\n            .body(source_config_body)\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let index_metadata = metastore\n            .index_metadata(IndexMetadataRequest::for_index_id(\"hdfs-logs\".to_string()))\n            .await\n            .unwrap()\n            .deserialize_index_metadata()\n            .unwrap();\n        assert!(!index_metadata.sources.contains_key(\"file-source\"));\n\n        // Check cannot delete source managed by Quickwit.\n        let resp = warp::test::request()\n            .path(format!(\"/indexes/hdfs-logs/sources/{INGEST_API_SOURCE_ID}\").as_str())\n            .method(\"DELETE\")\n            .body(source_config_body)\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 403);\n\n        let resp = warp::test::request()\n            .path(format!(\"/indexes/hdfs-logs/sources/{CLI_SOURCE_ID}\").as_str())\n            .method(\"DELETE\")\n            .body(source_config_body)\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 403);\n\n        // Check get a non existing source returns 404.\n        let resp = warp::test::request()\n            .path(\"/indexes/hdfs-logs/sources/file-source\")\n            .method(\"GET\")\n            .body(source_config_body)\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 404);\n\n        // Check delete index.\n        let resp = warp::test::request()\n            .path(\"/indexes/hdfs-logs\")\n            .method(\"DELETE\")\n            .body(source_config_body)\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let indexes = metastore\n            .list_indexes_metadata(ListIndexesMetadataRequest::all())\n            .await\n            .unwrap()\n            .deserialize_indexes_metadata()\n            .await\n            .unwrap();\n        assert!(indexes.is_empty());\n    }\n\n    #[tokio::test]\n    async fn test_create_index_with_yaml() {\n        let metastore = metastore_for_test();\n        let index_service = IndexService::new(metastore.clone(), StorageResolver::unconfigured());\n        let mut node_config = NodeConfig::for_test();\n        node_config.default_index_root_uri = Uri::for_test(\"file:///default-index-root-uri\");\n        let index_management_handler =\n            super::index_management_handlers(index_service, Arc::new(node_config))\n                .recover(recover_fn);\n        let resp = warp::test::request()\n            .path(\"/indexes\")\n            .method(\"POST\")\n            .header(\"content-type\", \"application/yaml\")\n            .body(\n                r#\"\n            version: 0.8\n            index_id: hdfs-logs\n            doc_mapping:\n              field_mappings:\n                - name: timestamp\n                  type: i64\n                  fast: true\n                  indexed: true\n            \"#,\n            )\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let resp_json: serde_json::Value = serde_json::from_slice(resp.body()).unwrap();\n        let expected_response_json = serde_json::json!({\n            \"index_config\": {\n                \"index_id\": \"hdfs-logs\",\n                \"index_uri\": \"file:///default-index-root-uri/hdfs-logs\",\n            }\n        });\n        assert_json_include!(actual: resp_json, expected: expected_response_json);\n    }\n\n    #[tokio::test]\n    async fn test_create_index_and_source_with_toml() {\n        let metastore = metastore_for_test();\n        let index_service = IndexService::new(metastore.clone(), StorageResolver::unconfigured());\n        let mut node_config = NodeConfig::for_test();\n        node_config.default_index_root_uri = Uri::for_test(\"file:///default-index-root-uri\");\n        let index_management_handler =\n            super::index_management_handlers(index_service, Arc::new(node_config))\n                .recover(recover_fn);\n        let resp = warp::test::request()\n            .path(\"/indexes\")\n            .method(\"POST\")\n            .header(\"content-type\", \"application/toml\")\n            .body(\n                r#\"\n            version = \"0.7\"\n            index_id = \"hdfs-logs\"\n            [doc_mapping]\n            field_mappings = [\n                { name = \"timestamp\", type = \"i64\", fast = true, indexed = true}\n            ]\n            \"#,\n            )\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let resp_json: serde_json::Value = serde_json::from_slice(resp.body()).unwrap();\n        let expected_response_json = serde_json::json!({\n            \"index_config\": {\n                \"index_id\": \"hdfs-logs\",\n                \"index_uri\": \"file:///default-index-root-uri/hdfs-logs\",\n            }\n        });\n        assert_json_include!(actual: resp_json, expected: expected_response_json);\n    }\n\n    #[tokio::test]\n    async fn test_create_index_with_wrong_content_type() {\n        let metastore = metastore_for_test();\n        let index_service = IndexService::new(metastore.clone(), StorageResolver::unconfigured());\n        let mut node_config = NodeConfig::for_test();\n        node_config.default_index_root_uri = Uri::for_test(\"file:///default-index-root-uri\");\n        let index_management_handler =\n            super::index_management_handlers(index_service, Arc::new(node_config))\n                .recover(recover_fn);\n        let resp = warp::test::request()\n            .path(\"/indexes\")\n            .method(\"POST\")\n            .header(\"content-type\", \"application/yoml\")\n            .body(r#\"\"#)\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 415);\n        let body = std::str::from_utf8(resp.body()).unwrap();\n        assert!(body.contains(\"content-type is not supported\"));\n    }\n\n    #[tokio::test]\n    async fn test_create_index_with_bad_config() -> anyhow::Result<()> {\n        let index_service = IndexService::new(\n            MetastoreServiceClient::mocked(),\n            StorageResolver::unconfigured(),\n        );\n        let index_management_handler =\n            super::index_management_handlers(index_service, Arc::new(NodeConfig::for_test()))\n                .recover(recover_fn);\n        let resp = warp::test::request()\n            .path(\"/indexes\")\n            .method(\"POST\")\n            .json(&true)\n            .body(\n                r#\"{\"version\": \"0.7\", \"index_id\": \"hdfs-log\", \"doc_mapping\":\n    {\"field_mappings\":[{\"name\": \"timestamp\", \"type\": \"unknown\", \"fast\": true, \"indexed\":\n    true}]}}\"#,\n            )\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 400);\n        let body = std::str::from_utf8(resp.body()).unwrap();\n        assert!(body.contains(\"field `timestamp` has an unknown type\"));\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_update_index() {\n        let metastore = metastore_for_test();\n        let index_service = IndexService::new(metastore.clone(), StorageResolver::unconfigured());\n        let mut node_config = NodeConfig::for_test();\n        node_config.default_index_root_uri = Uri::for_test(\"file:///default-index-root-uri\");\n        let index_management_handler =\n            super::index_management_handlers(index_service, Arc::new(node_config));\n        {\n            let resp = warp::test::request()\n                .path(\"/indexes\")\n                .method(\"POST\")\n                .json(&true)\n                .body(r#\"{\"version\": \"0.7\", \"index_id\": \"hdfs-logs\", \"doc_mapping\": {\"field_mappings\":[{\"name\": \"timestamp\", \"type\": \"i64\", \"fast\": true, \"indexed\": true}]},\"search_settings\":{\"default_search_fields\":[\"body\"]}}\"#)\n                .reply(&index_management_handler)\n                .await;\n            assert_eq!(resp.status(), 200);\n            let resp_json: serde_json::Value = serde_json::from_slice(resp.body()).unwrap();\n            let expected_response_json = serde_json::json!({\n                \"index_config\": {\n                    \"search_settings\": {\n                        \"default_search_fields\": [\"body\"]\n                    }\n                }\n            });\n            assert_json_include!(actual: resp_json, expected: expected_response_json);\n        }\n        {\n            let resp = warp::test::request()\n                .path(\"/indexes/hdfs-logs\")\n                .method(\"PUT\")\n                .json(&true)\n                .body(r#\"{\"version\": \"0.7\", \"index_id\": \"hdfs-logs\", \"doc_mapping\": {\"field_mappings\":[{\"name\": \"timestamp\", \"type\": \"i64\", \"fast\": true, \"indexed\": true}]},\"search_settings\":{\"default_search_fields\":[\"severity_text\", \"body\"]}}\"#)\n                .reply(&index_management_handler)\n                .await;\n            assert_eq!(resp.status(), 200);\n            let resp_json: serde_json::Value = serde_json::from_slice(resp.body()).unwrap();\n            let expected_response_json = serde_json::json!({\n                \"index_config\": {\n                    \"search_settings\": {\n                        \"default_search_fields\": [\"severity_text\", \"body\"]\n                    }\n                }\n            });\n            assert_json_include!(actual: resp_json, expected: expected_response_json);\n        }\n        // check that the metastore was updated\n        let index_metadata = metastore\n            .index_metadata(IndexMetadataRequest::for_index_id(\"hdfs-logs\".to_string()))\n            .await\n            .unwrap()\n            .deserialize_index_metadata()\n            .unwrap();\n        assert_eq!(\n            index_metadata\n                .index_config\n                .search_settings\n                .default_search_fields,\n            [\"severity_text\", \"body\"]\n        );\n        // test with index_uri at the root of a bucket\n        {\n            let resp = warp::test::request()\n                .path(\"/indexes\")\n                .method(\"POST\")\n                .json(&true)\n                .body(r#\"{\"version\": \"0.7\", \"index_id\": \"hdfs-logs2\", \"index_uri\": \"s3://my-bucket\", \"doc_mapping\": {\"field_mappings\":[{\"name\": \"timestamp\", \"type\": \"i64\", \"fast\": true, \"indexed\": true}]},\"search_settings\":{\"default_search_fields\":[\"body\"]}}\"#)\n                .reply(&index_management_handler)\n                .await;\n            let body = std::str::from_utf8(resp.body()).unwrap();\n            assert_eq!(resp.status(), 200, \"{body}\",);\n        }\n        {\n            let resp = warp::test::request()\n                .path(\"/indexes/hdfs-logs2\")\n                .method(\"PUT\")\n                .json(&true)\n                .body(r#\"{\"version\": \"0.7\", \"index_id\": \"hdfs-logs2\", \"index_uri\": \"s3://my-bucket\", \"doc_mapping\": {\"field_mappings\":[{\"name\": \"timestamp\", \"type\": \"i64\", \"fast\": true, \"indexed\": true}]},\"search_settings\":{\"default_search_fields\":[\"severity_text\", \"body\"]}}\"#)\n                .reply(&index_management_handler)\n                .await;\n            let body = std::str::from_utf8(resp.body()).unwrap();\n            assert_eq!(resp.status(), 200, \"{body}\",);\n        }\n    }\n\n    #[tokio::test]\n    async fn test_create_source_with_bad_config() {\n        let metastore = metastore_for_test();\n        let index_service = IndexService::new(metastore, StorageResolver::unconfigured());\n        let index_management_handler =\n            super::index_management_handlers(index_service, Arc::new(NodeConfig::for_test()))\n                .recover(recover_fn);\n        {\n            // Source config with bad version.\n            let resp = warp::test::request()\n                .path(\"/indexes/my-index/sources\")\n                .method(\"POST\")\n                .json(&true)\n                .body(r#\"{\"version\": 0.4, \"source_id\": \"file-source\"}\"#)\n                .reply(&index_management_handler)\n                .await;\n            assert_eq!(resp.status(), 400);\n            let body = std::str::from_utf8(resp.body()).unwrap();\n            assert!(body.contains(\"invalid type: floating point `0.4`\"));\n        }\n        {\n            // Invalid pulsar source config with number of pipelines > 1, not supported yet.\n            let resp = warp::test::request()\n                .path(\"/indexes/my-index/sources\")\n                .method(\"POST\")\n                .json(&true)\n                .body(\n                    r#\"{\"version\": \"0.8\", \"source_id\": \"pulsar-source\",\n    \"num_pipelines\": 2, \"source_type\": \"pulsar\", \"params\": {\"topics\": [\"my-topic\"],\n    \"address\": \"pulsar://localhost:6650\" }}\"#,\n                )\n                .reply(&index_management_handler)\n                .await;\n            assert_eq!(resp.status(), 400);\n            let body = std::str::from_utf8(resp.body()).unwrap();\n            assert!(body.contains(\n                \"Quickwit currently supports multiple pipelines only for GCP PubSub or Kafka \\\n                 sources\"\n            ));\n        }\n        {\n            let resp = warp::test::request()\n                .path(\"/indexes/hdfs-logs/sources\")\n                .method(\"POST\")\n                .body(\n                    r#\"{\"version\": \"0.8\", \"source_id\": \"my-stdin-source\", \"source_type\": \"stdin\"}\"#,\n                )\n                .reply(&index_management_handler)\n                .await;\n            assert_eq!(resp.status(), 400);\n            let response_body = std::str::from_utf8(resp.body()).unwrap();\n            assert!(\n                response_body.contains(\"stdin can only be used as source through the CLI command\")\n            )\n        }\n        {\n            let resp = warp::test::request()\n                .path(\"/indexes/hdfs-logs/sources\")\n                .method(\"POST\")\n                .body(\n                    r#\"{\"version\": \"0.8\", \"source_id\": \"my-local-file-source\", \"source_type\": \"file\", \"params\": {\"filepath\": \"localfile\"}}\"#,\n                )\n                .reply(&index_management_handler)\n                .await;\n            assert_eq!(resp.status(), 400);\n            let response_body = std::str::from_utf8(resp.body()).unwrap();\n            assert!(response_body.contains(\"limited to a local usage\"))\n        }\n    }\n\n    #[cfg(feature = \"sqs-for-tests\")]\n    #[tokio::test]\n    async fn test_update_source() {\n        use quickwit_indexing::source::sqs_queue::test_helpers::start_mock_sqs_get_queue_attributes_endpoint;\n\n        let metastore = metastore_for_test();\n        let (queue_url, _guard) = start_mock_sqs_get_queue_attributes_endpoint().await;\n        let index_service = IndexService::new(metastore.clone(), StorageResolver::unconfigured());\n        let mut node_config = NodeConfig::for_test();\n        node_config.default_index_root_uri = Uri::for_test(\"file:///default-index-root-uri\");\n        let index_management_handler =\n            super::index_management_handlers(index_service, Arc::new(node_config));\n        let resp = warp::test::request()\n            .path(\"/indexes\")\n            .method(\"POST\")\n            .json(&true)\n            .body(r#\"{\"version\": \"0.7\", \"index_id\": \"hdfs-logs\", \"doc_mapping\": {\"field_mappings\":[{\"name\": \"timestamp\", \"type\": \"i64\", \"fast\": true, \"indexed\": true}]}}\"#)\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let resp_json: serde_json::Value = serde_json::from_slice(resp.body()).unwrap();\n        let expected_response_json = serde_json::json!({\n            \"index_config\": {\n                \"index_id\": \"hdfs-logs\",\n                \"index_uri\": \"file:///default-index-root-uri/hdfs-logs\",\n            }\n        });\n        assert_json_include!(actual: resp_json, expected: expected_response_json);\n\n        // Create source.\n        let source_config_body = serde_json::json!({\n            \"version\": \"0.7\",\n            \"source_id\": \"sqs-source\",\n            \"source_type\": \"file\",\n            \"params\": {\"notifications\": [{\"type\": \"sqs\", \"queue_url\": queue_url, \"message_type\": \"s3_notification\"}]},\n        });\n        let resp = warp::test::request()\n            .path(\"/indexes/hdfs-logs/sources\")\n            .method(\"POST\")\n            .json(&source_config_body)\n            .reply(&index_management_handler)\n            .await;\n        let resp_body = std::str::from_utf8(resp.body()).unwrap();\n        assert_eq!(resp.status(), 200, \"{resp_body}\");\n\n        {\n            // Update the source.\n            let update_source_config_body = serde_json::json!({\n                \"version\": \"0.7\",\n                \"source_id\": \"sqs-source\",\n                \"source_type\": \"file\",\n                \"params\": {\"notifications\": [{\"type\": \"sqs\", \"queue_url\": queue_url, \"message_type\": \"s3_notification\"}]},\n            });\n            let resp = warp::test::request()\n                .path(\"/indexes/hdfs-logs/sources/sqs-source\")\n                .method(\"PUT\")\n                .json(&update_source_config_body)\n                .reply(&index_management_handler)\n                .await;\n            let resp_body = std::str::from_utf8(resp.body()).unwrap();\n            assert_eq!(resp.status(), 200, \"{resp_body}\");\n            // Check that the source has been updated.\n            let index_metadata = metastore\n                .index_metadata(IndexMetadataRequest::for_index_id(\"hdfs-logs\".to_string()))\n                .await\n                .unwrap()\n                .deserialize_index_metadata()\n                .unwrap();\n            let metastore_source_config = index_metadata.sources.get(\"sqs-source\").unwrap();\n            assert_eq!(metastore_source_config.source_type(), SourceType::File);\n            assert_eq!(\n                metastore_source_config,\n                &serde_json::from_value(update_source_config_body).unwrap(),\n            );\n        }\n        {\n            // Update the source with a different source_id (forbidden)\n            let update_source_config_body = serde_json::json!({\n                \"version\": \"0.7\",\n                \"source_id\": \"new-source-id\",\n                \"source_type\": \"file\",\n                \"params\": {\"notifications\": [{\"type\": \"sqs\", \"queue_url\": queue_url, \"message_type\": \"s3_notification\"}]},\n            });\n            let resp = warp::test::request()\n                .path(\"/indexes/hdfs-logs/sources/sqs-source\")\n                .method(\"PUT\")\n                .json(&update_source_config_body)\n                .reply(&index_management_handler)\n                .await;\n            let resp_body = std::str::from_utf8(resp.body()).unwrap();\n            assert_eq!(resp.status(), 400, \"{resp_body}\");\n            // Check that the source hasn't been updated.\n            let index_metadata = metastore\n                .index_metadata(IndexMetadataRequest::for_index_id(\"hdfs-logs\".to_string()))\n                .await\n                .unwrap()\n                .deserialize_index_metadata()\n                .unwrap();\n            assert!(index_metadata.sources.contains_key(\"sqs-source\"));\n            assert!(!index_metadata.sources.contains_key(\"other-source-id\"));\n        }\n    }\n\n    #[tokio::test]\n    async fn test_delete_non_existing_source() {\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore.expect_index_metadata().return_once(|_| {\n            Ok(\n                IndexMetadataResponse::try_from_index_metadata(&IndexMetadata::for_test(\n                    \"quickwit-demo-index\",\n                    \"file:///path/to/index/quickwit-demo-index\",\n                ))\n                .unwrap(),\n            )\n        });\n        // TODO\n        // metastore\n        //     .expect_index_exists()\n        //     .return_once(|index_id: &str| Ok(index_id == \"quickwit-demo-index\"));\n        mock_metastore.expect_delete_source().return_once(\n            |delete_source_request: DeleteSourceRequest| {\n                let index_uid: IndexUid = delete_source_request.index_uid().clone();\n                let source_id = delete_source_request.source_id;\n                assert_eq!(index_uid.index_id, \"quickwit-demo-index\");\n                Err(MetastoreError::NotFound(EntityKind::Source {\n                    index_id: \"quickwit-demo-index\".to_string(),\n                    source_id: source_id.to_string(),\n                }))\n            },\n        );\n        let index_service = IndexService::new(\n            MetastoreServiceClient::from_mock(mock_metastore),\n            StorageResolver::unconfigured(),\n        );\n        let index_management_handler =\n            super::index_management_handlers(index_service, Arc::new(NodeConfig::for_test()))\n                .recover(recover_fn);\n        let resp = warp::test::request()\n            .path(\"/indexes/quickwit-demo-index/sources/foo-source\")\n            .method(\"DELETE\")\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 404);\n    }\n\n    #[tokio::test]\n    async fn test_source_reset_checkpoint() -> anyhow::Result<()> {\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_index_metadata()\n            .returning(|_| {\n                Ok(\n                    IndexMetadataResponse::try_from_index_metadata(&IndexMetadata::for_test(\n                        \"quickwit-demo-index\",\n                        \"file:///path/to/index/quickwit-demo-index\",\n                    ))\n                    .unwrap(),\n                )\n            })\n            .times(2);\n        mock_metastore\n            .expect_reset_source_checkpoint()\n            .returning(\n                |reset_source_checkpoint_request: ResetSourceCheckpointRequest| {\n                    let index_uid: IndexUid = reset_source_checkpoint_request.index_uid().clone();\n                    let source_id = reset_source_checkpoint_request.source_id;\n                    if index_uid.index_id == \"quickwit-demo-index\" && source_id == \"source-to-reset\"\n                    {\n                        return Ok(EmptyResponse {});\n                    }\n                    Err(MetastoreError::Internal {\n                        message: \"\".to_string(),\n                        cause: \"\".to_string(),\n                    })\n                },\n            )\n            .times(2);\n        let index_service = IndexService::new(\n            MetastoreServiceClient::from_mock(mock_metastore),\n            StorageResolver::unconfigured(),\n        );\n        let index_management_handler =\n            super::index_management_handlers(index_service, Arc::new(NodeConfig::for_test()))\n                .recover(recover_fn);\n        let resp = warp::test::request()\n            .path(\"/indexes/quickwit-demo-index/sources/source-to-reset/reset-checkpoint\")\n            .method(\"PUT\")\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let resp = warp::test::request()\n            .path(\"/indexes/quickwit-demo-index/sources/source-to-reset-2/reset-checkpoint\")\n            .method(\"PUT\")\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 500);\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_source_toggle() -> anyhow::Result<()> {\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_index_metadata()\n            .returning(|_| {\n                Ok(\n                    IndexMetadataResponse::try_from_index_metadata(&IndexMetadata::for_test(\n                        \"quickwit-demo-index\",\n                        \"file:///path/to/index/quickwit-demo-index\",\n                    ))\n                    .unwrap(),\n                )\n            })\n            .times(3);\n        mock_metastore.expect_toggle_source().return_once(\n            |toggle_source_request: ToggleSourceRequest| {\n                let index_uid: IndexUid = toggle_source_request.index_uid().clone();\n                let source_id = toggle_source_request.source_id;\n                let enable = toggle_source_request.enable;\n                if index_uid.index_id == \"quickwit-demo-index\"\n                    && source_id == \"source-to-toggle\"\n                    && enable\n                {\n                    return Ok(EmptyResponse {});\n                }\n                Err(MetastoreError::Internal {\n                    message: \"\".to_string(),\n                    cause: \"\".to_string(),\n                })\n            },\n        );\n        let index_service = IndexService::new(\n            MetastoreServiceClient::from_mock(mock_metastore),\n            StorageResolver::unconfigured(),\n        );\n        let index_management_handler =\n            super::index_management_handlers(index_service, Arc::new(NodeConfig::for_test()))\n                .recover(recover_fn);\n        let resp = warp::test::request()\n            .path(\"/indexes/quickwit-demo-index/sources/source-to-toggle/toggle\")\n            .method(\"PUT\")\n            .json(&true)\n            .body(r#\"{\"enable\": true}\"#)\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let resp = warp::test::request()\n            .path(\"/indexes/quickwit-demo-index/sources/source-to-toggle/toggle\")\n            .method(\"PUT\")\n            .json(&true)\n            .body(r#\"{\"toggle\": true}\"#) // unknown field, should return 400.\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 400);\n        // Check cannot toggle source managed by Quickwit.\n        let resp = warp::test::request()\n            .path(format!(\"/indexes/hdfs-logs/sources/{INGEST_API_SOURCE_ID}/toggle\").as_str())\n            .method(\"PUT\")\n            .body(r#\"{\"enable\": true}\"#)\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 403);\n\n        let resp = warp::test::request()\n            .path(format!(\"/indexes/hdfs-logs/sources/{CLI_SOURCE_ID}/toggle\").as_str())\n            .method(\"PUT\")\n            .body(r#\"{\"enable\": true}\"#)\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 403);\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_analyze_request() {\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore.expect_index_metadata().return_once(|_| {\n            Ok(\n                IndexMetadataResponse::try_from_index_metadata(&IndexMetadata::for_test(\n                    \"test-index\",\n                    \"ram:///indexes/test-index\",\n                ))\n                .unwrap(),\n            )\n        });\n        let index_service = IndexService::new(\n            MetastoreServiceClient::from_mock(mock_metastore),\n            StorageResolver::unconfigured(),\n        );\n        let index_management_handler =\n            super::index_management_handlers(index_service, Arc::new(NodeConfig::for_test()))\n                .recover(recover_fn);\n        let resp = warp::test::request()\n            .path(\"/analyze\")\n            .method(\"POST\")\n            .json(&true)\n            .body(\n                r#\"{\"type\": \"ngram\", \"min_gram\": 3, \"max_gram\": 3, \"text\": \"Hel\", \"filters\":\n    [\"lower_caser\"]}\"#,\n            )\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let actual_response_json: JsonValue = serde_json::from_slice(resp.body()).unwrap();\n        let expected_response_json = serde_json::json!([\n            {\n                \"offset_from\": 0,\n                \"offset_to\": 3,\n                \"position\": 0,\n                \"position_length\": 1,\n                \"text\": \"hel\"\n            }\n        ]);\n        assert_json_include!(\n            actual: actual_response_json,\n            expected: expected_response_json\n        );\n    }\n\n    #[tokio::test]\n    async fn test_parse_query_request() {\n        let index_service = IndexService::new(\n            MetastoreServiceClient::mocked(),\n            StorageResolver::unconfigured(),\n        );\n        let index_management_handler =\n            super::index_management_handlers(index_service, Arc::new(NodeConfig::for_test()))\n                .recover(recover_fn);\n        let resp = warp::test::request()\n            .path(\"/parse-query\")\n            .method(\"POST\")\n            .json(&true)\n            .body(r#\"{\"query\": \"field:this AND field:that\"}\"#)\n            .reply(&index_management_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/index_api/source_resource.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse bytes::Bytes;\nuse quickwit_config::{\n    CLI_SOURCE_ID, ConfigFormat, FileSourceParams, INGEST_API_SOURCE_ID, SourceConfig,\n    SourceParams, load_source_config_from_user_config, load_source_config_update,\n};\nuse quickwit_index_management::{IndexService, IndexServiceError};\nuse quickwit_metastore::IndexMetadataResponseExt;\nuse quickwit_proto::ingest::Shard;\nuse quickwit_proto::metastore::{\n    DeleteSourceRequest, EntityKind, IndexMetadataRequest, ListShardsRequest, ListShardsSubrequest,\n    MetastoreError, MetastoreResult, MetastoreService, MetastoreServiceClient,\n    ResetSourceCheckpointRequest, ToggleSourceRequest,\n};\nuse quickwit_proto::types::{IndexId, IndexUid, SourceId};\nuse serde::Deserialize;\nuse tracing::info;\nuse warp::{Filter, Rejection};\n\nuse super::rest_handler::{json_body, log_failure};\nuse crate::format::{extract_config_format, extract_format_from_qs};\nuse crate::rest_api_response::into_rest_api_response;\nuse crate::with_arg;\n\npub fn create_source_handler(\n    index_service: IndexService,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"indexes\" / String / \"sources\")\n        .and(warp::post())\n        .and(extract_config_format())\n        .and(warp::body::content_length_limit(1024 * 1024))\n        .and(warp::filters::body::bytes())\n        .and(with_arg(index_service))\n        .then(create_source)\n        .map(log_failure(\"failed to create source\"))\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n        .boxed()\n}\n\n#[allow(clippy::result_large_err)]\nfn check_source_type(source_params: &SourceParams) -> Result<(), IndexServiceError> {\n    // Note: This check is performed here instead of the source config serde\n    // because many tests use the file source, and can't store that config in\n    // the metastore without going through the validation.\n    if let SourceParams::File(FileSourceParams::Filepath(_)) = source_params {\n        return Err(IndexServiceError::InvalidConfig(anyhow::anyhow!(\n            \"path based file sources are limited to a local usage, please use the CLI command \\\n             `quickwit tool local-ingest` to ingest data from a specific file or setup a \\\n             notification based file source\"\n        )));\n    }\n    Ok(())\n}\n\n#[utoipa::path(\n    post,\n    tag = \"Sources\",\n    path = \"/indexes/{index_id}/sources\",\n    request_body = VersionedSourceConfig,\n    responses(\n        // We return `VersionedSourceConfig` as it's the serialized model view.\n        (status = 200, description = \"Successfully created source.\", body = VersionedSourceConfig)\n    ),\n    params(\n        (\"index_id\" = String, Path, description = \"The index ID to create a source for.\"),\n    )\n)]\n/// Creates Source.\npub async fn create_source(\n    index_id: IndexId,\n    config_format: ConfigFormat,\n    source_config_bytes: Bytes,\n    mut index_service: IndexService,\n) -> Result<SourceConfig, IndexServiceError> {\n    let source_config: SourceConfig =\n        load_source_config_from_user_config(config_format, &source_config_bytes)\n            .map_err(IndexServiceError::InvalidConfig)?;\n    check_source_type(&source_config.source_params)?;\n    let index_metadata_request = IndexMetadataRequest::for_index_id(index_id.to_string());\n    let index_uid: IndexUid = index_service\n        .metastore()\n        .index_metadata(index_metadata_request)\n        .await?\n        .deserialize_index_metadata()?\n        .index_uid;\n    info!(index_id = %index_id, source_id = %source_config.source_id, \"create-source\");\n    index_service.add_source(index_uid, source_config).await\n}\n\n/// Query parameters for update source queries\n#[derive(Deserialize, Debug, Eq, PartialEq, utoipa::IntoParams)]\n#[into_params(parameter_in = Query)]\npub struct UpdateQueryParams {\n    /// Create the source if it doesn't exist yet\n    #[serde(default)]\n    pub create: bool,\n}\n\nfn update_source_qp() -> impl Filter<Extract = (UpdateQueryParams,), Error = Rejection> + Clone {\n    warp::query::<UpdateQueryParams>()\n}\n\npub fn update_source_handler(\n    index_service: IndexService,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"indexes\" / String / \"sources\" / String)\n        .and(warp::put())\n        .and(extract_config_format())\n        .and(update_source_qp())\n        .and(warp::body::content_length_limit(1024 * 1024))\n        .and(warp::filters::body::bytes())\n        .and(with_arg(index_service))\n        .then(update_source)\n        .map(log_failure(\"failed to update source\"))\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n        .boxed()\n}\n\n#[utoipa::path(\n    put,\n    tag = \"Sources\",\n    path = \"/indexes/{index_id}/sources/{source_id}\",\n    request_body = VersionedSourceConfig,\n    responses(\n        // We return `VersionedSourceConfig` as it's the serialized model view.\n        (status = 200, description = \"Successfully updated source.\", body = VersionedSourceConfig)\n    ),\n    params(\n        (\"index_id\" = String, Path, description = \"The index ID to create a source for.\"),\n        (\"source_id\" = String, Path, description = \"The source ID to update.\"),\n        UpdateQueryParams,\n    )\n)]\n/// Updates Source.\npub async fn update_source(\n    index_id: IndexId,\n    source_id: SourceId,\n    config_format: ConfigFormat,\n    query_params: UpdateQueryParams,\n    source_config_bytes: Bytes,\n    mut index_service: IndexService,\n) -> Result<SourceConfig, IndexServiceError> {\n    let index_metadata_request = IndexMetadataRequest::for_index_id(index_id.to_string());\n    let mut current_index_metadata = index_service\n        .metastore()\n        .index_metadata(index_metadata_request)\n        .await?\n        .deserialize_index_metadata()?;\n    let current_source_config = match current_index_metadata.sources.remove(&source_id) {\n        Some(source_config) => source_config,\n        None if query_params.create => {\n            let source_config: SourceConfig =\n                load_source_config_from_user_config(config_format, &source_config_bytes)\n                    .map_err(IndexServiceError::InvalidConfig)?;\n            if source_config.source_id != source_id {\n                return Err(IndexServiceError::InvalidConfig(anyhow::anyhow!(\n                    \"`source_id` in config file does not match source_id from query path\"\n                )));\n            }\n            check_source_type(&source_config.source_params)?;\n            info!(index_id = %index_id, source_id = %source_config.source_id, \"create-source-on-update\");\n            // TODO handle already exists?\n            return index_service\n                .add_source(current_index_metadata.index_uid, source_config)\n                .await;\n        }\n        None => {\n            return Err(MetastoreError::NotFound(EntityKind::Source {\n                index_id: index_id.to_string(),\n                source_id,\n            })\n            .into());\n        }\n    };\n\n    let new_source_config: SourceConfig =\n        load_source_config_update(config_format, &source_config_bytes, &current_source_config)\n            .map_err(IndexServiceError::InvalidConfig)?;\n\n    info!(index_id = %index_id, source_id = %new_source_config.source_id, \"update-source\");\n    index_service\n        .update_source(current_index_metadata.index_uid, new_source_config)\n        .await\n}\n\npub fn get_source_handler(\n    metastore: MetastoreServiceClient,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"indexes\" / String / \"sources\" / String)\n        .and(warp::get())\n        .and(with_arg(metastore))\n        .then(get_source)\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n        .boxed()\n}\n\npub async fn get_source(\n    index_id: IndexId,\n    source_id: SourceId,\n    metastore: MetastoreServiceClient,\n) -> MetastoreResult<SourceConfig> {\n    info!(index_id = %index_id, source_id = %source_id, \"get-source\");\n    let index_metadata_request = IndexMetadataRequest::for_index_id(index_id.to_string());\n    let source_config = metastore\n        .index_metadata(index_metadata_request)\n        .await?\n        .deserialize_index_metadata()?\n        .sources\n        .remove(&source_id)\n        .ok_or({\n            MetastoreError::NotFound(EntityKind::Source {\n                index_id,\n                source_id,\n            })\n        })?;\n    Ok(source_config)\n}\n\npub fn reset_source_checkpoint_handler(\n    metastore: MetastoreServiceClient,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"indexes\" / String / \"sources\" / String / \"reset-checkpoint\")\n        .and(warp::put())\n        .and(with_arg(metastore))\n        .then(reset_source_checkpoint)\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n        .boxed()\n}\n\n#[utoipa::path(\n    put,\n    tag = \"Sources\",\n    path = \"/indexes/{index_id}/sources/{source_id}/reset-checkpoint\",\n    responses(\n        (status = 200, description = \"Successfully reset source checkpoint.\")\n    ),\n    params(\n        (\"index_id\" = String, Path, description = \"The index ID of the source.\"),\n        (\"source_id\" = String, Path, description = \"The source ID whose checkpoint is reset.\"),\n    )\n)]\n/// Resets source checkpoint.\npub async fn reset_source_checkpoint(\n    index_id: IndexId,\n    source_id: SourceId,\n    metastore: MetastoreServiceClient,\n) -> MetastoreResult<()> {\n    let index_metadata_request = IndexMetadataRequest::for_index_id(index_id.to_string());\n    let index_uid: IndexUid = metastore\n        .index_metadata(index_metadata_request)\n        .await?\n        .deserialize_index_metadata()?\n        .index_uid;\n    info!(index_id = %index_id, source_id = %source_id, \"reset-checkpoint\");\n    let reset_source_checkpoint_request = ResetSourceCheckpointRequest {\n        index_uid: Some(index_uid),\n        source_id: source_id.clone(),\n    };\n    metastore\n        .reset_source_checkpoint(reset_source_checkpoint_request)\n        .await?;\n    Ok(())\n}\n\npub fn toggle_source_handler(\n    metastore: MetastoreServiceClient,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"indexes\" / String / \"sources\" / String / \"toggle\")\n        .and(warp::put())\n        .and(json_body())\n        .and(with_arg(metastore))\n        .then(toggle_source)\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n        .boxed()\n}\n\n#[derive(Deserialize, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct ToggleSource {\n    enable: bool,\n}\n\n#[utoipa::path(\n    put,\n    tag = \"Sources\",\n    path = \"/indexes/{index_id}/sources/{source_id}/toggle\",\n    request_body = ToggleSource,\n    responses(\n        (status = 200, description = \"Successfully toggled source.\")\n    ),\n    params(\n        (\"index_id\" = String, Path, description = \"The index ID of the source.\"),\n        (\"source_id\" = String, Path, description = \"The source ID to toggle.\"),\n    )\n)]\n/// Toggles source.\npub async fn toggle_source(\n    index_id: IndexId,\n    source_id: SourceId,\n    toggle_source: ToggleSource,\n    metastore: MetastoreServiceClient,\n) -> Result<(), IndexServiceError> {\n    info!(index_id = %index_id, source_id = %source_id, enable = toggle_source.enable, \"toggle-source\");\n    let index_metadata_request = IndexMetadataRequest::for_index_id(index_id.to_string());\n    let index_uid: IndexUid = metastore\n        .index_metadata(index_metadata_request)\n        .await?\n        .deserialize_index_metadata()?\n        .index_uid;\n    if [CLI_SOURCE_ID, INGEST_API_SOURCE_ID].contains(&source_id.as_str()) {\n        return Err(IndexServiceError::OperationNotAllowed(format!(\n            \"source `{source_id}` is managed by Quickwit, you cannot enable or disable a source \\\n             managed by Quickwit\"\n        )));\n    }\n    let toggle_source_request = ToggleSourceRequest {\n        index_uid: Some(index_uid),\n        source_id: source_id.clone(),\n        enable: toggle_source.enable,\n    };\n    metastore.toggle_source(toggle_source_request).await?;\n    Ok(())\n}\n\npub fn delete_source_handler(\n    metastore: MetastoreServiceClient,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"indexes\" / String / \"sources\" / String)\n        .and(warp::delete())\n        .and(with_arg(metastore))\n        .then(delete_source)\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n        .boxed()\n}\n\n#[utoipa::path(\n    delete,\n    tag = \"Sources\",\n    path = \"/indexes/{index_id}/sources/{source_id}\",\n    responses(\n        (status = 200, description = \"Successfully deleted source.\")\n    ),\n    params(\n        (\"index_id\" = String, Path, description = \"The index ID to remove the source from.\"),\n        (\"source_id\" = String, Path, description = \"The source ID to remove from the index.\"),\n    )\n)]\n/// Deletes source.\npub async fn delete_source(\n    index_id: IndexId,\n    source_id: SourceId,\n    metastore: MetastoreServiceClient,\n) -> Result<(), IndexServiceError> {\n    info!(index_id = %index_id, source_id = %source_id, \"delete-source\");\n    let index_metadata_request = IndexMetadataRequest::for_index_id(index_id.to_string());\n    let index_uid: IndexUid = metastore\n        .index_metadata(index_metadata_request)\n        .await?\n        .deserialize_index_metadata()?\n        .index_uid;\n    if [INGEST_API_SOURCE_ID, CLI_SOURCE_ID].contains(&source_id.as_str()) {\n        return Err(IndexServiceError::OperationNotAllowed(format!(\n            \"source `{source_id}` is managed by Quickwit, you cannot delete a source managed by \\\n             Quickwit\"\n        )));\n    }\n    let delete_source_request = DeleteSourceRequest {\n        index_uid: Some(index_uid),\n        source_id: source_id.clone(),\n    };\n    metastore.delete_source(delete_source_request).await?;\n    Ok(())\n}\n\npub fn get_source_shards_handler(\n    metastore: MetastoreServiceClient,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"indexes\" / String / \"sources\" / String / \"shards\")\n        .and(warp::get())\n        .and(with_arg(metastore))\n        .then(get_source_shards)\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n        .boxed()\n}\n\npub async fn get_source_shards(\n    index_id: IndexId,\n    source_id: SourceId,\n    metastore: MetastoreServiceClient,\n) -> MetastoreResult<Vec<Shard>> {\n    info!(index_id = %index_id, source_id = %source_id, \"get-source-shards\");\n    let index_metadata_request = IndexMetadataRequest::for_index_id(index_id.to_string());\n    let index_uid: IndexUid = metastore\n        .index_metadata(index_metadata_request)\n        .await?\n        .deserialize_index_metadata()?\n        .index_uid;\n    let response = metastore\n        .list_shards(ListShardsRequest {\n            subrequests: vec![ListShardsSubrequest {\n                index_uid: Some(index_uid),\n                source_id: source_id.to_string(),\n                ..Default::default()\n            }],\n        })\n        .await?;\n    let shards = response\n        .subresponses\n        .into_iter()\n        .flat_map(|resp| resp.shards)\n        .collect();\n    Ok(shards)\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/index_api/split_resource.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_metastore::{\n    IndexMetadataResponseExt, ListSplitsQuery, ListSplitsRequestExt,\n    MetastoreServiceStreamSplitsExt, Split, SplitState,\n};\nuse quickwit_proto::metastore::{\n    IndexMetadataRequest, ListSplitsRequest, MarkSplitsForDeletionRequest, MetastoreResult,\n    MetastoreService, MetastoreServiceClient,\n};\nuse quickwit_proto::types::{IndexId, IndexUid};\nuse serde::{Deserialize, Serialize};\nuse tracing::info;\nuse warp::{Filter, Rejection};\n\nuse super::rest_handler::json_body;\nuse crate::format::extract_format_from_qs;\nuse crate::rest_api_response::into_rest_api_response;\nuse crate::simple_list::{from_simple_list, to_simple_list};\nuse crate::with_arg;\n\n/// This struct represents the QueryString passed to\n/// the rest API to filter splits.\n#[derive(Debug, Clone, Deserialize, Serialize, utoipa::IntoParams, utoipa::ToSchema, Default)]\n#[into_params(parameter_in = Query)]\npub struct ListSplitsQueryParams {\n    /// If set, define the number of splits to skip\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    #[serde(default)]\n    pub offset: Option<usize>,\n    /// If set, restrict maximum number of splits to retrieve\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    #[serde(default)]\n    pub limit: Option<usize>,\n    /// A specific split state(s) to filter by.\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(serialize_with = \"to_simple_list\")]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    #[serde(default)]\n    pub split_states: Option<Vec<SplitState>>,\n    /// If set, restrict splits to documents with a `timestamp >= start_timestamp`.\n    /// This timestamp is in seconds.\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    #[serde(default)]\n    pub start_timestamp: Option<i64>,\n    /// If set, restrict splits to documents with a `timestamp < end_timestamp`.\n    /// This timestamp is in seconds.\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    #[serde(default)]\n    pub end_timestamp: Option<i64>,\n    /// If set, restrict splits whose creation dates are before this date.\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    #[serde(default)]\n    pub end_create_timestamp: Option<i64>,\n}\n\n#[derive(Serialize, Deserialize, Debug, utoipa::ToSchema)]\npub struct ListSplitsResponse {\n    #[serde(default)]\n    pub offset: usize,\n    #[serde(default)]\n    pub size: usize,\n    #[serde(default)]\n    pub splits: Vec<Split>,\n}\n\n#[utoipa::path(\n    get,\n    tag = \"Indexes\",\n    path = \"/indexes/{index_id}/splits\",\n    responses(\n        (status = 200, description = \"Successfully fetched splits.\", body = ListSplitsResponse)\n    ),\n    params(\n        ListSplitsQueryParams,\n        (\"index_id\" = String, Path, description = \"The index ID to retrieve splits for.\"),\n    )\n)]\n\n/// Get splits.\npub async fn list_splits(\n    index_id: IndexId,\n    list_split_query: ListSplitsQueryParams,\n    metastore: MetastoreServiceClient,\n) -> MetastoreResult<ListSplitsResponse> {\n    let index_metadata_request = IndexMetadataRequest::for_index_id(index_id.to_string());\n    let index_uid: IndexUid = metastore\n        .index_metadata(index_metadata_request)\n        .await?\n        .deserialize_index_metadata()?\n        .index_uid;\n    info!(index_id = %index_id, list_split_query = ?list_split_query, \"get-splits\");\n    let mut query = ListSplitsQuery::for_index(index_uid);\n    let mut offset = 0;\n    if let Some(offset_value) = list_split_query.offset {\n        query = query.with_offset(offset_value);\n        offset = offset_value;\n    }\n    if let Some(limit) = list_split_query.limit {\n        query = query.with_limit(limit);\n    }\n    if let Some(split_states) = list_split_query.split_states {\n        query = query.with_split_states(split_states);\n    }\n    if let Some(start_timestamp) = list_split_query.start_timestamp {\n        query = query.with_time_range_start_gte(start_timestamp);\n    }\n    if let Some(end_timestamp) = list_split_query.end_timestamp {\n        query = query.with_time_range_end_lt(end_timestamp);\n    }\n    if let Some(end_created_timestamp) = list_split_query.end_create_timestamp {\n        query = query.with_create_timestamp_lt(end_created_timestamp);\n    }\n    let list_splits_request = ListSplitsRequest::try_from_list_splits_query(&query)?;\n    let splits = metastore\n        .list_splits(list_splits_request)\n        .await?\n        .collect_splits()\n        .await?;\n    Ok(ListSplitsResponse {\n        offset,\n        size: splits.len(),\n        splits,\n    })\n}\n\npub fn list_splits_handler(\n    metastore: MetastoreServiceClient,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"indexes\" / String / \"splits\")\n        .and(warp::get())\n        .and(warp::query())\n        .and(with_arg(metastore))\n        .then(list_splits)\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n        .boxed()\n}\n\n#[derive(Deserialize, utoipa::ToSchema)]\n#[serde(deny_unknown_fields)]\npub struct SplitsForDeletion {\n    pub split_ids: Vec<String>,\n}\n\n#[utoipa::path(\n    put,\n    tag = \"Splits\",\n    path = \"/indexes/{index_id}/splits/mark-for-deletion\",\n    request_body = SplitsForDeletion,\n    responses(\n        (status = 200, description = \"Successfully marked splits for deletion.\")\n    ),\n    params(\n        (\"index_id\" = String, Path, description = \"The index ID to mark splits for deletion for.\"),\n    )\n)]\n/// Marks splits for deletion.\npub async fn mark_splits_for_deletion(\n    index_id: IndexId,\n    splits_for_deletion: SplitsForDeletion,\n    metastore: MetastoreServiceClient,\n) -> MetastoreResult<()> {\n    let index_metadata_request = IndexMetadataRequest::for_index_id(index_id.to_string());\n    let index_uid: IndexUid = metastore\n        .index_metadata(index_metadata_request)\n        .await?\n        .deserialize_index_metadata()?\n        .index_uid;\n    info!(index_id = %index_id, splits_ids = ?splits_for_deletion.split_ids, \"mark-splits-for-deletion\");\n    let split_ids: Vec<String> = splits_for_deletion\n        .split_ids\n        .iter()\n        .map(|split_id| split_id.to_string())\n        .collect();\n    let mark_splits_for_deletion_request =\n        MarkSplitsForDeletionRequest::new(index_uid, split_ids.clone());\n    metastore\n        .mark_splits_for_deletion(mark_splits_for_deletion_request)\n        .await?;\n    Ok(())\n}\n\npub fn mark_splits_for_deletion_handler(\n    metastore: MetastoreServiceClient,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"indexes\" / String / \"splits\" / \"mark-for-deletion\")\n        .and(warp::put())\n        .and(json_body())\n        .and(with_arg(metastore))\n        .then(mark_splits_for_deletion)\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n        .boxed()\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/indexing_api/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod rest_handler;\n\npub use rest_handler::{IndexingApi, indexing_get_handler};\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/indexing_api/rest_handler.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::convert::Infallible;\n\nuse quickwit_actors::{AskError, Mailbox, Observe};\nuse quickwit_indexing::actors::{IndexingService, IndexingServiceCounters};\nuse warp::{Filter, Rejection};\n\nuse crate::format::extract_format_from_qs;\nuse crate::require;\nuse crate::rest::recover_fn;\nuse crate::rest_api_response::into_rest_api_response;\n\n#[derive(utoipa::OpenApi)]\n#[openapi(paths(indexing_endpoint))]\npub struct IndexingApi;\n\n#[utoipa::path(\n    get,\n    tag = \"Indexing\",\n    path = \"/indexing\",\n    responses(\n        (status = 200, description = \"Successfully observed indexing pipelines.\", body = IndexingStatistics)\n    ),\n)]\n/// Observe Indexing Pipeline\nasync fn indexing_endpoint(\n    indexing_service_mailbox: Mailbox<IndexingService>,\n) -> Result<IndexingServiceCounters, AskError<Infallible>> {\n    let counters = indexing_service_mailbox.ask(Observe).await?;\n    indexing_service_mailbox.ask(Observe).await?;\n    Ok(counters)\n}\n\nfn indexing_get_filter() -> impl Filter<Extract = (), Error = Rejection> + Clone {\n    warp::path!(\"indexing\").and(warp::get())\n}\n\npub fn indexing_get_handler(\n    indexing_service_mailbox_opt: Option<Mailbox<IndexingService>>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    indexing_get_filter()\n        .and(require(indexing_service_mailbox_opt))\n        .then(indexing_endpoint)\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n        .recover(recover_fn)\n        .boxed()\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/ingest_api/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod response;\nmod rest_handler;\n\npub use response::{RestIngestResponse, RestParseFailure};\n#[cfg(test)]\npub(crate) use rest_handler::tests::setup_ingest_v1_service;\npub use rest_handler::{IngestApi, IngestApiSchemas};\npub(crate) use rest_handler::{ingest_api_handlers, lines};\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/ingest_api/response.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::BTreeMap;\n\nuse bytes::Bytes;\nuse quickwit_ingest::{IngestResponse, IngestServiceError};\nuse quickwit_proto::ingest::router::IngestResponseV2;\nuse quickwit_proto::ingest::{DocBatchV2, ParseFailureReason};\nuse quickwit_proto::types::DocUid;\nuse serde::{Deserialize, Serialize};\n\n#[derive(Serialize, Deserialize, Debug, PartialEq, utoipa::ToSchema)]\npub struct RestParseFailure {\n    pub message: String,\n    pub document: String,\n    pub reason: ParseFailureReason,\n}\n\n#[derive(Serialize, Deserialize, Debug, PartialEq, Default, utoipa::ToSchema)]\npub struct RestIngestResponse {\n    /// Number of rows in the request payload\n    pub num_docs_for_processing: u64,\n    /// Number of docs successfully ingested\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub num_ingested_docs: Option<u64>, // TODO(#5604) remove Option\n    /// Number of docs rejected because of parsing errors\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub num_rejected_docs: Option<u64>, // TODO(#5604) remove Option\n    /// Detailed description of parsing errors (available if the path param\n    /// `detailed_response` is set to `true`)\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub parse_failures: Option<Vec<RestParseFailure>>,\n}\n\nimpl RestIngestResponse {\n    pub(crate) fn from_ingest_v1(ingest_response: IngestResponse) -> Self {\n        Self {\n            num_docs_for_processing: ingest_response.num_docs_for_processing,\n            ..Default::default()\n        }\n    }\n\n    /// Converts [`IngestResponseV2`] into [`RestIngestResponse`].\n    ///\n    /// Generates a detailed failure description (`parse_failures`) if\n    /// `doc_batch_clone_opt.is_some()`\n    pub(crate) fn from_ingest_v2(\n        mut ingest_response: IngestResponseV2,\n        doc_batch_clone_opt: Option<&DocBatchV2>,\n        num_docs_for_processing: u64,\n    ) -> Result<Self, IngestServiceError> {\n        let num_responses = ingest_response.successes.len() + ingest_response.failures.len();\n        if num_responses != 1 {\n            return Err(IngestServiceError::Internal(format!(\n                \"expected a single failure/success, got {num_responses}\",\n            )));\n        }\n        if let Some(failure_resp) = ingest_response.failures.pop() {\n            return Err(failure_resp.into());\n        }\n        let success_resp = ingest_response.successes.pop().unwrap();\n\n        let mut resp = Self {\n            num_docs_for_processing,\n            num_ingested_docs: Some(success_resp.num_ingested_docs as u64),\n            num_rejected_docs: Some(success_resp.parse_failures.len() as u64),\n            parse_failures: None,\n        };\n        if let Some(doc_batch) = doc_batch_clone_opt {\n            let docs: BTreeMap<DocUid, Bytes> = doc_batch.docs().collect();\n            let mut parse_failures = Vec::with_capacity(success_resp.parse_failures.len());\n            for failure in success_resp.parse_failures {\n                let doc = docs.get(&failure.doc_uid()).ok_or_else(|| {\n                    IngestServiceError::Internal(format!(\n                        \"failed doc_uid {} not found in the original doc batch\",\n                        failure.doc_uid()\n                    ))\n                })?;\n                parse_failures.push(RestParseFailure {\n                    reason: failure.reason(),\n                    message: failure.message,\n                    document: String::from_utf8(doc.to_vec()).unwrap(),\n                });\n            }\n            resp.parse_failures = Some(parse_failures);\n        }\n        Ok(resp)\n    }\n\n    /// Aggregates ingest counts and errors.\n    pub fn merge(self, other: Self) -> Self {\n        Self {\n            num_docs_for_processing: self.num_docs_for_processing + other.num_docs_for_processing,\n            num_ingested_docs: apply_op(self.num_ingested_docs, other.num_ingested_docs, |a, b| {\n                a + b\n            }),\n            num_rejected_docs: apply_op(self.num_rejected_docs, other.num_rejected_docs, |a, b| {\n                a + b\n            }),\n            parse_failures: apply_op(self.parse_failures, other.parse_failures, |a, b| {\n                a.into_iter().chain(b).collect()\n            }),\n        }\n    }\n}\n\nfn apply_op<T>(a: Option<T>, b: Option<T>, f: impl Fn(T, T) -> T) -> Option<T> {\n    match (a, b) {\n        (Some(a), Some(b)) => Some(f(a, b)),\n        (Some(a), None) => Some(a),\n        (None, Some(b)) => Some(b),\n        (None, None) => None,\n    }\n}\n#[cfg(test)]\nmod tests {\n    use quickwit_proto::ingest::ParseFailure;\n    use quickwit_proto::ingest::router::{IngestFailure, IngestFailureReason, IngestSuccess};\n    use quickwit_proto::types::IndexUid;\n\n    use super::*;\n\n    #[test]\n    fn test_from_ingest_v1() {\n        let ingest_response = IngestResponse {\n            num_docs_for_processing: 10,\n        };\n        let rest_response = RestIngestResponse::from_ingest_v1(ingest_response);\n        assert_eq!(rest_response.num_docs_for_processing, 10);\n        assert_eq!(rest_response.num_ingested_docs, None);\n        assert_eq!(rest_response.num_rejected_docs, None);\n        assert_eq!(rest_response.parse_failures, None);\n    }\n\n    #[test]\n    fn test_from_ingest_v2_success() {\n        let success_resp = IngestResponseV2 {\n            successes: vec![IngestSuccess {\n                subrequest_id: 0,\n                index_uid: Some(IndexUid::new_with_random_ulid(\"myindex\")),\n                source_id: String::from(\"mysource\"),\n                shard_id: Some(\"myshard\".into()),\n                replication_position_inclusive: None,\n                num_ingested_docs: 5,\n                parse_failures: vec![],\n            }],\n            failures: vec![],\n        };\n        let rest_response = RestIngestResponse::from_ingest_v2(success_resp, None, 10).unwrap();\n        assert_eq!(rest_response.num_docs_for_processing, 10);\n        assert_eq!(rest_response.num_ingested_docs, Some(5));\n        assert_eq!(rest_response.num_rejected_docs, Some(0));\n        assert_eq!(rest_response.parse_failures, None);\n    }\n\n    #[test]\n    fn test_from_ingest_v2_partial_success() {\n        let success_resp = IngestResponseV2 {\n            successes: vec![IngestSuccess {\n                subrequest_id: 0,\n                index_uid: Some(IndexUid::new_with_random_ulid(\"myindex\")),\n                source_id: String::from(\"mysource\"),\n                shard_id: Some(\"myshard\".into()),\n                replication_position_inclusive: None,\n                num_ingested_docs: 5,\n                parse_failures: vec![ParseFailure {\n                    doc_uid: Some(DocUid::for_test(42)),\n                    message: \"error\".to_string(),\n                    reason: ParseFailureReason::InvalidJson.into(),\n                }],\n            }],\n            failures: vec![],\n        };\n        let rest_response = RestIngestResponse::from_ingest_v2(success_resp, None, 10).unwrap();\n        assert_eq!(rest_response.num_docs_for_processing, 10);\n        assert_eq!(rest_response.num_ingested_docs, Some(5));\n        assert_eq!(rest_response.num_rejected_docs, Some(1));\n        assert_eq!(rest_response.parse_failures, None);\n    }\n\n    #[test]\n    fn test_from_ingest_v2_failure() {\n        let failure_resp = IngestResponseV2 {\n            successes: vec![],\n            failures: vec![IngestFailure {\n                subrequest_id: 0,\n                index_id: String::from(\"myindex\"),\n                source_id: String::from(\"mysource\"),\n                reason: IngestFailureReason::SourceNotFound.into(),\n            }],\n        };\n        let result = RestIngestResponse::from_ingest_v2(failure_resp, None, 10);\n        assert!(result.is_err());\n    }\n\n    #[test]\n    fn test_merge_responses() {\n        let response1 = RestIngestResponse {\n            num_docs_for_processing: 10,\n            num_ingested_docs: Some(5),\n            num_rejected_docs: Some(2),\n            parse_failures: Some(vec![RestParseFailure {\n                message: \"error1\".to_string(),\n                document: \"doc1\".to_string(),\n                reason: ParseFailureReason::InvalidJson,\n            }]),\n        };\n        let response2 = RestIngestResponse {\n            num_docs_for_processing: 15,\n            num_ingested_docs: Some(10),\n            num_rejected_docs: Some(3),\n            parse_failures: Some(vec![RestParseFailure {\n                message: \"error2\".to_string(),\n                document: \"doc2\".to_string(),\n                reason: ParseFailureReason::InvalidJson,\n            }]),\n        };\n        let merged_response = response1.merge(response2);\n        assert_eq!(merged_response.num_docs_for_processing, 25);\n        assert_eq!(merged_response.num_ingested_docs.unwrap(), 15);\n        assert_eq!(merged_response.num_rejected_docs.unwrap(), 5);\n        assert_eq!(\n            merged_response.parse_failures.unwrap(),\n            vec![\n                RestParseFailure {\n                    message: \"error1\".to_string(),\n                    document: \"doc1\".to_string(),\n                    reason: ParseFailureReason::InvalidJson,\n                },\n                RestParseFailure {\n                    message: \"error2\".to_string(),\n                    document: \"doc2\".to_string(),\n                    reason: ParseFailureReason::InvalidJson,\n                }\n            ]\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/ingest_api/rest_handler.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse bytes::{Buf, Bytes};\nuse quickwit_config::{INGEST_V2_SOURCE_ID, IngestApiConfig, validate_identifier};\nuse quickwit_ingest::{\n    CommitType, DocBatchBuilder, DocBatchV2Builder, FetchResponse, IngestRequest, IngestService,\n    IngestServiceClient, IngestServiceError, TailRequest,\n};\nuse quickwit_proto::ingest::CommitTypeV2;\nuse quickwit_proto::ingest::router::{\n    IngestRequestV2, IngestRouterService, IngestRouterServiceClient, IngestSubrequest,\n};\nuse quickwit_proto::types::{DocUidGenerator, IndexId};\nuse serde::Deserialize;\nuse warp::{Filter, Rejection};\n\nuse super::RestIngestResponse;\nuse crate::decompression::get_body_bytes;\nuse crate::format::extract_format_from_qs;\nuse crate::rest_api_response::into_rest_api_response;\nuse crate::{Body, BodyFormat, with_arg};\n\n#[derive(utoipa::OpenApi)]\n#[openapi(paths(ingest, tail_endpoint,))]\npub struct IngestApi;\n\n#[derive(utoipa::OpenApi)]\n#[openapi(components(schemas(\n    quickwit_ingest::DocBatch,\n    quickwit_ingest::FetchResponse,\n    quickwit_ingest::IngestResponse,\n    quickwit_ingest::CommitType,\n)))]\npub struct IngestApiSchemas;\n\n#[derive(Clone, Debug, Deserialize, PartialEq)]\nstruct IngestOptions {\n    #[serde(alias = \"commit\", default = \"IngestOptions::default_commit_type\")]\n    commit_type: CommitTypeV2,\n    #[serde(default)]\n    use_legacy_ingest: bool,\n    #[serde(default)]\n    detailed_response: bool,\n}\n\nimpl IngestOptions {\n    // This default implementation is necessary because `CommitTypeV2::default()` is\n    // `CommitTypeV2::Unspecified`.\n    fn default_commit_type() -> CommitTypeV2 {\n        CommitTypeV2::Auto\n    }\n\n    fn commit_type_v1(&self) -> CommitType {\n        match self.commit_type {\n            CommitTypeV2::Unspecified | CommitTypeV2::Auto => CommitType::Auto,\n            CommitTypeV2::Force => CommitType::Force,\n            CommitTypeV2::WaitFor => CommitType::WaitFor,\n        }\n    }\n}\n\npub(crate) fn ingest_api_handlers(\n    ingest_router: IngestRouterServiceClient,\n    ingest_service: IngestServiceClient,\n    config: IngestApiConfig,\n    enable_ingest_v1: bool,\n    enable_ingest_v2: bool,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    ingest_handler(\n        ingest_router,\n        ingest_service.clone(),\n        config,\n        enable_ingest_v1,\n        enable_ingest_v2,\n    )\n    .or(tail_handler(ingest_service))\n    .boxed()\n}\n\nfn ingest_filter(\n    config: IngestApiConfig,\n) -> impl Filter<Extract = (String, Body, IngestOptions), Error = Rejection> + Clone {\n    warp::path!(String / \"ingest\")\n        .and(warp::post())\n        .and(warp::body::content_length_limit(\n            config.content_length_limit.as_u64(),\n        ))\n        .and(get_body_bytes())\n        .and(warp::query::<IngestOptions>())\n}\n\nfn ingest_handler(\n    ingest_router: IngestRouterServiceClient,\n    ingest_service: IngestServiceClient,\n    config: IngestApiConfig,\n    enable_ingest_v1: bool,\n    enable_ingest_v2: bool,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    ingest_filter(config)\n        .and(with_arg(ingest_router))\n        .and(with_arg(ingest_service))\n        .then(\n            move |index_id, body, ingest_options, ingest_router, ingest_service| {\n                ingest(\n                    index_id,\n                    body,\n                    ingest_options,\n                    ingest_router,\n                    ingest_service,\n                    enable_ingest_v1,\n                    enable_ingest_v2,\n                )\n            },\n        )\n        .map(|result| into_rest_api_response(result, BodyFormat::default()))\n        .boxed()\n}\n\n#[utoipa::path(\n    post,\n    tag = \"Ingest\",\n    path = \"/{index_id}/ingest\",\n    request_body(content = String, description = \"Documents to ingest in NDJSON format and limited to 10MB\", content_type = \"application/json\"),\n    responses(\n        (status = 200, description = \"Successfully ingested documents.\", body = RestIngestResponse)\n    ),\n    params(\n        (\"index_id\" = String, Path, description = \"The index ID to add docs to.\"),\n        (\"commit\" = Option<CommitType>, Query, description = \"Force or wait for commit at the end of the indexing operation.\"),\n    )\n)]\n/// Ingest documents\nasync fn ingest(\n    index_id: IndexId,\n    body: Body,\n    ingest_options: IngestOptions,\n    ingest_router: IngestRouterServiceClient,\n    ingest_service: IngestServiceClient,\n    enable_ingest_v1: bool,\n    enable_ingest_v2: bool,\n) -> Result<RestIngestResponse, IngestServiceError> {\n    if enable_ingest_v2 && !ingest_options.use_legacy_ingest {\n        return ingest_v2(index_id, body, ingest_options, ingest_router).await;\n    }\n    if !enable_ingest_v1 {\n        let message = \"ingest v1 is disabled: environment variable `QW_DISABLE_INGEST_V1` is set\";\n        return Err(IngestServiceError::Internal(message.to_string()));\n    }\n    ingest_v1(index_id, body, ingest_options, ingest_service).await\n}\n\n/// Ingest documents\nasync fn ingest_v1(\n    index_id: IndexId,\n    body: Body,\n    ingest_options: IngestOptions,\n    ingest_service: IngestServiceClient,\n) -> Result<RestIngestResponse, IngestServiceError> {\n    if ingest_options.detailed_response {\n        return Err(IngestServiceError::BadRequest(\n            \"detailed_response is not supported in ingest v1\".to_string(),\n        ));\n    }\n    // The size of the body should be an upper bound of the size of the batch. The removal of the\n    // end of line character for each doc compensates the addition of the `DocCommand` header.\n    let mut doc_batch_builder = DocBatchBuilder::with_capacity(index_id, body.content.remaining());\n    for line in lines(&body.content) {\n        doc_batch_builder.ingest_doc(line);\n    }\n    let ingest_req = IngestRequest {\n        doc_batches: vec![doc_batch_builder.build()],\n        commit: ingest_options.commit_type_v1() as i32,\n    };\n    let ingest_response = ingest_service.ingest(ingest_req).await?;\n    Ok(RestIngestResponse::from_ingest_v1(ingest_response))\n}\n\nasync fn ingest_v2(\n    index_id: IndexId,\n    body: Body,\n    ingest_options: IngestOptions,\n    ingest_router: IngestRouterServiceClient,\n) -> Result<RestIngestResponse, IngestServiceError> {\n    let mut doc_batch_builder = DocBatchV2Builder::default();\n    let mut doc_uid_generator = DocUidGenerator::default();\n\n    for doc in lines(&body.content) {\n        doc_batch_builder.add_doc(doc_uid_generator.next_doc_uid(), doc);\n    }\n    drop(body);\n    let doc_batch_opt = doc_batch_builder.build();\n\n    let Some(doc_batch) = doc_batch_opt else {\n        let response = RestIngestResponse::default();\n        return Ok(response);\n    };\n    let num_docs_for_processing = doc_batch.num_docs() as u64;\n    let doc_batch_clone_opt = if ingest_options.detailed_response {\n        Some(doc_batch.clone())\n    } else {\n        None\n    };\n\n    // Validate index ID early because propagating back the right error (400)\n    // from deeper ingest layers is harder\n    if validate_identifier(\"\", &index_id).is_err() {\n        return Err(IngestServiceError::BadRequest(\n            \"invalid index ID\".to_string(),\n        ));\n    }\n\n    let subrequest = IngestSubrequest {\n        subrequest_id: 0,\n        index_id,\n        source_id: INGEST_V2_SOURCE_ID.to_string(),\n        doc_batch: Some(doc_batch),\n    };\n    let request = IngestRequestV2 {\n        commit_type: ingest_options.commit_type as i32,\n        subrequests: vec![subrequest],\n    };\n    let response = ingest_router.ingest(request).await?;\n    RestIngestResponse::from_ingest_v2(\n        response,\n        doc_batch_clone_opt.as_ref(),\n        num_docs_for_processing,\n    )\n}\n\npub fn tail_handler(\n    ingest_service: IngestServiceClient,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    tail_filter()\n        .and(with_arg(ingest_service))\n        .then(tail_endpoint)\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n        .boxed()\n}\n\nfn tail_filter() -> impl Filter<Extract = (String,), Error = Rejection> + Clone {\n    warp::path!(String / \"tail\").and(warp::get())\n}\n\n#[utoipa::path(\n    get,\n    tag = \"Ingest\",\n    path = \"/{index_id}/tail\",\n    responses(\n        (status = 200, description = \"Successfully fetched documents.\", body = FetchResponse)\n    ),\n    params(\n        (\"index_id\" = String, Path, description = \"The index ID to tail.\"),\n    )\n)]\n/// Returns the last few ingested documents.\nasync fn tail_endpoint(\n    index_id: IndexId,\n    ingest_service: IngestServiceClient,\n) -> Result<FetchResponse, IngestServiceError> {\n    let fetch_response = ingest_service.tail(TailRequest { index_id }).await?;\n    Ok(fetch_response)\n}\n\npub(crate) fn lines(body: &Bytes) -> impl Iterator<Item = &[u8]> {\n    body.split(|byte| byte == &b'\\n')\n        .filter(|line| !is_empty_or_blank_line(line))\n}\n\n#[inline]\nfn is_empty_or_blank_line(line: &[u8]) -> bool {\n    line.is_empty() || line.iter().all(|ch| ch.is_ascii_whitespace())\n}\n\n#[cfg(test)]\npub(crate) mod tests {\n    use std::str;\n    use std::time::Duration;\n\n    use bytes::Bytes;\n    use quickwit_actors::{Mailbox, Universe};\n    use quickwit_config::IngestApiConfig;\n    use quickwit_ingest::{\n        CreateQueueIfNotExistsRequest, FetchRequest, FetchResponse, IngestApiService,\n        IngestServiceClient, QUEUES_DIR_NAME, SuggestTruncateRequest, init_ingest_api,\n    };\n    use quickwit_proto::ingest::router::IngestRouterServiceClient;\n\n    use super::{RestIngestResponse, ingest_api_handlers};\n    use crate::ingest_api::lines;\n\n    #[test]\n    fn test_process_lines() {\n        let test_cases = [\n            // an empty line is inserted before the metadata action and the doc\n            (&b\"\\n{ \\\"create\\\" : { \\\"_index\\\" : \\\"my-index-1\\\", \\\"_id\\\" : \\\"1\\\"} }\\n{\\\"id\\\": 1, \\\"message\\\": \\\"push\\\"}\"[..], 2),\n            // a blank line is inserted before the metadata action and the doc\n            (&b\"       \\n{ \\\"create\\\" : { \\\"_index\\\" : \\\"my-index-1\\\", \\\"_id\\\" : \\\"1\\\"} }\\n{\\\"id\\\": 1, \\\"message\\\": \\\"push\\\"}\"[..], 2),\n            // an empty line is inserted after the metadata action and before the doc\n            (&b\"{ \\\"create\\\" : { \\\"_index\\\" : \\\"my-index-1\\\", \\\"_id\\\" : \\\"1\\\"} }\\n\\n{\\\"id\\\": 1, \\\"message\\\": \\\"push\\\"}\"[..], 2),\n            // a blank line is inserted after the metadata action and before the doc\n            (&b\"{ \\\"create\\\" : { \\\"_index\\\" : \\\"my-index-1\\\", \\\"_id\\\" : \\\"1\\\"} }\\n     \\n{\\\"id\\\": 1, \\\"message\\\": \\\"push\\\"}\"[..], 2),\n        ];\n\n        for &(input, expected_count) in &test_cases {\n            assert_eq!(lines(&Bytes::from(input)).count(), expected_count);\n        }\n    }\n\n    pub(crate) async fn setup_ingest_v1_service(\n        queues: &[&str],\n        config: &IngestApiConfig,\n    ) -> (\n        Universe,\n        tempfile::TempDir,\n        IngestServiceClient,\n        Mailbox<IngestApiService>,\n    ) {\n        let universe = Universe::with_accelerated_time();\n        let temp_dir = tempfile::tempdir().unwrap();\n        let queues_dir_path = temp_dir.path().join(QUEUES_DIR_NAME);\n        let ingest_service_mailbox = init_ingest_api(&universe, &queues_dir_path, config)\n            .await\n            .unwrap();\n        for queue in queues {\n            let create_queue_req = CreateQueueIfNotExistsRequest {\n                queue_id: queue.to_string(),\n            };\n            ingest_service_mailbox\n                .ask_for_res(create_queue_req)\n                .await\n                .unwrap();\n        }\n        let ingest_service = IngestServiceClient::from_mailbox(ingest_service_mailbox.clone());\n        (universe, temp_dir, ingest_service, ingest_service_mailbox)\n    }\n\n    #[tokio::test]\n    async fn test_ingest_api_returns_200_when_ingest_json_and_fetch() {\n        let (universe, _temp_dir, ingest_service, _) =\n            setup_ingest_v1_service(&[\"my-index\"], &IngestApiConfig::default()).await;\n        let ingest_router = IngestRouterServiceClient::mocked();\n        let ingest_api_handlers = ingest_api_handlers(\n            ingest_router,\n            ingest_service,\n            IngestApiConfig::default(),\n            true,\n            false,\n        );\n        let resp = warp::test::request()\n            .path(\"/my-index/ingest\")\n            .method(\"POST\")\n            .json(&true)\n            .body(r#\"{\"id\": 1, \"message\": \"push\"}\"#)\n            .reply(&ingest_api_handlers)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let ingest_response: RestIngestResponse = serde_json::from_slice(resp.body()).unwrap();\n        assert_eq!(ingest_response.num_docs_for_processing, 1);\n\n        let resp = warp::test::request()\n            .path(\"/my-index/tail\")\n            .method(\"GET\")\n            .reply(&ingest_api_handlers)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let fetch_response: FetchResponse = serde_json::from_slice(resp.body()).unwrap();\n        let doc_batch = fetch_response.doc_batch.unwrap();\n        assert_eq!(doc_batch.index_id, \"my-index\");\n        assert_eq!(doc_batch.num_docs(), 1);\n        assert_eq!(\n            doc_batch.doc_lengths.iter().sum::<u32>() as usize,\n            doc_batch.doc_buffer.len()\n        );\n\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_ingest_api_returns_200_when_ingest_ndjson_and_fetch() {\n        let (universe, _temp_dir, ingest_service, _) =\n            setup_ingest_v1_service(&[\"my-index\"], &IngestApiConfig::default()).await;\n        let ingest_router = IngestRouterServiceClient::mocked();\n        let ingest_api_handlers = ingest_api_handlers(\n            ingest_router,\n            ingest_service,\n            IngestApiConfig::default(),\n            true,\n            false,\n        );\n        let payload = r#\"\n            {\"id\": 1, \"message\": \"push\"}\n            {\"id\": 2, \"message\": \"push\"}\n            {\"id\": 3, \"message\": \"push\"}\"#;\n        let resp = warp::test::request()\n            .path(\"/my-index/ingest\")\n            .method(\"POST\")\n            .body(payload)\n            .reply(&ingest_api_handlers)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let ingest_response: RestIngestResponse = serde_json::from_slice(resp.body()).unwrap();\n        assert_eq!(ingest_response.num_docs_for_processing, 3);\n\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_ingest_api_return_429_if_above_limits() {\n        let config: IngestApiConfig =\n            serde_json::from_str(r#\"{ \"max_queue_memory_usage\": \"1\" }\"#).unwrap();\n        let (universe, _temp_dir, ingest_service, _) =\n            setup_ingest_v1_service(&[\"my-index\"], &config).await;\n        let ingest_router = IngestRouterServiceClient::mocked();\n        let ingest_api_handlers = ingest_api_handlers(\n            ingest_router,\n            ingest_service,\n            IngestApiConfig::default(),\n            true,\n            false,\n        );\n        let resp = warp::test::request()\n            .path(\"/my-index/ingest\")\n            .method(\"POST\")\n            .json(&true)\n            .body(r#\"{\"id\": 1, \"message\": \"push\"}\"#)\n            .reply(&ingest_api_handlers)\n            .await;\n        assert_eq!(resp.status(), 429);\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_ingest_api_return_413_if_above_content_limit() {\n        let config: IngestApiConfig =\n            serde_json::from_str(r#\"{ \"content_length_limit\": \"1\" }\"#).unwrap();\n        let (universe, _temp_dir, ingest_service, _) =\n            setup_ingest_v1_service(&[\"my-index\"], &IngestApiConfig::default()).await;\n        let ingest_router = IngestRouterServiceClient::mocked();\n        let ingest_api_handlers =\n            ingest_api_handlers(ingest_router, ingest_service, config.clone(), true, false);\n        let resp = warp::test::request()\n            .path(\"/my-index/ingest\")\n            .method(\"POST\")\n            .json(&true)\n            .body(r#\"{\"id\": 1, \"message\": \"push\"}\"#)\n            .reply(&ingest_api_handlers)\n            .await;\n        assert_eq!(resp.status(), 413);\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_ingest_api_blocks_when_wait_is_specified() {\n        let (universe, _temp_dir, ingest_service_client, ingest_service_mailbox) =\n            setup_ingest_v1_service(&[\"my-index\"], &IngestApiConfig::default()).await;\n        let ingest_router = IngestRouterServiceClient::mocked();\n        let ingest_api_handlers = ingest_api_handlers(\n            ingest_router,\n            ingest_service_client,\n            IngestApiConfig::default(),\n            true,\n            false,\n        );\n        let handle = tokio::spawn(async move {\n            let resp = warp::test::request()\n                .path(\"/my-index/ingest?commit=wait_for\")\n                .method(\"POST\")\n                .json(&true)\n                .body(r#\"{\"id\": 1, \"message\": \"push\"}\"#)\n                .reply(&ingest_api_handlers)\n                .await;\n            assert_eq!(resp.status(), 200);\n            let ingest_response: RestIngestResponse = serde_json::from_slice(resp.body()).unwrap();\n            assert_eq!(ingest_response.num_docs_for_processing, 1);\n        });\n        universe.sleep(Duration::from_secs(10)).await;\n        assert!(!handle.is_finished());\n        assert_eq!(\n            ingest_service_mailbox\n                .ask_for_res(FetchRequest {\n                    index_id: \"my-index\".to_string(),\n                    start_after: None,\n                    num_bytes_limit: None,\n                })\n                .await\n                .unwrap()\n                .doc_batch\n                .unwrap()\n                .num_docs(),\n            1\n        );\n        ingest_service_mailbox\n            .ask_for_res(SuggestTruncateRequest {\n                index_id: \"my-index\".to_string(),\n                up_to_position_included: 0,\n            })\n            .await\n            .unwrap();\n        handle.await.unwrap();\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_ingest_api_blocks_when_force_is_specified() {\n        let (universe, _temp_dir, ingest_service_client, ingest_service_mailbox) =\n            setup_ingest_v1_service(&[\"my-index\"], &IngestApiConfig::default()).await;\n        let ingest_router = IngestRouterServiceClient::mocked();\n        let ingest_api_handlers = ingest_api_handlers(\n            ingest_router,\n            ingest_service_client,\n            IngestApiConfig::default(),\n            true,\n            false,\n        );\n        let handle = tokio::spawn(async move {\n            let resp = warp::test::request()\n                .path(\"/my-index/ingest?commit=force\")\n                .method(\"POST\")\n                .json(&true)\n                .body(r#\"{\"id\": 1, \"message\": \"push\"}\"#)\n                .reply(&ingest_api_handlers)\n                .await;\n            assert_eq!(resp.status(), 200);\n            let ingest_response: RestIngestResponse = serde_json::from_slice(resp.body()).unwrap();\n            assert_eq!(ingest_response.num_docs_for_processing, 1);\n        });\n        universe.sleep(Duration::from_secs(10)).await;\n        assert!(!handle.is_finished());\n        assert_eq!(\n            ingest_service_mailbox\n                .ask_for_res(FetchRequest {\n                    index_id: \"my-index\".to_string(),\n                    start_after: None,\n                    num_bytes_limit: None,\n                })\n                .await\n                .unwrap()\n                .doc_batch\n                .unwrap()\n                .num_docs(),\n            2\n        );\n        ingest_service_mailbox\n            .ask_for_res(SuggestTruncateRequest {\n                index_id: \"my-index\".to_string(),\n                up_to_position_included: 0,\n            })\n            .await\n            .unwrap();\n        handle.await.unwrap();\n        universe.assert_quit().await;\n    }\n\n    #[tokio::test]\n    async fn test_ingest_api_unsupported_detailed_errors() {\n        let (universe, _temp_dir, ingest_service, _) =\n            setup_ingest_v1_service(&[\"my-index\"], &IngestApiConfig::default()).await;\n        let ingest_router = IngestRouterServiceClient::mocked();\n        let ingest_api_handlers = ingest_api_handlers(\n            ingest_router,\n            ingest_service,\n            IngestApiConfig::default(),\n            true,\n            false,\n        );\n        let resp = warp::test::request()\n            .path(\"/my-index/ingest?detailed_response=true\")\n            .method(\"POST\")\n            .json(&true)\n            .body(r#\"{\"id\": 1, \"message\": \"push\"}\"#)\n            .reply(&ingest_api_handlers)\n            .await;\n        assert_eq!(resp.status(), 400);\n        universe.assert_quit().await;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/jaeger_api/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod model;\nmod parse_duration;\nmod rest_handler;\npub(crate) use rest_handler::{JaegerApi, jaeger_api_handlers};\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/jaeger_api/model.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\n\nuse base64::prelude::{BASE64_STANDARD, Engine};\nuse itertools::Itertools;\nuse prost_types::{Duration, Timestamp};\nuse quickwit_proto::jaeger::api_v2::{KeyValue, Log, Process, Span, SpanRef, ValueType};\nuse serde::{Deserialize, Serialize};\nuse serde_json::{Value, json};\nuse serde_with::serde_as;\nuse warp::hyper::StatusCode;\n\npub(super) const DEFAULT_NUMBER_OF_TRACES: i32 = 20;\n\npub(super) fn build_jaeger_traces(spans: Vec<JaegerSpan>) -> anyhow::Result<Vec<JaegerTrace>> {\n    let jaeger_traces: Vec<JaegerTrace> = spans\n        .into_iter()\n        .chunk_by(|span| span.trace_id.clone())\n        .into_iter()\n        .map(|(span_id, group)| JaegerTrace::new(span_id, group.collect()))\n        .collect();\n    Ok(jaeger_traces)\n}\n\n#[derive(Debug, Default, Clone, Serialize, Deserialize, PartialEq)]\n#[serde(deny_unknown_fields)]\npub struct JaegerResponseBody<T> {\n    pub data: T,\n}\n\n#[serde_with::skip_serializing_none]\n#[derive(Clone, Default, Debug, Serialize, Deserialize, utoipa::IntoParams)]\n#[serde(rename_all = \"camelCase\")]\n#[serde(deny_unknown_fields)]\npub struct TracesSearchQueryParams {\n    #[serde(default)]\n    pub service: Option<String>,\n    #[serde(default)]\n    pub operation: Option<String>,\n    // these are microsecond precision\n    pub start: Option<i64>,\n    pub end: Option<i64>,\n    pub tags: Option<String>,\n    // these are unit-suffixed numbers. in practice we only support precision up to the ms\n    pub min_duration: Option<String>,\n    pub max_duration: Option<String>,\n    pub lookback: Option<String>,\n    pub limit: Option<i32>,\n}\n\n// Jaeger Model for UI\n// Source: https://github.com/jaegertracing/jaeger/blob/main/model/json/model.go#L82\n\n#[derive(Clone, Default, Debug, PartialEq, Serialize, utoipa::IntoParams)]\n#[serde(rename_all = \"camelCase\")]\npub struct JaegerTrace {\n    #[serde(rename = \"traceID\")]\n    #[serde(serialize_with = \"serialize_bytes_to_hex\")]\n    trace_id: Vec<u8>,\n    spans: Vec<JaegerSpan>,\n    processes: HashMap<String, JaegerProcess>,\n    warnings: Vec<String>,\n}\n\nimpl JaegerTrace {\n    pub fn new(trace_id: Vec<u8>, mut spans: Vec<JaegerSpan>) -> Self {\n        let processes = Self::build_process_map(&mut spans);\n        JaegerTrace {\n            trace_id,\n            spans,\n            processes,\n            warnings: Vec::new(),\n        }\n    }\n\n    /// Processes a collection of spans, updating the `process_id` field based on the unique\n    /// `service_name` values. The function uses an accumulator (`acc`) to keep track of\n    /// processed `JaegerProcess` objects and assigns a new key to each unique `service_name` value.\n    /// The logic has been replicated from\n    /// <https://github.com/jaegertracing/jaeger/blob/995231c42cadd70bce2bbbf02579e33f6e6329c8/model/converter/json/process_hashtable.go#L37>\n    /// TODO: use also tags to identify processes.\n    fn build_process_map(spans: &mut [JaegerSpan]) -> HashMap<String, JaegerProcess> {\n        let mut service_name_to_process_id: HashMap<String, String> = HashMap::new();\n        let mut process_map: HashMap<String, JaegerProcess> = HashMap::new();\n        let mut process_counter: i32 = 0;\n        for span in spans.iter_mut() {\n            let Some(current_process) = span.process.as_mut() else {\n                continue;\n            };\n            if let Some(process_id) = service_name_to_process_id.get(&current_process.service_name)\n            {\n                span.process_id = Some(process_id.clone());\n            } else {\n                process_counter += 1;\n                current_process.key = format!(\"p{process_counter}\");\n                span.process_id = Some(current_process.key.clone());\n                process_map.insert(current_process.key.clone(), current_process.clone());\n                service_name_to_process_id.insert(\n                    current_process.service_name.clone(),\n                    current_process.key.clone(),\n                );\n            }\n        }\n        process_map\n    }\n}\n\n#[serde_as]\n#[derive(Debug, Clone, PartialEq, Serialize)]\n#[serde(rename_all = \"camelCase\")]\npub struct JaegerSpan {\n    #[serde(rename = \"traceID\")]\n    #[serde(serialize_with = \"serialize_bytes_to_hex\")]\n    pub trace_id: Vec<u8>,\n    #[serde(rename = \"spanID\")]\n    #[serde(serialize_with = \"serialize_bytes_to_hex\")]\n    span_id: Vec<u8>,\n    operation_name: String,\n    references: Vec<JaegerSpanRef>,\n    #[serde(default)]\n    flags: u32,\n    start_time: i64, // start_time since Unix epoch\n    duration: i64,   // microseconds\n    tags: Vec<JaegerKeyValue>,\n    logs: Vec<JaegerLog>,\n    #[serde(default)]\n    #[serde(skip_serializing)]\n    process: Option<JaegerProcess>,\n    #[serde(rename = \"processID\")]\n    pub process_id: Option<String>,\n    pub warnings: Vec<String>,\n}\n\nimpl TryFrom<Span> for JaegerSpan {\n    type Error = anyhow::Error;\n    fn try_from(span: Span) -> Result<Self, Self::Error> {\n        let references: Vec<JaegerSpanRef> =\n            span.references.iter().map(JaegerSpanRef::from).collect();\n        let tags: Vec<JaegerKeyValue> = span.tags.iter().map(JaegerKeyValue::from).collect();\n        let logs: Vec<JaegerLog> = span.logs.iter().map(JaegerLog::from).collect();\n        Ok(Self {\n            trace_id: span.trace_id,\n            span_id: span.span_id,\n            operation_name: span.operation_name.clone(),\n            references,\n            flags: span.flags,\n            start_time: span\n                .start_time\n                .as_ref()\n                .map(convert_timestamp_to_microsecs)\n                .unwrap_or(0),\n            duration: span\n                .duration\n                .map(convert_duration_to_microsecs)\n                .unwrap_or(0),\n            tags,\n            logs,\n            process: span.process.map(JaegerProcess::from),\n            process_id: None,\n            warnings: span.warnings.iter().map(|s| s.to_string()).collect(),\n        })\n    }\n}\n\n#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]\n#[serde(rename_all = \"camelCase\")]\npub struct JaegerSpanRef {\n    #[serde(rename = \"traceID\")]\n    #[serde(serialize_with = \"serialize_bytes_to_hex\")]\n    trace_id: Vec<u8>,\n    #[serde(rename = \"spanID\")]\n    #[serde(serialize_with = \"serialize_bytes_to_hex\")]\n    span_id: Vec<u8>,\n    ref_type: String,\n}\n\nimpl From<&SpanRef> for JaegerSpanRef {\n    fn from(sr: &SpanRef) -> Self {\n        Self {\n            trace_id: sr.trace_id.clone(),\n            span_id: sr.span_id.clone(),\n            ref_type: if sr.ref_type == 0 {\n                \"CHILD_OF\".to_string()\n            } else {\n                \"FOLLOWS_FROM\".to_string()\n            },\n        }\n    }\n}\n\n#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]\npub struct JaegerKeyValue {\n    key: String,\n    #[serde(rename = \"type\")]\n    value_type: String,\n    value: Value,\n}\n\nimpl From<&KeyValue> for JaegerKeyValue {\n    fn from(kv: &KeyValue) -> Self {\n        match kv.v_type {\n            // String = 0,\n            0 => Self {\n                key: kv.key.to_string(),\n                value_type: ValueType::String.as_str_name().to_lowercase(),\n                value: json!(kv.v_str.to_string()),\n            },\n            // Bool = 1,\n            1 => Self {\n                key: kv.key.to_string(),\n                value_type: ValueType::Bool.as_str_name().to_lowercase(),\n                value: json!(kv.v_bool),\n            },\n            // Int64 = 2,\n            2 => {\n                if kv.v_int64 > 9007199254740991 {\n                    Self {\n                        key: kv.key.to_string(),\n                        value_type: ValueType::Int64.as_str_name().to_lowercase(),\n                        value: json!(kv.v_int64.to_string()),\n                    }\n                } else {\n                    Self {\n                        key: kv.key.to_string(),\n                        value_type: ValueType::Int64.as_str_name().to_lowercase(),\n                        value: json!(kv.v_int64),\n                    }\n                }\n            }\n            // Float64 = 3,\n            3 => Self {\n                key: kv.key.to_string(),\n                value_type: ValueType::Float64.as_str_name().to_lowercase(),\n                value: json!(kv.v_float64),\n            },\n            // Binary = 4,\n            4 => Self {\n                key: kv.key.to_string(),\n                value_type: ValueType::Binary.as_str_name().to_lowercase(),\n                value: serde_json::Value::String(BASE64_STANDARD.encode(kv.v_binary.as_slice())),\n            },\n            _ => Self {\n                key: \"no_value\".to_string(),\n                value_type: \"unsupported_type\".to_string(),\n                value: Default::default(),\n            },\n        }\n    }\n}\n\n#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]\npub struct JaegerLog {\n    timestamp: i64, // microseconds since Unix epoch\n    fields: Vec<JaegerKeyValue>,\n}\n\nimpl From<&Log> for JaegerLog {\n    fn from(log: &Log) -> Self {\n        Self {\n            timestamp: log\n                .timestamp\n                .as_ref()\n                .map(convert_timestamp_to_microsecs)\n                .unwrap_or(0),\n            fields: log.fields.iter().map(JaegerKeyValue::from).collect(),\n        }\n    }\n}\n\n#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]\n#[serde(rename_all = \"camelCase\")]\nstruct JaegerProcess {\n    service_name: String,\n    key: String,\n    tags: Vec<JaegerKeyValue>,\n}\n\nimpl Default for JaegerProcess {\n    fn default() -> Self {\n        Self {\n            service_name: \"none\".to_string(),\n            key: \"\".to_string(),\n            tags: Vec::new(),\n        }\n    }\n}\n\nimpl From<Process> for JaegerProcess {\n    fn from(process: Process) -> Self {\n        Self {\n            service_name: process.service_name.to_string(),\n            key: \"\".to_string(),\n            tags: process.tags.iter().map(JaegerKeyValue::from).collect(),\n        }\n    }\n}\n\n#[derive(Debug, Clone, Serialize, Deserialize)]\npub struct JaegerError {\n    #[serde(with = \"http_serde::status_code\")]\n    pub status: StatusCode,\n    pub message: String,\n}\n\nimpl From<anyhow::Error> for JaegerError {\n    fn from(error: anyhow::Error) -> Self {\n        Self {\n            status: StatusCode::INTERNAL_SERVER_ERROR,\n            message: error.to_string(),\n        }\n    }\n}\n\nfn serialize_bytes_to_hex<S>(bytes: &Vec<u8>, s: S) -> Result<S::Ok, S::Error>\nwhere S: serde::Serializer {\n    s.serialize_str(&format!(\"{:0>16}\", hex::encode(bytes)))\n}\n\nfn convert_timestamp_to_microsecs(timestamp: &Timestamp) -> i64 {\n    timestamp.seconds * 1_000_000 + i64::from(timestamp.nanos / 1000)\n}\n\nfn convert_duration_to_microsecs(duration: Duration) -> i64 {\n    duration.seconds * 1_000_000 + i64::from(duration.nanos / 1000)\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_proto::jaeger::api_v2::Log;\n\n    use crate::jaeger_api::model::{JaegerSpan, build_jaeger_traces};\n\n    #[test]\n    fn test_convert_grpc_jaeger_spans_into_jaeger_ui_model() {\n        let file_content = std::fs::read_to_string(get_jaeger_ui_trace_filepath()).unwrap();\n        let expected_jaeger_trace: serde_json::Value = serde_json::from_str(&file_content).unwrap();\n        let grpc_spans = create_grpc_spans();\n        let jaeger_spans: Vec<JaegerSpan> = grpc_spans\n            .iter()\n            .map(|span| super::JaegerSpan::try_from(span.clone()).unwrap())\n            .collect();\n        let traces = build_jaeger_traces(jaeger_spans).unwrap();\n        let trace_json: serde_json::Value = serde_json::to_value(traces[0].clone()).unwrap();\n        assert_json_diff::assert_json_eq!(expected_jaeger_trace, trace_json);\n    }\n\n    fn get_jaeger_ui_trace_filepath() -> String {\n        format!(\n            \"{}/resources/tests/jaeger_ui_trace.json\",\n            env!(\"CARGO_MANIFEST_DIR\"),\n        )\n    }\n\n    fn create_grpc_spans() -> Vec<quickwit_proto::jaeger::api_v2::Span> {\n        let span_0 = quickwit_proto::jaeger::api_v2::Span {\n            trace_id: vec![1],\n            span_id: vec![1],\n            operation_name: \"test-general-conversion\".to_string(),\n            start_time: Some(prost_types::Timestamp {\n                seconds: 1485467191,\n                nanos: 639875000,\n            }),\n            duration: Some(prost_types::Duration {\n                seconds: 0,\n                nanos: 5000,\n            }),\n            process: Some(quickwit_proto::jaeger::api_v2::Process {\n                service_name: \"service-x\".to_string(),\n                tags: Vec::new(),\n            }),\n            logs: vec![\n                Log {\n                    timestamp: Some(prost_types::Timestamp {\n                        seconds: 1485467191,\n                        nanos: 639875000,\n                    }),\n                    fields: vec![quickwit_proto::jaeger::api_v2::KeyValue {\n                        key: \"event\".to_string(),\n                        v_type: 0,\n                        v_str: \"some-event\".to_string(),\n                        ..Default::default()\n                    }],\n                },\n                Log {\n                    timestamp: Some(prost_types::Timestamp {\n                        seconds: 1485467191,\n                        nanos: 639875000,\n                    }),\n                    fields: vec![quickwit_proto::jaeger::api_v2::KeyValue {\n                        key: \"x\".to_string(),\n                        v_type: 0,\n                        v_str: \"y\".to_string(),\n                        ..Default::default()\n                    }],\n                },\n            ],\n            ..Default::default()\n        };\n        let span_1 = quickwit_proto::jaeger::api_v2::Span {\n            operation_name: \"some-operation\".to_string(),\n            trace_id: vec![1],\n            span_id: vec![2],\n            start_time: Some(prost_types::Timestamp {\n                seconds: 1485467191,\n                nanos: 639875000,\n            }),\n            duration: Some(prost_types::Duration {\n                seconds: 0,\n                nanos: 5000,\n            }),\n            process: Some(quickwit_proto::jaeger::api_v2::Process {\n                service_name: \"service-x\".to_string(),\n                tags: Vec::new(),\n            }),\n            process_id: \"\".to_string(),\n            tags: vec![\n                quickwit_proto::jaeger::api_v2::KeyValue {\n                    key: \"peer.service\".to_string(),\n                    v_type: 0,\n                    v_str: \"service-y\".to_string(),\n                    ..Default::default()\n                },\n                quickwit_proto::jaeger::api_v2::KeyValue {\n                    key: \"peer.ipv4\".to_string(),\n                    v_type: 2,\n                    v_int64: 23456,\n                    ..Default::default()\n                },\n                quickwit_proto::jaeger::api_v2::KeyValue {\n                    key: \"error\".to_string(),\n                    v_type: 1,\n                    v_bool: true,\n                    ..Default::default()\n                },\n                quickwit_proto::jaeger::api_v2::KeyValue {\n                    key: \"temperature\".to_string(),\n                    v_type: 3,\n                    v_float64: 72.5,\n                    ..Default::default()\n                },\n                quickwit_proto::jaeger::api_v2::KeyValue {\n                    key: \"javascript_limit\".to_string(),\n                    v_type: 2,\n                    v_int64: 9223372036854775222,\n                    ..Default::default()\n                },\n                quickwit_proto::jaeger::api_v2::KeyValue {\n                    key: \"blob\".to_string(),\n                    v_type: 4,\n                    v_binary: vec![0b0, 0b0, 0b00110000, 0b00111001],\n                    ..Default::default()\n                },\n            ],\n            ..Default::default()\n        };\n        let span_2 = quickwit_proto::jaeger::api_v2::Span {\n            operation_name: \"some-operation\".to_string(),\n            trace_id: vec![1],\n            span_id: vec![3],\n            references: vec![quickwit_proto::jaeger::api_v2::SpanRef {\n                trace_id: vec![1],\n                span_id: vec![2],\n                ref_type: 0,\n            }],\n            start_time: Some(prost_types::Timestamp {\n                seconds: 1485467191,\n                nanos: 639875000,\n            }),\n            duration: Some(prost_types::Duration {\n                seconds: 0,\n                nanos: 5000,\n            }),\n            process: Some(quickwit_proto::jaeger::api_v2::Process {\n                service_name: \"service-y\".to_string(),\n                tags: Vec::new(),\n            }),\n            process_id: \"\".to_string(),\n            ..Default::default()\n        };\n        let span_3 = quickwit_proto::jaeger::api_v2::Span {\n            operation_name: \"reference-test\".to_string(),\n            trace_id: vec![1],\n            span_id: vec![4],\n            references: vec![\n                quickwit_proto::jaeger::api_v2::SpanRef {\n                    trace_id: vec![255],\n                    span_id: vec![255],\n                    ref_type: 0,\n                },\n                quickwit_proto::jaeger::api_v2::SpanRef {\n                    trace_id: vec![1],\n                    span_id: vec![2],\n                    ref_type: 0,\n                },\n                quickwit_proto::jaeger::api_v2::SpanRef {\n                    trace_id: vec![1],\n                    span_id: vec![2],\n                    ref_type: 1,\n                },\n            ],\n            start_time: Some(prost_types::Timestamp {\n                seconds: 1485467191,\n                nanos: 639875000,\n            }),\n            duration: Some(prost_types::Duration {\n                seconds: 0,\n                nanos: 5000,\n            }),\n            process: Some(quickwit_proto::jaeger::api_v2::Process {\n                service_name: \"service-y\".to_string(),\n                tags: Vec::new(),\n            }),\n            process_id: \"\".to_string(),\n            warnings: vec![\"some span warning\".to_string()],\n            ..Default::default()\n        };\n        let span_4 = quickwit_proto::jaeger::api_v2::Span {\n            operation_name: \"preserveParentID-test\".to_string(),\n            trace_id: vec![1],\n            span_id: vec![5],\n            references: vec![quickwit_proto::jaeger::api_v2::SpanRef {\n                trace_id: vec![1],\n                span_id: vec![4],\n                ref_type: 0,\n            }],\n            start_time: Some(prost_types::Timestamp {\n                seconds: 1485467191,\n                nanos: 639875000,\n            }),\n            duration: Some(prost_types::Duration {\n                seconds: 0,\n                nanos: 4000,\n            }),\n            process: Some(quickwit_proto::jaeger::api_v2::Process {\n                service_name: \"service-y\".to_string(),\n                tags: Vec::new(),\n            }),\n            process_id: \"\".to_string(),\n            warnings: vec![\"some span warning\".to_string()],\n            ..Default::default()\n        };\n        vec![span_0, span_1, span_2, span_3, span_4]\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/jaeger_api/parse_duration.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse prost_types::{Duration as ProstDuration, Timestamp as ProstTimestamp};\n\npub(crate) fn parse_duration_with_units(duration_string: String) -> anyhow::Result<ProstDuration> {\n    parse_duration_nanos(&duration_string)\n        .map(to_well_known_timestamp)\n        .map(|timestamp| ProstDuration {\n            seconds: timestamp.seconds,\n            nanos: timestamp.nanos,\n        })\n        .map_err(|error| anyhow::anyhow!(\"Failed to parse duration: {:?}\", error))\n}\n\npub(crate) fn to_well_known_timestamp(timestamp_nanos: i64) -> ProstTimestamp {\n    let seconds = timestamp_nanos / 1_000_000_000;\n    let nanos = (timestamp_nanos % 1_000_000_000) as i32;\n    ProstTimestamp { seconds, nanos }\n}\n\n/// Parses a duration string and return duration in nanoseconds.\n/// A duration string is a possibly signed sequence of decimal numbers, each\n/// with optional fraction and a unit suffix, such as \"300ms\", \"-1.5h\".\n///\n/// Valid time units are \"ns\", \"us\" (or \"µs\"), \"ms\", \"s\", \"m\", \"h\".\nfn parse_duration_nanos(input: &str) -> anyhow::Result<i64> {\n    let mut num_str = String::new();\n    for ch in input.trim().chars() {\n        if ch.is_ascii_digit() || ch == '.' || ch == '-' {\n            num_str.push(ch);\n            continue;\n        }\n        if ch.is_alphabetic() {\n            let unit = &input[num_str.len()..];\n            let num: f64 = num_str.parse()?;\n            let duration: f64 = match unit {\n                \"ns\" => num,\n                \"us\" | \"µs\" => num * 1000.0,\n                \"ms\" => num * 1_000_000.0,\n                \"s\" => num * 1_000_000_000.0,\n                \"m\" => num * 60.0 * 1_000_000_000.0,\n                \"h\" => num * 3600.0 * 1_000_000_000.0,\n                _ => anyhow::bail!(\"Invalid time unit: {}\", unit),\n            };\n            if num < i64::MIN as f64 || num > i64::MAX as f64 {\n                anyhow::bail!(\"Invalid duration: {}\", num_str)\n            }\n            return Ok(duration.round() as i64);\n        } else {\n            anyhow::bail!(\"Invalid duration string\")\n        }\n    }\n    anyhow::bail!(\"Invalid duration string\")\n}\n\n#[cfg(test)]\nmod tests {\n    use crate::jaeger_api::parse_duration::parse_duration_nanos;\n\n    #[test]\n    fn test_parse_duration_nanos() {\n        // Test valid duration strings\n        assert_eq!(parse_duration_nanos(\"300ns\").unwrap(), 300);\n        assert_eq!(parse_duration_nanos(\"1us\").unwrap(), 1000);\n        assert_eq!(parse_duration_nanos(\"2.5ms\").unwrap(), 2500000);\n        assert_eq!(parse_duration_nanos(\"3s\").unwrap(), 3000000000);\n        assert_eq!(parse_duration_nanos(\"4m\").unwrap(), 240000000000);\n        assert_eq!(parse_duration_nanos(\"5h\").unwrap(), 18000000000000);\n        assert_eq!(parse_duration_nanos(\"-100ns\").unwrap(), -100);\n        assert_eq!(parse_duration_nanos(\"-2us\").unwrap(), -2000);\n        assert_eq!(parse_duration_nanos(\"-3.5ms\").unwrap(), -3500000);\n        assert_eq!(parse_duration_nanos(\"-4s\").unwrap(), -4000000000);\n        assert_eq!(parse_duration_nanos(\"-5m\").unwrap(), -300000000000);\n        assert_eq!(parse_duration_nanos(\"-6h\").unwrap(), -21600000000000);\n\n        // Test invalid duration strings\n        assert!(parse_duration_nanos(\"abc\").is_err());\n        assert!(parse_duration_nanos(\"1.2.3s\").is_err());\n        assert!(parse_duration_nanos(\"1-.23s\").is_err());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/jaeger_api/rest_handler.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::time::Instant;\n\nuse itertools::Itertools;\nuse quickwit_jaeger::JaegerService;\nuse quickwit_proto::jaeger::storage::v1::{\n    FindTracesRequest, GetOperationsRequest, GetServicesRequest, GetTraceRequest,\n    SpansResponseChunk, TraceQueryParameters,\n};\nuse quickwit_proto::tonic;\nuse tokio_stream::StreamExt;\nuse tokio_stream::wrappers::ReceiverStream;\nuse tracing::error;\nuse warp::hyper::StatusCode;\nuse warp::{Filter, Rejection};\n\nuse super::model::build_jaeger_traces;\nuse super::parse_duration::{parse_duration_with_units, to_well_known_timestamp};\nuse crate::jaeger_api::model::{\n    DEFAULT_NUMBER_OF_TRACES, JaegerError, JaegerResponseBody, JaegerSpan, JaegerTrace,\n    TracesSearchQueryParams,\n};\nuse crate::rest::recover_fn;\nuse crate::rest_api_response::RestApiResponse;\nuse crate::search_api::extract_index_id_patterns;\nuse crate::{BodyFormat, require};\n\n#[derive(utoipa::OpenApi)]\n#[openapi(paths(\n    jaeger_services_handler,\n    jaeger_service_operations_handler,\n    jaeger_traces_search_handler,\n    jaeger_traces_handler\n))]\npub(crate) struct JaegerApi;\n\n/// Setup Jaeger API handlers\n///\n/// This is where all Jaeger handlers\n/// should be registered.\n/// Request are executed on the `otel-traces-v0_*` indexes.\npub(crate) fn jaeger_api_handlers(\n    jaeger_service_opt: Option<JaegerService>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    jaeger_services_handler(jaeger_service_opt.clone())\n        .or(jaeger_service_operations_handler(\n            jaeger_service_opt.clone(),\n        ))\n        .or(jaeger_traces_search_handler(jaeger_service_opt.clone()))\n        .or(jaeger_traces_handler(jaeger_service_opt.clone()))\n        .recover(recover_fn)\n        .boxed()\n}\n\nfn jaeger_api_path_filter() -> impl Filter<Extract = (Vec<String>,), Error = Rejection> + Clone {\n    warp::path!(String / \"jaeger\" / \"api\" / ..)\n        .and(warp::get())\n        .and_then(extract_index_id_patterns)\n}\n\n#[utoipa::path(\n    get,\n    tag = \"Jaeger\",\n    path = \"/{otel-traces-index-id}/jaeger/api/services\",\n    responses(\n        (status = 200, description = \"Successfully fetched services names.\", body = JaegerResponseBody )\n    ),\n    params(\n        (\"otel-traces-index-id\" = String, Path, description = \"The name of the index to get services for.\")\n    )\n)]\npub fn jaeger_services_handler(\n    jaeger_service_opt: Option<JaegerService>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    jaeger_api_path_filter()\n        .and(warp::path!(\"services\"))\n        .and(require(jaeger_service_opt))\n        .then(jaeger_services)\n        .map(|result| make_jaeger_api_response(result, BodyFormat::default()))\n}\n\n#[utoipa::path(\n    get,\n    tag = \"Jaeger\",\n    path = \"/{otel-traces-index-id}/jaeger/api/services/{service}/operations\",\n    responses(\n        (status = 200, description = \"Successfully fetched operations names the given service.\", body = JaegerResponseBody )\n    ),\n    params(\n        (\"otel-traces-index-id\" = String, Path, description = \"The name of the index to get operations for.\"),\n        (\"service\" = String, Path, description = \"The name of the service to get operations for.\"),\n    )\n)]\npub fn jaeger_service_operations_handler(\n    jaeger_service_opt: Option<JaegerService>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    jaeger_api_path_filter()\n        .and(warp::path!(\"services\" / String / \"operations\"))\n        .and(require(jaeger_service_opt))\n        .then(jaeger_service_operations)\n        .map(|result| make_jaeger_api_response(result, BodyFormat::default()))\n}\n\n#[utoipa::path(\n    get,\n    tag = \"Jaeger\",\n    path = \"/{otel-traces-index-id}/jaeger/api/traces\",\n    responses(\n        (status = 200, description = \"Successfully fetched traces information.\", body = JaegerResponseBody )\n    ),\n    params(\n        (\"otel-traces-index-id\" = String, Path, description = \"The name of the index to get traces for.\"),\n        (\"service\" = Option<String>, Query, description = \"The service name.\"),\n        (\"operation\" = Option<String>, Query, description = \"The operation name.\"),\n        (\"start\" = Option<i64>, Query, description = \"The start time in nanoseconds.\"),\n        (\"end\" = Option<i64>, Query, description = \"The end time in nanoseconds.\"),\n        (\"tags\" = Option<String>, Query, description = \"Sets tags with values in the logfmt format, such as error=true status=200.\"),\n        (\"min_duration\" = Option<String>, Query, description = \"Filters all traces with a duration higher than the set value. Possible values are 1.2s, 100ms, 500us.\"),\n        (\"max_duration\" = Option<String>, Query, description = \"Filters all traces with a duration lower than the set value. Possible values are 1.2s, 100ms, 500us.\"),\n        (\"limit\" = Option<i32>, Query, description = \"Limits the number of traces returned.\"),\n    )\n)]\npub fn jaeger_traces_search_handler(\n    jaeger_service_opt: Option<JaegerService>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    jaeger_api_path_filter()\n        .and(warp::path!(\"traces\"))\n        .and(warp::query())\n        .and(require(jaeger_service_opt))\n        .then(jaeger_traces_search)\n        .map(|result| make_jaeger_api_response(result, BodyFormat::default()))\n}\n\n#[utoipa::path(\n    get,\n    tag = \"Jaeger\",\n    path = \"/{otel-traces-index-id}/jaeger/api/traces/{id}\",\n    responses(\n        (status = 200, description = \"Successfully fetched traces spans for the provided trace ID.\", body = JaegerResponseBody )\n    ),\n    params(\n        (\"otel-traces-index-id\" = String, Path, description = \"The name of the index to get traces for.\"),\n        (\"id\" = String, Path, description = \"The ID of the trace to get spans for.\"),\n    )\n)]\npub fn jaeger_traces_handler(\n    jaeger_service_opt: Option<JaegerService>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    jaeger_api_path_filter()\n        .and(warp::path!(\"traces\" / String))\n        .and(warp::get())\n        .and(require(jaeger_service_opt))\n        .then(jaeger_get_trace_by_id)\n        .map(|result| make_jaeger_api_response(result, BodyFormat::default()))\n}\n\nasync fn jaeger_services(\n    index_id_patterns: Vec<String>,\n    jaeger_service: JaegerService,\n) -> Result<JaegerResponseBody<Vec<String>>, JaegerError> {\n    let get_services_response = jaeger_service\n        .get_services_for_indexes(GetServicesRequest {}, index_id_patterns)\n        .await\n        .map_err(|error| JaegerError {\n            status: StatusCode::INTERNAL_SERVER_ERROR,\n            message: format!(\"failed to fetch services: {error}\"),\n        })?;\n    Ok(JaegerResponseBody::<Vec<String>> {\n        data: get_services_response.services,\n    })\n}\n\nasync fn jaeger_service_operations(\n    index_id_patterns: Vec<String>,\n    service_name: String,\n    jaeger_service: JaegerService,\n) -> Result<JaegerResponseBody<Vec<String>>, JaegerError> {\n    let get_operations_request = GetOperationsRequest {\n        service: service_name,\n        span_kind: \"\".to_string(),\n    };\n    let get_operations_response = jaeger_service\n        .get_operations_for_indexes(get_operations_request, index_id_patterns)\n        .await\n        .map_err(|error| JaegerError {\n            status: StatusCode::INTERNAL_SERVER_ERROR,\n            message: format!(\"failed to fetch services: {error}\"),\n        })?;\n\n    let operations = get_operations_response\n        .operations\n        .into_iter()\n        .map(|op| op.name)\n        .collect_vec();\n    Ok(JaegerResponseBody::<Vec<String>> { data: operations })\n}\n\nasync fn jaeger_traces_search(\n    index_id_patterns: Vec<String>,\n    search_params: TracesSearchQueryParams,\n    jaeger_service: JaegerService,\n) -> Result<JaegerResponseBody<Vec<JaegerTrace>>, JaegerError> {\n    let duration_min = search_params\n        .min_duration\n        .map(parse_duration_with_units)\n        .transpose()?;\n    let duration_max = search_params\n        .max_duration\n        .map(parse_duration_with_units)\n        .transpose()?;\n    let tags = search_params\n        .tags\n        .clone()\n        .map(|s| {\n            serde_json::from_str::<HashMap<String, String>>(&s).map_err(|error| {\n                let error_msg = format!(\n                    \"failed to deserialize tags `{:?}`: {:?}\",\n                    search_params.tags, error\n                );\n                error!(error_msg);\n                JaegerError {\n                    status: StatusCode::INTERNAL_SERVER_ERROR,\n                    message: error_msg,\n                }\n            })\n        })\n        .transpose()?\n        .unwrap_or(Default::default());\n    let query = TraceQueryParameters {\n        service_name: search_params.service.unwrap_or_default(),\n        operation_name: search_params.operation.unwrap_or_default(),\n        tags,\n        start_time_min: search_params\n            .start\n            .map(|ts| to_well_known_timestamp(ts * 1000)),\n        start_time_max: search_params\n            .end\n            .map(|ts| to_well_known_timestamp(ts * 1000)),\n        duration_min,\n        duration_max,\n        num_traces: search_params.limit.unwrap_or(DEFAULT_NUMBER_OF_TRACES),\n    };\n    let find_traces_request = FindTracesRequest { query: Some(query) };\n    let spans_chunk_stream = jaeger_service\n        .find_traces_for_indexes(\n            find_traces_request,\n            \"find_traces\",\n            Instant::now(),\n            index_id_patterns,\n            true,\n        )\n        .await\n        .map_err(|error| {\n            error!(error = ?error, \"failed to fetch traces\");\n            JaegerError {\n                status: StatusCode::INTERNAL_SERVER_ERROR,\n                message: \"failed to fetch traces\".to_string(),\n            }\n        })?;\n    let jaeger_spans = collect_and_build_jaeger_spans(spans_chunk_stream).await?;\n    let jaeger_traces: Vec<JaegerTrace> = build_jaeger_traces(jaeger_spans)?;\n    Ok(JaegerResponseBody {\n        data: jaeger_traces,\n    })\n}\n\nasync fn collect_and_build_jaeger_spans(\n    mut spans_chunk_stream: ReceiverStream<Result<SpansResponseChunk, tonic::Status>>,\n) -> anyhow::Result<Vec<JaegerSpan>> {\n    let mut all_spans = Vec::<JaegerSpan>::new();\n    while let Some(Ok(SpansResponseChunk { spans })) = spans_chunk_stream.next().await {\n        let jaeger_spans: Vec<JaegerSpan> =\n            spans.into_iter().map(JaegerSpan::try_from).try_collect()?;\n        all_spans.extend(jaeger_spans);\n    }\n    Ok(all_spans)\n}\n\nasync fn jaeger_get_trace_by_id(\n    index_id_patterns: Vec<String>,\n    trace_id_string: String,\n    jaeger_service: JaegerService,\n) -> Result<JaegerResponseBody<Vec<JaegerTrace>>, JaegerError> {\n    let trace_id = hex::decode(trace_id_string.clone()).map_err(|error| {\n        error!(error = ?error, \"failed to decode trace `{}`\", trace_id_string.clone());\n        JaegerError {\n            status: StatusCode::INTERNAL_SERVER_ERROR,\n            message: \"failed to decode trace id\".to_string(),\n        }\n    })?;\n    let get_trace_request = GetTraceRequest { trace_id };\n    let spans_chunk_stream: ReceiverStream<Result<SpansResponseChunk, tonic::Status>> =\n        jaeger_service\n            .get_trace_for_indexes(\n                get_trace_request,\n                \"get_trace\",\n                Instant::now(),\n                index_id_patterns,\n            )\n            .await\n            .map_err(|error| {\n                error!(error = ?error, \"failed to fetch trace `{trace_id_string}`\");\n                JaegerError {\n                    status: StatusCode::INTERNAL_SERVER_ERROR,\n                    message: \"failed to fetch trace\".to_string(),\n                }\n            })?;\n    let jaeger_spans = collect_and_build_jaeger_spans(spans_chunk_stream).await?;\n    let jaeger_traces: Vec<JaegerTrace> = build_jaeger_traces(jaeger_spans)?;\n    Ok(JaegerResponseBody {\n        data: jaeger_traces,\n    })\n}\n\nfn make_jaeger_api_response<T: serde::Serialize>(\n    jaeger_result: Result<T, JaegerError>,\n    body_format: BodyFormat,\n) -> RestApiResponse {\n    let status_code = match &jaeger_result {\n        Ok(_) => StatusCode::OK,\n        Err(err) => err.status,\n    };\n    RestApiResponse::new(&jaeger_result, status_code, body_format)\n}\n\n#[cfg(test)]\nmod tests {\n    use std::collections::HashMap;\n    use std::sync::Arc;\n\n    use quickwit_config::JaegerConfig;\n    use quickwit_opentelemetry::otlp::OTEL_TRACES_INDEX_ID;\n    use quickwit_search::MockSearchService;\n    use serde_json::Value as JsonValue;\n\n    use super::*;\n    use crate::recover_fn;\n\n    #[tokio::test]\n    async fn test_when_jaeger_not_found() {\n        let jaeger_api_handler = jaeger_api_handlers(None).recover(crate::rest::recover_fn_final);\n        let resp = warp::test::request()\n            .path(\"/otel-traces-v0_9/jaeger/api/services\")\n            .reply(&jaeger_api_handler)\n            .await;\n        assert_eq!(resp.status(), 404);\n        let error_body = serde_json::from_slice::<HashMap<String, String>>(resp.body()).unwrap();\n        assert!(error_body.contains_key(\"message\"));\n        assert_eq!(error_body.get(\"message\").unwrap(), \"Route not found\");\n    }\n\n    #[tokio::test]\n    async fn test_jaeger_services() -> anyhow::Result<()> {\n        let mut mock_search_service = MockSearchService::new();\n        mock_search_service\n            .expect_root_list_terms()\n            .withf(|req| {\n                req.index_id_patterns == vec![OTEL_TRACES_INDEX_ID]\n                    && req.field == \"service_name\"\n                    && req.start_timestamp.is_some()\n            })\n            .return_once(|_| {\n                Ok(quickwit_proto::search::ListTermsResponse {\n                    num_hits: 0,\n                    terms: Vec::new(),\n                    elapsed_time_micros: 0,\n                    errors: Vec::new(),\n                })\n            });\n        let mock_search_service = Arc::new(mock_search_service);\n        let jaeger = JaegerService::new(JaegerConfig::default(), mock_search_service);\n\n        let jaeger_api_handler = jaeger_api_handlers(Some(jaeger)).recover(recover_fn);\n        let resp = warp::test::request()\n            .path(\"/otel-traces-v0_9/jaeger/api/services\")\n            .reply(&jaeger_api_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let actual_response_json: JsonValue = serde_json::from_slice(resp.body())?;\n        assert!(\n            actual_response_json\n                .get(\"data\")\n                .unwrap()\n                .as_array()\n                .unwrap()\n                .is_empty()\n        );\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_jaeger_service_operations() {\n        let mut mock_search_service = MockSearchService::new();\n        mock_search_service\n            .expect_root_list_terms()\n            .withf(|req| {\n                req.index_id_patterns == vec![OTEL_TRACES_INDEX_ID]\n                    && req.field == \"span_fingerprint\"\n                    && req.start_timestamp.is_some()\n            })\n            .return_once(|_| {\n                Ok(quickwit_proto::search::ListTermsResponse {\n                    num_hits: 1,\n                    terms: Vec::new(),\n                    elapsed_time_micros: 0,\n                    errors: Vec::new(),\n                })\n            });\n        let mock_search_service = Arc::new(mock_search_service);\n        let jaeger = JaegerService::new(JaegerConfig::default(), mock_search_service);\n        let jaeger_api_handler = jaeger_api_handlers(Some(jaeger)).recover(recover_fn);\n        let resp = warp::test::request()\n            .path(\"/otel-traces-v0_9/jaeger/api/services/service1/operations\")\n            .reply(&jaeger_api_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let actual_response_json: JsonValue = serde_json::from_slice(resp.body()).unwrap();\n        assert!(\n            actual_response_json\n                .get(\"data\")\n                .unwrap()\n                .as_array()\n                .unwrap()\n                .is_empty()\n        );\n    }\n\n    #[tokio::test]\n    async fn test_jaeger_traces_search() {\n        let mut mock_search_service = MockSearchService::new();\n        mock_search_service\n            .expect_root_search()\n            .withf(|req| {\n                assert!(req.query_ast.contains(\n                    \"{\\\"type\\\":\\\"term\\\",\\\"field\\\":\\\"resource_attributes.tag.first\\\",\\\"value\\\":\\\"\\\n                     common\\\"}\"\n                ));\n                assert!(req.query_ast.contains(\n                    \"{\\\"type\\\":\\\"term\\\",\\\"field\\\":\\\"resource_attributes.tag.second\\\",\\\"value\\\":\\\"\\\n                     true\\\"}\"\n                ));\n                assert!(req.query_ast.contains(\n                    \"{\\\"type\\\":\\\"term\\\",\\\"field\\\":\\\"resource_attributes.tag.second\\\",\\\"value\\\":\\\"\\\n                     true\\\"}\"\n                ));\n                // no lowerbound because minDuration < 1ms,\n                assert!(req.query_ast.contains(\n                    \"{\\\"type\\\":\\\"range\\\",\\\"field\\\":\\\"span_duration_millis\\\",\\\"lower_bound\\\":\\\"\\\n                     Unbounded\\\",\\\"upper_bound\\\":{\\\"Included\\\":1200}}\"\n                ));\n                assert_eq!(req.start_timestamp, Some(1702352106));\n                // TODO(trinity) i think we have an off by 1 here, imo this should be rounded up\n                assert_eq!(req.end_timestamp, Some(1702373706));\n                assert_eq!(\n                    req.index_id_patterns,\n                    vec![OTEL_TRACES_INDEX_ID.to_string()]\n                );\n                true\n            })\n            .return_once(|_| {\n                Ok(quickwit_proto::search::SearchResponse {\n                    num_hits: 0,\n                    hits: Vec::new(),\n                    elapsed_time_micros: 0,\n                    errors: Vec::new(),\n                    aggregation_postcard: None,\n                    scroll_id: None,\n                    failed_splits: Vec::new(),\n                    num_successful_splits: 1,\n                })\n            });\n        let mock_search_service = Arc::new(mock_search_service);\n        let jaeger = JaegerService::new(JaegerConfig::default(), mock_search_service);\n        let jaeger_api_handler = jaeger_api_handlers(Some(jaeger)).recover(recover_fn);\n        let resp = warp::test::request()\n            .path(\n                \"/otel-traces-v0_9/jaeger/api/traces?service=quickwit&\\\n                 operation=delete_splits_marked_for_deletion&minDuration=500us&maxDuration=1.2s&\\\n                 tags=%7B%22tag.first%22%3A%22common%22%2C%22tag.second%22%3A%22true%22%7D&\\\n                 limit=1&start=1702352106016000&end=1702373706016000&lookback=custom\",\n            )\n            .reply(&jaeger_api_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n    }\n\n    #[tokio::test]\n    async fn test_jaeger_trace_by_id() {\n        let mut mock_search_service = MockSearchService::new();\n        mock_search_service\n            .expect_root_search()\n            .withf(|req| req.index_id_patterns == vec![OTEL_TRACES_INDEX_ID.to_string()])\n            .return_once(|_| {\n                Ok(quickwit_proto::search::SearchResponse {\n                    num_hits: 0,\n                    hits: Vec::new(),\n                    elapsed_time_micros: 0,\n                    errors: Vec::new(),\n                    aggregation_postcard: None,\n                    scroll_id: None,\n                    failed_splits: Vec::new(),\n                    num_successful_splits: 1,\n                })\n            });\n        let mock_search_service = Arc::new(mock_search_service);\n        let jaeger = JaegerService::new(JaegerConfig::default(), mock_search_service);\n\n        let jaeger_api_handler = jaeger_api_handlers(Some(jaeger)).recover(recover_fn);\n        let resp = warp::test::request()\n            .path(\"/otel-traces-v0_9/jaeger/api/traces/1506026ddd216249555653218dc88a6c\")\n            .reply(&jaeger_api_handler)\n            .await;\n\n        assert_eq!(resp.status(), 200);\n        let actual_response_json: JsonValue = serde_json::from_slice(resp.body()).unwrap();\n        assert!(\n            actual_response_json\n                .get(\"data\")\n                .unwrap()\n                .as_array()\n                .unwrap()\n                .is_empty()\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n#![recursion_limit = \"256\"]\n\nmod build_info;\nmod cluster_api;\nmod decompression;\nmod delete_task_api;\nmod developer_api;\nmod elasticsearch_api;\nmod format;\nmod grpc;\nmod health_check_api;\nmod index_api;\nmod indexing_api;\nmod ingest_api;\nmod jaeger_api;\nmod load_shield;\nmod metrics;\nmod metrics_api;\nmod node_info_handler;\nmod openapi;\nmod otlp_api;\nmod rate_modulator;\nmod rest;\nmod rest_api_response;\nmod search_api;\npub(crate) mod simple_list;\npub mod tcp_listener;\nmod template_api;\nmod ui_handler;\n\nuse std::collections::{HashMap, HashSet};\nuse std::convert::Infallible;\nuse std::fs;\nuse std::net::SocketAddr;\nuse std::num::NonZeroUsize;\nuse std::sync::Arc;\nuse std::time::Duration;\n\nuse anyhow::{Context, bail};\nuse bytesize::ByteSize;\npub(crate) use decompression::Body;\npub use format::BodyFormat;\nuse futures::StreamExt;\nuse itertools::Itertools;\nuse once_cell::sync::Lazy;\nuse quickwit_actors::{ActorExitStatus, Mailbox, SpawnContext, Universe};\nuse quickwit_cluster::{\n    Cluster, ClusterChange, ClusterChangeStream, ClusterNode, ListenerHandle, start_cluster_service,\n};\nuse quickwit_common::pubsub::{EventBroker, EventSubscriptionHandle};\nuse quickwit_common::rate_limiter::RateLimiterSettings;\nuse quickwit_common::retry::RetryParams;\nuse quickwit_common::runtimes::RuntimesConfig;\nuse quickwit_common::tower::{\n    BalanceChannel, BoxFutureInfaillible, BufferLayer, Change, CircuitBreakerEvaluator,\n    ConstantRate, EstimateRateLayer, EventListenerLayer, GrpcMetricsLayer, LoadShedLayer,\n    RateLimitLayer, RetryLayer, RetryPolicy, SmaRateEstimator, TimeoutLayer,\n};\nuse quickwit_common::uri::Uri;\nuse quickwit_common::{get_bool_from_env, spawn_named_task};\nuse quickwit_config::service::QuickwitService;\nuse quickwit_config::{ClusterConfig, IngestApiConfig, NodeConfig};\nuse quickwit_control_plane::control_plane::{ControlPlane, ControlPlaneEventSubscriber};\nuse quickwit_control_plane::{IndexerNodeInfo, IndexerPool};\nuse quickwit_index_management::{IndexService as IndexManager, IndexServiceError};\nuse quickwit_indexing::actors::IndexingService;\nuse quickwit_indexing::models::ShardPositionsService;\nuse quickwit_indexing::start_indexing_service;\nuse quickwit_ingest::{\n    GetMemoryCapacity, IngestRequest, IngestRouter, IngestServiceClient, Ingester, IngesterPool,\n    IngesterPoolEntry, LocalShardsUpdate, get_idle_shard_timeout,\n    setup_ingester_capacity_update_listener, setup_local_shards_update_listener,\n    start_ingest_api_service, try_get_ingester_status, wait_for_ingester_decommission,\n    wait_for_ingester_status,\n};\nuse quickwit_jaeger::JaegerService;\nuse quickwit_janitor::{JanitorService, start_janitor_service};\nuse quickwit_metastore::{\n    ControlPlaneMetastore, ListIndexesMetadataResponseExt, MetastoreResolver,\n};\nuse quickwit_opentelemetry::otlp::{OtlpGrpcLogsService, OtlpGrpcTracesService};\nuse quickwit_proto::control_plane::ControlPlaneServiceClient;\nuse quickwit_proto::indexing::{IndexingServiceClient, ShardPositionsUpdate};\nuse quickwit_proto::ingest::ingester::{\n    IngesterService, IngesterServiceClient, IngesterServiceTowerLayerStack, IngesterStatus,\n    PersistFailureReason, PersistResponse,\n};\nuse quickwit_proto::ingest::router::IngestRouterServiceClient;\nuse quickwit_proto::ingest::{IngestV2Error, RateLimitingCause};\nuse quickwit_proto::metastore::{\n    EntityKind, ListIndexesMetadataRequest, MetastoreError, MetastoreService,\n    MetastoreServiceClient,\n};\nuse quickwit_proto::search::ReportSplitsRequest;\nuse quickwit_proto::types::NodeId;\nuse quickwit_search::{\n    SearchJobPlacer, SearchService, SearchServiceClient, SearcherContext, SearcherPool,\n    create_search_client_from_channel, start_searcher_service,\n};\nuse quickwit_storage::{SplitCache, StorageResolver};\nuse tcp_listener::TcpListenerResolver;\nuse tokio::sync::oneshot;\nuse tonic::codec::CompressionEncoding;\nuse tonic_health::ServingStatus;\nuse tonic_health::server::HealthReporter;\nuse tower::ServiceBuilder;\nuse tower::timeout::Timeout;\nuse tracing::{debug, error, info, warn};\nuse warp::{Filter, Rejection};\n\npub use crate::build_info::{BuildInfo, RuntimeInfo};\npub use crate::index_api::{ListSplitsQueryParams, ListSplitsResponse};\npub use crate::ingest_api::{RestIngestResponse, RestParseFailure};\npub use crate::metrics::SERVE_METRICS;\nuse crate::rate_modulator::RateModulator;\n#[cfg(test)]\nuse crate::rest::recover_fn;\npub use crate::search_api::{SearchRequestQueryString, SortBy, search_request_from_api_request};\n\nconst READINESS_REPORTING_INTERVAL: Duration = if cfg!(any(test, feature = \"testsuite\")) {\n    Duration::from_millis(25)\n} else {\n    Duration::from_secs(10)\n};\n\nconst METASTORE_CLIENT_MAX_CONCURRENCY_ENV_KEY: &str = \"QW_METASTORE_CLIENT_MAX_CONCURRENCY\";\nconst DEFAULT_METASTORE_CLIENT_MAX_CONCURRENCY: usize = 6;\nconst DISABLE_DELETE_TASK_SERVICE_ENV_KEY: &str = \"QW_DISABLE_DELETE_TASK_SERVICE\";\n\npub type EnvFilterReloadFn = Arc<dyn Fn(&str) -> anyhow::Result<()> + Send + Sync>;\n\npub fn do_nothing_env_filter_reload_fn() -> EnvFilterReloadFn {\n    Arc::new(|_| Ok(()))\n}\n\nfn get_metastore_client_max_concurrency() -> usize {\n    quickwit_common::get_from_env(\n        METASTORE_CLIENT_MAX_CONCURRENCY_ENV_KEY,\n        DEFAULT_METASTORE_CLIENT_MAX_CONCURRENCY,\n        false,\n    )\n}\n\nstatic CP_GRPC_CLIENT_METRICS_LAYER: Lazy<GrpcMetricsLayer> =\n    Lazy::new(|| GrpcMetricsLayer::new(\"control_plane\", \"client\"));\nstatic CP_GRPC_SERVER_METRICS_LAYER: Lazy<GrpcMetricsLayer> =\n    Lazy::new(|| GrpcMetricsLayer::new(\"control_plane\", \"server\"));\n\nstatic INDEXING_GRPC_CLIENT_METRICS_LAYER: Lazy<GrpcMetricsLayer> =\n    Lazy::new(|| GrpcMetricsLayer::new(\"indexing\", \"client\"));\npub(crate) static INDEXING_GRPC_SERVER_METRICS_LAYER: Lazy<GrpcMetricsLayer> =\n    Lazy::new(|| GrpcMetricsLayer::new(\"indexing\", \"server\"));\n\nstatic INGEST_GRPC_CLIENT_METRICS_LAYER: Lazy<GrpcMetricsLayer> =\n    Lazy::new(|| GrpcMetricsLayer::new(\"ingest\", \"client\"));\nstatic INGEST_GRPC_SERVER_METRICS_LAYER: Lazy<GrpcMetricsLayer> =\n    Lazy::new(|| GrpcMetricsLayer::new(\"ingest\", \"server\"));\n\nstatic METASTORE_GRPC_CLIENT_METRICS_LAYER: Lazy<GrpcMetricsLayer> =\n    Lazy::new(|| GrpcMetricsLayer::new(\"metastore\", \"client\"));\nstatic METASTORE_GRPC_SERVER_METRICS_LAYER: Lazy<GrpcMetricsLayer> =\n    Lazy::new(|| GrpcMetricsLayer::new(\"metastore\", \"server\"));\n\nstatic GRPC_INGESTER_SERVICE_TIMEOUT: Duration = Duration::from_secs(30);\nstatic GRPC_INDEXING_SERVICE_TIMEOUT: Duration = Duration::from_secs(30);\nstatic GRPC_METASTORE_SERVICE_TIMEOUT: Duration = Duration::from_secs(10);\n\nstruct QuickwitServices {\n    pub node_config: Arc<NodeConfig>,\n    pub cluster: Cluster,\n    pub metastore_server_opt: Option<MetastoreServiceClient>,\n    pub metastore_client: MetastoreServiceClient,\n    pub control_plane_server_opt: Option<Mailbox<ControlPlane>>,\n    pub control_plane_client: ControlPlaneServiceClient,\n    pub index_manager: IndexManager,\n    pub indexing_service_opt: Option<Mailbox<IndexingService>>,\n    // Ingest v1\n    pub ingest_service: IngestServiceClient,\n    // Ingest v2\n    pub ingest_router_opt: Option<IngestRouter>,\n    pub ingest_router_service: IngestRouterServiceClient,\n    ingester_opt: Option<Ingester>,\n\n    pub janitor_service_opt: Option<Mailbox<JanitorService>>,\n    pub jaeger_service_opt: Option<JaegerService>,\n    pub otlp_logs_service_opt: Option<OtlpGrpcLogsService>,\n    pub otlp_traces_service_opt: Option<OtlpGrpcTracesService>,\n    /// We do have a search service even on nodes that are not running `search`.\n    /// It is only used to serve the rest API calls and will only execute\n    /// the root requests.\n    pub search_service: Arc<dyn SearchService>,\n\n    pub env_filter_reload_fn: EnvFilterReloadFn,\n\n    /// The control plane listens to various events.\n    /// We must maintain a reference to the subscription handles to continue receiving\n    /// notifications. Otherwise, the subscriptions are dropped.\n    _local_shards_update_listener_handle_opt: Option<ListenerHandle>,\n    _report_splits_subscription_handle_opt: Option<EventSubscriptionHandle>,\n}\n\nimpl QuickwitServices {\n    /// Client in the type is a bit misleading here.\n    ///\n    /// The object returned is the implementation of the local ingester service,\n    /// with all of the appropriate tower layers.\n    pub fn ingester_service(&self) -> Option<IngesterServiceClient> {\n        let ingester = self.ingester_opt.clone()?;\n        Some(ingester_service_layer_stack(IngesterServiceClient::tower()).build(ingester))\n    }\n}\n\nasync fn balance_channel_for_service(\n    cluster: &Cluster,\n    service: QuickwitService,\n) -> BalanceChannel<SocketAddr> {\n    let cluster_change_stream = cluster.change_stream();\n    let service_change_stream = cluster_change_stream.filter_map(move |cluster_change| {\n        Box::pin(async move {\n            match cluster_change {\n                ClusterChange::Add(node) if node.enabled_services().contains(&service) => {\n                    let chitchat_id = node.chitchat_id();\n                    info!(\n                        node_id = chitchat_id.node_id,\n                        generation_id = chitchat_id.generation_id,\n                        \"adding node `{}` to {} pool\",\n                        chitchat_id.node_id,\n                        service.as_str().replace('_', \" \"),\n                    );\n                    Some(Change::Insert(node.grpc_advertise_addr(), node.channel()))\n                }\n                ClusterChange::Remove(node) if node.enabled_services().contains(&service) => {\n                    let chitchat_id = node.chitchat_id();\n                    info!(\n                        node_id = chitchat_id.node_id,\n                        generation_id = chitchat_id.generation_id,\n                        \"removing node `{}` from {} pool\",\n                        chitchat_id.node_id,\n                        service.as_str().replace('_', \" \"),\n                    );\n                    Some(Change::Remove(node.grpc_advertise_addr()))\n                }\n                _ => None,\n            }\n        })\n    });\n    BalanceChannel::from_stream(service_change_stream)\n}\n\nasync fn start_ingest_client_if_needed(\n    node_config: &NodeConfig,\n    universe: &Universe,\n    cluster: &Cluster,\n) -> anyhow::Result<IngestServiceClient> {\n    if node_config.is_service_enabled(QuickwitService::Indexer) {\n        let ingest_api_service = start_ingest_api_service(\n            universe,\n            &node_config.data_dir_path,\n            &node_config.ingest_api_config,\n        )\n        .await?;\n        let num_buckets = NonZeroUsize::new(60).expect(\"60 should be non-zero\");\n        let rate_estimator = SmaRateEstimator::new(\n            num_buckets,\n            Duration::from_secs(10),\n            Duration::from_millis(100),\n        );\n        let memory_capacity = ingest_api_service.ask(GetMemoryCapacity).await?;\n        let min_rate = ConstantRate::new(ByteSize::mib(1).as_u64(), Duration::from_millis(100));\n        let rate_modulator = RateModulator::new(rate_estimator.clone(), memory_capacity, min_rate);\n        let ingest_service = IngestServiceClient::tower()\n            .stack_ingest_layer(\n                ServiceBuilder::new()\n                    .layer(EstimateRateLayer::<IngestRequest, _>::new(rate_estimator))\n                    .layer(BufferLayer::new(100))\n                    .layer(RateLimitLayer::new(rate_modulator))\n                    .into_inner(),\n            )\n            .build_from_mailbox(ingest_api_service);\n        Ok(ingest_service)\n    } else {\n        let balance_channel = balance_channel_for_service(cluster, QuickwitService::Indexer).await;\n        let ingest_service = IngestServiceClient::from_balance_channel(\n            balance_channel,\n            node_config.grpc_config.max_message_size,\n            node_config.ingest_api_config.grpc_compression_encoding(),\n        );\n        Ok(ingest_service)\n    }\n}\n\nasync fn start_control_plane_if_needed(\n    node_config: &NodeConfig,\n    cluster: &Cluster,\n    event_broker: &EventBroker,\n    metastore_client: &MetastoreServiceClient,\n    universe: &Universe,\n    indexer_pool: &IndexerPool,\n    ingester_pool: &IngesterPool,\n) -> anyhow::Result<(Option<Mailbox<ControlPlane>>, ControlPlaneServiceClient)> {\n    if node_config.is_service_enabled(QuickwitService::ControlPlane) {\n        check_cluster_configuration(\n            &node_config.enabled_services,\n            &node_config.peer_seeds,\n            metastore_client.clone(),\n        )\n        .await?;\n\n        let self_node_id: NodeId = cluster.self_node_id().into();\n\n        let control_plane_mailbox = setup_control_plane(\n            universe,\n            event_broker,\n            self_node_id,\n            cluster.clone(),\n            indexer_pool.clone(),\n            ingester_pool.clone(),\n            metastore_client.clone(),\n            node_config.default_index_root_uri.clone(),\n            &node_config.ingest_api_config,\n        )\n        .await?;\n\n        let control_plane_server_opt = Some(control_plane_mailbox.clone());\n        let control_plane_client = ControlPlaneServiceClient::tower()\n            .stack_layer(CP_GRPC_SERVER_METRICS_LAYER.clone())\n            .stack_layer(LoadShedLayer::new(100))\n            .build_from_mailbox(control_plane_mailbox);\n        Ok((control_plane_server_opt, control_plane_client))\n    } else {\n        let balance_channel =\n            balance_channel_for_service(cluster, QuickwitService::ControlPlane).await;\n\n        // If the node is a metastore, we skip this check in order to avoid a deadlock.\n        // If the node is a searcher, we skip this check because the searcher does not need to.\n        if !node_config.is_service_enabled(QuickwitService::Metastore)\n            && node_config.enabled_services != HashSet::from([QuickwitService::Searcher])\n        {\n            info!(\"connecting to control plane\");\n\n            if !balance_channel\n                .wait_for(Duration::from_secs(300), |connections| {\n                    !connections.is_empty()\n                })\n                .await\n            {\n                bail!(\"could not find control plane in the cluster\");\n            }\n        }\n        let control_plane_server_opt = None;\n        let control_plane_client = ControlPlaneServiceClient::tower()\n            .stack_layer(CP_GRPC_CLIENT_METRICS_LAYER.clone())\n            .build_from_balance_channel(\n                balance_channel,\n                node_config.grpc_config.max_message_size,\n                None,\n            );\n        Ok((control_plane_server_opt, control_plane_client))\n    }\n}\n\nfn start_shard_positions_service(\n    ingester_opt: Option<Ingester>,\n    cluster: Cluster,\n    event_broker: EventBroker,\n    spawn_ctx: SpawnContext,\n) {\n    // We spawn a task here, because we need the ingester to be ready before spawning the\n    // the `ShardPositionsService`. If we don't, all the events we emit too early will be dismissed.\n    tokio::spawn(async move {\n        if let Some(ingester) = &ingester_opt\n            && wait_for_ingester_status(ingester, IngesterStatus::Ready, Duration::from_secs(300))\n                .await\n                .is_err()\n        {\n            warn!(\"ingester failed to reach ready status\");\n        }\n        ShardPositionsService::spawn(&spawn_ctx, event_broker, cluster);\n    });\n}\n\n/// Waits for the shutdown signal and notifies all other services when it\n/// occurs.\n///\n/// Usually called when receiving a SIGTERM signal, e.g. k8s trying to\n/// decomission a pod.\nasync fn shutdown_signal_handler(\n    shutdown_signal: BoxFutureInfaillible<()>,\n    universe: Universe,\n    ingester_opt: Option<Ingester>,\n    grpc_shutdown_trigger_tx: oneshot::Sender<()>,\n    rest_shutdown_trigger_tx: oneshot::Sender<()>,\n    cluster: Cluster,\n) -> HashMap<String, ActorExitStatus> {\n    shutdown_signal.await;\n    // We must decommission the ingester first before terminating the indexing pipelines that\n    // may consume from it. We also need to keep the gRPC server running while doing so.\n    if let Some(ingester) = &ingester_opt\n        && let Err(error) = wait_for_ingester_decommission(ingester, Duration::from_secs(300)).await\n    {\n        error!(\"failed to decommission ingester gracefully: {:?}\", error);\n    }\n    let actor_exit_statuses = universe.quit().await;\n\n    if grpc_shutdown_trigger_tx.send(()).is_err() {\n        debug!(\"gRPC server shutdown signal receiver was dropped\");\n    }\n    if rest_shutdown_trigger_tx.send(()).is_err() {\n        debug!(\"REST server shutdown signal receiver was dropped\");\n    }\n    if let Err(err) = cluster.initiate_shutdown().await {\n        debug!(\"{err}\");\n    }\n    actor_exit_statuses\n}\n\npub async fn serve_quickwit(\n    node_config: NodeConfig,\n    runtimes_config: RuntimesConfig,\n    metastore_resolver: MetastoreResolver,\n    storage_resolver: StorageResolver,\n    tcp_listener_resolver: impl TcpListenerResolver,\n    shutdown_signal: BoxFutureInfaillible<()>,\n    env_filter_reload_fn: EnvFilterReloadFn,\n) -> anyhow::Result<HashMap<String, ActorExitStatus>> {\n    let cluster = start_cluster_service(&node_config)\n        .await\n        .context(\"failed to start cluster service\")?;\n\n    let event_broker = EventBroker::default();\n    let indexer_pool = IndexerPool::default();\n    let ingester_pool = IngesterPool::default();\n    let universe = Universe::new();\n    let grpc_config = node_config.grpc_config.clone();\n\n    // Instantiate a metastore \"server\" if the `metastore` role is enabled on the node.\n    let metastore_server_opt: Option<MetastoreServiceClient> =\n        if node_config.is_service_enabled(QuickwitService::Metastore) {\n            let metastore: MetastoreServiceClient = metastore_resolver\n                .resolve(&node_config.metastore_uri)\n                .await\n                .with_context(|| {\n                    format!(\n                        \"failed to resolve metastore uri `{}`\",\n                        node_config.metastore_uri\n                    )\n                })?;\n            let max_in_flight_requests = if node_config.metastore_uri.protocol().is_database() {\n                node_config\n                    .metastore_configs\n                    .find_postgres()\n                    .map(|config| config.max_connections.get() * 2)\n                    .unwrap_or_default()\n                    .max(100)\n            } else {\n                100\n            };\n            // These layers apply to all the RPCs of the metastore.\n            let shared_layer = ServiceBuilder::new()\n                .layer(METASTORE_GRPC_SERVER_METRICS_LAYER.clone())\n                .layer(LoadShedLayer::new(max_in_flight_requests))\n                .into_inner();\n            let broker_layer = EventListenerLayer::new(event_broker.clone());\n            let metastore = MetastoreServiceClient::tower()\n                .stack_layer(shared_layer)\n                .stack_create_index_layer(broker_layer.clone())\n                .stack_delete_index_layer(broker_layer.clone())\n                .stack_add_source_layer(broker_layer.clone())\n                .stack_delete_source_layer(broker_layer.clone())\n                .stack_toggle_source_layer(broker_layer)\n                .build(metastore);\n            Some(metastore)\n        } else {\n            None\n        };\n    // Instantiate a metastore client, either local if available or remote otherwise.\n    let metastore_client: MetastoreServiceClient =\n        if let Some(metastore_server) = &metastore_server_opt {\n            metastore_server.clone()\n        } else {\n            info!(\"connecting to metastore\");\n\n            let balance_channel =\n                balance_channel_for_service(&cluster, QuickwitService::Metastore).await;\n\n            if !balance_channel\n                .wait_for(Duration::from_secs(300), |connections| {\n                    !connections.is_empty()\n                })\n                .await\n            {\n                bail!(\"could not find any metastore node in the cluster\");\n            }\n            MetastoreServiceClient::tower()\n                .stack_layer(RetryLayer::new(RetryPolicy::from(RetryParams::standard())))\n                .stack_layer(TimeoutLayer::new(GRPC_METASTORE_SERVICE_TIMEOUT))\n                .stack_layer(METASTORE_GRPC_CLIENT_METRICS_LAYER.clone())\n                .stack_layer(tower::limit::GlobalConcurrencyLimitLayer::new(\n                    get_metastore_client_max_concurrency(),\n                ))\n                .build_from_balance_channel(balance_channel, grpc_config.max_message_size, None)\n        };\n    // Instantiate a control plane server if the `control-plane` role is enabled on the node.\n    // Otherwise, instantiate a control plane client.\n    let (control_plane_server_opt, control_plane_client) = start_control_plane_if_needed(\n        &node_config,\n        &cluster,\n        &event_broker,\n        &metastore_client,\n        &universe,\n        &indexer_pool,\n        &ingester_pool,\n    )\n    .await\n    .context(\"failed to start control plane service\")?;\n\n    // Set up the \"control plane proxy\" for the metastore.\n    let metastore_through_control_plane = MetastoreServiceClient::new(ControlPlaneMetastore::new(\n        control_plane_client.clone(),\n        metastore_client,\n    ));\n\n    // Setup ingest service v1.\n    let ingest_service = start_ingest_client_if_needed(&node_config, &universe, &cluster)\n        .await\n        .context(\"failed to start ingest v1 service\")?;\n\n    let indexing_service_opt = if node_config.is_service_enabled(QuickwitService::Indexer) {\n        let indexing_service = start_indexing_service(\n            &universe,\n            &node_config,\n            runtimes_config.num_threads_blocking,\n            cluster.clone(),\n            metastore_through_control_plane.clone(),\n            ingester_pool.clone(),\n            storage_resolver.clone(),\n            event_broker.clone(),\n        )\n        .await\n        .context(\"failed to start indexing service\")?;\n        Some(indexing_service)\n    } else {\n        None\n    };\n\n    // Setup the indexer pool to track cluster changes.\n    setup_indexer_pool(\n        cluster.change_stream(),\n        indexing_service_opt.clone(),\n        indexer_pool,\n        node_config.grpc_config.max_message_size,\n    );\n\n    // Setup ingest service v2.\n    let (ingest_router, ingest_router_service, ingester_opt) = setup_ingest_v2(\n        &node_config,\n        &cluster,\n        &event_broker,\n        control_plane_client.clone(),\n        ingester_pool,\n    )\n    .await\n    .context(\"failed to start ingest v2 service\")?;\n\n    if node_config.is_service_enabled(QuickwitService::Indexer)\n        || node_config.is_service_enabled(QuickwitService::ControlPlane)\n    {\n        start_shard_positions_service(\n            ingester_opt.clone(),\n            cluster.clone(),\n            event_broker.clone(),\n            universe.spawn_ctx().clone(),\n        );\n    }\n\n    // Any node can serve index management requests (create/update/delete index, add/remove source,\n    // etc.), so we always instantiate an index manager.\n    let mut index_manager = IndexManager::new(\n        metastore_through_control_plane.clone(),\n        storage_resolver.clone(),\n    );\n\n    if node_config.is_service_enabled(QuickwitService::Indexer)\n        && node_config.indexer_config.enable_otlp_endpoint\n    {\n        {\n            let otel_logs_index_config =\n                OtlpGrpcLogsService::index_config(&node_config.default_index_root_uri)\n                    .context(\"failed to load OTEL logs index config\")?;\n            let otel_traces_index_config =\n                OtlpGrpcTracesService::index_config(&node_config.default_index_root_uri)\n                    .context(\"failed to load OTEL traces index config\")?;\n\n            for (index_name, index_config) in [\n                (\"OTEL logs\", otel_logs_index_config),\n                (\"OTEL traces\", otel_traces_index_config),\n            ] {\n                match index_manager.create_index(index_config, false).await {\n                    Ok(_)\n                    | Err(IndexServiceError::Metastore(MetastoreError::AlreadyExists(\n                        EntityKind::Index { .. },\n                    ))) => {}\n                    Err(error) => bail!(\"failed to create {index_name} index: {error}\",),\n                };\n            }\n        }\n    }\n\n    let split_cache_opt: Option<Arc<SplitCache>> =\n        if let Some(split_cache_limits) = node_config.searcher_config.split_cache {\n            let split_cache = SplitCache::with_root_path(\n                node_config.data_dir_path.join(\"searcher-split-cache\"),\n                storage_resolver.clone(),\n                split_cache_limits,\n            )\n            .context(\"failed to load searcher split cache\")?;\n            Some(split_cache)\n        } else {\n            None\n        };\n\n    // Initialize Lambda invoker if enabled and searcher service is running\n    let searcher_context = if node_config.is_service_enabled(QuickwitService::Searcher) {\n        if let Some(lambda_config) = &node_config.searcher_config.lambda {\n            #[cfg(feature = \"lambda\")]\n            {\n                info!(\"initializing AWS Lambda invoker for search\");\n                warn!(\"offloading to lambda is EXPERIMENTAL. Use at your own risk\");\n                let invoker =\n                    quickwit_lambda_client::try_get_or_deploy_invoker(lambda_config).await?;\n                Arc::new(SearcherContext::new(\n                    node_config.searcher_config.clone(),\n                    split_cache_opt,\n                    Some(invoker),\n                ))\n            }\n            #[cfg(not(feature = \"lambda\"))]\n            {\n                let _ = lambda_config;\n                bail!(\"lambda support is statically disabled, but enabled in configuration\");\n            }\n        } else {\n            Arc::new(SearcherContext::new_without_invoker(\n                node_config.searcher_config.clone(),\n                split_cache_opt,\n            ))\n        }\n    } else {\n        Arc::new(SearcherContext::new_without_invoker(\n            node_config.searcher_config.clone(),\n            split_cache_opt,\n        ))\n    };\n\n    let (search_job_placer, search_service) = setup_searcher(\n        &node_config,\n        cluster.change_stream(),\n        // search remains available without a control plane because not all\n        // metastore RPCs are proxied\n        metastore_through_control_plane.clone(),\n        storage_resolver.clone(),\n        searcher_context,\n    )\n    .await\n    .context(\"failed to start searcher service\")?;\n\n    // The control plane listens for local shards updates to learn about each shard's ingestion\n    // throughput. Ingesters (routers) do so to update their shard table.\n    let local_shards_update_listener_handle_opt = if node_config\n        .is_service_enabled(QuickwitService::ControlPlane)\n        || node_config.is_service_enabled(QuickwitService::Indexer)\n    {\n        Some(setup_local_shards_update_listener(cluster.clone(), event_broker.clone()).await)\n    } else {\n        None\n    };\n\n    let report_splits_subscription_handle_opt =\n        // DISCLAIMER: This is quirky here: We base our decision to forward the split report depending\n        // on the current searcher configuration.\n        if node_config.searcher_config.split_cache.is_some() {\n            // The searcher receive hints about new splits to populate their index.\n            Some(event_broker.subscribe::<ReportSplitsRequest>(search_job_placer.clone()))\n        } else {\n            None\n        };\n\n    let janitor_service_opt = if node_config.is_service_enabled(QuickwitService::Janitor) {\n        let janitor_service = start_janitor_service(\n            &universe,\n            &node_config,\n            metastore_through_control_plane.clone(),\n            search_job_placer,\n            storage_resolver.clone(),\n            event_broker.clone(),\n            !get_bool_from_env(DISABLE_DELETE_TASK_SERVICE_ENV_KEY, false),\n        )\n        .await\n        .context(\"failed to start janitor service\")?;\n        Some(janitor_service)\n    } else {\n        None\n    };\n\n    let jaeger_service_opt = if node_config.jaeger_config.enable_endpoint\n        && node_config.is_service_enabled(QuickwitService::Searcher)\n    {\n        let search_service = search_service.clone();\n        Some(JaegerService::new(\n            node_config.jaeger_config.clone(),\n            search_service,\n        ))\n    } else {\n        None\n    };\n\n    let otlp_logs_service_opt = if node_config.is_service_enabled(QuickwitService::Indexer)\n        && node_config.indexer_config.enable_otlp_endpoint\n    {\n        Some(OtlpGrpcLogsService::new(ingest_router_service.clone()))\n    } else {\n        None\n    };\n\n    let otlp_traces_service_opt = if node_config.is_service_enabled(QuickwitService::Indexer)\n        && node_config.indexer_config.enable_otlp_endpoint\n    {\n        Some(OtlpGrpcTracesService::new(\n            ingest_router_service.clone(),\n            None,\n        ))\n    } else {\n        None\n    };\n\n    let grpc_listen_addr = node_config.grpc_listen_addr;\n    let rest_listen_addr = node_config.rest_config.listen_addr;\n    let quickwit_services: Arc<QuickwitServices> = Arc::new(QuickwitServices {\n        node_config: Arc::new(node_config),\n        cluster: cluster.clone(),\n        metastore_server_opt,\n        metastore_client: metastore_through_control_plane.clone(),\n        control_plane_server_opt,\n        control_plane_client,\n        _local_shards_update_listener_handle_opt: local_shards_update_listener_handle_opt,\n        _report_splits_subscription_handle_opt: report_splits_subscription_handle_opt,\n        index_manager,\n        indexing_service_opt,\n        ingest_router_opt: Some(ingest_router),\n        ingest_router_service,\n        ingest_service,\n        ingester_opt: ingester_opt.clone(),\n        janitor_service_opt,\n        jaeger_service_opt,\n        otlp_logs_service_opt,\n        otlp_traces_service_opt,\n        search_service,\n        env_filter_reload_fn,\n    });\n    // Setup and start gRPC server.\n    let (grpc_readiness_trigger_tx, grpc_readiness_signal_rx) = oneshot::channel::<()>();\n    let grpc_readiness_trigger = Box::pin(async move {\n        if grpc_readiness_trigger_tx.send(()).is_err() {\n            debug!(\"gRPC server readiness signal receiver was dropped\");\n        }\n    });\n    let (grpc_shutdown_trigger_tx, grpc_shutdown_signal_rx) = oneshot::channel::<()>();\n    let grpc_shutdown_signal = Box::pin(async move {\n        if grpc_shutdown_signal_rx.await.is_err() {\n            debug!(\"gRPC server shutdown trigger sender was dropped\");\n        }\n    });\n    let (health_reporter, health_service) = tonic_health::server::health_reporter();\n    let grpc_server = grpc::start_grpc_server(\n        tcp_listener_resolver.resolve(grpc_listen_addr).await?,\n        grpc_config,\n        quickwit_services.clone(),\n        grpc_readiness_trigger,\n        grpc_shutdown_signal,\n        health_service,\n    );\n    // Setup and start REST server.\n    let (rest_readiness_trigger_tx, rest_readiness_signal_rx) = oneshot::channel::<()>();\n    let rest_readiness_trigger = Box::pin(async move {\n        if rest_readiness_trigger_tx.send(()).is_err() {\n            debug!(\"REST server readiness signal receiver was dropped\");\n        }\n    });\n    let (rest_shutdown_trigger_tx, rest_shutdown_signal_rx) = oneshot::channel::<()>();\n    let rest_shutdown_signal = Box::pin(async move {\n        if rest_shutdown_signal_rx.await.is_err() {\n            debug!(\"REST server shutdown trigger sender was dropped\");\n        }\n    });\n\n    let rest_server = rest::start_rest_server(\n        tcp_listener_resolver.resolve(rest_listen_addr).await?,\n        quickwit_services,\n        rest_readiness_trigger,\n        rest_shutdown_signal,\n    );\n\n    // Node readiness indicates that the server is ready to receive requests.\n    // Thus readiness task is started once gRPC and REST servers are started.\n    spawn_named_task(\n        node_readiness_reporting_task(\n            cluster.clone(),\n            metastore_through_control_plane,\n            ingester_opt.clone(),\n            grpc_readiness_signal_rx,\n            rest_readiness_signal_rx,\n            health_reporter,\n        ),\n        \"node_readiness_reporting\",\n    );\n\n    let shutdown_handle = tokio::spawn(shutdown_signal_handler(\n        shutdown_signal,\n        universe,\n        ingester_opt,\n        grpc_shutdown_trigger_tx,\n        rest_shutdown_trigger_tx,\n        cluster.clone(),\n    ));\n    let grpc_join_handle = async move {\n        spawn_named_task(grpc_server, \"grpc_server\")\n            .await\n            .expect(\"tasks running the gRPC server should not panic or be cancelled\")\n            .context(\"gRPC server failed\")\n    };\n\n    let rest_join_handle = async move {\n        spawn_named_task(rest_server, \"rest_server\")\n            .await\n            .expect(\"tasks running the REST server should not panic or be cancelled\")\n            .context(\"REST server failed\")\n    };\n\n    let chitchat_server_handle = cluster.chitchat_server_termination_watcher().await;\n\n    if let Err(err) = tokio::try_join!(grpc_join_handle, rest_join_handle, chitchat_server_handle) {\n        error!(\"server failed: {err:?}\");\n    }\n\n    let actor_exit_statuses = shutdown_handle\n        .await\n        .context(\"failed to gracefully shutdown services\")?;\n    Ok(actor_exit_statuses)\n}\n\n#[derive(Clone, Copy)]\nstruct PersistCircuitBreakerEvaluator;\n\nimpl CircuitBreakerEvaluator for PersistCircuitBreakerEvaluator {\n    type Response = PersistResponse;\n\n    type Error = IngestV2Error;\n\n    fn is_circuit_breaker_error(&self, output: &Result<Self::Response, IngestV2Error>) -> bool {\n        let Ok(persist_response) = output.as_ref() else {\n            return false;\n        };\n        for persist_failure in &persist_response.failures {\n            // This is the error we return when the WAL is full.\n            if persist_failure.reason() == PersistFailureReason::WalFull {\n                return true;\n            }\n        }\n        false\n    }\n\n    fn make_circuit_breaker_output(&self) -> IngestV2Error {\n        IngestV2Error::TooManyRequests(RateLimitingCause::CircuitBreaker)\n    }\n}\n\n/// Stack of layers to use on the server side of the ingester service.\nfn ingester_service_layer_stack(\n    layer_stack: IngesterServiceTowerLayerStack,\n) -> IngesterServiceTowerLayerStack {\n    layer_stack\n        .stack_layer(INGEST_GRPC_SERVER_METRICS_LAYER.clone())\n        .stack_persist_layer(quickwit_common::tower::OneTaskPerCallLayer)\n        .stack_persist_layer(\n            // \"3\" may seem a little bit low, but we only consider error caused by a full WAL.\n            PersistCircuitBreakerEvaluator.make_layer(\n                3,\n                Duration::from_millis(500),\n                crate::metrics::SERVE_METRICS.circuit_break_total.clone(),\n            ),\n        )\n        .stack_open_replication_stream_layer(quickwit_common::tower::OneTaskPerCallLayer)\n        .stack_init_shards_layer(quickwit_common::tower::OneTaskPerCallLayer)\n        .stack_retain_shards_layer(quickwit_common::tower::OneTaskPerCallLayer)\n        .stack_truncate_shards_layer(quickwit_common::tower::OneTaskPerCallLayer)\n        .stack_close_shards_layer(quickwit_common::tower::OneTaskPerCallLayer)\n        .stack_decommission_layer(quickwit_common::tower::OneTaskPerCallLayer)\n}\n\nasync fn setup_ingest_v2(\n    node_config: &NodeConfig,\n    cluster: &Cluster,\n    event_broker: &EventBroker,\n    control_plane: ControlPlaneServiceClient,\n    ingester_pool: IngesterPool,\n) -> anyhow::Result<(IngestRouter, IngestRouterServiceClient, Option<Ingester>)> {\n    // Instantiate ingest router.\n    let self_node_id: NodeId = cluster.self_node_id().into();\n    let grpc_compression_encoding_opt = node_config.ingest_api_config.grpc_compression_encoding();\n    let replication_factor = node_config\n        .ingest_api_config\n        .replication_factor()\n        .expect(\"replication factor should have been validated\")\n        .get();\n\n    // Any node can serve ingest requests, so we always instantiate an ingest router.\n    // TODO: I'm not sure that's such a good idea.\n    let ingest_router = IngestRouter::new(\n        self_node_id.clone(),\n        control_plane.clone(),\n        ingester_pool.clone(),\n        replication_factor,\n        event_broker.clone(),\n        node_config.availability_zone.clone(),\n    );\n    ingest_router.subscribe();\n    setup_ingester_capacity_update_listener(cluster.clone(), event_broker.clone())\n        .await\n        .forever();\n\n    let ingest_router_service = IngestRouterServiceClient::tower()\n        .stack_layer(INGEST_GRPC_SERVER_METRICS_LAYER.clone())\n        .build(ingest_router.clone());\n\n    let rate_limit =\n        ConstantRate::bytes_per_sec(node_config.ingest_api_config.shard_throughput_limit);\n    let rate_limiter_settings = RateLimiterSettings {\n        burst_limit: node_config.ingest_api_config.shard_burst_limit.as_u64(),\n        rate_limit,\n        // Refill every 100ms.\n        refill_period: Duration::from_millis(100),\n    };\n\n    // Instantiate ingester.\n    let ingester_opt: Option<Ingester> = if node_config.is_service_enabled(QuickwitService::Indexer)\n    {\n        let wal_dir_path = node_config.data_dir_path.join(\"wal\");\n        fs::create_dir_all(&wal_dir_path)?;\n\n        let idle_shard_timeout = get_idle_shard_timeout();\n        let ingester = Ingester::try_new(\n            cluster.clone(),\n            control_plane,\n            ingester_pool.clone(),\n            &wal_dir_path,\n            node_config.ingest_api_config.max_queue_disk_usage,\n            node_config.ingest_api_config.max_queue_memory_usage,\n            rate_limiter_settings,\n            replication_factor,\n            idle_shard_timeout,\n        )\n        .await?;\n        ingester.subscribe(event_broker);\n        // We will now receive all new shard positions update events, from chitchat.\n        // Unfortunately at this point, chitchat is already running.\n        //\n        // We need to make sure the existing positions are loaded too.\n        Some(ingester)\n    } else {\n        None\n    };\n    setup_ingester_pool(\n        cluster.change_stream(),\n        ingester_opt.clone(),\n        ingester_pool,\n        grpc_compression_encoding_opt,\n        node_config.grpc_config.max_message_size,\n    );\n    Ok((ingest_router, ingest_router_service, ingester_opt))\n}\n\nfn setup_ingester_pool(\n    cluster_change_stream: ClusterChangeStream,\n    ingester_opt: Option<Ingester>,\n    ingester_pool: IngesterPool,\n    grpc_compression_encoding_opt: Option<CompressionEncoding>,\n    grpc_max_message_size: ByteSize,\n) {\n    let ingester_change_stream = cluster_change_stream.filter_map(move |cluster_change| {\n        let ingester_opt_clone = ingester_opt.clone();\n        Box::pin(async move {\n            match cluster_change {\n                ClusterChange::Add(node) if node.is_indexer() => {\n                    let change = build_ingester_insert_change(\n                        &node,\n                        ingester_opt_clone,\n                        grpc_max_message_size,\n                        grpc_compression_encoding_opt,\n                    );\n                    Some(change)\n                }\n                // only update the ingester pool when the ingester status changes, to avoid\n                // unnecessary churn\n                ClusterChange::Update { previous, updated }\n                    if updated.is_indexer()\n                        && previous.ingester_status() != updated.ingester_status() =>\n                {\n                    let change = build_ingester_insert_change(\n                        &updated,\n                        ingester_opt_clone,\n                        grpc_max_message_size,\n                        grpc_compression_encoding_opt,\n                    );\n                    Some(change)\n                }\n                ClusterChange::Remove(node) if node.is_indexer() => {\n                    let change = build_ingester_remove_change(&node);\n                    Some(change)\n                }\n                _ => None,\n            }\n        })\n    });\n    ingester_pool.listen_for_changes(ingester_change_stream);\n}\n\nfn build_ingester_insert_change(\n    node: &ClusterNode,\n    ingester_opt: Option<impl IngesterService>,\n    grpc_max_message_size: ByteSize,\n    grpc_compression_encoding_opt: Option<CompressionEncoding>,\n) -> Change<NodeId, IngesterPoolEntry> {\n    let chitchat_id = node.chitchat_id();\n    info!(\n        node_id = chitchat_id.node_id,\n        generation_id = chitchat_id.generation_id,\n        \"adding/updating node `{}` with ingester status `{}` to ingester pool\",\n        chitchat_id.node_id,\n        node.ingester_status(),\n    );\n    let node_id: NodeId = node.node_id().into();\n    let ingester_service = build_ingester_service(\n        node,\n        ingester_opt,\n        grpc_max_message_size,\n        grpc_compression_encoding_opt,\n    );\n    let pool_entry = IngesterPoolEntry {\n        client: ingester_service,\n        status: node.ingester_status(),\n        availability_zone: node.availability_zone().map(|az| az.to_string()),\n    };\n    Change::Insert(node_id, pool_entry)\n}\n\nfn build_ingester_remove_change(node: &ClusterNode) -> Change<NodeId, IngesterPoolEntry> {\n    let chitchat_id = node.chitchat_id();\n    info!(\n        node_id = chitchat_id.node_id,\n        generation_id = chitchat_id.generation_id,\n        \"removing node `{}` from ingester pool\",\n        chitchat_id.node_id,\n    );\n    let node_id: NodeId = node.node_id().into();\n    Change::Remove(node_id)\n}\n\nfn build_ingester_service(\n    node: &ClusterNode,\n    ingester_opt: Option<impl IngesterService>,\n    max_message_size: ByteSize,\n    grpc_compression_encoding_opt: Option<CompressionEncoding>,\n) -> IngesterServiceClient {\n    if node.is_self_node() {\n        // Here, since the service is available locally, we bypass the network stack\n        // and use the instance directly. However, we still want client-side\n        // metrics, so we use both metrics layers.\n        let ingester = ingester_opt.expect(\"ingester service should be initialized\");\n        let service = ingester_service_layer_stack(\n            IngesterServiceClient::tower().stack_layer(INGEST_GRPC_CLIENT_METRICS_LAYER.clone()),\n        )\n        .build(ingester);\n        return service;\n    }\n    IngesterServiceClient::tower()\n        .stack_layer(INGEST_GRPC_CLIENT_METRICS_LAYER.clone())\n        .stack_layer(TimeoutLayer::new(GRPC_INGESTER_SERVICE_TIMEOUT))\n        .build_from_channel(\n            node.grpc_advertise_addr(),\n            node.channel(),\n            max_message_size,\n            grpc_compression_encoding_opt,\n        )\n}\n\nasync fn setup_searcher(\n    node_config: &NodeConfig,\n    cluster_change_stream: ClusterChangeStream,\n    metastore: MetastoreServiceClient,\n    storage_resolver: StorageResolver,\n    searcher_context: Arc<SearcherContext>,\n) -> anyhow::Result<(SearchJobPlacer, Arc<dyn SearchService>)> {\n    let searcher_pool = SearcherPool::default();\n    let search_job_placer = SearchJobPlacer::new(searcher_pool.clone());\n\n    let search_service = start_searcher_service(\n        metastore,\n        storage_resolver,\n        search_job_placer.clone(),\n        searcher_context,\n    )\n    .await?;\n    let search_service_clone = search_service.clone();\n    let max_message_size = node_config.grpc_config.max_message_size;\n    let request_timeout = node_config.searcher_config.request_timeout();\n    let searcher_change_stream = cluster_change_stream.filter_map(move |cluster_change| {\n        let search_service_clone = search_service_clone.clone();\n        Box::pin(async move {\n            match cluster_change {\n                ClusterChange::Add(node) if node.is_searcher() => {\n                    let chitchat_id = node.chitchat_id();\n                    info!(\n                        node_id = chitchat_id.node_id,\n                        generation_id = chitchat_id.generation_id,\n                        \"adding node `{}` to searcher pool\",\n                        chitchat_id.node_id,\n                    );\n                    let grpc_addr = node.grpc_advertise_addr();\n\n                    if node.is_self_node() {\n                        let search_client =\n                            SearchServiceClient::from_service(search_service_clone, grpc_addr);\n                        Some(Change::Insert(grpc_addr, search_client))\n                    } else {\n                        let timeout_channel = Timeout::new(node.channel(), request_timeout);\n                        let search_client = create_search_client_from_channel(\n                            grpc_addr,\n                            timeout_channel,\n                            max_message_size,\n                        );\n                        Some(Change::Insert(grpc_addr, search_client))\n                    }\n                }\n                ClusterChange::Remove(node) if node.is_searcher() => {\n                    let chitchat_id = node.chitchat_id();\n                    info!(\n                        node_id = chitchat_id.node_id,\n                        generation_id = chitchat_id.generation_id,\n                        \"removing node `{}` from searcher pool\",\n                        chitchat_id.node_id,\n                    );\n                    Some(Change::Remove(node.grpc_advertise_addr()))\n                }\n                _ => None,\n            }\n        })\n    });\n    searcher_pool.listen_for_changes(searcher_change_stream);\n    Ok((search_job_placer, search_service))\n}\n\n#[allow(clippy::too_many_arguments)]\nasync fn setup_control_plane(\n    universe: &Universe,\n    event_broker: &EventBroker,\n    self_node_id: NodeId,\n    cluster: Cluster,\n    indexer_pool: IndexerPool,\n    ingester_pool: IngesterPool,\n    metastore: MetastoreServiceClient,\n    default_index_root_uri: Uri,\n    ingest_api_config: &IngestApiConfig,\n) -> anyhow::Result<Mailbox<ControlPlane>> {\n    let cluster_id = cluster.cluster_id().to_string();\n    let replication_factor = ingest_api_config\n        .replication_factor()\n        .expect(\"replication factor should have been validated\")\n        .get();\n    let cluster_config = ClusterConfig {\n        cluster_id,\n        auto_create_indexes: true,\n        default_index_root_uri,\n        replication_factor,\n        shard_throughput_limit: ingest_api_config.shard_throughput_limit,\n        shard_scale_up_factor: ingest_api_config.shard_scale_up_factor,\n    };\n    let (control_plane_mailbox, _control_plane_handle, mut readiness_rx) = ControlPlane::spawn(\n        universe,\n        cluster_config,\n        self_node_id,\n        cluster.clone(),\n        indexer_pool,\n        ingester_pool,\n        metastore,\n    );\n    let subscriber = ControlPlaneEventSubscriber::new(control_plane_mailbox.downgrade());\n    event_broker\n        .subscribe_without_timeout::<LocalShardsUpdate>(subscriber.clone())\n        .forever();\n    event_broker\n        .subscribe_without_timeout::<ShardPositionsUpdate>(subscriber)\n        .forever();\n\n    tokio::time::timeout(\n        Duration::from_secs(300),\n        readiness_rx.wait_for(|readiness| *readiness),\n    )\n    .await\n    .context(\"control plane initialization timed out\")?\n    .context(\"control plane was killled or quit\")?;\n\n    info!(\"control plane is ready\");\n    Ok(control_plane_mailbox)\n}\n\nfn setup_indexer_pool(\n    cluster_change_stream: ClusterChangeStream,\n    indexing_service_opt: Option<Mailbox<IndexingService>>,\n    indexer_pool: IndexerPool,\n    grpc_max_message_size: ByteSize,\n) {\n    let indexer_change_stream = cluster_change_stream.filter_map(move |cluster_change| {\n        let indexing_service_clone_opt = indexing_service_opt.clone();\n        Box::pin(async move {\n            match cluster_change {\n                ClusterChange::Add(node) if node.is_indexer() => {\n                    let change = build_indexer_insert_change(\n                        &node,\n                        indexing_service_clone_opt,\n                        grpc_max_message_size,\n                    );\n                    Some(change)\n                }\n                ClusterChange::Remove(node) if node.is_indexer() => {\n                    let change = build_indexer_remove_change(&node);\n                    Some(change)\n                }\n                _ => None,\n            }\n        })\n    });\n    indexer_pool.listen_for_changes(indexer_change_stream);\n}\n\nfn build_indexer_insert_change(\n    node: &ClusterNode,\n    indexing_service_opt: Option<Mailbox<IndexingService>>,\n    grpc_max_message_size: ByteSize,\n) -> Change<NodeId, IndexerNodeInfo> {\n    let chitchat_id = node.chitchat_id();\n    info!(\n        node_id = chitchat_id.node_id,\n        generation_id = chitchat_id.generation_id,\n        \"adding node `{}` with ingester status `{}` to indexer pool\",\n        chitchat_id.node_id,\n        node.ingester_status()\n    );\n    let node_id: NodeId = node.node_id().into();\n    let client = build_indexing_service(node, indexing_service_opt, grpc_max_message_size);\n    Change::Insert(\n        node_id.clone(),\n        IndexerNodeInfo {\n            node_id,\n            generation_id: chitchat_id.generation_id,\n            client,\n            indexing_tasks: node.indexing_tasks().to_vec(),\n            indexing_capacity: node.indexing_capacity(),\n        },\n    )\n}\n\nfn build_indexer_remove_change(node: &ClusterNode) -> Change<NodeId, IndexerNodeInfo> {\n    let chitchat_id = node.chitchat_id();\n    info!(\n        node_id = chitchat_id.node_id,\n        generation_id = chitchat_id.generation_id,\n        \"removing node `{}` from indexer pool\",\n        chitchat_id.node_id,\n    );\n    let node_id: NodeId = node.node_id().into();\n    Change::Remove(node_id)\n}\n\nfn build_indexing_service(\n    node: &ClusterNode,\n    indexing_service_opt: Option<Mailbox<IndexingService>>,\n    max_message_size: ByteSize,\n) -> IndexingServiceClient {\n    if node.is_self_node() {\n        // Here, since the service is available locally, we bypass the network stack\n        // and use the mailbox directly. However, we still want client-side metrics,\n        // so we use both metrics layers.\n        let indexing_service_mailbox =\n            indexing_service_opt.expect(\"indexing service should be initialized\");\n        let shared_layers = ServiceBuilder::new()\n            .layer(INDEXING_GRPC_CLIENT_METRICS_LAYER.clone())\n            .layer(INDEXING_GRPC_SERVER_METRICS_LAYER.clone())\n            .into_inner();\n        return IndexingServiceClient::tower()\n            .stack_layer(shared_layers)\n            .build_from_mailbox(indexing_service_mailbox);\n    }\n    IndexingServiceClient::tower()\n        .stack_layer(INDEXING_GRPC_CLIENT_METRICS_LAYER.clone())\n        .stack_layer(TimeoutLayer::new(GRPC_INDEXING_SERVICE_TIMEOUT))\n        .build_from_channel(\n            node.grpc_advertise_addr(),\n            node.channel(),\n            max_message_size,\n            None,\n        )\n}\n\nfn require<T: Clone + Send>(\n    val_opt: Option<T>,\n) -> impl Filter<Extract = (T,), Error = Rejection> + Clone {\n    warp::any().and_then(move || {\n        let val_opt_clone = val_opt.clone();\n        async move {\n            if let Some(val) = val_opt_clone {\n                Ok(val)\n            } else {\n                Err(warp::reject())\n            }\n        }\n    })\n}\n\nfn with_arg<T: Clone + Send>(arg: T) -> impl Filter<Extract = (T,), Error = Infallible> + Clone {\n    warp::any().map(move || arg.clone())\n}\n\n/// Reports node readiness to chitchat cluster every 10 seconds (25 ms for tests).\nasync fn node_readiness_reporting_task(\n    cluster: Cluster,\n    metastore: MetastoreServiceClient,\n    ingester_opt: Option<impl IngesterService>,\n    grpc_readiness_signal_rx: oneshot::Receiver<()>,\n    rest_readiness_signal_rx: oneshot::Receiver<()>,\n    health_reporter: HealthReporter,\n) {\n    let mut node_ready = false;\n    cluster.set_self_node_readiness(node_ready).await;\n    // Set the initial health status to `NotServing` with \"\" meaning all services, as per\n    // https://github.com/grpc/grpc/blob/master/doc/health-checking.md\n    health_reporter\n        .set_service_status(\"\", ServingStatus::NotServing)\n        .await;\n\n    if grpc_readiness_signal_rx.await.is_err() {\n        // the gRPC server failed.\n        return;\n    };\n    info!(\"gRPC server is ready\");\n\n    if rest_readiness_signal_rx.await.is_err() {\n        // the REST server failed.\n        return;\n    };\n    info!(\"REST server is ready\");\n\n    let mut interval = tokio::time::interval(READINESS_REPORTING_INTERVAL);\n\n    loop {\n        interval.tick().await;\n\n        let metastore_is_available = match metastore.check_connectivity().await {\n            Ok(()) => {\n                debug!(metastore_endpoints=?metastore.endpoints(), \"metastore service is available\");\n                true\n            }\n            Err(error) => {\n                warn!(metastore_endpoints=?metastore.endpoints(), error=?error, \"metastore service is unavailable\");\n                false\n            }\n        };\n        let ingester_is_available = if let Some(ingester) = &ingester_opt {\n            match try_get_ingester_status(ingester).await {\n                Ok(status) => {\n                    status == IngesterStatus::Initializing || status != IngesterStatus::Failed\n                }\n                Err(error) => {\n                    // If we couldn't get the ingester status, it's not looking good, so we set the\n                    // node to not ready.\n                    error!(%error, \"failed to get ingester status\");\n                    false\n                }\n            }\n        } else {\n            true\n        };\n        let new_node_ready = metastore_is_available && ingester_is_available;\n\n        if new_node_ready != node_ready {\n            node_ready = new_node_ready;\n            cluster.set_self_node_readiness(node_ready).await;\n\n            let serving_status = if node_ready {\n                ServingStatus::Serving\n            } else {\n                ServingStatus::NotServing\n            };\n            health_reporter.set_service_status(\"\", serving_status).await;\n        }\n    }\n}\n\n/// Displays some warnings if the cluster runs a file-backed metastore or serves file-backed\n/// indexes.\nasync fn check_cluster_configuration(\n    services: &HashSet<QuickwitService>,\n    peer_seeds: &[String],\n    metastore: MetastoreServiceClient,\n) -> anyhow::Result<()> {\n    if !services.contains(&QuickwitService::Metastore) || peer_seeds.is_empty() {\n        return Ok(());\n    }\n    if metastore\n        .endpoints()\n        .iter()\n        .any(|uri| !uri.protocol().is_database())\n    {\n        warn!(\n            metastore_endpoints=?metastore.endpoints(),\n            \"Using a file-backed metastore in cluster mode is not recommended for production use.\n            Running multiple file-backed metastores simultaneously can lead to data loss.\");\n    }\n    let file_backed_indexes = metastore\n        .list_indexes_metadata(ListIndexesMetadataRequest::all())\n        .await?\n        .deserialize_indexes_metadata()\n        .await?\n        .into_iter()\n        .filter(|index_metadata| index_metadata.index_uri().protocol().is_file_storage())\n        .collect::<Vec<_>>();\n    if !file_backed_indexes.is_empty() {\n        let index_ids = file_backed_indexes\n            .iter()\n            .map(|index_metadata| index_metadata.index_id())\n            .join(\", \");\n        let index_uris = file_backed_indexes\n            .iter()\n            .map(|index_metadata| index_metadata.index_uri())\n            .join(\", \");\n        warn!(\n            index_ids=%index_ids,\n            index_uris=%index_uris,\n            \"Found some file-backed indexes in the metastore. Some nodes in the cluster may not have access to all index files.\"\n        );\n    }\n    Ok(())\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_cluster::{ChannelTransport, ClusterNode, create_cluster_for_test};\n    use quickwit_common::uri::Uri;\n    use quickwit_common::{ServiceStream, assert_eventually};\n    use quickwit_config::SearcherConfig;\n    use quickwit_metastore::{IndexMetadata, metastore_for_test};\n    use quickwit_proto::ingest::ingester::{MockIngesterService, ObservationMessage};\n    use quickwit_proto::metastore::{ListIndexesMetadataResponse, MockMetastoreService};\n    use quickwit_search::Job;\n    use tokio::sync::watch;\n    use tonic::transport::{Channel, Server};\n    use tonic_health::pb::HealthCheckRequest;\n    use tonic_health::pb::health_client::HealthClient;\n    use tonic_health::server::health_reporter;\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_check_cluster_configuration() {\n        let services = HashSet::from_iter([QuickwitService::Metastore]);\n        let peer_seeds = [\"192.168.0.12:7280\".to_string()];\n        let mut mock_metastore = MockMetastoreService::new();\n\n        mock_metastore\n            .expect_endpoints()\n            .return_const(vec![Uri::for_test(\"file:///qwdata/indexes\")]);\n        mock_metastore\n            .expect_list_indexes_metadata()\n            .return_once(|_| {\n                Ok(ListIndexesMetadataResponse::for_test(vec![\n                    IndexMetadata::for_test(\"test-index\", \"file:///qwdata/indexes/test-index\"),\n                ]))\n            });\n\n        check_cluster_configuration(\n            &services,\n            &peer_seeds,\n            MetastoreServiceClient::from_mock(mock_metastore),\n        )\n        .await\n        .unwrap();\n    }\n\n    #[tokio::test]\n    async fn test_readiness_updates() {\n        let transport = ChannelTransport::default();\n        let cluster = create_cluster_for_test(Vec::new(), &[], &transport, false)\n            .await\n            .unwrap();\n        let (metastore_readiness_tx, metastore_readiness_rx) = watch::channel(false);\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_check_connectivity()\n            .returning(move || {\n                if *metastore_readiness_rx.borrow() {\n                    Ok(())\n                } else {\n                    Err(anyhow::anyhow!(\"Metastore not ready\"))\n                }\n            });\n        let (ingester_status_tx, ingester_status_rx) = watch::channel(IngesterStatus::Initializing);\n        let mut mock_ingester = MockIngesterService::new();\n        mock_ingester\n            .expect_open_observation_stream()\n            .returning(move |_| {\n                let status_stream = ServiceStream::from(ingester_status_rx.clone());\n                let observation_stream = status_stream.map(|status| {\n                    let message = ObservationMessage {\n                        node_id: \"test-node\".to_string(),\n                        status: status as i32,\n                    };\n                    Ok(message)\n                });\n                Ok(observation_stream)\n            });\n        let (grpc_readiness_trigger_tx, grpc_readiness_signal_rx) = oneshot::channel();\n        let (rest_readiness_trigger_tx, rest_readiness_signal_rx) = oneshot::channel();\n\n        let (health_reporter, health_service) = health_reporter();\n        let (client, server) = tokio::io::duplex(1024);\n        tokio::spawn(async move {\n            Server::builder()\n                .add_service(health_service)\n                .serve_with_incoming(tokio_stream::once(Ok::<_, std::io::Error>(server)))\n                .await\n                .unwrap();\n        });\n        let mut client_opt = Some(client);\n        let connector = tower::service_fn(move |_: http::Uri| {\n            let client = client_opt.take().unwrap();\n            async move { Ok::<_, Infallible>(hyper_util::rt::TokioIo::new(client)) }\n        });\n        let channel = Channel::builder(\"http://[::]:50051\".parse().unwrap())\n            .connect_with_connector(connector)\n            .await\n            .unwrap();\n\n        let mut health_client = HealthClient::new(channel);\n\n        tokio::spawn(node_readiness_reporting_task(\n            cluster.clone(),\n            MetastoreServiceClient::from_mock(mock_metastore),\n            Some(mock_ingester),\n            grpc_readiness_signal_rx,\n            rest_readiness_signal_rx,\n            health_reporter,\n        ));\n        assert!(!cluster.is_self_node_ready().await);\n\n        let request = tonic::Request::new(HealthCheckRequest::default());\n        let response = health_client.check(request).await.unwrap().into_inner();\n        assert_eq!(response.status(), ServingStatus::NotServing.into());\n\n        grpc_readiness_trigger_tx.send(()).unwrap();\n        rest_readiness_trigger_tx.send(()).unwrap();\n        assert!(!cluster.is_self_node_ready().await);\n\n        metastore_readiness_tx.send(true).unwrap();\n        ingester_status_tx.send(IngesterStatus::Ready).unwrap();\n        assert_eventually!(cluster.is_self_node_ready().await);\n\n        let request = tonic::Request::new(HealthCheckRequest::default());\n        let response = health_client.check(request).await.unwrap().into_inner();\n        assert_eq!(response.status(), ServingStatus::Serving.into());\n\n        metastore_readiness_tx.send(false).unwrap();\n        assert_eventually!(!cluster.is_self_node_ready().await);\n\n        let request = tonic::Request::new(HealthCheckRequest::default());\n        let response = health_client.check(request).await.unwrap().into_inner();\n        assert_eq!(response.status(), ServingStatus::NotServing.into());\n    }\n\n    #[tokio::test]\n    async fn test_setup_indexer_pool() {\n        let universe = Universe::with_accelerated_time();\n        let (indexing_service_mailbox, _indexing_service_inbox) =\n            universe.create_test_mailbox::<IndexingService>();\n        let node_config = NodeConfig::for_test();\n\n        let (cluster_change_stream, cluster_change_stream_tx) =\n            ClusterChangeStream::new_unbounded();\n        let indexer_pool = IndexerPool::default();\n        setup_indexer_pool(\n            cluster_change_stream,\n            Some(indexing_service_mailbox),\n            indexer_pool.clone(),\n            node_config.grpc_config.max_message_size,\n        );\n\n        // adding a indexer node refreshes the indexer pool\n        let new_indexer_node = ClusterNode::for_test(\n            \"test-indexer-node\",\n            1,\n            true,\n            &[\"indexer\"],\n            &[],\n            IngesterStatus::Ready,\n        )\n        .await;\n        cluster_change_stream_tx\n            .send(ClusterChange::Add(new_indexer_node.clone()))\n            .unwrap();\n        tokio::time::sleep(Duration::from_millis(1)).await;\n\n        assert_eq!(indexer_pool.len(), 1);\n\n        // removing an indexer node refreshes the indexer pool\n        cluster_change_stream_tx\n            .send(ClusterChange::Remove(new_indexer_node))\n            .unwrap();\n        tokio::time::sleep(Duration::from_millis(1)).await;\n\n        assert!(indexer_pool.is_empty());\n    }\n\n    #[tokio::test]\n    async fn test_setup_searcher() {\n        let node_config = NodeConfig::for_test();\n        let searcher_context = Arc::new(SearcherContext::new_without_invoker(\n            SearcherConfig::default(),\n            None,\n        ));\n        let metastore = metastore_for_test();\n        let (change_stream, change_stream_tx) = ClusterChangeStream::new_unbounded();\n        let storage_resolver = StorageResolver::unconfigured();\n        let (search_job_placer, _searcher_service) = setup_searcher(\n            &node_config,\n            change_stream,\n            metastore,\n            storage_resolver,\n            searcher_context,\n        )\n        .await\n        .unwrap();\n\n        struct DummyJob(String);\n\n        impl Job for DummyJob {\n            fn split_id(&self) -> &str {\n                &self.0\n            }\n\n            fn cost(&self) -> usize {\n                1\n            }\n        }\n        search_job_placer\n            .assign_job(DummyJob(\"job-1\".to_string()), &HashSet::new())\n            .await\n            .unwrap_err();\n\n        let self_node = ClusterNode::for_test(\n            \"node-1\",\n            1337,\n            true,\n            &[\"searcher\"],\n            &[],\n            IngesterStatus::Ready,\n        )\n        .await;\n        change_stream_tx\n            .send(ClusterChange::Add(self_node.clone()))\n            .unwrap();\n        tokio::time::sleep(Duration::from_millis(1)).await;\n\n        let searcher_client = search_job_placer\n            .assign_job(DummyJob(\"job-1\".to_string()), &HashSet::new())\n            .await\n            .unwrap();\n        assert!(searcher_client.is_local());\n\n        change_stream_tx\n            .send(ClusterChange::Remove(self_node))\n            .unwrap();\n\n        let node = ClusterNode::for_test(\n            \"node-1\",\n            1337,\n            false,\n            &[\"searcher\"],\n            &[],\n            IngesterStatus::Ready,\n        )\n        .await;\n        change_stream_tx.send(ClusterChange::Add(node)).unwrap();\n        tokio::time::sleep(Duration::from_millis(1)).await;\n\n        let searcher_client = search_job_placer\n            .assign_job(DummyJob(\"job-1\".to_string()), &HashSet::new())\n            .await\n            .unwrap();\n        assert!(!searcher_client.is_local());\n    }\n\n    #[tokio::test]\n    async fn test_setup_ingester_pool() {\n        let (cluster_change_stream, cluster_change_stream_tx) =\n            ClusterChangeStream::new_unbounded();\n        let ingester_pool = IngesterPool::default();\n        setup_ingester_pool(\n            cluster_change_stream,\n            None::<Ingester>,\n            ingester_pool.clone(),\n            None,\n            ByteSize::mib(20),\n        );\n\n        // Add an indexer node with IngesterStatus::Initializing.\n        let new_node = ClusterNode::for_test(\n            \"test-ingester-node\",\n            1,\n            false,\n            &[\"indexer\"],\n            &[],\n            IngesterStatus::Initializing,\n        )\n        .await;\n        cluster_change_stream_tx\n            .send(ClusterChange::Add(new_node.clone()))\n            .unwrap();\n        tokio::time::sleep(Duration::from_millis(1)).await;\n\n        assert_eq!(ingester_pool.len(), 1);\n        let pool_entry = ingester_pool\n            .get(&NodeId::from(\"test-ingester-node\"))\n            .unwrap();\n        assert_eq!(pool_entry.status, IngesterStatus::Initializing);\n\n        // Update the node: ingester status transitions from Initializing to Ready.\n        let updated_node = ClusterNode::for_test(\n            \"test-ingester-node\",\n            1,\n            false,\n            &[\"indexer\"],\n            &[],\n            IngesterStatus::Ready,\n        )\n        .await;\n        cluster_change_stream_tx\n            .send(ClusterChange::Update {\n                previous: new_node.clone(),\n                updated: updated_node.clone(),\n            })\n            .unwrap();\n        tokio::time::sleep(Duration::from_millis(1)).await;\n\n        assert_eq!(ingester_pool.len(), 1);\n        let pool_entry = ingester_pool\n            .get(&NodeId::from(\"test-ingester-node\"))\n            .unwrap();\n        assert_eq!(pool_entry.status, IngesterStatus::Ready);\n\n        // Update the node: ingester status transitions from Ready to Decommissioning.\n        let updated_node_2 = ClusterNode::for_test(\n            \"test-ingester-node\",\n            1,\n            false,\n            &[\"indexer\"],\n            &[],\n            IngesterStatus::Decommissioning,\n        )\n        .await;\n        cluster_change_stream_tx\n            .send(ClusterChange::Update {\n                previous: updated_node.clone(),\n                updated: updated_node_2.clone(),\n            })\n            .unwrap();\n        tokio::time::sleep(Duration::from_millis(1)).await;\n\n        // The node should still be in the pool with updated status.\n        assert_eq!(ingester_pool.len(), 1);\n        let pool_entry = ingester_pool\n            .get(&NodeId::from(\"test-ingester-node\"))\n            .unwrap();\n        assert_eq!(pool_entry.status, IngesterStatus::Decommissioning);\n\n        // Remove the node.\n        cluster_change_stream_tx\n            .send(ClusterChange::Remove(updated_node))\n            .unwrap();\n        tokio::time::sleep(Duration::from_millis(1)).await;\n\n        assert!(ingester_pool.is_empty());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/load_shield.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::time::Duration;\n\nuse quickwit_common::metrics::{GaugeGuard, IntGauge};\nuse tokio::sync::{Semaphore, SemaphorePermit};\n\nuse crate::rest::TooManyRequests;\n\npub struct LoadShield {\n    in_flight_semaphore_opt: Option<Semaphore>, // This one is doing the load shedding.\n    concurrency_semaphore_opt: Option<Semaphore>,\n    ongoing_gauge: IntGauge,\n    pending_gauge: IntGauge,\n}\n\npub struct LoadShieldPermit {\n    _concurrency_permit_opt: Option<SemaphorePermit<'static>>,\n    _in_flight_permit_opt: Option<SemaphorePermit<'static>>,\n    _ongoing_gauge_guard: GaugeGuard<'static>,\n}\n\nimpl LoadShield {\n    pub fn new(endpoint_group: &'static str) -> LoadShield {\n        let endpoint_group_uppercase = endpoint_group.to_ascii_uppercase();\n        let max_in_flight_env_key = format!(\"QW_{endpoint_group_uppercase}_MAX_IN_FLIGHT\");\n        let max_concurrency_env_key = format!(\"QW_{endpoint_group_uppercase}_MAX_CONCURRENCY\");\n        let max_in_flight_opt: Option<usize> =\n            quickwit_common::get_from_env_opt(&max_in_flight_env_key, false);\n        let max_concurrency_opt: Option<usize> =\n            quickwit_common::get_from_env_opt(&max_concurrency_env_key, false);\n        let in_flight_semaphore_opt = max_in_flight_opt.map(Semaphore::new);\n        let concurrency_semaphore_opt = max_concurrency_opt.map(Semaphore::new);\n        let pending_gauge = crate::metrics::SERVE_METRICS\n            .pending_requests\n            .with_label_values([endpoint_group]);\n        let ongoing_gauge = crate::metrics::SERVE_METRICS\n            .ongoing_requests\n            .with_label_values([endpoint_group]);\n        LoadShield {\n            in_flight_semaphore_opt,\n            concurrency_semaphore_opt,\n            ongoing_gauge,\n            pending_gauge,\n        }\n    }\n\n    async fn acquire_in_flight_permit(\n        &'static self,\n    ) -> Result<Option<SemaphorePermit<'static>>, warp::Rejection> {\n        let Some(in_flight_semaphore) = &self.in_flight_semaphore_opt else {\n            return Ok(None);\n        };\n        let Ok(in_flight_permit) = in_flight_semaphore.try_acquire() else {\n            // Wait a little to deal before load shedding. The point is to lower the load associated\n            // with super aggressive clients.\n            tokio::time::sleep(Duration::from_millis(100)).await;\n            return Err(warp::reject::custom(TooManyRequests));\n        };\n        Ok(Some(in_flight_permit))\n    }\n\n    async fn acquire_concurrency_permit(&'static self) -> Option<SemaphorePermit<'static>> {\n        let concurrency_semaphore = self.concurrency_semaphore_opt.as_ref()?;\n        Some(concurrency_semaphore.acquire().await.unwrap())\n    }\n\n    pub async fn acquire_permit(&'static self) -> Result<LoadShieldPermit, warp::Rejection> {\n        let mut pending_gauge_guard = GaugeGuard::from_gauge(&self.pending_gauge);\n        pending_gauge_guard.add(1);\n        let in_flight_permit_opt = self.acquire_in_flight_permit().await?;\n        let concurrency_permit_opt = self.acquire_concurrency_permit().await;\n        drop(pending_gauge_guard);\n        let mut ongoing_gauge_guard = GaugeGuard::from_gauge(&self.ongoing_gauge);\n        ongoing_gauge_guard.add(1);\n        Ok(LoadShieldPermit {\n            _in_flight_permit_opt: in_flight_permit_opt,\n            _concurrency_permit_opt: concurrency_permit_opt,\n            _ongoing_gauge_guard: ongoing_gauge_guard,\n        })\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/metrics.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse once_cell::sync::Lazy;\nuse quickwit_common::metrics::{\n    HistogramVec, IntCounter, IntCounterVec, IntGaugeVec, new_counter, new_counter_vec,\n    new_gauge_vec, new_histogram_vec,\n};\n\npub struct ServeMetrics {\n    pub http_requests_total: IntCounterVec<2>,\n    pub request_duration_secs: HistogramVec<2>,\n    pub ongoing_requests: IntGaugeVec<1>,\n    pub pending_requests: IntGaugeVec<1>,\n    pub circuit_break_total: IntCounter,\n}\n\nimpl Default for ServeMetrics {\n    fn default() -> Self {\n        let circuit_break_total = new_counter(\n            \"circuit_break_total\",\n            \"Circuit breaker counter\",\n            \"grpc\",\n            &[],\n        );\n        ServeMetrics {\n            http_requests_total: new_counter_vec(\n                \"http_requests_total\",\n                \"Total number of HTTP requests processed.\",\n                \"\",\n                &[],\n                [\"method\", \"status_code\"],\n            ),\n            request_duration_secs: new_histogram_vec(\n                \"request_duration_secs\",\n                \"Response time in seconds\",\n                \"\",\n                &[],\n                [\"method\", \"status_code\"],\n                // last bucket is 163.84s\n                quickwit_common::metrics::exponential_buckets(0.02, 2.0, 14).unwrap(),\n            ),\n            ongoing_requests: new_gauge_vec(\n                \"ongoing_requests\",\n                \"Number of ongoing requests.\",\n                \"\",\n                &[],\n                [\"endpoint_group\"],\n            ),\n            pending_requests: new_gauge_vec(\n                \"pending_requests\",\n                \"Number of pending requests.\",\n                \"\",\n                &[],\n                [\"endpoint_group\"],\n            ),\n            circuit_break_total,\n        }\n    }\n}\n\n/// Serve counters exposes a bunch a set of metrics about the request received to quickwit.\npub static SERVE_METRICS: Lazy<ServeMetrics> = Lazy::new(ServeMetrics::default);\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/metrics_api.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse tracing::error;\nuse warp::hyper::StatusCode;\nuse warp::reply::with_status;\n\n#[derive(utoipa::OpenApi)]\n#[openapi(paths(metrics_handler))]\n/// Endpoints which are weirdly tied to another crate with no\n/// other bits of information attached.\n///\n/// If a crate plans to encompass different schemas, handlers, etc...\n/// Then it should have its own specific API group.\npub struct MetricsApi;\n\n#[utoipa::path(\n    get,\n    tag = \"Get Metrics\",\n    path = \"/\",\n    responses(\n        (status = 200, description = \"Successfully fetched metrics.\", body = String),\n        (status = 500, description = \"Metrics not available.\", body = String),\n    ),\n)]\n/// Get Node Metrics\n///\n/// These are in the form of prometheus metrics.\npub fn metrics_handler() -> impl warp::Reply {\n    match quickwit_common::metrics::metrics_text_payload() {\n        Ok(metrics) => with_status(metrics, StatusCode::OK),\n        Err(e) => {\n            error!(\"failed to encode prometheus metrics: {e}\");\n            with_status(String::new(), StatusCode::INTERNAL_SERVER_ERROR)\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/node_info_handler.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::Arc;\n\nuse quickwit_config::NodeConfig;\nuse serde_json::json;\nuse warp::{Filter, Rejection};\n\nuse crate::rest::recover_fn;\nuse crate::{BuildInfo, RuntimeInfo, with_arg};\n\n#[derive(utoipa::OpenApi)]\n#[openapi(paths(node_version_handler, node_config_handler,))]\npub struct NodeInfoApi;\n\npub fn node_info_handler(\n    build_info: &'static BuildInfo,\n    runtime_info: &'static RuntimeInfo,\n    config: Arc<NodeConfig>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    node_version_handler(build_info, runtime_info)\n        .or(node_config_handler(config))\n        .recover(recover_fn)\n        .boxed()\n}\n\n#[utoipa::path(get, tag = \"Node Info\", path = \"/version\")]\nfn node_version_handler(\n    build_info: &'static BuildInfo,\n    runtime_info: &'static RuntimeInfo,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path(\"version\")\n        .and(warp::path::end())\n        .and(with_arg(build_info))\n        .and(with_arg(runtime_info))\n        .then(get_version)\n}\n\nasync fn get_version(\n    build_info: &'static BuildInfo,\n    runtime_info: &'static RuntimeInfo,\n) -> impl warp::Reply {\n    warp::reply::json(&json!({\n        \"build\": build_info,\n        \"runtime\": runtime_info,\n    }))\n}\n\n#[utoipa::path(get, tag = \"Node Info\", path = \"/config\")]\nfn node_config_handler(\n    config: Arc<NodeConfig>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path(\"config\")\n        .and(warp::path::end())\n        .and(with_arg(config))\n        .then(get_config)\n}\n\nasync fn get_config(config: Arc<NodeConfig>) -> impl warp::Reply {\n    // We must redact sensitive information such as credentials.\n    let mut config = (*config).clone();\n    config.redact();\n    warp::reply::json(&config)\n}\n\n#[cfg(test)]\nmod tests {\n    use assert_json_diff::assert_json_include;\n    use quickwit_common::uri::Uri;\n    use serde_json::Value as JsonValue;\n\n    use super::*;\n    use crate::recover_fn;\n\n    #[tokio::test]\n    async fn test_rest_node_info() {\n        let build_info = BuildInfo::get();\n        let runtime_info = RuntimeInfo::get();\n        let mut config = NodeConfig::for_test();\n        config.metastore_uri = Uri::for_test(\"postgresql://username:password@db\");\n        let handler = node_info_handler(build_info, runtime_info, Arc::new(config.clone()))\n            .recover(recover_fn);\n        let resp = warp::test::request().path(\"/version\").reply(&handler).await;\n        assert_eq!(resp.status(), 200);\n        let info_json: JsonValue = serde_json::from_slice(resp.body()).unwrap();\n        let build_info_json = info_json.get(\"build\").unwrap();\n        let expected_build_info_json = serde_json::json!({\n            \"commit_date\": build_info.commit_date,\n            \"version\": build_info.version,\n        });\n        assert_json_include!(actual: build_info_json, expected: expected_build_info_json);\n\n        let runtime_info_json = info_json.get(\"runtime\").unwrap();\n        let expected_runtime_info_json = serde_json::json!({\n            \"num_cpus\": runtime_info.num_cpus,\n        });\n        assert_json_include!(\n            actual: runtime_info_json,\n            expected: expected_runtime_info_json\n        );\n\n        let resp = warp::test::request().path(\"/config\").reply(&handler).await;\n        assert_eq!(resp.status(), 200);\n        let resp_json: JsonValue = serde_json::from_slice(resp.body()).unwrap();\n        let expected_response_json = serde_json::json!({\n            \"node_id\": config.node_id,\n            \"metastore_uri\": \"postgresql://username:***redacted***@db\",\n        });\n        assert_json_include!(actual: resp_json, expected: expected_response_json);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/openapi.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::mem;\n\nuse quickwit_config::ConfigApiSchemas;\nuse quickwit_doc_mapper::DocMapperApiSchemas;\nuse quickwit_indexing::IndexingApiSchemas;\nuse quickwit_janitor::JanitorApiSchemas;\nuse quickwit_metastore::MetastoreApiSchemas;\nuse utoipa::OpenApi;\nuse utoipa::openapi::Tag;\n\nuse crate::cluster_api::ClusterApi;\nuse crate::delete_task_api::DeleteTaskApi;\nuse crate::developer_api::DeveloperApi;\nuse crate::elasticsearch_api::ElasticCompatibleApi;\nuse crate::health_check_api::HealthCheckApi;\nuse crate::index_api::IndexApi;\nuse crate::indexing_api::IndexingApi;\nuse crate::ingest_api::{IngestApi, IngestApiSchemas};\nuse crate::jaeger_api::JaegerApi;\nuse crate::metrics_api::MetricsApi;\nuse crate::node_info_handler::NodeInfoApi;\nuse crate::otlp_api::OtlpApi;\nuse crate::search_api::SearchApi;\nuse crate::template_api::IndexTemplateApi;\n\n/// Builds the OpenApi docs structure using the registered/merged docs.\npub fn build_docs() -> utoipa::openapi::OpenApi {\n    let mut docs_base = utoipa::openapi::OpenApiBuilder::new()\n        .info(\n            utoipa::openapi::InfoBuilder::new()\n                .title(\"Quickwit\")\n                .version(env!(\"CARGO_PKG_VERSION\"))\n                .description(Some(env!(\"CARGO_PKG_DESCRIPTION\")))\n                .license(Some(utoipa::openapi::License::new(env!(\n                    \"CARGO_PKG_LICENSE\"\n                ))))\n                .contact(Some(\n                    utoipa::openapi::ContactBuilder::new()\n                        .name(Some(\"Quickwit, Inc.\"))\n                        .email(Some(\"hello@quickwit.io\"))\n                        .build(),\n                ))\n                .build(),\n        )\n        .paths(utoipa::openapi::Paths::new())\n        .components(Some(utoipa::openapi::Components::new()))\n        .build();\n\n    // Tags use for grouping and sorting routes.\n    let tags = vec![\n        Tag::new(\"Search\"),\n        Tag::new(\"Indexes\"),\n        Tag::new(\"Ingest\"),\n        Tag::new(\"Delete Tasks\"),\n        Tag::new(\"Node Health\"),\n        Tag::new(\"Sources\"),\n        Tag::new(\"Get Metrics\"),\n        Tag::new(\"Cluster Info\"),\n        Tag::new(\"Node Info\"),\n        Tag::new(\"Indexing\"),\n        Tag::new(\"Splits\"),\n        Tag::new(\"Jaeger\"),\n        Tag::new(\"Open Telemetry\"),\n        Tag::new(\"Debug\"),\n    ];\n    docs_base.tags = Some(tags);\n\n    // Routing\n    docs_base.merge_components_and_paths(ClusterApi::openapi().with_path_prefix(\"/api/v1\"));\n    docs_base.merge_components_and_paths(DeleteTaskApi::openapi().with_path_prefix(\"/api/v1\"));\n    docs_base\n        .merge_components_and_paths(DeveloperApi::openapi().with_path_prefix(\"/api/developer\"));\n    docs_base\n        .merge_components_and_paths(ElasticCompatibleApi::openapi().with_path_prefix(\"/api/v1\"));\n    docs_base.merge_components_and_paths(OtlpApi::openapi().with_path_prefix(\"/api/v1\"));\n    docs_base.merge_components_and_paths(HealthCheckApi::openapi().with_path_prefix(\"/health\"));\n    docs_base.merge_components_and_paths(IndexApi::openapi().with_path_prefix(\"/api/v1\"));\n    docs_base.merge_components_and_paths(IndexingApi::openapi().with_path_prefix(\"/api/v1\"));\n    docs_base.merge_components_and_paths(IndexTemplateApi::openapi().with_path_prefix(\"/api/v1\"));\n    docs_base.merge_components_and_paths(IngestApi::openapi().with_path_prefix(\"/api/v1\"));\n    docs_base.merge_components_and_paths(JaegerApi::openapi().with_path_prefix(\"/api/v1\"));\n    docs_base.merge_components_and_paths(MetricsApi::openapi().with_path_prefix(\"/metrics\"));\n    docs_base.merge_components_and_paths(NodeInfoApi::openapi().with_path_prefix(\"/api/v1\"));\n    docs_base.merge_components_and_paths(SearchApi::openapi().with_path_prefix(\"/api/v1\"));\n\n    // Schemas\n    docs_base.merge_components_and_paths(MetastoreApiSchemas::openapi());\n    docs_base.merge_components_and_paths(ConfigApiSchemas::openapi());\n    docs_base.merge_components_and_paths(JanitorApiSchemas::openapi());\n    docs_base.merge_components_and_paths(DocMapperApiSchemas::openapi());\n    docs_base.merge_components_and_paths(IndexingApiSchemas::openapi());\n    docs_base.merge_components_and_paths(IngestApiSchemas::openapi());\n\n    docs_base\n}\n\npub trait OpenApiMerger {\n    /// Merges a given [OpenApi] schema into another schema.\n    fn merge_components_and_paths(&mut self, schema: utoipa::openapi::OpenApi);\n\n    /// Modifies all of the paths for a given OpenAPI instance\n    /// and appends the provided prefix to the paths.\n    fn with_path_prefix(self, path: &str) -> Self;\n}\n\nimpl OpenApiMerger for utoipa::openapi::OpenApi {\n    fn merge_components_and_paths(&mut self, schema: utoipa::openapi::OpenApi) {\n        self.paths.paths.extend(schema.paths.paths);\n\n        if let Some(tags) = &mut self.tags {\n            tags.extend(schema.tags.unwrap_or_default());\n        } else {\n            self.tags = schema.tags;\n        }\n\n        if let Some(components) = &mut self.components {\n            let other_components = schema.components.unwrap_or_default();\n\n            components.responses.extend(other_components.responses);\n            components.schemas.extend(other_components.schemas);\n            components\n                .security_schemes\n                .extend(other_components.security_schemes);\n        } else {\n            self.components = schema.components;\n        }\n    }\n\n    fn with_path_prefix(mut self, prefix: &str) -> Self {\n        let paths = mem::take(&mut self.paths.paths);\n        for (path, detail) in paths {\n            // We can panic here as it will be raised during unit tests.\n            assert!(\n                path.starts_with('/'),\n                \"Path {path:?} does not start with `/`.\"\n            );\n\n            let adjusted_path = if path != \"/\" {\n                format!(\"{prefix}{path}\")\n            } else {\n                prefix.to_owned()\n            };\n            self.paths.paths.insert(adjusted_path, detail);\n        }\n\n        self\n    }\n}\n\n#[cfg(test)]\nmod openapi_schema_tests {\n    use std::collections::BTreeSet;\n\n    use itertools::Itertools;\n    use utoipa::openapi::schema::AdditionalProperties;\n    use utoipa::openapi::{RefOr, Schema};\n\n    use super::*;\n\n    #[test]\n    fn ensure_schemas_resolve() {\n        let docs = build_docs();\n        resolve_openapi_schemas(&docs).expect(\"All schemas should be resolved.\");\n    }\n\n    fn resolve_openapi_schemas(openapi: &utoipa::openapi::OpenApi) -> anyhow::Result<()> {\n        let schemas_lookup = if let Some(components) = &openapi.components {\n            resolve_component_schemas(components)?\n        } else {\n            BTreeSet::new()\n        };\n\n        let mut errors = Vec::new();\n        for (path, detail) in openapi.paths.paths.iter() {\n            let path = path.as_str();\n            for (method, operation) in detail.operations.iter() {\n                let method = serde_json::to_string(method).unwrap();\n                let contents = operation\n                    .request_body\n                    .as_ref()\n                    .map(|v| &v.content)\n                    .cloned()\n                    .unwrap_or_default();\n                for (key, content) in contents {\n                    let location = match content.schema {\n                        RefOr::Ref(r) => r.ref_location,\n                        RefOr::T(_) => continue,\n                    };\n\n                    if !schemas_lookup.contains(&location) {\n                        let info = format!(\"key:{key:?}\");\n                        errors.push((location, method.clone(), path, info));\n                    }\n                }\n\n                for (status, resp) in operation.responses.responses.iter() {\n                    let location = match resp {\n                        RefOr::Ref(r) => &r.ref_location,\n                        RefOr::T(_) => continue,\n                    };\n\n                    if !schemas_lookup.contains(location) {\n                        let info = format!(\"status:{status}\");\n                        errors.push((location.clone(), method.clone(), path, info));\n                    }\n                }\n\n                for parameter in operation.parameters.as_deref().unwrap_or(&[]) {\n                    let location = match &parameter.schema {\n                        Some(RefOr::Ref(r)) => &r.ref_location,\n                        Some(RefOr::T(schema)) => {\n                            let parent = format!(\"param: {}\", &parameter.name);\n                            check_schema(\n                                &method,\n                                path,\n                                &schemas_lookup,\n                                &mut errors,\n                                &parent,\n                                schema,\n                            );\n                            continue;\n                        }\n                        _ => continue,\n                    };\n\n                    if !schemas_lookup.contains(location) {\n                        let info = format!(\"param:{}\", parameter.name);\n                        errors.push((location.clone(), method.clone(), path, info));\n                    }\n                }\n            }\n        }\n\n        if !errors.is_empty() {\n            let errors = errors\n                .into_iter()\n                .map(|(location, method, path, info)| {\n                    format!(\"{method} {path:?} {info} - location: {location}\")\n                })\n                .join(\"\\n\");\n\n            anyhow::bail!(\n                \"failed to resolve schemas, do these types implement `ToSchema`?:\\n\\n{errors}\"\n            )\n        }\n\n        Ok(())\n    }\n\n    /// Builds a lookup set of all of the schemas that can be referenced.\n    fn resolve_component_schemas(\n        components: &utoipa::openapi::Components,\n    ) -> anyhow::Result<BTreeSet<String>> {\n        // Loads the core schemas which is used by most references\n        // This can have references in and of itself however, so we\n        // need to track those to resolve later.\n        let mut schema_lookup = BTreeSet::new();\n        let mut pending_resolved = Vec::new();\n        let mut resolve_once = Vec::new();\n\n        for (schema_item, maybe_ref) in &components.schemas {\n            let path = format!(\"#/components/schemas/{schema_item}\");\n            match maybe_ref {\n                RefOr::Ref(r) => {\n                    pending_resolved.push((path, r.ref_location.clone()));\n                }\n                RefOr::T(schema) => {\n                    resolve_schema(&mut resolve_once, schema_item, schema);\n                    schema_lookup.insert(path);\n                }\n            };\n        }\n\n        for schema_item in components.security_schemes.keys() {\n            let path = format!(\"#/components/securitySchemes/{schema_item}\");\n            schema_lookup.insert(path);\n        }\n\n        // Although responses aren't technically a schema, they can be referenced and contain\n        // references, so it's easier to merge them into one.\n        for (schema_item, maybe_ref) in &components.responses {\n            let path = format!(\"#/components/responses/{schema_item}\");\n            match maybe_ref {\n                RefOr::Ref(r) => {\n                    pending_resolved.push((path, r.ref_location.clone()));\n                }\n                RefOr::T(schema) => {\n                    for (_, content) in &schema.content {\n                        if let RefOr::Ref(r) = &content.schema\n                            && !schema_lookup.contains(&r.ref_location)\n                        {\n                            resolve_once.push(CheckResolve::new(\n                                r.ref_location.clone(),\n                                schema_item.clone(),\n                            ));\n                        }\n                    }\n                    schema_lookup.insert(path);\n                }\n            };\n        }\n\n        // Walks through the list of references that need to be resolved.\n        // Technically a reference can lead to a reference, so if one\n        // location is resolved later on, we might then be able to resolve\n        // others, hence the loop.\n        loop {\n            for (path, location) in mem::take(&mut pending_resolved) {\n                if schema_lookup.contains(&location) {\n                    schema_lookup.insert(path);\n                } else {\n                    pending_resolved.push((path, location));\n                }\n            }\n\n            if pending_resolved.is_empty() {\n                break;\n            }\n        }\n\n        let mut failed_to_resolve = Vec::new();\n        for resolve in resolve_once {\n            if !schema_lookup.contains(&resolve.location) {\n                failed_to_resolve.push(resolve);\n            }\n        }\n\n        if !pending_resolved.is_empty() || !failed_to_resolve.is_empty() {\n            let errors_pending = pending_resolved\n                .into_iter()\n                .map(|(path, _)| format!(\"{path:?}\"))\n                .join(\"\\n\");\n            let errors_resolve_once = failed_to_resolve\n                .into_iter()\n                .map(|resolve| format!(\"Struct: {:?} - {:?}\", resolve.parent, resolve.location,))\n                .join(\"\\n\");\n            anyhow::bail!(\n                \"failed to resolve schemas for OpenAPI \\\n                 spec:\\n{errors_pending}\\n{errors_resolve_once}\"\n            );\n        }\n\n        Ok(schema_lookup)\n    }\n\n    fn resolve_schema(\n        resolve_once: &mut Vec<CheckResolve>,\n        parent_location: &str,\n        schema: &Schema,\n    ) {\n        match schema {\n            Schema::Array(array) => {\n                let parent = format!(\"{parent_location}.Vec\");\n                match &*array.items {\n                    RefOr::Ref(r) => {\n                        resolve_once.push(CheckResolve::new(r.ref_location.clone(), parent))\n                    }\n                    RefOr::T(schema) => resolve_schema(resolve_once, &parent, schema),\n                }\n            }\n            Schema::Object(object) => {\n                for (key, r) in object.properties.iter() {\n                    let parent = format!(\"{parent_location}.{key}\");\n                    match r {\n                        RefOr::Ref(r) => {\n                            resolve_once.push(CheckResolve::new(r.ref_location.clone(), parent))\n                        }\n                        RefOr::T(schema) => resolve_schema(resolve_once, &parent, schema),\n                    }\n                }\n\n                if let Some(ref props) = object.additional_properties\n                    && let AdditionalProperties::RefOr(ref r) = **props\n                {\n                    match r {\n                        RefOr::Ref(r) => resolve_once.push(CheckResolve::new(\n                            r.ref_location.clone(),\n                            parent_location.to_owned(),\n                        )),\n                        RefOr::T(schema) => resolve_schema(resolve_once, parent_location, schema),\n                    }\n                }\n            }\n            Schema::OneOf(one_of) => {\n                let parent = format!(\"{parent_location}.Enum\");\n                for r in &one_of.items {\n                    match r {\n                        RefOr::Ref(r) => resolve_once\n                            .push(CheckResolve::new(r.ref_location.clone(), parent.clone())),\n                        RefOr::T(schema) => resolve_schema(resolve_once, &parent, schema),\n                    }\n                }\n            }\n            Schema::AllOf(all_of) => {\n                for r in &all_of.items {\n                    match r {\n                        RefOr::Ref(r) => resolve_once.push(CheckResolve::new(\n                            r.ref_location.clone(),\n                            parent_location.to_owned(),\n                        )),\n                        RefOr::T(schema) => resolve_schema(resolve_once, parent_location, schema),\n                    }\n                }\n            }\n            _ => unimplemented!(\"Unknown schema variant\"),\n        }\n    }\n\n    fn check_schema<'a>(\n        method: &str,\n        path: &'a str,\n        schemas_lookup: &BTreeSet<String>,\n        errors: &mut Vec<(String, String, &'a str, String)>,\n        parent_location: &str,\n        schema: &Schema,\n    ) {\n        match schema {\n            Schema::Array(array) => {\n                let parent = format!(\"{parent_location}.Vec\");\n                match &*array.items {\n                    RefOr::Ref(r) => {\n                        if !schemas_lookup.contains(&r.ref_location) {\n                            errors.push((parent, method.to_string(), path, String::new()));\n                        }\n                    }\n                    RefOr::T(schema) => {\n                        check_schema(method, path, schemas_lookup, errors, &parent, schema)\n                    }\n                }\n            }\n            Schema::Object(object) => {\n                for (key, r) in object.properties.iter() {\n                    let parent = format!(\"{parent_location}.{key}\");\n                    match r {\n                        RefOr::Ref(r) => {\n                            if !schemas_lookup.contains(&r.ref_location) {\n                                errors.push((parent, method.to_string(), path, String::new()));\n                            }\n                        }\n                        RefOr::T(schema) => {\n                            check_schema(method, path, schemas_lookup, errors, &parent, schema)\n                        }\n                    }\n                }\n\n                if let Some(ref props) = object.additional_properties\n                    && let AdditionalProperties::RefOr(ref r) = **props\n                {\n                    match r {\n                        RefOr::Ref(r) => {\n                            if !schemas_lookup.contains(&r.ref_location) {\n                                errors.push((\n                                    parent_location.to_string(),\n                                    method.to_string(),\n                                    path,\n                                    String::new(),\n                                ));\n                            }\n                        }\n                        RefOr::T(schema) => check_schema(\n                            method,\n                            path,\n                            schemas_lookup,\n                            errors,\n                            parent_location,\n                            schema,\n                        ),\n                    }\n                }\n            }\n            Schema::OneOf(one_of) => {\n                let parent = format!(\"{parent_location}.Enum\");\n                for r in &one_of.items {\n                    match r {\n                        RefOr::Ref(r) => {\n                            if !schemas_lookup.contains(&r.ref_location) {\n                                errors.push((\n                                    parent.clone(),\n                                    method.to_string(),\n                                    path,\n                                    String::new(),\n                                ));\n                            }\n                        }\n                        RefOr::T(schema) => {\n                            check_schema(method, path, schemas_lookup, errors, &parent, schema)\n                        }\n                    }\n                }\n            }\n            Schema::AllOf(all_of) => {\n                for r in &all_of.items {\n                    match r {\n                        RefOr::Ref(r) => {\n                            let (_, type_name) = r.ref_location.rsplit_once('/').unwrap();\n                            let parent = format!(\"{parent_location}.{type_name}\");\n                            if !schemas_lookup.contains(&r.ref_location) {\n                                errors.push((parent, method.to_string(), path, String::new()));\n                            }\n                        }\n                        RefOr::T(schema) => check_schema(\n                            method,\n                            path,\n                            schemas_lookup,\n                            errors,\n                            parent_location,\n                            schema,\n                        ),\n                    }\n                }\n            }\n            _ => unimplemented!(\"Unknown schema variant\"),\n        }\n    }\n\n    struct CheckResolve {\n        location: String,\n        parent: String,\n    }\n\n    impl CheckResolve {\n        fn new(location: String, parent: String) -> Self {\n            Self { location, parent }\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/otlp_api/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod rest_handler;\npub use rest_handler::OtlpApi;\npub(crate) use rest_handler::otlp_ingest_api_handlers;\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/otlp_api/rest_handler.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_common::rate_limited_error;\nuse quickwit_opentelemetry::otlp::{OtelSignal, OtlpGrpcLogsService, OtlpGrpcTracesService};\nuse quickwit_proto::opentelemetry::proto::collector::logs::v1::logs_service_server::LogsService;\nuse quickwit_proto::opentelemetry::proto::collector::logs::v1::{\n    ExportLogsServiceRequest, ExportLogsServiceResponse,\n};\nuse quickwit_proto::opentelemetry::proto::collector::trace::v1::trace_service_server::TraceService;\nuse quickwit_proto::opentelemetry::proto::collector::trace::v1::{\n    ExportTraceServiceRequest, ExportTraceServiceResponse,\n};\nuse quickwit_proto::types::IndexId;\nuse quickwit_proto::{ServiceError, ServiceErrorCode, tonic};\nuse serde::{self, Serialize};\nuse warp::{Filter, Rejection};\n\nuse crate::decompression::get_body_bytes;\nuse crate::rest::recover_fn;\nuse crate::rest_api_response::into_rest_api_response;\nuse crate::{Body, BodyFormat, require, with_arg};\n\n#[derive(utoipa::OpenApi)]\n#[openapi(paths(\n    otlp_default_logs_handler,\n    otlp_logs_handler,\n    otlp_default_traces_handler,\n    otlp_ingest_traces_handler\n))]\npub struct OtlpApi;\n\n/// Setup OpenTelemetry API handlers.\npub(crate) fn otlp_ingest_api_handlers(\n    otlp_logs_service: Option<OtlpGrpcLogsService>,\n    otlp_traces_service: Option<OtlpGrpcTracesService>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    otlp_default_logs_handler(otlp_logs_service.clone())\n        .or(otlp_default_traces_handler(otlp_traces_service.clone()).recover(recover_fn))\n        .or(otlp_logs_handler(otlp_logs_service).recover(recover_fn))\n        .or(otlp_ingest_traces_handler(otlp_traces_service).recover(recover_fn))\n        .boxed()\n}\n\n/// Open Telemetry REST/Protobuf logs ingest endpoint.\n#[utoipa::path(\n    post,\n    tag = \"Open Telemetry\",\n    path = \"/otlp/v1/logs\",\n    request_body(content = String, description = \"`ExportLogsServiceRequest` protobuf message\", content_type = \"application/x-protobuf\"),\n    responses(\n        (status = 200, description = \"Successfully exported logs.\", body = ExportLogsServiceResponse)\n    ),\n)]\npub(crate) fn otlp_default_logs_handler(\n    otlp_logs_service: Option<OtlpGrpcLogsService>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    require(otlp_logs_service)\n        .and(warp::path!(\"otlp\" / \"v1\" / \"logs\"))\n        .and(warp::header::exact_ignore_case(\n            \"content-type\",\n            \"application/x-protobuf\",\n        ))\n        .and(warp::header::optional::<String>(\n            OtelSignal::Logs.header_name(),\n        ))\n        .and(warp::post())\n        .and(get_body_bytes())\n        .then(\n            |otlp_logs_service, index_id: Option<String>, body| async move {\n                let index_id =\n                    index_id.unwrap_or_else(|| OtelSignal::Logs.default_index_id().to_string());\n                otlp_ingest_logs(otlp_logs_service, index_id, body).await\n            },\n        )\n        .and(with_arg(BodyFormat::default()))\n        .map(into_rest_api_response)\n        .boxed()\n}\n/// Open Telemetry REST/Protobuf logs ingest endpoint.\n#[utoipa::path(\n    post,\n    tag = \"Open Telemetry\",\n    path = \"/{index}/otlp/v1/logs\",\n    request_body(content = String, description = \"`ExportLogsServiceRequest` protobuf message\", content_type = \"application/x-protobuf\"),\n    responses(\n        (status = 200, description = \"Successfully exported logs.\", body = ExportLogsServiceResponse)\n    ),\n)]\npub(crate) fn otlp_logs_handler(\n    otlp_log_service: Option<OtlpGrpcLogsService>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    require(otlp_log_service)\n        .and(warp::path!(String / \"otlp\" / \"v1\" / \"logs\"))\n        .and(warp::header::exact_ignore_case(\n            \"content-type\",\n            \"application/x-protobuf\",\n        ))\n        .and(warp::post())\n        .and(get_body_bytes())\n        .then(otlp_ingest_logs)\n        .and(with_arg(BodyFormat::default()))\n        .map(into_rest_api_response)\n        .boxed()\n}\n\n/// Open Telemetry REST/Protobuf traces ingest endpoint.\n#[utoipa::path(\n    post,\n    tag = \"Open Telemetry\",\n    path = \"/otlp/v1/traces\",\n    request_body(content = String, description = \"`ExportTraceServiceRequest` protobuf message\", content_type = \"application/x-protobuf\"),\n    responses(\n        (status = 200, description = \"Successfully exported traces.\", body = ExportTracesServiceResponse)\n    ),\n)]\npub(crate) fn otlp_default_traces_handler(\n    otlp_traces_service: Option<OtlpGrpcTracesService>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    require(otlp_traces_service)\n        .and(warp::path!(\"otlp\" / \"v1\" / \"traces\"))\n        .and(warp::header::exact_ignore_case(\n            \"content-type\",\n            \"application/x-protobuf\",\n        ))\n        .and(warp::header::optional::<String>(\n            OtelSignal::Traces.header_name(),\n        ))\n        .and(warp::post())\n        .and(get_body_bytes())\n        .then(\n            |otlp_traces_service, index_id: Option<String>, body| async move {\n                let index_id =\n                    index_id.unwrap_or_else(|| OtelSignal::Traces.default_index_id().to_string());\n                otlp_ingest_traces(otlp_traces_service, index_id, body).await\n            },\n        )\n        .and(with_arg(BodyFormat::default()))\n        .map(into_rest_api_response)\n        .boxed()\n}\n/// Open Telemetry REST/Protobuf traces ingest endpoint.\n#[utoipa::path(\n    post,\n    tag = \"Open Telemetry\",\n    path = \"/{index}/otlp/v1/traces\",\n    request_body(content = String, description = \"`ExportTraceServiceRequest` protobuf message\", content_type = \"application/x-protobuf\"),\n    responses(\n        (status = 200, description = \"Successfully exported traces.\", body = ExportTracesServiceResponse)\n    ),\n)]\npub(crate) fn otlp_ingest_traces_handler(\n    otlp_traces_service: Option<OtlpGrpcTracesService>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    require(otlp_traces_service)\n        .and(warp::path!(String / \"otlp\" / \"v1\" / \"traces\"))\n        .and(warp::header::exact_ignore_case(\n            \"content-type\",\n            \"application/x-protobuf\",\n        ))\n        .and(warp::post())\n        .and(get_body_bytes())\n        .then(otlp_ingest_traces)\n        .and(with_arg(BodyFormat::default()))\n        .map(into_rest_api_response)\n        .boxed()\n}\n\n#[derive(Debug, Clone, thiserror::Error, Serialize)]\npub enum OtlpApiError {\n    #[error(\"invalid OTLP request: {0}\")]\n    InvalidPayload(String),\n    #[error(\"error when ingesting payload: {0}\")]\n    Ingest(String),\n}\n\nimpl ServiceError for OtlpApiError {\n    fn error_code(&self) -> ServiceErrorCode {\n        match self {\n            OtlpApiError::InvalidPayload(_) => ServiceErrorCode::BadRequest,\n            OtlpApiError::Ingest(err_msg) => {\n                rate_limited_error!(limit_per_min = 6, \"otlp internal error: {err_msg}\");\n                ServiceErrorCode::Internal\n            }\n        }\n    }\n}\n\nasync fn otlp_ingest_logs(\n    otlp_logs_service: OtlpGrpcLogsService,\n    index_id: IndexId,\n    body: Body,\n) -> Result<ExportLogsServiceResponse, OtlpApiError> {\n    let export_logs_request: ExportLogsServiceRequest =\n        prost::Message::decode(&body.content[..])\n            .map_err(|err| OtlpApiError::InvalidPayload(err.to_string()))?;\n    let mut request = tonic::Request::new(export_logs_request);\n    let index = index_id\n        .try_into()\n        .map_err(|_| OtlpApiError::InvalidPayload(\"invalid index id\".to_string()))?;\n    request\n        .metadata_mut()\n        .insert(OtelSignal::Logs.header_name(), index);\n    let result = otlp_logs_service\n        .export(request)\n        .await\n        .map_err(|err| OtlpApiError::Ingest(err.to_string()))?;\n    Ok(result.into_inner())\n}\n\nasync fn otlp_ingest_traces(\n    otlp_traces_service: OtlpGrpcTracesService,\n    index_id: IndexId,\n    body: Body,\n) -> Result<ExportTraceServiceResponse, OtlpApiError> {\n    let export_traces_request: ExportTraceServiceRequest =\n        prost::Message::decode(&body.content[..])\n            .map_err(|err| OtlpApiError::InvalidPayload(err.to_string()))?;\n    let mut request = tonic::Request::new(export_traces_request);\n    let index = index_id\n        .try_into()\n        .map_err(|_| OtlpApiError::InvalidPayload(\"invalid index id\".to_string()))?;\n    request\n        .metadata_mut()\n        .insert(OtelSignal::Traces.header_name(), index);\n    let response = otlp_traces_service\n        .export(request)\n        .await\n        .map_err(|err| OtlpApiError::Ingest(err.to_string()))?;\n    Ok(response.into_inner())\n}\n\n#[cfg(test)]\nmod tests {\n    use std::io::Write;\n\n    use flate2::Compression;\n    use flate2::write::GzEncoder;\n    use prost::Message;\n    use quickwit_ingest::CommitType;\n    use quickwit_opentelemetry::otlp::{\n        OtlpGrpcLogsService, OtlpGrpcTracesService, make_resource_spans_for_test,\n    };\n    use quickwit_proto::ingest::router::{\n        IngestResponseV2, IngestRouterServiceClient, IngestSuccess, MockIngestRouterService,\n    };\n    use quickwit_proto::opentelemetry::proto::collector::logs::v1::{\n        ExportLogsServiceRequest, ExportLogsServiceResponse,\n    };\n    use quickwit_proto::opentelemetry::proto::collector::trace::v1::{\n        ExportTraceServiceRequest, ExportTraceServiceResponse,\n    };\n    use quickwit_proto::opentelemetry::proto::logs::v1::{LogRecord, ResourceLogs, ScopeLogs};\n    use quickwit_proto::opentelemetry::proto::resource::v1::Resource;\n    use warp::Filter;\n\n    use super::otlp_ingest_api_handlers;\n    use crate::rest::recover_fn;\n\n    fn compress(body: &[u8]) -> Vec<u8> {\n        let mut encoder = GzEncoder::new(Vec::new(), Compression::default());\n        encoder.write_all(body).expect(\"Failed to write to encoder\");\n        encoder.finish().expect(\"Failed to finish compression\")\n    }\n\n    #[tokio::test]\n    async fn test_otlp_ingest_logs_handler() {\n        let mut mock_ingest_router = MockIngestRouterService::new();\n        mock_ingest_router\n            .expect_ingest()\n            .times(2)\n            .withf(|request| {\n                if request.subrequests.len() == 1 {\n                    let subrequest = &request.subrequests[0];\n                    subrequest.doc_batch.is_some()\n                    // && request.commit == CommitType::Auto as i32\n                    && subrequest.doc_batch.as_ref().unwrap().doc_lengths.len() == 1\n                    && subrequest.index_id == quickwit_opentelemetry::otlp::OTEL_LOGS_INDEX_ID\n                } else {\n                    false\n                }\n            })\n            .returning(|_| {\n                Ok(IngestResponseV2 {\n                    successes: vec![IngestSuccess {\n                        num_ingested_docs: 1,\n                        ..Default::default()\n                    }],\n                    failures: Vec::new(),\n                })\n            });\n        mock_ingest_router\n            .expect_ingest()\n            .times(2)\n            .withf(|request| {\n                if request.subrequests.len() == 1 {\n                    let subrequest = &request.subrequests[0];\n                    subrequest.doc_batch.is_some()\n                    // && request.commit == CommitType::Auto as i32\n                    && subrequest.doc_batch.as_ref().unwrap().doc_lengths.len() == 1\n                    && subrequest.index_id == \"otel-logs-v0_6\"\n                } else {\n                    false\n                }\n            })\n            .returning(|_| {\n                Ok(IngestResponseV2 {\n                    successes: vec![IngestSuccess {\n                        num_ingested_docs: 1,\n                        ..Default::default()\n                    }],\n                    failures: Vec::new(),\n                })\n            });\n        let ingest_router = IngestRouterServiceClient::from_mock(mock_ingest_router);\n        let logs_service = OtlpGrpcLogsService::new(ingest_router.clone());\n        let traces_service = OtlpGrpcTracesService::new(ingest_router, Some(CommitType::Force));\n        let export_logs_request = ExportLogsServiceRequest {\n            resource_logs: vec![ResourceLogs {\n                resource: Some(Resource {\n                    attributes: Vec::new(),\n                    dropped_attributes_count: 0,\n                }),\n                scope_logs: vec![ScopeLogs {\n                    log_records: vec![LogRecord {\n                        body: None,\n                        attributes: Vec::new(),\n                        dropped_attributes_count: 0,\n                        time_unix_nano: 1704036033047000000,\n                        severity_number: 0,\n                        severity_text: \"ERROR\".to_string(),\n                        span_id: Vec::new(),\n                        trace_id: Vec::new(),\n                        flags: 0,\n                        observed_time_unix_nano: 0,\n                    }],\n                    scope: None,\n                    schema_url: \"\".to_string(),\n                }],\n                schema_url: \"\".to_string(),\n            }],\n        };\n        let body = export_logs_request.encode_to_vec();\n        let otlp_traces_api_handler =\n            otlp_ingest_api_handlers(Some(logs_service), Some(traces_service)).recover(recover_fn);\n        {\n            // Test default otlp endpoint\n            let resp = warp::test::request()\n                .path(\"/otlp/v1/logs\")\n                .method(\"POST\")\n                .header(\"content-type\", \"application/x-protobuf\")\n                .body(body.clone())\n                .reply(&otlp_traces_api_handler)\n                .await;\n            assert_eq!(resp.status(), 200);\n            let actual_response: ExportLogsServiceResponse =\n                serde_json::from_slice(resp.body()).unwrap();\n            assert!(actual_response.partial_success.is_some());\n            assert_eq!(\n                actual_response\n                    .partial_success\n                    .unwrap()\n                    .rejected_log_records,\n                0\n            );\n        }\n        {\n            // Test default otlp endpoint with compression\n            let resp = warp::test::request()\n                .path(\"/otlp/v1/logs\")\n                .method(\"POST\")\n                .header(\"content-type\", \"application/x-protobuf\")\n                .header(\"content-encoding\", \"gzip\")\n                .body(compress(&body))\n                .reply(&otlp_traces_api_handler)\n                .await;\n            assert_eq!(resp.status(), 200);\n            let actual_response: ExportLogsServiceResponse =\n                serde_json::from_slice(resp.body()).unwrap();\n            assert!(actual_response.partial_success.is_some());\n            assert_eq!(\n                actual_response\n                    .partial_success\n                    .unwrap()\n                    .rejected_log_records,\n                0\n            );\n        }\n        {\n            // Test endpoint with index ID through header\n            let resp = warp::test::request()\n                .path(\"/otlp/v1/logs\")\n                .method(\"POST\")\n                .header(\"content-type\", \"application/x-protobuf\")\n                .header(\"qw-otel-logs-index\", \"otel-logs-v0_6\")\n                .body(body.clone())\n                .reply(&otlp_traces_api_handler)\n                .await;\n            assert_eq!(resp.status(), 200);\n            let actual_response: ExportLogsServiceResponse =\n                serde_json::from_slice(resp.body()).unwrap();\n            assert!(actual_response.partial_success.is_some());\n            assert_eq!(\n                actual_response\n                    .partial_success\n                    .unwrap()\n                    .rejected_log_records,\n                0\n            );\n        }\n        {\n            // Test endpoint with given index ID through path.\n            let resp = warp::test::request()\n                .path(\"/otel-logs-v0_6/otlp/v1/logs\")\n                .method(\"POST\")\n                .header(\"content-type\", \"application/x-protobuf\")\n                .body(body.clone())\n                .reply(&otlp_traces_api_handler)\n                .await;\n            assert_eq!(resp.status(), 200);\n            let actual_response: ExportLogsServiceResponse =\n                serde_json::from_slice(resp.body()).unwrap();\n            assert!(actual_response.partial_success.is_some());\n            assert_eq!(\n                actual_response\n                    .partial_success\n                    .unwrap()\n                    .rejected_log_records,\n                0\n            );\n        }\n    }\n\n    #[tokio::test]\n    async fn test_otlp_ingest_traces_handler() {\n        let mut mock_ingest_router = MockIngestRouterService::new();\n        mock_ingest_router\n            .expect_ingest()\n            .times(2)\n            .withf(|request| {\n                if request.subrequests.len() == 1 {\n                    let subrequest = &request.subrequests[0];\n                    subrequest.doc_batch.is_some()\n                    // && request.commit == CommitType::Auto as i32\n                    && subrequest.doc_batch.as_ref().unwrap().doc_lengths.len() == 5\n                    && subrequest.index_id == quickwit_opentelemetry::otlp::OTEL_TRACES_INDEX_ID\n                } else {\n                    false\n                }\n            })\n            .returning(|_| {\n                Ok(IngestResponseV2 {\n                    successes: vec![IngestSuccess {\n                        num_ingested_docs: 1,\n                        ..Default::default()\n                    }],\n                    failures: Vec::new(),\n                })\n            });\n        mock_ingest_router\n            .expect_ingest()\n            .times(2)\n            .withf(|request| {\n                if request.subrequests.len() == 1 {\n                    let subrequest = &request.subrequests[0];\n                    subrequest.doc_batch.is_some()\n                    // && request.commit == CommitType::Auto as i32\n                    && subrequest.doc_batch.as_ref().unwrap().doc_lengths.len() == 5\n                    && subrequest.index_id == \"otel-traces-v0_6\"\n                } else {\n                    false\n                }\n            })\n            .returning(|_| {\n                Ok(IngestResponseV2 {\n                    successes: vec![IngestSuccess {\n                        num_ingested_docs: 1,\n                        ..Default::default()\n                    }],\n                    failures: Vec::new(),\n                })\n            });\n        let ingest_router = IngestRouterServiceClient::from_mock(mock_ingest_router);\n        let logs_service = OtlpGrpcLogsService::new(ingest_router.clone());\n        let traces_service = OtlpGrpcTracesService::new(ingest_router, Some(CommitType::Force));\n        let export_trace_request = ExportTraceServiceRequest {\n            resource_spans: make_resource_spans_for_test(),\n        };\n        let body = export_trace_request.encode_to_vec();\n        let otlp_traces_api_handler =\n            otlp_ingest_api_handlers(Some(logs_service), Some(traces_service)).recover(recover_fn);\n        {\n            // Test default otlp endpoint\n            let resp = warp::test::request()\n                .path(\"/otlp/v1/traces\")\n                .method(\"POST\")\n                .header(\"content-type\", \"application/x-protobuf\")\n                .body(body.clone())\n                .reply(&otlp_traces_api_handler)\n                .await;\n            assert_eq!(resp.status(), 200);\n            let actual_response: ExportTraceServiceResponse =\n                serde_json::from_slice(resp.body()).unwrap();\n            assert!(actual_response.partial_success.is_some());\n            assert_eq!(actual_response.partial_success.unwrap().rejected_spans, 0);\n        }\n        {\n            // Test default otlp endpoint with compression\n            let resp = warp::test::request()\n                .path(\"/otlp/v1/traces\")\n                .method(\"POST\")\n                .header(\"content-type\", \"application/x-protobuf\")\n                .header(\"content-encoding\", \"gzip\")\n                .body(compress(&body))\n                .reply(&otlp_traces_api_handler)\n                .await;\n            assert_eq!(resp.status(), 200);\n            let actual_response: ExportTraceServiceResponse =\n                serde_json::from_slice(resp.body()).unwrap();\n            assert!(actual_response.partial_success.is_some());\n            assert_eq!(actual_response.partial_success.unwrap().rejected_spans, 0);\n        }\n        {\n            // Test endpoint with given index ID through header.\n            let resp = warp::test::request()\n                .path(\"/otlp/v1/traces\")\n                .method(\"POST\")\n                .header(\"content-type\", \"application/x-protobuf\")\n                .header(\"qw-otel-traces-index\", \"otel-traces-v0_6\")\n                .body(body.clone())\n                .reply(&otlp_traces_api_handler)\n                .await;\n            assert_eq!(resp.status(), 200);\n            let actual_response: ExportTraceServiceResponse =\n                serde_json::from_slice(resp.body()).unwrap();\n            assert!(actual_response.partial_success.is_some());\n            assert_eq!(actual_response.partial_success.unwrap().rejected_spans, 0);\n        }\n        {\n            // Test endpoint with given index ID through path.\n            let resp = warp::test::request()\n                .path(\"/otel-traces-v0_6/otlp/v1/traces\")\n                .method(\"POST\")\n                .header(\"content-type\", \"application/x-protobuf\")\n                .body(body)\n                .reply(&otlp_traces_api_handler)\n                .await;\n            assert_eq!(resp.status(), 200);\n            let actual_response: ExportTraceServiceResponse =\n                serde_json::from_slice(resp.body()).unwrap();\n            assert!(actual_response.partial_success.is_some());\n            assert_eq!(actual_response.partial_success.unwrap().rejected_spans, 0);\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/rate_modulator.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::time::Duration;\n\nuse quickwit_common::tower::{ConstantRate, Rate};\nuse quickwit_ingest::MemoryCapacity;\n\n#[derive(Clone)]\npub struct RateModulator<R> {\n    rate_estimator: R,\n    memory_capacity: MemoryCapacity,\n    min_rate: ConstantRate,\n}\n\nimpl<R> RateModulator<R>\nwhere R: Rate\n{\n    /// Creates a new [`RateModulator`] instance.\n    ///\n    /// # Panics\n    ///\n    /// Panics if `rate_estimator` and `min_rate` have different periods.\n    pub fn new(rate_estimator: R, memory_capacity: MemoryCapacity, min_rate: ConstantRate) -> Self {\n        assert_eq!(\n            rate_estimator.period(),\n            min_rate.period(),\n            \"Rate estimator and min rate periods must be equal.\"\n        );\n\n        Self {\n            rate_estimator,\n            memory_capacity,\n            min_rate,\n        }\n    }\n}\n\nimpl<R> Rate for RateModulator<R>\nwhere R: Rate\n{\n    fn work(&self) -> u64 {\n        let memory_usage_ratio = self.memory_capacity.usage_ratio();\n        let work = self.rate_estimator.work().max(self.min_rate.work());\n\n        if memory_usage_ratio < 0.25 {\n            work * 2\n        } else if memory_usage_ratio > 0.99 {\n            work / 32\n        } else if memory_usage_ratio > 0.98 {\n            work / 16\n        } else if memory_usage_ratio > 0.95 {\n            work / 8\n        } else if memory_usage_ratio > 0.90 {\n            work / 4\n        } else if memory_usage_ratio > 0.80 {\n            work / 2\n        } else if memory_usage_ratio > 0.70 {\n            work * 2 / 3\n        } else {\n            work\n        }\n    }\n\n    fn period(&self) -> Duration {\n        self.rate_estimator.period()\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/rest.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt::Formatter;\nuse std::io;\nuse std::sync::Arc;\n\nuse hyper_util::rt::{TokioExecutor, TokioIo};\nuse hyper_util::server::conn::auto::Builder;\nuse hyper_util::service::TowerToHyperService;\nuse quickwit_common::tower::BoxFutureInfaillible;\nuse quickwit_config::{disable_ingest_v1, enable_ingest_v2};\nuse quickwit_search::SearchService;\nuse tokio::io::{AsyncRead, AsyncWrite};\nuse tokio::net::{TcpListener, TcpStream};\nuse tokio_rustls::TlsAcceptor;\nuse tokio_util::either::Either;\nuse tower::ServiceBuilder;\nuse tower_http::compression::CompressionLayer;\nuse tower_http::compression::predicate::{NotForContentType, Predicate, SizeAbove};\nuse tower_http::cors::{AllowOrigin, CorsLayer};\nuse tracing::{error, info};\nuse warp::filters::log::Info;\nuse warp::hyper::http::HeaderValue;\nuse warp::hyper::{Method, StatusCode, http};\nuse warp::{Filter, Rejection, Reply, redirect};\n\nuse crate::cluster_api::cluster_handler;\nuse crate::decompression::{CorruptedData, UnsupportedEncoding};\nuse crate::delete_task_api::delete_task_api_handlers;\nuse crate::developer_api::developer_api_routes;\nuse crate::elasticsearch_api::elastic_api_handlers;\nuse crate::health_check_api::health_check_handlers;\nuse crate::index_api::index_management_handlers;\nuse crate::indexing_api::indexing_get_handler;\nuse crate::ingest_api::ingest_api_handlers;\nuse crate::jaeger_api::jaeger_api_handlers;\nuse crate::metrics_api::metrics_handler;\nuse crate::node_info_handler::node_info_handler;\nuse crate::otlp_api::otlp_ingest_api_handlers;\nuse crate::rest_api_response::{RestApiError, RestApiResponse};\nuse crate::search_api::{\n    search_get_handler, search_plan_get_handler, search_plan_post_handler, search_post_handler,\n};\nuse crate::template_api::index_template_api_handlers;\nuse crate::ui_handler::ui_handler;\nuse crate::{BodyFormat, BuildInfo, QuickwitServices, RuntimeInfo};\n\n#[derive(Debug)]\npub(crate) struct InvalidJsonRequest(pub serde_json::Error);\n\nimpl warp::reject::Reject for InvalidJsonRequest {}\n\n#[derive(Debug)]\npub(crate) struct InvalidArgument(pub String);\n\nimpl warp::reject::Reject for InvalidArgument {}\n\n#[derive(Debug)]\npub struct TooManyRequests;\n\nimpl warp::reject::Reject for TooManyRequests {}\n\nimpl std::fmt::Display for TooManyRequests {\n    fn fmt(&self, f: &mut Formatter) -> std::fmt::Result {\n        write!(f, \"too many requests\")\n    }\n}\n\n/// Env variable key to define the minimum size above which a response should be compressed.\n/// If unset, no compression is applied.\nconst QW_MINIMUM_COMPRESSION_SIZE_KEY: &str = \"QW_MINIMUM_COMPRESSION_SIZE\";\n\n#[derive(Clone, Copy)]\nstruct CompressionPredicate {\n    size_above_opt: Option<SizeAbove>,\n}\n\nimpl CompressionPredicate {\n    fn from_env() -> CompressionPredicate {\n        let minimum_compression_size_opt: Option<u16> =\n            quickwit_common::get_from_env_opt::<usize>(QW_MINIMUM_COMPRESSION_SIZE_KEY, false).map(\n                |minimum_compression_size: usize| {\n                    u16::try_from(minimum_compression_size).unwrap_or(u16::MAX)\n                },\n            );\n        let size_above_opt = minimum_compression_size_opt.map(SizeAbove::new);\n        CompressionPredicate { size_above_opt }\n    }\n}\n\nimpl Predicate for CompressionPredicate {\n    fn should_compress<B>(&self, response: &http::Response<B>) -> bool\n    where B: http_body::Body {\n        if let Some(size_above) = self.size_above_opt {\n            size_above.should_compress(response)\n        } else {\n            false\n        }\n    }\n}\n\nasync fn apply_tls_if_necessary(\n    tcp_stream: TcpStream,\n    tls_acceptor_opt: &Option<TlsAcceptor>,\n) -> io::Result<impl AsyncRead + AsyncWrite + Unpin + 'static> {\n    let Some(tls_acceptor) = &tls_acceptor_opt else {\n        return Ok(Either::Right(tcp_stream));\n    };\n    let tls_stream_res = tls_acceptor\n        .accept(tcp_stream)\n        .await\n        .inspect_err(|err| error!(\"failed to perform tls handshake: {err:#}\"))?;\n    Ok(Either::Left(tls_stream_res))\n}\n\n/// Starts REST services.\npub(crate) async fn start_rest_server(\n    tcp_listener: TcpListener,\n    quickwit_services: Arc<QuickwitServices>,\n    readiness_trigger: BoxFutureInfaillible<()>,\n    shutdown_signal: BoxFutureInfaillible<()>,\n) -> anyhow::Result<()> {\n    let request_counter = warp::log::custom(|info: Info| {\n        let elapsed = info.elapsed();\n        let status = info.status();\n        let label_values: [&str; 2] = [info.method().as_str(), status.as_str()];\n        crate::SERVE_METRICS\n            .request_duration_secs\n            .with_label_values(label_values)\n            .observe(elapsed.as_secs_f64());\n        crate::SERVE_METRICS\n            .http_requests_total\n            .with_label_values(label_values)\n            .inc();\n    });\n    // Docs routes\n    let api_doc = warp::path(\"openapi.json\")\n        .and(warp::get())\n        .map(|| warp::reply::json(&crate::openapi::build_docs()))\n        .recover(recover_fn)\n        .boxed();\n\n    // `/health/*` routes.\n    let health_check_routes = health_check_handlers(\n        quickwit_services.cluster.clone(),\n        quickwit_services.indexing_service_opt.clone(),\n        quickwit_services.janitor_service_opt.clone(),\n    )\n    .boxed();\n\n    // `/metrics` route.\n    let metrics_routes = warp::path(\"metrics\")\n        .and(warp::get())\n        .map(metrics_handler)\n        .recover(recover_fn)\n        .boxed();\n\n    // `/api/developer/*` route.\n    let developer_routes = developer_api_routes(\n        quickwit_services.cluster.clone(),\n        quickwit_services.env_filter_reload_fn.clone(),\n    )\n    .boxed();\n\n    // `/api/v1/*` routes.\n    let api_v1_root_route = api_v1_routes(quickwit_services.clone());\n\n    let redirect_root_to_ui_route = warp::path::end()\n        .and(warp::get())\n        .map(|| redirect(http::Uri::from_static(\"/ui/search\")))\n        .recover(recover_fn)\n        .boxed();\n\n    let extra_headers = warp::reply::with::headers(\n        quickwit_services\n            .node_config\n            .rest_config\n            .extra_headers\n            .clone(),\n    );\n\n    // Combine all the routes together.\n    let rest_routes = api_v1_root_route\n        .or(api_doc)\n        .or(redirect_root_to_ui_route)\n        .or(ui_handler())\n        .or(health_check_routes)\n        .or(metrics_routes)\n        .or(developer_routes)\n        .with(request_counter)\n        .recover(recover_fn_final)\n        .with(extra_headers)\n        .boxed();\n\n    let warp_service = warp::service(rest_routes);\n    let compression_predicate = CompressionPredicate::from_env().and(NotForContentType::IMAGES);\n    let cors = build_cors(&quickwit_services.node_config.rest_config.cors_allow_origins);\n\n    let service = ServiceBuilder::new()\n        .layer(\n            CompressionLayer::new()\n                .zstd(true)\n                .gzip(true)\n                .quality(tower_http::CompressionLevel::Fastest)\n                .compress_when(compression_predicate),\n        )\n        .layer(cors)\n        .service(warp_service);\n\n    let rest_listen_addr = tcp_listener.local_addr()?;\n    info!(\n        rest_listen_addr=?rest_listen_addr,\n        \"starting REST server listening on {rest_listen_addr}\"\n    );\n\n    let service = TowerToHyperService::new(service);\n\n    let server = Builder::new(TokioExecutor::new());\n    let graceful = hyper_util::server::graceful::GracefulShutdown::new();\n    let mut shutdown_signal = std::pin::pin!(shutdown_signal);\n    readiness_trigger.await;\n\n    let tls_acceptor_opt: Option<TlsAcceptor> =\n        if let Some(tls_config) = &quickwit_services.node_config.rest_config.tls {\n            let rustls_config = tls::make_rustls_config(tls_config)?;\n            Some(TlsAcceptor::from(rustls_config))\n        } else {\n            None\n        };\n\n    loop {\n        tokio::select! {\n            tcp_accept_res = tcp_listener.accept() => {\n                let tcp_stream = match tcp_accept_res {\n                    Ok((tcp_stream, _remote_addr)) => tcp_stream,\n                    Err(err) => {\n                        error!(\"failed to accept connection: {err:#}\");\n                        continue;\n                    }\n                };\n\n                let Ok(tcp_or_tls_stream) = apply_tls_if_necessary(tcp_stream, &tls_acceptor_opt).await else {\n                    continue;\n                };\n\n                let serve_fut = server.serve_connection_with_upgrades(TokioIo::new(tcp_or_tls_stream), service.clone());\n                let serve_with_shutdown_fut = graceful.watch(serve_fut.into_owned());\n                tokio::spawn(async move {\n                    if let Err(err) = serve_with_shutdown_fut.await {\n                        error!(\"failed to serve connection: {err:#}\");\n                    }\n                });\n            },\n            _ = &mut shutdown_signal => {\n                info!(\"REST server shutdown signal received\");\n                break;\n            }\n        }\n    }\n\n    graceful.shutdown().await;\n    info!(\"gracefully shutdown\");\n\n    Ok(())\n}\n\nfn search_routes(\n    search_service: Arc<dyn SearchService>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    search_get_handler(search_service.clone())\n        .or(search_post_handler(search_service.clone()))\n        .or(search_plan_get_handler(search_service.clone()))\n        .or(search_plan_post_handler(search_service.clone()))\n        .recover(recover_fn)\n        .boxed()\n}\n\nfn api_v1_routes(\n    quickwit_services: Arc<QuickwitServices>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    let api_v1_root_url = warp::path!(\"api\" / \"v1\" / ..);\n    api_v1_root_url.and(\n        elastic_api_handlers(\n            quickwit_services.cluster.clone(),\n            quickwit_services.node_config.clone(),\n            quickwit_services.search_service.clone(),\n            quickwit_services.ingest_service.clone(),\n            quickwit_services.ingest_router_service.clone(),\n            quickwit_services.metastore_client.clone(),\n            quickwit_services.index_manager.clone(),\n            !disable_ingest_v1(),\n            enable_ingest_v2(),\n        )\n        .or(cluster_handler(quickwit_services.cluster.clone()))\n        .boxed()\n        .or(node_info_handler(\n            BuildInfo::get(),\n            RuntimeInfo::get(),\n            quickwit_services.node_config.clone(),\n        ))\n        .boxed()\n        .or(indexing_get_handler(\n            quickwit_services.indexing_service_opt.clone(),\n        ))\n        .boxed()\n        .or(search_routes(quickwit_services.search_service.clone()))\n        .boxed()\n        .or(ingest_api_handlers(\n            quickwit_services.ingest_router_service.clone(),\n            quickwit_services.ingest_service.clone(),\n            quickwit_services.node_config.ingest_api_config.clone(),\n            !disable_ingest_v1(),\n            enable_ingest_v2(),\n        ))\n        .boxed()\n        .or(otlp_ingest_api_handlers(\n            quickwit_services.otlp_logs_service_opt.clone(),\n            quickwit_services.otlp_traces_service_opt.clone(),\n        ))\n        .boxed()\n        .or(index_management_handlers(\n            quickwit_services.index_manager.clone(),\n            quickwit_services.node_config.clone(),\n        ))\n        .boxed()\n        .or(delete_task_api_handlers(\n            quickwit_services.metastore_client.clone(),\n        ))\n        .boxed()\n        .or(jaeger_api_handlers(\n            quickwit_services.jaeger_service_opt.clone(),\n        ))\n        .boxed()\n        .or(index_template_api_handlers(\n            quickwit_services.metastore_client.clone(),\n        ))\n        .boxed(),\n    )\n}\n\n/// This function returns a formatted error based on the given rejection reason.\n///\n/// The ordering of rejection processing is very important, we need to start\n/// with the most specific rejections and end with the most generic. If not, Quickwit\n/// will return useless errors to the user.\n// TODO: we may want in the future revamp rejections as our usage does not exactly\n// match rejection behaviour. When a filter returns a rejection, it means that it\n// did not match, but maybe another filter can. Consequently warp will continue\n// to try to match other filters. Once a filter is matched, we can enter into\n// our own logic and return a proper reply.\n// More on this here: https://github.com/seanmonstar/warp/issues/388.\n// We may use this work on the PR is merged: https://github.com/seanmonstar/warp/pull/909.\npub async fn recover_fn(rejection: Rejection) -> Result<impl Reply, Rejection> {\n    let error = get_status_with_error(rejection)?;\n    let status_code = error.status_code;\n    Ok(RestApiResponse::new::<(), _>(\n        &Err(error),\n        status_code,\n        BodyFormat::default(),\n    ))\n}\n\npub async fn recover_fn_final(rejection: Rejection) -> Result<impl Reply, Rejection> {\n    let error = get_status_with_error(rejection).unwrap_or_else(|rejection: Rejection| {\n        if rejection.is_not_found() {\n            RestApiError {\n                status_code: StatusCode::NOT_FOUND,\n                message: \"Route not found\".to_string(),\n            }\n        } else {\n            error!(\"REST server error: {:?}\", rejection);\n            RestApiError {\n                status_code: StatusCode::INTERNAL_SERVER_ERROR,\n                message: \"internal server error\".to_string(),\n            }\n        }\n    });\n    let status_code = error.status_code;\n    Ok(RestApiResponse::new::<(), _>(\n        &Err(error),\n        status_code,\n        BodyFormat::default(),\n    ))\n}\n\nfn get_status_with_error(rejection: Rejection) -> Result<RestApiError, Rejection> {\n    if let Some(error) = rejection.find::<crate::format::UnsupportedMediaType>() {\n        Ok(RestApiError {\n            status_code: StatusCode::UNSUPPORTED_MEDIA_TYPE,\n            message: error.to_string(),\n        })\n    } else if let Some(error) = rejection.find::<serde_qs::Error>() {\n        Ok(RestApiError {\n            status_code: StatusCode::BAD_REQUEST,\n            message: error.to_string(),\n        })\n    } else if let Some(error) = rejection.find::<InvalidJsonRequest>() {\n        // Happens when the request body could not be deserialized correctly.\n        Ok(RestApiError {\n            status_code: StatusCode::BAD_REQUEST,\n            message: error.0.to_string(),\n        })\n    } else if let Some(error) = rejection.find::<warp::filters::body::BodyDeserializeError>() {\n        // Happens when the request body could not be deserialized correctly.\n        Ok(RestApiError {\n            status_code: StatusCode::BAD_REQUEST,\n            message: error.to_string(),\n        })\n    } else if let Some(error) = rejection.find::<warp::reject::UnsupportedMediaType>() {\n        Ok(RestApiError {\n            status_code: StatusCode::UNSUPPORTED_MEDIA_TYPE,\n            message: error.to_string(),\n        })\n    } else if let Some(error) = rejection.find::<UnsupportedEncoding>() {\n        Ok(RestApiError {\n            status_code: StatusCode::UNSUPPORTED_MEDIA_TYPE,\n            message: error.to_string(),\n        })\n    } else if let Some(error) = rejection.find::<CorruptedData>() {\n        Ok(RestApiError {\n            status_code: StatusCode::BAD_REQUEST,\n            message: error.to_string(),\n        })\n    } else if let Some(error) = rejection.find::<warp::reject::InvalidQuery>() {\n        Ok(RestApiError {\n            status_code: StatusCode::BAD_REQUEST,\n            message: error.to_string(),\n        })\n    } else if let Some(error) = rejection.find::<warp::reject::LengthRequired>() {\n        Ok(RestApiError {\n            status_code: StatusCode::LENGTH_REQUIRED,\n            message: error.to_string(),\n        })\n    } else if let Some(error) = rejection.find::<warp::reject::MissingHeader>() {\n        Ok(RestApiError {\n            status_code: StatusCode::BAD_REQUEST,\n            message: error.to_string(),\n        })\n    } else if let Some(error) = rejection.find::<warp::reject::InvalidHeader>() {\n        Ok(RestApiError {\n            status_code: StatusCode::BAD_REQUEST,\n            message: error.to_string(),\n        })\n    } else if let Some(error) = rejection.find::<warp::reject::PayloadTooLarge>() {\n        Ok(RestApiError {\n            status_code: StatusCode::PAYLOAD_TOO_LARGE,\n            message: error.to_string(),\n        })\n    } else if let Some(err) = rejection.find::<TooManyRequests>() {\n        Ok(RestApiError {\n            status_code: StatusCode::TOO_MANY_REQUESTS,\n            message: err.to_string(),\n        })\n    } else if let Some(error) = rejection.find::<InvalidArgument>() {\n        // Happens when the url path or request body contains invalid argument(s).\n        Ok(RestApiError {\n            status_code: StatusCode::BAD_REQUEST,\n            message: error.0.to_string(),\n        })\n    } else if let Some(error) = rejection.find::<warp::reject::MethodNotAllowed>() {\n        Ok(RestApiError {\n            status_code: StatusCode::METHOD_NOT_ALLOWED,\n            message: error.to_string(),\n        })\n    } else {\n        Err(rejection)\n    }\n}\n\nfn build_cors(cors_origins: &[String]) -> CorsLayer {\n    let debug_mode = quickwit_common::get_bool_from_env(\"QW_ENABLE_CORS_DEBUG\", false);\n    if debug_mode {\n        info!(\"CORS debug mode is enabled, localhost and 127.0.0.1 origins will be allowed\");\n        return CorsLayer::new()\n            .allow_methods([\n                Method::GET,\n                Method::POST,\n                Method::PUT,\n                Method::PATCH,\n                Method::DELETE,\n            ])\n            .allow_origin(AllowOrigin::predicate(|origin, _parts| {\n                [b\"https://localhost:\", b\"https://127.0.0.1:\"]\n                    .iter()\n                    .any(|prefix| origin.as_bytes().starts_with(*prefix))\n            }))\n            .allow_headers([http::header::CONTENT_TYPE]);\n    }\n\n    let mut cors = CorsLayer::new().allow_methods([\n        Method::GET,\n        Method::POST,\n        Method::PUT,\n        Method::DELETE,\n        Method::OPTIONS,\n    ]);\n    if !cors_origins.is_empty() {\n        let allow_any = cors_origins.iter().any(|origin| origin.as_str() == \"*\");\n\n        if allow_any {\n            info!(\"CORS is enabled, all origins will be allowed\");\n            cors = cors.allow_origin(tower_http::cors::Any);\n        } else {\n            info!(origins = ?cors_origins, \"CORS is enabled, the following origins will be allowed\");\n            let origins = cors_origins\n                .iter()\n                .map(|origin| origin.parse::<HeaderValue>().unwrap())\n                .collect::<Vec<_>>();\n            cors = cors.allow_origin(origins);\n        };\n    }\n    cors\n}\n\nmod tls {\n    // most of this module is copied from hyper-tls examples, licensed under Apache 2.0, MIT or ISC\n\n    use std::sync::Arc;\n    use std::vec::Vec;\n    use std::{fs, io};\n\n    use quickwit_config::TlsConfig;\n    use rustls::pki_types::{CertificateDer, PrivateKeyDer};\n    use tokio_rustls::rustls::ServerConfig;\n\n    fn io_error(error: String) -> io::Error {\n        io::Error::other(error)\n    }\n\n    // Load public certificate from file.\n    fn load_certs(filename: &str) -> io::Result<Vec<CertificateDer<'static>>> {\n        // Open certificate file.\n        let certfile = fs::File::open(filename)\n            .map_err(|error| io_error(format!(\"failed to open {filename}: {error}\")))?;\n        let mut reader = io::BufReader::new(certfile);\n        // Load and return certificate.\n        rustls_pemfile::certs(&mut reader).collect()\n    }\n\n    // Load private key from file.\n    fn load_private_key(filename: &str) -> io::Result<PrivateKeyDer<'static>> {\n        // Open keyfile.\n        let keyfile = fs::File::open(filename)\n            .map_err(|error| io_error(format!(\"failed to open {filename}: {error}\")))?;\n        let mut reader = io::BufReader::new(keyfile);\n\n        // Load and return a single private key.\n        rustls_pemfile::private_key(&mut reader).map(|key| key.unwrap())\n    }\n\n    pub fn make_rustls_config(config: &TlsConfig) -> anyhow::Result<Arc<ServerConfig>> {\n        let certs = load_certs(&config.cert_path)?;\n        let key = load_private_key(&config.key_path)?;\n\n        // TODO we could add support for client authorization, it seems less important than on the\n        // gRPC side though\n        if config.validate_client {\n            anyhow::bail!(\"mTLS isn't supported on rest api\");\n        }\n\n        let mut cfg = rustls::ServerConfig::builder()\n            .with_no_client_auth()\n            .with_single_cert(certs, key)\n            .map_err(|error| io_error(error.to_string()))?;\n        // Configure ALPN to accept HTTP/2, HTTP/1.1, and HTTP/1.0 in that order.\n        cfg.alpn_protocols = vec![b\"h2\".to_vec(), b\"http/1.1\".to_vec(), b\"http/1.0\".to_vec()];\n        Ok(Arc::new(cfg))\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::future::Future;\n    use std::pin::Pin;\n    use std::task::{Context, Poll};\n\n    use quickwit_cluster::{ChannelTransport, create_cluster_for_test};\n    use quickwit_config::NodeConfig;\n    use quickwit_index_management::IndexService;\n    use quickwit_ingest::{IngestApiService, IngestServiceClient};\n    use quickwit_proto::control_plane::ControlPlaneServiceClient;\n    use quickwit_proto::ingest::router::IngestRouterServiceClient;\n    use quickwit_proto::metastore::MetastoreServiceClient;\n    use quickwit_search::MockSearchService;\n    use quickwit_storage::StorageResolver;\n    use tower::Service;\n    use warp::http::HeaderName;\n    use warp::hyper::{Request, Response, StatusCode};\n\n    use super::*;\n    use crate::rest::recover_fn_final;\n\n    pub(crate) fn ingest_service_client() -> IngestServiceClient {\n        let universe = quickwit_actors::Universe::new();\n        let (ingest_service_mailbox, _) = universe.create_test_mailbox::<IngestApiService>();\n        IngestServiceClient::from_mailbox(ingest_service_mailbox)\n    }\n\n    #[tokio::test]\n    async fn test_cors() {\n        // No cors enabled\n        {\n            let cors = build_cors(&[]);\n\n            let mut layer = ServiceBuilder::new().layer(cors).service(HelloWorld);\n\n            let resp = layer.call(Request::new(())).await.unwrap();\n            let headers = resp.headers();\n            assert_eq!(headers.get(\"Access-Control-Allow-Origin\"), None);\n            assert_eq!(headers.get(\"Access-Control-Allow-Methods\"), None);\n            assert_eq!(headers.get(\"Access-Control-Allow-Headers\"), None);\n            assert_eq!(headers.get(\"Access-Control-Max-Age\"), None);\n\n            let resp = layer\n                .call(cors_request(\"http://localhost:3000\"))\n                .await\n                .unwrap();\n            let headers = resp.headers();\n            assert_eq!(headers.get(\"Access-Control-Allow-Origin\"), None);\n            assert_eq!(\n                headers.get(\"Access-Control-Allow-Methods\"),\n                Some(\n                    &\"GET,POST,PUT,DELETE,OPTIONS\"\n                        .parse::<HeaderValue>()\n                        .unwrap()\n                )\n            );\n            assert_eq!(headers.get(\"Access-Control-Allow-Headers\"), None);\n            assert_eq!(headers.get(\"Access-Control-Max-Age\"), None);\n        }\n\n        // Wildcard cors enabled\n        {\n            let cors = build_cors(&[\"*\".to_string()]);\n\n            let mut layer = ServiceBuilder::new().layer(cors).service(HelloWorld);\n\n            let resp = layer.call(Request::new(())).await.unwrap();\n            let headers = resp.headers();\n            assert_eq!(\n                headers.get(\"Access-Control-Allow-Origin\"),\n                Some(&\"*\".parse::<HeaderValue>().unwrap())\n            );\n            assert_eq!(headers.get(\"Access-Control-Allow-Methods\"), None);\n            assert_eq!(headers.get(\"Access-Control-Allow-Headers\"), None);\n            assert_eq!(headers.get(\"Access-Control-Max-Age\"), None);\n\n            let resp = layer\n                .call(cors_request(\"http://localhost:3000\"))\n                .await\n                .unwrap();\n            let headers = resp.headers();\n            assert_eq!(\n                headers.get(\"Access-Control-Allow-Origin\"),\n                Some(&\"*\".parse::<HeaderValue>().unwrap())\n            );\n            assert_eq!(\n                headers.get(\"Access-Control-Allow-Methods\"),\n                Some(\n                    &\"GET,POST,PUT,DELETE,OPTIONS\"\n                        .parse::<HeaderValue>()\n                        .unwrap()\n                )\n            );\n            assert_eq!(headers.get(\"Access-Control-Allow-Headers\"), None);\n            assert_eq!(headers.get(\"Access-Control-Max-Age\"), None);\n        }\n\n        // Specific origin cors enabled\n        {\n            let cors = build_cors(&[\"https://quickwit.io\".to_string()]);\n\n            let mut layer = ServiceBuilder::new().layer(cors).service(HelloWorld);\n\n            let resp = layer.call(Request::new(())).await.unwrap();\n            let headers = resp.headers();\n            assert_eq!(headers.get(\"Access-Control-Allow-Origin\"), None);\n            assert_eq!(headers.get(\"Access-Control-Allow-Methods\"), None);\n            assert_eq!(headers.get(\"Access-Control-Allow-Headers\"), None);\n            assert_eq!(headers.get(\"Access-Control-Max-Age\"), None);\n\n            let resp = layer\n                .call(cors_request(\"http://localhost:3000\"))\n                .await\n                .unwrap();\n            let headers = resp.headers();\n            assert_eq!(headers.get(\"Access-Control-Allow-Origin\"), None);\n            assert_eq!(\n                headers.get(\"Access-Control-Allow-Methods\"),\n                Some(\n                    &\"GET,POST,PUT,DELETE,OPTIONS\"\n                        .parse::<HeaderValue>()\n                        .unwrap()\n                )\n            );\n            assert_eq!(headers.get(\"Access-Control-Allow-Headers\"), None);\n            assert_eq!(headers.get(\"Access-Control-Max-Age\"), None);\n\n            let resp = layer\n                .call(cors_request(\"https://quickwit.io\"))\n                .await\n                .unwrap();\n            let headers = resp.headers();\n            assert_eq!(\n                headers.get(\"Access-Control-Allow-Origin\"),\n                Some(&\"https://quickwit.io\".parse::<HeaderValue>().unwrap())\n            );\n            assert_eq!(\n                headers.get(\"Access-Control-Allow-Methods\"),\n                Some(\n                    &\"GET,POST,PUT,DELETE,OPTIONS\"\n                        .parse::<HeaderValue>()\n                        .unwrap()\n                )\n            );\n            assert_eq!(headers.get(\"Access-Control-Allow-Headers\"), None);\n            assert_eq!(headers.get(\"Access-Control-Max-Age\"), None);\n        }\n\n        // Specific multiple-origin cors enabled\n        {\n            let cors = build_cors(&[\n                \"https://quickwit.io\".to_string(),\n                \"http://localhost:3000\".to_string(),\n            ]);\n\n            let mut layer = ServiceBuilder::new().layer(cors).service(HelloWorld);\n\n            let resp = layer.call(Request::new(())).await.unwrap();\n            let headers = resp.headers();\n            assert_eq!(headers.get(\"Access-Control-Allow-Origin\"), None);\n            assert_eq!(headers.get(\"Access-Control-Allow-Methods\"), None);\n            assert_eq!(headers.get(\"Access-Control-Allow-Headers\"), None);\n            assert_eq!(headers.get(\"Access-Control-Max-Age\"), None);\n\n            let resp = layer\n                .call(cors_request(\"http://localhost:3000\"))\n                .await\n                .unwrap();\n            let headers = resp.headers();\n            assert_eq!(\n                headers.get(\"Access-Control-Allow-Origin\"),\n                Some(&\"http://localhost:3000\".parse::<HeaderValue>().unwrap())\n            );\n            assert_eq!(\n                headers.get(\"Access-Control-Allow-Methods\"),\n                Some(\n                    &\"GET,POST,PUT,DELETE,OPTIONS\"\n                        .parse::<HeaderValue>()\n                        .unwrap()\n                )\n            );\n            assert_eq!(headers.get(\"Access-Control-Allow-Headers\"), None);\n            assert_eq!(headers.get(\"Access-Control-Max-Age\"), None);\n\n            let resp = layer\n                .call(cors_request(\"https://quickwit.io\"))\n                .await\n                .unwrap();\n            let headers = resp.headers();\n            assert_eq!(\n                headers.get(\"Access-Control-Allow-Origin\"),\n                Some(&\"https://quickwit.io\".parse::<HeaderValue>().unwrap())\n            );\n            assert_eq!(\n                headers.get(\"Access-Control-Allow-Methods\"),\n                Some(\n                    &\"GET,POST,PUT,DELETE,OPTIONS\"\n                        .parse::<HeaderValue>()\n                        .unwrap()\n                )\n            );\n            assert_eq!(headers.get(\"Access-Control-Allow-Headers\"), None);\n            assert_eq!(headers.get(\"Access-Control-Max-Age\"), None);\n        }\n    }\n\n    fn cors_request(origin: &'static str) -> Request<()> {\n        let mut request = Request::new(());\n        (*request.method_mut()) = Method::OPTIONS;\n        request\n            .headers_mut()\n            .insert(\"Origin\", HeaderValue::from_static(origin));\n        request\n    }\n\n    struct HelloWorld;\n\n    impl Service<Request<()>> for HelloWorld {\n        type Response = Response<String>;\n        type Error = http::Error;\n        type Future = Pin<Box<dyn Future<Output = Result<Self::Response, Self::Error>>>>;\n\n        fn poll_ready(&mut self, _cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {\n            Poll::Ready(Ok(()))\n        }\n\n        fn call(&mut self, _req: Request<()>) -> Self::Future {\n            let body = \"hello, world!\\n\".to_string();\n            let resp = Response::builder()\n                .status(StatusCode::OK)\n                .body(body)\n                .expect(\"Unable to create `http::Response`\");\n\n            let fut = async { Ok(resp) };\n\n            Box::pin(fut)\n        }\n    }\n\n    #[tokio::test]\n    async fn test_extra_headers() {\n        let mut node_config = NodeConfig::for_test();\n        node_config.rest_config.extra_headers.insert(\n            HeaderName::from_static(\"x-custom-header\"),\n            HeaderValue::from_static(\"custom-value\"),\n        );\n        node_config.rest_config.extra_headers.insert(\n            HeaderName::from_static(\"x-custom-header-2\"),\n            HeaderValue::from_static(\"custom-value-2\"),\n        );\n        let metastore_client = MetastoreServiceClient::mocked();\n        let index_service =\n            IndexService::new(metastore_client.clone(), StorageResolver::unconfigured());\n        let control_plane_client = ControlPlaneServiceClient::mocked();\n        let transport = ChannelTransport::default();\n        let cluster = create_cluster_for_test(Vec::new(), &[], &transport, false)\n            .await\n            .unwrap();\n        let quickwit_services = QuickwitServices {\n            _report_splits_subscription_handle_opt: None,\n            _local_shards_update_listener_handle_opt: None,\n            cluster,\n            control_plane_server_opt: None,\n            control_plane_client,\n            indexing_service_opt: None,\n            index_manager: index_service,\n            ingest_service: ingest_service_client(),\n            ingest_router_opt: None,\n            ingest_router_service: IngestRouterServiceClient::mocked(),\n            ingester_opt: None,\n            janitor_service_opt: None,\n            otlp_logs_service_opt: None,\n            otlp_traces_service_opt: None,\n            metastore_client,\n            metastore_server_opt: None,\n            node_config: Arc::new(node_config.clone()),\n            search_service: Arc::new(MockSearchService::new()),\n            jaeger_service_opt: None,\n            env_filter_reload_fn: crate::do_nothing_env_filter_reload_fn(),\n        };\n\n        let handler = api_v1_routes(Arc::new(quickwit_services))\n            .recover(recover_fn_final)\n            .with(warp::reply::with::headers(\n                node_config.rest_config.extra_headers.clone(),\n            ));\n\n        let resp = warp::test::request()\n            .path(\"/api/v1/version\")\n            .reply(&handler.clone())\n            .await;\n\n        assert_eq!(resp.status(), 200);\n        assert_eq!(\n            resp.headers().get(\"x-custom-header\").unwrap(),\n            \"custom-value\"\n        );\n        assert_eq!(\n            resp.headers().get(\"x-custom-header-2\").unwrap(),\n            \"custom-value-2\"\n        );\n\n        let resp_404 = warp::test::request()\n            .path(\"/api/v1/version404\")\n            .reply(&handler)\n            .await;\n\n        assert_eq!(resp_404.status(), 404);\n        assert_eq!(\n            resp_404.headers().get(\"x-custom-header\").unwrap(),\n            \"custom-value\"\n        );\n        assert_eq!(\n            resp_404.headers().get(\"x-custom-header-2\").unwrap(),\n            \"custom-value-2\"\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/rest_api_response.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse quickwit_proto::ServiceError;\nuse serde::{self, Serialize};\nuse warp::Reply;\nuse warp::hyper::StatusCode;\nuse warp::hyper::header::CONTENT_TYPE;\nuse warp::hyper::http::HeaderValue;\n\nuse crate::format::BodyFormat;\n\nconst JSON_SERIALIZATION_ERROR: &str = \"JSON serialization failed.\";\n\n#[derive(Serialize)]\npub(crate) struct RestApiError {\n    // For now, we want to keep [`RestApiError`] as simple as possible\n    // and return just a message.\n    #[serde(skip_serializing)]\n    pub status_code: StatusCode,\n    pub message: String,\n}\n\n/// Makes a JSON API response from a result.\n/// The error is wrapped into an [`RestApiError`] to publicly expose\n/// a consistent error format.\npub(crate) fn into_rest_api_response<T: serde::Serialize, E: ServiceError>(\n    result: Result<T, E>,\n    body_format: BodyFormat,\n) -> RestApiResponse {\n    let rest_api_result = result.map_err(|error| RestApiError {\n        status_code: error.error_code().http_status_code(),\n        message: error.to_string(),\n    });\n    let status_code = match &rest_api_result {\n        Ok(_) => StatusCode::OK,\n        Err(error) => error.status_code,\n    };\n    RestApiResponse::new(&rest_api_result, status_code, body_format)\n}\n\n/// A JSON reply for the REST API.\npub struct RestApiResponse {\n    status_code: StatusCode,\n    inner: Result<Vec<u8>, ()>,\n}\n\nimpl RestApiResponse {\n    pub fn new<T: serde::Serialize, E: serde::Serialize>(\n        result: &Result<T, E>,\n        status_code: StatusCode,\n        body_format: BodyFormat,\n    ) -> Self {\n        let inner = body_format.result_to_vec(result);\n        RestApiResponse { status_code, inner }\n    }\n}\n\nimpl Reply for RestApiResponse {\n    #[inline]\n    fn into_response(self) -> warp::reply::Response {\n        match self.inner {\n            Ok(body) => {\n                let mut response = warp::reply::Response::new(body.into());\n                response\n                    .headers_mut()\n                    .insert(CONTENT_TYPE, HeaderValue::from_static(\"application/json\"));\n                *response.status_mut() = self.status_code;\n                response\n            }\n            Err(()) => {\n                quickwit_common::rate_limited_error!(\n                    limit_per_min = 10,\n                    \"REST body json serialization error.\"\n                );\n                warp::reply::json(&RestApiError {\n                    status_code: StatusCode::INTERNAL_SERVER_ERROR,\n                    message: JSON_SERIALIZATION_ERROR.to_string(),\n                })\n                .into_response()\n            }\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/search_api/grpc_adapter.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::Arc;\n\nuse async_trait::async_trait;\nuse quickwit_proto::error::convert_to_grpc_result;\nuse quickwit_proto::search::{\n    GetKvRequest, GetKvResponse, LeafListFieldsRequest, ListFieldsRequest, ListFieldsResponse,\n    ReportSplitsRequest, ReportSplitsResponse, search_service_server as grpc,\n};\nuse quickwit_proto::{set_parent_span_from_request_metadata, tonic};\nuse quickwit_search::SearchService;\nuse tracing::instrument;\n\n#[derive(Clone)]\npub struct GrpcSearchAdapter(Arc<dyn SearchService>);\n\nimpl From<Arc<dyn SearchService>> for GrpcSearchAdapter {\n    fn from(search_service_arc: Arc<dyn SearchService>) -> Self {\n        GrpcSearchAdapter(search_service_arc)\n    }\n}\n\n#[async_trait]\nimpl grpc::SearchService for GrpcSearchAdapter {\n    #[instrument(skip(self, request))]\n    async fn root_search(\n        &self,\n        request: tonic::Request<quickwit_proto::search::SearchRequest>,\n    ) -> Result<tonic::Response<quickwit_proto::search::SearchResponse>, tonic::Status> {\n        set_parent_span_from_request_metadata(request.metadata());\n        let search_request = request.into_inner();\n        let search_result = self.0.root_search(search_request).await;\n        convert_to_grpc_result(search_result)\n    }\n\n    #[instrument(skip(self, request))]\n    async fn leaf_search(\n        &self,\n        request: tonic::Request<quickwit_proto::search::LeafSearchRequest>,\n    ) -> Result<tonic::Response<quickwit_proto::search::LeafSearchResponse>, tonic::Status> {\n        set_parent_span_from_request_metadata(request.metadata());\n        let leaf_search_request = request.into_inner();\n        let leaf_search_result = self.0.leaf_search(leaf_search_request).await;\n        convert_to_grpc_result(leaf_search_result)\n    }\n\n    #[instrument(skip(self, request))]\n    async fn fetch_docs(\n        &self,\n        request: tonic::Request<quickwit_proto::search::FetchDocsRequest>,\n    ) -> Result<tonic::Response<quickwit_proto::search::FetchDocsResponse>, tonic::Status> {\n        set_parent_span_from_request_metadata(request.metadata());\n        let fetch_docs_request = request.into_inner();\n        let fetch_docs_result = self.0.fetch_docs(fetch_docs_request).await;\n        convert_to_grpc_result(fetch_docs_result)\n    }\n\n    #[instrument(skip(self, request))]\n    async fn root_list_terms(\n        &self,\n        request: tonic::Request<quickwit_proto::search::ListTermsRequest>,\n    ) -> Result<tonic::Response<quickwit_proto::search::ListTermsResponse>, tonic::Status> {\n        set_parent_span_from_request_metadata(request.metadata());\n        let search_request = request.into_inner();\n        let search_result = self.0.root_list_terms(search_request).await;\n        convert_to_grpc_result(search_result)\n    }\n\n    #[instrument(skip(self, request))]\n    async fn leaf_list_terms(\n        &self,\n        request: tonic::Request<quickwit_proto::search::LeafListTermsRequest>,\n    ) -> Result<tonic::Response<quickwit_proto::search::LeafListTermsResponse>, tonic::Status> {\n        set_parent_span_from_request_metadata(request.metadata());\n        let leaf_search_request = request.into_inner();\n        let leaf_search_result = self.0.leaf_list_terms(leaf_search_request).await;\n        convert_to_grpc_result(leaf_search_result)\n    }\n\n    async fn scroll(\n        &self,\n        request: tonic::Request<quickwit_proto::search::ScrollRequest>,\n    ) -> Result<tonic::Response<quickwit_proto::search::SearchResponse>, tonic::Status> {\n        let scroll_request = request.into_inner();\n        let scroll_result = self.0.scroll(scroll_request).await;\n        convert_to_grpc_result(scroll_result)\n    }\n\n    #[instrument(skip(self, request))]\n    async fn put_kv(\n        &self,\n        request: tonic::Request<quickwit_proto::search::PutKvRequest>,\n    ) -> Result<tonic::Response<quickwit_proto::search::PutKvResponse>, tonic::Status> {\n        set_parent_span_from_request_metadata(request.metadata());\n        let put_request = request.into_inner();\n        self.0.put_kv(put_request).await;\n        Ok(tonic::Response::new(\n            quickwit_proto::search::PutKvResponse {},\n        ))\n    }\n\n    #[instrument(skip(self, request))]\n    async fn get_kv(\n        &self,\n        request: tonic::Request<GetKvRequest>,\n    ) -> Result<tonic::Response<GetKvResponse>, tonic::Status> {\n        set_parent_span_from_request_metadata(request.metadata());\n        let get_search_after_context_request = request.into_inner();\n        let payload = self.0.get_kv(get_search_after_context_request).await;\n        let get_response = GetKvResponse { payload };\n        Ok(tonic::Response::new(get_response))\n    }\n\n    #[instrument(skip(self, request))]\n    async fn report_splits(\n        &self,\n        request: tonic::Request<ReportSplitsRequest>,\n    ) -> Result<tonic::Response<ReportSplitsResponse>, tonic::Status> {\n        set_parent_span_from_request_metadata(request.metadata());\n        let get_search_after_context_request = request.into_inner();\n        self.0.report_splits(get_search_after_context_request).await;\n        Ok(tonic::Response::new(ReportSplitsResponse {}))\n    }\n\n    #[instrument(skip(self, request))]\n    async fn list_fields(\n        &self,\n        request: tonic::Request<ListFieldsRequest>,\n    ) -> Result<tonic::Response<ListFieldsResponse>, tonic::Status> {\n        set_parent_span_from_request_metadata(request.metadata());\n        let resp = self.0.root_list_fields(request.into_inner()).await;\n        convert_to_grpc_result(resp)\n    }\n    #[instrument(skip(self, request))]\n    async fn leaf_list_fields(\n        &self,\n        request: tonic::Request<LeafListFieldsRequest>,\n    ) -> Result<tonic::Response<ListFieldsResponse>, tonic::Status> {\n        set_parent_span_from_request_metadata(request.metadata());\n        let resp = self.0.leaf_list_fields(request.into_inner()).await;\n        convert_to_grpc_result(resp)\n    }\n\n    #[instrument(skip(self, request))]\n    async fn search_plan(\n        &self,\n        request: tonic::Request<quickwit_proto::search::SearchRequest>,\n    ) -> Result<tonic::Response<quickwit_proto::search::SearchPlanResponse>, tonic::Status> {\n        set_parent_span_from_request_metadata(request.metadata());\n        let search_request = request.into_inner();\n        let search_result = self.0.search_plan(search_request).await;\n        convert_to_grpc_result(search_result)\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/search_api/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod grpc_adapter;\nmod rest_handler;\n\npub use self::grpc_adapter::GrpcSearchAdapter;\npub use self::rest_handler::{\n    SearchApi, SearchRequestQueryString, SortBy, search_get_handler, search_plan_get_handler,\n    search_plan_post_handler, search_post_handler, search_request_from_api_request,\n};\npub(crate) use self::rest_handler::{extract_index_id_patterns, extract_index_id_patterns_default};\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/search_api/rest_handler.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::convert::TryFrom;\nuse std::sync::Arc;\n\nuse percent_encoding::percent_decode_str;\nuse quickwit_config::validate_index_id_pattern;\nuse quickwit_proto::search::{CountHits, SortField, SortOrder};\nuse quickwit_query::query_ast::query_ast_from_user_text;\nuse quickwit_search::{SearchError, SearchPlanResponseRest, SearchResponseRest, SearchService};\nuse serde::{Deserialize, Deserializer, Serialize, Serializer};\nuse serde_json::Value as JsonValue;\nuse tracing::info;\nuse warp::{Filter, Rejection};\n\nuse crate::rest_api_response::into_rest_api_response;\nuse crate::simple_list::{from_simple_list, to_simple_list};\nuse crate::{BodyFormat, with_arg};\n\n#[derive(utoipa::OpenApi)]\n#[openapi(\n    paths(\n        search_get_handler,\n        search_post_handler,\n        search_plan_get_handler,\n        search_plan_post_handler,\n    ),\n    components(schemas(\n        BodyFormat,\n        SearchRequestQueryString,\n        SearchResponseRest,\n        SearchPlanResponseRest,\n        SortBy,\n        SortField,\n        SortOrder,\n    ),)\n)]\npub struct SearchApi;\n\npub(crate) async fn extract_index_id_patterns_default() -> Result<Vec<String>, Rejection> {\n    let index_id_patterns = Vec::new();\n    Ok(index_id_patterns)\n}\n\npub(crate) async fn extract_index_id_patterns(\n    comma_separated_index_id_patterns: String,\n) -> Result<Vec<String>, Rejection> {\n    let percent_decoded_comma_separated_index_id_patterns =\n        percent_decode_str(&comma_separated_index_id_patterns)\n            .decode_utf8()\n            .map_err(|error| {\n                let message = format!(\n                    \"failed to percent decode comma-separated index ID patterns \\\n                     `{comma_separated_index_id_patterns}`: {error}\"\n                );\n                crate::rest::InvalidArgument(message)\n            })?;\n    let mut index_id_patterns = Vec::new();\n\n    for index_id_pattern in percent_decoded_comma_separated_index_id_patterns.split(',') {\n        validate_index_id_pattern(index_id_pattern, true)\n            .map_err(|error| crate::rest::InvalidArgument(error.to_string()))?;\n        index_id_patterns.push(index_id_pattern.to_string());\n    }\n    assert!(!index_id_patterns.is_empty());\n    Ok(index_id_patterns)\n}\n\n#[derive(Debug, Default, Eq, PartialEq, Deserialize, utoipa::ToSchema)]\npub struct SortBy {\n    /// Fields to sort on.\n    pub sort_fields: Vec<SortField>,\n}\n\nimpl SortBy {\n    pub fn is_empty(&self) -> bool {\n        self.sort_fields.is_empty()\n    }\n}\n\nimpl From<String> for SortBy {\n    fn from(sort_by: String) -> Self {\n        let mut sort_fields = Vec::new();\n\n        for field_name in sort_by.split(',') {\n            if field_name.is_empty() {\n                continue;\n            }\n            let (field_name, sort_order) = if let Some(tail) = field_name.strip_prefix('+') {\n                (tail.trim().to_string(), SortOrder::Desc)\n            } else if let Some(tail) = field_name.strip_prefix('-') {\n                (tail.trim().to_string(), SortOrder::Asc)\n            } else {\n                let trimmed_field_name = field_name.trim().to_string();\n\n                (trimmed_field_name, SortOrder::Desc)\n            };\n            let sort_field = SortField {\n                field_name,\n                sort_order: sort_order as i32,\n                sort_datetime_format: None,\n            };\n            sort_fields.push(sort_field);\n        }\n        Self { sort_fields }\n    }\n}\n\npub fn sort_by_mini_dsl<'de, D>(deserializer: D) -> Result<SortBy, D::Error>\nwhere D: Deserializer<'de> {\n    let sort_by_mini_dsl = String::deserialize(deserializer)?;\n    Ok(SortBy::from(sort_by_mini_dsl))\n}\n\nimpl Serialize for SortBy {\n    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>\n    where S: Serializer {\n        let mut sort_by_mini_dsl = String::new();\n\n        for sort_field in &self.sort_fields {\n            if sort_field.sort_order() == SortOrder::Desc {\n                sort_by_mini_dsl.push('-');\n            }\n            sort_by_mini_dsl.push_str(&sort_field.field_name);\n        }\n        serializer.serialize_str(&sort_by_mini_dsl)\n    }\n}\n\nfn default_max_hits() -> u64 {\n    20\n}\n\n/// This struct represents the QueryString passed to\n/// the rest API.\n#[derive(\n    Debug, Default, Eq, PartialEq, Serialize, Deserialize, utoipa::IntoParams, utoipa::ToSchema,\n)]\n#[into_params(parameter_in = Query)]\n#[serde(deny_unknown_fields)]\npub struct SearchRequestQueryString {\n    /// Query text. The query language is that of tantivy.\n    pub query: String,\n    #[param(value_type = Object)]\n    #[schema(value_type = Object)]\n    /// The aggregation JSON string.\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub aggs: Option<JsonValue>,\n    // Fields to search on\n    #[param(rename = \"search_field\")]\n    #[schema(rename = \"search_field\")]\n    #[serde(default)]\n    #[serde(rename = \"search_field\")]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    #[serde(serialize_with = \"to_simple_list\")]\n    pub search_fields: Option<Vec<String>>,\n    /// Fields to extract snippets on.\n    #[serde(default)]\n    #[serde(deserialize_with = \"from_simple_list\")]\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    #[serde(serialize_with = \"to_simple_list\")]\n    pub snippet_fields: Option<Vec<String>>,\n    /// If set, restrict search to documents with a `timestamp >= start_timestamp`.\n    /// This timestamp is expressed in seconds.\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub start_timestamp: Option<i64>,\n    /// If set, restrict search to documents with a `timestamp < end_timestamp``.\n    /// This timestamp is expressed in seconds.\n    #[serde(skip_serializing_if = \"Option::is_none\")]\n    pub end_timestamp: Option<i64>,\n    /// Maximum number of hits to return (by default 20).\n    #[serde(default = \"default_max_hits\")]\n    pub max_hits: u64,\n    /// First hit to return. Together with num_hits, this parameter\n    /// can be used for pagination.\n    ///\n    /// E.g.\n    /// The results with rank [start_offset..start_offset + max_hits) are returned\n    #[serde(default)] // Default to 0. (We are 0-indexed)\n    pub start_offset: u64,\n    /// The output format.\n    #[serde(default)]\n    pub format: BodyFormat,\n    /// Specifies how documents are sorted.\n    #[serde(alias = \"sort_by_field\")]\n    #[serde(deserialize_with = \"sort_by_mini_dsl\")]\n    #[serde(default)]\n    #[serde(skip_serializing_if = \"SortBy::is_empty\")]\n    #[param(value_type = String)]\n    pub sort_by: SortBy,\n    #[param(value_type = bool)]\n    #[schema(value_type = bool)]\n    #[serde(with = \"count_hits_from_bool\")]\n    #[serde(default = \"count_hits_from_bool::default\")]\n    pub count_all: CountHits,\n    #[param(value_type = bool)]\n    #[schema(value_type = bool)]\n    #[serde(default)]\n    pub allow_failed_splits: bool,\n}\n\nmod count_hits_from_bool {\n    use quickwit_proto::search::CountHits;\n    use serde::{self, Deserialize, Deserializer, Serializer};\n\n    pub fn serialize<S>(count_hits: &CountHits, serializer: S) -> Result<S::Ok, S::Error>\n    where S: Serializer {\n        if count_hits == &CountHits::Underestimate {\n            serializer.serialize_bool(false)\n        } else {\n            serializer.serialize_none()\n        }\n    }\n\n    pub fn deserialize<'de, D>(deserializer: D) -> Result<CountHits, D::Error>\n    where D: Deserializer<'de> {\n        let count_all = Option::<bool>::deserialize(deserializer)?.unwrap_or(true);\n        Ok(if count_all {\n            CountHits::CountAll\n        } else {\n            CountHits::Underestimate\n        })\n    }\n\n    pub fn default() -> CountHits {\n        CountHits::CountAll\n    }\n}\n\npub fn search_request_from_api_request(\n    index_id_patterns: Vec<String>,\n    search_request: SearchRequestQueryString,\n) -> Result<quickwit_proto::search::SearchRequest, SearchError> {\n    // The query ast below may still contain user input query. The actual\n    // parsing of the user query will happen in the root service, and might require\n    // the user of the docmapper default fields (which we do not have at this point).\n    let query_ast = query_ast_from_user_text(&search_request.query, search_request.search_fields);\n    let query_ast_json = serde_json::to_string(&query_ast)?;\n    let search_request = quickwit_proto::search::SearchRequest {\n        index_id_patterns,\n        query_ast: query_ast_json,\n        snippet_fields: search_request.snippet_fields.unwrap_or_default(),\n        start_timestamp: search_request.start_timestamp,\n        end_timestamp: search_request.end_timestamp,\n        max_hits: search_request.max_hits,\n        start_offset: search_request.start_offset,\n        aggregation_request: search_request\n            .aggs\n            .map(|agg| serde_json::to_string(&agg).expect(\"could not serialize JsonValue\")),\n        sort_fields: search_request.sort_by.sort_fields,\n        scroll_ttl_secs: None,\n        search_after: None,\n        count_hits: search_request.count_all.into(),\n        ignore_missing_indexes: false,\n        skip_aggregation_finalization: false,\n    };\n    Ok(search_request)\n}\n\nasync fn search_endpoint(\n    index_id_patterns: Vec<String>,\n    search_request: SearchRequestQueryString,\n    search_service: &dyn SearchService,\n) -> Result<SearchResponseRest, SearchError> {\n    let allow_failed_splits = search_request.allow_failed_splits;\n    let search_request = search_request_from_api_request(index_id_patterns, search_request)?;\n    let search_response =\n        search_service\n            .root_search(search_request)\n            .await\n            .and_then(|search_response| {\n                if (!allow_failed_splits || search_response.num_successful_splits == 0)\n                    && let Some(search_error) =\n                        SearchError::from_split_errors(&search_response.failed_splits[..])\n                {\n                    return Err(search_error);\n                }\n                Ok(search_response)\n            })?;\n    let search_response_rest = SearchResponseRest::try_from(search_response)?;\n    Ok(search_response_rest)\n}\n\nfn search_get_filter()\n-> impl Filter<Extract = (Vec<String>, SearchRequestQueryString), Error = Rejection> + Clone {\n    warp::path!(String / \"search\")\n        .and_then(extract_index_id_patterns)\n        .and(warp::get())\n        .and(warp::query())\n}\n\nfn search_post_filter()\n-> impl Filter<Extract = (Vec<String>, SearchRequestQueryString), Error = Rejection> + Clone {\n    warp::path!(String / \"search\")\n        .and_then(extract_index_id_patterns)\n        .and(warp::post())\n        .and(warp::body::content_length_limit(1024 * 1024))\n        .and(warp::body::json())\n}\n\nfn search_plan_get_filter()\n-> impl Filter<Extract = (Vec<String>, SearchRequestQueryString), Error = Rejection> + Clone {\n    warp::path!(String / \"search-plan\")\n        .and_then(extract_index_id_patterns)\n        .and(warp::get())\n        .and(warp::query())\n}\n\nfn search_plan_post_filter()\n-> impl Filter<Extract = (Vec<String>, SearchRequestQueryString), Error = Rejection> + Clone {\n    warp::path!(String / \"search-plan\")\n        .and_then(extract_index_id_patterns)\n        .and(warp::post())\n        .and(warp::body::content_length_limit(1024 * 1024))\n        .and(warp::body::json())\n}\n\nasync fn search(\n    index_id_patterns: Vec<String>,\n    search_request: SearchRequestQueryString,\n    search_service: Arc<dyn SearchService>,\n) -> impl warp::Reply {\n    info!(request =? search_request, \"search\");\n    let body_format = search_request.format;\n    let result = search_endpoint(index_id_patterns, search_request, &*search_service).await;\n    into_rest_api_response(result, body_format)\n}\n\nasync fn search_plan(\n    index_id_patterns: Vec<String>,\n    search_request: SearchRequestQueryString,\n    search_service: Arc<dyn SearchService>,\n) -> impl warp::Reply {\n    let body_format = search_request.format;\n    let result: Result<SearchPlanResponseRest, SearchError> = async {\n        let plan_request = search_request_from_api_request(index_id_patterns, search_request)?;\n        let plan_response = search_service.search_plan(plan_request).await?;\n        let response = serde_json::from_str(&plan_response.result)?;\n        Ok(response)\n    }\n    .await;\n    into_rest_api_response(result, body_format)\n}\n\n#[utoipa::path(\n    get,\n    tag = \"Search\",\n    path = \"/{index_id}/search\",\n    responses(\n        (status = 200, description = \"Successfully executed search.\", body = SearchResponseRest)\n    ),\n    params(\n        SearchRequestQueryString,\n        (\"index_id\" = String, Path, description = \"The index ID to search.\"),\n    )\n)]\n/// Search Index (GET Variant)\n///\n/// Parses the search request from the request query string.\npub fn search_get_handler(\n    search_service: Arc<dyn SearchService>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    search_get_filter()\n        .and(with_arg(search_service))\n        .then(search)\n}\n\n#[utoipa::path(\n    post,\n    tag = \"Search\",\n    path = \"/{index_id}/search\",\n    request_body = SearchRequestQueryString,\n    responses(\n        (status = 200, description = \"Successfully executed search.\", body = SearchResponseRest)\n    ),\n    params(\n        (\"index_id\" = String, Path, description = \"The index ID to search.\"),\n    )\n)]\n/// Search Index (POST Variant)\n///\n/// REST POST search handler.\n///\n/// Parses the search request from the request body.\npub fn search_post_handler(\n    search_service: Arc<dyn SearchService>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    search_post_filter()\n        .and(with_arg(search_service))\n        .then(search)\n}\n\n#[utoipa::path(\n    get,\n    tag = \"Search\",\n    path = \"/{index_id}/search-plan\",\n    responses(\n        (status = 200, description = \"Metadata about how a request would be executed.\", body = SearchPlanResponseRest)\n    ),\n    params(\n        SearchRequestQueryString,\n        (\"index_id\" = String, Path, description = \"The index ID to search.\"),\n    )\n)]\n/// Plan Query (GET Variant)\n///\n/// Parses the search request from the request query string.\npub fn search_plan_get_handler(\n    search_service: Arc<dyn SearchService>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    search_plan_get_filter()\n        .and(with_arg(search_service))\n        .then(search_plan)\n}\n\n#[utoipa::path(\n    post,\n    tag = \"Search\",\n    path = \"/{index_id}/search-plan\",\n    request_body = SearchRequestQueryString,\n    responses(\n        (status = 200, description = \"Metadata about how a request would be executed.\", body = SearchPlanResponseRest)\n    ),\n    params(\n        (\"index_id\" = String, Path, description = \"The index ID to search.\"),\n    )\n)]\n/// Plan Query (POST Variant)\n///\n/// Parses the search request from the request body.\npub fn search_plan_post_handler(\n    search_service: Arc<dyn SearchService>,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    search_plan_post_filter()\n        .and(with_arg(search_service))\n        .then(search_plan)\n}\n\n#[cfg(test)]\nmod tests {\n    use assert_json_diff::{assert_json_eq, assert_json_include};\n    use mockall::predicate;\n    use quickwit_search::{MockSearchService, SearchError};\n    use serde_json::{Value as JsonValue, json};\n\n    use super::*;\n    use crate::recover_fn;\n\n    fn search_handler(\n        mock_search_service: MockSearchService,\n    ) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n        let mock_search_service_in_arc = Arc::new(mock_search_service);\n        search_get_handler(mock_search_service_in_arc.clone())\n            .or(search_post_handler(mock_search_service_in_arc.clone()))\n            .or(search_plan_get_handler(mock_search_service_in_arc.clone()))\n            .or(search_plan_post_handler(mock_search_service_in_arc.clone()))\n            .recover(recover_fn)\n    }\n\n    #[tokio::test]\n    async fn test_extract_index_id_patterns() {\n        extract_index_id_patterns(\"my-index\".to_string())\n            .await\n            .unwrap();\n        assert_eq!(\n            extract_index_id_patterns(\"my-index-1,my-index-2%2A\".to_string())\n                .await\n                .unwrap(),\n            vec![\"my-index-1\".to_string(), \"my-index-2*\".to_string()]\n        );\n        assert_eq!(\n            extract_index_id_patterns(\"my-index-1%2Cmy-index-%2A\".to_string())\n                .await\n                .unwrap(),\n            vec![\"my-index-1\".to_string(), \"my-index-*\".to_string()]\n        );\n        extract_index_id_patterns(\"\".to_string()).await.unwrap_err();\n        extract_index_id_patterns(\" \".to_string())\n            .await\n            .unwrap_err();\n    }\n\n    #[test]\n    fn test_serialize_search_response() -> anyhow::Result<()> {\n        let search_response = SearchResponseRest {\n            num_hits: 55,\n            hits: Vec::new(),\n            snippets: None,\n            elapsed_time_micros: 0u64,\n            errors: Vec::new(),\n            aggregations: None,\n        };\n        let search_response_json: JsonValue = serde_json::to_value(search_response)?;\n        let expected_search_response_json: JsonValue = json!({\n            \"num_hits\": 55,\n            \"hits\": [],\n            \"elapsed_time_micros\": 0,\n        });\n        assert_json_include!(\n            actual: search_response_json,\n            expected: expected_search_response_json\n        );\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_rest_search_api_route_post() {\n        let rest_search_api_filter = search_post_filter();\n        let (indexes, req) = warp::test::request()\n            .method(\"POST\")\n            .path(\"/quickwit-demo-index/search\")\n            .json(&true)\n            .body(r#\"{\"query\": \"*\", \"max_hits\":10, \"aggs\": {\"range\":[]} }\"#)\n            .filter(&rest_search_api_filter)\n            .await\n            .unwrap();\n        assert_eq!(indexes, vec![\"quickwit-demo-index\".to_string()]);\n        assert_eq!(\n            &req,\n            &super::SearchRequestQueryString {\n                query: \"*\".to_string(),\n                search_fields: None,\n                start_timestamp: None,\n                max_hits: 10,\n                format: BodyFormat::default(),\n                sort_by: SortBy::default(),\n                aggs: Some(json!({\"range\":[]})),\n                count_all: CountHits::CountAll,\n                ..Default::default()\n            }\n        );\n    }\n\n    #[tokio::test]\n    async fn test_rest_search_api_route_post_multi_indexes() {\n        let rest_search_api_filter = search_post_filter();\n        let (indexes, req) = warp::test::request()\n            .method(\"POST\")\n            .path(\"/quickwit-demo-index,quickwit-demo,quickwit-demo-index-*/search\")\n            .json(&true)\n            .body(r#\"{\"query\": \"*\", \"max_hits\":10, \"aggs\": {\"range\":[]} }\"#)\n            .filter(&rest_search_api_filter)\n            .await\n            .unwrap();\n        assert_eq!(\n            indexes,\n            vec![\n                \"quickwit-demo-index\".to_string(),\n                \"quickwit-demo\".to_string(),\n                \"quickwit-demo-index-*\".to_string()\n            ]\n        );\n        assert_eq!(\n            &req,\n            &super::SearchRequestQueryString {\n                query: \"*\".to_string(),\n                search_fields: None,\n                start_timestamp: None,\n                max_hits: 10,\n                format: BodyFormat::default(),\n                sort_by: SortBy::default(),\n                aggs: Some(json!({\"range\":[]})),\n                ..Default::default()\n            }\n        );\n    }\n\n    #[tokio::test]\n    async fn test_rest_search_api_route_post_multi_indexes_bad_pattern() {\n        let rest_search_api_filter = search_post_filter();\n        let bad_pattern_rejection = warp::test::request()\n            .method(\"POST\")\n            .path(\"/quickwit-demo-index**/search\")\n            .json(&true)\n            .body(r#\"{\"query\": \"*\", \"max_hits\":10, \"aggs\": {\"range\":[]} }\"#)\n            .filter(&rest_search_api_filter)\n            .await\n            .unwrap_err();\n        let rejection = bad_pattern_rejection\n            .find::<crate::rest::InvalidArgument>()\n            .unwrap();\n        assert_eq!(\n            rejection.0,\n            \"index ID pattern `quickwit-demo-index**` is invalid: patterns must not contain \\\n             multiple consecutive `*`\"\n        );\n    }\n\n    #[tokio::test]\n    async fn test_rest_search_api_route_simple() {\n        let rest_search_api_filter = search_get_filter();\n        let (indexes, req) = warp::test::request()\n            .path(\n                \"/quickwit-demo-index/search?query=*&end_timestamp=1450720000&max_hits=10&\\\n                 start_offset=22\",\n            )\n            .filter(&rest_search_api_filter)\n            .await\n            .unwrap();\n        assert_eq!(indexes, vec![\"quickwit-demo-index\".to_string()]);\n        assert_eq!(\n            &req,\n            &super::SearchRequestQueryString {\n                query: \"*\".to_string(),\n                search_fields: None,\n                start_timestamp: None,\n                end_timestamp: Some(1450720000),\n                max_hits: 10,\n                start_offset: 22,\n                format: BodyFormat::default(),\n                sort_by: SortBy::default(),\n                ..Default::default()\n            }\n        );\n    }\n\n    #[tokio::test]\n    async fn test_rest_search_api_route_count_all() {\n        let rest_search_api_filter = search_get_filter();\n        let (indexes, req) = warp::test::request()\n            .path(\"/quickwit-demo-index/search?query=*&count_all=true\")\n            .filter(&rest_search_api_filter)\n            .await\n            .unwrap();\n        assert_eq!(indexes, vec![\"quickwit-demo-index\".to_string()]);\n        assert_eq!(\n            &req,\n            &super::SearchRequestQueryString {\n                query: \"*\".to_string(),\n                format: BodyFormat::default(),\n                sort_by: SortBy::default(),\n                max_hits: 20,\n                count_all: CountHits::CountAll,\n                ..Default::default()\n            }\n        );\n        let rest_search_api_filter = search_get_filter();\n        let (indexes, req) = warp::test::request()\n            .path(\"/quickwit-demo-index/search?query=*&count_all=false\")\n            .filter(&rest_search_api_filter)\n            .await\n            .unwrap();\n        assert_eq!(indexes, vec![\"quickwit-demo-index\".to_string()]);\n        assert_eq!(\n            &req,\n            &super::SearchRequestQueryString {\n                query: \"*\".to_string(),\n                format: BodyFormat::default(),\n                sort_by: SortBy::default(),\n                max_hits: 20,\n                count_all: CountHits::Underestimate,\n                ..Default::default()\n            }\n        );\n    }\n\n    #[tokio::test]\n    async fn test_rest_search_api_route_simple_default_num_hits_default_offset() {\n        let rest_search_api_filter = search_get_filter();\n        let (indexes, req) = warp::test::request()\n            .path(\n                \"/quickwit-demo-index/search?query=*&end_timestamp=1450720000&search_field=title,\\\n                 body\",\n            )\n            .filter(&rest_search_api_filter)\n            .await\n            .unwrap();\n        assert_eq!(indexes, vec![\"quickwit-demo-index\".to_string()]);\n        assert_eq!(\n            &req,\n            &super::SearchRequestQueryString {\n                query: \"*\".to_string(),\n                search_fields: Some(vec![\"title\".to_string(), \"body\".to_string()]),\n                start_timestamp: None,\n                end_timestamp: Some(1450720000),\n                max_hits: 20,\n                start_offset: 0,\n                format: BodyFormat::default(),\n                sort_by: SortBy::default(),\n                ..Default::default()\n            }\n        );\n    }\n\n    #[tokio::test]\n    async fn test_rest_search_api_route_simple_format() {\n        let rest_search_api_filter = search_get_filter();\n        let (indexes, req) = warp::test::request()\n            .path(\"/quickwit-demo-index/search?query=*&format=json\")\n            .filter(&rest_search_api_filter)\n            .await\n            .unwrap();\n        assert_eq!(indexes, vec![\"quickwit-demo-index\".to_string()]);\n        assert_eq!(\n            &req,\n            &super::SearchRequestQueryString {\n                query: \"*\".to_string(),\n                start_timestamp: None,\n                end_timestamp: None,\n                max_hits: 20,\n                start_offset: 0,\n                format: BodyFormat::Json,\n                search_fields: None,\n                sort_by: SortBy::default(),\n                ..Default::default()\n            }\n        );\n    }\n\n    #[tokio::test]\n    async fn test_rest_search_api_route_sort_by() {\n        for (sort_by_query_param, expected_sort_fields) in [\n            (\"\", Vec::new()),\n            (\",\", Vec::new()),\n            (\n                \"field1\",\n                vec![SortField {\n                    field_name: \"field1\".to_string(),\n                    sort_order: SortOrder::Desc as i32,\n                    sort_datetime_format: None,\n                }],\n            ),\n            (\n                \"+field1\",\n                vec![SortField {\n                    field_name: \"field1\".to_string(),\n                    sort_order: SortOrder::Desc as i32,\n                    sort_datetime_format: None,\n                }],\n            ),\n            (\n                \"-field1\",\n                vec![SortField {\n                    field_name: \"field1\".to_string(),\n                    sort_order: SortOrder::Asc as i32,\n                    sort_datetime_format: None,\n                }],\n            ),\n            (\n                \"_score\",\n                vec![SortField {\n                    field_name: \"_score\".to_string(),\n                    sort_order: SortOrder::Desc as i32,\n                    sort_datetime_format: None,\n                }],\n            ),\n            (\n                \"-_score\",\n                vec![SortField {\n                    field_name: \"_score\".to_string(),\n                    sort_order: SortOrder::Asc as i32,\n                    sort_datetime_format: None,\n                }],\n            ),\n            (\n                \"+_score\",\n                vec![SortField {\n                    field_name: \"_score\".to_string(),\n                    sort_order: SortOrder::Desc as i32,\n                    sort_datetime_format: None,\n                }],\n            ),\n            (\n                \"field1,field2\",\n                vec![\n                    SortField {\n                        field_name: \"field1\".to_string(),\n                        sort_order: SortOrder::Desc as i32,\n                        sort_datetime_format: None,\n                    },\n                    SortField {\n                        field_name: \"field2\".to_string(),\n                        sort_order: SortOrder::Desc as i32,\n                        sort_datetime_format: None,\n                    },\n                ],\n            ),\n            (\n                \"+field1,-field2\",\n                vec![\n                    SortField {\n                        field_name: \"field1\".to_string(),\n                        sort_order: SortOrder::Desc as i32,\n                        sort_datetime_format: None,\n                    },\n                    SortField {\n                        field_name: \"field2\".to_string(),\n                        sort_order: SortOrder::Asc as i32,\n                        sort_datetime_format: None,\n                    },\n                ],\n            ),\n            (\n                \"-field1,+field2\",\n                vec![\n                    SortField {\n                        field_name: \"field1\".to_string(),\n                        sort_order: SortOrder::Asc as i32,\n                        sort_datetime_format: None,\n                    },\n                    SortField {\n                        field_name: \"field2\".to_string(),\n                        sort_order: SortOrder::Desc as i32,\n                        sort_datetime_format: None,\n                    },\n                ],\n            ),\n        ] {\n            let path = format!(\n                \"/quickwit-demo-index/search?query=*&format=json&sort_by={sort_by_query_param}\"\n            );\n            let rest_search_api_filter = search_get_filter();\n            let (_, req) = warp::test::request()\n                .path(&path)\n                .filter(&rest_search_api_filter)\n                .await\n                .unwrap();\n\n            assert_eq!(\n                &req.sort_by.sort_fields, &expected_sort_fields,\n                \"Expected sort fields `{:?}` for query param `{sort_by_query_param}`, got: {:?}\",\n                expected_sort_fields, req.sort_by.sort_fields\n            );\n        }\n\n        let rest_search_api_filter = search_get_filter();\n        let (_, req) = warp::test::request()\n            .path(\"/quickwit-demo-index/search?query=*&format=json&sort_by_field=fiel1\")\n            .filter(&rest_search_api_filter)\n            .await\n            .unwrap();\n\n        assert_eq!(\n            &req.sort_by.sort_fields,\n            &[SortField {\n                field_name: \"fiel1\".to_string(),\n                sort_order: SortOrder::Desc as i32,\n                sort_datetime_format: None,\n            }],\n        );\n    }\n\n    #[tokio::test]\n    async fn test_rest_search_api_route_invalid_key() {\n        let resp = warp::test::request()\n            .path(\"/quickwit-demo-index/search?query=*&end_unix_timestamp=1450720000\")\n            .reply(&search_handler(MockSearchService::new()))\n            .await;\n        assert_eq!(resp.status(), 400);\n        let resp_json: JsonValue = serde_json::from_slice(resp.body()).unwrap();\n        assert!(\n            resp_json\n                .get(\"message\")\n                .unwrap()\n                .as_str()\n                .unwrap()\n                .contains(\"Invalid query string\")\n        );\n    }\n\n    #[tokio::test]\n    async fn test_rest_search_api_route_post_with_invalid_payload() -> anyhow::Result<()> {\n        let resp = warp::test::request()\n            .method(\"POST\")\n            .path(\"/quickwit-demo-index/search\")\n            .json(&true)\n            .body(r#\"{\"query\": \"*\", \"bad_param\":10, \"aggs\": {\"range\":[]} }\"#)\n            .reply(&search_handler(MockSearchService::new()))\n            .await;\n        assert_eq!(resp.status(), 400);\n        let content = String::from_utf8_lossy(resp.body());\n        assert!(content.contains(\"Request body deserialize error: unknown field `bad_param`\"));\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_rest_search_api_route_serialize_with_results() -> anyhow::Result<()> {\n        let mut mock_search_service = MockSearchService::new();\n        mock_search_service.expect_root_search().returning(|_| {\n            Ok(quickwit_proto::search::SearchResponse {\n                hits: Vec::new(),\n                num_hits: 10,\n                elapsed_time_micros: 16,\n                errors: Vec::new(),\n                ..Default::default()\n            })\n        });\n        let rest_search_api_handler = search_handler(mock_search_service);\n        let resp = warp::test::request()\n            .path(\"/quickwit-demo-index/search?query=*\")\n            .reply(&rest_search_api_handler)\n            .await;\n        assert_eq!(resp.status(), 200);\n        let resp_json: JsonValue = serde_json::from_slice(resp.body())?;\n        let expected_response_json = serde_json::json!({\n            \"num_hits\": 10,\n            \"hits\": [],\n            \"elapsed_time_micros\": 16,\n        });\n        assert_json_include!(actual: resp_json, expected: expected_response_json);\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_rest_search_api_start_offset_and_num_hits_parameter() -> anyhow::Result<()> {\n        let mut mock_search_service = MockSearchService::new();\n        mock_search_service\n            .expect_root_search()\n            .with(predicate::function(\n                |search_request: &quickwit_proto::search::SearchRequest| {\n                    search_request.start_offset == 5 && search_request.max_hits == 30\n                },\n            ))\n            .returning(|_| Ok(Default::default()));\n        let rest_search_api_handler = search_handler(mock_search_service);\n        assert_eq!(\n            warp::test::request()\n                .path(\"/quickwit-demo-index/search?query=*&start_offset=5&max_hits=30\")\n                .reply(&rest_search_api_handler)\n                .await\n                .status(),\n            200\n        );\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_rest_search_api_with_index_does_not_exist() -> anyhow::Result<()> {\n        let mut mock_search_service = MockSearchService::new();\n        mock_search_service.expect_root_search().returning(|_| {\n            Err(SearchError::IndexesNotFound {\n                index_ids: vec![\"not-found-index\".to_string()],\n            })\n        });\n        let rest_search_api_handler = search_handler(mock_search_service);\n        assert_eq!(\n            warp::test::request()\n                .path(\"/index-does-not-exist/search?query=myfield:test\")\n                .reply(&rest_search_api_handler)\n                .await\n                .status(),\n            404\n        );\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_rest_search_api_with_wrong_fieldname() -> anyhow::Result<()> {\n        let mut mock_search_service = MockSearchService::new();\n        mock_search_service\n            .expect_root_search()\n            .returning(|_| Err(SearchError::Internal(\"ty\".to_string())));\n        let rest_search_api_handler = search_handler(mock_search_service);\n        assert_eq!(\n            warp::test::request()\n                .path(\"/index-does-not-exist/search?query=myfield:test\")\n                .reply(&rest_search_api_handler)\n                .await\n                .status(),\n            500\n        );\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_rest_search_api_with_invalid_query() -> anyhow::Result<()> {\n        let mut mock_search_service = MockSearchService::new();\n        mock_search_service\n            .expect_root_search()\n            .returning(|_| Err(SearchError::InvalidQuery(\"invalid query\".to_string())));\n        let rest_search_api_handler = search_handler(mock_search_service);\n        let response = warp::test::request()\n            .path(\"/my-index/search?query=myfield:test\")\n            .reply(&rest_search_api_handler)\n            .await;\n        assert_eq!(response.status(), 400);\n        let body = String::from_utf8_lossy(response.body());\n        assert!(body.contains(\"invalid query\"));\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_rest_search_api_route_serialize_results_with_snippet() -> anyhow::Result<()> {\n        let mut mock_search_service = MockSearchService::new();\n        mock_search_service.expect_root_search().returning(|_| {\n            Ok(quickwit_proto::search::SearchResponse {\n                hits: vec![quickwit_proto::search::Hit {\n                    json: r#\"{\"title\": \"foo\", \"body\": \"foo bar baz\"}\"#.to_string(),\n                    partial_hit: None,\n                    snippet: Some(r#\"{\"title\": [], \"body\": [\"foo <em>bar</em> baz\"]}\"#.to_string()),\n                    index_id: \"quickwit-demo-index\".to_string(),\n                }],\n                num_hits: 1,\n                elapsed_time_micros: 16,\n                errors: Vec::new(),\n                ..Default::default()\n            })\n        });\n        let rest_search_api_handler = search_handler(mock_search_service);\n        let resp = warp::test::request()\n            .path(\n                \"/quickwit-demo-index/search?query=bar&search_field=title,body&\\\n                 snippet_fields=title,body\",\n            )\n            .reply(&rest_search_api_handler)\n            .await;\n\n        assert_eq!(resp.status(), 200);\n        let resp_json: JsonValue = serde_json::from_slice(resp.body())?;\n        let expected_response_json = serde_json::json!({\n            \"num_hits\": 1,\n            \"hits\": [{\"title\": \"foo\", \"body\": \"foo bar baz\"}],\n            \"snippets\": [{\"title\": [], \"body\": [\"foo <em>bar</em> baz\"]}],\n            \"elapsed_time_micros\": 16,\n            \"errors\": [],\n        });\n        assert_json_eq!(resp_json, expected_response_json);\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_rest_search_api_multi_indexes() {\n        {\n            let mut mock_search_service = MockSearchService::new();\n            mock_search_service\n                .expect_root_search()\n                .with(predicate::function(\n                    |search_request: &quickwit_proto::search::SearchRequest| {\n                        search_request.index_id_patterns\n                            == vec![\"quickwit-demo-*\".to_string(), \"quickwit-demo2\".to_string()]\n                    },\n                ))\n                .returning(|_| Ok(Default::default()));\n            let rest_search_api_handler = search_handler(mock_search_service);\n            assert_eq!(\n                warp::test::request()\n                    .path(\"/quickwit-demo-*,quickwit-demo2/search?query=*\")\n                    .reply(&rest_search_api_handler)\n                    .await\n                    .status(),\n                200\n            );\n            assert_eq!(\n                warp::test::request()\n                    .path(\"/quickwit-demo-*%2Cquickwit-demo2/search?query=*\")\n                    .reply(&rest_search_api_handler)\n                    .await\n                    .status(),\n                200\n            );\n        }\n        {\n            let mut mock_search_service = MockSearchService::new();\n            mock_search_service\n                .expect_root_search()\n                .returning(|_| Ok(Default::default()));\n            let rest_search_api_handler = search_handler(mock_search_service);\n            assert_eq!(\n                warp::test::request()\n                    .path(\"/*/search?query=*\")\n                    .reply(&rest_search_api_handler)\n                    .await\n                    .status(),\n                200\n            );\n            let response = warp::test::request()\n                .path(\"/abc!/search?query=*\")\n                .reply(&rest_search_api_handler)\n                .await;\n            assert_eq!(response.status(), 400);\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/simple_list.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::str::FromStr;\n\nuse serde::{Deserialize, Deserializer, Serializer};\n\n/// Serializes an `Option<&[Serialize]>` with\n/// `Some(value)` to a comma separated string of values.\n/// Used to serialize values within the query string\npub fn to_simple_list<S, T>(\n    value: &Option<Vec<T>>,\n    serializer: S,\n) -> Result<<S as Serializer>::Ok, <S as Serializer>::Error>\nwhere\n    S: Serializer,\n    T: ToString,\n{\n    let vec = &value\n        .as_ref()\n        .expect(\"attempt to serialize Option::None value\");\n\n    let serialized_str = vec\n        .iter()\n        .map(|value| value.to_string())\n        .collect::<Vec<_>>() // do not collect here\n        .join(\",\");\n\n    serializer.serialize_str(&serialized_str)\n}\n\n/// Deserializes a comma separated string of values\n/// into a [`Vec<T>`].\n/// Used to deserialize list of values from the query string.\npub fn from_simple_list<'de, D, T>(deserializer: D) -> Result<Option<Vec<T>>, D::Error>\nwhere\n    D: Deserializer<'de>,\n    T: FromStr,\n    <T as FromStr>::Err: ToString,\n{\n    let str_sequence = String::deserialize(deserializer)?;\n    let list = str_sequence\n        .trim_matches(',')\n        .split(',')\n        .map(|item| T::from_str(item))\n        .collect::<Result<Vec<_>, _>>()\n        .map_err(|err| serde::de::Error::custom(err.to_string()))?;\n    Ok(Some(list))\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/tcp_listener.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::net::SocketAddr;\n\nuse quickwit_proto::tonic;\nuse tokio::net::TcpListener;\nuse tonic::async_trait;\n\n/// Resolve `SocketAddr` into `TcpListener` instances.\n///\n/// This trait can be used to inject existing [`TcpListener`] instances to the\n/// Quickwit REST and gRPC servers when running them in tests.\n#[async_trait]\npub trait TcpListenerResolver: Clone + Send + 'static {\n    async fn resolve(&self, addr: SocketAddr) -> anyhow::Result<TcpListener>;\n}\n\n#[derive(Clone)]\npub struct DefaultTcpListenerResolver;\n\n#[async_trait]\nimpl TcpListenerResolver for DefaultTcpListenerResolver {\n    async fn resolve(&self, addr: SocketAddr) -> anyhow::Result<TcpListener> {\n        TcpListener::bind(addr)\n            .await\n            .map_err(|err| anyhow::anyhow!(err))\n    }\n}\n\n#[cfg(any(test, feature = \"testsuite\"))]\npub mod for_tests {\n    use std::collections::HashMap;\n    use std::sync::Arc;\n\n    use anyhow::Context;\n    use tokio::sync::Mutex;\n\n    use super::*;\n\n    #[derive(Clone, Default)]\n    pub struct TestTcpListenerResolver {\n        listeners: Arc<Mutex<HashMap<SocketAddr, TcpListener>>>,\n    }\n\n    #[async_trait]\n    impl TcpListenerResolver for TestTcpListenerResolver {\n        async fn resolve(&self, addr: SocketAddr) -> anyhow::Result<TcpListener> {\n            self.listeners\n                .lock()\n                .await\n                .remove(&addr)\n                .context(format!(\"No listener found for address {addr}\"))\n        }\n    }\n\n    impl TestTcpListenerResolver {\n        pub async fn add_listener(&self, listener: TcpListener) {\n            self.listeners\n                .lock()\n                .await\n                .insert(listener.local_addr().unwrap(), listener);\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/template_api/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod rest_handler;\n\npub(crate) use rest_handler::{IndexTemplateApi, index_template_api_handlers};\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/template_api/rest_handler.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::any::type_name;\n\nuse bytes::Bytes;\nuse quickwit_config::{ConfigFormat, IndexTemplate, IndexTemplateId, VersionedIndexTemplate};\nuse quickwit_proto::metastore::{\n    CreateIndexTemplateRequest, DeleteIndexTemplatesRequest, GetIndexTemplateRequest,\n    ListIndexTemplatesRequest, MetastoreError, MetastoreResult, MetastoreService,\n    MetastoreServiceClient, serde_utils,\n};\nuse serde_json::Value as JsonValue;\nuse warp::reject::Rejection;\nuse warp::{Filter, Reply};\n\nuse crate::format::{extract_config_format, extract_format_from_qs};\nuse crate::rest::recover_fn;\nuse crate::rest_api_response::into_rest_api_response;\nuse crate::with_arg;\n\n#[derive(utoipa::OpenApi)]\n#[openapi(\n    paths(\n        create_index_template,\n        get_index_template,\n        update_index_template,\n        delete_index_template,\n        list_index_templates,\n    ),\n    components(schemas(VersionedIndexTemplate))\n)]\npub(crate) struct IndexTemplateApi;\n\npub(crate) fn index_template_api_handlers(\n    metastore: MetastoreServiceClient,\n) -> impl Filter<Extract = (impl Reply,), Error = Rejection> + Clone {\n    create_index_template_handler(metastore.clone())\n        .or(get_index_template_handler(metastore.clone()))\n        .or(update_index_template_handler(metastore.clone()))\n        .or(delete_index_template_handler(metastore.clone()))\n        .or(list_index_templates_handler(metastore.clone()))\n        .recover(recover_fn)\n        .boxed()\n}\n\nfn create_index_template_handler(\n    metastore: MetastoreServiceClient,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"templates\")\n        .and(warp::post())\n        .and(warp::filters::body::bytes())\n        .and(extract_config_format())\n        .and(with_arg(metastore))\n        .then(create_index_template)\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n}\n\n#[utoipa::path(\n    post,\n    tag = \"Templates\",\n    path = \"/templates\",\n    request_body = VersionedIndexTemplate,\n    responses(\n        (status = 200, description = \"The index template was successfully created.\", body = VersionedIndexTemplate)\n    ),\n)]\n/// Creates a new index template.\nasync fn create_index_template(\n    body: Bytes,\n    config_format: ConfigFormat,\n    metastore: MetastoreServiceClient,\n) -> MetastoreResult<IndexTemplate> {\n    let index_template: IndexTemplate =\n        config_format\n            .parse(&body)\n            .map_err(|error| MetastoreError::JsonDeserializeError {\n                struct_name: type_name::<IndexTemplate>().to_string(),\n                message: error.to_string(),\n            })?;\n    index_template.validate().map_err(|error| {\n        let message = format!(\"invalid index template: {error}\");\n        MetastoreError::InvalidArgument { message }\n    })?;\n    let index_template_json = serde_utils::to_json_str(&index_template)?;\n    let create_index_template = CreateIndexTemplateRequest {\n        index_template_json,\n        overwrite: false,\n    };\n    metastore\n        .create_index_template(create_index_template)\n        .await?;\n    Ok(index_template)\n}\n\nfn get_index_template_handler(\n    metastore: MetastoreServiceClient,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"templates\" / String)\n        .and(warp::get())\n        .and(with_arg(metastore))\n        .then(get_index_template)\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n}\n\n#[utoipa::path(\n    get,\n    tag = \"Templates\",\n    path = \"/templates/{template_id}\",\n    responses(\n        (status = 200, description = \"The index template was successfully retrieved.\", body = VersionedIndexTemplate),\n        (status = 404, description = \"The index template was not found.\")\n    ),\n)]\n/// Retrieves the index template identified by `template_id`.\nasync fn get_index_template(\n    template_id: IndexTemplateId,\n    metastore: MetastoreServiceClient,\n) -> MetastoreResult<IndexTemplate> {\n    let get_index_template_request = GetIndexTemplateRequest { template_id };\n    let get_index_template_response = metastore\n        .get_index_template(get_index_template_request)\n        .await?;\n    let index_template: IndexTemplate =\n        serde_utils::from_json_str(&get_index_template_response.index_template_json)?;\n    Ok(index_template)\n}\n\nfn update_index_template_handler(\n    metastore: MetastoreServiceClient,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"templates\" / String)\n        .and(warp::put())\n        .and(warp::filters::body::bytes())\n        .and(extract_config_format())\n        .and(with_arg(metastore))\n        .then(update_index_template)\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n}\n\n#[utoipa::path(\n    put,\n    tag = \"Templates\",\n    path = \"/templates/{template_id}\",\n    request_body = VersionedIndexTemplate,\n    responses(\n        (status = 200, description = \"The index template was successfully retrieved.\", body = VersionedIndexTemplate),\n        (status = 404, description = \"The index template was not found.\")\n    ),\n)]\n/// Updates the index template identified by `template_id`.\nasync fn update_index_template(\n    template_id: IndexTemplateId,\n    body: Bytes,\n    config_format: ConfigFormat,\n    metastore: MetastoreServiceClient,\n) -> MetastoreResult<IndexTemplate> {\n    let mut json_value: JsonValue =\n        config_format\n            .parse(&body)\n            .map_err(|error| MetastoreError::JsonDeserializeError {\n                struct_name: type_name::<IndexTemplate>().to_string(),\n                message: error.to_string(),\n            })?;\n    json_value[\"template_id\"] = JsonValue::String(template_id);\n\n    if let Some(JsonValue::Number(number)) = json_value.get(\"version\") {\n        json_value[\"version\"] = JsonValue::String(number.to_string());\n    }\n    let index_template: IndexTemplate = serde_utils::from_json_value(json_value)?;\n    index_template.validate().map_err(|error| {\n        let message = format!(\"invalid index template: {error}\");\n        MetastoreError::InvalidArgument { message }\n    })?;\n    let index_template_json = serde_utils::to_json_str(&index_template)?;\n    let create_index_template = CreateIndexTemplateRequest {\n        index_template_json,\n        overwrite: true,\n    };\n    metastore\n        .create_index_template(create_index_template)\n        .await?;\n    Ok(index_template)\n}\n\nfn delete_index_template_handler(\n    metastore: MetastoreServiceClient,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"templates\" / String)\n        .and(warp::delete())\n        .and(with_arg(metastore))\n        .then(delete_index_template)\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n}\n\n#[utoipa::path(\n    delete,\n    tag = \"Templates\",\n    path = \"/templates/{template_id}\",\n    responses(\n        (status = 200, description = \"The index template was successfully deleted.\"),\n        (status = 404, description = \"The index template was not found.\")\n    ),\n)]\n/// Deletes the index template identified by the provided `template_id`.\nasync fn delete_index_template(\n    template_id: IndexTemplateId,\n    metastore: MetastoreServiceClient,\n) -> MetastoreResult<()> {\n    let template_ids = vec![template_id];\n    let delete_index_templates_request = DeleteIndexTemplatesRequest { template_ids };\n    metastore\n        .delete_index_templates(delete_index_templates_request)\n        .await?;\n    Ok(())\n}\n\nfn list_index_templates_handler(\n    metastore: MetastoreServiceClient,\n) -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path!(\"templates\")\n        .and(warp::get())\n        .and(with_arg(metastore))\n        .then(list_index_templates)\n        .and(extract_format_from_qs())\n        .map(into_rest_api_response)\n}\n\n#[utoipa::path(\n    get,\n    tag = \"Templates\",\n    path = \"/templates\",\n    responses(\n        (status = 200, description = \"The index template was successfully retrieved.\", body = [VersionedIndexTemplate]),\n    ),\n)]\n/// Retrieves all the index templates stored in the metastore.\nasync fn list_index_templates(\n    metastore: MetastoreServiceClient,\n) -> MetastoreResult<Vec<IndexTemplate>> {\n    let list_index_templates_request = ListIndexTemplatesRequest {};\n    let list_index_templates_response = metastore\n        .list_index_templates(list_index_templates_request)\n        .await?;\n    let index_templates: Vec<IndexTemplate> = list_index_templates_response\n        .index_templates_json\n        .into_iter()\n        .map(|index_template_json| {\n            serde_utils::from_json_str::<IndexTemplate>(&index_template_json)\n        })\n        .collect::<MetastoreResult<_>>()?;\n    Ok(index_templates)\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_proto::metastore::{\n        EmptyResponse, EntityKind, GetIndexTemplateResponse, ListIndexTemplatesResponse,\n        MockMetastoreService,\n    };\n    use serde_json::json;\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_create_index_template() {\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_create_index_template()\n            .return_once(|request| {\n                assert!(!request.overwrite);\n\n                let index_template: IndexTemplate =\n                    serde_json::from_str(&request.index_template_json).unwrap();\n\n                assert_eq!(index_template.template_id, \"test-template-foo\");\n                assert_eq!(index_template.index_id_patterns, [\"test-index-foo*\"]);\n\n                Ok(EmptyResponse {})\n            });\n        let metastore = MetastoreServiceClient::from_mock(mock_metastore);\n        let create_index_template_handler = create_index_template_handler(metastore);\n        let response = warp::test::request()\n            .path(\"/templates\")\n            .method(\"POST\")\n            .json(&json!({\n                \"version\": \"0.7\",\n                \"template_id\": \"test-template-foo\",\n                \"index_id_patterns\": [\"test-index-foo*\"],\n                \"doc_mapping\": {},\n            }))\n            .reply(&create_index_template_handler)\n            .await;\n        assert_eq!(response.status(), 200);\n    }\n\n    #[tokio::test]\n    async fn test_get_index_template() {\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_get_index_template()\n            .withf(|request| request.template_id == \"test-template-foo\")\n            .return_once(|request| {\n                assert_eq!(request.template_id, \"test-template-foo\");\n\n                let error = MetastoreError::NotFound(EntityKind::IndexTemplate {\n                    template_id: request.template_id,\n                });\n                Err(error)\n            });\n        mock_metastore\n            .expect_get_index_template()\n            .withf(|request| request.template_id == \"test-template-bar\")\n            .return_once(|request| {\n                assert_eq!(request.template_id, \"test-template-bar\");\n\n                let index_template =\n                    IndexTemplate::for_test(\"test-template-bar\", &[\"test-index-bar*\"], 100);\n                let index_template_json = serde_utils::to_json_str(&index_template).unwrap();\n                let response = GetIndexTemplateResponse {\n                    index_template_json,\n                };\n                Ok(response)\n            });\n        let metastore = MetastoreServiceClient::from_mock(mock_metastore);\n        let get_index_template_handler = get_index_template_handler(metastore);\n\n        let response = warp::test::request()\n            .path(\"/templates/test-template-foo\")\n            .reply(&get_index_template_handler)\n            .await;\n        assert_eq!(response.status(), 404);\n\n        let response = warp::test::request()\n            .path(\"/templates/test-template-bar\")\n            .reply(&get_index_template_handler)\n            .await;\n        assert_eq!(response.status(), 200);\n\n        let index_template: IndexTemplate = serde_json::from_slice(response.body()).unwrap();\n        assert_eq!(index_template.template_id, \"test-template-bar\");\n        assert_eq!(index_template.index_id_patterns, [\"test-index-bar*\"]);\n        assert_eq!(index_template.priority, 100);\n    }\n\n    #[tokio::test]\n    async fn test_update_index_template() {\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_create_index_template()\n            .return_once(|request| {\n                assert!(request.overwrite);\n\n                let index_template: IndexTemplate =\n                    serde_json::from_str(&request.index_template_json).unwrap();\n\n                assert_eq!(index_template.template_id, \"test-template-foo\");\n                assert_eq!(index_template.index_id_patterns, [\"test-index-foo*\"]);\n\n                Ok(EmptyResponse {})\n            });\n        let metastore = MetastoreServiceClient::from_mock(mock_metastore);\n        let update_index_template_handler = update_index_template_handler(metastore);\n        let response = warp::test::request()\n            .path(\"/templates/test-template-foo\")\n            .method(\"PUT\")\n            .json(&json!({\n                \"version\": \"0.7\",\n                \"template_id\": \"test-template-bar\", // This `template_id` should be ignored and overridden by the path parameter.\n                \"index_id_patterns\": [\"test-index-foo*\"],\n                \"doc_mapping\": {},\n            }))\n            .reply(&update_index_template_handler)\n            .await;\n        assert_eq!(response.status(), 200);\n    }\n\n    #[tokio::test]\n    async fn test_delete_index_template() {\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_delete_index_templates()\n            .return_once(|request| {\n                assert_eq!(request.template_ids, [\"test-template-foo\"]);\n                Ok(EmptyResponse {})\n            });\n        let metastore = MetastoreServiceClient::from_mock(mock_metastore);\n        let delete_index_template_handler = delete_index_template_handler(metastore);\n        let response = warp::test::request()\n            .path(\"/templates/test-template-foo\")\n            .method(\"DELETE\")\n            .reply(&delete_index_template_handler)\n            .await;\n        assert_eq!(response.status(), 200);\n    }\n\n    #[tokio::test]\n    async fn test_list_index_templates() {\n        let mut mock_metastore = MockMetastoreService::new();\n        mock_metastore\n            .expect_list_index_templates()\n            .return_once(|_request| {\n                let index_template_foo =\n                    IndexTemplate::for_test(\"test-template-foo\", &[\"test-index-foo*\"], 100);\n                let index_template_foo_json = serde_json::to_string(&index_template_foo).unwrap();\n\n                let index_template_bar =\n                    IndexTemplate::for_test(\"test-template-bar\", &[\"test-index-bar*\"], 200);\n                let index_template_bar_json = serde_json::to_string(&index_template_bar).unwrap();\n\n                let response = ListIndexTemplatesResponse {\n                    index_templates_json: vec![index_template_foo_json, index_template_bar_json],\n                };\n                Ok(response)\n            });\n        let metastore = MetastoreServiceClient::from_mock(mock_metastore);\n        let list_index_templates_handler = list_index_templates_handler(metastore);\n        let response = warp::test::request()\n            .path(\"/templates\")\n            .method(\"GET\")\n            .reply(&list_index_templates_handler)\n            .await;\n        assert_eq!(response.status(), 200);\n\n        let mut index_templates: Vec<IndexTemplate> =\n            serde_json::from_slice(response.body()).unwrap();\n        index_templates.sort_unstable_by(|left, right| left.template_id.cmp(&right.template_id));\n\n        assert_eq!(index_templates.len(), 2);\n        assert_eq!(index_templates[0].template_id, \"test-template-bar\");\n        assert_eq!(index_templates[1].template_id, \"test-template-foo\");\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-serve/src/ui_handler.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse once_cell::sync::Lazy;\nuse quickwit_telemetry::payload::TelemetryEvent;\nuse regex::Regex;\nuse rust_embed::RustEmbed;\nuse warp::hyper::header::HeaderValue;\nuse warp::path::Tail;\nuse warp::reply::Response;\nuse warp::{Filter, Rejection};\n\nuse crate::rest::recover_fn;\n\n/// Regular expression to identify which path should serve an asset file.\n/// If not matched, the server serves the `index.html` file.\nconst PATH_PATTERN: &str = r\"(^static|\\.(png|json|txt|ico|js|map|css|woff2|ttf)$)\";\n\nconst UI_INDEX_FILE_NAME: &str = \"index.html\";\n\n#[derive(RustEmbed)]\n#[folder = \"../quickwit-ui/build/\"]\nstruct Asset;\n\npub fn ui_handler() -> impl Filter<Extract = (impl warp::Reply,), Error = Rejection> + Clone {\n    warp::path(\"ui\")\n        .and(warp::path::tail())\n        .and_then(serve_file)\n        .recover(recover_fn)\n        .boxed()\n}\n\nasync fn serve_file(path: Tail) -> Result<impl warp::Reply, Rejection> {\n    serve_impl(path.as_str()).await\n}\n\nasync fn serve_impl(path: &str) -> Result<impl warp::Reply + use<>, Rejection> {\n    static PATH_PTN: Lazy<Regex> = Lazy::new(|| Regex::new(PATH_PATTERN).unwrap());\n    let path_to_file = if PATH_PTN.is_match(path) {\n        path\n    } else {\n        // Quickwit UI is a single page application.\n        // Any path request that is not an asset should serve the `index.html` file.\n        // The client (browser) usually request `index.html` once unless the user refreshes the\n        // page.\n        quickwit_telemetry::send_telemetry_event(TelemetryEvent::UiIndexPageLoad).await;\n        UI_INDEX_FILE_NAME\n    };\n    let asset = Asset::get(path_to_file).ok_or_else(warp::reject::not_found)?;\n    let mime = mime_guess::from_path(path_to_file).first_or_octet_stream();\n\n    let mut res = Response::new(asset.data.into_owned().into());\n    res.headers_mut().insert(\n        \"content-type\",\n        HeaderValue::from_str(mime.as_ref()).unwrap(),\n    );\n    Ok(res)\n}\n\n#[cfg(test)]\nmod tests {\n    use super::*;\n\n    #[test]\n    fn test_path_regex() {\n        let path_ptn = Regex::new(PATH_PATTERN).unwrap();\n\n        assert!(path_ptn.is_match(\"manifest.json\"));\n        assert!(path_ptn.is_match(\"favicon.ico\"));\n        assert!(path_ptn.is_match(\"static/js/main.df380554.js.map\"));\n        assert!(path_ptn.is_match(\"android-chrome-192x192.png\"));\n        assert!(!path_ptn.is_match(\"search\"));\n        assert!(!path_ptn.is_match(\"\"));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/Cargo.toml",
    "content": "[package]\nname = \"quickwit-storage\"\ndescription = \"Storage layer\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nanyhow = { workspace = true }\nasync-trait = { workspace = true }\nbase64 = { workspace = true }\nbytes = { workspace = true }\nbytesize = { workspace = true }\nfnv = { workspace = true }\nfutures = { workspace = true }\nhttp-body-util = { workspace = true}\nhyper = { workspace = true }\nlru = { workspace = true }\nmd5 = { workspace = true }\nmini-moka = { workspace = true }\nmockall = { workspace = true, optional = true }\nonce_cell = { workspace = true }\npin-project = { workspace = true }\nquick_cache = { workspace = true }\nregex = { workspace = true }\nserde = { workspace = true }\nserde_json = { workspace = true }\ntantivy = { workspace = true }\ntempfile = { workspace = true }\nthiserror = { workspace = true }\ntokio = { workspace = true, features = [\"test-util\"] }\ntokio-stream = { workspace = true }\ntokio-util = { workspace = true }\ntracing = { workspace = true }\nulid = { workspace = true }\n\naws-config = { workspace = true }\naws-credential-types = { workspace = true }\naws-sdk-s3 = { workspace = true }\naws-smithy-types = { workspace = true }\n\nazure_core = { workspace = true, optional = true }\nazure_identity = { workspace = true, optional = true }\nazure_storage = { workspace = true, optional = true }\nazure_storage_blobs = { workspace = true, optional = true }\n\nquickwit-aws = { workspace = true }\nquickwit-common = { workspace = true }\nquickwit-config = { workspace = true }\nquickwit-proto = { workspace = true }\n\nopendal = { workspace = true, optional = true }\nreqwest = { workspace = true, optional = true }\n\n[dev-dependencies]\nhttp = { workspace = true }\nmockall = { workspace = true }\nproptest = { workspace = true }\ntokio = { workspace = true }\ntracing-subscriber = { workspace = true }\n\naws-sdk-s3 = { workspace = true }\naws-smithy-runtime = { workspace = true, features = [\"test-util\"] }\n\nquickwit-common = { workspace = true, features = [\"testsuite\"] }\n\n[features]\nazure = [\n  \"azure_core\",\n  \"azure_identity\",\n  \"azure_storage\",\n  \"azure_storage_blobs\",\n  \"azure_core/hmac_rust\",\n  \"azure_core/enable_reqwest_rustls\",\n  \"azure_storage/enable_reqwest_rustls\",\n  \"azure_storage_blobs/enable_reqwest_rustls\",\n]\ngcs = [\"dep:opendal\", \"opendal/services-gcs\"]\nci-test = []\nintegration-testsuite = [\n  \"azure\",\n  \"azure_core/azurite_workaround\",\n  \"azure_storage_blobs/azurite_workaround\",\n  \"gcs\",                                    # Stands for Google cloud storage.\n  \"dep:reqwest\",\n]\ntestsuite = [\"mockall\"]\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/bundle_storage.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::convert::TryInto;\nuse std::fmt::Debug;\nuse std::ops::Range;\nuse std::path::{Path, PathBuf};\nuse std::sync::Arc;\nuse std::{fmt, io};\n\nuse anyhow::Context;\nuse async_trait::async_trait;\nuse quickwit_common::chunk_range;\nuse quickwit_common::uri::Uri;\nuse serde::{Deserialize, Serialize};\nuse tantivy::HasLen;\nuse tantivy::directory::FileSlice;\nuse tokio::io::{AsyncRead, AsyncWriteExt};\nuse tracing::error;\n\nuse crate::storage::SendableAsync;\nuse crate::{\n    BulkDeleteError, OwnedBytes, Storage, StorageError, StorageResult, VersionedComponent,\n};\n\n/// BundleStorage bundles together multiple files into a single file.\n/// with some metadata\npub struct BundleStorage {\n    storage: Arc<dyn Storage>,\n    /// The file path of the bundle in the storage.\n    bundle_filepath: PathBuf,\n    metadata: BundleStorageFileOffsets,\n}\n\nimpl BundleStorage {\n    /// Opens a BundleStorage.\n    ///\n    /// The provided data must include the footer_bytes at the end of the slice, but it can have\n    /// more up front.\n    ///\n    /// Returns (Hotcache, Self)\n    pub fn open_from_split_data_with_owned_bytes(\n        storage: Arc<dyn Storage>,\n        bundle_filepath: PathBuf,\n        split_data: OwnedBytes,\n    ) -> anyhow::Result<(FileSlice, Self)> {\n        Self::open_from_split_data(\n            storage,\n            bundle_filepath,\n            FileSlice::new(Arc::new(split_data)),\n        )\n    }\n    /// Opens a BundleStorage.\n    ///\n    /// The provided data must include the footer_bytes at the end of the slice, but it can have\n    /// more up front.\n    ///\n    /// Returns (Hotcache, Self)\n    pub fn open_from_split_data(\n        storage: Arc<dyn Storage>,\n        bundle_filepath: PathBuf,\n        split_data: FileSlice,\n    ) -> anyhow::Result<(FileSlice, Self)> {\n        let (hotcache, metadata) = BundleStorageFileOffsets::open_from_split_data(split_data)?;\n        Ok((\n            hotcache,\n            BundleStorage {\n                storage,\n                bundle_filepath,\n                metadata,\n            },\n        ))\n    }\n\n    /// Returns Iterator over files contained in the bundle.\n    pub fn iter_files(&self) -> impl Iterator<Item = &PathBuf> {\n        self.metadata.files.keys()\n    }\n}\n\nconst SPLIT_HOTBYTES_FOOTER_LENGTH_NUM_BYTES: usize = std::mem::size_of::<u32>();\nconst BUNDLE_METADATA_LENGTH_NUM_BYTES: usize = std::mem::size_of::<u32>();\n\n#[derive(Copy, Clone, Default)]\n#[repr(u32)]\npub enum BundleStorageFileOffsetsVersions {\n    #[default]\n    V1 = 1,\n}\n\nimpl VersionedComponent for BundleStorageFileOffsetsVersions {\n    const MAGIC_NUMBER: u32 = 403_881_646u32;\n\n    type Component = BundleStorageFileOffsets;\n\n    fn to_version_code(self) -> u32 {\n        self as u32\n    }\n\n    fn try_from_version_code_impl(version_code: u32) -> Option<Self> {\n        match version_code {\n            1 => Some(Self::V1),\n            _ => None,\n        }\n    }\n\n    fn serialize_impl(component: &BundleStorageFileOffsets, output: &mut Vec<u8>) {\n        let metadata_json = serde_json::to_string(component).unwrap();\n        output.extend_from_slice(metadata_json.as_bytes());\n    }\n\n    fn deserialize_impl(&self, bytes: &mut OwnedBytes) -> anyhow::Result<Self::Component> {\n        serde_json::from_reader(bytes).context(\"deserializing bundle storage file offsets failed\")\n    }\n}\n\n/// Returns the file offsets in the file bundle.\n#[derive(Debug, Default, Serialize, Deserialize, Clone)]\npub struct BundleStorageFileOffsets {\n    /// The files and their offsets in the body\n    pub files: HashMap<PathBuf, Range<u64>>,\n}\n\nimpl BundleStorageFileOffsets {\n    /// File need to include split data (with hotcache at the end).\n    /// See docs/internals/split-format.md\n    /// [Files, FileMetadata, FileMetadata Len, HotCache, HotCache Len]\n    /// Returns (Hotcache, Self)\n    fn open_from_split_data(file: FileSlice) -> anyhow::Result<(FileSlice, Self)> {\n        let (bundle_and_hotcache_bytes, hotcache_num_bytes_data) =\n            file.split_from_end(SPLIT_HOTBYTES_FOOTER_LENGTH_NUM_BYTES);\n        let hotcache_num_bytes: u32 = u32::from_le_bytes(\n            hotcache_num_bytes_data\n                .read_bytes()?\n                .as_ref()\n                .try_into()\n                .unwrap(),\n        );\n        let (bundle, hotcache) =\n            bundle_and_hotcache_bytes.split_from_end(hotcache_num_bytes as usize);\n        Ok((hotcache, Self::open(bundle)?))\n    }\n\n    /// FileSlice needs to end with the bundle (without hotcache from the split at the end).\n    /// See docs/internals/split-format.md\n    /// [Files, FileMetadata, FileMetadata Len]\n    pub fn open(file: FileSlice) -> anyhow::Result<Self> {\n        let (tantivy_files_data, num_bytes_file_metadata) =\n            file.split_from_end(BUNDLE_METADATA_LENGTH_NUM_BYTES);\n        let footer_num_bytes: u32 = u32::from_le_bytes(\n            num_bytes_file_metadata\n                .read_bytes()?\n                .as_slice()\n                .try_into()\n                .unwrap(),\n        );\n\n        let mut bundle_storage_file_offsets_data = tantivy_files_data\n            .slice_from_end(footer_num_bytes as usize)\n            .read_bytes()?;\n        BundleStorageFileOffsetsVersions::try_read_component(&mut bundle_storage_file_offsets_data)\n    }\n\n    /// Returns file offsets for given path.\n    pub fn get(&self, path: &Path) -> Option<Range<u64>> {\n        self.files.get(path).cloned()\n    }\n\n    /// Returns whether file exists in metadata.\n    pub fn exists(&self, path: &Path) -> bool {\n        self.files.contains_key(path)\n    }\n}\n\n#[async_trait]\nimpl Storage for BundleStorage {\n    async fn check_connectivity(&self) -> anyhow::Result<()> {\n        if !self\n            .storage\n            .exists(&self.bundle_filepath)\n            .await\n            .unwrap_or(false)\n        {\n            anyhow::bail!(\"`{}` not found in storage\", self.bundle_filepath.display())\n        }\n        Ok(())\n    }\n\n    async fn put(\n        &self,\n        path: &Path,\n        _payload: Box<dyn crate::PutPayload>,\n    ) -> crate::StorageResult<()> {\n        Err(unsupported_operation(&[path]))\n    }\n\n    async fn copy_to(\n        &self,\n        path: &Path,\n        output: &mut dyn SendableAsync,\n    ) -> crate::StorageResult<()> {\n        let file_num_bytes = self.file_num_bytes(path).await? as usize;\n        let block_size = 100_000_000;\n        for block in chunk_range(0..file_num_bytes, block_size) {\n            let file_content = self.get_slice(path, block).await?;\n            output.write_all(&file_content).await?;\n        }\n        output.flush().await?;\n        Ok(())\n    }\n\n    async fn get_slice(\n        &self,\n        path: &Path,\n        range: Range<usize>,\n    ) -> crate::StorageResult<OwnedBytes> {\n        let file_offsets = self.metadata.get(path).ok_or_else(|| {\n            crate::StorageErrorKind::NotFound\n                .with_error(anyhow::anyhow!(\"missing file `{}`\", path.display()))\n        })?;\n        let new_range =\n            file_offsets.start as usize + range.start..file_offsets.start as usize + range.end;\n        self.storage\n            .get_slice(&self.bundle_filepath, new_range)\n            .await\n    }\n\n    async fn get_slice_stream(\n        &self,\n        path: &Path,\n        _range: Range<usize>,\n    ) -> StorageResult<Box<dyn AsyncRead + Send + Unpin>> {\n        Err(unsupported_operation(&[path]))\n    }\n\n    async fn get_all(&self, path: &Path) -> crate::StorageResult<OwnedBytes> {\n        let file_offsets = self.metadata.get(path).ok_or_else(|| {\n            crate::StorageErrorKind::NotFound\n                .with_error(anyhow::anyhow!(\"missing file `{}`\", path.display()))\n        })?;\n        self.storage\n            .get_slice(\n                &self.bundle_filepath,\n                file_offsets.start as usize..file_offsets.end as usize,\n            )\n            .await\n    }\n\n    async fn delete(&self, path: &Path) -> crate::StorageResult<()> {\n        Err(unsupported_operation(&[path]))\n    }\n\n    async fn bulk_delete<'a>(&self, paths: &[&'a Path]) -> Result<(), BulkDeleteError> {\n        Err(BulkDeleteError {\n            error: Some(unsupported_operation(paths)),\n            ..Default::default()\n        })\n    }\n\n    async fn exists(&self, path: &Path) -> crate::StorageResult<bool> {\n        // also check if self.bundle_file_name exists ?\n        Ok(self.metadata.exists(path))\n    }\n\n    async fn file_num_bytes(&self, path: &Path) -> StorageResult<u64> {\n        let file_range = self.metadata.get(path).ok_or_else(|| {\n            crate::StorageErrorKind::NotFound\n                .with_error(anyhow::anyhow!(\"missing file `{}`\", path.display()))\n        })?;\n        Ok(file_range.end - file_range.start)\n    }\n\n    fn uri(&self) -> &Uri {\n        self.storage.uri()\n    }\n}\n\nimpl HasLen for BundleStorage {\n    fn len(&self) -> usize {\n        unimplemented!()\n    }\n}\n\nimpl fmt::Debug for BundleStorage {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        write!(\n            f,\n            \"BundleStorage({:?}, files={:?})\",\n            &self.bundle_filepath, self.metadata\n        )\n    }\n}\n\nfn unsupported_operation(paths: &[&Path]) -> StorageError {\n    let msg = \"Unsupported operation. BundleStorage only supports async reads\";\n    error!(paths=?paths, msg);\n    io::Error::other(format!(\"{msg}: {paths:?}\")).into()\n}\n\n#[cfg(test)]\nmod tests {\n    use std::fs::{self, File};\n    use std::io::Write;\n\n    use super::*;\n    use crate::{PutPayload, RamStorageBuilder, SplitPayloadBuilder};\n\n    #[tokio::test]\n    async fn bundle_storage_file_offsets() -> anyhow::Result<()> {\n        let temp_dir = tempfile::tempdir()?;\n        let test_filepath1 = temp_dir.path().join(\"f1\");\n        let test_filepath2 = temp_dir.path().join(\"f2\");\n\n        let mut file1 = File::create(&test_filepath1)?;\n        file1.write_all(&[123, 76])?;\n\n        let mut file2 = File::create(&test_filepath2)?;\n        file2.write_all(&[99, 55, 44])?;\n\n        let buffer = SplitPayloadBuilder::get_split_payload(\n            &[test_filepath1.clone(), test_filepath2.clone()],\n            &[],\n            &[5, 5, 5],\n        )?\n        .read_all()\n        .await?;\n\n        let bundle_filepath = Path::new(\"bundle\");\n        let bundle_file_slice = FileSlice::new(Arc::new(buffer.clone()));\n        let (hotcache, metadata) =\n            BundleStorageFileOffsets::open_from_split_data(bundle_file_slice)?;\n        assert_eq!(hotcache.read_bytes().unwrap().as_ref(), &[5, 5, 5]);\n        let ram_storage = RamStorageBuilder::default()\n            .put(&bundle_filepath.to_string_lossy(), &buffer)\n            .build();\n\n        let bundle_storage = BundleStorage {\n            metadata,\n            bundle_filepath: bundle_filepath.to_path_buf(),\n            storage: Arc::new(ram_storage),\n        };\n        let f1_data = bundle_storage.get_all(Path::new(\"f1\")).await?;\n        assert_eq!(&*f1_data, &[123u8, 76u8]);\n\n        let f2_data = bundle_storage.get_all(Path::new(\"f2\")).await?;\n        assert_eq!(&f2_data[..], &[99, 55, 44]);\n\n        Ok(())\n    }\n    #[tokio::test]\n    async fn bundle_storage_test() -> anyhow::Result<()> {\n        let temp_dir = tempfile::tempdir()?;\n        let test_filepath1 = temp_dir.path().join(\"f1\");\n        let test_filepath2 = temp_dir.path().join(\"f2\");\n\n        let mut file1 = File::create(&test_filepath1)?;\n        file1.write_all(&[123, 76])?;\n\n        let mut file2 = File::create(&test_filepath2)?;\n        file2.write_all(&[99, 55, 44])?;\n\n        let buffer = SplitPayloadBuilder::get_split_payload(\n            &[test_filepath1.clone(), test_filepath2.clone()],\n            &[],\n            &[1, 3, 3, 7],\n        )?\n        .read_all()\n        .await?;\n\n        let (hotcache, metadata) =\n            BundleStorageFileOffsets::open_from_split_data(FileSlice::from(buffer.to_vec()))?;\n        assert_eq!(hotcache.read_bytes().unwrap().as_ref(), &[1, 3, 3, 7]);\n\n        let bundle_filepath = Path::new(\"bundle\");\n        let ram_storage = RamStorageBuilder::default()\n            .put(&bundle_filepath.to_string_lossy(), &buffer)\n            .build();\n\n        let bundle_storage = BundleStorage {\n            metadata,\n            bundle_filepath: bundle_filepath.to_path_buf(),\n            storage: Arc::new(ram_storage),\n        };\n        let f1_data = bundle_storage.get_all(Path::new(\"f1\")).await?;\n        assert_eq!(&*f1_data, &[123u8, 76u8]);\n\n        let f2_data = bundle_storage.get_all(Path::new(\"f2\")).await?;\n        assert_eq!(&f2_data[..], &[99, 55, 44]);\n\n        let copy_to_file = temp_dir.path().join(\"copy_file\");\n        bundle_storage\n            .copy_to_file(Path::new(\"f2\"), &copy_to_file)\n            .await?;\n        let file_content = fs::read(copy_to_file).unwrap();\n        assert_eq!(&f2_data[..], file_content);\n\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn bundlestorage_test_empty() -> anyhow::Result<()> {\n        let buffer = SplitPayloadBuilder::get_split_payload(&[], &[], &[])?\n            .read_all()\n            .await?;\n\n        let (_hotcache, metadata) =\n            BundleStorageFileOffsets::open_from_split_data(FileSlice::from(buffer.to_vec()))?;\n\n        let bundle_filepath = PathBuf::from(\"bundle\");\n        let ram_storage = RamStorageBuilder::default()\n            .put(&bundle_filepath.to_string_lossy(), &buffer)\n            .build();\n        let bundle_storage = BundleStorage {\n            metadata,\n            bundle_filepath,\n            storage: Arc::new(ram_storage),\n        };\n\n        assert_eq!(bundle_storage.exists(Path::new(\"blub\")).await?, false);\n\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/cache/base_cache.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::borrow::Borrow;\nuse std::hash::Hash;\nuse std::sync::{Arc, Weak};\nuse std::time::Duration;\n\nuse bytesize::ByteSize;\nuse lru::LruCache;\nuse mini_moka::sync::Cache as MokaCache;\nuse quick_cache::unsync::Cache as QuickCache;\nuse quickwit_config::CachePolicy;\nuse tokio::time::Instant;\nuse tracing::{error, warn};\n\nuse crate::OwnedBytes;\nuse crate::cache::stored_item::{StoredItem, ValueLen};\nuse crate::metrics::SingleCacheMetrics;\n\n/// We do not evict anything that has been accessed in the last 60s.\n///\n/// The goal is to behave better on scan access patterns, without being as aggressive as\n/// using a MRU strategy.\n///\n/// TLDR is:\n///\n/// If two items have been access in the last 60s it is not really worth considering the\n/// latter too be more recent than the previous and do an eviction.\n/// The difference is not significant enough to raise the probability of its future access.\n///\n/// On the other hand, for very large queries involving enough data to saturate the cache,\n/// we are facing a scanning pattern. If variations of this  query is repeated over and over\n/// a regular LRU eviction policy would yield a hit rate of 0.\npub(crate) const LRU_MIN_TIME_SINCE_LAST_ACCESS: Duration = Duration::from_secs(60);\n\n// A fake entry inside a cache, which the cache believe to be of the given size\n#[derive(Clone)]\npub(crate) struct FakeCacheEntry(pub usize);\n\nimpl ValueLen for FakeCacheEntry {\n    fn len(&self) -> usize {\n        self.0\n    }\n}\n\n#[derive(Clone, Copy, Debug, PartialEq)]\npub(crate) enum Capacity {\n    Unlimited,\n    InBytes(usize),\n}\n\nimpl Capacity {\n    fn exceeds_capacity(&self, num_bytes: usize) -> bool {\n        match *self {\n            Capacity::Unlimited => false,\n            Capacity::InBytes(capacity_in_bytes) => num_bytes > capacity_in_bytes,\n        }\n    }\n}\n\npub(crate) enum AnyCache<K: Hash + Eq, V: ValueLen = OwnedBytes> {\n    Lru(Lru<K, V>),\n    S3Fifo(S3Fifo<K, V>),\n    TinyLfu(TinyLfu<K, V>),\n}\n\nimpl<K: Hash + Eq + Send + Sync + 'static, V: ValueLen + Clone + Send + Sync + 'static>\n    AnyCache<K, V>\n{\n    pub fn from_policy_and_capacity(\n        policy: CachePolicy,\n        capacity: ByteSize,\n        cache_metrics: SingleCacheMetrics,\n    ) -> Self {\n        match policy {\n            CachePolicy::Lru => AnyCache::Lru(Lru::with_capacity(\n                Capacity::InBytes(capacity.as_u64().try_into().unwrap_or(usize::MAX)),\n                cache_metrics,\n            )),\n            CachePolicy::S3Fifo => {\n                AnyCache::S3Fifo(S3Fifo::with_capacity(capacity.as_u64(), cache_metrics))\n            }\n            CachePolicy::TinyLfu => {\n                AnyCache::TinyLfu(TinyLfu::with_capacity(capacity.as_u64(), cache_metrics))\n            }\n        }\n    }\n    pub fn unbounded(cache_metrics: SingleCacheMetrics) -> Self {\n        AnyCache::Lru(Lru::with_capacity(Capacity::Unlimited, cache_metrics))\n    }\n\n    pub fn get<Q>(&mut self, cache_key: &Q) -> Option<V>\n    where\n        K: Borrow<Q>,\n        Arc<K>: Borrow<Q>,\n        Q: Hash + Eq + ?Sized,\n    {\n        match self {\n            AnyCache::Lru(lru) => lru.get(cache_key),\n            AnyCache::S3Fifo(s3fifo) => s3fifo.get(cache_key),\n            AnyCache::TinyLfu(tiny_lfu) => tiny_lfu.get(cache_key),\n        }\n    }\n    pub fn put(&mut self, key: K, value: V) {\n        match self {\n            AnyCache::Lru(lru) => lru.put(key, value),\n            AnyCache::S3Fifo(s3fifo) => s3fifo.put(key, value),\n            AnyCache::TinyLfu(tiny_lfu) => tiny_lfu.put(key, value),\n        }\n    }\n}\n\npub struct Lru<K: Hash + Eq, V> {\n    lru_cache: LruCache<K, StoredItem<V>>,\n    num_items: usize,\n    num_bytes: u64,\n    capacity: Capacity,\n    cache_metrics: SingleCacheMetrics,\n}\n\nimpl<K: Hash + Eq, V> Drop for Lru<K, V> {\n    fn drop(&mut self) {\n        // we don't count this toward evicted entries, as we are clearing the whole cache\n        self.cache_metrics.in_cache_count.sub(self.num_items as i64);\n        self.cache_metrics\n            .in_cache_num_bytes\n            .sub(self.num_bytes as i64);\n    }\n}\n\nimpl<K: Hash + Eq, V: ValueLen + Clone> Lru<K, V> {\n    /// Creates a new NeedMutSliceCache with the given capacity.\n    fn with_capacity(capacity: Capacity, cache_metrics: SingleCacheMetrics) -> Self {\n        Lru {\n            // The limit will be decided by the amount of memory in the cache,\n            // not the number of items in the cache.\n            // Enforcing this limit is done in the `NeedMutCache` impl.\n            lru_cache: LruCache::unbounded(),\n            num_items: 0,\n            num_bytes: 0,\n            capacity,\n            cache_metrics,\n        }\n    }\n\n    fn record_item(&mut self, num_bytes: u64) {\n        self.num_items += 1;\n        self.num_bytes += num_bytes;\n        self.cache_metrics.in_cache_count.inc();\n        self.cache_metrics.in_cache_num_bytes.add(num_bytes as i64);\n    }\n\n    fn drop_item(&mut self, num_bytes: u64) {\n        self.num_items -= 1;\n        self.num_bytes -= num_bytes;\n        self.cache_metrics.in_cache_count.dec();\n        self.cache_metrics.in_cache_num_bytes.sub(num_bytes as i64);\n        self.cache_metrics.evict_num_items.inc();\n        self.cache_metrics.evict_num_bytes.inc_by(num_bytes);\n    }\n\n    pub fn get<Q>(&mut self, cache_key: &Q) -> Option<V>\n    where\n        K: Borrow<Q>,\n        Q: Hash + Eq + ?Sized,\n    {\n        let item_opt = self.lru_cache.get_mut(cache_key);\n        if let Some(item) = item_opt {\n            self.cache_metrics.hits_num_items.inc();\n            self.cache_metrics.hits_num_bytes.inc_by(item.len() as u64);\n            Some(item.payload())\n        } else {\n            self.cache_metrics.misses_num_items.inc();\n            None\n        }\n    }\n\n    /// Attempt to put the given amount of data in the cache.\n    /// This may fail silently if the owned_bytes slice is larger than the cache\n    /// capacity.\n    fn put(&mut self, key: K, bytes: V) {\n        if self.capacity.exceeds_capacity(bytes.len()) {\n            // The value does not fit in the cache. We simply don't store it.\n            if self.capacity != Capacity::InBytes(0) {\n                warn!(\n                    capacity_in_bytes = ?self.capacity,\n                    len = bytes.len(),\n                    \"Downloaded a byte slice larger than the cache capacity.\"\n                );\n            }\n            return;\n        }\n        if let Some(previous_data) = self.lru_cache.pop(&key) {\n            self.drop_item(previous_data.len() as u64);\n        }\n\n        let now = Instant::now();\n        while self\n            .capacity\n            .exceeds_capacity(self.num_bytes as usize + bytes.len())\n        {\n            if let Some((_, candidate_for_eviction)) = self.lru_cache.peek_lru() {\n                let time_since_last_access =\n                    now.duration_since(candidate_for_eviction.last_access_time());\n                if time_since_last_access < LRU_MIN_TIME_SINCE_LAST_ACCESS {\n                    // It is not worth doing an eviction.\n                    // TODO: It is sub-optimal that we might have needlessly evicted items in this\n                    // loop before just returning.\n                    return;\n                }\n            }\n            if let Some((_, bytes)) = self.lru_cache.pop_lru() {\n                self.drop_item(bytes.len() as u64);\n            } else {\n                error!(\n                    \"Logical error. Even after removing all of the items in the cache the \\\n                     capacity is insufficient. This case is guarded against and should never \\\n                     happen.\"\n                );\n                return;\n            }\n        }\n        self.record_item(bytes.len() as u64);\n        self.lru_cache.put(key, StoredItem::new(bytes, now));\n    }\n}\n\n// actually, quick_cache is a Clock-PRO, not a S3-fifo contrary to what quick-cache and Moka's\n// readme says. While both are clearly distinct (one being clock-based, the other being fifo\n// based), they are not too disimilar in term of strenght/weaknesses.\npub struct S3Fifo<K: Hash + Eq, V: ValueLen> {\n    cache:\n        QuickCache<K, V, QuickCacheWeighter, quick_cache::DefaultHashBuilder, QuickCacheLifecycle>,\n    capacity: u64,\n    cache_metrics: SingleCacheMetrics,\n}\n\nimpl<K: Hash + Eq, V: ValueLen> Drop for S3Fifo<K, V> {\n    fn drop(&mut self) {\n        // we don't count this toward evicted entries, as we are clearing the whole cache\n        self.cache_metrics\n            .in_cache_count\n            .sub(self.cache.len() as i64);\n        self.cache_metrics\n            .in_cache_num_bytes\n            .sub(self.cache.weight() as i64);\n    }\n}\n\nstruct QuickCacheWeighter;\nimpl<K, V: ValueLen> quick_cache::Weighter<K, V> for QuickCacheWeighter {\n    fn weight(&self, _key: &K, value: &V) -> u64 {\n        value.len() as u64\n    }\n}\n\nstruct QuickCacheLifecycle;\n#[derive(Default)]\nstruct QuickCacheQueryEffect {\n    count: u64,\n    bytes: u64,\n}\nimpl<K, V: ValueLen> quick_cache::Lifecycle<K, V> for QuickCacheLifecycle {\n    type RequestState = QuickCacheQueryEffect;\n\n    fn begin_request(&self) -> Self::RequestState {\n        QuickCacheQueryEffect::default()\n    }\n    fn on_evict(&self, state: &mut Self::RequestState, _key: K, val: V) {\n        state.count += 1;\n        state.bytes += val.len() as u64;\n    }\n}\n\nimpl<K: Hash + Eq, V: ValueLen + Clone> S3Fifo<K, V> {\n    /// Creates a new NeedMutSliceCache with the given capacity.\n    fn with_capacity(capacity: u64, cache_metrics: SingleCacheMetrics) -> Self {\n        S3Fifo {\n            cache: QuickCache::with(\n                (capacity / (128 * 1024)) as usize,\n                capacity,\n                QuickCacheWeighter,\n                quick_cache::DefaultHashBuilder::new(),\n                QuickCacheLifecycle,\n            ),\n            capacity,\n            cache_metrics,\n        }\n    }\n\n    pub fn get<Q>(&mut self, cache_key: &Q) -> Option<V>\n    where\n        K: Borrow<Q>,\n        Q: Hash + Eq + ?Sized,\n    {\n        let item_opt = self.cache.get(cache_key);\n        if let Some(item) = item_opt {\n            self.cache_metrics.hits_num_items.inc();\n            self.cache_metrics.hits_num_bytes.inc_by(item.len() as u64);\n            Some(item.clone())\n        } else {\n            self.cache_metrics.misses_num_items.inc();\n            None\n        }\n    }\n\n    /// Attempt to put the given amount of data in the cache.\n    /// This may fail silently if the owned_bytes slice is larger than the cache\n    /// capacity.\n    fn put(&mut self, key: K, value: V) {\n        if self.capacity < value.len() as u64 {\n            // The value does not fit in the cache. We simply don't store it.\n            if self.capacity != 0 {\n                warn!(\n                    capacity_in_bytes = ?self.capacity,\n                    len = value.len(),\n                    \"Downloaded a byte slice larger than the cache capacity.\"\n                );\n            }\n            return;\n        }\n\n        self.cache_metrics.in_cache_count.inc();\n        self.cache_metrics\n            .in_cache_num_bytes\n            .add(value.len() as i64);\n        let evicted = self.cache.insert_with_lifecycle(key, value);\n        self.cache_metrics.in_cache_count.sub(evicted.count as i64);\n        self.cache_metrics\n            .in_cache_num_bytes\n            .sub(evicted.bytes as i64);\n        self.cache_metrics.evict_num_items.inc_by(evicted.count);\n        self.cache_metrics.evict_num_bytes.inc_by(evicted.bytes);\n    }\n}\n\n// We don't make this value Clone to ensure each item is dropped only once\nstruct CapacityTracker<V: ValueLen> {\n    item: V,\n    cache_metrics: Weak<SingleCacheMetrics>,\n}\n\nimpl<V: ValueLen> Drop for CapacityTracker<V> {\n    fn drop(&mut self) {\n        if let Some(cache_metrics) = self.cache_metrics.upgrade() {\n            cache_metrics.in_cache_count.dec();\n            cache_metrics.in_cache_num_bytes.sub(self.item.len() as i64);\n            cache_metrics.evict_num_items.inc();\n            cache_metrics.evict_num_bytes.inc_by(self.item.len() as u64);\n        }\n    }\n}\n\npub struct TinyLfu<K: Hash + Eq, V: ValueLen> {\n    // this field is put first so it's dropped before the cache\n    // we use that to not count removed entries as \"evicted\", by\n    // calling CapacityTracker's Drop only after its Weak has expired.\n    cache_metrics: Arc<SingleCacheMetrics>,\n\n    // we store an Arc because moka does a lot of internal cloning, and it's hard to not double\n    // evict/forget to count eviction otherwise\n    cache: MokaCache<K, Arc<CapacityTracker<V>>>,\n    capacity: u64,\n}\n\nimpl<K: Hash + Eq, V: ValueLen> Drop for TinyLfu<K, V> {\n    fn drop(&mut self) {\n        // we don't count this toward evicted entries, as we are clearing the whole cache\n        self.cache_metrics\n            .in_cache_count\n            .sub(self.cache.entry_count() as i64);\n        self.cache_metrics\n            .in_cache_num_bytes\n            .sub(self.cache.weighted_size() as i64);\n    }\n}\n\nimpl<K: Hash + Eq + Send + Sync + 'static, V: ValueLen + Clone + Send + Sync + 'static>\n    TinyLfu<K, V>\n{\n    /// Creates a new NeedMutSliceCache with the given capacity.\n    fn with_capacity(capacity: u64, cache_metrics: SingleCacheMetrics) -> Self {\n        TinyLfu {\n            cache: MokaCache::builder()\n                .max_capacity(capacity)\n                .weigher(|_k, v: &Arc<CapacityTracker<V>>| {\n                    v.item.len().try_into().unwrap_or(u32::MAX)\n                })\n                .build(),\n            capacity,\n            cache_metrics: cache_metrics.into(),\n        }\n    }\n\n    pub fn get<Q>(&mut self, cache_key: &Q) -> Option<V>\n    where\n        Arc<K>: Borrow<Q>,\n        Q: Hash + Eq + ?Sized,\n    {\n        let item_opt = self.cache.get(cache_key);\n        if let Some(item) = item_opt {\n            self.cache_metrics.hits_num_items.inc();\n            self.cache_metrics\n                .hits_num_bytes\n                .inc_by(item.item.len() as u64);\n            Some(item.item.clone())\n        } else {\n            self.cache_metrics.misses_num_items.inc();\n            None\n        }\n    }\n\n    /// Attempt to put the given amount of data in the cache.\n    /// This may fail silently if the owned_bytes slice is larger than the cache\n    /// capacity.\n    fn put(&mut self, key: K, value: V) {\n        if self.capacity < value.len() as u64 {\n            // The value does not fit in the cache. We simply don't store it.\n            if self.capacity != 0 {\n                warn!(\n                    capacity_in_bytes = ?self.capacity,\n                    len = value.len(),\n                    \"Downloaded a byte slice larger than the cache capacity.\"\n                );\n            }\n            return;\n        }\n\n        self.cache_metrics.in_cache_count.inc();\n        self.cache_metrics\n            .in_cache_num_bytes\n            .add(value.len() as i64);\n        self.cache.insert(\n            key,\n            CapacityTracker {\n                item: value,\n                cache_metrics: Arc::downgrade(&self.cache_metrics),\n            }\n            .into(),\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/cache/byte_range_cache.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::borrow::{Borrow, Cow};\nuse std::collections::BTreeMap;\nuse std::ops::Range;\nuse std::path::{Path, PathBuf};\nuse std::sync::atomic::{AtomicU64, Ordering};\nuse std::sync::{Arc, Mutex};\n\nuse tantivy::directory::OwnedBytes;\n\nuse crate::metrics::{CacheMetrics, SingleCacheMetrics};\n\n#[derive(Clone, PartialOrd, Ord, PartialEq, Eq)]\nstruct CacheKey<'a, T: ToOwned + ?Sized> {\n    tag: Cow<'a, T>,\n    range_start: usize,\n}\n\nimpl<T: ToOwned + ?Sized> CacheKey<'static, T> {\n    fn from_owned(tag: T::Owned, range_start: usize) -> Self {\n        CacheKey {\n            tag: Cow::Owned(tag),\n            range_start,\n        }\n    }\n}\n\nimpl<'a, T: ToOwned + ?Sized> CacheKey<'a, T> {\n    fn from_borrowed(tag: &'a T, range_start: usize) -> Self {\n        CacheKey {\n            tag: Cow::Borrowed(tag),\n            range_start,\n        }\n    }\n}\n\nstruct CacheValue {\n    range_end: usize,\n    bytes: OwnedBytes,\n}\n\n/// T is a tag, usually a file path.\nstruct NeedMutByteRangeCache<T: 'static + ToOwned + ?Sized> {\n    cache: BTreeMap<CacheKey<'static, T>, CacheValue>,\n    // this is hardly significant as items can get merged if they overlap\n    num_items: u64,\n    num_bytes: u64,\n    cache_counters: &'static SingleCacheMetrics,\n}\n\nimpl<T: 'static + ToOwned + ?Sized + Ord> NeedMutByteRangeCache<T> {\n    fn with_infinite_capacity(cache_counters: &'static CacheMetrics) -> Self {\n        NeedMutByteRangeCache {\n            cache: BTreeMap::new(),\n            num_items: 0,\n            num_bytes: 0,\n            cache_counters: &cache_counters.cache_metrics,\n        }\n    }\n\n    fn get_slice(&mut self, tag: &T, byte_range: Range<usize>) -> Option<OwnedBytes> {\n        if byte_range.start == byte_range.end {\n            return Some(OwnedBytes::empty());\n        }\n\n        let key = CacheKey::from_borrowed(tag, byte_range.start);\n        let (k, v) = if let Some((k, v)) = self.get_block(&key, byte_range.end) {\n            (k, v)\n        } else if let Some((k, v)) = self.merge_ranges(&key, byte_range.end) {\n            (k, v)\n        } else {\n            self.cache_counters.misses_num_items.inc();\n            return None;\n        };\n\n        let start = byte_range.start - k.range_start;\n        let end = byte_range.end - k.range_start;\n        let result = v.bytes.slice(start..end);\n\n        self.cache_counters.hits_num_items.inc();\n        self.cache_counters\n            .hits_num_bytes\n            .inc_by((end - start) as u64);\n\n        Some(result)\n    }\n\n    fn put_slice(&mut self, tag: T::Owned, byte_range: Range<usize>, bytes: OwnedBytes) {\n        let len = byte_range.end - byte_range.start;\n        assert_eq!(len, bytes.len());\n        if len == 0 {\n            return;\n        }\n\n        // try to find a block with which we overlap (and not just touch)\n        let start_key = CacheKey::from_borrowed(tag.borrow(), byte_range.start);\n        let first_matching_block = self\n            .get_block(&start_key, byte_range.start + 1)\n            .map(|(k, _v)| k);\n\n        let end_key = CacheKey::from_borrowed(tag.borrow(), byte_range.end - 1);\n        let last_matching_block = self.get_block(&end_key, byte_range.end).map(|(k, _v)| k);\n\n        if first_matching_block.is_some() && first_matching_block == last_matching_block {\n            // same start and end: all the range is already covered\n            return;\n        }\n\n        let first_matching_block = first_matching_block.unwrap_or(&start_key);\n        let last_matching_block = last_matching_block.unwrap_or(&end_key);\n\n        let overlapping: Vec<Range<usize>> = self\n            .cache\n            .range(first_matching_block..=last_matching_block)\n            .map(|(k, v)| k.range_start..v.range_end)\n            .collect();\n\n        let can_drop_first = overlapping\n            .first()\n            .map(|r| byte_range.start <= r.start)\n            .unwrap_or(true);\n\n        let can_drop_last = overlapping\n            .last()\n            .map(|r| byte_range.end >= r.end)\n            .unwrap_or(true);\n\n        let (final_range, final_bytes) = if can_drop_first && can_drop_last {\n            // if we are here, either there was no overlapping block, or there was, but this buffer\n            // covers entirely every block it overlapped with. There is no merging to do.\n            (byte_range, bytes)\n        } else {\n            // if we are here, we have to do some merging\n\n            // first find the final buffer start and end position.\n            let start = if can_drop_first {\n                byte_range.start\n            } else {\n                // if no first, can_drop_first is true\n                overlapping.first().unwrap().start\n            };\n            let end = if can_drop_last {\n                byte_range.end\n            } else {\n                // if no last, can_drop_last is true\n                overlapping.last().unwrap().end\n            };\n\n            let mut buffer = Vec::with_capacity(end - start);\n\n            // if this buffer overlap, but does not contain the 1st buffer, copy the\n            // non-overlapping part at the start of the final buffer.\n            if !can_drop_first {\n                let first_range = overlapping.first().unwrap();\n                let key = CacheKey::from_borrowed(tag.borrow(), first_range.start);\n                let block = self.cache.get(&key).unwrap();\n\n                let len = first_range.end.min(byte_range.start) - first_range.start;\n                buffer.extend_from_slice(&block.bytes[..len]);\n            }\n\n            // copy the entire current buffer\n            buffer.extend_from_slice(&bytes);\n\n            // if this buffer overlap, but does not contain the last buffer, copy the\n            // non-overlapping part ad the end of the final buffer.\n            if !can_drop_last {\n                let last_range = overlapping.last().unwrap();\n                let key = CacheKey::from_borrowed(tag.borrow(), last_range.start);\n                let block = self.cache.get(&key).unwrap();\n\n                let start = last_range.start.max(byte_range.end) - last_range.start;\n                buffer.extend_from_slice(&block.bytes[start..]);\n            }\n\n            // sanity check, we copied as much as expected\n            debug_assert_eq!(end - start, buffer.len());\n\n            (start..end, OwnedBytes::new(buffer))\n        };\n\n        // not sure why, but the borrow check gets unhappy if I create a borrowed\n        // in the loop. It works with .get() instead of .remove() (?).\n        let mut key = CacheKey::from_owned(tag, 0);\n        for range in overlapping.into_iter() {\n            // remove every block with which we overlapped, including the 1st and last, as they\n            // were included as prefix/suffix to the final block.\n            key.range_start = range.start;\n            self.cache.remove(&key);\n            self.update_counter_drop_item(range.end - range.start);\n        }\n\n        // and finally insert the newly added buffer\n        key.range_start = final_range.start;\n        let value = CacheValue {\n            range_end: final_range.end,\n            bytes: final_bytes,\n        };\n        self.cache.insert(key, value);\n        self.update_counter_record_item(final_range.end - final_range.start);\n    }\n\n    // Return a block that contain everything between query.range_start and range_end\n    fn get_block<'a>(\n        &self,\n        query: &CacheKey<'a, T>,\n        range_end: usize,\n    ) -> Option<(&CacheKey<'a, T>, &CacheValue)> {\n        self.cache\n            .range(..=query)\n            .next_back()\n            .filter(|(k, v)| k.tag == query.tag && range_end <= v.range_end)\n    }\n\n    /// Try to merge all blocks in the given range. Fails if some bytes were not already stored.\n    fn merge_ranges<'a>(\n        &mut self,\n        start: &CacheKey<'a, T>,\n        range_end: usize,\n    ) -> Option<(&CacheKey<'a, T>, &CacheValue)> {\n        let own_key = |key: &CacheKey<T>| {\n            CacheKey::from_owned(T::borrow(&key.tag).to_owned(), key.range_start)\n        };\n\n        let first_block = self.get_block(start, start.range_start)?;\n\n        // query cache for all blocks which overlap with our query\n        let overlapping_blocks = self\n            .cache\n            .range(first_block.0..)\n            .take_while(|(k, _)| k.tag == start.tag && k.range_start <= range_end);\n\n        // verify there are no hole, and each range touches the next one. There can't be overlap\n        // due to how we fill our data-structure.\n        let mut last_block = first_block;\n        for (k, v) in overlapping_blocks.clone().skip(1) {\n            if k.range_start != last_block.1.range_end {\n                return None;\n            }\n\n            last_block = (k, v);\n        }\n        if last_block.1.range_end < range_end {\n            // we got a gap at the end\n            return None;\n        }\n\n        // we have everything we need. Merge every sub-buffer into a single large buffer.\n        let mut buffer = Vec::with_capacity(last_block.1.range_end - first_block.0.range_start);\n        let mut part_count = 0i64;\n        for (_, v) in overlapping_blocks {\n            part_count += 1;\n            buffer.extend_from_slice(&v.bytes);\n        }\n        assert_eq!(\n            buffer.len(),\n            (last_block.1.range_end - first_block.0.range_start)\n        );\n\n        let new_key = own_key(first_block.0);\n        let new_value = CacheValue {\n            range_end: last_block.1.range_end,\n            bytes: OwnedBytes::new(buffer),\n        };\n\n        // cleanup is sub-optimal, we'd need a BTreeMap::drain_range or something like that\n        let last_key = own_key(last_block.0);\n\n        // remove previous buffers from the cache\n        let blocks_to_remove: Vec<_> = self\n            .cache\n            .range(&new_key..=&last_key)\n            .map(|(k, _)| own_key(k))\n            .collect();\n        for block in blocks_to_remove {\n            self.cache.remove(&block);\n        }\n\n        // and insert the new merged buffer\n        self.cache.insert(new_key, new_value);\n\n        self.num_items -= (part_count - 1) as u64;\n        self.cache_counters.in_cache_count.sub(part_count - 1);\n\n        self.get_block(start, range_end)\n    }\n\n    fn update_counter_record_item(&mut self, num_bytes: usize) {\n        self.num_items += 1;\n        self.num_bytes += num_bytes as u64;\n        self.cache_counters.in_cache_count.inc();\n        self.cache_counters.in_cache_num_bytes.add(num_bytes as i64);\n    }\n\n    fn update_counter_drop_item(&mut self, num_bytes: usize) {\n        self.num_items -= 1;\n        self.num_bytes -= num_bytes as u64;\n        self.cache_counters.in_cache_count.dec();\n        self.cache_counters.in_cache_num_bytes.sub(num_bytes as i64);\n        self.cache_counters.evict_num_items.inc();\n        self.cache_counters.evict_num_bytes.inc_by(num_bytes as u64);\n    }\n}\n\nimpl<T: 'static + ToOwned + ?Sized> Drop for NeedMutByteRangeCache<T> {\n    fn drop(&mut self) {\n        self.cache_counters\n            .in_cache_count\n            .sub(self.num_items as i64);\n        self.cache_counters\n            .in_cache_num_bytes\n            .sub(self.num_bytes as i64);\n    }\n}\n\n/// Cache for ranges of bytes in files.\n///\n/// This cache is used in the contraption that makes it possible for Quickwit\n/// to use tantivy while doing asynchronous io.\n/// Quickwit manually populates this cache in an asynchronous \"warmup\" phase.\n/// tantivy then gets its data from this cache without performing any IO.\n///\n/// Contrary to `MemorySizedCache`, it's able to answer subset of known ranges,\n/// does not have any eviction, and assumes an infinite capacity.\n///\n/// This cache assume immutable data: if you put a new slice and it overlap with\n/// cached data, the changes may or may not get recorded.\n///\n/// At the moment this is hardly a cache as it features no eviction policy.\n#[derive(Clone)]\npub struct ByteRangeCache {\n    inner_arc: Arc<Inner>,\n}\n\nstruct Inner {\n    num_stored_bytes: AtomicU64,\n    need_mut_byte_range_cache: Mutex<NeedMutByteRangeCache<Path>>,\n}\n\nimpl ByteRangeCache {\n    /// Creates a slice cache that never removes any entry.\n    pub fn with_infinite_capacity(cache_counters: &'static CacheMetrics) -> Self {\n        let need_mut_byte_range_cache =\n            NeedMutByteRangeCache::with_infinite_capacity(cache_counters);\n        let inner = Inner {\n            num_stored_bytes: AtomicU64::default(),\n            need_mut_byte_range_cache: Mutex::new(need_mut_byte_range_cache),\n        };\n        ByteRangeCache {\n            inner_arc: Arc::new(inner),\n        }\n    }\n\n    /// Overall amount of bytes stored in the cache.\n    pub fn get_num_bytes(&self) -> u64 {\n        self.inner_arc.num_stored_bytes.load(Ordering::Relaxed)\n    }\n\n    /// If available, returns the cached view of the slice.\n    pub fn get_slice(&self, path: &Path, byte_range: Range<usize>) -> Option<OwnedBytes> {\n        self.inner_arc\n            .need_mut_byte_range_cache\n            .lock()\n            .unwrap()\n            .get_slice(path, byte_range)\n    }\n\n    /// Put the given amount of data in the cache.\n    pub fn put_slice(&self, path: PathBuf, byte_range: Range<usize>, bytes: OwnedBytes) {\n        let mut need_mut_byte_range_cache_locked =\n            self.inner_arc.need_mut_byte_range_cache.lock().unwrap();\n        need_mut_byte_range_cache_locked.put_slice(path, byte_range, bytes);\n        let num_bytes = need_mut_byte_range_cache_locked.num_bytes;\n        drop(need_mut_byte_range_cache_locked);\n        self.inner_arc\n            .num_stored_bytes\n            .store(num_bytes, Ordering::Relaxed);\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::collections::HashMap;\n    use std::ops::Range;\n    use std::path::Path;\n\n    use once_cell::sync::Lazy;\n    use proptest::prelude::*;\n\n    use super::ByteRangeCache;\n    use crate::OwnedBytes;\n    use crate::metrics::{CACHE_METRICS_FOR_TESTS, CacheMetrics};\n\n    #[derive(Debug)]\n    enum Operation {\n        Insert {\n            range: Range<usize>,\n            tag: &'static str,\n        },\n        Get {\n            range: Range<usize>,\n            tag: &'static str,\n        },\n    }\n\n    fn tag_strategy() -> impl Strategy<Value = &'static str> {\n        prop_oneof![Just(\"path1\"), Just(\"path2\"),]\n    }\n\n    #[allow(deprecated)]\n    fn range_strategy() -> impl Strategy<Value = Range<usize>> {\n        (0usize..11usize).prop_perturb(|start, mut rng| start..rng.gen_range(start..12usize))\n    }\n\n    fn op_strategy() -> impl Strategy<Value = Operation> {\n        prop_oneof![\n            (tag_strategy(), range_strategy())\n                .prop_map(|(tag, range)| Operation::Insert { range, tag }),\n            (tag_strategy(), range_strategy())\n                .prop_map(|(tag, range)| Operation::Get { range, tag }),\n        ]\n    }\n\n    fn ops_strategy() -> impl Strategy<Value = Vec<Operation>> {\n        prop::collection::vec(op_strategy(), 1..100)\n    }\n\n    proptest::proptest! {\n        #[test]\n        fn test_proptest_byte_range_cache(ops in ops_strategy()) {\n            let mut state: HashMap<&'static str, Vec<bool>> = HashMap::new();\n            state.insert(\"path1\", vec![false; 12]);\n            state.insert(\"path2\", vec![false; 12]);\n\n            let cache = ByteRangeCache::with_infinite_capacity(&CACHE_METRICS_FOR_TESTS);\n\n            for op in ops {\n                match op {\n                    Operation::Insert {\n                        range,\n                        tag,\n                    } => {\n                        state.get_mut(tag).unwrap()\n                            [range.clone()].fill(true);\n                        let bytes = range.clone().map(|i| (i%256) as u8).collect::<Vec<_>>();\n                        cache.put_slice(tag.into(), range, OwnedBytes::new(bytes));\n\n\n                        let expected_item_count: usize = state.values()\n                            .map(|tagged_state| {\n                                count_items(tagged_state)\n                            })\n                            .sum();\n                        // in some case we have ranges touching each other, count_items count them\n                        // as only one, but cache count them as 2.\n                        assert!(cache.inner_arc.need_mut_byte_range_cache.lock().unwrap().num_items >= expected_item_count as u64);\n\n                        let expected_byte_count = state.values()\n                            .flatten()\n                            .filter(|stored| **stored)\n                            .count();\n                        assert_eq!(cache.inner_arc.need_mut_byte_range_cache.lock().unwrap().num_bytes, expected_byte_count as u64);\n                    }\n                    Operation::Get {\n                        range,\n                        tag,\n                    } => {\n                        let slice = cache.get_slice(Path::new(tag), range.clone());\n                        if state[tag][range.clone()].iter().all(|t| *t) {\n                            let slice = slice.unwrap();\n                            let bytes = range.clone().map(|i| (i%256) as u8).collect::<Vec<_>>();\n                            assert_eq!(slice[..], bytes[..]);\n\n                        } else {\n                            assert!(slice.is_none());\n                        }\n                    },\n                }\n            }\n        }\n    }\n\n    fn count_items(state: &[bool]) -> usize {\n        state\n            .iter()\n            .fold((false, 0), |(last_val, count), next| {\n                if *next && !last_val {\n                    (*next, count + 1)\n                } else {\n                    (*next, count)\n                }\n            })\n            .1\n    }\n\n    #[test]\n    fn test_byte_range_cache_doesnt_merge_unnecessarily() {\n        // we need to get a 'static ref to metrics, and want a dedicated metrics because we assert\n        // on it\n        static METRICS: Lazy<CacheMetrics> =\n            Lazy::new(|| CacheMetrics::for_component(\"byterange_cache_test\"));\n\n        let cache = ByteRangeCache::with_infinite_capacity(&METRICS);\n\n        let key: std::path::PathBuf = \"key\".into();\n\n        cache.put_slice(\n            key.clone(),\n            0..5,\n            OwnedBytes::new((0..5).collect::<Vec<_>>()),\n        );\n        cache.put_slice(\n            key.clone(),\n            5..10,\n            OwnedBytes::new((5..10).collect::<Vec<_>>()),\n        );\n        cache.put_slice(\n            key.clone(),\n            10..15,\n            OwnedBytes::new((10..15).collect::<Vec<_>>()),\n        );\n        cache.put_slice(\n            key.clone(),\n            15..20,\n            OwnedBytes::new((15..20).collect::<Vec<_>>()),\n        );\n\n        {\n            let mutable_cache = cache.inner_arc.need_mut_byte_range_cache.lock().unwrap();\n            assert_eq!(mutable_cache.cache.len(), 4);\n            assert_eq!(mutable_cache.num_items, 4);\n            assert_eq!(mutable_cache.cache_counters.in_cache_count.get(), 4);\n            assert_eq!(mutable_cache.num_bytes, 20);\n            assert_eq!(mutable_cache.cache_counters.in_cache_num_bytes.get(), 20);\n        }\n\n        cache.get_slice(&key, 3..12).unwrap();\n\n        {\n            // now they should've been merged, except the last one\n            let mutable_cache = cache.inner_arc.need_mut_byte_range_cache.lock().unwrap();\n            assert_eq!(mutable_cache.cache.len(), 2);\n            assert_eq!(mutable_cache.num_items, 2);\n            assert_eq!(mutable_cache.cache_counters.in_cache_count.get(), 2);\n            assert_eq!(mutable_cache.num_bytes, 20);\n            assert_eq!(mutable_cache.cache_counters.in_cache_num_bytes.get(), 20);\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/cache/memory_sized_cache.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::borrow::Borrow;\nuse std::hash::Hash;\nuse std::ops::Range;\nuse std::path::{Path, PathBuf};\nuse std::sync::{Arc, Mutex};\n\nuse quickwit_config::CacheConfig;\n\nuse crate::OwnedBytes;\nuse crate::cache::base_cache::{AnyCache, FakeCacheEntry};\nuse crate::cache::slice_address::{SliceAddress, SliceAddressKey, SliceAddressRef};\nuse crate::metrics::CacheMetrics;\n\nstruct CacheState<K: Hash + Eq> {\n    cache: AnyCache<K>,\n    virtual_caches: Vec<AnyCache<K, FakeCacheEntry>>,\n}\n\nimpl<K: Hash + Eq + Clone + Send + Sync + 'static> CacheState<K> {\n    fn from_config(cache_config: &CacheConfig, cache_counters: &'static CacheMetrics) -> Self {\n        let cache = AnyCache::from_policy_and_capacity(\n            cache_config.policy(),\n            cache_config.capacity(),\n            cache_counters.cache_metrics.clone(),\n        );\n        let virtual_caches = cache_config\n            .virtual_caches\n            .iter()\n            .cloned()\n            .map(|mut virtual_cache_config| {\n                AnyCache::from_policy_and_capacity(\n                    virtual_cache_config.policy_for_virtual_cache(cache_config.policy()),\n                    virtual_cache_config.capacity_for_virtual_cache(cache_config.capacity()),\n                    cache_counters.virtual_cache(&virtual_cache_config),\n                )\n            })\n            .collect();\n        CacheState {\n            cache,\n            virtual_caches,\n        }\n    }\n\n    fn infinite(cache_counters: &'static CacheMetrics) -> Self {\n        CacheState {\n            cache: AnyCache::unbounded(cache_counters.cache_metrics.clone()),\n            // there is no point in having virtual caches for an unbounded cache\n            virtual_caches: Vec::new(),\n        }\n    }\n\n    pub fn get<Q>(&mut self, cache_key: &Q) -> Option<OwnedBytes>\n    where\n        K: Borrow<Q>,\n        Arc<K>: Borrow<Q>,\n        Q: Hash + Eq + ?Sized,\n    {\n        for virtual_cache in &mut self.virtual_caches {\n            // we simulate an access on all virtual caches\n            virtual_cache.get(cache_key);\n        }\n        self.cache.get(cache_key)\n    }\n\n    fn put(&mut self, key: K, bytes: OwnedBytes) {\n        for virtual_cache in &mut self.virtual_caches {\n            // we simulate an access on all virtual caches\n            virtual_cache.put(key.clone(), FakeCacheEntry(bytes.len()));\n        }\n\n        self.cache.put(key, bytes)\n    }\n}\n\n/// A simple in-resident memory slice cache.\npub struct MemorySizedCache<K: Hash + Eq = SliceAddress> {\n    inner: Mutex<CacheState<K>>,\n}\n\nimpl<K: Hash + Eq + Clone + Send + Sync + 'static> MemorySizedCache<K> {\n    /// Creates an slice cache with the given capacity.\n    pub fn from_config(cache_config: &CacheConfig, cache_counters: &'static CacheMetrics) -> Self {\n        MemorySizedCache {\n            inner: Mutex::new(CacheState::from_config(cache_config, cache_counters)),\n        }\n    }\n\n    /// Creates a slice cache that never removes any entry.\n    pub fn with_infinite_capacity(cache_counters: &'static CacheMetrics) -> Self {\n        MemorySizedCache {\n            inner: Mutex::new(CacheState::infinite(cache_counters)),\n        }\n    }\n\n    /// If available, returns the cached view of the slice.\n    pub fn get<Q>(&self, cache_key: &Q) -> Option<OwnedBytes>\n    where\n        K: Borrow<Q>,\n        Arc<K>: Borrow<Q>,\n        Q: Hash + Eq + ?Sized,\n    {\n        self.inner.lock().unwrap().get(cache_key)\n    }\n\n    /// Attempt to put the given amount of data in the cache.\n    /// This may fail silently if the owned_bytes slice is larger than the cache\n    /// capacity.\n    pub fn put(&self, val: K, bytes: OwnedBytes) {\n        self.inner.lock().unwrap().put(val, bytes);\n    }\n}\n\nimpl MemorySizedCache<SliceAddress> {\n    /// If available, returns the cached view of the slice.\n    pub fn get_slice(&self, path: &Path, byte_range: Range<usize>) -> Option<OwnedBytes> {\n        let slice_address_ref = SliceAddressRef { path, byte_range };\n        self.get(&slice_address_ref as &dyn SliceAddressKey)\n    }\n\n    /// Attempt to put the given amount of data in the cache.\n    /// This may fail silently if the owned_bytes slice is larger than the cache\n    /// capacity.\n    pub fn put_slice(&self, path: PathBuf, byte_range: Range<usize>, bytes: OwnedBytes) {\n        let slice_address = SliceAddress { path, byte_range };\n        self.put(slice_address, bytes);\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use bytesize::ByteSize;\n\n    use super::*;\n    use crate::cache::base_cache::LRU_MIN_TIME_SINCE_LAST_ACCESS;\n    use crate::metrics::CACHE_METRICS_FOR_TESTS;\n\n    #[tokio::test]\n    async fn test_cache_edge_condition() {\n        tokio::time::pause();\n        let cache = MemorySizedCache::<String>::from_config(\n            &ByteSize::b(5).into(),\n            &CACHE_METRICS_FOR_TESTS,\n        );\n        {\n            let data = OwnedBytes::new(&b\"abc\"[..]);\n            cache.put(\"3\".to_string(), data);\n            assert_eq!(cache.get(&\"3\".to_string()).unwrap(), &b\"abc\"[..]);\n        }\n        {\n            let data = OwnedBytes::new(&b\"de\"[..]);\n            cache.put(\"2\".to_string(), data);\n            // our first entry should still be here.\n            assert_eq!(cache.get(&\"3\".to_string()).unwrap(), &b\"abc\"[..]);\n            assert_eq!(cache.get(&\"2\".to_string()).unwrap(), &b\"de\"[..]);\n        }\n        {\n            let data = OwnedBytes::new(&b\"fghij\"[..]);\n            cache.put(\"5\".to_string(), data);\n            // Eviction should not happen, because all items in cache are too young.\n            assert!(cache.get(&\"5\".to_string()).is_none());\n        }\n        tokio::time::advance(LRU_MIN_TIME_SINCE_LAST_ACCESS.mul_f32(1.1f32)).await;\n        {\n            let data = OwnedBytes::new(&b\"fghij\"[..]);\n            cache.put(\"5\".to_string(), data);\n            assert_eq!(cache.get(&\"5\".to_string()).unwrap(), &b\"fghij\"[..]);\n            // our two first entries should have be removed from the cache\n            assert!(cache.get(&\"2\".to_string()).is_none());\n            assert!(cache.get(&\"3\".to_string()).is_none());\n        }\n        tokio::time::advance(LRU_MIN_TIME_SINCE_LAST_ACCESS.mul_f32(1.1f32)).await;\n        {\n            let data = OwnedBytes::new(&b\"klmnop\"[..]);\n            cache.put(\"6\".to_string(), data);\n            // The entry put should have been dismissed as it is too large for the cache\n            assert!(cache.get(&\"6\".to_string()).is_none());\n            // The previous entry should however be remaining.\n            assert_eq!(cache.get(&\"5\".to_string()).unwrap(), &b\"fghij\"[..]);\n        }\n    }\n\n    #[test]\n    fn test_cache_edge_unlimited_capacity() {\n        let cache = MemorySizedCache::with_infinite_capacity(&CACHE_METRICS_FOR_TESTS);\n        {\n            let data = OwnedBytes::new(&b\"abc\"[..]);\n            cache.put(\"3\".to_string(), data);\n            assert_eq!(cache.get(&\"3\".to_string()).unwrap(), &b\"abc\"[..]);\n        }\n        {\n            let data = OwnedBytes::new(&b\"de\"[..]);\n            cache.put(\"2\".to_string(), data);\n            assert_eq!(cache.get(&\"3\".to_string()).unwrap(), &b\"abc\"[..]);\n            assert_eq!(cache.get(&\"2\".to_string()).unwrap(), &b\"de\"[..]);\n        }\n    }\n\n    #[test]\n    fn test_cache() {\n        let cache =\n            MemorySizedCache::from_config(&ByteSize::kb(10).into(), &CACHE_METRICS_FOR_TESTS);\n        assert!(cache.get(&\"hello.seg\").is_none());\n        let data = OwnedBytes::new(&b\"werwer\"[..]);\n        cache.put(\"hello.seg\", data);\n        assert_eq!(cache.get(&\"hello.seg\").unwrap(), &b\"werwer\"[..]);\n    }\n\n    #[test]\n    fn test_cache_no_cache() {\n        let cache =\n            MemorySizedCache::from_config(&CacheConfig::no_cache(), &CACHE_METRICS_FOR_TESTS);\n        assert!(cache.get(&\"hello.seg\").is_none());\n        let data = OwnedBytes::new(&b\"werwer\"[..]);\n        cache.put(\"hello.seg\", data);\n        assert!(cache.get(&\"hello.seg\").is_none());\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/cache/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod base_cache;\nmod byte_range_cache;\nmod memory_sized_cache;\nmod quickwit_cache;\nmod slice_address;\nmod storage_with_cache;\nmod stored_item;\n\nuse std::ops::Range;\nuse std::path::{Path, PathBuf};\nuse std::sync::Arc;\n\nuse async_trait::async_trait;\npub use quickwit_cache::QuickwitCache;\npub use storage_with_cache::StorageWithCache;\n\npub use self::byte_range_cache::ByteRangeCache;\npub use self::memory_sized_cache::MemorySizedCache;\nuse crate::{OwnedBytes, Storage};\n\n/// Wraps the given directory with a slice cache that is actually global\n/// to quickwit.\n///\n/// FIXME The current approach is quite horrible in that:\n/// - it uses a global\n/// - it relies on the idea that all of the files we attempt to cache have universally unique names.\n///   It happens to be true today, but this might be very error prone in the future.\npub fn wrap_storage_with_cache(\n    long_term_cache: Arc<dyn StorageCache>,\n    storage: Arc<dyn Storage>,\n) -> Arc<dyn Storage> {\n    Arc::new(StorageWithCache {\n        storage,\n        cache: long_term_cache,\n    })\n}\n\n/// The `StorageCache` trait is the abstraction used to describe the caching logic\n/// used in front of a storage. See `StorageWithCache`.\n#[cfg_attr(any(test, feature = \"testsuite\"), mockall::automock)]\n#[async_trait]\npub trait StorageCache: Send + Sync + 'static {\n    /// Try to get a slice from the cache.\n    async fn get(&self, path: &Path, byte_range: Range<usize>) -> Option<OwnedBytes>;\n    /// Try to get the entire file.\n    async fn get_all(&self, path: &Path) -> Option<OwnedBytes>;\n    /// Put a slice of data into the cache.\n    async fn put(&self, path: PathBuf, byte_range: Range<usize>, bytes: OwnedBytes);\n    /// Put an entire file into the cache.\n    async fn put_all(&self, path: PathBuf, bytes: OwnedBytes);\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/cache/quickwit_cache.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::ops::Range;\nuse std::path::{Path, PathBuf};\nuse std::sync::Arc;\n\nuse async_trait::async_trait;\nuse quickwit_config::CacheConfig;\n\nuse crate::OwnedBytes;\nuse crate::cache::{MemorySizedCache, StorageCache};\nuse crate::metrics::CacheMetrics;\n\nconst FULL_SLICE: Range<usize> = 0..usize::MAX;\n\n/// Quickwit storage cache with a size limit.\n/// It is used currently by to cache only fast fields data.\npub struct QuickwitCache {\n    router: Vec<(&'static str, Arc<dyn StorageCache>)>,\n}\n\nimpl From<Vec<(&'static str, Arc<dyn StorageCache>)>> for QuickwitCache {\n    fn from(router: Vec<(&'static str, Arc<dyn StorageCache>)>) -> Self {\n        QuickwitCache { router }\n    }\n}\n\nimpl QuickwitCache {\n    /// Creates a [`QuickwitCache`] with a cache on fast fields.\n    pub fn new(cache_config: &CacheConfig) -> Self {\n        let mut quickwit_cache = QuickwitCache::empty();\n        let fast_field_cache_counters: &'static CacheMetrics =\n            &crate::STORAGE_METRICS.fast_field_cache;\n        quickwit_cache.add_route(\n            \".fast\",\n            Arc::new(SimpleCache::from_config(\n                cache_config,\n                fast_field_cache_counters,\n            )),\n        );\n        quickwit_cache\n    }\n\n    /// Empties cache.\n    pub fn empty() -> QuickwitCache {\n        QuickwitCache::from(Vec::new())\n    }\n\n    /// Adds a caching route defined by a path suffix. All elements with a path matching\n    /// this suffix will be cached.\n    pub fn add_route(&mut self, path_suffix: &'static str, route_cache: Arc<dyn StorageCache>) {\n        self.router.push((path_suffix, route_cache));\n    }\n\n    fn get_relevant_cache(&self, path: &Path) -> Option<&dyn StorageCache> {\n        for (suffix, cache) in &self.router {\n            if path.to_string_lossy().ends_with(suffix) {\n                return Some(cache.as_ref());\n            }\n        }\n        None\n    }\n}\n\n#[async_trait]\nimpl StorageCache for QuickwitCache {\n    async fn get(&self, path: &Path, byte_range: Range<usize>) -> Option<OwnedBytes> {\n        // We don't check for the presence of the entire file in the\n        // cache.\n        // That's voluntary to avoid messing with the cache miss counts.\n        if let Some(cache) = self.get_relevant_cache(path) {\n            return cache.get(path, byte_range).await;\n        }\n        None\n    }\n\n    async fn get_all(&self, path: &Path) -> Option<OwnedBytes> {\n        if let Some(cache) = self.get_relevant_cache(path) {\n            return cache.get_all(path).await;\n        }\n        None\n    }\n\n    async fn put(&self, path: PathBuf, byte_range: Range<usize>, bytes: OwnedBytes) {\n        if let Some(cache) = self.get_relevant_cache(&path) {\n            cache.put(path, byte_range, bytes).await;\n        }\n    }\n\n    async fn put_all(&self, path: PathBuf, bytes: OwnedBytes) {\n        if let Some(cache) = self.get_relevant_cache(&path) {\n            cache.put(path, FULL_SLICE, bytes).await;\n        }\n    }\n}\n\n/// The Quickwit cache logic is very simple for the moment.\n///\n/// It stores hotcache files using an LRU cache.\n///\n/// HACK! We use `0..usize::MAX` to signify the \"entire file\".\n/// TODO fixme\nstruct SimpleCache {\n    slice_cache: MemorySizedCache,\n}\n\nimpl SimpleCache {\n    fn from_config(cache_config: &CacheConfig, cache_counters: &'static CacheMetrics) -> Self {\n        SimpleCache {\n            slice_cache: MemorySizedCache::from_config(cache_config, cache_counters),\n        }\n    }\n}\n\n#[async_trait]\nimpl StorageCache for SimpleCache {\n    async fn get(&self, path: &Path, byte_range: Range<usize>) -> Option<OwnedBytes> {\n        if let Some(bytes) = self.slice_cache.get_slice(path, byte_range) {\n            return Some(bytes);\n        }\n        None\n    }\n\n    async fn put(&self, path: PathBuf, byte_range: Range<usize>, bytes: OwnedBytes) {\n        self.slice_cache.put_slice(path, byte_range, bytes);\n    }\n\n    async fn get_all(&self, path: &Path) -> Option<OwnedBytes> {\n        self.slice_cache.get_slice(path, FULL_SLICE)\n    }\n\n    async fn put_all(&self, path: PathBuf, bytes: OwnedBytes) {\n        self.slice_cache.put_slice(path, FULL_SLICE.clone(), bytes);\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::path::Path;\n    use std::sync::Arc;\n\n    use super::QuickwitCache;\n    use crate::cache::StorageCache;\n    use crate::{MockStorageCache, OwnedBytes};\n\n    #[tokio::test]\n    async fn test_quickwit_cache_get_all() {\n        let mock_cache_hotcache = MockStorageCache::default();\n        let mut mock_cache_fast = MockStorageCache::default();\n        mock_cache_fast\n            .expect_get_all()\n            .times(1)\n            .withf(|path| path == Path::new(\"bubu/toto.fast\"))\n            .returning(|_| Some(OwnedBytes::new(&b\"aaaa\"[..])));\n        let mut quickwit_cache = QuickwitCache::empty();\n        quickwit_cache.add_route(\"hotcache\", Arc::new(mock_cache_hotcache));\n        quickwit_cache.add_route(\"fast\", Arc::new(mock_cache_fast));\n        quickwit_cache.get_all(Path::new(\"bubu/toto.fast\")).await;\n    }\n\n    #[tokio::test]\n    async fn test_quickwit_cache_get() {\n        let mock_cache_hotcache = MockStorageCache::default();\n        let mut mock_cache = MockStorageCache::default();\n        mock_cache\n            .expect_get()\n            .times(1)\n            .withf(|path, _| path == Path::new(\"bubu/toto.fast\"))\n            .returning(|_, _| Some(OwnedBytes::new(&b\"aaaaa\"[..])));\n        let mut quickwit_cache = QuickwitCache::empty();\n        quickwit_cache.add_route(\"hotcache\", Arc::new(mock_cache_hotcache));\n        quickwit_cache.add_route(\"fast\", Arc::new(mock_cache));\n        quickwit_cache.get(Path::new(\"bubu/toto.fast\"), 5..10).await;\n    }\n\n    #[tokio::test]\n    async fn test_quickwit_cache_priority() {\n        let mut mock_cache_ast = MockStorageCache::default();\n        mock_cache_ast\n            .expect_get()\n            .times(1)\n            .withf(|path, _| path == Path::new(\"bubu/toto.fast\"))\n            .returning(|_, _| Some(OwnedBytes::new(&b\"aaaaa\"[..])));\n        let mock_cache_fast = MockStorageCache::default();\n        let mut quickwit_cache = QuickwitCache::empty();\n        quickwit_cache.add_route(\"ast\", Arc::new(mock_cache_ast));\n        quickwit_cache.add_route(\"fast\", Arc::new(mock_cache_fast));\n        assert_eq!(\n            quickwit_cache\n                .get(Path::new(\"bubu/toto.fast\"), 5..10)\n                .await\n                .unwrap(),\n            &b\"aaaaa\"[..]\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/cache/slice_address.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::borrow::Borrow;\nuse std::hash::{Hash, Hasher};\nuse std::ops::Range;\nuse std::path::{Path, PathBuf};\n\n#[derive(Hash, Clone, Debug, Eq, PartialEq)]\npub struct SliceAddress {\n    pub path: PathBuf,\n    pub byte_range: Range<usize>,\n}\n\n// ------------------------------------------------------------\n// The following struct exists to make it possible to\n// fetch a slice from a cache without cloning PathBuf.\n\n// The trick is described in https://github.com/sunshowers-code/borrow-complex-key-example/blob/main/src/lib.rs\n\n#[derive(Clone, Debug, Eq, Hash, PartialEq)]\npub(crate) struct SliceAddressRef<'a> {\n    pub path: &'a Path,\n    pub byte_range: Range<usize>,\n}\n\npub(crate) trait SliceAddressKey {\n    fn key(&self) -> SliceAddressRef<'_>;\n}\n\nimpl SliceAddressKey for SliceAddress {\n    fn key(&self) -> SliceAddressRef<'_> {\n        SliceAddressRef {\n            path: self.path.as_path(),\n            byte_range: self.byte_range.clone(),\n        }\n    }\n}\n\nimpl SliceAddressKey for SliceAddressRef<'_> {\n    fn key(&self) -> SliceAddressRef<'_> {\n        self.clone()\n    }\n}\n\nimpl<'a> Borrow<dyn SliceAddressKey + 'a> for std::sync::Arc<SliceAddress> {\n    fn borrow(&self) -> &(dyn SliceAddressKey + 'a) {\n        &**self\n    }\n}\n\nimpl<'a> Borrow<dyn SliceAddressKey + 'a> for SliceAddress {\n    fn borrow(&self) -> &(dyn SliceAddressKey + 'a) {\n        self\n    }\n}\nimpl PartialEq for dyn SliceAddressKey + '_ {\n    fn eq(&self, other: &Self) -> bool {\n        self.key().eq(&other.key())\n    }\n}\n\nimpl Eq for dyn SliceAddressKey + '_ {}\n\nimpl Hash for dyn SliceAddressKey + '_ {\n    fn hash<H: Hasher>(&self, state: &mut H) {\n        self.key().hash(state)\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/cache/storage_with_cache.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::ops::Range;\nuse std::path::Path;\nuse std::sync::Arc;\n\nuse async_trait::async_trait;\nuse quickwit_common::uri::Uri;\nuse tokio::io::AsyncRead;\n\nuse crate::cache::StorageCache;\nuse crate::storage::SendableAsync;\nuse crate::{BulkDeleteError, OwnedBytes, Storage, StorageResult};\n\n/// Use with care, StorageWithCache is read-only.\npub struct StorageWithCache {\n    pub storage: Arc<dyn Storage>,\n    pub cache: Arc<dyn StorageCache>,\n}\n\nimpl fmt::Debug for StorageWithCache {\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        f.debug_struct(\"StorageWithCache\").finish()\n    }\n}\n\n#[async_trait]\nimpl Storage for StorageWithCache {\n    async fn check_connectivity(&self) -> anyhow::Result<()> {\n        self.storage.check_connectivity().await\n    }\n\n    async fn put(\n        &self,\n        path: &Path,\n        _payload: Box<dyn crate::PutPayload>,\n    ) -> crate::StorageResult<()> {\n        unimplemented!(\"StorageWithCache is readonly. Failed to put {:?}\", path)\n    }\n\n    async fn copy_to(&self, path: &Path, output: &mut dyn SendableAsync) -> StorageResult<()> {\n        self.storage.copy_to(path, output).await\n    }\n\n    async fn get_slice(&self, path: &Path, byte_range: Range<usize>) -> StorageResult<OwnedBytes> {\n        if let Some(bytes) = self.cache.get(path, byte_range.clone()).await {\n            Ok(bytes)\n        } else {\n            let bytes = self.storage.get_slice(path, byte_range.clone()).await?;\n            self.cache\n                .put(path.to_owned(), byte_range, bytes.clone())\n                .await;\n            Ok(bytes)\n        }\n    }\n\n    async fn get_slice_stream(\n        &self,\n        path: &Path,\n        _range: Range<usize>,\n    ) -> StorageResult<Box<dyn AsyncRead + Send + Unpin>> {\n        unimplemented!(\n            \"StorageWithCache does not support streamed read yet. Failed to get {:?}\",\n            path\n        )\n    }\n\n    async fn get_all(&self, path: &Path) -> StorageResult<OwnedBytes> {\n        if let Some(bytes) = self.cache.get_all(path).await {\n            Ok(bytes)\n        } else {\n            let bytes = self.storage.get_all(path).await?;\n            self.cache.put_all(path.to_owned(), bytes.clone()).await;\n            Ok(bytes)\n        }\n    }\n\n    async fn delete(&self, path: &Path) -> StorageResult<()> {\n        unimplemented!(\"Failed to delete file `{path:?}`. `StorageWithCache` is read-only.\")\n    }\n\n    async fn bulk_delete<'a>(&self, paths: &[&'a Path]) -> Result<(), BulkDeleteError> {\n        unimplemented!(\"Failed to delete files `{paths:?}`. `StorageWithCache` is read-only.\")\n    }\n\n    async fn exists(&self, path: &Path) -> StorageResult<bool> {\n        self.storage.exists(path).await\n    }\n\n    async fn file_num_bytes(&self, path: &Path) -> StorageResult<u64> {\n        self.storage.file_num_bytes(path).await\n    }\n\n    fn uri(&self) -> &Uri {\n        self.storage.uri()\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::collections::HashMap;\n    use std::path::PathBuf;\n    use std::sync::Mutex;\n\n    use super::*;\n    use crate::{MockStorage, MockStorageCache, OwnedBytes};\n\n    #[tokio::test]\n    async fn put_in_cache_test() {\n        let mut mock_storage = MockStorage::default();\n        let mut mock_cache = MockStorageCache::default();\n        let actual_cache: Arc<Mutex<HashMap<PathBuf, OwnedBytes>>> =\n            Arc::new(Mutex::new(HashMap::new()));\n\n        let cache1 = actual_cache.clone();\n        mock_cache\n            .expect_get_all()\n            .times(2)\n            .returning(move |path| cache1.lock().unwrap().get(path).cloned());\n        mock_cache\n            .expect_put_all()\n            .times(1)\n            .returning(move |path, data| {\n                let actual_cache = actual_cache.clone();\n                actual_cache.lock().unwrap().insert(path, data);\n            });\n\n        mock_storage\n            .expect_get_all()\n            .times(1)\n            .returning(|_path| Ok(OwnedBytes::new(vec![1, 2, 3])));\n\n        let storage_with_cache = StorageWithCache {\n            storage: Arc::new(mock_storage),\n            cache: Arc::new(mock_cache),\n        };\n\n        let data1 = storage_with_cache\n            .get_all(Path::new(\"cool_file\"))\n            .await\n            .unwrap();\n        // hitting the cache\n        let data2 = storage_with_cache\n            .get_all(Path::new(\"cool_file\"))\n            .await\n            .unwrap();\n        assert_eq!(data1, data2);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/cache/stored_item.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse tantivy::directory::OwnedBytes;\nuse tokio::time::Instant;\n\n/// It is a bit overkill to put this in its own module, but I\n/// wanted to ensure that no one would access payload without updating `last_access_time`.\npub(super) struct StoredItem<V = OwnedBytes> {\n    last_access_time: Instant,\n    payload: V,\n}\n\nimpl<V> StoredItem<V> {\n    pub fn new(payload: V, now: Instant) -> Self {\n        StoredItem {\n            last_access_time: now,\n            payload,\n        }\n    }\n}\n\nimpl<V: ValueLen + Clone> StoredItem<V> {\n    pub fn payload(&mut self) -> V {\n        self.last_access_time = Instant::now();\n        self.payload.clone()\n    }\n\n    pub fn len(&self) -> usize {\n        self.payload.len()\n    }\n\n    pub fn last_access_time(&self) -> Instant {\n        self.last_access_time\n    }\n}\n\npub(crate) trait ValueLen {\n    fn len(&self) -> usize;\n}\n\nimpl ValueLen for OwnedBytes {\n    fn len(&self) -> usize {\n        OwnedBytes::len(self)\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/debouncer.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::hash::Hash;\nuse std::ops::Range;\nuse std::path::{Path, PathBuf};\nuse std::sync::{Arc, Mutex};\n\nuse async_trait::async_trait;\nuse fnv::FnvHashMap;\nuse futures::future::{BoxFuture, WeakShared};\nuse futures::{Future, FutureExt};\nuse quickwit_common::uri::Uri;\nuse tantivy::directory::OwnedBytes;\nuse tokio::io::AsyncRead;\n\nuse crate::storage::SendableAsync;\nuse crate::{BulkDeleteError, Storage, StorageResult};\n\n/// The AsyncDebouncer debounces inflight Futures, so that concurrent async request to the same data\n/// source can be deduplicated.\n///\n/// Since we pass the Future potentially to multiple consumer, everything needs to be cloneable. The\n/// data and the future. This is reflected on the generic type bounds for the value V: Clone.\n///\n/// Since most Futures return an Result<V, Error>, this also encompasses the error.\npub struct AsyncDebouncer<K, V: Clone> {\n    cache: Mutex<FnvHashMap<K, WeakShared<BoxFuture<'static, V>>>>,\n}\n\nimpl<K, V: Clone> Default for AsyncDebouncer<K, V> {\n    fn default() -> Self {\n        Self {\n            cache: Default::default(),\n        }\n    }\n}\n\nimpl<K: Hash + Eq + Clone, V: Clone> AsyncDebouncer<K, V> {\n    /// Returns the number of inflight futures.\n    pub fn len(&self) -> usize {\n        self.cache.lock().unwrap().len()\n    }\n\n    /// Cleanup\n    /// In case there is already an existing Future for the passed key, the constructor is not\n    /// used.\n    fn cleanup(&self) {\n        let mut guard = self.cache.lock().unwrap();\n        guard.retain(|_, v| v.upgrade().is_some());\n    }\n\n    /// Instead of the future directly, a constructor to build the future is passed.\n    /// In case there is already an existing Future for the passed key, the constructor is not\n    /// used.\n    pub async fn get_or_create<T, F>(&self, key: K, build_a_future: T) -> V\n    where\n        T: FnOnce() -> F,\n        F: Future<Output = V> + Send + 'static,\n    {\n        self.cleanup();\n\n        // explicit scope to drop the lock\n        let weak_fut_opt = { self.cache.lock().unwrap().get(&key).cloned() };\n        if let Some(weak_future) = weak_fut_opt\n            && let Some(future) = weak_future.upgrade()\n        {\n            return future.await;\n        }\n\n        let fut = Box::pin(build_a_future()) as BoxFuture<'static, V>;\n        let fut = fut.shared();\n        self.cache.lock().unwrap().insert(\n            key.clone(),\n            fut.clone().downgrade().expect(\n                \"future has been dropped, but that shouldn't happen since it's still in scope\",\n            ),\n        );\n        let res = fut.await;\n\n        self.cache.lock().unwrap().remove(&key);\n\n        res\n    }\n}\n\ntype DebouncerKey = (PathBuf, Range<usize>);\n\n/// Just to keep in mind there is a race condition on debouncing, when combined with delete\n///\n/// All on the same key\n/// start get R1\n/// start delete R2\n/// end delete R2\n/// start get R3\n/// end get R1\n/// end get R3\n///\n/// ==> R3 would return the cached result, although the resource has been deleted.\npub(crate) struct DebouncedStorage<T> {\n    // wrap both in Arc, because the Future is stored in the cache, which has 'static lifetime\n    // associated\n    underlying: Arc<T>,\n    slice_debouncer: Arc<AsyncDebouncer<DebouncerKey, StorageResult<OwnedBytes>>>,\n}\n\nimpl<T> fmt::Debug for DebouncedStorage<T> {\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        f.debug_struct(\"DebouncedStorage\").finish()\n    }\n}\n\nimpl<T: Storage> DebouncedStorage<T> {\n    pub(crate) fn new(underlying: T) -> Self {\n        Self {\n            underlying: Arc::new(underlying),\n            slice_debouncer: Arc::new(AsyncDebouncer::default()),\n        }\n    }\n}\n\n#[async_trait]\nimpl<T: Storage> Storage for DebouncedStorage<T> {\n    async fn check_connectivity(&self) -> anyhow::Result<()> {\n        self.underlying.check_connectivity().await\n    }\n\n    async fn put(\n        &self,\n        path: &Path,\n        payload: Box<dyn crate::PutPayload>,\n    ) -> crate::StorageResult<()> {\n        self.underlying.put(path, payload).await\n    }\n\n    async fn copy_to(&self, path: &Path, output: &mut dyn SendableAsync) -> StorageResult<()> {\n        self.underlying.copy_to(path, output).await\n    }\n\n    async fn get_slice(&self, path: &Path, range: Range<usize>) -> StorageResult<OwnedBytes> {\n        let (debouncer, underlying) = (self.slice_debouncer.clone(), self.underlying.clone());\n        let key = (path.to_owned(), range);\n        debouncer\n            .get_or_create(key.clone(), || async move {\n                underlying.get_slice(&key.0, key.1).await\n            })\n            .await\n    }\n\n    async fn get_slice_stream(\n        &self,\n        path: &Path,\n        range: Range<usize>,\n    ) -> StorageResult<Box<dyn AsyncRead + Send + Unpin>> {\n        // Getting a stream bypasses the debouncer\n        self.underlying.get_slice_stream(path, range).await\n    }\n\n    async fn delete(&self, path: &Path) -> StorageResult<()> {\n        self.underlying.delete(path).await\n    }\n\n    async fn bulk_delete<'a>(&self, paths: &[&'a Path]) -> Result<(), BulkDeleteError> {\n        self.underlying.bulk_delete(paths).await\n    }\n\n    async fn get_all(&self, path: &Path) -> StorageResult<OwnedBytes> {\n        let (debouncer, underlying) = (self.slice_debouncer.clone(), self.underlying.clone());\n        let key = (path.to_owned(), 0..usize::MAX);\n        debouncer\n            .get_or_create(\n                key.clone(),\n                || async move { underlying.get_all(&key.0).await },\n            )\n            .await\n    }\n\n    fn uri(&self) -> &Uri {\n        self.underlying.uri()\n    }\n\n    async fn file_num_bytes(&self, path: &Path) -> StorageResult<u64> {\n        self.underlying.file_num_bytes(path).await\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use std::ops::Range;\n    use std::path::PathBuf;\n    use std::sync::Arc;\n    use std::sync::atomic::{AtomicU32, Ordering};\n    use std::time::Duration;\n\n    use once_cell::sync::OnceCell;\n    use tempfile::TempDir;\n    use tokio::fs::{self, File};\n    use tokio::io::AsyncWriteExt;\n    use tokio::task;\n\n    use super::*;\n\n    #[test]\n    fn test_sync_and_send() {\n        fn is_sync<T: Sync>() {}\n        fn is_send<T: Send>() {}\n        is_sync::<AsyncDebouncer<String, Result<String, String>>>();\n        is_send::<AsyncDebouncer<String, Result<String, String>>>();\n    }\n\n    #[derive(Hash, Clone, Debug, Eq, PartialEq)]\n    pub struct SliceAddress {\n        pub path: PathBuf,\n        pub byte_range: Range<usize>,\n    }\n\n    async fn get_test_file(temp_dir: &TempDir) -> Arc<PathBuf> {\n        let test_filepath1 = Arc::new(temp_dir.path().join(\"f1\"));\n\n        let mut file1 = File::create(test_filepath1.as_ref()).await.unwrap();\n        file1.write_all(\"nice cache dude\".as_bytes()).await.unwrap();\n        test_filepath1\n    }\n\n    #[tokio::test]\n    async fn test_async_slice_cache() {\n        // test data\n\n        let temp_dir = tempfile::tempdir().unwrap();\n        let test_filepath1 = get_test_file(&temp_dir).await;\n\n        let cache: AsyncDebouncer<SliceAddress, Result<String, String>> = AsyncDebouncer::default();\n\n        let addr1 = SliceAddress {\n            path: test_filepath1.as_ref().clone(),\n            byte_range: 10..20,\n        };\n\n        static COUNT: AtomicU32 = AtomicU32::new(0);\n\n        // Load via closure\n        let _val = cache\n            .get_or_create(addr1.clone(), || {\n                let test_filepath1 = test_filepath1.clone();\n                async move {\n                    COUNT.fetch_add(1, Ordering::SeqCst);\n                    let contents = Box::pin(fs::read_to_string(test_filepath1.as_ref().clone()))\n                        .await\n                        // to string, so that the error is cloneable\n                        .map_err(|err| err.to_string())?;\n\n                    Ok(contents)\n                }\n            })\n            .await\n            .unwrap();\n\n        // Load via function\n        let _val = cache\n            .get_or_create(addr1, || {\n                load_via_fn(test_filepath1.as_ref().clone(), &COUNT)\n            })\n            .await\n            .unwrap();\n\n        assert_eq!(COUNT.load(Ordering::SeqCst), 2);\n\n        // Load via function, new entry\n        let addr2 = SliceAddress {\n            path: test_filepath1.as_ref().clone(),\n            byte_range: 10..30,\n        };\n\n        let _val = cache\n            .get_or_create(addr2.to_owned(), || {\n                load_via_fn(test_filepath1.as_ref().clone(), &COUNT)\n            })\n            .await\n            .unwrap();\n\n        assert_eq!(COUNT.load(Ordering::SeqCst), 3);\n\n        let load = || load_via_fn(test_filepath1.as_ref().clone(), &COUNT);\n\n        let handles = vec![\n            cache.get_or_create(addr2.to_owned(), load),\n            cache.get_or_create(addr2.to_owned(), load),\n        ];\n\n        futures::future::join_all(handles).await;\n\n        // Count is only increased by one, because of debouncing\n        assert_eq!(COUNT.load(Ordering::SeqCst), 4);\n\n        // Quadruple debouncing\n        let handles = vec![\n            cache.get_or_create(addr2.to_owned(), load),\n            cache.get_or_create(addr2.to_owned(), load),\n            cache.get_or_create(addr2.to_owned(), load),\n            cache.get_or_create(addr2.to_owned(), load),\n        ];\n        futures::future::join_all(handles).await;\n\n        // Count is only increased by one, because of debouncing\n        assert_eq!(COUNT.load(Ordering::SeqCst), 5);\n    }\n\n    #[tokio::test]\n    async fn test_debounce() {\n        let temp_dir = tempfile::tempdir().unwrap();\n        let test_filepath1 = get_test_file(&temp_dir).await;\n\n        let cache: AsyncDebouncer<SliceAddress, Result<String, String>> = AsyncDebouncer::default();\n\n        let addr2 = SliceAddress {\n            path: test_filepath1.as_ref().clone(),\n            byte_range: 10..20,\n        };\n        static COUNT: AtomicU32 = AtomicU32::new(0);\n\n        let load = || load_via_fn(test_filepath1.as_ref().clone(), &COUNT);\n\n        let handles = vec![\n            cache.get_or_create(addr2.to_owned(), load),\n            cache.get_or_create(addr2.to_owned(), load),\n        ];\n\n        futures::future::join_all(handles).await;\n\n        // Count is only increased by one, because of debouncing\n        assert_eq!(COUNT.load(Ordering::SeqCst), 1);\n    }\n\n    #[tokio::test]\n    async fn test_cancellation_future() {\n        use tokio::time::timeout;\n        let cache: AsyncDebouncer<String, Result<String, String>> = AsyncDebouncer::default();\n\n        let load = || async {\n            timeout(Duration::from_millis(10), load_via_fn2())\n                .await\n                .map_err(|err| err.to_string())\n        };\n\n        cache\n            .get_or_create(\"key1\".to_owned(), load)\n            .await\n            .unwrap_err();\n        tokio::time::sleep(Duration::from_secs(1)).await;\n        let val = cache.get_or_create(\"key1\".to_owned(), load).await;\n        assert!(val.is_err());\n    }\n\n    async fn load_via_fn2() -> String {\n        tokio::time::sleep(Duration::from_millis(500)).await;\n        \"blub\".to_string()\n    }\n\n    pub static GLOBAL_DEBOUNCER: once_cell::sync::OnceCell<AsyncDebouncer<String, String>> =\n        OnceCell::new();\n    pub fn get_global_debouncer() -> &'static AsyncDebouncer<String, String> {\n        GLOBAL_DEBOUNCER.get_or_init(AsyncDebouncer::default)\n    }\n\n    #[tokio::test]\n    async fn test_cancellation_task() {\n        let load = || async { load_via_fn2().await };\n\n        let handle = task::spawn(async move {\n            get_global_debouncer()\n                .get_or_create(\"key1\".to_owned(), load)\n                .await\n        });\n        tokio::time::sleep(Duration::from_millis(10)).await;\n        // This will cause  the Future to be cancelled, so it will not be polled anymore.\n        // That also means the remove in the cache is not called, which is awaiting the future\n        handle.abort();\n\n        tokio::time::sleep(Duration::from_secs(1)).await;\n        // The task still hangs unfinished\n        assert_eq!(get_global_debouncer().len(), 1);\n\n        // The next get clears\n        get_global_debouncer()\n            .get_or_create(\"key1\".to_owned(), load)\n            .await;\n\n        tokio::time::sleep(Duration::from_secs(1)).await;\n        assert_eq!(get_global_debouncer().len(), 0);\n    }\n\n    async fn load_via_fn(path: PathBuf, cnt: &AtomicU32) -> Result<String, String> {\n        cnt.fetch_add(1, Ordering::SeqCst);\n        let contents = Box::pin(fs::read_to_string(path))\n            .await\n            .map_err(|err| err.to_string())?;\n        // sleep so the requests can be reproducible debounced\n        tokio::time::sleep(Duration::from_millis(10)).await;\n        Ok(contents)\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/error.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::path::PathBuf;\nuse std::sync::Arc;\nuse std::{fmt, io};\n\nuse serde::{Deserialize, Serialize};\nuse tantivy::directory::error::{OpenDirectoryError, OpenReadError};\nuse thiserror::Error;\n\n/// Storage error kind.\n#[derive(Clone, Copy, Debug, Eq, PartialEq, Serialize, Deserialize)]\npub enum StorageErrorKind {\n    /// The target file does not exist.\n    NotFound,\n    /// The request credentials do not allow for this operation.\n    Unauthorized,\n    /// A third-party service forbids this operation, or is misconfigured.\n    Service,\n    /// Any generic internal error.\n    Internal,\n    /// A timeout occurred during the operation.\n    Timeout,\n    /// Io error.\n    Io,\n}\n\n/// Generic Storage Resolver Error.\n#[allow(missing_docs)]\n#[derive(Debug, Clone, thiserror::Error, Serialize, Deserialize)]\npub enum StorageResolverError {\n    /// The storage config is invalid.\n    #[error(\"invalid storage config: `{0}`\")]\n    InvalidConfig(String),\n\n    /// The URI is malformed or does not contain sufficient information to connect to the storage.\n    #[error(\"invalid storage URI: `{0}`\")]\n    InvalidUri(String),\n\n    /// The requested backend is unsupported or unavailable.\n    #[error(\"unsupported storage backend: `{0}`\")]\n    UnsupportedBackend(String),\n\n    /// The URI is valid, and is meant to be handled by this resolver,\n    /// but the resolver failed to actually connect to the storage.\n    /// e.g. connection error, credentials error, incompatible version,\n    /// internal error in third party, etc.\n    #[error(\"failed to open storage {kind:?}: {message}\")]\n    FailedToOpenStorage {\n        kind: crate::StorageErrorKind,\n        message: String,\n    },\n}\n\nimpl StorageErrorKind {\n    /// Creates a StorageError.\n    pub fn with_error(self, source: impl Into<anyhow::Error>) -> StorageError {\n        StorageError {\n            kind: self,\n            source: Arc::new(source.into()),\n        }\n    }\n}\n\nimpl From<StorageError> for io::Error {\n    fn from(storage_err: StorageError) -> Self {\n        let io_error_kind = match storage_err.kind() {\n            StorageErrorKind::NotFound => io::ErrorKind::NotFound,\n            _ => io::ErrorKind::Other,\n        };\n        // TODO: This is swallowing the context of the source error.\n        io::Error::new(io_error_kind, storage_err.source.to_string())\n    }\n}\n\n/// Generic StorageError.\n#[derive(Debug, Clone, Error)]\n#[error(\"storage error(kind={kind:?}, source={source})\")]\n#[allow(missing_docs)]\npub struct StorageError {\n    pub kind: StorageErrorKind,\n    #[source]\n    source: Arc<anyhow::Error>,\n}\n\n/// Generic Result type for storage operations.\npub type StorageResult<T> = Result<T, StorageError>;\n\nimpl StorageError {\n    /// Add some context to the wrapper error.\n    pub fn add_context<C>(self, ctx: C) -> Self\n    where C: fmt::Display + Send + Sync + 'static {\n        StorageError {\n            kind: self.kind,\n            source: Arc::new(anyhow::anyhow!(\"{ctx}\").context(self.source)),\n        }\n    }\n\n    /// Returns the corresponding `StorageErrorKind` for this error.\n    pub fn kind(&self) -> StorageErrorKind {\n        self.kind\n    }\n}\n\nimpl From<io::Error> for StorageError {\n    fn from(err: io::Error) -> StorageError {\n        match err.kind() {\n            io::ErrorKind::NotFound => StorageErrorKind::NotFound.with_error(err),\n            _ => StorageErrorKind::Io.with_error(err),\n        }\n    }\n}\n\nimpl From<OpenDirectoryError> for StorageError {\n    fn from(err: OpenDirectoryError) -> StorageError {\n        match err {\n            OpenDirectoryError::DoesNotExist(_) => StorageErrorKind::NotFound.with_error(err),\n            _ => StorageErrorKind::Io.with_error(err),\n        }\n    }\n}\n\nimpl From<OpenReadError> for StorageError {\n    fn from(err: OpenReadError) -> StorageError {\n        match err {\n            OpenReadError::FileDoesNotExist(_) => StorageErrorKind::NotFound.with_error(err),\n            _ => StorageErrorKind::Io.with_error(err),\n        }\n    }\n}\n\n/// Error returned by `bulk_delete`. Under the hood, `bulk_delete` groups the files to\n/// delete into multiple batches of fixed size and issues one delete objects request per batch. The\n/// whole operation can fail in multiples ways, which is reflected by the quirkiness of the API of\n/// [`BulkDeleteError`]. First, a batch can fail partially, i.e. some objects are deleted while\n/// others are not. The `successes` and `failures` attributes of the error will be populated\n/// accordingly. Second, a batch can fail completely, in which case the `error` field will be set.\n/// Because a batch failing entirely usually indicates a systemic error, for instance, a connection\n/// or credentials issue, `bulk_delete` does not attempt to delete the remaining batches and\n/// populates the `unattempted` attribute. Consequently, the attributes of this error are not\n/// \"mutually exclusive\": there exists a path where all those fields are not empty. The caller is\n/// expected to handle this error carefully and inspect the instance thoroughly before any retry\n/// attempt.\n#[must_use]\n#[derive(Debug, Default, thiserror::Error)]\npub struct BulkDeleteError {\n    /// Error that occurred for a whole batch and caused the entire deletion operation to be\n    /// aborted.\n    pub error: Option<StorageError>,\n    /// List of files that were successfully deleted, including non-existing files.\n    pub successes: Vec<PathBuf>,\n    /// List of files that failed to be deleted along with the corresponding failure descriptions.\n    pub failures: HashMap<PathBuf, DeleteFailure>,\n    /// List of remaining files to delete before the operation was aborted.\n    pub unattempted: Vec<PathBuf>,\n}\n\n/// Describes the failure for an individual file in a batch delete operation.\n#[derive(Debug, Default)]\npub struct DeleteFailure {\n    /// The error that occurred for this file.\n    pub error: Option<StorageError>,\n    /// The failure code is a string that uniquely identifies an error condition. It is meant to be\n    /// read and understood by programs that detect and handle errors by type.\n    pub code: Option<String>,\n    /// The error message contains a generic description of the error condition in English. It is\n    /// intended for a human audience. Simple programs display the message directly to the end user\n    /// if they encounter an error condition they don't know how or don't care to handle.\n    /// Sophisticated programs with more exhaustive error handling and proper internationalization\n    /// are more likely to ignore the error message.\n    pub message: Option<String>,\n}\n\nimpl fmt::Display for BulkDeleteError {\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        write!(\n            f,\n            \"bulk delete error ({} success(es),  {} failure(s), {} unattempted)\",\n            self.successes.len(),\n            self.failures.len(),\n            self.unattempted.len()\n        )?;\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/file_descriptor_cache.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fs::File;\nuse std::io;\nuse std::num::{NonZeroU32, NonZeroUsize};\nuse std::ops::Range;\nuse std::path::{Path, PathBuf};\nuse std::sync::{Arc, Mutex};\n\nuse tantivy::directory::OwnedBytes;\nuse tokio::sync::{OwnedSemaphorePermit, Semaphore};\nuse ulid::Ulid;\n\nuse crate::metrics::SingleCacheMetrics;\n\npub struct FileDescriptorCache {\n    fd_cache: Mutex<lru::LruCache<Ulid, SplitFile>>,\n    fd_semaphore: Arc<Semaphore>,\n    fd_cache_metrics: SingleCacheMetrics,\n}\n\n#[derive(Clone)]\npub struct SplitFile(Arc<SplitFileInner>);\n\nstruct SplitFileInner {\n    num_bytes: u64,\n    // Order matters here. We want file to be dropped (closed) before the semaphore.\n    file: File,\n    _fd_semaphore_guard: OwnedSemaphorePermit,\n}\n\nfn get_split_file_path(root_path: &Path, split_id: Ulid) -> PathBuf {\n    let split_filename = quickwit_common::split_file(split_id);\n    root_path.join(split_filename)\n}\n\nimpl FileDescriptorCache {\n    /// Creates a new file descriptor cache.\n    /// `max_fd_limit` is the total number of file descriptors that can be open at the same time.\n    /// `fd_cache_capacity` is the number of file descriptors that can be cached. It is required to\n    /// be less than `max_fd_limit`.\n    ///\n    /// # Warning\n    ///\n    /// The file descriptor cache can be prone to deadlocks.\n    /// Currently the risk is only avoided due to the split search concurrency limit.\n    ///\n    /// When setting the two limit, ensure the max_fd_limit is higher than the split search\n    /// concurrency limit and that you have set some margin between the two, and also make sure\n    /// the `max_fd_limit` is sufficient to avoid deadlocks.\n    ///\n    /// TODO It would be good to refactor this to enforce this with a bit of a refactoring.\n    /// For instance, client could be forced to declare upfront the number of file descriptors they\n    /// will need. In Quickwit however, one task is hitting one split at a time, so the risk is\n    /// absent.\n    fn new(\n        max_fd_limit: NonZeroU32,\n        fd_cache_capacity: NonZeroU32,\n        fd_cache_metrics: SingleCacheMetrics,\n    ) -> FileDescriptorCache {\n        assert!(max_fd_limit.get() > fd_cache_capacity.get());\n        let fd_cache = Mutex::new(lru::LruCache::new(\n            NonZeroUsize::new(fd_cache_capacity.get() as usize).unwrap(),\n        ));\n        let fd_semaphore = Arc::new(Semaphore::new(max_fd_limit.get() as usize));\n        FileDescriptorCache {\n            fd_cache,\n            fd_semaphore,\n            fd_cache_metrics,\n        }\n    }\n\n    pub fn with_fd_cache_capacity(fd_cache_capacity: NonZeroU32) -> FileDescriptorCache {\n        let max_fd_limit = (fd_cache_capacity.get() * 2)\n            .clamp(fd_cache_capacity.get() + 100, fd_cache_capacity.get() + 200);\n        Self::new(\n            NonZeroU32::new(max_fd_limit).unwrap(),\n            fd_cache_capacity,\n            crate::STORAGE_METRICS\n                .fd_cache_metrics\n                .cache_metrics\n                .clone(),\n        )\n    }\n\n    fn get_split_file(&self, split_id: Ulid) -> Option<SplitFile> {\n        self.fd_cache.lock().unwrap().get(&split_id).cloned()\n    }\n\n    fn put_split_file(&self, split_id: Ulid, split_file: SplitFile) {\n        let mut fd_cache_lock = self.fd_cache.lock().unwrap();\n        fd_cache_lock.push(split_id, split_file);\n        self.fd_cache_metrics\n            .in_cache_count\n            .set(fd_cache_lock.len() as i64);\n    }\n\n    /// Evicts the given list of split ids from the file descriptor cache.\n    /// This method does NOT remove the actual files.\n    pub fn evict_split_files(&self, split_ids: &[Ulid]) {\n        let mut fd_cache_lock = self.fd_cache.lock().unwrap();\n        for split_id in split_ids {\n            fd_cache_lock.pop(split_id);\n        }\n        self.fd_cache_metrics\n            .in_cache_count\n            .set(fd_cache_lock.len() as i64);\n        self.fd_cache_metrics\n            .evict_num_items\n            .inc_by(split_ids.len() as u64);\n    }\n\n    pub async fn get_or_open_split_file(\n        &self,\n        root_path: &Path,\n        split_id: Ulid,\n        num_bytes: u64,\n    ) -> std::io::Result<SplitFile> {\n        if let Some(split_file) = self.get_split_file(split_id) {\n            self.fd_cache_metrics.hits_num_items.inc();\n            return Ok(split_file);\n        } else {\n            self.fd_cache_metrics.misses_num_items.inc();\n        }\n        let split_path = get_split_file_path(root_path, split_id);\n        let fd_semaphore_guard = Semaphore::acquire_owned(self.fd_semaphore.clone())\n            .await\n            .expect(\"fd_semaphore acquire failed. please report\");\n        let file: File = tokio::task::spawn_blocking(move || std::fs::File::open(split_path))\n            .await\n            .map_err(|join_error| {\n                io::Error::other(format!(\"failed to open file: {join_error:?}\"))\n            })??;\n        let split_file = SplitFile(Arc::new(SplitFileInner {\n            num_bytes,\n            file,\n            _fd_semaphore_guard: fd_semaphore_guard,\n        }));\n        self.put_split_file(split_id, split_file.clone());\n        Ok(split_file)\n    }\n}\n\nimpl SplitFile {\n    pub async fn get_range(&self, range: Range<usize>) -> io::Result<OwnedBytes> {\n        use std::os::unix::fs::FileExt;\n        let file = self.clone();\n        let buf = tokio::task::spawn_blocking(move || {\n            let mut buf = Vec::with_capacity(range.len());\n            #[allow(clippy::uninit_vec)]\n            unsafe {\n                buf.set_len(range.len());\n            }\n            file.0.file.read_exact_at(&mut buf, range.start as u64)?;\n            io::Result::Ok(buf)\n        })\n        .await\n        .unwrap()?;\n        Ok(OwnedBytes::new(buf))\n    }\n\n    pub async fn get_all(&self) -> io::Result<OwnedBytes> {\n        self.get_range(0..self.0.num_bytes as usize).await\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::num::NonZeroU32;\n\n    use tokio::fs;\n    use ulid::Ulid;\n\n    use super::FileDescriptorCache;\n    use crate::metrics::CacheMetrics;\n\n    #[tokio::test]\n    async fn test_fd_cache_big_cache() {\n        let cache_metrics = CacheMetrics::for_component(\"fdtest\").cache_metrics;\n        let fd_cache = FileDescriptorCache::new(\n            NonZeroU32::new(20).unwrap(),\n            NonZeroU32::new(10).unwrap(),\n            cache_metrics.clone(),\n        );\n        let tempdir = tempfile::tempdir().unwrap();\n        let split_ids: Vec<Ulid> = std::iter::repeat_with(Ulid::new).take(100).collect();\n        for &split_id in &split_ids {\n            let split_filepath = super::get_split_file_path(tempdir.path(), split_id);\n            let content = split_id.to_string();\n            assert_eq!(content.len(), 26);\n            fs::write(split_filepath, content.as_bytes()).await.unwrap();\n        }\n        for &split_id in &split_ids[0..10] {\n            fd_cache\n                .get_or_open_split_file(tempdir.path(), split_id, 26)\n                .await\n                .unwrap();\n        }\n        for &split_id in &split_ids[0..10] {\n            fd_cache\n                .get_or_open_split_file(tempdir.path(), split_id, 26)\n                .await\n                .unwrap();\n        }\n        for &split_id in &split_ids[0..10] {\n            fd_cache\n                .get_or_open_split_file(tempdir.path(), split_id, 26)\n                .await\n                .unwrap();\n        }\n        assert_eq!(cache_metrics.in_cache_count.get(), 10);\n        assert_eq!(cache_metrics.hits_num_items.get(), 20);\n        assert_eq!(cache_metrics.misses_num_items.get(), 10);\n    }\n\n    // This mimics Quickwit's workload where the fd cache is much smaller than the number of\n    // splits. Each search will read from the same split file, and the cache will help avoid\n    // opening the file several times.\n    #[tokio::test]\n    async fn test_fd_cache_small_cache() {\n        let cache_metrics = CacheMetrics::for_component(\"fdtest2\").cache_metrics;\n        let fd_cache = FileDescriptorCache::new(\n            NonZeroU32::new(20).unwrap(),\n            NonZeroU32::new(10).unwrap(),\n            cache_metrics.clone(),\n        );\n        let tempdir = tempfile::tempdir().unwrap();\n        let split_ids: Vec<Ulid> = std::iter::repeat_with(Ulid::new).take(100).collect();\n        for &split_id in &split_ids {\n            let split_filepath = super::get_split_file_path(tempdir.path(), split_id);\n            let content = split_id.to_string();\n            assert_eq!(content.len(), 26);\n            fs::write(split_filepath, content.as_bytes()).await.unwrap();\n        }\n        for &split_id in &split_ids[0..100] {\n            for _ in 0..10 {\n                fd_cache\n                    .get_or_open_split_file(tempdir.path(), split_id, 26)\n                    .await\n                    .unwrap();\n            }\n        }\n        assert_eq!(cache_metrics.in_cache_count.get(), 10);\n        assert_eq!(cache_metrics.hits_num_items.get(), 100 * 9);\n        assert_eq!(cache_metrics.misses_num_items.get(), 100);\n    }\n\n    #[tokio::test]\n    async fn test_split_file() {\n        let fd_cache = FileDescriptorCache::with_fd_cache_capacity(NonZeroU32::new(20).unwrap());\n        let tempdir = tempfile::tempdir().unwrap();\n        let split_id: Ulid = Ulid::new();\n        let split_filepath = super::get_split_file_path(tempdir.path(), split_id);\n        let content = split_id.to_string();\n        assert_eq!(content.len(), 26);\n        fs::write(split_filepath, content.as_bytes()).await.unwrap();\n        let split_file = fd_cache\n            .get_or_open_split_file(tempdir.path(), split_id, 26)\n            .await\n            .unwrap();\n        {\n            let bytes = split_file.get_all().await.unwrap();\n            assert_eq!(bytes.as_slice(), content.as_bytes());\n        }\n        {\n            let bytes = split_file.get_range(1..3).await.unwrap();\n            assert_eq!(bytes.as_slice(), &content.as_bytes()[1..3]);\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n#![warn(missing_docs)]\n#![allow(clippy::bool_assert_comparison)]\n#![allow(clippy::len_without_is_empty)]\n#![deny(clippy::disallowed_methods)]\n\n//! `quickwit-storage` is the abstraction used in quickwit to interface itself\n//! to different storage:\n//! - object storages (S3)\n//! - local filesystem\n//! - distributed filesystems.\n//! - etc.\n//!\n//! The `BundleStorage` bundles together multiple files into a single file.\nmod cache;\nmod debouncer;\nmod file_descriptor_cache;\nmod metrics;\nmod storage;\nmod timeout_and_retry_storage;\npub use debouncer::AsyncDebouncer;\npub(crate) use debouncer::DebouncedStorage;\n\npub use self::metrics::STORAGE_METRICS;\npub use self::payload::PutPayload;\npub use self::storage::Storage;\n\nmod bundle_storage;\nmod error;\n\nmod local_file_storage;\nmod object_storage;\n#[cfg(feature = \"gcs\")]\nmod opendal_storage;\nmod payload;\nmod prefix_storage;\nmod ram_storage;\nmod split;\nmod split_cache;\nmod storage_factory;\nmod storage_resolver;\nmod versioned_component;\n\nuse quickwit_common::uri::Uri;\npub use split_cache::SplitCache;\npub use tantivy::directory::OwnedBytes;\npub use versioned_component::VersionedComponent;\n\npub use self::bundle_storage::{BundleStorage, BundleStorageFileOffsets};\n#[cfg(any(test, feature = \"testsuite\"))]\npub use self::cache::MockStorageCache;\npub use self::cache::{\n    ByteRangeCache, MemorySizedCache, QuickwitCache, StorageCache, wrap_storage_with_cache,\n};\npub use self::local_file_storage::{LocalFileStorage, LocalFileStorageFactory};\n#[cfg(feature = \"azure\")]\npub use self::object_storage::{AzureBlobStorage, AzureBlobStorageFactory};\npub use self::object_storage::{\n    MultiPartPolicy, S3CompatibleObjectStorage, S3CompatibleObjectStorageFactory,\n};\n#[cfg(feature = \"gcs\")]\npub use self::opendal_storage::GoogleCloudStorageFactory;\n#[cfg(all(feature = \"gcs\", feature = \"integration-testsuite\"))]\npub use self::opendal_storage::test_config_helpers;\npub use self::ram_storage::{RamStorage, RamStorageBuilder};\npub use self::split::{SplitPayload, SplitPayloadBuilder};\n#[cfg(any(test, feature = \"testsuite\"))]\npub use self::storage::MockStorage;\n#[cfg(any(test, feature = \"testsuite\"))]\npub use self::storage_factory::MockStorageFactory;\npub use self::storage_factory::{StorageFactory, UnsupportedStorage};\npub use self::storage_resolver::StorageResolver;\n#[cfg(feature = \"integration-testsuite\")]\npub use self::test_suite::{\n    storage_test_multi_part_upload, storage_test_single_part_upload, storage_test_suite,\n    test_write_and_bulk_delete,\n};\npub use self::timeout_and_retry_storage::TimeoutAndRetryStorage;\npub use crate::error::{\n    BulkDeleteError, DeleteFailure, StorageError, StorageErrorKind, StorageResolverError,\n    StorageResult,\n};\n\n/// Loads an entire local or remote file into memory.\npub async fn load_file(\n    storage_resolver: &StorageResolver,\n    uri: &Uri,\n) -> anyhow::Result<OwnedBytes> {\n    let parent = uri\n        .parent()\n        .ok_or_else(|| anyhow::anyhow!(\"URI `{uri}` is not a valid file URI\"))?;\n    let storage = storage_resolver.resolve(&parent).await?;\n    let file_name = uri\n        .file_name()\n        .ok_or_else(|| anyhow::anyhow!(\"URI `{uri}` is not a valid file URI\"))?;\n    let bytes = storage.get_all(file_name).await?;\n    Ok(bytes)\n}\n\n// this function isn't meant to be called, just to break compilation if\n// serde_json::Map is an ordered map and not a btree map\n#[allow(dead_code)]\n#[cfg(not(any(test, feature = \"testsuite\", feature = \"integration-testsuite\")))]\nunsafe fn serde_json_preserve_order_canary(\n    val: serde_json::Map<String, serde_json::Value>,\n) -> std::collections::BTreeMap<String, serde_json::Value> {\n    use std::mem::transmute as assert_serde_json__preserve_order__disabled;\n    unsafe { assert_serde_json__preserve_order__disabled(val) }\n}\n\n#[cfg(any(test, feature = \"testsuite\", feature = \"integration-testsuite\"))]\nmod for_test {\n    use std::sync::Arc;\n\n    use crate::{RamStorage, Storage};\n\n    /// Returns a storage backed by an \"in-memory file\" for testing.\n    pub fn storage_for_test() -> Arc<dyn Storage> {\n        Arc::new(RamStorage::default())\n    }\n}\n\n#[cfg(any(test, feature = \"testsuite\", feature = \"integration-testsuite\"))]\npub use for_test::storage_for_test;\n\n#[cfg(test)]\nmod tests {\n    use std::str::FromStr;\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_load_file() {\n        let storage_resolver = StorageResolver::builder()\n            .register(LocalFileStorageFactory)\n            .build()\n            .unwrap();\n        let expected_bytes = tokio::fs::read_to_string(\"Cargo.toml\").await.unwrap();\n        assert_eq!(\n            load_file(&storage_resolver, &Uri::from_str(\"Cargo.toml\").unwrap())\n                .await\n                .unwrap()\n                .as_slice(),\n            expected_bytes.as_bytes()\n        );\n    }\n}\n\n#[cfg(any(test, feature = \"integration-testsuite\"))]\npub(crate) mod test_suite {\n\n    use std::path::Path;\n\n    use anyhow::Context;\n    use tokio::io::AsyncReadExt;\n\n    use crate::{Storage, StorageErrorKind};\n\n    async fn test_get_inexistent_file(storage: &mut dyn Storage) -> anyhow::Result<()> {\n        let err = storage\n            .get_slice(Path::new(\"missingfile\"), 0..3)\n            .await\n            .map_err(|err| err.kind());\n        assert!(matches!(err, Err(StorageErrorKind::NotFound)));\n        Ok(())\n    }\n\n    async fn test_write_and_get_slice(storage: &mut dyn Storage) -> anyhow::Result<()> {\n        let test_path = Path::new(\"write_and_read_slice\");\n        storage\n            .put(\n                test_path,\n                Box::new(b\"abcdefghiklmnopqrstuvxyz\"[..].to_vec()),\n            )\n            .await?;\n        let payload = storage.get_slice(test_path, 3..6).await?;\n        assert_eq!(&payload[..], b\"def\");\n        Ok(())\n    }\n\n    async fn test_write_and_get_slice_stream(storage: &mut dyn Storage) -> anyhow::Result<()> {\n        let test_path = Path::new(\"write_and_read_slice_stream\");\n        storage\n            .put(\n                test_path,\n                Box::new(b\"abcdefghiklmnopqrstuvxyz\"[..].to_vec()),\n            )\n            .await?;\n        let mut reader = storage.get_slice_stream(test_path, 3..6).await?;\n        let mut buf = vec![0; 3];\n        reader.read_exact(&mut buf).await?;\n        assert_eq!(&buf[..], b\"def\");\n        Ok(())\n    }\n\n    async fn test_write_get_all(storage: &mut dyn Storage) -> anyhow::Result<()> {\n        let test_path = Path::new(\"write_and_read_all\");\n        storage\n            .put(test_path, Box::new(b\"abcdef\"[..].to_vec()))\n            .await?;\n        let payload = storage.get_all(test_path).await?;\n        assert_eq!(&payload[..], &b\"abcdef\"[..]);\n        Ok(())\n    }\n\n    async fn test_write_and_cp(storage: &mut dyn Storage) -> anyhow::Result<()> {\n        let test_path = Path::new(\"write_and_cp\");\n        let payload_bytes = b\"abcdefghijklmnopqrstuvwxyz\";\n        storage\n            .put(test_path, Box::new(payload_bytes.to_vec()))\n            .await?;\n        let temp_dir = tempfile::tempdir()?;\n        let dest_path = temp_dir.path().to_path_buf();\n        let local_copy = dest_path.join(\"local_copy\");\n        storage.copy_to_file(test_path, &local_copy).await?;\n        let payload = std::fs::read(&local_copy)?;\n        assert_eq!(&payload[..], payload_bytes);\n        Ok(())\n    }\n\n    async fn test_write_and_delete(storage: &mut dyn Storage) -> anyhow::Result<()> {\n        let test_path = Path::new(\"write_and_delete\");\n        let payload_bytes = b\"abcdefghijklmnopqrstuvwxyz\";\n        storage\n            .put(test_path, Box::new(payload_bytes.to_vec()))\n            .await?;\n        assert!(storage.exists(test_path).await?);\n        storage.delete(test_path).await?;\n        assert!(!storage.exists(test_path).await?);\n        storage.delete(test_path).await?;\n        Ok(())\n    }\n\n    /// Tests `Storage::bulk_delete`.\n    pub async fn test_write_and_bulk_delete(storage: &mut dyn Storage) -> anyhow::Result<()> {\n        let test_paths = [\n            Path::new(\"foo\"),\n            Path::new(\"bar\"),\n            Path::new(\"qux\"),\n            Path::new(\"baz\"),\n            Path::new(\"file-does-not-exist\"),\n        ];\n        for test_path in &test_paths[0..4] {\n            storage\n                .put(Path::new(test_path), Box::new(b\"123\".to_vec()))\n                .await?;\n            assert!(storage.exists(test_path).await?);\n        }\n        storage.bulk_delete(&test_paths).await?;\n\n        for test_path in test_paths {\n            assert!(!storage.exists(test_path).await?);\n        }\n        Ok(())\n    }\n\n    async fn test_file_size(storage: &mut dyn Storage) -> anyhow::Result<()> {\n        let test_path = Path::new(\"write_for_filesize\");\n        let payload_bytes = b\"abcdefghijklmnopqrstuvwxyz\";\n        storage\n            .put(test_path, Box::new(payload_bytes.to_vec()))\n            .await?;\n        assert_eq!(storage.file_num_bytes(test_path).await?, 26u64);\n        storage.delete(test_path).await?;\n        Ok(())\n    }\n\n    async fn test_exists(storage: &mut dyn Storage) -> anyhow::Result<()> {\n        let test_path = Path::new(\"exists\");\n        assert!(!storage.exists(test_path).await.unwrap());\n        storage\n            .put(test_path, Box::<std::vec::Vec<u8>>::default())\n            .await?;\n        assert!(storage.exists(test_path).await.unwrap());\n        storage.delete(test_path).await.unwrap();\n        Ok(())\n    }\n\n    async fn test_delete_missing_file(storage: &mut dyn Storage) -> anyhow::Result<()> {\n        let test_path = Path::new(\"missing_file\");\n        assert!(!storage.exists(test_path).await.unwrap());\n        assert!(storage.delete(test_path).await.is_ok());\n        Ok(())\n    }\n\n    async fn test_write_and_delete_with_dir_separator(\n        storage: &mut dyn Storage,\n    ) -> anyhow::Result<()> {\n        let test_path = Path::new(\"foo/bar/write_and_delete_with_separator\");\n        let payload_bytes = b\"abcdefghijklmnopqrstuvwxyz\";\n        storage\n            .put(test_path, Box::new(payload_bytes.to_vec()))\n            .await?;\n        assert!(matches!(\n            storage.exists(Path::new(\"foo/bar\")).await,\n            Ok(false)\n        ));\n        storage.delete(test_path).await?;\n\n        assert!(matches!(\n            storage.exists(Path::new(\"foo/bar\")).await,\n            Ok(false)\n        ));\n        assert!(matches!(storage.exists(Path::new(\"foo\")).await, Ok(false)));\n        Ok(())\n    }\n\n    /// Generic test suite for a storage.\n    pub async fn storage_test_suite(storage: &mut dyn Storage) -> anyhow::Result<()> {\n        test_get_inexistent_file(storage)\n            .await\n            .context(\"get_inexistent_file\")?;\n        test_write_and_get_slice(storage)\n            .await\n            .context(\"write_and_get_slice\")?;\n        test_write_and_get_slice_stream(storage)\n            .await\n            .context(\"write_and_get_slice_stream\")?;\n        test_write_get_all(storage)\n            .await\n            .context(\"write_and_get_all\")?;\n        test_write_and_cp(storage).await.context(\"write_and_cp\")?;\n        test_write_and_delete(storage)\n            .await\n            .context(\"write_and_delete\")?;\n        test_write_and_bulk_delete(storage)\n            .await\n            .context(\"write_and_bulk_delete\")?;\n        test_exists(storage).await.context(\"exists\")?;\n        test_write_and_delete_with_dir_separator(storage)\n            .await\n            .context(\"write_and_delete_with_separator\")?;\n        test_file_size(storage).await.context(\"file_size\")?;\n        test_delete_missing_file(storage)\n            .await\n            .context(\"delete_missing_file\")?;\n        Ok(())\n    }\n\n    /// Generic single-part upload test.\n    #[cfg(feature = \"integration-testsuite\")]\n    pub async fn storage_test_single_part_upload(storage: &mut dyn Storage) -> anyhow::Result<()> {\n        use std::ops::Range;\n\n        let test_path = Path::new(\"hello_small.txt\");\n        let data = b\"hello, happy tax payer!\";\n        let data_size = data.len() as u64;\n        storage.put(test_path, Box::new(data.to_vec())).await?;\n        // file_num_bytes\n        assert_eq!(storage.file_num_bytes(test_path).await?, data_size);\n        // get_all\n        let all_bytes = storage.get_all(test_path).await?;\n        assert_eq!(all_bytes.as_slice(), data);\n        // get_slice\n        let happy_bytes = storage\n            .get_slice(test_path, Range { start: 7, end: 12 })\n            .await?;\n        assert_eq!(happy_bytes.as_slice(), &data[7..12]);\n        // get_slice_stream\n        let mut happy_byte_stream = storage\n            .get_slice_stream(test_path, Range { start: 7, end: 12 })\n            .await?;\n        let mut happy_bytes_read = Vec::new();\n        happy_byte_stream.read_to_end(&mut happy_bytes_read).await?;\n        assert_eq!(happy_bytes_read.as_slice(), &data[7..12]);\n        Ok(())\n    }\n\n    /// Generic multi-part upload test.\n    #[cfg(feature = \"integration-testsuite\")]\n    pub async fn storage_test_multi_part_upload(storage: &mut dyn Storage) -> anyhow::Result<()> {\n        let test_path = Path::new(\"hello_large.txt\");\n\n        let mut test_buffer = Vec::with_capacity(15_000_000);\n        for i in 0..15_000_000u32 {\n            test_buffer.push((i % 256) as u8);\n        }\n\n        storage\n            .put(test_path, Box::new(test_buffer.clone()))\n            .await?;\n\n        assert_eq!(storage.file_num_bytes(test_path).await?, 15_000_000);\n\n        let downloaded_data = storage.get_all(test_path).await?;\n\n        assert_eq!(test_buffer.len(), downloaded_data.len(), \"Length mismatch\");\n        // dont use assert_eq since we dont want large buffers to be printed\n        // if assert fails\n        assert!(\n            test_buffer.as_slice() == downloaded_data.as_slice(),\n            \"Content mismatch - data corruption detected!\"\n        );\n\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/local_file_storage.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::{BTreeSet, HashMap};\nuse std::fmt;\nuse std::io::{ErrorKind, SeekFrom};\nuse std::ops::Range;\nuse std::path::{Component, Path, PathBuf};\nuse std::sync::Arc;\n\nuse async_trait::async_trait;\nuse futures::StreamExt;\nuse futures::future::{BoxFuture, FutureExt};\nuse quickwit_common::ignore_error_kind;\nuse quickwit_common::uri::Uri;\nuse quickwit_config::StorageBackend;\nuse tokio::io::{AsyncRead, AsyncReadExt, AsyncSeekExt, AsyncWriteExt};\nuse tracing::warn;\n\nuse crate::metrics::object_storage_get_slice_in_flight_guards;\nuse crate::storage::SendableAsync;\nuse crate::{\n    BulkDeleteError, DebouncedStorage, DeleteFailure, OwnedBytes, Storage, StorageError,\n    StorageErrorKind, StorageFactory, StorageResolverError, StorageResult,\n};\n\n/// File system compatible storage implementation.\n#[derive(Clone)]\npub struct LocalFileStorage {\n    uri: Uri,\n    root: PathBuf,\n}\n\nimpl fmt::Debug for LocalFileStorage {\n    fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {\n        formatter\n            .debug_struct(\"LocalFileStorage\")\n            .field(\"root\", &self.root.display())\n            .finish()\n    }\n}\n\nimpl LocalFileStorage {\n    fn full_path(&self, relative_path: &Path) -> crate::StorageResult<PathBuf> {\n        ensure_valid_relative_path(relative_path)?;\n        Ok(self.root.join(relative_path))\n    }\n\n    /// Creates a local file storage instance given a URI.\n    pub fn from_uri(uri: &Uri) -> Result<Self, StorageResolverError> {\n        uri.filepath()\n            .map(|root| Self {\n                uri: uri.clone(),\n                root: root.to_path_buf(),\n            })\n            .ok_or_else(|| {\n                let message = format!(\"URI `{uri}` is not a valid file URI\");\n                StorageResolverError::InvalidUri(message)\n            })\n    }\n\n    /// Moves a file from a source to a destination.\n    /// from here is an external path, and to is an internal path.\n    pub async fn move_into(&self, from_external: &Path, to: &Path) -> crate::StorageResult<()> {\n        let to_full_path = self.full_path(to)?;\n        tokio::fs::rename(from_external, to_full_path).await?;\n        Ok(())\n    }\n\n    /// Moves a file from a source to a destination.\n    /// from here is an internal path, and to is an external path.\n    pub async fn move_out(&self, from_internal: &Path, to: &Path) -> crate::StorageResult<()> {\n        let from_full_path = self.full_path(from_internal)?;\n        tokio::fs::rename(from_full_path, to).await?;\n        Ok(())\n    }\n\n    async fn delete_single_file(&self, relative_path: &Path) -> StorageResult<()> {\n        let full_path = self.full_path(relative_path)?;\n        ignore_error_kind!(ErrorKind::NotFound, tokio::fs::remove_file(full_path).await)?;\n        Ok(())\n    }\n}\n\n/// Ensure that the path given does not include any \"..\" for security reasons.\n///\n/// In order to reduce the attack surface, we want to make sure the `FileStorage`\n/// only access/delete files that are children of its root_directory.\nfn ensure_valid_relative_path(path: &Path) -> StorageResult<()> {\n    for component in path.components() {\n        match component {\n            Component::RootDir | Component::ParentDir | Component::Prefix(_) => {\n                // We forbid `Path` components that are breaking the assumption that\n                // root.join(path) is a child of root (if we omit fs links).\n                return Err(StorageErrorKind::Unauthorized.with_error(anyhow::anyhow!(\n                    \"path `{}` is forbidden. only simple relative path are allowed\",\n                    path.display()\n                )));\n            }\n            Component::CurDir | Component::Normal(_) => {\n                // we accept `./` and subdir/\n            }\n        }\n    }\n    Ok(())\n}\n\n/// Delete empty directories starting from `{root}/{path}` directory and stopping at `{root}`\n/// directory. Note that the `{root}` directory is not deleted.\nfn delete_all_dirs_if_empty<'a>(\n    root: &'a Path,\n    path: &'a Path,\n) -> BoxFuture<'a, std::io::Result<()>> {\n    async move {\n        let full_path = root.join(path);\n        let path_entries_result = full_path.read_dir();\n        if let Err(err) = &path_entries_result {\n            // Ignore `ErrorKind::NotFound` as this could be deleted by another concurrent task.\n            if err.kind() == ErrorKind::NotFound {\n                return Ok(());\n            }\n        }\n\n        let is_not_empty = path_entries_result?.next().is_some();\n        if is_not_empty {\n            return Ok(());\n        }\n\n        let delete_result = tokio::fs::remove_dir(full_path).await;\n        if let Err(err) = &delete_result {\n            // Ignore `ErrorKind::NotFound` as this could be deleted by another concurrent task.\n            if err.kind() == ErrorKind::NotFound {\n                return Ok(());\n            }\n            delete_result?;\n        }\n\n        match &path.parent() {\n            Some(path) => {\n                if path == &Path::new(\"\") || path == &Path::new(\".\") {\n                    return Ok(());\n                }\n                delete_all_dirs_if_empty(root, path).await?;\n            }\n            _ => return Ok(()),\n        }\n\n        Ok(())\n    }\n    .boxed()\n}\n\n#[async_trait]\nimpl Storage for LocalFileStorage {\n    async fn check_connectivity(&self) -> anyhow::Result<()> {\n        if !self.root.try_exists()? {\n            // By creating directories, we check if we have the right permissions.\n            tokio::fs::create_dir_all(&self.root).await?\n        }\n        Ok(())\n    }\n\n    async fn put(\n        &self,\n        path: &Path,\n        payload: Box<dyn crate::PutPayload>,\n    ) -> crate::StorageResult<()> {\n        let full_path = self.full_path(path)?;\n        let parent_dir = full_path.parent().ok_or_else(|| {\n            let err = anyhow::anyhow!(\"no parent directory for {full_path:?}\");\n            StorageErrorKind::Internal.with_error(err)\n        })?;\n\n        tokio::fs::create_dir_all(parent_dir).await?;\n        let mut reader = payload.byte_stream().await?.into_async_read();\n        let named_temp_file = tempfile::NamedTempFile::new_in(parent_dir)?;\n        let (temp_std_file, temp_filepath) = named_temp_file.into_parts();\n        let mut temp_tokio_file = tokio::fs::File::from_std(temp_std_file);\n        tokio::io::copy(&mut reader, &mut temp_tokio_file).await?;\n        temp_tokio_file.flush().await?;\n        temp_tokio_file.sync_data().await?;\n        temp_filepath\n            .persist(&full_path)\n            .map_err(|err| StorageErrorKind::Io.with_error(err))?;\n        // We also need to sync the parent directory to ensure it\n        // the file move has been persisted on all file systems.\n        tokio::fs::File::open(parent_dir).await?.sync_data().await?;\n        Ok(())\n    }\n\n    async fn copy_to(&self, path: &Path, output: &mut dyn SendableAsync) -> StorageResult<()> {\n        let full_path = self.full_path(path)?;\n        let mut file = tokio::fs::File::open(&full_path).await?;\n        tokio::io::copy(&mut file, output).await?;\n        Ok(())\n    }\n\n    #[tracing::instrument(skip(self), level = \"debug\")]\n    async fn get_slice(&self, path: &Path, range: Range<usize>) -> StorageResult<OwnedBytes> {\n        let full_path = self.full_path(path)?;\n        tokio::task::spawn_blocking(move || {\n            use std::io::{Read, Seek};\n            // we run these io in a spawn_blocking so there is no scheduling delay between each\n            // step, as there would be if using tokio async File.\n            let mut file = std::fs::File::open(full_path)?;\n            file.seek(SeekFrom::Start(range.start as u64))?;\n            let _in_flight_guards = object_storage_get_slice_in_flight_guards(range.len());\n            let mut content_bytes: Vec<u8> = Vec::with_capacity(range.len());\n            #[allow(clippy::uninit_vec)]\n            unsafe {\n                content_bytes.set_len(range.len());\n            }\n            file.read_exact(&mut content_bytes)?;\n            Ok(OwnedBytes::new(content_bytes))\n        })\n        .await\n        .map_err(|_| {\n            StorageErrorKind::Internal.with_error(anyhow::anyhow!(\"reading file panicked\"))\n        })?\n    }\n\n    #[tracing::instrument(skip(self), level = \"debug\")]\n    async fn get_slice_stream(\n        &self,\n        path: &Path,\n        range: Range<usize>,\n    ) -> StorageResult<Box<dyn AsyncRead + Send + Unpin>> {\n        let full_path = self.full_path(path)?;\n        let mut file = tokio::fs::File::open(&full_path).await?;\n        file.seek(SeekFrom::Start(range.start as u64)).await?;\n        Ok(Box::new(file.take(range.len() as u64)))\n    }\n\n    async fn delete(&self, path: &Path) -> StorageResult<()> {\n        self.delete_single_file(path).await?;\n        if let Some(parent) = path.parent()\n            && let Err(error) = delete_all_dirs_if_empty(&self.root, parent).await\n        {\n            warn!(error=?error, path=%path.display(), \"failed to delete directory\");\n        }\n        Ok(())\n    }\n\n    /// Deletes the files identified by `paths` concurrently, with a maximum of `10` syscalls at a\n    /// time. Additionally, deletes the parent directories of `paths` if they are empty after the\n    /// first round of deletions.\n    async fn bulk_delete<'a>(&self, paths: &[&'a Path]) -> Result<(), BulkDeleteError> {\n        let mut successes = Vec::with_capacity(paths.len());\n        let mut failures = HashMap::new();\n        let mut parent_paths = BTreeSet::new();\n\n        let remove_file_res_futures: Vec<_> = paths\n            .iter()\n            .map(|path| async move {\n                let remove_file_res = self.delete_single_file(path).await;\n                (path, remove_file_res)\n            })\n            .collect();\n\n        let mut stream = futures::stream::iter(remove_file_res_futures).buffer_unordered(10);\n\n        while let Some((path, remove_file_res)) = stream.next().await {\n            match remove_file_res {\n                Ok(_) => {\n                    successes.push(path.to_path_buf());\n\n                    if let Some(parent) = path.parent() {\n                        parent_paths.insert(parent);\n                    }\n                }\n                Err(error) => {\n                    let failure = DeleteFailure {\n                        error: Some(error),\n                        ..Default::default()\n                    };\n                    failures.insert(path.to_path_buf(), failure);\n                }\n            }\n        }\n        // Delete parent directories of `paths` if they are empty.\n        // Traverse the parent directories in reverse order, so that we delete the deepest ones\n        // first.\n        for parent_path in parent_paths.into_iter().rev() {\n            if let Err(error) = delete_all_dirs_if_empty(&self.root, parent_path).await {\n                warn!(error=?error, path=%parent_path.display(), \"failed to delete directory\");\n            }\n        }\n        if failures.is_empty() {\n            return Ok(());\n        }\n        Err(BulkDeleteError {\n            successes,\n            failures,\n            ..Default::default()\n        })\n    }\n\n    async fn get_all(&self, path: &Path) -> StorageResult<OwnedBytes> {\n        let full_path = self.full_path(path)?;\n        let content_bytes = tokio::fs::read(full_path).await.map_err(|err| {\n            StorageError::from(err).add_context(format!(\n                \"failed to read file {}/{}\",\n                self.uri(),\n                path.to_string_lossy()\n            ))\n        })?;\n        Ok(OwnedBytes::new(content_bytes))\n    }\n\n    fn uri(&self) -> &Uri {\n        &self.uri\n    }\n\n    async fn file_num_bytes(&self, path: &Path) -> StorageResult<u64> {\n        let full_path = self.full_path(path)?;\n        match tokio::fs::metadata(full_path).await {\n            Ok(metadata) => {\n                if metadata.is_file() {\n                    Ok(metadata.len())\n                } else {\n                    Err(StorageErrorKind::NotFound.with_error(anyhow::anyhow!(\n                        \"file `{}` is not a regular file, cannot determine its size\",\n                        path.display()\n                    )))\n                }\n            }\n            Err(err) => {\n                if err.kind() == ErrorKind::NotFound {\n                    Err(StorageErrorKind::NotFound.with_error(err))\n                } else {\n                    Err(err.into())\n                }\n            }\n        }\n    }\n}\n\n/// A File storage resolver\n#[derive(Clone, Debug, Default)]\npub struct LocalFileStorageFactory;\n\n#[async_trait]\nimpl StorageFactory for LocalFileStorageFactory {\n    fn backend(&self) -> StorageBackend {\n        StorageBackend::File\n    }\n\n    async fn resolve(&self, uri: &Uri) -> Result<Arc<dyn Storage>, StorageResolverError> {\n        let storage = LocalFileStorage::from_uri(uri)?;\n        Ok(Arc::new(DebouncedStorage::new(storage)))\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use std::str::FromStr;\n\n    use super::*;\n    use crate::test_suite::storage_test_suite;\n\n    #[tokio::test]\n    async fn test_local_file_storage() -> anyhow::Result<()> {\n        let temp_dir = tempfile::tempdir()?;\n        let uri = Uri::from_str(&format!(\"{}\", temp_dir.path().display())).unwrap();\n        let mut local_file_storage = LocalFileStorage::from_uri(&uri)?;\n        storage_test_suite(&mut local_file_storage).await?;\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_local_file_storage_forbids_double_dot() {\n        let temp_dir = tempfile::tempdir().unwrap();\n        let uri = Uri::from_str(&format!(\"{}\", temp_dir.path().display())).unwrap();\n        let local_file_storage = LocalFileStorage::from_uri(&uri).unwrap();\n        assert_eq!(\n            local_file_storage\n                .exists(Path::new(\"hello/toto\"))\n                .await\n                .unwrap(),\n            false\n        );\n        let exist_error = local_file_storage\n            .exists(Path::new(\"hello/../toto\"))\n            .await\n            .unwrap_err();\n        assert_eq!(exist_error.kind(), StorageErrorKind::Unauthorized);\n    }\n\n    #[tokio::test]\n    async fn test_local_file_storage_factory() -> anyhow::Result<()> {\n        let temp_dir = tempfile::tempdir()?;\n        let index_uri =\n            Uri::from_str(&format!(\"file://{}/foo/bar\", temp_dir.path().display())).unwrap();\n        let local_file_storage_factory = LocalFileStorageFactory;\n        let local_file_storage = local_file_storage_factory.resolve(&index_uri).await?;\n        assert_eq!(local_file_storage.uri(), &index_uri);\n\n        let err = local_file_storage_factory\n            .resolve(&Uri::for_test(\"s3://foo/bar\"))\n            .await\n            .err()\n            .unwrap();\n        assert!(matches!(err, StorageResolverError::InvalidUri { .. }));\n\n        let err = local_file_storage_factory\n            .resolve(&Uri::for_test(\"s3://\"))\n            .await\n            .err()\n            .unwrap();\n        assert!(matches!(err, StorageResolverError::InvalidUri { .. }));\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_local_file_storage_bulk_delete() {\n        let temp_dir = tempfile::tempdir().unwrap();\n        tokio::fs::create_dir(temp_dir.path().join(\"foo-dir\"))\n            .await\n            .unwrap();\n        tokio::fs::create_dir(temp_dir.path().join(\"bar-dir\"))\n            .await\n            .unwrap();\n        tokio::fs::File::create(temp_dir.path().join(\"foo-dir/foo\"))\n            .await\n            .unwrap();\n\n        let uri = Uri::from_str(&format!(\"{}\", temp_dir.path().display())).unwrap();\n        let local_file_storage = LocalFileStorage::from_uri(&uri).unwrap();\n        let error = local_file_storage\n            .bulk_delete(&[Path::new(\"foo-dir/foo\"), Path::new(\"bar-dir\")])\n            .await\n            .unwrap_err();\n        assert_eq!(error.successes, [PathBuf::from(\"foo-dir/foo\")]);\n\n        let failure = error.failures.get(Path::new(\"bar-dir\")).unwrap();\n        assert_eq!(failure.error.as_ref().unwrap().kind(), StorageErrorKind::Io);\n\n        assert!(!temp_dir.path().join(\"foo-dir\").try_exists().unwrap());\n    }\n\n    #[tokio::test]\n    async fn test_try_delete_dir_all() -> anyhow::Result<()> {\n        let path_root = tempfile::tempdir()?.keep();\n        let dir_path = path_root.clone().join(\"foo/bar/baz\");\n        tokio::fs::create_dir_all(dir_path.clone()).await?;\n\n        // check all empty directory\n        assert_eq!(dir_path.try_exists().unwrap(), true);\n        delete_all_dirs_if_empty(&path_root, dir_path.as_path()).await?;\n        assert_eq!(dir_path.try_exists().unwrap(), false);\n        assert_eq!(dir_path.parent().unwrap().try_exists().unwrap(), false);\n\n        // check with intermediate file\n        tokio::fs::create_dir_all(dir_path.clone()).await?;\n        let intermediate_file = dir_path.parent().unwrap().join(\"fizz.txt\");\n        tokio::fs::File::create(intermediate_file.clone()).await?;\n        assert_eq!(dir_path.try_exists().unwrap(), true);\n        assert_eq!(intermediate_file.try_exists().unwrap(), true);\n        delete_all_dirs_if_empty(&path_root, dir_path.as_path()).await?;\n        assert_eq!(dir_path.try_exists().unwrap(), false);\n        assert_eq!(dir_path.parent().unwrap().try_exists().unwrap(), true);\n\n        // make sure it does not go beyond the path\n        tokio::fs::create_dir_all(path_root.join(\"home/foo/bar\")).await?;\n        delete_all_dirs_if_empty(&path_root.join(\"home/foo\"), Path::new(\"bar\")).await?;\n        assert_eq!(path_root.join(\"home/foo\").try_exists().unwrap(), true);\n\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/metrics.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n// See https://prometheus.io/docs/practices/naming/\n\nuse std::collections::HashMap;\nuse std::sync::RwLock;\n\nuse once_cell::sync::Lazy;\nuse quickwit_common::metrics::{\n    GaugeGuard, Histogram, IntCounter, IntCounterVec, IntGauge, new_counter, new_counter_vec,\n    new_gauge, new_histogram_vec,\n};\nuse quickwit_config::CacheConfig;\n\n/// Counters associated to storage operations.\npub struct StorageMetrics {\n    pub shortlived_cache: CacheMetrics,\n    pub partial_request_cache: CacheMetrics,\n    pub predicate_cache: CacheMetrics,\n    pub fd_cache_metrics: CacheMetrics,\n    pub fast_field_cache: CacheMetrics,\n    pub split_footer_cache: CacheMetrics,\n    pub searcher_split_cache: CacheMetrics,\n    pub get_slice_timeout_successes: [IntCounter; 3],\n    pub get_slice_timeout_all_timeouts: IntCounter,\n    pub object_storage_get_total: IntCounter,\n    pub object_storage_get_errors_total: IntCounterVec<1>,\n    pub object_storage_get_slice_in_flight_count: IntGauge,\n    pub object_storage_get_slice_in_flight_num_bytes: IntGauge,\n    pub object_storage_put_total: IntCounter,\n    pub object_storage_put_parts: IntCounter,\n    pub object_storage_download_num_bytes: IntCounter,\n    pub object_storage_upload_num_bytes: IntCounter,\n\n    pub object_storage_delete_requests_total: IntCounter,\n    pub object_storage_bulk_delete_requests_total: IntCounter,\n    pub object_storage_delete_request_duration: Histogram,\n    pub object_storage_bulk_delete_request_duration: Histogram,\n}\n\nimpl Default for StorageMetrics {\n    fn default() -> Self {\n        let get_slice_timeout_outcome_total_vec = new_counter_vec(\n            \"get_slice_timeout_outcome\",\n            \"Outcome of get_slice operations. success_after_1_timeout means the operation \\\n             succeeded after a retry caused by a timeout.\",\n            \"storage\",\n            &[],\n            [\"outcome\"],\n        );\n        let get_slice_timeout_successes = [\n            get_slice_timeout_outcome_total_vec.with_label_values([\"success_after_0_timeout\"]),\n            get_slice_timeout_outcome_total_vec.with_label_values([\"success_after_1_timeout\"]),\n            get_slice_timeout_outcome_total_vec.with_label_values([\"success_after_2+_timeout\"]),\n        ];\n        let get_slice_timeout_all_timeouts =\n            get_slice_timeout_outcome_total_vec.with_label_values([\"all_timeouts\"]);\n\n        let object_storage_requests_total = new_counter_vec(\n            \"object_storage_requests_total\",\n            \"Total number of object storage requests performed.\",\n            \"storage\",\n            &[],\n            [\"action\"],\n        );\n        let object_storage_delete_requests_total =\n            object_storage_requests_total.with_label_values([\"delete_object\"]);\n        let object_storage_bulk_delete_requests_total =\n            object_storage_requests_total.with_label_values([\"delete_objects\"]);\n\n        let object_storage_request_duration = new_histogram_vec(\n            \"object_storage_request_duration_seconds\",\n            \"Duration of object storage requests in seconds.\",\n            \"storage\",\n            &[],\n            [\"action\"],\n            vec![0.1, 0.5, 1.0, 5.0, 10.0, 30.0, 60.0],\n        );\n        let object_storage_delete_request_duration =\n            object_storage_request_duration.with_label_values([\"delete_object\"]);\n        let object_storage_bulk_delete_request_duration =\n            object_storage_request_duration.with_label_values([\"delete_objects\"]);\n\n        StorageMetrics {\n            fast_field_cache: CacheMetrics::for_component(\"fastfields\"),\n            fd_cache_metrics: CacheMetrics::for_component(\"fd\"),\n            partial_request_cache: CacheMetrics::for_component(\"partial_request\"),\n            predicate_cache: CacheMetrics::for_component(\"predicate\"),\n            searcher_split_cache: CacheMetrics::for_component(\"searcher_split\"),\n            shortlived_cache: CacheMetrics::for_component(\"shortlived\"),\n            split_footer_cache: CacheMetrics::for_component(\"splitfooter\"),\n            get_slice_timeout_successes,\n            get_slice_timeout_all_timeouts,\n            object_storage_get_total: new_counter(\n                \"object_storage_gets_total\",\n                \"Number of objects fetched. Might be lower than get_slice_timeout_outcome if \\\n                 queries are debounced.\",\n                \"storage\",\n                &[],\n            ),\n            object_storage_get_errors_total: new_counter_vec::<1>(\n                \"object_storage_get_errors_total\",\n                \"Number of GetObject errors.\",\n                \"storage\",\n                &[],\n                [\"code\"],\n            ),\n            object_storage_get_slice_in_flight_count: new_gauge(\n                \"object_storage_get_slice_in_flight_count\",\n                \"Number of GetObject for which the memory was allocated but the download is still \\\n                 in progress.\",\n                \"storage\",\n                &[],\n            ),\n            object_storage_get_slice_in_flight_num_bytes: new_gauge(\n                \"object_storage_get_slice_in_flight_num_bytes\",\n                \"Memory allocated for GetObject requests that are still in progress.\",\n                \"storage\",\n                &[],\n            ),\n            object_storage_put_total: new_counter(\n                \"object_storage_puts_total\",\n                \"Number of objects uploaded. May differ from object_storage_requests_parts due to \\\n                 multipart upload.\",\n                \"storage\",\n                &[],\n            ),\n            object_storage_put_parts: new_counter(\n                \"object_storage_puts_parts\",\n                \"Number of object parts uploaded.\",\n                \"\",\n                &[],\n            ),\n            object_storage_download_num_bytes: new_counter(\n                \"object_storage_download_num_bytes\",\n                \"Amount of data downloaded from an object storage.\",\n                \"storage\",\n                &[],\n            ),\n            object_storage_upload_num_bytes: new_counter(\n                \"object_storage_upload_num_bytes\",\n                \"Amount of data uploaded to an object storage.\",\n                \"storage\",\n                &[],\n            ),\n            object_storage_delete_requests_total,\n            object_storage_bulk_delete_requests_total,\n            object_storage_delete_request_duration,\n            object_storage_bulk_delete_request_duration,\n        }\n    }\n}\n\n/// Counters associated to a cache.\npub struct CacheMetrics {\n    pub component_name: String,\n    pub cache_metrics: SingleCacheMetrics,\n    virtual_caches_metrics: RwLock<HashMap<CacheConfig, SingleCacheMetrics>>,\n}\n\n#[derive(Clone)]\npub struct SingleCacheMetrics {\n    pub in_cache_count: IntGauge,\n    pub in_cache_num_bytes: IntGauge,\n    pub hits_num_items: IntCounter,\n    pub hits_num_bytes: IntCounter,\n    pub misses_num_items: IntCounter,\n    pub evict_num_items: IntCounter,\n    pub evict_num_bytes: IntCounter,\n}\n\nimpl CacheMetrics {\n    pub fn for_component(component_name: &str) -> Self {\n        const CACHE_METRICS_NAMESPACE: &str = \"cache\";\n        let labels = [(\"component_name\", component_name)];\n        CacheMetrics {\n            component_name: component_name.to_string(),\n            cache_metrics: SingleCacheMetrics {\n                in_cache_count: new_gauge(\n                    \"in_cache_count\",\n                    \"Count of in cache by component\",\n                    CACHE_METRICS_NAMESPACE,\n                    &labels,\n                ),\n                in_cache_num_bytes: new_gauge(\n                    \"in_cache_num_bytes\",\n                    \"Number of bytes in cache by component\",\n                    CACHE_METRICS_NAMESPACE,\n                    &labels,\n                ),\n                hits_num_items: new_counter(\n                    \"cache_hits_total\",\n                    \"Number of cache hits by component\",\n                    CACHE_METRICS_NAMESPACE,\n                    &labels,\n                ),\n                hits_num_bytes: new_counter(\n                    \"cache_hits_bytes\",\n                    \"Number of cache hits in bytes by component\",\n                    CACHE_METRICS_NAMESPACE,\n                    &labels,\n                ),\n                misses_num_items: new_counter(\n                    \"cache_misses_total\",\n                    \"Number of cache misses by component\",\n                    CACHE_METRICS_NAMESPACE,\n                    &labels,\n                ),\n                evict_num_items: new_counter(\n                    \"cache_evict_total\",\n                    \"Number of cache entry evicted by component\",\n                    CACHE_METRICS_NAMESPACE,\n                    &labels,\n                ),\n                evict_num_bytes: new_counter(\n                    \"cache_evict_bytes\",\n                    \"Number of cache entry evicted in bytes by component\",\n                    CACHE_METRICS_NAMESPACE,\n                    &labels,\n                ),\n            },\n            virtual_caches_metrics: RwLock::default(),\n        }\n    }\n\n    pub fn virtual_cache(&self, config: &CacheConfig) -> SingleCacheMetrics {\n        if let Some(virtual_cache_metrics) = self.virtual_caches_metrics.read().unwrap().get(config)\n        {\n            return virtual_cache_metrics.clone();\n        }\n\n        const CACHE_METRICS_NAMESPACE: &str = \"cache\";\n        let capacity = config.capacity().as_u64().to_string();\n        let policy = config.policy().to_string();\n        let labels = [\n            (\"component_name\", self.component_name.as_str()),\n            (\"capacity\", &capacity),\n            (\"policy\", &policy),\n        ];\n        let new_virtual_cache_metrics = SingleCacheMetrics {\n            in_cache_count: new_gauge(\n                \"virtual_in_cache_count\",\n                \"Count of in cache by component\",\n                CACHE_METRICS_NAMESPACE,\n                &labels,\n            ),\n            in_cache_num_bytes: new_gauge(\n                \"virtual_in_cache_num_bytes\",\n                \"Number of bytes in cache by component\",\n                CACHE_METRICS_NAMESPACE,\n                &labels,\n            ),\n            hits_num_items: new_counter(\n                \"virtual_cache_hits_total\",\n                \"Number of cache hits by component\",\n                CACHE_METRICS_NAMESPACE,\n                &labels,\n            ),\n            hits_num_bytes: new_counter(\n                \"virtual_cache_hits_bytes\",\n                \"Number of cache hits in bytes by component\",\n                CACHE_METRICS_NAMESPACE,\n                &labels,\n            ),\n            misses_num_items: new_counter(\n                \"virtual_cache_misses_total\",\n                \"Number of cache misses by component\",\n                CACHE_METRICS_NAMESPACE,\n                &labels,\n            ),\n            evict_num_items: new_counter(\n                \"virtual_cache_evict_total\",\n                \"Number of cache entry evicted by component\",\n                CACHE_METRICS_NAMESPACE,\n                &labels,\n            ),\n            evict_num_bytes: new_counter(\n                \"virtual_cache_evict_bytes\",\n                \"Number of cache entry evicted in bytes by component\",\n                CACHE_METRICS_NAMESPACE,\n                &labels,\n            ),\n        };\n\n        self.virtual_caches_metrics\n            .write()\n            .unwrap()\n            .entry(config.clone())\n            .or_insert(new_virtual_cache_metrics)\n            .clone()\n    }\n}\n\n/// Storage counters exposes a bunch a set of storage/cache related metrics through a prometheus\n/// endpoint.\npub static STORAGE_METRICS: Lazy<StorageMetrics> = Lazy::new(StorageMetrics::default);\n\n#[cfg(test)]\npub static CACHE_METRICS_FOR_TESTS: Lazy<CacheMetrics> =\n    Lazy::new(|| CacheMetrics::for_component(\"fortest\"));\n\npub fn object_storage_get_slice_in_flight_guards(\n    get_request_size: usize,\n) -> (GaugeGuard<'static>, GaugeGuard<'static>) {\n    let mut bytes_guard = GaugeGuard::from_gauge(\n        &crate::STORAGE_METRICS.object_storage_get_slice_in_flight_num_bytes,\n    );\n    bytes_guard.add(get_request_size as i64);\n    let mut count_guard =\n        GaugeGuard::from_gauge(&crate::STORAGE_METRICS.object_storage_get_slice_in_flight_count);\n    count_guard.add(1);\n    (bytes_guard, count_guard)\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/object_storage/azure_blob_storage.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::num::NonZeroU32;\nuse std::ops::Range;\nuse std::path::{Path, PathBuf};\nuse std::sync::Arc;\nuse std::{fmt, io};\n\nuse async_trait::async_trait;\nuse azure_core::error::ErrorKind;\nuse azure_core::{Pageable, StatusCode};\nuse azure_storage::Error as AzureError;\nuse azure_storage::prelude::*;\nuse azure_storage_blobs::blob::operations::GetBlobResponse;\nuse azure_storage_blobs::prelude::*;\nuse bytes::Bytes;\nuse futures::io::Error as FutureError;\nuse futures::stream::{StreamExt, TryStreamExt};\nuse md5::Digest;\nuse once_cell::sync::OnceCell;\nuse quickwit_common::retry::{RetryParams, Retryable, retry};\nuse quickwit_common::uri::Uri;\nuse quickwit_common::{chunk_range, ignore_error_kind, into_u64_range};\nuse quickwit_config::{AzureStorageConfig, StorageBackend};\nuse regex::Regex;\nuse tantivy::directory::OwnedBytes;\nuse thiserror::Error;\nuse tokio::io::{AsyncRead, AsyncWriteExt, BufReader};\nuse tokio_util::compat::FuturesAsyncReadCompatExt;\nuse tokio_util::io::StreamReader;\nuse tracing::{instrument, warn};\n\nuse crate::debouncer::DebouncedStorage;\nuse crate::metrics::object_storage_get_slice_in_flight_guards;\nuse crate::storage::SendableAsync;\nuse crate::{\n    BulkDeleteError, DeleteFailure, MultiPartPolicy, PutPayload, STORAGE_METRICS, Storage,\n    StorageError, StorageErrorKind, StorageFactory, StorageResolverError, StorageResult,\n};\n\n/// Azure object storage resolver.\npub struct AzureBlobStorageFactory {\n    storage_config: AzureStorageConfig,\n}\n\nimpl AzureBlobStorageFactory {\n    /// Creates a new Azure blob storage factory.\n    pub fn new(storage_config: AzureStorageConfig) -> Self {\n        Self { storage_config }\n    }\n}\n\n#[async_trait]\nimpl StorageFactory for AzureBlobStorageFactory {\n    fn backend(&self) -> StorageBackend {\n        StorageBackend::Azure\n    }\n\n    async fn resolve(&self, uri: &Uri) -> Result<Arc<dyn Storage>, StorageResolverError> {\n        let storage = AzureBlobStorage::from_uri(&self.storage_config, uri)?;\n        Ok(Arc::new(DebouncedStorage::new(storage)))\n    }\n}\n\n/// Azure object storage implementation\npub struct AzureBlobStorage {\n    container_client: ContainerClient,\n    uri: Uri,\n    prefix: PathBuf,\n    multipart_policy: MultiPartPolicy,\n    retry_params: RetryParams,\n}\n\nimpl fmt::Debug for AzureBlobStorage {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        f.debug_struct(\"AzureBlobStorage\")\n            .field(\"uri\", &self.uri)\n            .field(\"prefix\", &self.prefix)\n            .finish()\n    }\n}\n\nimpl AzureBlobStorage {\n    /// Creates a new [`AzureBlobStorage`] instance.\n    pub fn new(\n        storage_account_name: String,\n        storage_credentials: StorageCredentials,\n        uri: Uri,\n        container_name: String,\n    ) -> Self {\n        let container_client = BlobServiceClient::new(storage_account_name, storage_credentials)\n            .container_client(container_name);\n        Self {\n            container_client,\n            uri,\n            prefix: PathBuf::new(),\n            multipart_policy: MultiPartPolicy {\n                // Azure max part size is 100MB\n                // https://azure.microsoft.com/en-us/blog/general-availability-larger-block-blobs-in-azure-storage/\n                target_part_num_bytes: 100_000_000,\n                multipart_threshold_num_bytes: 100_000_000,\n                max_num_parts: 50_000, // Azure allows up to 50,000 blocks\n                max_object_num_bytes: 4_770_000_000_000u64, // Azure allows up to 4.77TB objects\n                max_concurrent_uploads: 100,\n            },\n            retry_params: RetryParams::aggressive(),\n        }\n    }\n\n    /// Sets the prefix path.\n    ///\n    /// The existing prefix is overwritten.\n    pub fn with_prefix(self, prefix: PathBuf) -> Self {\n        Self {\n            container_client: self.container_client,\n            uri: self.uri,\n            prefix,\n            multipart_policy: self.multipart_policy,\n            retry_params: self.retry_params,\n        }\n    }\n\n    /// Creates an emulated storage for testing.\n    #[cfg(feature = \"integration-testsuite\")]\n    pub fn new_emulated(container: &str) -> Self {\n        use std::str::FromStr;\n\n        let container_client = ClientBuilder::emulator().container_client(container);\n        let uri = Uri::from_str(&format!(\"azure://tester/{container}\")).unwrap();\n\n        Self {\n            container_client,\n            uri,\n            prefix: PathBuf::new(),\n            multipart_policy: MultiPartPolicy::default(),\n            retry_params: RetryParams::no_retries(),\n        }\n    }\n\n    /// Sets the multipart policy.\n    ///\n    /// See `MultiPartPolicy`.\n    #[cfg(feature = \"integration-testsuite\")]\n    pub fn set_policy(&mut self, multipart_policy: MultiPartPolicy) {\n        self.multipart_policy = multipart_policy;\n    }\n\n    /// Builds instance from URI.\n    pub fn from_uri(\n        azure_storage_config: &AzureStorageConfig,\n        uri: &Uri,\n    ) -> Result<AzureBlobStorage, StorageResolverError> {\n        let storage_account_name =\n            azure_storage_config.resolve_account_name().ok_or_else(|| {\n                let message = format!(\n                    \"could not find Azure storage account name in environment variable `{}` or \\\n                     storage config\",\n                    AzureStorageConfig::AZURE_STORAGE_ACCOUNT_ENV_VAR\n                );\n                StorageResolverError::InvalidConfig(message)\n            })?;\n        let storage_credentials = if let Some(access_key) =\n            azure_storage_config.resolve_access_key()\n        {\n            StorageCredentials::access_key(storage_account_name.clone(), access_key)\n        } else if let Ok(credential) = azure_identity::create_credential() {\n            StorageCredentials::token_credential(credential)\n        } else {\n            return Err(StorageResolverError::InvalidConfig(\n                \"could not find Azure storage account credentials using the following credential \\\n                 providers: environment, managed identity, and storage account access key\"\n                    .to_string(),\n            ));\n        };\n        let (container_name, prefix) = parse_azure_uri(uri).ok_or_else(|| {\n            let message = format!(\"failed to extract container name from Azure URI `{uri}`\");\n            StorageResolverError::InvalidUri(message)\n        })?;\n        let azure_blob_storage = AzureBlobStorage::new(\n            storage_account_name,\n            storage_credentials,\n            uri.clone(),\n            container_name,\n        );\n        Ok(azure_blob_storage.with_prefix(prefix))\n    }\n\n    /// Returns the blob name (a.k.a blob key).\n    fn blob_name(&self, relative_path: &Path) -> String {\n        let key_path = self.prefix.join(relative_path);\n        key_path.to_string_lossy().to_string()\n    }\n\n    /// Downloads a blob as vector of bytes.\n    async fn get_to_vec(\n        &self,\n        path: &Path,\n        range_opt: Option<Range<usize>>,\n    ) -> StorageResult<Vec<u8>> {\n        let name = self.blob_name(path);\n        let capacity = range_opt.as_ref().map(Range::len).unwrap_or(0);\n        retry(&self.retry_params, || async {\n            let (mut response_stream, _in_flight_guards) = if let Some(range) = range_opt.as_ref() {\n                let stream = self\n                    .container_client\n                    .blob_client(&name)\n                    .get()\n                    .range(range.clone())\n                    .into_stream();\n                // only record ranged get request as being in flight\n                let in_flight_guards = object_storage_get_slice_in_flight_guards(capacity);\n                (stream, Some(in_flight_guards))\n            } else {\n                let stream = self.container_client.blob_client(&name).get().into_stream();\n                (stream, None)\n            };\n            let mut buf: Vec<u8> = Vec::with_capacity(capacity);\n            download_all(&mut response_stream, &mut buf).await?;\n\n            Result::<_, AzureErrorWrapper>::Ok(buf)\n        })\n        .await\n        .map_err(StorageError::from)\n    }\n\n    /// Performs a single part upload.\n    async fn put_single_part<'a>(\n        &'a self,\n        name: &'a str,\n        payload: Box<dyn crate::PutPayload>,\n    ) -> StorageResult<()> {\n        crate::STORAGE_METRICS.object_storage_put_parts.inc();\n        crate::STORAGE_METRICS\n            .object_storage_upload_num_bytes\n            .inc_by(payload.len());\n        retry(&self.retry_params, || async {\n            let data = Bytes::from(payload.read_all().await?.to_vec());\n            let hash = azure_storage_blobs::prelude::Hash::from(md5::compute(&data[..]).0);\n            self.container_client\n                .blob_client(name)\n                .put_block_blob(data)\n                .hash(hash)\n                .into_future()\n                .await?;\n            Result::<(), AzureErrorWrapper>::Ok(())\n        })\n        .await?;\n        Ok(())\n    }\n\n    /// Performs a multipart upload.\n    async fn put_multi_part<'a>(\n        &'a self,\n        name: &'a str,\n        payload: Box<dyn PutPayload>,\n        part_len: u64,\n        total_len: u64,\n    ) -> StorageResult<()> {\n        assert!(total_len > 0);\n        let multipart_ranges =\n            chunk_range(0..total_len as usize, part_len as usize).map(into_u64_range);\n\n        let blob_client = self.container_client.blob_client(name);\n        let upload_blocks_stream = tokio_stream::iter(multipart_ranges.enumerate())\n            .map(|(num, range)| {\n                let moved_blob_client = blob_client.clone();\n                let moved_payload = payload.clone();\n                crate::STORAGE_METRICS.object_storage_put_parts.inc();\n                crate::STORAGE_METRICS\n                    .object_storage_upload_num_bytes\n                    .inc_by(range.end - range.start);\n                async move {\n                    retry(&self.retry_params, || async {\n                        // zero pad block ids to make them sortable as strings\n                        let block_id = format!(\"block:{:05}\", num);\n                        let (data, hash_digest) =\n                            extract_range_data_and_hash(moved_payload.box_clone(), range.clone())\n                                .await?;\n                        let hash = azure_storage_blobs::prelude::Hash::from(hash_digest.0);\n                        moved_blob_client\n                            .put_block(block_id.clone(), data)\n                            .hash(hash)\n                            .into_future()\n                            .await?;\n                        Result::<_, AzureErrorWrapper>::Ok(block_id)\n                    })\n                    .await\n                }\n            })\n            .buffer_unordered(self.multipart_policy.max_concurrent_uploads());\n\n        // Collect and sort block ids to preserve part order for put_block_list.\n        // Azure docs: \"The put block list operation enforces the order in which blocks\n        // are to be combined to create a blob\".\n        // https://docs.microsoft.com/en-us/rest/api/storageservices/put-block-list\n        let mut block_ids: Vec<String> = upload_blocks_stream\n            .try_collect()\n            .await\n            .map_err(StorageError::from)?;\n        block_ids.sort_unstable();\n\n        let block_list = BlockList {\n            blocks: block_ids\n                .into_iter()\n                .map(BlobBlockType::new_uncommitted)\n                .collect(),\n        };\n\n        // Commit all uploaded blocks.\n        blob_client\n            .put_block_list(block_list)\n            .into_future()\n            .await\n            .map_err(AzureErrorWrapper::from)?;\n\n        Ok(())\n    }\n}\n\n#[async_trait]\nimpl Storage for AzureBlobStorage {\n    async fn check_connectivity(&self) -> anyhow::Result<()> {\n        if let Some(first_blob_result) = self\n            .container_client\n            .list_blobs()\n            .max_results(NonZeroU32::new(1u32).expect(\"1 is always non-zero.\"))\n            .into_stream()\n            .next()\n            .await\n        {\n            let _ = first_blob_result?;\n        }\n        Ok(())\n    }\n\n    async fn put(\n        &self,\n        path: &Path,\n        payload: Box<dyn crate::PutPayload>,\n    ) -> crate::StorageResult<()> {\n        crate::STORAGE_METRICS.object_storage_put_total.inc();\n        let name = self.blob_name(path);\n        let total_len = payload.len();\n        let part_num_bytes = self.multipart_policy.part_num_bytes(total_len);\n\n        if part_num_bytes >= total_len {\n            self.put_single_part(&name, payload).await?;\n        } else {\n            self.put_multi_part(&name, payload, part_num_bytes, total_len)\n                .await?;\n        }\n        Ok(())\n    }\n\n    async fn copy_to(&self, path: &Path, output: &mut dyn SendableAsync) -> StorageResult<()> {\n        let name = self.blob_name(path);\n        let mut output_stream = self.container_client.blob_client(name).get().into_stream();\n\n        while let Some(chunk_result) = output_stream.next().await {\n            let chunk_response = chunk_result.map_err(AzureErrorWrapper::from)?;\n            let chunk_response_body_stream = chunk_response\n                .data\n                .map_err(FutureError::other)\n                .into_async_read()\n                .compat();\n            let mut body_stream_reader = BufReader::new(chunk_response_body_stream);\n            let num_bytes_copied = tokio::io::copy_buf(&mut body_stream_reader, output).await?;\n            STORAGE_METRICS\n                .object_storage_download_num_bytes\n                .inc_by(num_bytes_copied);\n        }\n        output.flush().await?;\n        Ok(())\n    }\n\n    async fn delete(&self, path: &Path) -> StorageResult<()> {\n        let blob_name = self.blob_name(path);\n        let delete_res: Result<_, StorageError> = self\n            .container_client\n            .blob_client(blob_name)\n            .delete()\n            .into_future()\n            .await\n            .map_err(|err| AzureErrorWrapper::from(err).into());\n        ignore_error_kind!(StorageErrorKind::NotFound, delete_res)?;\n        Ok(())\n    }\n\n    async fn bulk_delete<'a>(&self, paths: &[&'a Path]) -> Result<(), BulkDeleteError> {\n        // See https://github.com/Azure/azure-sdk-for-rust/issues/1068\n        warn!(\n            num_files = paths.len(),\n            \"`AzureBlobStorage` does not support batch delete. Falling back to sequential delete, \\\n             which might be slow and issue many requests.\"\n        );\n        let mut successes = Vec::with_capacity(paths.len());\n        let mut failures = HashMap::new();\n\n        let futures = paths\n            .iter()\n            .map(|path| async move {\n                let delete_res = self.delete(path).await;\n                (path, delete_res)\n            })\n            .collect::<Vec<_>>();\n        let mut stream = futures::stream::iter(futures).buffer_unordered(100);\n\n        while let Some((path, delete_res)) = stream.next().await {\n            match delete_res {\n                Ok(_) => successes.push(path.to_path_buf()),\n                Err(error) => {\n                    let failure = DeleteFailure {\n                        error: Some(error),\n                        ..Default::default()\n                    };\n                    failures.insert(path.to_path_buf(), failure);\n                }\n            };\n        }\n        if failures.is_empty() {\n            Ok(())\n        } else {\n            Err(BulkDeleteError {\n                successes,\n                failures,\n                ..Default::default()\n            })\n        }\n    }\n\n    #[instrument(level = \"debug\", skip(self, range), fields(range.start = range.start, range.end = range.end))]\n    async fn get_slice(&self, path: &Path, range: Range<usize>) -> StorageResult<OwnedBytes> {\n        self.get_to_vec(path, Some(range.clone()))\n            .await\n            .map(OwnedBytes::new)\n            .map_err(|err| {\n                err.add_context(format!(\n                    \"failed to fetch slice {:?} for object: {}/{}\",\n                    range,\n                    self.uri,\n                    path.display(),\n                ))\n            })\n    }\n\n    #[instrument(level = \"debug\", skip(self, range), fields(range.start = range.start, range.end = range.end))]\n    async fn get_slice_stream(\n        &self,\n        path: &Path,\n        range: Range<usize>,\n    ) -> StorageResult<Box<dyn AsyncRead + Send + Unpin>> {\n        retry(&self.retry_params, || async {\n            let range = range.clone();\n            let name = self.blob_name(path);\n            let page_stream = self\n                .container_client\n                .blob_client(name)\n                .get()\n                .range(range)\n                .into_stream();\n            let mut bytes_stream = page_stream\n                .map(|page_res| page_res.map(|page| page.data).map_err(FutureError::other))\n                .try_flatten()\n                .map(|bytes_res| bytes_res.map_err(FutureError::other));\n            // Peek into the stream so that any early error can be retried\n            let first_chunk = bytes_stream.next().await;\n            let reader: Box<dyn AsyncRead + Send + Unpin> = if let Some(res) = first_chunk {\n                let first_chunk = res.map_err(AzureErrorWrapper::from)?;\n                let reconstructed_stream =\n                    Box::pin(futures::stream::once(async { Ok(first_chunk) }).chain(bytes_stream));\n                Box::new(StreamReader::new(reconstructed_stream))\n            } else {\n                Box::new(tokio::io::empty())\n            };\n            Result::<Box<dyn AsyncRead + Send + Unpin>, AzureErrorWrapper>::Ok(reader)\n        })\n        .await\n        .map_err(|e| e.into())\n    }\n\n    #[instrument(level = \"debug\", skip(self), fields(fetched_bytes_len))]\n    async fn get_all(&self, path: &Path) -> StorageResult<OwnedBytes> {\n        let data = self\n            .get_to_vec(path, None)\n            .await\n            .map(OwnedBytes::new)\n            .map_err(|err| {\n                err.add_context(format!(\n                    \"failed to fetch object: {}/{}\",\n                    self.uri,\n                    path.display()\n                ))\n            })?;\n        tracing::Span::current().record(\"fetched_bytes_len\", data.len());\n        Ok(data)\n    }\n\n    async fn file_num_bytes(&self, path: &Path) -> StorageResult<u64> {\n        let name = self.blob_name(path);\n        let properties_result = self\n            .container_client\n            .blob_client(name)\n            .get_properties()\n            .into_future()\n            .await;\n        match properties_result {\n            Ok(response) => Ok(response.blob.properties.content_length),\n            Err(err) => Err(StorageError::from(AzureErrorWrapper::from(err))),\n        }\n    }\n\n    fn uri(&self) -> &Uri {\n        &self.uri\n    }\n}\n\n/// Copy range of payload into `Bytes` and return the computed md5.\nasync fn extract_range_data_and_hash(\n    payload: Box<dyn PutPayload>,\n    range: Range<u64>,\n) -> io::Result<(Bytes, Digest)> {\n    let mut reader = payload\n        .range_byte_stream(range.clone())\n        .await?\n        .into_async_read();\n    let mut buf: Vec<u8> = Vec::with_capacity(range.count());\n    tokio::io::copy(&mut reader, &mut buf).await?;\n    let data = Bytes::from(buf);\n    let hash = md5::compute(&data[..]);\n    Ok((data, hash))\n}\n\npub fn parse_azure_uri(uri: &Uri) -> Option<(String, PathBuf)> {\n    // Ex: azure://container/prefix.\n    static URI_PTN: OnceCell<Regex> = OnceCell::new();\n\n    let captures = URI_PTN\n        .get_or_init(|| {\n            Regex::new(r\"azure(\\+[^:]+)?://(?P<container>[^/]+)(/(?P<prefix>.+))?\")\n                .expect(\"The regular expression should compile.\")\n        })\n        .captures(uri.as_str())?;\n\n    let container = captures.name(\"container\")?.as_str().to_string();\n    let prefix = captures\n        .name(\"prefix\")\n        .map(|prefix_match| PathBuf::from(prefix_match.as_str()))\n        .unwrap_or_default();\n    Some((container, prefix))\n}\n\n/// Collect a download stream into an output buffer.\nasync fn download_all(\n    chunk_stream: &mut Pageable<GetBlobResponse, AzureError>,\n    output: &mut Vec<u8>,\n) -> Result<(), AzureErrorWrapper> {\n    output.clear();\n    while let Some(chunk_result) = chunk_stream.next().await {\n        let chunk_response = chunk_result?;\n        let chunk_response_body_stream = chunk_response\n            .data\n            .map_err(FutureError::other)\n            .into_async_read()\n            .compat();\n        let mut body_stream_reader = BufReader::new(chunk_response_body_stream);\n        let num_bytes_copied = tokio::io::copy_buf(&mut body_stream_reader, output).await?;\n        crate::STORAGE_METRICS\n            .object_storage_download_num_bytes\n            .inc_by(num_bytes_copied);\n    }\n    // When calling `get_all`, the Vec capacity is not properly set.\n    output.shrink_to_fit();\n    Ok(())\n}\n\n#[derive(Error, Debug)]\n#[error(\"Azure error wrapper(inner={inner})\")]\nstruct AzureErrorWrapper {\n    inner: AzureError,\n}\n\nimpl Retryable for AzureErrorWrapper {\n    fn is_retryable(&self) -> bool {\n        match self.inner.kind() {\n            ErrorKind::HttpResponse { status, .. } => !matches!(\n                status,\n                StatusCode::NotFound\n                    | StatusCode::Unauthorized\n                    | StatusCode::BadRequest\n                    | StatusCode::Forbidden\n            ),\n            ErrorKind::Io => true,\n            _ => false,\n        }\n    }\n}\n\nimpl From<AzureError> for AzureErrorWrapper {\n    fn from(err: AzureError) -> Self {\n        AzureErrorWrapper { inner: err }\n    }\n}\n\nimpl From<io::Error> for AzureErrorWrapper {\n    fn from(err: io::Error) -> Self {\n        AzureErrorWrapper {\n            inner: AzureError::new(ErrorKind::Io, err),\n        }\n    }\n}\n\nimpl From<AzureErrorWrapper> for StorageError {\n    fn from(err: AzureErrorWrapper) -> Self {\n        match err.inner.kind() {\n            ErrorKind::HttpResponse { status, .. } => match status {\n                StatusCode::NotFound => StorageErrorKind::NotFound.with_error(err),\n                _ => StorageErrorKind::Service.with_error(err),\n            },\n            ErrorKind::Io => StorageErrorKind::Io.with_error(err),\n            ErrorKind::Credential => StorageErrorKind::Unauthorized.with_error(err),\n            _ => StorageErrorKind::Internal.with_error(err),\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_common::uri::Uri;\n\n    use crate::object_storage::azure_blob_storage::parse_azure_uri;\n\n    #[test]\n    fn test_parse_azure_uri() {\n        assert!(parse_azure_uri(&Uri::for_test(\"azure://\")).is_none());\n\n        let (container, prefix) =\n            parse_azure_uri(&Uri::for_test(\"azure://test-container\")).unwrap();\n        assert_eq!(container, \"test-container\");\n        assert!(prefix.to_str().unwrap().is_empty());\n\n        let (container, prefix) =\n            parse_azure_uri(&Uri::for_test(\"azure://test-container/\")).unwrap();\n        assert_eq!(container, \"test-container\");\n        assert!(prefix.to_str().unwrap().is_empty());\n\n        let (container, prefix) =\n            parse_azure_uri(&Uri::for_test(\"azure://test-container/indexes\")).unwrap();\n        assert_eq!(container, \"test-container\");\n        assert_eq!(prefix.to_str().unwrap(), \"indexes\");\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/object_storage/error.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse aws_sdk_s3::error::{DisplayErrorContext, ProvideErrorMetadata, SdkError};\nuse aws_sdk_s3::operation::abort_multipart_upload::AbortMultipartUploadError;\nuse aws_sdk_s3::operation::complete_multipart_upload::CompleteMultipartUploadError;\nuse aws_sdk_s3::operation::create_multipart_upload::CreateMultipartUploadError;\nuse aws_sdk_s3::operation::delete_object::DeleteObjectError;\nuse aws_sdk_s3::operation::delete_objects::DeleteObjectsError;\nuse aws_sdk_s3::operation::get_object::GetObjectError;\nuse aws_sdk_s3::operation::head_object::HeadObjectError;\nuse aws_sdk_s3::operation::put_object::PutObjectError;\nuse aws_sdk_s3::operation::upload_part::UploadPartError;\n\nuse crate::{StorageError, StorageErrorKind};\n\nimpl<E> From<SdkError<E>> for StorageError\nwhere E: std::error::Error + ToStorageErrorKind + Send + Sync + 'static\n{\n    fn from(error: SdkError<E>) -> StorageError {\n        let error_kind = match &error {\n            SdkError::ConstructionFailure(_) => StorageErrorKind::Internal,\n            SdkError::DispatchFailure(failure) => {\n                if failure.is_io() {\n                    StorageErrorKind::Io\n                } else if failure.is_timeout() {\n                    StorageErrorKind::Timeout\n                } else {\n                    StorageErrorKind::Internal\n                }\n            }\n            SdkError::ResponseError(response_error) => {\n                match response_error.raw().status().as_u16() {\n                    404 /* NOT_FOUND */ => StorageErrorKind::NotFound,\n                    403 /* UNAUTHORIZED */ => StorageErrorKind::Unauthorized,\n                    _ => StorageErrorKind::Internal,\n                }\n            }\n            SdkError::ServiceError(service_error) => service_error.err().to_storage_error_kind(),\n            SdkError::TimeoutError(_) => StorageErrorKind::Timeout,\n            _ => StorageErrorKind::Internal,\n        };\n        let source = anyhow::anyhow!(\"{}\", DisplayErrorContext(error));\n        error_kind.with_error(source)\n    }\n}\n\npub trait ToStorageErrorKind {\n    fn to_storage_error_kind(&self) -> StorageErrorKind;\n}\n\nimpl ToStorageErrorKind for GetObjectError {\n    fn to_storage_error_kind(&self) -> StorageErrorKind {\n        let error_code = self.code().unwrap_or(\"unknown\");\n        crate::STORAGE_METRICS\n            .object_storage_get_errors_total\n            .with_label_values([error_code])\n            .inc();\n        match self {\n            GetObjectError::InvalidObjectState(_) => StorageErrorKind::Service,\n            GetObjectError::NoSuchKey(_) => StorageErrorKind::NotFound,\n            _ => StorageErrorKind::Service,\n        }\n    }\n}\n\nimpl ToStorageErrorKind for DeleteObjectError {\n    fn to_storage_error_kind(&self) -> StorageErrorKind {\n        StorageErrorKind::Service\n    }\n}\n\nimpl ToStorageErrorKind for DeleteObjectsError {\n    fn to_storage_error_kind(&self) -> StorageErrorKind {\n        StorageErrorKind::Service\n    }\n}\n\nimpl ToStorageErrorKind for UploadPartError {\n    fn to_storage_error_kind(&self) -> StorageErrorKind {\n        StorageErrorKind::Service\n    }\n}\n\nimpl ToStorageErrorKind for CompleteMultipartUploadError {\n    fn to_storage_error_kind(&self) -> StorageErrorKind {\n        StorageErrorKind::Service\n    }\n}\n\nimpl ToStorageErrorKind for AbortMultipartUploadError {\n    fn to_storage_error_kind(&self) -> StorageErrorKind {\n        match self {\n            AbortMultipartUploadError::NoSuchUpload(_) => StorageErrorKind::Internal,\n            _ => StorageErrorKind::Service,\n        }\n    }\n}\n\nimpl ToStorageErrorKind for CreateMultipartUploadError {\n    fn to_storage_error_kind(&self) -> StorageErrorKind {\n        StorageErrorKind::Service\n    }\n}\n\nimpl ToStorageErrorKind for PutObjectError {\n    fn to_storage_error_kind(&self) -> StorageErrorKind {\n        StorageErrorKind::Service\n    }\n}\n\nimpl ToStorageErrorKind for HeadObjectError {\n    fn to_storage_error_kind(&self) -> StorageErrorKind {\n        match self {\n            HeadObjectError::NotFound(_) => StorageErrorKind::NotFound,\n            _ => StorageErrorKind::Service,\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/object_storage/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod error;\n\nmod s3_compatible_storage;\npub use self::s3_compatible_storage::S3CompatibleObjectStorage;\npub use self::s3_compatible_storage_resolver::S3CompatibleObjectStorageFactory;\n\nmod policy;\npub use crate::object_storage::policy::MultiPartPolicy;\n\nmod s3_compatible_storage_resolver;\n\n#[cfg(feature = \"azure\")]\nmod azure_blob_storage;\n#[cfg(feature = \"azure\")]\npub use self::azure_blob_storage::{AzureBlobStorage, AzureBlobStorageFactory};\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/object_storage/policy.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n/// The multipart policy defines when and how multipart upload / download should happen.\n///\n/// The right settings might be vendor specific, but if not available the default values\n/// should be safe.\npub struct MultiPartPolicy {\n    /// Ideal part size.\n    /// Since S3 has a constraint on the number of parts, it cannot always be\n    /// respected.\n    pub target_part_num_bytes: usize,\n    /// Maximum number of parts allowed.\n    pub max_num_parts: usize,\n    /// Threshold above which multipart is triggered.\n    pub multipart_threshold_num_bytes: u64,\n    /// Maximum size allowed for an object.\n    pub max_object_num_bytes: u64,\n    /// Maximum number of parts to be upload concurrently.\n    pub max_concurrent_uploads: usize,\n}\n\nimpl MultiPartPolicy {\n    /// This function returns the size of the part that should\n    /// be used. We should have `part_num_bytes(len)` <= `len`.\n    ///\n    /// If this function returns `len`, then multipart upload\n    /// will not be used.\n    pub fn part_num_bytes(&self, len: u64) -> u64 {\n        assert!(\n            len < self.max_object_num_bytes,\n            \"This object storage does not support object of that size {}\",\n            self.max_object_num_bytes\n        );\n        assert!(\n            self.max_num_parts > 0,\n            \"Misconfiguration: max_num_parts == 0 makes no sense.\"\n        );\n        if len < self.multipart_threshold_num_bytes || self.max_num_parts == 1 {\n            return len;\n        }\n        let max_num_parts = self.max_num_parts as u64;\n        // complete part is the smallest integer such that\n        // <max_num_parts> * <min_part_len> >= len.\n        let min_part_len = 1u64 + (len - 1u64) / max_num_parts;\n        (min_part_len).max(self.target_part_num_bytes as u64)\n    }\n\n    /// Limits the number of parts that can be concurrently uploaded.\n    pub fn max_concurrent_uploads(&self) -> usize {\n        self.max_concurrent_uploads\n    }\n}\n\n// The best default value may differ depending on vendors.\nimpl Default for MultiPartPolicy {\n    fn default() -> Self {\n        MultiPartPolicy {\n            // S3 limits part size from 5M to 5GB, we want to end up with as few parts as possible\n            // since each part is charged as a put request.\n            target_part_num_bytes: 5_000_000_000, // 5GB\n            multipart_threshold_num_bytes: 128 * 1_024 * 1_024, // 128 MiB\n            max_num_parts: 10_000,\n            max_object_num_bytes: 5_000_000_000_000u64, // S3 allows up to 5TB objects\n            max_concurrent_uploads: 100,\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/object_storage/s3_compatible_storage.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::ops::Range;\nuse std::path::{Path, PathBuf};\nuse std::pin::Pin;\nuse std::task::{Context, Poll};\nuse std::{fmt, io};\n\nuse anyhow::{Context as AnyhhowContext, anyhow};\nuse async_trait::async_trait;\nuse aws_credential_types::provider::SharedCredentialsProvider;\nuse aws_sdk_s3::Client as S3Client;\nuse aws_sdk_s3::config::{Credentials, Region};\nuse aws_sdk_s3::error::{ProvideErrorMetadata, SdkError};\nuse aws_sdk_s3::operation::delete_objects::DeleteObjectsOutput;\nuse aws_sdk_s3::operation::get_object::{GetObjectError, GetObjectOutput};\nuse aws_sdk_s3::primitives::ByteStream;\nuse aws_sdk_s3::types::builders::ObjectIdentifierBuilder;\nuse aws_sdk_s3::types::{CompletedMultipartUpload, CompletedPart, Delete, ObjectIdentifier};\nuse base64::prelude::{BASE64_STANDARD, Engine};\nuse futures::{StreamExt, stream};\nuse once_cell::sync::{Lazy, OnceCell};\nuse quickwit_aws::retry::{AwsRetryable, aws_retry};\nuse quickwit_aws::{aws_behavior_version, get_aws_config};\nuse quickwit_common::retry::{Retry, RetryParams};\nuse quickwit_common::uri::Uri;\nuse quickwit_common::{chunk_range, into_u64_range};\nuse quickwit_config::S3StorageConfig;\nuse regex::Regex;\nuse tokio::io::{AsyncRead, AsyncReadExt, AsyncWriteExt, BufReader, ReadBuf};\nuse tokio::sync::Semaphore;\nuse tracing::{info, instrument, warn};\n\nuse crate::metrics::object_storage_get_slice_in_flight_guards;\nuse crate::object_storage::MultiPartPolicy;\nuse crate::storage::SendableAsync;\nuse crate::{\n    BulkDeleteError, DeleteFailure, OwnedBytes, STORAGE_METRICS, Storage, StorageError,\n    StorageErrorKind, StorageResolverError, StorageResult,\n};\n\n/// Semaphore to limit the number of concurrent requests to the object store. Some object stores\n/// (R2, SeaweedFs...) return errors when too many concurrent requests are emitted.\nstatic REQUEST_SEMAPHORE: Lazy<Semaphore> = Lazy::new(|| {\n    let num_permits: usize =\n        quickwit_common::get_from_env(\"QW_S3_MAX_CONCURRENCY\", 10_000usize, false);\n    Semaphore::new(num_permits)\n});\n\n/// Wrap the async read handle together with a permit to keep the permit alive\n/// until the handle is dropped\nstruct S3AsyncRead<T: AsyncRead + Send + Unpin> {\n    pub read: T,\n    pub _permit: Result<tokio::sync::SemaphorePermit<'static>, tokio::sync::AcquireError>,\n}\n\nimpl<T: AsyncRead + Send + Unpin> AsyncRead for S3AsyncRead<T> {\n    fn poll_read(\n        self: Pin<&mut Self>,\n        cx: &mut Context<'_>,\n        buf: &mut ReadBuf<'_>,\n    ) -> Poll<io::Result<()>> {\n        let self_unpin = self.get_mut();\n        Pin::new(&mut self_unpin.read).poll_read(cx, buf)\n    }\n}\n\n/// S3-compatible object storage implementation.\npub struct S3CompatibleObjectStorage {\n    s3_client: S3Client,\n    uri: Uri,\n    bucket: String,\n    prefix: PathBuf,\n    multipart_policy: MultiPartPolicy,\n    retry_params: RetryParams,\n    disable_multi_object_delete: bool,\n    disable_multipart_upload: bool,\n}\n\nimpl fmt::Debug for S3CompatibleObjectStorage {\n    fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {\n        formatter\n            .debug_struct(\"S3CompatibleObjectStorage\")\n            .field(\"bucket\", &self.bucket)\n            .field(\"prefix\", &self.prefix)\n            .finish()\n    }\n}\n\nfn get_credentials_provider(\n    s3_storage_config: &S3StorageConfig,\n) -> Option<SharedCredentialsProvider> {\n    match (\n        &s3_storage_config.access_key_id,\n        &s3_storage_config.secret_access_key,\n    ) {\n        (Some(access_key_id), Some(secret_access_key)) => {\n            info!(\"using S3 credentials defined in storage config\");\n            let credentials = Credentials::from_keys(access_key_id, secret_access_key, None);\n            let credentials_provider = SharedCredentialsProvider::new(credentials);\n            Some(credentials_provider)\n        }\n        _ => None,\n    }\n}\n\nfn get_region(s3_storage_config: &S3StorageConfig) -> Option<Region> {\n    s3_storage_config.region.clone().map(|region| {\n        info!(region=%region, \"using S3 region defined in storage config\");\n        Region::new(region)\n    })\n}\n\npub async fn create_s3_client(s3_storage_config: &S3StorageConfig) -> S3Client {\n    let aws_config = get_aws_config().await;\n    let credentials_provider =\n        get_credentials_provider(s3_storage_config).or(aws_config.credentials_provider());\n    let region = get_region(s3_storage_config).or(aws_config.region().cloned());\n    let mut s3_config = aws_sdk_s3::Config::builder()\n        .behavior_version(aws_behavior_version())\n        .region(region);\n\n    if let Some(identity_cache) = aws_config.identity_cache() {\n        s3_config.set_identity_cache(identity_cache);\n    }\n    s3_config.set_credentials_provider(credentials_provider);\n    s3_config.set_force_path_style(s3_storage_config.force_path_style_access());\n    s3_config.set_http_client(aws_config.http_client());\n    s3_config.set_retry_config(aws_config.retry_config().cloned());\n    s3_config.set_sleep_impl(aws_config.sleep_impl());\n    s3_config.set_stalled_stream_protection(aws_config.stalled_stream_protection());\n    s3_config.set_timeout_config(aws_config.timeout_config().cloned());\n\n    if let Some(endpoint) = s3_storage_config.endpoint() {\n        info!(endpoint=%endpoint, \"using S3 endpoint defined in storage config or environment variable\");\n        s3_config.set_endpoint_url(Some(endpoint));\n    }\n    S3Client::from_conf(s3_config.build())\n}\n\nimpl S3CompatibleObjectStorage {\n    /// Creates an object storage given a region and an uri.\n    pub async fn from_uri(\n        s3_storage_config: &S3StorageConfig,\n        uri: &Uri,\n    ) -> Result<Self, StorageResolverError> {\n        let s3_client = create_s3_client(s3_storage_config).await;\n        Self::from_uri_and_client(s3_storage_config, uri, s3_client).await\n    }\n\n    /// Creates an object storage given a region, an uri and an S3 client.\n    pub async fn from_uri_and_client(\n        s3_storage_config: &S3StorageConfig,\n        uri: &Uri,\n        s3_client: S3Client,\n    ) -> Result<Self, StorageResolverError> {\n        let (bucket, prefix) = parse_s3_uri(uri).ok_or_else(|| {\n            let message = format!(\"failed to extract bucket name from S3 URI: {uri}\");\n            StorageResolverError::InvalidUri(message)\n        })?;\n        let retry_params = RetryParams::aggressive();\n        let disable_multi_object_delete = s3_storage_config.disable_multi_object_delete;\n        let disable_multipart_upload = s3_storage_config.disable_multipart_upload;\n        Ok(Self {\n            s3_client,\n            uri: uri.clone(),\n            bucket,\n            prefix,\n            multipart_policy: MultiPartPolicy::default(),\n            retry_params,\n            disable_multi_object_delete,\n            disable_multipart_upload,\n        })\n    }\n\n    /// Sets a specific for all buckets.\n    ///\n    /// This method overrides any existing prefix. (It does NOT\n    /// append the argument to any existing prefix.)\n    pub fn with_prefix(self, prefix: PathBuf) -> Self {\n        Self {\n            s3_client: self.s3_client,\n            uri: self.uri,\n            bucket: self.bucket,\n            prefix,\n            multipart_policy: self.multipart_policy,\n            retry_params: self.retry_params,\n            disable_multi_object_delete: self.disable_multi_object_delete,\n            disable_multipart_upload: self.disable_multipart_upload,\n        }\n    }\n\n    /// Sets the multipart policy.\n    ///\n    /// See `MultiPartPolicy`.\n    #[cfg(feature = \"integration-testsuite\")]\n    pub fn set_policy(&mut self, multipart_policy: MultiPartPolicy) {\n        self.multipart_policy = multipart_policy;\n    }\n}\n\npub fn parse_s3_uri(uri: &Uri) -> Option<(String, PathBuf)> {\n    static S3_URI_PTN: OnceCell<Regex> = OnceCell::new();\n\n    let captures = S3_URI_PTN\n        .get_or_init(|| {\n            // s3://bucket/path/to/object\n            Regex::new(r\"s3(\\+[^:]+)?://(?P<bucket>[^/]+)(/(?P<prefix>.+))?\")\n                .expect(\"The regular expression should compile.\")\n        })\n        .captures(uri.as_str())?;\n\n    let bucket = captures.name(\"bucket\")?.as_str().to_string();\n    let prefix = captures\n        .name(\"prefix\")\n        .map(|prefix_match| PathBuf::from(prefix_match.as_str()))\n        .unwrap_or_default();\n    Some((bucket, prefix))\n}\n\n#[derive(Clone, Debug)]\nstruct MultipartUploadId(pub String);\n\n#[derive(Clone, Debug)]\nstruct Part {\n    pub part_number: usize,\n    pub range: Range<u64>,\n    pub md5: md5::Digest,\n}\n\nimpl Part {\n    fn len(&self) -> u64 {\n        self.range.end - self.range.start\n    }\n}\n\nconst MD5_CHUNK_SIZE: usize = 1_000_000;\n\nasync fn compute_md5<T: AsyncRead + std::marker::Unpin>(mut read: T) -> io::Result<md5::Digest> {\n    let mut checksum = md5::Context::new();\n    let mut buf = vec![0; MD5_CHUNK_SIZE];\n    loop {\n        let read_len = read.read(&mut buf).await?;\n        checksum.consume(&buf[..read_len]);\n        if read_len == 0 {\n            return Ok(checksum.finalize());\n        }\n    }\n}\n\nimpl S3CompatibleObjectStorage {\n    fn key(&self, relative_path: &Path) -> String {\n        // FIXME: This may not work on Windows.\n        let key_path = self.prefix.join(relative_path);\n        key_path.to_string_lossy().to_string()\n    }\n\n    fn relative_path(&self, key: &str) -> PathBuf {\n        // FIXME: This may not work on Windows.\n        Path::new(key)\n            .strip_prefix(&self.prefix)\n            .expect(\"The prefix should have been prepended to the key before this method call.\")\n            .to_path_buf()\n    }\n\n    async fn put_single_part_single_try<'a>(\n        &'a self,\n        bucket: &'a str,\n        key: &'a str,\n        payload: Box<dyn crate::PutPayload>,\n        len: u64,\n    ) -> Result<(), Retry<StorageError>> {\n        let body = payload\n            .byte_stream()\n            .await\n            .map_err(|io_error| Retry::Permanent(StorageError::from(io_error)))?;\n\n        crate::STORAGE_METRICS.object_storage_put_parts.inc();\n        crate::STORAGE_METRICS\n            .object_storage_upload_num_bytes\n            .inc_by(len);\n\n        self.s3_client\n            .put_object()\n            .bucket(bucket)\n            .key(key)\n            .body(body)\n            .content_length(len as i64)\n            .send()\n            .await\n            .map_err(|sdk_error| {\n                if sdk_error.is_retryable() {\n                    Retry::Transient(StorageError::from(sdk_error))\n                } else {\n                    Retry::Permanent(StorageError::from(sdk_error))\n                }\n            })?;\n        Ok(())\n    }\n\n    async fn put_single_part<'a>(\n        &'a self,\n        key: &'a str,\n        payload: Box<dyn crate::PutPayload>,\n        len: u64,\n    ) -> StorageResult<()> {\n        let bucket = &self.bucket;\n        aws_retry(&self.retry_params, || async {\n            self.put_single_part_single_try(bucket, key, payload.clone(), len)\n                .await\n        })\n        .await\n        .map_err(|error| error.into_inner())?;\n        Ok(())\n    }\n\n    async fn create_multipart_upload(&self, key: &str) -> StorageResult<MultipartUploadId> {\n        let upload_id = aws_retry(&self.retry_params, || async {\n            self.s3_client\n                .create_multipart_upload()\n                .bucket(self.bucket.clone())\n                .key(key)\n                .send()\n                .await\n        })\n        .await?\n        .upload_id\n        .ok_or_else(|| {\n            StorageErrorKind::Internal\n                .with_error(anyhow!(\"the returned multipart upload id was null\"))\n        })?;\n        Ok(MultipartUploadId(upload_id))\n    }\n\n    async fn create_multipart_requests(\n        &self,\n        payload: Box<dyn crate::PutPayload>,\n        len: u64,\n        part_len: u64,\n    ) -> io::Result<Vec<Part>> {\n        assert!(len > 0);\n        let multipart_ranges = chunk_range(0..len as usize, part_len as usize)\n            .map(into_u64_range)\n            .collect::<Vec<_>>();\n\n        let mut parts = Vec::with_capacity(multipart_ranges.len());\n\n        for (multipart_id, multipart_range) in multipart_ranges.into_iter().enumerate() {\n            let read = payload\n                .range_byte_stream(multipart_range.clone())\n                .await?\n                .into_async_read();\n            let md5 = compute_md5(read).await?;\n\n            let part = Part {\n                part_number: multipart_id + 1, // parts are 1-indexed\n                range: multipart_range,\n                md5,\n            };\n            parts.push(part);\n        }\n        Ok(parts)\n    }\n\n    fn build_delete_batch_requests<'a>(\n        &self,\n        delete_paths: &'a [&'a Path],\n    ) -> anyhow::Result<Vec<(&'a [&'a Path], Delete)>> {\n        #[cfg(test)]\n        const MAX_NUM_KEYS: usize = 3;\n\n        #[cfg(not(test))]\n        const MAX_NUM_KEYS: usize = 1_000;\n\n        let path_chunks = delete_paths.chunks(MAX_NUM_KEYS);\n        let num_delete_requests = path_chunks.len();\n        let mut delete_requests: Vec<(&[&Path], Delete)> = Vec::with_capacity(num_delete_requests);\n\n        for path_chunk in path_chunks {\n            let object_ids: Vec<ObjectIdentifier> = path_chunk\n                .iter()\n                .map(|path| {\n                    let key = self.key(path);\n                    ObjectIdentifierBuilder::default()\n                        .key(key)\n                        .build()\n                        .context(\"failed to build object identifier\")\n                })\n                .collect::<anyhow::Result<_>>()?;\n            let delete = Delete::builder()\n                .set_objects(Some(object_ids))\n                .build()\n                .context(\"failed to build delete request\")?;\n            delete_requests.push((path_chunk, delete));\n        }\n        Ok(delete_requests)\n    }\n\n    async fn upload_part<'a>(\n        &'a self,\n        upload_id: MultipartUploadId,\n        key: &'a str,\n        part: Part,\n        payload: Box<dyn crate::PutPayload>,\n    ) -> Result<CompletedPart, Retry<StorageError>> {\n        let byte_stream = payload\n            .range_byte_stream(part.range.clone())\n            .await\n            .map_err(StorageError::from)\n            .map_err(Retry::Permanent)?;\n        let md5 = BASE64_STANDARD.encode(part.md5.0);\n\n        crate::STORAGE_METRICS.object_storage_put_parts.inc();\n        crate::STORAGE_METRICS\n            .object_storage_upload_num_bytes\n            .inc_by(part.len());\n\n        let upload_part_output = self\n            .s3_client\n            .upload_part()\n            .bucket(self.bucket.clone())\n            .key(key)\n            .body(byte_stream)\n            .content_length(part.len() as i64)\n            .content_md5(md5)\n            .part_number(part.part_number as i32)\n            .upload_id(upload_id.0)\n            .send()\n            .await\n            .map_err(|s3_err| {\n                if s3_err.is_retryable() {\n                    Retry::Transient(StorageError::from(s3_err))\n                } else {\n                    Retry::Permanent(StorageError::from(s3_err))\n                }\n            })?;\n\n        let completed_part = CompletedPart::builder()\n            .set_e_tag(upload_part_output.e_tag)\n            .part_number(part.part_number as i32)\n            .build();\n        Ok(completed_part)\n    }\n\n    async fn put_multipart<'a>(\n        &'a self,\n        key: &'a str,\n        payload: Box<dyn crate::PutPayload>,\n        part_len: u64,\n        total_len: u64,\n    ) -> StorageResult<()> {\n        let upload_id = self.create_multipart_upload(key).await?;\n        let parts = self\n            .create_multipart_requests(payload.clone(), total_len, part_len)\n            .await?;\n        let max_concurrent_upload = self.multipart_policy.max_concurrent_uploads();\n        let completed_parts_res: StorageResult<Vec<CompletedPart>> =\n            stream::iter(parts.into_iter().map(|part| {\n                let payload = payload.clone();\n                let upload_id = upload_id.clone();\n                aws_retry(&self.retry_params, move || {\n                    self.upload_part(upload_id.clone(), key, part.clone(), payload.clone())\n                })\n            }))\n            .buffered(max_concurrent_upload)\n            .collect::<Vec<_>>()\n            .await\n            .into_iter()\n            .map(|res| res.map_err(|e| e.into_inner()))\n            .collect();\n        match completed_parts_res {\n            Ok(completed_parts) => {\n                self.complete_multipart_upload(key, completed_parts, &upload_id.0)\n                    .await\n            }\n            Err(upload_error) => {\n                let abort_multipart_upload_res: StorageResult<()> =\n                    self.abort_multipart_upload(key, &upload_id.0).await;\n                if let Err(abort_error) = abort_multipart_upload_res {\n                    warn!(\n                        key = %key,\n                        error = ?abort_error,\n                        \"Failed to abort multipart upload.\"\n                    );\n                }\n                Err(upload_error)\n            }\n        }\n    }\n\n    async fn complete_multipart_upload(\n        &self,\n        key: &str,\n        completed_parts: Vec<CompletedPart>,\n        upload_id: &str,\n    ) -> StorageResult<()> {\n        let completed_upload = CompletedMultipartUpload::builder()\n            .set_parts(Some(completed_parts))\n            .build();\n        aws_retry(&self.retry_params, || async {\n            self.s3_client\n                .complete_multipart_upload()\n                .bucket(self.bucket.clone())\n                .key(key)\n                .multipart_upload(completed_upload.clone())\n                .upload_id(upload_id)\n                .send()\n                .await\n        })\n        .await?;\n        Ok(())\n    }\n\n    async fn abort_multipart_upload(&self, key: &str, upload_id: &str) -> StorageResult<()> {\n        aws_retry(&self.retry_params, || async {\n            self.s3_client\n                .abort_multipart_upload()\n                .bucket(self.bucket.clone())\n                .key(key)\n                .upload_id(upload_id)\n                .send()\n                .await\n        })\n        .await?;\n        Ok(())\n    }\n\n    async fn get_object(\n        &self,\n        path: &Path,\n        range_opt: Option<Range<usize>>,\n    ) -> Result<GetObjectOutput, SdkError<GetObjectError>> {\n        let key = self.key(path);\n        let range_str = range_opt.map(|range| format!(\"bytes={}-{}\", range.start, range.end - 1));\n\n        crate::STORAGE_METRICS.object_storage_get_total.inc();\n\n        let get_object_output = self\n            .s3_client\n            .get_object()\n            .bucket(self.bucket.clone())\n            .key(key)\n            .set_range(range_str)\n            .send()\n            .await?;\n        Ok(get_object_output)\n    }\n\n    async fn get_to_vec(\n        &self,\n        path: &Path,\n        range_opt: Option<Range<usize>>,\n    ) -> StorageResult<Vec<u8>> {\n        let cap = range_opt.as_ref().map(Range::len).unwrap_or(0);\n        let get_object_output = aws_retry(&self.retry_params, || {\n            self.get_object(path, range_opt.clone())\n        })\n        .await?;\n        // only record ranged get request as being in flight\n        let _in_flight_guards =\n            range_opt.map(|range| object_storage_get_slice_in_flight_guards(range.len()));\n        let mut buf: Vec<u8> = Vec::with_capacity(cap);\n        download_all(get_object_output.body, &mut buf).await?;\n        Ok(buf)\n    }\n\n    /// Bulk delete implementation based on the DeleteObject API:\n    /// <https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObject.html>\n    async fn bulk_delete_single(&self, paths: &[&Path]) -> Result<(), BulkDeleteError> {\n        let mut successes = Vec::with_capacity(paths.len());\n        let mut failures = HashMap::new();\n\n        let futures = paths\n            .iter()\n            .map(|path| async move {\n                let delete_res = self.delete(path).await;\n                (path, delete_res)\n            })\n            .collect::<Vec<_>>();\n        let mut stream = futures::stream::iter(futures).buffer_unordered(100);\n\n        while let Some((path, delete_res)) = stream.next().await {\n            match delete_res {\n                Ok(_) => successes.push(path.to_path_buf()),\n                Err(error) => {\n                    let failure = DeleteFailure {\n                        error: Some(error),\n                        ..Default::default()\n                    };\n                    failures.insert(path.to_path_buf(), failure);\n                }\n            };\n        }\n        if failures.is_empty() {\n            Ok(())\n        } else {\n            Err(BulkDeleteError {\n                successes,\n                failures,\n                ..Default::default()\n            })\n        }\n    }\n\n    /// Bulk delete implementation based on the DeleteObjects API, also called Multi-Object Delete\n    /// API: <https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html>\n    async fn bulk_delete_multi(&self, paths: &[&Path]) -> Result<(), BulkDeleteError> {\n        let _permit = REQUEST_SEMAPHORE.acquire().await;\n\n        let delete_requests: Vec<(&[&Path], Delete)> = self\n            .build_delete_batch_requests(paths)\n            .map_err(|error: anyhow::Error| {\n                let unattempted = paths.iter().copied().map(Path::to_path_buf).collect();\n                BulkDeleteError {\n                    error: Some(StorageErrorKind::Internal.with_error(error)),\n                    successes: Default::default(),\n                    failures: Default::default(),\n                    unattempted,\n                }\n            })?;\n\n        let mut error = None;\n        let mut successes = Vec::with_capacity(paths.len());\n        let mut failures = HashMap::new();\n        let mut unattempted = Vec::new();\n\n        let mut delete_requests_it = delete_requests.iter();\n\n        for (path_chunk, delete) in &mut delete_requests_it {\n            let delete_objects_res: StorageResult<DeleteObjectsOutput> =\n                aws_retry(&self.retry_params, || async {\n                    crate::STORAGE_METRICS\n                        .object_storage_bulk_delete_requests_total\n                        .inc();\n                    let _timer = crate::STORAGE_METRICS\n                        .object_storage_bulk_delete_request_duration\n                        .start_timer();\n                    self.s3_client\n                        .delete_objects()\n                        .bucket(self.bucket.clone())\n                        .delete(delete.clone())\n                        .send()\n                        .await\n                })\n                .await\n                .map_err(Into::into);\n\n            match delete_objects_res {\n                Ok(delete_objects_output) => {\n                    if let Some(deleted_objects) = delete_objects_output.deleted {\n                        for deleted_object in deleted_objects {\n                            if let Some(key) = deleted_object.key {\n                                let path = self.relative_path(&key);\n                                successes.push(path);\n                            }\n                        }\n                    }\n                    if let Some(s3_errors) = delete_objects_output.errors {\n                        for s3_error in s3_errors {\n                            if let Some(key) = s3_error.key {\n                                let path = self.relative_path(&key);\n                                match s3_error.code {\n                                    Some(code) if code == \"NoSuchKey\" => {\n                                        successes.push(path);\n                                    }\n                                    _ => {\n                                        let failure = DeleteFailure {\n                                            code: s3_error.code,\n                                            message: s3_error.message,\n                                            ..Default::default()\n                                        };\n                                        failures.insert(path, failure);\n                                    }\n                                }\n                            }\n                        }\n                    }\n                }\n                Err(delete_objects_error) => {\n                    error = Some(delete_objects_error);\n                    unattempted.extend(path_chunk.iter().copied().map(PathBuf::from));\n                    break;\n                }\n            }\n        }\n\n        if error.is_none() && failures.is_empty() {\n            return Ok(());\n        }\n\n        // Do we have remaining requests?\n        for (path_chunk, _) in delete_requests_it {\n            unattempted.extend(path_chunk.iter().copied().map(PathBuf::from));\n        }\n\n        Err(BulkDeleteError {\n            error,\n            successes,\n            failures,\n            unattempted,\n        })\n    }\n}\n\nasync fn download_all(byte_stream: ByteStream, output: &mut Vec<u8>) -> io::Result<()> {\n    output.clear();\n    let mut body_stream_reader = BufReader::new(byte_stream.into_async_read());\n    let num_bytes_copied = tokio::io::copy_buf(&mut body_stream_reader, output).await?;\n    STORAGE_METRICS\n        .object_storage_download_num_bytes\n        .inc_by(num_bytes_copied);\n    // When calling `get_all`, the Vec capacity is not properly set.\n    output.shrink_to_fit();\n    Ok(())\n}\n\n#[async_trait]\nimpl Storage for S3CompatibleObjectStorage {\n    async fn check_connectivity(&self) -> anyhow::Result<()> {\n        // we ignore error as we never close the semaphore\n        let _permit = REQUEST_SEMAPHORE.acquire().await;\n        self.s3_client\n            .list_objects_v2()\n            .bucket(self.bucket.clone())\n            .max_keys(1)\n            .send()\n            .await?;\n        Ok(())\n    }\n\n    async fn put(\n        &self,\n        path: &Path,\n        payload: Box<dyn crate::PutPayload>,\n    ) -> crate::StorageResult<()> {\n        crate::STORAGE_METRICS.object_storage_put_total.inc();\n        let _permit = REQUEST_SEMAPHORE.acquire().await;\n        let key = self.key(path);\n        let total_len = payload.len();\n        let part_num_bytes = self.multipart_policy.part_num_bytes(total_len);\n        if self.disable_multipart_upload || part_num_bytes >= total_len {\n            self.put_single_part(&key, payload, total_len).await?;\n        } else {\n            self.put_multipart(&key, payload, part_num_bytes, total_len)\n                .await?;\n        }\n        Ok(())\n    }\n\n    async fn copy_to(&self, path: &Path, output: &mut dyn SendableAsync) -> StorageResult<()> {\n        let _permit = REQUEST_SEMAPHORE.acquire().await;\n        let get_object_output =\n            aws_retry(&self.retry_params, || self.get_object(path, None)).await?;\n        let mut body_read = BufReader::new(get_object_output.body.into_async_read());\n        let num_bytes_copied = tokio::io::copy_buf(&mut body_read, output).await?;\n        STORAGE_METRICS\n            .object_storage_download_num_bytes\n            .inc_by(num_bytes_copied);\n        output.flush().await?;\n        Ok(())\n    }\n\n    async fn delete(&self, path: &Path) -> StorageResult<()> {\n        let _permit = REQUEST_SEMAPHORE.acquire().await;\n        let bucket = self.bucket.clone();\n        let key = self.key(path);\n        let delete_res = aws_retry(&self.retry_params, || async {\n            crate::STORAGE_METRICS\n                .object_storage_delete_requests_total\n                .inc();\n            let _timer = crate::STORAGE_METRICS\n                .object_storage_delete_request_duration\n                .start_timer();\n            self.s3_client\n                .delete_object()\n                .bucket(&bucket)\n                .key(&key)\n                .send()\n                .await\n        })\n        .await;\n\n        match delete_res {\n            Ok(_) => Ok(()),\n            Err(error) if error.code() == Some(\"NoSuchKey\") => Ok(()),\n            Err(error) => Err(error.into()),\n        }\n    }\n\n    async fn bulk_delete<'a>(&self, paths: &[&'a Path]) -> Result<(), BulkDeleteError> {\n        if self.disable_multi_object_delete {\n            self.bulk_delete_single(paths).await\n        } else {\n            self.bulk_delete_multi(paths).await\n        }\n    }\n\n    #[instrument(level = \"debug\", skip(self, range), fields(range.start = range.start, range.end = range.end))]\n    async fn get_slice(&self, path: &Path, range: Range<usize>) -> StorageResult<OwnedBytes> {\n        let _permit = REQUEST_SEMAPHORE.acquire().await;\n        self.get_to_vec(path, Some(range.clone()))\n            .await\n            .map(OwnedBytes::new)\n            .map_err(|err| {\n                err.add_context(format!(\n                    \"failed to fetch slice {:?} for object: {}/{}\",\n                    range,\n                    self.uri,\n                    path.display(),\n                ))\n            })\n    }\n\n    #[instrument(level = \"debug\", skip(self, range), fields(range.start = range.start, range.end = range.end))]\n    async fn get_slice_stream(\n        &self,\n        path: &Path,\n        range: Range<usize>,\n    ) -> crate::StorageResult<Box<dyn AsyncRead + Send + Unpin>> {\n        let permit = REQUEST_SEMAPHORE.acquire().await;\n        let get_object_output = aws_retry(&self.retry_params, || {\n            self.get_object(path, Some(range.clone()))\n        })\n        .await?;\n        Ok(Box::new(S3AsyncRead {\n            read: get_object_output.body.into_async_read(),\n            _permit: permit,\n        }))\n    }\n\n    #[instrument(level = \"debug\", skip(self), fields(num_bytes_fetched))]\n    async fn get_all(&self, path: &Path) -> StorageResult<OwnedBytes> {\n        let _permit = REQUEST_SEMAPHORE.acquire().await;\n        let bytes = self\n            .get_to_vec(path, None)\n            .await\n            .map(OwnedBytes::new)\n            .map_err(|err| {\n                err.add_context(format!(\n                    \"failed to fetch object: {}/{}\",\n                    self.uri,\n                    path.display()\n                ))\n            })?;\n        tracing::Span::current().record(\"num_bytes_fetched\", bytes.len());\n        Ok(bytes)\n    }\n\n    async fn file_num_bytes(&self, path: &Path) -> StorageResult<u64> {\n        let _permit = REQUEST_SEMAPHORE.acquire().await;\n        let bucket = self.bucket.clone();\n        let key = self.key(path);\n        let head_object_output = aws_retry(&self.retry_params, || async {\n            self.s3_client\n                .head_object()\n                .bucket(&bucket)\n                .key(&key)\n                .send()\n                .await\n        })\n        .await?;\n\n        Ok(head_object_output.content_length().unwrap_or(0) as u64)\n    }\n\n    fn uri(&self) -> &Uri {\n        &self.uri\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use std::path::PathBuf;\n\n    use aws_sdk_s3::config::{Credentials, Region};\n    use aws_sdk_s3::primitives::SdkBody;\n    use aws_smithy_runtime::client::http::test_util::{ReplayEvent, StaticReplayClient};\n    use hyper::http;\n    use quickwit_aws::aws_behavior_version;\n    use quickwit_common::chunk_range;\n    use quickwit_common::uri::Uri;\n\n    use super::*;\n    use crate::{MultiPartPolicy, S3CompatibleObjectStorage};\n\n    #[tokio::test]\n    async fn test_md5_calc() -> std::io::Result<()> {\n        let data = (0..1_500_000).map(|el| el as u8).collect::<Vec<_>>();\n        let md5 = compute_md5(data.as_slice()).await?;\n        assert_eq!(md5, md5::compute(data));\n\n        Ok(())\n    }\n\n    #[test]\n    fn test_split_range_into_chunks_inexact() {\n        assert_eq!(\n            chunk_range(0..11, 3).collect::<Vec<_>>(),\n            vec![0..3, 3..6, 6..9, 9..11]\n        );\n    }\n    #[test]\n    fn test_split_range_into_chunks_exact() {\n        assert_eq!(\n            chunk_range(0..9, 3).collect::<Vec<_>>(),\n            vec![0..3, 3..6, 6..9]\n        );\n    }\n\n    #[test]\n    fn test_split_range_empty() {\n        assert!(chunk_range(0..0, 1).collect::<Vec<_>>().is_empty());\n    }\n\n    #[test]\n    fn test_parse_uri() {\n        assert_eq!(\n            parse_s3_uri(&Uri::for_test(\"s3://bucket/path/to/object\")),\n            Some((\"bucket\".to_string(), PathBuf::from(\"path/to/object\")))\n        );\n        assert_eq!(\n            parse_s3_uri(&Uri::for_test(\"s3://bucket/path\")),\n            Some((\"bucket\".to_string(), PathBuf::from(\"path\")))\n        );\n        assert_eq!(\n            parse_s3_uri(&Uri::for_test(\"s3://bucket/path/to/object\")),\n            Some((\"bucket\".to_string(), PathBuf::from(\"path/to/object\")))\n        );\n        assert_eq!(\n            parse_s3_uri(&Uri::for_test(\"s3://bucket/\")),\n            Some((\"bucket\".to_string(), PathBuf::from(\"\")))\n        );\n        assert_eq!(\n            parse_s3_uri(&Uri::for_test(\"s3://bucket\")),\n            Some((\"bucket\".to_string(), PathBuf::from(\"\")))\n        );\n        assert_eq!(parse_s3_uri(&Uri::for_test(\"ram://path/to/file\")), None);\n    }\n\n    #[tokio::test]\n    async fn test_s3_compatible_storage_relative_path() {\n        let sdk_config = aws_config::defaults(aws_behavior_version()).load().await;\n        let s3_client = S3Client::new(&sdk_config);\n        let uri = Uri::for_test(\"s3://bucket/indexes\");\n        let bucket = \"bucket\".to_string();\n        let prefix = PathBuf::new();\n\n        let mut s3_storage = S3CompatibleObjectStorage {\n            s3_client,\n            uri,\n            bucket,\n            prefix,\n            multipart_policy: MultiPartPolicy::default(),\n            retry_params: RetryParams::for_test(),\n            disable_multi_object_delete: false,\n            disable_multipart_upload: false,\n        };\n        assert_eq!(\n            s3_storage.relative_path(\"indexes/foo\"),\n            PathBuf::from(\"indexes/foo\")\n        );\n\n        s3_storage.prefix = PathBuf::from(\"indexes\");\n\n        assert_eq!(\n            s3_storage.relative_path(\"indexes/foo\"),\n            PathBuf::from(\"foo\")\n        );\n    }\n\n    #[tokio::test]\n    async fn test_s3_compatible_storage_bulk_delete_single() {\n        let client = StaticReplayClient::new(vec![\n            ReplayEvent::new(\n                http::Request::builder().body(SdkBody::empty()).unwrap(),\n                http::Response::builder().body(SdkBody::empty()).unwrap(),\n            ),\n            ReplayEvent::new(\n                http::Request::builder().body(SdkBody::empty()).unwrap(),\n                http::Response::builder().body(SdkBody::empty()).unwrap(),\n            ),\n        ]);\n        let credentials = Credentials::new(\"mock_key\", \"mock_secret\", None, None, \"mock_provider\");\n        let config = aws_sdk_s3::Config::builder()\n            .behavior_version(aws_behavior_version())\n            .region(Some(Region::new(\"Foo\")))\n            .http_client(client.clone())\n            .credentials_provider(credentials)\n            .build();\n        let s3_client = S3Client::from_conf(config);\n        let uri = Uri::for_test(\"s3://bucket/indexes\");\n        let bucket = \"bucket\".to_string();\n        let prefix = PathBuf::new();\n\n        let s3_storage = S3CompatibleObjectStorage {\n            s3_client,\n            uri,\n            bucket,\n            prefix,\n            multipart_policy: MultiPartPolicy::default(),\n            retry_params: RetryParams::for_test(),\n            disable_multi_object_delete: true,\n            disable_multipart_upload: false,\n        };\n        let _ = s3_storage\n            .bulk_delete(&[Path::new(\"foo\"), Path::new(\"bar\")])\n            .await;\n\n        let requests = client.actual_requests().collect::<Vec<_>>();\n        assert_eq!(requests.len(), 2);\n        assert!(requests[0].uri().to_string().ends_with(\"DeleteObject\"));\n    }\n\n    #[tokio::test]\n    async fn test_s3_compatible_storage_bulk_delete_multi() {\n        let client = StaticReplayClient::new(vec![ReplayEvent::new(\n            http::Request::builder().body(SdkBody::empty()).unwrap(),\n            http::Response::builder().body(SdkBody::empty()).unwrap(),\n        )]);\n        let credentials = Credentials::new(\"mock_key\", \"mock_secret\", None, None, \"mock_provider\");\n        let config = aws_sdk_s3::Config::builder()\n            .behavior_version(aws_behavior_version())\n            .region(Some(Region::new(\"Foo\")))\n            .http_client(client.clone())\n            .credentials_provider(credentials)\n            .build();\n        let s3_client = S3Client::from_conf(config);\n        let uri = Uri::for_test(\"s3://bucket/indexes\");\n        let bucket = \"bucket\".to_string();\n        let prefix = PathBuf::new();\n\n        let s3_storage = S3CompatibleObjectStorage {\n            s3_client,\n            uri,\n            bucket,\n            prefix,\n            multipart_policy: MultiPartPolicy::default(),\n            retry_params: RetryParams::for_test(),\n            disable_multi_object_delete: false,\n            disable_multipart_upload: false,\n        };\n        let _ = s3_storage\n            .bulk_delete(&[Path::new(\"foo\"), Path::new(\"bar\")])\n            .await;\n\n        let requests = client.actual_requests().collect::<Vec<_>>();\n        assert_eq!(requests.len(), 1);\n        assert!(requests[0].uri().to_string().ends_with(\"delete\"));\n    }\n\n    #[tokio::test]\n    async fn test_s3_compatible_storage_bulk_delete_multi_errors() {\n        let client = StaticReplayClient::new(vec![\n            ReplayEvent::new(\n                // This is quite fragile, currently this is *not* validated by the SDK\n                // but may in future, that being said, there is no way to know what the\n                // request should look like until it raises an error in reality as this\n                // is up to how the validation is implemented.\n                http::Request::builder().body(SdkBody::empty()).unwrap(),\n                http::Response::builder()\n                    .status(200)\n                    .body(SdkBody::from(\n                        r#\"<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n                        <DeleteResult xmlns=\"http://s3.amazonaws.com/doc/2006-03-01/\">\n                            <Deleted>\n                                <Key>foo</Key>\n                            </Deleted>\n                            <Error>\n                                <Key>bar</Key>\n                                <Code>NoSuchKey</Code>\n                                <Message>The specified key does not exist</Message>\n                            </Error>\n                            <Error>\n                                <Key>baz</Key>\n                                <Code>AccessDenied</Code>\n                                <Message>Access Denied</Message>\n                            </Error>\n                        </DeleteResult>\"#\n                    ))\n                    .unwrap()\n            ),\n            ReplayEvent::new(\n                // This is quite fragile, currently this is *not* validated by the SDK\n                // but may in future, that being said, there is no way to know what the\n                // request should look like until it raises an error in reality as this\n                // is up to how the validation is implemented.\n                http::Request::builder().body(SdkBody::empty()).unwrap(),\n                http::Response::builder()\n                    .status(400)\n                    .body(SdkBody::from(\n                        r#\"<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n                        <Error>\n                            <Code>MalformedXML</Code>\n                            <Message>The XML you provided was not well-formed or did not validate against our published schema.</Message>\n                            <RequestId>264A17BF16E9E80A</RequestId>\n                            <HostId>P3xqrhuhYxlrefdw3rEzmJh8z5KDtGzb+/FB7oiQaScI9Yaxd8olYXc7d1111ab+</HostId>\n                        </Error>\"#\n                    ))\n                    .unwrap()\n            ),\n        ]);\n        let credentials = Credentials::new(\"mock_key\", \"mock_secret\", None, None, \"mock_provider\");\n        let config = aws_sdk_s3::Config::builder()\n            .behavior_version(aws_behavior_version())\n            .region(Some(Region::new(\"Foo\")))\n            .http_client(client)\n            .credentials_provider(credentials)\n            .build();\n        let s3_client = S3Client::from_conf(config);\n        let uri = Uri::for_test(\"s3://bucket/indexes\");\n        let bucket = \"bucket\".to_string();\n        let prefix = PathBuf::new();\n\n        let s3_storage = S3CompatibleObjectStorage {\n            s3_client,\n            uri,\n            bucket,\n            prefix,\n            multipart_policy: MultiPartPolicy::default(),\n            retry_params: RetryParams::for_test(),\n            disable_multi_object_delete: false,\n            disable_multipart_upload: false,\n        };\n        let bulk_delete_error = s3_storage\n            .bulk_delete(&[\n                Path::new(\"foo\"),\n                Path::new(\"bar\"),\n                Path::new(\"baz\"),\n                Path::new(\"foobar\"),\n                Path::new(\"foobaz\"),\n                Path::new(\"barfoo\"),\n                Path::new(\"barbaz\"),\n            ])\n            .await\n            .unwrap_err();\n\n        assert_eq!(\n            bulk_delete_error.successes,\n            [PathBuf::from(\"foo\"), PathBuf::from(\"bar\")]\n        );\n        let failure = bulk_delete_error.failures.get(Path::new(\"baz\")).unwrap();\n        assert_eq!(failure.code.as_ref().unwrap(), \"AccessDenied\");\n        assert_eq!(failure.message.as_ref().unwrap(), \"Access Denied\");\n        assert!(failure.error.is_none());\n\n        assert_eq!(\n            bulk_delete_error.unattempted,\n            [\n                PathBuf::from(\"foobar\"),\n                PathBuf::from(\"foobaz\"),\n                PathBuf::from(\"barfoo\"),\n                PathBuf::from(\"barbaz\")\n            ]\n        );\n        let delete_objects_error = bulk_delete_error.error.unwrap();\n        assert!(delete_objects_error.to_string().contains(\"MalformedXML\"));\n    }\n\n    #[tokio::test]\n    async fn test_s3_compatible_storage_retry_put() {\n        let client = StaticReplayClient::new(vec![\n            ReplayEvent::new(\n                // This is quite fragile, currently this is *not* validated by the SDK\n                // but may in future, that being said, there is no way to know what the\n                // request should look like until it raises an error in reality as this\n                // is up to how the validation is implemented.\n                http::Request::builder().body(SdkBody::empty()).unwrap(),\n                http::Response::builder()\n                    .status(429)\n                    .body(SdkBody::from(\n                        r#\"<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n                        <Error>\n                          <Code>SlowDown</Code>\n                          <Message>message</Message>\n                          <Resource>/my-path</Resource>\n                          <RequestId>4442587FB7D0A2F9</RequestId>\n                        </Error>\"#,\n                    ))\n                    .unwrap(),\n            ),\n            ReplayEvent::new(\n                // This is quite fragile, currently this is *not* validated by the SDK\n                // but may in future, that being said, there is no way to know what the\n                // request should look like until it raises an error in reality as this\n                // is up to how the validation is implemented.\n                http::Request::builder().body(SdkBody::empty()).unwrap(),\n                http::Response::builder()\n                    .status(200)\n                    .body(SdkBody::empty())\n                    .unwrap(),\n            ),\n        ]);\n        let credentials = Credentials::new(\"mock_key\", \"mock_secret\", None, None, \"mock_provider\");\n        let config = aws_sdk_s3::Config::builder()\n            .behavior_version(aws_behavior_version())\n            .region(Some(Region::new(\"Foo\")))\n            .http_client(client)\n            .credentials_provider(credentials)\n            .build();\n        let s3_client = S3Client::from_conf(config);\n        let uri = Uri::for_test(\"s3://bucket/indexes\");\n        let bucket = \"bucket\".to_string();\n        let prefix = PathBuf::new();\n\n        let s3_storage = S3CompatibleObjectStorage {\n            s3_client,\n            uri,\n            bucket,\n            prefix,\n            multipart_policy: MultiPartPolicy::default(),\n            retry_params: RetryParams::for_test(),\n            disable_multi_object_delete: false,\n            disable_multipart_upload: false,\n        };\n        s3_storage\n            .put(Path::new(\"my-path\"), Box::new(vec![1, 2, 3]))\n            .await\n            .unwrap();\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/object_storage/s3_compatible_storage_resolver.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::Arc;\n\nuse async_trait::async_trait;\nuse aws_sdk_s3::Client as S3Client;\nuse quickwit_common::uri::Uri;\nuse quickwit_config::{S3StorageConfig, StorageBackend};\nuse tokio::sync::OnceCell;\n\nuse super::s3_compatible_storage::create_s3_client;\nuse crate::{\n    DebouncedStorage, S3CompatibleObjectStorage, Storage, StorageFactory, StorageResolverError,\n};\n\n/// S3 compatible object storage resolver.\npub struct S3CompatibleObjectStorageFactory {\n    storage_config: S3StorageConfig,\n    // we cache the S3Client so we don't rebuild one every time we build a new Storage (for\n    // every search query).\n    // We don't build it in advance because we don't know if this factory is one that will\n    // end up being used, or if something like azure, gcs, or even local files, will be used\n    // instead.\n    s3_client: OnceCell<S3Client>,\n}\n\nimpl S3CompatibleObjectStorageFactory {\n    /// Creates a new S3-compatible storage factory.\n    pub fn new(storage_config: S3StorageConfig) -> Self {\n        Self {\n            storage_config,\n            s3_client: OnceCell::new(),\n        }\n    }\n}\n\n#[async_trait]\nimpl StorageFactory for S3CompatibleObjectStorageFactory {\n    fn backend(&self) -> StorageBackend {\n        StorageBackend::S3\n    }\n\n    async fn resolve(&self, uri: &Uri) -> Result<Arc<dyn Storage>, StorageResolverError> {\n        let s3_client = self\n            .s3_client\n            .get_or_init(|| create_s3_client(&self.storage_config))\n            .await\n            .clone();\n        let storage =\n            S3CompatibleObjectStorage::from_uri_and_client(&self.storage_config, uri, s3_client)\n                .await?;\n        Ok(Arc::new(DebouncedStorage::new(storage)))\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/opendal_storage/base.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::ops::Range;\nuse std::path::Path;\n\nuse async_trait::async_trait;\nuse futures::AsyncWriteExt as FuturesAsyncWriteExt;\nuse opendal::{DeleteInput, IntoDeleteInput, Operator};\nuse quickwit_common::uri::Uri;\nuse tokio::io::{AsyncRead, AsyncWriteExt as TokioAsyncWriteExt};\nuse tokio_util::compat::{FuturesAsyncReadCompatExt, FuturesAsyncWriteCompatExt};\n\nuse crate::metrics::object_storage_get_slice_in_flight_guards;\nuse crate::storage::SendableAsync;\nuse crate::{\n    BulkDeleteError, MultiPartPolicy, OwnedBytes, PutPayload, Storage, StorageError,\n    StorageErrorKind, StorageResolverError, StorageResult,\n};\n\n/// OpenDAL based storage implementation.\n/// # TODO\n///\n/// - Implement REQUEST_SEMAPHORE to control the concurrency.\n/// - Implement STORAGE_METRICS for metrics.\npub struct OpendalStorage {\n    uri: Uri,\n    op: Operator,\n    multipart_policy: MultiPartPolicy,\n}\n\nimpl fmt::Debug for OpendalStorage {\n    fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {\n        formatter\n            .debug_struct(\"OpendalStorage\")\n            .field(\"operator\", &self.op.info())\n            .finish()\n    }\n}\n\nimpl OpendalStorage {\n    /// Create a new google cloud storage.\n    pub fn new_google_cloud_storage(\n        uri: Uri,\n        cfg: opendal::services::Gcs,\n    ) -> Result<Self, StorageResolverError> {\n        let op = Operator::new(cfg)?.finish();\n        Ok(Self {\n            uri,\n            op,\n            // limits are the same as on S3\n            multipart_policy: MultiPartPolicy::default(),\n        })\n    }\n\n    #[cfg(feature = \"integration-testsuite\")]\n    pub fn set_policy(&mut self, multipart_policy: MultiPartPolicy) {\n        self.multipart_policy = multipart_policy;\n    }\n}\n\n#[async_trait]\nimpl Storage for OpendalStorage {\n    async fn check_connectivity(&self) -> anyhow::Result<()> {\n        self.op.check().await?;\n        Ok(())\n    }\n\n    async fn put(&self, path: &Path, payload: Box<dyn PutPayload>) -> StorageResult<()> {\n        crate::STORAGE_METRICS.object_storage_put_total.inc();\n        let path = path.as_os_str().to_string_lossy();\n        let mut payload_reader = payload.byte_stream().await?.into_async_read();\n\n        let mut storage_writer = self\n            .op\n            .writer_with(&path)\n            .chunk(self.multipart_policy.part_num_bytes(payload.len()) as usize)\n            .await?\n            .into_futures_async_write()\n            .compat_write();\n        tokio::io::copy(&mut payload_reader, &mut storage_writer).await?;\n        storage_writer.get_mut().close().await?;\n        crate::STORAGE_METRICS\n            .object_storage_upload_num_bytes\n            .inc_by(payload.len());\n        Ok(())\n    }\n\n    async fn copy_to(&self, path: &Path, output: &mut dyn SendableAsync) -> StorageResult<()> {\n        let path = path.as_os_str().to_string_lossy();\n        let mut storage_reader = self\n            .op\n            .reader(&path)\n            .await?\n            .into_futures_async_read(..)\n            .await?\n            .compat();\n        let num_bytes_copied = tokio::io::copy(&mut storage_reader, output).await?;\n        crate::STORAGE_METRICS\n            .object_storage_download_num_bytes\n            .inc_by(num_bytes_copied);\n        output.flush().await?;\n        Ok(())\n    }\n\n    async fn get_slice(&self, path: &Path, range: Range<usize>) -> StorageResult<OwnedBytes> {\n        let path = path.as_os_str().to_string_lossy();\n        let size = range.len();\n        let range = range.start as u64..range.end as u64;\n        // Unlike other object store implementations, in flight requests are\n        // recorded before issuing the query to the object store.\n        let _inflight_guards = object_storage_get_slice_in_flight_guards(size);\n        crate::STORAGE_METRICS.object_storage_get_total.inc();\n        let storage_content = self.op.read_with(&path).range(range).await?.to_vec();\n        Ok(OwnedBytes::new(storage_content))\n    }\n\n    async fn get_slice_stream(\n        &self,\n        path: &Path,\n        range: Range<usize>,\n    ) -> StorageResult<Box<dyn AsyncRead + Send + Unpin>> {\n        let path = path.as_os_str().to_string_lossy();\n        let range = range.start as u64..range.end as u64;\n        let storage_reader = self\n            .op\n            .reader_with(&path)\n            .await?\n            .into_futures_async_read(range)\n            .await?\n            .compat();\n        Ok(Box::new(storage_reader))\n    }\n\n    async fn get_all(&self, path: &Path) -> StorageResult<OwnedBytes> {\n        let path = path.as_os_str().to_string_lossy();\n        let storage_content = self.op.read(&path).await?.to_vec();\n        Ok(OwnedBytes::new(storage_content))\n    }\n\n    async fn delete(&self, path: &Path) -> StorageResult<()> {\n        let path = path.as_os_str().to_string_lossy();\n        crate::STORAGE_METRICS\n            .object_storage_delete_requests_total\n            .inc();\n        let _timer = crate::STORAGE_METRICS\n            .object_storage_delete_request_duration\n            .start_timer();\n        self.op.delete(&path).await?;\n        Ok(())\n    }\n\n    async fn bulk_delete<'a>(&self, paths: &[&'a Path]) -> Result<(), BulkDeleteError> {\n        // The mock service we used in integration testsuite doesn't support bulk delete.\n        // Let's fallback to delete one by one in this case.\n        #[cfg(feature = \"integration-testsuite\")]\n        {\n            let storage_info = self.op.info();\n            if storage_info.name().starts_with(\"sample-bucket\") && storage_info.scheme() == \"gcs\" {\n                let mut bulk_error = BulkDeleteError::default();\n                for (index, path) in paths.iter().enumerate() {\n                    crate::STORAGE_METRICS\n                        .object_storage_bulk_delete_requests_total\n                        .inc();\n                    let _timer = crate::STORAGE_METRICS\n                        .object_storage_bulk_delete_request_duration\n                        .start_timer();\n                    let result = self.op.delete(&path.as_os_str().to_string_lossy()).await;\n                    if let Err(err) = result {\n                        let storage_error_kind = err.kind();\n                        let storage_error: StorageError = err.into();\n                        bulk_error.failures.insert(\n                            path.to_path_buf(),\n                            crate::DeleteFailure {\n                                code: Some(storage_error_kind.to_string()),\n                                message: Some(storage_error.to_string()),\n                                error: Some(storage_error.clone()),\n                            },\n                        );\n                        bulk_error.error = Some(storage_error);\n                        for path in paths[index..].iter() {\n                            bulk_error.unattempted.push(path.to_path_buf())\n                        }\n                        break;\n                    } else {\n                        bulk_error.successes.push(path.to_path_buf())\n                    }\n                }\n\n                return if bulk_error.error.is_some() {\n                    Err(bulk_error)\n                } else {\n                    Ok(())\n                };\n            }\n        }\n        let delete_inputs: Vec<DeleteInput> = paths\n            .iter()\n            .map(|path| path.as_os_str().to_string_lossy().into_delete_input())\n            .collect();\n\n        self.op\n            .delete_iter(delete_inputs)\n            .await\n            .map_err(|error| BulkDeleteError {\n                error: Some(error.into()),\n                ..Default::default()\n            })?;\n        Ok(())\n    }\n\n    async fn file_num_bytes(&self, path: &Path) -> StorageResult<u64> {\n        let path = path.as_os_str().to_string_lossy();\n        let meta = self.op.stat(&path).await?;\n        Ok(meta.content_length())\n    }\n\n    fn uri(&self) -> &Uri {\n        &self.uri\n    }\n}\n\nimpl From<opendal::Error> for StorageError {\n    fn from(err: opendal::Error) -> Self {\n        match err.kind() {\n            opendal::ErrorKind::NotFound => StorageErrorKind::NotFound.with_error(err),\n            opendal::ErrorKind::PermissionDenied => StorageErrorKind::Unauthorized.with_error(err),\n            opendal::ErrorKind::ConfigInvalid => StorageErrorKind::Service.with_error(err),\n            _ => StorageErrorKind::Io.with_error(err),\n        }\n    }\n}\n\nimpl From<opendal::Error> for StorageResolverError {\n    fn from(err: opendal::Error) -> Self {\n        StorageResolverError::InvalidConfig(err.to_string())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/opendal_storage/google_cloud_storage.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::path::PathBuf;\nuse std::sync::Arc;\n\nuse async_trait::async_trait;\nuse once_cell::sync::OnceCell;\nuse quickwit_common::uri::Uri;\nuse quickwit_config::{GoogleCloudStorageConfig, StorageBackend};\nuse regex::Regex;\nuse tracing::info;\n\nuse super::OpendalStorage;\nuse crate::debouncer::DebouncedStorage;\nuse crate::{Storage, StorageFactory, StorageResolverError};\n\n/// Google cloud storage resolver.\npub struct GoogleCloudStorageFactory {\n    storage_config: GoogleCloudStorageConfig,\n}\n\nimpl GoogleCloudStorageFactory {\n    /// Create a new google cloud storage factory via config.\n    pub fn new(storage_config: GoogleCloudStorageConfig) -> Self {\n        Self { storage_config }\n    }\n}\n\n#[async_trait]\nimpl StorageFactory for GoogleCloudStorageFactory {\n    fn backend(&self) -> StorageBackend {\n        StorageBackend::Google\n    }\n\n    async fn resolve(&self, uri: &Uri) -> Result<Arc<dyn Storage>, StorageResolverError> {\n        let storage = from_uri(&self.storage_config, uri)?;\n        Ok(Arc::new(DebouncedStorage::new(storage)))\n    }\n}\n\n/// Helpers to configure the GCP local test setup.\n#[cfg(feature = \"integration-testsuite\")]\npub mod test_config_helpers {\n    use super::*;\n\n    /// URL of the local GCP emulator.\n    pub const LOCAL_GCP_EMULATOR_ENDPOINT: &str = \"http://127.0.0.1:4443\";\n    /// Creates a storage connecting to a local emulated google cloud storage.\n    pub fn new_emulated_google_cloud_storage(\n        uri: &Uri,\n    ) -> Result<OpendalStorage, StorageResolverError> {\n        let (bucket, root) = parse_google_uri(uri).expect(\"must be valid google uri\");\n\n        let cfg = opendal::services::Gcs::default()\n            .bucket(&bucket)\n            .root(&root.to_string_lossy())\n            .endpoint(LOCAL_GCP_EMULATOR_ENDPOINT)\n            .allow_anonymous() // Disable authentication for fake GCS server\n            .disable_vm_metadata(); // Disable GCE metadata server requests\n        let store = OpendalStorage::new_google_cloud_storage(uri.clone(), cfg)?;\n        Ok(store)\n    }\n}\n\nfn from_uri(\n    google_cloud_storage_config: &GoogleCloudStorageConfig,\n    uri: &Uri,\n) -> Result<OpendalStorage, StorageResolverError> {\n    let (bucket_name, prefix) = parse_google_uri(uri).ok_or_else(|| {\n        let message = format!(\"failed to extract bucket name from google URI: {uri}\");\n        StorageResolverError::InvalidUri(message)\n    })?;\n\n    let mut cfg = opendal::services::Gcs::default()\n        .bucket(&bucket_name)\n        .root(&prefix.to_string_lossy());\n\n    if let Some(credential_path) = google_cloud_storage_config.resolve_credential_path() {\n        info!(path=%credential_path, \"fetching google cloud storage credentials from path\");\n        cfg = cfg.credential_path(&credential_path);\n    }\n    let store = OpendalStorage::new_google_cloud_storage(uri.clone(), cfg)?;\n    Ok(store)\n}\n\nfn parse_google_uri(uri: &Uri) -> Option<(String, PathBuf)> {\n    // Ex: gs://bucket/prefix.\n    static URI_PTN: OnceCell<Regex> = OnceCell::new();\n\n    let captures = URI_PTN\n        .get_or_init(|| {\n            Regex::new(r\"gs(\\+[^:]+)?://(?P<bucket>[^/]+)(/(?P<prefix>.*))?$\")\n                .expect(\"The regular expression should compile.\")\n        })\n        .captures(uri.as_str())?;\n\n    let bucket = captures.name(\"bucket\")?.as_str().to_string();\n    let prefix = captures\n        .name(\"prefix\")\n        .map(|prefix_match| PathBuf::from(prefix_match.as_str()))\n        .unwrap_or_default();\n    Some((bucket, prefix))\n}\n\n#[cfg(test)]\nmod tests {\n    use quickwit_common::uri::Uri;\n\n    use super::parse_google_uri;\n\n    #[test]\n    fn test_parse_google_uri() {\n        assert!(parse_google_uri(&Uri::for_test(\"gs://\")).is_none());\n\n        let (bucket, prefix) = parse_google_uri(&Uri::for_test(\"gs://test-bucket\")).unwrap();\n        assert_eq!(bucket, \"test-bucket\");\n        assert!(prefix.to_str().unwrap().is_empty());\n\n        let (bucket, prefix) = parse_google_uri(&Uri::for_test(\"gs://test-bucket/\")).unwrap();\n        assert_eq!(bucket, \"test-bucket\");\n        assert!(prefix.to_str().unwrap().is_empty());\n\n        let (bucket, prefix) =\n            parse_google_uri(&Uri::for_test(\"gs://test-bucket/indexes\")).unwrap();\n        assert_eq!(bucket, \"test-bucket\");\n        assert_eq!(prefix.to_str().unwrap(), \"indexes\");\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/opendal_storage/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod base;\nuse base::OpendalStorage;\n\nmod google_cloud_storage;\n\npub use google_cloud_storage::GoogleCloudStorageFactory;\n#[cfg(feature = \"integration-testsuite\")]\npub use google_cloud_storage::test_config_helpers;\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/payload.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::io;\nuse std::ops::Range;\n\nuse async_trait::async_trait;\nuse aws_sdk_s3::primitives::ByteStream;\nuse tantivy::directory::OwnedBytes;\n\n#[async_trait]\n/// PutPayload is used to upload data and support multipart.\npub trait PutPayload: PutPayloadClone + Send + Sync {\n    /// Return the total length of the payload.\n    fn len(&self) -> u64;\n\n    /// Retrieve bytestream for specified range.\n    async fn range_byte_stream(&self, range: Range<u64>) -> io::Result<ByteStream>;\n\n    /// Retrieve complete bytestream.\n    async fn byte_stream(&self) -> io::Result<ByteStream> {\n        let total_len = self.len();\n        let range = 0..total_len;\n        self.range_byte_stream(range).await\n    }\n\n    /// Load the whole Payload into memory.\n    async fn read_all(&self) -> io::Result<OwnedBytes> {\n        let total_len = self.len();\n        let range = 0..total_len;\n        let mut reader = self.range_byte_stream(range).await?.into_async_read();\n\n        let mut data: Vec<u8> = Vec::with_capacity(total_len as usize);\n        tokio::io::copy(&mut reader, &mut data).await?;\n\n        Ok(OwnedBytes::new(data))\n    }\n}\n\npub trait PutPayloadClone {\n    fn box_clone(&self) -> Box<dyn PutPayload>;\n}\n\nimpl<T> PutPayloadClone for T\nwhere T: 'static + PutPayload + Clone\n{\n    fn box_clone(&self) -> Box<dyn PutPayload> {\n        Box::new(self.clone())\n    }\n}\n\nimpl Clone for Box<dyn PutPayload> {\n    fn clone(&self) -> Box<dyn PutPayload> {\n        self.box_clone()\n    }\n}\n\n#[async_trait]\nimpl PutPayload for Vec<u8> {\n    fn len(&self) -> u64 {\n        self.len() as u64\n    }\n\n    async fn range_byte_stream(&self, range: Range<u64>) -> io::Result<ByteStream> {\n        Ok(ByteStream::from(\n            self[range.start as usize..range.end as usize].to_vec(),\n        ))\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/prefix_storage.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::ops::Range;\nuse std::path::{Path, PathBuf};\nuse std::sync::Arc;\n\nuse async_trait::async_trait;\nuse quickwit_common::uri::Uri;\nuse tokio::io::AsyncRead;\n\nuse crate::storage::SendableAsync;\nuse crate::{BulkDeleteError, OwnedBytes, Storage};\n\n/// This storage acts as a proxy to another storage that simply modifies each API call\n/// by preceding each path with a given a prefix.\nstruct PrefixStorage {\n    pub storage: Arc<dyn Storage>,\n    pub prefix: PathBuf,\n    uri: Uri,\n}\n\nimpl fmt::Debug for PrefixStorage {\n    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {\n        f.debug_struct(\"PrefixStorage\")\n            .field(\"uri\", &self.uri)\n            .field(\"prefix\", &self.prefix)\n            .finish()\n    }\n}\n\n#[async_trait]\nimpl Storage for PrefixStorage {\n    async fn check_connectivity(&self) -> anyhow::Result<()> {\n        self.storage.check_connectivity().await\n    }\n\n    async fn put(\n        &self,\n        path: &Path,\n        payload: Box<dyn crate::PutPayload>,\n    ) -> crate::StorageResult<()> {\n        self.storage.put(&self.prefix.join(path), payload).await\n    }\n\n    async fn copy_to(\n        &self,\n        path: &Path,\n        output: &mut dyn SendableAsync,\n    ) -> crate::StorageResult<()> {\n        self.storage.copy_to(&self.prefix.join(path), output).await\n    }\n\n    async fn get_slice(\n        &self,\n        path: &Path,\n        range: Range<usize>,\n    ) -> crate::StorageResult<OwnedBytes> {\n        self.storage.get_slice(&self.prefix.join(path), range).await\n    }\n\n    async fn get_all(&self, path: &Path) -> crate::StorageResult<OwnedBytes> {\n        self.storage.get_all(&self.prefix.join(path)).await\n    }\n\n    async fn get_slice_stream(\n        &self,\n        path: &Path,\n        range: Range<usize>,\n    ) -> crate::StorageResult<Box<dyn AsyncRead + Send + Unpin>> {\n        self.storage\n            .get_slice_stream(&self.prefix.join(path), range)\n            .await\n    }\n\n    async fn delete(&self, path: &Path) -> crate::StorageResult<()> {\n        self.storage.delete(&self.prefix.join(path)).await\n    }\n\n    async fn bulk_delete<'a>(&self, paths: &[&'a Path]) -> Result<(), BulkDeleteError> {\n        let prefixed_pathbufs: Vec<PathBuf> =\n            paths.iter().map(|path| self.prefix.join(path)).collect();\n        let prefixed_paths: Vec<&Path> = prefixed_pathbufs\n            .iter()\n            .map(|pathbuf| pathbuf.as_path())\n            .collect();\n        self.storage\n            .bulk_delete(&prefixed_paths)\n            .await\n            .map_err(|error| strip_prefix_from_error(error, &self.prefix))?;\n        Ok(())\n    }\n\n    async fn exists(&self, path: &Path) -> crate::StorageResult<bool> {\n        self.storage.exists(&self.prefix.join(path)).await\n    }\n\n    fn uri(&self) -> &Uri {\n        &self.uri\n    }\n\n    async fn file_num_bytes(&self, path: &Path) -> crate::StorageResult<u64> {\n        self.storage.file_num_bytes(&self.prefix.join(path)).await\n    }\n}\n\n/// Creates a [`PrefixStorage`] using an underlying storage and a prefix.\npub(crate) fn add_prefix_to_storage(\n    storage: Arc<dyn Storage>,\n    prefix: PathBuf,\n    uri: Uri,\n) -> Arc<dyn Storage> {\n    Arc::new(PrefixStorage {\n        storage,\n        prefix,\n        uri,\n    })\n}\n\nfn strip_prefix_from_error(error: BulkDeleteError, prefix: &Path) -> BulkDeleteError {\n    if prefix == Path::new(\"\") {\n        return error;\n    }\n    let successes = error\n        .successes\n        .into_iter()\n        .map(|path| {\n            path.strip_prefix(prefix)\n                .expect(\n                    \"The prefix should have been prepended to the path before the bulk delete \\\n                     call.\",\n                )\n                .to_path_buf()\n        })\n        .collect();\n    let failures = error\n        .failures\n        .into_iter()\n        .map(|(path, failure)| {\n            (\n                path.strip_prefix(prefix)\n                    .expect(\n                        \"The prefix should have been prepended to the path before the bulk delete \\\n                         call.\",\n                    )\n                    .to_path_buf(),\n                failure,\n            )\n        })\n        .collect();\n    let unattempted = error\n        .unattempted\n        .into_iter()\n        .map(|path| {\n            path.strip_prefix(prefix)\n                .expect(\n                    \"The prefix should have been prepended to the path before the bulk delete \\\n                     call.\",\n                )\n                .to_path_buf()\n        })\n        .collect();\n    BulkDeleteError {\n        error: error.error,\n        successes,\n        failures,\n        unattempted,\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use std::collections::HashMap;\n\n    use super::*;\n    use crate::DeleteFailure;\n\n    #[test]\n    fn test_strip_prefix_from_error() {\n        {\n            let error = BulkDeleteError {\n                error: None,\n                successes: vec![PathBuf::from(\"ram:///indexes/foo\")],\n                unattempted: vec![PathBuf::from(\"ram:///indexes/bar\")],\n                failures: HashMap::from_iter([(\n                    PathBuf::from(\"ram:///indexes/baz\"),\n                    DeleteFailure::default(),\n                )]),\n            };\n            let stripped_error = strip_prefix_from_error(error, Path::new(\"\"));\n\n            assert_eq!(\n                stripped_error.successes,\n                vec![PathBuf::from(\"ram:///indexes/foo\")],\n            );\n            assert_eq!(\n                stripped_error.unattempted,\n                vec![PathBuf::from(\"ram:///indexes/bar\")],\n            );\n            assert_eq!(\n                stripped_error.failures.keys().next().unwrap(),\n                &PathBuf::from(\"ram:///indexes/baz\"),\n            );\n        }\n        {\n            let error = BulkDeleteError {\n                error: None,\n                successes: vec![PathBuf::from(\"ram:///indexes/foo\")],\n                unattempted: vec![PathBuf::from(\"ram:///indexes/bar\")],\n                failures: HashMap::from_iter([(\n                    PathBuf::from(\"ram:///indexes/baz\"),\n                    DeleteFailure::default(),\n                )]),\n            };\n            let stripped_error = strip_prefix_from_error(error, Path::new(\"ram:///indexes\"));\n\n            assert_eq!(stripped_error.successes, vec![PathBuf::from(\"foo\")],);\n            assert_eq!(stripped_error.unattempted, vec![PathBuf::from(\"bar\")],);\n            assert_eq!(\n                stripped_error.failures.keys().next().unwrap(),\n                &PathBuf::from(\"baz\"),\n            );\n        }\n        {\n            let error = BulkDeleteError {\n                error: None,\n                successes: vec![PathBuf::from(\"ram:///indexes/foo\")],\n                unattempted: vec![PathBuf::from(\"ram:///indexes/bar\")],\n                failures: HashMap::from_iter([(\n                    PathBuf::from(\"ram:///indexes/baz\"),\n                    DeleteFailure::default(),\n                )]),\n            };\n            let stripped_error = strip_prefix_from_error(error, Path::new(\"ram:///indexes/\"));\n\n            assert_eq!(stripped_error.successes, vec![PathBuf::from(\"foo\")],);\n            assert_eq!(stripped_error.unattempted, vec![PathBuf::from(\"bar\")],);\n            assert_eq!(\n                stripped_error.failures.keys().next().unwrap(),\n                &PathBuf::from(\"baz\"),\n            );\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/ram_storage.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::fmt;\nuse std::io::Cursor;\nuse std::ops::Range;\nuse std::path::{Path, PathBuf};\nuse std::sync::Arc;\n\nuse async_trait::async_trait;\nuse quickwit_common::uri::{Protocol, Uri};\nuse quickwit_config::StorageBackend;\nuse tokio::io::{AsyncRead, AsyncWriteExt};\nuse tokio::sync::RwLock;\n\nuse crate::prefix_storage::add_prefix_to_storage;\nuse crate::storage::SendableAsync;\nuse crate::{\n    BulkDeleteError, OwnedBytes, Storage, StorageErrorKind, StorageFactory, StorageResolverError,\n    StorageResult,\n};\n\n/// In Ram implementation of quickwit's storage.\n///\n/// This implementation is mostly useful in unit tests.\n#[derive(Clone)]\npub struct RamStorage {\n    uri: Uri,\n    files: Arc<RwLock<HashMap<PathBuf, OwnedBytes>>>,\n}\n\nimpl fmt::Debug for RamStorage {\n    fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {\n        formatter\n            .debug_struct(\"RamStorage\")\n            .field(\"uri\", &self.uri)\n            .finish()\n    }\n}\n\nimpl Default for RamStorage {\n    fn default() -> Self {\n        Self {\n            uri: Uri::for_test(\"ram:///\"),\n            files: Arc::new(RwLock::new(HashMap::new())),\n        }\n    }\n}\n\nimpl RamStorage {\n    /// Creates a [`RamStorageBuilder`]\n    pub fn builder() -> RamStorageBuilder {\n        RamStorageBuilder::default()\n    }\n\n    async fn put_data(&self, path: &Path, payload: OwnedBytes) {\n        self.files.write().await.insert(path.to_path_buf(), payload);\n    }\n\n    async fn get_data(&self, path: &Path) -> Option<OwnedBytes> {\n        self.files.read().await.get(path).cloned()\n    }\n\n    /// Returns the list of files that are present in the RamStorage.\n    pub async fn list_files(&self) -> Vec<PathBuf> {\n        self.files.read().await.keys().cloned().collect()\n    }\n}\n\n#[async_trait]\nimpl Storage for RamStorage {\n    async fn check_connectivity(&self) -> anyhow::Result<()> {\n        Ok(())\n    }\n\n    async fn put(\n        &self,\n        path: &Path,\n        payload: Box<dyn crate::PutPayload>,\n    ) -> crate::StorageResult<()> {\n        let payload_bytes = payload.read_all().await?;\n        self.put_data(path, payload_bytes).await;\n        Ok(())\n    }\n\n    async fn copy_to(&self, path: &Path, output: &mut dyn SendableAsync) -> StorageResult<()> {\n        let payload_bytes = self.get_data(path).await.ok_or_else(|| {\n            StorageErrorKind::NotFound\n                .with_error(anyhow::anyhow!(\"failed to find dest_path {:?}\", path))\n        })?;\n        output.write_all(&payload_bytes).await?;\n        output.flush().await?;\n        Ok(())\n    }\n\n    async fn get_slice(&self, path: &Path, range: Range<usize>) -> StorageResult<OwnedBytes> {\n        let payload_bytes = self.get_data(path).await.ok_or_else(|| {\n            StorageErrorKind::NotFound\n                .with_error(anyhow::anyhow!(\"failed to find dest_path {:?}\", path))\n        })?;\n        Ok(payload_bytes.slice(range.start..range.end))\n    }\n\n    async fn get_slice_stream(\n        &self,\n        path: &Path,\n        range: Range<usize>,\n    ) -> StorageResult<Box<dyn AsyncRead + Send + Unpin>> {\n        let bytes = self.get_slice(path, range).await?;\n        Ok(Box::new(Cursor::new(bytes)))\n    }\n\n    async fn delete(&self, path: &Path) -> StorageResult<()> {\n        self.files.write().await.remove(path);\n        Ok(())\n    }\n\n    async fn bulk_delete<'a>(&self, paths: &[&'a Path]) -> Result<(), BulkDeleteError> {\n        let mut files = self.files.write().await;\n        for &path in paths {\n            files.remove(path);\n        }\n        Ok(())\n    }\n\n    async fn get_all(&self, path: &Path) -> StorageResult<OwnedBytes> {\n        let payload_bytes = self.get_data(path).await.ok_or_else(|| {\n            StorageErrorKind::NotFound\n                .with_error(anyhow::anyhow!(\"failed to find dest_path {:?}\", path))\n        })?;\n        Ok(payload_bytes)\n    }\n\n    fn uri(&self) -> &Uri {\n        &self.uri\n    }\n\n    async fn file_num_bytes(&self, path: &Path) -> StorageResult<u64> {\n        if let Some(file_bytes) = self.files.read().await.get(path) {\n            Ok(file_bytes.len() as u64)\n        } else {\n            let err = anyhow::anyhow!(\"missing file `{}`\", path.display());\n            Err(StorageErrorKind::NotFound.with_error(err))\n        }\n    }\n}\n\n/// Builder to create a prepopulated [`RamStorage`]. This is mostly useful for tests.\n#[derive(Default)]\npub struct RamStorageBuilder {\n    files: HashMap<PathBuf, OwnedBytes>,\n}\n\nimpl RamStorageBuilder {\n    /// Adds a new file into the [`RamStorageBuilder`].\n    pub fn put(mut self, path: &str, payload: &[u8]) -> Self {\n        self.files\n            .insert(PathBuf::from(path), OwnedBytes::new(payload.to_vec()));\n        self\n    }\n\n    /// Finalizes the [`RamStorage`] creation.\n    pub fn build(self) -> RamStorage {\n        RamStorage {\n            uri: Uri::for_test(\"ram:///\"),\n            files: Arc::new(RwLock::new(self.files)),\n        }\n    }\n}\n\n/// Storage resolver for [`RamStorage`].\npub struct RamStorageFactory {\n    ram_storage: Arc<dyn Storage>,\n}\n\nimpl Default for RamStorageFactory {\n    fn default() -> Self {\n        RamStorageFactory {\n            ram_storage: Arc::new(RamStorage::default()),\n        }\n    }\n}\n\n#[async_trait]\nimpl StorageFactory for RamStorageFactory {\n    fn backend(&self) -> StorageBackend {\n        StorageBackend::Ram\n    }\n\n    async fn resolve(&self, uri: &Uri) -> Result<Arc<dyn Storage>, StorageResolverError> {\n        match uri.filepath() {\n            Some(prefix) if uri.protocol() == Protocol::Ram => Ok(add_prefix_to_storage(\n                self.ram_storage.clone(),\n                prefix.to_path_buf(),\n                uri.clone(),\n            )),\n            _ => {\n                let message = format!(\"URI `{uri}` is not a valid RAM URI\");\n                Err(StorageResolverError::InvalidUri(message))\n            }\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use super::*;\n    use crate::test_suite::storage_test_suite;\n\n    #[tokio::test]\n    async fn test_storage() -> anyhow::Result<()> {\n        let mut ram_storage = RamStorage::default();\n        storage_test_suite(&mut ram_storage).await?;\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_ram_storage_factory() {\n        let ram_storage_factory = RamStorageFactory::default();\n        let ram_uri = Uri::for_test(\"s3:///foo\");\n        let err = ram_storage_factory.resolve(&ram_uri).await.err().unwrap();\n        assert!(matches!(err, StorageResolverError::InvalidUri { .. }));\n\n        let data_uri = Uri::for_test(\"ram:///data\");\n        let data_storage = ram_storage_factory.resolve(&data_uri).await.ok().unwrap();\n        let home_uri = Uri::for_test(\"ram:///home\");\n        let home_storage = ram_storage_factory.resolve(&home_uri).await.ok().unwrap();\n        assert_ne!(data_storage.uri(), home_storage.uri());\n\n        let data_storage_two = ram_storage_factory.resolve(&data_uri).await.ok().unwrap();\n        assert_eq!(data_storage.uri(), data_storage_two.uri());\n    }\n\n    #[tokio::test]\n    async fn test_ram_storage_builder() -> anyhow::Result<()> {\n        let storage = RamStorage::builder()\n            .put(\"path1\", b\"path1_payload\")\n            .put(\"path2\", b\"path2_payload\")\n            .put(\"path1\", b\"path1_payloadb\")\n            .build();\n        assert_eq!(\n            &storage.get_all(Path::new(\"path1\")).await?,\n            &b\"path1_payloadb\"[..]\n        );\n        assert_eq!(\n            &storage.get_all(Path::new(\"path2\")).await?,\n            &b\"path2_payload\"[..]\n        );\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/split.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::io;\nuse std::ops::Range;\nuse std::path::{Path, PathBuf};\nuse std::pin::Pin;\n\nuse async_trait::async_trait;\nuse aws_sdk_s3::primitives::{ByteStream, FsBuilder, Length, SdkBody};\nuse futures::{Stream, StreamExt, stream};\nuse hyper::body::{Bytes, Frame};\nuse pin_project::pin_project;\nuse quickwit_common::shared_consts::SPLIT_FIELDS_FILE_NAME;\n\nuse crate::bundle_storage::BundleStorageFileOffsetsVersions;\nuse crate::{BundleStorageFileOffsets, PutPayload, VersionedComponent};\n\n/// Payload of a split which builds the split bundle and hotcache on the fly and streams it to the\n/// storage.\n#[derive(Clone)]\npub struct SplitPayload {\n    payloads: Vec<Box<dyn PutPayload>>,\n    /// bytes range of the footer (hotcache + bundle metadata)\n    pub footer_range: Range<u64>,\n}\n\nasync fn range_byte_stream_from_payloads(\n    payloads: &[Box<dyn PutPayload>],\n    range: Range<u64>,\n) -> io::Result<ByteStream> {\n    let mut bytestreams: Vec<ByteStream> = Vec::new();\n\n    let payloads_and_ranges =\n        chunk_payload_ranges(payloads, range.start as usize..range.end as usize);\n\n    for (payload, range) in payloads_and_ranges {\n        bytestreams.push(\n            payload\n                .range_byte_stream(range.start as u64..range.end as u64)\n                .await?,\n        );\n    }\n\n    let body = stream::iter(bytestreams)\n        .map(StreamAdaptor)\n        .flatten()\n        .map(|result| result.map(Frame::data));\n    let stream_body = http_body_util::StreamBody::new(body);\n    let concat_stream = ByteStream::new(SdkBody::from_body_1_x(stream_body));\n    Ok(concat_stream)\n}\n\n// With sdk 1.0, ByteStream no longer implement Stream, despite having analogous functions\n// this adaptor is just meant to make it implement Stream for places where we really need it\n#[pin_project]\nstruct StreamAdaptor(#[pin] ByteStream);\n\nimpl Stream for StreamAdaptor {\n    type Item = Result<Bytes, aws_smithy_types::byte_stream::error::Error>;\n\n    fn poll_next(\n        self: Pin<&mut Self>,\n        ctx: &mut std::task::Context<'_>,\n    ) -> std::task::Poll<Option<Self::Item>> {\n        self.project().0.poll_next(ctx)\n    }\n\n    fn size_hint(&self) -> (usize, Option<usize>) {\n        let (lower_bound_u64, upper_bound_u64) = self.0.size_hint();\n        // if conversion fails, it means lower_bound is too large to fit in an usize on this\n        // platform. When that's the case, we return usize::MAX as best effort. Any value is valid,\n        // but MAX is the most informative.\n        let lower_bound = lower_bound_u64.try_into().unwrap_or(usize::MAX);\n        // for the upperbound, if conversion fails, we just say the upper bound is unknown\n        let upper_bound =\n            upper_bound_u64.and_then(|upper_bound_u64| upper_bound_u64.try_into().ok());\n        (lower_bound, upper_bound)\n    }\n}\n\n#[async_trait]\nimpl PutPayload for SplitPayload {\n    fn len(&self) -> u64 {\n        self.payloads.iter().map(|payload| payload.len()).sum()\n    }\n\n    async fn range_byte_stream(&self, range: Range<u64>) -> io::Result<ByteStream> {\n        range_byte_stream_from_payloads(&self.payloads, range).await\n    }\n}\n\n#[derive(Clone)]\nstruct FilePayload {\n    len: u64,\n    path: PathBuf,\n}\n\n#[async_trait]\nimpl PutPayload for FilePayload {\n    fn len(&self) -> u64 {\n        self.len\n    }\n\n    async fn range_byte_stream(&self, range: Range<u64>) -> io::Result<ByteStream> {\n        assert!(!range.is_empty());\n        assert!(range.end <= self.len);\n\n        let len = range.end - range.start;\n        let mut fs_builder = FsBuilder::new().path(&self.path);\n\n        if range.start > 0 {\n            fs_builder = fs_builder.offset(range.start);\n        }\n        fs_builder = fs_builder.length(Length::Exact(len));\n\n        fs_builder\n            .build()\n            .await\n            .map_err(|error| io::Error::other(format!(\"failed to create byte stream: {error}\")))\n    }\n}\n\n/// SplitPayloadBuilder is used to create a `SplitPayload`.\n#[derive(Default)]\npub struct SplitPayloadBuilder {\n    /// File name, payload, and range of the payload in the bundle file\n    /// Range could be computed on the fly, and is just kept here for convenience.\n    payloads: Vec<(String, Box<dyn PutPayload>, Range<u64>)>,\n    current_offset: usize,\n}\n\nimpl SplitPayloadBuilder {\n    /// Creates a new SplitPayloadBuilder for given files and hotcache.\n    pub fn get_split_payload(\n        split_files: &[PathBuf],\n        serialized_split_fields: &[u8],\n        hotcache: &[u8],\n    ) -> anyhow::Result<SplitPayload> {\n        let mut split_payload_builder = SplitPayloadBuilder::default();\n        for file in split_files {\n            split_payload_builder.add_file(file)?;\n        }\n        split_payload_builder.add_payload(\n            SPLIT_FIELDS_FILE_NAME.to_string(),\n            Box::new(serialized_split_fields.to_vec()),\n        );\n        let offsets = split_payload_builder.finalize(hotcache)?;\n        Ok(offsets)\n    }\n\n    /// Adds the payload to the bundle file.\n    pub fn add_payload(&mut self, file_name: String, payload: Box<dyn PutPayload>) {\n        let range = self.current_offset as u64..self.current_offset as u64 + payload.len();\n        self.current_offset += payload.len() as usize;\n        self.payloads.push((file_name, payload, range));\n    }\n\n    /// Adds the file to the bundle file.\n    pub fn add_file(&mut self, path: &Path) -> io::Result<()> {\n        let file = std::fs::metadata(path)?;\n        let file_name = path\n            .file_name()\n            .and_then(std::ffi::OsStr::to_str)\n            .map(ToOwned::to_owned)\n            .ok_or_else(|| {\n                io::Error::new(\n                    io::ErrorKind::InvalidData,\n                    format!(\"Invalid file name in path {path:?}\"),\n                )\n            })?;\n\n        let file_payload = FilePayload {\n            path: path.to_owned(),\n            len: file.len(),\n        };\n\n        self.add_payload(file_name, Box::new(file_payload));\n\n        Ok(())\n    }\n\n    /// Writes the bundle file offsets metadata at the end of the bundle file,\n    /// and returns the byte-range of this metadata information.\n    pub fn finalize(self, hotcache: &[u8]) -> anyhow::Result<SplitPayload> {\n        // Add the fields metadata to the bundle metadata.\n        // Build the footer.\n        let metadata_with_fixed_paths = self\n            .payloads\n            .iter()\n            .map(|(file_name, _, range)| {\n                let file_name = PathBuf::from(file_name);\n                Ok((file_name, range.start..range.end))\n            })\n            .collect::<Result<HashMap<_, _>, anyhow::Error>>()?;\n\n        let bundle_storage_file_offsets = BundleStorageFileOffsets {\n            files: metadata_with_fixed_paths,\n        };\n        let metadata_json =\n            BundleStorageFileOffsetsVersions::serialize(&bundle_storage_file_offsets);\n\n        // The hotcache needs to be the next to the metadata in order to be able to read both\n        // in one continuous read.\n        let mut footer_bytes = Vec::new();\n        footer_bytes.extend(&metadata_json);\n        footer_bytes.extend((metadata_json.len() as u32).to_le_bytes());\n        footer_bytes.extend(hotcache);\n        footer_bytes.extend((hotcache.len() as u32).to_le_bytes());\n\n        let mut payloads: Vec<Box<dyn PutPayload>> = self\n            .payloads\n            .into_iter()\n            .map(|(_, payload, _)| payload)\n            .collect();\n\n        payloads.push(Box::new(footer_bytes.to_vec()));\n\n        Ok(SplitPayload {\n            payloads,\n            footer_range: self.current_offset as u64\n                ..self.current_offset as u64 + footer_bytes.len() as u64,\n        })\n    }\n}\n\n/// Returns the payloads with their absolute ranges.\nfn get_payloads_with_absolute_range(\n    payloads: &[Box<dyn PutPayload>],\n) -> Vec<(Box<dyn PutPayload>, Range<usize>)> {\n    let mut current = 0;\n    payloads\n        .iter()\n        .map(|payload| {\n            let start = current;\n            current += payload.len();\n            (payload.clone(), start as usize..current as usize)\n        })\n        .collect()\n}\n\nfn get_ranges_overlap(range1: &Range<usize>, range2: &Range<usize>) -> Range<usize> {\n    range1.start.max(range2.start)..range1.end.min(range2.end)\n}\n\n// Returns payloads and their relative ranges for an absolute range.\nfn chunk_payload_ranges(\n    payloads: &[Box<dyn PutPayload>],\n    range: Range<usize>,\n) -> Vec<(Box<dyn PutPayload>, Range<usize>)> {\n    let mut ranges = Vec::new();\n    for (payload, payload_absolute_range) in get_payloads_with_absolute_range(payloads) {\n        let absolute_range_overlap = get_ranges_overlap(&payload_absolute_range, &range);\n        if !absolute_range_overlap.is_empty() {\n            // Push the range relative to this payload as we will read from it.\n            ranges.push((\n                payload.clone(),\n                (absolute_range_overlap.start - payload_absolute_range.start)\n                    ..(absolute_range_overlap.end - payload_absolute_range.start),\n            ));\n        }\n    }\n    ranges\n}\n\n#[cfg(test)]\nmod tests {\n    use std::fs::File;\n    use std::io::Write;\n\n    use super::*;\n\n    #[tokio::test]\n    async fn test_split_offset_computer() -> anyhow::Result<()> {\n        let temp_dir = tempfile::tempdir()?;\n        let test_filepath1 = temp_dir.path().join(\"f1\");\n        let test_filepath2 = temp_dir.path().join(\"f2\");\n\n        let mut file1 = File::create(&test_filepath1)?;\n        file1.write_all(b\"hello\")?;\n\n        let mut file2 = File::create(&test_filepath2)?;\n        file2.write_all(b\"world\")?;\n\n        let split_payload =\n            SplitPayloadBuilder::get_split_payload(&[test_filepath1, test_filepath2], &[], b\"abc\")?;\n\n        assert_eq!(split_payload.len(), 128);\n\n        Ok(())\n    }\n\n    #[cfg(test)]\n    async fn fetch_data(\n        split_streamer: &SplitPayload,\n        range: Range<u64>,\n    ) -> anyhow::Result<Vec<u8>> {\n        use tokio::io::AsyncReadExt as _;\n\n        let mut data = Vec::new();\n        split_streamer\n            .range_byte_stream(range)\n            .await?\n            .into_async_read()\n            .read_to_end(&mut data)\n            .await?;\n        Ok(data)\n    }\n\n    #[test]\n    fn test_chunk_payloads() -> anyhow::Result<()> {\n        let payloads: Vec<Box<dyn PutPayload>> = vec![\n            Box::new(vec![1, 2, 3]),\n            Box::new(vec![4, 5, 6]),\n            Box::new(vec![7, 8, 9, 10]),\n        ];\n\n        assert_eq!(\n            chunk_payload_ranges(&payloads, 0..1)\n                .iter()\n                .map(|el| el.1.clone())\n                .collect::<Vec<_>>(),\n            vec![0..1]\n        );\n        assert_eq!(\n            chunk_payload_ranges(&payloads, 0..2)\n                .iter()\n                .map(|el| el.1.clone())\n                .collect::<Vec<_>>(),\n            vec![0..2]\n        );\n        assert_eq!(\n            chunk_payload_ranges(&payloads, 1..2)\n                .iter()\n                .map(|el| el.1.clone())\n                .collect::<Vec<_>>(),\n            vec![1..2]\n        );\n        assert_eq!(\n            chunk_payload_ranges(&payloads, 2..3)\n                .iter()\n                .map(|el| el.1.clone())\n                .collect::<Vec<_>>(),\n            vec![2..3]\n        );\n        assert_eq!(\n            chunk_payload_ranges(&payloads, 0..6)\n                .iter()\n                .map(|el| el.1.clone())\n                .collect::<Vec<_>>(),\n            vec![0..3, 0..3]\n        );\n        assert_eq!(\n            chunk_payload_ranges(&payloads, 0..5)\n                .iter()\n                .map(|el| el.1.clone())\n                .collect::<Vec<_>>(),\n            vec![0..3, 0..2]\n        );\n        assert_eq!(\n            chunk_payload_ranges(&payloads, 3..6)\n                .iter()\n                .map(|el| el.1.clone())\n                .collect::<Vec<_>>(),\n            vec![0..3]\n        );\n        assert_eq!(\n            chunk_payload_ranges(&payloads, 4..6)\n                .iter()\n                .map(|el| el.1.clone())\n                .collect::<Vec<_>>(),\n            vec![1..3]\n        );\n        assert_eq!(\n            chunk_payload_ranges(&payloads, 5..6)\n                .iter()\n                .map(|el| el.1.clone())\n                .collect::<Vec<_>>(),\n            vec![2..3]\n        );\n        assert_eq!(\n            chunk_payload_ranges(&payloads, 2..6)\n                .iter()\n                .map(|el| el.1.clone())\n                .collect::<Vec<_>>(),\n            vec![2..3, 0..3]\n        );\n        assert_eq!(\n            chunk_payload_ranges(&payloads, 2..5)\n                .iter()\n                .map(|el| el.1.clone())\n                .collect::<Vec<_>>(),\n            vec![2..3, 0..2]\n        );\n\n        assert_eq!(\n            chunk_payload_ranges(&payloads, 7..8)\n                .iter()\n                .map(|el| el.1.clone())\n                .collect::<Vec<_>>(),\n            vec![1..2]\n        );\n\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_split_streamer() -> anyhow::Result<()> {\n        let temp_dir = tempfile::tempdir()?;\n        let test_filepath1 = temp_dir.path().join(\"a\");\n        let test_filepath2 = temp_dir.path().join(\"b\");\n\n        let mut file1 = File::create(&test_filepath1)?;\n        file1.write_all(&[123, 76])?;\n\n        let mut file2 = File::create(&test_filepath2)?;\n        file2.write_all(&[99, 55, 44])?;\n\n        let split_streamer = SplitPayloadBuilder::get_split_payload(\n            &[test_filepath1.clone(), test_filepath2.clone()],\n            &[],\n            &[1, 2, 3],\n        )?;\n\n        // border case 1 exact start of first block\n        assert_eq!(fetch_data(&split_streamer, 0..1).await?, vec![123]);\n        assert_eq!(fetch_data(&split_streamer, 0..2).await?, vec![123, 76]);\n        assert_eq!(fetch_data(&split_streamer, 0..3).await?, vec![123, 76, 99]);\n\n        // border 2 case skip and take cross adjacent blocks\n        assert_eq!(fetch_data(&split_streamer, 1..3).await?, vec![76, 99]);\n\n        // border 3 case skip and take in separate blocks with full block between\n        assert_eq!(\n            fetch_data(&split_streamer, 1..6).await?,\n            vec![76, 99, 55, 44, 174]\n        );\n\n        // border case 4 exact middle block\n        assert_eq!(fetch_data(&split_streamer, 2..5).await?, vec![99, 55, 44]);\n\n        // border case 5, no skip but take in middle block\n        assert_eq!(fetch_data(&split_streamer, 2..4).await?, vec![99, 55]);\n\n        // border case 6 skip and take in middle block\n        assert_eq!(fetch_data(&split_streamer, 3..4).await?, vec![55]);\n\n        // border case 7 start exact last block - footer\n        assert_eq!(\n            fetch_data(&split_streamer, 5..10).await?,\n            vec![174, 190, 18, 24, 1]\n        );\n        // border case 8 skip and take in last block  - footer\n        assert_eq!(\n            fetch_data(&split_streamer, 6..10).await?,\n            vec![190, 18, 24, 1]\n        );\n\n        let total_len = split_streamer.len();\n        let all_data = fetch_data(&split_streamer, 0..total_len).await?;\n\n        // last 8 bytes are the length of the hotcache bytes\n        assert_eq!(all_data[all_data.len() - 4..], 3_u32.to_le_bytes());\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/split_cache/download_task.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::num::NonZeroU32;\nuse std::path::Path;\nuse std::sync::Arc;\nuse std::time::Duration;\n\nuse quickwit_common::split_file;\nuse tokio::sync::{OwnedSemaphorePermit, Semaphore};\n\nuse crate::split_cache::split_table::{CandidateSplit, DownloadOpportunity};\nuse crate::{SplitCache, StorageResolver};\n\nasync fn download_split(\n    root_path: &Path,\n    candidate_split: &CandidateSplit,\n    storage_resolver: StorageResolver,\n) -> anyhow::Result<u64> {\n    let CandidateSplit {\n        split_ulid,\n        storage_uri,\n        living_token: _,\n    } = candidate_split;\n    let split_filename = split_file(*split_ulid);\n    let target_filepath = root_path.join(&split_filename);\n    let storage = storage_resolver.resolve(storage_uri).await?;\n    let num_bytes = storage\n        .copy_to_file(Path::new(&split_filename), &target_filepath)\n        .await?;\n    Ok(num_bytes)\n}\n\nasync fn perform_eviction_and_download(\n    download_opportunity: DownloadOpportunity,\n    split_cache: Arc<SplitCache>,\n    storage_resolver: StorageResolver,\n    _download_permit: OwnedSemaphorePermit,\n) -> anyhow::Result<()> {\n    let DownloadOpportunity {\n        splits_to_delete,\n        split_to_download,\n    } = download_opportunity;\n    let split_ulid = split_to_download.split_ulid;\n    // tokio io runs on `spawn_blocking` threads anyway.\n    let split_cache_clone = split_cache.clone();\n    let _ = tokio::task::spawn_blocking(move || {\n        split_cache_clone.evict(&splits_to_delete[..]);\n    })\n    .await;\n    let num_bytes =\n        download_split(&split_cache.root_path, &split_to_download, storage_resolver).await?;\n    let mut shared_split_table_lock = split_cache.split_table.lock().unwrap();\n    shared_split_table_lock.register_as_downloaded(split_ulid, num_bytes);\n    Ok(())\n}\n\npub(crate) fn spawn_download_task(\n    split_cache: Arc<SplitCache>,\n    storage_resolver: StorageResolver,\n    num_concurrent_downloads: NonZeroU32,\n) {\n    let semaphore = Arc::new(Semaphore::new(num_concurrent_downloads.get() as usize));\n    tokio::task::spawn(async move {\n        loop {\n            let download_permit = Semaphore::acquire_owned(semaphore.clone()).await.unwrap();\n            let download_opportunity_opt = split_cache\n                .split_table\n                .lock()\n                .unwrap()\n                .find_download_opportunity();\n            if let Some(download_opportunity) = download_opportunity_opt {\n                let split_cache_clone = split_cache.clone();\n                tokio::task::spawn(perform_eviction_and_download(\n                    download_opportunity,\n                    split_cache_clone,\n                    storage_resolver.clone(),\n                    download_permit,\n                ));\n            } else {\n                // We wait 1 sec before retrying, to avoid wasting CPU.\n                tokio::time::sleep(Duration::from_secs(1)).await;\n            }\n        }\n    });\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/split_cache/mod.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nmod download_task;\nmod split_table;\n\nuse std::collections::BTreeMap;\nuse std::ffi::OsStr;\nuse std::io;\nuse std::ops::Range;\nuse std::path::{Path, PathBuf};\nuse std::str::FromStr;\nuse std::sync::{Arc, Mutex};\n\nuse async_trait::async_trait;\nuse quickwit_common::split_file;\nuse quickwit_common::uri::Uri;\nuse quickwit_config::SplitCacheLimits;\nuse quickwit_proto::search::ReportSplit;\nuse tantivy::directory::OwnedBytes;\nuse tracing::{error, info, instrument, warn};\nuse ulid::Ulid;\n\nuse crate::file_descriptor_cache::{FileDescriptorCache, SplitFile};\nuse crate::split_cache::download_task::spawn_download_task;\nuse crate::split_cache::split_table::SplitTable;\nuse crate::{Storage, StorageCache, wrap_storage_with_cache};\n\n/// On disk Cache of splits for searchers.\n///\n/// The search acts receives reports of splits.\npub struct SplitCache {\n    // Directory containing the cached split files.\n    // Split ids are universally unique, so we all put them in the same directory.\n    root_path: PathBuf,\n    // In memory structure, listing the splits we know about regardless\n    // of whether they are in cache, being downloaded, or just available for download.\n    split_table: Mutex<SplitTable>,\n    fd_cache: FileDescriptorCache,\n}\n\nimpl SplitCache {\n    /// Creates a new SplitCache and spawns the task that will continuously search for\n    /// download opportunities.\n    pub fn with_root_path(\n        root_path: PathBuf,\n        storage_resolver: crate::StorageResolver,\n        limits: SplitCacheLimits,\n    ) -> io::Result<Arc<SplitCache>> {\n        std::fs::create_dir_all(&root_path)?;\n        let mut existing_splits: BTreeMap<Ulid, u64> = Default::default();\n        for dir_entry_res in std::fs::read_dir(&root_path)? {\n            let dir_entry = dir_entry_res?;\n            let path = dir_entry.path();\n            let meta = std::fs::metadata(&path)?;\n            if meta.is_dir() {\n                continue;\n            }\n            let ext = path.extension().and_then(OsStr::to_str).unwrap_or(\"\");\n            match ext {\n                \"temp\" => {\n                    // This file is a temporary file that was being downloaded, when Quickwit was\n                    // stopped (killed for instance) in a way that prevented\n                    // their cleanup. It is important to remove it.\n                    if let Err(io_err) = std::fs::remove_file(&path)\n                        && io_err.kind() != io::ErrorKind::NotFound\n                    {\n                        error!(path=?path, \"failed to remove temporary file\");\n                    }\n                }\n                \"split\" => {\n                    if let Some(split_ulid) = split_id_from_path(&path) {\n                        existing_splits.insert(split_ulid, meta.len());\n                    } else {\n                        warn!(path=%path.display(), \".split file with invalid ulid in split cache directory, ignoring\");\n                    }\n                }\n                _ => {\n                    warn!(path=%path.display(), \"unknown file in split cache directory, ignoring\");\n                }\n            }\n        }\n        let mut split_table = SplitTable::with_limits_and_existing_splits(limits, existing_splits);\n\n        // In case of a setting change, it could be useful to evict some splits on startup.\n        let splits_to_remove_res = split_table.make_room_for_split_if_necessary(u64::MAX);\n        if let Ok(splits_to_remove) = splits_to_remove_res {\n            info!(\n                num_splits = splits_to_remove.len(),\n                \"Evicting splits from the searcher cache. Has the node configuration changed?\"\n            );\n            delete_evicted_splits(&root_path, &splits_to_remove[..]);\n        }\n        let fd_cache = FileDescriptorCache::with_fd_cache_capacity(limits.max_file_descriptors);\n        let split_cache = Arc::new(SplitCache {\n            root_path,\n            split_table: Mutex::new(split_table),\n            fd_cache,\n        });\n\n        spawn_download_task(\n            split_cache.clone(),\n            storage_resolver,\n            limits.num_concurrent_downloads,\n        );\n\n        Ok(split_cache)\n    }\n\n    /// Remove splits from both the fd cache and the split cache.\n    /// This method does NOT update the split table.\n    pub(crate) fn evict(&self, splits_to_evict: &[Ulid]) {\n        self.fd_cache.evict_split_files(splits_to_evict);\n        delete_evicted_splits(&self.root_path, splits_to_evict);\n    }\n\n    /// Wraps a storage with our split cache.\n    pub fn wrap_storage(self_arc: Arc<Self>, storage: Arc<dyn Storage>) -> Arc<dyn Storage> {\n        let cache = Arc::new(SplitCacheBackingStorage {\n            split_cache: self_arc,\n            storage_root_uri: storage.uri().clone(),\n        });\n        wrap_storage_with_cache(cache, storage)\n    }\n\n    /// Report the split cache about the existence of new splits.\n    pub fn report_splits(&self, report_splits: Vec<ReportSplit>) {\n        let mut split_table = self.split_table.lock().unwrap();\n        for report_split in report_splits {\n            let Ok(split_ulid) = Ulid::from_str(&report_split.split_id) else {\n                error!(split_id=%report_split.split_id, \"received invalid split ulid: ignoring\");\n                continue;\n            };\n            let Ok(storage_uri) = Uri::from_str(&report_split.storage_uri) else {\n                error!(storage_uri=%report_split.storage_uri, \"received invalid storage uri: ignoring\");\n                continue;\n            };\n            split_table.report(split_ulid, storage_uri);\n        }\n    }\n\n    // Returns a split guard object. As long as it is not dropped, the\n    // split won't be evinced from the cache.\n    async fn get_split_file(&self, split_id: Ulid, storage_uri: &Uri) -> Option<SplitFile> {\n        // We touch before even checking the fd cache in order to update the file's last access time\n        // for the file cache.\n        let num_bytes_opt: Option<u64> = self\n            .split_table\n            .lock()\n            .unwrap()\n            .touch(split_id, storage_uri);\n\n        let num_bytes = num_bytes_opt?;\n        self.fd_cache\n            .get_or_open_split_file(&self.root_path, split_id, num_bytes)\n            .await\n            .ok()\n    }\n}\n\n/// Removes the evicted split files from the file system.\n/// This function just logs errors, and swallows them.\n///\n/// At this point, the disk space is already accounted as released,\n/// so the error could result in a \"disk space leak\".\n#[instrument]\nfn delete_evicted_splits(root_path: &Path, splits_to_delete: &[Ulid]) {\n    for &split_to_delete in splits_to_delete {\n        let split_file_path = root_path.join(split_file(split_to_delete));\n        if let Err(_io_err) = std::fs::remove_file(&split_file_path) {\n            // This is an pretty critical error. The split size is not tracked anymore at this\n            // point.\n            error!(path=%split_file_path.display(), \"failed to remove split file from cache directory. This is critical as the file is now not taken in account in the cache size limits\");\n        }\n    }\n}\n\nfn split_id_from_path(split_path: &Path) -> Option<Ulid> {\n    let split_filename = split_path.file_name()?.to_str()?;\n    let split_id_str = split_filename.strip_suffix(\".split\")?;\n    Ulid::from_str(split_id_str).ok()\n}\n\nstruct SplitCacheBackingStorage {\n    split_cache: Arc<SplitCache>,\n    storage_root_uri: Uri,\n}\n\nimpl SplitCacheBackingStorage {\n    async fn get_impl(&self, path: &Path, byte_range: Range<usize>) -> Option<OwnedBytes> {\n        let split_id = split_id_from_path(path)?;\n        let split_file: SplitFile = self\n            .split_cache\n            .get_split_file(split_id, &self.storage_root_uri)\n            .await?;\n        split_file.get_range(byte_range).await.ok()\n    }\n\n    async fn get_all_impl(&self, path: &Path) -> Option<OwnedBytes> {\n        let split_id = split_id_from_path(path)?;\n        let split_file = self\n            .split_cache\n            .get_split_file(split_id, &self.storage_root_uri)\n            .await?;\n        split_file.get_all().await.ok()\n    }\n\n    fn record_hit_metrics(&self, result_opt: Option<&OwnedBytes>) {\n        let split_metrics = &crate::STORAGE_METRICS.searcher_split_cache.cache_metrics;\n        if let Some(result) = result_opt {\n            split_metrics.hits_num_items.inc();\n            split_metrics.hits_num_bytes.inc_by(result.len() as u64);\n        } else {\n            split_metrics.misses_num_items.inc();\n        }\n    }\n}\n\n#[async_trait]\nimpl StorageCache for SplitCacheBackingStorage {\n    async fn get(&self, path: &Path, byte_range: Range<usize>) -> Option<OwnedBytes> {\n        let result = self.get_impl(path, byte_range).await;\n        self.record_hit_metrics(result.as_ref());\n        result\n    }\n\n    async fn get_all(&self, path: &Path) -> Option<OwnedBytes> {\n        let result = self.get_all_impl(path).await;\n        self.record_hit_metrics(result.as_ref());\n        result\n    }\n\n    async fn put(&self, _path: PathBuf, _byte_range: Range<usize>, _bytes: OwnedBytes) {}\n    async fn put_all(&self, _path: PathBuf, _bytes: OwnedBytes) {}\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/split_cache/split_table.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::cmp::Ordering;\nuse std::collections::{BTreeMap, BTreeSet, HashMap};\nuse std::sync::{Arc, Weak};\nuse std::time::{Duration, Instant};\n\nuse quickwit_common::uri::Uri;\nuse quickwit_config::SplitCacheLimits;\nuse ulid::Ulid;\n\ntype LastAccessDate = u64;\n\n/// Maximum number of splits to track.\nconst MAX_NUM_CANDIDATES: usize = 1_000;\n\n/// Splits that are freshly reported get a last access time of `now - NEWLY_REPORT_SPLIT_LAST_TIME`.\nconst NEWLY_REPORTED_SPLIT_LAST_TIME: Duration = Duration::from_secs(60 * 10); // 10mn\n\n#[derive(Clone, Copy)]\npub(crate) struct SplitKey {\n    pub last_accessed: LastAccessDate,\n    pub split_ulid: Ulid,\n}\n\nimpl PartialOrd for SplitKey {\n    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {\n        Some(self.cmp(other))\n    }\n}\n\nimpl Ord for SplitKey {\n    fn cmp(&self, other: &Self) -> Ordering {\n        (self.last_accessed, &self.split_ulid).cmp(&(other.last_accessed, &other.split_ulid))\n    }\n}\n\nimpl PartialEq for SplitKey {\n    fn eq(&self, other: &Self) -> bool {\n        (self.last_accessed, &self.split_ulid) == (other.last_accessed, &other.split_ulid)\n    }\n}\n\nimpl Eq for SplitKey {}\n\n#[derive(Clone, Debug)]\nenum Status {\n    Candidate(CandidateSplit),\n    Downloading { alive_token: Weak<()> },\n    OnDisk { num_bytes: u64 },\n}\n\nimpl PartialEq for Status {\n    fn eq(&self, other: &Status) -> bool {\n        match (self, other) {\n            (Status::Candidate(candidate_split), Status::Candidate(other_candidate_split)) => {\n                candidate_split == other_candidate_split\n            }\n            (Status::Downloading { .. }, Status::Downloading { .. }) => true,\n            (\n                Status::OnDisk { num_bytes },\n                Status::OnDisk {\n                    num_bytes: other_num_bytes,\n                },\n            ) => num_bytes == other_num_bytes,\n            _ => false,\n        }\n    }\n}\n\npub struct SplitInfo {\n    pub(crate) split_key: SplitKey,\n    status: Status,\n}\n\n/// The split table keeps track of splits we know about (regardless of whether they have already\n/// been downloaded or not).\n///\n/// Invariant:\n/// Each split appearing into split_to_status, should be listed 1 and exactly once in the\n/// either\n/// - on_disk_splits\n/// - downloading_splits\n/// - candidate_splits.\n///\n/// It is possible for the split table size in bytes to exceed its limits, by at\n/// most one split.\npub struct SplitTable {\n    on_disk_splits: BTreeSet<SplitKey>,\n    downloading_splits: BTreeSet<SplitKey>,\n    candidate_splits: BTreeSet<SplitKey>,\n    split_to_status: HashMap<Ulid, SplitInfo>,\n    origin_time: Instant,\n    limits: SplitCacheLimits,\n    on_disk_bytes: u64,\n}\n\nimpl SplitTable {\n    pub(crate) fn with_limits_and_existing_splits(\n        limits: SplitCacheLimits,\n        existing_filepaths: BTreeMap<Ulid, u64>,\n    ) -> SplitTable {\n        let origin_time = Instant::now() - NEWLY_REPORTED_SPLIT_LAST_TIME;\n        let mut split_table = SplitTable {\n            on_disk_splits: BTreeSet::default(),\n            candidate_splits: BTreeSet::default(),\n            downloading_splits: BTreeSet::default(),\n            split_to_status: HashMap::default(),\n            origin_time,\n            limits,\n            on_disk_bytes: 0u64,\n        };\n        split_table.acknowledge_on_disk_splits(existing_filepaths);\n        split_table\n    }\n\n    fn acknowledge_on_disk_splits(&mut self, existing_filepaths: BTreeMap<Ulid, u64>) {\n        for (split_ulid, num_bytes) in existing_filepaths {\n            let split_info = SplitInfo {\n                split_key: SplitKey {\n                    last_accessed: 0,\n                    split_ulid,\n                },\n                status: Status::OnDisk { num_bytes },\n            };\n            self.insert(split_info);\n        }\n    }\n}\n\nfn compute_timestamp(start: Instant) -> LastAccessDate {\n    start.elapsed().as_micros() as u64\n}\n\nimpl SplitTable {\n    fn remove(&mut self, split_ulid: Ulid) -> Option<SplitInfo> {\n        let split_info = self.split_to_status.remove(&split_ulid)?;\n        let split_queue: &mut BTreeSet<SplitKey> = match split_info.status {\n            Status::Candidate { .. } => &mut self.candidate_splits,\n            Status::Downloading { .. } => &mut self.downloading_splits,\n            Status::OnDisk { num_bytes } => {\n                self.on_disk_bytes -= num_bytes;\n                crate::metrics::STORAGE_METRICS\n                    .searcher_split_cache\n                    .cache_metrics\n                    .in_cache_count\n                    .dec();\n                crate::metrics::STORAGE_METRICS\n                    .searcher_split_cache\n                    .cache_metrics\n                    .in_cache_num_bytes\n                    .sub(num_bytes as i64);\n                crate::metrics::STORAGE_METRICS\n                    .searcher_split_cache\n                    .cache_metrics\n                    .evict_num_items\n                    .inc();\n                crate::metrics::STORAGE_METRICS\n                    .searcher_split_cache\n                    .cache_metrics\n                    .evict_num_bytes\n                    .inc_by(num_bytes);\n                &mut self.on_disk_splits\n            }\n        };\n        let is_in_queue = split_queue.remove(&split_info.split_key);\n        assert!(is_in_queue);\n        if let Status::Downloading { alive_token } = &split_info.status\n            && alive_token.strong_count() == 0\n        {\n            return None;\n        }\n        Some(split_info)\n    }\n\n    fn gc_downloading_splits_if_necessary(&mut self) {\n        if self.downloading_splits.len()\n            < (self.limits.num_concurrent_downloads.get() as usize + 10)\n        {\n            return;\n        }\n        let mut splits_to_remove = Vec::new();\n        for split in &self.downloading_splits {\n            if let Some(split_info) = self.split_to_status.get(&split.split_ulid)\n                && let Status::Downloading { alive_token } = &split_info.status\n                && alive_token.strong_count() == 0\n            {\n                splits_to_remove.push(split.split_ulid);\n            }\n        }\n        for split in splits_to_remove {\n            self.remove(split);\n        }\n    }\n\n    /// Insert a `split_info`. This methods assumes the split was not present in the split table\n    /// to begin with. It will panic if the split was already present.\n    ///\n    /// Keep this method private.\n    fn insert(&mut self, split_info: SplitInfo) {\n        let was_not_in_queue = match split_info.status {\n            Status::Candidate { .. } => {\n                // we truncate *before* inserting, otherwise way may end up in an inconsistent\n                // state which make truncate_candidate_list loop indefinitely\n                self.truncate_candidate_list();\n                self.candidate_splits.insert(split_info.split_key)\n            }\n            Status::Downloading { .. } => self.downloading_splits.insert(split_info.split_key),\n            Status::OnDisk { num_bytes } => {\n                self.on_disk_bytes += num_bytes;\n                crate::metrics::STORAGE_METRICS\n                    .searcher_split_cache\n                    .cache_metrics\n                    .in_cache_count\n                    .inc();\n                crate::metrics::STORAGE_METRICS\n                    .searcher_split_cache\n                    .cache_metrics\n                    .in_cache_num_bytes\n                    .add(num_bytes as i64);\n                self.on_disk_splits.insert(split_info.split_key)\n            }\n        };\n        // this is fine to do in an inconsistent state, the last entry will just be ignored while\n        // gcing\n        self.gc_downloading_splits_if_necessary();\n        assert!(was_not_in_queue);\n        let split_ulid_was_absent = self\n            .split_to_status\n            .insert(split_info.split_key.split_ulid, split_info)\n            .is_none();\n        assert!(split_ulid_was_absent);\n    }\n\n    /// Touch the file, updating its last access time, possibly extending its life in the\n    /// cache (if in cache).\n    ///\n    /// If the file is already on the disk cache, return `Some(num_bytes)`.\n    /// If the file is not in cache, return `None`, and register the file in the candidate for\n    /// download list.\n    pub fn touch(&mut self, split_ulid: Ulid, storage_uri: &Uri) -> Option<u64> {\n        let timestamp = compute_timestamp(self.origin_time);\n        let status = self.mutate_split(split_ulid, |old_split_info| {\n            if let Some(mut split_info) = old_split_info {\n                split_info.split_key.last_accessed = timestamp;\n                split_info\n            } else {\n                SplitInfo {\n                    split_key: SplitKey {\n                        split_ulid,\n                        last_accessed: timestamp,\n                    },\n                    status: Status::Candidate(CandidateSplit {\n                        storage_uri: storage_uri.clone(),\n                        split_ulid,\n                        living_token: Arc::new(()),\n                    }),\n                }\n            }\n        });\n        if let Status::OnDisk { num_bytes } = status {\n            Some(num_bytes)\n        } else {\n            None\n        }\n    }\n\n    /// Mutates a split ulid.\n    ///\n    /// By design this function maintains the invariant.\n    /// It removes the split with the given ulid, modifies, and re\n    fn mutate_split(\n        &mut self,\n        split_ulid: Ulid,\n        mutate_fn: impl FnOnce(Option<SplitInfo>) -> SplitInfo,\n    ) -> Status {\n        let split_info_opt = self.remove(split_ulid);\n        let new_split: SplitInfo = mutate_fn(split_info_opt);\n        let new_status = new_split.status.clone();\n        self.insert(new_split);\n        new_status\n    }\n\n    fn change_split_status(&mut self, split_ulid: Ulid, status: Status) {\n        let start_time = self.origin_time;\n        self.mutate_split(split_ulid, move |split_info_opt| {\n            if let Some(mut split_info) = split_info_opt {\n                split_info.status = status;\n                split_info\n            } else {\n                SplitInfo {\n                    split_key: SplitKey {\n                        last_accessed: compute_timestamp(start_time),\n                        split_ulid,\n                    },\n                    status,\n                }\n            }\n        });\n    }\n\n    pub(crate) fn report(&mut self, split_ulid: Ulid, storage_uri: Uri) {\n        let origin_time = self.origin_time;\n        self.mutate_split(split_ulid, move |split_info_opt| {\n            if let Some(split_info) = split_info_opt {\n                return split_info;\n            }\n            SplitInfo {\n                split_key: SplitKey {\n                    last_accessed: compute_timestamp(origin_time)\n                        .saturating_sub(NEWLY_REPORTED_SPLIT_LAST_TIME.as_micros() as u64),\n                    split_ulid,\n                },\n                status: Status::Candidate(CandidateSplit {\n                    storage_uri,\n                    split_ulid,\n                    living_token: Arc::new(()),\n                }),\n            }\n        });\n    }\n\n    /// Make sure we have at most `MAX_CANDIDATES` candidate splits.\n    fn truncate_candidate_list(&mut self) {\n        // we remove one more to make place for one candidate about to be inserted\n        while self.candidate_splits.len() >= MAX_NUM_CANDIDATES {\n            let worst_candidate = self.candidate_splits.first().unwrap().split_ulid;\n            self.remove(worst_candidate);\n        }\n    }\n\n    pub(crate) fn register_as_downloaded(&mut self, split_ulid: Ulid, num_bytes: u64) {\n        self.change_split_status(split_ulid, Status::OnDisk { num_bytes });\n    }\n\n    /// Change the state of the given split from candidate to downloading state,\n    /// and returns its URI.\n    ///\n    /// This function does NOT trigger the download itself. It is up to\n    /// the caller to actually initiate the download.\n    pub(crate) fn start_download(&mut self, split_ulid: Ulid) -> Option<CandidateSplit> {\n        let split_info = self.remove(split_ulid)?;\n        let Status::Candidate(candidate_split) = split_info.status else {\n            self.insert(split_info);\n            return None;\n        };\n        let alive_token = Arc::downgrade(&candidate_split.living_token);\n        self.insert(SplitInfo {\n            split_key: split_info.split_key,\n            status: Status::Downloading { alive_token },\n        });\n        Some(candidate_split)\n    }\n\n    fn best_candidate(&self) -> Option<SplitKey> {\n        self.candidate_splits.last().copied()\n    }\n\n    fn is_out_of_limits(&self) -> bool {\n        if self.on_disk_splits.is_empty() {\n            return false;\n        }\n        if self.on_disk_splits.len() + self.downloading_splits.len()\n            >= self.limits.max_num_splits.get() as usize\n        {\n            return true;\n        }\n        if self.on_disk_bytes > self.limits.max_num_bytes.as_u64() {\n            return true;\n        }\n        false\n    }\n\n    /// Evicts splits to reach the target limits.\n    ///\n    /// Returns false if the first candidate for eviction is\n    /// fresher that the candidate split. (Note this is suboptimal.\n    ///\n    /// Returns `None` if this would mean evicting splits that\n    /// have been accessed more recently than the candidate split.\n    pub(crate) fn make_room_for_split_if_necessary(\n        &mut self,\n        last_access_date: LastAccessDate,\n    ) -> Result<Vec<Ulid>, NoRoomAvailable> {\n        let mut split_infos = Vec::new();\n        while self.is_out_of_limits() {\n            if let Some(first_split) = self.on_disk_splits.first() {\n                if first_split.last_accessed > last_access_date {\n                    // This is not worth doing the eviction.\n                    break;\n                }\n                split_infos.extend(self.remove(first_split.split_ulid));\n            } else {\n                break;\n            }\n        }\n        if self.is_out_of_limits() {\n            // We are still out of limits.\n            // Let's not go through with the eviction, and reinsert the splits.\n            for split_info in split_infos {\n                self.insert(split_info);\n            }\n            Err(NoRoomAvailable)\n        } else {\n            Ok(split_infos\n                .into_iter()\n                .map(|split_info| split_info.split_key.split_ulid)\n                .collect())\n        }\n    }\n\n    pub(crate) fn find_download_opportunity(&mut self) -> Option<DownloadOpportunity> {\n        let best_candidate_split_key = self.best_candidate()?;\n        let splits_to_delete: Vec<Ulid> = self\n            .make_room_for_split_if_necessary(best_candidate_split_key.last_accessed)\n            .ok()?;\n        let split_to_download: CandidateSplit =\n            self.start_download(best_candidate_split_key.split_ulid)?;\n        Some(DownloadOpportunity {\n            splits_to_delete,\n            split_to_download,\n        })\n    }\n\n    #[cfg(test)]\n    pub fn num_bytes(&self) -> u64 {\n        self.on_disk_bytes\n    }\n}\n\n#[derive(Clone, Copy, Debug)]\npub(crate) struct NoRoomAvailable;\n\n#[derive(Clone, Debug, Eq, PartialEq)]\npub(crate) struct CandidateSplit {\n    pub storage_uri: Uri,\n    pub split_ulid: Ulid,\n    pub living_token: Arc<()>,\n}\n\npub(crate) struct DownloadOpportunity {\n    // At this point, the split have already been removed from the split table.\n    // The file however need to be deleted.\n    pub splits_to_delete: Vec<Ulid>,\n    pub split_to_download: CandidateSplit,\n}\n\n#[cfg(test)]\nmod tests {\n    use std::num::NonZeroU32;\n    use std::sync::Arc;\n\n    use bytesize::ByteSize;\n    use quickwit_common::uri::Uri;\n    use quickwit_config::SplitCacheLimits;\n    use ulid::Ulid;\n\n    use crate::split_cache::split_table::{\n        CandidateSplit, DownloadOpportunity, SplitInfo, SplitKey, SplitTable, Status,\n    };\n\n    const TEST_STORAGE_URI: &str = \"s3://test\";\n\n    fn sorted_split_ulids(num_splits: usize) -> Vec<Ulid> {\n        let mut split_ulids: Vec<Ulid> =\n            std::iter::repeat_with(Ulid::new).take(num_splits).collect();\n        split_ulids.sort();\n        split_ulids\n    }\n\n    #[test]\n    fn test_split_table() {\n        let mut split_table = SplitTable::with_limits_and_existing_splits(\n            SplitCacheLimits {\n                max_num_bytes: ByteSize::kb(1),\n                max_num_splits: NonZeroU32::new(1).unwrap(),\n                num_concurrent_downloads: NonZeroU32::new(1).unwrap(),\n                max_file_descriptors: NonZeroU32::new(100).unwrap(),\n            },\n            Default::default(),\n        );\n        let ulids = sorted_split_ulids(2);\n        let ulid1 = ulids[0];\n        let ulid2 = ulids[1];\n        split_table.report(ulid1, Uri::for_test(TEST_STORAGE_URI));\n        split_table.report(ulid2, Uri::for_test(TEST_STORAGE_URI));\n        let candidate = split_table.best_candidate().unwrap();\n        assert_eq!(candidate.split_ulid, ulid2);\n    }\n\n    #[test]\n    fn test_split_table_prefer_last_touched() {\n        let mut split_table = SplitTable::with_limits_and_existing_splits(\n            SplitCacheLimits {\n                max_num_bytes: ByteSize::kb(1),\n                max_num_splits: NonZeroU32::new(1).unwrap(),\n                num_concurrent_downloads: NonZeroU32::new(1).unwrap(),\n                max_file_descriptors: NonZeroU32::new(100).unwrap(),\n            },\n            Default::default(),\n        );\n        let ulids = sorted_split_ulids(2);\n        let ulid1 = ulids[0];\n        let ulid2 = ulids[1];\n        split_table.report(ulid1, Uri::for_test(TEST_STORAGE_URI));\n        split_table.report(ulid2, Uri::for_test(TEST_STORAGE_URI));\n        let num_bytes_opt = split_table.touch(ulid1, &Uri::for_test(\"s3://test1/\"));\n        assert!(num_bytes_opt.is_none());\n        let candidate = split_table.best_candidate().unwrap();\n        assert_eq!(candidate.split_ulid, ulid1);\n    }\n\n    #[test]\n    fn test_split_table_prefer_start_download_prevent_new_report() {\n        let mut split_table = SplitTable::with_limits_and_existing_splits(\n            SplitCacheLimits {\n                max_num_bytes: ByteSize::kb(1),\n                max_num_splits: NonZeroU32::new(1).unwrap(),\n                num_concurrent_downloads: NonZeroU32::new(1).unwrap(),\n                max_file_descriptors: NonZeroU32::new(100).unwrap(),\n            },\n            Default::default(),\n        );\n        let ulid1 = Ulid::new();\n        split_table.report(ulid1, Uri::for_test(TEST_STORAGE_URI));\n        assert_eq!(split_table.num_bytes(), 0);\n        let download = split_table.start_download(ulid1);\n        assert!(download.is_some());\n        assert!(split_table.start_download(ulid1).is_none());\n        split_table.register_as_downloaded(ulid1, 10_000_000);\n        assert_eq!(split_table.num_bytes(), 10_000_000);\n        assert_eq!(\n            split_table.touch(ulid1, &Uri::for_test(TEST_STORAGE_URI)),\n            Some(10_000_000)\n        );\n        let ulid2 = Ulid::new();\n        split_table.report(ulid2, Uri::for_test(\"s3://test`/\"));\n        let download = split_table.start_download(ulid2);\n        assert!(download.is_some());\n        assert!(split_table.start_download(ulid2).is_none());\n        assert_eq!(split_table.num_bytes(), 10_000_000);\n        split_table.register_as_downloaded(ulid2, 3_000_000);\n        assert_eq!(split_table.num_bytes(), 13_000_000);\n    }\n\n    #[test]\n    fn test_eviction_due_to_size() {\n        let mut split_table = SplitTable::with_limits_and_existing_splits(\n            SplitCacheLimits {\n                max_num_bytes: ByteSize::mb(1),\n                max_num_splits: NonZeroU32::new(30).unwrap(),\n                num_concurrent_downloads: NonZeroU32::new(1).unwrap(),\n                max_file_descriptors: NonZeroU32::new(100).unwrap(),\n            },\n            Default::default(),\n        );\n        let mut split_ulids: Vec<Ulid> = std::iter::repeat_with(Ulid::new).take(6).collect();\n        split_ulids.sort();\n        let splits = [\n            (split_ulids[0], 10_000),\n            (split_ulids[1], 20_000),\n            (split_ulids[2], 300_000),\n            (split_ulids[3], 400_000),\n            (split_ulids[4], 100_000),\n            (split_ulids[5], 300_000),\n        ];\n        for (split_ulid, num_bytes) in splits {\n            split_table.report(split_ulid, Uri::for_test(TEST_STORAGE_URI));\n            split_table.register_as_downloaded(split_ulid, num_bytes);\n        }\n        let new_ulid = Ulid::new();\n        split_table.report(new_ulid, Uri::for_test(TEST_STORAGE_URI));\n        let DownloadOpportunity {\n            splits_to_delete,\n            split_to_download,\n        } = split_table.find_download_opportunity().unwrap();\n        assert_eq!(\n            &splits_to_delete[..],\n            &[splits[0].0, splits[1].0, splits[2].0][..]\n        );\n        assert_eq!(split_to_download.split_ulid, new_ulid);\n    }\n\n    #[test]\n    fn test_eviction_due_to_num_splits() {\n        let mut split_table = SplitTable::with_limits_and_existing_splits(\n            SplitCacheLimits {\n                max_num_bytes: ByteSize::mb(10),\n                max_num_splits: NonZeroU32::new(5).unwrap(),\n                num_concurrent_downloads: NonZeroU32::new(1).unwrap(),\n                max_file_descriptors: NonZeroU32::new(100).unwrap(),\n            },\n            Default::default(),\n        );\n        let mut split_ulids: Vec<Ulid> = std::iter::repeat_with(Ulid::new).take(6).collect();\n        split_ulids.sort();\n        let splits = [\n            (split_ulids[0], 10_000),\n            (split_ulids[1], 20_000),\n            (split_ulids[2], 300_000),\n            (split_ulids[3], 400_000),\n            (split_ulids[4], 100_000),\n            (split_ulids[5], 300_000),\n        ];\n        for (split_ulid, num_bytes) in splits {\n            split_table.report(split_ulid, Uri::for_test(TEST_STORAGE_URI));\n            split_table.register_as_downloaded(split_ulid, num_bytes);\n        }\n        let new_ulid = Ulid::new();\n        split_table.report(new_ulid, Uri::for_test(TEST_STORAGE_URI));\n        let DownloadOpportunity {\n            splits_to_delete,\n            split_to_download,\n        } = split_table.find_download_opportunity().unwrap();\n        assert_eq!(&splits_to_delete[..], &[splits[0].0, splits[1].0]);\n        assert_eq!(split_to_download.split_ulid, new_ulid);\n    }\n\n    #[test]\n    fn test_failed_download_can_be_re_reported() {\n        let mut split_table = SplitTable::with_limits_and_existing_splits(\n            SplitCacheLimits {\n                max_num_bytes: ByteSize::mb(10),\n                max_num_splits: NonZeroU32::new(5).unwrap(),\n                num_concurrent_downloads: NonZeroU32::new(1).unwrap(),\n                max_file_descriptors: NonZeroU32::new(100).unwrap(),\n            },\n            Default::default(),\n        );\n        let split_ulid = Ulid::new();\n        split_table.report(split_ulid, Uri::for_test(TEST_STORAGE_URI));\n        let candidate = split_table.start_download(split_ulid).unwrap();\n        // This report should be cancelled as we have a download currently running.\n        split_table.report(split_ulid, Uri::for_test(TEST_STORAGE_URI));\n\n        assert!(split_table.start_download(split_ulid).is_none());\n        std::mem::drop(candidate);\n\n        // Still not possible to start a download.\n        assert!(split_table.start_download(split_ulid).is_none());\n\n        // This report should be considered as our candidate (and its alive token has been dropped)\n        split_table.report(split_ulid, Uri::for_test(TEST_STORAGE_URI));\n\n        let candidate2 = split_table.start_download(split_ulid).unwrap();\n        assert_eq!(candidate2.split_ulid, split_ulid);\n    }\n\n    #[test]\n    fn test_split_table_truncate_candidates() {\n        let mut split_table = SplitTable::with_limits_and_existing_splits(\n            SplitCacheLimits {\n                max_num_bytes: ByteSize::mb(10),\n                max_num_splits: NonZeroU32::new(5).unwrap(),\n                num_concurrent_downloads: NonZeroU32::new(1).unwrap(),\n                max_file_descriptors: NonZeroU32::new(100).unwrap(),\n            },\n            Default::default(),\n        );\n        for i in 1..2_000 {\n            let split_ulid = Ulid::new();\n            split_table.report(split_ulid, Uri::for_test(TEST_STORAGE_URI));\n            assert_eq!(\n                split_table.candidate_splits.len(),\n                i.min(super::MAX_NUM_CANDIDATES)\n            );\n        }\n    }\n\n    // Unit test for #5334\n    #[test]\n    fn test_split_inserted_is_the_worst_candidate_5334() {\n        let mut split_table = SplitTable::with_limits_and_existing_splits(\n            SplitCacheLimits {\n                max_num_bytes: ByteSize::mb(10),\n                max_num_splits: NonZeroU32::new(2).unwrap(),\n                num_concurrent_downloads: NonZeroU32::new(1).unwrap(),\n                max_file_descriptors: NonZeroU32::new(100).unwrap(),\n            },\n            Default::default(),\n        );\n        for i in (0u128..=super::MAX_NUM_CANDIDATES as u128).rev() {\n            let split_ulid = Ulid(i);\n            let candidate_split = CandidateSplit {\n                storage_uri: Uri::for_test(TEST_STORAGE_URI),\n                split_ulid,\n                living_token: Arc::new(()),\n            };\n            let split_info = SplitInfo {\n                split_key: SplitKey {\n                    last_accessed: 0u64,\n                    split_ulid,\n                },\n                status: Status::Candidate(candidate_split),\n            };\n            split_table.insert(split_info);\n        }\n        assert_eq!(\n            split_table.candidate_splits.len(),\n            super::MAX_NUM_CANDIDATES\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/split_cache/tests.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::num::NonZeroU32;\n\nuse bytesize::ByteSize;\nuse quickwit_common::uri::Uri;\nuse quickwit_config::SplitCacheLimits;\nuse ulid::Ulid;\n\nuse crate::split_cache::split_table::{DownloadOpportunity, SplitTable};\n\nconst TEST_STORAGE_URI: &'static str = \"s3://test\";\n\n#[test]\nfn test_split_table() {\n    let mut split_table = SplitTable::with_limits(SplitCacheLimits {\n        max_num_bytes: ByteSize::kb(1),\n        max_num_splits: NonZeroU32::new(1).unwrap(),\n        num_concurrent_downloads: NonZeroU32::new(1).unwrap(),\n    });\n    let ulid1 = Ulid::new();\n    let ulid2 = Ulid::new();\n    split_table.report(ulid1, Uri::for_test(TEST_STORAGE_URI));\n    split_table.report(ulid2, Uri::for_test(TEST_STORAGE_URI));\n    let candidate = split_table.best_candidate().unwrap();\n    assert_eq!(candidate.split_ulid, ulid2);\n}\n\n#[test]\nfn test_split_table_prefer_last_touched() {\n    let mut split_table = SplitTable::with_limits(SplitCacheLimits {\n        max_num_bytes: ByteSize::kb(1),\n        max_num_splits: NonZeroU32::new(1).unwrap(),\n        num_concurrent_downloads: NonZeroU32::new(1).unwrap(),\n    });\n    let ulid1 = Ulid::new();\n    let ulid2 = Ulid::new();\n    split_table.report(ulid1, Uri::for_test(TEST_STORAGE_URI));\n    split_table.report(ulid2, Uri::for_test(TEST_STORAGE_URI));\n    let split_guard_opt = split_table.get_split_guard(ulid1, &Uri::for_test(\"s3://test1/\"));\n    assert!(split_guard_opt.is_none());\n    let candidate = split_table.best_candidate().unwrap();\n    assert_eq!(candidate.split_ulid, ulid1);\n}\n\n#[test]\nfn test_split_table_prefer_start_download_prevent_new_report() {\n    let mut split_table = SplitTable::with_limits(SplitCacheLimits {\n        max_num_bytes: ByteSize::kb(1),\n        max_num_splits: NonZeroU32::new(1).unwrap(),\n        num_concurrent_downloads: NonZeroU32::new(1).unwrap(),\n    });\n    let ulid1 = Ulid::new();\n    split_table.report(ulid1, Uri::for_test(TEST_STORAGE_URI));\n    assert_eq!(split_table.num_bytes(), 0);\n    let download = split_table.start_download(ulid1);\n    assert!(download.is_some());\n    assert!(split_table.start_download(ulid1).is_none());\n    split_table.register_as_downloaded(ulid1, 10_000_000);\n    assert_eq!(split_table.num_bytes(), 10_000_000);\n    split_table.get_split_guard(ulid1, &Uri::for_test(TEST_STORAGE_URI));\n    let ulid2 = Ulid::new();\n    split_table.report(ulid2, Uri::for_test(\"s3://test`/\"));\n    let download = split_table.start_download(ulid2);\n    assert!(download.is_some());\n    assert!(split_table.start_download(ulid2).is_none());\n    assert_eq!(split_table.num_bytes(), 10_000_000);\n    split_table.register_as_downloaded(ulid2, 3_000_000);\n    assert_eq!(split_table.num_bytes(), 13_000_000);\n}\n\n#[test]\nfn test_eviction_due_to_size() {\n    let mut split_table = SplitTable::with_limits(SplitCacheLimits {\n        max_num_bytes: ByteSize::mb(1),\n        max_num_splits: NonZeroU32::new(30).unwrap(),\n        num_concurrent_downloads: NonZeroU32::new(1).unwrap(),\n    });\n    let mut split_ulids: Vec<Ulid> = std::iter::repeat_with(Ulid::new).take(6).collect();\n    split_ulids.sort();\n    let splits = [\n        (split_ulids[0], 10_000),\n        (split_ulids[1], 20_000),\n        (split_ulids[2], 300_000),\n        (split_ulids[3], 400_000),\n        (split_ulids[4], 100_000),\n        (split_ulids[5], 300_000),\n    ];\n    for (split_ulid, num_bytes) in splits {\n        split_table.report(split_ulid, Uri::for_test(TEST_STORAGE_URI));\n        split_table.register_as_downloaded(split_ulid, num_bytes);\n    }\n    let new_ulid = Ulid::new();\n    split_table.report(new_ulid, Uri::for_test(TEST_STORAGE_URI));\n    let DownloadOpportunity {\n        splits_to_delete,\n        split_to_download,\n    } = split_table.find_download_opportunity().unwrap();\n    assert_eq!(\n        &splits_to_delete[..],\n        &[splits[0].0, splits[1].0, splits[2].0][..]\n    );\n    assert_eq!(split_to_download.split_ulid, new_ulid);\n}\n\n#[test]\nfn test_eviction_due_to_num_splits() {\n    let mut split_table = SplitTable::with_limits(SplitCacheLimits {\n        max_num_bytes: ByteSize::mb(10),\n        max_num_splits: NonZeroU32::new(5).unwrap(),\n        num_concurrent_downloads: NonZeroU32::new(1).unwrap(),\n    });\n    let mut split_ulids: Vec<Ulid> = std::iter::repeat_with(Ulid::new).take(6).collect();\n    split_ulids.sort();\n    let splits = [\n        (split_ulids[0], 10_000),\n        (split_ulids[1], 20_000),\n        (split_ulids[2], 300_000),\n        (split_ulids[3], 400_000),\n        (split_ulids[4], 100_000),\n        (split_ulids[5], 300_000),\n    ];\n    for (split_ulid, num_bytes) in splits {\n        split_table.report(split_ulid, Uri::for_test(TEST_STORAGE_URI));\n        split_table.register_as_downloaded(split_ulid, num_bytes);\n    }\n    let new_ulid = Ulid::new();\n    split_table.report(new_ulid, Uri::for_test(TEST_STORAGE_URI));\n    let DownloadOpportunity {\n        splits_to_delete,\n        split_to_download,\n    } = split_table.find_download_opportunity().unwrap();\n    assert_eq!(&splits_to_delete[..], &[splits[0].0][..]);\n    assert_eq!(split_to_download.split_ulid, new_ulid);\n}\n\n#[test]\nfn test_failed_download_can_be_re_reported() {\n    let mut split_table = SplitTable::with_limits(SplitCacheLimits {\n        max_num_bytes: ByteSize::mb(10),\n        max_num_splits: NonZeroU32::new(5).unwrap(),\n        num_concurrent_downloads: NonZeroU32::new(1).unwrap(),\n    });\n    let split_ulid = Ulid::new();\n    split_table.report(split_ulid, Uri::for_test(TEST_STORAGE_URI));\n    let candidate = split_table.start_download(split_ulid).unwrap();\n    // This report should be cancelled as we have a download currently running.\n    split_table.report(split_ulid, Uri::for_test(TEST_STORAGE_URI));\n\n    assert!(split_table.start_download(split_ulid).is_none());\n    std::mem::drop(candidate);\n\n    // Still not possible to start a download.\n    assert!(split_table.start_download(split_ulid).is_none());\n\n    // This report should be considered as our candidate (and its alive token has been dropped)\n    split_table.report(split_ulid, Uri::for_test(TEST_STORAGE_URI));\n\n    let candidate2 = split_table.start_download(split_ulid).unwrap();\n    assert_eq!(candidate2.split_ulid, split_ulid);\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/storage.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::fmt;\nuse std::io::{self};\nuse std::ops::Range;\nuse std::path::{Path, PathBuf};\n\nuse async_trait::async_trait;\nuse quickwit_common::uri::Uri;\nuse tempfile::TempPath;\nuse tokio::fs::File;\nuse tokio::io::{AsyncRead, AsyncWrite};\nuse tracing::error;\n\nuse crate::{BulkDeleteError, OwnedBytes, PutPayload, StorageErrorKind, StorageResult};\n\n/// This trait is only used to make it build trait object with `AsyncWrite + Send + Unpin`.\npub trait SendableAsync: AsyncWrite + Send + Unpin {}\nimpl<W: AsyncWrite + Send + Unpin> SendableAsync for W {}\n\n/// Storage meant to receive and serve quickwit's split.\n///\n/// Object storage are the primary target implementation of this trait,\n/// and its interface is meant to allow for multipart download/upload.\n///\n/// Note that Storage does not have the notion of directory separators.\n/// For underlying implementation where directory separator have meaning,\n/// The implementation should treat directory separators as exactly the same way\n/// object storage treat them. This means when directory separators a present\n/// in the storage operation path, the storage implementation should create and remove transparently\n/// these intermediate directories.\n#[cfg_attr(any(test, feature = \"testsuite\"), mockall::automock)]\n#[async_trait]\npub trait Storage: fmt::Debug + Send + Sync + 'static {\n    /// Check storage connection if applicable\n    async fn check_connectivity(&self) -> anyhow::Result<()>;\n\n    /// Saves a file into the storage.\n    async fn put(&self, path: &Path, payload: Box<dyn PutPayload>) -> StorageResult<()>;\n\n    /// Copies the file associated to `Path` into an `AsyncWrite`.\n    /// This function is required to call `.flush()` before it successfully returns.\n    ///\n    /// See also `copy_to_file`.\n    ///\n    /// async_trait Expansion of\n    /// async fn copy_to(&self, path: &Path, output: &mut dyn SendableAsync) -> StorageResult<()>;\n    ///\n    /// Just putting the async form is breaking mockall.\n    fn copy_to<'life0, 'life1, 'life2, 'async_trait>(\n        &'life0 self,\n        path: &'life1 Path,\n        output: &'life2 mut dyn SendableAsync,\n    ) -> ::core::pin::Pin<\n        Box<\n            dyn ::core::future::Future<Output = StorageResult<()>>\n                + ::core::marker::Send\n                + 'async_trait,\n        >,\n    >\n    where\n        'life0: 'async_trait,\n        'life1: 'async_trait,\n        'life2: 'async_trait,\n        Self: 'async_trait;\n\n    /// Downloads an entire file and writes it into a local file.\n    /// `output_path` is expected to be a file path (not a directory path)\n    /// without any existing file yet.\n    ///\n    /// This function will attempt to download things to a temporary file\n    /// in the same directory as the `output_path`, and then atomically move it\n    /// to the actual `output_path`.\n    ///\n    /// In case of failure, `quickwit` (not the OS) will attempt to delete the file\n    /// using some `Drop` mechanic.\n    /// If quickwit is killed for instance, this may result in the temporary file not\n    /// being deleted. It is important, upon started to identify these \".temp\"\n    /// files and delete them.\n    ///\n    /// See also `copy_to`.\n    async fn copy_to_file(&self, path: &Path, output_path: &Path) -> StorageResult<u64> {\n        default_copy_to_file(self, path, output_path).await\n    }\n\n    /// Downloads a slice of a file from the storage, and returns an in memory buffer\n    async fn get_slice(&self, path: &Path, range: Range<usize>) -> StorageResult<OwnedBytes>;\n\n    /// Opens a stream handle on the file from the storage.\n    ///\n    /// Might panic, return an error or an empty stream if the range is empty.\n    async fn get_slice_stream(\n        &self,\n        path: &Path,\n        range: Range<usize>,\n    ) -> StorageResult<Box<dyn AsyncRead + Send + Unpin>>;\n\n    /// Downloads the entire content of a \"small\" file, returns an in memory buffer.\n    /// For large files prefer `copy_to_file`.\n    async fn get_all(&self, path: &Path) -> StorageResult<OwnedBytes>;\n\n    /// Deletes a file.\n    ///\n    /// This method should return Ok(()) if the file did not exist.\n    async fn delete(&self, path: &Path) -> StorageResult<()>;\n\n    /// Deletes multiple files at once.\n    ///\n    /// The implementation may call `[`Storage::delete`] in a loop if the underlying storage does\n    /// not support deleting objects in bulk. The request can fail partially, i.e. some objects are\n    /// successfully deleted while others are not.\n    async fn bulk_delete<'a>(&self, paths: &[&'a Path]) -> Result<(), BulkDeleteError>;\n\n    /// Returns whether a file exists or not.\n    async fn exists(&self, path: &Path) -> StorageResult<bool> {\n        match self.file_num_bytes(path).await {\n            Ok(_) => Ok(true),\n            Err(storage_err) if storage_err.kind() == StorageErrorKind::NotFound => Ok(false),\n            Err(other_storage_err) => Err(other_storage_err),\n        }\n    }\n\n    /// Returns a file size.\n    async fn file_num_bytes(&self, path: &Path) -> StorageResult<u64>;\n\n    /// Returns an URI identifying the storage\n    fn uri(&self) -> &Uri;\n}\n\nasync fn default_copy_to_file<S: Storage + ?Sized>(\n    storage: &S,\n    path: &Path,\n    output_path: &Path,\n) -> StorageResult<u64> {\n    let mut download_temp_file =\n        DownloadTempFile::with_target_path(output_path.to_path_buf()).await?;\n    storage.copy_to(path, download_temp_file.as_mut()).await?;\n    let num_bytes = download_temp_file.persist().await?;\n    Ok(num_bytes)\n}\n\nstruct DownloadTempFile {\n    target_filepath: PathBuf,\n    temp_filepath: PathBuf,\n    file: File,\n    has_attempted_deletion: bool,\n}\n\nimpl DownloadTempFile {\n    /// Creates or truncate temp file.\n    pub async fn with_target_path(target_filepath: PathBuf) -> io::Result<DownloadTempFile> {\n        let Some(filename) = target_filepath.file_name() else {\n            return Err(io::Error::other(\n                \"Target filepath is not a directory path. Expected a filepath.\",\n            ));\n        };\n        let filename: &str = filename\n            .to_str()\n            .ok_or_else(|| io::Error::other(\"target filepath is not a valid UTF-8 string\"))?;\n        let mut temp_filepath = target_filepath.clone();\n        temp_filepath.set_file_name(format!(\"{filename}.temp\"));\n        let file = tokio::fs::File::create(temp_filepath.clone()).await?;\n        Ok(DownloadTempFile {\n            target_filepath,\n            temp_filepath,\n            file,\n            has_attempted_deletion: false,\n        })\n    }\n\n    pub async fn persist(mut self) -> io::Result<u64> {\n        TempPath::from_path(&self.temp_filepath).persist(&self.target_filepath)?;\n        self.has_attempted_deletion = true;\n        let num_bytes = std::fs::metadata(&self.target_filepath)?.len();\n        Ok(num_bytes)\n    }\n}\n\nimpl Drop for DownloadTempFile {\n    fn drop(&mut self) {\n        if self.has_attempted_deletion {\n            return;\n        }\n        let temp_filepath = self.temp_filepath.clone();\n        self.has_attempted_deletion = true;\n        tokio::task::spawn_blocking(move || {\n            if let Err(io_error) = std::fs::remove_file(&temp_filepath) {\n                error!(temp_filepath=%temp_filepath.display(), io_error=?io_error, \"Failed to remove temporary file\");\n            }\n        });\n    }\n}\n\nimpl AsMut<File> for DownloadTempFile {\n    fn as_mut(&mut self) -> &mut File {\n        &mut self.file\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::time::Duration;\n\n    use super::*;\n    use crate::{RamStorage, StorageError};\n\n    const CONTENT: &[u8] = b\"hello world\";\n\n    #[tokio::test]\n    async fn test_copy_to_file() {\n        let ram_storage = RamStorage::default();\n        let temp_dir = tempfile::tempdir().unwrap();\n        let dest_filepath = temp_dir.path().join(\"bar\");\n        let path = Path::new(\"foo/bar\");\n        ram_storage\n            .put(path, Box::new(CONTENT.to_owned()))\n            .await\n            .unwrap();\n        let num_bytes = ram_storage\n            .copy_to_file(path, &dest_filepath)\n            .await\n            .unwrap();\n        assert_eq!(num_bytes, 11);\n        let content = std::fs::read(&dest_filepath).unwrap();\n        assert_eq!(&content, CONTENT);\n    }\n\n    #[tokio::test]\n    async fn test_copy_to_file_deletes_tempfile_on_failure() {\n        let mut storage = MockStorage::default();\n        storage.expect_copy_to().return_once(|_, _| {\n            Box::pin(futures::future::err(StorageError::from(io::Error::other(\n                \"fake storage error\",\n            ))))\n        });\n        let path = Path::new(\"foo/bar\");\n        let temp_dir = tempfile::tempdir().unwrap();\n        let dest_filepath = temp_dir.path().join(\"bar\");\n        default_copy_to_file(&storage, path, &dest_filepath)\n            .await\n            .unwrap_err();\n        tokio::time::sleep(Duration::from_millis(100)).await;\n        let mut read_dir = tokio::fs::read_dir(dest_filepath.parent().unwrap())\n            .await\n            .unwrap();\n        let entry_opt = read_dir\n            .next_entry()\n            .await\n            .unwrap()\n            .map(|dir_entry| dir_entry.path());\n        assert_eq!(entry_opt, None);\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/storage_factory.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::sync::Arc;\n\nuse async_trait::async_trait;\nuse quickwit_common::uri::Uri;\nuse quickwit_config::StorageBackend;\n\nuse crate::{Storage, StorageResolverError};\n\n/// A storage factory builds a [`Storage`] object for a target [`StorageBackend`] from a\n/// [`Uri`].\n#[cfg_attr(any(test, feature = \"testsuite\"), mockall::automock)]\n#[async_trait]\npub trait StorageFactory: Send + Sync + 'static {\n    /// Returns the storage backend targeted by the factory.\n    fn backend(&self) -> StorageBackend;\n\n    /// Returns the appropriate [`Storage`] object for the URI.\n    async fn resolve(&self, uri: &Uri) -> Result<Arc<dyn Storage>, StorageResolverError>;\n}\n\n/// A storage factory for handling unsupported or unavailable storage backends.\n#[derive(Debug, Clone)]\npub struct UnsupportedStorage {\n    backend: StorageBackend,\n    message: &'static str,\n}\n\nimpl UnsupportedStorage {\n    /// Creates a new [`UnsupportedStorage`].\n    pub fn new(backend: StorageBackend, message: &'static str) -> Self {\n        Self { backend, message }\n    }\n}\n\n#[async_trait]\nimpl StorageFactory for UnsupportedStorage {\n    fn backend(&self) -> StorageBackend {\n        self.backend\n    }\n\n    async fn resolve(&self, _uri: &Uri) -> Result<Arc<dyn Storage>, StorageResolverError> {\n        Err(StorageResolverError::UnsupportedBackend(\n            self.message.to_string(),\n        ))\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/storage_resolver.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashMap;\nuse std::fmt;\nuse std::sync::Arc;\n\nuse once_cell::sync::Lazy;\nuse quickwit_common::uri::{Protocol, Uri};\nuse quickwit_config::{StorageBackend, StorageConfigs};\n\n#[cfg(feature = \"azure\")]\nuse crate::AzureBlobStorageFactory;\n#[cfg(feature = \"gcs\")]\nuse crate::GoogleCloudStorageFactory;\nuse crate::local_file_storage::LocalFileStorageFactory;\nuse crate::ram_storage::RamStorageFactory;\nuse crate::{S3CompatibleObjectStorageFactory, Storage, StorageFactory, StorageResolverError};\n\n/// Returns the [`Storage`] instance associated with the protocol of a URI. The actual creation of\n/// storage objects is delegated to pre-registered [`StorageFactory`]. The resolver is only\n/// responsible for dispatching to the appropriate factory.\n#[derive(Clone)]\npub struct StorageResolver {\n    per_backend_factories: Arc<HashMap<StorageBackend, Box<dyn StorageFactory>>>,\n}\n\nimpl fmt::Debug for StorageResolver {\n    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {\n        f.debug_struct(\"StorageResolver\").finish()\n    }\n}\n\nimpl StorageResolver {\n    /// Creates an empty [`StorageResolverBuilder`].\n    pub fn builder() -> StorageResolverBuilder {\n        StorageResolverBuilder::default()\n    }\n\n    /// Resolves the given URI.\n    pub async fn resolve(&self, uri: &Uri) -> Result<Arc<dyn Storage>, StorageResolverError> {\n        let backend = match uri.protocol() {\n            Protocol::Azure => StorageBackend::Azure,\n            Protocol::File => StorageBackend::File,\n            Protocol::Ram => StorageBackend::Ram,\n            Protocol::S3 => StorageBackend::S3,\n            Protocol::Google => StorageBackend::Google,\n            _ => {\n                let message = format!(\n                    \"Quickwit does not support {} as a storage backend\",\n                    uri.protocol()\n                );\n                return Err(StorageResolverError::UnsupportedBackend(message));\n            }\n        };\n        let storage_factory = self.per_backend_factories.get(&backend).ok_or({\n            let message = format!(\"no storage factory is registered for {}\", uri.protocol());\n            StorageResolverError::UnsupportedBackend(message)\n        })?;\n        let storage = storage_factory.resolve(uri).await?;\n        Ok(storage)\n    }\n\n    /// Creates and returns a default [`StorageResolver`] with the default storage configuration for\n    /// each backend. Note that if the environment (env vars, instance metadata, ...) fails to\n    /// provide the necessary credentials, the default Azure or S3 storage returned by this\n    /// resolver will not work.\n    pub fn unconfigured() -> Self {\n        static STORAGE_RESOLVER: Lazy<StorageResolver> = Lazy::new(|| {\n            let storage_configs = StorageConfigs::default();\n            StorageResolver::configured(&storage_configs)\n        });\n        STORAGE_RESOLVER.clone()\n    }\n\n    /// Creates and returns a [`StorageResolver`].\n    pub fn configured(storage_configs: &StorageConfigs) -> Self {\n        let mut builder = StorageResolver::builder()\n            .register(LocalFileStorageFactory)\n            .register(RamStorageFactory::default())\n            .register(S3CompatibleObjectStorageFactory::new(\n                storage_configs.find_s3().cloned().unwrap_or_default(),\n            ));\n        #[cfg(feature = \"azure\")]\n        {\n            builder = builder.register(AzureBlobStorageFactory::new(\n                storage_configs.find_azure().cloned().unwrap_or_default(),\n            ));\n        }\n        #[cfg(not(feature = \"azure\"))]\n        {\n            use crate::storage_factory::UnsupportedStorage;\n\n            builder = builder.register(UnsupportedStorage::new(\n                StorageBackend::Azure,\n                \"Quickwit was compiled without the `azure` feature\",\n            ))\n        }\n        #[cfg(feature = \"gcs\")]\n        {\n            builder = builder.register(GoogleCloudStorageFactory::new(\n                storage_configs.find_google().cloned().unwrap_or_default(),\n            ));\n        }\n        #[cfg(not(feature = \"gcs\"))]\n        {\n            use crate::storage_factory::UnsupportedStorage;\n\n            builder = builder.register(UnsupportedStorage::new(\n                StorageBackend::Google,\n                \"Quickwit was compiled without the `gcs` feature\",\n            ))\n        }\n        builder\n            .build()\n            .expect(\"storage factory and config backends should match\")\n    }\n\n    /// Returns a [`StorageResolver`] for testing purposes. Unlike\n    /// [`StorageResolver::unconfigured`], this resolver does not return a singleton.\n    #[cfg(any(test, feature = \"testsuite\"))]\n    pub fn for_test() -> Self {\n        StorageResolver::builder()\n            .register(RamStorageFactory::default())\n            .register(LocalFileStorageFactory)\n            .build()\n            .expect(\"storage factory and config backends should match\")\n    }\n}\n\n#[derive(Default)]\npub struct StorageResolverBuilder {\n    per_backend_factories: HashMap<StorageBackend, Box<dyn StorageFactory>>,\n}\n\nimpl StorageResolverBuilder {\n    /// Registers a [`StorageFactory`].\n    pub fn register<S: StorageFactory>(mut self, storage_factory: S) -> Self {\n        self.per_backend_factories\n            .insert(storage_factory.backend(), Box::new(storage_factory));\n        self\n    }\n\n    /// Builds the [`StorageResolver`].\n    pub fn build(self) -> anyhow::Result<StorageResolver> {\n        let storage_resolver = StorageResolver {\n            per_backend_factories: Arc::new(self.per_backend_factories),\n        };\n        Ok(storage_resolver)\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use std::path::Path;\n\n    use super::*;\n    use crate::{MockStorageFactory, RamStorage};\n\n    #[tokio::test]\n    async fn test_storage_resolver_simple() -> anyhow::Result<()> {\n        let mut file_storage_factory = MockStorageFactory::new();\n        file_storage_factory\n            .expect_backend()\n            .returning(|| StorageBackend::File);\n\n        let mut ram_storage_factory = MockStorageFactory::new();\n        ram_storage_factory\n            .expect_backend()\n            .returning(|| StorageBackend::Ram);\n        ram_storage_factory.expect_resolve().returning(|_uri| {\n            Ok(Arc::new(\n                RamStorage::builder()\n                    .put(\"hello\", b\"hello_content_second\")\n                    .build(),\n            ))\n        });\n        let storage_resolver = StorageResolver::builder()\n            .register(file_storage_factory)\n            .register(ram_storage_factory)\n            .build()\n            .unwrap();\n        let storage = storage_resolver.resolve(&Uri::for_test(\"ram:///\")).await?;\n        let data = storage.get_all(Path::new(\"hello\")).await?;\n        assert_eq!(&data[..], b\"hello_content_second\");\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_storage_resolver_override() -> anyhow::Result<()> {\n        let mut first_ram_storage_factory = MockStorageFactory::new();\n        first_ram_storage_factory\n            .expect_backend()\n            .returning(|| StorageBackend::Ram);\n\n        let mut second_ram_storage_factory = MockStorageFactory::new();\n        second_ram_storage_factory\n            .expect_backend()\n            .returning(|| StorageBackend::Ram);\n        second_ram_storage_factory\n            .expect_resolve()\n            .returning(|uri| {\n                assert_eq!(uri.as_str(), \"ram:///home\");\n                Ok(Arc::new(\n                    RamStorage::builder()\n                        .put(\"hello\", b\"hello_content_second\")\n                        .build(),\n                ))\n            });\n        let storage_resolver = StorageResolver::builder()\n            .register(first_ram_storage_factory)\n            .register(second_ram_storage_factory)\n            .build()\n            .unwrap();\n        let storage = storage_resolver\n            .resolve(&Uri::for_test(\"ram:///home\"))\n            .await?;\n        let data = storage.get_all(Path::new(\"hello\")).await?;\n        assert_eq!(&data[..], b\"hello_content_second\");\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn test_storage_resolver_unsupported_protocol() {\n        let storage_resolver = StorageResolver::unconfigured();\n        let storage_uri = Uri::for_test(\"postgresql://localhost:5432/metastore\");\n        let resolver_error = storage_resolver.resolve(&storage_uri).await.unwrap_err();\n        assert!(matches!(\n            resolver_error,\n            StorageResolverError::UnsupportedBackend(_)\n        ));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/timeout_and_retry_storage.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::ops::Range;\nuse std::path::Path;\nuse std::sync::Arc;\n\nuse async_trait::async_trait;\nuse quickwit_common::uri::Uri;\nuse quickwit_common::{rate_limited_info, rate_limited_warn};\nuse quickwit_config::StorageTimeoutPolicy;\nuse tantivy::directory::OwnedBytes;\nuse tokio::io::AsyncRead;\n\nuse crate::storage::SendableAsync;\nuse crate::{BulkDeleteError, PutPayload, Storage, StorageErrorKind, StorageResult};\n\n/// Storage proxy that implements a retry operation if the underlying storage\n/// takes too long.\n///\n/// This is useful in order to ensure a low latency on S3.\n/// Retrying agressively is recommended for S3.\n///\n/// <https://docs.aws.amazon.com/whitepapers/latest/s3-optimizing-performance-best-practices/timeouts-and-retries-for-latency-sensitive-applications.html>\n#[derive(Clone, Debug)]\npub struct TimeoutAndRetryStorage {\n    underlying: Arc<dyn Storage>,\n    storage_timeout_policy: StorageTimeoutPolicy,\n}\n\nimpl TimeoutAndRetryStorage {\n    /// Creates a new `TimeoutAndRetryStorage`.\n    ///\n    /// See [StorageTimeoutPolicy] for more information.\n    pub fn new(storage: Arc<dyn Storage>, storage_timeout_policy: StorageTimeoutPolicy) -> Self {\n        TimeoutAndRetryStorage {\n            underlying: storage,\n            storage_timeout_policy,\n        }\n    }\n}\n\n#[async_trait]\nimpl Storage for TimeoutAndRetryStorage {\n    async fn check_connectivity(&self) -> anyhow::Result<()> {\n        self.underlying.check_connectivity().await\n    }\n\n    async fn put(&self, path: &Path, payload: Box<dyn PutPayload>) -> StorageResult<()> {\n        self.underlying.put(path, payload).await\n    }\n\n    fn copy_to<'life0, 'life1, 'life2, 'async_trait>(\n        &'life0 self,\n        path: &'life1 Path,\n        output: &'life2 mut dyn SendableAsync,\n    ) -> ::core::pin::Pin<\n        Box<\n            dyn ::core::future::Future<Output = StorageResult<()>>\n                + ::core::marker::Send\n                + 'async_trait,\n        >,\n    >\n    where\n        'life0: 'async_trait,\n        'life1: 'async_trait,\n        'life2: 'async_trait,\n        Self: 'async_trait,\n    {\n        self.underlying.copy_to(path, output)\n    }\n\n    async fn copy_to_file(&self, path: &Path, output_path: &Path) -> StorageResult<u64> {\n        self.underlying.copy_to_file(path, output_path).await\n    }\n\n    /// Downloads a slice of a file from the storage, and returns an in memory buffer\n    async fn get_slice(&self, path: &Path, range: Range<usize>) -> StorageResult<OwnedBytes> {\n        let num_bytes = range.len();\n        for (attempt_id, timeout_duration) in self\n            .storage_timeout_policy\n            .compute_timeout(num_bytes)\n            .enumerate()\n        {\n            let get_slice_fut = self.underlying.get_slice(path, range.clone());\n            // TODO test avoid aborting timed out requests. #5468\n            match tokio::time::timeout(timeout_duration, get_slice_fut).await {\n                Ok(result) => {\n                    crate::STORAGE_METRICS\n                        .get_slice_timeout_successes\n                        .get(attempt_id)\n                        .or(crate::STORAGE_METRICS.get_slice_timeout_successes.last())\n                        .unwrap()\n                        .inc();\n                    return result;\n                }\n                Err(_elapsed) => {\n                    rate_limited_info!(limit_per_min=60, num_bytes=num_bytes, path=%path.display(), timeout_secs=timeout_duration.as_secs_f32(), \"get timeout elapsed\");\n                    continue;\n                }\n            }\n        }\n        rate_limited_warn!(limit_per_min=60, num_bytes=num_bytes, path=%path.display(), \"all get_slice attempts timeouted\");\n        crate::STORAGE_METRICS.get_slice_timeout_all_timeouts.inc();\n        return Err(\n            StorageErrorKind::Timeout.with_error(anyhow::anyhow!(\"internal timeout on get_slice\"))\n        );\n    }\n\n    async fn get_slice_stream(\n        &self,\n        path: &Path,\n        range: Range<usize>,\n    ) -> StorageResult<Box<dyn AsyncRead + Send + Unpin>> {\n        self.underlying.get_slice_stream(path, range).await\n    }\n\n    async fn get_all(&self, path: &Path) -> StorageResult<OwnedBytes> {\n        self.underlying.get_all(path).await\n    }\n\n    async fn delete(&self, path: &Path) -> StorageResult<()> {\n        self.underlying.delete(path).await\n    }\n\n    async fn bulk_delete<'a>(&self, paths: &[&'a Path]) -> Result<(), BulkDeleteError> {\n        self.underlying.bulk_delete(paths).await\n    }\n\n    async fn exists(&self, path: &Path) -> StorageResult<bool> {\n        self.underlying.exists(path).await\n    }\n\n    async fn file_num_bytes(&self, path: &Path) -> StorageResult<u64> {\n        self.underlying.file_num_bytes(path).await\n    }\n\n    fn uri(&self) -> &Uri {\n        self.underlying.uri()\n    }\n}\n\n#[cfg(test)]\nmod tests {\n\n    use std::sync::Mutex;\n    use std::time::Duration;\n\n    use tokio::time::Instant;\n\n    use super::*;\n\n    #[derive(Debug)]\n    struct StorageWithDelay {\n        delays: Mutex<Vec<Duration>>,\n    }\n\n    impl StorageWithDelay {\n        pub fn new(mut delays: Vec<Duration>) -> StorageWithDelay {\n            delays.reverse();\n            StorageWithDelay {\n                delays: Mutex::new(delays),\n            }\n        }\n    }\n\n    #[async_trait]\n    impl Storage for StorageWithDelay {\n        fn uri(&self) -> &Uri {\n            todo!();\n        }\n\n        async fn check_connectivity(&self) -> anyhow::Result<()> {\n            todo!()\n        }\n        async fn put(&self, _path: &Path, _payload: Box<dyn PutPayload>) -> StorageResult<()> {\n            todo!();\n        }\n        fn copy_to<'life0, 'life1, 'life2, 'async_trait>(\n            &'life0 self,\n            _path: &'life1 Path,\n            _output: &'life2 mut dyn SendableAsync,\n        ) -> ::core::pin::Pin<\n            Box<\n                dyn ::core::future::Future<Output = StorageResult<()>>\n                    + ::core::marker::Send\n                    + 'async_trait,\n            >,\n        >\n        where\n            'life0: 'async_trait,\n            'life1: 'async_trait,\n            'life2: 'async_trait,\n            Self: 'async_trait,\n        {\n            todo!();\n        }\n\n        async fn get_slice(&self, _path: &Path, range: Range<usize>) -> StorageResult<OwnedBytes> {\n            let duration_opt = self.delays.lock().unwrap().pop();\n            let Some(delay) = duration_opt else {\n                return Err(\n                    StorageErrorKind::Internal.with_error(anyhow::anyhow!(\"internal error\"))\n                );\n            };\n            tokio::time::sleep(delay).await;\n            let buf = vec![0u8; range.len()];\n            Ok(OwnedBytes::new(buf))\n        }\n        async fn get_slice_stream(\n            &self,\n            _path: &Path,\n            _range: Range<usize>,\n        ) -> StorageResult<Box<dyn AsyncRead + Send + Unpin>> {\n            todo!()\n        }\n        async fn get_all(&self, _path: &Path) -> StorageResult<OwnedBytes> {\n            todo!();\n        }\n        async fn delete(&self, _path: &Path) -> StorageResult<()> {\n            todo!();\n        }\n        async fn bulk_delete<'a>(&self, _paths: &[&'a Path]) -> Result<(), BulkDeleteError> {\n            todo!();\n        }\n        async fn exists(&self, _path: &Path) -> StorageResult<bool> {\n            todo!()\n        }\n        async fn file_num_bytes(&self, _path: &Path) -> StorageResult<u64> {\n            todo!();\n        }\n    }\n\n    #[tokio::test]\n    async fn test_timeout_and_retry_storage() {\n        tokio::time::pause();\n\n        let timeout_policy = StorageTimeoutPolicy {\n            min_throughtput_bytes_per_secs: 100_000,\n            timeout_millis: 2_000,\n            max_num_retries: 1,\n        };\n\n        let path = Path::new(\"foo/bar\");\n\n        {\n            let now = Instant::now();\n            let storage_with_delay =\n                StorageWithDelay::new(vec![Duration::from_secs(5), Duration::from_secs(3)]);\n            let storage =\n                TimeoutAndRetryStorage::new(Arc::new(storage_with_delay), timeout_policy.clone());\n            assert_eq!(\n                storage.get_slice(path, 10..100).await.unwrap_err().kind,\n                StorageErrorKind::Timeout\n            );\n            let elapsed = now.elapsed().as_millis();\n            assert!(elapsed.abs_diff(2 * 2_000) < 100);\n        }\n        {\n            let now = Instant::now();\n            let storage_with_delay =\n                StorageWithDelay::new(vec![Duration::from_secs(5), Duration::from_secs(1)]);\n            let storage = TimeoutAndRetryStorage::new(Arc::new(storage_with_delay), timeout_policy);\n            assert!(storage.get_slice(path, 10..100).await.is_ok(),);\n            let elapsed = now.elapsed().as_millis();\n            assert!(elapsed.abs_diff(2_000 + 1_000) < 100);\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/src/versioned_component.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::io::Read;\n\nuse anyhow::Context;\nuse tantivy::directory::OwnedBytes;\n\n/// Helper trait for versioning.\n///\n/// See the unit test for an example.\npub trait VersionedComponent: Default + Copy + Clone {\n    /// This number is used to identify that the type.\n    const MAGIC_NUMBER: u32;\n    /// Type of the component we are versioning.\n    type Component;\n    /// Component name, used to make explicit error messages.\n    fn component_name() -> &'static str {\n        std::any::type_name::<Self::Component>()\n    }\n    /// Return the version code.\n    fn to_version_code(self) -> u32;\n\n    /// Serialize the header.\n    /// Only the current version is meant to be serialized.\n    fn header() -> [u8; 8] {\n        let mut header = [0u8; 8];\n        header[0..4].copy_from_slice(&Self::MAGIC_NUMBER.to_le_bytes());\n        header[4..8].copy_from_slice(&Self::default().to_version_code().to_le_bytes());\n        header\n    }\n\n    /// Deserialize the header from a `Read` trait.\n    /// This method will check for the magic number and version code.\n    fn try_read_component(bytes: &mut OwnedBytes) -> anyhow::Result<Self::Component> {\n        let version = try_read_version::<Self>(bytes)?;\n        version.deserialize_impl(bytes)\n    }\n\n    /// Parse the version code.\n    /// This version is meant to be implemented but only to be called\n    /// from `try_deserialize_from_bytes`.\n    ///\n    /// If the version is unknown, this method should return `None`.\n    fn try_from_version_code_impl(version_code: u32) -> Option<Self>;\n\n    /// Function to serialize a given component with the current codec.\n    fn serialize(component: &Self::Component) -> Vec<u8> {\n        let mut output = Vec::with_capacity(8);\n        output.extend_from_slice(&Self::header());\n        Self::serialize_impl(component, &mut output);\n        output\n    }\n\n    /// Serialize the component using the current format.\n    ///\n    /// This function should NOT serialize the header.\n    /// It should only append content to the `output` buffer.\n    ///\n    /// This function is meant to be implemented but should not be called directly.\n    /// Instead, client should use `.serialize(..)`.\n    fn serialize_impl(component: &Self::Component, output: &mut Vec<u8>);\n\n    /// This method is meant to be implemented but not called, except by `try_read_component`.\n    ///\n    /// This method should consume the bytes from the `OwnedBytes`.\n    fn deserialize_impl(&self, bytes: &mut OwnedBytes) -> anyhow::Result<Self::Component>;\n}\n\n/// Deserialize the header from a `Read` trait.\n///\n/// (This function is not part of the trait to make it private.)\nfn try_read_version<V: VersionedComponent>(bytes: &mut OwnedBytes) -> anyhow::Result<V> {\n    let mut header_bytes: [u8; 8] = [0u8; 8];\n    bytes\n        .read_exact(&mut header_bytes[..])\n        .with_context(|| format!(\"failed to read header for {}\", V::component_name()))?;\n    try_deserialize_from_bytes::<V>(header_bytes)\n}\n\n/// Deserialize the header from 8 bytes.\n/// An error is returned if the magic number does not match,\n/// or if the version is unsupported.\n///\n/// (This function is not part of the trait to make it private.)\nfn try_deserialize_from_bytes<V: VersionedComponent>(header_bytes: [u8; 8]) -> anyhow::Result<V> {\n    let magic_number = u32::from_le_bytes(header_bytes[0..4].try_into().unwrap());\n    if magic_number != V::MAGIC_NUMBER {\n        anyhow::bail!(\"hot directory metadata's magic number does not match\");\n    }\n    let version_code: u32 = u32::from_le_bytes(header_bytes[4..8].try_into().unwrap());\n    V::try_from_version_code_impl(version_code).with_context(|| {\n        format!(\n            \"version code {} is not supported for {}\",\n            version_code,\n            V::component_name()\n        )\n    })\n}\n\n#[cfg(test)]\nmod tests {\n    use tantivy::directory::OwnedBytes;\n\n    use crate::VersionedComponent;\n\n    #[derive(Copy, Clone, Default)]\n    #[repr(u32)]\n    enum FakeComponentCodec {\n        V1,\n        #[default]\n        V2 = 2,\n    }\n\n    #[derive(Debug)]\n    struct FakeComponent {\n        value: u32,\n    }\n\n    impl VersionedComponent for FakeComponentCodec {\n        const MAGIC_NUMBER: u32 = 332_221_734u32;\n\n        type Component = FakeComponent;\n\n        fn to_version_code(self) -> u32 {\n            self as u32\n        }\n\n        fn try_from_version_code_impl(version_code: u32) -> Option<Self> {\n            match version_code {\n                1u32 => Some(Self::V1),\n                2u32 => Some(Self::V2),\n                _ => None,\n            }\n        }\n\n        fn serialize_impl(component: &Self::Component, output: &mut Vec<u8>) {\n            output.extend_from_slice(&component.value.to_le_bytes());\n        }\n\n        fn deserialize_impl(&self, bytes: &mut OwnedBytes) -> anyhow::Result<Self::Component> {\n            match self {\n                FakeComponentCodec::V1 => {\n                    if bytes.len() < 8 {\n                        anyhow::bail!(\"not enough bytes to deserialize\");\n                    }\n                    let value_bytes: [u8; 8] = bytes[0..8].try_into().unwrap();\n                    let value: u32 = u64::from_le_bytes(value_bytes) as u32;\n                    Ok(FakeComponent { value })\n                }\n                FakeComponentCodec::V2 => {\n                    if bytes.len() < 4 {\n                        anyhow::bail!(\"not enough bytes to deserialize\");\n                    }\n                    let value_bytes: [u8; 4] = bytes[0..4].try_into().unwrap();\n                    bytes.advance(4);\n                    let value: u32 = u32::from_le_bytes(value_bytes);\n                    Ok(FakeComponent { value })\n                }\n            }\n        }\n    }\n\n    #[test]\n    fn test_versioned_component() {\n        let component = FakeComponent { value: 42 };\n        let buf = FakeComponentCodec::serialize(&component);\n        {\n            let mut payload = OwnedBytes::new(buf.clone());\n            let fake_component = FakeComponentCodec::try_read_component(&mut payload).unwrap();\n            assert_eq!(fake_component.value, 42u32);\n        }\n        {\n            let mut buf_clone = buf.clone();\n            buf_clone[0] = 0u8;\n            let mut payload = OwnedBytes::new(buf_clone);\n            let fake_component_err =\n                FakeComponentCodec::try_read_component(&mut payload).unwrap_err();\n            assert!(\n                fake_component_err\n                    .to_string()\n                    .to_lowercase()\n                    .contains(\"magic number\")\n            );\n        }\n        {\n            let mut buf_clone = buf;\n            buf_clone.truncate(4);\n            buf_clone.extend_from_slice(&1u32.to_le_bytes());\n            buf_clone.extend_from_slice(&32u64.to_le_bytes());\n            let mut payload = OwnedBytes::new(buf_clone);\n            let fake_component = FakeComponentCodec::try_read_component(&mut payload).unwrap();\n            assert_eq!(fake_component.value, 32u32);\n        }\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/tests/azure_storage.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n// This file is an integration test that assumes that a connection\n// to Azurite (the emulated azure blob storage environment)\n// with default `loose` config is possible.\n\n#[cfg(feature = \"integration-testsuite\")]\n#[tokio::test]\n#[cfg_attr(not(feature = \"ci-test\"), ignore)]\nasync fn azure_storage_test_suite() -> anyhow::Result<()> {\n    use std::path::PathBuf;\n\n    use anyhow::Context;\n    use azure_storage_blobs::prelude::ClientBuilder;\n    use quickwit_common::rand::append_random_suffix;\n    use quickwit_storage::{AzureBlobStorage, MultiPartPolicy};\n    let _ = tracing_subscriber::fmt::try_init();\n\n    // Setup container.\n    let container_name = append_random_suffix(\"quickwit\").to_lowercase();\n    let container_client = ClientBuilder::emulator().container_client(&container_name);\n    container_client.create().into_future().await?;\n\n    let mut object_storage = AzureBlobStorage::new_emulated(&container_name);\n    quickwit_storage::storage_test_suite(&mut object_storage).await?;\n\n    let mut object_storage = AzureBlobStorage::new_emulated(&container_name).with_prefix(\n        PathBuf::from(\"/integration-tests/test-azure-compatible-storage\"),\n    );\n    quickwit_storage::storage_test_single_part_upload(&mut object_storage)\n        .await\n        .context(\"test single-part upload failed\")?;\n\n    object_storage.set_policy(MultiPartPolicy {\n        // On azure, block size is limited between 64KB and 100MB.\n        target_part_num_bytes: 5 * 1_024 * 1_024, // 5MiB\n        max_num_parts: 10_000,\n        multipart_threshold_num_bytes: 10_000_000,\n        max_object_num_bytes: 5_000_000_000_000,\n        max_concurrent_uploads: 100,\n    });\n    quickwit_storage::storage_test_multi_part_upload(&mut object_storage)\n        .await\n        .context(\"test multipart upload failed\")?;\n\n    // Teardown container.\n    container_client.delete().into_future().await?;\n    Ok(())\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/tests/google_cloud_storage.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n// This file is an integration test that assumes that a connection\n// to Fake GCS Server (the emulated google cloud storage environment)\n\n#[cfg(all(feature = \"integration-testsuite\", feature = \"gcs\"))]\n#[cfg_attr(not(feature = \"ci-test\"), ignore)]\nmod gcp_storage_test_suite {\n    use std::str::FromStr;\n\n    use anyhow::Context;\n    use quickwit_common::rand::append_random_suffix;\n    use quickwit_common::setup_logging_for_tests;\n    use quickwit_common::uri::Uri;\n    use quickwit_storage::test_config_helpers::{\n        LOCAL_GCP_EMULATOR_ENDPOINT, new_emulated_google_cloud_storage,\n    };\n\n    pub fn sign_gcs_request(req: &mut reqwest::Request) {\n        req.headers_mut().insert(\n            reqwest::header::AUTHORIZATION,\n            reqwest::header::HeaderValue::from_str(\"Bearer dummy\").unwrap(),\n        );\n    }\n\n    async fn create_gcs_bucket(bucket_name: &str) -> anyhow::Result<()> {\n        let client = reqwest::Client::new();\n        let url = format!(\"{LOCAL_GCP_EMULATOR_ENDPOINT}/storage/v1/b\");\n        let mut request = client\n            .post(url)\n            .body(serde_json::to_vec(&serde_json::json!({\n                \"name\": bucket_name,\n            }))?)\n            .header(reqwest::header::CONTENT_TYPE, \"application/json\")\n            .build()?;\n\n        sign_gcs_request(&mut request);\n\n        let response = client.execute(request).await?;\n\n        if !response.status().is_success() {\n            let error_text = response.text().await?;\n            anyhow::bail!(\"Failed to create bucket: {}\", error_text);\n        };\n        Ok(())\n    }\n\n    #[tokio::test]\n    async fn google_cloud_storage_test_suite() -> anyhow::Result<()> {\n        setup_logging_for_tests();\n\n        let bucket_name = append_random_suffix(\"sample-bucket\").to_lowercase();\n        create_gcs_bucket(bucket_name.as_str())\n            .await\n            .context(\"Failed to create test GCS bucket\")?;\n\n        let mut object_storage =\n            new_emulated_google_cloud_storage(&Uri::from_str(&format!(\"gs://{bucket_name}\"))?)?;\n\n        quickwit_storage::storage_test_suite(&mut object_storage).await?;\n\n        let mut object_storage = new_emulated_google_cloud_storage(&Uri::from_str(&format!(\n            \"gs://{bucket_name}/integration-tests/test-gcs-storage\"\n        ))?)?;\n\n        quickwit_storage::storage_test_single_part_upload(&mut object_storage)\n            .await\n            .context(\"test single-part upload failed\")?;\n\n        // TODO: Uncomment storage_test_multi_part_upload when the XML API is\n        // supported in the emulated GCS server\n        // (https://github.com/fsouza/fake-gcs-server/pull/1164)\n\n        // object_storage.set_policy(MultiPartPolicy {\n        //     target_part_num_bytes: 5 * 1_024 * 1_024,\n        //     max_num_parts: 10_000,\n        //     multipart_threshold_num_bytes: 10_000_000,\n        //     max_object_num_bytes: 5_000_000_000_000,\n        //     max_concurrent_uploads: 100,\n        // });\n        // quickwit_storage::storage_test_multi_part_upload(&mut object_storage)\n        //     .await\n        //     .context(\"test multipart upload failed\")?;\n        Ok(())\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-storage/tests/s3_storage.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n// This file is an integration test that assumes that the environment\n// makes it po\n\n#[cfg(feature = \"integration-testsuite\")]\npub mod s3_storage_test_suite {\n\n    use std::path::PathBuf;\n    use std::str::FromStr;\n\n    use anyhow::Context;\n    use once_cell::sync::OnceCell;\n    use quickwit_common::setup_logging_for_tests;\n    use quickwit_common::uri::Uri;\n    use quickwit_config::S3StorageConfig;\n    use quickwit_storage::{MultiPartPolicy, S3CompatibleObjectStorage};\n    use tokio::runtime::Runtime;\n\n    // Introducing a common runtime for the unit tests in this file.\n    //\n    // By default, tokio creates a new runtime, for each unit test.\n    // Here, we want to use the singleton `AwsSdkConfig` object.\n    // This object packs a smithy connector which itself includes a\n    // hyper client pool. A hyper client cannot be used from multiple runtimes.\n    fn test_runtime_singleton() -> &'static Runtime {\n        static RUNTIME_CACHE: OnceCell<tokio::runtime::Runtime> = OnceCell::new();\n        RUNTIME_CACHE.get_or_init(|| {\n            tokio::runtime::Builder::new_multi_thread()\n                .worker_threads(1)\n                .enable_all()\n                .build()\n                .unwrap()\n        })\n    }\n\n    async fn run_s3_storage_test_suite(s3_storage_config: S3StorageConfig, bucket_uri: &str) {\n        setup_logging_for_tests();\n\n        let storage_uri = Uri::from_str(bucket_uri).unwrap();\n        let mut object_storage =\n            S3CompatibleObjectStorage::from_uri(&s3_storage_config, &storage_uri)\n                .await\n                .unwrap();\n\n        quickwit_storage::storage_test_suite(&mut object_storage)\n            .await\n            .context(\"S3 storage test suite failed\")\n            .unwrap();\n\n        let mut object_storage =\n            S3CompatibleObjectStorage::from_uri(&s3_storage_config, &storage_uri)\n                .await\n                .unwrap()\n                .with_prefix(PathBuf::from(\"test-s3-compatible-storage\"));\n\n        quickwit_storage::storage_test_single_part_upload(&mut object_storage)\n            .await\n            .context(\"test single-part upload failed\")\n            .unwrap();\n\n        object_storage.set_policy(MultiPartPolicy {\n            target_part_num_bytes: 5 * 1_024 * 1_024, //< the minimum on S3 is 5MB.\n            max_num_parts: 10_000,\n            multipart_threshold_num_bytes: 10_000_000,\n            max_object_num_bytes: 5_000_000_000_000,\n            max_concurrent_uploads: 100,\n        });\n\n        quickwit_storage::storage_test_multi_part_upload(&mut object_storage)\n            .await\n            .context(\"test multipart upload failed\")\n            .unwrap();\n    }\n\n    #[test]\n    #[cfg_attr(not(feature = \"ci-test\"), ignore)]\n    fn test_suite_on_s3_storage_path_style_access() {\n        use quickwit_common::rand::append_random_suffix;\n\n        let s3_storage_config = S3StorageConfig {\n            force_path_style_access: true,\n            ..Default::default()\n        };\n        let bucket_uri =\n            append_random_suffix(\"s3://quickwit-integration-tests/test-path-style-access\");\n        let test_runtime = test_runtime_singleton();\n        test_runtime.block_on(run_s3_storage_test_suite(s3_storage_config, &bucket_uri));\n    }\n\n    #[test]\n    #[cfg_attr(not(feature = \"ci-test\"), ignore)]\n    fn test_suite_on_s3_storage_virtual_hosted_style_access() {\n        use quickwit_common::rand::append_random_suffix;\n\n        let s3_storage_config = S3StorageConfig {\n            force_path_style_access: false,\n            ..Default::default()\n        };\n        let bucket_uri = append_random_suffix(\n            \"s3://quickwit-integration-tests/test-virtual-hosted-style-access\",\n        );\n        let test_runtime = test_runtime_singleton();\n        test_runtime.block_on(run_s3_storage_test_suite(s3_storage_config, &bucket_uri));\n    }\n\n    #[test]\n    #[cfg_attr(not(feature = \"ci-test\"), ignore)]\n    fn test_suite_on_s3_storage_bulk_delete_single_object_delete_api() {\n        use std::str::FromStr;\n\n        use anyhow::Context;\n        use quickwit_common::rand::append_random_suffix;\n        use quickwit_common::uri::Uri;\n\n        let s3_storage_config = S3StorageConfig {\n            disable_multi_object_delete: true,\n            ..Default::default()\n        };\n        let bucket_uri = append_random_suffix(\n            \"s3://quickwit-integration-tests/test-bulk-delete-single-object-delete-api\",\n        );\n        let storage_uri = Uri::from_str(&bucket_uri).unwrap();\n        let test_runtime = test_runtime_singleton();\n        test_runtime.block_on(async move {\n            let mut object_storage =\n                S3CompatibleObjectStorage::from_uri(&s3_storage_config, &storage_uri)\n                    .await\n                    .unwrap();\n            quickwit_storage::test_write_and_bulk_delete(&mut object_storage)\n                .await\n                .context(\"test bulk delete single-object delete API failed\")\n                .unwrap();\n        });\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-telemetry/Cargo.toml",
    "content": "[package]\nname = \"quickwit-telemetry\"\ndescription = \"Open Telemetry services\"\n\nversion.workspace = true\nedition.workspace = true\nhomepage.workspace = true\ndocumentation.workspace = true\nrepository.workspace = true\nauthors.workspace = true\nlicense.workspace = true\n\n[dependencies]\nasync-trait = { workspace = true }\nhostname = { workspace = true }\nmd5 = { workspace = true }\nonce_cell = { workspace = true }\nreqwest = { workspace = true }\nserde = { workspace = true }\ntokio = { workspace = true }\ntracing = { workspace = true }\nusername = { workspace = true }\nuuid = { workspace = true }\n\n# This is actually not used directly the goal is to fix the version\n# used by reqwest. 0.8.30 has an unclear license.\nencoding_rs = { workspace = true }\n\nquickwit-common = { workspace = true }\n\n[dev-dependencies]\nserde_json = { workspace = true }\n\n[package.metadata.cargo-machete]\n# see above\nignored = [\"encoding_rs\"]\n"
  },
  {
    "path": "quickwit/quickwit-telemetry/src/lib.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n#![allow(clippy::bool_assert_comparison)]\n#![deny(clippy::disallowed_methods)]\n\npub mod payload;\n/// This crate contains  the code responsible for sending usage data to Quickwit inc's server.\nmod sender;\npub(crate) mod sink;\n\nuse once_cell::sync::OnceCell;\nuse payload::QuickwitTelemetryInfo;\nuse tracing::info;\n\nuse crate::payload::TelemetryEvent;\npub use crate::sender::is_telemetry_disabled;\nuse crate::sender::{TelemetryLoopHandle, TelemetrySender};\n\nstatic TELEMETRY_SENDER: OnceCell<TelemetrySender> = OnceCell::new();\n\n/// Returns a `TelemetryLoopHandle` if the telemetry loop is not yet started.\npub fn start_telemetry_loop(quickwit_info: QuickwitTelemetryInfo) -> Option<TelemetryLoopHandle> {\n    let telemetry_sender =\n        TELEMETRY_SENDER.get_or_init(|| TelemetrySender::from_quickwit_info(quickwit_info));\n    // This should not happen... unless telemetry is enabled and you are running tests in parallel\n    // in the same process.\n    if telemetry_sender.loop_started() {\n        info!(\"telemetry loop already started. please disable telemetry during tests\");\n        return None;\n    }\n    Some(telemetry_sender.start_loop())\n}\n\n/// Sends a telemetry event to Quickwit's server via HTTP.\n///\n/// Telemetry guarantees to send at most 1 request per minute.\n/// Each requests can ship at most 10 messages.\n///\n/// If this methods is called too often, some events will be dropped.\n///\n/// If the http requests fail, the error will be silent.\n///\n/// We voluntarily use an enum here to make it easier for reader\n/// to audit the type of information that is send home.\npub async fn send_telemetry_event(event: TelemetryEvent) {\n    if let Some(telemetry_sender) = TELEMETRY_SENDER.get() {\n        telemetry_sender.send(event).await;\n    }\n}\n\n/// This environment variable can be set to disable sending telemetry events.\npub const DISABLE_TELEMETRY_ENV_KEY: &str = \"QW_DISABLE_TELEMETRY\";\n"
  },
  {
    "path": "quickwit/quickwit-telemetry/src/payload.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::collections::HashSet;\nuse std::env;\nuse std::time::UNIX_EPOCH;\n\nuse serde::{Deserialize, Serialize};\nuse uuid::Uuid;\n\n/// Represents the payload of the request sent with telemetry requests.\n#[derive(Debug, Serialize, Deserialize)]\npub struct TelemetryPayload {\n    /// Client information. See details in `[ClientInformation]`.\n    pub client_info: ClientInfo,\n    /// Quickwit information. See details in `[QuickwitInfo]`.\n    pub quickwit_info: QuickwitTelemetryInfo,\n    pub events: Vec<EventWithTimestamp>,\n    /// Represents the number of events that where drops due to the\n    /// combination of the `TELEMETRY_PUSH_COOLDOWN` and `MAX_EVENT_IN_QUEUE`.\n    pub num_dropped_events: usize,\n}\n\n#[derive(Debug, Serialize, Deserialize)]\npub struct EventWithTimestamp {\n    /// Unix time in seconds.\n    pub unixtime: u64,\n    /// Telemetry event.\n    #[serde(flatten)]\n    pub event: TelemetryEvent,\n}\n\n/// Returns the number of seconds elapsed since UNIX_EPOCH.\n///\n/// If the system clock is set before 1970, returns 0.\nfn unixtime() -> u64 {\n    match UNIX_EPOCH.elapsed() {\n        Ok(duration) => duration.as_secs(),\n        Err(_) => 0u64,\n    }\n}\n\nimpl From<TelemetryEvent> for EventWithTimestamp {\n    fn from(event: TelemetryEvent) -> Self {\n        EventWithTimestamp {\n            unixtime: unixtime(),\n            event,\n        }\n    }\n}\n\n/// Represents a Telemetry Event send to Quickwit's telemetry server for usage information.\n#[derive(Debug, Serialize, Deserialize, PartialEq, Eq)]\n#[serde(tag = \"type\")]\n#[serde(rename_all = \"snake_case\")]\npub enum TelemetryEvent {\n    RunCommand,\n    /// EndCommand (with the return code).\n    EndCommand {\n        return_code: i32,\n    },\n    /// Event sent every 12h to signal the server is running.\n    Running,\n    /// UI index.html was requested.\n    UiIndexPageLoad,\n}\n\n#[derive(Clone, Debug, Serialize, Deserialize)]\npub struct ClientInfo {\n    session_uuid: uuid::Uuid,\n    os: String,\n    arch: String,\n    hashed_host_username: String,\n    kubernetes: bool,\n}\n\n#[derive(Clone, Debug, Serialize, Deserialize)]\npub struct QuickwitTelemetryInfo {\n    pub version: String,\n    pub services: HashSet<String>,\n    pub features: HashSet<QuickwitFeature>,\n}\n\nimpl QuickwitTelemetryInfo {\n    pub fn new(services: HashSet<String>, features: HashSet<QuickwitFeature>) -> Self {\n        Self {\n            features,\n            version: env!(\"CARGO_PKG_VERSION\").to_string(),\n            services,\n        }\n    }\n}\n\nimpl Default for QuickwitTelemetryInfo {\n    fn default() -> Self {\n        Self {\n            features: HashSet::new(),\n            version: env!(\"CARGO_PKG_VERSION\").to_string(),\n            services: HashSet::new(),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Eq, Hash)]\n#[serde(rename_all = \"snake_case\")]\npub enum QuickwitFeature {\n    FileBackedMetastore,\n    Jaeger,\n    Otlp,\n    PostgresqMetastore,\n}\n\nfn hashed_host_username() -> String {\n    let hostname = hostname::get()\n        .map(|hostname| hostname.to_string_lossy().to_string())\n        .unwrap_or_default();\n    let username = username::get_user_name().unwrap_or_default();\n    let hashed_value = format!(\"{hostname}:{username}\");\n    let digest = md5::compute(hashed_value.as_bytes());\n    format!(\"{digest:x}\")\n}\n\nimpl Default for ClientInfo {\n    fn default() -> ClientInfo {\n        ClientInfo {\n            session_uuid: Uuid::new_v4(),\n            os: env::consts::OS.to_string(),\n            arch: env::consts::ARCH.to_string(),\n            hashed_host_username: hashed_host_username(),\n            kubernetes: std::env::var_os(\"KUBERNETES_SERVICE_HOST\").is_some(),\n        }\n    }\n}\n\n#[cfg(test)]\nmod tests {\n    use serde_json;\n\n    use super::{EventWithTimestamp, TelemetryEvent};\n\n    #[test]\n    fn test_serialize_payload_as_expected() {\n        let event = EventWithTimestamp {\n            unixtime: 0,\n            event: TelemetryEvent::EndCommand { return_code: 0 },\n        };\n        let json = serde_json::to_string(&event).unwrap();\n        assert_eq!(\n            json,\n            r#\"{\"unixtime\":0,\"type\":\"end_command\",\"return_code\":0}\"#\n        );\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-telemetry/src/sender.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::mem;\nuse std::sync::Arc;\nuse std::sync::atomic::{AtomicBool, Ordering};\nuse std::time::Duration;\n\nuse tokio::sync::mpsc::{Receiver, Sender};\nuse tokio::sync::{Mutex, RwLock, oneshot};\nuse tokio::task::JoinHandle;\nuse tokio::time::Interval;\nuse tracing::info;\n\nuse crate::payload::{\n    ClientInfo, EventWithTimestamp, QuickwitTelemetryInfo, TelemetryEvent, TelemetryPayload,\n};\nuse crate::sink::{HttpClient, Sink};\n\n/// At most 1 Request per minutes.\nconst TELEMETRY_PUSH_COOLDOWN: Duration = Duration::from_secs(60);\n\n/// Interval at which to send telemetry `Running` event.\nconst TELEMETRY_RUNNING_EVENT_INTERVAL: Duration =\n    Duration::from_secs(if cfg!(test) { 3 } else { 60 * 60 * 12 }); // 12h\n\n/// Upon termination of the program, we send one last telemetry request with pending events.\n/// This duration is the amount of time we wait for at most to send that last telemetry request.\nconst LAST_REQUEST_TIMEOUT: Duration = Duration::from_secs(1);\n\nconst MAX_NUM_EVENTS_IN_QUEUE: usize = 10;\n\n#[cfg(test)]\nstruct ClockButton(Sender<()>);\n\n#[cfg(test)]\nimpl ClockButton {\n    async fn tick(&self) {\n        let _ = self.0.send(()).await;\n    }\n}\n\nenum Clock {\n    Periodical(Mutex<Interval>),\n    #[cfg(test)]\n    Manual(Mutex<Receiver<()>>),\n}\n\nimpl Clock {\n    pub fn periodical(period: Duration) -> Clock {\n        let interval = tokio::time::interval(period);\n        Clock::Periodical(Mutex::new(interval))\n    }\n\n    #[cfg(test)]\n    pub async fn manual() -> (ClockButton, Clock) {\n        let (tx, rx) = tokio::sync::mpsc::channel(1);\n        let _ = tx.send(()).await;\n        let button = ClockButton(tx);\n        (button, Clock::Manual(Mutex::new(rx)))\n    }\n\n    async fn tick(&self) {\n        match self {\n            Clock::Periodical(interval) => {\n                interval.lock().await.tick().await;\n            }\n            #[cfg(test)]\n            Clock::Manual(channel) => {\n                channel.lock().await.recv().await;\n            }\n        }\n    }\n}\n\n#[derive(Default)]\nstruct EventsState {\n    events: Vec<EventWithTimestamp>,\n    num_dropped_events: usize,\n}\n\nimpl EventsState {\n    fn drain_events(&mut self) -> EventsState {\n        mem::replace(\n            self,\n            EventsState {\n                events: Vec::new(),\n                num_dropped_events: 0,\n            },\n        )\n    }\n\n    /// Adds an event.\n    /// If the queue is already saturated, (ie. it has reached the len `MAX_NUM_EVENTS_IN_QUEUE`)\n    // Returns true iff it was the first event in the queue.\n    fn push_event(&mut self, event: TelemetryEvent) -> bool {\n        if self.events.len() >= MAX_NUM_EVENTS_IN_QUEUE {\n            self.num_dropped_events += 1;\n            return false;\n        }\n        let events_was_empty = self.events.is_empty();\n        self.events.push(EventWithTimestamp::from(event));\n        events_was_empty\n    }\n}\n\nstruct Events {\n    state: RwLock<EventsState>,\n    items_available_tx: Sender<()>,\n    items_available_rx: RwLock<Receiver<()>>,\n}\n\nimpl Default for Events {\n    fn default() -> Self {\n        let (items_available_tx, items_available_rx) = tokio::sync::mpsc::channel(1);\n        Events {\n            state: RwLock::new(EventsState::default()),\n            items_available_tx,\n            items_available_rx: RwLock::new(items_available_rx),\n        }\n    }\n}\n\nimpl Events {\n    /// Wait for events to be available (if there are pending events, then do not wait)\n    /// and then send them to the ingest API server.\n    async fn drain_events(&self) -> EventsState {\n        self.items_available_rx.write().await.recv().await;\n        self.state.write().await.drain_events()\n    }\n\n    async fn push_event(&self, event: TelemetryEvent) {\n        let is_first_event = self.state.write().await.push_event(event);\n        if is_first_event {\n            let _ = self.items_available_tx.send(()).await;\n        }\n    }\n}\n\npub(crate) struct Inner {\n    sink: Option<Box<dyn Sink>>,\n    client_info: ClientInfo,\n    quickwit_info: QuickwitTelemetryInfo,\n    /// This channel is just used to signal there are new items available.\n    events: Events,\n    clock: Clock,\n    is_started: AtomicBool,\n}\n\nimpl Inner {\n    pub fn is_disabled(&self) -> bool {\n        self.sink.is_none()\n    }\n\n    async fn create_telemetry_payload(&self) -> TelemetryPayload {\n        let events_state = self.events.drain_events().await;\n        TelemetryPayload {\n            client_info: self.client_info.clone(),\n            quickwit_info: self.quickwit_info.clone(),\n            events: events_state.events,\n            num_dropped_events: events_state.num_dropped_events,\n        }\n    }\n\n    /// Wait for events to be available (if there are pending events, then do not wait)\n    /// and then send them to the ingest API server.\n    ///\n    /// If the requests fails, it fails silently.\n    async fn send_pending_events(&self) {\n        if let Some(sink) = self.sink.as_ref() {\n            let payload = self.create_telemetry_payload().await;\n            sink.send_payload(payload).await;\n        }\n    }\n\n    async fn send(&self, event: TelemetryEvent) {\n        if self.is_disabled() {\n            return;\n        }\n        self.events.push_event(event).await;\n    }\n}\n\npub struct TelemetrySender {\n    pub(crate) inner: Arc<Inner>,\n}\n\npub enum TelemetryLoopHandle {\n    NoLoop,\n    WithLoop {\n        join_handle: JoinHandle<()>,\n        terminate_command_tx: oneshot::Sender<()>,\n    },\n}\n\nimpl TelemetryLoopHandle {\n    /// Terminate telemetry will exit the telemetry loop\n    /// and possibly send the last request, possibly ignoring the\n    /// telemetry cooldown.\n    pub async fn terminate_telemetry(self) {\n        if let Self::WithLoop {\n            join_handle,\n            terminate_command_tx,\n        } = self\n        {\n            let _ = terminate_command_tx.send(());\n            let _ = tokio::time::timeout(LAST_REQUEST_TIMEOUT, join_handle).await;\n        }\n    }\n}\n\nimpl TelemetrySender {\n    pub fn from_quickwit_info(quickwit_info: QuickwitTelemetryInfo) -> Self {\n        let http_client = create_http_client();\n        TelemetrySender::new(\n            quickwit_info,\n            http_client,\n            Clock::periodical(TELEMETRY_PUSH_COOLDOWN),\n        )\n    }\n\n    fn new<S: Sink>(\n        quickwit_info: QuickwitTelemetryInfo,\n        sink_opt: Option<S>,\n        clock: Clock,\n    ) -> Self {\n        let sink_opt: Option<Box<dyn Sink>> = if let Some(sink) = sink_opt {\n            Some(Box::new(sink))\n        } else {\n            None\n        };\n        Self {\n            inner: Arc::new(Inner {\n                sink: sink_opt,\n                client_info: ClientInfo::default(),\n                quickwit_info,\n                events: Events::default(),\n                clock,\n                is_started: AtomicBool::new(false),\n            }),\n        }\n    }\n\n    pub fn loop_started(&self) -> bool {\n        self.inner.is_started.load(Ordering::Relaxed)\n    }\n\n    pub fn start_loop(&self) -> TelemetryLoopHandle {\n        let (terminate_command_tx, mut terminate_command_rx) = oneshot::channel();\n        if self.inner.is_disabled() {\n            return TelemetryLoopHandle::NoLoop;\n        }\n\n        assert!(\n            self.inner\n                .is_started\n                .compare_exchange(false, true, Ordering::SeqCst, Ordering::SeqCst)\n                .is_ok(),\n            \"The telemetry loop is already started.\"\n        );\n\n        let inner = self.inner.clone();\n        start_monitor_if_server_running_task(inner.clone());\n        let join_handle = tokio::task::spawn(async move {\n            // This channel is used to send the command to terminate telemetry.\n            loop {\n                let quit_loop = tokio::select! {\n                    _ = (&mut terminate_command_rx) => { true }\n                    _ = inner.clock.tick() => { false }\n                };\n                inner.send_pending_events().await;\n                if quit_loop {\n                    break;\n                }\n            }\n        });\n        TelemetryLoopHandle::WithLoop {\n            join_handle,\n            terminate_command_tx,\n        }\n    }\n\n    pub async fn send(&self, event: TelemetryEvent) {\n        self.inner.send(event).await;\n    }\n}\n\n/// telemetry is disabled in tests.\n#[cfg(test)]\npub fn is_telemetry_disabled() -> bool {\n    true\n}\n/// Check to see if telemetry is enabled.\n#[cfg(not(test))]\npub fn is_telemetry_disabled() -> bool {\n    quickwit_common::get_bool_from_env(crate::DISABLE_TELEMETRY_ENV_KEY, false)\n}\n\nfn start_monitor_if_server_running_task(telemetry_sender: Arc<Inner>) {\n    let mut clock = tokio::time::interval(TELEMETRY_RUNNING_EVENT_INTERVAL);\n    tokio::spawn(async move {\n        // Drop the first immediate tick.\n        clock.tick().await;\n        loop {\n            clock.tick().await;\n            telemetry_sender.send(TelemetryEvent::Running).await;\n        }\n    });\n}\n\nfn create_http_client() -> Option<HttpClient> {\n    if is_telemetry_disabled() {\n        info!(\"telemetry to quickwit is disabled\");\n        return None;\n    }\n    let client = HttpClient::try_new()?;\n    info!(\"telemetry to {} is enabled\", client.endpoint());\n    Some(client)\n}\n\n#[cfg(test)]\nmod tests {\n\n    use std::env;\n\n    use super::*;\n\n    #[ignore]\n    #[tokio::test]\n    async fn test_enabling_and_disabling_telemetry() {\n        // SAFETY: this test may not be entirely sound if not run with nextest or --test-threads=1\n        // as this is only a test, and it would be extremly inconvenient to run it in a different\n        // way, we are keeping it that way\n\n        // We group the two in a single test to ensure it happens on the same thread.\n        unsafe { env::set_var(crate::DISABLE_TELEMETRY_ENV_KEY, \"\") };\n        assert_eq!(\n            TelemetrySender::from_quickwit_info(QuickwitTelemetryInfo::default())\n                .inner\n                .is_disabled(),\n            true\n        );\n        unsafe { env::remove_var(crate::DISABLE_TELEMETRY_ENV_KEY) };\n        assert_eq!(\n            TelemetrySender::from_quickwit_info(QuickwitTelemetryInfo::default())\n                .inner\n                .is_disabled(),\n            false\n        );\n    }\n\n    #[tokio::test]\n    async fn test_telemetry_no_wait_for_first_event() {\n        let (tx, mut rx) = tokio::sync::mpsc::unbounded_channel();\n        let (_clock_btn, clock) = Clock::manual().await;\n        let telemetry_sender =\n            TelemetrySender::new(QuickwitTelemetryInfo::default(), Some(tx), clock);\n        let loop_handler = telemetry_sender.start_loop();\n        telemetry_sender.send(TelemetryEvent::UiIndexPageLoad).await;\n        let payload_opt = rx.recv().await;\n        assert!(payload_opt.is_some());\n        let payload = payload_opt.unwrap();\n        assert_eq!(payload.events.len(), 1);\n        loop_handler.terminate_telemetry().await;\n    }\n\n    #[tokio::test]\n    async fn test_telemetry_two_events() {\n        let (tx, mut rx) = tokio::sync::mpsc::unbounded_channel();\n        let (clock_btn, clock) = Clock::manual().await;\n        let telemetry_sender =\n            TelemetrySender::new(QuickwitTelemetryInfo::default(), Some(tx), clock);\n        let loop_handler = telemetry_sender.start_loop();\n        telemetry_sender.send(TelemetryEvent::UiIndexPageLoad).await;\n        {\n            let payload = rx.recv().await.unwrap();\n            assert_eq!(payload.events.len(), 1);\n        }\n        clock_btn.tick().await;\n        telemetry_sender.send(TelemetryEvent::UiIndexPageLoad).await;\n        {\n            let payload = rx.recv().await.unwrap();\n            assert_eq!(payload.events.len(), 1);\n        }\n        loop_handler.terminate_telemetry().await;\n    }\n\n    #[tokio::test]\n    async fn test_telemetry_uptime_events() {\n        let (tx, mut rx) = tokio::sync::mpsc::unbounded_channel();\n        let (clock_btn, clock) = Clock::manual().await;\n        let telemetry_sender =\n            TelemetrySender::new(QuickwitTelemetryInfo::default(), Some(tx), clock);\n        let loop_handler = telemetry_sender.start_loop();\n        telemetry_sender.send(TelemetryEvent::UiIndexPageLoad).await;\n        {\n            let payload = rx.recv().await.unwrap();\n            assert_eq!(payload.events.len(), 1);\n        }\n        clock_btn.tick().await;\n        tokio::time::sleep(TELEMETRY_RUNNING_EVENT_INTERVAL + Duration::from_secs(1)).await;\n        {\n            let payload = rx.recv().await.unwrap();\n            assert_eq!(payload.events.len(), 1);\n            assert_eq!(payload.events[0].event, TelemetryEvent::Running);\n        }\n        loop_handler.terminate_telemetry().await;\n    }\n\n    #[tokio::test]\n    async fn test_telemetry_cooldown_observed() {\n        let (tx, mut rx) = tokio::sync::mpsc::unbounded_channel();\n        let (clock_btn, clock) = Clock::manual().await;\n        let telemetry_sender =\n            TelemetrySender::new(QuickwitTelemetryInfo::default(), Some(tx), clock);\n        let loop_handler = telemetry_sender.start_loop();\n        telemetry_sender.send(TelemetryEvent::UiIndexPageLoad).await;\n        {\n            let payload = rx.recv().await.unwrap();\n            assert_eq!(payload.events.len(), 1);\n        }\n        tokio::task::yield_now().await;\n        telemetry_sender.send(TelemetryEvent::UiIndexPageLoad).await;\n\n        let timeout_res = tokio::time::timeout(Duration::from_millis(1), rx.recv()).await;\n        assert!(timeout_res.is_err());\n\n        telemetry_sender.send(TelemetryEvent::UiIndexPageLoad).await;\n        clock_btn.tick().await;\n        {\n            let payload = rx.recv().await.unwrap();\n            assert_eq!(payload.events.len(), 2);\n        }\n        loop_handler.terminate_telemetry().await;\n    }\n\n    #[tokio::test]\n    async fn test_terminate_telemetry_sends_pending_events() {\n        let (tx, mut rx) = tokio::sync::mpsc::unbounded_channel();\n        let (_clock_btn, clock) = Clock::manual().await;\n        let telemetry_sender =\n            TelemetrySender::new(QuickwitTelemetryInfo::default(), Some(tx), clock);\n        let loop_handler = telemetry_sender.start_loop();\n        telemetry_sender.send(TelemetryEvent::UiIndexPageLoad).await;\n        let payload = rx.recv().await.unwrap();\n        assert_eq!(payload.events.len(), 1);\n        telemetry_sender\n            .send(TelemetryEvent::EndCommand { return_code: 2i32 })\n            .await;\n        loop_handler.terminate_telemetry().await;\n        let payload = rx.recv().await.unwrap();\n        assert_eq!(payload.events.len(), 1);\n        assert!(matches!(\n            &payload.events[0].event,\n            &TelemetryEvent::EndCommand { .. }\n        ));\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-telemetry/src/sink.rs",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nuse std::time::Duration;\n\nuse async_trait::async_trait;\nuse reqwest::Client;\nuse reqwest::redirect::Policy;\nuse tokio::sync::mpsc::UnboundedSender;\n\nuse crate::payload::TelemetryPayload;\n\n/// Telemetry ingest API URL\nconst DEFAULT_TELEMETRY_INGEST_API_URL: &str = \"https://telemetry.quickwit.io/\";\n\nfn telemetry_ingest_api_url() -> String {\n    if let Some(ingest_api_url) = std::env::var_os(\"TELEMETRY_INGEST_API\") {\n        ingest_api_url.to_string_lossy().to_string()\n    } else {\n        DEFAULT_TELEMETRY_INGEST_API_URL.to_string()\n    }\n}\n\n#[async_trait]\npub trait Sink: Send + Sync + 'static {\n    async fn send_payload(&self, payload: TelemetryPayload);\n}\npub struct HttpClient {\n    client: Client,\n    endpoint: String,\n}\n\nimpl HttpClient {\n    pub fn try_new() -> Option<Self> {\n        let client = Client::builder()\n            .redirect(Policy::limited(3))\n            .timeout(Duration::from_secs(10))\n            .build()\n            .ok()?;\n        Some(HttpClient {\n            client,\n            endpoint: telemetry_ingest_api_url(),\n        })\n    }\n\n    pub fn endpoint(&self) -> &str {\n        &self.endpoint\n    }\n}\n\n#[async_trait]\nimpl Sink for UnboundedSender<TelemetryPayload> {\n    async fn send_payload(&self, payload: TelemetryPayload) {\n        let _ = self.send(payload);\n    }\n}\n\n#[async_trait]\nimpl Sink for HttpClient {\n    async fn send_payload(&self, payload: TelemetryPayload) {\n        // Note that we swallow the error if any\n        let _ = self.client.post(&self.endpoint).json(&payload).send().await;\n    }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ui/.gitignore",
    "content": "# See https://help.github.com/articles/ignoring-files/ for more about ignoring files.\n\n# dependencies\n/node_modules\n/.pnp\n.pnp.js\n\n# testing\n/coverage\n/cypress/videos\n/cypress/screenshots\n\n# misc\n.DS_Store\n.env.local\n.env.development.local\n.env.test.local\n.env.production.local\n\nnpm-debug.log*\nyarn-debug.log*\nyarn-error.log*\n"
  },
  {
    "path": "quickwit/quickwit-ui/.gitignore_for_build_directory",
    "content": "# Ignore all files in this directory\n*\n# except .gitignore\n!.gitignore\n"
  },
  {
    "path": "quickwit/quickwit-ui/Makefile",
    "content": ".PHONY: build install start\n\nbuild:\n\tyarn build\n\ninstall:\n\tyarn install --frozen-lockfile --network-timeout 300000\n\nstart:\n\tyarn start\n"
  },
  {
    "path": "quickwit/quickwit-ui/README.md",
    "content": "# quickwit-ui\n\n\n\n## Prerequisites\n\n`node` and `yarn` need to be installed on your system.\nThe project then relies on misc nodejs tools that can be installed locally by \nrunning `yarn`.\n\n## Available Scripts\n\n\nIn the project directory, you can run:\n\n\n### `yarn start`\n\nRuns the app in the development mode.\\\nOpen [http://localhost:3000](http://localhost:3000) to view it in the browser.\n\nThe page will reload if you make edits.\\\nYou will also see any lint errors in the console.\n\n### `yarn test`\n\nLaunches the test runner.\n\n### `yarn e2e-test`\n\nLaunches the e2e test runner with [cypress](https://www.cypress.io/). To make them work, you need to start a\nsearcher beforehand with `cargo r run --service searcher --config config/quickwit.yaml`.\n\n### `yarn format`\n\nRe-writes files with the correct formatting if needed.\\\nYou might want to configure your IDE to do that [automatically](https://biomejs.dev/guides/editors/first-party-extensions/).\n\n### `yarn build`\n\nBuilds the app for production to the `build` folder.\\\nIt correctly bundles React in production mode and optimizes the build for the best performance.\n\nThe build is minified and the filenames include the hashes.\\\nYour app is ready to be deployed!\n"
  },
  {
    "path": "quickwit/quickwit-ui/biome.json",
    "content": "{\n  \"$schema\": \"./node_modules/@biomejs/biome/configuration_schema.json\",\n  \"formatter\": {\n    \"enabled\": true,\n    \"indentStyle\": \"space\",\n    \"includes\": [\"**\", \"!build/**\"]\n  },\n  \"linter\": {\n    \"enabled\": true,\n    \"rules\": {\n      \"recommended\": true,\n      \"style\": \"off\",\n      \"complexity\": \"off\",\n      \"correctness\": {\n        \"useExhaustiveDependencies\": \"off\"\n      },\n      \"suspicious\": {\n        \"noTsIgnore\": \"off\",\n        \"useIterableCallbackReturn\": \"off\",\n        \"noExplicitAny\": \"off\",\n        \"noArrayIndexKey\": \"off\"\n      }\n    },\n    \"includes\": [\"**\", \"!build/**\"]\n  }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ui/build/.gitignore",
    "content": "# Ignore all files in this directory\n*\n# except .gitignore\n!.gitignore\n"
  },
  {
    "path": "quickwit/quickwit-ui/e2e/homepage.spec.ts",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { expect, test } from \"@playwright/test\";\n\ntest.describe(\"Home navigation\", () => {\n  test(\"Should display sidebar links\", async ({ page }) => {\n    await page.goto(\"/ui\");\n    await expect(page.locator(\"a\")).toContainText([\n      \"Query editor\",\n      \"Indexes\",\n      \"Cluster\",\n    ]);\n  });\n\n  test(\"Should navigate to cluster state\", async ({ page }) => {\n    await page.goto(\"/ui\");\n    await page.getByRole(\"link\", { name: \"Cluster\" }).click();\n    await expect(page.getByLabel(\"breadcrumb\")).toContainText(\"Cluster\");\n    await expect(page.getByText(\"cluster_id\")).toBeVisible();\n  });\n\n  test(\"Should display otel logs index page\", async ({ page }) => {\n    await page.goto(\"/ui/indexes/otel-logs-v0_7\");\n    await expect(\n      page.getByLabel(\"breadcrumb\").getByRole(\"link\", { name: \"Indexes\" }),\n    ).toBeVisible();\n  });\n});\n"
  },
  {
    "path": "quickwit/quickwit-ui/index.html",
    "content": "<!DOCTYPE html>\n<html lang=\"en\">\n  <head>\n    <meta charset=\"utf-8\" />\n    <link rel=\"icon\" href=\"favicon.ico\" />\n    <meta name=\"viewport\" content=\"width=device-width, initial-scale=1\" />\n    <meta name=\"theme-color\" content=\"#000000\" />\n    <meta\n      name=\"description\"\n      content=\"Sub-second search & analytics engine on cloud storage\"\n    />\n    <link rel=\"apple-touch-icon\" href=\"logo192.png\" />\n    <!--\n      manifest.json provides metadata used when your web app is installed on a\n      user's mobile device or desktop. See https://developers.google.com/web/fundamentals/web-app-manifest/\n    -->\n    <link rel=\"manifest\" href=\"manifest.json\" />\n    <!--\n      TODO: remove and replace with Quickwit fonts.\n    -->\n    <link\n      rel=\"stylesheet\"\n      href=\"https://fonts.googleapis.com/css?family=Roboto:300,400,500,700&display=swap\"\n    />\n    <title>Quickwit UI</title>\n  </head>\n  <body>\n    <noscript>You need to enable JavaScript to run this app.</noscript>\n    <div id=\"root\"></div>\n    <script type=\"module\" src=\"src/index.tsx\"></script>\n  </body>\n</html>\n"
  },
  {
    "path": "quickwit/quickwit-ui/jest/setup.js",
    "content": "global.TextEncoder = require(\"util\").TextEncoder;\n"
  },
  {
    "path": "quickwit/quickwit-ui/jest.config.js",
    "content": "module.exports = {\n  setupFiles: [\n    \"react-app-polyfill/jsdom\", // polyfill jsdom api (such as fetch)\n    \"<rootDir>/jest/setup.js\", // polyfill textEncode\n  ],\n\n  setupFilesAfterEnv: [\"@testing-library/jest-dom\"],\n\n  testEnvironment: \"jsdom\",\n\n  transform: {\n    // transform js file (typescript and es6 import)\n    \"^.+\\\\.(js|jsx|mjs|cjs|ts|tsx)$\": [\n      \"babel-jest\",\n      {\n        presets: [[\"babel-preset-react-app\", { runtime: \"automatic\" }]],\n        plugins: [\n          [\n            \"@dr.pogodin/babel-plugin-transform-assets\",\n            { extensions: [\"svg\", \"woff2\"] },\n          ],\n        ],\n        babelrc: false,\n        configFile: false,\n      },\n    ],\n  },\n\n  moduleNameMapper: {\n    \"@monaco-editor/react\": \"<rootDir>/mocks/monacoMock.js\",\n    \"swagger-ui-react\": \"<rootDir>/mocks/swaggerUIMock.js\",\n    \"@mui/x-charts\": \"<rootDir>/mocks/x-charts.js\",\n  },\n\n  testPathIgnorePatterns: [\"/node_modules/\", \"<rootDir>/e2e/\"],\n\n  resetMocks: true,\n};\n"
  },
  {
    "path": "quickwit/quickwit-ui/mocks/monacoMock.js",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n// Mock MonocoEditor as the current jest setup does not work when Monaco JS files\n// are loaded.\nexport const Editor = (props) => {\n  return <div>{props.value}</div>;\n};\n"
  },
  {
    "path": "quickwit/quickwit-ui/mocks/swaggerUIMock.js",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\n// Mock SwaggerUI as the current jest setup does not work when Monaco JS files\n// are loaded.\nexport default function SwaggerUI(props) {\n  return <div>{props.url}</div>;\n}\n"
  },
  {
    "path": "quickwit/quickwit-ui/mocks/x-charts.js",
    "content": "export const LineChart = ({ children }) => children;\n"
  },
  {
    "path": "quickwit/quickwit-ui/package.json",
    "content": "{\n  \"name\": \"quickwit-ui\",\n  \"version\": \"0.8.0\",\n  \"license\": \"Apache-2.0\",\n  \"private\": true,\n  \"packageManager\": \"yarn@1.22.22\",\n  \"dependencies\": {\n    \"@babel/core\": \"7.29.0\",\n    \"@babel/runtime\": \"7.28.6\",\n    \"@biomejs/biome\": \"2.4.4\",\n    \"@dr.pogodin/babel-plugin-transform-assets\": \"1.2.6\",\n    \"@emotion/react\": \"11.14.0\",\n    \"@emotion/styled\": \"11.14.1\",\n    \"@monaco-editor/react\": \"4.7.0\",\n    \"@mui/icons-material\": \"7.3.8\",\n    \"@mui/lab\": \"7.0.1-beta.22\",\n    \"@mui/material\": \"7.3.8\",\n    \"@mui/system\": \"7.3.8\",\n    \"@mui/x-charts\": \"8.27.0\",\n    \"@mui/x-date-pickers\": \"8.27.2\",\n    \"@testing-library/dom\": \"10.4.1\",\n    \"@testing-library/jest-dom\": \"6.9.1\",\n    \"@testing-library/react\": \"16.3.2\",\n    \"@testing-library/user-event\": \"14.6.1\",\n    \"@types/jest\": \"30.0.0\",\n    \"@types/node\": \"24.10.9\",\n    \"@types/react\": \"19.2.14\",\n    \"@types/react-dom\": \"19.2.3\",\n    \"@types/swagger-ui-react\": \"5.18.0\",\n    \"babel-jest\": \"30.2.0\",\n    \"babel-preset-react-app\": \"10.1.0\",\n    \"dayjs\": \"1.11.19\",\n    \"jest\": \"30.2.0\",\n    \"jest-environment-jsdom\": \"30.2.0\",\n    \"monaco-editor\": \"0.55.1\",\n    \"react\": \"19.2.4\",\n    \"react-app-polyfill\": \"3.0.0\",\n    \"react-dom\": \"19.2.4\",\n    \"react-number-format\": \"5.4.4\",\n    \"react-router\": \"7.13.1\",\n    \"styled-components\": \"6.1.19\",\n    \"styled-icons\": \"10.47.1\",\n    \"swagger-ui-react\": \"5.32.0\",\n    \"typescript\": \"5.9.3\",\n    \"vite\": \"7.3.1\"\n  },\n  \"resolutions\": {\n    \"@types/react\": \"19.2.14\",\n    \"@types/react-dom\": \"19.2.3\",\n    \"dompurify\": \"3.3.1\",\n    \"glob\": \"11.1.0\"\n  },\n  \"scripts\": {\n    \"start\": \"vite\",\n    \"build\": \"vite build --outDir build\",\n    \"test\": \"jest\",\n    \"postbuild\": \"cp .gitignore_for_build_directory build/.gitignore\",\n    \"lint\": \"biome check\",\n    \"format\": \"biome check --write\",\n    \"type\": \"tsc\",\n    \"e2e-test\": \"playwright test\"\n  },\n  \"devDependencies\": {\n    \"@playwright/test\": \"^1.58.2\"\n  }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ui/playwright.config.ts",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { defineConfig } from \"@playwright/test\";\n\nexport default defineConfig({\n  testDir: \"./e2e\",\n  use: {\n    baseURL: \"http://127.0.0.1:7280/ui\",\n    browserName: \"chromium\",\n    video: \"off\",\n    screenshot: \"off\",\n  },\n});\n"
  },
  {
    "path": "quickwit/quickwit-ui/public/manifest.json",
    "content": "{\n  \"short_name\": \"Quickwit UI\",\n  \"name\": \"Quickwit UI: Search and manage your indexes.\",\n  \"icons\": [\n    {\n      \"src\": \"favicon.ico\",\n      \"sizes\": \"32x32 16x16\",\n      \"type\": \"image/x-icon\"\n    },\n    {\n      \"src\": \"android-chrome-192x192.png\",\n      \"sizes\": \"192x192\",\n      \"type\": \"image/png\"\n    },\n    {\n      \"src\": \"android-chrome-512x512.png\",\n      \"sizes\": \"512x512\",\n      \"type\": \"image/png\"\n    }\n  ],\n  \"start_url\": \".\",\n  \"display\": \"standalone\",\n  \"theme_color\": \"#000000\",\n  \"background_color\": \"#ffffff\"\n}\n"
  },
  {
    "path": "quickwit/quickwit-ui/public/robots.txt",
    "content": "# https://www.robotstxt.org/robotstxt.html\nUser-agent: *\nDisallow:\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/components/ApiUrlFooter.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport ContentCopyIcon from \"@mui/icons-material/ContentCopy\";\nimport { Box, Button, styled, Typography } from \"@mui/material\";\nimport { QUICKWIT_LIGHT_GREY } from \"../utils/theme\";\n\nconst Footer = styled(Box)`\n  display: flex;\n  height: 25px;\n  padding: 0px 5px;\n  position: absolute;\n  bottom: 0px;\n  font-size: 0.9em;\n  background-color: ${QUICKWIT_LIGHT_GREY};\n  opacity: 0.7;\n`;\n\nexport default function ApiUrlFooter(url: string) {\n  const urlMaxLength = 80;\n  const origin =\n    // @ts-ignore\n    process.env.NODE_ENV === \"development\"\n      ? \"http://localhost:7280\"\n      : window.location.origin;\n  const completeUrl = `${origin}/${url}`;\n  const isTooLong = completeUrl.length > urlMaxLength;\n  // TODO show generated aggregation\n  return (\n    <Footer>\n      <Typography sx={{ padding: \"4px 5px\", fontSize: \"0.95em\" }}>\n        API URL:\n      </Typography>\n      <Button\n        sx={{\n          fontSize: \"0.93em\",\n          textTransform: \"inherit\",\n          whiteSpace: \"nowrap\",\n          overflow: \"hidden\",\n          textOverflow: \"clip\",\n        }}\n        onClick={() => {\n          if (window.isSecureContext) {\n            navigator.clipboard.writeText(completeUrl);\n          } else {\n            window.open(completeUrl, \"_blank\");\n          }\n        }}\n        endIcon={<ContentCopyIcon />}\n        size=\"small\"\n      >\n        {completeUrl.substring(0, urlMaxLength)}\n        {isTooLong && \"...\"}\n      </Button>\n    </Footer>\n  );\n}\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/components/IndexSideBar.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport styled from \"@emotion/styled\";\nimport { ChevronRight, KeyboardArrowDown } from \"@mui/icons-material\";\nimport {\n  Autocomplete,\n  Box,\n  Chip,\n  CircularProgress,\n  IconButton,\n  List,\n  ListItem,\n  ListItemText,\n  TextField,\n  Typography,\n} from \"@mui/material\";\nimport Tooltip from \"@mui/material/Tooltip\";\nimport React, { useEffect, useMemo, useState } from \"react\";\nimport { Client } from \"../services/client\";\nimport { FieldMapping, getAllFields, IndexMetadata } from \"../utils/models\";\n\nconst IndexBarWrapper = styled(\"div\")({\n  display: \"flex\",\n  height: \"100%\",\n  flex: \"0 0 260px\",\n  maxWidth: \"260px\",\n  flexDirection: \"column\",\n  borderRight: \"1px solid rgba(0, 0, 0, 0.12)\",\n  overflow: \"auto\",\n});\n\nfunction IndexAutocomplete(props: IndexMetadataProps) {\n  const [open, setOpen] = React.useState(false);\n  const [options, setOptions] = React.useState<readonly IndexMetadata[]>([]);\n  const [value, setValue] = React.useState<IndexMetadata | null>(null);\n  const [loading, setLoading] = React.useState(false);\n  // We want to show the circular progress only if we are loading some results and\n  // when there is no option available.\n  const showLoading = loading && options.length === 0;\n  const quickwitClient = useMemo(() => new Client(), []);\n\n  useEffect(() => {\n    if (loading) {\n      return;\n    }\n    setLoading(true);\n    quickwitClient.listIndexes().then(\n      (indexesMetadata) => {\n        setOptions([...indexesMetadata]);\n        setLoading(false);\n      },\n      (error) => {\n        console.log(\"Index autocomplete error\", error);\n        setLoading(false);\n      },\n    );\n  }, [quickwitClient, open]);\n\n  useEffect(() => {\n    if (!open) {\n      if (props.indexMetadata !== null && options.length === 0) {\n        setOptions([props.indexMetadata]);\n      }\n    }\n  }, [open, props.indexMetadata, options.length]);\n\n  useEffect(() => {\n    setValue(props.indexMetadata);\n  }, [props.indexMetadata]);\n\n  return (\n    <Autocomplete\n      size=\"small\"\n      sx={{ width: 210 }}\n      open={open}\n      value={value}\n      onChange={(_, updatedValue) => {\n        setValue(updatedValue);\n\n        if (\n          updatedValue == null ||\n          updatedValue.index_config.index_id == null\n        ) {\n          props.onIndexMetadataUpdate(null);\n        } else {\n          props.onIndexMetadataUpdate(updatedValue);\n        }\n      }}\n      onOpen={() => {\n        setOpen(true);\n      }}\n      onClose={() => {\n        setOpen(false);\n        setLoading(false);\n      }}\n      isOptionEqualToValue={(option, value) =>\n        option.index_config.index_id === value.index_config.index_id\n      }\n      getOptionLabel={(option) => option.index_config.index_id}\n      options={options}\n      noOptionsText=\"No indexes.\"\n      loading={loading}\n      renderInput={(params) => (\n        <TextField\n          {...params}\n          placeholder=\"Select an index\"\n          InputProps={{\n            ...params.InputProps,\n            endAdornment: (\n              <React.Fragment>\n                {showLoading ? (\n                  <CircularProgress color=\"inherit\" size={20} />\n                ) : null}\n                {params.InputProps.endAdornment}\n              </React.Fragment>\n            ),\n          }}\n        />\n      )}\n    />\n  );\n}\n\nexport interface IndexMetadataProps {\n  indexMetadata: null | IndexMetadata;\n  onIndexMetadataUpdate(indexMetadata: IndexMetadata | null): void;\n}\n\nfunction fieldTypeLabel(fieldMapping: FieldMapping): string {\n  if (fieldMapping.type[0] !== undefined) {\n    return fieldMapping.type[0].toUpperCase();\n  } else {\n    return \"\";\n  }\n}\n\nexport function IndexSideBar(props: IndexMetadataProps) {\n  const [open, setOpen] = useState(true);\n  const fields =\n    props.indexMetadata === null\n      ? []\n      : getAllFields(\n          props.indexMetadata.index_config.doc_mapping.field_mappings,\n        );\n  return (\n    <IndexBarWrapper>\n      <Box sx={{ px: 3, py: 2 }}>\n        <Typography variant=\"body1\" mb={1}>\n          Index ID\n        </Typography>\n        <IndexAutocomplete {...props} />\n      </Box>\n      <Box sx={{ paddingLeft: \"10px\", height: \"100%\" }}>\n        <IconButton\n          aria-label=\"expand row\"\n          size=\"small\"\n          onClick={() => setOpen(!open)}\n        >\n          {open ? <KeyboardArrowDown /> : <ChevronRight />}\n        </IconButton>\n        Fields\n        {open && (\n          <List\n            dense={true}\n            sx={{ paddingTop: \"0\", overflowWrap: \"break-word\" }}\n          >\n            {fields.map(function (field) {\n              return (\n                <ListItem\n                  key={field.json_path}\n                  secondaryAction={\n                    <IconButton edge=\"end\" aria-label=\"add\"></IconButton>\n                  }\n                  sx={{ paddingLeft: \"10px\" }}\n                >\n                  <Tooltip\n                    title={field.field_mapping.type}\n                    arrow\n                    placement=\"left\"\n                  >\n                    <Chip\n                      label={fieldTypeLabel(field.field_mapping)}\n                      size=\"small\"\n                      sx={{\n                        marginRight: \"10px\",\n                        borderRadius: \"3px\",\n                        fontSize: \"0.6rem\",\n                      }}\n                    />\n                  </Tooltip>\n                  <ListItemText primary={field.json_path} />\n                </ListItem>\n              );\n            })}\n          </List>\n        )}\n      </Box>\n    </IndexBarWrapper>\n  );\n}\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/components/IndexSummary.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport styled from \"@emotion/styled\";\nimport { Alert, Paper } from \"@mui/material\";\nimport dayjs from \"dayjs\";\nimport utc from \"dayjs/plugin/utc\";\nimport { FC, ReactNode } from \"react\";\nimport { NumericFormat } from \"react-number-format\";\nimport { Index } from \"../utils/models\";\n\ndayjs.extend(utc);\n\nconst ItemContainer = styled.div`\n  padding: 10px;\n  display: flex;\n  flex-direction: column;\n`;\nconst Row = styled.div`\n  padding: 5px;\n  display: flex;\n  flex-direction: row;\n  &:nth-of-type(odd) {\n    background: rgba(0, 0, 0, 0.05);\n  }\n`;\nconst RowKey = styled.div`\n  width: 350px;\n`;\nconst IndexRow: FC<{ title: string; children: ReactNode }> = ({\n  title,\n  children,\n}) => (\n  <Row>\n    <RowKey>{title}</RowKey>\n    <div>{children}</div>\n  </Row>\n);\n\nexport function IndexSummary({ index }: { index: Index }) {\n  const all_splits = index.splits;\n  const published_splits = all_splits.filter(\n    (split) => split.split_state === \"Published\",\n  );\n  const num_of_staged_splits = all_splits.filter(\n    (split) => split.split_state === \"Staged\",\n  ).length;\n  const num_of_marked_for_delete_splits = all_splits.filter(\n    (split) => split.split_state === \"MarkedForDeletion\",\n  ).length;\n  const total_num_docs = published_splits\n    .map((split) => split.num_docs)\n    .reduce((sum, current) => sum + current, 0);\n  const total_num_bytes = published_splits\n    .map((split) => {\n      return split.footer_offsets.end;\n    })\n    .reduce((sum, current) => sum + current, 0);\n  const total_uncompressed_num_bytes = published_splits\n    .map((split) => {\n      return split.uncompressed_docs_size_in_bytes;\n    })\n    .reduce((sum, current) => sum + current, 0);\n  return (\n    <Paper variant=\"outlined\">\n      <ItemContainer>\n        {index.split_limit_reached && (\n          <Alert severity=\"warning\" sx={{ mb: 2 }}>\n            Split limit reached. Only the first 10,000 splits were retrieved.\n            The actual total may be higher. Statistics shown are incomplete.\n          </Alert>\n        )}\n        <IndexRow title=\"Created at:\">\n          {dayjs\n            .unix(index.metadata.create_timestamp)\n            .utc()\n            .format(\"YYYY/MM/DD HH:mm\")}\n        </IndexRow>\n        <IndexRow title=\"URI:\">\n          {index.metadata.index_config.index_uri}\n        </IndexRow>\n        <IndexRow title=\"Number of published documents:\">\n          <NumericFormat\n            value={total_num_docs}\n            displayType={\"text\"}\n            thousandSeparator={true}\n          />\n        </IndexRow>\n        <IndexRow title=\"Size of published documents (uncompressed):\">\n          <NumericFormat\n            value={total_uncompressed_num_bytes / 1000000}\n            displayType={\"text\"}\n            thousandSeparator={true}\n            suffix=\" MB\"\n            decimalScale={2}\n          />\n        </IndexRow>\n        <IndexRow title=\"Number of published splits:\">\n          {published_splits.length}\n        </IndexRow>\n        <IndexRow title=\"Size of published splits:\">\n          <NumericFormat\n            value={total_num_bytes / 1000000}\n            displayType={\"text\"}\n            thousandSeparator={true}\n            suffix=\" MB\"\n            decimalScale={2}\n          />\n        </IndexRow>\n        <IndexRow title=\"Number of staged splits:\">\n          {num_of_staged_splits}\n        </IndexRow>\n        <IndexRow title=\"Number of splits marked for deletion:\">\n          {num_of_marked_for_delete_splits}\n        </IndexRow>\n      </ItemContainer>\n    </Paper>\n  );\n}\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/components/IndexesTable.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport {\n  Paper,\n  Table,\n  TableBody,\n  TableCell,\n  TableContainer,\n  TableHead,\n  TableRow,\n} from \"@mui/material\";\nimport dayjs from \"dayjs\";\nimport utc from \"dayjs/plugin/utc\";\nimport { useNavigate } from \"react-router\";\nimport { IndexMetadata } from \"../utils/models\";\n\ndayjs.extend(utc);\n\nconst IndexesTable = ({\n  indexesMetadata,\n}: Readonly<{ indexesMetadata: IndexMetadata[] }>) => {\n  const navigate = useNavigate();\n  const handleClick = (indexId: string) => {\n    navigate(`/indexes/${indexId}`);\n  };\n\n  return (\n    <TableContainer component={Paper}>\n      <Table sx={{ minWidth: 650 }} aria-label=\"Indexes\">\n        <TableHead>\n          <TableRow>\n            <TableCell align=\"left\">ID</TableCell>\n            <TableCell align=\"left\">URI</TableCell>\n            <TableCell align=\"left\">Created on</TableCell>\n            <TableCell align=\"left\">Sources</TableCell>\n          </TableRow>\n        </TableHead>\n        <TableBody>\n          {indexesMetadata.map((indexMetadata) => (\n            <TableRow\n              key={indexMetadata.index_config.index_id}\n              sx={{\n                \"&:last-child td, &:last-child th\": { border: 0 },\n                cursor: \"pointer\",\n              }}\n              hover={true}\n              onClick={() => handleClick(indexMetadata.index_config.index_id)}\n            >\n              <TableCell component=\"th\" scope=\"row\">\n                {indexMetadata.index_config.index_id}\n              </TableCell>\n              <TableCell align=\"left\">\n                {indexMetadata.index_config.index_uri}\n              </TableCell>\n              <TableCell align=\"left\">\n                {dayjs\n                  .unix(indexMetadata.create_timestamp)\n                  .utc()\n                  .format(\"YYYY/MM/DD HH:mm\")}\n              </TableCell>\n              <TableCell align=\"left\">\n                {indexMetadata.sources?.length || \"None\"}\n              </TableCell>\n            </TableRow>\n          ))}\n        </TableBody>\n      </Table>\n    </TableContainer>\n  );\n};\n\nexport default IndexesTable;\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/components/JsonEditor.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { BeforeMount, Editor, OnMount } from \"@monaco-editor/react\";\nimport { useCallback } from \"react\";\nimport { EDITOR_THEME } from \"../utils/theme\";\n\nexport function JsonEditor({\n  content,\n  resizeOnMount,\n}: {\n  content: unknown;\n  resizeOnMount: boolean;\n}) {\n  // Setting editor height based on lines height and count to stretch and fit its content.\n  const onMount: OnMount = useCallback(\n    (editor) => {\n      if (!resizeOnMount) {\n        return;\n      }\n      const editorElement = editor.getDomNode();\n\n      if (!editorElement) {\n        return;\n      }\n\n      // Weirdly enough, we have to wait a few ms to get the right height\n      // from `editor.getContentHeight()`. If not, we sometimes end up with\n      // a height > 7000px... and I don't know why.\n      setTimeout(() => {\n        const height = Math.min(800, editor.getContentHeight());\n        editorElement.style.height = `${height}px`;\n        editor.layout();\n      }, 10);\n    },\n    [resizeOnMount],\n  );\n\n  const beforeMount: BeforeMount = (monaco) => {\n    monaco.editor.defineTheme(\"quickwit-light\", EDITOR_THEME);\n  };\n\n  return (\n    <Editor\n      language=\"json\"\n      value={JSON.stringify(content, null, 2)}\n      beforeMount={beforeMount}\n      onMount={onMount}\n      options={{\n        readOnly: true,\n        fontFamily: \"monospace\",\n        overviewRulerBorder: false,\n        overviewRulerLanes: 0,\n        minimap: {\n          enabled: false,\n        },\n        scrollbar: {\n          alwaysConsumeMouseWheel: false,\n        },\n        renderLineHighlight: \"gutter\",\n        fontSize: 12,\n        fixedOverflowWidgets: true,\n        scrollBeyondLastLine: false,\n        automaticLayout: true,\n        wordWrap: \"on\",\n        wrappingIndent: \"deepIndent\",\n      }}\n      theme=\"quickwit-light\"\n    />\n  );\n}\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/components/LayoutUtils.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { Box, Breadcrumbs, styled } from \"@mui/material\";\n\nexport const APP_BAR_HEIGHT_PX = \"48px\";\nexport const ViewUnderAppBarBox = styled(Box)`\ndisplay: flex;\nflex-direction: column;\nmargin-top: ${APP_BAR_HEIGHT_PX};\nheight: calc(100% - ${APP_BAR_HEIGHT_PX});\nwidth: 100%;\n`;\nexport const FullBoxContainer = styled(Box)`\ndisplay: flex;\nflex-direction: column;\nheight: 100%;\nwidth: 100%;\npadding: 16px 24px;\n`;\nexport const QBreadcrumbs = styled(Breadcrumbs)`\npadding-bottom: 8px;\n`;\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/components/Loader.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { Box, keyframes, styled } from \"@mui/material\";\nimport loadinIcongUrl from \"../assets/img/quickwit-logo-monochrome.svg\";\n\nconst spin = keyframes`\nfrom {\n  transform: rotate(0deg);\n}\nto {\n  transform: rotate(360deg);\n}\n`;\n\nconst LoadingIcon = (props: React.ComponentProps<\"img\">) => (\n  <img {...props} src={loadinIcongUrl} alt=\"loading icon\" />\n);\n\nconst SpinningLoadingIcon = styled(LoadingIcon)`\n  height: 10vmin;\n  pointer-events: none;\n  fill: #cbd1dd;\n  animation: ${spin} infinite 5s linear;\n`;\n\nexport default function Loader() {\n  return (\n    <Box\n      display=\"flex\"\n      justifyContent=\"center\"\n      alignItems=\"center\"\n      minHeight=\"40vh\"\n    >\n      <SpinningLoadingIcon></SpinningLoadingIcon>\n    </Box>\n  );\n}\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/components/QueryActionBar.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport PlayArrowIcon from \"@mui/icons-material/PlayArrow\";\nimport { Box, Button, Tab, Tabs } from \"@mui/material\";\nimport { SearchComponentProps } from \"../utils/SearchComponentProps\";\nimport { TimeRangeSelect } from \"./TimeRangeSelect\";\n\nexport function QueryEditorActionBar(props: SearchComponentProps) {\n  const timestamp_field_name =\n    props.index?.metadata.index_config.doc_mapping.timestamp_field;\n  const shouldDisplayTimeRangeSelect = timestamp_field_name ?? false;\n\n  const handleChange = (_event: React.SyntheticEvent, newTab: number) => {\n    const updatedSearchRequest = {\n      ...props.searchRequest,\n      aggregation: newTab !== 0,\n    };\n    props.onSearchRequestUpdate(updatedSearchRequest);\n    props.runSearch(updatedSearchRequest);\n  };\n\n  return (\n    <Box sx={{ display: \"flex\" }}>\n      <Box sx={{ flexGrow: 0, padding: \"10px\" }}>\n        <Button\n          onClick={() => props.runSearch(props.searchRequest)}\n          variant=\"contained\"\n          startIcon={<PlayArrowIcon />}\n          disableElevation\n          sx={{ flexGrow: 1 }}\n          disabled={props.queryRunning || !props.searchRequest.indexId}\n        >\n          Run\n        </Button>\n      </Box>\n      <Box sx={{ flexGrow: 0 }}>\n        <Box sx={{ borderBottom: 1, borderColor: \"divider\", flexGrow: 1 }}>\n          <Tabs\n            value={Number(props.searchRequest.aggregation)}\n            onChange={handleChange}\n          >\n            <Tab label=\"Search\" />\n            <Tab label=\"Aggregation\" />\n          </Tabs>\n        </Box>\n      </Box>\n      <Box sx={{ flexGrow: 1 }}></Box>\n      {shouldDisplayTimeRangeSelect && (\n        <TimeRangeSelect\n          timeRange={{\n            startTimestamp: props.searchRequest.startTimestamp,\n            endTimestamp: props.searchRequest.endTimestamp,\n          }}\n          onUpdate={(timeRange) => {\n            props.runSearch({ ...props.searchRequest, ...timeRange });\n          }}\n          disabled={props.queryRunning || !props.searchRequest.indexId}\n        />\n      )}\n    </Box>\n  );\n}\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/components/QueryEditor/AggregationEditor.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { Box } from \"@mui/material\";\nimport FormControl from \"@mui/material/FormControl\";\nimport MenuItem from \"@mui/material/MenuItem\";\nimport Select, { SelectChangeEvent } from \"@mui/material/Select\";\nimport TextField from \"@mui/material/TextField\";\nimport { useEffect, useRef, useState } from \"react\";\nimport { HistogramAgg, TermAgg } from \"../../utils/models\";\nimport { SearchComponentProps } from \"../../utils/SearchComponentProps\";\n\nexport function AggregationEditor(props: SearchComponentProps) {\n  return (\n    <Box hidden={!props.searchRequest.aggregation}>\n      <MetricKind\n        searchRequest={props.searchRequest}\n        onSearchRequestUpdate={props.onSearchRequestUpdate}\n        runSearch={props.runSearch}\n        index={props.index}\n        queryRunning={props.queryRunning}\n      />\n      <AggregationKind\n        searchRequest={props.searchRequest}\n        onSearchRequestUpdate={props.onSearchRequestUpdate}\n        runSearch={props.runSearch}\n        index={props.index}\n        queryRunning={props.queryRunning}\n      />\n    </Box>\n  );\n}\n\nexport function MetricKind(props: SearchComponentProps) {\n  // TODO add percentiles\n  const metricRef = useRef(props.searchRequest.aggregationConfig.metric);\n\n  const handleTypeChange = (event: SelectChangeEvent) => {\n    const value = event.target.value;\n    const updatedMetric =\n      value !== \"count\" ? { ...metricRef.current!, type: value } : null;\n    const updatedAggregation = {\n      ...props.searchRequest.aggregationConfig,\n      metric: updatedMetric,\n    };\n    const updatedSearchRequest = {\n      ...props.searchRequest,\n      aggregationConfig: updatedAggregation,\n    };\n    props.onSearchRequestUpdate(updatedSearchRequest);\n    metricRef.current = updatedMetric;\n  };\n\n  const handleNameChange = (event: React.ChangeEvent<HTMLInputElement>) => {\n    const value = event.target.value;\n    if (metricRef.current == null) {\n      return;\n    }\n    const updatedMetric = { ...metricRef.current!, field: value };\n    const updatedAggregation = {\n      ...props.searchRequest.aggregationConfig,\n      metric: updatedMetric,\n    };\n    const updatedSearchRequest = {\n      ...props.searchRequest,\n      aggregationConfig: updatedAggregation,\n    };\n    props.onSearchRequestUpdate(updatedSearchRequest);\n    metricRef.current = updatedMetric;\n  };\n\n  return (\n    <Box sx={{ m: 1, minWidth: 120, display: \"flex\", flexDirection: \"row\" }}>\n      <FormControl variant=\"standard\">\n        <Select\n          value={metricRef.current ? metricRef.current.type : \"count\"}\n          onChange={handleTypeChange}\n          sx={{ minHeight: \"44px\" }}\n        >\n          <MenuItem value=\"count\">Count</MenuItem>\n          <MenuItem value=\"avg\">Average</MenuItem>\n          <MenuItem value=\"sum\">Sum</MenuItem>\n          <MenuItem value=\"max\">Max</MenuItem>\n          <MenuItem value=\"min\">Min</MenuItem>\n        </Select>\n      </FormControl>\n      <FormControl variant=\"standard\">\n        <TextField\n          variant=\"standard\"\n          label=\"Field\"\n          onChange={handleNameChange}\n          sx={{\n            marginLeft: \"10px\",\n            ...(!metricRef.current && { display: \"none\" }),\n          }}\n        />\n      </FormControl>\n    </Box>\n  );\n}\n\nexport function AggregationKind(props: SearchComponentProps) {\n  const defaultAgg = {\n    histogram: {\n      interval: \"1d\",\n    },\n  };\n  const [aggregations, setAggregations] = useState<\n    ({ term: TermAgg } | { histogram: HistogramAgg })[]\n  >([defaultAgg]);\n\n  useEffect(() => {\n    // do the initial filling of parameters\n    const aggregationConfig = props.searchRequest.aggregationConfig;\n    if (\n      aggregationConfig.histogram === null &&\n      aggregationConfig.term === null\n    ) {\n      const initialAggregation = Object.assign({}, ...aggregations);\n      const initialSearchRequest = {\n        ...props.searchRequest,\n        aggregationConfig: initialAggregation,\n      };\n      props.onSearchRequestUpdate(initialSearchRequest);\n    }\n  }, []); // Empty dependency array means this runs once after mount\n\n  useEffect(() => {\n    // Update search request whenever aggregations change\n    const metric = props.searchRequest.aggregationConfig.metric;\n    const updatedAggregation = Object.assign(\n      {},\n      { metric: metric },\n      ...aggregations,\n    );\n    const updatedSearchRequest = {\n      ...props.searchRequest,\n      aggregationConfig: updatedAggregation,\n    };\n    props.onSearchRequestUpdate(updatedSearchRequest);\n  }, [aggregations]);\n\n  const handleAggregationChange = (pos: number, event: SelectChangeEvent) => {\n    const value = event.target.value;\n    setAggregations((agg) => {\n      const newAggregations = [...agg];\n      switch (value) {\n        case \"histogram\": {\n          newAggregations[pos] = {\n            histogram: {\n              interval: \"1d\",\n            },\n          };\n          break;\n        }\n        case \"term\": {\n          newAggregations[pos] = {\n            term: {\n              field: \"\",\n              size: 10,\n            },\n          };\n          break;\n        }\n        case \"rm\": {\n          newAggregations.splice(pos, 1);\n        }\n      }\n      return newAggregations;\n    });\n  };\n\n  const handleHistogramChange = (pos: number, event: SelectChangeEvent) => {\n    const value = event.target.value;\n    setAggregations((agg) => {\n      const newAggregations = [...agg];\n      newAggregations[pos] = { histogram: { interval: value } };\n      return newAggregations;\n    });\n  };\n\n  const handleTermFieldChange = (\n    pos: number,\n    event: React.ChangeEvent<HTMLInputElement | HTMLTextAreaElement>,\n  ) => {\n    const value = event.target.value;\n    setAggregations((agg) => {\n      const newAggregations = [...agg];\n      const term = newAggregations[pos];\n      if (isTerm(term)) {\n        term.term.field = value;\n      }\n      return newAggregations;\n    });\n  };\n\n  const handleTermCountChange = (\n    pos: number,\n    event: React.ChangeEvent<HTMLInputElement | HTMLTextAreaElement>,\n  ) => {\n    const value = event.target.value;\n    setAggregations((agg) => {\n      const newAggregations = [...agg];\n      const term = newAggregations[pos];\n      if (isTerm(term)) {\n        term.term.size = Number(value);\n      }\n      return newAggregations;\n    });\n  };\n\n  function isHistogram(\n    agg: { term: TermAgg } | { histogram: HistogramAgg } | undefined,\n  ): agg is { histogram: HistogramAgg } {\n    if (!agg) return false;\n    return \"histogram\" in agg;\n  }\n\n  function isTerm(\n    agg: { term: TermAgg } | { histogram: HistogramAgg } | undefined,\n  ): agg is { term: TermAgg } {\n    if (!agg) return false;\n    return \"term\" in agg;\n  }\n\n  const getAggregationKind = (\n    agg: { term: TermAgg } | { histogram: HistogramAgg } | undefined,\n  ) => {\n    if (isHistogram(agg)) {\n      return \"histogram\";\n    }\n    if (isTerm(agg)) {\n      return \"term\";\n    }\n    return \"new\";\n  };\n\n  const makeOptions = (\n    pos: number,\n    agg: ({ term: TermAgg } | { histogram: HistogramAgg })[],\n  ) => {\n    const options = [];\n    if (pos >= agg.length) {\n      options.push(\n        <MenuItem value=\"new\" key=\"new\">\n          Add aggregation\n        </MenuItem>,\n      );\n    }\n    let addHistogram = true;\n    let addTerm = true;\n    for (let i = 0; i < agg.length; i++) {\n      if (i === pos) continue;\n      if (getAggregationKind(agg[i]) === \"histogram\") addHistogram = false;\n      if (getAggregationKind(agg[i]) === \"term\") addTerm = false;\n    }\n    if (addHistogram) {\n      options.push(\n        <MenuItem value=\"histogram\" key=\"histogram\">\n          Histogram aggregation\n        </MenuItem>,\n      );\n    }\n    if (addTerm) {\n      options.push(\n        <MenuItem value=\"term\" key=\"term\">\n          Term aggregation\n        </MenuItem>,\n      );\n    }\n    if (agg.length > 1) {\n      options.push(\n        <MenuItem value=\"rm\" key=\"rm\">\n          Remove aggregation\n        </MenuItem>,\n      );\n    }\n    return options;\n  };\n\n  const drawAdditional = (\n    pos: number,\n    aggs: ({ term: TermAgg } | { histogram: HistogramAgg })[],\n  ) => {\n    const agg = aggs[pos];\n    if (isHistogram(agg)) {\n      return (\n        <FormControl variant=\"standard\">\n          <Select\n            value={agg.histogram.interval}\n            onChange={(e) => handleHistogramChange(pos, e)}\n            sx={{ marginLeft: \"10px\", minHeight: \"44px\" }}\n          >\n            <MenuItem value=\"10s\">10 seconds</MenuItem>\n            <MenuItem value=\"1m\">1 minute</MenuItem>\n            <MenuItem value=\"5m\">5 minutes</MenuItem>\n            <MenuItem value=\"10m\">10 minutes</MenuItem>\n            <MenuItem value=\"1h\">1 hour</MenuItem>\n            <MenuItem value=\"1d\">1 day</MenuItem>\n          </Select>\n        </FormControl>\n      );\n    }\n    if (isTerm(agg)) {\n      return (\n        <>\n          <FormControl variant=\"standard\">\n            <TextField\n              variant=\"standard\"\n              label=\"Field\"\n              onChange={(e) => handleTermFieldChange(pos, e)}\n              sx={{ marginLeft: \"10px\" }}\n            />\n          </FormControl>\n          <FormControl variant=\"standard\">\n            <TextField\n              variant=\"standard\"\n              label=\"Return top\"\n              type=\"number\"\n              onChange={(e) => handleTermCountChange(pos, e)}\n              value={agg.term.size}\n              sx={{ marginLeft: \"10px\" }}\n            />\n          </FormControl>\n        </>\n      );\n    }\n    return null;\n  };\n\n  return (\n    <>\n      <Box sx={{ m: 1, minWidth: 120, display: \"flex\", flexDirection: \"row\" }}>\n        <FormControl variant=\"standard\">\n          <Select\n            value={getAggregationKind(aggregations[0])}\n            onChange={(e) => handleAggregationChange(0, e)}\n            sx={{ minHeight: \"44px\", width: \"190px\" }}\n          >\n            {makeOptions(0, aggregations)}\n          </Select>\n        </FormControl>\n        {drawAdditional(0, aggregations)}\n      </Box>\n      <Box sx={{ m: 1, minWidth: 120, display: \"flex\", flexDirection: \"row\" }}>\n        <FormControl\n          variant=\"standard\"\n          sx={{ m: 1, minWidth: 120, display: \"flex\", flexDirection: \"row\" }}\n        >\n          <Select\n            value={getAggregationKind(aggregations[1])}\n            onChange={(e) => handleAggregationChange(1, e)}\n            sx={{ minHeight: \"44px\", width: \"190px\" }}\n          >\n            {makeOptions(1, aggregations)}\n          </Select>\n          {drawAdditional(1, aggregations)}\n        </FormControl>\n      </Box>\n    </>\n  );\n}\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/components/QueryEditor/QueryEditor.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { Editor } from \"@monaco-editor/react\";\nimport { Box } from \"@mui/material\";\nimport * as monacoEditor from \"monaco-editor/esm/vs/editor/editor.api\";\nimport React, { useEffect, useRef, useState } from \"react\";\nimport { SearchComponentProps } from \"../../utils/SearchComponentProps\";\nimport { EDITOR_THEME } from \"../../utils/theme\";\nimport {\n  createIndexCompletionProvider,\n  LANGUAGE_CONFIG,\n  LanguageFeatures,\n} from \"./config\";\n\nconst QUICKWIT_EDITOR_THEME_ID = \"quickwit-light\";\n\nfunction getLanguageId(indexId: string | null): string {\n  if (indexId === null) {\n    return \"\";\n  }\n  return `${indexId}-query-language`;\n}\n\nexport function QueryEditor(props: SearchComponentProps) {\n  const monacoRef = useRef<null | typeof monacoEditor>(null);\n  const [languageId, setLanguageId] = useState<string>(\"\");\n  const runSearchRef = useRef(props.runSearch);\n  const searchRequestRef = useRef(props.searchRequest);\n  const defaultValue =\n    props.searchRequest.query === null\n      ? `// Select an index and type your query. Example: field_name:\"phrase query\"`\n      : props.searchRequest.query;\n  let resize: () => void;\n\n  function handleEditorDidMount(editor: any, monaco: any) {\n    monacoRef.current = monaco;\n    editor.addAction({\n      id: \"SEARCH\",\n      label: \"Run search\",\n      keybindings: [\n        monaco.KeyCode.F9,\n        monaco.KeyMod.CtrlCmd | monaco.KeyCode.Enter,\n      ],\n      run: () => {\n        runSearchRef.current(searchRequestRef.current);\n      },\n    });\n    resize = () => {\n      editor.layout({\n        width: Math.max(window.innerWidth - (260 + 180 + 2 * 24), 200),\n        height: 84,\n      });\n    };\n    window.addEventListener(\"resize\", resize);\n  }\n\n  React.useEffect(() => {\n    return () => window.removeEventListener(\"resize\", resize);\n  });\n\n  useEffect(() => {\n    const updatedLanguageId = getLanguageId(props.searchRequest.indexId);\n    if (\n      monacoRef.current !== null &&\n      updatedLanguageId !== \"\" &&\n      props.index !== null\n    ) {\n      const monaco = monacoRef.current;\n      if (\n        !monaco.languages\n          .getLanguages()\n          .some(({ id }: { id: string }) => id === updatedLanguageId)\n      ) {\n        console.log(\"register language\", updatedLanguageId);\n        monaco.languages.register({ id: updatedLanguageId });\n        monaco.languages.setMonarchTokensProvider(\n          updatedLanguageId,\n          LanguageFeatures(),\n        );\n        if (props.index != null) {\n          monaco.languages.registerCompletionItemProvider(\n            updatedLanguageId,\n            createIndexCompletionProvider(props.index.metadata),\n          );\n          monaco.languages.setLanguageConfiguration(\n            updatedLanguageId,\n            LANGUAGE_CONFIG,\n          );\n        }\n      }\n      setLanguageId(updatedLanguageId);\n    }\n  }, [monacoRef, props.index]);\n\n  useEffect(() => {\n    if (monacoRef.current !== null) {\n      runSearchRef.current = props.runSearch;\n    }\n  }, [monacoRef, props.runSearch]);\n\n  function handleEditorChange(value: any) {\n    const updatedSearchRequest = Object.assign({}, props.searchRequest, {\n      query: value,\n    });\n    searchRequestRef.current = updatedSearchRequest;\n    props.onSearchRequestUpdate(updatedSearchRequest);\n  }\n\n  function handleEditorWillMount(monaco: any) {\n    monaco.editor.defineTheme(QUICKWIT_EDITOR_THEME_ID, EDITOR_THEME);\n  }\n\n  return (\n    <Box sx={{ height: \"100px\", py: 1 }}>\n      <Editor\n        beforeMount={handleEditorWillMount}\n        onMount={handleEditorDidMount}\n        onChange={handleEditorChange}\n        language={languageId}\n        value={defaultValue}\n        options={{\n          fontFamily: \"monospace\",\n          minimap: {\n            enabled: false,\n          },\n          renderLineHighlight: \"gutter\",\n          fontSize: 14,\n          fixedOverflowWidgets: true,\n          scrollBeyondLastLine: false,\n        }}\n        theme={QUICKWIT_EDITOR_THEME_ID}\n      />\n    </Box>\n  );\n}\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/components/QueryEditor/config.ts",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { getAllFields, IndexMetadata } from \"../../utils/models\";\n\nexport enum CompletionItemKind {\n  Field = 3,\n  Operator = 11,\n}\n\nconst BRACES: [string, string] = [\"{\", \"}\"];\nconst BRACKETS: [string, string] = [\"[\", \"]\"];\nconst PARENTHESES: [string, string] = [\"(\", \")\"];\n\nexport const LANGUAGE_CONFIG = {\n  comments: {\n    lineComment: \"//\",\n  },\n  brackets: [BRACES, BRACKETS, PARENTHESES],\n  autoClosingPairs: [\n    { open: \"{\", close: \"}\" },\n    { open: \"[\", close: \"]\" },\n    { open: \"(\", close: \")\" },\n    { open: '\"', close: '\"' },\n    { open: \"'\", close: \"'\" },\n  ],\n  surroundingPairs: [\n    { open: \"{\", close: \"}\" },\n    { open: \"[\", close: \"]\" },\n    { open: \"(\", close: \")\" },\n    { open: '\"', close: '\"' },\n    { open: \"'\", close: \"'\" },\n  ],\n};\n\n// TODO: clean language features as I (fmassot) did not dig into it yet.\nexport function LanguageFeatures(): any {\n  return {\n    defaultToken: \"invalid\",\n    //wordDefinition: /(-?\\d*\\.\\d\\w*)|([^\\`\\~\\!\\#\\%\\^\\&\\*\\(\\)\\-\\=\\+\\[\\{\\]\\}\\\\\\|\\;\\:\\'\\\"\\,\\.\\<\\>\\/\\?\\s]+)/g,\n    operators: [\"+\", \"-\"],\n    brackets: [{ open: \"(\", close: \")\", token: \"delimiter.parenthesis\" }],\n    keywords: [\"AND\", \"OR\"],\n    symbols: /[=><!~?:&|+\\-*/^%]+/,\n    escapes:\n      /\\\\(?:[abfnrtv\\\\\"']|x[0-9A-Fa-f]{1,4}|u[0-9A-Fa-f]{4}|U[0-9A-Fa-f]{8})/,\n    tokenizer: {\n      root: [\n        // identifiers and keywords\n        [\n          /[a-z_$][\\w$]*/,\n          {\n            cases: {\n              \"@keywords\": \"keyword\",\n              \"@default\": \"identifier\",\n            },\n          },\n        ],\n        [/[A-Z][\\w$]*/, \"type.identifier\"], // to show class names nicely\n\n        // whitespace\n        { include: \"@whitespace\" },\n\n        // delimiters and operators\n        [/[{}()[]]/, \"@brackets\"],\n        [/[<>](?!@symbols)/, \"@brackets\"],\n        [/@symbols/, { cases: { \"@operators\": \"operator\", \"@default\": \"\" } }],\n\n        // @ annotations.\n        // As an example, we emit a debugging log message on these tokens.\n        // Note: message are suppressed during the first load -- change some lines to see them.\n        [\n          /@\\s*[a-zA-Z_$][\\w$]*/,\n          { token: \"annotation\", log: \"annotation token: $0\" },\n        ],\n\n        // numbers\n        [/\\d*\\.\\d+([eE][-+]?\\d+)?/, \"number.float\"],\n        [/0[xX][0-9a-fA-F]+/, \"number.hex\"],\n        [/\\d+/, \"number\"],\n\n        // delimiter: after number because of .\\d floats\n        [/[;,.]/, \"delimiter\"],\n\n        // strings\n        [/\"([^\"\\\\]|\\\\.)*$/, \"string.invalid\"], // non-terminated string\n        [/\"/, { token: \"string.quote\", bracket: \"@open\", next: \"@string\" }],\n\n        // characters\n        [/'[^\\\\']'/, \"string\"],\n        [/(')(@escapes)(')/, [\"string\", \"string.escape\", \"string\"]],\n        [/'/, \"string.invalid\"],\n      ],\n      comment: [\n        [/[^/*]+/, \"comment\"],\n        [/\\/\\*/, \"comment\", \"@push\"], // nested comment\n        [\"\\\\*/\", \"comment\", \"@pop\"],\n        [/[/*]/, \"comment\"],\n      ],\n      string: [\n        [/[^\\\\\"]+/, \"string\"],\n        [/@escapes/, \"string.escape\"],\n        [/\\\\./, \"string.escape.invalid\"],\n        [/\"/, { token: \"string.quote\", bracket: \"@close\", next: \"@pop\" }],\n      ],\n\n      whitespace: [\n        [/[ \\t\\r\\n]+/, \"white\"],\n        [/\\/\\*/, \"comment\", \"@comment\"],\n        [/\\/\\/.*$/, \"comment\"],\n      ],\n    },\n  };\n}\n\nexport const createIndexCompletionProvider = (indexMetadata: IndexMetadata) => {\n  const fields = getAllFields(\n    indexMetadata.index_config.doc_mapping.field_mappings,\n  );\n  const completionProvider = {\n    provideCompletionItems(model: any, position: any) {\n      const word = model.getWordUntilPosition(position);\n\n      const range = {\n        startLineNumber: position.lineNumber,\n        endLineNumber: position.lineNumber,\n        startColumn: word.startColumn,\n        endColumn: word.endColumn,\n      };\n\n      // We want to auto complete all fields except timestamp that is handled with `TimeRangeSelect` component.\n      const fieldSuggestions = fields\n        .filter(\n          (field) =>\n            field.json_path !==\n            indexMetadata.index_config.doc_mapping.timestamp_field,\n        )\n        .map((field) => {\n          return {\n            label: field.json_path,\n            kind: CompletionItemKind.Field,\n            insertText:\n              field.field_mapping.type === \"json\"\n                ? field.json_path + \".\"\n                : field.json_path + \":\",\n            range: range,\n          };\n        });\n\n      return {\n        suggestions: fieldSuggestions.concat([\n          {\n            label: \"OR\",\n            kind: CompletionItemKind.Operator,\n            insertText: \"OR \",\n            range: range,\n          },\n          {\n            label: \"AND\",\n            kind: CompletionItemKind.Operator,\n            insertText: \"AND \",\n            range: range,\n          },\n        ]),\n      };\n    },\n  };\n\n  return completionProvider;\n};\n\nexport const setErrorMarker = (\n  monaco: any,\n  editor: any,\n  startlineNumber: number,\n  startColumnNumber: number,\n  message: string,\n) => {\n  const model = editor.getModel();\n\n  if (model) {\n    monaco.editor.setModelMarkers(model, \"QuestDBLanguageName\", [\n      {\n        message,\n        severity: monaco.MarkerSeverity.Error,\n        startLineNumber: startlineNumber,\n        endLineNumber: startlineNumber,\n        startColumn: startColumnNumber,\n        endColumn: startColumnNumber,\n      },\n    ]);\n  }\n};\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/components/ResponseErrorDisplay.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport SentimentVeryDissatisfiedIcon from \"@mui/icons-material/SentimentVeryDissatisfied\";\nimport { Box } from \"@mui/material\";\nimport { ResponseError } from \"../utils/models\";\n\nfunction renderMessage(error: ResponseError) {\n  if (\n    error.message !== null &&\n    error.message.includes(\"No search node available.\")\n  ) {\n    return (\n      <Box sx={{ fontSize: 16, pt: 2 }}>\n        Your cluster does not contain any search node. You need at least one\n        search node.\n      </Box>\n    );\n  } else {\n    return (\n      <>\n        <Box sx={{ fontSize: 16, pt: 2 }}>\n          {error.status && <span>Status: {error.status}</span>}\n        </Box>\n        <Box sx={{ fontSize: 14, pt: 1, alignItems: \"center\" }}>\n          Error: {error.message}\n        </Box>\n      </>\n    );\n  }\n}\n\nexport default function ErrorResponseDisplay(error: ResponseError) {\n  return (\n    <Box\n      sx={{\n        pt: 2,\n        display: \"flex\",\n        flexDirection: \"column\",\n        alignItems: \"center\",\n      }}\n    >\n      <SentimentVeryDissatisfiedIcon sx={{ fontSize: 60 }} />\n      {renderMessage(error)}\n    </Box>\n  );\n}\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/components/SearchResult/AggregationResult.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { BarChart } from \"@mui/x-charts/BarChart\";\nimport { LineChart } from \"@mui/x-charts/LineChart\";\nimport {\n  extractAggregationResults,\n  HistogramResult,\n  ParsedAggregationResult,\n  SearchResponse,\n  TermResult,\n} from \"../../utils/models\";\n\nfunction isHistogram(agg: ParsedAggregationResult): agg is HistogramResult {\n  return agg != null && \"timestamps\" in agg;\n}\n\nfunction isTerm(agg: ParsedAggregationResult): agg is TermResult {\n  return Array.isArray(agg);\n}\n\nexport function AggregationResult({\n  searchResponse,\n}: {\n  searchResponse: SearchResponse;\n}) {\n  const result = extractAggregationResults(searchResponse.aggregations);\n  if (isHistogram(result)) {\n    const xAxis: React.ComponentProps<typeof LineChart>[\"xAxis\"] = [\n      {\n        data: result.timestamps,\n        valueFormatter: (date: number) => {\n          return new Date(date).toISOString();\n        },\n      },\n    ];\n    const series: React.ComponentProps<typeof LineChart>[\"series\"] =\n      result.data.map((line) => {\n        return {\n          curve: \"monotoneX\",\n          label: line.name,\n          data: line.value,\n        };\n      });\n    // we don't customize colors because we would need a full palette.\n    return <LineChart xAxis={xAxis} series={series} yAxis={[{ min: 0 }]} />;\n  } else if (isTerm(result)) {\n    return (\n      <BarChart\n        series={[\n          { data: result.map((entry) => entry.value), color: \"#004BD9A5\" },\n        ]}\n        xAxis={[{ data: result.map((entry) => entry.term), scaleType: \"band\" }]}\n        margin={{ top: 10, bottom: 30, left: 40, right: 10 }}\n      />\n    );\n  } else {\n    return <p>no result to display</p>;\n  }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/components/SearchResult/ResultTable.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { Box, styled, Table, TableBody, TableContainer } from \"@mui/material\";\nimport { Field, getAllFields, Index, SearchResponse } from \"../../utils/models\";\nimport { Row } from \"./Row\";\n\nconst TableBox = styled(Box)`\ndisplay: flex;\nflex-direction: column;\noverflow: auto;\nflex: 1 1 100%;\nheight: 100%;\n`;\n\nexport function ResultTable({\n  searchResponse,\n  index,\n}: {\n  searchResponse: SearchResponse;\n  index: Index;\n}) {\n  const timestampField = getTimestampField(index);\n  return (\n    <TableBox>\n      <TableContainer>\n        <Table size=\"small\">\n          <TableBody>\n            {searchResponse.hits.map((hit, idx) => (\n              <Row key={idx} row={hit} timestampField={timestampField} />\n            ))}\n          </TableBody>\n        </Table>\n      </TableContainer>\n    </TableBox>\n  );\n}\n\nfunction getTimestampField(index: Index): Field | null {\n  const fields = getAllFields(\n    index.metadata.index_config.doc_mapping.field_mappings,\n  );\n  const timestamp_field_name =\n    index.metadata.index_config.doc_mapping.timestamp_field;\n  const timestamp_field = fields.filter(\n    (field) => field.field_mapping.name === timestamp_field_name,\n  )[0];\n  return timestamp_field ?? null;\n}\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/components/SearchResult/Row.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { KeyboardArrowDown } from \"@mui/icons-material\";\nimport ChevronRight from \"@mui/icons-material/ChevronRight\";\nimport { Box, IconButton, styled, TableCell, TableRow } from \"@mui/material\";\nimport dayjs from \"dayjs\";\nimport relativeTime from \"dayjs/plugin/relativeTime\";\nimport utc from \"dayjs/plugin/utc\";\nimport React, { useState } from \"react\";\nimport {\n  DATE_TIME_WITH_SECONDS_FORMAT as DATE_TIME_WITH_MILLISECONDS_FORMAT,\n  DATE_TIME_WITH_SECONDS_FORMAT,\n  Entry,\n  Field,\n  RawDoc,\n} from \"../../utils/models\";\nimport { QUICKWIT_INTERMEDIATE_GREY } from \"../../utils/theme\";\nimport { JsonEditor } from \"../JsonEditor\";\n\ndayjs.extend(relativeTime);\ndayjs.extend(utc);\n\ninterface RowProps {\n  timestampField: Field | null;\n  row: RawDoc;\n}\n\nconst EntryName = styled(\"dt\")`\n  display: inline;\n  background-color: ${QUICKWIT_INTERMEDIATE_GREY};\n  color: #343741;\n  padding: 2px 1px 2px 4px;\n  margin-right: 4px;\n  word-break: normal;\n  border-radius: 3px;\n`;\n\nconst EntryValue = styled(\"dd\")`\n  display: inline;\n  margin: 0;\n  padding: 0;\n  margin-inline-end: 5px;\n`;\n\nfunction EntryFormatter(entry: Entry) {\n  // Some field can contains objects, stringify them to render them otherwise React will crash.\n  const value =\n    typeof entry.value === \"object\" ? JSON.stringify(entry.value) : entry.value;\n  return (\n    <>\n      <EntryName>{entry.key}:</EntryName>\n      <EntryValue>{value}</EntryValue>\n    </>\n  );\n}\n\n// Display the timestamp value if found in a `TableCell`.\nfunction DisplayTimestampValue(row: RawDoc, timestampField: Field | null) {\n  if (\n    timestampField === null ||\n    timestampField.field_mapping.output_format === null\n  ) {\n    return <></>;\n  }\n  let field_value = row;\n  for (const path_segment of timestampField.path_segments) {\n    field_value = field_value[path_segment];\n  }\n  if (!field_value) {\n    return <></>;\n  }\n  return (\n    <TableCell sx={{ verticalAlign: \"top\", padding: \"4px\" }}>\n      <Box\n        sx={{\n          maxHeight: \"115px\",\n          width: \"90px\",\n          display: \"inline-block\",\n          wordBreak: \"break-word\",\n        }}\n      >\n        {formatDateTime(\n          field_value,\n          timestampField.field_mapping.output_format,\n        )}\n      </Box>\n    </TableCell>\n  );\n}\n\nfunction formatDateTime(field_value: any, timestampOutputFormat: string): any {\n  // A unix timestamp can be in secs/millis/micros/nanos and need to be converted properly.\n  if (\n    timestampOutputFormat === \"unix_timestamp_secs\" &&\n    typeof field_value === \"number\"\n  ) {\n    return dayjs(field_value * 1000)\n      .utc()\n      .format(DATE_TIME_WITH_SECONDS_FORMAT);\n  } else if (\n    timestampOutputFormat === \"unix_timestamp_millis\" &&\n    typeof field_value === \"number\"\n  ) {\n    return dayjs(field_value).utc().format(DATE_TIME_WITH_MILLISECONDS_FORMAT);\n  } else if (\n    timestampOutputFormat === \"unix_timestamp_micros\" &&\n    typeof field_value === \"number\"\n  ) {\n    return dayjs(field_value / 1000)\n      .utc()\n      .format(DATE_TIME_WITH_MILLISECONDS_FORMAT);\n  } else if (\n    timestampOutputFormat === \"unix_timestamp_nanos\" &&\n    typeof field_value === \"number\"\n  ) {\n    return dayjs(field_value / 1000000)\n      .utc()\n      .format(DATE_TIME_WITH_MILLISECONDS_FORMAT);\n  } else {\n    // Other formats are string values and we can just display it as is.\n    return field_value;\n  }\n}\n\nconst BreakWordBox = styled(\"dl\")({\n  verticalAlign: \"top\",\n  display: \"inline-block\",\n  color: \"#464646\",\n  wordBreak: \"break-all\",\n  wordWrap: \"break-word\",\n  margin: 1,\n  overflow: \"hidden\",\n  lineHeight: \"1.8em\",\n});\n\nexport function Row(props: RowProps) {\n  const [open, setOpen] = useState(false);\n  const entries: Entry[] = [];\n  for (const [key, value] of Object.entries(props.row)) {\n    entries.push({ key: key, value: value });\n  }\n  return (\n    <>\n      <TableRow>\n        <TableCell\n          sx={{ px: 0, py: 0, verticalAlign: \"top\", padding: \"0  px\" }}\n        >\n          <IconButton\n            aria-label=\"expand row\"\n            size=\"small\"\n            onClick={() => setOpen(!open)}\n          >\n            {open ? <KeyboardArrowDown /> : <ChevronRight />}\n          </IconButton>\n        </TableCell>\n        {DisplayTimestampValue(props.row, props.timestampField)}\n        <TableCell sx={{ padding: \"4px\" }}>\n          {!open && (\n            <BreakWordBox sx={{ maxHeight: \"100px\" }}>\n              {entries.map((entry) => (\n                <React.Fragment key={entry.key}>\n                  {EntryFormatter(entry)}\n                </React.Fragment>\n              ))}\n            </BreakWordBox>\n          )}\n          {open && <JsonEditor content={props.row} resizeOnMount={true} />}\n        </TableCell>\n      </TableRow>\n    </>\n  );\n}\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/components/SearchResult/SearchResult.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { Box, Typography } from \"@mui/material\";\nimport { useMemo } from \"react\";\nimport { NumericFormat } from \"react-number-format\";\nimport { Index, ResponseError, SearchResponse } from \"../../utils/models\";\nimport Loader from \"../Loader\";\nimport ErrorResponseDisplay from \"../ResponseErrorDisplay\";\nimport { AggregationResult } from \"./AggregationResult\";\nimport { ResultTable } from \"./ResultTable\";\n\nfunction HitCount({ searchResponse }: { searchResponse: SearchResponse }) {\n  return (\n    <Box>\n      <Typography variant=\"body2\" color=\"textSecondary\">\n        <NumericFormat\n          displayType=\"text\"\n          value={searchResponse.num_hits}\n          thousandSeparator=\",\"\n        />{\" \"}\n        hits found in&nbsp;\n        <NumericFormat\n          decimalScale={2}\n          displayType=\"text\"\n          value={searchResponse.elapsed_time_micros / 1000000}\n          thousandSeparator=\",\"\n        />{\" \"}\n        seconds\n      </Typography>\n    </Box>\n  );\n}\n\ninterface SearchResultProps {\n  queryRunning: boolean;\n  index: null | Index;\n  searchResponse: null | SearchResponse;\n  searchError: null | ResponseError;\n}\n\nexport default function SearchResult(props: SearchResultProps) {\n  const result = useMemo(() => {\n    if (props.searchResponse == null || props.index == null) {\n      return null;\n    } else if (props.searchResponse.aggregations === undefined) {\n      return (\n        <ResultTable\n          searchResponse={props.searchResponse}\n          index={props.index}\n        />\n      );\n    } else {\n      return <AggregationResult searchResponse={props.searchResponse} />;\n    }\n  }, [props.searchResponse, props.index]);\n\n  if (props.queryRunning) {\n    return <Loader />;\n  }\n\n  if (props.searchError !== null) {\n    return ErrorResponseDisplay(props.searchError);\n  }\n\n  if (props.searchResponse == null || props.index == null) {\n    return <></>;\n  }\n\n  return (\n    <Box sx={{ pt: 1, flexGrow: \"1\", flexBasis: \"0%\", overflow: \"hidden\" }}>\n      <Box\n        sx={{\n          height: \"100%\",\n          flexDirection: \"column\",\n          flexGrow: 1,\n          display: \"flex\",\n        }}\n      >\n        <Box\n          sx={{\n            flexShrink: 0,\n            display: \"flex\",\n            flexGrow: 0,\n            flexBasis: \"auto\",\n          }}\n        >\n          <HitCount searchResponse={props.searchResponse} />\n        </Box>\n        <Box\n          sx={{\n            pt: 2,\n            flexGrow: 1,\n            flexBasis: \"0%\",\n            minHeight: 0,\n            display: \"flex\",\n            flexDirection: \"column\",\n          }}\n        >\n          {result}\n        </Box>\n      </Box>\n    </Box>\n  );\n}\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/components/SideBar.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport {\n  ListItemButton,\n  ListSubheader,\n  styled,\n  Typography,\n} from \"@mui/material\";\nimport List from \"@mui/material/List\";\nimport ListItemIcon from \"@mui/material/ListItemIcon\";\nimport ListItemText from \"@mui/material/ListItemText\";\nimport { Database } from \"@styled-icons/feather/Database\";\nimport { Settings } from \"@styled-icons/feather/Settings\";\nimport { GroupWork } from \"@styled-icons/material-outlined/GroupWork\";\nimport { CodeSSlash } from \"@styled-icons/remix-line/CodeSSlash\";\nimport * as React from \"react\";\nimport { Link as RouterLink, LinkProps as RouterLinkProps } from \"react-router\";\nimport { useLocalStorage } from \"../providers/LocalStorageProvider\";\nimport { toUrlSearchRequestParams } from \"../utils/urls\";\nimport { APP_BAR_HEIGHT_PX } from \"./LayoutUtils\";\n\ninterface ListItemLinkProps {\n  icon?: React.ReactElement;\n  primary: React.ReactElement;\n  to: string;\n}\n\nfunction ListItemLink(props: ListItemLinkProps) {\n  const { icon, primary, to } = props;\n\n  const renderLink = React.useMemo(\n    () =>\n      React.forwardRef<HTMLAnchorElement, Omit<RouterLinkProps, \"to\">>(\n        function Link(itemProps, ref) {\n          return (\n            // biome-ignore lint/a11y/useValidAriaRole: remove the role\n            <RouterLink to={to} ref={ref} {...itemProps} role={undefined} />\n          );\n        },\n      ),\n    [to],\n  );\n\n  return (\n    <ListItemButton component={renderLink}>\n      {icon ? (\n        <ListItemIcon sx={{ minWidth: \"40px\" }}>{icon}</ListItemIcon>\n      ) : null}\n      <ListItemText primary={primary} />\n    </ListItemButton>\n  );\n}\n\nconst SideBarWrapper = styled(\"div\")({\n  display: \"flex\",\n  marginTop: `${APP_BAR_HEIGHT_PX}`,\n  height: `calc(100% - ${APP_BAR_HEIGHT_PX})`,\n  flex: \"0 0 180px\",\n  flexDirection: \"column\",\n  borderRight: \"1px solid rgba(0, 0, 0, 0.12)\",\n});\n\nconst SideBar = () => {\n  const lastSearchRequest = useLocalStorage().lastSearchRequest;\n  let searchUrl = \"/search\";\n  if (lastSearchRequest.indexId || lastSearchRequest.query) {\n    searchUrl =\n      \"/search?\" + toUrlSearchRequestParams(lastSearchRequest).toString();\n  }\n  return (\n    <SideBarWrapper sx={{ px: 0, py: 2 }}>\n      <List dense={true} sx={{ py: 0 }}>\n        <ListSubheader sx={{ lineHeight: \"25px\" }}>\n          <Typography variant=\"body1\">Discover</Typography>\n        </ListSubheader>\n        <ListItemLink\n          to={searchUrl}\n          primary={<Typography variant=\"body1\">Query editor</Typography>}\n          icon={<CodeSSlash size=\"18px\" />}\n        />\n        <ListSubheader sx={{ lineHeight: \"25px\", paddingTop: \"10px\" }}>\n          <Typography variant=\"body1\">Admin</Typography>\n        </ListSubheader>\n        <ListItemLink\n          to=\"/indexes\"\n          primary={<Typography variant=\"body1\">Indexes</Typography>}\n          icon={<Database size=\"18px\" />}\n        />\n        <ListItemLink\n          to=\"/cluster\"\n          primary={<Typography variant=\"body1\">Cluster</Typography>}\n          icon={<GroupWork size=\"18px\" />}\n        />\n        <ListItemLink\n          to=\"/node-info\"\n          primary={<Typography variant=\"body1\">Node info</Typography>}\n          icon={<Settings size=\"18px\" />}\n        />\n        <ListItemLink\n          to=\"/api-playground\"\n          primary={<Typography variant=\"body1\">API </Typography>}\n          icon={<CodeSSlash size=\"18px\" />}\n        />\n      </List>\n    </SideBarWrapper>\n  );\n};\n\nexport default SideBar;\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/components/TimeRangeSelect.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { AccessTime, ChevronRight, DateRange } from \"@mui/icons-material\";\nimport {\n  Box,\n  Button,\n  Divider,\n  List,\n  ListItemButton,\n  ListItemIcon,\n  ListItemText,\n  Popover,\n} from \"@mui/material\";\nimport { DateTimePicker, LocalizationProvider } from \"@mui/x-date-pickers\";\nimport { AdapterDayjs } from \"@mui/x-date-pickers/AdapterDayjs\";\nimport { Dayjs, default as dayjs } from \"dayjs\";\nimport relativeTime from \"dayjs/plugin/relativeTime\";\nimport utc from \"dayjs/plugin/utc\";\nimport React, { JSX, useEffect, useMemo, useState } from \"react\";\nimport { DATE_TIME_WITH_SECONDS_FORMAT } from \"../utils/models\";\n\ndayjs.extend(relativeTime);\ndayjs.extend(utc);\n\nconst TIME_RANGE_CHOICES = [\n  [\"Last 15 min\", 15 * 60],\n  [\"Last 30 min\", 30 * 60],\n  [\"Last 1 hour\", 60 * 60],\n  [\"Last 7 days\", 7 * 24 * 60 * 60],\n  [\"Last 30 days\", 30 * 24 * 60 * 60],\n  [\"Last 3 months\", 90 * 24 * 60 * 60],\n  [\"Last year\", 365 * 24 * 60 * 60],\n];\n\ntype TimeRange = {\n  startTimestamp: number | null;\n  endTimestamp: number | null;\n};\n\nexport interface TimeRangeSelectProps {\n  timeRange: TimeRange;\n  disabled?: boolean;\n  onUpdate(newTimeRange: TimeRange): void;\n}\n\ninterface TimeRangeSelectState {\n  anchor: HTMLElement | null;\n  customDatesPanelOpen: boolean;\n  width: number;\n}\n\nexport function TimeRangeSelect(props: TimeRangeSelectProps): JSX.Element {\n  const getInitialState = () => {\n    return { width: 220, anchor: null, customDatesPanelOpen: false };\n  };\n  const initialState = useMemo(() => {\n    return getInitialState();\n  }, []);\n  const [state, setState] = useState<TimeRangeSelectState>(initialState);\n\n  const handleOpenClick = (event: React.MouseEvent<HTMLButtonElement>) => {\n    setState((prevState) => {\n      return { ...prevState, anchor: event.currentTarget };\n    });\n  };\n\n  const handleOpenCustomDatesPanelClick = () => {\n    setState((prevState) => {\n      return { ...prevState, customDatesPanelOpen: true, width: 500 };\n    });\n  };\n\n  useEffect(() => {\n    setState(initialState);\n  }, [props.disabled, initialState]);\n\n  const handleClose = () => {\n    setState(initialState);\n  };\n\n  const handleTimeRangeChoiceClick = (\n    secondsBeforeNow: number | string | undefined,\n  ) => {\n    if (secondsBeforeNow === undefined) {\n      return;\n    }\n    // Ensures that we have a number.\n    secondsBeforeNow = +secondsBeforeNow;\n    setState(initialState);\n    const startTimestamp = Math.trunc(Date.now() / 1000) - secondsBeforeNow;\n    props.onUpdate({ startTimestamp, endTimestamp: null });\n  };\n\n  const handleReset = () => {\n    props.onUpdate({ startTimestamp: null, endTimestamp: null });\n  };\n\n  const open = Boolean(state.anchor);\n  const id = open ? \"time-range-select-popover\" : undefined;\n\n  return (\n    <Box sx={{ padding: \"10px\" }}>\n      <Button\n        variant=\"contained\"\n        disableElevation\n        onClick={handleOpenClick}\n        startIcon={<AccessTime />}\n        disabled={props.disabled}\n      >\n        <DateTimeRangeLabel\n          startTimestamp={props.timeRange.startTimestamp}\n          endTimestamp={props.timeRange.endTimestamp}\n        />\n      </Button>\n      <Popover\n        id={id}\n        open={open}\n        anchorEl={state.anchor}\n        onClose={handleClose}\n        anchorOrigin={{\n          vertical: \"bottom\",\n          horizontal: \"center\",\n        }}\n        transformOrigin={{\n          vertical: \"top\",\n          horizontal: \"center\",\n        }}\n        PaperProps={{\n          style: { width: state.width },\n        }}\n      >\n        <Box display=\"flex\" flexDirection=\"column\">\n          <Box p={1.5}>\n            <b>Select a period</b>\n          </Box>\n          <Divider />\n          <Box display=\"flex\" flexDirection=\"row\">\n            <Box flexGrow={1} borderRight={1} borderColor=\"grey.300\">\n              <List disablePadding>\n                {TIME_RANGE_CHOICES.map((value, idx) => {\n                  return (\n                    <ListItemButton\n                      key={idx}\n                      onClick={() => handleTimeRangeChoiceClick(value[1])}\n                    >\n                      <ListItemText primary={value[0]} />\n                    </ListItemButton>\n                  );\n                })}\n                <ListItemButton onClick={handleReset}>\n                  <ListItemText primary=\"Reset\" />\n                </ListItemButton>\n                <ListItemButton onClick={handleOpenCustomDatesPanelClick}>\n                  <ListItemIcon\n                    sx={{\n                      alignItems: \"left\",\n                      minWidth: \"inherit\",\n                      paddingRight: \"8px\",\n                    }}\n                  >\n                    <DateRange />\n                  </ListItemIcon>\n                  <ListItemText\n                    primary=\"Custom dates\"\n                    sx={{ paddingRight: \"16px\" }}\n                  />\n                  <ListItemIcon sx={{ minWidth: \"inherit\" }}>\n                    <ChevronRight />\n                  </ListItemIcon>\n                </ListItemButton>\n              </List>\n            </Box>\n            {state.anchor !== null && state.customDatesPanelOpen && (\n              <CustomDatesPanel {...props} />\n            )}\n          </Box>\n        </Box>\n      </Popover>\n    </Box>\n  );\n}\n\nfunction CustomDatesPanel(props: TimeRangeSelectProps): JSX.Element {\n  const [startDate, setStartDate] = useState<Dayjs | null>(null);\n  const [endDate, setEndDate] = useState<Dayjs | null>(null);\n\n  useEffect(() => {\n    setStartDate(\n      props.timeRange.startTimestamp\n        ? convertTimestampSecsIntoDateUtc(props.timeRange.startTimestamp)\n        : null,\n    );\n    setEndDate(\n      props.timeRange.endTimestamp\n        ? convertTimestampSecsIntoDateUtc(props.timeRange.endTimestamp)\n        : null,\n    );\n  }, [props.timeRange.startTimestamp, props.timeRange.endTimestamp]);\n  const handleReset = (event: React.MouseEvent<HTMLButtonElement>) => {\n    event.preventDefault();\n    setStartDate(null);\n    setEndDate(null);\n    props.onUpdate({ startTimestamp: null, endTimestamp: null });\n  };\n  const handleApply = (event: React.MouseEvent<HTMLButtonElement>) => {\n    event.preventDefault();\n    const startTimestamp = startDate ? startDate.valueOf() / 1000 : null;\n    const endTimestamp = endDate ? endDate.valueOf() / 1000 : null;\n    props.onUpdate({ startTimestamp, endTimestamp });\n  };\n\n  return (\n    <LocalizationProvider dateAdapter={AdapterDayjs}>\n      <Box\n        display=\"flex\"\n        flexDirection=\"column\"\n        p={2}\n        sx={{ minWidth: \"300px\" }}\n      >\n        <Box flexGrow={1}>\n          <Box pb={1.5}>\n            <DateTimePicker\n              label=\"Start Date\"\n              value={startDate}\n              format={DATE_TIME_WITH_SECONDS_FORMAT}\n              onChange={(newValue: null | Dayjs) => {\n                // By default, newValue is a datetime defined on the local time zone and for now we consider\n                // input/output only in UTC.\n                setStartDate(\n                  newValue\n                    ? dayjs(\n                        newValue.valueOf() + newValue.utcOffset() * 60 * 1000,\n                      ).utc()\n                    : null,\n                );\n              }}\n              slotProps={{ textField: { sx: { width: \"100%\" } } }}\n            />\n          </Box>\n          <Box>\n            <DateTimePicker\n              label=\"End Date\"\n              value={endDate}\n              format={DATE_TIME_WITH_SECONDS_FORMAT}\n              onChange={(newValue: null | Dayjs) => {\n                // By default, newValue is a datetime defined on the local time zone and for now we consider\n                // input/output only in UTC.\n                setEndDate(\n                  newValue\n                    ? dayjs(\n                        newValue.valueOf() + newValue.utcOffset() * 60 * 1000,\n                      ).utc()\n                    : null,\n                );\n              }}\n              slotProps={{ textField: { sx: { width: \"100%\" } } }}\n            />\n          </Box>\n        </Box>\n        <Box display=\"flex\">\n          <Button\n            variant=\"outlined\"\n            color=\"primary\"\n            onClick={handleReset}\n            disableElevation\n            style={{ marginRight: 10 }}\n          >\n            Reset\n          </Button>\n          <Button\n            variant=\"contained\"\n            color=\"primary\"\n            onClick={handleApply}\n            disableElevation\n          >\n            Apply\n          </Button>\n        </Box>\n      </Box>\n    </LocalizationProvider>\n  );\n}\n\ninterface DateTimeRangeLabelProps {\n  startTimestamp: number | null;\n  endTimestamp: number | null;\n}\n\nfunction DateTimeRangeLabel(props: DateTimeRangeLabelProps): JSX.Element {\n  function Label() {\n    if (props.startTimestamp !== null && props.endTimestamp !== null) {\n      return (\n        <>\n          {convertTimestampSecsIntoDateUtc(props.startTimestamp).format(\n            DATE_TIME_WITH_SECONDS_FORMAT,\n          )}{\" \"}\n          -{\" \"}\n          {convertTimestampSecsIntoDateUtc(props.endTimestamp).format(\n            DATE_TIME_WITH_SECONDS_FORMAT,\n          )}\n        </>\n      );\n    } else if (props.startTimestamp !== null && props.endTimestamp === null) {\n      return (\n        <>\n          Since{\" \"}\n          {convertTimestampSecsIntoDateUtc(props.startTimestamp).fromNow(true)}\n        </>\n      );\n    } else if (props.startTimestamp == null && props.endTimestamp != null) {\n      return (\n        <>\n          Before{\" \"}\n          {convertTimestampSecsIntoDateUtc(props.endTimestamp).format(\n            DATE_TIME_WITH_SECONDS_FORMAT,\n          )}\n        </>\n      );\n    }\n    return <>No date range</>;\n  }\n\n  return (\n    <span style={{ textTransform: \"none\" }}>\n      <Label />\n    </span>\n  );\n}\n\nfunction convertTimestampSecsIntoDateUtc(timestamp_secs: number): Dayjs {\n  return dayjs(timestamp_secs * 1000).utc();\n}\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/components/TopBar.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport GitHubIcon from \"@mui/icons-material/GitHub\";\nimport {\n  Box,\n  IconButton,\n  Link,\n  SvgIcon,\n  styled,\n  Tooltip,\n  Typography,\n} from \"@mui/material\";\nimport AppBar from \"@mui/material/AppBar\";\nimport Toolbar from \"@mui/material/Toolbar\";\nimport { Discord } from \"@styled-icons/fa-brands/Discord\";\nimport { useEffect, useMemo, useState } from \"react\";\nimport quickwitLogoUrl from \"../assets/img/quickwit-logo-with-title.svg\";\nimport { Client } from \"../services/client\";\n\nconst Logo = (props: React.ComponentProps<\"img\">) => (\n  <img {...props} src={quickwitLogoUrl} alt=\"quickwit logo\" />\n);\n\nconst StyledAppBar = styled(AppBar)(({ theme }) => ({\n  zIndex: theme.zIndex.drawer + 1,\n}));\n\n// Update the Button's color prop options\ndeclare module \"@mui/material/AppBar\" {\n  interface AppBarPropsColorOverrides {\n    neutral: true;\n  }\n}\n\nconst TopBar = () => {\n  const [clusterId, setClusterId] = useState<string>(\"\");\n  const quickwitClient = useMemo(() => new Client(), []);\n\n  useEffect(() => {\n    quickwitClient.cluster().then((cluster) => {\n      setClusterId(cluster.cluster_id);\n    });\n  }, [quickwitClient]);\n\n  return (\n    <StyledAppBar position=\"fixed\" elevation={0} color=\"neutral\">\n      <Toolbar variant=\"dense\">\n        <Box\n          sx={{\n            flexGrow: 1,\n            p: 0,\n            m: 0,\n            display: \"flex\",\n            alignItems: \"center\",\n          }}\n        >\n          <Logo height=\"25px\"></Logo>\n          <Tooltip title=\"Cluster ID\" placement=\"right\">\n            <Typography mx={2}>{clusterId}</Typography>\n          </Tooltip>\n        </Box>\n        <Link href=\"https://quickwit.io/docs\" target=\"_blank\" sx={{ px: 2 }}>\n          Docs\n        </Link>\n        <Link href=\"https://discord.gg/rpRRTezWhW\" target=\"_blank\">\n          <IconButton size=\"large\">\n            <SvgIcon>\n              <Discord />\n            </SvgIcon>\n          </IconButton>\n        </Link>\n        <Link href=\"https://github.com/quickwit-inc/quickwit\" target=\"_blank\">\n          <IconButton size=\"large\">\n            <GitHubIcon />\n          </IconButton>\n        </Link>\n      </Toolbar>\n    </StyledAppBar>\n  );\n};\n\nexport default TopBar;\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/index.css",
    "content": "/*\nCopyright 2021-Present Datadog, Inc.\n\nLicensed under the Apache License, Version 2.0 (the \"License\");\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n    http://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an \"AS IS\" BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n*/\nhtml,\nbody {\n  height: 100%;\n}\n\n#root {\n  height: 100%;\n}\n\ndiv.swagger-ui div.information-container {\n  display: none;\n}\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/index.test.js",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { describe, expect, it } from \"@jest/globals\";\nimport { render, screen } from \"@testing-library/react\";\nimport { BrowserRouter } from \"react-router\";\nimport App from \"./views/App\";\n\ndescribe(\"App\", () => {\n  it(\"Should display side bar links\", () => {\n    render(\n      <BrowserRouter>\n        <App />\n      </BrowserRouter>,\n    );\n    expect(screen.getByText(/Discover/)).toBeInTheDocument();\n    expect(screen.getByText(/Query editor/)).toBeInTheDocument();\n    expect(screen.getByText(/Admin/)).toBeInTheDocument();\n  });\n});\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/index.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport React from \"react\";\nimport { createRoot } from \"react-dom/client\";\nimport \"./index.css\";\nimport { BrowserRouter } from \"react-router\";\nimport App from \"./views/App\";\n\nconst root = createRoot(document.getElementById(\"root\")!);\nroot.render(\n  <React.StrictMode>\n    <BrowserRouter basename={import.meta.env.BASE_URL}>\n      <App />\n    </BrowserRouter>\n  </React.StrictMode>,\n);\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/providers/EditorProvider.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport * as monacoEditor from \"monaco-editor/esm/vs/editor/editor.api\";\nimport {\n  createContext,\n  MutableRefObject,\n  PropsWithChildren,\n  useContext,\n  useRef,\n} from \"react\";\n\ntype ContextProps = {\n  editorRef: MutableRefObject<unknown | null> | null;\n  monacoRef: MutableRefObject<typeof monacoEditor | null> | null;\n};\n\nconst defaultValues = {\n  editorRef: null,\n  monacoRef: null,\n};\n\nconst EditorContext = createContext<ContextProps>(defaultValues);\n\nexport const EditorProvider = ({ children }: PropsWithChildren<unknown>) => {\n  const editorRef = useRef<unknown | null>(null);\n  const monacoRef = useRef<typeof monacoEditor | null>(null);\n\n  return (\n    <EditorContext.Provider\n      value={{\n        editorRef,\n        monacoRef,\n      }}\n    >\n      {children}\n    </EditorContext.Provider>\n  );\n};\n\nexport const useEditor = () => {\n  return useContext(EditorContext);\n};\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/providers/LocalStorageProvider.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport {\n  createContext,\n  PropsWithChildren,\n  useContext,\n  useEffect,\n  useState,\n} from \"react\";\nimport { EMPTY_SEARCH_REQUEST, SearchRequest } from \"../utils/models\";\n\ntype Props = Record<string, unknown>;\n\ntype ContextProps = {\n  lastSearchRequest: SearchRequest;\n  updateLastSearchRequest: (searchRequest: SearchRequest) => void;\n};\n\nconst defaultValues = {\n  lastSearchRequest: EMPTY_SEARCH_REQUEST,\n  updateLastSearchRequest: () => undefined,\n};\n\nfunction parseSearchRequest(value: string | null): SearchRequest {\n  if (value === null) {\n    return EMPTY_SEARCH_REQUEST;\n  }\n  return JSON.parse(value);\n}\n\nexport const LocalStorageContext = createContext<ContextProps>(defaultValues);\n\nexport const LocalStorageProvider = ({\n  children,\n}: PropsWithChildren<Props>) => {\n  const [lastSearchRequest, setLastSearchRequest] =\n    useState<SearchRequest>(EMPTY_SEARCH_REQUEST);\n\n  useEffect(() => {\n    if (localStorage.getItem(\"lastSearchRequest\") !== null) {\n      const lastSearchRequest = parseSearchRequest(\n        localStorage.getItem(\"lastSearchRequest\"),\n      );\n      setLastSearchRequest(lastSearchRequest);\n    }\n  }, []);\n\n  useEffect(() => {\n    localStorage.setItem(\n      \"lastSearchRequest\",\n      JSON.stringify(lastSearchRequest),\n    );\n  }, [lastSearchRequest]);\n\n  function updateLastSearchRequest(searchRequest: SearchRequest) {\n    setLastSearchRequest(searchRequest);\n  }\n\n  return (\n    <LocalStorageContext.Provider\n      value={{\n        lastSearchRequest,\n        updateLastSearchRequest,\n      }}\n    >\n      {children}\n    </LocalStorageContext.Provider>\n  );\n};\n\nexport const useLocalStorage = () => {\n  return useContext(LocalStorageContext);\n};\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/services/client.test.ts",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { describe, expect, it, jest } from \"@jest/globals\";\nimport { SearchRequest } from \"../utils/models\";\nimport { Client } from \"./client\";\n\ndescribe(\"Client unit test\", () => {\n  it(\"Should construct correct search URL\", async () => {\n    // Mocking the fetch function to simulate network requests\n    const mockFetch = jest.fn((_url: string, _options?: unknown) =>\n      Promise.resolve({ ok: true, json: () => Promise.resolve({}) }),\n    );\n    (global as any).fetch = mockFetch;\n\n    const searchRequest: SearchRequest = {\n      indexId: \"my-new-fresh-index-id\",\n      query: \"severity_error:ERROR\",\n      startTimestamp: 100,\n      endTimestamp: 200,\n      maxHits: 20,\n      sortByField: {\n        field_name: \"timestamp\",\n        order: \"Desc\",\n      },\n      aggregation: false,\n      aggregationConfig: {\n        metric: null,\n        term: null,\n        histogram: null,\n      },\n    };\n\n    const client = new Client();\n    expect(client.buildSearchBody(searchRequest, null)).toBe(\n      '{\"query\":\"severity_error:ERROR\",\"max_hits\":20,\"start_timestamp\":100,\"end_timestamp\":200,\"sort_by_field\":\"+timestamp\"}',\n    );\n\n    await client.search(searchRequest, null);\n    const expectedUrl = `${client.apiRoot()}my-new-fresh-index-id/search`;\n    expect(mockFetch).toHaveBeenCalledTimes(1);\n    expect(mockFetch).toHaveBeenCalledWith(expectedUrl, expect.any(Object));\n  });\n});\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/services/client.ts",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport {\n  Cluster,\n  Index,\n  IndexMetadata,\n  QuickwitBuildInfo,\n  SearchRequest,\n  SearchResponse,\n  SplitMetadata,\n} from \"../utils/models\";\nimport { serializeSortByField } from \"../utils/urls\";\n\nexport class Client {\n  private readonly _host: string;\n\n  constructor(host?: string) {\n    if (!host) {\n      this._host = window.location.origin;\n    } else {\n      this._host = host;\n    }\n  }\n\n  apiRoot(): string {\n    return this._host + \"/api/v1/\";\n  }\n\n  async search(\n    request: SearchRequest,\n    timestamp_field: string | null,\n  ): Promise<SearchResponse> {\n    // TODO: improve validation of request.\n    if (request.indexId === null || request.indexId === undefined) {\n      throw Error(\"Search request must have and index id.\");\n    }\n    const url = `${this.apiRoot()}${request.indexId}/search`;\n    const body = this.buildSearchBody(request, timestamp_field);\n    return this.fetch(url, this.defaultGetRequestParams(), body);\n  }\n\n  async cluster(): Promise<Cluster> {\n    return await this.fetch(\n      `${this.apiRoot()}cluster`,\n      this.defaultGetRequestParams(),\n    );\n  }\n\n  async buildInfo(): Promise<QuickwitBuildInfo> {\n    return await this.fetch(\n      `${this.apiRoot()}version`,\n      this.defaultGetRequestParams(),\n    );\n  }\n\n  async config(): Promise<Record<string, any>> {\n    return await this.fetch(\n      `${this.apiRoot()}config`,\n      this.defaultGetRequestParams(),\n    );\n  }\n  //\n  // Index management API\n  //\n  async getIndex(indexId: string): Promise<Index> {\n    const [metadata, splits] = await Promise.all([\n      this.getIndexMetadata(indexId),\n      this.getAllSplits(indexId),\n    ]);\n    return {\n      metadata: metadata,\n      splits: splits[0],\n      split_limit_reached: splits[1],\n    };\n  }\n\n  async getIndexMetadata(indexId: string): Promise<IndexMetadata> {\n    return this.fetch(`${this.apiRoot()}indexes/${indexId}`, {});\n  }\n\n  async getAllSplits(\n    indexId: string,\n  ): Promise<[Array<SplitMetadata>, boolean]> {\n    // TODO: restrieve all the splits.\n    const results: { splits: Array<SplitMetadata> } = await this.fetch(\n      `${this.apiRoot()}indexes/${indexId}/splits?limit=10000`,\n      {},\n    );\n\n    return [results[\"splits\"], results[\"splits\"].length === 10000];\n  }\n\n  async listIndexes(): Promise<Array<IndexMetadata>> {\n    return this.fetch(`${this.apiRoot()}indexes`, {});\n  }\n\n  async fetch<T>(\n    url: string,\n    params: RequestInit,\n    body: string | null = null,\n  ): Promise<T> {\n    if (body !== null) {\n      params.method = \"POST\";\n      params.body = body;\n      params.headers = {\n        ...params.headers,\n        \"content-type\": \"application/json\",\n      };\n    }\n    const response = await fetch(url, params);\n    if (response.ok) {\n      return response.json() as Promise<T>;\n    }\n    const message = await response.text();\n    return await Promise.reject({\n      message: message,\n      status: response.status,\n    });\n  }\n\n  private defaultGetRequestParams(): RequestInit {\n    return {\n      method: \"GET\",\n      headers: { Accept: \"application/json\" },\n      mode: \"cors\",\n      cache: \"default\",\n    };\n  }\n\n  buildSearchBody(\n    request: SearchRequest,\n    timestamp_field: string | null,\n  ): string {\n    const body: any = {\n      // TODO: the trim should be done in the backend.\n      query: request.query.trim() || \"*\",\n    };\n\n    if (request.aggregation) {\n      const qw_aggregation = this.buildAggregation(request, timestamp_field);\n      body[\"aggs\"] = qw_aggregation;\n      body[\"max_hits\"] = 0;\n    } else {\n      body[\"max_hits\"] = 20;\n    }\n    if (request.startTimestamp) {\n      body[\"start_timestamp\"] = request.startTimestamp;\n    }\n    if (request.endTimestamp) {\n      body[\"end_timestamp\"] = request.endTimestamp;\n    }\n    if (request.sortByField) {\n      body[\"sort_by_field\"] = serializeSortByField(request.sortByField);\n    }\n    return JSON.stringify(body);\n  }\n\n  buildAggregation(\n    request: SearchRequest,\n    timestamp_field: string | null,\n  ): any {\n    let aggregation: any;\n    if (request.aggregationConfig.metric) {\n      const metric = request.aggregationConfig.metric;\n      aggregation = {\n        metric: {\n          [metric.type]: {\n            field: metric.field,\n          },\n        },\n      };\n    }\n    if (request.aggregationConfig.histogram && timestamp_field) {\n      const histogram = request.aggregationConfig.histogram;\n      const interval = histogram.interval;\n      let extended_bounds: any;\n      if (request.startTimestamp && request.endTimestamp) {\n        extended_bounds = {\n          min: request.startTimestamp,\n          max: request.endTimestamp,\n        };\n      } else {\n        extended_bounds = undefined;\n      }\n      aggregation = {\n        histo_agg: {\n          aggs: aggregation,\n          date_histogram: {\n            field: timestamp_field,\n            fixed_interval: interval,\n            min_doc_count: 0,\n            extended_bounds: extended_bounds,\n          },\n        },\n      };\n    }\n    if (request.aggregationConfig.term) {\n      const term = request.aggregationConfig.term;\n      aggregation = {\n        term_agg: {\n          aggs: aggregation,\n          terms: {\n            field: term.field,\n            size: term.size,\n            order: {\n              _count: \"desc\",\n            },\n            min_doc_count: 1,\n          },\n        },\n      };\n    }\n    return aggregation;\n  }\n}\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/utils/SearchComponentProps.ts",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { Index, SearchRequest } from \"./models\";\n\nexport interface SearchComponentProps {\n  searchRequest: SearchRequest;\n  queryRunning: boolean;\n  index: null | Index;\n  onSearchRequestUpdate(searchRequest: SearchRequest): void;\n  runSearch(searchRequest: SearchRequest): void;\n}\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/utils/models.ts",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nexport type RawDoc = Record<string, any>;\n\nexport type FieldMapping = {\n  description: string | null;\n  name: string;\n  type: string;\n  stored: boolean | null;\n  fast: boolean | null;\n  indexed: boolean | null;\n  // Specific datetime field attributes.\n  output_format: string | null;\n  field_mappings?: FieldMapping[];\n};\n\nexport type Field = {\n  // Json path (path segments concatenated as a string with dots between segments).\n  json_path: string;\n  // Json path of the field.\n  path_segments: string[];\n  field_mapping: FieldMapping;\n};\n\nexport type Entry = {\n  key: string;\n  value: any;\n};\n\nexport const DATE_TIME_WITH_SECONDS_FORMAT = \"YYYY/MM/DD HH:mm:ss\";\nexport const DATE_TIME_WITH_MILLISECONDS_FORMAT = \"YYYY/MM/DD HH:mm:ss.SSS\";\n\n// Returns a flatten array of fields and nested fields found in the given `FieldMapping` array.\nexport function getAllFields(field_mappings: Array<FieldMapping>): Field[] {\n  const fields: Field[] = [];\n  for (const field_mapping of field_mappings) {\n    if (\n      field_mapping.type === \"object\" &&\n      field_mapping.field_mappings !== undefined\n    ) {\n      for (const child_field_mapping of getAllFields(\n        field_mapping.field_mappings,\n      )) {\n        fields.push({\n          json_path: field_mapping.name + \".\" + child_field_mapping.json_path,\n          path_segments: [field_mapping.name].concat(\n            child_field_mapping.path_segments,\n          ),\n          field_mapping: child_field_mapping.field_mapping,\n        });\n      }\n    } else {\n      fields.push({\n        json_path: field_mapping.name,\n        path_segments: [field_mapping.name],\n        field_mapping: field_mapping,\n      });\n    }\n  }\n\n  return fields;\n}\n\nexport type DocMapping = {\n  field_mappings: FieldMapping[];\n  tag_fields: string[];\n  store: boolean;\n  dynamic_mapping: boolean;\n  timestamp_field: string | null;\n};\n\nexport type SortOrder = \"Asc\" | \"Desc\";\n\nexport type SortByField = {\n  field_name: string;\n  order: SortOrder;\n};\n\nexport type SearchRequest = {\n  indexId: string | null;\n  query: string;\n  startTimestamp: number | null;\n  endTimestamp: number | null;\n  maxHits: number;\n  sortByField: SortByField | null;\n  aggregation: boolean;\n  aggregationConfig: Aggregation;\n};\n\nexport type Aggregation = {\n  metric: Metric | null;\n  term: TermAgg | null;\n  histogram: HistogramAgg | null;\n};\n\nexport type Metric = {\n  type: string;\n  field: string;\n};\n\nexport type TermAgg = {\n  field: string;\n  size: number;\n};\n\nexport type HistogramAgg = {\n  interval: string;\n};\n\nexport type ParsedAggregationResult = TermResult | HistogramResult | null;\n\nexport type TermResult = { term: string; value: number }[];\n\nexport type HistogramResult = {\n  timestamps: Date[];\n  data: { name: string | undefined; value: number[] }[];\n};\n\nexport function extractAggregationResults(\n  aggregation: any,\n): ParsedAggregationResult {\n  const extract_value = (entry: any) => {\n    if (\"metric\" in entry) {\n      return entry.metric.value || 0;\n    } else {\n      return entry.doc_count;\n    }\n  };\n  if (\"histo_agg\" in aggregation) {\n    const buckets = aggregation.histo_agg.buckets;\n    const timestamps = buckets.map((entry: any) => entry.key);\n    const value = buckets.map(extract_value);\n    // we are in the \"simple histogram\" case\n    return {\n      timestamps,\n      data: [{ name: undefined, value }],\n    };\n  } else if (\"term_agg\" in aggregation) {\n    // we have a term aggregation, but maybe there is an histogram inside\n    const term_buckets = aggregation.term_agg.buckets;\n    if (term_buckets.length === 0) {\n      return null;\n    }\n    if (term_buckets.length > 0 && \"histo_agg\" in term_buckets[0]) {\n      // we have a term+histo aggregation\n      const timestamps_set: Set<number> = new Set();\n      term_buckets.forEach((bucket: any) =>\n        bucket.histo_agg.buckets.forEach((entry: any) =>\n          timestamps_set.add(entry.key),\n        ),\n      );\n      const timestamps = [...timestamps_set];\n      timestamps.sort();\n\n      const data = term_buckets.map((bucket: any) => {\n        const histo_buckets = bucket.histo_agg.buckets;\n        const first_elem_key = histo_buckets[0].key;\n        const last_elem_key = histo_buckets[histo_buckets.length - 1].key;\n        const prefix_len = timestamps.indexOf(first_elem_key);\n        const suffix_len =\n          timestamps.length - timestamps.indexOf(last_elem_key) - 1;\n        const value = Array(prefix_len)\n          .fill(0)\n          .concat(histo_buckets.map(extract_value), Array(suffix_len).fill(0));\n\n        return { name: bucket.key, value };\n      });\n      return {\n        timestamps: timestamps.map((date) => new Date(date)),\n        data,\n      };\n    } else {\n      return term_buckets.map((bucket: any) => {\n        return {\n          term: bucket.key,\n          value: extract_value(bucket),\n        };\n      });\n    }\n  }\n  // we are in neither case??\n  return null;\n}\n\nexport const EMPTY_SEARCH_REQUEST: SearchRequest = {\n  indexId: \"\",\n  query: \"\",\n  startTimestamp: null,\n  endTimestamp: null,\n  maxHits: 100,\n  sortByField: null,\n  aggregation: false,\n  aggregationConfig: {\n    metric: null,\n    term: null,\n    histogram: null,\n  },\n};\n\nexport type ResponseError = {\n  status: number | null;\n  message: string | null;\n};\n\nexport type SearchResponse = {\n  num_hits: number;\n  hits: Array<RawDoc>;\n  elapsed_time_micros: number;\n  errors: Array<any> | undefined;\n  aggregations: any | undefined;\n};\n\nexport type IndexConfig = {\n  version: string;\n  index_id: string;\n  index_uri: string;\n  doc_mapping: DocMapping;\n  indexing_settings: object;\n  search_settings: object;\n  retention: object;\n};\n\nexport type IndexMetadata = {\n  index_config: IndexConfig;\n  checkpoint: object;\n  sources: object[] | undefined;\n  create_timestamp: number;\n};\n\nexport const EMPTY_INDEX_METADATA: IndexMetadata = {\n  index_config: {\n    version: \"\",\n    index_uri: \"\",\n    index_id: \"\",\n    doc_mapping: {\n      field_mappings: [],\n      tag_fields: [],\n      store: false,\n      dynamic_mapping: false,\n      timestamp_field: null,\n    },\n    indexing_settings: {},\n    search_settings: {},\n    retention: {},\n  },\n  checkpoint: {},\n  sources: undefined,\n  create_timestamp: 0,\n};\n\nexport type SplitMetadata = {\n  split_id: string;\n  split_state: string;\n  num_docs: number;\n  uncompressed_docs_size_in_bytes: number;\n  time_range: null | Range;\n  update_timestamp: number;\n  version: number;\n  create_timestamp: number;\n  tags: string[];\n  demux_num_ops: number;\n  footer_offsets: Range;\n};\n\nexport type Range = {\n  start: number;\n  end: number;\n};\n\nexport type Index = {\n  metadata: IndexMetadata;\n  splits: SplitMetadata[];\n  split_limit_reached: boolean;\n};\n\nexport type Cluster = {\n  node_id: string;\n  cluster_id: string;\n  state: ClusterState;\n};\n\nexport type ClusterState = {\n  state: ClusterStateSnapshot;\n  live_nodes: any[];\n  dead_nodes: any[];\n};\n\nexport type ClusterStateSnapshot = {\n  seed_addrs: string[];\n  node_states: Record<string, NodeState>;\n};\n\nexport type NodeState = {\n  key_values: KeyValues;\n  max_version: number;\n};\n\nexport type KeyValues = {\n  available_services: KeyValue;\n  grpc_address: KeyValue;\n  heartbeat: KeyValue;\n};\n\nexport type KeyValue = {\n  value: any;\n  version: number;\n};\n\nexport type QuickwitBuildInfo = {\n  commit_version_tag: string;\n  cargo_pkg_version: string;\n  cargo_build_target: string;\n  commit_short_hash: string;\n  commit_date: string;\n  version: string;\n};\n\nexport type NodeId = {\n  id: string;\n  grpc_address: string;\n  self: boolean;\n};\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/utils/theme.ts",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { createTheme } from \"@mui/material\";\nimport SoehneMonoDreiviertelfettWoff2 from \"./../assets/fonts/soehne-mono-web-dreiviertelfett.woff2\";\nimport SoehneMonoKraftigWoff2 from \"./../assets/fonts/soehne-mono-web-kraftig.woff2\";\nimport SoehneBuchWoff2 from \"./../assets/fonts/soehne-web-buch.woff2\";\nimport SoehneHalbfettWoff2 from \"./../assets/fonts/soehne-web-halbfett.woff2\";\n\nexport const QUICKWIT_BLUE = \"#004BD9\";\nexport const QUICKWIT_RED = \"#FF0026\";\nexport const QUICKWIT_GREEN = \"#00D588\";\nexport const QUICKWIT_GREY = \"#CBD1DE\";\nexport const QUICKWIT_INTERMEDIATE_GREY = \"rgba(203,209,222,0.5)\";\nexport const QUICKWIT_LIGHT_GREY = \"#F8F9FB\";\nexport const QUICKWIT_BLACK = \"#1F232A\";\n\n// Update the Typography's var@iant prop options\ndeclare module \"@mui/material/Typography\" {\n  interface TypographyPropsVariantOverrides {\n    fontSize: true;\n    poster: true;\n    h3: false;\n  }\n}\n\ndeclare module \"@mui/material/styles\" {\n  interface Theme {\n    status: {\n      danger: React.CSSProperties[\"color\"];\n    };\n  }\n\n  interface PaletteOptions {\n    neutral: PaletteOptions[\"primary\"];\n  }\n\n  interface Palette {\n    primary: Palette[\"primary\"];\n    secondary: Palette[\"secondary\"];\n    text: Palette[\"text\"];\n    neutral: Palette[\"primary\"];\n  }\n}\n\nexport const theme = createTheme({\n  palette: {\n    primary: {\n      main: \"#000000\",\n      contrastText: \"#ffffff\",\n    },\n    secondary: {\n      main: \"#000000\",\n    },\n    text: {\n      primary: \"#000000\",\n    },\n    neutral: {\n      main: \"#F8F9FB\",\n      contrastText: \"#000000\",\n    },\n  },\n  typography: {\n    fontSize: 12,\n    fontFamily: \"SoehneMono, Arial\",\n    body1: {\n      fontSize: \"0.8rem\",\n    },\n  },\n  components: {\n    MuiCssBaseline: {\n      styleOverrides: `\n        @font-face {\n          font-family: 'SoehneMono';\n          font-style: normal;\n          font-display: swap;\n          font-weight: 500;\n          src: local('SoehneMonoKraftig'), local('SoehneMonoKraftig'), url(${SoehneMonoKraftigWoff2}) format('woff2');\n        }\n        @font-face {\n          font-family: 'SoehneMono';\n          font-style: normal;\n          font-display: swap;\n          font-weight: 700;\n          src: local('SoehneMonoDreiviertelfett'), local('SoehneMonoDreiviertelfett'), url(${SoehneMonoDreiviertelfettWoff2}) format('woff2');\n        }\n        @font-face {\n          font-family: 'Soehne';\n          font-style: bold;\n          font-display: swap;\n          font-weight: 600;\n          src: local('SoehneHalbfett'), local('SoehneHalbfett'), url(${SoehneHalbfettWoff2}) format('woff2');\n        }\n        @font-face {\n          font-family: 'Soehne';\n          font-style: normal;\n          font-display: swap;\n          font-weight: 300;\n          src: local('SoehneBuch'), local('SoehneBuch'), url(${SoehneBuchWoff2}) format('woff2');\n        }\n      `,\n    },\n  },\n});\n\nexport const EDITOR_THEME = {\n  base: \"vs\" as const,\n  inherit: true,\n  rules: [\n    { token: \"comment\", foreground: \"#1F232A\", fontStyle: \"italic\" },\n    { token: \"keyword\", foreground: QUICKWIT_BLUE },\n  ],\n  colors: {\n    \"editor.comment.foreground\": \"#CBD1DE\",\n    \"editor.foreground\": \"#000000\",\n    \"editor.background\": QUICKWIT_LIGHT_GREY,\n    \"editorLineNumber.foreground\": \"black\",\n    \"editor.lineHighlightBackground\": \"#DFE0E1\",\n  },\n};\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/utils/urls.ts",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { Aggregation, SearchRequest, SortByField, SortOrder } from \"./models\";\n\nexport function hasSearchParams(historySearch: string): boolean {\n  const searchParams = new URLSearchParams(historySearch);\n\n  return (\n    searchParams.has(\"index_id\") ||\n    searchParams.has(\"query\") ||\n    searchParams.has(\"start_timestamp\") ||\n    searchParams.has(\"end_timestamp\")\n  );\n}\n\nexport function parseSearchUrl(historySearch: string): SearchRequest {\n  const searchParams = new URLSearchParams(historySearch);\n  const startTimestampString = searchParams.get(\"start_timestamp\");\n  let startTimestamp = null;\n  const startTimeStampParsedInt = parseInt(startTimestampString || \"\", 10);\n  if (!Number.isNaN(startTimeStampParsedInt)) {\n    startTimestamp = startTimeStampParsedInt;\n  }\n  let endTimestamp = null;\n  const endTimestampString = searchParams.get(\"end_timestamp\");\n  const endTimestampParsedInt = parseInt(endTimestampString || \"\", 10);\n  if (!Number.isNaN(endTimestampParsedInt)) {\n    endTimestamp = endTimestampParsedInt;\n  }\n  let indexId = null;\n  const indexIdParam = searchParams.get(\"index_id\");\n  if (indexIdParam !== null && indexIdParam.length > 0) {\n    indexId = searchParams.get(\"index_id\");\n  }\n  let sortByField = null;\n  const sortByFieldParam = searchParams.get(\"sort_by_field\");\n  if (sortByFieldParam !== null) {\n    if (sortByFieldParam.startsWith(\"+\")) {\n      const order: SortOrder = \"Desc\";\n      sortByField = { field_name: sortByFieldParam.substring(1), order: order };\n    } else if (sortByFieldParam.startsWith(\"-\")) {\n      const order: SortOrder = \"Asc\";\n      sortByField = { field_name: sortByFieldParam.substring(1), order: order };\n    } else {\n      const order: SortOrder = \"Desc\";\n      sortByField = { field_name: sortByFieldParam, order: order };\n    }\n  }\n  const aggregationParam = searchParams.get(\"aggregation\");\n  const aggregation = parseAggregation(aggregationParam);\n  return {\n    indexId: indexId,\n    query: searchParams.get(\"query\") || \"\",\n    maxHits: 10,\n    startTimestamp: startTimestamp,\n    endTimestamp: endTimestamp,\n    sortByField: sortByField,\n    aggregation: aggregationParam != null,\n    aggregationConfig: aggregation,\n  };\n}\n\nfunction parseAggregation(param: string | null): Aggregation {\n  const empty: Aggregation = {\n    metric: null,\n    term: null,\n    histogram: null,\n  };\n  if (param !== null) {\n    try {\n      const aggregation: Aggregation = JSON.parse(param);\n      return aggregation;\n    } catch {\n      // ignore malformed param\n    }\n  }\n  return empty;\n}\n\nexport function toUrlSearchRequestParams(\n  request: SearchRequest,\n): URLSearchParams {\n  const params = new URLSearchParams();\n  params.append(\"query\", request.query || \"*\");\n  // We have to set the index ID in url params as it's not present in the UI path params.\n  // This enables the react app to be able to get index ID from url params\n  // if the user enter directly the UI url.\n  params.append(\"index_id\", request.indexId || \"\");\n  if (request.maxHits) {\n    params.append(\"max_hits\", request.maxHits.toString());\n  }\n  if (request.startTimestamp) {\n    params.append(\"start_timestamp\", request.startTimestamp.toString());\n  }\n  if (request.endTimestamp) {\n    params.append(\"end_timestamp\", request.endTimestamp.toString());\n  }\n  if (request.sortByField) {\n    params.append(\"sort_by_field\", serializeSortByField(request.sortByField));\n  }\n  if (request.aggregation) {\n    params.append(\n      \"aggregation\",\n      JSON.stringify(request.aggregationConfig, (_, val) => {\n        if (val == null) {\n          return undefined;\n        } else {\n          return val;\n        }\n      }),\n    );\n  }\n  return params;\n}\n\nexport function serializeSortByField(sortByField: SortByField): string {\n  const order = sortByField.order === \"Desc\" ? \"+\" : \"-\";\n  return `${order}${sortByField.field_name}`;\n}\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/views/ApiView.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport \"swagger-ui-react/swagger-ui.css\";\nimport SwaggerUI from \"swagger-ui-react\";\nimport {\n  FullBoxContainer,\n  ViewUnderAppBarBox,\n} from \"../components/LayoutUtils\";\n\nfunction ApiView() {\n  return (\n    <ViewUnderAppBarBox>\n      <FullBoxContainer>\n        <SwaggerUI\n          layout=\"BaseLayout\"\n          defaultModelsExpandDepth={-1}\n          url=\"/openapi.json\"\n        />\n      </FullBoxContainer>\n    </ViewUnderAppBarBox>\n  );\n}\n\nexport default ApiView;\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/views/App.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { CssBaseline, ThemeProvider } from \"@mui/material\";\nimport { Navigate, Route, Routes } from \"react-router\";\nimport { FullBoxContainer } from \"../components/LayoutUtils\";\nimport SideBar from \"../components/SideBar\";\nimport TopBar from \"../components/TopBar\";\nimport { LocalStorageProvider } from \"../providers/LocalStorageProvider\";\nimport { theme } from \"../utils/theme\";\nimport ApiView from \"./ApiView\";\nimport ClusterView from \"./ClusterView\";\nimport IndexesView from \"./IndexesView\";\nimport IndexView from \"./IndexView\";\nimport NodeInfoView from \"./NodeInfoView\";\nimport SearchView from \"./SearchView\";\n\nfunction App() {\n  return (\n    <ThemeProvider theme={theme}>\n      <LocalStorageProvider>\n        <FullBoxContainer sx={{ flexDirection: \"row\", p: 0 }}>\n          <CssBaseline />\n          <TopBar />\n          <SideBar />\n          <Routes>\n            <Route path=\"/\" element={<Navigate to=\"/search\" />} />\n            <Route path=\"search\" element={<SearchView />} />\n            <Route path=\"indexes\" element={<IndexesView />} />\n            <Route path=\"indexes/:indexId\" element={<IndexView />} />\n            <Route path=\"cluster\" element={<ClusterView />} />\n            <Route path=\"node-info\" element={<NodeInfoView />} />\n            <Route path=\"api-playground\" element={<ApiView />} />\n          </Routes>\n        </FullBoxContainer>\n      </LocalStorageProvider>\n    </ThemeProvider>\n  );\n}\n\nexport default App;\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/views/ClusterView.test.jsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { render, screen, waitFor } from \"@testing-library/react\";\nimport { act } from \"react\";\nimport { Client } from \"../services/client\";\nimport ClusterView from \"./ClusterView\";\n\njest.mock(\"../services/client\");\n\nlet container = null;\nbeforeEach(() => {\n  // setup a DOM element as a render target\n  container = document.createElement(\"div\");\n  document.body.appendChild(container);\n});\n\nafterEach(() => {\n  // cleanup on exiting\n  container.remove();\n  container = null;\n});\n\ntest(\"renders ClusterStateView\", async () => {\n  const clusterState = {\n    state: {\n      seed_addrs: [],\n      node_states: {\n        \"node-green-uCdq/1656700092\": {\n          key_values: {\n            available_services: {\n              value: \"searcher\",\n              version: 3,\n            },\n            grpc_address: {\n              value: \"127.0.0.1:7281\",\n              version: 2,\n            },\n            heartbeat: {\n              value: \"24\",\n              version: 27,\n            },\n          },\n          max_version: 27,\n        },\n      },\n    },\n    live_nodes: [],\n    dead_nodes: [],\n  };\n  Client.prototype.cluster.mockImplementation(() =>\n    Promise.resolve(clusterState),\n  );\n\n  await act(async () => {\n    render(<ClusterView />, container);\n  });\n\n  await waitFor(() =>\n    expect(screen.getByText(/node-green-uCdq/)).toBeInTheDocument(),\n  );\n});\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/views/ClusterView.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { Typography } from \"@mui/material\";\nimport { useEffect, useMemo, useState } from \"react\";\nimport ApiUrlFooter from \"../components/ApiUrlFooter\";\nimport { JsonEditor } from \"../components/JsonEditor\";\nimport {\n  FullBoxContainer,\n  QBreadcrumbs,\n  ViewUnderAppBarBox,\n} from \"../components/LayoutUtils\";\nimport Loader from \"../components/Loader\";\nimport ErrorResponseDisplay from \"../components/ResponseErrorDisplay\";\nimport { Client } from \"../services/client\";\nimport { Cluster, ResponseError } from \"../utils/models\";\n\nfunction ClusterView() {\n  const [loading, setLoading] = useState(false);\n  const [cluster, setCluster] = useState<null | Cluster>(null);\n  const [responseError, setResponseError] = useState<ResponseError | null>(\n    null,\n  );\n  const quickwitClient = useMemo(() => new Client(), []);\n\n  useEffect(() => {\n    setLoading(true);\n    quickwitClient.cluster().then(\n      (cluster) => {\n        setResponseError(null);\n        setLoading(false);\n        setCluster(cluster);\n      },\n      (error) => {\n        setLoading(false);\n        setResponseError(error);\n      },\n    );\n  }, [quickwitClient]);\n\n  const renderResult = () => {\n    if (responseError !== null) {\n      return ErrorResponseDisplay(responseError);\n    }\n    if (loading || cluster == null) {\n      return <Loader />;\n    }\n    return <JsonEditor content={cluster} resizeOnMount={false} />;\n  };\n\n  return (\n    <ViewUnderAppBarBox>\n      <FullBoxContainer>\n        <QBreadcrumbs aria-label=\"breadcrumb\">\n          <Typography color=\"text.primary\">Cluster</Typography>\n        </QBreadcrumbs>\n        <FullBoxContainer sx={{ px: 0 }}>{renderResult()}</FullBoxContainer>\n      </FullBoxContainer>\n      {ApiUrlFooter(\"api/v1/cluster\")}\n    </ViewUnderAppBarBox>\n  );\n}\n\nexport default ClusterView;\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/views/IndexView.test.jsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { render, screen, waitFor } from \"@testing-library/react\";\nimport { act } from \"react\";\nimport { BrowserRouter } from \"react-router\";\nimport { Client } from \"../services/client\";\nimport IndexView from \"./IndexView\";\n\njest.mock(\"../services/client\");\njest.mock(\"react-router\", () => ({\n  ...jest.requireActual(\"react-router\"),\n  useParams: () => ({\n    indexId: \"my-new-fresh-index-id\",\n  }),\n}));\n\ntest(\"renders IndexView\", async () => {\n  const index = {\n    metadata: {\n      index_config: {\n        index_uri: \"my-new-fresh-index-uri\",\n      },\n    },\n    splits: [],\n  };\n  Client.prototype.getIndex.mockImplementation(() => Promise.resolve(index));\n\n  await act(async () => {\n    render(<IndexView />, { wrapper: BrowserRouter });\n  });\n\n  await waitFor(() =>\n    expect(screen.getByText(/my-new-fresh-index-uri/)).toBeInTheDocument(),\n  );\n});\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/views/IndexView.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { TabContext, TabList, TabPanel } from \"@mui/lab\";\nimport { Box, styled, Tab, Typography } from \"@mui/material\";\nimport Link, { LinkProps } from \"@mui/material/Link\";\nimport React, { useCallback, useEffect, useMemo, useState } from \"react\";\nimport { Link as RouterLink, useParams } from \"react-router\";\nimport ApiUrlFooter from \"../components/ApiUrlFooter\";\nimport { IndexSummary } from \"../components/IndexSummary\";\nimport { JsonEditor } from \"../components/JsonEditor\";\nimport {\n  FullBoxContainer,\n  QBreadcrumbs,\n  ViewUnderAppBarBox,\n} from \"../components/LayoutUtils\";\nimport Loader from \"../components/Loader\";\nimport { Client } from \"../services/client\";\nimport { Index } from \"../utils/models\";\n\nexport type ErrorResult = {\n  error: string;\n};\n\nconst CustomTabPanel = styled(TabPanel)`\n  padding-left: 0;\n  padding-right: 0;\n  height: 100%;\n`;\n\n// NOTE : https://mui.com/material-ui/react-breadcrumbs/#integration-with-react-router\ninterface LinkRouterProps extends LinkProps {\n  to: string;\n  replace?: boolean;\n}\n\nfunction LinkRouter(props: LinkRouterProps) {\n  return <Link {...props} component={RouterLink} />;\n}\n\nfunction IndexView() {\n  const { indexId } = useParams();\n  const [loading, setLoading] = useState(false);\n  const [, setLoadingError] = useState<ErrorResult | null>(null);\n  const [tabIndex, setTabIndex] = useState(\"1\");\n  const [index, setIndex] = useState<Index>();\n  const quickwitClient = useMemo(() => new Client(), []);\n\n  const handleTabIndexChange = (_: React.SyntheticEvent, newValue: string) => {\n    setTabIndex(newValue);\n  };\n\n  const fetchIndex = useCallback(() => {\n    setLoading(true);\n    if (indexId === undefined) {\n      console.warn(\"`indexId` should always be set.\");\n      return;\n    } else {\n      quickwitClient.getIndex(indexId).then(\n        (fetchedIndex) => {\n          setLoadingError(null);\n          setLoading(false);\n          setIndex(fetchedIndex);\n        },\n        (error) => {\n          setLoading(false);\n          setLoadingError({ error: error });\n        },\n      );\n    }\n  }, [indexId, quickwitClient]);\n\n  const renderFetchIndexResult = () => {\n    if (loading || index === undefined) {\n      return <Loader />;\n    } else {\n      // TODO: remove this css with magic number `48px`.\n      return (\n        <Box\n          sx={{\n            display: \"flex\",\n            flexDirection: \"column\",\n            height: \"calc(100% - 48px)\",\n          }}\n        >\n          <TabContext value={tabIndex}>\n            <Box sx={{ borderBottom: 1, borderColor: \"divider\" }}>\n              <TabList onChange={handleTabIndexChange} aria-label=\"Index tabs\">\n                <Tab label=\"Summary\" value=\"1\" />\n                <Tab label=\"Sources\" value=\"2\" />\n                <Tab label=\"Doc Mapping\" value=\"3\" />\n                <Tab label=\"Indexing settings\" value=\"4\" />\n                <Tab label=\"Search settings\" value=\"5\" />\n                <Tab label=\"Retention settings\" value=\"6\" />\n                <Tab label=\"Splits\" value=\"7\" />\n              </TabList>\n            </Box>\n            <CustomTabPanel value=\"1\">\n              <IndexSummary index={index} />\n            </CustomTabPanel>\n            <CustomTabPanel value=\"2\">\n              <JsonEditor\n                content={index.metadata.sources}\n                resizeOnMount={false}\n              />\n            </CustomTabPanel>\n            <CustomTabPanel value=\"3\">\n              <JsonEditor\n                content={index.metadata.index_config.doc_mapping}\n                resizeOnMount={false}\n              />\n            </CustomTabPanel>\n            <CustomTabPanel value=\"4\">\n              <JsonEditor\n                content={index.metadata.index_config.indexing_settings}\n                resizeOnMount={false}\n              />\n            </CustomTabPanel>\n            <CustomTabPanel value=\"5\">\n              <JsonEditor\n                content={index.metadata.index_config.search_settings}\n                resizeOnMount={false}\n              />\n            </CustomTabPanel>\n            <CustomTabPanel value=\"6\">\n              <JsonEditor\n                content={index.metadata.index_config.retention || {}}\n                resizeOnMount={false}\n              />\n            </CustomTabPanel>\n            <CustomTabPanel value=\"7\">\n              <JsonEditor content={index.splits} resizeOnMount={false} />\n            </CustomTabPanel>\n          </TabContext>\n        </Box>\n      );\n    }\n  };\n\n  useEffect(() => {\n    fetchIndex();\n  }, [fetchIndex]);\n\n  return (\n    <ViewUnderAppBarBox>\n      <FullBoxContainer>\n        <QBreadcrumbs aria-label=\"breadcrumb\">\n          <LinkRouter underline=\"hover\" color=\"inherit\" to=\"/indexes\">\n            <Typography color=\"text.primary\">Indexes</Typography>\n          </LinkRouter>\n          <Typography color=\"text.primary\">{indexId}</Typography>\n        </QBreadcrumbs>\n        {renderFetchIndexResult()}\n      </FullBoxContainer>\n      {ApiUrlFooter(\"api/v1/indexes/\" + indexId)}\n    </ViewUnderAppBarBox>\n  );\n}\n\nexport default IndexView;\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/views/IndexesView.test.jsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { render, screen } from \"@testing-library/react\";\nimport { act } from \"react\";\nimport { Client } from \"../services/client\";\nimport IndexesView from \"./IndexesView\";\n\njest.mock(\"../services/client\");\nconst mockedUsedNavigate = jest.fn();\njest.mock(\"react-router\", () => ({\n  ...jest.requireActual(\"react-router\"),\n  useNavigate: () => mockedUsedNavigate,\n}));\n\nlet container = null;\nbeforeEach(() => {\n  // setup a DOM element as a render target\n  container = document.createElement(\"div\");\n  document.body.appendChild(container);\n});\n\nafterEach(() => {\n  // cleanup on exiting\n  container.remove();\n  container = null;\n});\n\ntest(\"renders IndexesView\", async () => {\n  const indexes = [\n    {\n      index_config: {\n        index_id: \"my-new-fresh-index\",\n        index_uri: \"my-uri\",\n        indexing_settings: {\n          timestamp_field: \"timestamp\",\n        },\n        search_settings: {},\n        doc_mapping: {\n          store: false,\n          field_mappings: [],\n          tag_fields: [],\n          dynamic_mapping: false,\n        },\n      },\n      sources: [],\n      create_timestamp: 1000,\n      update_timestamp: 1000,\n    },\n  ];\n  Client.prototype.listIndexes.mockResolvedValueOnce(() => indexes);\n\n  await act(async () => {\n    render(<IndexesView />, container);\n  });\n\n  expect(\n    screen.getByText(indexes[0].index_config.index_id),\n  ).toBeInTheDocument();\n});\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/views/IndexesView.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { Box, Typography } from \"@mui/material\";\nimport { useEffect, useMemo, useState } from \"react\";\nimport ApiUrlFooter from \"../components/ApiUrlFooter\";\nimport IndexesTable from \"../components/IndexesTable\";\nimport {\n  FullBoxContainer,\n  QBreadcrumbs,\n  ViewUnderAppBarBox,\n} from \"../components/LayoutUtils\";\nimport Loader from \"../components/Loader\";\nimport ErrorResponseDisplay from \"../components/ResponseErrorDisplay\";\nimport { Client } from \"../services/client\";\nimport { IndexMetadata, ResponseError } from \"../utils/models\";\n\nfunction IndexesView() {\n  const [loading, setLoading] = useState(false);\n  const [responseError, setResponseError] = useState<ResponseError | null>(\n    null,\n  );\n  const [indexesMetadata, setIndexesMetadata] = useState<IndexMetadata[]>();\n  const quickwitClient = useMemo(() => new Client(), []);\n\n  const renderFetchIndexesResult = () => {\n    if (responseError !== null) {\n      return ErrorResponseDisplay(responseError);\n    }\n    if (loading || indexesMetadata === undefined) {\n      return <Loader />;\n    }\n    if (indexesMetadata.length > 0) {\n      return (\n        <FullBoxContainer sx={{ px: 0 }}>\n          <IndexesTable indexesMetadata={indexesMetadata} />\n        </FullBoxContainer>\n      );\n    }\n    return <Box>You have no index registered in your metastore.</Box>;\n  };\n\n  useEffect(() => {\n    setLoading(true);\n    quickwitClient.listIndexes().then(\n      (indexesMetadata) => {\n        setResponseError(null);\n        setLoading(false);\n        setIndexesMetadata(indexesMetadata);\n      },\n      (error) => {\n        setLoading(false);\n        setResponseError(error);\n      },\n    );\n  }, [quickwitClient]);\n\n  return (\n    <ViewUnderAppBarBox>\n      <FullBoxContainer>\n        <QBreadcrumbs aria-label=\"breadcrumb\">\n          <Typography color=\"text.primary\">Indexes</Typography>\n        </QBreadcrumbs>\n        {renderFetchIndexesResult()}\n      </FullBoxContainer>\n      {ApiUrlFooter(\"api/v1/indexes\")}\n    </ViewUnderAppBarBox>\n  );\n}\n\nexport default IndexesView;\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/views/NodeInfoView.test.jsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { render, screen, waitFor } from \"@testing-library/react\";\nimport { act } from \"react\";\nimport { Client } from \"../services/client\";\nimport NodeInfoView from \"./NodeInfoView\";\n\njest.mock(\"../services/client\");\n\nlet container = null;\nbeforeEach(() => {\n  // setup a DOM element as a render target\n  container = document.createElement(\"div\");\n  document.body.appendChild(container);\n});\n\nafterEach(() => {\n  // cleanup on exiting\n  container.remove();\n  container = null;\n});\n\ntest(\"renders NodeInfoView\", async () => {\n  const cluster = {\n    cluster_id: \"my cluster id\",\n  };\n  Client.prototype.cluster.mockImplementation(() => Promise.resolve(cluster));\n\n  const config = {\n    node_id: \"my-node-id\",\n  };\n  Client.prototype.config.mockImplementation(() => Promise.resolve(config));\n\n  const buildInfo = {\n    version: \"0.3.2\",\n  };\n  Client.prototype.buildInfo.mockImplementation(() =>\n    Promise.resolve(buildInfo),\n  );\n  await act(async () => {\n    render(<NodeInfoView />, container);\n  });\n\n  await waitFor(() =>\n    expect(screen.getByText(/my-node-id/)).toBeInTheDocument(),\n  );\n});\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/views/NodeInfoView.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { TabContext, TabList, TabPanel } from \"@mui/lab\";\nimport { Box, styled, Tab, Typography } from \"@mui/material\";\nimport { useEffect, useMemo, useState } from \"react\";\nimport ApiUrlFooter from \"../components/ApiUrlFooter\";\nimport { JsonEditor } from \"../components/JsonEditor\";\nimport {\n  FullBoxContainer,\n  QBreadcrumbs,\n  ViewUnderAppBarBox,\n} from \"../components/LayoutUtils\";\nimport Loader from \"../components/Loader\";\nimport { Client } from \"../services/client\";\nimport { QuickwitBuildInfo } from \"../utils/models\";\n\nconst CustomTabPanel = styled(TabPanel)`\n  padding-left: 0;\n  padding-right: 0;\n  height: 100%;\n`;\n\nfunction NodeInfoView() {\n  const [loadingCounter, setLoadingCounter] = useState(2);\n  const [nodeId, setNodeId] = useState<string>(\"\");\n  const [nodeConfig, setNodeConfig] = useState<null | Record<string, any>>(\n    null,\n  );\n  const [buildInfo, setBuildInfo] = useState<null | QuickwitBuildInfo>(null);\n  const [tabIndex, setTabIndex] = useState(\"1\");\n  const quickwitClient = useMemo(() => new Client(), []);\n\n  const urlByTab: Record<string, string> = {\n    \"1\": \"api/v1/config\",\n    \"2\": \"api/v1/version\",\n  };\n\n  const handleTabIndexChange = (_: React.SyntheticEvent, newValue: string) => {\n    setTabIndex(newValue);\n  };\n\n  useEffect(() => {\n    quickwitClient.cluster().then(\n      (cluster) => {\n        setNodeId(cluster.node_id);\n      },\n      (error) => {\n        console.log(\"Error when fetching cluster info:\", error);\n      },\n    );\n  });\n  useEffect(() => {\n    setLoadingCounter(2);\n    quickwitClient.buildInfo().then(\n      (fetchedBuildInfo) => {\n        setLoadingCounter((prevCounter) => prevCounter - 1);\n        setBuildInfo(fetchedBuildInfo);\n      },\n      (error) => {\n        setLoadingCounter((prevCounter) => prevCounter - 1);\n        console.log(\"Error when fetching build info: \", error);\n      },\n    );\n    quickwitClient.config().then(\n      (fetchedConfig) => {\n        setLoadingCounter((prevCounter) => prevCounter - 1);\n        setNodeConfig(fetchedConfig);\n      },\n      (error) => {\n        setLoadingCounter((prevCounter) => prevCounter - 1);\n        console.log(\"Error when fetching node config: \", error);\n      },\n    );\n  }, [quickwitClient]);\n\n  const renderResult = () => {\n    if (loadingCounter !== 0) {\n      return <Loader />;\n    } else {\n      return (\n        <FullBoxContainer sx={{ px: 0 }}>\n          <TabContext value={tabIndex}>\n            <Box sx={{ borderBottom: 1, borderColor: \"divider\" }}>\n              <TabList onChange={handleTabIndexChange} aria-label=\"Index tabs\">\n                <Tab label=\"Node config\" value=\"1\" />\n                <Tab label=\"Build info\" value=\"2\" />\n              </TabList>\n            </Box>\n            <CustomTabPanel value=\"1\">\n              <JsonEditor content={nodeConfig} resizeOnMount={false} />\n            </CustomTabPanel>\n            <CustomTabPanel value=\"2\">\n              <JsonEditor content={buildInfo} resizeOnMount={false} />\n            </CustomTabPanel>\n          </TabContext>\n        </FullBoxContainer>\n      );\n    }\n  };\n\n  return (\n    <ViewUnderAppBarBox>\n      <FullBoxContainer>\n        <QBreadcrumbs aria-label=\"breadcrumb\">\n          <Typography color=\"text.primary\">Node ID: {nodeId} (self)</Typography>\n        </QBreadcrumbs>\n        {renderResult()}\n      </FullBoxContainer>\n      {ApiUrlFooter(urlByTab[tabIndex] || \"\")}\n    </ViewUnderAppBarBox>\n  );\n}\n\nexport default NodeInfoView;\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/views/SearchView.test.jsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { render, screen, waitFor } from \"@testing-library/react\";\nimport { act } from \"react\";\nimport { Client } from \"../services/client\";\nimport SearchView from \"./SearchView\";\n\njest.mock(\"../services/client\");\nconst mockedUsedNavigate = jest.fn();\njest.mock(\"react-router\", () => ({\n  ...jest.requireActual(\"react-router\"),\n  useLocation: () => ({\n    pathname: \"/search\",\n    search:\n      \"index_id=my-new-fresh-index-idmax_hits=10&start_timestamp=1460554590&end_timestamp=1460554592&sort_by_field=-timestamp\",\n  }),\n  useNavigate: () => mockedUsedNavigate,\n}));\n\nlet container = null;\nbeforeEach(() => {\n  // setup a DOM element as a render target\n  container = document.createElement(\"div\");\n  document.body.appendChild(container);\n});\n\nafterEach(() => {\n  // cleanup on exiting\n  container.remove();\n  container = null;\n});\n\ntest(\"renders SearchView\", async () => {\n  const index = {\n    metadata: {\n      index_config: {\n        index_id: \"my-new-fresh-index-id\",\n        index_uri: \"my-new-fresh-index-uri\",\n        indexing_settings: {},\n        doc_mapping: {\n          field_mappings: [\n            {\n              name: \"timestamp\",\n              type: \"i64\",\n            },\n          ],\n        },\n      },\n    },\n    splits: [],\n  };\n  Client.prototype.getIndex.mockImplementation(() => Promise.resolve(index));\n  Client.prototype.listIndexes.mockImplementation(() =>\n    Promise.resolve([index.metadata]),\n  );\n\n  const searchResponse = {\n    num_hits: 2,\n    hits: [\n      { body: \"INFO This is an info log\" },\n      { body: \"WARN This is a warn log\" },\n    ],\n    elapsed_time_micros: 10,\n    errors: [],\n  };\n  Client.prototype.search.mockImplementation(() =>\n    Promise.resolve(searchResponse),\n  );\n\n  await act(async () => {\n    render(<SearchView />, container);\n  });\n\n  await waitFor(() =>\n    expect(screen.getByText(/This is an info log/)).toBeInTheDocument(),\n  );\n});\n"
  },
  {
    "path": "quickwit/quickwit-ui/src/views/SearchView.tsx",
    "content": "// Copyright 2021-Present Datadog, Inc.\n//\n// Licensed under the Apache License, Version 2.0 (the \"License\");\n// you may not use this file except in compliance with the License.\n// You may obtain a copy of the License at\n//\n//     http://www.apache.org/licenses/LICENSE-2.0\n//\n// Unless required by applicable law or agreed to in writing, software\n// distributed under the License is distributed on an \"AS IS\" BASIS,\n// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n// See the License for the specific language governing permissions and\n// limitations under the License.\n\nimport { useEffect, useMemo, useRef, useState } from \"react\";\nimport { useLocation, useNavigate } from \"react-router\";\nimport ApiUrlFooter from \"../components/ApiUrlFooter\";\nimport { IndexSideBar } from \"../components/IndexSideBar\";\nimport {\n  FullBoxContainer,\n  ViewUnderAppBarBox,\n} from \"../components/LayoutUtils\";\nimport { QueryEditorActionBar } from \"../components/QueryActionBar\";\nimport { AggregationEditor } from \"../components/QueryEditor/AggregationEditor\";\nimport { QueryEditor } from \"../components/QueryEditor/QueryEditor\";\nimport SearchResult from \"../components/SearchResult/SearchResult\";\nimport { useLocalStorage } from \"../providers/LocalStorageProvider\";\nimport { Client } from \"../services/client\";\nimport {\n  EMPTY_SEARCH_REQUEST,\n  Index,\n  IndexMetadata,\n  ResponseError,\n  SearchRequest,\n  SearchResponse,\n} from \"../utils/models\";\nimport {\n  hasSearchParams,\n  parseSearchUrl,\n  toUrlSearchRequestParams,\n} from \"../utils/urls\";\n\nfunction updateSearchRequestWithIndex(\n  index: Index | null,\n  searchRequest: SearchRequest,\n) {\n  // If we have a timestamp field, order by desc on the timestamp field.\n  if (index?.metadata.index_config.doc_mapping.timestamp_field) {\n    searchRequest.sortByField = {\n      field_name: index?.metadata.index_config.doc_mapping.timestamp_field,\n      order: \"Desc\",\n    };\n  } else {\n    searchRequest.sortByField = null;\n    searchRequest.startTimestamp = null;\n    searchRequest.endTimestamp = null;\n  }\n  if (index?.metadata.index_config.index_id) {\n    searchRequest.indexId = index?.metadata.index_config.index_id;\n  }\n}\n\nfunction SearchView() {\n  const location = useLocation();\n  const navigate = useNavigate();\n  const [index, setIndex] = useState<null | Index>(null);\n  const prevIndexIdRef = useRef<string | null>(null);\n  const [searchResponse, setSearchResponse] = useState<null | SearchResponse>(\n    null,\n  );\n  const [searchError, setSearchError] = useState<null | ResponseError>(null);\n  const [queryRunning, setQueryRunning] = useState(false);\n  const [searchRequest, setSearchRequest] = useState<SearchRequest>(\n    hasSearchParams(location.search)\n      ? parseSearchUrl(location.search)\n      : EMPTY_SEARCH_REQUEST,\n  );\n  const updateLastSearchRequest = useLocalStorage().updateLastSearchRequest;\n  const quickwitClient = useMemo(() => new Client(), []);\n\n  const runSearch = (updatedSearchRequest: SearchRequest) => {\n    if (!updatedSearchRequest || !updatedSearchRequest.indexId) {\n      return;\n    }\n\n    console.log(\"Run search...\", updatedSearchRequest);\n    updateSearchRequestWithIndex(index, updatedSearchRequest);\n    setSearchRequest(updatedSearchRequest);\n    setQueryRunning(true);\n    setSearchError(null);\n    navigate(\n      \"/search?\" + toUrlSearchRequestParams(updatedSearchRequest).toString(),\n    );\n    const timestamp_field =\n      index?.metadata.index_config.doc_mapping.timestamp_field || null;\n    quickwitClient.search(updatedSearchRequest, timestamp_field).then(\n      (response) => {\n        updateLastSearchRequest(updatedSearchRequest);\n        setSearchResponse(response);\n        setQueryRunning(false);\n      },\n      (error) => {\n        setQueryRunning(false);\n        setSearchError(error);\n        console.error(\"Error when running search request\", error);\n      },\n    );\n  };\n  const onIndexMetadataUpdate = (indexMetadata: IndexMetadata | null) => {\n    setSearchRequest((previousRequest) => {\n      updateSearchRequestWithIndex(index, previousRequest);\n      return {\n        ...previousRequest,\n        indexId:\n          indexMetadata === null ? null : indexMetadata.index_config.index_id,\n      };\n    });\n  };\n  const onSearchRequestUpdate = (searchRequest: SearchRequest) => {\n    setSearchRequest(searchRequest);\n  };\n  useEffect(() => {\n    if (prevIndexIdRef.current !== index?.metadata.index_config.index_id) {\n      setSearchResponse(null);\n    }\n    // Run search only if this is the first time we set the index.\n    if (prevIndexIdRef.current === null) {\n      runSearch(searchRequest);\n    }\n    prevIndexIdRef.current =\n      index === null ? null : index.metadata.index_config.index_id;\n  }, [index]);\n  useEffect(() => {\n    if (!searchRequest.indexId) {\n      return;\n    }\n\n    if (\n      index !== null &&\n      index.metadata.index_config.index_id === searchRequest.indexId\n    ) {\n      return;\n    }\n    // If index id is changing, it's better to reset timestamps as the time unit may be different\n    // between indexes.\n    if (\n      prevIndexIdRef.current !== null &&\n      prevIndexIdRef.current !== index?.metadata.index_config.index_id\n    ) {\n      searchRequest.startTimestamp = null;\n      searchRequest.endTimestamp = null;\n    }\n    quickwitClient.getIndex(searchRequest.indexId).then((fetchedIndex) => {\n      setIndex(fetchedIndex);\n    });\n  }, [searchRequest, quickwitClient, index]);\n\n  const searchParams = toUrlSearchRequestParams(searchRequest);\n  // `toUrlSearchRequestParams` is used for the UI urls. We need to remove the `indexId` request parameter to generate\n  // the correct API url, this is the only difference.\n  searchParams.delete(\"index_id\");\n  return (\n    <ViewUnderAppBarBox sx={{ flexDirection: \"row\" }}>\n      <IndexSideBar\n        indexMetadata={index === null ? null : index.metadata}\n        onIndexMetadataUpdate={onIndexMetadataUpdate}\n      />\n      <FullBoxContainer sx={{ padding: 0 }}>\n        <FullBoxContainer>\n          <QueryEditorActionBar\n            searchRequest={searchRequest}\n            onSearchRequestUpdate={onSearchRequestUpdate}\n            runSearch={runSearch}\n            index={index}\n            queryRunning={queryRunning}\n          />\n          <QueryEditor\n            searchRequest={searchRequest}\n            onSearchRequestUpdate={onSearchRequestUpdate}\n            runSearch={runSearch}\n            index={index}\n            queryRunning={queryRunning}\n          />\n          <AggregationEditor\n            searchRequest={searchRequest}\n            onSearchRequestUpdate={onSearchRequestUpdate}\n            runSearch={runSearch}\n            index={index}\n            queryRunning={queryRunning}\n          />\n          <SearchResult\n            queryRunning={queryRunning}\n            searchError={searchError}\n            searchResponse={searchResponse}\n            index={index}\n          />\n        </FullBoxContainer>\n        {index !== null &&\n          ApiUrlFooter(\n            `api/v1/${index?.metadata.index_config.index_id}/search?${searchParams.toString()}`,\n          )}\n      </FullBoxContainer>\n    </ViewUnderAppBarBox>\n  );\n}\n\nexport default SearchView;\n"
  },
  {
    "path": "quickwit/quickwit-ui/tsconfig.json",
    "content": "{\n  \"compilerOptions\": {\n    \"target\": \"ESNext\",\n    \"lib\": [\"dom\", \"dom.iterable\", \"esnext\"],\n    \"allowJs\": true,\n    \"skipLibCheck\": true,\n    \"esModuleInterop\": true,\n    \"allowSyntheticDefaultImports\": true,\n    \"strict\": true,\n    \"noPropertyAccessFromIndexSignature\": true,\n    \"noUncheckedIndexedAccess\": true,\n    \"noUnusedLocals\": true,\n    \"noUnusedParameters\": true,\n    \"forceConsistentCasingInFileNames\": true,\n    \"noFallthroughCasesInSwitch\": true,\n    \"module\": \"esnext\",\n    \"moduleResolution\": \"node\",\n    \"resolveJsonModule\": true,\n    \"isolatedModules\": true,\n    \"noEmit\": true,\n    \"jsx\": \"react-jsx\",\n    \"types\": [\"vite/client\"]\n  },\n  \"exclude\": [\"build\"]\n}\n"
  },
  {
    "path": "quickwit/quickwit-ui/vite.config.ts",
    "content": "import { UserConfig } from \"vite\";\n\nexport default {\n  base: \"/ui\",\n  server: {\n    proxy: {\n      \"/api\": \"http://127.0.0.1:7280\",\n      \"/openapi.json\": \"http://127.0.0.1:7280\",\n    },\n    port: 3000,\n  },\n  build: {\n    rollupOptions: {\n      onwarn(warning, warn) {\n        // Suppress \"use client\" directive warnings from material-ui\n        if (\n          warning.code === \"MODULE_LEVEL_DIRECTIVE\" &&\n          warning.message.includes('\"use client\"')\n        ) {\n          return;\n        }\n        warn(warning);\n      },\n    },\n  },\n} satisfies UserConfig;\n"
  },
  {
    "path": "quickwit/rest-api-tests/Pipfile",
    "content": "[[source]]\nurl = \"https://pypi.org/simple\"\nverify_ssl = true\nname = \"pypi\"\n\n[packages]\nrequests = \"*\"\npyaml = \"*\"\n\n[dev-packages]\n\n[requires]\npython_version = \"3.11\"\n"
  },
  {
    "path": "quickwit/rest-api-tests/README.md",
    "content": "# Rest API tests\n\nThis directory is meant to test quickwit at the Rest API level.\nIt was initially meant to iterate over the elastic search compatibility API,\nbut it can also be used as a convenient way to create integration tests.\n\n# Setting up the Python environment\n\n## Installing Pipenv\n\n```bash\npip install --user pipenv\n```\n\n[Pipenv installation](https://pipenv.pypa.io/en/latest/installation/)\n\n## Installing the dependencies in a virtual environment\n\n```bash\npipenv shell\npipenv install\n```\n\n# Running the tests\n\nThe test script is meant to target `elasticsearch` and `quickwit`.\n\nWhen targeting quickwit, the script expects a fresh quickwit instance\nrunning on `http://localhost:7280`. The data involved is small, and\nrunning in DEBUG mode is fine.\n\n```bash\n./run_tests.py --engine quickwit\n```\n\nWhen targeting elasticsearch, the script expects elastic to be running on\n`http://localhost:9200` (see [compose script](./docker-compose.yaml)).\n\nIn both cases, the test will take care of setting up, ingesting and tearing down the\nindexes involved.\n\n```./run_tests.py --engine elasticsearch```\n\n# Writing a new test suite\n\nWriting a new test suite only requires to create a new subdirectory somewhere in the scenarii/` tree.\nThe test script recursively browse the directories and executes some setup / teardown operation.\n\n## setup\n\nSetup consists in two things. First a context is built by loading and merging the content of the files `_ctx.yaml` and `_ctx.<engine>.yaml`.\nThis context will be used to prepopulate our steps dictionary.\n\nThis engine-specific context is perfect if you know all steps will target a specific endpoint, or a specific method.\n\nOnce the context is loaded, the steps described in `_setup.yaml` and `_setup.<engine>.yaml` (if present) will be executed.\n\nThese steps are just like any other steps except you are guaranteed they will be executed respectively before and after all other steps.\nIn particular, when targeting one specific test using the `--test flag`,\nthe necessary `setup` and `teardown` script will be automatically executed.\n\n# teardown\n\nIt then executes the tests described in .yaml files, in their lexicographical order.\nA single file can contain more than one tests, by separating them by `---`.\n\nHere is an example of a test\n\n```yaml\n# Query string takes priority over query defined in body\nmethod: [GET, POST]\nparams:\n  # this overrides the query sent in body\n  q: type:PushEvent\n  size: 3\njson:\n  query:\n    term:\n      type:\n        value: \"whatever\"\nexpected:\n  hits:\n    total:\n      value: 60\n      relation: \"eq\"\n    hits:\n      $expect: \"len(val) == 3\"\n```\n\nA test will just run a REST HTTP call, and check that the resulting JSON matches\nsome expectation.\n\n\n- **method**: gives the list of HTTP methods to test. If there is more than one, they will be all tested.\n- **params**: describes the parameters that should be sent as query strings.\n- **json**: describes the JSON body, sent with the query\n- **expected**: describes the expectation.\n\n# Expectations\n\nThe expectation is an object that mirrors the structure of the response.\nIt does not need to contain its entire tree.\n\nFor instance, given the following json object:\n```json\n{\"name\": \"Droopy\", \"age\": 31}\n```\n\nIt is possible to test for the name part only by using the following expectation:\n```yaml\n# ...\nexpected:\n  name: Droopy\n```\n\nSometimes, it might be cumbersome or even impossible to check a result against a value.\nIn that case, it is possible to express the condition as a python expression, by using the reserved keyword \"$expect\".\n\nIn the following, we could check that the age is greater than 30, like this:\n```yaml\n# ...\nexpected:\n  age:\n      $expect: \"val >= 3\"\n```\n\nNote that the value of the node (here `31`) is injected as a variable `val` in the expression.\n"
  },
  {
    "path": "quickwit/rest-api-tests/docker-compose.yaml",
    "content": "# This docker-compose file is useful to start up\n# a single node elasticsearch to test our rest api tests\n# against.\nversion: \"3.7\"\n\nservices:\n  elasticsearch:\n    image: docker.elastic.co/elasticsearch/elasticsearch:8.9.0\n    container_name: elasticsearch\n    environment:\n      - xpack.security.enabled=false\n      - discovery.type=single-node\n      - \"ES_JAVA_OPTS=-Xms512m -Xmx512m\"\n    ports:\n      - 9200:9200\n      - 9300:9300\n    # If you see elasticsearch lacking disk space, you can mount a local directory\n    # as follows like this.\n    #volumes:\n    #  - /Users/fulmicoton/git/quickwit/quickwit/rest-api-tests/esdata:/usr/share/elasticsearch/data\n"
  },
  {
    "path": "quickwit/rest-api-tests/run_tests.py",
    "content": "#!/usr/bin/env python3\n\nimport copy\nimport glob\nimport gzip\nimport http\nimport json\nimport os\nimport requests\nimport random\nimport shutil\nimport subprocess\nimport sys\nimport tempfile\nimport time\nimport yaml\n\nfrom os import mkdir\nfrom os import path as osp\n\n# Simple !include constructor for YAML to allow reusing fragments across files.\n# Usage examples:\n#   - !include path/to/file.yaml                -> includes full file content\n#   - !include path/to/file.yaml::doc_mapping   -> includes the 'doc_mapping' key\n#   - !include path/to/file.yaml::a.b.c         -> includes nested key a -> b -> c\ndef _yaml_include(loader, node):\n    value = loader.construct_scalar(node)\n    if \"::\" in value:\n        filepath, subpath = value.split(\"::\", 1)\n    else:\n        filepath, subpath = value, None\n    with open(filepath, \"r\") as f:\n        included = yaml.load(f, Loader=yaml.Loader)\n    if subpath:\n        cur = included\n        for seg in filter(None, subpath.split(\".\")):\n            if not isinstance(cur, dict) or seg not in cur:\n                raise KeyError(f\"!include path '{subpath}' not found in {filepath}\")\n            cur = cur[seg]\n        return cur\n    return included\n\n# Register the constructor on the default Loader used by this script.\nyaml.Loader.add_constructor(\"!include\", _yaml_include)\n\ndef debug_http():\n    old_send = http.client.HTTPConnection.send\n    def new_send(self, data):\n        print(f'{\"-\"*9} BEGIN REQUEST {\"-\"*9}')\n        if len(data) > 500:\n            print(\"Data too big\")\n            print(data[:500])\n        else:\n            print(data.decode('utf-8').strip())\n        print(f'{\"-\"*10} END REQUEST {\"-\"*10}')\n        return old_send(self, data)\n    http.client.HTTPConnection.send = new_send\n\ndef open_scenario(scenario_filepath):\n    data = open(scenario_filepath).read()\n    steps_data = data.split(\"\\n---\")\n    for step_data in steps_data:\n        step_data  = step_data.strip()\n        if step_data == \"\":\n            continue\n        step_dict = yaml.load(step_data, Loader=yaml.Loader)\n        if type(step_dict) == dict:\n            yield step_dict\n\ndef run_step(step, previous_result):\n    result = {}\n    if \"method\" in step:\n        methods = step[\"method\"]\n        if type(methods) != list:\n            methods = [methods]\n        for method in methods:\n            result = run_request_step(method, step, previous_result)\n    if \"sleep_after\" in step:\n        time.sleep(step[\"sleep_after\"])\n    return result\n\ndef run_request_with_retry(run_req, expected_status_code=None, num_retries=10, wait_time=0.5):\n    for try_number in range(num_retries + 1):\n        r = run_req()\n        if expected_status_code is None or r.status_code == expected_status_code:\n            return r\n        print(\"Failed with\", r.text, r.status_code)\n        if try_number < num_retries:\n            print(\"Retrying...\")\n            time.sleep(wait_time)\n    raise Exception(\"Wrong status code. Got %s, expected %s, url %s\" % (r.status_code, expected_status_code, run_req().url))\n\n\ndef resolve_previous_result(c, previous_result):\n    if type(c) == dict:\n        result = {}\n        if len(c) == 1 and \"$previous\" in c:\n            return eval(c[\"$previous\"], None, {\"val\": previous_result})\n        for (k, v) in c.items():\n            result[k] = resolve_previous_result(v, previous_result)\n        return result\n    if type(c) == list:\n        return [\n            resolve_previous_result(v, previous_result)\n            for v in c\n        ]\n    return c\n\ndef run_request_step(method, step, previous_result):\n    assert method in {\"GET\", \"POST\", \"PUT\", \"DELETE\"}\n    if \"headers\" not in step:\n        step[\"headers\"] = {'user-agent': 'my-app/0.0.1'}\n    method_req = getattr(requests, method.lower())\n    endpoint = step.get(\"endpoint\", \"\")\n    url = \"{}/{}\".format(step[\"api_root\"].rstrip('/'), endpoint.lstrip('/'))\n    kvargs = {\n        k: v\n        for k, v in step.items()\n        if k in {\"params\", \"data\", \"json\", \"headers\"}\n    }\n    body_from_file = step.get(\"body_from_file\", None)\n    if body_from_file is not None:\n        body_from_file = osp.join(step[\"cwd\"], body_from_file)\n        kvargs[\"data\"] = open(body_from_file, 'rb').read()\n\n    kvargs = resolve_previous_result(kvargs, previous_result)\n    shuffle_ndjson = step.get(\"shuffle_ndjson\", None)\n    if shuffle_ndjson is not None:\n        docs_per_split = distribute_items(shuffle_ndjson, step.get(\"min_splits\", 1), step.get(\"max_splits\", 5), step.get(\"seed\", None))\n\n        for i, bucket in enumerate(docs_per_split):\n            new_step = copy.deepcopy(step)\n            del new_step[\"shuffle_ndjson\"]\n            new_step[\"ndjson\"] = bucket\n            run_request_step(method, new_step, previous_result)\n        return;\n    ndjson = step.get(\"ndjson\", None)\n    if ndjson is not None:\n        # Add a newline at the end to please elasticsearch -> \"The bulk request must be terminated by a newline [\\\\n]\".\n        kvargs[\"data\"] = \"\\n\".join([json.dumps(doc) for doc in ndjson]) + \"\\n\"\n        kvargs.setdefault(\"headers\")[\"Content-Type\"] = \"application/json\"\n    expected_status_code = step.get(\"status_code\", 200)\n    debug = step.get(\"debug\", False)\n    num_retries = step.get(\"num_retries\", 0)\n    run_req = lambda : method_req(url, **kvargs)\n    r = run_request_with_retry(run_req, expected_status_code, num_retries)\n    expected_resp = step.get(\"expected\", None)\n    json_resp = r.json()\n    if debug:\n        print(expected_status_code)\n        print(json_resp)\n    if expected_resp is not None:\n        try:\n            check_result(json_resp, expected_resp, context_path=\"\")\n        except Exception as e:\n            print(json.dumps(json_resp, indent=2))\n            raise e\n    return json_resp\n\ndef distribute_items(items, min_buckets, max_buckets, seed=None):\n    if seed is None:\n        seed = random.randint(0, 10000)\n    random.seed(seed)\n    \n    # Determine the number of buckets\n    num_buckets = random.randint(min_buckets, max_buckets)\n    \n    # Initialize empty buckets\n    buckets = [[] for _ in range(num_buckets)]\n    \n    # Distribute items randomly into buckets\n    for item in items:\n        random_bucket = random.randint(0, num_buckets - 1)\n        buckets[random_bucket].append(item)\n    \n    # Print the seed for reproducibility\n    print(f\"Seed: {seed}\")\n    \n    return buckets\n\ndef check_result(result, expected, context_path = \"\"):\n    if type(expected) == dict and \"$expect\" in expected:\n        expectations = expected[\"$expect\"]\n        if type(expectations) == str:\n            expectations = [expectations]\n        for expectation in expectations:\n            if not eval(expectation, None, {\"val\": result}):\n                print(result)\n                raise Exception(\"Failed to meet expectation %s at %s\" % (expectation, context_path))\n            return\n    if type(result) != type(expected):\n        raise Exception(\"Wrong type at context %s. Got %s, Expected %s\" % (context_path, type(result), type(expected)))\n    elif type(result) == dict:\n        check_result_dict(result, expected, context_path)\n    elif type(result) == list:\n        check_result_list(result, expected, context_path)\n    elif result != expected:\n        raise Exception(\"Expected %s at context %s, got %s\" % (expected, context_path, result))\n\ndef check_result_list(result, expected, context_path=\"\"):\n    if len(result) != len(expected):\n        if len(expected) != 0:\n            # get keys from the expected dicts and filter result to print only the keys that are in the expected dicts\n            expected_keys = set().union(*expected)\n            filtered_result = [{k: v for k, v in d.items() if k in expected_keys} for d in result]\n            # Check if the length differs by more than five\n            if abs(len(filtered_result) - len(expected)) > 5:\n                # Show only the first 5 elements followed by ellipsis if there are more\n                display_filtered_result = filtered_result[:5] + ['...'] if len(filtered_result) > 5 else filtered_result\n            else:\n                display_filtered_result = filtered_result\n            raise Exception(\"Wrong length at context %s. Expected: %s Received: %s,\\n Expected \\n%s \\n Received \\n%s\" % (context_path, len(expected), len(result), expected, display_filtered_result))\n        raise Exception(\"Wrong length at context %s. Expected: %s Received: %s\" % (context_path, len(expected), len(result)))\n    for (i, (left, right)) in enumerate(zip(result, expected)):\n        check_result(left, right, context_path + \"[%s]\" % i)\n\ndef check_result_dict(result, expected, context_path=\"\"):\n    for key, value in expected.items():\n        try:\n            child = result[key]\n        except KeyError:\n            raise Exception(\"Missing key `%s` at context %s\" % (key, context_path))\n        check_result(child, value, context_path + \".\" + key)\n\nclass PathTree:\n    def __init__(self):\n        self.children = {}\n        self.scripts = []\n\n    def add_child(self, seg):\n        child = self.children.get(seg, None)\n        if child is None:\n            self.children[seg] = PathTree()\n        return self.children[seg]\n\n    def add_script(self, script):\n        self.scripts.append(script)\n\n    def add_path(self, path):\n        path_segs = path.split(\"/\")\n        if path_segs[-1].startswith(\"_\"):\n            return\n        path_tree = self\n        for path_seg in path_segs[:-1]:\n            path_tree = path_tree.add_child(path_seg)\n        path_tree.add_script(path_segs[-1])\n\n    def visit_nodes(self, visitor, path=[]):\n        success = True\n        success &= visitor.enter_directory(path)\n        for script in self.scripts:\n            success &= visitor.run_scenario(path, script)\n        for k in sorted(self.children.keys()):\n            child_path = path + [k]\n            success &= self.children[k].visit_nodes(visitor, child_path)\n        success &= visitor.exit_directory(path)\n        return success\n\n# Returns a new dictionary without modifying the arguments.\n# The new dictionary is the result of merging the two dictionaries\n# in that order:\n# The second dictionary may shadow/override the keys of the first dictionar\ndef stack_dicts(context, overriding):\n    context = context.copy()\n    context.update(overriding)\n    return context\n\nclass Visitor:\n    def __init__(self, engine):\n        self.engine = engine\n        self.context_stack = []\n        self.context = {}\n    def run_setup_teardown_scripts(self, script_name, path):\n        cwd = \"/\".join(path)\n        success = True\n        for file_name in [script_name + \".yaml\", script_name + \".\" + self.engine + \".yaml\"]:\n            script_fullpath = cwd + \"/\" + file_name\n            if osp.exists(script_fullpath):\n                success &= self.run_scenario(path, file_name)\n        return success\n    def load_context(self, path):\n        context = {\"cwd\": \"/\".join(path)}\n        for file_name in [\"_ctx.yaml\", \"_ctx.\" + self.engine + \".yaml\"]:\n            ctx_filepath = \"/\".join(path + [file_name])\n            if osp.exists(ctx_filepath):\n                ctx = yaml.load(open(ctx_filepath), Loader=yaml.Loader)\n                context.update(ctx)\n        self.context_stack.append(context)\n        self.context.update(context)\n    def enter_directory(self, path):\n        print(\"============\")\n        self.load_context(path)\n        return self.run_setup_teardown_scripts(\"_setup\", path)\n    def exit_directory(self, path):\n        success = self.run_setup_teardown_scripts(\"_teardown\", path)\n        self.context_stack.pop()\n        self.context = {}\n        for ctx in self.context_stack:\n            self.context.update(ctx)\n        return success\n    def run_scenario(self, path, script):\n        scenario_path = \"/\".join(path + [script])\n        steps = list(open_scenario(scenario_path))\n        num_steps_executed = 0\n        num_steps_skipped = 0\n        previous_result = {}\n        for (i, step) in enumerate(steps, 1):\n            step = stack_dicts(self.context, step)\n            applicable_engine = step.get(\"engines\", None)\n            if applicable_engine is not None:\n                if self.engine not in applicable_engine:\n                    num_steps_skipped += 1\n                    continue\n            try:\n                previous_result = run_step(step, previous_result)\n                num_steps_executed += 1\n            except Exception as e:\n                print(\"🔴 %s\" % scenario_path)\n                print(f\"Failed at step '{step['desc']}'\" if 'desc' in step else f\"Failed at step {i}\")\n                print(step)\n                print(e)\n                print(\"--------------\")\n                return False\n        else:\n            print(\"🟢 %s: %d steps (%d skipped)\" % (scenario_path, num_steps_executed, num_steps_skipped))\n        return True\n\ndef build_path_tree(paths):\n    paths.sort()\n    path_tree = PathTree()\n    for path in paths:\n        path_tree.add_path(path)\n    return path_tree\n\ndef run(scenario_paths, engine):\n    path_tree = build_path_tree(scenario_paths)\n    visitor = Visitor(engine=engine)\n    return path_tree.visit_nodes(visitor)\n\ndef filter_test(prefixes, test_name):\n    for prefix in prefixes:\n        if test_name.startswith(prefix):\n            return True\n    return False\n\ndef filter_tests(prefixes, test_names):\n    print(\"Filtering tests prefixes: %s\" % prefixes)\n    if prefixes is None or len(prefixes) == 0:\n        return test_names\n    return [\n        test_name\n        for test_name in test_names\n        if filter_test(prefixes, test_name)\n    ]\n\nclass QuickwitRunner:\n    def __init__(self, quickwit_bin_path):\n        self.quickwit_dir = tempfile.TemporaryDirectory()\n        print('created temporary directory', self.quickwit_dir, self.quickwit_dir.name)\n        qwdata = osp.join(self.quickwit_dir.name, \"qwdata\")\n        config = osp.join(self.quickwit_dir.name, \"config\")\n        mkdir(qwdata)\n        mkdir(config)\n        shutil.copy(\"../../config/quickwit.yaml\", config)\n        shutil.copy(quickwit_bin_path, self.quickwit_dir.name)\n        self.proc = subprocess.Popen([\"./quickwit\", \"run\"], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL, cwd=self.quickwit_dir.name)\n        for i in range(100):\n            try:\n                print(\"Checking on quickwit\")\n                res = requests.get(\"http://localhost:7280/health/readyz\")\n                if res.status_code == 200 and res.text.strip() == \"true\":\n                    print(\"Quickwit started\")\n                    time.sleep(6)\n                    break\n            except:\n                pass\n            print(\"Server not ready yet. Sleep and retry...\")\n            time.sleep(1)\n        else:\n            print(\"Quickwit never started. Exiting.\")\n            sys.exit(2)\n    def __del__(self):\n        print(\"Killing Quickwit\")\n        subprocess.Popen.kill(self.proc)\n\ndef main():\n    import argparse\n    arg_parser = argparse.ArgumentParser(\n        prog=\"rest-api-test\",\n        description=\"Runs a set of calls against a REST API and checks for conditions over the results.\"\n    )\n    arg_parser.add_argument(\"--engine\", help=\"Targeted engine (elastic/quickwit).\", default=\"quickwit\")\n    arg_parser.add_argument(\"--test\", help=\"Specific prefix to select the tests to run. If not specified, all tests are run.\", nargs=\"*\")\n    arg_parser.add_argument(\"--binary\", help=\"Specific the quickwit binary to run.\", nargs=\"?\")\n    parsed_args = arg_parser.parse_args()\n\n    print(parsed_args)\n\n    quickwit_process = None\n    if parsed_args.binary is not None:\n        if parsed_args.engine != \"quickwit\":\n            print(\"The --binary option is only supported for quickwit engine.\")\n            sys.exit(3)\n        binary = parsed_args.binary\n        quickwit_process = QuickwitRunner(binary)\n    quickwit_process\n\n    scenario_filepaths = glob.glob(\"scenarii/**/*.yaml\", recursive=True)\n    scenario_filepaths = list(filter_tests(parsed_args.test, scenario_filepaths))\n    return run(scenario_filepaths, engine=parsed_args.engine)\n\nif __name__ == \"__main__\":\n    import sys\n    if main():\n        sys.exit(0)\n    else:\n        sys.exit(1)\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/aggregations/0001-aggregations.yaml",
    "content": "# Test date histogram aggregation\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: _elastic/aggregations/_search\njson:\n  query: { match_all: {} }\n  aggs:\n    date_histo:\n      date_histogram:\n        field: \"date\"\n        fixed_interval: \"30d\"\n        offset: \"-4d\"\nexpected:\n  aggregations:\n    date_histo:\n      buckets:\n        -  { \"doc_count\": 5, \"key\": 1420070400000.0, \"key_as_string\": \"2015-01-01T00:00:00Z\" }\n        -  { \"doc_count\": 2, \"key\": 1422662400000.0, \"key_as_string\": \"2015-01-31T00:00:00Z\" }\n---\n# Test date histogram with extended bounds\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: _elastic/aggregations/_search\njson:\n  query: { match_all: {} }\n  aggs:\n    date_histo:\n      date_histogram:\n        field: \"date\"\n        fixed_interval: \"30d\"\n        offset: \"-4d\"\n        extended_bounds:\n          min: 1420070400000\n          max: 1425254400000\nexpected:\n  aggregations:\n    date_histo:\n      buckets:\n        -  { \"doc_count\": 5, \"key\": 1420070400000.0, \"key_as_string\": \"2015-01-01T00:00:00Z\" }\n        -  { \"doc_count\": 2, \"key\": 1422662400000.0, \"key_as_string\": \"2015-01-31T00:00:00Z\" }\n        -  { \"doc_count\": 0, \"key\": 1425254400000.0, \"key_as_string\": \"2015-03-02T00:00:00Z\" }\n---\n# Test date histogram aggregation and sub-aggregation \nmethod: [GET]\nengines:\n  - quickwit\nendpoint: _elastic/aggregations/_search\njson:\n  query: { match_all: {} }\n  aggs:\n    date_histo: \n      date_histogram: \n        field: \"date\"\n        fixed_interval: \"30d\"\n        offset: \"-4d\"\n      aggs:\n        response:\n          stats:\n            field: response\nexpected:\n  aggregations:\n    date_histo:\n      buckets:\n        -  { \"doc_count\": 5, \"key\": 1420070400000.0, \"key_as_string\": \"2015-01-01T00:00:00Z\", \"response\": { \"avg\": 85.0, \"count\": 4, \"max\": 120.0, \"min\": 20.0, \"sum\": 340.0 } }\n        -  { \"doc_count\": 2, \"key\": 1422662400000.0, \"key_as_string\": \"2015-01-31T00:00:00Z\", \"response\": { \"avg\": 80.0, \"count\": 2, \"max\": 130.0, \"min\": 30.0, \"sum\": 160.0 }  }\n--- \n# Test date histogram aggregation + exists and sub-aggregation \nmethod: [GET]\nengines:\n  - quickwit\nendpoint: _elastic/aggregations/_search\njson:\n  query:\n    bool:\n      must:\n        - exists:\n            field: response\n  aggs:\n    date_histo: \n      date_histogram: \n        field: \"date\"\n        fixed_interval: \"30d\"\n        offset: \"-4d\"\n      aggs:\n        response:\n          stats:\n            field: response\nexpected:\n  aggregations:\n    date_histo:\n      buckets:\n        -  { \"doc_count\": 4, \"key\": 1420070400000.0, \"key_as_string\": \"2015-01-01T00:00:00Z\", \"response\": { \"avg\": 85.0, \"count\": 4, \"max\": 120.0, \"min\": 20.0, \"sum\": 340.0 } }\n        -  { \"doc_count\": 2, \"key\": 1422662400000.0, \"key_as_string\": \"2015-01-31T00:00:00Z\", \"response\": { \"avg\": 80.0, \"count\": 2, \"max\": 130.0, \"min\": 30.0, \"sum\": 160.0 }  }\n--- \n# Test range aggregation\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: _elastic/aggregations/_search\njson:\n  query: { match_all: {} }\n  aggs:\n    my_range:\n      range: \n        field: response\n        ranges:\n        - { to: 50, key: fast }\n        - { from: 50, to: 80, key: medium }\n        - { from: 80, key: slow }\nexpected:\n  aggregations:\n    my_range:\n      buckets:\n        - { \"doc_count\": 5, \"key\": \"fast\", \"to\": 50.0 }\n        - { \"doc_count\": 0, \"from\": 50.0, \"key\": \"medium\", \"to\": 80.0 }\n        - { \"doc_count\": 4, \"from\": 80.0, \"key\": \"slow\" }\n--- \n# Test term aggs\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: _elastic/aggregations/_search\njson:\n  query: { match_all: {} }\n  aggs:\n    hosts: \n      terms: \n        field: \"host\"\n    tags: \n      terms: \n        field: \"tags\"\nexpected:\n  aggregations:\n    hosts:\n      buckets:\n      - doc_count: 4\n        key: 192.168.0.10\n      - doc_count: 2\n        key: 192.168.0.1\n      - doc_count: 1\n        key: 192.168.0.11\n      doc_count_error_upper_bound: 0\n      sum_other_doc_count: 0\n    tags:\n      buckets:\n      - doc_count: 5\n        key: nice\n      - doc_count: 2\n        key: cool\n      doc_count_error_upper_bound: 0\n      sum_other_doc_count: 0\n--- \n# Test term aggs with split_size\n# We set split_size to 1, so one document with name \"Fritz\" will be missing from one split.\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: _elastic/aggregations/_search\njson:\n  query: { match_all: {} }\n  aggs:\n    names: \n      terms: \n        field: \"name\"\n        size: 1\n        split_size: 1\nexpected:\n  aggregations:\n    names:\n      buckets:\n        # There are 3 documents with name \"Fritz\" but we only get 2. One does not get passed to the \n        # root node, because it gets cut off due to the split_size parameter set to 1.\n        # We also get doc_count_error_upper_bound: 2, which signals that the result is approximate.\n      - doc_count: 2 \n        key: \"Fritz\"\n      sum_other_doc_count: 8\n      doc_count_error_upper_bound: 2\n--- \n# Test term aggs with shard_size\n# segment_size is an alias to shard_size\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: _elastic/aggregations/_search\njson:\n  query: { match_all: {} }\n  aggs:\n    names: \n      terms: \n        field: \"name\"\n        size: 1\n        segment_size: 1\nexpected:\n  aggregations:\n    names:\n      buckets:\n        # There are 3 documents with name \"Fritz\" but we only get 2. One does not get passed to the \n        # root node, because it gets cut off due to the split_size parameter set to 1.\n        # We also get doc_count_error_upper_bound: 2, which signals that the result is approximate.\n      - doc_count: 2 \n        key: \"Fritz\"\n      sum_other_doc_count: 8\n      doc_count_error_upper_bound: 2\n--- \n# Test term aggs with shard_size\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: _elastic/aggregations/_search\njson:\n  query: { match_all: {} }\n  aggs:\n    names: \n      terms: \n        field: \"name\"\n        size: 1\n        shard_size: 1\nexpected:\n  aggregations:\n    names:\n      buckets:\n        # There are 3 documents with name \"Fritz\" but we only get 2. One does not get passed to the \n        # root node, because it gets cut off due to the split_size parameter set to 1.\n        # We also get doc_count_error_upper_bound: 2, which signals that the result is approximate.\n      - doc_count: 2 \n        key: \"Fritz\"\n      sum_other_doc_count: 8\n      doc_count_error_upper_bound: 2\n---\n# Test term aggs with split_size\n# Here we increase split_size to 5, so we will get the 3 documents with name \"Fritz\"\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: _elastic/aggregations/_search\njson:\n  query: { match_all: {} }\n  aggs:\n    names: \n      terms: \n        field: \"name\"\n        size: 1\n        split_size: 5\nexpected:\n  aggregations:\n    names:\n      buckets:\n        # We get all 3 documents with name \"Fritz\"\n        # We also get doc_count_error_upper_bound: 0, to the result is exact.\n      - doc_count: 3 \n        key: \"Fritz\"\n      sum_other_doc_count: 7\n      doc_count_error_upper_bound: 0\n--- \n# Test date histogram + percentiles sub-aggregation\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: _elastic/aggregations/_search\njson:\n  query: { match_all: {} }\n  aggs:\n    metrics:\n      date_histogram:\n        field: date\n        fixed_interval: 30d\n        offset: \"-4d\"\n      aggs:\n        response:\n          percentiles:\n            field: response\n            percents:\n            - 85\n            keyed: false\nexpected:\n  aggregations:\n    metrics:\n      buckets:\n      - doc_count: 5\n        key: 1420070400000.0\n        key_as_string: '2015-01-01T00:00:00Z'\n        response:\n          values:\n          - key: 85.0\n            value: 100.49456770856702\n      - doc_count: 2\n        key: 1422662400000.0\n        key_as_string: '2015-01-31T00:00:00Z'\n        response:\n          values:\n          - key: 85.0\n            value: 30.26717133872237\n--- \n# Test histogram\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: _elastic/aggregations/_search\njson:\n  query: { match_all: {} }\n  aggs:\n    metrics:\n      histogram:\n        field: response\n        interval: 50\nexpected:\n  aggregations:\n    metrics:\n      buckets:\n      - doc_count: 5\n        key: 0.0\n      - doc_count: 0\n        key: 50.0\n      - doc_count: 4\n        key: 100.0\n\n--- \n# Test histogram empty result on empty index\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: _elastic/empty_aggregations/_search\njson:\n  query: { match_all: {} }\n  aggs:\n    metrics:\n      histogram:\n        field: response\n        interval: 50\nexpected:\n  aggregations:\n    metrics:\n      buckets: []\n---\n# Test cardinality aggregation\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: _elastic/aggregations/_search\njson:\n  query: { match_all: {} }\n  aggs:\n    unique_names:\n      cardinality:\n        field: \"name\"\n    unique_response:\n      cardinality:\n        field: \"response\"\n    unique_dates:\n      cardinality:\n        field: \"date\"\nexpected:\n  aggregations:\n    unique_names:\n      value: 8.0\n    unique_response:\n      value: 5.0 # TODO: Check. The correct number is 6\n    unique_dates:\n      value: 6.0 \n---\n# Test extended stats aggregation\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: _elastic/aggregations/_search\njson:\n  query: { match_all: {} }\n  aggs:\n    response_stats:\n      extended_stats:\n        field: \"response\"\nexpected:\n  aggregations:\n    response_stats:\n      sum_of_squares: 55300.0\n# Test term aggs number precision\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: _elastic/aggregations/_search\njson:\n  query: { match_all: {} }\n  size: 0\n  aggs:\n    names: \n      terms: \n        field: \"high_prec_test\"\nexpected:\n  aggregations:\n    names:\n      buckets:\n      - doc_count: 1 \n        key: 1769070189829214200\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/aggregations/0002-doc-len.yaml",
    "content": "# Test summing doc len\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: _elastic/aggregations/_search\njson:\n  query: { match_all: {} }\n  aggs:\n    doc_len:\n      sum:\n        field: \"_doc_length\"\nexpected:\n  aggregations:\n    doc_len:\n      value: 952.0\n---\n# Test doc len isn't shown when querying documents\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: _elastic/aggregations/_search\njson:\n  query:\n    term:\n      id:\n        value: 1\nexpected:\n  hits:\n    hits:\n      - _source:\n          $expect: \"not '_doc_length' in val\"\n---\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/aggregations/_ctx.yaml",
    "content": "method: GET\nengines: [\"quickwit\"]\napi_root: \"http://localhost:7280/api/v1/\"\nheaders:\n  Content-Type: application/json\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/aggregations/_setup.quickwit.yaml",
    "content": "# Delete possibly remaining index\nmethod: DELETE\nendpoint: indexes/aggregations\nstatus_code: null\n---\nmethod: DELETE\nendpoint: indexes/empty_aggregations\nstatus_code: null\n---\n# Create index\nmethod: POST\nendpoint: indexes/\njson:\n  version: \"0.8\"\n  index_id: aggregations\n  doc_mapping:\n    mode: dynamic\n    dynamic_mapping:\n      tokenizer: default\n      fast: true\n    field_mappings:\n      - name: date\n        type: datetime\n        input_formats:\n          - rfc3339\n        fast_precision: seconds\n        fast: true\n      - name: high_prec_test\n        type: u64\n        fast: true\n    store_document_size: true\n---\n# Create empty index\nmethod: POST\nendpoint: indexes/\njson:\n  version: \"0.8\"\n  index_id: empty_aggregations\n  doc_mapping:\n    mode: dynamic\n    dynamic_mapping:\n      tokenizer: default\n      fast: true\n    field_mappings:\n      - name: date\n        type: datetime\n        input_formats:\n          - rfc3339\n        fast_precision: seconds\n        fast: true\n---\n# Ingest documents\nmethod: POST\nendpoint: aggregations/ingest\nparams:\n  commit: force\nndjson:\n  - {\"name\": \"Albert\", \"response\": 100, \"id\": 1, \"date\": \"2015-01-01T12:10:30Z\", \"host\": \"192.168.0.10\", \"tags\": [\"nice\"]}\n  - {\"name\": \"Fred\", \"response\": 100, \"id\": 3, \"date\": \"2015-01-01T12:10:30Z\", \"host\": \"192.168.0.1\", \"tags\": [\"nice\"]}\n  - {\"name\": \"Manfred\", \"response\": 120, \"id\": 13, \"date\": \"2015-01-11T12:10:30Z\", \"host\": \"192.168.0.11\", \"tags\": [\"nice\"]}\n  - {\"name\": \"Horst\", \"id\": 2, \"date\": \"2015-01-01T11:11:30Z\", \"host\": \"192.168.0.10\", \"tags\": [\"nice\", \"cool\"]}\n  - {\"name\": \"Fritz\", \"response\": 30, \"id\": 5, \"host\": \"192.168.0.1\", \"tags\": [\"nice\", \"cool\"]}\n---\n# Ingest documents split #2\nmethod: POST\nendpoint: aggregations/ingest\nparams:\n  commit: force\nndjson:\n  - {\"name\": \"Fritz\", \"high_prec_test\": 1769070189829214200, \"response\": 30, \"id\": 0}\n  - {\"name\": \"Fritz\", \"response\": 30, \"id\": 0}\n  - {\"name\": \"Holger\", \"response\": 30, \"id\": 4, \"date\": \"2015-02-06T00:00:00Z\", \"host\": \"192.168.0.10\"}\n  - {\"name\": \"Werner\", \"response\": 20, \"id\": 5, \"date\": \"2015-01-02T00:00:00Z\", \"host\": \"192.168.0.10\"}\n  - {\"name\": \"Bernhard\", \"response\": 130, \"id\": 14, \"date\": \"2015-02-16T00:00:00Z\"}\n\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/aggregations/_teardown.quickwit.yaml",
    "content": "# # Delete index\nmethod: DELETE\nendpoint: indexes/aggregations\n---\nmethod: DELETE\nendpoint: indexes/empty_aggregations\n\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/concat_fields/0001_concat_field.yaml",
    "content": "# we use the tokenizer from the concat field, not the underlying field\nendpoint: concat/search\nparams:\n  query: \"concat_raw:AB-CD\"\nexpected:\n  num_hits: 1\n---\nendpoint: concat/search\nparams:\n  query: \"concat_raw:EF-GH\"\nexpected:\n  num_hits: 1\n---\nendpoint: concat/search\nparams:\n  query: \"concat_raw:'AB CD'\"\nexpected:\n  num_hits: 0\n---\nendpoint: concat/search\nparams:\n  query: \"concat_raw:'EF GH'\"\nexpected:\n  num_hits: 0\n---\nendpoint: concat/search\nparams:\n  query: \"concat_default:AB\"\nexpected:\n  num_hits: 1\n---\nendpoint: concat/search\nparams:\n  query: \"concat_default:GH\"\nexpected:\n  num_hits: 1\n---\n# we find bool both in text and in bool fields\nendpoint: concat/search\nparams:\n  query: \"concat_raw:true\"\nexpected:\n  num_hits: 2\n---\nendpoint: concat/search\nparams:\n  query: \"concat_default:true\"\nexpected:\n  num_hits: 2\n---\n# we find numbers both in text and int fields\nendpoint: concat/search\nparams:\n  query: \"concat_raw:42\"\nexpected:\n  num_hits: 1 # only 1 hit, 42 doesn't get tokenized on this field\n---\nendpoint: concat/search\nparams:\n  query: \"concat_default:42\"\nexpected:\n  num_hits: 2 # 2 hits, the number, and the tokenized text\n---\nendpoint: concat/search\nparams:\n  query: \"concat_raw:otherfieldvalue\"\nexpected:\n  num_hits: 1\n---\nendpoint: concat/search\nparams:\n  query: \"concat_raw:9\"\nexpected:\n  num_hits: 1\n---\nendpoint: concat/search\nparams:\n  query: \"concat_raw:false\"\nexpected:\n  num_hits: 2 # also include the document with a json field\n---\nendpoint: concat/search\nparams:\n  query: \"concat_default:otherfieldvalue OR concat_default:9\"\nexpected:\n  num_hits: 0 # this field doesn't include _dynamic\n---\nendpoint: concat/search\nparams:\n  query: \"concat_default:false\"\nexpected:\n  num_hits: 1 # only include the document with a json field\n---\nendpoint: concat/search\nparams:\n  query: \"concat_raw:10\"\nexpected:\n  num_hits: 1\n---\nendpoint: concat/search\nparams:\n  query: \"concat_raw:nestedstring\"\nexpected:\n  num_hits: 1\n---\nendpoint: concat/search\nparams:\n  query: \"concat_default:10\"\nexpected:\n  num_hits: 1\n---\nendpoint: concat/search\nparams:\n  query: \"concat_default:nestedstring\"\nexpected:\n  num_hits: 1\n---\nendpoint: concat/search\nparams:\n  query: \"concat_default:1.5\"\nexpected:\n  num_hits: 1\n---\nendpoint: concat/search\nparams:\n  query: \"concat_default:2.5\"\nexpected:\n  num_hits: 1\n---\nendpoint: concat/search\nparams:\n  query: \"concat_default:3.5\"\nexpected:\n  num_hits: 0\n---\nendpoint: concat/search\nparams:\n  query: \"concat_raw:1.5\"\nexpected:\n  num_hits: 1\n---\nendpoint: concat/search\nparams:\n  query: \"concat_raw:2.5\"\nexpected:\n  num_hits: 1\n---\nendpoint: concat/search\nparams:\n  query: \"concat_raw:3.5\"\nexpected:\n  num_hits: 1\n---\nendpoint: concat/search\nparams:\n  query: \"concat_raw:9223372036854775808\"\nexpected:\n  num_hits: 1\n---\nendpoint: concat/search\nparams:\n  query: \"concat_raw:-5\"\nexpected:\n  num_hits: 1\n---\nendpoint: concat/search\nparams:\n  # concat date values are stored as strings to enable some level of range\n  # querying even though they don't support fast fields\n  query: \"concat_raw:\\\"2024-01-01\\\"*\"\nexpected:\n  num_hits: 1\n---\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/concat_fields/_ctx.yaml",
    "content": "method: GET\nengines: [\"quickwit\"]\napi_root: \"http://localhost:7280/api/v1/\"\nheaders:\n  Content-Type: application/json\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/concat_fields/_setup.quickwit.yaml",
    "content": "# Delete possibly remaining index\nmethod: DELETE\nendpoint: indexes/concat\nstatus_code: null\n---\n# Create index\nmethod: POST\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: concat\n  doc_mapping:\n    mode: dynamic\n    field_mappings:\n      - name: text1\n        type: text\n        tokenizer: default\n      - name: text2\n        type: text\n        tokenizer: raw\n      - name: boolean\n        type: bool\n      - name: int\n        type: u64\n      - name: float\n        type: f64\n      - name: json\n        type: json\n      - name: concat_raw\n        type: concatenate\n        concatenate_fields:\n          - text1\n          - text2\n          - boolean\n          - int\n          - json\n          - float\n        tokenizer: raw\n        include_dynamic_fields: true\n      - name: concat_default\n        type: concatenate\n        concatenate_fields:\n          - text1\n          - text2\n          - boolean\n          - int\n          - json\n          - float\n        tokenizer: default\n    dynamic_mapping:\n      tokenizer: default\n      expand_dots: true\nsleep_after: 3\n---\n# Ingest documents\nmethod: POST\nendpoint: concat/ingest\nnum_retries: 10\nparams:\n  commit: force\nndjson:\n  - {\"text1\": \"AB-CD\", \"text2\": \"EF-GH\"}\n  - {\"text1\": \"true\"}\n  - {\"boolean\": true}\n  - {\"text2\": \"i like 42\"}\n  - {\"int\": 42}\n  - {\"other-field\": \"otherfieldvalue\", \"other-field-number\": 9, \"other-field-bool\": false}\n  - {\"json\": {\"some_bool\": false, \"some_int\": 10, \"nested\": {\"some_string\": \"nestedstring\"}}}\n  - {\"float\": 1.5}\n  - {\"json\": {\"val:\": 2.5, \"date\": \"2024-01-01T00:13:00Z\"}}\n  - {\"other\": 3.5}\n  # too big to be a i64, parsed as a u64\n  - {\"big\": 9223372036854775808}\n  - {\"neg\": -5}\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/concat_fields/_teardown.quickwit.yaml",
    "content": "# Delete index\nmethod: DELETE\nendpoint: indexes/concat\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/default_search_fields/0001_default_fields.yaml",
    "content": "endpoint: defaultsearchfields/search\nparams:\n  query: hello\nexpected:\n  num_hits: 1\n  hits:\n    - id: 1\n      some_dynamic_field: hello\n---\nendpoint: defaultsearchfields/search\nparams:\n  query: allo\nexpected:\n  num_hits: 1\n  hits:\n    - id: 2\n      inner_json: {'somefieldinjson': 'allo'}\n---\nendpoint: defaultsearchfields/search\nparams:\n  query: bonjour\nexpected:\n  num_hits: 1\n  hits:\n    - id: 3\n      regular_field: bonjour\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/default_search_fields/0002_invalid_default_fields.yaml",
    "content": "# should fail because we are not in dynamic,\n# yet we are targeting a field not in the field mapping.\nmethod: POST\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: failing1\n  doc_mapping:\n    mode: lenient\n    field_mappings: []\n  search_settings:\n    default_search_fields:\n      - regular_field\nstatus_code: 400\nexpected:\n  message:\n    $expect: \"\\\"unknown default search field `regular_field`\\\" in val\"\n---\n# should fail because default search field targets a sub field of a\n# non-json field\nmethod: POST\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: failing2\n  doc_mapping:\n    mode: dynamic\n    field_mappings:\n      - name: text\n        type: text\n  search_settings:\n    default_search_fields:\n      - text.inner\nstatus_code: 400\nexpected:\n  message:\n    $expect: \"\\\"unknown default search field `text.inner`\\\" in val\"\n---\n# should fail because dynamic field is not indexed.\nmethod: POST\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: failing3\n  doc_mapping:\n    mode: dynamic\n    field_mappings: []\n    dynamic_mapping:\n      indexed: false\n  search_settings:\n    default_search_fields:\n      - some_field\nstatus_code: 400\nexpected:\n  message:\n    $expect: \"\\\"default search field `some_field` is not indexed\\\" in val\"\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/default_search_fields/_ctx.yaml",
    "content": "method: GET\nengines: [\"quickwit\"]\napi_root: \"http://localhost:7280/api/v1/\"\nheaders:\n  Content-Type: application/json\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/default_search_fields/_setup.quickwit.yaml",
    "content": "# Delete possibly remaining index\nmethod: DELETE\nendpoint: indexes/defaultsearchfields\nstatus_code: null\n---\n# Create index\nmethod: POST\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: defaultsearchfields\n  doc_mapping:\n    mode: dynamic\n    field_mappings:\n      - name: id\n        type: u64\n      - name: inner_json\n        type: json\n      - name: regular_field\n        type: text\n    dynamic_mapping:\n      expand_dots: true\n      fast: true\n  search_settings:\n    default_search_fields:\n      - regular_field\n      - some_dynamic_field\n      - inner_json.somefieldinjson\n---\nmethod: POST\nendpoint: defaultsearchfields/ingest\nparams:\n  commit: force\nndjson:\n  - {\"id\": 1, \"some_dynamic_field\": \"hello\"}\n  - {\"id\": 2, \"inner_json\": {\"somefieldinjson\": \"allo\"}}\n  - {\"id\": 3, \"regular_field\": \"bonjour\"}\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/default_search_fields/_teardown.quickwit.yaml",
    "content": "# Delete index\nmethod: DELETE\nendpoint: indexes/defaultsearchfields\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0001-noquery.yaml",
    "content": "# This tests a simple request with no queries.\nexpected:\n  hits:\n    total:\n      value: 100\n      relation: \"eq\"\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0002-query_string.yaml",
    "content": "params:\n  q: type:PushEvent\nexpected:\n  hits:\n    total:\n      value: 60\n      relation: \"eq\"\n    hits:\n      $expect: \"len(val) == 10\"\n---\n# Testing size.\nparams:\n  q: type:PushEvent\n  size: 3\nexpected:\n  hits:\n    total:\n      value: 60\n      relation: \"eq\"\n    hits:\n      $expect: \"len(val) == 3\"\n---\n# Query string takes priority over query defined in body\nparams:\n  # this overrides the query sent in body\n  q: type:PushEvent\n  size: 3\njson:\n  query:\n    term:\n      type:\n        value: \"whatever\"\nexpected:\n  hits:\n    total:\n      value: 60\n      relation: \"eq\"\n    hits:\n      $expect: \"len(val) == 3\"\n---\nparams:\n  # this overrides the query sent in body\n  size: 3\njson:\n  query:\n    term:\n      type:\n        value: \"PushEvent\"\n        # By default case_insensitive is false and prevents matching\n        # case_insensitive: false\nexpected:\n  hits:\n    total:\n      value: 0\n      relation: \"eq\"\n    hits:\n      $expect: \"len(val) == 0\"\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0003-match.yaml",
    "content": "json:\n  query:\n    match:\n      type:\n        query:  PushEvent\nexpected:\n  hits:\n    total:\n      value: 60\n---\njson:\n  query:\n    match:\n      # It is strangely possible to supply the\n      # query directly as a string.\n      type: PushEvent\nexpected:\n  hits:\n    total:\n      value: 60\n---\njson:\n  query:\n    match:\n      type: \",\" # this will result in a zero-ter query\nexpected:\n  hits:\n    total:\n      value: 0\n---\njson:\n  query:\n    match:\n      type:\n        query: \", \" # this will result in a zero term query.\n        zero_terms_query: all\nexpected:\n  hits:\n    total:\n      value: 100\n---\njson:\n  query:\n    match:\n      payload.commits.message:\n        query: \"intial commit\" # by default this is a disjunction\nexpected:\n  hits:\n    total:\n      value: 6\n---\njson:\n  query:\n    match:\n      payload.commits.message:\n        query: \"intial commit\" # by default this is a disjunction\n        operator: AND\nexpected:\n  hits:\n    total:\n      value: 1\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0004-term_aggregations.yaml",
    "content": "# disabled due to the previous lack of fast field specific tokenizer.\nparams:\n  size: 0\njson:\n  aggs:\n    mytypeagg:\n      terms:\n        field: type\n        size: 5\nexpected:\n  hits:\n    total:\n      value: 100\n    hits:\n      $expect: \"len(val) == 0\"\n  aggregations:\n    mytypeagg:\n      doc_count_error_upper_bound: 0\n      sum_other_doc_count: 9\n      buckets:\n        - { \"key\": \"pushevent\", \"doc_count\": 60 }\n        - { \"key\": \"createevent\", \"doc_count\" : 12 }\n        - { \"key\": \"issuecommentevent\", \"doc_count\" : 8 }\n        - { \"key\": \"watchevent\", \"doc_count\" : 6 }\n        - { \"key\": \"pullrequestevent\", \"doc_count\" : 5 }\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0005-query_string_query.yaml",
    "content": "params:\n  size: 10\njson:\n  query:\n    query_string:\n      query: \"type:PushEvent AND actor.login:jadonk\"\nexpected:\n  hits:\n    total:\n      value: 2\n    hits:\n      $expect: \"len(val) == 2\"\n---\nparams:\n  size: 10\njson:\n  query:\n    query_string:\n      query: \"PushEvent\"\n      fields: [\"type\"]\nexpected:\n  hits:\n    total:\n      value: 60\n---\nparams:\n  size: 10\njson:\n  query:\n    query_string:\n      query: \"PushEvent\"\n      fields: \"type\"\nstatus_code: 400\n---\nparams:\n  size: 10\njson:\n  query:\n    query_string:\n      query: \"type:PushEvent OR\"\n      fields: []\nstatus_code: 400\n---\nparams:\n  size: 10\njson:\n  query:\n    query_string:\n      query: \"type:PushEvent OR\"\n      fields: [\"body\"]\n      lenient: true\n# Lenient is not about the syntax.\nstatus_code: 400\n---\nparams:\n  size: 10\njson:\n  query:\n    query_string:\n      query: \"type:PushEvent\"\n      fields: []\n      lenient: true\nexpected:\n  hits:\n    total:\n      value: 60\n---\nparams:\n  size: 10\njson:\n  query:\n    query_string:\n      query: \"type:PushEvent\"\n      fields: []\n      lenient: true\nexpected:\n  hits:\n    total:\n      value: 60\n---\nparams:\n  size: 10\njson:\n  query:\n    query_string:\n      query: \"actor.id:1315639\"\n      fields: []\nexpected:\n  hits:\n    total:\n      value: 1\n---\n# This test does not work on quickwit.\n# Quickwit always act like elasticsearch's lenient mode.\nengines: [elasticsearch]\nparams:\n  size: 10\njson:\n  query:\n    query_string:\n      query: \"type:PushEvent OR actor.id:shouldhavebeenanumber\"\n      fields: []\n      lenient: false\nstatus_code: 400\n---\nparams:\n  size: 10\njson:\n  query:\n    query_string:\n      query: \"type:PushEvent OR actor.id:shouldhavebeenanumber\"\n      fields: []\n      lenient: true\nexpected:\n  hits:\n    total:\n      value: 60\n---\nparams:\n  size: 10\njson:\n  query:\n    query_string:\n      query: \"type:PushEvent AND actor.id:shouldhavebeenanumber\"\n      fields: []\n      lenient: true\nexpected:\n  hits:\n    total:\n      value: 0\n---\n# Default field\njson:\n  query:\n    query_string:\n      default_field: payload.commits.message\n      lenient: true\n      query: \"to AND the\"\nexpected:\n  hits:\n    total:\n      value: 3\n---\n# Default field + fields\njson:\n  query:\n    query_string:\n      default_field: payload.commits.message\n      fields:\n        - payload.comments.body\n      lenient: true\n      query: \"to AND the\"\nstatus_code: 400\n---\n# wildcard\njson:\n  query:\n    query_string:\n      default_field: payload.description\n      lenient: true\n      query: \"Jou*al AND unix\"\nexpected:\n  hits:\n    total:\n      value: 2\n---\n# wildcard\njson:\n  query:\n    query_string:\n      default_field: payload.description\n      lenient: true\n      query: \"Jour?al AND unix\"\nexpected:\n  hits:\n    total:\n      value: 2\n---\n# wildcard\njson:\n  query:\n    query_string:\n      default_field: payload.description\n      lenient: true\n      query: \"jou*al AND unix\"\nexpected:\n  hits:\n    total:\n      value: 2\n---\n# trailing wildcard\njson:\n  query:\n    query_string:\n      default_field: payload.description\n      lenient: true\n      query: \"jour*\"\nexpected:\n  hits:\n    total:\n      value: 3\n---\n# escaped wildcard\njson:\n  query:\n    query_string:\n      default_field: payload.description\n      lenient: true\n      # ? char removed by tokenizer\n      query: \"jour\\\\?\"\nexpected:\n  hits:\n    total:\n      value: 1\n---\njson:\n  query:\n    regexp:\n      payload.description:\n          value: \"jour.*\"\nexpected:\n  hits:\n    total:\n      value: 3\n---\njson:\n  query:\n    query_string:\n      default_field: actor.id\n      query: \">=10791466\"\n      lenient: true\nexpected:\n  hits:\n    total:\n      value: 2\n---\njson:\n  query:\n    query_string:\n      default_field: actor.id\n      query: \">10791466\"\n      lenient: true\nexpected:\n  hits:\n    total:\n      value: 1\n---\njson:\n  query:\n    query_string:\n      query: \"true\"\n      fields: [\"public\", \"public.notdefined\", \"notdefined\"]\n      lenient: true\nexpected:\n  hits:\n    total:\n      value: 100\n---\n# trailing wildcard\njson:\n  query:\n    query_string:\n      query: \"jour*\"\n      fields: [\"payload.description\", \"payload.notdefined\", \"notdefined\"]\n      lenient: true\nexpected:\n  hits:\n    total:\n      value: 3\n---\n# elasticsearch accepts this query\nengines:\n  - quickwit\njson:\n  query:\n    query_string:\n      query: \"true\"\n      fields: [\"public\", \"public.notdefined\"]\nstatus_code: 400\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0006-term_query.yaml",
    "content": "params:\n  # this overrides the query sent in body apparently\n  size: 3\njson:\n  track_total_hits: true\n  query:\n    term:\n      type:\n        value: \"PushEvent\"\n        case_insensitive: true\nexpected:\n  hits:\n    total:\n      value: 60\n      relation: \"eq\"\n    hits:\n      $expect: \"len(val) == 3\"\n---\n# Terms must be pushed in their form post tokenization\nparams:\n  size: 0\njson:\n  track_total_hits: true\n  query:\n    term:\n      type:\n        # this does not match because push event has been lowercased by the tokenizer.\n        value: \"PushEvent\"\nexpected:\n  hits:\n    total:\n      value: 0\n      relation: \"eq\"\n---\nparams:\n  size: 0\njson:\n  track_total_hits: true\n  query:\n    term:\n      type:\n        value: \"pushevent\"\nexpected:\n  hits:\n    total:\n      value: 60\n      relation: \"eq\"\n---\nparams:\n  size: 0\n# Testing the format without the \"value\" object\njson:\n  track_total_hits: true\n  query:\n    term:\n      type: \"pushevent\"\nexpected:\n  hits:\n    total:\n      value: 60\n      relation: \"eq\"\n# Also testing numbers, and numbers as string in the JSON query\n---\nengines: [\"elasticsearch\"]\nparams:\n  size: 0\njson:\n  track_total_hits: true\n  query:\n    term:\n      actor.id: 1762355\nexpected:\n  hits:\n    total:\n      value: 1\n      relation: \"eq\"\n---\nparams:\n  size: 0\njson:\n  track_total_hits: true\n  query:\n    term:\n      actor.id: \"1762355\"\nexpected:\n  hits:\n    total:\n      value: 1\n      relation: \"eq\"\n---\nparams:\n  size: 0\njson:\n  track_total_hits: true\n  query:\n    term:\n      actor.id:\n        value: 1762355\nexpected:\n  hits:\n    total:\n      value: 1\n      relation: \"eq\"\n---\nparams:\n  size: 0\njson:\n  track_total_hits: true\n  query:\n    term:\n      actor.id:\n        value: \"1762355\"\nexpected:\n  hits:\n    total:\n      value: 1\n      relation: \"eq\"\n---\n# id is a text field\nparams:\n  size: 0\njson:\n  track_total_hits: true\n  query:\n    term:\n      id:\n        value: \"2549961272\"\nexpected:\n  hits:\n    total:\n      value: 1\n      relation: \"eq\"\n---\nparams:\n  size: 0\njson:\n  track_total_hits: true\n  query:\n    term:\n      id:\n        value: 2549961272\nexpected:\n  hits:\n    total:\n      value: 1\n      relation: \"eq\"\n---\nparams:\n  size: 0\njson:\n  track_total_hits: true\n  query:\n    term:\n      id: 2549961272\nexpected:\n  hits:\n    total:\n      value: 1\n      relation: \"eq\"\n---\nparams:\n  size: 0\njson:\n  track_total_hits: true\n  query:\n    term:\n      id: \"2549961272\"\nexpected:\n  hits:\n    total:\n      value: 1\n      relation: \"eq\"\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0007-range_queries.yaml",
    "content": "json:\n  query:\n    range:\n      actor.id:\n        gte: 10791466\nexpected:\n  hits:\n    total:\n      value: 2\n      relation: \"eq\"\n---\njson:\n  query:\n    range:\n      actor.id:\n        gt: 10791466\nexpected:\n  hits:\n    total:\n      value: 1\n      relation: \"eq\"\n---\njson:\n  query:\n    range:\n      actor.id:\n        lt: 10791466\nexpected:\n  hits:\n    total:\n      value: 98\n      relation: \"eq\"\n---\njson:\n  query:\n    range:\n      actor.id:\n        lte: 10791466\nexpected:\n  hits:\n    total:\n      value: 99\n      relation: \"eq\"\n---\njson:\n  query:\n    range:\n      actor.id:\n        gt: 467872\nexpected:\n  hits:\n    total:\n      value: 84\n      relation: \"eq\"\n---\njson:\n  query:\n    range:\n      actor.id:\n        gte: 467872\nexpected:\n  hits:\n    total:\n      value: 85\n      relation: \"eq\"\n---\njson:\n  query:\n    range:\n      actor.id:\n        lte: 467872\nexpected:\n  hits:\n    total:\n      value: 16\n      relation: \"eq\"\n---\njson:\n  query:\n    range:\n      actor.id:\n        gt: 467872\n        lt: 10791466\nexpected:\n  hits:\n    total:\n      value: 82\n      relation: \"eq\"\n---\n# Missing in some documents\njson:\n  query:\n    range:\n      payload.size:\n        gte: 2\nexpected:\n  hits:\n    total:\n      value: 13\n      relation: \"eq\"\n---\n# Field not present in all documents\njson:\n  query:\n    range:\n      payload.size:\n        lt: 2\nexpected:\n  hits:\n    total:\n      value: 47\n      relation: \"eq\"\n---\n# Timestamp field\njson:\n  query:\n    range:\n      created_at:\n        lt: \"2015-02-01T00:00:13Z\"\n        gte: \"2015-02-01T00:00:10Z\"\nexpected:\n  hits:\n    total:\n      value: 44\n      relation: \"eq\"\n---\n# Timestamp field using timestamp\njson:\n  query:\n    range:\n      created_at:\n        lt: 1422748813000\nexpected:\n  hits:\n    total:\n      value: 86\n      relation: \"eq\"\n---\n# Timestamp field\njson:\n  query:\n    range:\n      created_at:\n        gte: \"2015-02-01T00:00:10Z\"\nexpected:\n  hits:\n    total:\n      value: 58\n      relation: \"eq\"\n---\n# Timestamp field\njson:\n  query:\n    range:\n      created_at:\n        lt: \"2015-02-01T00:00:13Z\"\nexpected:\n  hits:\n    total:\n      value: 86\n      relation: \"eq\"\n---\n# Timestamp field with milliseconds precision 2015-02-01T00:00:00.001\njson:\n  query:\n    range:\n      created_at:\n        gte: \"2015-02-01T00:00:00.001Z\"\n        lt: \"2015-02-01T00:00:00.002Z\"\nexpected:\n  hits:\n    total:\n      value: 1\n      relation: \"eq\"\n---\n# Timestamp field with range in microseconds.\n# Datetime will be truncated at milliseconds as\n# defined in the doc mapper.\njson:\n  query:\n    range:\n      created_at:\n        gte: \"2015-02-01T00:00:00.001999Z\"\n        lte: \"2015-02-01T00:00:00.001999Z\"\nexpected:\n  hits:\n    total:\n      value: 1\n      relation: \"eq\"\n---\n# This field is not a JSON field and doesn not have fast field normalization.\n# That means it is case sensitive\njson:\n  query:\n    range:\n      repo.name:\n        gte: \"h\"\n        lte: \"i\"\nexpected:\n  hits:\n    total:\n      value: 1\n      relation: \"eq\"\n---\n# This field is a JSON field and has fast field normalization.\n# That means it is case insensitive\njson:\n  query:\n    range:\n      actor.login:\n        gte: \"H\" # should automatically be normalized\n        lte: \"Z\"\nexpected:\n  hits:\n    total:\n      value: 68\n      relation: \"eq\"\n---\n# This field is a JSON field and has fast field normalization.\n# That means it is case insensitive\nengines:\n    - quickwit\njson:\n  query:\n    range:\n      actor.login:\n        gte: \"h\" # should automatically be normalized\n        lte: \"z\"\nexpected:\n  hits:\n    total:\n      value: 68\n      relation: \"eq\"\n---\n# This field is a JSON field and has fast field normalization.\n# That means it is case insensitive\njson:\n  query:\n    range:\n      actor.login:\n        gte: \"H\" # should automatically be normalized\n        lte: \"Z\"\nexpected:\n  hits:\n    total:\n      value: 68\n      relation: \"eq\"\n---\n# Timestamp field with a custom format.\njson:\n  query:\n    range:\n      created_at:\n        gte: \"2015|02|01 T00:00:00.001999Z\"\n        lte: \"2015|02|01 T00:00:00.001999Z\"\n        # Elasticsearch date format requires text to be escaped with single quotes\n        format: yyyy|MM|dd 'T'HH:mm:ss.SSSSSS'Z'\nexpected:\n  hits:\n    total:\n      value: 1\n      relation: \"eq\"\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0008-sort_by.yaml",
    "content": "json:\n  size: 1\n  query:\n      match_all: {}\n  sort:\n    - actor.id:\n        order: desc\nexpected:\n  hits:\n    total:\n      value: 100\n      relation: eq\n    hits:\n      - _source:\n          actor:\n            id: 10791502\n---\n# Checking that passing the sort params as a query string works.\nparams:\n    sort: \"actor.id:desc\"\n    q: \"*\"\n    size: 1\nexpected:\n    hits:\n        total:\n            value: 100\n            relation: eq\n        hits:\n            - _source:\n                actor:\n                    id: 10791502\n---\njson:\n  size: 1\n  query:\n      match_all: {}\n  sort:\n    - actor.id:\n        order: asc\nexpected:\n  hits:\n    total:\n      value: 100\n      relation: eq\n    hits:\n      - _source:\n          actor:\n            id: 5688\n---\njson:\n  size: 1\n  query:\n      match_all: {}\n  sort:\n    - actor.id\nexpected:\n  hits:\n    total:\n      value: 100\n      relation: eq\n    hits:\n      - _source:\n          actor:\n            id: 5688\n---\njson:\n  size: 1\n  query:\n      match_all: {}\n  sort:\n    actor.id: {}\nexpected:\n  hits:\n    total:\n      value: 100\n      relation: eq\n    hits:\n      - _source:\n          actor:\n            id: 5688\n---\njson:\n  size: 1\n  query:\n      match_all: {}\n  sort:\n    _doc:\n      order: desc\nexpected:\n  hits:\n    total:\n      value: 100\n      relation: eq\n    hits:\n      - _source:\n          actor:\n            id: 9018\n---\njson:\n  size: 1\n  query:\n      match_all: {}\n  sort:\n    _doc:\n      order: asc\nexpected:\n  hits:\n    total:\n      value: 100\n      relation: eq\n    hits:\n      - _source:\n          actor:\n            id: 1762355\n---\njson:\n  size: 1\n  query:\n      match_all: {}\n  sort:\n    _doc: {}\nexpected:\n  hits:\n    total:\n      value: 100\n      relation: eq\n    hits:\n      - _source:\n          actor:\n            id: 1762355\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0009-bool_query.yaml",
    "content": "# Motivated by #3249\njson:\n  query:\n      match_all: {}\nexpected:\n  hits:\n    total:\n      value: 100\n---\njson:\n  query:\n      bool:\n        filter:\n          - match_all: {}\nexpected:\n  hits:\n    total:\n      value: 100\n---\njson:\n  query:\n      bool: {}\nexpected:\n  hits:\n    total:\n      value: 100\n---\njson:\n  query:\n      bool:\n        must_not:\n          - match_none: {}\nexpected:\n  hits:\n    total:\n      value: 100\n---\njson:\n  query:\n      bool:\n        must_not:\n          - {\"query_string\": {\"query\": \"type:PushEvent AND actor.login:jadonk\"}}\nexpected:\n  hits:\n    total:\n      value: 98\n---\n# Silly edge case 1\njson:\n  query:\n      bool:\n        should:\n          - match_none: {}\nexpected:\n  hits:\n    total:\n      value: 0\n---\n# Silly edge case 2\njson:\n  query:\n      bool:\n        should:\n          - match_none: {}\n        must_not:\n          - match_none: {}\nexpected:\n  hits:\n    total:\n      value: 0\n---\n# Silly edge case 3\njson:\n  query:\n      bool:\n        must_not:\n          - match_none: {}\nexpected:\n  hits:\n    total:\n      value: 100\n---\n# Silly edge case 4\njson:\n  query:\n      bool:\n        must:\n          - match_all: {}\n        should:\n          - match_none: {}\nexpected:\n  hits:\n    total:\n      value: 100\n---\n# Silly edge case 4\njson:\n  query:\n      bool:\n        filter:\n          - match_all: {}\n        should:\n          - match_none: {}\nexpected:\n  hits:\n    total:\n      value: 100\n---\n# Support null values\n# This format is not supported by Elasticsearch\nengines: [\"quickwit\"]\njson:\n  query:\n      bool:\n        must: null\n        must_not: null\n        should: null\n        filter: null\n        boost: null\nexpected:\n  hits:\n    total:\n      value: 100\n---\njson:\n    query:\n        bool:\n            should:\n                - {\"query_string\": {\"query\": \"type:PushEvent\"}}\n                - {\"query_string\": {\"query\": \"actor.login:jadonk\"}}\n                - {\"query_string\": {\"query\": \"actor.login:teozfrank\"}}\n                - {\"query_string\": {\"query\": \"type:IssueCommentEvent\"}}\n            minimum_should_match: 1\nexpected:\n    hits:\n        total:\n            value: 69\n---\njson:\n  query:\n      bool:\n        should:\n          - {\"query_string\": {\"query\": \"type:PushEvent\"}}\n          - {\"query_string\": {\"query\": \"actor.login:jadonk\"}}\n          - {\"query_string\": {\"query\": \"actor.login:teozfrank\"}}\n          - {\"query_string\": {\"query\": \"type:IssueCommentEvent\"}}\n        minimum_should_match: 2\nexpected:\n  hits:\n    total:\n      value: 3\n---\njson:\n    query:\n        bool:\n            should:\n                - {\"query_string\": {\"query\": \"type:PushEvent\"}}\n                - {\"query_string\": {\"query\": \"actor.login:jadonk\"}}\n                - {\"query_string\": {\"query\": \"actor.login:teozfrank\"}}\n                - {\"query_string\": {\"query\": \"type:IssueCommentEvent\"}}\n            minimum_should_match: 3\nexpected:\n    hits:\n        total:\n            value: 0\n---\njson:\n  query:\n      bool:\n        must:\n          - {\"query_string\": {\"query\": \"type:PushEvent\"}}\n        should:\n          - {\"query_string\": {\"query\": \"actor.login:jadonk\"}}\n          - {\"query_string\": {\"query\": \"actor.login:teozfrank\"}}\n          - {\"query_string\": {\"query\": \"type:IssueCommentEvent\"}}\n        minimum_should_match: 1\nexpected:\n  hits:\n    total:\n      value: 2\n---\njson:\n  query:\n      bool:\n        must:\n          - {\"query_string\": {\"query\": \"type:PushEvent\"}}\n        should:\n          - {\"query_string\": {\"query\": \"actor.login:jadonk\"}}\n        minimum_should_match: 2 # that's one too many'\nexpected:\n  hits:\n    total:\n      value: 0\n---\njson:\n    query:\n        bool:\n            should:\n            - {\"query_string\": {\"query\": \"type:PushEvent\"}}\n            - {\"query_string\": {\"query\": \"actor.login:jadonk\"}}\n            - {\"query_string\": {\"query\": \"actor.login:teozfrank\"}}\n            - {\"query_string\": {\"query\": \"type:IssueCommentEvent\"}}\n            minimum_should_match: 50%\nexpected:\n    hits:\n        total:\n            value: 3\n---\njson:\n    query:\n        bool:\n            should:\n            - {\"query_string\": {\"query\": \"type:PushEvent\"}}\n            - {\"query_string\": {\"query\": \"actor.login:jadonk\"}}\n            - {\"query_string\": {\"query\": \"actor.login:teozfrank\"}}\n            - {\"query_string\": {\"query\": \"type:IssueCommentEvent\"}}\n            minimum_should_match: -2\nexpected:\n    hits:\n        total:\n            value: 3\n---\njson:\n    query:\n        bool:\n            should:\n            - {\"query_string\": {\"query\": \"type:PushEvent\"}}\n            - {\"query_string\": {\"query\": \"actor.login:jadonk\"}}\n            - {\"query_string\": {\"query\": \"actor.login:teozfrank\"}}\n            - {\"query_string\": {\"query\": \"type:IssueCommentEvent\"}}\n            minimum_should_match: -3\nexpected:\n    hits:\n        total:\n            value: 69\n---\n# corner case: a minimum should match that is too negative is just discarded.\njson:\n    query:\n        bool:\n            should:\n            - {\"query_string\": {\"query\": \"type:PushEvent\"}}\n            - {\"query_string\": {\"query\": \"actor.login:jadonk\"}}\n            - {\"query_string\": {\"query\": \"actor.login:teozfrank\"}}\n            - {\"query_string\": {\"query\": \"type:IssueCommentEvent\"}}\n            minimum_should_match: -10\nexpected:\n    hits:\n        total:\n            value: 69\n---\n# corner case: a minimum should match that is too negative is just discarded.\njson:\n    query:\n        bool:\n            must:\n                - {\"query_string\": {\"query\": \"type:PushEvent\"}}\n            should:\n                - {\"query_string\": {\"query\": \"actor.login:jadonk\"}}\n                - {\"query_string\": {\"query\": \"actor.login:teozfrank\"}}\n                - {\"query_string\": {\"query\": \"type:IssueCommentEvent\"}}\n            minimum_should_match: -10\nexpected:\n    hits:\n        total:\n            value: 60\n---\njson:\n    query:\n        bool:\n            should:\n            - {\"query_string\": {\"query\": \"type:PushEvent\"}}\n            - {\"query_string\": {\"query\": \"actor.login:jadonk\"}}\n            - {\"query_string\": {\"query\": \"actor.login:teozfrank\"}}\n            - {\"query_string\": {\"query\": \"type:IssueCommentEvent\"}}\n            minimum_should_match: 0\nexpected:\n    hits:\n        total:\n            value: 69\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0010-match_phrase_prefix_query.yaml",
    "content": "method: [GET]\njson:\n  query:\n    match_phrase_prefix:\n      payload.pull_request.body:\n        query: \"p\"\nexpected:\n  hits:\n    total:\n      value: 2\n      relation: \"eq\"\n---\nmethod: [GET]\njson:\n  query:\n    match_phrase_prefix:\n      payload.pull_request.body:\n        query: \"to p\"\nexpected:\n  hits:\n    total:\n      value: 1\n      relation: \"eq\"\n---\nmethod: [GET]\njson:\n  query:\n    match_phrase_prefix:\n      payload.pull_request.body:\n        query: \"be to p\"\nexpected:\n  hits:\n    total:\n      value: 1\n      relation: \"eq\"\n---\nmethod: [GET]\njson:\n  query:\n    match_phrase_prefix:\n      payload.commits.message:\n        query: \"automated comm\"\nexpected:\n  hits:\n    total:\n      value: 1\n      relation: \"eq\"\n    hits:\n      - _source:\n          payload:\n            commits:\n              - message: \"automated commit\"\n---\nmethod: [GET]\njson:\n  query:\n    match_phrase_prefix:\n      payload.commits.message:\n        query: \"fix\"\n        max_expansions: 2\nexpected:\n  hits:\n    total:\n      value: 6\n      relation: \"eq\"\n---\n# This is a bit of a sloppy just testing that the tokenizer property is\n# plugged\n#\n# We only apply it to quickwit because the raw tokenizer does not exist in ES.\nmethod: [GET]\nengines:\n  - quickwit\njson:\n  query:\n    match_phrase_prefix:\n      payload.commits.message:\n        query: \"automated comm\"\n        analyzer: raw\nexpected:\n  hits:\n    total:\n      value: 0\n      relation: \"eq\"\n---\n# This is a bit of a sloppy just testing that the tokenizer property is\n# plugged\n#\n# We only apply it to quickwit because the raw tokenizer does not exist in ES.\nmethod: [GET]\njson:\n  query:\n    match_phrase_prefix:\n      payload.commits.message:\n        query: \"automated comm\"\n        analyzer: inexistent_tokenizer\nstatus_code: 400\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0011-exists-query.yaml",
    "content": "json:\n  query:\n    exists:\n      field: type\nexpected:\n  hits:\n    total:\n      value: 100\n---\njson:\n  query:\n    exists:\n      field: thisfielddoesnotexists\nexpected:\n  hits:\n    total:\n      value: 0\n---\njson:\n  query:\n    exists:\n      field: payload.size\nexpected:\n  hits:\n    total:\n      value: 60\n---\njson:\n  query:\n    exists:\n      field: payload\nexpected:\n  hits:\n    total:\n      # one of the docs contains `\"payload\":{}`\n      value: 99\n---\n# Fortunately, ES does not accept this quirky syntax in the\n# case of exists query.\njson:\n  query:\n    exists: payload.size\nstatus_code: 400\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0012-scroll-api.yaml",
    "content": "---\nengines: [\"quickwit\"]\nparams:\n    size: 1\n    scroll: 30m\n    allow_partial_search_results: \"false\"\njson:\n    query:\n        match_all: {}\nstatus_code: 400\nexpected:\n    error:\n        reason: \"Invalid argument: Quickwit only supports scroll API with allow_partial_search_results set to true\"\n---\nparams:\n  size: 1\n  scroll: 30m\njson:\n  query:\n    match_all: {}\n  sort:\n    - actor.id:\n        order: desc\n  aggs:\n    mytypeagg:\n      terms:\n        field: type\n        size: 5\nstore:\n  scroll_id: _scroll_id\nexpected:\n  _scroll_id:\n    $expect: \"len(val) > 4\"\n  aggregations:\n    mytypeagg: {}\n  hits:\n    hits:\n      - _source: {actor: {login: \"miyuotsuki\"}}\n    total:\n      value: 100\n      relation: \"eq\"\n---\nmethod: GET\nendpoint: \"_search/scroll\"\nparams:\n  scroll: 30m\njson:\n  scroll_id:\n    $previous: \"val[\\\"_scroll_id\\\"]\"\nexpected:\n  _scroll_id:\n    $expect: \"len(val) > 4\"\n  hits:\n    hits:\n      - _source: {actor: {login: \"ScottThiessen\"}}\n    total:\n      value: 100\n---\nmethod: GET\nendpoint: \"_search/scroll\"\nparams:\n  scroll: 30m\njson:\n  scroll_id:\n    $previous: \"val[\\\"_scroll_id\\\"]\"\nexpected:\n  _scroll_id:\n    $expect: \"len(val) > 4\"\n  hits:\n    hits:\n      - _source: {actor: {login: \"seenajon\"}}\n    total:\n      value: 100\n---\nengines: [\"quickwit\"]\nparams:\n  size: 1\n  scroll: 31m\njson:\n  query:\n    match_all: {}\n  sort:\n    - actor.id:\n        order: desc\nstatus_code: 400\nexpected:\n  status: 400\n  error:\n    reason: \"Invalid argument: Quickwit only supports scroll TTL period up to 1800 secs\"\n---\nparams:\n  size: 40\n  scroll: 30m\njson:\n  query:\n    match_all: {}\n  sort:\n    - actor.id:\n        order: desc\n  aggs:\n    mytypeagg:\n      terms:\n        field: type\n        size: 5\nstore:\n  scroll_id: _scroll_id\nexpected:\n  _scroll_id:\n    $expect: \"len(val) > 4\"\n  aggregations:\n    mytypeagg: {}\n  hits:\n    hits:\n      $expect: \"len(val) == 40\"\n    total:\n      value: 100\n      relation: \"eq\"\n---\nmethod: GET\nendpoint: \"_search/scroll\"\nparams:\n  scroll: 30m\njson:\n  scroll_id:\n    $previous: \"val[\\\"_scroll_id\\\"]\"\nexpected:\n  _scroll_id:\n    $expect: \"len(val) > 4\"\n  hits:\n    hits:\n      $expect: len(val) == 40\n    total:\n      value: 100\n---\nmethod: GET\nendpoint: \"_search/scroll\"\nparams:\n  scroll: 30m\njson:\n  scroll_id:\n    $previous: \"val[\\\"_scroll_id\\\"]\"\nexpected:\n  _scroll_id:\n    $expect: \"len(val) > 4\"\n  hits:\n    hits:\n      $expect: len(val) == 20\n    total:\n      value: 100\n---\nmethod: GET\nendpoint: \"_search/scroll\"\nparams:\n  scroll: 30m\njson:\n  scroll_id:\n    $previous: \"val[\\\"_scroll_id\\\"]\"\nexpected:\n  _scroll_id:\n    $expect: \"len(val) > 4\"\n  hits:\n    hits:\n      $expect: len(val) == 0\n    total:\n      value: 100\n---\nmethod: GET\nendpoint: \"_search/scroll\"\nparams:\n  scroll: 30m\njson:\n  scroll_id:\n    $previous: \"val[\\\"_scroll_id\\\"]\"\nexpected:\n  _scroll_id:\n    $expect: \"len(val) > 4\"\n  hits:\n    hits:\n      $expect: len(val) == 0\n    total:\n      value: 100\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0013-phrase-query.yaml",
    "content": "json:\n  query:\n    match_phrase:\n      payload.commits.message: sign decoration\nexpected:\n  hits:\n    total:\n      value: 1\n---\njson:\n  query:\n    match_phrase:\n      payload.commits.message:\n        query: sign decoration\nexpected:\n  hits:\n    total:\n      value: 1\n---\njson:\n  query:\n    match_phrase:\n      # There is a \"zone of explosion\" message.\n      # Without slop no matches!\n      payload.commits.message: zone explosion\nexpected:\n  hits:\n    total:\n      value: 0\n---\njson:\n  query:\n    match_phrase:\n      # There is a \"zone of explosion\" message.\n      # Without slop no matches!\n      payload.commits.message:\n        query: zone explosion\n        slop: 1\nexpected:\n  hits:\n    total:\n      value: 1\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0014-multi-match-query.yaml",
    "content": "json:\n  engines:\n    - quickwit\n  query:\n    multi_match:\n      query: sign decoration\n      fields: []\nstatus_code:\n  400\nexpected:\n---\njson:\n  query:\n    multi_match:\n      query: sign decoration\n      fields: [\"payload.commits.message\"]\nexpected:\n  hits:\n    total:\n      value: 1\n---\njson:\n  query:\n    multi_match:\n      query: sign decoration\n      # Apparently elasticsearch accepts a string here.\n      fields: \"payload.commits.message\"\nexpected:\n  hits:\n    total:\n      value: 1\n---\njson:\n  query:\n    multi_match:\n      query: sign decoration\n      fields:\n        - inexistent_field\n        - payload.commits.message\nexpected:\n  hits:\n    total:\n      value: 1\n---\njson:\n  query:\n    multi_match:\n      type: phrase\n      query: sign decoration\n      fields: [\"payload.commits.message\"]\nexpected:\n  hits:\n    total:\n      value: 1\n---\njson:\n  query:\n    multi_match:\n      type: phrase\n      query: zone explosion\n      fields: [\"payload.commits.message\"]\nexpected:\n  hits:\n    total:\n      value: 0\n---\njson:\n  query:\n    multi_match:\n      type: phrase\n      query: zone explosion\n      slop: 1\n      fields: [\"payload.commits.message\"]\nexpected:\n  hits:\n    total:\n      value: 1\n---\nengines:\n    # TODO check the discrepancy with ES\n    - quickwit\njson:\n  query:\n    multi_match:\n      type: most_fields\n      query: the pomle missingtoken\n      fields: [\"payload.commits.message\", \"actor.login\"]\nexpected:\n  hits:\n    total:\n      value: 4\n---\njson:\n  query:\n    multi_match:\n      type: phrase\n      query: zone of expl\n      fields: [\"payload.commits.message\"]\nexpected:\n  hits:\n    total:\n      value: 0\n---\njson:\n  query:\n    multi_match:\n      type: phrase_prefix\n      query: zone of expl\n      fields: [\"payload.commits.message\"]\nexpected:\n  hits:\n    total:\n      value: 1\n---\njson:\n  query:\n    multi_match:\n      type: phrase_prefix\n      query: zone of expl\n      # Yeah it makes no sense at all, but elastic accepts it.\n      lenient: true\n      fields: [\"payload.commits.message\"]\n---\njson:\n  query:\n    multi_match:\n      type: most_fields\n      query: the\n      lenient: false\n      fields: [\"payload.commits.message\", \"hello\"]\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0015-terms-query.yaml",
    "content": "json:\n  query:\n    terms:\n      type:\n        - PushEvent\n        - CommitCommentEvent\nexpected:\n  hits:\n    total:\n      value: 0\n---\njson:\n  query:\n    terms:\n      type:\n        - pushevent\n        - commitcommentevent\nexpected:\n  hits:\n    total:\n      value: 61\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0016-misc-query.yaml",
    "content": "json:\n  query:\n    multi_match:\n      fields:\n        - payload.commits.message\n        - payload.description\n        - payload.comment.body\n      lenient: true\n      query: to be\n      type: phrase\nexpected:\n  hits:\n    total:\n      value: 2\n---\njson:\n  query:\n    multi_match:\n      fields:\n        - payload.commits.message\n        - payload.description\n      lenient: true\n      query: to b\n      type: phrase\nexpected:\n  hits:\n    total:\n      value: 0\n---\njson:\n  query:\n    multi_match:\n      fields:\n        - payload.commits.message\n        - payload.description\n        - payload.comment.body\n      lenient: true\n      query: to be\n      type: phrase_prefix\nexpected:\n  hits:\n    total:\n      value: 2\n---\njson:\n  query:\n    multi_match:\n      fields:\n        - payload.commits.message\n        - payload.description\n        - payload.comment.body\n      lenient: true\n      query: to b\n      type: phrase_prefix\nexpected:\n  hits:\n    total:\n      value: 3\n---\njson:\n  query:\n    query_string:\n      default_field: payload.commits.message\n      lenient: true\n      query: \"to AND the\"\nexpected:\n  hits:\n    total:\n      value: 3\n---\njson:\n  query:\n    query_string:\n      fields:\n        - payload.commits.message\n      lenient: true\n      query: \"to AND the\"\nexpected:\n  hits:\n    total:\n      value: 3\n---\nengines: [\"quickwit\"]\njson:\n  query:\n    exists:\n      field: payload.commits.message\nexpected:\n  hits:\n    total:\n      value: 59  # There are actually 60 documents where this field is not empty, but one of them has a field longer than 255 chars\n---\n# test exists for a non-fast field\njson:\n  query:\n    exists:\n      field: public\nexpected:\n  hits:\n    total:\n      value: 100\n---\njson:\n  query:\n    match_all: {}\nexpected:\n  hits:\n    total:\n      value: 100\n---\njson:\n  query:\n    terms:\n      payload.commits.message:\n        - fix\n        - bug\n        - problem\n        - closes\nexpected:\n  hits:\n    total:\n      value: 3\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0017-match-bool-prefix-query.yaml",
    "content": "method: [GET]\njson:\n  query:\n    match_bool_prefix:\n      payload.pull_request.body:\n        query: \"file not ch\"\nexpected:\n  hits:\n    total:\n      value: 1\n      relation: \"eq\"\n---\nmethod: [GET]\njson:\n  query:\n    match_bool_prefix:\n      payload.pull_request.body:\n        query: \"file not chzn\"\n        operator: AND\nexpected:\n  hits:\n    total:\n      value: 0\n      relation: \"eq\"\n---\nmethod: [GET]\njson:\n  query:\n    match_bool_prefix:\n      payload.pull_request.body:\n        query: \"file not ch\"\n        operator: AND\nexpected:\n  hits:\n    total:\n      value: 1\n      relation: \"eq\"\n---\nmethod: [GET]\njson:\n  query:\n    match_bool_prefix:\n      payload.pull_request.body: \"file not ch\"\nexpected:\n  hits:\n    total:\n      value: 1\n      relation: \"eq\"\n---\nmethod: [GET]\njson:\n  query:\n    match_phrase_prefix:\n      payload.commits.message:\n        query: \"fix\"\nexpected:\n  hits:\n    total:\n      value: 7\n      relation: \"eq\"\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0018-search_after.yaml",
    "content": "json:\n  size: 1\n  query:\n      match_all: {}\n  sort:\n    - actor.id:\n        order: desc\nexpected:\n  hits:\n    total:\n      value: 100\n      relation: eq\n    hits:\n      - sort: [10791502]\n---\njson:\n  size: 1\n  query:\n      match_all: {}\n  sort:\n    - actor.id:\n        order: desc\n  search_after: [10791502]\nexpected:\n  hits:\n    total:\n      value: 100\n      relation: eq\n    hits:\n      - sort: [10791466]\n---\njson:\n  size: 1\n  query:\n      match_all: {}\n  sort:\n    - actor.id:\n        order: asc\n  search_after: [5688]\nexpected:\n  hits:\n    total:\n      value: 100\n      relation: eq\n    hits:\n      - sort: [9018]\n---\n# Test with a search after value as string\n# Quickwit should convert it to the correct type\njson:\n  size: 1\n  query:\n      match_all: {}\n  sort:\n    - actor.id:\n        order: asc\n  search_after: [\"5688\"]\nexpected:\n  hits:\n    total:\n      value: 100\n      relation: eq\n    hits:\n      - sort: [9018]\n---\njson:\n  size: 1\n  query:\n      match_all: {}\n  sort:\n    - actor.id:\n        order: asc\n  search_after: [5688]\nexpected:\n  hits:\n    total:\n      value: 100\n      relation: eq\n    hits:\n      - sort: [9018]\n---\njson:\n  size: 100\n  track_total_hits: true\n  query:\n      match_all: {}\n  sort:\n    - created_at:\n        order: asc\n  search_after: [1422748815000]\nexpected:\n  hits:\n    hits:\n      $expect: \"len(val) == 4\"\n---\n# Quickwit should accept timestamp as string.\njson:\n  size: 100\n  track_total_hits: true\n  query:\n      match_all: {}\n  sort:\n    - created_at:\n        order: asc\n  search_after: [\"1422748815000\"]\nexpected:\n  hits:\n    hits:\n      $expect: \"len(val) == 4\"\n---\njson:\n  size: 100\n  track_total_hits: true\n  query:\n      match_all: {}\n  sort:\n    - created_at:\n        order: desc\n  search_after: [\"1422748800001\"]\nexpected:\n  hits:\n    hits:\n      $expect: \"len(val) == 7\"\n---\n# Only works for quickwit engine,\n# `epoch_nanos_int` format is quickwit specific\nengines:\n  - quickwit\njson:\n  size: 100\n  track_total_hits: true\n  query:\n      match_all: {}\n  sort:\n    - created_at:\n        order: asc\n        format: epoch_nanos_int\n  search_after: [1422748815000000000]\nexpected:\n  hits:\n    hits:\n      - sort: [1422748816000000000]\n      - sort: [1422748816000000000]\n      - sort: [1422748816000000000]\n      - sort: [1422748816000000000]\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0019-count.yaml",
    "content": "endpoint: \"gharchive/_count\"\nparams:\n  q: type:PushEvent\nexpected:\n  count: 60\n---\nendpoint: \"gharchive/_count\"\nexpected:\n  count: 100\n\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0020-stats.yaml",
    "content": "method: [GET]\nengines:\n  - quickwit\n  - elasticsearch\nendpoint: \"gharchive/_stats\"\nexpected:\n  _all:\n    primaries:\n      docs:\n        count: 100\n      store:\n        size_in_bytes:\n          $expect: \"val > 278300\"\n    total:\n      segments:\n        count: 1\n      docs:\n        count: 100\n  indices:\n    gharchive:\n      primaries:\n        docs:\n          count: 100\n        store:\n          size_in_bytes:\n            $expect: \"val > 278300\"\n      total:\n        segments:\n          count: 1\n        docs:\n          count: 100\n---\nmethod: [GET]\nengines:\n  - quickwit\n  - elasticsearch\nendpoint: \"ghar*/_stats\"\nexpected:\n  _all:\n    primaries:\n      docs:\n        count: 100\n    total:\n      segments:\n        count: 1\n      docs:\n        count: 100\n  indices:\n    gharchive:\n      primaries:\n        docs:\n          count: 100\n      total:\n        segments:\n          count: 1\n        docs:\n          count: 100\n---\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: \"_stats\"\nexpected:\n  _all:\n    primaries:\n      docs:\n        count: 104\n    total:\n      segments:\n        count: 3\n      docs:\n        count: 104\n  indices:\n    gharchive:\n      primaries:\n        docs:\n          count: 100\n      total:\n        segments:\n          count: 1\n        docs:\n          count: 100\n    fast_only:\n      primaries:\n        docs:\n          count: 2\n      total:\n        segments:\n          count: 1\n        docs:\n          count: 2\n    empty_index:\n      primaries:\n        docs:\n          count: 0\n      total:\n        segments:\n          count: 0\n        docs:\n          count: 0\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0021-cat-indices.yaml",
    "content": "method: [GET]\nengines:\n  - quickwit\nendpoint: \"_cat/indices?format=json\"\nexpected:\n- index: empty_index\n  docs.count: '0'\n- index: fast_only\n  docs.count: '2'\n- index: gharchive\n  dataset.size: 222.8kb\n  docs.count: '100'\n  docs.deleted: '0'\n  health: green\n  pri: '1'\n  pri.store.size:\n      $expect: 270 < float(val[:-2]) < 280\n  rep: '1'\n  status: open\n  store.size:\n      #272.4kb\n      $expect: 270 < float(val[:-2]) < 280\n  rep: '1'\n  #uuid: gharchive:01HN2SDANHDN6WFAFNH7BBMQ8C\n- index: otel-logs-v0_9\n  docs.count: '0'\n- index: otel-traces-v0_9\n  docs.count: '0'\n- index: simple_es_compat\n  docs.count: '2'\n---\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: \"_cat/indices/gharchive?format=json\"\nexpected:\n- dataset.size:\n    # 222.8kb\n    $expect: 220 < float(val[:-2]) < 230\n  docs.count: '100'\n  docs.deleted: '0'\n  health: green\n  index: gharchive\n  pri: '1'\n  pri.store.size:\n      #272.4kb\n      $expect: 270 < float(val[:-2]) < 280\n  rep: '1'\n  status: open\n  store.size:\n      # 272.4kb\n      $expect: 270 < float(val[:-2]) < 280\n  #uuid: gharchive:01HN2SDANHDN6WFAFNH7BBMQ8C\n---\nmethod: [GET]\nengines:\n  - quickwit\n  - elasticsearch\nendpoint: \"_cat/indices/gharchive?format=json&h=docs.count,index\"\nexpected:\n- docs.count: '100'\n  index: gharchive\n--- # Wildcard test\nmethod: [GET]\nengines:\n  - quickwit\n  - elasticsearch\nendpoint: \"_cat/indices/gharc*?format=json&h=docs.count,index\"\nexpected:\n- docs.count: '100'\n  index: gharchive\n--- # health green test\nmethod: [GET]\nengines:\n  - quickwit\n  - elasticsearch\nendpoint: \"_cat/indices/gharchive?format=json&health=green\"\nexpected:\n- docs.count: '100'\n  index: gharchive\n--- # health red test\nmethod: [GET]\nengines:\n  - quickwit\n  - elasticsearch\nendpoint: \"_cat/indices/gharchive?format=json&health=red\"\nexpected: []\n---\n# Quickwit only supports JSON output. (Elastic has a table like text output.)\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: \"_cat/indices/gharchive\" # missing format=json\nstatus_code: 400\n---\n# Quickwit does not supports the `v` parameter.\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: \"_cat/indices/gharchive?format=json&v=true\" # invalid h=true\nstatus_code: 400\n---\nmethod: [GET]\nengines:\n  - quickwit\n  - elasticsearch\nendpoint: \"_cat/indices/gharchive?format=json&b=b\" # unsupported bytes parameter\nstatus_code: 400\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0022-source.yaml",
    "content": "--- # _source_excludes\nparams:\n  _source_excludes: [\"actor\"]\njson:\n  size: 1\n  query:\n      match_all: {}\nexpected:\n  hits:\n    total:\n      value: 100\n      relation: eq\n    hits:\n      - _source:\n          $expect: \"not 'actor' in val\" \n--- # _source_includes\nparams:\n  _source_includes: [\"actor\"]\njson:\n  size: 1\n  query:\n      match_all: {}\nexpected:\n  hits:\n    total:\n      value: 100\n      relation: eq\n    hits:\n      - _source:\n          $expect: \"len(val) == 1\" # Contains only 'actor'\n          actor:\n            id: 5688\n--- # _source_includes and _source_excludes\nparams:\n  _source_includes: \"actor,id\"\n  _source_excludes: [\"actor\"]\njson:\n  size: 1\n  query:\n      match_all: {}\nexpected:\n  hits:\n    total:\n      value: 100\n      relation: eq\n    hits:\n      - _source:\n          $expect: \"len(val) == 1\" # Contains only 'actor'\n          id: 5688\n--- # _source_includes with path\nparams:\n  _source_includes: \"actor.id\"\njson:\n  size: 1\n  query:\n      match_all: {}\nexpected:\n  hits:\n    total:\n      value: 100\n      relation: eq\n    hits:\n      - _source:\n          actor: \n            $expect: \"len(val) == 1\" # Contains only 'actor'\n            id: 5688\n\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0023-extra_filters.yaml",
    "content": "# Extra filters are additional filters that are applied to the query. Useful for permissions and other use cases. \nengines:\n  - quickwit\njson:\n  query:\n      match_all: {}\nparams:\n  extra_filters: \"type:PushEvent\"\nexpected:\n  hits:\n    total:\n      value: 60\n--- # 2 extra filters\nengines:\n  - quickwit\njson:\n  query:\n      match_all: {}\nparams:\n  extra_filters: \"type:PushEvent,actor.login:jadonk\"\nexpected:\n  hits:\n    total:\n      value: 2\n--- # Test mixing\nengines:\n  - quickwit\njson:\n  query:\n    query_string:\n      query: \"type:PushEvent\"\nparams:\n  extra_filters: \"actor.login:jadonk\"\nexpected:\n  hits:\n    total:\n      value: 2\n--- # Test mixing\nengines:\n  - quickwit\njson:\n  query:\n    query_string:\n      query: \"type:PushEvent\"\nparams:\n  extra_filters: \"type:PushEvent,actor.login:jadonk\"\nexpected:\n  hits:\n    total:\n      value: 2\n\n\n\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0024-delete_indices.yaml",
    "content": "--- #Create indices quickwit\nengines:\n  - quickwit\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: test_index1\n  doc_mapping:\n    mode: dynamic\nsleep_after: 3\n---\nengines:\n  - quickwit\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: test_index2\n  doc_mapping:\n    mode: dynamic\nsleep_after: 3\n--- # create indices elasticsearch\nengines:\n  - elasticsearch\nmethod: PUT\nendpoint: test_index1\njson: {\n  \"mappings\": {\n    \"properties\": {\n      \"created_at\": {\n        \"type\": \"date\",\n        \"store\": true\n      }\n    }\n  }\n}\n--- # create indices elasticsearch\nengines:\n  - elasticsearch\nmethod: PUT\nendpoint: test_index2\njson: {\n  \"mappings\": {\n    \"properties\": {\n      \"created_at\": {\n        \"type\": \"date\",\n        \"store\": true\n      }\n    }\n  }\n}\n---\nengines:\n  - quickwit\n  - elasticsearch\nmethod: DELETE\nendpoint: test_index1,does_not_exist\nstatus_code: 404\n--- # delete partially matching with ignore_unavailable\nengines:\n  - quickwit\n  - elasticsearch\nmethod: DELETE\nendpoint: test_index1,does_not_exist\nstatus_code: 200\nparams:\n  ignore_unavailable: \"true\"\n--- # already deleted\nengines:\n  - quickwit\n  - elasticsearch\nmethod: DELETE \nendpoint: test_index1\nstatus_code: 404\n---\nengines:\n  - quickwit\n  - elasticsearch\nmethod: DELETE\nendpoint: test_index2\nstatus_code: 200\n\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0025-msearch.yaml",
    "content": "endpoint: \"_msearch\"\nmethod: POST\nndjson:\n  - {\"index\":\"gharchive\"}\n  - {\"query\" : {\"match\" : { \"type\": \"PushEvent\"}}, \"size\": 0, \"track_total_hits\": true}\nexpected:\n  responses:\n    - hits:\n        total:\n          value: 60\n---\nendpoint: \"_msearch\"\nengines: [\"quickwit\"]\nmethod: POST\nparams:\n  extra_filters: \"type:PushEvent,actor.login:jadonk\"\nndjson:\n  - {\"index\":\"gharchive\"}\n  - {\"query\" : {\"match\" : { \"type\": \"PushEvent\"}}, \"size\": 0, \"track_total_hits\": true}\nexpected:\n  responses:\n    - hits:\n        total:\n          value: 2\n---\n# `_source_excludes` is not supported in elasticsearch' msearch.\n# This parameter is quickwit specific.\n# To get more info about the quirks of msearch parameters,\n# https://github.com/elastic/elasticsearch/issues/4227\nendpoint: \"_msearch\"\nengines: [\"quickwit\"]\nmethod: POST\nparams:\n  _source_excludes: [\"actor\"]\nndjson:\n  - {\"index\":\"gharchive\"}\n  - {\"query\" : {\"match_all\" : {}}, \"size\": 1}\nexpected:\n  responses:\n    - hits:\n        total:\n          value: 100\n          relation: eq\n        hits:\n          - _source:\n              $expect: \"not 'actor' in val\"\n---\n# `_source_includes` is not supported in elasticsearch' msearch.\n# This parameter is quickwit specific.\nendpoint: \"_msearch\"\nengines: [\"quickwit\"]\nmethod: POST\nparams:\n  _source_includes: [\"actor\"]\nndjson:\n  - {\"index\":\"gharchive\"}\n  - {\"query\" : {\"match_all\" : {}}, \"size\": 1}\nexpected:\n  responses:\n    - hits:\n        total:\n          value: 100\n          relation: eq\n        hits:\n          - _source:\n              $expect: \"len(val) == 1\" # Contains only 'actor'\n              actor:\n                id: 5688\n---\n# `{_sources: {\"excludes\": [..]}}` is currently not supported in Quickwit.\n# To get more info about the quirks of msearch parameters,\n# https://github.com/elastic/elasticsearch/issues/4227\nendpoint: \"_msearch\"\nengines: [\"elasticsearch\"]\nmethod: POST\nndjson:\n  - {\"index\":\"gharchive\"}\n  - {\"query\" : {\"match_all\" : {}}, \"size\": 1, \"_source\": {\"excludes\": [\"actor\"]} }\nexpected:\n  responses:\n    - hits:\n        total:\n          value: 100\n          relation: eq\n        hits:\n          - _source:\n              $expect: \"not 'actor' in val\"\n---\n# Same as above\nendpoint: \"_msearch\"\nengines: [\"elasticsearch\"]\nmethod: POST\nndjson:\n  - {\"index\":\"gharchive\"}\n  - {\"query\" : {\"match_all\" : {}}, \"size\": 1, \"_source\": {\"includes\": [\"actor\"]}}\nexpected:\n  responses:\n    - hits:\n        total:\n          value: 100\n          relation: eq\n        hits:\n          - _source:\n              $expect: \"len(val) == 1\" # Contains only 'actor'\n              actor:\n                id: 5688\n---\n# test missing index\nendpoint: \"_msearch\"\nmethod: POST\nndjson:\n  - {\"index\":\"idontexist\"}\n  - {\"query\" : {\"match\" : { \"type\": \"PushEvent\"}}, \"size\": 0, \"track_total_hits\": true}\nexpected:\n  responses:\n    - status: 404\n---\nendpoint: \"_msearch\"\nmethod: POST\nndjson:\n  - {\"index\":\"idontexist\", \"ignore_unavailable\": true}\n  - {\"query\" : {\"match\" : { \"type\": \"PushEvent\"}}, \"size\": 0}\nexpected:\n  responses:\n    - hits:\n        total:\n          value: 0\n      status: 200\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0026-resolve.yaml",
    "content": "method: GET\nendpoint: /_resolve/index/gh*\nexpected:\n  indices:\n    - name: gharchive\n      attributes:\n        - open\n  aliases: []\n  data_streams: []\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0027-cluster-health.yaml",
    "content": "method: GET\nendpoint: /_cluster/health\nstatus_code: 200\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0028-fast_only_field_query.yaml",
    "content": "# Search for a term in a field that is not indexed but is a fast field\nengines:\n  - quickwit\nendpoint: \"fast_only/_search\"\nparams:\n  size: 0\njson:\n  track_total_hits: true\n  query:\n    term:\n      fast_text: \"abc-123\"\nexpected:\n  hits:\n    total:\n      value: 1\n      relation: \"eq\"\n--- # term query with no matches\nengines:\n  - quickwit\nendpoint: \"fast_only/_search\"\nparams:\n  size: 0\njson:\n  track_total_hits: true\n  query:\n    term:\n      fast_text: \"zzz\"\nexpected:\n  hits:\n    total:\n      value: 0\n      relation: \"eq\"\n\n--- # term set query with partial match\nengines:\n  - quickwit\nendpoint: \"fast_only/_search\"\nparams:\n  size: 0\njson:\n  track_total_hits: true\n  query:\n    terms:\n      fast_text:\n        - \"abc-123\"\n        - \"zzz\"\nexpected:\n  hits:\n    total:\n      value: 1\n      relation: \"eq\"\n\n--- # term set query with multiple matches\nengines:\n  - quickwit\nendpoint: \"fast_only/_search\"\nparams:\n  size: 0\njson:\n  track_total_hits: true\n  query:\n    terms:\n      fast_text:\n        - \"abc-123\"\n        - \"def-456\"\nexpected:\n  hits:\n    total:\n      value: 2\n      relation: \"eq\"\n\n--- # term query on nested JSON field\nengines:\n  - quickwit\nendpoint: \"fast_only/_search\"\nparams:\n  size: 0\njson:\n  track_total_hits: true\n  query:\n    term:\n      obj.nested_text: \"abc-123\"\nexpected:\n  hits:\n    total:\n      value: 1\n      relation: \"eq\"\n\n--- # term query with no matches\nengines:\n  - quickwit\nendpoint: \"fast_only/_search\"\nparams:\n  size: 0\njson:\n  track_total_hits: true\n  query:\n    term:\n      obj.nested_text: \"zzz\"\nexpected:\n  hits:\n    total:\n      value: 0\n      relation: \"eq\"\n\n--- # term set query\nengines:\n  - quickwit\nendpoint: \"fast_only/_search\"\nparams:\n  size: 0\njson:\n  track_total_hits: true\n  query:\n    terms:\n      obj.nested_text:\n        - \"abc-123\"\n        - \"ghi-789\"\nexpected:\n  hits:\n    total:\n      value: 2\n      relation: \"eq\"\n\n--- # term set query with no matches\nengines:\n  - quickwit\nendpoint: \"fast_only/_search\"\nparams:\n  size: 0\njson:\n  track_total_hits: true\n  query:\n    terms:\n      obj.nested_text:\n        - \"zzz\"\nexpected:\n  hits:\n    total:\n      value: 0\n      relation: \"eq\"\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0029-wildcard.yaml",
    "content": "json:\n  query:\n    wildcard:\n      actor.login:\n        value: jad?nk\nexpected:\n  hits:\n    total:\n      value: 2\n---\njson:\n  query:\n    wildcard:\n      actor.login:\n        value: j*nk\nexpected:\n  hits:\n    total:\n      value: 2\n---\njson:\n  query:\n    wildcard:\n      actor.login: jad?nk\nexpected:\n  hits:\n    total:\n      value: 2\n---\njson:\n  query:\n    wildcard:\n      repo.name:\n        value: RUS*\n        case_insensitive: true\nexpected:\n  hits:\n    total:\n      value: 1\n---\njson:\n  query:\n    wildcard:\n      repo.name:\n        value: RUS*\n        case_insensitive: false\nexpected:\n  hits:\n    total:\n      value: 0\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0030-prefix.yaml",
    "content": "json:\n  query:\n    prefix:\n      actor.login:\n        value: jado\nexpected:\n  hits:\n    total:\n      value: 2\n---\njson:\n  query:\n    prefix:\n      actor.login:\n        value: j\nexpected:\n  hits:\n    total:\n      value: 10\n---\njson:\n  query:\n    prefix:\n      actor.login: jado\nexpected:\n  hits:\n    total:\n      value: 2\n---\njson:\n  query:\n    prefix:\n      repo.name:\n        value: RUST\n        case_insensitive: true\nexpected:\n  hits:\n    total:\n      value: 1\n---\njson:\n  query:\n    prefix:\n      repo.name:\n        value: RUST\n        case_insensitive: false\nexpected:\n  hits:\n    total:\n      value: 0\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0031-regex.yaml",
    "content": "# Basic regex match\nparams:\n  size: 0\njson:\n  track_total_hits: true\n  query:\n    regexp:\n      type:\n        value: \".*event\"\nexpected:\n  hits:\n    total:\n      value: 100\n      relation: \"eq\"\n---\n# Regex always match from start to end (`(re)` equivalent to `^(re)$`)\nparams:\n  size: 3\njson:\n  track_total_hits: true\n  query:\n    regexp:\n      type:\n        value: \"event\"\nexpected:\n  hits:\n    total:\n      value: 0\n      relation: \"eq\"\n---\n# Regex with case_insensitive flag\nparams:\n  size: 3\njson:\n  track_total_hits: true\n  query:\n    regexp:\n      repo.name:\n        # lowercased by the tokenizer\n        value: \"RUST.*\"\n        case_insensitive: true\nexpected:\n  hits:\n    total:\n      value: 1\n      relation: \"eq\"\n---\nparams:\n  size: 3\njson:\n  track_total_hits: true\n  query:\n    regexp:\n      type:\n        # lowercased by the tokenizer\n        value: \"RUST.*\"\n        case_insensitive: false\nexpected:\n  hits:\n    total:\n      value: 0\n      relation: \"eq\"\n---\n# In Elasticsearch, ^ and $ are escaped when they are used as anchors, so\n# ^pushevent$ only matches if the original term is \"^pushevent$\". In Quickwit\n# this fails (for now) because tantivy-fst returns an error on all zero width\n# assertions.\nengines:\n  - elasticsearch\nendpoint: \"simple_es_compat/_search\"\nparams:\n  size: 3\njson:\n  track_total_hits: true\n  query:\n    regexp:\n      keyword_text:\n        value: \"red$\"\nexpected:\n  hits:\n    total:\n      value: 0\n      relation: \"eq\"\n---\nengines:\n  - elasticsearch\nendpoint: \"simple_es_compat/_search\"\nparams:\n  size: 3\njson:\n  track_total_hits: true\n  query:\n    regexp:\n      keyword_text:\n        value: \"gold$\"\nexpected:\n  hits:\n    total:\n      value: 1\n      relation: \"eq\"\n---\n# regex in query_string\nparams:\n  size: 10\njson:\n  query:\n    query_string:\n      query: \"type:/pushevent/\"\nexpected:\n  hits:\n    total:\n      value: 60\n      relation: \"eq\"\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/0032-mappings.yaml",
    "content": "method: [GET]\nengines:\n  - elasticsearch\nendpoint: \"gharchive/_mappings\"\nexpected:\n  gharchive: {}\n  # _all:\n  #   primaries:\n  #     docs:\n  #       count: 100\n  #     store:\n  #       size_in_bytes:\n  #         $expect: \"val > 278300\"\n  #   total:\n  #     segments:\n  #       count: 1\n  #     docs:\n  #       count: 100\n  # indices:\n  #   gharchive:\n  #     primaries:\n  #       docs:\n  #         count: 100\n  #       store:\n  #         size_in_bytes:\n  #           $expect: \"val > 278300\"\n  #     total:\n  #       segments:\n  #         count: 1\n  #       docs:\n  #         count: 100\n---\n# method: [GET]\n# engines:\n#   - quickwit\n#   - elasticsearch\n# endpoint: \"ghar*/_stats\"\n# expected:\n#   _all:\n#     primaries:\n#       docs:\n#         count: 100\n#     total:\n#       segments:\n#         count: 1\n#       docs:\n#         count: 100\n#   indices:\n#     gharchive:\n#       primaries:\n#         docs:\n#           count: 100\n#       total:\n#         segments:\n#           count: 1\n#         docs:\n#           count: 100\n# ---\n# method: [GET]\n# engines:\n#   - quickwit\n# endpoint: \"_stats\"\n# expected:\n#   _all:\n#     primaries:\n#       docs:\n#         count: 102\n#     total:\n#       segments:\n#         count: 2\n#       docs:\n#         count: 102\n#   indices:\n#     gharchive:\n#       primaries:\n#         docs:\n#           count: 100\n#       total:\n#         segments:\n#           count: 1\n#         docs:\n#           count: 100\n#     fast_only:\n#       primaries:\n#         docs:\n#           count: 2\n#       total:\n#         segments:\n#           count: 1\n#         docs:\n#           count: 2\n#     empty_index:\n#       primaries:\n#         docs:\n#           count: 0\n#       total:\n#         segments:\n#           count: 0\n#         docs:\n#           count: 0\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/_ctx.elasticsearch.yaml",
    "content": "api_root: http://localhost:9200/\n\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/_ctx.quickwit.yaml",
    "content": "api_root: \"http://localhost:7280/api/v1/_elastic/\"\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/_ctx.yaml",
    "content": "method: [GET, POST]\nendpoint: \"gharchive/_search\"\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/_setup.elasticsearch.yaml",
    "content": "# Delete possibly remaining index\nmethod: DELETE\nendpoint: gharchive\nstatus_code: null\n---\nmethod: DELETE\nendpoint: empty_index\nstatus_code: null\n---\nmethod: DELETE\nendpoint: simple_es_compat\nstatus_code: null\n---\n# empty index\nmethod: PUT\nendpoint: empty_index\njson: {\n  \"mappings\": {\n    \"properties\": {\n      \"created_at\": {\n        \"type\": \"date\",\n        \"store\": true\n      }\n    }\n  }\n}\n---\n# Create index\nmethod: PUT\nendpoint: gharchive\njson: {\n  \"settings\": {\n    \"analysis\": {\n    \"normalizer\": {\n      \"keyword_lowercase\": {\n        \"type\": \"custom\",\n        \"filter\": [\"lowercase\"]\n      },\n      \"keyword_keepcase\": { \"type\": \"custom\" }\n    }\n  }\n  },\n  \"mappings\": {\n    \"properties\": {\n      \"id\": {\n        \"type\": \"text\",\n        \"store\": true,\n        \"norms\": false,\n        \"index_options\": \"docs\"\n      },\n      \"type\": {\n        \"type\": \"text\",\n        \"store\": true,\n        \"norms\": false,\n        \"index_options\": \"docs\",\n        \"fielddata\": true\n      },\n      \"actor\": {\n        \"properties\": {\n          \"id\": {\n            \"type\": \"long\",\n            \"store\": true\n          },\n          \"login\": {\n            \"type\": \"keyword\",\n            \"normalizer\": \"keyword_lowercase\",\n            \"store\": true,\n            \"norms\": false,\n            \"index_options\": \"docs\"\n          },\n          \"gravatar_id\": {\n            \"type\": \"text\",\n            \"store\": true,\n            \"norms\": false,\n            \"index_options\": \"docs\"\n          },\n          \"url\": {\n            \"type\": \"text\",\n            \"store\": true,\n            \"norms\": false,\n            \"index_options\": \"docs\"\n          },\n          \"avatar_url\": {\n            \"type\": \"text\",\n            \"store\": true,\n            \"norms\": false,\n            \"index_options\": \"docs\"\n          }\n        }\n      },\n      \"repo\": {\n        \"properties\": {\n          \"id\": {\n            \"type\": \"long\",\n            \"store\": true\n          },\n          \"name\": {\n            \"type\": \"keyword\",\n            \"normalizer\": \"keyword_keepcase\",\n            \"store\": true\n          },\n          \"url\": {\n            \"type\": \"text\",\n            \"store\": true,\n            \"norms\": false,\n            \"index_options\": \"docs\"\n          }\n        }\n      },\n      \"payload\": {\n        \"type\": \"object\"\n      },\n      \"created_at\": {\n        \"type\": \"date\",\n        \"store\": true\n      }\n    }\n  }\n}\n---\nmethod: PUT\nendpoint: gharchive/_settings\njson: { \"number_of_replicas\": 0 }\n---\n# Create index\nmethod: PUT\nendpoint: simple_es_compat\njson: {\n  \"mappings\": {\n    \"properties\": {\n      \"keyword_text\": {\n        \"type\": \"keyword\",\n      }\n    }\n  }\n}\n---\nmethod: PUT\nendpoint: simple_es_compat/_settings\njson: { \"number_of_replicas\": 0 }\n---\n# Ingest documents\nmethod: POST\nendpoint: _bulk\nparams:\n  refresh: \"true\"\nheaders: {\"Content-Type\": \"application/json\", \"content-encoding\": \"gzip\"}\nbody_from_file: gharchive-bulk.json.gz\n---\nmethod: POST\nendpoint: _bulk\nparams:\n  refresh: \"true\"\nheaders: {\"Content-Type\": \"application/json\"}\nndjson:\n  - {\"index\":{\"_index\":\"simple_es_compat\"}}\n  - {\"keyword_text\": \"red\"}\n  - {\"index\":{\"_index\":\"simple_es_compat\"}}\n  - {\"keyword_text\": \"gold$\"}\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/_setup.quickwit.yaml",
    "content": "# Delete possibly remaining index\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/gharchive\nstatus_code: null\n---\n# Delete possibly remaining index\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/empty_index\nstatus_code: null\n---\n# Delete possibly remaining index\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/simple_es_compat\nstatus_code: null\n---\n# Create index\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: empty_index\n  doc_mapping:\n    field_mappings:\n        - name: created_at\n          type: datetime\n          fast: true\nsleep_after: 3\n---\n# Create index\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: gharchive\n  doc_mapping:\n    index_field_presence: true\n    timestamp_field: created_at\n    mode: dynamic\n    field_mappings:\n        - name: repo\n          type: object\n          field_mappings:\n              - name: name\n                type: text\n                fast: true\n                indexed: true\n                tokenizer: raw\n        - name: public\n          type: bool\n          fast: false\n          indexed: true\n        - name: created_at\n          type: datetime\n          fast: true\n          fast_precision: milliseconds\n    dynamic_mapping:\n      expand_dots: true\n      tokenizer: default\n      fast:\n        normalizer: lowercase\n      record: position\n---\n# Ingest documents\nmethod: POST\nendpoint: _bulk\nparams:\n  refresh: \"true\"\nheaders: {\"Content-Type\": \"application/json\", \"content-encoding\": \"gzip\"}\nbody_from_file: gharchive-bulk.json.gz\n---\n# Delete possibly remaining index\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/fast_only\nstatus_code: null\n---\n# Create a dedicated index with a root fast-only field\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: fast_only\n  doc_mapping:\n    field_mappings:\n      - name: fast_text\n        type: text\n        fast: true\n        indexed: false\n      - name: obj\n        type: object\n        field_mappings:\n          - name: nested_text\n            type: text\n            fast: true\n            indexed: false\nsleep_after: 1\n---\n# Ingest a couple documents into fast_only\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: fast_only/ingest\nparams:\n  commit: force\nndjson:\n  - {\"fast_text\": \"abc-123\", \"obj\": {\"nested_text\": \"abc-123\"}}\n  - {\"fast_text\": \"def-456\", \"obj\": {\"nested_text\": \"ghi-789\"}}\n\n---\n# Create simple_es_compat index\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: simple_es_compat\n  doc_mapping:\n    field_mappings:\n      - name: keyword_text\n        type: text\n        fast: true\n        indexed: true\n        tokenizer: raw\nsleep_after: 1\n---\n# Ingest documents into simple_es_compat\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: simple_es_compat/ingest\nparams:\n  commit: force\nndjson:\n  - {\"keyword_text\": \"red\"}\n  - {\"keyword_text\": \"gold$\"}\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/_teardown.elasticsearch.yaml",
    "content": "# Delete possibly remaining index\nmethod: DELETE\nendpoint: gharchive\n---\nmethod: DELETE\nendpoint: empty_index\n---\nmethod: DELETE\nendpoint: test_index1\nstatus_code: null\n---\nmethod: DELETE\nendpoint: test_index2\nstatus_code: null\n\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/_teardown.quickwit.yaml",
    "content": "# Delete index\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/gharchive\n---\n# Delete index\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/empty_index\n---\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/fast_only\nstatus_code: null\n---\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/test_index1\nstatus_code: null\n--- # Cleanup\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/test_index2\nstatus_code: null\n--- # Cleanup\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/test_index1\nstatus_code: null\n--- # Cleanup\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/search_after\nstatus_code: null\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/bulk/0001-happy-path.yaml",
    "content": "ndjson:\n  - index: { \"_index\": \"test-index\", \"_id\": \"1\" }\n  - message: Hello, World!\n  - index: { \"_index\": \"test-index\" }\n  - message: Hola, Mundo!\nstatus_code: 200\nexpected:\n  errors: false\n  items:\n    - index:\n        _index: test-index\n        _id: \"1\"\n        status: 201\n    - index:\n        _index: test-index\n        status: 201\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/bulk/0002-malformed-action.yaml",
    "content": "ndjson:\n  - del: { \"_index\": \"test-index\", \"_id\": \"1\" }\nstatus_code: 400\nexpected:\n  status: 400\n  error:\n    type: illegal_argument_exception\n    reason:\n      $expect: val.startswith('Malformed action/metadata line [1]')\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/bulk/0003-validation-failed-index-missing.yaml",
    "content": "ndjson:\n  - index: { \"_id\": \"1\" }\n  - message: Hello, World!\nstatus_code: 400\nexpected:\n  status: 400\n  error:\n    type: action_request_validation_exception\n    reason: \"Validation Failed: 1: index is missing;\"\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/bulk/0004-put-request.yaml",
    "content": "method: PUT\nndjson:\n  - index: { \"_index\": \"test-index\" }\n  - message: Hello, World!\nstatus_code: 200\nexpected:\n  errors: false\n  items:\n    - index:\n        _index: test-index\n        status: 201\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/bulk/0005-document-parsing-exception.yaml",
    "content": "ndjson:\n  - index: { \"_index\": \"test-index\", \"_id\": \"5\" }\n  - message: Hello, World!\n    timestamp: timestamp\nstatus_code: 200\nexpected:\n  errors: true\n  items:\n    - index:\n        _index: test-index\n        _id: \"5\"\n        status: 400\n        error:\n          type: document_parsing_exception\n          reason:\n            $expect: \"'timestamp' in val\"\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/bulk/0006-partial-index-not-found.yaml",
    "content": "ndjson:\n  - index: { \"_index\": \"test-index-not-found\" }\n  - message: Hello, World!\"\n  - index: { \"_index\": \"test-index\" }\n  - message: Hola, Mundo!\n  - index: { \"_index\": \"test-index-pattern-777\" }\n  - message: Hola, Mundo!\nstatus_code: 200\nexpected:\n  errors: true\n  items:\n    - index:\n        _index: test-index-not-found\n        status: 404\n        error:\n          index: test-index-not-found\n          type: index_not_found_exception\n          reason:\n            $expect: val.startswith('no such index [test-index-not-found]')\n    - index:\n        _index: test-index\n        status: 201\n    - index:\n        _index: test-index-pattern-777\n        status: 201\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/bulk/0007-illegal-index-name.yaml",
    "content": "# allowed characters are different between ES and Quickwit\nengines:\n  - quickwit\nndjson:\n  - index: { \"_index\": \"test-index\" }\n  - message: Hola, Mundo!\n  - index: { \"_index\": \"test-index-pattern-11\" }\n  - message: Hola, Mundo!\n  - index: { \"_index\": \"test-index-pattern-&1\" }\n  - message: Hola, Mundo!\nstatus_code: 200\nexpected:\n  errors: true\n  items:\n    - index:\n        _index: test-index\n        status: 201\n    - index:\n        _index: test-index-pattern-11\n        status: 201\n    - index:\n        _index: test-index-pattern-&1\n        status: 400\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/bulk/_ctx.elasticsearch.yaml",
    "content": "api_root: http://localhost:9200\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/bulk/_ctx.quickwit.yaml",
    "content": "api_root: http://localhost:7280/api/v1/_elastic\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/bulk/_ctx.yaml",
    "content": "method: [POST]\nendpoint: \"_bulk\"\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/bulk/_setup.elasticsearch.yaml",
    "content": "# Delete possibly remaining index\nmethod: DELETE\nendpoint: test-index*\nstatus_code: null\n---\nmethod: PUT\nendpoint: test-index\njson: {\n  \"mappings\": {\n    \"properties\": {\n      \"message\": {\n        \"type\": \"text\",\n        \"store\": true\n      },\n      \"timestamp\": {\n        \"type\": \"integer\",\n        \"store\": true\n      }\n    }\n  }\n}\n---\n# Only create indexes automatically for specific pattern\nmethod: PUT\nendpoint: _cluster/settings\njson:\n  transient:\n    action.auto_create_index: \"test-index-pattern-*\"\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/bulk/_setup.quickwit.yaml",
    "content": "# Delete possibly remaining index and template\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/test-index\nstatus_code: null\n---\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/test-index-pattern-11\nstatus_code: null\n---\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/test-index-pattern-777\nstatus_code: null\n---\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: templates/test-index-template\nstatus_code: null\n---\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/\njson:\n  version: \"0.8\"\n  index_id: test-index\n  doc_mapping:\n    field_mappings:\n        - name: message\n          type: text\n        - name: timestamp\n          type: datetime\nsleep_after: 3\n---\n# Create index template\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: templates\njson:\n  version: \"0.8\"\n  template_id: test-index-template\n  index_id_patterns:\n    - test-index-pattern-*\n  doc_mapping:\n    mode: dynamic\n  indexing_settings:\n    commit_timeout_secs: 1\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/bulk/_teardown.elasticsearch.yaml",
    "content": "# Reconfigure with the default settings\nmethod: PUT\nendpoint: _cluster/settings\njson:\n  transient:\n    action.auto_create_index: \"true\"\n---\nmethod: DELETE\nendpoint: test-index*\nstatus_code: null\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/bulk/_teardown.quickwit.yaml",
    "content": "method: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/test-index\n---\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/test-index-pattern-11\n---\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/test-index-pattern-777\n---\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: templates/test-index-template\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/multi-indices/0001-muti_indices_query.yaml",
    "content": "endpoint: \"gharchive-*/_search\"\nparams:\n  q: \"*\"\nexpected:\n  hits:\n    total:\n      value: 4\n      relation: \"eq\"\n    hits:\n      $expect: \"len(val) == 4\"\n---\nendpoint: \"gharchive-*/_search\"\nparams:\n  q: \"actor.login:fmassot OR actor.login:guilload\"\nexpected:\n  hits:\n    total:\n      value: 2\n      relation: \"eq\"\n    hits:\n      $expect: \"len(val) == 2\"\n---\nendpoint: \"gharchive-1,gharchive-2/_search\"\nparams:\n  q: \"actor.login:fmassot OR actor.login:guilload\"\nexpected:\n  hits:\n    total:\n      value: 2\n      relation: \"eq\"\n    hits:\n      $expect: \"len(val) == 2\"\n---\nendpoint: \"gharchive-1%2Cgharchive-2/_search\"\nparams:\n  q: \"actor.login:fmassot OR actor.login:guilload\"\nexpected:\n  hits:\n    total:\n      value: 2\n      relation: \"eq\"\n    hits:\n      $expect: \"len(val) == 2\"\n---\n# Index information\nendpoint: \"gharchive-1%2Cgharchive-2/_search\"\nparams:\n  size: 2\njson:\n  query:\n    match_all: {}\n  sort:\n    created_at:\n      order: desc\nexpected:\n  hits:\n    total:\n      value: 4\n      relation: \"eq\"\n    hits:\n      - _source:\n          actor:\n            login: trinity\n        _index: \"gharchive-2\"\n      - _source:\n          actor:\n            login: fulmicoton\n        _index: \"gharchive-1\"\n---\nendpoint: \"gharchive-*,-gharchive-2/_search\"\nparams:\n  q: \"*\"\nexpected:\n  hits:\n    total:\n      value: 2\n      relation: \"eq\"\n    hits:\n      $expect: \"len(val) == 2\"\n---\nendpoint: \"gharchive-*,-*-2/_search\"\nparams:\n  q: \"*\"\nexpected:\n  hits:\n    total:\n      value: 2\n      relation: \"eq\"\n    hits:\n      $expect: \"len(val) == 2\"\n---\n# It is valid to have a pattern that does not match\n# any index.\nendpoint: \"invalidptn-*/_search\"\nparams:\n  size: 2\njson:\n  query:\n    match_all: {}\n  sort:\n    created_at:\n      order: desc\nexpected:\n  hits:\n    total:\n      value: 0\n      relation: \"eq\"\n    hits: []\n---\n# If a specific index (not a pattern) does not exist,\n# this returns an error.\nendpoint: \"invalidptn*-,nonexistingindex/_search\"\nparams:\n  size: 2\njson:\n  query:\n    match_all: {}\n  sort:\n    created_at:\n      order: desc\nstatus_code: 404\n---\n# If one of the pattern matches no index,\n# but another matches some indices, it is valid too.\nendpoint: \"invalidptn*-,gharchive*/_search\"\nparams:\n  size: 2\njson:\n  query:\n    match_all: {}\n  sort:\n    created_at:\n      order: desc\nexpected:\n  hits:\n    total:\n      value: 104\n      relation: \"eq\"\n    hits:\n      $expect: \"len(val) == 2\"\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/multi-indices/0002-muti_indices_scroll.yaml",
    "content": "endpoint: \"gharchive-*/_search\"\nparams:\n  size: 1\n  scroll: 30m\njson:\n  query:\n    match_all: {}\n  sort:\n    - actor.id:\n        order: desc\nstore:\n  scroll_id: _scroll_id\nexpected:\n  _scroll_id:\n    $expect: \"len(val) > 4\"\n  hits:\n    hits:\n      - _source: {actor: {login: \"trinity\"}}\n    total:\n      value: 4\n      relation: \"eq\"\n---\nmethod: GET\nendpoint: \"_search/scroll\"\nparams:\n  scroll: 30m\njson:\n  scroll_id:\n    $previous: \"val[\\\"_scroll_id\\\"]\"\nexpected:\n  _scroll_id:\n    $expect: \"len(val) > 4\"\n  hits:\n    hits:\n      - _source: {actor: {login: \"guilload\"}}\n    total:\n      value: 4\n---\nmethod: GET\nendpoint: \"_search/scroll\"\nparams:\n  scroll: 30m\njson:\n  scroll_id:\n    $previous: \"val[\\\"_scroll_id\\\"]\"\nexpected:\n  _scroll_id:\n    $expect: \"len(val) > 4\"\n  hits:\n    hits:\n      - _source: {actor: {login: \"fulmicoton\"}}\n    total:\n      value: 4\n---\nendpoint: \"gharchive-*,non-existing-index/_search\"\nparams:\n  size: 1\n  scroll: 30m\njson:\n  query:\n    match_all: {}\n  sort:\n    - actor.id:\n        order: desc\nstore:\n  scroll_id: _scroll_id\nstatus_code: 404\n---\nendpoint: \"non-existing-index-*/_search\"\nparams:\n  size: 1\n  scroll: 30m\njson:\n  query:\n    match_all: {}\n  sort:\n    - actor.id:\n        order: desc\nexpected:\n  $expect: \"'_scroll_id' in val\"\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/multi-indices/0003-multi_indices_aggs.yaml",
    "content": "# Test date histogram aggregation\nmethod: [POST]\nengines:\n  - quickwit\nendpoint: \"gharchive-*/_search\"\njson:\n  query: { match_all: {} }\n  aggs:\n    logins: \n      terms: \n        field: \"actor.login\"\n        order:\n          _key: asc\nexpected:\n  aggregations:\n    logins:\n      buckets:\n      - doc_count: 1\n        key: fmassot\n      - doc_count: 1\n        key: fulmicoton\n      - doc_count: 1\n        key: guilload\n      - doc_count: 1\n        key: trinity\n      sum_other_doc_count: 0\n---\n# Test date histogram aggregation\nmethod: [POST]\nendpoint: \"noindexmatching-*/_search\"\njson:\n  query: { match_all: {} }\n  aggs:\n    logins:\n      terms:\n        field: \"actor.login\"\n        order:\n          _key: asc\nexpected:\n  $expect: \"not 'aggregations' in val\""
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/multi-indices/0004-missing_index_query.yaml",
    "content": "endpoint: \"idontexist/_search\"\nparams:\n  q: \"*\"\nstatus_code: 404\n---\nendpoint: \"idontexist/_search\"\nparams:\n  q: \"*\"\n  ignore_unavailable: \"true\"\nexpected:\n  hits:\n    total:\n      value: 0\n---\nendpoint: \"gharchive-*,idontexist/_search\"\nparams:\n  q: \"*\"\nstatus_code: 404\n---\nendpoint: \"gharchive-*,idontexist/_search\"\nparams:\n  q: \"*\"\n  ignore_unavailable: \"true\"\nexpected:\n  hits:\n    total:\n      value: 4\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/multi-indices/_ctx.yaml",
    "content": "method: [GET, POST]\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/multi-indices/_setup.elasticsearch.yaml",
    "content": "# Delete possibly remaining index\nmethod: DELETE\nendpoint: gharchive-1\nstatus_code: null\n---\n# Delete possibly remaining index\nmethod: DELETE\nendpoint: gharchive-2\nstatus_code: null\n---\n# Create index 1\nmethod: PUT\nendpoint: gharchive-1\njson: {\n  \"mappings\": {\n    \"properties\": {\n      \"id\": {\n        \"type\": \"text\",\n        \"store\": true,\n        \"norms\": false,\n        \"index_options\": \"docs\"\n      },\n      \"type\": {\n        \"type\": \"text\",\n        \"store\": true,\n        \"norms\": false,\n        \"index_options\": \"docs\",\n        \"fielddata\": true\n      },\n      \"actor\": {\n        \"properties\": {\n          \"id\": {\n            \"type\": \"long\",\n            \"store\": true\n          },\n          \"login\": {\n            \"type\": \"text\",\n            \"store\": true,\n            \"norms\": false,\n            \"index_options\": \"docs\"\n          }\n        }\n      },\n      \"created_at\": {\n        \"type\": \"date\",\n        \"store\": true\n      }\n    }\n  }\n}\n---\n# Create index 2\nmethod: PUT\nendpoint: gharchive-2\njson: {\n  \"mappings\": {\n    \"properties\": {\n      \"id\": {\n        \"type\": \"text\",\n        \"store\": true,\n        \"norms\": false,\n        \"index_options\": \"docs\"\n      },\n      \"type\": {\n        \"type\": \"text\",\n        \"store\": true,\n        \"norms\": false,\n        \"index_options\": \"docs\",\n        \"fielddata\": true\n      },\n      \"actor\": {\n        \"properties\": {\n          \"id\": {\n            \"type\": \"long\",\n            \"store\": true\n          },\n          \"login\": {\n            \"type\": \"text\",\n            \"store\": true,\n            \"norms\": false,\n            \"index_options\": \"docs\"\n          }\n        }\n      },\n      \"created_at\": {\n        \"type\": \"date\",\n        \"store\": true\n      }\n    }\n  }\n}\n---\n# Ingest documents in index 1 and 2\nmethod: POST\nendpoint: _bulk\nparams:\n  refresh: \"true\"\nheaders: {\"Content-Type\": \"application/json\"}\nndjson:\n  - \"index\": { \"_index\": \"gharchive-1\" }\n  - {\"id\": 1, \"created_at\":\"2015-02-01T00:00:14Z\", \"type\": \"CreateEvent\", \"actor\": { \"id\": 1, \"login\": \"fmassot\" } }\n  - \"index\": { \"_index\": \"gharchive-1\" }\n  - {\"id\": 2, \"created_at\":\"2015-02-01T00:00:16Z\", \"type\": \"CreateEvent\", \"actor\": { \"id\": 2, \"login\": \"fulmicoton\" } }\n  - \"index\": { \"_index\": \"gharchive-2\" }\n  - {\"id\": 3, \"created_at\":\"2015-02-01T00:00:15Z\", \"type\": \"CreateEvent\", \"actor\": { \"id\": 3, \"login\": \"guilload\" } }\n  - \"index\": { \"_index\": \"gharchive-2\" }\n  - {\"id\": 4, \"created_at\":\"2015-02-01T00:00:17Z\", \"type\": \"CreateEvent\", \"actor\": { \"id\": 4, \"login\": \"trinity\" } }\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/multi-indices/_setup.quickwit.yaml",
    "content": "# Delete possibly remaining index\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/gharchive-1\nstatus_code: null\n---\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/gharchive-2\nstatus_code: null\n---\n# Create index 1\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: gharchive-1\n  doc_mapping:\n    index_field_presence: true\n    timestamp_field: created_at\n    mode: dynamic\n    field_mappings:\n        - name: created_at\n          type: datetime\n          fast: true\n    dynamic_mapping:\n      expand_dots: true\n      tokenizer: default\n      fast:\n        normalizer: lowercase\n      record: position\n---\n# Create index 2\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: gharchive-2\n  doc_mapping:\n    index_field_presence: true\n    timestamp_field: created_at\n    mode: strict\n    field_mappings:\n      - name: created_at\n        type: datetime\n        fast: true\n      - name: id\n        type: u64\n        fast: true\n      - name: type\n        type: text\n        fast: true\n      - name: actor\n        type: object\n        fast: true\n        field_mappings:\n          - name: id\n            type: u64\n            fast: true\n          - name: login\n            type: text\n            fast: true\n---\n# Ingest documents in index 1\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: gharchive-1/ingest\nparams:\n  commit: \"force\"\nheaders: {\"Content-Type\": \"application/json\"}\nndjson:\n  - {\"id\": 1, \"created_at\":\"2015-02-01T00:00:14Z\", \"type\": \"CreateEvent\", \"actor\": { \"id\": 1, \"login\": \"fmassot\" } }\n  - {\"id\": 2, \"created_at\":\"2015-02-01T00:00:16Z\", \"type\": \"CreateEvent\", \"actor\": { \"id\": 2, \"login\": \"fulmicoton\" } }\n---\n# Ingest documents in index 2\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: gharchive-2/ingest\nparams:\n  commit: \"force\"\nheaders: {\"Content-Type\": \"application/json\"}\nndjson:\n  - {\"id\": 3, \"created_at\":\"2015-02-01T00:00:15Z\", \"type\": \"CreateEvent\", \"actor\": { \"id\": 3, \"login\": \"guilload\" } }\n  - {\"id\": 4, \"created_at\":\"2015-02-01T00:00:17Z\", \"type\": \"CreateEvent\", \"actor\": { \"id\": 4, \"login\": \"trinity\" } }\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/multi-indices/_teardown.elasticsearch.yaml",
    "content": "method: DELETE\nendpoint: gharchive-1\n---\nmethod: DELETE\nendpoint: gharchive-2\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility/multi-indices/_teardown.quickwit.yaml",
    "content": "method: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/gharchive-1\n---\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/gharchive-2\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility_info/0001-info.yaml",
    "content": "expected:\n  cluster_name:\n    $expect: \"val != ''\"\n  version:\n    build_date:\n      $expect: \"val != ''\"\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility_info/_ctx.elasticsearch.yaml",
    "content": "api_root: http://localhost:9200\n\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility_info/_ctx.quickwit.yaml",
    "content": "api_root: \"http://localhost:7280/api/v1/_elastic\"\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_compatibility_info/_ctx.yaml",
    "content": "method: [GET]\nendpoint: \"/\"\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_field_capabilities/0001-field-capabilities.yaml",
    "content": "# Test _field_caps API\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: fieldcaps/_field_caps\nexpected:\n  indices:\n  - fieldcaps\n  fields:\n    nested.response:\n      long:\n        type: long\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n    nested.name:\n      keyword:\n        type: keyword\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n      text:\n        type: text\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n    host:\n      ip:\n        type: ip\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n    mixed: # This is a little weird case (values [5, -5.5]), since coercion happens only on the columnar side. That's why `long` is not aggregatable.\n      long:\n        metadata_field: false\n        searchable: true\n        aggregatable: false\n      double:\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n    date:\n      date_nanos:\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n    response:\n      long:\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n    id:\n      long:\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n      double:\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n    name:\n      keyword:\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n      text:\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n    tags:\n      keyword:\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n      text:\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n---\n# Test _field_caps API with timestamp filter\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: fieldcaps/_field_caps?start_timestamp=1684993001\nexpected:\n  indices:\n  - fieldcaps\n  fields:\n    $expect: \"not 'id' in val\" # Filtered by start_timestamp\n    mixed: # This is a little weird case (values [5, -5.5]), since coercion happens only on the columnar side. That's why `long` is not aggregatable.\n      long:\n        metadata_field: false\n        searchable: true\n        aggregatable: false\n      double:\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n    date:\n      date_nanos:\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n---\n# Test fields parameter with `.dynamic` suffix\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: fieldcaps/_field_caps?fields=nested.response,nested.name\nexpected:\n  indices:\n  - fieldcaps\n  fields:\n    nested.response:\n      long:\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n    nested.name:\n      keyword:\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n      text:\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n---\n# Test fields parameter with wildcard\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: fieldcaps/_field_caps?fields=nest*\nexpected:\n  indices:\n  - fieldcaps\n  fields:\n    nested.response:\n      long:\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n    nested.name:\n      keyword:\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n      text:\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n---\n# Test fields parameter with wildcard\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: fieldcaps/_field_caps?fields=nest*\nexpected:\n  indices:\n  - fieldcaps\n  fields:\n    nested.response:\n      long:\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n    nested.name:\n      keyword:\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n      text:\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n---\n# Test fields parameter with wildcard\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: fieldcaps/_field_caps?fields=nested.*ponse\nexpected:\n  indices:\n  - fieldcaps\n  fields:\n    nested.response:\n      long:\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n---\n# Compare with elastic search\nmethod: [GET]\nendpoint: fieldcaps/_field_caps?fields=nested.*ponse\nexpected:\n  indices:\n  - fieldcaps\n  fields:\n    nested.response:\n      long:\n        type: long\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n---\n# Compare ip field with elastic search\nmethod: [GET]\nendpoint: fieldcaps*/_field_caps?fields=host\nexpected:\n  indices:\n  - fieldcaps\n  - fieldcaps-2\n  fields:\n    host:\n      ip:\n        type: ip\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n---\n# Compare ip field with elastic search\nmethod: [GET]\nengines:\n  - quickwit\n  - elasticsearch\nendpoint: fieldcaps/_field_caps?fields=date\nexpected:\n  indices:\n  - fieldcaps\n  fields:\n    date:\n      date_nanos:\n        type: date_nanos\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n---\n# Wildcard on index name\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: fieldca*/_field_caps?fields=tags*\nexpected:\n  indices:\n  - fieldcaps\n  - fieldcaps-2\n  fields:\n    tags:\n      keyword:\n        type: keyword\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n    tags-2:\n      keyword:\n        type: keyword\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n        indices:\n          - fieldcaps-2\n---\n# _field_caps without index endpoint\nmethod: [GET]\nengines:\n  - quickwit\nendpoint: _field_caps?fields=tags*\nexpected:\n  indices:\n  - fieldcaps\n  - fieldcaps-2\n  fields:\n    tags:\n      keyword:\n        type: keyword\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n    tags-2:\n      keyword:\n        type: keyword\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n        indices:\n          - fieldcaps-2\n---\n# Wildcard on index name + Wildcard without match\nmethod: [GET]\nendpoint: fieldca*,blub*/_field_caps?fields=date\nexpected:\n  indices:\n  - fieldcaps\n  - fieldcaps-2\n  fields:\n    date:\n      date_nanos:\n        type: date_nanos\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n---\n# Exact match index + Non matching exact index\nmethod: [GET]\nendpoint: fieldcaps,blub/_field_caps?fields=date\nstatus_code: 404\n---\n# Compare ip field with elastic search\nmethod: [GET]\nendpoint: doesnotexist/_field_caps?fields=date\nstatus_code: 404\n---\n# Compare ip field with elastic search\nmethod: [GET]\nendpoint: doesno*texist/_field_caps?fields=date\nstatus_code: 200\n---\n# Test _field_caps API with index_filter (term query)\n# Note: term queries require exact token match; 'fritz' is lowercase due to default tokenizer\nmethod: [POST]\nendpoint: fieldcaps/_field_caps?fields=*\njson:\n  index_filter:\n    term:\n      name: \"fritz\"\nexpected:\n  indices:\n  - fieldcaps\n  fields:\n    name:\n      keyword:\n        type: keyword\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n      text:\n        type: text\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n---\n# Test _field_caps API with index_filter (match_all query)\nmethod: [POST]\nendpoint: fieldcaps/_field_caps?fields=name\njson:\n  index_filter:\n    match_all: {}\nexpected:\n  indices:\n  - fieldcaps\n  fields:\n    name:\n      keyword:\n        type: keyword\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n      text:\n        type: text\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n---\n# Test _field_caps API with index_filter (bool query)\nmethod: [POST]\nendpoint: fieldcaps/_field_caps?fields=response,name\njson:\n  index_filter:\n    bool:\n      must:\n        - term:\n            name: \"fritz\"\n      filter:\n        - range:\n            response:\n              gte: 30\nexpected:\n  indices:\n  - fieldcaps\n  fields:\n    response:\n      long:\n        type: long\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n    name:\n      keyword:\n        type: keyword\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n      text:\n        type: text\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n---\n# Test _field_caps API with invalid index_filter\nmethod: [POST]\nendpoint: fieldcaps/_field_caps?fields=*\njson:\n  index_filter:\n    invalid_query_type:\n      field: \"value\"\nstatus_code: 400\n---\n# Test _field_caps API with empty index_filter (should return 400 like ES)\nmethod: [POST]\nengines:\n  - quickwit\n  - elasticsearch\nendpoint: fieldcaps/_field_caps?fields=name\njson:\n  index_filter: {}\nstatus_code: 400\n---\n# Test _field_caps API with index_filter using tag field for split pruning (QW-only)\nmethod: [POST]\nengines:\n  - quickwit\nendpoint: fieldcaps/_field_caps?fields=name\njson:\n  index_filter:\n    term:\n      tags: \"nice\"\nexpected:\n  indices:\n  - fieldcaps\n  fields:\n    name:\n      keyword:\n        type: keyword\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n      text:\n        type: text\n        metadata_field: false\n        searchable: true\n        aggregatable: true\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_field_capabilities/_ctx.elasticsearch.yaml",
    "content": "api_root: http://localhost:9200/\n\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_field_capabilities/_ctx.quickwit.yaml",
    "content": "api_root: \"http://localhost:7280/api/v1/_elastic/\"\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_field_capabilities/_ctx.yaml",
    "content": "method: [GET, POST]\nheaders:\n  Content-Type: application/json\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_field_capabilities/_setup.elasticsearch.yaml",
    "content": "# Delete possibly remaining index\nmethod: DELETE\nendpoint: fieldcaps\nstatus_code: null\n---\nmethod: DELETE\nendpoint: fieldcaps-2\nstatus_code: null\n---\n# Create index 1\nmethod: PUT\nendpoint: fieldcaps\njson: {\n  \"mappings\": {\n    \"properties\": {\n      \"host\": {\n        \"type\": \"ip\",\n        \"store\": true\n      },\n      \"date\": {\n        \"type\": \"date_nanos\"\n      },\n    }\n  }\n}\n---\n# Create index 2\nmethod: PUT\nendpoint: fieldcaps-2\njson: {\n  \"mappings\": {\n    \"properties\": {\n      \"host\": {\n        \"type\": \"ip\",\n        \"store\": true\n      },\n      \"date\": {\n        \"type\": \"date_nanos\"\n      },\n    }\n  }\n}\n---\n# Ingest documents in fieldcaps\nmethod: POST\nendpoint: _bulk\nparams:\n  refresh: \"true\"\nheaders: {\"Content-Type\": \"application/json\"}\nndjson:\n  - \"index\": { \"_index\": \"fieldcaps\" }\n  - {\"name\": \"Fritz\", \"response\": 30, \"id\": 5, \"host\": \"192.168.0.1\", \"tags\": [\"nice\", \"cool\"]}\n  - \"index\": { \"_index\": \"fieldcaps\" }\n  - {\"nested\": {\"name\": \"Fritz\", \"response\": 30}, \"date\": \"2015-01-11T12:10:30Z\", \"host\": \"192.168.0.11\", \"tags\": [\"nice\"]}\n  - \"index\": { \"_index\": \"fieldcaps-2\" }\n  - {\"name\": \"Fritz\", \"response\": 30, \"id\": 6, \"host\": \"192.168.0.1\", \"tags\": [\"nice\", \"cool\"], \"tags-2\": [\"awesome\"]}\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_field_capabilities/_setup.quickwit.yaml",
    "content": "# Delete possibly remaining index\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/fieldcaps\nstatus_code: null\n---\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/fieldcaps-2\nstatus_code: null\n---\n# Create index\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: fieldcaps\n  doc_mapping:\n    mode: dynamic\n    dynamic_mapping:\n      tokenizer: default\n      fast: true\n    timestamp_field: date\n    tag_fields: [\"tags\"]\n    field_mappings:\n      - name: date\n        type: datetime\n        input_formats:\n          - rfc3339\n        fast_precision: seconds\n        fast: true\n      - name: host\n        type: ip\n        fast: true\n      - name: tags\n        type: array<text>\n        tokenizer: raw\n        fast: true\n---\n# Create index\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: fieldcaps-2\n  doc_mapping:\n    mode: dynamic\n    dynamic_mapping:\n      tokenizer: default\n      fast: true\n    field_mappings:\n      - name: date\n        type: datetime\n        input_formats:\n          - rfc3339\n        fast_precision: seconds\n        fast: true\n      - name: host\n        type: ip\n        fast: true\n---\n# Ingest documents\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: fieldcaps/ingest\nparams:\n  commit: force\nndjson:\n  - {\"name\": \"Fritz\", \"response\": 30, \"id\": 5, \"date\": \"2015-01-10T12:00:00Z\", \"host\": \"192.168.0.1\", \"tags\": [\"nice\", \"cool\"]}\n  - {\"nested\": {\"name\": \"Fritz\", \"response\": 30}, \"date\": \"2015-01-11T12:00:00Z\", \"host\": \"192.168.0.11\", \"tags\": [\"nice\"]}\n---\n# Ingest documents split #1 index fieldcaps\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: fieldcaps/ingest\nparams:\n  commit: force\nndjson:\n  - {\"id\": -5.5, \"date\": \"2018-01-10T12:00:00Z\"} # cross split mixed type\n---\n# Ingest documents split #2 index fieldcaps\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: fieldcaps/ingest\nparams:\n  commit: force\nndjson:\n  - {\"mixed\": 5, \"date\": \"2023-01-10T12:00:00Z\"} # inter split mixed type\n  - {\"mixed\": -5.5, \"date\": \"2024-01-10T12:00:00Z\"}\n---\n# Ingest documents in index fieldcaps-2\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: fieldcaps-2/ingest\nparams:\n  commit: force\nndjson:\n  - {\"name\": \"Fritz\", \"response\": 30, \"id\": 6, \"host\": \"192.168.0.1\", \"tags\": [\"nice\", \"cool\"], \"tags-2\": [\"awesome\"]}\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_field_capabilities/_teardown.elasticsearch.yaml",
    "content": "# # Delete index\nmethod: DELETE\nendpoint: fieldcaps\n---\nmethod: DELETE\nendpoint: fieldcaps-2\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/es_field_capabilities/_teardown.quickwit.yaml",
    "content": "# # Delete index\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/fieldcaps\n---\nmethod: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/fieldcaps-2\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/multi_splits/0001-request-optimizations.yaml",
    "content": "json:\n  size: 1\n  track_total_hits: true\n  query:\n      match_all: {}\n  sort:\n    - timestamp:\n        order: asc\nexpected:\n  hits:\n    hits:\n      - _source: {\"timestamp\": \"2015-01-10T10:00:00Z\"}\n---\njson:\n  size: 2\n  track_total_hits: true\n  query:\n      match_all: {}\n  sort:\n    - timestamp:\n        order: asc\nexpected:\n  hits:\n    hits:\n      - _source: {\"timestamp\": \"2015-01-10T10:00:00Z\"}\n      - _source: {\"timestamp\": \"2015-01-10T10:00:00Z\"}\n---\njson:\n  size: 3\n  track_total_hits: true\n  query:\n      match_all: {}\n  sort:\n    - timestamp:\n        order: asc\nexpected:\n  hits:\n    hits:\n      - _source: {\"timestamp\": \"2015-01-10T10:00:00Z\"}\n      - _source: {\"timestamp\": \"2015-01-10T10:00:00Z\"}\n      - _source: {\"timestamp\": \"2015-01-10T10:00:00Z\"}\n---\njson:\n  size: 5\n  track_total_hits: true\n  query:\n      match_all: {}\n  sort:\n    - timestamp:\n        order: asc\nexpected:\n  hits:\n    hits:\n      - _source: {\"timestamp\": \"2015-01-10T10:00:00Z\"}\n      - _source: {\"timestamp\": \"2015-01-10T10:00:00Z\"}\n      - _source: {\"timestamp\": \"2015-01-10T10:00:00Z\"}\n      - _source: {\"timestamp\": \"2015-01-10T10:00:00Z\"}\n      - _source: {\"timestamp\": \"2015-01-10T12:00:00Z\"}\n--- # ASC + TIMESTAMP filter\njson:\n  size: 5\n  track_total_hits: true\n  query:\n    range:\n      timestamp:\n        gte: \"2015-01-10T12:00:00Z\"\n  sort:\n    - timestamp:\n        order: asc\nexpected:\n  hits:\n    hits:\n      - _source: {\"timestamp\": \"2015-01-10T12:00:00Z\"}\n      - _source: {\"timestamp\": \"2015-01-10T13:00:00Z\"}\n      - _source: {\"timestamp\": \"2015-01-10T14:00:00.000000001Z\"}\n      - _source: {\"timestamp\": \"2015-01-11T12:00:00Z\"}\n      - _source: {\"timestamp\": \"2015-01-11T12:00:00Z\"}\n--- # ASC + TIMESTAMP filter\njson:\n  size: 5\n  track_total_hits: true\n  query:\n    range:\n      timestamp:\n        lt: \"2015-01-10T12:00:00Z\"\n  sort:\n    - timestamp:\n        order: asc\nexpected:\n  hits:\n    hits:\n      - _source: {\"timestamp\": \"2015-01-10T10:00:00Z\"}\n      - _source: {\"timestamp\": \"2015-01-10T10:00:00Z\"}\n      - _source: {\"timestamp\": \"2015-01-10T10:00:00Z\"}\n      - _source: {\"timestamp\": \"2015-01-10T10:00:00Z\"}\n--- # DESC\njson:\n  size: 6\n  track_total_hits: true\n  query:\n      match_all: {}\n  sort:\n    - timestamp:\n        order: desc\nexpected:\n  hits:\n    hits:\n      - _source: {\"timestamp\": \"2016-01-11T12:00:00Z\"}\n      - _source: {\"timestamp\": \"2016-01-10T10:00:00Z\"}\n      - _source: {\"timestamp\": \"2015-01-11T12:00:00Z\"}\n      - _source: {\"timestamp\": \"2015-01-11T12:00:00Z\"}\n      - _source: {\"timestamp\": \"2015-01-11T12:00:00Z\"}\n      - _source: {\"timestamp\": \"2015-01-11T12:00:00Z\"}\n---\njson:\n  size: 7\n  track_total_hits: true\n  query:\n      match_all: {}\n  sort:\n    - timestamp:\n        order: desc\nexpected:\n  hits:\n    hits:\n      - _source: {\"timestamp\": \"2016-01-11T12:00:00Z\"}\n      - _source: {\"timestamp\": \"2016-01-10T10:00:00Z\"}\n      - _source: {\"timestamp\": \"2015-01-11T12:00:00Z\"}\n      - _source: {\"timestamp\": \"2015-01-11T12:00:00Z\"}\n      - _source: {\"timestamp\": \"2015-01-11T12:00:00Z\"}\n      - _source: {\"timestamp\": \"2015-01-11T12:00:00Z\"}\n      - _source: {\"timestamp\": \"2015-01-10T14:00:00.000000001Z\"}\n\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/multi_splits/_ctx.yaml",
    "content": "method: [GET]\nendpoint: \"multi_splits/_search\"\n# The entire suite is just for Quickwit\nengines: [quickwit]\napi_root: \"http://localhost:7280/api/v1/_elastic/\"\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/multi_splits/_setup.quickwit.yaml",
    "content": "# Delete possibly remaining index\nmethod: DELETE\nendpoint: indexes/multi_splits\nstatus_code: null\n---\n# Create index\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: multi_splits\n  doc_mapping:\n    mode: dynamic\n    timestamp_field: timestamp\n    field_mappings:\n        - name: timestamp\n          type: datetime\n          input_formats:\n            - rfc3339\n          fast: true\nsleep_after: 3\n---\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: multi_splits/ingest\nparams:\n  commit: force\nmin_splits: 1\nmax_splits: 10\n#seed: 3694\nshuffle_ndjson:\n  - {\"timestamp\": \"2015-01-10T10:00:00Z\"}\n  - {\"timestamp\": \"2015-01-11T12:00:00Z\"}\n  - {\"timestamp\": \"2015-01-10T10:00:00Z\"}\n  - {\"timestamp\": \"2015-01-10T13:00:00Z\"}\n  - {\"timestamp\": \"2015-01-11T12:00:00Z\"}\n  - {\"timestamp\": \"2015-01-10T10:00:00Z\"}\n  - {\"timestamp\": \"2015-01-10T14:00:00.000000001Z\"} # 1h later than 2.doc\n  - {\"timestamp\": \"2015-01-11T12:00:00Z\"}\n  - {\"timestamp\": \"2015-01-10T10:00:00Z\"}\n  - {\"timestamp\": \"2015-01-10T12:00:00Z\"} # 1h earlier than 2. doc\n  - {\"timestamp\": \"2015-01-11T12:00:00Z\"}\n  - {\"timestamp\": \"2016-01-10T10:00:00Z\"}\n  - {\"timestamp\": \"2016-01-11T12:00:00Z\"}\n\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/multi_splits/_teardown.quickwit.yaml",
    "content": "method: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/multi_splits\nstatus_code: null\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/qw_search_api/0001_ts_range.yaml",
    "content": "# This tests a simple request with no queries.\nendpoint: simple/search\nparams:\n  query: \"*\"\n  start_timestamp: 1684993001\n  end_timestamp: 1684993002\nexpected:\n  num_hits: 1\n---\nendpoint: simple/search\nparams:\n  query: \"*\"\n  start_timestamp: 1684993002\n  end_timestamp: 1684993004\nexpected:\n  num_hits: 2\n---\nendpoint: simple/search\nparams:\n  query: \"*\"\n  start_timestamp: 1684993002\n  end_timestamp: 1684993004\nexpected:\n  num_hits: 2\n---\nendpoint: simple/search\nparams:\n  query: \"ts:>=2023/05/25\"\nexpected:\n  num_hits: 4\n---\nendpoint: simple/search\nparams:\n  query: \"ts:>=1684993002 AND ts:<1684993004\"\nexpected:\n  num_hits: 2\n---\nendpoint: simple/search\nparams:\n  query: \"auto_date:>=2023-05-25T00:00:00Z AND auto_date:<2023-05-26T00:00:00Z\"\nexpected:\n  num_hits: 2\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/qw_search_api/0002_negative_search.yaml",
    "content": "# regression test for bizarre handling of - vs NOT when no positive clause is present\nendpoint: simple/search\nparams:\n  query: \"-ts:1234567890 AND -ts:1234567891\"\nexpected:\n  num_hits: 4\n---\nendpoint: simple/search\nparams:\n  query: \"NOT ts:1234567890 AND NOT ts:1234567891\"\nexpected:\n  num_hits: 4\n---\nendpoint: simple/search\nparams:\n  query: \"NOT ts:1234567890 AND -ts:1234567891\"\nexpected:\n  num_hits: 4\n---\nendpoint: simple/search\nparams:\n  query: \"-ts:1234567890 AND NOT ts:1234567891\"\nexpected:\n  num_hits: 4\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/qw_search_api/0003_exists_search.yaml",
    "content": "endpoint: nested/search\nparams:\n  query: \"doesnotexist:*\"\nexpected:\n  num_hits: 0\n---\n# json fast fields:\nendpoint: nested/search\nparams:\n  query: \"json_fast:*\"\nexpected:\n  num_hits: 1\n---\nendpoint: nested/search\nparams:\n  query: \"json_fast.field_c:*\"\nexpected:\n  num_hits: 1\n---\nendpoint: nested/search\nparams:\n  query: \"json_fast.doesnotexist:*\"\nexpected:\n  num_hits: 0\n---\n# json text fields:\nendpoint: nested/search\nparams:\n  query: \"json_text.field_a:*\"\nexpected:\n  num_hits: 2\n---\nendpoint: nested/search\nparams:\n  query: \"json_text.field_b:*\"\nexpected:\n  num_hits: 1\n---\nendpoint: nested/search\nparams:\n  query: \"json_text:*\"\nexpected:\n  num_hits: 2\n---\n# object fields:\nendpoint: nested/search\nparams:\n  query: \"object_multi.object_fast_field:*\"\nexpected:\n  num_hits: 2\n---\nendpoint: nested/search\nparams:\n  query: \"object_multi.doesnotexist:*\"\nexpected:\n  num_hits: 0\n---\nendpoint: nested/search\nparams:\n  query: \"object_multi.object_text_field:*\"\nexpected:\n  num_hits: 1\n---\nendpoint: nested/search\nparams:\n  query: \"object_multi:*\"\nexpected:\n  num_hits: 3\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/qw_search_api/0004_exact_string.yaml",
    "content": "## using an index (with the raw tokenizer)\nendpoint: nested/search\nmethod: POST\njson:\n  query: \"text_raw:indexed-with-raw-tokenizer-dashes\"\nexpected:\n  num_hits: 1\n---\nendpoint: nested/search\nmethod: POST\njson:\n  query: \"text_raw:indexed_with_raw_tokenizer_dashes\"\nexpected:\n  num_hits: 0\n---\nendpoint: nested/search\nmethod: POST\njson:\n  query: \"text_raw:indexed-with-raw\"\nexpected:\n  num_hits: 0\n---\nendpoint: nested/search\nmethod: POST\njson:\n  query: 'text_raw:\"indexed with raw tokenizer dashes\"'\nexpected:\n  num_hits: 1\n---\nendpoint: nested/search\nmethod: POST\njson:\n  query: 'text_raw:\"indexed with raw\"'\nexpected:\n  num_hits: 0\n---\n## using a fast field (use a range query to force using the fast field)\nendpoint: nested/search\nmethod: POST\njson:\n  query: \"text_fast:fast-text-value-dashes\"\nexpected:\n  num_hits: 1\n---\nendpoint: nested/search\nmethod: POST\njson:\n  query: \"text_fast:[fast-text-value-dashes TO fast-text-value-dashes]\"\nexpected:\n  num_hits: 1\n---\nendpoint: nested/search\nmethod: POST\njson:\n  query: \"text_fast:[fast_text_value_dashes TO fast_text_value_dashes]\"\nexpected:\n  num_hits: 0\n---\nendpoint: nested/search\nmethod: POST\njson:\n  query: \"text_fast:[fast-text-value TO fast-text-value]\"\nexpected:\n  num_hits: 0\n---\n# unfortunately, the query parser does not support escaping whitespaces\n# use the Elasticsearch API instead\nendpoint: nested/search\nmethod: POST\njson:\n  query: 'text_fast:[\"fast text value whitespaces\" TO \"fast text value whitespacesd\"]'\nstatus_code: 400\n---\nendpoint: nested/search\nmethod: POST\njson:\n  query: \"text_fast:[fast text value whitespaces TO fast text value whitespaces]\"\nstatus_code: 400\n---\nendpoint: nested/search\nmethod: POST\njson:\n  query: \"text_fast:[fast\\ text\\ value\\ whitespaces TO fast\\ text\\ value\\ whitespaces]\"\nstatus_code: 400\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/qw_search_api/0005_fast_field_search.yaml",
    "content": "# Validate searching on a fast-only (non-indexed) field\nendpoint: nested/search\nmethod: POST\njson:\n  query: \"text_fast:fast-text-value-dashes\"\nexpected:\n  num_hits: 1\n---\n# Non-matching exact value should return no hits\nendpoint: nested/search\nmethod: POST\njson:\n  query: \"text_fast:fast_text_value_dashes\"\nexpected:\n  num_hits: 0\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/qw_search_api/_ctx.yaml",
    "content": "method: GET\nengines: [\"quickwit\"]\napi_root: \"http://localhost:7280/api/v1/\"\nheaders:\n  Content-Type: application/json\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/qw_search_api/_setup.quickwit.yaml",
    "content": "# Delete possibly remaining index\nmethod: DELETE\nendpoint: indexes/simple\nstatus_code: null\n---\n# Create index\nmethod: POST\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: simple\n  doc_mapping:\n    timestamp_field: ts\n    mode: dynamic\n    field_mappings:\n      - name: ts\n        type: datetime\n        fast: true\n      - name: not_fast\n        type: datetime\n        fast: true\n    dynamic_mapping:\n      tokenizer: default\n      expand_dots: true\n      fast: true\n---\n# Ingest documents\nmethod: POST\nendpoint: simple/ingest\nparams:\n  commit: force\nndjson:\n  - {\"ts\": 1684993001, \"not_fast\": 1684993001, \"auto_date\": \"2023-05-25T10:00:00Z\"}\n  - {\"ts\": 1684993002, \"not_fast\": 1684993002, \"auto_date\": \"2023-05-25T11:00:00Z\"}\n---\n# Ingest documents split #2\nmethod: POST\nendpoint: simple/ingest\nparams:\n  commit: force\nndjson:\n  - {\"ts\": 1684993003, \"not_fast\": 1684993003}\n  - {\"ts\": 1684993004, \"not_fast\": 1684993004}\n  # a missing timestamp\n  - {\"not_fast\": 1684993003}\n---\nmethod: DELETE\nendpoint: indexes/nested\nstatus_code: null\n---\nmethod: POST\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: nested\n  doc_mapping:\n    index_field_presence: true\n    # default mode is dynamic\n    field_mappings:\n      - name: json_text\n        type: json\n        indexed: true\n      - name: json_fast\n        type: json\n        stored: true\n        fast: true\n      - name: object_multi\n        type: object\n        field_mappings:\n          - name: object_text_field\n            type: text\n          - name: object_fast_field\n            type: u64\n            fast: true\n      - name: text_fast\n        type: text\n        fast: true\n        indexed: false\n      - name: text_raw\n        type: text\n        fast: false\n        indexed: true\n        tokenizer: raw\n\n---\nmethod: POST\nendpoint: nested/ingest\nparams:\n  commit: force\nndjson:\n  - {\"json_text\": {\"field_a\": \"hello\", \"field_b\": \"world\"}}\n  - {\"json_text\": {\"field_a\": \"hi\"}}\n  - {\"json_fast\": {\"field_c\": 1}}\n  - {\"object_multi\": {\"object_text_field\": \"multi hello\"}}\n  - {\"object_multi\": {\"object_fast_field\": 1}}\n  - {\"object_multi\": {\"object_fast_field\": 2}}\n  - {\"text_raw\": \"indexed-with-raw-tokenizer-dashes\"}\n  - {\"text_raw\": \"indexed with raw tokenizer dashes\"}\n  - {\"text_fast\": \"fast-text-value-dashes\"}\n  - {\"text_fast\": \"fast text value whitespaces\"}\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/qw_search_api/_teardown.quickwit.yaml",
    "content": "method: DELETE\nendpoint: indexes/simple\n---\nmethod: DELETE\nendpoint: indexes/nested\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/search_after/0001-search_after_edge_case.yaml",
    "content": "json:\n  size: 1\n  query:\n      match_all: {}\n  sort:\n    - val_u64:\n        order: asc\n  search_after: [-10]\nexpected:\n  hits:\n    total:\n      value: 5\n      relation: eq\n    hits:\n      - sort: [0]\n--- # f64 to u64\njson:\n  size: 1\n  query:\n      match_all: {}\n  sort:\n    - val_u64:\n        order: asc\n  search_after: [0.2]\nexpected:\n  hits:\n    total:\n      value: 5\n      relation: eq\n    hits:\n      - sort: [20]\n--- # u64 to i64\ndesc: \"search after u64 to i64 asc\"\njson:\n  size: 5\n  query:\n      match_all: {}\n  sort:\n    - val_i64:\n        order: asc\n  search_after: [250]\nexpected:\n  hits:\n    total:\n      value: 5\n      relation: eq\n    hits:\n      - sort: [300]\n      - sort: [9223372036854775807]\n      - sort: [9223372036854775807]\n--- # u64 to i64\ndesc: \"search after u64 to i64 desc\"\njson:\n  size: 5\n  query:\n      match_all: {}\n  sort:\n    - val_i64:\n        order: desc\n  search_after: [250]\nexpected:\n  hits:\n    total:\n      value: 5\n      relation: eq\n    hits:\n      - sort: [200]\n      - sort: [-100]\n--- # u64 to i64 corner case. We are exceeding i64::MAX, so we don't get any results.\ndesc: \"search after u64 to i64 corner case exceeding i64::MAX asc\"\njson:\n  size: 5\n  query:\n      match_all: {}\n  sort:\n    - val_i64:\n        order: asc\n  search_after: [18_000_000_000_000_000_000]\nexpected:\n  hits:\n    total:\n      value: 5\n      relation: eq\n    hits:\n      $expect: \"len(val) == 0\"\n--- # u64 to i64 corner case.We are exceeding i64::MAX, but with desc we get ALL the results.\ndesc: \"search after u64 to i64 corner case exceeding i64::MAX desc\"\njson:\n  size: 5\n  query:\n      match_all: {}\n  sort:\n    - val_i64:\n        order: desc\n  search_after: [18_000_000_000_000_000_000]\nexpected:\n  hits:\n    total:\n      value: 5\n      relation: eq\n    hits:\n      - sort: [9_223_372_036_854_775_807]\n      - sort: [9_223_372_036_854_775_807]\n      - sort: [300]\n      - sort: [200]\n      - sort: [-100]\n--- # u64 to i64 corner case\ndesc: \"search after u64 to i64 corner case one below i64::MAX asc\"\njson:\n  size: 1\n  query:\n      match_all: {}\n  sort:\n    - val_i64:\n        order: asc\n  search_after: [9_223_372_036_854_775_806]\nexpected:\n  hits:\n    total:\n      value: 5\n      relation: eq\n    hits:\n      - sort: [9_223_372_036_854_775_807]\n---\ndesc: \"search after u64 to i64 corner case exactly i64::MAX asc\"\njson:\n  size: 1\n  query:\n      match_all: {}\n  sort:\n    - val_i64:\n        order: asc\n  search_after: [9_223_372_036_854_775_807]\nexpected:\n  hits:\n    total:\n      value: 5\n      relation: eq\n    hits:\n      $expect: \"len(val) == 0\"\n---\ndesc: \"search after u64 to i64 corner case one above i64::MAX asc\"\njson:\n  size: 1\n  query:\n      match_all: {}\n  sort:\n    - val_i64:\n        order: asc\n  search_after: [9_223_372_036_854_775_808]\nexpected:\n  hits:\n    total:\n      value: 5\n      relation: eq\n    hits:\n      $expect: \"len(val) == 0\"\n---\ndesc: \"search after f64 to i64 corner case\"\njson:\n  size: 1\n  query:\n      match_all: {}\n  sort:\n    - val_i64:\n        order: asc\n  search_after: [9_223_372_036_854_500_000.5] # lower the value we seem to hit some f64 accuracy issue here\nexpected:\n  hits:\n    total:\n      value: 5\n      relation: eq\n    hits:\n      - sort: [9_223_372_036_854_775_807]\n---\ndesc: \"search after f64 to i64 out of bounds asc match nothing\"\njson:\n  size: 1\n  query:\n      match_all: {}\n  sort:\n    - val_i64:\n        order: asc\n  search_after: [19_223_372_036_854_500_000.5]\nexpected:\n  hits:\n    total:\n      value: 5\n      relation: eq\n    hits:\n      $expect: \"len(val) == 0\"\n---\ndesc: \"search after f64 to i64 out of bounds desc match everything\"\njson:\n  size: 5\n  query:\n      match_all: {}\n  sort:\n    - val_i64:\n        order: desc\n  search_after: [19_223_372_036_854_500_000.5]\nexpected:\n  hits:\n    total:\n      value: 5\n      relation: eq\n    hits:\n      - sort: [9_223_372_036_854_775_807]\n      - sort: [9_223_372_036_854_775_807]\n      - sort: [300]\n      - sort: [200]\n      - sort: [-100]\n---\ndesc: \"search after on mixed column asc\"\njson:\n  size: 5\n  query:\n      match_all: {}\n  sort:\n    - mixed_type:\n        order: asc\n  search_after: [-10]\nexpected:\n  hits:\n    total:\n      value: 5\n      relation: eq\n    hits:\n      - sort: [0]\n      - sort: [True]\n      - sort: [10.5]\n      - sort: [18000000000000000000]\n---\ndesc: \"search after on mixed column desc match nothing\"\njson:\n  size: 5\n  query:\n      match_all: {}\n  sort:\n    - mixed_type:\n        order: desc\n  search_after: [-10]\nexpected:\n  hits:\n    total:\n      value: 5\n      relation: eq\n    hits:\n      $expect: \"len(val) == 0\"\n---\ndesc: \"search after on mixed column desc\"\njson:\n  size: 5\n  query:\n      match_all: {}\n  sort:\n    - mixed_type:\n        order: desc\n  search_after: [2]\nexpected:\n  hits:\n    total:\n      value: 5\n      relation: eq\n    hits:\n      - sort: [True]\n      - sort: [0]\n      - sort: [-10]\n\n\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/search_after/_ctx.yaml",
    "content": "method: [GET]\nendpoint: \"search_after/_search\"\n# The entire suite is just for Quickwit\nengines: [quickwit]\napi_root: \"http://localhost:7280/api/v1/_elastic/\"\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/search_after/_setup.quickwit.yaml",
    "content": "# Delete possibly remaining index\nmethod: DELETE\nendpoint: indexes/search_after\nstatus_code: null\n---\n# Create index\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: search_after\n  doc_mapping:\n    mode: dynamic\n    dynamic_mapping:\n      tokenizer: default\n      fast: true\n    field_mappings:\n      - name: val_u64\n        type: u64\n        fast: true\n      - name: val_f64\n        type: f64\n        fast: true\n      - name: val_i64\n        type: i64\n        fast: true\nsleep_after: 3\n---\n# Ingest documents split #1\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: search_after/ingest\nparams:\n  commit: force\nndjson:\n  - {\"mixed_type\": 18_000_000_000_000_000_000, \"val_i64\": -100, \"val_f64\": 100.5, \"val_u64\": 0} # mixed_type is a u64\n  - {\"mixed_type\": 0, \"val_i64\": 9_223_372_036_854_775_807, \"val_f64\": 110, \"val_u64\": 18_000_000_000_000_000_000} # to enforce u64 type on val_u64 we need a value > 2^63, or it will take i64 (maybe we should change this)\n---\n# Ingest documents split #2\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: search_after/ingest\nparams:\n  commit: force\nndjson:\n  - {\"mixed_type\": 10.5, \"val_i64\": 200, \"val_f64\": 200.0, \"val_u64\": 20} #mixed_type is a f64\n---\n# Ingest documents split #3\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: search_after/ingest\nparams:\n  commit: force\nndjson:\n  - {\"mixed_type\": -10, \"val_i64\": 300, \"val_f64\": 300.0, \"val_u64\": 0} #mixed_type is a i64\n---\n# Ingest documents split #4\nmethod: POST\napi_root: http://localhost:7280/api/v1/\nendpoint: search_after/ingest\nparams:\n  commit: force\nndjson:\n  - {\"mixed_type\": true, \"val_i64\": 9_223_372_036_854_775_807, \"val_f64\": 300.0, \"val_u64\": 0} # i64::MAX\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/search_after/_teardown.quickwit.yaml",
    "content": "method: DELETE\napi_root: http://localhost:7280/api/v1/\nendpoint: indexes/search_after\nstatus_code: null\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/sort_orders/0001-sort-elasticapi.yaml",
    "content": "method: [GET]\nengines:\n  - quickwit\nendpoint: _elastic/sortorder/_search\njson:\n  query:\n    bool:\n      must_not:\n        match:\n          count: 10\n  sort:\n    - id: {\"order\" : \"desc\"}\nexpected:\n  hits:\n    total:\n      value: 4\n      relation: \"eq\"\n    hits:\n      - _source: { \"id\": 5 }\n      - _source: { \"count\": -2.5, \"id\": 4 }\n      - _source: { \"id\": 3 }\n      - _source: { \"count\": 15, \"id\": 2 }\n---\nendpoint: _elastic/sortorder/_search\njson:\n  query:\n    bool:\n      must_not:\n        match:\n          count: 10\n  sort:\n    - id: {\"order\" : \"asc\"}\nexpected:\n  hits:\n    total:\n      value: 4\n      relation: \"eq\"\n    hits:\n      - _source: {\"count\": 15, \"id\": 2 }\n      - _source: {\"id\": 3}\n      - _source: {\"count\": -2.5, \"id\": 4}\n      - _source: {\"id\": 5}\n---\nendpoint: _elastic/sortorder/_search\njson:\n  query:\n    match_all: {}\n  sort:\n    - id: {\"order\" : \"asc\"}\n    - count: {\"order\" : \"asc\"}\nexpected:\n  hits:\n    total:\n      value: 7\n      relation: \"eq\"\n    hits:\n      - _source: {\"count\": 10, \"id\": 0 }\n      - _source: {\"count\": 10, \"id\": 1 }\n      - _source: {\"count\": 10, \"id\": 2 }\n      - _source: {\"count\": 15, \"id\": 2 }\n      - _source: {\"id\": 3}\n      - _source: {\"count\": -2.5, \"id\": 4}\n      - _source: {\"id\": 5}\n---\nendpoint: _elastic/sortorder/_search\njson:\n  query:\n    match_all: {}\n  sort:\n    - count: {\"order\" : \"desc\"}\n    - id: {\"order\" : \"desc\"}\nexpected:\n  hits:\n    total:\n      value: 7\n      relation: \"eq\"\n    hits:\n      - _source: {\"count\": 15, \"id\": 2 }\n      - _source: {\"count\": 10, \"id\": 2 }\n      - _source: {\"count\": 10, \"id\": 1 }\n      - _source: {\"count\": 10, \"id\": 0 }\n      - _source: {\"count\": -2.5, \"id\": 4}\n      - _source: {\"id\": 5}\n      - _source: {\"id\": 3}\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/sort_orders/_ctx.yaml",
    "content": "method: GET\nengines: [\"quickwit\"]\napi_root: \"http://localhost:7280/api/v1/\"\nheaders:\n  Content-Type: application/json\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/sort_orders/_setup.quickwit.yaml",
    "content": "# Delete possibly remaining index\nmethod: DELETE\nendpoint: indexes/sortorder\nstatus_code: null\n---\n# Create index\nmethod: POST\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: sortorder\n  doc_mapping:\n    mode: dynamic\n    dynamic_mapping:\n      tokenizer: default\n      fast: true\n---\n# Ingest documents\nmethod: POST\nendpoint: sortorder/ingest\nparams:\n  commit: force\nmin_splits: 1\nmax_splits: 10\nshuffle_ndjson:\n  - {\"count\": 10, \"id\": 1}\n  - {\"count\": 10, \"id\": 2}\n  - {\"count\": 15, \"id\": 2}\n  - {\"id\": 3}\n  - {\"count\": 10, \"id\": 0}\n  - {\"count\": -2.5, \"id\": 4}\n  - {\"id\": 5}\n\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/sort_orders/_teardown.quickwit.yaml",
    "content": "# # Delete index\nmethod: DELETE\nendpoint: indexes/sortorder\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/tag_fields/0001_allowed_types.yaml",
    "content": "# allowed types\nmethod: POST\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: allowedtypes\n  doc_mapping:\n    mode: dynamic\n    field_mappings:\n      - name: text1\n        type: text\n        tokenizer: raw\n      - name: number1\n        type: u64\n      - name: number2\n        type: i64\n    tag_fields: \n      - text1\n      - number1\n      - number2\n---\n# tokenized not allowed\nmethod: POST\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: tokenizedtype\n  doc_mapping:\n    mode: dynamic\n    field_mappings:\n      - name: text1\n        type: text\n        tokenizer: default\n    tag_fields: [text1]\nstatus_code: 400\n---\n# float not allowed\nmethod: POST\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: floattype\n  doc_mapping:\n    mode: dynamic\n    field_mappings:\n      - name: number3\n        type: f64\n    tag_fields: [number3]\nstatus_code: 400\n---\n# boolean not allowed\nmethod: POST\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: booltype\n  doc_mapping:\n    mode: dynamic\n    field_mappings:\n      - name: boolean\n        type: bool\n    tag_fields: [boolean]\nstatus_code: 400\n---\n# json not allowed\nmethod: POST\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: jsontype\n  doc_mapping:\n    mode: dynamic\n    field_mappings:\n      - name: json1\n        type: json\n    tag_fields: [json1]\nstatus_code: 400\n---\n# ip not allowed\nmethod: POST\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: iptype\n  doc_mapping:\n    mode: dynamic\n    field_mappings:\n      - name: ip1\n        type: ip\n    tag_fields: [ip1]\nstatus_code: 400\n---\n# bytes not allowed\nmethod: POST\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: bytestype\n  doc_mapping:\n    mode: dynamic\n    field_mappings:\n      - name: bytes1\n        type: bytes\n    tag_fields: [bytes1]\nstatus_code: 400\n---\n# bytes not allowed\nmethod: POST\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: datetype\n  doc_mapping:\n    mode: dynamic\n    field_mappings:\n      - name: date1\n        type: datetime\n        input_formats:\n          - rfc3339\n    tag_fields: [date1]\nstatus_code: 400\n---\n# dynamic not allowed\nmethod: POST\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: dynamictype\n  doc_mapping:\n    mode: dynamic\n    tag_fields: [dynamic1]\nstatus_code: 400\n---\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/tag_fields/0002_negative_tags.yaml",
    "content": "# regression test for https://github.com/quickwit-oss/quickwit/issues/4698\nendpoint: tag-simple/search\nparams:\n  query: \"tag:1\"\nexpected:\n  num_hits: 3\n---\nendpoint: tag-simple/search\nparams:\n  query: \"-tag:2\"\nexpected:\n  num_hits: 4\n---\nendpoint: tag-simple/search\nparams:\n  query: \"tag:2\"\nexpected:\n  num_hits: 1\n---\nendpoint: tag-simple/search\nparams:\n  query: \"-tag:1\"\nexpected:\n  num_hits: 2\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/tag_fields/_ctx.yaml",
    "content": "method: GET\nengines: [\"quickwit\"]\napi_root: \"http://localhost:7280/api/v1/\"\nheaders:\n  Content-Type: application/json\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/tag_fields/_setup.quickwit.yaml",
    "content": "# Delete possibly remaining index\nmethod: DELETE\nendpoint: indexes/allowedtypes\nstatus_code: null\n---\nmethod: DELETE\nendpoint: indexes/tag-simple\nstatus_code: null\n---\nmethod: POST\nendpoint: indexes/\njson:\n  version: \"0.7\"\n  index_id: tag-simple\n  doc_mapping:\n    field_mappings:\n      - name: seq\n        type: u64\n      - name: tag\n        type: u64\n    tag_fields: [\"tag\"]\n---\nmethod: POST\nendpoint: tag-simple/ingest\nparams:\n  commit: force\nndjson:\n  - {\"seq\": 1, \"tag\": 1}\n  - {\"seq\": 2, \"tag\": 2}\n---\nmethod: POST\nendpoint: tag-simple/ingest\nparams:\n  commit: force\nndjson:\n  - {\"seq\": 1, \"tag\": 1}\n  - {\"seq\": 3, \"tag\": null}\n---\nmethod: POST\nendpoint: tag-simple/ingest\nparams:\n  commit: force\nndjson:\n  - {\"seq\": 4, \"tag\": 1}\n---\n"
  },
  {
    "path": "quickwit/rest-api-tests/scenarii/tag_fields/_teardown.quickwit.yaml",
    "content": "method: DELETE\nendpoint: indexes/allowedtypes\nstatus_code: null\n---\nmethod: DELETE\nendpoint: indexes/tag-simple\n"
  },
  {
    "path": "quickwit/rust-toolchain.toml",
    "content": "[toolchain]\nchannel = \"1.93\"\ncomponents = [\"cargo\", \"clippy\", \"rustfmt\", \"rust-docs\"]\n\n"
  },
  {
    "path": "quickwit/rustfmt.toml",
    "content": "ignore = [\n   \"**/codegen/**/*.rs\",\n]\n\ncomment_width = 120\nformat_strings = true\ngroup_imports = \"StdExternalCrate\"\nimports_granularity = \"Module\"\nnormalize_comments = false\nwhere_single_line = true\nwrap_comments = true\n"
  },
  {
    "path": "quickwit/scripts/about.hbs",
    "content": "<html>\n\n<head>\n    <style>\n        @media (prefers-color-scheme: dark) {\n            body {\n                background: #333;\n                color: white;\n            }\n            a {\n                color: skyblue;\n            }\n        }\n        .container {\n            font-family: sans-serif;\n            max-width: 800px;\n            margin: 0 auto;\n        }\n        .intro {\n            text-align: center;\n        }\n        .licenses-list {\n            list-style-type: none;\n            margin: 0;\n            padding: 0;\n        }\n        .license-used-by {\n            margin-top: -10px;\n        }\n        .license-text {\n            max-height: 200px;\n            overflow-y: scroll;\n            white-space: pre-wrap;\n        }\n    </style>\n</head>\n\n<body>\n    <main class=\"container\">\n        <div class=\"intro\">\n            <h1>Third Party Licenses</h1>\n            <p>This page lists the licenses of the projects used in cargo-about.</p>\n        </div>\n    \n        <h2>Overview of licenses:</h2>\n        <ul class=\"licenses-overview\">\n            {{#each overview}}\n            <li><a href=\"#{{id}}\">{{name}}</a> ({{count}})</li>\n            {{/each}}\n        </ul>\n\n        <h2>All license text:</h2>\n        <ul class=\"licenses-list\">\n            {{#each licenses}}\n            <li class=\"license\">\n                <h3 id=\"{{id}}\">{{name}}</h3>\n                <h4>Used by:</h4>\n                <ul class=\"license-used-by\">\n                    {{#each used_by}}\n                    <li><a href=\"{{#if crate.repository}} {{crate.repository}} {{else}} https://crates.io/crates/{{crate.name}} {{/if}}\">{{crate.name}} {{crate.version}}</a></li>\n                    {{/each}}\n                </ul>\n                <pre class=\"license-text\">{{text}}</pre>\n            </li>\n            {{/each}}\n        </ul>\n    </main>\n</body>\n\n</html>\n"
  },
  {
    "path": "quickwit/scripts/about.toml",
    "content": "accepted = [\n    \"0BSD\",\n    \"AGPL-3.0\",\n    \"LicenseRef-AGPL-3.0-or-later\",\n    \"Apache-2.0\",\n    \"BSD-2-Clause\",\n    \"BSD-3-Clause\",\n\"CC0-1.0\",\n    \"ISC\",\n    \"MIT\",\n    \"MPL-2.0\",\n    \"OpenSSL\",\n    \"Unicode-DFS-2016\",\n    \"Unlicense\",\n    \"Zlib\",\n    \"zlib-acknowledgement\"\n]\n\nworkarounds = [\n    \"ring\",\n    \"rustls\",\n]\n\n[whichlang.clarify]\nlicense = \"MIT\"\n\n\n[plotters.clarify]\nlicense = \"MIT\"\n\n\n[plotters-svg.clarify]\nlicense = \"MIT\"\n\n\n[advapi32-sys.clarify]\nlicense = \"MIT\"\n"
  },
  {
    "path": "quickwit/scripts/check_license_headers.sh",
    "content": "#!/bin/bash\n\nRESULT=0\n\nfor file in $(git ls-files | \\\n    grep \"build\\|src\\|proto\" | \\\n    grep -e \"\\.proto\\|\\.rs\\|\\.ts\" | \\\n    grep -v \"quickwit-proto/protos/third-party\" | \\\n    grep -v \"quickwit-proto/src\" | \\\n    grep -v \"/codegen/\" \\\n)\ndo\n    diff <(sed 's/{\\\\d+}/2021/' .license_header.txt) <(head -n 14 $file) > /dev/null\n    DIFFRESULT=$?\n    if [ $DIFFRESULT -ne 0 ]; then\n        grep -q -i 'begin quickwit-codegen' $file\n        GREPRESULT=$?\n        if [ $GREPRESULT -ne 0 ]; then\n            echo \"Incomplete or missing license header in $file\"\n            RESULT=1\n        fi\n    fi\ndone\n\nexit $RESULT\n"
  },
  {
    "path": "quickwit/scripts/check_log_format.sh",
    "content": "#!/bin/bash\n\nRESULT=0\n\nfor file in $(git ls-files | grep -E \"src/.*\\.rs$\")\ndo\n    LOG_STARTING_WITH_UPPERCASE=$(grep -E -n \"(warn|info|error|debug)!\\(\\\"[A-Z][a-z]\" $file)\n    DIFFRESULT=$?\n    LOG_ENDING_WITH_PERIOD=$(grep -E -n \"(warn|info|error|debug)!.*\\.\\\"\\);\" $file)\n    DIFFRESULT=$(($DIFFRESULT && $?))\n    if [ $DIFFRESULT -eq 0 ]; then\n      echo \"====================\"\n      echo $file\n      echo $LOG_STARTING_WITH_UPPERCASE\n      echo $LOG_ENDING_WITH_PERIOD\n      echo $FAULTY_LINES\n      RESULT=1\n    fi\ndone\n\nexit $RESULT\n"
  },
  {
    "path": "quickwit/scripts/dep-tree.py",
    "content": "import fileinput\nfrom collections import defaultdict\nimport graphviz\n\nFILTER = {\n\t\"quickwit-backward-compat\",\n\t\"quickwit-actors\",\n\t\"quickwit-cli\",\n    \"quickwit-rest-client\",\n\t\"quickwit-doc-mapper\",\n\t\"quickwit-search\",\n\t\"quickwit-common\",\n\t\"quickwit-indexing\",\n\t\"quickwit-metastore\",\n\t\"quickwit-proto\",\n\t\"quickwit-directories\",\n\t\"quickwit-common\",\n    \"quickwit-rest-client\",\n\t\"quickwit-serve\",\n\t\"quickwit-storage\",\n\t\"quickwit-cluster\",\n\t\"quickwit-index-management\",\n    \"tantivy\"\n}\n\ndef deps():\n    deps = defaultdict(set)\n    last_level = {}\n    old_code = 10\n    for line in fileinput.input():\n        line = line.strip()\n        if len(line) < 2:\n            continue\n        (code, package) = (line[0], line[1:])\n        if package not in FILTER:\n            continue\n        code = int(code)\n        last_level[code] = package\n        print(line)\n        if code > 0:\n            if (code - 1) in last_level:\n                deps[last_level[code - 1]].add(package)\n    return dict(deps)\n\n\ndeps_graph = deps()\n\ndot = graphviz.Digraph(filename='deps', directory='.', format='svg')\n\nfor (from_node, to_nodes) in deps_graph.items():\n    if from_node not in FILTER:\n        continue\n    dot.node(from_node, from_node)\n    for to_node in to_nodes:\n        print((from_node, to_node))\n        if to_node == from_node:\n            continue\n        if to_node not in FILTER:\n            continue\n        dot.edge(from_node, to_node)\n\ndot.render()\n"
  }
]